TOP ACTIVATIONS
MAX = 0.191

Finn . F inn asked , " Do you
spots ! F inn gasped and got to his
it . F inn said to his friend ,
promise ." F inn felt cozy and safe .
had to wait until d inn ert ime .
line ! F inn shouted , " Dad ,
Emma could hear a wh inn ying sound . She stepped
in !" F inn re eled in the line
The horse wh inn ied and said , "
and enjoy it until d inn ert ime .
shore . F inn was so happy he gave
's help . <|endoftext|> F inn sat under the apple tree
one ." F inn was excited . He put
Suddenly , Brown ie wh inn ied .
ran out and started sp inn in ', round and round
asked . F inn pointed to the insect and
pond . F inn said , " Dad ,
smiled and the horse wh inn ied . Todd was so
shark ." F inn and Carl sw am as
screaming ?" F inn said sadly , " I

MIN ACTIVATIONS
MIN = -0.232

be cow boys and cow girls . They put on hats
forgot the lion and his mercy . He always tried to
play cow boys and cow girls , and pretend to ride
your fault and ask for mercy . I will spare you
my cow boys and cow girls . I love you ."
Tom cried and begged for mercy . They wished they had
very scared and begged for mercy . He said , '
at each other . They nod . S ara
They smile and nod . They take turns to
and Ben say . They nod and smile . They learn
sorry for fighting . They nod their heads .
They are kind . They nod and smile .
polite . They smile and nod . " Sure
Tom and Mia nod and say yes . They
S ara and Ben nod . They have some scratches
The cater pillar seemed to nod . He was j olly
says . They nod . They smile . They
S ara and Ben nod and smile . They help
S ara and Ben nod . They say , "
other a wink and a nod . Bob and

INTERVAL 0.086 - 0.191
CONTAINS 1.333%

It is mine . I found it first ." They both
a little pig named Pink y . She lived in a
whenever he looked at the picture , he felt so much
set and asked , " What is that ?" Lily replied
careless and bumped into the shelf , causing it to break

INTERVAL -0.020 - 0.086
CONTAINS 72.039%

says . " It is pretty !" Anna reaches
said , " Okay , Anna , that sounds fun .
He was playing in the garden when he saw something funny
She hugged her mom my , who was waiting for her
. Jack was only three years old and had been

INTERVAL -0.126 - -0.020
CONTAINS 26.175%

shield that he would use to protect himself from the wind
and scared . They knew they did something bad . They
couldn 't believe it - the island was even more beautiful
just the right amount of dessert . The end . <|endoftext|>
. As she watched , something strange happened .

INTERVAL -0.232 - -0.126
CONTAINS 0.452%

ie and Fl utter became friends and flew together in the
to have victory . I will provide everything you need to
plan worked perfectly ! The friends reunited with huge smiles on
the toy store with her friends . She couldn â
the v ase , it will break ," said her mom
Token Finn
Token ID15368
Feature activation+1.84e-02

Pos logit contributions
run+0.136
ang+0.116
udge+0.105
aser+0.100
web+0.098
Neg logit contributions
 towns-0.115
girls-0.112
sets-0.112
 conversations-0.111
 amen-0.111
Token.
Token ID13
Feature activation+1.60e-02

Pos logit contributions
run+0.036
ang+0.031
udge+0.028
aser+0.027
ours+0.027
Neg logit contributions
sets-0.028
 towns-0.028
 amen-0.027
 conversations-0.026
 narrative-0.026
Token
Token ID198
Feature activation-7.02e-03

Pos logit contributions
run+0.036
ang+0.031
udge+0.028
ours+0.028
inn+0.027
Neg logit contributions
 towns-0.020
sets-0.020
girls-0.019
 amen-0.019
 conversations-0.018
Token
Token ID198
Feature activation-8.02e-03

Pos logit contributions
girls+0.006
 towns+0.006
 nod+0.005
sets+0.005
 conversations+0.005
Neg logit contributions
run-0.018
ang-0.016
aser-0.015
web-0.015
udge-0.015
TokenF
Token ID37
Feature activation+2.50e-02

Pos logit contributions
girls+0.012
 towns+0.012
sets+0.011
 nod+0.011
 conversations+0.011
Neg logit contributions
run-0.016
ang-0.014
aser-0.013
udge-0.013
web-0.012
Tokeninn
Token ID3732
Feature activation+1.87e-01

Pos logit contributions
run+0.029
web+0.026
aser+0.025
ang+0.024
umes+0.024
Neg logit contributions
girls-0.047
 nod-0.047
 voy-0.047
bit-0.045
 towns-0.044
Token asked
Token ID1965
Feature activation-1.48e-02

Pos logit contributions
run+0.363
ang+0.316
udge+0.286
aser+0.282
web+0.280
Neg logit contributions
girls-0.284
 towns-0.280
sets-0.275
 amen-0.267
 nod-0.262
Token,
Token ID11
Feature activation-8.84e-03

Pos logit contributions
girls+0.023
 towns+0.023
sets+0.022
 amen+0.021
 nod+0.021
Neg logit contributions
run-0.028
ang-0.025
udge-0.022
aser-0.022
web-0.022
Token "
Token ID366
Feature activation+4.09e-02

Pos logit contributions
girls+0.014
 towns+0.014
sets+0.013
 amen+0.013
 nod+0.013
Neg logit contributions
run-0.017
ang-0.014
udge-0.013
web-0.013
aser-0.013
TokenDo
Token ID5211
Feature activation+2.62e-02

Pos logit contributions
run+0.068
ang+0.059
udge+0.053
it+0.052
web+0.052
Neg logit contributions
 towns-0.069
girls-0.068
sets-0.068
 amen-0.067
 conversations-0.066
Token you
Token ID345
Feature activation+2.00e-02

Pos logit contributions
run+0.031
ang+0.025
aser+0.022
earing+0.021
it+0.021
Neg logit contributions
sets-0.056
 towns-0.056
girls-0.055
 distance-0.053
 amen-0.052
Token spots
Token ID10222
Feature activation+2.99e-02

Pos logit contributions
run+0.104
ang+0.089
web+0.082
aser+0.081
udge+0.080
Neg logit contributions
girls-0.084
 towns-0.079
 amen-0.079
sets-0.076
Arthur-0.074
Token!
Token ID0
Feature activation+4.02e-02

Pos logit contributions
run+0.067
ang+0.060
udge+0.055
aser+0.054
web+0.054
Neg logit contributions
 towns-0.036
girls-0.035
sets-0.035
 nod-0.033
 amen-0.033
Token
Token ID198
Feature activation-1.04e-02

Pos logit contributions
run+0.085
ang+0.075
aser+0.071
web+0.070
udge+0.070
Neg logit contributions
girls-0.053
 towns-0.052
sets-0.049
 amen-0.049
 nod-0.048
Token
Token ID198
Feature activation-1.15e-02

Pos logit contributions
girls+0.011
 towns+0.011
 amen+0.010
sets+0.010
 nod+0.010
Neg logit contributions
run-0.025
ang-0.022
aser-0.021
udge-0.021
web-0.021
TokenF
Token ID37
Feature activation+2.36e-02

Pos logit contributions
girls+0.016
 towns+0.016
sets+0.015
 amen+0.015
 nod+0.015
Neg logit contributions
run-0.024
ang-0.021
udge-0.019
aser-0.019
web-0.019
Tokeninn
Token ID3732
Feature activation+1.87e-01

Pos logit contributions
run+0.034
aser+0.031
web+0.030
umes+0.030
 unlike+0.029
Neg logit contributions
 voy-0.034
 nod-0.034
bit-0.033
 towns-0.033
 travel-0.031
Token gasped
Token ID45236
Feature activation+1.55e-03

Pos logit contributions
run+0.381
ang+0.326
udge+0.306
aser+0.292
web+0.289
Neg logit contributions
 towns-0.267
girls-0.260
sets-0.249
 amen-0.240
 nod-0.238
Token and
Token ID290
Feature activation+1.65e-02

Pos logit contributions
run+0.003
ang+0.003
aser+0.003
udge+0.003
ours+0.003
Neg logit contributions
 towns-0.002
girls-0.002
sets-0.002
 amen-0.002
 nod-0.002
Token got
Token ID1392
Feature activation+6.89e-02

Pos logit contributions
run+0.037
ang+0.033
udge+0.032
aser+0.031
web+0.030
Neg logit contributions
 towns-0.020
 nod-0.020
girls-0.019
 amen-0.019
sets-0.019
Token to
Token ID284
Feature activation-3.56e-02

Pos logit contributions
run+0.141
ang+0.126
udge+0.121
web+0.120
aser+0.117
Neg logit contributions
 towns-0.101
 nod-0.099
girls-0.097
 overcome-0.097
 amen-0.090
Token his
Token ID465
Feature activation+3.08e-02

Pos logit contributions
girls+0.053
sets+0.051
 towns+0.049
 amen+0.048
 narrative+0.047
Neg logit contributions
run-0.069
ang-0.059
udge-0.055
web-0.053
ours-0.052
Token it
Token ID340
Feature activation+6.13e-02

Pos logit contributions
run+0.166
ang+0.153
web+0.146
udge+0.144
aser+0.138
Neg logit contributions
 towns-0.091
 hearts-0.087
girls-0.086
 overcome-0.083
 islands-0.079
Token.
Token ID13
Feature activation+9.94e-03

Pos logit contributions
run+0.125
ang+0.114
udge+0.104
ours+0.100
aser+0.100
Neg logit contributions
girls-0.084
 towns-0.083
sets-0.081
 nod-0.077
 amen-0.077
Token
Token ID198
Feature activation-9.22e-03

Pos logit contributions
run+0.023
ang+0.021
udge+0.019
aser+0.019
web+0.019
Neg logit contributions
girls-0.011
 towns-0.011
sets-0.010
 amen-0.010
 nod-0.010
Token
Token ID198
Feature activation-1.03e-02

Pos logit contributions
girls+0.009
 towns+0.009
sets+0.009
 nod+0.009
 amen+0.009
Neg logit contributions
run-0.022
ang-0.020
aser-0.019
web-0.018
udge-0.018
TokenF
Token ID37
Feature activation+2.40e-02

Pos logit contributions
girls+0.015
 towns+0.014
sets+0.014
 amen+0.014
 nod+0.013
Neg logit contributions
run-0.021
ang-0.018
udge-0.017
aser-0.017
web-0.017
Tokeninn
Token ID3732
Feature activation+1.86e-01

Pos logit contributions
run+0.031
aser+0.028
web+0.027
umes+0.027
she+0.025
Neg logit contributions
 nod-0.043
girls-0.043
 voy-0.043
bit-0.042
zo-0.041
Token said
Token ID531
Feature activation-2.49e-03

Pos logit contributions
run+0.392
ang+0.342
udge+0.314
aser+0.308
web+0.307
Neg logit contributions
girls-0.252
 towns-0.251
sets-0.241
 amen-0.234
 nod-0.230
Token to
Token ID284
Feature activation-3.46e-02

Pos logit contributions
girls+0.003
 towns+0.003
sets+0.003
 amen+0.003
 conversations+0.003
Neg logit contributions
run-0.005
ang-0.005
udge-0.004
aser-0.004
web-0.004
Token his
Token ID465
Feature activation+3.21e-02

Pos logit contributions
girls+0.047
 towns+0.046
sets+0.046
 amen+0.044
 conversations+0.043
Neg logit contributions
run-0.073
ang-0.064
udge-0.058
web-0.057
aser-0.057
Token friend
Token ID1545
Feature activation-3.99e-02

Pos logit contributions
run+0.066
ang+0.058
udge+0.053
web+0.052
aser+0.051
Neg logit contributions
girls-0.046
 towns-0.045
sets-0.043
 amen-0.042
 nod-0.042
Token,
Token ID11
Feature activation-9.36e-03

Pos logit contributions
 towns+0.046
girls+0.046
sets+0.045
 amen+0.042
 conversations+0.041
Neg logit contributions
run-0.092
ang-0.080
udge-0.075
aser-0.074
ours-0.072
Token promise
Token ID6991
Feature activation-4.38e-02

Pos logit contributions
run+0.072
ang+0.064
udge+0.060
aser+0.059
web+0.059
Neg logit contributions
girls-0.035
 towns-0.034
 nod-0.033
sets-0.033
 amen-0.033
Token."
Token ID526
Feature activation-4.98e-02

Pos logit contributions
girls+0.069
 towns+0.069
sets+0.067
 amen+0.065
 conversations+0.064
Neg logit contributions
run-0.082
ang-0.071
udge-0.064
aser-0.063
web-0.063
Token
Token ID198
Feature activation-1.98e-02

Pos logit contributions
girls+0.059
 towns+0.059
sets+0.057
 amen+0.055
 conversations+0.054
Neg logit contributions
run-0.114
ang-0.100
udge-0.093
aser-0.090
web-0.089
Token
Token ID198
Feature activation-2.04e-02

Pos logit contributions
girls+0.022
 towns+0.021
sets+0.020
 amen+0.020
 nod+0.019
Neg logit contributions
run-0.046
ang-0.041
aser-0.039
udge-0.038
web-0.038
TokenF
Token ID37
Feature activation+1.95e-02

Pos logit contributions
girls+0.021
 towns+0.020
 amen+0.019
sets+0.019
 nod+0.018
Neg logit contributions
run-0.050
ang-0.044
aser-0.042
ours-0.042
udge-0.041
Tokeninn
Token ID3732
Feature activation+1.86e-01

Pos logit contributions
run+0.026
ang+0.022
web+0.022
aser+0.022
ours+0.020
Neg logit contributions
 nod-0.034
 amen-0.033
 towns-0.032
girls-0.032
Arthur-0.031
Token felt
Token ID2936
Feature activation-2.94e-02

Pos logit contributions
run+0.398
ang+0.349
udge+0.321
aser+0.316
web+0.315
Neg logit contributions
girls-0.245
 towns-0.242
sets-0.235
 amen-0.227
 nod-0.222
Token cozy
Token ID37438
Feature activation-1.43e-02

Pos logit contributions
 towns+0.043
 nod+0.042
 amen+0.041
girls+0.040
 conversations+0.040
Neg logit contributions
run-0.058
ang-0.052
web-0.047
udge-0.047
aser-0.047
Token and
Token ID290
Feature activation+1.20e-02

Pos logit contributions
girls+0.029
 towns+0.027
 narrative+0.026
sets+0.026
 conversations+0.025
Neg logit contributions
run-0.022
ang-0.017
web-0.017
 underneath-0.016
it-0.015
Token safe
Token ID3338
Feature activation-4.08e-02

Pos logit contributions
run+0.024
ang+0.021
udge+0.019
aser+0.019
web+0.019
Neg logit contributions
girls-0.017
 towns-0.017
sets-0.017
 amen-0.016
 nod-0.016
Token.
Token ID13
Feature activation-1.23e-03

Pos logit contributions
girls+0.072
 towns+0.068
 amen+0.066
sets+0.065
Arthur+0.064
Neg logit contributions
run-0.072
ang-0.058
web-0.054
udge-0.052
it-0.052
Token had
Token ID550
Feature activation+3.42e-02

Pos logit contributions
run+0.041
ang+0.038
udge+0.036
web+0.034
inn+0.033
Neg logit contributions
girls-0.017
 nod-0.016
 towns-0.016
sets-0.015
 Claus-0.015
Token to
Token ID284
Feature activation-3.50e-02

Pos logit contributions
run+0.066
ang+0.057
udge+0.052
web+0.051
aser+0.051
Neg logit contributions
girls-0.052
 towns-0.052
sets-0.051
 amen-0.049
 conversations-0.049
Token wait
Token ID4043
Feature activation-7.39e-02

Pos logit contributions
girls+0.052
sets+0.051
 towns+0.049
 amen+0.046
 conversations+0.045
Neg logit contributions
run-0.069
ang-0.058
udge-0.057
earing-0.054
aser-0.053
Token until
Token ID1566
Feature activation+3.30e-02

Pos logit contributions
 towns+0.102
girls+0.098
 amen+0.097
sets+0.096
 nod+0.096
Neg logit contributions
run-0.150
ang-0.136
udge-0.123
aser-0.121
web-0.120
Token d
Token ID288
Feature activation+3.45e-02

Pos logit contributions
run+0.063
ang+0.055
udge+0.050
aser+0.049
web+0.048
Neg logit contributions
 towns-0.051
girls-0.051
sets-0.050
 amen-0.049
 nod-0.048
Tokeninn
Token ID3732
Feature activation+1.86e-01

Pos logit contributions
run+0.050
web+0.047
ang+0.044
 possession+0.041
 oh+0.039
Neg logit contributions
girls-0.057
 towns-0.057
 nod-0.055
sets-0.055
 overcome-0.054
Tokenert
Token ID861
Feature activation+1.29e-02

Pos logit contributions
run+0.281
ang+0.236
udge+0.208
web+0.200
aser+0.195
Neg logit contributions
girls-0.356
 towns-0.355
sets-0.348
 amen-0.339
 nod-0.332
Tokenime
Token ID524
Feature activation+2.93e-02

Pos logit contributions
run+0.022
ang+0.020
udge+0.018
ours+0.018
web+0.017
Neg logit contributions
 towns-0.020
girls-0.020
sets-0.020
 nod-0.019
 amen-0.019
Token.
Token ID13
Feature activation+4.14e-03

Pos logit contributions
run+0.062
ang+0.054
udge+0.049
aser+0.049
web+0.048
Neg logit contributions
girls-0.040
 towns-0.039
sets-0.038
 amen-0.037
 nod-0.036
Token
Token ID198
Feature activation-1.63e-02

Pos logit contributions
run+0.010
ang+0.009
udge+0.008
aser+0.008
web+0.008
Neg logit contributions
girls-0.005
 towns-0.005
sets-0.004
 amen-0.004
 nod-0.004
Token
Token ID198
Feature activation-1.68e-02

Pos logit contributions
girls+0.015
 towns+0.015
 amen+0.014
 nod+0.014
 conversations+0.014
Neg logit contributions
run-0.040
ang-0.036
aser-0.035
udge-0.034
web-0.034
Token line
Token ID1627
Feature activation+4.56e-02

Pos logit contributions
girls+0.067
 towns+0.066
sets+0.065
 amen+0.063
 conversations+0.061
Neg logit contributions
run-0.086
ang-0.075
udge-0.068
aser-0.067
web-0.066
Token!
Token ID0
Feature activation+3.76e-02

Pos logit contributions
run+0.101
ang+0.088
udge+0.082
aser+0.080
web+0.080
Neg logit contributions
girls-0.057
 towns-0.056
sets-0.055
 amen-0.053
 nod-0.052
Token
Token ID198
Feature activation-1.78e-02

Pos logit contributions
run+0.090
ang+0.082
web+0.078
aser+0.077
udge+0.076
Neg logit contributions
 towns-0.038
girls-0.037
 amen-0.035
sets-0.035
 nod-0.035
Token
Token ID198
Feature activation-1.83e-02

Pos logit contributions
 towns+0.015
girls+0.014
 amen+0.014
 nod+0.014
 overcome+0.013
Neg logit contributions
run-0.045
ang-0.041
aser-0.040
web-0.039
udge-0.038
TokenF
Token ID37
Feature activation+2.28e-02

Pos logit contributions
 towns+0.020
girls+0.019
 nod+0.018
 amen+0.018
Arthur+0.018
Neg logit contributions
run-0.043
ang-0.037
aser-0.036
web-0.036
udge-0.035
Tokeninn
Token ID3732
Feature activation+1.86e-01

Pos logit contributions
run+0.027
web+0.025
ang+0.023
umes+0.022
aser+0.022
Neg logit contributions
 voy-0.040
 nod-0.040
 towns-0.039
bit-0.037
 overcome-0.036
Token shouted
Token ID17293
Feature activation-2.46e-02

Pos logit contributions
run+0.366
ang+0.318
udge+0.290
aser+0.284
web+0.283
Neg logit contributions
girls-0.275
 towns-0.272
sets-0.265
 amen-0.258
 nod-0.253
Token,
Token ID11
Feature activation-1.34e-02

Pos logit contributions
 towns+0.025
 nod+0.024
girls+0.024
sets+0.023
 amen+0.023
Neg logit contributions
run-0.060
ang-0.055
web-0.050
udge-0.050
aser-0.049
Token "
Token ID366
Feature activation+3.94e-02

Pos logit contributions
girls+0.023
 towns+0.023
sets+0.022
 amen+0.022
 conversations+0.021
Neg logit contributions
run-0.023
ang-0.020
udge-0.018
aser-0.017
web-0.017
TokenDad
Token ID46270
Feature activation-2.42e-02

Pos logit contributions
run+0.074
ang+0.065
udge+0.059
it+0.057
web+0.056
Neg logit contributions
girls-0.063
 towns-0.061
sets-0.060
 amen-0.059
 conversations-0.057
Token,
Token ID11
Feature activation-1.27e-02

Pos logit contributions
girls+0.032
 towns+0.032
sets+0.031
 amen+0.030
 conversations+0.030
Neg logit contributions
run-0.052
ang-0.045
aser-0.041
udge-0.041
ours-0.040
Token Emma
Token ID18966
Feature activation+2.74e-03

Pos logit contributions
girls+0.023
 towns+0.023
sets+0.022
 amen+0.022
 conversations+0.021
Neg logit contributions
run-0.032
ang-0.027
udge-0.025
web-0.025
aser-0.025
Token could
Token ID714
Feature activation-4.43e-02

Pos logit contributions
run+0.006
ang+0.005
udge+0.004
web+0.004
aser+0.004
Neg logit contributions
girls-0.004
 towns-0.004
sets-0.004
 amen-0.004
 conversations-0.004
Token hear
Token ID3285
Feature activation-8.22e-02

Pos logit contributions
 towns+0.046
girls+0.045
sets+0.044
 nod+0.043
 amen+0.043
Neg logit contributions
run-0.107
ang-0.096
udge-0.089
aser-0.088
web-0.087
Token a
Token ID257
Feature activation-1.07e-02

Pos logit contributions
 towns+0.122
girls+0.122
sets+0.119
 amen+0.115
 conversations+0.114
Neg logit contributions
run-0.162
ang-0.141
udge-0.128
aser-0.126
web-0.125
Token wh
Token ID348
Feature activation-2.51e-02

Pos logit contributions
girls+0.016
sets+0.016
 towns+0.015
 amen+0.015
 conversations+0.014
Neg logit contributions
run-0.021
ang-0.018
web-0.016
udge-0.016
aser-0.016
Tokeninn
Token ID3732
Feature activation+1.86e-01

Pos logit contributions
 towns+0.044
sets+0.042
 narrative+0.042
 overcome+0.042
girls+0.042
Neg logit contributions
run-0.041
ang-0.031
ours-0.028
web-0.028
udge-0.028
Tokenying
Token ID1112
Feature activation+3.84e-02

Pos logit contributions
run+0.318
ang+0.277
udge+0.242
ours+0.236
web+0.231
Neg logit contributions
 towns-0.314
sets-0.309
girls-0.308
 nod-0.296
 amen-0.294
Token sound
Token ID2128
Feature activation-6.53e-02

Pos logit contributions
run+0.065
ang+0.055
udge+0.049
aser+0.048
web+0.047
Neg logit contributions
 towns-0.068
girls-0.067
sets-0.066
 amen-0.064
 nod-0.063
Token.
Token ID13
Feature activation+4.69e-03

Pos logit contributions
 towns+0.090
girls+0.090
sets+0.087
 amen+0.084
 nod+0.083
Neg logit contributions
run-0.135
ang-0.119
udge-0.108
aser-0.107
web-0.106
Token She
Token ID1375
Feature activation-1.43e-02

Pos logit contributions
run+0.010
ang+0.009
udge+0.008
web+0.008
aser+0.008
Neg logit contributions
 towns-0.006
girls-0.006
sets-0.006
 amen-0.006
 nod-0.006
Token stepped
Token ID10764
Feature activation+1.02e-02

Pos logit contributions
 towns+0.018
girls+0.016
 nod+0.016
sets+0.016
 amen+0.016
Neg logit contributions
run-0.032
ang-0.029
udge-0.027
aser-0.027
ours-0.026
Token in
Token ID287
Feature activation-1.41e-03

Pos logit contributions
run+0.133
ang+0.118
udge+0.110
aser+0.106
web+0.105
Neg logit contributions
girls-0.073
 towns-0.072
sets-0.071
 amen-0.068
 nod-0.067
Token!"
Token ID2474
Feature activation-3.82e-02

Pos logit contributions
 towns+0.002
girls+0.002
sets+0.002
 conversations+0.002
 amen+0.002
Neg logit contributions
run-0.003
ang-0.003
aser-0.003
udge-0.002
web-0.002
Token
Token ID198
Feature activation-1.95e-02

Pos logit contributions
girls+0.048
 towns+0.047
sets+0.046
 amen+0.044
 conversations+0.043
Neg logit contributions
run-0.086
ang-0.075
udge-0.069
aser-0.067
web-0.066
Token
Token ID198
Feature activation-1.96e-02

Pos logit contributions
girls+0.020
 towns+0.019
 amen+0.018
 nod+0.018
sets+0.017
Neg logit contributions
run-0.047
ang-0.041
aser-0.040
web-0.039
udge-0.039
TokenF
Token ID37
Feature activation+2.23e-02

Pos logit contributions
girls+0.025
 towns+0.024
 amen+0.023
 nod+0.022
sets+0.022
Neg logit contributions
run-0.043
ang-0.037
aser-0.036
udge-0.035
web-0.035
Tokeninn
Token ID3732
Feature activation+1.86e-01

Pos logit contributions
run+0.023
web+0.021
ang+0.020
umes+0.019
aser+0.019
Neg logit contributions
 voy-0.040
 nod-0.040
 towns-0.040
bit-0.039
 rep-0.038
Token re
Token ID302
Feature activation+3.38e-02

Pos logit contributions
run+0.387
ang+0.339
udge+0.310
aser+0.305
web+0.304
Neg logit contributions
girls-0.253
 towns-0.251
sets-0.244
 amen-0.236
 nod-0.232
Tokeneled
Token ID18449
Feature activation+1.19e-01

Pos logit contributions
run+0.057
ang+0.045
web+0.043
aser+0.042
ula+0.042
Neg logit contributions
 towns-0.055
girls-0.054
 voy-0.051
zo-0.050
 conversations-0.050
Token in
Token ID287
Feature activation-2.19e-03

Pos logit contributions
run+0.201
ang+0.169
udge+0.150
aser+0.148
web+0.147
Neg logit contributions
girls-0.214
 towns-0.210
sets-0.206
 amen-0.202
 conversations-0.199
Token the
Token ID262
Feature activation-4.63e-02

Pos logit contributions
 towns+0.003
girls+0.002
 nod+0.002
sets+0.002
 conversations+0.002
Neg logit contributions
run-0.005
ang-0.004
aser-0.004
web-0.004
udge-0.004
Token line
Token ID1627
Feature activation+4.69e-02

Pos logit contributions
 towns+0.055
girls+0.055
sets+0.053
 amen+0.051
 nod+0.051
Neg logit contributions
run-0.105
ang-0.093
udge-0.086
web-0.085
aser-0.085
Token
Token ID198
Feature activation-1.78e-02

Pos logit contributions
girls+0.036
 towns+0.035
sets+0.035
 nod+0.032
 overcome+0.032
Neg logit contributions
run-0.112
ang-0.106
web-0.100
aser-0.099
udge-0.098
Token
Token ID198
Feature activation-1.87e-02

Pos logit contributions
 towns+0.017
girls+0.017
sets+0.016
 nod+0.016
 conversations+0.015
Neg logit contributions
run-0.044
ang-0.039
aser-0.037
ours-0.037
udge-0.037
TokenThe
Token ID464
Feature activation-5.74e-02

Pos logit contributions
 towns+0.021
 overcome+0.019
 conversations+0.018
girls+0.018
sets+0.018
Neg logit contributions
run-0.044
ang-0.040
ours-0.039
aser-0.038
udge-0.037
Token horse
Token ID8223
Feature activation+2.67e-02

Pos logit contributions
girls+0.065
 towns+0.065
sets+0.062
 amen+0.060
 nod+0.059
Neg logit contributions
run-0.136
ang-0.119
udge-0.112
aser-0.110
web-0.109
Token wh
Token ID348
Feature activation-2.81e-02

Pos logit contributions
run+0.056
ang+0.050
web+0.045
udge+0.045
aser+0.044
Neg logit contributions
girls-0.036
 towns-0.035
sets-0.034
 amen-0.033
 conversations-0.032
Tokeninn
Token ID3732
Feature activation+1.86e-01

Pos logit contributions
 narrative+0.049
 towns+0.048
sets+0.045
girls+0.045
 overcome+0.045
Neg logit contributions
run-0.043
 oh-0.032
ang-0.030
web-0.029
aser-0.029
Tokenied
Token ID798
Feature activation+3.90e-02

Pos logit contributions
run+0.205
ang+0.159
udge+0.133
web+0.124
aser+0.120
Neg logit contributions
 towns-0.431
girls-0.430
sets-0.422
 amen-0.412
 nod-0.407
Token and
Token ID290
Feature activation+1.49e-02

Pos logit contributions
run+0.065
ang+0.056
udge+0.050
aser+0.049
ours+0.049
Neg logit contributions
 towns-0.069
girls-0.068
sets-0.067
 nod-0.065
 amen-0.065
Token said
Token ID531
Feature activation-7.16e-03

Pos logit contributions
run+0.032
ang+0.029
udge+0.028
aser+0.027
ours+0.027
Neg logit contributions
 towns-0.019
sets-0.019
girls-0.018
 nod-0.017
 amen-0.017
Token,
Token ID11
Feature activation-1.33e-02

Pos logit contributions
 towns+0.008
girls+0.008
 amen+0.008
sets+0.008
 nod+0.008
Neg logit contributions
run-0.016
ang-0.015
udge-0.014
web-0.013
aser-0.013
Token "
Token ID366
Feature activation+3.86e-02

Pos logit contributions
girls+0.021
 towns+0.021
sets+0.021
 amen+0.020
 nod+0.020
Neg logit contributions
run-0.024
ang-0.021
udge-0.019
aser-0.019
web-0.019
Token and
Token ID290
Feature activation+2.03e-02

Pos logit contributions
run+0.062
ang+0.053
udge+0.049
aser+0.048
web+0.048
Neg logit contributions
girls-0.043
 towns-0.043
sets-0.042
 amen-0.041
 nod-0.040
Token enjoy
Token ID2883
Feature activation-4.49e-02

Pos logit contributions
run+0.042
ang+0.037
udge+0.034
aser+0.034
web+0.033
Neg logit contributions
girls-0.028
 towns-0.027
sets-0.027
 amen-0.026
 nod-0.025
Token it
Token ID340
Feature activation+6.00e-02

Pos logit contributions
girls+0.067
 towns+0.066
sets+0.065
 amen+0.063
 nod+0.062
Neg logit contributions
run-0.088
ang-0.077
udge-0.070
web-0.068
aser-0.068
Token until
Token ID1566
Feature activation+3.59e-02

Pos logit contributions
run+0.130
ang+0.114
udge+0.105
aser+0.103
web+0.103
Neg logit contributions
girls-0.078
 towns-0.077
sets-0.075
 amen-0.072
 nod-0.071
Token d
Token ID288
Feature activation+3.58e-02

Pos logit contributions
run+0.061
ang+0.050
web+0.045
ours+0.045
aser+0.045
Neg logit contributions
girls-0.065
 towns-0.060
sets-0.060
 amen-0.057
 nod-0.057
Tokeninn
Token ID3732
Feature activation+1.85e-01

Pos logit contributions
run+0.056
ang+0.049
web+0.049
aser+0.042
udge+0.041
Neg logit contributions
 towns-0.061
girls-0.060
 nod-0.059
sets-0.058
 overcome-0.058
Tokenert
Token ID861
Feature activation+1.26e-02

Pos logit contributions
run+0.295
ang+0.248
udge+0.219
web+0.213
aser+0.212
Neg logit contributions
girls-0.343
 towns-0.342
sets-0.335
 amen-0.326
 nod-0.321
Tokenime
Token ID524
Feature activation+3.34e-02

Pos logit contributions
run+0.021
ang+0.018
udge+0.016
aser+0.016
web+0.016
Neg logit contributions
 towns-0.022
girls-0.022
sets-0.021
 amen-0.021
 nod-0.020
Token.
Token ID13
Feature activation+1.02e-02

Pos logit contributions
run+0.068
ang+0.058
udge+0.053
aser+0.053
web+0.052
Neg logit contributions
girls-0.047
sets-0.047
 towns-0.046
 amen-0.045
 nod-0.045
Token
Token ID220
Feature activation-3.52e-02

Pos logit contributions
run+0.023
ang+0.020
udge+0.019
aser+0.018
ours+0.018
Neg logit contributions
girls-0.013
 towns-0.012
sets-0.012
 amen-0.012
 conversations-0.011
Token
Token ID198
Feature activation-1.01e-02

Pos logit contributions
girls+0.031
 towns+0.031
sets+0.029
 nod+0.028
 amen+0.027
Neg logit contributions
run-0.087
ang-0.081
web-0.076
aser-0.075
udge-0.074
Token shore
Token ID15191
Feature activation+6.82e-03

Pos logit contributions
girls+0.063
sets+0.062
 towns+0.061
 amen+0.059
 conversations+0.057
Neg logit contributions
run-0.094
ang-0.084
udge-0.076
aser-0.076
web-0.075
Token.
Token ID13
Feature activation+2.99e-03

Pos logit contributions
run+0.015
ang+0.013
udge+0.012
aser+0.012
web+0.012
Neg logit contributions
girls-0.008
 towns-0.008
sets-0.008
 amen-0.008
 conversations-0.008
Token
Token ID198
Feature activation-1.89e-02

Pos logit contributions
run+0.007
ang+0.006
web+0.006
aser+0.006
udge+0.006
Neg logit contributions
 towns-0.003
girls-0.003
 amen-0.003
sets-0.003
 nod-0.003
Token
Token ID198
Feature activation-1.92e-02

Pos logit contributions
girls+0.014
 towns+0.013
 nod+0.013
 amen+0.012
 overcome+0.012
Neg logit contributions
run-0.049
aser-0.045
web-0.044
ang-0.044
udge-0.042
TokenF
Token ID37
Feature activation+2.31e-02

Pos logit contributions
 towns+0.021
girls+0.020
 amen+0.019
 nod+0.019
 overcome+0.018
Neg logit contributions
run-0.045
aser-0.039
ang-0.038
web-0.038
udge-0.038
Tokeninn
Token ID3732
Feature activation+1.85e-01

Pos logit contributions
run+0.025
web+0.025
ang+0.023
umes+0.022
aser+0.021
Neg logit contributions
 voy-0.041
 nod-0.040
 towns-0.039
bit-0.038
 narrative-0.037
Token was
Token ID373
Feature activation+1.06e-02

Pos logit contributions
run+0.368
ang+0.319
udge+0.291
aser+0.286
web+0.285
Neg logit contributions
girls-0.272
 towns-0.270
sets-0.263
 amen-0.255
 nod-0.251
Token so
Token ID523
Feature activation+2.50e-03

Pos logit contributions
run+0.021
ang+0.019
aser+0.019
udge+0.018
web+0.018
Neg logit contributions
 towns-0.014
girls-0.014
 nod-0.014
 overcome-0.014
sets-0.013
Token happy
Token ID3772
Feature activation+4.27e-03

Pos logit contributions
run+0.005
web+0.004
aser+0.004
ang+0.004
udge+0.004
Neg logit contributions
 nod-0.004
 towns-0.004
 amen-0.003
 overcome-0.003
 thanked-0.003
Token he
Token ID339
Feature activation-7.05e-03

Pos logit contributions
run+0.008
ang+0.007
udge+0.006
web+0.006
aser+0.006
Neg logit contributions
girls-0.007
 towns-0.007
sets-0.006
 amen-0.006
 nod-0.006
Token gave
Token ID2921
Feature activation+7.20e-03

Pos logit contributions
girls+0.009
 towns+0.008
sets+0.008
 amen+0.008
 nod+0.008
Neg logit contributions
run-0.016
ang-0.014
udge-0.013
ours-0.013
aser-0.013
Token's
Token ID338
Feature activation-3.81e-02

Pos logit contributions
run+0.096
ang+0.084
udge+0.078
aser+0.076
web+0.075
Neg logit contributions
girls-0.061
 towns-0.060
sets-0.058
 amen-0.056
 nod-0.055
Token help
Token ID1037
Feature activation-1.25e-01

Pos logit contributions
 towns+0.048
girls+0.047
sets+0.045
 nod+0.045
 conversations+0.045
Neg logit contributions
run-0.083
ang-0.075
udge-0.069
aser-0.067
ours-0.067
Token.
Token ID13
Feature activation+6.83e-04

Pos logit contributions
girls+0.180
 towns+0.174
sets+0.172
 amen+0.170
 conversations+0.159
Neg logit contributions
run-0.259
ang-0.216
udge-0.205
web-0.199
aser-0.197
Token<|endoftext|>
Token ID50256
Feature activation+4.65e-03

Pos logit contributions
run+0.002
ang+0.001
web+0.001
udge+0.001
aser+0.001
Neg logit contributions
girls-0.001
 towns-0.001
sets-0.001
 amen-0.001
 nod-0.001
TokenF
Token ID37
Feature activation+2.56e-02

Pos logit contributions
run+0.009
ang+0.008
udge+0.007
web+0.007
it+0.007
Neg logit contributions
girls-0.007
 towns-0.007
 amen-0.007
sets-0.007
 nod-0.006
Tokeninn
Token ID3732
Feature activation+1.85e-01

Pos logit contributions
run+0.040
ang+0.033
aser+0.033
web+0.032
ours+0.031
Neg logit contributions
 nod-0.044
 towns-0.042
girls-0.041
 amen-0.041
 distance-0.040
Token sat
Token ID3332
Feature activation+2.16e-02

Pos logit contributions
run+0.363
ang+0.316
udge+0.285
aser+0.282
web+0.281
Neg logit contributions
girls-0.276
 towns-0.271
sets-0.268
 amen-0.260
 nod-0.256
Token under
Token ID739
Feature activation+6.54e-02

Pos logit contributions
run+0.035
ang+0.028
udge+0.027
aser+0.026
ours+0.024
Neg logit contributions
 towns-0.041
sets-0.040
girls-0.038
 amen-0.038
 nod-0.038
Token the
Token ID262
Feature activation-4.04e-02

Pos logit contributions
run+0.134
ang+0.116
udge+0.106
web+0.105
aser+0.104
Neg logit contributions
girls-0.092
 towns-0.090
sets-0.088
 amen-0.085
 nod-0.083
Token apple
Token ID17180
Feature activation+4.15e-02

Pos logit contributions
girls+0.047
 towns+0.047
sets+0.045
 amen+0.044
 nod+0.043
Neg logit contributions
run-0.093
ang-0.082
udge-0.076
aser-0.074
web-0.074
Token tree
Token ID5509
Feature activation+1.50e-02

Pos logit contributions
run+0.089
ang+0.079
udge+0.072
web+0.071
aser+0.071
Neg logit contributions
girls-0.054
 towns-0.053
sets-0.051
 amen-0.050
 nod-0.048
Token one
Token ID530
Feature activation+3.26e-02

Pos logit contributions
run+0.206
ang+0.181
udge+0.166
web+0.164
aser+0.162
Neg logit contributions
girls-0.144
 towns-0.144
sets-0.139
 amen-0.136
 nod-0.133
Token."
Token ID526
Feature activation-4.51e-02

Pos logit contributions
run+0.082
ang+0.071
udge+0.069
web+0.068
aser+0.067
Neg logit contributions
girls-0.029
 towns-0.026
sets-0.026
 nod-0.025
 amen-0.025
Token
Token ID198
Feature activation-1.76e-02

Pos logit contributions
girls+0.058
 towns+0.057
sets+0.057
 amen+0.054
 conversations+0.053
Neg logit contributions
run-0.100
ang-0.088
udge-0.080
ours-0.076
aser-0.076
Token
Token ID198
Feature activation-1.76e-02

Pos logit contributions
girls+0.022
 towns+0.022
sets+0.021
 amen+0.020
 conversations+0.020
Neg logit contributions
run-0.039
ang-0.035
udge-0.032
web-0.031
aser-0.031
TokenF
Token ID37
Feature activation+2.16e-02

Pos logit contributions
girls+0.020
 towns+0.020
sets+0.020
 amen+0.019
 conversations+0.018
Neg logit contributions
run-0.041
ang-0.037
udge-0.034
web-0.033
aser-0.033
Tokeninn
Token ID3732
Feature activation+1.85e-01

Pos logit contributions
run+0.024
web+0.022
ang+0.021
aser+0.020
umes+0.020
Neg logit contributions
 nod-0.039
 voy-0.039
 towns-0.038
bit-0.037
 rep-0.036
Token was
Token ID373
Feature activation+1.26e-02

Pos logit contributions
run+0.370
ang+0.322
udge+0.293
aser+0.288
web+0.287
Neg logit contributions
girls-0.270
 towns-0.267
sets-0.261
 amen-0.253
 nod-0.248
Token excited
Token ID6568
Feature activation-5.43e-03

Pos logit contributions
run+0.025
aser+0.023
ang+0.023
udge+0.022
web+0.021
Neg logit contributions
 nod-0.017
 towns-0.016
 overcome-0.016
 amen-0.016
girls-0.016
Token.
Token ID13
Feature activation+3.76e-03

Pos logit contributions
girls+0.010
 towns+0.009
 amen+0.009
 narrative+0.009
sets+0.009
Neg logit contributions
run-0.010
ang-0.008
udge-0.007
earing-0.007
it-0.007
Token He
Token ID679
Feature activation-4.03e-02

Pos logit contributions
run+0.009
ang+0.008
udge+0.008
aser+0.008
web+0.008
Neg logit contributions
girls-0.004
 towns-0.004
sets-0.004
 amen-0.003
 nod-0.003
Token put
Token ID1234
Feature activation+4.14e-02

Pos logit contributions
girls+0.048
 nod+0.047
 towns+0.047
 voy+0.044
 thanked+0.044
Neg logit contributions
run-0.088
ang-0.079
udge-0.078
aser-0.077
web-0.076
TokenSuddenly
Token ID38582
Feature activation+1.88e-02

Pos logit contributions
 towns+0.017
girls+0.016
 overcome+0.015
 nod+0.014
Arthur+0.014
Neg logit contributions
run-0.040
ang-0.035
aser-0.034
ours-0.034
udge-0.033
Token,
Token ID11
Feature activation-1.36e-02

Pos logit contributions
run+0.042
ang+0.037
web+0.034
aser+0.034
udge+0.034
Neg logit contributions
girls-0.023
sets-0.022
 towns-0.021
 amen-0.020
 conversations-0.020
Token Brown
Token ID4373
Feature activation+3.22e-02

Pos logit contributions
girls+0.021
 towns+0.020
sets+0.020
 amen+0.019
 nod+0.019
Neg logit contributions
run-0.026
ang-0.023
udge-0.021
aser-0.020
web-0.020
Tokenie
Token ID494
Feature activation+6.41e-02

Pos logit contributions
run+0.053
ang+0.044
web+0.040
aser+0.040
udge+0.039
Neg logit contributions
girls-0.058
 amen-0.057
 towns-0.057
sets-0.056
 nod-0.056
Token wh
Token ID348
Feature activation-3.25e-02

Pos logit contributions
run+0.138
ang+0.121
udge+0.111
aser+0.110
web+0.110
Neg logit contributions
girls-0.083
 towns-0.082
sets-0.080
 amen-0.078
 nod-0.076
Tokeninn
Token ID3732
Feature activation+1.85e-01

Pos logit contributions
 towns+0.057
girls+0.055
 narrative+0.055
 overcome+0.053
sets+0.053
Neg logit contributions
run-0.053
ang-0.040
web-0.037
aser-0.036
udge-0.036
Tokenied
Token ID798
Feature activation+3.94e-02

Pos logit contributions
run+0.231
ang+0.195
udge+0.167
web+0.158
ours+0.150
Neg logit contributions
 towns-0.392
girls-0.387
sets-0.381
 amen-0.369
 nod-0.365
Token.
Token ID13
Feature activation+2.82e-03

Pos logit contributions
run+0.070
ang+0.063
udge+0.055
ours+0.055
aser+0.054
Neg logit contributions
 towns-0.065
girls-0.063
sets-0.063
 nod-0.061
 conversations-0.060
Token
Token ID6184
Feature activation+1.04e-02

Pos logit contributions
run+0.006
ang+0.006
udge+0.005
web+0.005
aser+0.005
Neg logit contributions
 towns-0.003
girls-0.003
sets-0.003
 amen-0.003
 nod-0.003
Token
Token ID95
Feature activation-7.57e-03

Pos logit contributions
run+0.020
ang+0.017
udge+0.016
aser+0.015
web+0.015
Neg logit contributions
girls-0.016
 towns-0.016
sets-0.015
 amen-0.015
 nod-0.015
Token
Token ID26391
Feature activation-4.03e-02

Pos logit contributions
girls+0.012
sets+0.012
 towns+0.012
 amen+0.012
 nod+0.011
Neg logit contributions
run-0.014
ang-0.012
udge-0.011
web-0.011
aser-0.011
Token ran
Token ID4966
Feature activation-1.80e-02

Pos logit contributions
girls+0.061
sets+0.060
 towns+0.059
 amen+0.058
 conversations+0.057
Neg logit contributions
run-0.096
ang-0.084
aser-0.077
web-0.075
udge-0.074
Token out
Token ID503
Feature activation+4.05e-02

Pos logit contributions
girls+0.032
 towns+0.031
sets+0.030
 conversations+0.030
 nod+0.029
Neg logit contributions
run-0.031
ang-0.027
udge-0.024
aser-0.023
web-0.023
Token and
Token ID290
Feature activation+2.16e-02

Pos logit contributions
run+0.054
ang+0.045
udge+0.038
aser+0.038
web+0.038
Neg logit contributions
 towns-0.085
girls-0.084
sets-0.082
 nod-0.080
 conversations-0.080
Token started
Token ID2067
Feature activation-6.79e-03

Pos logit contributions
run+0.047
ang+0.041
udge+0.038
aser+0.037
web+0.037
Neg logit contributions
girls-0.028
 towns-0.028
sets-0.027
 amen-0.026
 nod-0.026
Token sp
Token ID599
Feature activation+3.37e-02

Pos logit contributions
 towns+0.012
girls+0.012
 amen+0.011
sets+0.011
 narrative+0.011
Neg logit contributions
run-0.012
ang-0.010
udge-0.009
web-0.008
ours-0.008
Tokeninn
Token ID3732
Feature activation+1.85e-01

Pos logit contributions
run+0.055
ang+0.045
web+0.040
udge+0.040
aser+0.040
Neg logit contributions
 towns-0.059
girls-0.058
sets-0.058
 conversations-0.056
 amen-0.056
Tokenin
Token ID259
Feature activation+7.77e-02

Pos logit contributions
run+0.232
ang+0.188
udge+0.160
web+0.154
aser+0.150
Neg logit contributions
girls-0.400
 towns-0.399
sets-0.392
 amen-0.380
 conversations-0.376
Token',
Token ID3256
Feature activation-8.23e-02

Pos logit contributions
run+0.148
ang+0.127
udge+0.115
aser+0.114
web+0.113
Neg logit contributions
girls-0.123
 towns-0.121
sets-0.117
 amen-0.112
 conversations-0.112
Token round
Token ID2835
Feature activation+5.95e-02

Pos logit contributions
girls+0.129
 towns+0.127
sets+0.124
 amen+0.123
 narrative+0.120
Neg logit contributions
run-0.157
ang-0.131
udge-0.117
aser-0.114
web-0.114
Token and
Token ID290
Feature activation+2.06e-02

Pos logit contributions
run+0.124
ang+0.109
udge+0.100
web+0.099
aser+0.099
Neg logit contributions
girls-0.081
 towns-0.081
sets-0.078
 conversations-0.074
 nod-0.074
Token round
Token ID2835
Feature activation+5.97e-02

Pos logit contributions
run+0.041
ang+0.036
udge+0.036
aser+0.034
web+0.033
Neg logit contributions
girls-0.030
 towns-0.029
 nod-0.028
sets-0.027
 hearts-0.027
Token asked
Token ID1965
Feature activation-1.14e-02

Pos logit contributions
run+0.035
ang+0.031
udge+0.029
aser+0.028
web+0.028
Neg logit contributions
girls-0.018
 towns-0.018
sets-0.017
 amen-0.017
 nod-0.016
Token.
Token ID13
Feature activation+5.58e-03

Pos logit contributions
girls+0.013
 towns+0.013
sets+0.013
 amen+0.012
 conversations+0.012
Neg logit contributions
run-0.026
ang-0.023
udge-0.021
aser-0.021
web-0.021
Token
Token ID198
Feature activation-1.91e-02

Pos logit contributions
run+0.013
ang+0.011
web+0.011
aser+0.011
udge+0.011
Neg logit contributions
 nod-0.006
girls-0.006
 towns-0.006
 amen-0.006
 rep-0.006
Token
Token ID198
Feature activation-1.91e-02

Pos logit contributions
girls+0.018
 towns+0.017
 amen+0.016
 nod+0.016
sets+0.015
Neg logit contributions
run-0.047
ang-0.042
aser-0.041
web-0.040
udge-0.039
TokenF
Token ID37
Feature activation+2.17e-02

Pos logit contributions
girls+0.026
 towns+0.026
 nod+0.025
 overcome+0.024
 amen+0.024
Neg logit contributions
run-0.040
ang-0.033
aser-0.033
udge-0.032
web-0.031
Tokeninn
Token ID3732
Feature activation+1.84e-01

Pos logit contributions
run+0.022
umes+0.021
aser+0.021
web+0.020
 unlike+0.020
Neg logit contributions
bit-0.038
 nod-0.038
 voy-0.037
girls-0.036
 towns-0.035
Token pointed
Token ID6235
Feature activation+8.40e-03

Pos logit contributions
run+0.383
ang+0.336
udge+0.311
web+0.297
aser+0.297
Neg logit contributions
 towns-0.249
girls-0.248
sets-0.238
 amen-0.233
 nod-0.233
Token to
Token ID284
Feature activation-3.72e-02

Pos logit contributions
run+0.017
ang+0.017
web+0.016
aser+0.015
udge+0.014
Neg logit contributions
 nod-0.012
 towns-0.011
girls-0.010
 hearts-0.010
 conversations-0.010
Token the
Token ID262
Feature activation-4.66e-02

Pos logit contributions
 towns+0.045
girls+0.044
 nod+0.042
sets+0.041
 amen+0.041
Neg logit contributions
run-0.084
ang-0.076
aser-0.070
udge-0.069
web-0.069
Token insect
Token ID12833
Feature activation+9.25e-02

Pos logit contributions
 towns+0.079
 nod+0.077
 conversations+0.076
girls+0.076
sets+0.075
Neg logit contributions
run-0.088
ang-0.073
udge-0.064
ours-0.061
inn-0.061
Token and
Token ID290
Feature activation+1.36e-02

Pos logit contributions
run+0.212
ang+0.192
udge+0.179
aser+0.173
ours+0.171
Neg logit contributions
girls-0.104
 towns-0.103
sets-0.102
 nod-0.100
 amen-0.097
Token pond
Token ID16723
Feature activation+1.08e-02

Pos logit contributions
girls+0.052
 towns+0.051
sets+0.049
 amen+0.048
 nod+0.047
Neg logit contributions
run-0.097
ang-0.086
udge-0.079
aser-0.078
web-0.078
Token.
Token ID13
Feature activation+1.11e-02

Pos logit contributions
run+0.024
ang+0.021
udge+0.020
web+0.019
aser+0.019
Neg logit contributions
girls-0.013
 towns-0.013
sets-0.012
 amen-0.012
 conversations-0.012
Token
Token ID198
Feature activation-8.70e-03

Pos logit contributions
run+0.026
ang+0.024
web+0.023
aser+0.023
udge+0.022
Neg logit contributions
 towns-0.011
girls-0.011
sets-0.011
 amen-0.010
 nod-0.010
Token
Token ID198
Feature activation-1.06e-02

Pos logit contributions
 towns+0.009
girls+0.009
 amen+0.008
sets+0.008
 nod+0.008
Neg logit contributions
run-0.021
ang-0.019
aser-0.018
web-0.018
udge-0.018
TokenF
Token ID37
Feature activation+2.08e-02

Pos logit contributions
 towns+0.013
girls+0.013
sets+0.012
 nod+0.012
 amen+0.012
Neg logit contributions
run-0.024
ang-0.020
aser-0.020
udge-0.019
web-0.019
Tokeninn
Token ID3732
Feature activation+1.84e-01

Pos logit contributions
run+0.026
web+0.023
aser+0.023
ang+0.021
umes+0.021
Neg logit contributions
 nod-0.036
 voy-0.036
 towns-0.035
bit-0.034
girls-0.034
Token said
Token ID531
Feature activation-5.38e-03

Pos logit contributions
run+0.358
ang+0.308
udge+0.283
web+0.276
aser+0.275
Neg logit contributions
 towns-0.276
girls-0.276
sets-0.266
 amen-0.259
 nod-0.255
Token,
Token ID11
Feature activation-1.18e-02

Pos logit contributions
girls+0.007
 towns+0.007
sets+0.007
 amen+0.007
 nod+0.007
Neg logit contributions
run-0.012
ang-0.010
udge-0.009
web-0.009
aser-0.009
Token "
Token ID366
Feature activation+3.97e-02

Pos logit contributions
 towns+0.018
girls+0.018
sets+0.017
 amen+0.017
 nod+0.017
Neg logit contributions
run-0.022
ang-0.020
udge-0.018
web-0.018
aser-0.018
TokenDad
Token ID46270
Feature activation-2.33e-02

Pos logit contributions
run+0.073
ang+0.066
udge+0.059
it+0.059
inn+0.054
Neg logit contributions
girls-0.064
 towns-0.061
 amen-0.060
sets-0.059
 conversations-0.058
Token,
Token ID11
Feature activation-1.11e-02

Pos logit contributions
girls+0.030
 towns+0.030
sets+0.028
 amen+0.027
 narrative+0.027
Neg logit contributions
run-0.052
ang-0.044
aser-0.041
udge-0.040
ours-0.040
Token smiled
Token ID13541
Feature activation-9.91e-02

Pos logit contributions
run+0.012
ang+0.010
udge+0.009
web+0.009
aser+0.009
Neg logit contributions
girls-0.009
 towns-0.009
sets-0.009
 amen-0.008
 conversations-0.008
Token and
Token ID290
Feature activation+1.40e-02

Pos logit contributions
girls+0.189
 towns+0.173
 narrative+0.172
 conversations+0.167
sets+0.166
Neg logit contributions
run-0.162
ang-0.136
it-0.124
udge-0.120
web-0.116
Token the
Token ID262
Feature activation-4.58e-02

Pos logit contributions
run+0.031
ang+0.028
udge+0.025
web+0.025
aser+0.025
Neg logit contributions
girls-0.017
 towns-0.017
sets-0.017
 amen-0.016
 conversations-0.016
Token horse
Token ID8223
Feature activation+2.23e-02

Pos logit contributions
girls+0.075
sets+0.073
 overcome+0.072
 achieve+0.071
 narrative+0.070
Neg logit contributions
run-0.075
ang-0.064
web-0.055
aser-0.054
 here-0.051
Token wh
Token ID348
Feature activation-2.23e-02

Pos logit contributions
run+0.043
ang+0.037
web+0.033
aser+0.032
it+0.031
Neg logit contributions
girls-0.033
sets-0.033
 towns-0.033
 narrative-0.032
 amen-0.031
Tokeninn
Token ID3732
Feature activation+1.84e-01

Pos logit contributions
 narrative+0.041
 towns+0.040
girls+0.038
sets+0.038
 amen+0.038
Neg logit contributions
run-0.034
ang-0.024
web-0.023
aser-0.022
 oh-0.022
Tokenied
Token ID798
Feature activation+3.76e-02

Pos logit contributions
run+0.199
ang+0.152
udge+0.124
web+0.118
aser+0.117
Neg logit contributions
girls-0.436
 towns-0.433
sets-0.427
 amen-0.419
 nod-0.414
Token.
Token ID13
Feature activation+3.61e-03

Pos logit contributions
run+0.068
ang+0.058
udge+0.053
web+0.051
aser+0.051
Neg logit contributions
girls-0.062
 towns-0.061
sets-0.060
 amen-0.058
 nod-0.057
Token Todd
Token ID14377
Feature activation+5.40e-03

Pos logit contributions
run+0.007
ang+0.006
ours+0.006
earing+0.006
udge+0.006
Neg logit contributions
girls-0.005
sets-0.005
 narrative-0.005
 towns-0.005
 conversations-0.005
Token was
Token ID373
Feature activation+1.36e-02

Pos logit contributions
run+0.011
ang+0.009
web+0.008
udge+0.008
aser+0.008
Neg logit contributions
girls-0.008
sets-0.008
 towns-0.008
 conversations-0.008
 amen-0.008
Token so
Token ID523
Feature activation+3.39e-03

Pos logit contributions
run+0.026
ang+0.023
udge+0.021
aser+0.021
ours+0.021
Neg logit contributions
 towns-0.021
girls-0.020
sets-0.020
 nod-0.020
 amen-0.019
Token shark
Token ID21027
Feature activation+6.82e-03

Pos logit contributions
girls+0.059
 towns+0.058
sets+0.057
 amen+0.055
 nod+0.054
Neg logit contributions
run-0.100
ang-0.088
udge-0.081
aser-0.080
web-0.080
Token."
Token ID526
Feature activation-5.13e-02

Pos logit contributions
run+0.015
ang+0.014
udge+0.013
aser+0.012
web+0.012
Neg logit contributions
girls-0.008
 towns-0.008
sets-0.008
 amen-0.008
 nod-0.007
Token
Token ID198
Feature activation-1.86e-02

Pos logit contributions
 towns+0.067
girls+0.066
sets+0.065
 amen+0.063
 conversations+0.061
Neg logit contributions
run-0.111
ang-0.097
udge-0.089
aser-0.087
web-0.086
Token
Token ID198
Feature activation-1.94e-02

Pos logit contributions
girls+0.020
 towns+0.019
sets+0.018
 amen+0.018
 nod+0.017
Neg logit contributions
run-0.044
ang-0.039
aser-0.037
udge-0.037
web-0.036
TokenF
Token ID37
Feature activation+1.78e-02

Pos logit contributions
girls+0.024
 towns+0.023
sets+0.021
 nod+0.021
 amen+0.021
Neg logit contributions
run-0.044
ang-0.038
aser-0.036
udge-0.036
ours-0.036
Tokeninn
Token ID3732
Feature activation+1.84e-01

Pos logit contributions
run+0.024
ang+0.021
aser+0.020
web+0.020
umes+0.020
Neg logit contributions
girls-0.030
 nod-0.030
bit-0.027
 towns-0.027
 amen-0.027
Token and
Token ID290
Feature activation+1.24e-02

Pos logit contributions
run+0.364
ang+0.318
aser+0.286
udge+0.285
web+0.283
Neg logit contributions
girls-0.270
sets-0.268
 towns-0.264
 amen-0.256
 conversations-0.253
Token Carl
Token ID8124
Feature activation+2.57e-02

Pos logit contributions
run+0.024
ang+0.020
udge+0.019
aser+0.019
web+0.019
Neg logit contributions
girls-0.020
 towns-0.019
sets-0.018
 nod-0.018
 amen-0.018
Token sw
Token ID1509
Feature activation+4.29e-02

Pos logit contributions
run+0.050
ang+0.043
udge+0.038
earing+0.037
aser+0.037
Neg logit contributions
sets-0.040
girls-0.037
 amen-0.036
 conversations-0.034
Arthur-0.034
Tokenam
Token ID321
Feature activation-5.70e-02

Pos logit contributions
run+0.093
ang+0.084
web+0.077
aser+0.077
udge+0.077
Neg logit contributions
girls-0.055
 towns-0.052
sets-0.049
 nod-0.048
 conversations-0.048
Token as
Token ID355
Feature activation+1.39e-02

Pos logit contributions
girls+0.109
sets+0.108
 towns+0.107
 amen+0.105
 conversations+0.101
Neg logit contributions
run-0.090
ang-0.074
udge-0.065
web-0.062
aser-0.062
Token screaming
Token ID14788
Feature activation-5.65e-02

Pos logit contributions
run+0.033
ang+0.030
udge+0.027
web+0.027
aser+0.026
Neg logit contributions
 towns-0.021
 amen-0.021
 nod-0.021
girls-0.021
 overcome-0.021
Token?"
Token ID1701
Feature activation-2.21e-02

Pos logit contributions
 towns+0.081
girls+0.079
sets+0.079
 amen+0.077
 nod+0.077
Neg logit contributions
run-0.111
ang-0.101
udge-0.090
web-0.085
aser-0.085
Token
Token ID198
Feature activation-1.91e-02

Pos logit contributions
girls+0.024
 towns+0.022
 nod+0.022
 amen+0.021
sets+0.020
Neg logit contributions
run-0.052
ang-0.046
aser-0.045
web-0.045
udge-0.044
Token
Token ID198
Feature activation-1.91e-02

Pos logit contributions
girls+0.018
 towns+0.016
 amen+0.015
 nod+0.015
 overcome+0.015
Neg logit contributions
run-0.048
aser-0.042
ang-0.042
web-0.041
udge-0.040
TokenF
Token ID37
Feature activation+2.07e-02

Pos logit contributions
girls+0.026
 towns+0.025
 nod+0.023
 overcome+0.023
 amen+0.023
Neg logit contributions
run-0.040
aser-0.034
ang-0.033
udge-0.032
web-0.032
Tokeninn
Token ID3732
Feature activation+1.84e-01

Pos logit contributions
run+0.020
umes+0.020
aser+0.019
 unlike+0.019
web+0.019
Neg logit contributions
 nod-0.035
bit-0.035
 voy-0.035
 towns-0.033
girls-0.033
Token said
Token ID531
Feature activation-5.67e-03

Pos logit contributions
run+0.383
ang+0.336
udge+0.308
aser+0.301
web+0.300
Neg logit contributions
girls-0.249
 towns-0.247
sets-0.240
 amen-0.233
 nod-0.230
Token sadly
Token ID21098
Feature activation+1.58e-02

Pos logit contributions
girls+0.007
 towns+0.007
sets+0.006
 amen+0.006
 conversations+0.006
Neg logit contributions
run-0.013
ang-0.011
aser-0.011
udge-0.011
web-0.010
Token,
Token ID11
Feature activation-1.54e-02

Pos logit contributions
run+0.035
ang+0.030
udge+0.028
aser+0.027
web+0.027
Neg logit contributions
girls-0.020
 towns-0.020
sets-0.020
 conversations-0.018
 amen-0.018
Token "
Token ID366
Feature activation+3.75e-02

Pos logit contributions
girls+0.029
 towns+0.028
sets+0.027
 conversations+0.026
 amen+0.026
Neg logit contributions
run-0.025
ang-0.021
udge-0.018
aser-0.018
web-0.018
TokenI
Token ID40
Feature activation+4.14e-02

Pos logit contributions
run+0.080
ang+0.067
ours+0.065
aser+0.064
udge+0.062
Neg logit contributions
 towns-0.048
girls-0.048
sets-0.048
 nod-0.045
Arthur-0.044
Token be
Token ID307
Feature activation-1.86e-02

Pos logit contributions
girls+0.049
sets+0.048
 towns+0.048
 amen+0.045
 conversations+0.043
Neg logit contributions
run-0.082
ang-0.071
udge-0.067
aser-0.065
web-0.065
Token cow
Token ID9875
Feature activation+1.10e-02

Pos logit contributions
girls+0.026
 towns+0.026
sets+0.025
 conversations+0.024
 amen+0.024
Neg logit contributions
run-0.039
ang-0.034
udge-0.031
aser-0.030
ours-0.030
Tokenboys
Token ID13202
Feature activation-1.43e-01

Pos logit contributions
run+0.026
ang+0.025
web+0.023
aser+0.022
udge+0.022
Neg logit contributions
 towns-0.010
 amen-0.010
 narrative-0.009
Arthur-0.009
 nod-0.009
Token and
Token ID290
Feature activation+1.67e-02

Pos logit contributions
 towns+0.244
 narrative+0.236
 amen+0.232
sets+0.227
 hearts+0.223
Neg logit contributions
run-0.241
ang-0.210
udge-0.187
web-0.185
aser-0.182
Token cow
Token ID9875
Feature activation+1.18e-02

Pos logit contributions
run+0.037
ang+0.032
aser+0.029
web+0.028
ours+0.028
Neg logit contributions
 towns-0.019
sets-0.019
 amen-0.018
 narrative-0.018
 nod-0.018
Tokengirls
Token ID36960
Feature activation-2.32e-01

Pos logit contributions
run+0.033
ang+0.031
web+0.028
aser+0.028
udge+0.028
Neg logit contributions
 amen-0.006
 towns-0.006
 narrative-0.006
 nod-0.006
Arthur-0.005
Token.
Token ID13
Feature activation+1.35e-02

Pos logit contributions
 towns+0.342
sets+0.335
 narrative+0.323
 amen+0.322
 distance+0.318
Neg logit contributions
run-0.439
ang-0.370
udge-0.338
web-0.338
aser-0.329
Token They
Token ID1119
Feature activation-1.11e-01

Pos logit contributions
run+0.031
ang+0.027
aser+0.027
web+0.026
udge+0.025
Neg logit contributions
girls-0.016
 towns-0.016
 nod-0.015
sets-0.014
 conversations-0.014
Token put
Token ID1234
Feature activation+4.30e-02

Pos logit contributions
sets+0.167
 towns+0.155
girls+0.153
 conversations+0.153
 amen+0.149
Neg logit contributions
run-0.217
ang-0.189
udge-0.169
aser-0.167
ours-0.164
Token on
Token ID319
Feature activation+2.49e-02

Pos logit contributions
run+0.089
ang+0.077
udge+0.072
web+0.070
aser+0.069
Neg logit contributions
girls-0.063
 towns-0.060
sets-0.059
 hearts-0.056
 conversations-0.056
Token hats
Token ID23910
Feature activation-6.30e-02

Pos logit contributions
run+0.048
ang+0.041
udge+0.038
ours+0.038
web+0.037
Neg logit contributions
 amen-0.035
girls-0.034
sets-0.034
 towns-0.034
 nod-0.033
Token forgot
Token ID16453
Feature activation-4.07e-02

Pos logit contributions
run+0.034
ang+0.025
earing+0.023
aser+0.023
udge+0.023
Neg logit contributions
sets-0.035
girls-0.035
 towns-0.033
 amen-0.032
Arthur-0.031
Token the
Token ID262
Feature activation-4.79e-02

Pos logit contributions
girls+0.071
 towns+0.069
sets+0.069
 amen+0.067
 nod+0.066
Neg logit contributions
run-0.071
ang-0.057
udge-0.052
aser-0.051
web-0.051
Token lion
Token ID18744
Feature activation+9.85e-03

Pos logit contributions
girls+0.072
 towns+0.069
sets+0.069
 amen+0.066
 conversations+0.064
Neg logit contributions
run-0.093
ang-0.080
aser-0.073
udge-0.073
web-0.073
Token and
Token ID290
Feature activation+1.45e-02

Pos logit contributions
run+0.020
ang+0.017
web+0.015
aser+0.015
udge+0.015
Neg logit contributions
 towns-0.014
girls-0.014
sets-0.014
 amen-0.014
 conversations-0.013
Token his
Token ID465
Feature activation+2.80e-02

Pos logit contributions
run+0.030
ang+0.026
udge+0.024
web+0.024
aser+0.024
Neg logit contributions
girls-0.020
 towns-0.019
sets-0.019
 amen-0.019
 nod-0.018
Token mercy
Token ID17703
Feature activation-2.30e-01

Pos logit contributions
run+0.056
ang+0.049
udge+0.045
aser+0.044
web+0.044
Neg logit contributions
girls-0.040
 towns-0.040
sets-0.039
 amen-0.038
 nod-0.037
Token.
Token ID13
Feature activation+6.78e-03

Pos logit contributions
girls+0.328
sets+0.324
 towns+0.322
 amen+0.304
 conversations+0.294
Neg logit contributions
run-0.477
ang-0.417
udge-0.383
web-0.370
aser-0.369
Token He
Token ID679
Feature activation-4.28e-02

Pos logit contributions
run+0.016
ang+0.014
udge+0.013
web+0.012
aser+0.012
Neg logit contributions
girls-0.008
sets-0.008
 towns-0.008
 amen-0.008
 conversations-0.007
Token always
Token ID1464
Feature activation-5.43e-02

Pos logit contributions
sets+0.062
 towns+0.059
girls+0.058
 conversations+0.055
 amen+0.055
Neg logit contributions
run-0.085
ang-0.075
web-0.069
udge-0.066
aser-0.066
Token tried
Token ID3088
Feature activation-9.42e-04

Pos logit contributions
girls+0.081
sets+0.079
 towns+0.079
 conversations+0.074
 amen+0.072
Neg logit contributions
run-0.107
ang-0.090
web-0.082
ours-0.080
aser-0.079
Token to
Token ID284
Feature activation-3.89e-02

Pos logit contributions
girls+0.002
 towns+0.001
sets+0.001
 amen+0.001
 nod+0.001
Neg logit contributions
run-0.002
ang-0.001
udge-0.001
aser-0.001
web-0.001
Token play
Token ID711
Feature activation-1.01e-02

Pos logit contributions
sets+0.099
girls+0.094
 towns+0.094
 amen+0.092
 conversations+0.091
Neg logit contributions
run-0.137
ang-0.120
udge-0.107
aser-0.105
ours-0.104
Token cow
Token ID9875
Feature activation+8.72e-03

Pos logit contributions
 towns+0.018
girls+0.018
sets+0.017
 conversations+0.017
 amen+0.017
Neg logit contributions
run-0.018
ang-0.015
aser-0.013
udge-0.013
ours-0.013
Tokenboys
Token ID13202
Feature activation-1.44e-01

Pos logit contributions
run+0.020
ang+0.018
udge+0.016
web+0.016
aser+0.016
Neg logit contributions
 towns-0.010
girls-0.010
sets-0.010
 amen-0.009
 nod-0.009
Token and
Token ID290
Feature activation+1.60e-02

Pos logit contributions
 towns+0.226
 narrative+0.218
sets+0.216
 amen+0.214
girls+0.212
Neg logit contributions
run-0.266
ang-0.229
web-0.201
udge-0.200
aser-0.199
Token cow
Token ID9875
Feature activation+8.93e-03

Pos logit contributions
run+0.035
ang+0.030
web+0.026
aser+0.026
ours+0.026
Neg logit contributions
sets-0.018
 narrative-0.018
 amen-0.018
 towns-0.017
 nod-0.017
Tokengirls
Token ID36960
Feature activation-2.27e-01

Pos logit contributions
run+0.022
ang+0.020
web+0.018
aser+0.018
udge+0.018
Neg logit contributions
 amen-0.008
 narrative-0.008
 towns-0.008
 nod-0.007
sets-0.007
Token,
Token ID11
Feature activation-7.43e-03

Pos logit contributions
sets+0.342
 towns+0.341
 narrative+0.333
andra+0.325
 amen+0.324
Neg logit contributions
run-0.416
ang-0.347
udge-0.311
web-0.309
it-0.302
Token and
Token ID290
Feature activation+1.59e-02

Pos logit contributions
sets+0.011
girls+0.011
 towns+0.011
 amen+0.011
 conversations+0.010
Neg logit contributions
run-0.014
ang-0.012
udge-0.011
web-0.010
aser-0.010
Token pretend
Token ID16614
Feature activation+4.28e-02

Pos logit contributions
run+0.034
ang+0.029
web+0.026
aser+0.026
udge+0.026
Neg logit contributions
girls-0.020
sets-0.020
 towns-0.019
 amen-0.019
 nod-0.018
Token to
Token ID284
Feature activation-3.73e-02

Pos logit contributions
run+0.087
ang+0.075
udge+0.068
web+0.068
aser+0.068
Neg logit contributions
girls-0.060
 towns-0.060
sets-0.058
 amen-0.057
 nod-0.056
Token ride
Token ID6594
Feature activation+5.76e-03

Pos logit contributions
sets+0.053
 amen+0.047
 towns+0.047
 narrative+0.046
girls+0.045
Neg logit contributions
run-0.078
ang-0.067
udge-0.061
web-0.059
ours-0.055
Token your
Token ID534
Feature activation+3.96e-02

Pos logit contributions
girls+0.083
 towns+0.082
sets+0.080
 conversations+0.077
 amen+0.077
Neg logit contributions
run-0.089
ang-0.075
aser-0.067
udge-0.067
web-0.066
Token fault
Token ID8046
Feature activation-2.33e-02

Pos logit contributions
run+0.087
ang+0.077
udge+0.071
web+0.069
aser+0.069
Neg logit contributions
girls-0.050
 towns-0.050
sets-0.048
 amen-0.047
 nod-0.047
Token and
Token ID290
Feature activation+1.72e-02

Pos logit contributions
 towns+0.031
girls+0.030
sets+0.029
 amen+0.028
 conversations+0.028
Neg logit contributions
run-0.050
ang-0.044
aser-0.040
web-0.040
udge-0.040
Token ask
Token ID1265
Feature activation-2.47e-02

Pos logit contributions
run+0.032
ang+0.028
udge+0.026
aser+0.024
web+0.024
Neg logit contributions
girls-0.028
 towns-0.028
sets-0.027
 amen-0.027
 nod-0.026
Token for
Token ID329
Feature activation+4.73e-02

Pos logit contributions
girls+0.040
 towns+0.040
sets+0.039
 amen+0.038
 nod+0.037
Neg logit contributions
run-0.045
ang-0.039
udge-0.035
aser-0.034
web-0.034
Token mercy
Token ID17703
Feature activation-2.26e-01

Pos logit contributions
run+0.111
ang+0.097
udge+0.090
web+0.090
aser+0.089
Neg logit contributions
girls-0.053
sets-0.052
 towns-0.051
 amen-0.047
 nod-0.046
Token.
Token ID13
Feature activation+1.55e-02

Pos logit contributions
sets+0.303
girls+0.301
 towns+0.296
 amen+0.279
 nod+0.269
Neg logit contributions
run-0.489
ang-0.422
udge-0.387
web-0.374
aser-0.373
Token I
Token ID314
Feature activation+2.73e-02

Pos logit contributions
run+0.029
ang+0.025
udge+0.023
aser+0.022
web+0.022
Neg logit contributions
girls-0.025
 towns-0.025
 amen-0.024
sets-0.024
 nod-0.023
Token will
Token ID481
Feature activation-1.38e-01

Pos logit contributions
run+0.063
ang+0.056
udge+0.051
aser+0.051
web+0.051
Neg logit contributions
girls-0.032
 towns-0.031
sets-0.030
 amen-0.029
 nod-0.029
Token spare
Token ID13952
Feature activation-4.04e-02

Pos logit contributions
sets+0.212
 towns+0.208
girls+0.201
 conversations+0.187
 narrative+0.185
Neg logit contributions
run-0.257
ang-0.213
web-0.199
udge-0.198
 here-0.198
Token you
Token ID345
Feature activation+1.49e-02

Pos logit contributions
 towns+0.063
girls+0.063
sets+0.061
 amen+0.061
 conversations+0.059
Neg logit contributions
run-0.076
ang-0.066
udge-0.061
ours-0.058
aser-0.058
Token my
Token ID616
Feature activation+7.24e-02

Pos logit contributions
girls+0.018
 towns+0.017
sets+0.017
 amen+0.017
 nod+0.016
Neg logit contributions
run-0.020
ang-0.017
udge-0.015
aser-0.015
web-0.015
Token cow
Token ID9875
Feature activation+9.78e-03

Pos logit contributions
run+0.139
ang+0.119
udge+0.108
aser+0.108
ours+0.107
Neg logit contributions
girls-0.112
 towns-0.111
sets-0.109
 conversations-0.101
 amen-0.099
Tokenboys
Token ID13202
Feature activation-1.43e-01

Pos logit contributions
run+0.020
ang+0.017
web+0.016
udge+0.016
aser+0.016
Neg logit contributions
 towns-0.014
girls-0.013
sets-0.013
 amen-0.013
 conversations-0.012
Token and
Token ID290
Feature activation+1.63e-02

Pos logit contributions
 towns+0.215
sets+0.203
girls+0.200
 conversations+0.197
 amen+0.196
Neg logit contributions
run-0.279
ang-0.240
udge-0.217
aser-0.216
web-0.215
Token cow
Token ID9875
Feature activation+1.01e-02

Pos logit contributions
run+0.033
ang+0.028
aser+0.025
ours+0.025
web+0.025
Neg logit contributions
 towns-0.022
girls-0.021
sets-0.021
 nod-0.021
 amen-0.021
Tokengirls
Token ID36960
Feature activation-2.25e-01

Pos logit contributions
run+0.021
ang+0.018
web+0.016
aser+0.016
udge+0.016
Neg logit contributions
 towns-0.014
sets-0.013
 amen-0.013
girls-0.013
 narrative-0.013
Token.
Token ID13
Feature activation+2.18e-02

Pos logit contributions
 towns+0.328
sets+0.323
girls+0.306
 conversations+0.298
 narrative+0.298
Neg logit contributions
run-0.436
ang-0.375
udge-0.342
web-0.337
it-0.336
Token I
Token ID314
Feature activation+3.05e-02

Pos logit contributions
run+0.038
ang+0.031
udge+0.028
ours+0.027
it+0.026
Neg logit contributions
 towns-0.037
sets-0.037
girls-0.036
 conversations-0.036
 overcome-0.035
Token love
Token ID1842
Feature activation-1.65e-02

Pos logit contributions
run+0.065
ang+0.057
udge+0.052
web+0.051
aser+0.050
Neg logit contributions
 towns-0.040
sets-0.040
girls-0.039
 amen-0.036
 conversations-0.036
Token you
Token ID345
Feature activation+1.56e-02

Pos logit contributions
girls+0.028
 towns+0.028
sets+0.027
 amen+0.027
 nod+0.027
Neg logit contributions
run-0.029
ang-0.024
udge-0.022
aser-0.021
web-0.021
Token."
Token ID526
Feature activation-4.21e-02

Pos logit contributions
run+0.024
ang+0.021
it+0.019
aser+0.019
web+0.018
Neg logit contributions
 towns-0.027
sets-0.026
girls-0.026
 rep-0.025
 narrative-0.025
Token Tom
Token ID4186
Feature activation-8.47e-03

Pos logit contributions
run+0.019
ang+0.015
ours+0.014
aser+0.013
 oh+0.013
Neg logit contributions
sets-0.033
 towns-0.032
 conversations-0.032
 Amen-0.032
girls-0.031
Token cried
Token ID16896
Feature activation-1.49e-02

Pos logit contributions
sets+0.016
 avail+0.015
girls+0.015
 Amen+0.014
 conversations+0.014
Neg logit contributions
run-0.014
ang-0.011
 else-0.011
it-0.011
a-0.010
Token and
Token ID290
Feature activation+1.73e-02

Pos logit contributions
sets+0.028
girls+0.027
andra+0.027
Andrew+0.027
 towns+0.026
Neg logit contributions
run-0.025
udge-0.018
ang-0.018
 at-0.017
--0.017
Token begged
Token ID32946
Feature activation-2.98e-02

Pos logit contributions
run+0.032
ang+0.025
udge+0.022
aser+0.022
ours+0.022
Neg logit contributions
sets-0.031
 towns-0.029
girls-0.029
 conversations-0.028
 amen-0.026
Token for
Token ID329
Feature activation+5.00e-02

Pos logit contributions
sets+0.053
girls+0.049
 towns+0.048
 amen+0.044
Arthur+0.044
Neg logit contributions
run-0.055
udge-0.042
ang-0.041
ours-0.039
ula-0.038
Token mercy
Token ID17703
Feature activation-2.24e-01

Pos logit contributions
run+0.099
ang+0.074
web+0.072
udge+0.072
aser+0.071
Neg logit contributions
sets-0.080
 towns-0.071
girls-0.071
bit-0.068
 conversations-0.068
Token.
Token ID13
Feature activation+1.27e-02

Pos logit contributions
sets+0.321
girls+0.309
 towns+0.307
 amen+0.288
Arthur+0.275
Neg logit contributions
run-0.484
ang-0.409
udge-0.383
web-0.357
aser-0.355
Token They
Token ID1119
Feature activation-1.10e-01

Pos logit contributions
run+0.024
ang+0.021
udge+0.019
ours+0.019
earing+0.018
Neg logit contributions
sets-0.019
girls-0.018
 feast-0.018
 towns-0.018
 voy-0.017
Token wished
Token ID16555
Feature activation+5.21e-02

Pos logit contributions
sets+0.204
 towns+0.184
girls+0.181
 conversations+0.180
 amen+0.170
Neg logit contributions
run-0.201
ang-0.161
aser-0.142
earing-0.139
udge-0.138
Token they
Token ID484
Feature activation-7.16e-02

Pos logit contributions
run+0.105
ang+0.088
ours+0.080
aser+0.079
udge+0.079
Neg logit contributions
girls-0.076
sets-0.074
 towns-0.073
 conversations-0.069
 nod-0.067
Token had
Token ID550
Feature activation+3.15e-02

Pos logit contributions
sets+0.117
 towns+0.112
girls+0.110
 conversations+0.109
 amen+0.108
Neg logit contributions
run-0.136
ang-0.104
web-0.099
aser-0.098
it-0.095
Token very
Token ID845
Feature activation+4.76e-02

Pos logit contributions
run+0.031
ang+0.029
aser+0.027
udge+0.027
ours+0.027
Neg logit contributions
girls-0.024
 towns-0.023
 nod-0.023
 amen-0.022
 overcome-0.022
Token scared
Token ID12008
Feature activation-2.25e-02

Pos logit contributions
run+0.094
ang+0.082
aser+0.082
udge+0.080
web+0.079
Neg logit contributions
 nod-0.065
 amen-0.062
girls-0.062
 towns-0.061
 chatter-0.060
Token and
Token ID290
Feature activation+1.67e-02

Pos logit contributions
girls+0.031
 towns+0.031
sets+0.030
 amen+0.029
 conversations+0.028
Neg logit contributions
run-0.048
ang-0.041
udge-0.038
aser-0.037
web-0.037
Token begged
Token ID32946
Feature activation-3.14e-02

Pos logit contributions
run+0.035
ang+0.032
udge+0.031
aser+0.029
web+0.028
Neg logit contributions
girls-0.022
 amen-0.022
 towns-0.021
 nod-0.021
 chatter-0.020
Token for
Token ID329
Feature activation+4.82e-02

Pos logit contributions
sets+0.053
girls+0.049
 towns+0.048
 narrative+0.046
 conversations+0.046
Neg logit contributions
run-0.059
udge-0.045
ang-0.044
aser-0.042
inn-0.041
Token mercy
Token ID17703
Feature activation-2.24e-01

Pos logit contributions
run+0.103
ang+0.081
web+0.080
udge+0.077
aser+0.077
Neg logit contributions
sets-0.068
 towns-0.063
girls-0.063
 conversations-0.058
 narrative-0.057
Token.
Token ID13
Feature activation+6.80e-03

Pos logit contributions
sets+0.331
 towns+0.319
girls+0.313
 amen+0.294
Arthur+0.285
Neg logit contributions
run-0.473
ang-0.401
udge-0.372
web-0.348
aser-0.344
Token He
Token ID679
Feature activation-3.65e-02

Pos logit contributions
run+0.014
ang+0.012
udge+0.011
ours+0.011
aser+0.011
Neg logit contributions
sets-0.009
 towns-0.009
girls-0.009
 conversations-0.009
 amen-0.009
Token said
Token ID531
Feature activation-4.04e-04

Pos logit contributions
girls+0.043
 towns+0.040
 amen+0.040
 nod+0.040
 overcome+0.036
Neg logit contributions
run-0.083
ang-0.075
udge-0.071
aser-0.069
ours-0.068
Token,
Token ID11
Feature activation-8.76e-03

Pos logit contributions
girls+0.001
sets+0.001
 towns+0.001
 conversations+0.000
 chatter+0.000
Neg logit contributions
run-0.001
ang-0.001
aser-0.001
web-0.001
ours-0.001
Token '
Token ID705
Feature activation+7.53e-02

Pos logit contributions
girls+0.016
 towns+0.016
sets+0.016
 conversations+0.015
 amen+0.015
Neg logit contributions
run-0.015
ang-0.012
aser-0.010
udge-0.010
web-0.010
Token at
Token ID379
Feature activation+2.69e-02

Pos logit contributions
run+0.021
ang+0.019
web+0.018
udge+0.017
aser+0.015
Neg logit contributions
 nod-0.024
 towns-0.024
 hearts-0.023
 amen-0.023
 conversations-0.023
Token each
Token ID1123
Feature activation-5.87e-02

Pos logit contributions
run+0.062
ang+0.056
udge+0.051
aser+0.051
web+0.051
Neg logit contributions
girls-0.031
 towns-0.031
sets-0.029
 amen-0.028
 nod-0.028
Token other
Token ID584
Feature activation-7.68e-02

Pos logit contributions
 towns+0.052
girls+0.051
sets+0.048
 nod+0.046
 amen+0.046
Neg logit contributions
run-0.152
ang-0.136
udge-0.127
web-0.126
aser-0.125
Token.
Token ID13
Feature activation+5.52e-03

Pos logit contributions
girls+0.109
 towns+0.107
sets+0.106
 amen+0.103
 conversations+0.099
Neg logit contributions
run-0.155
ang-0.132
aser-0.121
udge-0.119
web-0.118
Token They
Token ID1119
Feature activation-1.18e-01

Pos logit contributions
run+0.014
ang+0.012
udge+0.011
aser+0.011
web+0.011
Neg logit contributions
girls-0.005
 towns-0.005
sets-0.005
 amen-0.005
 conversations-0.005
Token nod
Token ID18666
Feature activation-2.03e-01

Pos logit contributions
sets+0.173
girls+0.160
 towns+0.153
 conversations+0.152
 amen+0.147
Neg logit contributions
run-0.226
ang-0.197
udge-0.180
aser-0.177
ours-0.176
Token.
Token ID13
Feature activation-5.99e-04

Pos logit contributions
girls+0.365
 towns+0.361
sets+0.356
Arthur+0.335
 feast+0.333
Neg logit contributions
run-0.354
it-0.253
ang-0.251
udge-0.247
--0.238
Token
Token ID198
Feature activation-2.02e-02

Pos logit contributions
 towns+0.001
girls+0.001
sets+0.001
 amen+0.001
 feast+0.001
Neg logit contributions
run-0.001
ang-0.001
udge-0.001
ours-0.001
it-0.001
Token
Token ID198
Feature activation-2.05e-02

Pos logit contributions
girls+0.023
 towns+0.022
sets+0.021
 amen+0.021
 nod+0.021
Neg logit contributions
run-0.047
ang-0.042
aser-0.039
udge-0.038
web-0.038
TokenS
Token ID50
Feature activation+1.98e-02

Pos logit contributions
 towns+0.023
girls+0.023
 amen+0.021
 nod+0.021
sets+0.020
Neg logit contributions
run-0.048
ang-0.041
aser-0.040
udge-0.039
web-0.039
Tokenara
Token ID3301
Feature activation+6.95e-02

Pos logit contributions
run+0.025
ang+0.022
ours+0.022
aser+0.021
out+0.019
Neg logit contributions
 Amen-0.042
 towns-0.042
girls-0.041
 chatter-0.041
sets-0.040
Token
Token ID198
Feature activation-1.05e-02

Pos logit contributions
run+0.031
ang+0.025
udge+0.023
it+0.022
ours+0.022
Neg logit contributions
girls-0.032
 towns-0.032
sets-0.032
 conversations-0.030
 feast-0.030
Token
Token ID198
Feature activation-1.15e-02

Pos logit contributions
sets+0.016
girls+0.016
 towns+0.016
 conversations+0.015
 amen+0.015
Neg logit contributions
run-0.021
ang-0.018
udge-0.016
web-0.016
it-0.016
TokenThey
Token ID2990
Feature activation-1.28e-01

Pos logit contributions
sets+0.017
girls+0.016
 feast+0.016
 amen+0.016
 conversations+0.015
Neg logit contributions
run-0.023
ang-0.022
it-0.019
udge-0.019
web-0.019
Token smile
Token ID8212
Feature activation-1.15e-01

Pos logit contributions
sets+0.201
 conversations+0.177
girls+0.174
 towns+0.163
rob+0.158
Neg logit contributions
run-0.236
ang-0.199
aser-0.190
udge-0.188
ours-0.186
Token and
Token ID290
Feature activation+1.10e-02

Pos logit contributions
 towns+0.222
girls+0.214
Andrew+0.211
sets+0.210
andra+0.209
Neg logit contributions
run-0.164
 at-0.132
ang-0.121
it-0.118
 inside-0.118
Token nod
Token ID18666
Feature activation-2.01e-01

Pos logit contributions
run+0.017
bug+0.014
ang+0.014
 else+0.014
 inside+0.013
Neg logit contributions
sets-0.017
Andrew-0.017
Sir-0.016
Jose-0.016
 towns-0.016
Token.
Token ID13
Feature activation+1.40e-02

Pos logit contributions
sets+0.415
 towns+0.406
girls+0.396
rob+0.395
andra+0.383
Neg logit contributions
run-0.304
 at-0.213
--0.208
udge-0.200
it-0.200
Token They
Token ID1119
Feature activation-1.23e-01

Pos logit contributions
run+0.026
ang+0.022
ours+0.021
udge+0.021
elly+0.019
Neg logit contributions
sets-0.021
 voy-0.020
 feast-0.020
 towns-0.020
girls-0.020
Token take
Token ID1011
Feature activation+2.38e-02

Pos logit contributions
sets+0.216
 conversations+0.190
Andrew+0.189
girls+0.188
 towns+0.185
Neg logit contributions
run-0.167
 else-0.154
ang-0.150
bug-0.147
earing-0.146
Token turns
Token ID4962
Feature activation+2.00e-03

Pos logit contributions
run+0.037
ang+0.028
udge+0.028
ours+0.027
out+0.027
Neg logit contributions
sets-0.044
 amen-0.043
girls-0.043
 Amen-0.041
 rep-0.041
Token to
Token ID284
Feature activation-4.05e-02

Pos logit contributions
run+0.003
ang+0.003
udge+0.002
web+0.002
aser+0.002
Neg logit contributions
 towns-0.004
girls-0.004
sets-0.004
 amen-0.003
 narrative-0.003
Token and
Token ID290
Feature activation+1.70e-02

Pos logit contributions
run+0.090
ang+0.077
ours+0.072
udge+0.068
aser+0.065
Neg logit contributions
 towns-0.095
girls-0.094
sets-0.094
 conversations-0.090
 distance-0.089
Token Ben
Token ID3932
Feature activation-2.81e-03

Pos logit contributions
run+0.035
ang+0.031
udge+0.028
web+0.028
aser+0.028
Neg logit contributions
girls-0.024
 towns-0.023
sets-0.023
 amen-0.022
 nod-0.022
Token say
Token ID910
Feature activation-2.34e-02

Pos logit contributions
sets+0.004
 towns+0.004
girls+0.004
 conversations+0.004
 narrative+0.004
Neg logit contributions
run-0.005
ang-0.004
ours-0.004
udge-0.004
aser-0.004
Token.
Token ID13
Feature activation+1.59e-02

Pos logit contributions
sets+0.040
 towns+0.039
girls+0.039
 distance+0.037
 islands+0.036
Neg logit contributions
run-0.045
ang-0.034
aser-0.032
it-0.031
earing-0.031
Token They
Token ID1119
Feature activation-1.16e-01

Pos logit contributions
run+0.031
ang+0.026
udge+0.024
ours+0.023
it+0.023
Neg logit contributions
girls-0.024
 towns-0.024
sets-0.024
 feast-0.023
 conversations-0.023
Token nod
Token ID18666
Feature activation-2.00e-01

Pos logit contributions
sets+0.172
girls+0.160
 conversations+0.156
 towns+0.153
 amen+0.149
Neg logit contributions
run-0.231
ang-0.201
udge-0.185
aser-0.176
ours-0.175
Token and
Token ID290
Feature activation+1.11e-02

Pos logit contributions
girls+0.397
sets+0.388
 towns+0.380
rob+0.366
 feast+0.360
Neg logit contributions
run-0.319
udge-0.217
it-0.216
--0.215
ang-0.211
Token smile
Token ID8212
Feature activation-1.17e-01

Pos logit contributions
run+0.021
ang+0.017
 else+0.016
udge+0.015
 here+0.015
Neg logit contributions
sets-0.015
andra-0.014
 voy-0.014
Andrew-0.014
 towns-0.014
Token.
Token ID13
Feature activation+1.11e-02

Pos logit contributions
 towns+0.213
girls+0.205
 narrative+0.203
 distance+0.199
andra+0.198
Neg logit contributions
run-0.179
ang-0.143
 at-0.141
it-0.139
--0.131
Token They
Token ID1119
Feature activation-1.22e-01

Pos logit contributions
run+0.026
ang+0.023
udge+0.021
ours+0.020
it+0.020
Neg logit contributions
girls-0.013
 towns-0.013
sets-0.013
 amen-0.012
 conversations-0.012
Token learn
Token ID2193
Feature activation-8.78e-02

Pos logit contributions
sets+0.198
girls+0.176
 conversations+0.176
 towns+0.173
 amen+0.166
Neg logit contributions
run-0.206
ang-0.187
udge-0.171
earing-0.163
aser-0.162
Token sorry
Token ID7926
Feature activation-1.15e-01

Pos logit contributions
girls+0.044
sets+0.043
 towns+0.043
 amen+0.041
 conversations+0.040
Neg logit contributions
run-0.049
ang-0.042
udge-0.038
aser-0.037
web-0.037
Token for
Token ID329
Feature activation+4.33e-02

Pos logit contributions
girls+0.177
 towns+0.175
sets+0.170
 amen+0.162
 conversations+0.162
Neg logit contributions
run-0.227
ang-0.192
aser-0.177
udge-0.176
web-0.173
Token fighting
Token ID4330
Feature activation-4.29e-02

Pos logit contributions
run+0.074
ang+0.060
web+0.055
udge+0.055
ours+0.053
Neg logit contributions
 towns-0.073
sets-0.073
girls-0.073
 conversations-0.070
Arthur-0.068
Token.
Token ID13
Feature activation+3.80e-03

Pos logit contributions
girls+0.059
 towns+0.058
 amen+0.056
sets+0.056
 conversations+0.053
Neg logit contributions
run-0.090
ang-0.075
udge-0.073
aser-0.071
web-0.071
Token They
Token ID1119
Feature activation-1.22e-01

Pos logit contributions
run+0.008
ang+0.007
udge+0.007
ours+0.007
it+0.006
Neg logit contributions
sets-0.005
girls-0.005
 towns-0.005
 feast-0.005
 overcome-0.004
Token nod
Token ID18666
Feature activation-2.00e-01

Pos logit contributions
sets+0.203
girls+0.185
 towns+0.177
 conversations+0.176
 islands+0.160
Neg logit contributions
run-0.208
ang-0.175
earing-0.172
aser-0.172
 else-0.171
Token their
Token ID511
Feature activation-4.88e-02

Pos logit contributions
girls+0.389
 towns+0.388
sets+0.383
 feast+0.356
Arthur+0.355
Neg logit contributions
run-0.322
it-0.228
udge-0.223
--0.221
 at-0.216
Token heads
Token ID6665
Feature activation-1.07e-01

Pos logit contributions
sets+0.064
girls+0.060
 towns+0.059
 amen+0.057
 conversations+0.056
Neg logit contributions
run-0.101
ang-0.089
aser-0.084
ours-0.081
udge-0.081
Token.
Token ID13
Feature activation+3.01e-03

Pos logit contributions
girls+0.188
 towns+0.180
sets+0.176
Arthur+0.169
 narrative+0.166
Neg logit contributions
run-0.193
ang-0.151
it-0.141
aser-0.139
udge-0.137
Token
Token ID198
Feature activation-1.91e-02

Pos logit contributions
run+0.006
ang+0.005
udge+0.004
ours+0.004
it+0.004
Neg logit contributions
 towns-0.005
 feast-0.005
sets-0.005
girls-0.005
 overcome-0.004
Token
Token ID198
Feature activation-1.95e-02

Pos logit contributions
girls+0.024
 towns+0.024
sets+0.024
 amen+0.022
 conversations+0.022
Neg logit contributions
run-0.042
ang-0.037
udge-0.034
web-0.033
aser-0.033
Token They
Token ID1119
Feature activation-1.18e-01

Pos logit contributions
run+0.015
ang+0.014
udge+0.013
web+0.013
ours+0.013
Neg logit contributions
girls-0.005
 towns-0.005
sets-0.005
 amen-0.005
 conversations-0.005
Token are
Token ID389
Feature activation-6.23e-02

Pos logit contributions
sets+0.191
girls+0.185
 towns+0.180
 conversations+0.172
 amen+0.171
Neg logit contributions
run-0.216
ang-0.184
udge-0.168
aser-0.163
earing-0.159
Token kind
Token ID1611
Feature activation-8.72e-02

Pos logit contributions
girls+0.087
 towns+0.084
 amen+0.082
 nod+0.078
 hearts+0.077
Neg logit contributions
run-0.135
ang-0.118
udge-0.111
aser-0.108
it-0.106
Token.
Token ID13
Feature activation+6.94e-03

Pos logit contributions
Arthur+0.174
 achieve+0.170
sets+0.163
 amen+0.163
 overcome+0.161
Neg logit contributions
run-0.135
--0.100
ang-0.097
udge-0.095
elly-0.092
Token They
Token ID1119
Feature activation-1.20e-01

Pos logit contributions
run+0.017
ang+0.015
udge+0.014
ours+0.013
inn+0.013
Neg logit contributions
sets-0.007
girls-0.007
 towns-0.007
 amen-0.007
 overcome-0.007
Token nod
Token ID18666
Feature activation-2.00e-01

Pos logit contributions
sets+0.208
girls+0.196
 towns+0.189
 conversations+0.184
 amen+0.178
Neg logit contributions
run-0.203
ang-0.173
udge-0.157
aser-0.156
ours-0.152
Token and
Token ID290
Feature activation+9.52e-03

Pos logit contributions
girls+0.356
sets+0.352
 towns+0.347
 voy+0.326
Arthur+0.325
Neg logit contributions
run-0.348
ang-0.256
udge-0.252
it-0.245
web-0.236
Token smile
Token ID8212
Feature activation-1.17e-01

Pos logit contributions
run+0.018
ang+0.015
 else+0.015
ours+0.014
bug+0.014
Neg logit contributions
sets-0.013
rob-0.012
 voy-0.012
Andrew-0.012
girls-0.011
Token.
Token ID13
Feature activation+2.39e-03

Pos logit contributions
 towns+0.207
girls+0.202
 narrative+0.195
sets+0.195
andra+0.195
Neg logit contributions
run-0.185
ang-0.149
 at-0.144
it-0.143
--0.131
Token
Token ID198
Feature activation-1.91e-02

Pos logit contributions
run+0.005
ang+0.005
udge+0.004
ours+0.004
inn+0.004
Neg logit contributions
sets-0.003
girls-0.003
 towns-0.003
 feast-0.003
 amen-0.003
Token
Token ID198
Feature activation-1.93e-02

Pos logit contributions
girls+0.023
 towns+0.022
sets+0.022
 amen+0.021
 nod+0.021
Neg logit contributions
run-0.043
ang-0.038
udge-0.035
aser-0.035
web-0.035
Token polite
Token ID23507
Feature activation-9.02e-02

Pos logit contributions
girls+0.081
 towns+0.078
 amen+0.076
 nod+0.075
sets+0.072
Neg logit contributions
run-0.130
ang-0.115
udge-0.108
aser-0.106
web-0.103
Token.
Token ID13
Feature activation+8.44e-03

Pos logit contributions
 towns+0.151
sets+0.150
Arthur+0.150
 amen+0.149
girls+0.147
Neg logit contributions
run-0.164
ang-0.126
udge-0.121
it-0.115
aser-0.114
Token They
Token ID1119
Feature activation-1.18e-01

Pos logit contributions
run+0.022
ang+0.019
udge+0.018
ours+0.018
aser+0.017
Neg logit contributions
girls-0.008
 towns-0.008
sets-0.008
 amen-0.007
 conversations-0.007
Token smile
Token ID8212
Feature activation-1.14e-01

Pos logit contributions
sets+0.175
girls+0.161
 towns+0.158
 conversations+0.156
 amen+0.149
Neg logit contributions
run-0.228
ang-0.198
ours-0.180
udge-0.179
aser-0.178
Token and
Token ID290
Feature activation+1.31e-02

Pos logit contributions
 towns+0.211
girls+0.207
 narrative+0.199
sets+0.198
Arthur+0.194
Neg logit contributions
run-0.181
ang-0.142
it-0.130
ours-0.126
aser-0.121
Token nod
Token ID18666
Feature activation-2.00e-01

Pos logit contributions
run+0.025
ang+0.021
ours+0.020
 else+0.019
udge+0.019
Neg logit contributions
sets-0.018
 towns-0.017
 voy-0.016
girls-0.015
Andrew-0.015
Token.
Token ID13
Feature activation+4.65e-03

Pos logit contributions
sets+0.335
 towns+0.335
girls+0.332
 voy+0.311
Arthur+0.306
Neg logit contributions
run-0.370
ang-0.277
udge-0.271
it-0.264
--0.252
Token
Token ID198
Feature activation-1.71e-02

Pos logit contributions
run+0.011
ang+0.010
udge+0.009
ours+0.009
it+0.009
Neg logit contributions
girls-0.005
sets-0.005
 towns-0.005
 conversations-0.004
 amen-0.004
Token
Token ID198
Feature activation-1.77e-02

Pos logit contributions
girls+0.019
 towns+0.019
sets+0.018
 amen+0.018
 nod+0.017
Neg logit contributions
run-0.040
ang-0.035
udge-0.033
aser-0.032
web-0.032
Token"
Token ID1
Feature activation+4.68e-02

Pos logit contributions
 towns+0.025
girls+0.025
 amen+0.023
sets+0.023
 nod+0.023
Neg logit contributions
run-0.037
ang-0.031
aser-0.029
udge-0.029
web-0.029
TokenSure
Token ID19457
Feature activation+1.47e-03

Pos logit contributions
run+0.085
ang+0.078
udge+0.070
it+0.069
web+0.067
Neg logit contributions
girls-0.073
 towns-0.072
sets-0.071
 conversations-0.068
 voy-0.067
Token
Token ID198
Feature activation-1.58e-02

Pos logit contributions
girls+0.041
sets+0.040
 towns+0.040
 conversations+0.037
 amen+0.036
Neg logit contributions
run-0.055
ang-0.048
udge-0.044
ours-0.041
aser-0.041
Token
Token ID198
Feature activation-1.64e-02

Pos logit contributions
girls+0.020
 towns+0.020
sets+0.020
 amen+0.018
 conversations+0.018
Neg logit contributions
run-0.035
ang-0.031
udge-0.028
web-0.028
aser-0.027
TokenTom
Token ID13787
Feature activation-1.04e-02

Pos logit contributions
girls+0.024
sets+0.024
 towns+0.024
 amen+0.023
 conversations+0.023
Neg logit contributions
run-0.032
ang-0.029
udge-0.026
web-0.025
aser-0.025
Token and
Token ID290
Feature activation+1.27e-02

Pos logit contributions
girls+0.016
 towns+0.016
sets+0.016
 conversations+0.016
 amen+0.015
Neg logit contributions
run-0.019
ang-0.017
ours-0.015
aser-0.015
it-0.015
Token Mia
Token ID32189
Feature activation+2.76e-02

Pos logit contributions
run+0.020
ang+0.018
udge+0.016
aser+0.015
web+0.015
Neg logit contributions
girls-0.023
 towns-0.023
sets-0.022
 amen-0.022
 nod-0.022
Token nod
Token ID18666
Feature activation-1.99e-01

Pos logit contributions
run+0.050
 at+0.047
it+0.046
 else+0.046
earing+0.045
Neg logit contributions
girls-0.040
sets-0.037
 conversations-0.033
 battles-0.033
rob-0.031
Token and
Token ID290
Feature activation+9.90e-03

Pos logit contributions
girls+0.421
 towns+0.414
sets+0.410
rob+0.394
 feast+0.383
Neg logit contributions
run-0.278
 at-0.206
--0.204
it-0.198
udge-0.186
Token say
Token ID910
Feature activation-3.08e-02

Pos logit contributions
run+0.013
 else+0.013
 at+0.011
bug+0.011
 how+0.011
Neg logit contributions
sets-0.018
rob-0.016
 North-0.016
girls-0.016
 conversations-0.016
Token yes
Token ID3763
Feature activation-1.28e-01

Pos logit contributions
girls+0.058
sets+0.055
 distance+0.055
 towns+0.054
andra+0.051
Neg logit contributions
run-0.052
aser-0.043
it-0.039
ang-0.038
ours-0.038
Token.
Token ID13
Feature activation+3.00e-03

Pos logit contributions
girls+0.279
 towns+0.273
sets+0.266
Arthur+0.265
rob+0.259
Neg logit contributions
run-0.168
--0.141
it-0.137
earing-0.130
out-0.126
Token They
Token ID1119
Feature activation-1.22e-01

Pos logit contributions
run+0.006
ang+0.005
ours+0.005
udge+0.005
it+0.004
Neg logit contributions
sets-0.004
 feast-0.004
girls-0.004
 overcome-0.004
 voy-0.004
Token
Token ID198
Feature activation-1.33e-02

Pos logit contributions
girls+0.015
 towns+0.015
sets+0.014
 amen+0.014
 conversations+0.013
Neg logit contributions
run-0.028
ang-0.025
udge-0.023
web-0.022
aser-0.022
TokenS
Token ID50
Feature activation+2.07e-02

Pos logit contributions
girls+0.020
 towns+0.019
sets+0.019
 amen+0.018
 nod+0.018
Neg logit contributions
run-0.027
ang-0.023
udge-0.021
aser-0.021
web-0.021
Tokenara
Token ID3301
Feature activation+7.00e-02

Pos logit contributions
run+0.029
ours+0.025
ang+0.025
aser+0.023
udge+0.023
Neg logit contributions
 towns-0.042
girls-0.041
sets-0.040
 conversations-0.039
 chatter-0.039
Token and
Token ID290
Feature activation+1.47e-02

Pos logit contributions
run+0.148
ang+0.129
ours+0.121
udge+0.117
aser+0.117
Neg logit contributions
girls-0.094
sets-0.092
 towns-0.091
 amen-0.085
 conversations-0.084
Token Ben
Token ID3932
Feature activation-4.20e-03

Pos logit contributions
run+0.028
ang+0.024
udge+0.022
ours+0.022
aser+0.022
Neg logit contributions
girls-0.022
sets-0.022
 towns-0.022
 conversations-0.021
 amen-0.021
Token nod
Token ID18666
Feature activation-1.99e-01

Pos logit contributions
sets+0.006
girls+0.006
 towns+0.005
 conversations+0.005
 islands+0.005
Neg logit contributions
run-0.008
ours-0.007
 at-0.006
 else-0.006
ang-0.006
Token.
Token ID13
Feature activation+1.18e-02

Pos logit contributions
girls+0.372
 towns+0.369
sets+0.365
 feast+0.335
Arthur+0.333
Neg logit contributions
run-0.339
udge-0.249
it-0.240
--0.236
ang-0.234
Token They
Token ID1119
Feature activation-1.20e-01

Pos logit contributions
run+0.027
ang+0.024
udge+0.022
ours+0.022
it+0.021
Neg logit contributions
sets-0.014
 towns-0.014
girls-0.013
 overcome-0.013
 feast-0.013
Token have
Token ID423
Feature activation+3.21e-02

Pos logit contributions
sets+0.193
girls+0.173
 towns+0.167
 conversations+0.164
 islands+0.153
Neg logit contributions
run-0.219
ang-0.180
earing-0.173
udge-0.171
 else-0.167
Token some
Token ID617
Feature activation-3.00e-02

Pos logit contributions
run+0.043
out+0.027
 else+0.026
 here+0.025
ours+0.025
Neg logit contributions
sets-0.072
girls-0.066
 amen-0.066
 towns-0.064
 Claus-0.062
Token scratches
Token ID37627
Feature activation-5.12e-02

Pos logit contributions
girls+0.041
sets+0.040
 towns+0.040
 amen+0.038
 conversations+0.036
Neg logit contributions
run-0.063
ang-0.055
udge-0.050
aser-0.050
web-0.049
TokenThe
Token ID464
Feature activation-6.28e-02

Pos logit contributions
girls+0.025
 towns+0.025
sets+0.024
 amen+0.023
 nod+0.023
Neg logit contributions
run-0.047
ang-0.041
aser-0.039
udge-0.038
ours-0.038
Token cater
Token ID20825
Feature activation+7.41e-02

Pos logit contributions
girls+0.085
 towns+0.084
sets+0.081
 amen+0.079
 nod+0.077
Neg logit contributions
run-0.132
ang-0.116
udge-0.106
aser-0.104
web-0.104
Tokenpillar
Token ID41643
Feature activation+9.97e-02

Pos logit contributions
run+0.143
ours+0.109
udge+0.103
ang+0.100
inn+0.098
Neg logit contributions
girls-0.105
 towns-0.102
 nod-0.101
 narrative-0.100
sets-0.097
Token seemed
Token ID3947
Feature activation-1.92e-02

Pos logit contributions
run+0.231
ang+0.206
web+0.189
udge+0.188
aser+0.188
Neg logit contributions
girls-0.112
 towns-0.110
sets-0.107
 amen-0.100
 conversations-0.098
Token to
Token ID284
Feature activation-4.58e-02

Pos logit contributions
 towns+0.029
girls+0.028
 nod+0.028
sets+0.027
 amen+0.027
Neg logit contributions
run-0.037
ang-0.033
udge-0.030
aser-0.029
web-0.029
Token nod
Token ID18666
Feature activation-1.99e-01

Pos logit contributions
girls+0.072
sets+0.069
 towns+0.067
 hearts+0.059
 islands+0.058
Neg logit contributions
run-0.087
ang-0.071
udge-0.070
 here-0.067
earing-0.067
Token.
Token ID13
Feature activation-3.20e-03

Pos logit contributions
girls+0.320
 towns+0.316
sets+0.308
 amen+0.291
 conversations+0.286
Neg logit contributions
run-0.382
ang-0.308
udge-0.292
web-0.287
it-0.283
Token He
Token ID679
Feature activation-4.89e-02

Pos logit contributions
 towns+0.004
girls+0.004
sets+0.003
 amen+0.003
 nod+0.003
Neg logit contributions
run-0.007
ang-0.007
web-0.006
udge-0.006
aser-0.006
Token was
Token ID373
Feature activation+9.89e-03

Pos logit contributions
girls+0.058
 towns+0.056
 amen+0.054
 nod+0.054
sets+0.053
Neg logit contributions
run-0.112
ang-0.099
udge-0.093
aser-0.091
web-0.090
Token j
Token ID474
Feature activation+3.66e-02

Pos logit contributions
run+0.019
ang+0.017
udge+0.016
aser+0.016
ours+0.015
Neg logit contributions
 towns-0.014
girls-0.014
 nod-0.014
 amen-0.014
 overcome-0.014
Tokenolly
Token ID5098
Feature activation+2.71e-02

Pos logit contributions
run+0.057
web+0.049
ang+0.046
udge+0.043
aser+0.042
Neg logit contributions
 towns-0.063
 nod-0.062
girls-0.061
 conversations-0.059
sets-0.058
Token says
Token ID1139
Feature activation-6.57e-03

Pos logit contributions
run+0.038
ang+0.033
ours+0.029
udge+0.029
web+0.028
Neg logit contributions
girls-0.027
 towns-0.026
sets-0.026
 conversations-0.025
 amen-0.025
Token.
Token ID13
Feature activation-3.75e-04

Pos logit contributions
girls+0.008
 towns+0.008
sets+0.008
 conversations+0.007
 amen+0.007
Neg logit contributions
run-0.015
ang-0.013
aser-0.012
udge-0.012
web-0.011
Token
Token ID198
Feature activation-1.75e-02

Pos logit contributions
 towns+0.001
girls+0.001
sets+0.000
 amen+0.000
 conversations+0.000
Neg logit contributions
run-0.001
ang-0.001
udge-0.001
aser-0.001
web-0.001
Token
Token ID198
Feature activation-1.83e-02

Pos logit contributions
girls+0.021
 towns+0.021
sets+0.021
 amen+0.020
 nod+0.019
Neg logit contributions
run-0.039
ang-0.035
udge-0.032
web-0.031
aser-0.031
TokenThey
Token ID2990
Feature activation-1.27e-01

Pos logit contributions
girls+0.028
 towns+0.028
sets+0.027
 amen+0.026
 conversations+0.026
Neg logit contributions
run-0.036
ang-0.031
udge-0.028
web-0.027
aser-0.027
Token nod
Token ID18666
Feature activation-1.99e-01

Pos logit contributions
sets+0.181
girls+0.167
 conversations+0.160
 amen+0.157
 towns+0.155
Neg logit contributions
run-0.253
ang-0.225
udge-0.201
aser-0.196
ours-0.195
Token.
Token ID13
Feature activation+1.22e-03

Pos logit contributions
sets+0.371
girls+0.369
 towns+0.363
rob+0.351
Arthur+0.350
Neg logit contributions
run-0.331
udge-0.224
ang-0.221
--0.220
 at-0.218
Token They
Token ID1119
Feature activation-1.23e-01

Pos logit contributions
run+0.003
ang+0.002
udge+0.002
ours+0.002
it+0.002
Neg logit contributions
sets-0.001
 towns-0.001
girls-0.001
 overcome-0.001
 amen-0.001
Token smile
Token ID8212
Feature activation-1.16e-01

Pos logit contributions
sets+0.203
girls+0.178
 conversations+0.174
 towns+0.172
 amen+0.169
Neg logit contributions
run-0.209
ang-0.186
udge-0.166
earing-0.164
 else-0.161
Token.
Token ID13
Feature activation+6.38e-04

Pos logit contributions
 towns+0.209
girls+0.205
 narrative+0.201
andra+0.198
Andrew+0.198
Neg logit contributions
run-0.182
ang-0.142
 at-0.139
it-0.135
 here-0.127
Token They
Token ID1119
Feature activation-1.25e-01

Pos logit contributions
run+0.001
ang+0.001
udge+0.001
ours+0.001
it+0.001
Neg logit contributions
sets-0.001
 towns-0.001
girls-0.001
 amen-0.001
 overcome-0.001
Token
Token ID198
Feature activation-2.28e-02

Pos logit contributions
girls+0.028
 towns+0.028
sets+0.027
 amen+0.026
 nod+0.026
Neg logit contributions
run-0.049
ang-0.043
udge-0.040
aser-0.039
web-0.039
TokenS
Token ID50
Feature activation+2.39e-02

Pos logit contributions
girls+0.029
 towns+0.029
sets+0.029
 amen+0.027
 conversations+0.027
Neg logit contributions
run-0.049
ang-0.044
udge-0.040
web-0.040
aser-0.039
Tokenara
Token ID3301
Feature activation+7.04e-02

Pos logit contributions
run+0.039
ang+0.034
aser+0.031
ours+0.030
it+0.030
Neg logit contributions
 towns-0.044
girls-0.044
sets-0.042
 amen-0.041
 conversations-0.041
Token and
Token ID290
Feature activation+1.11e-02

Pos logit contributions
run+0.135
ang+0.116
web+0.108
aser+0.107
udge+0.106
Neg logit contributions
girls-0.109
sets-0.104
 towns-0.104
 conversations-0.099
 amen-0.097
Token Ben
Token ID3932
Feature activation-8.08e-03

Pos logit contributions
run+0.020
ang+0.017
udge+0.015
ours+0.014
aser+0.014
Neg logit contributions
 towns-0.018
sets-0.018
 conversations-0.018
girls-0.018
 amen-0.017
Token nod
Token ID18666
Feature activation-1.99e-01

Pos logit contributions
girls+0.013
sets+0.012
 conversations+0.012
 towns+0.011
 battles+0.011
Neg logit contributions
run-0.013
aser-0.012
earing-0.012
it-0.012
bug-0.012
Token and
Token ID290
Feature activation+9.42e-03

Pos logit contributions
girls+0.387
 towns+0.381
sets+0.375
rob+0.352
Arthur+0.347
Neg logit contributions
run-0.330
--0.231
it-0.230
udge-0.229
 at-0.218
Token smile
Token ID8212
Feature activation-1.14e-01

Pos logit contributions
run+0.017
ang+0.013
 else+0.013
bug+0.013
 inside+0.012
Neg logit contributions
sets-0.014
 conversations-0.013
Sir-0.013
 towns-0.013
Arthur-0.013
Token.
Token ID13
Feature activation-2.17e-03

Pos logit contributions
 towns+0.209
girls+0.203
 narrative+0.196
Andrew+0.193
 voy+0.191
Neg logit contributions
run-0.183
ang-0.143
it-0.142
 at-0.139
--0.132
Token They
Token ID1119
Feature activation-1.24e-01

Pos logit contributions
girls+0.002
 towns+0.002
sets+0.002
 overcome+0.002
 conversations+0.002
Neg logit contributions
run-0.005
ang-0.005
ours-0.004
udge-0.004
inn-0.004
Token help
Token ID1037
Feature activation-1.28e-01

Pos logit contributions
sets+0.192
 towns+0.171
girls+0.171
 conversations+0.170
 islands+0.159
Neg logit contributions
run-0.215
earing-0.187
ang-0.186
aser-0.184
udge-0.174
Token
Token ID198
Feature activation-1.48e-02

Pos logit contributions
girls+0.018
 towns+0.018
sets+0.018
 amen+0.017
 conversations+0.016
Neg logit contributions
run-0.032
ang-0.028
udge-0.026
web-0.025
aser-0.025
TokenS
Token ID50
Feature activation+2.46e-02

Pos logit contributions
girls+0.022
 towns+0.022
sets+0.022
 amen+0.021
 conversations+0.021
Neg logit contributions
run-0.029
ang-0.025
udge-0.023
web-0.022
aser-0.022
Tokenara
Token ID3301
Feature activation+7.43e-02

Pos logit contributions
run+0.041
ang+0.036
udge+0.032
ours+0.032
aser+0.032
Neg logit contributions
 towns-0.044
girls-0.043
sets-0.042
 amen-0.041
 conversations-0.041
Token and
Token ID290
Feature activation+1.52e-02

Pos logit contributions
run+0.164
ang+0.144
udge+0.133
ours+0.133
aser+0.131
Neg logit contributions
girls-0.094
 towns-0.090
sets-0.089
 conversations-0.084
 amen-0.082
Token Ben
Token ID3932
Feature activation-3.56e-03

Pos logit contributions
run+0.029
ang+0.025
udge+0.023
aser+0.022
ours+0.022
Neg logit contributions
girls-0.023
 towns-0.023
sets-0.023
 conversations-0.022
 amen-0.021
Token nod
Token ID18666
Feature activation-1.99e-01

Pos logit contributions
girls+0.005
sets+0.004
 conversations+0.004
 battles+0.004
 towns+0.004
Neg logit contributions
run-0.007
ours-0.006
it-0.006
 at-0.006
aser-0.006
Token.
Token ID13
Feature activation+2.96e-03

Pos logit contributions
girls+0.384
 towns+0.382
sets+0.371
rob+0.347
Arthur+0.343
Neg logit contributions
run-0.319
--0.231
it-0.231
udge-0.230
 at-0.221
Token They
Token ID1119
Feature activation-1.20e-01

Pos logit contributions
run+0.007
ang+0.006
udge+0.006
ours+0.006
it+0.005
Neg logit contributions
sets-0.003
girls-0.003
 towns-0.003
 overcome-0.003
 conversations-0.003
Token say
Token ID910
Feature activation-3.09e-02

Pos logit contributions
sets+0.172
girls+0.167
 towns+0.148
 conversations+0.148
 voy+0.136
Neg logit contributions
run-0.229
ang-0.196
earing-0.193
udge-0.191
aser-0.189
Token,
Token ID11
Feature activation-1.35e-02

Pos logit contributions
girls+0.054
 distance+0.052
sets+0.052
 towns+0.050
andra+0.047
Neg logit contributions
run-0.053
aser-0.043
ours-0.043
ang-0.041
it-0.040
Token "
Token ID366
Feature activation+3.48e-02

Pos logit contributions
 towns+0.027
girls+0.027
sets+0.026
 conversations+0.026
 voy+0.025
Neg logit contributions
run-0.020
ang-0.016
it-0.013
ours-0.013
udge-0.013
Token other
Token ID584
Feature activation-7.72e-02

Pos logit contributions
girls+0.059
 towns+0.057
sets+0.057
 amen+0.054
 conversations+0.051
Neg logit contributions
run-0.131
ang-0.116
udge-0.108
aser-0.107
web-0.106
Token a
Token ID257
Feature activation-1.32e-02

Pos logit contributions
girls+0.110
sets+0.107
 towns+0.107
 amen+0.104
 nod+0.099
Neg logit contributions
run-0.156
ang-0.131
udge-0.120
aser-0.120
web-0.118
Token wink
Token ID41955
Feature activation-1.10e-02

Pos logit contributions
girls+0.015
 towns+0.015
sets+0.015
 amen+0.015
 nod+0.015
Neg logit contributions
run-0.030
ang-0.027
udge-0.024
web-0.024
aser-0.024
Token and
Token ID290
Feature activation+1.71e-02

Pos logit contributions
girls+0.015
 towns+0.015
sets+0.015
 amen+0.015
 nod+0.014
Neg logit contributions
run-0.022
ang-0.020
udge-0.018
aser-0.017
web-0.017
Token a
Token ID257
Feature activation-1.24e-02

Pos logit contributions
run+0.034
ang+0.028
web+0.026
udge+0.025
ours+0.024
Neg logit contributions
 towns-0.024
girls-0.024
sets-0.022
 narrative-0.022
 conversations-0.021
Token nod
Token ID18666
Feature activation-1.99e-01

Pos logit contributions
girls+0.018
 towns+0.017
sets+0.016
 amen+0.015
 conversations+0.015
Neg logit contributions
run-0.026
ang-0.022
web-0.020
aser-0.020
udge-0.020
Token.
Token ID13
Feature activation+2.07e-03

Pos logit contributions
girls+0.288
 towns+0.282
sets+0.267
 amen+0.263
 conversations+0.254
Neg logit contributions
run-0.417
ang-0.349
udge-0.322
web-0.320
aser-0.316
Token
Token ID198
Feature activation-1.76e-02

Pos logit contributions
run+0.005
ang+0.004
udge+0.004
aser+0.004
web+0.004
Neg logit contributions
girls-0.002
 towns-0.002
sets-0.002
 amen-0.002
 conversations-0.002
Token
Token ID198
Feature activation-1.85e-02

Pos logit contributions
girls+0.022
 towns+0.021
 amen+0.020
sets+0.020
 nod+0.019
Neg logit contributions
run-0.039
ang-0.034
aser-0.033
udge-0.032
web-0.032
TokenBob
Token ID18861
Feature activation-1.80e-02

Pos logit contributions
girls+0.018
 towns+0.018
 amen+0.016
sets+0.016
 nod+0.016
Neg logit contributions
run-0.046
ang-0.040
aser-0.039
ours-0.038
udge-0.038
Token and
Token ID290
Feature activation+9.98e-03

Pos logit contributions
sets+0.027
girls+0.027
 towns+0.026
 conversations+0.025
 amen+0.025
Neg logit contributions
run-0.035
ang-0.031
web-0.027
udge-0.026
aser-0.026
Token It
Token ID632
Feature activation+5.18e-02

Pos logit contributions
run+0.052
ang+0.045
web+0.044
udge+0.041
aser+0.040
Neg logit contributions
girls-0.041
 towns-0.037
 nod-0.037
 amen-0.036
sets-0.036
Token is
Token ID318
Feature activation+7.40e-03

Pos logit contributions
run+0.122
udge+0.108
ang+0.108
inn+0.097
year+0.097
Neg logit contributions
girls-0.060
sets-0.053
 towns-0.051
 Claus-0.051
 amen-0.049
Token mine
Token ID6164
Feature activation+9.25e-02

Pos logit contributions
run+0.013
ang+0.013
udge+0.012
op+0.011
ore+0.011
Neg logit contributions
girls-0.011
 hearts-0.011
 towns-0.010
 rep-0.010
 Arthur-0.010
Token.
Token ID13
Feature activation+2.80e-02

Pos logit contributions
run+0.238
ang+0.217
udge+0.204
aser+0.194
inn+0.193
Neg logit contributions
girls-0.077
 towns-0.069
sets-0.068
 amen-0.063
 overcome-0.057
Token I
Token ID314
Feature activation+3.67e-02

Pos logit contributions
run+0.051
ang+0.045
web+0.044
udge+0.041
aser+0.040
Neg logit contributions
girls-0.044
 towns-0.041
 amen-0.040
 nod-0.040
sets-0.038
Token found
Token ID1043
Feature activation+1.18e-01

Pos logit contributions
run+0.083
web+0.076
ang+0.075
aser+0.073
ula+0.072
Neg logit contributions
 nod-0.041
girls-0.037
 overcome-0.035
 travel-0.034
 hear-0.034
Token it
Token ID340
Feature activation+6.46e-02

Pos logit contributions
run+0.257
ang+0.249
udge+0.238
inn+0.223
op+0.213
Neg logit contributions
 hearts-0.141
girls-0.118
 Hero-0.114
 towns-0.114
 amen-0.113
Token first
Token ID717
Feature activation-1.29e-02

Pos logit contributions
run+0.157
ang+0.143
udge+0.139
ours+0.128
web+0.124
Neg logit contributions
girls-0.063
 amen-0.056
sets-0.054
 towns-0.054
 overcome-0.053
Token."
Token ID526
Feature activation-3.54e-02

Pos logit contributions
girls+0.012
 towns+0.011
sets+0.011
 amen+0.011
 nod+0.010
Neg logit contributions
run-0.032
ang-0.029
udge-0.027
web-0.027
aser-0.027
Token They
Token ID1119
Feature activation-1.09e-01

Pos logit contributions
girls+0.053
 towns+0.053
sets+0.052
 amen+0.050
 conversations+0.049
Neg logit contributions
run-0.069
ang-0.060
udge-0.055
ours-0.052
aser-0.052
Token both
Token ID1111
Feature activation-2.29e-02

Pos logit contributions
sets+0.182
girls+0.173
 towns+0.171
 amen+0.167
 conversations+0.166
Neg logit contributions
run-0.203
ang-0.169
udge-0.153
aser-0.153
ours-0.150
Token a
Token ID257
Feature activation-3.53e-03

Pos logit contributions
run+0.045
ang+0.039
aser+0.036
udge+0.036
web+0.035
Neg logit contributions
 towns-0.028
girls-0.028
sets-0.027
 amen-0.027
 conversations-0.026
Token little
Token ID1310
Feature activation+5.31e-02

Pos logit contributions
girls+0.006
 towns+0.006
sets+0.005
 amen+0.005
 nod+0.005
Neg logit contributions
run-0.007
ang-0.006
udge-0.005
aser-0.005
web-0.005
Token pig
Token ID12967
Feature activation+2.40e-02

Pos logit contributions
run+0.116
ang+0.097
udge+0.092
web+0.089
aser+0.089
Neg logit contributions
 towns-0.069
girls-0.068
sets-0.066
 nod-0.065
 amen-0.065
Token named
Token ID3706
Feature activation+7.63e-02

Pos logit contributions
run+0.040
ang+0.036
udge+0.031
web+0.031
aser+0.029
Neg logit contributions
 towns-0.042
girls-0.042
 amen-0.041
sets-0.041
 nod-0.040
Token Pink
Token ID14657
Feature activation+3.65e-02

Pos logit contributions
run+0.167
ang+0.148
udge+0.137
aser+0.135
web+0.135
Neg logit contributions
girls-0.099
 towns-0.095
sets-0.093
 amen-0.092
 nod-0.087
Tokeny
Token ID88
Feature activation+8.91e-02

Pos logit contributions
run+0.046
ang+0.038
udge+0.033
web+0.031
aser+0.030
Neg logit contributions
 towns-0.079
girls-0.079
sets-0.078
 amen-0.077
 conversations-0.076
Token.
Token ID13
Feature activation+1.86e-02

Pos logit contributions
run+0.192
ang+0.172
udge+0.156
web+0.154
aser+0.153
Neg logit contributions
girls-0.115
 towns-0.114
sets-0.112
 amen-0.107
 conversations-0.106
Token She
Token ID1375
Feature activation-2.91e-03

Pos logit contributions
run+0.038
ang+0.033
inn+0.030
udge+0.030
ours+0.030
Neg logit contributions
girls-0.025
 towns-0.025
sets-0.023
 conversations-0.023
 amen-0.023
Token lived
Token ID5615
Feature activation+2.99e-02

Pos logit contributions
girls+0.004
 towns+0.004
sets+0.004
 amen+0.004
 nod+0.004
Neg logit contributions
run-0.006
ang-0.005
udge-0.004
web-0.004
aser-0.004
Token in
Token ID287
Feature activation+5.95e-03

Pos logit contributions
run+0.054
ang+0.042
aser+0.040
ours+0.040
udge+0.039
Neg logit contributions
 towns-0.051
 overcome-0.049
girls-0.049
 amen-0.047
sets-0.047
Token a
Token ID257
Feature activation-3.72e-03

Pos logit contributions
run+0.012
ang+0.010
udge+0.009
aser+0.009
web+0.009
Neg logit contributions
girls-0.009
 towns-0.009
sets-0.009
 amen-0.008
 nod-0.008
Token whenever
Token ID8797
Feature activation-6.96e-03

Pos logit contributions
run+0.273
ang+0.259
udge+0.252
ours+0.246
aser+0.238
Neg logit contributions
 towns-0.109
sets-0.099
 amen-0.090
girls-0.089
 conversations-0.089
Token he
Token ID339
Feature activation-2.72e-03

Pos logit contributions
girls+0.010
 towns+0.010
sets+0.010
 nod+0.009
 amen+0.009
Neg logit contributions
run-0.014
ang-0.012
web-0.011
udge-0.010
aser-0.010
Token looked
Token ID3114
Feature activation-6.84e-04

Pos logit contributions
 towns+0.003
girls+0.003
sets+0.003
 nod+0.003
 conversations+0.003
Neg logit contributions
run-0.006
ang-0.005
udge-0.005
aser-0.005
ours-0.005
Token at
Token ID379
Feature activation+2.93e-02

Pos logit contributions
 towns+0.001
sets+0.001
girls+0.001
 nod+0.001
 conversations+0.001
Neg logit contributions
run-0.001
ang-0.001
udge-0.001
aser-0.001
web-0.001
Token the
Token ID262
Feature activation-4.65e-02

Pos logit contributions
run+0.062
ang+0.056
udge+0.051
aser+0.050
ours+0.050
Neg logit contributions
 towns-0.040
girls-0.038
sets-0.038
 conversations-0.036
 hearts-0.035
Token picture
Token ID4286
Feature activation+9.03e-02

Pos logit contributions
 towns+0.060
girls+0.060
sets+0.058
 amen+0.056
 nod+0.055
Neg logit contributions
run-0.101
ang-0.089
udge-0.081
aser-0.079
web-0.079
Token,
Token ID11
Feature activation-1.37e-02

Pos logit contributions
run+0.198
ang+0.178
udge+0.163
aser+0.158
ours+0.157
Neg logit contributions
 towns-0.111
girls-0.111
sets-0.109
 amen-0.103
 conversations-0.102
Token he
Token ID339
Feature activation-1.99e-03

Pos logit contributions
 towns+0.018
girls+0.018
sets+0.018
 amen+0.017
 nod+0.017
Neg logit contributions
run-0.029
ang-0.026
udge-0.024
aser-0.023
web-0.023
Token felt
Token ID2936
Feature activation-2.75e-02

Pos logit contributions
 towns+0.003
girls+0.002
sets+0.002
 nod+0.002
 amen+0.002
Neg logit contributions
run-0.004
ang-0.004
udge-0.004
aser-0.004
ours-0.003
Token so
Token ID523
Feature activation-3.28e-03

Pos logit contributions
 towns+0.042
girls+0.040
 nod+0.039
 amen+0.039
 conversations+0.038
Neg logit contributions
run-0.054
ang-0.048
udge-0.043
aser-0.043
ours-0.042
Token much
Token ID881
Feature activation+4.11e-02

Pos logit contributions
 towns+0.005
 nod+0.005
 amen+0.005
girls+0.005
 overcome+0.005
Neg logit contributions
run-0.006
ang-0.005
udge-0.005
aser-0.005
web-0.005
Token set
Token ID900
Feature activation-4.52e-02

Pos logit contributions
run+0.010
ang+0.008
udge+0.008
web+0.008
aser+0.008
Neg logit contributions
girls-0.007
 towns-0.007
sets-0.007
 amen-0.007
 nod-0.007
Token and
Token ID290
Feature activation+1.76e-02

Pos logit contributions
 towns+0.063
girls+0.063
sets+0.062
 nod+0.059
 conversations+0.058
Neg logit contributions
run-0.092
ang-0.081
udge-0.073
aser-0.072
web-0.072
Token asked
Token ID1965
Feature activation-1.60e-02

Pos logit contributions
run+0.042
ang+0.039
udge+0.038
aser+0.035
web+0.034
Neg logit contributions
 towns-0.019
girls-0.018
 amen-0.018
sets-0.017
 nod-0.017
Token,
Token ID11
Feature activation-9.34e-03

Pos logit contributions
 towns+0.024
girls+0.024
 amen+0.023
sets+0.023
 nod+0.023
Neg logit contributions
run-0.031
ang-0.028
udge-0.025
web-0.025
aser-0.024
Token "
Token ID366
Feature activation+3.84e-02

Pos logit contributions
 nod+0.012
 towns+0.012
 amen+0.012
girls+0.012
 rep+0.012
Neg logit contributions
run-0.019
ang-0.018
web-0.016
udge-0.016
aser-0.016
TokenWhat
Token ID2061
Feature activation+9.70e-02

Pos logit contributions
run+0.070
ang+0.060
udge+0.054
web+0.053
aser+0.053
Neg logit contributions
girls-0.063
 towns-0.062
sets-0.061
 amen-0.060
 nod-0.059
Token is
Token ID318
Feature activation+2.12e-05

Pos logit contributions
run+0.203
ang+0.175
aser+0.163
udge+0.163
web+0.162
Neg logit contributions
girls-0.133
 towns-0.129
sets-0.124
 amen-0.122
 conversations-0.116
Token that
Token ID326
Feature activation+6.70e-02

Pos logit contributions
run+0.000
ang+0.000
udge+0.000
it+0.000
aser+0.000
Neg logit contributions
 parties-0.000
 hearts-0.000
sets-0.000
 conversations-0.000
 towns-0.000
Token?"
Token ID1701
Feature activation-1.31e-02

Pos logit contributions
run+0.142
ang+0.127
udge+0.114
web+0.108
aser+0.107
Neg logit contributions
sets-0.094
 towns-0.089
girls-0.089
 nod-0.088
 amen-0.086
Token Lily
Token ID20037
Feature activation+5.83e-02

Pos logit contributions
girls+0.026
 towns+0.026
sets+0.025
 amen+0.025
 conversations+0.024
Neg logit contributions
run-0.020
ang-0.015
udge-0.014
aser-0.013
web-0.013
Token replied
Token ID8712
Feature activation+3.54e-03

Pos logit contributions
run+0.103
ang+0.088
it+0.079
earing+0.079
ours+0.078
Neg logit contributions
girls-0.096
 towns-0.094
sets-0.088
 conversations-0.086
 amen-0.085
Token careless
Token ID36138
Feature activation+2.66e-03

Pos logit contributions
run+0.105
aser+0.089
web+0.088
ion+0.087
udge+0.087
Neg logit contributions
 towns-0.060
 nod-0.060
 overcome-0.060
 amen-0.057
 smile-0.057
Token and
Token ID290
Feature activation+2.49e-02

Pos logit contributions
run+0.006
ang+0.005
udge+0.005
aser+0.005
ours+0.005
Neg logit contributions
 towns-0.003
girls-0.003
sets-0.003
 conversations-0.003
 amen-0.003
Token bumped
Token ID38034
Feature activation+2.87e-02

Pos logit contributions
run+0.055
ang+0.051
udge+0.049
aser+0.046
ours+0.045
Neg logit contributions
 towns-0.029
girls-0.029
 amen-0.027
sets-0.026
 nod-0.026
Token into
Token ID656
Feature activation+1.19e-02

Pos logit contributions
run+0.054
ang+0.048
ours+0.046
aser+0.044
ool+0.044
Neg logit contributions
 towns-0.041
sets-0.039
girls-0.038
 hearts-0.037
 overcome-0.036
Token the
Token ID262
Feature activation-3.99e-02

Pos logit contributions
run+0.027
ang+0.027
ours+0.025
aser+0.024
earing+0.023
Neg logit contributions
 towns-0.013
 hearts-0.011
girls-0.011
 conversations-0.011
 chatter-0.010
Token shelf
Token ID18316
Feature activation+9.76e-02

Pos logit contributions
 towns+0.051
girls+0.047
 conversations+0.047
 hearts+0.046
sets+0.046
Neg logit contributions
run-0.090
ang-0.080
udge-0.073
ours-0.071
inn-0.069
Token,
Token ID11
Feature activation-1.50e-04

Pos logit contributions
run+0.226
ang+0.204
udge+0.187
ours+0.184
aser+0.183
Neg logit contributions
 towns-0.107
girls-0.104
sets-0.099
 nod-0.097
 amen-0.094
Token causing
Token ID6666
Feature activation+3.94e-03

Pos logit contributions
 towns+0.000
sets+0.000
girls+0.000
 nod+0.000
 amen+0.000
Neg logit contributions
run-0.000
ang-0.000
udge-0.000
aser-0.000
ours-0.000
Token it
Token ID340
Feature activation+6.35e-02

Pos logit contributions
run+0.008
ang+0.008
udge+0.007
aser+0.007
web+0.007
Neg logit contributions
 towns-0.005
girls-0.005
sets-0.005
 conversations-0.005
 hearts-0.005
Token to
Token ID284
Feature activation-3.14e-02

Pos logit contributions
ang+0.142
run+0.132
udge+0.124
ours+0.124
aser+0.116
Neg logit contributions
 overcome-0.068
 towns-0.065
 conversations-0.064
girls-0.063
 chatter-0.063
Token break
Token ID2270
Feature activation+3.28e-02

Pos logit contributions
girls+0.041
 towns+0.040
sets+0.039
 amen+0.038
 nod+0.038
Neg logit contributions
run-0.068
ang-0.060
udge-0.055
aser-0.054
web-0.053
Token says
Token ID1139
Feature activation+8.64e-03

Pos logit contributions
girls+0.014
 towns+0.014
sets+0.014
 amen+0.013
 conversations+0.013
Neg logit contributions
run-0.022
ang-0.019
udge-0.018
aser-0.017
web-0.017
Token.
Token ID13
Feature activation+1.74e-02

Pos logit contributions
run+0.019
ang+0.016
udge+0.015
aser+0.014
web+0.014
Neg logit contributions
girls-0.012
sets-0.012
 towns-0.012
 amen-0.011
 conversations-0.011
Token "
Token ID366
Feature activation+4.22e-02

Pos logit contributions
run+0.041
ang+0.036
aser+0.034
udge+0.034
web+0.034
Neg logit contributions
girls-0.019
 towns-0.019
 nod-0.018
sets-0.018
 amen-0.018
TokenIt
Token ID1026
Feature activation+6.75e-02

Pos logit contributions
run+0.087
ang+0.074
aser+0.068
ours+0.068
web+0.067
Neg logit contributions
 towns-0.060
girls-0.058
sets-0.056
 nod-0.056
 amen-0.054
Token is
Token ID318
Feature activation+2.47e-03

Pos logit contributions
run+0.170
udge+0.157
ang+0.151
year+0.141
inn+0.140
Neg logit contributions
girls-0.059
sets-0.058
 towns-0.057
 overcome-0.053
 Claus-0.050
Token pretty
Token ID2495
Feature activation+3.85e-02

Pos logit contributions
run+0.005
ang+0.005
udge+0.005
ore+0.005
web+0.005
Neg logit contributions
girls-0.003
sets-0.003
 hearts-0.003
 overcome-0.003
 towns-0.003
Token!"
Token ID2474
Feature activation-2.94e-02

Pos logit contributions
run+0.081
ang+0.070
udge+0.065
aser+0.063
web+0.063
Neg logit contributions
girls-0.053
 towns-0.052
sets-0.052
 amen-0.049
 nod-0.048
Token
Token ID198
Feature activation-1.13e-02

Pos logit contributions
girls+0.035
 towns+0.035
sets+0.034
 amen+0.033
 conversations+0.032
Neg logit contributions
run-0.067
ang-0.059
udge-0.055
aser-0.054
web-0.054
Token
Token ID198
Feature activation-1.16e-02

Pos logit contributions
girls+0.012
 towns+0.012
 amen+0.011
sets+0.011
 nod+0.011
Neg logit contributions
run-0.027
ang-0.024
aser-0.022
udge-0.022
web-0.022
TokenAnna
Token ID31160
Feature activation+1.27e-02

Pos logit contributions
sets+0.019
girls+0.019
 towns+0.018
 amen+0.018
 conversations+0.018
Neg logit contributions
run-0.022
ang-0.019
web-0.017
udge-0.017
it-0.016
Token reaches
Token ID12229
Feature activation+3.94e-02

Pos logit contributions
run+0.026
ang+0.023
udge+0.021
aser+0.020
web+0.020
Neg logit contributions
girls-0.018
 towns-0.018
sets-0.017
 amen-0.017
 conversations-0.016
Token said
Token ID531
Feature activation+4.37e-04

Pos logit contributions
run+0.014
ang+0.012
udge+0.011
aser+0.011
web+0.011
Neg logit contributions
girls-0.009
 towns-0.009
sets-0.008
 amen-0.008
 conversations-0.008
Token,
Token ID11
Feature activation-9.04e-03

Pos logit contributions
run+0.001
ang+0.001
aser+0.001
udge+0.001
ours+0.001
Neg logit contributions
girls-0.001
sets-0.001
 towns-0.001
 conversations-0.000
 amen-0.000
Token "
Token ID366
Feature activation+4.20e-02

Pos logit contributions
girls+0.017
 towns+0.017
sets+0.017
 conversations+0.017
 narrative+0.016
Neg logit contributions
run-0.014
ang-0.011
aser-0.010
udge-0.010
it-0.009
TokenOkay
Token ID16454
Feature activation-1.14e-01

Pos logit contributions
run+0.100
ang+0.088
udge+0.082
aser+0.082
web+0.081
Neg logit contributions
girls-0.046
 towns-0.046
sets-0.043
 nod-0.042
 amen-0.042
Token,
Token ID11
Feature activation-8.25e-03

Pos logit contributions
girls+0.162
 towns+0.162
sets+0.158
 conversations+0.149
 narrative+0.148
Neg logit contributions
run-0.232
ang-0.199
aser-0.181
udge-0.180
it-0.179
Token Anna
Token ID11735
Feature activation+1.93e-02

Pos logit contributions
girls+0.013
 amen+0.012
 towns+0.012
 nod+0.012
sets+0.012
Neg logit contributions
run-0.016
ang-0.014
udge-0.013
web-0.012
aser-0.012
Token,
Token ID11
Feature activation-7.63e-03

Pos logit contributions
run+0.034
ang+0.029
udge+0.025
web+0.025
aser+0.025
Neg logit contributions
 towns-0.033
sets-0.032
 amen-0.031
 conversations-0.031
 narrative-0.030
Token that
Token ID326
Feature activation+6.65e-02

Pos logit contributions
girls+0.010
 nod+0.010
 amen+0.010
 towns+0.010
sets+0.009
Neg logit contributions
run-0.016
ang-0.015
udge-0.014
web-0.013
aser-0.013
Token sounds
Token ID5238
Feature activation-1.02e-01

Pos logit contributions
run+0.140
ang+0.122
web+0.113
aser+0.111
udge+0.109
Neg logit contributions
girls-0.089
 towns-0.089
sets-0.084
 amen-0.083
 conversations-0.080
Token fun
Token ID1257
Feature activation-6.77e-02

Pos logit contributions
girls+0.148
 towns+0.147
sets+0.145
 amen+0.138
 conversations+0.135
Neg logit contributions
run-0.205
ang-0.177
udge-0.162
web-0.159
aser-0.159
Token.
Token ID13
Feature activation+2.00e-02

Pos logit contributions
girls+0.073
 towns+0.070
sets+0.067
 amen+0.064
 conversations+0.064
Neg logit contributions
run-0.162
ang-0.146
udge-0.135
aser-0.133
web-0.132
Token He
Token ID679
Feature activation-3.72e-02

Pos logit contributions
run+0.025
ang+0.022
udge+0.021
aser+0.021
web+0.021
Neg logit contributions
girls-0.010
 towns-0.010
sets-0.010
 amen-0.009
 nod-0.009
Token was
Token ID373
Feature activation+2.07e-02

Pos logit contributions
 towns+0.050
girls+0.048
 amen+0.046
sets+0.046
 nod+0.046
Neg logit contributions
run-0.079
ang-0.070
udge-0.066
aser-0.064
ours-0.062
Token playing
Token ID2712
Feature activation+6.93e-03

Pos logit contributions
run+0.046
ang+0.040
udge+0.040
aser+0.039
web+0.038
Neg logit contributions
 towns-0.028
girls-0.024
 conversations-0.024
 overcome-0.023
 amen-0.023
Token in
Token ID287
Feature activation+3.43e-03

Pos logit contributions
run+0.012
ang+0.009
ours+0.008
aser+0.008
udge+0.008
Neg logit contributions
 towns-0.013
girls-0.013
sets-0.012
 conversations-0.012
 nod-0.012
Token the
Token ID262
Feature activation-4.20e-02

Pos logit contributions
run+0.008
ang+0.007
aser+0.007
udge+0.006
ours+0.006
Neg logit contributions
 towns-0.004
girls-0.004
sets-0.004
 conversations-0.004
 amen-0.004
Token garden
Token ID11376
Feature activation+2.18e-02

Pos logit contributions
 towns+0.045
girls+0.043
sets+0.042
 conversations+0.040
 nod+0.040
Neg logit contributions
run-0.103
ang-0.090
udge-0.084
aser-0.083
web-0.082
Token when
Token ID618
Feature activation+9.85e-03

Pos logit contributions
run+0.037
ang+0.033
udge+0.029
web+0.029
it+0.028
Neg logit contributions
girls-0.038
 amen-0.036
sets-0.035
 towns-0.035
Arthur-0.034
Token he
Token ID339
Feature activation+3.14e-03

Pos logit contributions
run+0.020
ang+0.017
web+0.016
udge+0.015
aser+0.015
Neg logit contributions
girls-0.014
 towns-0.014
sets-0.014
 amen-0.013
 nod-0.013
Token saw
Token ID2497
Feature activation+6.46e-02

Pos logit contributions
run+0.007
ang+0.006
udge+0.005
web+0.005
aser+0.005
Neg logit contributions
girls-0.004
 towns-0.004
sets-0.004
 amen-0.004
 nod-0.004
Token something
Token ID1223
Feature activation+5.93e-03

Pos logit contributions
run+0.129
ang+0.113
udge+0.102
aser+0.099
web+0.097
Neg logit contributions
 towns-0.097
girls-0.094
sets-0.093
 amen-0.089
 conversations-0.089
Token funny
Token ID8258
Feature activation-7.85e-03

Pos logit contributions
run+0.012
ang+0.011
udge+0.010
earing+0.010
ours+0.010
Neg logit contributions
 rep-0.008
 nod-0.008
 conversations-0.007
sets-0.007
 chatter-0.007
Token She
Token ID1375
Feature activation-1.25e-02

Pos logit contributions
run+0.000
ang+0.000
web+0.000
udge+0.000
aser+0.000
Neg logit contributions
 towns-0.000
 amen-0.000
 nod-0.000
 hearts-0.000
sets-0.000
Token hugged
Token ID41130
Feature activation-5.72e-02

Pos logit contributions
 towns+0.015
girls+0.015
sets+0.014
 amen+0.014
 nod+0.014
Neg logit contributions
run-0.029
ang-0.025
udge-0.024
aser-0.023
web-0.023
Token her
Token ID607
Feature activation+4.45e-02

Pos logit contributions
girls+0.099
 towns+0.097
sets+0.095
 amen+0.093
 nod+0.091
Neg logit contributions
run-0.100
ang-0.085
udge-0.076
aser-0.075
web-0.075
Token mom
Token ID1995
Feature activation-5.10e-02

Pos logit contributions
run+0.094
ang+0.082
udge+0.076
web+0.074
aser+0.074
Neg logit contributions
girls-0.061
 towns-0.060
sets-0.059
 amen-0.057
 nod-0.056
Tokenmy
Token ID1820
Feature activation+1.07e-04

Pos logit contributions
girls+0.076
 towns+0.073
 amen+0.072
 conversations+0.070
sets+0.068
Neg logit contributions
run-0.102
ang-0.088
aser-0.081
udge-0.079
web-0.078
Token,
Token ID11
Feature activation-1.37e-02

Pos logit contributions
run+0.000
ang+0.000
aser+0.000
udge+0.000
web+0.000
Neg logit contributions
girls-0.000
 towns-0.000
 amen-0.000
sets-0.000
 conversations-0.000
Token who
Token ID508
Feature activation+7.56e-02

Pos logit contributions
girls+0.020
 towns+0.020
sets+0.019
 amen+0.018
 conversations+0.018
Neg logit contributions
run-0.028
ang-0.024
udge-0.022
aser-0.022
web-0.021
Token was
Token ID373
Feature activation+7.56e-03

Pos logit contributions
run+0.179
ang+0.156
udge+0.146
aser+0.143
ours+0.141
Neg logit contributions
 towns-0.085
sets-0.082
 nod-0.080
girls-0.079
 amen-0.078
Token waiting
Token ID4953
Feature activation+3.92e-02

Pos logit contributions
run+0.015
ang+0.014
udge+0.013
aser+0.012
ours+0.012
Neg logit contributions
 towns-0.011
sets-0.010
 overcome-0.010
 nod-0.010
girls-0.010
Token for
Token ID329
Feature activation+4.86e-02

Pos logit contributions
run+0.064
ang+0.058
udge+0.052
aser+0.048
web+0.048
Neg logit contributions
 towns-0.069
sets-0.066
 nod-0.066
 amen-0.065
girls-0.065
Token her
Token ID607
Feature activation+4.47e-02

Pos logit contributions
run+0.084
ang+0.070
udge+0.063
web+0.063
aser+0.061
Neg logit contributions
girls-0.083
sets-0.082
 towns-0.082
 amen-0.078
 nod-0.078
Token.
Token ID13
Feature activation+1.31e-02

Pos logit contributions
run+0.137
ang+0.121
udge+0.111
aser+0.109
web+0.109
Neg logit contributions
girls-0.082
 towns-0.081
sets-0.079
 amen-0.076
 nod-0.074
Token
Token ID198
Feature activation-7.70e-03

Pos logit contributions
run+0.029
ang+0.025
udge+0.024
aser+0.023
ours+0.023
Neg logit contributions
girls-0.016
 towns-0.016
sets-0.015
 amen-0.015
 conversations-0.014
TokenJack
Token ID14295
Feature activation+1.57e-02

Pos logit contributions
 towns+0.008
girls+0.008
sets+0.008
 amen+0.007
 nod+0.007
Neg logit contributions
run-0.018
ang-0.016
aser-0.015
web-0.015
udge-0.015
Token was
Token ID373
Feature activation+1.78e-02

Pos logit contributions
run+0.034
ang+0.029
udge+0.027
aser+0.027
web+0.027
Neg logit contributions
girls-0.021
 towns-0.020
sets-0.020
 amen-0.019
 nod-0.019
Token only
Token ID691
Feature activation+8.41e-03

Pos logit contributions
run+0.034
ang+0.031
aser+0.030
udge+0.029
ours+0.028
Neg logit contributions
 towns-0.026
 nod-0.024
 conversations-0.024
 overcome-0.024
girls-0.024
Token three
Token ID1115
Feature activation+5.51e-03

Pos logit contributions
run+0.017
ang+0.016
web+0.016
ool+0.016
ours+0.016
Neg logit contributions
 towns-0.010
 nod-0.009
 hearts-0.009
 overcome-0.008
 conversations-0.008
Token years
Token ID812
Feature activation-2.70e-02

Pos logit contributions
run+0.013
ang+0.011
udge+0.010
aser+0.010
web+0.010
Neg logit contributions
girls-0.006
 towns-0.006
sets-0.006
 amen-0.006
 nod-0.006
Token old
Token ID1468
Feature activation+4.13e-02

Pos logit contributions
 towns+0.049
girls+0.047
 amen+0.046
sets+0.046
 nod+0.045
Neg logit contributions
run-0.045
ang-0.038
udge-0.034
web-0.033
aser-0.033
Token and
Token ID290
Feature activation+1.90e-02

Pos logit contributions
run+0.075
ang+0.062
udge+0.056
aser+0.055
web+0.054
Neg logit contributions
girls-0.066
 amen-0.064
 narrative-0.063
sets-0.063
 towns-0.062
Token had
Token ID550
Feature activation+3.88e-02

Pos logit contributions
run+0.045
ang+0.041
udge+0.038
aser+0.038
web+0.037
Neg logit contributions
 towns-0.020
 nod-0.019
 amen-0.019
girls-0.018
sets-0.018
Token been
Token ID587
Feature activation+8.93e-03

Pos logit contributions
run+0.070
ang+0.053
web+0.050
udge+0.048
aser+0.048
Neg logit contributions
girls-0.067
sets-0.064
 amen-0.064
 towns-0.062
 nod-0.061
Token shield
Token ID7614
Feature activation-4.08e-02

Pos logit contributions
run+0.243
ang+0.208
udge+0.194
web+0.186
ours+0.184
Neg logit contributions
girls-0.165
 towns-0.160
sets-0.155
 amen-0.151
 nod-0.148
Token that
Token ID326
Feature activation+6.77e-02

Pos logit contributions
girls+0.070
 towns+0.069
sets+0.068
 amen+0.066
 nod+0.065
Neg logit contributions
run-0.071
ang-0.061
udge-0.054
aser-0.053
web-0.053
Token he
Token ID339
Feature activation+6.10e-04

Pos logit contributions
run+0.168
ang+0.153
udge+0.143
aser+0.137
ours+0.136
Neg logit contributions
girls-0.067
 towns-0.067
sets-0.065
 nod-0.061
 amen-0.060
Token would
Token ID561
Feature activation-9.97e-02

Pos logit contributions
run+0.001
ang+0.001
udge+0.001
ours+0.001
ares+0.001
Neg logit contributions
girls-0.001
 voy-0.001
 towns-0.001
zo-0.001
 overcome-0.001
Token use
Token ID779
Feature activation+2.50e-02

Pos logit contributions
girls+0.120
 towns+0.119
sets+0.116
 amen+0.111
 conversations+0.108
Neg logit contributions
run-0.224
ang-0.198
udge-0.183
web-0.180
aser-0.180
Token to
Token ID284
Feature activation-3.41e-02

Pos logit contributions
run+0.052
ang+0.050
udge+0.045
ours+0.044
aser+0.043
Neg logit contributions
girls-0.032
 towns-0.031
 conversations-0.030
sets-0.029
 hearts-0.028
Token protect
Token ID1805
Feature activation-5.71e-02

Pos logit contributions
sets+0.046
girls+0.045
 towns+0.045
 amen+0.042
 conversations+0.042
Neg logit contributions
run-0.072
ang-0.062
udge-0.057
web-0.057
aser-0.056
Token himself
Token ID2241
Feature activation+3.47e-02

Pos logit contributions
 towns+0.057
girls+0.057
sets+0.051
 conversations+0.048
 hearts+0.047
Neg logit contributions
run-0.140
ang-0.131
ours-0.120
aser-0.117
udge-0.117
Token from
Token ID422
Feature activation+2.94e-02

Pos logit contributions
run+0.075
ang+0.066
udge+0.061
aser+0.060
web+0.060
Neg logit contributions
girls-0.044
 towns-0.044
sets-0.043
 amen-0.041
 nod-0.040
Token the
Token ID262
Feature activation-4.16e-02

Pos logit contributions
run+0.065
ang+0.060
udge+0.055
aser+0.054
it+0.053
Neg logit contributions
 towns-0.037
girls-0.036
 conversations-0.033
sets-0.033
 hearts-0.033
Token wind
Token ID2344
Feature activation-9.48e-02

Pos logit contributions
girls+0.057
sets+0.055
 amen+0.054
 nod+0.053
 towns+0.052
Neg logit contributions
run-0.079
ang-0.070
web-0.067
aser-0.067
udge-0.063
Token and
Token ID290
Feature activation+1.52e-02

Pos logit contributions
girls+0.124
 towns+0.114
sets+0.114
 narrative+0.114
Arthur+0.114
Neg logit contributions
run-0.113
ang-0.089
web-0.085
udge-0.084
it-0.079
Token scared
Token ID12008
Feature activation-2.20e-02

Pos logit contributions
run+0.028
ang+0.024
udge+0.022
aser+0.021
web+0.021
Neg logit contributions
girls-0.025
 towns-0.024
sets-0.024
 amen-0.023
 nod-0.023
Token.
Token ID13
Feature activation+1.24e-02

Pos logit contributions
girls+0.035
sets+0.034
 towns+0.033
 voy+0.033
 amen+0.033
Neg logit contributions
run-0.042
ang-0.034
udge-0.032
aser-0.030
it-0.030
Token They
Token ID1119
Feature activation-1.16e-01

Pos logit contributions
run+0.032
ang+0.029
udge+0.027
aser+0.027
web+0.026
Neg logit contributions
girls-0.011
 towns-0.011
sets-0.010
 amen-0.010
 nod-0.010
Token knew
Token ID2993
Feature activation+7.72e-02

Pos logit contributions
sets+0.172
girls+0.170
 towns+0.165
 conversations+0.160
 amen+0.155
Neg logit contributions
run-0.231
ang-0.198
udge-0.180
web-0.177
aser-0.177
Token they
Token ID484
Feature activation-7.31e-02

Pos logit contributions
run+0.174
ang+0.155
udge+0.143
aser+0.140
web+0.139
Neg logit contributions
girls-0.094
 towns-0.093
sets-0.090
 amen-0.088
 nod-0.085
Token did
Token ID750
Feature activation-5.27e-02

Pos logit contributions
 towns+0.091
girls+0.090
sets+0.089
 amen+0.086
 conversations+0.084
Neg logit contributions
run-0.162
ang-0.138
web-0.129
udge-0.128
aser-0.128
Token something
Token ID1223
Feature activation+3.79e-03

Pos logit contributions
girls+0.099
sets+0.098
 towns+0.097
 amen+0.093
 nod+0.090
Neg logit contributions
run-0.080
web-0.062
ang-0.060
udge-0.060
aser-0.055
Token bad
Token ID2089
Feature activation-8.47e-03

Pos logit contributions
run+0.008
ang+0.007
udge+0.006
aser+0.006
web+0.006
Neg logit contributions
girls-0.005
 towns-0.005
 amen-0.005
sets-0.005
 nod-0.005
Token.
Token ID13
Feature activation+1.21e-02

Pos logit contributions
 towns+0.012
girls+0.012
sets+0.011
 amen+0.011
 conversations+0.011
Neg logit contributions
run-0.017
ang-0.015
aser-0.014
web-0.014
udge-0.014
Token They
Token ID1119
Feature activation-1.15e-01

Pos logit contributions
run+0.030
ang+0.026
udge+0.025
aser+0.024
web+0.024
Neg logit contributions
girls-0.012
 towns-0.012
sets-0.012
 amen-0.011
 nod-0.011
Token couldn
Token ID3521
Feature activation+6.82e-02

Pos logit contributions
 towns+0.013
girls+0.013
sets+0.012
 amen+0.012
 nod+0.012
Neg logit contributions
run-0.022
ang-0.019
udge-0.018
aser-0.018
web-0.017
Token't
Token ID470
Feature activation+1.66e-02

Pos logit contributions
run+0.138
ang+0.121
udge+0.110
aser+0.108
web+0.108
Neg logit contributions
girls-0.097
 towns-0.096
sets-0.094
 amen-0.091
 nod-0.090
Token believe
Token ID1975
Feature activation+1.11e-03

Pos logit contributions
run+0.030
ang+0.025
udge+0.024
aser+0.023
web+0.022
Neg logit contributions
sets-0.028
girls-0.028
 towns-0.026
 conversations-0.025
 amen-0.025
Token it
Token ID340
Feature activation+5.93e-02

Pos logit contributions
run+0.002
ang+0.002
udge+0.002
aser+0.001
ours+0.001
Neg logit contributions
 towns-0.002
sets-0.002
 amen-0.002
 nod-0.002
girls-0.002
Token -
Token ID532
Feature activation+4.14e-02

Pos logit contributions
run+0.128
ang+0.114
udge+0.104
aser+0.102
ours+0.102
Neg logit contributions
 towns-0.076
girls-0.075
sets-0.075
 conversations-0.070
 nod-0.070
Token the
Token ID262
Feature activation-4.72e-02

Pos logit contributions
run+0.075
ang+0.067
udge+0.061
aser+0.059
web+0.057
Neg logit contributions
 towns-0.070
sets-0.068
 nod-0.067
 amen-0.065
girls-0.065
Token island
Token ID7022
Feature activation+2.09e-02

Pos logit contributions
girls+0.062
sets+0.056
 towns+0.054
 amen+0.053
 nod+0.052
Neg logit contributions
run-0.099
ang-0.089
web-0.082
aser-0.081
udge-0.080
Token was
Token ID373
Feature activation+1.15e-02

Pos logit contributions
run+0.043
ang+0.038
udge+0.035
aser+0.034
ours+0.033
Neg logit contributions
 towns-0.029
sets-0.028
girls-0.028
 nod-0.027
 conversations-0.026
Token even
Token ID772
Feature activation+5.96e-02

Pos logit contributions
run+0.023
ang+0.021
udge+0.019
aser+0.019
ours+0.019
Neg logit contributions
 towns-0.016
sets-0.016
girls-0.015
 conversations-0.015
 distance-0.015
Token more
Token ID517
Feature activation-9.55e-02

Pos logit contributions
run+0.139
ang+0.127
udge+0.121
aser+0.117
ours+0.114
Neg logit contributions
 towns-0.066
sets-0.061
girls-0.060
 distance-0.059
 amen-0.058
Token beautiful
Token ID4950
Feature activation+2.05e-02

Pos logit contributions
 towns+0.157
 distance+0.148
 conversations+0.148
sets+0.147
 nod+0.145
Neg logit contributions
run-0.178
ang-0.148
aser-0.139
ours-0.138
udge-0.135
Token just
Token ID655
Feature activation+2.87e-02

Pos logit contributions
run+0.067
ang+0.054
udge+0.051
ours+0.050
web+0.050
Neg logit contributions
girls-0.055
 towns-0.053
sets-0.052
 amen-0.050
 nod-0.049
Token the
Token ID262
Feature activation-4.55e-02

Pos logit contributions
run+0.057
ang+0.050
udge+0.045
aser+0.044
web+0.044
Neg logit contributions
 towns-0.042
girls-0.042
sets-0.041
 amen-0.040
 nod-0.040
Token right
Token ID826
Feature activation-4.67e-03

Pos logit contributions
girls+0.070
sets+0.066
 towns+0.065
 amen+0.063
 conversations+0.060
Neg logit contributions
run-0.086
ang-0.074
aser-0.071
udge-0.069
ours-0.067
Token amount
Token ID2033
Feature activation+1.27e-02

Pos logit contributions
 towns+0.007
girls+0.007
sets+0.006
 conversations+0.006
 nod+0.006
Neg logit contributions
run-0.009
ang-0.008
udge-0.007
aser-0.007
web-0.007
Token of
Token ID286
Feature activation+1.34e-02

Pos logit contributions
run+0.019
ang+0.017
web+0.015
udge+0.015
ours+0.015
Neg logit contributions
 towns-0.023
girls-0.022
sets-0.022
 hearts-0.021
 conversations-0.021
Token dessert
Token ID23084
Feature activation-3.78e-02

Pos logit contributions
run+0.027
ang+0.024
udge+0.022
aser+0.022
web+0.021
Neg logit contributions
girls-0.019
 towns-0.019
sets-0.018
 amen-0.018
 nod-0.017
Token.
Token ID13
Feature activation+6.55e-03

Pos logit contributions
girls+0.047
 towns+0.045
sets+0.045
 conversations+0.042
 amen+0.042
Neg logit contributions
run-0.083
ang-0.075
udge-0.068
web-0.067
aser-0.066
Token The
Token ID383
Feature activation-5.68e-02

Pos logit contributions
run+0.016
ang+0.015
web+0.014
aser+0.014
udge+0.014
Neg logit contributions
girls-0.006
 towns-0.006
sets-0.006
 nod-0.006
 amen-0.006
Token end
Token ID886
Feature activation-5.59e-02

Pos logit contributions
girls+0.061
sets+0.056
 towns+0.054
 amen+0.049
 conversations+0.047
Neg logit contributions
run-0.134
ang-0.120
aser-0.117
web-0.113
udge-0.112
Token.
Token ID13
Feature activation+5.55e-03

Pos logit contributions
girls+0.048
sets+0.044
 towns+0.044
 nod+0.044
 conversations+0.042
Neg logit contributions
run-0.140
ang-0.132
aser-0.122
udge-0.119
ours-0.119
Token<|endoftext|>
Token ID50256
Feature activation+4.04e-03

Pos logit contributions
run+0.013
ang+0.011
udge+0.010
aser+0.010
web+0.010
Neg logit contributions
girls-0.007
 towns-0.006
sets-0.006
 amen-0.006
 nod-0.006
Token.
Token ID13
Feature activation+2.69e-03

Pos logit contributions
girls+0.118
 towns+0.113
sets+0.110
 amen+0.108
 conversations+0.104
Neg logit contributions
run-0.182
ang-0.152
udge-0.141
web-0.140
aser-0.139
Token
Token ID198
Feature activation-1.92e-02

Pos logit contributions
run+0.006
ang+0.006
aser+0.005
udge+0.005
web+0.005
Neg logit contributions
 towns-0.003
girls-0.003
sets-0.003
 nod-0.003
 amen-0.003
Token
Token ID198
Feature activation-1.95e-02

Pos logit contributions
girls+0.012
 towns+0.012
 nod+0.011
 amen+0.011
 conversations+0.011
Neg logit contributions
run-0.052
ang-0.047
aser-0.047
udge-0.044
ours-0.044
TokenAs
Token ID1722
Feature activation-2.20e-03

Pos logit contributions
 towns+0.020
 overcome+0.018
sets+0.017
Arthur+0.017
girls+0.017
Neg logit contributions
run-0.049
ang-0.042
aser-0.041
ours-0.040
udge-0.040
Token she
Token ID673
Feature activation+1.38e-02

Pos logit contributions
 towns+0.003
girls+0.003
sets+0.003
 amen+0.003
 hearts+0.003
Neg logit contributions
run-0.005
ang-0.004
aser-0.004
udge-0.004
ours-0.004
Token watched
Token ID7342
Feature activation-8.82e-02

Pos logit contributions
run+0.028
ang+0.024
web+0.022
udge+0.021
aser+0.021
Neg logit contributions
 towns-0.020
girls-0.020
sets-0.020
 amen-0.019
 conversations-0.019
Token,
Token ID11
Feature activation-1.54e-02

Pos logit contributions
 towns+0.105
girls+0.102
 amen+0.100
sets+0.099
 nod+0.098
Neg logit contributions
run-0.201
ang-0.180
udge-0.166
aser-0.164
web-0.161
Token something
Token ID1223
Feature activation+2.58e-07

Pos logit contributions
 towns+0.022
sets+0.022
girls+0.021
 amen+0.021
 nod+0.021
Neg logit contributions
run-0.032
ang-0.028
udge-0.025
aser-0.025
web-0.024
Token strange
Token ID6283
Feature activation+3.98e-02

Feature not active on prev token;
no predictions were affected.
Token happened
Token ID3022
Feature activation-1.29e-02

Pos logit contributions
run+0.073
ang+0.062
udge+0.057
web+0.057
aser+0.056
Neg logit contributions
girls-0.064
 towns-0.064
sets-0.061
 amen-0.060
 nod-0.059
Token.
Token ID13
Feature activation+1.47e-03

Pos logit contributions
girls+0.016
 towns+0.015
sets+0.015
 amen+0.014
 nod+0.014
Neg logit contributions
run-0.029
ang-0.026
udge-0.024
web-0.023
aser-0.023
Tokenie
Token ID494
Feature activation+6.25e-02

Pos logit contributions
girls+0.022
 towns+0.022
sets+0.021
 amen+0.021
 conversations+0.021
Neg logit contributions
run-0.025
ang-0.022
web-0.019
udge-0.018
aser-0.018
Token and
Token ID290
Feature activation+1.47e-02

Pos logit contributions
run+0.131
ang+0.115
udge+0.105
aser+0.103
web+0.103
Neg logit contributions
girls-0.084
 towns-0.083
sets-0.081
 amen-0.079
 nod-0.077
Token Fl
Token ID1610
Feature activation+3.92e-02

Pos logit contributions
run+0.031
ang+0.027
udge+0.025
aser+0.025
web+0.024
Neg logit contributions
girls-0.020
 towns-0.020
sets-0.019
 amen-0.019
 nod-0.018
Tokenutter
Token ID10381
Feature activation-2.05e-02

Pos logit contributions
run+0.057
ang+0.047
udge+0.041
aser+0.040
web+0.040
Neg logit contributions
girls-0.078
 towns-0.078
sets-0.076
 amen-0.075
 nod-0.074
Token became
Token ID2627
Feature activation-2.70e-02

Pos logit contributions
girls+0.025
 towns+0.025
sets+0.024
 amen+0.023
 conversations+0.022
Neg logit contributions
run-0.046
ang-0.041
udge-0.037
aser-0.037
web-0.037
Token friends
Token ID2460
Feature activation-1.28e-01

Pos logit contributions
girls+0.040
 towns+0.039
sets+0.039
 amen+0.039
 nod+0.037
Neg logit contributions
run-0.052
ang-0.046
udge-0.041
web-0.040
aser-0.039
Token and
Token ID290
Feature activation+1.75e-02

Pos logit contributions
girls+0.212
sets+0.211
 amen+0.211
 towns+0.206
 narrative+0.205
Neg logit contributions
run-0.234
ang-0.202
udge-0.170
it-0.170
web-0.166
Token flew
Token ID13112
Feature activation-1.01e-02

Pos logit contributions
run+0.035
ang+0.031
web+0.027
aser+0.026
ours+0.026
Neg logit contributions
 towns-0.023
 narrative-0.022
 amen-0.022
Arthur-0.022
sets-0.021
Token together
Token ID1978
Feature activation-8.06e-02

Pos logit contributions
 towns+0.021
 amen+0.021
girls+0.021
sets+0.021
 nod+0.020
Neg logit contributions
run-0.013
ang-0.011
udge-0.009
web-0.008
inn-0.008
Token in
Token ID287
Feature activation-2.26e-03

Pos logit contributions
girls+0.145
 towns+0.144
sets+0.141
 amen+0.139
 nod+0.134
Neg logit contributions
run-0.134
ang-0.113
udge-0.099
aser-0.096
web-0.096
Token the
Token ID262
Feature activation-5.12e-02

Pos logit contributions
 amen+0.003
sets+0.003
girls+0.003
 towns+0.003
 nod+0.003
Neg logit contributions
run-0.004
ang-0.004
udge-0.003
web-0.003
inn-0.003
Token to
Token ID284
Feature activation-3.41e-02

Pos logit contributions
run+0.038
ang+0.034
udge+0.031
web+0.031
aser+0.029
Neg logit contributions
girls-0.027
 towns-0.027
sets-0.026
 nod-0.026
 amen-0.026
Token have
Token ID423
Feature activation+4.15e-02

Pos logit contributions
girls+0.057
sets+0.056
 narrative+0.053
rob+0.052
 amen+0.051
Neg logit contributions
run-0.052
 here-0.044
udge-0.042
 nail-0.041
 else-0.040
Token victory
Token ID5373
Feature activation-5.75e-02

Pos logit contributions
run+0.069
ang+0.053
aser+0.048
web+0.048
ours+0.047
Neg logit contributions
girls-0.075
 towns-0.072
sets-0.071
 nod-0.070
 amen-0.068
Token.
Token ID13
Feature activation+1.92e-02

Pos logit contributions
girls+0.085
 towns+0.084
sets+0.082
 amen+0.079
 conversations+0.078
Neg logit contributions
run-0.113
ang-0.099
udge-0.091
aser-0.088
web-0.088
Token I
Token ID314
Feature activation+3.66e-02

Pos logit contributions
run+0.034
ang+0.030
udge+0.028
web+0.028
aser+0.027
Neg logit contributions
girls-0.031
 towns-0.031
 nod-0.030
 amen-0.030
sets-0.029
Token will
Token ID481
Feature activation-1.39e-01

Pos logit contributions
run+0.083
ang+0.075
web+0.070
aser+0.069
udge+0.069
Neg logit contributions
 nod-0.042
girls-0.042
 towns-0.040
 overcome-0.037
 amen-0.037
Token provide
Token ID2148
Feature activation-3.99e-02

Pos logit contributions
girls+0.202
sets+0.196
 towns+0.188
 amen+0.178
 conversations+0.177
Neg logit contributions
run-0.271
ang-0.228
udge-0.213
aser-0.212
earing-0.206
Token everything
Token ID2279
Feature activation-2.06e-02

Pos logit contributions
 towns+0.060
girls+0.059
 amen+0.058
 conversations+0.056
sets+0.056
Neg logit contributions
run-0.077
ang-0.070
udge-0.064
web-0.062
aser-0.060
Token you
Token ID345
Feature activation+1.84e-02

Pos logit contributions
 towns+0.032
sets+0.032
 amen+0.031
girls+0.031
 nod+0.030
Neg logit contributions
run-0.038
ang-0.033
udge-0.032
aser-0.028
web-0.028
Token need
Token ID761
Feature activation+3.97e-02

Pos logit contributions
run+0.035
ang+0.030
aser+0.027
udge+0.026
web+0.026
Neg logit contributions
girls-0.029
 towns-0.028
sets-0.028
 amen-0.027
 conversations-0.026
Token to
Token ID284
Feature activation-3.48e-02

Pos logit contributions
run+0.083
ang+0.074
udge+0.068
web+0.066
aser+0.066
Neg logit contributions
girls-0.053
 towns-0.053
sets-0.051
 amen-0.050
 conversations-0.049
Token plan
Token ID1410
Feature activation-4.04e-02

Pos logit contributions
girls+0.083
sets+0.080
 amen+0.077
 towns+0.076
 robes+0.073
Neg logit contributions
run-0.106
ang-0.095
aser-0.090
web-0.088
udge-0.084
Token worked
Token ID3111
Feature activation-3.61e-02

Pos logit contributions
girls+0.052
 towns+0.052
sets+0.050
 amen+0.049
 nod+0.048
Neg logit contributions
run-0.087
ang-0.077
udge-0.070
aser-0.068
web-0.068
Token perfectly
Token ID7138
Feature activation+9.27e-02

Pos logit contributions
girls+0.061
 towns+0.060
sets+0.059
 amen+0.058
 nod+0.057
Neg logit contributions
run-0.065
ang-0.054
udge-0.049
aser-0.048
web-0.048
Token!
Token ID0
Feature activation+3.32e-02

Pos logit contributions
run+0.197
ang+0.172
udge+0.158
aser+0.156
web+0.155
Neg logit contributions
girls-0.124
 towns-0.122
sets-0.119
 amen-0.116
 nod-0.113
Token The
Token ID383
Feature activation-6.13e-02

Pos logit contributions
run+0.077
ang+0.067
udge+0.063
ours+0.061
aser+0.061
Neg logit contributions
girls-0.039
 towns-0.038
sets-0.037
 amen-0.036
 nod-0.035
Token friends
Token ID2460
Feature activation-1.37e-01

Pos logit contributions
girls+0.095
sets+0.090
 amen+0.089
 towns+0.087
 conversations+0.082
Neg logit contributions
run-0.114
ang-0.101
aser-0.096
web-0.094
udge-0.092
Token reunited
Token ID35440
Feature activation-1.84e-02

Pos logit contributions
girls+0.178
 towns+0.172
sets+0.171
 amen+0.165
 conversations+0.161
Neg logit contributions
run-0.295
ang-0.258
udge-0.237
aser-0.236
web-0.234
Token with
Token ID351
Feature activation+7.41e-02

Pos logit contributions
girls+0.030
sets+0.030
 towns+0.030
 amen+0.029
Arthur+0.028
Neg logit contributions
run-0.033
ang-0.027
udge-0.025
web-0.025
aser-0.024
Token huge
Token ID3236
Feature activation+5.67e-02

Pos logit contributions
run+0.134
ang+0.103
udge+0.096
ours+0.089
aser+0.087
Neg logit contributions
sets-0.121
 amen-0.119
girls-0.118
 towns-0.113
 narrative-0.111
Token smiles
Token ID21845
Feature activation-1.45e-01

Pos logit contributions
run+0.105
ang+0.087
aser+0.085
ours+0.080
udge+0.080
Neg logit contributions
girls-0.089
sets-0.087
 amen-0.086
 Amen-0.081
 towns-0.080
Token on
Token ID319
Feature activation+2.00e-02

Pos logit contributions
girls+0.287
 towns+0.276
 amen+0.271
 conversations+0.262
sets+0.259
Neg logit contributions
run-0.219
ang-0.179
aser-0.160
web-0.155
it-0.154
Token the
Token ID262
Feature activation-4.19e-02

Pos logit contributions
run+0.028
ang+0.025
aser+0.023
udge+0.023
web+0.023
Neg logit contributions
 towns-0.018
girls-0.017
sets-0.017
 conversations-0.016
 hearts-0.016
Token toy
Token ID13373
Feature activation+1.20e-01

Pos logit contributions
 towns+0.063
 conversations+0.061
 hearts+0.059
girls+0.059
 feast+0.058
Neg logit contributions
run-0.087
ang-0.073
udge-0.070
ours-0.068
web-0.066
Token store
Token ID3650
Feature activation+3.25e-02

Pos logit contributions
run+0.268
ang+0.235
udge+0.219
aser+0.213
web+0.213
Neg logit contributions
girls-0.148
 towns-0.146
sets-0.142
 amen-0.135
 conversations-0.132
Token with
Token ID351
Feature activation+7.84e-02

Pos logit contributions
run+0.068
ang+0.059
udge+0.054
aser+0.054
web+0.053
Neg logit contributions
girls-0.044
 towns-0.043
sets-0.042
 amen-0.042
 nod-0.040
Token her
Token ID607
Feature activation+4.92e-02

Pos logit contributions
run+0.147
ang+0.128
udge+0.115
web+0.114
aser+0.114
Neg logit contributions
girls-0.123
 towns-0.122
sets-0.119
 amen-0.116
 conversations-0.115
Token friends
Token ID2460
Feature activation-1.28e-01

Pos logit contributions
run+0.108
ang+0.097
udge+0.091
web+0.089
ours+0.088
Neg logit contributions
girls-0.061
 towns-0.060
sets-0.057
 conversations-0.055
 nod-0.055
Token.
Token ID13
Feature activation+7.05e-03

Pos logit contributions
girls+0.160
 towns+0.158
sets+0.154
 amen+0.149
 conversations+0.145
Neg logit contributions
run-0.282
ang-0.247
udge-0.226
aser-0.224
web-0.222
Token She
Token ID1375
Feature activation-1.21e-02

Pos logit contributions
run+0.016
ang+0.014
web+0.013
aser+0.013
udge+0.013
Neg logit contributions
girls-0.008
 towns-0.008
sets-0.008
 nod-0.008
 amen-0.008
Token couldn
Token ID3521
Feature activation+6.80e-02

Pos logit contributions
 towns+0.014
girls+0.013
 nod+0.013
 amen+0.012
 overcome+0.012
Neg logit contributions
run-0.027
ang-0.025
udge-0.024
web-0.023
aser-0.023
Tokenâ
Token ID22940
Feature activation-9.33e-03

Pos logit contributions
run+0.130
ang+0.112
udge+0.102
aser+0.100
web+0.100
Neg logit contributions
girls-0.105
 towns-0.104
sets-0.101
 amen-0.098
 nod-0.097
Token
Token ID26391
Feature activation-3.80e-02

Pos logit contributions
girls+0.013
 towns+0.013
sets+0.013
 amen+0.012
 conversations+0.012
Neg logit contributions
run-0.019
ang-0.017
udge-0.016
earing-0.016
it-0.015
Token the
Token ID262
Feature activation-4.01e-02

Pos logit contributions
run+0.078
ang+0.067
udge+0.066
inn+0.063
earing+0.061
Neg logit contributions
 towns-0.052
sets-0.051
 hearts-0.050
girls-0.048
 shops-0.045
Token v
Token ID410
Feature activation+1.23e-01

Pos logit contributions
 towns+0.081
girls+0.080
sets+0.079
 hearts+0.078
 conversations+0.076
Neg logit contributions
run-0.060
ang-0.047
udge-0.043
aser-0.039
web-0.038
Tokenase
Token ID589
Feature activation+1.02e-01

Pos logit contributions
run+0.192
ang+0.148
ours+0.128
udge+0.126
earing+0.124
Neg logit contributions
girls-0.233
sets-0.224
 towns-0.221
 conversations-0.209
 overcome-0.205
Token,
Token ID11
Feature activation-4.00e-03

Pos logit contributions
run+0.228
ang+0.202
udge+0.187
aser+0.182
web+0.181
Neg logit contributions
girls-0.124
 towns-0.123
sets-0.121
 amen-0.115
 nod-0.113
Token it
Token ID340
Feature activation+6.01e-02

Pos logit contributions
 nod+0.006
 towns+0.006
sets+0.006
 hearts+0.006
 amen+0.006
Neg logit contributions
run-0.008
ang-0.007
udge-0.006
inn-0.006
idd-0.006
Token will
Token ID481
Feature activation-1.35e-01

Pos logit contributions
run+0.150
ang+0.134
udge+0.131
inn+0.124
ula+0.119
Neg logit contributions
girls-0.056
sets-0.054
 towns-0.052
 overcome-0.051
 nod-0.051
Token break
Token ID2270
Feature activation+3.22e-02

Pos logit contributions
girls+0.159
 towns+0.157
sets+0.152
 amen+0.147
 nod+0.145
Neg logit contributions
run-0.309
ang-0.274
udge-0.253
aser-0.248
web-0.247
Token,"
Token ID553
Feature activation-5.05e-02

Pos logit contributions
run+0.073
ang+0.066
udge+0.060
ours+0.060
aser+0.058
Neg logit contributions
 towns-0.037
girls-0.036
sets-0.036
 overcome-0.033
 hearts-0.033
Token said
Token ID531
Feature activation+2.79e-03

Pos logit contributions
 nod+0.068
sets+0.067
 amen+0.066
 towns+0.066
girls+0.065
Neg logit contributions
run-0.107
ang-0.094
udge-0.088
aser-0.084
web-0.081
Token her
Token ID607
Feature activation+5.10e-02

Pos logit contributions
run+0.005
ang+0.004
udge+0.004
aser+0.004
web+0.004
Neg logit contributions
 towns-0.005
girls-0.005
sets-0.005
 nod-0.004
 amen-0.004
Token mom
Token ID1995
Feature activation-4.75e-02

Pos logit contributions
run+0.117
ang+0.104
udge+0.097
web+0.093
ours+0.092
Neg logit contributions
 towns-0.059
girls-0.059
sets-0.058
 nod-0.057
 amen-0.056