TOP ACTIVATIONS
MAX = 4.470

She hoped it would not die . " Mom
was very sick and might die soon .
club called the Brave Adventure rs . The brave
fossils are made when animals die and get buried under the
he threw away did not die . They landed on his
lab door opened and a sweet little girl came in .
family or friends when they die . We have to put
sometimes animals get sick and die , just like people .
reached in and picked a sweet b erry . It tasted
beautiful flower that would never die . Tim knew this was
a shop and saw a sweet little heart shaped toy in
suffer , scream , and die . He pulled
opened it and found a sweet surprise inside .
that the turkey had to die for their dinner , but
, but it had a sweet heart of an engine .
girl was too young to die . <|endoftext|> Once upon a
like us . When they die , they don 't come
bird died because sometimes things die . It is part of
and cry and fight and die . They felt bored and
does not want them to die . She buzz es at

MIN ACTIVATIONS
MIN = -6.824

that it produced more av oc ados than ever before .
on , Lily loved av oc ados and would ask for
was an eager 3 - year - old . He liked
they could grow more av oc ados and share them with
there was a three - year - old who found a
Sammy would often share av oc ados and other y ummy
was an adventurous three - year - old and loved to
Every day two 3 - year - old children enjoyed playing
my was a three - year - old boy who loved
One day , three - year - old Sam was innoc
a very curious 3 - year - old boy . He
're the best rh in oc eros ever !" Tim my
Jack was a 3 - year old who liked to explore
was an eager 3 - year - old boy who loved
a very curious three - year - old girl . She
store to get more av oc ados . Gary looked at
my !" exclaimed three - year - old Jenny . She
Ben was a 3 - year - old boy who owned
was a curious 3 - year - old , eager to
a very curious three - year - old boy . He

INTERVAL 1.647 - 4.470
CONTAINS 2.236%

his mum had lots of fun playing with the new stuff
for her blocks and saw them in the corner of the
's favorite toy and it broke . Lily was so sad
He figured out they must be his friends . The young
They want to weigh themselves . Lily goes first .

INTERVAL -1.177 - 1.647
CONTAINS 90.400%

new place to explore . The moral of the
€ asked Simon . â € � � It
forget how nice it was to have water when she was
The girl was filled with happy feelings as she
. But then Amanda started to fuss and demanding

INTERVAL -4.000 - -1.177
CONTAINS 7.345%

Can we play with our cars again ?" Their mom says
" Can we please see the baby ze br as
cried . Mom heard the noise and came to
. You are the best soldier ever ," he said .
the grass . He laughs and gets up .

INTERVAL -6.824 - -4.000
CONTAINS 0.019%

Lily were known as the brav est animals in the park
Sam was a three - year - old on a mission
that little island was always known as Lucy 's Island because
there was a three - year - old named Nancy .
and fast . The merry - go - round spins .
Token She
Token ID1375
Feature activation+9.81e-01

Pos logit contributions
 heads+0.003
zie+0.003
 myself+0.003
 herself+0.003
keeping+0.003
Neg logit contributions
itt-0.004
 Some-0.004
year-0.004
 of-0.004
op-0.004
Token hoped
Token ID10719
Feature activation-2.26e-01

Pos logit contributions
 myself+0.032
zie+0.032
 heads+0.031
 herself+0.031
keeping+0.031
Neg logit contributions
itt-0.043
 Some-0.041
 of-0.041
year-0.041
op-0.040
Token it
Token ID340
Feature activation+3.47e-02

Pos logit contributions
year+0.012
itt+0.011
op+0.011
ades+0.011
imsy+0.011
Neg logit contributions
 herself-0.007
 myself-0.006
 heads-0.006
zie-0.006
keeping-0.005
Token would
Token ID561
Feature activation-1.39e+00

Pos logit contributions
 heads+0.001
 herself+0.001
zie+0.001
 forth+0.001
 forget+0.001
Neg logit contributions
itt-0.001
 Some-0.001
year-0.001
op-0.001
 of-0.001
Token not
Token ID407
Feature activation+6.38e-01

Pos logit contributions
 of+0.048
 Some+0.043
itt+0.043
 fires+0.042
year+0.039
Neg logit contributions
zie-0.059
 heads-0.057
keeping-0.056
 forth-0.056
 myself-0.054
Token die
Token ID4656
Feature activation+4.47e+00

Pos logit contributions
zie+0.026
keeping+0.025
 heads+0.024
 myself+0.024
ishment+0.023
Neg logit contributions
 of-0.024
 fires-0.020
 Some-0.020
op-0.020
itt-0.020
Token.
Token ID13
Feature activation+4.27e-01

Pos logit contributions
 heads+0.168
zie+0.167
 myself+0.166
 herself+0.162
keeping+0.160
Neg logit contributions
itt-0.171
 of-0.164
year-0.160
 Some-0.159
 fires-0.157
Token
Token ID198
Feature activation+1.32e-01

Pos logit contributions
zie+0.018
 forth+0.018
 heads+0.018
keeping+0.017
 fuss+0.017
Neg logit contributions
 Some-0.016
itt-0.015
 of-0.014
op-0.014
year-0.013
Token
Token ID198
Feature activation+3.71e-01

Pos logit contributions
 heads+0.006
zie+0.005
 myself+0.005
 forth+0.005
keeping+0.005
Neg logit contributions
itt-0.005
 of-0.005
 Some-0.004
year-0.004
op-0.004
Token"
Token ID1
Feature activation+8.20e-01

Pos logit contributions
 heads+0.015
 myself+0.014
zie+0.014
 herself+0.014
keeping+0.014
Neg logit contributions
itt-0.014
 Some-0.013
year-0.013
 of-0.013
op-0.013
TokenMom
Token ID29252
Feature activation-8.05e-01

Pos logit contributions
 heads+0.029
 myself+0.029
 herself+0.029
zie+0.028
keeping+0.028
Neg logit contributions
itt-0.034
year-0.032
 of-0.032
 Some-0.032
op-0.031
Token was
Token ID373
Feature activation-2.94e-01

Pos logit contributions
itt+0.016
year+0.015
 Some+0.015
 fires+0.014
op+0.014
Neg logit contributions
 herself-0.017
 heads-0.017
zie-0.016
 myself-0.016
 forth-0.016
Token very
Token ID845
Feature activation+1.54e-01

Pos logit contributions
itt+0.014
op+0.014
ades+0.013
osh+0.013
 Some+0.013
Neg logit contributions
 herself-0.010
 myself-0.008
 forgetting-0.008
 mess-0.008
 forget-0.008
Token sick
Token ID6639
Feature activation+2.45e-01

Pos logit contributions
 fuss+0.005
 herself+0.005
 myself+0.005
 heads+0.005
 forget+0.005
Neg logit contributions
itt-0.006
 Some-0.006
 fires-0.006
op-0.006
glass-0.006
Token and
Token ID290
Feature activation+1.13e+00

Pos logit contributions
 herself+0.008
 heads+0.008
 myself+0.008
zie+0.008
keeping+0.008
Neg logit contributions
itt-0.011
year-0.010
 Some-0.010
op-0.010
 fires-0.009
Token might
Token ID1244
Feature activation+1.16e+00

Pos logit contributions
 heads+0.038
 herself+0.037
 myself+0.037
zie+0.037
 forth+0.036
Neg logit contributions
itt-0.050
op-0.046
year-0.046
 Some-0.046
 fires-0.043
Token die
Token ID4656
Feature activation+4.37e+00

Pos logit contributions
 heads+0.040
 myself+0.039
zie+0.039
 herself+0.038
keeping+0.038
Neg logit contributions
itt-0.048
 of-0.046
 Some-0.046
year-0.045
op-0.045
Token soon
Token ID2582
Feature activation+3.52e-01

Pos logit contributions
 heads+0.158
 myself+0.156
zie+0.155
 herself+0.154
keeping+0.150
Neg logit contributions
itt-0.178
 Some-0.166
year-0.166
 of-0.164
op-0.162
Token.
Token ID13
Feature activation+2.92e-01

Pos logit contributions
 heads+0.013
 myself+0.013
zie+0.013
 herself+0.012
keeping+0.012
Neg logit contributions
itt-0.014
 of-0.013
 Some-0.013
year-0.013
op-0.013
Token
Token ID220
Feature activation+5.93e-01

Pos logit contributions
 myself+0.012
keeping+0.012
 heads+0.012
 forth+0.012
zie+0.011
Neg logit contributions
 Some-0.012
 of-0.011
itt-0.010
op-0.010
time-0.009
Token
Token ID198
Feature activation-3.33e-01

Pos logit contributions
 heads+0.027
 forth+0.026
 myself+0.026
zie+0.025
keeping+0.025
Neg logit contributions
itt-0.020
 of-0.020
 Some-0.020
op-0.018
year-0.018
Token
Token ID198
Feature activation+1.18e-02

Pos logit contributions
 of+0.011
itt+0.011
 Some+0.011
op+0.010
year+0.010
Neg logit contributions
 heads-0.015
 myself-0.014
zie-0.014
 forth-0.014
keeping-0.014
Token club
Token ID3430
Feature activation+1.12e-01

Pos logit contributions
itt+0.015
year+0.014
 Some+0.014
op+0.014
 of+0.014
Neg logit contributions
 heads-0.012
 myself-0.012
 herself-0.012
keeping-0.012
zie-0.011
Token called
Token ID1444
Feature activation-7.84e-01

Pos logit contributions
 myself+0.004
 herself+0.004
 heads+0.004
zie+0.004
keeping+0.004
Neg logit contributions
itt-0.005
 Some-0.004
year-0.004
op-0.004
 fires-0.004
Token the
Token ID262
Feature activation+6.11e-01

Pos logit contributions
itt+0.033
op+0.030
year+0.030
 Some+0.030
 of+0.027
Neg logit contributions
 herself-0.029
 myself-0.028
 heads-0.028
keeping-0.027
zie-0.027
Token Brave
Token ID22813
Feature activation+5.01e-01

Pos logit contributions
 heads+0.023
 myself+0.022
 herself+0.022
zie+0.022
keeping+0.021
Neg logit contributions
itt-0.025
year-0.023
op-0.023
 Some-0.023
 of-0.022
Token Adventure
Token ID9553
Feature activation+1.19e+00

Pos logit contributions
 heads+0.018
zie+0.017
 myself+0.017
 herself+0.017
keeping+0.016
Neg logit contributions
itt-0.021
 Some-0.020
year-0.019
op-0.019
 of-0.019
Tokenrs
Token ID3808
Feature activation+4.33e+00

Pos logit contributions
 heads+0.036
 myself+0.036
keeping+0.036
zie+0.035
 herself+0.035
Neg logit contributions
itt-0.055
 Some-0.053
op-0.050
year-0.050
 fires-0.049
Token.
Token ID13
Feature activation+2.14e-01

Pos logit contributions
 heads+0.158
 myself+0.157
 herself+0.155
zie+0.152
keeping+0.151
Neg logit contributions
itt-0.179
 Some-0.166
year-0.164
op-0.159
 of-0.158
Token
Token ID198
Feature activation-2.09e-01

Pos logit contributions
 heads+0.008
 myself+0.008
zie+0.008
 herself+0.008
keeping+0.008
Neg logit contributions
itt-0.008
 Some-0.008
 of-0.008
year-0.008
op-0.007
Token
Token ID198
Feature activation+1.73e-01

Pos logit contributions
itt+0.009
year+0.008
 Some+0.008
 fires+0.008
op+0.008
Neg logit contributions
 herself-0.007
 myself-0.007
zie-0.007
 heads-0.007
keeping-0.006
TokenThe
Token ID464
Feature activation-2.63e-01

Pos logit contributions
 myself+0.006
 herself+0.006
zie+0.006
 forth+0.006
 heads+0.005
Neg logit contributions
itt-0.007
op-0.006
 Some-0.006
oses-0.006
year-0.006
Token brave
Token ID14802
Feature activation-4.38e-01

Pos logit contributions
itt+0.011
year+0.010
 Some+0.010
 of+0.010
op+0.010
Neg logit contributions
 heads-0.010
 myself-0.010
zie-0.009
 herself-0.009
keeping-0.009
Token fossils
Token ID34066
Feature activation-4.50e-01

Pos logit contributions
itt+0.020
year+0.020
op+0.019
imsy+0.018
 Some+0.017
Neg logit contributions
 herself-0.015
 myself-0.014
 heads-0.014
zie-0.013
keeping-0.013
Token are
Token ID389
Feature activation+8.39e-01

Pos logit contributions
itt+0.018
year+0.017
 Some+0.017
op+0.017
 fires+0.016
Neg logit contributions
 heads-0.016
 herself-0.016
 myself-0.016
zie-0.015
keeping-0.015
Token made
Token ID925
Feature activation-1.32e+00

Pos logit contributions
 heads+0.032
 herself+0.031
 myself+0.031
zie+0.029
 fuss+0.028
Neg logit contributions
itt-0.034
op-0.032
year-0.031
 Some-0.031
 of-0.028
Token when
Token ID618
Feature activation+1.18e+00

Pos logit contributions
itt+0.036
op+0.031
year+0.031
 Some+0.031
 paramed+0.027
Neg logit contributions
 herself-0.069
 myself-0.068
 heads-0.066
zie-0.062
 mess-0.060
Token animals
Token ID4695
Feature activation+9.54e-01

Pos logit contributions
 heads+0.044
 myself+0.044
 herself+0.044
keeping+0.041
zie+0.041
Neg logit contributions
itt-0.047
year-0.046
op-0.043
 Some-0.043
 of-0.041
Token die
Token ID4656
Feature activation+4.17e+00

Pos logit contributions
 heads+0.032
 herself+0.031
 myself+0.030
 forth+0.029
zie+0.029
Neg logit contributions
itt-0.042
year-0.040
 Some-0.040
op-0.037
 fires-0.037
Token and
Token ID290
Feature activation+6.59e-01

Pos logit contributions
zie+0.158
 myself+0.158
 heads+0.157
 herself+0.155
keeping+0.151
Neg logit contributions
itt-0.158
 of-0.149
 Some-0.146
year-0.146
op-0.144
Token get
Token ID651
Feature activation+1.35e+00

Pos logit contributions
 heads+0.025
 myself+0.024
 herself+0.024
 forth+0.023
 fuss+0.023
Neg logit contributions
itt-0.027
op-0.025
year-0.025
 Some-0.024
 of-0.023
Token buried
Token ID11694
Feature activation+7.15e-01

Pos logit contributions
 heads+0.049
 herself+0.047
 fuss+0.044
 forth+0.044
 myself+0.043
Neg logit contributions
itt-0.059
op-0.057
year-0.052
 Some-0.052
ades-0.049
Token under
Token ID739
Feature activation-7.81e-01

Pos logit contributions
 heads+0.028
 herself+0.027
 myself+0.027
keeping+0.027
zie+0.026
Neg logit contributions
itt-0.028
 Some-0.026
year-0.026
op-0.026
 of-0.024
Token the
Token ID262
Feature activation+3.31e-01

Pos logit contributions
itt+0.033
year+0.032
op+0.031
 Some+0.029
imsy+0.029
Neg logit contributions
 herself-0.028
 heads-0.028
 myself-0.028
keeping-0.027
 fuss-0.025
Token he
Token ID339
Feature activation-7.94e-01

Pos logit contributions
 heads+0.023
 myself+0.022
zie+0.022
 herself+0.022
keeping+0.021
Neg logit contributions
itt-0.030
year-0.028
 Some-0.028
op-0.027
 of-0.027
Token threw
Token ID9617
Feature activation+7.69e-01

Pos logit contributions
itt+0.035
year+0.033
 Some+0.033
 of+0.032
op+0.032
Neg logit contributions
 heads-0.026
 myself-0.026
zie-0.025
 herself-0.025
keeping-0.024
Token away
Token ID1497
Feature activation+1.47e+00

Pos logit contributions
 heads+0.030
 herself+0.029
 myself+0.029
zie+0.028
keeping+0.028
Neg logit contributions
itt-0.030
op-0.028
year-0.028
 Some-0.027
 of-0.026
Token did
Token ID750
Feature activation-2.41e-01

Pos logit contributions
 herself+0.052
 heads+0.052
 myself+0.051
keeping+0.048
zie+0.047
Neg logit contributions
itt-0.065
 Some-0.060
op-0.060
year-0.059
imsy-0.054
Token not
Token ID407
Feature activation+2.18e-01

Pos logit contributions
itt+0.011
year+0.010
 Some+0.010
op+0.010
 fires+0.009
Neg logit contributions
 herself-0.008
 myself-0.008
 heads-0.008
keeping-0.008
zie-0.007
Token die
Token ID4656
Feature activation+4.13e+00

Pos logit contributions
 heads+0.009
zie+0.009
 myself+0.009
 forth+0.009
keeping+0.009
Neg logit contributions
 of-0.007
itt-0.007
 Some-0.007
year-0.007
 fires-0.007
Token.
Token ID13
Feature activation+2.78e-01

Pos logit contributions
 heads+0.154
 myself+0.152
zie+0.152
 herself+0.149
keeping+0.146
Neg logit contributions
itt-0.162
 of-0.151
year-0.150
 Some-0.150
op-0.147
Token They
Token ID1119
Feature activation+1.06e+00

Pos logit contributions
 heads+0.010
zie+0.010
 myself+0.010
 forth+0.010
 herself+0.010
Neg logit contributions
itt-0.011
 Some-0.011
 of-0.010
year-0.010
op-0.010
Token landed
Token ID11406
Feature activation+1.35e+00

Pos logit contributions
 myself+0.037
keeping+0.037
zie+0.037
 herself+0.035
 heads+0.034
Neg logit contributions
itt-0.043
 of-0.043
 Some-0.041
 fires-0.041
year-0.040
Token on
Token ID319
Feature activation-3.27e-01

Pos logit contributions
 herself+0.058
 forth+0.055
 myself+0.055
 heads+0.054
keeping+0.051
Neg logit contributions
 Some-0.049
op-0.043
itt-0.042
year-0.041
imsy-0.040
Token his
Token ID465
Feature activation-2.49e-01

Pos logit contributions
itt+0.014
year+0.014
op+0.014
imsy+0.013
 of+0.013
Neg logit contributions
 herself-0.012
 heads-0.011
 myself-0.011
keeping-0.011
zie-0.011
Token lab
Token ID2248
Feature activation-4.13e-01

Pos logit contributions
zie+0.003
 myself+0.003
keeping+0.003
 heads+0.003
 forth+0.003
Neg logit contributions
itt-0.003
 of-0.003
 Some-0.002
year-0.002
op-0.002
Token door
Token ID3420
Feature activation-1.43e-01

Pos logit contributions
itt+0.018
year+0.017
 Some+0.017
 fires+0.017
op+0.016
Neg logit contributions
 heads-0.014
zie-0.013
keeping-0.013
 herself-0.013
 forth-0.013
Token opened
Token ID4721
Feature activation+1.56e+00

Pos logit contributions
itt+0.008
 Some+0.007
year+0.007
 fires+0.007
op+0.007
Neg logit contributions
 heads-0.004
 forth-0.004
 herself-0.004
zie-0.004
 forget-0.003
Token and
Token ID290
Feature activation+1.09e+00

Pos logit contributions
 herself+0.049
 heads+0.049
 myself+0.049
keeping+0.046
zie+0.045
Neg logit contributions
itt-0.077
op-0.071
 Some-0.070
year-0.069
 of-0.063
Token a
Token ID257
Feature activation+1.18e+00

Pos logit contributions
 heads+0.046
 myself+0.046
zie+0.046
 herself+0.046
keeping+0.044
Neg logit contributions
itt-0.037
 Some-0.034
year-0.034
 of-0.034
op-0.034
Token sweet
Token ID6029
Feature activation+4.11e+00

Pos logit contributions
 fuss+0.046
 heads+0.045
 forget+0.043
 forth+0.043
 mess+0.042
Neg logit contributions
itt-0.044
year-0.041
 Some-0.040
op-0.039
 fires-0.038
Token little
Token ID1310
Feature activation+2.62e+00

Pos logit contributions
 heads+0.139
zie+0.135
 myself+0.134
 herself+0.133
 forth+0.130
Neg logit contributions
itt-0.178
year-0.166
 Some-0.166
op-0.162
 of-0.159
Token girl
Token ID2576
Feature activation+1.75e+00

Pos logit contributions
 heads+0.086
 myself+0.081
 forth+0.080
zie+0.079
 fuss+0.079
Neg logit contributions
itt-0.117
year-0.110
 Some-0.107
op-0.106
 fires-0.103
Token came
Token ID1625
Feature activation+2.32e-01

Pos logit contributions
 herself+0.073
 heads+0.073
zie+0.072
keeping+0.072
 forth+0.071
Neg logit contributions
itt-0.059
 Some-0.056
year-0.055
 fires-0.054
op-0.052
Token in
Token ID287
Feature activation+1.81e+00

Pos logit contributions
 herself+0.010
 myself+0.010
 heads+0.009
zie+0.009
keeping+0.009
Neg logit contributions
itt-0.009
 Some-0.008
year-0.008
op-0.008
 fires-0.007
Token.
Token ID13
Feature activation+3.03e-01

Pos logit contributions
 heads+0.064
 myself+0.063
 herself+0.063
zie+0.062
keeping+0.061
Neg logit contributions
itt-0.075
 Some-0.070
year-0.070
op-0.069
 of-0.068
Token family
Token ID1641
Feature activation+9.71e-01

Pos logit contributions
zie+0.044
 myself+0.044
 herself+0.043
 heads+0.043
keeping+0.042
Neg logit contributions
itt-0.046
 of-0.043
 Some-0.043
year-0.042
 fires-0.042
Token or
Token ID393
Feature activation+4.04e-01

Pos logit contributions
 heads+0.031
zie+0.031
 myself+0.031
 herself+0.030
keeping+0.029
Neg logit contributions
itt-0.042
 of-0.041
 Some-0.040
year-0.040
op-0.039
Token friends
Token ID2460
Feature activation+2.54e+00

Pos logit contributions
zie+0.015
 heads+0.015
 myself+0.015
 herself+0.014
keeping+0.014
Neg logit contributions
itt-0.016
 of-0.015
 Some-0.015
year-0.015
 fires-0.014
Token when
Token ID618
Feature activation+1.19e+00

Pos logit contributions
 heads+0.072
 myself+0.071
zie+0.070
 herself+0.070
keeping+0.068
Neg logit contributions
itt-0.123
year-0.116
 Some-0.116
 of-0.114
op-0.113
Token they
Token ID484
Feature activation+9.31e-01

Pos logit contributions
 heads+0.042
 myself+0.042
 herself+0.041
keeping+0.039
zie+0.039
Neg logit contributions
itt-0.050
year-0.048
op-0.047
 Some-0.046
 of-0.045
Token die
Token ID4656
Feature activation+4.10e+00

Pos logit contributions
 heads+0.031
 myself+0.031
zie+0.031
 herself+0.030
keeping+0.029
Neg logit contributions
itt-0.040
 Some-0.038
 of-0.038
year-0.038
op-0.037
Token.
Token ID13
Feature activation+4.17e-01

Pos logit contributions
zie+0.154
 myself+0.145
 herself+0.143
 heads+0.140
keeping+0.136
Neg logit contributions
 of-0.159
 fires-0.152
itt-0.151
op-0.144
 Some-0.142
Token We
Token ID775
Feature activation+1.61e-01

Pos logit contributions
 myself+0.014
 heads+0.014
 herself+0.014
zie+0.013
keeping+0.013
Neg logit contributions
itt-0.018
year-0.017
op-0.017
 of-0.017
 Some-0.016
Token have
Token ID423
Feature activation+1.27e+00

Pos logit contributions
 heads+0.005
 myself+0.005
 herself+0.005
zie+0.005
keeping+0.005
Neg logit contributions
itt-0.007
year-0.007
 Some-0.007
op-0.006
 of-0.006
Token to
Token ID284
Feature activation+8.61e-01

Pos logit contributions
 heads+0.064
 herself+0.062
 myself+0.062
keeping+0.061
 fuss+0.059
Neg logit contributions
itt-0.037
year-0.035
op-0.034
imsy-0.028
 Some-0.028
Token put
Token ID1234
Feature activation+7.23e-01

Pos logit contributions
zie+0.031
 heads+0.030
 myself+0.029
keeping+0.029
 forth+0.028
Neg logit contributions
itt-0.034
 of-0.033
 Some-0.031
 fires-0.031
year-0.031
Token sometimes
Token ID3360
Feature activation+9.17e-01

Pos logit contributions
 herself+0.010
 myself+0.009
zie+0.009
 heads+0.009
 forth+0.008
Neg logit contributions
itt-0.013
op-0.012
year-0.012
 Some-0.012
ades-0.011
Token animals
Token ID4695
Feature activation+8.83e-01

Pos logit contributions
 heads+0.033
 myself+0.033
 herself+0.033
zie+0.032
 forth+0.031
Neg logit contributions
itt-0.038
year-0.036
 Some-0.035
op-0.035
 of-0.034
Token get
Token ID651
Feature activation+1.61e+00

Pos logit contributions
 heads+0.031
 herself+0.030
 myself+0.030
zie+0.029
 forth+0.029
Neg logit contributions
itt-0.037
year-0.036
 Some-0.035
op-0.034
 fires-0.033
Token sick
Token ID6639
Feature activation+1.25e-01

Pos logit contributions
 heads+0.061
 herself+0.060
 myself+0.058
 forth+0.057
 fuss+0.056
Neg logit contributions
itt-0.068
op-0.065
year-0.061
 Some-0.059
ades-0.056
Token and
Token ID290
Feature activation+8.49e-01

Pos logit contributions
 heads+0.005
 herself+0.005
 myself+0.005
keeping+0.005
zie+0.005
Neg logit contributions
itt-0.005
year-0.005
 Some-0.005
op-0.004
 fires-0.004
Token die
Token ID4656
Feature activation+4.08e+00

Pos logit contributions
 heads+0.032
 myself+0.032
 herself+0.031
zie+0.030
 forth+0.030
Neg logit contributions
itt-0.034
year-0.032
op-0.032
 Some-0.031
 of-0.029
Token,
Token ID11
Feature activation+7.78e-01

Pos logit contributions
zie+0.159
 myself+0.158
 heads+0.156
 herself+0.153
keeping+0.152
Neg logit contributions
itt-0.148
 of-0.145
 Some-0.137
 fires-0.137
year-0.137
Token just
Token ID655
Feature activation+1.41e-01

Pos logit contributions
 heads+0.027
 myself+0.027
zie+0.027
 herself+0.027
keeping+0.026
Neg logit contributions
itt-0.032
year-0.030
 Some-0.030
 of-0.030
op-0.030
Token like
Token ID588
Feature activation+5.29e-01

Pos logit contributions
 heads+0.005
 myself+0.005
 herself+0.005
zie+0.005
 forth+0.005
Neg logit contributions
itt-0.006
year-0.005
op-0.005
 Some-0.005
 of-0.005
Token people
Token ID661
Feature activation+2.61e-01

Pos logit contributions
 herself+0.019
 myself+0.019
 heads+0.019
zie+0.018
keeping+0.018
Neg logit contributions
itt-0.022
op-0.020
year-0.020
 Some-0.020
 of-0.019
Token.
Token ID13
Feature activation+5.43e-01

Pos logit contributions
 heads+0.010
 herself+0.010
 myself+0.010
zie+0.010
keeping+0.009
Neg logit contributions
itt-0.010
year-0.010
 Some-0.009
op-0.009
 fires-0.009
Token reached
Token ID4251
Feature activation-2.13e-01

Pos logit contributions
 heads+0.017
 herself+0.017
zie+0.017
 myself+0.017
 forth+0.016
Neg logit contributions
itt-0.022
 Some-0.021
 fires-0.021
year-0.021
op-0.020
Token in
Token ID287
Feature activation+1.46e+00

Pos logit contributions
itt+0.007
year+0.006
 Some+0.006
op+0.006
 of+0.006
Neg logit contributions
 herself-0.009
 heads-0.009
zie-0.009
 myself-0.009
 forth-0.009
Token and
Token ID290
Feature activation+6.69e-01

Pos logit contributions
 heads+0.058
 herself+0.058
 myself+0.057
zie+0.057
keeping+0.055
Neg logit contributions
itt-0.055
 Some-0.050
year-0.050
op-0.050
 fires-0.048
Token picked
Token ID6497
Feature activation+1.67e+00

Pos logit contributions
 herself+0.024
 heads+0.024
 myself+0.023
zie+0.023
 forth+0.023
Neg logit contributions
itt-0.027
 Some-0.026
year-0.025
 fires-0.025
op-0.025
Token a
Token ID257
Feature activation+7.11e-01

Pos logit contributions
 herself+0.070
 heads+0.067
 myself+0.067
zie+0.066
keeping+0.064
Neg logit contributions
itt-0.060
year-0.055
op-0.055
 Some-0.054
 of-0.053
Token sweet
Token ID6029
Feature activation+4.04e+00

Pos logit contributions
 heads+0.026
 myself+0.026
 herself+0.026
zie+0.025
 fuss+0.025
Neg logit contributions
itt-0.028
year-0.026
 Some-0.026
 fires-0.025
op-0.025
Token b
Token ID275
Feature activation+1.32e+00

Pos logit contributions
 heads+0.134
 myself+0.131
 herself+0.130
zie+0.130
 forth+0.124
Neg logit contributions
itt-0.179
year-0.167
 Some-0.166
op-0.162
 fires-0.161
Tokenerry
Token ID6996
Feature activation+5.47e-01

Pos logit contributions
 heads+0.048
 myself+0.047
zie+0.047
 herself+0.047
keeping+0.045
Neg logit contributions
itt-0.053
 Some-0.050
year-0.050
 of-0.049
op-0.048
Token.
Token ID13
Feature activation+1.81e-01

Pos logit contributions
 heads+0.022
 myself+0.022
zie+0.022
 herself+0.021
keeping+0.021
Neg logit contributions
itt-0.020
year-0.018
 of-0.018
 Some-0.018
op-0.018
Token It
Token ID632
Feature activation+8.60e-02

Pos logit contributions
 heads+0.007
 myself+0.007
zie+0.007
 forth+0.007
keeping+0.007
Neg logit contributions
itt-0.007
 Some-0.007
 of-0.007
year-0.006
op-0.006
Token tasted
Token ID29187
Feature activation+6.74e-01

Pos logit contributions
keeping+0.004
 myself+0.004
 heads+0.004
zie+0.004
 forth+0.003
Neg logit contributions
itt-0.003
 of-0.003
year-0.003
 Some-0.003
op-0.003
Token beautiful
Token ID4950
Feature activation+1.56e+00

Pos logit contributions
 fuss+0.003
 heads+0.003
 mess+0.003
 myself+0.003
 forget+0.003
Neg logit contributions
itt-0.004
year-0.004
 Some-0.004
 fires-0.003
ug-0.003
Token flower
Token ID15061
Feature activation+5.14e-01

Pos logit contributions
 heads+0.054
 myself+0.054
 herself+0.053
zie+0.052
 forth+0.052
Neg logit contributions
itt-0.067
year-0.063
 Some-0.061
op-0.061
 fires-0.057
Token that
Token ID326
Feature activation+2.07e-01

Pos logit contributions
 heads+0.018
 myself+0.018
 herself+0.018
zie+0.018
keeping+0.017
Neg logit contributions
itt-0.021
year-0.020
 Some-0.020
op-0.019
 fires-0.019
Token would
Token ID561
Feature activation-1.69e+00

Pos logit contributions
 heads+0.008
 myself+0.008
zie+0.008
 herself+0.008
keeping+0.008
Neg logit contributions
itt-0.008
 Some-0.007
year-0.007
op-0.007
 of-0.007
Token never
Token ID1239
Feature activation-1.84e+00

Pos logit contributions
itt+0.072
 of+0.069
 Some+0.068
year+0.068
op+0.066
Neg logit contributions
 heads-0.057
zie-0.056
 myself-0.056
 herself-0.054
keeping-0.054
Token die
Token ID4656
Feature activation+4.03e+00

Pos logit contributions
 of+0.085
itt+0.083
 Some+0.081
year+0.080
 fires+0.077
Neg logit contributions
 heads-0.055
zie-0.055
 myself-0.054
keeping-0.052
 herself-0.052
Token.
Token ID13
Feature activation+1.15e-01

Pos logit contributions
 herself+0.132
 myself+0.132
 heads+0.129
keeping+0.126
zie+0.123
Neg logit contributions
itt-0.189
 Some-0.177
op-0.171
year-0.170
 fires-0.159
Token Tim
Token ID5045
Feature activation+9.75e-01

Pos logit contributions
 myself+0.004
 heads+0.003
 herself+0.003
zie+0.003
keeping+0.003
Neg logit contributions
itt-0.005
year-0.005
 fires-0.005
op-0.005
 of-0.005
Token knew
Token ID2993
Feature activation-1.20e+00

Pos logit contributions
 heads+0.037
 myself+0.037
zie+0.036
 herself+0.036
keeping+0.035
Neg logit contributions
itt-0.038
year-0.035
 Some-0.035
 of-0.035
op-0.034
Token this
Token ID428
Feature activation-9.81e-02

Pos logit contributions
itt+0.051
 Some+0.048
op+0.048
year+0.047
imsy+0.044
Neg logit contributions
 herself-0.047
 myself-0.043
 forth-0.041
zie-0.041
 heads-0.039
Token was
Token ID373
Feature activation-6.26e-01

Pos logit contributions
itt+0.004
year+0.004
 Some+0.004
op+0.004
 of+0.004
Neg logit contributions
 heads-0.003
 myself-0.003
 herself-0.003
zie-0.003
keeping-0.003
Token a
Token ID257
Feature activation+3.63e-01

Pos logit contributions
 herself+0.009
 heads+0.009
 myself+0.009
keeping+0.009
zie+0.009
Neg logit contributions
itt-0.008
year-0.007
op-0.007
 Some-0.007
 of-0.006
Token shop
Token ID6128
Feature activation+6.64e-01

Pos logit contributions
 fuss+0.013
 heads+0.013
 forget+0.013
 mess+0.013
 forth+0.013
Neg logit contributions
itt-0.015
year-0.014
 Some-0.014
op-0.013
 fires-0.013
Token and
Token ID290
Feature activation+3.37e-01

Pos logit contributions
 heads+0.025
keeping+0.024
 herself+0.023
 forget+0.022
 forth+0.022
Neg logit contributions
itt-0.028
 Some-0.026
 fires-0.025
br-0.024
ades-0.024
Token saw
Token ID2497
Feature activation-1.42e+00

Pos logit contributions
 herself+0.011
 heads+0.011
 myself+0.011
keeping+0.011
 forth+0.011
Neg logit contributions
itt-0.015
op-0.014
 Some-0.014
year-0.013
ades-0.013
Token a
Token ID257
Feature activation+1.05e+00

Pos logit contributions
itt+0.054
op+0.053
imsy+0.048
year+0.048
ug+0.046
Neg logit contributions
 herself-0.059
 myself-0.053
 heads-0.052
keeping-0.051
zie-0.048
Token sweet
Token ID6029
Feature activation+4.01e+00

Pos logit contributions
 fuss+0.039
 heads+0.039
 forget+0.037
 forth+0.037
 mess+0.036
Neg logit contributions
itt-0.041
year-0.039
 Some-0.039
 fires-0.037
op-0.036
Token little
Token ID1310
Feature activation+2.56e+00

Pos logit contributions
 heads+0.140
 myself+0.135
 herself+0.134
zie+0.134
 fuss+0.133
Neg logit contributions
itt-0.171
year-0.160
 Some-0.159
op-0.153
 fires-0.151
Token heart
Token ID2612
Feature activation+9.23e-01

Pos logit contributions
 heads+0.069
 forth+0.064
 fuss+0.064
 myself+0.062
 herself+0.060
Neg logit contributions
itt-0.132
year-0.126
 Some-0.121
op-0.119
 fires-0.117
Token shaped
Token ID14292
Feature activation-4.18e-01

Pos logit contributions
 herself+0.030
 heads+0.028
 forth+0.028
 myself+0.027
keeping+0.027
Neg logit contributions
itt-0.042
 Some-0.042
year-0.040
 fires-0.038
op-0.036
Token toy
Token ID13373
Feature activation+3.77e-01

Pos logit contributions
itt+0.018
op+0.016
 Some+0.015
year+0.015
oses+0.014
Neg logit contributions
 herself-0.016
 heads-0.016
 myself-0.015
 forth-0.015
 fuss-0.014
Token in
Token ID287
Feature activation+1.05e+00

Pos logit contributions
 heads+0.013
 herself+0.012
 myself+0.012
 forth+0.012
keeping+0.012
Neg logit contributions
itt-0.017
year-0.016
 Some-0.016
 fires-0.015
op-0.014
Token suffer
Token ID8659
Feature activation+2.01e+00

Pos logit contributions
 heads+0.008
 myself+0.008
 herself+0.008
zie+0.008
keeping+0.008
Neg logit contributions
itt-0.011
year-0.010
 Some-0.010
 of-0.010
op-0.010
Token,
Token ID11
Feature activation+2.59e-01

Pos logit contributions
 heads+0.072
 myself+0.072
 herself+0.071
zie+0.070
keeping+0.068
Neg logit contributions
itt-0.084
 Some-0.078
year-0.078
op-0.076
 of-0.075
Token scream
Token ID8196
Feature activation+7.79e-01

Pos logit contributions
 heads+0.007
 myself+0.007
 herself+0.007
zie+0.007
 forth+0.007
Neg logit contributions
itt-0.013
year-0.012
 Some-0.012
op-0.012
 of-0.012
Token,
Token ID11
Feature activation-3.87e-02

Pos logit contributions
 heads+0.029
 herself+0.028
 myself+0.028
zie+0.028
 forth+0.027
Neg logit contributions
itt-0.032
year-0.030
 Some-0.029
op-0.028
 of-0.028
Token and
Token ID290
Feature activation+1.63e-01

Pos logit contributions
itt+0.002
year+0.002
 Some+0.002
op+0.002
 of+0.002
Neg logit contributions
 heads-0.001
 herself-0.001
 myself-0.001
 forth-0.001
zie-0.001
Token die
Token ID4656
Feature activation+3.99e+00

Pos logit contributions
 heads+0.007
 myself+0.007
zie+0.007
 herself+0.006
keeping+0.006
Neg logit contributions
itt-0.006
 Some-0.005
year-0.005
 of-0.005
op-0.005
Token.
Token ID13
Feature activation-2.27e-02

Pos logit contributions
zie+0.166
 myself+0.163
 herself+0.157
 heads+0.157
keeping+0.157
Neg logit contributions
 of-0.135
itt-0.129
 fires-0.126
op-0.123
year-0.121
Token
Token ID198
Feature activation+2.27e-01

Pos logit contributions
itt+0.001
 Some+0.001
 of+0.001
year+0.001
op+0.001
Neg logit contributions
 heads-0.001
zie-0.001
 forth-0.001
 myself-0.001
keeping-0.001
Token
Token ID198
Feature activation+5.24e-01

Pos logit contributions
 myself+0.008
 heads+0.008
zie+0.008
 herself+0.008
keeping+0.007
Neg logit contributions
itt-0.009
year-0.009
 Some-0.009
 of-0.009
op-0.009
TokenHe
Token ID1544
Feature activation-1.31e-01

Pos logit contributions
 myself+0.019
 herself+0.018
zie+0.018
 heads+0.018
keeping+0.017
Neg logit contributions
itt-0.021
year-0.020
 Some-0.020
 of-0.020
op-0.019
Token pulled
Token ID5954
Feature activation+1.20e+00

Pos logit contributions
itt+0.006
 Some+0.006
op+0.006
year+0.006
 of+0.006
Neg logit contributions
 heads-0.004
zie-0.004
 herself-0.004
 forth-0.004
 myself-0.004
Token opened
Token ID4721
Feature activation+8.83e-01

Pos logit contributions
 herself+0.006
 heads+0.006
 myself+0.006
zie+0.006
 forth+0.006
Neg logit contributions
itt-0.009
year-0.008
op-0.008
 Some-0.008
 fires-0.008
Token it
Token ID340
Feature activation-4.62e-01

Pos logit contributions
 herself+0.031
 myself+0.030
 heads+0.030
zie+0.028
 forth+0.027
Neg logit contributions
itt-0.040
year-0.037
 Some-0.037
op-0.036
 of-0.035
Token and
Token ID290
Feature activation+7.47e-01

Pos logit contributions
itt+0.020
year+0.019
 Some+0.018
 fires+0.018
op+0.017
Neg logit contributions
 herself-0.017
 heads-0.017
 myself-0.016
zie-0.015
 forth-0.015
Token found
Token ID1043
Feature activation-2.69e-01

Pos logit contributions
 heads+0.028
 herself+0.028
 myself+0.027
zie+0.027
 forth+0.027
Neg logit contributions
itt-0.030
 Some-0.028
year-0.027
op-0.027
 of-0.027
Token a
Token ID257
Feature activation+8.30e-01

Pos logit contributions
itt+0.009
op+0.009
year+0.008
 Some+0.008
 fires+0.008
Neg logit contributions
 herself-0.012
 myself-0.011
 heads-0.011
zie-0.011
keeping-0.011
Token sweet
Token ID6029
Feature activation+3.97e+00

Pos logit contributions
 heads+0.035
 myself+0.034
zie+0.034
 herself+0.034
keeping+0.033
Neg logit contributions
itt-0.029
year-0.026
 Some-0.026
 of-0.026
op-0.026
Token surprise
Token ID5975
Feature activation+1.00e+00

Pos logit contributions
 myself+0.153
 heads+0.153
zie+0.152
 herself+0.148
keeping+0.147
Neg logit contributions
itt-0.148
 of-0.145
 Some-0.138
op-0.137
year-0.137
Token inside
Token ID2641
Feature activation+1.71e+00

Pos logit contributions
 heads+0.036
 herself+0.035
 myself+0.035
zie+0.035
keeping+0.034
Neg logit contributions
itt-0.041
 Some-0.038
year-0.038
op-0.037
 fires-0.037
Token.
Token ID13
Feature activation-2.54e-01

Pos logit contributions
 herself+0.048
 myself+0.046
keeping+0.045
zie+0.043
 heads+0.043
Neg logit contributions
itt-0.087
op-0.081
 Some-0.081
year-0.077
imsy-0.077
Token
Token ID198
Feature activation-6.59e-01

Pos logit contributions
itt+0.009
 Some+0.009
 of+0.008
op+0.008
year+0.008
Neg logit contributions
 heads-0.011
 myself-0.011
keeping-0.011
zie-0.011
 herself-0.010
Token
Token ID198
Feature activation-2.69e-01

Pos logit contributions
itt+0.026
 Some+0.024
 of+0.024
year+0.024
op+0.024
Neg logit contributions
 heads-0.025
 myself-0.024
zie-0.024
 herself-0.024
keeping-0.023
Token that
Token ID326
Feature activation-2.70e-02

Pos logit contributions
 of+0.018
itt+0.015
 Some+0.015
op+0.014
year+0.013
Neg logit contributions
keeping-0.031
 heads-0.030
 myself-0.030
zie-0.030
 forth-0.028
Token the
Token ID262
Feature activation+9.53e-02

Pos logit contributions
itt+0.001
year+0.001
 Some+0.001
 of+0.001
op+0.001
Neg logit contributions
 heads-0.001
 myself-0.001
zie-0.001
 herself-0.001
keeping-0.001
Token turkey
Token ID26876
Feature activation-8.90e-01

Pos logit contributions
 heads+0.005
zie+0.005
 myself+0.005
 herself+0.005
keeping+0.005
Neg logit contributions
itt-0.003
 of-0.002
 Some-0.002
year-0.002
 fires-0.002
Token had
Token ID550
Feature activation+3.61e-02

Pos logit contributions
itt+0.035
year+0.032
 Some+0.032
 of+0.032
op+0.031
Neg logit contributions
 heads-0.034
 myself-0.033
zie-0.033
 herself-0.033
keeping-0.032
Token to
Token ID284
Feature activation+1.32e+00

Pos logit contributions
 heads+0.001
 herself+0.001
 myself+0.001
zie+0.001
keeping+0.001
Neg logit contributions
itt-0.001
year-0.001
 Some-0.001
op-0.001
 of-0.001
Token die
Token ID4656
Feature activation+3.95e+00

Pos logit contributions
 heads+0.051
zie+0.050
keeping+0.050
 forth+0.049
 myself+0.048
Neg logit contributions
 of-0.047
itt-0.043
 fires-0.041
 Some-0.039
aster-0.039
Token for
Token ID329
Feature activation-2.10e-01

Pos logit contributions
 heads+0.160
zie+0.159
 myself+0.159
 herself+0.154
keeping+0.153
Neg logit contributions
itt-0.138
 of-0.132
year-0.128
 Some-0.127
op-0.126
Token their
Token ID511
Feature activation+6.41e-01

Pos logit contributions
itt+0.008
year+0.007
 Some+0.007
 of+0.007
op+0.007
Neg logit contributions
 heads-0.009
 myself-0.009
 herself-0.008
zie-0.008
keeping-0.008
Token dinner
Token ID8073
Feature activation+1.43e+00

Pos logit contributions
 heads+0.022
 herself+0.019
 myself+0.019
 fuss+0.019
keeping+0.019
Neg logit contributions
year-0.030
itt-0.030
 Some-0.029
op-0.028
ades-0.027
Token,
Token ID11
Feature activation+4.81e-01

Pos logit contributions
 heads+0.054
 myself+0.053
zie+0.053
 herself+0.052
keeping+0.051
Neg logit contributions
itt-0.055
year-0.052
 Some-0.052
 of-0.051
op-0.050
Token but
Token ID475
Feature activation+7.47e-01

Pos logit contributions
 heads+0.015
zie+0.014
 myself+0.014
 herself+0.014
keeping+0.014
Neg logit contributions
itt-0.022
 of-0.021
 Some-0.021
year-0.021
op-0.020
Token,
Token ID11
Feature activation+8.93e-01

Pos logit contributions
itt+0.020
year+0.018
op+0.017
 Some+0.016
ades+0.016
Neg logit contributions
 heads-0.012
 herself-0.010
 forth-0.010
 myself-0.010
 fuss-0.010
Token but
Token ID475
Feature activation+1.08e+00

Pos logit contributions
 herself+0.025
 myself+0.024
 heads+0.024
 forth+0.024
zie+0.024
Neg logit contributions
itt-0.046
year-0.042
op-0.041
 Some-0.040
ades-0.040
Token it
Token ID340
Feature activation+1.11e+00

Pos logit contributions
 herself+0.042
 myself+0.042
 heads+0.042
zie+0.041
 forth+0.041
Neg logit contributions
itt-0.043
year-0.038
op-0.038
 Some-0.038
 of-0.037
Token had
Token ID550
Feature activation+1.83e-01

Pos logit contributions
 heads+0.032
zie+0.032
 forth+0.030
 herself+0.030
 forget+0.029
Neg logit contributions
itt-0.055
 Some-0.051
year-0.050
op-0.049
ades-0.049
Token a
Token ID257
Feature activation+6.89e-01

Pos logit contributions
 herself+0.007
 heads+0.007
 myself+0.007
keeping+0.006
 forget+0.006
Neg logit contributions
itt-0.008
op-0.008
year-0.007
icker-0.007
ades-0.007
Token sweet
Token ID6029
Feature activation+3.93e+00

Pos logit contributions
 heads+0.026
 fuss+0.026
 mess+0.025
 forget+0.025
 forth+0.024
Neg logit contributions
itt-0.028
year-0.028
 Some-0.024
op-0.024
 of-0.024
Tokenheart
Token ID11499
Feature activation+1.58e+00

Pos logit contributions
 heads+0.122
 fuss+0.116
 herself+0.114
 forth+0.112
zie+0.110
Neg logit contributions
itt-0.191
year-0.184
 Some-0.173
op-0.166
ades-0.161
Token of
Token ID286
Feature activation-4.22e-01

Pos logit contributions
 herself+0.048
keeping+0.046
 myself+0.041
 heads+0.040
 fuss+0.040
Neg logit contributions
itt-0.076
 Some-0.073
ades-0.071
year-0.069
op-0.068
Token an
Token ID281
Feature activation-4.51e-02

Pos logit contributions
itt+0.018
year+0.016
op+0.016
ades+0.015
imsy+0.015
Neg logit contributions
 herself-0.016
 myself-0.016
 heads-0.015
 fuss-0.015
zie-0.014
Token engine
Token ID3113
Feature activation-7.96e-01

Pos logit contributions
itt+0.002
op+0.002
 Some+0.002
year+0.002
ades+0.002
Neg logit contributions
 myself-0.002
 heads-0.002
 herself-0.002
zie-0.001
 forth-0.001
Token.
Token ID13
Feature activation-9.65e-02

Pos logit contributions
itt+0.042
year+0.041
 Some+0.040
ades+0.036
op+0.035
Neg logit contributions
 herself-0.020
keeping-0.020
 heads-0.019
 forth-0.019
 myself-0.018
Token girl
Token ID2576
Feature activation+1.47e+00

Pos logit contributions
 heads+0.047
 forth+0.043
 myself+0.042
zie+0.042
 herself+0.041
Neg logit contributions
itt-0.108
year-0.100
op-0.100
 Some-0.099
 of-0.094
Token was
Token ID373
Feature activation-4.51e-01

Pos logit contributions
 heads+0.048
 herself+0.047
zie+0.045
 forth+0.045
 myself+0.045
Neg logit contributions
itt-0.067
year-0.062
 Some-0.062
op-0.059
 fires-0.059
Token too
Token ID1165
Feature activation+2.62e+00

Pos logit contributions
itt+0.020
op+0.019
 Some+0.019
year+0.018
ades+0.018
Neg logit contributions
 herself-0.016
 myself-0.015
zie-0.014
 heads-0.014
 fuss-0.013
Token young
Token ID1862
Feature activation+4.35e-01

Pos logit contributions
 heads+0.091
 myself+0.090
 herself+0.089
 fuss+0.086
zie+0.084
Neg logit contributions
itt-0.109
 Some-0.102
year-0.102
 fires-0.097
op-0.096
Token to
Token ID284
Feature activation+6.54e-01

Pos logit contributions
 heads+0.018
 myself+0.017
 herself+0.017
zie+0.017
keeping+0.017
Neg logit contributions
itt-0.016
year-0.015
 Some-0.015
 of-0.014
op-0.014
Token die
Token ID4656
Feature activation+3.92e+00

Pos logit contributions
 heads+0.024
 myself+0.024
zie+0.023
 herself+0.023
keeping+0.023
Neg logit contributions
itt-0.026
 of-0.024
 Some-0.024
year-0.024
op-0.024
Token.
Token ID13
Feature activation+3.15e-01

Pos logit contributions
zie+0.154
 heads+0.152
 myself+0.150
 herself+0.146
keeping+0.145
Neg logit contributions
itt-0.139
 of-0.137
 fires-0.131
year-0.131
op-0.128
Token<|endoftext|>
Token ID50256
Feature activation+1.37e+00

Pos logit contributions
 heads+0.014
 forth+0.014
 myself+0.013
zie+0.013
 fuss+0.013
Neg logit contributions
 Some-0.011
itt-0.011
 of-0.010
year-0.009
op-0.009
TokenOnce
Token ID7454
Feature activation-7.85e-02

Pos logit contributions
 heads+0.057
 myself+0.055
zie+0.054
 forth+0.054
 herself+0.052
Neg logit contributions
itt-0.051
 of-0.049
 Some-0.047
year-0.046
op-0.046
Token upon
Token ID2402
Feature activation-3.73e-01

Pos logit contributions
itt+0.003
 of+0.003
 Some+0.003
year+0.003
op+0.003
Neg logit contributions
 heads-0.003
 myself-0.003
keeping-0.003
zie-0.003
 herself-0.003
Token a
Token ID257
Feature activation-4.81e-01

Pos logit contributions
itt+0.012
op+0.011
year+0.011
 Some+0.011
 fires+0.010
Neg logit contributions
 herself-0.017
 myself-0.017
 heads-0.016
keeping-0.016
zie-0.015
Token like
Token ID588
Feature activation-3.46e-01

Pos logit contributions
itt+0.005
year+0.005
op+0.004
 Some+0.004
ades+0.004
Neg logit contributions
 heads-0.003
 herself-0.003
 myself-0.003
 forth-0.003
zie-0.003
Token us
Token ID514
Feature activation+3.33e-01

Pos logit contributions
itt+0.015
year+0.014
op+0.014
 Some+0.013
imsy+0.012
Neg logit contributions
 herself-0.013
 myself-0.012
 heads-0.012
keeping-0.011
zie-0.011
Token.
Token ID13
Feature activation-3.77e-01

Pos logit contributions
 heads+0.013
zie+0.012
 myself+0.012
 herself+0.012
keeping+0.012
Neg logit contributions
itt-0.013
 of-0.012
 Some-0.012
year-0.012
op-0.012
Token When
Token ID1649
Feature activation+1.19e+00

Pos logit contributions
itt+0.015
year+0.014
 Some+0.014
 of+0.014
op+0.014
Neg logit contributions
 heads-0.014
 myself-0.013
zie-0.013
 herself-0.013
keeping-0.013
Token they
Token ID484
Feature activation+7.88e-01

Pos logit contributions
 heads+0.046
 myself+0.045
 herself+0.045
zie+0.045
keeping+0.043
Neg logit contributions
itt-0.047
year-0.043
 Some-0.043
op-0.042
 of-0.042
Token die
Token ID4656
Feature activation+3.89e+00

Pos logit contributions
 myself+0.029
zie+0.029
 heads+0.029
keeping+0.029
 herself+0.028
Neg logit contributions
itt-0.031
 of-0.030
op-0.028
 Some-0.028
year-0.028
Token,
Token ID11
Feature activation+8.80e-01

Pos logit contributions
zie+0.162
 myself+0.160
 heads+0.160
 herself+0.157
keeping+0.155
Neg logit contributions
itt-0.131
 of-0.128
op-0.121
 fires-0.120
year-0.119
Token they
Token ID484
Feature activation+3.09e-01

Pos logit contributions
 herself+0.028
 myself+0.027
 heads+0.027
keeping+0.026
 forth+0.026
Neg logit contributions
itt-0.041
year-0.039
op-0.038
 Some-0.037
 of-0.036
Token don
Token ID836
Feature activation+1.42e+00

Pos logit contributions
 myself+0.011
zie+0.011
keeping+0.011
 heads+0.010
 herself+0.010
Neg logit contributions
itt-0.012
 of-0.012
 fires-0.012
 Some-0.012
op-0.011
Token't
Token ID470
Feature activation+8.89e-01

Pos logit contributions
 heads+0.063
zie+0.062
 myself+0.062
 herself+0.061
keeping+0.060
Neg logit contributions
itt-0.046
 of-0.043
 Some-0.042
year-0.042
op-0.041
Token come
Token ID1282
Feature activation-5.04e-02

Pos logit contributions
zie+0.037
 heads+0.035
 myself+0.035
keeping+0.035
 herself+0.034
Neg logit contributions
 of-0.035
 fires-0.028
itt-0.027
 Some-0.027
op-0.027
Token bird
Token ID6512
Feature activation+3.23e-01

Pos logit contributions
 heads+0.016
 myself+0.016
 herself+0.016
zie+0.015
 forth+0.015
Neg logit contributions
itt-0.014
year-0.014
op-0.013
 Some-0.013
 of-0.013
Token died
Token ID3724
Feature activation+2.35e+00

Pos logit contributions
 heads+0.011
 myself+0.011
 forth+0.011
zie+0.010
 herself+0.010
Neg logit contributions
itt-0.014
year-0.014
 Some-0.013
op-0.013
 fires-0.012
Token because
Token ID780
Feature activation+1.70e+00

Pos logit contributions
zie+0.101
 heads+0.099
 myself+0.097
 fuss+0.096
keeping+0.095
Neg logit contributions
 of-0.075
itt-0.072
 fires-0.069
year-0.066
 Some-0.065
Token sometimes
Token ID3360
Feature activation+1.31e+00

Pos logit contributions
 myself+0.056
 herself+0.055
 forth+0.046
 heads+0.045
keeping+0.045
Neg logit contributions
op-0.076
itt-0.076
ades-0.073
 Some-0.072
year-0.071
Token things
Token ID1243
Feature activation+1.49e+00

Pos logit contributions
 heads+0.050
zie+0.050
 myself+0.050
 herself+0.049
keeping+0.049
Neg logit contributions
itt-0.049
 of-0.046
 Some-0.046
year-0.045
 fires-0.045
Token die
Token ID4656
Feature activation+3.88e+00

Pos logit contributions
 heads+0.055
 myself+0.055
zie+0.055
 herself+0.054
keeping+0.053
Neg logit contributions
itt-0.058
 of-0.055
 Some-0.054
year-0.054
op-0.053
Token.
Token ID13
Feature activation+3.36e-01

Pos logit contributions
zie+0.156
 myself+0.153
 heads+0.149
 herself+0.148
keeping+0.146
Neg logit contributions
itt-0.137
 of-0.137
 fires-0.128
 Some-0.126
op-0.125
Token It
Token ID632
Feature activation+5.57e-01

Pos logit contributions
 myself+0.012
 herself+0.011
 heads+0.011
zie+0.011
keeping+0.011
Neg logit contributions
itt-0.014
year-0.013
op-0.013
 of-0.013
ades-0.013
Token is
Token ID318
Feature activation+5.05e-01

Pos logit contributions
 heads+0.020
 myself+0.020
 herself+0.019
zie+0.019
 forth+0.018
Neg logit contributions
itt-0.023
year-0.022
op-0.022
 Some-0.021
ades-0.020
Token part
Token ID636
Feature activation+5.98e-01

Pos logit contributions
 herself+0.018
 myself+0.018
 heads+0.018
zie+0.017
 forth+0.017
Neg logit contributions
itt-0.022
op-0.021
year-0.019
 Some-0.019
ades-0.018
Token of
Token ID286
Feature activation-1.46e+00

Pos logit contributions
 myself+0.040
 heads+0.040
 herself+0.040
zie+0.039
keeping+0.039
Neg logit contributions
itt-0.005
year-0.005
 Some-0.004
op-0.004
 fires-0.002
Token and
Token ID290
Feature activation-2.36e-01

Pos logit contributions
 herself+0.009
 heads+0.009
 forth+0.008
 myself+0.007
zie+0.007
Neg logit contributions
itt-0.016
op-0.014
 Some-0.014
year-0.014
imsy-0.013
Token cry
Token ID3960
Feature activation+2.33e+00

Pos logit contributions
itt+0.010
 of+0.010
 Some+0.009
year+0.009
 fires+0.009
Neg logit contributions
zie-0.008
 myself-0.008
 heads-0.008
 herself-0.008
keeping-0.007
Token and
Token ID290
Feature activation+7.07e-02

Pos logit contributions
 heads+0.074
 herself+0.073
 myself+0.068
 forth+0.066
zie+0.065
Neg logit contributions
itt-0.111
year-0.101
 Some-0.100
op-0.096
 fires-0.094
Token fight
Token ID1907
Feature activation+1.65e+00

Pos logit contributions
 myself+0.002
 heads+0.002
zie+0.002
 herself+0.002
keeping+0.002
Neg logit contributions
itt-0.003
 of-0.003
 Some-0.003
year-0.003
op-0.003
Token and
Token ID290
Feature activation+3.95e-01

Pos logit contributions
 heads+0.057
 herself+0.055
 myself+0.055
zie+0.054
 forth+0.052
Neg logit contributions
itt-0.073
year-0.067
 Some-0.066
op-0.066
 of-0.064
Token die
Token ID4656
Feature activation+3.86e+00

Pos logit contributions
 heads+0.014
 myself+0.014
zie+0.014
 herself+0.014
keeping+0.013
Neg logit contributions
itt-0.016
year-0.015
 Some-0.015
 of-0.015
op-0.015
Token.
Token ID13
Feature activation+2.60e-01

Pos logit contributions
 myself+0.141
 heads+0.141
zie+0.141
 herself+0.138
keeping+0.135
Neg logit contributions
itt-0.153
 of-0.144
year-0.143
 Some-0.143
op-0.140
Token They
Token ID1119
Feature activation+1.16e+00

Pos logit contributions
 heads+0.009
 myself+0.009
zie+0.009
 herself+0.009
keeping+0.009
Neg logit contributions
itt-0.011
 Some-0.010
 of-0.010
year-0.010
op-0.010
Token felt
Token ID2936
Feature activation-1.63e-01

Pos logit contributions
 myself+0.047
zie+0.045
keeping+0.044
 herself+0.042
 heads+0.042
Neg logit contributions
itt-0.044
 of-0.043
year-0.042
 Some-0.042
 fires-0.039
Token bored
Token ID17533
Feature activation-1.37e+00

Pos logit contributions
itt+0.007
op+0.006
year+0.006
 Some+0.006
 of+0.006
Neg logit contributions
 herself-0.006
 heads-0.006
 myself-0.006
 fuss-0.005
zie-0.005
Token and
Token ID290
Feature activation-4.28e-01

Pos logit contributions
itt+0.061
year+0.055
op+0.054
 Some+0.054
imsy+0.052
Neg logit contributions
 herself-0.047
 heads-0.046
 forth-0.044
 myself-0.044
 fuss-0.043
Token does
Token ID857
Feature activation-9.99e-02

Pos logit contributions
 Some+0.017
 of+0.017
itt+0.016
 fires+0.016
year+0.015
Neg logit contributions
zie-0.014
keeping-0.014
 myself-0.014
 forth-0.013
 heads-0.013
Token not
Token ID407
Feature activation+2.53e-01

Pos logit contributions
itt+0.004
year+0.004
op+0.004
 Some+0.004
 of+0.004
Neg logit contributions
 herself-0.003
 myself-0.003
 heads-0.003
 fuss-0.003
zie-0.003
Token want
Token ID765
Feature activation+3.50e-02

Pos logit contributions
zie+0.012
keeping+0.011
 heads+0.011
 forth+0.011
 myself+0.011
Neg logit contributions
 of-0.008
 Some-0.007
 fires-0.007
itt-0.007
year-0.007
Token them
Token ID606
Feature activation+1.24e+00

Pos logit contributions
 herself+0.002
 heads+0.002
 myself+0.002
zie+0.002
keeping+0.002
Neg logit contributions
itt-0.001
year-0.001
op-0.001
 Some-0.001
imsy-0.001
Token to
Token ID284
Feature activation+1.02e+00

Pos logit contributions
 herself+0.065
 myself+0.064
 heads+0.063
keeping+0.061
zie+0.060
Neg logit contributions
itt-0.035
year-0.031
 Some-0.029
op-0.028
 fires-0.026
Token die
Token ID4656
Feature activation+3.86e+00

Pos logit contributions
keeping+0.041
zie+0.041
 heads+0.039
 forth+0.038
ishment+0.038
Neg logit contributions
 of-0.039
 Some-0.033
 fires-0.031
itt-0.031
aster-0.030
Token.
Token ID13
Feature activation+8.94e-02

Pos logit contributions
zie+0.162
 myself+0.155
 heads+0.155
keeping+0.149
 forth+0.148
Neg logit contributions
 of-0.130
itt-0.127
 fires-0.122
year-0.119
 Some-0.118
Token She
Token ID1375
Feature activation+7.83e-01

Pos logit contributions
 heads+0.003
zie+0.003
 myself+0.003
 herself+0.003
keeping+0.003
Neg logit contributions
itt-0.004
 Some-0.004
 of-0.004
year-0.004
op-0.004
Token buzz
Token ID14713
Feature activation+5.03e-01

Pos logit contributions
keeping+0.024
zie+0.023
 myself+0.022
 herself+0.022
 forth+0.021
Neg logit contributions
itt-0.037
 Some-0.036
 of-0.035
 fires-0.034
year-0.034
Tokenes
Token ID274
Feature activation-2.91e-01

Pos logit contributions
 heads+0.020
 herself+0.019
 myself+0.019
zie+0.017
 forth+0.017
Neg logit contributions
itt-0.019
year-0.018
op-0.018
 fires-0.017
 Some-0.017
Token at
Token ID379
Feature activation+4.41e-01

Pos logit contributions
itt+0.015
year+0.014
op+0.014
imsy+0.013
ades+0.013
Neg logit contributions
 herself-0.009
 heads-0.008
 myself-0.008
zie-0.007
 Holly-0.007
Token that
Token ID326
Feature activation-7.13e-01

Pos logit contributions
itt+0.024
 Some+0.022
 of+0.021
year+0.021
op+0.021
Neg logit contributions
 heads-0.053
 myself-0.052
zie-0.052
 herself-0.052
keeping-0.051
Token it
Token ID340
Feature activation-8.52e-01

Pos logit contributions
itt+0.031
op+0.029
year+0.029
 Some+0.029
ades+0.027
Neg logit contributions
 herself-0.025
 myself-0.025
 forth-0.024
 heads-0.024
zie-0.023
Token produced
Token ID4635
Feature activation+5.14e-01

Pos logit contributions
itt+0.032
 of+0.030
year+0.030
 Some+0.029
op+0.028
Neg logit contributions
 heads-0.034
 myself-0.033
zie-0.033
 herself-0.033
keeping-0.032
Token more
Token ID517
Feature activation-5.95e-01

Pos logit contributions
 herself+0.022
 myself+0.021
 heads+0.020
keeping+0.019
zie+0.019
Neg logit contributions
itt-0.019
op-0.018
year-0.017
 Some-0.017
 fires-0.016
Token av
Token ID1196
Feature activation-2.55e+00

Pos logit contributions
itt+0.021
op+0.019
year+0.019
 Some+0.019
 fires+0.018
Neg logit contributions
 myself-0.025
 herself-0.025
 heads-0.025
zie-0.024
keeping-0.024
Tokenoc
Token ID420
Feature activation-6.71e+00

Pos logit contributions
itt+0.050
 Some+0.047
year+0.046
op+0.042
 of+0.042
Neg logit contributions
 myself-0.142
 heads-0.141
 herself-0.139
keeping-0.139
zie-0.138
Tokenados
Token ID22484
Feature activation-3.06e-01

Pos logit contributions
year+0.383
 Some+0.381
itt+0.377
 of+0.376
op+0.373
Neg logit contributions
 myself-0.127
 heads-0.122
 herself-0.115
keeping-0.113
zie-0.104
Token than
Token ID621
Feature activation+1.41e-01

Pos logit contributions
itt+0.011
 of+0.010
year+0.010
 Some+0.010
op+0.010
Neg logit contributions
 heads-0.012
 myself-0.012
zie-0.012
 herself-0.012
keeping-0.012
Token ever
Token ID1683
Feature activation-3.55e-01

Pos logit contributions
 herself+0.005
 myself+0.005
keeping+0.005
 heads+0.005
 forget+0.005
Neg logit contributions
op-0.006
itt-0.006
 Some-0.006
year-0.006
ades-0.005
Token before
Token ID878
Feature activation+1.57e+00

Pos logit contributions
itt+0.014
 of+0.014
year+0.013
 Some+0.013
 fires+0.013
Neg logit contributions
 heads-0.013
zie-0.013
 myself-0.013
 herself-0.012
keeping-0.012
Token.
Token ID13
Feature activation+1.49e-01

Pos logit contributions
 heads+0.058
zie+0.057
 myself+0.056
 herself+0.055
keeping+0.054
Neg logit contributions
itt-0.063
 of-0.059
 Some-0.059
year-0.058
op-0.057
Token on
Token ID319
Feature activation-1.80e+00

Pos logit contributions
itt+0.040
 Some+0.039
year+0.039
ades+0.039
 fires+0.037
Neg logit contributions
 forth-0.063
 herself-0.061
 myself-0.058
zie-0.057
 heads-0.056
Token,
Token ID11
Feature activation-8.40e-01

Pos logit contributions
 of+0.046
itt+0.042
op+0.038
time+0.036
 Some+0.034
Neg logit contributions
keeping-0.100
 heads-0.098
 myself-0.095
zie-0.092
ishment-0.091
Token Lily
Token ID20037
Feature activation+6.02e-01

Pos logit contributions
itt+0.045
 of+0.044
 Some+0.044
year+0.043
op+0.042
Neg logit contributions
 heads-0.020
 myself-0.019
zie-0.019
keeping-0.018
 herself-0.018
Token loved
Token ID6151
Feature activation-1.12e+00

Pos logit contributions
 heads+0.025
 myself+0.025
zie+0.025
 herself+0.025
keeping+0.024
Neg logit contributions
itt-0.021
 of-0.019
 Some-0.019
year-0.019
op-0.019
Token av
Token ID1196
Feature activation-2.49e+00

Pos logit contributions
itt+0.037
year+0.036
op+0.034
imsy+0.033
 Some+0.033
Neg logit contributions
 herself-0.052
 myself-0.049
 forth-0.048
 heads-0.048
 fuss-0.046
Tokenoc
Token ID420
Feature activation-6.61e+00

Pos logit contributions
itt+0.035
 of+0.028
op+0.027
 Some+0.026
year+0.026
Neg logit contributions
 heads-0.158
zie-0.156
 myself-0.156
 herself-0.155
keeping-0.153
Tokenados
Token ID22484
Feature activation-7.94e-01

Pos logit contributions
itt+0.359
year+0.348
 Some+0.346
 of+0.345
op+0.339
Neg logit contributions
 heads-0.146
 myself-0.145
 herself-0.140
zie-0.139
keeping-0.134
Token and
Token ID290
Feature activation-3.09e-01

Pos logit contributions
itt+0.028
 of+0.026
year+0.026
 Some+0.026
op+0.026
Neg logit contributions
 heads-0.033
 myself-0.032
zie-0.032
 herself-0.031
keeping-0.031
Token would
Token ID561
Feature activation-2.17e+00

Pos logit contributions
itt+0.011
year+0.010
 Some+0.010
 of+0.010
op+0.010
Neg logit contributions
 heads-0.012
 myself-0.012
 herself-0.012
zie-0.012
keeping-0.012
Token ask
Token ID1265
Feature activation-8.94e-02

Pos logit contributions
itt+0.056
 of+0.054
 Some+0.053
year+0.049
 fires+0.048
Neg logit contributions
keeping-0.109
 heads-0.108
zie-0.106
 myself-0.106
 forth-0.103
Token for
Token ID329
Feature activation-6.33e-01

Pos logit contributions
itt+0.004
op+0.004
 Some+0.004
year+0.004
ades+0.004
Neg logit contributions
 herself-0.003
 myself-0.003
 heads-0.003
 forth-0.002
zie-0.002
Token was
Token ID373
Feature activation-1.26e+00

Pos logit contributions
 of+0.079
itt+0.075
 fires+0.073
op+0.071
 Some+0.071
Neg logit contributions
 myself-0.091
keeping-0.088
 heads-0.088
 fuss-0.085
 herself-0.084
Token an
Token ID281
Feature activation-1.39e+00

Pos logit contributions
ades+0.057
op+0.056
itt+0.056
ork+0.052
ves+0.051
Neg logit contributions
 herself-0.044
 myself-0.039
 forth-0.038
zie-0.035
 Daisy-0.034
Token eager
Token ID11069
Feature activation-2.05e+00

Pos logit contributions
itt+0.060
op+0.055
 Some+0.053
ades+0.053
year+0.050
Neg logit contributions
 myself-0.050
 heads-0.050
zie-0.050
 herself-0.049
 forth-0.048
Token 3
Token ID513
Feature activation-3.66e+00

Pos logit contributions
itt+0.087
year+0.080
ades+0.078
 Some+0.077
 fires+0.075
Neg logit contributions
 heads-0.074
zie-0.069
 herself-0.069
 forth-0.068
 myself-0.068
Token-
Token ID12
Feature activation-3.97e+00

Pos logit contributions
itt+0.098
 Some+0.094
imsy+0.087
op+0.084
year+0.083
Neg logit contributions
 heads-0.179
keeping-0.176
zie-0.174
 herself-0.174
 myself-0.173
Tokenyear
Token ID1941
Feature activation-6.61e+00

Pos logit contributions
 Some+0.091
ifying+0.086
 fires+0.080
ades+0.077
 Fire+0.072
Neg logit contributions
 herself-0.213
 myself-0.208
keeping-0.208
zie-0.191
riet-0.189
Token-
Token ID12
Feature activation-3.68e+00

Pos logit contributions
 Some+0.165
itt+0.153
Will+0.139
 fires+0.132
ifying+0.131
Neg logit contributions
 heads-0.350
 myself-0.324
keeping-0.323
 herself-0.317
zie-0.307
Tokenold
Token ID727
Feature activation-2.20e+00

Pos logit contributions
ifying+0.121
 Some+0.121
 fires+0.110
imsy+0.107
op+0.103
Neg logit contributions
 herself-0.163
 myself-0.159
keeping-0.159
zie-0.145
riet-0.137
Token.
Token ID13
Feature activation-6.37e-01

Pos logit contributions
itt+0.092
 Some+0.085
year+0.084
op+0.084
 of+0.082
Neg logit contributions
 heads-0.077
 myself-0.076
 herself-0.075
zie-0.075
keeping-0.074
Token He
Token ID679
Feature activation-9.47e-01

Pos logit contributions
itt+0.023
 Some+0.022
year+0.022
 of+0.022
op+0.021
Neg logit contributions
 heads-0.026
 myself-0.025
zie-0.025
 herself-0.025
keeping-0.024
Token liked
Token ID8288
Feature activation-1.21e+00

Pos logit contributions
itt+0.040
 Some+0.036
year+0.036
op+0.035
 fires+0.035
Neg logit contributions
 heads-0.033
 herself-0.033
zie-0.033
 myself-0.032
 forth-0.031
Token they
Token ID484
Feature activation+3.10e-01

Pos logit contributions
 herself+0.011
 myself+0.010
 forth+0.009
 forget+0.009
 heads+0.009
Neg logit contributions
itt-0.021
op-0.020
year-0.020
ades-0.019
imsy-0.018
Token could
Token ID714
Feature activation-5.69e-01

Pos logit contributions
 heads+0.013
 myself+0.013
zie+0.013
 herself+0.013
keeping+0.013
Neg logit contributions
itt-0.011
 Some-0.010
year-0.010
 of-0.009
op-0.009
Token grow
Token ID1663
Feature activation+2.66e+00

Pos logit contributions
itt+0.023
 of+0.023
 Some+0.022
op+0.021
year+0.021
Neg logit contributions
zie-0.020
 heads-0.020
keeping-0.020
 myself-0.019
 forth-0.019
Token more
Token ID517
Feature activation-1.91e-01

Pos logit contributions
 herself+0.105
 myself+0.101
 heads+0.094
zie+0.088
 forth+0.088
Neg logit contributions
itt-0.114
op-0.108
 Some-0.099
year-0.097
imsy-0.093
Token av
Token ID1196
Feature activation-2.12e+00

Pos logit contributions
itt+0.008
op+0.008
year+0.008
 Some+0.008
imsy+0.007
Neg logit contributions
 myself-0.007
 herself-0.007
 heads-0.007
 forth-0.006
keeping-0.006
Tokenoc
Token ID420
Feature activation-6.61e+00

Pos logit contributions
itt+0.037
year+0.032
 Some+0.031
 of+0.030
op+0.029
Neg logit contributions
 heads-0.126
 myself-0.125
zie-0.124
 herself-0.123
keeping-0.122
Tokenados
Token ID22484
Feature activation+1.49e-01

Pos logit contributions
year+0.379
 of+0.377
 Some+0.376
itt+0.371
op+0.365
Neg logit contributions
 myself-0.131
 heads-0.125
 herself-0.118
keeping-0.114
zie-0.104
Token and
Token ID290
Feature activation-5.21e-01

Pos logit contributions
 heads+0.006
 myself+0.006
 herself+0.006
zie+0.006
keeping+0.006
Neg logit contributions
itt-0.005
year-0.005
 Some-0.005
 of-0.005
op-0.005
Token share
Token ID2648
Feature activation+9.95e-01

Pos logit contributions
itt+0.022
year+0.021
 Some+0.021
op+0.021
 of+0.021
Neg logit contributions
 heads-0.018
 myself-0.017
 herself-0.017
zie-0.017
keeping-0.017
Token them
Token ID606
Feature activation-7.73e-01

Pos logit contributions
 herself+0.036
 myself+0.036
 heads+0.035
keeping+0.033
 fuss+0.032
Neg logit contributions
itt-0.042
year-0.040
op-0.040
 Some-0.038
imsy-0.036
Token with
Token ID351
Feature activation-1.60e-01

Pos logit contributions
itt+0.022
 of+0.021
year+0.020
 Some+0.020
op+0.019
Neg logit contributions
zie-0.038
 heads-0.037
keeping-0.036
 myself-0.035
 forth-0.035
Token there
Token ID612
Feature activation-4.65e-01

Pos logit contributions
itt+0.047
 of+0.044
 Some+0.044
year+0.044
op+0.043
Neg logit contributions
 heads-0.047
 myself-0.046
zie-0.045
keeping-0.045
 herself-0.045
Token was
Token ID373
Feature activation-1.45e+00

Pos logit contributions
itt+0.020
year+0.019
 Some+0.018
 fires+0.018
op+0.018
Neg logit contributions
zie-0.016
 herself-0.016
 heads-0.015
 myself-0.015
 forth-0.015
Token a
Token ID257
Feature activation-4.09e-01

Pos logit contributions
op+0.056
year+0.055
ades+0.054
itt+0.053
oses+0.052
Neg logit contributions
 herself-0.060
 myself-0.054
 Daisy-0.048
zie-0.047
 heads-0.046
Token three
Token ID1115
Feature activation-2.45e+00

Pos logit contributions
itt+0.016
year+0.016
 Some+0.015
 of+0.015
 fires+0.015
Neg logit contributions
 heads-0.015
 forth-0.014
 fuss-0.014
zie-0.014
 forget-0.014
Token-
Token ID12
Feature activation-3.40e+00

Pos logit contributions
itt+0.060
 Some+0.058
year+0.056
imsy+0.054
 fires+0.053
Neg logit contributions
 heads-0.124
zie-0.119
 herself-0.116
 myself-0.116
 fuss-0.115
Tokenyear
Token ID1941
Feature activation-6.60e+00

Pos logit contributions
ifying+0.071
op+0.062
imsy+0.058
 Our+0.058
oses+0.056
Neg logit contributions
 herself-0.194
 myself-0.177
zie-0.165
keeping-0.164
riet-0.148
Token-
Token ID12
Feature activation-2.86e+00

Pos logit contributions
ifying+0.160
 Some+0.155
Will+0.154
 fires+0.135
imsy+0.134
Neg logit contributions
 heads-0.341
 myself-0.325
 herself-0.321
keeping-0.318
zie-0.315
Tokenold
Token ID727
Feature activation-2.20e+00

Pos logit contributions
ifying+0.093
imsy+0.078
op+0.074
 Some+0.074
 fires+0.072
Neg logit contributions
 herself-0.139
 myself-0.130
keeping-0.126
zie-0.119
 heads-0.105
Token who
Token ID508
Feature activation-1.26e+00

Pos logit contributions
itt+0.089
year+0.084
 Some+0.084
imsy+0.082
op+0.081
Neg logit contributions
 heads-0.079
 herself-0.078
keeping-0.077
 myself-0.076
zie-0.075
Token found
Token ID1043
Feature activation+9.56e-02

Pos logit contributions
itt+0.054
 Some+0.053
 fires+0.050
op+0.050
 of+0.049
Neg logit contributions
 herself-0.042
 heads-0.041
keeping-0.039
 myself-0.038
zie-0.038
Token a
Token ID257
Feature activation+1.12e+00

Pos logit contributions
 herself+0.004
 myself+0.004
zie+0.004
keeping+0.004
 heads+0.003
Neg logit contributions
itt-0.003
op-0.003
year-0.003
imsy-0.003
 Some-0.003
Token Sammy
Token ID44067
Feature activation-1.37e-01

Pos logit contributions
 heads+0.001
 myself+0.001
keeping+0.001
zie+0.001
 fuss+0.001
Neg logit contributions
 of-0.001
 Some-0.001
 fires-0.001
itt-0.001
year-0.001
Token would
Token ID561
Feature activation-2.04e+00

Pos logit contributions
itt+0.005
 of+0.005
 Some+0.004
year+0.004
op+0.004
Neg logit contributions
 myself-0.006
 heads-0.006
zie-0.006
keeping-0.005
 herself-0.005
Token often
Token ID1690
Feature activation-2.51e-01

Pos logit contributions
itt+0.066
 of+0.063
 Some+0.062
 fires+0.059
year+0.058
Neg logit contributions
 heads-0.087
keeping-0.087
zie-0.086
 myself-0.086
 herself-0.083
Token share
Token ID2648
Feature activation+8.07e-01

Pos logit contributions
itt+0.010
 of+0.010
 Some+0.010
year+0.009
op+0.009
Neg logit contributions
zie-0.009
 myself-0.009
 heads-0.009
 herself-0.008
keeping-0.008
Token av
Token ID1196
Feature activation-2.69e+00

Pos logit contributions
 herself+0.029
 heads+0.029
 myself+0.029
 forth+0.027
keeping+0.027
Neg logit contributions
itt-0.034
year-0.032
op-0.032
 Some-0.030
imsy-0.029
Tokenoc
Token ID420
Feature activation-6.59e+00

Pos logit contributions
itt+0.048
year+0.042
 Some+0.042
 of+0.040
op+0.039
Neg logit contributions
 heads-0.157
 myself-0.157
zie-0.155
 herself-0.154
keeping-0.153
Tokenados
Token ID22484
Feature activation-8.11e-01

Pos logit contributions
year+0.368
itt+0.367
 of+0.366
 Some+0.363
op+0.355
Neg logit contributions
 myself-0.133
 heads-0.131
 herself-0.125
keeping-0.119
zie-0.116
Token and
Token ID290
Feature activation-4.75e-01

Pos logit contributions
itt+0.025
 of+0.023
year+0.022
 Some+0.022
op+0.022
Neg logit contributions
 heads-0.038
zie-0.038
 myself-0.037
 herself-0.036
keeping-0.036
Token other
Token ID584
Feature activation-1.11e+00

Pos logit contributions
itt+0.020
 of+0.019
 Some+0.019
year+0.019
op+0.018
Neg logit contributions
 heads-0.016
 myself-0.016
zie-0.016
 herself-0.016
keeping-0.015
Token y
Token ID331
Feature activation+4.42e-01

Pos logit contributions
itt+0.047
year+0.044
 Some+0.044
op+0.043
 of+0.043
Neg logit contributions
 heads-0.039
 myself-0.038
 herself-0.037
zie-0.037
 forth-0.036
Tokenummy
Token ID13513
Feature activation+6.74e-01

Pos logit contributions
 heads+0.015
 myself+0.015
 herself+0.015
zie+0.015
keeping+0.014
Neg logit contributions
itt-0.018
year-0.018
 Some-0.018
 fires-0.017
 of-0.017
Token was
Token ID373
Feature activation-1.37e+00

Pos logit contributions
itt+0.061
 Some+0.057
year+0.057
 of+0.057
op+0.056
Neg logit contributions
 heads-0.060
 myself-0.059
zie-0.059
 herself-0.058
keeping-0.057
Token an
Token ID281
Feature activation-1.54e+00

Pos logit contributions
ades+0.060
itt+0.057
op+0.057
 Some+0.053
 fires+0.053
Neg logit contributions
 herself-0.051
 myself-0.043
 forth-0.043
zie-0.043
 heads-0.042
Token adventurous
Token ID39770
Feature activation-1.60e+00

Pos logit contributions
itt+0.062
op+0.056
 Some+0.056
ades+0.054
year+0.054
Neg logit contributions
 heads-0.059
zie-0.059
 myself-0.059
 herself-0.058
 forth-0.056
Token three
Token ID1115
Feature activation-3.64e+00

Pos logit contributions
itt+0.063
year+0.059
 Some+0.058
op+0.056
 of+0.056
Neg logit contributions
 heads-0.061
zie-0.059
 myself-0.058
 herself-0.057
 forth-0.057
Token-
Token ID12
Feature activation-4.11e+00

Pos logit contributions
itt+0.124
 Some+0.118
year+0.112
imsy+0.112
op+0.108
Neg logit contributions
 heads-0.156
zie-0.147
 herself-0.145
 forth-0.145
 myself-0.143
Tokenyear
Token ID1941
Feature activation-6.59e+00

Pos logit contributions
 Some+0.076
 fires+0.070
ifying+0.063
ades+0.062
time+0.058
Neg logit contributions
 herself-0.239
keeping-0.232
 myself-0.226
zie-0.223
riet-0.210
Token-
Token ID12
Feature activation-3.77e+00

Pos logit contributions
 Some+0.155
itt+0.146
Will+0.130
 fires+0.126
imsy+0.125
Neg logit contributions
 heads-0.356
keeping-0.330
 herself-0.325
 myself-0.324
zie-0.322
Tokenold
Token ID727
Feature activation-2.52e+00

Pos logit contributions
 Some+0.110
ifying+0.106
 fires+0.104
imsy+0.098
op+0.094
Neg logit contributions
 herself-0.180
keeping-0.172
 myself-0.169
zie-0.168
riet-0.150
Token and
Token ID290
Feature activation-1.16e+00

Pos logit contributions
itt+0.094
 of+0.087
 Some+0.087
year+0.087
op+0.085
Neg logit contributions
 heads-0.099
 myself-0.098
zie-0.097
 herself-0.096
keeping-0.094
Token loved
Token ID6151
Feature activation-8.30e-01

Pos logit contributions
itt+0.048
 Some+0.044
op+0.044
 of+0.043
 fires+0.043
Neg logit contributions
 herself-0.044
 myself-0.042
zie-0.041
 heads-0.041
 forth-0.040
Token to
Token ID284
Feature activation-6.56e-01

Pos logit contributions
itt+0.029
year+0.025
 Some+0.024
imsy+0.024
op+0.024
Neg logit contributions
 herself-0.038
keeping-0.037
 heads-0.036
 forth-0.036
 myself-0.035
Token Every
Token ID3887
Feature activation+7.98e-01

Pos logit contributions
itt+0.015
op+0.014
year+0.014
ades+0.013
 fires+0.013
Neg logit contributions
 myself-0.018
 herself-0.018
zie-0.017
 heads-0.017
keeping-0.017
Token day
Token ID1110
Feature activation-3.33e-01

Pos logit contributions
zie+0.018
 heads+0.017
 myself+0.017
 herself+0.016
 forget+0.016
Neg logit contributions
itt-0.043
 Some-0.041
oses-0.040
year-0.040
 fires-0.040
Token two
Token ID734
Feature activation-1.37e+00

Pos logit contributions
itt+0.019
year+0.018
ades+0.018
imsy+0.018
oses+0.017
Neg logit contributions
 herself-0.007
 forth-0.006
 Daisy-0.006
 myself-0.005
 forget-0.005
Token 3
Token ID513
Feature activation-3.73e+00

Pos logit contributions
itt+0.063
year+0.060
 Some+0.058
op+0.058
 fires+0.057
Neg logit contributions
 heads-0.044
 herself-0.041
 myself-0.040
zie-0.040
 forth-0.040
Token-
Token ID12
Feature activation-3.87e+00

Pos logit contributions
itt+0.094
 Some+0.087
imsy+0.085
year+0.083
op+0.082
Neg logit contributions
 heads-0.189
 myself-0.184
zie-0.184
 herself-0.184
keeping-0.180
Tokenyear
Token ID1941
Feature activation-6.58e+00

Pos logit contributions
imsy+0.046
op+0.046
 fires+0.042
itt+0.039
 Some+0.037
Neg logit contributions
 herself-0.239
keeping-0.230
zie-0.228
 myself-0.226
 heads-0.207
Token-
Token ID12
Feature activation-3.40e+00

Pos logit contributions
ifying+0.145
itt+0.144
 Some+0.144
op+0.136
Will+0.136
Neg logit contributions
 heads-0.352
 myself-0.330
 herself-0.327
keeping-0.319
zie-0.313
Tokenold
Token ID727
Feature activation-2.00e+00

Pos logit contributions
imsy+0.084
op+0.083
ifying+0.082
 fires+0.078
itt+0.077
Neg logit contributions
 herself-0.171
keeping-0.162
 myself-0.160
zie-0.159
 heads-0.142
Token children
Token ID1751
Feature activation-1.66e-01

Pos logit contributions
itt+0.088
year+0.081
 Some+0.081
op+0.081
 of+0.079
Neg logit contributions
 heads-0.066
 herself-0.065
zie-0.064
 myself-0.064
keeping-0.063
Token enjoyed
Token ID8359
Feature activation-7.83e-01

Pos logit contributions
itt+0.006
 Some+0.006
year+0.006
op+0.005
ended+0.005
Neg logit contributions
 heads-0.006
zie-0.006
 herself-0.006
 forth-0.006
 forget-0.006
Token playing
Token ID2712
Feature activation-1.52e+00

Pos logit contributions
itt+0.033
 Some+0.032
year+0.032
op+0.032
 of+0.030
Neg logit contributions
 herself-0.028
 heads-0.027
keeping-0.026
 myself-0.026
 forth-0.026
Tokenmy
Token ID1820
Feature activation-1.52e+00

Pos logit contributions
 of+0.090
itt+0.087
 Some+0.084
op+0.084
 fires+0.083
Neg logit contributions
 myself-0.067
keeping-0.066
 heads-0.065
 fuss-0.062
 herself-0.062
Token was
Token ID373
Feature activation-1.38e+00

Pos logit contributions
itt+0.052
 of+0.049
 Some+0.048
year+0.047
op+0.047
Neg logit contributions
 heads-0.064
 myself-0.063
zie-0.063
 herself-0.062
keeping-0.061
Token a
Token ID257
Feature activation-1.01e+00

Pos logit contributions
ades+0.066
op+0.063
itt+0.063
ves+0.061
ork+0.060
Neg logit contributions
 herself-0.046
 myself-0.039
 forth-0.039
zie-0.037
 Holly-0.035
Token three
Token ID1115
Feature activation-3.30e+00

Pos logit contributions
itt+0.044
ades+0.041
 Some+0.041
year+0.040
 of+0.039
Neg logit contributions
 forget-0.034
 heads-0.033
 forth-0.033
zie-0.031
 fuss-0.030
Token-
Token ID12
Feature activation-3.93e+00

Pos logit contributions
itt+0.098
 Some+0.095
imsy+0.091
year+0.087
op+0.083
Neg logit contributions
 heads-0.150
 forth-0.144
 fuss-0.143
 myself-0.140
zie-0.139
Tokenyear
Token ID1941
Feature activation-6.56e+00

Pos logit contributions
ifying+0.090
 Fire+0.086
 Some+0.085
ork+0.085
ades+0.084
Neg logit contributions
 herself-0.190
 myself-0.184
keeping-0.179
riet-0.169
zie-0.164
Token-
Token ID12
Feature activation-3.68e+00

Pos logit contributions
 Some+0.167
itt+0.151
Will+0.143
imsy+0.137
ifying+0.137
Neg logit contributions
 heads-0.345
keeping-0.314
 myself-0.306
 fuss-0.303
 herself-0.301
Tokenold
Token ID727
Feature activation-2.39e+00

Pos logit contributions
ifying+0.145
 Some+0.127
imsy+0.122
 Fire+0.119
 fires+0.118
Neg logit contributions
 herself-0.138
keeping-0.132
 myself-0.132
zie-0.119
riet-0.115
Token boy
Token ID2933
Feature activation-1.18e+00

Pos logit contributions
itt+0.111
year+0.102
 Some+0.102
op+0.101
 of+0.098
Neg logit contributions
 heads-0.075
 myself-0.072
 herself-0.072
keeping-0.071
zie-0.070
Token who
Token ID508
Feature activation-1.47e+00

Pos logit contributions
itt+0.054
ades+0.053
year+0.052
 Some+0.051
imsy+0.050
Neg logit contributions
 herself-0.035
 forth-0.034
 myself-0.033
 heads-0.032
keeping-0.031
Token loved
Token ID6151
Feature activation-8.94e-01

Pos logit contributions
itt+0.055
 Some+0.050
year+0.049
op+0.049
 of+0.048
Neg logit contributions
 herself-0.059
 heads-0.058
zie-0.057
 myself-0.057
keeping-0.054
TokenOne
Token ID3198
Feature activation-7.24e-01

Pos logit contributions
 heads+0.054
 myself+0.052
zie+0.051
 forth+0.051
 fuss+0.050
Neg logit contributions
itt-0.051
 of-0.049
 Some-0.048
year-0.046
op-0.046
Token day
Token ID1110
Feature activation-7.94e-01

Pos logit contributions
itt+0.036
 Some+0.033
year+0.033
op+0.032
 fires+0.031
Neg logit contributions
zie-0.022
 myself-0.021
 herself-0.020
 forget-0.020
 forth-0.019
Token,
Token ID11
Feature activation+2.52e-02

Pos logit contributions
itt+0.030
year+0.028
 Some+0.028
 of+0.028
op+0.027
Neg logit contributions
 heads-0.031
 myself-0.030
zie-0.030
 herself-0.030
keeping-0.029
Token three
Token ID1115
Feature activation-3.10e+00

Pos logit contributions
 herself+0.001
 myself+0.001
 heads+0.001
zie+0.001
keeping+0.001
Neg logit contributions
itt-0.001
year-0.001
 Some-0.001
op-0.001
 fires-0.001
Token-
Token ID12
Feature activation-4.07e+00

Pos logit contributions
itt+0.108
year+0.105
 Some+0.104
imsy+0.099
op+0.098
Neg logit contributions
 heads-0.127
zie-0.122
 herself-0.122
 myself-0.121
 forth-0.117
Tokenyear
Token ID1941
Feature activation-6.55e+00

Pos logit contributions
op+0.060
 fires+0.058
itt+0.057
 Some+0.055
ades+0.055
Neg logit contributions
 herself-0.255
keeping-0.242
zie-0.238
 myself-0.237
riet-0.216
Token-
Token ID12
Feature activation-3.83e+00

Pos logit contributions
itt+0.142
 Some+0.141
op+0.128
 fires+0.125
Will+0.120
Neg logit contributions
 herself-0.347
 heads-0.347
 myself-0.342
keeping-0.330
zie-0.328
Tokenold
Token ID727
Feature activation-2.40e+00

Pos logit contributions
op+0.092
itt+0.090
 fires+0.090
 Some+0.089
imsy+0.082
Neg logit contributions
 herself-0.205
keeping-0.192
 myself-0.190
zie-0.190
riet-0.168
Token Sam
Token ID3409
Feature activation-1.20e+00

Pos logit contributions
itt+0.102
year+0.095
 Some+0.095
op+0.094
 of+0.093
Neg logit contributions
 heads-0.082
 herself-0.081
 myself-0.081
zie-0.081
keeping-0.079
Token was
Token ID373
Feature activation-1.18e+00

Pos logit contributions
itt+0.041
 of+0.040
 Some+0.038
year+0.038
 fires+0.037
Neg logit contributions
 myself-0.050
 heads-0.050
zie-0.049
 herself-0.048
keeping-0.047
Token innoc
Token ID7909
Feature activation+4.09e-01

Pos logit contributions
ades+0.054
itt+0.051
 Some+0.050
op+0.050
year+0.050
Neg logit contributions
 herself-0.041
 forth-0.034
 myself-0.033
 Holly-0.032
keeping-0.032
Token a
Token ID257
Feature activation-9.25e-01

Pos logit contributions
ades+0.054
op+0.051
itt+0.051
ves+0.049
 Some+0.047
Neg logit contributions
 herself-0.042
 forth-0.037
 myself-0.035
zie-0.035
 heads-0.032
Token very
Token ID845
Feature activation-5.52e-01

Pos logit contributions
itt+0.039
 Some+0.036
ades+0.036
year+0.036
 of+0.035
Neg logit contributions
 heads-0.032
 forget-0.032
 forth-0.031
zie-0.029
 fuss-0.029
Token curious
Token ID11040
Feature activation-2.24e+00

Pos logit contributions
itt+0.026
ades+0.025
 Some+0.025
op+0.025
glass+0.025
Neg logit contributions
 forget-0.015
 heads-0.014
 myself-0.014
 forth-0.014
 fuss-0.014
Token 3
Token ID513
Feature activation-3.82e+00

Pos logit contributions
itt+0.098
year+0.087
 Some+0.086
op+0.085
ades+0.085
Neg logit contributions
 heads-0.079
zie-0.076
 forth-0.073
 fuss-0.072
 herself-0.071
Token-
Token ID12
Feature activation-4.01e+00

Pos logit contributions
itt+0.105
 Some+0.098
op+0.090
year+0.090
imsy+0.089
Neg logit contributions
 heads-0.187
zie-0.182
 herself-0.182
 myself-0.182
keeping-0.181
Tokenyear
Token ID1941
Feature activation-6.54e+00

Pos logit contributions
ifying+0.096
 Some+0.092
 fires+0.087
ades+0.084
op+0.083
Neg logit contributions
 herself-0.212
 myself-0.201
keeping-0.199
zie-0.197
riet-0.190
Token-
Token ID12
Feature activation-3.57e+00

Pos logit contributions
 Some+0.152
itt+0.142
Will+0.130
ifying+0.127
 fires+0.126
Neg logit contributions
 heads-0.351
keeping-0.328
 herself-0.327
 myself-0.327
zie-0.320
Tokenold
Token ID727
Feature activation-2.33e+00

Pos logit contributions
ifying+0.124
 Some+0.115
 fires+0.108
imsy+0.104
op+0.103
Neg logit contributions
 herself-0.159
 myself-0.150
keeping-0.149
zie-0.146
riet-0.134
Token boy
Token ID2933
Feature activation-1.00e+00

Pos logit contributions
itt+0.092
year+0.086
 Some+0.086
 of+0.085
op+0.084
Neg logit contributions
 heads-0.086
 myself-0.085
zie-0.085
 herself-0.084
keeping-0.082
Token.
Token ID13
Feature activation-5.39e-01

Pos logit contributions
itt+0.046
ades+0.043
year+0.043
 Some+0.042
imsy+0.041
Neg logit contributions
 herself-0.031
 heads-0.029
 forth-0.029
zie-0.028
keeping-0.028
Token He
Token ID679
Feature activation-7.37e-01

Pos logit contributions
itt+0.019
 Some+0.018
 of+0.018
year+0.018
 fires+0.017
Neg logit contributions
 heads-0.022
 myself-0.022
zie-0.022
keeping-0.022
 herself-0.022
Token're
Token ID821
Feature activation+1.06e+00

Pos logit contributions
 of+0.021
itt+0.021
 Some+0.019
op+0.019
osh+0.019
Neg logit contributions
keeping-0.024
zie-0.023
 heads-0.023
 herself-0.022
 myself-0.022
Token the
Token ID262
Feature activation-1.78e-01

Pos logit contributions
 heads+0.052
keeping+0.051
 fuss+0.050
zie+0.049
 forth+0.046
Neg logit contributions
 of-0.030
itt-0.027
 fires-0.027
year-0.026
 Some-0.025
Token best
Token ID1266
Feature activation-2.14e+00

Pos logit contributions
itt+0.007
year+0.006
 Some+0.006
 of+0.006
op+0.006
Neg logit contributions
 heads-0.007
 myself-0.007
zie-0.007
 herself-0.007
keeping-0.007
Token rh
Token ID9529
Feature activation-1.73e+00

Pos logit contributions
itt+0.104
year+0.099
 Some+0.099
op+0.095
 fires+0.094
Neg logit contributions
 myself-0.061
 heads-0.060
zie-0.058
 herself-0.057
keeping-0.056
Tokenin
Token ID259
Feature activation-1.71e+00

Pos logit contributions
 Some+0.054
imsy+0.046
year+0.045
 paramed+0.044
 fires+0.044
Neg logit contributions
 myself-0.081
 forth-0.072
keeping-0.070
 herself-0.069
 heads-0.069
Tokenoc
Token ID420
Feature activation-6.54e+00

Pos logit contributions
 Some+0.067
itt+0.065
year+0.062
imsy+0.060
ades+0.059
Neg logit contributions
 myself-0.067
keeping-0.061
 heads-0.060
 herself-0.060
 forth-0.059
Tokeneros
Token ID27498
Feature activation-1.85e+00

Pos logit contributions
 Some+0.292
year+0.282
 of+0.274
itt+0.269
 fires+0.263
Neg logit contributions
 myself-0.232
 heads-0.225
keeping-0.214
 herself-0.201
 forth-0.196
Token ever
Token ID1683
Feature activation-1.99e+00

Pos logit contributions
itt+0.076
 Some+0.074
year+0.072
op+0.070
 fires+0.069
Neg logit contributions
 myself-0.066
 heads-0.063
 herself-0.063
keeping-0.062
zie-0.061
Token!"
Token ID2474
Feature activation+1.15e-01

Pos logit contributions
 of+0.091
-+0.075
year+0.074
 to+0.071
itt+0.069
Neg logit contributions
 heads-0.078
zie-0.072
ishment-0.069
 herself-0.068
 fuss-0.064
Token Tim
Token ID5045
Feature activation-7.15e-01

Pos logit contributions
 heads+0.005
zie+0.005
keeping+0.005
 myself+0.005
 forth+0.005
Neg logit contributions
 of-0.004
itt-0.004
 Some-0.004
op-0.004
year-0.003
Tokenmy
Token ID1820
Feature activation-1.10e+00

Pos logit contributions
 of+0.026
op+0.020
-+0.020
 to+0.019
year+0.019
Neg logit contributions
keeping-0.030
 heads-0.029
 Holly-0.029
 Ivy-0.029
ishment-0.028
Token Jack
Token ID3619
Feature activation-1.11e-01

Pos logit contributions
 Some+0.014
 of+0.011
itt+0.011
year+0.010
 fires+0.009
Neg logit contributions
 myself-0.020
 forth-0.020
keeping-0.019
 heads-0.019
 fuss-0.018
Token was
Token ID373
Feature activation-1.00e+00

Pos logit contributions
itt+0.004
year+0.004
 Some+0.004
 of+0.004
op+0.004
Neg logit contributions
 heads-0.004
 myself-0.004
zie-0.004
 herself-0.004
keeping-0.004
Token a
Token ID257
Feature activation-8.54e-01

Pos logit contributions
op+0.049
ades+0.048
itt+0.048
ork+0.044
rain+0.044
Neg logit contributions
 herself-0.033
 myself-0.028
 Polly-0.027
 forget-0.026
 Holly-0.026
Token 3
Token ID513
Feature activation-3.62e+00

Pos logit contributions
itt+0.036
year+0.035
 Some+0.033
ades+0.033
 of+0.033
Neg logit contributions
 forget-0.028
 heads-0.026
 seek-0.026
 fuss-0.025
 forth-0.025
Token-
Token ID12
Feature activation-4.01e+00

Pos logit contributions
itt+0.068
 Some+0.065
year+0.059
imsy+0.058
 fires+0.057
Neg logit contributions
 heads-0.201
keeping-0.197
zie-0.196
 herself-0.195
 myself-0.194
Tokenyear
Token ID1941
Feature activation-6.53e+00

Pos logit contributions
ifying+0.067
time+0.064
op+0.055
 fires+0.054
 Some+0.054
Neg logit contributions
 herself-0.221
 myself-0.220
zie-0.215
keeping-0.213
 forth-0.198
Token old
Token ID1468
Feature activation-1.02e+00

Pos logit contributions
 Some+0.150
ifying+0.135
Will+0.132
itt+0.131
 fires+0.124
Neg logit contributions
 heads-0.352
keeping-0.324
 myself-0.317
 fuss-0.314
 herself-0.311
Token who
Token ID508
Feature activation-1.84e+00

Pos logit contributions
itt+0.045
 of+0.043
 Some+0.042
op+0.041
 fires+0.041
Neg logit contributions
 myself-0.035
zie-0.035
 herself-0.033
keeping-0.033
 heads-0.033
Token liked
Token ID8288
Feature activation-6.72e-01

Pos logit contributions
itt+0.069
 Some+0.066
 fires+0.064
 of+0.062
year+0.062
Neg logit contributions
 heads-0.070
 herself-0.068
zie-0.067
 myself-0.067
keeping-0.065
Token to
Token ID284
Feature activation+1.00e-01

Pos logit contributions
itt+0.021
year+0.019
imsy+0.018
op+0.018
 Some+0.017
Neg logit contributions
keeping-0.032
 heads-0.031
 herself-0.031
 forth-0.030
 myself-0.030
Token explore
Token ID7301
Feature activation-1.22e-01

Pos logit contributions
 heads+0.003
 myself+0.003
 herself+0.003
zie+0.003
keeping+0.003
Neg logit contributions
itt-0.004
year-0.004
 Some-0.004
 of-0.004
op-0.004
Token was
Token ID373
Feature activation-1.23e+00

Pos logit contributions
itt+0.087
year+0.082
 Some+0.081
op+0.077
 of+0.077
Neg logit contributions
 heads-0.080
zie-0.078
 herself-0.077
 myself-0.077
 forth-0.075
Token an
Token ID281
Feature activation-1.38e+00

Pos logit contributions
ades+0.055
op+0.053
itt+0.053
 Some+0.050
ves+0.049
Neg logit contributions
 herself-0.043
 forth-0.039
 myself-0.039
 heads-0.034
zie-0.034
Token eager
Token ID11069
Feature activation-2.00e+00

Pos logit contributions
itt+0.055
op+0.051
 Some+0.051
year+0.049
 of+0.048
Neg logit contributions
 heads-0.052
 myself-0.052
zie-0.051
 herself-0.051
 forth-0.050
Token 3
Token ID513
Feature activation-3.56e+00

Pos logit contributions
itt+0.086
year+0.081
 Some+0.078
 fires+0.077
ades+0.077
Neg logit contributions
 heads-0.069
 herself-0.065
zie-0.065
 myself-0.064
 forth-0.064
Token-
Token ID12
Feature activation-3.92e+00

Pos logit contributions
itt+0.090
 Some+0.085
imsy+0.077
op+0.076
year+0.076
Neg logit contributions
 heads-0.180
keeping-0.177
 myself-0.176
 herself-0.176
zie-0.175
Tokenyear
Token ID1941
Feature activation-6.52e+00

Pos logit contributions
ifying+0.098
 Some+0.098
 fires+0.091
ades+0.088
imsy+0.084
Neg logit contributions
 herself-0.202
 myself-0.194
keeping-0.193
zie-0.181
riet-0.180
Token-
Token ID12
Feature activation-3.63e+00

Pos logit contributions
 Some+0.154
itt+0.144
imsy+0.124
 fires+0.122
Will+0.120
Neg logit contributions
 heads-0.353
keeping-0.332
 myself-0.330
 herself-0.326
zie-0.315
Tokenold
Token ID727
Feature activation-2.15e+00

Pos logit contributions
ifying+0.117
 Some+0.114
 fires+0.107
imsy+0.104
itt+0.099
Neg logit contributions
 herself-0.166
 myself-0.160
keeping-0.158
zie-0.148
riet-0.143
Token boy
Token ID2933
Feature activation-7.64e-01

Pos logit contributions
itt+0.089
 Some+0.083
year+0.083
 of+0.081
op+0.081
Neg logit contributions
 heads-0.076
 myself-0.075
zie-0.075
 herself-0.074
keeping-0.073
Token who
Token ID508
Feature activation-1.27e+00

Pos logit contributions
itt+0.037
ades+0.036
year+0.035
 Some+0.035
imsy+0.034
Neg logit contributions
 herself-0.021
keeping-0.019
 myself-0.019
 forth-0.019
 heads-0.019
Token loved
Token ID6151
Feature activation-8.86e-01

Pos logit contributions
itt+0.045
 Some+0.042
year+0.041
op+0.040
 of+0.040
Neg logit contributions
 heads-0.052
 herself-0.052
zie-0.051
 myself-0.051
keeping-0.050
Token a
Token ID257
Feature activation-9.92e-01

Pos logit contributions
ades+0.058
itt+0.056
op+0.055
 fires+0.053
 Some+0.052
Neg logit contributions
 herself-0.054
 myself-0.049
 forth-0.046
zie-0.045
 heads-0.044
Token very
Token ID845
Feature activation-5.81e-01

Pos logit contributions
itt+0.040
 Some+0.037
ades+0.037
year+0.037
op+0.036
Neg logit contributions
 heads-0.035
 forget-0.033
 forth-0.033
 myself-0.033
zie-0.033
Token curious
Token ID11040
Feature activation-2.26e+00

Pos logit contributions
itt+0.024
 Some+0.023
op+0.023
 fires+0.023
ades+0.022
Neg logit contributions
 heads-0.019
 myself-0.019
zie-0.019
 forget-0.018
 herself-0.018
Token three
Token ID1115
Feature activation-3.39e+00

Pos logit contributions
itt+0.110
year+0.102
 Some+0.101
op+0.100
 fires+0.098
Neg logit contributions
 heads-0.065
zie-0.063
 myself-0.061
 herself-0.060
 forth-0.058
Token-
Token ID12
Feature activation-3.80e+00

Pos logit contributions
itt+0.090
 Some+0.081
year+0.080
 of+0.077
op+0.077
Neg logit contributions
 heads-0.170
 myself-0.168
zie-0.167
 herself-0.166
keeping-0.163
Tokenyear
Token ID1941
Feature activation-6.48e+00

Pos logit contributions
 Some+0.070
 fires+0.069
ades+0.062
itt+0.062
op+0.062
Neg logit contributions
 herself-0.217
 myself-0.210
keeping-0.208
zie-0.204
riet-0.198
Token-
Token ID12
Feature activation-3.55e+00

Pos logit contributions
 Some+0.133
itt+0.130
 fires+0.109
op+0.106
year+0.104
Neg logit contributions
 heads-0.366
 myself-0.348
keeping-0.348
 herself-0.346
zie-0.341
Tokenold
Token ID727
Feature activation-2.43e+00

Pos logit contributions
 Some+0.088
 fires+0.086
itt+0.082
op+0.079
ifying+0.078
Neg logit contributions
 herself-0.181
 myself-0.176
keeping-0.173
zie-0.171
riet-0.159
Token girl
Token ID2576
Feature activation-8.25e-01

Pos logit contributions
 of+0.095
itt+0.091
year+0.089
 Some+0.089
op+0.085
Neg logit contributions
zie-0.091
 myself-0.090
 heads-0.090
 fuss-0.085
 herself-0.084
Token.
Token ID13
Feature activation-6.96e-01

Pos logit contributions
itt+0.034
year+0.032
 Some+0.031
 fires+0.031
ades+0.030
Neg logit contributions
 heads-0.029
 herself-0.029
zie-0.028
 myself-0.028
keeping-0.028
Token She
Token ID1375
Feature activation+9.10e-02

Pos logit contributions
 Some+0.028
 of+0.024
year+0.023
itt+0.021
time+0.020
Neg logit contributions
keeping-0.031
 forth-0.030
 fuss-0.029
 heads-0.029
 myself-0.027
Token store
Token ID3650
Feature activation-2.38e-01

Pos logit contributions
 myself+0.010
 heads+0.010
 herself+0.010
zie+0.009
 forth+0.009
Neg logit contributions
itt-0.014
year-0.013
op-0.013
 Some-0.013
 of-0.012
Token to
Token ID284
Feature activation+6.82e-01

Pos logit contributions
itt+0.009
year+0.008
 Some+0.008
 fires+0.008
op+0.008
Neg logit contributions
 heads-0.009
 herself-0.009
keeping-0.009
 myself-0.009
zie-0.009
Token get
Token ID651
Feature activation+1.10e+00

Pos logit contributions
 heads+0.025
zie+0.024
 myself+0.024
 herself+0.024
keeping+0.023
Neg logit contributions
itt-0.028
 Some-0.026
 of-0.026
year-0.026
op-0.025
Token more
Token ID517
Feature activation-3.98e-01

Pos logit contributions
 herself+0.041
 myself+0.038
 forth+0.037
 heads+0.036
keeping+0.035
Neg logit contributions
itt-0.047
op-0.046
oses-0.041
year-0.041
imsy-0.040
Token av
Token ID1196
Feature activation-2.18e+00

Pos logit contributions
itt+0.016
year+0.015
op+0.015
 Some+0.014
 fires+0.014
Neg logit contributions
 myself-0.015
 herself-0.015
 heads-0.015
 fuss-0.014
 forth-0.014
Tokenoc
Token ID420
Feature activation-6.47e+00

Pos logit contributions
itt+0.039
 Some+0.035
year+0.035
 of+0.033
op+0.031
Neg logit contributions
 heads-0.126
 myself-0.126
zie-0.124
 herself-0.124
keeping-0.123
Tokenados
Token ID22484
Feature activation-5.18e-02

Pos logit contributions
year+0.381
 of+0.378
 Some+0.377
itt+0.369
 fires+0.359
Neg logit contributions
 myself-0.122
 heads-0.121
keeping-0.109
 herself-0.108
zie-0.097
Token.
Token ID13
Feature activation+2.55e-01

Pos logit contributions
 of+0.002
itt+0.002
op+0.002
 Some+0.001
year+0.001
Neg logit contributions
 heads-0.002
zie-0.002
 myself-0.002
 herself-0.002
keeping-0.002
Token Gary
Token ID10936
Feature activation+1.94e-01

Pos logit contributions
 heads+0.009
 myself+0.009
zie+0.009
keeping+0.009
 herself+0.009
Neg logit contributions
itt-0.011
 Some-0.010
 of-0.010
year-0.010
op-0.010
Token looked
Token ID3114
Feature activation-3.15e-01

Pos logit contributions
 heads+0.007
 myself+0.007
zie+0.007
 herself+0.007
keeping+0.007
Neg logit contributions
itt-0.007
year-0.007
 Some-0.007
 of-0.007
op-0.007
Token at
Token ID379
Feature activation-2.74e-01

Pos logit contributions
itt+0.012
op+0.011
 Some+0.010
year+0.009
oses+0.009
Neg logit contributions
 heads-0.013
 forth-0.013
 herself-0.013
 myself-0.013
zie-0.012
Tokenmy
Token ID1820
Feature activation-6.13e-01

Pos logit contributions
 of+0.040
itt+0.039
op+0.037
 fires+0.037
 Some+0.036
Neg logit contributions
 heads-0.029
keeping-0.028
 myself-0.027
 herself-0.027
 fuss-0.026
Token!"
Token ID2474
Feature activation-2.98e-01

Pos logit contributions
itt+0.026
 of+0.025
op+0.024
 fires+0.023
year+0.023
Neg logit contributions
 heads-0.021
 myself-0.021
 herself-0.020
zie-0.020
keeping-0.020
Token exclaimed
Token ID33552
Feature activation+4.56e-01

Pos logit contributions
 of+0.013
itt+0.013
 Some+0.013
op+0.012
year+0.012
Neg logit contributions
zie-0.010
 heads-0.010
 myself-0.010
keeping-0.010
 forth-0.009
Token three
Token ID1115
Feature activation-3.04e+00

Pos logit contributions
 herself+0.011
 heads+0.011
zie+0.011
 myself+0.011
keeping+0.010
Neg logit contributions
itt-0.024
year-0.023
 Some-0.022
 fires-0.022
op-0.022
Token-
Token ID12
Feature activation-3.77e+00

Pos logit contributions
itt+0.128
 Some+0.121
year+0.120
 fires+0.116
op+0.116
Neg logit contributions
 heads-0.106
 myself-0.103
 herself-0.103
zie-0.101
 forth-0.097
Tokenyear
Token ID1941
Feature activation-6.47e+00

Pos logit contributions
 fires+0.056
ades+0.055
op+0.053
imsy+0.051
itt+0.050
Neg logit contributions
 herself-0.229
 myself-0.213
keeping-0.213
zie-0.207
riet-0.198
Token-
Token ID12
Feature activation-3.52e+00

Pos logit contributions
itt+0.141
 Some+0.140
op+0.126
 fires+0.123
ifying+0.120
Neg logit contributions
 heads-0.348
 herself-0.339
 myself-0.333
zie-0.324
keeping-0.323
Tokenold
Token ID727
Feature activation-2.47e+00

Pos logit contributions
ifying+0.087
 fires+0.086
imsy+0.081
op+0.080
 Some+0.079
Neg logit contributions
 herself-0.184
 myself-0.169
keeping-0.168
zie-0.163
riet-0.155
Token Jenny
Token ID28437
Feature activation-1.35e+00

Pos logit contributions
itt+0.115
 Some+0.108
year+0.107
 of+0.106
op+0.105
Neg logit contributions
 heads-0.075
 myself-0.073
 herself-0.073
zie-0.073
keeping-0.071
Token.
Token ID13
Feature activation-8.07e-01

Pos logit contributions
itt+0.052
 of+0.049
op+0.048
year+0.048
 Some+0.047
Neg logit contributions
 myself-0.051
 heads-0.051
zie-0.051
 herself-0.049
keeping-0.049
Token She
Token ID1375
Feature activation-1.78e-02

Pos logit contributions
 Some+0.031
 of+0.031
itt+0.030
year+0.029
op+0.028
Neg logit contributions
 heads-0.031
 myself-0.030
keeping-0.030
zie-0.030
 forth-0.030
TokenBen
Token ID11696
Feature activation-1.43e+00

Pos logit contributions
 heads+0.057
 myself+0.055
zie+0.054
 forth+0.054
 herself+0.053
Neg logit contributions
itt-0.051
 of-0.049
 Some-0.047
year-0.046
op-0.046
Token was
Token ID373
Feature activation-1.13e+00

Pos logit contributions
 of+0.045
itt+0.044
year+0.042
 Some+0.041
op+0.041
Neg logit contributions
 myself-0.063
 heads-0.062
keeping-0.061
 herself-0.061
zie-0.060
Token a
Token ID257
Feature activation-8.11e-01

Pos logit contributions
itt+0.050
ades+0.049
op+0.048
 Some+0.044
ves+0.044
Neg logit contributions
 herself-0.039
 forth-0.035
 myself-0.034
zie-0.033
 Holly-0.033
Token 3
Token ID513
Feature activation-3.41e+00

Pos logit contributions
itt+0.035
ades+0.034
 Some+0.033
year+0.033
 of+0.032
Neg logit contributions
 forget-0.027
 heads-0.026
 forth-0.026
 fuss-0.024
zie-0.023
Token-
Token ID12
Feature activation-3.86e+00

Pos logit contributions
 Some+0.095
itt+0.094
imsy+0.089
op+0.083
year+0.082
Neg logit contributions
 heads-0.162
keeping-0.155
 fuss-0.155
 forth-0.155
 herself-0.154
Tokenyear
Token ID1941
Feature activation-6.45e+00

Pos logit contributions
oses+0.090
 construction+0.087
ifying+0.087
 Some+0.086
ork+0.085
Neg logit contributions
 herself-0.187
 myself-0.180
keeping-0.172
riet-0.163
 forth-0.162
Token-
Token ID12
Feature activation-3.47e+00

Pos logit contributions
 Some+0.163
itt+0.143
Will+0.135
ifying+0.131
ades+0.128
Neg logit contributions
 heads-0.344
keeping-0.318
 myself-0.315
 herself-0.308
zie-0.306
Tokenold
Token ID727
Feature activation-2.05e+00

Pos logit contributions
ifying+0.137
 Some+0.124
 construction+0.117
 fires+0.117
imsy+0.117
Neg logit contributions
 herself-0.130
 myself-0.123
keeping-0.123
riet-0.108
zie-0.108
Token boy
Token ID2933
Feature activation-6.99e-01

Pos logit contributions
itt+0.101
 Some+0.094
year+0.093
op+0.093
 of+0.090
Neg logit contributions
 heads-0.058
 myself-0.056
 herself-0.056
keeping-0.055
zie-0.054
Token who
Token ID508
Feature activation-1.23e+00

Pos logit contributions
itt+0.032
ades+0.031
year+0.030
 Some+0.030
imsy+0.029
Neg logit contributions
 forth-0.021
 herself-0.020
 heads-0.019
keeping-0.019
 myself-0.019
Token owned
Token ID6898
Feature activation+1.03e+00

Pos logit contributions
itt+0.047
 Some+0.043
year+0.042
op+0.042
 of+0.041
Neg logit contributions
 heads-0.048
 herself-0.047
zie-0.047
 myself-0.046
keeping-0.044
Token was
Token ID373
Feature activation-1.21e+00

Pos logit contributions
itt+0.090
year+0.087
 Some+0.085
op+0.081
 of+0.081
Neg logit contributions
 heads-0.083
zie-0.082
 herself-0.080
 myself-0.079
keeping-0.078
Token a
Token ID257
Feature activation-7.24e-01

Pos logit contributions
itt+0.048
op+0.048
ades+0.046
 Some+0.043
oses+0.042
Neg logit contributions
 herself-0.048
 myself-0.044
zie-0.042
 forth-0.040
 Holly-0.039
Token curious
Token ID11040
Feature activation-1.88e+00

Pos logit contributions
itt+0.030
 Some+0.028
year+0.028
ades+0.027
op+0.027
Neg logit contributions
 heads-0.026
 forget-0.025
 forth-0.025
zie-0.024
 myself-0.023
Token 3
Token ID513
Feature activation-3.67e+00

Pos logit contributions
itt+0.082
year+0.072
op+0.071
 Some+0.071
ades+0.070
Neg logit contributions
 heads-0.067
zie-0.065
 forth-0.062
 fuss-0.061
 herself-0.060
Token-
Token ID12
Feature activation-3.84e+00

Pos logit contributions
itt+0.093
 Some+0.088
year+0.079
op+0.079
imsy+0.078
Neg logit contributions
 heads-0.187
zie-0.184
 herself-0.182
keeping-0.182
 myself-0.181
Tokenyear
Token ID1941
Feature activation-6.45e+00

Pos logit contributions
 Some+0.082
ifying+0.079
 fires+0.073
op+0.068
ades+0.068
Neg logit contributions
 herself-0.207
keeping-0.198
 myself-0.197
zie-0.191
riet-0.186
Token-
Token ID12
Feature activation-3.42e+00

Pos logit contributions
 Some+0.165
itt+0.154
Will+0.143
ifying+0.139
imsy+0.133
Neg logit contributions
 heads-0.338
keeping-0.313
 myself-0.309
 herself-0.308
zie-0.299
Tokenold
Token ID727
Feature activation-2.13e+00

Pos logit contributions
ifying+0.114
 Some+0.112
 fires+0.102
imsy+0.099
op+0.098
Neg logit contributions
 herself-0.150
keeping-0.144
 myself-0.142
zie-0.136
riet-0.127
Token,
Token ID11
Feature activation-6.18e-01

Pos logit contributions
itt+0.091
 Some+0.083
year+0.083
op+0.083
 fires+0.080
Neg logit contributions
 heads-0.074
 herself-0.072
 myself-0.071
zie-0.071
keeping-0.071
Token eager
Token ID11069
Feature activation-2.34e+00

Pos logit contributions
itt+0.023
 Some+0.021
year+0.021
op+0.021
 of+0.020
Neg logit contributions
 herself-0.024
zie-0.024
 heads-0.024
 myself-0.024
 forth-0.023
Token to
Token ID284
Feature activation-7.76e-02

Pos logit contributions
itt+0.058
year+0.051
 Some+0.047
op+0.047
ades+0.043
Neg logit contributions
 heads-0.128
 forth-0.123
 herself-0.122
keeping-0.122
 myself-0.121
Token a
Token ID257
Feature activation-8.78e-01

Pos logit contributions
ades+0.053
itt+0.052
op+0.051
 Some+0.048
ves+0.048
Neg logit contributions
 herself-0.041
 myself-0.035
 forth-0.035
zie-0.032
 heads-0.031
Token very
Token ID845
Feature activation-5.47e-01

Pos logit contributions
itt+0.037
 Some+0.034
ades+0.034
year+0.034
 of+0.033
Neg logit contributions
 heads-0.031
 forget-0.030
 forth-0.030
zie-0.028
 fuss-0.028
Token curious
Token ID11040
Feature activation-2.23e+00

Pos logit contributions
itt+0.025
 Some+0.024
op+0.024
ades+0.024
glass+0.023
Neg logit contributions
 forget-0.016
 heads-0.016
 fuss-0.015
 myself-0.015
zie-0.015
Token three
Token ID1115
Feature activation-3.32e+00

Pos logit contributions
itt+0.096
year+0.086
 Some+0.085
op+0.084
ades+0.082
Neg logit contributions
 heads-0.079
zie-0.076
 forth-0.073
 herself-0.073
 fuss-0.073
Token-
Token ID12
Feature activation-3.57e+00

Pos logit contributions
itt+0.110
 Some+0.104
year+0.098
imsy+0.096
op+0.096
Neg logit contributions
 heads-0.143
zie-0.137
 herself-0.136
 myself-0.136
 forth-0.136
Tokenyear
Token ID1941
Feature activation-6.43e+00

Pos logit contributions
 Some+0.069
ifying+0.064
 fires+0.061
ades+0.057
op+0.056
Neg logit contributions
 herself-0.205
 myself-0.199
keeping-0.196
zie-0.188
riet-0.182
Token-
Token ID12
Feature activation-3.36e+00

Pos logit contributions
 Some+0.152
itt+0.144
Will+0.122
 fires+0.122
imsy+0.118
Neg logit contributions
 heads-0.348
keeping-0.328
 myself-0.324
 herself-0.321
zie-0.316
Tokenold
Token ID727
Feature activation-2.29e+00

Pos logit contributions
 Some+0.105
ifying+0.102
 fires+0.095
itt+0.089
op+0.088
Neg logit contributions
 herself-0.156
 myself-0.151
keeping-0.149
zie-0.142
riet-0.132
Token boy
Token ID2933
Feature activation-1.17e+00

Pos logit contributions
itt+0.096
 Some+0.088
year+0.088
op+0.086
 of+0.085
Neg logit contributions
 heads-0.082
 myself-0.080
 herself-0.079
zie-0.079
keeping-0.078
Token.
Token ID13
Feature activation-7.24e-01

Pos logit contributions
itt+0.060
ades+0.056
year+0.055
 Some+0.055
 fires+0.054
Neg logit contributions
 herself-0.030
 forth-0.028
 heads-0.028
 myself-0.027
keeping-0.027
Token He
Token ID679
Feature activation-7.72e-01

Pos logit contributions
itt+0.025
 Some+0.024
 of+0.024
year+0.024
 fires+0.022
Neg logit contributions
 heads-0.030
 myself-0.030
zie-0.029
keeping-0.029
 herself-0.029
Token his
Token ID465
Feature activation-1.32e+00

Pos logit contributions
 heads+0.004
keeping+0.004
 myself+0.004
ishment+0.004
idents+0.004
Neg logit contributions
 of-0.004
year-0.003
itt-0.003
 Some-0.003
 fires-0.003
Token mum
Token ID25682
Feature activation+4.70e-01

Pos logit contributions
itt+0.053
 of+0.050
 Some+0.049
year+0.049
op+0.048
Neg logit contributions
 myself-0.047
 heads-0.047
zie-0.046
keeping-0.046
 herself-0.046
Token had
Token ID550
Feature activation+1.86e-01

Pos logit contributions
 heads+0.017
zie+0.017
 herself+0.017
 forth+0.017
 myself+0.016
Neg logit contributions
itt-0.019
 Some-0.018
year-0.017
 fires-0.017
op-0.017
Token lots
Token ID6041
Feature activation+7.34e-01

Pos logit contributions
 herself+0.008
 myself+0.007
 heads+0.007
keeping+0.007
zie+0.007
Neg logit contributions
itt-0.007
op-0.007
year-0.006
 Some-0.006
imsy-0.006
Token of
Token ID286
Feature activation-1.17e+00

Pos logit contributions
 forget+0.039
 herself+0.039
 fuss+0.038
 Holly+0.036
 forth+0.036
Neg logit contributions
itt-0.015
imsy-0.014
op-0.013
 Some-0.013
year-0.013
Token fun
Token ID1257
Feature activation+1.72e+00

Pos logit contributions
itt+0.057
year+0.055
op+0.055
 Some+0.053
 of+0.052
Neg logit contributions
 heads-0.033
 herself-0.032
 myself-0.031
 fuss-0.031
 forth-0.030
Token playing
Token ID2712
Feature activation-8.41e-01

Pos logit contributions
 herself+0.065
 heads+0.065
 forth+0.065
 myself+0.063
keeping+0.062
Neg logit contributions
itt-0.065
year-0.064
 Some-0.063
op-0.058
imsy-0.058
Token with
Token ID351
Feature activation+1.92e-01

Pos logit contributions
itt+0.032
year+0.030
 Some+0.030
op+0.028
 of+0.028
Neg logit contributions
 herself-0.033
 heads-0.033
 myself-0.032
 forth-0.032
keeping-0.032
Token the
Token ID262
Feature activation-3.40e-01

Pos logit contributions
 herself+0.006
 heads+0.006
keeping+0.006
 fuss+0.005
 myself+0.005
Neg logit contributions
itt-0.009
year-0.008
imsy-0.008
op-0.008
 Some-0.008
Token new
Token ID649
Feature activation-5.09e-02

Pos logit contributions
itt+0.014
year+0.013
 Some+0.013
op+0.013
 of+0.012
Neg logit contributions
 heads-0.013
zie-0.012
 herself-0.012
 myself-0.012
 forth-0.012
Token stuff
Token ID3404
Feature activation-3.13e-01

Pos logit contributions
itt+0.002
 of+0.002
 Some+0.002
year+0.002
op+0.002
Neg logit contributions
 heads-0.002
 myself-0.002
zie-0.002
 herself-0.002
keeping-0.002
Token for
Token ID329
Feature activation-8.24e-01

Pos logit contributions
itt+0.035
year+0.032
 Some+0.032
 of+0.032
op+0.031
Neg logit contributions
 heads-0.041
 myself-0.040
zie-0.040
 herself-0.040
keeping-0.039
Token her
Token ID607
Feature activation+2.50e-01

Pos logit contributions
itt+0.027
op+0.025
year+0.025
 Some+0.024
 of+0.024
Neg logit contributions
 herself-0.037
 heads-0.037
 myself-0.037
zie-0.036
keeping-0.035
Token blocks
Token ID7021
Feature activation-1.28e+00

Pos logit contributions
 heads+0.009
 herself+0.009
 myself+0.009
 forth+0.009
 fuss+0.008
Neg logit contributions
year-0.011
itt-0.010
 Some-0.010
op-0.009
 fires-0.009
Token and
Token ID290
Feature activation+1.75e-01

Pos logit contributions
itt+0.051
 Some+0.047
year+0.047
op+0.046
 fires+0.045
Neg logit contributions
 heads-0.048
 herself-0.048
 myself-0.047
zie-0.047
keeping-0.046
Token saw
Token ID2497
Feature activation-1.58e+00

Pos logit contributions
 myself+0.007
 heads+0.006
zie+0.006
keeping+0.006
 herself+0.006
Neg logit contributions
itt-0.007
 of-0.006
year-0.006
 Some-0.006
op-0.006
Token them
Token ID606
Feature activation+1.80e+00

Pos logit contributions
itt+0.058
op+0.054
year+0.054
 Some+0.051
 of+0.050
Neg logit contributions
 herself-0.067
 heads-0.063
 myself-0.063
zie-0.061
keeping-0.059
Token in
Token ID287
Feature activation+1.22e+00

Pos logit contributions
 herself+0.065
 heads+0.065
 myself+0.062
 forth+0.062
zie+0.060
Neg logit contributions
itt-0.077
year-0.072
 Some-0.071
op-0.068
 fires-0.066
Token the
Token ID262
Feature activation+6.06e-01

Pos logit contributions
zie+0.058
 myself+0.056
 heads+0.056
 forth+0.056
keeping+0.053
Neg logit contributions
 of-0.037
 Some-0.036
itt-0.034
 fires-0.033
year-0.032
Token corner
Token ID5228
Feature activation-1.57e+00

Pos logit contributions
 heads+0.020
zie+0.019
 myself+0.019
 herself+0.019
 forth+0.018
Neg logit contributions
itt-0.027
year-0.027
op-0.026
 Some-0.025
 of-0.025
Token of
Token ID286
Feature activation-1.47e+00

Pos logit contributions
itt+0.072
op+0.066
year+0.065
 Some+0.064
imsy+0.064
Neg logit contributions
 herself-0.050
 heads-0.047
 myself-0.044
zie-0.044
 fuss-0.042
Token the
Token ID262
Feature activation+1.34e-01

Pos logit contributions
 of+0.045
itt+0.043
 Some+0.043
 fires+0.040
year+0.040
Neg logit contributions
 heads-0.067
zie-0.066
 myself-0.065
 forth-0.064
 forget-0.063
Token's
Token ID338
Feature activation+6.00e-01

Pos logit contributions
 herself+0.108
 heads+0.106
 myself+0.104
 forth+0.102
zie+0.099
Neg logit contributions
itt-0.108
year-0.105
 Some-0.102
 fires-0.095
ades-0.092
Token favorite
Token ID4004
Feature activation-9.94e-02

Pos logit contributions
 heads+0.020
 myself+0.019
 herself+0.019
zie+0.019
keeping+0.018
Neg logit contributions
itt-0.027
year-0.026
 Some-0.025
 of-0.025
op-0.025
Token toy
Token ID13373
Feature activation+5.89e-01

Pos logit contributions
itt+0.004
year+0.004
 Some+0.004
op+0.004
 of+0.004
Neg logit contributions
 heads-0.004
 myself-0.004
 herself-0.004
 forth-0.004
zie-0.003
Token and
Token ID290
Feature activation+8.36e-01

Pos logit contributions
 heads+0.024
 herself+0.024
 myself+0.023
 forth+0.023
zie+0.022
Neg logit contributions
itt-0.022
year-0.022
 Some-0.021
 fires-0.019
op-0.019
Token it
Token ID340
Feature activation+2.26e-01

Pos logit contributions
 heads+0.031
 myself+0.031
 herself+0.030
zie+0.030
 forth+0.029
Neg logit contributions
itt-0.034
year-0.031
 Some-0.031
 of-0.031
op-0.030
Token broke
Token ID6265
Feature activation+1.95e+00

Pos logit contributions
 heads+0.009
 myself+0.009
zie+0.009
 herself+0.009
keeping+0.009
Neg logit contributions
itt-0.008
 of-0.007
 Some-0.007
year-0.007
op-0.007
Token.
Token ID13
Feature activation+2.93e-01

Pos logit contributions
zie+0.075
 heads+0.074
 myself+0.072
keeping+0.070
 forth+0.070
Neg logit contributions
itt-0.073
 of-0.071
year-0.068
 Some-0.068
op-0.067
Token Lily
Token ID20037
Feature activation+1.62e+00

Pos logit contributions
 heads+0.010
 forth+0.010
 myself+0.010
zie+0.010
keeping+0.010
Neg logit contributions
 Some-0.013
itt-0.013
 of-0.012
op-0.011
year-0.011
Token was
Token ID373
Feature activation-3.44e-01

Pos logit contributions
zie+0.065
 myself+0.065
keeping+0.063
 heads+0.063
 herself+0.062
Neg logit contributions
itt-0.060
 of-0.056
op-0.056
 Some-0.054
year-0.053
Token so
Token ID523
Feature activation+6.69e-01

Pos logit contributions
itt+0.013
 of+0.012
year+0.012
 Some+0.012
op+0.011
Neg logit contributions
 heads-0.014
zie-0.014
 myself-0.014
 herself-0.013
keeping-0.013
Token sad
Token ID6507
Feature activation-5.13e-01

Pos logit contributions
 herself+0.023
 heads+0.023
 myself+0.022
 forth+0.021
 fuss+0.021
Neg logit contributions
itt-0.028
 Some-0.027
 fires-0.027
year-0.026
op-0.026
Token He
Token ID679
Feature activation-1.37e-01

Pos logit contributions
 heads+0.008
 myself+0.008
 forth+0.007
zie+0.007
 herself+0.007
Neg logit contributions
itt-0.007
 Some-0.007
 of-0.006
year-0.006
op-0.006
Token figured
Token ID11638
Feature activation-4.64e-01

Pos logit contributions
itt+0.006
year+0.005
 Some+0.005
 of+0.005
op+0.005
Neg logit contributions
 heads-0.005
 myself-0.005
zie-0.005
 herself-0.005
keeping-0.005
Token out
Token ID503
Feature activation+2.97e-01

Pos logit contributions
itt+0.017
op+0.016
year+0.015
 Some+0.015
 fires+0.015
Neg logit contributions
 herself-0.020
zie-0.018
 myself-0.018
 heads-0.017
keeping-0.017
Token they
Token ID484
Feature activation+1.14e+00

Pos logit contributions
 herself+0.008
 heads+0.007
 myself+0.007
zie+0.006
keeping+0.006
Neg logit contributions
itt-0.017
op-0.016
year-0.015
imsy-0.014
 Some-0.014
Token must
Token ID1276
Feature activation+5.18e-01

Pos logit contributions
 heads+0.040
 herself+0.039
 myself+0.039
zie+0.039
keeping+0.037
Neg logit contributions
itt-0.049
 Some-0.045
year-0.045
op-0.044
 of-0.044
Token be
Token ID307
Feature activation+2.08e+00

Pos logit contributions
 heads+0.017
zie+0.016
 myself+0.016
 forth+0.016
keeping+0.016
Neg logit contributions
itt-0.022
 of-0.022
 Some-0.021
year-0.020
op-0.020
Token his
Token ID465
Feature activation-1.97e-01

Pos logit contributions
 heads+0.082
 herself+0.082
 myself+0.082
zie+0.080
keeping+0.079
Neg logit contributions
itt-0.080
 Some-0.074
op-0.073
year-0.072
 of-0.070
Token friends
Token ID2460
Feature activation+2.15e+00

Pos logit contributions
itt+0.008
year+0.007
 Some+0.007
 of+0.007
op+0.007
Neg logit contributions
 heads-0.008
 myself-0.008
zie-0.007
 herself-0.007
keeping-0.007
Token.
Token ID13
Feature activation+1.09e-01

Pos logit contributions
 herself+0.074
 myself+0.074
keeping+0.073
 heads+0.072
zie+0.072
Neg logit contributions
itt-0.093
 Some-0.088
year-0.086
op-0.083
 fires-0.081
Token The
Token ID383
Feature activation-5.47e-02

Pos logit contributions
 heads+0.005
 forth+0.004
 myself+0.004
zie+0.004
 herself+0.004
Neg logit contributions
 Some-0.004
itt-0.004
 of-0.004
year-0.004
op-0.004
Token young
Token ID1862
Feature activation+2.82e-01

Pos logit contributions
itt+0.002
 of+0.002
 Some+0.002
year+0.002
op+0.002
Neg logit contributions
 myself-0.002
zie-0.002
 heads-0.002
keeping-0.002
 herself-0.002
Token
Token ID198
Feature activation-1.00e-01

Pos logit contributions
itt+0.021
year+0.020
 Some+0.020
 of+0.020
op+0.019
Neg logit contributions
 heads-0.019
 myself-0.019
zie-0.019
 herself-0.019
keeping-0.018
TokenThey
Token ID2990
Feature activation+2.31e-01

Pos logit contributions
itt+0.004
 Some+0.004
year+0.004
 of+0.004
 fires+0.004
Neg logit contributions
 myself-0.003
 heads-0.003
 herself-0.003
zie-0.003
keeping-0.003
Token want
Token ID765
Feature activation-4.50e-01

Pos logit contributions
 heads+0.008
zie+0.007
 herself+0.007
 myself+0.007
 forth+0.007
Neg logit contributions
itt-0.010
year-0.010
 Some-0.010
op-0.009
 of-0.009
Token to
Token ID284
Feature activation+3.24e-01

Pos logit contributions
year+0.016
itt+0.016
imsy+0.015
ug+0.014
op+0.014
Neg logit contributions
 herself-0.020
 heads-0.019
 fuss-0.019
keeping-0.018
 myself-0.018
Token weigh
Token ID10164
Feature activation-6.51e-01

Pos logit contributions
 heads+0.012
 myself+0.012
 herself+0.012
zie+0.012
keeping+0.011
Neg logit contributions
itt-0.013
year-0.012
 Some-0.012
op-0.012
 of-0.012
Token themselves
Token ID2405
Feature activation+1.94e+00

Pos logit contributions
itt+0.025
year+0.024
 Some+0.024
op+0.023
 of+0.023
Neg logit contributions
 heads-0.025
 myself-0.025
 herself-0.025
zie-0.024
 forth-0.023
Token.
Token ID13
Feature activation-1.99e-01

Pos logit contributions
 herself+0.070
 myself+0.070
 heads+0.068
 forth+0.065
zie+0.063
Neg logit contributions
itt-0.085
year-0.079
 Some-0.079
op-0.073
 fires-0.071
Token Lily
Token ID20037
Feature activation+1.54e+00

Pos logit contributions
itt+0.009
year+0.008
 Some+0.008
 of+0.008
op+0.008
Neg logit contributions
 heads-0.007
 myself-0.006
zie-0.006
 herself-0.006
keeping-0.006
Token goes
Token ID2925
Feature activation+1.52e-01

Pos logit contributions
 myself+0.045
keeping+0.044
zie+0.044
 herself+0.044
 heads+0.043
Neg logit contributions
itt-0.073
 Some-0.069
 of-0.069
op-0.068
year-0.067
Token first
Token ID717
Feature activation-2.94e-01

Pos logit contributions
 herself+0.006
 myself+0.005
 forth+0.005
 heads+0.005
zie+0.005
Neg logit contributions
itt-0.007
 Some-0.006
op-0.006
ended-0.006
year-0.006
Token.
Token ID13
Feature activation-2.20e-01

Pos logit contributions
itt+0.012
year+0.011
 Some+0.011
 of+0.010
op+0.010
Neg logit contributions
 heads-0.011
 myself-0.011
 herself-0.011
zie-0.011
keeping-0.011
Token new
Token ID649
Feature activation+9.62e-01

Pos logit contributions
 heads+0.013
 fuss+0.013
 herself+0.013
 forget+0.013
 myself+0.013
Neg logit contributions
itt-0.014
year-0.013
op-0.013
 Some-0.013
 fires-0.012
Token place
Token ID1295
Feature activation+1.51e+00

Pos logit contributions
 heads+0.042
 myself+0.041
zie+0.041
 herself+0.041
keeping+0.040
Neg logit contributions
itt-0.032
year-0.030
 Some-0.029
 of-0.029
op-0.029
Token to
Token ID284
Feature activation+8.83e-01

Pos logit contributions
 herself+0.054
 myself+0.054
 heads+0.052
keeping+0.052
zie+0.050
Neg logit contributions
itt-0.065
 Some-0.059
op-0.058
year-0.057
 fires-0.053
Token explore
Token ID7301
Feature activation+8.96e-01

Pos logit contributions
 myself+0.030
 herself+0.030
 heads+0.029
zie+0.029
 forth+0.028
Neg logit contributions
itt-0.039
year-0.036
 Some-0.036
op-0.036
 of-0.034
Token.
Token ID13
Feature activation-2.01e-01

Pos logit contributions
 herself+0.029
 myself+0.028
 heads+0.027
zie+0.026
keeping+0.026
Neg logit contributions
itt-0.042
op-0.039
year-0.038
 Some-0.038
imsy-0.036
Token
Token ID198
Feature activation-1.03e-01

Pos logit contributions
itt+0.007
 Some+0.007
 of+0.007
year+0.007
op+0.007
Neg logit contributions
 heads-0.008
zie-0.008
 myself-0.008
 forth-0.008
keeping-0.008
Token
Token ID198
Feature activation+2.53e-01

Pos logit contributions
itt+0.004
year+0.003
 Some+0.003
 of+0.003
op+0.003
Neg logit contributions
 heads-0.004
 myself-0.004
zie-0.004
 herself-0.004
keeping-0.004
TokenThe
Token ID464
Feature activation-2.68e-01

Pos logit contributions
 myself+0.008
 herself+0.008
zie+0.008
 heads+0.007
 fuss+0.007
Neg logit contributions
itt-0.011
 Some-0.010
op-0.010
year-0.010
oses-0.010
Token moral
Token ID6573
Feature activation-1.31e-01

Pos logit contributions
itt+0.012
year+0.011
op+0.011
 Some+0.011
 of+0.010
Neg logit contributions
 heads-0.009
 myself-0.009
 herself-0.009
zie-0.008
 forth-0.008
Token of
Token ID286
Feature activation-2.71e+00

Pos logit contributions
itt+0.003
op+0.002
oses+0.002
imsy+0.002
 fires+0.002
Neg logit contributions
 herself-0.007
 myself-0.007
 heads-0.007
keeping-0.007
 thinks-0.007
Token the
Token ID262
Feature activation-1.27e+00

Pos logit contributions
itt+0.104
 Some+0.096
year+0.096
 of+0.095
op+0.094
Neg logit contributions
 heads-0.104
 myself-0.104
zie-0.103
 herself-0.102
keeping-0.100
Token€
Token ID26391
Feature activation+1.01e+00

Pos logit contributions
 heads+0.039
 herself+0.037
keeping+0.036
 myself+0.036
zie+0.034
Neg logit contributions
itt-0.089
year-0.085
op-0.085
 of-0.084
 Some-0.082
Token asked
Token ID1965
Feature activation+6.67e-01

Pos logit contributions
 heads+0.036
zie+0.035
 myself+0.035
 herself+0.035
keeping+0.033
Neg logit contributions
itt-0.043
op-0.040
 Some-0.039
 of-0.039
year-0.038
Token Simon
Token ID11288
Feature activation-5.16e-03

Pos logit contributions
 herself+0.022
 myself+0.022
 heads+0.022
zie+0.021
 forth+0.020
Neg logit contributions
itt-0.030
year-0.028
 Some-0.028
op-0.027
 of-0.027
Token.
Token ID13
Feature activation+7.43e-01

Pos logit contributions
itt+0.000
year+0.000
 Some+0.000
op+0.000
 of+0.000
Neg logit contributions
 myself-0.000
 herself-0.000
 heads-0.000
zie-0.000
 forth-0.000
Token
Token ID198
Feature activation+1.82e-01

Pos logit contributions
 heads+0.024
 myself+0.024
zie+0.024
 herself+0.023
keeping+0.023
Neg logit contributions
itt-0.033
year-0.031
 Some-0.030
 of-0.030
op-0.030
Token
Token ID198
Feature activation+4.43e-01

Pos logit contributions
 herself+0.005
 myself+0.005
zie+0.005
 Holly+0.005
 heads+0.004
Neg logit contributions
itt-0.008
 fires-0.008
year-0.008
 Some-0.008
 of-0.008
Tokenâ
Token ID22940
Feature activation+1.49e+00

Pos logit contributions
 myself+0.014
 herself+0.013
 Holly+0.013
zie+0.012
 fuss+0.012
Neg logit contributions
itt-0.019
 fires-0.018
 Some-0.018
year-0.018
 of-0.017
Token€
Token ID26391
Feature activation+8.30e-01

Pos logit contributions
 heads+0.034
 herself+0.033
keeping+0.033
 myself+0.032
 fuss+0.031
Neg logit contributions
itt-0.077
year-0.074
op-0.074
 of-0.071
 Some-0.071
Token�
Token ID129
Feature activation-5.57e-02

Pos logit contributions
 herself+0.032
 heads+0.032
 myself+0.032
zie+0.032
 forth+0.031
Neg logit contributions
itt-0.032
op-0.030
 Some-0.029
 of-0.028
 fires-0.028
Token�
Token ID241
Feature activation+4.00e-01

Pos logit contributions
itt+0.002
year+0.002
 Some+0.002
 of+0.002
op+0.002
Neg logit contributions
 heads-0.002
zie-0.002
 myself-0.002
 herself-0.002
keeping-0.002
TokenIt
Token ID1026
Feature activation+6.40e-01

Pos logit contributions
 herself+0.014
 myself+0.014
 heads+0.013
keeping+0.013
zie+0.013
Neg logit contributions
itt-0.016
 fires-0.015
 of-0.015
year-0.015
 Some-0.015
Token forget
Token ID6044
Feature activation+6.23e-01

Pos logit contributions
zie+0.043
 heads+0.042
 myself+0.042
keeping+0.040
 forth+0.040
Neg logit contributions
 of-0.045
 Some-0.039
itt-0.037
year-0.037
 fires-0.036
Token how
Token ID703
Feature activation+3.82e-01

Pos logit contributions
 herself+0.026
 myself+0.024
zie+0.023
 heads+0.022
 fuss+0.022
Neg logit contributions
itt-0.027
year-0.024
op-0.023
 Some-0.023
imsy-0.022
Token nice
Token ID3621
Feature activation+1.85e+00

Pos logit contributions
 herself+0.017
 myself+0.017
 heads+0.016
zie+0.015
keeping+0.015
Neg logit contributions
itt-0.013
 Some-0.012
op-0.012
year-0.011
glass-0.011
Token it
Token ID340
Feature activation-6.84e-01

Pos logit contributions
 heads+0.076
zie+0.074
 myself+0.072
keeping+0.072
 fuss+0.069
Neg logit contributions
 of-0.068
itt-0.063
 fires-0.060
 Some-0.059
op-0.058
Token was
Token ID373
Feature activation-6.45e-01

Pos logit contributions
itt+0.022
 of+0.022
year+0.021
op+0.021
 Some+0.020
Neg logit contributions
 myself-0.029
keeping-0.028
 heads-0.028
zie-0.028
 herself-0.028
Token to
Token ID284
Feature activation+4.29e-01

Pos logit contributions
itt+0.026
year+0.024
 Some+0.024
op+0.023
 fires+0.022
Neg logit contributions
 heads-0.024
 herself-0.024
 myself-0.024
zie-0.024
keeping-0.023
Token have
Token ID423
Feature activation+1.04e+00

Pos logit contributions
 heads+0.016
keeping+0.016
zie+0.015
 myself+0.015
 forth+0.015
Neg logit contributions
 of-0.016
itt-0.014
 fires-0.014
 Some-0.014
year-0.013
Token water
Token ID1660
Feature activation+1.93e+00

Pos logit contributions
 heads+0.041
 myself+0.041
zie+0.040
 herself+0.040
keeping+0.039
Neg logit contributions
itt-0.039
 Some-0.036
 of-0.036
year-0.036
op-0.035
Token when
Token ID618
Feature activation+1.42e+00

Pos logit contributions
 herself+0.075
 heads+0.074
 myself+0.073
keeping+0.072
zie+0.072
Neg logit contributions
itt-0.076
year-0.072
 Some-0.071
op-0.067
 fires-0.063
Token she
Token ID673
Feature activation+6.54e-01

Pos logit contributions
 heads+0.048
zie+0.047
 myself+0.047
keeping+0.045
 herself+0.045
Neg logit contributions
itt-0.061
 of-0.058
 Some-0.057
year-0.056
op-0.056
Token was
Token ID373
Feature activation+8.16e-01

Pos logit contributions
 heads+0.022
zie+0.021
 herself+0.020
 myself+0.020
 forth+0.020
Neg logit contributions
itt-0.029
 Some-0.027
year-0.027
 fires-0.026
op-0.026
Token
Token ID198
Feature activation+1.51e-02

Pos logit contributions
 heads+0.014
 myself+0.014
zie+0.014
 herself+0.014
 forth+0.013
Neg logit contributions
itt-0.015
op-0.014
 of-0.013
 Some-0.013
year-0.013
Token
Token ID198
Feature activation+5.09e-01

Pos logit contributions
 myself+0.001
 heads+0.001
zie+0.001
 herself+0.001
keeping+0.001
Neg logit contributions
itt-0.001
year-0.001
 Some-0.001
 of-0.001
op-0.001
TokenThe
Token ID464
Feature activation-1.43e-01

Pos logit contributions
 myself+0.020
 herself+0.020
zie+0.019
keeping+0.019
 heads+0.018
Neg logit contributions
itt-0.019
 Some-0.017
 fires-0.017
year-0.017
op-0.017
Token girl
Token ID2576
Feature activation+1.16e+00

Pos logit contributions
itt+0.006
year+0.006
 Some+0.006
op+0.006
 of+0.006
Neg logit contributions
 heads-0.005
 myself-0.005
 herself-0.005
zie-0.005
 forth-0.005
Token was
Token ID373
Feature activation-7.53e-01

Pos logit contributions
 heads+0.037
zie+0.036
 herself+0.036
 myself+0.035
 forth+0.035
Neg logit contributions
itt-0.052
year-0.050
 Some-0.049
 fires-0.048
op-0.047
Token filled
Token ID5901
Feature activation+1.04e-01

Pos logit contributions
itt+0.034
 Some+0.031
op+0.031
year+0.030
 fires+0.030
Neg logit contributions
 herself-0.026
 myself-0.025
zie-0.024
 heads-0.024
 forth-0.023
Token with
Token ID351
Feature activation+6.43e-01

Pos logit contributions
 myself+0.005
 herself+0.005
 heads+0.005
keeping+0.005
zie+0.004
Neg logit contributions
itt-0.003
 Some-0.003
year-0.003
op-0.003
 fires-0.003
Token happy
Token ID3772
Feature activation-6.62e-01

Pos logit contributions
 herself+0.018
 heads+0.017
 myself+0.017
zie+0.016
keeping+0.016
Neg logit contributions
itt-0.032
op-0.030
year-0.030
 Some-0.030
 fires-0.029
Token feelings
Token ID7666
Feature activation+1.05e+00

Pos logit contributions
itt+0.029
 of+0.028
 Some+0.027
year+0.027
op+0.027
Neg logit contributions
 heads-0.021
 myself-0.021
zie-0.021
 herself-0.021
keeping-0.020
Token as
Token ID355
Feature activation+1.37e+00

Pos logit contributions
 heads+0.042
 myself+0.042
zie+0.041
 herself+0.040
keeping+0.040
Neg logit contributions
itt-0.038
 of-0.037
 Some-0.035
year-0.035
op-0.035
Token she
Token ID673
Feature activation-7.67e-01

Pos logit contributions
 herself+0.042
 myself+0.041
zie+0.039
 heads+0.038
keeping+0.037
Neg logit contributions
itt-0.065
 Some-0.062
year-0.061
op-0.061
 fires-0.059
Token.
Token ID13
Feature activation-1.35e-01

Pos logit contributions
 heads+0.008
 herself+0.008
 myself+0.008
zie+0.008
keeping+0.008
Neg logit contributions
itt-0.008
 Some-0.008
year-0.008
op-0.008
 fires-0.007
Token
Token ID198
Feature activation-5.79e-01

Pos logit contributions
 Some+0.004
itt+0.004
 of+0.004
year+0.004
op+0.004
Neg logit contributions
 heads-0.007
 forth-0.007
 fuss-0.006
 forget-0.006
 mess-0.006
Token
Token ID198
Feature activation-1.46e-01

Pos logit contributions
itt+0.021
year+0.019
 Some+0.019
 of+0.019
op+0.019
Neg logit contributions
 heads-0.024
 myself-0.023
zie-0.023
 herself-0.023
keeping-0.022
TokenBut
Token ID1537
Feature activation-6.58e-01

Pos logit contributions
itt+0.006
 Some+0.005
op+0.005
year+0.005
 fires+0.005
Neg logit contributions
 herself-0.005
 myself-0.005
zie-0.005
keeping-0.005
 heads-0.005
Token then
Token ID788
Feature activation-1.15e-01

Pos logit contributions
itt+0.029
 Some+0.027
op+0.026
year+0.026
ades+0.025
Neg logit contributions
 herself-0.023
zie-0.022
 myself-0.022
 heads-0.022
 forth-0.022
Token Amanda
Token ID23040
Feature activation-8.36e-02

Pos logit contributions
itt+0.004
year+0.004
 Some+0.004
op+0.004
 of+0.004
Neg logit contributions
 heads-0.004
 myself-0.004
 herself-0.004
zie-0.004
 forth-0.004
Token started
Token ID2067
Feature activation+2.43e-01

Pos logit contributions
itt+0.004
year+0.004
 Some+0.004
 fires+0.003
op+0.003
Neg logit contributions
 heads-0.003
zie-0.003
 forth-0.003
 herself-0.003
 forget-0.003
Token to
Token ID284
Feature activation+2.10e-01

Pos logit contributions
 herself+0.013
 heads+0.013
 fuss+0.012
keeping+0.012
zie+0.012
Neg logit contributions
itt-0.007
year-0.006
 Some-0.005
op-0.005
 fires-0.005
Token fuss
Token ID34297
Feature activation+1.77e+00

Pos logit contributions
 myself+0.006
keeping+0.006
zie+0.006
 heads+0.006
 herself+0.005
Neg logit contributions
 of-0.009
itt-0.009
 Some-0.009
op-0.009
 fires-0.009
Token and
Token ID290
Feature activation-2.27e-01

Pos logit contributions
 herself+0.068
 myself+0.061
 heads+0.061
zie+0.059
 forth+0.056
Neg logit contributions
itt-0.076
 Some-0.070
year-0.068
 fires-0.065
op-0.063
Token demanding
Token ID11334
Feature activation-7.67e-01

Pos logit contributions
itt+0.010
year+0.009
 Some+0.009
 of+0.009
op+0.009
Neg logit contributions
 heads-0.008
 myself-0.008
zie-0.008
 herself-0.008
keeping-0.008
Token Can
Token ID1680
Feature activation+2.04e-01

Pos logit contributions
 myself+0.020
 heads+0.019
 herself+0.019
zie+0.018
keeping+0.018
Neg logit contributions
itt-0.025
year-0.024
op-0.024
 of-0.023
ades-0.023
Token we
Token ID356
Feature activation+1.23e+00

Pos logit contributions
 myself+0.006
 herself+0.005
 heads+0.005
zie+0.005
 forth+0.005
Neg logit contributions
year-0.010
itt-0.010
op-0.010
 fires-0.010
 Some-0.009
Token play
Token ID711
Feature activation+6.26e-01

Pos logit contributions
 heads+0.040
 myself+0.039
zie+0.039
 herself+0.039
keeping+0.037
Neg logit contributions
itt-0.055
year-0.052
 Some-0.051
 of-0.050
op-0.050
Token with
Token ID351
Feature activation+6.97e-01

Pos logit contributions
 myself+0.024
 herself+0.024
 forth+0.022
 heads+0.021
 fuss+0.021
Neg logit contributions
itt-0.026
year-0.024
op-0.023
 Some-0.023
imsy-0.022
Token our
Token ID674
Feature activation-1.80e-01

Pos logit contributions
 myself+0.024
 herself+0.023
 heads+0.022
 forth+0.021
 fuss+0.020
Neg logit contributions
year-0.032
itt-0.031
op-0.029
 Some-0.029
imsy-0.029
Token cars
Token ID5006
Feature activation-1.90e+00

Pos logit contributions
itt+0.007
year+0.006
 Some+0.006
 of+0.006
op+0.006
Neg logit contributions
 heads-0.007
 myself-0.007
 herself-0.007
zie-0.007
 forth-0.007
Token again
Token ID757
Feature activation-3.35e-02

Pos logit contributions
itt+0.085
year+0.080
 Some+0.079
op+0.077
 of+0.075
Neg logit contributions
 myself-0.062
 heads-0.061
 herself-0.060
zie-0.060
 forth-0.059
Token?"
Token ID1701
Feature activation+3.35e-01

Pos logit contributions
itt+0.001
year+0.001
 Some+0.001
 of+0.001
op+0.001
Neg logit contributions
 heads-0.001
 myself-0.001
zie-0.001
 herself-0.001
keeping-0.001
Token Their
Token ID5334
Feature activation-7.14e-01

Pos logit contributions
 heads+0.015
zie+0.015
 myself+0.015
 forth+0.014
keeping+0.014
Neg logit contributions
 Some-0.012
itt-0.011
 of-0.011
op-0.010
year-0.010
Token mom
Token ID1995
Feature activation-7.31e-01

Pos logit contributions
itt+0.026
 Some+0.024
 of+0.024
op+0.024
 fires+0.023
Neg logit contributions
 myself-0.029
zie-0.029
 herself-0.028
keeping-0.028
 heads-0.027
Token says
Token ID1139
Feature activation+6.88e-01

Pos logit contributions
 of+0.035
itt+0.035
 Some+0.034
 fires+0.032
op+0.032
Neg logit contributions
keeping-0.024
 myself-0.020
 herself-0.019
zie-0.019
 heads-0.018
Token
Token ID198
Feature activation-1.44e-01

Pos logit contributions
 of+0.019
itt+0.019
 Some+0.018
op+0.017
year+0.016
Neg logit contributions
 heads-0.030
 forth-0.029
zie-0.029
keeping-0.028
 myself-0.028
Token"
Token ID1
Feature activation+2.64e-01

Pos logit contributions
itt+0.005
 of+0.004
 Some+0.004
op+0.004
year+0.004
Neg logit contributions
 heads-0.007
zie-0.007
keeping-0.006
 myself-0.006
 forth-0.006
TokenCan
Token ID6090
Feature activation-6.86e-01

Pos logit contributions
 heads+0.008
 myself+0.008
 herself+0.008
zie+0.007
keeping+0.007
Neg logit contributions
itt-0.012
 of-0.011
op-0.011
year-0.011
 fires-0.011
Token we
Token ID356
Feature activation+9.46e-01

Pos logit contributions
itt+0.034
year+0.034
op+0.032
 fires+0.031
 Some+0.031
Neg logit contributions
 myself-0.020
 heads-0.019
 herself-0.019
zie-0.018
 forth-0.017
Token please
Token ID3387
Feature activation+2.76e-01

Pos logit contributions
 heads+0.034
 myself+0.034
zie+0.034
keeping+0.033
 herself+0.033
Neg logit contributions
itt-0.038
 of-0.036
 Some-0.035
op-0.035
year-0.034
Token see
Token ID766
Feature activation-1.28e+00

Pos logit contributions
 heads+0.009
 myself+0.009
zie+0.009
 herself+0.009
keeping+0.009
Neg logit contributions
itt-0.012
year-0.011
 Some-0.011
 of-0.011
op-0.011
Token the
Token ID262
Feature activation+2.84e-01

Pos logit contributions
itt+0.061
year+0.058
op+0.055
imsy+0.054
 Some+0.053
Neg logit contributions
 myself-0.043
 herself-0.041
 forth-0.039
 heads-0.039
zie-0.037
Token baby
Token ID5156
Feature activation+9.46e-02

Pos logit contributions
keeping+0.013
zie+0.013
 heads+0.013
 herself+0.012
 myself+0.012
Neg logit contributions
itt-0.009
 of-0.008
 Some-0.008
 fires-0.008
op-0.008
Token ze
Token ID41271
Feature activation-2.30e+00

Pos logit contributions
 heads+0.003
keeping+0.003
zie+0.003
 herself+0.003
 myself+0.003
Neg logit contributions
 of-0.004
itt-0.004
 fires-0.004
 Some-0.004
op-0.004
Tokenbr
Token ID1671
Feature activation-1.89e+00

Pos logit contributions
year+0.044
itt+0.042
 Some+0.041
 fires+0.038
imsy+0.036
Neg logit contributions
 myself-0.128
zie-0.123
 herself-0.122
 heads-0.120
 forth-0.119
Tokenas
Token ID292
Feature activation-9.73e-01

Pos logit contributions
itt+0.055
 Some+0.052
year+0.050
 fires+0.048
 of+0.047
Neg logit contributions
 myself-0.088
 heads-0.087
zie-0.086
 herself-0.085
keeping-0.082
Token cried
Token ID16896
Feature activation+4.22e-01

Pos logit contributions
 of+0.048
itt+0.047
op+0.046
 Some+0.044
year+0.044
Neg logit contributions
 myself-0.029
zie-0.029
keeping-0.028
 heads-0.026
 herself-0.025
Token.
Token ID13
Feature activation-3.15e-01

Pos logit contributions
 heads+0.014
 herself+0.014
 myself+0.013
zie+0.012
 forth+0.012
Neg logit contributions
itt-0.020
year-0.018
 Some-0.018
op-0.017
 fires-0.017
Token
Token ID198
Feature activation-2.48e-01

Pos logit contributions
itt+0.012
 Some+0.011
 of+0.011
year+0.011
op+0.011
Neg logit contributions
 myself-0.012
zie-0.012
 heads-0.012
 herself-0.012
 forth-0.012
Token
Token ID198
Feature activation+7.61e-02

Pos logit contributions
itt+0.011
year+0.010
 Some+0.010
 fires+0.010
 of+0.010
Neg logit contributions
 heads-0.008
 myself-0.008
 herself-0.008
zie-0.008
keeping-0.008
TokenMom
Token ID29252
Feature activation-5.88e-01

Pos logit contributions
 myself+0.003
 heads+0.003
 herself+0.002
 forth+0.002
keeping+0.002
Neg logit contributions
itt-0.003
year-0.003
 Some-0.003
 fires-0.003
 of-0.003
Token heard
Token ID2982
Feature activation-2.02e+00

Pos logit contributions
itt+0.023
 Some+0.022
year+0.022
 fires+0.021
op+0.021
Neg logit contributions
 heads-0.022
zie-0.022
 herself-0.021
 forth-0.021
 fuss-0.020
Token the
Token ID262
Feature activation+2.02e-01

Pos logit contributions
itt+0.072
 Some+0.066
year+0.066
 of+0.066
op+0.064
Neg logit contributions
 heads-0.084
 myself-0.083
zie-0.083
 herself-0.081
keeping-0.080
Token noise
Token ID7838
Feature activation+1.88e+00

Pos logit contributions
 heads+0.004
 myself+0.004
zie+0.004
 herself+0.004
keeping+0.004
Neg logit contributions
itt-0.011
year-0.011
 Some-0.011
 of-0.011
op-0.011
Token and
Token ID290
Feature activation+5.27e-01

Pos logit contributions
 heads+0.072
 herself+0.071
 myself+0.070
zie+0.069
 forth+0.068
Neg logit contributions
itt-0.074
year-0.068
 Some-0.068
op-0.065
 fires-0.064
Token came
Token ID1625
Feature activation-2.35e-01

Pos logit contributions
 heads+0.023
zie+0.023
 myself+0.023
 herself+0.023
keeping+0.022
Neg logit contributions
itt-0.017
 Some-0.016
year-0.016
 of-0.016
op-0.016
Token to
Token ID284
Feature activation+9.88e-01

Pos logit contributions
itt+0.005
 of+0.005
year+0.005
 Some+0.004
op+0.004
Neg logit contributions
zie-0.013
 myself-0.013
 heads-0.013
keeping-0.012
 herself-0.012
Token.
Token ID13
Feature activation+2.64e-03

Pos logit contributions
 of+0.036
itt+0.032
op+0.031
 fires+0.030
-+0.029
Neg logit contributions
zie-0.045
 heads-0.045
 herself-0.043
keeping-0.043
 forth-0.042
Token You
Token ID921
Feature activation-9.33e-02

Pos logit contributions
 myself+0.000
 herself+0.000
 heads+0.000
zie+0.000
keeping+0.000
Neg logit contributions
itt-0.000
year-0.000
op-0.000
ades-0.000
 of-0.000
Token are
Token ID389
Feature activation+1.20e+00

Pos logit contributions
 of+0.004
itt+0.004
 Some+0.003
 fires+0.003
op+0.003
Neg logit contributions
zie-0.004
keeping-0.003
 heads-0.003
 myself-0.003
 herself-0.003
Token the
Token ID262
Feature activation-1.70e-01

Pos logit contributions
zie+0.062
 heads+0.062
keeping+0.061
 fuss+0.059
 forth+0.058
Neg logit contributions
 of-0.030
 fires-0.030
itt-0.027
year-0.027
 Some-0.027
Token best
Token ID1266
Feature activation-1.91e+00

Pos logit contributions
itt+0.006
year+0.006
 Some+0.006
op+0.006
 of+0.006
Neg logit contributions
 heads-0.007
 myself-0.007
 herself-0.007
zie-0.007
 forth-0.007
Token soldier
Token ID10686
Feature activation-1.55e+00

Pos logit contributions
itt+0.074
year+0.069
 Some+0.068
op+0.067
 fires+0.063
Neg logit contributions
 heads-0.073
 myself-0.072
 herself-0.071
keeping-0.070
zie-0.069
Token ever
Token ID1683
Feature activation-1.65e+00

Pos logit contributions
itt+0.060
 Some+0.057
year+0.056
op+0.055
 fires+0.053
Neg logit contributions
 myself-0.059
 heads-0.058
 herself-0.057
keeping-0.057
zie-0.055
Token,"
Token ID553
Feature activation+9.52e-01

Pos logit contributions
 of+0.055
-+0.041
year+0.040
 fires+0.039
 to+0.038
Neg logit contributions
zie-0.086
 heads-0.082
ishment-0.079
 Pixie-0.078
 Holly-0.076
Token he
Token ID339
Feature activation-2.03e+00

Pos logit contributions
 heads+0.042
 myself+0.042
 herself+0.042
zie+0.041
 forth+0.040
Neg logit contributions
itt-0.032
year-0.029
op-0.029
 Some-0.028
 of-0.028
Token said
Token ID531
Feature activation+2.30e-02

Pos logit contributions
itt+0.103
 of+0.101
year+0.100
 Some+0.099
op+0.097
Neg logit contributions
zie-0.052
 myself-0.052
 heads-0.050
keeping-0.050
 herself-0.049
Token.
Token ID13
Feature activation-4.47e-01

Pos logit contributions
 herself+0.001
 heads+0.001
 myself+0.001
keeping+0.000
 forth+0.000
Neg logit contributions
itt-0.001
 Some-0.001
 fires-0.001
year-0.001
op-0.001
Token the
Token ID262
Feature activation+3.40e-01

Pos logit contributions
itt+0.030
year+0.028
 Some+0.028
op+0.027
 of+0.027
Neg logit contributions
 heads-0.029
 myself-0.029
 herself-0.029
zie-0.029
keeping-0.028
Token grass
Token ID8701
Feature activation+7.78e-01

Pos logit contributions
keeping+0.011
zie+0.011
 myself+0.011
 heads+0.010
 herself+0.010
Neg logit contributions
itt-0.015
 of-0.015
 Some-0.014
op-0.014
 fires-0.013
Token.
Token ID13
Feature activation-4.61e-01

Pos logit contributions
 herself+0.024
 heads+0.024
 myself+0.024
zie+0.023
keeping+0.022
Neg logit contributions
itt-0.037
year-0.035
 Some-0.034
op-0.033
 fires-0.033
Token He
Token ID679
Feature activation-5.78e-01

Pos logit contributions
itt+0.021
 Some+0.020
 of+0.020
year+0.020
op+0.019
Neg logit contributions
 heads-0.015
 myself-0.014
zie-0.014
 forth-0.014
keeping-0.014
Token laughs
Token ID22051
Feature activation+2.81e-01

Pos logit contributions
itt+0.028
 Some+0.027
 of+0.026
year+0.026
op+0.026
Neg logit contributions
keeping-0.016
zie-0.016
 myself-0.016
 heads-0.016
 herself-0.016
Token and
Token ID290
Feature activation-1.35e+00

Pos logit contributions
 heads+0.011
 myself+0.011
 herself+0.011
zie+0.011
keeping+0.011
Neg logit contributions
itt-0.011
year-0.010
 Some-0.010
 of-0.010
op-0.009
Token gets
Token ID3011
Feature activation+4.96e-01

Pos logit contributions
 of+0.041
 Some+0.039
osh+0.037
 to+0.036
ch+0.035
Neg logit contributions
keeping-0.067
zie-0.059
oph-0.054
abby-0.053
ishment-0.052
Token up
Token ID510
Feature activation-8.41e-01

Pos logit contributions
 herself+0.018
 heads+0.017
 forth+0.017
 myself+0.017
 fuss+0.016
Neg logit contributions
itt-0.022
op-0.020
year-0.020
 Some-0.020
 of-0.019
Token.
Token ID13
Feature activation-5.21e-01

Pos logit contributions
itt+0.038
year+0.035
 Some+0.035
op+0.034
 fires+0.033
Neg logit contributions
 herself-0.029
 heads-0.028
 myself-0.028
 forth-0.027
zie-0.027
Token
Token ID198
Feature activation-7.22e-01

Pos logit contributions
itt+0.020
 Some+0.020
 of+0.019
year+0.019
op+0.018
Neg logit contributions
 heads-0.020
 myself-0.020
zie-0.020
keeping-0.019
 forth-0.019
Token
Token ID198
Feature activation-3.80e-01

Pos logit contributions
itt+0.029
year+0.028
 Some+0.027
 of+0.027
op+0.027
Neg logit contributions
 heads-0.025
 myself-0.025
zie-0.025
 herself-0.025
keeping-0.024
Token Lily
Token ID20037
Feature activation+1.25e+00

Pos logit contributions
 of+0.003
 Some+0.002
 fires+0.002
year+0.002
itt+0.002
Neg logit contributions
 heads-0.001
 myself-0.001
keeping-0.001
zie-0.001
 herself-0.001
Token were
Token ID547
Feature activation-3.17e-01

Pos logit contributions
 myself+0.053
 heads+0.052
zie+0.051
keeping+0.051
 herself+0.050
Neg logit contributions
itt-0.043
 of-0.040
 Some-0.039
op-0.039
year-0.039
Token known
Token ID1900
Feature activation-3.95e+00

Pos logit contributions
itt+0.014
year+0.013
op+0.013
 Some+0.012
imsy+0.012
Neg logit contributions
 herself-0.012
 myself-0.011
zie-0.011
 forth-0.011
 forget-0.010
Token as
Token ID355
Feature activation+8.09e-01

Pos logit contributions
itt+0.177
 Some+0.166
op+0.164
year+0.163
 fires+0.154
Neg logit contributions
 myself-0.133
 herself-0.132
 heads-0.129
 forth-0.122
zie-0.122
Token the
Token ID262
Feature activation-8.30e-01

Pos logit contributions
 herself+0.027
 myself+0.027
 heads+0.026
zie+0.026
keeping+0.026
Neg logit contributions
itt-0.035
op-0.033
year-0.033
 Some-0.032
 of-0.031
Token brav
Token ID49025
Feature activation-4.79e+00

Pos logit contributions
itt+0.034
op+0.034
year+0.033
 Some+0.031
 of+0.031
Neg logit contributions
 heads-0.031
 forth-0.029
 myself-0.029
zie-0.029
 forget-0.029
Tokenest
Token ID395
Feature activation-1.24e+00

Pos logit contributions
itt+0.225
op+0.208
year+0.205
 Some+0.199
 of+0.197
Neg logit contributions
 heads-0.154
zie-0.147
 myself-0.143
 herself-0.140
keeping-0.137
Token animals
Token ID4695
Feature activation-9.73e-01

Pos logit contributions
itt+0.047
year+0.046
op+0.043
 Some+0.042
imsy+0.042
Neg logit contributions
 fuss-0.046
 heads-0.046
 herself-0.045
 forth-0.044
zie-0.043
Token in
Token ID287
Feature activation-6.43e-01

Pos logit contributions
 Some+0.034
itt+0.033
op+0.031
year+0.030
imsy+0.030
Neg logit contributions
 herself-0.040
 forth-0.039
keeping-0.039
 myself-0.038
 forget-0.038
Token the
Token ID262
Feature activation-4.55e-03

Pos logit contributions
itt+0.025
year+0.024
op+0.024
 Some+0.023
 of+0.023
Neg logit contributions
 heads-0.024
 myself-0.024
 herself-0.024
zie-0.024
keeping-0.023
Token park
Token ID3952
Feature activation+2.33e-01

Pos logit contributions
itt+0.000
year+0.000
op+0.000
 Some+0.000
 of+0.000
Neg logit contributions
 heads-0.000
 forth-0.000
zie-0.000
 fuss-0.000
 myself-0.000
TokenSam
Token ID16305
Feature activation-2.56e+00

Pos logit contributions
 heads+0.050
 myself+0.049
zie+0.048
 forth+0.047
 herself+0.047
Neg logit contributions
itt-0.046
 of-0.044
 Some-0.043
year-0.042
op-0.041
Token was
Token ID373
Feature activation-1.18e+00

Pos logit contributions
 of+0.082
itt+0.079
op+0.074
 Some+0.074
 fires+0.073
Neg logit contributions
 myself-0.110
 heads-0.108
keeping-0.108
 herself-0.105
 fuss-0.103
Token a
Token ID257
Feature activation-7.13e-01

Pos logit contributions
ades+0.049
op+0.048
itt+0.048
 Some+0.046
year+0.044
Neg logit contributions
 herself-0.043
 myself-0.039
 forth-0.038
zie-0.037
 heads-0.037
Token three
Token ID1115
Feature activation-2.64e+00

Pos logit contributions
itt+0.030
year+0.029
 Some+0.028
ades+0.028
 of+0.028
Neg logit contributions
 heads-0.025
 forget-0.024
 forth-0.024
zie-0.022
 fuss-0.022
Token-
Token ID12
Feature activation-2.96e+00

Pos logit contributions
itt+0.073
 Some+0.071
year+0.068
imsy+0.066
op+0.064
Neg logit contributions
 heads-0.127
zie-0.119
 forth-0.119
 fuss-0.119
 myself-0.118
Tokenyear
Token ID1941
Feature activation-5.95e+00

Pos logit contributions
oses+0.065
ifying+0.061
 Fire+0.058
 Some+0.058
ork+0.057
Neg logit contributions
 myself-0.149
 herself-0.147
keeping-0.133
 forth-0.129
zie-0.127
Token-
Token ID12
Feature activation-2.73e+00

Pos logit contributions
 Some+0.145
itt+0.129
Will+0.122
 fires+0.120
year+0.117
Neg logit contributions
 heads-0.317
 myself-0.300
keeping-0.293
zie-0.289
 herself-0.287
Tokenold
Token ID727
Feature activation-1.86e+00

Pos logit contributions
ifying+0.102
 Some+0.091
 Fire+0.085
 fires+0.084
oses+0.084
Neg logit contributions
 myself-0.109
 herself-0.108
keeping-0.101
zie-0.091
riet-0.086
Token on
Token ID319
Feature activation-1.31e+00

Pos logit contributions
itt+0.090
year+0.084
 Some+0.084
op+0.082
 of+0.081
Neg logit contributions
 heads-0.054
 myself-0.052
 herself-0.052
zie-0.051
keeping-0.050
Token a
Token ID257
Feature activation+7.35e-02

Pos logit contributions
year+0.054
itt+0.053
op+0.051
ades+0.048
ended+0.047
Neg logit contributions
 herself-0.052
 myself-0.050
 heads-0.048
zie-0.045
 forth-0.042
Token mission
Token ID4365
Feature activation-2.15e-01

Pos logit contributions
 heads+0.003
 myself+0.003
zie+0.003
 herself+0.003
 forth+0.003
Neg logit contributions
itt-0.003
year-0.003
 Some-0.003
 of-0.003
 fires-0.003
Token that
Token ID326
Feature activation-9.02e-01

Pos logit contributions
itt+0.035
 of+0.034
 Some+0.033
year+0.032
op+0.032
Neg logit contributions
 heads-0.024
zie-0.022
keeping-0.022
 myself-0.022
 fuss-0.021
Token little
Token ID1310
Feature activation+6.37e-01

Pos logit contributions
itt+0.032
 of+0.032
year+0.030
 Some+0.030
op+0.029
Neg logit contributions
 heads-0.037
keeping-0.036
 myself-0.036
zie-0.035
 herself-0.035
Token island
Token ID7022
Feature activation-1.03e+00

Pos logit contributions
zie+0.024
keeping+0.023
 myself+0.023
 heads+0.023
 herself+0.021
Neg logit contributions
 of-0.026
itt-0.023
 Some-0.023
 fires-0.021
op-0.021
Token was
Token ID373
Feature activation-1.53e+00

Pos logit contributions
itt+0.042
 Some+0.039
year+0.039
op+0.038
 fires+0.037
Neg logit contributions
 heads-0.038
zie-0.038
 herself-0.038
 myself-0.037
keeping-0.036
Token always
Token ID1464
Feature activation-7.32e-01

Pos logit contributions
itt+0.065
op+0.061
 Some+0.058
year+0.055
ades+0.054
Neg logit contributions
 herself-0.059
zie-0.056
 myself-0.055
 heads-0.052
 forth-0.051
Token known
Token ID1900
Feature activation-4.12e+00

Pos logit contributions
itt+0.029
 Some+0.027
op+0.026
year+0.025
 of+0.025
Neg logit contributions
 herself-0.029
 myself-0.029
 heads-0.028
zie-0.028
 forth-0.028
Token as
Token ID355
Feature activation+7.21e-01

Pos logit contributions
itt+0.173
 Some+0.162
year+0.160
op+0.159
 of+0.155
Neg logit contributions
 myself-0.146
 heads-0.145
 herself-0.145
zie-0.141
keeping-0.137
Token Lucy
Token ID22162
Feature activation+1.14e+00

Pos logit contributions
 myself+0.027
 herself+0.027
 heads+0.027
zie+0.027
keeping+0.026
Neg logit contributions
itt-0.028
 Some-0.026
op-0.026
year-0.026
 of-0.025
Token's
Token ID338
Feature activation-1.13e+00

Pos logit contributions
 herself+0.041
 heads+0.041
zie+0.041
 myself+0.040
keeping+0.039
Neg logit contributions
itt-0.046
 Some-0.045
year-0.043
 fires-0.042
ades-0.040
Token Island
Token ID5451
Feature activation+9.05e-02

Pos logit contributions
itt+0.048
year+0.045
 Some+0.045
op+0.044
imsy+0.041
Neg logit contributions
 heads-0.042
 herself-0.039
 myself-0.038
zie-0.038
 fuss-0.037
Token because
Token ID780
Feature activation+9.76e-01

Pos logit contributions
keeping+0.003
 myself+0.003
 herself+0.003
zie+0.003
 heads+0.003
Neg logit contributions
itt-0.004
 Some-0.004
year-0.004
op-0.004
imsy-0.004
Token there
Token ID612
Feature activation+5.17e-01

Pos logit contributions
 heads+0.004
 myself+0.004
keeping+0.004
zie+0.004
 herself+0.004
Neg logit contributions
itt-0.004
 of-0.004
 Some-0.003
year-0.003
op-0.003
Token was
Token ID373
Feature activation-4.61e-01

Pos logit contributions
 heads+0.023
 myself+0.023
keeping+0.022
zie+0.022
 herself+0.022
Neg logit contributions
itt-0.017
 of-0.016
 Some-0.015
year-0.015
op-0.015
Token a
Token ID257
Feature activation+1.24e-01

Pos logit contributions
op+0.017
itt+0.017
ades+0.016
oses+0.016
year+0.016
Neg logit contributions
 herself-0.020
 myself-0.017
zie-0.016
 Daisy-0.015
 Holly-0.015
Token three
Token ID1115
Feature activation-1.84e+00

Pos logit contributions
 heads+0.004
 forget+0.004
 fuss+0.004
 forth+0.004
zie+0.004
Neg logit contributions
itt-0.005
year-0.005
 Some-0.005
 fires-0.005
 of-0.005
Token-
Token ID12
Feature activation-3.05e+00

Pos logit contributions
itt+0.042
 Some+0.041
year+0.039
imsy+0.038
 fires+0.036
Neg logit contributions
 heads-0.095
zie-0.092
 myself-0.090
 fuss-0.090
 herself-0.089
Tokenyear
Token ID1941
Feature activation-5.75e+00

Pos logit contributions
ifying+0.052
op+0.049
oses+0.049
 Some+0.045
imsy+0.044
Neg logit contributions
 herself-0.178
 myself-0.170
zie-0.163
keeping-0.160
riet-0.145
Token-
Token ID12
Feature activation-2.93e+00

Pos logit contributions
 Some+0.138
itt+0.126
ifying+0.124
Will+0.121
 fires+0.118
Neg logit contributions
 heads-0.301
 myself-0.284
 herself-0.279
keeping-0.276
zie-0.275
Tokenold
Token ID727
Feature activation-1.42e+00

Pos logit contributions
ifying+0.079
op+0.069
 Some+0.069
imsy+0.068
oses+0.066
Neg logit contributions
 herself-0.148
 myself-0.141
zie-0.134
keeping-0.133
riet-0.117
Token named
Token ID3706
Feature activation-2.24e+00

Pos logit contributions
itt+0.053
 Some+0.049
year+0.049
op+0.048
 of+0.047
Neg logit contributions
 heads-0.056
 myself-0.055
 herself-0.055
zie-0.054
keeping-0.054
Token Nancy
Token ID18496
Feature activation+2.08e+00

Pos logit contributions
itt+0.105
 Some+0.100
 of+0.100
year+0.099
op+0.096
Neg logit contributions
 heads-0.070
keeping-0.066
 forth-0.065
zie-0.064
 myself-0.064
Token.
Token ID13
Feature activation-1.60e-01

Pos logit contributions
 herself+0.061
 heads+0.060
keeping+0.059
zie+0.056
 myself+0.056
Neg logit contributions
 Some-0.097
itt-0.096
year-0.091
ades-0.091
 fires-0.089
Token and
Token ID290
Feature activation-1.87e-01

Pos logit contributions
itt+0.076
 Some+0.070
year+0.070
 of+0.069
op+0.068
Neg logit contributions
 heads-0.086
 myself-0.085
zie-0.084
 herself-0.084
keeping-0.082
Token fast
Token ID3049
Feature activation-6.16e-01

Pos logit contributions
itt+0.007
year+0.006
 Some+0.006
 of+0.006
op+0.006
Neg logit contributions
 heads-0.008
 myself-0.008
zie-0.008
 herself-0.007
 forth-0.007
Token.
Token ID13
Feature activation+2.48e-02

Pos logit contributions
itt+0.024
 of+0.022
 Some+0.022
year+0.022
op+0.022
Neg logit contributions
zie-0.023
 heads-0.023
 myself-0.023
 herself-0.023
keeping-0.022
Token The
Token ID383
Feature activation-1.37e-01

Pos logit contributions
 herself+0.001
 myself+0.001
 heads+0.001
zie+0.001
keeping+0.001
Neg logit contributions
itt-0.001
year-0.001
op-0.001
 fires-0.001
 of-0.001
Token merry
Token ID44570
Feature activation-1.64e+00

Pos logit contributions
itt+0.005
year+0.005
 Some+0.005
 of+0.005
op+0.005
Neg logit contributions
 heads-0.005
 myself-0.005
zie-0.005
 herself-0.005
keeping-0.005
Token-
Token ID12
Feature activation-4.29e+00

Pos logit contributions
itt+0.038
 of+0.036
 Some+0.033
op+0.033
 fires+0.033
Neg logit contributions
 myself-0.086
zie-0.086
 herself-0.085
keeping-0.085
 heads-0.085
Tokengo
Token ID2188
Feature activation-2.29e+00

Pos logit contributions
 Some+0.168
imsy+0.159
 hamm+0.158
 fires+0.156
ifying+0.151
Neg logit contributions
 myself-0.141
 herself-0.124
lla-0.120
amboo-0.120
zie-0.116
Token-
Token ID12
Feature activation-3.28e+00

Pos logit contributions
itt+0.062
 Some+0.060
year+0.058
imsy+0.052
 paramed+0.044
Neg logit contributions
 heads-0.120
 forth-0.117
 myself-0.116
 herself-0.111
zie-0.108
Tokenround
Token ID744
Feature activation-1.28e+00

Pos logit contributions
 Some+0.166
itt+0.163
imsy+0.158
 fires+0.158
year+0.154
Neg logit contributions
 myself-0.081
 herself-0.072
keeping-0.069
 heads-0.065
zie-0.064
Token spins
Token ID37621
Feature activation-5.73e-01

Pos logit contributions
itt+0.059
year+0.057
 Some+0.054
op+0.052
 of+0.051
Neg logit contributions
 heads-0.044
 herself-0.041
 myself-0.040
 forth-0.039
zie-0.039
Token.
Token ID13
Feature activation-5.64e-02

Pos logit contributions
itt+0.024
 Some+0.022
year+0.022
op+0.022
 of+0.021
Neg logit contributions
 heads-0.021
 herself-0.021
 myself-0.021
zie-0.020
 forth-0.020