TOP ACTIVATIONS
MAX = 0.145

V icky ?" V icky said in a shaky
until it got darker . Rh onda was scared but Alex
two friends named Alex and Rh onda . Alex wanted to
two friends named Happy and Gr umpy . They were very
, go !" V icky thought about it and
a gr umpy bear called Gr umbles . He used to
was a little duck named Qu ack . Qu ack was
bud ge . Fr ust rated , the little
day , an explorer named Che e wanted to study a
duck named Qu ackers . Qu ackers lived on a farm
. She was three - years - old and loved to
beautiful colors . Alex and Rh onda were so excited to
will try !" V icky took a deep breath
was a little duck named Qu ackers . Qu ackers lived
was a little bunny named Fl ops . Fl ops was
ipping by . V icky called out , "
lived a nice bear named Che wy . Che wy had
duck named Qu ack . Qu ack was black and loved
there were two friends , L ola and Ab ig ail
bunny named Fl ops . Fl ops was an attractive bunny

MIN ACTIVATIONS
MIN = -0.149

some buttons or rib bons or patches . Would you like
very happy and wondered how it would taste .
it 's unknown how strong it is !"
may want more than what we have , but it 's
! We can buy anything we want !" "
big stick to break the vine . Lily looked around and
okay , Anna ? Does it hurt a lot ?"
we can see some lights or tunnels ," mom said .
wanted to find out what it was . What are you
few moments , wondering what it was . John
be free to fly wherever it wants ." The
! How much harder can it be to wrap a book
, it could do anything it wanted . <|endoftext|> Once upon
we can sew some buttons or rib bons or patches .
game or read a book or draw a picture . Maybe
. That 's how heavy it was ! "
will put it on my paper when I finish . It
I 'll show you how it 's done ". So the
, imagination can include anything we want ." The
friends all the exciting things it had found during its journey

INTERVAL 0.071 - 0.145
CONTAINS 1.502%

us with his candy ." " Stop it ,
She decided to tell P ao how she felt .
The lady smiled and praised them . " That 's
Once upon a time there was a girl called Mary .
was a little fish named Bub bles who lived in the

INTERVAL -0.002 - 0.071
CONTAINS 49.375%

. They wanted to go home and wash the paint off
puppy very much , but the mother said no because they
stones . They were not friends anymore . In the end
the next day , it was raining outside and John was
!" she said . So they all packed their

INTERVAL -0.076 - -0.002
CONTAINS 48.637%

saw the cat picking flowers . The mean man
and they all had lots of fun . On the way
to rock back and forth on his four legs while relaxing
! This is much better and more comfortable ". His dad
. She looked all over her house but couldn 't find

INTERVAL -0.149 - -0.076
CONTAINS 0.487%

Dear God , bless us and keep us safe ," they
to do more than anything - she loved to roll !
their smiles and their voices and their dreams . We have
's cook some regular food and have a picnic . And
proud of themselves and continued to play on the shore .
Token V
Token ID569
Feature activation+7.10e-02

Pos logit contributions
 heal+0.068
chief+0.067
 drown+0.066
 down+0.064
 rid+0.062
Neg logit contributions
elly-0.107
Every-0.095
umpy-0.092
urious-0.087
Two-0.087
Tokenicky
Token ID17479
Feature activation-6.42e-03

Pos logit contributions
elly+0.111
Every+0.108
umpy+0.100
Children+0.095
Two+0.094
Neg logit contributions
 heal-0.103
chief-0.101
 drown-0.099
 down-0.098
ridge-0.098
Token?"
Token ID1701
Feature activation+1.28e-02

Pos logit contributions
 heal+0.009
 drown+0.008
chief+0.008
 down+0.008
 rid+0.008
Neg logit contributions
elly-0.012
Every-0.011
umpy-0.010
urious-0.010
Two-0.010
Token
Token ID198
Feature activation+7.45e-02

Pos logit contributions
elly+0.021
Every+0.019
oma+0.017
umpy+0.017
Children+0.017
Neg logit contributions
 drown-0.020
 heal-0.020
chief-0.020
 rid-0.019
 soar-0.019
Token
Token ID198
Feature activation+6.96e-02

Pos logit contributions
elly+0.147
umpy+0.126
Every+0.125
ubby+0.124
urious+0.122
Neg logit contributions
 heal-0.096
 down-0.089
chief-0.089
 drown-0.087
 rid-0.083
TokenV
Token ID53
Feature activation+1.45e-01

Pos logit contributions
elly+0.101
ubby+0.082
umpy+0.074
oph+0.073
grown+0.072
Neg logit contributions
 heal-0.132
 down-0.120
 breathe-0.119
 build-0.119
 we-0.118
Tokenicky
Token ID17479
Feature activation+3.61e-02

Pos logit contributions
Every+0.220
elly+0.220
Two+0.192
 elderly+0.191
Children+0.191
Neg logit contributions
 down-0.203
 heal-0.202
 elbow-0.197
 drown-0.195
 rid-0.188
Token said
Token ID531
Feature activation+3.70e-02

Pos logit contributions
elly+0.053
Every+0.046
umpy+0.044
urious+0.041
Two+0.041
Neg logit contributions
 heal-0.062
chief-0.061
 drown-0.061
 down-0.059
 rid-0.058
Token in
Token ID287
Feature activation+2.58e-02

Pos logit contributions
elly+0.054
Every+0.049
umpy+0.046
 little+0.045
 year+0.044
Neg logit contributions
 drown-0.062
chief-0.059
 heal-0.057
 rid-0.056
 elbow-0.056
Token a
Token ID257
Feature activation+1.24e-02

Pos logit contributions
elly+0.064
umpy+0.059
urious+0.056
illing+0.056
itting+0.053
Neg logit contributions
 heal-0.022
 bed-0.021
 down-0.020
 anything-0.019
chief-0.019
Token shaky
Token ID35335
Feature activation+2.94e-02

Pos logit contributions
elly+0.017
Every+0.014
umpy+0.014
urious+0.012
oma+0.012
Neg logit contributions
 heal-0.023
chief-0.023
 drown-0.023
 down-0.022
 rid-0.022
Token until
Token ID1566
Feature activation+2.79e-04

Pos logit contributions
elly+0.040
Every+0.037
umpy+0.034
 year+0.033
oma+0.032
Neg logit contributions
 drown-0.037
chief-0.037
 heal-0.036
 elbow-0.035
ridge-0.034
Token it
Token ID340
Feature activation-1.53e-02

Pos logit contributions
elly+0.000
Every+0.000
 year+0.000
 little+0.000
grown+0.000
Neg logit contributions
 heal-0.000
 bleed-0.000
chief-0.000
 elbow-0.000
 drown-0.000
Token got
Token ID1392
Feature activation+3.42e-02

Pos logit contributions
 down+0.023
 heal+0.023
 drown+0.022
chief+0.022
 soar+0.021
Neg logit contributions
elly-0.027
Every-0.022
umpy-0.022
ubby-0.022
illing-0.021
Token darker
Token ID18646
Feature activation+1.74e-02

Pos logit contributions
elly+0.055
Every+0.048
umpy+0.046
urious+0.043
oma+0.043
Neg logit contributions
 heal-0.055
chief-0.054
 drown-0.053
 down-0.053
 rid-0.051
Token.
Token ID13
Feature activation-1.49e-02

Pos logit contributions
elly+0.029
Every+0.025
umpy+0.024
urious+0.023
oma+0.022
Neg logit contributions
chief-0.027
 heal-0.027
 drown-0.027
 rid-0.025
 elbow-0.025
Token Rh
Token ID10323
Feature activation+1.44e-01

Pos logit contributions
 heal+0.020
chief+0.020
 drown+0.019
 down+0.019
 rid+0.019
Neg logit contributions
elly-0.028
Every-0.025
umpy-0.024
urious-0.023
Two-0.023
Tokenonda
Token ID13533
Feature activation+3.59e-02

Pos logit contributions
 blonde+0.208
 blond+0.204
elly+0.191
Every+0.181
grown+0.179
Neg logit contributions
 we-0.186
 ourselves-0.184
 heal-0.181
 back-0.172
 you-0.171
Token was
Token ID373
Feature activation+5.44e-02

Pos logit contributions
elly+0.052
Every+0.046
umpy+0.044
urious+0.041
Two+0.040
Neg logit contributions
 heal-0.063
chief-0.062
 drown-0.060
 down-0.059
 rid-0.059
Token scared
Token ID12008
Feature activation+5.17e-02

Pos logit contributions
elly+0.078
Every+0.074
umpy+0.070
 young+0.068
 elderly+0.065
Neg logit contributions
chief-0.092
 heal-0.089
 elbow-0.085
 drown-0.084
ridge-0.083
Token but
Token ID475
Feature activation+2.25e-03

Pos logit contributions
elly+0.091
Every+0.080
umpy+0.077
urious+0.072
Two+0.072
Neg logit contributions
 heal-0.076
chief-0.075
 drown-0.074
 down-0.072
 rid-0.070
Token Alex
Token ID4422
Feature activation+5.14e-02

Pos logit contributions
elly+0.004
Every+0.004
umpy+0.004
oma+0.004
Two+0.004
Neg logit contributions
 heal-0.003
chief-0.003
 drown-0.003
 down-0.003
 rid-0.003
Token two
Token ID734
Feature activation+7.98e-02

Pos logit contributions
elly+0.091
umpy+0.074
oma+0.071
Every+0.071
urious+0.070
Neg logit contributions
 down-0.141
 we-0.137
 drown-0.136
 back-0.136
 heal-0.136
Token friends
Token ID2460
Feature activation+6.46e-02

Pos logit contributions
elly+0.147
Every+0.127
umpy+0.124
If+0.121
year+0.120
Neg logit contributions
 heal-0.103
 drown-0.101
 down-0.101
 back-0.100
chief-0.097
Token named
Token ID3706
Feature activation+4.73e-02

Pos logit contributions
elly+0.109
Every+0.095
umpy+0.092
urious+0.086
Two+0.086
Neg logit contributions
 heal-0.100
chief-0.097
 drown-0.097
 down-0.095
 rid-0.092
Token Alex
Token ID4422
Feature activation+7.81e-02

Pos logit contributions
elly+0.104
Every+0.094
umpy+0.090
Two+0.087
urious+0.087
Neg logit contributions
 heal-0.049
chief-0.048
 drown-0.047
 down-0.044
 rid-0.044
Token and
Token ID290
Feature activation-9.55e-03

Pos logit contributions
elly+0.136
Every+0.120
umpy+0.115
Two+0.110
urious+0.110
Neg logit contributions
 heal-0.115
chief-0.112
 down-0.111
 drown-0.109
 rid-0.105
Token Rh
Token ID10323
Feature activation+1.43e-01

Pos logit contributions
chief+0.014
 heal+0.013
 drown+0.013
ridge+0.012
 rid+0.012
Neg logit contributions
Every-0.016
elly-0.016
ida-0.014
Children-0.014
Two-0.014
Tokenonda
Token ID13533
Feature activation+4.83e-02

Pos logit contributions
elly+0.205
Every+0.173
Wow+0.162
umpy+0.159
 importance+0.156
Neg logit contributions
 heal-0.229
 him-0.219
 drown-0.216
 rid-0.215
 build-0.209
Token.
Token ID13
Feature activation-8.20e-03

Pos logit contributions
elly+0.099
Every+0.089
umpy+0.086
Two+0.082
illing+0.082
Neg logit contributions
 heal-0.056
chief-0.056
 drown-0.054
 down-0.052
 rid-0.051
Token Alex
Token ID4422
Feature activation+6.08e-02

Pos logit contributions
 heal+0.009
chief+0.009
 down+0.009
 drown+0.009
 build+0.008
Neg logit contributions
elly-0.017
umpy-0.015
Every-0.015
ubby-0.014
urious-0.014
Token wanted
Token ID2227
Feature activation+2.20e-02

Pos logit contributions
elly+0.080
Every+0.067
umpy+0.063
urious+0.059
Two+0.058
Neg logit contributions
 heal-0.116
chief-0.114
 drown-0.112
 down-0.111
 rid-0.108
Token to
Token ID284
Feature activation-4.59e-02

Pos logit contributions
elly+0.046
Every+0.043
Children+0.040
Two+0.039
 elderly+0.038
Neg logit contributions
chief-0.025
 heal-0.024
 drown-0.024
 bleed-0.021
 down-0.021
Token two
Token ID734
Feature activation+8.74e-02

Pos logit contributions
elly+0.102
Every+0.084
umpy+0.084
oma+0.080
iet+0.078
Neg logit contributions
 heal-0.132
 down-0.132
 drown-0.131
chief-0.129
 rid-0.126
Token friends
Token ID2460
Feature activation+7.28e-02

Pos logit contributions
elly+0.155
umpy+0.132
Every+0.131
oma+0.124
year+0.123
Neg logit contributions
 heal-0.122
 down-0.119
chief-0.118
 drown-0.117
 back-0.115
Token named
Token ID3706
Feature activation+5.72e-02

Pos logit contributions
elly+0.122
Every+0.105
umpy+0.102
urious+0.096
Two+0.096
Neg logit contributions
 heal-0.113
chief-0.111
 drown-0.109
 down-0.109
 rid-0.104
Token Happy
Token ID14628
Feature activation+8.97e-02

Pos logit contributions
elly+0.121
Every+0.111
umpy+0.105
oma+0.102
Two+0.102
Neg logit contributions
 heal-0.064
chief-0.062
 drown-0.060
 rid-0.056
 down-0.055
Token and
Token ID290
Feature activation-3.45e-03

Pos logit contributions
elly+0.198
Every+0.183
umpy+0.182
urious+0.170
ubby+0.170
Neg logit contributions
 we-0.085
 heal-0.080
 back-0.080
 out-0.076
 down-0.075
Token Gr
Token ID1902
Feature activation+1.42e-01

Pos logit contributions
 heal+0.004
chief+0.004
 drown+0.004
 rid+0.003
 down+0.003
Neg logit contributions
elly-0.007
Every-0.007
Children-0.006
oma-0.006
Two-0.006
Tokenumpy
Token ID32152
Feature activation+6.86e-02

Pos logit contributions
Every+0.210
elly+0.209
Sometimes+0.183
That+0.183
Two+0.183
Neg logit contributions
 heal-0.197
 rid-0.193
 drown-0.186
 him-0.174
 breathe-0.173
Token.
Token ID13
Feature activation+4.12e-03

Pos logit contributions
elly+0.142
Every+0.130
umpy+0.125
urious+0.122
Two+0.119
Neg logit contributions
 heal-0.078
chief-0.076
 down-0.075
 drown-0.071
 rid-0.069
Token They
Token ID1119
Feature activation+4.15e-03

Pos logit contributions
elly+0.009
umpy+0.008
Every+0.008
urious+0.008
iet+0.007
Neg logit contributions
 heal-0.005
 down-0.004
chief-0.004
 drown-0.004
 rid-0.004
Token were
Token ID547
Feature activation+7.01e-02

Pos logit contributions
elly+0.007
umpy+0.006
oma+0.006
Every+0.006
urious+0.006
Neg logit contributions
 down-0.007
 heal-0.006
 build-0.006
 rid-0.006
 him-0.006
Token very
Token ID845
Feature activation+5.49e-02

Pos logit contributions
elly+0.138
Every+0.122
umpy+0.118
urious+0.113
Two+0.112
Neg logit contributions
 heal-0.089
chief-0.086
 drown-0.086
 down-0.085
 rid-0.081
Token,
Token ID11
Feature activation-1.10e-02

Pos logit contributions
elly+0.028
Every+0.027
umpy+0.026
 year+0.024
 little+0.024
Neg logit contributions
chief-0.026
 heal-0.025
 elbow-0.025
 drown-0.024
ridge-0.023
Token go
Token ID467
Feature activation+1.97e-02

Pos logit contributions
chief+0.018
 elbow+0.017
 heal+0.017
 rid+0.016
 drown+0.016
Neg logit contributions
elly-0.016
Every-0.016
 little-0.015
 young-0.015
umpy-0.015
Token!"
Token ID2474
Feature activation+2.71e-03

Pos logit contributions
elly+0.031
Every+0.030
umpy+0.028
 year+0.027
 little+0.026
Neg logit contributions
chief-0.030
 heal-0.029
 elbow-0.028
 drown-0.028
ridge-0.027
Token
Token ID198
Feature activation+7.47e-02

Pos logit contributions
elly+0.003
Every+0.003
 little+0.003
L+0.002
So+0.002
Neg logit contributions
 drown-0.006
 rid-0.005
 soar-0.005
pan-0.005
 mast-0.005
Token
Token ID198
Feature activation+6.93e-02

Pos logit contributions
elly+0.133
Every+0.115
umpy+0.112
urious+0.106
ubby+0.106
Neg logit contributions
 heal-0.109
chief-0.105
 drown-0.104
 down-0.103
 rid-0.099
TokenV
Token ID53
Feature activation+1.40e-01

Pos logit contributions
elly+0.092
ubby+0.071
umpy+0.069
grown+0.065
oma+0.065
Neg logit contributions
 heal-0.140
 soar-0.129
 breathe-0.128
 drown-0.127
 build-0.126
Tokenicky
Token ID17479
Feature activation+3.26e-02

Pos logit contributions
elly+0.212
Every+0.203
grown+0.178
oma+0.178
Children+0.177
Neg logit contributions
 heal-0.211
 drown-0.210
 down-0.203
 elbow-0.199
 rid-0.197
Token thought
Token ID1807
Feature activation+4.81e-02

Pos logit contributions
elly+0.048
Every+0.042
umpy+0.040
urious+0.038
Two+0.037
Neg logit contributions
 heal-0.056
chief-0.055
 drown-0.054
 down-0.053
 rid-0.052
Token about
Token ID546
Feature activation+1.25e-02

Pos logit contributions
elly+0.068
Every+0.065
 little+0.061
 year+0.057
umpy+0.056
Neg logit contributions
chief-0.082
 drown-0.082
 rid-0.079
 hyd-0.078
stack-0.077
Token it
Token ID340
Feature activation-2.66e-02

Pos logit contributions
elly+0.026
Every+0.025
oma+0.023
Children+0.023
 year+0.022
Neg logit contributions
chief-0.013
 drown-0.012
 heal-0.012
 rid-0.012
 down-0.011
Token and
Token ID290
Feature activation-2.30e-02

Pos logit contributions
 heal+0.038
chief+0.038
 drown+0.038
 rid+0.036
 down+0.035
Neg logit contributions
elly-0.047
Every-0.042
umpy-0.040
urious-0.038
oma-0.038
Token a
Token ID257
Feature activation+6.71e-02

Pos logit contributions
elly+0.174
umpy+0.159
urious+0.150
Every+0.148
ida+0.148
Neg logit contributions
 heal-0.091
 down-0.091
 soar-0.085
 out-0.084
 him-0.084
Token gr
Token ID1036
Feature activation+1.11e-01

Pos logit contributions
elly+0.077
Every+0.062
urious+0.059
umpy+0.058
If+0.055
Neg logit contributions
 heal-0.140
 down-0.135
 float-0.133
 we-0.133
 drown-0.131
Tokenumpy
Token ID32152
Feature activation+7.21e-02

Pos logit contributions
Every+0.079
urious+0.060
Children+0.059
Next+0.056
Today+0.054
Neg logit contributions
 heal-0.246
 drown-0.236
 rid-0.230
 down-0.228
 hyd-0.226
Token bear
Token ID6842
Feature activation+5.56e-02

Pos logit contributions
elly+0.109
Every+0.102
urious+0.092
umpy+0.090
Two+0.088
Neg logit contributions
 heal-0.120
chief-0.116
 down-0.115
 drown-0.113
 rid-0.107
Token called
Token ID1444
Feature activation+3.45e-02

Pos logit contributions
elly+0.083
Every+0.072
umpy+0.068
urious+0.064
Two+0.064
Neg logit contributions
 heal-0.095
chief-0.094
 drown-0.093
 down-0.091
 rid-0.089
Token Gr
Token ID1902
Feature activation+1.39e-01

Pos logit contributions
elly+0.065
Every+0.058
umpy+0.055
oma+0.055
ubby+0.054
Neg logit contributions
 heal-0.044
chief-0.043
 drown-0.042
 rid-0.041
 down-0.040
Tokenumbles
Token ID25329
Feature activation+6.19e-02

Pos logit contributions
elly+0.176
Every+0.170
Two+0.147
Children+0.145
oma+0.142
Neg logit contributions
 heal-0.246
 drown-0.237
 rid-0.231
 down-0.230
 build-0.221
Token.
Token ID13
Feature activation+3.40e-03

Pos logit contributions
elly+0.120
Every+0.107
umpy+0.103
urious+0.097
Two+0.097
Neg logit contributions
 heal-0.079
chief-0.078
 drown-0.077
 down-0.074
 rid-0.072
Token He
Token ID679
Feature activation+1.38e-02

Pos logit contributions
elly+0.007
Every+0.006
umpy+0.006
Two+0.005
oma+0.005
Neg logit contributions
 heal-0.004
 drown-0.004
chief-0.004
 down-0.004
 rid-0.004
Token used
Token ID973
Feature activation+3.58e-02

Pos logit contributions
elly+0.020
Every+0.016
umpy+0.015
oma+0.014
urious+0.014
Neg logit contributions
 heal-0.025
chief-0.025
 drown-0.025
 down-0.024
 rid-0.024
Token to
Token ID284
Feature activation-2.99e-02

Pos logit contributions
elly+0.069
Every+0.069
 little+0.061
grown+0.059
 young+0.058
Neg logit contributions
 drown-0.041
 heal-0.039
chief-0.038
stack-0.036
 hyd-0.034
Token was
Token ID373
Feature activation+8.67e-02

Pos logit contributions
elly+0.079
umpy+0.069
iet+0.065
Every+0.065
ubby+0.065
Neg logit contributions
 heal-0.061
 down-0.061
 him-0.058
chief-0.058
 drown-0.057
Token a
Token ID257
Feature activation+6.94e-02

Pos logit contributions
elly+0.181
umpy+0.166
ida+0.157
urious+0.156
oma+0.154
Neg logit contributions
 down-0.094
 heal-0.093
 out-0.087
 soar-0.086
 him-0.086
Token little
Token ID1310
Feature activation+6.40e-02

Pos logit contributions
elly+0.080
Every+0.065
urious+0.061
umpy+0.060
If+0.058
Neg logit contributions
 heal-0.144
 down-0.138
 we-0.137
 float-0.137
 drown-0.135
Token duck
Token ID22045
Feature activation+7.07e-02

Pos logit contributions
elly+0.077
Every+0.065
umpy+0.060
urious+0.055
Two+0.054
Neg logit contributions
 heal-0.129
chief-0.126
 drown-0.125
 down-0.124
 rid-0.121
Token named
Token ID3706
Feature activation+4.67e-02

Pos logit contributions
elly+0.109
Every+0.094
umpy+0.090
urious+0.084
Two+0.084
Neg logit contributions
 heal-0.119
chief-0.117
 drown-0.116
 down-0.113
 rid-0.111
Token Qu
Token ID2264
Feature activation+1.39e-01

Pos logit contributions
elly+0.095
Every+0.087
umpy+0.081
ubby+0.079
Two+0.078
Neg logit contributions
 heal-0.054
chief-0.053
 drown-0.051
 down-0.050
 elbow-0.049
Tokenack
Token ID441
Feature activation+8.08e-02

Pos logit contributions
elly+0.193
Every+0.186
Children+0.167
oma+0.166
resa+0.163
Neg logit contributions
 drown-0.228
 heal-0.224
 down-0.211
 rid-0.209
 breathe-0.209
Token.
Token ID13
Feature activation+1.13e-02

Pos logit contributions
elly+0.177
Every+0.159
umpy+0.153
Two+0.150
Children+0.148
Neg logit contributions
 heal-0.083
 down-0.078
chief-0.077
 drown-0.077
 rid-0.074
Token Qu
Token ID2264
Feature activation+1.37e-01

Pos logit contributions
elly+0.028
umpy+0.025
Every+0.025
urious+0.025
iet+0.025
Neg logit contributions
 down-0.008
 drown-0.008
 heal-0.008
chief-0.007
 float-0.007
Tokenack
Token ID441
Feature activation+6.76e-02

Pos logit contributions
elly+0.184
Every+0.181
oma+0.160
Children+0.160
resa+0.158
Neg logit contributions
 drown-0.230
 heal-0.223
 breathe-0.212
 down-0.211
 build-0.211
Token was
Token ID373
Feature activation+7.86e-02

Pos logit contributions
elly+0.110
Every+0.088
umpy+0.087
oph+0.086
oma+0.085
Neg logit contributions
 heal-0.108
 down-0.105
 soar-0.101
chief-0.099
 we-0.099
Token bud
Token ID3600
Feature activation-1.28e-02

Pos logit contributions
chief+0.007
 drown+0.006
 heal+0.006
 rid+0.006
 down+0.006
Neg logit contributions
elly-0.010
Every-0.009
umpy-0.009
Two-0.008
urious-0.008
Tokenge
Token ID469
Feature activation+2.35e-02

Pos logit contributions
 heal+0.021
chief+0.021
 drown+0.020
 down+0.020
 rid+0.020
Neg logit contributions
elly-0.020
Every-0.018
umpy-0.017
urious-0.016
oma-0.015
Token.
Token ID13
Feature activation-2.11e-02

Pos logit contributions
elly+0.046
Every+0.041
umpy+0.040
oma+0.038
urious+0.037
Neg logit contributions
chief-0.029
 heal-0.029
 drown-0.029
 rid-0.027
 soar-0.026
Token
Token ID198
Feature activation+6.85e-02

Pos logit contributions
 drown+0.036
 heal+0.035
chief+0.035
 rid+0.033
 down+0.033
Neg logit contributions
elly-0.033
Every-0.029
umpy-0.026
Two-0.026
Children-0.025
Token
Token ID198
Feature activation+6.49e-02

Pos logit contributions
elly+0.141
umpy+0.121
ubby+0.120
iet+0.119
Every+0.117
Neg logit contributions
 heal-0.078
 down-0.072
chief-0.072
 drown-0.071
 rid-0.067
TokenFr
Token ID6732
Feature activation+1.39e-01

Pos logit contributions
elly+0.105
ubby+0.087
oph+0.083
grown+0.082
 amusement+0.078
Neg logit contributions
 heal-0.104
 soar-0.095
 you-0.092
 breathe-0.091
chief-0.090
Tokenust
Token ID436
Feature activation+5.41e-02

Pos logit contributions
elly+0.158
Wow+0.154
grown+0.151
Every+0.147
Children+0.145
Neg logit contributions
 drown-0.220
 heal-0.219
 soar-0.215
 build-0.208
 breathe-0.204
Tokenrated
Token ID4111
Feature activation+4.56e-02

Pos logit contributions
elly+0.080
Every+0.070
umpy+0.065
Two+0.062
Children+0.062
Neg logit contributions
 heal-0.094
chief-0.091
 down-0.091
 drown-0.091
 rid-0.088
Token,
Token ID11
Feature activation+2.76e-02

Pos logit contributions
elly+0.088
Every+0.078
umpy+0.075
urious+0.071
Two+0.071
Neg logit contributions
 heal-0.060
chief-0.059
 drown-0.058
 down-0.055
 rid-0.055
Token the
Token ID262
Feature activation+5.77e-02

Pos logit contributions
elly+0.059
Every+0.056
Two+0.051
umpy+0.051
Children+0.051
Neg logit contributions
 drown-0.028
chief-0.028
 heal-0.028
 rid-0.024
 down-0.024
Token little
Token ID1310
Feature activation+5.26e-02

Pos logit contributions
elly+0.110
oph+0.098
year+0.096
lic+0.096
oma+0.095
Neg logit contributions
 down-0.079
 back-0.074
 soar-0.072
 heal-0.072
 drown-0.070
Token day
Token ID1110
Feature activation+4.09e-02

Pos logit contributions
elly+0.158
Every+0.137
umpy+0.134
Children+0.128
oma+0.127
Neg logit contributions
 heal-0.121
 down-0.117
 drown-0.116
chief-0.116
 rid-0.110
Token,
Token ID11
Feature activation+6.38e-02

Pos logit contributions
elly+0.081
Every+0.073
umpy+0.070
urious+0.067
Two+0.066
Neg logit contributions
 heal-0.051
chief-0.050
 drown-0.049
 down-0.048
 rid-0.046
Token an
Token ID281
Feature activation+9.85e-02

Pos logit contributions
elly+0.136
umpy+0.121
urious+0.116
iet+0.116
oma+0.115
Neg logit contributions
 down-0.075
 he-0.070
 heal-0.068
 him-0.066
 out-0.066
Token explorer
Token ID39349
Feature activation+6.81e-02

Pos logit contributions
elly+0.142
Girls+0.130
Two+0.122
Every+0.120
grown+0.118
Neg logit contributions
 back-0.154
 heal-0.154
 down-0.151
 float-0.150
 sail-0.146
Token named
Token ID3706
Feature activation+4.68e-02

Pos logit contributions
elly+0.106
Every+0.092
umpy+0.087
urious+0.081
Two+0.081
Neg logit contributions
 heal-0.114
chief-0.112
 drown-0.111
 down-0.108
 rid-0.106
Token Che
Token ID2580
Feature activation+1.39e-01

Pos logit contributions
elly+0.105
Every+0.095
umpy+0.092
urious+0.088
Two+0.088
Neg logit contributions
 heal-0.046
chief-0.045
 drown-0.044
 down-0.043
 rid-0.041
Tokene
Token ID68
Feature activation+7.14e-02

Pos logit contributions
elly+0.257
 importantly+0.217
Every+0.213
umpy+0.208
Two+0.205
Neg logit contributions
 heal-0.194
chief-0.183
 him-0.176
 down-0.174
 build-0.171
Token wanted
Token ID2227
Feature activation+5.26e-02

Pos logit contributions
elly+0.126
Every+0.114
ubby+0.107
 importantly+0.104
urious+0.104
Neg logit contributions
chief-0.103
 heal-0.103
 down-0.095
 he-0.092
 him-0.092
Token to
Token ID284
Feature activation-2.99e-02

Pos logit contributions
elly+0.127
Every+0.116
umpy+0.113
urious+0.108
Two+0.108
Neg logit contributions
 heal-0.043
chief-0.042
 drown-0.041
 down-0.039
 rid-0.037
Token study
Token ID2050
Feature activation+1.79e-02

Pos logit contributions
chief+0.039
 drown+0.035
 elbow+0.034
 down+0.034
ridge+0.034
Neg logit contributions
Every-0.057
Two-0.051
elly-0.050
 little-0.049
 young-0.048
Token a
Token ID257
Feature activation-6.50e-04

Pos logit contributions
elly+0.032
Every+0.030
umpy+0.027
 year+0.026
oma+0.026
Neg logit contributions
 drown-0.026
chief-0.025
 heal-0.024
 rid-0.023
 bleed-0.023
Token duck
Token ID22045
Feature activation+6.34e-02

Pos logit contributions
elly+0.063
Every+0.052
umpy+0.048
urious+0.044
Two+0.044
Neg logit contributions
 heal-0.113
chief-0.111
 drown-0.110
 down-0.108
 rid-0.107
Token named
Token ID3706
Feature activation+4.04e-02

Pos logit contributions
elly+0.095
Every+0.082
umpy+0.077
urious+0.072
illing+0.071
Neg logit contributions
chief-0.110
 heal-0.110
 drown-0.108
 down-0.105
 rid-0.103
Token Qu
Token ID2264
Feature activation+1.37e-01

Pos logit contributions
elly+0.079
Every+0.072
umpy+0.067
ubby+0.065
Two+0.065
Neg logit contributions
 heal-0.049
chief-0.049
 drown-0.047
 elbow-0.046
 down-0.045
Tokenackers
Token ID28874
Feature activation+7.24e-02

Pos logit contributions
elly+0.201
Every+0.193
Children+0.177
oma+0.175
resa+0.174
Neg logit contributions
 drown-0.212
 heal-0.208
 down-0.194
 breathe-0.194
 float-0.191
Token.
Token ID13
Feature activation+3.78e-03

Pos logit contributions
elly+0.144
Every+0.130
umpy+0.125
urious+0.118
illing+0.118
Neg logit contributions
chief-0.092
 heal-0.089
 drown-0.087
 down-0.083
 rid-0.082
Token Qu
Token ID2264
Feature activation+1.39e-01

Pos logit contributions
elly+0.009
Every+0.008
umpy+0.008
urious+0.008
iet+0.008
Neg logit contributions
 heal-0.003
chief-0.003
 down-0.003
 drown-0.003
 rid-0.003
Tokenackers
Token ID28874
Feature activation+6.11e-02

Pos logit contributions
Every+0.215
elly+0.215
oma+0.202
Children+0.200
resa+0.198
Neg logit contributions
 drown-0.191
 heal-0.183
 down-0.178
 breathe-0.177
 float-0.172
Token lived
Token ID5615
Feature activation+8.68e-02

Pos logit contributions
elly+0.071
Every+0.058
umpy+0.054
urious+0.049
Two+0.049
Neg logit contributions
 heal-0.126
chief-0.124
 drown-0.123
 down-0.121
 rid-0.119
Token on
Token ID319
Feature activation+2.49e-03

Pos logit contributions
elly+0.196
Every+0.175
umpy+0.172
oma+0.168
urious+0.167
Neg logit contributions
 down-0.085
 heal-0.083
 drown-0.078
 rid-0.076
chief-0.074
Token a
Token ID257
Feature activation+1.87e-02

Pos logit contributions
elly+0.004
Every+0.004
umpy+0.003
Two+0.003
urious+0.003
Neg logit contributions
 heal-0.004
chief-0.004
 drown-0.004
 down-0.004
 rid-0.004
Token farm
Token ID5318
Feature activation+9.33e-03

Pos logit contributions
elly+0.029
Every+0.026
umpy+0.024
urious+0.023
Two+0.023
Neg logit contributions
 heal-0.030
chief-0.030
 drown-0.030
 down-0.029
 rid-0.029
Token.
Token ID13
Feature activation+9.46e-03

Pos logit contributions
elly+0.131
Every+0.119
umpy+0.114
urious+0.109
oma+0.108
Neg logit contributions
 heal-0.075
chief-0.075
 drown-0.074
 down-0.070
 rid-0.070
Token She
Token ID1375
Feature activation+2.17e-02

Pos logit contributions
elly+0.018
Every+0.016
umpy+0.016
oma+0.015
ubby+0.015
Neg logit contributions
 heal-0.012
 down-0.012
chief-0.011
 drown-0.011
 rid-0.011
Token was
Token ID373
Feature activation+7.28e-02

Pos logit contributions
elly+0.027
Every+0.022
umpy+0.021
urious+0.020
ubby+0.020
Neg logit contributions
 heal-0.044
chief-0.042
 down-0.042
 drown-0.041
 soar-0.040
Token three
Token ID1115
Feature activation+9.52e-02

Pos logit contributions
elly+0.134
Every+0.118
umpy+0.114
urious+0.109
Children+0.107
Neg logit contributions
 heal-0.102
chief-0.098
 drown-0.098
 down-0.097
 rid-0.093
Token-
Token ID12
Feature activation+6.32e-02

Pos logit contributions
elly+0.121
Every+0.111
Two+0.110
oph+0.109
Very+0.108
Neg logit contributions
 down-0.165
 back-0.155
 heal-0.153
 we-0.149
 he-0.147
Tokenyears
Token ID19002
Feature activation+1.39e-01

Pos logit contributions
elly+0.062
Every+0.048
umpy+0.045
urious+0.041
oma+0.041
Neg logit contributions
 heal-0.141
chief-0.137
 rid-0.137
 down-0.136
 drown-0.136
Token-
Token ID12
Feature activation+5.83e-02

Pos logit contributions
elly+0.211
Every+0.185
umpy+0.174
Two+0.164
urious+0.164
Neg logit contributions
 heal-0.232
chief-0.227
 drown-0.224
 down-0.221
 rid-0.215
Tokenold
Token ID727
Feature activation+4.72e-02

Pos logit contributions
elly+0.063
Every+0.051
umpy+0.048
urious+0.043
Two+0.043
Neg logit contributions
 heal-0.125
chief-0.123
 drown-0.122
 down-0.120
 rid-0.118
Token and
Token ID290
Feature activation-7.64e-03

Pos logit contributions
elly+0.072
Every+0.060
umpy+0.058
oma+0.054
urious+0.053
Neg logit contributions
 drown-0.083
 heal-0.082
chief-0.081
 rid-0.077
 soar-0.075
Token loved
Token ID6151
Feature activation+4.97e-02

Pos logit contributions
 heal+0.015
 down+0.015
chief+0.015
 drown+0.014
 rid+0.014
Neg logit contributions
elly-0.010
umpy-0.008
Every-0.008
urious-0.007
iet-0.007
Token to
Token ID284
Feature activation-2.00e-02

Pos logit contributions
elly+0.093
Every+0.087
 elderly+0.082
grown+0.080
 year+0.079
Neg logit contributions
chief-0.067
 drown-0.065
 heal-0.063
 rid-0.059
 mast-0.059
Token beautiful
Token ID4950
Feature activation+7.75e-03

Pos logit contributions
elly+0.011
Every+0.009
umpy+0.009
oma+0.009
urious+0.009
Neg logit contributions
 heal-0.009
 drown-0.008
 down-0.008
chief-0.008
 rid-0.008
Token colors
Token ID7577
Feature activation+2.00e-02

Pos logit contributions
elly+0.014
Every+0.012
umpy+0.012
urious+0.011
Two+0.011
Neg logit contributions
 heal-0.011
chief-0.011
 drown-0.011
 down-0.010
 rid-0.010
Token.
Token ID13
Feature activation-1.68e-02

Pos logit contributions
elly+0.040
Every+0.036
umpy+0.035
urious+0.033
Two+0.033
Neg logit contributions
 heal-0.024
chief-0.024
 drown-0.023
 down-0.023
 rid-0.022
Token Alex
Token ID4422
Feature activation+5.64e-02

Pos logit contributions
 heal+0.022
chief+0.021
 drown+0.021
 rid+0.020
 down+0.020
Neg logit contributions
elly-0.033
Every-0.029
umpy-0.028
urious-0.027
Two-0.027
Token and
Token ID290
Feature activation+5.51e-03

Pos logit contributions
elly+0.091
Every+0.077
umpy+0.074
urious+0.069
ubby+0.069
Neg logit contributions
 heal-0.091
chief-0.090
 down-0.088
 drown-0.087
 rid-0.084
Token Rh
Token ID10323
Feature activation+1.38e-01

Pos logit contributions
elly+0.010
Every+0.009
oma+0.008
umpy+0.008
Children+0.008
Neg logit contributions
 heal-0.008
chief-0.008
 drown-0.008
 down-0.007
 rid-0.007
Tokenonda
Token ID13533
Feature activation+3.35e-02

Pos logit contributions
elly+0.183
Every+0.152
 blonde+0.148
 blond+0.142
grown+0.142
Neg logit contributions
 heal-0.221
 rid-0.206
 back-0.203
 we-0.202
 ourselves-0.199
Token were
Token ID547
Feature activation+5.60e-02

Pos logit contributions
elly+0.051
Every+0.045
umpy+0.043
Two+0.040
urious+0.040
Neg logit contributions
 heal-0.056
chief-0.056
 drown-0.054
 down-0.053
 rid-0.052
Token so
Token ID523
Feature activation+3.15e-02

Pos logit contributions
elly+0.083
Every+0.080
umpy+0.073
oma+0.069
illing+0.069
Neg logit contributions
 heal-0.093
chief-0.093
 drown-0.089
 rid-0.085
 elbow-0.085
Token excited
Token ID6568
Feature activation+3.64e-02

Pos logit contributions
elly+0.056
Every+0.047
oma+0.045
umpy+0.045
oph+0.045
Neg logit contributions
 down-0.046
 heal-0.046
 drown-0.045
 soar-0.043
 rid-0.043
Token to
Token ID284
Feature activation-1.22e-02

Pos logit contributions
elly+0.075
Every+0.066
umpy+0.062
illing+0.061
 year+0.061
Neg logit contributions
 heal-0.046
 drown-0.045
chief-0.044
 rid-0.042
 elbow-0.041
Token will
Token ID481
Feature activation-1.91e-02

Pos logit contributions
 heal+0.016
chief+0.016
 drown+0.015
 down+0.015
 rid+0.014
Neg logit contributions
elly-0.029
Every-0.026
umpy-0.025
Two-0.024
urious-0.024
Token try
Token ID1949
Feature activation-3.29e-02

Pos logit contributions
chief+0.024
 drown+0.023
ridge+0.023
 rid+0.023
 elbow+0.022
Neg logit contributions
Every-0.034
elly-0.032
umpy-0.030
 little-0.030
 young-0.029
Token!"
Token ID2474
Feature activation+3.52e-03

Pos logit contributions
 drown+0.044
stack+0.042
 heal+0.042
chief+0.041
 rid+0.040
Neg logit contributions
elly-0.060
Every-0.057
aste-0.051
oma-0.051
umpy-0.051
Token
Token ID198
Feature activation+7.32e-02

Pos logit contributions
elly+0.004
Every+0.003
So+0.003
 little+0.003
That+0.003
Neg logit contributions
 drown-0.008
 rid-0.007
 anything-0.007
 heal-0.007
 soar-0.007
Token
Token ID198
Feature activation+6.91e-02

Pos logit contributions
elly+0.137
umpy+0.117
Every+0.116
ubby+0.112
urious+0.111
Neg logit contributions
 heal-0.101
chief-0.094
 down-0.094
 drown-0.094
 rid-0.089
TokenV
Token ID53
Feature activation+1.38e-01

Pos logit contributions
elly+0.091
ubby+0.072
umpy+0.068
grown+0.064
oma+0.062
Neg logit contributions
 heal-0.143
 soar-0.131
 build-0.130
 breathe-0.130
 we-0.126
Tokenicky
Token ID17479
Feature activation+3.15e-02

Pos logit contributions
elly+0.211
Every+0.197
umpy+0.175
oma+0.175
grown+0.173
Neg logit contributions
 heal-0.213
 drown-0.211
 down-0.204
 elbow-0.201
 rid-0.198
Token took
Token ID1718
Feature activation+5.57e-02

Pos logit contributions
elly+0.049
Every+0.044
umpy+0.041
Two+0.039
urious+0.039
Neg logit contributions
 heal-0.052
chief-0.051
 drown-0.050
 down-0.048
 rid-0.048
Token a
Token ID257
Feature activation+7.33e-04

Pos logit contributions
elly+0.081
 little+0.078
Every+0.077
 young+0.071
E+0.071
Neg logit contributions
 drown-0.089
stack-0.088
chief-0.086
 hyd-0.084
 mast-0.082
Token deep
Token ID2769
Feature activation+1.32e-02

Pos logit contributions
elly+0.001
Every+0.001
umpy+0.001
urious+0.001
Two+0.001
Neg logit contributions
 heal-0.001
chief-0.001
 drown-0.001
 down-0.001
 rid-0.001
Token breath
Token ID8033
Feature activation+2.91e-02

Pos logit contributions
elly+0.031
Every+0.029
umpy+0.028
ubby+0.028
iet+0.027
Neg logit contributions
 heal-0.011
 drown-0.010
chief-0.010
 rid-0.010
ridge-0.009
Token was
Token ID373
Feature activation+7.88e-02

Pos logit contributions
elly+0.070
umpy+0.060
Every+0.057
ubby+0.057
illing+0.056
Neg logit contributions
 heal-0.058
 down-0.058
chief-0.055
 drown-0.055
 him-0.054
Token a
Token ID257
Feature activation+6.28e-02

Pos logit contributions
elly+0.168
umpy+0.156
ida+0.148
urious+0.146
oma+0.145
Neg logit contributions
 down-0.082
 heal-0.078
 out-0.076
 him-0.074
 soar-0.074
Token little
Token ID1310
Feature activation+5.47e-02

Pos logit contributions
elly+0.073
Every+0.059
urious+0.055
umpy+0.055
If+0.053
Neg logit contributions
 heal-0.130
 down-0.126
 float-0.123
 we-0.122
 drown-0.122
Token duck
Token ID22045
Feature activation+6.34e-02

Pos logit contributions
elly+0.063
Every+0.052
umpy+0.048
urious+0.044
Two+0.044
Neg logit contributions
 heal-0.113
chief-0.111
 drown-0.110
 down-0.108
 rid-0.107
Token named
Token ID3706
Feature activation+4.04e-02

Pos logit contributions
elly+0.095
Every+0.082
umpy+0.077
urious+0.072
illing+0.071
Neg logit contributions
chief-0.110
 heal-0.110
 drown-0.108
 down-0.105
 rid-0.103
Token Qu
Token ID2264
Feature activation+1.37e-01

Pos logit contributions
elly+0.079
Every+0.072
umpy+0.067
ubby+0.065
Two+0.065
Neg logit contributions
 heal-0.049
chief-0.049
 drown-0.047
 elbow-0.046
 down-0.045
Tokenackers
Token ID28874
Feature activation+7.24e-02

Pos logit contributions
elly+0.201
Every+0.193
Children+0.177
oma+0.175
resa+0.174
Neg logit contributions
 drown-0.212
 heal-0.208
 down-0.194
 breathe-0.194
 float-0.191
Token.
Token ID13
Feature activation+3.78e-03

Pos logit contributions
elly+0.144
Every+0.130
umpy+0.125
urious+0.118
illing+0.118
Neg logit contributions
chief-0.092
 heal-0.089
 drown-0.087
 down-0.083
 rid-0.082
Token Qu
Token ID2264
Feature activation+1.39e-01

Pos logit contributions
elly+0.009
Every+0.008
umpy+0.008
urious+0.008
iet+0.008
Neg logit contributions
 heal-0.003
chief-0.003
 down-0.003
 drown-0.003
 rid-0.003
Tokenackers
Token ID28874
Feature activation+6.11e-02

Pos logit contributions
Every+0.215
elly+0.215
oma+0.202
Children+0.200
resa+0.198
Neg logit contributions
 drown-0.191
 heal-0.183
 down-0.178
 breathe-0.177
 float-0.172
Token lived
Token ID5615
Feature activation+8.68e-02

Pos logit contributions
elly+0.071
Every+0.058
umpy+0.054
urious+0.049
Two+0.049
Neg logit contributions
 heal-0.126
chief-0.124
 drown-0.123
 down-0.121
 rid-0.119
Token was
Token ID373
Feature activation+8.36e-02

Pos logit contributions
elly+0.079
umpy+0.068
iet+0.065
Every+0.065
ubby+0.065
Neg logit contributions
 heal-0.061
 down-0.060
 him-0.058
chief-0.058
 he-0.058
Token a
Token ID257
Feature activation+6.82e-02

Pos logit contributions
elly+0.168
umpy+0.151
ida+0.143
Every+0.142
oma+0.142
Neg logit contributions
 heal-0.097
 down-0.097
 soar-0.090
 drown-0.090
chief-0.089
Token little
Token ID1310
Feature activation+6.21e-02

Pos logit contributions
elly+0.079
Every+0.064
urious+0.060
umpy+0.059
If+0.058
Neg logit contributions
 heal-0.141
 down-0.136
 float-0.134
 we-0.134
 drown-0.133
Token bunny
Token ID44915
Feature activation+6.64e-02

Pos logit contributions
elly+0.074
Every+0.061
umpy+0.057
urious+0.052
Two+0.052
Neg logit contributions
 heal-0.126
chief-0.125
 drown-0.123
 down-0.121
 rid-0.119
Token named
Token ID3706
Feature activation+5.43e-02

Pos logit contributions
elly+0.097
Every+0.084
umpy+0.079
urious+0.073
illing+0.073
Neg logit contributions
chief-0.117
 heal-0.117
 drown-0.115
 down-0.111
 rid-0.111
Token Fl
Token ID1610
Feature activation+1.37e-01

Pos logit contributions
elly+0.120
Every+0.111
umpy+0.103
oma+0.099
Two+0.099
Neg logit contributions
 heal-0.055
 drown-0.054
chief-0.053
 rid-0.051
 down-0.049
Tokenops
Token ID2840
Feature activation+6.66e-02

Pos logit contributions
elly+0.273
Every+0.251
umpy+0.240
Children+0.231
Two+0.229
Neg logit contributions
 heal-0.165
 drown-0.160
chief-0.159
 down-0.152
 rid-0.145
Token.
Token ID13
Feature activation+1.23e-02

Pos logit contributions
elly+0.134
Every+0.122
umpy+0.116
illing+0.111
Two+0.110
Neg logit contributions
chief-0.081
 heal-0.080
 drown-0.078
 down-0.074
 rid-0.074
Token Fl
Token ID1610
Feature activation+1.36e-01

Pos logit contributions
elly+0.030
Every+0.028
umpy+0.027
Two+0.026
urious+0.026
Neg logit contributions
 heal-0.010
chief-0.009
 drown-0.009
 down-0.009
 rid-0.008
Tokenops
Token ID2840
Feature activation+6.30e-02

Pos logit contributions
elly+0.266
Every+0.253
umpy+0.235
Children+0.233
Two+0.229
Neg logit contributions
 heal-0.158
 drown-0.157
chief-0.154
 down-0.147
 rid-0.139
Token was
Token ID373
Feature activation+7.93e-02

Pos logit contributions
elly+0.092
Every+0.080
umpy+0.075
urious+0.070
Two+0.070
Neg logit contributions
 heal-0.111
chief-0.109
 drown-0.108
 down-0.105
 rid-0.104
Tokenipping
Token ID4501
Feature activation+2.18e-02

Pos logit contributions
elly+0.144
Every+0.126
umpy+0.124
ubby+0.121
Two+0.117
Neg logit contributions
 heal-0.096
 drown-0.091
chief-0.091
 rid-0.086
 down-0.086
Token by
Token ID416
Feature activation-2.32e-02

Pos logit contributions
elly+0.047
Every+0.043
umpy+0.042
oma+0.039
urious+0.039
Neg logit contributions
chief-0.023
 drown-0.023
 heal-0.023
 rid-0.020
ridge-0.020
Token.
Token ID13
Feature activation-2.10e-02

Pos logit contributions
 heal+0.025
chief+0.024
 drown+0.023
 down+0.023
 rid+0.022
Neg logit contributions
elly-0.051
Every-0.046
umpy-0.044
Two-0.043
urious-0.042
Token
Token ID198
Feature activation+7.09e-02

Pos logit contributions
 drown+0.037
 heal+0.037
 rid+0.036
chief+0.035
 bleed+0.034
Neg logit contributions
elly-0.032
Every-0.029
Two-0.025
umpy-0.025
Children-0.025
Token
Token ID198
Feature activation+6.26e-02

Pos logit contributions
elly+0.148
umpy+0.130
ubby+0.128
urious+0.125
Every+0.124
Neg logit contributions
 heal-0.076
 down-0.072
 drown-0.070
 him-0.068
 he-0.068
TokenV
Token ID53
Feature activation+1.37e-01

Pos logit contributions
elly+0.103
 importance+0.087
ubby+0.086
oma+0.086
 elderly+0.085
Neg logit contributions
 he-0.090
 heal-0.089
 you-0.086
 we-0.085
 breathe-0.084
Tokenicky
Token ID17479
Feature activation+3.47e-02

Pos logit contributions
elly+0.212
Every+0.207
Children+0.181
oma+0.180
grown+0.180
Neg logit contributions
 drown-0.197
 heal-0.193
 down-0.192
 elbow-0.184
 build-0.183
Token called
Token ID1444
Feature activation+3.53e-02

Pos logit contributions
elly+0.053
Every+0.045
umpy+0.043
oma+0.040
ubby+0.040
Neg logit contributions
 heal-0.058
 down-0.057
chief-0.057
 drown-0.056
 soar-0.054
Token out
Token ID503
Feature activation+3.08e-02

Pos logit contributions
elly+0.059
Every+0.054
 little+0.053
aste+0.050
 young+0.049
Neg logit contributions
chief-0.053
 drown-0.050
ridge-0.047
 soar-0.046
 glide-0.046
Token,
Token ID11
Feature activation+1.49e-02

Pos logit contributions
elly+0.064
Every+0.058
umpy+0.056
urious+0.054
Two+0.053
Neg logit contributions
 heal-0.035
chief-0.034
 drown-0.034
 down-0.033
 rid-0.032
Token "
Token ID366
Feature activation+3.54e-02

Pos logit contributions
elly+0.024
Every+0.021
umpy+0.019
Children+0.019
Two+0.018
Neg logit contributions
 heal-0.024
chief-0.024
 drown-0.024
 rid-0.023
 elbow-0.022
Token lived
Token ID5615
Feature activation+8.92e-02

Pos logit contributions
elly+0.071
umpy+0.061
ubby+0.058
iet+0.058
illing+0.058
Neg logit contributions
 heal-0.052
 down-0.051
 him-0.050
 he-0.050
chief-0.049
Token a
Token ID257
Feature activation+6.46e-02

Pos logit contributions
elly+0.169
umpy+0.147
Every+0.144
oma+0.141
urious+0.138
Neg logit contributions
 heal-0.118
 down-0.117
 drown-0.111
 rid-0.110
chief-0.107
Token nice
Token ID3621
Feature activation+7.16e-02

Pos logit contributions
elly+0.076
Every+0.060
urious+0.057
umpy+0.057
illing+0.054
Neg logit contributions
 heal-0.133
 down-0.127
 we-0.126
 drown-0.126
 soar-0.125
Token bear
Token ID6842
Feature activation+5.52e-02

Pos logit contributions
elly+0.090
Every+0.080
umpy+0.072
urious+0.071
Children+0.068
Neg logit contributions
 heal-0.136
 down-0.133
chief-0.131
 drown-0.130
 soar-0.126
Token named
Token ID3706
Feature activation+4.52e-02

Pos logit contributions
elly+0.095
Every+0.084
umpy+0.080
urious+0.076
Two+0.076
Neg logit contributions
 heal-0.082
chief-0.080
 drown-0.079
 down-0.078
 rid-0.076
Token Che
Token ID2580
Feature activation+1.37e-01

Pos logit contributions
elly+0.093
Every+0.085
umpy+0.079
Two+0.077
oma+0.077
Neg logit contributions
 heal-0.051
 drown-0.051
chief-0.051
 rid-0.048
 elbow-0.046
Tokenwy
Token ID21768
Feature activation+6.23e-02

Pos logit contributions
elly+0.246
Every+0.212
umpy+0.204
 importantly+0.204
Two+0.199
Neg logit contributions
 heal-0.198
chief-0.187
 rid-0.174
 him-0.173
ridge-0.172
Token.
Token ID13
Feature activation+4.51e-03

Pos logit contributions
elly+0.140
Every+0.127
umpy+0.125
urious+0.121
Children+0.119
Neg logit contributions
 heal-0.061
 down-0.053
chief-0.053
 drown-0.050
 build-0.050
Token Che
Token ID2580
Feature activation+1.33e-01

Pos logit contributions
elly+0.010
iet+0.009
ida+0.009
urious+0.009
umpy+0.009
Neg logit contributions
chief-0.005
 down-0.004
 heal-0.004
 drown-0.004
 build-0.004
Tokenwy
Token ID21768
Feature activation+5.36e-02

Pos logit contributions
elly+0.249
Every+0.225
 importantly+0.216
Two+0.213
 Spanish+0.211
Neg logit contributions
 heal-0.177
chief-0.165
atmeal-0.152
 down-0.152
ridge-0.151
Token had
Token ID550
Feature activation+3.60e-02

Pos logit contributions
elly+0.071
Every+0.056
umpy+0.055
ubby+0.052
year+0.051
Neg logit contributions
 heal-0.103
chief-0.099
 down-0.098
 build-0.096
 drown-0.095
Token duck
Token ID22045
Feature activation+7.07e-02

Pos logit contributions
elly+0.077
Every+0.065
umpy+0.060
urious+0.055
Two+0.054
Neg logit contributions
 heal-0.129
chief-0.126
 drown-0.125
 down-0.124
 rid-0.121
Token named
Token ID3706
Feature activation+4.67e-02

Pos logit contributions
elly+0.109
Every+0.094
umpy+0.090
urious+0.084
Two+0.084
Neg logit contributions
 heal-0.119
chief-0.117
 drown-0.116
 down-0.113
 rid-0.111
Token Qu
Token ID2264
Feature activation+1.39e-01

Pos logit contributions
elly+0.095
Every+0.087
umpy+0.081
ubby+0.079
Two+0.078
Neg logit contributions
 heal-0.054
chief-0.053
 drown-0.051
 down-0.050
 elbow-0.049
Tokenack
Token ID441
Feature activation+8.08e-02

Pos logit contributions
elly+0.193
Every+0.186
Children+0.167
oma+0.166
resa+0.163
Neg logit contributions
 drown-0.228
 heal-0.224
 down-0.211
 rid-0.209
 breathe-0.209
Token.
Token ID13
Feature activation+1.13e-02

Pos logit contributions
elly+0.177
Every+0.159
umpy+0.153
Two+0.150
Children+0.148
Neg logit contributions
 heal-0.083
 down-0.078
chief-0.077
 drown-0.077
 rid-0.074
Token Qu
Token ID2264
Feature activation+1.37e-01

Pos logit contributions
elly+0.028
umpy+0.025
Every+0.025
urious+0.025
iet+0.025
Neg logit contributions
 down-0.008
 drown-0.008
 heal-0.008
chief-0.007
 float-0.007
Tokenack
Token ID441
Feature activation+6.76e-02

Pos logit contributions
elly+0.184
Every+0.181
oma+0.160
Children+0.160
resa+0.158
Neg logit contributions
 drown-0.230
 heal-0.223
 breathe-0.212
 down-0.211
 build-0.211
Token was
Token ID373
Feature activation+7.86e-02

Pos logit contributions
elly+0.110
Every+0.088
umpy+0.087
oph+0.086
oma+0.085
Neg logit contributions
 heal-0.108
 down-0.105
 soar-0.101
chief-0.099
 we-0.099
Token black
Token ID2042
Feature activation+6.53e-02

Pos logit contributions
elly+0.146
Every+0.129
umpy+0.125
urious+0.119
Two+0.118
Neg logit contributions
 heal-0.108
chief-0.105
 drown-0.104
 down-0.103
 rid-0.099
Token and
Token ID290
Feature activation-1.03e-02

Pos logit contributions
elly+0.122
ida+0.119
Every+0.119
Children+0.116
Girls+0.115
Neg logit contributions
 down-0.079
 back-0.073
 out-0.070
 heal-0.068
 him-0.066
Token loved
Token ID6151
Feature activation+5.43e-02

Pos logit contributions
 heal+0.019
chief+0.019
 drown+0.019
 down+0.018
 rid+0.018
Neg logit contributions
elly-0.013
Every-0.012
umpy-0.011
ubby-0.010
Two-0.010
Token there
Token ID612
Feature activation+3.38e-02

Pos logit contributions
elly+0.100
Every+0.085
umpy+0.080
urious+0.073
Two+0.073
Neg logit contributions
 heal-0.146
chief-0.144
 drown-0.143
 down-0.140
 rid-0.138
Token were
Token ID547
Feature activation+7.52e-02

Pos logit contributions
elly+0.047
Every+0.040
umpy+0.038
urious+0.035
Two+0.035
Neg logit contributions
 heal-0.062
chief-0.061
 drown-0.060
 down-0.059
 rid-0.058
Token two
Token ID734
Feature activation+8.45e-02

Pos logit contributions
elly+0.113
umpy+0.095
Every+0.094
oma+0.092
iet+0.089
Neg logit contributions
 down-0.123
 heal-0.122
 drown-0.120
chief-0.118
 rid-0.116
Token friends
Token ID2460
Feature activation+6.96e-02

Pos logit contributions
elly+0.153
umpy+0.130
Every+0.130
If+0.123
oma+0.123
Neg logit contributions
 heal-0.114
 down-0.111
chief-0.110
 drown-0.109
 back-0.108
Token,
Token ID11
Feature activation+4.16e-02

Pos logit contributions
elly+0.120
Every+0.105
umpy+0.101
urious+0.096
Two+0.095
Neg logit contributions
 heal-0.104
chief-0.102
 drown-0.101
 down-0.099
 rid-0.096
Token L
Token ID406
Feature activation+1.36e-01

Pos logit contributions
elly+0.088
umpy+0.078
Every+0.077
urious+0.074
illing+0.072
Neg logit contributions
 heal-0.045
 down-0.044
chief-0.044
 drown-0.043
 rid-0.041
Tokenola
Token ID5708
Feature activation+7.40e-02

Pos logit contributions
elly+0.193
Every+0.185
Two+0.172
 importance+0.169
umpy+0.165
Neg logit contributions
 drown-0.204
chief-0.199
 heal-0.198
 down-0.197
 we-0.194
Token and
Token ID290
Feature activation-2.39e-03

Pos logit contributions
elly+0.104
Every+0.088
umpy+0.084
oma+0.078
illing+0.078
Neg logit contributions
 heal-0.136
 drown-0.134
chief-0.133
 rid-0.129
 down-0.128
Token Ab
Token ID2275
Feature activation+1.11e-01

Pos logit contributions
chief+0.003
 heal+0.003
 drown+0.003
ridge+0.003
 rid+0.003
Neg logit contributions
elly-0.004
Every-0.004
Two-0.004
Children-0.004
ida-0.004
Tokenig
Token ID328
Feature activation+6.45e-02

Pos logit contributions
elly+0.204
Every+0.182
umpy+0.174
urious+0.165
Two+0.165
Neg logit contributions
 heal-0.152
chief-0.149
 drown-0.147
 down-0.144
 rid-0.140
Tokenail
Token ID603
Feature activation+6.44e-02

Pos logit contributions
elly+0.122
Every+0.111
umpy+0.106
ubby+0.102
iet+0.102
Neg logit contributions
 heal-0.084
 drown-0.081
chief-0.080
 down-0.076
 rid-0.076
Token bunny
Token ID44915
Feature activation+6.64e-02

Pos logit contributions
elly+0.074
Every+0.061
umpy+0.057
urious+0.052
Two+0.052
Neg logit contributions
 heal-0.126
chief-0.125
 drown-0.123
 down-0.121
 rid-0.119
Token named
Token ID3706
Feature activation+5.43e-02

Pos logit contributions
elly+0.097
Every+0.084
umpy+0.079
urious+0.073
illing+0.073
Neg logit contributions
chief-0.117
 heal-0.117
 drown-0.115
 down-0.111
 rid-0.111
Token Fl
Token ID1610
Feature activation+1.37e-01

Pos logit contributions
elly+0.120
Every+0.111
umpy+0.103
oma+0.099
Two+0.099
Neg logit contributions
 heal-0.055
 drown-0.054
chief-0.053
 rid-0.051
 down-0.049
Tokenops
Token ID2840
Feature activation+6.66e-02

Pos logit contributions
elly+0.273
Every+0.251
umpy+0.240
Children+0.231
Two+0.229
Neg logit contributions
 heal-0.165
 drown-0.160
chief-0.159
 down-0.152
 rid-0.145
Token.
Token ID13
Feature activation+1.23e-02

Pos logit contributions
elly+0.134
Every+0.122
umpy+0.116
illing+0.111
Two+0.110
Neg logit contributions
chief-0.081
 heal-0.080
 drown-0.078
 down-0.074
 rid-0.074
Token Fl
Token ID1610
Feature activation+1.36e-01

Pos logit contributions
elly+0.030
Every+0.028
umpy+0.027
Two+0.026
urious+0.026
Neg logit contributions
 heal-0.010
chief-0.009
 drown-0.009
 down-0.009
 rid-0.008
Tokenops
Token ID2840
Feature activation+6.30e-02

Pos logit contributions
elly+0.266
Every+0.253
umpy+0.235
Children+0.233
Two+0.229
Neg logit contributions
 heal-0.158
 drown-0.157
chief-0.154
 down-0.147
 rid-0.139
Token was
Token ID373
Feature activation+7.93e-02

Pos logit contributions
elly+0.092
Every+0.080
umpy+0.075
urious+0.070
Two+0.070
Neg logit contributions
 heal-0.111
chief-0.109
 drown-0.108
 down-0.105
 rid-0.104
Token an
Token ID281
Feature activation+8.99e-02

Pos logit contributions
elly+0.137
Every+0.122
umpy+0.116
ubby+0.109
Two+0.108
Neg logit contributions
 heal-0.118
chief-0.117
 drown-0.115
 down-0.111
 rid-0.109
Token attractive
Token ID10966
Feature activation+5.01e-02

Pos logit contributions
elly+0.113
Every+0.098
Two+0.088
Children+0.087
umpy+0.086
Neg logit contributions
 heal-0.172
 down-0.167
chief-0.166
 drown-0.166
 rid-0.160
Token bunny
Token ID44915
Feature activation+3.77e-02

Pos logit contributions
elly+0.092
Every+0.086
umpy+0.081
urious+0.081
Children+0.080
Neg logit contributions
 heal-0.067
 down-0.065
chief-0.065
 drown-0.060
 float-0.059
Token some
Token ID617
Feature activation-4.60e-02

Pos logit contributions
 drown+0.039
chief+0.036
 heal+0.034
 elbow+0.032
 bleed+0.032
Neg logit contributions
elly-0.058
Every-0.052
oma-0.049
umpy-0.049
Children-0.048
Token buttons
Token ID12163
Feature activation-5.49e-02

Pos logit contributions
 drown+0.075
chief+0.075
 heal+0.069
 down+0.067
 rid+0.065
Neg logit contributions
elly-0.074
Every-0.066
umpy-0.059
 year-0.057
Children-0.057
Token or
Token ID393
Feature activation-1.17e-01

Pos logit contributions
 drown+0.102
chief+0.095
 elbow+0.089
 heal+0.089
 rid+0.088
Neg logit contributions
elly-0.082
Every-0.067
umpy-0.065
ubby-0.059
aste-0.057
Token rib
Token ID12183
Feature activation-4.21e-02

Pos logit contributions
chief+0.172
 drown+0.166
stack+0.161
ridge+0.151
 heal+0.148
Neg logit contributions
elly-0.179
Every-0.175
 little-0.166
Some-0.162
 young-0.158
Tokenbons
Token ID23461
Feature activation-5.33e-02

Pos logit contributions
 drown+0.058
 heal+0.057
chief+0.055
 down+0.054
 rid+0.053
Neg logit contributions
elly-0.077
Every-0.070
umpy-0.066
oma-0.063
Two-0.063
Token or
Token ID393
Feature activation-1.24e-01

Pos logit contributions
 drown+0.086
chief+0.081
 heal+0.076
 elbow+0.076
 bleed+0.075
Neg logit contributions
elly-0.090
Every-0.078
umpy-0.075
ubby-0.069
aste-0.068
Token patches
Token ID16082
Feature activation-5.75e-02

Pos logit contributions
 drown+0.194
chief+0.192
stack+0.187
 heal+0.175
 rid+0.172
Neg logit contributions
elly-0.180
Every-0.171
 little-0.162
Some-0.156
Two-0.153
Token.
Token ID13
Feature activation-7.37e-02

Pos logit contributions
 drown+0.093
chief+0.090
 elbow+0.087
 bleed+0.081
 heal+0.080
Neg logit contributions
elly-0.098
Every-0.086
umpy-0.085
ubby-0.075
aste-0.075
Token Would
Token ID10928
Feature activation+6.81e-03

Pos logit contributions
chief+0.094
 heal+0.094
 drown+0.092
 down+0.088
 rid+0.088
Neg logit contributions
elly-0.144
Every-0.130
umpy-0.122
Two-0.120
Children-0.117
Token you
Token ID345
Feature activation-3.95e-02

Pos logit contributions
Every+0.015
elly+0.014
Children+0.014
A+0.014
One+0.014
Neg logit contributions
 drown-0.007
chief-0.006
 bleed-0.006
 rid-0.006
 elbow-0.006
Token like
Token ID588
Feature activation-3.34e-02

Pos logit contributions
 heal+0.063
chief+0.063
 drown+0.062
 down+0.060
 rid+0.059
Neg logit contributions
elly-0.064
Every-0.057
umpy-0.054
Two-0.050
urious-0.050
Token very
Token ID845
Feature activation-3.06e-02

Pos logit contributions
elly+0.008
Every+0.007
umpy+0.006
 little+0.006
 young+0.006
Neg logit contributions
 drown-0.006
chief-0.006
 heal-0.006
ridge-0.006
 rid-0.006
Token happy
Token ID3772
Feature activation-3.51e-02

Pos logit contributions
 heal+0.055
chief+0.054
 drown+0.054
 down+0.053
 rid+0.052
Neg logit contributions
elly-0.043
Every-0.037
umpy-0.035
Two-0.032
urious-0.032
Token and
Token ID290
Feature activation-7.12e-02

Pos logit contributions
 drown+0.061
chief+0.061
 heal+0.060
 rid+0.057
 bleed+0.056
Neg logit contributions
elly-0.055
 year-0.046
Every-0.045
 little-0.043
umpy-0.042
Token wondered
Token ID14028
Feature activation-2.91e-02

Pos logit contributions
 drown+0.120
chief+0.119
ridge+0.115
imet+0.114
 heal+0.112
Neg logit contributions
Every-0.092
elly-0.087
 little-0.087
 elderly-0.087
 kind-0.085
Token how
Token ID703
Feature activation-3.30e-02

Pos logit contributions
 drown+0.053
chief+0.052
 heal+0.049
 bleed+0.049
 rid+0.048
Neg logit contributions
elly-0.038
Every-0.036
 little-0.032
 young-0.031
grown-0.030
Token it
Token ID340
Feature activation-1.22e-01

Pos logit contributions
chief+0.056
 drown+0.055
 elbow+0.053
stack+0.052
 bleed+0.051
Neg logit contributions
elly-0.048
Every-0.048
 kind-0.044
 young-0.044
 little-0.044
Token would
Token ID561
Feature activation-8.91e-02

Pos logit contributions
chief+0.258
 drown+0.246
 elbow+0.239
 heal+0.228
ridge+0.227
Neg logit contributions
elly-0.119
Every-0.118
 meant-0.104
 liked-0.102
umpy-0.101
Token taste
Token ID6938
Feature activation-6.00e-02

Pos logit contributions
chief+0.118
 elbow+0.108
ridge+0.104
 down+0.102
 drown+0.099
Neg logit contributions
Every-0.163
elly-0.158
umpy-0.146
 year-0.142
 importance-0.142
Token.
Token ID13
Feature activation-6.21e-02

Pos logit contributions
chief+0.102
 elbow+0.097
 drown+0.097
stack+0.092
ridge+0.089
Neg logit contributions
elly-0.098
Every-0.087
 little-0.079
umpy-0.079
 year-0.078
Token
Token ID220
Feature activation+2.08e-02

Pos logit contributions
 heal+0.105
 drown+0.102
 rid+0.100
chief+0.099
 bleed+0.096
Neg logit contributions
elly-0.100
Every-0.092
umpy-0.082
Two-0.081
Children-0.079
Token
Token ID198
Feature activation+2.23e-02

Pos logit contributions
elly+0.031
Every+0.028
umpy+0.026
oma+0.024
Children+0.024
Neg logit contributions
 heal-0.036
 drown-0.036
chief-0.036
 rid-0.034
 down-0.033
Token it
Token ID340
Feature activation-6.60e-02

Pos logit contributions
chief+0.034
 heal+0.033
 drown+0.032
 rid+0.031
 down+0.031
Neg logit contributions
elly-0.055
Every-0.049
umpy-0.047
 little-0.046
 young-0.045
Token's
Token ID338
Feature activation-3.29e-02

Pos logit contributions
chief+0.101
 elbow+0.096
 heal+0.095
 rid+0.095
 down+0.094
Neg logit contributions
elly-0.104
umpy-0.093
Every-0.089
urious-0.089
 little-0.084
Token unknown
Token ID6439
Feature activation-3.41e-02

Pos logit contributions
chief+0.051
 drown+0.047
 heal+0.047
 elbow+0.046
ridge+0.045
Neg logit contributions
elly-0.051
Every-0.047
umpy-0.045
 little-0.043
 young-0.043
Token how
Token ID703
Feature activation-3.12e-02

Pos logit contributions
chief+0.048
 drown+0.045
 heal+0.042
 elbow+0.042
 bleed+0.041
Neg logit contributions
elly-0.065
umpy-0.054
Every-0.054
 little-0.053
aste-0.052
Token strong
Token ID1913
Feature activation-6.47e-02

Pos logit contributions
chief+0.037
 heal+0.034
 drown+0.032
ridge+0.032
 elbow+0.032
Neg logit contributions
elly-0.066
Every-0.059
umpy-0.058
grown-0.054
ubby-0.054
Token it
Token ID340
Feature activation-1.22e-01

Pos logit contributions
chief+0.080
 drown+0.075
ridge+0.073
 heal+0.071
 rid+0.070
Neg logit contributions
elly-0.131
Every-0.117
umpy-0.113
able-0.108
ubby-0.108
Token is
Token ID318
Feature activation-6.22e-02

Pos logit contributions
chief+0.222
 elbow+0.212
stack+0.200
 drown+0.192
ridge+0.191
Neg logit contributions
elly-0.160
umpy-0.147
Every-0.141
 little-0.138
 meant-0.136
Token!"
Token ID2474
Feature activation-3.52e-02

Pos logit contributions
chief+0.083
 elbow+0.077
 drown+0.076
 heal+0.074
ridge+0.072
Neg logit contributions
elly-0.115
Every-0.106
umpy-0.103
 importance-0.096
 young-0.096
Token
Token ID220
Feature activation+2.72e-02

Pos logit contributions
 drown+0.070
 heal+0.068
 rid+0.065
 soar+0.065
 bleed+0.065
Neg logit contributions
elly-0.042
Every-0.040
Children-0.036
Tom-0.036
S-0.035
Token
Token ID198
Feature activation+3.11e-02

Pos logit contributions
elly+0.037
Every+0.032
umpy+0.030
oma+0.028
Children+0.027
Neg logit contributions
chief-0.051
 heal-0.051
 drown-0.050
 elbow-0.049
 rid-0.049
Token
Token ID198
Feature activation+2.63e-02

Pos logit contributions
elly+0.053
Every+0.046
umpy+0.044
urious+0.042
Two+0.042
Neg logit contributions
 heal-0.048
chief-0.047
 drown-0.046
 down-0.045
 rid-0.044
Token may
Token ID743
Feature activation-8.15e-02

Pos logit contributions
chief+0.100
 heal+0.091
 elbow+0.090
 drown+0.090
 down+0.090
Neg logit contributions
elly-0.076
Every-0.072
umpy-0.067
Two-0.064
ubby-0.061
Token want
Token ID765
Feature activation-5.54e-02

Pos logit contributions
chief+0.127
 elbow+0.126
ridge+0.118
 rid+0.117
 down+0.115
Neg logit contributions
Every-0.131
 young-0.115
One-0.114
 little-0.113
elly-0.109
Token more
Token ID517
Feature activation-6.55e-02

Pos logit contributions
chief+0.080
 drown+0.071
 elbow+0.069
stack+0.067
ridge+0.066
Neg logit contributions
elly-0.099
Every-0.098
 little-0.089
Children-0.088
 young-0.087
Token than
Token ID621
Feature activation-7.22e-02

Pos logit contributions
chief+0.107
 heal+0.104
 drown+0.103
 rid+0.099
 down+0.099
Neg logit contributions
elly-0.106
Every-0.094
umpy-0.088
urious-0.082
illing-0.081
Token what
Token ID644
Feature activation-7.11e-02

Pos logit contributions
chief+0.094
 drown+0.081
 heal+0.080
ridge+0.078
 elbow+0.077
Neg logit contributions
elly-0.133
Every-0.125
 young-0.121
 little-0.119
 elderly-0.117
Token we
Token ID356
Feature activation-1.21e-01

Pos logit contributions
chief+0.093
 drown+0.078
ridge+0.076
 rid+0.076
 elbow+0.076
Neg logit contributions
elly-0.144
Every-0.140
 little-0.126
 young-0.125
umpy-0.123
Token have
Token ID423
Feature activation-9.33e-02

Pos logit contributions
chief+0.182
 drown+0.165
 heal+0.165
 elbow+0.164
 down+0.159
Neg logit contributions
elly-0.202
Every-0.191
umpy-0.178
Two-0.172
urious-0.171
Token,
Token ID11
Feature activation-4.79e-02

Pos logit contributions
chief+0.114
 heal+0.111
 drown+0.110
 down+0.102
 rid+0.100
Neg logit contributions
elly-0.189
Every-0.169
umpy-0.163
Children-0.154
urious-0.153
Token but
Token ID475
Feature activation-5.59e-02

Pos logit contributions
chief+0.082
 drown+0.076
ridge+0.074
 rid+0.074
 heal+0.073
Neg logit contributions
elly-0.072
Every-0.066
 young-0.061
umpy-0.060
 little-0.060
Token it
Token ID340
Feature activation-6.70e-02

Pos logit contributions
chief+0.054
 drown+0.050
 heal+0.047
 elbow+0.047
 bleed+0.046
Neg logit contributions
elly-0.116
Every-0.112
 young-0.108
 little-0.106
 elderly-0.105
Token's
Token ID338
Feature activation-3.11e-02

Pos logit contributions
chief+0.118
 elbow+0.115
 drown+0.110
 rid+0.107
 down+0.105
Neg logit contributions
elly-0.092
Every-0.084
umpy-0.081
 importance-0.080
urious-0.079
Token!
Token ID0
Feature activation-6.55e-02

Pos logit contributions
elly+0.020
Every+0.017
umpy+0.017
oma+0.016
ubby+0.016
Neg logit contributions
 heal-0.015
chief-0.015
 drown-0.014
 down-0.013
 rid-0.013
Token We
Token ID775
Feature activation-4.19e-02

Pos logit contributions
 heal+0.077
chief+0.077
 drown+0.075
 rid+0.071
 down+0.070
Neg logit contributions
elly-0.138
Every-0.122
umpy-0.118
Two-0.115
oma-0.112
Token can
Token ID460
Feature activation-5.19e-02

Pos logit contributions
chief+0.065
 heal+0.063
 drown+0.061
 down+0.060
 elbow+0.059
Neg logit contributions
elly-0.069
Every-0.062
umpy-0.060
Two-0.057
ubby-0.055
Token buy
Token ID2822
Feature activation-3.12e-02

Pos logit contributions
chief+0.064
 drown+0.060
 down+0.059
 elbow+0.058
 heal+0.056
Neg logit contributions
elly-0.097
Every-0.097
Two-0.089
umpy-0.086
Children-0.085
Token anything
Token ID1997
Feature activation-7.14e-02

Pos logit contributions
 drown+0.045
 heal+0.043
chief+0.043
 bleed+0.040
 elbow+0.040
Neg logit contributions
elly-0.055
Every-0.050
Children-0.047
 elderly-0.045
oma-0.045
Token we
Token ID356
Feature activation-1.21e-01

Pos logit contributions
chief+0.104
 elbow+0.092
 drown+0.088
ridge+0.085
 bleed+0.083
Neg logit contributions
elly-0.141
Every-0.125
umpy-0.118
 little-0.117
 young-0.113
Token want
Token ID765
Feature activation-5.98e-02

Pos logit contributions
chief+0.179
 heal+0.172
 drown+0.170
 down+0.162
 elbow+0.161
Neg logit contributions
elly-0.206
Every-0.186
umpy-0.179
Two-0.170
Children-0.165
Token!"
Token ID2474
Feature activation-3.21e-02

Pos logit contributions
chief+0.084
 drown+0.082
 heal+0.079
 bleed+0.075
 elbow+0.075
Neg logit contributions
elly-0.116
Every-0.101
umpy-0.098
aste-0.094
 year-0.093
Token
Token ID198
Feature activation+4.51e-02

Pos logit contributions
 drown+0.072
 rid+0.071
 heal+0.071
ridge+0.068
chief+0.067
Neg logit contributions
elly-0.033
Every-0.029
Children-0.024
Tom-0.023
oma-0.023
Token
Token ID198
Feature activation+3.94e-02

Pos logit contributions
elly+0.076
Every+0.067
umpy+0.064
urious+0.060
Two+0.060
Neg logit contributions
 heal-0.069
chief-0.068
 drown-0.067
 down-0.065
 rid-0.064
Token"
Token ID1
Feature activation+3.93e-03

Pos logit contributions
elly+0.052
umpy+0.041
Every+0.040
ubby+0.038
urious+0.037
Neg logit contributions
 heal-0.076
chief-0.073
 drown-0.073
 down-0.071
 rid-0.070
Token big
Token ID1263
Feature activation-4.65e-02

Pos logit contributions
chief+0.072
 heal+0.070
 drown+0.070
 down+0.068
 rid+0.067
Neg logit contributions
elly-0.056
Every-0.049
umpy-0.045
Two-0.043
 year-0.042
Token stick
Token ID4859
Feature activation-7.61e-02

Pos logit contributions
 heal+0.064
chief+0.063
 drown+0.062
 down+0.060
 rid+0.060
Neg logit contributions
elly-0.085
Every-0.075
umpy-0.072
Two-0.069
urious-0.068
Token to
Token ID284
Feature activation-8.44e-02

Pos logit contributions
chief+0.113
 drown+0.109
 heal+0.107
 rid+0.103
 elbow+0.102
Neg logit contributions
elly-0.132
Every-0.116
umpy-0.110
 year-0.104
illing-0.104
Token break
Token ID2270
Feature activation-1.74e-02

Pos logit contributions
chief+0.105
ridge+0.097
 drown+0.092
 down+0.090
imet+0.089
Neg logit contributions
Every-0.152
elly-0.142
 little-0.137
 young-0.132
 elderly-0.129
Token the
Token ID262
Feature activation-6.43e-02

Pos logit contributions
 drown+0.021
chief+0.020
ridge+0.018
atmeal+0.018
 heal+0.018
Neg logit contributions
elly-0.034
Every-0.031
umpy-0.029
aste-0.029
 little-0.029
Token vine
Token ID17793
Feature activation-1.19e-01

Pos logit contributions
chief+0.096
 drown+0.092
 heal+0.091
 rid+0.087
 down+0.086
Neg logit contributions
elly-0.106
Every-0.093
umpy-0.091
 little-0.089
Two-0.087
Token.
Token ID13
Feature activation-5.69e-02

Pos logit contributions
chief+0.205
 elbow+0.197
 drown+0.196
 rid+0.191
ridge+0.183
Neg logit contributions
elly-0.172
Every-0.155
 year-0.143
umpy-0.142
 little-0.133
Token Lily
Token ID20037
Feature activation+2.12e-02

Pos logit contributions
 heal+0.092
 drown+0.090
 rid+0.087
chief+0.087
 bleed+0.085
Neg logit contributions
elly-0.095
Every-0.090
Two-0.080
umpy-0.079
Children-0.077
Token looked
Token ID3114
Feature activation-5.87e-03

Pos logit contributions
elly+0.033
Every+0.028
umpy+0.027
ubby+0.025
Two+0.025
Neg logit contributions
 heal-0.036
chief-0.035
 down-0.034
 drown-0.034
 rid-0.033
Token around
Token ID1088
Feature activation-1.37e-02

Pos logit contributions
 drown+0.008
 heal+0.008
chief+0.008
ridge+0.007
 elbow+0.007
Neg logit contributions
elly-0.011
Every-0.010
umpy-0.009
ubby-0.009
 little-0.009
Token and
Token ID290
Feature activation-4.63e-02

Pos logit contributions
 heal+0.022
chief+0.022
 drown+0.022
 rid+0.021
 down+0.020
Neg logit contributions
elly-0.022
Every-0.019
umpy-0.017
urious-0.017
Children-0.017
Token okay
Token ID8788
Feature activation-1.58e-02

Pos logit contributions
chief+0.069
 heal+0.069
 drown+0.064
 down+0.061
 rid+0.061
Neg logit contributions
elly-0.081
Every-0.072
umpy-0.070
ubby-0.068
Two-0.066
Token,
Token ID11
Feature activation-2.01e-02

Pos logit contributions
chief+0.025
 heal+0.024
 rid+0.024
 drown+0.024
 elbow+0.023
Neg logit contributions
elly-0.026
Every-0.022
umpy-0.022
ubby-0.021
 little-0.020
Token Anna
Token ID11735
Feature activation-3.19e-03

Pos logit contributions
chief+0.028
 heal+0.028
 drown+0.027
 down+0.027
 rid+0.026
Neg logit contributions
elly-0.036
Every-0.032
umpy-0.031
Two-0.029
Children-0.029
Token?
Token ID30
Feature activation-7.50e-02

Pos logit contributions
chief+0.005
 heal+0.004
 drown+0.004
 rid+0.004
 elbow+0.004
Neg logit contributions
elly-0.006
umpy-0.005
Every-0.005
oma-0.005
ubby-0.005
Token Does
Token ID8314
Feature activation-5.28e-02

Pos logit contributions
 heal+0.118
 drown+0.116
 rid+0.115
chief+0.114
 down+0.109
Neg logit contributions
elly-0.127
Every-0.104
Two-0.101
umpy-0.100
Some-0.093
Token it
Token ID340
Feature activation-1.18e-01

Pos logit contributions
 rid+0.061
chief+0.060
 hyd+0.060
stack+0.060
 drown+0.060
Neg logit contributions
elly-0.105
Every-0.096
 little-0.090
umpy-0.087
 year-0.086
Token hurt
Token ID5938
Feature activation-3.70e-02

Pos logit contributions
stack+0.196
chief+0.188
 mast+0.163
ridge+0.160
 elbow+0.159
Neg logit contributions
elly-0.180
umpy-0.177
 little-0.172
Every-0.172
 meant-0.166
Token a
Token ID257
Feature activation-4.19e-02

Pos logit contributions
chief+0.060
 drown+0.054
ridge+0.052
 elbow+0.050
stack+0.050
Neg logit contributions
elly-0.064
Every-0.056
umpy-0.055
ubby-0.051
 little-0.050
Token lot
Token ID1256
Feature activation-5.16e-02

Pos logit contributions
 heal+0.070
chief+0.069
 drown+0.068
 down+0.067
 rid+0.065
Neg logit contributions
elly-0.065
Every-0.056
umpy-0.053
Two-0.050
urious-0.050
Token?"
Token ID1701
Feature activation-3.10e-02

Pos logit contributions
 heal+0.060
 down+0.054
 drown+0.052
 rid+0.051
chief+0.051
Neg logit contributions
elly-0.107
Every-0.097
urious-0.094
umpy-0.093
Two-0.093
Token
Token ID198
Feature activation+4.07e-02

Pos logit contributions
 drown+0.067
 rid+0.064
 heal+0.064
chief+0.063
 bleed+0.062
Neg logit contributions
elly-0.034
Every-0.032
Children-0.028
oma-0.027
ara-0.026
Token we
Token ID356
Feature activation-6.58e-02

Pos logit contributions
 drown+0.037
chief+0.035
 bleed+0.033
 heal+0.032
 elbow+0.032
Neg logit contributions
elly-0.048
Every-0.046
 little-0.045
 young-0.042
A-0.041
Token can
Token ID460
Feature activation-7.34e-02

Pos logit contributions
chief+0.109
 elbow+0.100
 heal+0.100
 drown+0.099
 rid+0.094
Neg logit contributions
elly-0.102
Every-0.094
umpy-0.088
Two-0.085
ubby-0.081
Token see
Token ID766
Feature activation-5.66e-02

Pos logit contributions
chief+0.084
 drown+0.080
 elbow+0.080
 down+0.078
 rid+0.074
Neg logit contributions
Every-0.139
elly-0.138
Two-0.126
One-0.124
umpy-0.122
Token some
Token ID617
Feature activation-3.72e-02

Pos logit contributions
 drown+0.081
chief+0.081
 heal+0.075
stack+0.073
 elbow+0.073
Neg logit contributions
elly-0.101
Every-0.090
 little-0.088
 young-0.083
Children-0.083
Token lights
Token ID7588
Feature activation-5.78e-02

Pos logit contributions
chief+0.056
 drown+0.054
 heal+0.053
 down+0.050
 rid+0.050
Neg logit contributions
elly-0.065
Every-0.058
umpy-0.054
Children-0.051
ubby-0.051
Token or
Token ID393
Feature activation-1.18e-01

Pos logit contributions
 drown+0.104
chief+0.102
 elbow+0.095
 rid+0.095
 heal+0.094
Neg logit contributions
elly-0.088
Every-0.073
umpy-0.069
aste-0.065
 year-0.065
Token tunnels
Token ID22642
Feature activation-5.25e-02

Pos logit contributions
chief+0.192
 drown+0.176
 heal+0.174
 rid+0.167
stack+0.166
Neg logit contributions
elly-0.177
Every-0.163
 little-0.151
Some-0.150
Two-0.149
Token,"
Token ID553
Feature activation-2.58e-02

Pos logit contributions
chief+0.080
 drown+0.076
 heal+0.073
 elbow+0.071
 rid+0.070
Neg logit contributions
elly-0.094
Every-0.081
umpy-0.080
 year-0.074
ubby-0.072
Token mom
Token ID1995
Feature activation+1.64e-02

Pos logit contributions
 drown+0.039
chief+0.039
 heal+0.039
ridge+0.038
 soar+0.035
Neg logit contributions
elly-0.041
Every-0.040
Children-0.038
 little-0.037
Two-0.037
Token said
Token ID531
Feature activation+1.36e-02

Pos logit contributions
elly+0.030
Every+0.026
umpy+0.026
ubby+0.025
Two+0.024
Neg logit contributions
 down-0.023
 heal-0.022
chief-0.022
 build-0.020
 drown-0.020
Token.
Token ID13
Feature activation-2.98e-02

Pos logit contributions
elly+0.018
 little+0.016
 young+0.015
Every+0.015
illed+0.014
Neg logit contributions
 drown-0.025
 soar-0.022
 glide-0.022
chief-0.022
 mast-0.022
Token wanted
Token ID2227
Feature activation-2.77e-02

Pos logit contributions
 drown+0.120
chief+0.117
ridge+0.112
stack+0.110
 heal+0.108
Neg logit contributions
elly-0.156
Every-0.155
 little-0.146
 young-0.140
 English-0.139
Token to
Token ID284
Feature activation-8.63e-02

Pos logit contributions
 drown+0.038
chief+0.035
stack+0.034
 bleed+0.034
 heal+0.032
Neg logit contributions
elly-0.049
Every-0.048
 English-0.043
 little-0.043
 Spanish-0.042
Token find
Token ID1064
Feature activation-3.92e-02

Pos logit contributions
chief+0.111
ridge+0.104
 drown+0.103
 down+0.101
 elbow+0.096
Neg logit contributions
Every-0.156
elly-0.144
 little-0.136
Two-0.136
oma-0.134
Token out
Token ID503
Feature activation-6.38e-02

Pos logit contributions
 drown+0.052
chief+0.048
 heal+0.046
stack+0.045
 bleed+0.044
Neg logit contributions
elly-0.074
Every-0.069
 little-0.063
Children-0.063
aste-0.062
Token what
Token ID644
Feature activation-6.16e-02

Pos logit contributions
chief+0.088
 drown+0.085
 heal+0.083
 elbow+0.078
 rid+0.077
Neg logit contributions
elly-0.118
Every-0.109
umpy-0.102
 year-0.097
grown-0.096
Token it
Token ID340
Feature activation-1.17e-01

Pos logit contributions
chief+0.111
 drown+0.107
 heal+0.101
 rid+0.100
 bleed+0.099
Neg logit contributions
elly-0.089
Every-0.087
 year-0.075
umpy-0.071
Two-0.071
Token was
Token ID373
Feature activation-5.56e-02

Pos logit contributions
chief+0.238
 drown+0.234
 heal+0.228
 elbow+0.222
 rid+0.222
Neg logit contributions
elly-0.137
Every-0.120
umpy-0.111
urious-0.097
oma-0.095
Token.
Token ID13
Feature activation-6.73e-02

Pos logit contributions
chief+0.074
 drown+0.072
 heal+0.069
 elbow+0.067
 rid+0.065
Neg logit contributions
elly-0.106
Every-0.095
umpy-0.094
ubby-0.086
illing-0.085
Token What
Token ID1867
Feature activation-9.66e-03

Pos logit contributions
 heal+0.109
 drown+0.107
chief+0.103
 rid+0.103
 elbow+0.101
Neg logit contributions
elly-0.113
Every-0.104
umpy-0.094
Two-0.093
Children-0.089
Token are
Token ID389
Feature activation-2.62e-04

Pos logit contributions
 heal+0.015
 down+0.014
 drown+0.014
chief+0.014
 soar+0.014
Neg logit contributions
elly-0.016
umpy-0.014
Every-0.014
illing-0.013
ubby-0.013
Token you
Token ID345
Feature activation-3.46e-02

Pos logit contributions
 drown+0.000
 heal+0.000
chief+0.000
 bleed+0.000
ridge+0.000
Neg logit contributions
elly-0.001
Every-0.001
ida-0.000
Children-0.000
 elderly-0.000
Token few
Token ID1178
Feature activation-2.39e-02

Pos logit contributions
chief+0.074
 heal+0.070
 down+0.069
 drown+0.069
 rid+0.069
Neg logit contributions
elly-0.069
Every-0.062
umpy-0.057
Two-0.057
 year-0.057
Token moments
Token ID7188
Feature activation-5.26e-02

Pos logit contributions
 heal+0.043
chief+0.043
 drown+0.042
 down+0.041
 rid+0.041
Neg logit contributions
elly-0.033
Every-0.029
umpy-0.027
Two-0.025
urious-0.025
Token,
Token ID11
Feature activation-4.30e-02

Pos logit contributions
 drown+0.094
 bleed+0.089
chief+0.089
 rid+0.086
 elbow+0.084
Neg logit contributions
elly-0.075
Every-0.069
 year-0.063
grown-0.059
 young-0.059
Token wondering
Token ID11263
Feature activation-4.04e-02

Pos logit contributions
 drown+0.071
chief+0.069
 heal+0.069
 rid+0.067
 bleed+0.065
Neg logit contributions
elly-0.060
Every-0.059
 little-0.055
 young-0.054
 year-0.051
Token what
Token ID644
Feature activation-5.39e-02

Pos logit contributions
 drown+0.067
chief+0.066
 heal+0.065
 rid+0.062
 bleed+0.061
Neg logit contributions
elly-0.061
Every-0.056
umpy-0.050
urious-0.048
iet-0.048
Token it
Token ID340
Feature activation-1.17e-01

Pos logit contributions
chief+0.068
 drown+0.067
 heal+0.066
 rid+0.061
 bleed+0.061
Neg logit contributions
elly-0.105
Every-0.101
umpy-0.089
Two-0.089
 year-0.089
Token was
Token ID373
Feature activation-5.22e-02

Pos logit contributions
chief+0.221
 drown+0.219
 heal+0.218
 rid+0.208
 down+0.207
Neg logit contributions
elly-0.154
Every-0.136
umpy-0.124
urious-0.116
oma-0.113
Token.
Token ID13
Feature activation-6.11e-02

Pos logit contributions
chief+0.072
 drown+0.071
 heal+0.068
 elbow+0.067
 rid+0.064
Neg logit contributions
elly-0.095
Every-0.087
umpy-0.083
illing-0.078
 importance-0.078
Token
Token ID198
Feature activation+2.35e-02

Pos logit contributions
 heal+0.098
 drown+0.096
chief+0.092
 rid+0.092
 bleed+0.090
Neg logit contributions
elly-0.101
Every-0.094
Two-0.085
umpy-0.084
Children-0.082
Token
Token ID198
Feature activation+1.83e-02

Pos logit contributions
elly+0.046
umpy+0.039
ubby+0.038
Every+0.037
urious+0.037
Neg logit contributions
 heal-0.029
chief-0.028
 drown-0.027
 down-0.026
 rid-0.025
TokenJohn
Token ID7554
Feature activation+3.07e-02

Pos logit contributions
elly+0.023
ubby+0.017
umpy+0.017
oma+0.015
urious+0.015
Neg logit contributions
 heal-0.035
 drown-0.034
chief-0.033
 soar-0.033
 down-0.032
Token be
Token ID307
Feature activation-3.48e-02

Pos logit contributions
elly+0.048
ubby+0.039
umpy+0.039
ida+0.036
Every+0.036
Neg logit contributions
 heal-0.056
chief-0.054
 rid-0.053
ridge-0.052
 down-0.051
Token free
Token ID1479
Feature activation-5.48e-02

Pos logit contributions
chief+0.061
 heal+0.058
 drown+0.056
ridge+0.056
 down+0.053
Neg logit contributions
elly-0.050
Every-0.046
umpy-0.043
ubby-0.042
 young-0.040
Token to
Token ID284
Feature activation-9.92e-02

Pos logit contributions
chief+0.080
 heal+0.074
 drown+0.073
 elbow+0.070
ridge+0.070
Neg logit contributions
elly-0.104
umpy-0.090
Every-0.090
 year-0.082
ubby-0.081
Token fly
Token ID6129
Feature activation-6.18e-02

Pos logit contributions
chief+0.135
ridge+0.120
 elbow+0.114
atmeal+0.111
imet+0.109
Neg logit contributions
Every-0.175
 little-0.163
elly-0.161
 young-0.161
umpy-0.153
Token wherever
Token ID14530
Feature activation-7.59e-02

Pos logit contributions
chief+0.087
ridge+0.084
 drown+0.081
 elbow+0.079
atmeal+0.079
Neg logit contributions
elly-0.112
Every-0.104
umpy-0.101
 year-0.095
iet-0.094
Token it
Token ID340
Feature activation-1.17e-01

Pos logit contributions
chief+0.108
ridge+0.103
 elbow+0.099
 bleed+0.096
 rid+0.096
Neg logit contributions
Every-0.129
 little-0.119
elly-0.118
 year-0.116
 young-0.114
Token wants
Token ID3382
Feature activation-4.65e-02

Pos logit contributions
chief+0.202
 elbow+0.189
ridge+0.183
 rid+0.183
 drown+0.182
Neg logit contributions
elly-0.165
umpy-0.154
Every-0.148
 little-0.143
urious-0.140
Token."
Token ID526
Feature activation-3.89e-02

Pos logit contributions
chief+0.073
 drown+0.071
 elbow+0.068
 bleed+0.065
ridge+0.064
Neg logit contributions
elly-0.080
Every-0.072
 year-0.068
umpy-0.068
 little-0.066
Token
Token ID198
Feature activation+3.15e-02

Pos logit contributions
 heal+0.079
 drown+0.078
 rid+0.077
chief+0.076
 bleed+0.074
Neg logit contributions
Every-0.045
elly-0.045
Children-0.037
oma-0.036
Two-0.035
Token
Token ID198
Feature activation+2.68e-02

Pos logit contributions
elly+0.051
Every+0.045
umpy+0.043
urious+0.040
oma+0.040
Neg logit contributions
 heal-0.051
chief-0.050
 drown-0.049
 down-0.048
 rid-0.047
TokenThe
Token ID464
Feature activation+1.27e-02

Pos logit contributions
elly+0.028
Every+0.021
umpy+0.021
ubby+0.018
urious+0.018
Neg logit contributions
 heal-0.058
chief-0.057
 drown-0.056
 down-0.055
 rid-0.055
Token!
Token ID0
Feature activation-7.43e-02

Pos logit contributions
 heal+0.005
chief+0.005
 down+0.005
 drown+0.005
 rid+0.004
Neg logit contributions
elly-0.006
Every-0.005
umpy-0.005
Two-0.005
urious-0.005
Token How
Token ID1374
Feature activation+6.25e-03

Pos logit contributions
 heal+0.096
 drown+0.095
chief+0.093
 down+0.090
 rid+0.088
Neg logit contributions
elly-0.146
Every-0.131
umpy-0.125
Two-0.120
Children-0.117
Token much
Token ID881
Feature activation-4.72e-02

Pos logit contributions
elly+0.010
Every+0.008
umpy+0.008
oma+0.008
Children+0.008
Neg logit contributions
 heal-0.010
 drown-0.010
chief-0.010
 down-0.010
 rid-0.009
Token harder
Token ID7069
Feature activation-5.58e-02

Pos logit contributions
chief+0.071
 heal+0.070
 drown+0.069
 down+0.067
 rid+0.066
Neg logit contributions
elly-0.081
Every-0.072
umpy-0.068
urious-0.064
Two-0.064
Token can
Token ID460
Feature activation-5.94e-02

Pos logit contributions
 heal+0.077
chief+0.076
 drown+0.076
 rid+0.073
 bleed+0.070
Neg logit contributions
elly-0.101
Every-0.091
umpy-0.085
ubby-0.080
illing-0.080
Token it
Token ID340
Feature activation-1.17e-01

Pos logit contributions
chief+0.061
 drown+0.059
 Grad+0.058
 down+0.058
stack+0.057
Neg logit contributions
Every-0.109
 little-0.108
elly-0.107
 young-0.104
 elderly-0.099
Token be
Token ID307
Feature activation-4.21e-02

Pos logit contributions
chief+0.162
 elbow+0.151
stack+0.140
 drown+0.138
 rid+0.135
Neg logit contributions
elly-0.201
Every-0.198
umpy-0.194
 little-0.183
 young-0.181
Token to
Token ID284
Feature activation-8.22e-02

Pos logit contributions
chief+0.069
 drown+0.064
 elbow+0.062
 heal+0.060
stack+0.060
Neg logit contributions
elly-0.067
Every-0.062
umpy-0.058
 young-0.056
 little-0.054
Token wrap
Token ID14441
Feature activation-4.16e-02

Pos logit contributions
chief+0.104
 down+0.092
 drown+0.092
 elbow+0.089
ridge+0.088
Neg logit contributions
Every-0.148
elly-0.147
 young-0.134
 little-0.134
 elderly-0.133
Token a
Token ID257
Feature activation-5.05e-02

Pos logit contributions
 drown+0.059
chief+0.050
 bleed+0.050
 elbow+0.049
 heal+0.048
Neg logit contributions
elly-0.077
Every-0.070
aste-0.065
iet-0.064
umpy-0.064
Token book
Token ID1492
Feature activation-7.29e-02

Pos logit contributions
 drown+0.079
chief+0.078
 heal+0.075
 down+0.073
 rid+0.071
Neg logit contributions
elly-0.079
Every-0.071
umpy-0.066
 little-0.065
 year-0.065
Token,
Token ID11
Feature activation-3.33e-02

Pos logit contributions
 drown+0.053
chief+0.050
ridge+0.047
 elbow+0.047
 heal+0.046
Neg logit contributions
elly-0.057
Every-0.053
umpy-0.047
 importance-0.047
grown-0.046
Token it
Token ID340
Feature activation-5.88e-02

Pos logit contributions
 drown+0.040
chief+0.039
 heal+0.038
ridge+0.035
 elbow+0.035
Neg logit contributions
elly-0.063
Every-0.060
 little-0.056
 young-0.055
 elderly-0.054
Token could
Token ID714
Feature activation-6.28e-02

Pos logit contributions
 drown+0.105
chief+0.105
 heal+0.100
 elbow+0.099
 down+0.099
Neg logit contributions
elly-0.075
Every-0.071
umpy-0.064
urious-0.062
 importance-0.061
Token do
Token ID466
Feature activation-6.41e-02

Pos logit contributions
chief+0.084
 down+0.079
ridge+0.077
 elbow+0.077
imet+0.075
Neg logit contributions
Every-0.112
 little-0.095
 young-0.094
One-0.092
elly-0.091
Token anything
Token ID1997
Feature activation-5.65e-02

Pos logit contributions
chief+0.091
 drown+0.088
 down+0.083
stack+0.083
 elbow+0.083
Neg logit contributions
Every-0.104
elly-0.103
 importance-0.094
 little-0.092
 year-0.092
Token it
Token ID340
Feature activation-1.17e-01

Pos logit contributions
chief+0.099
 elbow+0.095
ridge+0.090
 drown+0.090
stack+0.089
Neg logit contributions
elly-0.084
Every-0.081
 little-0.074
umpy-0.073
 young-0.071
Token wanted
Token ID2227
Feature activation-3.07e-02

Pos logit contributions
chief+0.205
 drown+0.195
 elbow+0.190
ridge+0.181
 mast+0.181
Neg logit contributions
elly-0.158
Every-0.152
umpy-0.143
 importance-0.133
 meant-0.131
Token.
Token ID13
Feature activation-5.42e-02

Pos logit contributions
 drown+0.045
chief+0.045
 elbow+0.041
 bleed+0.041
 heal+0.040
Neg logit contributions
elly-0.055
Every-0.049
umpy-0.046
 year-0.046
aste-0.044
Token<|endoftext|>
Token ID50256
Feature activation+4.25e-02

Pos logit contributions
 drown+0.096
 heal+0.096
chief+0.091
 bleed+0.091
 rid+0.090
Neg logit contributions
elly-0.083
Every-0.078
Two-0.067
umpy-0.067
Children-0.066
TokenOnce
Token ID7454
Feature activation+6.49e-02

Pos logit contributions
elly+0.077
Every+0.068
umpy+0.066
urious+0.064
iet+0.063
Neg logit contributions
 heal-0.058
 down-0.057
chief-0.056
 drown-0.056
 rid-0.053
Token upon
Token ID2402
Feature activation+3.38e-02

Pos logit contributions
elly+0.085
Every+0.071
umpy+0.067
urious+0.062
Two+0.062
Neg logit contributions
 heal-0.123
chief-0.122
 drown-0.120
 down-0.119
 rid-0.116
Token we
Token ID356
Feature activation-6.80e-02

Pos logit contributions
 drown+0.028
chief+0.027
 bleed+0.025
 heal+0.025
 rid+0.023
Neg logit contributions
elly-0.061
Every-0.059
 little-0.055
 young-0.054
 elderly-0.053
Token can
Token ID460
Feature activation-6.98e-02

Pos logit contributions
chief+0.117
 elbow+0.107
 heal+0.107
 drown+0.106
ridge+0.102
Neg logit contributions
elly-0.100
Every-0.092
umpy-0.088
Two-0.083
ubby-0.081
Token sew
Token ID34249
Feature activation-3.01e-02

Pos logit contributions
chief+0.073
 drown+0.071
 down+0.067
 elbow+0.066
 heal+0.064
Neg logit contributions
elly-0.143
Every-0.139
umpy-0.128
Two-0.127
ubby-0.123
Token some
Token ID617
Feature activation-4.60e-02

Pos logit contributions
 drown+0.039
chief+0.036
 heal+0.034
 elbow+0.032
 bleed+0.032
Neg logit contributions
elly-0.058
Every-0.052
oma-0.049
umpy-0.049
Children-0.048
Token buttons
Token ID12163
Feature activation-5.49e-02

Pos logit contributions
 drown+0.075
chief+0.075
 heal+0.069
 down+0.067
 rid+0.065
Neg logit contributions
elly-0.074
Every-0.066
umpy-0.059
 year-0.057
Children-0.057
Token or
Token ID393
Feature activation-1.17e-01

Pos logit contributions
 drown+0.102
chief+0.095
 elbow+0.089
 heal+0.089
 rid+0.088
Neg logit contributions
elly-0.082
Every-0.067
umpy-0.065
ubby-0.059
aste-0.057
Token rib
Token ID12183
Feature activation-4.21e-02

Pos logit contributions
chief+0.172
 drown+0.166
stack+0.161
ridge+0.151
 heal+0.148
Neg logit contributions
elly-0.179
Every-0.175
 little-0.166
Some-0.162
 young-0.158
Tokenbons
Token ID23461
Feature activation-5.33e-02

Pos logit contributions
 drown+0.058
 heal+0.057
chief+0.055
 down+0.054
 rid+0.053
Neg logit contributions
elly-0.077
Every-0.070
umpy-0.066
oma-0.063
Two-0.063
Token or
Token ID393
Feature activation-1.24e-01

Pos logit contributions
 drown+0.086
chief+0.081
 heal+0.076
 elbow+0.076
 bleed+0.075
Neg logit contributions
elly-0.090
Every-0.078
umpy-0.075
ubby-0.069
aste-0.068
Token patches
Token ID16082
Feature activation-5.75e-02

Pos logit contributions
 drown+0.194
chief+0.192
stack+0.187
 heal+0.175
 rid+0.172
Neg logit contributions
elly-0.180
Every-0.171
 little-0.162
Some-0.156
Two-0.153
Token.
Token ID13
Feature activation-7.37e-02

Pos logit contributions
 drown+0.093
chief+0.090
 elbow+0.087
 bleed+0.081
 heal+0.080
Neg logit contributions
elly-0.098
Every-0.086
umpy-0.085
ubby-0.075
aste-0.075
Token game
Token ID983
Feature activation-3.81e-02

Pos logit contributions
 heal+0.082
 drown+0.078
 down+0.075
chief+0.074
 rid+0.074
Neg logit contributions
elly-0.085
Every-0.072
umpy-0.071
urious-0.068
illing-0.068
Token or
Token ID393
Feature activation-1.08e-01

Pos logit contributions
chief+0.053
 heal+0.051
 drown+0.051
 down+0.047
 rid+0.047
Neg logit contributions
elly-0.072
Every-0.064
umpy-0.061
Two-0.057
illing-0.057
Token read
Token ID1100
Feature activation-3.42e-02

Pos logit contributions
chief+0.166
ridge+0.144
stack+0.139
 heal+0.138
 drown+0.137
Neg logit contributions
elly-0.170
Every-0.167
 little-0.159
Some-0.156
 young-0.155
Token a
Token ID257
Feature activation-6.00e-02

Pos logit contributions
chief+0.044
 drown+0.043
 heal+0.042
 rid+0.040
 elbow+0.038
Neg logit contributions
elly-0.066
Every-0.060
umpy-0.056
Children-0.054
oma-0.054
Token book
Token ID1492
Feature activation-6.79e-02

Pos logit contributions
 heal+0.092
chief+0.088
 drown+0.087
 down+0.087
 rid+0.084
Neg logit contributions
elly-0.102
Every-0.089
umpy-0.086
urious-0.082
illing-0.081
Token or
Token ID393
Feature activation-1.16e-01

Pos logit contributions
chief+0.086
 heal+0.085
 drown+0.084
 rid+0.079
 down+0.078
Neg logit contributions
elly-0.134
Every-0.120
umpy-0.116
Two-0.108
oma-0.108
Token draw
Token ID3197
Feature activation-4.10e-02

Pos logit contributions
chief+0.201
 drown+0.181
 heal+0.177
stack+0.177
ridge+0.175
Neg logit contributions
elly-0.153
Every-0.148
 little-0.142
Some-0.138
 young-0.136
Token a
Token ID257
Feature activation-5.34e-02

Pos logit contributions
 drown+0.067
chief+0.066
 heal+0.061
 rid+0.061
 elbow+0.058
Neg logit contributions
elly-0.067
Every-0.059
umpy-0.054
oma-0.053
Children-0.052
Token picture
Token ID4286
Feature activation-8.22e-02

Pos logit contributions
chief+0.085
 drown+0.083
 heal+0.083
 rid+0.080
 down+0.079
Neg logit contributions
elly-0.086
Every-0.075
umpy-0.072
Two-0.069
oma-0.067
Token.
Token ID13
Feature activation-7.26e-02

Pos logit contributions
chief+0.143
 drown+0.139
 rid+0.124
 elbow+0.123
 heal+0.122
Neg logit contributions
elly-0.140
Every-0.119
umpy-0.114
 year-0.109
aste-0.103
Token Maybe
Token ID6674
Feature activation-2.40e-02

Pos logit contributions
 heal+0.085
chief+0.084
 drown+0.082
 down+0.079
 rid+0.078
Neg logit contributions
elly-0.149
Every-0.135
umpy-0.129
Two-0.124
Children-0.123
Token.
Token ID13
Feature activation-6.08e-02

Pos logit contributions
 heal+0.019
chief+0.019
 down+0.018
 drown+0.018
 rid+0.017
Neg logit contributions
elly-0.032
Every-0.028
umpy-0.027
iet-0.026
oma-0.026
Token That
Token ID1320
Feature activation-2.62e-02

Pos logit contributions
 heal+0.097
 drown+0.095
 rid+0.095
chief+0.093
 bleed+0.089
Neg logit contributions
elly-0.102
Every-0.095
umpy-0.086
Two-0.085
Children-0.081
Token's
Token ID338
Feature activation-1.54e-02

Pos logit contributions
chief+0.041
 drown+0.040
 heal+0.040
 rid+0.039
 down+0.038
Neg logit contributions
elly-0.042
Every-0.038
umpy-0.035
urious-0.034
Two-0.034
Token how
Token ID703
Feature activation-1.75e-02

Pos logit contributions
 heal+0.023
chief+0.022
 drown+0.022
 down+0.022
 rid+0.021
Neg logit contributions
elly-0.027
Every-0.024
umpy-0.023
urious-0.022
Two-0.021
Token heavy
Token ID4334
Feature activation-6.41e-02

Pos logit contributions
chief+0.020
stack+0.019
ridge+0.018
 drown+0.018
 heal+0.018
Neg logit contributions
elly-0.036
Every-0.034
 young-0.032
 little-0.032
umpy-0.032
Token it
Token ID340
Feature activation-1.15e-01

Pos logit contributions
 drown+0.083
ridge+0.076
chief+0.075
 rid+0.074
 bleed+0.074
Neg logit contributions
elly-0.109
Every-0.104
 little-0.102
 meant-0.095
 young-0.094
Token was
Token ID373
Feature activation-4.14e-02

Pos logit contributions
chief+0.217
 elbow+0.201
stack+0.200
 drown+0.198
 rid+0.192
Neg logit contributions
elly-0.132
umpy-0.120
Every-0.119
 meant-0.119
 little-0.117
Token!
Token ID0
Feature activation-6.62e-02

Pos logit contributions
chief+0.064
 elbow+0.063
 drown+0.061
ridge+0.058
stack+0.056
Neg logit contributions
elly-0.065
Every-0.060
umpy-0.057
 little-0.055
 young-0.055
Token
Token ID198
Feature activation+2.79e-02

Pos logit contributions
 heal+0.106
 drown+0.105
chief+0.102
 rid+0.100
 down+0.098
Neg logit contributions
elly-0.111
Every-0.102
umpy-0.093
Two-0.091
Children-0.088
Token
Token ID198
Feature activation+2.42e-02

Pos logit contributions
elly+0.052
umpy+0.044
Every+0.044
ubby+0.042
urious+0.042
Neg logit contributions
 heal-0.037
chief-0.036
 drown-0.036
 down-0.034
 rid-0.034
Token"
Token ID1
Feature activation-7.43e-03

Pos logit contributions
elly+0.030
umpy+0.022
ubby+0.022
oma+0.020
urious+0.020
Neg logit contributions
 heal-0.046
 drown-0.045
chief-0.044
 soar-0.044
 rid-0.043
Token will
Token ID481
Feature activation-1.13e-01

Pos logit contributions
 heal+0.036
chief+0.036
 drown+0.035
 down+0.035
 rid+0.034
Neg logit contributions
elly-0.045
Every-0.040
umpy-0.038
urious-0.036
oma-0.036
Token put
Token ID1234
Feature activation-5.87e-02

Pos logit contributions
chief+0.168
ridge+0.158
atmeal+0.142
 elbow+0.142
imet+0.142
Neg logit contributions
Every-0.180
 little-0.177
 young-0.169
 year-0.160
 kind-0.157
Token it
Token ID340
Feature activation-9.66e-02

Pos logit contributions
 drown+0.092
stack+0.082
 soar+0.080
 spine+0.076
 heal+0.076
Neg logit contributions
elly-0.087
able-0.082
 little-0.081
 two-0.078
grown-0.075
Token on
Token ID319
Feature activation-7.47e-02

Pos logit contributions
chief+0.134
 drown+0.123
ridge+0.112
stack+0.111
 Grad+0.109
Neg logit contributions
elly-0.164
 little-0.154
Every-0.152
umpy-0.144
able-0.141
Token my
Token ID616
Feature activation-7.42e-02

Pos logit contributions
 drown+0.103
chief+0.102
 heal+0.093
 bleed+0.092
ridge+0.092
Neg logit contributions
elly-0.131
Every-0.126
 little-0.115
 year-0.114
oma-0.111
Token paper
Token ID3348
Feature activation-1.15e-01

Pos logit contributions
 rid+0.104
 drown+0.103
chief+0.100
 Grad+0.097
ridge+0.096
Neg logit contributions
 little-0.118
elly-0.107
 child-0.105
 elderly-0.104
 young-0.100
Token when
Token ID618
Feature activation-8.02e-02

Pos logit contributions
 drown+0.199
chief+0.186
 rid+0.180
 heal+0.179
 bleed+0.179
Neg logit contributions
elly-0.174
Every-0.140
 year-0.134
umpy-0.134
iet-0.132
Token I
Token ID314
Feature activation-7.73e-02

Pos logit contributions
chief+0.093
 bleed+0.088
 rid+0.087
 drown+0.086
 heal+0.084
Neg logit contributions
elly-0.142
 little-0.142
Every-0.138
 young-0.134
 year-0.132
Token finish
Token ID5461
Feature activation-7.02e-02

Pos logit contributions
chief+0.119
 down+0.111
 heal+0.109
 drown+0.108
 elbow+0.107
Neg logit contributions
elly-0.125
Every-0.112
umpy-0.109
Two-0.104
 year-0.098
Token.
Token ID13
Feature activation-7.55e-02

Pos logit contributions
 drown+0.091
 heal+0.084
 bleed+0.084
chief+0.082
stack+0.081
Neg logit contributions
elly-0.126
Every-0.116
grown-0.111
 little-0.110
 year-0.109
Token It
Token ID632
Feature activation-7.80e-02

Pos logit contributions
 drown+0.099
 heal+0.099
chief+0.098
 down+0.096
 rid+0.095
Neg logit contributions
elly-0.143
Every-0.132
umpy-0.121
Two-0.121
Children-0.115
Token I
Token ID314
Feature activation-6.31e-02

Pos logit contributions
chief+0.104
 drown+0.094
 heal+0.093
 elbow+0.092
ridge+0.089
Neg logit contributions
Every-0.151
elly-0.142
Two-0.136
 young-0.134
 little-0.134
Token'll
Token ID1183
Feature activation-6.73e-02

Pos logit contributions
chief+0.082
 heal+0.081
 drown+0.079
 down+0.077
 rid+0.074
Neg logit contributions
elly-0.120
Every-0.109
umpy-0.105
Two-0.099
urious-0.098
Token show
Token ID905
Feature activation-4.73e-02

Pos logit contributions
chief+0.086
 drown+0.081
 down+0.078
ridge+0.077
 elbow+0.076
Neg logit contributions
Every-0.119
elly-0.113
umpy-0.105
 little-0.103
ida-0.103
Token you
Token ID345
Feature activation-5.25e-02

Pos logit contributions
 drown+0.051
chief+0.045
 bleed+0.043
 heal+0.042
 hyd+0.040
Neg logit contributions
elly-0.096
Every-0.095
 little-0.088
 young-0.087
Children-0.087
Token how
Token ID703
Feature activation-3.55e-02

Pos logit contributions
chief+0.084
 drown+0.083
 bleed+0.078
 heal+0.078
 elbow+0.077
Neg logit contributions
Every-0.082
elly-0.082
 little-0.071
 young-0.070
umpy-0.069
Token it
Token ID340
Feature activation-1.14e-01

Pos logit contributions
chief+0.048
 drown+0.044
 heal+0.044
stack+0.044
 elbow+0.042
Neg logit contributions
elly-0.070
Every-0.065
umpy-0.062
 young-0.059
grown-0.058
Token's
Token ID338
Feature activation-5.65e-02

Pos logit contributions
chief+0.211
 drown+0.195
 elbow+0.195
stack+0.193
 mast+0.185
Neg logit contributions
elly-0.154
Every-0.141
umpy-0.138
 importance-0.130
urious-0.128
Token done
Token ID1760
Feature activation-6.75e-02

Pos logit contributions
chief+0.089
 drown+0.083
 elbow+0.079
ridge+0.076
 heal+0.075
Neg logit contributions
elly-0.091
Every-0.085
umpy-0.078
 importance-0.077
 elderly-0.077
Token".
Token ID1911
Feature activation-3.34e-02

Pos logit contributions
 drown+0.087
chief+0.087
 heal+0.080
 elbow+0.078
 bleed+0.077
Neg logit contributions
elly-0.133
Every-0.121
umpy-0.116
 year-0.109
aste-0.108
Token So
Token ID1406
Feature activation+2.00e-03

Pos logit contributions
 drown+0.062
 heal+0.061
 rid+0.059
 bleed+0.059
chief+0.057
Neg logit contributions
elly-0.044
Every-0.042
oma-0.037
Two-0.037
Children-0.037
Token the
Token ID262
Feature activation+1.81e-02

Pos logit contributions
 little+0.003
Every+0.003
elly+0.003
 young+0.003
oph+0.003
Neg logit contributions
 drown-0.003
 bleed-0.003
 heal-0.003
chief-0.003
 elbow-0.003
Token,
Token ID11
Feature activation-3.46e-02

Pos logit contributions
chief+0.035
 drown+0.034
 heal+0.033
 rid+0.033
 bleed+0.032
Neg logit contributions
elly-0.037
Every-0.033
 year-0.030
 little-0.030
 young-0.030
Token imagination
Token ID13843
Feature activation+6.10e-03

Pos logit contributions
chief+0.042
 elbow+0.038
 drown+0.037
 heal+0.037
 bleed+0.036
Neg logit contributions
elly-0.066
Every-0.064
 little-0.062
 young-0.062
 elderly-0.058
Token can
Token ID460
Feature activation-6.97e-02

Pos logit contributions
elly+0.011
Every+0.010
umpy+0.009
ubby+0.008
oma+0.008
Neg logit contributions
 drown-0.009
chief-0.009
 rid-0.009
 heal-0.008
 elbow-0.008
Token include
Token ID2291
Feature activation-2.63e-02

Pos logit contributions
chief+0.101
 elbow+0.096
 down+0.090
ridge+0.087
 drown+0.086
Neg logit contributions
Every-0.121
One-0.108
Two-0.102
 little-0.102
elly-0.101
Token anything
Token ID1997
Feature activation-6.28e-02

Pos logit contributions
chief+0.037
 drown+0.035
 heal+0.033
 bleed+0.032
ridge+0.032
Neg logit contributions
elly-0.047
Every-0.043
Children-0.040
 little-0.040
 young-0.039
Token we
Token ID356
Feature activation-1.14e-01

Pos logit contributions
chief+0.101
 elbow+0.090
stack+0.089
ridge+0.088
 drown+0.088
Neg logit contributions
elly-0.109
Every-0.098
 little-0.093
umpy-0.092
 importantly-0.092
Token want
Token ID765
Feature activation-5.98e-02

Pos logit contributions
chief+0.182
 elbow+0.162
 drown+0.158
 heal+0.156
 down+0.152
Neg logit contributions
elly-0.176
Every-0.170
umpy-0.159
ubby-0.152
Two-0.152
Token."
Token ID526
Feature activation-3.24e-02

Pos logit contributions
chief+0.091
 drown+0.085
 elbow+0.081
 bleed+0.078
 heal+0.078
Neg logit contributions
elly-0.110
Every-0.098
umpy-0.092
 year-0.090
aste-0.087
Token
Token ID198
Feature activation+4.04e-02

Pos logit contributions
chief+0.070
 heal+0.069
 drown+0.068
 rid+0.067
ridge+0.067
Neg logit contributions
elly-0.034
Every-0.032
Children-0.028
Two-0.026
 young-0.025
Token
Token ID198
Feature activation+3.47e-02

Pos logit contributions
elly+0.070
Every+0.061
umpy+0.059
urious+0.056
Two+0.056
Neg logit contributions
 heal-0.060
chief-0.058
 drown-0.058
 down-0.056
 rid-0.055
TokenThe
Token ID464
Feature activation+1.54e-02

Pos logit contributions
elly+0.037
umpy+0.027
Every+0.026
ubby+0.025
illing+0.024
Neg logit contributions
 heal-0.074
 drown-0.072
chief-0.071
 rid-0.070
 down-0.069
Token friends
Token ID2460
Feature activation-4.01e-02

Pos logit contributions
 drown+0.037
chief+0.036
 heal+0.035
 elbow+0.034
 bleed+0.033
Neg logit contributions
elly-0.045
Every-0.040
umpy-0.038
grown-0.037
aste-0.037
Token all
Token ID477
Feature activation-5.35e-02

Pos logit contributions
chief+0.064
 drown+0.063
 heal+0.061
 bleed+0.058
 rid+0.058
Neg logit contributions
elly-0.063
Every-0.058
umpy-0.053
 year-0.051
grown-0.050
Token the
Token ID262
Feature activation-3.99e-02

Pos logit contributions
chief+0.079
 drown+0.077
 heal+0.077
 elbow+0.072
ridge+0.070
Neg logit contributions
elly-0.094
Every-0.086
umpy-0.081
urious-0.076
 year-0.075
Token exciting
Token ID7895
Feature activation-3.32e-02

Pos logit contributions
 heal+0.069
 drown+0.066
 down+0.065
chief+0.065
 rid+0.064
Neg logit contributions
elly-0.061
Every-0.052
umpy-0.050
oma-0.047
illing-0.046
Token things
Token ID1243
Feature activation-6.66e-02

Pos logit contributions
chief+0.055
 drown+0.055
 heal+0.055
 rid+0.051
 bleed+0.050
Neg logit contributions
elly-0.052
Every-0.045
umpy-0.043
 year-0.041
urious-0.041
Token it
Token ID340
Feature activation-1.14e-01

Pos logit contributions
chief+0.107
 elbow+0.106
 drown+0.105
ridge+0.097
 heal+0.095
Neg logit contributions
elly-0.101
Every-0.092
umpy-0.086
 meant-0.084
 year-0.083
Token had
Token ID550
Feature activation-6.93e-02

Pos logit contributions
chief+0.230
 drown+0.224
 elbow+0.221
 heal+0.219
 bleed+0.211
Neg logit contributions
elly-0.133
Every-0.121
umpy-0.111
 importance-0.098
 year-0.097
Token found
Token ID1043
Feature activation-3.09e-02

Pos logit contributions
 heal+0.097
chief+0.096
 drown+0.094
 down+0.091
 rid+0.089
Neg logit contributions
elly-0.126
Every-0.112
umpy-0.107
urious-0.101
Two-0.101
Token during
Token ID1141
Feature activation-6.15e-02

Pos logit contributions
 heal+0.039
chief+0.039
 drown+0.039
 down+0.036
 rid+0.035
Neg logit contributions
elly-0.060
Every-0.054
umpy-0.052
urious-0.049
oma-0.049
Token its
Token ID663
Feature activation-5.51e-02

Pos logit contributions
chief+0.079
 heal+0.077
 drown+0.074
 down+0.072
ridge+0.069
Neg logit contributions
elly-0.112
Every-0.107
grown-0.097
 year-0.097
 little-0.095
Token journey
Token ID7002
Feature activation-3.72e-02

Pos logit contributions
chief+0.087
 down+0.084
 drown+0.083
 heal+0.083
 elbow+0.081
Neg logit contributions
elly-0.084
Every-0.080
 year-0.073
umpy-0.072
urious-0.071
Token us
Token ID514
Feature activation-1.65e-02

Pos logit contributions
elly+0.012
umpy+0.010
Every+0.010
Children+0.009
ubby+0.009
Neg logit contributions
chief-0.009
 drown-0.009
 heal-0.009
 elbow-0.008
ridge-0.008
Token with
Token ID351
Feature activation-1.17e-02

Pos logit contributions
chief+0.022
 drown+0.020
 elbow+0.019
 heal+0.019
 rid+0.018
Neg logit contributions
elly-0.032
Every-0.029
umpy-0.029
 year-0.026
aste-0.026
Token his
Token ID465
Feature activation-3.75e-03

Pos logit contributions
chief+0.012
 drown+0.012
 heal+0.012
 down+0.012
 soar+0.011
Neg logit contributions
elly-0.025
Every-0.023
umpy-0.021
Children-0.021
ubby-0.021
Token candy
Token ID18550
Feature activation-9.69e-03

Pos logit contributions
 drown+0.006
 heal+0.006
chief+0.005
 rid+0.005
 down+0.005
Neg logit contributions
elly-0.006
 little-0.005
Every-0.005
umpy-0.005
 young-0.005
Token."
Token ID526
Feature activation-3.82e-03

Pos logit contributions
 drown+0.012
chief+0.012
 heal+0.011
 rid+0.011
 down+0.010
Neg logit contributions
elly-0.020
Every-0.017
umpy-0.017
 year-0.016
oma-0.016
Token
Token ID198
Feature activation+7.31e-02

Pos logit contributions
 heal+0.007
chief+0.007
 drown+0.007
 rid+0.007
ridge+0.007
Neg logit contributions
elly-0.005
Every-0.005
umpy-0.004
Two-0.004
Children-0.004
Token
Token ID198
Feature activation+6.47e-02

Pos logit contributions
elly+0.129
Every+0.113
umpy+0.109
urious+0.103
Two+0.103
Neg logit contributions
 heal-0.107
chief-0.104
 drown-0.103
 down-0.101
 rid-0.097
Token"
Token ID1
Feature activation+2.25e-02

Pos logit contributions
elly+0.087
ubby+0.073
umpy+0.069
illing+0.066
oph+0.066
Neg logit contributions
 heal-0.116
 down-0.113
 he-0.110
 we-0.109
chief-0.109
TokenStop
Token ID19485
Feature activation+3.15e-02

Pos logit contributions
elly+0.037
ubby+0.034
illing+0.031
umpy+0.031
grown+0.031
Neg logit contributions
 heal-0.035
 he-0.033
 rid-0.032
 soar-0.032
 we-0.031
Token it
Token ID340
Feature activation-1.50e-02

Pos logit contributions
elly+0.054
 little+0.049
oma+0.048
Every+0.046
 young+0.046
Neg logit contributions
 heal-0.040
 drown-0.039
 rid-0.039
chief-0.037
 Grad-0.037
Token,
Token ID11
Feature activation+2.35e-02

Pos logit contributions
 heal+0.016
chief+0.016
 drown+0.016
 down+0.015
 rid+0.015
Neg logit contributions
elly-0.032
Every-0.029
umpy-0.028
urious-0.027
Two-0.027
Token
Token ID198
Feature activation+5.06e-02

Pos logit contributions
elly+0.122
umpy+0.104
oph+0.102
ubby+0.102
urious+0.101
Neg logit contributions
 heal-0.065
 drown-0.063
chief-0.059
 down-0.058
 breathe-0.058
TokenShe
Token ID3347
Feature activation-1.64e-02

Pos logit contributions
elly+0.078
oph+0.066
 importance+0.065
 elderly+0.065
oma+0.064
Neg logit contributions
 breathe-0.080
 heal-0.075
 soar-0.075
 drown-0.074
 he-0.073
Token decided
Token ID3066
Feature activation+7.56e-03

Pos logit contributions
 heal+0.022
 down+0.021
chief+0.021
 back+0.020
 drown+0.020
Neg logit contributions
elly-0.032
Every-0.028
umpy-0.028
ubby-0.026
illing-0.026
Token to
Token ID284
Feature activation-3.08e-02

Pos logit contributions
elly+0.013
Every+0.011
 little+0.011
Some+0.011
Sometimes+0.011
Neg logit contributions
 drown-0.011
chief-0.010
ridge-0.010
 heal-0.010
 glide-0.010
Token tell
Token ID1560
Feature activation+1.53e-02

Pos logit contributions
chief+0.038
 down+0.036
ridge+0.034
 elbow+0.034
imet+0.034
Neg logit contributions
Every-0.053
elly-0.053
 little-0.050
 year-0.049
Two-0.048
Token P
Token ID350
Feature activation+8.41e-02

Pos logit contributions
elly+0.026
Every+0.024
 little+0.022
oma+0.022
iet+0.022
Neg logit contributions
 drown-0.020
chief-0.020
 heal-0.020
 down-0.019
ridge-0.019
Tokenao
Token ID5488
Feature activation+4.49e-02

Pos logit contributions
elly+0.112
Every+0.104
Two+0.092
umpy+0.090
Children+0.089
Neg logit contributions
 heal-0.148
chief-0.145
 down-0.142
 drown-0.142
 rid-0.137
Token how
Token ID703
Feature activation+2.01e-02

Pos logit contributions
elly+0.089
Every+0.080
umpy+0.077
oma+0.073
ubby+0.073
Neg logit contributions
 heal-0.055
 drown-0.054
chief-0.053
 down-0.050
 rid-0.050
Token she
Token ID673
Feature activation-4.04e-02

Pos logit contributions
elly+0.035
Every+0.031
umpy+0.030
illing+0.027
Two+0.027
Neg logit contributions
chief-0.031
 heal-0.030
 drown-0.030
 down-0.029
 elbow-0.028
Token felt
Token ID2936
Feature activation+2.19e-02

Pos logit contributions
 heal+0.063
 down+0.059
chief+0.058
 drown+0.058
 rid+0.057
Neg logit contributions
elly-0.068
Every-0.060
umpy-0.057
ubby-0.055
Two-0.054
Token.
Token ID13
Feature activation-3.90e-02

Pos logit contributions
elly+0.037
umpy+0.031
Every+0.031
 little+0.029
 young+0.028
Neg logit contributions
chief-0.037
ridge-0.033
 drown-0.033
 elbow-0.032
 heal-0.032
Token
Token ID198
Feature activation+5.78e-02

Pos logit contributions
elly+0.135
umpy+0.115
Every+0.114
urious+0.111
illing+0.111
Neg logit contributions
 heal-0.088
 drown-0.081
 down-0.080
chief-0.080
 rid-0.077
TokenThe
Token ID464
Feature activation+3.80e-02

Pos logit contributions
elly+0.107
ubby+0.090
oph+0.087
umpy+0.087
ited+0.085
Neg logit contributions
 we-0.083
 heal-0.083
 breathe-0.077
 build-0.076
 back-0.076
Token lady
Token ID10846
Feature activation+3.60e-02

Pos logit contributions
elly+0.080
urious+0.068
umpy+0.068
oph+0.067
illed+0.066
Neg logit contributions
 heal-0.048
 back-0.047
 float-0.044
 soar-0.044
 drown-0.043
Token smiled
Token ID13541
Feature activation+6.36e-02

Pos logit contributions
elly+0.064
Every+0.057
umpy+0.055
ubby+0.052
illing+0.051
Neg logit contributions
 heal-0.054
 down-0.052
chief-0.052
 back-0.049
 drown-0.048
Token and
Token ID290
Feature activation-1.71e-02

Pos logit contributions
elly+0.118
Every+0.106
Two+0.102
umpy+0.101
ida+0.101
Neg logit contributions
 down-0.088
 heal-0.085
 back-0.081
chief-0.079
 drown-0.075
Token praised
Token ID15342
Feature activation+7.21e-02

Pos logit contributions
 heal+0.028
chief+0.027
 down+0.027
 drown+0.027
 rid+0.026
Neg logit contributions
elly-0.028
Every-0.024
umpy-0.023
urious-0.022
Two-0.021
Token them
Token ID606
Feature activation+1.86e-02

Pos logit contributions
elly+0.160
Every+0.145
umpy+0.139
Children+0.135
oma+0.134
Neg logit contributions
 heal-0.072
chief-0.071
 drown-0.071
 down-0.064
 rid-0.063
Token.
Token ID13
Feature activation-1.53e-02

Pos logit contributions
elly+0.035
Every+0.031
umpy+0.030
oma+0.029
illing+0.028
Neg logit contributions
chief-0.025
 drown-0.025
 heal-0.024
 rid-0.022
 soar-0.022
Token "
Token ID366
Feature activation+3.96e-02

Pos logit contributions
 down+0.015
 he+0.013
 back+0.012
chief+0.012
 heal+0.012
Neg logit contributions
elly-0.034
urious-0.031
ited-0.031
ubby-0.031
iet-0.031
TokenThat
Token ID2504
Feature activation+4.62e-03

Pos logit contributions
elly+0.060
ubby+0.050
iet+0.049
urious+0.049
umpy+0.048
Neg logit contributions
 heal-0.066
chief-0.062
 he-0.061
 we-0.061
 soar-0.061
Token's
Token ID338
Feature activation+2.58e-02

Pos logit contributions
elly+0.009
Every+0.008
umpy+0.007
urious+0.007
Two+0.007
Neg logit contributions
 heal-0.006
chief-0.006
 drown-0.006
 down-0.006
 rid-0.006
TokenOnce
Token ID7454
Feature activation+7.04e-02

Pos logit contributions
elly+0.103
Every+0.090
umpy+0.088
urious+0.086
iet+0.084
Neg logit contributions
 down-0.074
 heal-0.073
chief-0.071
 drown-0.070
 rid-0.067
Token upon
Token ID2402
Feature activation+4.58e-02

Pos logit contributions
elly+0.083
Every+0.070
umpy+0.063
ubby+0.058
urious+0.058
Neg logit contributions
 heal-0.146
chief-0.144
 drown-0.142
 rid-0.138
ridge-0.135
Token a
Token ID257
Feature activation+6.41e-02

Pos logit contributions
elly+0.087
umpy+0.074
Every+0.073
ubby+0.071
urious+0.070
Neg logit contributions
 heal-0.063
 drown-0.059
 down-0.059
chief-0.059
 rid-0.056
Token time
Token ID640
Feature activation+4.41e-02

Pos logit contributions
elly+0.095
year+0.083
Children+0.082
iring+0.081
umpy+0.081
Neg logit contributions
 heal-0.096
 sleep-0.094
 drown-0.094
 float-0.091
 soar-0.091
Token there
Token ID612
Feature activation+2.95e-02

Pos logit contributions
elly+0.081
Every+0.073
umpy+0.068
Two+0.065
ubby+0.065
Neg logit contributions
 heal-0.062
chief-0.060
 drown-0.059
 rid-0.058
 down-0.056
Token was
Token ID373
Feature activation+7.57e-02

Pos logit contributions
elly+0.049
umpy+0.041
Every+0.040
ubby+0.038
iet+0.038
Neg logit contributions
 heal-0.047
 down-0.046
chief-0.046
 drown-0.045
 build-0.043
Token a
Token ID257
Feature activation+6.06e-02

Pos logit contributions
elly+0.153
umpy+0.137
urious+0.130
Every+0.129
ida+0.128
Neg logit contributions
 heal-0.088
 down-0.087
 soar-0.082
 drown-0.081
chief-0.081
Token girl
Token ID2576
Feature activation+4.92e-02

Pos logit contributions
elly+0.066
Every+0.052
umpy+0.048
urious+0.048
If+0.045
Neg logit contributions
 heal-0.130
 down-0.126
 float-0.124
 drown-0.123
 we-0.123
Token called
Token ID1444
Feature activation+3.88e-02

Pos logit contributions
elly+0.080
Every+0.070
umpy+0.068
urious+0.065
Children+0.064
Neg logit contributions
 heal-0.078
 down-0.074
chief-0.074
 drown-0.073
 rid-0.071
Token Mary
Token ID5335
Feature activation+5.73e-02

Pos logit contributions
elly+0.078
Every+0.070
umpy+0.068
urious+0.065
Two+0.064
Neg logit contributions
 heal-0.047
chief-0.045
 drown-0.045
 down-0.044
 rid-0.042
Token.
Token ID13
Feature activation-7.95e-03

Pos logit contributions
elly+0.122
Every+0.110
umpy+0.107
Two+0.102
urious+0.102
Neg logit contributions
 heal-0.063
chief-0.061
 drown-0.059
 down-0.058
 rid-0.056
Token was
Token ID373
Feature activation+5.68e-02

Pos logit contributions
elly+0.036
umpy+0.031
ubby+0.030
iet+0.029
illing+0.029
Neg logit contributions
 heal-0.027
 down-0.026
 him-0.026
 he-0.025
 drown-0.025
Token a
Token ID257
Feature activation+4.25e-02

Pos logit contributions
elly+0.118
umpy+0.107
urious+0.103
ida+0.101
oma+0.101
Neg logit contributions
 down-0.062
 heal-0.060
 out-0.058
 soar-0.058
 drown-0.057
Token little
Token ID1310
Feature activation+3.00e-02

Pos logit contributions
elly+0.047
Every+0.039
urious+0.036
If+0.035
umpy+0.035
Neg logit contributions
 heal-0.089
 down-0.086
 we-0.085
 float-0.085
 drown-0.084
Token fish
Token ID5916
Feature activation+2.59e-02

Pos logit contributions
elly+0.035
Every+0.029
umpy+0.027
urious+0.025
oma+0.024
Neg logit contributions
 heal-0.061
chief-0.060
 drown-0.060
 down-0.059
 rid-0.058
Token named
Token ID3706
Feature activation+1.88e-02

Pos logit contributions
elly+0.041
Every+0.035
umpy+0.034
urious+0.032
oma+0.032
Neg logit contributions
 heal-0.042
 drown-0.041
chief-0.041
 down-0.041
 rid-0.039
Token Bub
Token ID19645
Feature activation+1.03e-01

Pos logit contributions
elly+0.036
Every+0.034
 little+0.032
ubby+0.032
aste+0.032
Neg logit contributions
 heal-0.022
 elbow-0.022
chief-0.022
 bleed-0.021
 drown-0.021
Tokenbles
Token ID7689
Feature activation+3.83e-02

Pos logit contributions
elly+0.194
Every+0.180
umpy+0.165
Two+0.164
Children+0.161
Neg logit contributions
 heal-0.129
 drown-0.124
 down-0.121
chief-0.120
 rid-0.120
Token who
Token ID508
Feature activation-1.07e-03

Pos logit contributions
elly+0.082
Every+0.074
umpy+0.071
urious+0.068
Two+0.068
Neg logit contributions
 heal-0.041
chief-0.040
 drown-0.039
 down-0.039
 rid-0.037
Token lived
Token ID5615
Feature activation+5.67e-02

Pos logit contributions
 heal+0.002
chief+0.002
 drown+0.002
 down+0.002
 rid+0.002
Neg logit contributions
elly-0.001
Every-0.001
umpy-0.001
urious-0.001
Two-0.001
Token in
Token ID287
Feature activation-1.72e-02

Pos logit contributions
elly+0.109
Every+0.099
umpy+0.095
ubby+0.091
illing+0.089
Neg logit contributions
chief-0.075
 heal-0.073
 drown-0.071
 rid-0.064
 down-0.063
Token the
Token ID262
Feature activation-2.79e-02

Pos logit contributions
chief+0.027
 heal+0.027
 drown+0.026
 down+0.025
 elbow+0.025
Neg logit contributions
elly-0.027
Every-0.026
umpy-0.022
Two-0.022
ubby-0.022
Token.
Token ID13
Feature activation-4.73e-02

Pos logit contributions
chief+0.080
 heal+0.080
 rid+0.079
 drown+0.079
 elbow+0.076
Neg logit contributions
elly-0.064
Every-0.052
umpy-0.052
 year-0.049
illing-0.049
Token They
Token ID1119
Feature activation-3.17e-02

Pos logit contributions
 heal+0.073
chief+0.071
 drown+0.069
 rid+0.068
 down+0.068
Neg logit contributions
elly-0.081
Every-0.072
umpy-0.068
oma-0.065
Two-0.065
Token wanted
Token ID2227
Feature activation-1.27e-03

Pos logit contributions
 heal+0.053
chief+0.052
 drown+0.052
 down+0.051
 rid+0.050
Neg logit contributions
elly-0.048
Every-0.042
umpy-0.040
Two-0.037
urious-0.037
Token to
Token ID284
Feature activation-6.00e-02

Pos logit contributions
chief+0.002
 drown+0.001
 heal+0.001
ridge+0.001
 soar+0.001
Neg logit contributions
elly-0.002
Every-0.002
Children-0.002
 little-0.002
grown-0.002
Token go
Token ID467
Feature activation-1.85e-02

Pos logit contributions
chief+0.084
ridge+0.081
imet+0.075
 down+0.075
 elbow+0.071
Neg logit contributions
Every-0.100
 little-0.095
Tom-0.090
 young-0.090
elly-0.090
Token home
Token ID1363
Feature activation-1.23e-03

Pos logit contributions
chief+0.022
ridge+0.022
 elbow+0.020
 heal+0.019
imet+0.018
Neg logit contributions
elly-0.038
umpy-0.035
Every-0.034
 year-0.033
S-0.032
Token and
Token ID290
Feature activation-6.78e-02

Pos logit contributions
chief+0.002
 heal+0.002
 drown+0.002
ridge+0.002
 rid+0.002
Neg logit contributions
elly-0.002
Every-0.002
umpy-0.002
 year-0.002
oma-0.002
Token wash
Token ID13502
Feature activation+3.87e-03

Pos logit contributions
ridge+0.108
chief+0.106
imet+0.105
pan+0.091
atmeal+0.091
Neg logit contributions
 little-0.098
 kind-0.096
 friendly-0.092
Every-0.091
 young-0.091
Token the
Token ID262
Feature activation-3.47e-02

Pos logit contributions
elly+0.007
Every+0.006
able+0.006
umpy+0.006
 little+0.006
Neg logit contributions
chief-0.005
 drown-0.005
ridge-0.005
 soar-0.005
 heal-0.005
Token paint
Token ID7521
Feature activation-6.77e-02

Pos logit contributions
 heal+0.050
chief+0.050
 drown+0.049
 down+0.048
 rid+0.046
Neg logit contributions
elly-0.061
Every-0.053
umpy-0.052
urious-0.049
Two-0.049
Token off
Token ID572
Feature activation-3.04e-02

Pos logit contributions
 drown+0.119
chief+0.113
 rid+0.104
ridge+0.103
 heal+0.099
Neg logit contributions
elly-0.104
Every-0.086
 year-0.084
umpy-0.083
able-0.079
Token puppy
Token ID26188
Feature activation-2.49e-02

Pos logit contributions
elly+0.010
year+0.009
illing+0.008
urious+0.008
ud+0.008
Neg logit contributions
 down-0.008
 back-0.008
 heal-0.008
chief-0.007
 out-0.007
Token very
Token ID845
Feature activation+2.09e-02

Pos logit contributions
chief+0.038
 drown+0.038
 elbow+0.036
ridge+0.036
 heal+0.035
Neg logit contributions
elly-0.040
Every-0.038
 little-0.034
umpy-0.034
 young-0.033
Token much
Token ID881
Feature activation+1.14e-02

Pos logit contributions
elly+0.034
Children+0.029
iet+0.029
Every+0.029
oma+0.028
Neg logit contributions
 heal-0.032
 drown-0.031
 down-0.030
chief-0.030
 rid-0.029
Token,
Token ID11
Feature activation-1.30e-02

Pos logit contributions
elly+0.023
Every+0.021
umpy+0.020
Two+0.020
urious+0.020
Neg logit contributions
 heal-0.014
chief-0.013
 down-0.013
 drown-0.013
 rid-0.012
Token but
Token ID475
Feature activation-5.38e-04

Pos logit contributions
 drown+0.022
 heal+0.022
chief+0.022
 rid+0.022
ridge+0.021
Neg logit contributions
elly-0.018
Every-0.017
 little-0.015
umpy-0.015
 young-0.015
Token the
Token ID262
Feature activation+2.82e-02

Pos logit contributions
 drown+0.001
 heal+0.001
 rid+0.001
 bleed+0.001
chief+0.001
Neg logit contributions
Every-0.001
elly-0.001
 little-0.001
 young-0.001
 child-0.001
Token mother
Token ID2802
Feature activation+3.10e-02

Pos logit contributions
elly+0.059
umpy+0.050
illing+0.049
lic+0.048
year+0.048
Neg logit contributions
 back-0.038
 down-0.038
 heal-0.038
chief-0.036
 he-0.034
Token said
Token ID531
Feature activation+3.34e-02

Pos logit contributions
elly+0.058
Every+0.049
umpy+0.047
Two+0.047
ubby+0.046
Neg logit contributions
 heal-0.044
chief-0.043
 down-0.042
 he-0.040
 we-0.038
Token no
Token ID645
Feature activation+4.24e-02

Pos logit contributions
elly+0.066
Every+0.062
umpy+0.057
 little+0.057
 year+0.056
Neg logit contributions
 drown-0.037
 heal-0.036
chief-0.035
 rid-0.035
 down-0.034
Token because
Token ID780
Feature activation-3.53e-03

Pos logit contributions
elly+0.091
Every+0.085
umpy+0.081
Two+0.080
iet+0.080
Neg logit contributions
 heal-0.044
 down-0.042
chief-0.041
 drown-0.039
 rid-0.037
Token they
Token ID484
Feature activation-2.33e-02

Pos logit contributions
 heal+0.003
chief+0.003
 drown+0.003
 down+0.003
 rid+0.002
Neg logit contributions
elly-0.008
Every-0.008
umpy-0.008
Two-0.007
Children-0.007
Token stones
Token ID14966
Feature activation-3.95e-02

Pos logit contributions
 heal+0.009
chief+0.009
 drown+0.009
 rid+0.009
 down+0.009
Neg logit contributions
elly-0.012
Every-0.011
umpy-0.010
Two-0.010
urious-0.010
Token.
Token ID13
Feature activation-2.47e-02

Pos logit contributions
chief+0.071
ridge+0.068
 drown+0.064
 heal+0.064
 elbow+0.063
Neg logit contributions
elly-0.057
Every-0.051
umpy-0.047
 year-0.044
y-0.044
Token They
Token ID1119
Feature activation-1.00e-02

Pos logit contributions
 down+0.026
chief+0.024
 heal+0.024
 drown+0.024
 build+0.023
Neg logit contributions
elly-0.053
Every-0.048
umpy-0.047
ubby-0.046
iet-0.046
Token were
Token ID547
Feature activation+4.57e-02

Pos logit contributions
 heal+0.014
 down+0.013
 drown+0.013
 build+0.013
 rid+0.013
Neg logit contributions
elly-0.020
umpy-0.017
Every-0.016
ubby-0.016
urious-0.016
Token not
Token ID407
Feature activation+1.61e-02

Pos logit contributions
elly+0.084
Every+0.075
umpy+0.072
urious+0.068
Two+0.068
Neg logit contributions
 heal-0.063
chief-0.062
 drown-0.061
 down-0.060
 rid-0.058
Token friends
Token ID2460
Feature activation+1.31e-02

Pos logit contributions
elly+0.026
Every+0.022
umpy+0.021
urious+0.020
Two+0.020
Neg logit contributions
 heal-0.026
chief-0.026
 drown-0.026
 down-0.025
 rid-0.024
Token anymore
Token ID7471
Feature activation+4.15e-03

Pos logit contributions
elly+0.026
Every+0.023
umpy+0.023
urious+0.021
oma+0.021
Neg logit contributions
 heal-0.016
chief-0.016
 drown-0.016
 down-0.015
 rid-0.015
Token.
Token ID13
Feature activation-2.57e-02

Pos logit contributions
elly+0.009
Every+0.008
umpy+0.008
urious+0.008
Two+0.008
Neg logit contributions
 heal-0.004
 drown-0.004
 down-0.004
chief-0.004
 rid-0.004
Token In
Token ID554
Feature activation+6.52e-03

Pos logit contributions
 down+0.022
 build+0.019
chief+0.019
 drown+0.019
 heal+0.018
Neg logit contributions
elly-0.059
ubby-0.054
Every-0.053
iet-0.053
umpy-0.052
Token the
Token ID262
Feature activation+5.76e-03

Pos logit contributions
elly+0.017
umpy+0.015
Every+0.015
urious+0.014
illing+0.014
Neg logit contributions
 heal-0.005
 drown-0.005
chief-0.005
 down-0.005
 rid-0.004
Token end
Token ID886
Feature activation+3.74e-02

Pos logit contributions
elly+0.011
umpy+0.009
year+0.009
illing+0.009
ubby+0.009
Neg logit contributions
 heal-0.008
 down-0.008
 drown-0.008
 back-0.008
 build-0.008
Token the
Token ID262
Feature activation+1.71e-02

Pos logit contributions
 heal+0.001
 drown+0.001
chief+0.001
 rid+0.001
 bleed+0.001
Neg logit contributions
elly-0.001
Every-0.001
 year-0.001
oma-0.001
umpy-0.001
Token next
Token ID1306
Feature activation+4.15e-02

Pos logit contributions
elly+0.029
Every+0.026
umpy+0.024
Two+0.023
urious+0.023
Neg logit contributions
 heal-0.026
chief-0.026
 drown-0.026
 down-0.025
 rid-0.025
Token day
Token ID1110
Feature activation+1.96e-02

Pos logit contributions
elly+0.067
Every+0.058
umpy+0.055
urious+0.052
oma+0.052
Neg logit contributions
 heal-0.067
chief-0.066
 drown-0.065
 down-0.065
 rid-0.063
Token,
Token ID11
Feature activation+1.20e-02

Pos logit contributions
elly+0.038
Every+0.035
umpy+0.033
Two+0.032
urious+0.031
Neg logit contributions
 heal-0.025
chief-0.024
 drown-0.024
 rid-0.023
 down-0.023
Token it
Token ID340
Feature activation-1.76e-02

Pos logit contributions
Every+0.025
elly+0.024
 little+0.023
 year+0.023
 young+0.022
Neg logit contributions
 drown-0.013
chief-0.012
 heal-0.012
 bleed-0.012
 elbow-0.011
Token was
Token ID373
Feature activation+2.54e-02

Pos logit contributions
chief+0.032
 heal+0.032
 drown+0.031
 rid+0.031
 elbow+0.030
Neg logit contributions
elly-0.023
Every-0.020
umpy-0.019
urious-0.018
oma-0.017
Token raining
Token ID43079
Feature activation+2.03e-02

Pos logit contributions
elly+0.035
Every+0.034
umpy+0.030
 year+0.029
Sometimes+0.028
Neg logit contributions
chief-0.044
 heal-0.043
 drown-0.042
 rid-0.040
ridge-0.040
Token outside
Token ID2354
Feature activation-2.73e-03

Pos logit contributions
elly+0.036
Every+0.032
umpy+0.031
 year+0.029
Two+0.029
Neg logit contributions
chief-0.029
 heal-0.029
 drown-0.028
 rid-0.027
 elbow-0.026
Token and
Token ID290
Feature activation-4.00e-02

Pos logit contributions
chief+0.005
 drown+0.004
 heal+0.004
ridge+0.004
 rid+0.004
Neg logit contributions
elly-0.004
Every-0.004
 year-0.004
umpy-0.004
 little-0.003
Token John
Token ID1757
Feature activation+2.54e-02

Pos logit contributions
chief+0.048
 heal+0.047
 drown+0.045
 rid+0.044
ridge+0.044
Neg logit contributions
Every-0.069
elly-0.066
 little-0.066
 year-0.064
grown-0.063
Token was
Token ID373
Feature activation+2.74e-02

Pos logit contributions
elly+0.039
Every+0.034
umpy+0.032
urious+0.030
Two+0.030
Neg logit contributions
 heal-0.043
chief-0.042
 drown-0.042
 down-0.041
 rid-0.040
Token!"
Token ID2474
Feature activation+1.44e-02

Pos logit contributions
elly+0.009
Every+0.008
umpy+0.008
 year+0.008
 young+0.008
Neg logit contributions
chief-0.007
 elbow-0.007
 bleed-0.007
 drown-0.007
 heal-0.007
Token she
Token ID673
Feature activation+6.36e-03

Pos logit contributions
elly+0.022
Every+0.022
 little+0.020
So+0.018
Two+0.018
Neg logit contributions
 drown-0.024
 heal-0.023
 elbow-0.022
chief-0.022
ridge-0.022
Token said
Token ID531
Feature activation+4.12e-02

Pos logit contributions
elly+0.010
Every+0.008
umpy+0.008
urious+0.008
Two+0.008
Neg logit contributions
 heal-0.011
chief-0.011
 drown-0.010
 down-0.010
 rid-0.010
Token.
Token ID13
Feature activation-6.02e-03

Pos logit contributions
elly+0.057
Every+0.055
 little+0.052
 year+0.051
year+0.051
Neg logit contributions
 drown-0.069
ridge-0.067
chief-0.064
atmeal-0.061
 mast-0.060
Token
Token ID198
Feature activation+7.03e-02

Pos logit contributions
 heal+0.009
 drown+0.009
chief+0.009
 rid+0.009
 elbow+0.009
Neg logit contributions
elly-0.010
Every-0.009
umpy-0.008
Two-0.008
Children-0.008
Token
Token ID198
Feature activation+6.50e-02

Pos logit contributions
elly+0.134
umpy+0.114
Every+0.112
ubby+0.110
urious+0.109
Neg logit contributions
 heal-0.095
 drown-0.089
chief-0.088
 down-0.088
 rid-0.083
TokenSo
Token ID2396
Feature activation+2.72e-02

Pos logit contributions
elly+0.095
ubby+0.077
oma+0.077
umpy+0.075
grown+0.070
Neg logit contributions
 heal-0.119
 we-0.111
 soar-0.110
 down-0.107
 breathe-0.107
Token they
Token ID484
Feature activation-1.92e-03

Pos logit contributions
elly+0.047
Every+0.045
 year+0.039
 little+0.039
iet+0.039
Neg logit contributions
 drown-0.039
 heal-0.038
chief-0.038
 elbow-0.036
 rid-0.035
Token all
Token ID477
Feature activation+5.86e-03

Pos logit contributions
 down+0.003
 heal+0.002
chief+0.002
 back+0.002
 build+0.002
Neg logit contributions
elly-0.004
umpy-0.003
Every-0.003
urious-0.003
oma-0.003
Token packed
Token ID11856
Feature activation+3.13e-02

Pos logit contributions
elly+0.012
Every+0.010
umpy+0.010
oma+0.009
Two+0.009
Neg logit contributions
 down-0.007
 heal-0.007
chief-0.007
 drown-0.007
 rid-0.007
Token their
Token ID511
Feature activation-8.86e-03

Pos logit contributions
elly+0.069
Every+0.063
umpy+0.059
oma+0.058
Children+0.058
Neg logit contributions
 heal-0.031
 drown-0.031
chief-0.031
 rid-0.027
 soar-0.027
Token saw
Token ID2497
Feature activation+3.88e-02

Pos logit contributions
elly+0.001
umpy+0.001
Every+0.001
ubby+0.001
illing+0.001
Neg logit contributions
 heal-0.001
 down-0.001
chief-0.001
 drown-0.001
 build-0.001
Token the
Token ID262
Feature activation+7.96e-03

Pos logit contributions
elly+0.089
Every+0.081
umpy+0.081
urious+0.077
Two+0.076
Neg logit contributions
 heal-0.037
 down-0.035
chief-0.035
 drown-0.034
 rid-0.033
Token cat
Token ID3797
Feature activation-1.66e-02

Pos logit contributions
elly+0.014
Every+0.013
year+0.012
umpy+0.012
udging+0.012
Neg logit contributions
 heal-0.012
 drown-0.011
 down-0.011
 rid-0.011
 back-0.010
Token picking
Token ID10868
Feature activation+1.51e-02

Pos logit contributions
 heal+0.025
chief+0.025
 drown+0.025
 rid+0.024
 down+0.023
Neg logit contributions
elly-0.028
Every-0.024
umpy-0.023
urious-0.022
oma-0.022
Token flowers
Token ID12734
Feature activation-4.16e-02

Pos logit contributions
elly+0.030
Every+0.026
umpy+0.025
oma+0.023
aste+0.023
Neg logit contributions
 heal-0.019
 drown-0.019
chief-0.019
 elbow-0.017
 build-0.017
Token.
Token ID13
Feature activation-1.89e-02

Pos logit contributions
chief+0.073
 drown+0.072
 elbow+0.072
 heal+0.068
ridge+0.066
Neg logit contributions
elly-0.063
Every-0.055
umpy-0.049
illing-0.047
aste-0.046
Token
Token ID198
Feature activation+6.43e-02

Pos logit contributions
 drown+0.028
 heal+0.027
chief+0.027
 rid+0.026
 elbow+0.026
Neg logit contributions
elly-0.033
Every-0.030
umpy-0.029
Two-0.028
 little-0.028
Token
Token ID198
Feature activation+5.64e-02

Pos logit contributions
elly+0.131
iet+0.112
urious+0.112
ubby+0.112
oma+0.110
Neg logit contributions
 heal-0.073
 down-0.072
 he-0.066
 rid-0.066
 drown-0.065
TokenThe
Token ID464
Feature activation+4.48e-02

Pos logit contributions
elly+0.087
oma+0.070
oph+0.070
ubby+0.068
 importance+0.067
Neg logit contributions
 heal-0.093
 down-0.090
 he-0.088
 build-0.085
 rid-0.083
Token mean
Token ID1612
Feature activation+7.02e-02

Pos logit contributions
elly+0.091
oph+0.082
Every+0.080
udging+0.076
ud+0.074
Neg logit contributions
 heal-0.060
 he-0.057
 down-0.056
 back-0.055
 drown-0.053
Token man
Token ID582
Feature activation+2.03e-02

Pos logit contributions
Every+0.141
elly+0.140
 importantly+0.128
urious+0.128
Girls+0.125
Neg logit contributions
 down-0.080
 he-0.080
 heal-0.079
 out-0.069
 back-0.069
Token and
Token ID290
Feature activation-3.77e-02

Pos logit contributions
elly+0.011
Every+0.010
umpy+0.009
illing+0.009
 year+0.009
Neg logit contributions
 drown-0.008
 heal-0.008
chief-0.008
 rid-0.007
 bleed-0.007
Token they
Token ID484
Feature activation-1.95e-02

Pos logit contributions
 drown+0.062
chief+0.062
 heal+0.058
 bleed+0.058
ridge+0.057
Neg logit contributions
Every-0.051
 elderly-0.049
 little-0.049
elly-0.048
 young-0.047
Token all
Token ID477
Feature activation-9.98e-03

Pos logit contributions
 heal+0.030
chief+0.030
 drown+0.029
 down+0.028
 rid+0.028
Neg logit contributions
elly-0.033
Every-0.029
umpy-0.027
Two-0.026
urious-0.026
Token had
Token ID550
Feature activation-5.65e-03

Pos logit contributions
chief+0.018
 heal+0.018
 drown+0.017
 rid+0.016
ridge+0.016
Neg logit contributions
elly-0.014
Every-0.013
umpy-0.012
illing-0.011
 year-0.011
Token lots
Token ID6041
Feature activation+4.88e-02

Pos logit contributions
 heal+0.008
chief+0.008
 drown+0.008
 down+0.008
 rid+0.008
Neg logit contributions
elly-0.010
Every-0.009
umpy-0.008
urious-0.008
Two-0.008
Token of
Token ID286
Feature activation-1.67e-02

Pos logit contributions
elly+0.065
oma+0.057
urious+0.056
Every+0.056
iet+0.056
Neg logit contributions
 heal-0.087
 down-0.084
chief-0.080
 rid-0.080
 he-0.080
Token fun
Token ID1257
Feature activation+2.49e-02

Pos logit contributions
 heal+0.023
 down+0.022
 rid+0.022
chief+0.021
 him+0.021
Neg logit contributions
elly-0.032
umpy-0.028
iet-0.027
urious-0.026
Every-0.026
Token.
Token ID13
Feature activation-1.46e-02

Pos logit contributions
elly+0.052
Every+0.046
umpy+0.045
urious+0.042
Two+0.042
Neg logit contributions
 heal-0.029
chief-0.029
 drown-0.028
 down-0.027
 rid-0.026
Token On
Token ID1550
Feature activation+2.07e-03

Pos logit contributions
 heal+0.022
 drown+0.022
chief+0.021
 rid+0.021
 bleed+0.020
Neg logit contributions
elly-0.026
Every-0.024
umpy-0.022
Two-0.021
oma-0.021
Token the
Token ID262
Feature activation-2.91e-03

Pos logit contributions
elly+0.005
Every+0.004
umpy+0.004
urious+0.004
Two+0.004
Neg logit contributions
 heal-0.002
chief-0.002
 drown-0.002
 down-0.002
 rid-0.002
Token way
Token ID835
Feature activation+1.48e-02

Pos logit contributions
 down+0.003
chief+0.003
 heal+0.003
 back+0.003
 build+0.003
Neg logit contributions
elly-0.007
umpy-0.006
year-0.006
ubby-0.006
grown-0.006
Token to
Token ID284
Feature activation-2.42e-02

Pos logit contributions
elly+0.071
Every+0.068
 elderly+0.066
 little+0.066
 young+0.063
Neg logit contributions
chief-0.063
 drown-0.060
 heal-0.059
stack-0.056
 rid-0.055
Token rock
Token ID3881
Feature activation+8.49e-03

Pos logit contributions
chief+0.032
 down+0.030
 rid+0.029
imet+0.029
 drown+0.028
Neg logit contributions
Every-0.041
 little-0.038
elly-0.036
 year-0.035
 young-0.035
Token back
Token ID736
Feature activation-5.81e-04

Pos logit contributions
elly+0.012
Every+0.011
 little+0.010
 year+0.010
able+0.010
Neg logit contributions
 drown-0.014
chief-0.014
 mast-0.013
 elbow-0.013
imet-0.013
Token and
Token ID290
Feature activation-3.81e-02

Pos logit contributions
chief+0.001
 drown+0.001
 heal+0.001
 elbow+0.001
 bleed+0.001
Neg logit contributions
elly-0.001
Every-0.001
umpy-0.001
 year-0.001
urious-0.001
Token forth
Token ID6071
Feature activation+4.21e-03

Pos logit contributions
 drown+0.052
imet+0.049
chief+0.049
pan+0.048
ridge+0.045
Neg logit contributions
Every-0.060
 little-0.057
 liked-0.056
 kind-0.054
 young-0.052
Token on
Token ID319
Feature activation-1.64e-02

Pos logit contributions
Every+0.005
elly+0.005
 little+0.004
is+0.004
 year+0.004
Neg logit contributions
 drown-0.009
 bleed-0.008
chief-0.008
ridge-0.007
 rid-0.007
Token his
Token ID465
Feature activation-7.19e-03

Pos logit contributions
 drown+0.020
chief+0.020
 heal+0.020
 rid+0.018
 bleed+0.018
Neg logit contributions
elly-0.030
Every-0.030
 little-0.027
 year-0.027
grown-0.026
Token four
Token ID1440
Feature activation+2.27e-02

Pos logit contributions
 drown+0.011
 rid+0.011
chief+0.010
 heal+0.010
pan+0.010
Neg logit contributions
 little-0.011
Every-0.010
urious-0.010
 elderly-0.009
 year-0.009
Token legs
Token ID7405
Feature activation+1.55e-02

Pos logit contributions
elly+0.037
Every+0.035
umpy+0.032
 year+0.031
urious+0.031
Neg logit contributions
 drown-0.036
 heal-0.035
chief-0.035
 rid-0.034
stack-0.032
Token while
Token ID981
Feature activation+1.93e-02

Pos logit contributions
elly+0.030
Every+0.027
umpy+0.026
illing+0.025
Two+0.024
Neg logit contributions
 heal-0.019
 drown-0.019
chief-0.019
 rid-0.018
 down-0.017
Token relaxing
Token ID28175
Feature activation+1.32e-02

Pos logit contributions
elly+0.036
Every+0.036
 little+0.033
grown+0.032
 young+0.032
Neg logit contributions
chief-0.023
 heal-0.023
 drown-0.022
 bleed-0.021
 rid-0.021
Token!
Token ID0
Feature activation-4.87e-02

Pos logit contributions
 drown+0.046
chief+0.046
 heal+0.043
 rid+0.042
 elbow+0.042
Neg logit contributions
elly-0.045
Every-0.041
umpy-0.038
 little-0.036
 year-0.036
Token This
Token ID770
Feature activation+1.32e-02

Pos logit contributions
 drown+0.068
 heal+0.067
chief+0.067
 rid+0.064
 down+0.063
Neg logit contributions
elly-0.089
Every-0.081
umpy-0.076
Two-0.074
illing-0.071
Token is
Token ID318
Feature activation+2.28e-02

Pos logit contributions
elly+0.020
Every+0.018
umpy+0.016
Two+0.016
oma+0.016
Neg logit contributions
chief-0.023
 drown-0.022
 heal-0.022
 rid-0.022
 bleed-0.021
Token much
Token ID881
Feature activation+1.18e-02

Pos logit contributions
elly+0.036
Every+0.035
umpy+0.030
oma+0.030
 elderly+0.029
Neg logit contributions
 drown-0.036
chief-0.035
 heal-0.034
 elbow-0.033
 rid-0.032
Token better
Token ID1365
Feature activation+1.02e-02

Pos logit contributions
elly+0.020
Every+0.018
umpy+0.017
Two+0.016
urious+0.016
Neg logit contributions
chief-0.018
 drown-0.018
 heal-0.018
 down-0.017
 rid-0.016
Token and
Token ID290
Feature activation-3.80e-02

Pos logit contributions
elly+0.016
Every+0.013
umpy+0.013
 year+0.013
 young+0.012
Neg logit contributions
chief-0.018
 heal-0.017
 drown-0.017
 bleed-0.016
 elbow-0.016
Token more
Token ID517
Feature activation+6.77e-03

Pos logit contributions
chief+0.048
 drown+0.048
 elbow+0.043
ridge+0.042
pan+0.042
Neg logit contributions
Every-0.068
elly-0.066
 young-0.062
 elderly-0.061
 little-0.060
Token comfortable
Token ID6792
Feature activation-8.56e-03

Pos logit contributions
elly+0.009
Every+0.008
umpy+0.008
 importantly+0.008
 young+0.008
Neg logit contributions
chief-0.013
 drown-0.012
 heal-0.012
 rid-0.012
stack-0.011
Token".
Token ID1911
Feature activation-7.06e-03

Pos logit contributions
chief+0.014
 drown+0.013
 heal+0.013
 rid+0.012
 bleed+0.012
Neg logit contributions
elly-0.014
 year-0.012
umpy-0.012
Every-0.012
 elderly-0.011
Token His
Token ID2399
Feature activation+4.22e-02

Pos logit contributions
 heal+0.011
 drown+0.010
chief+0.010
ridge+0.010
 rid+0.010
Neg logit contributions
elly-0.012
Every-0.011
umpy-0.010
Two-0.010
illing-0.010
Token dad
Token ID9955
Feature activation+4.55e-02

Pos logit contributions
elly+0.102
Every+0.089
umpy+0.088
If+0.086
Two+0.084
Neg logit contributions
 down-0.039
 elbow-0.038
chief-0.037
 back-0.037
 heal-0.035
Token.
Token ID13
Feature activation-5.40e-02

Pos logit contributions
 drown+0.057
chief+0.056
 heal+0.053
ridge+0.052
 rid+0.051
Neg logit contributions
elly-0.064
Every-0.058
umpy-0.053
 year-0.051
oma-0.049
Token She
Token ID1375
Feature activation-3.82e-02

Pos logit contributions
 drown+0.094
 heal+0.093
chief+0.089
 rid+0.088
 elbow+0.087
Neg logit contributions
elly-0.083
Every-0.075
umpy-0.068
Two-0.066
urious-0.065
Token looked
Token ID3114
Feature activation-1.60e-02

Pos logit contributions
 heal+0.059
chief+0.058
 down+0.057
 rid+0.055
 drown+0.055
Neg logit contributions
elly-0.065
Every-0.055
umpy-0.055
ubby-0.051
oma-0.050
Token all
Token ID477
Feature activation-3.10e-02

Pos logit contributions
 drown+0.022
 heal+0.021
chief+0.021
 soar+0.019
 elbow+0.019
Neg logit contributions
elly-0.030
Every-0.028
umpy-0.026
urious-0.025
grown-0.025
Token over
Token ID625
Feature activation-1.74e-02

Pos logit contributions
 heal+0.031
chief+0.031
 drown+0.031
 rid+0.027
 down+0.027
Neg logit contributions
elly-0.069
Every-0.063
umpy-0.061
urious-0.058
Children-0.057
Token her
Token ID607
Feature activation-2.20e-02

Pos logit contributions
 heal+0.022
chief+0.021
 drown+0.021
 down+0.021
 rid+0.020
Neg logit contributions
elly-0.034
Every-0.031
umpy-0.030
urious-0.028
Two-0.028
Token house
Token ID2156
Feature activation-4.17e-02

Pos logit contributions
 heal+0.034
 drown+0.034
chief+0.034
 down+0.032
 rid+0.032
Neg logit contributions
elly-0.036
Every-0.032
umpy-0.030
urious-0.029
Two-0.028
Token but
Token ID475
Feature activation-2.25e-02

Pos logit contributions
 drown+0.071
 heal+0.069
chief+0.068
 rid+0.066
 down+0.064
Neg logit contributions
elly-0.062
Every-0.055
umpy-0.051
oma-0.048
urious-0.048
Token couldn
Token ID3521
Feature activation-1.94e-02

Pos logit contributions
 drown+0.026
 heal+0.025
chief+0.024
 rid+0.023
 down+0.022
Neg logit contributions
elly-0.045
Every-0.042
umpy-0.039
oma-0.038
urious-0.038
Token't
Token ID470
Feature activation-2.15e-02

Pos logit contributions
 rid+0.029
 mast+0.029
 elbow+0.029
 drown+0.028
 heal+0.028
Neg logit contributions
Every-0.031
elly-0.030
Two-0.028
umpy-0.028
oma-0.027
Token find
Token ID1064
Feature activation-1.39e-02

Pos logit contributions
chief+0.027
ridge+0.024
 elbow+0.023
 down+0.023
 drown+0.023
Neg logit contributions
Every-0.040
elly-0.039
 little-0.037
Two-0.037
 young-0.036
TokenDear
Token ID20266
Feature activation-3.62e-03

Pos logit contributions
 heal+0.005
 rid+0.005
 soar+0.005
 drown+0.005
 we+0.005
Neg logit contributions
elly-0.004
umpy-0.004
ubby-0.004
urious-0.003
grown-0.003
Token God
Token ID1793
Feature activation-3.11e-03

Pos logit contributions
chief+0.005
 heal+0.005
 down+0.005
 drown+0.005
 bleed+0.005
Neg logit contributions
elly-0.006
Every-0.006
Children-0.005
 little-0.005
Two-0.005
Token,
Token ID11
Feature activation-1.43e-02

Pos logit contributions
chief+0.004
 heal+0.004
 drown+0.004
 rid+0.004
 down+0.004
Neg logit contributions
elly-0.006
Every-0.005
umpy-0.005
ubby-0.005
oma-0.005
Token bless
Token ID12012
Feature activation+7.47e-03

Pos logit contributions
chief+0.023
 elbow+0.021
 bleed+0.020
 drown+0.020
 rid+0.019
Neg logit contributions
elly-0.023
 little-0.022
 young-0.022
Every-0.022
 year-0.020
Token us
Token ID514
Feature activation-4.58e-02

Pos logit contributions
elly+0.016
Every+0.015
Children+0.014
 year+0.014
ubby+0.014
Neg logit contributions
chief-0.008
 drown-0.007
 heal-0.007
 bleed-0.006
 rid-0.006
Token and
Token ID290
Feature activation-7.64e-02

Pos logit contributions
chief+0.071
 drown+0.070
 heal+0.067
 rid+0.065
 elbow+0.064
Neg logit contributions
elly-0.077
Every-0.069
umpy-0.066
 year-0.063
ubby-0.063
Token keep
Token ID1394
Feature activation-2.96e-02

Pos logit contributions
chief+0.113
imet+0.100
 drown+0.099
ridge+0.097
atmeal+0.094
Neg logit contributions
 little-0.119
 young-0.118
 kind-0.117
 elderly-0.117
Every-0.116
Token us
Token ID514
Feature activation-5.98e-02

Pos logit contributions
 drown+0.042
chief+0.040
 elbow+0.040
stack+0.039
 bleed+0.039
Neg logit contributions
elly-0.045
 young-0.045
Every-0.044
 little-0.044
Children-0.043
Token safe
Token ID3338
Feature activation-4.73e-02

Pos logit contributions
chief+0.101
 drown+0.093
imet+0.093
stack+0.092
 elbow+0.091
Neg logit contributions
Every-0.082
 little-0.081
 young-0.080
 year-0.080
elly-0.080
Token,"
Token ID553
Feature activation-1.52e-02

Pos logit contributions
chief+0.078
 drown+0.073
 elbow+0.070
 heal+0.068
ridge+0.068
Neg logit contributions
elly-0.077
Every-0.066
umpy-0.066
 year-0.064
 young-0.061
Token they
Token ID484
Feature activation-1.53e-02

Pos logit contributions
 drown+0.024
chief+0.023
 rid+0.022
 heal+0.022
 unst+0.021
Neg logit contributions
Every-0.023
elly-0.022
Children-0.022
 little-0.021
 young-0.021
Token to
Token ID284
Feature activation-5.92e-02

Pos logit contributions
elly+0.005
Every+0.004
 year+0.004
grown+0.004
umpy+0.004
Neg logit contributions
 drown-0.004
chief-0.004
 heal-0.004
 rid-0.003
 bleed-0.003
Token do
Token ID466
Feature activation-5.61e-02

Pos logit contributions
chief+0.074
 drown+0.072
 rid+0.070
 down+0.070
 elbow+0.065
Neg logit contributions
Every-0.107
elly-0.107
Two-0.094
umpy-0.094
 year-0.093
Token more
Token ID517
Feature activation-3.55e-02

Pos logit contributions
chief+0.078
 drown+0.078
 heal+0.075
 rid+0.071
 down+0.071
Neg logit contributions
elly-0.100
Every-0.092
umpy-0.087
 year-0.083
illing-0.081
Token than
Token ID621
Feature activation-3.79e-02

Pos logit contributions
chief+0.054
 drown+0.053
 heal+0.052
 rid+0.049
 down+0.048
Neg logit contributions
elly-0.063
Every-0.055
umpy-0.052
illing-0.049
 year-0.049
Token anything
Token ID1997
Feature activation-5.11e-02

Pos logit contributions
chief+0.048
 drown+0.046
 heal+0.042
 hyd+0.041
ridge+0.040
Neg logit contributions
elly-0.072
Every-0.066
grown-0.062
 little-0.062
 year-0.061
Token-
Token ID12
Feature activation-7.99e-02

Pos logit contributions
chief+0.077
 drown+0.074
 elbow+0.072
ridge+0.069
 mast+0.068
Neg logit contributions
elly-0.091
Every-0.083
umpy-0.078
 year-0.074
able-0.074
Token she
Token ID673
Feature activation-1.02e-01

Pos logit contributions
 drown+0.120
chief+0.119
 heal+0.119
 down+0.111
 elbow+0.109
Neg logit contributions
elly-0.136
Every-0.120
umpy-0.115
urious-0.108
Children-0.107
Token loved
Token ID6151
Feature activation-1.07e-02

Pos logit contributions
 heal+0.207
chief+0.202
 drown+0.200
 down+0.199
 rid+0.195
Neg logit contributions
elly-0.123
Every-0.101
umpy-0.094
ubby-0.087
Two-0.086
Token to
Token ID284
Feature activation-7.05e-02

Pos logit contributions
 drown+0.016
chief+0.016
 heal+0.015
 rid+0.014
 bleed+0.014
Neg logit contributions
elly-0.019
Every-0.017
umpy-0.016
 year-0.015
grown-0.015
Token roll
Token ID4836
Feature activation-1.06e-02

Pos logit contributions
chief+0.097
 drown+0.095
 down+0.093
 heal+0.091
 rid+0.091
Neg logit contributions
elly-0.124
Every-0.117
umpy-0.108
Two-0.103
oma-0.101
Token!
Token ID0
Feature activation-5.71e-02

Pos logit contributions
 drown+0.015
chief+0.015
 heal+0.014
ridge+0.013
 elbow+0.013
Neg logit contributions
elly-0.021
Every-0.018
umpy-0.018
able-0.017
 year-0.016
Token their
Token ID511
Feature activation-4.00e-02

Pos logit contributions
chief+0.050
 drown+0.049
 heal+0.047
 bleed+0.047
 elbow+0.045
Neg logit contributions
elly-0.066
Every-0.061
Children-0.055
 year-0.055
 little-0.055
Token smiles
Token ID21845
Feature activation-2.16e-02

Pos logit contributions
chief+0.066
 heal+0.065
 down+0.063
 drown+0.063
 rid+0.060
Neg logit contributions
elly-0.061
Every-0.053
umpy-0.050
urious-0.048
 importance-0.048
Token and
Token ID290
Feature activation-8.56e-02

Pos logit contributions
chief+0.032
 heal+0.032
 drown+0.032
 rid+0.030
 down+0.029
Neg logit contributions
elly-0.038
Every-0.033
umpy-0.032
ubby-0.030
oma-0.030
Token their
Token ID511
Feature activation-3.54e-02

Pos logit contributions
chief+0.128
 drown+0.122
ridge+0.119
 heal+0.118
 rid+0.114
Neg logit contributions
elly-0.131
Every-0.129
 young-0.117
 elderly-0.115
 little-0.115
Token voices
Token ID10839
Feature activation-1.15e-02

Pos logit contributions
chief+0.060
 heal+0.058
 drown+0.057
 down+0.056
 rid+0.055
Neg logit contributions
elly-0.052
Every-0.046
umpy-0.042
 importance-0.042
urious-0.042
Token and
Token ID290
Feature activation-8.70e-02

Pos logit contributions
chief+0.017
 drown+0.017
 elbow+0.016
ridge+0.015
 rid+0.015
Neg logit contributions
elly-0.019
Every-0.018
umpy-0.018
 year-0.016
 little-0.016
Token their
Token ID511
Feature activation-4.18e-02

Pos logit contributions
chief+0.125
ridge+0.118
atmeal+0.114
 drown+0.111
imet+0.109
Neg logit contributions
Every-0.137
elly-0.136
 English-0.129
 little-0.129
 Spanish-0.129
Token dreams
Token ID10625
Feature activation-2.23e-02

Pos logit contributions
chief+0.072
 heal+0.070
 drown+0.068
 down+0.067
ridge+0.066
Neg logit contributions
elly-0.061
Every-0.052
umpy-0.050
ubby-0.048
urious-0.048
Token.
Token ID13
Feature activation-6.30e-02

Pos logit contributions
chief+0.032
 drown+0.030
 elbow+0.029
 rid+0.028
ridge+0.028
Neg logit contributions
elly-0.040
Every-0.037
umpy-0.036
 year-0.034
oma-0.032
Token We
Token ID775
Feature activation-4.67e-02

Pos logit contributions
 heal+0.070
chief+0.069
 drown+0.068
 down+0.065
 rid+0.065
Neg logit contributions
elly-0.134
Every-0.122
umpy-0.117
Two-0.112
oma-0.111
Token have
Token ID423
Feature activation-3.40e-02

Pos logit contributions
chief+0.067
 heal+0.063
 drown+0.063
 down+0.063
 elbow+0.059
Neg logit contributions
elly-0.081
Every-0.075
umpy-0.071
Two-0.068
iet-0.067
Token's
Token ID338
Feature activation-5.90e-02

Pos logit contributions
 drown+0.072
chief+0.071
 heal+0.070
 elbow+0.067
 rid+0.066
Neg logit contributions
elly-0.105
Every-0.098
umpy-0.091
oph-0.090
Children-0.089
Token cook
Token ID4255
Feature activation-3.35e-02

Pos logit contributions
chief+0.076
 drown+0.067
 elbow+0.066
 down+0.065
 rid+0.065
Neg logit contributions
Every-0.104
elly-0.098
 little-0.095
Two-0.093
 young-0.093
Token some
Token ID617
Feature activation-3.91e-02

Pos logit contributions
 drown+0.050
chief+0.048
 bleed+0.045
 heal+0.045
 elbow+0.043
Neg logit contributions
elly-0.056
Every-0.052
aste-0.048
 little-0.047
 year-0.047
Token regular
Token ID3218
Feature activation-3.85e-02

Pos logit contributions
chief+0.058
 drown+0.056
 heal+0.054
 down+0.051
 bleed+0.051
Neg logit contributions
elly-0.071
Every-0.062
umpy-0.058
 year-0.057
illing-0.057
Token food
Token ID2057
Feature activation-5.92e-02

Pos logit contributions
chief+0.060
 drown+0.060
 heal+0.058
 bleed+0.056
stack+0.056
Neg logit contributions
elly-0.065
Every-0.053
 year-0.053
illing-0.053
umpy-0.051
Token and
Token ID290
Feature activation-8.85e-02

Pos logit contributions
chief+0.096
 drown+0.093
 heal+0.088
 elbow+0.087
 bleed+0.087
Neg logit contributions
elly-0.102
Every-0.084
umpy-0.081
illing-0.079
 year-0.078
Token have
Token ID423
Feature activation-5.76e-02

Pos logit contributions
chief+0.117
 drown+0.105
 down+0.102
 elbow+0.101
 heal+0.100
Neg logit contributions
elly-0.152
Every-0.148
 little-0.143
 young-0.141
 elderly-0.138
Token a
Token ID257
Feature activation-5.24e-02

Pos logit contributions
chief+0.096
 drown+0.095
 heal+0.094
 down+0.090
stack+0.089
Neg logit contributions
elly-0.085
Every-0.080
Children-0.069
 year-0.069
Two-0.069
Token picnic
Token ID35715
Feature activation-3.86e-02

Pos logit contributions
chief+0.093
 heal+0.090
 drown+0.089
 down+0.088
 elbow+0.085
Neg logit contributions
elly-0.073
Every-0.064
 year-0.059
 little-0.058
umpy-0.058
Token.
Token ID13
Feature activation-6.88e-02

Pos logit contributions
chief+0.057
 drown+0.055
 heal+0.054
 bleed+0.052
 elbow+0.049
Neg logit contributions
elly-0.070
Every-0.060
 year-0.058
umpy-0.057
ubby-0.055
Token And
Token ID843
Feature activation-3.63e-02

Pos logit contributions
chief+0.090
 heal+0.090
 drown+0.089
 rid+0.086
 down+0.085
Neg logit contributions
elly-0.130
Every-0.119
umpy-0.110
Two-0.110
Children-0.106
Token proud
Token ID6613
Feature activation-3.91e-02

Pos logit contributions
chief+0.061
 heal+0.057
ridge+0.056
 drown+0.055
stack+0.054
Neg logit contributions
elly-0.044
Every-0.041
umpy-0.038
 little-0.035
 young-0.035
Token of
Token ID286
Feature activation-4.03e-02

Pos logit contributions
chief+0.072
 heal+0.072
 drown+0.071
 down+0.068
 rid+0.067
Neg logit contributions
elly-0.056
Every-0.047
umpy-0.044
illing-0.040
Children-0.040
Token themselves
Token ID2405
Feature activation-3.90e-02

Pos logit contributions
chief+0.049
 drown+0.048
 heal+0.047
 down+0.044
ridge+0.043
Neg logit contributions
elly-0.078
Every-0.073
Children-0.068
umpy-0.067
oma-0.067
Token and
Token ID290
Feature activation-7.79e-02

Pos logit contributions
chief+0.055
 drown+0.053
 heal+0.049
 bleed+0.048
 elbow+0.048
Neg logit contributions
elly-0.070
Every-0.066
umpy-0.060
 year-0.059
 little-0.059
Token continued
Token ID3767
Feature activation-4.33e-02

Pos logit contributions
chief+0.144
 drown+0.140
ridge+0.137
 heal+0.135
imet+0.132
Neg logit contributions
elly-0.092
Every-0.088
 elderly-0.084
 young-0.081
 little-0.081
Token to
Token ID284
Feature activation-8.51e-02

Pos logit contributions
 drown+0.072
idents+0.071
chief+0.071
 heal+0.070
 bleed+0.069
Neg logit contributions
Every-0.051
grown-0.050
elly-0.050
 little-0.045
 kind-0.045
Token play
Token ID711
Feature activation-1.17e-02

Pos logit contributions
chief+0.113
imet+0.105
 rid+0.104
balls+0.102
 down+0.100
Neg logit contributions
Every-0.142
 kind-0.124
 little-0.124
 year-0.123
 young-0.122
Token on
Token ID319
Feature activation-5.71e-02

Pos logit contributions
chief+0.020
 drown+0.018
 bleed+0.017
ridge+0.017
 heal+0.017
Neg logit contributions
elly-0.018
Every-0.017
 year-0.016
 little-0.015
umpy-0.015
Token the
Token ID262
Feature activation-6.44e-02

Pos logit contributions
chief+0.094
 heal+0.086
ridge+0.085
 drown+0.084
 bleed+0.084
Neg logit contributions
Every-0.083
elly-0.081
 little-0.078
 year-0.076
 young-0.073
Token shore
Token ID15191
Feature activation-5.35e-02

Pos logit contributions
chief+0.109
 drown+0.103
 heal+0.103
 down+0.101
 rid+0.100
Neg logit contributions
elly-0.093
Every-0.088
umpy-0.079
 elderly-0.079
iet-0.078
Token.
Token ID13
Feature activation-5.96e-02

Pos logit contributions
chief+0.091
 drown+0.089
 bleed+0.087
 heal+0.087
 elbow+0.083
Neg logit contributions
elly-0.086
Every-0.076
umpy-0.073
 year-0.072
illing-0.069