TOP ACTIVATIONS
MAX = 0.105

golf with his dad . They would go to the golf
ooters . They always wore their gear : helmets , pads
on , Lily and her neighbor became good friends and played
!" So , Kitty and Dog gy took turns sliding with
on , he wore his new sunglasses everywhere he went ,
." R ib bit said , " Would
the park to play with his jets . He threw them
golf with his dad . They would go to the big
day on , Kitty and Dog gy loved to play the
" Yes !" So they skipped together along the path
with his dad . They would go to the golf course
her friends . They ran , jumped , and slid on
a bell . They ride their bikes on the road near
, too . They skipped and sang and laughed .
with his dad . They would go to the big green
a cloth . They want their bikes to look nice .
skip just like Tom . They both laughed and had lots
her friends . They ran , jumped , and slid on
and boy swung together . They laughed and smiled and it
play with the pigeon . They had lots of fun together

MIN ACTIVATIONS
MIN = -0.105

. " Who 's there ?" she said in a loud
me pick out a costume ?" she asked .
you doing here ?" a voice said . It was a
What are you searching for ?" An nie smiled
voice coming from the diamond . Annie was startled . "
. Daddy put an arm around Jake and said
had to go to the bar ber . She said his
he asks in a loud voice . S ara
Hello ? Is someone there ?" Lu cy was
" Hello !" said Pen elope in a
" What are you doing ?" It was her mum .
that said ' No entry '. But Annie was eager to
you like it ?" the bar ber asked .
did you find that handle ?" Clara looked up and saw
Who are you ?" a voice says . The voice is
Who goes there ?" a voice said from inside .
's mom had filled the chamber with balloons , stream ers
What are you looking for ?" the fairy asked .
have to destroy my map ?" Jamie cried .
suddenly she heard a voice . It was a

INTERVAL 0.053 - 0.105
CONTAINS 0.553%

the sky , and Tom always helped the bird by snapping
hats , boots and m ittens . Lily had thick ,
, John and Jane . They both liked to play outside
on , Lily and her brother would go to their secret
on the big wheel ?" Her friend replied , " Yes

INTERVAL 0.000 - 0.053
CONTAINS 47.901%

miserable . <|endoftext|> Once upon a time , there was a
huge tree . The tree had big eyes that could see
try new things . One day , they decided
, the screw driver tightened screws , and the hammer hit
to their yard . They decided to play with the swing

INTERVAL -0.052 - 0.000
CONTAINS 51.190%

and made a beautiful tree grow . <|endoftext|> Once upon a
working on the machine , until it was fixed .
and fl aps its wings . It runs away from them
a long time . When they finally returned
to go home , Lily was happy because she made a

INTERVAL -0.105 - -0.052
CONTAINS 0.357%

the lecture . But , when he flew inside his feathers
daddy wanted to watch a movie about a superhero .
careful next time , the dish washer can be dangerous !"
box and saw a j igsaw puzzle inside . He paused
. " Look !" Ben says to Anna .
Token golf
Token ID13126
Feature activation+2.04e-02

Pos logit contributions
year+0.038
ore+0.038
 gen+0.035
 goblin+0.033
 x+0.032
Neg logit contributions
 pads-0.053
 tails-0.044
coat-0.044
 sunglasses-0.044
 Indians-0.043
Token with
Token ID351
Feature activation+5.50e-02

Pos logit contributions
 pads+0.061
ety+0.054
burn+0.052
 tails+0.052
 Indians+0.051
Neg logit contributions
ore-0.024
 x-0.019
year-0.019
 gen-0.018
 goblin-0.017
Token his
Token ID465
Feature activation+6.99e-02

Pos logit contributions
 pads+0.171
burn+0.164
coat+0.162
ety+0.161
drops+0.152
Neg logit contributions
ore-0.039
 x-0.037
 of-0.034
year-0.026
looking-0.024
Token dad
Token ID9955
Feature activation+1.84e-02

Pos logit contributions
 pads+0.181
burn+0.169
coat+0.162
ety+0.160
drops+0.158
Neg logit contributions
 x-0.074
ore-0.068
 old-0.058
--0.057
 repair-0.057
Token.
Token ID13
Feature activation+3.48e-02

Pos logit contributions
 pads+0.051
ety+0.044
 tails+0.043
 Indians+0.043
coat+0.042
Neg logit contributions
ore-0.025
 x-0.022
year-0.021
 gen-0.020
 healer-0.019
Token They
Token ID1119
Feature activation+1.05e-01

Pos logit contributions
 pads+0.102
ety+0.087
 tails+0.087
burn+0.086
 Indians+0.084
Neg logit contributions
ore-0.044
 x-0.038
 gen-0.033
 operating-0.032
year-0.032
Token would
Token ID561
Feature activation+1.01e-01

Pos logit contributions
 pads+0.246
burn+0.246
 tails+0.230
coat+0.228
 fault+0.227
Neg logit contributions
 of-0.104
year-0.100
 year-0.083
 vill-0.080
 was-0.079
Token go
Token ID467
Feature activation+4.30e-02

Pos logit contributions
 pads+0.243
ety+0.217
burn+0.209
ked+0.203
ibi+0.202
Neg logit contributions
 x-0.119
 deliver-0.109
 of-0.104
 cook-0.098
 repair-0.097
Token to
Token ID284
Feature activation+3.54e-02

Pos logit contributions
 pads+0.114
 tails+0.098
ety+0.096
burn+0.095
 pad+0.092
Neg logit contributions
ore-0.065
 x-0.051
year-0.050
 healer-0.046
 of-0.045
Token the
Token ID262
Feature activation+2.98e-02

Pos logit contributions
 pads+0.107
ety+0.089
coat+0.089
 tails+0.088
burn+0.088
Neg logit contributions
 x-0.041
ore-0.039
 gen-0.032
 operating-0.031
 healer-0.031
Token golf
Token ID13126
Feature activation+4.11e-02

Pos logit contributions
 pads+0.067
ety+0.055
 tails+0.054
burn+0.052
coat+0.052
Neg logit contributions
ore-0.057
 x-0.054
 gen-0.051
 healer-0.049
 operating-0.049
Tokenooters
Token ID48316
Feature activation+5.49e-02

Pos logit contributions
 pads+0.065
 tails+0.053
burn+0.051
ety+0.051
 Indians+0.050
Neg logit contributions
ore-0.101
year-0.095
old-0.094
 x-0.089
 operating-0.085
Token.
Token ID13
Feature activation+2.19e-02

Pos logit contributions
 pads+0.158
 Indians+0.145
 tails+0.141
ety+0.140
 pad+0.136
Neg logit contributions
ore-0.064
year-0.050
 x-0.045
le-0.043
 of-0.043
Token They
Token ID1119
Feature activation+6.73e-02

Pos logit contributions
 pads+0.057
burn+0.048
ety+0.048
 tails+0.047
coat+0.047
Neg logit contributions
ore-0.035
 x-0.029
year-0.029
 operating-0.027
 gen-0.027
Token always
Token ID1464
Feature activation+7.23e-02

Pos logit contributions
 pads+0.163
burn+0.148
 tails+0.147
bit+0.138
aters+0.136
Neg logit contributions
year-0.087
ore-0.078
 gen-0.077
old-0.076
 of-0.075
Token wore
Token ID12408
Feature activation+6.91e-02

Pos logit contributions
 pads+0.167
 tails+0.149
burn+0.145
 hoops+0.142
ety+0.138
Neg logit contributions
ore-0.092
year-0.092
 x-0.091
 gen-0.089
old-0.086
Token their
Token ID511
Feature activation+1.04e-01

Pos logit contributions
ety+0.145
 pads+0.145
 pad+0.133
burn+0.123
flower+0.123
Neg logit contributions
 x-0.109
year-0.093
 news-0.093
 operating-0.092
ore-0.092
Token gear
Token ID7733
Feature activation+3.53e-02

Pos logit contributions
ety+0.189
 pads+0.177
burn+0.174
inas+0.172
 Indians+0.164
Neg logit contributions
 x-0.186
 news-0.163
 year-0.159
place-0.155
 gen-0.152
Token:
Token ID25
Feature activation+7.87e-02

Pos logit contributions
 pads+0.085
ety+0.075
 Indians+0.072
burn+0.070
 tails+0.069
Neg logit contributions
ore-0.058
 x-0.052
 gen-0.049
year-0.048
 healer-0.046
Token helmets
Token ID28359
Feature activation+7.10e-02

Pos logit contributions
 pads+0.178
ety+0.161
burn+0.161
bit+0.153
ibi+0.153
Neg logit contributions
ore-0.101
year-0.101
 x-0.100
 operating-0.089
 healer-0.085
Token,
Token ID11
Feature activation+6.03e-02

Pos logit contributions
 pads+0.198
ety+0.180
 Indians+0.177
 tails+0.173
 sunglasses+0.162
Neg logit contributions
ore-0.091
 x-0.070
 gen-0.066
 operating-0.063
 of-0.062
Token pads
Token ID21226
Feature activation+7.41e-02

Pos logit contributions
 pads+0.100
ety+0.093
burn+0.085
ibi+0.083
 pad+0.083
Neg logit contributions
 x-0.128
 operating-0.108
 gen-0.106
ore-0.105
 of-0.104
Token on
Token ID319
Feature activation+3.16e-03

Pos logit contributions
 pads+0.071
ety+0.062
 tails+0.059
 Indians+0.057
coat+0.057
Neg logit contributions
ore-0.044
year-0.039
 x-0.038
 gen-0.035
 operating-0.034
Token,
Token ID11
Feature activation+1.97e-02

Pos logit contributions
 pads+0.008
ety+0.006
 tails+0.006
burn+0.006
 Indians+0.006
Neg logit contributions
ore-0.006
year-0.005
 x-0.005
 gen-0.005
 goblin-0.005
Token Lily
Token ID20037
Feature activation+7.16e-02

Pos logit contributions
 pads+0.049
ety+0.039
 tails+0.038
 pad+0.038
coat+0.038
Neg logit contributions
ore-0.034
 x-0.030
year-0.030
 gen-0.030
 healer-0.028
Token and
Token ID290
Feature activation+1.22e-02

Pos logit contributions
 pads+0.178
 tails+0.166
ety+0.162
 Indians+0.160
 sunscreen+0.153
Neg logit contributions
ore-0.089
old-0.074
abel-0.074
year-0.073
le-0.072
Token her
Token ID607
Feature activation+6.38e-02

Pos logit contributions
 pads+0.029
ety+0.024
 tails+0.023
coat+0.023
burn+0.022
Neg logit contributions
ore-0.022
 x-0.021
year-0.020
 gen-0.019
 goblin-0.019
Token neighbor
Token ID4780
Feature activation+1.04e-01

Pos logit contributions
 pads+0.150
ety+0.131
burn+0.125
coat+0.125
 tails+0.125
Neg logit contributions
 x-0.101
 goblin-0.084
 news-0.081
 Doctor-0.075
 gen-0.074
Token became
Token ID2627
Feature activation+4.98e-02

Pos logit contributions
 pads+0.196
 Indians+0.194
ety+0.178
 tails+0.177
bur+0.177
Neg logit contributions
 of-0.167
old-0.156
ore-0.143
abel-0.142
year-0.137
Token good
Token ID922
Feature activation+1.76e-02

Pos logit contributions
 pads+0.121
ety+0.100
 pad+0.098
coat+0.097
ibi+0.094
Neg logit contributions
 x-0.077
ore-0.076
year-0.076
 healer-0.075
 gen-0.072
Token friends
Token ID2460
Feature activation-1.66e-03

Pos logit contributions
 pads+0.039
ety+0.032
 tails+0.029
 Indians+0.029
 pad+0.029
Neg logit contributions
ore-0.035
 x-0.032
year-0.032
 gen-0.030
 healer-0.030
Token and
Token ID290
Feature activation+7.04e-02

Pos logit contributions
ore+0.003
 x+0.003
year+0.003
 gen+0.003
 goblin+0.002
Neg logit contributions
 pads-0.004
ety-0.003
 tails-0.003
burn-0.003
coat-0.003
Token played
Token ID2826
Feature activation+1.47e-02

Pos logit contributions
 pads+0.142
ety+0.118
 pad+0.115
coat+0.114
bur+0.114
Neg logit contributions
year-0.120
ore-0.116
 of-0.114
old-0.108
 operating-0.106
Token!"
Token ID2474
Feature activation+1.02e-02

Pos logit contributions
 pads+0.141
ety+0.125
 Indians+0.123
 tails+0.122
 sunglasses+0.120
Neg logit contributions
ore-0.073
 of-0.052
 x-0.051
year-0.048
 ingredient-0.048
Token So
Token ID1406
Feature activation+3.71e-02

Pos logit contributions
 pads+0.023
ety+0.019
 tails+0.018
 Indians+0.018
coat+0.018
Neg logit contributions
ore-0.020
 x-0.018
year-0.017
 gen-0.017
 operating-0.016
Token,
Token ID11
Feature activation+2.14e-02

Pos logit contributions
 pads+0.106
ety+0.089
 pad+0.088
coat+0.087
 Indians+0.086
Neg logit contributions
ore-0.048
 gen-0.041
 x-0.041
year-0.038
 operating-0.037
Token Kitty
Token ID21168
Feature activation+5.24e-02

Pos logit contributions
 pads+0.048
ety+0.039
 pad+0.037
 tails+0.037
coat+0.036
Neg logit contributions
ore-0.041
 x-0.038
 gen-0.037
year-0.037
 operating-0.035
Token and
Token ID290
Feature activation+5.33e-02

Pos logit contributions
 pads+0.131
 tails+0.116
ety+0.114
burn+0.111
 Indians+0.109
Neg logit contributions
ore-0.073
 x-0.062
old-0.061
year-0.060
 gen-0.058
Token Dog
Token ID8532
Feature activation+1.04e-01

Pos logit contributions
 pads+0.111
coat+0.086
 pad+0.086
ety+0.086
 tails+0.079
Neg logit contributions
 x-0.101
ore-0.095
 operating-0.094
 healer-0.091
 gen-0.090
Tokengy
Token ID1360
Feature activation+7.54e-02

Pos logit contributions
 pads+0.218
 sunscreen+0.208
 sunglasses+0.203
 Indians+0.198
 hoops+0.194
Neg logit contributions
ore-0.140
oc-0.132
ray-0.131
le-0.130
fire-0.123
Token took
Token ID1718
Feature activation+2.10e-02

Pos logit contributions
 pads+0.164
 Indians+0.143
ety+0.139
burn+0.137
 tails+0.130
Neg logit contributions
old-0.117
ore-0.117
year-0.113
 of-0.113
abel-0.107
Token turns
Token ID4962
Feature activation+3.82e-02

Pos logit contributions
 pads+0.056
ety+0.047
 Indians+0.045
 pad+0.045
coat+0.044
Neg logit contributions
ore-0.032
 x-0.029
year-0.028
 gen-0.026
 healer-0.025
Token sliding
Token ID22292
Feature activation+5.78e-02

Pos logit contributions
 pads+0.074
ety+0.062
 Indians+0.059
 tails+0.057
bit+0.055
Neg logit contributions
ore-0.086
year-0.076
 x-0.074
 of-0.072
 operating-0.071
Token with
Token ID351
Feature activation+5.80e-02

Pos logit contributions
 pads+0.150
 Indians+0.138
ety+0.135
burn+0.128
 tails+0.127
Neg logit contributions
ore-0.086
 x-0.060
 of-0.059
year-0.057
--0.055
Token on
Token ID319
Feature activation+9.25e-03

Pos logit contributions
 pads+0.086
ety+0.074
 tails+0.072
coat+0.069
 Indians+0.068
Neg logit contributions
ore-0.053
year-0.046
 x-0.044
 gen-0.041
 operating-0.039
Token,
Token ID11
Feature activation+2.48e-02

Pos logit contributions
 pads+0.024
ety+0.020
 tails+0.019
burn+0.018
coat+0.018
Neg logit contributions
ore-0.016
 x-0.014
 gen-0.014
year-0.014
 goblin-0.013
Token he
Token ID339
Feature activation+2.19e-02

Pos logit contributions
 pads+0.063
coat+0.051
 pad+0.050
 tails+0.050
ety+0.049
Neg logit contributions
ore-0.042
year-0.038
 x-0.037
 gen-0.036
 operating-0.034
Token wore
Token ID12408
Feature activation+4.91e-02

Pos logit contributions
 pads+0.047
ety+0.038
 tails+0.037
coat+0.036
burn+0.035
Neg logit contributions
ore-0.044
year-0.040
 gen-0.040
 x-0.040
 healer-0.039
Token his
Token ID465
Feature activation+9.69e-02

Pos logit contributions
 pads+0.156
ety+0.145
burn+0.136
 pad+0.134
coat+0.132
Neg logit contributions
 x-0.036
ore-0.034
year-0.031
 gen-0.030
 healer-0.028
Token new
Token ID649
Feature activation+1.03e-01

Pos logit contributions
 pads+0.166
burn+0.159
ety+0.145
lets+0.138
inas+0.137
Neg logit contributions
 x-0.179
 news-0.171
 goblin-0.170
 magical-0.168
 old-0.166
Token sunglasses
Token ID36251
Feature activation+5.01e-02

Pos logit contributions
 pads+0.153
burn+0.126
ety+0.126
 Indians+0.120
ked+0.113
Neg logit contributions
 ingredient-0.214
 x-0.213
 news-0.213
 year-0.205
--0.204
Token everywhere
Token ID8347
Feature activation+4.58e-03

Pos logit contributions
 pads+0.124
 tails+0.112
ety+0.111
 Indians+0.108
burn+0.104
Neg logit contributions
ore-0.078
year-0.067
 x-0.062
 gen-0.061
 healer-0.059
Token he
Token ID339
Feature activation+5.57e-03

Pos logit contributions
 pads+0.013
ety+0.011
 tails+0.011
burn+0.010
 Indians+0.010
Neg logit contributions
ore-0.007
 x-0.006
year-0.006
 gen-0.006
 goblin-0.005
Token went
Token ID1816
Feature activation+7.45e-03

Pos logit contributions
 pads+0.011
ety+0.009
 tails+0.009
burn+0.008
 Indians+0.008
Neg logit contributions
ore-0.012
 x-0.011
year-0.011
 gen-0.011
 goblin-0.011
Token,
Token ID11
Feature activation+4.59e-02

Pos logit contributions
 pads+0.020
ety+0.017
 tails+0.016
burn+0.016
 Indians+0.016
Neg logit contributions
ore-0.012
 x-0.010
year-0.010
 gen-0.010
 goblin-0.009
Token."
Token ID526
Feature activation+2.28e-02

Pos logit contributions
 pads+0.032
ety+0.027
 tails+0.025
burn+0.025
 Indians+0.025
Neg logit contributions
ore-0.022
 x-0.020
year-0.019
 gen-0.019
 goblin-0.018
Token
Token ID220
Feature activation+1.57e-02

Pos logit contributions
 pads+0.047
ety+0.037
burn+0.036
coat+0.036
 Indians+0.035
Neg logit contributions
ore-0.049
 gen-0.043
 x-0.043
year-0.042
 healer-0.041
Token
Token ID198
Feature activation+1.14e-02

Pos logit contributions
 pads+0.043
ety+0.036
 tails+0.035
burn+0.035
 Indians+0.035
Neg logit contributions
ore-0.024
 x-0.020
year-0.020
 gen-0.020
 goblin-0.018
Token
Token ID198
Feature activation+1.84e-02

Pos logit contributions
 pads+0.031
ety+0.026
 tails+0.025
burn+0.025
coat+0.025
Neg logit contributions
ore-0.018
year-0.015
 x-0.015
 gen-0.015
 goblin-0.014
TokenR
Token ID49
Feature activation+4.44e-03

Pos logit contributions
 pads+0.037
ety+0.029
burn+0.028
coat+0.028
 tails+0.027
Neg logit contributions
ore-0.040
 x-0.036
 gen-0.036
year-0.036
 healer-0.035
Tokenib
Token ID571
Feature activation+1.03e-01

Pos logit contributions
 pads+0.009
ety+0.007
 tails+0.006
 Indians+0.006
burn+0.006
Neg logit contributions
ore-0.010
 x-0.009
year-0.009
 gen-0.009
 goblin-0.009
Tokenbit
Token ID2545
Feature activation+4.36e-02

Pos logit contributions
coat+0.145
 Indians+0.143
 pads+0.140
burn+0.133
 sunscreen+0.126
Neg logit contributions
ore-0.217
le-0.209
oc-0.195
 of-0.194
--0.193
Token said
Token ID531
Feature activation+1.89e-02

Pos logit contributions
 pads+0.121
coat+0.104
burn+0.104
 Indians+0.100
ety+0.100
Neg logit contributions
ore-0.055
 gen-0.049
 goblin-0.045
le-0.044
 healer-0.043
Token,
Token ID11
Feature activation+4.98e-02

Pos logit contributions
 pads+0.051
ety+0.043
coat+0.041
 tails+0.040
burn+0.040
Neg logit contributions
ore-0.030
 gen-0.027
year-0.026
 x-0.025
 goblin-0.024
Token "
Token ID366
Feature activation+4.31e-02

Pos logit contributions
 pads+0.133
ety+0.114
burn+0.109
coat+0.108
ibi+0.106
Neg logit contributions
ore-0.076
year-0.071
 gen-0.067
 healer-0.062
 x-0.061
TokenWould
Token ID17353
Feature activation-5.78e-03

Pos logit contributions
 pads+0.106
 sunglasses+0.088
 Indians+0.088
 tails+0.088
coat+0.088
Neg logit contributions
ore-0.071
year-0.064
 gen-0.059
 healer-0.056
oc-0.054
Token the
Token ID262
Feature activation+2.93e-02

Pos logit contributions
ore+0.006
year+0.005
 gen+0.005
 x+0.005
 goblin+0.005
Neg logit contributions
 pads-0.008
ety-0.006
 tails-0.006
 Indians-0.006
burn-0.006
Token park
Token ID3952
Feature activation-1.58e-02

Pos logit contributions
 pads+0.073
ety+0.060
 tails+0.059
coat+0.056
lets+0.055
Neg logit contributions
 x-0.049
ore-0.049
 gen-0.045
 operating-0.044
 healer-0.044
Token to
Token ID284
Feature activation+7.51e-03

Pos logit contributions
ore+0.029
 x+0.028
 gen+0.027
year+0.027
 goblin+0.026
Neg logit contributions
 pads-0.037
ety-0.031
coat-0.029
 tails-0.028
burn-0.028
Token play
Token ID711
Feature activation-1.05e-02

Pos logit contributions
 pads+0.014
ety+0.011
 tails+0.011
burn+0.010
coat+0.010
Neg logit contributions
ore-0.017
 x-0.016
year-0.016
 gen-0.016
 goblin-0.015
Token with
Token ID351
Feature activation+1.58e-02

Pos logit contributions
ore+0.016
 gen+0.014
year+0.014
 x+0.014
 goblin+0.013
Neg logit contributions
 pads-0.028
ety-0.023
 tails-0.023
coat-0.023
 sunglasses-0.022
Token his
Token ID465
Feature activation+1.03e-01

Pos logit contributions
 pads+0.048
ety+0.041
 pad+0.040
coat+0.039
burn+0.039
Neg logit contributions
ore-0.019
 x-0.017
year-0.016
 gen-0.016
 goblin-0.014
Token jets
Token ID20792
Feature activation+4.70e-02

Pos logit contributions
 pads+0.239
coat+0.222
ety+0.216
 pad+0.206
bit+0.200
Neg logit contributions
 old-0.117
 of-0.113
 magical-0.107
 news-0.107
 special-0.105
Token.
Token ID13
Feature activation+2.36e-02

Pos logit contributions
 pads+0.144
ety+0.131
ibi+0.126
 tails+0.126
burn+0.125
Neg logit contributions
ore-0.045
 x-0.032
 of-0.031
 operating-0.027
 gen-0.027
Token He
Token ID679
Feature activation+4.83e-02

Pos logit contributions
 pads+0.069
 tails+0.060
ety+0.059
burn+0.059
 sunglasses+0.057
Neg logit contributions
ore-0.030
 x-0.024
year-0.023
 gen-0.022
 operating-0.021
Token threw
Token ID9617
Feature activation+5.78e-02

Pos logit contributions
 pads+0.121
burn+0.106
 tails+0.105
bit+0.102
ety+0.100
Neg logit contributions
ore-0.064
 x-0.058
year-0.056
 gen-0.056
 goblin-0.053
Token them
Token ID606
Feature activation+6.05e-02

Pos logit contributions
 pads+0.170
ety+0.154
coat+0.154
burn+0.153
ibi+0.150
Neg logit contributions
ore-0.053
 x-0.052
 of-0.048
year-0.044
 gen-0.042
Token golf
Token ID13126
Feature activation+1.70e-02

Pos logit contributions
year+0.050
ore+0.048
 gen+0.045
abel+0.042
 goblin+0.042
Neg logit contributions
 pads-0.064
 tails-0.054
 sunglasses-0.054
coat-0.053
 Indians-0.052
Token with
Token ID351
Feature activation+4.95e-02

Pos logit contributions
 pads+0.050
ety+0.044
burn+0.043
 tails+0.043
 Indians+0.042
Neg logit contributions
ore-0.020
 x-0.017
year-0.017
 gen-0.016
 goblin-0.015
Token his
Token ID465
Feature activation+7.53e-02

Pos logit contributions
 pads+0.162
burn+0.153
ety+0.151
coat+0.151
 pad+0.145
Neg logit contributions
ore-0.030
 x-0.029
 of-0.023
year-0.019
looking-0.017
Token dad
Token ID9955
Feature activation+1.96e-02

Pos logit contributions
 pads+0.189
burn+0.183
coat+0.176
ety+0.172
drops+0.170
Neg logit contributions
 x-0.073
ore-0.071
 of-0.064
 old-0.063
--0.060
Token.
Token ID13
Feature activation+3.24e-02

Pos logit contributions
 pads+0.055
ety+0.048
 tails+0.046
 Indians+0.046
coat+0.046
Neg logit contributions
ore-0.026
 x-0.023
year-0.022
 gen-0.021
 healer-0.020
Token They
Token ID1119
Feature activation+1.02e-01

Pos logit contributions
 pads+0.094
 tails+0.079
ety+0.079
burn+0.079
 Indians+0.077
Neg logit contributions
ore-0.043
 x-0.037
 gen-0.032
year-0.032
 operating-0.031
Token would
Token ID561
Feature activation+1.01e-01

Pos logit contributions
burn+0.247
 pads+0.243
 fault+0.225
aters+0.223
coat+0.221
Neg logit contributions
 of-0.100
year-0.098
 vill-0.081
 year-0.077
looking-0.076
Token go
Token ID467
Feature activation+4.30e-02

Pos logit contributions
 pads+0.244
ety+0.217
burn+0.213
ibi+0.202
ked+0.200
Neg logit contributions
 x-0.117
 deliver-0.108
 of-0.103
 repair-0.096
 show-0.096
Token to
Token ID284
Feature activation+3.65e-02

Pos logit contributions
 pads+0.113
 tails+0.098
burn+0.096
ety+0.095
 pad+0.092
Neg logit contributions
ore-0.065
 x-0.051
year-0.050
 healer-0.046
 of-0.046
Token the
Token ID262
Feature activation+3.13e-02

Pos logit contributions
 pads+0.111
ety+0.092
coat+0.092
burn+0.091
 tails+0.091
Neg logit contributions
 x-0.041
ore-0.039
 gen-0.032
 operating-0.032
 healer-0.031
Token big
Token ID1263
Feature activation+3.12e-02

Pos logit contributions
 pads+0.069
ety+0.057
 tails+0.055
burn+0.054
coat+0.054
Neg logit contributions
ore-0.061
 x-0.058
 gen-0.054
 healer-0.052
 goblin-0.052
Token day
Token ID1110
Feature activation+4.10e-02

Pos logit contributions
 pads+0.036
ety+0.030
 tails+0.029
coat+0.028
 Indians+0.027
Neg logit contributions
ore-0.029
 x-0.026
year-0.025
 gen-0.025
 healer-0.024
Token on
Token ID319
Feature activation+1.51e-02

Pos logit contributions
 pads+0.114
ety+0.103
 tails+0.097
coat+0.095
 Indians+0.092
Neg logit contributions
ore-0.060
year-0.051
 x-0.049
 operating-0.043
 gen-0.043
Token,
Token ID11
Feature activation+4.57e-02

Pos logit contributions
 pads+0.040
ety+0.034
 tails+0.032
 Indians+0.031
 pad+0.031
Neg logit contributions
ore-0.024
 x-0.021
 gen-0.021
year-0.020
 operating-0.020
Token Kitty
Token ID21168
Feature activation+4.91e-02

Pos logit contributions
 pads+0.107
 pad+0.092
ety+0.088
coat+0.088
 sunglasses+0.086
Neg logit contributions
ore-0.078
 x-0.066
year-0.066
 operating-0.065
 of-0.065
Token and
Token ID290
Feature activation+7.28e-02

Pos logit contributions
 pads+0.130
 tails+0.113
ety+0.113
 Indians+0.110
 sunglasses+0.108
Neg logit contributions
ore-0.067
 x-0.053
 gen-0.053
 healer-0.052
 goblin-0.051
Token Dog
Token ID8532
Feature activation+1.02e-01

Pos logit contributions
 pads+0.159
coat+0.132
 pad+0.129
ety+0.125
 wrists+0.118
Neg logit contributions
 x-0.128
 operating-0.118
 God-0.114
 healer-0.113
 Doctor-0.113
Tokengy
Token ID1360
Feature activation+7.98e-02

Pos logit contributions
 pads+0.199
 sunscreen+0.186
 sunglasses+0.184
 Indians+0.179
 hoops+0.177
Neg logit contributions
ore-0.170
le-0.147
 of-0.145
year-0.144
oc-0.144
Token loved
Token ID6151
Feature activation+2.66e-02

Pos logit contributions
 pads+0.165
ety+0.140
 Indians+0.138
burn+0.130
 sunglasses+0.129
Neg logit contributions
ore-0.135
abel-0.131
old-0.128
year-0.124
 operating-0.124
Token to
Token ID284
Feature activation+2.19e-02

Pos logit contributions
 pads+0.065
ety+0.056
coat+0.051
 pad+0.051
ibi+0.050
Neg logit contributions
ore-0.047
 x-0.043
year-0.041
 operating-0.039
 gen-0.038
Token play
Token ID711
Feature activation-1.19e-02

Pos logit contributions
 pads+0.047
ety+0.039
 tails+0.035
 pad+0.035
 Indians+0.035
Neg logit contributions
ore-0.045
 x-0.042
 gen-0.040
year-0.039
 operating-0.039
Token the
Token ID262
Feature activation+2.35e-02

Pos logit contributions
ore+0.023
year+0.021
 gen+0.021
 x+0.020
 goblin+0.020
Neg logit contributions
 pads-0.027
 tails-0.022
ety-0.021
coat-0.021
 Indians-0.020
Token "
Token ID366
Feature activation+5.06e-02

Pos logit contributions
 pads+0.112
ety+0.098
burn+0.091
 tails+0.089
 Indians+0.087
Neg logit contributions
ore-0.073
year-0.066
 gen-0.063
 x-0.062
 goblin-0.059
TokenYes
Token ID5297
Feature activation+3.92e-02

Pos logit contributions
 pads+0.127
 tails+0.110
 Indians+0.108
burn+0.107
ety+0.104
Neg logit contributions
ore-0.076
year-0.072
looking-0.064
le-0.059
 gen-0.059
Token!"
Token ID2474
Feature activation+1.24e-02

Pos logit contributions
 pads+0.120
ety+0.099
 Indians+0.096
 tails+0.095
 pad+0.094
Neg logit contributions
ore-0.050
 x-0.043
 gen-0.041
 of-0.037
year-0.036
Token
Token ID198
Feature activation+9.68e-03

Pos logit contributions
 pads+0.031
ety+0.026
 tails+0.025
 Indians+0.025
burn+0.024
Neg logit contributions
ore-0.022
year-0.019
 x-0.019
 gen-0.019
 goblin-0.017
TokenSo
Token ID2396
Feature activation+2.75e-02

Pos logit contributions
 pads+0.023
ety+0.019
 tails+0.018
 Indians+0.018
burn+0.018
Neg logit contributions
ore-0.018
year-0.016
 x-0.016
 gen-0.016
 goblin-0.015
Token they
Token ID484
Feature activation+1.02e-01

Pos logit contributions
 pads+0.074
ety+0.062
coat+0.061
burn+0.060
 Indians+0.059
Neg logit contributions
ore-0.042
year-0.036
 gen-0.035
 x-0.034
 goblin-0.032
Token skipped
Token ID26684
Feature activation+4.49e-02

Pos logit contributions
 pads+0.215
bur+0.195
burn+0.189
bit+0.186
 fault+0.186
Neg logit contributions
year-0.147
ore-0.144
 of-0.143
le-0.139
abel-0.127
Token together
Token ID1978
Feature activation+4.37e-02

Pos logit contributions
 pads+0.124
ety+0.112
 Indians+0.108
burn+0.106
 tails+0.105
Neg logit contributions
ore-0.059
year-0.048
 of-0.042
 operating-0.039
 late-0.039
Token along
Token ID1863
Feature activation+7.96e-03

Pos logit contributions
 pads+0.121
ety+0.112
 Indians+0.109
burn+0.107
bur+0.106
Neg logit contributions
ore-0.057
 x-0.043
year-0.043
 of-0.041
 operating-0.040
Token the
Token ID262
Feature activation+5.27e-02

Pos logit contributions
 pads+0.021
ety+0.018
 tails+0.017
burn+0.017
 Indians+0.017
Neg logit contributions
ore-0.013
 x-0.011
year-0.011
 gen-0.011
 goblin-0.010
Token path
Token ID3108
Feature activation+1.71e-02

Pos logit contributions
 pads+0.110
ety+0.098
burn+0.091
bur+0.091
 Indians+0.089
Neg logit contributions
ore-0.095
 x-0.092
 operating-0.085
looking-0.082
 healer-0.080
Token with
Token ID351
Feature activation+5.50e-02

Pos logit contributions
 pads+0.061
ety+0.054
burn+0.052
 tails+0.052
 Indians+0.051
Neg logit contributions
ore-0.024
 x-0.019
year-0.019
 gen-0.018
 goblin-0.017
Token his
Token ID465
Feature activation+6.99e-02

Pos logit contributions
 pads+0.171
burn+0.164
coat+0.162
ety+0.161
drops+0.152
Neg logit contributions
ore-0.039
 x-0.037
 of-0.034
year-0.026
looking-0.024
Token dad
Token ID9955
Feature activation+1.84e-02

Pos logit contributions
 pads+0.181
burn+0.169
coat+0.162
ety+0.160
drops+0.158
Neg logit contributions
 x-0.074
ore-0.068
 old-0.058
--0.057
 repair-0.057
Token.
Token ID13
Feature activation+3.48e-02

Pos logit contributions
 pads+0.051
ety+0.044
 tails+0.043
 Indians+0.043
coat+0.042
Neg logit contributions
ore-0.025
 x-0.022
year-0.021
 gen-0.020
 healer-0.019
Token They
Token ID1119
Feature activation+1.05e-01

Pos logit contributions
 pads+0.102
ety+0.087
 tails+0.087
burn+0.086
 Indians+0.084
Neg logit contributions
ore-0.044
 x-0.038
 gen-0.033
 operating-0.032
year-0.032
Token would
Token ID561
Feature activation+1.01e-01

Pos logit contributions
 pads+0.246
burn+0.246
 tails+0.230
coat+0.228
 fault+0.227
Neg logit contributions
 of-0.104
year-0.100
 year-0.083
 vill-0.080
 was-0.079
Token go
Token ID467
Feature activation+4.30e-02

Pos logit contributions
 pads+0.243
ety+0.217
burn+0.209
ked+0.203
ibi+0.202
Neg logit contributions
 x-0.119
 deliver-0.109
 of-0.104
 cook-0.098
 repair-0.097
Token to
Token ID284
Feature activation+3.54e-02

Pos logit contributions
 pads+0.114
 tails+0.098
ety+0.096
burn+0.095
 pad+0.092
Neg logit contributions
ore-0.065
 x-0.051
year-0.050
 healer-0.046
 of-0.045
Token the
Token ID262
Feature activation+2.98e-02

Pos logit contributions
 pads+0.107
ety+0.089
coat+0.089
 tails+0.088
burn+0.088
Neg logit contributions
 x-0.041
ore-0.039
 gen-0.032
 operating-0.031
 healer-0.031
Token golf
Token ID13126
Feature activation+4.11e-02

Pos logit contributions
 pads+0.067
ety+0.055
 tails+0.054
burn+0.052
coat+0.052
Neg logit contributions
ore-0.057
 x-0.054
 gen-0.051
 healer-0.049
 operating-0.049
Token course
Token ID1781
Feature activation+2.95e-02

Pos logit contributions
 pads+0.090
burn+0.079
ety+0.078
 tails+0.075
ibi+0.073
Neg logit contributions
ore-0.078
 x-0.067
 gen-0.065
 goblin-0.064
 healer-0.063
Token her
Token ID607
Feature activation+5.04e-02

Pos logit contributions
 pads+0.075
 pad+0.061
ety+0.060
 Indians+0.060
coat+0.059
Neg logit contributions
ore-0.039
 x-0.038
 gen-0.036
year-0.035
 ingredient-0.032
Token friends
Token ID2460
Feature activation+3.36e-02

Pos logit contributions
 pads+0.133
ety+0.114
burn+0.110
bit+0.107
 pad+0.106
Neg logit contributions
ore-0.066
 x-0.063
 gen-0.061
 goblin-0.059
 chamber-0.053
Token.
Token ID13
Feature activation+2.57e-02

Pos logit contributions
 pads+0.104
ety+0.093
 Indians+0.091
burn+0.090
ibi+0.090
Neg logit contributions
ore-0.034
 x-0.026
year-0.025
 operating-0.023
 healer-0.022
Token They
Token ID1119
Feature activation+7.49e-02

Pos logit contributions
 pads+0.065
burn+0.055
ety+0.054
 Indians+0.053
 tails+0.053
Neg logit contributions
ore-0.040
 x-0.036
year-0.035
 operating-0.033
 gen-0.032
Token ran
Token ID4966
Feature activation+3.73e-02

Pos logit contributions
 pads+0.189
burn+0.178
 fault+0.168
bit+0.165
 tails+0.163
Neg logit contributions
year-0.096
 of-0.076
ore-0.074
looking-0.069
 gen-0.065
Token,
Token ID11
Feature activation+1.01e-01

Pos logit contributions
 pads+0.104
burn+0.092
ety+0.091
 Indians+0.087
bur+0.086
Neg logit contributions
ore-0.049
year-0.040
 x-0.036
looking-0.036
 healer-0.034
Token jumped
Token ID11687
Feature activation+2.93e-02

Pos logit contributions
 fault+0.224
burn+0.220
aters+0.219
bit+0.213
lets+0.202
Neg logit contributions
 of-0.133
looking-0.123
ore-0.107
year-0.101
 else-0.092
Token,
Token ID11
Feature activation+7.59e-02

Pos logit contributions
 pads+0.080
ety+0.072
burn+0.068
 Indians+0.066
 tails+0.065
Neg logit contributions
ore-0.042
 x-0.034
year-0.033
 gen-0.032
 goblin-0.031
Token and
Token ID290
Feature activation+6.55e-02

Pos logit contributions
aters+0.189
 pads+0.189
 fault+0.187
burn+0.186
ibi+0.185
Neg logit contributions
ore-0.065
 of-0.065
looking-0.058
year-0.051
ce-0.044
Token slid
Token ID27803
Feature activation+7.46e-02

Pos logit contributions
 pads+0.131
aters+0.121
burn+0.117
ety+0.115
bit+0.114
Neg logit contributions
ore-0.108
year-0.106
looking-0.095
 of-0.092
fire-0.090
Token on
Token ID319
Feature activation+3.24e-02

Pos logit contributions
burn+0.214
 Indians+0.209
 pads+0.209
ety+0.208
ige+0.197
Neg logit contributions
ore-0.051
 of-0.041
 with-0.039
--0.036
le-0.029
Token a
Token ID257
Feature activation+7.32e-02

Pos logit contributions
 pads+0.116
ety+0.113
burn+0.110
bur+0.105
coat+0.103
Neg logit contributions
 x-0.076
year-0.071
 gen-0.066
old-0.065
 operating-0.063
Token bell
Token ID8966
Feature activation+4.84e-02

Pos logit contributions
 pads+0.159
ety+0.156
bur+0.138
burn+0.138
 Indians+0.130
Neg logit contributions
 gen-0.118
 x-0.117
ore-0.107
 operating-0.103
 goblin-0.101
Token.
Token ID13
Feature activation+2.12e-02

Pos logit contributions
 pads+0.136
 Indians+0.121
ety+0.120
 tails+0.115
 pad+0.112
Neg logit contributions
ore-0.062
 x-0.053
 gen-0.050
year-0.047
 operating-0.045
Token They
Token ID1119
Feature activation+6.03e-02

Pos logit contributions
 pads+0.051
burn+0.043
ety+0.043
coat+0.041
 Indians+0.041
Neg logit contributions
ore-0.035
 x-0.032
year-0.031
 operating-0.030
 gen-0.029
Token ride
Token ID6594
Feature activation+3.37e-02

Pos logit contributions
 pads+0.140
burn+0.128
ety+0.118
 tails+0.117
 Indians+0.115
Neg logit contributions
 x-0.084
year-0.083
ore-0.080
 operating-0.079
old-0.079
Token their
Token ID511
Feature activation+1.01e-01

Pos logit contributions
 pads+0.086
ety+0.073
 Indians+0.071
burn+0.069
coat+0.069
Neg logit contributions
ore-0.055
year-0.049
 x-0.047
 gen-0.043
 healer-0.041
Token bikes
Token ID16715
Feature activation+5.39e-02

Pos logit contributions
ety+0.177
burn+0.169
 pads+0.168
inas+0.160
 Indians+0.159
Neg logit contributions
 x-0.169
 site-0.161
 year-0.157
 old-0.151
 news-0.150
Token on
Token ID319
Feature activation+2.60e-02

Pos logit contributions
 pads+0.148
 Indians+0.132
 tails+0.130
ety+0.130
burn+0.128
Neg logit contributions
ore-0.072
year-0.058
 x-0.055
 of-0.050
 operating-0.048
Token the
Token ID262
Feature activation+5.62e-02

Pos logit contributions
 pads+0.074
ety+0.065
burn+0.061
 Indians+0.061
coat+0.060
Neg logit contributions
ore-0.034
 x-0.033
 gen-0.029
year-0.029
 operating-0.028
Token road
Token ID2975
Feature activation+1.27e-02

Pos logit contributions
 pads+0.123
ety+0.120
burn+0.108
lets+0.108
ked+0.105
Neg logit contributions
 x-0.098
ore-0.085
 site-0.084
 operating-0.083
 gen-0.081
Token near
Token ID1474
Feature activation+2.22e-02

Pos logit contributions
 pads+0.033
ety+0.028
 tails+0.027
 Indians+0.027
burn+0.027
Neg logit contributions
ore-0.020
 x-0.018
year-0.018
 gen-0.017
 operating-0.016
Token,
Token ID11
Feature activation+5.75e-02

Pos logit contributions
 pads+0.178
ety+0.170
 Indians+0.161
 tails+0.160
bit+0.160
Neg logit contributions
ore-0.068
 of-0.049
year-0.047
 x-0.046
 operating-0.046
Token too
Token ID1165
Feature activation+3.35e-02

Pos logit contributions
 pads+0.133
ety+0.114
burn+0.112
 tails+0.112
bit+0.108
Neg logit contributions
 x-0.082
ore-0.082
 of-0.076
 gen-0.075
 goblin-0.074
Token.
Token ID13
Feature activation+3.61e-02

Pos logit contributions
 pads+0.100
ety+0.088
 Indians+0.086
 tails+0.085
bur+0.085
Neg logit contributions
ore-0.041
year-0.030
 x-0.028
 goblin-0.027
looking-0.026
Token They
Token ID1119
Feature activation+7.80e-02

Pos logit contributions
 pads+0.096
burn+0.086
 tails+0.082
ety+0.081
 Indians+0.081
Neg logit contributions
ore-0.051
year-0.043
 x-0.042
 operating-0.040
 gen-0.037
Token skipped
Token ID26684
Feature activation+5.36e-02

Pos logit contributions
 pads+0.186
burn+0.159
 tails+0.159
bit+0.159
ety+0.151
Neg logit contributions
year-0.111
ore-0.108
abel-0.105
 of-0.097
 year-0.091
Token and
Token ID290
Feature activation+1.01e-01

Pos logit contributions
 pads+0.152
ety+0.140
burn+0.137
bur+0.133
 Indians+0.133
Neg logit contributions
ore-0.062
year-0.049
 of-0.047
 goblin-0.040
 year-0.038
Token sang
Token ID25889
Feature activation+3.95e-02

Pos logit contributions
bit+0.205
bur+0.205
 fault+0.203
 pads+0.197
burn+0.196
Neg logit contributions
 of-0.127
ore-0.126
year-0.120
 year-0.107
looking-0.099
Token and
Token ID290
Feature activation+8.75e-02

Pos logit contributions
 pads+0.100
ety+0.093
burn+0.085
bur+0.084
 Indians+0.082
Neg logit contributions
ore-0.061
year-0.057
 x-0.052
 gen-0.048
 operating-0.048
Token laughed
Token ID13818
Feature activation+4.03e-02

Pos logit contributions
 pads+0.181
bur+0.181
ety+0.176
burn+0.169
 pad+0.156
Neg logit contributions
ore-0.119
year-0.118
 of-0.108
 year-0.099
fire-0.097
Token.
Token ID13
Feature activation+3.01e-02

Pos logit contributions
 pads+0.112
ety+0.103
 Indians+0.096
bur+0.095
burn+0.095
Neg logit contributions
ore-0.052
year-0.047
 x-0.042
 gen-0.041
 operating-0.040
Token
Token ID198
Feature activation-6.46e-03

Pos logit contributions
 pads+0.078
burn+0.067
ety+0.067
 Indians+0.065
 tails+0.065
Neg logit contributions
ore-0.047
year-0.040
 x-0.039
 operating-0.036
 gen-0.036
Token with
Token ID351
Feature activation+4.95e-02

Pos logit contributions
 pads+0.050
ety+0.044
burn+0.043
 tails+0.043
 Indians+0.042
Neg logit contributions
ore-0.020
 x-0.017
year-0.017
 gen-0.016
 goblin-0.015
Token his
Token ID465
Feature activation+7.53e-02

Pos logit contributions
 pads+0.162
burn+0.153
ety+0.151
coat+0.151
 pad+0.145
Neg logit contributions
ore-0.030
 x-0.029
 of-0.023
year-0.019
looking-0.017
Token dad
Token ID9955
Feature activation+1.96e-02

Pos logit contributions
 pads+0.189
burn+0.183
coat+0.176
ety+0.172
drops+0.170
Neg logit contributions
 x-0.073
ore-0.071
 of-0.064
 old-0.063
--0.060
Token.
Token ID13
Feature activation+3.24e-02

Pos logit contributions
 pads+0.055
ety+0.048
 tails+0.046
 Indians+0.046
coat+0.046
Neg logit contributions
ore-0.026
 x-0.023
year-0.022
 gen-0.021
 healer-0.020
Token They
Token ID1119
Feature activation+1.02e-01

Pos logit contributions
 pads+0.094
 tails+0.079
ety+0.079
burn+0.079
 Indians+0.077
Neg logit contributions
ore-0.043
 x-0.037
 gen-0.032
year-0.032
 operating-0.031
Token would
Token ID561
Feature activation+1.01e-01

Pos logit contributions
burn+0.247
 pads+0.243
 fault+0.225
aters+0.223
coat+0.221
Neg logit contributions
 of-0.100
year-0.098
 vill-0.081
 year-0.077
looking-0.076
Token go
Token ID467
Feature activation+4.30e-02

Pos logit contributions
 pads+0.244
ety+0.217
burn+0.213
ibi+0.202
ked+0.200
Neg logit contributions
 x-0.117
 deliver-0.108
 of-0.103
 repair-0.096
 show-0.096
Token to
Token ID284
Feature activation+3.65e-02

Pos logit contributions
 pads+0.113
 tails+0.098
burn+0.096
ety+0.095
 pad+0.092
Neg logit contributions
ore-0.065
 x-0.051
year-0.050
 healer-0.046
 of-0.046
Token the
Token ID262
Feature activation+3.13e-02

Pos logit contributions
 pads+0.111
ety+0.092
coat+0.092
burn+0.091
 tails+0.091
Neg logit contributions
 x-0.041
ore-0.039
 gen-0.032
 operating-0.032
 healer-0.031
Token big
Token ID1263
Feature activation+3.12e-02

Pos logit contributions
 pads+0.069
ety+0.057
 tails+0.055
burn+0.054
coat+0.054
Neg logit contributions
ore-0.061
 x-0.058
 gen-0.054
 healer-0.052
 goblin-0.052
Token green
Token ID4077
Feature activation+2.76e-02

Pos logit contributions
 pads+0.070
ety+0.058
burn+0.057
 tails+0.056
coat+0.055
Neg logit contributions
ore-0.059
 x-0.054
 goblin-0.051
 gen-0.051
 healer-0.049
Token a
Token ID257
Feature activation+6.40e-02

Pos logit contributions
 pads+0.131
ety+0.114
 Indians+0.111
 pad+0.109
burn+0.107
Neg logit contributions
 x-0.070
ore-0.066
year-0.062
 gen-0.058
 of-0.057
Token cloth
Token ID16270
Feature activation+5.62e-02

Pos logit contributions
 pads+0.137
ety+0.128
bur+0.111
 tails+0.109
 Indians+0.107
Neg logit contributions
ore-0.114
 gen-0.111
 x-0.110
 goblin-0.099
 operating-0.098
Token.
Token ID13
Feature activation+1.39e-02

Pos logit contributions
 pads+0.163
 Indians+0.147
ety+0.145
 tails+0.139
ibi+0.134
Neg logit contributions
ore-0.067
 x-0.050
 gen-0.050
 of-0.045
 operating-0.043
Token They
Token ID1119
Feature activation+3.80e-02

Pos logit contributions
 pads+0.035
ety+0.029
 Indians+0.028
 tails+0.028
burn+0.028
Neg logit contributions
ore-0.024
 x-0.021
year-0.020
 gen-0.020
 operating-0.019
Token want
Token ID765
Feature activation+2.11e-02

Pos logit contributions
 pads+0.091
ety+0.078
 tails+0.074
burn+0.074
 Indians+0.073
Neg logit contributions
ore-0.061
 x-0.060
 gen-0.056
year-0.055
 operating-0.054
Token their
Token ID511
Feature activation+1.00e-01

Pos logit contributions
 pads+0.054
ety+0.045
 tails+0.043
 Indians+0.043
 pad+0.042
Neg logit contributions
ore-0.034
 x-0.033
 gen-0.030
year-0.030
 operating-0.029
Token bikes
Token ID16715
Feature activation+5.16e-02

Pos logit contributions
ety+0.187
 pads+0.182
 Indians+0.158
pper+0.158
pp+0.156
Neg logit contributions
 x-0.171
 news-0.167
 old-0.153
 wishes-0.148
 voice-0.148
Token to
Token ID284
Feature activation+4.51e-02

Pos logit contributions
 pads+0.143
 tails+0.128
 Indians+0.124
 pad+0.120
ety+0.117
Neg logit contributions
ore-0.064
 x-0.056
 of-0.053
year-0.051
 gen-0.050
Token look
Token ID804
Feature activation+1.22e-03

Pos logit contributions
 pads+0.111
ety+0.094
 pad+0.088
 tails+0.086
 Indians+0.084
Neg logit contributions
 x-0.079
ore-0.073
 gen-0.068
 operating-0.067
year-0.063
Token nice
Token ID3621
Feature activation+1.28e-02

Pos logit contributions
 pads+0.003
ety+0.002
 tails+0.002
 Indians+0.002
burn+0.002
Neg logit contributions
ore-0.002
 x-0.002
year-0.002
 gen-0.002
 goblin-0.002
Token.
Token ID13
Feature activation+1.83e-02

Pos logit contributions
 pads+0.034
ety+0.029
 tails+0.028
 Indians+0.028
burn+0.028
Neg logit contributions
ore-0.019
year-0.017
 x-0.017
 gen-0.016
 goblin-0.015
Token skip
Token ID14267
Feature activation+3.55e-02

Pos logit contributions
 pads+0.126
ety+0.107
 pad+0.102
 Indians+0.098
bur+0.098
Neg logit contributions
ore-0.089
 x-0.085
 of-0.079
 operating-0.077
 healer-0.075
Token just
Token ID655
Feature activation+3.03e-02

Pos logit contributions
 pads+0.096
ety+0.085
burn+0.083
 tails+0.083
 Indians+0.082
Neg logit contributions
ore-0.050
year-0.040
 x-0.039
 goblin-0.036
 gen-0.036
Token like
Token ID588
Feature activation+4.87e-02

Pos logit contributions
 pads+0.076
ety+0.066
burn+0.063
 Indians+0.063
coat+0.062
Neg logit contributions
ore-0.048
 x-0.042
year-0.041
 operating-0.038
 goblin-0.038
Token Tom
Token ID4186
Feature activation+3.83e-02

Pos logit contributions
 pads+0.122
ety+0.107
burn+0.103
coat+0.098
 pad+0.097
Neg logit contributions
 x-0.073
ore-0.073
year-0.063
 gen-0.063
old-0.062
Token.
Token ID13
Feature activation+4.27e-02

Pos logit contributions
 pads+0.105
ety+0.093
 Indians+0.092
burn+0.091
 tails+0.091
Neg logit contributions
ore-0.046
 x-0.040
 gen-0.035
year-0.035
 goblin-0.035
Token They
Token ID1119
Feature activation+9.96e-02

Pos logit contributions
 pads+0.111
burn+0.101
 Indians+0.097
 tails+0.097
ety+0.096
Neg logit contributions
ore-0.063
old-0.048
year-0.048
 x-0.045
looking-0.044
Token both
Token ID1111
Feature activation+8.83e-02

Pos logit contributions
 pads+0.206
burn+0.193
bur+0.185
 Indians+0.178
 fault+0.175
Neg logit contributions
year-0.133
le-0.130
 of-0.128
ore-0.127
abel-0.127
Token laughed
Token ID13818
Feature activation+4.27e-02

Pos logit contributions
 pads+0.191
burn+0.185
bur+0.183
 Indians+0.179
ety+0.159
Neg logit contributions
year-0.131
ore-0.127
abel-0.116
le-0.115
 of-0.114
Token and
Token ID290
Feature activation+8.93e-02

Pos logit contributions
 pads+0.126
ety+0.119
bur+0.113
 Indians+0.112
burn+0.111
Neg logit contributions
ore-0.042
year-0.038
 of-0.032
 x-0.032
 deeply-0.031
Token had
Token ID550
Feature activation+5.64e-02

Pos logit contributions
 pads+0.178
bur+0.168
ety+0.166
burn+0.161
 pad+0.159
Neg logit contributions
ore-0.111
year-0.108
 of-0.099
le-0.095
 deeply-0.092
Token lots
Token ID6041
Feature activation-4.14e-03

Pos logit contributions
 pads+0.147
ety+0.142
burn+0.141
bur+0.135
flower+0.130
Neg logit contributions
year-0.052
 x-0.052
ore-0.049
abel-0.048
 news-0.047
Token her
Token ID607
Feature activation+4.64e-02

Pos logit contributions
 pads+0.057
ety+0.045
 pad+0.045
 Indians+0.044
coat+0.044
Neg logit contributions
ore-0.035
 x-0.033
 gen-0.031
year-0.031
 goblin-0.028
Token friends
Token ID2460
Feature activation+3.47e-02

Pos logit contributions
 pads+0.121
ety+0.102
burn+0.098
 pad+0.097
coat+0.095
Neg logit contributions
ore-0.067
 gen-0.062
 x-0.062
 goblin-0.059
 healer-0.053
Token.
Token ID13
Feature activation+2.70e-02

Pos logit contributions
 pads+0.110
ety+0.097
 Indians+0.096
ibi+0.094
 tails+0.094
Neg logit contributions
ore-0.035
 x-0.025
year-0.025
 healer-0.023
 operating-0.022
Token They
Token ID1119
Feature activation+7.00e-02

Pos logit contributions
 pads+0.069
 Indians+0.056
ety+0.056
 tails+0.056
burn+0.056
Neg logit contributions
ore-0.044
year-0.038
 x-0.038
 operating-0.035
 gen-0.034
Token ran
Token ID4966
Feature activation+3.47e-02

Pos logit contributions
 pads+0.177
burn+0.167
 tails+0.151
bit+0.150
 fault+0.150
Neg logit contributions
year-0.090
ore-0.077
 of-0.073
looking-0.068
 gen-0.067
Token,
Token ID11
Feature activation+9.94e-02

Pos logit contributions
 pads+0.095
burn+0.084
ety+0.083
 Indians+0.079
 tails+0.078
Neg logit contributions
ore-0.049
year-0.040
 x-0.036
looking-0.036
 healer-0.034
Token jumped
Token ID11687
Feature activation+2.77e-02

Pos logit contributions
burn+0.221
aters+0.217
 fault+0.213
bit+0.208
lets+0.199
Neg logit contributions
 of-0.134
looking-0.121
ore-0.113
year-0.100
 else-0.095
Token,
Token ID11
Feature activation+7.53e-02

Pos logit contributions
 pads+0.075
ety+0.067
burn+0.063
 Indians+0.062
 tails+0.061
Neg logit contributions
ore-0.041
 x-0.033
year-0.033
 gen-0.031
 goblin-0.031
Token and
Token ID290
Feature activation+6.12e-02

Pos logit contributions
aters+0.175
 pads+0.175
burn+0.174
ibi+0.171
 fault+0.167
Neg logit contributions
ore-0.084
 of-0.081
looking-0.070
year-0.067
 lad-0.061
Token slid
Token ID27803
Feature activation+7.36e-02

Pos logit contributions
 pads+0.122
burn+0.108
aters+0.107
ety+0.106
bur+0.104
Neg logit contributions
ore-0.109
year-0.103
looking-0.093
 goblin-0.091
 of-0.091
Token on
Token ID319
Feature activation+3.13e-02

Pos logit contributions
burn+0.218
 pads+0.211
 Indians+0.210
ety+0.209
bur+0.196
Neg logit contributions
ore-0.052
 of-0.040
 with-0.034
--0.031
le-0.027
Token and
Token ID290
Feature activation+5.16e-02

Pos logit contributions
 pads+0.130
 Indians+0.119
 tails+0.116
bit+0.114
burn+0.110
Neg logit contributions
ore-0.123
year-0.111
abel-0.110
old-0.110
 operating-0.105
Token boy
Token ID2933
Feature activation+5.87e-02

Pos logit contributions
 pads+0.134
 tails+0.111
bit+0.111
aters+0.108
burn+0.106
Neg logit contributions
ore-0.065
year-0.065
 gen-0.063
old-0.060
 healer-0.059
Token swung
Token ID28507
Feature activation+5.67e-02

Pos logit contributions
 pads+0.121
 Indians+0.108
 tails+0.102
bit+0.096
ety+0.095
Neg logit contributions
ore-0.113
old-0.101
year-0.099
abel-0.098
 operating-0.096
Token together
Token ID1978
Feature activation+3.68e-02

Pos logit contributions
 pads+0.157
 Indians+0.152
ety+0.146
burn+0.143
coat+0.134
Neg logit contributions
ore-0.056
year-0.055
 of-0.050
 x-0.040
 gen-0.040
Token.
Token ID13
Feature activation+3.70e-02

Pos logit contributions
 pads+0.108
ety+0.099
 Indians+0.098
burn+0.094
 tails+0.091
Neg logit contributions
ore-0.046
year-0.035
 x-0.032
 gen-0.030
 ingredient-0.030
Token They
Token ID1119
Feature activation+9.91e-02

Pos logit contributions
 pads+0.098
burn+0.084
 tails+0.083
ety+0.083
 sunglasses+0.083
Neg logit contributions
ore-0.055
 x-0.043
year-0.043
 operating-0.042
 healer-0.039
Token laughed
Token ID13818
Feature activation+4.08e-02

Pos logit contributions
 pads+0.227
aters+0.210
burn+0.208
bit+0.207
 Indians+0.207
Neg logit contributions
 of-0.121
year-0.117
abel-0.117
le-0.104
 else-0.104
Token and
Token ID290
Feature activation+9.64e-02

Pos logit contributions
 pads+0.120
ety+0.113
 Indians+0.104
burn+0.104
bur+0.102
Neg logit contributions
ore-0.043
year-0.039
 x-0.035
 gen-0.034
 deeply-0.033
Token smiled
Token ID13541
Feature activation+3.42e-02

Pos logit contributions
 pads+0.227
ety+0.203
aters+0.199
burn+0.198
 fault+0.197
Neg logit contributions
year-0.103
ore-0.091
 of-0.087
le-0.081
 Annie-0.081
Token and
Token ID290
Feature activation+7.79e-02

Pos logit contributions
 pads+0.091
ety+0.080
 Indians+0.078
coat+0.075
burn+0.074
Neg logit contributions
ore-0.049
year-0.045
 gen-0.041
 x-0.040
 operating-0.039
Token it
Token ID340
Feature activation+2.29e-02

Pos logit contributions
 pads+0.184
ety+0.161
 pad+0.156
 tails+0.152
coat+0.152
Neg logit contributions
year-0.102
ore-0.092
 gen-0.085
 of-0.082
abel-0.078
Token play
Token ID711
Feature activation+3.32e-03

Pos logit contributions
 pads+0.143
ety+0.125
ige+0.119
 pad+0.119
burn+0.117
Neg logit contributions
ore-0.083
 x-0.073
 healer-0.072
 gen-0.071
 of-0.070
Token with
Token ID351
Feature activation+1.55e-02

Pos logit contributions
 pads+0.009
ety+0.008
 tails+0.008
burn+0.007
 Indians+0.007
Neg logit contributions
ore-0.005
 x-0.004
year-0.004
 gen-0.004
 goblin-0.004
Token the
Token ID262
Feature activation+7.39e-02

Pos logit contributions
 pads+0.044
ety+0.036
burn+0.036
coat+0.036
 pad+0.035
Neg logit contributions
ore-0.022
 x-0.019
year-0.019
 gen-0.018
 healer-0.017
Token pigeon
Token ID46387
Feature activation+1.75e-02

Pos logit contributions
 pads+0.181
burn+0.158
ety+0.158
coat+0.155
ked+0.154
Neg logit contributions
 news-0.095
 villagers-0.087
 of-0.086
 year-0.085
 noise-0.084
Token.
Token ID13
Feature activation+3.46e-02

Pos logit contributions
 pads+0.047
ety+0.041
burn+0.040
coat+0.040
 tails+0.039
Neg logit contributions
ore-0.025
year-0.021
 x-0.021
 gen-0.020
 healer-0.019
Token They
Token ID1119
Feature activation+9.88e-02

Pos logit contributions
 pads+0.103
burn+0.091
 tails+0.089
 sunglasses+0.088
ety+0.086
Neg logit contributions
ore-0.043
 x-0.032
year-0.031
 gen-0.029
 operating-0.026
Token had
Token ID550
Feature activation+5.78e-02

Pos logit contributions
 pads+0.214
burn+0.200
 fault+0.192
aters+0.183
bur+0.180
Neg logit contributions
 of-0.126
year-0.123
le-0.118
ore-0.117
abel-0.114
Token lots
Token ID6041
Feature activation-1.17e-03

Pos logit contributions
 pads+0.174
burn+0.172
ety+0.168
inas+0.166
ibi+0.164
Neg logit contributions
 x-0.027
 news-0.024
 of-0.023
ore-0.023
 year-0.022
Token of
Token ID286
Feature activation+3.07e-02

Pos logit contributions
ore+0.001
 x+0.001
year+0.001
 gen+0.001
 goblin+0.001
Neg logit contributions
 pads-0.004
ety-0.004
 tails-0.004
coat-0.003
 Indians-0.003
Token fun
Token ID1257
Feature activation+2.62e-02

Pos logit contributions
 pads+0.069
burn+0.061
ety+0.061
 Indians+0.057
coat+0.056
Neg logit contributions
ore-0.054
 x-0.051
year-0.049
old-0.046
 gen-0.045
Token together
Token ID1978
Feature activation+3.70e-02

Pos logit contributions
 pads+0.068
ety+0.059
burn+0.057
 Indians+0.057
coat+0.055
Neg logit contributions
ore-0.042
year-0.035
 x-0.034
 gen-0.032
 operating-0.031
Token.
Token ID13
Feature activation-6.74e-02

Pos logit contributions
ore+0.120
year+0.119
 healer+0.119
 gen+0.113
 x+0.112
Neg logit contributions
 pads-0.098
burn-0.078
coat-0.077
 sunglasses-0.077
ety-0.073
Token "
Token ID366
Feature activation+3.31e-04

Pos logit contributions
 x+0.152
year+0.150
ore+0.150
 gen+0.149
le+0.148
Neg logit contributions
 pads-0.093
 pad-0.072
ety-0.065
 tails-0.062
 tail-0.058
TokenWho
Token ID8241
Feature activation-2.17e-02

Pos logit contributions
 pads+0.001
ety+0.001
 tails+0.001
burn+0.001
 Indians+0.001
Neg logit contributions
ore-0.001
 x-0.001
year-0.001
 gen-0.001
 goblin-0.001
Token's
Token ID338
Feature activation-2.39e-02

Pos logit contributions
ore+0.048
year+0.045
 gen+0.044
 x+0.043
 goblin+0.043
Neg logit contributions
 pads-0.043
ety-0.034
 tails-0.031
 pad-0.030
 Indians-0.030
Token there
Token ID612
Feature activation-1.80e-02

Pos logit contributions
ore+0.044
year+0.042
 x+0.038
 goblin+0.037
 gen+0.037
Neg logit contributions
 pads-0.052
ety-0.045
 Indians-0.044
 sunglasses-0.044
burn-0.043
Token?"
Token ID1701
Feature activation-1.02e-01

Pos logit contributions
ore+0.028
year+0.027
 x+0.025
 gen+0.025
 goblin+0.024
Neg logit contributions
 pads-0.045
ety-0.039
burn-0.036
 Indians-0.036
 pad-0.036
Token she
Token ID673
Feature activation-2.65e-02

Pos logit contributions
year+0.251
 x+0.236
ore+0.235
auc+0.235
place+0.229
Neg logit contributions
 pads-0.104
 pad-0.068
 Cle-0.068
 sunglasses-0.066
 legs-0.063
Token said
Token ID531
Feature activation-4.43e-02

Pos logit contributions
ore+0.056
year+0.050
 x+0.047
 healer+0.047
 operating+0.046
Neg logit contributions
 pads-0.056
ety-0.046
 tails-0.043
 sunglasses-0.042
 Indians-0.042
Token in
Token ID287
Feature activation-2.86e-02

Pos logit contributions
ore+0.087
 x+0.084
year+0.083
 healer+0.081
 goblin+0.080
Neg logit contributions
 pads-0.081
ety-0.063
ked-0.063
coat-0.063
 tails-0.062
Token a
Token ID257
Feature activation-2.25e-02

Pos logit contributions
ore+0.059
year+0.050
 gen+0.049
 healer+0.048
 x+0.046
Neg logit contributions
 pads-0.058
 tails-0.050
 sunglasses-0.049
 tail-0.045
ety-0.044
Token loud
Token ID7812
Feature activation-3.44e-02

Pos logit contributions
ore+0.037
year+0.033
 x+0.031
 gen+0.029
 healer+0.029
Neg logit contributions
 pads-0.056
 tails-0.048
 pad-0.046
ety-0.045
 sunglasses-0.045
Token me
Token ID502
Feature activation-4.97e-02

Pos logit contributions
year+0.066
ore+0.065
 gen+0.059
 goblin+0.057
 x+0.055
Neg logit contributions
 pads-0.051
ety-0.045
 sunglasses-0.040
burn-0.040
 Indians-0.040
Token pick
Token ID2298
Feature activation-2.62e-02

Pos logit contributions
year+0.094
ore+0.091
 gen+0.083
 goblin+0.082
looking+0.076
Neg logit contributions
 pads-0.105
ety-0.091
burn-0.090
 sunglasses-0.086
 Indians-0.085
Token out
Token ID503
Feature activation-3.62e-02

Pos logit contributions
ore+0.047
year+0.045
 gen+0.043
 x+0.042
 operating+0.041
Neg logit contributions
 pads-0.062
ety-0.050
 sunglasses-0.049
burn-0.047
 tails-0.047
Token a
Token ID257
Feature activation-1.90e-02

Pos logit contributions
ore+0.070
year+0.063
 gen+0.062
 x+0.060
 goblin+0.060
Neg logit contributions
 pads-0.080
 sunglasses-0.065
ety-0.065
 tails-0.064
 sunscreen-0.062
Token costume
Token ID16569
Feature activation-5.92e-02

Pos logit contributions
ore+0.036
year+0.034
 x+0.031
 gen+0.030
 operating+0.030
Neg logit contributions
 pads-0.043
ety-0.035
 Indians-0.034
coat-0.034
 tails-0.034
Token?"
Token ID1701
Feature activation-9.98e-02

Pos logit contributions
year+0.103
ore+0.102
 gen+0.091
 x+0.088
 goblin+0.088
Neg logit contributions
 pads-0.135
coat-0.111
ety-0.110
 sunglasses-0.109
 pad-0.108
Token she
Token ID673
Feature activation-3.18e-02

Pos logit contributions
year+0.240
 passage+0.210
 site+0.203
ore+0.202
place+0.202
Neg logit contributions
 pads-0.104
 Cle-0.098
ety-0.091
 Indians-0.086
burn-0.086
Token asked
Token ID1965
Feature activation-4.20e-02

Pos logit contributions
ore+0.071
year+0.061
 x+0.057
 gen+0.057
 operating+0.056
Neg logit contributions
 pads-0.065
ety-0.054
 tails-0.050
 sunglasses-0.050
 pad-0.049
Token.
Token ID13
Feature activation-4.84e-02

Pos logit contributions
ore+0.083
 gen+0.076
year+0.075
 x+0.072
 goblin+0.071
Neg logit contributions
 pads-0.084
ety-0.070
 tails-0.065
coat-0.064
 pad-0.064
Token
Token ID198
Feature activation-3.09e-02

Pos logit contributions
year+0.102
ore+0.099
 x+0.096
 gen+0.095
le+0.091
Neg logit contributions
 pads-0.083
ety-0.067
 Indians-0.064
 pad-0.062
burn-0.061
Token
Token ID198
Feature activation-3.64e-02

Pos logit contributions
year+0.060
ore+0.060
 gen+0.057
 goblin+0.057
 x+0.056
Neg logit contributions
 pads-0.059
ety-0.049
 Indians-0.046
coat-0.046
burn-0.045
Token you
Token ID345
Feature activation+9.08e-03

Pos logit contributions
ore+0.023
year+0.021
 gen+0.020
 x+0.020
 goblin+0.019
Neg logit contributions
 pads-0.026
ety-0.022
 tails-0.021
 Indians-0.021
 sunglasses-0.021
Token doing
Token ID1804
Feature activation-3.77e-02

Pos logit contributions
 pads+0.020
ety+0.016
 tails+0.015
coat+0.014
burn+0.014
Neg logit contributions
ore-0.019
 x-0.017
 gen-0.017
year-0.017
 goblin-0.016
Token here
Token ID994
Feature activation-3.17e-02

Pos logit contributions
year+0.068
ore+0.066
 gen+0.058
 goblin+0.056
 x+0.055
Neg logit contributions
 pads-0.084
 sunglasses-0.069
ety-0.068
coat-0.067
burn-0.067
Token?"
Token ID1701
Feature activation-6.50e-02

Pos logit contributions
ore+0.053
year+0.053
 gen+0.047
 x+0.045
 goblin+0.045
Neg logit contributions
 pads-0.076
ety-0.064
coat-0.061
 sunglasses-0.060
burn-0.060
Token a
Token ID257
Feature activation-8.32e-02

Pos logit contributions
year+0.153
ore+0.151
fire+0.139
 ingredient+0.138
le+0.133
Neg logit contributions
 pads-0.089
ety-0.083
 Cle-0.074
 sunglasses-0.072
 tails-0.070
Token voice
Token ID3809
Feature activation-9.83e-02

Pos logit contributions
ore+0.148
year+0.142
 ingredient+0.126
 operating+0.124
 x+0.121
Neg logit contributions
 dads-0.140
ety-0.126
 pads-0.122
 Indians-0.119
 sunglasses-0.118
Token said
Token ID531
Feature activation-4.78e-02

Pos logit contributions
ore+0.217
 healer+0.211
year+0.201
 goblin+0.201
handed+0.200
Neg logit contributions
 pads-0.111
ked-0.101
 sunglasses-0.100
ety-0.098
 swung-0.096
Token.
Token ID13
Feature activation-4.48e-02

Pos logit contributions
ore+0.093
year+0.089
 healer+0.086
 x+0.085
 operating+0.083
Neg logit contributions
 pads-0.087
ety-0.074
ked-0.074
 sunglasses-0.069
 tails-0.068
Token It
Token ID632
Feature activation-3.36e-02

Pos logit contributions
ore+0.096
year+0.093
 ingredient+0.089
 healer+0.089
 x+0.087
Neg logit contributions
 pads-0.079
ety-0.059
 tails-0.058
 pad-0.057
 sunglasses-0.056
Token was
Token ID373
Feature activation-1.75e-02

Pos logit contributions
ore+0.068
year+0.063
 healer+0.057
 x+0.057
old+0.055
Neg logit contributions
 pads-0.072
ety-0.058
 tails-0.057
 swung-0.055
 sunglasses-0.055
Token a
Token ID257
Feature activation-1.19e-02

Pos logit contributions
ore+0.035
year+0.031
 healer+0.029
 gen+0.029
 x+0.029
Neg logit contributions
 pads-0.038
ety-0.032
 tails-0.031
 sunglasses-0.030
 Indians-0.029
Token What
Token ID1867
Feature activation-2.36e-02

Pos logit contributions
ore+0.049
 x+0.046
 gen+0.045
year+0.044
 goblin+0.043
Neg logit contributions
 pads-0.057
ety-0.046
 Indians-0.045
 pad-0.043
 tails-0.043
Token are
Token ID389
Feature activation-1.11e-02

Pos logit contributions
ore+0.060
year+0.056
 gen+0.054
 x+0.054
 goblin+0.053
Neg logit contributions
 pads-0.039
ety-0.029
 Indians-0.027
 tails-0.026
 sunglasses-0.026
Token you
Token ID345
Feature activation-1.34e-03

Pos logit contributions
ore+0.021
year+0.019
 x+0.018
 gen+0.018
 goblin+0.018
Neg logit contributions
 pads-0.026
ety-0.021
 tails-0.020
 Indians-0.020
 sunglasses-0.020
Token searching
Token ID10342
Feature activation-2.37e-02

Pos logit contributions
ore+0.003
year+0.003
 x+0.003
 gen+0.003
 goblin+0.003
Neg logit contributions
 pads-0.003
ety-0.002
 tails-0.002
 Indians-0.002
burn-0.002
Token for
Token ID329
Feature activation-4.91e-02

Pos logit contributions
ore+0.045
 gen+0.043
year+0.042
 x+0.042
 goblin+0.040
Neg logit contributions
 pads-0.052
ety-0.043
 sunglasses-0.040
 pad-0.040
 tails-0.040
Token?"
Token ID1701
Feature activation-9.69e-02

Pos logit contributions
year+0.090
ore+0.088
 gen+0.084
 goblin+0.082
 x+0.079
Neg logit contributions
 pads-0.103
 sunglasses-0.084
ety-0.083
 tails-0.082
 Indians-0.079
Token
Token ID198
Feature activation-7.37e-02

Pos logit contributions
 x+0.248
year+0.240
ore+0.233
 potion+0.231
place+0.231
Neg logit contributions
 pads-0.079
 pad-0.054
ety-0.048
 -0.043
 wherever-0.033
Token
Token ID198
Feature activation-7.00e-02

Pos logit contributions
 x+0.177
ore+0.173
 goblin+0.173
year+0.172
 ingredient+0.170
Neg logit contributions
 pads-0.069
ety-0.049
 pad-0.043
 Indians-0.042
ibi-0.039
TokenAn
Token ID2025
Feature activation-6.40e-02

Pos logit contributions
 x+0.131
ore+0.130
 potion+0.113
 goblin+0.112
 ingredient+0.111
Neg logit contributions
 pads-0.111
 Indians-0.093
ety-0.092
 pad-0.087
 tails-0.087
Tokennie
Token ID11952
Feature activation-4.56e-02

Pos logit contributions
ore+0.146
 gen+0.133
 operating+0.124
fire+0.123
 x+0.122
Neg logit contributions
 pads-0.106
 pad-0.088
ety-0.083
 tails-0.081
lets-0.074
Token smiled
Token ID13541
Feature activation-7.67e-03

Pos logit contributions
ore+0.099
 x+0.092
year+0.091
 operating+0.085
 gen+0.084
Neg logit contributions
 pads-0.088
ety-0.071
 pad-0.069
 tails-0.068
lets-0.065
Token voice
Token ID3809
Feature activation-4.83e-02

Pos logit contributions
ore+0.126
year+0.124
handed+0.109
 ingredient+0.107
oons+0.106
Neg logit contributions
 pads-0.090
ety-0.083
 pad-0.079
 sunglasses-0.077
 tails-0.076
Token coming
Token ID2406
Feature activation-2.28e-02

Pos logit contributions
year+0.098
ore+0.094
 goblin+0.089
 healer+0.089
 ingredient+0.087
Neg logit contributions
 pads-0.086
ety-0.072
burn-0.071
coat-0.068
 sunglasses-0.067
Token from
Token ID422
Feature activation-3.52e-02

Pos logit contributions
ore+0.032
year+0.029
 x+0.029
 gen+0.028
 goblin+0.027
Neg logit contributions
 pads-0.058
ety-0.052
burn-0.050
 tails-0.049
 pad-0.048
Token the
Token ID262
Feature activation-1.58e-02

Pos logit contributions
ore+0.067
year+0.064
 gen+0.063
 x+0.059
abel+0.058
Neg logit contributions
 pads-0.068
ety-0.061
 sunglasses-0.059
 tails-0.058
bur-0.056
Token diamond
Token ID15291
Feature activation-9.34e-03

Pos logit contributions
ore+0.031
year+0.029
 x+0.027
 goblin+0.025
old+0.025
Neg logit contributions
 pads-0.034
ety-0.029
 tails-0.027
 sunglasses-0.027
 pad-0.027
Token.
Token ID13
Feature activation-9.65e-02

Pos logit contributions
ore+0.016
year+0.015
 x+0.014
 gen+0.014
 goblin+0.013
Neg logit contributions
 pads-0.023
ety-0.019
 tails-0.019
 pad-0.018
burn-0.018
Token Annie
Token ID25860
Feature activation-4.54e-02

Pos logit contributions
 x+0.200
ore+0.190
handed+0.189
 healer+0.182
 potion+0.179
Neg logit contributions
 pads-0.135
 tails-0.109
 wo-0.108
 pad-0.107
ety-0.102
Token was
Token ID373
Feature activation-1.84e-02

Pos logit contributions
ore+0.101
year+0.097
 x+0.095
 operating+0.089
 gen+0.089
Neg logit contributions
 pads-0.082
ety-0.068
 pad-0.063
 tails-0.062
 swung-0.062
Token startled
Token ID36249
Feature activation-2.29e-02

Pos logit contributions
ore+0.032
year+0.029
 gen+0.027
 x+0.027
 goblin+0.027
Neg logit contributions
 pads-0.044
ety-0.037
 tails-0.036
burn-0.035
 Indians-0.034
Token.
Token ID13
Feature activation-5.99e-02

Pos logit contributions
ore+0.048
 gen+0.045
year+0.045
 x+0.044
 healer+0.042
Neg logit contributions
 pads-0.045
ety-0.040
 tails-0.036
 sunglasses-0.035
 pad-0.034
Token "
Token ID366
Feature activation+3.91e-03

Pos logit contributions
ore+0.138
year+0.137
 x+0.136
 gen+0.136
 healer+0.130
Neg logit contributions
 pads-0.085
ety-0.067
 pad-0.063
 tails-0.061
coat-0.056
Token.
Token ID13
Feature activation-2.43e-02

Pos logit contributions
ore+0.053
year+0.051
 gen+0.047
 x+0.046
 healer+0.043
Neg logit contributions
 pads-0.061
 tails-0.048
ety-0.047
 pad-0.047
coat-0.046
Token
Token ID198
Feature activation-3.45e-02

Pos logit contributions
ore+0.048
 gen+0.045
year+0.044
 healer+0.044
 x+0.044
Neg logit contributions
 pads-0.052
ety-0.041
 pad-0.039
 tails-0.039
 Indians-0.039
Token
Token ID198
Feature activation-2.28e-02

Pos logit contributions
 gen+0.068
ore+0.068
year+0.066
 goblin+0.066
 x+0.066
Neg logit contributions
 pads-0.067
ety-0.053
ibi-0.051
 pad-0.048
 Indians-0.048
TokenDaddy
Token ID48280
Feature activation-2.74e-02

Pos logit contributions
ore+0.049
 gen+0.045
 x+0.044
 healer+0.043
year+0.043
Neg logit contributions
 pads-0.046
ety-0.034
 tails-0.034
 Indians-0.034
 pad-0.033
Token put
Token ID1234
Feature activation-2.67e-02

Pos logit contributions
ore+0.061
year+0.053
 x+0.052
 gen+0.051
 ingredient+0.050
Neg logit contributions
 pads-0.054
ety-0.046
 pad-0.041
 Indians-0.041
burn-0.041
Token an
Token ID281
Feature activation-9.63e-02

Pos logit contributions
ore+0.051
 gen+0.047
 healer+0.046
year+0.045
 operating+0.045
Neg logit contributions
 pads-0.058
ety-0.046
 tails-0.045
burn-0.044
 sunglasses-0.044
Token arm
Token ID3211
Feature activation-3.09e-02

Pos logit contributions
ore+0.130
fire+0.105
year+0.096
arium+0.089
ground+0.089
Neg logit contributions
 pads-0.238
 sunglasses-0.227
coat-0.213
 pad-0.210
 tail-0.209
Token around
Token ID1088
Feature activation-2.64e-02

Pos logit contributions
ore+0.060
 gen+0.056
year+0.056
 goblin+0.055
 healer+0.054
Neg logit contributions
 pads-0.070
coat-0.054
ety-0.052
 pad-0.051
burn-0.051
Token Jake
Token ID14757
Feature activation-2.54e-03

Pos logit contributions
ore+0.057
 healer+0.052
 gen+0.052
year+0.052
 x+0.050
Neg logit contributions
 pads-0.049
 tails-0.040
ety-0.038
 Indians-0.036
 pad-0.035
Token and
Token ID290
Feature activation-3.52e-02

Pos logit contributions
ore+0.004
year+0.004
 x+0.004
 gen+0.004
 goblin+0.004
Neg logit contributions
 pads-0.006
ety-0.005
 tails-0.005
coat-0.005
 pad-0.005
Token said
Token ID531
Feature activation-1.71e-02

Pos logit contributions
ore+0.080
year+0.070
 x+0.066
 healer+0.064
 gen+0.064
Neg logit contributions
 pads-0.070
ety-0.055
 tails-0.054
 Indians-0.054
 sunglasses-0.053
Token had
Token ID550
Feature activation+1.08e-02

Pos logit contributions
ore+0.018
year+0.017
 x+0.016
 gen+0.016
 goblin+0.016
Neg logit contributions
 pads-0.016
ety-0.013
 tails-0.011
coat-0.011
 pad-0.011
Token to
Token ID284
Feature activation-2.29e-02

Pos logit contributions
 pads+0.026
ety+0.021
 tails+0.020
burn+0.020
 Indians+0.020
Neg logit contributions
ore-0.019
 x-0.018
year-0.017
 gen-0.017
 goblin-0.016
Token go
Token ID467
Feature activation-3.66e-02

Pos logit contributions
ore+0.051
year+0.049
 gen+0.044
 goblin+0.044
 x+0.043
Neg logit contributions
 pads-0.044
ety-0.035
burn-0.034
 tails-0.034
 sunglasses-0.034
Token to
Token ID284
Feature activation-4.51e-02

Pos logit contributions
year+0.089
 gen+0.087
ore+0.086
 goblin+0.083
 ingredient+0.080
Neg logit contributions
 pads-0.060
ety-0.047
coat-0.045
burn-0.045
bur-0.040
Token the
Token ID262
Feature activation-4.95e-02

Pos logit contributions
year+0.104
ore+0.099
 gen+0.090
 goblin+0.086
le+0.085
Neg logit contributions
 pads-0.077
ety-0.066
 Indians-0.065
 sunscreen-0.064
burn-0.063
Token bar
Token ID2318
Feature activation-9.54e-02

Pos logit contributions
ore+0.100
year+0.099
le+0.088
 gen+0.085
ment+0.082
Neg logit contributions
 pads-0.096
coat-0.078
 sunglasses-0.077
 sunscreen-0.076
 Indians-0.076
Tokenber
Token ID527
Feature activation-4.88e-02

Pos logit contributions
 potion+0.199
 x+0.192
 gen+0.191
 goblin+0.183
year+0.179
Neg logit contributions
 pads-0.148
ety-0.119
 pad-0.117
 dads-0.115
coat-0.104
Token.
Token ID13
Feature activation-1.70e-02

Pos logit contributions
ore+0.088
year+0.087
 gen+0.086
 goblin+0.079
 x+0.078
Neg logit contributions
 pads-0.114
 pad-0.091
ety-0.090
coat-0.086
burn-0.081
Token She
Token ID1375
Feature activation-4.16e-02

Pos logit contributions
ore+0.039
year+0.038
 gen+0.036
 x+0.035
 goblin+0.035
Neg logit contributions
 pads-0.031
ety-0.025
 tails-0.023
coat-0.023
ibi-0.022
Token said
Token ID531
Feature activation-1.62e-02

Pos logit contributions
ore+0.103
year+0.086
 goblin+0.083
fire+0.083
 x+0.083
Neg logit contributions
 pads-0.075
ety-0.063
 swung-0.059
coat-0.056
 sunglasses-0.055
Token his
Token ID465
Feature activation-1.32e-02

Pos logit contributions
ore+0.030
year+0.027
 x+0.026
 gen+0.025
 goblin+0.025
Neg logit contributions
 pads-0.037
ety-0.031
 tails-0.030
 Indians-0.029
 sunglasses-0.029
Token he
Token ID339
Feature activation-3.45e-02

Pos logit contributions
ore+0.176
year+0.176
fire+0.167
 ingredient+0.162
le+0.161
Neg logit contributions
 pads-0.076
 dads-0.066
ety-0.066
 sunscreen-0.065
 sunglasses-0.064
Token asks
Token ID7893
Feature activation-5.40e-02

Pos logit contributions
ore+0.072
year+0.069
fire+0.061
 gen+0.060
 ingredient+0.059
Neg logit contributions
 pads-0.075
 tails-0.061
ety-0.059
 sunglasses-0.058
 Indians-0.056
Token in
Token ID287
Feature activation-4.23e-02

Pos logit contributions
ore+0.118
year+0.111
 healer+0.110
fire+0.110
 gen+0.109
Neg logit contributions
 pads-0.079
ety-0.073
 dads-0.064
 tails-0.060
 sunglasses-0.060
Token a
Token ID257
Feature activation-5.49e-02

Pos logit contributions
ore+0.091
year+0.087
 gen+0.081
 healer+0.080
le+0.080
Neg logit contributions
 pads-0.065
 tails-0.057
 sunglasses-0.056
 dads-0.055
 sunscreen-0.053
Token loud
Token ID7812
Feature activation-5.59e-02

Pos logit contributions
ore+0.099
year+0.099
le+0.084
abel+0.082
 healer+0.080
Neg logit contributions
 pads-0.109
 tails-0.097
 sunglasses-0.092
 dads-0.092
 pad-0.090
Token voice
Token ID3809
Feature activation-9.43e-02

Pos logit contributions
year+0.091
ore+0.083
 healer+0.075
 operating+0.068
 ingredient+0.066
Neg logit contributions
 pads-0.123
 tails-0.115
 sunglasses-0.107
coat-0.107
 tail-0.106
Token.
Token ID13
Feature activation-5.42e-02

Pos logit contributions
 healer+0.223
 goblin+0.220
year+0.219
body+0.212
ore+0.210
Neg logit contributions
 pads-0.103
ety-0.082
 sunglasses-0.069
 dads-0.067
burn-0.066
Token
Token ID198
Feature activation-4.32e-02

Pos logit contributions
le+0.119
year+0.118
ore+0.117
 x+0.116
 ingredient+0.115
Neg logit contributions
 pads-0.078
ety-0.059
 Indians-0.056
 pad-0.054
ibi-0.052
Token
Token ID198
Feature activation-4.14e-02

Pos logit contributions
year+0.092
 goblin+0.089
ore+0.089
 healer+0.087
 ingredient+0.086
Neg logit contributions
 pads-0.071
ety-0.060
ibi-0.054
 Indians-0.051
 dads-0.050
TokenS
Token ID50
Feature activation-2.35e-02

Pos logit contributions
ore+0.097
 x+0.094
year+0.091
 ingredient+0.090
 goblin+0.087
Neg logit contributions
 pads-0.060
ety-0.051
 Indians-0.045
 dads-0.045
 sunscreen-0.042
Tokenara
Token ID3301
Feature activation-1.86e-02

Pos logit contributions
year+0.057
ore+0.054
 gen+0.054
 goblin+0.052
 x+0.052
Neg logit contributions
 pads-0.039
ety-0.034
burn-0.029
coat-0.029
 tails-0.028
TokenHello
Token ID15496
Feature activation-1.06e-02

Pos logit contributions
 pads+0.009
ety+0.007
 tails+0.007
burn+0.007
 Indians+0.007
Neg logit contributions
ore-0.008
 x-0.008
year-0.008
 gen-0.007
 goblin-0.007
Token?
Token ID30
Feature activation-3.35e-02

Pos logit contributions
ore+0.019
year+0.017
 x+0.016
 gen+0.016
 goblin+0.015
Neg logit contributions
 pads-0.025
ety-0.022
 tails-0.020
burn-0.020
coat-0.020
Token Is
Token ID1148
Feature activation+4.85e-03

Pos logit contributions
ore+0.057
 x+0.052
year+0.052
 gen+0.051
 healer+0.050
Neg logit contributions
 pads-0.077
ety-0.066
 Indians-0.062
coat-0.061
 pad-0.061
Token someone
Token ID2130
Feature activation-1.85e-02

Pos logit contributions
 pads+0.012
ety+0.010
 tails+0.010
burn+0.010
 Indians+0.010
Neg logit contributions
ore-0.008
 x-0.007
year-0.007
 gen-0.007
 goblin-0.006
Token there
Token ID612
Feature activation-2.07e-03

Pos logit contributions
ore+0.034
year+0.031
 gen+0.031
 x+0.030
 goblin+0.029
Neg logit contributions
 pads-0.043
ety-0.037
coat-0.033
 Indians-0.033
burn-0.033
Token?"
Token ID1701
Feature activation-9.39e-02

Pos logit contributions
ore+0.003
year+0.003
 x+0.003
 gen+0.003
 goblin+0.002
Neg logit contributions
 pads-0.006
ety-0.005
 tails-0.005
burn-0.005
 Indians-0.005
Token
Token ID198
Feature activation-5.27e-02

Pos logit contributions
year+0.224
ore+0.215
 goblin+0.209
 ingredient+0.203
auc+0.202
Neg logit contributions
 pads-0.099
 Cle-0.085
ety-0.083
 swung-0.081
 sunglasses-0.075
Token
Token ID198
Feature activation-4.76e-02

Pos logit contributions
ore+0.117
 goblin+0.114
year+0.113
 ingredient+0.109
 x+0.109
Neg logit contributions
 pads-0.080
ety-0.065
coat-0.060
burn-0.056
 Indians-0.055
TokenLu
Token ID25596
Feature activation-2.20e-02

Pos logit contributions
ore+0.100
 goblin+0.091
 x+0.090
year+0.089
 ingredient+0.087
Neg logit contributions
 pads-0.084
ety-0.071
coat-0.069
 Indians-0.064
 swung-0.063
Tokency
Token ID948
Feature activation-3.21e-02

Pos logit contributions
ore+0.053
 x+0.052
 gen+0.051
year+0.050
 goblin+0.049
Neg logit contributions
 pads-0.037
ety-0.032
coat-0.027
burn-0.027
 pad-0.026
Token was
Token ID373
Feature activation-2.74e-02

Pos logit contributions
ore+0.067
year+0.061
 x+0.059
 healer+0.057
 gen+0.056
Neg logit contributions
 pads-0.068
ety-0.055
 tails-0.053
 pad-0.052
 sunglasses-0.049
Token
Token ID220
Feature activation-2.31e-03

Pos logit contributions
year+0.090
 x+0.090
ore+0.088
 gen+0.085
 goblin+0.085
Neg logit contributions
 pads-0.086
 tails-0.066
ety-0.065
 pad-0.062
coat-0.059
Token
Token ID198
Feature activation-3.62e-02

Pos logit contributions
ore+0.004
 x+0.004
year+0.003
 gen+0.003
 goblin+0.003
Neg logit contributions
 pads-0.006
ety-0.005
 tails-0.005
coat-0.004
burn-0.004
Token
Token ID198
Feature activation-3.76e-02

Pos logit contributions
 x+0.073
ore+0.073
year+0.071
 gen+0.071
 goblin+0.070
Neg logit contributions
 pads-0.072
ety-0.057
 tails-0.053
 pad-0.052
coat-0.052
Token"
Token ID1
Feature activation+3.53e-02

Pos logit contributions
ore+0.084
 x+0.080
 gen+0.077
year+0.077
 goblin+0.077
Neg logit contributions
 pads-0.070
 tails-0.055
ety-0.052
coat-0.050
 Indians-0.050
TokenHello
Token ID15496
Feature activation-1.46e-02

Pos logit contributions
 pads+0.082
 tails+0.064
ety+0.063
coat+0.060
 Indians+0.060
Neg logit contributions
ore-0.065
 gen-0.065
year-0.064
 x-0.059
fire-0.056
Token!"
Token ID2474
Feature activation-9.18e-02

Pos logit contributions
ore+0.028
year+0.027
 x+0.025
 gen+0.025
 goblin+0.024
Neg logit contributions
 pads-0.032
ety-0.027
 tails-0.025
burn-0.025
 pad-0.024
Token said
Token ID531
Feature activation-5.69e-02

Pos logit contributions
year+0.213
ore+0.200
 x+0.194
auc+0.189
 ingredient+0.187
Neg logit contributions
 pads-0.135
 tails-0.121
 swung-0.106
 ho-0.100
 tail-0.098
Token Pen
Token ID7507
Feature activation-1.36e-02

Pos logit contributions
ore+0.121
year+0.120
 x+0.111
 goblin+0.108
auc+0.107
Neg logit contributions
 pads-0.095
 tails-0.084
ety-0.079
burn-0.079
 tail-0.073
Tokenelope
Token ID47329
Feature activation-2.03e-02

Pos logit contributions
ore+0.021
year+0.019
 x+0.019
 gen+0.018
 operating+0.017
Neg logit contributions
 pads-0.037
ety-0.031
 tails-0.029
 pad-0.029
burn-0.029
Token in
Token ID287
Feature activation-1.99e-02

Pos logit contributions
ore+0.036
year+0.033
 x+0.033
 gen+0.032
 goblin+0.031
Neg logit contributions
 pads-0.049
ety-0.040
 tails-0.039
 pad-0.038
burn-0.037
Token a
Token ID257
Feature activation-8.06e-03

Pos logit contributions
ore+0.038
year+0.033
 x+0.031
 gen+0.031
 operating+0.030
Neg logit contributions
 pads-0.045
 tails-0.038
 sunglasses-0.036
ety-0.035
 Indians-0.035
Token "
Token ID366
Feature activation-1.79e-02

Pos logit contributions
ore+0.159
 healer+0.158
handed+0.151
 x+0.151
body+0.146
Neg logit contributions
 pads-0.083
ety-0.058
 tails-0.058
 pad-0.057
coat-0.056
TokenWhat
Token ID2061
Feature activation-2.74e-03

Pos logit contributions
ore+0.036
 x+0.033
 gen+0.031
 healer+0.031
year+0.031
Neg logit contributions
 pads-0.038
ety-0.032
burn-0.029
 tails-0.029
coat-0.029
Token are
Token ID389
Feature activation+3.12e-03

Pos logit contributions
ore+0.006
year+0.005
 x+0.005
 gen+0.005
 goblin+0.005
Neg logit contributions
 pads-0.006
ety-0.005
 tails-0.004
 Indians-0.004
burn-0.004
Token you
Token ID345
Feature activation+1.51e-02

Pos logit contributions
 pads+0.008
ety+0.007
 tails+0.006
 Indians+0.006
burn+0.006
Neg logit contributions
ore-0.005
 x-0.005
year-0.005
 gen-0.005
 goblin-0.004
Token doing
Token ID1804
Feature activation-3.57e-02

Pos logit contributions
 pads+0.034
ety+0.027
 tails+0.026
coat+0.025
burn+0.024
Neg logit contributions
ore-0.031
 x-0.029
 gen-0.028
year-0.028
 goblin-0.027
Token?"
Token ID1701
Feature activation-9.17e-02

Pos logit contributions
year+0.066
ore+0.063
 gen+0.057
old+0.055
 goblin+0.054
Neg logit contributions
 pads-0.074
 sunglasses-0.064
burn-0.062
ety-0.062
coat-0.062
Token It
Token ID632
Feature activation-1.46e-02

Pos logit contributions
 healer+0.207
ore+0.203
 x+0.201
year+0.201
handed+0.198
Neg logit contributions
 pads-0.107
 swung-0.080
ety-0.078
 played-0.076
 Sam-0.074
Token was
Token ID373
Feature activation-2.21e-02

Pos logit contributions
ore+0.025
year+0.023
 x+0.022
 gen+0.021
 healer+0.020
Neg logit contributions
 pads-0.036
ety-0.030
 tails-0.029
burn-0.028
 Indians-0.028
Token her
Token ID607
Feature activation+1.80e-02

Pos logit contributions
ore+0.044
year+0.039
 healer+0.036
 gen+0.036
 x+0.035
Neg logit contributions
 pads-0.048
ety-0.040
 tails-0.039
 sunglasses-0.038
burn-0.038
Token mum
Token ID25682
Feature activation-1.05e-02

Pos logit contributions
 pads+0.039
ety+0.032
 Indians+0.030
 tails+0.029
burn+0.029
Neg logit contributions
ore-0.037
 x-0.035
 gen-0.034
year-0.032
 goblin-0.032
Token.
Token ID13
Feature activation-3.68e-02

Pos logit contributions
ore+0.019
year+0.017
 x+0.017
 gen+0.016
 goblin+0.016
Neg logit contributions
 pads-0.025
ety-0.021
 tails-0.020
burn-0.020
coat-0.020
Token that
Token ID326
Feature activation-2.57e-02

Pos logit contributions
ore+0.047
year+0.044
 x+0.042
 goblin+0.040
 gen+0.040
Neg logit contributions
 pads-0.076
ety-0.061
coat-0.060
 sunglasses-0.060
 pad-0.060
Token said
Token ID531
Feature activation-6.89e-03

Pos logit contributions
ore+0.051
 x+0.047
year+0.047
 gen+0.045
 goblin+0.044
Neg logit contributions
 pads-0.056
ety-0.044
 tails-0.042
 sunglasses-0.041
coat-0.041
Token '
Token ID705
Feature activation-1.23e-02

Pos logit contributions
ore+0.015
 x+0.014
year+0.013
 gen+0.013
 goblin+0.013
Neg logit contributions
 pads-0.014
ety-0.011
 tails-0.011
coat-0.010
 Indians-0.010
TokenNo
Token ID2949
Feature activation-1.52e-02

Pos logit contributions
ore+0.023
 x+0.021
 gen+0.021
year+0.021
 goblin+0.020
Neg logit contributions
 pads-0.028
ety-0.024
burn-0.022
coat-0.022
 tails-0.021
Token entry
Token ID5726
Feature activation-4.02e-02

Pos logit contributions
ore+0.027
year+0.025
 gen+0.024
 x+0.024
 goblin+0.022
Neg logit contributions
 pads-0.036
 tails-0.029
ety-0.029
 pad-0.028
coat-0.028
Token'.
Token ID4458
Feature activation-9.17e-02

Pos logit contributions
ore+0.063
 gen+0.059
 x+0.059
year+0.057
 goblin+0.057
Neg logit contributions
 pads-0.102
ety-0.082
 pad-0.081
 tails-0.079
coat-0.078
Token But
Token ID887
Feature activation-6.04e-02

Pos logit contributions
ore+0.170
year+0.161
 x+0.160
 potion+0.155
 healer+0.153
Neg logit contributions
 pads-0.139
 tails-0.126
 wrists-0.120
 pad-0.116
're-0.111
Token Annie
Token ID25860
Feature activation-3.21e-02

Pos logit contributions
ore+0.131
year+0.121
 x+0.120
looking+0.117
handed+0.115
Neg logit contributions
 pads-0.098
 tails-0.084
 pad-0.077
ety-0.076
 sunglasses-0.076
Token was
Token ID373
Feature activation-2.99e-02

Pos logit contributions
ore+0.069
year+0.065
 x+0.064
 gen+0.062
 goblin+0.060
Neg logit contributions
 pads-0.065
ety-0.052
 pad-0.049
 tails-0.049
coat-0.046
Token eager
Token ID11069
Feature activation+5.07e-03

Pos logit contributions
ore+0.064
 gen+0.055
 x+0.054
oc+0.053
year+0.053
Neg logit contributions
 pads-0.061
 pad-0.048
ety-0.047
 tails-0.046
 sunglasses-0.046
Token to
Token ID284
Feature activation-1.59e-02

Pos logit contributions
 pads+0.012
ety+0.010
 tails+0.009
 Indians+0.009
burn+0.009
Neg logit contributions
ore-0.010
 x-0.009
year-0.009
 gen-0.009
 goblin-0.008
Token you
Token ID345
Feature activation-3.52e-03

Pos logit contributions
 pads+0.062
ety+0.053
coat+0.052
burn+0.051
 tails+0.051
Neg logit contributions
ore-0.028
 x-0.028
 gen-0.025
year-0.023
 operating-0.022
Token like
Token ID588
Feature activation+1.50e-02

Pos logit contributions
ore+0.008
year+0.007
 x+0.007
 gen+0.007
 goblin+0.007
Neg logit contributions
 pads-0.007
ety-0.006
 tails-0.006
coat-0.005
 pad-0.005
Token it
Token ID340
Feature activation-6.76e-03

Pos logit contributions
 pads+0.040
ety+0.034
 tails+0.032
 Indians+0.032
coat+0.032
Neg logit contributions
ore-0.024
 x-0.021
 gen-0.020
year-0.020
 operating-0.018
Token?"
Token ID1701
Feature activation-5.04e-02

Pos logit contributions
ore+0.010
year+0.009
 x+0.009
 gen+0.009
 goblin+0.008
Neg logit contributions
 pads-0.018
ety-0.015
burn-0.015
 tails-0.015
coat-0.014
Token the
Token ID262
Feature activation+6.15e-03

Pos logit contributions
year+0.133
ore+0.130
fire+0.123
 ingredient+0.122
 potion+0.120
Neg logit contributions
 pads-0.066
ety-0.053
 tails-0.048
 sunglasses-0.042
 dads-0.041
Token bar
Token ID2318
Feature activation-9.16e-02

Pos logit contributions
 pads+0.015
ety+0.013
 tails+0.012
burn+0.012
 Indians+0.012
Neg logit contributions
ore-0.011
 x-0.010
year-0.009
 gen-0.009
 goblin-0.009
Tokenber
Token ID527
Feature activation-6.13e-02

Pos logit contributions
 potion+0.186
 x+0.182
 cellar+0.175
 operating+0.171
ore+0.171
Neg logit contributions
 pads-0.143
ety-0.128
 dads-0.117
ked-0.107
 pad-0.104
Token asked
Token ID1965
Feature activation-3.50e-02

Pos logit contributions
ore+0.137
year+0.131
body+0.123
 healer+0.121
fire+0.120
Neg logit contributions
 pads-0.113
ety-0.092
 swung-0.083
 pad-0.083
coat-0.077
Token.
Token ID13
Feature activation-4.72e-02

Pos logit contributions
ore+0.069
year+0.063
 gen+0.063
 x+0.062
 healer+0.061
Neg logit contributions
 pads-0.074
ety-0.061
 tails-0.058
 pad-0.054
coat-0.053
Token
Token ID198
Feature activation-3.70e-02

Pos logit contributions
ore+0.107
year+0.104
fire+0.098
 ingredient+0.097
 healer+0.096
Neg logit contributions
 pads-0.083
ety-0.058
 pad-0.058
 tails-0.056
 Indians-0.055
Token
Token ID198
Feature activation-2.18e-02

Pos logit contributions
year+0.078
ore+0.078
 gen+0.075
 x+0.074
 healer+0.073
Neg logit contributions
 pads-0.068
ety-0.054
 tails-0.048
ibi-0.048
 Indians-0.047
Token did
Token ID750
Feature activation-3.42e-02

Pos logit contributions
ore+0.030
year+0.028
 gen+0.028
 x+0.027
 goblin+0.026
Neg logit contributions
 pads-0.025
ety-0.020
 tails-0.018
 sunglasses-0.018
 Indians-0.018
Token you
Token ID345
Feature activation-2.49e-02

Pos logit contributions
ore+0.072
year+0.069
 gen+0.062
 goblin+0.062
 x+0.061
Neg logit contributions
 pads-0.069
ety-0.056
 tails-0.056
 sunglasses-0.055
 Indians-0.055
Token find
Token ID1064
Feature activation-4.20e-02

Pos logit contributions
ore+0.052
year+0.051
 gen+0.047
 goblin+0.046
 healer+0.045
Neg logit contributions
 pads-0.048
ety-0.041
 sunglasses-0.040
 Indians-0.038
burn-0.038
Token that
Token ID326
Feature activation-1.57e-02

Pos logit contributions
year+0.086
ore+0.081
 goblin+0.074
 healer+0.073
fire+0.072
Neg logit contributions
 pads-0.079
 sunglasses-0.072
ety-0.065
 sunscreen-0.064
bur-0.063
Token handle
Token ID5412
Feature activation+3.21e-04

Pos logit contributions
ore+0.026
year+0.025
 gen+0.022
 x+0.021
 goblin+0.021
Neg logit contributions
 pads-0.038
ety-0.032
burn-0.031
 sunglasses-0.031
coat-0.031
Token?"
Token ID1701
Feature activation-9.16e-02

Pos logit contributions
 pads+0.001
ety+0.001
 tails+0.001
burn+0.001
 Indians+0.001
Neg logit contributions
ore-0.001
year-0.000
 x-0.000
 gen-0.000
 goblin-0.000
Token Clara
Token ID27443
Feature activation-4.06e-02

Pos logit contributions
 potion+0.222
auc+0.221
year+0.219
ore+0.218
 x+0.213
Neg logit contributions
 pads-0.079
 wrists-0.077
 swung-0.067
 sunglasses-0.065
 neck-0.063
Token looked
Token ID3114
Feature activation-3.24e-02

Pos logit contributions
ore+0.084
year+0.080
 x+0.078
 gen+0.074
 operating+0.073
Neg logit contributions
 pads-0.081
ety-0.070
 pad-0.062
 tails-0.061
lets-0.060
Token up
Token ID510
Feature activation-3.02e-02

Pos logit contributions
ore+0.062
 gen+0.061
 x+0.060
year+0.057
fire+0.056
Neg logit contributions
 pads-0.070
ety-0.059
 sunglasses-0.057
 pad-0.056
 tails-0.055
Token and
Token ID290
Feature activation+1.84e-02

Pos logit contributions
 x+0.067
ore+0.066
 gen+0.065
year+0.063
 healer+0.061
Neg logit contributions
 pads-0.055
ety-0.046
 sunglasses-0.043
 tails-0.042
 pad-0.041
Token saw
Token ID2497
Feature activation-2.77e-02

Pos logit contributions
 pads+0.044
ety+0.036
burn+0.036
 tails+0.036
coat+0.035
Neg logit contributions
ore-0.033
year-0.030
 gen-0.029
 x-0.029
 goblin-0.028
Token Who
Token ID5338
Feature activation-1.78e-02

Pos logit contributions
ore+0.055
year+0.053
 x+0.052
 healer+0.050
 goblin+0.050
Neg logit contributions
 pads-0.056
ety-0.046
 tails-0.044
 Indians-0.044
 pad-0.041
Token are
Token ID389
Feature activation-2.89e-02

Pos logit contributions
ore+0.041
year+0.039
 gen+0.037
 x+0.037
 goblin+0.037
Neg logit contributions
 pads-0.033
ety-0.026
 tails-0.024
burn-0.023
 Indians-0.023
Token you
Token ID345
Feature activation+2.59e-03

Pos logit contributions
ore+0.062
year+0.059
 goblin+0.056
 gen+0.056
 x+0.055
Neg logit contributions
 pads-0.054
ety-0.045
 Indians-0.043
 tails-0.043
 sunglasses-0.042
Token?"
Token ID1701
Feature activation-4.58e-02

Pos logit contributions
 pads+0.007
ety+0.006
 tails+0.006
burn+0.006
 Indians+0.006
Neg logit contributions
ore-0.004
year-0.003
 x-0.003
 gen-0.003
 goblin-0.003
Token a
Token ID257
Feature activation-4.42e-02

Pos logit contributions
ore+0.138
year+0.133
 goblin+0.125
 ingredient+0.123
 potion+0.123
Neg logit contributions
 pads-0.040
 tails-0.035
 dads-0.033
ety-0.031
 sunglasses-0.029
Token voice
Token ID3809
Feature activation-9.16e-02

Pos logit contributions
ore+0.074
year+0.066
 x+0.059
abel+0.059
 operating+0.058
Neg logit contributions
 pads-0.095
 dads-0.086
 tails-0.085
ety-0.085
 Indians-0.085
Token says
Token ID1139
Feature activation-2.58e-02

Pos logit contributions
ore+0.187
year+0.179
 healer+0.179
 goblin+0.173
 ingredient+0.169
Neg logit contributions
 pads-0.142
 sunglasses-0.113
 swung-0.112
ety-0.112
 dads-0.112
Token.
Token ID13
Feature activation-2.66e-02

Pos logit contributions
ore+0.048
year+0.045
 healer+0.044
 gen+0.044
 x+0.044
Neg logit contributions
 pads-0.058
ety-0.046
 tails-0.046
coat-0.043
burn-0.043
Token The
Token ID383
Feature activation-4.39e-02

Pos logit contributions
ore+0.055
year+0.052
 x+0.049
 healer+0.048
 goblin+0.048
Neg logit contributions
 pads-0.054
ety-0.042
 tails-0.042
 pad-0.039
 Indians-0.039
Token voice
Token ID3809
Feature activation-6.83e-02

Pos logit contributions
year+0.079
ore+0.078
le+0.069
old+0.068
abel+0.066
Neg logit contributions
 pads-0.082
 tails-0.078
 sunglasses-0.073
 tail-0.073
burn-0.072
Token is
Token ID318
Feature activation+5.92e-03

Pos logit contributions
ore+0.146
year+0.134
 healer+0.133
handed+0.132
fire+0.130
Neg logit contributions
 pads-0.122
 tails-0.101
 swung-0.098
 sunglasses-0.096
 hoops-0.095
TokenWho
Token ID8241
Feature activation-2.40e-02

Pos logit contributions
ore+0.014
year+0.013
 x+0.013
 gen+0.013
 goblin+0.012
Neg logit contributions
 pads-0.012
ety-0.009
 tails-0.009
 Indians-0.009
burn-0.008
Token goes
Token ID2925
Feature activation-2.46e-02

Pos logit contributions
ore+0.049
year+0.045
 x+0.044
 gen+0.043
 goblin+0.042
Neg logit contributions
 pads-0.052
ety-0.042
 tails-0.040
 Indians-0.038
 sunglasses-0.038
Token there
Token ID612
Feature activation-8.68e-03

Pos logit contributions
ore+0.043
year+0.043
 gen+0.041
 x+0.040
 goblin+0.038
Neg logit contributions
 pads-0.056
ety-0.047
 sunglasses-0.045
burn-0.044
 Indians-0.044
Token?"
Token ID1701
Feature activation-2.39e-02

Pos logit contributions
ore+0.014
year+0.013
 x+0.012
 gen+0.012
 goblin+0.011
Neg logit contributions
 pads-0.022
ety-0.019
 tails-0.018
 Indians-0.018
coat-0.018
Token a
Token ID257
Feature activation-6.32e-02

Pos logit contributions
ore+0.047
 x+0.045
year+0.044
 gen+0.043
 goblin+0.041
Neg logit contributions
 pads-0.051
ety-0.042
 tails-0.040
 pad-0.038
 sunglasses-0.038
Token voice
Token ID3809
Feature activation-9.15e-02

Pos logit contributions
ore+0.112
year+0.101
 x+0.092
 operating+0.092
 ingredient+0.089
Neg logit contributions
 pads-0.116
 dads-0.112
 tails-0.107
ety-0.105
 Indians-0.104
Token said
Token ID531
Feature activation-3.51e-02

Pos logit contributions
ore+0.181
handed+0.171
 healer+0.171
 goblin+0.164
 x+0.163
Neg logit contributions
 pads-0.120
ked-0.114
 swung-0.110
 sunglasses-0.109
ety-0.105
Token from
Token ID422
Feature activation-4.90e-02

Pos logit contributions
ore+0.067
 x+0.064
year+0.063
 healer+0.060
 operating+0.060
Neg logit contributions
 pads-0.069
ety-0.056
 tails-0.054
 pad-0.054
coat-0.053
Token inside
Token ID2641
Feature activation-4.07e-02

Pos logit contributions
ore+0.099
year+0.087
 gen+0.082
 ingredient+0.081
washer+0.081
Neg logit contributions
 pads-0.087
 tails-0.080
 sunglasses-0.078
ety-0.076
 tail-0.071
Token.
Token ID13
Feature activation-2.73e-02

Pos logit contributions
ore+0.080
year+0.077
 x+0.077
 healer+0.073
 gen+0.073
Neg logit contributions
 pads-0.075
ety-0.063
 tails-0.062
coat-0.060
 sunglasses-0.059
Token
Token ID198
Feature activation-4.51e-02

Pos logit contributions
ore+0.054
year+0.050
 x+0.049
 gen+0.047
 healer+0.046
Neg logit contributions
 pads-0.058
ety-0.048
 tails-0.045
 pad-0.044
 Indians-0.044
Token's
Token ID338
Feature activation-1.34e-02

Pos logit contributions
ore+0.072
 x+0.066
year+0.063
 gen+0.062
 goblin+0.061
Neg logit contributions
 pads-0.073
ety-0.058
 pad-0.056
 tails-0.055
lets-0.054
Token mom
Token ID1995
Feature activation-1.49e-02

Pos logit contributions
ore+0.032
year+0.030
 x+0.028
 gen+0.028
 healer+0.028
Neg logit contributions
 pads-0.024
 tails-0.018
ety-0.018
 sunglasses-0.017
 pad-0.017
Token had
Token ID550
Feature activation-2.63e-04

Pos logit contributions
ore+0.031
 x+0.028
year+0.027
 gen+0.027
 goblin+0.026
Neg logit contributions
 pads-0.032
ety-0.026
 tails-0.024
coat-0.024
 pad-0.024
Token filled
Token ID5901
Feature activation-5.43e-02

Pos logit contributions
ore+0.001
 x+0.000
year+0.000
 gen+0.000
 goblin+0.000
Neg logit contributions
 pads-0.001
ety-0.000
 tails-0.000
 Indians-0.000
coat-0.000
Token the
Token ID262
Feature activation-6.57e-02

Pos logit contributions
ore+0.119
 x+0.112
 operating+0.111
year+0.109
 gen+0.109
Neg logit contributions
 pads-0.086
 sunglasses-0.074
ety-0.072
 tails-0.071
 sunscreen-0.071
Token chamber
Token ID11847
Feature activation-9.11e-02

Pos logit contributions
ore+0.102
year+0.099
ment+0.085
old+0.084
 gen+0.084
Neg logit contributions
 pads-0.149
 pad-0.131
 tails-0.130
 tail-0.124
 sunglasses-0.121
Token with
Token ID351
Feature activation-1.58e-02

Pos logit contributions
 x+0.174
ore+0.172
year+0.158
 gen+0.155
 goblin+0.152
Neg logit contributions
 pads-0.190
 pad-0.146
coat-0.140
 tails-0.138
lets-0.138
Token balloons
Token ID35485
Feature activation+1.56e-02

Pos logit contributions
ore+0.032
 x+0.028
year+0.028
 gen+0.027
 healer+0.026
Neg logit contributions
 pads-0.035
 tails-0.028
ety-0.027
 sunglasses-0.027
coat-0.026
Token,
Token ID11
Feature activation+1.64e-02

Pos logit contributions
 pads+0.041
ety+0.035
 tails+0.034
 Indians+0.033
burn+0.033
Neg logit contributions
ore-0.024
year-0.021
 x-0.021
 gen-0.021
 goblin-0.019
Token stream
Token ID4269
Feature activation+9.52e-03

Pos logit contributions
 pads+0.039
ety+0.033
burn+0.031
 tails+0.031
 Indians+0.030
Neg logit contributions
ore-0.029
year-0.027
 x-0.026
 gen-0.026
 goblin-0.024
Tokeners
Token ID364
Feature activation+3.30e-02

Pos logit contributions
 pads+0.028
ety+0.024
 tails+0.023
burn+0.023
 Indians+0.023
Neg logit contributions
ore-0.012
year-0.010
 x-0.010
 gen-0.010
 goblin-0.009
TokenWhat
Token ID2061
Feature activation-1.91e-02

Pos logit contributions
ore+0.005
 x+0.004
year+0.004
 gen+0.004
 goblin+0.004
Neg logit contributions
 pads-0.005
ety-0.004
 tails-0.004
burn-0.004
 Indians-0.004
Token are
Token ID389
Feature activation-6.00e-03

Pos logit contributions
ore+0.044
year+0.041
 gen+0.039
 x+0.039
 healer+0.038
Neg logit contributions
 pads-0.037
ety-0.028
 tails-0.026
 Indians-0.026
 sunglasses-0.026
Token you
Token ID345
Feature activation+3.93e-04

Pos logit contributions
ore+0.011
year+0.010
 x+0.010
 gen+0.010
 goblin+0.009
Neg logit contributions
 pads-0.014
ety-0.012
 tails-0.011
 Indians-0.011
 sunglasses-0.011
Token looking
Token ID2045
Feature activation-1.96e-02

Pos logit contributions
 pads+0.001
ety+0.001
 tails+0.001
 Indians+0.001
burn+0.001
Neg logit contributions
ore-0.001
year-0.001
 x-0.001
 gen-0.001
 goblin-0.001
Token for
Token ID329
Feature activation-4.97e-02

Pos logit contributions
ore+0.035
 gen+0.033
year+0.033
 x+0.033
 healer+0.030
Neg logit contributions
 pads-0.044
ety-0.037
 pad-0.035
 tails-0.035
 sunglasses-0.035
Token?"
Token ID1701
Feature activation-9.11e-02

Pos logit contributions
year+0.097
ore+0.091
 gen+0.085
 goblin+0.085
abel+0.082
Neg logit contributions
 pads-0.095
 sunglasses-0.082
ety-0.079
 Indians-0.079
 tails-0.078
Token the
Token ID262
Feature activation-2.71e-02

Pos logit contributions
ore+0.215
year+0.211
 ingredient+0.194
auc+0.192
 potion+0.192
Neg logit contributions
 pads-0.124
ety-0.091
 wrists-0.086
 tails-0.085
 Indians-0.084
Token fairy
Token ID25607
Feature activation-4.10e-02

Pos logit contributions
ore+0.047
year+0.041
 x+0.035
 operating+0.035
old+0.035
Neg logit contributions
 pads-0.065
ety-0.057
 sunglasses-0.055
 tails-0.054
 Indians-0.054
Token asked
Token ID1965
Feature activation-4.28e-02

Pos logit contributions
ore+0.087
year+0.078
 x+0.078
 operating+0.072
 healer+0.072
Neg logit contributions
 pads-0.083
ety-0.071
 pad-0.064
lets-0.063
 tails-0.062
Token.
Token ID13
Feature activation-4.28e-02

Pos logit contributions
ore+0.087
year+0.076
 healer+0.075
 gen+0.075
 x+0.075
Neg logit contributions
 pads-0.085
ety-0.070
 tails-0.066
 pad-0.066
coat-0.064
Token
Token ID198
Feature activation-3.55e-02

Pos logit contributions
ore+0.099
year+0.091
 x+0.087
 goblin+0.086
 gen+0.086
Neg logit contributions
 pads-0.073
ety-0.059
 pad-0.055
 Indians-0.054
coat-0.053
Token have
Token ID423
Feature activation+4.78e-03

Pos logit contributions
 pads+0.015
ety+0.013
 tails+0.012
 Indians+0.012
burn+0.012
Neg logit contributions
ore-0.013
 x-0.011
year-0.011
 gen-0.011
 goblin-0.011
Token to
Token ID284
Feature activation-1.31e-02

Pos logit contributions
 pads+0.011
ety+0.009
 tails+0.009
burn+0.009
 Indians+0.009
Neg logit contributions
ore-0.009
 x-0.008
year-0.008
 gen-0.008
 goblin-0.007
Token destroy
Token ID4117
Feature activation-2.26e-02

Pos logit contributions
ore+0.027
year+0.026
 goblin+0.023
 x+0.023
 gen+0.023
Neg logit contributions
 pads-0.027
ety-0.022
burn-0.022
 tails-0.022
 Indians-0.021
Token my
Token ID616
Feature activation+1.36e-02

Pos logit contributions
ore+0.041
year+0.041
 gen+0.037
 healer+0.036
 goblin+0.036
Neg logit contributions
 pads-0.052
 tails-0.042
ety-0.042
 sunglasses-0.040
burn-0.039
Token map
Token ID3975
Feature activation-1.39e-02

Pos logit contributions
 pads+0.031
ety+0.026
burn+0.025
coat+0.025
 Indians+0.024
Neg logit contributions
ore-0.026
 x-0.024
 gen-0.023
year-0.023
 goblin-0.022
Token?"
Token ID1701
Feature activation-9.11e-02

Pos logit contributions
ore+0.021
year+0.020
 x+0.018
 gen+0.018
 goblin+0.017
Neg logit contributions
 pads-0.037
ety-0.031
coat-0.030
burn-0.030
 tails-0.030
Token Jamie
Token ID17826
Feature activation-3.39e-02

Pos logit contributions
ore+0.211
 potion+0.198
fire+0.198
year+0.198
 healer+0.192
Neg logit contributions
 pads-0.095
 swung-0.088
 sunglasses-0.077
 tails-0.074
 toes-0.074
Token cried
Token ID16896
Feature activation-3.20e-02

Pos logit contributions
ore+0.071
year+0.066
 x+0.061
 gen+0.061
fire+0.058
Neg logit contributions
 pads-0.071
ety-0.059
 tails-0.055
 pad-0.054
burn-0.052
Token.
Token ID13
Feature activation-2.32e-02

Pos logit contributions
ore+0.063
year+0.057
 gen+0.056
 ingredient+0.055
 healer+0.054
Neg logit contributions
 pads-0.066
ety-0.058
burn-0.055
coat-0.052
 tails-0.052
Token
Token ID198
Feature activation-2.62e-02

Pos logit contributions
ore+0.045
year+0.043
 x+0.042
 gen+0.041
 healer+0.040
Neg logit contributions
 pads-0.048
ety-0.039
coat-0.036
 pad-0.036
 tails-0.036
Token
Token ID198
Feature activation-3.35e-02

Pos logit contributions
ore+0.050
year+0.048
 gen+0.048
 x+0.047
 goblin+0.047
Neg logit contributions
 pads-0.054
ety-0.045
ibi-0.041
 Indians-0.040
coat-0.039
Token suddenly
Token ID6451
Feature activation-8.47e-03

Pos logit contributions
ore+0.060
 x+0.055
year+0.054
 gen+0.052
 healer+0.052
Neg logit contributions
 pads-0.058
ety-0.046
 tails-0.044
 pad-0.044
 sunglasses-0.043
Token she
Token ID673
Feature activation+1.38e-02

Pos logit contributions
ore+0.016
year+0.015
 x+0.015
 gen+0.014
 healer+0.014
Neg logit contributions
 pads-0.019
ety-0.016
 tails-0.015
coat-0.014
burn-0.014
Token heard
Token ID2982
Feature activation-4.45e-02

Pos logit contributions
 pads+0.030
ety+0.025
 tails+0.024
burn+0.024
 Indians+0.024
Neg logit contributions
ore-0.027
 x-0.025
 gen-0.024
year-0.024
 goblin-0.023
Token a
Token ID257
Feature activation-4.85e-02

Pos logit contributions
ore+0.094
 healer+0.085
year+0.080
 gen+0.080
 goblin+0.080
Neg logit contributions
 pads-0.088
 sunglasses-0.071
 tails-0.068
ety-0.066
burn-0.064
Token voice
Token ID3809
Feature activation-6.28e-02

Pos logit contributions
ore+0.092
year+0.087
old+0.079
handed+0.077
 ingredient+0.077
Neg logit contributions
 pads-0.093
 tails-0.080
 pad-0.078
 Indians-0.077
 sunglasses-0.077
Token.
Token ID13
Feature activation-9.11e-02

Pos logit contributions
ore+0.124
year+0.123
 healer+0.122
 goblin+0.119
 ingredient+0.113
Neg logit contributions
 pads-0.110
ety-0.087
 sunglasses-0.087
burn-0.086
 pad-0.083
Token
Token ID198
Feature activation-6.55e-02

Pos logit contributions
 x+0.203
handed+0.196
ore+0.193
year+0.190
 healer+0.190
Neg logit contributions
 pads-0.117
 pad-0.089
 wo-0.086
 tails-0.084
ety-0.082
Token
Token ID198
Feature activation-7.39e-02

Pos logit contributions
year+0.152
 x+0.148
 goblin+0.147
 gen+0.146
 healer+0.144
Neg logit contributions
 pads-0.089
ety-0.060
ibi-0.059
 pad-0.057
coat-0.057
TokenIt
Token ID1026
Feature activation-2.68e-02

Pos logit contributions
 x+0.178
ore+0.177
year+0.167
 goblin+0.167
 gen+0.167
Neg logit contributions
 pads-0.082
 tails-0.057
 Indians-0.052
ety-0.052
coat-0.050
Token was
Token ID373
Feature activation-1.90e-02

Pos logit contributions
ore+0.049
year+0.045
 x+0.043
 healer+0.041
 ingredient+0.040
Neg logit contributions
 pads-0.062
ety-0.052
 tails-0.049
burn-0.048
 Indians-0.048
Token a
Token ID257
Feature activation-3.53e-02

Pos logit contributions
ore+0.039
year+0.035
 gen+0.033
 healer+0.032
 goblin+0.032
Neg logit contributions
 pads-0.039
ety-0.033
 tails-0.031
burn-0.031
 sunglasses-0.031
Token the
Token ID262
Feature activation+3.44e-03

Pos logit contributions
 pads+0.020
ety+0.017
 tails+0.016
coat+0.016
burn+0.016
Neg logit contributions
ore-0.012
 x-0.011
year-0.011
 gen-0.010
 goblin-0.010
Token sky
Token ID6766
Feature activation+3.47e-02

Pos logit contributions
 pads+0.008
ety+0.006
 tails+0.006
burn+0.006
 Indians+0.006
Neg logit contributions
ore-0.007
 x-0.006
year-0.006
 gen-0.006
 goblin-0.006
Token,
Token ID11
Feature activation+5.03e-02

Pos logit contributions
 pads+0.090
ety+0.078
 Indians+0.077
burn+0.074
 sunglasses+0.074
Neg logit contributions
ore-0.055
year-0.045
 gen-0.045
 x-0.042
 healer-0.041
Token and
Token ID290
Feature activation+3.32e-02

Pos logit contributions
 pads+0.130
ety+0.110
ibi+0.109
 Indians+0.106
burn+0.105
Neg logit contributions
ore-0.070
year-0.064
 gen-0.063
 x-0.061
 healer-0.055
Token Tom
Token ID4186
Feature activation+4.58e-02

Pos logit contributions
 pads+0.076
ety+0.064
coat+0.063
burn+0.062
 tails+0.060
Neg logit contributions
ore-0.058
year-0.054
 x-0.052
 gen-0.052
 operating-0.048
Token always
Token ID1464
Feature activation+5.45e-02

Pos logit contributions
 pads+0.107
ety+0.092
burn+0.090
 Indians+0.090
ibi+0.089
Neg logit contributions
ore-0.078
 gen-0.070
year-0.065
 x-0.063
 healer-0.063
Token helped
Token ID4193
Feature activation-1.49e-02

Pos logit contributions
 pads+0.127
ibi+0.104
burn+0.101
ety+0.101
coat+0.100
Neg logit contributions
ore-0.085
 gen-0.084
year-0.082
 x-0.075
 goblin-0.075
Token the
Token ID262
Feature activation+1.26e-02

Pos logit contributions
ore+0.031
year+0.030
 gen+0.028
 goblin+0.028
 x+0.027
Neg logit contributions
 pads-0.031
ety-0.025
 tails-0.024
burn-0.023
 Indians-0.023
Token bird
Token ID6512
Feature activation-1.57e-02

Pos logit contributions
 pads+0.033
ety+0.027
coat+0.026
 tails+0.026
 Indians+0.026
Neg logit contributions
ore-0.020
 x-0.019
 gen-0.019
year-0.018
 healer-0.017
Token by
Token ID416
Feature activation+2.61e-02

Pos logit contributions
ore+0.029
year+0.028
 x+0.027
 gen+0.027
 goblin+0.026
Neg logit contributions
 pads-0.037
ety-0.030
 tails-0.029
burn-0.028
 pad-0.027
Token snapping
Token ID35733
Feature activation+5.85e-03

Pos logit contributions
 pads+0.067
coat+0.056
ety+0.055
 Indians+0.052
ibi+0.052
Neg logit contributions
ore-0.043
 x-0.040
 gen-0.040
year-0.038
 operating-0.035
Token hats
Token ID23910
Feature activation+7.60e-02

Pos logit contributions
 pads+0.061
ety+0.049
burn+0.047
ibi+0.046
 Indians+0.045
Neg logit contributions
 x-0.090
ore-0.088
 gen-0.086
year-0.084
 goblin-0.080
Token,
Token ID11
Feature activation+4.88e-02

Pos logit contributions
 pads+0.237
 Indians+0.227
ety+0.224
 tails+0.215
ibi+0.210
Neg logit contributions
ore-0.059
 of-0.053
 from-0.030
 x-0.029
 with-0.028
Token boots
Token ID14412
Feature activation+7.84e-02

Pos logit contributions
 pads+0.102
ibi+0.090
ety+0.090
burn+0.085
 Indians+0.085
Neg logit contributions
 x-0.080
ore-0.078
 gen-0.076
year-0.072
le-0.068
Token and
Token ID290
Feature activation+3.80e-02

Pos logit contributions
 pads+0.237
 Indians+0.228
ety+0.224
 tails+0.213
burn+0.209
Neg logit contributions
ore-0.067
 of-0.065
 x-0.038
 gen-0.035
fire-0.035
Token m
Token ID285
Feature activation+4.63e-02

Pos logit contributions
 pads+0.060
ety+0.051
burn+0.045
ibi+0.044
 tails+0.043
Neg logit contributions
 x-0.087
ore-0.085
year-0.084
 gen-0.084
 goblin-0.078
Tokenittens
Token ID34978
Feature activation+5.42e-02

Pos logit contributions
 pads+0.093
ety+0.079
 Indians+0.078
 pad+0.075
 tails+0.072
Neg logit contributions
ore-0.100
 x-0.085
year-0.082
old-0.080
 gen-0.079
Token.
Token ID13
Feature activation+2.87e-02

Pos logit contributions
 pads+0.166
ety+0.158
 Indians+0.153
ibi+0.144
 tails+0.144
Neg logit contributions
ore-0.057
 x-0.033
fire-0.033
 of-0.029
year-0.029
Token Lily
Token ID20037
Feature activation+2.23e-02

Pos logit contributions
 pads+0.074
burn+0.063
ety+0.062
 Indians+0.061
bur+0.060
Neg logit contributions
ore-0.046
 x-0.036
year-0.035
 operating-0.035
 gen-0.033
Token had
Token ID550
Feature activation+4.59e-02

Pos logit contributions
 pads+0.048
ety+0.041
burn+0.040
 tails+0.039
 Indians+0.038
Neg logit contributions
ore-0.042
 x-0.038
 gen-0.038
year-0.037
 goblin-0.036
Token thick
Token ID6546
Feature activation+4.58e-02

Pos logit contributions
 pads+0.144
ety+0.137
burn+0.133
inas+0.132
 Indians+0.131
Neg logit contributions
 x-0.026
ore-0.020
 gen-0.019
year-0.018
 operating-0.018
Token,
Token ID11
Feature activation+5.07e-02

Pos logit contributions
 pads+0.065
ety+0.060
 Indians+0.052
burn+0.048
lets+0.045
Neg logit contributions
ore-0.112
 x-0.112
 gen-0.105
 operating-0.100
 goblin-0.097
Token,
Token ID11
Feature activation+3.94e-02

Pos logit contributions
 pads+0.026
ety+0.021
 tails+0.020
 Indians+0.020
coat+0.020
Neg logit contributions
ore-0.017
 x-0.015
year-0.015
 gen-0.015
 goblin-0.014
Token John
Token ID1757
Feature activation+1.64e-02

Pos logit contributions
 pads+0.103
ety+0.083
burn+0.081
coat+0.080
 tails+0.079
Neg logit contributions
ore-0.057
 x-0.056
year-0.055
 gen-0.054
 goblin-0.051
Token and
Token ID290
Feature activation+4.51e-02

Pos logit contributions
 pads+0.043
ety+0.036
 Indians+0.035
 tails+0.035
coat+0.035
Neg logit contributions
ore-0.025
 x-0.022
year-0.022
 gen-0.022
 healer-0.020
Token Jane
Token ID12091
Feature activation+3.61e-02

Pos logit contributions
 pads+0.113
 pad+0.091
 tails+0.090
ety+0.088
coat+0.087
Neg logit contributions
 gen-0.067
ore-0.066
year-0.066
 x-0.066
 goblin-0.065
Token.
Token ID13
Feature activation+2.86e-02

Pos logit contributions
 pads+0.113
 Indians+0.101
coat+0.097
ety+0.095
burn+0.095
Neg logit contributions
ore-0.038
year-0.030
 x-0.027
 gen-0.027
old-0.027
Token They
Token ID1119
Feature activation+5.80e-02

Pos logit contributions
 pads+0.070
ety+0.059
burn+0.057
 Indians+0.057
coat+0.056
Neg logit contributions
ore-0.047
year-0.043
 x-0.043
 gen-0.041
 operating-0.040
Token both
Token ID1111
Feature activation+3.79e-02

Pos logit contributions
 pads+0.146
burn+0.130
 Indians+0.117
bur+0.116
coat+0.116
Neg logit contributions
year-0.086
ore-0.083
 of-0.071
le-0.071
old-0.067
Token liked
Token ID8288
Feature activation+3.36e-02

Pos logit contributions
 pads+0.085
burn+0.071
 Indians+0.070
ety+0.069
 tails+0.066
Neg logit contributions
ore-0.071
year-0.068
 gen-0.060
 x-0.059
old-0.057
Token to
Token ID284
Feature activation+3.06e-02

Pos logit contributions
 pads+0.092
ety+0.077
coat+0.074
 pad+0.073
ibi+0.073
Neg logit contributions
ore-0.049
 x-0.044
year-0.041
 healer-0.039
 operating-0.039
Token play
Token ID711
Feature activation-1.12e-03

Pos logit contributions
 pads+0.066
ety+0.053
 pad+0.049
burn+0.049
 tails+0.048
Neg logit contributions
ore-0.062
 x-0.058
 gen-0.055
year-0.053
 operating-0.053
Token outside
Token ID2354
Feature activation+1.06e-02

Pos logit contributions
ore+0.002
year+0.002
 x+0.002
 gen+0.002
 goblin+0.002
Neg logit contributions
 pads-0.003
ety-0.002
 tails-0.002
 Indians-0.002
coat-0.002
Token on
Token ID319
Feature activation-7.25e-04

Pos logit contributions
 pads+0.042
ety+0.036
 tails+0.034
coat+0.033
 Indians+0.033
Neg logit contributions
ore-0.029
year-0.026
 x-0.025
 gen-0.024
 operating-0.023
Token,
Token ID11
Feature activation+1.18e-02

Pos logit contributions
ore+0.001
year+0.001
 x+0.001
 gen+0.001
 goblin+0.001
Neg logit contributions
 pads-0.002
ety-0.001
 tails-0.001
burn-0.001
 Indians-0.001
Token Lily
Token ID20037
Feature activation+4.20e-02

Pos logit contributions
 pads+0.028
ety+0.022
 tails+0.021
burn+0.021
coat+0.021
Neg logit contributions
ore-0.022
 x-0.020
year-0.020
 gen-0.020
 goblin-0.019
Token and
Token ID290
Feature activation+1.46e-02

Pos logit contributions
 pads+0.100
ety+0.087
 tails+0.085
coat+0.083
 Indians+0.082
Neg logit contributions
ore-0.068
 gen-0.061
year-0.059
 x-0.058
 healer-0.058
Token her
Token ID607
Feature activation+5.26e-03

Pos logit contributions
 pads+0.034
ety+0.027
coat+0.026
 tails+0.026
burn+0.026
Neg logit contributions
ore-0.027
 x-0.026
year-0.024
 gen-0.024
 goblin-0.024
Token brother
Token ID3956
Feature activation+7.82e-02

Pos logit contributions
 pads+0.011
ety+0.008
 tails+0.008
burn+0.008
 Indians+0.008
Neg logit contributions
ore-0.012
 x-0.011
year-0.011
 gen-0.010
 goblin-0.010
Token would
Token ID561
Feature activation+3.93e-02

Pos logit contributions
 pads+0.161
ety+0.140
coat+0.138
burn+0.134
ibi+0.131
Neg logit contributions
year-0.126
abel-0.126
ore-0.125
old-0.120
 gen-0.117
Token go
Token ID467
Feature activation+3.32e-03

Pos logit contributions
 pads+0.090
ety+0.075
coat+0.070
ibi+0.068
burn+0.068
Neg logit contributions
ore-0.072
 gen-0.067
 x-0.067
 operating-0.064
 healer-0.063
Token to
Token ID284
Feature activation-1.52e-02

Pos logit contributions
 pads+0.008
ety+0.006
 tails+0.006
burn+0.006
 Indians+0.006
Neg logit contributions
ore-0.006
 x-0.006
year-0.006
 gen-0.006
 goblin-0.005
Token their
Token ID511
Feature activation-1.53e-02

Pos logit contributions
ore+0.029
year+0.028
 gen+0.026
 x+0.025
 goblin+0.024
Neg logit contributions
 pads-0.034
ety-0.028
 tails-0.027
 Indians-0.027
burn-0.027
Token secret
Token ID3200
Feature activation-3.34e-02

Pos logit contributions
ore+0.030
year+0.029
 gen+0.026
 x+0.026
 goblin+0.025
Neg logit contributions
 pads-0.034
ety-0.028
 tails-0.028
 Indians-0.026
burn-0.026
Token on
Token ID319
Feature activation+1.71e-02

Pos logit contributions
ore+0.021
year+0.020
 gen+0.018
 x+0.018
 goblin+0.018
Neg logit contributions
 pads-0.032
ety-0.026
 tails-0.026
coat-0.026
 Indians-0.026
Token the
Token ID262
Feature activation+6.51e-02

Pos logit contributions
 pads+0.049
ety+0.041
 tails+0.040
coat+0.039
 pad+0.039
Neg logit contributions
ore-0.024
 x-0.022
 gen-0.020
year-0.019
 operating-0.019
Token big
Token ID1263
Feature activation+5.59e-02

Pos logit contributions
 pads+0.146
ety+0.134
lets+0.124
 tails+0.120
burn+0.115
Neg logit contributions
 x-0.118
 site-0.109
ore-0.106
 gen-0.097
 operating-0.096
Token wheel
Token ID7825
Feature activation+1.37e-02

Pos logit contributions
 pads+0.123
ety+0.105
 tails+0.099
ked+0.097
 Indians+0.097
Neg logit contributions
 x-0.104
ore-0.104
 site-0.097
 gen-0.093
 year-0.087
Token?"
Token ID1701
Feature activation+2.27e-02

Pos logit contributions
 pads+0.039
ety+0.033
 tails+0.032
 Indians+0.031
burn+0.031
Neg logit contributions
ore-0.019
 x-0.017
year-0.016
 gen-0.016
 goblin-0.015
Token Her
Token ID2332
Feature activation+5.54e-02

Pos logit contributions
 pads+0.056
ety+0.045
burn+0.044
 Indians+0.044
 tails+0.044
Neg logit contributions
ore-0.040
 x-0.036
 gen-0.034
year-0.034
 operating-0.032
Token friend
Token ID1545
Feature activation+1.22e-02

Pos logit contributions
 pads+0.141
coat+0.112
 Indians+0.110
 pad+0.110
ety+0.109
Neg logit contributions
 x-0.086
 gen-0.078
ore-0.075
 goblin-0.075
 healer-0.068
Token replied
Token ID8712
Feature activation+1.58e-02

Pos logit contributions
 pads+0.033
ety+0.027
 tails+0.026
 Indians+0.026
burn+0.026
Neg logit contributions
ore-0.019
 x-0.017
year-0.017
 gen-0.016
 goblin-0.015
Token,
Token ID11
Feature activation+4.18e-02

Pos logit contributions
 pads+0.041
ety+0.035
 Indians+0.033
 tails+0.033
burn+0.032
Neg logit contributions
ore-0.026
year-0.023
 x-0.023
 gen-0.022
 goblin-0.021
Token "
Token ID366
Feature activation+2.70e-02

Pos logit contributions
 pads+0.111
ety+0.090
burn+0.086
 tails+0.086
 Indians+0.085
Neg logit contributions
ore-0.066
year-0.060
 gen-0.058
 x-0.058
 goblin-0.053
TokenYes
Token ID5297
Feature activation+3.61e-02

Pos logit contributions
 pads+0.066
 tails+0.055
ety+0.053
burn+0.053
 Indians+0.053
Neg logit contributions
ore-0.047
year-0.043
 gen-0.040
 x-0.040
looking-0.037
Token miserable
Token ID22444
Feature activation+2.27e-03

Pos logit contributions
ore+0.002
 x+0.001
year+0.001
 gen+0.001
 goblin+0.001
Neg logit contributions
 pads-0.003
ety-0.002
 tails-0.002
burn-0.002
 Indians-0.002
Token.
Token ID13
Feature activation+2.25e-02

Pos logit contributions
 pads+0.006
ety+0.005
 tails+0.005
burn+0.005
 Indians+0.005
Neg logit contributions
ore-0.004
 x-0.003
year-0.003
 gen-0.003
 goblin-0.003
Token<|endoftext|>
Token ID50256
Feature activation-1.58e-02

Pos logit contributions
 pads+0.063
 tails+0.054
ety+0.053
coat+0.053
 sunglasses+0.052
Neg logit contributions
ore-0.032
year-0.027
 gen-0.025
 x-0.025
 healer-0.024
TokenOnce
Token ID7454
Feature activation+1.34e-02

Pos logit contributions
ore+0.028
 x+0.027
year+0.027
 gen+0.026
 healer+0.025
Neg logit contributions
 pads-0.037
ety-0.030
 tails-0.028
coat-0.028
 pad-0.028
Token upon
Token ID2402
Feature activation-1.86e-02

Pos logit contributions
 pads+0.029
ety+0.023
 tails+0.022
 Indians+0.022
burn+0.022
Neg logit contributions
ore-0.028
year-0.025
 x-0.025
 gen-0.025
 goblin-0.023
Token a
Token ID257
Feature activation+1.36e-02

Pos logit contributions
ore+0.034
 gen+0.029
 x+0.029
year+0.029
 healer+0.028
Neg logit contributions
 pads-0.045
ety-0.037
 tails-0.036
burn-0.035
 Indians-0.035
Token time
Token ID640
Feature activation+2.55e-02

Pos logit contributions
 pads+0.033
ety+0.027
 Indians+0.026
 tails+0.026
burn+0.026
Neg logit contributions
ore-0.024
 x-0.022
year-0.022
 gen-0.021
 goblin-0.020
Token,
Token ID11
Feature activation-7.54e-03

Pos logit contributions
 pads+0.070
ety+0.058
 tails+0.057
burn+0.057
 Indians+0.056
Neg logit contributions
ore-0.040
 x-0.033
 gen-0.033
year-0.032
 goblin-0.030
Token there
Token ID612
Feature activation+6.91e-04

Pos logit contributions
ore+0.013
 x+0.012
 gen+0.012
year+0.012
 goblin+0.012
Neg logit contributions
 pads-0.018
ety-0.015
 tails-0.014
coat-0.014
burn-0.014
Token was
Token ID373
Feature activation-4.56e-03

Pos logit contributions
 pads+0.002
ety+0.002
 tails+0.002
burn+0.002
coat+0.002
Neg logit contributions
ore-0.001
year-0.001
 x-0.001
 gen-0.001
 goblin-0.001
Token a
Token ID257
Feature activation-1.32e-02

Pos logit contributions
ore+0.008
 x+0.006
 gen+0.006
year+0.006
 goblin+0.006
Neg logit contributions
 pads-0.012
ety-0.010
 tails-0.010
burn-0.009
 Indians-0.009
Token huge
Token ID3236
Feature activation-5.47e-03

Pos logit contributions
ore+0.012
year+0.011
 x+0.011
 gen+0.010
 goblin+0.010
Neg logit contributions
 pads-0.016
ety-0.013
 tails-0.013
 Indians-0.012
burn-0.012
Token tree
Token ID5509
Feature activation-3.62e-02

Pos logit contributions
ore+0.011
year+0.010
 x+0.010
 gen+0.010
 goblin+0.009
Neg logit contributions
 pads-0.012
ety-0.010
 tails-0.009
 Indians-0.009
coat-0.009
Token.
Token ID13
Feature activation-1.46e-02

Pos logit contributions
 x+0.063
ore+0.063
year+0.062
 gen+0.062
 goblin+0.062
Neg logit contributions
 pads-0.084
ety-0.069
 tails-0.065
coat-0.064
 pad-0.062
Token The
Token ID383
Feature activation+9.61e-03

Pos logit contributions
ore+0.026
 gen+0.023
 x+0.023
year+0.023
 goblin+0.022
Neg logit contributions
 pads-0.035
ety-0.029
 tails-0.028
coat-0.027
 Indians-0.027
Token tree
Token ID5509
Feature activation-2.37e-02

Pos logit contributions
 pads+0.023
ety+0.019
 tails+0.018
burn+0.018
 Indians+0.018
Neg logit contributions
ore-0.018
 x-0.016
 gen-0.016
year-0.016
 goblin-0.015
Token had
Token ID550
Feature activation+1.92e-02

Pos logit contributions
ore+0.052
 x+0.050
year+0.048
 goblin+0.048
 gen+0.047
Neg logit contributions
 pads-0.048
ety-0.039
 tails-0.034
 pad-0.033
coat-0.033
Token big
Token ID1263
Feature activation+2.46e-02

Pos logit contributions
 pads+0.053
ety+0.046
burn+0.044
coat+0.043
 Indians+0.043
Neg logit contributions
ore-0.026
 x-0.024
year-0.024
 gen-0.023
 operating-0.021
Token eyes
Token ID2951
Feature activation+2.47e-02

Pos logit contributions
 pads+0.053
ety+0.046
burn+0.044
 Indians+0.042
coat+0.042
Neg logit contributions
ore-0.047
 x-0.045
 gen-0.043
year-0.041
 goblin-0.041
Token that
Token ID326
Feature activation-6.41e-05

Pos logit contributions
 pads+0.067
ety+0.058
 Indians+0.057
 tails+0.056
burn+0.054
Neg logit contributions
ore-0.037
year-0.030
 x-0.030
 gen-0.030
 operating-0.027
Token could
Token ID714
Feature activation-2.47e-02

Pos logit contributions
ore+0.000
 x+0.000
year+0.000
 gen+0.000
 goblin+0.000
Neg logit contributions
 pads-0.000
ety-0.000
 tails-0.000
 Indians-0.000
coat-0.000
Token see
Token ID766
Feature activation-2.28e-03

Pos logit contributions
ore+0.053
year+0.049
 goblin+0.046
 x+0.045
 gen+0.045
Neg logit contributions
 pads-0.049
ety-0.040
coat-0.039
 tails-0.038
burn-0.037
Token try
Token ID1949
Feature activation+2.47e-02

Pos logit contributions
 pads+0.069
ety+0.059
burn+0.052
 pad+0.052
coat+0.052
Neg logit contributions
ore-0.056
 x-0.052
 gen-0.048
 operating-0.048
 healer-0.046
Token new
Token ID649
Feature activation+1.84e-02

Pos logit contributions
 pads+0.064
ety+0.057
coat+0.054
burn+0.053
 tails+0.052
Neg logit contributions
ore-0.038
 x-0.034
year-0.031
 operating-0.030
 gen-0.030
Token things
Token ID1243
Feature activation+9.63e-03

Pos logit contributions
 pads+0.049
ety+0.043
coat+0.041
burn+0.041
 Indians+0.040
Neg logit contributions
ore-0.027
 x-0.024
 gen-0.022
year-0.022
 goblin-0.022
Token.
Token ID13
Feature activation+3.06e-02

Pos logit contributions
 pads+0.026
ety+0.022
 tails+0.022
 Indians+0.021
burn+0.021
Neg logit contributions
ore-0.014
 x-0.012
year-0.012
 gen-0.012
 goblin-0.011
Token
Token ID198
Feature activation-3.71e-03

Pos logit contributions
 pads+0.082
burn+0.074
coat+0.072
ety+0.070
 tails+0.068
Neg logit contributions
ore-0.043
 x-0.036
year-0.033
 goblin-0.032
 operating-0.032
Token
Token ID198
Feature activation+2.68e-02

Pos logit contributions
ore+0.006
 x+0.006
year+0.006
 gen+0.006
 goblin+0.005
Neg logit contributions
 pads-0.009
ety-0.008
 tails-0.007
 Indians-0.007
coat-0.007
TokenOne
Token ID3198
Feature activation-2.34e-02

Pos logit contributions
 pads+0.057
burn+0.049
ety+0.047
coat+0.046
 tails+0.045
Neg logit contributions
ore-0.051
year-0.049
 x-0.047
 goblin-0.044
 operating-0.044
Token day
Token ID1110
Feature activation+2.43e-02

Pos logit contributions
ore+0.060
 gen+0.059
 x+0.058
 goblin+0.056
 operating+0.056
Neg logit contributions
 pads-0.035
ety-0.029
ibi-0.024
 tails-0.024
burn-0.023
Token,
Token ID11
Feature activation+2.62e-02

Pos logit contributions
 pads+0.068
ety+0.059
burn+0.056
coat+0.056
 Indians+0.055
Neg logit contributions
ore-0.036
 x-0.029
year-0.029
 gen-0.027
 goblin-0.027
Token they
Token ID484
Feature activation+2.96e-02

Pos logit contributions
 pads+0.058
ety+0.046
burn+0.045
coat+0.044
 pad+0.043
Neg logit contributions
ore-0.053
year-0.048
 x-0.048
 goblin-0.046
 gen-0.045
Token decided
Token ID3066
Feature activation+2.36e-02

Pos logit contributions
 pads+0.073
 tails+0.059
coat+0.058
burn+0.057
ety+0.057
Neg logit contributions
ore-0.050
year-0.047
 gen-0.043
 goblin-0.043
 x-0.042
Token,
Token ID11
Feature activation+9.19e-03

Pos logit contributions
 pads+0.030
ety+0.025
 tails+0.024
 Indians+0.024
burn+0.023
Neg logit contributions
ore-0.021
year-0.019
 x-0.019
 gen-0.018
 goblin-0.017
Token the
Token ID262
Feature activation-9.02e-03

Pos logit contributions
 pads+0.022
ety+0.018
 tails+0.017
burn+0.017
 Indians+0.017
Neg logit contributions
ore-0.017
 x-0.015
year-0.015
 gen-0.015
 goblin-0.014
Token screw
Token ID9580
Feature activation+9.88e-03

Pos logit contributions
ore+0.017
year+0.015
 gen+0.015
 x+0.014
 healer+0.014
Neg logit contributions
 pads-0.021
ety-0.017
 tails-0.017
 Indians-0.016
coat-0.016
Tokendriver
Token ID26230
Feature activation+4.31e-03

Pos logit contributions
 pads+0.023
ety+0.019
 tails+0.018
 Indians+0.017
burn+0.017
Neg logit contributions
ore-0.019
 x-0.017
year-0.017
 gen-0.017
 goblin-0.016
Token tightened
Token ID37257
Feature activation+2.39e-03

Pos logit contributions
 pads+0.010
ety+0.009
 tails+0.008
burn+0.008
 Indians+0.008
Neg logit contributions
ore-0.008
 x-0.007
year-0.007
 gen-0.007
 goblin-0.006
Token screws
Token ID23742
Feature activation+5.03e-03

Pos logit contributions
 pads+0.006
ety+0.005
 tails+0.005
burn+0.005
 Indians+0.005
Neg logit contributions
ore-0.004
 x-0.004
year-0.004
 gen-0.004
 goblin-0.003
Token,
Token ID11
Feature activation+2.11e-02

Pos logit contributions
 pads+0.012
ety+0.010
 tails+0.010
 Indians+0.010
burn+0.010
Neg logit contributions
ore-0.009
 x-0.008
year-0.008
 gen-0.008
 goblin-0.007
Token and
Token ID290
Feature activation-9.04e-03

Pos logit contributions
 pads+0.053
ety+0.046
burn+0.044
 Indians+0.043
 tails+0.043
Neg logit contributions
ore-0.033
 x-0.031
year-0.030
 gen-0.030
 operating-0.028
Token the
Token ID262
Feature activation-2.12e-03

Pos logit contributions
ore+0.017
 x+0.015
year+0.015
 gen+0.015
 goblin+0.014
Neg logit contributions
 pads-0.021
ety-0.017
 tails-0.017
 Indians-0.016
burn-0.016
Token hammer
Token ID15554
Feature activation+1.05e-02

Pos logit contributions
ore+0.004
year+0.004
 x+0.004
 gen+0.003
 goblin+0.003
Neg logit contributions
 pads-0.005
ety-0.004
 tails-0.004
 Indians-0.004
coat-0.004
Token hit
Token ID2277
Feature activation+1.08e-02

Pos logit contributions
 pads+0.025
ety+0.021
 tails+0.020
 Indians+0.020
burn+0.020
Neg logit contributions
ore-0.019
 x-0.017
year-0.017
 gen-0.016
 goblin-0.015
Token to
Token ID284
Feature activation+1.58e-02

Pos logit contributions
ore+0.029
 gen+0.028
year+0.028
 x+0.027
 goblin+0.027
Neg logit contributions
 pads-0.027
ety-0.022
 tails-0.021
burn-0.020
coat-0.019
Token their
Token ID511
Feature activation+2.48e-02

Pos logit contributions
 pads+0.038
ety+0.032
coat+0.030
 tails+0.029
 Indians+0.029
Neg logit contributions
ore-0.029
 x-0.026
 gen-0.025
year-0.024
 operating-0.024
Token yard
Token ID12699
Feature activation+7.02e-03

Pos logit contributions
 pads+0.057
ety+0.047
coat+0.044
burn+0.044
 Indians+0.044
Neg logit contributions
ore-0.045
 x-0.043
 gen-0.041
 operating-0.039
 healer-0.039
Token.
Token ID13
Feature activation+4.81e-03

Pos logit contributions
 pads+0.019
ety+0.016
 tails+0.015
 Indians+0.015
burn+0.015
Neg logit contributions
ore-0.011
 x-0.009
year-0.009
 gen-0.009
 goblin-0.008
Token They
Token ID1119
Feature activation+4.27e-02

Pos logit contributions
 pads+0.011
ety+0.009
 tails+0.008
burn+0.008
 Indians+0.008
Neg logit contributions
ore-0.010
 x-0.009
year-0.009
 gen-0.009
 goblin-0.008
Token decided
Token ID3066
Feature activation+2.21e-02

Pos logit contributions
 pads+0.097
 tails+0.078
ety+0.078
burn+0.078
 Indians+0.078
Neg logit contributions
year-0.069
 gen-0.069
ore-0.069
abel-0.066
 x-0.064
Token to
Token ID284
Feature activation+3.06e-02

Pos logit contributions
 pads+0.057
ety+0.048
 tails+0.046
coat+0.045
 pad+0.045
Neg logit contributions
ore-0.035
 x-0.031
 gen-0.030
year-0.030
 operating-0.030
Token play
Token ID711
Feature activation-2.73e-02

Pos logit contributions
 pads+0.070
ety+0.056
 tails+0.053
 Indians+0.053
 pad+0.052
Neg logit contributions
ore-0.058
 x-0.055
 gen-0.054
 operating-0.051
 healer-0.050
Token with
Token ID351
Feature activation+1.83e-03

Pos logit contributions
ore+0.051
year+0.049
 gen+0.047
 x+0.045
 goblin+0.044
Neg logit contributions
 pads-0.062
 tails-0.052
coat-0.049
 sunglasses-0.048
ety-0.048
Token the
Token ID262
Feature activation+4.87e-02

Pos logit contributions
 pads+0.004
ety+0.003
 tails+0.003
burn+0.003
 Indians+0.003
Neg logit contributions
ore-0.004
year-0.003
 x-0.003
 gen-0.003
 goblin-0.003
Token swing
Token ID9628
Feature activation+3.81e-02

Pos logit contributions
 pads+0.109
ety+0.091
lets+0.084
burn+0.083
 Indians+0.082
Neg logit contributions
 x-0.089
 gen-0.082
ore-0.081
 healer-0.077
 goblin-0.075
Token and
Token ID290
Feature activation-4.40e-03

Pos logit contributions
ore+0.022
 x+0.020
year+0.020
 gen+0.020
 goblin+0.019
Neg logit contributions
 pads-0.031
ety-0.026
 tails-0.025
coat-0.024
 pad-0.024
Token made
Token ID925
Feature activation-3.54e-02

Pos logit contributions
ore+0.008
 x+0.007
year+0.007
 gen+0.007
 goblin+0.007
Neg logit contributions
 pads-0.010
ety-0.009
 tails-0.008
 Indians-0.008
burn-0.008
Token a
Token ID257
Feature activation-2.23e-02

Pos logit contributions
ore+0.068
year+0.062
 gen+0.059
 x+0.059
 goblin+0.059
Neg logit contributions
 pads-0.078
 tails-0.062
ety-0.059
 sunglasses-0.058
burn-0.057
Token beautiful
Token ID4950
Feature activation-2.04e-02

Pos logit contributions
ore+0.042
year+0.039
 x+0.035
 gen+0.034
 goblin+0.033
Neg logit contributions
 pads-0.052
ety-0.043
 tails-0.042
 pad-0.041
 Indians-0.040
Token tree
Token ID5509
Feature activation-4.09e-02

Pos logit contributions
ore+0.042
year+0.040
 x+0.037
 gen+0.035
 goblin+0.034
Neg logit contributions
 pads-0.044
 tails-0.035
ety-0.035
 pad-0.034
coat-0.033
Token grow
Token ID1663
Feature activation-2.69e-02

Pos logit contributions
ore+0.076
year+0.072
 x+0.068
 gen+0.065
 goblin+0.065
Neg logit contributions
 pads-0.097
ety-0.076
 tails-0.073
 pad-0.073
coat-0.072
Token.
Token ID13
Feature activation-4.71e-03

Pos logit contributions
ore+0.046
 x+0.045
 gen+0.042
year+0.041
 goblin+0.040
Neg logit contributions
 pads-0.066
ety-0.053
 tails-0.051
 pad-0.049
coat-0.049
Token<|endoftext|>
Token ID50256
Feature activation-1.64e-02

Pos logit contributions
ore+0.008
 x+0.007
year+0.007
 gen+0.007
 goblin+0.007
Neg logit contributions
 pads-0.012
ety-0.010
 tails-0.009
 Indians-0.009
 pad-0.009
TokenOnce
Token ID7454
Feature activation+1.22e-02

Pos logit contributions
ore+0.026
 x+0.025
year+0.025
 gen+0.024
 healer+0.023
Neg logit contributions
 pads-0.041
ety-0.034
 pad-0.032
 tails-0.032
coat-0.032
Token upon
Token ID2402
Feature activation-3.01e-02

Pos logit contributions
 pads+0.026
ety+0.020
 tails+0.020
burn+0.019
 Indians+0.019
Neg logit contributions
ore-0.026
year-0.023
 x-0.023
 gen-0.023
 goblin-0.022
Token a
Token ID257
Feature activation+6.93e-03

Pos logit contributions
ore+0.060
 gen+0.052
 healer+0.051
year+0.051
 x+0.051
Neg logit contributions
 pads-0.068
ety-0.054
 tails-0.053
 sunglasses-0.050
 pad-0.050
Token working
Token ID1762
Feature activation-1.63e-02

Pos logit contributions
 pads+0.004
ety+0.004
 tails+0.004
 Indians+0.003
burn+0.003
Neg logit contributions
ore-0.003
 x-0.003
year-0.003
 gen-0.003
 goblin-0.002
Token on
Token ID319
Feature activation-2.86e-03

Pos logit contributions
ore+0.032
year+0.030
 x+0.029
 gen+0.029
 goblin+0.028
Neg logit contributions
 pads-0.036
ety-0.028
 tails-0.028
coat-0.027
burn-0.026
Token the
Token ID262
Feature activation-3.69e-02

Pos logit contributions
ore+0.005
year+0.004
 x+0.004
 gen+0.004
 goblin+0.004
Neg logit contributions
 pads-0.007
ety-0.006
 tails-0.006
burn-0.006
 Indians-0.006
Token machine
Token ID4572
Feature activation-1.13e-02

Pos logit contributions
year+0.073
ore+0.065
abel+0.060
old+0.059
 gen+0.058
Neg logit contributions
 pads-0.080
 sunglasses-0.069
 tails-0.068
 Indians-0.065
 pad-0.062
Token,
Token ID11
Feature activation+1.14e-02

Pos logit contributions
ore+0.020
year+0.019
 x+0.018
 gen+0.018
 goblin+0.017
Neg logit contributions
 pads-0.027
ety-0.022
 tails-0.021
coat-0.021
burn-0.021
Token until
Token ID1566
Feature activation-2.46e-02

Pos logit contributions
 pads+0.026
ety+0.022
 tails+0.020
burn+0.020
coat+0.020
Neg logit contributions
ore-0.021
 x-0.019
year-0.019
 gen-0.019
 goblin-0.018
Token it
Token ID340
Feature activation-3.50e-02

Pos logit contributions
ore+0.052
 x+0.048
year+0.048
old+0.045
 goblin+0.045
Neg logit contributions
 pads-0.046
 tails-0.038
ety-0.037
 Indians-0.037
burn-0.036
Token was
Token ID373
Feature activation+1.61e-02

Pos logit contributions
ore+0.070
year+0.067
 x+0.064
 goblin+0.060
 healer+0.060
Neg logit contributions
 pads-0.074
ety-0.060
burn-0.056
 swung-0.055
coat-0.055
Token fixed
Token ID5969
Feature activation+3.47e-03

Pos logit contributions
 pads+0.040
ety+0.034
 tails+0.032
 Indians+0.032
burn+0.031
Neg logit contributions
ore-0.027
 x-0.025
year-0.025
 gen-0.024
 operating-0.022
Token.
Token ID13
Feature activation-1.82e-02

Pos logit contributions
 pads+0.009
ety+0.008
 tails+0.007
burn+0.007
 Indians+0.007
Neg logit contributions
ore-0.006
 x-0.005
year-0.005
 gen-0.005
 goblin-0.004
Token
Token ID198
Feature activation-2.41e-02

Pos logit contributions
ore+0.033
 x+0.033
 gen+0.031
year+0.030
 goblin+0.030
Neg logit contributions
 pads-0.042
ety-0.034
 tails-0.032
 Indians-0.032
 pad-0.031
Token and
Token ID290
Feature activation+1.78e-02

Pos logit contributions
 pads+0.012
ety+0.010
 tails+0.010
 Indians+0.010
burn+0.010
Neg logit contributions
ore-0.009
 x-0.008
year-0.008
 gen-0.008
 goblin-0.007
Token fl
Token ID781
Feature activation+1.10e-02

Pos logit contributions
 pads+0.041
ety+0.035
coat+0.033
burn+0.032
 Indians+0.032
Neg logit contributions
ore-0.033
 x-0.030
 gen-0.029
year-0.028
 goblin-0.028
Tokenaps
Token ID1686
Feature activation+1.06e-02

Pos logit contributions
 pads+0.033
ety+0.028
 tails+0.027
 Indians+0.027
burn+0.027
Neg logit contributions
ore-0.014
year-0.012
 x-0.012
 gen-0.011
 goblin-0.010
Token its
Token ID663
Feature activation+3.29e-02

Pos logit contributions
 pads+0.026
ety+0.022
 Indians+0.021
 tails+0.021
burn+0.021
Neg logit contributions
ore-0.018
year-0.016
 x-0.016
 gen-0.016
 goblin-0.015
Token wings
Token ID12098
Feature activation+1.38e-02

Pos logit contributions
 pads+0.061
ety+0.052
 Indians+0.052
burn+0.051
coat+0.050
Neg logit contributions
ore-0.070
 x-0.066
 gen-0.064
year-0.059
 goblin-0.058
Token.
Token ID13
Feature activation-1.82e-03

Pos logit contributions
 pads+0.036
ety+0.031
 Indians+0.030
 tails+0.029
burn+0.029
Neg logit contributions
ore-0.022
 x-0.019
year-0.019
 gen-0.018
 goblin-0.017
Token It
Token ID632
Feature activation-2.30e-02

Pos logit contributions
ore+0.003
year+0.003
 x+0.003
 gen+0.003
 goblin+0.003
Neg logit contributions
 pads-0.004
ety-0.004
 tails-0.003
 Indians-0.003
coat-0.003
Token runs
Token ID4539
Feature activation+9.68e-03

Pos logit contributions
ore+0.048
year+0.045
 x+0.043
 gen+0.042
 goblin+0.040
Neg logit contributions
 pads-0.051
 tails-0.041
ety-0.039
 pad-0.036
 sunglasses-0.036
Token away
Token ID1497
Feature activation-1.76e-02

Pos logit contributions
 pads+0.023
ety+0.019
 tails+0.018
 Indians+0.018
burn+0.018
Neg logit contributions
ore-0.018
year-0.016
 x-0.016
 gen-0.016
 goblin-0.015
Token from
Token ID422
Feature activation+2.38e-02

Pos logit contributions
ore+0.031
 x+0.031
year+0.030
 gen+0.029
 goblin+0.028
Neg logit contributions
 pads-0.040
ety-0.034
 tails-0.033
burn-0.031
coat-0.031
Token them
Token ID606
Feature activation+3.39e-03

Pos logit contributions
 pads+0.062
coat+0.049
ety+0.048
burn+0.047
 tails+0.046
Neg logit contributions
ore-0.040
 x-0.038
 gen-0.037
year-0.035
 goblin-0.035
Token a
Token ID257
Feature activation+3.19e-02

Pos logit contributions
ore+0.016
year+0.014
 gen+0.014
 x+0.014
 goblin+0.013
Neg logit contributions
 pads-0.022
 tails-0.018
ety-0.018
 Indians-0.017
burn-0.017
Token long
Token ID890
Feature activation+4.92e-02

Pos logit contributions
 pads+0.073
 tails+0.058
ety+0.058
bit+0.055
lets+0.054
Neg logit contributions
ore-0.061
 x-0.057
 gen-0.056
 healer-0.052
 operating-0.052
Token time
Token ID640
Feature activation+6.71e-03

Pos logit contributions
 pads+0.131
ety+0.112
inas+0.107
 Indians+0.106
lets+0.105
Neg logit contributions
ore-0.069
 x-0.069
 gen-0.061
 year-0.058
 healer-0.056
Token.
Token ID13
Feature activation-1.21e-02

Pos logit contributions
 pads+0.018
ety+0.015
 tails+0.015
 Indians+0.014
burn+0.014
Neg logit contributions
ore-0.010
 x-0.009
year-0.009
 gen-0.009
 goblin-0.008
Token
Token ID220
Feature activation+9.84e-03

Pos logit contributions
ore+0.023
 x+0.022
 gen+0.021
year+0.021
 goblin+0.019
Neg logit contributions
 pads-0.028
ety-0.023
 tails-0.022
 pad-0.021
 Indians-0.021
Token
Token ID198
Feature activation-8.46e-03

Pos logit contributions
 pads+0.026
ety+0.022
 tails+0.021
burn+0.021
 Indians+0.021
Neg logit contributions
ore-0.016
 x-0.013
year-0.013
 gen-0.013
 goblin-0.012
Token
Token ID198
Feature activation-1.07e-02

Pos logit contributions
ore+0.015
 x+0.014
 gen+0.014
year+0.014
 goblin+0.013
Neg logit contributions
 pads-0.020
ety-0.017
 tails-0.016
 Indians-0.016
 pad-0.015
TokenWhen
Token ID2215
Feature activation-1.05e-02

Pos logit contributions
ore+0.022
 x+0.020
 gen+0.020
year+0.019
 goblin+0.019
Neg logit contributions
 pads-0.023
ety-0.019
 tails-0.018
 Indians-0.018
burn-0.017
Token they
Token ID484
Feature activation+7.02e-03

Pos logit contributions
ore+0.023
year+0.022
 x+0.022
 gen+0.021
 goblin+0.020
Neg logit contributions
 pads-0.020
ety-0.016
 tails-0.016
burn-0.016
 Indians-0.015
Token finally
Token ID3443
Feature activation+3.45e-03

Pos logit contributions
 pads+0.015
ety+0.012
 tails+0.011
burn+0.011
 Indians+0.011
Neg logit contributions
ore-0.015
year-0.014
 x-0.014
 gen-0.013
 goblin-0.013
Token returned
Token ID4504
Feature activation+7.77e-03

Pos logit contributions
 pads+0.007
ety+0.006
 tails+0.005
burn+0.005
 Indians+0.005
Neg logit contributions
ore-0.007
 x-0.007
year-0.007
 gen-0.007
 goblin-0.006
Token to
Token ID284
Feature activation+1.24e-02

Pos logit contributions
 pads+0.027
ety+0.022
 tails+0.022
 Indians+0.021
burn+0.021
Neg logit contributions
ore-0.020
year-0.018
 x-0.018
 gen-0.017
 goblin-0.016
Token go
Token ID467
Feature activation-8.29e-03

Pos logit contributions
 pads+0.026
ety+0.020
 tails+0.019
coat+0.019
 Indians+0.019
Neg logit contributions
ore-0.027
 x-0.025
 gen-0.024
year-0.024
 goblin-0.023
Token home
Token ID1363
Feature activation-3.07e-02

Pos logit contributions
ore+0.019
year+0.019
 x+0.018
 gen+0.018
 goblin+0.017
Neg logit contributions
 pads-0.015
ety-0.012
burn-0.011
coat-0.011
 tails-0.011
Token,
Token ID11
Feature activation+2.07e-02

Pos logit contributions
year+0.069
ore+0.069
 x+0.064
 gen+0.062
 healer+0.060
Neg logit contributions
 pads-0.056
ety-0.045
coat-0.042
burn-0.041
 tails-0.041
Token Lily
Token ID20037
Feature activation-4.50e-03

Pos logit contributions
 pads+0.048
ety+0.039
coat+0.037
 tails+0.037
burn+0.037
Neg logit contributions
ore-0.038
 x-0.035
 gen-0.034
year-0.034
 goblin-0.033
Token was
Token ID373
Feature activation-5.93e-03

Pos logit contributions
ore+0.009
year+0.008
 x+0.008
 gen+0.008
 goblin+0.007
Neg logit contributions
 pads-0.010
ety-0.009
 tails-0.008
coat-0.008
burn-0.008
Token happy
Token ID3772
Feature activation-3.96e-03

Pos logit contributions
ore+0.012
 x+0.011
year+0.011
 gen+0.010
 goblin+0.010
Neg logit contributions
 pads-0.013
ety-0.011
burn-0.010
 tails-0.010
coat-0.010
Token because
Token ID780
Feature activation-2.69e-02

Pos logit contributions
ore+0.007
year+0.007
 x+0.007
 gen+0.006
 goblin+0.006
Neg logit contributions
 pads-0.009
ety-0.008
 tails-0.007
burn-0.007
coat-0.007
Token she
Token ID673
Feature activation-1.11e-02

Pos logit contributions
ore+0.061
 x+0.056
 gen+0.053
year+0.052
 goblin+0.052
Neg logit contributions
 pads-0.051
ety-0.042
 sunglasses-0.038
burn-0.037
 tails-0.037
Token made
Token ID925
Feature activation-1.94e-02

Pos logit contributions
ore+0.027
 x+0.025
year+0.024
 gen+0.024
 operating+0.023
Neg logit contributions
 pads-0.020
ety-0.016
coat-0.014
 Indians-0.014
 tails-0.014
Token a
Token ID257
Feature activation-8.75e-03

Pos logit contributions
ore+0.037
year+0.033
 gen+0.032
 x+0.030
 goblin+0.030
Neg logit contributions
 pads-0.045
ety-0.036
 tails-0.036
burn-0.036
 sunglasses-0.036
Token the
Token ID262
Feature activation-1.40e-02

Pos logit contributions
ore+0.030
year+0.027
 gen+0.027
 x+0.026
 goblin+0.025
Neg logit contributions
 pads-0.040
 tails-0.032
ety-0.031
 sunglasses-0.031
coat-0.031
Token lecture
Token ID19143
Feature activation-3.20e-02

Pos logit contributions
ore+0.027
year+0.025
 gen+0.024
 x+0.024
 goblin+0.023
Neg logit contributions
 pads-0.031
ety-0.025
 tails-0.025
 Indians-0.024
 sunglasses-0.024
Token.
Token ID13
Feature activation-6.05e-02

Pos logit contributions
ore+0.055
year+0.050
 gen+0.050
 x+0.049
 goblin+0.047
Neg logit contributions
 pads-0.079
coat-0.061
ety-0.061
 tails-0.061
 pad-0.060
Token But
Token ID887
Feature activation-4.40e-02

Pos logit contributions
 gen+0.126
 x+0.122
year+0.117
ore+0.117
 potion+0.106
Neg logit contributions
 pads-0.119
 pad-0.093
 tails-0.089
 Indians-0.084
 tail-0.079
Token,
Token ID11
Feature activation-5.55e-02

Pos logit contributions
ore+0.090
year+0.086
 x+0.086
 gen+0.082
looking+0.077
Neg logit contributions
 pads-0.087
 tails-0.071
 pad-0.067
ety-0.066
 Indians-0.066
Token when
Token ID618
Feature activation-5.42e-02

Pos logit contributions
ore+0.115
 x+0.110
year+0.108
 gen+0.104
auc+0.100
Neg logit contributions
 pads-0.105
 tails-0.085
 Indians-0.079
 pad-0.078
 sunglasses-0.078
Token he
Token ID339
Feature activation-3.57e-02

Pos logit contributions
year+0.110
ore+0.108
 x+0.107
old+0.102
 gen+0.096
Neg logit contributions
 pads-0.097
 tails-0.082
 Indians-0.082
 sunscreen-0.077
 sunglasses-0.076
Token flew
Token ID13112
Feature activation-2.84e-03

Pos logit contributions
ore+0.085
year+0.080
 gen+0.079
 x+0.078
 healer+0.076
Neg logit contributions
 pads-0.059
ety-0.047
 swung-0.046
coat-0.044
 pad-0.044
Token inside
Token ID2641
Feature activation-5.07e-02

Pos logit contributions
ore+0.005
year+0.005
 x+0.005
 gen+0.005
 goblin+0.005
Neg logit contributions
 pads-0.006
ety-0.005
 tails-0.005
coat-0.005
 Indians-0.005
Token his
Token ID465
Feature activation-1.80e-02

Pos logit contributions
year+0.106
ore+0.103
 x+0.102
 gen+0.099
 healer+0.097
Neg logit contributions
 pads-0.089
ety-0.069
coat-0.068
 sunglasses-0.067
 pad-0.065
Token feathers
Token ID27400
Feature activation+2.88e-02

Pos logit contributions
ore+0.034
year+0.034
 gen+0.031
 x+0.029
 healer+0.028
Neg logit contributions
 pads-0.042
 tails-0.033
ety-0.033
 sunglasses-0.032
 pad-0.032
Token daddy
Token ID35822
Feature activation+2.50e-02

Pos logit contributions
 pads+0.077
ety+0.062
 tails+0.062
coat+0.062
burn+0.059
Neg logit contributions
ore-0.044
 gen-0.043
 x-0.043
year-0.042
 goblin-0.041
Token wanted
Token ID2227
Feature activation-9.40e-03

Pos logit contributions
 pads+0.063
 tails+0.051
ety+0.051
 Indians+0.050
burn+0.048
Neg logit contributions
ore-0.042
year-0.038
 x-0.037
 gen-0.036
 operating-0.034
Token to
Token ID284
Feature activation-5.58e-03

Pos logit contributions
ore+0.020
year+0.018
 gen+0.018
 x+0.018
 goblin+0.017
Neg logit contributions
 pads-0.020
ety-0.016
 tails-0.015
burn-0.015
 sunglasses-0.015
Token watch
Token ID2342
Feature activation-3.35e-02

Pos logit contributions
ore+0.012
year+0.011
 x+0.011
 gen+0.010
 goblin+0.010
Neg logit contributions
 pads-0.012
ety-0.009
 tails-0.009
burn-0.009
coat-0.009
Token a
Token ID257
Feature activation-6.05e-02

Pos logit contributions
ore+0.067
year+0.064
 gen+0.060
 goblin+0.059
 healer+0.059
Neg logit contributions
 pads-0.070
 tails-0.057
burn-0.054
 sunglasses-0.054
ety-0.054
Token movie
Token ID3807
Feature activation-5.59e-02

Pos logit contributions
year+0.097
ore+0.096
old+0.083
handed+0.080
ment+0.080
Neg logit contributions
 pads-0.133
burn-0.114
 tails-0.114
 Indians-0.112
 pad-0.109
Token about
Token ID546
Feature activation-1.11e-02

Pos logit contributions
ore+0.105
year+0.104
 gen+0.096
 goblin+0.096
 x+0.093
Neg logit contributions
 pads-0.116
coat-0.091
burn-0.088
 tails-0.087
 pad-0.085
Token a
Token ID257
Feature activation-3.13e-02

Pos logit contributions
ore+0.020
year+0.018
 x+0.017
 gen+0.017
 goblin+0.016
Neg logit contributions
 pads-0.027
ety-0.022
 tails-0.022
burn-0.021
 Indians-0.021
Token superhero
Token ID17343
Feature activation-2.36e-02

Pos logit contributions
ore+0.059
year+0.055
 x+0.050
 gen+0.048
old+0.048
Neg logit contributions
 pads-0.070
 tails-0.057
 Indians-0.057
ety-0.056
 pad-0.056
Token.
Token ID13
Feature activation+2.82e-02

Pos logit contributions
ore+0.043
year+0.040
 x+0.037
 gen+0.037
 goblin+0.034
Neg logit contributions
 pads-0.057
 tails-0.045
ety-0.045
coat-0.044
burn-0.044
Token
Token ID198
Feature activation-1.72e-03

Pos logit contributions
 pads+0.075
burn+0.063
 tails+0.063
 Indians+0.061
ety+0.061
Neg logit contributions
ore-0.041
 x-0.034
year-0.032
 operating-0.032
 gen-0.032
Token careful
Token ID8161
Feature activation+1.00e-02

Pos logit contributions
ore+0.022
 gen+0.019
 x+0.019
year+0.019
 goblin+0.018
Neg logit contributions
 pads-0.020
 tails-0.016
ety-0.015
 Indians-0.015
coat-0.015
Token next
Token ID1306
Feature activation-2.37e-02

Pos logit contributions
 pads+0.024
ety+0.020
 tails+0.019
 Indians+0.019
burn+0.019
Neg logit contributions
ore-0.018
 x-0.016
year-0.016
 gen-0.016
 goblin-0.015
Token time
Token ID640
Feature activation+2.59e-02

Pos logit contributions
ore+0.053
year+0.051
 gen+0.050
 x+0.047
 healer+0.046
Neg logit contributions
 pads-0.045
ety-0.034
coat-0.033
burn-0.033
 tails-0.033
Token,
Token ID11
Feature activation+1.35e-02

Pos logit contributions
 pads+0.065
ety+0.055
 tails+0.053
burn+0.052
 Indians+0.052
Neg logit contributions
ore-0.044
 x-0.038
year-0.037
 gen-0.036
 operating-0.035
Token the
Token ID262
Feature activation-4.15e-02

Pos logit contributions
 pads+0.031
ety+0.026
 tails+0.025
burn+0.024
coat+0.024
Neg logit contributions
ore-0.025
 x-0.023
year-0.022
 gen-0.022
 goblin-0.021
Token dish
Token ID9433
Feature activation-6.29e-02

Pos logit contributions
ore+0.076
year+0.071
old+0.067
le+0.061
ment+0.060
Neg logit contributions
 pads-0.084
 sunglasses-0.077
 Indians-0.072
 sunscreen-0.071
burn-0.071
Tokenwasher
Token ID45146
Feature activation-3.41e-02

Pos logit contributions
year+0.071
 x+0.070
ore+0.067
 gen+0.063
 healer+0.058
Neg logit contributions
 pads-0.176
ety-0.146
coat-0.142
 pad-0.140
 Indians-0.139
Token can
Token ID460
Feature activation-1.66e-02

Pos logit contributions
ore+0.066
year+0.064
 x+0.060
 gen+0.060
 goblin+0.057
Neg logit contributions
 pads-0.075
ety-0.059
 tails-0.057
coat-0.057
 Indians-0.055
Token be
Token ID307
Feature activation+7.48e-04

Pos logit contributions
ore+0.037
year+0.036
 goblin+0.032
 x+0.032
 gen+0.032
Neg logit contributions
 pads-0.032
ety-0.026
burn-0.025
 tails-0.024
 Indians-0.024
Token dangerous
Token ID4923
Feature activation-1.60e-02

Pos logit contributions
 pads+0.002
ety+0.002
 tails+0.002
 Indians+0.001
burn+0.001
Neg logit contributions
ore-0.001
year-0.001
 x-0.001
 gen-0.001
 goblin-0.001
Token!"
Token ID2474
Feature activation+1.49e-02

Pos logit contributions
ore+0.030
year+0.028
 gen+0.028
 x+0.027
 goblin+0.026
Neg logit contributions
 pads-0.037
ety-0.030
 tails-0.029
 Indians-0.028
burn-0.028
Token box
Token ID3091
Feature activation-3.15e-02

Pos logit contributions
ore+0.006
year+0.006
 x+0.006
 gen+0.005
 healer+0.005
Neg logit contributions
 pads-0.009
ety-0.007
 tails-0.007
 pad-0.007
burn-0.007
Token and
Token ID290
Feature activation-2.55e-02

Pos logit contributions
ore+0.065
 x+0.063
year+0.060
 healer+0.060
 gen+0.060
Neg logit contributions
 pads-0.066
 pad-0.051
ety-0.050
 tails-0.048
coat-0.047
Token saw
Token ID2497
Feature activation-2.34e-02

Pos logit contributions
ore+0.056
 x+0.052
 healer+0.049
year+0.049
 gen+0.049
Neg logit contributions
 pads-0.051
ety-0.041
 sunglasses-0.038
 swung-0.038
 pad-0.037
Token a
Token ID257
Feature activation+1.97e-03

Pos logit contributions
ore+0.046
 x+0.040
year+0.040
 healer+0.039
 operating+0.038
Neg logit contributions
 pads-0.052
 tails-0.042
 sunglasses-0.041
ety-0.040
 pad-0.038
Token j
Token ID474
Feature activation-3.59e-02

Pos logit contributions
 pads+0.005
ety+0.004
 tails+0.004
 Indians+0.004
burn+0.004
Neg logit contributions
ore-0.004
year-0.003
 x-0.003
 gen-0.003
 goblin-0.003
Tokenigsaw
Token ID45636
Feature activation-5.42e-02

Pos logit contributions
 gen+0.053
 x+0.051
year+0.050
ore+0.049
 operating+0.047
Neg logit contributions
 pads-0.089
ety-0.076
coat-0.074
ibi-0.073
 pad-0.072
Token puzzle
Token ID15027
Feature activation-3.48e-02

Pos logit contributions
ore+0.109
year+0.105
 x+0.098
 operating+0.091
old+0.088
Neg logit contributions
 pads-0.113
coat-0.087
 pad-0.087
 tails-0.085
ety-0.084
Token inside
Token ID2641
Feature activation-3.36e-02

Pos logit contributions
ore+0.063
year+0.058
 x+0.057
 gen+0.053
 healer+0.052
Neg logit contributions
 pads-0.083
 pad-0.065
 tails-0.064
ety-0.063
coat-0.063
Token.
Token ID13
Feature activation-2.05e-02

Pos logit contributions
ore+0.065
 x+0.063
 gen+0.060
year+0.060
 healer+0.060
Neg logit contributions
 pads-0.071
 tails-0.055
ety-0.054
 pad-0.053
 sunglasses-0.051
Token He
Token ID679
Feature activation+9.60e-04

Pos logit contributions
ore+0.037
 x+0.035
 gen+0.034
year+0.033
le+0.033
Neg logit contributions
 pads-0.047
ety-0.038
 Indians-0.037
 pad-0.036
 tails-0.036
Token paused
Token ID24487
Feature activation-2.20e-02

Pos logit contributions
 pads+0.002
ety+0.002
 tails+0.002
 Indians+0.002
coat+0.002
Neg logit contributions
ore-0.002
 x-0.002
year-0.002
 gen-0.002
 goblin-0.002
Token.
Token ID13
Feature activation-1.73e-02

Pos logit contributions
 pads+0.018
ety+0.015
 tails+0.015
 Indians+0.014
burn+0.014
Neg logit contributions
ore-0.010
 x-0.009
year-0.009
 gen-0.008
 goblin-0.008
Token
Token ID198
Feature activation-4.02e-02

Pos logit contributions
ore+0.031
 x+0.030
year+0.030
 gen+0.029
 goblin+0.027
Neg logit contributions
 pads-0.041
ety-0.033
 tails-0.031
 Indians-0.030
 pad-0.030
Token
Token ID198
Feature activation-3.60e-02

Pos logit contributions
 gen+0.084
 goblin+0.084
 healer+0.083
 x+0.083
year+0.083
Neg logit contributions
 pads-0.072
ety-0.057
ibi-0.053
 tails-0.049
 pad-0.049
Token"
Token ID1
Feature activation+5.42e-03

Pos logit contributions
 x+0.087
ore+0.087
 gen+0.084
 goblin+0.080
year+0.079
Neg logit contributions
 pads-0.060
ety-0.041
 tails-0.039
 pad-0.036
 Indians-0.036
TokenLook
Token ID8567
Feature activation+1.29e-02

Pos logit contributions
 pads+0.011
ety+0.009
 tails+0.008
burn+0.008
 Indians+0.008
Neg logit contributions
ore-0.012
 x-0.011
year-0.011
 gen-0.011
 goblin-0.010
Token!"
Token ID2474
Feature activation-6.44e-02

Pos logit contributions
 pads+0.033
ety+0.027
 tails+0.026
coat+0.026
burn+0.026
Neg logit contributions
ore-0.022
 x-0.020
year-0.019
 gen-0.019
 goblin-0.018
Token Ben
Token ID3932
Feature activation-3.15e-02

Pos logit contributions
year+0.163
ore+0.160
 x+0.147
oc+0.138
 operating+0.137
Neg logit contributions
 pads-0.087
 tails-0.072
 dads-0.064
 sunscreen-0.064
 sunglasses-0.063
Token says
Token ID1139
Feature activation-3.21e-02

Pos logit contributions
year+0.064
ore+0.063
 gen+0.054
 healer+0.054
 x+0.053
Neg logit contributions
 pads-0.073
 tails-0.056
ety-0.055
 pad-0.053
 sunglasses-0.053
Token to
Token ID284
Feature activation-1.73e-02

Pos logit contributions
ore+0.063
 gen+0.060
year+0.060
 healer+0.060
 x+0.059
Neg logit contributions
 pads-0.064
ety-0.052
 tails-0.051
coat-0.049
ked-0.048
Token Anna
Token ID11735
Feature activation-2.08e-02

Pos logit contributions
ore+0.035
year+0.032
 gen+0.030
 x+0.030
 operating+0.029
Neg logit contributions
 pads-0.036
ety-0.031
 tails-0.030
 Indians-0.029
burn-0.028
Token.
Token ID13
Feature activation-1.16e-02

Pos logit contributions
ore+0.037
year+0.035
 x+0.034
 gen+0.034
 goblin+0.032
Neg logit contributions
 pads-0.049
ety-0.041
 tails-0.039
 pad-0.038
burn-0.037