TOP ACTIVATIONS
MAX = 0.117

no ! We destroyed the apple !" Lily cried .
go back to the pool side , Mia . Maybe we
going to eat you for dinner !" Tim my was so
want to go in the cave . It 's too ugly
not pull them off the stems , and we could not
not to fall into the pond . It 's deep and
want to play in the hut anymore ," Lily said .
We want to eat the banana , but we can 't
won 't swim in the pond again . I will be
want to play in the barrel anymore . It is too
't want to eat the pit . I only want to
you to clean up your room ! Now , I have
's not play in the forest again ," L ila said
Let 's go down the pit ," Ben said . "
sit in the old arm chair . It is wet and
or hurt anyone in the village . So , the villagers
children ! You ruined the cake ! You have to say
No , we cannot go outside ," Mia says .
to throw it over the fence ." Lily says . She
should not fight over the banana . You should share and

MIN ACTIVATIONS
MIN = -0.125

it would give her a special power . One
parents gave him a pe b ble . He had never
He gave him a beautiful golden sword . The
and always wore a spark ly dress . One day ,
Pixie gave her a spark ly wand , which she could
bedroom and put on the pretty buckle .
she liked to wear her special dress . It was purple
green dress with a spark ly belt . She couldn 't
yellow dress with a spark ly cape . One
feeling very proud wearing her special suit . She will never
to her and put a comforting hand on her shoulder .
unique girl with a spark ly smile and bright eyes that
had just given him a shiny new pistol . Joey had
mom my gave her a cool mask to wear . It
she put on her spark ly shoes . She
. He put on a blue shirt , blue pants ,
she found a shiny pe b ble and picked it up
She picked up a shiny golden sword and went out into
fairy gave Lily a spark ly wand and said to make
her mum put on her special clothes and said to her

INTERVAL 0.057 - 0.117
CONTAINS 0.325%

watch out for the crocod ile ," Lily said .
on an adventure . First , they found a trumpet .
dog ! You ate our candy ! You are a filthy
for help . And don 't let anyone tell you that
walls , the ceiling , and the people .

INTERVAL -0.004 - 0.057
CONTAINS 57.560%

was ready . She and her mom my crawled inside and
Spot again ! She painted a beautiful picture of him ,
. But they do not see the rock . Lily trips
he was able to leap up into the sky . When
." Their mom is still angry . She tells

INTERVAL -0.064 - -0.004
CONTAINS 41.790%

, the large hammer was always ready to spray and protect
king made a beautiful gray dress for her . The little
thanked her mom for letting her go .
and safe at home . <|endoftext|> Once upon a time ,
said to the busy friend , " If you take a

INTERVAL -0.125 - -0.064
CONTAINS 0.325%

she wasn 't wearing her fancy dress anymore . She sn
gave her a y ummy l oll ipop as a treat
little girl and gave her a big smile before getting back
me adow . In the me adow , there was a
any honey . Then he went to a small pond .
Token no
Token ID645
Feature activation+3.92e-02

Pos logit contributions
 mess+0.041
 wasteful+0.039
 overflow+0.038
 roughly+0.036
 neither+0.036
Neg logit contributions
 name-0.046
ka-0.044
ance-0.042
dot-0.041
 ti-0.040
Token!
Token ID0
Feature activation+2.36e-02

Pos logit contributions
 overflow+0.079
 neither+0.079
 wasteful+0.079
 unfortunately+0.071
 ruining+0.070
Neg logit contributions
ka-0.042
 name-0.041
 powers-0.037
dot-0.035
 aboard-0.034
Token We
Token ID775
Feature activation+3.25e-02

Pos logit contributions
 mess+0.046
 neither+0.043
 wasteful+0.042
 overflow+0.041
 ruining+0.039
Neg logit contributions
 name-0.032
ka-0.029
ance-0.028
dot-0.028
tie-0.027
Token destroyed
Token ID6572
Feature activation+7.10e-02

Pos logit contributions
 neither+0.058
 mess+0.057
 structures+0.056
 wasteful+0.055
 overflow+0.055
Neg logit contributions
 name-0.043
tie-0.040
ka-0.039
ocket-0.036
dot-0.036
Token the
Token ID262
Feature activation+5.22e-02

Pos logit contributions
 overflow+0.137
 forgot+0.129
 neither+0.128
 wasteful+0.127
ivery+0.122
Neg logit contributions
 aboard-0.088
 name-0.081
ka-0.080
 Magic-0.070
ance-0.070
Token apple
Token ID17180
Feature activation+1.17e-01

Pos logit contributions
 neither+0.099
 overflow+0.090
 wasteful+0.087
 roughly+0.082
 ruining+0.081
Neg logit contributions
 name-0.075
orn-0.069
tie-0.066
 amulet-0.062
 Crown-0.061
Token!"
Token ID2474
Feature activation+2.19e-02

Pos logit contributions
 wasteful+0.258
 overflow+0.238
 sudden+0.237
 neither+0.235
Sadly+0.233
Neg logit contributions
orn-0.091
tie-0.082
 aboard-0.082
ed-0.066
anger-0.066
Token Lily
Token ID20037
Feature activation+7.10e-02

Pos logit contributions
 mess+0.030
 neither+0.030
 wasteful+0.029
 overflow+0.027
 ruining+0.026
Neg logit contributions
 name-0.041
ka-0.038
dot-0.037
leaf-0.037
tie-0.037
Token cried
Token ID16896
Feature activation+6.83e-03

Pos logit contributions
 overflow+0.163
 structures+0.161
 flood+0.159
 wasteful+0.158
 earthquake+0.153
Neg logit contributions
ed-0.056
 name-0.056
 hugs-0.049
u-0.048
 gown-0.047
Token.
Token ID13
Feature activation+1.43e-02

Pos logit contributions
 neither+0.012
 mess+0.012
 wasteful+0.012
 overflow+0.011
 ruining+0.011
Neg logit contributions
 name-0.010
ka-0.010
tie-0.009
ance-0.009
dot-0.009
Token
Token ID198
Feature activation+7.78e-03

Pos logit contributions
 mess+0.028
 neither+0.027
 wasteful+0.026
 overflow+0.025
 ruining+0.024
Neg logit contributions
 name-0.020
tie-0.018
dot-0.018
ka-0.017
ance-0.017
Token go
Token ID467
Feature activation-2.51e-02

Pos logit contributions
 neither+0.062
 wasteful+0.062
 mess+0.058
 roughly+0.054
 mosquitoes+0.054
Neg logit contributions
 name-0.040
tie-0.031
ance-0.030
 aboard-0.030
 Crown-0.030
Token back
Token ID736
Feature activation+2.59e-02

Pos logit contributions
ka+0.041
tie+0.040
ance+0.039
anced+0.039
dot+0.039
Neg logit contributions
 mess-0.041
 neither-0.039
 overflow-0.035
 wasteful-0.034
 fighting-0.034
Token to
Token ID284
Feature activation+3.17e-02

Pos logit contributions
 neither+0.053
 wasteful+0.052
 mess+0.049
 overflow+0.046
 ruining+0.045
Neg logit contributions
 aboard-0.034
 name-0.033
ka-0.031
ance-0.030
tie-0.030
Token the
Token ID262
Feature activation+2.68e-02

Pos logit contributions
 neither+0.060
 wasteful+0.057
 roughly+0.052
 mess+0.052
 overflow+0.050
Neg logit contributions
 name-0.045
 aboard-0.044
ance-0.040
 Magic-0.039
 outer-0.039
Token pool
Token ID5933
Feature activation+8.02e-02

Pos logit contributions
 neither+0.047
 wasteful+0.041
 mess+0.038
 roughly+0.037
 overflow+0.036
Neg logit contributions
 name-0.045
 aboard-0.040
ka-0.040
tie-0.040
 Crown-0.040
Tokenside
Token ID1589
Feature activation+1.06e-01

Pos logit contributions
 wasteful+0.155
 neither+0.148
 ruining+0.147
 mosquitoes+0.144
 spitting+0.138
Neg logit contributions
 aboard-0.096
 name-0.090
tie-0.086
ka-0.082
 suit-0.080
Token,
Token ID11
Feature activation+5.40e-02

Pos logit contributions
 wasteful+0.238
 neither+0.219
 ruining+0.209
 mosquitoes+0.204
 roughly+0.195
Neg logit contributions
 name-0.131
 aboard-0.116
 powers-0.099
 outfit-0.099
 costume-0.098
Token Mia
Token ID32189
Feature activation+3.08e-02

Pos logit contributions
 wasteful+0.078
 neither+0.073
 roughly+0.071
oving+0.070
 mess+0.068
Neg logit contributions
 name-0.104
 aboard-0.086
 hello-0.080
ance-0.078
 hi-0.078
Token.
Token ID13
Feature activation+9.52e-03

Pos logit contributions
 neither+0.053
 wasteful+0.051
 mess+0.048
 overflow+0.048
 ruining+0.047
Neg logit contributions
 name-0.045
 aboard-0.043
ka-0.043
tie-0.042
ance-0.042
Token Maybe
Token ID6674
Feature activation+3.11e-02

Pos logit contributions
 mess+0.015
 neither+0.015
 wasteful+0.014
 overflow+0.013
 ruining+0.013
Neg logit contributions
 name-0.016
tie-0.015
ance-0.015
ka-0.015
dot-0.015
Token we
Token ID356
Feature activation+6.67e-03

Pos logit contributions
 mess+0.041
 wasteful+0.039
 neither+0.037
 overflow+0.036
 roughly+0.035
Neg logit contributions
 name-0.059
 aboard-0.059
ance-0.055
tie-0.053
dot-0.053
Token going
Token ID1016
Feature activation-1.71e-02

Pos logit contributions
 neither+0.035
 mess+0.035
 wasteful+0.033
 structures+0.032
 overflow+0.031
Neg logit contributions
 name-0.038
dot-0.035
leaf-0.035
 aboard-0.035
tie-0.034
Token to
Token ID284
Feature activation+3.29e-02

Pos logit contributions
ka+0.026
tie+0.025
ance+0.025
anced+0.025
shirt+0.025
Neg logit contributions
 mess-0.030
 neither-0.030
 wasteful-0.027
 fighting-0.026
 ruining-0.026
Token eat
Token ID4483
Feature activation+3.61e-02

Pos logit contributions
 neither+0.063
 wasteful+0.059
 mess+0.055
 roughly+0.053
 structures+0.051
Neg logit contributions
 name-0.049
 aboard-0.042
 Magic-0.041
 inspect-0.040
leaf-0.040
Token you
Token ID345
Feature activation+4.95e-02

Pos logit contributions
 neither+0.063
 mess+0.058
 wasteful+0.058
 overflow+0.055
 forgot+0.051
Neg logit contributions
 name-0.053
 aboard-0.052
tie-0.051
shirt-0.051
ance-0.051
Token for
Token ID329
Feature activation+2.83e-02

Pos logit contributions
 neither+0.091
 wasteful+0.087
 mess+0.081
 overflow+0.079
Uh+0.077
Neg logit contributions
 name-0.070
 aboard-0.068
tie-0.067
ance-0.066
ka-0.065
Token dinner
Token ID8073
Feature activation+1.04e-01

Pos logit contributions
 neither+0.055
 wasteful+0.051
 mess+0.051
 overflow+0.050
 roughly+0.046
Neg logit contributions
 powers-0.035
 name-0.035
dot-0.035
ance-0.034
 aboard-0.034
Token!"
Token ID2474
Feature activation+2.85e-02

Pos logit contributions
 wasteful+0.210
Uh+0.198
 overflow+0.197
 neither+0.196
 structures+0.189
Neg logit contributions
tie-0.109
 ti-0.105
 aboard-0.102
 name-0.099
ka-0.098
Token Tim
Token ID5045
Feature activation+1.67e-02

Pos logit contributions
 mess+0.041
 wasteful+0.040
 neither+0.040
 structures+0.036
 ruining+0.035
Neg logit contributions
 name-0.051
tie-0.048
shirt-0.047
Aid-0.046
dot-0.046
Tokenmy
Token ID1820
Feature activation+3.42e-02

Pos logit contributions
 neither+0.034
 wasteful+0.032
 mess+0.032
 overflow+0.030
 ruining+0.029
Neg logit contributions
ka-0.021
 name-0.021
tie-0.020
shirt-0.019
ance-0.019
Token was
Token ID373
Feature activation+3.21e-02

Pos logit contributions
 wasteful+0.066
 neither+0.066
 mess+0.064
 overflow+0.061
 structures+0.060
Neg logit contributions
 name-0.044
tie-0.040
ka-0.040
shirt-0.039
dot-0.038
Token so
Token ID523
Feature activation+3.21e-02

Pos logit contributions
 mess+0.048
 neither+0.048
 wasteful+0.047
 overflow+0.045
 structures+0.045
Neg logit contributions
 name-0.055
dot-0.047
tie-0.046
 aboard-0.045
leaf-0.045
Token want
Token ID765
Feature activation+5.56e-03

Pos logit contributions
 neither+0.003
 mess+0.003
 wasteful+0.003
 overflow+0.003
 fighting+0.003
Neg logit contributions
 name-0.003
tie-0.003
ka-0.003
ance-0.002
dot-0.002
Token to
Token ID284
Feature activation+2.24e-02

Pos logit contributions
 neither+0.011
 mess+0.011
 wasteful+0.011
 overflow+0.010
 roughly+0.010
Neg logit contributions
 name-0.007
tie-0.007
ka-0.007
ance-0.006
dot-0.006
Token go
Token ID467
Feature activation-1.67e-02

Pos logit contributions
 wasteful+0.045
 neither+0.044
 mess+0.043
 mosquitoes+0.039
 roughly+0.038
Neg logit contributions
 name-0.029
tie-0.026
ance-0.026
 aboard-0.025
dot-0.025
Token in
Token ID287
Feature activation+2.43e-02

Pos logit contributions
ka+0.024
leaf+0.024
tie+0.024
ance+0.023
shirt+0.023
Neg logit contributions
 mess-0.030
 neither-0.030
 wasteful-0.027
 fighting-0.026
 overflow-0.026
Token the
Token ID262
Feature activation+1.53e-03

Pos logit contributions
 neither+0.043
 wasteful+0.043
 mess+0.039
 overflow+0.037
 roughly+0.036
Neg logit contributions
 aboard-0.035
 name-0.034
ka-0.032
tie-0.032
ance-0.032
Token cave
Token ID11527
Feature activation+1.03e-01

Pos logit contributions
 neither+0.003
 mess+0.002
 wasteful+0.002
 overflow+0.002
 fighting+0.002
Neg logit contributions
ka-0.002
 name-0.002
tie-0.002
ance-0.002
dot-0.002
Token.
Token ID13
Feature activation+7.88e-03

Pos logit contributions
 wasteful+0.193
 mosquitoes+0.187
 spitting+0.184
 neither+0.184
 ruining+0.181
Neg logit contributions
 aboard-0.117
tie-0.116
 name-0.110
ka-0.109
 ti-0.105
Token It
Token ID632
Feature activation+2.28e-02

Pos logit contributions
 mess+0.015
 neither+0.015
 wasteful+0.014
 overflow+0.013
 ruining+0.013
Neg logit contributions
 name-0.011
tie-0.011
ka-0.011
dot-0.010
ance-0.010
Token's
Token ID338
Feature activation+2.83e-02

Pos logit contributions
 neither+0.039
 wasteful+0.038
 mess+0.036
 roughly+0.034
 overflow+0.033
Neg logit contributions
 name-0.034
ka-0.033
 aboard-0.032
tie-0.032
ance-0.032
Token too
Token ID1165
Feature activation+2.76e-02

Pos logit contributions
 neither+0.042
 wasteful+0.038
 mess+0.037
 structures+0.035
 overflow+0.035
Neg logit contributions
 name-0.048
 aboard-0.044
tie-0.043
 powers-0.043
ka-0.043
Token ugly
Token ID13400
Feature activation+2.15e-02

Pos logit contributions
 neither+0.047
 wasteful+0.042
 mess+0.041
 ruining+0.040
 fighting+0.039
Neg logit contributions
 name-0.042
ka-0.042
 aboard-0.041
tie-0.039
dot-0.037
Token not
Token ID407
Feature activation+1.55e-02

Pos logit contributions
 wasteful+0.064
 structures+0.062
 neither+0.061
 rats+0.060
 mess+0.058
Neg logit contributions
 name-0.075
tie-0.059
ad-0.056
 polish-0.055
 Magic-0.055
Token pull
Token ID2834
Feature activation+2.56e-02

Pos logit contributions
 neither+0.028
 mess+0.027
 wasteful+0.027
 rats+0.024
 ruining+0.024
Neg logit contributions
 name-0.022
tie-0.022
ka-0.021
shirt-0.020
dot-0.020
Token them
Token ID606
Feature activation+6.31e-02

Pos logit contributions
 mess+0.043
 neither+0.043
 wasteful+0.041
 overflow+0.041
 roughly+0.038
Neg logit contributions
ka-0.038
tie-0.036
 name-0.035
 aboard-0.035
ance-0.035
Token off
Token ID572
Feature activation+6.71e-02

Pos logit contributions
 wasteful+0.135
 neither+0.128
 mosquitoes+0.126
Uh+0.126
 spitting+0.123
Neg logit contributions
 aboard-0.064
tie-0.058
 name-0.058
ance-0.054
ka-0.050
Token the
Token ID262
Feature activation+1.95e-02

Pos logit contributions
 neither+0.123
 wasteful+0.115
 mess+0.111
 overflow+0.109
 ruining+0.106
Neg logit contributions
 aboard-0.085
ka-0.082
tie-0.079
 name-0.079
 outer-0.076
Token stems
Token ID21552
Feature activation+1.02e-01

Pos logit contributions
 neither+0.032
 wasteful+0.029
 mess+0.028
 overflow+0.026
 fighting+0.026
Neg logit contributions
 name-0.031
tie-0.031
shirt-0.030
ka-0.030
dot-0.030
Token,
Token ID11
Feature activation+5.72e-02

Pos logit contributions
 wasteful+0.203
 neither+0.196
 ruining+0.189
 mosquitoes+0.186
 rats+0.185
Neg logit contributions
tie-0.110
 aboard-0.099
 name-0.098
ance-0.094
ka-0.089
Token and
Token ID290
Feature activation+5.56e-02

Pos logit contributions
 mess+0.076
 neither+0.075
 anth+0.074
 wasteful+0.073
 roughly+0.072
Neg logit contributions
 name-0.097
tie-0.083
 aboard-0.082
shirt-0.080
 rib-0.079
Token we
Token ID356
Feature activation+2.27e-02

Pos logit contributions
 neither+0.083
 mess+0.077
 overflow+0.075
 wasteful+0.074
 roughly+0.073
Neg logit contributions
 name-0.089
 aboard-0.080
tie-0.079
 powers-0.079
leaf-0.079
Token could
Token ID714
Feature activation+4.34e-02

Pos logit contributions
 neither+0.042
 mess+0.041
 wasteful+0.040
 structures+0.038
 overflow+0.037
Neg logit contributions
 name-0.031
tie-0.029
ka-0.029
ance-0.027
leaf-0.027
Token not
Token ID407
Feature activation+1.82e-02

Pos logit contributions
 wasteful+0.073
 structures+0.068
 neither+0.068
 rats+0.066
 erupted+0.064
Neg logit contributions
 name-0.070
tie-0.055
 polish-0.051
 aboard-0.050
ad-0.050
Token not
Token ID407
Feature activation+3.04e-02

Pos logit contributions
 wasteful+0.066
 mess+0.065
 neither+0.064
 ruining+0.058
 overflow+0.058
Neg logit contributions
 name-0.055
tie-0.053
ance-0.050
ka-0.050
shirt-0.050
Token to
Token ID284
Feature activation+4.69e-02

Pos logit contributions
 neither+0.064
 mess+0.062
 wasteful+0.061
 overflow+0.057
 roughly+0.057
Neg logit contributions
 name-0.035
tie-0.031
 aboard-0.029
ance-0.028
ka-0.028
Token fall
Token ID2121
Feature activation+3.74e-02

Pos logit contributions
 neither+0.092
 wasteful+0.092
 mess+0.090
 structures+0.087
 unfortunately+0.086
Neg logit contributions
 name-0.059
tie-0.047
 Crown-0.043
dot-0.043
ka-0.042
Token into
Token ID656
Feature activation+3.12e-02

Pos logit contributions
 wasteful+0.062
 neither+0.061
 mess+0.061
 roughly+0.054
 ruining+0.053
Neg logit contributions
 aboard-0.055
ka-0.055
tie-0.054
 name-0.054
ance-0.052
Token the
Token ID262
Feature activation+3.89e-03

Pos logit contributions
 neither+0.052
 mess+0.050
 wasteful+0.049
 overflow+0.046
 roughly+0.045
Neg logit contributions
 name-0.045
 aboard-0.043
ka-0.043
ance-0.043
 powers-0.042
Token pond
Token ID16723
Feature activation+1.02e-01

Pos logit contributions
 neither+0.006
 mess+0.005
 wasteful+0.005
 overflow+0.005
 roughly+0.005
Neg logit contributions
ka-0.007
 name-0.007
tie-0.007
ance-0.007
dot-0.007
Token.
Token ID13
Feature activation+2.80e-02

Pos logit contributions
Uh+0.201
 wasteful+0.201
 neither+0.199
 ruining+0.192
 mess+0.188
Neg logit contributions
 name-0.093
tie-0.092
 aboard-0.089
ance-0.078
 gown-0.078
Token It
Token ID632
Feature activation+4.45e-02

Pos logit contributions
 mess+0.056
 neither+0.054
 wasteful+0.051
 overflow+0.048
 ruining+0.047
Neg logit contributions
 name-0.038
dot-0.033
ance-0.033
ka-0.033
tie-0.033
Token's
Token ID338
Feature activation+2.52e-02

Pos logit contributions
 mess+0.091
 wasteful+0.090
 neither+0.090
 roughly+0.088
 structures+0.086
Neg logit contributions
 name-0.046
 aboard-0.039
ance-0.038
ka-0.038
 shone-0.037
Token deep
Token ID2769
Feature activation+3.27e-02

Pos logit contributions
 neither+0.029
 mess+0.028
 structures+0.025
 wasteful+0.025
 overflow+0.024
Neg logit contributions
 name-0.051
tie-0.048
ka-0.047
 powers-0.046
 aboard-0.046
Token and
Token ID290
Feature activation+3.68e-02

Pos logit contributions
 neither+0.059
 wasteful+0.056
 mess+0.055
 overflow+0.051
 ruining+0.050
Neg logit contributions
ka-0.047
 name-0.045
 aboard-0.044
tie-0.044
 powers-0.042
Token want
Token ID765
Feature activation+1.16e-02

Pos logit contributions
 neither+0.019
 mess+0.018
 wasteful+0.018
 structures+0.016
 overflow+0.016
Neg logit contributions
 name-0.015
tie-0.015
ka-0.014
dot-0.014
ance-0.014
Token to
Token ID284
Feature activation+2.70e-02

Pos logit contributions
 neither+0.022
 mess+0.022
 wasteful+0.021
 overflow+0.020
 roughly+0.019
Neg logit contributions
 name-0.015
tie-0.015
ka-0.015
ance-0.014
dot-0.014
Token play
Token ID711
Feature activation+2.65e-02

Pos logit contributions
 neither+0.049
 wasteful+0.049
 mess+0.046
 structures+0.044
 roughly+0.042
Neg logit contributions
 name-0.040
tie-0.036
ka-0.035
dot-0.034
ance-0.033
Token in
Token ID287
Feature activation+4.92e-02

Pos logit contributions
 neither+0.048
 wasteful+0.045
 mess+0.044
 overflow+0.042
 erupted+0.040
Neg logit contributions
 name-0.038
tie-0.036
 aboard-0.035
ka-0.035
 powers-0.034
Token the
Token ID262
Feature activation+3.02e-02

Pos logit contributions
 neither+0.098
 wasteful+0.098
 forgot+0.089
 overflow+0.088
 erupted+0.087
Neg logit contributions
 outer-0.054
 aboard-0.053
 powers-0.051
ance-0.048
 Magic-0.047
Token hut
Token ID40812
Feature activation+1.01e-01

Pos logit contributions
 neither+0.057
 wasteful+0.048
 fighting+0.046
 overflow+0.046
 mess+0.044
Neg logit contributions
 name-0.044
orn-0.041
 amulet-0.041
 cloak-0.040
tie-0.040
Token anymore
Token ID7471
Feature activation+7.12e-02

Pos logit contributions
 wasteful+0.195
 neither+0.194
 overflow+0.181
 mosquitoes+0.180
 spitting+0.180
Neg logit contributions
 aboard-0.107
tie-0.102
 name-0.101
 ti-0.093
 amulet-0.090
Token,"
Token ID553
Feature activation+2.88e-02

Pos logit contributions
 wasteful+0.150
 overflow+0.145
 neither+0.140
 ruining+0.130
 mess+0.129
Neg logit contributions
 name-0.077
 aboard-0.071
 powers-0.068
ance-0.067
 ti-0.066
Token Lily
Token ID20037
Feature activation+4.28e-02

Pos logit contributions
 mess+0.049
 neither+0.047
 overflow+0.047
 wasteful+0.046
 ruining+0.044
Neg logit contributions
 name-0.042
 Magic-0.036
shirt-0.036
dot-0.036
 Crown-0.036
Token said
Token ID531
Feature activation-3.53e-04

Pos logit contributions
 overflow+0.084
 wasteful+0.084
 structures+0.081
 ruining+0.080
 flood+0.080
Neg logit contributions
 name-0.052
 gown-0.045
tie-0.044
shirt-0.044
ka-0.044
Token.
Token ID13
Feature activation+9.82e-03

Pos logit contributions
ka+0.001
tie+0.001
 name+0.001
ance+0.001
dot+0.001
Neg logit contributions
 neither-0.001
 mess-0.001
 wasteful-0.001
 overflow-0.001
 fighting-0.001
TokenWe
Token ID1135
Feature activation-5.86e-03

Pos logit contributions
 mess+0.006
 neither+0.006
 wasteful+0.006
 overflow+0.005
 ruining+0.005
Neg logit contributions
ka-0.007
tie-0.007
 name-0.007
leaf-0.007
dot-0.007
Token want
Token ID765
Feature activation+3.31e-02

Pos logit contributions
ka+0.010
tie+0.010
 name+0.009
ance+0.009
dot+0.009
Neg logit contributions
 neither-0.009
 mess-0.009
 wasteful-0.009
 fighting-0.008
 overflow-0.008
Token to
Token ID284
Feature activation+2.14e-02

Pos logit contributions
 neither+0.073
 wasteful+0.070
 mess+0.070
 overflow+0.068
 roughly+0.066
Neg logit contributions
 name-0.034
leaf-0.031
 powers-0.030
 aboard-0.030
tie-0.030
Token eat
Token ID4483
Feature activation+5.08e-02

Pos logit contributions
 neither+0.042
 wasteful+0.041
 mess+0.040
 fighting+0.036
 mosquitoes+0.036
Neg logit contributions
 name-0.031
tie-0.027
ka-0.026
dot-0.026
ance-0.025
Token the
Token ID262
Feature activation+1.53e-02

Pos logit contributions
 neither+0.094
 overflow+0.091
 wasteful+0.089
 forgot+0.088
 mess+0.087
Neg logit contributions
 name-0.066
 aboard-0.065
tie-0.059
ka-0.058
shirt-0.056
Token banana
Token ID25996
Feature activation+1.00e-01

Pos logit contributions
 neither+0.025
 wasteful+0.023
 mess+0.022
 overflow+0.021
 roughly+0.020
Neg logit contributions
 name-0.026
ka-0.025
tie-0.025
 aboard-0.023
 Crown-0.023
Token,
Token ID11
Feature activation+4.46e-02

Pos logit contributions
 wasteful+0.189
 ruining+0.182
 overflow+0.181
 sudden+0.179
 neither+0.177
Neg logit contributions
tie-0.111
 name-0.111
 aboard-0.107
orn-0.097
 powers-0.096
Token but
Token ID475
Feature activation+3.58e-02

Pos logit contributions
 wasteful+0.065
 neither+0.065
 mess+0.065
 roughly+0.061
 overflow+0.060
Neg logit contributions
 name-0.079
tie-0.067
uit-0.063
 aboard-0.063
shirt-0.063
Token we
Token ID356
Feature activation+3.17e-02

Pos logit contributions
 mess+0.052
 overflow+0.049
 roughly+0.048
 neither+0.048
 wasteful+0.047
Neg logit contributions
 name-0.062
 aboard-0.055
 Crown-0.053
tie-0.053
 Magic-0.052
Token can
Token ID460
Feature activation+5.00e-02

Pos logit contributions
 neither+0.059
 mess+0.056
 wasteful+0.056
 structures+0.054
 mosquitoes+0.052
Neg logit contributions
 name-0.043
tie-0.040
ka-0.037
leaf-0.036
ocket-0.036
Token't
Token ID470
Feature activation+4.67e-02

Pos logit contributions
 wasteful+0.095
 mess+0.088
 neither+0.087
 rats+0.085
 mosquitoes+0.083
Neg logit contributions
 name-0.080
tie-0.065
ad-0.061
 Crown-0.057
 Magic-0.057
Token won
Token ID1839
Feature activation+1.89e-02

Pos logit contributions
ka+0.006
tie+0.006
 name+0.006
ance+0.006
dot+0.006
Neg logit contributions
 neither-0.006
 mess-0.006
 wasteful-0.005
 overflow-0.005
 fighting-0.005
Token't
Token ID470
Feature activation+3.31e-02

Pos logit contributions
 neither+0.028
 mess+0.028
 wasteful+0.027
 overflow+0.025
 rats+0.024
Neg logit contributions
 name-0.032
ka-0.031
tie-0.030
ance-0.030
 aboard-0.030
Token swim
Token ID9422
Feature activation+4.52e-02

Pos logit contributions
 neither+0.059
 wasteful+0.056
 rats+0.053
 structures+0.052
 mess+0.052
Neg logit contributions
 name-0.050
tie-0.046
ka-0.043
yo-0.041
leaf-0.041
Token in
Token ID287
Feature activation+4.76e-02

Pos logit contributions
 neither+0.070
 mess+0.070
 wasteful+0.070
 ruining+0.065
 mosquitoes+0.064
Neg logit contributions
tie-0.070
ka-0.069
 aboard-0.067
 name-0.067
ance-0.061
Token the
Token ID262
Feature activation+2.51e-05

Pos logit contributions
 neither+0.090
 wasteful+0.086
 overflow+0.079
keys+0.079
 forgot+0.078
Neg logit contributions
 outer-0.056
 aboard-0.056
 powers-0.055
ka-0.054
 name-0.053
Token pond
Token ID16723
Feature activation+9.97e-02

Pos logit contributions
 neither+0.000
 mess+0.000
 wasteful+0.000
 overflow+0.000
 fighting+0.000
Neg logit contributions
ka-0.000
tie-0.000
 name-0.000
ance-0.000
dot-0.000
Token again
Token ID757
Feature activation+6.88e-02

Pos logit contributions
 wasteful+0.200
 neither+0.195
 ruining+0.191
Uh+0.189
 overflow+0.186
Neg logit contributions
tie-0.091
 aboard-0.087
 name-0.083
 ti-0.080
ka-0.079
Token.
Token ID13
Feature activation-5.64e-03

Pos logit contributions
 neither+0.140
 wasteful+0.139
 overflow+0.135
 ruining+0.129
 mess+0.127
Neg logit contributions
 aboard-0.067
 name-0.065
tie-0.063
ka-0.061
 powers-0.058
Token I
Token ID314
Feature activation-1.27e-03

Pos logit contributions
ka+0.010
tie+0.010
ance+0.009
 name+0.009
dot+0.009
Neg logit contributions
 neither-0.009
 mess-0.008
 wasteful-0.008
 overflow-0.007
 fighting-0.007
Token will
Token ID481
Feature activation+3.58e-03

Pos logit contributions
ka+0.002
tie+0.002
 name+0.002
ance+0.002
dot+0.002
Neg logit contributions
 neither-0.002
 mess-0.002
 wasteful-0.002
 overflow-0.002
 fighting-0.002
Token be
Token ID307
Feature activation+1.67e-02

Pos logit contributions
 neither+0.006
 mess+0.006
 wasteful+0.005
 overflow+0.005
 roughly+0.005
Neg logit contributions
 name-0.006
tie-0.006
ka-0.006
ance-0.005
dot-0.005
Token want
Token ID765
Feature activation+2.24e-02

Pos logit contributions
 neither+0.029
 mess+0.028
 wasteful+0.027
 overflow+0.025
 structures+0.025
Neg logit contributions
 name-0.021
tie-0.020
ka-0.020
dot-0.019
leaf-0.019
Token to
Token ID284
Feature activation+3.28e-02

Pos logit contributions
 mess+0.048
 neither+0.048
 wasteful+0.047
 overflow+0.045
 roughly+0.043
Neg logit contributions
 name-0.023
tie-0.022
ka-0.022
 aboard-0.021
leaf-0.021
Token play
Token ID711
Feature activation+3.86e-02

Pos logit contributions
 wasteful+0.065
 neither+0.064
 mess+0.061
 rats+0.059
 mosquitoes+0.058
Neg logit contributions
 name-0.045
tie-0.036
ka-0.036
 aboard-0.035
ad-0.034
Token in
Token ID287
Feature activation+5.17e-02

Pos logit contributions
 neither+0.074
 wasteful+0.070
 mess+0.066
 overflow+0.065
 forgot+0.064
Neg logit contributions
 name-0.051
tie-0.047
 aboard-0.046
ka-0.044
 ti-0.044
Token the
Token ID262
Feature activation+6.56e-03

Pos logit contributions
 neither+0.106
 wasteful+0.106
 forgot+0.099
 overflow+0.097
oving+0.094
Neg logit contributions
 outer-0.055
 aboard-0.055
 powers-0.047
 Magic-0.047
 name-0.046
Token barrel
Token ID9036
Feature activation+9.87e-02

Pos logit contributions
 neither+0.012
 mess+0.011
 wasteful+0.011
 overflow+0.010
 roughly+0.010
Neg logit contributions
 name-0.010
tie-0.010
ka-0.010
dot-0.009
 aboard-0.009
Token anymore
Token ID7471
Feature activation+7.03e-02

Pos logit contributions
 wasteful+0.227
 sudden+0.215
Uh+0.210
 neither+0.209
 mosquitoes+0.209
Neg logit contributions
tie-0.075
 aboard-0.072
ane-0.061
 name-0.060
ance-0.059
Token.
Token ID13
Feature activation+7.87e-03

Pos logit contributions
 wasteful+0.158
 overflow+0.154
 neither+0.146
 mess+0.146
 ruining+0.140
Neg logit contributions
 name-0.065
 aboard-0.061
 powers-0.055
ance-0.055
tie-0.054
Token It
Token ID632
Feature activation+3.92e-02

Pos logit contributions
 mess+0.014
 neither+0.014
 wasteful+0.013
 overflow+0.012
 ruining+0.012
Neg logit contributions
 name-0.012
tie-0.011
ka-0.011
dot-0.011
ance-0.011
Token is
Token ID318
Feature activation+3.13e-02

Pos logit contributions
 neither+0.090
 wasteful+0.088
 roughly+0.084
 overflow+0.084
 mess+0.083
Neg logit contributions
 name-0.033
 aboard-0.029
ka-0.028
ance-0.026
 shone-0.025
Token too
Token ID1165
Feature activation+3.10e-02

Pos logit contributions
 neither+0.045
 mess+0.039
 wasteful+0.038
 overflow+0.038
 structures+0.036
Neg logit contributions
 name-0.056
tie-0.051
 aboard-0.050
ka-0.050
 powers-0.048
Token't
Token ID470
Feature activation-5.56e-03

Pos logit contributions
 mess+0.027
 neither+0.027
 wasteful+0.026
 overflow+0.023
 roughly+0.022
Neg logit contributions
 name-0.031
tie-0.031
ance-0.031
ka-0.031
 aboard-0.030
Token want
Token ID765
Feature activation+9.90e-03

Pos logit contributions
ka+0.009
tie+0.009
ance+0.008
 name+0.008
dot+0.008
Neg logit contributions
 neither-0.009
 mess-0.009
 wasteful-0.009
 overflow-0.008
 fighting-0.008
Token to
Token ID284
Feature activation+3.20e-02

Pos logit contributions
 neither+0.020
 mess+0.019
 wasteful+0.019
 overflow+0.018
 roughly+0.017
Neg logit contributions
 name-0.012
tie-0.012
ka-0.012
dot-0.011
ance-0.011
Token eat
Token ID4483
Feature activation+4.44e-02

Pos logit contributions
 wasteful+0.060
 neither+0.059
 mess+0.055
 rats+0.053
 structures+0.052
Neg logit contributions
 name-0.046
tie-0.040
ka-0.038
dot-0.037
 Crown-0.036
Token the
Token ID262
Feature activation+2.07e-02

Pos logit contributions
 neither+0.079
 wasteful+0.075
 overflow+0.074
 forgot+0.071
 mess+0.071
Neg logit contributions
 name-0.059
 aboard-0.057
ka-0.056
tie-0.055
ance-0.054
Token pit
Token ID6028
Feature activation+9.85e-02

Pos logit contributions
 neither+0.032
 wasteful+0.029
 mess+0.028
 roughly+0.027
 overflow+0.026
Neg logit contributions
 name-0.037
ka-0.034
tie-0.034
 aboard-0.033
 Crown-0.033
Token.
Token ID13
Feature activation-3.50e-03

Pos logit contributions
 wasteful+0.187
Sadly+0.183
 ruining+0.179
 sudden+0.176
 overflow+0.175
Neg logit contributions
 aboard-0.110
tie-0.109
ka-0.105
iest-0.096
 name-0.091
Token I
Token ID314
Feature activation-2.90e-03

Pos logit contributions
ka+0.004
tie+0.004
 name+0.004
ance+0.004
dot+0.004
Neg logit contributions
 neither-0.007
 mess-0.007
 wasteful-0.007
 overflow-0.006
 fighting-0.006
Token only
Token ID691
Feature activation+6.23e-04

Pos logit contributions
ka+0.005
tie+0.005
 name+0.005
ance+0.005
dot+0.005
Neg logit contributions
 neither-0.005
 mess-0.004
 wasteful-0.004
 overflow-0.004
 fighting-0.004
Token want
Token ID765
Feature activation+2.03e-02

Pos logit contributions
 neither+0.001
 mess+0.001
 wasteful+0.001
 overflow+0.001
 fighting+0.001
Neg logit contributions
ka-0.001
tie-0.001
 name-0.001
ance-0.001
dot-0.001
Token to
Token ID284
Feature activation+3.40e-02

Pos logit contributions
 neither+0.041
 mess+0.040
 wasteful+0.039
 overflow+0.038
 roughly+0.037
Neg logit contributions
 name-0.023
tie-0.023
 aboard-0.022
ka-0.022
 powers-0.022
Token you
Token ID345
Feature activation+2.85e-02

Pos logit contributions
 wasteful+0.066
 mess+0.066
 neither+0.065
 overflow+0.063
 roughly+0.058
Neg logit contributions
ance-0.079
 name-0.077
ka-0.075
 Magic-0.073
 aboard-0.073
Token to
Token ID284
Feature activation+3.08e-02

Pos logit contributions
 mess+0.042
 neither+0.041
 wasteful+0.039
 overflow+0.037
 ruining+0.035
Neg logit contributions
 name-0.048
tie-0.046
ance-0.046
ka-0.044
dot-0.044
Token clean
Token ID3424
Feature activation+2.97e-02

Pos logit contributions
 neither+0.055
 wasteful+0.052
 mess+0.051
 structures+0.048
 rats+0.047
Neg logit contributions
 name-0.047
tie-0.041
ance-0.041
ka-0.040
shirt-0.039
Token up
Token ID510
Feature activation+4.76e-02

Pos logit contributions
 neither+0.057
 wasteful+0.054
 mess+0.051
 overflow+0.051
 erupted+0.049
Neg logit contributions
ka-0.038
 name-0.038
 aboard-0.037
tie-0.036
shirt-0.035
Token your
Token ID534
Feature activation+3.10e-02

Pos logit contributions
 neither+0.082
 wasteful+0.076
 mess+0.075
 overflow+0.073
 quarrel+0.069
Neg logit contributions
 aboard-0.066
ance-0.065
ka-0.064
 name-0.064
shirt-0.062
Token room
Token ID2119
Feature activation+9.80e-02

Pos logit contributions
 neither+0.051
 wasteful+0.043
 unfortunately+0.038
 overflow+0.038
 mosquitoes+0.037
Neg logit contributions
 name-0.057
 powers-0.051
ka-0.051
 gown-0.048
 outfit-0.047
Token!
Token ID0
Feature activation+7.28e-03

Pos logit contributions
 mosquitoes+0.172
 neither+0.171
Uh+0.168
 wasteful+0.168
 snakes+0.163
Neg logit contributions
tie-0.111
 name-0.110
ka-0.110
 powers-0.110
ance-0.108
Token Now
Token ID2735
Feature activation+2.84e-02

Pos logit contributions
 mess+0.013
 neither+0.013
 wasteful+0.012
 overflow+0.011
 ruining+0.011
Neg logit contributions
 name-0.011
tie-0.010
ka-0.010
ance-0.010
dot-0.010
Token,
Token ID11
Feature activation+2.04e-02

Pos logit contributions
 wasteful+0.042
 neither+0.041
 mess+0.040
 overflow+0.037
 roughly+0.036
Neg logit contributions
 name-0.049
ance-0.046
 aboard-0.045
tie-0.045
ka-0.044
Token I
Token ID314
Feature activation-1.99e-03

Pos logit contributions
 neither+0.030
 mess+0.030
 wasteful+0.028
 overflow+0.026
 roughly+0.026
Neg logit contributions
 name-0.036
tie-0.033
ance-0.033
 aboard-0.033
ka-0.032
Token have
Token ID423
Feature activation+1.52e-02

Pos logit contributions
ka+0.003
tie+0.003
 name+0.003
ance+0.003
dot+0.003
Neg logit contributions
 neither-0.004
 mess-0.004
 wasteful-0.003
 overflow-0.003
 fighting-0.003
Token's
Token ID338
Feature activation+1.40e-02

Pos logit contributions
 wasteful+0.035
 neither+0.034
 mess+0.033
 roughly+0.032
 overflow+0.031
Neg logit contributions
 name-0.036
ance-0.033
ka-0.032
 aboard-0.032
tie-0.032
Token not
Token ID407
Feature activation+3.08e-02

Pos logit contributions
 neither+0.027
 wasteful+0.026
 mess+0.026
 overflow+0.023
 rats+0.023
Neg logit contributions
 name-0.019
tie-0.018
ance-0.017
ka-0.017
 aboard-0.017
Token play
Token ID711
Feature activation+3.57e-02

Pos logit contributions
 wasteful+0.062
 neither+0.059
 mess+0.056
 floor+0.053
 structures+0.053
Neg logit contributions
 name-0.041
 aboard-0.034
tie-0.034
ance-0.032
dot-0.031
Token in
Token ID287
Feature activation+5.16e-02

Pos logit contributions
 neither+0.069
 wasteful+0.067
 mess+0.063
 overflow+0.060
 anth+0.059
Neg logit contributions
 name-0.045
 aboard-0.043
 powers-0.040
tie-0.040
ka-0.038
Token the
Token ID262
Feature activation+2.75e-02

Pos logit contributions
 wasteful+0.105
 neither+0.104
 forgot+0.095
 overflow+0.093
 mess+0.091
Neg logit contributions
 aboard-0.057
 outer-0.053
 powers-0.048
 name-0.047
ance-0.045
Token forest
Token ID8222
Feature activation+9.77e-02

Pos logit contributions
 neither+0.046
 wasteful+0.039
 mess+0.035
 fighting+0.034
 roughly+0.034
Neg logit contributions
 name-0.049
orn-0.043
 cloak-0.042
tie-0.041
 amulet-0.041
Token again
Token ID757
Feature activation+7.40e-02

Pos logit contributions
 wasteful+0.188
 neither+0.184
 overflow+0.175
 spitting+0.174
 ruining+0.174
Neg logit contributions
 aboard-0.104
tie-0.088
 powers-0.086
ka-0.085
 ti-0.084
Token,"
Token ID553
Feature activation-1.41e-02

Pos logit contributions
 wasteful+0.147
 neither+0.143
 overflow+0.142
 ruining+0.135
 mess+0.132
Neg logit contributions
 aboard-0.088
 name-0.080
ka-0.075
 powers-0.070
 outer-0.068
Token L
Token ID406
Feature activation-2.25e-02

Pos logit contributions
tie+0.027
ka+0.026
ance+0.025
shirt+0.025
dot+0.025
Neg logit contributions
 neither-0.020
 mess-0.019
 wasteful-0.018
 fighting-0.016
 overflow-0.016
Tokenila
Token ID10102
Feature activation+2.78e-02

Pos logit contributions
tie+0.037
 name+0.036
 amulet+0.035
shirt+0.035
ka+0.035
Neg logit contributions
 neither-0.037
 mess-0.035
 wasteful-0.033
 fighting-0.031
 mosquitoes-0.030
Token said
Token ID531
Feature activation-1.66e-02

Pos logit contributions
 mess+0.047
 overflow+0.046
 wasteful+0.046
 neither+0.045
 ruining+0.044
Neg logit contributions
tie-0.041
 name-0.040
ka-0.039
leaf-0.038
shirt-0.038
TokenLet
Token ID5756
Feature activation+2.23e-02

Pos logit contributions
 neither+0.010
 mess+0.010
 wasteful+0.009
 overflow+0.008
 ruining+0.008
Neg logit contributions
tie-0.009
 name-0.009
ka-0.008
dot-0.008
shirt-0.008
Token's
Token ID338
Feature activation+2.17e-02

Pos logit contributions
 wasteful+0.037
 neither+0.037
 mess+0.035
 roughly+0.033
 overflow+0.033
Neg logit contributions
 name-0.035
ance-0.032
ka-0.032
 aboard-0.032
tie-0.032
Token go
Token ID467
Feature activation-3.96e-02

Pos logit contributions
 wasteful+0.043
 neither+0.043
 mess+0.042
 mosquitoes+0.038
 rats+0.038
Neg logit contributions
 name-0.030
tie-0.025
 aboard-0.025
ka-0.024
ance-0.024
Token down
Token ID866
Feature activation+3.38e-02

Pos logit contributions
ka+0.064
anced+0.063
ance+0.063
tie+0.063
ocket+0.062
Neg logit contributions
 mess-0.063
 neither-0.061
 fighting-0.055
 overflow-0.054
 wasteful-0.051
Token the
Token ID262
Feature activation+4.42e-02

Pos logit contributions
 wasteful+0.063
 neither+0.062
 mess+0.060
 overflow+0.055
 ruining+0.053
Neg logit contributions
 aboard-0.049
ka-0.043
 name-0.043
ance-0.043
tie-0.041
Token pit
Token ID6028
Feature activation+9.74e-02

Pos logit contributions
 neither+0.078
 wasteful+0.075
 roughly+0.063
 unfortunately+0.062
 mosquitoes+0.062
Neg logit contributions
 name-0.071
 aboard-0.065
orn-0.060
anger-0.058
 amulet-0.058
Token,"
Token ID553
Feature activation+3.67e-02

Pos logit contributions
 wasteful+0.203
 mosquitoes+0.191
 ruining+0.183
 neither+0.182
 spitting+0.182
Neg logit contributions
 aboard-0.121
tie-0.099
 name-0.096
ka-0.093
anger-0.089
Token Ben
Token ID3932
Feature activation+2.27e-02

Pos logit contributions
 mess+0.055
 wasteful+0.053
 neither+0.051
 overflow+0.050
 ruining+0.050
Neg logit contributions
 name-0.062
ka-0.059
shirt-0.056
 Magic-0.054
dot-0.054
Token said
Token ID531
Feature activation-2.71e-03

Pos logit contributions
 neither+0.036
 mess+0.036
 wasteful+0.036
 overflow+0.034
 ruining+0.033
Neg logit contributions
 name-0.036
ka-0.035
tie-0.035
shirt-0.035
dot-0.033
Token.
Token ID13
Feature activation+7.28e-03

Pos logit contributions
ka+0.004
tie+0.004
 name+0.004
ance+0.004
dot+0.004
Neg logit contributions
 neither-0.005
 mess-0.004
 wasteful-0.004
 overflow-0.004
 fighting-0.004
Token "
Token ID366
Feature activation+5.39e-03

Pos logit contributions
 mess+0.013
 neither+0.013
 wasteful+0.012
 ruining+0.011
 overflow+0.011
Neg logit contributions
 name-0.011
ka-0.010
tie-0.010
dot-0.010
shirt-0.010
Token sit
Token ID1650
Feature activation+2.31e-02

Pos logit contributions
 wasteful+0.055
 neither+0.055
 mess+0.051
 roughly+0.049
 structures+0.048
Neg logit contributions
 name-0.035
dot-0.029
ance-0.028
tie-0.028
ka-0.028
Token in
Token ID287
Feature activation+4.49e-02

Pos logit contributions
 neither+0.045
 mess+0.045
 wasteful+0.044
 overflow+0.042
 structures+0.042
Neg logit contributions
ka-0.027
 name-0.027
tie-0.027
 aboard-0.026
ance-0.024
Token the
Token ID262
Feature activation-4.82e-03

Pos logit contributions
 wasteful+0.089
 neither+0.088
 overflow+0.080
 forgot+0.080
keys+0.080
Neg logit contributions
 aboard-0.052
 outer-0.048
ka-0.046
 name-0.045
ance-0.043
Token old
Token ID1468
Feature activation+3.50e-02

Pos logit contributions
ka+0.008
tie+0.008
 name+0.008
ance+0.008
dot+0.008
Neg logit contributions
 neither-0.008
 mess-0.008
 wasteful-0.007
 overflow-0.006
 fighting-0.006
Token arm
Token ID3211
Feature activation+6.06e-02

Pos logit contributions
 neither+0.062
 wasteful+0.059
 overflow+0.054
 ruining+0.052
 mess+0.050
Neg logit contributions
 name-0.051
 gown-0.050
ka-0.049
tie-0.048
shirt-0.048
Tokenchair
Token ID16337
Feature activation+9.70e-02

Pos logit contributions
 neither+0.126
 wasteful+0.122
 overflow+0.115
 ruining+0.114
 mess+0.112
Neg logit contributions
tie-0.066
shirt-0.066
ance-0.065
 gown-0.062
anger-0.062
Token.
Token ID13
Feature activation+1.35e-02

Pos logit contributions
 wasteful+0.198
 structures+0.188
 anth+0.188
 neither+0.187
 overflow+0.186
Neg logit contributions
tie-0.094
 aboard-0.087
ance-0.083
 gown-0.080
ka-0.078
Token It
Token ID632
Feature activation+2.42e-02

Pos logit contributions
 mess+0.021
 neither+0.020
 wasteful+0.019
 overflow+0.018
 ruining+0.018
Neg logit contributions
 name-0.023
tie-0.022
ka-0.022
dot-0.021
ance-0.021
Token is
Token ID318
Feature activation+3.42e-02

Pos logit contributions
 neither+0.048
 wasteful+0.046
 mess+0.045
 roughly+0.043
 overflow+0.043
Neg logit contributions
 name-0.029
ka-0.028
tie-0.026
 aboard-0.026
ance-0.026
Token wet
Token ID9583
Feature activation+5.28e-02

Pos logit contributions
 neither+0.048
 mess+0.042
 structures+0.042
 overflow+0.042
 wasteful+0.042
Neg logit contributions
 name-0.060
ka-0.055
tie-0.055
 aboard-0.054
dot-0.052
Token and
Token ID290
Feature activation+2.93e-02

Pos logit contributions
 neither+0.110
 wasteful+0.108
 anth+0.098
 overflow+0.098
 erupted+0.098
Neg logit contributions
tie-0.056
ka-0.054
 aboard-0.052
 name-0.048
ance-0.047
Token or
Token ID393
Feature activation+4.69e-02

Pos logit contributions
 wasteful+0.108
 neither+0.106
 mess+0.102
 overflow+0.095
 mosquitoes+0.093
Neg logit contributions
 name-0.099
 aboard-0.098
tie-0.093
ance-0.092
 powers-0.085
Token hurt
Token ID5938
Feature activation+5.96e-02

Pos logit contributions
 mess+0.070
 neither+0.067
 wasteful+0.064
 roughly+0.060
 structures+0.058
Neg logit contributions
 name-0.080
 aboard-0.074
tie-0.073
ance-0.070
leaf-0.069
Token anyone
Token ID2687
Feature activation+5.81e-02

Pos logit contributions
 mess+0.097
 wasteful+0.093
 neither+0.088
 overflow+0.085
 roughly+0.083
Neg logit contributions
 aboard-0.089
 name-0.089
ance-0.087
ka-0.084
tie-0.082
Token in
Token ID287
Feature activation+3.84e-02

Pos logit contributions
 neither+0.094
 wasteful+0.093
 mess+0.088
 overflow+0.087
 anth+0.084
Neg logit contributions
 aboard-0.087
tie-0.085
 name-0.085
 powers-0.078
ance-0.076
Token the
Token ID262
Feature activation+2.51e-02

Pos logit contributions
 neither+0.063
 wasteful+0.063
 mess+0.061
 overflow+0.056
 pies+0.055
Neg logit contributions
 aboard-0.057
 name-0.056
 outer-0.054
 Magic-0.052
ka-0.052
Token village
Token ID7404
Feature activation+9.65e-02

Pos logit contributions
 neither+0.044
 mess+0.038
 wasteful+0.038
 overflow+0.036
 ruining+0.035
Neg logit contributions
 name-0.040
ka-0.036
tie-0.036
 Magic-0.036
 Crown-0.036
Token.
Token ID13
Feature activation+3.63e-02

Pos logit contributions
 neither+0.168
 wasteful+0.164
 overflow+0.160
 mosquitoes+0.156
 mess+0.152
Neg logit contributions
 aboard-0.123
 name-0.115
 powers-0.111
tie-0.110
ka-0.109
Token So
Token ID1406
Feature activation+3.13e-02

Pos logit contributions
 mess+0.071
 neither+0.065
 wasteful+0.064
 overflow+0.061
 mosquitoes+0.060
Neg logit contributions
 name-0.054
 Magic-0.047
 aboard-0.047
tie-0.046
ance-0.046
Token,
Token ID11
Feature activation+4.13e-02

Pos logit contributions
 mess+0.049
 wasteful+0.048
 neither+0.046
 mosquitoes+0.043
 overflow+0.043
Neg logit contributions
 name-0.053
 aboard-0.051
ance-0.050
ka-0.048
 Magic-0.048
Token the
Token ID262
Feature activation+2.46e-02

Pos logit contributions
 mess+0.056
 wasteful+0.052
 neither+0.045
 roughly+0.044
 overflow+0.044
Neg logit contributions
 name-0.084
 aboard-0.082
ance-0.077
 Magic-0.075
tie-0.074
Token villagers
Token ID25263
Feature activation+5.22e-03

Pos logit contributions
 neither+0.040
 wasteful+0.036
 mess+0.035
 roughly+0.033
 overflow+0.033
Neg logit contributions
 name-0.043
ka-0.039
 Crown-0.038
 aboard-0.038
tie-0.037
Token children
Token ID1751
Feature activation+4.81e-02

Pos logit contributions
 neither+0.038
 wasteful+0.038
 overflow+0.037
 mess+0.036
 roughly+0.035
Neg logit contributions
 name-0.053
ka-0.053
ance-0.051
 aboard-0.050
 powers-0.049
Token!
Token ID0
Feature activation+7.61e-03

Pos logit contributions
 overflow+0.093
 neither+0.088
 wasteful+0.087
 structures+0.084
 anth+0.083
Neg logit contributions
 name-0.053
ka-0.053
tie-0.052
 aboard-0.050
ocket-0.048
Token You
Token ID921
Feature activation+3.07e-02

Pos logit contributions
 neither+0.014
 mess+0.014
 wasteful+0.013
 overflow+0.012
 roughly+0.012
Neg logit contributions
 name-0.011
ka-0.011
ance-0.010
tie-0.010
dot-0.010
Token ruined
Token ID20484
Feature activation+6.86e-02

Pos logit contributions
 neither+0.052
 mess+0.051
 wasteful+0.048
 overflow+0.048
 structures+0.047
Neg logit contributions
 name-0.044
leaf-0.042
ka-0.042
ance-0.041
tie-0.041
Token the
Token ID262
Feature activation+3.08e-02

Pos logit contributions
 overflow+0.128
 neither+0.127
 forgot+0.122
 erupted+0.122
Uh+0.121
Neg logit contributions
 name-0.075
ka-0.072
 aboard-0.071
ance-0.066
 Magic-0.065
Token cake
Token ID12187
Feature activation+9.64e-02

Pos logit contributions
 neither+0.045
 wasteful+0.039
 roughly+0.037
 overflow+0.036
 ruining+0.034
Neg logit contributions
 name-0.059
ka-0.054
tie-0.052
 aboard-0.052
 Crown-0.051
Token!
Token ID0
Feature activation+5.33e-03

Pos logit contributions
 wasteful+0.212
 neither+0.209
Uh+0.198
 overflow+0.196
 mosquitoes+0.195
Neg logit contributions
 name-0.070
tie-0.064
 aboard-0.063
ance-0.058
anger-0.057
Token You
Token ID921
Feature activation+3.36e-02

Pos logit contributions
 neither+0.010
 mess+0.010
 wasteful+0.009
 overflow+0.009
 roughly+0.008
Neg logit contributions
 name-0.007
ka-0.007
tie-0.007
ance-0.007
dot-0.007
Token have
Token ID423
Feature activation+2.68e-02

Pos logit contributions
 neither+0.057
 mess+0.055
 wasteful+0.053
 structures+0.052
 overflow+0.051
Neg logit contributions
 name-0.048
leaf-0.046
ka-0.046
tie-0.046
ance-0.045
Token to
Token ID284
Feature activation+2.68e-02

Pos logit contributions
 neither+0.052
 mess+0.052
 overflow+0.051
 wasteful+0.049
 roughly+0.049
Neg logit contributions
 name-0.029
tie-0.029
ka-0.029
ance-0.028
leaf-0.028
Token say
Token ID910
Feature activation+2.40e-02

Pos logit contributions
 neither+0.048
 wasteful+0.042
 mess+0.042
 rats+0.041
 snakes+0.041
Neg logit contributions
 name-0.041
ance-0.037
ka-0.036
ad-0.035
tie-0.034
TokenNo
Token ID2949
Feature activation+2.60e-02

Pos logit contributions
 mess+0.039
 neither+0.036
 wasteful+0.036
 structures+0.034
 ruining+0.033
Neg logit contributions
leaf-0.033
tie-0.033
 name-0.032
shirt-0.032
ka-0.031
Token,
Token ID11
Feature activation+3.92e-02

Pos logit contributions
 wasteful+0.051
 neither+0.049
 overflow+0.048
 mess+0.047
 roughly+0.047
Neg logit contributions
 name-0.032
 powers-0.029
ance-0.029
ka-0.029
 aboard-0.028
Token we
Token ID356
Feature activation+1.88e-02

Pos logit contributions
 mess+0.052
 wasteful+0.051
 roughly+0.051
oving+0.049
 structures+0.048
Neg logit contributions
 name-0.068
 aboard-0.061
ocket-0.061
ance-0.060
 hello-0.060
Token cannot
Token ID2314
Feature activation-2.08e-03

Pos logit contributions
 neither+0.040
 mess+0.040
 wasteful+0.039
 structures+0.037
 overflow+0.037
Neg logit contributions
 name-0.021
tie-0.018
ka-0.017
dot-0.016
ance-0.016
Token go
Token ID467
Feature activation+2.09e-02

Pos logit contributions
ka+0.003
tie+0.003
 name+0.003
ance+0.003
dot+0.003
Neg logit contributions
 neither-0.003
 mess-0.003
 wasteful-0.003
 overflow-0.003
 fighting-0.003
Token outside
Token ID2354
Feature activation+9.63e-02

Pos logit contributions
 wasteful+0.044
 neither+0.044
 mess+0.043
 roughly+0.039
 mosquitoes+0.039
Neg logit contributions
 aboard-0.023
 name-0.023
ka-0.022
tie-0.021
dot-0.020
Token,"
Token ID553
Feature activation+2.80e-02

Pos logit contributions
 wasteful+0.205
 neither+0.187
Uh+0.178
 mess+0.175
 spitting+0.175
Neg logit contributions
 aboard-0.104
 name-0.089
 ti-0.080
 gown-0.078
ka-0.078
Token Mia
Token ID32189
Feature activation+2.14e-02

Pos logit contributions
 mess+0.037
 wasteful+0.035
 neither+0.035
 ruining+0.033
oving+0.031
Neg logit contributions
 name-0.055
ka-0.048
 aboard-0.048
 Magic-0.047
dot-0.047
Token says
Token ID1139
Feature activation-8.57e-03

Pos logit contributions
 neither+0.036
 wasteful+0.035
 mess+0.034
 overflow+0.033
 ruining+0.032
Neg logit contributions
 name-0.034
ka-0.031
tie-0.031
ance-0.030
 aboard-0.029
Token.
Token ID13
Feature activation+1.39e-02

Pos logit contributions
ka+0.014
tie+0.014
 name+0.014
ance+0.014
dot+0.014
Neg logit contributions
 neither-0.014
 mess-0.013
 wasteful-0.012
 fighting-0.012
 overflow-0.011
Token
Token ID198
Feature activation+9.76e-03

Pos logit contributions
 mess+0.027
 neither+0.026
 wasteful+0.026
 overflow+0.024
 ruining+0.023
Neg logit contributions
 name-0.019
tie-0.018
 aboard-0.017
dot-0.017
ance-0.017
Token to
Token ID284
Feature activation+4.44e-02

Pos logit contributions
 overflow+0.085
 wasteful+0.084
 mess+0.084
 neither+0.083
 timer+0.082
Neg logit contributions
 name-0.025
 aboard-0.025
tie-0.023
ance-0.023
ka-0.020
Token throw
Token ID3714
Feature activation+2.45e-02

Pos logit contributions
 wasteful+0.072
 neither+0.071
 unfortunately+0.069
 timer+0.066
 mess+0.066
Neg logit contributions
 name-0.067
ka-0.058
yo-0.057
tie-0.057
ad-0.056
Token it
Token ID340
Feature activation+5.90e-02

Pos logit contributions
 neither+0.041
 wasteful+0.039
 mess+0.038
 overflow+0.038
 ruining+0.035
Neg logit contributions
ka-0.037
 name-0.036
tie-0.036
 aboard-0.035
ance-0.034
Token over
Token ID625
Feature activation+3.78e-02

Pos logit contributions
 wasteful+0.117
 neither+0.113
 overflow+0.108
 mess+0.106
 wasting+0.106
Neg logit contributions
tie-0.068
ka-0.062
 aboard-0.062
 name-0.059
ance-0.057
Token the
Token ID262
Feature activation+5.49e-02

Pos logit contributions
 mess+0.067
 neither+0.067
 wasteful+0.066
 overflow+0.066
 ruining+0.058
Neg logit contributions
ka-0.049
 aboard-0.049
tie-0.047
ance-0.046
 name-0.045
Token fence
Token ID13990
Feature activation+9.62e-02

Pos logit contributions
 neither+0.097
 wasteful+0.095
oving+0.084
 ruining+0.084
 overflow+0.084
Neg logit contributions
 name-0.082
tie-0.070
 aboard-0.066
yo-0.065
orn-0.063
Token."
Token ID526
Feature activation+2.22e-02

Pos logit contributions
 wasteful+0.222
 neither+0.212
 overflow+0.209
 ruining+0.204
 mess+0.203
Neg logit contributions
tie-0.075
 aboard-0.066
ka-0.059
shirt-0.056
ance-0.056
Token Lily
Token ID20037
Feature activation+3.40e-02

Pos logit contributions
 neither+0.029
 mess+0.028
 overflow+0.027
 wasteful+0.026
 ruining+0.026
Neg logit contributions
 name-0.044
dot-0.039
ka-0.039
tie-0.038
 Magic-0.038
Token says
Token ID1139
Feature activation-1.36e-02

Pos logit contributions
 neither+0.064
 overflow+0.063
 wasteful+0.063
 roughly+0.059
 mess+0.058
Neg logit contributions
 name-0.046
ka-0.040
leaf-0.038
 hugs-0.038
ance-0.037
Token.
Token ID13
Feature activation+8.13e-03

Pos logit contributions
ka+0.024
tie+0.023
 name+0.023
ance+0.023
shirt+0.023
Neg logit contributions
 neither-0.021
 mess-0.020
 wasteful-0.019
 fighting-0.018
 roughly-0.017
Token She
Token ID1375
Feature activation-3.53e-02

Pos logit contributions
 mess+0.015
 neither+0.015
 wasteful+0.014
 overflow+0.013
 ruining+0.013
Neg logit contributions
 name-0.012
tie-0.011
ka-0.011
dot-0.011
ance-0.011
Token should
Token ID815
Feature activation+2.65e-02

Pos logit contributions
 mess+0.043
 neither+0.043
 wasteful+0.041
 overflow+0.039
 structures+0.039
Neg logit contributions
 name-0.027
tie-0.025
ance-0.024
ka-0.024
leaf-0.023
Token not
Token ID407
Feature activation+4.92e-02

Pos logit contributions
 neither+0.044
 mess+0.041
 wasteful+0.041
 rats+0.038
 structures+0.038
Neg logit contributions
 name-0.044
tie-0.038
ance-0.037
ka-0.037
dot-0.036
Token fight
Token ID1907
Feature activation+4.61e-02

Pos logit contributions
 neither+0.094
 wasteful+0.091
 mess+0.085
 mosquitoes+0.084
 structures+0.083
Neg logit contributions
 name-0.072
tie-0.059
ad-0.055
orn-0.054
shirt-0.053
Token over
Token ID625
Feature activation+5.59e-02

Pos logit contributions
 overflow+0.084
 neither+0.084
 wasteful+0.082
 mess+0.082
 flood+0.080
Neg logit contributions
 name-0.058
 aboard-0.053
ka-0.051
tie-0.051
ance-0.050
Token the
Token ID262
Feature activation+4.38e-02

Pos logit contributions
 overflow+0.101
 neither+0.099
 mess+0.098
 wasteful+0.096
 erupted+0.092
Neg logit contributions
 name-0.065
 aboard-0.063
tie-0.060
 Crown-0.059
 powers-0.059
Token banana
Token ID25996
Feature activation+9.61e-02

Pos logit contributions
 neither+0.081
 wasteful+0.072
 overflow+0.071
 roughly+0.068
oving+0.065
Neg logit contributions
 name-0.067
 Crown-0.060
tie-0.057
orn-0.057
ka-0.056
Token.
Token ID13
Feature activation+4.26e-03

Pos logit contributions
 wasteful+0.190
 sudden+0.188
 overflow+0.187
 neither+0.182
Uh+0.176
Neg logit contributions
anger-0.087
orn-0.086
 name-0.085
ance-0.085
tie-0.084
Token You
Token ID921
Feature activation+2.91e-02

Pos logit contributions
 mess+0.008
 neither+0.008
 wasteful+0.008
 overflow+0.007
 ruining+0.007
Neg logit contributions
 name-0.005
ka-0.005
tie-0.005
ance-0.005
dot-0.005
Token should
Token ID815
Feature activation+2.80e-02

Pos logit contributions
 mess+0.058
 neither+0.056
 wasteful+0.055
 overflow+0.053
 structures+0.052
Neg logit contributions
 name-0.037
tie-0.035
ance-0.033
ka-0.032
leaf-0.032
Token share
Token ID2648
Feature activation+4.90e-02

Pos logit contributions
 neither+0.048
 wasteful+0.045
 mess+0.044
 rats+0.042
 overflow+0.041
Neg logit contributions
 name-0.045
tie-0.039
ance-0.038
ka-0.038
dot-0.037
Token and
Token ID290
Feature activation+4.95e-02

Pos logit contributions
 neither+0.101
 overflow+0.099
 wasteful+0.094
 flood+0.094
 erupted+0.090
Neg logit contributions
 name-0.058
tie-0.051
 aboard-0.048
 Crown-0.047
ocket-0.046
Token it
Token ID340
Feature activation-4.02e-02

Pos logit contributions
ka+0.069
leaf+0.065
tie+0.065
anger+0.065
ance+0.065
Neg logit contributions
 mess-0.071
 neither-0.071
 wasteful-0.060
 overflow-0.059
 floor-0.059
Token would
Token ID561
Feature activation-3.59e-02

Pos logit contributions
leaf+0.074
ka+0.072
tie+0.069
dot+0.068
 aboard+0.068
Neg logit contributions
 neither-0.062
 mess-0.061
 floor-0.052
 wasteful-0.050
 overflow-0.050
Token give
Token ID1577
Feature activation-5.06e-03

Pos logit contributions
ka+0.055
ance+0.053
leaf+0.053
dot+0.052
 aboard+0.051
Neg logit contributions
 mess-0.065
 neither-0.061
 overflow-0.057
 wasteful-0.055
 not-0.054
Token her
Token ID607
Feature activation-8.14e-02

Pos logit contributions
ka+0.007
tie+0.007
 name+0.007
ance+0.007
dot+0.007
Neg logit contributions
 neither-0.009
 mess-0.009
 wasteful-0.008
 overflow-0.008
 fighting-0.008
Token a
Token ID257
Feature activation-9.91e-02

Pos logit contributions
orn+0.159
leaf+0.158
anced+0.156
reetings+0.153
lynn+0.151
Neg logit contributions
 mess-0.109
 neither-0.098
 not-0.091
 both-0.088
 messy-0.077
Token special
Token ID2041
Feature activation-1.25e-01

Pos logit contributions
ance+0.241
anced+0.240
ully+0.207
lynn+0.206
umper+0.204
Neg logit contributions
 mess-0.134
 messy-0.088
 not-0.086
 bit-0.086
 fuss-0.082
Token power
Token ID1176
Feature activation-5.96e-02

Pos logit contributions
ance+0.249
anced+0.245
umper+0.242
igh+0.225
aved+0.224
Neg logit contributions
 mess-0.203
 floor-0.151
 mistake-0.146
 game-0.144
 noise-0.143
Token.
Token ID13
Feature activation-4.35e-03

Pos logit contributions
ka+0.119
anced+0.114
leaf+0.113
dot+0.112
ance+0.110
Neg logit contributions
 mess-0.081
 neither-0.074
 floor-0.064
 wasteful-0.061
 overflow-0.060
Token
Token ID198
Feature activation-9.43e-03

Pos logit contributions
ka+0.007
tie+0.007
 name+0.007
ance+0.007
dot+0.007
Neg logit contributions
 neither-0.007
 mess-0.007
 wasteful-0.007
 overflow-0.006
 fighting-0.006
Token
Token ID198
Feature activation+1.48e-02

Pos logit contributions
ka+0.016
tie+0.015
ance+0.015
 name+0.015
dot+0.015
Neg logit contributions
 neither-0.015
 mess-0.014
 wasteful-0.013
 overflow-0.012
 fighting-0.012
TokenOne
Token ID3198
Feature activation-1.82e-02

Pos logit contributions
 neither+0.024
 mess+0.024
 wasteful+0.023
 ruining+0.021
 overflow+0.021
Neg logit contributions
tie-0.025
 name-0.024
dot-0.023
ka-0.023
shirt-0.023
Token parents
Token ID3397
Feature activation-3.50e-02

Pos logit contributions
tie+0.082
ance+0.081
orn+0.081
ka+0.080
anced+0.079
Neg logit contributions
 neither-0.034
 mess-0.033
 fighting-0.027
 quarrel-0.025
 wasteful-0.024
Token gave
Token ID2921
Feature activation-3.27e-02

Pos logit contributions
ka+0.059
dot+0.058
tie+0.058
 powers+0.056
ance+0.056
Neg logit contributions
 mess-0.053
 neither-0.053
 fighting-0.048
 wasteful-0.046
 overflow-0.044
Token him
Token ID683
Feature activation-5.81e-02

Pos logit contributions
tie+0.054
anger+0.053
dot+0.053
ka+0.052
 amulet+0.052
Neg logit contributions
 neither-0.050
 mess-0.049
 fighting-0.045
 wasteful-0.043
 mice-0.041
Token a
Token ID257
Feature activation-6.88e-02

Pos logit contributions
 amulet+0.104
ka+0.103
ocket+0.103
tie+0.102
anced+0.102
Neg logit contributions
 neither-0.081
 mess-0.071
 wasteful-0.068
 both-0.062
 mice-0.061
Token pe
Token ID613
Feature activation-9.53e-02

Pos logit contributions
ance+0.146
anced+0.141
ka+0.128
dot+0.124
tie+0.123
Neg logit contributions
 mess-0.099
 neither-0.078
 quarrel-0.076
 flood-0.072
 wasteful-0.070
Tokenb
Token ID65
Feature activation-1.24e-01

Pos logit contributions
anced+0.155
 Magic+0.139
icol+0.139
ance+0.139
leaf+0.136
Neg logit contributions
 mess-0.140
 neither-0.134
 wasteful-0.123
 not-0.123
 pies-0.113
Tokenble
Token ID903
Feature activation-4.35e-02

Pos logit contributions
leaf+0.281
 outer+0.271
Ta+0.265
shirt+0.258
chery+0.256
Neg logit contributions
 neither-0.106
 not-0.091
 both-0.061
 mess-0.057
 pies-0.057
Token.
Token ID13
Feature activation-3.72e-02

Pos logit contributions
ka+0.079
dot+0.076
anced+0.075
leaf+0.075
ance+0.075
Neg logit contributions
 neither-0.062
 mess-0.059
 wasteful-0.053
 fighting-0.050
 roughly-0.050
Token He
Token ID679
Feature activation-3.16e-02

Pos logit contributions
ka+0.070
ocket+0.067
 powers+0.067
orn+0.066
ance+0.065
Neg logit contributions
 neither-0.053
 mess-0.047
 wasteful-0.044
 fighting-0.043
 roughly-0.042
Token had
Token ID550
Feature activation-2.97e-02

Pos logit contributions
ance+0.054
ka+0.054
tie+0.054
dot+0.053
 aboard+0.052
Neg logit contributions
 mess-0.047
 neither-0.046
 wasteful-0.041
 fighting-0.038
 roughly-0.038
Token never
Token ID1239
Feature activation-1.72e-02

Pos logit contributions
ka+0.050
tie+0.048
ance+0.048
anger+0.047
 name+0.047
Neg logit contributions
 mess-0.045
 neither-0.045
 wasteful-0.039
 fighting-0.038
 floor-0.038
Token He
Token ID679
Feature activation-4.39e-03

Pos logit contributions
 mess+0.025
 neither+0.025
 wasteful+0.024
 overflow+0.022
 ruining+0.022
Neg logit contributions
 name-0.020
tie-0.019
ka-0.019
ance-0.019
shirt-0.019
Token gave
Token ID2921
Feature activation-1.16e-02

Pos logit contributions
ka+0.007
tie+0.007
 name+0.007
ance+0.007
dot+0.007
Neg logit contributions
 neither-0.007
 mess-0.007
 wasteful-0.006
 overflow-0.006
 fighting-0.006
Token him
Token ID683
Feature activation-5.09e-02

Pos logit contributions
tie+0.020
ka+0.020
 name+0.020
dot+0.020
ance+0.020
Neg logit contributions
 neither-0.017
 mess-0.016
 wasteful-0.015
 fighting-0.014
 rats-0.014
Token a
Token ID257
Feature activation-7.56e-02

Pos logit contributions
ka+0.095
anger+0.094
anced+0.092
dot+0.092
tie+0.091
Neg logit contributions
 neither-0.070
 mess-0.062
 wasteful-0.060
 mice-0.057
 rats-0.057
Token beautiful
Token ID4950
Feature activation-9.76e-02

Pos logit contributions
ance+0.163
anced+0.155
ka+0.143
dot+0.140
ancing+0.132
Neg logit contributions
 mess-0.111
 wasteful-0.082
 fuss-0.079
 neither-0.077
 overflow-0.075
Token golden
Token ID10861
Feature activation-1.22e-01

Pos logit contributions
ance+0.178
anced+0.165
aved+0.153
leaf+0.149
dot+0.147
Neg logit contributions
 mess-0.176
 floor-0.137
 mistake-0.131
 table-0.131
 pie-0.128
Token sword
Token ID8429
Feature activation-4.89e-02

Pos logit contributions
ance+0.220
anced+0.210
leaf+0.196
dot+0.189
aved+0.188
Neg logit contributions
 mess-0.217
 floor-0.176
 mistake-0.161
 table-0.159
 pie-0.158
Token.
Token ID13
Feature activation-3.13e-03

Pos logit contributions
ka+0.096
leaf+0.092
dot+0.091
ance+0.090
anced+0.089
Neg logit contributions
 neither-0.063
 mess-0.061
 fighting-0.051
 roughly-0.050
 wasteful-0.049
Token
Token ID198
Feature activation-4.82e-03

Pos logit contributions
ka+0.005
tie+0.005
 name+0.005
ance+0.005
dot+0.005
Neg logit contributions
 neither-0.005
 mess-0.005
 wasteful-0.005
 overflow-0.004
 fighting-0.004
Token
Token ID198
Feature activation+2.10e-02

Pos logit contributions
ka+0.008
tie+0.008
 name+0.008
ance+0.007
dot+0.007
Neg logit contributions
 neither-0.008
 mess-0.008
 wasteful-0.007
 overflow-0.007
 fighting-0.007
TokenThe
Token ID464
Feature activation-2.72e-02

Pos logit contributions
 neither+0.033
 mess+0.033
 wasteful+0.030
 ruining+0.028
 fighting+0.028
Neg logit contributions
shirt-0.035
tie-0.035
leaf-0.035
ka-0.034
yo-0.034
Token and
Token ID290
Feature activation-3.75e-02

Pos logit contributions
anced+0.070
tie+0.069
leaf+0.068
ka+0.068
ance+0.068
Neg logit contributions
 mess-0.051
 neither-0.042
 structures-0.041
 floor-0.040
 snakes-0.040
Token always
Token ID1464
Feature activation-4.58e-02

Pos logit contributions
ka+0.074
 aboard+0.074
anger+0.073
tie+0.072
ance+0.071
Neg logit contributions
 neither-0.048
 mess-0.047
 not-0.042
 wasteful-0.041
 floor-0.039
Token wore
Token ID12408
Feature activation-6.75e-02

Pos logit contributions
anger+0.090
leaf+0.085
tie+0.084
ance+0.084
 aboard+0.082
Neg logit contributions
 neither-0.064
 mess-0.061
 fighting-0.056
 forgot-0.055
 ruining-0.055
Token a
Token ID257
Feature activation-7.09e-02

Pos logit contributions
anced+0.150
ane+0.146
ance+0.144
anger+0.142
oured+0.142
Neg logit contributions
 mess-0.074
 not-0.063
 both-0.062
 mice-0.057
 structures-0.053
Token spark
Token ID9009
Feature activation-5.60e-02

Pos logit contributions
ance+0.180
anced+0.171
leaf+0.149
 aboard+0.146
ancing+0.142
Neg logit contributions
 mess-0.102
 floor-0.068
 messy-0.068
 not-0.064
 roughly-0.061
Tokenly
Token ID306
Feature activation-1.22e-01

Pos logit contributions
anced+0.103
ance+0.096
leaf+0.094
dot+0.092
 aboard+0.091
Neg logit contributions
 mess-0.088
 neither-0.079
 floor-0.070
 anymore-0.069
 wasteful-0.068
Token dress
Token ID6576
Feature activation-5.69e-02

Pos logit contributions
ance+0.239
leaf+0.226
anced+0.225
iddled+0.204
ancing+0.193
Neg logit contributions
 mess-0.205
 floor-0.191
 bit-0.159
 table-0.154
 while-0.153
Token.
Token ID13
Feature activation-1.66e-02

Pos logit contributions
leaf+0.121
anced+0.120
ka+0.115
dot+0.115
 amulet+0.114
Neg logit contributions
 neither-0.064
 mess-0.060
 floor-0.052
 not-0.052
 anymore-0.051
Token One
Token ID1881
Feature activation-3.63e-02

Pos logit contributions
ka+0.033
tie+0.032
leaf+0.032
ance+0.032
 aboard+0.031
Neg logit contributions
 neither-0.021
 mess-0.020
 wasteful-0.018
 fighting-0.017
 roughly-0.017
Token day
Token ID1110
Feature activation-1.50e-02

Pos logit contributions
anced+0.082
ance+0.081
ka+0.079
tie+0.079
leaf+0.078
Neg logit contributions
 mess-0.037
 neither-0.035
 wasteful-0.028
 floor-0.028
 roughly-0.027
Token,
Token ID11
Feature activation+5.44e-03

Pos logit contributions
tie+0.030
leaf+0.029
ka+0.029
ance+0.029
shirt+0.029
Neg logit contributions
 neither-0.020
 mess-0.018
 wasteful-0.016
 fighting-0.016
 roughly-0.015
Token Pixie
Token ID46687
Feature activation-3.44e-02

Pos logit contributions
 neither+0.005
 mess+0.005
 wasteful+0.004
 overflow+0.004
 ruining+0.004
Neg logit contributions
 name-0.004
ka-0.004
tie-0.004
ance-0.004
dot-0.004
Token gave
Token ID2921
Feature activation-3.87e-02

Pos logit contributions
ka+0.059
tie+0.059
dot+0.058
shirt+0.056
leaf+0.056
Neg logit contributions
 mess-0.051
 neither-0.051
 wasteful-0.045
 roughly-0.041
 fighting-0.041
Token her
Token ID607
Feature activation-6.25e-02

Pos logit contributions
dot+0.073
ka+0.072
tie+0.072
anger+0.070
shirt+0.070
Neg logit contributions
 neither-0.053
 mess-0.046
 Amy-0.043
 fighting-0.041
 mice-0.040
Token a
Token ID257
Feature activation-8.17e-02

Pos logit contributions
anced+0.119
tie+0.118
dot+0.116
orn+0.114
ka+0.113
Neg logit contributions
 mess-0.084
 neither-0.081
 not-0.070
 both-0.066
 floor-0.063
Token spark
Token ID9009
Feature activation-7.18e-02

Pos logit contributions
ance+0.192
anced+0.190
dot+0.167
ka+0.165
ancing+0.163
Neg logit contributions
 mess-0.105
 bit-0.070
 not-0.070
 neither-0.068
 million-0.066
Tokenly
Token ID306
Feature activation-1.22e-01

Pos logit contributions
dot+0.131
anced+0.130
lynn+0.125
 Crown+0.124
leaf+0.124
Neg logit contributions
 mess-0.098
 neither-0.090
 worse-0.071
 anymore-0.071
 floor-0.071
Token wand
Token ID11569
Feature activation-5.98e-02

Pos logit contributions
ance+0.206
anced+0.200
leaf+0.187
lynn+0.186
ully+0.181
Neg logit contributions
 mess-0.207
 floor-0.170
 bit-0.165
 longer-0.153
 noise-0.152
Token,
Token ID11
Feature activation-3.18e-02

Pos logit contributions
ka+0.121
dot+0.118
ance+0.116
yo+0.116
leaf+0.115
Neg logit contributions
 neither-0.074
 mess-0.068
 fighting-0.056
 roughly-0.056
 wasteful-0.054
Token which
Token ID543
Feature activation-2.34e-02

Pos logit contributions
ka+0.062
ance+0.060
dot+0.059
anger+0.058
tie+0.058
Neg logit contributions
 neither-0.042
 mess-0.042
 wasteful-0.036
 fighting-0.034
 ruining-0.033
Token she
Token ID673
Feature activation-4.53e-02

Pos logit contributions
ka+0.041
dot+0.040
tie+0.040
shirt+0.039
ance+0.038
Neg logit contributions
 neither-0.037
 mess-0.035
 wasteful-0.032
 fighting-0.029
 overflow-0.029
Token could
Token ID714
Feature activation-3.44e-02

Pos logit contributions
dot+0.080
ance+0.079
ka+0.079
anger+0.079
 aboard+0.078
Neg logit contributions
 mess-0.062
 neither-0.061
 forgot-0.055
 wasteful-0.053
 roughly-0.051
Token bedroom
Token ID14043
Feature activation+2.73e-02

Pos logit contributions
leaf+0.173
ance+0.157
reetings+0.155
tie+0.154
 aboard+0.150
Neg logit contributions
 mess-0.110
 floor-0.101
 messy-0.086
 beds-0.084
 mouths-0.080
Token and
Token ID290
Feature activation-2.57e-02

Pos logit contributions
 neither+0.049
 wasteful+0.048
 mess+0.047
 overflow+0.044
 mosquitoes+0.043
Neg logit contributions
 name-0.038
tie-0.036
ance-0.035
 gown-0.035
ka-0.035
Token put
Token ID1234
Feature activation-8.18e-02

Pos logit contributions
ka+0.040
ance+0.038
tie+0.038
anced+0.037
dot+0.037
Neg logit contributions
 neither-0.044
 mess-0.044
 wasteful-0.039
 fighting-0.037
 roughly-0.037
Token on
Token ID319
Feature activation-6.26e-02

Pos logit contributions
reetings+0.207
leaf+0.201
chery+0.196
anced+0.196
found+0.196
Neg logit contributions
 both-0.065
 mess-0.055
 neither-0.053
 Lily-0.052
 they-0.046
Token the
Token ID262
Feature activation-1.07e-01

Pos logit contributions
reetings+0.150
leaf+0.144
orn+0.143
ane+0.143
tie+0.140
Neg logit contributions
 mess-0.063
 both-0.059
 not-0.052
 floor-0.049
 neither-0.046
Token pretty
Token ID2495
Feature activation-1.21e-01

Pos logit contributions
anced+0.216
ully+0.215
enna+0.214
igh+0.211
ance+0.207
Neg logit contributions
 mess-0.157
 floor-0.156
 noise-0.120
 ground-0.119
 floors-0.117
Token buckle
Token ID44983
Feature activation-2.58e-02

Pos logit contributions
leaf+0.247
year+0.243
ance+0.231
anced+0.229
uttered+0.225
Neg logit contributions
 mess-0.186
 floor-0.174
 floors-0.131
 table-0.131
 windows-0.123
Token.
Token ID13
Feature activation-2.50e-02

Pos logit contributions
ka+0.043
leaf+0.041
dot+0.041
 name+0.041
ance+0.041
Neg logit contributions
 neither-0.041
 mess-0.039
 wasteful-0.036
 roughly-0.034
 fighting-0.034
Token
Token ID220
Feature activation-1.93e-02

Pos logit contributions
ka+0.048
tie+0.046
leaf+0.046
ance+0.046
 aboard+0.045
Neg logit contributions
 neither-0.033
 mess-0.032
 wasteful-0.028
 fighting-0.027
 roughly-0.027
Token
Token ID198
Feature activation-7.38e-03

Pos logit contributions
ka+0.035
tie+0.034
dot+0.033
ance+0.033
leaf+0.033
Neg logit contributions
 neither-0.028
 mess-0.027
 wasteful-0.025
 fighting-0.023
 roughly-0.023
Token
Token ID198
Feature activation+8.17e-03

Pos logit contributions
ka+0.012
tie+0.012
 name+0.012
ance+0.012
dot+0.011
Neg logit contributions
 neither-0.012
 mess-0.011
 wasteful-0.011
 overflow-0.010
 fighting-0.010
Token she
Token ID673
Feature activation-2.77e-02

Pos logit contributions
 neither+0.005
 mess+0.005
 wasteful+0.005
 overflow+0.005
 fighting+0.004
Neg logit contributions
ka-0.006
 name-0.006
tie-0.006
ance-0.006
dot-0.006
Token liked
Token ID8288
Feature activation+4.75e-04

Pos logit contributions
ka+0.041
tie+0.041
ance+0.040
 name+0.040
 aboard+0.040
Neg logit contributions
 mess-0.047
 neither-0.046
 wasteful-0.043
 roughly-0.040
 overflow-0.039
Token to
Token ID284
Feature activation-3.28e-02

Pos logit contributions
 neither+0.001
 mess+0.001
 wasteful+0.001
 overflow+0.001
 fighting+0.001
Neg logit contributions
ka-0.001
tie-0.001
 name-0.001
ance-0.001
dot-0.001
Token wear
Token ID5806
Feature activation-7.33e-02

Pos logit contributions
ka+0.056
tie+0.053
 aboard+0.052
ance+0.052
leaf+0.051
Neg logit contributions
 mess-0.053
 neither-0.050
 overflow-0.045
 wasteful-0.045
 quarrel-0.045
Token her
Token ID607
Feature activation-8.63e-02

Pos logit contributions
anced+0.167
leaf+0.162
ance+0.160
reetings+0.158
int+0.156
Neg logit contributions
 mess-0.078
 both-0.064
 they-0.059
 not-0.058
 floor-0.056
Token special
Token ID2041
Feature activation-1.21e-01

Pos logit contributions
ance+0.189
year+0.183
leaf+0.180
anced+0.179
reetings+0.178
Neg logit contributions
 mess-0.104
 not-0.088
 floor-0.084
 neither-0.082
 messy-0.078
Token dress
Token ID6576
Feature activation-4.75e-02

Pos logit contributions
ance+0.223
anced+0.214
year+0.213
leaf+0.212
aved+0.206
Neg logit contributions
 mess-0.196
 floor-0.167
 quarrel-0.146
 table-0.145
 fight-0.136
Token.
Token ID13
Feature activation-2.49e-02

Pos logit contributions
leaf+0.098
dot+0.093
ka+0.092
anced+0.089
 aboard+0.089
Neg logit contributions
 neither-0.060
 mess-0.058
 floor-0.048
 roughly-0.046
 wasteful-0.045
Token It
Token ID632
Feature activation-1.75e-02

Pos logit contributions
ka+0.048
tie+0.046
leaf+0.046
ance+0.045
 aboard+0.045
Neg logit contributions
 neither-0.033
 mess-0.030
 wasteful-0.028
 fighting-0.026
 roughly-0.026
Token was
Token ID373
Feature activation-4.35e-02

Pos logit contributions
ka+0.029
tie+0.029
leaf+0.028
dot+0.028
ance+0.028
Neg logit contributions
 neither-0.028
 mess-0.027
 wasteful-0.025
 overflow-0.023
 fighting-0.022
Token purple
Token ID14032
Feature activation-3.22e-02

Pos logit contributions
ance+0.081
leaf+0.080
anger+0.079
ka+0.079
tie+0.079
Neg logit contributions
 mess-0.061
 neither-0.060
 wasteful-0.054
 not-0.053
 ruining-0.050
Token green
Token ID4077
Feature activation-4.90e-02

Pos logit contributions
ance+0.068
anced+0.065
ka+0.063
tie+0.061
 aboard+0.061
Neg logit contributions
 mess-0.061
 neither-0.054
 wasteful-0.052
 floor-0.050
 ruining-0.049
Token dress
Token ID6576
Feature activation-2.73e-02

Pos logit contributions
leaf+0.083
anced+0.081
ance+0.081
dot+0.079
tie+0.077
Neg logit contributions
 mess-0.081
 floor-0.073
 neither-0.068
 table-0.061
 overflow-0.061
Token with
Token ID351
Feature activation-5.72e-02

Pos logit contributions
ka+0.052
leaf+0.052
dot+0.051
anced+0.051
tie+0.050
Neg logit contributions
 neither-0.035
 mess-0.035
 wasteful-0.029
 floor-0.028
 overflow-0.028
Token a
Token ID257
Feature activation-4.86e-02

Pos logit contributions
int+0.097
orn+0.096
tie+0.095
anger+0.095
 aboard+0.095
Neg logit contributions
 mess-0.082
 neither-0.078
 floor-0.072
 mice-0.071
 rats-0.070
Token spark
Token ID9009
Feature activation-6.07e-02

Pos logit contributions
ance+0.085
anced+0.079
 aboard+0.074
leaf+0.074
ka+0.072
Neg logit contributions
 mess-0.088
 floor-0.074
 neither-0.072
 wasteful-0.067
 fighting-0.067
Tokenly
Token ID306
Feature activation-1.20e-01

Pos logit contributions
anced+0.111
ance+0.109
leaf+0.107
lynn+0.104
reetings+0.103
Neg logit contributions
 mess-0.092
 neither-0.075
 floor-0.075
 anymore-0.067
 fighting-0.064
Token belt
Token ID10999
Feature activation-6.40e-02

Pos logit contributions
ance+0.217
leaf+0.204
anced+0.197
year+0.187
found+0.184
Neg logit contributions
 mess-0.211
 floor-0.198
 table-0.160
 noise-0.152
 bit-0.143
Token.
Token ID13
Feature activation-3.61e-02

Pos logit contributions
ka+0.124
leaf+0.120
anced+0.120
Aid+0.119
ance+0.116
Neg logit contributions
 mess-0.081
 neither-0.081
 floor-0.073
 fighting-0.070
 not-0.067
Token She
Token ID1375
Feature activation-3.50e-02

Pos logit contributions
ka+0.072
leaf+0.072
tie+0.071
 powers+0.070
 amulet+0.069
Neg logit contributions
 mess-0.045
 neither-0.044
 floor-0.037
 wasteful-0.036
 roughly-0.036
Token couldn
Token ID3521
Feature activation-2.85e-02

Pos logit contributions
ance+0.062
 aboard+0.062
leaf+0.061
tie+0.061
anger+0.060
Neg logit contributions
 neither-0.050
 mess-0.049
 forgot-0.043
 floor-0.041
 wasteful-0.041
Token't
Token ID470
Feature activation-3.07e-02

Pos logit contributions
ka+0.067
tie+0.066
shirt+0.064
leaf+0.064
 name+0.064
Neg logit contributions
 neither-0.027
 mess-0.024
 wasteful-0.020
 overflow-0.019
 fighting-0.019
Token yellow
Token ID7872
Feature activation-7.16e-02

Pos logit contributions
ance+0.069
anced+0.067
ka+0.061
tie+0.061
 aboard+0.061
Neg logit contributions
 mess-0.054
 neither-0.047
 wasteful-0.045
 floor-0.042
 fighting-0.042
Token dress
Token ID6576
Feature activation-2.17e-02

Pos logit contributions
anced+0.114
leaf+0.112
ance+0.111
iddled+0.103
aved+0.101
Neg logit contributions
 mess-0.131
 floor-0.117
 neither-0.112
 table-0.105
 longer-0.101
Token with
Token ID351
Feature activation-6.14e-02

Pos logit contributions
ka+0.042
leaf+0.041
anced+0.041
dot+0.041
ance+0.040
Neg logit contributions
 neither-0.028
 mess-0.028
 wasteful-0.024
 overflow-0.022
 floor-0.022
Token a
Token ID257
Feature activation-5.43e-02

Pos logit contributions
int+0.113
anger+0.109
lynn+0.109
Aid+0.109
 aboard+0.108
Neg logit contributions
 mess-0.080
 neither-0.076
 mice-0.072
 floor-0.072
 both-0.070
Token spark
Token ID9009
Feature activation-4.80e-02

Pos logit contributions
ance+0.097
anced+0.092
leaf+0.086
 aboard+0.085
dot+0.082
Neg logit contributions
 mess-0.097
 floor-0.081
 neither-0.078
 wasteful-0.074
 fighting-0.073
Tokenly
Token ID306
Feature activation-1.20e-01

Pos logit contributions
anced+0.086
ance+0.083
leaf+0.082
dot+0.080
lynn+0.077
Neg logit contributions
 mess-0.075
 neither-0.067
 floor-0.062
 fighting-0.057
 wasteful-0.056
Token cape
Token ID36023
Feature activation-5.01e-02

Pos logit contributions
ance+0.214
leaf+0.208
anced+0.201
year+0.188
iddled+0.185
Neg logit contributions
 mess-0.212
 floor-0.203
 table-0.164
 noise-0.157
 ground-0.144
Token.
Token ID13
Feature activation+7.40e-03

Pos logit contributions
ka+0.099
leaf+0.096
Aid+0.096
dot+0.094
shirt+0.093
Neg logit contributions
 neither-0.065
 mess-0.062
 fighting-0.056
 floor-0.053
 ruining-0.052
Token
Token ID198
Feature activation-4.65e-03

Pos logit contributions
 neither+0.013
 mess+0.012
 wasteful+0.012
 overflow+0.011
 ruining+0.011
Neg logit contributions
 name-0.012
tie-0.011
ka-0.011
dot-0.011
shirt-0.011
Token
Token ID198
Feature activation+6.47e-03

Pos logit contributions
ka+0.007
tie+0.007
 name+0.007
ance+0.007
dot+0.007
Neg logit contributions
 neither-0.008
 mess-0.007
 wasteful-0.007
 overflow-0.006
 fighting-0.006
TokenOne
Token ID3198
Feature activation-2.24e-02

Pos logit contributions
 neither+0.010
 mess+0.009
 wasteful+0.009
 overflow+0.008
 ruining+0.008
Neg logit contributions
ka-0.011
tie-0.011
 name-0.011
dot-0.011
shirt-0.011
Token feeling
Token ID4203
Feature activation-3.17e-02

Pos logit contributions
 neither+0.013
 mess+0.013
 wasteful+0.012
 overflow+0.011
 mosquitoes+0.011
Neg logit contributions
 name-0.012
ka-0.012
tie-0.012
ance-0.012
dot-0.012
Token very
Token ID845
Feature activation-3.75e-03

Pos logit contributions
ka+0.069
anger+0.069
tie+0.069
anced+0.069
ance+0.068
Neg logit contributions
 neither-0.034
 mess-0.033
 wasteful-0.031
 roughly-0.026
 overflow-0.025
Token proud
Token ID6613
Feature activation-1.63e-02

Pos logit contributions
ka+0.007
tie+0.007
 name+0.007
ance+0.007
dot+0.006
Neg logit contributions
 neither-0.005
 mess-0.005
 wasteful-0.005
 overflow-0.004
 fighting-0.004
Token wearing
Token ID5762
Feature activation-5.93e-02

Pos logit contributions
ka+0.028
tie+0.028
 name+0.028
dot+0.027
ance+0.027
Neg logit contributions
 neither-0.025
 mess-0.024
 wasteful-0.022
 fighting-0.021
 roughly-0.020
Token her
Token ID607
Feature activation-8.40e-02

Pos logit contributions
anced+0.125
ka+0.122
leaf+0.121
dot+0.121
ance+0.121
Neg logit contributions
 mess-0.073
 not-0.059
 roughly-0.057
 neither-0.057
 floor-0.056
Token special
Token ID2041
Feature activation-1.20e-01

Pos logit contributions
ance+0.174
anced+0.165
leaf+0.156
dot+0.151
iddled+0.150
Neg logit contributions
 mess-0.125
 not-0.099
 floor-0.091
 quarrel-0.089
 neither-0.089
Token suit
Token ID6050
Feature activation-4.11e-02

Pos logit contributions
ance+0.211
anced+0.191
leaf+0.181
aved+0.171
tie+0.170
Neg logit contributions
 mess-0.234
 floor-0.185
 mistake-0.184
 fuss-0.176
 quarrel-0.173
Token.
Token ID13
Feature activation-3.26e-02

Pos logit contributions
ka+0.077
leaf+0.077
dot+0.076
 name+0.073
ance+0.073
Neg logit contributions
 neither-0.056
 mess-0.055
 wasteful-0.046
 roughly-0.045
 fighting-0.045
Token She
Token ID1375
Feature activation-3.15e-02

Pos logit contributions
ka+0.068
tie+0.065
 powers+0.063
leaf+0.063
shirt+0.062
Neg logit contributions
 neither-0.039
 mess-0.035
 wasteful-0.033
 roughly-0.032
 fighting-0.030
Token will
Token ID481
Feature activation-3.92e-02

Pos logit contributions
ka+0.059
tie+0.058
dot+0.057
leaf+0.057
anger+0.057
Neg logit contributions
 mess-0.044
 neither-0.042
 wasteful-0.037
 forgot-0.036
 roughly-0.034
Token never
Token ID1239
Feature activation+1.35e-02

Pos logit contributions
ka+0.092
shirt+0.089
dot+0.088
tie+0.088
leaf+0.088
Neg logit contributions
 neither-0.038
 mess-0.035
 not-0.030
 overflow-0.027
 wasteful-0.027
Token to
Token ID284
Feature activation-3.66e-02

Pos logit contributions
 neither+0.003
 mess+0.003
 wasteful+0.003
 overflow+0.003
 fighting+0.003
Neg logit contributions
ka-0.002
tie-0.002
 name-0.002
dot-0.002
ance-0.002
Token her
Token ID607
Feature activation-8.65e-03

Pos logit contributions
tie+0.073
ka+0.072
ance+0.070
shirt+0.069
anger+0.069
Neg logit contributions
 mess-0.049
 neither-0.048
 fighting-0.046
 wasteful-0.043
 ruining-0.041
Token and
Token ID290
Feature activation-3.55e-05

Pos logit contributions
ka+0.015
tie+0.015
ance+0.015
dot+0.015
leaf+0.015
Neg logit contributions
 mess-0.013
 neither-0.013
 wasteful-0.011
 fighting-0.011
 overflow-0.010
Token put
Token ID1234
Feature activation-5.35e-02

Pos logit contributions
ka+0.000
tie+0.000
 name+0.000
ance+0.000
dot+0.000
Neg logit contributions
 neither-0.000
 mess-0.000
 wasteful-0.000
 overflow-0.000
 fighting-0.000
Token a
Token ID257
Feature activation-5.62e-02

Pos logit contributions
dot+0.115
 amulet+0.112
leaf+0.112
tie+0.111
chery+0.108
Neg logit contributions
 neither-0.063
 mess-0.055
 wasteful-0.050
 fighting-0.048
 both-0.046
Token comforting
Token ID34228
Feature activation-1.20e-01

Pos logit contributions
ance+0.112
anced+0.108
tie+0.097
dot+0.095
 aboard+0.092
Neg logit contributions
 mess-0.099
 wasteful-0.077
 neither-0.074
 floor-0.073
 overflow-0.073
Token hand
Token ID1021
Feature activation-5.98e-02

Pos logit contributions
ance+0.225
anced+0.212
ane+0.200
amb+0.190
tie+0.189
Neg logit contributions
 mess-0.203
 floor-0.169
 noise-0.160
 table-0.150
 fuss-0.147
Token on
Token ID319
Feature activation-1.54e-02

Pos logit contributions
ka+0.113
 Crown+0.111
leaf+0.109
 amulet+0.109
ance+0.108
Neg logit contributions
 neither-0.081
 mess-0.072
 fighting-0.065
 roughly-0.065
 wasteful-0.063
Token her
Token ID607
Feature activation-5.06e-02

Pos logit contributions
ka+0.027
tie+0.027
ance+0.026
dot+0.026
shirt+0.026
Neg logit contributions
 neither-0.023
 mess-0.022
 wasteful-0.021
 fighting-0.020
 overflow-0.019
Token shoulder
Token ID8163
Feature activation-7.67e-03

Pos logit contributions
orn+0.106
anced+0.105
leaf+0.105
ance+0.104
ka+0.104
Neg logit contributions
 mess-0.067
 floor-0.060
 neither-0.050
 mistake-0.047
 not-0.046
Token.
Token ID13
Feature activation+4.42e-03

Pos logit contributions
ka+0.012
tie+0.012
 name+0.012
ance+0.012
dot+0.012
Neg logit contributions
 neither-0.013
 mess-0.012
 wasteful-0.011
 overflow-0.011
 fighting-0.010
Token unique
Token ID3748
Feature activation-3.23e-02

Pos logit contributions
ance+0.033
tie+0.033
anced+0.032
ka+0.032
 aboard+0.032
Neg logit contributions
 mess-0.024
 neither-0.024
 wasteful-0.023
 overflow-0.020
 fighting-0.020
Token girl
Token ID2576
Feature activation-2.30e-02

Pos logit contributions
anced+0.064
ance+0.064
ka+0.062
tie+0.061
dot+0.061
Neg logit contributions
 mess-0.045
 neither-0.039
 wasteful-0.036
 floor-0.034
 rats-0.034
Token with
Token ID351
Feature activation-3.89e-02

Pos logit contributions
ka+0.045
anced+0.045
ance+0.044
tie+0.044
 name+0.043
Neg logit contributions
 neither-0.030
 mess-0.028
 wasteful-0.026
 fighting-0.025
 roughly-0.024
Token a
Token ID257
Feature activation-2.39e-02

Pos logit contributions
 aboard+0.076
ka+0.072
tie+0.071
anger+0.070
int+0.069
Neg logit contributions
 neither-0.052
 mess-0.051
 wasteful-0.047
 mice-0.045
 fighting-0.045
Token spark
Token ID9009
Feature activation-5.21e-02

Pos logit contributions
ance+0.044
ka+0.042
anced+0.041
 aboard+0.041
tie+0.041
Neg logit contributions
 mess-0.038
 neither-0.035
 wasteful-0.032
 fighting-0.030
 overflow-0.029
Tokenly
Token ID306
Feature activation-1.20e-01

Pos logit contributions
anced+0.089
ance+0.087
dot+0.087
leaf+0.086
lynn+0.085
Neg logit contributions
 mess-0.082
 neither-0.076
 floor-0.065
 wasteful-0.063
 fighting-0.063
Token smile
Token ID8212
Feature activation-6.90e-02

Pos logit contributions
ance+0.216
leaf+0.216
anced+0.197
iddled+0.192
ancing+0.190
Neg logit contributions
 mess-0.208
 floor-0.178
 noise-0.148
 hole-0.146
 table-0.143
Token and
Token ID290
Feature activation-4.09e-02

Pos logit contributions
leaf+0.138
lynn+0.121
ance+0.121
 amulet+0.120
dot+0.120
Neg logit contributions
 neither-0.089
 mess-0.087
 fighting-0.078
 floor-0.074
 anymore-0.070
Token bright
Token ID6016
Feature activation-7.57e-02

Pos logit contributions
anger+0.070
ka+0.070
tie+0.069
 aboard+0.069
orn+0.068
Neg logit contributions
 neither-0.061
 mess-0.060
 not-0.051
 wasteful-0.051
 fighting-0.051
Token eyes
Token ID2951
Feature activation-7.44e-02

Pos logit contributions
ance+0.160
leaf+0.155
cher+0.154
chery+0.154
lynn+0.152
Neg logit contributions
 mess-0.102
 floor-0.080
 noise-0.068
 fighting-0.067
 faces-0.067
Token that
Token ID326
Feature activation-4.18e-02

Pos logit contributions
lynn+0.181
leaf+0.178
 amulet+0.177
yo+0.176
ance+0.175
Neg logit contributions
 mess-0.060
 neither-0.053
 fighting-0.052
 anymore-0.046
 floor-0.041
Token had
Token ID550
Feature activation-3.20e-02

Pos logit contributions
dot+0.077
ance+0.077
ka+0.076
tie+0.075
leaf+0.074
Neg logit contributions
 mess-0.057
 neither-0.056
 fighting-0.050
 wasteful-0.048
 forgot-0.044
Token just
Token ID655
Feature activation-5.51e-02

Pos logit contributions
ka+0.052
tie+0.052
anger+0.051
ance+0.051
dot+0.050
Neg logit contributions
 mess-0.052
 neither-0.051
 wasteful-0.046
 fighting-0.045
 ruining-0.043
Token given
Token ID1813
Feature activation-3.64e-02

Pos logit contributions
anger+0.100
lor+0.092
ance+0.092
orn+0.091
 name+0.090
Neg logit contributions
 mess-0.084
 neither-0.083
 fighting-0.079
 wasteful-0.075
 ruining-0.071
Token him
Token ID683
Feature activation-5.88e-02

Pos logit contributions
tie+0.061
anger+0.060
dot+0.060
ka+0.060
anced+0.060
Neg logit contributions
 neither-0.055
 mess-0.055
 wasteful-0.049
 fighting-0.048
 pies-0.044
Token a
Token ID257
Feature activation-8.06e-02

Pos logit contributions
ka+0.107
anced+0.106
orn+0.103
anger+0.103
dot+0.103
Neg logit contributions
 neither-0.085
 mess-0.075
 wasteful-0.074
 not-0.067
 roughly-0.065
Token shiny
Token ID22441
Feature activation-1.19e-01

Pos logit contributions
ance+0.179
anced+0.175
ka+0.153
ed+0.149
dot+0.148
Neg logit contributions
 mess-0.119
 quarrel-0.090
 fight-0.086
 neither-0.086
 mistake-0.084
Token new
Token ID649
Feature activation-1.00e-01

Pos logit contributions
ance+0.232
anced+0.228
leaf+0.218
dot+0.205
lor+0.203
Neg logit contributions
 mess-0.203
 floor-0.153
 mistake-0.151
 longer-0.138
 disaster-0.137
Token pistol
Token ID16790
Feature activation-4.90e-02

Pos logit contributions
ance+0.212
anced+0.200
leaf+0.182
umper+0.181
year+0.181
Neg logit contributions
 mess-0.155
 floor-0.120
 mistake-0.120
 quarrel-0.114
 fight-0.111
Token.
Token ID13
Feature activation-1.09e-02

Pos logit contributions
ka+0.096
dot+0.091
anced+0.090
leaf+0.090
ance+0.088
Neg logit contributions
 neither-0.067
 mess-0.061
 wasteful-0.054
 fighting-0.053
 roughly-0.053
Token Joey
Token ID26154
Feature activation-2.54e-02

Pos logit contributions
ka+0.017
tie+0.017
ance+0.016
dot+0.016
leaf+0.016
Neg logit contributions
 neither-0.019
 mess-0.018
 wasteful-0.017
 fighting-0.016
 overflow-0.016
Token had
Token ID550
Feature activation-5.36e-02

Pos logit contributions
ance+0.045
tie+0.044
ka+0.044
leaf+0.044
dot+0.044
Neg logit contributions
 mess-0.037
 neither-0.036
 wasteful-0.032
 fighting-0.031
 roughly-0.030
Token mom
Token ID1995
Feature activation+1.28e-02

Pos logit contributions
leaf+0.103
tie+0.098
 aboard+0.098
ka+0.097
 amulet+0.097
Neg logit contributions
 neither-0.033
 mess-0.032
 fighting-0.021
 floor-0.020
 not-0.020
Tokenmy
Token ID1820
Feature activation+3.41e-03

Pos logit contributions
 neither+0.023
 wasteful+0.022
 mess+0.022
 overflow+0.020
 roughly+0.020
Neg logit contributions
 name-0.019
ka-0.018
tie-0.018
 aboard-0.017
ance-0.017
Token gave
Token ID2921
Feature activation-4.09e-03

Pos logit contributions
 neither+0.006
 mess+0.005
 wasteful+0.005
 overflow+0.005
 ruining+0.005
Neg logit contributions
ka-0.005
tie-0.005
 name-0.005
ance-0.005
dot-0.005
Token her
Token ID607
Feature activation-6.09e-02

Pos logit contributions
tie+0.006
ka+0.006
 name+0.006
ance+0.006
dot+0.005
Neg logit contributions
 neither-0.008
 mess-0.007
 wasteful-0.007
 fighting-0.006
 overflow-0.006
Token a
Token ID257
Feature activation-9.76e-02

Pos logit contributions
anced+0.127
reetings+0.126
orn+0.122
 amulet+0.122
ance+0.121
Neg logit contributions
 neither-0.070
 mess-0.067
 wasteful-0.056
 both-0.054
 not-0.054
Token cool
Token ID3608
Feature activation-1.19e-01

Pos logit contributions
ance+0.248
anced+0.244
ancing+0.208
ed+0.204
reetings+0.203
Neg logit contributions
 mess-0.128
 messy-0.088
 table-0.083
 quarrel-0.080
 bit-0.078
Token mask
Token ID9335
Feature activation-5.34e-02

Pos logit contributions
ance+0.234
reetings+0.233
anced+0.230
lor+0.226
year+0.225
Neg logit contributions
 mess-0.195
 table-0.142
 floor-0.142
 bit-0.128
 messy-0.122
Token to
Token ID284
Feature activation-5.27e-02

Pos logit contributions
ka+0.101
dot+0.095
ance+0.094
 amulet+0.094
anced+0.093
Neg logit contributions
 neither-0.073
 mess-0.069
 roughly-0.061
 fighting-0.058
 wasteful-0.057
Token wear
Token ID5806
Feature activation-6.41e-02

Pos logit contributions
ka+0.084
 aboard+0.084
anced+0.083
dot+0.081
ance+0.081
Neg logit contributions
 mess-0.087
 neither-0.079
 wasteful-0.075
 not-0.074
 overflow-0.074
Token.
Token ID13
Feature activation+7.80e-03

Pos logit contributions
leaf+0.138
anced+0.138
dot+0.135
 amulet+0.134
orn+0.132
Neg logit contributions
 mess-0.067
 not-0.058
 neither-0.057
 both-0.053
 floor-0.053
Token It
Token ID632
Feature activation-1.53e-02

Pos logit contributions
 neither+0.012
 mess+0.011
 wasteful+0.011
 ruining+0.010
 overflow+0.010
Neg logit contributions
 name-0.014
ka-0.013
tie-0.013
shirt-0.013
dot-0.013
Token she
Token ID673
Feature activation-2.14e-02

Pos logit contributions
leaf+0.055
tie+0.054
ka+0.054
dot+0.053
 amulet+0.052
Neg logit contributions
 neither-0.048
 mess-0.046
 wasteful-0.039
 fighting-0.038
 roughly-0.037
Token put
Token ID1234
Feature activation-8.18e-02

Pos logit contributions
leaf+0.036
tie+0.036
ance+0.035
 name+0.035
ka+0.035
Neg logit contributions
 mess-0.034
 neither-0.033
 wasteful-0.029
 fighting-0.028
 floor-0.027
Token on
Token ID319
Feature activation-4.62e-02

Pos logit contributions
reetings+0.215
leaf+0.214
found+0.206
chery+0.202
int+0.197
Neg logit contributions
 both-0.065
 neither-0.055
 Lily-0.048
 they-0.047
 mess-0.045
Token her
Token ID607
Feature activation-8.90e-02

Pos logit contributions
orn+0.099
reetings+0.099
leaf+0.098
anced+0.095
tie+0.095
Neg logit contributions
 mess-0.058
 neither-0.049
 both-0.048
 floor-0.047
 not-0.046
Token spark
Token ID9009
Feature activation-6.90e-02

Pos logit contributions
leaf+0.177
ance+0.172
orn+0.170
year+0.167
anced+0.166
Neg logit contributions
 mess-0.130
 floor-0.115
 not-0.107
 messy-0.098
 both-0.097
Tokenly
Token ID306
Feature activation-1.18e-01

Pos logit contributions
leaf+0.131
anced+0.119
 Magic+0.119
dot+0.118
 name+0.117
Neg logit contributions
 mess-0.101
 floor-0.085
 neither-0.081
 anymore-0.071
 pies-0.068
Token shoes
Token ID10012
Feature activation-4.20e-02

Pos logit contributions
leaf+0.256
ance+0.233
year+0.227
anced+0.222
 aboard+0.220
Neg logit contributions
 mess-0.170
 floor-0.159
 table-0.121
 messy-0.112
 floors-0.109
Token.
Token ID13
Feature activation-4.69e-02

Pos logit contributions
leaf+0.083
ka+0.081
 name+0.081
anced+0.080
dot+0.080
Neg logit contributions
 neither-0.055
 mess-0.051
 roughly-0.043
 fighting-0.042
 floor-0.041
Token
Token ID198
Feature activation-1.50e-02

Pos logit contributions
ka+0.099
leaf+0.099
 powers+0.098
tie+0.098
 amulet+0.096
Neg logit contributions
 mess-0.051
 neither-0.050
 floor-0.044
 roughly-0.042
 fighting-0.041
Token
Token ID198
Feature activation-5.05e-03

Pos logit contributions
ka+0.026
tie+0.026
ance+0.025
dot+0.025
 name+0.025
Neg logit contributions
 neither-0.022
 mess-0.022
 wasteful-0.020
 roughly-0.018
 fighting-0.018
TokenShe
Token ID3347
Feature activation-3.13e-02

Pos logit contributions
ka+0.010
tie+0.010
 name+0.010
ance+0.010
dot+0.010
Neg logit contributions
 neither-0.006
 mess-0.006
 wasteful-0.006
 overflow-0.005
 roughly-0.005
Token.
Token ID13
Feature activation-1.23e-02

Pos logit contributions
ka+0.169
anced+0.157
leaf+0.155
dot+0.155
 amulet+0.155
Neg logit contributions
 neither-0.084
 mess-0.071
 not-0.067
 they-0.061
 floor-0.060
Token He
Token ID679
Feature activation-5.41e-02

Pos logit contributions
ka+0.020
tie+0.020
ance+0.019
leaf+0.019
dot+0.019
Neg logit contributions
 neither-0.020
 mess-0.019
 wasteful-0.018
 overflow-0.017
 fighting-0.017
Token put
Token ID1234
Feature activation-8.23e-02

Pos logit contributions
ance+0.105
leaf+0.104
tie+0.102
anger+0.101
dot+0.101
Neg logit contributions
 mess-0.070
 neither-0.062
 forgot-0.060
 floor-0.055
 stomp-0.051
Token on
Token ID319
Feature activation-6.14e-02

Pos logit contributions
reetings+0.176
found+0.175
leaf+0.174
lor+0.170
dot+0.166
Neg logit contributions
 both-0.086
 neither-0.086
 mess-0.073
 they-0.073
 fighting-0.064
Token a
Token ID257
Feature activation-9.40e-02

Pos logit contributions
reetings+0.141
leaf+0.138
orn+0.138
ocket+0.137
anced+0.133
Neg logit contributions
 mess-0.066
 floor-0.055
 both-0.054
 neither-0.052
 not-0.051
Token blue
Token ID4171
Feature activation-1.17e-01

Pos logit contributions
ance+0.206
anced+0.199
 aboard+0.165
reetings+0.162
ancing+0.159
Neg logit contributions
 mess-0.155
 floor-0.129
 messy-0.116
 table-0.113
 timer-0.109
Token shirt
Token ID10147
Feature activation-6.09e-02

Pos logit contributions
reetings+0.247
leaf+0.240
ance+0.239
anced+0.235
year+0.233
Neg logit contributions
 floor-0.139
 mess-0.136
 table-0.127
 longer-0.106
 both-0.103
Token,
Token ID11
Feature activation-6.75e-02

Pos logit contributions
leaf+0.143
ka+0.139
anced+0.135
dot+0.134
 amulet+0.132
Neg logit contributions
 neither-0.053
 mess-0.048
 floor-0.046
 not-0.040
 roughly-0.037
Token blue
Token ID4171
Feature activation-9.34e-02

Pos logit contributions
orn+0.130
ance+0.128
anger+0.128
ane+0.125
reetings+0.122
Neg logit contributions
 mess-0.082
 neither-0.082
 floor-0.077
 not-0.075
 they-0.074
Token pants
Token ID12581
Feature activation-5.95e-02

Pos logit contributions
leaf+0.193
year+0.191
reetings+0.188
ocket+0.180
 amulet+0.176
Neg logit contributions
 floor-0.102
 mess-0.097
 table-0.087
 mouths-0.086
 cups-0.085
Token,
Token ID11
Feature activation-6.18e-02

Pos logit contributions
leaf+0.136
 amulet+0.132
 name+0.131
ka+0.131
orn+0.129
Neg logit contributions
 neither-0.057
 mess-0.052
 floor-0.045
 fighting-0.044
 they-0.039
Token she
Token ID673
Feature activation-7.20e-03

Pos logit contributions
 mess+0.029
 neither+0.028
 wasteful+0.028
 overflow+0.025
 roughly+0.025
Neg logit contributions
 name-0.033
ka-0.032
tie-0.032
dot-0.031
shirt-0.031
Token found
Token ID1043
Feature activation-2.75e-02

Pos logit contributions
ka+0.012
tie+0.012
 name+0.012
ance+0.012
leaf+0.012
Neg logit contributions
 neither-0.011
 mess-0.011
 wasteful-0.010
 fighting-0.009
 overflow-0.009
Token a
Token ID257
Feature activation-7.71e-03

Pos logit contributions
ka+0.045
tie+0.043
 aboard+0.043
anced+0.043
dot+0.043
Neg logit contributions
 neither-0.044
 mess-0.041
 wasteful-0.040
 rats-0.039
 mice-0.037
Token shiny
Token ID22441
Feature activation-7.80e-02

Pos logit contributions
ka+0.012
ance+0.012
tie+0.012
dot+0.012
 name+0.012
Neg logit contributions
 mess-0.013
 neither-0.012
 wasteful-0.012
 overflow-0.011
 fighting-0.010
Token pe
Token ID613
Feature activation-9.41e-02

Pos logit contributions
ance+0.139
anced+0.137
leaf+0.133
dot+0.124
found+0.124
Neg logit contributions
 mess-0.146
 floor-0.116
 neither-0.109
 wasteful-0.105
 fighting-0.103
Tokenb
Token ID65
Feature activation-1.17e-01

Pos logit contributions
anced+0.163
leaf+0.148
ance+0.143
 Music+0.135
 farewell+0.134
Neg logit contributions
 mess-0.150
 neither-0.126
 not-0.124
 wasteful-0.121
 floor-0.116
Tokenble
Token ID903
Feature activation-2.77e-02

Pos logit contributions
leaf+0.285
 outer+0.262
shirt+0.261
chery+0.259
 farewell+0.259
Neg logit contributions
 neither-0.100
 not-0.086
 mess-0.053
 fighting-0.048
 both-0.045
Token and
Token ID290
Feature activation-2.43e-02

Pos logit contributions
ka+0.046
anced+0.045
ance+0.044
 name+0.044
leaf+0.043
Neg logit contributions
 neither-0.045
 mess-0.044
 wasteful-0.039
 fighting-0.039
 roughly-0.038
Token picked
Token ID6497
Feature activation-7.70e-02

Pos logit contributions
ance+0.043
ka+0.040
tie+0.040
 aboard+0.040
anced+0.040
Neg logit contributions
 mess-0.038
 neither-0.038
 wasteful-0.036
 roughly-0.031
 fighting-0.031
Token it
Token ID340
Feature activation-2.71e-02

Pos logit contributions
anced+0.197
chery+0.193
ka+0.190
uster+0.189
leaf+0.189
Neg logit contributions
 neither-0.048
 mess-0.045
 not-0.034
 fighting-0.029
 both-0.027
Token up
Token ID510
Feature activation-1.00e-02

Pos logit contributions
leaf+0.046
ance+0.046
ka+0.046
anced+0.045
dot+0.044
Neg logit contributions
 neither-0.043
 mess-0.043
 fighting-0.035
 roughly-0.035
 wasteful-0.035
Token She
Token ID1375
Feature activation-4.02e-02

Pos logit contributions
ka+0.056
tie+0.054
orn+0.052
ocket+0.052
 aboard+0.051
Neg logit contributions
 neither-0.035
 mess-0.032
 wasteful-0.030
 roughly-0.029
 floor-0.028
Token picked
Token ID6497
Feature activation-6.28e-02

Pos logit contributions
leaf+0.073
tie+0.073
ka+0.072
dot+0.071
ocket+0.070
Neg logit contributions
 mess-0.055
 neither-0.054
 forgot-0.047
 quarrel-0.044
 fighting-0.044
Token up
Token ID510
Feature activation-3.58e-02

Pos logit contributions
och+0.121
ka+0.120
ocket+0.120
aved+0.119
anced+0.119
Neg logit contributions
 neither-0.069
 mess-0.065
 mosquitoes-0.064
 snakes-0.058
 mice-0.058
Token a
Token ID257
Feature activation-4.35e-02

Pos logit contributions
tie+0.062
ka+0.060
leaf+0.059
dot+0.059
anced+0.059
Neg logit contributions
 neither-0.051
 mess-0.051
 rats-0.048
 mosquitoes-0.045
 mice-0.045
Token shiny
Token ID22441
Feature activation-1.01e-01

Pos logit contributions
ance+0.085
anced+0.084
dot+0.076
ka+0.074
leaf+0.073
Neg logit contributions
 mess-0.071
 neither-0.055
 floor-0.054
 wasteful-0.053
 quarrel-0.051
Token golden
Token ID10861
Feature activation-1.17e-01

Pos logit contributions
anced+0.184
ance+0.182
leaf+0.180
dot+0.164
aved+0.164
Neg logit contributions
 mess-0.182
 floor-0.146
 mistake-0.126
 table-0.124
 tower-0.121
Token sword
Token ID8429
Feature activation-3.51e-02

Pos logit contributions
ance+0.220
anced+0.214
leaf+0.202
aved+0.193
iddled+0.189
Neg logit contributions
 mess-0.204
 floor-0.173
 table-0.150
 tower-0.145
 pie-0.144
Token and
Token ID290
Feature activation-2.84e-02

Pos logit contributions
ka+0.073
leaf+0.072
dot+0.070
shirt+0.068
Aid+0.068
Neg logit contributions
 neither-0.043
 mess-0.040
 fighting-0.034
 floor-0.033
 roughly-0.033
Token went
Token ID1816
Feature activation-8.25e-02

Pos logit contributions
ka+0.049
tie+0.047
dot+0.047
anger+0.046
ance+0.046
Neg logit contributions
 neither-0.043
 mess-0.042
 wasteful-0.037
 roughly-0.035
 fighting-0.034
Token out
Token ID503
Feature activation-5.13e-02

Pos logit contributions
orn+0.189
ocket+0.187
anced+0.185
igh+0.184
ka+0.181
Neg logit contributions
 not-0.078
 mess-0.072
 after-0.068
 they-0.065
 ruining-0.059
Token into
Token ID656
Feature activation-2.63e-02

Pos logit contributions
tie+0.107
ka+0.106
yo+0.102
orn+0.102
 name+0.101
Neg logit contributions
 mess-0.058
 neither-0.057
 fighting-0.054
 snakes-0.050
 roughly-0.049
Token fairy
Token ID25607
Feature activation-4.88e-03

Pos logit contributions
tie+0.054
shirt+0.052
anced+0.050
ance+0.049
dot+0.049
Neg logit contributions
 mess-0.066
 neither-0.059
 wasteful-0.057
 fighting-0.056
 floor-0.054
Token gave
Token ID2921
Feature activation-9.54e-03

Pos logit contributions
ka+0.008
tie+0.007
 name+0.007
ance+0.007
dot+0.007
Neg logit contributions
 neither-0.008
 mess-0.008
 wasteful-0.007
 fighting-0.007
 overflow-0.007
Token Lily
Token ID20037
Feature activation-5.19e-02

Pos logit contributions
tie+0.019
ka+0.019
dot+0.019
 name+0.019
leaf+0.019
Neg logit contributions
 neither-0.012
 mess-0.011
 wasteful-0.010
 fighting-0.009
 overflow-0.009
Token a
Token ID257
Feature activation-8.27e-02

Pos logit contributions
anger+0.096
ka+0.094
anced+0.091
leaf+0.090
tie+0.088
Neg logit contributions
 neither-0.070
 mess-0.065
 both-0.057
 mice-0.057
 not-0.057
Token spark
Token ID9009
Feature activation-7.07e-02

Pos logit contributions
ance+0.207
anced+0.200
dot+0.177
ka+0.176
lynn+0.175
Neg logit contributions
 mess-0.097
 not-0.058
 neither-0.057
 overflow-0.057
 bit-0.057
Tokenly
Token ID306
Feature activation-1.17e-01

Pos logit contributions
anced+0.143
dot+0.136
ance+0.134
lynn+0.134
aved+0.133
Neg logit contributions
 mess-0.088
 neither-0.074
 floor-0.062
 worse-0.060
 longer-0.060
Token wand
Token ID11569
Feature activation-6.30e-02

Pos logit contributions
ance+0.195
anced+0.182
leaf+0.175
lynn+0.175
aden+0.167
Neg logit contributions
 mess-0.205
 floor-0.171
 bit-0.160
 noise-0.150
 longer-0.149
Token and
Token ID290
Feature activation-1.03e-02

Pos logit contributions
ka+0.144
dot+0.136
ance+0.136
yo+0.136
leaf+0.136
Neg logit contributions
 neither-0.064
 mess-0.057
 not-0.047
 fighting-0.047
 roughly-0.046
Token said
Token ID531
Feature activation-3.43e-02

Pos logit contributions
ka+0.019
tie+0.018
 name+0.018
dot+0.018
ance+0.018
Neg logit contributions
 neither-0.015
 mess-0.014
 wasteful-0.013
 overflow-0.012
 fighting-0.012
Token to
Token ID284
Feature activation-3.92e-03

Pos logit contributions
leaf+0.076
tie+0.076
shirt+0.075
ka+0.075
Aid+0.074
Neg logit contributions
 neither-0.038
 mess-0.030
 wasteful-0.027
 fighting-0.026
 not-0.025
Token make
Token ID787
Feature activation-7.31e-03

Pos logit contributions
tie+0.008
ka+0.007
 name+0.007
dot+0.007
ance+0.007
Neg logit contributions
 neither-0.005
 mess-0.005
 wasteful-0.005
 overflow-0.004
 fighting-0.004
Token her
Token ID607
Feature activation-3.01e-02

Pos logit contributions
tie+0.018
ka+0.018
ance+0.018
leaf+0.018
 name+0.017
Neg logit contributions
 neither-0.013
 mess-0.012
 wasteful-0.011
 fighting-0.010
 roughly-0.010
Token mum
Token ID25682
Feature activation-2.60e-02

Pos logit contributions
tie+0.066
ka+0.065
ance+0.065
 aboard+0.064
dot+0.064
Neg logit contributions
 mess-0.033
 neither-0.031
 fighting-0.025
 wasteful-0.025
 mosquitoes-0.024
Token put
Token ID1234
Feature activation-3.78e-02

Pos logit contributions
ka+0.041
tie+0.040
dot+0.040
ance+0.040
anced+0.039
Neg logit contributions
 neither-0.043
 mess-0.042
 wasteful-0.039
 fighting-0.036
 overflow-0.035
Token on
Token ID319
Feature activation-6.81e-03

Pos logit contributions
dot+0.059
ance+0.059
leaf+0.059
anced+0.058
 amulet+0.058
Neg logit contributions
 neither-0.061
 mess-0.057
 wasteful-0.054
 raw-0.051
 fighting-0.050
Token her
Token ID607
Feature activation-7.54e-02

Pos logit contributions
tie+0.010
ka+0.010
 name+0.010
ance+0.010
dot+0.010
Neg logit contributions
 neither-0.012
 mess-0.012
 wasteful-0.011
 overflow-0.010
 fighting-0.010
Token special
Token ID2041
Feature activation-1.16e-01

Pos logit contributions
ance+0.146
anced+0.140
ka+0.134
tie+0.134
 aboard+0.131
Neg logit contributions
 mess-0.104
 neither-0.092
 floor-0.091
 raw-0.086
 quarrel-0.085
Token clothes
Token ID8242
Feature activation-2.87e-02

Pos logit contributions
ance+0.218
anced+0.208
aved+0.198
tie+0.198
leaf+0.194
Neg logit contributions
 mess-0.183
 floor-0.160
 quarrel-0.149
 table-0.146
 fight-0.144
Token and
Token ID290
Feature activation-1.85e-02

Pos logit contributions
ka+0.055
leaf+0.054
dot+0.054
 name+0.053
ance+0.052
Neg logit contributions
 neither-0.040
 mess-0.037
 wasteful-0.032
 fighting-0.031
 roughly-0.031
Token said
Token ID531
Feature activation-1.99e-02

Pos logit contributions
ka+0.032
tie+0.031
ance+0.030
dot+0.030
leaf+0.030
Neg logit contributions
 neither-0.028
 mess-0.028
 wasteful-0.025
 overflow-0.024
 fighting-0.023
Token to
Token ID284
Feature activation+3.66e-03

Pos logit contributions
tie+0.040
ka+0.039
leaf+0.038
 name+0.038
shirt+0.038
Neg logit contributions
 neither-0.026
 mess-0.024
 wasteful-0.022
 fighting-0.020
 overflow-0.020
Token her
Token ID607
Feature activation-4.58e-02

Pos logit contributions
 neither+0.007
 mess+0.007
 wasteful+0.007
 overflow+0.006
 fighting+0.006
Neg logit contributions
 name-0.005
ka-0.005
tie-0.005
ance-0.005
dot-0.005
Token watch
Token ID2342
Feature activation+3.53e-02

Pos logit contributions
 neither+0.035
 wasteful+0.034
 mess+0.033
 roughly+0.030
 overflow+0.029
Neg logit contributions
 name-0.036
tie-0.035
 aboard-0.035
ka-0.034
ance-0.034
Token out
Token ID503
Feature activation+3.75e-02

Pos logit contributions
 neither+0.054
 wasteful+0.053
 mess+0.051
 roughly+0.046
 overflow+0.046
Neg logit contributions
 name-0.060
 aboard-0.059
ance-0.057
ka-0.056
tie-0.054
Token for
Token ID329
Feature activation+3.42e-02

Pos logit contributions
 neither+0.061
 wasteful+0.057
 mess+0.055
 overflow+0.050
 roughly+0.050
Neg logit contributions
 aboard-0.061
 name-0.060
ka-0.059
ance-0.057
tie-0.055
Token the
Token ID262
Feature activation-4.33e-03

Pos logit contributions
 neither+0.054
 wasteful+0.050
 mess+0.049
 overflow+0.046
 roughly+0.046
Neg logit contributions
 name-0.054
 aboard-0.054
ance-0.053
 powers-0.052
ka-0.052
Token crocod
Token ID37565
Feature activation+2.58e-03

Pos logit contributions
ka+0.007
tie+0.007
 name+0.007
ance+0.007
dot+0.007
Neg logit contributions
 neither-0.007
 mess-0.007
 wasteful-0.006
 overflow-0.006
 fighting-0.006
Tokenile
Token ID576
Feature activation+5.80e-02

Pos logit contributions
 neither+0.004
 mess+0.004
 wasteful+0.004
 overflow+0.003
 fighting+0.003
Neg logit contributions
ka-0.004
tie-0.004
 name-0.004
ance-0.004
dot-0.004
Token,"
Token ID553
Feature activation+2.30e-02

Pos logit contributions
 wasteful+0.098
 neither+0.098
 overflow+0.089
 mess+0.089
 ruining+0.086
Neg logit contributions
 aboard-0.085
 name-0.083
tie-0.082
 powers-0.078
ka-0.077
Token Lily
Token ID20037
Feature activation+3.07e-02

Pos logit contributions
 neither+0.029
 mess+0.028
 wasteful+0.027
 overflow+0.026
 ruining+0.026
Neg logit contributions
 name-0.045
ka-0.042
tie-0.040
yo-0.040
 aboard-0.040
Token said
Token ID531
Feature activation-1.71e-02

Pos logit contributions
 wasteful+0.054
 neither+0.053
 mess+0.052
 overflow+0.051
 ruining+0.048
Neg logit contributions
 name-0.046
tie-0.045
ka-0.044
shirt-0.042
leaf-0.042
Token.
Token ID13
Feature activation+8.86e-03

Pos logit contributions
ka+0.030
tie+0.030
leaf+0.030
dot+0.029
ance+0.029
Neg logit contributions
 neither-0.026
 mess-0.024
 wasteful-0.021
 fighting-0.021
 roughly-0.020
Token
Token ID198
Feature activation+9.23e-03

Pos logit contributions
 mess+0.016
 neither+0.016
 wasteful+0.016
 overflow+0.014
 ruining+0.014
Neg logit contributions
 name-0.012
tie-0.012
ka-0.012
dot-0.011
ance-0.011
Token on
Token ID319
Feature activation-1.50e-03

Pos logit contributions
ka+0.106
ocket+0.102
 amulet+0.100
ian+0.099
 Crown+0.098
Neg logit contributions
 neither-0.069
 mess-0.066
 fighting-0.062
 they-0.060
 pies-0.058
Token an
Token ID281
Feature activation-5.33e-03

Pos logit contributions
ka+0.002
tie+0.002
 name+0.002
ance+0.002
dot+0.002
Neg logit contributions
 neither-0.003
 mess-0.003
 wasteful-0.003
 overflow-0.003
 fighting-0.003
Token adventure
Token ID8855
Feature activation+1.13e-02

Pos logit contributions
ka+0.008
tie+0.008
 name+0.008
ance+0.007
dot+0.007
Neg logit contributions
 neither-0.009
 mess-0.009
 wasteful-0.009
 overflow-0.008
 fighting-0.008
Token.
Token ID13
Feature activation+4.78e-02

Pos logit contributions
 mess+0.021
 neither+0.021
 wasteful+0.020
 overflow+0.019
 ruining+0.018
Neg logit contributions
tie-0.015
ka-0.015
 name-0.015
 aboard-0.015
ance-0.015
Token First
Token ID3274
Feature activation+2.88e-02

Pos logit contributions
 mess+0.083
 wasteful+0.081
 neither+0.076
 overflow+0.072
 mosquitoes+0.070
Neg logit contributions
 name-0.082
 aboard-0.074
anced-0.071
dot-0.071
tie-0.071
Token,
Token ID11
Feature activation+6.44e-02

Pos logit contributions
 wasteful+0.055
 mess+0.055
 overflow+0.050
 ruining+0.050
 neither+0.050
Neg logit contributions
 aboard-0.038
 name-0.038
ance-0.037
tie-0.036
ka-0.035
Token they
Token ID484
Feature activation-2.70e-03

Pos logit contributions
 mess+0.107
 wasteful+0.105
ivering+0.098
 overflow+0.096
ivery+0.093
Neg logit contributions
 aboard-0.112
 name-0.106
tie-0.097
leaf-0.096
 onboard-0.091
Token found
Token ID1043
Feature activation-7.62e-03

Pos logit contributions
ka+0.004
tie+0.004
 name+0.004
ance+0.004
dot+0.004
Neg logit contributions
 neither-0.005
 mess-0.005
 wasteful-0.005
 overflow-0.004
 fighting-0.004
Token a
Token ID257
Feature activation-1.73e-02

Pos logit contributions
ka+0.011
tie+0.010
 name+0.010
dot+0.010
shirt+0.010
Neg logit contributions
 neither-0.014
 mess-0.014
 wasteful-0.013
 fighting-0.012
 overflow-0.012
Token trumpet
Token ID40334
Feature activation-2.02e-02

Pos logit contributions
ance+0.028
ka+0.028
tie+0.027
anced+0.027
dot+0.027
Neg logit contributions
 mess-0.029
 neither-0.028
 wasteful-0.026
 overflow-0.024
 fighting-0.024
Token.
Token ID13
Feature activation+3.78e-02

Pos logit contributions
ka+0.031
 name+0.030
dot+0.030
ance+0.029
leaf+0.029
Neg logit contributions
 neither-0.035
 mess-0.033
 wasteful-0.031
 fighting-0.030
 roughly-0.029
Token dog
Token ID3290
Feature activation+2.67e-02

Pos logit contributions
 wasteful+0.039
 neither+0.038
 overflow+0.036
 mess+0.036
 roughly+0.035
Neg logit contributions
 name-0.035
tie-0.032
ka-0.032
 aboard-0.031
ance-0.031
Token!
Token ID0
Feature activation+2.28e-02

Pos logit contributions
 neither+0.049
 wasteful+0.048
 mess+0.048
 overflow+0.045
 structures+0.043
Neg logit contributions
 name-0.034
ka-0.034
tie-0.034
 aboard-0.033
ance-0.032
Token You
Token ID921
Feature activation+2.05e-02

Pos logit contributions
 mess+0.042
 neither+0.040
 wasteful+0.040
 overflow+0.037
 roughly+0.036
Neg logit contributions
 name-0.032
ka-0.030
tie-0.030
 aboard-0.029
dot-0.029
Token ate
Token ID15063
Feature activation+6.32e-02

Pos logit contributions
 neither+0.031
 mess+0.030
 wasteful+0.029
 structures+0.027
 overflow+0.027
Neg logit contributions
 name-0.034
ka-0.033
tie-0.033
leaf-0.032
ance-0.032
Token our
Token ID674
Feature activation+5.41e-03

Pos logit contributions
 overflow+0.101
 neither+0.101
 wasteful+0.096
 erupted+0.096
 forgot+0.095
Neg logit contributions
 name-0.084
 aboard-0.084
tie-0.081
ka-0.079
ance-0.073
Token candy
Token ID18550
Feature activation+7.96e-02

Pos logit contributions
 neither+0.009
 mess+0.009
 wasteful+0.009
 overflow+0.008
 roughly+0.008
Neg logit contributions
 name-0.008
ka-0.008
tie-0.008
dot-0.008
shirt-0.008
Token!
Token ID0
Feature activation+2.09e-02

Pos logit contributions
 wasteful+0.156
 neither+0.150
 overflow+0.143
 fighting+0.139
 spitting+0.138
Neg logit contributions
tie-0.096
ka-0.085
 aboard-0.085
 name-0.081
ed-0.081
Token You
Token ID921
Feature activation+3.37e-02

Pos logit contributions
 mess+0.040
 neither+0.038
 wasteful+0.037
 overflow+0.035
 structures+0.034
Neg logit contributions
 name-0.029
tie-0.027
ka-0.027
dot-0.026
ance-0.026
Token are
Token ID389
Feature activation+4.31e-02

Pos logit contributions
 neither+0.055
 mess+0.053
 wasteful+0.053
 structures+0.051
 overflow+0.049
Neg logit contributions
 name-0.051
tie-0.050
ka-0.049
leaf-0.048
ance-0.047
Token a
Token ID257
Feature activation+3.74e-02

Pos logit contributions
 neither+0.060
 overflow+0.058
 mess+0.057
 roughly+0.056
 fuss+0.053
Neg logit contributions
 name-0.070
ka-0.068
leaf-0.067
ocket-0.067
tie-0.066
Token filthy
Token ID35677
Feature activation+4.99e-03

Pos logit contributions
 neither+0.055
 wasteful+0.048
 overflow+0.046
 structures+0.045
 mess+0.043
Neg logit contributions
 name-0.066
tie-0.063
ka-0.063
leaf-0.061
yo-0.059
Token for
Token ID329
Feature activation+2.12e-02

Pos logit contributions
ka+0.017
tie+0.017
dot+0.017
 name+0.016
ance+0.016
Neg logit contributions
 neither-0.013
 mess-0.013
 wasteful-0.012
 fighting-0.011
 overflow-0.011
Token help
Token ID1037
Feature activation+4.84e-02

Pos logit contributions
 neither+0.037
 mess+0.034
 wasteful+0.033
 overflow+0.031
 roughly+0.031
Neg logit contributions
 name-0.031
ance-0.030
 aboard-0.030
leaf-0.030
 powers-0.029
Token.
Token ID13
Feature activation-3.20e-02

Pos logit contributions
 neither+0.104
 wasteful+0.104
 mess+0.098
 structures+0.096
 overflow+0.094
Neg logit contributions
 name-0.047
ance-0.044
tie-0.040
 aboard-0.039
shirt-0.039
Token And
Token ID843
Feature activation-3.54e-03

Pos logit contributions
ka+0.059
tie+0.058
shirt+0.055
dot+0.055
ocket+0.055
Neg logit contributions
 neither-0.046
 wasteful-0.041
 mess-0.037
 fighting-0.036
 roughly-0.036
Token don
Token ID836
Feature activation+2.43e-02

Pos logit contributions
ka+0.008
tie+0.008
 name+0.007
dot+0.007
ance+0.007
Neg logit contributions
 neither-0.004
 mess-0.004
 wasteful-0.003
 overflow-0.003
 fighting-0.003
Token't
Token ID470
Feature activation+6.50e-02

Pos logit contributions
 mess+0.038
 wasteful+0.037
 neither+0.036
 rats+0.033
 roughly+0.032
Neg logit contributions
 name-0.040
ance-0.039
 aboard-0.039
tie-0.038
ka-0.037
Token let
Token ID1309
Feature activation+1.02e-02

Pos logit contributions
 structures+0.091
 wasteful+0.090
 neither+0.087
 roughly+0.079
iders+0.077
Neg logit contributions
 name-0.118
tie-0.097
 Crown-0.093
 Magic-0.092
 bow-0.091
Token anyone
Token ID2687
Feature activation+1.73e-02

Pos logit contributions
 neither+0.013
 mess+0.013
 wasteful+0.012
 overflow+0.011
 roughly+0.011
Neg logit contributions
 name-0.020
ance-0.019
ka-0.019
tie-0.019
 aboard-0.019
Token tell
Token ID1560
Feature activation+1.35e-02

Pos logit contributions
 neither+0.029
 mess+0.028
 wasteful+0.027
 overflow+0.025
 roughly+0.024
Neg logit contributions
 name-0.027
tie-0.026
leaf-0.025
ka-0.025
 aboard-0.025
Token you
Token ID345
Feature activation+1.65e-02

Pos logit contributions
 neither+0.016
 mess+0.015
 wasteful+0.014
 overflow+0.013
 roughly+0.012
Neg logit contributions
 name-0.027
ka-0.027
ance-0.027
tie-0.027
 aboard-0.026
Token that
Token ID326
Feature activation+3.17e-02

Pos logit contributions
 neither+0.028
 mess+0.028
 wasteful+0.027
 overflow+0.025
 ruining+0.025
Neg logit contributions
 name-0.024
tie-0.024
ance-0.024
ka-0.023
 aboard-0.023
Token walls
Token ID7714
Feature activation+2.19e-02

Pos logit contributions
 neither+0.050
 wasteful+0.044
 roughly+0.043
 overflow+0.040
 mess+0.039
Neg logit contributions
 name-0.059
ka-0.056
leaf-0.054
tie-0.053
shirt-0.053
Token,
Token ID11
Feature activation+3.44e-02

Pos logit contributions
 neither+0.038
 wasteful+0.036
 mess+0.036
 ruining+0.032
 overflow+0.032
Neg logit contributions
tie-0.032
ka-0.031
 name-0.031
ance-0.030
dot-0.030
Token the
Token ID262
Feature activation+3.00e-02

Pos logit contributions
 neither+0.061
 wasteful+0.059
 mess+0.058
 anth+0.056
 roughly+0.055
Neg logit contributions
 name-0.044
tie-0.043
ka-0.043
 shone-0.041
 aboard-0.041
Token ceiling
Token ID13387
Feature activation+4.58e-02

Pos logit contributions
 neither+0.044
 wasteful+0.039
 roughly+0.038
 mess+0.036
 overflow+0.036
Neg logit contributions
 name-0.054
ka-0.051
tie-0.049
leaf-0.049
shirt-0.048
Token,
Token ID11
Feature activation+2.98e-02

Pos logit contributions
 neither+0.086
 wasteful+0.086
 mess+0.079
 mosquitoes+0.075
 ruining+0.075
Neg logit contributions
tie-0.053
 shone-0.053
ka-0.052
 name-0.051
 powers-0.050
Token and
Token ID290
Feature activation+6.12e-02

Pos logit contributions
 neither+0.053
 wasteful+0.051
 mess+0.050
 anth+0.047
 roughly+0.047
Neg logit contributions
 name-0.040
tie-0.039
ka-0.038
 aboard-0.037
 shone-0.037
Token the
Token ID262
Feature activation+3.06e-02

Pos logit contributions
oving+0.101
 neither+0.099
 roughly+0.097
 mess+0.097
 overflow+0.095
Neg logit contributions
 rib-0.078
 name-0.075
 shone-0.075
ka-0.075
tie-0.073
Token people
Token ID661
Feature activation+2.29e-02

Pos logit contributions
 neither+0.045
 wasteful+0.040
 roughly+0.038
 mess+0.037
 overflow+0.036
Neg logit contributions
 name-0.055
ka-0.052
tie-0.050
leaf-0.050
 Crown-0.049
Token.
Token ID13
Feature activation+2.05e-02

Pos logit contributions
 neither+0.043
 mess+0.042
 wasteful+0.041
 overflow+0.037
 structures+0.037
Neg logit contributions
tie-0.030
 name-0.029
ka-0.029
dot-0.028
leaf-0.028
Token
Token ID198
Feature activation+2.24e-02

Pos logit contributions
 mess+0.038
 neither+0.036
 wasteful+0.035
 ruining+0.033
 overflow+0.033
Neg logit contributions
 name-0.030
ance-0.027
dot-0.027
ka-0.026
 aboard-0.026
Token
Token ID198
Feature activation+4.03e-02

Pos logit contributions
 neither+0.045
 mess+0.044
 wasteful+0.043
 ruining+0.041
 overflow+0.040
Neg logit contributions
 name-0.028
dot-0.025
tie-0.024
 aboard-0.024
ance-0.024
Token was
Token ID373
Feature activation+1.60e-03

Pos logit contributions
ka+0.022
tie+0.021
 name+0.021
dot+0.021
leaf+0.021
Neg logit contributions
 neither-0.022
 mess-0.022
 wasteful-0.020
 overflow-0.019
 fighting-0.018
Token ready
Token ID3492
Feature activation-1.53e-02

Pos logit contributions
 neither+0.003
 mess+0.002
 wasteful+0.002
 overflow+0.002
 fighting+0.002
Neg logit contributions
ka-0.003
tie-0.003
 name-0.003
ance-0.003
dot-0.003
Token.
Token ID13
Feature activation-6.68e-03

Pos logit contributions
ka+0.026
tie+0.025
leaf+0.024
ance+0.024
dot+0.024
Neg logit contributions
 mess-0.025
 neither-0.025
 wasteful-0.022
 overflow-0.021
 fighting-0.021
Token She
Token ID1375
Feature activation+2.34e-03

Pos logit contributions
ka+0.011
tie+0.011
ance+0.011
dot+0.011
 name+0.011
Neg logit contributions
 neither-0.011
 mess-0.010
 wasteful-0.010
 overflow-0.009
 fighting-0.009
Token and
Token ID290
Feature activation+2.02e-02

Pos logit contributions
 neither+0.004
 mess+0.004
 wasteful+0.004
 overflow+0.004
 fighting+0.003
Neg logit contributions
ka-0.003
tie-0.003
 name-0.003
ance-0.003
dot-0.003
Token her
Token ID607
Feature activation+1.01e-02

Pos logit contributions
 neither+0.038
 wasteful+0.037
 mess+0.036
 overflow+0.034
 ruining+0.033
Neg logit contributions
 name-0.028
ka-0.027
tie-0.026
shirt-0.025
ance-0.024
Token mom
Token ID1995
Feature activation-2.55e-03

Pos logit contributions
 neither+0.013
 wasteful+0.012
 mess+0.012
 overflow+0.011
 roughly+0.011
Neg logit contributions
 name-0.020
ka-0.019
tie-0.019
dot-0.018
 powers-0.018
Tokenmy
Token ID1820
Feature activation-5.82e-03

Pos logit contributions
ka+0.004
tie+0.004
 name+0.004
ance+0.004
dot+0.004
Neg logit contributions
 neither-0.004
 mess-0.004
 wasteful-0.004
 overflow-0.004
 fighting-0.004
Token crawled
Token ID45668
Feature activation-3.29e-02

Pos logit contributions
ka+0.009
tie+0.009
 name+0.009
ance+0.009
dot+0.009
Neg logit contributions
 neither-0.010
 mess-0.010
 wasteful-0.009
 overflow-0.008
 fighting-0.008
Token inside
Token ID2641
Feature activation+2.23e-02

Pos logit contributions
 Crown+0.050
orn+0.050
ance+0.050
yo+0.049
tie+0.048
Neg logit contributions
 mess-0.055
 neither-0.052
 overflow-0.049
 fighting-0.047
 roughly-0.046
Token and
Token ID290
Feature activation+1.69e-02

Pos logit contributions
 neither+0.041
 wasteful+0.041
 mess+0.038
 rats+0.036
 mosquitoes+0.036
Neg logit contributions
 name-0.031
 aboard-0.029
ka-0.028
tie-0.028
leaf-0.028
Token Spot
Token ID15899
Feature activation-1.92e-02

Pos logit contributions
ka+0.038
tie+0.037
dot+0.037
anced+0.036
ance+0.036
Neg logit contributions
 neither-0.027
 mess-0.027
 wasteful-0.024
 fighting-0.023
 ruining-0.022
Token again
Token ID757
Feature activation+3.27e-03

Pos logit contributions
ka+0.036
tie+0.034
leaf+0.034
dot+0.034
ance+0.034
Neg logit contributions
 mess-0.028
 neither-0.027
 fighting-0.024
 wasteful-0.024
 ruining-0.023
Token!
Token ID0
Feature activation+2.62e-03

Pos logit contributions
 neither+0.006
 mess+0.006
 wasteful+0.005
 overflow+0.005
 ruining+0.005
Neg logit contributions
tie-0.005
 name-0.005
ka-0.005
ance-0.005
dot-0.005
Token She
Token ID1375
Feature activation-9.70e-03

Pos logit contributions
 neither+0.004
 mess+0.004
 wasteful+0.004
 overflow+0.004
 ruining+0.004
Neg logit contributions
 name-0.004
tie-0.004
ka-0.004
ance-0.004
dot-0.004
Token painted
Token ID13055
Feature activation-3.75e-02

Pos logit contributions
ka+0.015
tie+0.015
 name+0.015
dot+0.015
ance+0.015
Neg logit contributions
 neither-0.016
 mess-0.016
 wasteful-0.015
 fighting-0.014
 overflow-0.014
Token a
Token ID257
Feature activation+2.10e-03

Pos logit contributions
leaf+0.075
dot+0.075
ka+0.073
tie+0.073
int+0.072
Neg logit contributions
 neither-0.043
 mess-0.043
 fighting-0.037
 structures-0.036
 mice-0.036
Token beautiful
Token ID4950
Feature activation-4.68e-02

Pos logit contributions
 neither+0.004
 mess+0.004
 wasteful+0.003
 overflow+0.003
 fighting+0.003
Neg logit contributions
ka-0.003
 name-0.003
tie-0.003
dot-0.003
ance-0.003
Token picture
Token ID4286
Feature activation-1.72e-02

Pos logit contributions
ance+0.089
anced+0.085
leaf+0.084
dot+0.080
tie+0.079
Neg logit contributions
 mess-0.080
 floor-0.064
 structures-0.057
 neither-0.055
 fighting-0.054
Token of
Token ID286
Feature activation-8.76e-03

Pos logit contributions
ka+0.028
 name+0.028
dot+0.027
tie+0.027
ance+0.027
Neg logit contributions
 neither-0.027
 mess-0.027
 wasteful-0.024
 roughly-0.023
 fighting-0.023
Token him
Token ID683
Feature activation-8.60e-03

Pos logit contributions
ka+0.015
tie+0.015
 name+0.015
dot+0.015
ance+0.014
Neg logit contributions
 neither-0.013
 mess-0.013
 wasteful-0.012
 fighting-0.011
 ruining-0.011
Token,
Token ID11
Feature activation-2.99e-03

Pos logit contributions
ka+0.015
tie+0.015
dot+0.015
 name+0.015
ance+0.014
Neg logit contributions
 neither-0.013
 mess-0.013
 wasteful-0.012
 fighting-0.011
 roughly-0.011
Token.
Token ID13
Feature activation+3.77e-02

Pos logit contributions
 wasteful+0.066
 mess+0.064
 neither+0.064
 rats+0.058
 roughly+0.058
Neg logit contributions
 name-0.044
tie-0.043
 aboard-0.043
ance-0.041
ka-0.039
Token But
Token ID887
Feature activation+3.81e-02

Pos logit contributions
 mess+0.070
 wasteful+0.065
 ruining+0.062
 wasting+0.061
 mosquitoes+0.060
Neg logit contributions
 name-0.064
anced-0.050
shirt-0.050
anger-0.049
dot-0.048
Token they
Token ID484
Feature activation+3.61e-02

Pos logit contributions
 mess+0.068
 wasteful+0.060
 overflow+0.056
 fuss+0.056
 structures+0.055
Neg logit contributions
 name-0.056
 aboard-0.055
ance-0.051
tie-0.050
 shone-0.049
Token do
Token ID466
Feature activation+1.41e-02

Pos logit contributions
 neither+0.063
 mess+0.063
 wasteful+0.062
 structures+0.060
 overflow+0.058
Neg logit contributions
 name-0.055
tie-0.044
ocket-0.042
leaf-0.041
ance-0.041
Token not
Token ID407
Feature activation+9.43e-03

Pos logit contributions
 mess+0.010
 neither+0.009
 wasteful+0.009
 overflow+0.008
 structures+0.007
Neg logit contributions
 name-0.035
tie-0.033
ance-0.033
 aboard-0.033
ka-0.033
Token see
Token ID766
Feature activation+1.11e-02

Pos logit contributions
 neither+0.017
 mess+0.016
 wasteful+0.016
 overflow+0.014
 ruining+0.014
Neg logit contributions
 name-0.014
tie-0.013
ka-0.013
 aboard-0.013
dot-0.013
Token the
Token ID262
Feature activation+1.48e-02

Pos logit contributions
 mess+0.017
 neither+0.016
 wasteful+0.016
 overflow+0.015
 roughly+0.015
Neg logit contributions
 name-0.018
tie-0.018
ka-0.018
ance-0.018
 aboard-0.017
Token rock
Token ID3881
Feature activation+8.08e-03

Pos logit contributions
 neither+0.022
 wasteful+0.021
 mess+0.020
 roughly+0.019
 ruining+0.019
Neg logit contributions
 name-0.026
tie-0.025
ka-0.024
 aboard-0.024
ance-0.023
Token.
Token ID13
Feature activation+3.28e-02

Pos logit contributions
 neither+0.014
 mess+0.014
 wasteful+0.014
 overflow+0.012
 roughly+0.012
Neg logit contributions
tie-0.011
 name-0.011
ka-0.011
ance-0.011
 aboard-0.011
Token Lily
Token ID20037
Feature activation+2.57e-02

Pos logit contributions
 mess+0.064
 wasteful+0.057
 ruining+0.057
 neither+0.056
 overflow+0.053
Neg logit contributions
 name-0.050
anced-0.041
anger-0.039
tie-0.039
ance-0.039
Token trips
Token ID13229
Feature activation-1.35e-02

Pos logit contributions
 neither+0.046
 wasteful+0.045
 mess+0.043
 overflow+0.043
 ruining+0.040
Neg logit contributions
 name-0.037
tie-0.034
ance-0.033
ka-0.033
 aboard-0.032
Token he
Token ID339
Feature activation-5.68e-03

Pos logit contributions
ka+0.020
 name+0.020
tie+0.020
leaf+0.019
ance+0.019
Neg logit contributions
 neither-0.016
 mess-0.014
 wasteful-0.013
 fighting-0.012
 overflow-0.012
Token was
Token ID373
Feature activation+8.43e-04

Pos logit contributions
ka+0.009
tie+0.009
 name+0.009
ance+0.009
dot+0.009
Neg logit contributions
 neither-0.010
 mess-0.009
 wasteful-0.009
 overflow-0.008
 fighting-0.008
Token able
Token ID1498
Feature activation+2.52e-02

Pos logit contributions
 neither+0.001
 mess+0.001
 wasteful+0.001
 overflow+0.001
 fighting+0.001
Neg logit contributions
ka-0.002
tie-0.002
 name-0.002
ance-0.001
dot-0.001
Token to
Token ID284
Feature activation-1.04e-02

Pos logit contributions
 wasteful+0.057
 mess+0.056
 neither+0.056
 overflow+0.053
 roughly+0.051
Neg logit contributions
tie-0.022
 name-0.022
 aboard-0.022
ka-0.020
dot-0.020
Token leap
Token ID16470
Feature activation-1.37e-02

Pos logit contributions
ka+0.018
tie+0.017
dot+0.017
ance+0.016
shirt+0.016
Neg logit contributions
 neither-0.017
 mess-0.016
 wasteful-0.015
 overflow-0.014
 fighting-0.014
Token up
Token ID510
Feature activation+6.31e-04

Pos logit contributions
ka+0.022
 name+0.022
tie+0.022
dot+0.022
leaf+0.022
Neg logit contributions
 neither-0.022
 mess-0.021
 wasteful-0.020
 overflow-0.018
 roughly-0.018
Token into
Token ID656
Feature activation+1.31e-02

Pos logit contributions
 neither+0.001
 mess+0.001
 wasteful+0.001
 overflow+0.001
 fighting+0.001
Neg logit contributions
ka-0.001
tie-0.001
 name-0.001
ance-0.001
dot-0.001
Token the
Token ID262
Feature activation+6.99e-03

Pos logit contributions
 neither+0.021
 mess+0.020
 wasteful+0.019
 overflow+0.017
 roughly+0.017
Neg logit contributions
ka-0.021
 name-0.021
tie-0.021
 aboard-0.020
ance-0.020
Token sky
Token ID6766
Feature activation+1.14e-02

Pos logit contributions
 neither+0.012
 mess+0.011
 wasteful+0.011
 fighting+0.010
 overflow+0.010
Neg logit contributions
ka-0.011
tie-0.011
 name-0.011
ance-0.010
dot-0.010
Token.
Token ID13
Feature activation-1.21e-02

Pos logit contributions
 neither+0.021
 mess+0.020
 wasteful+0.020
 overflow+0.018
 ruining+0.018
Neg logit contributions
tie-0.016
 name-0.016
ka-0.016
ance-0.015
 aboard-0.015
Token When
Token ID1649
Feature activation+9.31e-03

Pos logit contributions
ka+0.020
tie+0.019
ance+0.019
dot+0.019
leaf+0.019
Neg logit contributions
 neither-0.020
 mess-0.019
 wasteful-0.018
 fighting-0.016
 roughly-0.016
Token."
Token ID526
Feature activation+6.32e-03

Pos logit contributions
 neither+0.067
 mess+0.065
 wasteful+0.065
 overflow+0.064
 roughly+0.059
Neg logit contributions
 name-0.045
tie-0.044
 aboard-0.041
ka-0.041
ance-0.041
Token
Token ID198
Feature activation-9.29e-03

Pos logit contributions
 neither+0.011
 mess+0.011
 wasteful+0.011
 overflow+0.010
 ruining+0.010
Neg logit contributions
 name-0.009
tie-0.009
ka-0.009
dot-0.008
ance-0.008
Token
Token ID198
Feature activation+7.84e-03

Pos logit contributions
ka+0.016
tie+0.016
ance+0.015
 name+0.015
dot+0.015
Neg logit contributions
 neither-0.014
 mess-0.014
 wasteful-0.013
 fighting-0.012
 roughly-0.012
TokenTheir
Token ID14574
Feature activation+1.26e-02

Pos logit contributions
 neither+0.011
 mess+0.011
 wasteful+0.010
 overflow+0.009
 ruining+0.009
Neg logit contributions
ka-0.014
tie-0.014
 name-0.014
dot-0.014
shirt-0.014
Token mom
Token ID1995
Feature activation-5.71e-03

Pos logit contributions
 neither+0.017
 wasteful+0.016
 mess+0.015
 overflow+0.014
 roughly+0.014
Neg logit contributions
 name-0.024
ka-0.023
tie-0.023
ance-0.022
dot-0.022
Token is
Token ID318
Feature activation+1.86e-02

Pos logit contributions
tie+0.008
ka+0.008
 name+0.007
ance+0.007
dot+0.007
Neg logit contributions
 mess-0.011
 neither-0.011
 wasteful-0.010
 fighting-0.010
 overflow-0.009
Token still
Token ID991
Feature activation+3.04e-05

Pos logit contributions
 neither+0.021
 mess+0.019
 wasteful+0.018
 overflow+0.017
 snakes+0.017
Neg logit contributions
 name-0.038
tie-0.037
ka-0.036
leaf-0.036
dot-0.036
Token angry
Token ID7954
Feature activation+1.46e-02

Pos logit contributions
 neither+0.000
 mess+0.000
 wasteful+0.000
 overflow+0.000
 fighting+0.000
Neg logit contributions
ka-0.000
tie-0.000
 name-0.000
ance-0.000
dot-0.000
Token.
Token ID13
Feature activation+1.29e-02

Pos logit contributions
 neither+0.027
 mess+0.026
 wasteful+0.026
 overflow+0.024
 ruining+0.023
Neg logit contributions
 name-0.019
ka-0.019
tie-0.019
ance-0.018
dot-0.018
Token She
Token ID1375
Feature activation-2.81e-02

Pos logit contributions
 neither+0.025
 mess+0.025
 wasteful+0.023
 overflow+0.022
 ruining+0.021
Neg logit contributions
 name-0.017
tie-0.015
ance-0.015
dot-0.015
ka-0.015
Token tells
Token ID4952
Feature activation+2.31e-02

Pos logit contributions
tie+0.051
ka+0.049
shirt+0.048
ance+0.047
dot+0.047
Neg logit contributions
 mess-0.048
 wasteful-0.042
 neither-0.041
 fighting-0.039
 ruining-0.038
Token,
Token ID11
Feature activation+1.96e-02

Pos logit contributions
 mess+0.045
 wasteful+0.043
 neither+0.041
 overflow+0.040
 ruining+0.038
Neg logit contributions
 name-0.031
ance-0.031
tie-0.030
ka-0.030
 aboard-0.030
Token the
Token ID262
Feature activation-1.15e-03

Pos logit contributions
 mess+0.033
 wasteful+0.032
 neither+0.031
 overflow+0.029
 ruining+0.028
Neg logit contributions
 name-0.030
 aboard-0.030
ance-0.029
tie-0.028
 Magic-0.028
Token large
Token ID1588
Feature activation-7.22e-03

Pos logit contributions
ka+0.002
tie+0.002
 name+0.002
ance+0.002
dot+0.002
Neg logit contributions
 neither-0.002
 mess-0.002
 wasteful-0.002
 overflow-0.002
 fighting-0.002
Token hammer
Token ID15554
Feature activation-1.94e-02

Pos logit contributions
tie+0.012
ka+0.012
ance+0.012
dot+0.011
 name+0.011
Neg logit contributions
 mess-0.012
 neither-0.012
 wasteful-0.011
 fighting-0.010
 overflow-0.010
Token was
Token ID373
Feature activation-1.74e-02

Pos logit contributions
ka+0.031
 name+0.031
tie+0.030
dot+0.030
leaf+0.030
Neg logit contributions
 neither-0.032
 mess-0.030
 wasteful-0.028
 fighting-0.026
 overflow-0.026
Token always
Token ID1464
Feature activation-8.16e-03

Pos logit contributions
ka+0.034
tie+0.033
 name+0.033
ance+0.033
dot+0.033
Neg logit contributions
 neither-0.023
 mess-0.021
 wasteful-0.020
 fighting-0.018
 ruining-0.018
Token ready
Token ID3492
Feature activation-3.89e-02

Pos logit contributions
ka+0.014
tie+0.014
 name+0.014
ance+0.014
dot+0.014
Neg logit contributions
 neither-0.012
 mess-0.012
 wasteful-0.011
 fighting-0.010
 overflow-0.010
Token to
Token ID284
Feature activation-1.92e-02

Pos logit contributions
ka+0.072
dot+0.067
leaf+0.066
yo+0.066
tie+0.066
Neg logit contributions
 neither-0.059
 mess-0.058
 wasteful-0.051
 overflow-0.049
 ruining-0.049
Token spray
Token ID11662
Feature activation-2.29e-02

Pos logit contributions
ka+0.031
tie+0.030
shirt+0.030
dot+0.029
leaf+0.028
Neg logit contributions
 mess-0.032
 neither-0.032
 wasteful-0.029
 overflow-0.028
 fighting-0.028
Token and
Token ID290
Feature activation+1.13e-02

Pos logit contributions
ka+0.041
tie+0.041
dot+0.041
 name+0.040
leaf+0.039
Neg logit contributions
 neither-0.033
 mess-0.033
 wasteful-0.030
 overflow-0.028
 fighting-0.027
Token protect
Token ID1805
Feature activation+4.54e-03

Pos logit contributions
 neither+0.020
 mess+0.019
 wasteful+0.018
 overflow+0.017
 roughly+0.017
Neg logit contributions
 name-0.017
tie-0.016
ka-0.016
ance-0.016
 aboard-0.016
Token king
Token ID5822
Feature activation-2.18e-02

Pos logit contributions
tie+0.065
dot+0.064
anced+0.064
ance+0.063
ka+0.063
Neg logit contributions
 mess-0.038
 neither-0.035
 wasteful-0.032
 floor-0.028
 rats-0.027
Token made
Token ID925
Feature activation-1.60e-02

Pos logit contributions
ka+0.036
dot+0.036
tie+0.036
 name+0.036
ance+0.035
Neg logit contributions
 neither-0.034
 mess-0.033
 wasteful-0.030
 fighting-0.028
 roughly-0.027
Token a
Token ID257
Feature activation-5.58e-03

Pos logit contributions
tie+0.026
ka+0.025
dot+0.024
ance+0.024
 name+0.024
Neg logit contributions
 neither-0.026
 mess-0.026
 wasteful-0.024
 fighting-0.022
 overflow-0.022
Token beautiful
Token ID4950
Feature activation-5.74e-02

Pos logit contributions
ka+0.008
tie+0.008
ance+0.008
 name+0.008
dot+0.008
Neg logit contributions
 mess-0.010
 neither-0.010
 wasteful-0.009
 overflow-0.008
 fighting-0.008
Token gray
Token ID12768
Feature activation-5.52e-02

Pos logit contributions
ance+0.096
anced+0.093
leaf+0.088
tie+0.087
aved+0.085
Neg logit contributions
 mess-0.114
 floor-0.091
 mistake-0.084
 fuss-0.083
 neither-0.082
Token dress
Token ID6576
Feature activation-3.78e-02

Pos logit contributions
ance+0.086
anced+0.082
leaf+0.080
 aboard+0.079
tie+0.079
Neg logit contributions
 mess-0.103
 floor-0.087
 neither-0.086
 fighting-0.079
 quarrel-0.078
Token for
Token ID329
Feature activation+2.56e-03

Pos logit contributions
ka+0.073
leaf+0.072
dot+0.070
anced+0.069
ance+0.069
Neg logit contributions
 neither-0.050
 mess-0.050
 overflow-0.040
 floor-0.040
 wasteful-0.039
Token her
Token ID607
Feature activation-2.55e-02

Pos logit contributions
 neither+0.004
 mess+0.004
 wasteful+0.004
 overflow+0.003
 fighting+0.003
Neg logit contributions
ka-0.004
tie-0.004
 name-0.004
ance-0.004
dot-0.004
Token.
Token ID13
Feature activation-5.98e-03

Pos logit contributions
ka+0.046
ance+0.044
tie+0.044
leaf+0.044
dot+0.044
Neg logit contributions
 mess-0.038
 neither-0.037
 wasteful-0.032
 fighting-0.031
 ruining-0.030
Token The
Token ID383
Feature activation-2.09e-02

Pos logit contributions
ka+0.010
tie+0.010
 name+0.009
ance+0.009
dot+0.009
Neg logit contributions
 neither-0.010
 mess-0.009
 wasteful-0.009
 overflow-0.008
 fighting-0.008
Token little
Token ID1310
Feature activation-1.36e-02

Pos logit contributions
tie+0.039
ance+0.039
anced+0.038
ka+0.038
dot+0.038
Neg logit contributions
 mess-0.030
 neither-0.028
 wasteful-0.026
 fighting-0.024
 floor-0.023
Token thanked
Token ID26280
Feature activation-2.03e-02

Pos logit contributions
ka+0.010
tie+0.010
 name+0.010
ance+0.009
dot+0.009
Neg logit contributions
 neither-0.010
 mess-0.010
 wasteful-0.009
 overflow-0.008
 fighting-0.008
Token her
Token ID607
Feature activation-8.00e-03

Pos logit contributions
tie+0.039
ka+0.038
shirt+0.038
dot+0.037
orn+0.036
Neg logit contributions
 neither-0.029
 mess-0.026
 fighting-0.024
 ruining-0.023
 wasteful-0.022
Token mom
Token ID1995
Feature activation-4.38e-03

Pos logit contributions
tie+0.017
ka+0.017
ance+0.017
dot+0.017
shirt+0.017
Neg logit contributions
 mess-0.009
 neither-0.009
 wasteful-0.008
 fighting-0.007
 overflow-0.006
Token for
Token ID329
Feature activation-2.10e-02

Pos logit contributions
ka+0.007
tie+0.007
 name+0.007
dot+0.007
ance+0.007
Neg logit contributions
 neither-0.007
 mess-0.007
 wasteful-0.007
 fighting-0.006
 overflow-0.006
Token letting
Token ID9616
Feature activation-2.44e-02

Pos logit contributions
ka+0.043
tie+0.042
 name+0.041
dot+0.040
shirt+0.040
Neg logit contributions
 neither-0.027
 mess-0.027
 wasteful-0.024
 ruining-0.024
 fighting-0.024
Token her
Token ID607
Feature activation-2.50e-02

Pos logit contributions
ka+0.043
tie+0.042
dot+0.040
shirt+0.039
leaf+0.039
Neg logit contributions
 neither-0.038
 mess-0.036
 wasteful-0.032
 overflow-0.031
 fighting-0.031
Token go
Token ID467
Feature activation-4.33e-02

Pos logit contributions
ka+0.043
tie+0.040
orn+0.039
dot+0.039
anced+0.039
Neg logit contributions
 neither-0.040
 mess-0.038
 wasteful-0.034
 fighting-0.034
 overflow-0.033
Token.
Token ID13
Feature activation-1.93e-02

Pos logit contributions
ka+0.083
 amulet+0.082
tie+0.081
ocket+0.081
leaf+0.080
Neg logit contributions
 mess-0.058
 neither-0.057
 fighting-0.052
 overflow-0.045
 ruining-0.045
Token
Token ID220
Feature activation-9.31e-03

Pos logit contributions
ka+0.034
tie+0.034
leaf+0.032
shirt+0.031
dot+0.031
Neg logit contributions
 neither-0.029
 mess-0.027
 wasteful-0.024
 fighting-0.024
 roughly-0.023
Token
Token ID198
Feature activation-1.43e-03

Pos logit contributions
ka+0.015
tie+0.015
 name+0.015
dot+0.015
ance+0.015
Neg logit contributions
 neither-0.015
 mess-0.014
 wasteful-0.013
 fighting-0.012
 overflow-0.012
Token
Token ID198
Feature activation+1.77e-02

Pos logit contributions
ka+0.002
tie+0.002
 name+0.002
ance+0.002
dot+0.002
Neg logit contributions
 neither-0.002
 mess-0.002
 wasteful-0.002
 overflow-0.002
 fighting-0.002
Token and
Token ID290
Feature activation-1.15e-02

Pos logit contributions
 wasteful+0.022
 neither+0.022
 mess+0.022
 overflow+0.020
 roughly+0.019
Neg logit contributions
 name-0.019
tie-0.018
 aboard-0.018
ka-0.017
ance-0.017
Token safe
Token ID3338
Feature activation+1.12e-02

Pos logit contributions
ka+0.021
tie+0.020
shirt+0.020
dot+0.020
ance+0.020
Neg logit contributions
 neither-0.017
 mess-0.016
 wasteful-0.015
 fighting-0.014
 overflow-0.014
Token at
Token ID379
Feature activation-2.22e-02

Pos logit contributions
 neither+0.019
 mess+0.019
 wasteful+0.019
 overflow+0.017
 mosquitoes+0.016
Neg logit contributions
tie-0.017
 name-0.016
ka-0.016
 aboard-0.016
ance-0.016
Token home
Token ID1363
Feature activation+3.46e-02

Pos logit contributions
ka+0.046
tie+0.046
shirt+0.045
 name+0.044
dot+0.043
Neg logit contributions
 mess-0.028
 neither-0.028
 fighting-0.025
 wasteful-0.024
 ruining-0.024
Token.
Token ID13
Feature activation-3.13e-03

Pos logit contributions
 wasteful+0.069
 neither+0.068
 mess+0.068
 mosquitoes+0.064
 overflow+0.064
Neg logit contributions
 name-0.038
tie-0.036
 aboard-0.036
ance-0.036
ka-0.034
Token<|endoftext|>
Token ID50256
Feature activation-2.76e-02

Pos logit contributions
ka+0.005
tie+0.005
 name+0.005
ance+0.005
dot+0.005
Neg logit contributions
 neither-0.005
 mess-0.005
 wasteful-0.005
 overflow-0.004
 fighting-0.004
TokenOnce
Token ID7454
Feature activation+3.21e-03

Pos logit contributions
ka+0.053
tie+0.052
ance+0.052
shirt+0.051
dot+0.051
Neg logit contributions
 neither-0.035
 mess-0.035
 fighting-0.030
 wasteful-0.029
 roughly-0.029
Token upon
Token ID2402
Feature activation-1.22e-02

Pos logit contributions
 neither+0.006
 mess+0.006
 wasteful+0.006
 overflow+0.006
 ruining+0.005
Neg logit contributions
ka-0.004
 name-0.004
tie-0.004
ance-0.004
dot-0.004
Token a
Token ID257
Feature activation+4.46e-02

Pos logit contributions
ka+0.017
tie+0.017
ance+0.016
anced+0.016
 name+0.016
Neg logit contributions
 neither-0.022
 mess-0.022
 wasteful-0.021
 fighting-0.019
 overflow-0.019
Token time
Token ID640
Feature activation+1.00e-02

Pos logit contributions
 neither+0.090
 wasteful+0.087
 mess+0.080
 ruining+0.080
 overflow+0.080
Neg logit contributions
 name-0.057
tie-0.050
dot-0.050
ka-0.050
shirt-0.049
Token,
Token ID11
Feature activation+2.57e-03

Pos logit contributions
 mess+0.016
 neither+0.016
 wasteful+0.016
 overflow+0.014
 ruining+0.014
Neg logit contributions
 name-0.016
ka-0.015
tie-0.015
 aboard-0.015
ance-0.015
Token said
Token ID531
Feature activation-1.22e-02

Pos logit contributions
ka+0.008
tie+0.008
 name+0.007
dot+0.007
ance+0.007
Neg logit contributions
 neither-0.007
 mess-0.007
 wasteful-0.007
 overflow-0.006
 fighting-0.006
Token to
Token ID284
Feature activation-6.27e-03

Pos logit contributions
ka+0.022
tie+0.022
 name+0.022
dot+0.022
leaf+0.021
Neg logit contributions
 neither-0.018
 mess-0.017
 wasteful-0.016
 fighting-0.014
 roughly-0.014
Token the
Token ID262
Feature activation+1.18e-02

Pos logit contributions
ka+0.011
tie+0.011
 name+0.011
dot+0.010
ance+0.010
Neg logit contributions
 neither-0.010
 mess-0.009
 wasteful-0.009
 fighting-0.008
 overflow-0.008
Token busy
Token ID8179
Feature activation+2.56e-02

Pos logit contributions
 neither+0.018
 mess+0.017
 wasteful+0.017
 overflow+0.015
 roughly+0.015
Neg logit contributions
 name-0.021
tie-0.019
ka-0.019
ance-0.019
leaf-0.019
Token friend
Token ID1545
Feature activation-9.50e-03

Pos logit contributions
 neither+0.042
 wasteful+0.041
 mess+0.039
 ruining+0.037
 roughly+0.036
Neg logit contributions
 name-0.042
ka-0.038
tie-0.037
 aboard-0.037
 powers-0.037
Token,
Token ID11
Feature activation-8.56e-03

Pos logit contributions
ka+0.017
tie+0.017
 name+0.016
dot+0.016
ance+0.016
Neg logit contributions
 neither-0.014
 mess-0.013
 wasteful-0.013
 fighting-0.011
 overflow-0.011
Token "
Token ID366
Feature activation-1.90e-02

Pos logit contributions
ka+0.014
tie+0.014
 name+0.013
ance+0.013
dot+0.013
Neg logit contributions
 neither-0.014
 mess-0.013
 wasteful-0.013
 fighting-0.012
 overflow-0.012
TokenIf
Token ID1532
Feature activation-6.56e-03

Pos logit contributions
ka+0.033
 name+0.033
dot+0.033
ance+0.033
tie+0.032
Neg logit contributions
 neither-0.028
 mess-0.025
 wasteful-0.025
 fighting-0.022
 roughly-0.022
Token you
Token ID345
Feature activation-2.61e-02

Pos logit contributions
ka+0.015
tie+0.015
 name+0.015
dot+0.015
ance+0.015
Neg logit contributions
 neither-0.006
 mess-0.006
 wasteful-0.005
 fighting-0.004
 overflow-0.004
Token take
Token ID1011
Feature activation-3.47e-02

Pos logit contributions
ka+0.044
dot+0.043
shirt+0.043
tie+0.042
yo+0.041
Neg logit contributions
 neither-0.041
 mess-0.040
 wasteful-0.037
 overflow-0.035
 fighting-0.034
Token a
Token ID257
Feature activation-4.34e-02

Pos logit contributions
tie+0.061
dot+0.060
ka+0.060
anger+0.059
anced+0.059
Neg logit contributions
 neither-0.050
 mess-0.050
 wasteful-0.045
 fighting-0.043
 ruining-0.041
Token she
Token ID673
Feature activation-1.68e-02

Pos logit contributions
ka+0.003
tie+0.003
 name+0.003
ance+0.003
dot+0.003
Neg logit contributions
 neither-0.003
 mess-0.003
 wasteful-0.003
 overflow-0.003
 fighting-0.003
Token wasn
Token ID2492
Feature activation+1.28e-02

Pos logit contributions
ka+0.029
tie+0.029
leaf+0.028
ance+0.028
dot+0.028
Neg logit contributions
 mess-0.026
 neither-0.025
 wasteful-0.023
 roughly-0.021
 fighting-0.021
Token't
Token ID470
Feature activation+1.82e-02

Pos logit contributions
 mess+0.018
 neither+0.018
 wasteful+0.018
 structures+0.015
 ruining+0.015
Neg logit contributions
tie-0.023
 name-0.023
ka-0.022
ance-0.022
dot-0.022
Token wearing
Token ID5762
Feature activation-4.77e-02

Pos logit contributions
 neither+0.029
 wasteful+0.027
 mess+0.026
 structures+0.025
 overflow+0.024
Neg logit contributions
ka-0.030
 name-0.029
tie-0.029
dot-0.028
 aboard-0.028
Token her
Token ID607
Feature activation-5.13e-02

Pos logit contributions
leaf+0.089
ance+0.088
anced+0.087
ka+0.084
found+0.083
Neg logit contributions
 mess-0.069
 neither-0.062
 wasteful-0.056
 ruining-0.056
 roughly-0.055
Token fancy
Token ID14996
Feature activation-7.40e-02

Pos logit contributions
leaf+0.092
ance+0.092
anced+0.086
dot+0.084
 aboard+0.083
Neg logit contributions
 mess-0.084
 neither-0.076
 floor-0.066
 not-0.066
 fighting-0.064
Token dress
Token ID6576
Feature activation-3.96e-02

Pos logit contributions
leaf+0.131
ance+0.124
year+0.115
tie+0.115
anced+0.111
Neg logit contributions
 mess-0.139
 floor-0.111
 neither-0.103
 quarrel-0.101
 fuss-0.101
Token anymore
Token ID7471
Feature activation-3.51e-03

Pos logit contributions
leaf+0.078
dot+0.074
ka+0.074
 name+0.071
 aboard+0.071
Neg logit contributions
 neither-0.055
 mess-0.053
 wasteful-0.044
 roughly-0.042
 overflow-0.042
Token.
Token ID13
Feature activation-8.94e-03

Pos logit contributions
ka+0.005
tie+0.005
 name+0.005
ance+0.005
dot+0.005
Neg logit contributions
 neither-0.006
 mess-0.006
 wasteful-0.005
 overflow-0.005
 fighting-0.005
Token She
Token ID1375
Feature activation-1.01e-02

Pos logit contributions
ka+0.015
tie+0.015
ance+0.014
leaf+0.014
dot+0.014
Neg logit contributions
 neither-0.014
 mess-0.014
 wasteful-0.013
 fighting-0.012
 roughly-0.012
Token sn
Token ID3013
Feature activation-5.35e-02

Pos logit contributions
ka+0.017
tie+0.017
 name+0.017
dot+0.017
ance+0.017
Neg logit contributions
 neither-0.015
 mess-0.015
 wasteful-0.014
 overflow-0.013
 roughly-0.013
Token gave
Token ID2921
Feature activation-7.02e-04

Pos logit contributions
tie+0.032
ka+0.032
dot+0.032
 name+0.031
ance+0.031
Neg logit contributions
 neither-0.027
 mess-0.027
 wasteful-0.024
 fighting-0.023
 overflow-0.022
Token her
Token ID607
Feature activation-5.64e-02

Pos logit contributions
ka+0.001
tie+0.001
 name+0.001
ance+0.001
dot+0.001
Neg logit contributions
 neither-0.001
 mess-0.001
 wasteful-0.001
 overflow-0.001
 fighting-0.001
Token a
Token ID257
Feature activation-7.43e-02

Pos logit contributions
ka+0.108
anced+0.107
orn+0.107
anger+0.105
tie+0.104
Neg logit contributions
 neither-0.075
 mess-0.068
 wasteful-0.060
 fighting-0.056
 both-0.055
Token y
Token ID331
Feature activation-1.00e-03

Pos logit contributions
ance+0.165
anced+0.156
ka+0.145
tie+0.141
dot+0.141
Neg logit contributions
 mess-0.099
 neither-0.074
 wasteful-0.072
 overflow-0.067
 fuss-0.066
Tokenummy
Token ID13513
Feature activation-4.43e-02

Pos logit contributions
ka+0.002
tie+0.002
 name+0.002
ance+0.002
dot+0.002
Neg logit contributions
 neither-0.001
 mess-0.001
 wasteful-0.001
 overflow-0.001
 fighting-0.001
Token l
Token ID300
Feature activation-7.07e-02

Pos logit contributions
ance+0.101
anced+0.097
tie+0.093
ka+0.092
lynn+0.091
Neg logit contributions
 mess-0.052
 neither-0.043
 wasteful-0.036
 pies-0.036
 raw-0.035
Tokenoll
Token ID692
Feature activation-6.55e-02

Pos logit contributions
 outer+0.117
 aboard+0.110
 inspect+0.108
ka+0.106
 shone+0.104
Neg logit contributions
 neither-0.105
 wasteful-0.092
 roughly-0.090
 raw-0.089
 rats-0.089
Tokenipop
Token ID42800
Feature activation-1.77e-02

Pos logit contributions
 outer+0.100
leaf+0.096
 Magic+0.093
anced+0.092
dot+0.091
Neg logit contributions
 neither-0.114
 mess-0.104
 wasteful-0.095
 rats-0.095
 raw-0.094
Token as
Token ID355
Feature activation-5.42e-02

Pos logit contributions
ka+0.031
dot+0.030
 name+0.030
ance+0.029
leaf+0.029
Neg logit contributions
 neither-0.027
 mess-0.026
 wasteful-0.023
 roughly-0.022
 fighting-0.021
Token a
Token ID257
Feature activation-5.53e-02

Pos logit contributions
ka+0.093
tie+0.093
shirt+0.091
yo+0.091
orn+0.090
Neg logit contributions
 neither-0.080
 mess-0.071
 mosquitoes-0.067
 wasteful-0.067
 rats-0.065
Token treat
Token ID2190
Feature activation-2.76e-02

Pos logit contributions
ance+0.107
anced+0.100
ka+0.096
tie+0.091
dot+0.090
Neg logit contributions
 mess-0.088
 neither-0.074
 wasteful-0.069
 timer-0.068
 fuss-0.067
Token little
Token ID1310
Feature activation-1.61e-02

Pos logit contributions
ka+0.009
tie+0.009
ance+0.009
 name+0.009
dot+0.009
Neg logit contributions
 neither-0.007
 mess-0.007
 wasteful-0.006
 fighting-0.006
 overflow-0.006
Token girl
Token ID2576
Feature activation-2.39e-02

Pos logit contributions
ance+0.031
tie+0.031
dot+0.031
ka+0.031
leaf+0.031
Neg logit contributions
 neither-0.021
 mess-0.020
 wasteful-0.018
 fighting-0.016
 overflow-0.016
Token and
Token ID290
Feature activation-3.74e-03

Pos logit contributions
ka+0.043
tie+0.042
dot+0.042
leaf+0.041
ance+0.041
Neg logit contributions
 neither-0.035
 mess-0.033
 wasteful-0.030
 fighting-0.028
 roughly-0.028
Token gave
Token ID2921
Feature activation-2.40e-02

Pos logit contributions
ka+0.007
tie+0.007
 name+0.007
dot+0.006
ance+0.006
Neg logit contributions
 neither-0.005
 mess-0.005
 wasteful-0.005
 overflow-0.004
 fighting-0.004
Token her
Token ID607
Feature activation-5.39e-02

Pos logit contributions
tie+0.041
dot+0.040
ka+0.040
 name+0.039
 amulet+0.038
Neg logit contributions
 neither-0.038
 mess-0.035
 wasteful-0.032
 fighting-0.030
 rats-0.029
Token a
Token ID257
Feature activation-6.70e-02

Pos logit contributions
ka+0.103
anger+0.102
anced+0.100
orn+0.100
ocket+0.100
Neg logit contributions
 neither-0.071
 mess-0.063
 wasteful-0.055
 mice-0.054
 Amy-0.054
Token big
Token ID1263
Feature activation-6.14e-02

Pos logit contributions
ance+0.137
anced+0.133
ka+0.125
dot+0.121
lynn+0.120
Neg logit contributions
 mess-0.095
 neither-0.075
 overflow-0.070
 wasteful-0.068
 fuss-0.066
Token smile
Token ID8212
Feature activation-6.45e-02

Pos logit contributions
ance+0.098
anced+0.088
dot+0.087
tie+0.086
ocket+0.086
Neg logit contributions
 mess-0.114
 neither-0.099
 fighting-0.091
 wasteful-0.090
 quarrel-0.090
Token before
Token ID878
Feature activation-1.58e-03

Pos logit contributions
leaf+0.126
ka+0.124
 amulet+0.124
dot+0.120
anced+0.120
Neg logit contributions
 neither-0.076
 mess-0.064
 they-0.059
 fighting-0.058
 not-0.053
Token getting
Token ID1972
Feature activation-4.06e-02

Pos logit contributions
ka+0.003
tie+0.003
 name+0.003
ance+0.003
dot+0.003
Neg logit contributions
 neither-0.002
 mess-0.002
 wasteful-0.002
 overflow-0.002
 fighting-0.002
Token back
Token ID736
Feature activation-3.45e-02

Pos logit contributions
anger+0.073
tie+0.071
dot+0.070
 amulet+0.070
ocket+0.069
Neg logit contributions
 neither-0.062
 mess-0.057
 wasteful-0.052
 fighting-0.052
 worse-0.049
Token me
Token ID502
Feature activation-5.54e-02

Pos logit contributions
ka+0.019
tie+0.019
ance+0.019
dot+0.019
 name+0.019
Neg logit contributions
 mess-0.017
 neither-0.017
 wasteful-0.016
 fighting-0.014
 overflow-0.014
Tokenadow
Token ID4584
Feature activation-1.17e-02

Pos logit contributions
leaf+0.088
yo+0.088
 aboard+0.087
 Magic+0.085
ka+0.084
Neg logit contributions
 neither-0.086
 mess-0.081
 roughly-0.077
 wasteful-0.076
 not-0.073
Token.
Token ID13
Feature activation-1.82e-02

Pos logit contributions
ka+0.020
tie+0.020
ance+0.020
 name+0.020
leaf+0.019
Neg logit contributions
 neither-0.018
 mess-0.017
 wasteful-0.016
 overflow-0.015
 fighting-0.015
Token In
Token ID554
Feature activation-2.13e-02

Pos logit contributions
ka+0.027
tie+0.026
ance+0.026
 aboard+0.025
leaf+0.025
Neg logit contributions
 neither-0.032
 mess-0.030
 wasteful-0.029
 roughly-0.028
 fighting-0.027
Token the
Token ID262
Feature activation-2.29e-02

Pos logit contributions
tie+0.044
ka+0.044
leaf+0.044
shirt+0.043
dot+0.042
Neg logit contributions
 neither-0.025
 mess-0.024
 wasteful-0.021
 fighting-0.020
 ruining-0.020
Token me
Token ID502
Feature activation-6.90e-02

Pos logit contributions
ka+0.046
tie+0.045
leaf+0.045
ance+0.044
dot+0.044
Neg logit contributions
 mess-0.029
 neither-0.026
 wasteful-0.024
 overflow-0.022
 floor-0.022
Tokenadow
Token ID4584
Feature activation-7.91e-03

Pos logit contributions
ka+0.106
leaf+0.105
tie+0.104
 aboard+0.103
yo+0.100
Neg logit contributions
 mess-0.105
 neither-0.102
 roughly-0.096
 not-0.094
 wasteful-0.093
Token,
Token ID11
Feature activation-6.54e-03

Pos logit contributions
ka+0.012
tie+0.012
 name+0.012
leaf+0.012
dot+0.012
Neg logit contributions
 neither-0.013
 mess-0.013
 wasteful-0.012
 overflow-0.011
 fighting-0.011
Token there
Token ID612
Feature activation-2.19e-02

Pos logit contributions
ka+0.011
tie+0.011
 name+0.011
dot+0.011
ance+0.011
Neg logit contributions
 neither-0.010
 mess-0.010
 wasteful-0.009
 fighting-0.008
 overflow-0.008
Token was
Token ID373
Feature activation-8.56e-03

Pos logit contributions
ka+0.038
leaf+0.038
tie+0.038
dot+0.036
shirt+0.036
Neg logit contributions
 neither-0.034
 mess-0.032
 fighting-0.029
 wasteful-0.028
 roughly-0.027
Token a
Token ID257
Feature activation+3.92e-03

Pos logit contributions
ka+0.012
tie+0.012
ance+0.011
 name+0.011
dot+0.011
Neg logit contributions
 neither-0.016
 mess-0.016
 wasteful-0.015
 overflow-0.014
 rats-0.014
Token any
Token ID597
Feature activation+2.05e-02

Pos logit contributions
leaf+0.067
ka+0.067
tie+0.065
shirt+0.065
dot+0.064
Neg logit contributions
 neither-0.045
 rats-0.038
 mess-0.038
 snakes-0.037
 fighting-0.033
Token honey
Token ID12498
Feature activation-2.05e-03

Pos logit contributions
 neither+0.034
 wasteful+0.032
 mess+0.030
 overflow+0.029
 ruining+0.027
Neg logit contributions
 name-0.033
tie-0.032
 powers-0.032
 amulet-0.031
 aboard-0.031
Token.
Token ID13
Feature activation-9.02e-03

Pos logit contributions
ka+0.003
tie+0.003
 name+0.003
ance+0.003
dot+0.003
Neg logit contributions
 neither-0.003
 mess-0.003
 wasteful-0.003
 overflow-0.003
 fighting-0.003
Token Then
Token ID3244
Feature activation-4.71e-03

Pos logit contributions
ka+0.014
tie+0.014
ance+0.014
dot+0.014
 name+0.014
Neg logit contributions
 neither-0.015
 mess-0.014
 wasteful-0.014
 fighting-0.013
 roughly-0.013
Token he
Token ID339
Feature activation-2.92e-02

Pos logit contributions
ka+0.008
tie+0.008
 name+0.008
dot+0.008
leaf+0.008
Neg logit contributions
 neither-0.007
 mess-0.007
 wasteful-0.006
 fighting-0.006
 overflow-0.006
Token went
Token ID1816
Feature activation-6.89e-02

Pos logit contributions
leaf+0.055
ka+0.054
ance+0.053
tie+0.052
dot+0.052
Neg logit contributions
 mess-0.040
 neither-0.039
 wasteful-0.033
 forgot-0.033
 roughly-0.033
Token to
Token ID284
Feature activation-1.65e-02

Pos logit contributions
ka+0.151
yo+0.143
anced+0.143
leaf+0.139
shirt+0.136
Neg logit contributions
 mess-0.077
 not-0.075
 soon-0.069
 after-0.065
 floor-0.061
Token a
Token ID257
Feature activation+3.35e-03

Pos logit contributions
ka+0.032
tie+0.031
shirt+0.030
dot+0.029
ance+0.029
Neg logit contributions
 mess-0.024
 neither-0.023
 wasteful-0.020
 fighting-0.020
 overflow-0.019
Token small
Token ID1402
Feature activation+1.58e-02

Pos logit contributions
 neither+0.006
 mess+0.005
 wasteful+0.005
 overflow+0.005
 fighting+0.005
Neg logit contributions
 name-0.005
tie-0.005
ka-0.005
dot-0.005
shirt-0.005
Token pond
Token ID16723
Feature activation-1.46e-02

Pos logit contributions
 neither+0.030
 wasteful+0.028
 mess+0.027
 ruining+0.025
 overflow+0.025
Neg logit contributions
 name-0.022
ka-0.022
tie-0.021
 aboard-0.021
 amulet-0.021
Token.
Token ID13
Feature activation-1.91e-02

Pos logit contributions
ka+0.026
tie+0.025
ance+0.024
dot+0.024
 name+0.024
Neg logit contributions
 neither-0.022
 mess-0.022
 wasteful-0.020
 overflow-0.019
 fighting-0.019