TOP ACTIVATIONS
MAX = 0.255

grandma 's house for safe keeping . When
it to me for safe keeping . He told me to
They picked up their nap kins and wiped their mouths and
mouths and hands . Nap kins are soft and white and
told them to use nap kins to wipe their mouths and
the leaves and the pump kins . They went outside to
have used all the nap kins in the house ! You
start to paint the pump kins . They paint
load his truck with pump kins . The farmer helped him
. They make the pump kins look very colorful and bright
gave them forks and nap kins . " Thank
listen . They thought nap kins were boring and silly .
helped him load the pump kins . The
big field with many pump kins . The farmer said they
Come and see our pump kins ! We painted them !
will stay on the pump kins ." They run
to be with other pump kins in the garden .
. They see the pump kins . They see the painting
. She said that pump kins are rare and valuable .
They quickly put their nap kins over their salad . Then

MIN ACTIVATIONS
MIN = -0.183

like he was the best pian ist in the whole world
asleep , dreaming of icy pian os and big piles of
practice and become a great pian ist so that one day
eyes and yelled , " Whe ee e !" as the
so much fun . " Whe e !" he shouted .
and slides down . " Whe e !" she says .
little girl playing a toy sax ophone . She was playing
" Welcome home , B oun cy !", they said ,
swing . " Whe ee !" said Jill .
laugh and shout , " Whe e !" They
window . " Whe e !" The little bird
noise . " Whe e !" Anna and Ben
� � € � � Whe els . I â €
laughs and shouts , " Whe e ! This is fun
fast that he started to blur . He was running so
loved the sound of the sax ophone and the drums .
Everything around them was a blur , it was like the
its top speed ! " Whe e , this is fun
� � € � � Whe ee ! â € His
laughed and shouted , " Whe e !" But when she

INTERVAL 0.146 - 0.255
CONTAINS 0.003%

as , and gir aff es . Anna and Ben were
you want to trade nap kins with me ?" she offered
Don 't hurt the pump kins ." Tim
to help fold the nap kins for dinner . Lily was
honey off . She dri es them with a towel and

INTERVAL 0.036 - 0.146
CONTAINS 7.508%

a time , in a big forest , there lived a
rang the bell again . Then she popped back through the
started to w ag with excitement ! The mom called the
and explore , so he decided to brave ly step inside
with the pencil and praised him for his master pieces .

INTERVAL -0.073 - 0.036
CONTAINS 91.729%

that we should always listen to our parents and not play
upon a time , two friends called Sally and Harry were
then the tight ness will return ." They waited
on the fox 's offer . From then on , the
and Lily agreed . They went to the table and ate

INTERVAL -0.183 - -0.073
CONTAINS 0.760%

it on their wrists and pressed the buttons . The watch
big lion . He looked at Lily with his yellow eyes
man said , " You 're welcome , little ones .
, knowing that they were both winners . <|endoftext|> Once upon
One morning he noticed a lot of traffic . There were
Token grandma
Token ID49890
Feature activation-1.75e-02

Pos logit contributions
 lear+0.005
 str+0.005
ah+0.005
itting+0.005
 forward+0.005
Neg logit contributions
keeping-0.007
kins-0.006
 snakes-0.006
 traps-0.005
bes-0.005
Token's
Token ID338
Feature activation-3.56e-02

Pos logit contributions
 str+0.028
 lear+0.027
ah+0.027
 forward+0.026
 spend+0.026
Neg logit contributions
keeping-0.030
kins-0.028
 snakes-0.025
 skeletons-0.023
 iv-0.023
Token house
Token ID2156
Feature activation+2.20e-02

Pos logit contributions
 lear+0.064
 str+0.063
ah+0.062
 forward+0.059
 switched+0.058
Neg logit contributions
keeping-0.053
kins-0.050
 snakes-0.042
bes-0.038
 skeletons-0.038
Token for
Token ID329
Feature activation+1.01e-02

Pos logit contributions
keeping+0.032
kins+0.030
 snakes+0.025
bes+0.023
 traps+0.023
Neg logit contributions
 lear-0.040
 str-0.040
ah-0.039
 forward-0.037
itting-0.037
Token safe
Token ID3338
Feature activation+2.21e-02

Pos logit contributions
keeping+0.018
kins+0.018
 snakes+0.016
ls+0.015
bes+0.015
Neg logit contributions
 str-0.014
 lear-0.014
ah-0.013
 spend-0.013
 switched-0.013
Tokenkeeping
Token ID19934
Feature activation+2.48e-01

Pos logit contributions
kins+0.025
 snakes+0.020
ls+0.020
iders+0.019
keeping+0.019
Neg logit contributions
 lear-0.044
 str-0.044
 forward-0.042
 switched-0.041
ah-0.040
Token.
Token ID13
Feature activation+4.44e-03

Pos logit contributions
keeping+0.291
kins+0.289
 snakes+0.248
 traps+0.221
 skeletons+0.220
Neg logit contributions
 lear-0.501
 str-0.500
ah-0.486
 forward-0.467
 spend-0.457
Token
Token ID220
Feature activation-6.59e-03

Pos logit contributions
keeping+0.007
kins+0.006
 snakes+0.005
bes+0.005
 skeletons+0.005
Neg logit contributions
 lear-0.008
 str-0.008
ah-0.008
itting-0.007
 forward-0.007
Token
Token ID198
Feature activation+1.12e-02

Pos logit contributions
 str+0.010
itting+0.010
 forward+0.010
ah+0.010
 switched+0.010
Neg logit contributions
keeping-0.011
kins-0.010
 snakes-0.009
 rats-0.009
 traps-0.009
Token
Token ID198
Feature activation+9.12e-03

Pos logit contributions
kins+0.018
keeping+0.018
 snakes+0.016
 rats+0.015
 traps+0.015
Neg logit contributions
 str-0.018
ah-0.017
 lear-0.017
Sure-0.017
 switched-0.017
TokenWhen
Token ID2215
Feature activation-1.83e-02

Pos logit contributions
keeping+0.011
kins+0.010
bes+0.008
uts+0.008
 snakes+0.008
Neg logit contributions
ah-0.018
 lear-0.018
 str-0.017
oured-0.017
 forward-0.017
Token it
Token ID340
Feature activation+1.48e-03

Pos logit contributions
 lear+0.003
 str+0.003
ah+0.003
itting+0.003
 forward+0.003
Neg logit contributions
keeping-0.004
kins-0.004
 snakes-0.003
 traps-0.003
 skeletons-0.003
Token to
Token ID284
Feature activation-1.16e-02

Pos logit contributions
keeping+0.002
 snakes+0.002
kins+0.002
 traps+0.002
 mice+0.002
Neg logit contributions
 lear-0.003
itting-0.003
ah-0.003
 str-0.003
 blur-0.002
Token me
Token ID502
Feature activation-5.60e-02

Pos logit contributions
 lear+0.017
 str+0.017
ah+0.017
 forward+0.016
 switched+0.015
Neg logit contributions
keeping-0.021
kins-0.020
 snakes-0.017
bes-0.016
 traps-0.016
Token for
Token ID329
Feature activation+1.55e-03

Pos logit contributions
 lear+0.106
 str+0.106
ah+0.104
 forward+0.098
itting+0.097
Neg logit contributions
keeping-0.078
kins-0.072
 snakes-0.060
bes-0.055
 traps-0.054
Token safe
Token ID3338
Feature activation+1.87e-02

Pos logit contributions
keeping+0.003
kins+0.003
 snakes+0.002
ls+0.002
bes+0.002
Neg logit contributions
 str-0.002
 lear-0.002
ah-0.002
 spend-0.002
 switched-0.002
Tokenkeeping
Token ID19934
Feature activation+2.40e-01

Pos logit contributions
kins+0.018
iders+0.017
ls+0.015
uts+0.014
rob+0.014
Neg logit contributions
 lear-0.038
 str-0.036
 forward-0.035
by-0.034
 farewell-0.034
Token.
Token ID13
Feature activation-7.39e-03

Pos logit contributions
keeping+0.282
kins+0.277
 snakes+0.237
 traps+0.216
 skeletons+0.215
Neg logit contributions
 lear-0.484
 str-0.483
ah-0.465
 forward-0.442
 spend-0.434
Token He
Token ID679
Feature activation-1.12e-02

Pos logit contributions
 str+0.012
 lear+0.012
 forward+0.011
 switched+0.011
ah+0.011
Neg logit contributions
kins-0.013
keeping-0.012
 snakes-0.010
ls-0.010
bes-0.010
Token told
Token ID1297
Feature activation+9.95e-03

Pos logit contributions
 lear+0.020
ah+0.020
 str+0.020
itting+0.020
 forward+0.020
Neg logit contributions
keeping-0.016
kins-0.015
 snakes-0.013
 traps-0.012
bes-0.012
Token me
Token ID502
Feature activation-5.73e-02

Pos logit contributions
keeping+0.018
kins+0.017
 snakes+0.015
bes+0.014
 traps+0.014
Neg logit contributions
 lear-0.015
 str-0.015
ah-0.015
 forward-0.014
itting-0.013
Token to
Token ID284
Feature activation-1.36e-02

Pos logit contributions
 lear+0.089
 str+0.087
ah+0.086
itting+0.080
 forward+0.079
Neg logit contributions
keeping-0.099
kins-0.094
 snakes-0.083
 traps-0.076
 skeletons-0.076
Token They
Token ID1119
Feature activation+4.59e-03

Pos logit contributions
kins+0.050
keeping+0.048
 snakes+0.042
 traps+0.041
bes+0.040
Neg logit contributions
 lear-0.032
 str-0.029
 forward-0.028
itting-0.027
 switched-0.026
Token picked
Token ID6497
Feature activation+1.22e-02

Pos logit contributions
kins+0.006
keeping+0.006
 snakes+0.005
 cans+0.005
 mice+0.005
Neg logit contributions
itting-0.008
 lear-0.008
ah-0.008
 forward-0.008
 str-0.008
Token up
Token ID510
Feature activation+2.65e-02

Pos logit contributions
keeping+0.019
kins+0.018
 snakes+0.015
 traps+0.014
 skeletons+0.014
Neg logit contributions
 lear-0.021
 str-0.021
ah-0.021
itting-0.020
 switched-0.019
Token their
Token ID511
Feature activation+2.69e-02

Pos logit contributions
keeping+0.049
kins+0.046
 snakes+0.040
bes+0.038
 traps+0.038
Neg logit contributions
 lear-0.038
 str-0.038
ah-0.037
 forward-0.035
 switched-0.034
Token nap
Token ID25422
Feature activation+1.90e-02

Pos logit contributions
keeping+0.040
kins+0.036
 snakes+0.030
bes+0.028
 traps+0.027
Neg logit contributions
 str-0.049
 lear-0.049
ah-0.047
 forward-0.045
 switched-0.044
Tokenkins
Token ID5331
Feature activation+2.35e-01

Pos logit contributions
keeping+0.022
 snakes+0.018
 skeletons+0.016
 traps+0.015
 rats+0.015
Neg logit contributions
 str-0.038
 lear-0.037
 forward-0.036
ashed-0.035
itting-0.034
Token and
Token ID290
Feature activation+1.63e-02

Pos logit contributions
keeping+0.419
kins+0.378
 snakes+0.351
 skeletons+0.324
bes+0.320
Neg logit contributions
 str-0.363
 lear-0.353
 forward-0.341
ah-0.337
itting-0.319
Token wiped
Token ID21122
Feature activation-1.71e-02

Pos logit contributions
keeping+0.034
kins+0.032
 snakes+0.029
 skeletons+0.028
bes+0.027
Neg logit contributions
 str-0.020
 lear-0.019
ah-0.018
 switched-0.017
 spend-0.016
Token their
Token ID511
Feature activation+2.63e-02

Pos logit contributions
 lear+0.022
 str+0.022
ah+0.021
itting+0.020
 switched+0.019
Neg logit contributions
keeping-0.034
kins-0.033
 snakes-0.029
 traps-0.027
bes-0.027
Token mouths
Token ID28552
Feature activation+9.11e-03

Pos logit contributions
keeping+0.041
kins+0.036
 snakes+0.032
uts+0.031
iders+0.030
Neg logit contributions
 str-0.046
 lear-0.044
 farewell-0.041
 switched-0.041
ah-0.040
Token and
Token ID290
Feature activation+1.54e-02

Pos logit contributions
keeping+0.014
kins+0.012
 snakes+0.011
 skeletons+0.011
 rats+0.010
Neg logit contributions
 str-0.017
 lear-0.016
 forward-0.016
ah-0.015
 switched-0.015
Token mouths
Token ID28552
Feature activation+9.81e-03

Pos logit contributions
keeping+0.050
kins+0.045
uts+0.041
 snakes+0.041
iders+0.040
Neg logit contributions
 str-0.035
 lear-0.033
 switched-0.031
ah-0.031
by-0.031
Token and
Token ID290
Feature activation+1.39e-02

Pos logit contributions
keeping+0.016
kins+0.014
 snakes+0.013
 skeletons+0.013
 traps+0.012
Neg logit contributions
 str-0.017
 forward-0.015
 lear-0.015
ah-0.015
 switched-0.015
Token hands
Token ID2832
Feature activation-2.25e-02

Pos logit contributions
keeping+0.027
kins+0.026
 snakes+0.023
 skeletons+0.023
ls+0.022
Neg logit contributions
 str-0.018
 lear-0.017
 spend-0.016
 switched-0.016
ah-0.016
Token.
Token ID13
Feature activation+1.89e-02

Pos logit contributions
 lear+0.047
 str+0.046
ah+0.046
itting+0.043
 forward+0.043
Neg logit contributions
keeping-0.027
kins-0.025
 snakes-0.020
bes-0.018
 traps-0.018
Token Nap
Token ID14332
Feature activation-4.68e-02

Pos logit contributions
keeping+0.030
kins+0.027
 snakes+0.023
 traps+0.022
 skeletons+0.021
Neg logit contributions
 str-0.034
 lear-0.033
ah-0.032
 forward-0.032
 switched-0.031
Tokenkins
Token ID5331
Feature activation+2.35e-01

Pos logit contributions
 str+0.080
ah+0.079
 lear+0.078
 switched+0.073
itting+0.073
Neg logit contributions
keeping-0.075
kins-0.061
 snakes-0.057
 traps-0.053
 skeletons-0.053
Token are
Token ID389
Feature activation+9.97e-03

Pos logit contributions
keeping+0.419
kins+0.419
 snakes+0.352
bes+0.332
 iv+0.326
Neg logit contributions
 lear-0.346
 str-0.332
ah-0.331
 forward-0.308
itting-0.305
Token soft
Token ID2705
Feature activation+2.92e-02

Pos logit contributions
keeping+0.017
kins+0.016
 snakes+0.014
bes+0.013
 skeletons+0.013
Neg logit contributions
 str-0.016
 lear-0.015
ah-0.015
 forward-0.014
 switched-0.014
Token and
Token ID290
Feature activation+1.53e-02

Pos logit contributions
keeping+0.058
kins+0.055
 snakes+0.049
iders+0.048
 skeletons+0.046
Neg logit contributions
 str-0.034
 lear-0.033
ah-0.030
 forward-0.030
 pract-0.028
Token white
Token ID2330
Feature activation+3.19e-02

Pos logit contributions
keeping+0.029
kins+0.024
 snakes+0.022
orns+0.021
 situations+0.021
Neg logit contributions
 str-0.023
 switched-0.021
ah-0.021
 spend-0.021
 lear-0.021
Token and
Token ID290
Feature activation+1.37e-02

Pos logit contributions
keeping+0.053
kins+0.050
 snakes+0.042
 traps+0.041
ls+0.041
Neg logit contributions
 lear-0.048
 str-0.046
ah-0.046
 wait-0.042
 spend-0.041
Token told
Token ID1297
Feature activation+2.69e-02

Pos logit contributions
kins+0.017
keeping+0.014
,"+0.012
 bush+0.012
 snakes+0.012
Neg logit contributions
 lear-0.021
 forward-0.020
itting-0.020
 farewell-0.020
ah-0.019
Token them
Token ID606
Feature activation+4.94e-02

Pos logit contributions
keeping+0.046
kins+0.043
 snakes+0.039
 mice+0.036
 traps+0.036
Neg logit contributions
 lear-0.042
ah-0.041
 str-0.041
itting-0.039
 forward-0.037
Token to
Token ID284
Feature activation+7.40e-03

Pos logit contributions
keeping+0.081
kins+0.078
 snakes+0.071
 skeletons+0.064
uts+0.064
Neg logit contributions
 lear-0.081
ah-0.079
 str-0.078
itting-0.073
 forward-0.073
Token use
Token ID779
Feature activation+1.87e-02

Pos logit contributions
keeping+0.015
kins+0.015
 snakes+0.013
bes+0.012
 iv+0.012
Neg logit contributions
 lear-0.009
ah-0.009
 str-0.008
itting-0.008
 forward-0.008
Token nap
Token ID25422
Feature activation+1.91e-02

Pos logit contributions
keeping+0.032
kins+0.031
 snakes+0.027
 traps+0.025
bes+0.025
Neg logit contributions
 lear-0.029
 str-0.029
ah-0.028
itting-0.026
 forward-0.026
Tokenkins
Token ID5331
Feature activation+2.34e-01

Pos logit contributions
keeping+0.020
 snakes+0.014
 skeletons+0.012
orns+0.012
handed+0.011
Neg logit contributions
 str-0.042
 forward-0.042
 lear-0.040
 wait-0.039
itting-0.038
Token to
Token ID284
Feature activation+6.23e-03

Pos logit contributions
keeping+0.416
kins+0.387
 snakes+0.345
bes+0.320
 skeletons+0.320
Neg logit contributions
 str-0.350
 lear-0.348
ah-0.342
 forward-0.322
itting-0.316
Token wipe
Token ID19916
Feature activation-3.60e-02

Pos logit contributions
keeping+0.014
kins+0.013
 snakes+0.011
 skeletons+0.011
uts+0.011
Neg logit contributions
 str-0.007
 spend-0.007
 lear-0.007
ah-0.007
 wait-0.006
Token their
Token ID511
Feature activation+2.68e-02

Pos logit contributions
 str+0.041
 forward+0.040
ah+0.040
 lear+0.038
 switched+0.036
Neg logit contributions
keeping-0.076
kins-0.070
 snakes-0.064
 skeletons-0.062
uts-0.062
Token mouths
Token ID28552
Feature activation+9.81e-03

Pos logit contributions
keeping+0.050
kins+0.045
uts+0.041
 snakes+0.041
iders+0.040
Neg logit contributions
 str-0.035
 lear-0.033
 switched-0.031
ah-0.031
by-0.031
Token and
Token ID290
Feature activation+1.39e-02

Pos logit contributions
keeping+0.016
kins+0.014
 snakes+0.013
 skeletons+0.013
 traps+0.012
Neg logit contributions
 str-0.017
 forward-0.015
 lear-0.015
ah-0.015
 switched-0.015
Token the
Token ID262
Feature activation+2.99e-02

Pos logit contributions
keeping+0.054
kins+0.051
 snakes+0.044
bes+0.041
 traps+0.041
Neg logit contributions
 str-0.046
 lear-0.046
ah-0.045
 forward-0.041
 switched-0.041
Token leaves
Token ID5667
Feature activation+4.99e-02

Pos logit contributions
keeping+0.046
kins+0.043
 snakes+0.035
bes+0.034
uts+0.033
Neg logit contributions
 lear-0.053
 str-0.052
ah-0.051
 forward-0.048
 switched-0.048
Token and
Token ID290
Feature activation+1.38e-02

Pos logit contributions
keeping+0.088
kins+0.077
 snakes+0.069
 skeletons+0.069
 traps+0.066
Neg logit contributions
 str-0.076
ah-0.075
 lear-0.072
 forward-0.071
 switched-0.068
Token the
Token ID262
Feature activation+2.94e-02

Pos logit contributions
keeping+0.025
kins+0.024
 snakes+0.020
bes+0.020
uts+0.019
Neg logit contributions
 str-0.021
 lear-0.020
ah-0.019
 switched-0.019
 spend-0.018
Token pump
Token ID8901
Feature activation+1.81e-02

Pos logit contributions
keeping+0.042
kins+0.039
 snakes+0.031
bes+0.030
uts+0.030
Neg logit contributions
 lear-0.055
 str-0.054
ah-0.053
 forward-0.050
 switched-0.050
Tokenkins
Token ID5331
Feature activation+2.33e-01

Pos logit contributions
keeping+0.015
 skeletons+0.010
 traps+0.009
 iv+0.009
 situations+0.008
Neg logit contributions
ah-0.043
ee-0.039
 str-0.039
by-0.039
 spent-0.037
Token.
Token ID13
Feature activation+1.46e-02

Pos logit contributions
keeping+0.295
kins+0.277
 snakes+0.222
bes+0.200
 traps+0.197
Neg logit contributions
 lear-0.470
 str-0.465
ah-0.461
 forward-0.434
itting-0.433
Token They
Token ID1119
Feature activation+5.08e-03

Pos logit contributions
keeping+0.018
kins+0.017
 snakes+0.013
bes+0.012
 iv+0.012
Neg logit contributions
 lear-0.030
 str-0.030
ah-0.030
itting-0.027
 switched-0.027
Token went
Token ID1816
Feature activation+9.23e-03

Pos logit contributions
kins+0.007
keeping+0.006
 cans+0.006
 floor+0.006
 pieces+0.005
Neg logit contributions
 farewell-0.009
itting-0.009
 lear-0.009
ahoo-0.009
 forward-0.009
Token outside
Token ID2354
Feature activation-1.29e-02

Pos logit contributions
keeping+0.016
kins+0.015
 snakes+0.013
ls+0.013
 skeletons+0.012
Neg logit contributions
 lear-0.014
ah-0.014
 str-0.014
itting-0.013
 switched-0.012
Token to
Token ID284
Feature activation+4.73e-03

Pos logit contributions
 lear+0.021
 str+0.021
ah+0.020
 forward+0.019
itting+0.019
Neg logit contributions
keeping-0.022
kins-0.020
 snakes-0.018
bes-0.016
 traps-0.016
Token have
Token ID423
Feature activation-4.15e-03

Pos logit contributions
 lear+0.088
ah+0.086
 str+0.084
itting+0.082
 forward+0.082
Neg logit contributions
keeping-0.089
kins-0.084
 snakes-0.073
 skeletons-0.068
 traps-0.067
Token used
Token ID973
Feature activation+4.34e-03

Pos logit contributions
ah+0.007
 lear+0.007
itting+0.007
 forward+0.006
 str+0.006
Neg logit contributions
keeping-0.007
kins-0.006
 snakes-0.006
 trouble-0.005
 rats-0.005
Token all
Token ID477
Feature activation-3.89e-02

Pos logit contributions
keeping+0.007
kins+0.007
 snakes+0.006
 traps+0.005
 skeletons+0.005
Neg logit contributions
 lear-0.007
ah-0.007
 str-0.007
itting-0.007
 forward-0.007
Token the
Token ID262
Feature activation+2.44e-02

Pos logit contributions
 lear+0.066
ah+0.064
 str+0.064
itting+0.060
 forward+0.060
Neg logit contributions
keeping-0.063
kins-0.059
 snakes-0.053
 traps-0.048
 skeletons-0.047
Token nap
Token ID25422
Feature activation+1.98e-02

Pos logit contributions
keeping+0.038
kins+0.035
 snakes+0.029
bes+0.028
uts+0.027
Neg logit contributions
 str-0.042
 lear-0.042
ah-0.041
 forward-0.039
 switched-0.038
Tokenkins
Token ID5331
Feature activation+2.33e-01

Pos logit contributions
keeping+0.018
 snakes+0.014
 skeletons+0.012
 traps+0.012
 rats+0.012
Neg logit contributions
 str-0.042
ee-0.041
 forward-0.041
 lear-0.041
ah-0.040
Token in
Token ID287
Feature activation+2.46e-02

Pos logit contributions
keeping+0.364
kins+0.335
 snakes+0.298
bes+0.274
 skeletons+0.272
Neg logit contributions
 str-0.394
 lear-0.388
ah-0.379
 forward-0.374
itting-0.360
Token the
Token ID262
Feature activation+2.27e-02

Pos logit contributions
keeping+0.048
kins+0.046
 snakes+0.041
iders+0.040
uts+0.040
Neg logit contributions
 str-0.032
 lear-0.028
 spend-0.027
ah-0.026
 practice-0.026
Token house
Token ID2156
Feature activation+3.49e-02

Pos logit contributions
keeping+0.031
kins+0.029
 snakes+0.024
bes+0.022
 traps+0.022
Neg logit contributions
 lear-0.043
 str-0.043
ah-0.042
 forward-0.040
 switched-0.040
Token!
Token ID0
Feature activation-3.55e-02

Pos logit contributions
keeping+0.044
kins+0.042
 snakes+0.036
 skeletons+0.033
 traps+0.032
Neg logit contributions
 lear-0.067
 str-0.067
ah-0.065
 forward-0.065
 switched-0.061
Token You
Token ID921
Feature activation-5.80e-02

Pos logit contributions
 lear+0.059
 str+0.058
ah+0.056
 forward+0.053
 switched+0.053
Neg logit contributions
keeping-0.057
kins-0.057
 snakes-0.047
ls-0.044
 traps-0.044
Token start
Token ID923
Feature activation-2.96e-02

Pos logit contributions
kins+0.001
 cans+0.001
 snakes+0.001
 floor+0.001
keeping+0.001
Neg logit contributions
itting-0.002
ah-0.001
 farewell-0.001
ahoo-0.001
OO-0.001
Token to
Token ID284
Feature activation+4.31e-03

Pos logit contributions
ah+0.047
 lear+0.046
 forward+0.043
 str+0.043
itting+0.042
Neg logit contributions
keeping-0.048
kins-0.046
 snakes-0.042
 floor-0.038
bes-0.038
Token paint
Token ID7521
Feature activation-4.24e-03

Pos logit contributions
keeping+0.007
kins+0.007
 snakes+0.006
 floor+0.006
bes+0.006
Neg logit contributions
 forward-0.006
ah-0.006
 lear-0.006
itting-0.006
 farewell-0.006
Token the
Token ID262
Feature activation+1.77e-02

Pos logit contributions
 str+0.006
ah+0.006
 lear+0.006
 forward+0.006
 switched+0.006
Neg logit contributions
keeping-0.008
kins-0.007
 snakes-0.006
bes-0.006
 traps-0.006
Token pump
Token ID8901
Feature activation+1.40e-02

Pos logit contributions
keeping+0.026
kins+0.025
 snakes+0.021
bes+0.019
 traps+0.019
Neg logit contributions
 lear-0.032
 str-0.032
ah-0.031
 forward-0.029
itting-0.029
Tokenkins
Token ID5331
Feature activation+2.32e-01

Pos logit contributions
keeping+0.011
 iv+0.006
 skeletons+0.006
 rats+0.005
 snakes+0.005
Neg logit contributions
ah-0.035
 str-0.033
 forward-0.033
ee-0.032
 lear-0.032
Token.
Token ID13
Feature activation+1.85e-02

Pos logit contributions
keeping+0.356
kins+0.327
 snakes+0.279
bes+0.257
 skeletons+0.257
Neg logit contributions
 str-0.409
 lear-0.408
ah-0.400
 forward-0.381
itting-0.375
Token
Token ID198
Feature activation+2.81e-02

Pos logit contributions
keeping+0.031
kins+0.028
 snakes+0.024
 traps+0.022
bes+0.022
Neg logit contributions
 lear-0.031
 str-0.031
ah-0.030
 forward-0.029
itting-0.029
Token
Token ID198
Feature activation+2.59e-02

Pos logit contributions
keeping+0.055
kins+0.049
 snakes+0.044
 situations+0.043
 skeletons+0.042
Neg logit contributions
 str-0.040
 switched-0.036
ee-0.036
ah-0.036
 lear-0.036
TokenThey
Token ID2990
Feature activation-5.77e-03

Pos logit contributions
kins+0.035
keeping+0.032
 snakes+0.026
bes+0.025
 cans+0.024
Neg logit contributions
ah-0.052
 lear-0.051
 str-0.049
oured-0.047
 farewell-0.046
Token paint
Token ID7521
Feature activation-4.45e-03

Pos logit contributions
itting+0.010
ah+0.010
 farewell+0.009
ahoo+0.009
oured+0.009
Neg logit contributions
kins-0.008
 floor-0.007
 snakes-0.006
 cans-0.006
keeping-0.006
Token load
Token ID3440
Feature activation+1.06e-02

Pos logit contributions
 lear+0.001
 str+0.001
ah+0.001
 forward+0.001
 switched+0.001
Neg logit contributions
keeping-0.003
kins-0.003
 snakes-0.002
bes-0.002
 traps-0.002
Token his
Token ID465
Feature activation+6.49e-03

Pos logit contributions
keeping+0.021
kins+0.019
bes+0.018
 snakes+0.017
iders+0.017
Neg logit contributions
 forward-0.014
 str-0.013
ah-0.013
 lear-0.012
 switched-0.011
Token truck
Token ID7779
Feature activation-2.32e-02

Pos logit contributions
keeping+0.011
kins+0.010
 snakes+0.008
bes+0.008
 skeletons+0.008
Neg logit contributions
 lear-0.010
 str-0.010
ah-0.010
 forward-0.010
 switched-0.010
Token with
Token ID351
Feature activation+2.68e-02

Pos logit contributions
 str+0.041
 lear+0.040
ah+0.040
 forward+0.038
 switched+0.037
Neg logit contributions
keeping-0.035
kins-0.033
 snakes-0.028
bes-0.025
 skeletons-0.025
Token pump
Token ID8901
Feature activation+1.05e-02

Pos logit contributions
keeping+0.041
kins+0.038
 snakes+0.033
 traps+0.030
 skeletons+0.030
Neg logit contributions
 lear-0.048
 str-0.047
ah-0.047
itting-0.044
 forward-0.043
Tokenkins
Token ID5331
Feature activation+2.31e-01

Pos logit contributions
keeping+0.012
 snakes+0.011
iders+0.010
 iv+0.010
 spiders+0.010
Neg logit contributions
ah-0.020
 forward-0.019
by-0.019
 str-0.019
ee-0.018
Token.
Token ID13
Feature activation+1.31e-02

Pos logit contributions
keeping+0.263
kins+0.241
 snakes+0.189
bes+0.166
 traps+0.165
Neg logit contributions
 lear-0.497
 str-0.494
ah-0.489
 forward-0.463
itting-0.462
Token The
Token ID383
Feature activation+4.99e-03

Pos logit contributions
keeping+0.019
kins+0.018
 snakes+0.016
 traps+0.014
 skeletons+0.014
Neg logit contributions
 lear-0.024
 str-0.023
ah-0.023
itting-0.022
 forward-0.022
Token farmer
Token ID18739
Feature activation-2.91e-03

Pos logit contributions
keeping+0.008
kins+0.007
bes+0.006
 snakes+0.006
uts+0.006
Neg logit contributions
 lear-0.009
 str-0.009
ah-0.009
 forward-0.008
 switched-0.008
Token helped
Token ID4193
Feature activation+2.07e-02

Pos logit contributions
 lear+0.005
 str+0.005
ah+0.005
itting+0.005
 forward+0.005
Neg logit contributions
kins-0.004
keeping-0.004
 snakes-0.004
 skeletons-0.003
 traps-0.003
Token him
Token ID683
Feature activation+5.41e-02

Pos logit contributions
keeping+0.041
kins+0.039
bes+0.034
 snakes+0.033
uts+0.032
Neg logit contributions
 str-0.028
 lear-0.027
ah-0.026
 forward-0.025
 spend-0.024
Token.
Token ID13
Feature activation+1.12e-02

Pos logit contributions
keeping+0.039
kins+0.035
 snakes+0.030
 traps+0.028
 skeletons+0.028
Neg logit contributions
 str-0.045
 lear-0.044
ah-0.044
 forward-0.042
itting-0.041
Token They
Token ID1119
Feature activation-3.51e-03

Pos logit contributions
keeping+0.019
kins+0.018
 snakes+0.015
 traps+0.014
bes+0.014
Neg logit contributions
 lear-0.019
 str-0.018
ah-0.017
 forward-0.017
itting-0.017
Token make
Token ID787
Feature activation+2.53e-02

Pos logit contributions
ah+0.006
itting+0.006
 lear+0.006
 farewell+0.006
 forward+0.006
Neg logit contributions
kins-0.005
keeping-0.004
 snakes-0.004
 floor-0.004
 cans-0.004
Token the
Token ID262
Feature activation+1.61e-02

Pos logit contributions
keeping+0.038
kins+0.036
 snakes+0.031
 traps+0.028
bes+0.028
Neg logit contributions
 lear-0.045
ah-0.044
 str-0.044
itting-0.041
 forward-0.041
Token pump
Token ID8901
Feature activation+1.41e-02

Pos logit contributions
keeping+0.023
kins+0.022
 snakes+0.019
 traps+0.017
 skeletons+0.017
Neg logit contributions
 lear-0.029
 str-0.029
ah-0.029
itting-0.027
 forward-0.027
Tokenkins
Token ID5331
Feature activation+2.31e-01

Pos logit contributions
keeping+0.011
 skeletons+0.006
 iv+0.006
 snakes+0.005
 rats+0.005
Neg logit contributions
ah-0.035
 str-0.034
 forward-0.034
 lear-0.033
 switched-0.033
Token look
Token ID804
Feature activation+1.41e-02

Pos logit contributions
kins+0.406
keeping+0.395
 snakes+0.339
 traps+0.313
ls+0.312
Neg logit contributions
 lear-0.356
ah-0.349
 str-0.334
itting-0.313
 forward-0.299
Token very
Token ID845
Feature activation-1.59e-03

Pos logit contributions
kins+0.022
keeping+0.022
 snakes+0.019
ls+0.017
 traps+0.017
Neg logit contributions
 lear-0.024
ah-0.023
 str-0.023
itting-0.022
 switched-0.021
Token colorful
Token ID20239
Feature activation+2.91e-02

Pos logit contributions
 lear+0.003
 str+0.003
ah+0.003
 forward+0.002
itting+0.002
Neg logit contributions
keeping-0.003
kins-0.002
 snakes-0.002
bes-0.002
 traps-0.002
Token and
Token ID290
Feature activation+1.19e-02

Pos logit contributions
keeping+0.050
kins+0.041
 snakes+0.038
bes+0.038
 traps+0.037
Neg logit contributions
 str-0.044
 lear-0.043
 forward-0.043
ah-0.040
 switched-0.039
Token bright
Token ID6016
Feature activation-6.04e-03

Pos logit contributions
keeping+0.022
kins+0.019
 snakes+0.018
orns+0.017
 iv+0.017
Neg logit contributions
 str-0.018
 lear-0.017
 switched-0.016
 forward-0.016
 spend-0.016
Token gave
Token ID2921
Feature activation+2.53e-03

Pos logit contributions
 lear+0.006
 str+0.006
ah+0.006
 forward+0.006
itting+0.006
Neg logit contributions
keeping-0.010
kins-0.009
 snakes-0.008
bes-0.008
 traps-0.008
Token them
Token ID606
Feature activation+4.15e-02

Pos logit contributions
keeping+0.005
kins+0.004
 snakes+0.004
 skeletons+0.004
 traps+0.004
Neg logit contributions
 lear-0.004
 str-0.004
ah-0.004
itting-0.003
 switched-0.003
Token forks
Token ID43378
Feature activation+3.72e-02

Pos logit contributions
kins+0.050
 snakes+0.047
keeping+0.046
 cans+0.045
 mice+0.045
Neg logit contributions
 lear-0.082
itting-0.080
ah-0.079
 str-0.075
 blur-0.074
Token and
Token ID290
Feature activation+5.09e-03

Pos logit contributions
keeping+0.060
kins+0.055
 snakes+0.048
bes+0.044
 skeletons+0.044
Neg logit contributions
 str-0.062
 lear-0.062
ah-0.061
 forward-0.058
 switched-0.056
Token nap
Token ID25422
Feature activation+1.25e-02

Pos logit contributions
keeping+0.007
kins+0.007
 snakes+0.005
bes+0.005
uts+0.005
Neg logit contributions
 str-0.010
 lear-0.009
ah-0.009
 switched-0.009
 spend-0.009
Tokenkins
Token ID5331
Feature activation+2.31e-01

Pos logit contributions
keeping+0.014
 skeletons+0.011
 snakes+0.011
 situations+0.011
orns+0.010
Neg logit contributions
 str-0.026
 lear-0.024
 forward-0.024
 wait-0.024
 spent-0.023
Token.
Token ID13
Feature activation+1.03e-02

Pos logit contributions
keeping+0.265
kins+0.241
 snakes+0.193
bes+0.169
 traps+0.169
Neg logit contributions
 lear-0.491
 str-0.490
ah-0.483
 forward-0.459
itting-0.456
Token
Token ID198
Feature activation+2.04e-02

Pos logit contributions
keeping+0.020
kins+0.019
 snakes+0.015
 traps+0.015
 situations+0.015
Neg logit contributions
 lear-0.015
 str-0.015
 forward-0.014
itting-0.014
Sure-0.013
Token
Token ID198
Feature activation+1.88e-02

Pos logit contributions
keeping+0.045
kins+0.042
 situations+0.037
 snakes+0.036
 traps+0.035
Neg logit contributions
 str-0.025
Sure-0.023
ee-0.022
ah-0.021
 lear-0.021
Token"
Token ID1
Feature activation-2.62e-02

Pos logit contributions
keeping+0.029
kins+0.029
 snakes+0.024
bes+0.022
 traps+0.022
Neg logit contributions
 lear-0.032
ah-0.032
 str-0.031
 forward-0.029
 switched-0.029
TokenThank
Token ID10449
Feature activation-1.55e-02

Pos logit contributions
 lear+0.039
 str+0.039
ah+0.038
 forward+0.036
itting+0.036
Neg logit contributions
keeping-0.046
kins-0.044
 snakes-0.038
bes-0.036
 iv-0.035
Token listen
Token ID6004
Feature activation+1.50e-02

Pos logit contributions
keeping+0.105
kins+0.101
 snakes+0.091
 iv+0.086
 truth+0.084
Neg logit contributions
 lear-0.072
oured-0.071
ah-0.070
 str-0.068
itting-0.068
Token.
Token ID13
Feature activation+1.57e-02

Pos logit contributions
keeping+0.025
kins+0.023
 snakes+0.020
bes+0.018
 traps+0.018
Neg logit contributions
 str-0.025
 lear-0.025
ah-0.024
 forward-0.023
itting-0.023
Token They
Token ID1119
Feature activation+9.61e-04

Pos logit contributions
keeping+0.024
kins+0.022
 snakes+0.019
bes+0.017
 traps+0.017
Neg logit contributions
 lear-0.028
 str-0.028
ah-0.027
 forward-0.026
 switched-0.026
Token thought
Token ID1807
Feature activation+3.23e-02

Pos logit contributions
kins+0.002
keeping+0.002
 snakes+0.001
 bush+0.001
 rats+0.001
Neg logit contributions
 lear-0.001
 forward-0.001
itting-0.001
ah-0.001
 str-0.001
Token nap
Token ID25422
Feature activation+1.46e-02

Pos logit contributions
keeping+0.057
kins+0.054
 snakes+0.048
bes+0.044
 traps+0.044
Neg logit contributions
 lear-0.049
 str-0.048
ah-0.048
 forward-0.044
itting-0.044
Tokenkins
Token ID5331
Feature activation+2.30e-01

Pos logit contributions
keeping+0.017
 snakes+0.010
handed+0.010
orns+0.009
 skeletons+0.009
Neg logit contributions
 str-0.030
 forward-0.030
 lear-0.029
ah-0.029
 switched-0.029
Token were
Token ID547
Feature activation+2.47e-02

Pos logit contributions
keeping+0.438
kins+0.400
 snakes+0.360
 skeletons+0.341
bes+0.341
Neg logit contributions
 str-0.322
ah-0.311
 lear-0.309
 forward-0.293
 switched-0.290
Token boring
Token ID14262
Feature activation+6.43e-03

Pos logit contributions
keeping+0.041
kins+0.038
 snakes+0.032
bes+0.031
 skeletons+0.030
Neg logit contributions
 str-0.040
 lear-0.039
ah-0.039
 switched-0.037
 forward-0.037
Token and
Token ID290
Feature activation+7.92e-03

Pos logit contributions
keeping+0.011
kins+0.010
 snakes+0.009
bes+0.008
 traps+0.008
Neg logit contributions
 lear-0.010
 str-0.010
 forward-0.009
ah-0.009
itting-0.009
Token silly
Token ID14397
Feature activation+1.32e-02

Pos logit contributions
keeping+0.013
kins+0.013
 snakes+0.011
 traps+0.010
 skeletons+0.010
Neg logit contributions
 lear-0.013
 str-0.012
ah-0.012
itting-0.012
 forward-0.012
Token.
Token ID13
Feature activation+1.29e-02

Pos logit contributions
keeping+0.017
kins+0.015
 snakes+0.012
bes+0.012
 traps+0.011
Neg logit contributions
 lear-0.027
 str-0.026
ah-0.025
 forward-0.025
 switched-0.024
Token helped
Token ID4193
Feature activation+2.07e-02

Pos logit contributions
 lear+0.005
 str+0.005
ah+0.005
itting+0.005
 forward+0.005
Neg logit contributions
kins-0.004
keeping-0.004
 snakes-0.004
 skeletons-0.003
 traps-0.003
Token him
Token ID683
Feature activation+5.41e-02

Pos logit contributions
keeping+0.041
kins+0.039
bes+0.034
 snakes+0.033
uts+0.032
Neg logit contributions
 str-0.028
 lear-0.027
ah-0.026
 forward-0.025
 spend-0.024
Token load
Token ID3440
Feature activation+1.03e-02

Pos logit contributions
keeping+0.099
kins+0.095
 snakes+0.082
 traps+0.077
bes+0.076
Neg logit contributions
 lear-0.080
ah-0.078
 str-0.076
itting-0.073
 forward-0.070
Token the
Token ID262
Feature activation+1.73e-02

Pos logit contributions
keeping+0.019
kins+0.018
 snakes+0.016
bes+0.016
 iv+0.015
Neg logit contributions
 str-0.014
ah-0.014
 lear-0.014
 forward-0.014
 switched-0.013
Token pump
Token ID8901
Feature activation+1.10e-02

Pos logit contributions
keeping+0.029
kins+0.027
 snakes+0.023
bes+0.022
 skeletons+0.022
Neg logit contributions
 str-0.028
 lear-0.028
ah-0.027
 forward-0.026
 switched-0.025
Tokenkins
Token ID5331
Feature activation+2.30e-01

Pos logit contributions
keeping+0.012
 iv+0.010
 snakes+0.010
 skeletons+0.010
 spiders+0.009
Neg logit contributions
 forward-0.022
ah-0.022
 str-0.020
ee-0.020
 lear-0.020
Token.
Token ID13
Feature activation+6.89e-03

Pos logit contributions
keeping+0.397
kins+0.374
 snakes+0.326
bes+0.303
 traps+0.301
Neg logit contributions
 lear-0.356
 str-0.356
ah-0.349
 forward-0.327
itting-0.322
Token
Token ID220
Feature activation-2.19e-03

Pos logit contributions
keeping+0.011
kins+0.010
 snakes+0.008
bes+0.008
 traps+0.008
Neg logit contributions
 lear-0.012
 str-0.012
ah-0.012
itting-0.011
 forward-0.011
Token
Token ID198
Feature activation+1.62e-02

Pos logit contributions
 forward+0.003
itting+0.003
 str+0.003
ipp+0.002
ah+0.002
Neg logit contributions
keeping-0.004
kins-0.004
 snakes-0.004
 traps-0.004
 iv-0.004
Token
Token ID198
Feature activation+1.44e-02

Pos logit contributions
keeping+0.030
kins+0.028
 snakes+0.025
 rats+0.025
 traps+0.024
Neg logit contributions
 str-0.025
ah-0.023
 lear-0.022
Sure-0.022
ee-0.022
TokenThe
Token ID464
Feature activation+1.41e-02

Pos logit contributions
kins+0.021
keeping+0.021
 snakes+0.018
bes+0.017
 mice+0.017
Neg logit contributions
ah-0.024
 lear-0.024
 str-0.023
itting-0.022
oured-0.022
Token big
Token ID1263
Feature activation+4.76e-02

Pos logit contributions
 lear+0.010
 str+0.010
ah+0.010
 forward+0.009
 switched+0.009
Neg logit contributions
keeping-0.008
kins-0.008
 snakes-0.006
bes-0.006
 traps-0.006
Token field
Token ID2214
Feature activation+3.69e-02

Pos logit contributions
keeping+0.072
kins+0.067
 snakes+0.055
bes+0.052
 traps+0.051
Neg logit contributions
 lear-0.085
 str-0.084
ah-0.084
 forward-0.078
 switched-0.077
Token with
Token ID351
Feature activation+3.27e-02

Pos logit contributions
keeping+0.062
kins+0.057
 snakes+0.050
bes+0.046
 traps+0.046
Neg logit contributions
 str-0.059
 lear-0.059
ah-0.058
 forward-0.055
itting-0.055
Token many
Token ID867
Feature activation+2.04e-02

Pos logit contributions
keeping+0.056
kins+0.053
 snakes+0.043
bes+0.043
iders+0.041
Neg logit contributions
 lear-0.051
 str-0.051
ah-0.049
 switched-0.046
 forward-0.046
Token pump
Token ID8901
Feature activation+1.25e-02

Pos logit contributions
keeping+0.034
kins+0.026
iders+0.023
bes+0.023
handed+0.023
Neg logit contributions
 str-0.035
 lear-0.034
ah-0.033
 practice-0.032
 padd-0.032
Tokenkins
Token ID5331
Feature activation+2.30e-01

Pos logit contributions
keeping+0.010
iders+0.006
 iv+0.006
 skeletons+0.006
 situations+0.006
Neg logit contributions
ah-0.029
by-0.027
ee-0.027
 padd-0.026
 str-0.026
Token.
Token ID13
Feature activation+2.66e-02

Pos logit contributions
keeping+0.377
kins+0.327
 snakes+0.308
bes+0.287
 situations+0.287
Neg logit contributions
 str-0.371
 forward-0.361
 lear-0.354
itting-0.348
ah-0.347
Token The
Token ID383
Feature activation+9.92e-03

Pos logit contributions
keeping+0.058
kins+0.054
bes+0.046
 situations+0.045
 happen+0.044
Neg logit contributions
 lear-0.030
 forward-0.027
 str-0.027
 While-0.027
Sure-0.026
Token farmer
Token ID18739
Feature activation-1.71e-04

Pos logit contributions
keeping+0.014
kins+0.013
 snakes+0.011
bes+0.010
uts+0.010
Neg logit contributions
 lear-0.019
 str-0.019
ah-0.018
 forward-0.017
itting-0.017
Token said
Token ID531
Feature activation+1.20e-02

Pos logit contributions
 lear+0.000
 str+0.000
ah+0.000
 forward+0.000
itting+0.000
Neg logit contributions
keeping-0.000
kins-0.000
 snakes-0.000
 skeletons-0.000
bes-0.000
Token they
Token ID484
Feature activation+2.21e-02

Pos logit contributions
keeping+0.022
kins+0.020
 snakes+0.017
bes+0.017
 traps+0.017
Neg logit contributions
 lear-0.018
 str-0.018
ah-0.017
 forward-0.016
 switched-0.016
Token Come
Token ID7911
Feature activation+9.09e-03

Pos logit contributions
 lear+0.050
 str+0.050
ah+0.049
 forward+0.046
itting+0.046
Neg logit contributions
keeping-0.049
kins-0.045
 snakes-0.039
 skeletons-0.036
bes-0.036
Token and
Token ID290
Feature activation+1.71e-02

Pos logit contributions
keeping+0.011
kins+0.009
 bush+0.009
 snakes+0.007
 ranger+0.007
Neg logit contributions
itting-0.018
 lear-0.018
ah-0.018
 str-0.017
 switched-0.016
Token see
Token ID766
Feature activation+3.25e-02

Pos logit contributions
keeping+0.029
kins+0.025
 ranger+0.023
 bush+0.022
 snakes+0.022
Neg logit contributions
ah-0.029
itting-0.029
 lear-0.026
 str-0.024
ahoo-0.024
Token our
Token ID674
Feature activation-5.41e-02

Pos logit contributions
keeping+0.044
kins+0.041
 snakes+0.034
bes+0.030
 traps+0.030
Neg logit contributions
 lear-0.063
 str-0.063
ah-0.062
 forward-0.058
itting-0.058
Token pump
Token ID8901
Feature activation+1.09e-02

Pos logit contributions
 lear+0.095
 str+0.094
ah+0.094
 forward+0.088
itting+0.087
Neg logit contributions
keeping-0.082
kins-0.078
 snakes-0.066
 traps-0.060
 skeletons-0.060
Tokenkins
Token ID5331
Feature activation+2.30e-01

Pos logit contributions
keeping+0.017
 iv+0.014
 skeletons+0.013
 snakes+0.013
 mice+0.013
Neg logit contributions
ah-0.019
 str-0.018
ee-0.017
 forward-0.017
 lear-0.016
Token!
Token ID0
Feature activation-2.89e-02

Pos logit contributions
keeping+0.327
kins+0.302
 snakes+0.256
bes+0.233
 skeletons+0.231
Neg logit contributions
 str-0.429
 lear-0.425
ah-0.420
 forward-0.398
 switched-0.392
Token We
Token ID775
Feature activation-5.08e-02

Pos logit contributions
 str+0.053
 lear+0.052
ah+0.052
itting+0.049
 forward+0.048
Neg logit contributions
keeping-0.042
kins-0.038
 snakes-0.033
 skeletons-0.030
 traps-0.029
Token painted
Token ID13055
Feature activation-3.23e-02

Pos logit contributions
itting+0.095
 farewell+0.092
 lear+0.088
 str+0.087
oured+0.087
Neg logit contributions
kins-0.060
keeping-0.058
 bush-0.049
 snakes-0.048
,"-0.046
Token them
Token ID606
Feature activation+4.29e-02

Pos logit contributions
 str+0.043
ah+0.042
 lear+0.042
 forward+0.039
 switched+0.037
Neg logit contributions
keeping-0.063
kins-0.059
 snakes-0.051
bes-0.050
 traps-0.049
Token!
Token ID0
Feature activation-3.11e-02

Pos logit contributions
keeping+0.063
kins+0.063
 snakes+0.053
bes+0.048
ls+0.048
Neg logit contributions
 lear-0.076
 str-0.073
ah-0.072
itting-0.070
 switched-0.068
Token will
Token ID481
Feature activation+1.11e-02

Pos logit contributions
 lear+0.007
 str+0.007
ah+0.007
 forward+0.007
itting+0.007
Neg logit contributions
keeping-0.008
kins-0.008
 snakes-0.007
bes-0.006
 traps-0.006
Token stay
Token ID2652
Feature activation+4.00e-02

Pos logit contributions
keeping+0.022
kins+0.021
 snakes+0.018
 skeletons+0.017
bes+0.017
Neg logit contributions
 str-0.015
 lear-0.015
ah-0.014
 forward-0.013
 switched-0.013
Token on
Token ID319
Feature activation+1.37e-02

Pos logit contributions
keeping+0.072
kins+0.068
 snakes+0.059
 skeletons+0.055
bes+0.055
Neg logit contributions
 str-0.059
 lear-0.059
ah-0.058
 forward-0.054
itting-0.053
Token the
Token ID262
Feature activation+1.65e-02

Pos logit contributions
keeping+0.025
kins+0.024
 snakes+0.020
bes+0.020
uts+0.019
Neg logit contributions
 str-0.020
 lear-0.020
ah-0.019
 forward-0.018
 switched-0.017
Token pump
Token ID8901
Feature activation+9.60e-03

Pos logit contributions
keeping+0.022
kins+0.021
 snakes+0.017
bes+0.015
 skeletons+0.015
Neg logit contributions
 lear-0.032
 str-0.032
ah-0.031
 forward-0.030
 switched-0.029
Tokenkins
Token ID5331
Feature activation+2.30e-01

Pos logit contributions
keeping+0.009
 skeletons+0.007
 iv+0.007
 snakes+0.007
 rats+0.007
Neg logit contributions
ah-0.021
 str-0.020
 forward-0.020
ee-0.019
 lear-0.019
Token."
Token ID526
Feature activation+4.22e-02

Pos logit contributions
keeping+0.347
kins+0.304
 snakes+0.277
 skeletons+0.257
bes+0.255
Neg logit contributions
 str-0.412
 forward-0.398
 lear-0.394
ah-0.390
 spend-0.367
Token
Token ID198
Feature activation+3.13e-02

Pos logit contributions
keeping+0.078
kins+0.071
 snakes+0.061
 iv+0.061
 situations+0.061
Neg logit contributions
 str-0.063
 lear-0.062
ah-0.058
 forward-0.057
Sure-0.056
Token
Token ID198
Feature activation+2.83e-02

Pos logit contributions
keeping+0.069
kins+0.061
 situations+0.057
 snakes+0.055
 skeletons+0.055
Neg logit contributions
 str-0.037
ee-0.034
Sure-0.033
ah-0.032
by-0.031
TokenThey
Token ID2990
Feature activation-4.67e-03

Pos logit contributions
keeping+0.047
kins+0.044
 snakes+0.038
 traps+0.035
bes+0.035
Neg logit contributions
 lear-0.047
 str-0.046
ah-0.046
 forward-0.043
itting-0.042
Token run
Token ID1057
Feature activation+3.03e-02

Pos logit contributions
itting+0.008
ah+0.007
ahoo+0.007
 forward+0.006
rav+0.006
Neg logit contributions
kins-0.007
 cans-0.006
 tree-0.006
 floor-0.006
 bush-0.006
Token to
Token ID284
Feature activation+2.54e-03

Pos logit contributions
keeping+0.027
kins+0.025
 snakes+0.023
 skeletons+0.021
 traps+0.021
Neg logit contributions
 lear-0.020
ah-0.020
 str-0.020
 forward-0.018
itting-0.018
Token be
Token ID307
Feature activation-2.88e-02

Pos logit contributions
keeping+0.005
kins+0.005
 snakes+0.004
bes+0.004
 traps+0.004
Neg logit contributions
 lear-0.003
ah-0.003
 str-0.003
 forward-0.003
itting-0.003
Token with
Token ID351
Feature activation+3.02e-02

Pos logit contributions
 lear+0.047
 str+0.046
ah+0.046
 forward+0.042
itting+0.042
Neg logit contributions
keeping-0.048
kins-0.046
 snakes-0.039
 traps-0.036
bes-0.036
Token other
Token ID584
Feature activation+5.83e-02

Pos logit contributions
keeping+0.049
kins+0.046
 snakes+0.041
 traps+0.037
 skeletons+0.037
Neg logit contributions
 lear-0.051
ah-0.050
 str-0.050
itting-0.047
 forward-0.046
Token pump
Token ID8901
Feature activation+1.03e-02

Pos logit contributions
keeping+0.070
kins+0.063
 snakes+0.050
bes+0.047
 traps+0.044
Neg logit contributions
 lear-0.121
 str-0.121
ah-0.119
 forward-0.113
itting-0.113
Tokenkins
Token ID5331
Feature activation+2.30e-01

Pos logit contributions
keeping+0.009
 iv+0.006
iders+0.005
bes+0.005
 snakes+0.005
Neg logit contributions
 forward-0.023
ah-0.022
by-0.022
 str-0.022
ee-0.022
Token in
Token ID287
Feature activation+2.11e-02

Pos logit contributions
keeping+0.323
kins+0.286
 snakes+0.250
bes+0.231
 skeletons+0.224
Neg logit contributions
 str-0.439
 lear-0.430
ah-0.419
 forward-0.415
itting-0.402
Token the
Token ID262
Feature activation+2.41e-02

Pos logit contributions
keeping+0.045
kins+0.040
bes+0.038
iders+0.038
uts+0.038
Neg logit contributions
 str-0.025
 forward-0.023
 spend-0.022
 lear-0.022
ah-0.022
Token garden
Token ID11376
Feature activation+4.58e-02

Pos logit contributions
keeping+0.036
kins+0.034
 snakes+0.029
 traps+0.026
bes+0.026
Neg logit contributions
 lear-0.043
 str-0.043
ah-0.042
 forward-0.039
itting-0.039
Token.
Token ID13
Feature activation+1.45e-02

Pos logit contributions
keeping+0.060
kins+0.052
 snakes+0.044
bes+0.041
 skeletons+0.040
Neg logit contributions
 str-0.091
 lear-0.090
ah-0.086
 forward-0.086
itting-0.085
Token
Token ID198
Feature activation+2.80e-02

Pos logit contributions
keeping+0.025
kins+0.023
 traps+0.019
 snakes+0.018
bes+0.018
Neg logit contributions
 str-0.024
ah-0.023
 lear-0.023
itting-0.022
 forward-0.021
Token.
Token ID13
Feature activation+1.80e-02

Pos logit contributions
keeping+0.046
kins+0.042
bes+0.034
 snakes+0.034
 skeletons+0.033
Neg logit contributions
 str-0.051
 lear-0.051
ah-0.049
 forward-0.048
 switched-0.046
Token They
Token ID1119
Feature activation+1.53e-03

Pos logit contributions
kins+0.032
keeping+0.032
 snakes+0.026
bes+0.026
 traps+0.025
Neg logit contributions
 lear-0.027
 str-0.026
 forward-0.025
 switched-0.025
ah-0.024
Token see
Token ID766
Feature activation+3.03e-02

Pos logit contributions
kins+0.003
keeping+0.002
 snakes+0.002
 bush+0.002
 cans+0.002
Neg logit contributions
ah-0.002
itting-0.002
 str-0.002
 lear-0.002
 forward-0.002
Token the
Token ID262
Feature activation+1.86e-02

Pos logit contributions
keeping+0.064
kins+0.059
bes+0.051
iders+0.049
uts+0.048
Neg logit contributions
 lear-0.038
 str-0.038
ah-0.035
 forward-0.035
 switched-0.034
Token pump
Token ID8901
Feature activation+1.22e-02

Pos logit contributions
keeping+0.029
kins+0.027
 snakes+0.023
bes+0.021
 skeletons+0.021
Neg logit contributions
 lear-0.033
 str-0.032
ah-0.032
 forward-0.030
itting-0.030
Tokenkins
Token ID5331
Feature activation+2.30e-01

Pos logit contributions
keeping+0.009
 skeletons+0.005
 iv+0.005
 snakes+0.005
bes+0.004
Neg logit contributions
ah-0.031
 str-0.030
 forward-0.030
ee-0.029
 lear-0.028
Token.
Token ID13
Feature activation+1.95e-02

Pos logit contributions
keeping+0.396
kins+0.353
 snakes+0.318
 skeletons+0.299
bes+0.297
Neg logit contributions
 str-0.367
 lear-0.357
ah-0.351
 forward-0.344
 switched-0.332
Token They
Token ID1119
Feature activation+3.20e-03

Pos logit contributions
kins+0.034
keeping+0.034
bes+0.028
 snakes+0.028
 traps+0.027
Neg logit contributions
 lear-0.029
 str-0.028
 forward-0.027
ah-0.026
 switched-0.026
Token see
Token ID766
Feature activation+3.04e-02

Pos logit contributions
kins+0.006
keeping+0.005
 snakes+0.005
 skeletons+0.004
 bush+0.004
Neg logit contributions
ah-0.005
 lear-0.005
 str-0.005
itting-0.004
 forward-0.004
Token the
Token ID262
Feature activation+2.06e-02

Pos logit contributions
keeping+0.062
kins+0.057
bes+0.049
iders+0.047
 snakes+0.047
Neg logit contributions
 lear-0.040
 str-0.040
ah-0.038
 forward-0.036
 switched-0.036
Token painting
Token ID12036
Feature activation-1.97e-02

Pos logit contributions
keeping+0.032
kins+0.029
 snakes+0.025
bes+0.023
 traps+0.023
Neg logit contributions
 lear-0.036
 str-0.036
ah-0.035
 forward-0.033
itting-0.033
Token.
Token ID13
Feature activation+1.37e-02

Pos logit contributions
keeping+0.002
rob+0.001
bes+0.001
kins+0.001
 yourselves+0.001
Neg logit contributions
 forward-0.001
 wait-0.001
ah-0.001
 str-0.001
 lear-0.001
Token She
Token ID1375
Feature activation+6.01e-04

Pos logit contributions
kins+0.030
keeping+0.028
rob+0.024
 snakes+0.024
ers+0.023
Neg logit contributions
 lear-0.016
 str-0.016
 While-0.015
 forward-0.015
Sure-0.014
Token said
Token ID531
Feature activation+5.81e-03

Pos logit contributions
keeping+0.001
kins+0.001
 snakes+0.001
 traps+0.001
 skeletons+0.001
Neg logit contributions
 lear-0.001
 str-0.001
ah-0.001
 forward-0.001
itting-0.001
Token that
Token ID326
Feature activation-7.27e-03

Pos logit contributions
keeping+0.010
kins+0.010
 snakes+0.008
bes+0.008
 traps+0.008
Neg logit contributions
 lear-0.009
 str-0.009
ah-0.009
 forward-0.008
 switched-0.008
Token pump
Token ID8901
Feature activation+1.14e-02

Pos logit contributions
 str+0.010
 lear+0.009
 switched+0.008
ah+0.008
 pract+0.008
Neg logit contributions
keeping-0.015
kins-0.013
bes-0.012
handed-0.011
 situations-0.011
Tokenkins
Token ID5331
Feature activation+2.29e-01

Pos logit contributions
keeping+0.009
bes+0.005
 snakes+0.005
 skeletons+0.005
 traps+0.005
Neg logit contributions
ah-0.028
ee-0.027
 str-0.026
 forward-0.026
by-0.026
Token are
Token ID389
Feature activation+4.75e-03

Pos logit contributions
keeping+0.443
kins+0.403
 snakes+0.365
bes+0.345
 skeletons+0.344
Neg logit contributions
 str-0.312
 lear-0.301
ah-0.300
 switched-0.283
 forward-0.282
Token rare
Token ID4071
Feature activation+1.23e-02

Pos logit contributions
keeping+0.008
kins+0.007
 snakes+0.006
bes+0.006
 skeletons+0.006
Neg logit contributions
 str-0.008
 lear-0.008
ah-0.007
 forward-0.007
 switched-0.007
Token and
Token ID290
Feature activation+8.93e-03

Pos logit contributions
keeping+0.025
kins+0.023
bes+0.022
rob+0.021
iders+0.021
Neg logit contributions
 lear-0.013
 str-0.012
 forward-0.011
 spent-0.011
 wait-0.011
Token valuable
Token ID8119
Feature activation+2.91e-02

Pos logit contributions
keeping+0.017
kins+0.015
bes+0.013
orns+0.013
 snakes+0.013
Neg logit contributions
 str-0.012
 switched-0.012
 spend-0.011
 lear-0.011
 forward-0.011
Token.
Token ID13
Feature activation+1.45e-02

Pos logit contributions
keeping+0.044
kins+0.039
bes+0.037
iders+0.034
 snakes+0.034
Neg logit contributions
 lear-0.047
 str-0.044
ah-0.044
 wait-0.043
by-0.042
Token They
Token ID1119
Feature activation-6.57e-03

Pos logit contributions
keeping+0.025
kins+0.023
 snakes+0.019
bes+0.018
 traps+0.018
Neg logit contributions
 lear-0.026
 str-0.025
 forward-0.025
itting-0.024
ah-0.024
Token quickly
Token ID2952
Feature activation+4.42e-02

Pos logit contributions
itting+0.010
 forward+0.010
ah+0.010
 lear+0.010
 farewell+0.009
Neg logit contributions
kins-0.011
keeping-0.009
 bush-0.008
 cans-0.008
 iv-0.008
Token put
Token ID1234
Feature activation+8.06e-03

Pos logit contributions
kins+0.076
keeping+0.075
 cans+0.065
 snakes+0.065
 iv+0.064
Neg logit contributions
 forward-0.065
itting-0.064
ah-0.063
 lear-0.059
ee-0.056
Token their
Token ID511
Feature activation+1.81e-02

Pos logit contributions
keeping+0.014
kins+0.013
 snakes+0.011
 traps+0.010
bes+0.010
Neg logit contributions
 lear-0.013
 str-0.013
ah-0.013
itting-0.012
 switched-0.012
Token nap
Token ID25422
Feature activation+1.37e-02

Pos logit contributions
keeping+0.028
kins+0.022
uts+0.021
iders+0.020
 snakes+0.019
Neg logit contributions
 str-0.032
 lear-0.031
 forward-0.030
ah-0.030
 switched-0.029
Tokenkins
Token ID5331
Feature activation+2.29e-01

Pos logit contributions
keeping+0.014
 snakes+0.010
 skeletons+0.009
 situations+0.009
 rats+0.008
Neg logit contributions
 str-0.031
 lear-0.029
 forward-0.029
 switched-0.028
ah-0.028
Token over
Token ID625
Feature activation-6.72e-03

Pos logit contributions
keeping+0.370
kins+0.363
 snakes+0.286
bes+0.273
ls+0.270
Neg logit contributions
 lear-0.397
ah-0.384
 str-0.379
itting-0.353
 switched-0.346
Token their
Token ID511
Feature activation+1.68e-02

Pos logit contributions
 lear+0.010
 str+0.010
ah+0.010
 forward+0.009
 switched+0.009
Neg logit contributions
keeping-0.012
kins-0.011
 snakes-0.010
bes-0.009
 traps-0.009
Token salad
Token ID20698
Feature activation+5.15e-02

Pos logit contributions
keeping+0.023
kins+0.020
 snakes+0.017
uts+0.017
 skeletons+0.016
Neg logit contributions
 str-0.032
 lear-0.031
ah-0.030
 forward-0.029
 switched-0.029
Token.
Token ID13
Feature activation+1.24e-02

Pos logit contributions
keeping+0.087
 snakes+0.078
 situations+0.073
 yourselves+0.073
 skeletons+0.072
Neg logit contributions
 str-0.075
 lear-0.072
 forward-0.071
ah-0.068
 bought-0.064
Token Then
Token ID3244
Feature activation+6.07e-02

Pos logit contributions
keeping+0.018
kins+0.017
 snakes+0.014
 traps+0.013
bes+0.013
Neg logit contributions
 lear-0.023
 str-0.023
ah-0.022
 forward-0.021
itting-0.021
Token like
Token ID588
Feature activation-6.98e-02

Pos logit contributions
ah+0.004
 lear+0.004
itting+0.003
 str+0.003
packing+0.003
Neg logit contributions
kins-0.002
keeping-0.002
 trouble-0.002
 himself-0.002
 traps-0.002
Token he
Token ID339
Feature activation-9.40e-04

Pos logit contributions
ah+0.132
 lear+0.130
 str+0.128
itting+0.124
 forward+0.120
Neg logit contributions
keeping-0.095
kins-0.087
 snakes-0.077
 skeletons-0.070
 traps-0.069
Token was
Token ID373
Feature activation-1.03e-02

Pos logit contributions
 lear+0.002
 str+0.002
ah+0.002
itting+0.001
 forward+0.001
Neg logit contributions
keeping-0.001
kins-0.001
 snakes-0.001
 traps-0.001
 skeletons-0.001
Token the
Token ID262
Feature activation-7.39e-03

Pos logit contributions
 lear+0.018
ah+0.017
itting+0.016
 str+0.016
ee+0.016
Neg logit contributions
keeping-0.016
kins-0.015
 snakes-0.013
 traps-0.012
 skeletons-0.012
Token best
Token ID1266
Feature activation+4.75e-03

Pos logit contributions
ah+0.012
ur+0.012
 lear+0.012
anged+0.012
ashed+0.012
Neg logit contributions
keeping-0.009
kins-0.009
 snakes-0.008
 truth-0.008
 noise-0.008
Token pian
Token ID43923
Feature activation-1.69e-01

Pos logit contributions
keeping+0.006
kins+0.006
 snakes+0.005
uts+0.005
 floor+0.004
Neg logit contributions
ah-0.009
 str-0.009
 lear-0.008
 switched-0.008
oured-0.008
Tokenist
Token ID396
Feature activation-7.09e-02

Pos logit contributions
 str+0.256
 lear+0.254
 spend+0.236
 switched+0.235
 farewell+0.233
Neg logit contributions
keeping-0.303
kins-0.284
uts-0.230
 snakes-0.218
ers-0.217
Token in
Token ID287
Feature activation-6.86e-03

Pos logit contributions
 lear+0.123
 str+0.121
itting+0.117
ah+0.116
 switched+0.114
Neg logit contributions
keeping-0.104
kins-0.100
 snakes-0.084
ls-0.075
 skeletons-0.074
Token the
Token ID262
Feature activation-9.70e-03

Pos logit contributions
 str+0.010
 lear+0.010
ah+0.010
 forward+0.009
 spend+0.009
Neg logit contributions
keeping-0.012
kins-0.012
 snakes-0.010
bes-0.010
ls-0.010
Token whole
Token ID2187
Feature activation-4.13e-02

Pos logit contributions
 lear+0.020
ah+0.020
 str+0.019
itting+0.018
 switched+0.018
Neg logit contributions
keeping-0.012
kins-0.011
 snakes-0.009
 bush-0.008
 traps-0.008
Token world
Token ID995
Feature activation-7.35e-03

Pos logit contributions
 lear+0.074
 str+0.073
ah+0.073
 forward+0.068
itting+0.068
Neg logit contributions
keeping-0.062
kins-0.057
 snakes-0.048
bes-0.044
 skeletons-0.044
Token asleep
Token ID16039
Feature activation-2.91e-03

Pos logit contributions
keeping+0.039
kins+0.037
 snakes+0.032
bes+0.030
 traps+0.030
Neg logit contributions
 lear-0.034
 str-0.033
ah-0.033
 forward-0.030
itting-0.030
Token,
Token ID11
Feature activation-1.78e-02

Pos logit contributions
 lear+0.005
 str+0.005
ah+0.005
 forward+0.005
 switched+0.004
Neg logit contributions
keeping-0.005
kins-0.004
 snakes-0.004
 skeletons-0.004
 traps-0.003
Token dreaming
Token ID28368
Feature activation-4.36e-02

Pos logit contributions
ah+0.033
 lear+0.031
itting+0.030
ee+0.030
ipp+0.028
Neg logit contributions
keeping-0.026
kins-0.023
bes-0.021
 floor-0.020
 bush-0.020
Token of
Token ID286
Feature activation+1.58e-02

Pos logit contributions
 str+0.078
 lear+0.077
ah+0.076
 switched+0.071
itting+0.071
Neg logit contributions
keeping-0.064
kins-0.060
 snakes-0.052
 traps-0.047
 iv-0.047
Token icy
Token ID30284
Feature activation-7.71e-03

Pos logit contributions
keeping+0.030
kins+0.029
 snakes+0.025
bes+0.024
uts+0.024
Neg logit contributions
 str-0.022
 lear-0.021
ah-0.020
 forward-0.020
 switched-0.019
Token pian
Token ID43923
Feature activation-1.67e-01

Pos logit contributions
 str+0.012
 lear+0.012
 forward+0.012
itting+0.012
ah+0.012
Neg logit contributions
kins-0.011
keeping-0.011
bes-0.009
uts-0.009
iders-0.009
Tokenos
Token ID418
Feature activation+1.84e-02

Pos logit contributions
 lear+0.299
 str+0.295
ah+0.274
 forward+0.272
 switched+0.271
Neg logit contributions
keeping-0.259
kins-0.236
 snakes-0.192
uts-0.186
bes-0.184
Token and
Token ID290
Feature activation-4.59e-03

Pos logit contributions
keeping+0.031
kins+0.029
 snakes+0.024
 skeletons+0.023
bes+0.023
Neg logit contributions
 str-0.030
ah-0.029
 lear-0.029
 forward-0.028
itting-0.027
Token big
Token ID1263
Feature activation+3.27e-02

Pos logit contributions
 str+0.007
 lear+0.007
ah+0.007
 switched+0.006
 forward+0.006
Neg logit contributions
keeping-0.008
kins-0.008
 snakes-0.007
ls-0.006
 skeletons-0.006
Token piles
Token ID30935
Feature activation+3.60e-02

Pos logit contributions
keeping+0.048
kins+0.047
 snakes+0.037
bes+0.036
uts+0.035
Neg logit contributions
 lear-0.059
 str-0.058
ah-0.056
 forward-0.054
 switched-0.053
Token of
Token ID286
Feature activation+1.61e-02

Pos logit contributions
keeping+0.051
kins+0.048
 snakes+0.041
bes+0.038
 traps+0.037
Neg logit contributions
 lear-0.065
 str-0.065
ah-0.064
 forward-0.063
itting-0.061
Token practice
Token ID3357
Feature activation-7.36e-02

Pos logit contributions
 lear+0.020
ah+0.020
itting+0.020
 forward+0.019
 str+0.019
Neg logit contributions
keeping-0.033
kins-0.031
 snakes-0.027
 truth-0.026
bes-0.026
Token and
Token ID290
Feature activation-1.11e-02

Pos logit contributions
 lear+0.115
ah+0.115
 str+0.112
itting+0.105
 switched+0.105
Neg logit contributions
keeping-0.124
kins-0.118
 snakes-0.102
bes-0.097
 traps-0.095
Token become
Token ID1716
Feature activation-4.41e-02

Pos logit contributions
 lear+0.013
ah+0.013
 str+0.013
itting+0.012
 forward+0.012
Neg logit contributions
keeping-0.023
kins-0.022
 snakes-0.019
 traps-0.018
bes-0.018
Token a
Token ID257
Feature activation-3.05e-02

Pos logit contributions
ah+0.080
 lear+0.078
itting+0.075
 str+0.074
ee+0.072
Neg logit contributions
keeping-0.064
kins-0.062
 snakes-0.052
 traps-0.051
 mice-0.050
Token great
Token ID1049
Feature activation-6.54e-03

Pos logit contributions
itting+0.055
ah+0.055
asted+0.055
oured+0.054
anged+0.054
Neg logit contributions
kins-0.033
keeping-0.032
 old-0.030
 bush-0.030
 tree-0.030
Token pian
Token ID43923
Feature activation-1.65e-01

Pos logit contributions
ah+0.011
oured+0.010
sels+0.010
asted+0.010
ur+0.010
Neg logit contributions
keeping-0.009
kins-0.008
 old-0.008
 truth-0.008
 trouble-0.008
Tokenist
Token ID396
Feature activation-6.56e-02

Pos logit contributions
 str+0.226
 lear+0.225
 switched+0.215
 padd+0.210
 farewell+0.207
Neg logit contributions
keeping-0.309
kins-0.294
uts-0.246
bes-0.229
iders-0.229
Token so
Token ID523
Feature activation+2.68e-02

Pos logit contributions
 lear+0.106
 str+0.106
ah+0.104
itting+0.103
 switched+0.102
Neg logit contributions
keeping-0.098
kins-0.096
 snakes-0.080
ls-0.073
bes-0.073
Token that
Token ID326
Feature activation-2.52e-02

Pos logit contributions
keeping+0.037
kins+0.036
 snakes+0.031
ls+0.029
 skeletons+0.028
Neg logit contributions
 lear-0.049
ah-0.049
 str-0.048
 switched-0.044
itting-0.043
Token one
Token ID530
Feature activation-8.10e-02

Pos logit contributions
 lear+0.047
ah+0.047
 str+0.046
itting+0.044
oured+0.043
Neg logit contributions
keeping-0.034
kins-0.032
 snakes-0.027
 skeletons-0.025
 traps-0.024
Token day
Token ID1110
Feature activation-2.70e-02

Pos logit contributions
 lear+0.126
 str+0.121
ah+0.119
itting+0.112
 farewell+0.112
Neg logit contributions
kins-0.127
keeping-0.126
 snakes-0.114
 skeletons-0.102
 bush-0.101
Token eyes
Token ID2951
Feature activation+6.40e-03

Pos logit contributions
 str+0.006
 lear+0.006
 forward+0.005
ah+0.005
 switched+0.005
Neg logit contributions
keeping-0.007
kins-0.006
 snakes-0.005
uts-0.005
orns-0.005
Token and
Token ID290
Feature activation-6.57e-03

Pos logit contributions
keeping+0.011
kins+0.010
 snakes+0.009
 skeletons+0.008
 traps+0.008
Neg logit contributions
 str-0.010
 lear-0.010
 forward-0.010
ah-0.010
 switched-0.009
Token yelled
Token ID22760
Feature activation-3.33e-02

Pos logit contributions
ah+0.009
 lear+0.009
 str+0.009
itting+0.008
 forward+0.008
Neg logit contributions
keeping-0.012
kins-0.012
 snakes-0.010
bes-0.010
 traps-0.010
Token,
Token ID11
Feature activation-1.71e-02

Pos logit contributions
ah+0.060
 str+0.058
 lear+0.057
itting+0.054
 switched+0.053
Neg logit contributions
keeping-0.049
kins-0.047
 snakes-0.040
uts-0.038
ls-0.037
Token "
Token ID366
Feature activation-5.64e-02

Pos logit contributions
ah+0.033
 lear+0.032
itting+0.031
 str+0.031
ee+0.031
Neg logit contributions
keeping-0.022
kins-0.020
 snakes-0.018
bes-0.016
uts-0.016
TokenWhe
Token ID10842
Feature activation-1.64e-01

Pos logit contributions
 lear+0.098
 forward+0.098
ah+0.097
 str+0.095
 spend+0.094
Neg logit contributions
keeping-0.085
kins-0.068
 snakes-0.061
 yourselves-0.061
bes-0.060
Tokenee
Token ID1453
Feature activation-1.40e-01

Pos logit contributions
 lear+0.248
 str+0.232
 switched+0.220
 farewell+0.215
 spend+0.214
Neg logit contributions
kins-0.287
keeping-0.264
 snakes-0.235
bes-0.212
 bush-0.211
Tokene
Token ID68
Feature activation-3.34e-02

Pos logit contributions
 lear+0.219
 str+0.218
itting+0.214
 switched+0.207
Of+0.204
Neg logit contributions
kins-0.231
keeping-0.227
 snakes-0.189
bes-0.182
ers-0.177
Token!"
Token ID2474
Feature activation-3.34e-03

Pos logit contributions
 lear+0.065
 str+0.064
ah+0.063
itting+0.062
 forward+0.061
Neg logit contributions
keeping-0.043
kins-0.040
 snakes-0.034
bes-0.032
uts-0.030
Token as
Token ID355
Feature activation-4.70e-02

Pos logit contributions
 lear+0.006
 str+0.006
ah+0.006
itting+0.006
 forward+0.006
Neg logit contributions
keeping-0.005
kins-0.004
 snakes-0.004
bes-0.003
 skeletons-0.003
Token the
Token ID262
Feature activation-5.29e-03

Pos logit contributions
 lear+0.086
 str+0.086
ah+0.084
itting+0.079
 forward+0.079
Neg logit contributions
keeping-0.068
kins-0.064
 snakes-0.054
bes-0.048
 traps-0.048
Token so
Token ID523
Feature activation+3.10e-02

Pos logit contributions
 lear+0.014
itting+0.013
ah+0.013
 forward+0.012
 str+0.012
Neg logit contributions
keeping-0.013
kins-0.010
 snakes-0.010
 skeletons-0.010
 traps-0.010
Token much
Token ID881
Feature activation-3.18e-02

Pos logit contributions
keeping+0.062
kins+0.059
 snakes+0.053
bes+0.050
 traps+0.049
Neg logit contributions
 lear-0.040
 str-0.039
ah-0.039
 forward-0.035
itting-0.035
Token fun
Token ID1257
Feature activation-1.34e-03

Pos logit contributions
 str+0.047
 lear+0.046
ah+0.044
 switched+0.043
 forward+0.042
Neg logit contributions
keeping-0.059
kins-0.056
 snakes-0.047
bes-0.045
 iv-0.044
Token.
Token ID13
Feature activation-1.57e-03

Pos logit contributions
 lear+0.003
 str+0.003
ah+0.003
 forward+0.003
itting+0.003
Neg logit contributions
keeping-0.002
kins-0.002
 snakes-0.001
bes-0.001
 traps-0.001
Token "
Token ID366
Feature activation-4.47e-02

Pos logit contributions
 lear+0.003
 str+0.003
ah+0.003
 forward+0.003
itting+0.003
Neg logit contributions
keeping-0.002
kins-0.002
 snakes-0.002
bes-0.002
 skeletons-0.002
TokenWhe
Token ID10842
Feature activation-1.64e-01

Pos logit contributions
 lear+0.078
oured+0.078
 str+0.076
itting+0.076
 farewell+0.075
Neg logit contributions
keeping-0.060
kins-0.041
orns-0.038
 yourselves-0.036
bes-0.036
Tokene
Token ID68
Feature activation-3.05e-02

Pos logit contributions
 lear+0.245
 str+0.211
 farewell+0.204
 spend+0.203
 switched+0.202
Neg logit contributions
kins-0.296
keeping-0.258
 snakes-0.229
ers-0.228
es-0.225
Token!"
Token ID2474
Feature activation+4.66e-03

Pos logit contributions
itting+0.066
 lear+0.066
 str+0.065
 forward+0.065
 switched+0.064
Neg logit contributions
keeping-0.032
kins-0.028
bes-0.023
 snakes-0.021
uts-0.020
Token he
Token ID339
Feature activation+2.03e-03

Pos logit contributions
keeping+0.006
kins+0.006
 snakes+0.005
bes+0.004
uts+0.004
Neg logit contributions
 lear-0.009
ah-0.009
 str-0.009
itting-0.009
 forward-0.008
Token shouted
Token ID17293
Feature activation-1.70e-04

Pos logit contributions
keeping+0.002
kins+0.002
 snakes+0.002
bes+0.002
 noise+0.002
Neg logit contributions
itting-0.004
 lear-0.004
 forward-0.004
ah-0.004
packing-0.004
Token.
Token ID13
Feature activation-2.57e-03

Pos logit contributions
ah+0.000
itting+0.000
 str+0.000
 lear+0.000
packing+0.000
Neg logit contributions
keeping-0.000
kins-0.000
uts-0.000
 snakes-0.000
bes-0.000
Token and
Token ID290
Feature activation+1.87e-03

Pos logit contributions
keeping+0.083
kins+0.082
 snakes+0.070
 skeletons+0.068
bes+0.066
Neg logit contributions
ah-0.076
 str-0.076
 lear-0.076
 forward-0.071
 switched-0.068
Token slides
Token ID19392
Feature activation-4.88e-02

Pos logit contributions
keeping+0.004
kins+0.003
 snakes+0.003
 traps+0.003
bes+0.003
Neg logit contributions
 lear-0.002
ah-0.002
 str-0.002
itting-0.002
 forward-0.002
Token down
Token ID866
Feature activation-4.30e-02

Pos logit contributions
 lear+0.079
 str+0.077
ah+0.076
itting+0.073
 switched+0.071
Neg logit contributions
keeping-0.082
kins-0.076
 snakes-0.066
bes-0.061
ls-0.060
Token.
Token ID13
Feature activation+9.78e-03

Pos logit contributions
 lear+0.067
 str+0.066
ah+0.065
 forward+0.062
 switched+0.059
Neg logit contributions
keeping-0.074
kins-0.071
 snakes-0.060
bes-0.058
 skeletons-0.057
Token "
Token ID366
Feature activation-5.00e-02

Pos logit contributions
keeping+0.015
kins+0.014
 snakes+0.011
 traps+0.011
bes+0.011
Neg logit contributions
 lear-0.018
 str-0.017
ah-0.017
 forward-0.016
 switched-0.016
TokenWhe
Token ID10842
Feature activation-1.63e-01

Pos logit contributions
 str+0.081
oured+0.080
 lear+0.078
itting+0.077
by+0.076
Neg logit contributions
keeping-0.074
kins-0.061
 snakes-0.057
bes-0.054
 yourselves-0.052
Tokene
Token ID68
Feature activation-3.15e-02

Pos logit contributions
 lear+0.237
 str+0.222
 switched+0.207
 forward+0.199
 spend+0.198
Neg logit contributions
kins-0.293
keeping-0.279
 snakes-0.249
bes-0.228
 bush-0.226
Token!"
Token ID2474
Feature activation+5.27e-03

Pos logit contributions
 lear+0.066
 str+0.066
ah+0.064
itting+0.064
 forward+0.063
Neg logit contributions
keeping-0.035
kins-0.033
 snakes-0.027
bes-0.025
uts-0.023
Token she
Token ID673
Feature activation+5.33e-03

Pos logit contributions
keeping+0.007
kins+0.007
 snakes+0.006
bes+0.005
 traps+0.005
Neg logit contributions
 lear-0.010
 str-0.010
ah-0.010
itting-0.009
 forward-0.009
Token says
Token ID1139
Feature activation-3.11e-04

Pos logit contributions
keeping+0.007
kins+0.007
 snakes+0.006
bes+0.005
ls+0.005
Neg logit contributions
 lear-0.010
itting-0.010
ah-0.010
 forward-0.009
 str-0.009
Token.
Token ID13
Feature activation+1.10e-02

Pos logit contributions
 lear+0.001
ah+0.001
 str+0.001
itting+0.001
 switched+0.001
Neg logit contributions
keeping-0.000
kins-0.000
 snakes-0.000
bes-0.000
uts-0.000
Token little
Token ID1310
Feature activation-1.37e-02

Pos logit contributions
 lear+0.065
 str+0.063
ah+0.063
itting+0.062
 forward+0.061
Neg logit contributions
keeping-0.039
kins-0.038
 snakes-0.033
 bush-0.032
 skeletons-0.029
Token girl
Token ID2576
Feature activation-7.11e-03

Pos logit contributions
 str+0.027
 lear+0.027
ah+0.027
itting+0.026
 forward+0.025
Neg logit contributions
keeping-0.017
kins-0.015
 snakes-0.014
 bush-0.012
 mice-0.012
Token playing
Token ID2712
Feature activation-2.28e-02

Pos logit contributions
 lear+0.012
 str+0.011
ah+0.011
itting+0.010
 forward+0.010
Neg logit contributions
keeping-0.012
kins-0.011
 snakes-0.009
bes-0.009
 traps-0.009
Token a
Token ID257
Feature activation-3.42e-02

Pos logit contributions
 lear+0.039
 str+0.039
ah+0.038
 forward+0.036
itting+0.036
Neg logit contributions
keeping-0.036
kins-0.034
 snakes-0.029
bes-0.026
 traps-0.026
Token toy
Token ID13373
Feature activation-5.28e-02

Pos logit contributions
 lear+0.062
ah+0.062
itting+0.062
 forward+0.061
 str+0.061
Neg logit contributions
kins-0.040
keeping-0.039
 snakes-0.036
 bush-0.034
 floor-0.034
Token sax
Token ID46909
Feature activation-1.60e-01

Pos logit contributions
 str+0.091
 lear+0.091
ah+0.090
itting+0.085
 switched+0.085
Neg logit contributions
keeping-0.079
kins-0.074
 snakes-0.065
 skeletons-0.059
 traps-0.058
Tokenophone
Token ID38656
Feature activation-1.28e-01

Pos logit contributions
 lear+0.169
 str+0.155
 spend+0.147
 switched+0.145
 farewell+0.140
Neg logit contributions
keeping-0.340
kins-0.323
uts-0.287
bes-0.279
ers-0.277
Token.
Token ID13
Feature activation-1.05e-02

Pos logit contributions
 str+0.273
 lear+0.272
ah+0.267
 switched+0.259
itting+0.256
Neg logit contributions
keeping-0.137
kins-0.120
 snakes-0.101
 skeletons-0.089
uts-0.086
Token She
Token ID1375
Feature activation-1.89e-02

Pos logit contributions
 lear+0.019
 str+0.019
ah+0.019
 forward+0.018
itting+0.018
Neg logit contributions
keeping-0.015
kins-0.014
 snakes-0.012
bes-0.011
 skeletons-0.011
Token was
Token ID373
Feature activation-4.97e-03

Pos logit contributions
 forward+0.038
 lear+0.037
itting+0.037
wn+0.035
ah+0.034
Neg logit contributions
kins-0.020
keeping-0.018
ls-0.016
 snakes-0.016
 ranger-0.015
Token playing
Token ID2712
Feature activation-2.30e-02

Pos logit contributions
 lear+0.010
ah+0.009
 str+0.009
itting+0.009
 farewell+0.009
Neg logit contributions
keeping-0.006
kins-0.006
 snakes-0.005
 skeletons-0.005
 mice-0.005
Token "
Token ID366
Feature activation-6.63e-02

Pos logit contributions
 lear+0.007
 str+0.007
ah+0.007
 forward+0.006
itting+0.006
Neg logit contributions
keeping-0.005
kins-0.005
 snakes-0.004
 traps-0.004
bes-0.004
TokenWelcome
Token ID14618
Feature activation-3.61e-02

Pos logit contributions
 lear+0.112
 str+0.110
ah+0.106
 spend+0.106
oured+0.105
Neg logit contributions
keeping-0.097
kins-0.084
 snakes-0.073
bes-0.071
 mice-0.066
Token home
Token ID1363
Feature activation-4.44e-02

Pos logit contributions
 lear+0.059
 str+0.058
ah+0.057
 switched+0.054
itting+0.054
Neg logit contributions
keeping-0.060
kins-0.055
 snakes-0.048
 skeletons-0.044
 traps-0.044
Token,
Token ID11
Feature activation-2.78e-02

Pos logit contributions
 lear+0.072
 str+0.071
ah+0.069
 forward+0.066
 spend+0.066
Neg logit contributions
keeping-0.071
kins-0.068
 snakes-0.060
bes-0.056
 traps-0.055
Token B
Token ID347
Feature activation-5.69e-02

Pos logit contributions
 lear+0.047
 str+0.046
ah+0.046
itting+0.043
 forward+0.043
Neg logit contributions
keeping-0.044
kins-0.041
 snakes-0.036
 skeletons-0.033
 traps-0.033
Tokenoun
Token ID977
Feature activation-1.60e-01

Pos logit contributions
 str+0.079
 lear+0.078
 switched+0.072
 forward+0.069
 pract+0.069
Neg logit contributions
keeping-0.108
kins-0.098
uts-0.085
bes-0.083
 snakes-0.082
Tokency
Token ID948
Feature activation-6.27e-02

Pos logit contributions
 str+0.265
 lear+0.263
itting+0.247
 switched+0.239
 blur+0.237
Neg logit contributions
keeping-0.274
kins-0.255
ers-0.198
 snakes-0.183
 villagers-0.183
Token!",
Token ID40754
Feature activation+1.48e-02

Pos logit contributions
 str+0.130
 lear+0.130
ah+0.126
itting+0.124
 switched+0.122
Neg logit contributions
keeping-0.075
kins-0.068
 snakes-0.053
 traps-0.047
bes-0.047
Token they
Token ID484
Feature activation-2.43e-02

Pos logit contributions
keeping+0.023
kins+0.020
 snakes+0.017
 mice+0.016
bes+0.016
Neg logit contributions
 lear-0.026
ah-0.026
itting-0.026
 str-0.025
 forward-0.024
Token said
Token ID531
Feature activation-2.45e-02

Pos logit contributions
itting+0.037
 lear+0.035
ah+0.035
 str+0.033
packing+0.031
Neg logit contributions
keeping-0.042
kins-0.039
 snakes-0.033
 bush-0.032
ls-0.031
Token,
Token ID11
Feature activation-2.42e-02

Pos logit contributions
ah+0.051
 lear+0.049
 str+0.048
itting+0.048
ee+0.046
Neg logit contributions
keeping-0.029
kins-0.028
 snakes-0.019
uts-0.019
bes-0.019
Token swing
Token ID9628
Feature activation-1.84e-02

Pos logit contributions
 str+0.001
 lear+0.001
ah+0.001
 forward+0.001
 switched+0.001
Neg logit contributions
keeping-0.001
kins-0.001
 snakes-0.001
bes-0.001
uts-0.001
Token.
Token ID13
Feature activation-1.42e-02

Pos logit contributions
 lear+0.039
 str+0.039
ah+0.038
 switched+0.037
itting+0.037
Neg logit contributions
keeping-0.020
kins-0.018
 snakes-0.015
bes-0.013
 bush-0.012
Token
Token ID198
Feature activation+4.24e-03

Pos logit contributions
 lear+0.026
 str+0.026
ah+0.026
 forward+0.024
itting+0.024
Neg logit contributions
keeping-0.021
kins-0.019
 snakes-0.016
bes-0.015
 traps-0.015
Token
Token ID198
Feature activation+2.88e-03

Pos logit contributions
keeping+0.008
kins+0.007
 snakes+0.007
 rats+0.006
 situations+0.006
Neg logit contributions
 str-0.006
 lear-0.006
ah-0.006
 farewell-0.006
 switched-0.006
Token"
Token ID1
Feature activation-4.19e-02

Pos logit contributions
keeping+0.005
kins+0.005
bes+0.004
uts+0.004
 snakes+0.004
Neg logit contributions
ah-0.005
 lear-0.005
 str-0.004
 forward-0.004
oured-0.004
TokenWhe
Token ID10842
Feature activation-1.59e-01

Pos logit contributions
ah+0.068
 lear+0.067
 forward+0.065
 str+0.065
 spend+0.064
Neg logit contributions
keeping-0.065
kins-0.060
 snakes-0.051
 iv-0.048
 mice-0.047
Tokenee
Token ID1453
Feature activation-1.31e-01

Pos logit contributions
 lear+0.252
 str+0.229
 switched+0.218
 spend+0.212
 forward+0.211
Neg logit contributions
kins-0.266
keeping-0.237
 snakes-0.211
 bush-0.204
ers-0.193
Token!"
Token ID2474
Feature activation+2.45e-02

Pos logit contributions
 lear+0.261
 str+0.260
ah+0.250
itting+0.247
 forward+0.244
Neg logit contributions
keeping-0.169
kins-0.161
 snakes-0.131
bes-0.119
 traps-0.114
Token said
Token ID531
Feature activation-1.52e-02

Pos logit contributions
keeping+0.032
kins+0.031
 snakes+0.025
ls+0.022
bes+0.022
Neg logit contributions
ah-0.047
 lear-0.047
 str-0.046
itting-0.044
 forward-0.044
Token Jill
Token ID22772
Feature activation+3.52e-03

Pos logit contributions
ah+0.026
 lear+0.026
 str+0.026
itting+0.025
 forward+0.024
Neg logit contributions
keeping-0.023
kins-0.022
 snakes-0.018
 skeletons-0.016
 traps-0.016
Token.
Token ID13
Feature activation-1.04e-02

Pos logit contributions
keeping+0.003
kins+0.003
 snakes+0.002
bes+0.002
uts+0.002
Neg logit contributions
 lear-0.008
ah-0.008
itting-0.008
 str-0.008
 forward-0.007
Token laugh
Token ID6487
Feature activation-1.80e-02

Pos logit contributions
itting+0.025
 lear+0.023
ah+0.022
oured+0.022
 forward+0.021
Neg logit contributions
kins-0.019
keeping-0.018
 snakes-0.017
 bush-0.016
 tree-0.015
Token and
Token ID290
Feature activation-5.41e-03

Pos logit contributions
ah+0.037
 lear+0.037
 str+0.036
itting+0.036
 switched+0.035
Neg logit contributions
keeping-0.019
kins-0.019
 snakes-0.016
 mice-0.014
 skeletons-0.014
Token shout
Token ID9191
Feature activation-2.14e-02

Pos logit contributions
 lear+0.007
 str+0.007
ah+0.007
itting+0.006
 forward+0.006
Neg logit contributions
keeping-0.011
kins-0.010
 snakes-0.009
bes-0.008
 traps-0.008
Token,
Token ID11
Feature activation-1.23e-02

Pos logit contributions
ah+0.049
 str+0.047
 lear+0.046
packing+0.045
ee+0.045
Neg logit contributions
kins-0.019
keeping-0.017
 snakes-0.016
 mice-0.014
uts-0.013
Token "
Token ID366
Feature activation-3.49e-02

Pos logit contributions
ah+0.022
 lear+0.022
 str+0.022
itting+0.021
 switched+0.021
Neg logit contributions
keeping-0.017
kins-0.016
 snakes-0.014
bes-0.013
 traps-0.012
TokenWhe
Token ID10842
Feature activation-1.58e-01

Pos logit contributions
 lear+0.043
 str+0.041
 forward+0.039
ah+0.039
itting+0.038
Neg logit contributions
keeping-0.068
kins-0.064
 snakes-0.059
bes-0.055
uts-0.054
Tokene
Token ID68
Feature activation-3.27e-02

Pos logit contributions
 lear+0.216
 str+0.191
 switched+0.185
 forward+0.178
 farewell+0.177
Neg logit contributions
kins-0.304
keeping-0.283
 snakes-0.258
 bush-0.242
orns-0.240
Token!"
Token ID2474
Feature activation+5.49e-03

Pos logit contributions
 lear+0.074
 str+0.073
itting+0.072
 switched+0.072
 forward+0.072
Neg logit contributions
keeping-0.031
kins-0.028
 snakes-0.023
bes-0.021
uts-0.021
Token
Token ID198
Feature activation+1.19e-02

Pos logit contributions
keeping+0.007
kins+0.007
 snakes+0.006
 skeletons+0.005
 traps+0.005
Neg logit contributions
 lear-0.011
 str-0.011
ah-0.010
 forward-0.010
itting-0.010
Token
Token ID198
Feature activation+9.58e-03

Pos logit contributions
keeping+0.023
kins+0.021
 snakes+0.018
 situations+0.018
 skeletons+0.018
Neg logit contributions
 str-0.017
ah-0.016
 lear-0.016
 farewell-0.016
ee-0.016
TokenThey
Token ID2990
Feature activation-2.15e-02

Pos logit contributions
kins+0.012
keeping+0.011
 yourselves+0.010
bes+0.009
 snakes+0.009
Neg logit contributions
 lear-0.019
ah-0.019
oured-0.018
 str-0.018
 forward-0.018
Token window
Token ID4324
Feature activation-4.24e-02

Pos logit contributions
keeping+0.015
kins+0.014
 snakes+0.012
bes+0.011
 traps+0.011
Neg logit contributions
 lear-0.021
 str-0.021
ah-0.021
 forward-0.019
itting-0.019
Token.
Token ID13
Feature activation+1.87e-03

Pos logit contributions
 lear+0.077
 str+0.077
ah+0.076
 forward+0.071
itting+0.071
Neg logit contributions
keeping-0.062
kins-0.058
 snakes-0.048
bes-0.044
 traps-0.044
Token
Token ID198
Feature activation+1.51e-02

Pos logit contributions
keeping+0.003
kins+0.003
 snakes+0.002
 traps+0.002
bes+0.002
Neg logit contributions
 str-0.003
 lear-0.003
ah-0.003
 forward-0.003
itting-0.003
Token
Token ID198
Feature activation+1.48e-02

Pos logit contributions
keeping+0.029
kins+0.029
 snakes+0.024
 traps+0.024
 skeletons+0.023
Neg logit contributions
 str-0.022
ah-0.021
ee-0.020
Sure-0.020
itting-0.019
Token"
Token ID1
Feature activation-2.97e-02

Pos logit contributions
keeping+0.022
kins+0.021
 snakes+0.019
bes+0.018
 bush+0.018
Neg logit contributions
 lear-0.025
ah-0.024
 str-0.024
 forward-0.024
oured-0.024
TokenWhe
Token ID10842
Feature activation-1.57e-01

Pos logit contributions
 lear+0.050
 str+0.049
ah+0.049
 forward+0.048
itting+0.048
Neg logit contributions
keeping-0.046
kins-0.041
 snakes-0.037
 iv-0.035
 mice-0.033
Tokene
Token ID68
Feature activation-3.30e-02

Pos logit contributions
 lear+0.264
 str+0.243
 switched+0.233
 spend+0.224
 forward+0.223
Neg logit contributions
kins-0.242
keeping-0.217
 snakes-0.196
 bush-0.187
orns-0.181
Token!"
Token ID2474
Feature activation+3.73e-02

Pos logit contributions
 lear+0.071
 str+0.070
itting+0.070
 forward+0.069
 switched+0.069
Neg logit contributions
keeping-0.036
kins-0.031
 snakes-0.026
bes-0.025
uts-0.023
Token The
Token ID383
Feature activation-4.55e-03

Pos logit contributions
keeping+0.056
kins+0.053
 snakes+0.045
bes+0.041
 traps+0.041
Neg logit contributions
 lear-0.066
 str-0.065
ah-0.065
 forward-0.061
itting-0.061
Token little
Token ID1310
Feature activation-3.41e-03

Pos logit contributions
 str+0.007
ah+0.006
 lear+0.006
 switched+0.006
 farewell+0.006
Neg logit contributions
keeping-0.009
kins-0.008
bes-0.007
 snakes-0.007
uts-0.007
Token bird
Token ID6512
Feature activation-1.45e-02

Pos logit contributions
ah+0.006
 str+0.006
 lear+0.006
 forward+0.005
 switched+0.005
Neg logit contributions
keeping-0.005
kins-0.005
 snakes-0.004
bes-0.004
 skeletons-0.004
Token noise
Token ID7838
Feature activation+1.25e-02

Pos logit contributions
itting+0.032
 lear+0.032
 str+0.031
ah+0.031
oured+0.031
Neg logit contributions
kins-0.018
 snakes-0.018
keeping-0.018
 noise-0.016
 traps-0.016
Token.
Token ID13
Feature activation+7.29e-03

Pos logit contributions
keeping+0.014
kins+0.013
 snakes+0.010
bes+0.009
 traps+0.009
Neg logit contributions
 lear-0.027
 str-0.027
ah-0.026
 forward-0.025
itting-0.025
Token
Token ID198
Feature activation+1.99e-02

Pos logit contributions
keeping+0.012
kins+0.012
 snakes+0.009
 traps+0.009
bes+0.009
Neg logit contributions
 str-0.012
 lear-0.012
ah-0.012
 forward-0.011
 switched-0.011
Token
Token ID198
Feature activation+1.58e-02

Pos logit contributions
keeping+0.034
kins+0.033
 snakes+0.027
 skeletons+0.025
 traps+0.025
Neg logit contributions
 str-0.032
ah-0.031
 lear-0.030
 forward-0.029
itting-0.029
Token"
Token ID1
Feature activation-3.66e-02

Pos logit contributions
keeping+0.022
kins+0.021
bes+0.017
 snakes+0.017
 yourselves+0.016
Neg logit contributions
 lear-0.029
ah-0.029
 str-0.029
oured-0.028
 switched-0.027
TokenWhe
Token ID10842
Feature activation-1.56e-01

Pos logit contributions
 lear+0.059
ah+0.058
 str+0.058
 forward+0.056
itting+0.055
Neg logit contributions
keeping-0.060
kins-0.057
 snakes-0.050
bes-0.046
 iv-0.046
Tokene
Token ID68
Feature activation-3.24e-02

Pos logit contributions
 lear+0.226
 str+0.213
 switched+0.202
 forward+0.192
 spend+0.192
Neg logit contributions
kins-0.285
keeping-0.259
 snakes-0.239
 bush-0.217
 rats-0.214
Token!"
Token ID2474
Feature activation+3.21e-02

Pos logit contributions
 lear+0.064
 str+0.063
itting+0.062
ah+0.062
 forward+0.061
Neg logit contributions
keeping-0.040
kins-0.039
 snakes-0.032
bes-0.030
ls-0.028
Token Anna
Token ID11735
Feature activation-1.75e-02

Pos logit contributions
keeping+0.048
kins+0.046
 snakes+0.039
bes+0.035
 traps+0.035
Neg logit contributions
 lear-0.057
 str-0.056
ah-0.056
itting-0.052
 forward-0.052
Token and
Token ID290
Feature activation+7.93e-03

Pos logit contributions
 lear+0.036
 str+0.036
ah+0.036
itting+0.034
 forward+0.034
Neg logit contributions
keeping-0.021
kins-0.019
 snakes-0.016
bes-0.014
 skeletons-0.014
Token Ben
Token ID3932
Feature activation-1.54e-02

Pos logit contributions
keeping+0.015
kins+0.014
 snakes+0.011
bes+0.011
 traps+0.011
Neg logit contributions
 str-0.012
 lear-0.012
ah-0.012
 forward-0.011
 switched-0.011
Token �
Token ID6184
Feature activation-4.71e-02

Pos logit contributions
ah+0.029
 lear+0.028
 str+0.028
itting+0.027
ee+0.026
Neg logit contributions
keeping-0.021
kins-0.019
 snakes-0.016
uts-0.015
 bush-0.015
Token�
Token ID95
Feature activation-6.41e-02

Pos logit contributions
 lear+0.071
 str+0.069
ah+0.066
 switched+0.066
 forward+0.064
Neg logit contributions
keeping-0.073
kins-0.072
bes-0.064
uts-0.063
 yourselves-0.061
Token€
Token ID26391
Feature activation+1.17e-02

Pos logit contributions
 lear+0.103
 str+0.101
ah+0.097
 forward+0.094
itting+0.093
Neg logit contributions
keeping-0.108
kins-0.096
bes-0.085
 snakes-0.083
 iv-0.079
Token�
Token ID129
Feature activation+2.22e-02

Pos logit contributions
keeping+0.019
kins+0.018
 snakes+0.015
bes+0.014
 traps+0.014
Neg logit contributions
 lear-0.019
 str-0.019
ah-0.019
 forward-0.018
itting-0.018
Token�
Token ID241
Feature activation-4.50e-02

Pos logit contributions
keeping+0.036
kins+0.032
 snakes+0.028
bes+0.028
 traps+0.026
Neg logit contributions
 lear-0.037
 str-0.036
ah-0.035
itting-0.034
 switched-0.034
TokenWhe
Token ID10842
Feature activation-1.56e-01

Pos logit contributions
 lear+0.076
 str+0.076
ah+0.073
 forward+0.070
 switched+0.069
Neg logit contributions
keeping-0.071
kins-0.065
 snakes-0.056
bes-0.052
uts-0.051
Tokenels
Token ID1424
Feature activation+3.73e-02

Pos logit contributions
 lear+0.208
 str+0.185
 spend+0.173
 switched+0.173
 farewell+0.171
Neg logit contributions
kins-0.308
keeping-0.276
 snakes-0.243
bes-0.236
ers-0.235
Token.
Token ID13
Feature activation-3.10e-03

Pos logit contributions
keeping+0.059
kins+0.056
 snakes+0.048
bes+0.044
 traps+0.044
Neg logit contributions
 lear-0.063
 str-0.062
ah-0.061
 forward-0.057
itting-0.057
Token I
Token ID314
Feature activation-3.87e-02

Pos logit contributions
 lear+0.005
 str+0.005
ah+0.005
 forward+0.005
itting+0.005
Neg logit contributions
keeping-0.005
kins-0.004
 snakes-0.004
bes-0.003
 traps-0.003
Tokenâ
Token ID22940
Feature activation-5.47e-02

Pos logit contributions
 lear+0.057
 str+0.055
 forward+0.054
itting+0.053
 farewell+0.053
Neg logit contributions
kins-0.061
keeping-0.061
 snakes-0.050
bes-0.049
 iv-0.048
Token€
Token ID26391
Feature activation+1.30e-02

Pos logit contributions
 lear+0.096
 str+0.095
itting+0.088
 switched+0.087
 forward+0.087
Neg logit contributions
keeping-0.082
kins-0.071
bes-0.058
ers-0.057
 situations-0.055
Token laughs
Token ID22051
Feature activation-5.70e-03

Pos logit contributions
 lear+0.019
ah+0.019
itting+0.018
 forward+0.018
 str+0.018
Neg logit contributions
kins-0.012
keeping-0.012
 snakes-0.010
ls-0.009
bes-0.009
Token and
Token ID290
Feature activation+6.73e-03

Pos logit contributions
 lear+0.011
ah+0.011
 str+0.011
itting+0.010
 switched+0.010
Neg logit contributions
keeping-0.008
kins-0.007
 snakes-0.006
 skeletons-0.006
 traps-0.006
Token shouts
Token ID34757
Feature activation-1.49e-02

Pos logit contributions
keeping+0.009
kins+0.009
 snakes+0.007
bes+0.007
 traps+0.007
Neg logit contributions
 lear-0.013
ah-0.013
 str-0.013
itting-0.012
 forward-0.012
Token,
Token ID11
Feature activation-5.06e-03

Pos logit contributions
ah+0.032
 str+0.031
 lear+0.030
 switched+0.030
itting+0.030
Neg logit contributions
kins-0.014
keeping-0.013
 snakes-0.012
uts-0.011
ls-0.011
Token "
Token ID366
Feature activation-3.91e-02

Pos logit contributions
 lear+0.009
ah+0.009
 str+0.009
itting+0.008
 forward+0.008
Neg logit contributions
keeping-0.007
kins-0.007
 snakes-0.006
bes-0.006
 traps-0.005
TokenWhe
Token ID10842
Feature activation-1.56e-01

Pos logit contributions
 lear+0.054
 str+0.053
 forward+0.053
ah+0.051
itting+0.051
Neg logit contributions
keeping-0.071
kins-0.063
 snakes-0.057
bes-0.054
orns-0.053
Tokene
Token ID68
Feature activation-2.77e-02

Pos logit contributions
 lear+0.237
 str+0.216
 switched+0.207
 farewell+0.201
 spend+0.201
Neg logit contributions
kins-0.275
keeping-0.252
 snakes-0.226
 bush-0.208
bes-0.206
Token!
Token ID0
Feature activation-4.59e-02

Pos logit contributions
 lear+0.055
 str+0.055
itting+0.054
 forward+0.053
 switched+0.053
Neg logit contributions
keeping-0.033
kins-0.031
 snakes-0.026
bes-0.025
uts-0.024
Token This
Token ID770
Feature activation-2.15e-02

Pos logit contributions
 str+0.092
ah+0.090
 lear+0.089
itting+0.086
 forward+0.086
Neg logit contributions
keeping-0.054
kins-0.042
 snakes-0.042
uts-0.039
 yourselves-0.036
Token is
Token ID318
Feature activation-8.47e-03

Pos logit contributions
 lear+0.049
 str+0.048
 forward+0.046
ah+0.046
itting+0.046
Neg logit contributions
keeping-0.019
kins-0.018
 snakes-0.015
 bush-0.013
 floor-0.012
Token fun
Token ID1257
Feature activation+8.29e-03

Pos logit contributions
 lear+0.016
 str+0.015
ah+0.015
itting+0.015
 forward+0.014
Neg logit contributions
keeping-0.011
kins-0.011
 snakes-0.011
 skeletons-0.009
 traps-0.009
Token fast
Token ID3049
Feature activation-7.17e-03

Pos logit contributions
keeping+0.020
kins+0.019
 snakes+0.016
ls+0.015
 traps+0.015
Neg logit contributions
 lear-0.023
ah-0.023
 str-0.023
itting-0.021
 spend-0.020
Token that
Token ID326
Feature activation-4.18e-02

Pos logit contributions
 lear+0.012
 str+0.012
ah+0.011
 forward+0.011
itting+0.010
Neg logit contributions
keeping-0.012
kins-0.011
 snakes-0.010
bes-0.009
 traps-0.009
Token he
Token ID339
Feature activation-1.93e-02

Pos logit contributions
 lear+0.078
 str+0.077
ah+0.077
itting+0.074
 switched+0.071
Neg logit contributions
keeping-0.058
kins-0.054
 snakes-0.046
 traps-0.042
 skeletons-0.042
Token started
Token ID2067
Feature activation-5.82e-02

Pos logit contributions
itting+0.028
 lear+0.027
ah+0.026
 forward+0.025
 str+0.024
Neg logit contributions
keeping-0.034
kins-0.033
 snakes-0.029
 traps-0.027
bes-0.027
Token to
Token ID284
Feature activation-2.69e-02

Pos logit contributions
ah+0.098
 lear+0.093
itting+0.091
 str+0.088
packing+0.087
Neg logit contributions
keeping-0.093
kins-0.078
 snakes-0.076
 traps-0.069
uts-0.069
Token blur
Token ID23671
Feature activation-1.56e-01

Pos logit contributions
ah+0.040
 lear+0.040
itting+0.040
 forward+0.039
 str+0.038
Neg logit contributions
keeping-0.046
kins-0.041
 snakes-0.039
 traps-0.036
 bush-0.035
Token.
Token ID13
Feature activation-2.14e-02

Pos logit contributions
 lear+0.286
 str+0.285
ah+0.280
itting+0.265
 switched+0.264
Neg logit contributions
keeping-0.225
kins-0.205
 snakes-0.175
 traps-0.157
 skeletons-0.157
Token He
Token ID679
Feature activation-2.48e-02

Pos logit contributions
 str+0.042
 lear+0.042
ah+0.041
itting+0.039
 forward+0.039
Neg logit contributions
keeping-0.028
kins-0.025
 snakes-0.021
 skeletons-0.019
bes-0.019
Token was
Token ID373
Feature activation-1.18e-02

Pos logit contributions
itting+0.042
 lear+0.040
 forward+0.038
 Enjoy+0.038
ah+0.037
Neg logit contributions
keeping-0.034
kins-0.033
 snakes-0.031
 floor-0.029
ls-0.028
Token running
Token ID2491
Feature activation-3.28e-02

Pos logit contributions
itting+0.024
 lear+0.024
ah+0.024
by+0.023
packing+0.023
Neg logit contributions
keeping-0.012
 snakes-0.010
kins-0.010
 noise-0.009
 trouble-0.009
Token so
Token ID523
Feature activation+2.26e-02

Pos logit contributions
itting+0.060
 str+0.060
ah+0.058
 lear+0.058
 spend+0.058
Neg logit contributions
keeping-0.041
 snakes-0.035
kins-0.034
ls-0.032
 bush-0.032
Token loved
Token ID6151
Feature activation-2.91e-02

Pos logit contributions
 lear+0.018
ah+0.017
 forward+0.017
itting+0.017
 Enjoy+0.016
Neg logit contributions
kins-0.014
keeping-0.013
bes-0.011
 floor-0.011
 snakes-0.011
Token the
Token ID262
Feature activation+9.16e-03

Pos logit contributions
ah+0.045
 lear+0.044
 str+0.043
itting+0.041
 forward+0.041
Neg logit contributions
keeping-0.050
kins-0.046
 snakes-0.042
 skeletons-0.039
 traps-0.039
Token sound
Token ID2128
Feature activation+1.94e-03

Pos logit contributions
keeping+0.011
kins+0.011
 noise+0.009
 snakes+0.009
 floor+0.009
Neg logit contributions
ah-0.017
 lear-0.017
 str-0.016
itting-0.016
 forward-0.016
Token of
Token ID286
Feature activation+1.41e-02

Pos logit contributions
keeping+0.003
kins+0.003
 snakes+0.002
 traps+0.002
 skeletons+0.002
Neg logit contributions
 lear-0.004
 str-0.004
ah-0.003
itting-0.003
 switched-0.003
Token the
Token ID262
Feature activation+7.67e-03

Pos logit contributions
keeping+0.023
kins+0.021
 snakes+0.019
 traps+0.017
 skeletons+0.017
Neg logit contributions
ah-0.024
 lear-0.023
 str-0.023
itting-0.022
 forward-0.021
Token sax
Token ID46909
Feature activation-1.56e-01

Pos logit contributions
keeping+0.010
kins+0.009
 snakes+0.008
 floor+0.007
 noise+0.007
Neg logit contributions
 lear-0.015
ah-0.015
 str-0.014
itting-0.013
 forward-0.013
Tokenophone
Token ID38656
Feature activation-1.20e-01

Pos logit contributions
 lear+0.172
ah+0.151
 str+0.148
 spend+0.144
 farewell+0.144
Neg logit contributions
kins-0.322
keeping-0.306
uts-0.280
bes-0.272
es-0.270
Token and
Token ID290
Feature activation-9.40e-03

Pos logit contributions
ah+0.237
 str+0.236
 lear+0.236
itting+0.224
 switched+0.221
Neg logit contributions
keeping-0.146
kins-0.134
 snakes-0.115
 skeletons-0.102
 traps-0.101
Token the
Token ID262
Feature activation+7.03e-03

Pos logit contributions
ah+0.013
 lear+0.013
 str+0.013
itting+0.012
 forward+0.012
Neg logit contributions
keeping-0.018
kins-0.017
 snakes-0.015
 traps-0.014
 skeletons-0.014
Token drums
Token ID24730
Feature activation-5.57e-02

Pos logit contributions
keeping+0.010
kins+0.009
 snakes+0.008
 floor+0.007
 traps+0.007
Neg logit contributions
ah-0.013
 lear-0.013
 str-0.012
itting-0.012
 forward-0.012
Token.
Token ID13
Feature activation-9.80e-04

Pos logit contributions
 lear+0.111
ah+0.109
 str+0.105
itting+0.103
ee+0.100
Neg logit contributions
keeping-0.065
kins-0.062
 snakes-0.050
uts-0.047
 floor-0.046
Token Everything
Token ID11391
Feature activation-8.11e-02

Pos logit contributions
 lear+0.035
 str+0.035
ah+0.035
itting+0.032
 forward+0.032
Neg logit contributions
keeping-0.026
kins-0.023
 snakes-0.020
bes-0.018
 skeletons-0.018
Token around
Token ID1088
Feature activation-6.39e-02

Pos logit contributions
 lear+0.144
ah+0.137
 str+0.133
itting+0.132
 spend+0.126
Neg logit contributions
kins-0.115
keeping-0.113
 snakes-0.102
 iv-0.094
ls-0.093
Token them
Token ID606
Feature activation+1.27e-02

Pos logit contributions
 lear+0.100
 str+0.099
ah+0.097
itting+0.091
 forward+0.090
Neg logit contributions
keeping-0.109
kins-0.104
 snakes-0.090
 traps-0.083
 skeletons-0.083
Token was
Token ID373
Feature activation-1.79e-02

Pos logit contributions
keeping+0.022
kins+0.021
 snakes+0.019
 traps+0.017
 iv+0.017
Neg logit contributions
 lear-0.019
ah-0.019
 str-0.018
itting-0.017
 forward-0.017
Token a
Token ID257
Feature activation-3.61e-02

Pos logit contributions
 lear+0.030
ah+0.029
 str+0.028
itting+0.027
ee+0.026
Neg logit contributions
keeping-0.028
kins-0.027
 snakes-0.024
 traps-0.022
 iv-0.022
Token blur
Token ID23671
Feature activation-1.54e-01

Pos logit contributions
itting+0.063
oured+0.061
eal+0.060
ipp+0.059
ah+0.059
Neg logit contributions
 bush-0.043
kins-0.043
 snakes-0.043
keeping-0.042
 floor-0.040
Token,
Token ID11
Feature activation-3.09e-02

Pos logit contributions
 str+0.296
 lear+0.292
ah+0.287
itting+0.279
 switched+0.277
Neg logit contributions
keeping-0.196
kins-0.182
 snakes-0.159
 traps-0.138
ls-0.137
Token it
Token ID340
Feature activation-1.78e-02

Pos logit contributions
ah+0.056
 lear+0.056
 str+0.054
itting+0.052
 forward+0.050
Neg logit contributions
keeping-0.044
kins-0.040
 snakes-0.036
 traps-0.033
bes-0.032
Token was
Token ID373
Feature activation-1.36e-02

Pos logit contributions
 lear+0.031
ah+0.029
 str+0.028
itting+0.028
 forward+0.028
Neg logit contributions
kins-0.026
keeping-0.025
 snakes-0.023
 iv-0.021
 traps-0.020
Token like
Token ID588
Feature activation-6.51e-02

Pos logit contributions
 lear+0.023
ah+0.022
itting+0.021
 str+0.021
ee+0.021
Neg logit contributions
keeping-0.019
kins-0.019
 snakes-0.018
 traps-0.016
 iv-0.016
Token the
Token ID262
Feature activation-1.28e-02

Pos logit contributions
 lear+0.105
ah+0.105
 str+0.102
itting+0.098
 forward+0.095
Neg logit contributions
keeping-0.106
kins-0.099
 snakes-0.092
 traps-0.084
 skeletons-0.083
Token its
Token ID663
Feature activation-6.08e-03

Pos logit contributions
 lear+0.023
ah+0.023
 str+0.022
itting+0.022
 switched+0.021
Neg logit contributions
keeping-0.020
kins-0.018
 snakes-0.016
 traps-0.015
 skeletons-0.015
Token top
Token ID1353
Feature activation+6.11e-03

Pos logit contributions
 lear+0.008
 str+0.008
ah+0.008
itting+0.008
 forward+0.007
Neg logit contributions
keeping-0.011
kins-0.011
 snakes-0.010
 traps-0.009
 skeletons-0.009
Token speed
Token ID2866
Feature activation-5.28e-02

Pos logit contributions
keeping+0.008
kins+0.007
 snakes+0.006
 traps+0.005
bes+0.005
Neg logit contributions
 lear-0.012
 str-0.012
ah-0.012
itting-0.011
 forward-0.011
Token!
Token ID0
Feature activation-5.44e-02

Pos logit contributions
 lear+0.102
 str+0.101
ah+0.098
 switched+0.094
itting+0.094
Neg logit contributions
keeping-0.071
kins-0.063
 snakes-0.052
uts-0.047
ls-0.047
Token "
Token ID366
Feature activation-4.58e-02

Pos logit contributions
 lear+0.092
 str+0.092
ah+0.090
 forward+0.084
itting+0.084
Neg logit contributions
keeping-0.087
kins-0.082
 snakes-0.070
bes-0.064
 traps-0.064
TokenWhe
Token ID10842
Feature activation-1.54e-01

Pos logit contributions
by+0.080
ur+0.080
 spend+0.078
 lear+0.078
oured+0.076
Neg logit contributions
keeping-0.056
 yourselves-0.037
kins-0.037
 myself-0.036
 yourself-0.036
Tokene
Token ID68
Feature activation-3.01e-02

Pos logit contributions
 lear+0.209
 str+0.176
 switched+0.166
 forward+0.164
 farewell+0.162
Neg logit contributions
kins-0.301
keeping-0.270
 snakes-0.243
oon-0.242
ers-0.240
Token,
Token ID11
Feature activation-2.11e-02

Pos logit contributions
itting+0.069
 lear+0.068
 str+0.068
 switched+0.067
ah+0.066
Neg logit contributions
keeping-0.027
kins-0.024
bes-0.019
uts-0.018
 snakes-0.017
Token this
Token ID428
Feature activation+1.22e-02

Pos logit contributions
itting+0.037
ah+0.036
 lear+0.035
ashed+0.035
oured+0.034
Neg logit contributions
keeping-0.027
kins-0.025
 snakes-0.024
 bush-0.024
 noise-0.023
Token is
Token ID318
Feature activation-1.34e-02

Pos logit contributions
keeping+0.016
kins+0.016
 snakes+0.013
 floor+0.013
 hurts+0.013
Neg logit contributions
 lear-0.022
 str-0.021
itting-0.020
 forward-0.019
 switched-0.019
Token fun
Token ID1257
Feature activation+3.50e-03

Pos logit contributions
 lear+0.029
itting+0.026
ah+0.026
 str+0.025
ashed+0.025
Neg logit contributions
keeping-0.013
kins-0.012
 snakes-0.011
 happen-0.011
 noise-0.011
Token �
Token ID6184
Feature activation-3.89e-02

Pos logit contributions
ah+0.034
 lear+0.031
itting+0.031
 str+0.031
ee+0.031
Neg logit contributions
keeping-0.023
kins-0.022
bes-0.017
uts-0.016
 snakes-0.016
Token�
Token ID95
Feature activation-5.62e-02

Pos logit contributions
by+0.057
 lear+0.057
 str+0.057
ah+0.057
 forward+0.054
Neg logit contributions
kins-0.057
bes-0.055
keeping-0.054
uts-0.053
iders-0.052
Token€
Token ID26391
Feature activation+1.67e-02

Pos logit contributions
 lear+0.087
 str+0.087
by+0.082
ah+0.082
 forward+0.080
Neg logit contributions
keeping-0.095
kins-0.082
bes-0.080
ers-0.074
 snakes-0.072
Token�
Token ID129
Feature activation+2.86e-02

Pos logit contributions
keeping+0.027
kins+0.025
 snakes+0.022
bes+0.020
 traps+0.020
Neg logit contributions
 lear-0.028
 str-0.027
ah-0.027
 forward-0.025
 switched-0.025
Token�
Token ID241
Feature activation-4.15e-02

Pos logit contributions
keeping+0.044
kins+0.039
bes+0.036
 snakes+0.035
uts+0.033
Neg logit contributions
 lear-0.049
ah-0.047
 str-0.046
itting-0.046
 switched-0.044
TokenWhe
Token ID10842
Feature activation-1.53e-01

Pos logit contributions
 lear+0.062
 str+0.062
ah+0.059
 forward+0.058
itting+0.058
Neg logit contributions
keeping-0.071
kins-0.060
uts-0.053
 snakes-0.052
bes-0.050
Tokenee
Token ID1453
Feature activation-1.28e-01

Pos logit contributions
 lear+0.250
 str+0.215
 spend+0.212
 switched+0.209
 farewell+0.207
Neg logit contributions
kins-0.261
keeping-0.210
ers-0.194
es-0.193
 snakes-0.188
Token!
Token ID0
Feature activation-5.36e-02

Pos logit contributions
 lear+0.230
 str+0.225
itting+0.216
ah+0.210
 switched+0.210
Neg logit contributions
kins-0.184
keeping-0.183
bes-0.144
 snakes-0.139
ers-0.137
Tokenâ
Token ID22940
Feature activation-5.05e-02

Pos logit contributions
 lear+0.099
 str+0.099
ah+0.099
itting+0.092
 switched+0.091
Neg logit contributions
keeping-0.075
kins-0.067
 snakes-0.057
 skeletons-0.052
bes-0.051
Token€
Token ID26391
Feature activation+1.52e-02

Pos logit contributions
 lear+0.086
 str+0.083
 forward+0.077
itting+0.076
ah+0.076
Neg logit contributions
keeping-0.081
kins-0.073
bes-0.062
ers-0.059
 snakes-0.058
Token His
Token ID2399
Feature activation+3.69e-03

Pos logit contributions
keeping+0.023
kins+0.021
 snakes+0.018
bes+0.017
 traps+0.016
Neg logit contributions
 lear-0.028
 str-0.027
ah-0.027
 forward-0.025
itting-0.025
Token laughed
Token ID13818
Feature activation-6.60e-03

Pos logit contributions
itting+0.014
 lear+0.014
 forward+0.013
ah+0.013
 farewell+0.013
Neg logit contributions
kins-0.011
keeping-0.011
 snakes-0.010
 floor-0.009
ls-0.009
Token and
Token ID290
Feature activation-6.30e-03

Pos logit contributions
ah+0.015
 lear+0.015
 switched+0.014
itting+0.014
ee+0.014
Neg logit contributions
kins-0.005
keeping-0.005
 snakes-0.004
uts-0.004
 noise-0.004
Token shouted
Token ID17293
Feature activation+1.20e-02

Pos logit contributions
 lear+0.010
itting+0.010
ah+0.009
 forward+0.009
 str+0.009
Neg logit contributions
keeping-0.011
kins-0.010
 snakes-0.009
bes-0.008
 skeletons-0.008
Token,
Token ID11
Feature activation-7.59e-03

Pos logit contributions
kins+0.013
keeping+0.012
 snakes+0.011
uts+0.011
 mice+0.010
Neg logit contributions
ah-0.025
 str-0.024
 switched-0.023
 lear-0.023
packing-0.023
Token "
Token ID366
Feature activation-3.25e-02

Pos logit contributions
ah+0.014
 lear+0.014
 str+0.014
itting+0.013
 switched+0.013
Neg logit contributions
keeping-0.010
kins-0.010
 snakes-0.009
bes-0.008
 traps-0.007
TokenWhe
Token ID10842
Feature activation-1.53e-01

Pos logit contributions
 lear+0.038
 str+0.038
itting+0.037
 forward+0.036
ah+0.036
Neg logit contributions
keeping-0.063
kins-0.056
 snakes-0.053
bes-0.051
orns-0.050
Tokene
Token ID68
Feature activation-2.79e-02

Pos logit contributions
 lear+0.232
 str+0.207
 switched+0.202
 spend+0.194
 farewell+0.193
Neg logit contributions
kins-0.265
keeping-0.241
 snakes-0.218
orns-0.209
 bush-0.207
Token!"
Token ID2474
Feature activation+8.77e-03

Pos logit contributions
 lear+0.061
itting+0.060
 str+0.060
 switched+0.059
 forward+0.059
Neg logit contributions
keeping-0.027
kins-0.024
 snakes-0.020
bes-0.020
uts-0.020
Token But
Token ID887
Feature activation-1.58e-02

Pos logit contributions
keeping+0.014
kins+0.014
 snakes+0.012
 traps+0.011
 skeletons+0.011
Neg logit contributions
 lear-0.015
 str-0.014
ah-0.014
 forward-0.013
 switched-0.013
Token when
Token ID618
Feature activation+9.23e-03

Pos logit contributions
 lear+0.024
 str+0.024
ah+0.024
itting+0.023
 switched+0.022
Neg logit contributions
keeping-0.026
kins-0.025
 snakes-0.023
 rats-0.021
 skeletons-0.021
Token she
Token ID673
Feature activation+1.80e-02

Pos logit contributions
keeping+0.015
kins+0.014
 snakes+0.012
bes+0.011
 skeletons+0.011
Neg logit contributions
 str-0.016
 lear-0.015
ah-0.015
 forward-0.014
 switched-0.014
Tokenas
Token ID292
Feature activation-2.45e-03

Pos logit contributions
 lear+0.038
 str+0.038
ah+0.037
 forward+0.034
 switched+0.034
Neg logit contributions
keeping-0.052
kins-0.050
 snakes-0.044
bes-0.041
 traps-0.041
Token,
Token ID11
Feature activation+1.39e-02

Pos logit contributions
 lear+0.004
 str+0.004
ah+0.004
 forward+0.004
itting+0.004
Neg logit contributions
keeping-0.004
kins-0.004
 snakes-0.003
bes-0.003
 traps-0.003
Token and
Token ID290
Feature activation+2.25e-02

Pos logit contributions
keeping+0.027
kins+0.025
 situations+0.020
bes+0.020
uts+0.020
Neg logit contributions
ah-0.020
 lear-0.020
 str-0.020
 spend-0.017
 switched-0.017
Token gir
Token ID37370
Feature activation+4.24e-02

Pos logit contributions
keeping+0.039
kins+0.037
bes+0.029
iders+0.028
uts+0.028
Neg logit contributions
 lear-0.038
 str-0.037
ah-0.036
 spend-0.034
 switched-0.033
Tokenaff
Token ID2001
Feature activation+1.58e-02

Pos logit contributions
keeping+0.076
kins+0.072
 snakes+0.057
bes+0.057
 iv+0.055
Neg logit contributions
ah-0.066
 lear-0.066
 str-0.064
itting-0.058
 forward-0.058
Tokenes
Token ID274
Feature activation+1.55e-01

Pos logit contributions
keeping+0.016
kins+0.012
 situations+0.009
 floor+0.009
bes+0.008
Neg logit contributions
ah-0.038
 lear-0.036
 str-0.036
itting-0.035
ee-0.034
Token.
Token ID13
Feature activation+2.70e-02

Pos logit contributions
keeping+0.231
kins+0.211
bes+0.163
 snakes+0.161
 situations+0.160
Neg logit contributions
 lear-0.282
 str-0.278
ah-0.278
 forward-0.261
 spend-0.245
Token Anna
Token ID11735
Feature activation+3.25e-04

Pos logit contributions
keeping+0.049
kins+0.044
 snakes+0.036
 traps+0.035
bes+0.035
Neg logit contributions
 str-0.043
 lear-0.042
 forward-0.041
ah-0.040
 spend-0.039
Token and
Token ID290
Feature activation+2.17e-02

Pos logit contributions
keeping+0.001
kins+0.001
 snakes+0.000
bes+0.000
 skeletons+0.000
Neg logit contributions
 lear-0.001
 str-0.001
ah-0.001
 forward-0.000
itting-0.000
Token Ben
Token ID3932
Feature activation+5.56e-03

Pos logit contributions
keeping+0.032
kins+0.031
 snakes+0.026
 skeletons+0.024
 traps+0.024
Neg logit contributions
 lear-0.038
 str-0.038
ah-0.038
itting-0.036
 switched-0.035
Token were
Token ID547
Feature activation+3.39e-02

Pos logit contributions
kins+0.008
keeping+0.008
 snakes+0.007
 bush+0.007
 skeletons+0.006
Neg logit contributions
 lear-0.009
itting-0.009
 str-0.009
 forward-0.009
ah-0.009
Token you
Token ID345
Feature activation-1.31e-02

Pos logit contributions
keeping+0.127
kins+0.112
 snakes+0.096
 iv+0.091
 traps+0.089
Neg logit contributions
 str-0.131
 lear-0.127
ah-0.126
 forward-0.118
 spend-0.117
Token want
Token ID765
Feature activation-2.32e-02

Pos logit contributions
 lear+0.019
itting+0.018
 str+0.017
ah+0.017
 farewell+0.017
Neg logit contributions
kins-0.022
keeping-0.022
 snakes-0.018
 hurts-0.017
 trouble-0.017
Token to
Token ID284
Feature activation-3.04e-03

Pos logit contributions
 lear+0.031
 str+0.031
ah+0.031
 forward+0.028
itting+0.028
Neg logit contributions
keeping-0.045
kins-0.042
 snakes-0.038
 traps-0.035
 skeletons-0.035
Token trade
Token ID3292
Feature activation+1.58e-02

Pos logit contributions
 lear+0.004
ah+0.004
itting+0.004
 str+0.004
 forward+0.003
Neg logit contributions
keeping-0.006
kins-0.006
 snakes-0.005
bes-0.005
 traps-0.005
Token nap
Token ID25422
Feature activation+1.08e-02

Pos logit contributions
keeping+0.031
kins+0.028
 snakes+0.024
bes+0.024
 situations+0.023
Neg logit contributions
 str-0.022
ah-0.021
 lear-0.021
 forward-0.020
ee-0.018
Tokenkins
Token ID5331
Feature activation+2.25e-01

Pos logit contributions
keeping+0.021
 snakes+0.017
 traps+0.016
 rats+0.015
uts+0.015
Neg logit contributions
 str-0.015
 forward-0.014
ah-0.013
ee-0.013
 lear-0.013
Token with
Token ID351
Feature activation+2.43e-02

Pos logit contributions
keeping+0.337
kins+0.302
 snakes+0.263
bes+0.243
 traps+0.240
Neg logit contributions
 str-0.415
ah-0.408
 lear-0.404
 forward-0.388
itting-0.373
Token me
Token ID502
Feature activation-4.97e-02

Pos logit contributions
keeping+0.044
kins+0.042
 snakes+0.036
bes+0.034
 traps+0.034
Neg logit contributions
 str-0.036
 lear-0.036
ah-0.035
 forward-0.032
 switched-0.032
Token?"
Token ID1701
Feature activation+5.07e-02

Pos logit contributions
 str+0.101
ah+0.100
 lear+0.098
 forward+0.094
itting+0.091
Neg logit contributions
keeping-0.066
kins-0.058
 snakes-0.047
bes-0.044
 traps-0.043
Token she
Token ID673
Feature activation+2.73e-02

Pos logit contributions
keeping+0.121
kins+0.110
bes+0.094
iders+0.092
orns+0.090
Neg logit contributions
 str-0.061
 lear-0.060
Sure-0.053
 forward-0.053
 spend-0.053
Token offered
Token ID4438
Feature activation+3.24e-02

Pos logit contributions
keeping+0.064
kins+0.056
orns+0.051
 snakes+0.050
iders+0.049
Neg logit contributions
 str-0.036
ah-0.030
 lear-0.030
 switched-0.029
 pract-0.027
Token Don
Token ID2094
Feature activation-5.39e-04

Pos logit contributions
keeping+0.012
kins+0.011
 snakes+0.010
 skeletons+0.009
 traps+0.009
Neg logit contributions
 lear-0.014
 str-0.014
ah-0.013
 forward-0.013
itting-0.012
Token't
Token ID470
Feature activation+7.47e-02

Pos logit contributions
 str+0.001
ah+0.001
 spend+0.001
 lear+0.001
 switched+0.001
Neg logit contributions
keeping-0.001
kins-0.001
bes-0.001
 snakes-0.001
 skeletons-0.001
Token hurt
Token ID5938
Feature activation+3.22e-02

Pos logit contributions
keeping+0.155
kins+0.147
 snakes+0.133
 traps+0.124
 skeletons+0.124
Neg logit contributions
 lear-0.091
 str-0.088
ah-0.087
 forward-0.080
itting-0.080
Token the
Token ID262
Feature activation+1.83e-02

Pos logit contributions
keeping+0.059
kins+0.057
bes+0.049
 snakes+0.045
 iv+0.045
Neg logit contributions
 str-0.045
 lear-0.043
ah-0.043
 forward-0.042
 switched-0.039
Token pump
Token ID8901
Feature activation+6.22e-03

Pos logit contributions
keeping+0.028
kins+0.026
 snakes+0.022
bes+0.021
 traps+0.020
Neg logit contributions
 lear-0.032
 str-0.032
ah-0.031
 forward-0.029
itting-0.029
Tokenkins
Token ID5331
Feature activation+2.26e-01

Pos logit contributions
keeping+0.005
 iv+0.004
 skeletons+0.004
bes+0.004
 snakes+0.003
Neg logit contributions
ah-0.014
 forward-0.014
 str-0.014
 wait-0.013
ee-0.013
Token."
Token ID526
Feature activation+2.21e-02

Pos logit contributions
keeping+0.343
kins+0.291
 snakes+0.263
 skeletons+0.255
bes+0.253
Neg logit contributions
 str-0.409
 forward-0.390
ah-0.377
 wait-0.374
 lear-0.370
Token
Token ID220
Feature activation+3.15e-03

Pos logit contributions
kins+0.045
keeping+0.044
 snakes+0.037
 traps+0.037
 skeletons+0.036
Neg logit contributions
 lear-0.027
 str-0.026
ah-0.025
 switched-0.025
 spend-0.024
Token
Token ID198
Feature activation+2.13e-02

Pos logit contributions
keeping+0.006
kins+0.006
 snakes+0.005
 skeletons+0.005
 traps+0.005
Neg logit contributions
itting-0.004
ipp-0.004
 forward-0.004
 str-0.004
ah-0.004
Token
Token ID198
Feature activation+1.89e-02

Pos logit contributions
keeping+0.039
kins+0.038
 skeletons+0.033
 snakes+0.033
 traps+0.033
Neg logit contributions
 str-0.031
Sure-0.030
ah-0.029
 lear-0.028
 switched-0.028
TokenTim
Token ID14967
Feature activation+2.25e-02

Pos logit contributions
keeping+0.023
kins+0.022
 snakes+0.018
bes+0.016
 traps+0.015
Neg logit contributions
 lear-0.039
 str-0.038
ah-0.038
 forward-0.036
itting-0.036
Token to
Token ID284
Feature activation-1.30e-02

Pos logit contributions
keeping+0.017
kins+0.016
 snakes+0.014
 traps+0.013
 skeletons+0.013
Neg logit contributions
 lear-0.012
 str-0.012
ah-0.011
 forward-0.010
itting-0.010
Token help
Token ID1037
Feature activation+4.15e-02

Pos logit contributions
ah+0.018
 lear+0.018
 str+0.016
itting+0.016
 forward+0.016
Neg logit contributions
keeping-0.025
kins-0.024
 snakes-0.020
bes-0.019
 bush-0.019
Token fold
Token ID5591
Feature activation-4.38e-02

Pos logit contributions
keeping+0.074
kins+0.069
 snakes+0.062
 skeletons+0.059
 traps+0.057
Neg logit contributions
 str-0.064
 lear-0.061
ah-0.058
 switched-0.057
 spend-0.056
Token the
Token ID262
Feature activation-1.23e-03

Pos logit contributions
 str+0.065
 lear+0.063
ah+0.061
 forward+0.059
 switched+0.057
Neg logit contributions
keeping-0.080
kins-0.073
 snakes-0.065
 skeletons-0.061
 traps-0.061
Token nap
Token ID25422
Feature activation-4.24e-03

Pos logit contributions
 lear+0.002
ah+0.002
 str+0.002
itting+0.002
 forward+0.002
Neg logit contributions
keeping-0.002
kins-0.002
 snakes-0.001
 traps-0.001
 skeletons-0.001
Tokenkins
Token ID5331
Feature activation+2.14e-01

Pos logit contributions
 str+0.008
 forward+0.008
ah+0.007
ee+0.007
 lear+0.007
Neg logit contributions
keeping-0.006
 snakes-0.004
 skeletons-0.004
 traps-0.004
orns-0.004
Token for
Token ID329
Feature activation+9.01e-03

Pos logit contributions
keeping+0.357
kins+0.316
 snakes+0.298
 skeletons+0.274
 traps+0.272
Neg logit contributions
 str-0.347
 lear-0.332
 forward-0.332
ah-0.326
itting-0.315
Token dinner
Token ID8073
Feature activation+4.16e-02

Pos logit contributions
keeping+0.017
kins+0.016
 snakes+0.014
 skeletons+0.013
bes+0.013
Neg logit contributions
 str-0.013
 lear-0.013
ah-0.012
 switched-0.011
 forward-0.011
Token.
Token ID13
Feature activation+9.36e-03

Pos logit contributions
keeping+0.060
kins+0.052
 snakes+0.051
 skeletons+0.046
 traps+0.046
Neg logit contributions
 str-0.074
 lear-0.072
 forward-0.071
ah-0.070
 wait-0.067
Token Lily
Token ID20037
Feature activation+1.39e-03

Pos logit contributions
keeping+0.014
kins+0.012
 snakes+0.010
bes+0.010
 iv+0.009
Neg logit contributions
ah-0.017
 lear-0.017
 str-0.017
 forward-0.016
itting-0.016
Token was
Token ID373
Feature activation+1.95e-03

Pos logit contributions
kins+0.002
keeping+0.002
bes+0.002
 floor+0.002
 bush+0.002
Neg logit contributions
 forward-0.002
 lear-0.002
ah-0.002
itting-0.002
 farewell-0.002
Token honey
Token ID12498
Feature activation+3.34e-02

Pos logit contributions
keeping+0.024
kins+0.022
 snakes+0.019
 skeletons+0.018
 traps+0.017
Neg logit contributions
 str-0.026
 lear-0.025
ah-0.024
 switched-0.023
 forward-0.023
Token off
Token ID572
Feature activation-2.96e-02

Pos logit contributions
keeping+0.048
kins+0.042
 skeletons+0.038
 snakes+0.038
 rats+0.035
Neg logit contributions
 str-0.062
 lear-0.060
 forward-0.058
ah-0.057
 wait-0.055
Token.
Token ID13
Feature activation+1.45e-02

Pos logit contributions
 str+0.051
 lear+0.050
ah+0.048
 forward+0.047
 switched+0.045
Neg logit contributions
keeping-0.047
kins-0.043
 snakes-0.037
 skeletons-0.035
bes-0.035
Token She
Token ID1375
Feature activation-5.33e-03

Pos logit contributions
keeping+0.023
kins+0.022
 snakes+0.018
 traps+0.017
 skeletons+0.017
Neg logit contributions
 lear-0.025
 str-0.025
ah-0.023
 forward-0.023
itting-0.023
Token dri
Token ID1454
Feature activation-2.28e-02

Pos logit contributions
 lear+0.010
ah+0.010
itting+0.010
 forward+0.010
 str+0.010
Neg logit contributions
kins-0.007
keeping-0.006
 snakes-0.006
bes-0.005
ls-0.005
Tokenes
Token ID274
Feature activation+1.49e-01

Pos logit contributions
 str+0.048
ah+0.048
 lear+0.047
 switched+0.045
ee+0.045
Neg logit contributions
keeping-0.027
kins-0.022
 snakes-0.020
 skeletons-0.019
 traps-0.019
Token them
Token ID606
Feature activation+4.19e-02

Pos logit contributions
keeping+0.234
kins+0.224
 snakes+0.196
bes+0.178
 traps+0.176
Neg logit contributions
 lear-0.249
ah-0.247
 str-0.241
itting-0.237
 switched-0.224
Token with
Token ID351
Feature activation+2.91e-02

Pos logit contributions
keeping+0.053
kins+0.051
 snakes+0.040
bes+0.039
 floor+0.037
Neg logit contributions
ah-0.086
 lear-0.085
itting-0.080
 str-0.079
 spend-0.076
Token a
Token ID257
Feature activation-9.18e-03

Pos logit contributions
keeping+0.046
kins+0.045
 snakes+0.039
 traps+0.036
 mice+0.035
Neg logit contributions
 lear-0.048
ah-0.048
 str-0.047
itting-0.045
 forward-0.044
Token towel
Token ID24808
Feature activation-1.43e-02

Pos logit contributions
 lear+0.016
ah+0.016
 str+0.015
itting+0.015
 forward+0.015
Neg logit contributions
kins-0.013
keeping-0.012
 bush-0.010
 snakes-0.010
ls-0.010
Token and
Token ID290
Feature activation+1.08e-02

Pos logit contributions
 lear+0.026
 str+0.026
ah+0.026
 forward+0.024
itting+0.024
Neg logit contributions
keeping-0.021
kins-0.020
 snakes-0.016
bes-0.015
 skeletons-0.015
Token a
Token ID257
Feature activation-9.86e-03

Pos logit contributions
 str+0.010
ah+0.008
 spend+0.008
 lear+0.008
 switched+0.007
Neg logit contributions
keeping-0.024
kins-0.023
 snakes-0.020
 situations-0.020
ls-0.020
Token time
Token ID640
Feature activation-1.28e-02

Pos logit contributions
 lear+0.014
 str+0.013
ah+0.013
 forward+0.012
 switched+0.012
Neg logit contributions
keeping-0.018
kins-0.018
 snakes-0.016
bes-0.015
 traps-0.015
Token,
Token ID11
Feature activation-7.78e-03

Pos logit contributions
ah+0.016
 str+0.016
 lear+0.016
itting+0.015
 forward+0.015
Neg logit contributions
keeping-0.027
kins-0.024
 snakes-0.021
 traps-0.021
bes-0.021
Token in
Token ID287
Feature activation+1.38e-02

Pos logit contributions
 str+0.012
ah+0.012
 lear+0.012
 forward+0.011
itting+0.011
Neg logit contributions
keeping-0.014
kins-0.012
 snakes-0.011
bes-0.010
uts-0.010
Token a
Token ID257
Feature activation-1.10e-02

Pos logit contributions
keeping+0.021
kins+0.021
 snakes+0.018
bes+0.017
 rats+0.017
Neg logit contributions
 lear-0.023
ah-0.022
 str-0.022
itting-0.021
 forward-0.021
Token big
Token ID1263
Feature activation+3.93e-02

Pos logit contributions
 lear+0.022
ah+0.021
 str+0.021
itting+0.020
 forward+0.020
Neg logit contributions
keeping-0.014
kins-0.013
 snakes-0.011
 traps-0.010
 bush-0.010
Token forest
Token ID8222
Feature activation+4.89e-02

Pos logit contributions
keeping+0.043
kins+0.041
 snakes+0.037
 bush+0.034
 mice+0.032
Neg logit contributions
 lear-0.082
 str-0.080
ah-0.080
itting-0.079
 switched-0.076
Token,
Token ID11
Feature activation-7.24e-03

Pos logit contributions
keeping+0.090
kins+0.081
 snakes+0.070
bes+0.068
 skeletons+0.067
Neg logit contributions
 str-0.072
 lear-0.071
ah-0.071
itting-0.066
 forward-0.065
Token there
Token ID612
Feature activation+3.11e-02

Pos logit contributions
 str+0.010
ah+0.010
 forward+0.010
 lear+0.009
itting+0.009
Neg logit contributions
keeping-0.014
kins-0.012
bes-0.011
uts-0.011
 snakes-0.010
Token lived
Token ID5615
Feature activation+3.60e-02

Pos logit contributions
keeping+0.050
kins+0.046
 snakes+0.039
 traps+0.036
bes+0.036
Neg logit contributions
 str-0.053
 lear-0.053
ah-0.052
itting-0.048
 switched-0.048
Token a
Token ID257
Feature activation-1.29e-02

Pos logit contributions
keeping+0.073
kins+0.069
 snakes+0.061
bes+0.058
 traps+0.057
Neg logit contributions
 str-0.046
 lear-0.045
ah-0.043
 forward-0.041
itting-0.040
Token rang
Token ID28077
Feature activation-1.63e-02

Pos logit contributions
keeping+0.008
kins+0.008
 snakes+0.007
bes+0.007
 traps+0.007
Neg logit contributions
itting-0.008
 lear-0.007
ah-0.007
 str-0.007
 forward-0.007
Token the
Token ID262
Feature activation+1.31e-03

Pos logit contributions
 lear+0.027
ah+0.026
 str+0.026
itting+0.025
 switched+0.024
Neg logit contributions
keeping-0.026
kins-0.024
 snakes-0.022
 skeletons-0.020
 traps-0.020
Token bell
Token ID8966
Feature activation-6.17e-02

Pos logit contributions
keeping+0.002
kins+0.002
 snakes+0.002
bes+0.001
 traps+0.001
Neg logit contributions
 lear-0.002
 str-0.002
ah-0.002
 forward-0.002
 switched-0.002
Token again
Token ID757
Feature activation+5.21e-02

Pos logit contributions
 lear+0.107
 str+0.102
ah+0.102
itting+0.099
 switched+0.096
Neg logit contributions
keeping-0.089
kins-0.085
 snakes-0.073
uts-0.067
bes-0.066
Token.
Token ID13
Feature activation-2.33e-03

Pos logit contributions
keeping+0.076
kins+0.070
 snakes+0.059
bes+0.054
 traps+0.054
Neg logit contributions
 lear-0.095
 str-0.095
ah-0.094
 forward-0.088
itting-0.088
Token Then
Token ID3244
Feature activation+3.78e-02

Pos logit contributions
 lear+0.004
 str+0.004
ah+0.004
 forward+0.004
itting+0.004
Neg logit contributions
keeping-0.004
kins-0.003
 snakes-0.003
 traps-0.003
bes-0.003
Token she
Token ID673
Feature activation+6.46e-03

Pos logit contributions
keeping+0.045
kins+0.042
 snakes+0.033
 traps+0.030
 skeletons+0.030
Neg logit contributions
ah-0.077
itting-0.077
 lear-0.076
 str-0.076
ee-0.071
Token popped
Token ID22928
Feature activation+1.02e-02

Pos logit contributions
keeping+0.011
kins+0.011
 snakes+0.009
ls+0.009
 traps+0.009
Neg logit contributions
itting-0.009
 lear-0.009
 forward-0.009
ah-0.009
 str-0.008
Token back
Token ID736
Feature activation-2.57e-02

Pos logit contributions
keeping+0.016
kins+0.014
 snakes+0.012
bes+0.011
 bush+0.011
Neg logit contributions
 lear-0.018
ah-0.017
itting-0.017
 str-0.017
 spend-0.016
Token through
Token ID832
Feature activation-4.48e-02

Pos logit contributions
 lear+0.043
 str+0.043
ah+0.043
itting+0.040
 switched+0.039
Neg logit contributions
keeping-0.041
kins-0.038
 snakes-0.033
 traps-0.030
 skeletons-0.030
Token the
Token ID262
Feature activation+4.65e-04

Pos logit contributions
 str+0.064
 lear+0.064
ah+0.058
 forward+0.057
 switched+0.055
Neg logit contributions
keeping-0.083
kins-0.081
 snakes-0.069
bes-0.066
uts-0.065
Token started
Token ID2067
Feature activation-3.73e-02

Pos logit contributions
 lear+0.011
ah+0.011
 str+0.011
 forward+0.010
itting+0.010
Neg logit contributions
keeping-0.008
kins-0.008
 snakes-0.007
bes-0.006
 bush-0.006
Token to
Token ID284
Feature activation-1.36e-02

Pos logit contributions
ah+0.061
 lear+0.059
itting+0.056
ee+0.055
 str+0.055
Neg logit contributions
keeping-0.060
kins-0.053
 snakes-0.050
uts-0.047
 traps-0.046
Token w
Token ID266
Feature activation+1.87e-04

Pos logit contributions
ah+0.020
 lear+0.019
itting+0.019
 forward+0.019
 str+0.018
Neg logit contributions
keeping-0.023
kins-0.022
 snakes-0.020
bes-0.019
 bush-0.019
Tokenag
Token ID363
Feature activation-6.06e-02

Pos logit contributions
kins+0.000
keeping+0.000
 snakes+0.000
bes+0.000
ls+0.000
Neg logit contributions
 lear-0.000
ah-0.000
 str-0.000
 switched-0.000
 forward-0.000
Token with
Token ID351
Feature activation+1.31e-02

Pos logit contributions
 str+0.093
ah+0.092
 lear+0.091
 forward+0.084
itting+0.083
Neg logit contributions
keeping-0.107
kins-0.096
 snakes-0.085
 traps-0.081
bes-0.080
Token excitement
Token ID14067
Feature activation+3.74e-02

Pos logit contributions
keeping+0.023
kins+0.021
 snakes+0.020
 skeletons+0.018
 mice+0.018
Neg logit contributions
ah-0.020
 lear-0.020
 str-0.020
itting-0.019
 forward-0.018
Token!
Token ID0
Feature activation-4.73e-02

Pos logit contributions
keeping+0.042
kins+0.039
 snakes+0.030
uts+0.026
 skeletons+0.026
Neg logit contributions
 lear-0.081
ah-0.080
 str-0.080
itting-0.075
 switched-0.075
Token The
Token ID383
Feature activation-1.24e-02

Pos logit contributions
 lear+0.081
 str+0.080
ah+0.079
 forward+0.074
itting+0.074
Neg logit contributions
keeping-0.075
kins-0.071
 snakes-0.060
bes-0.056
 traps-0.055
Token mom
Token ID1995
Feature activation+6.92e-04

Pos logit contributions
 lear+0.019
 str+0.019
ah+0.018
 forward+0.017
itting+0.017
Neg logit contributions
keeping-0.022
kins-0.021
 snakes-0.018
bes-0.017
 traps-0.017
Token called
Token ID1444
Feature activation+1.11e-03

Pos logit contributions
kins+0.001
keeping+0.001
 snakes+0.001
 bush+0.001
 mice+0.001
Neg logit contributions
 forward-0.001
 lear-0.001
itting-0.001
 farewell-0.001
 str-0.001
Token the
Token ID262
Feature activation+4.21e-03

Pos logit contributions
keeping+0.001
 snakes+0.001
 mice+0.001
 bush+0.001
kins+0.001
Neg logit contributions
ah-0.002
itting-0.002
 lear-0.002
 str-0.002
 switched-0.002
Token and
Token ID290
Feature activation+7.79e-03

Pos logit contributions
kins+0.038
keeping+0.036
iders+0.033
 snakes+0.033
uts+0.033
Neg logit contributions
 str-0.023
 forward-0.022
 lear-0.022
ah-0.021
 spend-0.020
Token explore
Token ID7301
Feature activation+3.77e-02

Pos logit contributions
keeping+0.015
kins+0.013
 snakes+0.012
 traps+0.011
 bush+0.011
Neg logit contributions
 lear-0.011
ah-0.011
itting-0.011
 str-0.011
 forward-0.010
Token,
Token ID11
Feature activation-2.70e-04

Pos logit contributions
keeping+0.078
kins+0.071
uts+0.063
bes+0.063
iders+0.062
Neg logit contributions
 str-0.048
 forward-0.047
ah-0.043
 padd-0.042
 lear-0.042
Token so
Token ID523
Feature activation+4.97e-02

Pos logit contributions
 str+0.000
 lear+0.000
ah+0.000
 forward+0.000
 switched+0.000
Neg logit contributions
keeping-0.000
kins-0.000
 snakes-0.000
bes-0.000
 traps-0.000
Token he
Token ID339
Feature activation+1.93e-02

Pos logit contributions
keeping+0.080
kins+0.074
 snakes+0.062
bes+0.058
 iv+0.057
Neg logit contributions
 str-0.087
 lear-0.085
ah-0.084
 forward-0.080
 switched-0.078
Token decided
Token ID3066
Feature activation+4.77e-02

Pos logit contributions
kins+0.031
keeping+0.030
 snakes+0.026
 traps+0.026
ls+0.026
Neg logit contributions
 lear-0.032
itting-0.031
 forward-0.028
ah-0.027
 str-0.027
Token to
Token ID284
Feature activation-4.73e-03

Pos logit contributions
keeping+0.091
kins+0.085
 snakes+0.076
 traps+0.071
 skeletons+0.071
Neg logit contributions
 lear-0.067
ah-0.065
 str-0.065
itting-0.060
 forward-0.059
Token brave
Token ID14802
Feature activation+6.85e-02

Pos logit contributions
 lear+0.006
ah+0.006
itting+0.006
 str+0.006
 forward+0.005
Neg logit contributions
keeping-0.010
kins-0.009
 snakes-0.008
 traps-0.008
 bush-0.008
Tokenly
Token ID306
Feature activation+3.64e-02

Pos logit contributions
keeping+0.128
kins+0.119
 snakes+0.107
 traps+0.100
 skeletons+0.100
Neg logit contributions
 lear-0.097
 str-0.097
ah-0.096
itting-0.089
 switched-0.088
Token step
Token ID2239
Feature activation-4.52e-03

Pos logit contributions
keeping+0.059
kins+0.052
 snakes+0.044
 traps+0.043
bes+0.042
Neg logit contributions
 lear-0.064
ah-0.063
 str-0.062
itting-0.061
 forward-0.057
Token inside
Token ID2641
Feature activation+8.12e-03

Pos logit contributions
 lear+0.008
ah+0.008
 str+0.008
itting+0.008
 spend+0.007
Neg logit contributions
keeping-0.007
kins-0.006
 snakes-0.005
ls-0.005
 traps-0.005
Token with
Token ID351
Feature activation+1.46e-02

Pos logit contributions
 lear+0.082
 str+0.081
ah+0.080
 forward+0.075
itting+0.074
Neg logit contributions
keeping-0.083
kins-0.078
 snakes-0.067
bes-0.063
 traps-0.062
Token the
Token ID262
Feature activation+5.12e-03

Pos logit contributions
keeping+0.025
kins+0.023
 snakes+0.021
 skeletons+0.019
 traps+0.019
Neg logit contributions
 lear-0.023
 str-0.022
ah-0.022
itting-0.021
 forward-0.021
Token pencil
Token ID21613
Feature activation-3.13e-02

Pos logit contributions
keeping+0.008
kins+0.008
 snakes+0.007
bes+0.006
 traps+0.006
Neg logit contributions
 lear-0.008
 str-0.008
ah-0.008
 forward-0.008
 switched-0.008
Token and
Token ID290
Feature activation-5.35e-03

Pos logit contributions
 lear+0.058
 str+0.058
ah+0.057
 forward+0.054
itting+0.054
Neg logit contributions
keeping-0.044
kins-0.041
 snakes-0.034
bes-0.031
 traps-0.031
Token praised
Token ID15342
Feature activation-3.17e-02

Pos logit contributions
 str+0.009
 lear+0.009
ah+0.008
 switched+0.008
 spend+0.008
Neg logit contributions
keeping-0.009
kins-0.008
 snakes-0.007
 iv-0.007
bes-0.007
Token him
Token ID683
Feature activation+4.29e-02

Pos logit contributions
 str+0.040
 lear+0.040
 forward+0.038
ah+0.038
itting+0.035
Neg logit contributions
keeping-0.063
kins-0.061
 snakes-0.053
bes-0.053
 iv-0.051
Token for
Token ID329
Feature activation-3.66e-04

Pos logit contributions
keeping+0.053
kins+0.047
 snakes+0.040
 skeletons+0.036
 traps+0.036
Neg logit contributions
 str-0.086
ah-0.086
 lear-0.086
itting-0.082
 switched-0.082
Token his
Token ID465
Feature activation-7.45e-03

Pos logit contributions
 lear+0.001
 str+0.001
ah+0.001
 forward+0.001
itting+0.001
Neg logit contributions
keeping-0.001
kins-0.001
 snakes-0.000
bes-0.000
 traps-0.000
Token master
Token ID4958
Feature activation-3.49e-02

Pos logit contributions
 str+0.011
ah+0.011
 lear+0.011
itting+0.010
 forward+0.010
Neg logit contributions
keeping-0.013
kins-0.012
 snakes-0.011
 skeletons-0.010
 traps-0.010
Tokenpieces
Token ID34154
Feature activation-3.12e-02

Pos logit contributions
ah+0.077
 str+0.077
 lear+0.077
itting+0.075
 switched+0.073
Neg logit contributions
keeping-0.036
kins-0.031
 snakes-0.025
 skeletons-0.022
 traps-0.021
Token.
Token ID13
Feature activation-2.90e-04

Pos logit contributions
 lear+0.067
 str+0.066
ah+0.065
 forward+0.063
itting+0.061
Neg logit contributions
keeping-0.036
kins-0.033
 snakes-0.026
bes-0.023
 traps-0.023
Token that
Token ID326
Feature activation-2.05e-02

Pos logit contributions
ah+0.015
 lear+0.015
 str+0.015
 switched+0.014
 forward+0.013
Neg logit contributions
keeping-0.012
kins-0.011
 snakes-0.010
 traps-0.009
 skeletons-0.009
Token we
Token ID356
Feature activation-1.41e-02

Pos logit contributions
 str+0.030
 lear+0.030
ah+0.028
 forward+0.027
itting+0.026
Neg logit contributions
keeping-0.038
kins-0.036
 snakes-0.030
bes-0.030
ls-0.028
Token should
Token ID815
Feature activation+2.86e-02

Pos logit contributions
ah+0.027
 lear+0.027
 str+0.026
itting+0.025
 switched+0.024
Neg logit contributions
keeping-0.019
kins-0.018
 snakes-0.014
 traps-0.013
 truth-0.013
Token always
Token ID1464
Feature activation-6.44e-02

Pos logit contributions
keeping+0.056
kins+0.052
 snakes+0.046
 truth+0.044
bes+0.043
Neg logit contributions
 lear-0.038
ah-0.037
 str-0.035
 forward-0.034
itting-0.034
Token listen
Token ID6004
Feature activation+3.19e-03

Pos logit contributions
 lear+0.083
ah+0.081
 str+0.079
itting+0.075
 forward+0.074
Neg logit contributions
keeping-0.127
kins-0.121
 snakes-0.105
 traps-0.101
bes-0.100
Token to
Token ID284
Feature activation-8.78e-03

Pos logit contributions
keeping+0.006
kins+0.005
 snakes+0.005
 traps+0.004
 skeletons+0.004
Neg logit contributions
ah-0.005
 lear-0.005
 str-0.005
 switched-0.004
itting-0.004
Token our
Token ID674
Feature activation-6.24e-02

Pos logit contributions
 str+0.011
 lear+0.011
ah+0.011
 forward+0.010
 switched+0.010
Neg logit contributions
keeping-0.018
kins-0.017
 snakes-0.015
bes-0.014
 skeletons-0.014
Token parents
Token ID3397
Feature activation-2.16e-03

Pos logit contributions
ah+0.114
 lear+0.113
 str+0.111
itting+0.105
 switched+0.105
Neg logit contributions
keeping-0.091
kins-0.086
 snakes-0.074
 traps-0.069
 skeletons-0.069
Token and
Token ID290
Feature activation-2.72e-03

Pos logit contributions
 str+0.004
 lear+0.004
ah+0.003
 forward+0.003
itting+0.003
Neg logit contributions
keeping-0.003
kins-0.003
 snakes-0.003
bes-0.003
 skeletons-0.003
Token not
Token ID407
Feature activation+4.85e-02

Pos logit contributions
 str+0.004
 lear+0.003
ah+0.003
 forward+0.003
 switched+0.003
Neg logit contributions
keeping-0.005
kins-0.005
 snakes-0.005
bes-0.004
 skeletons-0.004
Token play
Token ID711
Feature activation-2.40e-02

Pos logit contributions
keeping+0.090
kins+0.079
 snakes+0.076
 traps+0.071
 mice+0.070
Neg logit contributions
 lear-0.068
ah-0.068
itting-0.063
oured-0.063
 str-0.062
Token upon
Token ID2402
Feature activation-1.06e-02

Pos logit contributions
 str+0.031
ah+0.031
 lear+0.029
 forward+0.028
itting+0.027
Neg logit contributions
keeping-0.098
kins-0.090
 snakes-0.083
bes-0.082
 traps-0.080
Token a
Token ID257
Feature activation-1.22e-02

Pos logit contributions
 str+0.012
 lear+0.010
ah+0.010
 spend+0.009
 switched+0.009
Neg logit contributions
keeping-0.024
kins-0.024
 snakes-0.021
ls-0.020
 iv-0.020
Token time
Token ID640
Feature activation-1.26e-02

Pos logit contributions
 lear+0.018
 str+0.018
ah+0.017
 switched+0.017
 forward+0.017
Neg logit contributions
keeping-0.020
kins-0.020
 snakes-0.017
bes-0.017
 traps-0.016
Token,
Token ID11
Feature activation-8.83e-03

Pos logit contributions
 lear+0.018
 str+0.018
ah+0.018
itting+0.017
 forward+0.017
Neg logit contributions
keeping-0.024
kins-0.022
 snakes-0.019
bes-0.018
 traps-0.018
Token two
Token ID734
Feature activation-1.70e-02

Pos logit contributions
 str+0.015
ah+0.015
 lear+0.015
 forward+0.014
itting+0.014
Neg logit contributions
keeping-0.014
kins-0.013
 snakes-0.011
bes-0.010
 traps-0.010
Token friends
Token ID2460
Feature activation+1.17e-02

Pos logit contributions
 lear+0.034
 str+0.034
ah+0.033
 forward+0.032
itting+0.032
Neg logit contributions
kins-0.020
keeping-0.019
 snakes-0.018
 mice-0.016
 traps-0.016
Token called
Token ID1444
Feature activation+3.97e-03

Pos logit contributions
kins+0.011
keeping+0.011
 snakes+0.010
 mice+0.009
 rats+0.008
Neg logit contributions
 str-0.026
 lear-0.025
ah-0.024
itting-0.024
oured-0.023
Token Sally
Token ID25737
Feature activation+1.24e-02

Pos logit contributions
keeping+0.006
kins+0.005
 snakes+0.005
 skeletons+0.004
 mice+0.004
Neg logit contributions
 str-0.007
 lear-0.007
ah-0.007
itting-0.007
 switched-0.007
Token and
Token ID290
Feature activation-4.94e-03

Pos logit contributions
keeping+0.017
kins+0.016
 snakes+0.013
bes+0.012
 traps+0.012
Neg logit contributions
 lear-0.023
 str-0.023
ah-0.023
 forward-0.022
itting-0.021
Token Harry
Token ID5850
Feature activation+2.29e-02

Pos logit contributions
 str+0.008
itting+0.008
 lear+0.008
ah+0.008
 forward+0.008
Neg logit contributions
keeping-0.008
kins-0.007
 snakes-0.006
 skeletons-0.006
 mice-0.006
Token were
Token ID547
Feature activation+1.87e-02

Pos logit contributions
keeping+0.020
kins+0.018
 snakes+0.013
 mice+0.011
 skeletons+0.010
Neg logit contributions
 str-0.054
 lear-0.053
ah-0.052
itting-0.051
 forward-0.051
Token then
Token ID788
Feature activation+1.10e-02

Pos logit contributions
 lear+0.004
 str+0.004
ah+0.004
 forward+0.004
itting+0.004
Neg logit contributions
keeping-0.007
kins-0.007
 snakes-0.006
bes-0.005
 traps-0.005
Token the
Token ID262
Feature activation+1.08e-02

Pos logit contributions
keeping+0.015
kins+0.014
 snakes+0.012
 traps+0.011
 skeletons+0.011
Neg logit contributions
ah-0.021
 lear-0.021
itting-0.020
 str-0.020
 forward-0.019
Token tight
Token ID5381
Feature activation-2.13e-02

Pos logit contributions
keeping+0.020
kins+0.018
 snakes+0.015
bes+0.015
 traps+0.014
Neg logit contributions
 str-0.017
 lear-0.017
ah-0.016
 forward-0.015
 switched-0.015
Tokenness
Token ID1108
Feature activation+6.17e-03

Pos logit contributions
 lear+0.037
 str+0.036
ah+0.036
itting+0.035
 forward+0.034
Neg logit contributions
keeping-0.033
kins-0.031
 snakes-0.027
bes-0.024
 traps-0.024
Token will
Token ID481
Feature activation-1.53e-03

Pos logit contributions
keeping+0.010
kins+0.009
 snakes+0.008
 traps+0.007
ls+0.007
Neg logit contributions
 lear-0.010
 str-0.010
ah-0.010
itting-0.009
 forward-0.009
Token return
Token ID1441
Feature activation+1.29e-02

Pos logit contributions
 lear+0.002
ah+0.002
itting+0.002
 str+0.002
 forward+0.002
Neg logit contributions
keeping-0.003
kins-0.003
 snakes-0.002
 iv-0.002
 traps-0.002
Token."
Token ID526
Feature activation+1.17e-02

Pos logit contributions
keeping+0.018
kins+0.017
 snakes+0.014
bes+0.013
 skeletons+0.013
Neg logit contributions
 str-0.024
 lear-0.024
ah-0.023
 forward-0.022
 switched-0.022
Token
Token ID198
Feature activation+1.36e-02

Pos logit contributions
keeping+0.018
kins+0.017
 snakes+0.014
bes+0.013
 traps+0.013
Neg logit contributions
 lear-0.020
 str-0.020
ah-0.020
 forward-0.019
itting-0.019
Token
Token ID198
Feature activation+1.18e-02

Pos logit contributions
keeping+0.020
kins+0.018
 snakes+0.016
 traps+0.014
 skeletons+0.014
Neg logit contributions
 str-0.025
ah-0.024
 lear-0.024
ee-0.023
itting-0.023
TokenThey
Token ID2990
Feature activation-2.01e-02

Pos logit contributions
keeping+0.015
kins+0.015
 snakes+0.012
bes+0.011
iders+0.011
Neg logit contributions
 lear-0.023
ah-0.022
oured-0.022
 str-0.022
 forward-0.022
Token waited
Token ID13488
Feature activation+1.43e-02

Pos logit contributions
itting+0.036
 lear+0.032
rav+0.032
 pleasure+0.031
 wanna+0.030
Neg logit contributions
kins-0.028
keeping-0.024
 cans-0.023
 bush-0.023
 mice-0.022
Token on
Token ID319
Feature activation+6.54e-03

Pos logit contributions
 str+0.004
 lear+0.004
ah+0.004
 forward+0.004
 farewell+0.004
Neg logit contributions
keeping-0.004
kins-0.004
 snakes-0.003
 rats-0.003
 traps-0.003
Token the
Token ID262
Feature activation-4.56e-03

Pos logit contributions
keeping+0.012
kins+0.012
 snakes+0.010
uts+0.010
ls+0.010
Neg logit contributions
 str-0.009
 lear-0.009
ah-0.008
 forward-0.008
 spend-0.008
Token fox
Token ID21831
Feature activation-3.79e-02

Pos logit contributions
 lear+0.009
 str+0.009
ah+0.009
 forward+0.008
 switched+0.008
Neg logit contributions
keeping-0.006
kins-0.006
 snakes-0.005
bes-0.004
 skeletons-0.004
Token's
Token ID338
Feature activation-5.73e-02

Pos logit contributions
 str+0.065
 lear+0.065
ah+0.064
 forward+0.060
 switched+0.060
Neg logit contributions
keeping-0.058
kins-0.055
 snakes-0.046
bes-0.043
 skeletons-0.043
Token offer
Token ID2897
Feature activation+5.09e-02

Pos logit contributions
 lear+0.089
 str+0.089
ah+0.086
 forward+0.081
 switched+0.079
Neg logit contributions
keeping-0.096
kins-0.094
 snakes-0.079
ls-0.074
bes-0.074
Token.
Token ID13
Feature activation-1.27e-02

Pos logit contributions
keeping+0.077
kins+0.073
 snakes+0.062
bes+0.058
 skeletons+0.057
Neg logit contributions
 str-0.087
 lear-0.087
ah-0.086
 forward-0.082
 switched-0.079
Token From
Token ID3574
Feature activation-1.04e-02

Pos logit contributions
 lear+0.023
 str+0.023
ah+0.023
 forward+0.021
itting+0.021
Neg logit contributions
keeping-0.019
kins-0.017
 snakes-0.015
bes-0.013
 traps-0.013
Token then
Token ID788
Feature activation-1.21e-02

Pos logit contributions
 lear+0.013
 str+0.013
ah+0.013
 forward+0.012
itting+0.012
Neg logit contributions
keeping-0.021
kins-0.020
 snakes-0.018
bes-0.017
 traps-0.016
Token on
Token ID319
Feature activation-1.25e-02

Pos logit contributions
 str+0.023
 lear+0.023
ah+0.022
 switched+0.022
oured+0.021
Neg logit contributions
kins-0.015
keeping-0.015
 snakes-0.012
 traps-0.011
ls-0.011
Token,
Token ID11
Feature activation-2.17e-02

Pos logit contributions
 lear+0.019
 str+0.019
ah+0.019
 forward+0.018
itting+0.017
Neg logit contributions
keeping-0.022
kins-0.020
 snakes-0.018
bes-0.016
 skeletons-0.016
Token the
Token ID262
Feature activation-6.86e-03

Pos logit contributions
 lear+0.035
 str+0.035
ah+0.035
 forward+0.032
itting+0.032
Neg logit contributions
keeping-0.036
kins-0.034
 snakes-0.029
 traps-0.027
 skeletons-0.027
Token and
Token ID290
Feature activation+6.12e-03

Pos logit contributions
keeping+0.011
kins+0.011
 snakes+0.009
bes+0.008
 iv+0.008
Neg logit contributions
 lear-0.015
ah-0.015
 str-0.015
 forward-0.015
itting-0.014
Token Lily
Token ID20037
Feature activation+1.25e-02

Pos logit contributions
keeping+0.009
kins+0.008
 snakes+0.007
bes+0.006
 traps+0.006
Neg logit contributions
 lear-0.011
 str-0.011
ah-0.011
 forward-0.011
itting-0.011
Token agreed
Token ID4987
Feature activation+3.06e-02

Pos logit contributions
kins+0.021
keeping+0.019
 bush+0.017
 cans+0.017
 floor+0.016
Neg logit contributions
ah-0.019
 forward-0.019
itting-0.019
 lear-0.018
 farewell-0.017
Token.
Token ID13
Feature activation+1.14e-02

Pos logit contributions
keeping+0.031
kins+0.028
 snakes+0.020
bes+0.018
uts+0.017
Neg logit contributions
ah-0.070
 lear-0.070
 str-0.069
itting-0.066
 forward-0.065
Token They
Token ID1119
Feature activation-5.67e-03

Pos logit contributions
keeping+0.021
kins+0.020
 snakes+0.017
bes+0.016
 iv+0.016
Neg logit contributions
 lear-0.017
 str-0.016
 forward-0.016
 switched-0.015
itting-0.015
Token went
Token ID1816
Feature activation+2.34e-03

Pos logit contributions
itting+0.008
 forward+0.008
ah+0.008
 lear+0.008
 farewell+0.007
Neg logit contributions
kins-0.010
keeping-0.009
 cans-0.008
 bush-0.008
 mice-0.008
Token to
Token ID284
Feature activation-2.54e-03

Pos logit contributions
keeping+0.005
kins+0.004
 snakes+0.004
bes+0.004
 traps+0.004
Neg logit contributions
 lear-0.003
 str-0.003
ah-0.003
 forward-0.003
itting-0.003
Token the
Token ID262
Feature activation+1.57e-02

Pos logit contributions
 str+0.003
 lear+0.003
 spend+0.003
ah+0.003
 switched+0.003
Neg logit contributions
keeping-0.005
kins-0.005
 snakes-0.004
bes-0.004
uts-0.004
Token table
Token ID3084
Feature activation+7.07e-02

Pos logit contributions
keeping+0.020
kins+0.018
 snakes+0.014
uts+0.014
bes+0.013
Neg logit contributions
 str-0.032
 lear-0.032
ah-0.030
 switched-0.030
 forward-0.030
Token and
Token ID290
Feature activation-1.37e-03

Pos logit contributions
keeping+0.133
kins+0.121
 snakes+0.115
 skeletons+0.108
ls+0.108
Neg logit contributions
 lear-0.096
 str-0.096
 forward-0.091
ah-0.083
 bought-0.081
Token ate
Token ID15063
Feature activation+1.40e-02

Pos logit contributions
 lear+0.002
ah+0.002
itting+0.002
 str+0.002
 forward+0.002
Neg logit contributions
keeping-0.003
kins-0.003
 snakes-0.002
 bush-0.002
 traps-0.002
Token it
Token ID340
Feature activation+5.02e-03

Pos logit contributions
kins+0.004
keeping+0.004
 snakes+0.003
 cans+0.003
 traps+0.003
Neg logit contributions
 lear-0.005
ah-0.005
itting-0.005
 str-0.005
 spend-0.005
Token on
Token ID319
Feature activation+5.56e-03

Pos logit contributions
kins+0.007
keeping+0.006
 snakes+0.005
ls+0.005
 traps+0.005
Neg logit contributions
 lear-0.010
ah-0.010
 str-0.009
itting-0.009
 spend-0.009
Token their
Token ID511
Feature activation+2.07e-02

Pos logit contributions
keeping+0.010
kins+0.009
 snakes+0.008
bes+0.008
 skeletons+0.008
Neg logit contributions
 lear-0.008
 str-0.008
ah-0.008
 forward-0.008
 switched-0.008
Token wrists
Token ID38163
Feature activation-1.45e-03

Pos logit contributions
keeping+0.034
kins+0.032
 snakes+0.027
 skeletons+0.025
 traps+0.025
Neg logit contributions
 str-0.035
 lear-0.034
ah-0.033
 forward-0.031
 switched-0.031
Token and
Token ID290
Feature activation-2.70e-03

Pos logit contributions
 lear+0.003
 str+0.003
ah+0.003
itting+0.002
 forward+0.002
Neg logit contributions
keeping-0.002
kins-0.002
 snakes-0.002
bes-0.002
 traps-0.002
Token pressed
Token ID12070
Feature activation-8.52e-02

Pos logit contributions
 lear+0.003
ah+0.003
 str+0.003
itting+0.003
 forward+0.003
Neg logit contributions
keeping-0.006
kins-0.005
 snakes-0.005
bes-0.004
 traps-0.004
Token the
Token ID262
Feature activation+1.56e-02

Pos logit contributions
ah+0.162
itting+0.161
 lear+0.161
 spend+0.151
packing+0.151
Neg logit contributions
keeping-0.102
kins-0.101
 snakes-0.093
 cans-0.081
ls-0.080
Token buttons
Token ID12163
Feature activation-2.66e-02

Pos logit contributions
keeping+0.023
kins+0.022
 snakes+0.019
 traps+0.017
bes+0.017
Neg logit contributions
 lear-0.027
 str-0.027
ah-0.027
itting-0.025
 forward-0.025
Token.
Token ID13
Feature activation-4.10e-03

Pos logit contributions
 lear+0.058
ah+0.058
 str+0.057
itting+0.055
 spend+0.054
Neg logit contributions
keeping-0.027
kins-0.024
 snakes-0.019
uts-0.017
ls-0.017
Token The
Token ID383
Feature activation-4.56e-03

Pos logit contributions
ah+0.008
 lear+0.008
 str+0.008
itting+0.007
 forward+0.007
Neg logit contributions
keeping-0.006
kins-0.005
 snakes-0.004
bes-0.004
 mice-0.004
Token watch
Token ID2342
Feature activation-1.61e-02

Pos logit contributions
 lear+0.006
 str+0.006
ah+0.006
itting+0.006
 forward+0.006
Neg logit contributions
kins-0.008
keeping-0.008
 snakes-0.007
 floor-0.007
 bush-0.007
Token big
Token ID1263
Feature activation+2.34e-02

Pos logit contributions
keeping+0.000
kins+0.000
bes+0.000
uts+0.000
 snakes+0.000
Neg logit contributions
 lear-0.000
 str-0.000
ah-0.000
 forward-0.000
 switched-0.000
Token lion
Token ID18744
Feature activation-2.36e-02

Pos logit contributions
keeping+0.040
kins+0.038
bes+0.034
uts+0.033
iders+0.032
Neg logit contributions
 lear-0.038
ah-0.037
 str-0.035
 spend-0.032
 forward-0.032
Token.
Token ID13
Feature activation-3.02e-03

Pos logit contributions
 str+0.040
 lear+0.040
ah+0.040
 forward+0.037
 switched+0.037
Neg logit contributions
keeping-0.037
kins-0.035
 snakes-0.029
bes-0.027
 traps-0.027
Token He
Token ID679
Feature activation-1.38e-02

Pos logit contributions
 lear+0.004
 farewell+0.003
ah+0.003
 str+0.003
 While+0.003
Neg logit contributions
keeping-0.007
kins-0.006
rob-0.005
bes-0.005
 situations-0.005
Token looked
Token ID3114
Feature activation-1.21e-02

Pos logit contributions
 lear+0.022
 str+0.021
ah+0.021
 forward+0.020
itting+0.020
Neg logit contributions
keeping-0.023
kins-0.022
 snakes-0.019
 skeletons-0.018
 traps-0.018
Token at
Token ID379
Feature activation-7.73e-02

Pos logit contributions
 lear+0.025
ah+0.025
 str+0.024
itting+0.024
 spend+0.022
Neg logit contributions
kins-0.013
 snakes-0.013
keeping-0.013
ls-0.012
 traps-0.012
Token Lily
Token ID20037
Feature activation+1.80e-03

Pos logit contributions
ah+0.090
 str+0.084
 lear+0.079
 farewell+0.077
 forward+0.075
Neg logit contributions
keeping-0.169
kins-0.163
uts-0.146
bes-0.140
 situations-0.136
Token with
Token ID351
Feature activation+1.05e-02

Pos logit contributions
keeping+0.003
kins+0.003
 snakes+0.002
bes+0.002
 traps+0.002
Neg logit contributions
 str-0.003
 lear-0.003
ah-0.003
 forward-0.003
itting-0.003
Token his
Token ID465
Feature activation-1.18e-02

Pos logit contributions
keeping+0.019
kins+0.018
 snakes+0.015
bes+0.014
 skeletons+0.014
Neg logit contributions
 str-0.016
 lear-0.016
ah-0.015
 forward-0.014
 switched-0.014
Token yellow
Token ID7872
Feature activation+6.39e-03

Pos logit contributions
 str+0.018
ah+0.018
 lear+0.018
 forward+0.017
 switched+0.016
Neg logit contributions
keeping-0.021
kins-0.019
 snakes-0.016
 iv-0.016
bes-0.016
Token eyes
Token ID2951
Feature activation+1.35e-02

Pos logit contributions
keeping+0.010
kins+0.009
 snakes+0.008
bes+0.007
 skeletons+0.007
Neg logit contributions
 lear-0.011
 str-0.011
ah-0.011
 forward-0.011
itting-0.011
Token man
Token ID582
Feature activation+1.15e-02

Pos logit contributions
kins+0.008
keeping+0.008
 snakes+0.006
 traps+0.005
 skeletons+0.005
Neg logit contributions
 lear-0.017
 str-0.017
ah-0.017
itting-0.016
 forward-0.016
Token said
Token ID531
Feature activation-3.78e-03

Pos logit contributions
kins+0.018
keeping+0.018
 snakes+0.016
ls+0.015
 traps+0.014
Neg logit contributions
 lear-0.018
itting-0.018
ah-0.018
 str-0.017
 forward-0.017
Token,
Token ID11
Feature activation-1.29e-02

Pos logit contributions
 lear+0.006
 str+0.006
ah+0.006
 forward+0.006
itting+0.006
Neg logit contributions
keeping-0.006
kins-0.006
 snakes-0.005
bes-0.005
 traps-0.005
Token "
Token ID366
Feature activation-3.36e-02

Pos logit contributions
 lear+0.021
 str+0.020
ah+0.020
 forward+0.019
 switched+0.018
Neg logit contributions
keeping-0.023
kins-0.021
 snakes-0.018
 traps-0.017
bes-0.017
TokenYou
Token ID1639
Feature activation-5.93e-02

Pos logit contributions
 lear+0.058
 str+0.057
ah+0.056
 forward+0.053
itting+0.053
Neg logit contributions
keeping-0.052
kins-0.048
 snakes-0.041
bes-0.038
 skeletons-0.037
Token're
Token ID821
Feature activation-7.55e-02

Pos logit contributions
 lear+0.116
itting+0.112
 str+0.110
ah+0.110
 forward+0.103
Neg logit contributions
kins-0.076
keeping-0.070
 snakes-0.057
 skeletons-0.056
 mice-0.055
Token welcome
Token ID7062
Feature activation+1.62e-02

Pos logit contributions
 lear+0.132
 str+0.131
ah+0.129
itting+0.120
 switched+0.118
Neg logit contributions
keeping-0.117
kins-0.111
 snakes-0.095
 skeletons-0.086
 traps-0.086
Token,
Token ID11
Feature activation-1.42e-02

Pos logit contributions
keeping+0.031
kins+0.029
 snakes+0.025
bes+0.025
 traps+0.024
Neg logit contributions
 forward-0.023
ah-0.022
 lear-0.022
 str-0.022
 farewell-0.020
Token little
Token ID1310
Feature activation-8.38e-03

Pos logit contributions
 lear+0.024
 str+0.024
ah+0.023
 spend+0.021
 forward+0.021
Neg logit contributions
keeping-0.025
kins-0.022
bes-0.019
uts-0.018
iders-0.018
Token ones
Token ID3392
Feature activation+1.60e-03

Pos logit contributions
 lear+0.014
 str+0.014
ah+0.014
 forward+0.013
itting+0.013
Neg logit contributions
keeping-0.014
kins-0.013
 snakes-0.011
bes-0.011
uts-0.010
Token.
Token ID13
Feature activation-6.12e-03

Pos logit contributions
keeping+0.002
kins+0.002
 snakes+0.001
bes+0.001
 traps+0.001
Neg logit contributions
 lear-0.003
 str-0.003
ah-0.003
 forward-0.003
itting-0.003
Token,
Token ID11
Feature activation-7.60e-03

Pos logit contributions
ah+0.011
 switched+0.010
 str+0.010
 lear+0.010
ee+0.010
Neg logit contributions
kins-0.006
keeping-0.005
 snakes-0.004
 mice-0.004
uts-0.004
Token knowing
Token ID6970
Feature activation-1.10e-02

Pos logit contributions
ah+0.013
 lear+0.013
itting+0.013
 str+0.013
ee+0.012
Neg logit contributions
keeping-0.011
kins-0.010
 snakes-0.009
bes-0.008
uts-0.008
Token that
Token ID326
Feature activation-1.36e-02

Pos logit contributions
ah+0.019
 str+0.019
 lear+0.018
itting+0.018
ee+0.018
Neg logit contributions
keeping-0.017
kins-0.016
 snakes-0.014
 skeletons-0.013
 traps-0.012
Token they
Token ID484
Feature activation+3.48e-03

Pos logit contributions
 lear+0.021
ah+0.020
 str+0.020
itting+0.019
 forward+0.018
Neg logit contributions
keeping-0.024
kins-0.023
 snakes-0.020
 traps-0.019
 skeletons-0.019
Token were
Token ID547
Feature activation+1.42e-02

Pos logit contributions
kins+0.005
keeping+0.005
 snakes+0.004
 skeletons+0.004
 iv+0.004
Neg logit contributions
 lear-0.006
ah-0.006
 str-0.006
itting-0.006
 farewell-0.006
Token both
Token ID1111
Feature activation-8.02e-02

Pos logit contributions
kins+0.022
keeping+0.019
 snakes+0.017
 traps+0.016
 mice+0.015
Neg logit contributions
 lear-0.026
ah-0.026
 str-0.024
itting-0.024
ee-0.023
Token winners
Token ID14591
Feature activation-5.01e-02

Pos logit contributions
ah+0.148
 str+0.146
 lear+0.145
itting+0.139
ahoo+0.131
Neg logit contributions
kins-0.109
keeping-0.107
 snakes-0.090
 traps-0.080
 mice-0.079
Token.
Token ID13
Feature activation+2.86e-03

Pos logit contributions
 lear+0.096
 str+0.096
ah+0.095
 forward+0.089
itting+0.089
Neg logit contributions
keeping-0.068
kins-0.063
 snakes-0.052
 traps-0.047
bes-0.047
Token<|endoftext|>
Token ID50256
Feature activation-2.43e-02

Pos logit contributions
keeping+0.004
kins+0.004
 snakes+0.003
 traps+0.003
bes+0.003
Neg logit contributions
 lear-0.005
 str-0.005
ah-0.005
 forward-0.005
 switched-0.005
TokenOnce
Token ID7454
Feature activation-3.53e-02

Pos logit contributions
ee+0.024
 forward+0.023
ah+0.022
 str+0.021
 ahead+0.021
Neg logit contributions
kins-0.053
keeping-0.052
 wounds-0.050
 snakes-0.049
 traps-0.048
Token upon
Token ID2402
Feature activation-3.88e-03

Pos logit contributions
 str+0.032
ah+0.032
 lear+0.032
 forward+0.029
itting+0.029
Neg logit contributions
keeping-0.085
kins-0.079
 snakes-0.072
bes-0.070
 traps-0.069
TokenOne
Token ID3198
Feature activation-4.22e-03

Pos logit contributions
keeping+0.028
kins+0.025
 snakes+0.023
 mice+0.021
bes+0.021
Neg logit contributions
 lear-0.036
 str-0.035
oured-0.035
ah-0.034
 switched-0.033
Token morning
Token ID3329
Feature activation-5.09e-02

Pos logit contributions
 lear+0.007
 str+0.007
ah+0.006
 switched+0.006
oured+0.006
Neg logit contributions
kins-0.007
keeping-0.007
 snakes-0.006
 bush-0.005
 mice-0.005
Token he
Token ID339
Feature activation+1.33e-02

Pos logit contributions
 lear+0.099
 str+0.098
ah+0.098
itting+0.096
 forward+0.091
Neg logit contributions
keeping-0.062
kins-0.061
 snakes-0.054
 mice-0.047
 skeletons-0.047
Token noticed
Token ID6810
Feature activation+7.87e-03

Pos logit contributions
keeping+0.025
kins+0.024
 snakes+0.020
 traps+0.019
 iv+0.019
Neg logit contributions
 lear-0.018
ah-0.018
 str-0.018
itting-0.017
 forward-0.016
Token a
Token ID257
Feature activation-1.89e-02

Pos logit contributions
keeping+0.015
kins+0.014
 snakes+0.012
bes+0.012
 traps+0.011
Neg logit contributions
 str-0.011
 lear-0.011
ah-0.011
 forward-0.010
 switched-0.010
Token lot
Token ID1256
Feature activation-7.50e-02

Pos logit contributions
 lear+0.036
 str+0.035
ah+0.035
itting+0.034
 forward+0.033
Neg logit contributions
keeping-0.025
kins-0.024
 snakes-0.021
 bush-0.019
 skeletons-0.019
Token of
Token ID286
Feature activation+2.10e-02

Pos logit contributions
 lear+0.102
 str+0.102
 forward+0.099
ah+0.099
itting+0.095
Neg logit contributions
keeping-0.146
kins-0.135
bes-0.115
 snakes-0.113
 traps-0.108
Token traffic
Token ID4979
Feature activation-7.38e-02

Pos logit contributions
keeping+0.031
kins+0.029
 snakes+0.023
bes+0.022
uts+0.022
Neg logit contributions
 lear-0.039
 str-0.039
ah-0.037
 forward-0.036
 switched-0.035
Token.
Token ID13
Feature activation+1.00e-02

Pos logit contributions
 lear+0.130
 str+0.130
ah+0.128
 forward+0.120
itting+0.119
Neg logit contributions
keeping-0.111
kins-0.104
 snakes-0.088
bes-0.081
 traps-0.080
Token There
Token ID1318
Feature activation+3.60e-02

Pos logit contributions
keeping+0.014
kins+0.013
 snakes+0.011
bes+0.010
 traps+0.010
Neg logit contributions
 lear-0.019
 str-0.019
ah-0.018
 forward-0.017
itting-0.017
Token were
Token ID547
Feature activation+1.75e-02

Pos logit contributions
keeping+0.067
kins+0.061
 snakes+0.053
bes+0.051
 traps+0.050
Neg logit contributions
 str-0.053
ah-0.052
 lear-0.052
 forward-0.048
 switched-0.048