TOP ACTIVATIONS
MAX = 0.139

his mind . He w agged his tail and bark ed
Jack was over joy ed . He had conquered
her friend . She w agged her tail and bark ed
The old man his sed ,
L illy was over joy ed and felt so lucky
raised its hood and his sed , ready to strike .
was so happy he w agged his tail and l icked
ball suddenly appeared out of nowhere ! It had been hiding
The butterfly was over joy ed by the bird 's
it . The dog w agged its tail and became friendly
ggy win ned and w agged his tail . Debbie smiled
up to Jack and w agged their tails . Jack was
Ab ig ail was over joy ed ! She thanked the
so friendly . Max w agged his tail and felt happy
in a flash and w agged his tail happily . From
. The princess was over joy ed ! She was so
fought back . They his sed and grow led , sw
so excited , he w agged his tail and w igg
was very happy and w agged his tail with joy .
okay ". The puppy w agged his tail and l icked

MIN ACTIVATIONS
MIN = -0.116

. " Bl ub ! You are very long
poking out from the shr ub bery . Jack
. " Bl ub ! I am the light
. " Bl ub , bl ub , bl
" Bl ub , bl ub , bl ub ," the
, bl ub , bl ub ," the fish replies .
. " Bl ub , bl ub , bl
" Bl ub , bl ub , bl ub ," they
, bl ub , bl ub ," they say back .
." B ub bles learned that it 's
drowned . B ub bles was very sad .
She started to " compl ain " that it wasn 't
of the cold por cel ain bowl !
and become smaller and f ain ter ." "
quickly . B ub bles wanted to perform a
away . B ub bles warned his friends about
you !" B ub bles replied , " Thanks
. B ub ba and his other friends
. B ub bles asked his friend ,
happy ! B ub ba and his squirrel friend

INTERVAL 0.075 - 0.139
CONTAINS 0.057%

strange . I have never seen a boat like that before
furry . The dog w agged its tail and the boy
toys and for a few minutes she did , but then
the dog . It w agged its tail and l icked
Joe . He was three years old and always very hungry

INTERVAL 0.011 - 0.075
CONTAINS 22.340%

The big animal smiled and said , " It 's ok
He was very happy and he knew he could never forget
Maybe I can learn a new recipe next time ."
me ?" Max : " Sure , that sounds
's very difficult for me to find enough food . Can

INTERVAL -0.052 - 0.011
CONTAINS 77.134%

. She wants to play in the water .
caul iflower and jumped back on the zip - line ,
improve my report . You spoiled it . You made a
go to the kitchen to get himself a bowl full of
. One day , Tim my 's mom baked some cookies

INTERVAL -0.116 - -0.052
CONTAINS 0.468%

adventure and was ready to go home . But before he
eer . They did not want to scare it . The
animals , and anything they want . One day
shouted out : " I want that hat ! Give me
and came up with a plan . He decided to take
Token his
Token ID465
Feature activation+2.69e-02

Pos logit contributions
 injuries+0.042
joy+0.039
awed+0.034
ksh+0.033
ishing+0.033
Neg logit contributions
 ring-0.021
 rob-0.021
 also-0.020
 priest-0.020
 pilot-0.020
Token mind
Token ID2000
Feature activation+3.08e-02

Pos logit contributions
 injuries+0.056
joy+0.052
sed+0.047
awed+0.046
irmed+0.044
Neg logit contributions
 also-0.039
 ring-0.038
 priest-0.036
 pilot-0.035
Queen-0.035
Token.
Token ID13
Feature activation+9.73e-03

Pos logit contributions
 injuries+0.060
joy+0.056
less+0.050
awed+0.050
ishing+0.049
Neg logit contributions
Queen-0.043
 ring-0.043
 rob-0.042
 priest-0.041
 pilot-0.040
Token He
Token ID679
Feature activation-7.11e-03

Pos logit contributions
joy+0.020
 injuries+0.020
irmed+0.017
ksh+0.017
awed+0.017
Neg logit contributions
 ring-0.015
 pilot-0.013
 priest-0.013
 rob-0.013
 sign-0.013
Token w
Token ID266
Feature activation+1.78e-02

Pos logit contributions
 ring+0.011
Queen+0.011
 achieve+0.010
 sign+0.010
 priest+0.010
Neg logit contributions
 injuries-0.012
joy-0.012
awed-0.011
sed-0.011
irmed-0.011
Tokenagged
Token ID14655
Feature activation+1.23e-01

Pos logit contributions
 injuries+0.018
joy+0.015
awed+0.011
sed+0.011
ksh+0.010
Neg logit contributions
 ring-0.045
 also-0.045
 understand-0.043
 sign-0.043
 learn-0.043
Token his
Token ID465
Feature activation+2.61e-02

Pos logit contributions
joy+0.253
 injuries+0.252
awed+0.227
sed+0.219
irmed+0.216
Neg logit contributions
 achieve-0.161
Queen-0.157
 ring-0.156
 also-0.152
 priest-0.150
Token tail
Token ID7894
Feature activation+3.53e-02

Pos logit contributions
 injuries+0.049
joy+0.044
awed+0.040
irmed+0.038
ksh+0.038
Neg logit contributions
 ring-0.044
 also-0.042
 guard-0.041
 sign-0.041
 cross-0.040
Token and
Token ID290
Feature activation+3.80e-03

Pos logit contributions
 injuries+0.093
joy+0.092
awed+0.084
sed+0.083
irmed+0.080
Neg logit contributions
 ring-0.027
 achieve-0.026
 priest-0.025
 know-0.025
 learn-0.024
Token bark
Token ID21405
Feature activation+4.26e-02

Pos logit contributions
 injuries+0.007
joy+0.007
awed+0.006
sed+0.006
irmed+0.006
Neg logit contributions
 ring-0.006
 also-0.005
 achieve-0.005
 priest-0.005
 sign-0.005
Tokened
Token ID276
Feature activation+1.66e-02

Pos logit contributions
 injuries+0.049
joy+0.045
awed+0.038
sed+0.035
irmed+0.035
Neg logit contributions
 ring-0.100
 also-0.098
 sign-0.095
 learn-0.094
 priest-0.094
Token
Token ID198
Feature activation-5.01e-03

Pos logit contributions
 injuries+0.037
joy+0.036
aj+0.034
ksh+0.033
awed+0.033
Neg logit contributions
 ring-0.021
 pilot-0.018
 priest-0.017
 guard-0.017
 sign-0.017
Token
Token ID198
Feature activation-3.82e-03

Pos logit contributions
 ring+0.007
 pilot+0.006
 guard+0.006
 sign+0.005
 priest+0.005
Neg logit contributions
 injuries-0.011
joy-0.011
awed-0.010
aj-0.010
irmed-0.010
TokenJack
Token ID14295
Feature activation+2.47e-05

Pos logit contributions
 ring+0.006
 guard+0.005
 sign+0.005
 pilot+0.005
 look+0.005
Neg logit contributions
 injuries-0.008
joy-0.008
awed-0.007
irmed-0.007
ksh-0.007
Token was
Token ID373
Feature activation+1.37e-02

Pos logit contributions
 injuries+0.000
joy+0.000
awed+0.000
irmed+0.000
aj+0.000
Neg logit contributions
 ring-0.000
 priest-0.000
Queen-0.000
 achieve-0.000
 sign-0.000
Token over
Token ID625
Feature activation-1.84e-02

Pos logit contributions
joy+0.030
 injuries+0.030
 nowhere+0.025
irmed+0.025
awed+0.025
Neg logit contributions
 ring-0.017
Queen-0.016
 sign-0.015
 pilot-0.015
 achieve-0.015
Tokenjoy
Token ID2633
Feature activation+1.21e-01

Pos logit contributions
 ring+0.024
 also+0.023
 sign+0.022
 guard+0.021
 understand+0.021
Neg logit contributions
 injuries-0.041
joy-0.038
awed-0.035
sed-0.034
irmed-0.034
Tokened
Token ID276
Feature activation+1.81e-02

Pos logit contributions
 injuries+0.160
joy+0.139
awed+0.121
sed+0.115
irmed+0.113
Neg logit contributions
 ring-0.270
 also-0.265
 sign-0.255
 guard-0.254
 understand-0.252
Token.
Token ID13
Feature activation+7.77e-03

Pos logit contributions
 injuries+0.040
joy+0.038
ishing+0.033
irmed+0.032
awed+0.032
Neg logit contributions
 ring-0.023
Queen-0.022
 rob-0.022
 priest-0.022
 pilot-0.022
Token He
Token ID679
Feature activation-8.46e-03

Pos logit contributions
joy+0.017
 injuries+0.016
irmed+0.014
ksh+0.014
awed+0.014
Neg logit contributions
 ring-0.011
 sign-0.010
 priest-0.010
 pilot-0.010
 also-0.009
Token had
Token ID550
Feature activation+4.09e-02

Pos logit contributions
 ring+0.014
Queen+0.013
 priest+0.013
 achieve+0.013
 rob+0.013
Neg logit contributions
 injuries-0.014
joy-0.013
awed-0.013
sed-0.013
irmed-0.013
Token conquered
Token ID29346
Feature activation+3.34e-02

Pos logit contributions
 injuries+0.084
joy+0.079
awed+0.070
irmed+0.069
sed+0.068
Neg logit contributions
 ring-0.060
 also-0.054
 guard-0.054
 pilot-0.054
 sign-0.054
Token her
Token ID607
Feature activation+1.15e-02

Pos logit contributions
 injuries+0.030
joy+0.027
awed+0.023
sed+0.023
irmed+0.023
Neg logit contributions
 ring-0.025
 also-0.024
 rob-0.023
 pilot-0.023
 understand-0.023
Token friend
Token ID1545
Feature activation-1.51e-02

Pos logit contributions
 injuries+0.026
joy+0.024
sed+0.022
awed+0.021
irmed+0.021
Neg logit contributions
 ring-0.015
 also-0.014
 understand-0.013
 sign-0.013
 priest-0.013
Token.
Token ID13
Feature activation+1.03e-02

Pos logit contributions
 ring+0.018
 also+0.016
 priest+0.016
 sign+0.016
 rob+0.016
Neg logit contributions
 injuries-0.035
joy-0.033
awed-0.029
sed-0.029
irmed-0.029
Token She
Token ID1375
Feature activation-9.96e-03

Pos logit contributions
 injuries+0.023
joy+0.023
awed+0.020
ksh+0.020
irmed+0.020
Neg logit contributions
 ring-0.014
 rob-0.012
 sign-0.012
 also-0.012
 priest-0.011
Token w
Token ID266
Feature activation+1.77e-02

Pos logit contributions
 ring+0.015
 achieve+0.015
Queen+0.014
 sign+0.014
 rob+0.014
Neg logit contributions
 injuries-0.018
joy-0.017
sed-0.016
awed-0.016
irmed-0.015
Tokenagged
Token ID14655
Feature activation+1.21e-01

Pos logit contributions
 injuries+0.020
joy+0.017
awed+0.013
sed+0.013
ksh+0.012
Neg logit contributions
 ring-0.043
 also-0.042
 understand-0.041
 sign-0.041
 guard-0.040
Token her
Token ID607
Feature activation+9.65e-03

Pos logit contributions
 injuries+0.254
joy+0.250
awed+0.231
sed+0.225
irmed+0.219
Neg logit contributions
 ring-0.156
 achieve-0.155
 also-0.148
 priest-0.147
Queen-0.145
Token tail
Token ID7894
Feature activation+3.45e-02

Pos logit contributions
 injuries+0.017
joy+0.015
awed+0.013
sed+0.013
irmed+0.013
Neg logit contributions
 ring-0.018
 also-0.017
 sign-0.016
 understand-0.016
 learn-0.016
Token and
Token ID290
Feature activation+4.80e-03

Pos logit contributions
 injuries+0.096
joy+0.093
awed+0.086
sed+0.085
irmed+0.083
Neg logit contributions
 ring-0.024
 priest-0.021
 achieve-0.021
 also-0.020
 know-0.020
Token bark
Token ID21405
Feature activation+4.01e-02

Pos logit contributions
 injuries+0.009
joy+0.009
sed+0.008
awed+0.008
irmed+0.008
Neg logit contributions
 ring-0.007
 achieve-0.007
 also-0.007
 priest-0.007
 pilot-0.007
Tokened
Token ID276
Feature activation+1.35e-02

Pos logit contributions
 injuries+0.053
joy+0.045
awed+0.039
sed+0.038
irmed+0.037
Neg logit contributions
 ring-0.089
 also-0.087
 sign-0.084
 guard-0.084
 understand-0.083
Token
Token ID198
Feature activation-7.30e-03

Pos logit contributions
 ring+0.011
 pilot+0.010
 priest+0.010
 rob+0.010
 guard+0.009
Neg logit contributions
 injuries-0.017
joy-0.016
awed-0.015
 nowhere-0.014
aj-0.014
TokenThe
Token ID464
Feature activation-1.38e-02

Pos logit contributions
 ring+0.010
 priest+0.009
 rob+0.009
 also+0.009
 pilot+0.009
Neg logit contributions
 injuries-0.015
joy-0.014
awed-0.013
irmed-0.012
 nowhere-0.012
Token old
Token ID1468
Feature activation+4.05e-03

Pos logit contributions
 ring+0.012
 also+0.011
 sign+0.011
 guard+0.010
 learn+0.010
Neg logit contributions
 injuries-0.038
joy-0.035
awed-0.033
sed-0.033
irmed-0.032
Token man
Token ID582
Feature activation+5.45e-04

Pos logit contributions
 injuries+0.011
joy+0.011
awed+0.010
sed+0.010
irmed+0.010
Neg logit contributions
 also-0.002
 achieve-0.002
 ring-0.002
 understand-0.002
 rob-0.002
Token his
Token ID465
Feature activation+2.55e-02

Pos logit contributions
 injuries+0.001
awed+0.001
joy+0.001
sed+0.001
uddled+0.001
Neg logit contributions
 achieve-0.001
 name-0.001
 ring-0.001
 learn-0.001
 priest-0.001
Tokensed
Token ID36622
Feature activation+1.20e-01

Pos logit contributions
 injuries+0.061
joy+0.056
awed+0.053
irmed+0.051
ksh+0.051
Neg logit contributions
 ring-0.030
 also-0.029
 sign-0.027
 guard-0.027
 understand-0.027
Token,
Token ID11
Feature activation-1.91e-03

Pos logit contributions
 injuries+0.379
joy+0.352
sed+0.335
awed+0.335
irmed+0.328
Neg logit contributions
 ring-0.046
 also-0.041
 understand-0.035
 sign-0.033
 guard-0.032
Token
Token ID6184
Feature activation-3.85e-02

Pos logit contributions
 ring+0.002
 also+0.002
 priest+0.002
 pilot+0.002
 learn+0.002
Neg logit contributions
 injuries-0.004
joy-0.004
awed-0.004
sed-0.004
irmed-0.004
Token
Token ID95
Feature activation+4.08e-03

Pos logit contributions
 ring+0.061
 also+0.060
 sign+0.057
 understand+0.056
 guard+0.056
Neg logit contributions
 injuries-0.075
joy-0.068
awed-0.062
sed-0.061
irmed-0.060
Token
Token ID26391
Feature activation+1.08e-02

Pos logit contributions
 injuries+0.008
joy+0.007
awed+0.007
sed+0.007
irmed+0.007
Neg logit contributions
 ring-0.006
 also-0.006
 sign-0.006
 guard-0.006
 learn-0.006
Token
Token ID129
Feature activation+3.45e-02

Pos logit contributions
 injuries+0.024
joy+0.022
awed+0.020
sed+0.020
irmed+0.019
Neg logit contributions
 ring-0.015
 also-0.014
 sign-0.013
 guard-0.013
 understand-0.013
Token
Token ID198
Feature activation-5.44e-03

Pos logit contributions
 injuries+0.038
joy+0.038
awed+0.035
aj+0.035
irmed+0.034
Neg logit contributions
 ring-0.020
 priest-0.017
 sign-0.017
 guard-0.017
 bell-0.016
TokenL
Token ID43
Feature activation+1.54e-02

Pos logit contributions
 ring+0.009
 guard+0.008
 cross+0.008
 sign+0.008
 look+0.008
Neg logit contributions
 injuries-0.010
joy-0.010
awed-0.008
ksh-0.008
 nowhere-0.008
Tokenilly
Token ID6548
Feature activation-1.35e-02

Pos logit contributions
 injuries+0.027
joy+0.024
awed+0.022
sed+0.021
irmed+0.021
Neg logit contributions
 ring-0.028
 also-0.027
 sign-0.026
 guard-0.026
 understand-0.026
Token was
Token ID373
Feature activation+1.32e-02

Pos logit contributions
 ring+0.019
 priest+0.017
 understand+0.017
 sign+0.016
Queen+0.016
Neg logit contributions
 injuries-0.025
joy-0.024
sed-0.024
awed-0.023
uddled-0.023
Token over
Token ID625
Feature activation-2.03e-02

Pos logit contributions
joy+0.030
 injuries+0.029
 nowhere+0.026
irmed+0.024
awed+0.024
Neg logit contributions
 ring-0.016
 sign-0.014
 slide-0.014
Queen-0.014
 achieve-0.013
Tokenjoy
Token ID2633
Feature activation+1.20e-01

Pos logit contributions
 ring+0.055
 also+0.055
 sign+0.054
 guard+0.053
 understand+0.053
Neg logit contributions
 injuries-0.016
joy-0.010
awed-0.010
sed-0.009
ksh-0.009
Tokened
Token ID276
Feature activation+1.67e-02

Pos logit contributions
 injuries+0.176
joy+0.144
sed+0.129
awed+0.125
irmed+0.120
Neg logit contributions
 ring-0.250
 also-0.246
 understand-0.239
 guard-0.238
 sign-0.237
Token and
Token ID290
Feature activation+6.21e-03

Pos logit contributions
 injuries+0.042
joy+0.041
awed+0.036
ishing+0.035
irmed+0.035
Neg logit contributions
 ring-0.017
 priest-0.016
 rob-0.015
 also-0.015
 pilot-0.015
Token felt
Token ID2936
Feature activation+1.15e-02

Pos logit contributions
 injuries+0.011
joy+0.011
awed+0.010
sed+0.010
irmed+0.010
Neg logit contributions
 ring-0.010
Queen-0.009
 priest-0.009
 sign-0.009
 pilot-0.009
Token so
Token ID523
Feature activation-8.56e-03

Pos logit contributions
joy+0.026
 injuries+0.026
irmed+0.023
 nowhere+0.021
awed+0.021
Neg logit contributions
 ring-0.014
 slide-0.012
 sign-0.012
 priest-0.012
 pilot-0.012
Token lucky
Token ID9670
Feature activation+1.28e-02

Pos logit contributions
 ring+0.009
 pilot+0.008
 sign+0.008
 rob+0.008
 priest+0.007
Neg logit contributions
 injuries-0.021
joy-0.020
irmed-0.018
awed-0.018
sed-0.017
Token raised
Token ID4376
Feature activation+1.70e-02

Pos logit contributions
 injuries+0.035
awed+0.033
joy+0.033
sed+0.031
irmed+0.030
Neg logit contributions
 learn-0.022
 achieve-0.022
 understand-0.022
 ring-0.022
 priest-0.022
Token its
Token ID663
Feature activation+3.66e-02

Pos logit contributions
 injuries+0.038
joy+0.037
awed+0.032
sed+0.031
irmed+0.031
Neg logit contributions
 ring-0.021
 also-0.020
 achieve-0.020
 priest-0.020
 pilot-0.020
Token hood
Token ID14263
Feature activation+5.19e-02

Pos logit contributions
 injuries+0.074
joy+0.065
sed+0.063
irmed+0.058
awed+0.058
Neg logit contributions
 also-0.055
Queen-0.052
 achieve-0.052
 ring-0.051
 priest-0.051
Token and
Token ID290
Feature activation+4.53e-03

Pos logit contributions
 injuries+0.118
joy+0.110
awed+0.109
ishing+0.106
less+0.101
Neg logit contributions
Queen-0.060
 learn-0.057
 rob-0.056
 pilot-0.056
 priest-0.056
Token his
Token ID465
Feature activation+2.58e-02

Pos logit contributions
 injuries+0.010
joy+0.009
awed+0.008
sed+0.008
irmed+0.008
Neg logit contributions
 ring-0.006
 also-0.006
 learn-0.006
 understand-0.006
 achieve-0.006
Tokensed
Token ID36622
Feature activation+1.20e-01

Pos logit contributions
 injuries+0.052
joy+0.047
sed+0.043
awed+0.042
irmed+0.041
Neg logit contributions
 ring-0.039
 also-0.039
 understand-0.036
 sign-0.036
 learn-0.036
Token,
Token ID11
Feature activation-5.38e-03

Pos logit contributions
 injuries+0.298
joy+0.274
awed+0.257
sed+0.254
irmed+0.250
Neg logit contributions
 ring-0.127
 also-0.121
 sign-0.112
 guard-0.112
 understand-0.111
Token ready
Token ID3492
Feature activation-2.78e-02

Pos logit contributions
Queen+0.007
 ring+0.007
 learn+0.007
 pilot+0.007
 priest+0.007
Neg logit contributions
 injuries-0.011
joy-0.011
sed-0.010
awed-0.010
irmed-0.009
Token to
Token ID284
Feature activation+1.49e-02

Pos logit contributions
 ring+0.040
 also+0.039
 sign+0.037
 understand+0.037
 guard+0.036
Neg logit contributions
 injuries-0.058
joy-0.053
awed-0.048
sed-0.048
irmed-0.047
Token strike
Token ID5587
Feature activation+6.66e-03

Pos logit contributions
 injuries+0.036
joy+0.027
 nowhere+0.025
sed+0.025
 breaths+0.024
Neg logit contributions
a-0.017
Queen-0.015
ele-0.015
"-0.015
 priest-0.014
Token.
Token ID13
Feature activation+3.60e-03

Pos logit contributions
 injuries+0.015
joy+0.014
awed+0.013
ishing+0.012
irmed+0.012
Neg logit contributions
 ring-0.008
 also-0.008
 pilot-0.008
 priest-0.007
 achieve-0.007
Token was
Token ID373
Feature activation+6.21e-03

Pos logit contributions
 injuries+0.002
awed+0.002
sed+0.002
joy+0.002
irmed+0.002
Neg logit contributions
 priest-0.002
 ring-0.002
 achieve-0.002
 understand-0.002
 pilot-0.002
Token so
Token ID523
Feature activation-1.56e-02

Pos logit contributions
 injuries+0.015
joy+0.014
sed+0.013
awed+0.013
 nowhere+0.013
Neg logit contributions
 ring-0.007
 achieve-0.006
 priest-0.006
Queen-0.006
 pilot-0.006
Token happy
Token ID3772
Feature activation+5.59e-03

Pos logit contributions
 ring+0.019
 pilot+0.018
 priest+0.016
Queen+0.016
 sign+0.016
Neg logit contributions
joy-0.033
awed-0.032
 injuries-0.032
irmed-0.031
sed-0.030
Token he
Token ID339
Feature activation+2.13e-02

Pos logit contributions
 injuries+0.013
joy+0.012
awed+0.012
sed+0.011
irmed+0.011
Neg logit contributions
 ring-0.006
 also-0.006
 sign-0.006
 understand-0.006
 guard-0.006
Token w
Token ID266
Feature activation+1.20e-02

Pos logit contributions
 injuries+0.038
joy+0.037
awed+0.035
sed+0.034
irmed+0.034
Neg logit contributions
 ring-0.033
 priest-0.031
 understand-0.031
 sign-0.031
 learn-0.031
Tokenagged
Token ID14655
Feature activation+1.20e-01

Pos logit contributions
 injuries+0.010
joy+0.008
sed+0.005
ksh+0.005
awed+0.005
Neg logit contributions
 ring-0.032
 also-0.032
 guard-0.031
 cross-0.031
 understand-0.031
Token his
Token ID465
Feature activation+2.10e-02

Pos logit contributions
 injuries+0.281
joy+0.267
awed+0.246
sed+0.244
irmed+0.239
Neg logit contributions
 ring-0.134
 also-0.128
 priest-0.126
 pilot-0.123
 achieve-0.121
Token tail
Token ID7894
Feature activation+3.17e-02

Pos logit contributions
 injuries+0.037
joy+0.034
awed+0.030
sed+0.030
irmed+0.029
Neg logit contributions
 ring-0.037
 also-0.036
 sign-0.035
 guard-0.034
 understand-0.034
Token and
Token ID290
Feature activation+3.18e-03

Pos logit contributions
 injuries+0.074
joy+0.070
awed+0.065
sed+0.064
irmed+0.062
Neg logit contributions
 ring-0.036
 also-0.034
 priest-0.033
 achieve-0.033
 sign-0.032
Token l
Token ID300
Feature activation+2.00e-02

Pos logit contributions
 injuries+0.007
joy+0.006
awed+0.006
sed+0.006
irmed+0.005
Neg logit contributions
 ring-0.004
 also-0.004
 sign-0.004
 understand-0.004
 priest-0.004
Tokenicked
Token ID9484
Feature activation+3.18e-02

Pos logit contributions
 injuries+0.029
joy+0.024
ksh+0.021
sed+0.020
ishing+0.020
Neg logit contributions
 also-0.042
 ring-0.041
 guard-0.040
 understand-0.040
a-0.040
Token ball
Token ID2613
Feature activation-1.52e-02

Pos logit contributions
 injuries+0.022
joy+0.020
sed+0.020
awed+0.018
ses+0.017
Neg logit contributions
 achieve-0.012
 understand-0.012
 also-0.012
 know-0.012
 learn-0.011
Token suddenly
Token ID6451
Feature activation+1.38e-02

Pos logit contributions
Queen+0.017
 understand+0.017
 priest+0.017
 ring+0.016
 pilot+0.016
Neg logit contributions
 injuries-0.033
sed-0.031
awed-0.031
joy-0.030
irmed-0.029
Token appeared
Token ID4120
Feature activation+1.53e-02

Pos logit contributions
 injuries+0.030
joy+0.027
awed+0.025
sed+0.025
irmed+0.024
Neg logit contributions
 ring-0.018
 also-0.018
 priest-0.017
 guard-0.017
 pilot-0.017
Token out
Token ID503
Feature activation-2.61e-02

Pos logit contributions
 injuries+0.038
joy+0.035
awed+0.033
sed+0.032
irmed+0.031
Neg logit contributions
 ring-0.016
 also-0.016
 sign-0.014
 guard-0.014
 understand-0.014
Token of
Token ID286
Feature activation-5.50e-04

Pos logit contributions
 ring+0.026
 also+0.024
 understand+0.022
 guard+0.022
 slide+0.022
Neg logit contributions
 injuries-0.067
joy-0.063
awed-0.058
sed-0.056
irmed-0.056
Token nowhere
Token ID12062
Feature activation+1.20e-01

Pos logit contributions
 ring+0.001
 also+0.001
 sign+0.001
 guard+0.001
 cross+0.001
Neg logit contributions
 injuries-0.001
joy-0.001
awed-0.001
ksh-0.001
uddled-0.001
Token!
Token ID0
Feature activation-5.18e-03

Pos logit contributions
 injuries+0.294
joy+0.272
awed+0.254
sed+0.250
irmed+0.247
Neg logit contributions
 ring-0.131
 also-0.126
 sign-0.117
 guard-0.116
 understand-0.114
Token It
Token ID632
Feature activation+1.22e-02

Pos logit contributions
 ring+0.008
 sign+0.007
 priest+0.007
 also+0.007
 pilot+0.007
Neg logit contributions
joy-0.011
 injuries-0.010
awed-0.009
ksh-0.009
sed-0.009
Token had
Token ID550
Feature activation+3.99e-02

Pos logit contributions
 injuries+0.026
joy+0.024
awed+0.023
sed+0.023
irmed+0.023
Neg logit contributions
 ring-0.015
 priest-0.014
 achieve-0.014
 learn-0.014
 understand-0.014
Token been
Token ID587
Feature activation+5.45e-02

Pos logit contributions
 injuries+0.091
joy+0.084
awed+0.078
sed+0.076
irmed+0.075
Neg logit contributions
 ring-0.050
 also-0.048
 sign-0.045
 guard-0.045
 understand-0.044
Token hiding
Token ID11816
Feature activation+3.48e-02

Pos logit contributions
 injuries+0.108
joy+0.097
awed+0.090
sed+0.088
irmed+0.088
Neg logit contributions
 ring-0.084
 also-0.083
 sign-0.078
 guard-0.078
 cross-0.076
Token
Token ID198
Feature activation-5.85e-03

Pos logit contributions
 ring+0.010
 pilot+0.009
 bell+0.009
 priest+0.008
 sign+0.008
Neg logit contributions
joy-0.013
 injuries-0.013
awed-0.012
 nowhere-0.011
irmed-0.011
TokenThe
Token ID464
Feature activation-7.01e-03

Pos logit contributions
 ring+0.009
 look+0.008
 name+0.008
 switch+0.008
 priest+0.008
Neg logit contributions
 injuries-0.011
joy-0.011
awed-0.010
 nowhere-0.009
irmed-0.009
Token butterfly
Token ID35113
Feature activation-1.05e-02

Pos logit contributions
 ring+0.006
 also+0.006
 learn+0.006
 understand+0.006
 achieve+0.006
Neg logit contributions
 injuries-0.018
joy-0.017
sed-0.016
awed-0.016
irmed-0.015
Token was
Token ID373
Feature activation+1.53e-02

Pos logit contributions
 priest+0.015
 ring+0.015
 achieve+0.015
 understand+0.015
 sign+0.014
Neg logit contributions
 injuries-0.018
awed-0.017
joy-0.017
sed-0.016
irmed-0.016
Token over
Token ID625
Feature activation-2.06e-02

Pos logit contributions
joy+0.034
 injuries+0.033
awed+0.030
 nowhere+0.030
sed+0.029
Neg logit contributions
 ring-0.018
 slide-0.017
Queen-0.016
 achieve-0.016
 pilot-0.016
Tokenjoy
Token ID2633
Feature activation+1.19e-01

Pos logit contributions
 ring+0.054
 also+0.054
 sign+0.052
 guard+0.052
 cross+0.052
Neg logit contributions
 injuries-0.018
joy-0.013
awed-0.012
sed-0.011
ksh-0.011
Tokened
Token ID276
Feature activation+1.73e-02

Pos logit contributions
 injuries+0.171
joy+0.141
sed+0.124
awed+0.122
irmed+0.117
Neg logit contributions
 ring-0.254
 also-0.249
 understand-0.240
 guard-0.240
 sign-0.240
Token by
Token ID416
Feature activation-1.48e-02

Pos logit contributions
 injuries+0.043
joy+0.043
awed+0.038
ishing+0.037
irmed+0.036
Neg logit contributions
 ring-0.017
 priest-0.017
 pilot-0.016
Queen-0.016
 slide-0.015
Token the
Token ID262
Feature activation-2.97e-03

Pos logit contributions
 ring+0.019
 also+0.018
 priest+0.018
 pilot+0.018
 understand+0.018
Neg logit contributions
 injuries-0.033
joy-0.031
awed-0.027
sed-0.027
irmed-0.026
Token bird
Token ID6512
Feature activation+2.05e-02

Pos logit contributions
 ring+0.003
 also+0.003
 sign+0.003
 guard+0.002
 understand+0.002
Neg logit contributions
 injuries-0.008
joy-0.007
awed-0.007
sed-0.007
irmed-0.006
Token's
Token ID338
Feature activation-1.17e-02

Pos logit contributions
 injuries+0.053
joy+0.049
awed+0.045
sed+0.044
irmed+0.043
Neg logit contributions
 ring-0.018
 priest-0.017
 also-0.017
 guard-0.017
 sign-0.016
Token it
Token ID340
Feature activation+1.49e-03

Pos logit contributions
 injuries+0.099
joy+0.089
awed+0.081
sed+0.080
ted+0.079
Neg logit contributions
 also-0.049
 ring-0.048
 priest-0.048
 achieve-0.047
 pilot-0.045
Token.
Token ID13
Feature activation+4.72e-03

Pos logit contributions
 injuries+0.003
joy+0.003
awed+0.003
sed+0.003
irmed+0.003
Neg logit contributions
 ring-0.002
 also-0.002
 priest-0.002
 rob-0.002
 pilot-0.002
Token The
Token ID383
Feature activation+6.90e-03

Pos logit contributions
 injuries+0.011
joy+0.010
awed+0.009
irmed+0.009
sed+0.009
Neg logit contributions
 ring-0.006
 also-0.005
 sign-0.005
 pilot-0.005
 priest-0.005
Token dog
Token ID3290
Feature activation+1.62e-02

Pos logit contributions
 injuries+0.017
joy+0.016
awed+0.015
sed+0.015
irmed+0.014
Neg logit contributions
 ring-0.007
 also-0.007
 sign-0.007
 understand-0.007
 guard-0.006
Token w
Token ID266
Feature activation+1.53e-02

Pos logit contributions
 injuries+0.031
awed+0.028
joy+0.028
sed+0.028
irmed+0.026
Neg logit contributions
 ring-0.023
 priest-0.023
 achieve-0.023
 pilot-0.022
 learn-0.022
Tokenagged
Token ID14655
Feature activation+1.19e-01

Pos logit contributions
 injuries+0.013
joy+0.011
ksh+0.008
sed+0.007
aj+0.006
Neg logit contributions
 also-0.039
 ring-0.039
 understand-0.039
 cross-0.038
 sign-0.038
Token its
Token ID663
Feature activation+3.43e-02

Pos logit contributions
 injuries+0.239
joy+0.233
awed+0.217
sed+0.215
irmed+0.210
Neg logit contributions
 achieve-0.166
 ring-0.163
 priest-0.162
 pilot-0.156
 also-0.155
Token tail
Token ID7894
Feature activation+3.15e-02

Pos logit contributions
 injuries+0.061
joy+0.054
awed+0.049
sed+0.048
irmed+0.047
Neg logit contributions
 ring-0.061
 also-0.060
 sign-0.057
 guard-0.057
 understand-0.056
Token and
Token ID290
Feature activation+3.99e-03

Pos logit contributions
 injuries+0.088
joy+0.085
awed+0.080
sed+0.079
irmed+0.077
Neg logit contributions
 ring-0.021
 priest-0.019
 achieve-0.018
 also-0.018
 pilot-0.017
Token became
Token ID2627
Feature activation+1.12e-02

Pos logit contributions
 injuries+0.009
joy+0.008
awed+0.007
sed+0.007
irmed+0.007
Neg logit contributions
 ring-0.005
 also-0.005
 sign-0.005
 guard-0.005
 understand-0.005
Token friendly
Token ID8030
Feature activation+2.62e-02

Pos logit contributions
 injuries+0.025
joy+0.025
irmed+0.022
awed+0.022
sed+0.021
Neg logit contributions
 ring-0.014
 priest-0.012
 achieve-0.012
 sign-0.012
 slide-0.012
Tokenggy
Token ID19970
Feature activation+3.66e-02

Pos logit contributions
 ring+0.047
 also+0.042
 rob+0.042
 learn+0.041
 achieve+0.041
Neg logit contributions
 injuries-0.069
joy-0.064
awed-0.060
sed-0.056
irmed-0.056
Token win
Token ID1592
Feature activation-3.18e-02

Pos logit contributions
 injuries+0.071
joy+0.065
awed+0.063
sed+0.062
irmed+0.059
Neg logit contributions
 ring-0.053
 priest-0.050
 rob-0.049
Queen-0.049
 sign-0.048
Tokenned
Token ID2817
Feature activation+2.47e-03

Pos logit contributions
 ring+0.038
 priest+0.035
 pilot+0.034
 also+0.034
 rob+0.033
Neg logit contributions
 injuries-0.074
joy-0.070
awed-0.065
irmed-0.062
sed-0.062
Token and
Token ID290
Feature activation+6.71e-03

Pos logit contributions
 injuries+0.005
joy+0.005
sed+0.004
awed+0.004
 breaths+0.004
Neg logit contributions
 also-0.003
Queen-0.003
 pilot-0.003
 priest-0.003
 exc-0.003
Token w
Token ID266
Feature activation+1.78e-02

Pos logit contributions
 injuries+0.014
joy+0.013
irmed+0.012
sed+0.012
awed+0.012
Neg logit contributions
 ring-0.009
Queen-0.008
 pilot-0.008
 priest-0.008
 also-0.008
Tokenagged
Token ID14655
Feature activation+1.19e-01

Pos logit contributions
 injuries+0.023
joy+0.020
ksh+0.016
sed+0.016
awed+0.016
Neg logit contributions
 ring-0.039
 also-0.039
 understand-0.038
 cross-0.038
 learn-0.037
Token his
Token ID465
Feature activation+2.59e-02

Pos logit contributions
 injuries+0.252
joy+0.244
sed+0.220
awed+0.219
irmed+0.212
Neg logit contributions
 ring-0.150
 priest-0.148
Queen-0.148
 also-0.147
 achieve-0.147
Token tail
Token ID7894
Feature activation+3.34e-02

Pos logit contributions
 injuries+0.047
joy+0.043
awed+0.039
irmed+0.037
sed+0.037
Neg logit contributions
 ring-0.044
 also-0.043
 guard-0.042
 sign-0.042
 understand-0.041
Token.
Token ID13
Feature activation+5.84e-03

Pos logit contributions
 injuries+0.074
joy+0.069
awed+0.063
sed+0.063
irmed+0.061
Neg logit contributions
 ring-0.042
 also-0.040
 priest-0.039
 rob-0.038
 sign-0.038
Token Debbie
Token ID31311
Feature activation-3.92e-03

Pos logit contributions
joy+0.012
 injuries+0.011
sed+0.010
irmed+0.010
awed+0.010
Neg logit contributions
 ring-0.009
 pilot-0.008
 switch-0.008
 rob-0.008
 sign-0.008
Token smiled
Token ID13541
Feature activation+3.57e-02

Pos logit contributions
 ring+0.006
 sign+0.005
 priest+0.005
 achieve+0.005
Queen+0.005
Neg logit contributions
 injuries-0.007
awed-0.007
joy-0.007
irmed-0.006
sed-0.006
Token up
Token ID510
Feature activation-3.28e-02

Pos logit contributions
 injuries+0.004
joy+0.004
irmed+0.003
awed+0.003
sed+0.003
Neg logit contributions
 ring-0.003
 priest-0.003
Queen-0.003
 achieve-0.003
 rob-0.003
Token to
Token ID284
Feature activation+1.56e-02

Pos logit contributions
 ring+0.044
 also+0.041
 understand+0.040
 pilot+0.040
 priest+0.039
Neg logit contributions
 injuries-0.072
joy-0.067
awed-0.061
sed-0.060
irmed-0.059
Token Jack
Token ID3619
Feature activation-2.16e-02

Pos logit contributions
 injuries+0.032
joy+0.029
awed+0.026
sed+0.026
irmed+0.025
Neg logit contributions
 ring-0.024
 also-0.023
 sign-0.022
 guard-0.022
 understand-0.021
Token and
Token ID290
Feature activation+7.75e-03

Pos logit contributions
 priest+0.018
Queen+0.018
 ring+0.017
 guard+0.016
 pilot+0.016
Neg logit contributions
 injuries-0.055
joy-0.052
ishing-0.049
awed-0.048
ksh-0.047
Token w
Token ID266
Feature activation+1.64e-02

Pos logit contributions
 injuries+0.015
joy+0.014
sed+0.013
awed+0.013
irmed+0.013
Neg logit contributions
 ring-0.011
 achieve-0.011
 sign-0.011
 also-0.011
 learn-0.010
Tokenagged
Token ID14655
Feature activation+1.19e-01

Pos logit contributions
 injuries+0.023
joy+0.021
ksh+0.018
sed+0.016
awed+0.015
Neg logit contributions
 also-0.034
 understand-0.034
 ring-0.033
 cross-0.033
 know-0.032
Token their
Token ID511
Feature activation+4.06e-02

Pos logit contributions
 injuries+0.290
joy+0.270
awed+0.252
sed+0.248
irmed+0.246
Neg logit contributions
 ring-0.129
 also-0.123
 sign-0.114
 guard-0.113
 priest-0.113
Token tails
Token ID30514
Feature activation+5.77e-02

Pos logit contributions
 injuries+0.058
joy+0.051
awed+0.045
sed+0.042
irmed+0.042
Neg logit contributions
 ring-0.086
 also-0.084
 sign-0.081
 guard-0.081
 understand-0.080
Token.
Token ID13
Feature activation+5.34e-03

Pos logit contributions
 injuries+0.137
joy+0.127
awed+0.118
sed+0.116
irmed+0.114
Neg logit contributions
 ring-0.066
 also-0.062
 sign-0.058
 guard-0.058
 priest-0.058
Token Jack
Token ID3619
Feature activation-2.37e-02

Pos logit contributions
 injuries+0.012
joy+0.011
awed+0.010
sed+0.010
irmed+0.009
Neg logit contributions
 ring-0.007
 also-0.007
 sign-0.007
 guard-0.007
 pilot-0.007
Token was
Token ID373
Feature activation+1.03e-02

Pos logit contributions
 ring+0.036
 priest+0.035
 achieve+0.034
Queen+0.033
 sign+0.033
Neg logit contributions
 injuries-0.041
joy-0.040
awed-0.040
sed-0.039
irmed-0.038
TokenAb
Token ID4826
Feature activation+5.65e-03

Pos logit contributions
 ring+0.008
 pilot+0.006
 guard+0.006
 name+0.006
 look+0.006
Neg logit contributions
 injuries-0.009
joy-0.009
awed-0.008
irmed-0.008
ksh-0.008
Tokenig
Token ID328
Feature activation+1.54e-02

Pos logit contributions
 injuries+0.011
joy+0.010
awed+0.009
sed+0.009
irmed+0.008
Neg logit contributions
 ring-0.009
 also-0.009
 sign-0.009
 guard-0.009
 learn-0.009
Tokenail
Token ID603
Feature activation-6.60e-03

Pos logit contributions
 injuries+0.020
joy+0.018
awed+0.017
sed+0.016
aj+0.015
Neg logit contributions
 ring-0.033
 understand-0.032
 also-0.032
 priest-0.032
 sign-0.031
Token was
Token ID373
Feature activation+1.40e-02

Pos logit contributions
 ring+0.009
 priest+0.009
 achieve+0.008
Queen+0.008
 understand+0.008
Neg logit contributions
 injuries-0.012
joy-0.012
awed-0.012
irmed-0.012
sed-0.012
Token over
Token ID625
Feature activation-1.91e-02

Pos logit contributions
 injuries+0.032
joy+0.032
irmed+0.028
awed+0.028
 nowhere+0.028
Neg logit contributions
 ring-0.016
 slide-0.015
 pilot-0.014
 sign-0.014
 achieve-0.014
Tokenjoy
Token ID2633
Feature activation+1.19e-01

Pos logit contributions
 ring+0.021
 also+0.020
 sign+0.019
 guard+0.019
 learn+0.018
Neg logit contributions
 injuries-0.046
joy-0.042
awed-0.040
sed-0.040
irmed-0.039
Tokened
Token ID276
Feature activation+1.80e-02

Pos logit contributions
 injuries+0.157
joy+0.135
awed+0.117
sed+0.113
irmed+0.109
Neg logit contributions
 ring-0.264
 also-0.259
 sign-0.250
 guard-0.249
 understand-0.247
Token!
Token ID0
Feature activation+1.00e-03

Pos logit contributions
 injuries+0.040
joy+0.038
ishing+0.033
awed+0.033
irmed+0.033
Neg logit contributions
 ring-0.022
 priest-0.022
 pilot-0.021
 rob-0.021
Queen-0.020
Token She
Token ID1375
Feature activation-8.93e-03

Pos logit contributions
joy+0.002
 injuries+0.002
irmed+0.002
awed+0.002
ksh+0.002
Neg logit contributions
 ring-0.002
 priest-0.001
 sign-0.001
 also-0.001
aches-0.001
Token thanked
Token ID26280
Feature activation+5.33e-03

Pos logit contributions
 ring+0.014
 priest+0.014
Queen+0.013
 achieve+0.013
 sign+0.013
Neg logit contributions
 injuries-0.015
awed-0.014
sed-0.014
joy-0.014
irmed-0.014
Token the
Token ID262
Feature activation-1.59e-03

Pos logit contributions
 injuries+0.012
joy+0.011
ted+0.009
awed+0.009
 breaths+0.009
Neg logit contributions
 ring-0.006
 also-0.006
 guard-0.006
 priest-0.006
a-0.006
Token so
Token ID523
Feature activation-1.38e-02

Pos logit contributions
 injuries+0.058
joy+0.056
awed+0.052
irmed+0.050
sed+0.049
Neg logit contributions
 ring-0.024
 also-0.021
 slide-0.021
 sign-0.021
 achieve-0.021
Token friendly
Token ID8030
Feature activation+2.66e-02

Pos logit contributions
 ring+0.013
 also+0.012
 sign+0.011
 guard+0.011
 understand+0.011
Neg logit contributions
 injuries-0.036
joy-0.033
awed-0.031
sed-0.031
irmed-0.030
Token.
Token ID13
Feature activation+6.72e-03

Pos logit contributions
 injuries+0.065
joy+0.060
awed+0.056
sed+0.055
irmed+0.054
Neg logit contributions
 ring-0.029
 also-0.028
 sign-0.026
 understand-0.026
 guard-0.026
Token Max
Token ID5436
Feature activation+2.24e-02

Pos logit contributions
 injuries+0.014
joy+0.013
awed+0.012
sed+0.012
irmed+0.012
Neg logit contributions
 ring-0.010
 also-0.009
 sign-0.009
 priest-0.009
 guard-0.009
Token w
Token ID266
Feature activation+1.39e-02

Pos logit contributions
 injuries+0.036
joy+0.036
awed+0.035
sed+0.034
irmed+0.033
Neg logit contributions
 ring-0.035
 priest-0.034
 switch-0.033
 achieve-0.033
 sign-0.033
Tokenagged
Token ID14655
Feature activation+1.19e-01

Pos logit contributions
 injuries+0.011
joy+0.009
ksh+0.006
awed+0.006
sed+0.005
Neg logit contributions
 ring-0.038
 also-0.037
 understand-0.037
 guard-0.036
 cross-0.036
Token his
Token ID465
Feature activation+2.31e-02

Pos logit contributions
 injuries+0.268
joy+0.256
awed+0.233
sed+0.233
irmed+0.227
Neg logit contributions
 ring-0.139
 achieve-0.135
 also-0.135
 priest-0.133
Queen-0.128
Token tail
Token ID7894
Feature activation+3.18e-02

Pos logit contributions
 injuries+0.041
joy+0.037
awed+0.034
sed+0.033
irmed+0.032
Neg logit contributions
 ring-0.041
 also-0.040
 sign-0.038
 guard-0.038
 understand-0.037
Token and
Token ID290
Feature activation+2.92e-03

Pos logit contributions
 injuries+0.091
joy+0.088
awed+0.082
sed+0.081
irmed+0.079
Neg logit contributions
 ring-0.020
 priest-0.017
 also-0.017
 achieve-0.017
 sign-0.016
Token felt
Token ID2936
Feature activation+6.78e-03

Pos logit contributions
 injuries+0.006
joy+0.006
awed+0.005
sed+0.005
irmed+0.005
Neg logit contributions
 ring-0.004
 also-0.004
 sign-0.004
 understand-0.004
 priest-0.004
Token happy
Token ID3772
Feature activation+7.17e-03

Pos logit contributions
joy+0.015
 injuries+0.015
irmed+0.013
awed+0.013
 nowhere+0.013
Neg logit contributions
 ring-0.008
 slide-0.007
 sign-0.007
 priest-0.007
 pilot-0.007
Token in
Token ID287
Feature activation-7.80e-03

Pos logit contributions
 ring+0.023
 understand+0.023
 rob+0.022
 know+0.022
Queen+0.022
Neg logit contributions
 injuries-0.063
joy-0.060
awed-0.055
ishing-0.053
irmed-0.053
Token a
Token ID257
Feature activation-3.72e-04

Pos logit contributions
 ring+0.010
 pilot+0.009
 also+0.009
 priest+0.009
 achieve+0.008
Neg logit contributions
 injuries-0.018
joy-0.017
awed-0.015
irmed-0.015
 nowhere-0.014
Token flash
Token ID7644
Feature activation+2.54e-02

Pos logit contributions
 ring+0.000
 also+0.000
 understand+0.000
 rob+0.000
 know+0.000
Neg logit contributions
 injuries-0.001
joy-0.001
awed-0.001
irmed-0.001
ksh-0.001
Token and
Token ID290
Feature activation+3.00e-03

Pos logit contributions
 injuries+0.061
joy+0.057
awed+0.053
ishing+0.052
irmed+0.050
Neg logit contributions
 ring-0.026
 rob-0.026
 learn-0.026
 also-0.025
 know-0.025
Token w
Token ID266
Feature activation+1.64e-02

Pos logit contributions
 injuries+0.006
joy+0.006
awed+0.005
sed+0.005
irmed+0.005
Neg logit contributions
 ring-0.004
 also-0.004
 sign-0.004
 pilot-0.004
 guard-0.004
Tokenagged
Token ID14655
Feature activation+1.19e-01

Pos logit contributions
 injuries+0.021
joy+0.018
sed+0.015
awed+0.015
ksh+0.015
Neg logit contributions
 ring-0.036
 also-0.036
 understand-0.035
 cross-0.035
 sign-0.035
Token his
Token ID465
Feature activation+2.57e-02

Pos logit contributions
 injuries+0.272
joy+0.266
awed+0.239
sed+0.231
irmed+0.229
Neg logit contributions
 ring-0.136
 also-0.133
Queen-0.131
 achieve-0.130
 pilot-0.128
Token tail
Token ID7894
Feature activation+3.08e-02

Pos logit contributions
 injuries+0.049
joy+0.044
awed+0.040
irmed+0.038
ksh+0.038
Neg logit contributions
 ring-0.042
 also-0.041
 guard-0.040
 sign-0.040
 cross-0.039
Token happily
Token ID18177
Feature activation-6.62e-03

Pos logit contributions
 injuries+0.071
joy+0.066
awed+0.060
sed+0.059
irmed+0.058
Neg logit contributions
 ring-0.037
 also-0.036
 priest-0.034
 sign-0.034
 learn-0.034
Token.
Token ID13
Feature activation+6.35e-03

Pos logit contributions
 ring+0.008
 also+0.008
 sign+0.007
 guard+0.007
 understand+0.007
Neg logit contributions
 injuries-0.016
joy-0.014
awed-0.013
sed-0.013
irmed-0.013
Token From
Token ID3574
Feature activation-8.12e-03

Pos logit contributions
 injuries+0.015
joy+0.015
awed+0.013
irmed+0.013
sed+0.013
Neg logit contributions
 ring-0.008
 pilot-0.007
 priest-0.007
 sign-0.007
 also-0.007
Token.
Token ID13
Feature activation+6.63e-03

Pos logit contributions
 ring+0.036
 also+0.035
 rob+0.033
 cross+0.033
Queen+0.033
Neg logit contributions
 injuries-0.049
joy-0.045
awed-0.040
sed-0.039
ishing-0.038
Token The
Token ID383
Feature activation+6.04e-03

Pos logit contributions
joy+0.013
 injuries+0.012
ksh+0.011
awed+0.010
irmed+0.010
Neg logit contributions
 ring-0.011
 sign-0.009
 pilot-0.009
 priest-0.009
aches-0.009
Token princess
Token ID21752
Feature activation-2.48e-02

Pos logit contributions
 injuries+0.014
joy+0.013
sed+0.012
 nowhere+0.011
irmed+0.011
Neg logit contributions
 achieve-0.007
 understand-0.006
 ring-0.006
 also-0.006
 know-0.006
Token was
Token ID373
Feature activation+1.22e-02

Pos logit contributions
 ring+0.033
 cross+0.033
 slide+0.032
 achieve+0.032
 sign+0.032
Neg logit contributions
 injuries-0.045
awed-0.043
joy-0.043
uddled-0.041
sed-0.040
Token over
Token ID625
Feature activation-2.29e-02

Pos logit contributions
joy+0.026
 injuries+0.025
 nowhere+0.023
sed+0.021
awed+0.021
Neg logit contributions
 slide-0.015
 ring-0.015
Queen-0.014
 achieve-0.013
 pilot-0.013
Tokenjoy
Token ID2633
Feature activation+1.19e-01

Pos logit contributions
 ring+0.061
 also+0.060
 sign+0.058
 guard+0.058
 understand+0.058
Neg logit contributions
 injuries-0.020
joy-0.016
awed-0.012
sed-0.011
irmed-0.011
Tokened
Token ID276
Feature activation+1.50e-02

Pos logit contributions
 injuries+0.142
joy+0.128
awed+0.110
sed+0.100
irmed+0.100
Neg logit contributions
 ring-0.276
 also-0.271
 sign-0.261
 guard-0.259
 learn-0.258
Token!
Token ID0
Feature activation+1.27e-03

Pos logit contributions
 injuries+0.036
joy+0.036
awed+0.031
ishing+0.031
irmed+0.030
Neg logit contributions
 ring-0.015
 priest-0.014
 pilot-0.014
Queen-0.014
 rob-0.014
Token She
Token ID1375
Feature activation-1.25e-02

Pos logit contributions
joy+0.003
 injuries+0.002
awed+0.002
irmed+0.002
ksh+0.002
Neg logit contributions
 ring-0.002
 sign-0.002
 priest-0.002
 pilot-0.002
aches-0.002
Token was
Token ID373
Feature activation+1.19e-02

Pos logit contributions
 ring+0.019
Queen+0.019
 priest+0.018
 achieve+0.018
 pilot+0.018
Neg logit contributions
 injuries-0.020
joy-0.019
awed-0.019
sed-0.018
uddled-0.018
Token so
Token ID523
Feature activation-1.13e-02

Pos logit contributions
joy+0.024
 injuries+0.022
 nowhere+0.021
years+0.019
less+0.019
Neg logit contributions
Queen-0.016
 ring-0.016
 slide-0.015
 achieve-0.014
 crane-0.014
Token fought
Token ID8350
Feature activation+2.95e-02

Pos logit contributions
 injuries+0.016
joy+0.014
awed+0.013
sed+0.013
irmed+0.013
Neg logit contributions
 ring-0.008
 also-0.008
 sign-0.008
 priest-0.007
 pilot-0.007
Token back
Token ID736
Feature activation-5.65e-03

Pos logit contributions
 injuries+0.069
joy+0.063
awed+0.058
sed+0.057
irmed+0.056
Neg logit contributions
 ring-0.035
 also-0.034
 priest-0.033
 pilot-0.032
 rob-0.032
Token.
Token ID13
Feature activation+6.90e-03

Pos logit contributions
 priest+0.008
 ring+0.008
 also+0.008
 pilot+0.008
 rob+0.008
Neg logit contributions
 injuries-0.011
joy-0.010
ishing-0.009
awed-0.009
irmed-0.009
Token They
Token ID1119
Feature activation-2.91e-03

Pos logit contributions
 injuries+0.014
joy+0.013
awed+0.012
irmed+0.012
sed+0.012
Neg logit contributions
 ring-0.011
 priest-0.010
 also-0.010
 sign-0.010
 pilot-0.009
Token his
Token ID465
Feature activation+2.52e-02

Pos logit contributions
 ring+0.004
 priest+0.004
 sign+0.004
 also+0.004
 rob+0.004
Neg logit contributions
 injuries-0.006
joy-0.005
awed-0.005
sed-0.005
irmed-0.005
Tokensed
Token ID36622
Feature activation+1.18e-01

Pos logit contributions
 injuries+0.061
joy+0.058
awed+0.055
irmed+0.054
ksh+0.053
Neg logit contributions
 ring-0.027
 also-0.026
 sign-0.025
 guard-0.025
 cross-0.024
Token and
Token ID290
Feature activation+4.00e-03

Pos logit contributions
 injuries+0.369
joy+0.346
awed+0.330
sed+0.326
irmed+0.320
Neg logit contributions
 ring-0.045
 also-0.044
 understand-0.037
 sign-0.035
 learn-0.034
Token grow
Token ID1663
Feature activation-5.06e-03

Pos logit contributions
 injuries+0.009
joy+0.008
awed+0.007
sed+0.007
irmed+0.007
Neg logit contributions
 ring-0.006
 also-0.006
 sign-0.005
 guard-0.005
 understand-0.005
Tokenled
Token ID992
Feature activation+4.88e-02

Pos logit contributions
 ring+0.014
 also+0.014
 sign+0.013
 guard+0.013
 understand+0.013
Neg logit contributions
 injuries-0.004
joy-0.003
awed-0.002
sed-0.002
irmed-0.002
Token,
Token ID11
Feature activation-5.60e-03

Pos logit contributions
 injuries+0.117
joy+0.109
awed+0.097
irmed+0.096
sed+0.096
Neg logit contributions
 ring-0.056
 also-0.053
 priest-0.053
Queen-0.051
 rob-0.051
Token sw
Token ID1509
Feature activation+1.35e-02

Pos logit contributions
 priest+0.006
 ring+0.006
Queen+0.006
a+0.005
 pilot+0.005
Neg logit contributions
 injuries-0.014
joy-0.013
sed-0.012
awed-0.012
irmed-0.011
Token so
Token ID523
Feature activation-1.26e-02

Pos logit contributions
 injuries+0.018
joy+0.017
awed+0.016
sed+0.016
 nowhere+0.015
Neg logit contributions
 ring-0.009
 achieve-0.009
 pilot-0.008
 learn-0.008
 priest-0.008
Token excited
Token ID6568
Feature activation-1.74e-02

Pos logit contributions
 ring+0.016
 pilot+0.016
 achieve+0.015
 priest+0.014
Queen+0.014
Neg logit contributions
joy-0.025
awed-0.025
 injuries-0.024
irmed-0.024
sed-0.023
Token,
Token ID11
Feature activation-1.47e-03

Pos logit contributions
 ring+0.018
 also+0.017
 sign+0.016
 guard+0.015
 understand+0.015
Neg logit contributions
 injuries-0.044
joy-0.041
awed-0.038
sed-0.038
irmed-0.037
Token he
Token ID339
Feature activation+2.36e-02

Pos logit contributions
 ring+0.002
 priest+0.002
Queen+0.002
 achieve+0.002
 pilot+0.002
Neg logit contributions
joy-0.003
 injuries-0.003
awed-0.003
sed-0.003
irmed-0.003
Token w
Token ID266
Feature activation+1.29e-02

Pos logit contributions
 injuries+0.036
sed+0.035
joy+0.035
awed+0.035
ounced+0.033
Neg logit contributions
Queen-0.039
 ring-0.038
 priest-0.038
 achieve-0.037
 understand-0.037
Tokenagged
Token ID14655
Feature activation+1.18e-01

Pos logit contributions
 injuries+0.012
joy+0.011
ksh+0.008
sed+0.007
awed+0.007
Neg logit contributions
 ring-0.032
 also-0.032
 cross-0.032
 understand-0.031
 sign-0.031
Token his
Token ID465
Feature activation+2.25e-02

Pos logit contributions
 injuries+0.278
joy+0.266
awed+0.246
sed+0.243
irmed+0.239
Neg logit contributions
 ring-0.131
 also-0.124
 priest-0.122
 pilot-0.119
 achieve-0.119
Token tail
Token ID7894
Feature activation+3.37e-02

Pos logit contributions
 injuries+0.041
joy+0.037
awed+0.033
sed+0.032
irmed+0.032
Neg logit contributions
 ring-0.039
 also-0.038
 sign-0.037
 guard-0.036
 understand-0.036
Token and
Token ID290
Feature activation+5.03e-03

Pos logit contributions
 injuries+0.090
joy+0.086
awed+0.081
sed+0.080
irmed+0.077
Neg logit contributions
 ring-0.027
 also-0.024
 achieve-0.024
 priest-0.024
 learn-0.022
Token w
Token ID266
Feature activation+1.19e-02

Pos logit contributions
 injuries+0.010
joy+0.010
awed+0.009
sed+0.009
irmed+0.009
Neg logit contributions
 ring-0.007
 also-0.007
 sign-0.007
 understand-0.007
 priest-0.007
Tokenigg
Token ID6950
Feature activation-8.65e-03

Pos logit contributions
 injuries+0.011
joy+0.009
ksh+0.007
sed+0.006
aj+0.005
Neg logit contributions
 ring-0.030
 understand-0.029
 also-0.029
 cross-0.029
 know-0.028
Token was
Token ID373
Feature activation+9.69e-03

Pos logit contributions
 priest+0.001
 understand+0.001
 achieve+0.001
 ring+0.001
Queen+0.001
Neg logit contributions
 injuries-0.001
sed-0.001
awed-0.001
joy-0.001
irmed-0.001
Token very
Token ID845
Feature activation-1.15e-02

Pos logit contributions
 injuries+0.025
joy+0.023
sed+0.022
awed+0.022
irmed+0.021
Neg logit contributions
 ring-0.009
 achieve-0.008
 learn-0.008
 sign-0.008
 priest-0.008
Token happy
Token ID3772
Feature activation+7.68e-03

Pos logit contributions
 ring+0.013
 achieve+0.012
 pilot+0.012
Queen+0.012
 slide+0.012
Neg logit contributions
 injuries-0.026
joy-0.024
sed-0.024
awed-0.023
irmed-0.023
Token and
Token ID290
Feature activation+7.26e-03

Pos logit contributions
 injuries+0.020
joy+0.019
awed+0.017
sed+0.017
irmed+0.017
Neg logit contributions
 ring-0.007
 also-0.007
 sign-0.006
 guard-0.006
 understand-0.006
Token w
Token ID266
Feature activation+1.20e-02

Pos logit contributions
 injuries+0.016
joy+0.015
awed+0.014
sed+0.014
irmed+0.013
Neg logit contributions
 ring-0.009
 also-0.009
 sign-0.009
 understand-0.008
 priest-0.008
Tokenagged
Token ID14655
Feature activation+1.18e-01

Pos logit contributions
 injuries+0.011
joy+0.009
ksh+0.006
sed+0.006
awed+0.005
Neg logit contributions
 also-0.032
 ring-0.031
 understand-0.031
 cross-0.030
 guard-0.030
Token his
Token ID465
Feature activation+2.21e-02

Pos logit contributions
 injuries+0.284
joy+0.264
awed+0.246
sed+0.242
irmed+0.238
Neg logit contributions
 ring-0.133
 also-0.128
 sign-0.119
 understand-0.118
 guard-0.117
Token tail
Token ID7894
Feature activation+3.14e-02

Pos logit contributions
 injuries+0.040
joy+0.036
awed+0.033
sed+0.032
irmed+0.031
Neg logit contributions
 ring-0.038
 also-0.037
 guard-0.036
 sign-0.036
 understand-0.035
Token with
Token ID351
Feature activation+8.86e-03

Pos logit contributions
 injuries+0.072
joy+0.066
awed+0.062
sed+0.061
irmed+0.059
Neg logit contributions
 ring-0.038
 also-0.037
 sign-0.035
 priest-0.034
 understand-0.034
Token joy
Token ID8716
Feature activation+2.77e-03

Pos logit contributions
 injuries+0.016
joy+0.015
awed+0.013
sed+0.013
irmed+0.013
Neg logit contributions
 ring-0.015
 also-0.014
 priest-0.014
 achieve-0.014
 pilot-0.014
Token.
Token ID13
Feature activation+8.94e-03

Pos logit contributions
 injuries+0.007
joy+0.006
awed+0.006
sed+0.006
irmed+0.006
Neg logit contributions
 ring-0.003
 also-0.003
 guard-0.003
 sign-0.003
 understand-0.003
Token okay
Token ID8788
Feature activation+2.31e-02

Pos logit contributions
 ring+0.022
 also+0.021
 sign+0.021
 guard+0.021
 understand+0.020
Neg logit contributions
 injuries-0.026
joy-0.023
awed-0.021
sed-0.021
irmed-0.021
Token".
Token ID1911
Feature activation-1.09e-02

Pos logit contributions
 injuries+0.060
joy+0.055
sed+0.051
awed+0.050
irmed+0.049
Neg logit contributions
 ring-0.022
 also-0.021
 sign-0.019
 priest-0.019
 rob-0.019
Token The
Token ID383
Feature activation+7.03e-03

Pos logit contributions
 ring+0.016
 also+0.015
 sign+0.015
 guard+0.015
 priest+0.014
Neg logit contributions
 injuries-0.023
joy-0.021
awed-0.019
sed-0.019
irmed-0.018
Token puppy
Token ID26188
Feature activation+1.33e-02

Pos logit contributions
 injuries+0.016
joy+0.015
awed+0.014
ishing+0.013
irmed+0.013
Neg logit contributions
 ring-0.009
 also-0.009
 guard-0.009
 sign-0.009
 pilot-0.008
Token w
Token ID266
Feature activation+1.80e-02

Pos logit contributions
 injuries+0.026
joy+0.023
sed+0.023
awed+0.022
irmed+0.022
Neg logit contributions
 achieve-0.019
 priest-0.019
 ring-0.018
Queen-0.018
 learn-0.018
Tokenagged
Token ID14655
Feature activation+1.18e-01

Pos logit contributions
 injuries+0.017
joy+0.015
ksh+0.011
sed+0.009
aj+0.008
Neg logit contributions
 also-0.045
 ring-0.045
 understand-0.044
 cross-0.043
 know-0.043
Token his
Token ID465
Feature activation+2.73e-02

Pos logit contributions
 injuries+0.253
joy+0.240
awed+0.222
sed+0.221
irmed+0.218
Neg logit contributions
 ring-0.150
 achieve-0.150
 priest-0.149
 also-0.144
Queen-0.142
Token tail
Token ID7894
Feature activation+3.34e-02

Pos logit contributions
 injuries+0.056
joy+0.052
awed+0.048
ksh+0.046
anged+0.046
Neg logit contributions
 ring-0.040
 also-0.038
 guard-0.038
 sign-0.037
 cross-0.036
Token and
Token ID290
Feature activation+6.49e-03

Pos logit contributions
 injuries+0.096
joy+0.091
awed+0.085
sed+0.084
irmed+0.083
Neg logit contributions
 ring-0.021
 priest-0.018
 also-0.018
 achieve-0.017
 sign-0.017
Token l
Token ID300
Feature activation+2.83e-02

Pos logit contributions
 injuries+0.014
joy+0.013
awed+0.012
sed+0.012
irmed+0.012
Neg logit contributions
 ring-0.009
 also-0.008
 sign-0.008
 guard-0.008
 understand-0.008
Tokenicked
Token ID9484
Feature activation+3.60e-02

Pos logit contributions
 injuries+0.045
joy+0.041
ksh+0.037
aj+0.034
ishing+0.032
Neg logit contributions
 also-0.054
 understand-0.053
a-0.052
 ring-0.052
 guard-0.050
Token.
Token ID13
Feature activation+2.84e-03

Pos logit contributions
 ring+0.007
 also+0.006
 achieve+0.006
Queen+0.006
 sign+0.006
Neg logit contributions
 injuries-0.012
joy-0.011
sed-0.010
awed-0.010
irmed-0.010
Token
Token ID198
Feature activation-8.24e-03

Pos logit contributions
joy+0.007
 injuries+0.006
awed+0.006
sed+0.006
ksh+0.005
Neg logit contributions
 ring-0.003
 pilot-0.003
 priest-0.003
 sign-0.003
 learn-0.003
Token
Token ID198
Feature activation-6.65e-03

Pos logit contributions
 ring+0.010
 pilot+0.009
 priest+0.008
 guard+0.008
 sign+0.008
Neg logit contributions
 injuries-0.019
joy-0.018
awed-0.017
aj-0.016
irmed-0.015
Token"
Token ID1
Feature activation+2.91e-03

Pos logit contributions
 ring+0.010
 look+0.009
 feel+0.009
 go+0.008
 name+0.008
Neg logit contributions
 injuries-0.013
joy-0.012
awed-0.011
irmed-0.011
 nowhere-0.011
TokenBl
Token ID3629
Feature activation-1.39e-02

Pos logit contributions
 injuries+0.005
joy+0.005
awed+0.004
ksh+0.004
sed+0.004
Neg logit contributions
 know-0.005
 ring-0.005
 learn-0.005
 go-0.004
 achieve-0.004
Tokenub
Token ID549
Feature activation-1.16e-01

Pos logit contributions
 pilot+0.024
 ring+0.024
 also+0.023
 priest+0.023
 even+0.023
Neg logit contributions
 injuries-0.024
joy-0.023
awed-0.022
aj-0.022
anged-0.021
Token!
Token ID0
Feature activation-1.55e-02

Pos logit contributions
 understand+0.150
Queen+0.146
 also+0.144
 rob+0.144
 priest+0.141
Neg logit contributions
 injuries-0.228
joy-0.228
awed-0.216
sed-0.207
ishing-0.206
Token You
Token ID921
Feature activation-2.61e-02

Pos logit contributions
 also+0.027
 ring+0.026
 sign+0.026
 priest+0.025
aches+0.025
Neg logit contributions
joy-0.025
 injuries-0.024
sed-0.022
ksh-0.021
aj-0.020
Token are
Token ID389
Feature activation-2.42e-02

Pos logit contributions
 priest+0.034
 pastor+0.031
 pilot+0.028
Queen+0.027
ister+0.027
Neg logit contributions
 injuries-0.050
sed-0.045
irmed-0.045
 nowhere-0.041
 breaths-0.041
Token very
Token ID845
Feature activation-1.63e-02

Pos logit contributions
 ring+0.025
 pilot+0.024
 slide+0.024
 priest+0.024
 understand+0.023
Neg logit contributions
 injuries-0.056
joy-0.051
sed-0.049
 nowhere-0.047
awed-0.045
Token long
Token ID890
Feature activation+3.47e-02

Pos logit contributions
 pilot+0.022
 ring+0.021
 slide+0.020
 priest+0.019
 switch+0.019
Neg logit contributions
 injuries-0.030
joy-0.029
sed-0.027
irmed-0.027
awed-0.027
Token poking
Token ID39048
Feature activation+1.18e-02

Pos logit contributions
 injuries+0.055
joy+0.050
awed+0.047
sed+0.046
irmed+0.044
Neg logit contributions
 ring-0.034
 also-0.033
 understand-0.032
 learn-0.032
 priest-0.031
Token out
Token ID503
Feature activation-3.36e-02

Pos logit contributions
 injuries+0.026
joy+0.024
awed+0.023
sed+0.023
irmed+0.022
Neg logit contributions
 also-0.014
 ring-0.014
 achieve-0.014
 priest-0.014
 pilot-0.014
Token from
Token ID422
Feature activation-1.77e-02

Pos logit contributions
 ring+0.038
 also+0.037
 priest+0.036
Queen+0.036
 achieve+0.035
Neg logit contributions
 injuries-0.077
joy-0.072
awed-0.067
irmed-0.063
ishing-0.062
Token the
Token ID262
Feature activation-1.41e-02

Pos logit contributions
 ring+0.017
 also+0.016
 sign+0.015
 understand+0.015
 guard+0.015
Neg logit contributions
 injuries-0.045
joy-0.042
awed-0.039
sed-0.039
irmed-0.038
Token shr
Token ID10157
Feature activation+3.03e-03

Pos logit contributions
 ring+0.018
 also+0.018
 understand+0.017
 know+0.017
 achieve+0.017
Neg logit contributions
 injuries-0.031
joy-0.028
sed-0.026
awed-0.025
irmed-0.024
Tokenub
Token ID549
Feature activation-1.14e-01

Pos logit contributions
 injuries+0.006
joy+0.005
awed+0.005
ishing+0.004
irmed+0.004
Neg logit contributions
 ring-0.005
 also-0.005
 sign-0.005
 guard-0.005
 pilot-0.005
Tokenbery
Token ID13001
Feature activation-4.46e-02

Pos logit contributions
 also+0.174
Queen+0.170
 learn+0.169
 understand+0.169
 rob+0.168
Neg logit contributions
 injuries-0.207
joy-0.186
awed-0.176
ishing-0.174
sed-0.172
Token.
Token ID13
Feature activation+1.71e-03

Pos logit contributions
 also+0.064
 achieve+0.061
 priest+0.059
 understand+0.059
 know+0.057
Neg logit contributions
 injuries-0.089
joy-0.079
sed-0.076
ksh-0.071
awed-0.071
Token
Token ID198
Feature activation-6.04e-03

Pos logit contributions
joy+0.004
 injuries+0.003
awed+0.003
irmed+0.003
sed+0.003
Neg logit contributions
 ring-0.003
 sign-0.002
 also-0.002
 priest-0.002
 understand-0.002
Token
Token ID198
Feature activation-5.33e-03

Pos logit contributions
 ring+0.009
 pilot+0.008
 sign+0.007
 bell+0.007
 priest+0.007
Neg logit contributions
joy-0.013
 injuries-0.012
awed-0.011
aj-0.011
 nowhere-0.011
TokenJack
Token ID14295
Feature activation-5.99e-04

Pos logit contributions
 ring+0.008
 look+0.007
 sign+0.006
 pilot+0.006
 bell+0.006
Neg logit contributions
joy-0.011
 injuries-0.011
awed-0.010
irmed-0.009
aj-0.009
Token.
Token ID13
Feature activation+3.58e-03

Pos logit contributions
 ring+0.005
 achieve+0.005
 also+0.005
Queen+0.005
 slide+0.005
Neg logit contributions
 injuries-0.009
joy-0.008
sed-0.007
awed-0.007
irmed-0.007
Token
Token ID198
Feature activation-1.12e-02

Pos logit contributions
joy+0.008
 injuries+0.008
awed+0.008
sed+0.007
irmed+0.007
Neg logit contributions
 ring-0.004
 pilot-0.004
 priest-0.004
 sign-0.004
 also-0.004
Token
Token ID198
Feature activation-9.96e-03

Pos logit contributions
 ring+0.014
 pilot+0.012
 priest+0.012
 guard+0.012
 sign+0.011
Neg logit contributions
 injuries-0.026
joy-0.025
awed-0.023
aj-0.021
irmed-0.021
Token"
Token ID1
Feature activation+2.92e-03

Pos logit contributions
 ring+0.016
 look+0.014
 go+0.013
 feel+0.013
 name+0.013
Neg logit contributions
 injuries-0.020
joy-0.018
awed-0.016
irmed-0.016
 nowhere-0.015
TokenBl
Token ID3629
Feature activation-1.35e-02

Pos logit contributions
 injuries+0.005
joy+0.005
awed+0.005
sed+0.004
irmed+0.004
Neg logit contributions
 ring-0.005
 know-0.004
 go-0.004
 learn-0.004
 cross-0.004
Tokenub
Token ID549
Feature activation-1.14e-01

Pos logit contributions
 pilot+0.024
 ring+0.024
 also+0.023
 even+0.023
 priest+0.023
Neg logit contributions
 injuries-0.023
joy-0.021
aj-0.021
awed-0.021
anged-0.021
Token!
Token ID0
Feature activation-1.52e-02

Pos logit contributions
 understand+0.153
 rob+0.145
 know+0.145
 also+0.145
 priest+0.144
Neg logit contributions
 injuries-0.219
joy-0.217
awed-0.207
ishing-0.197
sed-0.195
Token I
Token ID314
Feature activation+6.05e-03

Pos logit contributions
 ring+0.025
 also+0.025
 sign+0.024
aches+0.023
 understand+0.023
Neg logit contributions
joy-0.027
 injuries-0.026
sed-0.024
ksh-0.023
irmed-0.022
Token am
Token ID716
Feature activation-1.02e-02

Pos logit contributions
 injuries+0.013
irmed+0.012
joy+0.011
sed+0.011
ashed+0.010
Neg logit contributions
 priest-0.007
 pilot-0.007
 bell-0.006
 ring-0.006
 rob-0.006
Token the
Token ID262
Feature activation-1.21e-02

Pos logit contributions
 ring+0.012
 slide+0.012
Queen+0.012
 achieve+0.011
 pilot+0.011
Neg logit contributions
 injuries-0.022
joy-0.021
sed-0.019
 nowhere-0.018
 breaths-0.017
Token light
Token ID1657
Feature activation+6.23e-03

Pos logit contributions
 achieve+0.013
 know+0.013
 ring+0.013
aches+0.013
 understand+0.013
Neg logit contributions
 injuries-0.026
joy-0.024
sed-0.023
 nowhere-0.020
ses-0.020
Token.
Token ID13
Feature activation+3.25e-03

Pos logit contributions
Queen+0.025
 also+0.024
 achieve+0.024
 rob+0.023
 priest+0.023
Neg logit contributions
 injuries-0.023
joy-0.020
awed-0.018
less-0.017
sed-0.016
Token
Token ID198
Feature activation-1.17e-02

Pos logit contributions
 injuries+0.008
joy+0.007
awed+0.007
sed+0.007
irmed+0.007
Neg logit contributions
 ring-0.004
 guard-0.003
 pilot-0.003
 also-0.003
 priest-0.003
Token
Token ID198
Feature activation-1.06e-02

Pos logit contributions
 ring+0.015
 pilot+0.013
 guard+0.012
 rob+0.012
 priest+0.012
Neg logit contributions
 injuries-0.027
joy-0.025
awed-0.023
aj-0.022
irmed-0.022
Token"
Token ID1
Feature activation+3.58e-03

Pos logit contributions
 ring+0.009
 understand+0.007
 guard+0.006
 pilot+0.006
 rob+0.006
Neg logit contributions
 injuries-0.028
joy-0.027
awed-0.026
irmed-0.025
aj-0.024
TokenBl
Token ID3629
Feature activation-7.93e-03

Pos logit contributions
 injuries+0.007
joy+0.006
awed+0.006
ksh+0.005
irmed+0.005
Neg logit contributions
 ring-0.006
 learn-0.005
 also-0.005
 guard-0.005
 cross-0.005
Tokenub
Token ID549
Feature activation-1.12e-01

Pos logit contributions
 pilot+0.013
 also+0.013
 ring+0.013
 priest+0.013
 even+0.012
Neg logit contributions
 injuries-0.014
joy-0.013
anged-0.013
aj-0.013
awed-0.013
Token,
Token ID11
Feature activation-8.17e-03

Pos logit contributions
Queen+0.143
 also+0.139
 rob+0.138
 understand+0.137
 pilot+0.133
Neg logit contributions
 injuries-0.218
joy-0.214
awed-0.205
sed-0.202
ishing-0.200
Token bl
Token ID698
Feature activation-3.41e-02

Pos logit contributions
 achieve+0.011
 understand+0.011
Queen+0.011
 learn+0.011
 know+0.010
Neg logit contributions
 injuries-0.015
joy-0.014
sed-0.014
 breaths-0.012
ksh-0.011
Tokenub
Token ID549
Feature activation-1.12e-01

Pos logit contributions
 also+0.058
 ring+0.056
 pilot+0.055
 rob+0.055
 slide+0.055
Neg logit contributions
awed-0.054
 injuries-0.053
sed-0.053
joy-0.053
aj-0.051
Token,
Token ID11
Feature activation-8.07e-03

Pos logit contributions
 also+0.142
 understand+0.139
Queen+0.139
 rob+0.132
 learn+0.131
Neg logit contributions
 injuries-0.227
joy-0.219
sed-0.212
awed-0.208
ishing-0.204
Token bl
Token ID698
Feature activation-3.44e-02

Pos logit contributions
 achieve+0.011
Queen+0.011
 understand+0.011
 learn+0.010
 priest+0.010
Neg logit contributions
 injuries-0.015
sed-0.014
joy-0.014
ksh-0.012
ted-0.011
Token"
Token ID1
Feature activation+3.58e-03

Pos logit contributions
 ring+0.009
 understand+0.007
 guard+0.006
 pilot+0.006
 rob+0.006
Neg logit contributions
 injuries-0.028
joy-0.027
awed-0.026
irmed-0.025
aj-0.024
TokenBl
Token ID3629
Feature activation-7.93e-03

Pos logit contributions
 injuries+0.007
joy+0.006
awed+0.006
ksh+0.005
irmed+0.005
Neg logit contributions
 ring-0.006
 learn-0.005
 also-0.005
 guard-0.005
 cross-0.005
Tokenub
Token ID549
Feature activation-1.12e-01

Pos logit contributions
 pilot+0.013
 also+0.013
 ring+0.013
 priest+0.013
 even+0.012
Neg logit contributions
 injuries-0.014
joy-0.013
anged-0.013
aj-0.013
awed-0.013
Token,
Token ID11
Feature activation-8.17e-03

Pos logit contributions
Queen+0.143
 also+0.139
 rob+0.138
 understand+0.137
 pilot+0.133
Neg logit contributions
 injuries-0.218
joy-0.214
awed-0.205
sed-0.202
ishing-0.200
Token bl
Token ID698
Feature activation-3.41e-02

Pos logit contributions
 achieve+0.011
 understand+0.011
Queen+0.011
 learn+0.011
 know+0.010
Neg logit contributions
 injuries-0.015
joy-0.014
sed-0.014
 breaths-0.012
ksh-0.011
Tokenub
Token ID549
Feature activation-1.12e-01

Pos logit contributions
 also+0.058
 ring+0.056
 pilot+0.055
 rob+0.055
 slide+0.055
Neg logit contributions
awed-0.054
 injuries-0.053
sed-0.053
joy-0.053
aj-0.051
Token,
Token ID11
Feature activation-8.07e-03

Pos logit contributions
 also+0.142
 understand+0.139
Queen+0.139
 rob+0.132
 learn+0.131
Neg logit contributions
 injuries-0.227
joy-0.219
sed-0.212
awed-0.208
ishing-0.204
Token bl
Token ID698
Feature activation-3.44e-02

Pos logit contributions
 achieve+0.011
Queen+0.011
 understand+0.011
 learn+0.010
 priest+0.010
Neg logit contributions
 injuries-0.015
sed-0.014
joy-0.014
ksh-0.012
ted-0.011
Tokenub
Token ID549
Feature activation-1.12e-01

Pos logit contributions
 also+0.057
 ring+0.055
 slide+0.055
 rob+0.054
 pilot+0.053
Neg logit contributions
awed-0.055
 injuries-0.055
joy-0.055
sed-0.054
aj-0.053
Token,"
Token ID553
Feature activation+1.62e-02

Pos logit contributions
 also+0.151
 understand+0.146
Queen+0.146
 learn+0.141
 rob+0.139
Neg logit contributions
 injuries-0.220
joy-0.208
sed-0.207
awed-0.201
ishing-0.196
Token the
Token ID262
Feature activation-5.85e-03

Pos logit contributions
 injuries+0.024
joy+0.023
sed+0.019
awed+0.018
irmed+0.017
Neg logit contributions
 ring-0.030
 achieve-0.029
 understand-0.029
 guard-0.028
a-0.028
Token,
Token ID11
Feature activation-8.17e-03

Pos logit contributions
Queen+0.143
 also+0.139
 rob+0.138
 understand+0.137
 pilot+0.133
Neg logit contributions
 injuries-0.218
joy-0.214
awed-0.205
sed-0.202
ishing-0.200
Token bl
Token ID698
Feature activation-3.41e-02

Pos logit contributions
 achieve+0.011
 understand+0.011
Queen+0.011
 learn+0.011
 know+0.010
Neg logit contributions
 injuries-0.015
joy-0.014
sed-0.014
 breaths-0.012
ksh-0.011
Tokenub
Token ID549
Feature activation-1.12e-01

Pos logit contributions
 also+0.058
 ring+0.056
 pilot+0.055
 rob+0.055
 slide+0.055
Neg logit contributions
awed-0.054
 injuries-0.053
sed-0.053
joy-0.053
aj-0.051
Token,
Token ID11
Feature activation-8.07e-03

Pos logit contributions
 also+0.142
 understand+0.139
Queen+0.139
 rob+0.132
 learn+0.131
Neg logit contributions
 injuries-0.227
joy-0.219
sed-0.212
awed-0.208
ishing-0.204
Token bl
Token ID698
Feature activation-3.44e-02

Pos logit contributions
 achieve+0.011
Queen+0.011
 understand+0.011
 learn+0.010
 priest+0.010
Neg logit contributions
 injuries-0.015
sed-0.014
joy-0.014
ksh-0.012
ted-0.011
Tokenub
Token ID549
Feature activation-1.12e-01

Pos logit contributions
 also+0.057
 ring+0.055
 slide+0.055
 rob+0.054
 pilot+0.053
Neg logit contributions
awed-0.055
 injuries-0.055
joy-0.055
sed-0.054
aj-0.053
Token,"
Token ID553
Feature activation+1.62e-02

Pos logit contributions
 also+0.151
 understand+0.146
Queen+0.146
 learn+0.141
 rob+0.139
Neg logit contributions
 injuries-0.220
joy-0.208
sed-0.207
awed-0.201
ishing-0.196
Token the
Token ID262
Feature activation-5.85e-03

Pos logit contributions
 injuries+0.024
joy+0.023
sed+0.019
awed+0.018
irmed+0.017
Neg logit contributions
 ring-0.030
 achieve-0.029
 understand-0.029
 guard-0.028
a-0.028
Token fish
Token ID5916
Feature activation-1.05e-02

Pos logit contributions
 understand+0.007
 achieve+0.006
 ring+0.006
 also+0.006
 learn+0.006
Neg logit contributions
 injuries-0.013
joy-0.012
sed-0.012
awed-0.011
irmed-0.011
Token replies
Token ID20616
Feature activation-1.38e-02

Pos logit contributions
Queen+0.017
 pilot+0.016
 achieve+0.016
 rob+0.016
 priest+0.016
Neg logit contributions
awed-0.013
irmed-0.013
 injuries-0.013
sed-0.013
led-0.013
Token.
Token ID13
Feature activation+1.19e-03

Pos logit contributions
 ring+0.016
 also+0.015
 sign+0.014
 guard+0.014
 understand+0.014
Neg logit contributions
 injuries-0.033
joy-0.030
awed-0.029
sed-0.028
irmed-0.028
Token.
Token ID13
Feature activation+2.57e-03

Pos logit contributions
 ring+0.002
 also+0.002
 achieve+0.002
 rob+0.002
 pilot+0.002
Neg logit contributions
 injuries-0.003
joy-0.003
awed-0.002
sed-0.002
irmed-0.002
Token
Token ID198
Feature activation-1.41e-02

Pos logit contributions
 injuries+0.006
joy+0.006
awed+0.005
sed+0.005
ksh+0.005
Neg logit contributions
 ring-0.003
 also-0.002
 guard-0.002
 sign-0.002
 pilot-0.002
Token
Token ID198
Feature activation-1.32e-02

Pos logit contributions
 ring+0.015
 pilot+0.013
 guard+0.012
 priest+0.012
 rob+0.012
Neg logit contributions
 injuries-0.035
joy-0.032
awed-0.030
aj-0.029
irmed-0.029
Token"
Token ID1
Feature activation+2.84e-03

Pos logit contributions
 ring+0.011
 guard+0.008
 understand+0.008
 pilot+0.008
 rob+0.008
Neg logit contributions
 injuries-0.035
joy-0.033
awed-0.031
irmed-0.031
ksh-0.030
TokenBl
Token ID3629
Feature activation-6.61e-03

Pos logit contributions
 injuries+0.005
joy+0.005
awed+0.004
ksh+0.004
irmed+0.004
Neg logit contributions
 ring-0.004
 guard-0.004
 learn-0.004
 also-0.004
 cross-0.004
Tokenub
Token ID549
Feature activation-1.10e-01

Pos logit contributions
 pilot+0.011
 also+0.011
 ring+0.011
 priest+0.010
 rob+0.010
Neg logit contributions
 injuries-0.012
joy-0.011
aj-0.011
anged-0.011
awed-0.011
Token,
Token ID11
Feature activation-9.14e-03

Pos logit contributions
Queen+0.149
 also+0.147
 understand+0.147
 rob+0.146
 priest+0.140
Neg logit contributions
 injuries-0.203
joy-0.201
awed-0.189
ishing-0.188
sed-0.185
Token bl
Token ID698
Feature activation-3.38e-02

Pos logit contributions
 understand+0.013
Queen+0.013
 achieve+0.013
 learn+0.012
 know+0.012
Neg logit contributions
 injuries-0.016
joy-0.015
sed-0.015
ksh-0.013
 thirst-0.013
Tokenub
Token ID549
Feature activation-1.09e-01

Pos logit contributions
 also+0.057
 ring+0.055
 rob+0.055
 slide+0.054
 pilot+0.054
Neg logit contributions
joy-0.053
awed-0.052
aj-0.052
sed-0.052
 injuries-0.052
Token,
Token ID11
Feature activation-9.00e-03

Pos logit contributions
 also+0.146
 understand+0.143
Queen+0.141
 rob+0.137
 learn+0.135
Neg logit contributions
 injuries-0.216
joy-0.209
sed-0.200
awed-0.195
ishing-0.194
Token bl
Token ID698
Feature activation-3.40e-02

Pos logit contributions
Queen+0.013
 achieve+0.012
 understand+0.012
 priest+0.012
 learn+0.012
Neg logit contributions
 injuries-0.016
joy-0.015
sed-0.015
ksh-0.013
ted-0.012
Token"
Token ID1
Feature activation+2.84e-03

Pos logit contributions
 ring+0.011
 guard+0.008
 understand+0.008
 pilot+0.008
 rob+0.008
Neg logit contributions
 injuries-0.035
joy-0.033
awed-0.031
irmed-0.031
ksh-0.030
TokenBl
Token ID3629
Feature activation-6.61e-03

Pos logit contributions
 injuries+0.005
joy+0.005
awed+0.004
ksh+0.004
irmed+0.004
Neg logit contributions
 ring-0.004
 guard-0.004
 learn-0.004
 also-0.004
 cross-0.004
Tokenub
Token ID549
Feature activation-1.10e-01

Pos logit contributions
 pilot+0.011
 also+0.011
 ring+0.011
 priest+0.010
 rob+0.010
Neg logit contributions
 injuries-0.012
joy-0.011
aj-0.011
anged-0.011
awed-0.011
Token,
Token ID11
Feature activation-9.14e-03

Pos logit contributions
Queen+0.149
 also+0.147
 understand+0.147
 rob+0.146
 priest+0.140
Neg logit contributions
 injuries-0.203
joy-0.201
awed-0.189
ishing-0.188
sed-0.185
Token bl
Token ID698
Feature activation-3.38e-02

Pos logit contributions
 understand+0.013
Queen+0.013
 achieve+0.013
 learn+0.012
 know+0.012
Neg logit contributions
 injuries-0.016
joy-0.015
sed-0.015
ksh-0.013
 thirst-0.013
Tokenub
Token ID549
Feature activation-1.09e-01

Pos logit contributions
 also+0.057
 ring+0.055
 rob+0.055
 slide+0.054
 pilot+0.054
Neg logit contributions
joy-0.053
awed-0.052
aj-0.052
sed-0.052
 injuries-0.052
Token,
Token ID11
Feature activation-9.00e-03

Pos logit contributions
 also+0.146
 understand+0.143
Queen+0.141
 rob+0.137
 learn+0.135
Neg logit contributions
 injuries-0.216
joy-0.209
sed-0.200
awed-0.195
ishing-0.194
Token bl
Token ID698
Feature activation-3.40e-02

Pos logit contributions
Queen+0.013
 achieve+0.012
 understand+0.012
 priest+0.012
 learn+0.012
Neg logit contributions
 injuries-0.016
joy-0.015
sed-0.015
ksh-0.013
ted-0.012
Tokenub
Token ID549
Feature activation-1.09e-01

Pos logit contributions
 also+0.056
 slide+0.054
 ring+0.054
 rob+0.054
 pilot+0.052
Neg logit contributions
joy-0.054
awed-0.054
aj-0.053
 injuries-0.053
sed-0.053
Token,"
Token ID553
Feature activation+1.49e-02

Pos logit contributions
 also+0.155
 understand+0.150
Queen+0.149
 learn+0.145
 rob+0.143
Neg logit contributions
 injuries-0.209
joy-0.199
sed-0.195
awed-0.189
ishing-0.187
Token they
Token ID484
Feature activation+8.13e-03

Pos logit contributions
 injuries+0.024
joy+0.024
sed+0.020
irmed+0.019
awed+0.019
Neg logit contributions
 ring-0.025
 achieve-0.024
 understand-0.024
 guard-0.024
aches-0.023
Token,
Token ID11
Feature activation-9.14e-03

Pos logit contributions
Queen+0.149
 also+0.147
 understand+0.147
 rob+0.146
 priest+0.140
Neg logit contributions
 injuries-0.203
joy-0.201
awed-0.189
ishing-0.188
sed-0.185
Token bl
Token ID698
Feature activation-3.38e-02

Pos logit contributions
 understand+0.013
Queen+0.013
 achieve+0.013
 learn+0.012
 know+0.012
Neg logit contributions
 injuries-0.016
joy-0.015
sed-0.015
ksh-0.013
 thirst-0.013
Tokenub
Token ID549
Feature activation-1.09e-01

Pos logit contributions
 also+0.057
 ring+0.055
 rob+0.055
 slide+0.054
 pilot+0.054
Neg logit contributions
joy-0.053
awed-0.052
aj-0.052
sed-0.052
 injuries-0.052
Token,
Token ID11
Feature activation-9.00e-03

Pos logit contributions
 also+0.146
 understand+0.143
Queen+0.141
 rob+0.137
 learn+0.135
Neg logit contributions
 injuries-0.216
joy-0.209
sed-0.200
awed-0.195
ishing-0.194
Token bl
Token ID698
Feature activation-3.40e-02

Pos logit contributions
Queen+0.013
 achieve+0.012
 understand+0.012
 priest+0.012
 learn+0.012
Neg logit contributions
 injuries-0.016
joy-0.015
sed-0.015
ksh-0.013
ted-0.012
Tokenub
Token ID549
Feature activation-1.09e-01

Pos logit contributions
 also+0.056
 slide+0.054
 ring+0.054
 rob+0.054
 pilot+0.052
Neg logit contributions
joy-0.054
awed-0.054
aj-0.053
 injuries-0.053
sed-0.053
Token,"
Token ID553
Feature activation+1.49e-02

Pos logit contributions
 also+0.155
 understand+0.150
Queen+0.149
 learn+0.145
 rob+0.143
Neg logit contributions
 injuries-0.209
joy-0.199
sed-0.195
awed-0.189
ishing-0.187
Token they
Token ID484
Feature activation+8.13e-03

Pos logit contributions
 injuries+0.024
joy+0.024
sed+0.020
irmed+0.019
awed+0.019
Neg logit contributions
 ring-0.025
 achieve-0.024
 understand-0.024
 guard-0.024
aches-0.023
Token say
Token ID910
Feature activation-3.44e-02

Pos logit contributions
 injuries+0.013
sed+0.013
irms+0.013
 breaths+0.013
irmed+0.013
Neg logit contributions
umber-0.010
 priest-0.010
 rob-0.010
Queen-0.010
 achieve-0.010
Token back
Token ID736
Feature activation-9.39e-03

Pos logit contributions
 ring+0.044
 also+0.043
 guard+0.040
 sign+0.040
 rob+0.040
Neg logit contributions
 injuries-0.076
joy-0.071
awed-0.064
sed-0.064
irmed-0.063
Token.
Token ID13
Feature activation+4.32e-04

Pos logit contributions
 ring+0.012
 also+0.012
 understand+0.011
 achieve+0.011
 sign+0.011
Neg logit contributions
 injuries-0.021
joy-0.019
awed-0.018
sed-0.017
irmed-0.017
Token."
Token ID526
Feature activation-3.39e-03

Pos logit contributions
 injuries+0.001
joy+0.000
awed+0.000
sed+0.000
irmed+0.000
Neg logit contributions
 ring-0.000
 also-0.000
 sign-0.000
 guard-0.000
 learn-0.000
Token
Token ID220
Feature activation+2.13e-02

Pos logit contributions
 ring+0.005
 sign+0.004
 pilot+0.004
 priest+0.004
 also+0.004
Neg logit contributions
 injuries-0.007
joy-0.007
awed-0.006
sed-0.006
aj-0.006
Token
Token ID198
Feature activation-1.42e-03

Pos logit contributions
 injuries+0.052
joy+0.050
awed+0.048
aj+0.045
sed+0.045
Neg logit contributions
 ring-0.023
 pilot-0.020
 guard-0.020
 sign-0.020
 priest-0.019
Token
Token ID198
Feature activation-1.65e-03

Pos logit contributions
 ring+0.002
 pilot+0.001
 sign+0.001
 priest+0.001
 guard+0.001
Neg logit contributions
 injuries-0.003
joy-0.003
awed-0.003
irmed-0.003
aj-0.003
TokenB
Token ID33
Feature activation+4.42e-02

Pos logit contributions
 ring+0.003
 look+0.002
 sign+0.002
 pilot+0.002
 priest+0.002
Neg logit contributions
 injuries-0.003
joy-0.003
awed-0.003
irmed-0.002
aj-0.002
Tokenub
Token ID549
Feature activation-1.09e-01

Pos logit contributions
 injuries+0.070
joy+0.062
awed+0.055
sed+0.054
irmed+0.053
Neg logit contributions
 ring-0.087
 also-0.085
 sign-0.081
 guard-0.081
 understand-0.080
Tokenbles
Token ID7689
Feature activation-6.45e-02

Pos logit contributions
 ring+0.177
 sign+0.174
 learn+0.170
 understand+0.170
 also+0.170
Neg logit contributions
 injuries-0.194
joy-0.184
awed-0.174
sed-0.168
ishing-0.161
Token learned
Token ID4499
Feature activation+1.59e-03

Pos logit contributions
 ring+0.099
 priest+0.095
Queen+0.094
 pilot+0.094
 guard+0.093
Neg logit contributions
 injuries-0.113
awed-0.106
joy-0.106
irmed-0.099
uddled-0.098
Token that
Token ID326
Feature activation+1.32e-03

Pos logit contributions
 injuries+0.004
joy+0.003
awed+0.003
sed+0.003
irmed+0.003
Neg logit contributions
 ring-0.002
 also-0.002
 priest-0.002
 pilot-0.002
 rob-0.002
Token it
Token ID340
Feature activation-1.38e-03

Pos logit contributions
 injuries+0.003
joy+0.003
sed+0.003
awed+0.003
irmed+0.003
Neg logit contributions
 ring-0.002
 also-0.001
 sign-0.001
 guard-0.001
 priest-0.001
Token's
Token ID338
Feature activation-1.60e-02

Pos logit contributions
 ring+0.001
 also+0.001
 sign+0.001
 priest+0.001
 guard+0.001
Neg logit contributions
 injuries-0.004
joy-0.003
awed-0.003
sed-0.003
irmed-0.003
Token drowned
Token ID30859
Feature activation+5.30e-02

Pos logit contributions
 injuries+0.024
awed+0.021
joy+0.021
sed+0.020
irmed+0.020
Neg logit contributions
 ring-0.018
 sign-0.017
 priest-0.017
 pilot-0.017
 achieve-0.017
Token.
Token ID13
Feature activation+3.31e-03

Pos logit contributions
 injuries+0.135
joy+0.127
awed+0.118
sed+0.117
irmed+0.116
Neg logit contributions
 ring-0.052
 also-0.050
 guard-0.045
 even-0.045
 understand-0.044
Token
Token ID198
Feature activation-6.28e-03

Pos logit contributions
 injuries+0.009
joy+0.009
awed+0.008
ishing+0.008
sed+0.008
Neg logit contributions
 also-0.002
 ring-0.002
 learn-0.002
 understand-0.002
 guard-0.002
Token
Token ID198
Feature activation-5.71e-03

Pos logit contributions
 ring+0.005
 also+0.004
 sign+0.004
 guard+0.004
 pilot+0.004
Neg logit contributions
 injuries-0.017
joy-0.016
awed-0.015
sed-0.015
irmed-0.015
TokenB
Token ID33
Feature activation+3.66e-02

Pos logit contributions
 ring+0.009
 priest+0.008
 sign+0.008
 also+0.008
 guard+0.008
Neg logit contributions
 injuries-0.012
joy-0.011
awed-0.010
irmed-0.010
sed-0.009
Tokenub
Token ID549
Feature activation-1.08e-01

Pos logit contributions
 injuries+0.068
joy+0.061
awed+0.055
sed+0.054
irmed+0.053
Neg logit contributions
 ring-0.062
 also-0.061
 sign-0.058
 guard-0.058
 understand-0.057
Tokenbles
Token ID7689
Feature activation-6.37e-02

Pos logit contributions
 ring+0.135
 also+0.128
 sign+0.128
 understand+0.126
 learn+0.124
Neg logit contributions
 injuries-0.240
joy-0.222
awed-0.211
sed-0.205
irmed-0.201
Token was
Token ID373
Feature activation+9.08e-03

Pos logit contributions
 ring+0.096
 name+0.092
 priest+0.091
 sign+0.089
 rob+0.088
Neg logit contributions
 injuries-0.125
joy-0.107
awed-0.106
irmed-0.103
 nowhere-0.100
Token very
Token ID845
Feature activation-1.02e-02

Pos logit contributions
 injuries+0.027
joy+0.025
awed+0.024
sed+0.023
ksh+0.023
Neg logit contributions
 also-0.006
 ring-0.005
 guard-0.004
 understand-0.004
 cross-0.004
Token sad
Token ID6507
Feature activation-1.64e-02

Pos logit contributions
 also+0.008
 ring+0.008
 guard+0.007
 understand+0.007
 sign+0.007
Neg logit contributions
 injuries-0.028
joy-0.026
awed-0.025
sed-0.024
ksh-0.024
Token.
Token ID13
Feature activation+7.73e-03

Pos logit contributions
 ring+0.009
 also+0.009
 sign+0.007
 understand+0.007
 guard+0.007
Neg logit contributions
 injuries-0.049
joy-0.046
awed-0.044
sed-0.043
irmed-0.043
Token She
Token ID1375
Feature activation-1.76e-02

Pos logit contributions
 injuries+0.014
joy+0.014
awed+0.013
sed+0.012
irmed+0.012
Neg logit contributions
 ring-0.007
 also-0.006
 sign-0.006
 priest-0.006
 pilot-0.006
Token started
Token ID2067
Feature activation-5.77e-03

Pos logit contributions
 ring+0.024
 sign+0.023
 priest+0.022
 achieve+0.022
 rob+0.022
Neg logit contributions
 injuries-0.037
joy-0.032
awed-0.032
irmed-0.031
sed-0.030
Token to
Token ID284
Feature activation+1.32e-02

Pos logit contributions
 ring+0.009
 also+0.009
 pilot+0.009
 guard+0.008
 priest+0.008
Neg logit contributions
 injuries-0.011
joy-0.010
awed-0.009
sed-0.009
irmed-0.009
Token "
Token ID366
Feature activation+1.43e-02

Pos logit contributions
 injuries+0.028
joy+0.024
irmed+0.023
 breaths+0.022
ishing+0.022
Neg logit contributions
Queen-0.017
 priest-0.017
 pilot-0.016
a-0.016
"-0.015
Tokencompl
Token ID23855
Feature activation+2.67e-02

Pos logit contributions
 injuries+0.029
joy+0.026
awed+0.024
irmed+0.024
ishing+0.023
Neg logit contributions
 ring-0.022
 also-0.021
 guard-0.020
 sign-0.020
 priest-0.020
Tokenain
Token ID391
Feature activation-1.08e-01

Pos logit contributions
 injuries+0.042
awed+0.039
joy+0.038
anged+0.037
ishing+0.037
Neg logit contributions
 ring-0.051
 priest-0.050
 guard-0.050
 sign-0.049
 also-0.049
Token"
Token ID1
Feature activation+6.17e-03

Pos logit contributions
 also+0.154
 priest+0.150
 understand+0.149
Queen+0.145
 pilot+0.142
Neg logit contributions
 injuries-0.202
ishing-0.176
joy-0.175
less-0.165
awed-0.164
Token that
Token ID326
Feature activation-6.33e-03

Pos logit contributions
 injuries+0.014
joy+0.012
awed+0.012
irmed+0.011
uddled+0.011
Neg logit contributions
 ring-0.008
 rob-0.007
 priest-0.007
 guard-0.007
 pilot-0.007
Token it
Token ID340
Feature activation-4.37e-03

Pos logit contributions
 priest+0.008
 pilot+0.008
Queen+0.008
 achieve+0.007
 guard+0.007
Neg logit contributions
 injuries-0.013
joy-0.011
sed-0.010
irmed-0.010
less-0.010
Token wasn
Token ID2492
Feature activation-1.32e-02

Pos logit contributions
 priest+0.005
 ring+0.005
 pilot+0.005
Queen+0.005
 guard+0.005
Neg logit contributions
 injuries-0.010
joy-0.009
awed-0.008
sed-0.008
irmed-0.008
Token't
Token ID470
Feature activation-2.99e-02

Pos logit contributions
 priest+0.019
 pilot+0.019
 ring+0.019
 guard+0.018
Queen+0.018
Neg logit contributions
 injuries-0.026
 nowhere-0.020
awed-0.020
sed-0.020
less-0.019
Token of
Token ID286
Feature activation-5.20e-03

Pos logit contributions
 injuries+0.005
joy+0.004
awed+0.004
irmed+0.004
sed+0.004
Neg logit contributions
 ring-0.002
 also-0.002
 rob-0.002
 priest-0.002
 sign-0.002
Token the
Token ID262
Feature activation-8.38e-03

Pos logit contributions
 ring+0.006
 also+0.005
 understand+0.005
 pilot+0.005
 guard+0.005
Neg logit contributions
 injuries-0.013
joy-0.012
awed-0.011
sed-0.010
irmed-0.010
Token cold
Token ID4692
Feature activation+2.76e-02

Pos logit contributions
 also+0.010
 understand+0.010
 ring+0.010
 know+0.010
 achieve+0.010
Neg logit contributions
 injuries-0.019
joy-0.017
sed-0.015
awed-0.015
irmed-0.015
Token por
Token ID16964
Feature activation-3.38e-02

Pos logit contributions
 injuries+0.063
joy+0.058
awed+0.053
sed+0.052
irmed+0.051
Neg logit contributions
 also-0.034
 ring-0.034
 understand-0.032
 learn-0.032
 achieve-0.032
Tokencel
Token ID5276
Feature activation+6.95e-02

Pos logit contributions
 also+0.075
 ring+0.074
 understand+0.072
 guard+0.071
 sign+0.071
Neg logit contributions
 injuries-0.046
joy-0.038
awed-0.034
sed-0.033
ksh-0.031
Tokenain
Token ID391
Feature activation-1.08e-01

Pos logit contributions
 injuries+0.136
awed+0.124
joy+0.122
ishing+0.115
irmed+0.112
Neg logit contributions
 ring-0.108
 also-0.103
 understand-0.102
 guard-0.101
 sign-0.100
Token bowl
Token ID9396
Feature activation-2.01e-03

Pos logit contributions
 understand+0.149
 know+0.148
 also+0.146
 ring+0.145
 learn+0.145
Neg logit contributions
 injuries-0.219
joy-0.191
awed-0.176
ishing-0.176
sed-0.174
Token!
Token ID0
Feature activation-3.83e-03

Pos logit contributions
 ring+0.002
Queen+0.002
 pilot+0.002
 priest+0.002
 rob+0.002
Neg logit contributions
 injuries-0.005
joy-0.004
awed-0.004
sed-0.004
irmed-0.004
Token
Token ID220
Feature activation+1.51e-02

Pos logit contributions
 ring+0.005
 pilot+0.004
 sign+0.004
 priest+0.004
 also+0.004
Neg logit contributions
 injuries-0.008
joy-0.008
awed-0.008
sed-0.007
irmed-0.007
Token
Token ID198
Feature activation-5.27e-03

Pos logit contributions
 injuries+0.035
joy+0.033
awed+0.032
aj+0.031
uddled+0.030
Neg logit contributions
 ring-0.019
 pilot-0.017
 guard-0.016
 priest-0.016
 sign-0.016
Token
Token ID198
Feature activation-4.26e-03

Pos logit contributions
 ring+0.008
 pilot+0.007
 sign+0.006
 priest+0.006
 crane+0.006
Neg logit contributions
 injuries-0.011
joy-0.010
awed-0.010
aj-0.009
irmed-0.009
Token and
Token ID290
Feature activation-6.08e-03

Pos logit contributions
 injuries+0.002
joy+0.001
sed+0.001
awed+0.001
ishing+0.001
Neg logit contributions
 understand-0.001
 also-0.001
 rob-0.001
Queen-0.001
 learn-0.001
Token become
Token ID1716
Feature activation+1.81e-03

Pos logit contributions
 ring+0.007
 also+0.007
 guard+0.007
 pilot+0.007
 sign+0.007
Neg logit contributions
 injuries-0.014
joy-0.012
sed-0.012
awed-0.011
irmed-0.011
Token smaller
Token ID4833
Feature activation-1.26e-02

Pos logit contributions
 injuries+0.004
joy+0.004
awed+0.003
irmed+0.003
sed+0.003
Neg logit contributions
 ring-0.002
 also-0.002
 slide-0.002
 guard-0.002
 sign-0.002
Token and
Token ID290
Feature activation-6.64e-03

Pos logit contributions
 ring+0.011
 also+0.011
 understand+0.010
 sign+0.010
 guard+0.010
Neg logit contributions
 injuries-0.033
joy-0.031
awed-0.029
sed-0.029
irmed-0.028
Token f
Token ID277
Feature activation-2.73e-02

Pos logit contributions
 ring+0.008
 also+0.008
 sign+0.007
 guard+0.007
 learn+0.007
Neg logit contributions
 injuries-0.015
joy-0.015
awed-0.014
sed-0.013
irmed-0.013
Tokenain
Token ID391
Feature activation-1.05e-01

Pos logit contributions
 ring+0.047
 also+0.046
 sign+0.045
 guard+0.044
 learn+0.044
Neg logit contributions
 injuries-0.049
joy-0.045
awed-0.042
irmed-0.040
sed-0.039
Tokenter
Token ID353
Feature activation-1.34e-02

Pos logit contributions
 also+0.145
 understand+0.144
 know+0.138
 ring+0.138
 achieve+0.138
Neg logit contributions
 injuries-0.216
joy-0.192
sed-0.178
awed-0.177
ishing-0.177
Token."
Token ID526
Feature activation-1.01e-02

Pos logit contributions
 also+0.017
 understand+0.017
 achieve+0.016
 ring+0.016
Queen+0.016
Neg logit contributions
 injuries-0.028
joy-0.024
awed-0.024
sed-0.024
ishing-0.023
Token
Token ID198
Feature activation-1.87e-02

Pos logit contributions
 ring+0.012
 pilot+0.011
 guard+0.011
 priest+0.011
 sign+0.011
Neg logit contributions
 injuries-0.022
joy-0.022
awed-0.021
sed-0.020
irmed-0.020
Token
Token ID198
Feature activation-1.79e-02

Pos logit contributions
 ring+0.018
 pilot+0.015
 priest+0.015
 guard+0.015
 also+0.015
Neg logit contributions
 injuries-0.048
joy-0.045
awed-0.043
sed-0.041
irmed-0.041
Token"
Token ID1
Feature activation-4.73e-04

Pos logit contributions
 ring+0.023
 pilot+0.020
 priest+0.020
 understand+0.020
 guard+0.019
Neg logit contributions
 injuries-0.041
joy-0.037
awed-0.036
irmed-0.035
aj-0.034
Token quickly
Token ID2952
Feature activation-2.69e-02

Pos logit contributions
 ring+0.046
 also+0.043
Queen+0.043
 pilot+0.043
 priest+0.042
Neg logit contributions
 injuries-0.082
joy-0.076
sed-0.067
awed-0.067
irmed-0.066
Token.
Token ID13
Feature activation+4.81e-03

Pos logit contributions
 ring+0.032
 also+0.032
Queen+0.030
 priest+0.030
 pilot+0.030
Neg logit contributions
 injuries-0.063
joy-0.057
sed-0.051
awed-0.051
irmed-0.050
Token
Token ID198
Feature activation-3.97e-03

Pos logit contributions
 injuries+0.011
joy+0.011
awed+0.010
sed+0.010
irmed+0.010
Neg logit contributions
 ring-0.006
 sign-0.005
 pilot-0.005
 priest-0.005
 also-0.005
Token
Token ID198
Feature activation-3.52e-03

Pos logit contributions
 ring+0.006
 pilot+0.005
 priest+0.004
 sign+0.004
 bell+0.004
Neg logit contributions
 injuries-0.009
joy-0.008
awed-0.008
 nowhere-0.007
irmed-0.007
TokenB
Token ID33
Feature activation+4.07e-02

Pos logit contributions
 ring+0.005
 look+0.005
 sign+0.004
 bell+0.004
 pilot+0.004
Neg logit contributions
 injuries-0.007
joy-0.007
 nowhere-0.006
irmed-0.006
awed-0.006
Tokenub
Token ID549
Feature activation-1.04e-01

Pos logit contributions
 injuries+0.070
joy+0.063
awed+0.056
sed+0.055
irmed+0.054
Neg logit contributions
 ring-0.075
 also-0.073
 sign-0.070
 guard-0.069
 learn-0.069
Tokenbles
Token ID7689
Feature activation-6.12e-02

Pos logit contributions
 ring+0.143
 sign+0.141
 understand+0.138
 learn+0.138
 rob+0.135
Neg logit contributions
 injuries-0.206
joy-0.195
awed-0.185
sed-0.177
ishing-0.176
Token wanted
Token ID2227
Feature activation-2.47e-03

Pos logit contributions
 ring+0.090
 priest+0.086
 name+0.086
Queen+0.084
 sign+0.082
Neg logit contributions
 injuries-0.117
joy-0.104
awed-0.104
irmed-0.101
uddled-0.098
Token to
Token ID284
Feature activation+1.64e-02

Pos logit contributions
 ring+0.004
 also+0.003
 sign+0.003
 guard+0.003
 pilot+0.003
Neg logit contributions
 injuries-0.005
joy-0.005
awed-0.004
sed-0.004
irmed-0.004
Token perform
Token ID1620
Feature activation-5.29e-02

Pos logit contributions
 injuries+0.034
 thirst+0.026
sed+0.025
 bruises+0.023
irmed+0.022
Neg logit contributions
a-0.022
 priest-0.021
Queen-0.020
ele-0.020
 pastor-0.019
Token a
Token ID257
Feature activation-3.70e-04

Pos logit contributions
 rob+0.074
 ring+0.072
 priest+0.071
 also+0.071
Queen+0.071
Neg logit contributions
 injuries-0.110
joy-0.100
sed-0.088
irmed-0.086
ishing-0.085
Token away
Token ID1497
Feature activation-3.52e-02

Pos logit contributions
 rob+0.000
Queen+0.000
 ring+0.000
 priest+0.000
 pilot+0.000
Neg logit contributions
 injuries-0.000
joy-0.000
awed-0.000
irmed-0.000
ishing-0.000
Token.
Token ID13
Feature activation+5.53e-03

Pos logit contributions
 ring+0.044
 also+0.042
 pilot+0.041
Queen+0.041
 priest+0.040
Neg logit contributions
 injuries-0.080
joy-0.074
awed-0.066
sed-0.066
irmed-0.064
Token
Token ID198
Feature activation-4.52e-03

Pos logit contributions
 injuries+0.014
joy+0.013
awed+0.012
sed+0.012
irmed+0.012
Neg logit contributions
 ring-0.006
 also-0.005
 sign-0.005
 guard-0.005
 understand-0.005
Token
Token ID198
Feature activation-3.67e-03

Pos logit contributions
 ring+0.005
 priest+0.005
 pilot+0.005
 sign+0.005
 bell+0.004
Neg logit contributions
 injuries-0.010
joy-0.010
awed-0.009
irmed-0.009
 nowhere-0.009
TokenB
Token ID33
Feature activation+4.14e-02

Pos logit contributions
 ring+0.007
 priest+0.007
 sign+0.007
 pilot+0.007
 bell+0.007
Neg logit contributions
 injuries-0.005
joy-0.005
irmed-0.004
awed-0.004
uddled-0.004
Tokenub
Token ID549
Feature activation-1.03e-01

Pos logit contributions
 injuries+0.096
joy+0.089
awed+0.083
sed+0.081
irmed+0.080
Neg logit contributions
 ring-0.051
 also-0.049
 sign-0.046
 priest-0.045
 understand-0.045
Tokenbles
Token ID7689
Feature activation-6.09e-02

Pos logit contributions
 ring+0.131
 sign+0.126
 understand+0.125
 also+0.124
 learn+0.123
Neg logit contributions
 injuries-0.222
joy-0.208
awed-0.197
sed-0.192
ishing-0.187
Token warned
Token ID7728
Feature activation+1.85e-02

Pos logit contributions
 ring+0.087
 priest+0.085
 name+0.084
Queen+0.083
 sign+0.080
Neg logit contributions
 injuries-0.116
awed-0.104
joy-0.103
irmed-0.101
uddled-0.099
Token his
Token ID465
Feature activation+2.52e-02

Pos logit contributions
 injuries+0.046
joy+0.042
awed+0.039
sed+0.039
irmed+0.038
Neg logit contributions
 ring-0.020
 also-0.020
 sign-0.018
 guard-0.018
 understand-0.018
Token friends
Token ID2460
Feature activation-1.65e-02

Pos logit contributions
 injuries+0.055
joy+0.049
sed+0.046
irmed+0.042
anged+0.042
Neg logit contributions
 ring-0.033
Queen-0.033
 achieve-0.033
 also-0.032
 slide-0.031
Token about
Token ID546
Feature activation+4.12e-02

Pos logit contributions
Queen+0.023
 ring+0.022
 priest+0.022
 rob+0.021
 pilot+0.021
Neg logit contributions
 injuries-0.035
joy-0.031
sed-0.027
irmed-0.027
anged-0.026
Token you
Token ID345
Feature activation-4.77e-03

Pos logit contributions
 ring+0.040
 also+0.039
 understand+0.038
 guard+0.037
 sign+0.036
Neg logit contributions
 injuries-0.073
joy-0.067
awed-0.061
sed-0.060
irmed-0.057
Token!"
Token ID2474
Feature activation-1.32e-02

Pos logit contributions
 rob+0.006
 priest+0.006
 ring+0.006
Queen+0.006
 pilot+0.006
Neg logit contributions
 injuries-0.010
joy-0.009
sed-0.008
ksh-0.008
aj-0.008
Token
Token ID198
Feature activation-1.01e-02

Pos logit contributions
 ring+0.017
 also+0.016
 pilot+0.016
 priest+0.016
 guard+0.016
Neg logit contributions
joy-0.028
 injuries-0.028
awed-0.026
sed-0.024
irmed-0.024
Token
Token ID198
Feature activation-8.62e-03

Pos logit contributions
 ring+0.013
 pilot+0.012
 sign+0.011
 guard+0.011
 priest+0.011
Neg logit contributions
 injuries-0.022
joy-0.021
awed-0.020
uddled-0.019
aj-0.018
TokenB
Token ID33
Feature activation+3.98e-02

Pos logit contributions
 ring+0.013
 look+0.012
 name+0.012
 weigh+0.011
 go+0.011
Neg logit contributions
 injuries-0.017
awed-0.015
joy-0.015
uddled-0.014
 nowhere-0.013
Tokenub
Token ID549
Feature activation-1.03e-01

Pos logit contributions
 injuries+0.061
joy+0.057
awed+0.052
sed+0.049
irmed+0.049
Neg logit contributions
 ring-0.080
 also-0.078
 sign-0.074
 priest-0.074
 pilot-0.073
Tokenbles
Token ID7689
Feature activation-6.23e-02

Pos logit contributions
 understand+0.152
 learn+0.149
 sign+0.148
 ring+0.148
 know+0.143
Neg logit contributions
 injuries-0.184
joy-0.180
awed-0.176
sed-0.168
uddled-0.166
Token replied
Token ID8712
Feature activation+1.56e-02

Pos logit contributions
Queen+0.093
 priest+0.093
 ring+0.092
 pilot+0.090
 achieve+0.089
Neg logit contributions
awed-0.098
uddled-0.095
ashed-0.094
joy-0.093
 injuries-0.092
Token,
Token ID11
Feature activation-3.52e-03

Pos logit contributions
joy+0.034
 injuries+0.034
sed+0.030
 breaths+0.029
awed+0.029
Neg logit contributions
 also-0.018
 ring-0.017
Queen-0.017
 achieve-0.017
 priest-0.017
Token "
Token ID366
Feature activation+1.39e-02

Pos logit contributions
 achieve+0.006
aches+0.006
 know+0.006
 ring+0.006
 learn+0.006
Neg logit contributions
joy-0.006
 injuries-0.005
sed-0.005
awed-0.004
ksh-0.004
TokenThanks
Token ID9690
Feature activation+1.41e-04

Pos logit contributions
 injuries+0.026
joy+0.026
awed+0.024
ksh+0.023
irmed+0.023
Neg logit contributions
 ring-0.021
 sign-0.019
 cross-0.019
 also-0.019
 learn-0.019
Token.
Token ID13
Feature activation+1.08e-02

Pos logit contributions
 ring+0.009
 also+0.008
 understand+0.008
 guard+0.008
 pilot+0.008
Neg logit contributions
 injuries-0.011
joy-0.010
awed-0.009
sed-0.009
irmed-0.009
Token
Token ID220
Feature activation+1.75e-02

Pos logit contributions
 injuries+0.029
joy+0.027
awed+0.025
sed+0.025
irmed+0.025
Neg logit contributions
 ring-0.009
 also-0.009
 sign-0.008
 guard-0.008
 understand-0.008
Token
Token ID198
Feature activation-3.28e-03

Pos logit contributions
 injuries+0.041
joy+0.038
awed+0.038
sed+0.035
irmed+0.035
Neg logit contributions
 ring-0.020
 priest-0.018
 pilot-0.018
 sign-0.017
 guard-0.017
Token
Token ID198
Feature activation-2.42e-03

Pos logit contributions
 ring+0.003
 priest+0.003
 pilot+0.003
 sign+0.003
 guard+0.003
Neg logit contributions
 injuries-0.008
joy-0.008
awed-0.007
irmed-0.007
sed-0.007
TokenB
Token ID33
Feature activation+4.26e-02

Pos logit contributions
 ring+0.005
 priest+0.005
 also+0.005
 sign+0.005
 guard+0.005
Neg logit contributions
 injuries-0.003
joy-0.003
awed-0.003
irmed-0.002
sed-0.002
Tokenub
Token ID549
Feature activation-1.01e-01

Pos logit contributions
 injuries+0.099
joy+0.094
awed+0.087
sed+0.085
irmed+0.085
Neg logit contributions
 ring-0.051
 also-0.049
 sign-0.047
 learn-0.045
 understand-0.045
Tokenba
Token ID7012
Feature activation-6.26e-02

Pos logit contributions
 ring+0.125
 sign+0.122
 understand+0.122
 learn+0.121
 also+0.120
Neg logit contributions
 injuries-0.220
joy-0.207
awed-0.199
sed-0.190
ishing-0.187
Token and
Token ID290
Feature activation+6.70e-03

Pos logit contributions
 priest+0.100
 ring+0.098
 name+0.091
 sign+0.089
 pilot+0.088
Neg logit contributions
awed-0.112
 injuries-0.099
uddled-0.098
ounced-0.097
anged-0.096
Token his
Token ID465
Feature activation+2.66e-02

Pos logit contributions
 injuries+0.010
joy+0.009
awed+0.008
sed+0.008
irmed+0.008
Neg logit contributions
 ring-0.013
 also-0.013
 priest-0.013
 sign-0.012
 achieve-0.012
Token other
Token ID584
Feature activation-2.42e-03

Pos logit contributions
 injuries+0.063
joy+0.058
awed+0.054
irmed+0.052
sed+0.052
Neg logit contributions
 ring-0.032
 also-0.031
 guard-0.029
 sign-0.029
 understand-0.028
Token friends
Token ID2460
Feature activation-1.88e-02

Pos logit contributions
 ring+0.003
 also+0.003
 sign+0.003
 achieve+0.003
 priest+0.003
Neg logit contributions
 injuries-0.005
joy-0.005
sed-0.004
awed-0.004
irmed-0.004
Token.
Token ID13
Feature activation+1.19e-02

Pos logit contributions
 injuries+0.008
joy+0.007
awed+0.007
sed+0.007
irmed+0.007
Neg logit contributions
 ring-0.004
 also-0.004
 priest-0.004
 guard-0.004
 rob-0.004
Token
Token ID220
Feature activation+2.01e-02

Pos logit contributions
 injuries+0.029
joy+0.027
awed+0.026
sed+0.026
irmed+0.025
Neg logit contributions
 ring-0.012
 also-0.012
 priest-0.011
 understand-0.011
 sign-0.011
Token
Token ID198
Feature activation-1.44e-03

Pos logit contributions
 injuries+0.047
joy+0.045
awed+0.043
irmed+0.040
sed+0.040
Neg logit contributions
 ring-0.023
 pilot-0.020
 priest-0.020
 rob-0.019
 sign-0.019
Token
Token ID198
Feature activation-6.51e-04

Pos logit contributions
 ring+0.002
 priest+0.002
 pilot+0.002
 rob+0.002
 bell+0.001
Neg logit contributions
 injuries-0.003
joy-0.003
awed-0.003
irmed-0.003
 nowhere-0.003
TokenB
Token ID33
Feature activation+4.36e-02

Pos logit contributions
 ring+0.001
 priest+0.001
 understand+0.001
 also+0.001
 slide+0.001
Neg logit contributions
 injuries-0.001
joy-0.001
awed-0.001
irmed-0.001
uddled-0.001
Tokenub
Token ID549
Feature activation-1.01e-01

Pos logit contributions
 injuries+0.100
joy+0.094
awed+0.088
sed+0.086
irmed+0.085
Neg logit contributions
 ring-0.054
 also-0.052
 sign-0.049
 priest-0.048
 understand-0.048
Tokenbles
Token ID7689
Feature activation-6.08e-02

Pos logit contributions
 ring+0.125
 understand+0.119
 sign+0.119
 also+0.118
 learn+0.116
Neg logit contributions
 injuries-0.220
joy-0.207
awed-0.197
sed-0.191
ishing-0.186
Token asked
Token ID1965
Feature activation+2.74e-03

Pos logit contributions
 ring+0.091
 priest+0.090
 name+0.090
Queen+0.089
 achieve+0.085
Neg logit contributions
awed-0.098
 injuries-0.097
joy-0.093
anged-0.090
irmed-0.090
Token his
Token ID465
Feature activation+2.84e-02

Pos logit contributions
 injuries+0.007
joy+0.007
awed+0.006
sed+0.006
irmed+0.006
Neg logit contributions
 ring-0.003
 also-0.003
 sign-0.002
 understand-0.002
 priest-0.002
Token friend
Token ID1545
Feature activation-1.54e-02

Pos logit contributions
 injuries+0.057
joy+0.053
sed+0.051
aj+0.044
anged+0.043
Neg logit contributions
 achieve-0.040
 ring-0.039
 also-0.038
Queen-0.038
 understand-0.037
Token,
Token ID11
Feature activation+1.67e-03

Pos logit contributions
 ring+0.013
 also+0.012
 achieve+0.012
 sign+0.012
 priest+0.011
Neg logit contributions
 injuries-0.040
joy-0.038
sed-0.035
awed-0.035
anged-0.034
Token happy
Token ID3772
Feature activation+8.07e-03

Pos logit contributions
 ring+0.013
 also+0.011
 sign+0.011
 pilot+0.011
 guard+0.011
Neg logit contributions
 injuries-0.029
joy-0.027
awed-0.026
irmed-0.025
sed-0.025
Token!
Token ID0
Feature activation+5.64e-04

Pos logit contributions
 injuries+0.021
joy+0.020
awed+0.019
sed+0.019
irmed+0.018
Neg logit contributions
 ring-0.007
 also-0.006
 understand-0.006
 priest-0.006
 sign-0.006
Token
Token ID198
Feature activation-4.22e-03

Pos logit contributions
joy+0.001
 injuries+0.001
awed+0.001
sed+0.001
irmed+0.001
Neg logit contributions
 ring-0.001
 priest-0.001
 pilot-0.001
 sign-0.001
 guard-0.001
Token
Token ID198
Feature activation-3.82e-03

Pos logit contributions
 ring+0.006
 pilot+0.006
 priest+0.005
 bell+0.005
 guard+0.005
Neg logit contributions
 injuries-0.009
joy-0.008
awed-0.008
 nowhere-0.007
uddled-0.007
TokenB
Token ID33
Feature activation+3.97e-02

Pos logit contributions
 ring+0.006
 look+0.005
 name+0.005
 bell+0.005
 guard+0.005
Neg logit contributions
 injuries-0.008
joy-0.007
awed-0.007
 nowhere-0.006
irmed-0.006
Tokenub
Token ID549
Feature activation-1.01e-01

Pos logit contributions
 injuries+0.067
joy+0.064
awed+0.059
sed+0.056
irmed+0.056
Neg logit contributions
 ring-0.073
 also-0.070
 priest-0.068
 sign-0.067
 pilot-0.066
Tokenba
Token ID7012
Feature activation-6.24e-02

Pos logit contributions
 understand+0.142
 ring+0.140
 sign+0.139
 learn+0.139
 know+0.135
Neg logit contributions
 injuries-0.186
joy-0.186
awed-0.179
uddled-0.168
ishing-0.167
Token and
Token ID290
Feature activation+1.02e-02

Pos logit contributions
 ring+0.086
 priest+0.083
 switch+0.083
 sign+0.080
Queen+0.078
Neg logit contributions
awed-0.117
joy-0.111
 injuries-0.109
uddled-0.106
led-0.104
Token his
Token ID465
Feature activation+2.53e-02

Pos logit contributions
joy+0.020
 injuries+0.018
sed+0.017
anged+0.016
irmed+0.016
Neg logit contributions
 achieve-0.014
 ring-0.014
 priest-0.013
 sign-0.013
 also-0.013
Token squirrel
Token ID33039
Feature activation+1.70e-02

Pos logit contributions
 injuries+0.060
joy+0.055
awed+0.052
irmed+0.050
sed+0.050
Neg logit contributions
 ring-0.030
 also-0.029
 sign-0.027
 guard-0.027
 understand-0.026
Token friend
Token ID1545
Feature activation-1.93e-02

Pos logit contributions
 injuries+0.033
joy+0.030
awed+0.029
sed+0.028
irmed+0.028
Neg logit contributions
 ring-0.025
 priest-0.024
 sign-0.024
 also-0.024
 achieve-0.023
Token strange
Token ID6283
Feature activation-1.66e-03

Pos logit contributions
 ring+0.015
 priest+0.015
 pilot+0.014
 understand+0.014
Queen+0.014
Neg logit contributions
 injuries-0.029
joy-0.025
sed-0.024
 nowhere-0.023
ksh-0.023
Token.
Token ID13
Feature activation+2.99e-03

Pos logit contributions
Queen+0.002
 also+0.002
 achieve+0.002
 ring+0.002
Pear+0.002
Neg logit contributions
 injuries-0.003
joy-0.003
aj-0.003
ishing-0.003
 breaths-0.003
Token I
Token ID314
Feature activation+1.50e-02

Pos logit contributions
 injuries+0.005
joy+0.005
sed+0.004
ksh+0.004
irmed+0.004
Neg logit contributions
 ring-0.005
 priest-0.005
aches-0.005
 sign-0.005
 also-0.005
Token have
Token ID423
Feature activation+1.70e-02

Pos logit contributions
 injuries+0.031
irmed+0.029
joy+0.029
ashed+0.027
sed+0.027
Neg logit contributions
 pilot-0.017
 priest-0.016
Queen-0.016
 ring-0.016
 bell-0.016
Token never
Token ID1239
Feature activation+2.85e-03

Pos logit contributions
 injuries+0.038
joy+0.035
awed+0.032
sed+0.032
irmed+0.031
Neg logit contributions
 ring-0.022
 also-0.021
 guard-0.020
 sign-0.020
 understand-0.019
Token seen
Token ID1775
Feature activation+7.84e-02

Pos logit contributions
 injuries+0.004
joy+0.003
awed+0.003
sed+0.003
irmed+0.003
Neg logit contributions
 ring-0.006
 also-0.006
 understand-0.006
 learn-0.006
 know-0.006
Token a
Token ID257
Feature activation+3.71e-04

Pos logit contributions
 injuries+0.172
joy+0.162
awed+0.144
sed+0.140
ishing+0.139
Neg logit contributions
 ring-0.099
 also-0.096
 understand-0.093
 sign-0.093
 learn-0.092
Token boat
Token ID8848
Feature activation-1.52e-02

Pos logit contributions
 injuries+0.001
joy+0.001
sed+0.001
ishing+0.001
ksh+0.001
Neg logit contributions
 understand-0.000
 achieve-0.000
 learn-0.000
 also-0.000
 know-0.000
Token like
Token ID588
Feature activation-2.91e-02

Pos logit contributions
 understand+0.029
 rob+0.028
 know+0.028
Queen+0.028
 also+0.028
Neg logit contributions
 injuries-0.022
joy-0.019
less-0.016
ider-0.015
ishing-0.015
Token that
Token ID326
Feature activation+1.04e-03

Pos logit contributions
 ring+0.035
 understand+0.033
 also+0.033
 learn+0.031
 achieve+0.031
Neg logit contributions
 injuries-0.066
joy-0.062
awed-0.056
sed-0.055
 nowhere-0.053
Token before
Token ID878
Feature activation+5.69e-04

Pos logit contributions
 injuries+0.002
joy+0.002
ksh+0.002
sed+0.002
 nowhere+0.002
Neg logit contributions
 understand-0.001
 achieve-0.001
 also-0.001
 ring-0.001
Queen-0.001
Token furry
Token ID40857
Feature activation+4.46e-02

Pos logit contributions
 injuries+0.003
joy+0.002
awed+0.002
anged+0.002
ishing+0.002
Neg logit contributions
 also-0.001
 ring-0.001
 sign-0.001
 guard-0.001
 slide-0.001
Token.
Token ID13
Feature activation+8.86e-03

Pos logit contributions
 injuries+0.126
joy+0.111
aj+0.106
ksh+0.105
irmed+0.105
Neg logit contributions
 ring-0.037
 guard-0.035
 and-0.034
 also-0.034
 sign-0.033
Token The
Token ID383
Feature activation+1.66e-03

Pos logit contributions
 injuries+0.021
joy+0.020
awed+0.019
sed+0.019
irmed+0.018
Neg logit contributions
 ring-0.010
 also-0.009
 sign-0.009
 priest-0.008
 understand-0.008
Token dog
Token ID3290
Feature activation+2.10e-02

Pos logit contributions
 injuries+0.005
joy+0.004
awed+0.004
ishing+0.004
irmed+0.004
Neg logit contributions
 ring-0.001
 also-0.001
 sign-0.001
 guard-0.001
 slide-0.001
Token w
Token ID266
Feature activation+1.34e-02

Pos logit contributions
 injuries+0.043
awed+0.040
joy+0.039
sed+0.039
irmed+0.037
Neg logit contributions
 ring-0.028
 priest-0.027
 achieve-0.027
 understand-0.026
 learn-0.026
Tokenagged
Token ID14655
Feature activation+1.15e-01

Pos logit contributions
 injuries+0.013
joy+0.011
ksh+0.009
sed+0.007
aj+0.006
Neg logit contributions
 also-0.033
 ring-0.033
 cross-0.033
 understand-0.033
 sign-0.032
Token its
Token ID663
Feature activation+3.40e-02

Pos logit contributions
 injuries+0.255
joy+0.246
awed+0.231
sed+0.229
irmed+0.224
Neg logit contributions
 ring-0.135
 achieve-0.135
 priest-0.132
 also-0.128
Queen-0.126
Token tail
Token ID7894
Feature activation+3.05e-02

Pos logit contributions
 injuries+0.066
joy+0.060
awed+0.055
irmed+0.053
ksh+0.052
Neg logit contributions
 ring-0.055
 also-0.053
 guard-0.052
 sign-0.052
 cross-0.050
Token and
Token ID290
Feature activation+2.72e-04

Pos logit contributions
 injuries+0.092
joy+0.087
awed+0.082
sed+0.081
irmed+0.080
Neg logit contributions
 ring-0.015
 also-0.014
 sign-0.012
 guard-0.011
 understand-0.011
Token the
Token ID262
Feature activation-7.56e-03

Pos logit contributions
 injuries+0.001
joy+0.001
awed+0.001
ishing+0.001
ksh+0.001
Neg logit contributions
 ring-0.000
 also-0.000
 guard-0.000
 sign-0.000
 slide-0.000
Token boy
Token ID2933
Feature activation-7.94e-03

Pos logit contributions
 ring+0.006
 also+0.006
 sign+0.005
 guard+0.005
 slide+0.005
Neg logit contributions
 injuries-0.021
joy-0.019
awed-0.018
sed-0.018
irmed-0.018
Token toys
Token ID14958
Feature activation-1.18e-02

Pos logit contributions
 injuries+0.017
joy+0.016
awed+0.014
sed+0.014
irmed+0.014
Neg logit contributions
 ring-0.012
 also-0.011
 sign-0.011
 guard-0.011
 understand-0.011
Token and
Token ID290
Feature activation+4.47e-03

Pos logit contributions
 rob+0.014
 pilot+0.013
 ring+0.013
Queen+0.013
 priest+0.013
Neg logit contributions
 injuries-0.028
joy-0.024
ishing-0.023
 nowhere-0.022
awed-0.022
Token for
Token ID329
Feature activation+8.45e-03

Pos logit contributions
 injuries+0.010
joy+0.009
sed+0.009
irmed+0.008
awed+0.008
Neg logit contributions
 ring-0.005
Queen-0.005
 pilot-0.005
 rob-0.005
 guard-0.005
Token a
Token ID257
Feature activation-2.48e-03

Pos logit contributions
 injuries+0.021
joy+0.021
awed+0.020
sed+0.020
irmed+0.019
Neg logit contributions
 ring-0.008
 also-0.008
 learn-0.007
 sign-0.007
 understand-0.007
Token few
Token ID1178
Feature activation+5.44e-03

Pos logit contributions
 ring+0.003
 also+0.003
 guard+0.003
 sign+0.003
 understand+0.003
Neg logit contributions
 injuries-0.005
joy-0.005
awed-0.004
sed-0.004
irmed-0.004
Token minutes
Token ID2431
Feature activation+9.82e-02

Pos logit contributions
 injuries+0.006
sed+0.006
joy+0.006
ksh+0.006
awed+0.006
Neg logit contributions
 also-0.012
 even-0.012
 ring-0.012
 sign-0.012
 cross-0.012
Token she
Token ID673
Feature activation+1.79e-02

Pos logit contributions
 injuries+0.251
joy+0.231
sed+0.224
awed+0.221
irmed+0.218
Neg logit contributions
 also-0.096
 ring-0.091
 go-0.086
 learn-0.083
 even-0.083
Token did
Token ID750
Feature activation-7.11e-03

Pos logit contributions
 injuries+0.039
joy+0.035
awed+0.033
sed+0.032
irmed+0.032
Neg logit contributions
 ring-0.024
 sign-0.022
 rob-0.022
 also-0.022
 learn-0.022
Token,
Token ID11
Feature activation-5.91e-03

Pos logit contributions
 pilot+0.009
 rob+0.009
 ring+0.009
Queen+0.009
 priest+0.009
Neg logit contributions
 injuries-0.016
joy-0.014
awed-0.012
irmed-0.012
ishing-0.012
Token but
Token ID475
Feature activation+1.56e-02

Pos logit contributions
Queen+0.008
 ring+0.007
 priest+0.007
 pilot+0.007
 guard+0.007
Neg logit contributions
 injuries-0.013
joy-0.012
sed-0.011
awed-0.010
 nowhere-0.010
Token then
Token ID788
Feature activation+2.85e-02

Pos logit contributions
 injuries+0.032
joy+0.029
sed+0.026
awed+0.026
irmed+0.026
Neg logit contributions
 ring-0.021
 sign-0.021
 achieve-0.021
 understand-0.021
 learn-0.021
Token the
Token ID262
Feature activation-5.02e-03

Pos logit contributions
 injuries+0.041
joy+0.039
sed+0.033
awed+0.032
 nowhere+0.032
Neg logit contributions
 ring-0.017
a-0.017
 also-0.017
 priest-0.016
 pilot-0.016
Token dog
Token ID3290
Feature activation+2.19e-02

Pos logit contributions
 ring+0.005
 also+0.005
 sign+0.005
 guard+0.005
 slide+0.005
Neg logit contributions
 injuries-0.013
joy-0.012
awed-0.011
sed-0.011
irmed-0.011
Token.
Token ID13
Feature activation+2.32e-03

Pos logit contributions
 injuries+0.052
joy+0.048
awed+0.044
sed+0.043
irmed+0.042
Neg logit contributions
 ring-0.024
 also-0.024
 priest-0.023
 rob-0.022
 pilot-0.022
Token It
Token ID632
Feature activation+1.89e-02

Pos logit contributions
 injuries+0.005
joy+0.005
awed+0.004
sed+0.004
irmed+0.004
Neg logit contributions
 ring-0.004
 also-0.003
 pilot-0.003
 sign-0.003
 priest-0.003
Token w
Token ID266
Feature activation+1.57e-02

Pos logit contributions
 injuries+0.043
joy+0.040
awed+0.039
sed+0.038
irmed+0.037
Neg logit contributions
 ring-0.022
 priest-0.020
 learn-0.020
 achieve-0.020
 sign-0.020
Tokenagged
Token ID14655
Feature activation+1.16e-01

Pos logit contributions
 injuries+0.016
joy+0.014
ksh+0.011
sed+0.010
aj+0.009
Neg logit contributions
 also-0.037
 understand-0.037
 ring-0.037
 cross-0.037
 sign-0.036
Token its
Token ID663
Feature activation+3.71e-02

Pos logit contributions
joy+0.212
 injuries+0.212
sed+0.188
awed+0.187
irmed+0.180
Neg logit contributions
 achieve-0.176
Queen-0.172
 priest-0.170
 ring-0.166
 learn-0.162
Token tail
Token ID7894
Feature activation+3.14e-02

Pos logit contributions
 injuries+0.070
joy+0.063
awed+0.058
irmed+0.055
sed+0.055
Neg logit contributions
 ring-0.062
 also-0.060
 guard-0.057
 sign-0.057
 understand-0.057
Token and
Token ID290
Feature activation+3.25e-03

Pos logit contributions
 injuries+0.087
joy+0.083
awed+0.077
sed+0.076
irmed+0.075
Neg logit contributions
 ring-0.022
 priest-0.020
 also-0.020
 achieve-0.019
 learn-0.019
Token l
Token ID300
Feature activation+2.74e-02

Pos logit contributions
 injuries+0.007
joy+0.007
awed+0.006
sed+0.006
irmed+0.006
Neg logit contributions
 ring-0.004
 also-0.004
 sign-0.004
 learn-0.004
 understand-0.004
Tokenicked
Token ID9484
Feature activation+3.32e-02

Pos logit contributions
 injuries+0.043
joy+0.037
ksh+0.035
aj+0.033
ishing+0.032
Neg logit contributions
 also-0.053
 understand-0.051
 ring-0.051
a-0.050
 cross-0.049
Token Joe
Token ID5689
Feature activation-2.93e-02

Pos logit contributions
 injuries+0.134
joy+0.121
awed+0.116
irmed+0.110
ishing+0.109
Neg logit contributions
 also-0.091
 ring-0.086
 slide-0.085
 guard-0.084
 go-0.083
Token.
Token ID13
Feature activation+8.60e-03

Pos logit contributions
 ring+0.041
 sign+0.039
 priest+0.038
 also+0.038
Queen+0.037
Neg logit contributions
 injuries-0.060
joy-0.055
awed-0.051
ishing-0.049
sed-0.048
Token He
Token ID679
Feature activation-1.07e-02

Pos logit contributions
 injuries+0.018
joy+0.017
awed+0.015
sed+0.015
irmed+0.015
Neg logit contributions
 ring-0.012
 also-0.012
 pilot-0.011
 sign-0.011
 guard-0.011
Token was
Token ID373
Feature activation+6.12e-03

Pos logit contributions
 ring+0.013
 priest+0.012
 sign+0.012
 understand+0.012
 also+0.012
Neg logit contributions
 injuries-0.024
joy-0.022
awed-0.021
sed-0.021
irmed-0.020
Token three
Token ID1115
Feature activation-1.32e-02

Pos logit contributions
 injuries+0.015
joy+0.014
sed+0.013
irmed+0.013
awed+0.013
Neg logit contributions
 ring-0.006
 achieve-0.005
Queen-0.005
 priest-0.005
 understand-0.005
Token years
Token ID812
Feature activation+7.63e-02

Pos logit contributions
 ring+0.027
 achieve+0.025
 also+0.025
 understand+0.025
 rob+0.025
Neg logit contributions
 injuries-0.020
joy-0.018
awed-0.016
ishing-0.015
irmed-0.015
Token old
Token ID1468
Feature activation+2.05e-04

Pos logit contributions
 injuries+0.161
joy+0.148
awed+0.133
sed+0.132
irmed+0.129
Neg logit contributions
 ring-0.108
 also-0.104
 understand-0.100
 learn-0.098
 sign-0.098
Token and
Token ID290
Feature activation-1.71e-04

Pos logit contributions
 injuries+0.001
joy+0.001
awed+0.000
sed+0.000
irmed+0.000
Neg logit contributions
 ring-0.000
 also-0.000
 achieve-0.000
 understand-0.000
 learn-0.000
Token always
Token ID1464
Feature activation+1.51e-02

Pos logit contributions
 ring+0.000
 sign+0.000
 rob+0.000
 priest+0.000
 pilot+0.000
Neg logit contributions
 injuries-0.000
joy-0.000
sed-0.000
awed-0.000
irmed-0.000
Token very
Token ID845
Feature activation-8.81e-03

Pos logit contributions
 injuries+0.039
joy+0.035
awed+0.033
sed+0.032
irmed+0.031
Neg logit contributions
 ring-0.016
 also-0.016
 know-0.015
 understand-0.015
 guard-0.014
Token hungry
Token ID14720
Feature activation+6.78e-03

Pos logit contributions
 ring+0.010
 also+0.009
 sign+0.009
 pilot+0.009
 rob+0.009
Neg logit contributions
 injuries-0.021
joy-0.019
sed-0.018
irmed-0.018
awed-0.018
TokenThe
Token ID464
Feature activation-1.12e-02

Pos logit contributions
 ring+0.003
 priest+0.003
 pilot+0.003
 look+0.003
 rob+0.003
Neg logit contributions
 injuries-0.004
awed-0.004
joy-0.004
irmed-0.004
ksh-0.003
Token big
Token ID1263
Feature activation-1.40e-02

Pos logit contributions
 ring+0.011
 also+0.010
 sign+0.010
 guard+0.009
 understand+0.009
Neg logit contributions
 injuries-0.029
joy-0.027
awed-0.026
irmed-0.025
sed-0.025
Token animal
Token ID5044
Feature activation+1.03e-02

Pos logit contributions
 ring+0.017
 also+0.016
 understand+0.015
 achieve+0.015
 learn+0.015
Neg logit contributions
 injuries-0.032
joy-0.030
sed-0.028
awed-0.028
irmed-0.027
Token smiled
Token ID13541
Feature activation+3.66e-02

Pos logit contributions
 injuries+0.020
joy+0.018
awed+0.017
sed+0.017
irmed+0.017
Neg logit contributions
 ring-0.015
 achieve-0.015
 priest-0.014
 learn-0.014
 sign-0.014
Token and
Token ID290
Feature activation+3.93e-03

Pos logit contributions
 injuries+0.110
joy+0.104
awed+0.098
sed+0.097
irmed+0.096
Neg logit contributions
 ring-0.019
 also-0.017
 sign-0.015
 guard-0.014
 priest-0.014
Token said
Token ID531
Feature activation+1.29e-02

Pos logit contributions
 injuries+0.010
joy+0.009
awed+0.008
sed+0.008
irmed+0.008
Neg logit contributions
 ring-0.005
 also-0.005
 sign-0.004
 guard-0.004
 cross-0.004
Token,
Token ID11
Feature activation-2.52e-04

Pos logit contributions
 injuries+0.034
joy+0.033
sed+0.031
awed+0.030
irmed+0.029
Neg logit contributions
 also-0.010
 ring-0.010
 priest-0.009
 sign-0.009
 pilot-0.009
Token "
Token ID366
Feature activation+1.57e-02

Pos logit contributions
 ring+0.000
 also+0.000
 priest+0.000
 guard+0.000
 learn+0.000
Neg logit contributions
 injuries-0.001
joy-0.001
sed-0.000
awed-0.000
ksh-0.000
TokenIt
Token ID1026
Feature activation+1.35e-02

Pos logit contributions
 injuries+0.029
joy+0.026
awed+0.025
irmed+0.024
ksh+0.023
Neg logit contributions
 ring-0.026
 also-0.024
 sign-0.024
 learn-0.024
 guard-0.024
Token's
Token ID338
Feature activation-1.51e-02

Pos logit contributions
 injuries+0.029
sed+0.028
awed+0.028
irmed+0.028
joy+0.027
Neg logit contributions
 priest-0.015
 pilot-0.014
 ring-0.013
 guard-0.013
 achieve-0.013
Token ok
Token ID12876
Feature activation+1.11e-02

Pos logit contributions
 ring+0.020
 also+0.020
 sign+0.018
 understand+0.018
 guard+0.018
Neg logit contributions
 injuries-0.034
joy-0.031
awed-0.029
sed-0.028
irmed-0.028
Token He
Token ID679
Feature activation-1.07e-02

Pos logit contributions
joy+0.005
 injuries+0.004
awed+0.004
irmed+0.004
ksh+0.004
Neg logit contributions
 ring-0.003
 sign-0.003
 priest-0.003
 pilot-0.003
 also-0.003
Token was
Token ID373
Feature activation+8.95e-03

Pos logit contributions
 ring+0.016
Queen+0.015
 priest+0.015
 understand+0.014
 achieve+0.014
Neg logit contributions
 injuries-0.019
joy-0.018
awed-0.018
sed-0.017
irmed-0.017
Token very
Token ID845
Feature activation-1.25e-02

Pos logit contributions
joy+0.019
 injuries+0.018
 nowhere+0.017
sed+0.016
awed+0.016
Neg logit contributions
 ring-0.011
Queen-0.011
 slide-0.010
 achieve-0.010
 sign-0.010
Token happy
Token ID3772
Feature activation+4.93e-03

Pos logit contributions
 ring+0.017
Queen+0.017
 slide+0.016
 pilot+0.016
Pear+0.016
Neg logit contributions
joy-0.024
 injuries-0.022
sed-0.020
irmed-0.019
less-0.019
Token and
Token ID290
Feature activation+2.61e-03

Pos logit contributions
 injuries+0.013
joy+0.012
awed+0.011
sed+0.011
irmed+0.011
Neg logit contributions
 ring-0.005
 also-0.004
 understand-0.004
 sign-0.004
 learn-0.004
Token he
Token ID339
Feature activation+1.98e-02

Pos logit contributions
 injuries+0.005
joy+0.005
awed+0.005
sed+0.005
irmed+0.005
Neg logit contributions
 ring-0.003
 also-0.003
 sign-0.003
Queen-0.003
 pilot-0.003
Token knew
Token ID2993
Feature activation-2.52e-02

Pos logit contributions
 injuries+0.032
joy+0.032
awed+0.031
sed+0.030
irmed+0.029
Neg logit contributions
 ring-0.031
 priest-0.029
 understand-0.029
 sign-0.029
Queen-0.029
Token he
Token ID339
Feature activation+1.94e-02

Pos logit contributions
 ring+0.030
Queen+0.028
 rob+0.028
 sign+0.027
 pilot+0.027
Neg logit contributions
 injuries-0.058
joy-0.054
awed-0.047
sed-0.047
irmed-0.046
Token could
Token ID714
Feature activation-1.94e-02

Pos logit contributions
 injuries+0.036
joy+0.034
awed+0.031
sed+0.031
irmed+0.031
Neg logit contributions
 ring-0.029
 priest-0.027
 rob-0.027
 sign-0.027
 cross-0.027
Token never
Token ID1239
Feature activation-6.74e-04

Pos logit contributions
Queen+0.024
a+0.023
 priest+0.023
 pilot+0.021
Pear+0.021
Neg logit contributions
 injuries-0.040
joy-0.029
 intended-0.028
sed-0.028
 hours-0.028
Token forget
Token ID6044
Feature activation-1.29e-02

Pos logit contributions
 ring+0.000
 also+0.000
 pilot+0.000
 priest+0.000
 sign+0.000
Neg logit contributions
 injuries-0.002
joy-0.002
sed-0.002
awed-0.002
irmed-0.002
Token Maybe
Token ID6674
Feature activation-2.76e-02

Pos logit contributions
 ring+0.006
 also+0.006
 sign+0.005
 guard+0.005
 priest+0.005
Neg logit contributions
 injuries-0.010
joy-0.010
sed-0.009
awed-0.009
irmed-0.009
Token I
Token ID314
Feature activation+4.93e-03

Pos logit contributions
 ring+0.028
 also+0.028
 sign+0.025
 guard+0.025
 learn+0.024
Neg logit contributions
 injuries-0.070
joy-0.065
awed-0.062
sed-0.060
irmed-0.060
Token can
Token ID460
Feature activation-3.81e-02

Pos logit contributions
 injuries+0.011
joy+0.010
sed+0.010
irmed+0.010
anged+0.009
Neg logit contributions
 priest-0.005
 ring-0.005
 pilot-0.005
Queen-0.005
 achieve-0.005
Token learn
Token ID2193
Feature activation-4.20e-02

Pos logit contributions
 pilot+0.046
 priest+0.046
a+0.043
Queen+0.043
 rob+0.041
Neg logit contributions
 injuries-0.085
sed-0.064
ishing-0.059
 thirst-0.058
irmed-0.058
Token a
Token ID257
Feature activation-5.96e-03

Pos logit contributions
 ring+0.058
 pilot+0.056
 guard+0.055
 priest+0.055
 also+0.054
Neg logit contributions
 injuries-0.088
joy-0.076
awed-0.071
sed-0.069
aj-0.067
Token new
Token ID649
Feature activation+2.19e-02

Pos logit contributions
 also+0.009
 pilot+0.009
 priest+0.009
 achieve+0.008
 ring+0.008
Neg logit contributions
 injuries-0.010
joy-0.010
sed-0.009
ksh-0.008
less-0.008
Token recipe
Token ID8364
Feature activation-3.01e-02

Pos logit contributions
 injuries+0.049
joy+0.046
sed+0.042
awed+0.040
irmed+0.039
Neg logit contributions
 also-0.027
 ring-0.027
 pilot-0.025
 achieve-0.025
 priest-0.024
Token next
Token ID1306
Feature activation-1.56e-02

Pos logit contributions
 rob+0.050
 pilot+0.048
 ring+0.047
 cross+0.046
 also+0.046
Neg logit contributions
 injuries-0.052
joy-0.044
ishing-0.040
aj-0.040
less-0.039
Token time
Token ID640
Feature activation+2.53e-02

Pos logit contributions
 ring+0.021
 also+0.020
 sign+0.019
 guard+0.019
 understand+0.018
Neg logit contributions
 injuries-0.035
joy-0.032
awed-0.029
sed-0.029
irmed-0.028
Token."
Token ID526
Feature activation-1.10e-02

Pos logit contributions
 injuries+0.066
joy+0.061
awed+0.058
sed+0.057
irmed+0.056
Neg logit contributions
 ring-0.024
 also-0.023
 sign-0.021
 guard-0.020
 learn-0.020
Token
Token ID198
Feature activation-1.92e-02

Pos logit contributions
 ring+0.011
 pilot+0.010
 priest+0.010
 also+0.010
Queen+0.009
Neg logit contributions
 injuries-0.027
joy-0.027
awed-0.024
sed-0.024
irmed-0.024
Token me
Token ID502
Feature activation-1.98e-02

Pos logit contributions
 injuries+0.048
ishing+0.038
 nowhere+0.035
awed+0.034
joy+0.034
Neg logit contributions
 priest-0.055
 pilot-0.051
 ring-0.051
 also-0.051
 guard-0.051
Token?"
Token ID1701
Feature activation+6.27e-04

Pos logit contributions
 priest+0.024
 pastor+0.024
Queen+0.022
 pilot+0.022
 guard+0.020
Neg logit contributions
 injuries-0.042
ishing-0.039
aj-0.034
joy-0.032
uc-0.032
Token
Token ID198
Feature activation-5.10e-03

Pos logit contributions
 injuries+0.002
joy+0.002
awed+0.002
sed+0.002
irmed+0.002
Neg logit contributions
 ring-0.000
 also-0.000
 sign-0.000
 guard-0.000
 priest-0.000
Token
Token ID198
Feature activation-4.61e-03

Pos logit contributions
 ring+0.004
 also+0.003
 sign+0.003
 guard+0.003
 priest+0.003
Neg logit contributions
 injuries-0.014
joy-0.013
awed-0.013
sed-0.012
irmed-0.012
TokenMax
Token ID11518
Feature activation+3.21e-02

Pos logit contributions
 ring+0.008
 priest+0.008
 also+0.008
 rob+0.008
 understand+0.008
Neg logit contributions
 injuries-0.008
joy-0.007
awed-0.006
irmed-0.006
sed-0.006
Token:
Token ID25
Feature activation+1.59e-02

Pos logit contributions
joy+0.062
 injuries+0.059
sed+0.059
awed+0.057
irmed+0.054
Neg logit contributions
 priest-0.043
 pilot-0.042
 ring-0.042
 achieve-0.040
Queen-0.040
Token "
Token ID366
Feature activation+1.42e-02

Pos logit contributions
joy+0.032
 injuries+0.029
sed+0.027
awed+0.027
 nowhere+0.024
Neg logit contributions
 ring-0.022
 priest-0.022
 understand-0.021
 achieve-0.021
 learn-0.021
TokenSure
Token ID19457
Feature activation-3.41e-02

Pos logit contributions
 injuries+0.028
joy+0.026
awed+0.024
ksh+0.024
irmed+0.024
Neg logit contributions
 ring-0.020
 also-0.019
 cross-0.018
 sign-0.018
 guard-0.018
Token,
Token ID11
Feature activation-1.91e-03

Pos logit contributions
 ring+0.020
 guard+0.016
 sign+0.016
 also+0.016
 slide+0.014
Neg logit contributions
 injuries-0.103
joy-0.096
awed-0.093
irmed-0.090
ksh-0.090
Token that
Token ID326
Feature activation+8.34e-04

Pos logit contributions
 achieve+0.002
 priest+0.002
 ring+0.002
Queen+0.002
 also+0.002
Neg logit contributions
 injuries-0.004
joy-0.004
sed-0.003
less-0.003
ksh-0.003
Token sounds
Token ID5238
Feature activation+9.84e-03

Pos logit contributions
 injuries+0.002
joy+0.002
awed+0.002
sed+0.002
irmed+0.002
Neg logit contributions
 ring-0.001
 also-0.001
 priest-0.001
 guard-0.001
 pilot-0.001
Token's
Token ID338
Feature activation-1.80e-02

Pos logit contributions
 injuries+0.024
irmed+0.022
joy+0.022
sed+0.022
awed+0.022
Neg logit contributions
 priest-0.012
 ring-0.012
 guard-0.011
 slide-0.011
 sign-0.011
Token very
Token ID845
Feature activation-9.36e-03

Pos logit contributions
 ring+0.023
 also+0.021
 priest+0.021
 guard+0.021
 pilot+0.021
Neg logit contributions
 injuries-0.041
joy-0.037
sed-0.034
awed-0.033
irmed-0.033
Token difficult
Token ID2408
Feature activation+1.23e-02

Pos logit contributions
 ring+0.011
 also+0.010
 sign+0.010
 guard+0.010
 pilot+0.010
Neg logit contributions
 injuries-0.022
joy-0.020
awed-0.019
sed-0.019
irmed-0.019
Token for
Token ID329
Feature activation+7.06e-03

Pos logit contributions
 injuries+0.028
joy+0.026
awed+0.025
sed+0.024
irmed+0.024
Neg logit contributions
 ring-0.015
 also-0.015
 sign-0.014
 learn-0.014
 understand-0.014
Token me
Token ID502
Feature activation-2.39e-02

Pos logit contributions
 injuries+0.017
joy+0.015
awed+0.014
sed+0.014
irmed+0.014
Neg logit contributions
 ring-0.008
 also-0.008
 sign-0.007
 guard-0.007
 understand-0.007
Token to
Token ID284
Feature activation+1.22e-02

Pos logit contributions
 ring+0.033
Queen+0.032
 rob+0.032
 priest+0.031
 also+0.031
Neg logit contributions
 injuries-0.051
joy-0.044
sed-0.039
aj-0.038
ishing-0.038
Token find
Token ID1064
Feature activation-2.53e-02

Pos logit contributions
 injuries+0.029
joy+0.020
 breaths+0.019
 thirst+0.019
sed+0.019
Neg logit contributions
 priest-0.014
a-0.014
Queen-0.014
 pilot-0.013
ele-0.013
Token enough
Token ID1576
Feature activation-3.31e-03

Pos logit contributions
 ring+0.029
 also+0.029
 priest+0.027
 guard+0.026
 sign+0.026
Neg logit contributions
 injuries-0.060
joy-0.054
awed-0.050
sed-0.049
irmed-0.049
Token food
Token ID2057
Feature activation+1.49e-02

Pos logit contributions
 ring+0.005
 also+0.005
 priest+0.005
 guard+0.005
 understand+0.005
Neg logit contributions
 injuries-0.007
joy-0.006
sed-0.005
awed-0.005
irmed-0.005
Token.
Token ID13
Feature activation+1.37e-03

Pos logit contributions
 injuries+0.031
joy+0.027
sed+0.025
awed+0.025
ishing+0.024
Neg logit contributions
 ring-0.020
 rob-0.019
 pilot-0.019
 also-0.019
Queen-0.019
Token Can
Token ID1680
Feature activation-8.35e-03

Pos logit contributions
 injuries+0.002
joy+0.002
ksh+0.002
sed+0.002
irmed+0.002
Neg logit contributions
 ring-0.002
 priest-0.002
 slide-0.002
aches-0.002
 sign-0.002
Token.
Token ID13
Feature activation+4.92e-04

Pos logit contributions
 also+0.024
 ring+0.024
 go+0.023
 and+0.022
 understand+0.022
Neg logit contributions
 injuries-0.088
joy-0.084
awed-0.083
sed-0.081
irmed-0.081
Token She
Token ID1375
Feature activation-2.60e-02

Pos logit contributions
 injuries+0.001
joy+0.001
sed+0.001
awed+0.001
irmed+0.001
Neg logit contributions
 ring-0.000
 achieve-0.000
 pilot-0.000
 guard-0.000
 rob-0.000
Token wants
Token ID3382
Feature activation-2.36e-02

Pos logit contributions
 ring+0.034
 understand+0.032
 guard+0.031
 pilot+0.031
 rob+0.031
Neg logit contributions
 injuries-0.056
joy-0.053
awed-0.049
sed-0.049
irmed-0.048
Token to
Token ID284
Feature activation+1.03e-02

Pos logit contributions
 also+0.026
 ring+0.026
 go+0.023
 sign+0.023
 learn+0.023
Neg logit contributions
 injuries-0.058
joy-0.055
awed-0.052
irmed-0.050
ishing-0.050
Token play
Token ID711
Feature activation-1.69e-02

Pos logit contributions
 injuries+0.022
joy+0.018
sed+0.018
 thirst+0.017
irmed+0.017
Neg logit contributions
 pilot-0.012
 priest-0.012
Queen-0.012
a-0.011
 ring-0.011
Token in
Token ID287
Feature activation-1.30e-02

Pos logit contributions
 ring+0.014
 also+0.013
 sign+0.012
 guard+0.012
 understand+0.012
Neg logit contributions
 injuries-0.046
joy-0.043
awed-0.040
sed-0.040
irmed-0.039
Token the
Token ID262
Feature activation-6.86e-03

Pos logit contributions
 ring+0.014
 pilot+0.013
 understand+0.013
 guard+0.012
 priest+0.012
Neg logit contributions
 injuries-0.032
joy-0.030
awed-0.027
irmed-0.025
 nowhere-0.025
Token water
Token ID1660
Feature activation-2.42e-03

Pos logit contributions
 ring+0.007
 also+0.007
 understand+0.007
 guard+0.007
 priest+0.007
Neg logit contributions
 injuries-0.017
joy-0.015
awed-0.014
sed-0.014
irmed-0.014
Token.
Token ID13
Feature activation+1.03e-03

Pos logit contributions
 pilot+0.004
 understand+0.004
 ring+0.004
 know+0.003
 rob+0.003
Neg logit contributions
 injuries-0.005
joy-0.004
awed-0.004
ishing-0.004
sed-0.004
Token
Token ID198
Feature activation-1.37e-02

Pos logit contributions
 injuries+0.003
joy+0.003
sed+0.002
awed+0.002
irmed+0.002
Neg logit contributions
 ring-0.001
 also-0.001
 pilot-0.001
 guard-0.001
 rob-0.001
Token
Token ID198
Feature activation-1.24e-02

Pos logit contributions
 ring+0.012
 pilot+0.010
 also+0.010
 guard+0.010
 priest+0.010
Neg logit contributions
 injuries-0.036
joy-0.033
awed-0.032
sed-0.031
irmed-0.030
Token caul
Token ID46087
Feature activation-2.39e-03

Pos logit contributions
 also+0.009
 achieve+0.009
 understand+0.009
 slide+0.009
 learn+0.009
Neg logit contributions
 injuries-0.009
joy-0.009
sed-0.007
 nowhere-0.007
ishing-0.006
Tokeniflower
Token ID42642
Feature activation+1.05e-02

Pos logit contributions
 ring+0.005
 slide+0.005
 guard+0.005
 go+0.005
 understand+0.005
Neg logit contributions
 injuries-0.004
joy-0.003
awed-0.003
ishing-0.003
irmed-0.003
Token and
Token ID290
Feature activation+2.59e-03

Pos logit contributions
 injuries+0.024
joy+0.023
ishing+0.022
awed+0.021
 nowhere+0.021
Neg logit contributions
Queen-0.010
 pilot-0.010
 priest-0.010
 understand-0.010
 know-0.010
Token jumped
Token ID11687
Feature activation-1.14e-03

Pos logit contributions
 injuries+0.005
joy+0.005
awed+0.004
sed+0.004
irmed+0.004
Neg logit contributions
 ring-0.004
 priest-0.004
 also-0.004
Queen-0.004
 understand-0.003
Token back
Token ID736
Feature activation-6.07e-03

Pos logit contributions
 pilot+0.002
 priest+0.002
 rob+0.002
 also+0.002
Queen+0.002
Neg logit contributions
joy-0.002
 injuries-0.002
awed-0.002
 breaths-0.002
irmed-0.002
Token on
Token ID319
Feature activation-1.62e-02

Pos logit contributions
 ring+0.009
 also+0.009
 understand+0.009
Queen+0.009
 priest+0.009
Neg logit contributions
 injuries-0.012
joy-0.012
awed-0.010
ishing-0.010
sed-0.010
Token the
Token ID262
Feature activation-6.40e-03

Pos logit contributions
 ring+0.017
 also+0.017
 pilot+0.016
 understand+0.016
 priest+0.016
Neg logit contributions
 injuries-0.040
joy-0.038
awed-0.033
sed-0.032
 breaths-0.032
Token zip
Token ID19974
Feature activation-1.80e-02

Pos logit contributions
 understand+0.009
 also+0.009
 know+0.009
Queen+0.009
a+0.009
Neg logit contributions
 injuries-0.013
joy-0.011
sed-0.010
 breaths-0.010
awed-0.009
Token-
Token ID12
Feature activation+8.07e-03

Pos logit contributions
 understand+0.023
 know+0.023
 also+0.023
Queen+0.022
 rob+0.022
Neg logit contributions
 injuries-0.038
joy-0.035
awed-0.032
ishing-0.031
sed-0.030
Tokenline
Token ID1370
Feature activation-7.93e-02

Pos logit contributions
 injuries+0.012
joy+0.012
awed+0.011
ishing+0.011
ted+0.010
Neg logit contributions
 ring-0.015
 also-0.014
 pilot-0.014
 guard-0.014
 exc-0.014
Token,
Token ID11
Feature activation-6.46e-03

Pos logit contributions
Queen+0.126
 pilot+0.114
 rob+0.113
 priest+0.111
 ring+0.108
Neg logit contributions
 injuries-0.142
joy-0.139
less-0.118
awed-0.115
sed-0.110
Token improve
Token ID2987
Feature activation-1.96e-02

Pos logit contributions
 priest+0.008
 pilot+0.008
 ring+0.008
 also+0.008
 crane+0.008
Neg logit contributions
 injuries-0.022
joy-0.018
anged-0.017
less-0.017
ishing-0.017
Token my
Token ID616
Feature activation-6.73e-03

Pos logit contributions
 ring+0.024
 pilot+0.023
 priest+0.023
 also+0.023
 rob+0.022
Neg logit contributions
 injuries-0.045
joy-0.041
ishing-0.036
awed-0.036
 nowhere-0.035
Token report
Token ID989
Feature activation-1.01e-02

Pos logit contributions
 ring+0.010
 sign+0.009
 learn+0.009
 also+0.009
 understand+0.009
Neg logit contributions
 injuries-0.014
joy-0.013
awed-0.012
irmed-0.012
ksh-0.012
Token.
Token ID13
Feature activation-5.67e-03

Pos logit contributions
 rob+0.015
 ring+0.015
 also+0.014
 pilot+0.014
 priest+0.014
Neg logit contributions
 injuries-0.020
joy-0.018
awed-0.016
ishing-0.016
aj-0.015
Token You
Token ID921
Feature activation-2.75e-02

Pos logit contributions
 ring+0.011
 pilot+0.010
 also+0.010
 cross+0.010
 priest+0.010
Neg logit contributions
 injuries-0.009
joy-0.008
ksh-0.007
sed-0.007
irmed-0.007
Token spoiled
Token ID34781
Feature activation-2.98e-02

Pos logit contributions
 priest+0.037
 ring+0.036
 rob+0.036
Queen+0.035
 pilot+0.034
Neg logit contributions
 injuries-0.054
sed-0.044
awed-0.044
irmed-0.044
anged-0.043
Token it
Token ID340
Feature activation-5.65e-03

Pos logit contributions
 also+0.040
 achieve+0.040
 ring+0.040
 pilot+0.040
 understand+0.039
Neg logit contributions
 injuries-0.061
joy-0.055
ishing-0.051
awed-0.051
ted-0.048
Token.
Token ID13
Feature activation-6.06e-03

Pos logit contributions
 rob+0.008
 ring+0.008
 priest+0.008
 pilot+0.008
 also+0.007
Neg logit contributions
 injuries-0.012
joy-0.010
awed-0.010
sed-0.010
ishing-0.010
Token You
Token ID921
Feature activation-2.77e-02

Pos logit contributions
 ring+0.011
 also+0.010
 guard+0.010
 pilot+0.010
 priest+0.010
Neg logit contributions
 injuries-0.011
joy-0.010
sed-0.009
awed-0.009
irmed-0.009
Token made
Token ID925
Feature activation-1.94e-02

Pos logit contributions
 priest+0.039
 ring+0.037
 rob+0.036
 pilot+0.036
Queen+0.035
Neg logit contributions
 injuries-0.054
sed-0.045
awed-0.044
irmed-0.044
anged-0.043
Token a
Token ID257
Feature activation-6.68e-03

Pos logit contributions
 ring+0.027
 pilot+0.025
 priest+0.025
 guard+0.025
 rob+0.024
Neg logit contributions
 injuries-0.041
joy-0.036
awed-0.034
sed-0.032
 nowhere-0.032
Token go
Token ID467
Feature activation-5.07e-02

Pos logit contributions
 injuries+0.010
joy+0.009
irmed+0.009
sed+0.008
ksh+0.008
Neg logit contributions
 ring-0.005
Queen-0.005
 guard-0.005
 rob-0.005
 pilot-0.005
Token to
Token ID284
Feature activation+1.65e-02

Pos logit contributions
 ring+0.074
 also+0.072
 rob+0.069
 guard+0.068
 understand+0.068
Neg logit contributions
 injuries-0.104
joy-0.095
awed-0.086
sed-0.084
irmed-0.083
Token the
Token ID262
Feature activation-3.47e-03

Pos logit contributions
 injuries+0.039
joy+0.035
awed+0.033
sed+0.032
irmed+0.032
Neg logit contributions
 ring-0.020
 also-0.019
 sign-0.018
 guard-0.018
 understand-0.018
Token kitchen
Token ID9592
Feature activation+1.62e-03

Pos logit contributions
 ring+0.005
 understand+0.005
 also+0.005
 know+0.005
 rob+0.004
Neg logit contributions
 injuries-0.007
joy-0.007
ksh-0.006
awed-0.006
sed-0.006
Token to
Token ID284
Feature activation+1.54e-02

Pos logit contributions
 injuries+0.004
joy+0.003
awed+0.003
sed+0.003
irmed+0.003
Neg logit contributions
 ring-0.002
 also-0.002
 guard-0.002
 sign-0.002
 understand-0.002
Token get
Token ID651
Feature activation-2.59e-02

Pos logit contributions
 injuries+0.037
joy+0.032
ishing+0.029
irmed+0.029
sed+0.029
Neg logit contributions
 pilot-0.015
a-0.015
 ring-0.015
 priest-0.014
 rob-0.014
Token himself
Token ID2241
Feature activation-1.41e-02

Pos logit contributions
 also+0.030
 ring+0.029
 go+0.027
 even+0.027
 learn+0.026
Neg logit contributions
 injuries-0.063
joy-0.059
sed-0.056
awed-0.056
irmed-0.053
Token a
Token ID257
Feature activation-7.10e-04

Pos logit contributions
 ring+0.019
Queen+0.019
 pilot+0.018
 rob+0.018
 also+0.018
Neg logit contributions
 injuries-0.029
joy-0.027
awed-0.025
ishing-0.025
aj-0.024
Token bowl
Token ID9396
Feature activation+2.95e-04

Pos logit contributions
 ring+0.001
 also+0.001
 rob+0.001
 guard+0.001
 pilot+0.001
Neg logit contributions
 injuries-0.002
joy-0.001
sed-0.001
awed-0.001
irmed-0.001
Token full
Token ID1336
Feature activation+1.86e-02

Pos logit contributions
 injuries+0.001
joy+0.001
awed+0.001
sed+0.001
ishing+0.001
Neg logit contributions
 ring-0.000
 also-0.000
 rob-0.000
Queen-0.000
 pilot-0.000
Token of
Token ID286
Feature activation-4.83e-03

Pos logit contributions
 injuries+0.051
joy+0.047
awed+0.044
sed+0.044
irmed+0.043
Neg logit contributions
 ring-0.015
 also-0.014
 sign-0.013
 guard-0.013
 understand-0.013
Token.
Token ID13
Feature activation+9.72e-03

Pos logit contributions
 pilot+0.024
 ring+0.023
 guard+0.023
 rob+0.023
 also+0.023
Neg logit contributions
 injuries-0.030
joy-0.026
awed-0.026
ishing-0.024
less-0.024
Token One
Token ID1881
Feature activation+4.62e-03

Pos logit contributions
 injuries+0.020
joy+0.019
awed+0.017
sed+0.017
irmed+0.017
Neg logit contributions
 ring-0.014
 also-0.013
 guard-0.013
 sign-0.013
 priest-0.013
Token day
Token ID1110
Feature activation+2.93e-02

Pos logit contributions
 injuries+0.010
joy+0.009
awed+0.009
sed+0.008
irmed+0.008
Neg logit contributions
 ring-0.006
 also-0.006
 sign-0.006
 guard-0.006
 understand-0.006
Token,
Token ID11
Feature activation-6.71e-04

Pos logit contributions
 injuries+0.102
joy+0.095
ksh+0.093
awed+0.090
sed+0.090
Neg logit contributions
 ring-0.003
 go-0.003
"-0.003
 also-0.002
 sign-0.001
Token Tim
Token ID5045
Feature activation+7.57e-03

Pos logit contributions
 ring+0.001
 also+0.001
 sign+0.001
 slide+0.001
 go+0.001
Neg logit contributions
 injuries-0.001
joy-0.001
awed-0.001
irmed-0.001
sed-0.001
Tokenmy
Token ID1820
Feature activation-1.30e-02

Pos logit contributions
 injuries+0.016
joy+0.015
awed+0.014
irmed+0.014
ishing+0.013
Neg logit contributions
 ring-0.010
 sign-0.009
 also-0.009
 priest-0.009
 guard-0.009
Token's
Token ID338
Feature activation-1.53e-02

Pos logit contributions
 ring+0.014
 priest+0.012
 rob+0.012
 sign+0.012
 guard+0.011
Neg logit contributions
 injuries-0.032
joy-0.029
awed-0.028
irmed-0.027
sed-0.027
Token mom
Token ID1995
Feature activation-2.39e-03

Pos logit contributions
 ring+0.014
 also+0.013
 sign+0.012
 guard+0.012
 understand+0.012
Neg logit contributions
 injuries-0.040
joy-0.038
awed-0.035
sed-0.035
irmed-0.034
Token baked
Token ID22979
Feature activation+5.48e-03

Pos logit contributions
 ring+0.004
 achieve+0.004
 guard+0.004
 priest+0.004
 rob+0.004
Neg logit contributions
 injuries-0.004
joy-0.004
awed-0.004
uddled-0.003
ishing-0.003
Token some
Token ID617
Feature activation+2.60e-02

Pos logit contributions
 injuries+0.010
joy+0.009
awed+0.009
ksh+0.008
ishing+0.008
Neg logit contributions
 ring-0.009
 pilot-0.009
 guard-0.008
 priest-0.008
 achieve-0.008
Token cookies
Token ID14746
Feature activation+2.10e-02

Pos logit contributions
 injuries+0.038
joy+0.032
awed+0.030
sed+0.027
ishing+0.027
Neg logit contributions
 ring-0.052
 also-0.051
 guard-0.050
 pilot-0.050
 achieve-0.049
Token adventure
Token ID8855
Feature activation+2.33e-02

Pos logit contributions
 injuries+0.053
joy+0.050
awed+0.047
irmed+0.045
ksh+0.044
Neg logit contributions
 ring-0.033
 also-0.030
 sign-0.030
 slide-0.030
 go-0.029
Token and
Token ID290
Feature activation+1.96e-03

Pos logit contributions
 injuries+0.067
joy+0.063
awed+0.058
sed+0.057
irmed+0.057
Neg logit contributions
 ring-0.015
 also-0.014
 rob-0.013
 understand-0.013
 guard-0.012
Token was
Token ID373
Feature activation+8.14e-03

Pos logit contributions
 injuries+0.005
joy+0.004
awed+0.004
sed+0.004
ksh+0.004
Neg logit contributions
 also-0.002
 ring-0.002
 guard-0.002
 sign-0.002
 learn-0.002
Token ready
Token ID3492
Feature activation-3.14e-02

Pos logit contributions
 injuries+0.020
joy+0.019
awed+0.017
sed+0.017
irmed+0.017
Neg logit contributions
 ring-0.008
 sign-0.008
Queen-0.007
 pilot-0.007
 achieve-0.007
Token to
Token ID284
Feature activation+1.30e-02

Pos logit contributions
 ring+0.037
 go+0.036
 also+0.036
 even+0.034
 learn+0.034
Neg logit contributions
 injuries-0.074
awed-0.068
joy-0.067
sed-0.066
irmed-0.065
Token go
Token ID467
Feature activation-5.40e-02

Pos logit contributions
 injuries+0.030
joy+0.023
 thirst+0.023
night+0.022
 breaths+0.022
Neg logit contributions
a-0.014
ele-0.013
Queen-0.013
 priest-0.013
 rob-0.012
Token home
Token ID1363
Feature activation-3.06e-03

Pos logit contributions
 ring+0.070
 also+0.068
 sign+0.064
 guard+0.064
 understand+0.063
Neg logit contributions
 injuries-0.121
joy-0.111
awed-0.103
sed-0.101
irmed-0.099
Token.
Token ID13
Feature activation+5.15e-03

Pos logit contributions
 ring+0.004
 also+0.004
 sign+0.004
 rob+0.004
 pilot+0.004
Neg logit contributions
 injuries-0.007
joy-0.006
awed-0.006
sed-0.006
irmed-0.005
Token But
Token ID887
Feature activation+1.35e-02

Pos logit contributions
 injuries+0.013
joy+0.012
ishing+0.011
awed+0.011
sed+0.011
Neg logit contributions
 also-0.005
 ring-0.005
 learn-0.005
 sign-0.005
 go-0.005
Token before
Token ID878
Feature activation-2.35e-03

Pos logit contributions
 injuries+0.032
joy+0.030
awed+0.027
sed+0.027
irmed+0.026
Neg logit contributions
 ring-0.016
 also-0.015
 sign-0.014
 understand-0.014
 guard-0.014
Token he
Token ID339
Feature activation+1.95e-02

Pos logit contributions
 also+0.002
 ring+0.002
 sign+0.002
 go+0.002
 cross+0.002
Neg logit contributions
 injuries-0.006
joy-0.006
ishing-0.006
sed-0.006
ksh-0.006
Tokeneer
Token ID28153
Feature activation+6.15e-03

Pos logit contributions
 achieve+0.063
 also+0.062
 ring+0.061
 learn+0.060
 understand+0.060
Neg logit contributions
 injuries-0.068
joy-0.059
awed-0.056
sed-0.056
ishing-0.055
Token.
Token ID13
Feature activation+8.21e-03

Pos logit contributions
 injuries+0.013
joy+0.011
awed+0.010
anged+0.010
sed+0.010
Neg logit contributions
 ring-0.009
Queen-0.008
 rob-0.008
 also-0.008
 priest-0.008
Token They
Token ID1119
Feature activation-3.59e-03

Pos logit contributions
 injuries+0.018
joy+0.017
sed+0.016
awed+0.016
irmed+0.016
Neg logit contributions
 ring-0.010
 also-0.010
 priest-0.009
 pilot-0.009
 guard-0.009
Token did
Token ID750
Feature activation-8.26e-03

Pos logit contributions
 ring+0.006
 achieve+0.005
Queen+0.005
 rob+0.005
 sign+0.005
Neg logit contributions
 injuries-0.007
joy-0.006
sed-0.006
awed-0.006
irmed-0.005
Token not
Token ID407
Feature activation-2.85e-03

Pos logit contributions
Queen+0.009
 rob+0.008
 pilot+0.008
 priest+0.008
umber+0.008
Neg logit contributions
 injuries-0.020
joy-0.018
sed-0.017
awed-0.016
irmed-0.015
Token want
Token ID765
Feature activation-5.31e-02

Pos logit contributions
 achieve+0.003
chini+0.003
 crane+0.003
umber+0.003
ute+0.003
Neg logit contributions
 injuries-0.006
joy-0.005
 seen-0.005
 minutes-0.005
sed-0.005
Token to
Token ID284
Feature activation+1.47e-02

Pos logit contributions
 ring+0.071
 also+0.068
 sign+0.064
 guard+0.064
 understand+0.063
Neg logit contributions
 injuries-0.118
joy-0.108
awed-0.100
sed-0.098
irmed-0.097
Token scare
Token ID19437
Feature activation-1.32e-03

Pos logit contributions
 injuries+0.034
joy+0.025
sed+0.024
anged+0.024
 breaths+0.022
Neg logit contributions
umber-0.017
Queen-0.016
a-0.016
chini-0.015
 also-0.015
Token it
Token ID340
Feature activation+7.38e-04

Pos logit contributions
 achieve+0.002
 also+0.002
 ring+0.002
 priest+0.002
 pilot+0.002
Neg logit contributions
 injuries-0.003
joy-0.002
ted-0.002
anged-0.002
 breaths-0.002
Token.
Token ID13
Feature activation+5.91e-03

Pos logit contributions
 injuries+0.002
joy+0.001
sed+0.001
awed+0.001
ishing+0.001
Neg logit contributions
 priest-0.001
Queen-0.001
 also-0.001
 rob-0.001
 achieve-0.001
Token The
Token ID383
Feature activation+2.70e-03

Pos logit contributions
 injuries+0.012
joy+0.012
sed+0.011
awed+0.011
irmed+0.011
Neg logit contributions
 ring-0.008
 priest-0.008
 pilot-0.007
 also-0.007
 guard-0.007
Token animals
Token ID4695
Feature activation-1.84e-02

Pos logit contributions
 priest+0.015
a+0.015
Queen+0.014
 understand+0.014
 name+0.014
Neg logit contributions
 injuries-0.014
joy-0.011
 bruises-0.010
sed-0.009
 nowhere-0.009
Token,
Token ID11
Feature activation-8.35e-03

Pos logit contributions
 priest+0.021
 understand+0.020
Queen+0.019
 ring+0.019
 pilot+0.019
Neg logit contributions
 injuries-0.041
joy-0.037
sed-0.034
ishing-0.033
awed-0.033
Token and
Token ID290
Feature activation-4.28e-03

Pos logit contributions
 priest+0.008
a+0.008
Queen+0.008
 rob+0.008
Tab+0.007
Neg logit contributions
 injuries-0.020
joy-0.018
sed-0.016
 nowhere-0.016
ishing-0.016
Token anything
Token ID1997
Feature activation-1.20e-02

Pos logit contributions
 ring+0.006
Queen+0.006
 priest+0.005
a+0.005
 also+0.005
Neg logit contributions
 injuries-0.009
joy-0.008
sed-0.007
irmed-0.007
 bruises-0.006
Token they
Token ID484
Feature activation+7.64e-03

Pos logit contributions
 ring+0.016
 understand+0.015
 pilot+0.015
 rob+0.015
 learn+0.015
Neg logit contributions
 injuries-0.025
joy-0.022
sed-0.021
awed-0.021
irmed-0.020
Token want
Token ID765
Feature activation-5.98e-02

Pos logit contributions
 injuries+0.014
joy+0.013
irmed+0.012
sed+0.012
ishing+0.012
Neg logit contributions
 priest-0.010
 rob-0.010
 ring-0.010
 pilot-0.010
Queen-0.009
Token.
Token ID13
Feature activation+4.76e-03

Pos logit contributions
 ring+0.081
 also+0.077
 pilot+0.072
 understand+0.072
 guard+0.072
Neg logit contributions
 injuries-0.131
joy-0.119
awed-0.110
sed-0.108
irmed-0.105
Token
Token ID198
Feature activation-8.28e-03

Pos logit contributions
 injuries+0.013
joy+0.012
awed+0.011
sed+0.011
irmed+0.011
Neg logit contributions
 ring-0.004
 also-0.004
 sign-0.003
 guard-0.003
 understand-0.003
Token
Token ID198
Feature activation-7.12e-03

Pos logit contributions
 ring+0.008
 pilot+0.007
 guard+0.007
 rob+0.006
 priest+0.006
Neg logit contributions
 injuries-0.021
joy-0.020
awed-0.019
irmed-0.018
sed-0.018
TokenOne
Token ID3198
Feature activation+2.01e-02

Pos logit contributions
 ring+0.010
 guard+0.009
 also+0.009
 pilot+0.009
 sign+0.009
Neg logit contributions
 injuries-0.015
joy-0.014
awed-0.013
irmed-0.013
sed-0.013
Token day
Token ID1110
Feature activation+2.22e-02

Pos logit contributions
 injuries+0.036
joy+0.032
sed+0.029
irmed+0.029
ishing+0.028
Neg logit contributions
 ring-0.033
 achieve-0.031
 priest-0.031
Queen-0.030
 pilot-0.030
Token shouted
Token ID17293
Feature activation+1.06e-02

Pos logit contributions
 injuries+0.011
joy+0.010
awed+0.009
sed+0.009
irmed+0.009
Neg logit contributions
 ring-0.006
 also-0.006
 sign-0.006
 guard-0.006
 understand-0.006
Token out
Token ID503
Feature activation-2.54e-02

Pos logit contributions
 injuries+0.026
joy+0.025
awed+0.023
sed+0.023
irmed+0.022
Neg logit contributions
 also-0.011
 ring-0.011
 achieve-0.010
 learn-0.010
 understand-0.009
Token:
Token ID25
Feature activation+1.19e-02

Pos logit contributions
 ring+0.020
 also+0.019
 sign+0.017
 guard+0.017
 understand+0.017
Neg logit contributions
 injuries-0.070
joy-0.065
awed-0.061
sed-0.060
irmed-0.060
Token "
Token ID366
Feature activation+1.04e-02

Pos logit contributions
joy+0.025
 injuries+0.024
sed+0.021
awed+0.021
 nowhere+0.019
Neg logit contributions
 ring-0.015
 pilot-0.015
 learn-0.015
 achieve-0.015
 understand-0.014
TokenI
Token ID40
Feature activation-1.57e-02

Pos logit contributions
 injuries+0.021
joy+0.020
awed+0.018
irmed+0.017
ksh+0.017
Neg logit contributions
 ring-0.015
 also-0.015
 sign-0.014
 learn-0.014
 cross-0.014
Token want
Token ID765
Feature activation-5.50e-02

Pos logit contributions
 ring+0.018
 priest+0.017
 pilot+0.017
 rob+0.016
 also+0.016
Neg logit contributions
 injuries-0.034
joy-0.033
sed-0.030
irmed-0.030
awed-0.029
Token that
Token ID326
Feature activation-3.61e-03

Pos logit contributions
 ring+0.068
 also+0.066
 sign+0.062
 guard+0.062
 understand+0.061
Neg logit contributions
 injuries-0.126
joy-0.116
awed-0.107
sed-0.106
irmed-0.103
Token hat
Token ID6877
Feature activation-3.23e-03

Pos logit contributions
 also+0.005
 achieve+0.005
 rob+0.005
 priest+0.005
 understand+0.005
Neg logit contributions
 injuries-0.007
joy-0.007
sed-0.006
 nowhere-0.005
aj-0.005
Token!
Token ID0
Feature activation-6.12e-03

Pos logit contributions
Queen+0.004
 priest+0.004
 rob+0.004
 also+0.004
 understand+0.004
Neg logit contributions
 injuries-0.006
sed-0.005
less-0.005
awed-0.005
joy-0.005
Token Give
Token ID13786
Feature activation-9.85e-03

Pos logit contributions
 ring+0.011
 achieve+0.010
aches+0.010
 also+0.010
 priest+0.010
Neg logit contributions
 injuries-0.009
joy-0.009
sed-0.009
ksh-0.009
run-0.008
Token me
Token ID502
Feature activation-2.38e-02

Pos logit contributions
 achieve+0.014
 also+0.014
 ring+0.014
 exc+0.014
 understand+0.013
Neg logit contributions
 injuries-0.019
joy-0.019
awed-0.018
ted-0.017
sed-0.016
Token and
Token ID290
Feature activation+6.09e-03

Pos logit contributions
 injuries+0.080
joy+0.073
awed+0.068
sed+0.067
irmed+0.066
Neg logit contributions
 ring-0.036
 also-0.035
 sign-0.033
 guard-0.032
 understand-0.032
Token came
Token ID1625
Feature activation+6.79e-03

Pos logit contributions
 injuries+0.014
joy+0.013
awed+0.012
sed+0.012
ksh+0.012
Neg logit contributions
 also-0.008
 ring-0.008
 sign-0.007
 learn-0.007
 even-0.007
Token up
Token ID510
Feature activation-3.49e-02

Pos logit contributions
 injuries+0.014
joy+0.013
awed+0.012
sed+0.012
irmed+0.012
Neg logit contributions
 ring-0.010
 also-0.009
 sign-0.009
 priest-0.009
 pilot-0.009
Token with
Token ID351
Feature activation+9.37e-03

Pos logit contributions
 ring+0.036
 also+0.034
 pilot+0.033
 priest+0.032
 guard+0.031
Neg logit contributions
 injuries-0.087
joy-0.083
awed-0.074
sed-0.072
irmed-0.071
Token a
Token ID257
Feature activation-6.03e-04

Pos logit contributions
 injuries+0.025
joy+0.024
awed+0.023
sed+0.023
uddled+0.022
Neg logit contributions
 ring-0.008
 also-0.008
 sign-0.007
 understand-0.007
 guard-0.007
Token plan
Token ID1410
Feature activation-6.26e-02

Pos logit contributions
 also+0.001
 achieve+0.001
 priest+0.001
 learn+0.001
Queen+0.001
Neg logit contributions
 injuries-0.001
joy-0.001
sed-0.001
less-0.001
awed-0.001
Token.
Token ID13
Feature activation+1.06e-02

Pos logit contributions
 ring+0.054
 also+0.050
 sign+0.050
 guard+0.049
 go+0.047
Neg logit contributions
 injuries-0.169
joy-0.159
awed-0.152
ksh-0.151
irmed-0.150
Token He
Token ID679
Feature activation-7.24e-03

Pos logit contributions
 injuries+0.025
joy+0.024
sed+0.021
irmed+0.021
awed+0.021
Neg logit contributions
 ring-0.012
 also-0.011
 slide-0.011
 sign-0.011
 rob-0.011
Token decided
Token ID3066
Feature activation-3.97e-02

Pos logit contributions
 ring+0.011
 priest+0.010
 sign+0.010
 achieve+0.010
 rob+0.010
Neg logit contributions
 injuries-0.014
joy-0.013
awed-0.012
sed-0.012
irmed-0.012
Token to
Token ID284
Feature activation+1.49e-02

Pos logit contributions
 ring+0.066
 rob+0.065
Queen+0.065
 also+0.064
 priest+0.064
Neg logit contributions
 injuries-0.071
joy-0.066
awed-0.055
sed-0.055
 nowhere-0.051
Token take
Token ID1011
Feature activation-1.35e-02

Pos logit contributions
 injuries+0.027
ishing+0.021
 breaths+0.020
anged+0.020
ened+0.019
Neg logit contributions
 priest-0.021
a-0.019
 pastor-0.019
Queen-0.018
Tab-0.018