TOP ACTIVATIONS
MAX = 4.366

! She had solved the hunter 's problem so quickly .
there were two friends , Shar in and Maria . They
. They were called ost ric hes . Lily and Ben
, a little girl called Pet a was very troubled .
saw them change from run ny to firm . He said
lots of big juicy pine con es , so Penny started
" H oor ay for Blue and his mighty friends !"
f enn el and an ise and car away and c
ostr ich pen . Ost ric hes are big birds with
mice away from the or chard . The moral
They added some cheese , nuts , and dressing . They
us . We have many nuts in our hole . We
looked around the park for nuts and seeds . But a
g um ice cream , mango ice cream , even birthday
thanked him for returning the nut . From that day on
cookies got all s og gy and Jerry was in a
that he had so expert ly cut the little boy 's
wished they had never taken Wally out of the pool .
had wings and legs and antenna e and eyes . I
ostr ich pen . Ost ric hes are big birds with

MIN ACTIVATIONS
MIN = -5.580

to cry and threw a tant rum . She didn 't
kept crying and throwing a tant rum . Finally
She cried and threw a tant rum . Her mom tried
her feet and threw a tant rum . She grabbed the
and started to throw a tant rum . Suddenly , she
and so she threw a tant rum when her parents said
listen . She threw a tant rum in the store .
. He kept throwing a tant rum . His mom was
sad . She threw a tant rum and cried loudly .
surprise . He threw a tant rum . He kicked and
and they played make - bel ieve . Finally , they
stubborn . She throws a tant rum . She kicks and
red cape and make - bel ieve he had super powers
a world of make - bel ieve . She
he was playing make - bel ieve at home , too
his feet and threw a tant rum . "
temper and often threw temper tant rums . One day ,
are you ?" their n anny shouted . She saw them
and helped Tim make the aer opl ane look better .
liked to play make - bel ieve . One

INTERVAL 1.880 - 4.366
CONTAINS 0.887%

she was walking in the woods when she saw someone in
hopped and skipped through the jungle , looking for the special
bird who lived in the forest . Every day it liked
While swimming , the mole met a kind duck .
who lived in a big forest . Every day , the

INTERVAL -0.607 - 1.880
CONTAINS 80.031%

she jumped up and down ! The mom told
adventurous girl . She loved to explore and discover new things
happy and went to tell his mom . He said ,
" Fine . You can have it . But do not
time , there was a little girl named Lily . One

INTERVAL -3.093 - -0.607
CONTAINS 19.038%

, hold on tight !" Ben said . He rubbed the
They looked around the room and saw a hole in
the park and had a fun time . L ila learned
. One day , her mom asked her to go to
Emma ?" Em ma replied , " I love

INTERVAL -5.580 - -3.093
CONTAINS 0.043%

't find her yellow cr ay on , which made her
tried to make his toy car vanish . He closed his
was it ?" Sara 's mom says . "
friend , a colorful cr ay on named Color ful .
she found a big pair of scissors and decided to cut
Token!
Token ID0
Feature activation-5.61e-01

Pos logit contributions
bel+0.003
xc+0.003
 stained+0.003
pan+0.002
 ripped+0.002
Neg logit contributions
uth-0.002
uf-0.002
 example-0.002
emy-0.002
2-0.002
Token She
Token ID1375
Feature activation+4.01e-01

Pos logit contributions
bel+0.016
xc+0.015
pan+0.014
 down+0.013
 ripped+0.013
Neg logit contributions
uth-0.016
uf-0.016
emy-0.015
 example-0.014
ith-0.014
Token had
Token ID550
Feature activation+1.70e-01

Pos logit contributions
uth+0.011
uf+0.011
 example+0.010
ith+0.010
2+0.010
Neg logit contributions
xc-0.012
bel-0.012
pan-0.012
 stained-0.011
nam-0.011
Token solved
Token ID16019
Feature activation+1.76e+00

Pos logit contributions
uth+0.004
 example+0.003
uf+0.003
2+0.003
ith+0.003
Neg logit contributions
bel-0.007
xc-0.006
pan-0.006
lights-0.006
 tant-0.006
Token the
Token ID262
Feature activation+1.79e+00

Pos logit contributions
uth+0.053
 example+0.050
uf+0.049
emy+0.047
2+0.046
Neg logit contributions
bel-0.052
xc-0.048
 vacuum-0.045
 stained-0.044
pan-0.044
Token hunter
Token ID19177
Feature activation+4.37e+00

Pos logit contributions
uth+0.051
uf+0.050
emy+0.046
2+0.045
ith+0.044
Neg logit contributions
bel-0.052
 stained-0.049
xc-0.049
 ripped-0.048
 vacuum-0.046
Token's
Token ID338
Feature activation+1.78e+00

Pos logit contributions
uth+0.106
uf+0.105
 example+0.098
2+0.095
emy+0.093
Neg logit contributions
bel-0.147
xc-0.144
 stained-0.139
pan-0.134
 ripped-0.131
Token problem
Token ID1917
Feature activation+1.09e+00

Pos logit contributions
 example+0.041
uth+0.040
uf+0.038
2+0.035
emy+0.035
Neg logit contributions
bel-0.063
pan-0.061
xc-0.060
 stained-0.059
mine-0.057
Token so
Token ID523
Feature activation+2.15e-01

Pos logit contributions
uth+0.024
 example+0.023
uf+0.022
2+0.022
emy+0.020
Neg logit contributions
xc-0.039
bel-0.039
 stained-0.037
 ripped-0.035
pan-0.035
Token quickly
Token ID2952
Feature activation+1.33e+00

Pos logit contributions
uth+0.006
 example+0.005
uf+0.005
emy+0.005
2+0.005
Neg logit contributions
xc-0.007
bel-0.007
pan-0.007
 stained-0.006
 vacuum-0.006
Token.
Token ID13
Feature activation-3.33e-01

Pos logit contributions
 example+0.029
uth+0.029
uf+0.027
ith+0.026
 By+0.026
Neg logit contributions
xc-0.050
bel-0.047
nam-0.045
 stained-0.042
 tant-0.042
Token there
Token ID612
Feature activation-4.09e-01

Pos logit contributions
uth+0.037
Suddenly+0.035
Then+0.035
emy+0.034
 But+0.034
Neg logit contributions
bel-0.033
xc-0.029
pan-0.029
bat-0.026
 vacuum-0.026
Token were
Token ID547
Feature activation-1.17e+00

Pos logit contributions
bel+0.015
pan+0.014
nam+0.013
xc+0.013
 pans+0.013
Neg logit contributions
uth-0.010
 Even-0.009
uf-0.009
 But-0.008
 And-0.008
Token two
Token ID734
Feature activation-1.24e+00

Pos logit contributions
bel+0.044
pan+0.040
 stained+0.039
xc+0.039
 ripped+0.038
Neg logit contributions
uth-0.026
uf-0.025
 example-0.023
emy-0.023
2-0.022
Token friends
Token ID2460
Feature activation+4.55e-01

Pos logit contributions
bel+0.043
xc+0.040
 stained+0.039
pan+0.039
 ripped+0.038
Neg logit contributions
uth-0.031
uf-0.030
emy-0.028
 example-0.027
2-0.027
Token,
Token ID11
Feature activation+1.20e+00

Pos logit contributions
uth+0.011
uf+0.011
emy+0.010
2+0.010
 example+0.010
Neg logit contributions
bel-0.015
xc-0.014
 stained-0.014
 ripped-0.014
pan-0.013
Token Shar
Token ID15465
Feature activation+4.24e+00

Pos logit contributions
uth+0.035
uf+0.033
emy+0.032
2+0.032
 example+0.031
Neg logit contributions
bel-0.037
xc-0.033
pan-0.033
 stained-0.033
 ripped-0.030
Tokenin
Token ID259
Feature activation+1.60e+00

Pos logit contributions
 example+0.131
 Even+0.123
uth+0.122
uf+0.119
 By+0.117
Neg logit contributions
xc-0.109
bel-0.104
bat-0.094
 ripped-0.094
mine-0.092
Token and
Token ID290
Feature activation+5.02e-01

Pos logit contributions
uf+0.050
uth+0.050
nine+0.047
2+0.047
 heights+0.047
Neg logit contributions
bel-0.041
 ripped-0.031
 stained-0.031
igator-0.031
 cr-0.030
Token Maria
Token ID14200
Feature activation+3.92e-02

Pos logit contributions
uth+0.016
uf+0.015
emy+0.014
 example+0.014
2+0.014
Neg logit contributions
bel-0.014
 stained-0.013
xc-0.013
pan-0.012
 ripped-0.012
Token.
Token ID13
Feature activation+1.93e-01

Pos logit contributions
uth+0.001
uf+0.001
emy+0.001
 example+0.001
ith+0.001
Neg logit contributions
bel-0.001
xc-0.001
 stained-0.001
pan-0.001
 ripped-0.001
Token They
Token ID1119
Feature activation-3.13e-01

Pos logit contributions
uf+0.008
is+0.007
abi+0.007
ith+0.007
iley+0.007
Neg logit contributions
 Sarah-0.003
 down-0.003
packs-0.003
bat-0.003
 ripped-0.003
Token.
Token ID13
Feature activation+4.20e-02

Pos logit contributions
uth+0.014
der+0.014
2+0.013
uf+0.012
ith+0.012
Neg logit contributions
xc-0.036
bel-0.035
 stained-0.032
 vacuum-0.031
nam-0.031
Token They
Token ID1119
Feature activation-7.21e-02

Pos logit contributions
uth+0.001
uf+0.001
emy+0.001
 example+0.001
ith+0.001
Neg logit contributions
bel-0.001
 stained-0.001
xc-0.001
 ripped-0.001
pan-0.001
Token were
Token ID547
Feature activation-4.09e-01

Pos logit contributions
xc+0.002
bel+0.002
pan+0.002
 stained+0.002
 unst+0.002
Neg logit contributions
uth-0.002
uf-0.002
2-0.002
emy-0.002
ith-0.002
Token called
Token ID1444
Feature activation+9.19e-01

Pos logit contributions
bel+0.013
xc+0.012
 stained+0.012
pan+0.012
 ripped+0.011
Neg logit contributions
uth-0.011
uf-0.011
 example-0.010
emy-0.010
2-0.010
Token ost
Token ID23619
Feature activation+1.15e+00

Pos logit contributions
uth+0.022
uf+0.021
 example+0.020
2+0.019
emy+0.018
Neg logit contributions
bel-0.033
 stained-0.031
xc-0.031
 vacuum-0.030
 ripped-0.029
Tokenric
Token ID1173
Feature activation+4.23e+00

Pos logit contributions
uf+0.029
uth+0.028
 example+0.028
emy+0.026
Then+0.025
Neg logit contributions
bel-0.036
xc-0.034
 stained-0.033
bat-0.033
 ripped-0.032
Tokenhes
Token ID956
Feature activation-5.26e-04

Pos logit contributions
uth+0.130
uf+0.126
2+0.119
emy+0.119
 example+0.118
Neg logit contributions
bel-0.114
xc-0.108
 stained-0.105
 ripped-0.100
pan-0.099
Token.
Token ID13
Feature activation+3.89e-01

Pos logit contributions
xc+0.000
nam+0.000
bel+0.000
 swollen+0.000
 stained+0.000
Neg logit contributions
uth-0.000
 example-0.000
uf-0.000
der-0.000
:-0.000
Token Lily
Token ID20037
Feature activation-1.00e-01

Pos logit contributions
uth+0.009
uf+0.009
 example+0.009
emy+0.009
ith+0.008
Neg logit contributions
bel-0.012
xc-0.011
 stained-0.011
 ripped-0.011
pan-0.011
Token and
Token ID290
Feature activation-4.74e-01

Pos logit contributions
bel+0.003
xc+0.003
 stained+0.003
pan+0.003
 vacuum+0.003
Neg logit contributions
uth-0.003
uf-0.003
emy-0.002
2-0.002
 example-0.002
Token Ben
Token ID3932
Feature activation-3.51e-01

Pos logit contributions
bel+0.016
xc+0.015
 stained+0.015
pan+0.014
 ripped+0.014
Neg logit contributions
uth-0.012
uf-0.012
emy-0.011
 example-0.010
2-0.010
Token,
Token ID11
Feature activation+1.21e+00

Pos logit contributions
bel+0.017
 stained+0.016
pan+0.015
xc+0.015
 vacuum+0.015
Neg logit contributions
uth-0.011
uf-0.010
emy-0.010
 example-0.009
2-0.009
Token a
Token ID257
Feature activation-2.63e-01

Pos logit contributions
uth+0.041
emy+0.039
Suddenly+0.038
Then+0.038
uf+0.038
Neg logit contributions
bel-0.030
xc-0.027
pan-0.027
 vacuum-0.024
 stained-0.023
Token little
Token ID1310
Feature activation+5.71e-01

Pos logit contributions
 ripped+0.004
bel+0.004
 messy+0.004
 stained+0.004
 dirty+0.004
Neg logit contributions
der-0.010
 example-0.010
 habitats-0.010
uth-0.010
 heights-0.010
Token girl
Token ID2576
Feature activation+6.89e-01

Pos logit contributions
uth+0.020
uf+0.019
 example+0.018
emy+0.018
2+0.018
Neg logit contributions
bel-0.013
 stained-0.012
xc-0.012
 ripped-0.012
 towel-0.011
Token called
Token ID1444
Feature activation+1.21e-01

Pos logit contributions
uth+0.016
uf+0.016
emy+0.014
 example+0.014
2+0.014
Neg logit contributions
bel-0.024
 stained-0.022
xc-0.022
 ripped-0.022
 vacuum-0.021
Token Pet
Token ID4767
Feature activation+4.08e+00

Pos logit contributions
uth+0.004
uf+0.004
 example+0.003
emy+0.003
2+0.003
Neg logit contributions
bel-0.004
xc-0.003
 stained-0.003
pan-0.003
 swollen-0.003
Tokena
Token ID64
Feature activation+9.45e-02

Pos logit contributions
 example+0.121
der+0.114
uth+0.113
uf+0.113
 By+0.113
Neg logit contributions
bel-0.105
 down-0.105
led-0.105
xc-0.104
 ripped-0.101
Token was
Token ID373
Feature activation-2.53e-01

Pos logit contributions
uth+0.003
uf+0.003
 example+0.003
2+0.003
emy+0.003
Neg logit contributions
bel-0.002
 stained-0.002
 ripped-0.002
xc-0.002
 vacuum-0.002
Token very
Token ID845
Feature activation-7.69e-01

Pos logit contributions
bel+0.007
xc+0.006
 stained+0.006
 ripped+0.006
pan+0.006
Neg logit contributions
uth-0.008
uf-0.008
emy-0.007
 example-0.007
2-0.007
Token troubled
Token ID17840
Feature activation-3.31e-01

Pos logit contributions
bel+0.020
 stained+0.018
xc+0.018
 ripped+0.017
 vacuum+0.017
Neg logit contributions
uth-0.025
uf-0.024
 example-0.023
emy-0.023
2-0.022
Token.
Token ID13
Feature activation-2.46e-01

Pos logit contributions
bel+0.010
xc+0.009
 stained+0.009
 ripped+0.009
pan+0.009
Neg logit contributions
uth-0.009
uf-0.009
 example-0.009
emy-0.009
2-0.008
Token saw
Token ID2497
Feature activation-7.37e-01

Pos logit contributions
uth+0.031
uf+0.031
2+0.029
emy+0.029
 example+0.029
Neg logit contributions
bel-0.025
xc-0.024
 stained-0.022
pan-0.022
 ripped-0.021
Token them
Token ID606
Feature activation+8.67e-02

Pos logit contributions
xc+0.026
bel+0.025
days+0.023
 unst+0.022
 stained+0.022
Neg logit contributions
uf-0.016
uth-0.016
 example-0.016
emy-0.016
der-0.016
Token change
Token ID1487
Feature activation-3.66e-03

Pos logit contributions
uth+0.002
2+0.002
 example+0.002
ith+0.002
 violet+0.002
Neg logit contributions
xc-0.003
bel-0.003
 tant-0.003
robe-0.003
pan-0.003
Token from
Token ID422
Feature activation-1.76e-01

Pos logit contributions
xc+0.000
cher+0.000
bel+0.000
 Mechan+0.000
pan+0.000
Neg logit contributions
uth-0.000
 example-0.000
ith-0.000
abi-0.000
alph-0.000
Token run
Token ID1057
Feature activation+3.96e-01

Pos logit contributions
bel+0.006
xc+0.005
robe+0.005
cher+0.004
pan+0.004
Neg logit contributions
 example-0.005
uth-0.005
uf-0.005
abi-0.004
emy-0.004
Tokenny
Token ID3281
Feature activation+4.06e+00

Pos logit contributions
 example+0.008
2+0.007
ith+0.007
 But+0.006
aff+0.006
Neg logit contributions
xc-0.015
 ah-0.014
 swollen-0.014
nam-0.013
bel-0.013
Token to
Token ID284
Feature activation+7.15e-01

Pos logit contributions
uth+0.078
 example+0.073
uf+0.072
ith+0.068
osa+0.065
Neg logit contributions
xc-0.162
bel-0.149
 swollen-0.142
igator-0.141
 stained-0.140
Token firm
Token ID4081
Feature activation+1.39e+00

Pos logit contributions
uth+0.022
uf+0.022
 example+0.021
2+0.020
emy+0.020
Neg logit contributions
bel-0.020
xc-0.019
 stained-0.018
 ripped-0.017
pan-0.017
Token.
Token ID13
Feature activation+7.12e-02

Pos logit contributions
 example+0.030
uth+0.026
uf+0.023
ith+0.023
osa+0.022
Neg logit contributions
xc-0.054
 swollen-0.051
igator-0.050
bel-0.050
 stained-0.049
Token He
Token ID679
Feature activation+6.26e-01

Pos logit contributions
uth+0.002
uf+0.002
 example+0.001
emy+0.001
2+0.001
Neg logit contributions
bel-0.003
 stained-0.002
xc-0.002
 ripped-0.002
pan-0.002
Token said
Token ID531
Feature activation+6.10e-01

Pos logit contributions
uth+0.021
uf+0.020
emy+0.019
 example+0.019
2+0.019
Neg logit contributions
bel-0.016
xc-0.015
 stained-0.014
pan-0.014
 ripped-0.013
Token lots
Token ID6041
Feature activation-5.17e-01

Pos logit contributions
uth+0.005
uf+0.005
emy+0.005
 example+0.005
2+0.005
Neg logit contributions
bel-0.006
xc-0.006
 stained-0.006
pan-0.005
 ripped-0.005
Token of
Token ID286
Feature activation-1.26e+00

Pos logit contributions
bel+0.010
 stained+0.009
 ripped+0.008
 vacuum+0.008
 towel+0.008
Neg logit contributions
uth-0.020
uf-0.020
 example-0.019
2-0.019
emy-0.019
Token big
Token ID1263
Feature activation-3.69e-01

Pos logit contributions
 stained+0.030
bel+0.029
 ripped+0.028
 muddy+0.027
 cloth+0.027
Neg logit contributions
uf-0.042
uth-0.041
2-0.039
 example-0.039
avi-0.038
Token juicy
Token ID35075
Feature activation+2.13e+00

Pos logit contributions
 stained+0.009
 ripped+0.009
bel+0.008
 swollen+0.008
 muddy+0.008
Neg logit contributions
uf-0.012
uth-0.012
 example-0.011
2-0.011
avi-0.010
Token pine
Token ID20161
Feature activation+9.62e-01

Pos logit contributions
uf+0.074
uth+0.072
2+0.069
 example+0.069
avi+0.067
Neg logit contributions
 stained-0.046
 down-0.044
bel-0.044
 ripped-0.043
 cloth-0.043
Tokencon
Token ID1102
Feature activation+4.03e+00

Pos logit contributions
uth+0.030
 example+0.030
uf+0.028
alph+0.027
osa+0.026
Neg logit contributions
bel-0.023
 stained-0.022
 ripped-0.022
bat-0.022
 down-0.021
Tokenes
Token ID274
Feature activation+7.54e-02

Pos logit contributions
uth+0.150
uf+0.147
emy+0.139
 example+0.139
2+0.137
Neg logit contributions
bel-0.085
xc-0.077
 stained-0.075
 ripped-0.071
pan-0.071
Token,
Token ID11
Feature activation+3.45e-01

Pos logit contributions
uth+0.001
uf+0.001
 example+0.001
emy+0.001
2+0.001
Neg logit contributions
bel-0.003
xc-0.003
 stained-0.003
 vacuum-0.003
pan-0.003
Token so
Token ID523
Feature activation-3.40e-01

Pos logit contributions
uth+0.008
uf+0.008
emy+0.008
 example+0.008
2+0.007
Neg logit contributions
bel-0.012
xc-0.011
 stained-0.011
pan-0.010
 ripped-0.010
Token Penny
Token ID25965
Feature activation+1.14e+00

Pos logit contributions
bel+0.009
xc+0.008
 stained+0.008
pan+0.008
 vacuum+0.007
Neg logit contributions
uth-0.011
uf-0.011
emy-0.010
 example-0.010
2-0.010
Token started
Token ID2067
Feature activation+1.38e+00

Pos logit contributions
uth+0.034
uf+0.033
emy+0.031
2+0.031
ith+0.031
Neg logit contributions
bel-0.033
xc-0.031
pan-0.030
 stained-0.029
lights-0.028
Token "
Token ID366
Feature activation+1.93e-01

Pos logit contributions
uth+0.005
uf+0.005
emy+0.005
 example+0.005
2+0.005
Neg logit contributions
bel-0.004
xc-0.004
 ripped-0.004
 stained-0.004
pan-0.004
TokenH
Token ID39
Feature activation+7.39e-01

Pos logit contributions
uth+0.005
uf+0.005
emy+0.005
 example+0.005
2+0.005
Neg logit contributions
bel-0.006
xc-0.006
 stained-0.006
 ripped-0.005
pan-0.005
Tokenoor
Token ID2675
Feature activation+3.02e-01

Pos logit contributions
uth+0.019
 example+0.018
 violet+0.018
uf+0.018
abi+0.018
Neg logit contributions
xc-0.020
bel-0.019
 unst-0.019
boat-0.019
igator-0.019
Tokenay
Token ID323
Feature activation-1.90e+00

Pos logit contributions
uth+0.012
uf+0.012
emy+0.012
 example+0.011
2+0.011
Neg logit contributions
bel-0.005
xc-0.005
 stained-0.004
 ripped-0.004
pan-0.004
Token for
Token ID329
Feature activation+4.96e-01

Pos logit contributions
bel+0.074
nam+0.072
 stained+0.068
lights+0.066
 tant+0.065
Neg logit contributions
uth-0.040
uf-0.040
 example-0.038
emy-0.037
ith-0.036
Token Blue
Token ID4518
Feature activation+4.01e+00

Pos logit contributions
uth+0.014
uf+0.014
 example+0.014
emy+0.013
2+0.013
Neg logit contributions
bel-0.015
xc-0.014
 stained-0.013
pan-0.013
 ripped-0.012
Token and
Token ID290
Feature activation-2.86e-01

Pos logit contributions
uth+0.092
uf+0.087
emy+0.084
 example+0.075
osa+0.075
Neg logit contributions
 stained-0.147
bel-0.145
nam-0.144
 vacuum-0.142
xc-0.141
Token his
Token ID465
Feature activation+2.36e-01

Pos logit contributions
bel+0.010
xc+0.010
pan+0.009
 stained+0.009
lights+0.009
Neg logit contributions
 example-0.006
 Even-0.006
 But-0.006
emy-0.006
uth-0.005
Token mighty
Token ID18680
Feature activation-4.80e-01

Pos logit contributions
uth+0.006
uf+0.006
emy+0.006
 example+0.006
2+0.006
Neg logit contributions
bel-0.008
xc-0.007
 stained-0.007
pan-0.007
 vacuum-0.007
Token friends
Token ID2460
Feature activation-3.87e-02

Pos logit contributions
bel+0.010
xc+0.009
 ripped+0.009
 stained+0.009
 unst+0.008
Neg logit contributions
uth-0.018
uf-0.017
2-0.016
osa-0.016
ith-0.016
Token!"
Token ID2474
Feature activation+7.31e-01

Pos logit contributions
bel+0.001
xc+0.001
lights+0.001
 stained+0.001
 pans+0.001
Neg logit contributions
uth-0.001
uf-0.001
 knows-0.001
 example-0.001
 Even-0.001
Token f
Token ID277
Feature activation+2.05e-01

Pos logit contributions
uf+0.019
uth+0.019
 example+0.018
2+0.018
ith+0.017
Neg logit contributions
bel-0.023
xc-0.023
lights-0.022
 stained-0.022
pan-0.021
Tokenenn
Token ID1697
Feature activation+5.36e-02

Pos logit contributions
uth+0.006
uf+0.006
2+0.006
ith+0.006
 example+0.006
Neg logit contributions
bel-0.005
 stained-0.005
 ripped-0.005
 vacuum-0.005
nam-0.005
Tokenel
Token ID417
Feature activation-6.54e-01

Pos logit contributions
uth+0.002
uf+0.002
 example+0.002
emy+0.001
ith+0.001
Neg logit contributions
bel-0.002
 stained-0.001
nam-0.001
 vacuum-0.001
 ripped-0.001
Token and
Token ID290
Feature activation+5.79e-01

Pos logit contributions
bel+0.022
nam+0.022
 stained+0.021
xc+0.020
 vacuum+0.020
Neg logit contributions
uth-0.016
 example-0.015
uf-0.015
ith-0.015
2-0.014
Token an
Token ID281
Feature activation-7.86e-01

Pos logit contributions
uf+0.014
 example+0.014
uth+0.014
2+0.013
 knows+0.013
Neg logit contributions
bel-0.018
xc-0.018
lights-0.017
days-0.016
 stained-0.016
Tokenise
Token ID786
Feature activation+4.00e+00

Pos logit contributions
bel+0.027
 stained+0.024
xc+0.024
pan+0.023
nam+0.023
Neg logit contributions
uth-0.021
uf-0.020
 example-0.020
emy-0.019
2-0.018
Token and
Token ID290
Feature activation+3.78e-01

Pos logit contributions
uth+0.092
uf+0.086
 example+0.085
ith+0.084
2+0.079
Neg logit contributions
bel-0.141
nam-0.137
 stained-0.137
xc-0.130
 vacuum-0.128
Token car
Token ID1097
Feature activation-1.10e+00

Pos logit contributions
uf+0.009
uth+0.009
 example+0.009
2+0.009
 knows+0.008
Neg logit contributions
bel-0.012
xc-0.012
lights-0.011
days-0.011
 stained-0.010
Tokenaway
Token ID8272
Feature activation-8.47e-01

Pos logit contributions
nam+0.041
 stained+0.039
xc+0.038
 muddy+0.037
bel+0.036
Neg logit contributions
 example-0.024
uth-0.023
uf-0.021
der-0.021
 And-0.020
Token and
Token ID290
Feature activation+2.81e-01

Pos logit contributions
bel+0.025
 stained+0.025
xc+0.024
 vacuum+0.024
nam+0.024
Neg logit contributions
uth-0.024
 example-0.022
uf-0.022
emy-0.021
2-0.020
Token c
Token ID269
Feature activation+1.12e+00

Pos logit contributions
uf+0.007
uth+0.007
 example+0.007
2+0.006
 knows+0.006
Neg logit contributions
bel-0.009
xc-0.009
lights-0.008
days-0.008
ivery-0.008
Token ostr
Token ID42387
Feature activation+7.13e-01

Pos logit contributions
uth+0.007
uf+0.007
emy+0.006
 example+0.006
2+0.006
Neg logit contributions
bel-0.018
xc-0.017
 stained-0.017
 ripped-0.016
pan-0.016
Tokenich
Token ID488
Feature activation+9.07e-01

Pos logit contributions
 example+0.018
uf+0.017
uth+0.017
 By+0.017
 Even+0.016
Neg logit contributions
bel-0.024
xc-0.021
bat-0.020
 ripped-0.019
 stained-0.018
Token pen
Token ID3112
Feature activation+6.90e-01

Pos logit contributions
uf+0.022
uth+0.022
 example+0.020
2+0.020
emy+0.019
Neg logit contributions
bel-0.031
xc-0.029
 stained-0.028
 ripped-0.027
pan-0.027
Token.
Token ID13
Feature activation+1.28e-01

Pos logit contributions
uth+0.010
uf+0.010
der+0.010
 example+0.010
2+0.009
Neg logit contributions
bel-0.029
xc-0.028
 unst-0.027
 swollen-0.026
 stained-0.026
Token Ost
Token ID38919
Feature activation+9.05e-01

Pos logit contributions
uth+0.003
uf+0.003
emy+0.002
 example+0.002
2+0.002
Neg logit contributions
bel-0.005
xc-0.005
 stained-0.004
pan-0.004
 ripped-0.004
Tokenric
Token ID1173
Feature activation+3.95e+00

Pos logit contributions
uth+0.025
emy+0.024
uf+0.023
 example+0.022
2+0.022
Neg logit contributions
xc-0.026
bel-0.026
 stained-0.025
mine-0.025
 ripped-0.024
Tokenhes
Token ID956
Feature activation-4.26e-01

Pos logit contributions
uth+0.130
uf+0.128
 example+0.120
emy+0.120
2+0.117
Neg logit contributions
bel-0.099
xc-0.091
 stained-0.090
 ripped-0.086
pan-0.086
Token are
Token ID389
Feature activation-2.02e-01

Pos logit contributions
xc+0.017
bel+0.017
nam+0.017
 swollen+0.015
 pans+0.015
Neg logit contributions
:-0.007
 learns-0.007
 knows-0.007
uth-0.007
uf-0.007
Token big
Token ID1263
Feature activation+4.61e-03

Pos logit contributions
xc+0.007
bel+0.007
pan+0.006
mine+0.006
days+0.006
Neg logit contributions
uth-0.005
uf-0.005
emy-0.004
 example-0.004
2-0.004
Token birds
Token ID10087
Feature activation+1.25e+00

Pos logit contributions
uth+0.000
uf+0.000
emy+0.000
 example+0.000
2+0.000
Neg logit contributions
bel-0.000
xc-0.000
 stained-0.000
pan-0.000
 vacuum-0.000
Token with
Token ID351
Feature activation+3.01e-01

Pos logit contributions
uth+0.027
uf+0.026
emy+0.023
ith+0.023
2+0.023
Neg logit contributions
bel-0.047
xc-0.046
 stained-0.042
pan-0.040
 ripped-0.040
Token mice
Token ID10693
Feature activation+2.60e+00

Pos logit contributions
bel+0.005
xc+0.005
 stained+0.004
 ripped+0.004
pan+0.004
Neg logit contributions
uth-0.005
uf-0.005
 example-0.005
emy-0.005
2-0.004
Token away
Token ID1497
Feature activation+1.12e+00

Pos logit contributions
uf+0.049
 example+0.048
uth+0.048
 By+0.044
alph+0.042
Neg logit contributions
xc-0.100
bel-0.097
 stained-0.095
lights-0.091
 swollen-0.090
Token from
Token ID422
Feature activation+4.60e-01

Pos logit contributions
uth+0.020
 example+0.019
uf+0.019
der+0.017
 By+0.017
Neg logit contributions
xc-0.047
bel-0.042
 stained-0.042
 vacuum-0.040
 ripped-0.040
Token the
Token ID262
Feature activation+1.18e+00

Pos logit contributions
 example+0.013
uth+0.013
uf+0.013
emy+0.012
alph+0.011
Neg logit contributions
bel-0.014
xc-0.013
 stained-0.012
 ripped-0.012
 vacuum-0.012
Token or
Token ID393
Feature activation+6.05e-01

Pos logit contributions
uf+0.026
uth+0.026
 example+0.025
emy+0.023
2+0.022
Neg logit contributions
bel-0.043
xc-0.043
 stained-0.040
pan-0.039
 ripped-0.038
Tokenchard
Token ID30215
Feature activation+3.92e+00

Pos logit contributions
 example+0.014
uf+0.013
uth+0.013
 vet+0.013
 Chuck+0.012
Neg logit contributions
xc-0.021
iddled-0.018
bel-0.018
 stained-0.017
pan-0.017
Token.
Token ID13
Feature activation-5.24e-01

Pos logit contributions
uf+0.097
uth+0.097
 example+0.087
emy+0.085
2+0.083
Neg logit contributions
xc-0.135
bel-0.131
 stained-0.126
pan-0.121
 vacuum-0.119
Token
Token ID198
Feature activation-3.10e-01

Pos logit contributions
bel+0.016
xc+0.015
pan+0.014
 unst+0.014
 stained+0.014
Neg logit contributions
uth-0.013
uf-0.012
 example-0.012
emy-0.012
ith-0.012
Token
Token ID198
Feature activation-2.25e-01

Pos logit contributions
bel+0.011
xc+0.009
 stained+0.009
pan+0.009
 towel+0.009
Neg logit contributions
uth-0.007
uf-0.007
 example-0.007
2-0.006
ith-0.006
TokenThe
Token ID464
Feature activation+8.03e-01

Pos logit contributions
bel+0.008
xc+0.007
 stained+0.007
 ripped+0.007
pan+0.007
Neg logit contributions
uth-0.005
uf-0.005
emy-0.005
 example-0.005
2-0.005
Token moral
Token ID6573
Feature activation+1.21e+00

Pos logit contributions
uth+0.017
uf+0.016
emy+0.014
2+0.014
aja+0.014
Neg logit contributions
bel-0.029
 stained-0.027
xc-0.027
 ripped-0.027
 vacuum-0.027
Token They
Token ID1119
Feature activation+5.77e-01

Pos logit contributions
uth+0.015
uf+0.015
emy+0.013
 example+0.013
2+0.013
Neg logit contributions
bel-0.020
 stained-0.019
 ripped-0.018
pan-0.018
xc-0.018
Token added
Token ID2087
Feature activation+3.74e-01

Pos logit contributions
uf+0.015
uth+0.014
2+0.013
ith+0.013
 also+0.013
Neg logit contributions
xc-0.018
bel-0.018
adoes-0.016
pan-0.016
lights-0.016
Token some
Token ID617
Feature activation-4.93e-02

Pos logit contributions
 example+0.007
uth+0.006
ith+0.006
uf+0.006
2+0.006
Neg logit contributions
xc-0.014
bel-0.014
days-0.013
spe-0.013
nam-0.013
Token cheese
Token ID9891
Feature activation+2.81e+00

Pos logit contributions
xc+0.002
bel+0.002
pan+0.002
mine+0.002
bat+0.002
Neg logit contributions
ith-0.001
uth-0.001
 example-0.001
uf-0.001
emy-0.001
Token,
Token ID11
Feature activation+9.63e-01

Pos logit contributions
uth+0.039
 example+0.038
uf+0.036
emy+0.036
ith+0.034
Neg logit contributions
xc-0.123
nam-0.117
bel-0.116
 stained-0.110
 swollen-0.109
Token nuts
Token ID14380
Feature activation+3.92e+00

Pos logit contributions
uth+0.029
uf+0.029
emy+0.027
 example+0.027
2+0.026
Neg logit contributions
bel-0.027
 stained-0.025
xc-0.024
 ripped-0.024
pan-0.024
Token,
Token ID11
Feature activation+9.66e-01

Pos logit contributions
uth+0.059
uf+0.055
 example+0.053
emy+0.052
2+0.051
Neg logit contributions
xc-0.169
bel-0.161
nam-0.154
 stained-0.148
 vacuum-0.145
Token and
Token ID290
Feature activation+8.33e-02

Pos logit contributions
uth+0.026
uf+0.025
emy+0.023
 example+0.023
2+0.023
Neg logit contributions
bel-0.031
xc-0.029
 stained-0.028
 ripped-0.027
pan-0.027
Token dressing
Token ID18544
Feature activation-3.84e-01

Pos logit contributions
 example+0.002
uth+0.002
2+0.002
emy+0.002
nine+0.002
Neg logit contributions
xc-0.003
bel-0.003
robe-0.003
mine-0.002
bat-0.002
Token.
Token ID13
Feature activation+4.23e-01

Pos logit contributions
xc+0.013
bel+0.013
 swollen+0.013
 vacuum+0.012
 stained+0.012
Neg logit contributions
uth-0.008
abi-0.008
der-0.008
 example-0.008
2-0.007
Token They
Token ID1119
Feature activation+5.96e-01

Pos logit contributions
uf+0.011
uth+0.011
ith+0.011
emy+0.011
 example+0.010
Neg logit contributions
bel-0.013
 stained-0.012
 ripped-0.012
 sponge-0.011
pan-0.011
Token us
Token ID514
Feature activation-1.15e+00

Pos logit contributions
uth+0.020
uf+0.020
 example+0.019
emy+0.019
2+0.017
Neg logit contributions
bel-0.030
xc-0.028
 vacuum-0.027
 stained-0.026
pan-0.025
Token.
Token ID13
Feature activation+4.94e-01

Pos logit contributions
bel+0.046
xc+0.045
 stained+0.044
 vacuum+0.043
 swollen+0.043
Neg logit contributions
uth-0.021
 example-0.020
uf-0.019
osa-0.017
emy-0.016
Token We
Token ID775
Feature activation+5.97e-01

Pos logit contributions
uth+0.013
uf+0.013
ith+0.012
 example+0.012
emy+0.012
Neg logit contributions
bel-0.014
 ripped-0.013
xc-0.013
 stained-0.013
 vacuum-0.013
Token have
Token ID423
Feature activation+2.00e-01

Pos logit contributions
uth+0.017
uf+0.017
 example+0.015
emy+0.015
2+0.015
Neg logit contributions
bel-0.018
xc-0.017
 stained-0.016
pan-0.016
 ripped-0.015
Token many
Token ID867
Feature activation-8.45e-01

Pos logit contributions
uth+0.005
uf+0.005
 example+0.005
emy+0.004
ith+0.004
Neg logit contributions
bel-0.007
xc-0.007
pan-0.006
 vacuum-0.006
robe-0.006
Token nuts
Token ID14380
Feature activation+3.90e+00

Pos logit contributions
bel+0.029
xc+0.028
pan+0.027
 stained+0.026
 vacuum+0.026
Neg logit contributions
uth-0.021
uf-0.020
emy-0.019
 example-0.019
ith-0.018
Token in
Token ID287
Feature activation+1.07e+00

Pos logit contributions
uth+0.074
uf+0.071
 example+0.069
emy+0.066
2+0.063
Neg logit contributions
xc-0.153
bel-0.145
 vacuum-0.138
 stained-0.138
 swollen-0.134
Token our
Token ID674
Feature activation-1.10e+00

Pos logit contributions
uth+0.030
 example+0.029
uf+0.028
emy+0.027
alph+0.027
Neg logit contributions
bel-0.035
xc-0.032
 vacuum-0.029
pan-0.029
robe-0.029
Token hole
Token ID7604
Feature activation+5.98e-01

Pos logit contributions
bel+0.036
xc+0.035
pan+0.034
 stained+0.033
 vacuum+0.032
Neg logit contributions
uth-0.029
uf-0.029
emy-0.027
 example-0.027
ith-0.025
Token.
Token ID13
Feature activation+6.61e-01

Pos logit contributions
uth+0.010
 example+0.010
der+0.010
2+0.008
ith+0.008
Neg logit contributions
xc-0.024
bel-0.023
 vacuum-0.022
nam-0.022
 swollen-0.022
Token We
Token ID775
Feature activation+9.26e-01

Pos logit contributions
uth+0.020
uf+0.019
ith+0.019
emy+0.019
iley+0.018
Neg logit contributions
bel-0.016
 ripped-0.015
 sponge-0.015
 stained-0.015
 vacuum-0.014
Token looked
Token ID3114
Feature activation+9.41e-01

Pos logit contributions
bel+0.015
xc+0.014
pan+0.013
 stained+0.013
lights+0.013
Neg logit contributions
uf-0.015
uth-0.015
2-0.014
emy-0.014
ith-0.014
Token around
Token ID1088
Feature activation-8.60e-01

Pos logit contributions
uth+0.025
uf+0.024
emy+0.024
 example+0.023
ith+0.023
Neg logit contributions
bel-0.029
xc-0.028
pan-0.026
 stained-0.026
 ripped-0.025
Token the
Token ID262
Feature activation-1.09e-01

Pos logit contributions
bel+0.029
xc+0.029
 stained+0.028
pan+0.027
 ripped+0.027
Neg logit contributions
uth-0.020
uf-0.020
 example-0.019
emy-0.018
2-0.018
Token park
Token ID3952
Feature activation+6.29e-01

Pos logit contributions
bel+0.003
 stained+0.003
 ripped+0.003
xc+0.003
 towel+0.003
Neg logit contributions
uth-0.003
uf-0.003
 example-0.003
emy-0.002
2-0.002
Token for
Token ID329
Feature activation+3.81e-01

Pos logit contributions
uth+0.014
uf+0.013
 example+0.013
2+0.013
emy+0.012
Neg logit contributions
bel-0.022
xc-0.022
 stained-0.021
pan-0.020
 vacuum-0.020
Token nuts
Token ID14380
Feature activation+3.89e+00

Pos logit contributions
uth+0.013
uf+0.013
ith+0.012
emy+0.012
2+0.011
Neg logit contributions
bel-0.009
 stained-0.008
 ripped-0.008
xc-0.008
 vacuum-0.008
Token and
Token ID290
Feature activation+8.15e-01

Pos logit contributions
uth+0.087
uf+0.086
emy+0.081
 example+0.078
2+0.078
Neg logit contributions
bel-0.137
xc-0.134
 stained-0.126
 vacuum-0.121
 ripped-0.119
Token seeds
Token ID11904
Feature activation+2.04e+00

Pos logit contributions
uth+0.024
uf+0.024
 example+0.022
emy+0.022
2+0.022
Neg logit contributions
bel-0.023
xc-0.022
 stained-0.021
pan-0.020
 ripped-0.020
Token.
Token ID13
Feature activation-4.30e-03

Pos logit contributions
uth+0.031
uf+0.030
emy+0.030
ith+0.028
der+0.028
Neg logit contributions
xc-0.085
bel-0.080
 stained-0.076
 vacuum-0.075
nam-0.073
Token But
Token ID887
Feature activation+8.30e-01

Pos logit contributions
bel+0.000
 ripped+0.000
 unst+0.000
 stained+0.000
xc+0.000
Neg logit contributions
uf-0.000
uth-0.000
iley-0.000
ith-0.000
osa-0.000
Token a
Token ID257
Feature activation-4.67e-01

Pos logit contributions
uth+0.021
 example+0.020
uf+0.020
emy+0.020
2+0.020
Neg logit contributions
bel-0.027
xc-0.025
 stained-0.025
pan-0.024
 vacuum-0.023
Tokeng
Token ID70
Feature activation+9.23e-01

Pos logit contributions
nam+0.008
xc+0.007
 stained+0.007
 ripped+0.007
 messed+0.006
Neg logit contributions
uth-0.005
 example-0.005
ith-0.005
is-0.005
2-0.005
Tokenum
Token ID388
Feature activation+8.27e-01

Pos logit contributions
uth+0.024
uf+0.024
 example+0.022
emy+0.022
2+0.022
Neg logit contributions
bel-0.029
xc-0.028
 stained-0.027
pan-0.026
 ripped-0.026
Token ice
Token ID4771
Feature activation+9.03e-01

Pos logit contributions
uth+0.018
uf+0.017
ith+0.017
2+0.016
 example+0.016
Neg logit contributions
bel-0.029
xc-0.029
 stained-0.028
pan-0.028
nam-0.028
Token cream
Token ID8566
Feature activation+1.48e+00

Pos logit contributions
uth+0.028
uf+0.027
 example+0.027
2+0.026
ith+0.026
Neg logit contributions
xc-0.023
bel-0.022
 stained-0.022
nam-0.021
 ripped-0.021
Token,
Token ID11
Feature activation+3.69e-01

Pos logit contributions
 example+0.023
uth+0.023
2+0.023
uf+0.022
der+0.020
Neg logit contributions
xc-0.060
bel-0.058
 stained-0.057
nam-0.056
 swollen-0.055
Token mango
Token ID49364
Feature activation+3.88e+00

Pos logit contributions
uth+0.009
uf+0.008
2+0.008
 example+0.008
emy+0.007
Neg logit contributions
bel-0.013
xc-0.013
 stained-0.012
 ripped-0.011
pan-0.011
Token ice
Token ID4771
Feature activation+3.87e-01

Pos logit contributions
uth+0.095
uf+0.089
2+0.088
emy+0.083
ith+0.080
Neg logit contributions
xc-0.134
bel-0.130
 stained-0.126
nam-0.124
 vacuum-0.121
Token cream
Token ID8566
Feature activation+1.07e+00

Pos logit contributions
uth+0.014
uf+0.013
 example+0.013
2+0.013
ith+0.013
Neg logit contributions
xc-0.008
bel-0.008
 stained-0.008
 ripped-0.007
pan-0.007
Token,
Token ID11
Feature activation+4.92e-01

Pos logit contributions
uth+0.019
uf+0.019
2+0.019
 example+0.018
der+0.017
Neg logit contributions
xc-0.042
bel-0.040
 stained-0.039
 swollen-0.038
nam-0.038
Token even
Token ID772
Feature activation+9.62e-01

Pos logit contributions
uth+0.009
uf+0.009
2+0.009
 example+0.009
Suddenly+0.008
Neg logit contributions
xc-0.019
bel-0.019
 stained-0.017
 ripped-0.017
pan-0.017
Token birthday
Token ID10955
Feature activation-1.47e+00

Pos logit contributions
uth+0.021
 example+0.020
2+0.020
uf+0.019
nine+0.019
Neg logit contributions
bel-0.034
xc-0.033
pan-0.031
 stained-0.031
 ripped-0.030
Token thanked
Token ID26280
Feature activation-5.58e-01

Pos logit contributions
uth+0.002
 example+0.002
 Even+0.002
emy+0.002
uf+0.002
Neg logit contributions
xc-0.002
bel-0.002
 stained-0.002
pan-0.002
lights-0.002
Token him
Token ID683
Feature activation-2.18e-01

Pos logit contributions
bel+0.011
xc+0.010
 stained+0.010
pan+0.009
 ripped+0.009
Neg logit contributions
uth-0.021
uf-0.021
emy-0.020
 example-0.020
2-0.020
Token for
Token ID329
Feature activation-7.55e-02

Pos logit contributions
bel+0.007
 stained+0.007
xc+0.007
 ripped+0.007
pan+0.007
Neg logit contributions
uth-0.005
uf-0.005
 example-0.005
emy-0.005
2-0.005
Token returning
Token ID8024
Feature activation+9.09e-01

Pos logit contributions
bel+0.002
xc+0.002
 stained+0.002
lights+0.002
 ripped+0.002
Neg logit contributions
 example-0.002
uf-0.002
uth-0.002
emy-0.002
2-0.002
Token the
Token ID262
Feature activation+1.29e+00

Pos logit contributions
uth+0.030
uf+0.030
 example+0.028
emy+0.028
2+0.027
Neg logit contributions
bel-0.023
xc-0.021
 stained-0.021
pan-0.020
 vacuum-0.020
Token nut
Token ID6701
Feature activation+3.87e+00

Pos logit contributions
uth+0.037
uf+0.036
2+0.033
emy+0.033
 example+0.033
Neg logit contributions
bel-0.037
 stained-0.035
xc-0.035
 ripped-0.035
 towel-0.033
Token.
Token ID13
Feature activation-1.86e-01

Pos logit contributions
uth+0.070
uf+0.068
 example+0.065
emy+0.064
ith+0.061
Neg logit contributions
xc-0.150
bel-0.149
 stained-0.143
 ah-0.138
 vacuum-0.138
Token From
Token ID3574
Feature activation+1.88e+00

Pos logit contributions
bel+0.006
xc+0.005
 ripped+0.005
 down+0.005
 stained+0.005
Neg logit contributions
uth-0.005
 example-0.004
uf-0.004
ith-0.004
2-0.004
Token that
Token ID326
Feature activation+2.82e-01

Pos logit contributions
uth+0.044
uf+0.042
2+0.039
 example+0.038
emy+0.038
Neg logit contributions
bel-0.065
 stained-0.061
xc-0.061
 ripped-0.059
 towel-0.057
Token day
Token ID1110
Feature activation-3.77e-01

Pos logit contributions
uth+0.007
uf+0.007
 example+0.006
emy+0.006
2+0.006
Neg logit contributions
bel-0.010
xc-0.009
 stained-0.009
pan-0.009
 ripped-0.009
Token on
Token ID319
Feature activation-1.51e-01

Pos logit contributions
bel+0.007
xc+0.006
 stained+0.006
 ripped+0.005
pan+0.005
Neg logit contributions
uth-0.015
uf-0.015
emy-0.014
 example-0.014
2-0.014
Token cookies
Token ID14746
Feature activation+3.40e-01

Pos logit contributions
 example+0.015
emy+0.014
uth+0.014
 zoo+0.014
 jungle+0.014
Neg logit contributions
bel-0.024
nam-0.023
xc-0.022
pan-0.020
 stained-0.020
Token got
Token ID1392
Feature activation+9.61e-02

Pos logit contributions
 also+0.007
der+0.006
 always+0.006
 Even+0.006
 never+0.006
Neg logit contributions
xc-0.013
nam-0.012
bel-0.011
 pans-0.010
bat-0.010
Token all
Token ID477
Feature activation+1.25e+00

Pos logit contributions
Suddenly+0.002
 le+0.002
 more+0.002
 also+0.002
 until+0.002
Neg logit contributions
xc-0.003
robe-0.003
ci-0.003
days-0.003
bel-0.003
Token s
Token ID264
Feature activation-9.02e-01

Pos logit contributions
 By+0.027
 example+0.025
 Even+0.023
 le+0.022
 But+0.022
Neg logit contributions
nam-0.045
bel-0.043
kins-0.041
xc-0.041
cher-0.041
Tokenog
Token ID519
Feature activation+1.70e-01

Pos logit contributions
bel+0.018
xc+0.017
 stained+0.017
 ripped+0.016
pan+0.015
Neg logit contributions
uth-0.034
uf-0.033
 example-0.032
emy-0.031
osa-0.030
Tokengy
Token ID1360
Feature activation+3.87e+00

Pos logit contributions
uth+0.004
 example+0.004
2+0.004
uf+0.004
emy+0.004
Neg logit contributions
nam-0.005
 stained-0.005
 tant-0.005
bel-0.005
xc-0.004
Token and
Token ID290
Feature activation-4.19e-01

Pos logit contributions
uth+0.061
der+0.055
 example+0.055
uf+0.051
 By+0.050
Neg logit contributions
xc-0.156
nam-0.142
bel-0.140
 swollen-0.137
 stained-0.134
Token Jerry
Token ID13075
Feature activation+1.89e+00

Pos logit contributions
xc+0.017
bel+0.014
mine+0.013
robe+0.013
days+0.013
Neg logit contributions
 Even-0.008
 But-0.008
 le-0.007
 then-0.007
 never-0.007
Token was
Token ID373
Feature activation+6.32e-01

Pos logit contributions
uth+0.047
uf+0.044
Suddenly+0.042
ith+0.041
Then+0.040
Neg logit contributions
xc-0.064
nam-0.062
bel-0.061
lights-0.060
 pans-0.057
Token in
Token ID287
Feature activation-3.49e-02

Pos logit contributions
uth+0.013
 example+0.012
Suddenly+0.012
uf+0.011
 By+0.010
Neg logit contributions
xc-0.025
bel-0.023
robe-0.022
nam-0.021
mine-0.021
Token a
Token ID257
Feature activation-3.31e-01

Pos logit contributions
xc+0.001
bel+0.001
led+0.001
cher+0.001
nam+0.001
Neg logit contributions
 example-0.001
uf-0.001
uth-0.001
 By-0.001
 Even-0.001
Token that
Token ID326
Feature activation-2.81e-01

Pos logit contributions
uth+0.006
uf+0.006
 example+0.006
emy+0.006
2+0.006
Neg logit contributions
bel-0.006
xc-0.006
 stained-0.006
pan-0.006
 ripped-0.006
Token he
Token ID339
Feature activation-7.38e-02

Pos logit contributions
bel+0.011
xc+0.010
pan+0.010
 stained+0.010
iddled+0.009
Neg logit contributions
 example-0.006
uth-0.006
 By-0.006
 Even-0.006
 But-0.006
Token had
Token ID550
Feature activation+1.09e+00

Pos logit contributions
bel+0.003
xc+0.002
pan+0.002
nam+0.002
lights+0.002
Neg logit contributions
ith-0.002
uth-0.002
2-0.002
uf-0.001
 example-0.001
Token so
Token ID523
Feature activation+3.81e-01

Pos logit contributions
 example+0.026
uth+0.024
uf+0.024
2+0.023
ith+0.022
Neg logit contributions
bel-0.039
xc-0.037
lights-0.036
pan-0.036
igator-0.035
Token expert
Token ID5887
Feature activation+6.46e-01

Pos logit contributions
uth+0.007
uf+0.007
 example+0.006
2+0.006
emy+0.006
Neg logit contributions
xc-0.015
bel-0.015
pan-0.014
 stained-0.014
 ripped-0.013
Tokenly
Token ID306
Feature activation+3.85e+00

Pos logit contributions
 example+0.013
2+0.012
uth+0.012
uf+0.012
der+0.012
Neg logit contributions
bel-0.024
xc-0.024
 stained-0.023
 vacuum-0.022
 sponge-0.022
Token cut
Token ID2005
Feature activation+5.66e-01

Pos logit contributions
 example+0.083
uth+0.082
uf+0.081
ith+0.072
 By+0.072
Neg logit contributions
xc-0.147
bel-0.131
nam-0.129
igator-0.126
 stained-0.125
Token the
Token ID262
Feature activation+8.08e-01

Pos logit contributions
 example+0.014
2+0.014
uth+0.014
uf+0.013
ith+0.013
Neg logit contributions
bel-0.018
xc-0.017
 stained-0.017
pan-0.016
days-0.016
Token little
Token ID1310
Feature activation+1.69e+00

Pos logit contributions
uth+0.023
uf+0.023
 example+0.022
2+0.021
ith+0.021
Neg logit contributions
bel-0.024
xc-0.023
pan-0.021
 stained-0.021
 ripped-0.020
Token boy
Token ID2933
Feature activation+4.77e-01

Pos logit contributions
 example+0.042
uth+0.041
uf+0.040
2+0.039
ith+0.038
Neg logit contributions
bel-0.056
xc-0.054
pan-0.050
 stained-0.050
lights-0.048
Token's
Token ID338
Feature activation+9.56e-01

Pos logit contributions
uth+0.011
uf+0.010
 example+0.010
2+0.010
ith+0.010
Neg logit contributions
xc-0.017
bel-0.016
 stained-0.016
pan-0.016
lights-0.015
Token wished
Token ID16555
Feature activation+2.76e-01

Pos logit contributions
uth+0.009
uf+0.009
2+0.008
emy+0.008
 example+0.008
Neg logit contributions
bel-0.008
xc-0.007
pan-0.007
 stained-0.007
 vacuum-0.006
Token they
Token ID484
Feature activation+3.69e-01

Pos logit contributions
uth+0.007
 example+0.007
uf+0.007
emy+0.007
2+0.007
Neg logit contributions
xc-0.008
bel-0.008
 stained-0.008
pan-0.007
 ripped-0.007
Token had
Token ID550
Feature activation+1.04e+00

Pos logit contributions
uf+0.007
uth+0.007
 also+0.006
Then+0.006
ith+0.006
Neg logit contributions
xc-0.013
bel-0.013
pan-0.012
 ah-0.012
bat-0.012
Token never
Token ID1239
Feature activation+7.03e-01

Pos logit contributions
uth+0.023
uf+0.022
 example+0.021
emy+0.020
ith+0.019
Neg logit contributions
bel-0.038
xc-0.035
pan-0.035
lights-0.034
igator-0.034
Token taken
Token ID2077
Feature activation+4.32e-01

Pos logit contributions
uf+0.020
uth+0.019
emy+0.019
 example+0.019
ith+0.019
Neg logit contributions
bel-0.020
pan-0.019
xc-0.019
lights-0.017
 towel-0.017
Token Wally
Token ID50117
Feature activation+3.84e+00

Pos logit contributions
uth+0.014
uf+0.014
 example+0.013
emy+0.013
2+0.012
Neg logit contributions
bel-0.012
xc-0.011
 stained-0.010
pan-0.010
 vacuum-0.009
Token out
Token ID503
Feature activation-2.45e-01

Pos logit contributions
uf+0.084
 example+0.070
uth+0.070
osa+0.066
ith+0.064
Neg logit contributions
bel-0.146
xc-0.141
apy-0.139
lights-0.138
 ah-0.137
Token of
Token ID286
Feature activation-8.48e-01

Pos logit contributions
xc+0.011
bel+0.010
bat+0.010
nam+0.009
 stained+0.009
Neg logit contributions
 example-0.004
uf-0.003
 By-0.003
 And-0.003
ith-0.003
Token the
Token ID262
Feature activation+4.47e-02

Pos logit contributions
bel+0.029
xc+0.027
bat+0.023
pan+0.023
 vacuum+0.022
Neg logit contributions
 example-0.023
uf-0.022
uth-0.021
emy-0.020
 zoo-0.020
Token pool
Token ID5933
Feature activation+7.65e-01

Pos logit contributions
 zoo+0.001
 example+0.001
 jungle+0.001
 vet+0.001
uf+0.001
Neg logit contributions
bel-0.002
xc-0.001
mine-0.001
led-0.001
iddled-0.001
Token.
Token ID13
Feature activation+1.02e+00

Pos logit contributions
 example+0.014
uf+0.013
der+0.012
2+0.012
 vet+0.011
Neg logit contributions
xc-0.032
bel-0.030
 swollen-0.029
 stained-0.028
apy-0.028
Token had
Token ID550
Feature activation+1.65e-01

Pos logit contributions
uth+0.022
uf+0.021
ith+0.019
 example+0.019
 Even+0.019
Neg logit contributions
xc-0.033
nam-0.033
bel-0.032
lights-0.031
pan-0.030
Token wings
Token ID12098
Feature activation+1.89e+00

Pos logit contributions
uth+0.004
 example+0.003
uf+0.003
emy+0.003
osa+0.003
Neg logit contributions
bel-0.006
xc-0.006
pan-0.006
lights-0.006
mine-0.005
Token and
Token ID290
Feature activation+2.69e-01

Pos logit contributions
uth+0.020
der+0.019
ith+0.018
Suddenly+0.017
 Even+0.017
Neg logit contributions
xc-0.088
 vacuum-0.081
bel-0.080
nam-0.080
 pans-0.079
Token legs
Token ID7405
Feature activation+1.26e+00

Pos logit contributions
uth+0.007
 example+0.006
emy+0.006
uf+0.006
 Even+0.006
Neg logit contributions
xc-0.009
bel-0.009
pan-0.008
 stained-0.008
 vacuum-0.008
Token and
Token ID290
Feature activation+1.94e-01

Pos logit contributions
der+0.015
 Even+0.014
uth+0.014
Suddenly+0.014
ith+0.013
Neg logit contributions
xc-0.057
bel-0.053
 stained-0.051
 vacuum-0.050
 pans-0.050
Token antenna
Token ID20509
Feature activation+3.84e+00

Pos logit contributions
uth+0.004
 example+0.004
emy+0.004
uf+0.004
 Even+0.004
Neg logit contributions
xc-0.007
bel-0.006
pan-0.006
mine-0.006
 vacuum-0.006
Tokene
Token ID68
Feature activation-1.00e+00

Pos logit contributions
uth+0.094
uf+0.089
emy+0.086
alph+0.085
ith+0.083
Neg logit contributions
bel-0.136
xc-0.125
 vacuum-0.121
 stained-0.118
pan-0.117
Token and
Token ID290
Feature activation-7.64e-01

Pos logit contributions
xc+0.034
bel+0.033
 stained+0.033
 vacuum+0.033
nam+0.032
Neg logit contributions
uth-0.025
uf-0.024
 example-0.021
aff-0.021
ith-0.021
Token eyes
Token ID2951
Feature activation+1.42e+00

Pos logit contributions
xc+0.027
bel+0.024
pan+0.023
mine+0.022
lights+0.022
Neg logit contributions
uth-0.019
 example-0.018
emy-0.017
 But-0.016
Suddenly-0.016
Token.
Token ID13
Feature activation-2.69e-01

Pos logit contributions
der+0.021
uth+0.018
 knowing+0.017
ith+0.017
 And+0.017
Neg logit contributions
xc-0.064
bel-0.056
nam-0.055
 pans-0.054
 sponge-0.054
Token I
Token ID314
Feature activation+1.10e+00

Pos logit contributions
xc+0.006
bel+0.005
nam+0.005
 ah+0.005
 vacuum+0.005
Neg logit contributions
uth-0.009
uf-0.008
 example-0.008
ith-0.008
emy-0.008
Token ostr
Token ID42387
Feature activation+4.95e-01

Pos logit contributions
uth+0.004
uf+0.004
emy+0.003
 example+0.003
2+0.003
Neg logit contributions
bel-0.011
 stained-0.010
xc-0.010
 ripped-0.010
pan-0.010
Tokenich
Token ID488
Feature activation+7.95e-01

Pos logit contributions
 example+0.012
uf+0.012
uth+0.012
 By+0.011
osa+0.011
Neg logit contributions
bel-0.017
xc-0.015
bat-0.014
 ripped-0.013
 swollen-0.013
Token pen
Token ID3112
Feature activation+5.97e-01

Pos logit contributions
uth+0.019
uf+0.018
 example+0.017
emy+0.016
2+0.016
Neg logit contributions
bel-0.027
xc-0.026
 stained-0.026
 ripped-0.025
pan-0.025
Token.
Token ID13
Feature activation+6.68e-01

Pos logit contributions
uth+0.009
uf+0.008
 example+0.008
der+0.008
emy+0.008
Neg logit contributions
bel-0.025
xc-0.025
 stained-0.023
 unst-0.023
 swollen-0.022
Token Ost
Token ID38919
Feature activation+7.35e-01

Pos logit contributions
uth+0.013
uf+0.013
emy+0.011
 example+0.011
2+0.011
Neg logit contributions
bel-0.026
xc-0.025
 stained-0.024
pan-0.024
 ripped-0.024
Tokenric
Token ID1173
Feature activation+3.83e+00

Pos logit contributions
uth+0.020
emy+0.019
uf+0.019
 example+0.018
2+0.018
Neg logit contributions
xc-0.021
bel-0.021
 stained-0.020
mine-0.020
bat-0.019
Tokenhes
Token ID956
Feature activation-4.62e-01

Pos logit contributions
uth+0.131
uf+0.130
 example+0.122
emy+0.121
2+0.118
Neg logit contributions
bel-0.093
 stained-0.083
xc-0.082
pan-0.078
 ripped-0.078
Token are
Token ID389
Feature activation-2.16e-01

Pos logit contributions
xc+0.017
bel+0.016
nam+0.016
 stained+0.014
 vacuum+0.014
Neg logit contributions
uth-0.010
 example-0.009
uf-0.009
 learns-0.009
 And-0.009
Token big
Token ID1263
Feature activation+1.72e-02

Pos logit contributions
xc+0.008
bel+0.007
pan+0.007
mine+0.007
robe+0.006
Neg logit contributions
uth-0.005
emy-0.005
uf-0.005
 example-0.005
2-0.005
Token birds
Token ID10087
Feature activation+1.18e+00

Pos logit contributions
uth+0.000
uf+0.000
emy+0.000
 example+0.000
2+0.000
Neg logit contributions
bel-0.001
xc-0.001
 stained-0.001
pan-0.001
 ripped-0.001
Token with
Token ID351
Feature activation+2.63e-01

Pos logit contributions
uth+0.028
uf+0.027
emy+0.024
ith+0.024
2+0.024
Neg logit contributions
bel-0.042
xc-0.042
 stained-0.038
 ripped-0.037
 vacuum-0.036
Token to
Token ID284
Feature activation+4.59e-01

Pos logit contributions
uth+0.017
uf+0.017
emy+0.016
2+0.015
 example+0.015
Neg logit contributions
bel-0.013
 stained-0.012
xc-0.012
 ripped-0.012
 vacuum-0.011
Token cry
Token ID3960
Feature activation-9.28e-02

Pos logit contributions
uf+0.011
 example+0.011
uth+0.011
 But+0.010
 Even+0.010
Neg logit contributions
bel-0.015
lights-0.015
led-0.014
pan-0.014
xc-0.013
Token and
Token ID290
Feature activation-7.25e-01

Pos logit contributions
bel+0.003
 stained+0.003
 ripped+0.003
 vacuum+0.003
xc+0.003
Neg logit contributions
uth-0.003
uf-0.002
emy-0.002
2-0.002
 example-0.002
Token threw
Token ID9617
Feature activation+8.64e-02

Pos logit contributions
bel+0.023
xc+0.022
pan+0.022
lights+0.021
boat+0.020
Neg logit contributions
 But-0.018
 example-0.017
uth-0.017
uf-0.017
Suddenly-0.017
Token a
Token ID257
Feature activation+1.36e-01

Pos logit contributions
 example+0.003
uth+0.003
uf+0.002
2+0.002
ith+0.002
Neg logit contributions
bel-0.002
xc-0.002
pan-0.002
spe-0.002
days-0.002
Token tant
Token ID24246
Feature activation-5.58e+00

Pos logit contributions
2+0.004
uth+0.004
ith+0.004
uf+0.004
Suddenly+0.004
Neg logit contributions
xc-0.003
bel-0.003
bat-0.003
ics-0.003
lights-0.003
Tokenrum
Token ID6582
Feature activation-6.45e-01

Pos logit contributions
bel+0.092
xc+0.084
 stained+0.075
pan+0.075
igator+0.068
Neg logit contributions
uth-0.233
uf-0.231
2-0.221
emy-0.214
 example-0.213
Token.
Token ID13
Feature activation-2.38e-01

Pos logit contributions
bel+0.022
xc+0.021
 stained+0.020
 vacuum+0.019
pan+0.019
Neg logit contributions
uth-0.015
uf-0.014
 example-0.014
der-0.014
2-0.013
Token She
Token ID1375
Feature activation-2.19e-01

Pos logit contributions
bel+0.007
 stained+0.007
 ripped+0.006
xc+0.006
 vacuum+0.006
Neg logit contributions
uth-0.007
uf-0.007
emy-0.006
 example-0.006
2-0.006
Token didn
Token ID1422
Feature activation-2.45e-01

Pos logit contributions
bel+0.005
xc+0.005
pan+0.005
 stained+0.005
 ripped+0.004
Neg logit contributions
uth-0.007
uf-0.007
emy-0.007
 example-0.007
2-0.007
Token't
Token ID470
Feature activation+8.91e-01

Pos logit contributions
bel+0.006
xc+0.006
 stained+0.006
 ripped+0.005
pan+0.005
Neg logit contributions
uth-0.008
uf-0.008
emy-0.008
 example-0.007
2-0.007
Token kept
Token ID4030
Feature activation+7.63e-01

Pos logit contributions
bel+0.003
xc+0.003
 stained+0.003
 ripped+0.003
pan+0.003
Neg logit contributions
uth-0.005
uf-0.005
emy-0.004
 example-0.004
2-0.004
Token crying
Token ID13774
Feature activation-3.23e-01

Pos logit contributions
uf+0.020
uth+0.019
 example+0.019
emy+0.018
2+0.018
Neg logit contributions
bel-0.025
xc-0.023
igator-0.022
 stained-0.021
pan-0.021
Token and
Token ID290
Feature activation-2.45e-01

Pos logit contributions
bel+0.010
 stained+0.010
 ripped+0.009
pan+0.009
xc+0.009
Neg logit contributions
uth-0.009
uf-0.008
emy-0.008
2-0.007
 example-0.007
Token throwing
Token ID9644
Feature activation+1.31e-02

Pos logit contributions
bel+0.008
xc+0.008
igator+0.007
pan+0.007
lights+0.007
Neg logit contributions
uth-0.006
 example-0.006
uf-0.006
Suddenly-0.006
2-0.006
Token a
Token ID257
Feature activation+2.23e-01

Pos logit contributions
 example+0.000
uth+0.000
uf+0.000
ith+0.000
 But+0.000
Neg logit contributions
bel-0.000
xc-0.000
spe-0.000
days-0.000
washer-0.000
Token tant
Token ID24246
Feature activation-5.52e+00

Pos logit contributions
uf+0.007
ith+0.007
uth+0.006
Suddenly+0.006
emy+0.006
Neg logit contributions
xc-0.006
bel-0.005
bat-0.005
nam-0.005
ics-0.005
Tokenrum
Token ID6582
Feature activation-5.67e-01

Pos logit contributions
bel+0.093
xc+0.085
 stained+0.077
pan+0.076
 ripped+0.070
Neg logit contributions
uth-0.228
uf-0.227
2-0.216
 example-0.210
emy-0.209
Token.
Token ID13
Feature activation-3.08e-01

Pos logit contributions
bel+0.021
xc+0.020
igator+0.019
 stained+0.019
 pans+0.019
Neg logit contributions
der-0.012
uth-0.012
 example-0.011
 Even-0.010
 Then-0.010
Token
Token ID198
Feature activation-2.56e-01

Pos logit contributions
bel+0.010
 ripped+0.010
 stained+0.010
xc+0.009
 vacuum+0.009
Neg logit contributions
uth-0.008
uf-0.008
emy-0.007
 example-0.007
aja-0.007
Token
Token ID198
Feature activation-1.21e-01

Pos logit contributions
bel+0.008
 stained+0.007
xc+0.007
 ripped+0.007
 messed+0.007
Neg logit contributions
uf-0.007
uth-0.007
 example-0.007
2-0.006
ith-0.006
TokenFinally
Token ID11158
Feature activation-6.56e-01

Pos logit contributions
bel+0.005
 stained+0.004
 ripped+0.004
xc+0.004
 vacuum+0.004
Neg logit contributions
uf-0.002
uth-0.002
 example-0.002
emy-0.002
2-0.002
Token She
Token ID1375
Feature activation-4.18e-01

Pos logit contributions
bel+0.004
 ripped+0.004
 stained+0.004
 down+0.004
xc+0.004
Neg logit contributions
uf-0.006
uth-0.006
alph-0.006
emy-0.006
 example-0.006
Token cried
Token ID16896
Feature activation-8.31e-02

Pos logit contributions
bel+0.010
xc+0.010
 stained+0.009
 ripped+0.009
pan+0.009
Neg logit contributions
uth-0.014
uf-0.014
emy-0.013
 example-0.013
2-0.013
Token and
Token ID290
Feature activation-6.32e-01

Pos logit contributions
bel+0.003
xc+0.003
 stained+0.003
pan+0.003
 ripped+0.003
Neg logit contributions
uth-0.002
uf-0.002
 example-0.002
emy-0.002
2-0.002
Token threw
Token ID9617
Feature activation+4.30e-01

Pos logit contributions
bel+0.020
xc+0.019
igator+0.018
pan+0.018
days+0.017
Neg logit contributions
uth-0.016
uf-0.016
 example-0.015
emy-0.015
 But-0.015
Token a
Token ID257
Feature activation+3.29e-01

Pos logit contributions
 example+0.012
 But+0.011
uth+0.011
ith+0.011
2+0.011
Neg logit contributions
bel-0.012
xc-0.012
washer-0.011
days-0.011
boat-0.011
Token tant
Token ID24246
Feature activation-5.46e+00

Pos logit contributions
2+0.011
uf+0.010
uth+0.010
emy+0.010
ith+0.010
Neg logit contributions
xc-0.008
bel-0.007
bat-0.007
ics-0.007
lights-0.006
Tokenrum
Token ID6582
Feature activation-5.28e-01

Pos logit contributions
bel+0.093
xc+0.084
 stained+0.076
pan+0.074
igator+0.069
Neg logit contributions
uth-0.225
uf-0.223
2-0.211
emy-0.206
 example-0.206
Token.
Token ID13
Feature activation-1.13e-01

Pos logit contributions
bel+0.020
xc+0.020
igator+0.019
 stained+0.018
 sponge+0.018
Neg logit contributions
der-0.010
uth-0.010
 example-0.009
 And-0.009
 Then-0.009
Token Her
Token ID2332
Feature activation+9.20e-01

Pos logit contributions
bel+0.003
 ripped+0.003
 stained+0.003
 vacuum+0.003
xc+0.003
Neg logit contributions
uth-0.004
uf-0.004
emy-0.003
alph-0.003
 example-0.003
Token mom
Token ID1995
Feature activation-1.11e+00

Pos logit contributions
uth+0.018
emy+0.018
 example+0.017
uf+0.017
 wife+0.016
Neg logit contributions
xc-0.034
pan-0.034
iddled-0.033
bel-0.032
mine-0.032
Token tried
Token ID3088
Feature activation+1.44e+00

Pos logit contributions
bel+0.030
xc+0.028
 stained+0.026
pan+0.026
nam+0.025
Neg logit contributions
uth-0.034
uf-0.034
2-0.032
ith-0.032
 example-0.032
Token her
Token ID607
Feature activation+6.32e-01

Pos logit contributions
 example+0.018
uth+0.017
 But+0.016
uf+0.016
ith+0.016
Neg logit contributions
xc-0.016
boat-0.016
bel-0.016
pan-0.015
robe-0.015
Token feet
Token ID3625
Feature activation-3.14e-01

Pos logit contributions
uth+0.018
uf+0.017
 example+0.016
osa+0.015
ith+0.015
Neg logit contributions
xc-0.019
bel-0.018
pan-0.017
igator-0.017
bly-0.016
Token and
Token ID290
Feature activation-4.49e-02

Pos logit contributions
xc+0.015
bel+0.014
 pans+0.014
 stained+0.014
 ah+0.013
Neg logit contributions
2-0.003
Suddenly-0.003
 example-0.003
uth-0.003
 Even-0.002
Token threw
Token ID9617
Feature activation+4.60e-01

Pos logit contributions
xc+0.002
days+0.001
bel+0.001
lights+0.001
robe+0.001
Neg logit contributions
 But-0.001
2-0.001
uth-0.001
Suddenly-0.001
uf-0.001
Token a
Token ID257
Feature activation+3.76e-01

Pos logit contributions
 example+0.012
uth+0.011
nine+0.011
2+0.011
uf+0.011
Neg logit contributions
bel-0.015
xc-0.014
days-0.013
washer-0.013
 stained-0.012
Token tant
Token ID24246
Feature activation-5.41e+00

Pos logit contributions
2+0.013
uth+0.013
ith+0.012
uf+0.012
emy+0.012
Neg logit contributions
xc-0.008
bel-0.007
bat-0.007
ics-0.007
days-0.007
Tokenrum
Token ID6582
Feature activation-4.67e-01

Pos logit contributions
bel+0.090
xc+0.081
 stained+0.072
pan+0.072
igator+0.065
Neg logit contributions
uth-0.223
uf-0.223
2-0.213
emy-0.208
 example-0.207
Token.
Token ID13
Feature activation-3.02e-02

Pos logit contributions
bel+0.018
xc+0.018
igator+0.017
 stained+0.017
 vacuum+0.017
Neg logit contributions
der-0.009
uth-0.008
 Then-0.007
 example-0.007
 And-0.007
Token She
Token ID1375
Feature activation+2.48e-01

Pos logit contributions
bel+0.001
 stained+0.001
 ripped+0.001
xc+0.001
 vacuum+0.001
Neg logit contributions
uth-0.001
uf-0.001
emy-0.001
 example-0.001
alph-0.001
Token grabbed
Token ID13176
Feature activation+5.09e-01

Pos logit contributions
uth+0.007
uf+0.007
emy+0.007
2+0.007
ith+0.007
Neg logit contributions
bel-0.007
xc-0.007
pan-0.007
days-0.006
robe-0.006
Token the
Token ID262
Feature activation+3.26e-01

Pos logit contributions
uth+0.015
uf+0.015
 example+0.015
2+0.015
nine+0.014
Neg logit contributions
bel-0.014
xc-0.013
 stained-0.012
pan-0.012
days-0.011
Token and
Token ID290
Feature activation-4.38e-01

Pos logit contributions
uth+0.004
uf+0.004
emy+0.004
alph+0.004
2+0.004
Neg logit contributions
bel-0.004
 stained-0.004
 ripped-0.004
pan-0.004
 down-0.004
Token started
Token ID2067
Feature activation+4.63e-01

Pos logit contributions
bel+0.014
xc+0.014
days+0.013
lights+0.013
boat+0.013
Neg logit contributions
 But-0.010
uf-0.010
ith-0.010
Suddenly-0.010
2-0.010
Token to
Token ID284
Feature activation+3.13e-01

Pos logit contributions
uth+0.013
uf+0.013
 example+0.012
emy+0.012
2+0.011
Neg logit contributions
bel-0.014
xc-0.013
 stained-0.013
pan-0.012
 ripped-0.012
Token throw
Token ID3714
Feature activation+3.52e-01

Pos logit contributions
uf+0.011
uth+0.010
emy+0.010
 example+0.010
 But+0.009
Neg logit contributions
bel-0.008
lights-0.007
xc-0.007
pan-0.007
boat-0.006
Token a
Token ID257
Feature activation+3.47e-01

Pos logit contributions
uf+0.010
uth+0.010
 example+0.010
ith+0.010
emy+0.009
Neg logit contributions
bel-0.010
xc-0.010
spe-0.009
days-0.009
pan-0.008
Token tant
Token ID24246
Feature activation-5.34e+00

Pos logit contributions
uf+0.010
emy+0.010
ith+0.010
uth+0.010
2+0.010
Neg logit contributions
xc-0.009
bel-0.008
bat-0.008
nam-0.008
adoes-0.008
Tokenrum
Token ID6582
Feature activation-3.93e-01

Pos logit contributions
bel+0.083
xc+0.077
pan+0.065
 stained+0.064
igator+0.061
Neg logit contributions
uth-0.227
uf-0.225
2-0.215
 example-0.208
 By-0.207
Token.
Token ID13
Feature activation-2.04e-01

Pos logit contributions
bel+0.014
xc+0.013
 stained+0.013
igator+0.013
 vacuum+0.012
Neg logit contributions
uth-0.009
uf-0.008
 example-0.008
der-0.008
2-0.008
Token Suddenly
Token ID24975
Feature activation-3.37e-01

Pos logit contributions
bel+0.005
 stained+0.005
 ripped+0.005
 vacuum+0.005
xc+0.005
Neg logit contributions
uth-0.007
uf-0.007
emy-0.006
alph-0.006
 example-0.006
Token,
Token ID11
Feature activation-5.98e-01

Pos logit contributions
bel+0.014
xc+0.014
igator+0.013
boat+0.012
 stained+0.012
Neg logit contributions
uth-0.006
uf-0.005
emy-0.005
 example-0.005
Suddenly-0.005
Token she
Token ID673
Feature activation-1.02e+00

Pos logit contributions
xc+0.021
bel+0.020
boat+0.019
spe+0.018
igator+0.018
Neg logit contributions
uth-0.012
Suddenly-0.012
 example-0.012
Then-0.012
 vet-0.012
Token and
Token ID290
Feature activation+1.57e-01

Pos logit contributions
uth+0.017
uf+0.017
 example+0.015
emy+0.015
2+0.015
Neg logit contributions
bel-0.028
xc-0.027
 stained-0.026
pan-0.025
 ripped-0.025
Token so
Token ID523
Feature activation-8.93e-01

Pos logit contributions
uth+0.004
 But+0.004
uf+0.004
ith+0.003
 example+0.003
Neg logit contributions
xc-0.005
bel-0.005
pan-0.005
lights-0.005
 sponge-0.005
Token she
Token ID673
Feature activation-6.91e-01

Pos logit contributions
xc+0.034
pan+0.033
bel+0.030
 sponge+0.029
 towel+0.028
Neg logit contributions
uth-0.021
Suddenly-0.019
Then-0.018
 Even-0.018
 But-0.018
Token threw
Token ID9617
Feature activation+4.44e-01

Pos logit contributions
bel+0.019
xc+0.017
pan+0.016
 stained+0.016
 swollen+0.015
Neg logit contributions
uth-0.022
uf-0.022
emy-0.020
 example-0.020
ith-0.020
Token a
Token ID257
Feature activation+3.89e-01

Pos logit contributions
uth+0.014
 example+0.014
uf+0.014
2+0.014
ith+0.013
Neg logit contributions
bel-0.011
pan-0.010
xc-0.010
 stained-0.009
 vacuum-0.009
Token tant
Token ID24246
Feature activation-5.27e+00

Pos logit contributions
2+0.011
uf+0.011
uth+0.011
emy+0.011
 example+0.010
Neg logit contributions
xc-0.010
bel-0.010
days-0.009
pan-0.009
bat-0.009
Tokenrum
Token ID6582
Feature activation-4.76e-01

Pos logit contributions
bel+0.086
xc+0.078
pan+0.068
 stained+0.067
igator+0.063
Neg logit contributions
uf-0.220
uth-0.220
2-0.211
emy-0.203
 example-0.202
Token when
Token ID618
Feature activation+7.52e-02

Pos logit contributions
bel+0.016
xc+0.016
 stained+0.016
 vacuum+0.015
igator+0.015
Neg logit contributions
uth-0.011
der-0.010
 example-0.010
uf-0.010
Then-0.009
Token her
Token ID607
Feature activation-8.81e-01

Pos logit contributions
uth+0.002
Then+0.002
Suddenly+0.002
 example+0.001
 Then+0.001
Neg logit contributions
xc-0.003
bel-0.003
pan-0.003
igator-0.002
 tant-0.002
Token parents
Token ID3397
Feature activation-9.41e-01

Pos logit contributions
xc+0.031
bel+0.030
mine+0.030
pan+0.030
igator+0.029
Neg logit contributions
uf-0.019
uth-0.019
 wife-0.018
 example-0.017
emy-0.017
Token said
Token ID531
Feature activation+1.09e+00

Pos logit contributions
bel+0.030
xc+0.029
lights+0.028
 swollen+0.027
pan+0.026
Neg logit contributions
uf-0.026
2-0.024
ith-0.023
uth-0.022
 Even-0.020
Token listen
Token ID6004
Feature activation+1.08e+00

Pos logit contributions
uth+0.017
uf+0.016
emy+0.015
2+0.015
 example+0.015
Neg logit contributions
bel-0.019
 stained-0.018
 ripped-0.018
xc-0.018
 vacuum-0.017
Token.
Token ID13
Feature activation+3.32e-01

Pos logit contributions
uth+0.025
uf+0.023
 example+0.023
ith+0.022
2+0.022
Neg logit contributions
xc-0.037
bel-0.036
 stained-0.035
pan-0.035
 vacuum-0.034
Token She
Token ID1375
Feature activation+4.29e-01

Pos logit contributions
uth+0.013
uf+0.012
alph+0.012
emy+0.011
 example+0.011
Neg logit contributions
 stained-0.006
 ripped-0.006
bel-0.006
robe-0.006
 down-0.006
Token threw
Token ID9617
Feature activation+9.02e-01

Pos logit contributions
uth+0.014
uf+0.014
 example+0.013
emy+0.013
2+0.013
Neg logit contributions
bel-0.011
 stained-0.010
 ripped-0.010
xc-0.010
 vacuum-0.009
Token a
Token ID257
Feature activation+8.38e-01

Pos logit contributions
 example+0.023
nine+0.021
2+0.020
uth+0.020
 But+0.020
Neg logit contributions
bel-0.029
xc-0.028
days-0.027
washer-0.025
bat-0.025
Token tant
Token ID24246
Feature activation-5.24e+00

Pos logit contributions
2+0.027
emy+0.027
 example+0.027
uth+0.027
uf+0.027
Neg logit contributions
xc-0.020
bel-0.017
bat-0.017
ics-0.016
pan-0.015
Tokenrum
Token ID6582
Feature activation-1.59e-01

Pos logit contributions
bel+0.094
xc+0.084
 stained+0.078
pan+0.074
 ripped+0.071
Neg logit contributions
uth-0.212
uf-0.211
2-0.199
emy-0.194
 example-0.194
Token in
Token ID287
Feature activation+1.17e-01

Pos logit contributions
bel+0.006
xc+0.006
 stained+0.005
 vacuum+0.005
igator+0.005
Neg logit contributions
uth-0.003
 example-0.003
uf-0.003
der-0.003
2-0.003
Token the
Token ID262
Feature activation-3.35e-01

Pos logit contributions
 example+0.003
uf+0.003
uth+0.003
2+0.003
 Even+0.003
Neg logit contributions
bel-0.004
xc-0.004
bat-0.004
pan-0.003
washer-0.003
Token store
Token ID3650
Feature activation+7.60e-01

Pos logit contributions
bel+0.010
xc+0.009
pan+0.009
 stained+0.008
nam+0.008
Neg logit contributions
uf-0.010
 example-0.010
uth-0.009
emy-0.009
ith-0.009
Token.
Token ID13
Feature activation+3.25e-01

Pos logit contributions
 example+0.012
2+0.012
uf+0.012
uth+0.011
ith+0.010
Neg logit contributions
xc-0.033
bel-0.030
 stained-0.029
 swollen-0.028
igator-0.028
Token.
Token ID13
Feature activation-1.47e-01

Pos logit contributions
uth+0.021
uf+0.021
 example+0.020
der+0.018
2+0.018
Neg logit contributions
xc-0.039
bel-0.038
 stained-0.037
 vacuum-0.036
pan-0.035
Token He
Token ID679
Feature activation+5.23e-01

Pos logit contributions
bel+0.003
 ripped+0.003
 stained+0.003
 down+0.003
cloth+0.002
Neg logit contributions
uth-0.006
uf-0.005
emy-0.005
alph-0.005
 example-0.005
Token kept
Token ID4030
Feature activation+4.56e-01

Pos logit contributions
uth+0.018
uf+0.018
emy+0.017
 example+0.017
2+0.016
Neg logit contributions
bel-0.012
xc-0.011
 stained-0.011
 ripped-0.011
pan-0.011
Token throwing
Token ID9644
Feature activation+3.88e-01

Pos logit contributions
uf+0.012
uth+0.011
 example+0.011
2+0.011
emy+0.010
Neg logit contributions
bel-0.015
xc-0.014
 stained-0.013
pan-0.013
lights-0.013
Token a
Token ID257
Feature activation+8.27e-01

Pos logit contributions
 example+0.009
2+0.008
uf+0.008
ith+0.008
nine+0.008
Neg logit contributions
bel-0.014
xc-0.014
days-0.013
pan-0.012
ics-0.011
Token tant
Token ID24246
Feature activation-5.19e+00

Pos logit contributions
uth+0.026
uf+0.026
2+0.025
emy+0.025
 example+0.024
Neg logit contributions
xc-0.021
bel-0.021
pan-0.018
 stained-0.018
bat-0.017
Tokenrum
Token ID6582
Feature activation-3.78e-01

Pos logit contributions
bel+0.089
xc+0.079
 stained+0.070
pan+0.069
 ripped+0.065
Neg logit contributions
uth-0.214
uf-0.212
2-0.200
 example-0.197
emy-0.197
Token.
Token ID13
Feature activation-9.57e-02

Pos logit contributions
bel+0.014
xc+0.014
 stained+0.013
 vacuum+0.013
 towel+0.013
Neg logit contributions
der-0.007
uth-0.007
 example-0.007
uf-0.007
2-0.007
Token His
Token ID2399
Feature activation+6.72e-01

Pos logit contributions
bel+0.002
 ripped+0.002
 stained+0.002
 messed+0.002
 vacuum+0.002
Neg logit contributions
uth-0.003
uf-0.003
emy-0.003
alph-0.003
ith-0.003
Token mom
Token ID1995
Feature activation-5.37e-01

Pos logit contributions
uth+0.013
uf+0.013
 example+0.013
 wife+0.012
 vet+0.012
Neg logit contributions
xc-0.025
mine-0.023
iddled-0.023
bel-0.023
pan-0.023
Token was
Token ID373
Feature activation+3.62e-01

Pos logit contributions
bel+0.014
xc+0.013
 stained+0.012
pan+0.012
nam+0.011
Neg logit contributions
uf-0.017
uth-0.017
2-0.016
ith-0.016
 example-0.016
Token sad
Token ID6507
Feature activation-3.97e-01

Pos logit contributions
xc+0.019
bel+0.018
pan+0.018
lights+0.017
days+0.017
Neg logit contributions
 But-0.014
Suddenly-0.014
 Then-0.013
 Even-0.013
 example-0.013
Token.
Token ID13
Feature activation-3.74e-01

Pos logit contributions
bel+0.013
xc+0.013
 stained+0.012
pan+0.012
 ripped+0.012
Neg logit contributions
uth-0.010
uf-0.009
 example-0.009
emy-0.009
2-0.008
Token She
Token ID1375
Feature activation-4.03e-02

Pos logit contributions
 ripped+0.009
bel+0.009
 down+0.009
 stained+0.008
xc+0.008
Neg logit contributions
uf-0.012
uth-0.012
alph-0.011
 example-0.011
emy-0.011
Token threw
Token ID9617
Feature activation+6.58e-01

Pos logit contributions
bel+0.001
 stained+0.001
 ripped+0.001
xc+0.001
 vacuum+0.001
Neg logit contributions
uth-0.001
uf-0.001
emy-0.001
 example-0.001
2-0.001
Token a
Token ID257
Feature activation+9.03e-01

Pos logit contributions
 example+0.017
2+0.017
uth+0.016
uf+0.016
ith+0.015
Neg logit contributions
bel-0.021
xc-0.019
days-0.018
pan-0.018
 stained-0.018
Token tant
Token ID24246
Feature activation-5.16e+00

Pos logit contributions
2+0.028
uth+0.026
ith+0.026
uf+0.026
emy+0.026
Neg logit contributions
xc-0.023
bel-0.021
bat-0.020
ics-0.019
days-0.018
Tokenrum
Token ID6582
Feature activation-3.58e-01

Pos logit contributions
bel+0.085
xc+0.078
 stained+0.069
pan+0.068
igator+0.062
Neg logit contributions
uf-0.214
uth-0.213
2-0.204
emy-0.198
 example-0.197
Token and
Token ID290
Feature activation-1.74e-01

Pos logit contributions
bel+0.014
xc+0.013
 stained+0.013
igator+0.013
 vacuum+0.013
Neg logit contributions
uth-0.007
der-0.006
 example-0.006
uf-0.006
Suddenly-0.006
Token cried
Token ID16896
Feature activation-7.85e-02

Pos logit contributions
xc+0.006
bel+0.006
igator+0.005
lights+0.005
days+0.005
Neg logit contributions
Suddenly-0.004
2-0.004
uth-0.004
uf-0.004
 example-0.004
Token loudly
Token ID23112
Feature activation+1.57e-01

Pos logit contributions
bel+0.003
xc+0.003
 stained+0.002
pan+0.002
 ripped+0.002
Neg logit contributions
uth-0.002
uf-0.002
 example-0.002
2-0.002
emy-0.002
Token.
Token ID13
Feature activation-4.07e-01

Pos logit contributions
 example+0.003
der+0.003
uth+0.003
uf+0.003
 But+0.003
Neg logit contributions
xc-0.006
bel-0.006
 stained-0.006
nam-0.006
igator-0.006
Token surprise
Token ID5975
Feature activation-1.09e+00

Pos logit contributions
 example+0.003
uf+0.003
uth+0.003
 zoo+0.003
emy+0.003
Neg logit contributions
bel-0.004
xc-0.003
pan-0.003
 stained-0.003
iddled-0.003
Token.
Token ID13
Feature activation+2.61e-01

Pos logit contributions
xc+0.045
bel+0.044
 swollen+0.040
 stained+0.039
nam+0.039
Neg logit contributions
 example-0.019
uf-0.018
uth-0.018
2-0.017
der-0.017
Token He
Token ID679
Feature activation+6.22e-01

Pos logit contributions
uth+0.009
uf+0.008
emy+0.008
alph+0.008
 example+0.008
Neg logit contributions
bel-0.006
 ripped-0.006
 stained-0.006
xc-0.005
 messed-0.005
Token threw
Token ID9617
Feature activation+6.60e-01

Pos logit contributions
uth+0.021
uf+0.021
 example+0.020
emy+0.020
2+0.019
Neg logit contributions
bel-0.015
 stained-0.014
xc-0.014
 ripped-0.013
pan-0.013
Token a
Token ID257
Feature activation+6.16e-01

Pos logit contributions
 example+0.015
nine+0.013
 But+0.013
2+0.013
uf+0.013
Neg logit contributions
bel-0.024
days-0.023
xc-0.023
ivery-0.021
ci-0.020
Token tant
Token ID24246
Feature activation-4.94e+00

Pos logit contributions
2+0.017
uf+0.016
ith+0.016
uth+0.016
emy+0.016
Neg logit contributions
xc-0.019
bel-0.017
days-0.016
ics-0.016
bat-0.016
Tokenrum
Token ID6582
Feature activation-4.47e-01

Pos logit contributions
bel+0.083
xc+0.074
 stained+0.064
pan+0.063
igator+0.060
Neg logit contributions
uth-0.205
uf-0.204
2-0.193
 example-0.189
emy-0.187
Token.
Token ID13
Feature activation+2.24e-01

Pos logit contributions
bel+0.017
xc+0.017
 vacuum+0.016
 stained+0.016
nam+0.016
Neg logit contributions
der-0.009
 example-0.007
uth-0.007
 Then-0.007
 And-0.007
Token He
Token ID679
Feature activation+6.56e-01

Pos logit contributions
uth+0.007
uf+0.006
emy+0.006
 example+0.006
alph+0.006
Neg logit contributions
bel-0.007
 stained-0.006
 ripped-0.006
xc-0.006
pan-0.006
Token kicked
Token ID12165
Feature activation+7.34e-01

Pos logit contributions
uf+0.020
uth+0.020
emy+0.019
ith+0.019
2+0.018
Neg logit contributions
bel-0.018
xc-0.017
pan-0.016
days-0.015
lights-0.015
Token and
Token ID290
Feature activation-2.39e-01

Pos logit contributions
uf+0.020
 example+0.020
uth+0.020
2+0.019
emy+0.018
Neg logit contributions
bel-0.023
xc-0.021
 stained-0.020
pan-0.019
days-0.019
Token and
Token ID290
Feature activation+1.33e-01

Pos logit contributions
xc+0.007
bel+0.007
 tant+0.006
 swollen+0.006
 stained+0.006
Neg logit contributions
der-0.002
 And-0.002
:-0.002
 example-0.002
uf-0.002
Token they
Token ID484
Feature activation-1.24e+00

Pos logit contributions
uth+0.003
 example+0.003
emy+0.003
 Even+0.003
uf+0.003
Neg logit contributions
xc-0.004
bel-0.004
pan-0.004
mine-0.004
 stained-0.004
Token played
Token ID2826
Feature activation-1.86e+00

Pos logit contributions
xc+0.045
bel+0.042
pan+0.040
 unst+0.038
 towel+0.038
Neg logit contributions
uth-0.026
 Even-0.025
 knows-0.025
uf-0.024
ith-0.024
Token make
Token ID787
Feature activation-1.66e+00

Pos logit contributions
xc+0.061
 swollen+0.060
 stained+0.058
pan+0.056
igator+0.055
Neg logit contributions
 example-0.047
 And-0.044
 But-0.042
2-0.041
uth-0.041
Token-
Token ID12
Feature activation-1.93e+00

Pos logit contributions
pan+0.053
xc+0.051
 stained+0.051
bel+0.050
 swollen+0.049
Neg logit contributions
 example-0.040
uth-0.039
2-0.039
emy-0.038
 And-0.038
Tokenbel
Token ID6667
Feature activation-4.82e+00

Pos logit contributions
 towel+0.024
pan+0.022
xc+0.021
 stained+0.020
 tub+0.019
Neg logit contributions
2-0.082
uth-0.082
Suddenly-0.082
ith-0.080
aff-0.079
Tokenieve
Token ID12311
Feature activation-7.99e-02

Pos logit contributions
bel+0.161
xc+0.156
 stained+0.153
pan+0.149
 ripped+0.149
Neg logit contributions
uth-0.112
uf-0.108
 example-0.104
2-0.101
emy-0.097
Token.
Token ID13
Feature activation+8.69e-02

Pos logit contributions
xc+0.003
 swollen+0.003
pan+0.003
 sponge+0.003
 stained+0.003
Neg logit contributions
 example-0.002
 And-0.001
uth-0.001
2-0.001
der-0.001
Token Finally
Token ID9461
Feature activation+3.31e-01

Pos logit contributions
uth+0.003
 example+0.003
2+0.003
nine+0.003
uf+0.003
Neg logit contributions
 stained-0.002
robe-0.002
 ripped-0.002
 towel-0.002
cloth-0.002
Token,
Token ID11
Feature activation+8.96e-01

Pos logit contributions
uth+0.009
uf+0.009
 example+0.008
emy+0.008
2+0.008
Neg logit contributions
bel-0.011
xc-0.010
 stained-0.010
pan-0.009
 ripped-0.009
Token they
Token ID484
Feature activation-2.00e-01

Pos logit contributions
uth+0.025
uf+0.025
emy+0.023
 example+0.023
2+0.022
Neg logit contributions
bel-0.027
xc-0.025
 stained-0.025
pan-0.024
 ripped-0.024
Token stubborn
Token ID24444
Feature activation+9.33e-01

Pos logit contributions
uth+0.011
uf+0.011
 example+0.010
emy+0.010
2+0.010
Neg logit contributions
bel-0.010
xc-0.009
 stained-0.009
pan-0.009
 ripped-0.008
Token.
Token ID13
Feature activation+5.05e-01

Pos logit contributions
uth+0.019
uf+0.018
 example+0.018
der+0.017
2+0.017
Neg logit contributions
bel-0.036
xc-0.034
 stained-0.032
nam-0.031
 vacuum-0.031
Token She
Token ID1375
Feature activation+9.40e-01

Pos logit contributions
uth+0.018
uf+0.017
emy+0.017
 example+0.017
alph+0.017
Neg logit contributions
 stained-0.010
bel-0.010
 ripped-0.009
 vacuum-0.009
cloth-0.009
Token throws
Token ID12542
Feature activation+9.41e-01

Pos logit contributions
uth+0.028
uf+0.027
emy+0.026
 example+0.026
alph+0.025
Neg logit contributions
 stained-0.026
 ripped-0.026
bel-0.025
 messed-0.024
 vacuum-0.023
Token a
Token ID257
Feature activation+6.91e-01

Pos logit contributions
 example+0.023
nine+0.022
2+0.021
uf+0.021
uth+0.021
Neg logit contributions
bel-0.031
xc-0.030
days-0.029
spe-0.027
washer-0.027
Token tant
Token ID24246
Feature activation-4.74e+00

Pos logit contributions
2+0.023
uth+0.022
ith+0.022
uf+0.022
emy+0.022
Neg logit contributions
xc-0.017
bel-0.015
bat-0.014
days-0.013
lights-0.013
Tokenrum
Token ID6582
Feature activation-1.14e-01

Pos logit contributions
bel+0.081
xc+0.073
 stained+0.064
pan+0.063
igator+0.059
Neg logit contributions
uf-0.193
uth-0.193
2-0.183
 example-0.180
emy-0.178
Token.
Token ID13
Feature activation+7.90e-01

Pos logit contributions
bel+0.004
xc+0.004
 stained+0.004
 vacuum+0.004
igator+0.004
Neg logit contributions
der-0.002
uth-0.002
 example-0.002
Then-0.002
uf-0.002
Token She
Token ID1375
Feature activation+7.05e-01

Pos logit contributions
uth+0.025
uf+0.024
emy+0.024
 example+0.023
alph+0.023
Neg logit contributions
bel-0.020
 stained-0.019
 ripped-0.019
pan-0.018
 vacuum-0.017
Token kicks
Token ID17364
Feature activation+8.04e-01

Pos logit contributions
uth+0.020
uf+0.019
 example+0.018
emy+0.018
2+0.017
Neg logit contributions
bel-0.021
 stained-0.020
xc-0.020
 ripped-0.019
pan-0.019
Token and
Token ID290
Feature activation-2.65e-01

Pos logit contributions
uth+0.023
uf+0.022
2+0.022
 example+0.022
emy+0.021
Neg logit contributions
bel-0.024
xc-0.022
 stained-0.021
pan-0.020
 vacuum-0.020
Token red
Token ID2266
Feature activation-3.57e-01

Pos logit contributions
 example+0.003
uf+0.003
uth+0.003
emy+0.003
 But+0.003
Neg logit contributions
xc-0.005
pan-0.005
bel-0.005
mine-0.004
spe-0.004
Token cape
Token ID36023
Feature activation-1.69e+00

Pos logit contributions
xc+0.009
bel+0.009
spe+0.009
 unst+0.009
igator+0.009
Neg logit contributions
 example-0.010
uf-0.009
 zoo-0.009
emy-0.009
uth-0.009
Token and
Token ID290
Feature activation+2.39e-01

Pos logit contributions
xc+0.073
 swollen+0.068
 stained+0.067
 ah+0.067
 sponge+0.066
Neg logit contributions
 example-0.021
 But-0.021
uth-0.019
 And-0.019
Suddenly-0.019
Token make
Token ID787
Feature activation-1.80e+00

Pos logit contributions
 example+0.006
uth+0.006
emy+0.006
 But+0.006
uf+0.006
Neg logit contributions
xc-0.007
pan-0.006
bel-0.006
mine-0.006
igator-0.006
Token-
Token ID12
Feature activation-1.88e+00

Pos logit contributions
xc+0.064
igator+0.063
pan+0.063
bel+0.063
 stained+0.062
Neg logit contributions
 example-0.035
uth-0.035
emy-0.032
2-0.032
uf-0.031
Tokenbel
Token ID6667
Feature activation-4.65e+00

Pos logit contributions
 towel+0.018
xc+0.017
 stained+0.017
pan+0.017
 sponge+0.015
Neg logit contributions
Suddenly-0.084
2-0.084
uth-0.083
Then-0.082
ith-0.082
Tokenieve
Token ID12311
Feature activation-1.46e-01

Pos logit contributions
bel+0.154
xc+0.144
 stained+0.143
 ripped+0.138
pan+0.138
Neg logit contributions
uth-0.117
uf-0.114
emy-0.105
 example-0.103
2-0.102
Token he
Token ID339
Feature activation-1.01e+00

Pos logit contributions
xc+0.006
 swollen+0.005
 sponge+0.005
 ripped+0.005
 vacuum+0.005
Neg logit contributions
 example-0.003
der-0.003
uth-0.003
emy-0.003
 And-0.003
Token had
Token ID550
Feature activation+3.11e-01

Pos logit contributions
xc+0.032
pan+0.031
bel+0.030
 towel+0.030
 sponge+0.029
Neg logit contributions
 example-0.024
ith-0.024
uth-0.024
uf-0.023
2-0.022
Token super
Token ID2208
Feature activation-6.07e-02

Pos logit contributions
uth+0.007
 example+0.007
uf+0.006
emy+0.006
2+0.006
Neg logit contributions
xc-0.011
bel-0.010
pan-0.010
 sponge-0.010
 swollen-0.010
Tokenpowers
Token ID30132
Feature activation-7.79e-01

Pos logit contributions
xc+0.002
pan+0.002
bel+0.002
 stained+0.002
 ripped+0.002
Neg logit contributions
uth-0.001
uf-0.001
 example-0.001
emy-0.001
ith-0.001
Token a
Token ID257
Feature activation-6.53e-01

Pos logit contributions
xc+0.014
bel+0.014
pan+0.013
igator+0.012
 stained+0.012
Neg logit contributions
uth-0.010
 example-0.010
uf-0.010
osa-0.009
2-0.009
Token world
Token ID995
Feature activation+2.19e-02

Pos logit contributions
bel+0.019
xc+0.018
 stained+0.017
pan+0.017
 ripped+0.017
Neg logit contributions
uth-0.019
uf-0.018
 example-0.017
emy-0.017
2-0.017
Token of
Token ID286
Feature activation-2.13e+00

Pos logit contributions
uth+0.000
2+0.000
uf+0.000
 example+0.000
ith+0.000
Neg logit contributions
xc-0.001
bel-0.001
 stained-0.001
 sponge-0.001
 swollen-0.001
Token make
Token ID787
Feature activation-1.70e+00

Pos logit contributions
bel+0.070
xc+0.070
pan+0.065
 stained+0.063
 ripped+0.061
Neg logit contributions
uth-0.054
uf-0.052
 example-0.051
2-0.048
emy-0.048
Token-
Token ID12
Feature activation-1.76e+00

Pos logit contributions
xc+0.059
 sponge+0.057
pan+0.056
bel+0.056
 stained+0.054
Neg logit contributions
uth-0.040
 example-0.038
ith-0.034
2-0.033
uf-0.033
Tokenbel
Token ID6667
Feature activation-4.65e+00

Pos logit contributions
xc+0.025
 towel+0.025
pan+0.024
 stained+0.024
 sponge+0.023
Neg logit contributions
uth-0.074
2-0.071
ith-0.070
 example-0.070
uf-0.069
Tokenieve
Token ID12311
Feature activation-2.14e-02

Pos logit contributions
bel+0.153
xc+0.141
 stained+0.140
 ripped+0.135
pan+0.135
Neg logit contributions
uth-0.120
uf-0.118
emy-0.109
 example-0.105
2-0.103
Token.
Token ID13
Feature activation-6.18e-01

Pos logit contributions
xc+0.001
 swollen+0.001
 sponge+0.001
pan+0.001
 unst+0.001
Neg logit contributions
 example-0.000
uth-0.000
ith-0.000
emy-0.000
osa-0.000
Token
Token ID198
Feature activation-3.40e-01

Pos logit contributions
bel+0.016
 stained+0.015
xc+0.015
 ripped+0.015
 vacuum+0.014
Neg logit contributions
uf-0.019
uth-0.019
emy-0.018
 example-0.017
aja-0.017
Token
Token ID198
Feature activation-2.71e-01

Pos logit contributions
bel+0.010
 stained+0.009
xc+0.009
 towel+0.009
 ripped+0.009
Neg logit contributions
uf-0.009
 example-0.009
uth-0.009
2-0.008
der-0.008
TokenShe
Token ID3347
Feature activation-5.94e-01

Pos logit contributions
bel+0.010
 stained+0.009
xc+0.009
 ripped+0.009
 vacuum+0.009
Neg logit contributions
uf-0.006
uth-0.006
emy-0.006
 example-0.005
2-0.005
Token he
Token ID339
Feature activation-2.92e-01

Pos logit contributions
bel+0.000
xc+0.000
 stained+0.000
pan+0.000
 ripped+0.000
Neg logit contributions
uth-0.000
uf-0.000
emy-0.000
 example-0.000
alph-0.000
Token was
Token ID373
Feature activation+4.06e-01

Pos logit contributions
bel+0.008
pan+0.008
xc+0.008
nam+0.008
 stained+0.008
Neg logit contributions
uth-0.008
uf-0.008
2-0.008
ith-0.008
emy-0.008
Token playing
Token ID2712
Feature activation-1.98e+00

Pos logit contributions
uth+0.010
uf+0.009
 example+0.009
2+0.009
ith+0.008
Neg logit contributions
bel-0.014
xc-0.013
pan-0.013
 stained-0.012
lights-0.012
Token make
Token ID787
Feature activation-1.57e+00

Pos logit contributions
xc+0.077
 swollen+0.074
bel+0.072
 stained+0.071
nam+0.069
Neg logit contributions
 example-0.040
 And-0.036
 But-0.035
But-0.035
 zoo-0.033
Token-
Token ID12
Feature activation-1.77e+00

Pos logit contributions
bel+0.056
pan+0.056
 stained+0.056
xc+0.054
 swollen+0.053
Neg logit contributions
 example-0.033
uth-0.032
 And-0.031
 Even-0.030
uf-0.030
Tokenbel
Token ID6667
Feature activation-4.55e+00

Pos logit contributions
 towel+0.024
pan+0.024
 stained+0.023
xc+0.022
 sponge+0.022
Neg logit contributions
uth-0.073
2-0.072
Suddenly-0.071
ith-0.070
Then-0.069
Tokenieve
Token ID12311
Feature activation+1.23e-01

Pos logit contributions
bel+0.163
xc+0.158
 stained+0.156
pan+0.153
 ripped+0.152
Neg logit contributions
uth-0.092
uf-0.090
 example-0.088
2-0.085
ith-0.080
Token at
Token ID379
Feature activation-7.45e-01

Pos logit contributions
 example+0.002
uth+0.002
 And+0.002
der+0.002
 than+0.002
Neg logit contributions
xc-0.005
 swollen-0.005
 stained-0.005
pan-0.005
ivery-0.005
Token home
Token ID1363
Feature activation+5.34e-01

Pos logit contributions
bel+0.022
xc+0.020
 stained+0.019
pan+0.019
 ripped+0.018
Neg logit contributions
 example-0.021
uth-0.021
uf-0.021
2-0.020
emy-0.019
Token,
Token ID11
Feature activation-1.19e-01

Pos logit contributions
 example+0.011
uth+0.010
uf+0.010
2+0.009
 And+0.009
Neg logit contributions
xc-0.021
 stained-0.020
pan-0.019
 ripped-0.019
 swollen-0.019
Token too
Token ID1165
Feature activation+8.29e-01

Pos logit contributions
xc+0.005
bel+0.005
 stained+0.005
pan+0.005
 ripped+0.004
Neg logit contributions
uth-0.002
 example-0.002
uf-0.002
 Even-0.002
emy-0.002
Token his
Token ID465
Feature activation+6.85e-01

Pos logit contributions
bel+0.007
xc+0.007
 stained+0.007
bat+0.007
 swollen+0.007
Neg logit contributions
 example-0.010
uth-0.009
ith-0.009
uf-0.009
2-0.009
Token feet
Token ID3625
Feature activation-5.31e-01

Pos logit contributions
uth+0.018
uf+0.017
emy+0.016
 example+0.016
osa+0.016
Neg logit contributions
bel-0.022
xc-0.022
 stained-0.021
iddled-0.021
 messed-0.020
Token and
Token ID290
Feature activation+8.16e-02

Pos logit contributions
xc+0.024
bel+0.024
 stained+0.023
 ah+0.022
nam+0.022
Neg logit contributions
2-0.005
 example-0.005
Suddenly-0.005
uth-0.005
uf-0.005
Token threw
Token ID9617
Feature activation+3.07e-01

Pos logit contributions
emy+0.002
uf+0.002
 Then+0.002
uth+0.002
Suddenly+0.002
Neg logit contributions
xc-0.003
days-0.003
bel-0.002
igator-0.002
mine-0.002
Token a
Token ID257
Feature activation+5.75e-01

Pos logit contributions
 example+0.007
nine+0.007
 But+0.006
uf+0.006
 Then+0.006
Neg logit contributions
bel-0.011
days-0.011
xc-0.010
washer-0.010
ci-0.009
Token tant
Token ID24246
Feature activation-4.54e+00

Pos logit contributions
2+0.014
emy+0.014
ith+0.013
uth+0.013
affe+0.013
Neg logit contributions
xc-0.019
days-0.017
bel-0.017
ci-0.017
ics-0.016
Tokenrum
Token ID6582
Feature activation-4.88e-01

Pos logit contributions
bel+0.078
xc+0.074
pan+0.063
 stained+0.063
 ripped+0.058
Neg logit contributions
uth-0.183
uf-0.181
2-0.173
 example-0.170
emy-0.168
Token.
Token ID13
Feature activation+1.61e-01

Pos logit contributions
bel+0.020
xc+0.019
 vacuum+0.019
 stained+0.019
nam+0.019
Neg logit contributions
der-0.009
uth-0.007
 Then-0.007
 And-0.007
:-0.007
Token
Token ID198
Feature activation+2.19e-02

Pos logit contributions
uth+0.004
uf+0.003
 example+0.003
emy+0.003
2+0.003
Neg logit contributions
bel-0.006
 stained-0.005
xc-0.005
 ripped-0.005
pan-0.005
Token
Token ID198
Feature activation+1.55e-01

Pos logit contributions
uth+0.000
uf+0.000
 example+0.000
ith+0.000
2+0.000
Neg logit contributions
bel-0.001
xc-0.001
 stained-0.001
 ripped-0.001
 towel-0.001
Token"
Token ID1
Feature activation-4.88e-01

Pos logit contributions
uth+0.004
uf+0.004
 example+0.004
emy+0.004
2+0.004
Neg logit contributions
bel-0.005
xc-0.004
 stained-0.004
 ripped-0.004
pan-0.004
Token temper
Token ID4124
Feature activation+8.48e-01

Pos logit contributions
uth+0.032
 example+0.031
uf+0.031
ith+0.029
emy+0.029
Neg logit contributions
xc-0.039
bel-0.038
 stained-0.037
nam-0.036
 swollen-0.035
Token and
Token ID290
Feature activation-5.60e-03

Pos logit contributions
 example+0.014
der+0.013
ith+0.013
uf+0.013
uth+0.012
Neg logit contributions
xc-0.036
nam-0.031
bel-0.031
 swollen-0.030
 stained-0.030
Token often
Token ID1690
Feature activation-1.32e-01

Pos logit contributions
xc+0.000
bel+0.000
pan+0.000
bat+0.000
 unst+0.000
Neg logit contributions
uth-0.000
 example-0.000
 Even-0.000
emy-0.000
uf-0.000
Token threw
Token ID9617
Feature activation+4.44e-01

Pos logit contributions
xc+0.005
bel+0.004
pan+0.004
lights+0.004
 swollen+0.004
Neg logit contributions
uth-0.003
uf-0.003
ith-0.003
 example-0.003
 knows-0.003
Token temper
Token ID4124
Feature activation+4.30e-01

Pos logit contributions
 example+0.011
2+0.010
ith+0.010
uth+0.010
emy+0.010
Neg logit contributions
bel-0.014
xc-0.014
pan-0.013
days-0.013
 stained-0.012
Token tant
Token ID24246
Feature activation-4.53e+00

Pos logit contributions
 example+0.011
ith+0.011
2+0.011
uth+0.011
emy+0.011
Neg logit contributions
xc-0.013
 stained-0.012
pan-0.011
 swollen-0.011
bel-0.011
Tokenrums
Token ID45241
Feature activation+3.52e-01

Pos logit contributions
bel+0.082
xc+0.078
 stained+0.066
pan+0.065
 swollen+0.061
Neg logit contributions
uf-0.177
uth-0.176
2-0.168
 example-0.165
emy-0.163
Token.
Token ID13
Feature activation-4.98e-01

Pos logit contributions
der+0.005
2+0.005
uf+0.005
uth+0.005
 example+0.005
Neg logit contributions
xc-0.014
bel-0.014
 stained-0.014
 swollen-0.013
 towel-0.013
Token One
Token ID1881
Feature activation+1.07e+00

Pos logit contributions
bel+0.018
xc+0.017
 stained+0.017
 ripped+0.016
pan+0.016
Neg logit contributions
uth-0.011
uf-0.010
emy-0.010
 example-0.009
ith-0.009
Token day
Token ID1110
Feature activation-6.53e-01

Pos logit contributions
 example+0.016
uth+0.016
uf+0.015
emy+0.015
ith+0.014
Neg logit contributions
bel-0.046
xc-0.045
pan-0.044
nam-0.042
 stained-0.042
Token,
Token ID11
Feature activation+5.35e-03

Pos logit contributions
xc+0.024
bel+0.024
 swollen+0.023
 stained+0.023
pan+0.022
Neg logit contributions
uth-0.014
uf-0.013
 example-0.013
Suddenly-0.012
 By-0.012
Token are
Token ID389
Feature activation-5.63e-01

Pos logit contributions
bel+0.015
 swollen+0.014
lights+0.014
nam+0.014
bat+0.014
Neg logit contributions
 knows-0.007
uf-0.006
 also-0.006
 And-0.006
orry-0.006
Token you
Token ID345
Feature activation-4.51e-01

Pos logit contributions
bel+0.022
pan+0.019
 tant+0.018
iddled+0.018
 vacuum+0.018
Neg logit contributions
uf-0.013
uth-0.012
 example-0.011
emy-0.011
alph-0.010
Token?"
Token ID1701
Feature activation+1.35e+00

Pos logit contributions
bel+0.015
xc+0.013
nam+0.012
 tant+0.012
lights+0.012
Neg logit contributions
uf-0.013
uth-0.012
 example-0.012
emy-0.011
2-0.011
Token their
Token ID511
Feature activation-3.15e+00

Pos logit contributions
uth+0.038
uf+0.038
emy+0.036
 But+0.036
 example+0.034
Neg logit contributions
xc-0.041
bel-0.041
pan-0.038
 stained-0.036
 vacuum-0.035
Token n
Token ID299
Feature activation-1.25e+00

Pos logit contributions
xc+0.104
bel+0.100
 unst+0.098
pan+0.098
boat+0.097
Neg logit contributions
uth-0.076
 example-0.075
uf-0.073
emy-0.069
ith-0.066
Tokenanny
Token ID7737
Feature activation-4.48e+00

Pos logit contributions
bel+0.041
xc+0.039
 stained+0.038
 ripped+0.037
pan+0.037
Neg logit contributions
uth-0.032
uf-0.031
emy-0.028
 example-0.028
2-0.027
Token shouted
Token ID17293
Feature activation-7.32e-01

Pos logit contributions
bel+0.152
xc+0.133
 unst+0.131
pan+0.125
nam+0.124
Neg logit contributions
uf-0.118
uth-0.115
ith-0.115
2-0.112
avi-0.104
Token.
Token ID13
Feature activation-4.35e-01

Pos logit contributions
bel+0.027
xc+0.026
 stained+0.025
igator+0.023
pan+0.023
Neg logit contributions
uf-0.016
uth-0.015
 example-0.015
2-0.014
ith-0.014
Token She
Token ID1375
Feature activation-5.18e-02

Pos logit contributions
bel+0.019
xc+0.018
 stained+0.018
 ripped+0.017
pan+0.017
Neg logit contributions
uth-0.006
uf-0.006
emy-0.005
2-0.005
 example-0.005
Token saw
Token ID2497
Feature activation-6.11e-01

Pos logit contributions
bel+0.001
xc+0.001
days+0.001
pan+0.001
nam+0.001
Neg logit contributions
uth-0.001
uf-0.001
ith-0.001
emy-0.001
2-0.001
Token them
Token ID606
Feature activation-8.78e-02

Pos logit contributions
xc+0.020
bel+0.019
days+0.017
 unst+0.017
bat+0.017
Neg logit contributions
uf-0.015
 example-0.014
uth-0.014
der-0.014
emy-0.014
Token and
Token ID290
Feature activation+2.46e-01

Pos logit contributions
xc+0.017
bel+0.015
 stained+0.015
igator+0.015
pan+0.015
Neg logit contributions
 example-0.007
2-0.007
 And-0.007
 Then-0.006
 But-0.006
Token helped
Token ID4193
Feature activation-5.47e-01

Pos logit contributions
 example+0.006
 Even+0.006
uth+0.006
2+0.005
 Then+0.005
Neg logit contributions
xc-0.008
bel-0.007
pan-0.007
igator-0.007
 stained-0.007
Token Tim
Token ID5045
Feature activation-2.04e-01

Pos logit contributions
 swollen+0.016
bel+0.016
 stained+0.016
pan+0.015
robe+0.015
Neg logit contributions
uf-0.014
2-0.014
uth-0.014
 But-0.014
 example-0.014
Token make
Token ID787
Feature activation-1.45e+00

Pos logit contributions
xc+0.006
 swollen+0.005
 stained+0.005
bel+0.005
 ah+0.005
Neg logit contributions
2-0.005
 example-0.005
ith-0.005
Suddenly-0.005
 But-0.005
Token the
Token ID262
Feature activation-3.02e-01

Pos logit contributions
bel+0.046
igator+0.045
xc+0.044
 stained+0.043
pan+0.043
Neg logit contributions
 example-0.036
2-0.034
uth-0.032
uf-0.031
ith-0.030
Token aer
Token ID9551
Feature activation-4.43e+00

Pos logit contributions
bel+0.009
xc+0.008
pan+0.008
 stained+0.008
igator+0.008
Neg logit contributions
 example-0.008
uth-0.008
2-0.008
uf-0.008
 zoo-0.008
Tokenopl
Token ID20106
Feature activation-2.43e+00

Pos logit contributions
bel+0.153
xc+0.138
 stained+0.134
pan+0.129
 ripped+0.129
Neg logit contributions
uth-0.111
uf-0.108
emy-0.100
alph-0.098
 example-0.097
Tokenane
Token ID1531
Feature activation-1.93e+00

Pos logit contributions
bel+0.056
xc+0.050
 stained+0.048
pan+0.046
 ripped+0.045
Neg logit contributions
uth-0.088
uf-0.086
emy-0.082
 example-0.081
2-0.078
Token look
Token ID804
Feature activation+4.04e-01

Pos logit contributions
xc+0.067
 swollen+0.059
 stained+0.059
 messed+0.059
 streamed+0.058
Neg logit contributions
 example-0.041
2-0.038
uth-0.037
ith-0.036
Suddenly-0.035
Token better
Token ID1365
Feature activation+6.85e-02

Pos logit contributions
 example+0.006
Suddenly+0.006
ith+0.005
2+0.005
 But+0.005
Neg logit contributions
xc-0.017
pan-0.016
 swollen-0.015
 elbows-0.015
bel-0.015
Token.
Token ID13
Feature activation-2.79e-01

Pos logit contributions
uth+0.002
uf+0.002
 example+0.002
emy+0.002
2+0.002
Neg logit contributions
bel-0.002
xc-0.002
 stained-0.002
pan-0.002
 ripped-0.002
Token liked
Token ID8288
Feature activation+3.93e-01

Pos logit contributions
uth+0.001
uf+0.000
 example+0.000
2+0.000
emy+0.000
Neg logit contributions
bel-0.001
xc-0.001
pan-0.001
 stained-0.001
 towel-0.001
Token to
Token ID284
Feature activation-1.06e-01

Pos logit contributions
uth+0.010
uf+0.009
 example+0.009
2+0.009
emy+0.009
Neg logit contributions
bel-0.013
xc-0.013
 stained-0.012
pan-0.012
 ripped-0.012
Token play
Token ID711
Feature activation-2.33e+00

Pos logit contributions
bel+0.003
xc+0.003
 swollen+0.003
pan+0.003
 stained+0.003
Neg logit contributions
 example-0.003
uf-0.002
uth-0.002
2-0.002
 But-0.002
Token make
Token ID787
Feature activation-2.11e+00

Pos logit contributions
bel+0.061
xc+0.058
 stained+0.055
pan+0.053
 ripped+0.053
Neg logit contributions
uth-0.072
uf-0.071
 example-0.069
emy-0.068
2-0.067
Token-
Token ID12
Feature activation-1.54e+00

Pos logit contributions
bel+0.063
xc+0.059
pan+0.058
 stained+0.058
 ripped+0.057
Neg logit contributions
uth-0.058
 example-0.056
uf-0.056
emy-0.055
2-0.054
Tokenbel
Token ID6667
Feature activation-4.43e+00

Pos logit contributions
xc+0.016
pan+0.015
 stained+0.015
 towel+0.014
 ripped+0.013
Neg logit contributions
uth-0.068
2-0.067
 example-0.066
ith-0.065
emy-0.065
Tokenieve
Token ID12311
Feature activation-5.23e-02

Pos logit contributions
bel+0.137
xc+0.124
 stained+0.123
 ripped+0.118
pan+0.117
Neg logit contributions
uth-0.127
uf-0.124
emy-0.116
 example-0.109
alph-0.108
Token.
Token ID13
Feature activation-1.20e-01

Pos logit contributions
xc+0.002
 swollen+0.002
pan+0.002
 sponge+0.002
 unst+0.002
Neg logit contributions
 example-0.001
uth-0.001
ith-0.001
2-0.001
 And-0.001
Token
Token ID198
Feature activation-6.15e-01

Pos logit contributions
 Emma+0.003
 down+0.003
 stained+0.003
mine+0.003
 asleep+0.003
Neg logit contributions
emy-0.003
uth-0.003
uf-0.003
aja-0.003
2-0.003
Token
Token ID198
Feature activation-3.04e-01

Pos logit contributions
bel+0.017
xc+0.015
days+0.015
 stained+0.014
mine+0.014
Neg logit contributions
uf-0.019
 example-0.018
uth-0.017
der-0.017
emy-0.016
TokenOne
Token ID3198
Feature activation+7.77e-01

Pos logit contributions
bel+0.010
 stained+0.010
xc+0.010
 ripped+0.010
mine+0.009
Neg logit contributions
uf-0.007
uth-0.006
emy-0.006
 example-0.006
aja-0.006
Token she
Token ID673
Feature activation-1.63e-01

Pos logit contributions
bel+0.041
xc+0.039
 stained+0.038
 swollen+0.038
pan+0.037
Neg logit contributions
uth-0.021
uf-0.020
 example-0.018
 But-0.018
2-0.018
Token was
Token ID373
Feature activation+2.07e-01

Pos logit contributions
pan+0.005
bel+0.005
nam+0.005
 swollen+0.005
xc+0.005
Neg logit contributions
uth-0.004
ith-0.004
uf-0.004
Suddenly-0.003
alph-0.003
Token walking
Token ID6155
Feature activation+1.53e-01

Pos logit contributions
uth+0.006
uf+0.005
 example+0.005
2+0.005
ith+0.005
Neg logit contributions
bel-0.006
xc-0.006
 stained-0.006
pan-0.006
lights-0.006
Token in
Token ID287
Feature activation+4.06e-01

Pos logit contributions
 example+0.004
uth+0.004
2+0.004
Suddenly+0.004
Then+0.004
Neg logit contributions
 stained-0.005
 swollen-0.004
 ripped-0.004
xc-0.004
bel-0.004
Token the
Token ID262
Feature activation-1.37e-01

Pos logit contributions
uth+0.012
 example+0.012
uf+0.012
2+0.011
emy+0.011
Neg logit contributions
bel-0.011
xc-0.011
 stained-0.010
pan-0.010
 ripped-0.010
Token woods
Token ID16479
Feature activation+2.28e+00

Pos logit contributions
bel+0.005
 stained+0.005
xc+0.005
 ripped+0.005
 vacuum+0.005
Neg logit contributions
uth-0.003
uf-0.003
emy-0.002
 example-0.002
2-0.002
Token when
Token ID618
Feature activation-1.46e-01

Pos logit contributions
uth+0.056
uf+0.054
 example+0.052
2+0.050
emy+0.050
Neg logit contributions
bel-0.074
xc-0.072
 stained-0.070
 ripped-0.068
pan-0.067
Token she
Token ID673
Feature activation-1.11e+00

Pos logit contributions
bel+0.004
xc+0.004
 stained+0.004
pan+0.004
 vacuum+0.004
Neg logit contributions
uth-0.005
uf-0.004
 example-0.004
2-0.004
emy-0.004
Token saw
Token ID2497
Feature activation-5.31e-01

Pos logit contributions
bel+0.028
xc+0.027
 stained+0.026
pan+0.025
 ripped+0.024
Neg logit contributions
uth-0.036
uf-0.035
2-0.033
emy-0.033
 example-0.032
Token someone
Token ID2130
Feature activation-6.63e-01

Pos logit contributions
bel+0.018
xc+0.017
 stained+0.016
pan+0.016
 vacuum+0.016
Neg logit contributions
uth-0.013
uf-0.012
 example-0.012
emy-0.012
2-0.011
Token in
Token ID287
Feature activation+2.44e-01

Pos logit contributions
bel+0.022
xc+0.022
pan+0.021
 stained+0.020
 sponge+0.020
Neg logit contributions
uf-0.015
uth-0.015
2-0.014
ith-0.014
emy-0.013
Token hopped
Token ID45230
Feature activation+1.13e+00

Pos logit contributions
bel+0.000
xc+0.000
pan+0.000
 stained+0.000
 swollen+0.000
Neg logit contributions
uth-0.000
uf-0.000
emy-0.000
 example-0.000
2-0.000
Token and
Token ID290
Feature activation-6.90e-01

Pos logit contributions
uth+0.037
uf+0.036
 example+0.035
emy+0.034
2+0.034
Neg logit contributions
bel-0.029
xc-0.027
 stained-0.026
 ripped-0.025
pan-0.025
Token skipped
Token ID26684
Feature activation+1.13e-01

Pos logit contributions
bel+0.022
pan+0.022
xc+0.021
 stained+0.021
lights+0.020
Neg logit contributions
uth-0.016
emy-0.016
2-0.015
Suddenly-0.015
Then-0.015
Token through
Token ID832
Feature activation-4.07e-01

Pos logit contributions
uf+0.004
uth+0.004
emy+0.004
2+0.003
 example+0.003
Neg logit contributions
bel-0.003
xc-0.002
 stained-0.002
pan-0.002
 ripped-0.002
Token the
Token ID262
Feature activation-7.53e-01

Pos logit contributions
bel+0.011
xc+0.010
pan+0.010
 stained+0.009
 ripped+0.009
Neg logit contributions
uth-0.013
uf-0.013
 example-0.012
2-0.012
emy-0.012
Token jungle
Token ID20712
Feature activation+2.02e+00

Pos logit contributions
 stained+0.022
bel+0.022
 muddy+0.022
 ripped+0.021
xc+0.021
Neg logit contributions
uth-0.020
uf-0.020
 example-0.018
emy-0.018
der-0.018
Token,
Token ID11
Feature activation+6.19e-01

Pos logit contributions
uf+0.072
uth+0.071
emy+0.067
alph+0.063
 example+0.062
Neg logit contributions
bel-0.050
 stained-0.040
 ripped-0.040
 vacuum-0.038
pan-0.037
Token looking
Token ID2045
Feature activation+9.10e-01

Pos logit contributions
uf+0.019
uth+0.019
emy+0.017
 example+0.016
aja+0.016
Neg logit contributions
bel-0.018
 ripped-0.016
 stained-0.016
xc-0.016
 down-0.015
Token for
Token ID329
Feature activation-5.46e-01

Pos logit contributions
uth+0.021
uf+0.020
 example+0.019
emy+0.019
2+0.019
Neg logit contributions
bel-0.031
xc-0.030
 stained-0.030
pan-0.029
 ripped-0.029
Token the
Token ID262
Feature activation-4.24e-01

Pos logit contributions
bel+0.014
 stained+0.013
xc+0.013
 ripped+0.013
 towel+0.012
Neg logit contributions
uth-0.017
uf-0.017
emy-0.016
2-0.016
 example-0.016
Token special
Token ID2041
Feature activation-4.95e-01

Pos logit contributions
bel+0.014
xc+0.013
 stained+0.013
 ripped+0.012
pan+0.012
Neg logit contributions
uth-0.011
uf-0.011
emy-0.010
 example-0.010
2-0.010
Token bird
Token ID6512
Feature activation+6.23e-01

Pos logit contributions
uth+0.001
uf+0.001
emy+0.001
 example+0.001
2+0.001
Neg logit contributions
bel-0.001
 stained-0.001
xc-0.001
 ripped-0.001
pan-0.001
Token who
Token ID508
Feature activation+1.30e+00

Pos logit contributions
uth+0.015
uf+0.014
emy+0.013
 example+0.013
2+0.013
Neg logit contributions
bel-0.022
xc-0.020
 stained-0.020
pan-0.019
 ripped-0.019
Token lived
Token ID5615
Feature activation-3.39e-01

Pos logit contributions
uth+0.032
uf+0.030
ith+0.030
emy+0.029
 Even+0.028
Neg logit contributions
xc-0.043
bel-0.041
pan-0.041
 towel-0.038
 sponge-0.037
Token in
Token ID287
Feature activation+5.22e-01

Pos logit contributions
bel+0.009
xc+0.008
 stained+0.008
pan+0.008
 vacuum+0.008
Neg logit contributions
uth-0.010
uf-0.010
emy-0.010
 example-0.010
2-0.009
Token the
Token ID262
Feature activation-4.30e-01

Pos logit contributions
uth+0.014
uf+0.014
 example+0.013
emy+0.013
2+0.013
Neg logit contributions
bel-0.016
xc-0.015
 stained-0.014
pan-0.014
 ripped-0.014
Token forest
Token ID8222
Feature activation+2.76e+00

Pos logit contributions
bel+0.016
xc+0.015
 stained+0.015
pan+0.015
 ripped+0.015
Neg logit contributions
uth-0.009
uf-0.009
emy-0.008
 example-0.008
2-0.007
Token.
Token ID13
Feature activation-3.92e-04

Pos logit contributions
uth+0.065
uf+0.059
emy+0.057
 example+0.057
2+0.056
Neg logit contributions
xc-0.096
bel-0.092
pan-0.088
 stained-0.087
 vacuum-0.086
Token Every
Token ID3887
Feature activation-1.91e-01

Pos logit contributions
bel+0.000
 stained+0.000
xc+0.000
 ripped+0.000
igator+0.000
Neg logit contributions
uth-0.000
uf-0.000
 example-0.000
emy-0.000
ith-0.000
Token day
Token ID1110
Feature activation-1.05e+00

Pos logit contributions
bel+0.006
 stained+0.005
xc+0.005
pan+0.005
 ripped+0.005
Neg logit contributions
uth-0.006
uf-0.005
2-0.005
emy-0.005
 example-0.005
Token it
Token ID340
Feature activation+4.46e-02

Pos logit contributions
bel+0.038
xc+0.037
pan+0.034
 stained+0.034
 vacuum+0.033
Neg logit contributions
uth-0.024
uf-0.023
2-0.022
 example-0.021
 But-0.021
Token liked
Token ID8288
Feature activation+8.95e-01

Pos logit contributions
uth+0.001
emy+0.001
ith+0.001
Then+0.001
Suddenly+0.001
Neg logit contributions
xc-0.001
nam-0.001
bel-0.001
cloth-0.001
pan-0.001
Token
Token ID198
Feature activation+8.80e-02

Pos logit contributions
uth+0.001
uf+0.001
 example+0.001
2+0.001
emy+0.001
Neg logit contributions
bel-0.003
xc-0.003
 stained-0.002
 ripped-0.002
pan-0.002
TokenWhile
Token ID3633
Feature activation-3.73e-02

Pos logit contributions
uth+0.002
uf+0.002
emy+0.002
 example+0.002
2+0.002
Neg logit contributions
bel-0.003
xc-0.003
 stained-0.003
pan-0.003
 ripped-0.003
Token swimming
Token ID14899
Feature activation-2.73e-01

Pos logit contributions
bel+0.001
xc+0.001
pan+0.001
 stained+0.001
 unst+0.001
Neg logit contributions
uth-0.001
 example-0.001
 By-0.001
 zoo-0.001
emy-0.001
Token,
Token ID11
Feature activation+2.92e-01

Pos logit contributions
nam+0.011
 swollen+0.011
xc+0.011
 stained+0.011
bat+0.010
Neg logit contributions
 But-0.005
 example-0.004
 And-0.004
:-0.004
 Even-0.004
Token the
Token ID262
Feature activation-3.18e-02

Pos logit contributions
uth+0.009
uf+0.008
emy+0.008
 example+0.008
2+0.007
Neg logit contributions
bel-0.009
xc-0.008
 stained-0.008
pan-0.008
 ripped-0.007
Token mole
Token ID9411
Feature activation+2.24e+00

Pos logit contributions
bel+0.001
xc+0.001
pan+0.001
 stained+0.001
 unst+0.001
Neg logit contributions
uth-0.001
uf-0.001
emy-0.001
 example-0.001
2-0.001
Token met
Token ID1138
Feature activation+8.12e-01

Pos logit contributions
uf+0.058
uth+0.058
emy+0.057
ith+0.057
2+0.052
Neg logit contributions
bel-0.071
xc-0.068
nam-0.067
pan-0.063
 unst-0.060
Token a
Token ID257
Feature activation+2.46e-01

Pos logit contributions
uth+0.022
uf+0.021
 example+0.020
emy+0.020
2+0.019
Neg logit contributions
bel-0.027
 stained-0.024
xc-0.024
pan-0.024
 vacuum-0.023
Token kind
Token ID1611
Feature activation+1.64e+00

Pos logit contributions
uth+0.007
uf+0.007
 example+0.006
emy+0.006
2+0.006
Neg logit contributions
bel-0.007
 stained-0.007
xc-0.006
 ripped-0.006
pan-0.006
Token duck
Token ID22045
Feature activation+1.05e+00

Pos logit contributions
uth+0.050
uf+0.047
Then+0.045
emy+0.044
 example+0.044
Neg logit contributions
bel-0.043
 stained-0.040
 towel-0.040
pan-0.040
igator-0.040
Token.
Token ID13
Feature activation-4.24e-01

Pos logit contributions
uth+0.025
uf+0.024
emy+0.022
 example+0.022
2+0.021
Neg logit contributions
bel-0.036
xc-0.034
 stained-0.034
 ripped-0.033
pan-0.033
Token who
Token ID508
Feature activation+1.78e+00

Pos logit contributions
uth+0.065
uf+0.062
 example+0.060
2+0.059
emy+0.057
Neg logit contributions
bel-0.070
 stained-0.064
 ripped-0.062
xc-0.061
pan-0.060
Token lived
Token ID5615
Feature activation-3.82e-01

Pos logit contributions
uth+0.045
uf+0.044
emy+0.042
ith+0.040
 example+0.040
Neg logit contributions
bel-0.057
xc-0.057
pan-0.055
 stained-0.052
 towel-0.051
Token in
Token ID287
Feature activation+4.42e-01

Pos logit contributions
bel+0.009
xc+0.009
 stained+0.008
pan+0.008
 ripped+0.008
Neg logit contributions
uth-0.013
uf-0.013
emy-0.012
 example-0.012
2-0.012
Token a
Token ID257
Feature activation-2.81e-01

Pos logit contributions
uth+0.014
uf+0.014
emy+0.013
2+0.013
 example+0.013
Neg logit contributions
bel-0.012
 stained-0.011
xc-0.011
 ripped-0.010
pan-0.010
Token big
Token ID1263
Feature activation-9.81e-01

Pos logit contributions
bel+0.006
 stained+0.006
 ripped+0.006
xc+0.006
 towel+0.006
Neg logit contributions
uth-0.010
uf-0.009
 example-0.009
2-0.009
nine-0.009
Token forest
Token ID8222
Feature activation+2.49e+00

Pos logit contributions
bel+0.033
 stained+0.031
xc+0.031
 ripped+0.030
pan+0.030
Neg logit contributions
uth-0.023
uf-0.023
 example-0.021
emy-0.020
2-0.020
Token.
Token ID13
Feature activation-2.77e-01

Pos logit contributions
uth+0.071
uf+0.069
 example+0.064
emy+0.064
2+0.063
Neg logit contributions
bel-0.073
xc-0.070
 stained-0.068
pan-0.066
 ripped-0.065
Token Every
Token ID3887
Feature activation-4.53e-02

Pos logit contributions
bel+0.008
 stained+0.008
 ripped+0.007
xc+0.007
igator+0.007
Neg logit contributions
uth-0.008
 example-0.007
uf-0.007
emy-0.007
ith-0.007
Token day
Token ID1110
Feature activation-1.04e+00

Pos logit contributions
bel+0.001
 stained+0.001
igator+0.001
 towel+0.001
xc+0.001
Neg logit contributions
uth-0.002
uf-0.002
2-0.002
nine-0.001
aff-0.001
Token,
Token ID11
Feature activation+3.47e-01

Pos logit contributions
bel+0.035
xc+0.033
 stained+0.031
pan+0.031
 vacuum+0.030
Neg logit contributions
uth-0.026
uf-0.026
emy-0.024
 example-0.024
2-0.023
Token the
Token ID262
Feature activation+1.21e-01

Pos logit contributions
uth+0.009
 But+0.008
uf+0.008
2+0.008
emy+0.008
Neg logit contributions
xc-0.011
bel-0.011
pan-0.010
 stained-0.010
 vacuum-0.010
Token she
Token ID673
Feature activation-3.23e-01

Pos logit contributions
xc+0.032
bel+0.031
pan+0.030
 tant+0.028
 stained+0.028
Neg logit contributions
 example-0.017
 But-0.017
uth-0.016
 Even-0.016
 By-0.016
Token jumped
Token ID11687
Feature activation+4.80e-01

Pos logit contributions
bel+0.011
xc+0.011
pan+0.010
lights+0.010
packs+0.010
Neg logit contributions
uth-0.008
ith-0.007
uf-0.007
Suddenly-0.007
Just-0.007
Token up
Token ID510
Feature activation+5.40e-01

Pos logit contributions
 example+0.010
 But+0.009
:+0.009
Suddenly+0.009
2+0.008
Neg logit contributions
nam-0.017
bel-0.017
 swollen-0.016
xc-0.016
days-0.016
Token and
Token ID290
Feature activation-6.97e-01

Pos logit contributions
 example+0.005
uth+0.004
 But+0.003
2+0.003
 By+0.003
Neg logit contributions
xc-0.026
bel-0.025
igator-0.024
 stained-0.024
cher-0.024
Token down
Token ID866
Feature activation-1.25e+00

Pos logit contributions
xc+0.018
bel+0.018
pan+0.018
lights+0.017
ivery+0.017
Neg logit contributions
 But-0.020
 Then-0.019
Suddenly-0.018
2-0.018
 Just-0.018
Token!
Token ID0
Feature activation-4.88e-01

Pos logit contributions
xc+0.051
igator+0.045
nam+0.045
 stained+0.045
bel+0.045
Neg logit contributions
 example-0.025
 By-0.021
 But-0.020
2-0.020
uth-0.019
Token
Token ID198
Feature activation-2.45e-01

Pos logit contributions
bel+0.014
xc+0.013
 stained+0.013
 ripped+0.013
 vacuum+0.013
Neg logit contributions
uth-0.013
emy-0.013
uf-0.013
 example-0.012
ith-0.012
Token
Token ID198
Feature activation-1.29e-01

Pos logit contributions
bel+0.009
xc+0.008
 stained+0.008
 ripped+0.008
 towel+0.008
Neg logit contributions
uth-0.005
uf-0.005
 example-0.005
emy-0.005
2-0.004
TokenThe
Token ID464
Feature activation+7.14e-01

Pos logit contributions
bel+0.005
xc+0.004
 stained+0.004
 ripped+0.004
pan+0.004
Neg logit contributions
uth-0.003
uf-0.003
 example-0.003
emy-0.003
2-0.003
Token mom
Token ID1995
Feature activation-1.08e+00

Pos logit contributions
 example+0.019
uth+0.019
uf+0.018
2+0.017
ith+0.016
Neg logit contributions
bel-0.023
xc-0.022
pan-0.020
 stained-0.020
igator-0.019
Token told
Token ID1297
Feature activation+8.41e-01

Pos logit contributions
bel+0.033
nam+0.031
pan+0.031
xc+0.030
 stained+0.029
Neg logit contributions
uth-0.030
2-0.029
ith-0.029
uf-0.028
Suddenly-0.027
Token adventurous
Token ID39770
Feature activation+5.50e-01

Pos logit contributions
igator+0.018
 vacuum+0.017
 stained+0.016
 unst+0.016
xc+0.016
Neg logit contributions
uth-0.025
emy-0.023
uf-0.022
Then-0.021
aja-0.021
Token girl
Token ID2576
Feature activation+6.80e-02

Pos logit contributions
uth+0.022
uf+0.022
emy+0.020
Then+0.020
Eventually+0.020
Neg logit contributions
 vacuum-0.009
 towel-0.009
 stained-0.009
 ripped-0.009
bel-0.009
Token.
Token ID13
Feature activation-1.25e-01

Pos logit contributions
uth+0.002
uf+0.002
emy+0.002
nine+0.002
alph+0.002
Neg logit contributions
bel-0.001
 vacuum-0.001
 stained-0.001
 down-0.001
 ripped-0.001
Token She
Token ID1375
Feature activation+1.26e-01

Pos logit contributions
bel+0.003
 stained+0.003
 ripped+0.003
 vacuum+0.003
 down+0.003
Neg logit contributions
uf-0.004
uth-0.004
emy-0.004
alph-0.004
aja-0.004
Token loved
Token ID6151
Feature activation+1.18e+00

Pos logit contributions
uth+0.004
uf+0.003
emy+0.003
 example+0.003
2+0.003
Neg logit contributions
bel-0.004
xc-0.004
 stained-0.003
pan-0.003
 ripped-0.003
Token to
Token ID284
Feature activation+6.99e-01

Pos logit contributions
uth+0.042
uf+0.042
emy+0.038
2+0.037
 example+0.037
Neg logit contributions
bel-0.028
 stained-0.025
xc-0.024
 ripped-0.023
 vacuum-0.023
Token explore
Token ID7301
Feature activation+2.40e-01

Pos logit contributions
uth+0.021
uf+0.020
 example+0.020
2+0.019
emy+0.019
Neg logit contributions
bel-0.019
xc-0.018
 stained-0.017
pan-0.017
 ripped-0.017
Token and
Token ID290
Feature activation-3.17e-01

Pos logit contributions
uth+0.008
uf+0.007
 example+0.007
emy+0.007
2+0.007
Neg logit contributions
bel-0.006
xc-0.006
 stained-0.006
pan-0.006
 ripped-0.006
Token discover
Token ID7073
Feature activation-5.12e-01

Pos logit contributions
bel+0.008
 stained+0.007
nam+0.007
 ripped+0.007
 vacuum+0.007
Neg logit contributions
uf-0.011
uth-0.011
emy-0.010
2-0.009
alph-0.009
Token new
Token ID649
Feature activation-1.18e+00

Pos logit contributions
bel+0.016
xc+0.015
 stained+0.015
 ripped+0.015
pan+0.015
Neg logit contributions
uth-0.013
uf-0.013
emy-0.012
 example-0.012
2-0.012
Token things
Token ID1243
Feature activation+5.57e-01

Pos logit contributions
bel+0.028
 stained+0.024
xc+0.023
 towel+0.022
 ripped+0.022
Neg logit contributions
uf-0.044
uth-0.042
emy-0.038
2-0.038
abi-0.037
Token happy
Token ID3772
Feature activation+2.14e-01

Pos logit contributions
uth+0.001
uf+0.001
 example+0.001
2+0.001
Suddenly+0.001
Neg logit contributions
xc-0.001
pan-0.001
bel-0.001
lights-0.001
 stained-0.001
Token and
Token ID290
Feature activation-1.59e-01

Pos logit contributions
uth+0.004
uf+0.004
 example+0.004
emy+0.004
2+0.004
Neg logit contributions
bel-0.008
xc-0.008
 stained-0.008
pan-0.007
 ripped-0.007
Token went
Token ID1816
Feature activation-1.92e-01

Pos logit contributions
xc+0.005
bel+0.004
pan+0.004
 stained+0.004
 towel+0.004
Neg logit contributions
uth-0.005
uf-0.004
 example-0.004
2-0.004
emy-0.004
Token to
Token ID284
Feature activation+3.29e-02

Pos logit contributions
bel+0.005
xc+0.005
 stained+0.004
 ripped+0.004
pan+0.004
Neg logit contributions
uf-0.006
uth-0.006
 example-0.006
2-0.006
ith-0.006
Token tell
Token ID1560
Feature activation-6.26e-01

Pos logit contributions
uth+0.001
uf+0.001
 example+0.001
emy+0.001
2+0.001
Neg logit contributions
bel-0.001
xc-0.001
 stained-0.001
pan-0.001
 ripped-0.001
Token his
Token ID465
Feature activation+6.55e-01

Pos logit contributions
bel+0.016
xc+0.015
 stained+0.014
 ripped+0.014
pan+0.013
Neg logit contributions
 example-0.021
uf-0.020
uth-0.020
2-0.019
emy-0.019
Token mom
Token ID1995
Feature activation-3.59e-01

Pos logit contributions
uth+0.018
uf+0.018
 example+0.018
emy+0.016
2+0.016
Neg logit contributions
bel-0.020
xc-0.019
pan-0.019
 stained-0.019
mine-0.018
Token.
Token ID13
Feature activation-1.27e-01

Pos logit contributions
bel+0.017
xc+0.017
 stained+0.016
pan+0.016
 swollen+0.015
Neg logit contributions
uf-0.004
uth-0.004
2-0.003
 example-0.003
 Even-0.003
Token He
Token ID679
Feature activation+2.41e-01

Pos logit contributions
bel+0.004
xc+0.004
 stained+0.004
 ripped+0.004
pan+0.004
Neg logit contributions
uth-0.003
uf-0.003
emy-0.003
 example-0.003
2-0.003
Token said
Token ID531
Feature activation+5.54e-01

Pos logit contributions
uth+0.007
uf+0.007
 example+0.006
emy+0.006
2+0.006
Neg logit contributions
bel-0.007
xc-0.007
pan-0.007
 stained-0.007
nam-0.006
Token,
Token ID11
Feature activation-3.36e-01

Pos logit contributions
 example+0.010
der+0.009
:+0.009
Suddenly+0.009
uth+0.008
Neg logit contributions
xc-0.022
bel-0.022
 stained-0.022
packs-0.020
bat-0.020
Token "
Token ID366
Feature activation+2.56e-01

Pos logit contributions
bel+0.016
 stained+0.015
pan+0.015
 ripped+0.015
xc+0.015
Neg logit contributions
uth-0.014
uf-0.013
emy-0.012
 example-0.012
ith-0.012
TokenFine
Token ID34389
Feature activation+1.27e+00

Pos logit contributions
uf+0.009
uth+0.009
emy+0.009
 example+0.009
2+0.008
Neg logit contributions
bel-0.006
xc-0.005
 stained-0.005
 ripped-0.005
 towel-0.004
Token.
Token ID13
Feature activation+5.22e-02

Pos logit contributions
uth+0.039
uf+0.038
emy+0.035
 example+0.035
2+0.035
Neg logit contributions
bel-0.035
xc-0.033
 stained-0.032
 ripped-0.031
pan-0.031
Token You
Token ID921
Feature activation+6.27e-01

Pos logit contributions
uth+0.001
emy+0.001
uf+0.001
ith+0.001
aff+0.001
Neg logit contributions
xc-0.001
 ripped-0.001
bel-0.001
 stained-0.001
 sponge-0.001
Token can
Token ID460
Feature activation-6.26e-01

Pos logit contributions
uf+0.014
uth+0.013
2+0.013
 example+0.012
 Even+0.012
Neg logit contributions
bel-0.023
xc-0.022
lights-0.020
pan-0.020
 stained-0.020
Token have
Token ID423
Feature activation-4.55e-01

Pos logit contributions
bel+0.020
xc+0.019
 stained+0.018
pan+0.017
nam+0.017
Neg logit contributions
uth-0.017
uf-0.017
 example-0.015
2-0.015
emy-0.014
Token it
Token ID340
Feature activation-3.96e-01

Pos logit contributions
bel+0.018
xc+0.015
robe+0.015
spe+0.014
boat+0.014
Neg logit contributions
uth-0.011
 example-0.011
uf-0.010
emy-0.009
Suddenly-0.009
Token.
Token ID13
Feature activation+3.50e-01

Pos logit contributions
bel+0.019
xc+0.018
spe+0.017
 ah+0.017
 streamed+0.017
Neg logit contributions
 And-0.005
 example-0.005
:-0.004
 But-0.004
der-0.004
Token But
Token ID887
Feature activation+9.77e-01

Pos logit contributions
uth+0.010
emy+0.010
uf+0.009
aff+0.009
ith+0.009
Neg logit contributions
 ripped-0.009
xc-0.008
 sponge-0.008
 Sorry-0.008
cloth-0.008
Token do
Token ID466
Feature activation-1.30e+00

Pos logit contributions
uth+0.032
uf+0.032
emy+0.030
 example+0.029
ith+0.029
Neg logit contributions
bel-0.023
xc-0.022
 stained-0.022
 ripped-0.022
pan-0.021
Token not
Token ID407
Feature activation-3.10e-01

Pos logit contributions
bel+0.023
 stained+0.023
 ripped+0.023
nam+0.023
xc+0.023
Neg logit contributions
uth-0.047
uf-0.046
aff-0.044
ith-0.044
emy-0.043
Token time
Token ID640
Feature activation-4.72e-01

Pos logit contributions
 example+0.030
uth+0.030
nine+0.030
2+0.029
uf+0.029
Neg logit contributions
 stained-0.026
 ripped-0.025
xc-0.025
bel-0.025
 towel-0.025
Token,
Token ID11
Feature activation+1.23e+00

Pos logit contributions
bel+0.015
 stained+0.014
xc+0.014
pan+0.014
 vacuum+0.013
Neg logit contributions
uth-0.013
uf-0.012
emy-0.011
 example-0.011
2-0.011
Token there
Token ID612
Feature activation-3.74e-01

Pos logit contributions
uth+0.046
uf+0.044
emy+0.044
 example+0.042
2+0.042
Neg logit contributions
bel-0.025
xc-0.022
pan-0.022
 stained-0.021
 vacuum-0.020
Token was
Token ID373
Feature activation+1.16e-01

Pos logit contributions
bel+0.010
xc+0.009
pan+0.009
 stained+0.009
nam+0.009
Neg logit contributions
uth-0.012
uf-0.011
emy-0.011
 example-0.011
2-0.010
Token a
Token ID257
Feature activation-3.46e-01

Pos logit contributions
uth+0.004
uf+0.004
2+0.004
 example+0.004
emy+0.004
Neg logit contributions
bel-0.003
 stained-0.003
xc-0.002
 ripped-0.002
 towel-0.002
Token little
Token ID1310
Feature activation+4.68e-01

Pos logit contributions
bel+0.007
 stained+0.007
 ripped+0.007
 vacuum+0.007
 towel+0.007
Neg logit contributions
uth-0.012
 example-0.012
 By-0.012
uf-0.012
 habitats-0.011
Token girl
Token ID2576
Feature activation+3.36e-01

Pos logit contributions
uth+0.015
uf+0.015
 example+0.014
emy+0.014
2+0.014
Neg logit contributions
bel-0.012
xc-0.011
 stained-0.011
 ripped-0.010
pan-0.010
Token named
Token ID3706
Feature activation+6.20e-02

Pos logit contributions
uth+0.010
uf+0.009
2+0.009
 example+0.009
emy+0.009
Neg logit contributions
bel-0.010
 stained-0.009
 ripped-0.009
 vacuum-0.009
xc-0.009
Token Lily
Token ID20037
Feature activation-6.65e-02

Pos logit contributions
uth+0.002
uf+0.002
emy+0.002
 example+0.002
2+0.002
Neg logit contributions
bel-0.001
xc-0.001
 stained-0.001
pan-0.001
 ripped-0.001
Token.
Token ID13
Feature activation+2.19e-01

Pos logit contributions
bel+0.001
 stained+0.001
 ripped+0.001
 vacuum+0.001
xc+0.001
Neg logit contributions
uth-0.003
uf-0.002
 example-0.002
2-0.002
emy-0.002
Token One
Token ID1881
Feature activation+7.08e-01

Pos logit contributions
uth+0.006
uf+0.006
 example+0.006
emy+0.006
2+0.006
Neg logit contributions
bel-0.006
 stained-0.006
xc-0.006
 ripped-0.006
 vacuum-0.006
Token,
Token ID11
Feature activation+7.25e-01

Pos logit contributions
uth+0.003
uf+0.003
 example+0.003
 But+0.003
osa+0.002
Neg logit contributions
bel-0.007
 stained-0.006
xc-0.006
 vacuum-0.006
pan-0.006
Token hold
Token ID1745
Feature activation+8.63e-01

Pos logit contributions
uth+0.020
uf+0.020
 example+0.018
emy+0.018
2+0.018
Neg logit contributions
bel-0.022
xc-0.021
 stained-0.020
pan-0.020
 ripped-0.020
Token on
Token ID319
Feature activation-1.31e+00

Pos logit contributions
uth+0.029
uf+0.028
 example+0.027
emy+0.026
2+0.026
Neg logit contributions
bel-0.023
 stained-0.020
xc-0.020
pan-0.019
 ripped-0.019
Token tight
Token ID5381
Feature activation-1.26e-01

Pos logit contributions
bel+0.044
xc+0.043
 stained+0.042
pan+0.040
 vacuum+0.039
Neg logit contributions
uth-0.034
uf-0.032
 example-0.031
emy-0.029
2-0.029
Token!"
Token ID2474
Feature activation+9.60e-01

Pos logit contributions
bel+0.004
xc+0.004
 stained+0.004
 vacuum+0.004
 ripped+0.004
Neg logit contributions
uth-0.003
 example-0.003
uf-0.003
2-0.003
nine-0.003
Token Ben
Token ID3932
Feature activation-2.42e+00

Pos logit contributions
uf+0.020
uth+0.020
emy+0.018
2+0.017
 example+0.017
Neg logit contributions
bel-0.037
xc-0.035
 stained-0.034
pan-0.034
 vacuum-0.032
Token said
Token ID531
Feature activation-1.38e+00

Pos logit contributions
bel+0.072
xc+0.069
 stained+0.067
pan+0.066
 swollen+0.062
Neg logit contributions
uth-0.069
uf-0.068
2-0.064
 example-0.063
emy-0.062
Token.
Token ID13
Feature activation-6.13e-01

Pos logit contributions
 stained+0.052
xc+0.052
bel+0.052
 messed+0.048
bat+0.047
Neg logit contributions
 example-0.024
uth-0.023
 And-0.022
 But-0.022
uf-0.022
Token He
Token ID679
Feature activation-2.32e-02

Pos logit contributions
bel+0.022
 stained+0.020
pan+0.020
 ripped+0.020
xc+0.020
Neg logit contributions
uth-0.014
uf-0.014
 example-0.012
emy-0.012
2-0.011
Token rubbed
Token ID31862
Feature activation+9.35e-01

Pos logit contributions
bel+0.001
xc+0.001
 stained+0.001
pan+0.000
 ripped+0.000
Neg logit contributions
uth-0.001
uf-0.001
emy-0.001
 example-0.001
2-0.001
Token the
Token ID262
Feature activation-9.09e-01

Pos logit contributions
 Then+0.029
uth+0.028
uf+0.028
2+0.028
 But+0.028
Neg logit contributions
bel-0.022
xc-0.021
days-0.021
 stained-0.021
 messed-0.020
Token
Token ID198
Feature activation-2.84e-01

Pos logit contributions
bel+0.015
 swollen+0.014
xc+0.013
 stained+0.013
 towel+0.013
Neg logit contributions
 example-0.012
uth-0.012
uf-0.011
ith-0.011
2-0.011
TokenThey
Token ID2990
Feature activation-1.60e+00

Pos logit contributions
bel+0.008
 swollen+0.007
xc+0.007
 stained+0.007
 ripped+0.006
Neg logit contributions
uf-0.008
 example-0.008
uth-0.008
aff-0.008
ith-0.008
Token looked
Token ID3114
Feature activation+6.69e-01

Pos logit contributions
bel+0.039
xc+0.035
 stained+0.033
pan+0.032
 vacuum+0.030
Neg logit contributions
uf-0.055
uth-0.055
2-0.051
emy-0.050
ith-0.050
Token around
Token ID1088
Feature activation-1.26e+00

Pos logit contributions
uth+0.018
uf+0.017
emy+0.016
ith+0.016
 example+0.016
Neg logit contributions
bel-0.021
xc-0.020
pan-0.018
 stained-0.018
days-0.018
Token the
Token ID262
Feature activation-2.73e-01

Pos logit contributions
bel+0.041
xc+0.040
 stained+0.038
 ripped+0.037
pan+0.037
Neg logit contributions
uth-0.032
uf-0.031
 example-0.029
emy-0.029
2-0.028
Token room
Token ID2119
Feature activation-1.34e+00

Pos logit contributions
bel+0.007
 stained+0.007
 ripped+0.006
 vacuum+0.006
 towel+0.006
Neg logit contributions
uth-0.009
uf-0.009
emy-0.008
 example-0.008
2-0.008
Token and
Token ID290
Feature activation-1.95e-01

Pos logit contributions
xc+0.052
bel+0.050
 stained+0.046
 swollen+0.046
 vacuum+0.045
Neg logit contributions
 example-0.026
uth-0.026
uf-0.024
2-0.023
der-0.022
Token saw
Token ID2497
Feature activation-9.50e-01

Pos logit contributions
bel+0.005
xc+0.004
 stained+0.004
pan+0.004
 ripped+0.004
Neg logit contributions
uth-0.007
uf-0.007
emy-0.006
 example-0.006
2-0.006
Token a
Token ID257
Feature activation-2.37e-02

Pos logit contributions
bel+0.028
xc+0.026
 stained+0.026
 ripped+0.025
pan+0.025
Neg logit contributions
uth-0.027
uf-0.027
emy-0.025
 example-0.025
2-0.024
Token hole
Token ID7604
Feature activation+1.71e-01

Pos logit contributions
bel+0.001
xc+0.001
 stained+0.001
pan+0.001
 ripped+0.001
Neg logit contributions
uth-0.001
uf-0.001
emy-0.001
2-0.001
 example-0.001
Token in
Token ID287
Feature activation+2.61e-01

Pos logit contributions
:+0.003
der+0.003
ith+0.003
 example+0.003
 also+0.003
Neg logit contributions
xc-0.007
nam-0.006
 tant-0.006
bel-0.006
 unst-0.006
Token the
Token ID262
Feature activation+3.08e-01

Pos logit contributions
uth+0.004
uf+0.004
 example+0.003
emy+0.003
2+0.003
Neg logit contributions
bel-0.003
xc-0.003
 stained-0.003
pan-0.002
 ripped-0.002
Token park
Token ID3952
Feature activation+1.31e+00

Pos logit contributions
uf+0.006
uth+0.006
 example+0.006
 zoo+0.006
2+0.005
Neg logit contributions
bel-0.012
xc-0.011
pan-0.011
 stained-0.010
mine-0.010
Token and
Token ID290
Feature activation+7.48e-02

Pos logit contributions
uth+0.031
uf+0.030
 example+0.029
2+0.027
emy+0.027
Neg logit contributions
bel-0.044
xc-0.044
 stained-0.043
pan-0.041
 ripped-0.041
Token had
Token ID550
Feature activation+4.84e-01

Pos logit contributions
uth+0.002
uf+0.002
 But+0.002
2+0.002
 example+0.002
Neg logit contributions
xc-0.002
bel-0.002
pan-0.002
 swollen-0.002
 stained-0.002
Token a
Token ID257
Feature activation+9.11e-01

Pos logit contributions
uth+0.010
 example+0.009
uf+0.008
Suddenly+0.008
emy+0.008
Neg logit contributions
bel-0.020
xc-0.018
pan-0.018
 vacuum-0.017
 swollen-0.017
Token fun
Token ID1257
Feature activation-6.20e-01

Pos logit contributions
uth+0.023
uf+0.022
emy+0.021
 example+0.021
2+0.020
Neg logit contributions
bel-0.031
xc-0.029
 stained-0.028
pan-0.027
 ripped-0.026
Token time
Token ID640
Feature activation+1.26e-01

Pos logit contributions
bel+0.024
xc+0.023
 stained+0.022
 swollen+0.022
pan+0.022
Neg logit contributions
uth-0.012
 example-0.011
uf-0.011
emy-0.011
2-0.010
Token.
Token ID13
Feature activation+3.31e-01

Pos logit contributions
uth+0.004
uf+0.004
emy+0.003
 example+0.003
2+0.003
Neg logit contributions
bel-0.004
xc-0.003
 stained-0.003
 ripped-0.003
pan-0.003
Token L
Token ID406
Feature activation+3.52e-01

Pos logit contributions
uth+0.009
ith+0.008
uf+0.008
emy+0.008
 example+0.008
Neg logit contributions
bel-0.010
 stained-0.009
 sponge-0.009
 ripped-0.009
 vacuum-0.009
Tokenila
Token ID10102
Feature activation+2.05e-02

Pos logit contributions
uth+0.008
uf+0.008
 example+0.007
emy+0.007
2+0.007
Neg logit contributions
bel-0.013
xc-0.012
 stained-0.012
 ripped-0.011
pan-0.011
Token learned
Token ID4499
Feature activation+1.08e+00

Pos logit contributions
uth+0.001
uf+0.001
2+0.000
ith+0.000
emy+0.000
Neg logit contributions
bel-0.001
xc-0.001
pan-0.001
 stained-0.001
nam-0.001
Token.
Token ID13
Feature activation-3.07e-01

Pos logit contributions
xc+0.061
bel+0.061
 tant+0.056
 stained+0.055
pan+0.054
Neg logit contributions
uf-0.020
:-0.019
uth-0.019
 And-0.018
ith-0.017
Token One
Token ID1881
Feature activation+7.91e-01

Pos logit contributions
bel+0.011
xc+0.010
 stained+0.010
pan+0.010
 ripped+0.010
Neg logit contributions
uth-0.007
uf-0.006
 example-0.006
emy-0.006
2-0.006
Token day
Token ID1110
Feature activation-9.23e-01

Pos logit contributions
uth+0.015
uf+0.014
emy+0.013
 example+0.013
ith+0.012
Neg logit contributions
bel-0.031
xc-0.031
pan-0.030
 stained-0.029
 ripped-0.029
Token,
Token ID11
Feature activation+4.41e-02

Pos logit contributions
bel+0.039
xc+0.039
lights+0.036
 swollen+0.036
 stained+0.035
Neg logit contributions
uth-0.016
 But-0.014
uf-0.014
 By-0.013
Suddenly-0.013
Token her
Token ID607
Feature activation-1.75e-01

Pos logit contributions
uth+0.001
 But+0.001
Suddenly+0.001
uf+0.001
 Mint+0.001
Neg logit contributions
xc-0.001
pan-0.001
 stained-0.001
bel-0.001
lights-0.001
Token mom
Token ID1995
Feature activation-9.68e-01

Pos logit contributions
xc+0.006
bel+0.006
pan+0.006
mine+0.006
boat+0.006
Neg logit contributions
uth-0.004
uf-0.004
 example-0.004
emy-0.003
 wife-0.003
Token asked
Token ID1965
Feature activation+1.83e+00

Pos logit contributions
bel+0.034
nam+0.032
lights+0.030
pan+0.030
xc+0.030
Neg logit contributions
uf-0.023
uth-0.023
ith-0.023
 Even-0.022
2-0.021
Token her
Token ID607
Feature activation+3.08e-01

Pos logit contributions
uf+0.062
uth+0.061
 example+0.060
 And+0.054
osa+0.054
Neg logit contributions
bel-0.047
 stained-0.045
xc-0.045
boat-0.044
lights-0.043
Token to
Token ID284
Feature activation-5.42e-01

Pos logit contributions
uth+0.007
 example+0.007
uf+0.007
2+0.006
emy+0.006
Neg logit contributions
xc-0.011
bel-0.010
 stained-0.010
 swollen-0.010
pan-0.010
Token go
Token ID467
Feature activation-1.09e+00

Pos logit contributions
led+0.021
lights+0.021
bel+0.021
xc+0.020
 swollen+0.020
Neg logit contributions
uf-0.011
 example-0.010
 vet-0.009
 But-0.009
uth-0.009
Token to
Token ID284
Feature activation-4.17e-01

Pos logit contributions
bel+0.027
xc+0.025
 stained+0.024
pan+0.023
 ripped+0.023
Neg logit contributions
uth-0.037
uf-0.036
emy-0.034
2-0.033
 example-0.033
Token Emma
Token ID18966
Feature activation-3.06e-01

Pos logit contributions
bel+0.027
xc+0.024
 stained+0.024
pan+0.023
 ripped+0.022
Neg logit contributions
uth-0.031
uf-0.030
 example-0.028
2-0.028
emy-0.028
Token?"
Token ID1701
Feature activation+8.83e-01

Pos logit contributions
bel+0.009
xc+0.009
 stained+0.008
pan+0.008
apy+0.008
Neg logit contributions
uf-0.009
uth-0.008
 example-0.008
emy-0.008
ith-0.008
Token
Token ID198
Feature activation-7.09e-01

Pos logit contributions
uth+0.023
uf+0.022
emy+0.021
 example+0.020
2+0.020
Neg logit contributions
bel-0.029
xc-0.027
 stained-0.026
 ripped-0.026
 vacuum-0.025
Token
Token ID198
Feature activation-4.98e-01

Pos logit contributions
bel+0.026
xc+0.024
 stained+0.024
 ripped+0.023
 vacuum+0.023
Neg logit contributions
uth-0.015
uf-0.015
emy-0.014
 example-0.013
2-0.013
TokenEm
Token ID10161
Feature activation-1.46e+00

Pos logit contributions
bel+0.016
xc+0.015
 stained+0.015
pan+0.014
 ripped+0.014
Neg logit contributions
uth-0.013
uf-0.013
 example-0.012
emy-0.012
2-0.012
Tokenma
Token ID2611
Feature activation-2.23e+00

Pos logit contributions
bel+0.022
 stained+0.018
xc+0.017
pan+0.017
 ripped+0.016
Neg logit contributions
uth-0.064
uf-0.063
 example-0.060
emy-0.059
2-0.059
Token replied
Token ID8712
Feature activation+2.79e-01

Pos logit contributions
bel+0.058
pan+0.051
 stained+0.051
xc+0.049
 unst+0.046
Neg logit contributions
uth-0.075
uf-0.074
2-0.069
 example-0.068
ith-0.067
Token,
Token ID11
Feature activation-1.15e+00

Pos logit contributions
 example+0.005
uth+0.005
uf+0.005
2+0.005
 Even+0.005
Neg logit contributions
bel-0.011
 stained-0.011
xc-0.010
 messed-0.010
 swollen-0.010
Token "
Token ID366
Feature activation-5.31e-01

Pos logit contributions
bel+0.042
xc+0.040
 stained+0.040
pan+0.038
 ripped+0.038
Neg logit contributions
uth-0.024
uf-0.024
 example-0.022
2-0.021
emy-0.021
TokenI
Token ID40
Feature activation+6.53e-01

Pos logit contributions
bel+0.016
xc+0.015
 stained+0.014
 ripped+0.014
 vacuum+0.013
Neg logit contributions
uf-0.016
uth-0.015
emy-0.014
 example-0.014
ith-0.013
Token love
Token ID1842
Feature activation-4.57e-01

Pos logit contributions
uth+0.015
uf+0.015
 example+0.014
2+0.014
emy+0.013
Neg logit contributions
bel-0.022
xc-0.021
pan-0.021
nam-0.020
 stained-0.020
Token't
Token ID470
Feature activation+5.56e-01

Pos logit contributions
bel+0.012
xc+0.011
 stained+0.011
nam+0.011
pan+0.010
Neg logit contributions
uf-0.008
uth-0.008
 Even-0.007
2-0.007
 By-0.007
Token find
Token ID1064
Feature activation-5.69e-01

Pos logit contributions
uth+0.019
emy+0.019
uf+0.018
alph+0.017
osa+0.017
Neg logit contributions
bel-0.013
 vacuum-0.013
 ripped-0.013
 stained-0.013
mine-0.013
Token her
Token ID607
Feature activation+4.41e-01

Pos logit contributions
bel+0.014
xc+0.013
 stained+0.013
 ripped+0.013
pan+0.012
Neg logit contributions
uth-0.019
uf-0.018
emy-0.017
 example-0.017
2-0.017
Token yellow
Token ID7872
Feature activation-1.07e-03

Pos logit contributions
uth+0.012
 example+0.011
uf+0.011
2+0.011
emy+0.010
Neg logit contributions
bel-0.014
xc-0.014
pan-0.013
 stained-0.012
 vacuum-0.012
Token cr
Token ID1067
Feature activation-2.59e+00

Pos logit contributions
bel+0.000
xc+0.000
 stained+0.000
pan+0.000
 vacuum+0.000
Neg logit contributions
uth-0.000
uf-0.000
 example-0.000
emy-0.000
2-0.000
Tokenay
Token ID323
Feature activation-3.26e+00

Pos logit contributions
bel+0.055
xc+0.050
 stained+0.049
pan+0.046
 ripped+0.046
Neg logit contributions
uth-0.095
uf-0.093
emy-0.088
 example-0.088
2-0.088
Tokenon
Token ID261
Feature activation-7.86e-01

Pos logit contributions
bel+0.066
xc+0.059
 stained+0.057
pan+0.055
 vacuum+0.053
Neg logit contributions
uth-0.122
uf-0.119
2-0.117
emy-0.116
 example-0.115
Token,
Token ID11
Feature activation+1.51e-01

Pos logit contributions
xc+0.034
bel+0.033
 ah+0.031
apy+0.031
 swollen+0.031
Neg logit contributions
:-0.009
 example-0.009
der-0.009
uf-0.009
2-0.009
Token which
Token ID543
Feature activation-1.08e+00

Pos logit contributions
 But+0.002
uth+0.002
 Even+0.002
 example+0.002
Suddenly+0.002
Neg logit contributions
xc-0.007
bel-0.007
pan-0.006
days-0.006
boat-0.006
Token made
Token ID925
Feature activation-1.30e+00

Pos logit contributions
bel+0.040
xc+0.040
ivery+0.033
 unst+0.033
lights+0.033
Neg logit contributions
 violet-0.022
 But-0.021
uth-0.021
 Mint-0.021
 cent-0.020
Token her
Token ID607
Feature activation-4.67e-01

Pos logit contributions
xc+0.044
bel+0.041
spe+0.039
days+0.038
nam+0.037
Neg logit contributions
uth-0.029
 example-0.029
una-0.028
 Not-0.027
 Even-0.027
Token tried
Token ID3088
Feature activation+1.04e+00

Pos logit contributions
xc+0.017
bel+0.016
pan+0.016
igator+0.015
lights+0.015
Neg logit contributions
 example-0.011
 Even-0.010
 Then-0.010
 But-0.010
Suddenly-0.010
Token to
Token ID284
Feature activation+1.64e-01

Pos logit contributions
uth+0.031
uf+0.031
 example+0.029
emy+0.029
2+0.028
Neg logit contributions
bel-0.029
xc-0.027
 stained-0.026
pan-0.025
 ripped-0.025
Token make
Token ID787
Feature activation-1.68e+00

Pos logit contributions
 example+0.004
uf+0.004
uth+0.004
2+0.004
Suddenly+0.004
Neg logit contributions
bel-0.005
xc-0.005
 stained-0.005
pan-0.005
lights-0.004
Token his
Token ID465
Feature activation-4.78e-01

Pos logit contributions
bel+0.048
xc+0.044
 stained+0.043
pan+0.042
 ripped+0.042
Neg logit contributions
uth-0.050
uf-0.050
 example-0.048
emy-0.046
2-0.046
Token toy
Token ID13373
Feature activation-2.42e+00

Pos logit contributions
bel+0.016
xc+0.015
pan+0.015
 stained+0.015
 ripped+0.014
Neg logit contributions
uth-0.011
 example-0.011
uf-0.011
emy-0.010
2-0.010
Token car
Token ID1097
Feature activation-3.11e+00

Pos logit contributions
xc+0.086
 stained+0.082
 messed+0.081
 swollen+0.080
iddled+0.080
Neg logit contributions
 example-0.052
 And-0.044
 But-0.041
der-0.040
 zoo-0.038
Token vanish
Token ID39572
Feature activation+2.46e-01

Pos logit contributions
xc+0.122
 streamed+0.110
nam+0.108
 swollen+0.107
 stained+0.107
Neg logit contributions
 example-0.054
 more-0.049
 until-0.045
 And-0.045
 learn-0.044
Token.
Token ID13
Feature activation-4.58e-01

Pos logit contributions
der+0.004
 example+0.003
 until+0.003
 But+0.003
uth+0.003
Neg logit contributions
xc-0.011
bel-0.010
 ah-0.010
 swollen-0.010
apy-0.010
Token He
Token ID679
Feature activation+2.65e-01

Pos logit contributions
bel+0.016
 stained+0.014
xc+0.014
 ripped+0.014
 vacuum+0.014
Neg logit contributions
uth-0.011
uf-0.011
emy-0.010
 example-0.009
alph-0.009
Token closed
Token ID4838
Feature activation+5.65e-01

Pos logit contributions
uth+0.007
 example+0.006
uf+0.006
ith+0.006
2+0.006
Neg logit contributions
bel-0.008
xc-0.008
pan-0.008
nam-0.007
bat-0.007
Token his
Token ID465
Feature activation+5.96e-01

Pos logit contributions
 example+0.016
 But+0.015
uth+0.014
 Then+0.014
 Even+0.014
Neg logit contributions
bel-0.017
days-0.016
xc-0.016
igator-0.015
ivery-0.015
Token was
Token ID373
Feature activation-2.16e-01

Pos logit contributions
bel+0.001
 stained+0.001
xc+0.001
 ripped+0.001
pan+0.001
Neg logit contributions
uth-0.001
uf-0.001
emy-0.001
 example-0.001
2-0.001
Token it
Token ID340
Feature activation-5.25e-01

Pos logit contributions
bel+0.006
xc+0.006
 stained+0.006
pan+0.006
 ripped+0.005
Neg logit contributions
uth-0.006
uf-0.006
 example-0.006
2-0.006
emy-0.006
Token?"
Token ID1701
Feature activation+1.38e+00

Pos logit contributions
nam+0.019
bel+0.018
 tant+0.018
cloth+0.018
pan+0.018
Neg logit contributions
:-0.009
 also-0.009
2-0.009
 example-0.009
Suddenly-0.009
Token Sara
Token ID24799
Feature activation-1.86e+00

Pos logit contributions
uth+0.035
uf+0.034
 example+0.032
emy+0.031
2+0.031
Neg logit contributions
bel-0.046
 stained-0.043
xc-0.041
 ripped-0.041
pan-0.040
Token's
Token ID338
Feature activation-1.62e+00

Pos logit contributions
xc+0.039
 ripped+0.038
 down+0.037
 stained+0.037
bat+0.037
Neg logit contributions
uth-0.063
uf-0.062
 example-0.061
emy-0.058
Days-0.056
Token mom
Token ID1995
Feature activation-3.15e+00

Pos logit contributions
bel+0.040
 stained+0.036
xc+0.036
 ripped+0.035
pan+0.034
Neg logit contributions
uth-0.054
uf-0.054
emy-0.050
 example-0.050
2-0.049
Token says
Token ID1139
Feature activation-4.06e-01

Pos logit contributions
bel+0.095
 stained+0.092
xc+0.091
 ripped+0.089
 vacuum+0.086
Neg logit contributions
uth-0.087
uf-0.084
 example-0.079
emy-0.079
alph-0.073
Token.
Token ID13
Feature activation-6.75e-02

Pos logit contributions
bel+0.012
xc+0.011
 stained+0.011
pan+0.010
 ripped+0.010
Neg logit contributions
uth-0.012
uf-0.012
emy-0.011
 example-0.011
2-0.011
Token
Token ID198
Feature activation+6.57e-01

Pos logit contributions
bel+0.002
 ripped+0.002
 towel+0.002
pan+0.002
 stained+0.002
Neg logit contributions
uth-0.002
 example-0.002
uf-0.002
reck-0.002
agon-0.002
Token
Token ID198
Feature activation+7.15e-01

Pos logit contributions
 example+0.018
uf+0.017
uth+0.017
2+0.016
reck+0.015
Neg logit contributions
bel-0.019
 towel-0.018
 stained-0.018
 ripped-0.017
xc-0.017
Token"
Token ID1
Feature activation-4.33e-01

Pos logit contributions
uf+0.022
uth+0.021
 example+0.021
emy+0.020
2+0.020
Neg logit contributions
bel-0.019
 stained-0.017
xc-0.017
 ripped-0.017
 towel-0.017
Token friend
Token ID1545
Feature activation+3.23e-01

Pos logit contributions
uth+0.003
uf+0.003
 example+0.003
emy+0.003
2+0.003
Neg logit contributions
bel-0.005
xc-0.005
pan-0.004
 stained-0.004
cloth-0.004
Token,
Token ID11
Feature activation+1.83e-01

Pos logit contributions
uf+0.008
uth+0.008
emy+0.007
2+0.006
 And+0.006
Neg logit contributions
bel-0.012
pan-0.011
xc-0.011
 stained-0.011
 tant-0.010
Token a
Token ID257
Feature activation-6.23e-01

Pos logit contributions
uf+0.004
uth+0.004
emy+0.004
2+0.004
 And+0.003
Neg logit contributions
bel-0.007
xc-0.006
 stained-0.006
pan-0.006
lights-0.006
Token colorful
Token ID20239
Feature activation-6.88e-01

Pos logit contributions
xc+0.025
bel+0.024
led+0.023
lights+0.022
pan+0.022
Neg logit contributions
uf-0.012
uth-0.011
2-0.011
emy-0.011
 vet-0.010
Token cr
Token ID1067
Feature activation-2.24e+00

Pos logit contributions
bel+0.023
xc+0.022
led+0.020
 unst+0.019
 stained+0.019
Neg logit contributions
 zoo-0.018
 vet-0.017
 example-0.017
 jungle-0.017
uf-0.017
Tokenay
Token ID323
Feature activation-3.21e+00

Pos logit contributions
bel+0.043
xc+0.038
 stained+0.038
pan+0.036
 ripped+0.035
Neg logit contributions
uth-0.088
uf-0.086
 example-0.082
emy-0.082
2-0.081
Tokenon
Token ID261
Feature activation-4.41e-01

Pos logit contributions
bel+0.091
xc+0.080
nam+0.076
 stained+0.074
 pans+0.069
Neg logit contributions
uf-0.092
2-0.090
 example-0.090
uth-0.088
ith-0.087
Token named
Token ID3706
Feature activation+5.72e-01

Pos logit contributions
bel+0.014
xc+0.013
 stained+0.013
 ripped+0.012
pan+0.012
Neg logit contributions
uf-0.011
uth-0.011
emy-0.010
 example-0.010
2-0.010
Token Color
Token ID5315
Feature activation-9.16e-02

Pos logit contributions
uth+0.020
uf+0.020
 example+0.019
emy+0.019
ith+0.018
Neg logit contributions
bel-0.012
xc-0.012
 stained-0.011
 ripped-0.011
 vacuum-0.010
Tokenful
Token ID913
Feature activation-8.41e-01

Pos logit contributions
bel+0.003
 stained+0.002
xc+0.002
 ripped+0.002
pan+0.002
Neg logit contributions
uth-0.003
uf-0.003
 example-0.002
emy-0.002
2-0.002
Token.
Token ID13
Feature activation-2.92e-01

Pos logit contributions
bel+0.024
xc+0.023
 stained+0.023
 ripped+0.022
pan+0.022
Neg logit contributions
uth-0.025
uf-0.024
emy-0.022
 example-0.022
2-0.022
Token she
Token ID673
Feature activation-7.54e-02

Pos logit contributions
uth+0.011
 But+0.011
uf+0.010
Suddenly+0.010
Then+0.010
Neg logit contributions
xc-0.013
bel-0.012
pan-0.011
spe-0.011
 stained-0.011
Token found
Token ID1043
Feature activation-2.65e-02

Pos logit contributions
bel+0.002
pan+0.002
xc+0.002
nam+0.002
lights+0.002
Neg logit contributions
uth-0.002
ith-0.002
emy-0.002
uf-0.002
Then-0.002
Token a
Token ID257
Feature activation-2.26e-01

Pos logit contributions
bel+0.001
 stained+0.001
 ripped+0.001
xc+0.001
pan+0.001
Neg logit contributions
uth-0.001
uf-0.001
2-0.001
emy-0.001
 example-0.001
Token big
Token ID1263
Feature activation-6.35e-01

Pos logit contributions
bel+0.006
xc+0.006
 stained+0.005
pan+0.005
 ripped+0.005
Neg logit contributions
uth-0.007
uf-0.007
emy-0.007
 example-0.007
2-0.006
Token pair
Token ID5166
Feature activation-2.46e-01

Pos logit contributions
bel+0.019
xc+0.017
 stained+0.016
pan+0.016
 ripped+0.015
Neg logit contributions
uth-0.019
uf-0.018
 example-0.017
emy-0.017
2-0.017
Token of
Token ID286
Feature activation-3.20e+00

Pos logit contributions
bel+0.006
xc+0.006
 stained+0.006
 ripped+0.005
pan+0.005
Neg logit contributions
uth-0.008
uf-0.008
emy-0.008
 example-0.008
2-0.007
Token scissors
Token ID41786
Feature activation+2.94e-01

Pos logit contributions
bel+0.100
pan+0.091
xc+0.090
lights+0.089
days+0.086
Neg logit contributions
uth-0.085
2-0.085
emy-0.085
uf-0.084
ith-0.080
Token and
Token ID290
Feature activation+3.46e-01

Pos logit contributions
uth+0.007
uf+0.007
der+0.007
2+0.007
ith+0.007
Neg logit contributions
bel-0.009
xc-0.009
 stained-0.008
lights-0.008
 pans-0.008
Token decided
Token ID3066
Feature activation+8.16e-01

Pos logit contributions
uth+0.010
2+0.010
Suddenly+0.009
emy+0.009
uf+0.009
Neg logit contributions
xc-0.010
bel-0.009
pan-0.009
 stained-0.008
days-0.008
Token to
Token ID284
Feature activation-2.53e-01

Pos logit contributions
uth+0.019
uf+0.018
 example+0.017
emy+0.017
Suddenly+0.017
Neg logit contributions
bel-0.027
xc-0.027
 stained-0.026
pan-0.026
 vacuum-0.025
Token cut
Token ID2005
Feature activation-6.38e-02

Pos logit contributions
bel+0.008
xc+0.008
pan+0.007
 swollen+0.007
lights+0.007
Neg logit contributions
uth-0.006
uf-0.006
 example-0.006
2-0.006
 Even-0.006