TOP ACTIVATIONS
MAX = 2.063

truth . " Sir , you need to get
to be friends with the other human . They hugged and
work together and fix your rockets . You can use some
" No , thank you ," Lily said .
him . " Sir , sir , can you
ever tasted . <|endoftext|> " Z oom !" went the twins
and then smiled . " Ok , Mom my . I
said not to go too far ," Ben said .
girl finally understood . " Ok ," she said , "
do . She knew that staying out late was stupid and
cl apped her hands . Big brother smiled and held her
shape started to weep . Big , round tears rolled down
" Yes , thank you ," the doll said
says . " OK ," Ben says .
He enjoyed writing z ig z ags . He wrote z
a town with many other engines , big and small .
The park had lots of high slides . It was fun
" I 'm okay , thank you ," the lady said
says to Mia : " Look , Mia . That dog
No thank you !" said the other human . " I

MIN ACTIVATIONS
MIN = -2.584

. A lic ia noticed that the cord
it . A lic ia smiled while she looked
the necklace and the magn ifying glass and told him that
the necklace and the magn ifying glass and started to examine
?" D ott ie per ked up and
and then she had a scr um pt ious snack .
After a while , D ott ie was no longer scared
ball ?" A lic ia nodded and said ,
Sure , Lily . What kind of pizza do you want
find new videos . What kind of videos do you want
L ily replied , " There 's a scary pit over
and asked , " What kind of game ?".
to the lamp . " There is a gir affe .
at the shelf . " There , on the shelf .
!" she said . " There 's a butterfly on that
make bra ids and pony tails and curls . You can
, they threw the spark back and forth and laughed .
a flashlight and a magn ifying glass to examine their finds
dictionary reply ed , " There are many ways to make
like lettuce , tomatoes and cuc umbers . He smiled as

INTERVAL 0.901 - 2.063
CONTAINS 1.685%

. Have fun ." <|endoftext|> Ben and Lily were playing in
library . They see many people and cars on the road
bird thanked the veil for its help and flew away happily
duck was clean , the navy boat took the duck back
The end . <|endoftext|> Once upon a time , there was

INTERVAL -0.260 - 0.901
CONTAINS 72.883%

They also send them some jokes and r iddles .
. " You just need to make a wish !"
found something disgusting living there . It was a big spider
day . She realized that it 's better to be honest
a heart , and many others . When

INTERVAL -1.422 - -0.260
CONTAINS 25.376%

accept the answer . The next day ,
table . It was their mom 's tea , but she
Anna are friends . They like to play and have fun
mix , and bake a cake . He also let them
had fun and laughed . They became friends .

INTERVAL -2.584 - -1.422
CONTAINS 0.056%

But , the wheel fell off and she didn 't know
. He felt a strange feeling , almost like a reward
asked , " What happened to your t eddy bear ?"
When he does , she spr ays him with water .
That made him smile and feel better . Then they went
Token truth
Token ID3872
Feature activation-5.08e-01

Pos logit contributions
 next+0.002
 first+0.002
ued+0.001
 two+0.001
 one+0.001
Neg logit contributions
 mir-0.002
 Fro-0.002
 polar-0.002
 Stri-0.002
 ig-0.002
Token.
Token ID13
Feature activation+7.59e-01

Pos logit contributions
 Fro+0.025
 mir+0.024
 Stri+0.024
 polar+0.024
 ig+0.023
Neg logit contributions
 next-0.018
 first-0.018
ued-0.014
 two-0.014
 one-0.013
Token
Token ID198
Feature activation+2.15e-01

Pos logit contributions
 next+0.027
 first+0.027
 two+0.019
 one+0.019
ued+0.018
Neg logit contributions
 Fro-0.040
 mir-0.039
 Stri-0.038
 polar-0.038
 spe-0.037
Token
Token ID198
Feature activation+3.75e-01

Pos logit contributions
 next+0.008
 first+0.008
 one+0.006
ued+0.006
 two+0.006
Neg logit contributions
 Fro-0.011
 Stri-0.010
 mir-0.010
 polar-0.010
 ig-0.010
Token"
Token ID1
Feature activation+6.78e-01

Pos logit contributions
 first+0.014
 next+0.013
 one+0.011
ued+0.011
 two+0.010
Neg logit contributions
 Fro-0.018
 Stri-0.018
 mir-0.018
 polar-0.017
 ig-0.016
TokenSir
Token ID22788
Feature activation+2.06e+00

Pos logit contributions
 first+0.032
 next+0.030
ued+0.028
 one+0.025
 up+0.025
Neg logit contributions
 Stri-0.025
 mir-0.025
 Fro-0.025
 spe-0.024
 polar-0.023
Token,
Token ID11
Feature activation+8.02e-01

Pos logit contributions
 next+0.084
 first+0.079
ued+0.068
 moral+0.062
oning+0.061
Neg logit contributions
 Fro-0.092
 Stri-0.086
 mir-0.083
 polar-0.082
 spe-0.082
Token you
Token ID345
Feature activation-2.58e-01

Pos logit contributions
 next+0.035
 first+0.034
 two+0.027
 one+0.026
ued+0.024
Neg logit contributions
 Fro-0.037
 mir-0.035
 Stri-0.034
 spe-0.034
 polar-0.034
Token need
Token ID761
Feature activation+4.64e-01

Pos logit contributions
 Fro+0.013
 mir+0.012
 Stri+0.012
 polar+0.012
 spe+0.012
Neg logit contributions
 next-0.010
 first-0.010
 two-0.008
 one-0.007
ued-0.007
Token to
Token ID284
Feature activation+6.48e-01

Pos logit contributions
 next+0.021
 first+0.020
ued+0.016
 two+0.015
 one+0.015
Neg logit contributions
 Fro-0.020
 mir-0.020
 Stri-0.019
 polar-0.018
 spe-0.018
Token get
Token ID651
Feature activation+8.29e-01

Pos logit contributions
 next+0.028
 first+0.027
 two+0.021
 one+0.021
ued+0.020
Neg logit contributions
 Fro-0.030
 mir-0.029
 Stri-0.028
 polar-0.028
 spe-0.028
Token to
Token ID284
Feature activation-3.09e-02

Pos logit contributions
 Fro+0.008
 mir+0.007
 Stri+0.007
 cho+0.007
 som+0.007
Neg logit contributions
 next-0.010
 first-0.010
ued-0.008
 two-0.008
oning-0.008
Token be
Token ID307
Feature activation+8.71e-01

Pos logit contributions
 Fro+0.001
 mir+0.001
 Stri+0.001
 polar+0.001
 dam+0.001
Neg logit contributions
 next-0.001
 first-0.001
 two-0.001
 one-0.001
ued-0.001
Token friends
Token ID2460
Feature activation+2.13e-01

Pos logit contributions
 next+0.034
 first+0.033
 two+0.026
 one+0.025
ued+0.022
Neg logit contributions
 Fro-0.044
 mir-0.043
 Stri-0.042
 spe-0.041
 polar-0.041
Token with
Token ID351
Feature activation+1.01e-01

Pos logit contributions
 next+0.007
 first+0.007
 two+0.005
ued+0.005
 one+0.005
Neg logit contributions
 Fro-0.011
 mir-0.011
 Stri-0.011
 polar-0.011
 spe-0.010
Token the
Token ID262
Feature activation+1.46e+00

Pos logit contributions
 next+0.004
 first+0.004
ued+0.004
 two+0.003
 moment+0.003
Neg logit contributions
 Fro-0.004
 Stri-0.004
 polar-0.004
 mir-0.004
 dam-0.004
Token other
Token ID584
Feature activation+2.04e+00

Pos logit contributions
ued+0.053
 next+0.049
oned+0.046
perate+0.044
ged+0.044
Neg logit contributions
 Fro-0.065
 mir-0.064
 por-0.059
 Stri-0.058
 polar-0.058
Token human
Token ID1692
Feature activation+6.32e-02

Pos logit contributions
ued+0.098
 next+0.096
 first+0.094
 aside+0.082
oned+0.081
Neg logit contributions
 Fro-0.066
 mir-0.065
 Stri-0.061
 polar-0.061
 por-0.060
Token.
Token ID13
Feature activation-5.60e-02

Pos logit contributions
 next+0.002
 first+0.002
ued+0.002
 two+0.002
oning+0.002
Neg logit contributions
 Fro-0.003
 polar-0.003
 mir-0.003
 Stri-0.003
 ig-0.003
Token They
Token ID1119
Feature activation+2.90e-01

Pos logit contributions
 Fro+0.002
 Stri+0.002
 mir+0.002
 dam+0.002
 polar+0.002
Neg logit contributions
 next-0.003
 first-0.003
ued-0.002
 two-0.002
 one-0.002
Token hugged
Token ID41130
Feature activation-2.21e-01

Pos logit contributions
 next+0.013
 first+0.012
 two+0.009
ued+0.009
 one+0.009
Neg logit contributions
 Fro-0.013
 Stri-0.012
 mir-0.012
 polar-0.012
 spe-0.012
Token and
Token ID290
Feature activation+2.11e-01

Pos logit contributions
 Fro+0.011
 Stri+0.010
 polar+0.009
 spe+0.009
 coc+0.009
Neg logit contributions
 next-0.009
 first-0.008
ued-0.007
 two-0.006
 moral-0.006
Token work
Token ID670
Feature activation+6.16e-01

Pos logit contributions
 next+0.025
 first+0.024
ued+0.019
 two+0.019
 one+0.019
Neg logit contributions
 Fro-0.024
 mir-0.024
 polar-0.023
 Stri-0.023
 ig-0.023
Token together
Token ID1978
Feature activation+6.79e-01

Pos logit contributions
 next+0.021
 first+0.020
ued+0.018
oning+0.017
 moral+0.017
Neg logit contributions
 Stri-0.025
 spe-0.025
 cho-0.025
 scar-0.025
 mir-0.025
Token and
Token ID290
Feature activation+4.83e-01

Pos logit contributions
 next+0.026
 first+0.025
ued+0.023
 moral+0.021
 two+0.020
Neg logit contributions
 Fro-0.027
 mir-0.027
 spe-0.027
 ig-0.027
 Stri-0.026
Token fix
Token ID4259
Feature activation+1.09e-01

Pos logit contributions
 next+0.020
 first+0.020
ued+0.016
 two+0.016
 one+0.015
Neg logit contributions
 Fro-0.020
 mir-0.020
 spe-0.019
 Stri-0.019
 dam-0.019
Token your
Token ID534
Feature activation-4.98e-02

Pos logit contributions
 next+0.005
 first+0.005
ued+0.004
 two+0.004
 one+0.003
Neg logit contributions
 Fro-0.005
 Stri-0.004
 mir-0.004
 spe-0.004
 ig-0.004
Token rockets
Token ID20293
Feature activation+2.00e+00

Pos logit contributions
 ig+0.002
 mir+0.002
 coc+0.002
 Fro+0.002
 spe+0.002
Neg logit contributions
 first-0.002
 next-0.002
ued-0.002
 two-0.002
oning-0.002
Token.
Token ID13
Feature activation+1.90e-01

Pos logit contributions
 next+0.076
 first+0.071
ued+0.067
 moral+0.064
oning+0.062
Neg logit contributions
 spe-0.075
 ig-0.075
light-0.073
 coc-0.072
 dam-0.072
Token You
Token ID921
Feature activation-5.38e-01

Pos logit contributions
 next+0.008
 first+0.008
 two+0.006
 one+0.006
ued+0.006
Neg logit contributions
 Fro-0.009
 mir-0.008
 Stri-0.008
 polar-0.008
 spe-0.008
Token can
Token ID460
Feature activation+6.88e-01

Pos logit contributions
 spe+0.018
 mir+0.018
 Fro+0.018
 Stri+0.018
 dam+0.017
Neg logit contributions
 next-0.028
 first-0.026
ued-0.021
 one-0.021
 two-0.021
Token use
Token ID779
Feature activation-7.74e-02

Pos logit contributions
 next+0.030
 first+0.029
 two+0.023
ued+0.023
 moment+0.022
Neg logit contributions
 Fro-0.028
 Stri-0.028
 mir-0.027
 ig-0.027
 spe-0.027
Token some
Token ID617
Feature activation-5.74e-01

Pos logit contributions
 spe+0.003
 Stri+0.002
 mir+0.002
 ig+0.002
 scar+0.002
Neg logit contributions
 next-0.004
 first-0.004
ued-0.003
 moment-0.003
t-0.003
Token
Token ID198
Feature activation+2.22e-01

Pos logit contributions
 next+0.024
 first+0.023
ued+0.018
 one+0.016
 two+0.016
Neg logit contributions
 Fro-0.032
 mir-0.030
 Stri-0.029
 polar-0.028
 spe-0.028
Token
Token ID198
Feature activation+4.60e-01

Pos logit contributions
 next+0.007
 first+0.007
 two+0.005
 one+0.004
ued+0.004
Neg logit contributions
 Fro-0.013
 mir-0.013
 spe-0.012
 Stri-0.012
 polar-0.012
Token"
Token ID1
Feature activation+3.73e-01

Pos logit contributions
 first+0.020
 next+0.020
ued+0.015
 one+0.015
 two+0.015
Neg logit contributions
 Fro-0.021
 mir-0.020
 Stri-0.020
 polar-0.019
 spe-0.019
TokenNo
Token ID2949
Feature activation+6.82e-01

Pos logit contributions
 first+0.017
 next+0.016
ued+0.014
 one+0.013
 up+0.013
Neg logit contributions
 Fro-0.014
 mir-0.014
 Stri-0.014
 spe-0.014
 polar-0.013
Token,
Token ID11
Feature activation+7.79e-01

Pos logit contributions
 first+0.021
 next+0.021
 one+0.015
 two+0.015
 moment+0.012
Neg logit contributions
 Fro-0.040
 mir-0.039
 spe-0.038
 Stri-0.038
 polar-0.038
Token thank
Token ID5875
Feature activation+1.99e+00

Pos logit contributions
 next+0.033
 first+0.032
ued+0.025
 two+0.025
 one+0.024
Neg logit contributions
 Fro-0.035
 mir-0.034
 polar-0.033
 spe-0.033
 Stri-0.033
Token you
Token ID345
Feature activation+1.36e-01

Pos logit contributions
 next+0.083
 first+0.082
ued+0.068
 two+0.066
 moment+0.065
Neg logit contributions
 Fro-0.087
 Stri-0.080
 mir-0.078
 spe-0.077
 polar-0.076
Token,"
Token ID553
Feature activation-8.22e-02

Pos logit contributions
 next+0.005
 first+0.005
ued+0.003
 two+0.003
 one+0.003
Neg logit contributions
 Fro-0.007
 mir-0.007
 polar-0.007
 Stri-0.007
 ig-0.007
Token Lily
Token ID20037
Feature activation+9.39e-02

Pos logit contributions
 mir+0.004
 Fro+0.004
rim+0.004
ric+0.004
 dam+0.004
Neg logit contributions
 next-0.003
 first-0.003
 two-0.003
 one-0.003
 moral-0.002
Token said
Token ID531
Feature activation+6.13e-02

Pos logit contributions
 next+0.004
 first+0.003
ued+0.003
 moment+0.003
 one+0.003
Neg logit contributions
 Fro-0.005
 Stri-0.004
 mir-0.004
 polar-0.004
 scar-0.004
Token.
Token ID13
Feature activation+1.03e-01

Pos logit contributions
 next+0.002
 first+0.002
ued+0.002
 moral+0.002
 moment+0.002
Neg logit contributions
 Fro-0.003
 Stri-0.002
 spe-0.002
 mir-0.002
 cho-0.002
Token him
Token ID683
Feature activation+5.84e-01

Pos logit contributions
 Fro+0.000
 Stri+0.000
 mir+0.000
 polar+0.000
 spe+0.000
Neg logit contributions
 next-0.000
 first-0.000
ued-0.000
 two-0.000
 one-0.000
Token.
Token ID13
Feature activation+4.70e-01

Pos logit contributions
 next+0.022
 first+0.021
ued+0.020
 two+0.017
working+0.017
Neg logit contributions
 Fro-0.028
 Stri-0.026
 polar-0.025
 Chick-0.023
 mir-0.023
Token
Token ID198
Feature activation+8.59e-02

Pos logit contributions
 next+0.015
 first+0.015
 two+0.011
 one+0.010
ued+0.008
Neg logit contributions
 mir-0.026
 Fro-0.026
 spe-0.026
 dam-0.025
 ig-0.025
Token
Token ID198
Feature activation+2.88e-01

Pos logit contributions
 next+0.003
 first+0.003
 one+0.002
 two+0.002
ued+0.002
Neg logit contributions
 Fro-0.005
 Stri-0.004
 mir-0.004
 polar-0.004
 ig-0.004
Token"
Token ID1
Feature activation+3.99e-01

Pos logit contributions
 first+0.010
 next+0.010
 one+0.008
ued+0.008
 two+0.008
Neg logit contributions
 Fro-0.015
 mir-0.014
 Stri-0.014
 polar-0.014
 spe-0.013
TokenSir
Token ID22788
Feature activation+1.94e+00

Pos logit contributions
 first+0.019
 next+0.018
ued+0.015
 one+0.015
 two+0.014
Neg logit contributions
 Fro-0.016
 mir-0.016
 Stri-0.015
 polar-0.015
 spe-0.015
Token,
Token ID11
Feature activation+7.81e-01

Pos logit contributions
 next+0.073
 first+0.070
ued+0.054
 two+0.052
 one+0.051
Neg logit contributions
 Fro-0.098
 mir-0.092
 Stri-0.092
 polar-0.091
 spe-0.089
Token sir
Token ID15967
Feature activation+3.93e-01

Pos logit contributions
 next+0.035
 first+0.034
 two+0.027
 one+0.025
 moral+0.022
Neg logit contributions
 Fro-0.037
 mir-0.035
 spe-0.034
icol-0.034
 dam-0.034
Token,
Token ID11
Feature activation+6.21e-01

Pos logit contributions
 next+0.014
 first+0.013
 two+0.010
 one+0.010
ued+0.009
Neg logit contributions
 Fro-0.021
 mir-0.020
 Stri-0.020
 polar-0.020
 spe-0.020
Token can
Token ID460
Feature activation+1.01e+00

Pos logit contributions
 next+0.027
 first+0.026
 two+0.020
 one+0.020
ued+0.017
Neg logit contributions
 Fro-0.030
 mir-0.028
 spe-0.028
 Stri-0.028
 dam-0.027
Token you
Token ID345
Feature activation-1.24e-01

Pos logit contributions
 next+0.043
 first+0.042
ued+0.035
 moment+0.034
 two+0.033
Neg logit contributions
 Fro-0.044
 Stri-0.041
 polar-0.039
 mir-0.039
 spe-0.038
Token ever
Token ID1683
Feature activation-1.18e-01

Pos logit contributions
 next+0.001
 first+0.001
 two+0.001
ued+0.001
 one+0.001
Neg logit contributions
 Fro-0.001
 mir-0.001
 spe-0.001
 Stri-0.001
 polar-0.001
Token tasted
Token ID29187
Feature activation-9.68e-01

Pos logit contributions
 Fro+0.005
 mir+0.005
 spe+0.005
 polar+0.005
 Stri+0.005
Neg logit contributions
 next-0.005
 first-0.005
ued-0.004
 two-0.004
 one-0.003
Token.
Token ID13
Feature activation-5.01e-02

Pos logit contributions
 Fro+0.046
 mir+0.045
 spe+0.043
 polar+0.043
 Stri+0.042
Neg logit contributions
 next-0.037
 first-0.036
ued-0.028
 two-0.028
 one-0.026
Token<|endoftext|>
Token ID50256
Feature activation+2.62e-01

Pos logit contributions
 Fro+0.003
 mir+0.002
 Stri+0.002
 polar+0.002
 spe+0.002
Neg logit contributions
 next-0.002
 first-0.002
 two-0.001
 one-0.001
ued-0.001
Token"
Token ID1
Feature activation+1.22e+00

Pos logit contributions
 next+0.010
 first+0.010
 two+0.007
 one+0.007
ued+0.007
Neg logit contributions
 Fro-0.013
 mir-0.013
 Stri-0.013
 polar-0.012
 spe-0.012
TokenZ
Token ID57
Feature activation+1.94e+00

Pos logit contributions
 first+0.049
 next+0.048
ued+0.042
 closely+0.035
 behind+0.035
Neg logit contributions
 Fro-0.055
 Stri-0.052
 polar-0.051
 mir-0.050
avis-0.049
Tokenoom
Token ID4207
Feature activation+1.83e+00

Pos logit contributions
 next+0.095
 first+0.091
 two+0.075
 one+0.073
ued+0.070
Neg logit contributions
 Fro-0.079
 mir-0.072
 Stri-0.070
 polar-0.069
 ig-0.069
Token!"
Token ID2474
Feature activation-1.02e+00

Pos logit contributions
 next+0.070
 first+0.068
 two+0.051
 one+0.050
ued+0.047
Neg logit contributions
 Fro-0.093
 mir-0.090
 Stri-0.087
 polar-0.087
 spe-0.087
Token went
Token ID1816
Feature activation+3.01e-01

Pos logit contributions
 Fro+0.039
 mir+0.038
 Stri+0.036
 spe+0.036
 polar+0.036
Neg logit contributions
 next-0.051
 first-0.050
 two-0.041
 one-0.040
ued-0.039
Token the
Token ID262
Feature activation-1.83e-01

Pos logit contributions
 next+0.014
 first+0.013
 two+0.012
 moral+0.012
ued+0.011
Neg logit contributions
 Fro-0.011
 Stri-0.010
 polar-0.010
 mir-0.009
 cho-0.009
Token twins
Token ID20345
Feature activation-1.10e-01

Pos logit contributions
 Fro+0.008
 polar+0.007
 mir+0.007
 Stri+0.007
 ig+0.007
Neg logit contributions
 next-0.008
 first-0.007
ued-0.007
 one-0.006
 two-0.006
Token and
Token ID290
Feature activation+1.51e-01

Pos logit contributions
 next+0.010
 first+0.010
ued+0.008
 two+0.008
working+0.007
Neg logit contributions
 Fro-0.012
 Stri-0.011
 mir-0.011
 polar-0.011
 ig-0.011
Token then
Token ID788
Feature activation+4.66e-01

Pos logit contributions
 next+0.005
 first+0.005
ued+0.004
oning+0.004
 two+0.004
Neg logit contributions
 Fro-0.007
 mir-0.007
 polar-0.007
 ig-0.007
 Stri-0.007
Token smiled
Token ID13541
Feature activation+4.87e-01

Pos logit contributions
 next+0.017
 first+0.017
ued+0.013
oning+0.012
 two+0.012
Neg logit contributions
 Fro-0.023
 mir-0.022
 polar-0.022
 Stri-0.021
 ig-0.021
Token.
Token ID13
Feature activation-5.10e-02

Pos logit contributions
 next+0.016
 first+0.015
 two+0.011
 one+0.011
ued+0.010
Neg logit contributions
 Fro-0.027
 mir-0.027
 Stri-0.026
 polar-0.026
 spe-0.026
Token "
Token ID366
Feature activation+8.28e-01

Pos logit contributions
 Fro+0.002
 mir+0.002
 Stri+0.002
 polar+0.002
 ig+0.002
Neg logit contributions
 next-0.002
 first-0.002
ued-0.002
 one-0.002
 two-0.002
TokenOk
Token ID18690
Feature activation+1.93e+00

Pos logit contributions
 first+0.036
 next+0.034
ued+0.028
 two+0.027
 one+0.026
Neg logit contributions
 Fro-0.035
 mir-0.035
 Stri-0.034
 spe-0.033
avis-0.033
Token,
Token ID11
Feature activation+8.54e-01

Pos logit contributions
 next+0.073
 first+0.070
ued+0.054
 two+0.053
 one+0.051
Neg logit contributions
 Fro-0.094
 mir-0.088
 polar-0.088
 Stri-0.087
 spe-0.086
Token Mom
Token ID11254
Feature activation-3.23e-01

Pos logit contributions
 next+0.039
 first+0.038
 two+0.030
 one+0.029
ued+0.028
Neg logit contributions
 Fro-0.038
 mir-0.036
 Stri-0.035
 polar-0.035
 spe-0.035
Tokenmy
Token ID1820
Feature activation-1.72e-03

Pos logit contributions
 Fro+0.018
 mir+0.017
 Stri+0.017
 polar+0.017
 spe+0.017
Neg logit contributions
 next-0.011
 first-0.010
 two-0.007
 one-0.007
ued-0.007
Token.
Token ID13
Feature activation-2.86e-01

Pos logit contributions
 Fro+0.000
 mir+0.000
 Stri+0.000
 polar+0.000
 spe+0.000
Neg logit contributions
 next-0.000
 first-0.000
 two-0.000
ued-0.000
 one-0.000
Token I
Token ID314
Feature activation-3.94e-01

Pos logit contributions
 Fro+0.014
 mir+0.013
 Stri+0.013
 polar+0.013
 spe+0.013
Neg logit contributions
 next-0.012
 first-0.011
 two-0.009
 one-0.009
ued-0.008
Token said
Token ID531
Feature activation+4.65e-01

Pos logit contributions
 Stri+0.035
 Fro+0.034
 spe+0.032
 ig+0.032
 polar+0.032
Neg logit contributions
 next-0.034
 first-0.032
ued-0.029
 moral-0.028
oning-0.027
Token not
Token ID407
Feature activation-1.71e-01

Pos logit contributions
 next+0.019
 first+0.018
ued+0.014
 two+0.014
 one+0.013
Neg logit contributions
 Fro-0.022
 mir-0.022
 spe-0.021
 Stri-0.021
 polar-0.020
Token to
Token ID284
Feature activation+8.05e-02

Pos logit contributions
 Fro+0.006
 mir+0.006
 Stri+0.006
 ig+0.006
 coc+0.006
Neg logit contributions
 next-0.008
 first-0.008
ued-0.007
 two-0.007
 one-0.006
Token go
Token ID467
Feature activation+1.02e+00

Pos logit contributions
 next+0.004
 first+0.003
 two+0.003
 one+0.003
ued+0.003
Neg logit contributions
 Fro-0.004
 mir-0.004
 Stri-0.003
 polar-0.003
 spe-0.003
Token too
Token ID1165
Feature activation-1.14e-01

Pos logit contributions
 next+0.030
 first+0.028
ued+0.026
 moral+0.025
 two+0.025
Neg logit contributions
 Fro-0.051
 mir-0.048
 dam-0.048
 polar-0.048
 Stri-0.047
Token far
Token ID1290
Feature activation+1.92e+00

Pos logit contributions
 mir+0.005
 spe+0.005
 Fro+0.005
 ig+0.004
 coc+0.004
Neg logit contributions
 next-0.004
oned-0.004
ued-0.004
oning-0.004
ving-0.003
Token,"
Token ID553
Feature activation-7.92e-01

Pos logit contributions
 next+0.073
 first+0.068
ued+0.067
 two+0.064
 moral+0.064
Neg logit contributions
 Fro-0.075
 mir-0.071
 Stri-0.071
 polar-0.069
 scar-0.069
Token Ben
Token ID3932
Feature activation+3.61e-01

Pos logit contributions
 Fro+0.031
 Stri+0.028
 mir+0.027
 polar+0.027
 spe+0.026
Neg logit contributions
 next-0.040
 first-0.039
ued-0.034
 two-0.031
 one-0.030
Token said
Token ID531
Feature activation+2.61e-01

Pos logit contributions
 next+0.016
 first+0.015
ued+0.013
 moment+0.012
oning+0.012
Neg logit contributions
 Fro-0.016
 Stri-0.014
 mir-0.013
 polar-0.013
 scar-0.013
Token.
Token ID13
Feature activation+1.27e-01

Pos logit contributions
 next+0.011
 first+0.011
 moral+0.010
ued+0.009
 two+0.009
Neg logit contributions
 Fro-0.010
 Stri-0.009
 spe-0.008
 numb-0.008
 cho-0.008
Token
Token ID198
Feature activation-2.37e-02

Pos logit contributions
 next+0.005
 first+0.004
ued+0.003
 two+0.003
 one+0.003
Neg logit contributions
 Fro-0.007
 mir-0.006
 Stri-0.006
 polar-0.006
 spe-0.006
Token girl
Token ID2576
Feature activation-3.20e-01

Pos logit contributions
 next+0.006
 first+0.005
 two+0.004
 one+0.004
 moment+0.004
Neg logit contributions
 Fro-0.006
 mir-0.006
 Stri-0.006
 polar-0.006
 spe-0.006
Token finally
Token ID3443
Feature activation+1.94e-02

Pos logit contributions
 Fro+0.014
 mir+0.013
 Stri+0.013
 spe+0.013
 polar+0.013
Neg logit contributions
 next-0.014
 first-0.013
 two-0.010
ued-0.010
 one-0.010
Token understood
Token ID7247
Feature activation-2.14e-01

Pos logit contributions
 next+0.001
 first+0.001
 two+0.001
 one+0.001
ued+0.001
Neg logit contributions
 Fro-0.001
 mir-0.001
 Stri-0.001
 polar-0.001
 spe-0.001
Token.
Token ID13
Feature activation+1.16e-01

Pos logit contributions
 Fro+0.011
 mir+0.010
 Stri+0.010
 polar+0.010
 ig+0.010
Neg logit contributions
 next-0.008
 first-0.008
 two-0.006
ued-0.006
 one-0.006
Token "
Token ID366
Feature activation+1.03e+00

Pos logit contributions
 next+0.005
 first+0.005
 two+0.004
 one+0.004
ued+0.004
Neg logit contributions
 Fro-0.005
 mir-0.005
 Stri-0.004
 spe-0.004
 polar-0.004
TokenOk
Token ID18690
Feature activation+1.91e+00

Pos logit contributions
 first+0.046
 next+0.043
ued+0.037
 behind+0.036
 two+0.035
Neg logit contributions
 Fro-0.040
avis-0.039
 spe-0.039
 mir-0.039
 Stri-0.038
Token,"
Token ID553
Feature activation+1.14e-01

Pos logit contributions
 next+0.068
 first+0.065
 two+0.048
 one+0.046
ued+0.046
Neg logit contributions
 Fro-0.101
 mir-0.096
 Stri-0.094
 polar-0.094
 spe-0.094
Token she
Token ID673
Feature activation-5.30e-02

Pos logit contributions
 next+0.004
 first+0.004
 two+0.003
 one+0.002
ued+0.002
Neg logit contributions
 Fro-0.006
 mir-0.006
 Stri-0.006
 spe-0.006
 polar-0.006
Token said
Token ID531
Feature activation-3.49e-01

Pos logit contributions
 Fro+0.003
 mir+0.003
 Stri+0.003
 polar+0.002
 spe+0.002
Neg logit contributions
 next-0.002
 first-0.002
 two-0.001
 one-0.001
ued-0.001
Token,
Token ID11
Feature activation+3.20e-01

Pos logit contributions
 Fro+0.014
 spe+0.014
 Stri+0.013
 por+0.013
 mir+0.013
Neg logit contributions
 next-0.013
 first-0.013
ued-0.011
 moral-0.011
 two-0.011
Token "
Token ID366
Feature activation+6.68e-02

Pos logit contributions
 first+0.013
 next+0.013
ued+0.012
working+0.011
 moral+0.010
Neg logit contributions
 Fro-0.013
 spe-0.012
 Stri-0.012
 mir-0.011
 por-0.011
Token do
Token ID466
Feature activation-5.35e-01

Pos logit contributions
 Fro+0.030
 mir+0.028
 Stri+0.027
 polar+0.027
 ig+0.026
Neg logit contributions
 next-0.044
 first-0.043
 two-0.036
 one-0.035
ued-0.034
Token.
Token ID13
Feature activation-1.51e-01

Pos logit contributions
 Fro+0.018
 mir+0.017
 Stri+0.016
avis+0.016
 such+0.016
Neg logit contributions
 first-0.020
 next-0.020
ued-0.019
 two-0.018
five-0.018
Token She
Token ID1375
Feature activation+1.07e-01

Pos logit contributions
 Fro+0.006
 Stri+0.005
 mir+0.005
icol+0.005
 spe+0.005
Neg logit contributions
 next-0.008
 first-0.008
 two-0.006
ued-0.006
 one-0.006
Token knew
Token ID2993
Feature activation-3.31e-02

Pos logit contributions
 next+0.004
 first+0.004
 two+0.003
 one+0.003
ued+0.003
Neg logit contributions
 Fro-0.005
 mir-0.005
 Stri-0.005
 polar-0.005
 spe-0.005
Token that
Token ID326
Feature activation+1.33e-01

Pos logit contributions
 Fro+0.001
 Stri+0.001
 mir+0.001
 ig+0.001
avis+0.001
Neg logit contributions
 next-0.001
 first-0.001
ued-0.001
 two-0.001
 moment-0.001
Token staying
Token ID10589
Feature activation+1.91e+00

Pos logit contributions
 next+0.005
 first+0.005
ued+0.004
 two+0.004
 one+0.003
Neg logit contributions
 Fro-0.006
 mir-0.006
 Stri-0.006
 spe-0.006
 ig-0.006
Token out
Token ID503
Feature activation+1.45e-01

Pos logit contributions
 first+0.065
 next+0.062
ued+0.055
t+0.051
five+0.050
Neg logit contributions
 mir-0.086
 polar-0.084
 dam-0.084
 Fro-0.083
 ig-0.082
Token late
Token ID2739
Feature activation-3.81e-02

Pos logit contributions
 next+0.006
 first+0.006
ued+0.005
 two+0.005
 one+0.004
Neg logit contributions
 Fro-0.006
 mir-0.006
 polar-0.006
 ig-0.006
 spe-0.006
Token was
Token ID373
Feature activation+3.36e-01

Pos logit contributions
 Fro+0.002
 Stri+0.002
 mir+0.002
 spe+0.002
 polar+0.002
Neg logit contributions
 next-0.001
 first-0.001
ued-0.001
 two-0.001
 moral-0.001
Token stupid
Token ID8531
Feature activation+4.06e-01

Pos logit contributions
 next+0.013
 first+0.013
ued+0.011
 two+0.010
five+0.010
Neg logit contributions
 mir-0.014
 Fro-0.014
 Stri-0.014
 polar-0.014
 ig-0.014
Token and
Token ID290
Feature activation-1.06e-02

Pos logit contributions
 next+0.014
 first+0.014
 two+0.010
ued+0.010
 one+0.010
Neg logit contributions
 Fro-0.021
 mir-0.020
 Stri-0.020
 polar-0.020
 spe-0.020
Token cl
Token ID537
Feature activation+1.69e-01

Pos logit contributions
 Fro+0.021
 mir+0.020
 Stri+0.019
 polar+0.019
 spe+0.019
Neg logit contributions
 next-0.017
 first-0.016
 two-0.012
 one-0.012
ued-0.011
Tokenapped
Token ID6320
Feature activation+3.88e-01

Pos logit contributions
 next+0.005
 first+0.005
 two+0.004
 one+0.004
ued+0.003
Neg logit contributions
 mir-0.010
 spe-0.010
 Stri-0.009
 Fro-0.009
 scar-0.009
Token her
Token ID607
Feature activation-8.04e-01

Pos logit contributions
 next+0.015
 first+0.015
ued+0.012
 two+0.012
 one+0.011
Neg logit contributions
 Fro-0.018
 mir-0.017
 Stri-0.017
 polar-0.017
 spe-0.017
Token hands
Token ID2832
Feature activation-9.55e-02

Pos logit contributions
 Fro+0.027
 spe+0.026
 coc+0.025
 scar+0.025
 mir+0.023
Neg logit contributions
 next-0.040
 first-0.039
ued-0.031
working-0.031
 moral-0.031
Token.
Token ID13
Feature activation+3.76e-02

Pos logit contributions
 Fro+0.004
 spe+0.004
light+0.003
eter+0.003
 Stri+0.003
Neg logit contributions
 next-0.004
 moral-0.003
 first-0.003
ued-0.003
ruits-0.003
Token Big
Token ID4403
Feature activation+1.91e+00

Pos logit contributions
 next+0.002
 first+0.002
ued+0.001
 two+0.001
 one+0.001
Neg logit contributions
 Fro-0.002
 Stri-0.002
 mir-0.002
 polar-0.002
 spe-0.002
Token brother
Token ID3956
Feature activation+1.11e-01

Pos logit contributions
 next+0.097
 first+0.095
 two+0.077
ued+0.076
 one+0.076
Neg logit contributions
 Fro-0.071
 mir-0.066
 Stri-0.066
 polar-0.065
 spe-0.064
Token smiled
Token ID13541
Feature activation+5.32e-01

Pos logit contributions
 next+0.005
 first+0.004
ued+0.003
 two+0.003
 moral+0.003
Neg logit contributions
 Fro-0.005
 Stri-0.005
 polar-0.005
 mir-0.005
 spe-0.005
Token and
Token ID290
Feature activation+4.40e-01

Pos logit contributions
 next+0.016
 first+0.015
 two+0.010
 one+0.010
ued+0.009
Neg logit contributions
 Fro-0.032
 mir-0.031
 Stri-0.030
 polar-0.030
 spe-0.030
Token held
Token ID2714
Feature activation+4.50e-01

Pos logit contributions
 next+0.016
 first+0.016
 two+0.012
 one+0.012
ued+0.010
Neg logit contributions
 mir-0.023
 Fro-0.023
 ig-0.022
 polar-0.022
 dam-0.022
Token her
Token ID607
Feature activation-4.37e-01

Pos logit contributions
 next+0.018
 first+0.017
 moral+0.013
ued+0.013
 moment+0.012
Neg logit contributions
 Fro-0.021
 Stri-0.020
 polar-0.019
 por-0.019
 spe-0.019
Token shape
Token ID5485
Feature activation+3.79e-01

Pos logit contributions
ued+0.025
 first+0.025
 next+0.024
oned+0.023
cases+0.023
Neg logit contributions
 ig-0.019
 cob-0.018
 polar-0.018
 coc-0.018
 Fro-0.017
Token started
Token ID2067
Feature activation-1.17e-01

Pos logit contributions
 first+0.016
 next+0.016
ued+0.012
 two+0.012
oning+0.012
Neg logit contributions
 Fro-0.016
 mir-0.016
 polar-0.015
 ig-0.015
 spe-0.015
Token to
Token ID284
Feature activation-2.93e-02

Pos logit contributions
 Fro+0.006
 mir+0.005
 Stri+0.005
 polar+0.005
 spe+0.005
Neg logit contributions
 next-0.005
 first-0.005
 two-0.004
 one-0.004
ued-0.003
Token weep
Token ID49671
Feature activation-3.13e-01

Pos logit contributions
 Fro+0.001
 mir+0.001
 Stri+0.001
 polar+0.001
 spe+0.001
Neg logit contributions
 next-0.001
 first-0.001
 two-0.001
 one-0.001
ued-0.001
Token.
Token ID13
Feature activation+1.37e-01

Pos logit contributions
 Fro+0.018
 mir+0.018
 Stri+0.017
 spe+0.017
 polar+0.017
Neg logit contributions
 next-0.010
 first-0.009
 one-0.007
 two-0.007
ued-0.005
Token Big
Token ID4403
Feature activation+1.91e+00

Pos logit contributions
 next+0.006
 first+0.006
 two+0.005
 one+0.005
ued+0.004
Neg logit contributions
 Fro-0.006
 mir-0.006
 Stri-0.006
 polar-0.006
 spe-0.006
Token,
Token ID11
Feature activation+8.58e-01

Pos logit contributions
 next+0.089
 first+0.088
ued+0.072
 two+0.070
 one+0.069
Neg logit contributions
 Fro-0.074
 spe-0.070
 mir-0.069
 Stri-0.068
 polar-0.068
Token round
Token ID2835
Feature activation-5.18e-03

Pos logit contributions
 next+0.037
 first+0.036
ued+0.028
 two+0.027
 one+0.027
Neg logit contributions
 Fro-0.039
 mir-0.037
 polar-0.037
 Stri-0.037
 spe-0.036
Token tears
Token ID10953
Feature activation-5.24e-01

Pos logit contributions
 Fro+0.000
 mir+0.000
 Stri+0.000
 polar+0.000
 spe+0.000
Neg logit contributions
 next-0.000
 first-0.000
 two-0.000
 one-0.000
ued-0.000
Token rolled
Token ID11686
Feature activation+5.47e-01

Pos logit contributions
 Fro+0.021
 spe+0.020
 mir+0.020
 polar+0.020
 Stri+0.020
Neg logit contributions
 next-0.024
 first-0.024
 two-0.018
 one-0.018
ued-0.018
Token down
Token ID866
Feature activation-1.50e-01

Pos logit contributions
 next+0.018
 first+0.018
 two+0.013
 one+0.012
ued+0.012
Neg logit contributions
 Fro-0.030
 mir-0.029
 Stri-0.028
 polar-0.028
 spe-0.028
Token
Token ID198
Feature activation-8.97e-02

Pos logit contributions
 next+0.008
 first+0.007
ued+0.005
 two+0.005
 one+0.005
Neg logit contributions
 Fro-0.010
 mir-0.010
 Stri-0.010
 spe-0.009
 polar-0.009
Token
Token ID198
Feature activation+1.57e-01

Pos logit contributions
 Fro+0.005
 mir+0.005
 spe+0.005
 polar+0.005
 Stri+0.005
Neg logit contributions
 next-0.003
 first-0.003
 two-0.002
 one-0.002
ued-0.002
Token"
Token ID1
Feature activation+5.50e-01

Pos logit contributions
 next+0.006
 first+0.006
 two+0.005
ued+0.004
 one+0.004
Neg logit contributions
 Fro-0.008
 mir-0.007
 Stri-0.007
 polar-0.007
 spe-0.007
TokenYes
Token ID5297
Feature activation+7.29e-01

Pos logit contributions
 first+0.023
 next+0.022
ued+0.018
 one+0.017
 two+0.017
Neg logit contributions
 Fro-0.025
 spe-0.024
 mir-0.023
 Stri-0.023
 polar-0.022
Token,
Token ID11
Feature activation+3.66e-01

Pos logit contributions
 next+0.026
 first+0.025
 two+0.018
 one+0.018
ued+0.017
Neg logit contributions
 Fro-0.039
 mir-0.038
 Stri-0.037
 polar-0.036
 spe-0.036
Token thank
Token ID5875
Feature activation+1.90e+00

Pos logit contributions
 next+0.016
 first+0.016
 two+0.012
 one+0.012
ued+0.011
Neg logit contributions
 Fro-0.017
 mir-0.016
 polar-0.016
 Stri-0.016
 dam-0.015
Token you
Token ID345
Feature activation+2.24e-01

Pos logit contributions
 next+0.077
 first+0.075
 two+0.060
ued+0.059
 one+0.058
Neg logit contributions
 Fro-0.088
 Stri-0.081
 mir-0.080
 spe-0.080
 polar-0.077
Token,"
Token ID553
Feature activation-2.76e-01

Pos logit contributions
 next+0.008
 first+0.008
ued+0.006
 two+0.006
 one+0.006
Neg logit contributions
 Fro-0.012
 mir-0.011
 polar-0.011
 Stri-0.011
 ig-0.010
Token the
Token ID262
Feature activation+1.42e-01

Pos logit contributions
 Fro+0.012
 mir+0.011
 Stri+0.011
 polar+0.011
 spe+0.011
Neg logit contributions
 next-0.013
 first-0.013
 two-0.010
 one-0.010
ued-0.009
Token doll
Token ID3654
Feature activation+1.19e-01

Pos logit contributions
 next+0.007
 first+0.007
ued+0.006
 one+0.006
 two+0.006
Neg logit contributions
 Fro-0.005
 polar-0.005
 mir-0.005
 Stri-0.005
 spe-0.005
Token said
Token ID531
Feature activation-5.83e-02

Pos logit contributions
 next+0.004
 first+0.004
 two+0.003
ued+0.003
 one+0.003
Neg logit contributions
 Fro-0.006
 mir-0.006
 Stri-0.006
 polar-0.006
 spe-0.006
Token says
Token ID1139
Feature activation-2.87e-01

Pos logit contributions
 next+0.018
 first+0.017
ued+0.015
 moment+0.015
ging+0.015
Neg logit contributions
 Fro-0.014
 Stri-0.013
 scar-0.012
 mir-0.012
 por-0.011
Token.
Token ID13
Feature activation-2.01e-01

Pos logit contributions
 Fro+0.013
 Stri+0.012
 spe+0.011
 mir+0.011
 dam+0.011
Neg logit contributions
 next-0.011
 first-0.011
ued-0.009
 moral-0.009
 two-0.008
Token
Token ID198
Feature activation-2.81e-01

Pos logit contributions
 Fro+0.009
 Stri+0.008
 mir+0.008
 spe+0.008
 polar+0.008
Neg logit contributions
 next-0.009
 first-0.008
ued-0.007
 two-0.007
 one-0.006
Token
Token ID198
Feature activation+9.40e-03

Pos logit contributions
 Fro+0.016
 mir+0.015
 Stri+0.015
 polar+0.015
 spe+0.015
Neg logit contributions
 next-0.009
 first-0.009
 two-0.006
 one-0.006
ued-0.006
Token"
Token ID1
Feature activation+6.86e-01

Pos logit contributions
 next+0.000
 first+0.000
 two+0.000
 one+0.000
ued+0.000
Neg logit contributions
 Fro-0.001
 mir-0.001
 Stri-0.001
 polar-0.001
 spe-0.001
TokenOK
Token ID11380
Feature activation+1.89e+00

Pos logit contributions
 next+0.025
 first+0.025
ued+0.020
 two+0.019
 one+0.019
Neg logit contributions
 mir-0.032
 Fro-0.031
 spe-0.031
 Stri-0.031
 scar-0.029
Token,"
Token ID553
Feature activation-7.74e-01

Pos logit contributions
 next+0.071
 first+0.067
ued+0.056
 two+0.051
 moral+0.050
Neg logit contributions
 Fro-0.089
 Stri-0.084
 spe-0.084
 mir-0.084
 polar-0.083
Token Ben
Token ID3932
Feature activation+7.99e-01

Pos logit contributions
 mir+0.037
 Fro+0.035
 spe+0.035
 polar+0.034
 dam+0.034
Neg logit contributions
 next-0.034
 first-0.033
 two-0.027
 one-0.026
ued-0.022
Token says
Token ID1139
Feature activation+1.00e-01

Pos logit contributions
 next+0.036
 first+0.034
ued+0.030
ging+0.030
 moment+0.029
Neg logit contributions
 Fro-0.031
 Stri-0.028
 scar-0.027
 mir-0.025
 por-0.025
Token.
Token ID13
Feature activation+1.88e-02

Pos logit contributions
 next+0.004
 first+0.004
ued+0.004
 moral+0.003
 two+0.003
Neg logit contributions
 Fro-0.004
 spe-0.004
 Stri-0.003
 polar-0.003
 dam-0.003
Token
Token ID198
Feature activation+3.03e-02

Pos logit contributions
 next+0.001
 first+0.001
ued+0.001
 two+0.001
 one+0.001
Neg logit contributions
 Fro-0.001
 Stri-0.001
 mir-0.001
 polar-0.001
 spe-0.001
Token He
Token ID679
Feature activation+5.75e-01

Pos logit contributions
 Fro+0.004
 mir+0.003
 ig+0.003
 polar+0.003
 Stri+0.003
Neg logit contributions
 next-0.006
 first-0.006
ued-0.005
 two-0.005
ging-0.004
Token enjoyed
Token ID8359
Feature activation-1.64e-01

Pos logit contributions
 next+0.021
 first+0.020
 two+0.015
ued+0.014
 one+0.014
Neg logit contributions
 Fro-0.030
 mir-0.029
 Stri-0.028
 polar-0.028
 spe-0.028
Token writing
Token ID3597
Feature activation+1.39e-01

Pos logit contributions
 Fro+0.008
 mir+0.008
 Stri+0.008
 polar+0.008
 spe+0.008
Neg logit contributions
 next-0.006
 first-0.006
 two-0.005
 one-0.005
ued-0.005
Token z
Token ID1976
Feature activation+1.64e+00

Pos logit contributions
 next+0.005
 first+0.005
ued+0.004
 two+0.004
 one+0.004
Neg logit contributions
 Fro-0.006
 Stri-0.006
 mir-0.006
 polar-0.006
 scar-0.006
Tokenig
Token ID328
Feature activation-5.35e-02

Pos logit contributions
 next+0.079
 first+0.077
 two+0.061
 one+0.060
ued+0.058
Neg logit contributions
 Fro-0.068
 mir-0.067
 polar-0.064
 spe-0.064
 Stri-0.063
Tokenz
Token ID89
Feature activation+1.89e+00

Pos logit contributions
 Fro+0.003
 Stri+0.002
 mir+0.002
 polar+0.002
 ig+0.002
Neg logit contributions
 next-0.002
 first-0.002
 two-0.002
 one-0.002
ued-0.002
Tokenags
Token ID3775
Feature activation-4.12e-01

Pos logit contributions
 next+0.075
 first+0.072
 two+0.055
 one+0.055
 moment+0.053
Neg logit contributions
 Fro-0.089
 mir-0.084
 Stri-0.083
avis-0.081
 ig-0.081
Token.
Token ID13
Feature activation+4.33e-01

Pos logit contributions
 Fro+0.021
 Stri+0.021
 mir+0.020
 ig+0.020
 polar+0.020
Neg logit contributions
 next-0.014
 first-0.014
ued-0.010
 two-0.010
 moral-0.009
Token He
Token ID679
Feature activation+2.25e-02

Pos logit contributions
 next+0.020
 first+0.020
ued+0.017
ging+0.016
ve+0.015
Neg logit contributions
 Fro-0.017
 Stri-0.016
 ig-0.016
 mir-0.016
 polar-0.015
Token wrote
Token ID2630
Feature activation-3.11e-01

Pos logit contributions
 next+0.001
 first+0.001
ued+0.001
 two+0.001
oning+0.001
Neg logit contributions
 Fro-0.001
 Stri-0.001
 mir-0.001
 ig-0.001
 polar-0.001
Token z
Token ID1976
Feature activation+1.10e+00

Pos logit contributions
 Fro+0.013
 Stri+0.013
 mir+0.012
 polar+0.012
lic+0.012
Neg logit contributions
 next-0.013
 first-0.013
ued-0.010
 two-0.009
 moment-0.009
Token a
Token ID257
Feature activation-3.05e-01

Pos logit contributions
 Fro+0.004
 Stri+0.004
 mir+0.004
 polar+0.004
 spe+0.004
Neg logit contributions
 next-0.003
 first-0.003
ued-0.002
 two-0.002
 one-0.002
Token town
Token ID3240
Feature activation+7.33e-01

Pos logit contributions
 Fro+0.019
rim+0.018
urg+0.017
 mir+0.017
 Stri+0.017
Neg logit contributions
 next-0.010
 first-0.009
 two-0.007
 one-0.007
 moment-0.005
Token with
Token ID351
Feature activation-5.74e-03

Pos logit contributions
ued+0.030
 first+0.028
oning+0.028
 moral+0.027
oned+0.027
Neg logit contributions
 ig-0.026
 spe-0.024
 dam-0.023
 cho-0.021
 mir-0.021
Token many
Token ID867
Feature activation+9.90e-01

Pos logit contributions
 Fro+0.000
 Stri+0.000
 polar+0.000
 mir+0.000
 ig+0.000
Neg logit contributions
 next-0.000
 first-0.000
ued-0.000
 moment-0.000
 one-0.000
Token other
Token ID584
Feature activation+1.08e+00

Pos logit contributions
 next+0.036
 first+0.036
ued+0.026
 one+0.025
 two+0.025
Neg logit contributions
 mir-0.049
 Fro-0.048
 ig-0.048
 polar-0.048
 spe-0.047
Token engines
Token ID11874
Feature activation+1.89e+00

Pos logit contributions
 next+0.052
 first+0.051
ued+0.042
 one+0.039
 two+0.039
Neg logit contributions
 polar-0.040
 ig-0.040
 Stri-0.040
 Fro-0.040
 mir-0.040
Token,
Token ID11
Feature activation-1.48e-01

Pos logit contributions
 first+0.076
 next+0.076
ued+0.062
 moral+0.060
 moment+0.059
Neg logit contributions
 Fro-0.080
 ig-0.079
 Stri-0.078
 dam-0.077
 polar-0.076
Token big
Token ID1263
Feature activation+4.05e-01

Pos logit contributions
 spe+0.006
 Fro+0.006
 Stri+0.006
 polar+0.005
 scar+0.005
Neg logit contributions
 next-0.006
ued-0.006
 first-0.006
 moral-0.005
oning-0.005
Token and
Token ID290
Feature activation+5.00e-02

Pos logit contributions
 next+0.017
 first+0.017
 two+0.014
 one+0.013
ued+0.012
Neg logit contributions
 Fro-0.019
 mir-0.018
 Stri-0.017
 polar-0.017
 spe-0.017
Token small
Token ID1402
Feature activation+1.56e-01

Pos logit contributions
 next+0.002
 first+0.002
ued+0.002
oning+0.002
oned+0.001
Neg logit contributions
 spe-0.002
 Stri-0.002
 polar-0.002
 Fro-0.002
 mir-0.002
Token.
Token ID13
Feature activation+7.73e-02

Pos logit contributions
 next+0.006
 first+0.006
 two+0.005
 one+0.005
ued+0.004
Neg logit contributions
 Fro-0.008
 mir-0.007
 Stri-0.007
avis-0.007
rim-0.007
Token The
Token ID383
Feature activation-2.70e-01

Pos logit contributions
 next+0.012
 first+0.012
 two+0.009
 one+0.009
ued+0.009
Neg logit contributions
 Fro-0.013
 mir-0.013
 Stri-0.012
 polar-0.012
 spe-0.012
Token park
Token ID3952
Feature activation+1.06e+00

Pos logit contributions
 Fro+0.011
 mir+0.010
 polar+0.010
 ig+0.010
 Stri+0.010
Neg logit contributions
 next-0.012
 first-0.012
ued-0.010
 one-0.009
 two-0.009
Token had
Token ID550
Feature activation+2.12e-01

Pos logit contributions
 first+0.047
 next+0.046
ued+0.044
oning+0.044
oned+0.040
Neg logit contributions
 Fro-0.038
 mir-0.036
 cho-0.035
 ig-0.035
 spe-0.035
Token lots
Token ID6041
Feature activation+8.19e-01

Pos logit contributions
 next+0.007
 first+0.007
 two+0.005
 one+0.005
ued+0.004
Neg logit contributions
 Fro-0.012
 mir-0.012
 Stri-0.011
 polar-0.011
 spe-0.011
Token of
Token ID286
Feature activation+6.49e-01

Pos logit contributions
 next+0.040
 first+0.040
 two+0.032
 one+0.032
ued+0.031
Neg logit contributions
 Fro-0.032
 mir-0.030
 polar-0.029
 spe-0.029
 Stri-0.029
Token high
Token ID1029
Feature activation+1.89e+00

Pos logit contributions
 next+0.031
 first+0.031
ued+0.024
 one+0.023
 two+0.023
Neg logit contributions
 Fro-0.026
 mir-0.025
 ig-0.024
 polar-0.024
 Stri-0.023
Token slides
Token ID19392
Feature activation+3.26e-01

Pos logit contributions
 next+0.083
 first+0.081
 two+0.063
 one+0.061
ued+0.061
Neg logit contributions
 Fro-0.084
 mir-0.081
 spe-0.078
 polar-0.078
 Stri-0.078
Token.
Token ID13
Feature activation+6.05e-01

Pos logit contributions
 next+0.011
 first+0.011
 two+0.008
ued+0.008
 moral+0.007
Neg logit contributions
 Fro-0.017
 mir-0.017
 Stri-0.016
 spe-0.016
 polar-0.016
Token It
Token ID632
Feature activation+4.65e-01

Pos logit contributions
 next+0.027
 first+0.026
 two+0.020
 one+0.019
ued+0.019
Neg logit contributions
 Fro-0.028
 mir-0.026
 Stri-0.026
 polar-0.025
 spe-0.025
Token was
Token ID373
Feature activation+2.09e-01

Pos logit contributions
 next+0.020
 first+0.020
ued+0.016
 two+0.016
oning+0.015
Neg logit contributions
 Fro-0.020
 mir-0.019
 Stri-0.019
 polar-0.019
 spe-0.019
Token fun
Token ID1257
Feature activation+5.09e-01

Pos logit contributions
 next+0.009
 first+0.009
 two+0.007
 one+0.007
 moment+0.006
Neg logit contributions
 Fro-0.010
 mir-0.010
 Stri-0.009
 polar-0.009
 spe-0.009
Token"
Token ID1
Feature activation+5.43e-01

Pos logit contributions
 first+0.011
 next+0.011
ued+0.009
 one+0.009
 moment+0.008
Neg logit contributions
 Fro-0.013
 mir-0.013
 Stri-0.013
 spe-0.012
Su-0.012
TokenI
Token ID40
Feature activation-2.59e-01

Pos logit contributions
 first+0.025
 next+0.024
ued+0.020
 one+0.019
 two+0.019
Neg logit contributions
 Fro-0.022
 spe-0.022
 mir-0.022
 Stri-0.021
 polar-0.020
Token'm
Token ID1101
Feature activation+3.46e-01

Pos logit contributions
 Fro+0.010
 mir+0.010
 Stri+0.010
 polar+0.009
 spe+0.009
Neg logit contributions
 next-0.013
 first-0.013
 two-0.010
 one-0.010
ued-0.009
Token okay
Token ID8788
Feature activation+9.03e-02

Pos logit contributions
 next+0.013
 first+0.013
 two+0.010
 one+0.009
ued+0.008
Neg logit contributions
 Fro-0.018
 mir-0.018
 Stri-0.017
 spe-0.017
icol-0.017
Token,
Token ID11
Feature activation+9.72e-01

Pos logit contributions
 next+0.003
 first+0.003
 two+0.002
 one+0.002
ued+0.002
Neg logit contributions
 Fro-0.005
 mir-0.005
 Stri-0.005
 polar-0.005
 spe-0.005
Token thank
Token ID5875
Feature activation+1.88e+00

Pos logit contributions
 next+0.044
 first+0.042
ued+0.034
 two+0.033
 one+0.033
Neg logit contributions
 Fro-0.042
 mir-0.040
 spe-0.039
 Stri-0.039
 polar-0.038
Token you
Token ID345
Feature activation+3.11e-01

Pos logit contributions
 next+0.077
 first+0.076
ued+0.061
 two+0.060
 one+0.058
Neg logit contributions
 Fro-0.085
 Stri-0.078
 mir-0.077
 spe-0.077
avis-0.073
Token,"
Token ID553
Feature activation+9.90e-02

Pos logit contributions
 next+0.010
 first+0.010
 two+0.007
 one+0.007
ued+0.006
Neg logit contributions
 Fro-0.018
 mir-0.017
 Stri-0.017
 polar-0.017
 spe-0.017
Token the
Token ID262
Feature activation+6.33e-01

Pos logit contributions
 next+0.004
 first+0.004
 two+0.003
 one+0.003
 moral+0.002
Neg logit contributions
 mir-0.005
 Fro-0.005
 dam-0.005
rim-0.005
 polar-0.005
Token lady
Token ID10846
Feature activation-3.50e-01

Pos logit contributions
 next+0.028
 first+0.028
ued+0.025
 closely+0.023
 moment+0.023
Neg logit contributions
 Fro-0.022
 mir-0.022
 spe-0.020
 cob-0.020
 paramed-0.020
Token said
Token ID531
Feature activation+1.59e-01

Pos logit contributions
 Fro+0.017
 Stri+0.016
 mir+0.016
 polar+0.016
 spe+0.016
Neg logit contributions
 next-0.013
 first-0.013
 two-0.010
ued-0.010
 one-0.009
Token says
Token ID1139
Feature activation+7.02e-01

Pos logit contributions
 next+0.015
 first+0.014
 two+0.010
 one+0.010
ued+0.009
Neg logit contributions
 Fro-0.023
 mir-0.022
 Stri-0.022
 spe-0.022
 polar-0.022
Token to
Token ID284
Feature activation+2.26e-01

Pos logit contributions
 next+0.021
 first+0.020
 two+0.013
 one+0.013
ued+0.011
Neg logit contributions
 Fro-0.042
 Stri-0.041
 mir-0.041
 spe-0.041
 polar-0.041
Token Mia
Token ID32189
Feature activation+7.06e-02

Pos logit contributions
 next+0.011
 first+0.011
 two+0.009
 one+0.009
ued+0.008
Neg logit contributions
 Fro-0.009
 mir-0.009
 spe-0.009
 polar-0.009
 Stri-0.009
Token:
Token ID25
Feature activation-2.18e-01

Pos logit contributions
 next+0.003
 first+0.003
ued+0.003
oning+0.002
working+0.002
Neg logit contributions
 Fro-0.003
 Stri-0.003
 polar-0.003
 mir-0.003
 ig-0.003
Token "
Token ID366
Feature activation+1.68e-01

Pos logit contributions
 Fro+0.010
 Stri+0.009
 mir+0.008
 spe+0.008
 polar+0.008
Neg logit contributions
 first-0.009
 next-0.009
ued-0.008
 two-0.007
 one-0.007
TokenLook
Token ID8567
Feature activation+1.87e+00

Pos logit contributions
 first+0.007
 next+0.007
ued+0.006
 two+0.006
 one+0.006
Neg logit contributions
 Fro-0.008
 mir-0.007
 Stri-0.007
 polar-0.007
 spe-0.007
Token,
Token ID11
Feature activation+1.07e+00

Pos logit contributions
 next+0.065
 first+0.063
 two+0.046
 one+0.045
ued+0.043
Neg logit contributions
 Fro-0.101
 mir-0.097
 Stri-0.095
 polar-0.094
 spe-0.093
Token Mia
Token ID32189
Feature activation-1.13e-01

Pos logit contributions
 next+0.051
 first+0.051
 two+0.041
 one+0.041
 moment+0.034
Neg logit contributions
 mir-0.045
icol-0.045
 Fro-0.044
rim-0.044
 spe-0.043
Token.
Token ID13
Feature activation-2.07e-01

Pos logit contributions
 Fro+0.007
 mir+0.007
 spe+0.006
 dam+0.006
 Stri+0.006
Neg logit contributions
 next-0.004
 first-0.003
 two-0.002
 one-0.002
 moment-0.002
Token That
Token ID1320
Feature activation+1.10e-01

Pos logit contributions
 Fro+0.010
 mir+0.010
 Stri+0.009
 spe+0.009
 dam+0.009
Neg logit contributions
 next-0.008
 first-0.008
 two-0.006
 one-0.006
ued-0.006
Token dog
Token ID3290
Feature activation-2.80e-01

Pos logit contributions
 next+0.004
 first+0.004
 one+0.003
 two+0.003
 moral+0.002
Neg logit contributions
 mir-0.006
 Fro-0.006
urg-0.006
 dam-0.006
 spe-0.006
TokenNo
Token ID2949
Feature activation+6.01e-01

Pos logit contributions
 next+0.029
 first+0.029
 two+0.023
 one+0.023
ued+0.022
Neg logit contributions
 Fro-0.027
 mir-0.026
 Stri-0.025
 spe-0.025
 polar-0.024
Token thank
Token ID5875
Feature activation+1.27e+00

Pos logit contributions
 next+0.022
 first+0.021
 two+0.016
 one+0.015
ued+0.014
Neg logit contributions
 Fro-0.032
 mir-0.031
 Stri-0.030
 polar-0.030
 spe-0.030
Token you
Token ID345
Feature activation+3.75e-02

Pos logit contributions
 next+0.052
 first+0.050
 two+0.039
ued+0.038
 one+0.038
Neg logit contributions
 Fro-0.061
 Stri-0.057
 mir-0.055
 spe-0.055
 polar-0.055
Token!"
Token ID2474
Feature activation-2.08e-01

Pos logit contributions
 next+0.001
 first+0.001
 two+0.001
ued+0.001
 one+0.001
Neg logit contributions
 Fro-0.002
 mir-0.002
 polar-0.002
 Stri-0.002
 ig-0.002
Token said
Token ID531
Feature activation+1.07e-01

Pos logit contributions
 Fro+0.010
 Stri+0.009
 polar+0.008
 mir+0.008
 dam+0.008
Neg logit contributions
 next-0.009
 first-0.008
ued-0.006
 two-0.006
 moral-0.006
Token the
Token ID262
Feature activation+1.87e+00

Pos logit contributions
 next+0.005
 first+0.004
 two+0.003
 one+0.003
ued+0.003
Neg logit contributions
 Fro-0.005
 mir-0.005
 Stri-0.005
 polar-0.005
 spe-0.005
Token other
Token ID584
Feature activation+1.74e+00

Pos logit contributions
 next+0.075
ued+0.069
 first+0.068
oned+0.059
 behind+0.057
Neg logit contributions
 Fro-0.077
 mir-0.073
 Stri-0.071
 por-0.067
 polar-0.067
Token human
Token ID1692
Feature activation-6.68e-01

Pos logit contributions
 next+0.084
 first+0.080
ued+0.072
 moment+0.062
 two+0.062
Neg logit contributions
 Fro-0.067
 mir-0.062
 Stri-0.061
 polar-0.060
 ig-0.058
Token.
Token ID13
Feature activation-1.00e-01

Pos logit contributions
 Fro+0.032
 polar+0.030
 ig+0.028
 dam+0.028
 Stri+0.028
Neg logit contributions
 next-0.025
 first-0.024
ued-0.018
 two-0.018
 moral-0.018
Token "
Token ID366
Feature activation-4.02e-01

Pos logit contributions
 Fro+0.004
 Stri+0.004
 dam+0.004
lic+0.004
 polar+0.004
Neg logit contributions
 next-0.005
 first-0.004
ued-0.004
working-0.004
 moral-0.003
TokenI
Token ID40
Feature activation-3.27e-01

Pos logit contributions
 Fro+0.016
 Stri+0.015
 mir+0.015
avis+0.015
Su+0.015
Neg logit contributions
 first-0.018
 next-0.017
ued-0.014
 one-0.014
 two-0.014
Token.
Token ID13
Feature activation-3.89e-01

Pos logit contributions
 mir+0.017
 por+0.017
 ig+0.016
 cho+0.016
 Fro+0.016
Neg logit contributions
 first-0.029
Robert-0.029
 Robert-0.027
Whoever-0.027
 next-0.026
Token
Token ID220
Feature activation-1.03e-01

Pos logit contributions
 Fro+0.019
 Stri+0.018
 mir+0.017
icol+0.016
 coc+0.016
Neg logit contributions
 next-0.016
 first-0.015
ued-0.011
 two-0.011
 one-0.011
Token
Token ID198
Feature activation-3.63e-01

Pos logit contributions
 Fro+0.005
 Stri+0.004
lic+0.004
 mir+0.004
avis+0.004
Neg logit contributions
 next-0.004
 first-0.004
ued-0.003
 one-0.003
 two-0.003
Token
Token ID198
Feature activation-6.60e-01

Pos logit contributions
 Fro+0.016
 Stri+0.015
lic+0.013
 mir+0.013
 coc+0.013
Neg logit contributions
 next-0.016
 first-0.015
ued-0.012
working-0.012
 two-0.012
TokenA
Token ID32
Feature activation-6.89e-01

Pos logit contributions
 Fro+0.033
 mir+0.031
 Stri+0.031
 polar+0.029
 coc+0.029
Neg logit contributions
 next-0.025
 first-0.024
 two-0.018
 one-0.018
ued-0.017
Tokenlic
Token ID677
Feature activation-2.58e+00

Pos logit contributions
 Fro+0.037
ric+0.036
urg+0.036
 Stri+0.036
 mir+0.035
Neg logit contributions
 next-0.025
 first-0.025
 two-0.018
 one-0.018
 moment-0.017
Tokenia
Token ID544
Feature activation+1.13e-01

Pos logit contributions
 Fro+0.108
 mir+0.107
 Stri+0.103
 spe+0.103
 polar+0.102
Neg logit contributions
 next-0.122
 first-0.119
 two-0.096
 one-0.093
ued-0.089
Token noticed
Token ID6810
Feature activation-8.53e-01

Pos logit contributions
 next+0.005
 first+0.005
 two+0.003
 one+0.003
ued+0.003
Neg logit contributions
 Fro-0.005
 mir-0.005
 spe-0.005
 polar-0.005
 Stri-0.005
Token that
Token ID326
Feature activation-2.06e-01

Pos logit contributions
 mir+0.048
 spe+0.046
 Fro+0.046
 polar+0.046
 dam+0.046
Neg logit contributions
 next-0.031
 first-0.029
 two-0.023
 one-0.022
 behind-0.018
Token the
Token ID262
Feature activation+2.12e-01

Pos logit contributions
 Fro+0.010
 mir+0.009
 Stri+0.009
 spe+0.009
 polar+0.009
Neg logit contributions
 next-0.009
 first-0.008
ued-0.006
 two-0.006
 one-0.006
Token cord
Token ID15050
Feature activation+2.20e-01

Pos logit contributions
 next+0.011
 first+0.011
ued+0.009
 one+0.009
 two+0.008
Neg logit contributions
 cob-0.006
 Stri-0.006
 ig-0.006
 Fro-0.006
 por-0.006
Token it
Token ID340
Feature activation+1.89e-01

Pos logit contributions
 Fro+0.012
 spe+0.012
 Stri+0.011
 mir+0.011
 som+0.010
Neg logit contributions
 next-0.017
 first-0.017
ued-0.015
 moment-0.014
 moral-0.014
Token.
Token ID13
Feature activation-5.18e-01

Pos logit contributions
 next+0.007
 first+0.007
ued+0.007
 two+0.006
 moral+0.006
Neg logit contributions
 Fro-0.007
 mir-0.007
 por-0.007
 spe-0.007
 coc-0.007
Token
Token ID198
Feature activation-6.21e-01

Pos logit contributions
 Fro+0.024
 mir+0.023
 Stri+0.022
 polar+0.021
 dam+0.021
Neg logit contributions
 next-0.022
 first-0.021
 two-0.016
 one-0.016
ued-0.016
Token
Token ID198
Feature activation-6.54e-01

Pos logit contributions
 Fro+0.033
 Stri+0.030
 mir+0.030
avis+0.028
 polar+0.028
Neg logit contributions
 next-0.023
 first-0.022
 one-0.016
 two-0.016
ued-0.016
TokenA
Token ID32
Feature activation-6.00e-01

Pos logit contributions
 Fro+0.032
 mir+0.031
 Stri+0.030
 spe+0.029
 dam+0.029
Neg logit contributions
 next-0.025
 first-0.025
 two-0.018
 one-0.018
ued-0.018
Tokenlic
Token ID677
Feature activation-2.49e+00

Pos logit contributions
 ig+0.030
 Fro+0.030
 polar+0.029
 mir+0.029
ric+0.029
Neg logit contributions
 next-0.024
 first-0.023
 one-0.018
 two-0.018
 moment-0.016
Tokenia
Token ID544
Feature activation-2.51e-01

Pos logit contributions
 Fro+0.105
 mir+0.104
 spe+0.101
 Stri+0.101
 polar+0.101
Neg logit contributions
 next-0.117
 first-0.115
 two-0.092
 one-0.089
ued-0.086
Token smiled
Token ID13541
Feature activation+9.56e-02

Pos logit contributions
 Fro+0.012
 mir+0.012
 spe+0.011
 polar+0.011
 Stri+0.011
Neg logit contributions
 next-0.010
 first-0.010
 two-0.007
 one-0.007
ued-0.007
Token while
Token ID981
Feature activation-8.84e-01

Pos logit contributions
 next+0.003
 first+0.003
 one+0.002
 two+0.002
ued+0.001
Neg logit contributions
 Fro-0.006
 mir-0.006
 Stri-0.006
 polar-0.006
 spe-0.006
Token she
Token ID673
Feature activation-4.27e-01

Pos logit contributions
 Fro+0.043
 Stri+0.039
 mir+0.039
 spe+0.037
 polar+0.036
Neg logit contributions
 next-0.035
 first-0.034
ued-0.028
 two-0.026
 one-0.025
Token looked
Token ID3114
Feature activation-5.13e-01

Pos logit contributions
 Fro+0.022
 mir+0.022
 Stri+0.021
 polar+0.020
 dam+0.020
Neg logit contributions
 next-0.016
 first-0.016
 two-0.012
 one-0.012
ued-0.010
Token the
Token ID262
Feature activation-7.45e-01

Pos logit contributions
 Fro+0.015
 mir+0.014
 Stri+0.014
 spe+0.014
 polar+0.014
Neg logit contributions
 next-0.013
 first-0.013
 two-0.010
 one-0.010
ued-0.009
Token necklace
Token ID38987
Feature activation-2.10e-01

Pos logit contributions
 Fro+0.022
 spe+0.022
 mir+0.022
 coc+0.021
 ig+0.020
Neg logit contributions
 next-0.040
 first-0.040
ued-0.035
 two-0.032
 one-0.032
Token and
Token ID290
Feature activation+3.68e-01

Pos logit contributions
 Fro+0.010
 mir+0.010
 spe+0.010
 ig+0.010
 Stri+0.010
Neg logit contributions
 next-0.007
 first-0.007
ued-0.006
 two-0.005
 moral-0.005
Token the
Token ID262
Feature activation-5.31e-01

Pos logit contributions
 next+0.014
 first+0.013
 two+0.010
 one+0.010
ued+0.009
Neg logit contributions
 Fro-0.019
 mir-0.019
 Stri-0.018
 polar-0.018
 spe-0.018
Token magn
Token ID7842
Feature activation-1.29e+00

Pos logit contributions
 Fro+0.018
 mir+0.017
 spe+0.017
 polar+0.017
 Stri+0.016
Neg logit contributions
 next-0.028
 first-0.027
ued-0.023
 two-0.022
 one-0.022
Tokenifying
Token ID4035
Feature activation-2.34e+00

Pos logit contributions
 Fro+0.027
 spe+0.026
 mir+0.025
 Stri+0.024
 polar+0.024
Neg logit contributions
 next-0.086
 first-0.085
 two-0.072
 one-0.072
ued-0.069
Token glass
Token ID5405
Feature activation-8.76e-01

Pos logit contributions
 Stri+0.100
 Fro+0.099
rim+0.098
avis+0.096
 mir+0.095
Neg logit contributions
 next-0.112
 first-0.106
 one-0.101
 two-0.100
 top-0.082
Token and
Token ID290
Feature activation+2.83e-01

Pos logit contributions
 Fro+0.048
 mir+0.046
 Stri+0.045
 polar+0.045
 spe+0.044
Neg logit contributions
 next-0.031
 first-0.030
 two-0.022
 one-0.021
ued-0.019
Token told
Token ID1297
Feature activation+1.49e-01

Pos logit contributions
 next+0.011
 first+0.011
 two+0.008
 one+0.008
ued+0.008
Neg logit contributions
 Fro-0.014
 mir-0.014
 Stri-0.013
 polar-0.013
 spe-0.013
Token him
Token ID683
Feature activation-6.61e-01

Pos logit contributions
 next+0.007
 first+0.006
 two+0.005
 one+0.005
ued+0.005
Neg logit contributions
 Fro-0.007
 mir-0.007
 Stri-0.006
 spe-0.006
 polar-0.006
Token that
Token ID326
Feature activation+7.50e-02

Pos logit contributions
 Fro+0.033
 mir+0.033
 spe+0.032
 polar+0.032
 Stri+0.032
Neg logit contributions
 next-0.025
 first-0.025
 two-0.019
 one-0.019
ued-0.016
Token the
Token ID262
Feature activation-4.70e-01

Pos logit contributions
 Fro+0.024
 mir+0.023
 Stri+0.022
 dam+0.022
 polar+0.022
Neg logit contributions
 next-0.016
 first-0.015
 two-0.012
 one-0.012
ued-0.010
Token necklace
Token ID38987
Feature activation-1.43e-01

Pos logit contributions
 Fro+0.016
 mir+0.015
 spe+0.014
 polar+0.014
 Stri+0.014
Neg logit contributions
 next-0.026
 first-0.025
 two-0.021
 one-0.021
ued-0.020
Token and
Token ID290
Feature activation+2.03e-01

Pos logit contributions
 Fro+0.007
 Stri+0.006
 ig+0.006
 mir+0.006
 spe+0.006
Neg logit contributions
 first-0.005
 next-0.005
ued-0.004
 moral-0.004
 two-0.004
Token the
Token ID262
Feature activation-5.96e-01

Pos logit contributions
 next+0.008
 first+0.008
 two+0.006
 one+0.006
 behind+0.005
Neg logit contributions
 Fro-0.010
 mir-0.010
 polar-0.009
 Stri-0.009
 spe-0.009
Token magn
Token ID7842
Feature activation-1.45e+00

Pos logit contributions
 Fro+0.021
 mir+0.020
 Stri+0.020
 polar+0.019
 spe+0.019
Neg logit contributions
 next-0.032
 first-0.031
 two-0.025
 one-0.025
ued-0.024
Tokenifying
Token ID4035
Feature activation-2.33e+00

Pos logit contributions
 Fro+0.036
 mir+0.033
 spe+0.031
 Stri+0.031
 polar+0.031
Neg logit contributions
 next-0.094
 first-0.092
 two-0.078
 one-0.077
ued-0.075
Token glass
Token ID5405
Feature activation-6.49e-01

Pos logit contributions
 Stri+0.101
rim+0.101
avis+0.099
 Fro+0.096
 mir+0.096
Neg logit contributions
 next-0.109
 first-0.104
 one-0.097
 two-0.097
 top-0.083
Token and
Token ID290
Feature activation+3.22e-01

Pos logit contributions
 Fro+0.036
 mir+0.035
 Stri+0.034
 polar+0.034
 spe+0.033
Neg logit contributions
 next-0.022
 first-0.021
 two-0.015
 one-0.015
ued-0.014
Token started
Token ID2067
Feature activation-3.95e-01

Pos logit contributions
 next+0.013
 first+0.013
 two+0.010
 one+0.010
ued+0.009
Neg logit contributions
 Fro-0.016
 mir-0.015
 polar-0.015
 Stri-0.015
 spe-0.014
Token to
Token ID284
Feature activation-6.49e-01

Pos logit contributions
 mir+0.021
 Fro+0.021
rim+0.021
 polar+0.021
avis+0.020
Neg logit contributions
 next-0.015
 first-0.013
 one-0.011
 two-0.010
 behind-0.008
Token examine
Token ID10716
Feature activation-2.18e-02

Pos logit contributions
 Fro+0.031
 mir+0.031
lic+0.030
 Stri+0.030
rim+0.029
Neg logit contributions
 next-0.027
 first-0.026
 one-0.021
 two-0.021
 moment-0.017
Token?"
Token ID1701
Feature activation-4.55e-02

Pos logit contributions
 next+0.013
 first+0.012
ued+0.011
 two+0.010
 moral+0.010
Neg logit contributions
 mir-0.012
 spe-0.011
 Fro-0.011
 Stri-0.011
 coc-0.011
Token
Token ID220
Feature activation-6.60e-02

Pos logit contributions
 mir+0.003
 spe+0.003
 ig+0.003
urg+0.003
 polar+0.003
Neg logit contributions
 next-0.001
 first-0.001
 one-0.001
 two-0.001
ued-0.001
Token
Token ID198
Feature activation+7.09e-02

Pos logit contributions
 Fro+0.004
 Stri+0.003
 mir+0.003
 polar+0.003
 dam+0.003
Neg logit contributions
 next-0.002
 first-0.002
 one-0.002
 two-0.002
ued-0.002
Token
Token ID198
Feature activation+1.60e-01

Pos logit contributions
 next+0.002
 first+0.002
 two+0.001
 one+0.001
ued+0.001
Neg logit contributions
 mir-0.004
 Fro-0.004
 ig-0.004
 spe-0.004
 polar-0.004
TokenD
Token ID35
Feature activation-1.03e-01

Pos logit contributions
 next+0.007
 first+0.007
 two+0.006
 one+0.005
ued+0.005
Neg logit contributions
 mir-0.007
 ig-0.007
 Fro-0.007
 polar-0.007
urg-0.007
Tokenott
Token ID1252
Feature activation-2.31e+00

Pos logit contributions
 Fro+0.002
 mir+0.002
 spe+0.002
 polar+0.002
 Stri+0.002
Neg logit contributions
 next-0.007
 first-0.007
 two-0.006
 one-0.006
ued-0.006
Tokenie
Token ID494
Feature activation+5.37e-01

Pos logit contributions
 Fro+0.068
 mir+0.063
 Stri+0.061
 polar+0.061
 ig+0.059
Neg logit contributions
 next-0.135
 first-0.133
 two-0.112
 one-0.111
ued-0.108
Token per
Token ID583
Feature activation+5.73e-02

Pos logit contributions
 next+0.022
 first+0.022
 two+0.016
 one+0.016
ued+0.016
Neg logit contributions
 Fro-0.024
 Stri-0.022
 mir-0.022
 polar-0.022
 spe-0.021
Tokenked
Token ID9091
Feature activation+7.92e-01

Pos logit contributions
 next+0.003
 first+0.002
 two+0.002
 one+0.002
ued+0.002
Neg logit contributions
 mir-0.003
 Fro-0.003
 ig-0.002
 polar-0.002
rim-0.002
Token up
Token ID510
Feature activation+5.37e-01

Pos logit contributions
 next+0.029
 first+0.028
 two+0.021
ued+0.020
 one+0.019
Neg logit contributions
 Fro-0.041
 mir-0.038
 Stri-0.038
 polar-0.038
 spe-0.038
Token and
Token ID290
Feature activation+6.20e-01

Pos logit contributions
 next+0.018
 first+0.018
ued+0.013
 two+0.013
 one+0.012
Neg logit contributions
 Fro-0.029
 spe-0.027
 mir-0.027
 polar-0.027
 Stri-0.026
Token and
Token ID290
Feature activation-1.55e-02

Pos logit contributions
 next+0.007
 first+0.006
 moral+0.006
ued+0.006
 two+0.005
Neg logit contributions
 Fro-0.007
 spe-0.006
 Stri-0.006
 mir-0.006
 ig-0.006
Token then
Token ID788
Feature activation+8.67e-02

Pos logit contributions
 Fro+0.001
 Stri+0.001
 fro+0.001
 spe+0.001
lic+0.001
Neg logit contributions
 next-0.001
 first-0.001
 moral-0.001
ued-0.001
 two-0.000
Token she
Token ID673
Feature activation-4.32e-01

Pos logit contributions
 next+0.003
 first+0.003
ued+0.002
 two+0.002
 moral+0.002
Neg logit contributions
 Fro-0.004
 mir-0.004
 Stri-0.004
 spe-0.004
 polar-0.004
Token had
Token ID550
Feature activation-1.86e-01

Pos logit contributions
 Fro+0.021
 mir+0.020
 Stri+0.020
 polar+0.019
 spe+0.019
Neg logit contributions
 next-0.018
 first-0.017
 two-0.013
 one-0.013
ued-0.012
Token a
Token ID257
Feature activation+6.31e-01

Pos logit contributions
 Fro+0.010
 mir+0.010
 polar+0.010
 Stri+0.010
 spe+0.010
Neg logit contributions
 next-0.006
 first-0.006
 two-0.005
 one-0.005
ued-0.004
Token scr
Token ID6040
Feature activation-2.31e+00

Pos logit contributions
 first+0.026
 next+0.025
 one+0.020
 two+0.020
 moment+0.018
Neg logit contributions
 Fro-0.030
 Stri-0.030
rim-0.029
 mir-0.029
avis-0.028
Tokenum
Token ID388
Feature activation-1.32e-01

Pos logit contributions
 Fro+0.094
 Stri+0.093
 polar+0.089
 mir+0.087
 spe+0.087
Neg logit contributions
 next-0.106
 first-0.105
ued-0.083
 two-0.082
 one-0.080
Tokenpt
Token ID457
Feature activation-8.03e-01

Pos logit contributions
 Fro+0.006
 mir+0.006
 Stri+0.005
 spe+0.005
 dam+0.005
Neg logit contributions
 next-0.006
 first-0.006
 two-0.005
 one-0.005
ued-0.004
Tokenious
Token ID699
Feature activation+3.10e-02

Pos logit contributions
 Fro+0.034
 mir+0.032
 Stri+0.032
ric+0.031
rim+0.031
Neg logit contributions
 next-0.040
 first-0.038
 one-0.030
 two-0.030
 behind-0.027
Token snack
Token ID26906
Feature activation-3.09e-01

Pos logit contributions
 next+0.002
 first+0.002
 two+0.001
 one+0.001
ued+0.001
Neg logit contributions
 Fro-0.001
 mir-0.001
 polar-0.001
 spe-0.001
 ig-0.001
Token.
Token ID13
Feature activation+1.46e-01

Pos logit contributions
 Fro+0.016
 mir+0.016
 Stri+0.015
 polar+0.015
 spe+0.015
Neg logit contributions
 next-0.011
 first-0.011
 two-0.008
 one-0.008
ued-0.007
Token After
Token ID2293
Feature activation+2.87e-01

Pos logit contributions
 Fro+0.006
 mir+0.006
 Stri+0.005
 polar+0.005
 spe+0.005
Neg logit contributions
 next-0.005
 first-0.005
 two-0.003
 one-0.003
ued-0.003
Token a
Token ID257
Feature activation-7.12e-01

Pos logit contributions
 next+0.010
 first+0.010
 two+0.007
ued+0.007
 one+0.007
Neg logit contributions
 Fro-0.015
 mir-0.015
 Stri-0.015
 polar-0.014
 spe-0.014
Token while
Token ID981
Feature activation-7.21e-01

Pos logit contributions
 Fro+0.036
 mir+0.035
 spe+0.034
 polar+0.034
 Stri+0.033
Neg logit contributions
 next-0.027
 first-0.026
 two-0.020
 one-0.018
ued-0.018
Token,
Token ID11
Feature activation+1.43e-01

Pos logit contributions
 Fro+0.038
 mir+0.037
 Stri+0.036
 polar+0.035
 spe+0.035
Neg logit contributions
 next-0.026
 first-0.025
 two-0.019
 one-0.018
ued-0.018
Token D
Token ID360
Feature activation+2.43e-01

Pos logit contributions
 next+0.005
 first+0.005
 one+0.004
 two+0.004
ued+0.003
Neg logit contributions
 mir-0.008
 Fro-0.007
 ig-0.007
 polar-0.007
rim-0.007
Tokenott
Token ID1252
Feature activation-2.28e+00

Pos logit contributions
 next+0.017
 first+0.016
 two+0.014
 one+0.014
 moment+0.013
Neg logit contributions
 mir-0.005
 spe-0.005
 Fro-0.005
 polar-0.005
 Stri-0.005
Tokenie
Token ID494
Feature activation+8.01e-02

Pos logit contributions
 Fro+0.073
 mir+0.069
 Stri+0.066
 polar+0.066
 ig+0.064
Neg logit contributions
 next-0.128
 first-0.126
 two-0.105
 one-0.104
ued-0.101
Token was
Token ID373
Feature activation-2.21e-01

Pos logit contributions
 next+0.003
 first+0.003
 two+0.002
 one+0.002
ued+0.002
Neg logit contributions
 Fro-0.004
 mir-0.004
 Stri-0.004
 polar-0.004
 spe-0.004
Token no
Token ID645
Feature activation+4.09e-01

Pos logit contributions
 Fro+0.010
 mir+0.010
 polar+0.010
 dam+0.009
 spe+0.009
Neg logit contributions
 next-0.009
 first-0.009
ued-0.007
 one-0.006
 two-0.006
Token longer
Token ID2392
Feature activation+4.53e-01

Pos logit contributions
 next+0.021
 first+0.020
 two+0.016
ued+0.016
 one+0.016
Neg logit contributions
 Fro-0.016
 mir-0.015
 polar-0.014
 spe-0.014
 Stri-0.014
Token scared
Token ID12008
Feature activation-2.19e-01

Pos logit contributions
 next+0.019
 first+0.018
 two+0.014
 one+0.014
ued+0.013
Neg logit contributions
 Fro-0.022
 mir-0.021
 Stri-0.020
 polar-0.020
 spe-0.020
Token ball
Token ID2613
Feature activation-1.39e-01

Pos logit contributions
 next+0.010
 first+0.010
 two+0.007
 one+0.007
ued+0.007
Neg logit contributions
 Fro-0.011
 mir-0.011
 Stri-0.011
 polar-0.010
 spe-0.010
Token?"
Token ID1701
Feature activation-2.38e-02

Pos logit contributions
 Fro+0.007
 mir+0.006
 Stri+0.006
 polar+0.006
 spe+0.006
Neg logit contributions
 next-0.006
 first-0.005
 two-0.004
ued-0.004
 one-0.004
Token
Token ID198
Feature activation+2.84e-01

Pos logit contributions
 Fro+0.001
 mir+0.001
 Stri+0.001
 polar+0.001
 spe+0.001
Neg logit contributions
 next-0.001
 first-0.001
 two-0.001
 one-0.001
ued-0.000
Token
Token ID198
Feature activation+3.47e-01

Pos logit contributions
 next+0.009
 first+0.009
 two+0.006
 one+0.006
ued+0.005
Neg logit contributions
 Fro-0.016
 mir-0.016
 Stri-0.015
 polar-0.015
 spe-0.015
TokenA
Token ID32
Feature activation-4.08e-01

Pos logit contributions
 next+0.014
 first+0.013
 two+0.010
 one+0.010
ued+0.009
Neg logit contributions
 Fro-0.017
 mir-0.017
 Stri-0.016
 polar-0.016
 spe-0.016
Tokenlic
Token ID677
Feature activation-2.25e+00

Pos logit contributions
 Fro+0.020
 mir+0.020
 Stri+0.019
 spe+0.019
 dam+0.019
Neg logit contributions
 next-0.016
 first-0.016
 two-0.012
 one-0.012
 moment-0.011
Tokenia
Token ID544
Feature activation-3.04e-01

Pos logit contributions
 Fro+0.098
 mir+0.094
 Stri+0.091
 polar+0.090
 spe+0.090
Neg logit contributions
 next-0.102
 first-0.099
 two-0.079
 one-0.078
ued-0.074
Token nodded
Token ID14464
Feature activation-7.07e-01

Pos logit contributions
 Fro+0.014
 polar+0.013
 mir+0.013
 Stri+0.013
 spe+0.013
Neg logit contributions
 next-0.013
 first-0.013
 two-0.010
ued-0.009
 one-0.009
Token and
Token ID290
Feature activation+2.91e-01

Pos logit contributions
 Fro+0.039
 mir+0.038
 polar+0.037
 spe+0.036
 Stri+0.036
Neg logit contributions
 next-0.023
 first-0.023
 two-0.016
 one-0.016
ued-0.015
Token said
Token ID531
Feature activation-5.54e-01

Pos logit contributions
 next+0.011
 first+0.011
 two+0.008
ued+0.008
 one+0.008
Neg logit contributions
 Fro-0.015
 Stri-0.014
 mir-0.014
 polar-0.014
 spe-0.014
Token,
Token ID11
Feature activation-7.19e-02

Pos logit contributions
 Fro+0.029
 mir+0.028
 Stri+0.027
 spe+0.027
 polar+0.027
Neg logit contributions
 next-0.020
 first-0.020
 two-0.014
 one-0.014
ued-0.013
TokenSure
Token ID19457
Feature activation+8.81e-01

Pos logit contributions
 Fro+0.025
 mir+0.024
 polar+0.023
 Stri+0.023
 spe+0.023
Neg logit contributions
 next-0.020
 first-0.019
 two-0.015
 one-0.015
ued-0.014
Token,
Token ID11
Feature activation+4.98e-02

Pos logit contributions
 first+0.025
 next+0.025
 one+0.018
 two+0.015
 minute+0.012
Neg logit contributions
 mir-0.055
 Fro-0.055
 Stri-0.055
 polar-0.054
ric-0.054
Token Lily
Token ID20037
Feature activation-8.11e-01

Pos logit contributions
 next+0.002
 first+0.002
 two+0.001
 one+0.001
 moment+0.001
Neg logit contributions
 Fro-0.003
 mir-0.003
 polar-0.002
 Stri-0.002
 spe-0.002
Token.
Token ID13
Feature activation-7.18e-01

Pos logit contributions
 Fro+0.044
 mir+0.043
 polar+0.041
 Stri+0.041
 spe+0.041
Neg logit contributions
 next-0.029
 first-0.028
 two-0.020
 one-0.020
ued-0.018
Token What
Token ID1867
Feature activation+2.69e-01

Pos logit contributions
 Fro+0.036
 mir+0.035
 polar+0.034
 spe+0.034
 Stri+0.033
Neg logit contributions
 next-0.028
 first-0.027
 two-0.021
 one-0.020
ued-0.018
Token kind
Token ID1611
Feature activation-2.24e+00

Pos logit contributions
 next+0.012
 first+0.011
 two+0.009
 one+0.009
ued+0.008
Neg logit contributions
 Fro-0.012
 mir-0.012
 Stri-0.011
 polar-0.011
 spe-0.011
Token of
Token ID286
Feature activation-1.12e+00

Pos logit contributions
 Fro+0.097
 mir+0.093
 Stri+0.090
 polar+0.089
 spe+0.089
Neg logit contributions
 next-0.104
 first-0.101
 two-0.081
 one-0.080
ued-0.075
Token pizza
Token ID14256
Feature activation+5.66e-02

Pos logit contributions
 Fro+0.041
 mir+0.039
 polar+0.039
 ig+0.039
 spe+0.038
Neg logit contributions
 next-0.054
 first-0.053
ued-0.046
 two-0.042
 one-0.041
Token do
Token ID466
Feature activation-3.50e-01

Pos logit contributions
 next+0.003
 first+0.002
ued+0.002
 two+0.002
 one+0.002
Neg logit contributions
 Fro-0.002
 mir-0.002
 Stri-0.002
 dam-0.002
 polar-0.002
Token you
Token ID345
Feature activation+4.20e-01

Pos logit contributions
 Fro+0.015
 mir+0.014
 Stri+0.014
 dam+0.013
rim+0.013
Neg logit contributions
 next-0.014
 first-0.014
ued-0.012
 two-0.012
 one-0.011
Token want
Token ID765
Feature activation+3.17e-01

Pos logit contributions
 next+0.019
 first+0.018
ued+0.016
 two+0.015
 moment+0.015
Neg logit contributions
 Fro-0.016
 mir-0.016
 dam-0.015
 polar-0.015
 Stri-0.015
Token find
Token ID1064
Feature activation-2.19e-01

Pos logit contributions
 Fro+0.034
 mir+0.033
 spe+0.032
 polar+0.032
 Stri+0.032
Neg logit contributions
 next-0.029
 first-0.029
 two-0.022
 one-0.022
ued-0.019
Token new
Token ID649
Feature activation+8.72e-01

Pos logit contributions
 Fro+0.011
 mir+0.011
 Stri+0.010
 polar+0.010
 spe+0.010
Neg logit contributions
 next-0.008
 first-0.008
 two-0.006
ued-0.006
 one-0.006
Token videos
Token ID5861
Feature activation-7.14e-02

Pos logit contributions
 next+0.039
 first+0.038
 two+0.030
 one+0.029
ued+0.028
Neg logit contributions
 Fro-0.039
 mir-0.038
 Stri-0.036
 polar-0.036
 spe-0.036
Token.
Token ID13
Feature activation-4.90e-01

Pos logit contributions
 Fro+0.003
 Stri+0.003
 spe+0.003
 pic+0.003
lic+0.003
Neg logit contributions
 next-0.003
 first-0.003
 moral-0.002
ued-0.002
 moment-0.002
Token What
Token ID1867
Feature activation+1.38e-01

Pos logit contributions
 Fro+0.025
 mir+0.025
rim+0.024
 polar+0.024
 spe+0.024
Neg logit contributions
 next-0.019
 first-0.018
 two-0.014
 one-0.014
ued-0.011
Token kind
Token ID1611
Feature activation-2.24e+00

Pos logit contributions
 next+0.006
 first+0.006
ued+0.005
 two+0.004
 one+0.004
Neg logit contributions
 Fro-0.006
 mir-0.006
 Stri-0.006
 spe-0.006
 ig-0.005
Token of
Token ID286
Feature activation-1.11e+00

Pos logit contributions
 Fro+0.098
 mir+0.095
 Stri+0.091
 polar+0.091
 spe+0.091
Neg logit contributions
 next-0.103
 first-0.100
 two-0.079
 one-0.078
ued-0.073
Token videos
Token ID5861
Feature activation-4.55e-01

Pos logit contributions
 Fro+0.042
 mir+0.040
 Stri+0.039
 polar+0.038
 spe+0.038
Neg logit contributions
 next-0.057
 first-0.056
 two-0.045
 one-0.045
ued-0.043
Token do
Token ID466
Feature activation-4.57e-01

Pos logit contributions
 Fro+0.015
 mir+0.015
lic+0.015
 Stri+0.015
 spe+0.015
Neg logit contributions
 next-0.021
 first-0.020
 two-0.017
ued-0.017
 one-0.017
Token you
Token ID345
Feature activation+7.98e-01

Pos logit contributions
 Fro+0.019
 mir+0.019
 Stri+0.018
 pic+0.017
 dam+0.017
Neg logit contributions
 next-0.018
 first-0.018
ued-0.016
 two-0.015
ging-0.014
Token want
Token ID765
Feature activation+3.60e-01

Pos logit contributions
 next+0.034
 first+0.031
ued+0.027
 two+0.025
 one+0.025
Neg logit contributions
 Fro-0.035
 mir-0.034
 Stri-0.033
 polar-0.032
 spe-0.032
TokenL
Token ID43
Feature activation-8.87e-01

Pos logit contributions
 next+0.001
 first+0.001
 two+0.001
 one+0.001
ued+0.001
Neg logit contributions
 Fro-0.001
 mir-0.001
 Stri-0.001
 polar-0.001
 spe-0.001
Tokenily
Token ID813
Feature activation-1.95e-01

Pos logit contributions
 Fro+0.047
 mir+0.045
 Stri+0.044
 polar+0.043
 ig+0.043
Neg logit contributions
 next-0.031
 first-0.030
 one-0.021
 two-0.021
ued-0.021
Token replied
Token ID8712
Feature activation-3.73e-01

Pos logit contributions
 Fro+0.008
 Stri+0.007
avis+0.006
 por+0.006
 mir+0.006
Neg logit contributions
 next-0.009
 first-0.009
ued-0.007
oning-0.007
 moment-0.007
Token,
Token ID11
Feature activation-3.64e-01

Pos logit contributions
 Fro+0.015
 cho+0.013
 spe+0.013
 por+0.012
 mir+0.012
Neg logit contributions
 first-0.016
 next-0.016
ued-0.013
Whoever-0.013
 two-0.013
Token "
Token ID366
Feature activation-6.04e-01

Pos logit contributions
 Fro+0.013
 Stri+0.012
 Chick+0.011
 spe+0.011
 cho+0.011
Neg logit contributions
 first-0.017
 next-0.016
ued-0.015
 two-0.014
working-0.014
TokenThere
Token ID1858
Feature activation-2.21e+00

Pos logit contributions
 Fro+0.034
 mir+0.034
 polar+0.034
 dam+0.033
urg+0.033
Neg logit contributions
 next-0.022
 first-0.019
 two-0.014
 one-0.014
 moral-0.012
Token's
Token ID338
Feature activation-2.05e-02

Pos logit contributions
 Fro+0.103
 mir+0.098
 ig+0.095
 spe+0.095
 Stri+0.094
Neg logit contributions
 next-0.086
 first-0.083
ued-0.065
 two-0.063
 one-0.062
Token a
Token ID257
Feature activation+7.39e-02

Pos logit contributions
 Fro+0.001
 mir+0.001
 Stri+0.001
 polar+0.001
 spe+0.001
Neg logit contributions
 next-0.001
 first-0.001
 two-0.001
 one-0.000
ued-0.000
Token scary
Token ID14343
Feature activation+3.06e-01

Pos logit contributions
 next+0.003
 first+0.003
 two+0.002
 one+0.002
 moment+0.002
Neg logit contributions
 Fro-0.004
 Stri-0.004
 mir-0.004
rim-0.004
avis-0.004
Token pit
Token ID6028
Feature activation-3.50e-01

Pos logit contributions
 next+0.017
 first+0.016
 two+0.013
 one+0.013
ued+0.013
Neg logit contributions
 Fro-0.011
 mir-0.010
 polar-0.010
 spe-0.009
 ig-0.009
Token over
Token ID625
Feature activation-6.01e-01

Pos logit contributions
 Fro+0.018
 mir+0.018
 polar+0.018
 ig+0.018
 spe+0.018
Neg logit contributions
 next-0.012
 first-0.011
 two-0.009
 one-0.008
ued-0.008
Token and
Token ID290
Feature activation-8.74e-02

Pos logit contributions
 next+0.001
 first+0.001
ued+0.001
 two+0.001
 moral+0.001
Neg logit contributions
 Fro-0.002
 mir-0.001
 Stri-0.001
 polar-0.001
 ig-0.001
Token asked
Token ID1965
Feature activation-4.74e-01

Pos logit contributions
 Fro+0.005
 mir+0.004
 Stri+0.004
 polar+0.004
 spe+0.004
Neg logit contributions
 next-0.003
 first-0.003
 two-0.002
 one-0.002
ued-0.002
Token,
Token ID11
Feature activation-8.52e-01

Pos logit contributions
 Fro+0.020
 Stri+0.019
 ig+0.018
 polar+0.018
 mir+0.017
Neg logit contributions
 next-0.020
 first-0.019
ued-0.016
 moral-0.015
 two-0.015
Token "
Token ID366
Feature activation+1.48e-02

Pos logit contributions
 Fro+0.034
 Stri+0.032
lic+0.028
 polar+0.028
 Chick+0.027
Neg logit contributions
 next-0.039
 first-0.037
ued-0.035
 moral-0.031
working-0.030
TokenWhat
Token ID2061
Feature activation+1.88e-01

Pos logit contributions
 next+0.001
 first+0.001
ued+0.001
 two+0.000
 one+0.000
Neg logit contributions
 Fro-0.001
 Stri-0.001
 mir-0.001
 spe-0.001
 polar-0.001
Token kind
Token ID1611
Feature activation-2.19e+00

Pos logit contributions
 next+0.009
 first+0.008
ued+0.007
 two+0.006
 one+0.006
Neg logit contributions
 Fro-0.008
 mir-0.007
 Stri-0.007
 spe-0.007
 ig-0.007
Token of
Token ID286
Feature activation-1.39e+00

Pos logit contributions
 Fro+0.100
 mir+0.097
 Stri+0.093
 polar+0.092
 spe+0.092
Neg logit contributions
 next-0.099
 first-0.096
 two-0.076
 one-0.075
ued-0.069
Token game
Token ID983
Feature activation+6.55e-02

Pos logit contributions
 Fro+0.055
 mir+0.052
 Stri+0.051
 spe+0.051
 polar+0.050
Neg logit contributions
 next-0.067
 first-0.065
 two-0.052
ued-0.051
 one-0.050
Token?".
Token ID43634
Feature activation-8.04e-01

Pos logit contributions
 next+0.003
 first+0.003
 two+0.002
 one+0.002
ued+0.002
Neg logit contributions
 Fro-0.003
 mir-0.003
 Stri-0.003
 polar-0.003
 spe-0.003
Token
Token ID220
Feature activation-2.52e-01

Pos logit contributions
 Fro+0.029
 Stri+0.027
 mir+0.027
 polar+0.026
 spe+0.026
Neg logit contributions
 next-0.042
 first-0.041
 two-0.033
 one-0.033
ued-0.032
Token
Token ID198
Feature activation-4.08e-01

Pos logit contributions
 Fro+0.012
 Stri+0.011
 Chick+0.010
avis+0.010
lic+0.010
Neg logit contributions
 next-0.010
 first-0.010
ued-0.008
 one-0.008
 two-0.007
Token to
Token ID284
Feature activation-7.11e-01

Pos logit contributions
 next+0.021
 first+0.020
ued+0.015
 two+0.015
 one+0.014
Neg logit contributions
 Fro-0.027
 mir-0.026
 Stri-0.025
 spe-0.025
 polar-0.025
Token the
Token ID262
Feature activation-3.48e-01

Pos logit contributions
 Fro+0.036
 mir+0.035
 Stri+0.034
 polar+0.034
 spe+0.034
Neg logit contributions
 next-0.027
 first-0.027
 two-0.020
 one-0.020
ued-0.018
Token lamp
Token ID20450
Feature activation-5.47e-01

Pos logit contributions
 Fro+0.014
 mir+0.013
 Stri+0.013
 polar+0.013
 spe+0.013
Neg logit contributions
 next-0.017
 first-0.017
 two-0.014
 one-0.013
ued-0.013
Token.
Token ID13
Feature activation-7.82e-02

Pos logit contributions
 Fro+0.029
 mir+0.028
 spe+0.028
 Stri+0.028
 polar+0.027
Neg logit contributions
 next-0.018
 first-0.018
 two-0.013
ued-0.013
 one-0.013
Token "
Token ID366
Feature activation-8.95e-01

Pos logit contributions
 Fro+0.004
 mir+0.004
 Stri+0.004
 spe+0.004
 polar+0.004
Neg logit contributions
 next-0.003
 first-0.003
 two-0.002
ued-0.002
 one-0.002
TokenThere
Token ID1858
Feature activation-2.18e+00

Pos logit contributions
 Fro+0.037
 Stri+0.036
 mir+0.035
 spe+0.035
 polar+0.034
Neg logit contributions
 first-0.041
 next-0.040
ued-0.033
 one-0.031
 two-0.030
Token is
Token ID318
Feature activation-1.35e-01

Pos logit contributions
 Fro+0.093
 mir+0.088
 spe+0.085
 ig+0.084
 Stri+0.083
Neg logit contributions
 next-0.092
 first-0.087
ued-0.073
 moral-0.069
 two-0.068
Token a
Token ID257
Feature activation-7.66e-01

Pos logit contributions
 Fro+0.007
 mir+0.006
 Stri+0.006
 ig+0.006
 spe+0.006
Neg logit contributions
 next-0.005
 first-0.005
ued-0.004
 moment-0.004
 two-0.003
Token gir
Token ID37370
Feature activation+1.15e-01

Pos logit contributions
 Fro+0.032
 mir+0.030
 Stri+0.030
rim+0.029
urg+0.028
Neg logit contributions
 next-0.037
 first-0.036
 two-0.029
 one-0.029
 moment-0.026
Tokenaffe
Token ID21223
Feature activation-5.67e-01

Pos logit contributions
 next+0.006
 first+0.005
 two+0.004
 one+0.004
ued+0.004
Neg logit contributions
 Fro-0.005
 mir-0.005
 Stri-0.004
 polar-0.004
 spe-0.004
Token.
Token ID13
Feature activation-5.20e-01

Pos logit contributions
 Fro+0.027
 mir+0.027
 spe+0.026
 polar+0.026
 Stri+0.026
Neg logit contributions
 next-0.022
 first-0.022
 two-0.016
ued-0.016
 one-0.016
Token at
Token ID379
Feature activation-1.64e-01

Pos logit contributions
 next+0.030
 first+0.029
ued+0.024
 two+0.023
 moral+0.022
Neg logit contributions
 Fro-0.029
 mir-0.027
 Stri-0.027
 spe-0.026
 polar-0.026
Token the
Token ID262
Feature activation-2.07e-01

Pos logit contributions
 Fro+0.008
 mir+0.008
 spe+0.007
 polar+0.007
 dam+0.007
Neg logit contributions
 next-0.007
 first-0.007
 two-0.005
 one-0.005
ued-0.004
Token shelf
Token ID18316
Feature activation-1.19e+00

Pos logit contributions
 Fro+0.008
 mir+0.007
 spe+0.007
 Stri+0.007
 polar+0.007
Neg logit contributions
 next-0.011
 first-0.010
 two-0.008
 one-0.008
ued-0.008
Token.
Token ID13
Feature activation+1.51e-01

Pos logit contributions
 Fro+0.046
 ig+0.045
 mir+0.045
 spe+0.043
pour+0.043
Neg logit contributions
 next-0.042
ued-0.041
 first-0.040
oning-0.038
 moment-0.037
Token "
Token ID366
Feature activation-1.08e-01

Pos logit contributions
 next+0.006
 first+0.006
 two+0.004
 one+0.004
ued+0.004
Neg logit contributions
 Fro-0.008
 mir-0.008
 polar-0.007
 spe-0.007
 Stri-0.007
TokenThere
Token ID1858
Feature activation-2.18e+00

Pos logit contributions
 Fro+0.005
 mir+0.005
 Stri+0.004
 spe+0.004
 polar+0.004
Neg logit contributions
 next-0.005
 first-0.005
 two-0.004
 one-0.004
ued-0.004
Token,
Token ID11
Feature activation-1.23e-01

Pos logit contributions
 Fro+0.097
 mir+0.092
 Stri+0.089
 ig+0.088
 spe+0.087
Neg logit contributions
 next-0.090
 first-0.086
ued-0.070
 two-0.067
 one-0.066
Token on
Token ID319
Feature activation-9.87e-02

Pos logit contributions
 Fro+0.006
 mir+0.006
 Stri+0.006
 ig+0.006
 polar+0.006
Neg logit contributions
 next-0.004
 first-0.004
ued-0.003
 one-0.003
 two-0.003
Token the
Token ID262
Feature activation-2.26e-02

Pos logit contributions
 Fro+0.005
 mir+0.004
 Stri+0.004
 polar+0.004
 spe+0.004
Neg logit contributions
 next-0.004
 first-0.004
 two-0.003
 one-0.003
ued-0.003
Token shelf
Token ID18316
Feature activation-1.41e+00

Pos logit contributions
 Fro+0.001
 mir+0.001
 Stri+0.001
 polar+0.001
 spe+0.001
Neg logit contributions
 next-0.001
 first-0.001
 two-0.001
ued-0.001
 one-0.001
Token.
Token ID13
Feature activation-1.33e-01

Pos logit contributions
 Fro+0.053
 mir+0.053
 ig+0.053
 Stri+0.051
 spe+0.051
Neg logit contributions
 next-0.058
 first-0.054
ued-0.050
 two-0.047
 moment-0.047
Token!"
Token ID2474
Feature activation-7.20e-01

Pos logit contributions
 mir+0.012
 Fro+0.011
 Stri+0.011
rim+0.011
 dam+0.011
Neg logit contributions
 next-0.008
 first-0.007
 two-0.006
 one-0.006
 moment-0.004
Token she
Token ID673
Feature activation-2.05e-01

Pos logit contributions
 mir+0.036
ric+0.034
 Fro+0.034
 spe+0.033
 dam+0.033
Neg logit contributions
 next-0.030
 first-0.029
 one-0.024
 two-0.024
 moment-0.020
Token said
Token ID531
Feature activation-3.65e-01

Pos logit contributions
 mir+0.013
ric+0.013
 ig+0.013
 spe+0.013
 dam+0.013
Neg logit contributions
 first-0.005
 next-0.005
 two-0.004
 one-0.003
 eager-0.002
Token.
Token ID13
Feature activation-5.86e-01

Pos logit contributions
 Fro+0.020
 mir+0.019
 Stri+0.019
 polar+0.019
 spe+0.019
Neg logit contributions
 next-0.013
 first-0.012
 two-0.009
 one-0.009
ued-0.008
Token "
Token ID366
Feature activation-8.32e-01

Pos logit contributions
 Fro+0.030
 mir+0.028
 Stri+0.028
 polar+0.027
 spe+0.027
Neg logit contributions
 next-0.023
 first-0.022
 two-0.017
 one-0.016
ued-0.016
TokenThere
Token ID1858
Feature activation-2.17e+00

Pos logit contributions
 Fro+0.035
 mir+0.033
 Stri+0.033
 spe+0.032
 polar+0.032
Neg logit contributions
 first-0.037
 next-0.036
ued-0.030
 two-0.029
 one-0.028
Token's
Token ID338
Feature activation-4.67e-01

Pos logit contributions
 Fro+0.106
 mir+0.102
 spe+0.099
 Stri+0.098
 ig+0.097
Neg logit contributions
 next-0.082
 first-0.079
ued-0.060
 two-0.059
 one-0.057
Token a
Token ID257
Feature activation-4.00e-01

Pos logit contributions
 Fro+0.026
 mir+0.025
 Stri+0.025
 polar+0.025
 spe+0.024
Neg logit contributions
 next-0.016
 first-0.015
 two-0.011
 one-0.011
ued-0.009
Token butterfly
Token ID35113
Feature activation-6.75e-01

Pos logit contributions
 Fro+0.020
 Stri+0.019
 mir+0.019
rim+0.018
avis+0.018
Neg logit contributions
 next-0.016
 first-0.016
 two-0.012
 one-0.011
 moment-0.010
Token on
Token ID319
Feature activation-1.06e+00

Pos logit contributions
 Fro+0.034
 mir+0.033
 Stri+0.032
 polar+0.032
 spe+0.032
Neg logit contributions
 next-0.027
 first-0.026
 one-0.019
 two-0.019
ued-0.018
Token that
Token ID326
Feature activation-6.63e-01

Pos logit contributions
 mir+0.058
 Fro+0.058
 polar+0.055
rim+0.055
avis+0.055
Neg logit contributions
 next-0.039
 first-0.038
 two-0.029
 one-0.029
 behind-0.022
Token make
Token ID787
Feature activation+1.93e-01

Pos logit contributions
 next+0.013
 first+0.012
 two+0.009
 one+0.009
ued+0.009
Neg logit contributions
 Fro-0.016
 mir-0.016
 Stri-0.015
 polar-0.015
 spe-0.015
Token bra
Token ID8290
Feature activation+2.54e-01

Pos logit contributions
 next+0.007
 first+0.007
 one+0.006
 two+0.006
 up+0.004
Neg logit contributions
 mir-0.010
lic-0.010
 spe-0.010
 Fro-0.010
rim-0.010
Tokenids
Token ID2340
Feature activation-2.76e-01

Pos logit contributions
 next+0.014
 first+0.013
 two+0.011
 one+0.011
 moment+0.010
Neg logit contributions
 Fro-0.009
 Stri-0.008
 mir-0.008
 spe-0.008
 dam-0.008
Token and
Token ID290
Feature activation-4.00e-02

Pos logit contributions
 Fro+0.014
 mir+0.014
 Stri+0.013
 spe+0.013
 polar+0.013
Neg logit contributions
 next-0.010
 first-0.009
ued-0.007
 two-0.007
 one-0.007
Token pony
Token ID26902
Feature activation-4.52e-01

Pos logit contributions
 Fro+0.002
 mir+0.002
 polar+0.002
rim+0.002
 Stri+0.002
Neg logit contributions
 next-0.002
 first-0.002
 two-0.002
 one-0.002
 moral-0.001
Tokentails
Token ID26404
Feature activation-2.15e+00

Pos logit contributions
 Fro+0.024
 mir+0.024
 polar+0.023
icol+0.023
 spe+0.023
Neg logit contributions
 next-0.017
 first-0.016
 one-0.012
 two-0.012
ued-0.010
Token and
Token ID290
Feature activation-1.42e-01

Pos logit contributions
 Fro+0.108
 mir+0.106
 spe+0.105
 Stri+0.102
 polar+0.102
Neg logit contributions
 next-0.074
 first-0.073
ued-0.055
 two-0.053
 one-0.051
Token curls
Token ID49507
Feature activation-5.70e-01

Pos logit contributions
 Fro+0.006
 mir+0.006
 polar+0.006
 Stri+0.006
 dam+0.006
Neg logit contributions
 next-0.006
 first-0.006
 two-0.005
 one-0.005
ued-0.004
Token.
Token ID13
Feature activation-1.44e-01

Pos logit contributions
 Fro+0.030
 mir+0.029
 Stri+0.028
 spe+0.028
 polar+0.027
Neg logit contributions
 next-0.021
 first-0.020
 two-0.015
 one-0.014
ued-0.014
Token You
Token ID921
Feature activation-8.47e-01

Pos logit contributions
 Fro+0.007
 mir+0.007
rim+0.007
 polar+0.007
 dam+0.007
Neg logit contributions
 next-0.005
 first-0.005
 two-0.004
 one-0.004
 moral-0.003
Token can
Token ID460
Feature activation+2.93e-01

Pos logit contributions
 Fro+0.038
 mir+0.037
 Stri+0.035
 polar+0.035
 spe+0.035
Neg logit contributions
 next-0.038
 first-0.037
 two-0.029
 one-0.028
ued-0.027
Token,
Token ID11
Feature activation+5.25e-02

Pos logit contributions
 Fro+0.019
 mir+0.017
 Stri+0.017
 spe+0.017
 polar+0.017
Neg logit contributions
 next-0.014
 first-0.014
ued-0.011
 two-0.010
 moral-0.010
Token they
Token ID484
Feature activation-7.53e-02

Pos logit contributions
 next+0.002
 first+0.002
 two+0.001
 one+0.001
ued+0.001
Neg logit contributions
 Fro-0.003
 mir-0.003
 Stri-0.002
 polar-0.002
 spe-0.002
Token threw
Token ID9617
Feature activation+5.53e-01

Pos logit contributions
 Fro+0.004
 mir+0.004
 polar+0.004
 Stri+0.004
 spe+0.004
Neg logit contributions
 next-0.003
 first-0.003
 two-0.002
 one-0.002
ued-0.002
Token the
Token ID262
Feature activation-3.33e-01

Pos logit contributions
 next+0.023
 first+0.023
 two+0.017
 one+0.017
ued+0.017
Neg logit contributions
 Fro-0.026
 mir-0.025
 Stri-0.024
 spe-0.024
 polar-0.024
Token spark
Token ID9009
Feature activation-5.79e-01

Pos logit contributions
 spe+0.010
 Fro+0.010
 mir+0.010
 polar+0.010
 ig+0.010
Neg logit contributions
 next-0.018
 first-0.018
ued-0.015
 one-0.014
oning-0.014
Token back
Token ID736
Feature activation-2.15e+00

Pos logit contributions
 Fro+0.027
 mir+0.026
 spe+0.025
 Stri+0.025
 polar+0.024
Neg logit contributions
 next-0.024
 first-0.024
 two-0.018
 one-0.018
ued-0.017
Token and
Token ID290
Feature activation-1.60e-01

Pos logit contributions
 Fro+0.099
 ig+0.094
 mir+0.092
 dam+0.091
 spe+0.090
Neg logit contributions
 next-0.084
 first-0.081
ued-0.065
 two-0.062
 moral-0.061
Token forth
Token ID6071
Feature activation-7.93e-01

Pos logit contributions
 Fro+0.008
 mir+0.007
 Stri+0.007
 polar+0.007
 spe+0.007
Neg logit contributions
 next-0.007
 first-0.006
 two-0.005
 one-0.005
ued-0.005
Token and
Token ID290
Feature activation-3.02e-01

Pos logit contributions
 Fro+0.036
 cho+0.035
 spe+0.035
 Stri+0.034
 som+0.033
Neg logit contributions
 next-0.028
 first-0.027
ued-0.025
 moral-0.024
 two-0.022
Token laughed
Token ID13818
Feature activation-2.18e-01

Pos logit contributions
 Fro+0.014
 mir+0.013
 Stri+0.013
 spe+0.013
 polar+0.013
Neg logit contributions
 next-0.013
 first-0.012
 two-0.010
 one-0.009
ued-0.009
Token.
Token ID13
Feature activation-2.65e-01

Pos logit contributions
 Fro+0.012
 mir+0.011
 Stri+0.011
 spe+0.011
 polar+0.011
Neg logit contributions
 next-0.007
 first-0.007
 two-0.005
ued-0.005
 one-0.005
Token a
Token ID257
Feature activation+6.41e-02

Pos logit contributions
 Fro+0.021
 mir+0.021
rim+0.020
 polar+0.020
avis+0.020
Neg logit contributions
 next-0.011
 first-0.010
 two-0.008
 one-0.007
 moment-0.005
Token flashlight
Token ID38371
Feature activation-2.86e-01

Pos logit contributions
 next+0.003
 first+0.003
 two+0.002
 one+0.002
ued+0.002
Neg logit contributions
 Fro-0.003
 mir-0.003
 Stri-0.003
 polar-0.003
 spe-0.003
Token and
Token ID290
Feature activation-3.46e-01

Pos logit contributions
 Fro+0.014
 spe+0.013
 mir+0.013
 polar+0.013
 ig+0.013
Neg logit contributions
 next-0.010
 first-0.010
 two-0.008
ued-0.008
 moral-0.008
Token a
Token ID257
Feature activation-2.46e-01

Pos logit contributions
 Fro+0.020
 mir+0.019
 Stri+0.018
 polar+0.018
 spe+0.018
Neg logit contributions
 next-0.011
 first-0.011
 two-0.008
 one-0.008
ued-0.007
Token magn
Token ID7842
Feature activation-1.17e+00

Pos logit contributions
 Fro+0.010
 mir+0.010
 Stri+0.010
 spe+0.009
 polar+0.009
Neg logit contributions
 next-0.012
 first-0.011
 two-0.009
 one-0.009
ued-0.008
Tokenifying
Token ID4035
Feature activation-2.15e+00

Pos logit contributions
 Fro+0.026
 mir+0.024
 Stri+0.023
 spe+0.023
 polar+0.023
Neg logit contributions
 next-0.078
 first-0.076
 two-0.065
 one-0.065
ued-0.063
Token glass
Token ID5405
Feature activation-8.28e-01

Pos logit contributions
avis+0.102
 Stri+0.101
rim+0.101
urg+0.098
lee+0.095
Neg logit contributions
 first-0.095
 next-0.093
 one-0.087
 two-0.080
 top-0.077
Token to
Token ID284
Feature activation-6.24e-01

Pos logit contributions
 Fro+0.044
 mir+0.043
 Stri+0.042
 polar+0.041
 spe+0.041
Neg logit contributions
 next-0.030
 first-0.029
 two-0.021
 one-0.021
ued-0.019
Token examine
Token ID10716
Feature activation-4.63e-01

Pos logit contributions
 Fro+0.028
 mir+0.026
 Stri+0.026
 polar+0.025
 spe+0.025
Neg logit contributions
 next-0.028
 first-0.027
 two-0.021
 one-0.021
ued-0.020
Token their
Token ID511
Feature activation-2.89e-01

Pos logit contributions
 Fro+0.025
 mir+0.024
 Stri+0.023
 polar+0.023
 spe+0.023
Neg logit contributions
 next-0.016
 first-0.016
 two-0.012
 one-0.011
ued-0.011
Token finds
Token ID7228
Feature activation+1.20e-01

Pos logit contributions
 Fro+0.013
 mir+0.013
 Stri+0.012
 spe+0.012
 polar+0.012
Neg logit contributions
 next-0.013
 first-0.013
 two-0.010
 one-0.010
 moment-0.009
Token dictionary
Token ID22155
Feature activation-1.12e+00

Pos logit contributions
 coc+0.001
 cob+0.001
 por+0.001
 Fro+0.001
 scar+0.001
Neg logit contributions
ued-0.002
 first-0.002
 next-0.002
t-0.002
ging-0.002
Token reply
Token ID10971
Feature activation+3.76e-01

Pos logit contributions
 spe+0.041
 Fro+0.041
 ig+0.038
 Stri+0.038
 mir+0.038
Neg logit contributions
 next-0.049
 first-0.048
oning-0.041
ging-0.038
ued-0.038
Tokened
Token ID276
Feature activation-2.54e-01

Pos logit contributions
 next+0.014
 first+0.014
 two+0.010
ued+0.010
 one+0.010
Neg logit contributions
 Fro-0.018
 mir-0.017
 spe-0.017
 Stri-0.017
 polar-0.017
Token,
Token ID11
Feature activation+3.40e-01

Pos logit contributions
 Fro+0.013
 mir+0.012
 Stri+0.012
 polar+0.012
 spe+0.012
Neg logit contributions
 next-0.010
 first-0.009
 two-0.007
ued-0.007
 one-0.007
Token "
Token ID366
Feature activation-8.42e-02

Pos logit contributions
 first+0.014
 next+0.014
ued+0.012
 two+0.011
working+0.011
Neg logit contributions
 Fro-0.015
 Stri-0.014
 mir-0.014
 spe-0.014
 polar-0.013
TokenThere
Token ID1858
Feature activation-2.15e+00

Pos logit contributions
 Fro+0.003
 Stri+0.003
 mir+0.003
 spe+0.003
avis+0.003
Neg logit contributions
 first-0.004
 next-0.004
ued-0.003
 behind-0.003
 two-0.003
Token are
Token ID389
Feature activation+4.11e-01

Pos logit contributions
 Fro+0.086
 mir+0.082
 spe+0.079
 ig+0.078
 Stri+0.078
Neg logit contributions
 next-0.090
 first-0.086
ued-0.072
 moral-0.069
 two-0.068
Token many
Token ID867
Feature activation+9.58e-01

Pos logit contributions
 next+0.015
 first+0.014
 moment+0.013
ued+0.013
ging+0.012
Neg logit contributions
 mir-0.018
 Fro-0.017
 spe-0.017
 ig-0.017
 cho-0.017
Token ways
Token ID2842
Feature activation-2.17e-01

Pos logit contributions
 first+0.045
 next+0.043
ued+0.038
 moment+0.034
oning+0.034
Neg logit contributions
 spe-0.036
 coc-0.035
 cob-0.034
 mir-0.033
 por-0.032
Token to
Token ID284
Feature activation+5.55e-02

Pos logit contributions
 mir+0.006
 spe+0.006
 Fro+0.006
 ig+0.006
eter+0.006
Neg logit contributions
 next-0.010
 first-0.010
ued-0.009
 moral-0.009
 two-0.008
Token make
Token ID787
Feature activation+1.42e-01

Pos logit contributions
 next+0.002
 first+0.002
ued+0.002
 two+0.002
ging+0.002
Neg logit contributions
 Fro-0.002
 coc-0.002
 mir-0.002
 spe-0.002
 ig-0.002
Token like
Token ID588
Feature activation-2.82e-01

Pos logit contributions
 next+0.022
 first+0.021
ued+0.018
 moral+0.017
 two+0.017
Neg logit contributions
 mir-0.023
 spe-0.023
 Fro-0.023
 ig-0.022
 cho-0.022
Token lettuce
Token ID39406
Feature activation-8.98e-02

Pos logit contributions
 Fro+0.013
 mir+0.013
 Stri+0.012
 polar+0.012
 spe+0.012
Neg logit contributions
 next-0.012
 first-0.012
 two-0.009
 one-0.009
ued-0.009
Token,
Token ID11
Feature activation-3.56e-01

Pos logit contributions
 Fro+0.005
 mir+0.004
 spe+0.004
 polar+0.004
 Stri+0.004
Neg logit contributions
 next-0.003
 first-0.003
ued-0.003
 two-0.002
 one-0.002
Token tomatoes
Token ID23972
Feature activation-4.79e-01

Pos logit contributions
 Fro+0.014
 mir+0.014
 Stri+0.013
 polar+0.013
 spe+0.013
Neg logit contributions
 next-0.018
 first-0.017
 two-0.014
 one-0.014
ued-0.013
Token and
Token ID290
Feature activation-3.78e-01

Pos logit contributions
 Fro+0.023
 mir+0.022
 spe+0.021
 Stri+0.021
 por+0.021
Neg logit contributions
 next-0.018
 first-0.017
ued-0.014
 two-0.014
 moral-0.013
Token cuc
Token ID38421
Feature activation-2.15e+00

Pos logit contributions
 Fro+0.014
 mir+0.014
 Stri+0.013
 polar+0.013
 spe+0.013
Neg logit contributions
 next-0.019
 first-0.019
 two-0.015
 one-0.015
ued-0.015
Tokenumbers
Token ID17024
Feature activation-1.17e-02

Pos logit contributions
 Fro+0.075
 mir+0.070
 Stri+0.068
 spe+0.066
 polar+0.065
Neg logit contributions
 next-0.115
 first-0.112
 two-0.092
 one-0.091
ued-0.088
Token.
Token ID13
Feature activation-5.15e-02

Pos logit contributions
 Fro+0.001
 mir+0.001
 Stri+0.001
 spe+0.001
 polar+0.001
Neg logit contributions
 next-0.000
 first-0.000
ued-0.000
 two-0.000
 moral-0.000
Token He
Token ID679
Feature activation-1.41e-01

Pos logit contributions
 Fro+0.002
 Stri+0.002
 mir+0.002
 polar+0.002
 spe+0.002
Neg logit contributions
 next-0.002
 first-0.002
ued-0.002
 two-0.002
 one-0.002
Token smiled
Token ID13541
Feature activation+4.03e-01

Pos logit contributions
 Fro+0.007
 mir+0.006
 spe+0.006
 polar+0.006
 Stri+0.006
Neg logit contributions
 next-0.006
 first-0.005
ued-0.004
 two-0.004
 moment-0.004
Token as
Token ID355
Feature activation-1.98e-01

Pos logit contributions
 next+0.012
 first+0.011
 two+0.008
 one+0.008
ued+0.007
Neg logit contributions
 Fro-0.024
 mir-0.024
 Stri-0.023
 spe-0.023
 polar-0.023
Token.
Token ID13
Feature activation-1.14e-01

Pos logit contributions
 Fro+0.003
 mir+0.003
 Stri+0.003
 polar+0.003
 spe+0.003
Neg logit contributions
 next-0.002
 first-0.002
 two-0.001
 one-0.001
ued-0.001
Token Have
Token ID8192
Feature activation+7.31e-01

Pos logit contributions
 Fro+0.006
 mir+0.006
 polar+0.006
 dam+0.006
 Stri+0.006
Neg logit contributions
 next-0.004
 first-0.004
 two-0.003
 one-0.003
ued-0.002
Token fun
Token ID1257
Feature activation+3.09e-01

Pos logit contributions
 next+0.029
 first+0.028
 two+0.022
 one+0.022
ued+0.018
Neg logit contributions
 Fro-0.038
 mir-0.037
 polar-0.037
 dam-0.036
 Stri-0.036
Token."
Token ID526
Feature activation-9.99e-02

Pos logit contributions
 next+0.010
 first+0.010
 two+0.007
 one+0.007
ued+0.006
Neg logit contributions
 Fro-0.018
 mir-0.017
 Stri-0.017
 polar-0.016
 spe-0.016
Token<|endoftext|>
Token ID50256
Feature activation+2.19e-02

Pos logit contributions
 Fro+0.006
 mir+0.006
 polar+0.006
 spe+0.006
 Stri+0.005
Neg logit contributions
 next-0.003
 first-0.003
 two-0.002
 one-0.002
ued-0.002
TokenBen
Token ID11696
Feature activation+1.49e+00

Pos logit contributions
 next+0.001
 first+0.001
 two+0.001
 one+0.000
 moment+0.000
Neg logit contributions
 mir-0.001
 Fro-0.001
 spe-0.001
 polar-0.001
 dam-0.001
Token and
Token ID290
Feature activation-2.28e-01

Pos logit contributions
 next+0.053
 first+0.053
ued+0.043
 two+0.038
 moral+0.036
Neg logit contributions
 Fro-0.077
 Stri-0.071
 mir-0.071
 polar-0.071
 ig-0.068
Token Lily
Token ID20037
Feature activation+5.27e-01

Pos logit contributions
 mir+0.011
 Fro+0.011
 dam+0.011
 spe+0.011
rim+0.011
Neg logit contributions
 next-0.009
 first-0.009
 two-0.007
 one-0.007
 moment-0.006
Token were
Token ID547
Feature activation+2.61e-01

Pos logit contributions
 next+0.025
 first+0.025
ued+0.020
 two+0.019
 one+0.018
Neg logit contributions
 Fro-0.021
 Stri-0.020
 polar-0.019
 mir-0.019
 spe-0.019
Token playing
Token ID2712
Feature activation-2.60e-02

Pos logit contributions
 next+0.010
 first+0.010
ued+0.008
 moment+0.008
 one+0.007
Neg logit contributions
 Fro-0.012
 ig-0.011
 polar-0.011
 mir-0.011
 Stri-0.011
Token in
Token ID287
Feature activation+9.03e-02

Pos logit contributions
 mir+0.002
 Fro+0.002
 spe+0.002
 polar+0.002
 Stri+0.002
Neg logit contributions
 next-0.001
 first-0.001
 two-0.000
 one-0.000
ued-0.000
Token library
Token ID5888
Feature activation-5.24e-01

Pos logit contributions
 Fro+0.002
 mir+0.002
 Stri+0.002
 polar+0.002
 spe+0.002
Neg logit contributions
 next-0.002
 first-0.002
 two-0.002
 one-0.002
ued-0.002
Token.
Token ID13
Feature activation+9.69e-02

Pos logit contributions
 ig+0.019
 Fro+0.019
 spe+0.018
 Stri+0.018
light+0.018
Neg logit contributions
ued-0.019
 first-0.019
 next-0.019
 moral-0.017
ging-0.017
Token They
Token ID1119
Feature activation+2.30e-01

Pos logit contributions
 next+0.004
 first+0.004
 two+0.003
 one+0.003
ued+0.003
Neg logit contributions
 Fro-0.004
 mir-0.004
 Stri-0.004
 polar-0.004
 spe-0.004
Token see
Token ID766
Feature activation+1.82e-01

Pos logit contributions
 next+0.011
 first+0.010
ued+0.009
 two+0.008
 moment+0.008
Neg logit contributions
 Fro-0.009
 spe-0.009
 Stri-0.008
 mir-0.008
 polar-0.008
Token many
Token ID867
Feature activation+7.35e-01

Pos logit contributions
 next+0.006
 first+0.006
 two+0.004
 one+0.004
ued+0.004
Neg logit contributions
 Fro-0.010
 mir-0.010
 Stri-0.010
 polar-0.010
 spe-0.010
Token people
Token ID661
Feature activation+1.16e+00

Pos logit contributions
 next+0.032
 first+0.031
 two+0.024
 one+0.024
ued+0.022
Neg logit contributions
 Fro-0.034
 mir-0.033
 Stri-0.032
 polar-0.032
 spe-0.031
Token and
Token ID290
Feature activation+2.39e-02

Pos logit contributions
 first+0.041
 next+0.041
ued+0.037
 moral+0.034
 two+0.033
Neg logit contributions
 Fro-0.052
 ig-0.050
 spe-0.050
 polar-0.050
 mir-0.049
Token cars
Token ID5006
Feature activation+1.10e+00

Pos logit contributions
 next+0.001
 first+0.001
ued+0.001
 two+0.001
 one+0.001
Neg logit contributions
 Fro-0.001
 mir-0.001
 spe-0.001
 polar-0.001
 Stri-0.001
Token on
Token ID319
Feature activation-4.36e-01

Pos logit contributions
 next+0.040
 first+0.040
ued+0.034
 moral+0.032
 two+0.031
Neg logit contributions
 Fro-0.050
 Stri-0.050
 ig-0.048
 mir-0.047
 polar-0.046
Token the
Token ID262
Feature activation-2.99e-01

Pos logit contributions
 Fro+0.021
 mir+0.020
 Stri+0.019
 spe+0.019
 polar+0.019
Neg logit contributions
 next-0.018
 first-0.017
 two-0.013
 one-0.013
ued-0.013
Token road
Token ID2975
Feature activation+6.52e-01

Pos logit contributions
 scar+0.010
 mir+0.010
 Stri+0.010
 Fro+0.010
 por+0.010
Neg logit contributions
 next-0.014
 first-0.013
ued-0.012
 two-0.012
 one-0.011
Token bird
Token ID6512
Feature activation+7.48e-02

Pos logit contributions
 first+0.001
 next+0.001
ued+0.001
oning+0.001
eyed+0.001
Neg logit contributions
 spe-0.001
 coc-0.001
 Fro-0.001
 polar-0.001
 mir-0.001
Token thanked
Token ID26280
Feature activation+1.90e-01

Pos logit contributions
 next+0.003
 first+0.003
 two+0.002
ued+0.002
 one+0.002
Neg logit contributions
 Fro-0.004
 mir-0.003
 Stri-0.003
 polar-0.003
 spe-0.003
Token the
Token ID262
Feature activation+2.56e-01

Pos logit contributions
 next+0.007
 first+0.007
 one+0.005
 two+0.005
 moral+0.004
Neg logit contributions
 mir-0.010
 Fro-0.010
 dam-0.010
rim-0.010
 polar-0.010
Token veil
Token ID30615
Feature activation-6.63e-01

Pos logit contributions
 first+0.013
 next+0.013
ued+0.012
oned+0.011
oning+0.011
Neg logit contributions
 coc-0.008
 spe-0.008
 mir-0.008
 Fro-0.007
 cob-0.007
Token for
Token ID329
Feature activation-1.57e-01

Pos logit contributions
 Fro+0.030
 ig+0.026
 Stri+0.026
 coc+0.026
 por+0.026
Neg logit contributions
 next-0.024
 first-0.023
ued-0.022
ging-0.019
Near-0.019
Token its
Token ID663
Feature activation+9.61e-01

Pos logit contributions
 Fro+0.007
 mir+0.007
 Stri+0.007
 polar+0.007
 spe+0.007
Neg logit contributions
 next-0.007
 first-0.006
ued-0.005
 two-0.005
 one-0.005
Token help
Token ID1037
Feature activation+6.02e-01

Pos logit contributions
 next+0.039
 first+0.039
ued+0.038
cases+0.033
ging+0.033
Neg logit contributions
 Fro-0.037
 mir-0.036
 coc-0.035
 spe-0.035
 ig-0.034
Token and
Token ID290
Feature activation+7.66e-02

Pos logit contributions
 next+0.021
 first+0.021
ued+0.016
 two+0.016
 moral+0.015
Neg logit contributions
 Fro-0.031
 Stri-0.029
 mir-0.028
 spe-0.028
 polar-0.028
Token flew
Token ID13112
Feature activation+1.15e-01

Pos logit contributions
 next+0.003
 first+0.003
 two+0.002
 one+0.002
ued+0.002
Neg logit contributions
 Fro-0.004
 mir-0.004
 Stri-0.004
 polar-0.004
 spe-0.004
Token away
Token ID1497
Feature activation-1.85e-01

Pos logit contributions
 next+0.004
 first+0.004
 two+0.003
 one+0.003
ued+0.003
Neg logit contributions
 Fro-0.006
 mir-0.006
 Stri-0.006
 polar-0.006
 spe-0.006
Token happily
Token ID18177
Feature activation-2.96e-01

Pos logit contributions
 Fro+0.009
 spe+0.008
 mir+0.008
 dam+0.008
 polar+0.008
Neg logit contributions
 next-0.007
 first-0.007
ued-0.005
 two-0.005
 moral-0.005
Token duck
Token ID22045
Feature activation+3.81e-01

Pos logit contributions
 Fro+0.003
 mir+0.003
 polar+0.003
 Stri+0.003
 spe+0.003
Neg logit contributions
 next-0.004
 first-0.004
 two-0.003
 one-0.003
ued-0.003
Token was
Token ID373
Feature activation-2.01e-01

Pos logit contributions
 next+0.015
 first+0.015
 two+0.011
 one+0.011
ued+0.010
Neg logit contributions
 Fro-0.019
 mir-0.018
 Stri-0.018
 polar-0.018
 spe-0.017
Token clean
Token ID3424
Feature activation-9.93e-01

Pos logit contributions
 Fro+0.010
 mir+0.010
 Stri+0.010
 spe+0.010
 polar+0.010
Neg logit contributions
 next-0.008
 first-0.007
 two-0.006
 one-0.005
ued-0.005
Token,
Token ID11
Feature activation+2.71e-01

Pos logit contributions
 Fro+0.062
 mir+0.061
 Stri+0.059
 spe+0.058
lic+0.058
Neg logit contributions
 next-0.030
 first-0.029
 one-0.019
 two-0.019
 up-0.014
Token the
Token ID262
Feature activation+9.90e-02

Pos logit contributions
 next+0.011
 first+0.010
 two+0.008
 one+0.008
ued+0.007
Neg logit contributions
 Fro-0.014
 mir-0.013
 polar-0.013
 Stri-0.013
 spe-0.013
Token navy
Token ID23956
Feature activation+1.05e+00

Pos logit contributions
 next+0.005
 first+0.005
 two+0.004
 one+0.004
ued+0.004
Neg logit contributions
 Fro-0.004
 mir-0.004
 polar-0.003
 Stri-0.003
 spe-0.003
Token boat
Token ID8848
Feature activation+5.03e-01

Pos logit contributions
 next+0.053
 first+0.051
 two+0.042
 one+0.041
ued+0.040
Neg logit contributions
 Fro-0.041
 mir-0.039
 Stri-0.037
 polar-0.037
 spe-0.037
Token took
Token ID1718
Feature activation-9.44e-02

Pos logit contributions
 first+0.021
 next+0.021
ued+0.017
 two+0.016
 one+0.016
Neg logit contributions
 Fro-0.022
 mir-0.021
 polar-0.021
 Stri-0.021
 spe-0.021
Token the
Token ID262
Feature activation-2.89e-01

Pos logit contributions
 Fro+0.005
 mir+0.005
ric+0.005
rim+0.005
 Stri+0.005
Neg logit contributions
 next-0.003
 first-0.003
 one-0.002
 two-0.002
 up-0.002
Token duck
Token ID22045
Feature activation+1.03e-01

Pos logit contributions
 Fro+0.009
 mir+0.009
 polar+0.009
 spe+0.009
 ig+0.009
Neg logit contributions
 next-0.015
 first-0.015
ued-0.013
 two-0.013
 one-0.013
Token back
Token ID736
Feature activation-7.99e-01

Pos logit contributions
 next+0.004
 first+0.004
ued+0.003
 two+0.003
 moral+0.003
Neg logit contributions
 Fro-0.005
 mir-0.005
 polar-0.005
 ig-0.005
 dam-0.004
Token The
Token ID383
Feature activation+4.48e-01

Pos logit contributions
 next+0.006
 first+0.005
ued+0.004
 two+0.004
 one+0.004
Neg logit contributions
 Fro-0.005
 Stri-0.005
 mir-0.005
 polar-0.004
 ig-0.004
Token end
Token ID886
Feature activation+9.46e-02

Pos logit contributions
ued+0.022
 first+0.020
oned+0.020
ving+0.019
 aside+0.019
Neg logit contributions
 ig-0.015
 polar-0.015
 mir-0.013
 spe-0.013
 por-0.013
Token.
Token ID13
Feature activation+4.70e-02

Pos logit contributions
 next+0.003
 first+0.002
 one+0.002
 two+0.001
 moment+0.001
Neg logit contributions
 Fro-0.006
 mir-0.006
 spe-0.006
 dam-0.006
 Stri-0.006
Token<|endoftext|>
Token ID50256
Feature activation+2.40e-01

Pos logit contributions
 next+0.001
 first+0.001
 two+0.001
 one+0.001
ued+0.001
Neg logit contributions
 Fro-0.003
 mir-0.003
 spe-0.003
 Stri-0.003
 polar-0.003
TokenOnce
Token ID7454
Feature activation+1.25e+00

Pos logit contributions
 first+0.009
 next+0.009
ued+0.007
 one+0.007
 two+0.006
Neg logit contributions
 Fro-0.012
 Stri-0.011
 polar-0.011
 mir-0.011
 ig-0.010
Token upon
Token ID2402
Feature activation+9.91e-01

Pos logit contributions
 next+0.051
 first+0.047
ued+0.041
 two+0.039
oning+0.038
Neg logit contributions
 Fro-0.057
 ig-0.052
 mir-0.050
 Stri-0.050
 polar-0.050
Token a
Token ID257
Feature activation+2.63e-02

Pos logit contributions
 next+0.033
 first+0.032
 two+0.023
 one+0.022
ued+0.021
Neg logit contributions
 Fro-0.055
 mir-0.053
 Stri-0.052
 polar-0.051
 spe-0.051
Token time
Token ID640
Feature activation+2.26e-01

Pos logit contributions
 next+0.001
 first+0.001
 two+0.001
ued+0.001
 one+0.001
Neg logit contributions
 Fro-0.001
 mir-0.001
 pic-0.001
 spe-0.001
 ig-0.001
Token,
Token ID11
Feature activation-5.08e-03

Pos logit contributions
 next+0.010
 first+0.010
ued+0.009
 moral+0.009
Whoever+0.008
Neg logit contributions
 Fro-0.009
 ig-0.008
 polar-0.007
 spe-0.007
 Stri-0.007
Token there
Token ID612
Feature activation-1.50e-01

Pos logit contributions
 Fro+0.000
 ig+0.000
 spe+0.000
 polar+0.000
 cob+0.000
Neg logit contributions
 next-0.000
 first-0.000
five-0.000
oning-0.000
ued-0.000
Token was
Token ID373
Feature activation+7.30e-01

Pos logit contributions
 mir+0.008
 Fro+0.008
 Stri+0.007
 spe+0.007
 polar+0.007
Neg logit contributions
 next-0.006
 first-0.005
 two-0.004
 one-0.004
 moment-0.003
Token They
Token ID1119
Feature activation+1.43e-01

Pos logit contributions
 next+0.011
 first+0.011
 two+0.008
 one+0.008
ued+0.007
Neg logit contributions
 Fro-0.016
 mir-0.015
 Stri-0.015
 polar-0.015
 spe-0.015
Token also
Token ID635
Feature activation+4.06e-02

Pos logit contributions
 next+0.006
 first+0.006
 two+0.005
 one+0.004
ued+0.004
Neg logit contributions
 Fro-0.007
 mir-0.006
 Stri-0.006
 polar-0.006
 spe-0.006
Token send
Token ID3758
Feature activation+2.47e-01

Pos logit contributions
 next+0.002
 first+0.002
 two+0.001
ued+0.001
 one+0.001
Neg logit contributions
 Fro-0.002
 mir-0.002
 spe-0.002
 Stri-0.002
 polar-0.002
Token them
Token ID606
Feature activation-1.88e-01

Pos logit contributions
 next+0.011
 first+0.010
ued+0.008
 two+0.008
 one+0.007
Neg logit contributions
 Fro-0.011
 mir-0.010
 Stri-0.010
 polar-0.010
 spe-0.010
Token some
Token ID617
Feature activation-3.72e-01

Pos logit contributions
 Fro+0.007
 Stri+0.007
 scar+0.007
 polar+0.007
 mir+0.007
Neg logit contributions
 next-0.008
 first-0.008
ued-0.007
oning-0.006
 moral-0.006
Token jokes
Token ID14532
Feature activation+1.10e-01

Pos logit contributions
 mir+0.015
 spe+0.015
 Fro+0.015
 ig+0.015
 polar+0.015
Neg logit contributions
 next-0.016
 first-0.016
ued-0.013
 two-0.012
 one-0.012
Token and
Token ID290
Feature activation+1.05e-01

Pos logit contributions
 first+0.004
 next+0.004
ued+0.003
 two+0.003
 moral+0.003
Neg logit contributions
 Fro-0.005
 Stri-0.005
 mir-0.005
 spe-0.005
 polar-0.005
Token r
Token ID374
Feature activation-7.43e-01

Pos logit contributions
 next+0.005
 first+0.005
 two+0.004
ued+0.004
 one+0.003
Neg logit contributions
 Fro-0.005
 mir-0.004
 Stri-0.004
 spe-0.004
 polar-0.004
Tokeniddles
Token ID29319
Feature activation-9.16e-01

Pos logit contributions
 Fro+0.034
 mir+0.033
 spe+0.031
 polar+0.031
 Stri+0.031
Neg logit contributions
 next-0.033
 first-0.032
 two-0.026
 one-0.025
ued-0.023
Token.
Token ID13
Feature activation+3.11e-01

Pos logit contributions
 Fro+0.041
 Stri+0.040
 mir+0.040
 spe+0.039
 cho+0.038
Neg logit contributions
 first-0.035
 next-0.035
ued-0.027
 two-0.026
 moral-0.026
Token
Token ID198
Feature activation+1.56e-02

Pos logit contributions
 next+0.012
 first+0.012
 two+0.009
 one+0.009
ued+0.008
Neg logit contributions
 Fro-0.016
 mir-0.015
 Stri-0.015
 polar-0.015
 spe-0.015
Token.
Token ID13
Feature activation-1.52e-01

Pos logit contributions
 next+0.025
 first+0.023
ued+0.018
 moral+0.018
 two+0.018
Neg logit contributions
 Fro-0.032
 Stri-0.029
 polar-0.029
 por-0.028
 spe-0.028
Token "
Token ID366
Feature activation-2.81e-01

Pos logit contributions
 Fro+0.007
 Stri+0.007
 mir+0.007
 polar+0.007
 spe+0.006
Neg logit contributions
 next-0.006
 first-0.006
ued-0.005
 two-0.004
 one-0.004
TokenYou
Token ID1639
Feature activation-3.89e-01

Pos logit contributions
 Fro+0.011
 Stri+0.010
avis+0.010
 mir+0.010
Su+0.009
Neg logit contributions
 first-0.013
 next-0.012
ued-0.011
 one-0.010
 two-0.010
Token just
Token ID655
Feature activation-6.92e-01

Pos logit contributions
 Fro+0.016
 mir+0.015
 Stri+0.015
 spe+0.015
 polar+0.015
Neg logit contributions
 next-0.019
 first-0.018
 two-0.014
 one-0.014
ued-0.014
Token need
Token ID761
Feature activation+1.57e-01

Pos logit contributions
 Fro+0.028
 spe+0.027
 mir+0.026
 polar+0.026
 Stri+0.026
Neg logit contributions
 next-0.033
 first-0.031
ued-0.024
 moment-0.023
 two-0.023
Token to
Token ID284
Feature activation+4.07e-01

Pos logit contributions
 next+0.007
 first+0.007
ued+0.006
 two+0.005
 moment+0.005
Neg logit contributions
 Fro-0.006
 mir-0.006
 Stri-0.006
 spe-0.006
 coc-0.006
Token make
Token ID787
Feature activation+5.02e-01

Pos logit contributions
 next+0.018
 first+0.017
 two+0.013
 one+0.013
ued+0.013
Neg logit contributions
 Fro-0.019
 mir-0.018
 Stri-0.017
 polar-0.017
 spe-0.017
Token a
Token ID257
Feature activation+2.01e-01

Pos logit contributions
 next+0.019
 first+0.018
 two+0.014
 one+0.014
ued+0.011
Neg logit contributions
 Fro-0.027
 mir-0.026
 spe-0.025
 polar-0.025
 Stri-0.025
Token wish
Token ID4601
Feature activation-8.29e-01

Pos logit contributions
 next+0.009
 first+0.008
 one+0.007
 two+0.007
 moment+0.006
Neg logit contributions
 Fro-0.010
 Stri-0.009
 mir-0.009
 polar-0.009
 spe-0.009
Token!"
Token ID2474
Feature activation-1.70e-01

Pos logit contributions
 mir+0.035
 spe+0.033
 por+0.033
 Fro+0.033
 Stri+0.033
Neg logit contributions
 next-0.031
 first-0.031
ued-0.028
 two-0.025
 moral-0.024
Token
Token ID198
Feature activation+4.05e-02

Pos logit contributions
 Fro+0.009
 Stri+0.008
 mir+0.008
 polar+0.008
avis+0.008
Neg logit contributions
 next-0.006
 first-0.006
ued-0.005
 two-0.004
 one-0.004
Token found
Token ID1043
Feature activation-2.85e-01

Pos logit contributions
 next+0.015
 first+0.015
ued+0.012
 moral+0.012
 two+0.011
Neg logit contributions
 Fro-0.014
 Stri-0.014
 mir-0.013
 polar-0.013
 spe-0.013
Token something
Token ID1223
Feature activation+8.12e-01

Pos logit contributions
 Fro+0.014
 Stri+0.013
 mir+0.013
 polar+0.013
 ig+0.013
Neg logit contributions
 next-0.011
 first-0.011
ued-0.008
 two-0.008
 one-0.008
Token disgusting
Token ID23374
Feature activation-3.65e-02

Pos logit contributions
 next+0.033
 first+0.032
 two+0.025
ued+0.025
 one+0.024
Neg logit contributions
 Fro-0.037
 mir-0.036
 ig-0.035
 polar-0.035
 spe-0.035
Token living
Token ID2877
Feature activation-8.98e-01

Pos logit contributions
 Fro+0.002
 mir+0.002
 polar+0.002
 Stri+0.002
 spe+0.002
Neg logit contributions
 next-0.001
 first-0.001
 two-0.001
ued-0.001
 one-0.001
Token there
Token ID612
Feature activation-4.20e-02

Pos logit contributions
 Fro+0.054
 mir+0.053
 Stri+0.052
 spe+0.051
 polar+0.051
Neg logit contributions
 next-0.028
 first-0.027
 two-0.018
 one-0.018
ued-0.015
Token.
Token ID13
Feature activation-1.50e-01

Pos logit contributions
 Fro+0.002
 mir+0.002
 Stri+0.002
 ig+0.002
 polar+0.002
Neg logit contributions
 next-0.002
 first-0.002
ued-0.001
 two-0.001
 moral-0.001
Token It
Token ID632
Feature activation+4.06e-01

Pos logit contributions
 Fro+0.005
 Stri+0.005
 mir+0.005
avis+0.005
icol+0.005
Neg logit contributions
 next-0.008
 first-0.008
ued-0.006
 behind-0.006
 one-0.006
Token was
Token ID373
Feature activation+2.36e-02

Pos logit contributions
 next+0.017
 first+0.016
 two+0.013
 one+0.012
ued+0.012
Neg logit contributions
 Fro-0.019
 mir-0.019
 Stri-0.018
 polar-0.018
 spe-0.018
Token a
Token ID257
Feature activation-3.88e-02

Pos logit contributions
 next+0.001
 first+0.001
 two+0.001
 one+0.001
 missing+0.000
Neg logit contributions
 Fro-0.001
 mir-0.001
 Stri-0.001
 spe-0.001
 polar-0.001
Token big
Token ID1263
Feature activation-2.84e-01

Pos logit contributions
 Fro+0.002
 Stri+0.002
 mir+0.002
urg+0.002
rim+0.002
Neg logit contributions
 next-0.002
 first-0.002
 two-0.001
 one-0.001
 moment-0.001
Token spider
Token ID19230
Feature activation-1.94e-01

Pos logit contributions
 Fro+0.012
 mir+0.012
 Stri+0.012
 polar+0.011
 spe+0.011
Neg logit contributions
 next-0.013
 first-0.013
 one-0.010
 two-0.010
 moment-0.009
Token day
Token ID1110
Feature activation-2.96e-01

Pos logit contributions
ued+0.005
atively+0.005
ider+0.005
cases+0.005
 first+0.005
Neg logit contributions
 mir-0.006
 Fro-0.006
 cel-0.005
 por-0.005
 coc-0.005
Token.
Token ID13
Feature activation+1.19e-01

Pos logit contributions
 Fro+0.008
 such+0.008
 Stri+0.008
 coc+0.008
 Mitt+0.008
Neg logit contributions
 first-0.013
ued-0.012
 next-0.012
Whoever-0.012
 moment-0.012
Token She
Token ID1375
Feature activation-1.41e-01

Pos logit contributions
 next+0.004
 first+0.004
 two+0.003
 one+0.003
ued+0.003
Neg logit contributions
 Fro-0.006
 mir-0.006
 Stri-0.006
 polar-0.006
 spe-0.006
Token realized
Token ID6939
Feature activation-3.92e-01

Pos logit contributions
 Fro+0.008
 mir+0.007
 spe+0.007
 Stri+0.007
 polar+0.007
Neg logit contributions
 next-0.005
 first-0.005
 two-0.004
 one-0.003
ued-0.003
Token that
Token ID326
Feature activation+2.95e-01

Pos logit contributions
 Fro+0.018
 mir+0.018
 Stri+0.017
 polar+0.017
 spe+0.017
Neg logit contributions
 next-0.017
 first-0.016
 two-0.012
 one-0.012
ued-0.012
Token it
Token ID340
Feature activation+3.51e-01

Pos logit contributions
 first+0.012
ued+0.011
 next+0.011
five+0.009
oning+0.009
Neg logit contributions
 Fro-0.013
 mir-0.012
 Stri-0.012
 coc-0.012
 spe-0.012
Token's
Token ID338
Feature activation+1.15e-01

Pos logit contributions
 next+0.014
 first+0.014
ued+0.012
 one+0.011
five+0.010
Neg logit contributions
 Fro-0.015
 Stri-0.014
 mir-0.014
 por-0.013
 cel-0.013
Token better
Token ID1365
Feature activation-2.83e-01

Pos logit contributions
 next+0.004
 first+0.004
ued+0.004
five+0.004
oning+0.004
Neg logit contributions
 Fro-0.005
 mir-0.004
 Stri-0.004
 cel-0.004
 dam-0.004
Token to
Token ID284
Feature activation+8.13e-01

Pos logit contributions
 Fro+0.010
 mir+0.010
 Stri+0.009
 dam+0.009
 coc+0.009
Neg logit contributions
 first-0.013
 next-0.013
ued-0.012
 two-0.011
oning-0.011
Token be
Token ID307
Feature activation+9.13e-01

Pos logit contributions
 next+0.035
 first+0.033
ued+0.027
 two+0.026
 one+0.026
Neg logit contributions
 Fro-0.036
 mir-0.035
 polar-0.033
 Stri-0.033
 dam-0.033
Token honest
Token ID5508
Feature activation+1.00e-01

Pos logit contributions
 first+0.038
 next+0.037
ued+0.031
 moment+0.027
 two+0.027
Neg logit contributions
 Fro-0.040
 mir-0.039
 dam-0.037
 Stri-0.037
 polar-0.037
Token a
Token ID257
Feature activation-7.63e-01

Pos logit contributions
 Fro+0.011
 mir+0.011
 Stri+0.011
 spe+0.011
 polar+0.011
Neg logit contributions
 next-0.008
 first-0.008
ued-0.006
 two-0.006
 one-0.006
Token heart
Token ID2612
Feature activation-6.21e-01

Pos logit contributions
 Fro+0.034
 mir+0.032
 Stri+0.032
rim+0.030
avis+0.030
Neg logit contributions
 next-0.034
 first-0.034
 two-0.027
 one-0.027
 moment-0.024
Token,
Token ID11
Feature activation-1.19e-01

Pos logit contributions
 Fro+0.034
 mir+0.033
 Stri+0.032
 polar+0.031
 spe+0.031
Neg logit contributions
 next-0.022
 first-0.021
 two-0.016
 one-0.015
ued-0.014
Token and
Token ID290
Feature activation-3.48e-01

Pos logit contributions
 Fro+0.006
 mir+0.006
 Stri+0.006
 spe+0.006
 polar+0.006
Neg logit contributions
 next-0.004
 first-0.004
ued-0.003
 two-0.003
 one-0.003
Token many
Token ID867
Feature activation+1.48e-01

Pos logit contributions
 Fro+0.017
 mir+0.016
 spe+0.016
 Stri+0.016
 polar+0.016
Neg logit contributions
 next-0.013
 first-0.013
ued-0.010
 two-0.009
 one-0.009
Token others
Token ID1854
Feature activation+3.62e-02

Pos logit contributions
 next+0.006
 first+0.006
 two+0.005
 one+0.004
ued+0.004
Neg logit contributions
 Fro-0.007
 mir-0.007
 spe-0.006
 polar-0.006
 Stri-0.006
Token.
Token ID13
Feature activation-4.02e-02

Pos logit contributions
 next+0.002
 first+0.002
 two+0.001
ued+0.001
 one+0.001
Neg logit contributions
 Fro-0.002
 mir-0.002
 spe-0.001
 Stri-0.001
 polar-0.001
Token
Token ID220
Feature activation-2.53e-02

Pos logit contributions
 Fro+0.002
 Stri+0.002
 polar+0.002
 dam+0.002
 mir+0.002
Neg logit contributions
 next-0.002
 first-0.002
 one-0.001
ued-0.001
 two-0.001
Token
Token ID198
Feature activation-2.96e-01

Pos logit contributions
 Fro+0.001
 Stri+0.001
 dam+0.001
ric+0.001
lic+0.001
Neg logit contributions
 next-0.001
 first-0.001
ued-0.001
 one-0.001
 two-0.001
Token
Token ID198
Feature activation-2.77e-01

Pos logit contributions
 Fro+0.011
 Stri+0.011
lic+0.011
Su+0.010
 Chick+0.010
Neg logit contributions
 next-0.014
 first-0.014
working-0.011
 behind-0.011
 missing-0.011
TokenWhen
Token ID2215
Feature activation-9.31e-03

Pos logit contributions
 Fro+0.011
 Stri+0.010
Su+0.010
lic+0.010
 mir+0.010
Neg logit contributions
 first-0.013
 next-0.012
 one-0.010
 behind-0.010
 missing-0.010
Token accept
Token ID2453
Feature activation+1.44e-01

Pos logit contributions
 next+0.010
 first+0.010
 two+0.007
 one+0.007
ued+0.007
Neg logit contributions
 Fro-0.012
 mir-0.011
 Stri-0.011
 polar-0.011
 spe-0.011
Token the
Token ID262
Feature activation+5.95e-01

Pos logit contributions
 next+0.006
 first+0.006
 two+0.005
ued+0.004
 one+0.004
Neg logit contributions
 Fro-0.007
 mir-0.006
 Stri-0.006
 polar-0.006
 spe-0.006
Token answer
Token ID3280
Feature activation-2.01e-01

Pos logit contributions
 next+0.024
 first+0.023
ued+0.019
 behind+0.018
oned+0.017
Neg logit contributions
 mir-0.026
 ig-0.025
 polar-0.024
 por-0.024
 Fro-0.024
Token.
Token ID13
Feature activation-3.79e-01

Pos logit contributions
 Fro+0.010
 mir+0.010
 Stri+0.010
 polar+0.009
 spe+0.009
Neg logit contributions
 next-0.007
 first-0.007
 two-0.005
ued-0.005
 one-0.005
Token
Token ID220
Feature activation-1.90e-01

Pos logit contributions
 Fro+0.020
 mir+0.019
 Stri+0.019
 polar+0.019
 spe+0.019
Neg logit contributions
 next-0.014
 first-0.013
 two-0.010
 one-0.010
ued-0.009
Token
Token ID198
Feature activation-6.12e-01

Pos logit contributions
 Fro+0.010
 Stri+0.009
 mir+0.009
 polar+0.008
 dam+0.008
Neg logit contributions
 next-0.007
 first-0.007
 one-0.005
ued-0.005
 two-0.005
Token
Token ID198
Feature activation-4.11e-01

Pos logit contributions
 Fro+0.028
 Stri+0.027
lic+0.025
Su+0.025
 mir+0.025
Neg logit contributions
 next-0.026
 first-0.026
 one-0.021
 behind-0.019
 two-0.019
TokenThe
Token ID464
Feature activation+4.82e-01

Pos logit contributions
 Fro+0.017
 Stri+0.017
 mir+0.016
 polar+0.016
Su+0.016
Neg logit contributions
 first-0.019
 next-0.018
 one-0.014
 two-0.014
 behind-0.014
Token next
Token ID1306
Feature activation-1.13e-01

Pos logit contributions
 first+0.008
 next+0.008
ued+0.007
 behind+0.006
five+0.006
Neg logit contributions
 polar-0.030
 ig-0.029
 mir-0.029
 por-0.028
 Fro-0.028
Token day
Token ID1110
Feature activation-3.97e-01

Pos logit contributions
 Fro+0.007
 mir+0.007
 polar+0.006
 ig+0.006
 spe+0.006
Neg logit contributions
 next-0.003
 first-0.003
 two-0.002
ued-0.002
 behind-0.002
Token,
Token ID11
Feature activation-2.08e-01

Pos logit contributions
 Fro+0.021
 mir+0.020
 Stri+0.020
 polar+0.020
 spe+0.019
Neg logit contributions
 next-0.014
 first-0.014
 two-0.010
 one-0.010
ued-0.009
Token table
Token ID3084
Feature activation-5.67e-01

Pos logit contributions
 Fro+0.025
 mir+0.024
 Stri+0.023
 polar+0.023
 spe+0.023
Neg logit contributions
 next-0.023
 first-0.022
 two-0.018
 one-0.017
ued-0.017
Token.
Token ID13
Feature activation+4.25e-01

Pos logit contributions
 Fro+0.024
 mir+0.024
 ig+0.024
 spe+0.023
 polar+0.023
Neg logit contributions
 first-0.021
 next-0.021
ued-0.018
 two-0.017
 moral-0.017
Token It
Token ID632
Feature activation+1.80e-01

Pos logit contributions
 next+0.017
 first+0.016
 two+0.013
 one+0.013
 moral+0.011
Neg logit contributions
 mir-0.020
 Fro-0.020
 spe-0.020
rim-0.020
 polar-0.019
Token was
Token ID373
Feature activation+1.77e-03

Pos logit contributions
 next+0.007
 first+0.007
 two+0.005
 one+0.005
ued+0.004
Neg logit contributions
 Fro-0.010
 mir-0.009
 Stri-0.009
 polar-0.009
 dam-0.009
Token their
Token ID511
Feature activation+1.35e-01

Pos logit contributions
 next+0.000
 first+0.000
 two+0.000
 one+0.000
 behind+0.000
Neg logit contributions
 Fro-0.000
 mir-0.000
 Stri-0.000
 dam-0.000
 polar-0.000
Token mom
Token ID1995
Feature activation-4.89e-01

Pos logit contributions
 next+0.006
 first+0.006
 two+0.004
 one+0.004
 moral+0.004
Neg logit contributions
 Fro-0.006
 mir-0.006
 Stri-0.006
 spe-0.006
 polar-0.006
Token's
Token ID338
Feature activation-9.18e-02

Pos logit contributions
 Fro+0.027
 mir+0.027
 Stri+0.026
 polar+0.026
 spe+0.026
Neg logit contributions
 next-0.016
 first-0.016
 two-0.011
 one-0.011
ued-0.010
Token tea
Token ID8887
Feature activation-5.22e-01

Pos logit contributions
 Fro+0.004
 mir+0.004
 Stri+0.004
 spe+0.003
 polar+0.003
Neg logit contributions
 next-0.004
 first-0.004
 two-0.004
 one-0.003
 moment-0.003
Token,
Token ID11
Feature activation+1.61e-02

Pos logit contributions
 Fro+0.022
 mir+0.021
 Stri+0.020
 polar+0.020
 spe+0.020
Neg logit contributions
 next-0.025
 first-0.024
 two-0.020
 one-0.019
ued-0.018
Token but
Token ID475
Feature activation+6.95e-02

Pos logit contributions
 next+0.001
 first+0.001
 two+0.000
 one+0.000
ued+0.000
Neg logit contributions
 Fro-0.001
 mir-0.001
 Stri-0.001
 spe-0.001
 polar-0.001
Token she
Token ID673
Feature activation-8.52e-01

Pos logit contributions
 next+0.002
 first+0.002
ued+0.002
oning+0.002
 moment+0.002
Neg logit contributions
 Fro-0.003
 mir-0.003
 Stri-0.003
 spe-0.003
 ig-0.003
Token Anna
Token ID11735
Feature activation+7.23e-02

Pos logit contributions
 Fro+0.020
 mir+0.019
 Stri+0.019
 polar+0.019
 spe+0.018
Neg logit contributions
 next-0.021
 first-0.021
ued-0.016
 two-0.016
 one-0.015
Token are
Token ID389
Feature activation+3.82e-01

Pos logit contributions
 next+0.003
 first+0.003
ued+0.002
 two+0.002
 one+0.002
Neg logit contributions
 Fro-0.003
 spe-0.003
 mir-0.003
 polar-0.003
 Stri-0.003
Token friends
Token ID2460
Feature activation+3.58e-03

Pos logit contributions
 next+0.015
 first+0.014
ued+0.013
 moment+0.012
 one+0.011
Neg logit contributions
 Fro-0.017
 mir-0.016
 ig-0.016
 polar-0.015
 spe-0.015
Token.
Token ID13
Feature activation+4.37e-02

Pos logit contributions
 first+0.000
 next+0.000
ued+0.000
 two+0.000
 moment+0.000
Neg logit contributions
 Fro-0.000
 spe-0.000
 polar-0.000
 mir-0.000
 Stri-0.000
Token They
Token ID1119
Feature activation+7.60e-01

Pos logit contributions
 next+0.002
 first+0.002
ued+0.002
 two+0.002
 one+0.001
Neg logit contributions
 Fro-0.002
 mir-0.002
 spe-0.002
 polar-0.002
 Stri-0.002
Token like
Token ID588
Feature activation-5.07e-01

Pos logit contributions
 next+0.035
 first+0.033
ued+0.027
 two+0.026
 moment+0.025
Neg logit contributions
 Fro-0.031
 mir-0.029
 spe-0.029
 polar-0.029
 Stri-0.029
Token to
Token ID284
Feature activation-1.58e-01

Pos logit contributions
 Fro+0.024
 mir+0.024
 polar+0.023
 Stri+0.023
 spe+0.023
Neg logit contributions
 next-0.021
 first-0.020
 two-0.016
 one-0.016
ued-0.014
Token play
Token ID711
Feature activation-1.35e-01

Pos logit contributions
 Fro+0.006
 mir+0.006
 polar+0.006
 Stri+0.006
 spe+0.006
Neg logit contributions
 next-0.008
 first-0.008
 two-0.006
 one-0.006
ued-0.006
Token and
Token ID290
Feature activation-6.67e-01

Pos logit contributions
 Fro+0.008
 mir+0.008
 Stri+0.008
 polar+0.008
 spe+0.008
Neg logit contributions
 next-0.004
 first-0.004
 two-0.003
 one-0.003
ued-0.002
Token have
Token ID423
Feature activation-4.68e-01

Pos logit contributions
 Fro+0.031
 mir+0.029
 spe+0.028
 Stri+0.028
 polar+0.028
Neg logit contributions
 next-0.028
 first-0.027
 two-0.021
ued-0.021
 one-0.020
Token fun
Token ID1257
Feature activation-6.79e-01

Pos logit contributions
 Fro+0.025
 mir+0.025
rim+0.025
 polar+0.025
avis+0.024
Neg logit contributions
 next-0.017
 first-0.016
 two-0.014
 one-0.013
 moment-0.010
Token mix
Token ID5022
Feature activation+2.71e-01

Pos logit contributions
 next+0.004
 first+0.004
ued+0.003
 two+0.003
 one+0.003
Neg logit contributions
 Fro-0.006
 mir-0.006
 Stri-0.006
 spe-0.006
 polar-0.006
Token,
Token ID11
Feature activation+2.03e-01

Pos logit contributions
 next+0.010
 first+0.010
ued+0.008
 two+0.007
 one+0.007
Neg logit contributions
 Fro-0.013
 mir-0.013
 Stri-0.013
 spe-0.012
 ig-0.012
Token and
Token ID290
Feature activation+3.85e-01

Pos logit contributions
 first+0.007
 next+0.007
ued+0.006
 two+0.005
 moment+0.005
Neg logit contributions
 Fro-0.010
 mir-0.010
 spe-0.010
 Stri-0.010
 ig-0.010
Token bake
Token ID28450
Feature activation-4.86e-01

Pos logit contributions
 next+0.016
 first+0.016
ued+0.012
 two+0.012
 one+0.012
Neg logit contributions
 Fro-0.017
 mir-0.017
 spe-0.016
 Stri-0.016
 polar-0.016
Token a
Token ID257
Feature activation-1.82e-01

Pos logit contributions
 Fro+0.024
 mir+0.023
 Stri+0.023
 spe+0.023
 polar+0.023
Neg logit contributions
 next-0.018
 first-0.017
ued-0.013
 two-0.013
 one-0.013
Token cake
Token ID12187
Feature activation-2.80e-01

Pos logit contributions
 Fro+0.008
 mir+0.008
 Stri+0.008
 polar+0.007
 spe+0.007
Neg logit contributions
 next-0.008
 first-0.008
 two-0.006
 one-0.006
ued-0.006
Token.
Token ID13
Feature activation+5.86e-01

Pos logit contributions
 Fro+0.016
 mir+0.015
 Stri+0.015
 polar+0.015
 spe+0.015
Neg logit contributions
 next-0.009
 first-0.009
 two-0.006
 one-0.006
ued-0.006
Token He
Token ID679
Feature activation+1.08e-01

Pos logit contributions
 next+0.025
 first+0.024
ued+0.018
 two+0.018
 one+0.018
Neg logit contributions
 Fro-0.028
 Stri-0.026
 mir-0.026
 polar-0.025
 spe-0.025
Token also
Token ID635
Feature activation+1.32e-01

Pos logit contributions
 next+0.005
 first+0.004
ued+0.004
 one+0.003
 two+0.003
Neg logit contributions
 Fro-0.005
 mir-0.005
 Stri-0.005
 spe-0.005
 polar-0.005
Token let
Token ID1309
Feature activation-6.04e-02

Pos logit contributions
 next+0.006
 first+0.006
ued+0.005
oning+0.005
 moment+0.005
Neg logit contributions
 spe-0.005
 mir-0.005
 Fro-0.005
 Stri-0.005
 ig-0.005
Token them
Token ID606
Feature activation+3.74e-01

Pos logit contributions
 Fro+0.003
 mir+0.003
 spe+0.003
 Stri+0.003
 polar+0.003
Neg logit contributions
 next-0.002
 first-0.002
 two-0.002
 one-0.002
ued-0.002
Token had
Token ID550
Feature activation-4.93e-01

Pos logit contributions
 Fro+0.019
 mir+0.018
 Stri+0.018
 spe+0.017
 polar+0.017
Neg logit contributions
 next-0.015
 first-0.014
ued-0.011
 two-0.010
 one-0.010
Token fun
Token ID1257
Feature activation-2.89e-01

Pos logit contributions
rim+0.028
 mir+0.028
avis+0.027
 Fro+0.027
 polar+0.027
Neg logit contributions
 first-0.017
 next-0.017
 two-0.014
 one-0.013
 right-0.009
Token and
Token ID290
Feature activation-4.37e-01

Pos logit contributions
 Fro+0.014
 Stri+0.013
 spe+0.013
 dam+0.013
 ig+0.013
Neg logit contributions
 first-0.010
 next-0.010
ued-0.008
 two-0.007
 moral-0.007
Token laughed
Token ID13818
Feature activation-1.35e-01

Pos logit contributions
 mir+0.022
 Fro+0.022
 polar+0.022
 Stri+0.021
 ig+0.021
Neg logit contributions
 next-0.017
 first-0.017
 two-0.012
 one-0.012
 moment-0.011
Token.
Token ID13
Feature activation-3.35e-01

Pos logit contributions
 Fro+0.007
 Stri+0.007
 mir+0.007
 polar+0.007
 spe+0.006
Neg logit contributions
 first-0.005
 next-0.005
ued-0.004
 two-0.003
 moment-0.003
Token They
Token ID1119
Feature activation-4.60e-01

Pos logit contributions
 Fro+0.016
 mir+0.015
 Stri+0.015
 polar+0.015
 spe+0.015
Neg logit contributions
 next-0.014
 first-0.013
 two-0.010
ued-0.010
 one-0.010
Token became
Token ID2627
Feature activation-6.93e-01

Pos logit contributions
 Fro+0.022
 mir+0.021
 Stri+0.021
 polar+0.020
 spe+0.020
Neg logit contributions
 next-0.019
 first-0.018
 two-0.014
 one-0.014
ued-0.013
Token friends
Token ID2460
Feature activation-2.36e-01

Pos logit contributions
 mir+0.040
icol+0.039
rim+0.039
ric+0.038
avis+0.038
Neg logit contributions
 next-0.023
 first-0.022
 two-0.018
 one-0.017
 other-0.012
Token.
Token ID13
Feature activation-3.19e-01

Pos logit contributions
 Fro+0.014
 mir+0.013
 Stri+0.013
 polar+0.013
 spe+0.013
Neg logit contributions
 next-0.007
 first-0.007
 two-0.005
 one-0.005
ued-0.004
Token
Token ID198
Feature activation-8.86e-01

Pos logit contributions
 Fro+0.017
 mir+0.017
 Stri+0.016
 polar+0.016
 spe+0.016
Neg logit contributions
 next-0.011
 first-0.011
 two-0.008
 one-0.008
ued-0.007
Token
Token ID198
Feature activation-6.08e-01

Pos logit contributions
 Fro+0.039
 mir+0.036
 Stri+0.036
 polar+0.035
 ig+0.035
Neg logit contributions
 next-0.040
 first-0.039
 two-0.031
 one-0.030
ued-0.030
Token But
Token ID887
Feature activation-7.58e-02

Pos logit contributions
 next+0.014
 first+0.014
 one+0.011
 two+0.011
 moment+0.009
Neg logit contributions
 mir-0.019
 Fro-0.018
 spe-0.018
rim-0.018
 polar-0.018
Token,
Token ID11
Feature activation+8.35e-02

Pos logit contributions
 Fro+0.004
 Stri+0.003
 ig+0.003
 Chick+0.003
 polar+0.003
Neg logit contributions
 next-0.003
 first-0.003
ued-0.002
 two-0.002
oning-0.002
Token the
Token ID262
Feature activation-6.94e-01

Pos logit contributions
 next+0.003
 first+0.003
 two+0.002
 one+0.002
ued+0.002
Neg logit contributions
 Fro-0.004
 mir-0.004
 Stri-0.004
 polar-0.004
 spe-0.004
Token wheel
Token ID7825
Feature activation-3.88e-02

Pos logit contributions
 polar+0.029
 Fro+0.028
 ig+0.028
 mir+0.028
 Stri+0.027
Neg logit contributions
 next-0.030
 first-0.030
ued-0.025
 one-0.023
 two-0.023
Token fell
Token ID3214
Feature activation-6.47e-01

Pos logit contributions
 Fro+0.002
 Stri+0.002
 mir+0.002
 polar+0.002
 ig+0.002
Neg logit contributions
 next-0.002
 first-0.002
 two-0.001
ued-0.001
 one-0.001
Token off
Token ID572
Feature activation-1.53e+00

Pos logit contributions
 Fro+0.037
 mir+0.036
 polar+0.035
 Stri+0.035
 spe+0.035
Neg logit contributions
 next-0.021
 first-0.020
 two-0.014
 one-0.014
ued-0.012
Token and
Token ID290
Feature activation+1.23e-01

Pos logit contributions
 Fro+0.091
 mir+0.089
 polar+0.086
 Stri+0.086
 spe+0.084
Neg logit contributions
 next-0.048
 first-0.046
 two-0.032
 one-0.032
ued-0.026
Token she
Token ID673
Feature activation-1.05e+00

Pos logit contributions
 next+0.005
 first+0.005
 two+0.004
 one+0.004
ued+0.003
Neg logit contributions
 Fro-0.006
 mir-0.006
 polar-0.006
 Stri-0.006
 spe-0.006
Token didn
Token ID1422
Feature activation-4.46e-01

Pos logit contributions
 mir+0.056
 Fro+0.055
rim+0.053
icol+0.053
 spe+0.052
Neg logit contributions
 next-0.039
 first-0.039
 two-0.030
 one-0.029
ued-0.024
Token't
Token ID470
Feature activation+1.32e-01

Pos logit contributions
 Fro+0.029
 mir+0.028
 Stri+0.027
 polar+0.027
 spe+0.027
Neg logit contributions
 next-0.011
 first-0.011
 two-0.007
 one-0.006
ued-0.006
Token know
Token ID760
Feature activation-6.33e-01

Pos logit contributions
 next+0.005
 first+0.005
 one+0.004
 two+0.003
 moment+0.003
Neg logit contributions
 Fro-0.007
 mir-0.007
icol-0.007
 Stri-0.007
 spe-0.007
Token.
Token ID13
Feature activation-6.29e-02

Pos logit contributions
 next+0.003
 first+0.003
 two+0.002
 one+0.002
ued+0.002
Neg logit contributions
 Fro-0.004
 mir-0.004
 Stri-0.004
 polar-0.004
 spe-0.004
Token He
Token ID679
Feature activation+2.29e-01

Pos logit contributions
 Fro+0.002
 Stri+0.002
 dam+0.002
 mir+0.002
icol+0.002
Neg logit contributions
 next-0.003
 first-0.003
ued-0.003
 two-0.003
 one-0.003
Token felt
Token ID2936
Feature activation-5.77e-01

Pos logit contributions
 next+0.009
 first+0.009
 two+0.007
 one+0.007
ued+0.006
Neg logit contributions
 Fro-0.011
 mir-0.011
 Stri-0.011
 polar-0.010
 spe-0.010
Token a
Token ID257
Feature activation-1.87e-01

Pos logit contributions
 Fro+0.029
 mir+0.028
 Stri+0.028
 polar+0.027
 spe+0.027
Neg logit contributions
 next-0.023
 first-0.022
 two-0.017
 one-0.017
ued-0.015
Token strange
Token ID6283
Feature activation+1.86e-01

Pos logit contributions
 Fro+0.009
 Stri+0.009
 mir+0.009
 polar+0.008
 dam+0.008
Neg logit contributions
 next-0.008
 first-0.007
 two-0.006
 one-0.006
 moment-0.005
Token feeling
Token ID4203
Feature activation-1.44e+00

Pos logit contributions
 next+0.008
 first+0.008
 two+0.006
 one+0.006
ued+0.005
Neg logit contributions
 Fro-0.009
 mir-0.009
 Stri-0.008
 polar-0.008
 spe-0.008
Token,
Token ID11
Feature activation+3.28e-01

Pos logit contributions
 mir+0.065
 Fro+0.065
 spe+0.064
 polar+0.061
 Stri+0.060
Neg logit contributions
 first-0.056
 next-0.055
 two-0.044
 moral-0.044
ued-0.041
Token almost
Token ID2048
Feature activation-6.30e-02

Pos logit contributions
 first+0.012
 next+0.012
ued+0.010
 two+0.009
 one+0.009
Neg logit contributions
 Fro-0.016
 mir-0.015
 spe-0.015
 Stri-0.015
 polar-0.015
Token like
Token ID588
Feature activation-3.32e-01

Pos logit contributions
 Fro+0.003
 mir+0.003
 spe+0.003
 Stri+0.003
 polar+0.003
Neg logit contributions
 next-0.002
 first-0.002
ued-0.002
 two-0.002
 moral-0.002
Token a
Token ID257
Feature activation+2.46e-01

Pos logit contributions
 Fro+0.017
 mir+0.016
 Stri+0.016
 polar+0.015
 spe+0.015
Neg logit contributions
 next-0.013
 first-0.012
 two-0.009
ued-0.009
 one-0.009
Token reward
Token ID6721
Feature activation-5.34e-01

Pos logit contributions
 next+0.011
 first+0.011
 one+0.009
 two+0.009
 moment+0.008
Neg logit contributions
 Fro-0.011
 Stri-0.010
 mir-0.010
rim-0.010
avis-0.010
Token asked
Token ID1965
Feature activation-1.33e-01

Pos logit contributions
 next+0.013
 first+0.012
 two+0.009
 one+0.009
ued+0.008
Neg logit contributions
 Fro-0.018
 mir-0.017
 Stri-0.017
 polar-0.016
 spe-0.016
Token,
Token ID11
Feature activation-1.91e-01

Pos logit contributions
 Fro+0.006
 Stri+0.006
 polar+0.006
 mir+0.006
 spe+0.005
Neg logit contributions
 next-0.006
 first-0.005
ued-0.004
 two-0.004
 moment-0.004
Token "
Token ID366
Feature activation-2.03e-01

Pos logit contributions
 Fro+0.007
 Stri+0.007
 polar+0.006
 spe+0.006
 Chick+0.006
Neg logit contributions
ued-0.008
 next-0.008
 first-0.008
working-0.007
w-0.007
TokenWhat
Token ID2061
Feature activation+3.11e-01

Pos logit contributions
 Fro+0.009
 Stri+0.008
 mir+0.008
 polar+0.008
 spe+0.008
Neg logit contributions
 next-0.009
 first-0.009
 two-0.007
ued-0.007
 one-0.007
Token happened
Token ID3022
Feature activation+2.33e-01

Pos logit contributions
 next+0.012
 first+0.012
 two+0.009
 one+0.009
ued+0.008
Neg logit contributions
 Fro-0.016
 mir-0.015
 Stri-0.015
 spe-0.014
 polar-0.014
Token to
Token ID284
Feature activation-1.67e+00

Pos logit contributions
 next+0.009
 first+0.009
 two+0.007
ued+0.007
 one+0.007
Neg logit contributions
 Fro-0.011
 mir-0.011
 Stri-0.010
 polar-0.010
 spe-0.010
Token your
Token ID534
Feature activation+2.18e-01

Pos logit contributions
 mir+0.080
 spe+0.080
 Fro+0.079
 dam+0.076
icol+0.075
Neg logit contributions
 first-0.071
 next-0.070
 one-0.055
 two-0.054
 moral-0.043
Token t
Token ID256
Feature activation-4.79e-01

Pos logit contributions
 next+0.011
 first+0.010
 two+0.008
 one+0.008
ued+0.007
Neg logit contributions
 Fro-0.009
 mir-0.009
 Stri-0.008
 spe-0.008
 dam-0.008
Tokeneddy
Token ID21874
Feature activation+2.96e-02

Pos logit contributions
 Fro+0.014
 mir+0.014
 spe+0.014
 dam+0.013
 Stri+0.013
Neg logit contributions
 next-0.030
 first-0.029
 one-0.024
 two-0.023
ued-0.022
Token bear
Token ID6842
Feature activation-3.37e-02

Pos logit contributions
 next+0.002
 first+0.002
 one+0.001
 two+0.001
ued+0.001
Neg logit contributions
 Fro-0.001
 mir-0.001
 spe-0.001
 Stri-0.001
 dam-0.001
Token?"
Token ID1701
Feature activation-4.69e-01

Pos logit contributions
 Fro+0.002
 mir+0.001
 Stri+0.001
 polar+0.001
 ig+0.001
Neg logit contributions
 next-0.001
 first-0.001
 two-0.001
ued-0.001
 one-0.001
Token When
Token ID1649
Feature activation-5.75e-02

Pos logit contributions
 Fro+0.006
 Stri+0.005
 mir+0.005
icol+0.005
 spe+0.005
Neg logit contributions
 next-0.006
 first-0.006
ued-0.005
 two-0.005
 moment-0.005
Token he
Token ID339
Feature activation-8.40e-02

Pos logit contributions
 Fro+0.003
 mir+0.003
 Stri+0.003
 polar+0.003
 spe+0.003
Neg logit contributions
 next-0.002
 first-0.002
 two-0.001
ued-0.001
 one-0.001
Token does
Token ID857
Feature activation+9.29e-02

Pos logit contributions
 mir+0.004
 Fro+0.004
 polar+0.004
 spe+0.004
 dam+0.004
Neg logit contributions
 next-0.003
 first-0.003
 two-0.002
 one-0.002
ued-0.002
Token,
Token ID11
Feature activation+1.66e-01

Pos logit contributions
 next+0.004
 first+0.004
 moral+0.003
 two+0.003
ued+0.003
Neg logit contributions
 Fro-0.004
 Stri-0.004
 som-0.003
 mir-0.003
 cho-0.003
Token she
Token ID673
Feature activation-6.70e-01

Pos logit contributions
 next+0.006
 first+0.006
ued+0.005
 moral+0.005
 two+0.005
Neg logit contributions
 Fro-0.008
 Stri-0.008
 mir-0.007
 spe-0.007
 polar-0.007
Token spr
Token ID7500
Feature activation-1.60e+00

Pos logit contributions
 Fro+0.033
 mir+0.033
 polar+0.031
 Stri+0.031
 spe+0.031
Neg logit contributions
 next-0.026
 first-0.026
 two-0.019
 one-0.019
ued-0.018
Tokenays
Token ID592
Feature activation+7.55e-01

Pos logit contributions
 mir+0.075
 spe+0.074
 Fro+0.074
 polar+0.072
 Stri+0.070
Neg logit contributions
 next-0.069
 first-0.067
 two-0.052
 one-0.052
ued-0.048
Token him
Token ID683
Feature activation-3.42e-01

Pos logit contributions
 next+0.032
 first+0.031
 two+0.024
 one+0.024
ued+0.022
Neg logit contributions
 Fro-0.035
 mir-0.034
 Stri-0.033
 polar-0.033
 spe-0.033
Token with
Token ID351
Feature activation+4.01e-01

Pos logit contributions
 Fro+0.017
 Stri+0.015
 spe+0.015
 scar+0.015
 mir+0.015
Neg logit contributions
 next-0.011
 first-0.011
oning-0.009
 moral-0.009
ued-0.009
Token water
Token ID1660
Feature activation+7.71e-02

Pos logit contributions
 next+0.014
 first+0.014
 two+0.011
 one+0.011
ued+0.008
Neg logit contributions
 mir-0.022
 Fro-0.022
rim-0.021
 polar-0.021
 dam-0.020
Token.
Token ID13
Feature activation+7.45e-02

Pos logit contributions
 next+0.002
 first+0.002
 two+0.002
 one+0.002
ued+0.001
Neg logit contributions
 Fro-0.004
 mir-0.004
 polar-0.004
 Stri-0.004
 spe-0.004
Token That
Token ID1320
Feature activation-2.50e-01

Pos logit contributions
 Fro+0.011
 Stri+0.010
 mir+0.010
 polar+0.010
 spe+0.009
Neg logit contributions
 next-0.009
 first-0.009
 two-0.007
ued-0.007
 one-0.007
Token made
Token ID925
Feature activation+2.71e-01

Pos logit contributions
 Fro+0.013
 mir+0.012
 Stri+0.012
 ig+0.012
 polar+0.012
Neg logit contributions
 next-0.009
 first-0.009
ued-0.006
 two-0.006
 one-0.006
Token him
Token ID683
Feature activation-4.66e-01

Pos logit contributions
 next+0.011
 first+0.011
 two+0.009
 one+0.009
ued+0.007
Neg logit contributions
 mir-0.013
 Fro-0.013
 spe-0.012
 polar-0.012
 Stri-0.012
Token smile
Token ID8212
Feature activation-8.90e-01

Pos logit contributions
 Fro+0.020
 mir+0.019
 Stri+0.018
 polar+0.018
 spe+0.018
Neg logit contributions
 next-0.021
 first-0.021
 two-0.017
 one-0.016
ued-0.016
Token and
Token ID290
Feature activation-4.41e-02

Pos logit contributions
 Fro+0.047
 mir+0.045
 Stri+0.044
 polar+0.044
 spe+0.043
Neg logit contributions
 next-0.030
 first-0.029
ued-0.021
 two-0.021
 one-0.020
Token feel
Token ID1254
Feature activation-1.48e+00

Pos logit contributions
 Fro+0.002
 mir+0.002
 Stri+0.002
 polar+0.002
 spe+0.002
Neg logit contributions
 next-0.002
 first-0.002
 two-0.002
 one-0.002
ued-0.001
Token better
Token ID1365
Feature activation-4.66e-01

Pos logit contributions
 Fro+0.081
 Stri+0.079
 mir+0.079
 ig+0.076
 spe+0.076
Neg logit contributions
 next-0.053
 first-0.050
 two-0.039
 one-0.039
ued-0.032
Token.
Token ID13
Feature activation-1.38e-01

Pos logit contributions
 Fro+0.018
 mir+0.018
 spe+0.018
 polar+0.017
 Stri+0.017
Neg logit contributions
 first-0.018
 next-0.017
ued-0.016
oning-0.015
 two-0.015
Token Then
Token ID3244
Feature activation-3.50e-01

Pos logit contributions
 Fro+0.007
 mir+0.007
 Stri+0.007
 polar+0.007
 spe+0.006
Neg logit contributions
 next-0.005
 first-0.005
 two-0.004
 one-0.004
ued-0.004
Token they
Token ID484
Feature activation-4.01e-01

Pos logit contributions
 Fro+0.019
 mir+0.018
 Stri+0.018
 polar+0.018
 spe+0.018
Neg logit contributions
 next-0.012
 first-0.012
 two-0.008
ued-0.008
 one-0.008
Token went
Token ID1816
Feature activation-5.28e-01

Pos logit contributions
 Fro+0.021
 mir+0.021
 spe+0.020
rim+0.020
 Stri+0.020
Neg logit contributions
 next-0.015
 first-0.015
 two-0.011
 one-0.011
ued-0.010