TOP ACTIVATIONS
MAX = 0.064

. He kept throwing a tant rum . His mom was
listen . She threw a tant rum in the store .
She cried and threw a tant rum . Her mom tried
her feet and threw a tant rum . She grabbed the
to cry and threw a tant rum . She didn 't
. She hugs the re ind eer . "
surprise . He threw a tant rum . He kicked and
, she saw a re ind eer in her backyard .
sad . She threw a tant rum and cried loudly .
tries to take the re ind eer . "
my saw a cool re ind eer in his backyard .
? I want a re ind eer like this one ."
kept crying and throwing a tant rum . Finally
to play with the re ind eer , you have to
balls and built a re ind eer house .
and started to throw a tant rum . Suddenly , she
. She sees the re ind eer . She sees the
a picture of a re ind eer . She wanted a
snow and mentioned a re ind eer . She was so
want to keep the re ind eer , you have to

MIN ACTIVATIONS
MIN = -0.083

garden and admire his sun fl owers . They also see
let them remove his sun fl owers . The
shaved it because the sun fl owers needed more sunlight to
<|endoftext|> L ily loved sun fl owers . She liked to
for some funny little sun fl owers . Her m ama
amazed by all the wild fl owers and the bright blue
en vious of your sun fl owers . I wanted to
could even turn flip - fl ops ! Everyone was very
wealthy prince and his sun fl owers lived happily ever after
me adow full of wild fl owers . He spun around
back with lots of sun fl owers to fill the prince
Soon , all the sun fl owers were happily chatting and
ol anda and her sun fl owers started to dance together
said no because the sun fl owers were too big to
the soil for the sun fl owers . From that day
happy with all the sun fl owers and he ordered his
a field full of wild fl owers . They offered them
after that . Whenever the chip m unk needed help ,
From then on the sun fl owers were Y ol anda
the biggest and brightest sun fl owers she could find .

INTERVAL 0.027 - 0.064
CONTAINS 0.260%

special bag . She kept all her favourite things inside it
all . It was a tre asured stuffed animal that she
felt full . <|endoftext|> L ola and her mum were walking
copies . They printed a lot of copies and handed them
he had been spoiled and left his stuff around .

INTERVAL -0.010 - 0.027
CONTAINS 85.892%

all her spaghetti . Later that night , Lily
day , Will and his mom were walking at the park
. Now the farmer always carried the penny with him .
so thoughtful that he even brought Lucy presents to show how
the pot from the stove . He carried it to the

INTERVAL -0.046 - -0.010
CONTAINS 13.843%

whistle very well ," Lily said . " How does it
We 'll help you !" So Spike jumped on the tr
. She said , " No , you can 't play
not listen to their mom , who told them to be
'm so sorry . That bridge was old and dangerous .

INTERVAL -0.083 - -0.046
CONTAINS 0.006%

She saw lots of snow fl akes , glitter ing in
the tree and eat the pr unes . They say the
didn 't forget her sun flower . <|endoftext|> Once upon a
hello to the other sun fl owers around it . Soon
doesn 't taste like the grape juice we have at home
Token.
Token ID13
Feature activation+3.66e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
ination+0.001
Neg logit contributions
ipping-0.001
-0.001
 exercising-0.001
pill-0.001
wards-0.001
Token He
Token ID679
Feature activation-2.49e-03

Pos logit contributions
away+0.006
ff+0.006
rated+0.006
rost+0.005
igator+0.005
Neg logit contributions
-0.006
ipping-0.006
pill-0.006
wards-0.006
 strawberries-0.005
Token kept
Token ID4030
Feature activation+1.65e-03

Pos logit contributions
ipping+0.005
+0.005
pill+0.005
 strawberries+0.004
 exercising+0.004
Neg logit contributions
ff-0.003
away-0.003
rost-0.003
rated-0.003
ace-0.003
Token throwing
Token ID9644
Feature activation-6.18e-03

Pos logit contributions
ff+0.003
away+0.003
rost+0.003
 know+0.003
 its+0.002
Neg logit contributions
pill-0.003
-0.003
ipping-0.003
avored-0.002
wards-0.002
Token a
Token ID257
Feature activation+1.72e-02

Pos logit contributions
ipping+0.009
 strawberries+0.009
+0.009
pill+0.009
 exercising+0.008
Neg logit contributions
ff-0.012
rated-0.011
rost-0.010
away-0.010
 Amen-0.010
Token tant
Token ID24246
Feature activation+6.42e-02

Pos logit contributions
ff+0.031
rated+0.028
ace+0.028
rost+0.027
away+0.027
Neg logit contributions
ipping-0.027
-0.026
pill-0.025
 strawberries-0.025
ula-0.024
Tokenrum
Token ID6582
Feature activation+3.25e-03

Pos logit contributions
 lived+0.071
rost+0.067
ff+0.066
 woke+0.065
rated+0.064
Neg logit contributions
ipping-0.135
ula-0.124
 practicing-0.119
pill-0.118
balls-0.113
Token.
Token ID13
Feature activation+5.07e-03

Pos logit contributions
ff+0.005
away+0.005
rost+0.004
rated+0.004
ace+0.004
Neg logit contributions
ipping-0.006
-0.006
pill-0.006
 exercising-0.006
wards-0.006
Token His
Token ID2399
Feature activation+7.06e-03

Pos logit contributions
away+0.008
ff+0.008
rated+0.008
igator+0.007
ace+0.007
Neg logit contributions
-0.010
wards-0.009
ipping-0.009
pill-0.008
 exercising-0.008
Token mom
Token ID1995
Feature activation-3.14e-03

Pos logit contributions
ff+0.011
away+0.011
rated+0.010
rost+0.009
ination+0.009
Neg logit contributions
ipping-0.013
-0.013
 strawberries-0.012
pill-0.012
wards-0.011
Token was
Token ID373
Feature activation-6.56e-03

Pos logit contributions
ipping+0.007
+0.007
pill+0.007
 exercising+0.006
wards+0.006
Neg logit contributions
ff-0.004
away-0.004
rost-0.003
rated-0.003
 woke-0.003
Token listen
Token ID6004
Feature activation+9.71e-04

Pos logit contributions
ipping+0.024
 exercising+0.023
 practicing+0.022
+0.022
wards+0.022
Neg logit contributions
ff-0.013
away-0.011
ney-0.011
rost-0.010
ination-0.010
Token.
Token ID13
Feature activation+7.97e-04

Pos logit contributions
ff+0.002
away+0.001
rost+0.001
rated+0.001
ination+0.001
Neg logit contributions
ipping-0.002
-0.002
pill-0.002
 exercising-0.001
 practicing-0.001
Token She
Token ID1375
Feature activation-6.15e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
 woke+0.001
Neg logit contributions
ipping-0.002
-0.002
pill-0.001
 exercising-0.001
 strawberries-0.001
Token threw
Token ID9617
Feature activation-9.15e-03

Pos logit contributions
ipping+0.014
+0.013
pill+0.013
 exercising+0.012
wards+0.012
Neg logit contributions
ff-0.007
away-0.007
rost-0.006
rated-0.006
 woke-0.006
Token a
Token ID257
Feature activation+1.33e-02

Pos logit contributions
ipping+0.017
pill+0.016
+0.016
 strawberries+0.016
 exercising+0.015
Neg logit contributions
ff-0.014
away-0.013
rated-0.012
rost-0.012
ney-0.012
Token tant
Token ID24246
Feature activation+6.35e-02

Pos logit contributions
ff+0.017
away+0.014
rated+0.014
rost+0.014
ace+0.014
Neg logit contributions
ipping-0.029
-0.028
pill-0.027
 strawberries-0.026
ula-0.026
Tokenrum
Token ID6582
Feature activation+8.79e-04

Pos logit contributions
ff+0.066
rost+0.064
 lived+0.063
rated+0.063
ination+0.061
Neg logit contributions
ipping-0.141
ula-0.129
pill-0.125
 practicing-0.122
 exercising-0.117
Token in
Token ID287
Feature activation-1.77e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
ace+0.001
Neg logit contributions
ipping-0.002
pill-0.002
-0.002
 exercising-0.002
 strawberries-0.002
Token the
Token ID262
Feature activation-6.69e-03

Pos logit contributions
ipping+0.003
+0.003
pill+0.003
 strawberries+0.003
 exercising+0.003
Neg logit contributions
ff-0.003
away-0.003
rated-0.003
rost-0.003
ination-0.002
Token store
Token ID3650
Feature activation+2.45e-03

Pos logit contributions
pill+0.012
ipping+0.012
+0.012
 exercising+0.011
 practicing+0.011
Neg logit contributions
ff-0.010
away-0.010
rost-0.009
 woke-0.009
rated-0.009
Token.
Token ID13
Feature activation+2.47e-03

Pos logit contributions
ff+0.003
away+0.003
rost+0.003
 woke+0.003
rated+0.002
Neg logit contributions
ipping-0.005
pill-0.005
-0.005
 exercising-0.005
wards-0.005
Token She
Token ID1375
Feature activation-2.09e-03

Pos logit contributions
ff+0.006
away+0.006
rost+0.006
rated+0.005
igator+0.005
Neg logit contributions
ipping-0.007
-0.007
pill-0.007
wards-0.007
 exercising-0.007
Token cried
Token ID16896
Feature activation+7.22e-03

Pos logit contributions
ipping+0.004
+0.004
pill+0.004
 exercising+0.004
 strawberries+0.004
Neg logit contributions
ff-0.003
away-0.003
rost-0.003
rated-0.002
ace-0.002
Token and
Token ID290
Feature activation-1.34e-03

Pos logit contributions
ff+0.013
rost+0.012
away+0.012
rated+0.011
grand+0.011
Neg logit contributions
ipping-0.012
-0.011
 exercising-0.011
wards-0.011
 practicing-0.011
Token threw
Token ID9617
Feature activation-6.82e-03

Pos logit contributions
ipping+0.002
+0.002
pill+0.002
 exercising+0.002
wards+0.002
Neg logit contributions
ff-0.002
away-0.002
rost-0.002
rated-0.002
ination-0.002
Token a
Token ID257
Feature activation+1.58e-02

Pos logit contributions
ipping+0.012
 strawberries+0.012
pill+0.011
+0.011
wards+0.011
Neg logit contributions
ff-0.011
rated-0.010
rost-0.010
away-0.010
ney-0.009
Token tant
Token ID24246
Feature activation+6.13e-02

Pos logit contributions
ff+0.021
rated+0.019
rost+0.018
ace+0.018
away+0.018
Neg logit contributions
ipping-0.032
-0.031
pill-0.031
 strawberries-0.030
ula-0.029
Tokenrum
Token ID6582
Feature activation+1.56e-03

Pos logit contributions
ff+0.061
rost+0.060
 lived+0.058
rated+0.056
away+0.056
Neg logit contributions
ipping-0.139
ula-0.128
pill-0.125
 practicing-0.123
 exercising-0.119
Token.
Token ID13
Feature activation+5.60e-03

Pos logit contributions
ff+0.002
away+0.002
rost+0.002
rated+0.002
ace+0.002
Neg logit contributions
ipping-0.003
-0.003
pill-0.003
 exercising-0.003
 strawberries-0.003
Token Her
Token ID2332
Feature activation+3.90e-03

Pos logit contributions
ff+0.010
away+0.010
rated+0.009
rost+0.009
igator+0.009
Neg logit contributions
-0.010
ipping-0.009
pill-0.009
wards-0.009
ula-0.008
Token mom
Token ID1995
Feature activation+4.59e-04

Pos logit contributions
ff+0.005
away+0.005
rost+0.005
rated+0.005
ace+0.004
Neg logit contributions
ipping-0.008
-0.008
pill-0.008
 exercising-0.007
wards-0.007
Token tried
Token ID3088
Feature activation-5.20e-04

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.000
 woke+0.000
Neg logit contributions
ipping-0.001
-0.001
pill-0.001
 exercising-0.001
wards-0.001
Token her
Token ID607
Feature activation-1.60e-03

Pos logit contributions
ff+0.008
away+0.007
rost+0.007
rated+0.007
ney+0.006
Neg logit contributions
ipping-0.011
pill-0.010
-0.010
 strawberries-0.010
 exercising-0.010
Token feet
Token ID3625
Feature activation+7.91e-03

Pos logit contributions
ipping+0.003
pill+0.003
+0.003
 exercising+0.003
wards+0.003
Neg logit contributions
ff-0.002
away-0.002
rost-0.002
ace-0.002
 woke-0.002
Token and
Token ID290
Feature activation-3.51e-03

Pos logit contributions
ff+0.011
away+0.010
rost+0.010
rated+0.009
ace+0.009
Neg logit contributions
ipping-0.016
-0.015
pill-0.015
 exercising-0.014
wards-0.014
Token threw
Token ID9617
Feature activation-1.01e-02

Pos logit contributions
ipping+0.007
+0.007
pill+0.007
 exercising+0.006
wards+0.006
Neg logit contributions
ff-0.005
away-0.005
rost-0.004
rated-0.004
ace-0.004
Token a
Token ID257
Feature activation+1.36e-02

Pos logit contributions
ipping+0.019
pill+0.019
+0.018
 strawberries+0.018
wards+0.018
Neg logit contributions
ff-0.015
away-0.014
rated-0.013
rost-0.013
ination-0.012
Token tant
Token ID24246
Feature activation+6.09e-02

Pos logit contributions
ff+0.014
away+0.011
rated+0.011
rost+0.011
ination+0.010
Neg logit contributions
ipping-0.033
-0.031
pill-0.031
 strawberries-0.030
wards-0.029
Tokenrum
Token ID6582
Feature activation+8.10e-04

Pos logit contributions
rost+0.063
ff+0.063
 lived+0.063
ination+0.062
rated+0.060
Neg logit contributions
ipping-0.131
ula-0.121
 practicing-0.114
pill-0.113
 exercising-0.108
Token.
Token ID13
Feature activation+1.77e-04

Pos logit contributions
ff+0.001
away+0.001
ace+0.001
rost+0.001
rated+0.001
Neg logit contributions
ipping-0.002
pill-0.002
-0.002
 exercising-0.002
 strawberries-0.002
Token She
Token ID1375
Feature activation-4.98e-03

Pos logit contributions
ff+0.000
away+0.000
rost+0.000
rated+0.000
ace+0.000
Neg logit contributions
ipping-0.000
-0.000
pill-0.000
wards-0.000
 exercising-0.000
Token grabbed
Token ID13176
Feature activation-4.56e-03

Pos logit contributions
ipping+0.010
+0.009
pill+0.009
 exercising+0.009
 strawberries+0.009
Neg logit contributions
ff-0.007
away-0.007
rost-0.006
rated-0.006
ace-0.006
Token the
Token ID262
Feature activation+3.06e-03

Pos logit contributions
ipping+0.010
+0.009
pill+0.009
 exercising+0.009
wards+0.008
Neg logit contributions
ff-0.006
away-0.006
rost-0.005
rated-0.005
ace-0.005
Token to
Token ID284
Feature activation-8.60e-03

Pos logit contributions
+0.048
ipping+0.048
pill+0.048
avored+0.044
 strawberries+0.042
Neg logit contributions
ff-0.018
away-0.018
 woke-0.017
 know-0.016
ot-0.015
Token cry
Token ID3960
Feature activation-5.73e-03

Pos logit contributions
ipping+0.018
+0.017
 exercising+0.017
 practicing+0.016
wards+0.016
Neg logit contributions
ff-0.012
away-0.011
rost-0.010
rated-0.009
ination-0.009
Token and
Token ID290
Feature activation+3.56e-03

Pos logit contributions
ipping+0.011
+0.010
pill+0.010
 exercising+0.010
wards+0.010
Neg logit contributions
ff-0.009
away-0.008
rost-0.008
rated-0.007
ace-0.007
Token threw
Token ID9617
Feature activation-1.31e-02

Pos logit contributions
ff+0.005
away+0.005
rost+0.005
rated+0.004
ace+0.004
Neg logit contributions
ipping-0.007
-0.007
pill-0.006
 exercising-0.006
wards-0.006
Token a
Token ID257
Feature activation+1.51e-02

Pos logit contributions
ipping+0.024
 strawberries+0.023
pill+0.022
+0.022
 exercising+0.022
Neg logit contributions
ff-0.021
rost-0.018
rated-0.018
away-0.018
ney-0.017
Token tant
Token ID24246
Feature activation+5.95e-02

Pos logit contributions
ff+0.022
rost+0.019
away+0.019
rated+0.019
ace+0.018
Neg logit contributions
ipping-0.030
-0.029
pill-0.028
 strawberries-0.027
ula-0.027
Tokenrum
Token ID6582
Feature activation+1.19e-03

Pos logit contributions
ff+0.060
rost+0.060
 lived+0.057
ination+0.055
rated+0.055
Neg logit contributions
ipping-0.133
ula-0.124
 practicing-0.118
pill-0.117
 exercising-0.113
Token.
Token ID13
Feature activation-1.73e-03

Pos logit contributions
ff+0.001
away+0.001
ace+0.001
 woke+0.001
rated+0.001
Neg logit contributions
pill-0.003
ipping-0.003
-0.003
 strawberries-0.002
 exercising-0.002
Token She
Token ID1375
Feature activation-3.90e-03

Pos logit contributions
+0.003
ipping+0.003
pill+0.003
wards+0.003
 exercising+0.003
Neg logit contributions
ff-0.003
away-0.003
rost-0.002
rated-0.002
igator-0.002
Token didn
Token ID1422
Feature activation+1.33e-03

Pos logit contributions
ipping+0.008
+0.007
pill+0.007
 exercising+0.007
 strawberries+0.007
Neg logit contributions
ff-0.006
away-0.005
rost-0.005
rated-0.005
ace-0.005
Token't
Token ID470
Feature activation+4.95e-03

Pos logit contributions
ff+0.002
away+0.002
rost+0.002
rated+0.001
 woke+0.001
Neg logit contributions
ipping-0.003
pill-0.003
-0.003
 strawberries-0.003
wards-0.002
Token.
Token ID13
Feature activation+4.26e-03

Pos logit contributions
pill+0.013
ipping+0.013
+0.013
 strawberries+0.012
 exercising+0.012
Neg logit contributions
ff-0.007
away-0.007
rost-0.006
ace-0.006
ot-0.006
Token She
Token ID1375
Feature activation+4.20e-03

Pos logit contributions
ff+0.005
away+0.005
rost+0.004
rated+0.004
ace+0.004
Neg logit contributions
ipping-0.010
-0.009
pill-0.009
wards-0.009
 exercising-0.009
Token hugs
Token ID40657
Feature activation-8.38e-03

Pos logit contributions
ff+0.005
 woke+0.004
 lived+0.004
 know+0.004
away+0.004
Neg logit contributions
ipping-0.009
wards-0.009
-0.009
ula-0.009
 exercising-0.009
Token the
Token ID262
Feature activation+1.01e-02

Pos logit contributions
ipping+0.016
+0.016
pill+0.016
 exercising+0.015
wards+0.015
Neg logit contributions
ff-0.012
away-0.012
rost-0.011
rated-0.010
ace-0.010
Token re
Token ID302
Feature activation+3.48e-02

Pos logit contributions
ff+0.019
away+0.017
rated+0.016
rost+0.016
ination+0.016
Neg logit contributions
ipping-0.016
 strawberries-0.015
pill-0.015
-0.014
 exercising-0.014
Tokenind
Token ID521
Feature activation+5.86e-02

Pos logit contributions
rated+0.061
 know+0.056
 woke+0.054
ff+0.054
reed+0.054
Neg logit contributions
ipping-0.056
ip-0.046
pill-0.046
 skating-0.044
ooming-0.044
Tokeneer
Token ID28153
Feature activation+1.23e-02

Pos logit contributions
reed+0.074
ination+0.073
rated+0.071
agree+0.070
 disgust+0.068
Neg logit contributions
 strawberries-0.096
ipping-0.090
ip-0.089
pill-0.089
 cones-0.088
Token.
Token ID13
Feature activation+3.68e-03

Pos logit contributions
ff+0.016
away+0.015
rost+0.013
ace+0.012
 woke+0.012
Neg logit contributions
ipping-0.027
-0.027
pill-0.026
wards-0.024
 exercising-0.024
Token
Token ID198
Feature activation+7.31e-03

Pos logit contributions
ff+0.005
away+0.005
rated+0.004
rost+0.004
ace+0.004
Neg logit contributions
ipping-0.008
-0.008
pill-0.008
wards-0.007
 strawberries-0.007
Token
Token ID198
Feature activation+9.59e-03

Pos logit contributions
rated+0.013
ff+0.013
away+0.012
igator+0.011
ace+0.011
Neg logit contributions
-0.012
pill-0.012
ipping-0.012
wards-0.011
ula-0.011
Token"
Token ID1
Feature activation-6.55e-03

Pos logit contributions
ff+0.010
away+0.009
rost+0.008
rated+0.007
ot+0.007
Neg logit contributions
ipping-0.023
-0.023
pill-0.022
wards-0.021
 strawberries-0.021
Token surprise
Token ID5975
Feature activation-4.27e-03

Pos logit contributions
ipping+0.005
pill+0.004
+0.004
wards+0.004
 exercising+0.004
Neg logit contributions
ff-0.004
away-0.003
rost-0.003
rated-0.003
 woke-0.003
Token.
Token ID13
Feature activation+1.10e-02

Pos logit contributions
ipping+0.010
pill+0.010
+0.010
wards+0.009
 exercising+0.009
Neg logit contributions
away-0.004
ff-0.004
rost-0.004
 woke-0.004
 lived-0.004
Token He
Token ID679
Feature activation-2.29e-03

Pos logit contributions
rated+0.021
away+0.020
ff+0.019
igator+0.018
aret+0.017
Neg logit contributions
-0.018
wards-0.016
pill-0.015
ula-0.015
ipping-0.015
Token threw
Token ID9617
Feature activation-4.72e-03

Pos logit contributions
ipping+0.005
pill+0.005
+0.005
 strawberries+0.004
 exercising+0.004
Neg logit contributions
ff-0.003
away-0.003
rost-0.003
rated-0.003
ace-0.002
Token a
Token ID257
Feature activation+1.65e-02

Pos logit contributions
 strawberries+0.007
ipping+0.007
pill+0.006
+0.006
wards+0.006
Neg logit contributions
ff-0.008
rated-0.008
grand-0.008
away-0.008
ney-0.007
Token tant
Token ID24246
Feature activation+5.84e-02

Pos logit contributions
ff+0.029
rated+0.027
ace+0.026
rost+0.025
ination+0.025
Neg logit contributions
ipping-0.026
-0.026
 strawberries-0.025
pill-0.025
 pitcher-0.024
Tokenrum
Token ID6582
Feature activation+8.32e-03

Pos logit contributions
 lived+0.062
ff+0.062
 woke+0.060
rated+0.059
ination+0.059
Neg logit contributions
ipping-0.125
ula-0.113
pill-0.110
 practicing-0.109
balls-0.105
Token.
Token ID13
Feature activation+9.61e-03

Pos logit contributions
ff+0.012
away+0.011
rost+0.011
rated+0.010
ace+0.010
Neg logit contributions
ipping-0.016
-0.016
pill-0.016
 exercising-0.015
wards-0.015
Token He
Token ID679
Feature activation-2.43e-03

Pos logit contributions
rated+0.018
away+0.017
ff+0.016
igator+0.015
ace+0.014
Neg logit contributions
-0.018
wards-0.015
ula-0.015
ipping-0.015
pill-0.014
Token kicked
Token ID12165
Feature activation-3.92e-03

Pos logit contributions
ipping+0.005
pill+0.004
+0.004
 exercising+0.004
 strawberries+0.004
Neg logit contributions
ff-0.004
away-0.003
rost-0.003
rated-0.003
ace-0.003
Token and
Token ID290
Feature activation+2.37e-03

Pos logit contributions
ipping+0.006
 strawberries+0.006
+0.005
pill+0.005
 exercising+0.005
Neg logit contributions
ff-0.007
rated-0.007
grand-0.007
away-0.006
 Amen-0.006
Token,
Token ID11
Feature activation+2.04e-02

Pos logit contributions
ff+0.006
away+0.005
rost+0.005
rated+0.005
 woke+0.005
Neg logit contributions
ipping-0.008
-0.007
pill-0.007
 exercising-0.007
wards-0.007
Token she
Token ID673
Feature activation+9.36e-03

Pos logit contributions
ff+0.033
away+0.032
rost+0.030
 woke+0.029
 lived+0.028
Neg logit contributions
ipping-0.037
pill-0.035
-0.034
 exercising-0.032
wards-0.031
Token saw
Token ID2497
Feature activation-7.39e-04

Pos logit contributions
ff+0.014
away+0.013
rost+0.013
ination+0.012
ace+0.012
Neg logit contributions
ipping-0.017
pill-0.016
 exercising-0.016
-0.016
 practicing-0.016
Token a
Token ID257
Feature activation+1.22e-02

Pos logit contributions
 strawberries+0.001
ipping+0.001
wards+0.001
pill+0.001
+0.001
Neg logit contributions
ff-0.001
away-0.001
rated-0.001
rost-0.001
ney-0.001
Token re
Token ID302
Feature activation+3.36e-02

Pos logit contributions
ff+0.017
away+0.016
rost+0.015
rated+0.015
ace+0.014
Neg logit contributions
ipping-0.025
-0.024
pill-0.023
 exercising-0.022
 strawberries-0.022
Tokenind
Token ID521
Feature activation+5.68e-02

Pos logit contributions
rated+0.046
ff+0.045
 know+0.042
 lived+0.041
 woke+0.040
Neg logit contributions
ipping-0.069
-0.060
pill-0.057
 practicing-0.056
ooming-0.055
Tokeneer
Token ID28153
Feature activation+1.25e-02

Pos logit contributions
ff+0.062
ination+0.056
reed+0.056
rated+0.053
 know+0.051
Neg logit contributions
ipping-0.107
pill-0.104
 strawberries-0.101
ip-0.097
 practicing-0.096
Token in
Token ID287
Feature activation+1.56e-02

Pos logit contributions
away+0.014
 woke+0.014
ff+0.013
 lived+0.013
 alone+0.013
Neg logit contributions
pill-0.029
-0.028
 strawberries-0.027
ula-0.026
ipping-0.025
Token her
Token ID607
Feature activation+2.04e-03

Pos logit contributions
ff+0.024
away+0.022
rated+0.021
rost+0.021
ace+0.019
Neg logit contributions
ipping-0.029
-0.029
pill-0.028
 strawberries-0.027
wards-0.027
Token backyard
Token ID24296
Feature activation+2.16e-02

Pos logit contributions
ff+0.003
away+0.003
rated+0.003
rost+0.003
ination+0.003
Neg logit contributions
ipping-0.004
-0.003
pill-0.003
 strawberries-0.003
wards-0.003
Token.
Token ID13
Feature activation+4.34e-03

Pos logit contributions
ff+0.030
away+0.027
rost+0.025
 woke+0.024
rated+0.024
Neg logit contributions
ipping-0.045
pill-0.043
-0.043
 strawberries-0.040
 exercising-0.040
Token sad
Token ID6507
Feature activation+2.96e-03

Pos logit contributions
ff+0.005
away+0.005
rated+0.005
rost+0.005
 woke+0.004
Neg logit contributions
ipping-0.010
-0.009
pill-0.009
wards-0.009
 strawberries-0.009
Token.
Token ID13
Feature activation-3.61e-03

Pos logit contributions
ff+0.004
away+0.004
rost+0.003
 woke+0.003
rated+0.003
Neg logit contributions
ipping-0.006
pill-0.006
-0.006
 strawberries-0.006
 exercising-0.006
Token She
Token ID1375
Feature activation-1.80e-03

Pos logit contributions
ipping+0.007
+0.007
pill+0.007
 exercising+0.007
wards+0.007
Neg logit contributions
ff-0.005
away-0.005
rost-0.004
rated-0.004
ace-0.004
Token threw
Token ID9617
Feature activation-7.99e-03

Pos logit contributions
ipping+0.004
pill+0.004
+0.004
 exercising+0.003
 practicing+0.003
Neg logit contributions
ff-0.002
away-0.002
rost-0.002
rated-0.002
ace-0.002
Token a
Token ID257
Feature activation+1.39e-02

Pos logit contributions
ipping+0.014
 strawberries+0.013
+0.013
pill+0.013
 exercising+0.013
Neg logit contributions
ff-0.013
away-0.012
rated-0.012
rost-0.012
ney-0.011
Token tant
Token ID24246
Feature activation+5.62e-02

Pos logit contributions
ff+0.018
rated+0.016
rost+0.016
away+0.016
ace+0.015
Neg logit contributions
ipping-0.029
-0.028
pill-0.027
 strawberries-0.026
ula-0.026
Tokenrum
Token ID6582
Feature activation+1.65e-03

Pos logit contributions
rost+0.060
ff+0.058
 lived+0.057
away+0.056
ination+0.055
Neg logit contributions
ipping-0.122
ula-0.112
pill-0.108
 practicing-0.108
 exercising-0.102
Token and
Token ID290
Feature activation+1.46e-04

Pos logit contributions
ff+0.002
away+0.002
rost+0.002
ace+0.002
rated+0.002
Neg logit contributions
ipping-0.004
pill-0.003
-0.003
 exercising-0.003
 strawberries-0.003
Token cried
Token ID16896
Feature activation+6.06e-03

Pos logit contributions
ff+0.000
away+0.000
rost+0.000
rated+0.000
ace+0.000
Neg logit contributions
ipping-0.000
-0.000
pill-0.000
 exercising-0.000
wards-0.000
Token loudly
Token ID23112
Feature activation+5.52e-04

Pos logit contributions
ff+0.010
away+0.009
rost+0.009
rated+0.009
ination+0.008
Neg logit contributions
ipping-0.011
-0.010
pill-0.010
wards-0.010
 exercising-0.010
Token.
Token ID13
Feature activation-6.50e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
 woke+0.001
Neg logit contributions
ipping-0.001
pill-0.001
-0.001
 exercising-0.001
 strawberries-0.001
Token tries
Token ID8404
Feature activation-1.31e-03

Pos logit contributions
ff+0.007
 woke+0.006
away+0.006
rost+0.006
 lived+0.006
Neg logit contributions
-0.014
ipping-0.014
wards-0.014
ula-0.013
 exercising-0.013
Token to
Token ID284
Feature activation+1.57e-02

Pos logit contributions
ipping+0.002
+0.002
pill+0.002
 exercising+0.002
wards+0.002
Neg logit contributions
ff-0.002
away-0.002
rated-0.002
rost-0.002
ace-0.002
Token take
Token ID1011
Feature activation+1.34e-02

Pos logit contributions
ff+0.027
ney+0.025
rated+0.024
away+0.023
 lived+0.023
Neg logit contributions
ipping-0.025
-0.024
 exercising-0.023
 strawberries-0.022
 bouncing-0.022
Token the
Token ID262
Feature activation+1.43e-02

Pos logit contributions
rated+0.022
ff+0.022
ney+0.021
away+0.020
rost+0.019
Neg logit contributions
ipping-0.022
 strawberries-0.021
wards-0.021
pill-0.020
-0.020
Token re
Token ID302
Feature activation+3.19e-02

Pos logit contributions
ff+0.026
away+0.024
rost+0.023
rated+0.023
ination+0.022
Neg logit contributions
ipping-0.022
 strawberries-0.021
pill-0.020
-0.020
 practicing-0.019
Tokenind
Token ID521
Feature activation+5.61e-02

Pos logit contributions
rated+0.055
 lived+0.050
 know+0.050
 woke+0.050
ff+0.049
Neg logit contributions
ipping-0.052
 skating-0.041
ooming-0.041
pill-0.040
ip-0.040
Tokeneer
Token ID28153
Feature activation+7.28e-03

Pos logit contributions
reed+0.075
ination+0.074
rated+0.070
 disgust+0.069
 lived+0.068
Neg logit contributions
 strawberries-0.087
pill-0.083
ipping-0.081
 cones-0.081
ip-0.080
Token.
Token ID13
Feature activation+1.35e-02

Pos logit contributions
ff+0.009
away+0.008
rost+0.007
rated+0.007
ination+0.007
Neg logit contributions
ipping-0.016
-0.015
pill-0.015
 exercising-0.015
 strawberries-0.014
Token
Token ID198
Feature activation+1.09e-02

Pos logit contributions
ff+0.020
rated+0.019
away+0.018
ace+0.016
rost+0.016
Neg logit contributions
-0.028
ipping-0.026
pill-0.025
ula-0.025
wards-0.025
Token
Token ID198
Feature activation+1.36e-02

Pos logit contributions
rated+0.021
ff+0.020
away+0.020
ace+0.018
igator+0.018
Neg logit contributions
-0.016
ula-0.015
ipping-0.015
pill-0.015
wards-0.014
Token"
Token ID1
Feature activation-2.29e-03

Pos logit contributions
ff+0.016
away+0.015
rost+0.013
rated+0.013
ace+0.012
Neg logit contributions
ipping-0.031
-0.030
pill-0.029
 exercising-0.028
wards-0.028
Tokenmy
Token ID1820
Feature activation+1.73e-02

Pos logit contributions
ff+0.018
away+0.018
rost+0.016
ination+0.015
ot+0.015
Neg logit contributions
ipping-0.024
pill-0.022
-0.022
 exercising-0.022
 strawberries-0.021
Token saw
Token ID2497
Feature activation+2.19e-03

Pos logit contributions
ff+0.027
away+0.025
rost+0.024
ination+0.023
ace+0.022
Neg logit contributions
ipping-0.031
pill-0.030
-0.029
 exercising-0.028
 strawberries-0.028
Token a
Token ID257
Feature activation+1.15e-02

Pos logit contributions
ff+0.004
away+0.003
rated+0.003
rost+0.003
ace+0.003
Neg logit contributions
 strawberries-0.004
ipping-0.004
wards-0.004
pill-0.004
-0.004
Token cool
Token ID3608
Feature activation+1.94e-02

Pos logit contributions
ff+0.017
away+0.015
rost+0.015
rated+0.014
ace+0.014
Neg logit contributions
ipping-0.022
-0.022
pill-0.021
 strawberries-0.020
 exercising-0.020
Token re
Token ID302
Feature activation+3.32e-02

Pos logit contributions
ff+0.033
away+0.031
rost+0.029
rated+0.028
ace+0.028
Neg logit contributions
ipping-0.034
-0.032
pill-0.031
 exercising-0.029
 strawberries-0.029
Tokenind
Token ID521
Feature activation+5.57e-02

Pos logit contributions
ff+0.045
rated+0.044
 know+0.041
 lived+0.040
ace+0.040
Neg logit contributions
ipping-0.069
-0.061
pill-0.057
 practicing-0.056
ooming-0.055
Tokeneer
Token ID28153
Feature activation+9.88e-03

Pos logit contributions
ff+0.060
ination+0.052
reed+0.050
 know+0.049
rated+0.048
Neg logit contributions
ipping-0.112
pill-0.107
 strawberries-0.104
ip-0.101
 practicing-0.100
Token in
Token ID287
Feature activation+1.64e-02

Pos logit contributions
away+0.010
 woke+0.009
 alone+0.009
 lived+0.008
ff+0.008
Neg logit contributions
pill-0.024
-0.022
 strawberries-0.022
ula-0.021
ipping-0.020
Token his
Token ID465
Feature activation+1.89e-03

Pos logit contributions
ff+0.022
away+0.021
rated+0.020
rost+0.019
ace+0.018
Neg logit contributions
ipping-0.034
-0.033
pill-0.032
 strawberries-0.031
 exercising-0.031
Token backyard
Token ID24296
Feature activation+1.98e-02

Pos logit contributions
ff+0.003
away+0.003
rated+0.003
rost+0.003
ace+0.003
Neg logit contributions
ipping-0.003
-0.003
pill-0.003
 strawberries-0.003
 exercising-0.003
Token.
Token ID13
Feature activation+1.05e-02

Pos logit contributions
ff+0.025
away+0.023
rost+0.020
 woke+0.020
 lived+0.020
Neg logit contributions
ipping-0.043
pill-0.043
-0.042
 strawberries-0.039
wards-0.039
Token?
Token ID30
Feature activation-9.82e-04

Pos logit contributions
ff+0.008
away+0.007
rost+0.007
rated+0.007
ace+0.006
Neg logit contributions
ipping-0.017
-0.017
pill-0.016
 exercising-0.016
 practicing-0.016
Token I
Token ID314
Feature activation+1.69e-02

Pos logit contributions
ipping+0.002
+0.002
pill+0.002
wards+0.002
 exercising+0.002
Neg logit contributions
ff-0.001
away-0.001
rated-0.001
rost-0.001
ination-0.001
Token want
Token ID765
Feature activation+1.34e-02

Pos logit contributions
away+0.023
ff+0.023
rated+0.022
rost+0.022
ination+0.022
Neg logit contributions
-0.031
ipping-0.030
pill-0.028
 practicing-0.028
 exercising-0.027
Token a
Token ID257
Feature activation+1.87e-02

Pos logit contributions
ff+0.022
away+0.021
rated+0.021
rost+0.020
ney+0.019
Neg logit contributions
ipping-0.023
pill-0.022
 strawberries-0.022
-0.021
 exercising-0.021
Token re
Token ID302
Feature activation+3.45e-02

Pos logit contributions
ff+0.032
away+0.029
rated+0.029
rost+0.029
ace+0.028
Neg logit contributions
ipping-0.031
-0.030
pill-0.028
 strawberries-0.028
wards-0.027
Tokenind
Token ID521
Feature activation+5.51e-02

Pos logit contributions
rated+0.053
ff+0.048
 lived+0.048
 woke+0.047
 know+0.046
Neg logit contributions
ipping-0.065
-0.054
 practicing-0.053
pill-0.052
ooming-0.051
Tokeneer
Token ID28153
Feature activation+1.28e-02

Pos logit contributions
ff+0.063
ination+0.061
reed+0.061
rated+0.059
 disgust+0.057
Neg logit contributions
 strawberries-0.091
ipping-0.090
pill-0.090
ip-0.088
rob-0.086
Token like
Token ID588
Feature activation+1.23e-02

Pos logit contributions
ff+0.017
away+0.016
rost+0.014
 woke+0.013
ace+0.013
Neg logit contributions
ipping-0.028
-0.027
pill-0.027
wards-0.024
 exercising-0.024
Token this
Token ID428
Feature activation+1.18e-02

Pos logit contributions
ff+0.019
away+0.017
rost+0.016
rated+0.016
ace+0.015
Neg logit contributions
ipping-0.024
-0.022
pill-0.022
 exercising-0.021
wards-0.021
Token one
Token ID530
Feature activation+8.59e-03

Pos logit contributions
ff+0.016
away+0.014
rost+0.014
rated+0.013
ace+0.012
Neg logit contributions
ipping-0.024
-0.023
pill-0.023
 strawberries-0.022
 exercising-0.022
Token."
Token ID526
Feature activation+1.17e-02

Pos logit contributions
ff+0.011
away+0.010
rost+0.010
rated+0.009
ination+0.009
Neg logit contributions
ipping-0.018
-0.017
pill-0.017
 exercising-0.016
 strawberries-0.016
Token kept
Token ID4030
Feature activation+1.31e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.000
ination+0.000
Neg logit contributions
ipping-0.001
-0.001
pill-0.001
 exercising-0.001
 strawberries-0.001
Token crying
Token ID13774
Feature activation-2.26e-03

Pos logit contributions
ff+0.002
away+0.002
 woke+0.002
 know+0.002
ination+0.002
Neg logit contributions
pill-0.002
ipping-0.002
-0.002
wards-0.002
avored-0.002
Token and
Token ID290
Feature activation+7.58e-03

Pos logit contributions
ipping+0.004
+0.004
pill+0.004
 exercising+0.004
wards+0.004
Neg logit contributions
ff-0.004
away-0.003
rost-0.003
rated-0.003
ination-0.003
Token throwing
Token ID9644
Feature activation-6.84e-03

Pos logit contributions
ff+0.015
away+0.014
rost+0.014
rated+0.013
ace+0.013
Neg logit contributions
ipping-0.011
-0.011
 exercising-0.010
pill-0.010
wards-0.010
Token a
Token ID257
Feature activation+1.65e-02

Pos logit contributions
ipping+0.013
+0.012
 strawberries+0.012
pill+0.012
 exercising+0.011
Neg logit contributions
ff-0.011
away-0.010
rated-0.010
rost-0.010
ney-0.009
Token tant
Token ID24246
Feature activation+5.48e-02

Pos logit contributions
ff+0.025
rost+0.021
away+0.021
rated+0.021
ination+0.021
Neg logit contributions
ipping-0.032
-0.031
pill-0.030
 strawberries-0.030
ula-0.029
Tokenrum
Token ID6582
Feature activation-8.03e-04

Pos logit contributions
ff+0.058
rost+0.057
 lived+0.055
ination+0.055
away+0.054
Neg logit contributions
ipping-0.119
ula-0.111
pill-0.104
 practicing-0.104
 exercising-0.099
Token.
Token ID13
Feature activation+3.32e-03

Pos logit contributions
ipping+0.002
pill+0.002
+0.002
 exercising+0.001
 strawberries+0.001
Neg logit contributions
ff-0.001
away-0.001
rost-0.001
rated-0.001
ace-0.001
Token
Token ID198
Feature activation+3.73e-03

Pos logit contributions
ff+0.005
away+0.005
rated+0.005
rost+0.005
igator+0.004
Neg logit contributions
-0.007
ipping-0.006
pill-0.006
wards-0.006
ula-0.006
Token
Token ID198
Feature activation+5.32e-03

Pos logit contributions
ff+0.007
rated+0.007
away+0.007
igator+0.006
eer+0.006
Neg logit contributions
-0.006
ula-0.005
ipping-0.005
wards-0.005
pill-0.005
TokenFinally
Token ID11158
Feature activation-9.70e-03

Pos logit contributions
ff+0.011
away+0.010
rost+0.010
rated+0.009
eer+0.009
Neg logit contributions
ipping-0.008
-0.007
pill-0.007
 exercising-0.007
ula-0.007
Token to
Token ID284
Feature activation+9.50e-03

Pos logit contributions
ff+0.033
rated+0.031
rost+0.029
away+0.029
ace+0.028
Neg logit contributions
ipping-0.028
 strawberries-0.027
-0.027
pill-0.026
 exercising-0.026
Token play
Token ID711
Feature activation+5.51e-03

Pos logit contributions
ff+0.014
away+0.013
rated+0.012
rost+0.012
ace+0.011
Neg logit contributions
ipping-0.018
-0.018
 exercising-0.017
wards-0.017
 practicing-0.016
Token with
Token ID351
Feature activation-5.59e-04

Pos logit contributions
ff+0.010
rated+0.009
rost+0.009
away+0.008
ace+0.008
Neg logit contributions
ipping-0.009
-0.009
 exercising-0.008
wards-0.008
 practicing-0.008
Token the
Token ID262
Feature activation+1.10e-02

Pos logit contributions
ipping+0.001
 strawberries+0.001
pill+0.001
+0.001
 exercising+0.001
Neg logit contributions
ff-0.001
rated-0.001
away-0.001
rost-0.001
ney-0.001
Token re
Token ID302
Feature activation+2.74e-02

Pos logit contributions
ff+0.019
away+0.018
rost+0.017
rated+0.016
ination+0.016
Neg logit contributions
ipping-0.019
pill-0.018
-0.018
 strawberries-0.017
 exercising-0.017
Tokenind
Token ID521
Feature activation+5.46e-02

Pos logit contributions
rated+0.045
 woke+0.043
ff+0.043
 know+0.043
reed+0.042
Neg logit contributions
ipping-0.047
pill-0.038
 skating-0.037
 practicing-0.036
ooming-0.036
Tokeneer
Token ID28153
Feature activation+9.63e-04

Pos logit contributions
reed+0.071
ination+0.067
agree+0.064
rated+0.062
 disgust+0.061
Neg logit contributions
 strawberries-0.093
pill-0.090
ipping-0.084
 players-0.083
 cones-0.082
Token,
Token ID11
Feature activation-7.93e-03

Pos logit contributions
ff+0.002
away+0.001
rost+0.001
rated+0.001
ination+0.001
Neg logit contributions
ipping-0.002
-0.002
pill-0.002
 exercising-0.002
 strawberries-0.002
Token you
Token ID345
Feature activation+1.17e-02

Pos logit contributions
ipping+0.014
pill+0.013
+0.013
 exercising+0.013
 strawberries+0.013
Neg logit contributions
ff-0.013
away-0.012
rost-0.011
rated-0.011
ace-0.011
Token have
Token ID423
Feature activation+2.19e-03

Pos logit contributions
ff+0.017
away+0.017
rated+0.016
rost+0.015
ination+0.014
Neg logit contributions
-0.021
ipping-0.020
pill-0.019
 practicing-0.019
 exercising-0.019
Token to
Token ID284
Feature activation+1.84e-03

Pos logit contributions
ff+0.003
away+0.003
rost+0.003
rated+0.003
ace+0.003
Neg logit contributions
ipping-0.004
pill-0.004
-0.004
 strawberries-0.004
 exercising-0.004
Tokenballs
Token ID21591
Feature activation-1.28e-03

Pos logit contributions
pill+0.012
ipping+0.012
+0.012
wards+0.011
 exercising+0.011
Neg logit contributions
ff-0.008
away-0.008
rost-0.007
 woke-0.007
ney-0.007
Token and
Token ID290
Feature activation-4.60e-03

Pos logit contributions
pill+0.003
ipping+0.003
+0.003
pieces+0.003
 exercising+0.003
Neg logit contributions
ff-0.001
away-0.001
 woke-0.001
 lived-0.001
ace-0.001
Token built
Token ID3170
Feature activation-8.11e-03

Pos logit contributions
ipping+0.009
+0.008
pill+0.008
 exercising+0.007
wards+0.007
Neg logit contributions
ff-0.007
away-0.007
rost-0.006
rated-0.006
 woke-0.006
Token a
Token ID257
Feature activation+1.31e-02

Pos logit contributions
ipping+0.017
+0.017
pill+0.016
 exercising+0.015
 practicing+0.015
Neg logit contributions
ff-0.011
away-0.010
rost-0.009
 woke-0.009
 lived-0.009
Token re
Token ID302
Feature activation+3.06e-02

Pos logit contributions
ff+0.017
away+0.017
rost+0.015
 woke+0.014
rated+0.014
Neg logit contributions
ipping-0.028
pill-0.027
-0.027
 exercising-0.025
wards-0.024
Tokenind
Token ID521
Feature activation+5.41e-02

Pos logit contributions
ff+0.039
rated+0.038
 woke+0.036
 lived+0.036
 know+0.036
Neg logit contributions
ipping-0.063
-0.055
pill-0.053
 skating-0.051
ooming-0.050
Tokeneer
Token ID28153
Feature activation+8.40e-03

Pos logit contributions
ff+0.055
ination+0.048
reed+0.045
 know+0.043
away+0.041
Neg logit contributions
ipping-0.112
pill-0.107
 strawberries-0.105
 skating-0.100
balls-0.099
Token house
Token ID2156
Feature activation+1.25e-02

Pos logit contributions
away+0.010
ff+0.009
 woke+0.008
 know+0.008
ney+0.008
Neg logit contributions
-0.019
pill-0.019
ipping-0.018
 strawberries-0.017
ula-0.017
Token.
Token ID13
Feature activation-6.63e-03

Pos logit contributions
ff+0.014
away+0.014
rost+0.012
ney+0.011
 woke+0.011
Neg logit contributions
ipping-0.029
-0.028
pill-0.028
 strawberries-0.026
wards-0.026
Token
Token ID198
Feature activation-2.76e-03

Pos logit contributions
ipping+0.013
+0.013
pill+0.012
wards+0.012
 strawberries+0.012
Neg logit contributions
ff-0.010
away-0.009
rost-0.009
rated-0.008
ace-0.008
Token
Token ID198
Feature activation-4.30e-03

Pos logit contributions
+0.005
wards+0.005
ipping+0.005
pill+0.004
ula+0.004
Neg logit contributions
ff-0.005
away-0.004
rated-0.004
rost-0.004
eer-0.004
Token and
Token ID290
Feature activation+6.34e-03

Pos logit contributions
+0.045
pill+0.045
ipping+0.042
wards+0.041
avored+0.040
Neg logit contributions
away-0.010
ff-0.009
 woke-0.008
 know-0.007
 alone-0.007
Token started
Token ID2067
Feature activation-1.32e-02

Pos logit contributions
ff+0.008
away+0.007
 woke+0.007
rated+0.007
 lived+0.007
Neg logit contributions
ipping-0.014
-0.013
pill-0.013
wards-0.012
ula-0.012
Token to
Token ID284
Feature activation-6.76e-03

Pos logit contributions
ipping+0.032
pill+0.031
+0.031
avored+0.029
pieces+0.029
Neg logit contributions
ff-0.012
away-0.011
 woke-0.011
 know-0.010
rost-0.010
Token throw
Token ID3714
Feature activation-1.33e-02

Pos logit contributions
ipping+0.017
+0.016
pill+0.016
 exercising+0.015
wards+0.015
Neg logit contributions
ff-0.006
away-0.006
rost-0.005
rated-0.005
ace-0.004
Token a
Token ID257
Feature activation+1.25e-02

Pos logit contributions
ipping+0.025
 strawberries+0.024
pill+0.024
+0.023
 exercising+0.022
Neg logit contributions
ff-0.021
away-0.019
rost-0.018
rated-0.018
ney-0.017
Token tant
Token ID24246
Feature activation+5.40e-02

Pos logit contributions
ff+0.021
away+0.019
ace+0.019
rated+0.019
rost+0.019
Neg logit contributions
ipping-0.021
pill-0.020
-0.020
 strawberries-0.019
ula-0.018
Tokenrum
Token ID6582
Feature activation-1.39e-03

Pos logit contributions
ff+0.055
 lived+0.053
rost+0.053
rated+0.050
ination+0.050
Neg logit contributions
ipping-0.122
ula-0.112
pill-0.108
 practicing-0.108
 exercising-0.104
Token.
Token ID13
Feature activation+4.53e-03

Pos logit contributions
pill+0.003
ipping+0.003
+0.003
 strawberries+0.003
 exercising+0.003
Neg logit contributions
ff-0.001
away-0.001
ace-0.001
rated-0.001
 woke-0.001
Token Suddenly
Token ID24975
Feature activation-1.34e-03

Pos logit contributions
ff+0.008
away+0.008
rost+0.007
rated+0.007
igator+0.007
Neg logit contributions
-0.008
ipping-0.008
pill-0.007
wards-0.007
 exercising-0.007
Token,
Token ID11
Feature activation+2.35e-02

Pos logit contributions
ipping+0.003
pill+0.003
+0.003
 exercising+0.003
wards+0.003
Neg logit contributions
ff-0.002
away-0.002
 woke-0.001
rost-0.001
 lived-0.001
Token she
Token ID673
Feature activation-6.65e-03

Pos logit contributions
away+0.028
ff+0.028
 woke+0.028
 lived+0.027
rost+0.026
Neg logit contributions
ipping-0.049
pill-0.049
-0.045
 exercising-0.042
oured-0.041
Token.
Token ID13
Feature activation-2.14e-03

Pos logit contributions
ipping+0.008
pill+0.008
+0.008
 strawberries+0.007
wards+0.007
Neg logit contributions
ff-0.008
away-0.008
rost-0.007
rated-0.007
 woke-0.007
Token She
Token ID1375
Feature activation-2.02e-03

Pos logit contributions
+0.004
ipping+0.004
pill+0.004
ula+0.004
 strawberries+0.004
Neg logit contributions
ff-0.003
away-0.003
rated-0.003
rost-0.003
ace-0.003
Token sees
Token ID7224
Feature activation-1.09e-02

Pos logit contributions
ipping+0.005
+0.005
pill+0.004
wards+0.004
 exercising+0.004
Neg logit contributions
ff-0.002
away-0.002
rost-0.002
 woke-0.002
 know-0.002
Token the
Token ID262
Feature activation+1.31e-02

Pos logit contributions
ipping+0.025
+0.025
pill+0.023
 exercising+0.022
wards+0.022
Neg logit contributions
ff-0.012
away-0.011
rost-0.010
 woke-0.009
ot-0.009
Token re
Token ID302
Feature activation+2.30e-02

Pos logit contributions
ff+0.023
away+0.021
rost+0.020
rated+0.020
ination+0.019
Neg logit contributions
ipping-0.022
pill-0.021
-0.020
 strawberries-0.020
 exercising-0.019
Tokenind
Token ID521
Feature activation+5.39e-02

Pos logit contributions
rated+0.036
ff+0.035
 know+0.033
reed+0.033
 woke+0.032
Neg logit contributions
ipping-0.042
 practicing-0.034
pill-0.034
ooming-0.034
 skating-0.033
Tokeneer
Token ID28153
Feature activation-2.91e-03

Pos logit contributions
reed+0.062
ff+0.058
rated+0.056
ination+0.056
agree+0.052
Neg logit contributions
ipping-0.101
pill-0.097
 strawberries-0.096
balls-0.091
 juggling-0.090
Token.
Token ID13
Feature activation+3.02e-03

Pos logit contributions
pill+0.007
+0.007
ipping+0.006
 strawberries+0.006
wards+0.006
Neg logit contributions
ff-0.003
away-0.003
 woke-0.003
 know-0.003
rost-0.003
Token She
Token ID1375
Feature activation+1.62e-03

Pos logit contributions
rated+0.005
away+0.005
ff+0.005
rost+0.005
ace+0.004
Neg logit contributions
-0.005
pill-0.005
ipping-0.005
ula-0.005
 strawberries-0.005
Token sees
Token ID7224
Feature activation-7.14e-03

Pos logit contributions
ff+0.002
away+0.002
rost+0.001
 woke+0.001
 know+0.001
Neg logit contributions
ipping-0.004
-0.004
pill-0.004
wards-0.003
 exercising-0.003
Token the
Token ID262
Feature activation+1.66e-02

Pos logit contributions
ipping+0.016
+0.016
pill+0.015
 exercising+0.014
wards+0.014
Neg logit contributions
ff-0.008
away-0.007
rost-0.007
 woke-0.006
ot-0.006
Token a
Token ID257
Feature activation+2.02e-02

Pos logit contributions
ipping+0.015
pill+0.014
+0.014
wards+0.013
 strawberries+0.013
Neg logit contributions
ff-0.012
away-0.011
rost-0.010
rated-0.010
ace-0.009
Token picture
Token ID4286
Feature activation-4.00e-03

Pos logit contributions
ff+0.027
away+0.026
rost+0.023
 woke+0.023
ot+0.022
Neg logit contributions
ipping-0.043
-0.040
pill-0.040
 exercising-0.038
wards-0.037
Token of
Token ID286
Feature activation-5.10e-03

Pos logit contributions
ipping+0.009
+0.008
pill+0.008
 exercising+0.007
 strawberries+0.007
Neg logit contributions
ff-0.005
away-0.005
rost-0.005
 woke-0.004
ace-0.004
Token a
Token ID257
Feature activation+1.54e-02

Pos logit contributions
ipping+0.011
+0.010
pill+0.010
 exercising+0.010
wards+0.010
Neg logit contributions
ff-0.007
away-0.006
rost-0.006
ace-0.005
 woke-0.005
Token re
Token ID302
Feature activation+3.45e-02

Pos logit contributions
away+0.021
ff+0.021
 woke+0.019
rost+0.019
ot+0.018
Neg logit contributions
ipping-0.032
pill-0.029
-0.029
 exercising-0.027
wards-0.026
Tokenind
Token ID521
Feature activation+5.29e-02

Pos logit contributions
rated+0.044
ff+0.044
ace+0.040
 woke+0.040
 know+0.040
Neg logit contributions
ipping-0.071
pill-0.064
-0.063
 practicing-0.060
ip-0.059
Tokeneer
Token ID28153
Feature activation+1.16e-02

Pos logit contributions
ff+0.049
rated+0.043
reed+0.041
ination+0.041
 know+0.040
Neg logit contributions
pill-0.115
ipping-0.111
 strawberries-0.109
ip-0.104
 practicing-0.102
Token.
Token ID13
Feature activation+5.50e-03

Pos logit contributions
ff+0.012
away+0.011
 woke+0.010
 lived+0.010
ace+0.009
Neg logit contributions
ipping-0.028
-0.027
pill-0.027
 strawberries-0.025
ula-0.024
Token She
Token ID1375
Feature activation+1.14e-03

Pos logit contributions
ff+0.007
away+0.007
rost+0.006
rated+0.006
ace+0.006
Neg logit contributions
ipping-0.012
-0.011
pill-0.011
wards-0.011
 exercising-0.010
Token wanted
Token ID2227
Feature activation+9.07e-03

Pos logit contributions
ff+0.002
away+0.001
rost+0.001
rated+0.001
ace+0.001
Neg logit contributions
ipping-0.002
pill-0.002
-0.002
 exercising-0.002
 strawberries-0.002
Token a
Token ID257
Feature activation+1.75e-02

Pos logit contributions
ff+0.014
away+0.013
rated+0.012
rost+0.012
ace+0.011
Neg logit contributions
ipping-0.017
pill-0.017
-0.016
 strawberries-0.016
wards-0.016
Token snow
Token ID6729
Feature activation-2.66e-03

Pos logit contributions
ipping+0.009
+0.009
pill+0.009
ula+0.008
oured+0.008
Neg logit contributions
ff-0.008
away-0.008
 woke-0.007
 lived-0.007
rost-0.007
Token and
Token ID290
Feature activation+1.82e-03

Pos logit contributions
pill+0.006
ipping+0.006
+0.006
wards+0.005
 strawberries+0.005
Neg logit contributions
ff-0.003
away-0.003
rost-0.003
 woke-0.003
rated-0.003
Token mentioned
Token ID4750
Feature activation-3.09e-03

Pos logit contributions
ff+0.003
away+0.003
rost+0.003
ace+0.003
rated+0.003
Neg logit contributions
ipping-0.003
 exercising-0.003
pill-0.003
 strawberries-0.003
wards-0.003
Token a
Token ID257
Feature activation+4.84e-03

Pos logit contributions
ipping+0.006
pill+0.005
+0.005
wards+0.005
 strawberries+0.005
Neg logit contributions
ff-0.005
away-0.004
rost-0.004
rated-0.004
ace-0.004
Token re
Token ID302
Feature activation+3.37e-02

Pos logit contributions
ff+0.007
away+0.007
rost+0.006
rated+0.006
 woke+0.006
Neg logit contributions
ipping-0.009
-0.009
pill-0.009
 exercising-0.008
wards-0.008
Tokenind
Token ID521
Feature activation+5.29e-02

Pos logit contributions
rated+0.048
ff+0.045
 woke+0.045
 lived+0.044
 know+0.044
Neg logit contributions
ipping-0.063
pill-0.053
ip-0.053
-0.052
 skating-0.051
Tokeneer
Token ID28153
Feature activation+1.67e-02

Pos logit contributions
ff+0.051
ination+0.048
rated+0.043
reed+0.041
away+0.039
Neg logit contributions
pill-0.107
 strawberries-0.105
ipping-0.104
ip-0.102
ula-0.102
Token.
Token ID13
Feature activation-1.02e-03

Pos logit contributions
ff+0.014
 lived+0.013
away+0.013
 woke+0.012
ot+0.011
Neg logit contributions
-0.044
ipping-0.044
pill-0.043
 strawberries-0.040
 sidelines-0.040
Token She
Token ID1375
Feature activation-1.97e-04

Pos logit contributions
+0.002
ipping+0.002
pill+0.002
wards+0.002
 exercising+0.002
Neg logit contributions
away-0.002
ff-0.002
rated-0.001
rost-0.001
ace-0.001
Token was
Token ID373
Feature activation-4.54e-03

Pos logit contributions
ipping+0.000
pill+0.000
 exercising+0.000
 practicing+0.000
+0.000
Neg logit contributions
ff-0.000
away-0.000
rost-0.000
rated-0.000
ace-0.000
Token so
Token ID523
Feature activation+3.83e-03

Pos logit contributions
ipping+0.009
+0.009
pill+0.009
 strawberries+0.008
wards+0.008
Neg logit contributions
ff-0.006
away-0.005
 know-0.005
ace-0.005
rost-0.005
Token want
Token ID765
Feature activation+2.00e-02

Pos logit contributions
ff+0.012
away+0.011
rost+0.010
ination+0.010
rated+0.010
Neg logit contributions
-0.016
ipping-0.015
 exercising-0.014
pill-0.014
 practicing-0.013
Token to
Token ID284
Feature activation+9.52e-03

Pos logit contributions
ff+0.036
rated+0.034
away+0.032
rost+0.031
ace+0.031
Neg logit contributions
ipping-0.032
 strawberries-0.031
-0.030
pill-0.030
 exercising-0.029
Token keep
Token ID1394
Feature activation+1.18e-02

Pos logit contributions
ff+0.013
away+0.012
rated+0.011
rost+0.011
ination+0.010
Neg logit contributions
ipping-0.019
-0.019
 exercising-0.018
pill-0.018
wards-0.018
Token the
Token ID262
Feature activation+1.12e-02

Pos logit contributions
ff+0.016
away+0.015
rost+0.014
rated+0.013
ace+0.013
Neg logit contributions
ipping-0.024
-0.023
pill-0.023
wards-0.022
 exercising-0.022
Token re
Token ID302
Feature activation+2.42e-02

Pos logit contributions
ff+0.021
away+0.019
rost+0.019
rated+0.018
ot+0.018
Neg logit contributions
ipping-0.017
 strawberries-0.016
-0.015
pill-0.015
 exercising-0.015
Tokenind
Token ID521
Feature activation+5.28e-02

Pos logit contributions
rated+0.037
ff+0.037
 woke+0.037
 lived+0.036
reed+0.036
Neg logit contributions
ipping-0.043
 practicing-0.034
 skating-0.034
ooming-0.034
-0.033
Tokeneer
Token ID28153
Feature activation+4.17e-03

Pos logit contributions
reed+0.073
ination+0.069
agree+0.064
 disgust+0.064
ff+0.063
Neg logit contributions
 strawberries-0.081
ipping-0.078
pill-0.075
ip-0.072
 cones-0.072
Token,
Token ID11
Feature activation-2.76e-03

Pos logit contributions
ff+0.007
rost+0.006
ination+0.006
away+0.006
rated+0.006
Neg logit contributions
ipping-0.007
 exercising-0.007
-0.007
 strawberries-0.007
 practicing-0.007
Token you
Token ID345
Feature activation+1.26e-02

Pos logit contributions
ipping+0.005
+0.004
pill+0.004
 strawberries+0.004
 exercising+0.004
Neg logit contributions
ff-0.005
away-0.005
rated-0.004
rost-0.004
ace-0.004
Token have
Token ID423
Feature activation+3.99e-03

Pos logit contributions
ff+0.019
away+0.018
rost+0.017
rated+0.017
ination+0.016
Neg logit contributions
-0.023
ipping-0.022
pill-0.021
 practicing-0.021
 exercising-0.021
Token to
Token ID284
Feature activation+3.46e-03

Pos logit contributions
ff+0.006
away+0.006
rated+0.006
rost+0.006
ace+0.005
Neg logit contributions
ipping-0.007
-0.007
pill-0.007
 strawberries-0.007
 exercising-0.006
Token garden
Token ID11376
Feature activation-1.73e-02

Pos logit contributions
ipping+0.015
pill+0.014
+0.014
 exercising+0.013
wards+0.013
Neg logit contributions
ff-0.011
away-0.010
rost-0.010
ace-0.009
rated-0.009
Token and
Token ID290
Feature activation+2.65e-04

Pos logit contributions
ipping+0.035
+0.033
pill+0.033
wards+0.031
 exercising+0.031
Neg logit contributions
ff-0.025
away-0.023
rost-0.021
rated-0.020
 woke-0.020
Token admire
Token ID21099
Feature activation-7.85e-03

Pos logit contributions
ff+0.000
away+0.000
rost+0.000
rated+0.000
ace+0.000
Neg logit contributions
ipping-0.000
-0.000
pill-0.000
 exercising-0.000
wards-0.000
Token his
Token ID465
Feature activation-1.77e-03

Pos logit contributions
ipping+0.016
+0.016
pill+0.015
 strawberries+0.015
 exercising+0.015
Neg logit contributions
ff-0.011
away-0.010
rost-0.009
rated-0.009
ace-0.008
Token sun
Token ID4252
Feature activation-1.88e-02

Pos logit contributions
ipping+0.003
pill+0.003
+0.003
 exercising+0.003
wards+0.003
Neg logit contributions
ff-0.003
away-0.003
rost-0.002
rated-0.002
ace-0.002
Tokenfl
Token ID2704
Feature activation-7.41e-02

Pos logit contributions
ipping+0.037
pill+0.036
+0.036
wards+0.033
ula+0.033
Neg logit contributions
ff-0.027
away-0.025
rated-0.023
rost-0.023
 woke-0.022
Tokenowers
Token ID3618
Feature activation-1.62e-02

Pos logit contributions
ipping+0.103
ula+0.098
pill+0.098
wards+0.097
 exercising+0.097
Neg logit contributions
ff-0.142
away-0.126
 lived-0.121
ney-0.121
rated-0.121
Token.
Token ID13
Feature activation+5.16e-03

Pos logit contributions
ipping+0.034
pill+0.033
+0.033
wards+0.030
ula+0.030
Neg logit contributions
ff-0.021
away-0.020
rated-0.017
 lived-0.017
rost-0.017
Token They
Token ID1119
Feature activation+2.19e-03

Pos logit contributions
ff+0.006
away+0.006
rated+0.005
rost+0.005
ace+0.005
Neg logit contributions
ipping-0.011
-0.011
pill-0.011
wards-0.010
 exercising-0.010
Token also
Token ID635
Feature activation+3.79e-03

Pos logit contributions
ff+0.003
away+0.003
rost+0.003
rated+0.003
ination+0.003
Neg logit contributions
ipping-0.004
-0.004
pill-0.004
 strawberries-0.004
 exercising-0.004
Token see
Token ID766
Feature activation-5.63e-03

Pos logit contributions
ff+0.005
away+0.005
rost+0.005
rated+0.005
ace+0.005
Neg logit contributions
ipping-0.007
pill-0.007
-0.007
 exercising-0.007
wards-0.007
Token let
Token ID1309
Feature activation-1.51e-02

Pos logit contributions
ipping+0.010
+0.010
pill+0.009
 exercising+0.009
 strawberries+0.009
Neg logit contributions
ff-0.007
away-0.007
rost-0.006
rated-0.006
 woke-0.006
Token them
Token ID606
Feature activation-1.26e-02

Pos logit contributions
ipping+0.032
+0.031
pill+0.030
 exercising+0.029
wards+0.029
Neg logit contributions
ff-0.020
away-0.018
rost-0.017
rated-0.016
ace-0.016
Token remove
Token ID4781
Feature activation-1.18e-02

Pos logit contributions
ipping+0.027
pill+0.026
+0.025
 exercising+0.024
 strawberries+0.024
Neg logit contributions
ff-0.017
away-0.016
rost-0.014
rated-0.014
ace-0.013
Token his
Token ID465
Feature activation-3.67e-03

Pos logit contributions
ipping+0.024
+0.023
pill+0.023
 exercising+0.022
wards+0.021
Neg logit contributions
ff-0.017
away-0.016
rost-0.014
rated-0.014
ace-0.013
Token sun
Token ID4252
Feature activation-3.00e-02

Pos logit contributions
ipping+0.006
+0.006
pill+0.006
wards+0.006
 exercising+0.006
Neg logit contributions
ff-0.006
away-0.006
rost-0.005
rated-0.005
ace-0.005
Tokenfl
Token ID2704
Feature activation-7.35e-02

Pos logit contributions
ipping+0.055
pill+0.055
+0.052
wards+0.050
 exercising+0.050
Neg logit contributions
ff-0.047
away-0.044
rated-0.041
rost-0.040
ace-0.039
Tokenowers
Token ID3618
Feature activation-2.32e-02

Pos logit contributions
ipping+0.104
pill+0.100
 exercising+0.100
wards+0.097
ula+0.096
Neg logit contributions
ff-0.139
away-0.121
ney-0.120
ace-0.117
rost-0.117
Token.
Token ID13
Feature activation-4.21e-03

Pos logit contributions
ipping+0.052
pill+0.051
+0.049
wards+0.047
 strawberries+0.046
Neg logit contributions
ff-0.027
away-0.026
rated-0.021
rost-0.020
 know-0.020
Token
Token ID198
Feature activation-4.71e-03

Pos logit contributions
ipping+0.010
pill+0.009
+0.009
 strawberries+0.009
 exercising+0.009
Neg logit contributions
ff-0.005
away-0.004
rost-0.004
ace-0.003
ination-0.003
Token
Token ID198
Feature activation-2.59e-03

Pos logit contributions
ipping+0.010
+0.010
pill+0.010
wards+0.010
 exercising+0.009
Neg logit contributions
ff-0.006
away-0.006
rated-0.005
rost-0.005
ace-0.005
TokenThe
Token ID464
Feature activation-4.75e-03

Pos logit contributions
ipping+0.006
+0.005
pill+0.005
 strawberries+0.005
 exercising+0.005
Neg logit contributions
ff-0.003
away-0.003
rost-0.003
rated-0.003
ination-0.003
Token shaved
Token ID39273
Feature activation+2.01e-03

Pos logit contributions
away+0.008
ff+0.007
rost+0.007
rated+0.006
ination+0.006
Neg logit contributions
ipping-0.012
-0.012
pill-0.012
wards-0.011
 exercising-0.011
Token it
Token ID340
Feature activation-6.41e-03

Pos logit contributions
ff+0.003
away+0.003
rated+0.003
rost+0.003
ney+0.003
Neg logit contributions
ipping-0.004
pill-0.003
 strawberries-0.003
 exercising-0.003
ula-0.003
Token because
Token ID780
Feature activation+4.05e-03

Pos logit contributions
ipping+0.011
+0.010
 exercising+0.010
 practicing+0.010
pill+0.010
Neg logit contributions
ff-0.011
rost-0.010
away-0.010
ination-0.009
rated-0.009
Token the
Token ID262
Feature activation-5.10e-03

Pos logit contributions
ff+0.007
away+0.006
rated+0.006
rost+0.006
ace+0.006
Neg logit contributions
ipping-0.007
 exercising-0.006
pill-0.006
 strawberries-0.006
-0.006
Token sun
Token ID4252
Feature activation-2.06e-02

Pos logit contributions
ipping+0.010
+0.009
pill+0.009
 exercising+0.008
wards+0.008
Neg logit contributions
ff-0.008
away-0.008
rost-0.007
rated-0.007
 woke-0.007
Tokenfl
Token ID2704
Feature activation-7.34e-02

Pos logit contributions
ipping+0.045
+0.043
pill+0.043
 exercising+0.040
 strawberries+0.040
Neg logit contributions
ff-0.026
away-0.024
rost-0.022
rated-0.021
ination-0.020
Tokenowers
Token ID3618
Feature activation-1.58e-02

Pos logit contributions
ipping+0.100
pill+0.092
+0.091
 exercising+0.087
wards+0.086
Neg logit contributions
ff-0.151
away-0.140
rost-0.133
rated-0.132
ace-0.130
Token needed
Token ID2622
Feature activation+6.36e-03

Pos logit contributions
ipping+0.033
+0.032
pill+0.032
 exercising+0.030
 strawberries+0.030
Neg logit contributions
ff-0.020
away-0.019
rost-0.018
rated-0.017
ination-0.016
Token more
Token ID517
Feature activation-5.44e-03

Pos logit contributions
ff+0.011
away+0.010
rated+0.010
rost+0.010
ination+0.010
Neg logit contributions
ipping-0.010
pill-0.010
 strawberries-0.010
-0.010
wards-0.009
Token sunlight
Token ID19606
Feature activation-6.37e-03

Pos logit contributions
ipping+0.010
pill+0.010
+0.009
 strawberries+0.009
 exercising+0.009
Neg logit contributions
ff-0.008
away-0.008
rost-0.008
rated-0.007
ination-0.007
Token to
Token ID284
Feature activation+7.12e-03

Pos logit contributions
ipping+0.012
+0.012
pill+0.011
 strawberries+0.011
 exercising+0.011
Neg logit contributions
ff-0.009
away-0.009
rost-0.009
rated-0.008
ination-0.008
Token<|endoftext|>
Token ID50256
Feature activation-7.38e-03

Pos logit contributions
ff+0.005
away+0.005
rost+0.004
rated+0.004
ace+0.004
Neg logit contributions
ipping-0.008
-0.008
pill-0.008
wards-0.008
 exercising-0.008
TokenL
Token ID43
Feature activation+6.85e-03

Pos logit contributions
+0.013
ipping+0.013
pill+0.012
wards+0.012
 exercising+0.011
Neg logit contributions
ff-0.012
away-0.012
rated-0.012
rost-0.011
ace-0.011
Tokenily
Token ID813
Feature activation+1.34e-02

Pos logit contributions
ff+0.011
away+0.011
rated+0.011
ination+0.010
rost+0.010
Neg logit contributions
ipping-0.012
pill-0.011
ula-0.010
-0.010
 strawberries-0.010
Token loved
Token ID6151
Feature activation-1.78e-02

Pos logit contributions
ff+0.022
away+0.021
rost+0.021
rated+0.019
ination+0.019
Neg logit contributions
ipping-0.024
pill-0.022
-0.022
 exercising-0.021
 practicing-0.021
Token sun
Token ID4252
Feature activation-1.49e-02

Pos logit contributions
ipping+0.040
+0.039
pill+0.036
ula+0.034
wards+0.034
Neg logit contributions
away-0.020
ff-0.020
 woke-0.019
 God-0.018
ot-0.018
Tokenfl
Token ID2704
Feature activation-7.23e-02

Pos logit contributions
pill+0.031
+0.029
wards+0.028
ipping+0.028
 sidelines+0.028
Neg logit contributions
ff-0.019
ace-0.018
 woke-0.017
ot-0.017
away-0.017
Tokenowers
Token ID3618
Feature activation-1.31e-02

Pos logit contributions
ipping+0.099
 exercising+0.099
pill+0.099
ula+0.096
wards+0.094
Neg logit contributions
ff-0.139
away-0.119
 lived-0.119
ace-0.118
ney-0.116
Token.
Token ID13
Feature activation-9.25e-03

Pos logit contributions
ipping+0.030
pill+0.029
+0.029
wards+0.027
ula+0.027
Neg logit contributions
ff-0.015
away-0.013
 lived-0.012
ace-0.011
 woke-0.011
Token She
Token ID1375
Feature activation-1.38e-03

Pos logit contributions
ipping+0.018
+0.018
pill+0.018
wards+0.017
 exercising+0.017
Neg logit contributions
ff-0.012
away-0.012
rated-0.011
rost-0.011
 woke-0.010
Token liked
Token ID8288
Feature activation-2.30e-02

Pos logit contributions
ipping+0.003
pill+0.003
+0.003
 strawberries+0.003
 exercising+0.003
Neg logit contributions
ff-0.002
away-0.001
rost-0.001
rated-0.001
ination-0.001
Token to
Token ID284
Feature activation-2.55e-03

Pos logit contributions
ipping+0.049
+0.047
pill+0.045
wards+0.043
 exercising+0.043
Neg logit contributions
ff-0.031
away-0.029
rost-0.026
 woke-0.024
rated-0.024
Token for
Token ID329
Feature activation-8.48e-03

Pos logit contributions
ipping+0.043
+0.041
pill+0.041
 exercising+0.038
 strawberries+0.037
Neg logit contributions
ff-0.031
away-0.028
rost-0.027
 woke-0.025
rated-0.025
Token some
Token ID617
Feature activation-1.21e-02

Pos logit contributions
ipping+0.018
pill+0.017
+0.016
 exercising+0.015
 practicing+0.015
Neg logit contributions
ff-0.012
away-0.011
rost-0.010
 know-0.009
ace-0.009
Token funny
Token ID8258
Feature activation-1.64e-02

Pos logit contributions
ipping+0.021
+0.020
pill+0.019
wards+0.018
ula+0.017
Neg logit contributions
ff-0.020
away-0.019
rost-0.018
 know-0.017
 woke-0.017
Token little
Token ID1310
Feature activation-1.37e-02

Pos logit contributions
ipping+0.030
+0.030
pill+0.028
wards+0.026
 exercising+0.026
Neg logit contributions
ff-0.027
away-0.025
rost-0.023
 woke-0.023
rated-0.022
Token sun
Token ID4252
Feature activation-2.69e-02

Pos logit contributions
+0.028
ipping+0.027
ula+0.025
pill+0.024
oming+0.023
Neg logit contributions
ff-0.021
away-0.018
 know-0.017
 woke-0.017
rated-0.016
Tokenfl
Token ID2704
Feature activation-7.14e-02

Pos logit contributions
ipping+0.056
pill+0.054
+0.053
 exercising+0.050
wards+0.050
Neg logit contributions
ff-0.036
away-0.032
rost-0.030
rated-0.029
 woke-0.029
Tokenowers
Token ID3618
Feature activation-2.24e-02

Pos logit contributions
ipping+0.103
 exercising+0.099
ula+0.098
wards+0.098
pill+0.097
Neg logit contributions
ff-0.133
ace-0.112
away-0.111
ney-0.110
rost-0.107
Token.
Token ID13
Feature activation+7.46e-04

Pos logit contributions
ipping+0.046
pill+0.044
+0.044
 exercising+0.041
wards+0.041
Neg logit contributions
ff-0.031
away-0.028
rost-0.026
rated-0.025
 woke-0.024
Token Her
Token ID2332
Feature activation-3.28e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
ace+0.001
Neg logit contributions
ipping-0.002
-0.002
pill-0.002
 exercising-0.001
wards-0.001
Token m
Token ID285
Feature activation-1.13e-02

Pos logit contributions
ipping+0.007
pill+0.006
+0.006
 exercising+0.006
wards+0.006
Neg logit contributions
ff-0.005
away-0.004
rost-0.004
ace-0.004
 know-0.004
Tokenama
Token ID1689
Feature activation-9.65e-03

Pos logit contributions
ipping+0.016
+0.015
pill+0.015
wards+0.013
 exercising+0.013
Neg logit contributions
ff-0.023
away-0.023
rost-0.021
rated-0.021
 woke-0.020
Token amazed
Token ID24562
Feature activation+2.94e-03

Pos logit contributions
ipping+0.008
+0.008
pill+0.007
 exercising+0.007
wards+0.007
Neg logit contributions
ff-0.008
away-0.008
rost-0.007
rated-0.007
ace-0.007
Token by
Token ID416
Feature activation-5.82e-03

Pos logit contributions
ff+0.004
away+0.004
rost+0.004
rated+0.003
 woke+0.003
Neg logit contributions
ipping-0.006
-0.006
pill-0.006
 exercising-0.005
wards-0.005
Token all
Token ID477
Feature activation+2.72e-02

Pos logit contributions
ipping+0.013
+0.013
pill+0.012
 exercising+0.011
ula+0.011
Neg logit contributions
ff-0.007
away-0.007
rost-0.006
 lived-0.006
rated-0.006
Token the
Token ID262
Feature activation+9.40e-04

Pos logit contributions
ff+0.033
away+0.031
rost+0.027
rated+0.026
 lived+0.026
Neg logit contributions
ipping-0.061
-0.061
pill-0.058
wards-0.054
ula-0.054
Token wild
Token ID4295
Feature activation-5.71e-03

Pos logit contributions
ff+0.001
away+0.001
rated+0.001
rost+0.001
 know+0.001
Neg logit contributions
-0.002
ipping-0.002
pill-0.002
wards-0.002
ula-0.002
Tokenfl
Token ID2704
Feature activation-6.80e-02

Pos logit contributions
+0.009
ipping+0.009
pill+0.008
wards+0.008
ula+0.008
Neg logit contributions
ff-0.010
away-0.010
rated-0.009
rost-0.009
 woke-0.009
Tokenowers
Token ID3618
Feature activation-7.36e-03

Pos logit contributions
ula+0.112
wards+0.111
 exercising+0.110
 juggling+0.104
ipping+0.104
Neg logit contributions
ff-0.109
ney-0.090
ace-0.087
 lived-0.087
away-0.085
Token and
Token ID290
Feature activation+7.96e-04

Pos logit contributions
ipping+0.016
+0.015
pill+0.015
ula+0.014
wards+0.014
Neg logit contributions
ff-0.009
away-0.009
 lived-0.008
 woke-0.008
 know-0.008
Token the
Token ID262
Feature activation-2.21e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
ace+0.001
Neg logit contributions
ipping-0.001
pill-0.001
-0.001
 exercising-0.001
 strawberries-0.001
Token bright
Token ID6016
Feature activation+2.46e-03

Pos logit contributions
ipping+0.004
+0.004
pill+0.004
 exercising+0.003
wards+0.003
Neg logit contributions
ff-0.004
away-0.003
rost-0.003
rated-0.003
ace-0.003
Token blue
Token ID4171
Feature activation-1.16e-02

Pos logit contributions
ff+0.004
away+0.004
rost+0.004
rated+0.004
ace+0.003
Neg logit contributions
ipping-0.004
-0.004
pill-0.004
 exercising-0.004
wards-0.004
Token en
Token ID551
Feature activation-3.75e-03

Pos logit contributions
ipping+0.028
+0.027
pill+0.026
wards+0.023
 strawberries+0.023
Neg logit contributions
ff-0.022
 know-0.019
away-0.019
rost-0.019
ace-0.018
Tokenvious
Token ID1442
Feature activation-9.30e-03

Pos logit contributions
ipping+0.006
+0.006
pill+0.006
wards+0.006
 exercising+0.006
Neg logit contributions
ff-0.007
away-0.006
rost-0.006
rated-0.006
 know-0.005
Token of
Token ID286
Feature activation-1.31e-02

Pos logit contributions
ipping+0.020
+0.019
pill+0.018
wards+0.018
 exercising+0.017
Neg logit contributions
ff-0.012
away-0.012
rost-0.010
rated-0.010
 woke-0.010
Token your
Token ID534
Feature activation+1.57e-03

Pos logit contributions
ipping+0.024
+0.023
pill+0.023
 exercising+0.022
 strawberries+0.021
Neg logit contributions
ff-0.021
away-0.019
rost-0.018
rated-0.018
ace-0.017
Token sun
Token ID4252
Feature activation-2.43e-02

Pos logit contributions
ff+0.003
away+0.003
rost+0.002
ace+0.002
 woke+0.002
Neg logit contributions
ipping-0.003
-0.003
pill-0.003
 exercising-0.002
wards-0.002
Tokenfl
Token ID2704
Feature activation-6.78e-02

Pos logit contributions
ipping+0.048
pill+0.046
+0.046
wards+0.043
 exercising+0.043
Neg logit contributions
ff-0.035
away-0.032
rated-0.030
ace-0.029
rost-0.029
Tokenowers
Token ID3618
Feature activation-1.61e-02

Pos logit contributions
ipping+0.098
wards+0.094
 exercising+0.093
pill+0.093
ula+0.090
Neg logit contributions
ff-0.127
away-0.110
ney-0.108
ace-0.107
rost-0.105
Token.
Token ID13
Feature activation+4.85e-03

Pos logit contributions
ipping+0.035
+0.033
pill+0.033
wards+0.031
 exercising+0.031
Neg logit contributions
ff-0.021
away-0.019
rost-0.017
rated-0.017
ace-0.016
Token I
Token ID314
Feature activation+1.01e-02

Pos logit contributions
ff+0.007
away+0.007
rost+0.006
rated+0.006
ace+0.006
Neg logit contributions
ipping-0.009
-0.009
pill-0.009
 exercising-0.008
wards-0.008
Token wanted
Token ID2227
Feature activation-3.32e-04

Pos logit contributions
ff+0.013
away+0.013
rost+0.012
rated+0.011
ace+0.011
Neg logit contributions
-0.021
ipping-0.020
pill-0.020
 exercising-0.018
wards-0.018
Token to
Token ID284
Feature activation+2.49e-03

Pos logit contributions
ipping+0.001
pill+0.001
+0.001
 strawberries+0.001
 exercising+0.001
Neg logit contributions
ff-0.001
away-0.000
rost-0.000
rated-0.000
ace-0.000
Token could
Token ID714
Feature activation+5.44e-03

Pos logit contributions
ipping+0.013
pill+0.012
+0.012
 practicing+0.012
 exercising+0.012
Neg logit contributions
ff-0.010
away-0.009
rost-0.009
rated-0.008
ination-0.008
Token even
Token ID772
Feature activation+1.26e-02

Pos logit contributions
ff+0.007
away+0.007
rost+0.006
rated+0.006
ination+0.006
Neg logit contributions
ipping-0.011
-0.011
pill-0.010
 exercising-0.010
 practicing-0.010
Token turn
Token ID1210
Feature activation+3.54e-03

Pos logit contributions
ff+0.019
rost+0.016
away+0.016
rated+0.016
ination+0.015
Neg logit contributions
ipping-0.024
 exercising-0.022
pill-0.022
-0.022
 practicing-0.021
Token flip
Token ID14283
Feature activation-7.60e-03

Pos logit contributions
ff+0.005
away+0.005
rost+0.004
rated+0.004
ace+0.004
Neg logit contributions
ipping-0.007
pill-0.007
-0.007
 exercising-0.006
 strawberries-0.006
Token-
Token ID12
Feature activation+1.01e-02

Pos logit contributions
ipping+0.014
pill+0.014
+0.013
 strawberries+0.013
wards+0.012
Neg logit contributions
ff-0.011
away-0.011
rost-0.010
rated-0.010
ination-0.009
Tokenfl
Token ID2704
Feature activation-6.77e-02

Pos logit contributions
ff+0.016
away+0.015
rost+0.014
rated+0.014
ace+0.013
Neg logit contributions
ipping-0.018
pill-0.018
-0.017
wards-0.016
 exercising-0.016
Tokenops
Token ID2840
Feature activation-2.22e-02

Pos logit contributions
ipping+0.113
 exercising+0.111
wards+0.111
 strawberries+0.105
 practicing+0.104
Neg logit contributions
ff-0.111
ace-0.096
away-0.093
ney-0.089
rost-0.087
Token!
Token ID0
Feature activation-5.78e-03

Pos logit contributions
ipping+0.048
+0.047
pill+0.047
wards+0.044
 strawberries+0.043
Neg logit contributions
ff-0.027
away-0.026
ace-0.022
rost-0.022
rated-0.021
Token Everyone
Token ID11075
Feature activation+3.82e-03

Pos logit contributions
ipping+0.013
+0.013
pill+0.013
 exercising+0.012
wards+0.012
Neg logit contributions
ff-0.007
away-0.006
rost-0.005
rated-0.005
ace-0.005
Token was
Token ID373
Feature activation-5.54e-04

Pos logit contributions
ff+0.005
away+0.004
rost+0.004
rated+0.004
ace+0.004
Neg logit contributions
ipping-0.008
pill-0.008
-0.008
 exercising-0.008
wards-0.008
Token very
Token ID845
Feature activation+5.21e-03

Pos logit contributions
ipping+0.001
pill+0.001
+0.001
 strawberries+0.001
wards+0.001
Neg logit contributions
ff-0.001
away-0.001
 know-0.001
ace-0.001
rost-0.001
Token wealthy
Token ID11574
Feature activation-4.34e-03

Pos logit contributions
ipping+0.006
+0.006
pill+0.006
 exercising+0.005
wards+0.005
Neg logit contributions
ff-0.006
away-0.005
rost-0.005
rated-0.005
ace-0.005
Token prince
Token ID19716
Feature activation-1.75e-03

Pos logit contributions
ipping+0.007
pill+0.007
+0.006
 strawberries+0.006
 exercising+0.006
Neg logit contributions
ff-0.008
away-0.007
rost-0.007
rated-0.007
ination-0.007
Token and
Token ID290
Feature activation-1.80e-02

Pos logit contributions
ipping+0.004
pill+0.003
+0.003
 exercising+0.003
wards+0.003
Neg logit contributions
ff-0.002
away-0.002
rost-0.002
rated-0.002
ination-0.002
Token his
Token ID465
Feature activation-1.73e-02

Pos logit contributions
ipping+0.036
pill+0.036
 strawberries+0.035
+0.034
wards+0.034
Neg logit contributions
ff-0.024
away-0.022
rost-0.021
rated-0.021
ney-0.019
Token sun
Token ID4252
Feature activation-2.81e-02

Pos logit contributions
ipping+0.030
+0.028
pill+0.028
 exercising+0.026
wards+0.026
Neg logit contributions
ff-0.029
away-0.028
rost-0.026
rated-0.025
ace-0.025
Tokenfl
Token ID2704
Feature activation-6.75e-02

Pos logit contributions
ipping+0.041
pill+0.039
+0.039
 exercising+0.036
wards+0.035
Neg logit contributions
ff-0.055
away-0.052
rost-0.049
rated-0.048
ace-0.047
Tokenowers
Token ID3618
Feature activation-1.90e-02

Pos logit contributions
ipping+0.085
pill+0.080
+0.078
 exercising+0.074
wards+0.073
Neg logit contributions
ff-0.145
away-0.136
rost-0.130
rated-0.126
ace-0.125
Token lived
Token ID5615
Feature activation+2.13e-03

Pos logit contributions
ipping+0.039
+0.038
pill+0.037
 exercising+0.035
wards+0.035
Neg logit contributions
ff-0.026
away-0.024
rost-0.023
rated-0.022
ace-0.021
Token happily
Token ID18177
Feature activation+3.84e-03

Pos logit contributions
ff+0.003
away+0.003
rost+0.003
rated+0.003
ace+0.003
Neg logit contributions
ipping-0.004
-0.004
pill-0.004
 exercising-0.004
wards-0.003
Token ever
Token ID1683
Feature activation+1.67e-02

Pos logit contributions
ff+0.004
away+0.004
 lived+0.003
ace+0.003
 woke+0.003
Neg logit contributions
ipping-0.009
pill-0.009
-0.008
ula-0.008
 strawberries-0.008
Token after
Token ID706
Feature activation+5.33e-03

Pos logit contributions
ff+0.026
away+0.026
rated+0.022
rost+0.022
 woke+0.021
Neg logit contributions
ipping-0.033
-0.031
pill-0.031
ula-0.028
 exercising-0.028
Token me
Token ID502
Feature activation+9.96e-03

Pos logit contributions
ff+0.014
away+0.013
rost+0.012
rated+0.012
 woke+0.011
Neg logit contributions
ipping-0.018
pill-0.017
-0.017
 exercising-0.016
wards-0.016
Tokenadow
Token ID4584
Feature activation+1.96e-02

Pos logit contributions
ff+0.011
away+0.011
rost+0.010
rated+0.010
ination+0.010
Neg logit contributions
ipping-0.023
-0.021
pill-0.021
 exercising-0.020
wards-0.020
Token full
Token ID1336
Feature activation-9.25e-03

Pos logit contributions
ff+0.029
away+0.027
rost+0.025
rated+0.024
ace+0.023
Neg logit contributions
ipping-0.039
-0.037
pill-0.037
 exercising-0.034
wards-0.034
Token of
Token ID286
Feature activation-7.54e-03

Pos logit contributions
ipping+0.020
+0.019
pill+0.019
 exercising+0.018
 practicing+0.018
Neg logit contributions
ff-0.012
away-0.011
rost-0.010
 woke-0.010
ot-0.009
Token wild
Token ID4295
Feature activation-4.99e-03

Pos logit contributions
ipping+0.013
+0.013
pill+0.012
ula+0.011
wards+0.011
Neg logit contributions
ff-0.013
away-0.012
rost-0.011
rated-0.011
 know-0.011
Tokenfl
Token ID2704
Feature activation-6.71e-02

Pos logit contributions
+0.009
ipping+0.008
ula+0.008
pill+0.007
wards+0.007
Neg logit contributions
ff-0.009
away-0.008
rated-0.008
 woke-0.007
ace-0.007
Tokenowers
Token ID3618
Feature activation-9.62e-03

Pos logit contributions
wards+0.114
ula+0.113
 exercising+0.112
 practicing+0.109
 pro+0.109
Neg logit contributions
ff-0.100
ney-0.082
ace-0.080
 lived-0.074
away-0.073
Token.
Token ID13
Feature activation+2.50e-03

Pos logit contributions
ipping+0.024
+0.023
pill+0.022
wards+0.021
 sidelines+0.021
Neg logit contributions
ff-0.009
away-0.008
 know-0.006
 lived-0.006
igator-0.006
Token He
Token ID679
Feature activation-3.28e-03

Pos logit contributions
ff+0.003
away+0.003
rost+0.002
rated+0.002
 woke+0.002
Neg logit contributions
ipping-0.006
pill-0.005
-0.005
 exercising-0.005
 strawberries-0.005
Token spun
Token ID26843
Feature activation-6.02e-03

Pos logit contributions
ipping+0.006
pill+0.006
+0.006
 exercising+0.006
 strawberries+0.006
Neg logit contributions
ff-0.005
away-0.005
rost-0.004
rated-0.004
ace-0.004
Token around
Token ID1088
Feature activation+1.13e-02

Pos logit contributions
ipping+0.013
+0.012
pill+0.012
 exercising+0.012
wards+0.012
Neg logit contributions
ff-0.008
away-0.007
rost-0.007
rated-0.006
ace-0.006
Token back
Token ID736
Feature activation+1.08e-02

Pos logit contributions
ipping+0.043
+0.042
pill+0.042
 exercising+0.039
wards+0.039
Neg logit contributions
ff-0.027
away-0.025
rost-0.023
rated-0.022
ace-0.021
Token with
Token ID351
Feature activation+3.94e-04

Pos logit contributions
ff+0.016
away+0.015
rost+0.014
rated+0.014
ination+0.013
Neg logit contributions
ipping-0.021
-0.020
pill-0.020
 exercising-0.019
wards-0.019
Token lots
Token ID6041
Feature activation+1.46e-02

Pos logit contributions
ff+0.001
away+0.001
rated+0.001
rost+0.001
ination+0.001
Neg logit contributions
ipping-0.001
-0.001
pill-0.001
wards-0.001
 strawberries-0.001
Token of
Token ID286
Feature activation-7.63e-03

Pos logit contributions
ff+0.021
away+0.020
rost+0.018
rated+0.016
 woke+0.016
Neg logit contributions
ipping-0.029
-0.029
pill-0.028
 exercising-0.026
wards-0.026
Token sun
Token ID4252
Feature activation-1.07e-02

Pos logit contributions
ipping+0.013
+0.013
pill+0.012
 exercising+0.011
wards+0.011
Neg logit contributions
ff-0.013
away-0.012
rost-0.012
rated-0.011
 know-0.011
Tokenfl
Token ID2704
Feature activation-6.67e-02

Pos logit contributions
ipping+0.019
pill+0.019
+0.018
 exercising+0.017
wards+0.017
Neg logit contributions
ff-0.017
away-0.015
rated-0.014
 woke-0.014
ace-0.014
Tokenowers
Token ID3618
Feature activation-1.15e-02

Pos logit contributions
ipping+0.092
 exercising+0.088
pill+0.086
+0.086
ula+0.086
Neg logit contributions
ff-0.131
away-0.115
ace-0.111
ney-0.108
rost-0.108
Token to
Token ID284
Feature activation+1.74e-03

Pos logit contributions
ipping+0.024
+0.023
pill+0.023
 exercising+0.021
wards+0.021
Neg logit contributions
ff-0.016
away-0.014
rated-0.013
rost-0.013
 woke-0.013
Token fill
Token ID6070
Feature activation+5.53e-04

Pos logit contributions
ff+0.003
away+0.003
rated+0.003
rost+0.002
ney+0.002
Neg logit contributions
-0.003
ipping-0.003
wards-0.003
 strawberries-0.003
 exercising-0.003
Token the
Token ID262
Feature activation-6.33e-03

Pos logit contributions
ff+0.001
rost+0.001
away+0.001
rated+0.001
ace+0.001
Neg logit contributions
ipping-0.001
 strawberries-0.001
pill-0.001
-0.001
wards-0.001
Token prince
Token ID19716
Feature activation-3.92e-03

Pos logit contributions
ipping+0.010
+0.010
pill+0.009
 strawberries+0.009
 exercising+0.009
Neg logit contributions
ff-0.012
away-0.011
rost-0.010
rated-0.010
ace-0.010
Token Soon
Token ID15894
Feature activation-1.76e-02

Pos logit contributions
ipping+0.016
pill+0.015
+0.015
 exercising+0.015
 strawberries+0.015
Neg logit contributions
ff-0.008
away-0.007
rost-0.007
rated-0.006
ination-0.006
Token,
Token ID11
Feature activation+1.41e-02

Pos logit contributions
ipping+0.040
+0.040
pill+0.039
ula+0.036
 exercising+0.035
Neg logit contributions
ff-0.020
away-0.020
 woke-0.019
 lived-0.018
 know-0.017
Token all
Token ID477
Feature activation+1.79e-02

Pos logit contributions
ff+0.019
away+0.018
rost+0.017
rated+0.016
ace+0.015
Neg logit contributions
ipping-0.029
-0.028
pill-0.028
 exercising-0.026
wards-0.026
Token the
Token ID262
Feature activation-6.80e-03

Pos logit contributions
ff+0.026
away+0.025
rost+0.024
rated+0.022
ination+0.021
Neg logit contributions
ipping-0.035
pill-0.033
-0.033
 exercising-0.031
wards-0.031
Token sun
Token ID4252
Feature activation-1.82e-02

Pos logit contributions
ipping+0.011
+0.011
pill+0.010
 exercising+0.009
wards+0.009
Neg logit contributions
ff-0.012
away-0.012
rost-0.011
rated-0.011
ace-0.011
Tokenfl
Token ID2704
Feature activation-6.67e-02

Pos logit contributions
ipping+0.027
+0.025
pill+0.025
 exercising+0.023
wards+0.022
Neg logit contributions
ff-0.036
away-0.034
rost-0.032
rated-0.032
ace-0.031
Tokenowers
Token ID3618
Feature activation-6.76e-03

Pos logit contributions
ipping+0.087
pill+0.082
+0.080
 exercising+0.080
ula+0.077
Neg logit contributions
ff-0.139
away-0.127
rost-0.119
rated-0.119
ace-0.119
Token were
Token ID547
Feature activation-4.54e-03

Pos logit contributions
ipping+0.014
+0.013
pill+0.013
 exercising+0.012
wards+0.012
Neg logit contributions
ff-0.010
away-0.009
rost-0.009
rated-0.008
ace-0.008
Token happily
Token ID18177
Feature activation+1.89e-03

Pos logit contributions
ipping+0.008
+0.007
pill+0.007
 exercising+0.007
wards+0.007
Neg logit contributions
ff-0.008
away-0.008
rost-0.007
rated-0.007
ace-0.007
Token chatting
Token ID32046
Feature activation+1.04e-02

Pos logit contributions
ff+0.002
away+0.002
 woke+0.002
rated+0.002
ace+0.002
Neg logit contributions
pill-0.004
ipping-0.004
-0.004
ula-0.004
 strawberries-0.004
Token and
Token ID290
Feature activation-1.24e-02

Pos logit contributions
ff+0.017
rost+0.016
away+0.015
rated+0.014
ney+0.014
Neg logit contributions
ipping-0.019
-0.017
pill-0.017
wards-0.017
 strawberries-0.016
Tokenol
Token ID349
Feature activation-3.01e-02

Pos logit contributions
 exercising+0.023
ipping+0.023
+0.022
 practicing+0.022
pill+0.022
Neg logit contributions
ff-0.018
away-0.015
ace-0.014
ot-0.014
rost-0.014
Tokenanda
Token ID5282
Feature activation-1.85e-02

Pos logit contributions
ipping+0.060
+0.058
pill+0.055
 exercising+0.055
wards+0.054
Neg logit contributions
ff-0.044
away-0.040
rost-0.038
rated-0.035
ace-0.035
Token and
Token ID290
Feature activation-1.11e-02

Pos logit contributions
ipping+0.039
+0.038
pill+0.037
 exercising+0.034
ula+0.034
Neg logit contributions
ff-0.025
away-0.022
 lived-0.022
 woke-0.022
ot-0.021
Token her
Token ID607
Feature activation-7.93e-03

Pos logit contributions
ipping+0.022
+0.021
pill+0.020
 exercising+0.019
wards+0.019
Neg logit contributions
ff-0.017
away-0.015
rost-0.014
 woke-0.014
 lived-0.014
Token sun
Token ID4252
Feature activation-2.85e-02

Pos logit contributions
ipping+0.014
pill+0.013
+0.013
 exercising+0.013
wards+0.012
Neg logit contributions
ff-0.013
away-0.012
rost-0.011
 woke-0.011
ace-0.011
Tokenfl
Token ID2704
Feature activation-6.65e-02

Pos logit contributions
ipping+0.035
pill+0.033
+0.033
 exercising+0.029
wards+0.029
Neg logit contributions
ff-0.062
away-0.059
rost-0.057
rated-0.056
ace-0.055
Tokenowers
Token ID3618
Feature activation-1.61e-02

Pos logit contributions
ipping+0.091
pill+0.087
 exercising+0.086
wards+0.083
 practicing+0.082
Neg logit contributions
ff-0.131
away-0.118
rost-0.114
ace-0.112
 lived-0.112
Token started
Token ID2067
Feature activation-1.40e-02

Pos logit contributions
ipping+0.034
pill+0.032
+0.032
 exercising+0.030
wards+0.030
Neg logit contributions
ff-0.022
away-0.020
rost-0.018
rated-0.018
 woke-0.017
Token to
Token ID284
Feature activation+3.68e-03

Pos logit contributions
ipping+0.031
+0.031
pill+0.030
ula+0.028
 exercising+0.028
Neg logit contributions
ff-0.016
away-0.015
ot-0.013
 woke-0.013
 its-0.012
Token dance
Token ID9280
Feature activation+3.30e-03

Pos logit contributions
ff+0.006
away+0.005
rost+0.005
rated+0.005
ace+0.005
Neg logit contributions
ipping-0.007
-0.007
pill-0.006
 exercising-0.006
 strawberries-0.006
Token together
Token ID1978
Feature activation-1.34e-04

Pos logit contributions
ff+0.005
away+0.005
rost+0.004
rated+0.004
ace+0.004
Neg logit contributions
ipping-0.006
-0.006
pill-0.006
 exercising-0.006
wards-0.006
Token said
Token ID531
Feature activation+7.23e-03

Pos logit contributions
ff+0.002
away+0.002
 woke+0.002
 lived+0.002
rost+0.002
Neg logit contributions
ipping-0.004
-0.004
pill-0.004
wards-0.004
 exercising-0.004
Token no
Token ID645
Feature activation+7.05e-03

Pos logit contributions
ff+0.013
away+0.013
rost+0.012
rated+0.012
 woke+0.012
Neg logit contributions
ipping-0.011
pill-0.011
-0.011
 exercising-0.010
wards-0.010
Token because
Token ID780
Feature activation+7.48e-04

Pos logit contributions
away+0.007
ff+0.006
 know+0.006
ot+0.005
 woke+0.005
Neg logit contributions
-0.018
ipping-0.017
pill-0.017
wards-0.016
 strawberries-0.016
Token the
Token ID262
Feature activation-6.09e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
 woke+0.001
Neg logit contributions
ipping-0.001
-0.001
pill-0.001
wards-0.001
 exercising-0.001
Token sun
Token ID4252
Feature activation-2.58e-02

Pos logit contributions
ipping+0.009
+0.009
pill+0.008
 strawberries+0.008
wards+0.008
Neg logit contributions
ff-0.012
away-0.011
rost-0.011
rated-0.010
ace-0.010
Tokenfl
Token ID2704
Feature activation-6.65e-02

Pos logit contributions
ipping+0.054
+0.051
pill+0.051
 exercising+0.048
wards+0.048
Neg logit contributions
ff-0.035
away-0.033
rost-0.030
rated-0.029
ace-0.028
Tokenowers
Token ID3618
Feature activation-1.45e-02

Pos logit contributions
ipping+0.083
pill+0.076
+0.075
 exercising+0.069
wards+0.068
Neg logit contributions
ff-0.146
away-0.138
rost-0.133
rated-0.129
ace-0.126
Token were
Token ID547
Feature activation-4.50e-03

Pos logit contributions
ipping+0.030
+0.029
pill+0.028
 exercising+0.027
 strawberries+0.026
Neg logit contributions
ff-0.020
away-0.019
rost-0.018
rated-0.016
ace-0.016
Token too
Token ID1165
Feature activation+1.92e-02

Pos logit contributions
ipping+0.008
+0.007
pill+0.007
 exercising+0.007
 strawberries+0.007
Neg logit contributions
ff-0.008
away-0.007
rost-0.007
rated-0.007
ace-0.006
Token big
Token ID1263
Feature activation-3.54e-03

Pos logit contributions
ff+0.033
ace+0.032
rost+0.031
away+0.030
rated+0.030
Neg logit contributions
 strawberries-0.030
-0.030
ipping-0.029
 exercising-0.028
pill-0.027
Token to
Token ID284
Feature activation+4.23e-03

Pos logit contributions
ipping+0.007
+0.007
pill+0.007
wards+0.007
 exercising+0.007
Neg logit contributions
ff-0.005
away-0.004
rost-0.004
 woke-0.004
rated-0.004
Token the
Token ID262
Feature activation-4.95e-03

Pos logit contributions
ff+0.002
away+0.002
rost+0.002
rated+0.002
ace+0.002
Neg logit contributions
ipping-0.003
-0.002
pill-0.002
 strawberries-0.002
 exercising-0.002
Token soil
Token ID9260
Feature activation-5.32e-03

Pos logit contributions
ipping+0.009
+0.008
pill+0.008
 exercising+0.007
wards+0.007
Neg logit contributions
ff-0.009
away-0.008
rost-0.008
rated-0.007
ace-0.007
Token for
Token ID329
Feature activation+1.28e-02

Pos logit contributions
ipping+0.010
+0.010
pill+0.010
 exercising+0.009
wards+0.009
Neg logit contributions
ff-0.008
away-0.007
rost-0.007
rated-0.007
ace-0.006
Token the
Token ID262
Feature activation-8.84e-04

Pos logit contributions
ff+0.015
away+0.014
rost+0.013
rated+0.012
ace+0.012
Neg logit contributions
ipping-0.029
-0.028
pill-0.027
 exercising-0.026
wards-0.026
Token sun
Token ID4252
Feature activation-2.38e-02

Pos logit contributions
ipping+0.002
pill+0.002
+0.002
wards+0.001
ula+0.001
Neg logit contributions
ff-0.001
away-0.001
rost-0.001
rated-0.001
 woke-0.001
Tokenfl
Token ID2704
Feature activation-6.62e-02

Pos logit contributions
ipping+0.044
pill+0.042
+0.042
 exercising+0.039
wards+0.039
Neg logit contributions
ff-0.038
away-0.035
rost-0.033
rated-0.032
 woke-0.031
Tokenowers
Token ID3618
Feature activation-1.41e-02

Pos logit contributions
ipping+0.088
pill+0.082
 exercising+0.078
wards+0.078
+0.078
Neg logit contributions
ff-0.136
away-0.126
rost-0.119
rated-0.117
ace-0.116
Token.
Token ID13
Feature activation+2.66e-03

Pos logit contributions
ipping+0.029
+0.028
pill+0.028
 exercising+0.026
wards+0.026
Neg logit contributions
ff-0.019
away-0.018
rost-0.016
rated-0.016
 woke-0.015
Token From
Token ID3574
Feature activation+9.38e-03

Pos logit contributions
ff+0.003
away+0.003
rost+0.002
rated+0.002
ace+0.002
Neg logit contributions
ipping-0.006
-0.006
pill-0.006
 exercising-0.006
wards-0.006
Token that
Token ID326
Feature activation+1.23e-02

Pos logit contributions
ff+0.018
away+0.017
rost+0.016
 lived+0.015
rated+0.015
Neg logit contributions
ipping-0.014
pill-0.014
-0.014
 exercising-0.012
ula-0.012
Token day
Token ID1110
Feature activation-1.12e-02

Pos logit contributions
ff+0.022
away+0.021
rost+0.019
ination+0.018
ace+0.018
Neg logit contributions
ipping-0.020
-0.019
 exercising-0.019
 practicing-0.018
pill-0.018
Token happy
Token ID3772
Feature activation-4.42e-03

Pos logit contributions
away+0.009
rost+0.008
ff+0.008
ination+0.008
rated+0.008
Neg logit contributions
wards-0.007
ipping-0.007
-0.007
 exercising-0.007
pill-0.007
Token with
Token ID351
Feature activation-5.02e-03

Pos logit contributions
+0.010
ipping+0.010
pill+0.010
pieces+0.009
avored+0.009
Neg logit contributions
away-0.004
ff-0.004
 woke-0.004
 its-0.004
 lived-0.004
Token all
Token ID477
Feature activation+2.48e-02

Pos logit contributions
ipping+0.012
+0.011
pill+0.011
wards+0.011
 exercising+0.011
Neg logit contributions
ff-0.006
away-0.005
rost-0.005
rated-0.004
 woke-0.004
Token the
Token ID262
Feature activation+1.62e-03

Pos logit contributions
ff+0.031
away+0.029
rost+0.026
rated+0.025
ace+0.024
Neg logit contributions
ipping-0.055
-0.054
pill-0.052
 exercising-0.049
wards-0.048
Token sun
Token ID4252
Feature activation-1.74e-02

Pos logit contributions
ff+0.003
away+0.003
rost+0.002
rated+0.002
ace+0.002
Neg logit contributions
ipping-0.003
-0.003
pill-0.003
 exercising-0.002
wards-0.002
Tokenfl
Token ID2704
Feature activation-6.61e-02

Pos logit contributions
ipping+0.028
pill+0.027
+0.026
 exercising+0.024
ula+0.024
Neg logit contributions
ff-0.031
away-0.029
 woke-0.028
rated-0.028
rost-0.027
Tokenowers
Token ID3618
Feature activation-1.05e-02

Pos logit contributions
ipping+0.089
+0.083
 exercising+0.082
pill+0.082
ula+0.081
Neg logit contributions
ff-0.134
away-0.122
rost-0.114
ace-0.113
rated-0.112
Token and
Token ID290
Feature activation-3.05e-03

Pos logit contributions
ipping+0.022
+0.021
pill+0.021
ula+0.019
 exercising+0.019
Neg logit contributions
ff-0.014
away-0.013
 lived-0.012
 woke-0.012
rated-0.012
Token he
Token ID339
Feature activation-1.41e-02

Pos logit contributions
ipping+0.006
+0.006
pill+0.006
 exercising+0.005
wards+0.005
Neg logit contributions
ff-0.005
away-0.004
rost-0.004
rated-0.004
 woke-0.004
Token ordered
Token ID6149
Feature activation-7.52e-05

Pos logit contributions
ipping+0.029
pill+0.028
+0.028
 exercising+0.026
wards+0.026
Neg logit contributions
ff-0.020
away-0.018
rost-0.017
rated-0.016
ace-0.015
Token his
Token ID465
Feature activation-6.60e-03

Pos logit contributions
ipping+0.000
+0.000
pill+0.000
wards+0.000
 exercising+0.000
Neg logit contributions
ff-0.000
away-0.000
rost-0.000
rated-0.000
ace-0.000
Token a
Token ID257
Feature activation+2.94e-03

Pos logit contributions
ipping+0.019
+0.018
pill+0.018
 exercising+0.017
wards+0.017
Neg logit contributions
ff-0.014
away-0.013
rost-0.012
rated-0.012
 woke-0.012
Token field
Token ID2214
Feature activation+2.02e-02

Pos logit contributions
ff+0.004
away+0.004
 woke+0.003
rost+0.003
 know+0.003
Neg logit contributions
ipping-0.006
pill-0.006
-0.006
 exercising-0.005
wards-0.005
Token full
Token ID1336
Feature activation-1.57e-02

Pos logit contributions
ff+0.025
away+0.024
ney+0.021
rost+0.021
 lived+0.020
Neg logit contributions
ipping-0.044
pill-0.043
-0.043
 exercising-0.039
 practicing-0.039
Token of
Token ID286
Feature activation-1.19e-02

Pos logit contributions
ipping+0.035
+0.033
pill+0.033
 practicing+0.031
 exercising+0.031
Neg logit contributions
ff-0.020
away-0.018
rost-0.016
 woke-0.016
ot-0.015
Token wild
Token ID4295
Feature activation-7.38e-03

Pos logit contributions
ipping+0.020
+0.019
pill+0.018
 exercising+0.016
wards+0.016
Neg logit contributions
ff-0.022
away-0.021
rost-0.019
rated-0.019
 woke-0.018
Tokenfl
Token ID2704
Feature activation-6.61e-02

Pos logit contributions
ipping+0.012
+0.012
ula+0.011
wards+0.011
pill+0.011
Neg logit contributions
ff-0.013
away-0.013
rated-0.012
 woke-0.011
 lived-0.011
Tokenowers
Token ID3618
Feature activation-8.50e-03

Pos logit contributions
wards+0.121
 exercising+0.118
 sidelines+0.116
 juggling+0.116
 practicing+0.116
Neg logit contributions
ff-0.089
ney-0.071
ace-0.071
away-0.067
 lived-0.066
Token.
Token ID13
Feature activation-7.90e-03

Pos logit contributions
ipping+0.023
+0.022
 sidelines+0.021
pill+0.020
ula+0.020
Neg logit contributions
ff-0.006
away-0.006
 lived-0.005
 know-0.005
 woke-0.004
Token They
Token ID1119
Feature activation-5.03e-03

Pos logit contributions
ipping+0.020
+0.018
pill+0.018
 exercising+0.018
 strawberries+0.017
Neg logit contributions
ff-0.008
away-0.007
rost-0.007
ination-0.006
 lived-0.006
Token offered
Token ID4438
Feature activation+2.88e-03

Pos logit contributions
ipping+0.010
pill+0.010
+0.010
 exercising+0.009
 strawberries+0.009
Neg logit contributions
ff-0.007
away-0.007
rost-0.006
rated-0.006
ace-0.006
Token them
Token ID606
Feature activation+4.39e-03

Pos logit contributions
ff+0.005
away+0.004
rated+0.004
rost+0.004
ace+0.004
Neg logit contributions
ipping-0.005
pill-0.005
 strawberries-0.005
-0.005
wards-0.005
Token after
Token ID706
Feature activation-1.07e-02

Pos logit contributions
ipping+0.023
pill+0.023
+0.023
 strawberries+0.023
 exercising+0.022
Neg logit contributions
ff-0.009
away-0.009
 lived-0.008
 woke-0.007
ace-0.007
Token that
Token ID326
Feature activation+2.99e-03

Pos logit contributions
pill+0.021
ipping+0.020
+0.019
 pitcher+0.018
oured+0.018
Neg logit contributions
 lived-0.015
 its-0.014
away-0.014
ff-0.013
 know-0.013
Token.
Token ID13
Feature activation-1.62e-02

Pos logit contributions
away+0.003
ff+0.003
rated+0.003
 lived+0.003
 woke+0.003
Neg logit contributions
pill-0.007
ipping-0.007
-0.007
 strawberries-0.006
oured-0.006
Token Whenever
Token ID21326
Feature activation-8.86e-03

Pos logit contributions
ipping+0.040
+0.036
pill+0.036
 strawberries+0.036
 exercising+0.035
Neg logit contributions
ff-0.017
away-0.014
rost-0.014
 know-0.012
ace-0.012
Token the
Token ID262
Feature activation-1.15e-02

Pos logit contributions
ipping+0.019
+0.018
pill+0.018
oured+0.017
ula+0.017
Neg logit contributions
ff-0.011
away-0.010
 lived-0.010
 woke-0.010
 God-0.010
Token chip
Token ID11594
Feature activation-6.60e-02

Pos logit contributions
ipping+0.020
pill+0.018
oured+0.017
+0.017
pieces+0.017
Neg logit contributions
ff-0.017
rost-0.017
 value-0.016
 know-0.016
ney-0.016
Tokenm
Token ID76
Feature activation-1.61e-02

Pos logit contributions
ipping+0.093
pill+0.087
wards+0.086
ula+0.083
+0.083
Neg logit contributions
ff-0.119
away-0.110
 lived-0.110
rost-0.109
igator-0.109
Tokenunk
Token ID2954
Feature activation-2.66e-02

Pos logit contributions
ipping+0.027
+0.025
pill+0.025
wards+0.024
 exercising+0.023
Neg logit contributions
ff-0.029
away-0.026
rost-0.025
rated-0.024
ace-0.024
Token needed
Token ID2622
Feature activation+9.19e-03

Pos logit contributions
ipping+0.053
+0.051
pill+0.051
 exercising+0.048
wards+0.047
Neg logit contributions
ff-0.038
away-0.036
rost-0.034
rated-0.032
ination-0.030
Token help
Token ID1037
Feature activation-1.20e-03

Pos logit contributions
ff+0.013
away+0.013
rost+0.012
rated+0.011
ace+0.011
Neg logit contributions
ipping-0.018
-0.017
pill-0.017
 exercising-0.016
wards-0.016
Token,
Token ID11
Feature activation-2.90e-03

Pos logit contributions
ipping+0.002
+0.002
pill+0.002
 exercising+0.002
wards+0.002
Neg logit contributions
ff-0.002
away-0.002
rost-0.002
rated-0.002
ination-0.002
Token From
Token ID3574
Feature activation+9.03e-03

Pos logit contributions
ipping+0.015
pill+0.014
+0.014
 exercising+0.013
 strawberries+0.013
Neg logit contributions
ff-0.007
away-0.006
rost-0.006
rated-0.005
ace-0.005
Token then
Token ID788
Feature activation-2.99e-04

Pos logit contributions
ff+0.017
away+0.017
rost+0.015
 lived+0.015
rated+0.015
Neg logit contributions
ipping-0.014
pill-0.014
-0.014
 exercising-0.012
ula-0.011
Token on
Token ID319
Feature activation-1.59e-02

Pos logit contributions
ipping+0.001
pill+0.000
+0.000
 strawberries+0.000
wards+0.000
Neg logit contributions
ff-0.001
away-0.000
rost-0.000
rated-0.000
 woke-0.000
Token the
Token ID262
Feature activation-9.16e-03

Pos logit contributions
ipping+0.035
pill+0.035
+0.033
 exercising+0.032
oured+0.031
Neg logit contributions
ff-0.018
away-0.017
 lived-0.016
 woke-0.015
ot-0.015
Token sun
Token ID4252
Feature activation-2.45e-02

Pos logit contributions
ipping+0.019
pill+0.018
+0.016
ula+0.016
 exercising+0.016
Neg logit contributions
ff-0.012
ace-0.012
 lived-0.011
rated-0.011
 woke-0.011
Tokenfl
Token ID2704
Feature activation-6.55e-02

Pos logit contributions
ipping+0.034
+0.032
pill+0.032
 exercising+0.029
wards+0.029
Neg logit contributions
ff-0.050
away-0.047
rost-0.045
rated-0.044
 woke-0.043
Tokenowers
Token ID3618
Feature activation-1.23e-02

Pos logit contributions
ipping+0.089
pill+0.084
 exercising+0.084
wards+0.081
ula+0.081
Neg logit contributions
ff-0.130
away-0.116
rost-0.111
ace-0.111
 lived-0.110
Token were
Token ID547
Feature activation+2.65e-03

Pos logit contributions
ipping+0.025
+0.024
pill+0.024
 exercising+0.022
wards+0.022
Neg logit contributions
ff-0.017
away-0.016
rost-0.015
rated-0.014
ace-0.014
Token Y
Token ID575
Feature activation-1.25e-02

Pos logit contributions
ff+0.004
away+0.004
rost+0.003
rated+0.003
ace+0.003
Neg logit contributions
ipping-0.005
-0.005
pill-0.005
 exercising-0.004
wards-0.004
Tokenol
Token ID349
Feature activation-2.67e-02

Pos logit contributions
ipping+0.022
 exercising+0.022
pill+0.022
+0.021
 practicing+0.021
Neg logit contributions
ff-0.021
away-0.018
ace-0.018
rost-0.017
ot-0.017
Tokenanda
Token ID5282
Feature activation-1.87e-02

Pos logit contributions
ipping+0.050
+0.047
 exercising+0.046
wards+0.045
 practicing+0.044
Neg logit contributions
ff-0.043
away-0.039
rost-0.036
ace-0.035
ot-0.035
Token the
Token ID262
Feature activation+2.69e-03

Pos logit contributions
ipping+0.001
+0.001
pill+0.001
 exercising+0.001
wards+0.001
Neg logit contributions
ff-0.001
away-0.001
rost-0.000
rated-0.000
ace-0.000
Token biggest
Token ID4094
Feature activation+1.42e-02

Pos logit contributions
ff+0.005
away+0.004
rost+0.004
rated+0.004
ace+0.004
Neg logit contributions
ipping-0.005
-0.004
pill-0.004
 exercising-0.004
wards-0.004
Token and
Token ID290
Feature activation-2.42e-03

Pos logit contributions
ff+0.022
away+0.020
 woke+0.018
 know+0.018
rost+0.017
Neg logit contributions
ipping-0.028
pill-0.026
-0.026
 exercising-0.024
ula-0.024
Token brightest
Token ID33871
Feature activation+1.15e-02

Pos logit contributions
ipping+0.004
+0.004
pill+0.004
 exercising+0.003
wards+0.003
Neg logit contributions
ff-0.005
away-0.004
rost-0.004
rated-0.004
ace-0.004
Token sun
Token ID4252
Feature activation-1.08e-02

Pos logit contributions
ff+0.019
away+0.018
rost+0.016
 woke+0.016
rated+0.015
Neg logit contributions
ipping-0.021
-0.020
pill-0.019
 exercising-0.019
 practicing-0.018
Tokenfl
Token ID2704
Feature activation-6.50e-02

Pos logit contributions
ipping+0.016
pill+0.015
+0.015
 exercising+0.014
wards+0.014
Neg logit contributions
ff-0.021
away-0.019
rated-0.019
rost-0.018
ace-0.018
Tokenowers
Token ID3618
Feature activation-8.24e-03

Pos logit contributions
ipping+0.095
 exercising+0.092
wards+0.090
ula+0.090
 practicing+0.089
Neg logit contributions
ff-0.120
away-0.103
ace-0.101
ney-0.100
rated-0.098
Token she
Token ID673
Feature activation-1.24e-02

Pos logit contributions
ipping+0.019
+0.018
pill+0.017
ula+0.016
wards+0.016
Neg logit contributions
ff-0.009
away-0.009
 lived-0.008
rated-0.007
 woke-0.007
Token could
Token ID714
Feature activation+6.41e-04

Pos logit contributions
+0.025
ipping+0.025
pill+0.025
 strawberries+0.023
 exercising+0.023
Neg logit contributions
ff-0.017
away-0.016
rost-0.015
rated-0.014
ace-0.014
Token find
Token ID1064
Feature activation+2.26e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
ace+0.001
Neg logit contributions
ipping-0.001
-0.001
pill-0.001
 exercising-0.001
wards-0.001
Token.
Token ID13
Feature activation+2.10e-03

Pos logit contributions
ff+0.003
away+0.003
rost+0.003
rated+0.003
ace+0.003
Neg logit contributions
ipping-0.004
-0.004
pill-0.004
 exercising-0.004
wards-0.004
Token special
Token ID2041
Feature activation-3.85e-03

Pos logit contributions
ipping+0.059
pill+0.055
wards+0.052
+0.051
 exercising+0.050
Neg logit contributions
ff-0.025
away-0.024
 woke-0.024
ot-0.023
 lived-0.023
Token bag
Token ID6131
Feature activation+2.37e-02

Pos logit contributions
ipping+0.007
pill+0.007
+0.007
 exercising+0.007
wards+0.006
Neg logit contributions
ff-0.006
away-0.005
rost-0.005
 woke-0.005
 lived-0.005
Token.
Token ID13
Feature activation-1.10e-02

Pos logit contributions
ff+0.033
away+0.032
rost+0.029
rated+0.027
 woke+0.027
Neg logit contributions
ipping-0.048
pill-0.046
-0.046
 exercising-0.043
 strawberries-0.043
Token She
Token ID1375
Feature activation-2.16e-03

Pos logit contributions
+0.018
pill+0.016
ipping+0.015
 strawberries+0.015
wards+0.014
Neg logit contributions
away-0.020
rated-0.020
ff-0.019
rost-0.018
eer-0.017
Token kept
Token ID4030
Feature activation+4.10e-03

Pos logit contributions
pill+0.004
+0.004
ipping+0.004
 strawberries+0.004
 exercising+0.004
Neg logit contributions
ff-0.003
away-0.003
rated-0.003
rost-0.002
ination-0.002
Token all
Token ID477
Feature activation+2.99e-02

Pos logit contributions
ff+0.006
away+0.006
rost+0.005
 woke+0.005
ination+0.005
Neg logit contributions
ipping-0.008
-0.008
pill-0.008
wards-0.007
 exercising-0.007
Token her
Token ID607
Feature activation+2.52e-03

Pos logit contributions
ff+0.045
rated+0.042
away+0.041
rost+0.040
ination+0.038
Neg logit contributions
ipping-0.058
pill-0.053
 strawberries-0.053
 exercising-0.052
 practicing-0.051
Token favourite
Token ID12507
Feature activation+1.43e-02

Pos logit contributions
ff+0.005
away+0.004
rost+0.004
rated+0.004
ace+0.004
Neg logit contributions
ipping-0.004
-0.004
pill-0.004
 exercising-0.004
wards-0.004
Token things
Token ID1243
Feature activation+1.40e-02

Pos logit contributions
ff+0.027
away+0.025
rost+0.024
rated+0.024
ination+0.023
Neg logit contributions
ipping-0.022
pill-0.021
-0.020
 strawberries-0.020
 exercising-0.019
Token inside
Token ID2641
Feature activation+5.22e-03

Pos logit contributions
ff+0.023
away+0.021
rost+0.020
rated+0.020
ination+0.019
Neg logit contributions
ipping-0.026
-0.024
pill-0.024
 exercising-0.023
 strawberries-0.023
Token it
Token ID340
Feature activation+3.99e-03

Pos logit contributions
ff+0.008
away+0.008
rost+0.008
rated+0.007
ination+0.007
Neg logit contributions
ipping-0.010
-0.009
pill-0.009
 exercising-0.009
 strawberries-0.009
Token all
Token ID477
Feature activation+2.50e-02

Pos logit contributions
ipping+0.031
+0.029
pill+0.029
ula+0.026
wards+0.026
Neg logit contributions
ff-0.017
away-0.017
 lived-0.015
rost-0.015
 know-0.015
Token.
Token ID13
Feature activation-1.16e-03

Pos logit contributions
ff+0.031
away+0.031
 lived+0.026
rated+0.026
igator+0.025
Neg logit contributions
ipping-0.055
-0.055
pill-0.052
ula-0.049
wards-0.048
Token It
Token ID632
Feature activation+3.96e-03

Pos logit contributions
ipping+0.003
+0.002
pill+0.002
 exercising+0.002
wards+0.002
Neg logit contributions
ff-0.001
away-0.001
rost-0.001
rated-0.001
ace-0.001
Token was
Token ID373
Feature activation-1.94e-04

Pos logit contributions
ff+0.005
away+0.004
rost+0.004
rated+0.004
 woke+0.003
Neg logit contributions
ipping-0.009
-0.009
pill-0.009
 exercising-0.008
wards-0.008
Token a
Token ID257
Feature activation+1.25e-03

Pos logit contributions
ipping+0.000
+0.000
pill+0.000
 exercising+0.000
wards+0.000
Neg logit contributions
ff-0.000
away-0.000
rost-0.000
rated-0.000
ace-0.000
Token tre
Token ID2054
Feature activation+3.94e-02

Pos logit contributions
ff+0.002
away+0.002
rost+0.001
 woke+0.001
ot+0.001
Neg logit contributions
ipping-0.003
pill-0.003
-0.002
 exercising-0.002
 strawberries-0.002
Tokenasured
Token ID34006
Feature activation+1.03e-02

Pos logit contributions
ff+0.060
away+0.055
rost+0.053
 woke+0.052
rated+0.050
Neg logit contributions
ipping-0.076
-0.069
pill-0.068
 exercising-0.067
 practicing-0.066
Token stuffed
Token ID22259
Feature activation+1.53e-02

Pos logit contributions
ff+0.014
away+0.013
rost+0.012
rated+0.011
ace+0.011
Neg logit contributions
ipping-0.021
-0.021
pill-0.021
 exercising-0.019
wards-0.019
Token animal
Token ID5044
Feature activation+4.21e-03

Pos logit contributions
ff+0.024
away+0.023
rost+0.021
rated+0.020
ace+0.020
Neg logit contributions
ipping-0.028
-0.027
pill-0.027
 exercising-0.025
wards-0.025
Token that
Token ID326
Feature activation-5.25e-03

Pos logit contributions
away+0.005
ff+0.005
 lived+0.004
 woke+0.004
ace+0.004
Neg logit contributions
ipping-0.009
-0.009
pill-0.009
wards-0.008
 strawberries-0.008
Token she
Token ID673
Feature activation-9.23e-03

Pos logit contributions
ipping+0.010
+0.010
pill+0.010
wards+0.009
 exercising+0.009
Neg logit contributions
ff-0.008
away-0.007
rost-0.007
rated-0.007
ination-0.006
Token felt
Token ID2936
Feature activation+1.89e-03

Pos logit contributions
ff+0.008
away+0.008
rost+0.007
rated+0.007
ace+0.007
Neg logit contributions
ipping-0.011
-0.011
pill-0.011
 exercising-0.010
wards-0.010
Token full
Token ID1336
Feature activation-1.92e-02

Pos logit contributions
ff+0.003
away+0.003
rost+0.003
ace+0.003
ney+0.003
Neg logit contributions
ipping-0.003
-0.003
wards-0.003
 exercising-0.003
 strawberries-0.003
Token.
Token ID13
Feature activation+3.11e-03

Pos logit contributions
ipping+0.040
pill+0.039
+0.038
 exercising+0.036
wards+0.035
Neg logit contributions
ff-0.026
away-0.024
rost-0.022
 woke-0.021
rated-0.021
Token<|endoftext|>
Token ID50256
Feature activation+4.97e-04

Pos logit contributions
away+0.005
ff+0.005
rost+0.005
ot+0.005
ace+0.005
Neg logit contributions
-0.005
wards-0.005
ipping-0.005
pill-0.005
 exercising-0.005
TokenL
Token ID43
Feature activation+3.64e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
ace+0.001
Neg logit contributions
ipping-0.001
-0.001
pill-0.001
wards-0.001
 exercising-0.001
Tokenola
Token ID5708
Feature activation+3.24e-02

Pos logit contributions
ff+0.006
away+0.006
rated+0.005
ination+0.005
rost+0.005
Neg logit contributions
ipping-0.006
 strawberries-0.006
pill-0.006
-0.006
wards-0.005
Token and
Token ID290
Feature activation-5.01e-03

Pos logit contributions
ff+0.045
away+0.043
rost+0.041
rated+0.039
ace+0.038
Neg logit contributions
ipping-0.065
-0.062
pill-0.062
 exercising-0.058
 strawberries-0.057
Token her
Token ID607
Feature activation-9.31e-03

Pos logit contributions
ipping+0.009
+0.009
pill+0.009
wards+0.009
 exercising+0.009
Neg logit contributions
ff-0.008
away-0.007
rated-0.007
rost-0.007
ney-0.006
Token mum
Token ID25682
Feature activation+2.85e-03

Pos logit contributions
ipping+0.025
pill+0.024
 exercising+0.023
+0.022
oming+0.021
Neg logit contributions
ff-0.007
 lived-0.006
 woke-0.006
 know-0.006
away-0.006
Token were
Token ID547
Feature activation-9.60e-03

Pos logit contributions
ff+0.004
away+0.003
rost+0.003
rated+0.003
ace+0.003
Neg logit contributions
ipping-0.006
-0.006
pill-0.006
 exercising-0.006
wards-0.006
Token walking
Token ID6155
Feature activation-9.86e-03

Pos logit contributions
pill+0.018
+0.016
ipping+0.015
 strawberries+0.014
con+0.014
Neg logit contributions
ff-0.016
away-0.014
 alone-0.014
 woke-0.013
 lived-0.013
Token copies
Token ID9088
Feature activation-2.74e-03

Pos logit contributions
ff+0.003
away+0.003
 know+0.003
ot+0.003
rost+0.002
Neg logit contributions
ipping-0.006
-0.006
pill-0.006
 exercising-0.005
 strawberries-0.005
Token.
Token ID13
Feature activation-5.59e-03

Pos logit contributions
ipping+0.006
+0.006
pill+0.006
 exercising+0.005
 strawberries+0.005
Neg logit contributions
ff-0.004
away-0.003
 woke-0.003
rost-0.003
ot-0.003
Token They
Token ID1119
Feature activation+5.24e-03

Pos logit contributions
ipping+0.013
pill+0.012
+0.012
 exercising+0.012
 strawberries+0.012
Neg logit contributions
ff-0.006
away-0.006
rost-0.005
ination-0.005
rated-0.005
Token printed
Token ID10398
Feature activation-4.96e-03

Pos logit contributions
ff+0.008
away+0.008
rost+0.008
rated+0.007
ace+0.007
Neg logit contributions
ipping-0.009
-0.009
pill-0.009
wards-0.008
 practicing-0.008
Token a
Token ID257
Feature activation+9.07e-03

Pos logit contributions
ipping+0.009
pill+0.009
+0.008
wards+0.008
 strawberries+0.008
Neg logit contributions
ff-0.008
away-0.008
rost-0.007
rated-0.007
ace-0.007
Token lot
Token ID1256
Feature activation+3.54e-02

Pos logit contributions
ff+0.013
away+0.013
rost+0.011
rated+0.011
 woke+0.011
Neg logit contributions
ipping-0.018
-0.017
pill-0.017
 exercising-0.016
wards-0.016
Token of
Token ID286
Feature activation+1.46e-03

Pos logit contributions
ff+0.054
away+0.051
rost+0.047
rated+0.045
ace+0.044
Neg logit contributions
ipping-0.068
-0.065
pill-0.064
 exercising-0.060
wards-0.059
Token copies
Token ID9088
Feature activation-4.18e-05

Pos logit contributions
ff+0.003
away+0.002
rost+0.002
rated+0.002
ace+0.002
Neg logit contributions
ipping-0.002
pill-0.002
-0.002
 strawberries-0.002
wards-0.002
Token and
Token ID290
Feature activation-8.54e-03

Pos logit contributions
ipping+0.000
+0.000
pill+0.000
 strawberries+0.000
 exercising+0.000
Neg logit contributions
ff-0.000
away-0.000
 woke-0.000
ot-0.000
 lived-0.000
Token handed
Token ID10158
Feature activation+5.80e-03

Pos logit contributions
ipping+0.016
+0.016
pill+0.015
 exercising+0.014
 strawberries+0.014
Neg logit contributions
ff-0.013
away-0.012
rost-0.011
rated-0.011
 woke-0.011
Token them
Token ID606
Feature activation+1.08e-02

Pos logit contributions
ff+0.009
away+0.008
rost+0.008
rated+0.008
ace+0.007
Neg logit contributions
ipping-0.011
pill-0.010
-0.010
 strawberries-0.010
wards-0.010
Token he
Token ID339
Feature activation-2.65e-03

Pos logit contributions
ff+0.019
away+0.018
rost+0.018
rated+0.017
ace+0.017
Neg logit contributions
ipping-0.015
-0.014
pill-0.014
 exercising-0.013
wards-0.012
Token had
Token ID550
Feature activation-1.23e-03

Pos logit contributions
ipping+0.005
+0.005
pill+0.005
 exercising+0.005
wards+0.005
Neg logit contributions
ff-0.004
away-0.003
rost-0.003
rated-0.003
ace-0.003
Token been
Token ID587
Feature activation-6.87e-03

Pos logit contributions
ipping+0.003
+0.003
pill+0.003
 exercising+0.002
wards+0.002
Neg logit contributions
ff-0.002
away-0.001
rost-0.001
 woke-0.001
rated-0.001
Token spoiled
Token ID34781
Feature activation+1.21e-03

Pos logit contributions
+0.013
pill+0.013
ipping+0.013
 strawberries+0.011
avored+0.011
Neg logit contributions
ff-0.010
away-0.009
 alone-0.009
 know-0.009
ace-0.009
Token and
Token ID290
Feature activation+7.01e-03

Pos logit contributions
ff+0.002
away+0.002
rost+0.002
rated+0.001
ace+0.001
Neg logit contributions
ipping-0.002
pill-0.002
-0.002
wards-0.002
 exercising-0.002
Token left
Token ID1364
Feature activation+2.84e-02

Pos logit contributions
ff+0.011
away+0.010
rost+0.009
rated+0.009
ace+0.008
Neg logit contributions
ipping-0.014
-0.013
pill-0.013
 exercising-0.013
wards-0.012
Token his
Token ID465
Feature activation+5.87e-03

Pos logit contributions
ff+0.042
away+0.039
rost+0.037
rated+0.036
ination+0.034
Neg logit contributions
ipping-0.056
-0.053
pill-0.052
wards-0.050
 exercising-0.050
Token stuff
Token ID3404
Feature activation-4.94e-04

Pos logit contributions
ff+0.009
away+0.008
rost+0.008
rated+0.008
ace+0.007
Neg logit contributions
ipping-0.011
pill-0.011
-0.011
wards-0.010
 exercising-0.010
Token around
Token ID1088
Feature activation+1.48e-02

Pos logit contributions
ipping+0.001
pill+0.001
+0.001
wards+0.001
 strawberries+0.001
Neg logit contributions
ff-0.001
away-0.001
rost-0.001
rated-0.001
ace-0.001
Token.
Token ID13
Feature activation+6.28e-03

Pos logit contributions
ff+0.022
away+0.020
rost+0.019
rated+0.018
ace+0.018
Neg logit contributions
ipping-0.029
-0.028
pill-0.027
 exercising-0.026
wards-0.026
Token
Token ID198
Feature activation+1.80e-03

Pos logit contributions
ff+0.008
away+0.008
rated+0.008
rost+0.007
ace+0.007
Neg logit contributions
-0.013
ipping-0.013
pill-0.013
wards-0.012
 exercising-0.012
Token all
Token ID477
Feature activation+2.49e-02

Pos logit contributions
ipping+0.012
pill+0.012
+0.011
wards+0.011
 exercising+0.011
Neg logit contributions
ff-0.006
away-0.006
rost-0.005
 woke-0.005
rated-0.005
Token her
Token ID607
Feature activation+4.94e-04

Pos logit contributions
ff+0.031
away+0.029
rost+0.026
rated+0.025
ace+0.024
Neg logit contributions
ipping-0.055
-0.053
pill-0.052
 exercising-0.049
wards-0.049
Token spaghetti
Token ID48241
Feature activation-3.92e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
 woke+0.001
Neg logit contributions
ipping-0.001
-0.001
pill-0.001
wards-0.001
 exercising-0.001
Token.
Token ID13
Feature activation+6.70e-04

Pos logit contributions
ipping+0.009
pill+0.009
+0.008
wards+0.008
 exercising+0.008
Neg logit contributions
away-0.004
ff-0.004
 woke-0.004
rost-0.004
 lived-0.004
Token
Token ID198
Feature activation-1.67e-03

Pos logit contributions
ff+0.001
away+0.001
rated+0.001
rost+0.001
ace+0.001
Neg logit contributions
-0.001
ipping-0.001
pill-0.001
wards-0.001
 exercising-0.001
Token
Token ID198
Feature activation-3.24e-03

Pos logit contributions
+0.003
pill+0.002
ipping+0.002
ula+0.002
wards+0.002
Neg logit contributions
ff-0.003
rated-0.003
away-0.003
ace-0.003
rost-0.003
TokenLater
Token ID18602
Feature activation-8.43e-04

Pos logit contributions
ipping+0.006
+0.006
pill+0.005
 exercising+0.005
wards+0.005
Neg logit contributions
ff-0.006
away-0.005
rost-0.005
rated-0.005
ace-0.004
Token that
Token ID326
Feature activation+1.13e-02

Pos logit contributions
pill+0.002
ipping+0.002
+0.002
pieces+0.001
 exercising+0.001
Neg logit contributions
away-0.001
 woke-0.001
ff-0.001
 lived-0.001
 God-0.001
Token night
Token ID1755
Feature activation+3.89e-03

Pos logit contributions
ff+0.019
away+0.018
rost+0.017
ace+0.016
ination+0.015
Neg logit contributions
ipping-0.020
-0.019
 exercising-0.019
 practicing-0.018
pill-0.018
Token,
Token ID11
Feature activation+1.03e-02

Pos logit contributions
ff+0.005
away+0.005
 woke+0.004
rost+0.004
 lived+0.004
Neg logit contributions
ipping-0.008
pill-0.008
-0.008
 strawberries-0.007
 exercising-0.007
Token Lily
Token ID20037
Feature activation+5.30e-03

Pos logit contributions
 lived+0.009
 woke+0.009
away+0.009
rost+0.008
ot+0.008
Neg logit contributions
pill-0.026
ipping-0.026
-0.023
 strawberries-0.022
 exercising-0.022
Token day
Token ID1110
Feature activation+8.23e-03

Pos logit contributions
ipping+0.007
+0.007
 exercising+0.007
pill+0.007
 strawberries+0.007
Neg logit contributions
ff-0.008
away-0.007
rost-0.007
ination-0.006
ace-0.006
Token,
Token ID11
Feature activation+2.27e-02

Pos logit contributions
ff+0.011
away+0.010
 woke+0.010
rost+0.010
 lived+0.009
Neg logit contributions
ipping-0.017
pill-0.017
-0.016
 strawberries-0.015
 exercising-0.015
Token Will
Token ID2561
Feature activation+1.87e-02

Pos logit contributions
 woke+0.026
away+0.025
ff+0.025
rost+0.024
 lived+0.024
Neg logit contributions
ipping-0.052
pill-0.049
-0.048
 strawberries-0.044
 exercising-0.043
Token and
Token ID290
Feature activation+1.49e-02

Pos logit contributions
ff+0.026
away+0.024
rost+0.023
rated+0.022
ace+0.021
Neg logit contributions
ipping-0.038
pill-0.036
-0.036
 exercising-0.034
wards-0.034
Token his
Token ID465
Feature activation+2.28e-03

Pos logit contributions
ff+0.017
away+0.015
rost+0.013
rated+0.013
ney+0.012
Neg logit contributions
ipping-0.034
-0.032
 exercising-0.032
wards-0.032
pill-0.032
Token mom
Token ID1995
Feature activation+2.51e-03

Pos logit contributions
ff+0.003
away+0.003
rost+0.002
 woke+0.002
ace+0.002
Neg logit contributions
ipping-0.005
pill-0.005
-0.005
 exercising-0.005
wards-0.004
Token were
Token ID547
Feature activation-6.99e-03

Pos logit contributions
ff+0.003
away+0.003
rost+0.003
rated+0.003
ination+0.003
Neg logit contributions
ipping-0.005
-0.005
pill-0.005
 exercising-0.005
wards-0.005
Token walking
Token ID6155
Feature activation-7.15e-03

Pos logit contributions
pill+0.012
ipping+0.011
+0.011
 strawberries+0.010
ula+0.009
Neg logit contributions
ff-0.012
away-0.011
 woke-0.011
 alone-0.011
 lived-0.010
Token at
Token ID379
Feature activation+1.20e-03

Pos logit contributions
pill+0.017
ipping+0.016
+0.016
 strawberries+0.015
wards+0.015
Neg logit contributions
ff-0.008
away-0.008
rost-0.007
rated-0.006
ace-0.006
Token the
Token ID262
Feature activation-1.46e-03

Pos logit contributions
ff+0.002
away+0.002
rated+0.002
rost+0.002
ace+0.002
Neg logit contributions
ipping-0.002
-0.002
pill-0.002
wards-0.002
 exercising-0.002
Token park
Token ID3952
Feature activation-9.49e-04

Pos logit contributions
ipping+0.002
pill+0.002
+0.002
 exercising+0.002
wards+0.002
Neg logit contributions
ff-0.003
away-0.003
rost-0.002
 woke-0.002
 lived-0.002
Token.
Token ID13
Feature activation+1.06e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
ination+0.001
Neg logit contributions
ipping-0.001
-0.001
pill-0.001
wards-0.001
 exercising-0.001
Token Now
Token ID2735
Feature activation+6.64e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
ace+0.001
Neg logit contributions
ipping-0.002
-0.002
pill-0.002
wards-0.002
 exercising-0.002
Token the
Token ID262
Feature activation+3.64e-03

Pos logit contributions
ff+0.010
away+0.010
rost+0.009
rated+0.009
ace+0.009
Neg logit contributions
ipping-0.012
-0.012
pill-0.012
 exercising-0.011
wards-0.011
Token farmer
Token ID18739
Feature activation-2.25e-03

Pos logit contributions
ff+0.007
away+0.006
rost+0.006
rated+0.006
ination+0.006
Neg logit contributions
ipping-0.006
-0.005
pill-0.005
 strawberries-0.005
 exercising-0.005
Token always
Token ID1464
Feature activation+1.98e-02

Pos logit contributions
ipping+0.005
pill+0.004
+0.004
 exercising+0.004
wards+0.004
Neg logit contributions
ff-0.003
away-0.003
rost-0.003
rated-0.003
ination-0.002
Token carried
Token ID5281
Feature activation+5.39e-03

Pos logit contributions
rost+0.036
ff+0.035
away+0.035
ination+0.031
ney+0.031
Neg logit contributions
ipping-0.031
 practicing-0.030
 exercising-0.028
 bouncing-0.026
-0.025
Token the
Token ID262
Feature activation+4.22e-03

Pos logit contributions
rated+0.010
ff+0.010
rost+0.009
ney+0.009
 Amen+0.009
Neg logit contributions
ipping-0.008
 strawberries-0.007
wards-0.007
 practicing-0.007
-0.007
Token penny
Token ID26369
Feature activation-1.87e-03

Pos logit contributions
ff+0.009
away+0.008
rost+0.008
rated+0.007
ace+0.007
Neg logit contributions
ipping-0.006
-0.006
pill-0.006
 exercising-0.005
wards-0.005
Token with
Token ID351
Feature activation-3.72e-03

Pos logit contributions
ipping+0.004
+0.003
pill+0.003
 exercising+0.003
 strawberries+0.003
Neg logit contributions
ff-0.003
rated-0.003
rost-0.003
away-0.002
ination-0.002
Token him
Token ID683
Feature activation-8.56e-03

Pos logit contributions
ipping+0.006
 strawberries+0.006
wards+0.006
 practicing+0.006
pill+0.006
Neg logit contributions
rated-0.006
ff-0.006
away-0.005
ney-0.005
rost-0.005
Token.
Token ID13
Feature activation+2.20e-05

Pos logit contributions
ipping+0.015
+0.014
pill+0.014
 exercising+0.013
wards+0.013
Neg logit contributions
ff-0.014
away-0.013
rost-0.013
rated-0.012
ination-0.012
Token so
Token ID523
Feature activation+6.90e-03

Pos logit contributions
ipping+0.004
+0.004
pill+0.004
wards+0.003
 exercising+0.003
Neg logit contributions
ff-0.003
away-0.003
rost-0.003
rated-0.003
ination-0.003
Token thoughtful
Token ID22677
Feature activation-7.62e-04

Pos logit contributions
away+0.012
rost+0.012
ff+0.011
rated+0.011
ination+0.011
Neg logit contributions
wards-0.011
pill-0.010
ipping-0.010
 practicing-0.010
 exercising-0.010
Token that
Token ID326
Feature activation+7.23e-03

Pos logit contributions
ipping+0.002
+0.002
pill+0.002
 sidelines+0.002
wards+0.002
Neg logit contributions
away-0.001
ff-0.001
 woke-0.001
 lived-0.001
rost-0.001
Token he
Token ID339
Feature activation+9.04e-04

Pos logit contributions
ff+0.013
away+0.012
rost+0.011
rated+0.011
ination+0.011
Neg logit contributions
ipping-0.012
-0.011
pill-0.011
 exercising-0.011
wards-0.011
Token even
Token ID772
Feature activation+7.64e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
ace+0.001
Neg logit contributions
ipping-0.002
pill-0.002
-0.002
 exercising-0.002
 strawberries-0.002
Token brought
Token ID3181
Feature activation+1.21e-02

Pos logit contributions
ff+0.012
away+0.011
rost+0.010
rated+0.010
ination+0.010
Neg logit contributions
ipping-0.015
pill-0.014
-0.013
 exercising-0.013
 practicing-0.013
Token Lucy
Token ID22162
Feature activation-7.97e-03

Pos logit contributions
ff+0.018
away+0.017
rated+0.017
rost+0.016
ination+0.016
Neg logit contributions
ipping-0.022
pill-0.021
wards-0.021
 strawberries-0.021
-0.021
Token presents
Token ID10969
Feature activation+1.47e-02

Pos logit contributions
ipping+0.017
+0.016
pill+0.016
 exercising+0.015
wards+0.015
Neg logit contributions
ff-0.011
away-0.010
rost-0.009
ace-0.009
rated-0.009
Token to
Token ID284
Feature activation+4.81e-03

Pos logit contributions
ff+0.018
away+0.018
rost+0.016
 woke+0.015
ace+0.015
Neg logit contributions
ipping-0.032
pill-0.031
-0.030
 exercising-0.028
wards-0.028
Token show
Token ID905
Feature activation+1.39e-02

Pos logit contributions
ff+0.008
rated+0.007
away+0.007
rost+0.007
ney+0.007
Neg logit contributions
-0.008
ipping-0.008
wards-0.008
 exercising-0.008
 practicing-0.007
Token how
Token ID703
Feature activation+9.67e-03

Pos logit contributions
ff+0.020
rost+0.018
away+0.018
rated+0.017
ney+0.017
Neg logit contributions
ipping-0.027
pill-0.026
wards-0.026
-0.025
 strawberries-0.025
Token the
Token ID262
Feature activation+1.23e-02

Pos logit contributions
ipping+0.009
+0.009
pill+0.009
wards+0.009
 strawberries+0.008
Neg logit contributions
ff-0.006
away-0.006
rated-0.006
rost-0.005
ination-0.005
Token pot
Token ID1787
Feature activation+1.21e-02

Pos logit contributions
ff+0.021
away+0.020
rost+0.019
rated+0.018
 woke+0.018
Neg logit contributions
ipping-0.022
pill-0.021
-0.021
 exercising-0.019
wards-0.019
Token from
Token ID422
Feature activation+1.94e-02

Pos logit contributions
ff+0.018
rost+0.017
away+0.017
ination+0.017
rated+0.017
Neg logit contributions
ipping-0.022
-0.020
pill-0.019
wards-0.019
 strawberries-0.019
Token the
Token ID262
Feature activation+1.12e-02

Pos logit contributions
ff+0.026
away+0.022
rated+0.022
rost+0.021
ace+0.020
Neg logit contributions
ipping-0.042
-0.040
wards-0.038
pill-0.038
 exercising-0.038
Token stove
Token ID27635
Feature activation+2.25e-02

Pos logit contributions
ff+0.018
away+0.017
rost+0.016
 woke+0.015
rated+0.015
Neg logit contributions
ipping-0.021
pill-0.020
-0.019
 exercising-0.018
wards-0.018
Token.
Token ID13
Feature activation+1.43e-03

Pos logit contributions
ff+0.033
away+0.031
rost+0.029
rated+0.028
ace+0.027
Neg logit contributions
ipping-0.044
-0.042
pill-0.041
 exercising-0.039
wards-0.039
Token He
Token ID679
Feature activation-1.51e-03

Pos logit contributions
ff+0.002
away+0.002
rated+0.002
rost+0.001
ace+0.001
Neg logit contributions
ipping-0.003
-0.003
pill-0.003
wards-0.003
 exercising-0.003
Token carried
Token ID5281
Feature activation+1.08e-02

Pos logit contributions
ipping+0.003
pill+0.003
+0.003
 strawberries+0.003
 exercising+0.002
Neg logit contributions
ff-0.002
away-0.002
rated-0.002
rost-0.002
ace-0.002
Token it
Token ID340
Feature activation+1.09e-02

Pos logit contributions
ff+0.020
rated+0.019
away+0.018
ney+0.018
grand+0.017
Neg logit contributions
ipping-0.016
 strawberries-0.016
wards-0.015
-0.015
ula-0.015
Token to
Token ID284
Feature activation+1.18e-02

Pos logit contributions
ff+0.019
rated+0.017
rost+0.016
 Amen+0.016
away+0.016
Neg logit contributions
ipping-0.018
 exercising-0.017
 practicing-0.016
-0.016
 strawberries-0.016
Token the
Token ID262
Feature activation+1.50e-02

Pos logit contributions
ff+0.019
rated+0.018
away+0.016
ney+0.016
ace+0.015
Neg logit contributions
ipping-0.021
wards-0.021
 exercising-0.021
-0.021
 strawberries-0.020
Token whistle
Token ID16121
Feature activation+8.51e-04

Pos logit contributions
ff+0.009
away+0.008
rost+0.008
rated+0.008
ination+0.007
Neg logit contributions
ipping-0.009
-0.009
pill-0.008
 exercising-0.008
wards-0.008
Token very
Token ID845
Feature activation+6.99e-03

Pos logit contributions
ff+0.001
rost+0.001
away+0.001
ace+0.001
ination+0.001
Neg logit contributions
ipping-0.001
pill-0.001
 exercising-0.001
-0.001
 practicing-0.001
Token well
Token ID880
Feature activation+5.63e-03

Pos logit contributions
ff+0.011
rost+0.010
ace+0.010
grand+0.010
gent+0.009
Neg logit contributions
ipping-0.012
-0.011
pill-0.011
oming-0.010
 exercising-0.010
Token,"
Token ID553
Feature activation-1.51e-03

Pos logit contributions
ff+0.007
away+0.006
rost+0.006
rated+0.006
ace+0.005
Neg logit contributions
ipping-0.012
-0.012
pill-0.012
 exercising-0.011
 strawberries-0.011
Token Lily
Token ID20037
Feature activation+1.99e-03

Pos logit contributions
ipping+0.003
+0.003
pill+0.003
 exercising+0.003
wards+0.003
Neg logit contributions
ff-0.002
away-0.002
rost-0.002
rated-0.002
 woke-0.002
Token said
Token ID531
Feature activation-1.33e-02

Pos logit contributions
ff+0.003
away+0.003
rost+0.003
rated+0.003
ace+0.003
Neg logit contributions
ipping-0.003
pill-0.003
-0.003
 exercising-0.003
 strawberries-0.003
Token.
Token ID13
Feature activation-3.90e-03

Pos logit contributions
ipping+0.027
pill+0.026
+0.026
 strawberries+0.024
 exercising+0.024
Neg logit contributions
ff-0.019
away-0.017
rost-0.016
rated-0.015
ace-0.015
Token "
Token ID366
Feature activation+1.55e-02

Pos logit contributions
ipping+0.008
+0.007
pill+0.007
 exercising+0.007
wards+0.007
Neg logit contributions
ff-0.006
away-0.005
rost-0.005
rated-0.005
ace-0.005
TokenHow
Token ID2437
Feature activation-7.31e-03

Pos logit contributions
ff+0.028
rated+0.025
away+0.025
rost+0.025
ace+0.024
Neg logit contributions
-0.025
ipping-0.024
pill-0.024
ula-0.022
wards-0.021
Token does
Token ID857
Feature activation+1.92e-02

Pos logit contributions
 surfing+0.011
ipping+0.011
+0.011
pill+0.011
 exercising+0.011
Neg logit contributions
away-0.012
ff-0.011
ace-0.011
rated-0.011
eyed-0.011
Token it
Token ID340
Feature activation+1.97e-03

Pos logit contributions
ff+0.029
away+0.027
rost+0.026
rated+0.025
ace+0.024
Neg logit contributions
ipping-0.036
-0.035
pill-0.035
 exercising-0.033
wards-0.033
Token We
Token ID775
Feature activation+3.30e-04

Pos logit contributions
ipping+0.014
+0.014
pill+0.014
wards+0.013
 exercising+0.013
Neg logit contributions
ff-0.011
away-0.011
rost-0.010
rated-0.009
 woke-0.009
Token'll
Token ID1183
Feature activation+2.33e-02

Pos logit contributions
ff+0.000
away+0.000
rost+0.000
rated+0.000
ination+0.000
Neg logit contributions
ipping-0.001
-0.001
pill-0.001
 exercising-0.001
 practicing-0.001
Token help
Token ID1037
Feature activation+6.13e-03

Pos logit contributions
ff+0.033
away+0.030
rost+0.028
rated+0.028
ination+0.025
Neg logit contributions
ipping-0.047
-0.046
pill-0.043
 exercising-0.043
wards-0.042
Token you
Token ID345
Feature activation+1.35e-02

Pos logit contributions
ff+0.009
away+0.009
rost+0.008
rated+0.008
 know+0.008
Neg logit contributions
ipping-0.012
pill-0.011
-0.011
wards-0.010
 strawberries-0.010
Token!"
Token ID2474
Feature activation-1.19e-03

Pos logit contributions
ff+0.019
away+0.017
rost+0.016
rated+0.015
 woke+0.015
Neg logit contributions
ipping-0.028
-0.026
pill-0.025
 exercising-0.025
wards-0.025
Token So
Token ID1406
Feature activation-1.92e-02

Pos logit contributions
ipping+0.003
 strawberries+0.003
 exercising+0.003
+0.003
pill+0.003
Neg logit contributions
ff-0.001
away-0.001
ot-0.001
 know-0.001
 lived-0.001
Token Spike
Token ID26309
Feature activation-3.23e-03

Pos logit contributions
ipping+0.042
pill+0.042
+0.039
 strawberries+0.038
 exercising+0.037
Neg logit contributions
ot-0.022
away-0.022
 lived-0.021
ff-0.020
 its-0.020
Token jumped
Token ID11687
Feature activation-1.55e-02

Pos logit contributions
ipping+0.007
+0.007
pill+0.007
 exercising+0.006
 strawberries+0.006
Neg logit contributions
ff-0.004
away-0.004
 woke-0.003
 lived-0.003
rost-0.003
Token on
Token ID319
Feature activation-4.52e-03

Pos logit contributions
pill+0.035
ipping+0.035
+0.034
 exercising+0.032
 strawberries+0.032
Neg logit contributions
ff-0.018
away-0.018
rost-0.014
ot-0.014
ace-0.014
Token the
Token ID262
Feature activation-6.24e-04

Pos logit contributions
ipping+0.010
pill+0.010
+0.009
 strawberries+0.009
 exercising+0.009
Neg logit contributions
away-0.005
ff-0.004
 its-0.004
rost-0.004
ot-0.004
Token tr
Token ID491
Feature activation-1.37e-02

Pos logit contributions
ipping+0.001
+0.001
pill+0.001
 exercising+0.001
wards+0.001
Neg logit contributions
away-0.001
ff-0.001
igator-0.001
rost-0.001
 woke-0.001
Token.
Token ID13
Feature activation+1.66e-03

Pos logit contributions
ff+0.003
away+0.003
rost+0.003
rated+0.003
ace+0.003
Neg logit contributions
ipping-0.005
pill-0.005
-0.005
wards-0.005
 exercising-0.005
Token She
Token ID1375
Feature activation-3.12e-03

Pos logit contributions
away+0.003
ff+0.003
rated+0.002
rost+0.002
ney+0.002
Neg logit contributions
-0.003
ipping-0.003
pill-0.003
wards-0.003
 strawberries-0.003
Token said
Token ID531
Feature activation+6.64e-03

Pos logit contributions
ipping+0.007
pill+0.006
+0.006
 exercising+0.006
 strawberries+0.006
Neg logit contributions
ff-0.004
away-0.004
rost-0.003
rated-0.003
ace-0.003
Token,
Token ID11
Feature activation-2.80e-03

Pos logit contributions
ff+0.011
away+0.010
rated+0.010
rost+0.010
ace+0.009
Neg logit contributions
ipping-0.012
pill-0.011
-0.011
 strawberries-0.011
 exercising-0.011
Token "
Token ID366
Feature activation+3.11e-03

Pos logit contributions
ipping+0.005
+0.005
pill+0.004
 exercising+0.004
wards+0.004
Neg logit contributions
ff-0.005
away-0.005
rated-0.004
rost-0.004
ace-0.004
TokenNo
Token ID2949
Feature activation-1.82e-02

Pos logit contributions
ff+0.005
away+0.005
rost+0.004
rated+0.004
ace+0.004
Neg logit contributions
ipping-0.006
pill-0.006
-0.006
 exercising-0.005
 strawberries-0.005
Token,
Token ID11
Feature activation-2.04e-02

Pos logit contributions
pill+0.049
ipping+0.048
+0.048
oured+0.046
oming+0.045
Neg logit contributions
 know-0.012
ff-0.011
 woke-0.010
ot-0.009
away-0.009
Token you
Token ID345
Feature activation-1.89e-03

Pos logit contributions
ipping+0.039
pill+0.037
+0.037
wards+0.034
 exercising+0.034
Neg logit contributions
ff-0.031
away-0.029
rost-0.028
 woke-0.026
rated-0.026
Token can
Token ID460
Feature activation-5.45e-03

Pos logit contributions
ipping+0.004
+0.004
pill+0.004
 exercising+0.004
 strawberries+0.004
Neg logit contributions
ff-0.002
away-0.002
rost-0.002
rated-0.002
ace-0.002
Token't
Token ID470
Feature activation-5.24e-03

Pos logit contributions
ipping+0.011
+0.010
pill+0.010
 exercising+0.009
wards+0.009
Neg logit contributions
ff-0.008
away-0.008
rost-0.007
rated-0.007
ace-0.007
Token play
Token ID711
Feature activation-3.90e-04

Pos logit contributions
ipping+0.011
+0.010
pill+0.010
 exercising+0.009
wards+0.009
Neg logit contributions
ff-0.007
away-0.007
rost-0.006
rated-0.006
ace-0.006
Token not
Token ID407
Feature activation-2.53e-03

Pos logit contributions
pill+0.003
ipping+0.003
+0.002
 strawberries+0.002
avored+0.002
Neg logit contributions
away-0.000
 know-0.000
 its-0.000
ination-0.000
rost-0.000
Token listen
Token ID6004
Feature activation+4.38e-03

Pos logit contributions
ipping+0.006
 exercising+0.006
+0.006
pill+0.005
 practicing+0.005
Neg logit contributions
ff-0.003
away-0.002
rost-0.002
rated-0.002
ney-0.002
Token to
Token ID284
Feature activation-6.84e-03

Pos logit contributions
ff+0.006
away+0.006
rost+0.005
rated+0.005
ace+0.005
Neg logit contributions
ipping-0.009
-0.008
pill-0.008
 exercising-0.008
wards-0.008
Token their
Token ID511
Feature activation-5.16e-03

Pos logit contributions
ipping+0.013
+0.012
pill+0.012
 exercising+0.011
wards+0.011
Neg logit contributions
ff-0.011
away-0.010
rost-0.010
rated-0.009
ace-0.009
Token mom
Token ID1995
Feature activation-1.19e-02

Pos logit contributions
pill+0.014
ipping+0.013
+0.013
wards+0.012
 exercising+0.012
Neg logit contributions
away-0.004
ff-0.004
rost-0.004
ot-0.004
 know-0.003
Token,
Token ID11
Feature activation-1.77e-02

Pos logit contributions
pill+0.028
ipping+0.028
+0.028
 strawberries+0.026
wards+0.026
Neg logit contributions
ff-0.012
away-0.011
rost-0.010
 woke-0.010
 know-0.010
Token who
Token ID508
Feature activation-1.35e-02

Pos logit contributions
ipping+0.039
+0.038
pill+0.037
wards+0.034
 strawberries+0.033
Neg logit contributions
ff-0.021
away-0.020
rost-0.019
 woke-0.018
 know-0.018
Token told
Token ID1297
Feature activation-2.36e-03

Pos logit contributions
ipping+0.028
+0.027
pill+0.027
 exercising+0.026
 strawberries+0.025
Neg logit contributions
ff-0.018
away-0.017
rost-0.015
rated-0.015
ace-0.015
Token them
Token ID606
Feature activation+4.89e-03

Pos logit contributions
ipping+0.005
+0.004
pill+0.004
 strawberries+0.004
wards+0.004
Neg logit contributions
ff-0.004
away-0.003
rated-0.003
rost-0.003
ney-0.003
Token to
Token ID284
Feature activation-8.91e-03

Pos logit contributions
ff+0.007
away+0.006
rost+0.006
 know+0.006
 woke+0.006
Neg logit contributions
ipping-0.010
-0.010
pill-0.010
 exercising-0.009
wards-0.009
Token be
Token ID307
Feature activation-1.87e-03

Pos logit contributions
ipping+0.017
+0.017
 exercising+0.015
wards+0.014
 practicing+0.014
Neg logit contributions
ff-0.014
away-0.012
rated-0.012
rost-0.012
ney-0.011
Token'm
Token ID1101
Feature activation+8.36e-04

Pos logit contributions
away+0.008
ff+0.008
ace+0.007
rost+0.007
ination+0.007
Neg logit contributions
-0.008
ipping-0.008
pill-0.008
 exercising-0.007
wards-0.007
Token so
Token ID523
Feature activation+5.89e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
ace+0.001
rated+0.001
Neg logit contributions
ipping-0.002
pill-0.002
-0.002
 strawberries-0.002
 exercising-0.002
Token sorry
Token ID7926
Feature activation-3.49e-04

Pos logit contributions
ff+0.011
rated+0.011
away+0.011
rost+0.011
ace+0.011
Neg logit contributions
ipping-0.008
-0.007
 exercising-0.007
 practicing-0.007
pill-0.007
Token.
Token ID13
Feature activation+3.80e-03

Pos logit contributions
ipping+0.001
+0.001
pill+0.001
wards+0.001
 exercising+0.001
Neg logit contributions
ff-0.001
away-0.000
rost-0.000
rated-0.000
ace-0.000
Token That
Token ID1320
Feature activation-1.44e-02

Pos logit contributions
away+0.008
ff+0.008
rated+0.008
rost+0.007
ination+0.007
Neg logit contributions
pill-0.004
-0.004
wards-0.004
ipping-0.004
 exercising-0.004
Token bridge
Token ID7696
Feature activation-1.45e-02

Pos logit contributions
ipping+0.033
+0.032
pill+0.031
wards+0.029
 exercising+0.029
Neg logit contributions
ff-0.017
away-0.015
 woke-0.014
rost-0.014
rated-0.014
Token was
Token ID373
Feature activation-1.16e-02

Pos logit contributions
ipping+0.032
+0.030
pill+0.030
 exercising+0.029
wards+0.028
Neg logit contributions
ff-0.018
away-0.017
rost-0.015
rated-0.014
ace-0.014
Token old
Token ID1468
Feature activation+4.57e-03

Pos logit contributions
ipping+0.022
pill+0.021
+0.021
 exercising+0.020
 strawberries+0.020
Neg logit contributions
ff-0.018
away-0.017
rost-0.015
rated-0.015
ace-0.014
Token and
Token ID290
Feature activation+1.88e-03

Pos logit contributions
ff+0.006
away+0.005
rost+0.005
rated+0.004
 woke+0.004
Neg logit contributions
ipping-0.010
-0.010
pill-0.010
wards-0.009
 exercising-0.009
Token dangerous
Token ID4923
Feature activation-3.80e-03

Pos logit contributions
ff+0.003
away+0.003
rost+0.003
ace+0.003
rated+0.003
Neg logit contributions
ipping-0.003
 exercising-0.003
-0.003
pill-0.003
 practicing-0.003
Token.
Token ID13
Feature activation+2.15e-03

Pos logit contributions
ipping+0.008
+0.007
pill+0.007
 exercising+0.007
wards+0.007
Neg logit contributions
ff-0.005
away-0.005
rost-0.005
rated-0.004
ace-0.004
Token She
Token ID1375
Feature activation-4.78e-03

Pos logit contributions
+0.010
ipping+0.009
pill+0.009
wards+0.009
 exercising+0.008
Neg logit contributions
ff-0.008
away-0.008
rost-0.007
rated-0.007
igator-0.006
Token saw
Token ID2497
Feature activation-4.61e-03

Pos logit contributions
ipping+0.009
pill+0.008
 exercising+0.008
+0.008
 practicing+0.008
Neg logit contributions
ff-0.008
away-0.007
rost-0.007
rated-0.007
ination-0.006
Token lots
Token ID6041
Feature activation+1.37e-02

Pos logit contributions
ipping+0.008
pill+0.008
 strawberries+0.008
+0.007
wards+0.007
Neg logit contributions
ff-0.008
away-0.007
rost-0.007
rated-0.007
ination-0.006
Token of
Token ID286
Feature activation-4.45e-03

Pos logit contributions
away+0.013
ff+0.013
igator+0.011
ot+0.010
 know+0.010
Neg logit contributions
-0.034
ipping-0.032
wards-0.030
pill-0.029
 practicing-0.029
Token snow
Token ID6729
Feature activation-1.07e-02

Pos logit contributions
ipping+0.008
+0.007
pill+0.007
 exercising+0.007
wards+0.007
Neg logit contributions
ff-0.008
away-0.007
rost-0.007
rated-0.006
ace-0.006
Tokenfl
Token ID2704
Feature activation-5.83e-02

Pos logit contributions
pill+0.023
pieces+0.022
+0.022
ipping+0.022
wards+0.021
Neg logit contributions
ff-0.012
away-0.011
 lived-0.011
 woke-0.011
ney-0.010
Tokenakes
Token ID1124
Feature activation+2.77e-03

Pos logit contributions
wards+0.111
 exercising+0.111
 strawberries+0.110
ipping+0.109
pill+0.109
Neg logit contributions
ff-0.079
ace-0.063
ney-0.062
away-0.061
 lived-0.059
Token,
Token ID11
Feature activation-9.91e-03

Pos logit contributions
ff+0.003
away+0.003
 woke+0.003
 lived+0.002
rost+0.002
Neg logit contributions
pill-0.006
-0.006
ipping-0.006
pieces-0.006
wards-0.006
Token glitter
Token ID31133
Feature activation+8.12e-03

Pos logit contributions
ipping+0.020
+0.019
pill+0.018
wards+0.017
 exercising+0.017
Neg logit contributions
ff-0.014
away-0.013
rost-0.013
rated-0.012
 woke-0.012
Tokening
Token ID278
Feature activation+5.12e-03

Pos logit contributions
ff+0.010
away+0.009
ace+0.008
 woke+0.007
ot+0.007
Neg logit contributions
-0.018
ipping-0.018
pill-0.018
wards-0.017
 exercising-0.017
Token in
Token ID287
Feature activation+1.31e-02

Pos logit contributions
ff+0.007
away+0.007
rost+0.006
rated+0.006
 woke+0.006
Neg logit contributions
ipping-0.010
-0.010
pill-0.010
 exercising-0.009
wards-0.009
Token the
Token ID262
Feature activation+1.66e-03

Pos logit contributions
ipping+0.043
+0.040
pill+0.040
wards+0.038
 exercising+0.037
Neg logit contributions
ff-0.031
away-0.029
rost-0.027
rated-0.025
ace-0.025
Token tree
Token ID5509
Feature activation-2.66e-02

Pos logit contributions
ff+0.002
away+0.002
rost+0.002
rated+0.002
ace+0.002
Neg logit contributions
ipping-0.003
-0.003
pill-0.003
 exercising-0.003
wards-0.003
Token and
Token ID290
Feature activation+1.96e-03

Pos logit contributions
ipping+0.049
+0.047
pill+0.047
 exercising+0.044
 strawberries+0.044
Neg logit contributions
ff-0.042
away-0.038
rost-0.037
rated-0.035
ination-0.034
Token eat
Token ID4483
Feature activation-1.12e-02

Pos logit contributions
ff+0.003
away+0.003
rated+0.003
ination+0.003
rost+0.003
Neg logit contributions
 strawberries-0.003
-0.003
 exercising-0.003
 swinging-0.003
ipping-0.003
Token the
Token ID262
Feature activation-5.92e-03

Pos logit contributions
ipping+0.020
+0.019
pill+0.019
 strawberries+0.018
 exercising+0.017
Neg logit contributions
ff-0.019
away-0.017
rost-0.016
rated-0.016
ace-0.016
Token pr
Token ID778
Feature activation-4.71e-02

Pos logit contributions
ipping+0.010
+0.009
pill+0.009
wards+0.009
 exercising+0.008
Neg logit contributions
ff-0.011
away-0.010
rost-0.010
rated-0.009
 woke-0.009
Tokenunes
Token ID4015
Feature activation-3.89e-02

Pos logit contributions
ipping+0.065
+0.059
pill+0.059
 exercising+0.058
wards+0.058
Neg logit contributions
ff-0.096
away-0.089
rost-0.087
ace-0.083
rated-0.082
Token.
Token ID13
Feature activation+6.13e-03

Pos logit contributions
ipping+0.076
+0.072
pill+0.072
 exercising+0.068
 strawberries+0.067
Neg logit contributions
ff-0.058
away-0.054
rost-0.050
rated-0.049
ace-0.047
Token They
Token ID1119
Feature activation+4.92e-04

Pos logit contributions
away+0.010
ff+0.010
rated+0.010
ace+0.009
igator+0.009
Neg logit contributions
-0.010
pill-0.010
ipping-0.009
wards-0.009
 strawberries-0.009
Token say
Token ID910
Feature activation+5.34e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
ination+0.001
Neg logit contributions
ipping-0.001
pill-0.001
-0.001
 strawberries-0.001
 exercising-0.001
Token the
Token ID262
Feature activation+2.43e-03

Pos logit contributions
ff+0.011
away+0.011
rost+0.010
rated+0.010
 woke+0.010
Neg logit contributions
ipping-0.007
-0.007
pill-0.007
 exercising-0.006
wards-0.006
Token didn
Token ID1422
Feature activation+5.67e-03

Pos logit contributions
ipping+0.004
+0.004
pill+0.004
 exercising+0.003
wards+0.003
Neg logit contributions
ff-0.002
away-0.002
rost-0.002
rated-0.002
ace-0.002
Token't
Token ID470
Feature activation+7.79e-03

Pos logit contributions
ff+0.008
away+0.008
rost+0.007
rated+0.007
ace+0.007
Neg logit contributions
ipping-0.011
pill-0.011
-0.011
 exercising-0.010
 strawberries-0.010
Token forget
Token ID6044
Feature activation+7.92e-04

Pos logit contributions
ff+0.010
away+0.009
rost+0.009
rated+0.008
ination+0.008
Neg logit contributions
ipping-0.017
-0.016
pill-0.016
 exercising-0.015
wards-0.015
Token her
Token ID607
Feature activation-8.46e-03

Pos logit contributions
ff+0.001
away+0.001
rost+0.001
rated+0.001
ace+0.001
Neg logit contributions
ipping-0.001
-0.001
pill-0.001
 exercising-0.001
 strawberries-0.001
Token sun
Token ID4252
Feature activation-3.92e-02

Pos logit contributions
ipping+0.015
pill+0.015
+0.014
wards+0.013
 exercising+0.013
Neg logit contributions
ff-0.013
away-0.013
rost-0.012
rated-0.012
 woke-0.011
Tokenflower
Token ID25547
Feature activation-5.19e-02

Pos logit contributions
ipping+0.056
pill+0.055
+0.052
wards+0.049
 exercising+0.048
Neg logit contributions
ff-0.077
away-0.073
rated-0.069
rost-0.069
ace-0.068
Token.
Token ID13
Feature activation+4.13e-03

Pos logit contributions
ipping+0.113
pill+0.111
+0.106
ula+0.102
wards+0.101
Neg logit contributions
ff-0.062
away-0.059
 woke-0.056
rated-0.055
 lived-0.054
Token<|endoftext|>
Token ID50256
Feature activation+2.50e-03

Pos logit contributions
ff+0.006
away+0.006
rated+0.005
rost+0.005
igator+0.005
Neg logit contributions
-0.008
pill-0.008
ipping-0.008
wards-0.007
 exercising-0.007
TokenOnce
Token ID7454
Feature activation+1.27e-03

Pos logit contributions
ff+0.004
away+0.004
rated+0.004
rost+0.003
ace+0.003
Neg logit contributions
-0.005
ipping-0.005
pill-0.004
wards-0.004
 exercising-0.004
Token upon
Token ID2402
Feature activation+1.49e-02

Pos logit contributions
ff+0.002
away+0.002
rated+0.002
ination+0.002
rost+0.002
Neg logit contributions
ipping-0.002
-0.002
pill-0.002
 exercising-0.002
 strawberries-0.002
Token a
Token ID257
Feature activation-2.08e-02

Pos logit contributions
ff+0.025
away+0.023
rost+0.022
rated+0.022
ace+0.020
Neg logit contributions
ipping-0.026
-0.025
pill-0.025
 exercising-0.023
wards-0.023
Token hello
Token ID23748
Feature activation+3.19e-03

Pos logit contributions
ff+0.017
away+0.016
rost+0.015
rated+0.015
ney+0.014
Neg logit contributions
ipping-0.018
-0.017
pill-0.016
wards-0.016
 strawberries-0.016
Token to
Token ID284
Feature activation-5.62e-03

Pos logit contributions
ff+0.005
away+0.005
rost+0.004
rated+0.004
ination+0.004
Neg logit contributions
ipping-0.006
-0.006
pill-0.006
wards-0.005
 exercising-0.005
Token the
Token ID262
Feature activation+1.07e-03

Pos logit contributions
ipping+0.010
wards+0.009
+0.009
 strawberries+0.009
pill+0.009
Neg logit contributions
ff-0.009
away-0.009
rost-0.008
rated-0.008
ney-0.008
Token other
Token ID584
Feature activation-2.84e-03

Pos logit contributions
ff+0.002
away+0.002
rost+0.002
rated+0.002
ace+0.002
Neg logit contributions
ipping-0.002
-0.002
pill-0.002
 exercising-0.002
wards-0.002
Token sun
Token ID4252
Feature activation-2.39e-02

Pos logit contributions
ipping+0.005
+0.005
pill+0.005
ula+0.004
 exercising+0.004
Neg logit contributions
ff-0.005
away-0.005
rated-0.004
 lived-0.004
rost-0.004
Tokenfl
Token ID2704
Feature activation-5.90e-02

Pos logit contributions
ipping+0.038
pill+0.037
+0.036
 exercising+0.033
ula+0.033
Neg logit contributions
ff-0.044
away-0.041
rated-0.039
rost-0.038
 woke-0.038
Tokenowers
Token ID3618
Feature activation-1.17e-02

Pos logit contributions
ipping+0.082
 exercising+0.078
pill+0.078
ula+0.077
wards+0.075
Neg logit contributions
ff-0.116
away-0.103
ney-0.098
rost-0.097
ace-0.096
Token around
Token ID1088
Feature activation+5.53e-03

Pos logit contributions
ipping+0.025
+0.024
pill+0.024
ula+0.022
 exercising+0.022
Neg logit contributions
ff-0.015
away-0.014
 lived-0.012
 woke-0.012
rost-0.012
Token it
Token ID340
Feature activation+2.52e-04

Pos logit contributions
ff+0.008
away+0.007
rost+0.007
rated+0.007
ace+0.007
Neg logit contributions
ipping-0.011
-0.010
pill-0.010
wards-0.010
 exercising-0.010
Token.
Token ID13
Feature activation-7.00e-03

Pos logit contributions
ff+0.000
away+0.000
rost+0.000
rated+0.000
ace+0.000
Neg logit contributions
ipping-0.000
-0.000
pill-0.000
 exercising-0.000
wards-0.000
Token Soon
Token ID15894
Feature activation-1.76e-02

Pos logit contributions
ipping+0.016
pill+0.015
+0.015
 exercising+0.015
 strawberries+0.015
Neg logit contributions
ff-0.008
away-0.007
rost-0.007
rated-0.006
ination-0.006
Token doesn
Token ID1595
Feature activation+1.05e-02

Pos logit contributions
ipping+0.058
wards+0.052
pill+0.052
+0.051
ula+0.051
Neg logit contributions
 woke-0.017
ff-0.016
 lived-0.016
 know-0.014
rated-0.014
Token't
Token ID470
Feature activation+1.50e-02

Pos logit contributions
ff+0.017
away+0.016
rost+0.015
rated+0.015
ace+0.014
Neg logit contributions
ipping-0.019
-0.019
pill-0.018
 exercising-0.017
wards-0.017
Token taste
Token ID6938
Feature activation-6.82e-03

Pos logit contributions
ff+0.028
away+0.025
ney+0.024
rost+0.024
ace+0.023
Neg logit contributions
ipping-0.023
 strawberries-0.022
 exercising-0.022
-0.022
pill-0.021
Token like
Token ID588
Feature activation-9.62e-04

Pos logit contributions
ipping+0.013
+0.012
pill+0.012
 exercising+0.012
 strawberries+0.012
Neg logit contributions
ff-0.010
away-0.010
rost-0.009
rated-0.009
ace-0.008
Token the
Token ID262
Feature activation-8.51e-03

Pos logit contributions
ipping+0.002
+0.002
pill+0.002
wards+0.002
 exercising+0.001
Neg logit contributions
ff-0.001
away-0.001
rost-0.001
 woke-0.001
ot-0.001
Token grape
Token ID30777
Feature activation-5.20e-02

Pos logit contributions
ipping+0.016
pill+0.014
+0.014
wards+0.014
oured+0.013
Neg logit contributions
rated-0.012
ff-0.012
 value-0.011
 know-0.011
igator-0.011
Token juice
Token ID13135
Feature activation-2.40e-02

Pos logit contributions
ipping+0.100
+0.093
pill+0.092
wards+0.088
 exercising+0.086
Neg logit contributions
ff-0.078
away-0.074
rost-0.068
rated-0.068
 woke-0.066
Token we
Token ID356
Feature activation-1.10e-02

Pos logit contributions
ipping+0.050
+0.047
pill+0.047
 exercising+0.044
 strawberries+0.044
Neg logit contributions
ff-0.033
away-0.031
rost-0.028
rated-0.027
ace-0.026
Token have
Token ID423
Feature activation-1.10e-02

Pos logit contributions
ipping+0.022
+0.022
pill+0.022
 strawberries+0.021
 exercising+0.021
Neg logit contributions
ff-0.014
away-0.014
rost-0.013
rated-0.012
ace-0.012
Token at
Token ID379
Feature activation-1.90e-04

Pos logit contributions
ipping+0.023
+0.022
pill+0.022
wards+0.021
 exercising+0.020
Neg logit contributions
ff-0.015
away-0.014
rost-0.013
rated-0.012
 woke-0.012
Token home
Token ID1363
Feature activation-1.89e-03

Pos logit contributions
ipping+0.000
+0.000
 exercising+0.000
 strawberries+0.000
pill+0.000
Neg logit contributions
ff-0.000
away-0.000
rost-0.000
rated-0.000
ney-0.000