TOP ACTIVATIONS
MAX = 11.590

would take out a j igsaw puzzle and try to solve
It was a big j igsaw puzzle and she wanted to
The town spe ople had to find a
box and saw a j igsaw puzzle inside . He paused
Lily loved the big Fer ris wheel the most . She
she bowed , the town spe ople stopped to admire her
He went on the fer ris wheel , the roller coaster
's go on the fer ris wheel and see the whole
they went on the fer ris wheel together , and Lily
saw the great big Fer ris wheel and all the other
day ever ! <|endoftext|> Mor ris was very excited . Today
out for a walk with Sil via wearing her new coat
loved to play make - bel ieve . Today she was
! The town spe ople were sad and scared
happy and the nation pro spe red . <|endoftext|> Once upon
. â Ell ie happily stayed and watched
itty wanted to introduce Dog gy to her ball . She
went on with his mis chie vous ways , causing problems
He loved playing make - bel ieve with his friends .
He would race make - bel ieve cars around the track

MIN ACTIVATIONS
MIN = -5.449

a modest bench under a shady tree and sat down .
not too sunny or too shady , and not too crowded
a wooden seat under a shady tree . She climbed onto
the two friends saw a shady looking man lurking near the
said , pointing to a shady spot . They
break ." They found a shady spot and sat on the
take a nap under a shady tree . While
to move him to a shady spot , but he was
they could react , the shady man reached out and grabbed
biting . He finds a shady spot under a tree and
She settled down under a shady rock and watched the fish
He spotted a shady spot , next to some
and went to find a shady spot under a big tree
She pointed to a shady looking man at the end
excited and quickly found a shady spot to relax . He
over to a cool and shady spot . There were big
It was very tall and shady . Lily and Tom wanted
felt tired , so he lay on the floor , and
arrived , they found a shady spot to sit in and
the ostr ich to a shady tree and kept him safe

INTERVAL 7.330 - 11.590
CONTAINS 0.000%

would take out a j igsaw puzzle and try to solve
she bowed , the town spe ople stopped to admire her
It was a big j igsaw puzzle and she wanted to
Lily loved the big Fer ris wheel the most . She
box and saw a j igsaw puzzle inside . He paused

INTERVAL 3.071 - 7.330
CONTAINS 0.211%

her monkey the name " B AN ANA " and her
N ora was a fast lear ner . Soon enough she
Ben say they were making art with the ink . Mom
to help ! We â re all in this
On this ranch lived two brothers ; Joe and Jim .

INTERVAL -1.189 - 3.071
CONTAINS 89.488%

. The bug said " hello ". But the girl
said , " Look at our tower ! It 's so
the soup ?" asked Lily . " Of course
it was okay , and that accidents happen . She took
the string , he put the top back together . Sam

INTERVAL -5.449 - -1.189
CONTAINS 10.301%

game on her phone when she noticed that her eyes started
, fluffy pillow in her hands . The bear
work hard , success will come . It was very happy
the center of the me adow . She ran to the
cookies for you . Do you want some milk too ?"
Token would
Token ID561
Feature activation-1.04e+00

Pos logit contributions
 Sara+0.015
 ice+0.014
 o+0.014
 sne+0.013
 Ben+0.013
Neg logit contributions
lights-0.032
adoes-0.029
opes-0.029
igsaw-0.028
arians-0.028
Token take
Token ID1011
Feature activation-1.17e+00

Pos logit contributions
 Sara+0.020
 ice+0.019
 o+0.018
 Ben+0.018
 sit+0.018
Neg logit contributions
lights-0.050
opes-0.046
adoes-0.045
gy-0.044
bel-0.044
Token out
Token ID503
Feature activation+1.10e+00

Pos logit contributions
 Sara+0.022
 ice+0.019
 Ben+0.019
 o+0.016
 better+0.016
Neg logit contributions
lights-0.057
opes-0.055
zag-0.054
bel-0.053
spe-0.053
Token a
Token ID257
Feature activation+3.72e-01

Pos logit contributions
arians+0.059
zag+0.058
lights+0.057
gy+0.057
opes+0.056
Neg logit contributions
 ice-0.016
 Sara-0.016
 Ben-0.015
 o-0.014
 right-0.012
Token j
Token ID474
Feature activation+1.56e+00

Pos logit contributions
lights+0.015
opes+0.015
adoes+0.014
gy+0.014
arians+0.014
Neg logit contributions
 Sara-0.010
 ice-0.009
 strawberry-0.009
 Ben-0.009
 o-0.009
Tokenigsaw
Token ID45636
Feature activation+9.45e+00

Pos logit contributions
opes+0.059
lights+0.057
adoes+0.057
gy+0.056
bel+0.056
Neg logit contributions
 Sara-0.043
 ice-0.039
 Ben-0.039
 o-0.038
 strawberry-0.038
Token puzzle
Token ID15027
Feature activation+2.46e+00

Pos logit contributions
opes+0.325
adoes+0.309
gy+0.304
arians+0.280
phones+0.276
Neg logit contributions
 ice-0.292
 better-0.284
 right-0.273
 o-0.269
 warm-0.264
Token and
Token ID290
Feature activation-2.53e-01

Pos logit contributions
opes+0.127
lights+0.122
adoes+0.121
gy+0.119
arians+0.117
Neg logit contributions
 ice-0.034
 Sara-0.033
 Ben-0.032
 o-0.030
 right-0.029
Token try
Token ID1949
Feature activation-1.60e+00

Pos logit contributions
 Sara+0.005
 ice+0.005
 Ben+0.005
 strawberry+0.004
 o+0.004
Neg logit contributions
lights-0.012
opes-0.011
adoes-0.011
bel-0.011
gy-0.011
Token to
Token ID284
Feature activation+5.43e-01

Pos logit contributions
 Sara+0.033
 Ben+0.028
 ice+0.027
 strawberry+0.027
 o+0.026
Neg logit contributions
lights-0.071
opes-0.069
bel-0.068
gy-0.067
adoes-0.067
Token solve
Token ID8494
Feature activation+3.30e+00

Pos logit contributions
lights+0.020
opes+0.019
bel+0.018
gy+0.018
adoes+0.018
Neg logit contributions
 Sara-0.016
 Ben-0.014
 ice-0.014
 strawberry-0.014
 o-0.014
Token It
Token ID632
Feature activation+3.42e-01

Pos logit contributions
 ice+0.017
 o+0.017
 Sara+0.016
 sne+0.016
 ge+0.016
Neg logit contributions
lights-0.017
opes-0.016
igsaw-0.016
 puzzles-0.015
adoes-0.015
Token was
Token ID373
Feature activation+3.81e-01

Pos logit contributions
lights+0.014
opes+0.013
bel+0.013
gy+0.013
adoes+0.013
Neg logit contributions
 Sara-0.008
 Ben-0.007
 ice-0.007
 strawberry-0.007
 o-0.007
Token a
Token ID257
Feature activation-2.50e-01

Pos logit contributions
lights+0.016
gy+0.015
opes+0.015
 puzzles+0.015
bel+0.015
Neg logit contributions
 Sara-0.008
Come-0.007
 Ben-0.007
 ice-0.007
 strawberry-0.007
Token big
Token ID1263
Feature activation-6.68e-01

Pos logit contributions
 Sara+0.006
 Ben+0.005
Come+0.005
 ice+0.005
 o+0.005
Neg logit contributions
lights-0.010
bel-0.009
gy-0.009
zag-0.009
 puzzles-0.009
Token j
Token ID474
Feature activation+7.89e-01

Pos logit contributions
 Sara+0.018
 Ben+0.016
 ice+0.015
 strawberry+0.015
 o+0.015
Neg logit contributions
lights-0.025
opes-0.024
gy-0.024
bel-0.024
adoes-0.023
Tokenigsaw
Token ID45636
Feature activation+8.99e+00

Pos logit contributions
lights+0.027
opes+0.026
bel+0.025
gy+0.025
adoes+0.025
Neg logit contributions
 Sara-0.025
 Ben-0.022
 ice-0.022
 strawberry-0.022
 o-0.021
Token puzzle
Token ID15027
Feature activation+2.77e+00

Pos logit contributions
opes+0.315
adoes+0.296
gy+0.287
bel+0.272
phones+0.261
Neg logit contributions
 ice-0.284
 strawberry-0.265
 better-0.264
 o-0.264
 right-0.253
Token and
Token ID290
Feature activation-5.34e-01

Pos logit contributions
opes+0.130
lights+0.126
adoes+0.126
gy+0.124
bel+0.122
Neg logit contributions
 Sara-0.051
 ice-0.046
 Ben-0.045
 o-0.043
 strawberry-0.042
Token she
Token ID673
Feature activation-1.21e+00

Pos logit contributions
 Sara+0.011
 ice+0.009
 Ben+0.009
 strawberry+0.009
 o+0.009
Neg logit contributions
lights-0.024
opes-0.023
bel-0.023
gy-0.022
adoes-0.022
Token wanted
Token ID2227
Feature activation-4.82e-01

Pos logit contributions
 Sara+0.025
 Ben+0.021
 ice+0.021
 o+0.020
 sne+0.020
Neg logit contributions
lights-0.054
opes-0.053
adoes-0.051
gy-0.050
bel-0.050
Token to
Token ID284
Feature activation+1.14e-01

Pos logit contributions
 Sara+0.011
 Ben+0.009
 strawberry+0.009
 o+0.009
 ice+0.009
Neg logit contributions
lights-0.020
opes-0.019
bel-0.019
 puzzles-0.019
gy-0.019
Token
Token ID220
Feature activation+1.40e+00

Pos logit contributions
lights+0.008
opes+0.007
bel+0.007
gy+0.007
adoes+0.007
Neg logit contributions
 Sara-0.011
 Ben-0.010
 ice-0.010
 strawberry-0.010
 o-0.010
Token
Token ID198
Feature activation+3.37e-01

Pos logit contributions
lights+0.066
opes+0.064
bel+0.064
gy+0.063
zag+0.062
Neg logit contributions
 Sara-0.023
 ice-0.020
 strawberry-0.019
 o-0.019
 Ben-0.018
Token
Token ID198
Feature activation+4.08e-01

Pos logit contributions
lights+0.016
opes+0.015
bel+0.015
gy+0.015
zag+0.015
Neg logit contributions
 Sara-0.006
 ice-0.005
 strawberry-0.005
 Ben-0.005
 o-0.004
TokenThe
Token ID464
Feature activation+5.84e-01

Pos logit contributions
lights+0.016
opes+0.015
bel+0.015
gy+0.015
adoes+0.015
Neg logit contributions
 Sara-0.011
 Ben-0.010
 ice-0.009
 strawberry-0.009
 o-0.009
Token town
Token ID3240
Feature activation+3.01e+00

Pos logit contributions
lights+0.024
opes+0.023
bel+0.023
gy+0.023
adoes+0.022
Neg logit contributions
 Sara-0.014
 Ben-0.012
 ice-0.012
 strawberry-0.012
 o-0.011
Tokenspe
Token ID4125
Feature activation+8.73e+00

Pos logit contributions
opes+0.110
lights+0.109
adoes+0.104
gy+0.103
bel+0.101
Neg logit contributions
 Sara-0.085
 ice-0.076
 Ben-0.075
 o-0.074
 strawberry-0.073
Tokenople
Token ID643
Feature activation+5.48e-01

Pos logit contributions
lights+0.245
opes+0.222
adoes+0.217
bel+0.212
zag+0.211
Neg logit contributions
 ice-0.302
 Sara-0.301
 right-0.281
 strawberry-0.279
 Ben-0.274
Token had
Token ID550
Feature activation+2.20e+00

Pos logit contributions
lights+0.024
opes+0.022
adoes+0.022
bel+0.022
gy+0.021
Neg logit contributions
 Sara-0.012
 Ben-0.011
 ice-0.011
 o-0.011
 sne-0.010
Token to
Token ID284
Feature activation+1.13e+00

Pos logit contributions
lights+0.094
opes+0.091
bel+0.090
gy+0.089
adoes+0.088
Neg logit contributions
 Sara-0.049
 Ben-0.042
 ice-0.042
 strawberry-0.041
 o-0.040
Token find
Token ID1064
Feature activation-5.51e-01

Pos logit contributions
lights+0.050
opes+0.048
bel+0.047
gy+0.047
adoes+0.046
Neg logit contributions
 Sara-0.024
 Ben-0.021
 ice-0.021
 strawberry-0.020
 o-0.020
Token a
Token ID257
Feature activation+7.15e-02

Pos logit contributions
 Sara+0.011
 Ben+0.009
 ice+0.009
 o+0.008
 strawberry+0.008
Neg logit contributions
lights-0.025
opes-0.025
bel-0.024
gy-0.024
zag-0.024
Token box
Token ID3091
Feature activation+1.03e+00

Pos logit contributions
 Sara+0.002
 Ben+0.002
 ice+0.002
 strawberry+0.002
 o+0.002
Neg logit contributions
lights-0.002
opes-0.002
bel-0.002
gy-0.002
adoes-0.002
Token and
Token ID290
Feature activation-4.72e-02

Pos logit contributions
lights+0.042
bel+0.039
opes+0.039
gy+0.039
zag+0.038
Neg logit contributions
 Sara-0.025
 Ben-0.021
 strawberry-0.021
 ice-0.021
 o-0.020
Token saw
Token ID2497
Feature activation-1.69e-01

Pos logit contributions
 Sara+0.001
 strawberry+0.001
 ice+0.001
 Ben+0.001
 o+0.001
Neg logit contributions
lights-0.002
bel-0.002
gy-0.002
opes-0.002
adoes-0.002
Token a
Token ID257
Feature activation-1.15e+00

Pos logit contributions
 Sara+0.004
 strawberry+0.003
Come+0.003
 Ben+0.003
 ice+0.003
Neg logit contributions
lights-0.007
opes-0.007
bel-0.007
adoes-0.007
gy-0.007
Token j
Token ID474
Feature activation+4.35e-01

Pos logit contributions
 Sara+0.033
 Ben+0.029
Come+0.028
 o+0.027
 ice+0.026
Neg logit contributions
lights-0.040
bel-0.038
gy-0.038
 puzzles-0.038
zag-0.038
Tokenigsaw
Token ID45636
Feature activation+8.26e+00

Pos logit contributions
lights+0.014
opes+0.013
bel+0.013
gy+0.013
adoes+0.013
Neg logit contributions
 Sara-0.014
 Ben-0.013
 ice-0.013
 strawberry-0.013
 o-0.013
Token puzzle
Token ID15027
Feature activation+1.88e+00

Pos logit contributions
opes+0.283
adoes+0.270
bel+0.262
gy+0.260
lights+0.249
Neg logit contributions
 ice-0.254
 better-0.239
 Sara-0.233
 strawberry-0.231
 o-0.231
Token inside
Token ID2641
Feature activation-4.10e-01

Pos logit contributions
lights+0.078
opes+0.077
bel+0.075
gy+0.075
adoes+0.074
Neg logit contributions
 Sara-0.044
 ice-0.038
 Ben-0.038
 strawberry-0.037
 o-0.037
Token.
Token ID13
Feature activation+3.07e-01

Pos logit contributions
 Sara+0.011
 strawberry+0.009
 Ben+0.009
Come+0.009
 ice+0.009
Neg logit contributions
lights-0.015
bel-0.014
opes-0.014
spe-0.014
 puzzles-0.014
Token He
Token ID679
Feature activation+1.48e+00

Pos logit contributions
lights+0.010
bel+0.010
opes+0.009
adoes+0.009
cases+0.009
Neg logit contributions
 Sara-0.008
 sne-0.008
 o-0.008
 ice-0.008
 yawn-0.008
Token paused
Token ID24487
Feature activation+1.33e-01

Pos logit contributions
lights+0.063
opes+0.060
bel+0.059
gy+0.059
adoes+0.059
Neg logit contributions
 Sara-0.034
 Ben-0.029
 ice-0.029
 strawberry-0.028
 o-0.028
Token Lily
Token ID20037
Feature activation-8.25e-01

Pos logit contributions
lights+0.005
opes+0.005
gy+0.005
bel+0.005
adoes+0.005
Neg logit contributions
 Sara-0.003
 ice-0.003
 strawberry-0.003
 o-0.003
 Ben-0.003
Token loved
Token ID6151
Feature activation-5.78e-01

Pos logit contributions
 Sara+0.021
 Ben+0.018
 ice+0.018
 strawberry+0.018
 o+0.017
Neg logit contributions
lights-0.033
opes-0.032
bel-0.031
gy-0.031
adoes-0.030
Token the
Token ID262
Feature activation+9.57e-01

Pos logit contributions
 Sara+0.013
 Ben+0.011
 ice+0.011
 strawberry+0.011
 o+0.011
Neg logit contributions
lights-0.024
opes-0.023
bel-0.023
gy-0.023
adoes-0.023
Token big
Token ID1263
Feature activation+3.72e-01

Pos logit contributions
lights+0.036
opes+0.035
bel+0.034
gy+0.034
adoes+0.033
Neg logit contributions
 Sara-0.027
 Ben-0.024
 ice-0.023
 strawberry-0.023
 o-0.023
Token Fer
Token ID12880
Feature activation+4.69e+00

Pos logit contributions
lights+0.014
opes+0.014
bel+0.013
gy+0.013
adoes+0.013
Neg logit contributions
 Sara-0.010
 Ben-0.009
 ice-0.009
 strawberry-0.009
 o-0.008
Tokenris
Token ID2442
Feature activation+7.81e+00

Pos logit contributions
adoes+0.190
opes+0.184
lights+0.167
phones+0.159
arians+0.158
Neg logit contributions
 Sara-0.111
 ice-0.107
 strawberry-0.100
ax-0.099
Tom-0.097
Token wheel
Token ID7825
Feature activation+1.10e+00

Pos logit contributions
lights+0.306
adoes+0.297
opes+0.296
bel+0.294
gy+0.287
Neg logit contributions
 Sara-0.198
 ice-0.192
 strawberry-0.182
 o-0.178
 Ben-0.169
Token the
Token ID262
Feature activation+2.94e-02

Pos logit contributions
opes+0.050
lights+0.050
adoes+0.047
bel+0.047
gy+0.046
Neg logit contributions
 Sara-0.022
 ice-0.019
 Ben-0.019
 o-0.017
 strawberry-0.017
Token most
Token ID749
Feature activation+3.53e-01

Pos logit contributions
lights+0.001
opes+0.001
bel+0.001
gy+0.001
adoes+0.001
Neg logit contributions
 Sara-0.001
 Ben-0.001
 ice-0.001
 strawberry-0.001
 o-0.001
Token.
Token ID13
Feature activation-4.92e-01

Pos logit contributions
lights+0.015
opes+0.015
gy+0.014
bel+0.014
adoes+0.014
Neg logit contributions
 Sara-0.008
 ice-0.007
 Ben-0.007
 strawberry-0.007
 o-0.007
Token She
Token ID1375
Feature activation+4.03e-01

Pos logit contributions
 Sara+0.011
 ice+0.010
 o+0.010
 strawberry+0.010
 Ben+0.009
Neg logit contributions
lights-0.020
opes-0.019
gy-0.019
adoes-0.019
bel-0.019
Token she
Token ID673
Feature activation-2.92e-01

Pos logit contributions
lights+0.032
opes+0.031
bel+0.030
gy+0.030
adoes+0.030
Neg logit contributions
 Sara-0.017
 ice-0.015
 strawberry-0.015
 Ben-0.014
 o-0.014
Token bowed
Token ID35737
Feature activation-6.31e-01

Pos logit contributions
 Sara+0.008
 Ben+0.007
 ice+0.007
 strawberry+0.007
 o+0.006
Neg logit contributions
lights-0.011
opes-0.011
bel-0.011
gy-0.011
adoes-0.010
Token,
Token ID11
Feature activation+4.78e-01

Pos logit contributions
 Sara+0.016
 Ben+0.014
 ice+0.014
 strawberry+0.013
 o+0.013
Neg logit contributions
lights-0.025
opes-0.024
bel-0.024
gy-0.024
adoes-0.023
Token the
Token ID262
Feature activation-2.05e-01

Pos logit contributions
lights+0.021
opes+0.020
bel+0.020
gy+0.020
adoes+0.020
Neg logit contributions
 Sara-0.010
 Ben-0.009
 ice-0.008
 strawberry-0.008
 o-0.008
Token town
Token ID3240
Feature activation+2.17e+00

Pos logit contributions
 Sara+0.006
 Ben+0.005
 strawberry+0.005
 ice+0.005
 o+0.005
Neg logit contributions
lights-0.008
opes-0.007
bel-0.007
adoes-0.007
gy-0.007
Tokenspe
Token ID4125
Feature activation+7.39e+00

Pos logit contributions
opes+0.087
lights+0.087
gy+0.082
adoes+0.082
bel+0.081
Neg logit contributions
 Sara-0.052
 ice-0.046
 Ben-0.046
 o-0.044
 strawberry-0.043
Tokenople
Token ID643
Feature activation-3.86e-01

Pos logit contributions
lights+0.201
opes+0.186
bel+0.179
zag+0.178
bes+0.174
Neg logit contributions
 ice-0.259
 Sara-0.258
 right-0.238
 o-0.237
 Ben-0.235
Token stopped
Token ID5025
Feature activation-5.54e-01

Pos logit contributions
 Sara+0.009
 Ben+0.008
 ice+0.008
 o+0.007
 strawberry+0.007
Neg logit contributions
lights-0.016
opes-0.016
bel-0.015
gy-0.015
adoes-0.015
Token to
Token ID284
Feature activation-1.98e-01

Pos logit contributions
 Sara+0.014
 strawberry+0.012
 ice+0.012
 Ben+0.012
 o+0.011
Neg logit contributions
lights-0.022
bel-0.022
opes-0.022
gy-0.021
adoes-0.021
Token admire
Token ID21099
Feature activation-3.76e-02

Pos logit contributions
 Sara+0.005
 strawberry+0.004
 Ben+0.004
 ice+0.004
Come+0.004
Neg logit contributions
bel-0.008
lights-0.007
gy-0.007
opes-0.007
adoes-0.007
Token her
Token ID607
Feature activation-4.08e-01

Pos logit contributions
 Sara+0.001
 Ben+0.001
 ice+0.000
 strawberry+0.000
 o+0.000
Neg logit contributions
lights-0.002
opes-0.002
bel-0.002
gy-0.002
zag-0.002
Token He
Token ID679
Feature activation+1.35e+00

Pos logit contributions
lights+0.011
opes+0.011
bel+0.010
gy+0.010
adoes+0.010
Neg logit contributions
 Sara-0.007
 Ben-0.006
 ice-0.006
 strawberry-0.006
 o-0.006
Token went
Token ID1816
Feature activation-5.59e-01

Pos logit contributions
lights+0.049
opes+0.047
bel+0.047
gy+0.046
adoes+0.046
Neg logit contributions
 Sara-0.039
 Ben-0.035
 ice-0.034
 strawberry-0.034
 o-0.033
Token on
Token ID319
Feature activation+1.47e+00

Pos logit contributions
 Sara+0.012
 ice+0.010
 Ben+0.010
 o+0.010
 strawberry+0.010
Neg logit contributions
lights-0.025
opes-0.024
adoes-0.024
bel-0.023
gy-0.023
Token the
Token ID262
Feature activation+1.50e+00

Pos logit contributions
lights+0.068
opes+0.065
bel+0.064
gy+0.064
adoes+0.063
Neg logit contributions
 Sara-0.029
 Ben-0.025
 ice-0.025
 strawberry-0.023
 o-0.023
Token fer
Token ID11354
Feature activation+4.25e+00

Pos logit contributions
lights+0.057
opes+0.056
bel+0.055
gy+0.054
adoes+0.053
Neg logit contributions
 Sara-0.040
 Ben-0.035
 strawberry-0.034
 ice-0.034
 o-0.033
Tokenris
Token ID2442
Feature activation+7.25e+00

Pos logit contributions
lights+0.137
adoes+0.135
opes+0.133
bel+0.120
arians+0.117
Neg logit contributions
 ice-0.125
 Sara-0.125
 o-0.117
 strawberry-0.111
 dove-0.111
Token wheel
Token ID7825
Feature activation+7.77e-01

Pos logit contributions
lights+0.271
opes+0.264
bel+0.262
adoes+0.260
iders+0.250
Neg logit contributions
 ice-0.189
 Sara-0.189
 o-0.170
 Ben-0.168
 strawberry-0.162
Token,
Token ID11
Feature activation+5.98e-01

Pos logit contributions
lights+0.036
opes+0.036
adoes+0.035
bel+0.034
gy+0.033
Neg logit contributions
 Sara-0.015
 ice-0.013
 Ben-0.013
 o-0.012
 strawberry-0.011
Token the
Token ID262
Feature activation-5.72e-01

Pos logit contributions
lights+0.023
bel+0.022
opes+0.022
gy+0.022
zag+0.021
Neg logit contributions
 Sara-0.016
 Ben-0.014
 strawberry-0.014
 ice-0.013
 o-0.013
Token roller
Token ID24471
Feature activation-1.18e+00

Pos logit contributions
 Sara+0.015
 Ben+0.013
 strawberry+0.013
 ice+0.012
 o+0.012
Neg logit contributions
lights-0.022
bel-0.021
opes-0.021
 robots-0.020
spe-0.020
Token coaster
Token ID42450
Feature activation+7.36e-01

Pos logit contributions
 Sara+0.047
 Ben+0.043
 ice+0.042
 strawberry+0.042
 o+0.042
Neg logit contributions
lights-0.030
opes-0.029
bel-0.028
gy-0.027
adoes-0.027
Token's
Token ID338
Feature activation-7.19e-01

Pos logit contributions
 Sara+0.076
 strawberry+0.068
 yawn+0.067
Come+0.066
 ice+0.065
Neg logit contributions
gy-0.094
adoes-0.092
opes-0.092
ations-0.088
ors-0.087
Token go
Token ID467
Feature activation-2.42e+00

Pos logit contributions
 Sara+0.016
 strawberry+0.014
Come+0.014
 ice+0.013
 Ben+0.013
Neg logit contributions
gy-0.029
lights-0.028
opes-0.028
adoes-0.027
bel-0.027
Token on
Token ID319
Feature activation+1.67e+00

Pos logit contributions
 Sara+0.053
 Ben+0.046
 strawberry+0.045
 ice+0.044
 o+0.042
Neg logit contributions
lights-0.101
gy-0.099
bel-0.098
opes-0.098
zag-0.097
Token the
Token ID262
Feature activation+1.48e+00

Pos logit contributions
lights+0.076
opes+0.074
bel+0.073
gy+0.072
adoes+0.071
Neg logit contributions
 Sara-0.034
 Ben-0.028
 ice-0.028
 strawberry-0.027
 o-0.026
Token fer
Token ID11354
Feature activation+4.12e+00

Pos logit contributions
lights+0.051
opes+0.050
gy+0.048
bel+0.048
adoes+0.047
Neg logit contributions
 Sara-0.045
 Ben-0.039
 strawberry-0.038
Come-0.037
 ice-0.037
Tokenris
Token ID2442
Feature activation+7.21e+00

Pos logit contributions
lights+0.130
opes+0.126
adoes+0.123
bel+0.113
gy+0.112
Neg logit contributions
 Sara-0.136
 ice-0.129
 o-0.124
 strawberry-0.124
 Ben-0.122
Token wheel
Token ID7825
Feature activation+5.94e-01

Pos logit contributions
lights+0.313
opes+0.300
bel+0.299
adoes+0.295
gy+0.282
Neg logit contributions
 Sara-0.152
 ice-0.151
 o-0.135
 strawberry-0.133
 Ben-0.132
Token and
Token ID290
Feature activation-5.75e-02

Pos logit contributions
lights+0.026
opes+0.025
bel+0.024
adoes+0.024
gy+0.023
Neg logit contributions
 Sara-0.013
 Ben-0.012
 ice-0.011
 o-0.011
 strawberry-0.010
Token see
Token ID766
Feature activation-4.35e-01

Pos logit contributions
 Sara+0.001
 strawberry+0.001
 o+0.001
 ice+0.001
 Ben+0.001
Neg logit contributions
lights-0.003
bel-0.003
gy-0.003
opes-0.003
adoes-0.002
Token the
Token ID262
Feature activation-7.90e-01

Pos logit contributions
 Sara+0.014
 yawn+0.013
Come+0.013
 strawberry+0.013
 sne+0.013
Neg logit contributions
opes-0.013
lights-0.013
adoes-0.012
 puzzles-0.012
 robots-0.012
Token whole
Token ID2187
Feature activation-9.25e-01

Pos logit contributions
 Sara+0.023
 Ben+0.020
 strawberry+0.019
 ice+0.018
 o+0.018
Neg logit contributions
lights-0.028
opes-0.027
bel-0.027
adoes-0.026
 puzzles-0.026
Token they
Token ID484
Feature activation-4.41e-01

Pos logit contributions
opes+0.053
lights+0.051
bel+0.050
zag+0.049
arians+0.049
Neg logit contributions
 Sara-0.029
 Ben-0.025
 Tom-0.021
 ice-0.020
 strawberry-0.020
Token went
Token ID1816
Feature activation-9.07e-01

Pos logit contributions
 Sara+0.011
 Ben+0.010
 ice+0.010
 strawberry+0.010
 o+0.009
Neg logit contributions
lights-0.017
opes-0.017
bel-0.017
gy-0.016
adoes-0.016
Token on
Token ID319
Feature activation+1.44e+00

Pos logit contributions
 Sara+0.019
 Ben+0.016
 ice+0.016
 strawberry+0.015
 o+0.015
Neg logit contributions
lights-0.041
opes-0.040
bel-0.039
adoes-0.039
gy-0.038
Token the
Token ID262
Feature activation+1.15e+00

Pos logit contributions
lights+0.068
opes+0.066
bel+0.066
gy+0.065
adoes+0.064
Neg logit contributions
 Sara-0.027
 Ben-0.022
 ice-0.022
 strawberry-0.021
 o-0.020
Token fer
Token ID11354
Feature activation+3.94e+00

Pos logit contributions
lights+0.045
opes+0.043
bel+0.042
adoes+0.042
gy+0.042
Neg logit contributions
 Sara-0.031
 ice-0.028
 Ben-0.027
 strawberry-0.027
 o-0.027
Tokenris
Token ID2442
Feature activation+7.04e+00

Pos logit contributions
lights+0.129
opes+0.124
adoes+0.123
bel+0.110
gy+0.110
Neg logit contributions
 Sara-0.122
 ice-0.119
 o-0.113
 strawberry-0.111
 Ben-0.108
Token wheel
Token ID7825
Feature activation+3.84e-01

Pos logit contributions
lights+0.305
bel+0.297
adoes+0.295
opes+0.294
iders+0.284
Neg logit contributions
 ice-0.143
 Sara-0.142
 o-0.126
 strawberry-0.123
 Ben-0.122
Token together
Token ID1978
Feature activation+2.16e+00

Pos logit contributions
opes+0.018
lights+0.018
adoes+0.017
bel+0.017
gy+0.016
Neg logit contributions
 Sara-0.007
 ice-0.006
 Ben-0.006
 o-0.006
 strawberry-0.005
Token,
Token ID11
Feature activation-1.35e-01

Pos logit contributions
lights+0.075
gy+0.073
opes+0.072
zag+0.072
bel+0.072
Neg logit contributions
 Sara-0.061
Come-0.052
 strawberry-0.052
 Ben-0.051
 ice-0.051
Token and
Token ID290
Feature activation-5.10e-01

Pos logit contributions
 Sara+0.003
 Ben+0.003
 strawberry+0.003
 ice+0.003
 o+0.003
Neg logit contributions
lights-0.005
bel-0.005
gy-0.005
opes-0.005
zag-0.005
Token Lily
Token ID20037
Feature activation-1.51e+00

Pos logit contributions
 Sara+0.012
 strawberry+0.010
 ice+0.010
 o+0.010
 Ben+0.010
Neg logit contributions
lights-0.021
bel-0.020
gy-0.020
opes-0.019
spe-0.019
Token saw
Token ID2497
Feature activation-6.34e-02

Pos logit contributions
 Sara+0.038
 warmer+0.032
 strawberry+0.032
 ice+0.031
 Ben+0.030
Neg logit contributions
gy-0.064
opes-0.063
bel-0.063
lights-0.061
cases-0.061
Token the
Token ID262
Feature activation-5.03e-01

Pos logit contributions
 Sara+0.001
Come+0.001
 yawn+0.001
 warmer+0.001
 sne+0.001
Neg logit contributions
lights-0.002
 puzzles-0.002
opes-0.002
adoes-0.002
bel-0.002
Token great
Token ID1049
Feature activation+2.20e-01

Pos logit contributions
 Sara+0.014
 Ben+0.013
 o+0.012
 strawberry+0.012
 ice+0.012
Neg logit contributions
lights-0.018
opes-0.017
bel-0.017
adoes-0.017
zag-0.016
Token big
Token ID1263
Feature activation-3.74e-02

Pos logit contributions
lights+0.008
bel+0.007
 puzzles+0.007
opes+0.007
adoes+0.007
Neg logit contributions
 Sara-0.006
 Ben-0.006
Come-0.005
 strawberry-0.005
 o-0.005
Token Fer
Token ID12880
Feature activation+3.70e+00

Pos logit contributions
 Sara+0.001
Come+0.001
 Ben+0.001
 o+0.001
 warmer+0.001
Neg logit contributions
lights-0.001
 puzzles-0.001
opes-0.001
 robots-0.001
adoes-0.001
Tokenris
Token ID2442
Feature activation+6.97e+00

Pos logit contributions
adoes+0.126
opes+0.122
lights+0.116
bel+0.108
gy+0.106
Neg logit contributions
 Sara-0.114
 ice-0.112
 strawberry-0.109
 o-0.102
 Ben-0.100
Token wheel
Token ID7825
Feature activation+5.45e-01

Pos logit contributions
lights+0.283
opes+0.277
adoes+0.271
bel+0.265
gy+0.264
Neg logit contributions
 Sara-0.173
 ice-0.167
 strawberry-0.158
 o-0.150
 Ben-0.147
Token and
Token ID290
Feature activation-6.33e-02

Pos logit contributions
lights+0.024
opes+0.023
bel+0.023
adoes+0.023
gy+0.022
Neg logit contributions
 Sara-0.012
 ice-0.010
 Ben-0.010
 strawberry-0.010
 o-0.009
Token all
Token ID477
Feature activation+5.60e-01

Pos logit contributions
 Sara+0.001
 Ben+0.001
 ice+0.001
 strawberry+0.001
 o+0.001
Neg logit contributions
lights-0.003
opes-0.003
bel-0.003
gy-0.003
adoes-0.003
Token the
Token ID262
Feature activation+2.86e-01

Pos logit contributions
lights+0.024
bel+0.023
spe+0.023
opes+0.022
gy+0.022
Neg logit contributions
 Sara-0.012
 Ben-0.010
 ice-0.010
 strawberry-0.010
 o-0.009
Token other
Token ID584
Feature activation-3.60e-01

Pos logit contributions
lights+0.011
 puzzles+0.010
opes+0.010
bel+0.010
spe+0.010
Neg logit contributions
 Sara-0.007
 Ben-0.006
Come-0.006
 strawberry-0.006
 o-0.006
Token day
Token ID1110
Feature activation+1.12e+00

Pos logit contributions
 Sara+0.024
 ice+0.021
 Ben+0.021
 strawberry+0.021
 o+0.020
Neg logit contributions
lights-0.033
opes-0.032
bel-0.031
gy-0.030
adoes-0.030
Token ever
Token ID1683
Feature activation-6.00e-01

Pos logit contributions
lights+0.034
bel+0.032
spe+0.032
gy+0.031
opes+0.030
Neg logit contributions
 Sara-0.035
 strawberry-0.032
 ice-0.031
 Ben-0.031
 o-0.031
Token!
Token ID0
Feature activation-6.05e-01

Pos logit contributions
 Sara+0.019
 Ben+0.016
 strawberry+0.016
 ice+0.016
Come+0.016
Neg logit contributions
lights-0.020
gy-0.019
opes-0.019
bel-0.018
zag-0.018
Token<|endoftext|>
Token ID50256
Feature activation+4.20e-01

Pos logit contributions
 Sara+0.010
 ice+0.009
 o+0.008
 strawberry+0.008
 Ben+0.008
Neg logit contributions
lights-0.028
gy-0.027
opes-0.027
bel-0.027
adoes-0.027
TokenMor
Token ID20044
Feature activation+2.88e+00

Pos logit contributions
gy+0.015
lights+0.014
zag+0.014
opes+0.014
bel+0.013
Neg logit contributions
 Sara-0.010
 strawberry-0.010
 ice-0.010
 o-0.010
 vanilla-0.010
Tokenris
Token ID2442
Feature activation+6.85e+00

Pos logit contributions
lights+0.086
opes+0.082
bel+0.080
adoes+0.079
gy+0.078
Neg logit contributions
 Sara-0.103
 Ben-0.094
 ice-0.092
 strawberry-0.091
 o-0.090
Token was
Token ID373
Feature activation-8.96e-01

Pos logit contributions
lights+0.296
opes+0.283
bel+0.282
adoes+0.276
gy+0.270
Neg logit contributions
 Sara-0.156
 Ben-0.134
 ice-0.133
 strawberry-0.130
 o-0.127
Token very
Token ID845
Feature activation-1.16e+00

Pos logit contributions
 Sara+0.020
 ice+0.018
 strawberry+0.018
 o+0.017
 Ben+0.016
Neg logit contributions
gy-0.035
lights-0.035
opes-0.035
bel-0.035
zag-0.034
Token excited
Token ID6568
Feature activation+1.10e+00

Pos logit contributions
 Sara+0.024
 Ben+0.020
 ice+0.020
 strawberry+0.019
 o+0.019
Neg logit contributions
lights-0.052
opes-0.050
bel-0.050
gy-0.049
adoes-0.048
Token.
Token ID13
Feature activation+3.06e-01

Pos logit contributions
lights+0.053
bel+0.050
opes+0.050
adoes+0.049
gy+0.049
Neg logit contributions
 Sara-0.020
 Ben-0.018
 ice-0.018
 o-0.016
 strawberry-0.016
Token Today
Token ID6288
Feature activation+1.17e+00

Pos logit contributions
lights+0.011
gy+0.011
opes+0.011
bel+0.011
adoes+0.010
Neg logit contributions
 Sara-0.008
 ice-0.007
 o-0.007
 strawberry-0.007
 Ben-0.007
Token out
Token ID503
Feature activation+4.63e-01

Pos logit contributions
 Sara+0.019
 Ben+0.016
 ice+0.016
 strawberry+0.016
 o+0.015
Neg logit contributions
lights-0.033
opes-0.031
bel-0.031
gy-0.031
adoes-0.030
Token for
Token ID329
Feature activation-1.14e+00

Pos logit contributions
opes+0.022
lights+0.022
adoes+0.021
gy+0.021
arians+0.021
Neg logit contributions
 Sara-0.009
 Ben-0.007
 ice-0.007
 o-0.007
 sne-0.006
Token a
Token ID257
Feature activation+2.19e-01

Pos logit contributions
 Sara+0.023
 Ben+0.019
 strawberry+0.019
Come+0.018
 o+0.018
Neg logit contributions
lights-0.049
opes-0.048
bel-0.047
 puzzles-0.047
gy-0.047
Token walk
Token ID2513
Feature activation+1.72e-01

Pos logit contributions
lights+0.008
opes+0.008
bel+0.008
gy+0.007
zag+0.007
Neg logit contributions
 Sara-0.006
 Ben-0.006
 ice-0.005
 strawberry-0.005
 o-0.005
Token with
Token ID351
Feature activation+2.24e-01

Pos logit contributions
opes+0.007
lights+0.007
adoes+0.007
bel+0.006
arians+0.006
Neg logit contributions
 Sara-0.004
 Ben-0.004
 ice-0.004
 o-0.003
 strawberry-0.003
Token Sil
Token ID4243
Feature activation+6.83e+00

Pos logit contributions
lights+0.010
opes+0.009
bel+0.009
gy+0.009
adoes+0.009
Neg logit contributions
 Sara-0.005
 Ben-0.004
 ice-0.004
 strawberry-0.004
 o-0.004
Tokenvia
Token ID8869
Feature activation+1.81e+00

Pos logit contributions
lights+0.200
adoes+0.190
opes+0.186
bel+0.179
gy+0.179
Neg logit contributions
 Sara-0.235
 o-0.218
 ice-0.210
 Ben-0.208
 strawberry-0.208
Token wearing
Token ID5762
Feature activation+9.68e-01

Pos logit contributions
lights+0.073
bel+0.071
opes+0.071
gy+0.070
adoes+0.068
Neg logit contributions
 Sara-0.043
 Ben-0.037
 ice-0.037
 strawberry-0.036
 o-0.035
Token her
Token ID607
Feature activation+3.61e-02

Pos logit contributions
lights+0.043
opes+0.041
bel+0.041
gy+0.040
adoes+0.040
Neg logit contributions
 Sara-0.020
 Ben-0.017
 ice-0.017
 strawberry-0.017
 o-0.016
Token new
Token ID649
Feature activation+7.77e-01

Pos logit contributions
lights+0.001
opes+0.001
bel+0.001
gy+0.001
adoes+0.001
Neg logit contributions
 Sara-0.001
 Ben-0.001
 ice-0.001
 strawberry-0.001
Come-0.001
Token coat
Token ID13209
Feature activation-1.22e+00

Pos logit contributions
lights+0.032
opes+0.032
gy+0.031
bel+0.031
zag+0.030
Neg logit contributions
 Sara-0.018
 Ben-0.016
 ice-0.015
 strawberry-0.015
Come-0.014
Token loved
Token ID6151
Feature activation-3.12e-01

Pos logit contributions
 Sara+0.006
 Ben+0.005
 ice+0.005
 strawberry+0.005
 o+0.005
Neg logit contributions
lights-0.009
opes-0.008
bel-0.008
gy-0.008
adoes-0.008
Token to
Token ID284
Feature activation+5.94e-01

Pos logit contributions
 Sara+0.008
 Ben+0.007
 ice+0.006
 strawberry+0.006
 o+0.006
Neg logit contributions
lights-0.012
opes-0.012
bel-0.012
adoes-0.012
gy-0.012
Token play
Token ID711
Feature activation+7.89e-01

Pos logit contributions
lights+0.024
opes+0.024
gy+0.023
bel+0.023
adoes+0.023
Neg logit contributions
 Sara-0.014
 Ben-0.012
 strawberry-0.012
 ice-0.012
 o-0.011
Token make
Token ID787
Feature activation-2.18e-01

Pos logit contributions
lights+0.033
opes+0.032
bel+0.031
gy+0.031
spe+0.031
Neg logit contributions
 Sara-0.019
 Ben-0.017
 ice-0.016
 o-0.016
 strawberry-0.015
Token-
Token ID12
Feature activation+1.62e+00

Pos logit contributions
 Sara+0.007
 Ben+0.006
 strawberry+0.006
 ice+0.006
 o+0.006
Neg logit contributions
lights-0.007
opes-0.007
bel-0.007
adoes-0.007
gy-0.007
Tokenbel
Token ID6667
Feature activation+6.76e+00

Pos logit contributions
lights+0.043
opes+0.042
adoes+0.039
arians+0.037
gy+0.036
Neg logit contributions
 Sara-0.062
 ice-0.060
 strawberry-0.057
 o-0.057
 sne-0.056
Tokenieve
Token ID12311
Feature activation+2.25e+00

Pos logit contributions
gy+0.232
lights+0.223
zag+0.204
opes+0.204
adoes+0.203
Neg logit contributions
 Sara-0.210
 ice-0.203
 Ben-0.194
 better-0.191
 say-0.190
Token.
Token ID13
Feature activation-1.14e-01

Pos logit contributions
gy+0.093
opes+0.092
lights+0.088
zag+0.087
phones+0.086
Neg logit contributions
 Sara-0.051
 o-0.048
 Ben-0.046
 ice-0.045
 it-0.044
Token Today
Token ID6288
Feature activation+1.43e+00

Pos logit contributions
 Sara+0.003
 Ben+0.003
 ice+0.003
 strawberry+0.002
 o+0.002
Neg logit contributions
lights-0.004
opes-0.004
bel-0.004
adoes-0.004
gy-0.004
Token she
Token ID673
Feature activation-5.80e-01

Pos logit contributions
lights+0.057
opes+0.056
bel+0.055
gy+0.054
adoes+0.054
Neg logit contributions
 Sara-0.037
 Ben-0.032
 ice-0.031
 strawberry-0.030
 o-0.030
Token was
Token ID373
Feature activation+2.89e-01

Pos logit contributions
 Sara+0.014
 Ben+0.012
 ice+0.012
 strawberry+0.012
 o+0.011
Neg logit contributions
lights-0.024
opes-0.023
bel-0.023
gy-0.022
adoes-0.022
Token!
Token ID0
Feature activation-8.78e-01

Pos logit contributions
 Sara+0.014
 Ben+0.012
 ice+0.012
 strawberry+0.012
 o+0.012
Neg logit contributions
lights-0.028
opes-0.027
bel-0.027
gy-0.026
adoes-0.026
Token
Token ID198
Feature activation-3.94e-01

Pos logit contributions
 ice+0.019
 o+0.019
 Sara+0.018
 strawberry+0.018
 sne+0.017
Neg logit contributions
lights-0.035
gy-0.033
bel-0.033
opes-0.033
adoes-0.032
Token
Token ID198
Feature activation-2.56e-01

Pos logit contributions
 ice+0.008
 strawberry+0.008
 Sara+0.008
 o+0.008
 sne+0.008
Neg logit contributions
bel-0.015
lights-0.015
zag-0.015
gy-0.015
opes-0.015
TokenThe
Token ID464
Feature activation-2.14e-01

Pos logit contributions
 Sara+0.007
 ice+0.007
 Ben+0.006
 o+0.006
 strawberry+0.006
Neg logit contributions
lights-0.009
bel-0.009
opes-0.009
gy-0.008
zag-0.008
Token town
Token ID3240
Feature activation+2.36e+00

Pos logit contributions
 Sara+0.005
 Ben+0.004
 strawberry+0.004
Come+0.004
 ice+0.004
Neg logit contributions
lights-0.009
opes-0.008
bel-0.008
 puzzles-0.008
adoes-0.008
Tokenspe
Token ID4125
Feature activation+6.75e+00

Pos logit contributions
lights+0.091
opes+0.088
bel+0.087
gy+0.086
adoes+0.085
Neg logit contributions
 Sara-0.062
 Ben-0.055
 ice-0.054
 strawberry-0.053
 o-0.052
Tokenople
Token ID643
Feature activation+1.04e-01

Pos logit contributions
lights+0.170
bel+0.157
opes+0.155
zag+0.148
adoes+0.146
Neg logit contributions
 Sara-0.268
 ice-0.258
 Ben-0.246
 strawberry-0.245
 o-0.241
Token were
Token ID547
Feature activation+3.44e-01

Pos logit contributions
lights+0.004
bel+0.003
opes+0.003
gy+0.003
spe+0.003
Neg logit contributions
 Sara-0.003
 strawberry-0.003
 Ben-0.003
 ice-0.003
 o-0.003
Token sad
Token ID6507
Feature activation+8.17e-01

Pos logit contributions
lights+0.013
gy+0.012
bel+0.012
spe+0.012
opes+0.012
Neg logit contributions
 Sara-0.009
 Ben-0.008
 strawberry-0.008
Come-0.008
 ice-0.008
Token and
Token ID290
Feature activation+8.27e-01

Pos logit contributions
lights+0.035
opes+0.035
bel+0.034
gy+0.034
adoes+0.033
Neg logit contributions
 Sara-0.018
 Ben-0.016
 ice-0.015
 o-0.015
 strawberry-0.014
Token scared
Token ID12008
Feature activation+3.18e-01

Pos logit contributions
lights+0.036
opes+0.035
bel+0.035
gy+0.034
adoes+0.034
Neg logit contributions
 Sara-0.018
 Ben-0.015
 ice-0.015
 strawberry-0.015
 o-0.014
Token happy
Token ID3772
Feature activation+9.87e-02

Pos logit contributions
lights+0.014
bel+0.014
gy+0.014
opes+0.014
spe+0.014
Neg logit contributions
 Sara-0.009
 Ben-0.008
 ice-0.007
 strawberry-0.007
 o-0.007
Token and
Token ID290
Feature activation+1.44e+00

Pos logit contributions
lights+0.004
opes+0.004
bel+0.004
gy+0.004
adoes+0.004
Neg logit contributions
 Sara-0.002
 Ben-0.002
 ice-0.002
 strawberry-0.002
 o-0.002
Token the
Token ID262
Feature activation+1.69e-01

Pos logit contributions
lights+0.056
bel+0.054
opes+0.053
gy+0.053
adoes+0.052
Neg logit contributions
 Sara-0.038
 Ben-0.033
 ice-0.033
 strawberry-0.033
 o-0.032
Token nation
Token ID3277
Feature activation+8.64e-01

Pos logit contributions
lights+0.006
opes+0.006
gy+0.006
bel+0.006
zag+0.006
Neg logit contributions
 Sara-0.005
 Ben-0.004
 ice-0.004
 right-0.004
 o-0.004
Token pro
Token ID386
Feature activation-4.76e-01

Pos logit contributions
opes+0.037
lights+0.037
gy+0.036
adoes+0.036
bel+0.035
Neg logit contributions
 Sara-0.018
 ice-0.016
 Ben-0.016
 o-0.016
 strawberry-0.015
Tokenspe
Token ID4125
Feature activation+6.72e+00

Pos logit contributions
 Sara+0.017
 Ben+0.015
 ice+0.015
 strawberry+0.015
 o+0.014
Neg logit contributions
lights-0.015
opes-0.014
bel-0.014
gy-0.014
adoes-0.013
Tokenred
Token ID445
Feature activation+5.94e-01

Pos logit contributions
lights+0.188
opes+0.171
bel+0.168
zag+0.161
adoes+0.157
Neg logit contributions
 Sara-0.240
 ice-0.230
 Ben-0.222
 right-0.222
 strawberry-0.212
Token.
Token ID13
Feature activation+1.69e-01

Pos logit contributions
lights+0.024
opes+0.024
bel+0.023
gy+0.023
adoes+0.023
Neg logit contributions
 Sara-0.014
 Ben-0.012
 ice-0.012
 o-0.011
 strawberry-0.011
Token<|endoftext|>
Token ID50256
Feature activation+7.54e-01

Pos logit contributions
lights+0.007
opes+0.007
bel+0.007
adoes+0.007
gy+0.007
Neg logit contributions
 Sara-0.004
 Ben-0.003
 ice-0.003
 strawberry-0.003
 o-0.003
TokenOnce
Token ID7454
Feature activation+1.71e+00

Pos logit contributions
lights+0.027
gy+0.027
zag+0.026
opes+0.026
bel+0.025
Neg logit contributions
 Sara-0.019
 strawberry-0.018
 ice-0.018
 o-0.018
 sne-0.017
Token upon
Token ID2402
Feature activation-7.46e-01

Pos logit contributions
opes+0.061
lights+0.060
adoes+0.058
bel+0.058
arians+0.057
Neg logit contributions
 Sara-0.053
 Ben-0.048
 ice-0.044
 o-0.043
 strawberry-0.043
Token.
Token ID13
Feature activation+2.06e-01

Pos logit contributions
 Sara+0.020
 Ben+0.018
 ice+0.017
 strawberry+0.017
 o+0.017
Neg logit contributions
lights-0.034
opes-0.033
bel-0.032
gy-0.032
adoes-0.032
Tokenâ
Token ID22940
Feature activation+1.60e+00

Pos logit contributions
lights+0.008
opes+0.008
gy+0.008
bel+0.008
adoes+0.008
Neg logit contributions
 Sara-0.005
 ice-0.004
 strawberry-0.004
 o-0.004
 Ben-0.004
Token
Token ID26391
Feature activation+2.78e+00

Pos logit contributions
lights+0.052
opes+0.050
gy+0.049
bel+0.049
adoes+0.048
Neg logit contributions
 Sara-0.053
 Ben-0.048
 ice-0.047
 strawberry-0.047
 o-0.045
Token
Token ID198
Feature activation+3.50e-01

Pos logit contributions
lights+0.126
opes+0.123
bel+0.121
gy+0.120
adoes+0.119
Neg logit contributions
 Sara-0.054
 Ben-0.045
 ice-0.044
 strawberry-0.043
 o-0.042
Token
Token ID198
Feature activation+2.53e-01

Pos logit contributions
lights+0.014
opes+0.014
gy+0.013
zag+0.013
bel+0.013
Neg logit contributions
 ice-0.007
 Sara-0.007
 strawberry-0.007
 o-0.007
 sne-0.007
TokenEll
Token ID30639
Feature activation+6.65e+00

Pos logit contributions
bel+0.009
lights+0.009
opes+0.009
gy+0.009
spe+0.009
Neg logit contributions
 Sara-0.007
 ice-0.006
 Ben-0.006
 strawberry-0.006
 o-0.006
Tokenie
Token ID494
Feature activation+1.83e-01

Pos logit contributions
lights+0.189
bel+0.179
adoes+0.174
opes+0.167
cases+0.162
Neg logit contributions
 Sara-0.239
 o-0.224
 ice-0.221
 Ben-0.216
 sne-0.215
Token happily
Token ID18177
Feature activation+6.81e-01

Pos logit contributions
lights+0.008
opes+0.007
bel+0.007
gy+0.007
adoes+0.007
Neg logit contributions
 Sara-0.004
 Ben-0.004
 ice-0.004
 strawberry-0.004
 o-0.003
Token stayed
Token ID9658
Feature activation-8.73e-01

Pos logit contributions
lights+0.026
opes+0.025
bel+0.025
gy+0.025
zag+0.024
Neg logit contributions
 Sara-0.018
 Ben-0.016
 ice-0.015
 strawberry-0.015
 o-0.015
Token and
Token ID290
Feature activation+8.45e-01

Pos logit contributions
 Sara+0.021
 Ben+0.018
 ice+0.018
 strawberry+0.017
 o+0.017
Neg logit contributions
lights-0.036
bel-0.035
opes-0.035
gy-0.034
adoes-0.034
Token watched
Token ID7342
Feature activation-9.93e-01

Pos logit contributions
gy+0.030
lights+0.030
bel+0.029
zag+0.028
spe+0.028
Neg logit contributions
 Sara-0.023
 strawberry-0.021
 ice-0.020
 Ben-0.020
 vanilla-0.019
Tokenitty
Token ID9760
Feature activation+1.87e+00

Pos logit contributions
lights+0.121
opes+0.114
bel+0.113
adoes+0.112
gy+0.106
Neg logit contributions
 Sara-0.107
 Ben-0.095
 ice-0.094
 o-0.091
 strawberry-0.091
Token wanted
Token ID2227
Feature activation+4.64e-03

Pos logit contributions
lights+0.079
opes+0.076
bel+0.075
adoes+0.073
gy+0.073
Neg logit contributions
 Sara-0.044
 Ben-0.038
 ice-0.037
 strawberry-0.036
 o-0.036
Token to
Token ID284
Feature activation+1.20e+00

Pos logit contributions
gy+0.000
lights+0.000
opes+0.000
bel+0.000
adoes+0.000
Neg logit contributions
 Sara-0.000
 Ben-0.000
 strawberry-0.000
Come-0.000
 ice-0.000
Token introduce
Token ID10400
Feature activation-2.06e-01

Pos logit contributions
gy+0.049
bel+0.048
lights+0.047
opes+0.047
adoes+0.047
Neg logit contributions
 Sara-0.025
 strawberry-0.022
 Ben-0.022
Come-0.021
 ice-0.021
Token Dog
Token ID8532
Feature activation+3.17e+00

Pos logit contributions
 Sara+0.005
 Ben+0.004
 ice+0.004
 strawberry+0.004
 o+0.004
Neg logit contributions
lights-0.009
opes-0.009
bel-0.009
gy-0.008
adoes-0.008
Tokengy
Token ID1360
Feature activation+6.65e+00

Pos logit contributions
lights+0.103
opes+0.096
bel+0.092
adoes+0.090
spe+0.087
Neg logit contributions
 Sara-0.109
 Ben-0.099
 ice-0.097
 strawberry-0.095
 o-0.094
Token to
Token ID284
Feature activation+4.09e-01

Pos logit contributions
lights+0.331
opes+0.311
adoes+0.297
arians+0.296
bel+0.290
Neg logit contributions
 Sara-0.115
 ice-0.108
 o-0.097
 strawberry-0.095
 Ben-0.094
Token her
Token ID607
Feature activation-2.89e-01

Pos logit contributions
gy+0.016
opes+0.016
bel+0.016
lights+0.016
adoes+0.016
Neg logit contributions
 Sara-0.009
 strawberry-0.008
 ice-0.007
 Ben-0.007
 vanilla-0.007
Token ball
Token ID2613
Feature activation-4.10e-01

Pos logit contributions
 Sara+0.006
 Ben+0.005
 ice+0.005
 o+0.005
 strawberry+0.005
Neg logit contributions
lights-0.013
opes-0.012
bel-0.012
adoes-0.012
gy-0.012
Token.
Token ID13
Feature activation-5.60e-01

Pos logit contributions
 Sara+0.009
 Ben+0.008
 ice+0.008
 strawberry+0.007
 o+0.007
Neg logit contributions
lights-0.018
opes-0.017
bel-0.017
adoes-0.017
gy-0.016
Token She
Token ID1375
Feature activation+7.18e-01

Pos logit contributions
 Sara+0.014
 Ben+0.012
 ice+0.012
 strawberry+0.012
 o+0.012
Neg logit contributions
lights-0.022
opes-0.022
bel-0.021
gy-0.021
adoes-0.021
Token went
Token ID1816
Feature activation-4.05e-01

Pos logit contributions
lights+0.065
opes+0.063
bel+0.062
gy+0.062
adoes+0.061
Neg logit contributions
 Sara-0.037
 Ben-0.032
 ice-0.031
 strawberry-0.031
 o-0.030
Token on
Token ID319
Feature activation+1.61e+00

Pos logit contributions
 Sara+0.008
 ice+0.006
 Ben+0.006
 o+0.006
 strawberry+0.005
Neg logit contributions
lights-0.020
opes-0.020
adoes-0.019
spe-0.018
bel-0.018
Token with
Token ID351
Feature activation+3.83e-01

Pos logit contributions
lights+0.075
opes+0.072
bel+0.071
gy+0.070
adoes+0.070
Neg logit contributions
 Sara-0.032
 ice-0.027
 Ben-0.027
 strawberry-0.025
 o-0.025
Token his
Token ID465
Feature activation+9.49e-01

Pos logit contributions
lights+0.018
opes+0.017
bel+0.017
gy+0.017
adoes+0.017
Neg logit contributions
 Sara-0.007
 Ben-0.006
 ice-0.006
 strawberry-0.006
 o-0.006
Token mis
Token ID2984
Feature activation+3.58e+00

Pos logit contributions
lights+0.036
opes+0.034
bel+0.034
adoes+0.033
gy+0.033
Neg logit contributions
 Sara-0.027
 ice-0.024
 Ben-0.024
 strawberry-0.023
 o-0.023
Tokenchie
Token ID3043
Feature activation+6.61e+00

Pos logit contributions
lights+0.079
opes+0.072
bel+0.067
gy+0.067
adoes+0.067
Neg logit contributions
 Sara-0.155
 ice-0.144
 strawberry-0.143
 Ben-0.142
 o-0.142
Tokenvous
Token ID31222
Feature activation-3.10e-02

Pos logit contributions
lights+0.256
opes+0.245
adoes+0.236
gy+0.232
bel+0.230
Neg logit contributions
 Sara-0.178
 ice-0.160
 o-0.155
 Ben-0.155
 strawberry-0.155
Token ways
Token ID2842
Feature activation-2.75e-01

Pos logit contributions
 Sara+0.001
 Ben+0.001
 strawberry+0.001
 ice+0.001
Come+0.001
Neg logit contributions
lights-0.001
opes-0.001
gy-0.001
bel-0.001
adoes-0.001
Token,
Token ID11
Feature activation+2.85e-01

Pos logit contributions
 Sara+0.006
 Ben+0.005
 ice+0.005
 strawberry+0.005
 o+0.005
Neg logit contributions
lights-0.012
opes-0.012
bel-0.011
adoes-0.011
gy-0.011
Token causing
Token ID6666
Feature activation+4.16e-01

Pos logit contributions
lights+0.010
bel+0.010
gy+0.010
opes+0.010
adoes+0.010
Neg logit contributions
 Sara-0.008
 strawberry-0.007
 Ben-0.007
 o-0.006
 ice-0.006
Token problems
Token ID2761
Feature activation+2.87e+00

Pos logit contributions
opes+0.016
 robots+0.016
 puzzles+0.016
lights+0.016
adoes+0.016
Neg logit contributions
 Sara-0.008
 strawberry-0.007
 o-0.007
Come-0.007
 Ben-0.007
Token He
Token ID679
Feature activation+1.41e+00

Pos logit contributions
lights+0.003
opes+0.002
gy+0.002
bel+0.002
adoes+0.002
Neg logit contributions
 Sara-0.002
 Ben-0.002
 strawberry-0.002
 ice-0.002
 o-0.002
Token loved
Token ID6151
Feature activation-4.75e-01

Pos logit contributions
lights+0.053
opes+0.052
gy+0.051
bel+0.051
adoes+0.049
Neg logit contributions
 Sara-0.038
 Ben-0.033
 strawberry-0.033
 ice-0.033
 o-0.031
Token playing
Token ID2712
Feature activation+1.11e+00

Pos logit contributions
 Sara+0.010
 ice+0.009
 Ben+0.009
 strawberry+0.009
 o+0.008
Neg logit contributions
lights-0.021
opes-0.020
bel-0.020
gy-0.020
adoes-0.019
Token make
Token ID787
Feature activation-3.97e-01

Pos logit contributions
lights+0.045
opes+0.043
gy+0.042
bel+0.042
zag+0.042
Neg logit contributions
 Sara-0.027
 Ben-0.025
 ice-0.025
 o-0.024
 strawberry-0.023
Token-
Token ID12
Feature activation+1.41e+00

Pos logit contributions
 Sara+0.012
 strawberry+0.011
 Ben+0.011
 ice+0.010
 o+0.010
Neg logit contributions
lights-0.013
bel-0.013
opes-0.013
adoes-0.013
 puzzles-0.012
Tokenbel
Token ID6667
Feature activation+6.54e+00

Pos logit contributions
opes+0.025
lights+0.025
adoes+0.021
arians+0.020
gy+0.019
Neg logit contributions
 Sara-0.066
 ice-0.063
 o-0.060
 strawberry-0.060
Come-0.060
Tokenieve
Token ID12311
Feature activation+2.35e+00

Pos logit contributions
gy+0.226
lights+0.218
opes+0.203
zag+0.203
adoes+0.198
Neg logit contributions
 Sara-0.198
 ice-0.194
 better-0.187
 say-0.181
 Ben-0.181
Token with
Token ID351
Feature activation+5.66e-01

Pos logit contributions
gy+0.094
opes+0.094
lights+0.091
zag+0.090
phones+0.088
Neg logit contributions
 Sara-0.055
 ice-0.051
 o-0.051
 Ben-0.050
 it-0.045
Token his
Token ID465
Feature activation+6.30e-01

Pos logit contributions
lights+0.025
gy+0.024
opes+0.024
bel+0.024
zag+0.024
Neg logit contributions
 Sara-0.012
 Ben-0.011
 ice-0.010
 o-0.010
 strawberry-0.010
Token friends
Token ID2460
Feature activation+1.75e+00

Pos logit contributions
lights+0.023
opes+0.022
gy+0.021
adoes+0.021
bel+0.021
Neg logit contributions
 Sara-0.018
 ice-0.017
 Ben-0.017
 o-0.017
 strawberry-0.016
Token.
Token ID13
Feature activation-3.55e-01

Pos logit contributions
lights+0.083
opes+0.080
bel+0.078
adoes+0.078
gy+0.077
Neg logit contributions
 Sara-0.031
 Ben-0.027
 ice-0.027
 o-0.026
 strawberry-0.024
Token He
Token ID679
Feature activation+1.71e+00

Pos logit contributions
 Sara+0.007
 ice+0.006
 strawberry+0.006
 o+0.006
 Ben+0.006
Neg logit contributions
lights-0.012
opes-0.011
gy-0.011
bel-0.011
adoes-0.011
Token would
Token ID561
Feature activation-7.07e-01

Pos logit contributions
lights+0.074
opes+0.071
bel+0.070
adoes+0.070
gy+0.069
Neg logit contributions
 Sara-0.038
 Ben-0.034
 ice-0.033
 o-0.032
 sne-0.031
Token race
Token ID3234
Feature activation+1.95e+00

Pos logit contributions
 Sara+0.015
 Ben+0.013
 ice+0.013
 o+0.012
 strawberry+0.012
Neg logit contributions
lights-0.031
opes-0.030
bel-0.029
adoes-0.029
gy-0.029
Token make
Token ID787
Feature activation-5.73e-01

Pos logit contributions
lights+0.092
opes+0.092
gy+0.088
arians+0.087
bel+0.086
Neg logit contributions
 Sara-0.036
 Ben-0.031
 ice-0.030
 o-0.026
 Tom-0.025
Token-
Token ID12
Feature activation+1.40e+00

Pos logit contributions
 Sara+0.014
 strawberry+0.012
 Ben+0.012
 ice+0.012
 o+0.011
Neg logit contributions
lights-0.023
opes-0.022
bel-0.022
adoes-0.022
gy-0.022
Tokenbel
Token ID6667
Feature activation+6.52e+00

Pos logit contributions
lights+0.036
opes+0.035
adoes+0.031
arians+0.031
gy+0.031
Neg logit contributions
 Sara-0.055
 ice-0.052
Come-0.050
 o-0.050
 strawberry-0.049
Tokenieve
Token ID12311
Feature activation+2.07e+00

Pos logit contributions
gy+0.207
lights+0.203
opes+0.189
zag+0.186
adoes+0.183
Neg logit contributions
 Sara-0.219
 ice-0.205
 Ben-0.200
 o-0.193
 better-0.192
Token cars
Token ID5006
Feature activation+1.90e+00

Pos logit contributions
lights+0.085
opes+0.084
gy+0.084
zag+0.079
bel+0.079
Neg logit contributions
 Sara-0.046
 Ben-0.042
 ice-0.041
 o-0.041
 it-0.037
Token around
Token ID1088
Feature activation+1.13e+00

Pos logit contributions
lights+0.090
opes+0.088
adoes+0.084
gy+0.084
bel+0.083
Neg logit contributions
 Sara-0.033
 ice-0.030
 Ben-0.029
 o-0.028
 say-0.026
Token the
Token ID262
Feature activation+5.25e-01

Pos logit contributions
lights+0.057
opes+0.055
bel+0.055
gy+0.054
zag+0.054
Neg logit contributions
 Sara-0.017
 Ben-0.015
 ice-0.014
 o-0.013
 strawberry-0.011
Token track
Token ID2610
Feature activation+2.72e+00

Pos logit contributions
lights+0.017
opes+0.017
gy+0.016
bel+0.016
adoes+0.016
Neg logit contributions
 Sara-0.017
 Ben-0.015
 strawberry-0.015
 ice-0.014
Come-0.014
Token a
Token ID257
Feature activation-5.00e-01

Pos logit contributions
 Sara+0.006
 strawberry+0.006
 ice+0.005
 Ben+0.005
 o+0.005
Neg logit contributions
lights-0.014
bel-0.013
opes-0.013
gy-0.013
adoes-0.013
Token modest
Token ID12949
Feature activation-5.28e-01

Pos logit contributions
 Sara+0.015
 Ben+0.013
Come+0.013
 Danny+0.012
 o+0.012
Neg logit contributions
zag-0.017
bel-0.017
 puzzles-0.016
gy-0.016
lights-0.016
Token bench
Token ID7624
Feature activation-1.85e+00

Pos logit contributions
 Sara+0.016
 Ben+0.014
Come+0.013
 ice+0.013
 o+0.013
Neg logit contributions
lights-0.018
gy-0.017
opes-0.017
bel-0.017
zag-0.017
Token under
Token ID739
Feature activation+1.81e-01

Pos logit contributions
 Sara+0.042
 Ben+0.036
 ice+0.036
 strawberry+0.035
 o+0.034
Neg logit contributions
lights-0.078
opes-0.077
bel-0.075
adoes-0.074
gy-0.074
Token a
Token ID257
Feature activation-5.53e-01

Pos logit contributions
lights+0.008
opes+0.008
gy+0.008
bel+0.008
adoes+0.008
Neg logit contributions
 Sara-0.003
 ice-0.003
 strawberry-0.003
 o-0.003
 Ben-0.003
Token shady
Token ID36135
Feature activation-5.44e+00

Pos logit contributions
 Sara+0.016
Come+0.015
 Ben+0.014
 OK+0.014
 Danny+0.013
Neg logit contributions
lights-0.018
zag-0.018
 puzzles-0.018
bel-0.018
gy-0.018
Token tree
Token ID5509
Feature activation-1.29e+00

Pos logit contributions
 Sara+0.111
 Ben+0.093
 ice+0.091
 strawberry+0.090
 o+0.087
Neg logit contributions
lights-0.246
opes-0.239
bel-0.235
adoes-0.230
gy-0.230
Token and
Token ID290
Feature activation-3.08e-01

Pos logit contributions
 Sara+0.028
 Ben+0.024
 ice+0.024
 strawberry+0.023
 o+0.023
Neg logit contributions
lights-0.056
opes-0.054
bel-0.053
adoes-0.053
gy-0.052
Token sat
Token ID3332
Feature activation-3.20e+00

Pos logit contributions
 Sara+0.007
 Ben+0.006
 strawberry+0.006
 o+0.006
 Goose+0.006
Neg logit contributions
bel-0.012
gy-0.012
lights-0.012
 puzzles-0.011
opes-0.011
Token down
Token ID866
Feature activation-5.23e-01

Pos logit contributions
 Sara+0.093
 Ben+0.085
 strawberry+0.083
 ice+0.083
 sne+0.081
Neg logit contributions
lights-0.113
bel-0.109
gy-0.107
opes-0.106
adoes-0.106
Token.
Token ID13
Feature activation+5.51e-01

Pos logit contributions
 Sara+0.012
 Ben+0.011
 strawberry+0.011
 ice+0.011
 o+0.010
Neg logit contributions
lights-0.021
bel-0.021
opes-0.020
gy-0.020
adoes-0.020
Token not
Token ID407
Feature activation-9.40e-01

Pos logit contributions
lights+0.008
adoes+0.008
opes+0.008
 puzzles+0.007
bel+0.007
Neg logit contributions
 Sara-0.005
Come-0.005
 strawberry-0.005
 ice-0.005
 o-0.005
Token too
Token ID1165
Feature activation-9.00e-01

Pos logit contributions
 Sara+0.023
 Ben+0.020
Come+0.020
 strawberry+0.019
 ice+0.019
Neg logit contributions
lights-0.036
opes-0.036
gy-0.035
adoes-0.035
bel-0.034
Token sunny
Token ID27737
Feature activation-1.82e+00

Pos logit contributions
 Sara+0.022
 Ben+0.019
 ice+0.019
 strawberry+0.019
 o+0.019
Neg logit contributions
lights-0.035
opes-0.034
bel-0.033
adoes-0.033
gy-0.033
Token or
Token ID393
Feature activation-3.88e-01

Pos logit contributions
 Sara+0.041
 Ben+0.035
 ice+0.035
 strawberry+0.034
 o+0.033
Neg logit contributions
lights-0.077
opes-0.075
bel-0.074
gy-0.073
adoes-0.073
Token too
Token ID1165
Feature activation-9.15e-01

Pos logit contributions
 Sara+0.009
 Ben+0.007
 strawberry+0.007
Come+0.007
 ice+0.007
Neg logit contributions
lights-0.016
opes-0.015
bel-0.015
adoes-0.015
gy-0.015
Token shady
Token ID36135
Feature activation-5.40e+00

Pos logit contributions
 Sara+0.023
 Ben+0.020
 ice+0.020
Come+0.020
 strawberry+0.019
Neg logit contributions
lights-0.035
opes-0.034
bel-0.033
adoes-0.033
gy-0.033
Token,
Token ID11
Feature activation+2.06e-01

Pos logit contributions
 Sara+0.122
 Ben+0.105
 ice+0.103
 strawberry+0.102
 o+0.099
Neg logit contributions
lights-0.228
opes-0.221
bel-0.218
gy-0.217
adoes-0.215
Token and
Token ID290
Feature activation-5.17e-01

Pos logit contributions
lights+0.008
adoes+0.007
opes+0.007
bel+0.007
gy+0.007
Neg logit contributions
 Sara-0.005
 strawberry-0.004
Come-0.004
 ice-0.004
 o-0.004
Token not
Token ID407
Feature activation-6.27e-01

Pos logit contributions
 Sara+0.011
 strawberry+0.010
 Ben+0.010
 ice+0.009
 o+0.009
Neg logit contributions
lights-0.021
bel-0.020
opes-0.020
adoes-0.020
 puzzles-0.020
Token too
Token ID1165
Feature activation-1.04e+00

Pos logit contributions
 Sara+0.014
 Ben+0.011
 strawberry+0.011
Come+0.011
 ice+0.011
Neg logit contributions
lights-0.025
gy-0.025
opes-0.025
bel-0.025
adoes-0.025
Token crowded
Token ID18012
Feature activation+1.15e+00

Pos logit contributions
 Sara+0.025
 ice+0.022
 Ben+0.022
Come+0.021
 o+0.021
Neg logit contributions
lights-0.040
bel-0.039
opes-0.038
gy-0.038
adoes-0.038
Token a
Token ID257
Feature activation-2.91e-01

Pos logit contributions
 Sara+0.008
 ice+0.007
 strawberry+0.006
 Ben+0.006
 o+0.006
Neg logit contributions
lights-0.015
bel-0.015
opes-0.015
gy-0.015
adoes-0.015
Token wooden
Token ID13510
Feature activation+4.26e-01

Pos logit contributions
 Sara+0.008
 Ben+0.007
Come+0.007
 o+0.006
 Danny+0.006
Neg logit contributions
bel-0.010
lights-0.010
gy-0.010
zag-0.010
 puzzles-0.010
Token seat
Token ID5852
Feature activation-9.87e-01

Pos logit contributions
lights+0.016
opes+0.015
bel+0.015
gy+0.014
adoes+0.014
Neg logit contributions
 Sara-0.012
 ice-0.011
 Ben-0.011
 strawberry-0.011
 o-0.011
Token under
Token ID739
Feature activation+3.64e-02

Pos logit contributions
 Sara+0.023
 ice+0.020
 Ben+0.019
 strawberry+0.019
 o+0.019
Neg logit contributions
opes-0.041
lights-0.041
adoes-0.039
gy-0.039
bel-0.039
Token a
Token ID257
Feature activation+2.18e-02

Pos logit contributions
lights+0.002
opes+0.002
bel+0.002
gy+0.002
adoes+0.002
Neg logit contributions
 Sara-0.001
 Ben-0.000
 ice-0.000
 strawberry-0.000
 o-0.000
Token shady
Token ID36135
Feature activation-5.33e+00

Pos logit contributions
lights+0.001
zag+0.001
 puzzles+0.001
gy+0.001
bel+0.001
Neg logit contributions
 Sara-0.001
Come-0.001
 Ben-0.001
 OK-0.001
 Goose-0.000
Token tree
Token ID5509
Feature activation-9.67e-01

Pos logit contributions
 Sara+0.116
 Ben+0.100
 ice+0.096
 strawberry+0.095
 o+0.093
Neg logit contributions
lights-0.228
opes-0.219
gy-0.219
bel-0.217
adoes-0.214
Token.
Token ID13
Feature activation-2.05e-01

Pos logit contributions
 Sara+0.023
 Ben+0.021
 strawberry+0.019
 ice+0.019
 o+0.019
Neg logit contributions
lights-0.039
gy-0.038
bel-0.038
opes-0.037
zag-0.037
Token She
Token ID1375
Feature activation+6.87e-01

Pos logit contributions
 ice+0.005
 o+0.005
 sne+0.005
 Sara+0.004
 warmer+0.004
Neg logit contributions
lights-0.007
gy-0.007
opes-0.007
adoes-0.007
bel-0.007
Token climbed
Token ID19952
Feature activation-6.94e-01

Pos logit contributions
lights+0.026
opes+0.026
bel+0.026
gy+0.025
spe+0.024
Neg logit contributions
 Sara-0.017
 Ben-0.015
 ice-0.015
 strawberry-0.014
 warmer-0.014
Token onto
Token ID4291
Feature activation-1.50e-02

Pos logit contributions
 Sara+0.018
 Ben+0.017
 ice+0.016
 strawberry+0.016
 sne+0.016
Neg logit contributions
lights-0.025
gy-0.024
bel-0.024
opes-0.024
zag-0.024
Token the
Token ID262
Feature activation-1.01e+00

Pos logit contributions
 Sara+0.006
 Ben+0.006
 ice+0.005
 strawberry+0.005
 o+0.005
Neg logit contributions
lights-0.013
opes-0.013
bel-0.012
gy-0.012
zag-0.012
Token two
Token ID734
Feature activation+1.14e-01

Pos logit contributions
 Sara+0.026
 Ben+0.023
 strawberry+0.022
Come+0.021
 vanilla+0.021
Neg logit contributions
lights-0.037
 robots-0.036
bel-0.036
 puzzles-0.036
adoes-0.035
Token friends
Token ID2460
Feature activation+1.96e-01

Pos logit contributions
lights+0.005
opes+0.004
bel+0.004
gy+0.004
adoes+0.004
Neg logit contributions
 Sara-0.003
 Ben-0.002
 strawberry-0.002
 ice-0.002
 o-0.002
Token saw
Token ID2497
Feature activation+1.39e-01

Pos logit contributions
lights+0.007
gy+0.007
opes+0.007
bel+0.007
spe+0.007
Neg logit contributions
 Sara-0.005
 strawberry-0.005
 ice-0.004
 Ben-0.004
 vanilla-0.004
Token a
Token ID257
Feature activation-7.16e-01

Pos logit contributions
lights+0.006
opes+0.006
bel+0.006
adoes+0.006
gy+0.005
Neg logit contributions
 Sara-0.003
 strawberry-0.002
 o-0.002
 yawn-0.002
 Ben-0.002
Token shady
Token ID36135
Feature activation-5.15e+00

Pos logit contributions
 Sara+0.018
 Ben+0.016
Come+0.015
 ice+0.014
 o+0.014
Neg logit contributions
bel-0.026
lights-0.026
spe-0.026
gy-0.026
zag-0.026
Token looking
Token ID2045
Feature activation-8.40e-02

Pos logit contributions
 Sara+0.115
 Ben+0.100
 ice+0.094
Come+0.094
 strawberry+0.093
Neg logit contributions
lights-0.215
gy-0.209
bel-0.205
opes-0.204
zag-0.203
Token man
Token ID582
Feature activation+1.21e+00

Pos logit contributions
 Sara+0.002
 Ben+0.002
Come+0.002
It+0.002
 warmer+0.002
Neg logit contributions
lights-0.003
gy-0.003
zag-0.003
bel-0.003
opes-0.003
Token lurking
Token ID36105
Feature activation+3.45e-01

Pos logit contributions
gy+0.043
lights+0.042
bel+0.041
spe+0.040
opes+0.040
Neg logit contributions
 Sara-0.034
 Ben-0.030
 strawberry-0.029
Come-0.028
 ice-0.028
Token near
Token ID1474
Feature activation-7.63e-01

Pos logit contributions
lights+0.013
bel+0.012
opes+0.012
gy+0.012
adoes+0.012
Neg logit contributions
 Sara-0.010
 Ben-0.008
 strawberry-0.008
 ice-0.008
 o-0.008
Token the
Token ID262
Feature activation-6.04e-01

Pos logit contributions
 Sara+0.014
 strawberry+0.012
 o+0.011
 ice+0.011
 Ben+0.011
Neg logit contributions
lights-0.033
gy-0.033
opes-0.033
bel-0.032
adoes-0.032
Token said
Token ID531
Feature activation-3.75e-01

Pos logit contributions
 Sara+0.021
 ice+0.017
 strawberry+0.017
 Ben+0.016
 o+0.016
Neg logit contributions
lights-0.056
opes-0.055
gy-0.055
bel-0.054
zag-0.054
Token,
Token ID11
Feature activation-8.81e-02

Pos logit contributions
 Sara+0.011
 strawberry+0.010
 ice+0.010
 Ben+0.009
 o+0.009
Neg logit contributions
lights-0.013
gy-0.013
bel-0.013
opes-0.012
adoes-0.012
Token pointing
Token ID10609
Feature activation+3.32e-01

Pos logit contributions
 Sara+0.002
 Ben+0.002
 ice+0.002
 strawberry+0.002
 o+0.002
Neg logit contributions
lights-0.003
opes-0.003
gy-0.003
bel-0.003
adoes-0.003
Token to
Token ID284
Feature activation+2.50e-01

Pos logit contributions
lights+0.011
bel+0.011
adoes+0.011
gy+0.011
opes+0.010
Neg logit contributions
 Sara-0.009
 ice-0.008
 strawberry-0.008
 warmer-0.008
Come-0.008
Token a
Token ID257
Feature activation-8.84e-01

Pos logit contributions
gy+0.009
adoes+0.009
opes+0.009
 puzzles+0.009
 robots+0.009
Neg logit contributions
 Sara-0.005
 strawberry-0.005
Come-0.005
iza-0.005
 o-0.005
Token shady
Token ID36135
Feature activation-5.13e+00

Pos logit contributions
 Sara+0.026
Come+0.023
 Ben+0.023
iza+0.021
ax+0.021
Neg logit contributions
zag-0.027
bel-0.027
spe-0.027
lights-0.027
gy-0.027
Token spot
Token ID4136
Feature activation-3.61e+00

Pos logit contributions
 Sara+0.109
 Ben+0.095
Come+0.091
 ice+0.090
 strawberry+0.089
Neg logit contributions
lights-0.218
gy-0.212
bel-0.206
adoes-0.206
opes-0.205
Token.
Token ID13
Feature activation-8.61e-02

Pos logit contributions
 Sara+0.095
 Ben+0.083
 ice+0.082
 strawberry+0.081
 o+0.079
Neg logit contributions
lights-0.140
opes-0.135
bel-0.133
gy-0.131
adoes-0.130
Token
Token ID198
Feature activation-5.16e-01

Pos logit contributions
 o+0.002
 ice+0.002
 sne+0.002
 strawberry+0.002
 yawn+0.002
Neg logit contributions
lights-0.003
bel-0.003
opes-0.003
adoes-0.003
gy-0.003
Token
Token ID198
Feature activation-6.01e-01

Pos logit contributions
 strawberry+0.011
 Sara+0.011
 o+0.010
 sne+0.010
 yawn+0.010
Neg logit contributions
lights-0.021
opes-0.020
gy-0.019
bel-0.019
zag-0.019
TokenThey
Token ID2990
Feature activation+8.61e-01

Pos logit contributions
 Sara+0.019
 strawberry+0.016
 ice+0.016
 Ben+0.016
 o+0.016
Neg logit contributions
lights-0.020
opes-0.020
bel-0.019
gy-0.019
adoes-0.019
Token break
Token ID2270
Feature activation-8.61e-02

Pos logit contributions
zag+0.030
bel+0.030
lights+0.029
gy+0.028
cases+0.027
Neg logit contributions
 Sara-0.018
 Ben-0.017
Come-0.015
 ge-0.014
 strawberry-0.014
Token."
Token ID526
Feature activation-1.02e+00

Pos logit contributions
 Sara+0.002
 Ben+0.002
 ice+0.002
 strawberry+0.002
 o+0.002
Neg logit contributions
lights-0.004
opes-0.003
bel-0.003
gy-0.003
adoes-0.003
Token They
Token ID1119
Feature activation+1.11e+00

Pos logit contributions
 Sara+0.025
 Ben+0.021
 ice+0.021
 strawberry+0.020
 o+0.020
Neg logit contributions
lights-0.042
opes-0.041
bel-0.040
gy-0.040
adoes-0.040
Token found
Token ID1043
Feature activation-3.48e-01

Pos logit contributions
lights+0.039
gy+0.039
bel+0.038
opes+0.037
cases+0.036
Neg logit contributions
 Sara-0.030
 Ben-0.026
 strawberry-0.026
 ice-0.025
Come-0.025
Token a
Token ID257
Feature activation-4.67e-01

Pos logit contributions
 Sara+0.007
 yawn+0.006
 Ben+0.006
 strawberry+0.006
 o+0.006
Neg logit contributions
lights-0.014
gy-0.014
bel-0.014
adoes-0.014
opes-0.014
Token shady
Token ID36135
Feature activation-5.12e+00

Pos logit contributions
 Sara+0.014
 Ben+0.013
Come+0.012
ax+0.011
ily+0.011
Neg logit contributions
zag-0.016
bel-0.015
 puzzles-0.015
gy-0.015
spe-0.014
Token spot
Token ID4136
Feature activation-3.58e+00

Pos logit contributions
 Sara+0.110
 Ben+0.098
Come+0.094
 strawberry+0.087
 warmer+0.087
Neg logit contributions
gy-0.209
lights-0.209
zag-0.204
bel-0.203
adoes-0.199
Token and
Token ID290
Feature activation-4.62e-03

Pos logit contributions
 Sara+0.087
 Ben+0.075
 ice+0.074
 strawberry+0.072
 o+0.071
Neg logit contributions
lights-0.146
opes-0.146
adoes-0.139
bel-0.139
gy-0.138
Token sat
Token ID3332
Feature activation-2.78e+00

Pos logit contributions
 Sara+0.000
 Goose+0.000
Come+0.000
 strawberry+0.000
 welcome+0.000
Neg logit contributions
bel-0.000
lights-0.000
gy-0.000
cases-0.000
 puzzles-0.000
Token on
Token ID319
Feature activation+8.31e-01

Pos logit contributions
 Sara+0.082
 Ben+0.075
 strawberry+0.074
 ice+0.073
 sne+0.071
Neg logit contributions
lights-0.098
bel-0.094
gy-0.092
adoes-0.090
opes-0.090
Token the
Token ID262
Feature activation+8.03e-01

Pos logit contributions
lights+0.038
gy+0.037
opes+0.037
bel+0.036
adoes+0.036
Neg logit contributions
 Sara-0.014
 Ben-0.012
 strawberry-0.011
 o-0.011
Come-0.011
Token take
Token ID1011
Feature activation-7.88e-01

Pos logit contributions
gy+0.016
bel+0.015
lights+0.014
 puzzles+0.014
adoes+0.014
Neg logit contributions
 Sara-0.009
 strawberry-0.008
Come-0.008
 vanilla-0.007
 Ben-0.007
Token a
Token ID257
Feature activation+8.66e-01

Pos logit contributions
 Sara+0.017
 strawberry+0.014
Come+0.014
 Ben+0.014
 o+0.013
Neg logit contributions
opes-0.032
lights-0.032
gy-0.032
 puzzles-0.031
adoes-0.031
Token nap
Token ID25422
Feature activation-4.20e-01

Pos logit contributions
lights+0.038
bel+0.037
zag+0.037
gy+0.036
 puzzles+0.036
Neg logit contributions
 Sara-0.017
 Ben-0.014
Come-0.013
 strawberry-0.012
 Danny-0.012
Token under
Token ID739
Feature activation+3.44e-01

Pos logit contributions
 Sara+0.011
 Ben+0.009
 ice+0.009
 strawberry+0.009
 o+0.009
Neg logit contributions
lights-0.017
opes-0.016
bel-0.016
gy-0.016
adoes-0.015
Token a
Token ID257
Feature activation+8.57e-02

Pos logit contributions
lights+0.016
opes+0.016
bel+0.016
gy+0.016
adoes+0.015
Neg logit contributions
 Sara-0.006
 Ben-0.005
 ice-0.005
 strawberry-0.005
 o-0.005
Token shady
Token ID36135
Feature activation-5.06e+00

Pos logit contributions
lights+0.003
gy+0.003
zag+0.003
 puzzles+0.003
bel+0.003
Neg logit contributions
 Sara-0.002
Come-0.002
 Ben-0.002
 Danny-0.002
 vanilla-0.002
Token tree
Token ID5509
Feature activation-1.09e+00

Pos logit contributions
 Sara+0.111
 Ben+0.095
 ice+0.093
 strawberry+0.091
 o+0.089
Neg logit contributions
lights-0.216
opes-0.209
gy-0.207
bel-0.206
adoes-0.203
Token.
Token ID13
Feature activation-3.78e-01

Pos logit contributions
 Sara+0.027
 Ben+0.023
 strawberry+0.023
Come+0.022
 ice+0.022
Neg logit contributions
gy-0.042
lights-0.042
bel-0.041
zag-0.041
opes-0.041
Token
Token ID198
Feature activation-2.04e-01

Pos logit contributions
iza+0.008
 sne+0.008
 warmer+0.008
 Sara+0.008
 o+0.008
Neg logit contributions
gy-0.014
opes-0.013
zag-0.013
lights-0.013
bel-0.013
Token
Token ID198
Feature activation+5.05e-02

Pos logit contributions
 Sara+0.004
 strawberry+0.004
 ice+0.004
 o+0.004
 vanilla+0.003
Neg logit contributions
lights-0.008
bel-0.008
zag-0.008
gy-0.008
opes-0.008
TokenWhile
Token ID3633
Feature activation-1.62e-02

Pos logit contributions
lights+0.002
bel+0.002
gy+0.002
zag+0.002
opes+0.002
Neg logit contributions
 Sara-0.001
 Ben-0.001
 strawberry-0.001
 ice-0.001
 o-0.001
Token to
Token ID284
Feature activation+1.25e+00

Pos logit contributions
lights+0.002
gy+0.001
bel+0.001
adoes+0.001
opes+0.001
Neg logit contributions
 Sara-0.001
 Ben-0.001
 o-0.001
 strawberry-0.001
 ice-0.001
Token move
Token ID1445
Feature activation+2.75e-02

Pos logit contributions
gy+0.051
lights+0.050
bel+0.049
opes+0.049
adoes+0.048
Neg logit contributions
 Sara-0.029
 strawberry-0.026
 Ben-0.025
Come-0.024
 ice-0.022
Token him
Token ID683
Feature activation-1.31e-01

Pos logit contributions
lights+0.001
bel+0.001
gy+0.001
adoes+0.001
opes+0.001
Neg logit contributions
 Sara-0.001
 strawberry-0.001
 ice-0.001
 Ben-0.001
 o-0.001
Token to
Token ID284
Feature activation+4.52e-01

Pos logit contributions
 Sara+0.003
 Ben+0.003
 ice+0.003
 o+0.003
 strawberry+0.002
Neg logit contributions
lights-0.006
opes-0.005
bel-0.005
adoes-0.005
gy-0.005
Token a
Token ID257
Feature activation-5.52e-01

Pos logit contributions
gy+0.017
lights+0.016
bel+0.016
 puzzles+0.016
opes+0.016
Neg logit contributions
Come-0.010
 strawberry-0.010
 Sara-0.009
ax-0.009
 Goose-0.009
Token shady
Token ID36135
Feature activation-5.06e+00

Pos logit contributions
Come+0.016
 Sara+0.016
ax+0.016
 Ben+0.015
 Danny+0.015
Neg logit contributions
 puzzles-0.016
gy-0.016
zag-0.016
bel-0.016
lights-0.016
Token spot
Token ID4136
Feature activation-3.63e+00

Pos logit contributions
 Sara+0.106
 Ben+0.091
 ice+0.087
 strawberry+0.087
 o+0.083
Neg logit contributions
lights-0.221
gy-0.212
opes-0.211
bel-0.211
adoes-0.207
Token,
Token ID11
Feature activation-3.69e-02

Pos logit contributions
 Sara+0.088
 Ben+0.076
 ice+0.074
 o+0.071
 strawberry+0.071
Neg logit contributions
lights-0.149
opes-0.148
bel-0.142
adoes-0.141
gy-0.141
Token but
Token ID475
Feature activation-6.56e-01

Pos logit contributions
 Sara+0.001
 strawberry+0.001
 ice+0.001
 o+0.001
 Ben+0.001
Neg logit contributions
lights-0.001
bel-0.001
gy-0.001
opes-0.001
adoes-0.001
Token he
Token ID339
Feature activation-7.73e-02

Pos logit contributions
 Sara+0.016
 strawberry+0.015
 o+0.015
 ice+0.015
Come+0.014
Neg logit contributions
lights-0.024
bel-0.022
 puzzles-0.022
adoes-0.022
gy-0.022
Token was
Token ID373
Feature activation+1.72e-01

Pos logit contributions
 Sara+0.002
 strawberry+0.002
 ice+0.002
 Ben+0.002
 warmer+0.002
Neg logit contributions
lights-0.003
gy-0.003
bel-0.003
opes-0.003
zag-0.003
Token they
Token ID484
Feature activation-4.83e-01

Pos logit contributions
 Sara+0.054
 Ben+0.047
 ice+0.046
 strawberry+0.046
 o+0.044
Neg logit contributions
lights-0.092
opes-0.089
bel-0.088
gy-0.087
adoes-0.086
Token could
Token ID714
Feature activation-9.93e-01

Pos logit contributions
 Sara+0.013
 strawberry+0.012
 ice+0.011
 vanilla+0.011
 Ben+0.011
Neg logit contributions
lights-0.017
gy-0.017
bel-0.017
opes-0.016
spe-0.016
Token react
Token ID6324
Feature activation-1.49e-01

Pos logit contributions
 Sara+0.022
 strawberry+0.021
 ice+0.019
 vanilla+0.019
 Ben+0.019
Neg logit contributions
gy-0.040
zag-0.039
bel-0.038
lights-0.038
opes-0.037
Token,
Token ID11
Feature activation+9.07e-02

Pos logit contributions
 Sara+0.004
 strawberry+0.003
 Ben+0.003
 ice+0.003
 o+0.003
Neg logit contributions
lights-0.006
gy-0.005
bel-0.005
opes-0.005
adoes-0.005
Token the
Token ID262
Feature activation-7.10e-01

Pos logit contributions
lights+0.003
bel+0.003
opes+0.003
adoes+0.003
gy+0.003
Neg logit contributions
 Sara-0.002
 strawberry-0.002
 ice-0.002
 Ben-0.002
 o-0.002
Token shady
Token ID36135
Feature activation-5.05e+00

Pos logit contributions
 Sara+0.018
 Ben+0.016
 strawberry+0.016
Come+0.015
 vanilla+0.015
Neg logit contributions
lights-0.025
 robots-0.024
bel-0.024
adoes-0.024
 puzzles-0.024
Token man
Token ID582
Feature activation-1.36e-01

Pos logit contributions
 Sara+0.119
 Ben+0.106
 strawberry+0.100
 ice+0.099
Come+0.098
Neg logit contributions
lights-0.207
opes-0.196
gy-0.196
adoes-0.194
bel-0.193
Token reached
Token ID4251
Feature activation-1.79e+00

Pos logit contributions
 Sara+0.003
 strawberry+0.003
 Ben+0.003
 ice+0.003
 o+0.003
Neg logit contributions
lights-0.005
gy-0.005
bel-0.005
opes-0.005
spe-0.005
Token out
Token ID503
Feature activation-3.03e-01

Pos logit contributions
 Sara+0.041
 strawberry+0.038
Come+0.036
 ice+0.035
 vanilla+0.034
Neg logit contributions
gy-0.070
lights-0.069
bel-0.066
adoes-0.065
opes-0.065
Token and
Token ID290
Feature activation-7.20e-01

Pos logit contributions
 Sara+0.006
 Ben+0.005
 ice+0.005
 strawberry+0.005
 o+0.005
Neg logit contributions
lights-0.013
opes-0.013
bel-0.013
gy-0.013
adoes-0.013
Token grabbed
Token ID13176
Feature activation-5.80e-01

Pos logit contributions
 Sara+0.017
 strawberry+0.014
 Ben+0.014
 ice+0.014
 o+0.013
Neg logit contributions
bel-0.028
gy-0.028
lights-0.028
opes-0.027
adoes-0.027
Token biting
Token ID30791
Feature activation-1.18e+00

Pos logit contributions
lights+0.004
bel+0.004
gy+0.004
opes+0.004
adoes+0.004
Neg logit contributions
 Sara-0.003
 strawberry-0.002
 Ben-0.002
 ice-0.002
 o-0.002
Token.
Token ID13
Feature activation+4.36e-02

Pos logit contributions
 Sara+0.024
 strawberry+0.024
 yawn+0.022
 warmer+0.022
Come+0.022
Neg logit contributions
adoes-0.046
lights-0.045
opes-0.045
 puzzles-0.044
 robots-0.043
Token He
Token ID679
Feature activation+1.24e+00

Pos logit contributions
lights+0.001
opes+0.001
gy+0.001
 robots+0.001
bel+0.001
Neg logit contributions
 strawberry-0.001
 o-0.001
 ice-0.001
 warmer-0.001
 Sara-0.001
Token finds
Token ID7228
Feature activation-5.58e-01

Pos logit contributions
lights+0.046
opes+0.045
bel+0.044
gy+0.043
adoes+0.043
Neg logit contributions
 Sara-0.033
 strawberry-0.029
 Ben-0.029
 ice-0.028
 o-0.027
Token a
Token ID257
Feature activation-3.99e-01

Pos logit contributions
 strawberry+0.012
 Sara+0.012
Come+0.012
 yawn+0.012
 mir+0.012
Neg logit contributions
lights-0.020
 robots-0.019
 puzzles-0.019
bel-0.019
adoes-0.018
Token shady
Token ID36135
Feature activation-5.02e+00

Pos logit contributions
 Sara+0.011
iza+0.010
 Ben+0.010
Come+0.010
 Danny+0.010
Neg logit contributions
zag-0.013
bel-0.013
gy-0.013
 puzzles-0.013
spe-0.012
Token spot
Token ID4136
Feature activation-3.45e+00

Pos logit contributions
 Sara+0.095
 Ben+0.080
 ice+0.077
 strawberry+0.076
 o+0.074
Neg logit contributions
lights-0.228
opes-0.220
bel-0.219
gy-0.219
adoes-0.215
Token under
Token ID739
Feature activation+7.20e-02

Pos logit contributions
 Sara+0.092
 Ben+0.080
 ice+0.079
 strawberry+0.077
 o+0.076
Neg logit contributions
lights-0.134
opes-0.131
bel-0.126
gy-0.125
adoes-0.125
Token a
Token ID257
Feature activation-4.17e-01

Pos logit contributions
lights+0.003
opes+0.003
gy+0.003
 puzzles+0.003
adoes+0.003
Neg logit contributions
 Sara-0.001
Come-0.001
 strawberry-0.001
 o-0.001
 say-0.001
Token tree
Token ID5509
Feature activation-1.46e+00

Pos logit contributions
 Sara+0.012
Come+0.011
 Ben+0.011
 OK+0.010
 Danny+0.010
Neg logit contributions
lights-0.013
zag-0.013
gy-0.013
 puzzles-0.013
bel-0.013
Token and
Token ID290
Feature activation-9.78e-01

Pos logit contributions
 Sara+0.032
 Ben+0.028
 ice+0.027
 strawberry+0.027
 o+0.026
Neg logit contributions
lights-0.062
opes-0.060
bel-0.060
gy-0.059
adoes-0.058
Token She
Token ID1375
Feature activation+2.13e-01

Pos logit contributions
lights+0.002
opes+0.002
bel+0.002
gy+0.002
adoes+0.002
Neg logit contributions
 Sara-0.001
 ice-0.001
 o-0.001
 strawberry-0.001
 sne-0.001
Token settled
Token ID10282
Feature activation-2.40e+00

Pos logit contributions
lights+0.008
gy+0.007
bel+0.007
opes+0.007
spe+0.007
Neg logit contributions
 Sara-0.006
 Ben-0.005
 strawberry-0.005
 ice-0.005
 vanilla-0.005
Token down
Token ID866
Feature activation-5.20e-01

Pos logit contributions
 Sara+0.067
 Ben+0.060
 ice+0.059
 strawberry+0.059
 o+0.057
Neg logit contributions
lights-0.088
bel-0.084
opes-0.084
gy-0.083
adoes-0.082
Token under
Token ID739
Feature activation+7.53e-02

Pos logit contributions
 Sara+0.012
 Ben+0.011
 ice+0.011
 strawberry+0.011
 o+0.010
Neg logit contributions
lights-0.021
bel-0.020
opes-0.020
gy-0.020
adoes-0.020
Token a
Token ID257
Feature activation-5.53e-02

Pos logit contributions
lights+0.003
opes+0.003
 puzzles+0.003
gy+0.003
adoes+0.003
Neg logit contributions
 Sara-0.001
Come-0.001
 ice-0.001
 strawberry-0.001
 warmer-0.001
Token shady
Token ID36135
Feature activation-4.95e+00

Pos logit contributions
 Sara+0.002
Come+0.001
 Ben+0.001
 Tom+0.001
 Danny+0.001
Neg logit contributions
lights-0.002
zag-0.002
bel-0.002
gy-0.002
igsaw-0.002
Token rock
Token ID3881
Feature activation-1.83e+00

Pos logit contributions
 Sara+0.108
 Ben+0.093
 ice+0.090
 strawberry+0.088
 o+0.086
Neg logit contributions
lights-0.212
opes-0.204
gy-0.203
bel-0.203
adoes-0.199
Token and
Token ID290
Feature activation-3.89e-01

Pos logit contributions
 Sara+0.037
 Ben+0.032
 ice+0.031
 strawberry+0.030
 o+0.030
Neg logit contributions
lights-0.082
opes-0.080
bel-0.078
adoes-0.077
gy-0.077
Token watched
Token ID7342
Feature activation-1.29e+00

Pos logit contributions
 Sara+0.009
 strawberry+0.008
 Ben+0.008
 vanilla+0.008
Come+0.008
Neg logit contributions
bel-0.015
gy-0.014
lights-0.014
cases-0.013
zag-0.013
Token the
Token ID262
Feature activation-4.99e-01

Pos logit contributions
 strawberry+0.030
 Sara+0.029
Come+0.029
 ice+0.028
 mir+0.027
Neg logit contributions
lights-0.047
bel-0.045
adoes-0.044
opes-0.043
gy-0.042
Token fish
Token ID5916
Feature activation-2.37e+00

Pos logit contributions
 Sara+0.014
 Ben+0.012
 Gus+0.012
 OK+0.012
 Tom+0.012
Neg logit contributions
 puzzles-0.015
lights-0.015
opes-0.015
zag-0.015
bel-0.015
Token
Token ID198
Feature activation-1.33e-01

Pos logit contributions
 Sara+0.015
 ice+0.014
 o+0.014
 sne+0.014
 strawberry+0.014
Neg logit contributions
lights-0.021
gy-0.020
opes-0.020
bel-0.020
spe-0.019
Token
Token ID198
Feature activation+1.02e-02

Pos logit contributions
 Sara+0.003
 ice+0.003
 strawberry+0.003
 o+0.002
 sne+0.002
Neg logit contributions
bel-0.005
lights-0.005
zag-0.005
gy-0.005
opes-0.005
TokenHe
Token ID1544
Feature activation+1.46e+00

Pos logit contributions
lights+0.000
bel+0.000
opes+0.000
spe+0.000
zag+0.000
Neg logit contributions
 Sara-0.000
 Ben-0.000
 ice-0.000
 strawberry-0.000
 o-0.000
Token spotted
Token ID13489
Feature activation-1.00e+00

Pos logit contributions
lights+0.052
gy+0.052
opes+0.052
bel+0.050
spe+0.049
Neg logit contributions
 Sara-0.038
 Ben-0.033
 strawberry-0.032
 ice-0.032
 vanilla-0.031
Token a
Token ID257
Feature activation-6.72e-01

Pos logit contributions
 Sara+0.022
Come+0.018
 strawberry+0.018
 Ben+0.018
 o+0.017
Neg logit contributions
lights-0.040
opes-0.039
gy-0.039
bel-0.039
adoes-0.038
Token shady
Token ID36135
Feature activation-4.94e+00

Pos logit contributions
 Sara+0.019
 Ben+0.016
Come+0.015
 Tom+0.014
 Danny+0.014
Neg logit contributions
bel-0.023
lights-0.023
zag-0.023
gy-0.023
spe-0.023
Token spot
Token ID4136
Feature activation-3.56e+00

Pos logit contributions
 Sara+0.113
 Ben+0.098
 ice+0.094
 strawberry+0.092
Come+0.090
Neg logit contributions
lights-0.205
gy-0.198
opes-0.196
bel-0.195
zag-0.192
Token,
Token ID11
Feature activation+3.96e-01

Pos logit contributions
 Sara+0.090
 Ben+0.079
 ice+0.077
 strawberry+0.076
 o+0.074
Neg logit contributions
lights-0.141
opes-0.136
bel-0.135
gy-0.133
adoes-0.132
Token next
Token ID1306
Feature activation-9.55e-01

Pos logit contributions
lights+0.015
bel+0.014
opes+0.014
 puzzles+0.014
adoes+0.014
Neg logit contributions
 Sara-0.009
 Ben-0.008
 ice-0.007
 o-0.007
 strawberry-0.007
Token to
Token ID284
Feature activation-1.21e+00

Pos logit contributions
 Sara+0.017
 ice+0.015
 Ben+0.014
 strawberry+0.014
 o+0.014
Neg logit contributions
lights-0.045
opes-0.045
adoes-0.043
gy-0.043
bel-0.043
Token some
Token ID617
Feature activation-1.87e+00

Pos logit contributions
 Sara+0.024
 warmer+0.023
 vanilla+0.022
 strawberry+0.022
 sne+0.021
Neg logit contributions
bel-0.043
gy-0.043
lights-0.042
 puzzles-0.042
opes-0.041
Token and
Token ID290
Feature activation+3.90e-01

Pos logit contributions
lights+0.025
gy+0.024
opes+0.024
bel+0.024
zag+0.023
Neg logit contributions
 Sara-0.018
 ice-0.016
 strawberry-0.016
 o-0.016
 Ben-0.016
Token went
Token ID1816
Feature activation-1.15e+00

Pos logit contributions
gy+0.015
bel+0.014
lights+0.014
adoes+0.013
 puzzles+0.013
Neg logit contributions
 Sara-0.008
 strawberry-0.008
Come-0.008
It-0.007
 vanilla-0.007
Token to
Token ID284
Feature activation+2.55e-01

Pos logit contributions
 Sara+0.025
 strawberry+0.023
 Ben+0.022
 ice+0.021
Come+0.021
Neg logit contributions
lights-0.047
gy-0.046
bel-0.046
zag-0.045
opes-0.044
Token find
Token ID1064
Feature activation-1.04e+00

Pos logit contributions
gy+0.008
 puzzles+0.007
 robots+0.006
bel+0.006
opes+0.006
Neg logit contributions
Come-0.007
ax-0.007
aw-0.007
 strawberry-0.007
 vanilla-0.006
Token a
Token ID257
Feature activation-2.48e-01

Pos logit contributions
ax+0.030
Come+0.029
It+0.028
Che+0.027
 sur+0.027
Neg logit contributions
lights-0.028
 puzzles-0.026
gy-0.024
adoes-0.024
 robots-0.023
Token shady
Token ID36135
Feature activation-4.87e+00

Pos logit contributions
 Sara+0.007
Come+0.006
 Ben+0.006
ax+0.006
ily+0.006
Neg logit contributions
zag-0.008
gy-0.007
bel-0.007
 puzzles-0.007
lights-0.007
Token spot
Token ID4136
Feature activation-3.69e+00

Pos logit contributions
 Sara+0.100
 Ben+0.087
Come+0.082
 ice+0.079
 strawberry+0.079
Neg logit contributions
lights-0.210
gy-0.205
bel-0.200
zag-0.199
opes-0.197
Token under
Token ID739
Feature activation-5.20e-03

Pos logit contributions
 Sara+0.091
 Ben+0.079
 strawberry+0.078
 ice+0.078
 o+0.075
Neg logit contributions
lights-0.148
bel-0.142
gy-0.140
opes-0.139
adoes-0.136
Token a
Token ID257
Feature activation-2.92e-01

Pos logit contributions
 Sara+0.000
Come+0.000
 strawberry+0.000
 o+0.000
 say+0.000
Neg logit contributions
lights-0.000
gy-0.000
opes-0.000
 puzzles-0.000
adoes-0.000
Token big
Token ID1263
Feature activation-1.04e+00

Pos logit contributions
 Sara+0.008
Come+0.008
 Ben+0.008
 OK+0.007
 Danny+0.007
Neg logit contributions
lights-0.010
zag-0.009
 puzzles-0.009
gy-0.009
igsaw-0.009
Token tree
Token ID5509
Feature activation-1.63e+00

Pos logit contributions
 Sara+0.028
Come+0.027
 Ben+0.025
 OK+0.024
 warmer+0.023
Neg logit contributions
lights-0.036
gy-0.035
 puzzles-0.034
opes-0.034
zag-0.033
Token
Token ID198
Feature activation-1.45e-01

Pos logit contributions
 Sara+0.006
 ice+0.005
 strawberry+0.005
 Ben+0.005
 o+0.005
Neg logit contributions
lights-0.015
opes-0.015
bel-0.015
gy-0.014
zag-0.014
TokenShe
Token ID3347
Feature activation+8.60e-01

Pos logit contributions
 Sara+0.004
 Ben+0.003
 ice+0.003
 strawberry+0.003
 o+0.003
Neg logit contributions
lights-0.006
opes-0.006
bel-0.005
gy-0.005
adoes-0.005
Token pointed
Token ID6235
Feature activation-2.44e-01

Pos logit contributions
lights+0.035
opes+0.034
bel+0.034
gy+0.034
adoes+0.033
Neg logit contributions
 Sara-0.020
 Ben-0.017
 ice-0.017
 strawberry-0.017
 o-0.016
Token to
Token ID284
Feature activation+7.84e-01

Pos logit contributions
 Sara+0.006
 ice+0.005
Come+0.005
 strawberry+0.005
 warmer+0.005
Neg logit contributions
lights-0.009
bel-0.009
adoes-0.008
opes-0.008
gy-0.008
Token a
Token ID257
Feature activation-4.33e-01

Pos logit contributions
lights+0.035
opes+0.034
bel+0.034
gy+0.034
adoes+0.033
Neg logit contributions
 Sara-0.014
 Ben-0.012
 strawberry-0.012
 ice-0.012
 o-0.011
Token shady
Token ID36135
Feature activation-4.84e+00

Pos logit contributions
 Sara+0.012
 Ben+0.010
Come+0.010
 o+0.009
 ice+0.009
Neg logit contributions
bel-0.016
spe-0.015
lights-0.015
gy-0.015
 puzzles-0.015
Token looking
Token ID2045
Feature activation+2.06e-01

Pos logit contributions
 Sara+0.110
 Ben+0.096
 ice+0.091
 strawberry+0.090
Come+0.089
Neg logit contributions
lights-0.200
gy-0.194
bel-0.193
opes-0.191
adoes-0.188
Token man
Token ID582
Feature activation+1.68e+00

Pos logit contributions
lights+0.007
bel+0.007
gy+0.007
opes+0.007
adoes+0.007
Neg logit contributions
 Sara-0.006
 Ben-0.005
Come-0.005
 ice-0.005
 strawberry-0.005
Token at
Token ID379
Feature activation-1.19e-01

Pos logit contributions
gy+0.058
lights+0.057
bel+0.056
opes+0.055
spe+0.054
Neg logit contributions
 Sara-0.048
 Ben-0.041
 strawberry-0.041
 ice-0.039
 o-0.039
Token the
Token ID262
Feature activation-1.52e-01

Pos logit contributions
 Sara+0.002
 Ben+0.002
 ice+0.002
 strawberry+0.001
 o+0.001
Neg logit contributions
lights-0.006
opes-0.006
bel-0.006
gy-0.006
adoes-0.005
Token end
Token ID886
Feature activation-1.03e+00

Pos logit contributions
 Sara+0.004
Come+0.004
 Ben+0.004
 warmer+0.004
 Danny+0.004
Neg logit contributions
bel-0.005
gy-0.005
spe-0.005
 puzzles-0.005
 robots-0.005
Token excited
Token ID6568
Feature activation+1.42e+00

Pos logit contributions
gy+0.007
lights+0.007
bel+0.007
zag+0.007
 puzzles+0.006
Neg logit contributions
 Sara-0.005
Come-0.005
 Ben-0.004
 ice-0.004
 strawberry-0.004
Token and
Token ID290
Feature activation+5.17e-01

Pos logit contributions
lights+0.062
opes+0.060
bel+0.059
gy+0.059
adoes+0.058
Neg logit contributions
 Sara-0.031
 Ben-0.026
 ice-0.026
 strawberry-0.025
 o-0.025
Token quickly
Token ID2952
Feature activation-9.08e-02

Pos logit contributions
lights+0.021
bel+0.020
opes+0.020
gy+0.020
adoes+0.019
Neg logit contributions
 Sara-0.013
 Ben-0.011
 ice-0.011
 strawberry-0.011
 o-0.010
Token found
Token ID1043
Feature activation-7.26e-01

Pos logit contributions
 Sara+0.002
 Ben+0.002
 ice+0.002
 strawberry+0.002
Come+0.002
Neg logit contributions
lights-0.004
opes-0.003
gy-0.003
bel-0.003
adoes-0.003
Token a
Token ID257
Feature activation-4.30e-01

Pos logit contributions
 Sara+0.016
 strawberry+0.014
 ice+0.014
 yawn+0.014
 Ben+0.013
Neg logit contributions
lights-0.029
gy-0.028
bel-0.028
opes-0.028
adoes-0.028
Token shady
Token ID36135
Feature activation-4.83e+00

Pos logit contributions
 Sara+0.011
 Ben+0.011
Come+0.010
 Danny+0.009
 Tom+0.009
Neg logit contributions
zag-0.015
gy-0.015
bel-0.015
spe-0.014
lights-0.014
Token spot
Token ID4136
Feature activation-3.60e+00

Pos logit contributions
 Sara+0.098
 Ben+0.085
 ice+0.079
 strawberry+0.078
Come+0.076
Neg logit contributions
lights-0.213
gy-0.206
bel-0.204
opes-0.203
zag-0.201
Token to
Token ID284
Feature activation-7.57e-01

Pos logit contributions
 Sara+0.088
 Ben+0.077
 ice+0.076
 strawberry+0.075
 o+0.073
Neg logit contributions
lights-0.145
opes-0.140
bel-0.139
gy-0.137
adoes-0.136
Token relax
Token ID8960
Feature activation-1.72e+00

Pos logit contributions
 Sara+0.018
Come+0.017
 Ben+0.017
 Danny+0.015
ax+0.015
Neg logit contributions
gy-0.028
zag-0.025
bel-0.025
bes-0.025
 puzzles-0.024
Token.
Token ID13
Feature activation-1.45e-02

Pos logit contributions
 Sara+0.041
 Ben+0.035
 ice+0.035
 strawberry+0.035
 o+0.033
Neg logit contributions
lights-0.070
opes-0.067
bel-0.067
gy-0.067
adoes-0.066
Token He
Token ID679
Feature activation+1.40e+00

Pos logit contributions
 Sara+0.000
 Ben+0.000
 ice+0.000
 strawberry+0.000
 o+0.000
Neg logit contributions
lights-0.001
opes-0.001
bel-0.001
gy-0.001
adoes-0.001
Token over
Token ID625
Feature activation+5.29e-02

Pos logit contributions
 Sara+0.003
 ice+0.003
 o+0.003
 Ben+0.003
 right+0.003
Neg logit contributions
lights-0.011
opes-0.010
arians-0.010
adoes-0.010
spe-0.009
Token to
Token ID284
Feature activation-4.77e-01

Pos logit contributions
lights+0.002
opes+0.002
bel+0.002
gy+0.002
adoes+0.002
Neg logit contributions
 Sara-0.001
 Ben-0.001
 ice-0.001
 strawberry-0.001
 o-0.001
Token a
Token ID257
Feature activation-3.51e-01

Pos logit contributions
 Sara+0.010
 strawberry+0.009
 ice+0.009
 vanilla+0.008
Come+0.008
Neg logit contributions
gy-0.019
opes-0.019
lights-0.019
bel-0.019
adoes-0.019
Token cool
Token ID3608
Feature activation-1.96e+00

Pos logit contributions
 Sara+0.010
 Ben+0.009
Come+0.008
 Danny+0.008
 vanilla+0.008
Neg logit contributions
gy-0.012
 puzzles-0.012
bel-0.012
zag-0.012
lights-0.012
Token and
Token ID290
Feature activation-4.54e-01

Pos logit contributions
 Sara+0.050
Come+0.045
 Ben+0.043
 vanilla+0.040
 ice+0.039
Neg logit contributions
gy-0.074
lights-0.073
zag-0.070
adoes-0.070
opes-0.069
Token shady
Token ID36135
Feature activation-4.83e+00

Pos logit contributions
 Sara+0.011
 Ben+0.009
 strawberry+0.009
 ice+0.009
 o+0.008
Neg logit contributions
lights-0.018
bel-0.018
gy-0.018
opes-0.017
adoes-0.017
Token spot
Token ID4136
Feature activation-3.73e+00

Pos logit contributions
 Sara+0.108
 Ben+0.095
Come+0.090
 ice+0.090
 strawberry+0.089
Neg logit contributions
lights-0.199
gy-0.196
bel-0.190
adoes-0.189
opes-0.189
Token.
Token ID13
Feature activation-7.26e-02

Pos logit contributions
 Sara+0.092
 Ben+0.080
 ice+0.078
 strawberry+0.077
 o+0.075
Neg logit contributions
lights-0.151
opes-0.145
bel-0.144
gy-0.142
adoes-0.141
Token There
Token ID1318
Feature activation+1.55e+00

Pos logit contributions
 Sara+0.002
 Ben+0.002
 ice+0.002
 strawberry+0.002
 o+0.002
Neg logit contributions
lights-0.003
opes-0.003
bel-0.003
gy-0.003
adoes-0.003
Token were
Token ID547
Feature activation-5.55e-01

Pos logit contributions
lights+0.052
opes+0.050
gy+0.050
bel+0.050
adoes+0.048
Neg logit contributions
 Sara-0.047
 strawberry-0.042
 ice-0.041
 Ben-0.040
 o-0.040
Token big
Token ID1263
Feature activation+2.87e-01

Pos logit contributions
 Sara+0.017
 Ben+0.015
 strawberry+0.015
 say+0.015
 ice+0.015
Neg logit contributions
lights-0.018
opes-0.017
bel-0.017
 puzzles-0.016
gy-0.016
Token It
Token ID632
Feature activation+2.23e-02

Pos logit contributions
lights+0.023
opes+0.021
adoes+0.021
bel+0.021
 puzzles+0.021
Neg logit contributions
 o-0.013
 strawberry-0.013
 ice-0.013
 yawn-0.013
 Sara-0.012
Token was
Token ID373
Feature activation-4.30e-02

Pos logit contributions
lights+0.001
opes+0.001
bel+0.001
gy+0.001
adoes+0.001
Neg logit contributions
 Sara-0.001
 Ben-0.000
 ice-0.000
 strawberry-0.000
 o-0.000
Token very
Token ID845
Feature activation-4.32e-02

Pos logit contributions
 Sara+0.001
Come+0.001
 Goose+0.001
 strawberry+0.001
 Ben+0.001
Neg logit contributions
lights-0.002
opes-0.002
 puzzles-0.002
bel-0.002
gy-0.002
Token tall
Token ID7331
Feature activation+1.45e-01

Pos logit contributions
Come+0.001
 Sara+0.001
 Goose+0.001
 Ben+0.001
 strawberry+0.001
Neg logit contributions
lights-0.001
bel-0.001
spe-0.001
zag-0.001
ways-0.001
Token and
Token ID290
Feature activation-2.83e-01

Pos logit contributions
lights+0.006
opes+0.006
bel+0.006
gy+0.006
adoes+0.006
Neg logit contributions
 Sara-0.003
 Ben-0.003
 ice-0.003
 strawberry-0.003
 o-0.002
Token shady
Token ID36135
Feature activation-4.82e+00

Pos logit contributions
 Sara+0.007
 strawberry+0.006
 Ben+0.006
 ice+0.006
 Goose+0.006
Neg logit contributions
lights-0.011
bel-0.010
gy-0.010
 puzzles-0.010
opes-0.010
Token.
Token ID13
Feature activation+5.90e-01

Pos logit contributions
 Sara+0.112
 Ben+0.096
 ice+0.095
 strawberry+0.091
 o+0.091
Neg logit contributions
lights-0.203
opes-0.197
bel-0.194
gy-0.190
adoes-0.190
Token Lily
Token ID20037
Feature activation-1.29e+00

Pos logit contributions
lights+0.025
opes+0.023
adoes+0.023
 robots+0.023
bel+0.022
Neg logit contributions
 o-0.012
 strawberry-0.012
 ice-0.012
 yawn-0.011
 sne-0.011
Token and
Token ID290
Feature activation-1.59e+00

Pos logit contributions
 Sara+0.033
 strawberry+0.028
 ice+0.028
 o+0.028
 warmer+0.028
Neg logit contributions
opes-0.048
gy-0.048
bel-0.048
lights-0.047
zag-0.047
Token Tom
Token ID4186
Feature activation+3.88e-01

Pos logit contributions
 o+0.021
 strawberry+0.021
 warmer+0.021
 mir+0.020
 Sara+0.019
Neg logit contributions
 puzzles-0.071
opes-0.070
 robots-0.070
adoes-0.070
gy-0.069
Token wanted
Token ID2227
Feature activation-5.31e-01

Pos logit contributions
opes+0.013
lights+0.013
bel+0.013
gy+0.013
spe+0.012
Neg logit contributions
 Sara-0.011
 strawberry-0.010
 Ben-0.010
 ice-0.010
 o-0.009
Token felt
Token ID2936
Feature activation+3.61e-01

Pos logit contributions
gy+0.004
lights+0.004
bel+0.004
opes+0.004
zag+0.004
Neg logit contributions
 Sara-0.003
 strawberry-0.003
 Ben-0.003
 ice-0.003
 vanilla-0.003
Token tired
Token ID10032
Feature activation-1.22e+00

Pos logit contributions
lights+0.013
bel+0.013
 puzzles+0.012
gy+0.012
ways+0.012
Neg logit contributions
 Sara-0.008
 strawberry-0.008
 Ben-0.007
Come-0.007
 Goose-0.007
Token,
Token ID11
Feature activation-3.93e-02

Pos logit contributions
 Sara+0.035
 strawberry+0.033
Come+0.032
 ge+0.030
 Ben+0.030
Neg logit contributions
lights-0.042
gy-0.038
bel-0.038
adoes-0.037
 robots-0.036
Token so
Token ID523
Feature activation-1.12e+00

Pos logit contributions
 Sara+0.001
 strawberry+0.001
 ice+0.001
 Ben+0.001
 o+0.001
Neg logit contributions
lights-0.001
opes-0.001
bel-0.001
adoes-0.001
gy-0.001
Token he
Token ID339
Feature activation-1.05e+00

Pos logit contributions
 Sara+0.029
 strawberry+0.026
 ice+0.025
 Ben+0.025
 o+0.024
Neg logit contributions
lights-0.042
bel-0.041
gy-0.040
opes-0.039
spe-0.039
Token lay
Token ID3830
Feature activation-4.82e+00

Pos logit contributions
 Sara+0.027
 strawberry+0.023
 Ben+0.023
 vanilla+0.022
 ice+0.022
Neg logit contributions
lights-0.037
bel-0.036
gy-0.036
opes-0.034
spe-0.034
Token on
Token ID319
Feature activation+5.37e-01

Pos logit contributions
 Sara+0.139
 Ben+0.125
 ice+0.124
 strawberry+0.123
 o+0.119
Neg logit contributions
lights-0.173
bel-0.166
gy-0.163
opes-0.162
adoes-0.160
Token the
Token ID262
Feature activation+6.77e-01

Pos logit contributions
lights+0.024
opes+0.023
gy+0.023
bel+0.023
adoes+0.022
Neg logit contributions
 Sara-0.010
 Ben-0.009
Come-0.008
 strawberry-0.008
 ice-0.008
Token floor
Token ID4314
Feature activation-3.66e-01

Pos logit contributions
lights+0.026
opes+0.025
gy+0.025
bel+0.024
adoes+0.024
Neg logit contributions
 Sara-0.018
 Ben-0.017
 strawberry-0.015
Come-0.015
 ice-0.015
Token,
Token ID11
Feature activation-1.56e-01

Pos logit contributions
 Sara+0.008
 Ben+0.007
 ice+0.007
 strawberry+0.007
 o+0.006
Neg logit contributions
lights-0.016
opes-0.015
bel-0.015
gy-0.015
adoes-0.015
Token and
Token ID290
Feature activation-8.72e-01

Pos logit contributions
 Sara+0.004
 Ben+0.003
 strawberry+0.003
 ice+0.003
 Goose+0.003
Neg logit contributions
lights-0.006
bel-0.006
gy-0.005
opes-0.005
adoes-0.005
Token arrived
Token ID5284
Feature activation+8.06e-01

Pos logit contributions
 Sara+0.019
 strawberry+0.017
 vanilla+0.016
 Ben+0.016
 ice+0.016
Neg logit contributions
bel-0.019
gy-0.019
lights-0.018
spe-0.017
cases-0.017
Token,
Token ID11
Feature activation-4.03e-01

Pos logit contributions
lights+0.033
opes+0.032
bel+0.031
gy+0.031
adoes+0.030
Neg logit contributions
 Sara-0.021
 Ben-0.018
 ice-0.018
 strawberry-0.017
 o-0.017
Token they
Token ID484
Feature activation-5.59e-01

Pos logit contributions
 Sara+0.012
 Ben+0.011
 ice+0.011
 strawberry+0.010
 o+0.010
Neg logit contributions
lights-0.014
opes-0.014
bel-0.013
gy-0.013
adoes-0.013
Token found
Token ID1043
Feature activation-4.07e-01

Pos logit contributions
 Sara+0.013
 strawberry+0.012
 ice+0.011
 vanilla+0.011
Come+0.011
Neg logit contributions
bel-0.021
gy-0.021
spe-0.020
cases-0.020
lights-0.020
Token a
Token ID257
Feature activation-2.77e-01

Pos logit contributions
 yawn+0.009
 Sara+0.009
 sne+0.009
 Ben+0.008
 strawberry+0.008
Neg logit contributions
lights-0.015
gy-0.015
adoes-0.015
opes-0.014
bel-0.014
Token shady
Token ID36135
Feature activation-4.81e+00

Pos logit contributions
 Sara+0.008
 Ben+0.007
Come+0.007
 Danny+0.006
ax+0.006
Neg logit contributions
zag-0.010
bel-0.010
gy-0.010
lights-0.010
 puzzles-0.009
Token spot
Token ID4136
Feature activation-3.40e+00

Pos logit contributions
 Sara+0.110
 Ben+0.096
 ice+0.090
 strawberry+0.090
Come+0.089
Neg logit contributions
lights-0.199
gy-0.193
bel-0.190
opes-0.189
zag-0.187
Token to
Token ID284
Feature activation-5.49e-01

Pos logit contributions
 Sara+0.085
 Ben+0.073
 ice+0.073
 strawberry+0.071
 o+0.070
Neg logit contributions
opes-0.136
lights-0.136
adoes-0.130
bel-0.129
gy-0.128
Token sit
Token ID1650
Feature activation-2.42e+00

Pos logit contributions
Come+0.013
ax+0.012
 Ben+0.012
 Sara+0.012
 Goose+0.011
Neg logit contributions
gy-0.018
bel-0.016
 puzzles-0.016
zag-0.016
bes-0.016
Token in
Token ID287
Feature activation+1.72e+00

Pos logit contributions
 Sara+0.066
 Ben+0.059
 strawberry+0.057
 ice+0.057
 o+0.055
Neg logit contributions
lights-0.092
gy-0.087
bel-0.087
opes-0.087
adoes-0.085
Token and
Token ID290
Feature activation+6.92e-01

Pos logit contributions
lights+0.075
gy+0.072
opes+0.070
adoes+0.069
bel+0.069
Neg logit contributions
 Sara-0.034
 Ben-0.030
 strawberry-0.029
Come-0.028
 sne-0.027
Token the
Token ID262
Feature activation+7.76e-01

Pos logit contributions
 Sara+0.032
Come+0.029
 strawberry+0.028
 Ben+0.027
iza+0.027
Neg logit contributions
gy-0.056
adoes-0.053
bel-0.052
lights-0.052
opes-0.052
Token ostr
Token ID42387
Feature activation+1.91e+00

Pos logit contributions
gy+0.028
zag+0.025
 creations+0.024
bel+0.023
 puzzles+0.023
Neg logit contributions
 Danny-0.021
 Sara-0.021
 Ben-0.021
Come-0.020
 Mara-0.018
Tokenich
Token ID488
Feature activation+8.97e-02

Pos logit contributions
lights+0.063
opes+0.057
 puzzles+0.055
adoes+0.055
bel+0.055
Neg logit contributions
 Sara-0.061
 ice-0.057
 o-0.056
 strawberry-0.055
 Ben-0.054
Token to
Token ID284
Feature activation-8.30e-02

Pos logit contributions
lights+0.004
opes+0.004
arians+0.004
bel+0.004
adoes+0.004
Neg logit contributions
 Sara-0.002
 ice-0.001
 o-0.001
 strawberry-0.001
 Ben-0.001
Token a
Token ID257
Feature activation-4.75e-01

Pos logit contributions
Come+0.002
 strawberry+0.002
 Sara+0.002
 vanilla+0.002
ax+0.002
Neg logit contributions
gy-0.003
opes-0.003
bel-0.003
 puzzles-0.003
 robots-0.003
Token shady
Token ID36135
Feature activation-4.81e+00

Pos logit contributions
 Sara+0.013
 Ben+0.012
Come+0.012
 Danny+0.011
ax+0.011
Neg logit contributions
gy-0.016
bel-0.016
 puzzles-0.016
zag-0.015
spe-0.015
Token tree
Token ID5509
Feature activation-1.15e+00

Pos logit contributions
 Sara+0.103
 Ben+0.088
 ice+0.084
 strawberry+0.084
Come+0.082
Neg logit contributions
lights-0.205
gy-0.200
bel-0.197
opes-0.196
adoes-0.194
Token and
Token ID290
Feature activation-6.70e-01

Pos logit contributions
 Sara+0.026
 Ben+0.023
 ice+0.022
 strawberry+0.022
 o+0.021
Neg logit contributions
lights-0.048
opes-0.047
bel-0.046
gy-0.046
adoes-0.045
Token kept
Token ID4030
Feature activation-1.57e+00

Pos logit contributions
 Sara+0.015
 Ben+0.012
 ice+0.012
 strawberry+0.012
Come+0.012
Neg logit contributions
gy-0.027
bel-0.027
lights-0.026
opes-0.026
adoes-0.025
Token him
Token ID683
Feature activation-4.21e-01

Pos logit contributions
 Sara+0.048
Ben+0.042
 Ben+0.041
ax+0.040
iza+0.039
Neg logit contributions
gy-0.049
opes-0.045
ths-0.043
bel-0.043
adoes-0.042
Token safe
Token ID3338
Feature activation-1.36e+00

Pos logit contributions
 Sara+0.008
 Ben+0.007
 strawberry+0.006
 ice+0.006
 o+0.006
Neg logit contributions
lights-0.019
opes-0.019
bel-0.019
gy-0.019
adoes-0.018
Token would
Token ID561
Feature activation-1.04e+00

Pos logit contributions
 Sara+0.015
 ice+0.014
 o+0.014
 sne+0.013
 Ben+0.013
Neg logit contributions
lights-0.032
adoes-0.029
opes-0.029
igsaw-0.028
arians-0.028
Token take
Token ID1011
Feature activation-1.17e+00

Pos logit contributions
 Sara+0.020
 ice+0.019
 o+0.018
 Ben+0.018
 sit+0.018
Neg logit contributions
lights-0.050
opes-0.046
adoes-0.045
gy-0.044
bel-0.044
Token out
Token ID503
Feature activation+1.10e+00

Pos logit contributions
 Sara+0.022
 ice+0.019
 Ben+0.019
 o+0.016
 better+0.016
Neg logit contributions
lights-0.057
opes-0.055
zag-0.054
bel-0.053
spe-0.053
Token a
Token ID257
Feature activation+3.72e-01

Pos logit contributions
arians+0.059
zag+0.058
lights+0.057
gy+0.057
opes+0.056
Neg logit contributions
 ice-0.016
 Sara-0.016
 Ben-0.015
 o-0.014
 right-0.012
Token j
Token ID474
Feature activation+1.56e+00

Pos logit contributions
lights+0.015
opes+0.015
adoes+0.014
gy+0.014
arians+0.014
Neg logit contributions
 Sara-0.010
 ice-0.009
 strawberry-0.009
 Ben-0.009
 o-0.009
Tokenigsaw
Token ID45636
Feature activation+9.45e+00

Pos logit contributions
opes+0.059
lights+0.057
adoes+0.057
gy+0.056
bel+0.056
Neg logit contributions
 Sara-0.043
 ice-0.039
 Ben-0.039
 o-0.038
 strawberry-0.038
Token puzzle
Token ID15027
Feature activation+2.46e+00

Pos logit contributions
opes+0.325
adoes+0.309
gy+0.304
arians+0.280
phones+0.276
Neg logit contributions
 ice-0.292
 better-0.284
 right-0.273
 o-0.269
 warm-0.264
Token and
Token ID290
Feature activation-2.53e-01

Pos logit contributions
opes+0.127
lights+0.122
adoes+0.121
gy+0.119
arians+0.117
Neg logit contributions
 ice-0.034
 Sara-0.033
 Ben-0.032
 o-0.030
 right-0.029
Token try
Token ID1949
Feature activation-1.60e+00

Pos logit contributions
 Sara+0.005
 ice+0.005
 Ben+0.005
 strawberry+0.004
 o+0.004
Neg logit contributions
lights-0.012
opes-0.011
adoes-0.011
bel-0.011
gy-0.011
Token to
Token ID284
Feature activation+5.43e-01

Pos logit contributions
 Sara+0.033
 Ben+0.028
 ice+0.027
 strawberry+0.027
 o+0.026
Neg logit contributions
lights-0.071
opes-0.069
bel-0.068
gy-0.067
adoes-0.067
Token solve
Token ID8494
Feature activation+3.30e+00

Pos logit contributions
lights+0.020
opes+0.019
bel+0.018
gy+0.018
adoes+0.018
Neg logit contributions
 Sara-0.016
 Ben-0.014
 ice-0.014
 strawberry-0.014
 o-0.014
Token she
Token ID673
Feature activation-2.92e-01

Pos logit contributions
lights+0.032
opes+0.031
bel+0.030
gy+0.030
adoes+0.030
Neg logit contributions
 Sara-0.017
 ice-0.015
 strawberry-0.015
 Ben-0.014
 o-0.014
Token bowed
Token ID35737
Feature activation-6.31e-01

Pos logit contributions
 Sara+0.008
 Ben+0.007
 ice+0.007
 strawberry+0.007
 o+0.006
Neg logit contributions
lights-0.011
opes-0.011
bel-0.011
gy-0.011
adoes-0.010
Token,
Token ID11
Feature activation+4.78e-01

Pos logit contributions
 Sara+0.016
 Ben+0.014
 ice+0.014
 strawberry+0.013
 o+0.013
Neg logit contributions
lights-0.025
opes-0.024
bel-0.024
gy-0.024
adoes-0.023
Token the
Token ID262
Feature activation-2.05e-01

Pos logit contributions
lights+0.021
opes+0.020
bel+0.020
gy+0.020
adoes+0.020
Neg logit contributions
 Sara-0.010
 Ben-0.009
 ice-0.008
 strawberry-0.008
 o-0.008
Token town
Token ID3240
Feature activation+2.17e+00

Pos logit contributions
 Sara+0.006
 Ben+0.005
 strawberry+0.005
 ice+0.005
 o+0.005
Neg logit contributions
lights-0.008
opes-0.007
bel-0.007
adoes-0.007
gy-0.007
Tokenspe
Token ID4125
Feature activation+7.39e+00

Pos logit contributions
opes+0.087
lights+0.087
gy+0.082
adoes+0.082
bel+0.081
Neg logit contributions
 Sara-0.052
 ice-0.046
 Ben-0.046
 o-0.044
 strawberry-0.043
Tokenople
Token ID643
Feature activation-3.86e-01

Pos logit contributions
lights+0.201
opes+0.186
bel+0.179
zag+0.178
bes+0.174
Neg logit contributions
 ice-0.259
 Sara-0.258
 right-0.238
 o-0.237
 Ben-0.235
Token stopped
Token ID5025
Feature activation-5.54e-01

Pos logit contributions
 Sara+0.009
 Ben+0.008
 ice+0.008
 o+0.007
 strawberry+0.007
Neg logit contributions
lights-0.016
opes-0.016
bel-0.015
gy-0.015
adoes-0.015
Token to
Token ID284
Feature activation-1.98e-01

Pos logit contributions
 Sara+0.014
 strawberry+0.012
 ice+0.012
 Ben+0.012
 o+0.011
Neg logit contributions
lights-0.022
bel-0.022
opes-0.022
gy-0.021
adoes-0.021
Token admire
Token ID21099
Feature activation-3.76e-02

Pos logit contributions
 Sara+0.005
 strawberry+0.004
 Ben+0.004
 ice+0.004
Come+0.004
Neg logit contributions
bel-0.008
lights-0.007
gy-0.007
opes-0.007
adoes-0.007
Token her
Token ID607
Feature activation-4.08e-01

Pos logit contributions
 Sara+0.001
 Ben+0.001
 ice+0.000
 strawberry+0.000
 o+0.000
Neg logit contributions
lights-0.002
opes-0.002
bel-0.002
gy-0.002
zag-0.002
Token It
Token ID632
Feature activation+3.42e-01

Pos logit contributions
 ice+0.017
 o+0.017
 Sara+0.016
 sne+0.016
 ge+0.016
Neg logit contributions
lights-0.017
opes-0.016
igsaw-0.016
 puzzles-0.015
adoes-0.015
Token was
Token ID373
Feature activation+3.81e-01

Pos logit contributions
lights+0.014
opes+0.013
bel+0.013
gy+0.013
adoes+0.013
Neg logit contributions
 Sara-0.008
 Ben-0.007
 ice-0.007
 strawberry-0.007
 o-0.007
Token a
Token ID257
Feature activation-2.50e-01

Pos logit contributions
lights+0.016
gy+0.015
opes+0.015
 puzzles+0.015
bel+0.015
Neg logit contributions
 Sara-0.008
Come-0.007
 Ben-0.007
 ice-0.007
 strawberry-0.007
Token big
Token ID1263
Feature activation-6.68e-01

Pos logit contributions
 Sara+0.006
 Ben+0.005
Come+0.005
 ice+0.005
 o+0.005
Neg logit contributions
lights-0.010
bel-0.009
gy-0.009
zag-0.009
 puzzles-0.009
Token j
Token ID474
Feature activation+7.89e-01

Pos logit contributions
 Sara+0.018
 Ben+0.016
 ice+0.015
 strawberry+0.015
 o+0.015
Neg logit contributions
lights-0.025
opes-0.024
gy-0.024
bel-0.024
adoes-0.023
Tokenigsaw
Token ID45636
Feature activation+8.99e+00

Pos logit contributions
lights+0.027
opes+0.026
bel+0.025
gy+0.025
adoes+0.025
Neg logit contributions
 Sara-0.025
 Ben-0.022
 ice-0.022
 strawberry-0.022
 o-0.021
Token puzzle
Token ID15027
Feature activation+2.77e+00

Pos logit contributions
opes+0.315
adoes+0.296
gy+0.287
bel+0.272
phones+0.261
Neg logit contributions
 ice-0.284
 strawberry-0.265
 better-0.264
 o-0.264
 right-0.253
Token and
Token ID290
Feature activation-5.34e-01

Pos logit contributions
opes+0.130
lights+0.126
adoes+0.126
gy+0.124
bel+0.122
Neg logit contributions
 Sara-0.051
 ice-0.046
 Ben-0.045
 o-0.043
 strawberry-0.042
Token she
Token ID673
Feature activation-1.21e+00

Pos logit contributions
 Sara+0.011
 ice+0.009
 Ben+0.009
 strawberry+0.009
 o+0.009
Neg logit contributions
lights-0.024
opes-0.023
bel-0.023
gy-0.022
adoes-0.022
Token wanted
Token ID2227
Feature activation-4.82e-01

Pos logit contributions
 Sara+0.025
 Ben+0.021
 ice+0.021
 o+0.020
 sne+0.020
Neg logit contributions
lights-0.054
opes-0.053
adoes-0.051
gy-0.050
bel-0.050
Token to
Token ID284
Feature activation+1.14e-01

Pos logit contributions
 Sara+0.011
 Ben+0.009
 strawberry+0.009
 o+0.009
 ice+0.009
Neg logit contributions
lights-0.020
opes-0.019
bel-0.019
 puzzles-0.019
gy-0.019
Token Lily
Token ID20037
Feature activation-8.25e-01

Pos logit contributions
lights+0.005
opes+0.005
gy+0.005
bel+0.005
adoes+0.005
Neg logit contributions
 Sara-0.003
 ice-0.003
 strawberry-0.003
 o-0.003
 Ben-0.003
Token loved
Token ID6151
Feature activation-5.78e-01

Pos logit contributions
 Sara+0.021
 Ben+0.018
 ice+0.018
 strawberry+0.018
 o+0.017
Neg logit contributions
lights-0.033
opes-0.032
bel-0.031
gy-0.031
adoes-0.030
Token the
Token ID262
Feature activation+9.57e-01

Pos logit contributions
 Sara+0.013
 Ben+0.011
 ice+0.011
 strawberry+0.011
 o+0.011
Neg logit contributions
lights-0.024
opes-0.023
bel-0.023
gy-0.023
adoes-0.023
Token big
Token ID1263
Feature activation+3.72e-01

Pos logit contributions
lights+0.036
opes+0.035
bel+0.034
gy+0.034
adoes+0.033
Neg logit contributions
 Sara-0.027
 Ben-0.024
 ice-0.023
 strawberry-0.023
 o-0.023
Token Fer
Token ID12880
Feature activation+4.69e+00

Pos logit contributions
lights+0.014
opes+0.014
bel+0.013
gy+0.013
adoes+0.013
Neg logit contributions
 Sara-0.010
 Ben-0.009
 ice-0.009
 strawberry-0.009
 o-0.008
Tokenris
Token ID2442
Feature activation+7.81e+00

Pos logit contributions
adoes+0.190
opes+0.184
lights+0.167
phones+0.159
arians+0.158
Neg logit contributions
 Sara-0.111
 ice-0.107
 strawberry-0.100
ax-0.099
Tom-0.097
Token wheel
Token ID7825
Feature activation+1.10e+00

Pos logit contributions
lights+0.306
adoes+0.297
opes+0.296
bel+0.294
gy+0.287
Neg logit contributions
 Sara-0.198
 ice-0.192
 strawberry-0.182
 o-0.178
 Ben-0.169
Token the
Token ID262
Feature activation+2.94e-02

Pos logit contributions
opes+0.050
lights+0.050
adoes+0.047
bel+0.047
gy+0.046
Neg logit contributions
 Sara-0.022
 ice-0.019
 Ben-0.019
 o-0.017
 strawberry-0.017
Token most
Token ID749
Feature activation+3.53e-01

Pos logit contributions
lights+0.001
opes+0.001
bel+0.001
gy+0.001
adoes+0.001
Neg logit contributions
 Sara-0.001
 Ben-0.001
 ice-0.001
 strawberry-0.001
 o-0.001
Token.
Token ID13
Feature activation-4.92e-01

Pos logit contributions
lights+0.015
opes+0.015
gy+0.014
bel+0.014
adoes+0.014
Neg logit contributions
 Sara-0.008
 ice-0.007
 Ben-0.007
 strawberry-0.007
 o-0.007
Token She
Token ID1375
Feature activation+4.03e-01

Pos logit contributions
 Sara+0.011
 ice+0.010
 o+0.010
 strawberry+0.010
 Ben+0.009
Neg logit contributions
lights-0.020
opes-0.019
gy-0.019
adoes-0.019
bel-0.019
Token box
Token ID3091
Feature activation+1.03e+00

Pos logit contributions
 Sara+0.002
 Ben+0.002
 ice+0.002
 strawberry+0.002
 o+0.002
Neg logit contributions
lights-0.002
opes-0.002
bel-0.002
gy-0.002
adoes-0.002
Token and
Token ID290
Feature activation-4.72e-02

Pos logit contributions
lights+0.042
bel+0.039
opes+0.039
gy+0.039
zag+0.038
Neg logit contributions
 Sara-0.025
 Ben-0.021
 strawberry-0.021
 ice-0.021
 o-0.020
Token saw
Token ID2497
Feature activation-1.69e-01

Pos logit contributions
 Sara+0.001
 strawberry+0.001
 ice+0.001
 Ben+0.001
 o+0.001
Neg logit contributions
lights-0.002
bel-0.002
gy-0.002
opes-0.002
adoes-0.002
Token a
Token ID257
Feature activation-1.15e+00

Pos logit contributions
 Sara+0.004
 strawberry+0.003
Come+0.003
 Ben+0.003
 ice+0.003
Neg logit contributions
lights-0.007
opes-0.007
bel-0.007
adoes-0.007
gy-0.007
Token j
Token ID474
Feature activation+4.35e-01

Pos logit contributions
 Sara+0.033
 Ben+0.029
Come+0.028
 o+0.027
 ice+0.026
Neg logit contributions
lights-0.040
bel-0.038
gy-0.038
 puzzles-0.038
zag-0.038
Tokenigsaw
Token ID45636
Feature activation+8.26e+00

Pos logit contributions
lights+0.014
opes+0.013
bel+0.013
gy+0.013
adoes+0.013
Neg logit contributions
 Sara-0.014
 Ben-0.013
 ice-0.013
 strawberry-0.013
 o-0.013
Token puzzle
Token ID15027
Feature activation+1.88e+00

Pos logit contributions
opes+0.283
adoes+0.270
bel+0.262
gy+0.260
lights+0.249
Neg logit contributions
 ice-0.254
 better-0.239
 Sara-0.233
 strawberry-0.231
 o-0.231
Token inside
Token ID2641
Feature activation-4.10e-01

Pos logit contributions
lights+0.078
opes+0.077
bel+0.075
gy+0.075
adoes+0.074
Neg logit contributions
 Sara-0.044
 ice-0.038
 Ben-0.038
 strawberry-0.037
 o-0.037
Token.
Token ID13
Feature activation+3.07e-01

Pos logit contributions
 Sara+0.011
 strawberry+0.009
 Ben+0.009
Come+0.009
 ice+0.009
Neg logit contributions
lights-0.015
bel-0.014
opes-0.014
spe-0.014
 puzzles-0.014
Token He
Token ID679
Feature activation+1.48e+00

Pos logit contributions
lights+0.010
bel+0.010
opes+0.009
adoes+0.009
cases+0.009
Neg logit contributions
 Sara-0.008
 sne-0.008
 o-0.008
 ice-0.008
 yawn-0.008
Token paused
Token ID24487
Feature activation+1.33e-01

Pos logit contributions
lights+0.063
opes+0.060
bel+0.059
gy+0.059
adoes+0.059
Neg logit contributions
 Sara-0.034
 Ben-0.029
 ice-0.029
 strawberry-0.028
 o-0.028
Token her
Token ID607
Feature activation-5.82e-01

Pos logit contributions
 Sara+0.030
 Ben+0.025
 ice+0.025
 strawberry+0.025
 o+0.024
Neg logit contributions
lights-0.069
opes-0.067
gy-0.066
bel-0.066
adoes-0.065
Token monkey
Token ID21657
Feature activation+5.28e-01

Pos logit contributions
 Sara+0.016
 Ben+0.015
 ice+0.015
 o+0.014
 strawberry+0.014
Neg logit contributions
lights-0.022
bel-0.021
opes-0.021
adoes-0.020
gy-0.020
Token the
Token ID262
Feature activation+3.55e-01

Pos logit contributions
lights+0.028
opes+0.026
arians+0.026
adoes+0.025
bel+0.025
Neg logit contributions
 Sara-0.009
 ice-0.008
 Ben-0.008
 strawberry-0.007
 o-0.007
Token name
Token ID1438
Feature activation-6.14e-01

Pos logit contributions
lights+0.013
opes+0.013
bel+0.013
gy+0.012
adoes+0.012
Neg logit contributions
 Sara-0.010
 Ben-0.009
 ice-0.009
 strawberry-0.008
 o-0.008
Token "
Token ID366
Feature activation+2.95e-01

Pos logit contributions
 Sara+0.015
 ice+0.013
 o+0.013
 strawberry+0.013
 Ben+0.013
Neg logit contributions
lights-0.023
opes-0.022
adoes-0.021
gy-0.021
bel-0.021
TokenB
Token ID33
Feature activation+3.57e+00

Pos logit contributions
lights+0.010
opes+0.010
bel+0.010
gy+0.010
adoes+0.010
Neg logit contributions
 Sara-0.009
 Ben-0.008
 ice-0.008
 strawberry-0.008
 o-0.008
TokenAN
Token ID1565
Feature activation-4.16e-01

Pos logit contributions
lights+0.123
opes+0.111
adoes+0.111
bel+0.109
 puzzles+0.106
Neg logit contributions
 Sara-0.116
 Ben-0.105
 ice-0.100
 strawberry-0.100
 o-0.096
TokenANA
Token ID31574
Feature activation+8.95e-02

Pos logit contributions
 Sara+0.014
 Ben+0.013
 ice+0.013
 strawberry+0.012
 o+0.012
Neg logit contributions
lights-0.013
opes-0.012
bel-0.012
gy-0.012
adoes-0.012
Token"
Token ID1
Feature activation+4.46e-01

Pos logit contributions
lights+0.004
opes+0.003
bel+0.003
gy+0.003
adoes+0.003
Neg logit contributions
 Sara-0.002
 Ben-0.002
 strawberry-0.002
 ice-0.002
 o-0.002
Token and
Token ID290
Feature activation-7.00e-01

Pos logit contributions
lights+0.018
bel+0.018
opes+0.017
gy+0.017
adoes+0.017
Neg logit contributions
 Sara-0.011
 strawberry-0.009
 ice-0.009
 Ben-0.009
 o-0.009
Token her
Token ID607
Feature activation-6.45e-01

Pos logit contributions
 Sara+0.016
 Ben+0.014
 ice+0.013
 strawberry+0.013
 o+0.013
Neg logit contributions
lights-0.030
opes-0.029
bel-0.029
gy-0.028
adoes-0.028
TokenN
Token ID45
Feature activation-3.58e-01

Pos logit contributions
 Sara+0.001
 ice+0.001
 Ben+0.001
 yawn+0.001
 Tom+0.001
Neg logit contributions
bel-0.002
lights-0.002
spe-0.002
zag-0.002
opes-0.002
Tokenora
Token ID5799
Feature activation-5.75e-01

Pos logit contributions
 Sara+0.012
 ice+0.011
 strawberry+0.011
 Ben+0.011
 vanilla+0.010
Neg logit contributions
lights-0.011
opes-0.010
bel-0.010
gy-0.010
zag-0.010
Token was
Token ID373
Feature activation-2.00e-01

Pos logit contributions
 Sara+0.013
 Ben+0.011
 ice+0.011
 strawberry+0.011
 o+0.011
Neg logit contributions
lights-0.025
opes-0.024
bel-0.023
adoes-0.023
gy-0.023
Token a
Token ID257
Feature activation-4.10e-01

Pos logit contributions
 Sara+0.005
 Ben+0.005
 strawberry+0.005
 ice+0.005
 o+0.005
Neg logit contributions
gy-0.007
bel-0.006
zag-0.006
spe-0.006
lights-0.006
Token fast
Token ID3049
Feature activation-1.43e+00

Pos logit contributions
 Sara+0.011
 ice+0.009
 strawberry+0.009
 Ben+0.009
 o+0.009
Neg logit contributions
lights-0.017
opes-0.016
adoes-0.016
bel-0.015
gy-0.015
Token lear
Token ID22454
Feature activation+4.06e+00

Pos logit contributions
 Sara+0.044
 Ben+0.039
 ice+0.039
 strawberry+0.039
 o+0.038
Neg logit contributions
lights-0.048
bel-0.045
gy-0.045
opes-0.045
adoes-0.044
Tokenner
Token ID1008
Feature activation+1.08e-01

Pos logit contributions
lights+0.108
opes+0.103
bel+0.103
gy+0.102
zag+0.100
Neg logit contributions
 Sara-0.150
 ice-0.138
 Ben-0.137
 strawberry-0.137
 o-0.134
Token.
Token ID13
Feature activation-3.62e-04

Pos logit contributions
lights+0.005
opes+0.005
adoes+0.005
bel+0.005
gy+0.004
Neg logit contributions
 Sara-0.002
 Ben-0.002
 ice-0.002
 strawberry-0.002
 o-0.002
Token Soon
Token ID15894
Feature activation+6.47e-01

Pos logit contributions
 Sara+0.000
 ice+0.000
 o+0.000
 sne+0.000
 strawberry+0.000
Neg logit contributions
lights-0.000
opes-0.000
bel-0.000
gy-0.000
adoes-0.000
Token enough
Token ID1576
Feature activation-1.43e+00

Pos logit contributions
lights+0.022
opes+0.021
bel+0.021
gy+0.021
adoes+0.021
Neg logit contributions
 Sara-0.017
 ice-0.016
 strawberry-0.015
 o-0.015
 Ben-0.015
Token she
Token ID673
Feature activation-8.77e-01

Pos logit contributions
 Sara+0.042
 ice+0.038
 strawberry+0.037
 o+0.037
 Ben+0.037
Neg logit contributions
lights-0.048
opes-0.045
adoes-0.044
 puzzles-0.044
bel-0.044
Token Ben
Token ID3932
Feature activation-4.46e-01

Pos logit contributions
 warmer+0.025
 ice+0.020
 better+0.019
 strawberry+0.019
 welcome+0.019
Neg logit contributions
opes-0.076
lights-0.073
 robots-0.072
 puzzles-0.072
bel-0.072
Token say
Token ID910
Feature activation-1.21e+00

Pos logit contributions
 Sara+0.008
 strawberry+0.008
 ice+0.007
 vanilla+0.006
Come+0.006
Neg logit contributions
opes-0.019
gy-0.018
bel-0.018
lights-0.018
cases-0.018
Token they
Token ID484
Feature activation+1.26e-01

Pos logit contributions
 Sara+0.031
 Ben+0.027
 ice+0.027
 strawberry+0.027
 o+0.026
Neg logit contributions
lights-0.048
opes-0.046
bel-0.046
gy-0.045
adoes-0.045
Token were
Token ID547
Feature activation+4.54e-01

Pos logit contributions
lights+0.005
opes+0.005
bel+0.005
gy+0.005
adoes+0.005
Neg logit contributions
 Sara-0.003
 strawberry-0.003
 Ben-0.002
 ice-0.002
 o-0.002
Token making
Token ID1642
Feature activation-3.24e-01

Pos logit contributions
opes+0.018
lights+0.018
bel+0.017
gy+0.017
adoes+0.017
Neg logit contributions
 Sara-0.010
 strawberry-0.010
 ice-0.009
Come-0.009
 Ben-0.009
Token art
Token ID1242
Feature activation+3.35e+00

Pos logit contributions
 Sara+0.007
Come+0.007
 strawberry+0.007
 Goose+0.007
 dove+0.007
Neg logit contributions
 puzzles-0.012
lights-0.012
opes-0.012
adoes-0.012
 robots-0.011
Token with
Token ID351
Feature activation+4.90e-01

Pos logit contributions
opes+0.144
gy+0.144
lights+0.137
zag+0.134
adoes+0.134
Neg logit contributions
 Sara-0.071
 ice-0.066
 o-0.063
 better-0.061
 Ben-0.061
Token the
Token ID262
Feature activation+1.85e-01

Pos logit contributions
lights+0.020
opes+0.019
bel+0.019
adoes+0.018
gy+0.018
Neg logit contributions
 Sara-0.012
 Ben-0.011
 strawberry-0.010
 ice-0.010
 o-0.010
Token ink
Token ID16882
Feature activation+5.84e-01

Pos logit contributions
lights+0.006
opes+0.006
gy+0.006
bel+0.006
adoes+0.006
Neg logit contributions
 Sara-0.006
 Ben-0.005
 ice-0.005
 strawberry-0.005
 o-0.005
Token.
Token ID13
Feature activation+2.64e-02

Pos logit contributions
lights+0.026
opes+0.025
bel+0.025
gy+0.025
adoes+0.024
Neg logit contributions
 Sara-0.012
 Ben-0.010
 ice-0.010
 o-0.010
 strawberry-0.010
Token Mom
Token ID11254
Feature activation-5.49e-01

Pos logit contributions
lights+0.001
bel+0.001
opes+0.001
adoes+0.001
gy+0.001
Neg logit contributions
 ice-0.001
 strawberry-0.001
 o-0.001
 yawn-0.001
 warmer-0.001
Token to
Token ID284
Feature activation+5.51e-01

Pos logit contributions
lights+0.017
opes+0.017
gy+0.016
bel+0.016
adoes+0.016
Neg logit contributions
 Sara-0.008
 Ben-0.007
 ice-0.007
 strawberry-0.006
 o-0.006
Token help
Token ID1037
Feature activation+1.62e-01

Pos logit contributions
lights+0.025
opes+0.024
bel+0.024
gy+0.024
adoes+0.024
Neg logit contributions
 Sara-0.011
 Ben-0.009
 ice-0.009
 strawberry-0.009
 o-0.008
Token!
Token ID0
Feature activation+3.65e-01

Pos logit contributions
lights+0.007
opes+0.007
bel+0.006
gy+0.006
adoes+0.006
Neg logit contributions
 Sara-0.004
 Ben-0.003
 ice-0.003
 strawberry-0.003
 o-0.003
Token We
Token ID775
Feature activation-1.57e+00

Pos logit contributions
lights+0.015
opes+0.015
bel+0.015
gy+0.014
adoes+0.014
Neg logit contributions
 Sara-0.009
 Ben-0.007
 ice-0.007
 strawberry-0.007
 o-0.007
Tokenâ
Token ID22940
Feature activation+2.38e+00

Pos logit contributions
 Sara+0.040
 Ben+0.035
 ice+0.035
 strawberry+0.034
 o+0.034
Neg logit contributions
lights-0.063
opes-0.060
bel-0.060
adoes-0.058
gy-0.058
Token
Token ID26391
Feature activation+3.11e+00

Pos logit contributions
lights+0.083
opes+0.080
bel+0.079
gy+0.078
adoes+0.077
Neg logit contributions
 Sara-0.072
 Ben-0.064
 ice-0.064
 strawberry-0.063
 o-0.062
Token
Token ID8151
Feature activation-4.96e-01

Pos logit contributions
lights+0.122
opes+0.111
bel+0.107
adoes+0.106
zag+0.105
Neg logit contributions
 Sara-0.087
 Ben-0.081
 Tom-0.075
 ice-0.074
 see-0.072
Tokenre
Token ID260
Feature activation+6.69e-01

Pos logit contributions
 Sara+0.015
 Ben+0.013
 ice+0.013
 strawberry+0.012
 o+0.012
Neg logit contributions
lights-0.018
opes-0.018
adoes-0.017
bel-0.017
gy-0.017
Token all
Token ID477
Feature activation+9.92e-01

Pos logit contributions
lights+0.031
opes+0.030
bel+0.029
gy+0.029
adoes+0.029
Neg logit contributions
 Sara-0.013
 Ben-0.011
 ice-0.011
 strawberry-0.011
 o-0.011
Token in
Token ID287
Feature activation+1.76e+00

Pos logit contributions
lights+0.045
opes+0.044
bel+0.043
gy+0.043
adoes+0.043
Neg logit contributions
 Sara-0.020
 Ben-0.017
 ice-0.016
 strawberry-0.016
 o-0.016
Token this
Token ID428
Feature activation+8.14e-01

Pos logit contributions
lights+0.081
opes+0.079
bel+0.077
gy+0.077
adoes+0.076
Neg logit contributions
 Sara-0.036
 Ben-0.030
 ice-0.029
 strawberry-0.028
 o-0.028
Token On
Token ID1550
Feature activation+1.78e+00

Pos logit contributions
 Sara+0.010
 o+0.010
 strawberry+0.009
 ice+0.009
 sne+0.009
Neg logit contributions
lights-0.016
opes-0.016
gy-0.016
adoes-0.015
 puzzles-0.015
Token this
Token ID428
Feature activation+1.06e-01

Pos logit contributions
lights+0.087
bel+0.086
opes+0.085
adoes+0.083
gy+0.082
Neg logit contributions
 Sara-0.031
 Ben-0.025
 ice-0.025
 strawberry-0.024
 o-0.023
Token ranch
Token ID28752
Feature activation+2.96e+00

Pos logit contributions
opes+0.004
lights+0.004
bel+0.003
adoes+0.003
arians+0.003
Neg logit contributions
 Sara-0.003
 ice-0.003
 o-0.003
 strawberry-0.003
 warm-0.003
Token lived
Token ID5615
Feature activation+2.30e-01

Pos logit contributions
lights+0.112
opes+0.108
bel+0.105
adoes+0.103
gy+0.102
Neg logit contributions
 Sara-0.084
 Ben-0.074
 ice-0.074
 strawberry-0.071
 o-0.071
Token two
Token ID734
Feature activation+1.75e+00

Pos logit contributions
bel+0.009
gy+0.009
opes+0.009
lights+0.008
 puzzles+0.008
Neg logit contributions
 Sara-0.005
 strawberry-0.005
 yawn-0.004
 sne-0.004
 ice-0.004
Token brothers
Token ID9397
Feature activation+3.20e+00

Pos logit contributions
lights+0.068
opes+0.066
bel+0.065
gy+0.064
adoes+0.063
Neg logit contributions
 Sara-0.046
 Ben-0.041
 ice-0.040
 strawberry-0.039
 o-0.039
Token;
Token ID26
Feature activation+1.48e+00

Pos logit contributions
lights+0.135
bel+0.132
adoes+0.126
opes+0.125
zag+0.121
Neg logit contributions
 Sara-0.083
 Ben-0.075
 ice-0.070
 o-0.067
 strawberry-0.065
Token Joe
Token ID5689
Feature activation+8.12e-01

Pos logit contributions
lights+0.062
opes+0.059
gy+0.059
bel+0.058
 puzzles+0.058
Neg logit contributions
 Sara-0.032
 strawberry-0.027
 o-0.027
 Ben-0.026
 ice-0.026
Token and
Token ID290
Feature activation-1.22e+00

Pos logit contributions
gy+0.032
opes+0.032
lights+0.032
bel+0.031
zag+0.030
Neg logit contributions
 Sara-0.019
 strawberry-0.017
 ice-0.016
 Ben-0.016
Come-0.016
Token Jim
Token ID5395
Feature activation+2.57e-01

Pos logit contributions
 Sara+0.023
 Ben+0.020
 ice+0.018
 strawberry+0.017
 o+0.017
Neg logit contributions
lights-0.058
opes-0.057
bel-0.056
adoes-0.055
gy-0.055
Token.
Token ID13
Feature activation-6.15e-01

Pos logit contributions
lights+0.013
adoes+0.012
bel+0.012
opes+0.012
zag+0.012
Neg logit contributions
 Sara-0.004
 Ben-0.004
 ice-0.004
 o-0.003
 say-0.003
Token.
Token ID13
Feature activation+2.17e-01

Pos logit contributions
lights+0.039
opes+0.039
bel+0.038
gy+0.037
adoes+0.037
Neg logit contributions
 Sara-0.017
 Ben-0.015
 ice-0.014
 strawberry-0.014
 o-0.013
Token
Token ID198
Feature activation+1.85e-01

Pos logit contributions
lights+0.008
gy+0.008
bel+0.008
opes+0.008
adoes+0.008
Neg logit contributions
 Sara-0.005
 ice-0.005
 strawberry-0.005
 o-0.005
 Ben-0.005
TokenThe
Token ID464
Feature activation-2.97e-01

Pos logit contributions
bel+0.007
lights+0.007
spe+0.006
gy+0.006
zag+0.006
Neg logit contributions
 Sara-0.005
 ice-0.005
 Ben-0.005
 strawberry-0.005
 o-0.005
Token bug
Token ID5434
Feature activation-9.82e-01

Pos logit contributions
 Sara+0.007
 Ben+0.006
 ice+0.006
 strawberry+0.006
 o+0.005
Neg logit contributions
lights-0.013
opes-0.012
bel-0.012
gy-0.012
adoes-0.012
Token said
Token ID531
Feature activation+2.65e-01

Pos logit contributions
 Sara+0.020
 Ben+0.016
 ice+0.016
 strawberry+0.016
 o+0.015
Neg logit contributions
lights-0.044
opes-0.043
bel-0.042
gy-0.042
adoes-0.041
Token "
Token ID366
Feature activation+1.36e+00

Pos logit contributions
lights+0.010
bel+0.009
opes+0.009
gy+0.009
adoes+0.009
Neg logit contributions
 Sara-0.007
 Ben-0.006
 ice-0.006
 strawberry-0.006
 o-0.006
Tokenhello
Token ID31373
Feature activation+5.51e-01

Pos logit contributions
bel+0.056
lights+0.054
gy+0.054
spe+0.053
adoes+0.052
Neg logit contributions
 Sara-0.031
 ice-0.028
 Ben-0.028
 strawberry-0.027
 o-0.027
Token".
Token ID1911
Feature activation-1.74e-02

Pos logit contributions
lights+0.020
opes+0.019
gy+0.019
bel+0.019
adoes+0.019
Neg logit contributions
 Sara-0.015
 ice-0.013
 strawberry-0.013
 Ben-0.013
 o-0.013
Token But
Token ID887
Feature activation+6.33e-01

Pos logit contributions
 Sara+0.000
 Ben+0.000
 ice+0.000
 strawberry+0.000
 o+0.000
Neg logit contributions
lights-0.001
opes-0.001
bel-0.001
gy-0.001
adoes-0.001
Token the
Token ID262
Feature activation-3.04e-01

Pos logit contributions
lights+0.026
bel+0.025
gy+0.025
adoes+0.024
 puzzles+0.024
Neg logit contributions
 Sara-0.013
 strawberry-0.011
 ice-0.011
Come-0.011
 o-0.011
Token girl
Token ID2576
Feature activation-6.64e-01

Pos logit contributions
 Sara+0.007
 Ben+0.006
 ice+0.006
 o+0.006
 strawberry+0.006
Neg logit contributions
lights-0.012
bel-0.012
opes-0.011
gy-0.011
adoes-0.011
Token said
Token ID531
Feature activation-8.34e-02

Pos logit contributions
 Sara+0.017
 Ben+0.016
 ice+0.015
 o+0.014
 strawberry+0.014
Neg logit contributions
lights-0.051
bel-0.048
adoes-0.047
opes-0.047
gy-0.045
Token,
Token ID11
Feature activation+1.16e-01

Pos logit contributions
 Sara+0.002
 Ben+0.002
 ice+0.002
 strawberry+0.002
 o+0.002
Neg logit contributions
lights-0.003
opes-0.003
bel-0.003
gy-0.003
adoes-0.003
Token "
Token ID366
Feature activation+1.25e+00

Pos logit contributions
opes+0.004
lights+0.004
gy+0.004
bel+0.004
zag+0.004
Neg logit contributions
 Sara-0.003
 ice-0.003
 o-0.003
 strawberry-0.003
 warmer-0.003
TokenLook
Token ID8567
Feature activation+6.82e-01

Pos logit contributions
lights+0.048
bel+0.048
opes+0.047
gy+0.047
adoes+0.046
Neg logit contributions
 Sara-0.032
 ice-0.027
 o-0.027
 Ben-0.027
 strawberry-0.026
Token at
Token ID379
Feature activation+6.81e-01

Pos logit contributions
lights+0.027
opes+0.026
bel+0.025
gy+0.025
adoes+0.025
Neg logit contributions
 Sara-0.018
 Ben-0.015
 ice-0.015
 strawberry-0.015
 o-0.015
Token our
Token ID674
Feature activation-4.89e-01

Pos logit contributions
lights+0.034
bel+0.034
spe+0.032
gy+0.032
zag+0.032
Neg logit contributions
 Sara-0.013
 Ben-0.012
 ice-0.011
 o-0.010
 strawberry-0.010
Token tower
Token ID10580
Feature activation+5.99e-01

Pos logit contributions
 Sara+0.016
 ice+0.016
 Ben+0.015
 o+0.015
 strawberry+0.014
Neg logit contributions
lights-0.015
gy-0.015
opes-0.015
bel-0.015
arians-0.014
Token!
Token ID0
Feature activation+3.30e-01

Pos logit contributions
lights+0.027
opes+0.027
bel+0.026
adoes+0.026
arians+0.025
Neg logit contributions
 Sara-0.013
 ice-0.011
 Ben-0.011
 o-0.010
 Tom-0.010
Token It
Token ID632
Feature activation+5.25e-01

Pos logit contributions
lights+0.014
opes+0.014
gy+0.014
bel+0.014
adoes+0.014
Neg logit contributions
 Sara-0.007
 ice-0.006
 o-0.006
 strawberry-0.006
 Ben-0.006
Token's
Token ID338
Feature activation-9.37e-01

Pos logit contributions
lights+0.025
bel+0.025
opes+0.024
adoes+0.024
gy+0.023
Neg logit contributions
 Sara-0.009
 ice-0.008
 Ben-0.008
 o-0.007
 better-0.007
Token so
Token ID523
Feature activation+6.48e-01

Pos logit contributions
 Sara+0.023
 Ben+0.020
 strawberry+0.020
Come+0.020
 ice+0.019
Neg logit contributions
lights-0.037
opes-0.036
gy-0.035
adoes-0.035
bel-0.035
Token the
Token ID262
Feature activation+6.07e-01

Pos logit contributions
Come+0.060
 mir+0.050
 Come+0.049
 wake+0.049
 Promise+0.048
Neg logit contributions
 puzzles-0.068
lights-0.063
opes-0.059
adoes-0.057
 robots-0.056
Token soup
Token ID17141
Feature activation-1.19e+00

Pos logit contributions
lights+0.023
 creations+0.022
opes+0.021
 puzzles+0.021
adoes+0.020
Neg logit contributions
 Sara-0.017
Come-0.014
 Ben-0.013
 Tom-0.012
ax-0.012
Token?"
Token ID1701
Feature activation-1.57e+00

Pos logit contributions
 Sara+0.034
Come+0.030
 Ben+0.029
 strawberry+0.028
 ice+0.028
Neg logit contributions
lights-0.042
bel-0.040
gy-0.038
zag-0.038
opes-0.038
Token asked
Token ID1965
Feature activation-6.59e-01

Pos logit contributions
 ice+0.035
 o+0.034
 strawberry+0.034
 warmer+0.033
 Goose+0.032
Neg logit contributions
bel-0.057
lights-0.056
 puzzles-0.052
 robots-0.051
gy-0.051
Token Lily
Token ID20037
Feature activation-1.47e+00

Pos logit contributions
 Sara+0.010
 Ben+0.008
 ice+0.007
 strawberry+0.007
 o+0.007
Neg logit contributions
lights-0.033
opes-0.032
bel-0.032
gy-0.032
adoes-0.032
Token.
Token ID13
Feature activation-3.30e-01

Pos logit contributions
 Sara+0.040
 strawberry+0.036
 ice+0.036
Come+0.035
 ge+0.034
Neg logit contributions
gy-0.054
lights-0.051
bel-0.050
zag-0.049
opes-0.049
Token
Token ID198
Feature activation-5.75e-01

Pos logit contributions
 ice+0.007
 o+0.007
 strawberry+0.007
 Sara+0.006
 warmer+0.006
Neg logit contributions
lights-0.013
bel-0.013
gy-0.012
opes-0.012
adoes-0.012
Token
Token ID198
Feature activation-6.07e-01

Pos logit contributions
 strawberry+0.013
 ice+0.013
 o+0.012
 warmer+0.012
 yawn+0.012
Neg logit contributions
lights-0.021
zag-0.021
bel-0.021
gy-0.020
opes-0.020
Token"
Token ID1
Feature activation+6.45e-01

Pos logit contributions
 Sara+0.019
 ice+0.018
 strawberry+0.018
 o+0.017
 Ben+0.017
Neg logit contributions
lights-0.019
bel-0.019
gy-0.018
opes-0.018
zag-0.018
TokenOf
Token ID5189
Feature activation-3.16e-01

Pos logit contributions
bel+0.022
lights+0.022
opes+0.021
 puzzles+0.020
zag+0.020
Neg logit contributions
 Sara-0.018
 Ben-0.016
 o-0.016
 ice-0.016
 yawn-0.016
Token course
Token ID1781
Feature activation+1.99e+00

Pos logit contributions
 Sara+0.013
 Ben+0.012
 ice+0.012
 strawberry+0.012
 o+0.012
Neg logit contributions
lights-0.007
opes-0.007
gy-0.007
bel-0.006
adoes-0.006
Token it
Token ID340
Feature activation-7.36e-01

Pos logit contributions
lights+0.001
bel+0.001
adoes+0.001
opes+0.001
gy+0.001
Neg logit contributions
 Sara-0.001
 strawberry-0.001
 ice-0.001
 Ben-0.001
Come-0.001
Token was
Token ID373
Feature activation+4.57e-01

Pos logit contributions
 Sara+0.017
 Ben+0.015
 ice+0.014
 strawberry+0.014
 o+0.014
Neg logit contributions
lights-0.031
opes-0.030
bel-0.030
gy-0.029
adoes-0.029
Token okay
Token ID8788
Feature activation-1.29e+00

Pos logit contributions
gy+0.017
lights+0.017
opes+0.017
 puzzles+0.017
adoes+0.017
Neg logit contributions
 Sara-0.010
Come-0.009
 strawberry-0.009
 ice-0.009
 ge-0.009
Token,
Token ID11
Feature activation-2.18e-01

Pos logit contributions
 Sara+0.029
 Ben+0.025
 ice+0.024
 strawberry+0.023
 o+0.023
Neg logit contributions
lights-0.056
opes-0.054
bel-0.053
gy-0.052
adoes-0.052
Token and
Token ID290
Feature activation-6.61e-01

Pos logit contributions
 Sara+0.006
 strawberry+0.005
 ice+0.005
 o+0.005
 Ben+0.005
Neg logit contributions
lights-0.008
adoes-0.007
opes-0.007
bel-0.007
gy-0.007
Token that
Token ID326
Feature activation+2.10e-01

Pos logit contributions
 Sara+0.016
 ice+0.014
 Ben+0.014
 strawberry+0.014
 o+0.013
Neg logit contributions
lights-0.026
bel-0.025
gy-0.025
opes-0.025
adoes-0.024
Token accidents
Token ID17390
Feature activation+1.62e-01

Pos logit contributions
lights+0.008
opes+0.008
adoes+0.008
 puzzles+0.008
bel+0.008
Neg logit contributions
 Sara-0.005
 ice-0.004
 strawberry-0.004
 Ben-0.004
 o-0.004
Token happen
Token ID1645
Feature activation-5.30e-02

Pos logit contributions
lights+0.006
opes+0.006
bel+0.006
gy+0.006
adoes+0.006
Neg logit contributions
 Sara-0.004
 Ben-0.004
 ice-0.004
 strawberry-0.004
 o-0.004
Token.
Token ID13
Feature activation+6.80e-02

Pos logit contributions
 Sara+0.001
 Ben+0.001
 ice+0.001
 strawberry+0.001
 o+0.001
Neg logit contributions
lights-0.002
opes-0.002
bel-0.002
gy-0.002
adoes-0.002
Token She
Token ID1375
Feature activation+8.09e-01

Pos logit contributions
lights+0.003
opes+0.003
gy+0.003
adoes+0.003
bel+0.002
Neg logit contributions
 ice-0.001
 Sara-0.001
 o-0.001
 strawberry-0.001
 sne-0.001
Token took
Token ID1718
Feature activation-7.95e-01

Pos logit contributions
lights+0.034
bel+0.032
opes+0.032
gy+0.032
spe+0.031
Neg logit contributions
 Sara-0.017
 Ben-0.014
 strawberry-0.014
 ice-0.014
Come-0.013
Token the
Token ID262
Feature activation+6.71e-01

Pos logit contributions
lights+0.031
opes+0.030
bel+0.030
gy+0.030
zag+0.029
Neg logit contributions
 Sara-0.011
 Ben-0.009
 ice-0.008
 o-0.008
 strawberry-0.008
Token string
Token ID4731
Feature activation-4.33e-02

Pos logit contributions
lights+0.025
bel+0.024
opes+0.024
adoes+0.023
zag+0.023
Neg logit contributions
 Sara-0.019
 Ben-0.017
 strawberry-0.016
 o-0.015
 ice-0.015
Token,
Token ID11
Feature activation+6.37e-01

Pos logit contributions
 Sara+0.001
 Ben+0.001
 ice+0.001
 strawberry+0.001
 o+0.001
Neg logit contributions
lights-0.002
opes-0.002
bel-0.002
gy-0.002
adoes-0.002
Token he
Token ID339
Feature activation-3.21e-01

Pos logit contributions
lights+0.026
bel+0.025
opes+0.025
adoes+0.024
gy+0.024
Neg logit contributions
 Sara-0.015
 strawberry-0.012
 ice-0.012
 Ben-0.012
 o-0.012
Token put
Token ID1234
Feature activation-1.91e+00

Pos logit contributions
 Sara+0.008
 Ben+0.007
 ice+0.007
 strawberry+0.007
 o+0.007
Neg logit contributions
lights-0.013
opes-0.012
bel-0.012
gy-0.012
adoes-0.012
Token the
Token ID262
Feature activation+3.14e-01

Pos logit contributions
 Sara+0.032
 strawberry+0.026
Come+0.025
 Ben+0.024
 yawn+0.024
Neg logit contributions
lights-0.089
opes-0.084
bel-0.084
adoes-0.084
gy-0.083
Token top
Token ID1353
Feature activation+2.93e-02

Pos logit contributions
lights+0.011
bel+0.011
opes+0.011
adoes+0.010
gy+0.010
Neg logit contributions
 Sara-0.009
 Ben-0.008
 o-0.008
 strawberry-0.008
 ice-0.008
Token back
Token ID736
Feature activation-1.04e+00

Pos logit contributions
lights+0.001
bel+0.001
opes+0.001
gy+0.001
adoes+0.001
Neg logit contributions
 Sara-0.001
 Ben-0.001
 ice-0.001
 strawberry-0.001
 o-0.001
Token together
Token ID1978
Feature activation+2.27e+00

Pos logit contributions
 Sara+0.023
 ice+0.021
 Ben+0.020
 o+0.019
 strawberry+0.019
Neg logit contributions
lights-0.044
opes-0.044
bel-0.042
adoes-0.042
arians-0.041
Token.
Token ID13
Feature activation-8.53e-02

Pos logit contributions
lights+0.090
bel+0.086
opes+0.086
gy+0.085
zag+0.083
Neg logit contributions
 Sara-0.057
 Ben-0.049
 strawberry-0.048
 ice-0.047
 o-0.046
Token Sam
Token ID3409
Feature activation+3.49e-01

Pos logit contributions
 Sara+0.002
 ice+0.002
 Ben+0.002
 strawberry+0.002
 o+0.002
Neg logit contributions
lights-0.003
opes-0.003
bel-0.003
gy-0.003
adoes-0.003
Token game
Token ID983
Feature activation+1.06e+00

Pos logit contributions
lights+0.014
opes+0.013
adoes+0.013
gy+0.013
bel+0.013
Neg logit contributions
 Sara-0.012
 ice-0.012
 o-0.011
 Ben-0.011
 strawberry-0.011
Token on
Token ID319
Feature activation+9.17e-01

Pos logit contributions
lights+0.049
opes+0.049
gy+0.047
adoes+0.047
spe+0.046
Neg logit contributions
 Sara-0.021
 Ben-0.019
 ice-0.017
 o-0.016
 strawberry-0.015
Token her
Token ID607
Feature activation+2.69e-01

Pos logit contributions
lights+0.053
bel+0.052
opes+0.051
spe+0.050
cases+0.050
Neg logit contributions
 Sara-0.011
 Ben-0.008
 ice-0.006
 Tom-0.005
 strawberry-0.005
Token phone
Token ID3072
Feature activation+1.64e+00

Pos logit contributions
lights+0.010
bel+0.010
spe+0.010
adoes+0.009
opes+0.009
Neg logit contributions
 ice-0.007
 Sara-0.007
 Ben-0.007
 o-0.007
 strawberry-0.007
Token when
Token ID618
Feature activation-1.15e+00

Pos logit contributions
lights+0.081
opes+0.077
spe+0.076
bel+0.076
igsaw+0.074
Neg logit contributions
 Sara-0.027
 Ben-0.025
 ice-0.023
 o-0.021
 say-0.020
Token she
Token ID673
Feature activation-1.24e+00

Pos logit contributions
 Sara+0.025
 ice+0.021
 o+0.021
 strawberry+0.021
 Ben+0.021
Neg logit contributions
lights-0.048
opes-0.046
bel-0.046
gy-0.046
adoes-0.045
Token noticed
Token ID6810
Feature activation-6.07e-01

Pos logit contributions
 Sara+0.028
 ice+0.024
 strawberry+0.023
 Ben+0.023
 o+0.022
Neg logit contributions
lights-0.051
bel-0.050
opes-0.050
gy-0.050
zag-0.048
Token that
Token ID326
Feature activation-1.57e-01

Pos logit contributions
 Sara+0.013
 Ben+0.011
 ice+0.011
 strawberry+0.011
 o+0.010
Neg logit contributions
lights-0.027
opes-0.026
bel-0.025
gy-0.025
adoes-0.025
Token her
Token ID607
Feature activation-1.69e+00

Pos logit contributions
 Sara+0.003
 Ben+0.002
 ice+0.002
 strawberry+0.002
 o+0.002
Neg logit contributions
lights-0.007
opes-0.007
bel-0.007
gy-0.007
adoes-0.007
Token eyes
Token ID2951
Feature activation-1.38e+00

Pos logit contributions
 Sara+0.049
 Ben+0.045
 ice+0.044
 o+0.042
 strawberry+0.042
Neg logit contributions
lights-0.062
opes-0.058
bel-0.058
gy-0.057
spe-0.057
Token started
Token ID2067
Feature activation-1.15e+00

Pos logit contributions
 Sara+0.035
 Ben+0.030
 ice+0.030
 strawberry+0.029
 o+0.029
Neg logit contributions
lights-0.055
opes-0.053
bel-0.052
gy-0.052
adoes-0.051
Token,
Token ID11
Feature activation-5.81e-01

Pos logit contributions
 Sara+0.008
Come+0.007
 Ben+0.007
 ice+0.007
 strawberry+0.007
Neg logit contributions
lights-0.010
opes-0.010
gy-0.010
adoes-0.010
zag-0.010
Token fluffy
Token ID39145
Feature activation-2.72e+00

Pos logit contributions
 Sara+0.016
Come+0.013
 Danny+0.013
 Ben+0.013
iza+0.013
Neg logit contributions
 puzzles-0.020
lights-0.019
 robots-0.019
gy-0.019
bel-0.018
Token pillow
Token ID28774
Feature activation-8.05e-01

Pos logit contributions
 Sara+0.082
 Ben+0.071
 ice+0.071
 strawberry+0.070
Come+0.070
Neg logit contributions
lights-0.093
gy-0.092
opes-0.092
adoes-0.088
zag-0.086
Token in
Token ID287
Feature activation+6.29e-01

Pos logit contributions
 Sara+0.019
 Ben+0.017
 ice+0.016
 strawberry+0.016
 o+0.016
Neg logit contributions
lights-0.033
opes-0.032
bel-0.032
gy-0.031
adoes-0.031
Token her
Token ID607
Feature activation-1.14e+00

Pos logit contributions
lights+0.028
opes+0.027
gy+0.027
bel+0.027
adoes+0.026
Neg logit contributions
 Sara-0.013
 ice-0.011
 Ben-0.011
 strawberry-0.010
 o-0.010
Token hands
Token ID2832
Feature activation-1.69e+00

Pos logit contributions
 Sara+0.031
 Ben+0.027
 ice+0.026
 strawberry+0.026
 o+0.025
Neg logit contributions
lights-0.044
opes-0.042
gy-0.042
bel-0.041
adoes-0.041
Token.
Token ID13
Feature activation-1.72e-02

Pos logit contributions
 Sara+0.046
Come+0.041
 Ben+0.039
 ice+0.039
 strawberry+0.039
Neg logit contributions
lights-0.059
gy-0.059
opes-0.058
zag-0.057
bel-0.054
Token
Token ID198
Feature activation-2.92e-01

Pos logit contributions
 ice+0.000
 sne+0.000
 o+0.000
 Sara+0.000
 yawn+0.000
Neg logit contributions
lights-0.001
opes-0.001
adoes-0.001
gy-0.001
bel-0.001
Token
Token ID198
Feature activation-2.20e-01

Pos logit contributions
 ice+0.008
 sne+0.007
 warmer+0.007
 strawberry+0.007
 o+0.007
Neg logit contributions
bel-0.009
zag-0.009
lights-0.009
gy-0.009
bes-0.009
TokenThe
Token ID464
Feature activation-3.35e-01

Pos logit contributions
 Sara+0.006
 ice+0.006
 Ben+0.006
 o+0.006
 strawberry+0.006
Neg logit contributions
bel-0.007
lights-0.007
spe-0.007
opes-0.007
gy-0.007
Token bear
Token ID6842
Feature activation+1.61e-01

Pos logit contributions
 Sara+0.011
 Ben+0.009
 strawberry+0.009
 ice+0.009
Come+0.009
Neg logit contributions
lights-0.011
opes-0.010
gy-0.010
adoes-0.010
bel-0.010
Token work
Token ID670
Feature activation+2.58e+00

Pos logit contributions
lights+0.045
opes+0.044
bel+0.043
gy+0.043
adoes+0.042
Neg logit contributions
 Sara-0.025
 Ben-0.022
 ice-0.021
 strawberry-0.021
 o-0.020
Token hard
Token ID1327
Feature activation+1.47e+00

Pos logit contributions
lights+0.097
opes+0.096
gy+0.096
bel+0.094
zag+0.092
Neg logit contributions
 Sara-0.068
 Ben-0.063
 better-0.061
 ice-0.060
 right-0.058
Token,
Token ID11
Feature activation+6.37e-01

Pos logit contributions
lights+0.062
gy+0.060
bel+0.059
opes+0.058
zag+0.058
Neg logit contributions
 Sara-0.032
 Ben-0.030
 ice-0.029
 o-0.028
 strawberry-0.026
Token success
Token ID1943
Feature activation+9.63e-01

Pos logit contributions
lights+0.027
opes+0.026
bel+0.026
adoes+0.025
gy+0.025
Neg logit contributions
 Sara-0.014
 ice-0.012
 Ben-0.012
 strawberry-0.012
 o-0.011
Token will
Token ID481
Feature activation-2.13e+00

Pos logit contributions
lights+0.040
bel+0.039
opes+0.039
gy+0.039
adoes+0.038
Neg logit contributions
 Sara-0.021
 Ben-0.020
 ice-0.018
 o-0.018
 strawberry-0.017
Token come
Token ID1282
Feature activation-2.57e+00

Pos logit contributions
 Sara+0.043
 Ben+0.036
 strawberry+0.035
 ice+0.035
 o+0.033
Neg logit contributions
lights-0.095
opes-0.092
bel-0.091
gy-0.090
adoes-0.090
Token.
Token ID13
Feature activation+3.22e-01

Pos logit contributions
 Sara+0.059
 Ben+0.051
 ice+0.051
 o+0.049
 strawberry+0.048
Neg logit contributions
lights-0.107
opes-0.105
bel-0.101
adoes-0.099
gy-0.099
Token It
Token ID632
Feature activation+1.39e-02

Pos logit contributions
lights+0.013
opes+0.012
bel+0.012
gy+0.012
adoes+0.012
Neg logit contributions
 Sara-0.008
 Ben-0.007
 ice-0.007
 strawberry-0.007
 o-0.007
Token was
Token ID373
Feature activation+6.01e-02

Pos logit contributions
lights+0.001
opes+0.001
bel+0.001
gy+0.001
adoes+0.001
Neg logit contributions
 Sara-0.000
 Ben-0.000
 ice-0.000
 o-0.000
 strawberry-0.000
Token very
Token ID845
Feature activation-1.49e-01

Pos logit contributions
lights+0.002
bel+0.002
opes+0.002
gy+0.002
adoes+0.002
Neg logit contributions
 Sara-0.001
 Ben-0.001
 ice-0.001
 strawberry-0.001
Come-0.001
Token happy
Token ID3772
Feature activation-1.20e-01

Pos logit contributions
 Sara+0.004
 Ben+0.003
 strawberry+0.003
Come+0.003
 ice+0.003
Neg logit contributions
lights-0.006
bel-0.006
spe-0.006
opes-0.005
gy-0.005
Token the
Token ID262
Feature activation-1.04e+00

Pos logit contributions
 strawberry+0.034
 Sara+0.033
 ice+0.033
 vanilla+0.032
 o+0.032
Neg logit contributions
lights-0.057
gy-0.057
adoes-0.055
bel-0.054
 puzzles-0.053
Token center
Token ID3641
Feature activation-1.70e+00

Pos logit contributions
 Sara+0.030
 Ben+0.027
Come+0.027
 sne+0.026
 Danny+0.025
Neg logit contributions
gy-0.033
lights-0.033
 puzzles-0.032
opes-0.032
 robots-0.031
Token of
Token ID286
Feature activation-1.43e+00

Pos logit contributions
 Sara+0.042
 ice+0.037
 Ben+0.037
 strawberry+0.036
Come+0.036
Neg logit contributions
lights-0.068
opes-0.062
bel-0.062
zag-0.062
gy-0.061
Token the
Token ID262
Feature activation-7.57e-01

Pos logit contributions
 Sara+0.026
Come+0.023
 sne+0.023
 yawn+0.023
 warmer+0.023
Neg logit contributions
lights-0.060
 puzzles-0.058
 robots-0.056
adoes-0.056
opes-0.055
Token me
Token ID502
Feature activation-4.24e+00

Pos logit contributions
 Sara+0.022
 Ben+0.019
Come+0.018
 Gus+0.018
 sne+0.018
Neg logit contributions
lights-0.025
 puzzles-0.025
gy-0.025
bel-0.024
opes-0.024
Tokenadow
Token ID4584
Feature activation-1.41e+00

Pos logit contributions
 Sara+0.144
 strawberry+0.132
 ice+0.130
 Ben+0.130
Come+0.125
Neg logit contributions
lights-0.126
gy-0.122
opes-0.121
zag-0.121
bel-0.120
Token.
Token ID13
Feature activation-2.30e-01

Pos logit contributions
 Sara+0.036
 strawberry+0.033
 ice+0.033
 Ben+0.032
Come+0.030
Neg logit contributions
lights-0.054
gy-0.051
bel-0.051
zag-0.050
opes-0.050
Token She
Token ID1375
Feature activation+5.56e-01

Pos logit contributions
 ice+0.006
 Sara+0.005
 o+0.005
 strawberry+0.005
 sne+0.005
Neg logit contributions
lights-0.008
bel-0.008
gy-0.008
adoes-0.008
opes-0.008
Token ran
Token ID4966
Feature activation-7.80e-01

Pos logit contributions
gy+0.020
lights+0.020
bel+0.020
opes+0.020
zag+0.019
Neg logit contributions
 Sara-0.014
 Ben-0.013
 ice-0.013
 strawberry-0.012
 warmer-0.012
Token to
Token ID284
Feature activation+2.48e-01

Pos logit contributions
 Sara+0.018
 Ben+0.015
 strawberry+0.015
 ice+0.015
 o+0.015
Neg logit contributions
lights-0.032
bel-0.031
gy-0.031
opes-0.031
adoes-0.030
Token the
Token ID262
Feature activation-2.11e-01

Pos logit contributions
gy+0.009
bel+0.009
adoes+0.008
 puzzles+0.008
opes+0.008
Neg logit contributions
 strawberry-0.005
 vanilla-0.005
 Sara-0.005
 ice-0.005
 warmer-0.005
Token cookies
Token ID14746
Feature activation-8.40e-01

Pos logit contributions
 Sara+0.013
 Ben+0.011
Come+0.011
 strawberry+0.011
 Goose+0.010
Neg logit contributions
lights-0.021
opes-0.020
bel-0.020
 puzzles-0.020
adoes-0.019
Token for
Token ID329
Feature activation-4.52e-01

Pos logit contributions
 Sara+0.021
 strawberry+0.018
 Ben+0.018
Come+0.018
 ice+0.018
Neg logit contributions
lights-0.032
bel-0.031
opes-0.030
gy-0.030
zag-0.030
Token you
Token ID345
Feature activation-4.05e-01

Pos logit contributions
 Sara+0.010
Come+0.009
 strawberry+0.009
 o+0.008
 ice+0.008
Neg logit contributions
opes-0.018
lights-0.018
 puzzles-0.017
bel-0.017
adoes-0.017
Token.
Token ID13
Feature activation+1.30e+00

Pos logit contributions
 Sara+0.011
 strawberry+0.009
Come+0.009
 Ben+0.009
 ice+0.009
Neg logit contributions
lights-0.015
opes-0.015
gy-0.014
bel-0.014
zag-0.014
Token Do
Token ID2141
Feature activation-1.77e+00

Pos logit contributions
lights+0.040
adoes+0.040
gy+0.039
opes+0.039
phones+0.039
Neg logit contributions
 o-0.036
 ice-0.035
 strawberry-0.034
 sne-0.033
 sh-0.033
Token you
Token ID345
Feature activation-1.81e+00

Pos logit contributions
 Sara+0.033
 Ben+0.028
 ice+0.026
 strawberry+0.025
 o+0.024
Neg logit contributions
lights-0.086
bel-0.082
opes-0.082
gy-0.081
adoes-0.080
Token want
Token ID765
Feature activation-1.71e+00

Pos logit contributions
 Sara+0.047
 strawberry+0.045
 ice+0.040
 vanilla+0.039
 Goose+0.039
Neg logit contributions
gy-0.063
bel-0.060
opes-0.060
zag-0.060
lights-0.059
Token some
Token ID617
Feature activation-7.00e-01

Pos logit contributions
 Sara+0.044
Come+0.039
 Ben+0.038
 o+0.036
 strawberry+0.036
Neg logit contributions
lights-0.061
 puzzles-0.060
opes-0.059
bel-0.058
 robots-0.057
Token milk
Token ID7545
Feature activation-1.95e+00

Pos logit contributions
 Sara+0.018
 Ben+0.015
Come+0.015
 strawberry+0.015
 Goose+0.014
Neg logit contributions
lights-0.026
opes-0.025
 puzzles-0.025
bel-0.025
adoes-0.024
Token too
Token ID1165
Feature activation+2.63e-01

Pos logit contributions
 Sara+0.051
 Ben+0.045
Come+0.044
 strawberry+0.044
 ice+0.044
Neg logit contributions
lights-0.072
bel-0.070
gy-0.069
opes-0.068
adoes-0.067
Token?"
Token ID1701
Feature activation-1.65e+00

Pos logit contributions
lights+0.008
zag+0.007
bel+0.007
gy+0.007
adoes+0.007
Neg logit contributions
 Sara-0.008
Come-0.008
 Ben-0.008
 ge-0.008
 strawberry-0.007