TOP ACTIVATIONS
MAX = 0.048

As the aer opl ane took off , Lucy
. The aer opl ane took off and Tim
Suddenly , the aer opl ane started to shake .
One day , the aer opl ane saw a little girl
The aer opl ane soared high in the
When she saw the aer opl ane , she felt anxious
Then the aer opl ane tipped , and began
no , but the aer opl ane was very patient ,
." The aer opl ane flew around the park
was sad . The aer opl ane wanted to make her
tried to make the aer opl ane stop shaking , but
opl ane . The aer opl ane was very generous .
They asked the aer opl ane to fly away and
come back . The aer opl ane was scared , and
Then the aer opl ane started to get smaller
slow . Inside the aer opl ane , there were many
thank you to the aer opl ane . The
the sky . The aer opl ane liked to move fast
and thin . The aer opl ane flew calmly and then
and right . The aer opl ane was very happy when

MIN ACTIVATIONS
MIN = -0.054

rail , Mia 's foot slipped off the rail and onto
ripped the rocket apart and spat out the pieces . The
using his problem - s olving skills . From that day
the ground , its feet slipped on the oily surface .
its er aser when it accidentally dropped it into a deep
with her doll when it accidentally fell out of her hand
scared the bunny and it accidentally dropped the camera . The
with his friends when he accidentally fell and hurt his paw
it away a long time ago . It is not good
was playing and his weapon slipped out of his hands .
day , Lily 's ball accidentally went over Mrs . Smith
ran , her slippery feet slipped on the ground and the
to walk , her foot slipped on the wet ground .
he climbed , the leaf slipped and he fell down to
and Mia is the h ider . Tom counts to ten
someone a very long time ago . The little girl thought
Suddenly , the crystal slipped from their hands and fell
But then Jack 's foot slipped , and he started to
Cole 's room when he accidentally disturbed him . Cole said
was sitting , his ball slipped out of his hands .

INTERVAL 0.023 - 0.048
CONTAINS 0.071%

They see photos of Grand ma and Grand pa in
and not strong like the other animals ." The owl told
The pupil and the attractive person became good friends .
. Tom and his dad and mum wanted to
's a problem with the pump . I need to open

INTERVAL -0.003 - 0.023
CONTAINS 68.573%

up . He used his shield to protect himself from the
stretched up into the sky . When she
They were happy in the open . <|endoftext|> Tom was a
the sky . " It 's so noisy and
morning , the shrimp filled the pond with cheer .

INTERVAL -0.028 - -0.003
CONTAINS 31.347%

cry . They feel dizz y from running so much .
curiosity was a good thing but it 's important to always
â„¢ s just try both ways and then we can decide
what is this ?" He takes it out and reads it
Eventually the sun started to set and Lucy knew

INTERVAL -0.054 - -0.028
CONTAINS 0.009%

with her toys when she accidentally knocked over a lamp .
baking a cake and she accidentally forgot to put in the
But then , Lily slipped and fell into the fountain
day , her little brother accidentally broke it . Lily was
. But then , Max accidentally marked Lily 's favorite shirt
Token
Token ID198
Feature activation+1.18e-03

Pos logit contributions
 L+0.018
 me+0.018
 Lily+0.017
 swe+0.016
 my+0.016
Neg logit contributions
ver-0.023
five-0.021
seek-0.020
ival-0.020
tted-0.020
Token
Token ID198
Feature activation-4.00e-04

Pos logit contributions
ver+0.002
five+0.002
seek+0.002
ival+0.002
tted+0.002
Neg logit contributions
 L-0.002
 me-0.002
 Lily-0.002
 swe-0.001
 my-0.001
TokenAs
Token ID1722
Feature activation-3.29e-03

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.001
 my+0.000
 swe+0.000
Neg logit contributions
ver-0.001
five-0.001
seek-0.001
ival-0.001
glass-0.001
Token the
Token ID262
Feature activation+1.65e-02

Pos logit contributions
 L+0.004
 me+0.004
 Lily+0.003
 swe+0.003
 my+0.003
Neg logit contributions
ver-0.007
five-0.006
ival-0.006
seek-0.006
zed-0.006
Token aer
Token ID9551
Feature activation+3.30e-02

Pos logit contributions
ver+0.036
seek+0.034
five+0.034
ival+0.033
ative+0.033
Neg logit contributions
 L-0.018
 me-0.017
 Lily-0.017
 my-0.014
 p-0.013
Tokenopl
Token ID20106
Feature activation+4.79e-02

Pos logit contributions
ver+0.041
 voyage+0.037
 warrior+0.035
 commander+0.034
ative+0.033
Neg logit contributions
 me-0.063
 L-0.059
 Lily-0.055
 my-0.054
 you-0.053
Tokenane
Token ID1531
Feature activation+1.93e-02

Pos logit contributions
 voyage+0.047
 quest+0.046
zed+0.044
seek+0.042
 believe+0.042
Neg logit contributions
 L-0.099
 me-0.092
 you-0.092
 Lily-0.090
ries-0.089
Token took
Token ID1718
Feature activation-2.06e-03

Pos logit contributions
ver+0.041
five+0.040
seek+0.040
ative+0.039
asts+0.039
Neg logit contributions
 me-0.021
 L-0.020
 Lily-0.019
 p-0.017
 swe-0.015
Token off
Token ID572
Feature activation-6.76e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 swe+0.002
 Mitt+0.001
Neg logit contributions
ver-0.004
five-0.004
seek-0.004
ative-0.004
ival-0.004
Token,
Token ID11
Feature activation-4.35e-03

Pos logit contributions
 me+0.008
 L+0.007
 Lily+0.007
 swe+0.006
 mess+0.006
Neg logit contributions
ver-0.013
five-0.012
seek-0.012
ival-0.012
ative-0.012
Token Lucy
Token ID22162
Feature activation+9.75e-04

Pos logit contributions
 me+0.005
 L+0.005
 swe+0.004
 scar+0.004
lasses+0.004
Neg logit contributions
ver-0.008
five-0.008
ival-0.008
zed-0.008
he-0.007
Token.
Token ID13
Feature activation-7.99e-03

Pos logit contributions
ver+0.021
five+0.021
ival+0.021
seek+0.021
glass+0.020
Neg logit contributions
 L-0.006
 Lily-0.006
 me-0.005
 my-0.004
 Sara-0.004
Token
Token ID198
Feature activation-6.09e-04

Pos logit contributions
 L+0.010
 me+0.010
 swe+0.009
 scar+0.009
 my+0.009
Neg logit contributions
ver-0.014
five-0.013
seek-0.013
ival-0.013
ugg-0.013
Token
Token ID198
Feature activation-4.28e-03

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.001
 swe+0.001
 my+0.001
Neg logit contributions
ver-0.001
five-0.001
seek-0.001
ival-0.001
glass-0.001
TokenThe
Token ID464
Feature activation+5.76e-04

Pos logit contributions
 L+0.005
 me+0.005
 Lily+0.005
 my+0.004
 swe+0.004
Neg logit contributions
ver-0.009
five-0.009
seek-0.008
ival-0.008
glass-0.008
Token aer
Token ID9551
Feature activation+3.22e-02

Pos logit contributions
ver+0.001
five+0.001
seek+0.001
ival+0.001
glass+0.001
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 my-0.001
 swe-0.001
Tokenopl
Token ID20106
Feature activation+4.79e-02

Pos logit contributions
ver+0.042
 voyage+0.037
 warrior+0.034
 commander+0.033
growth+0.033
Neg logit contributions
 me-0.063
 L-0.058
 Lily-0.056
 my-0.054
 mess-0.053
Tokenane
Token ID1531
Feature activation+9.03e-03

Pos logit contributions
 quest+0.054
 voyage+0.052
zed+0.050
 believe+0.049
adows+0.045
Neg logit contributions
 L-0.099
 you-0.090
 Lily-0.090
 me-0.088
ries-0.087
Token took
Token ID1718
Feature activation-8.27e-03

Pos logit contributions
ver+0.018
five+0.017
asts+0.017
seek+0.017
ative+0.017
Neg logit contributions
 me-0.011
 L-0.011
 Lily-0.010
 p-0.009
 swe-0.008
Token off
Token ID572
Feature activation-7.07e-03

Pos logit contributions
 L+0.008
 me+0.008
 Lily+0.007
 swe+0.006
 my+0.006
Neg logit contributions
five-0.017
ver-0.017
seek-0.017
ival-0.016
glass-0.016
Token and
Token ID290
Feature activation-6.80e-03

Pos logit contributions
 L+0.008
 me+0.008
 Lily+0.008
 swe+0.006
 my+0.006
Neg logit contributions
ver-0.014
five-0.014
seek-0.014
glass-0.013
ival-0.013
Token Tim
Token ID5045
Feature activation+4.09e-03

Pos logit contributions
 L+0.010
 me+0.010
 Lily+0.009
 my+0.008
 swe+0.008
Neg logit contributions
ver-0.012
five-0.011
seek-0.011
ival-0.011
glass-0.011
Token
Token ID198
Feature activation+5.35e-03

Pos logit contributions
ver+0.016
five+0.016
seek+0.016
ival+0.015
glass+0.015
Neg logit contributions
 L-0.010
 Lily-0.010
 me-0.010
 my-0.008
 Sara-0.008
TokenSuddenly
Token ID38582
Feature activation-3.12e-03

Pos logit contributions
ver+0.010
five+0.009
ative+0.009
seek+0.009
ival+0.009
Neg logit contributions
 Lily-0.007
 L-0.007
 me-0.007
 my-0.006
 Mitt-0.006
Token,
Token ID11
Feature activation+1.31e-03

Pos logit contributions
 L+0.003
 Lily+0.003
 me+0.003
 my+0.003
 Sara+0.003
Neg logit contributions
five-0.006
ver-0.006
seek-0.006
glass-0.006
ival-0.006
Token the
Token ID262
Feature activation+2.84e-03

Pos logit contributions
ver+0.003
five+0.003
ival+0.002
glass+0.002
asts+0.002
Neg logit contributions
 me-0.001
 L-0.001
 swe-0.001
 Lily-0.001
 my-0.001
Token aer
Token ID9551
Feature activation+2.99e-02

Pos logit contributions
ver+0.005
five+0.005
seek+0.005
ival+0.005
glass+0.005
Neg logit contributions
 L-0.004
 Lily-0.003
 me-0.003
 my-0.003
 Mitt-0.003
Tokenopl
Token ID20106
Feature activation+4.78e-02

Pos logit contributions
ver+0.031
five+0.027
 voyage+0.025
ative+0.025
ival+0.024
Neg logit contributions
 me-0.066
 L-0.064
 Lily-0.061
 my-0.057
 you-0.056
Tokenane
Token ID1531
Feature activation+3.39e-03

Pos logit contributions
 voyage+0.039
 quest+0.039
seek+0.038
zed+0.038
five+0.038
Neg logit contributions
 L-0.106
 me-0.101
 Lily-0.097
 you-0.097
 my-0.091
Token started
Token ID2067
Feature activation-2.63e-03

Pos logit contributions
ver+0.006
five+0.006
seek+0.006
ival+0.006
glass+0.006
Neg logit contributions
 L-0.004
 me-0.004
 Lily-0.004
 my-0.004
 swe-0.003
Token to
Token ID284
Feature activation+9.59e-03

Pos logit contributions
 L+0.003
 Lily+0.003
 me+0.003
 my+0.002
 Sara+0.002
Neg logit contributions
ver-0.005
seek-0.005
five-0.005
asts-0.005
glass-0.005
Token shake
Token ID13279
Feature activation-1.43e-03

Pos logit contributions
ver+0.015
five+0.015
seek+0.015
asts+0.014
 voyage+0.014
Neg logit contributions
 L-0.014
 Lily-0.014
 me-0.013
 my-0.012
 Sara-0.012
Token.
Token ID13
Feature activation-1.71e-03

Pos logit contributions
 L+0.002
 Mitt+0.002
 Lily+0.002
 me+0.002
lasses+0.001
Neg logit contributions
ver-0.002
ative-0.002
five-0.002
adows-0.002
seek-0.002
TokenOne
Token ID3198
Feature activation+2.91e-03

Pos logit contributions
ver+0.010
five+0.010
seek+0.010
ival+0.010
glass+0.009
Neg logit contributions
 L-0.009
 Lily-0.009
 me-0.009
 my-0.007
 swe-0.007
Token day
Token ID1110
Feature activation-1.35e-03

Pos logit contributions
seek+0.006
ver+0.006
ival+0.006
ives+0.006
tted+0.006
Neg logit contributions
 L-0.004
 Lily-0.004
 me-0.003
 my-0.003
 Sara-0.003
Token,
Token ID11
Feature activation+3.81e-03

Pos logit contributions
 L+0.002
 Lily+0.001
 me+0.001
 my+0.001
 Sara+0.001
Neg logit contributions
ver-0.003
seek-0.003
five-0.003
glass-0.003
ival-0.003
Token the
Token ID262
Feature activation+7.23e-03

Pos logit contributions
ver+0.008
five+0.007
ival+0.007
ative+0.007
asts+0.007
Neg logit contributions
 me-0.004
 swe-0.004
 L-0.004
 mess-0.004
 mistake-0.003
Token aer
Token ID9551
Feature activation+3.21e-02

Pos logit contributions
ver+0.014
five+0.014
glass+0.013
ival+0.013
seek+0.013
Neg logit contributions
 L-0.009
 Lily-0.008
 me-0.008
 my-0.007
 Mitt-0.007
Tokenopl
Token ID20106
Feature activation+4.75e-02

Pos logit contributions
ver+0.037
five+0.031
 voyage+0.031
ative+0.029
 warrior+0.029
Neg logit contributions
 me-0.068
 L-0.064
 Lily-0.062
 my-0.059
 us-0.058
Tokenane
Token ID1531
Feature activation+9.15e-03

Pos logit contributions
 quest+0.040
 voyage+0.040
zed+0.040
seek+0.038
five+0.037
Neg logit contributions
 L-0.104
 me-0.099
 you-0.097
 Lily-0.097
 my-0.090
Token saw
Token ID2497
Feature activation-4.42e-05

Pos logit contributions
ver+0.018
five+0.017
seek+0.017
ival+0.017
ative+0.017
Neg logit contributions
 me-0.011
 L-0.011
 Lily-0.011
 p-0.009
 my-0.009
Token a
Token ID257
Feature activation+3.60e-03

Pos logit contributions
 L+0.000
 me+0.000
 swe+0.000
 Lily+0.000
 my+0.000
Neg logit contributions
ver-0.000
five-0.000
glass-0.000
ative-0.000
seek-0.000
Token little
Token ID1310
Feature activation-5.32e-03

Pos logit contributions
 warrior+0.006
 voyage+0.006
ival+0.006
 rider+0.006
 sailor+0.005
Neg logit contributions
 L-0.005
 Lily-0.005
ries-0.004
 me-0.004
lasses-0.004
Token girl
Token ID2576
Feature activation-4.53e-03

Pos logit contributions
 L+0.007
 Lily+0.006
 me+0.006
lasses+0.006
 Mitt+0.006
Neg logit contributions
 warrior-0.009
 rider-0.008
five-0.008
 sailor-0.008
 astronaut-0.008
Token
Token ID220
Feature activation-1.06e-02

Pos logit contributions
 Lily+0.004
 L+0.004
 me+0.003
 Mitt+0.003
 my+0.003
Neg logit contributions
ver-0.004
seek-0.004
glass-0.004
five-0.004
ative-0.004
Token
Token ID198
Feature activation+3.47e-03

Pos logit contributions
 L+0.013
 Lily+0.012
 me+0.012
 my+0.010
 Sara+0.010
Neg logit contributions
ver-0.021
five-0.020
seek-0.020
glass-0.020
ival-0.020
Token
Token ID198
Feature activation+2.18e-05

Pos logit contributions
ver+0.007
five+0.007
seek+0.006
glass+0.006
ival+0.006
Neg logit contributions
 L-0.004
 me-0.004
 Lily-0.004
 my-0.003
 Sara-0.003
TokenThe
Token ID464
Feature activation-1.66e-03

Pos logit contributions
ver+0.000
five+0.000
seek+0.000
glass+0.000
ugg+0.000
Neg logit contributions
 me-0.000
 Lily-0.000
 L-0.000
 my-0.000
 Mitt-0.000
Token aer
Token ID9551
Feature activation+3.13e-02

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 my+0.001
 p+0.001
Neg logit contributions
ver-0.004
seek-0.003
five-0.003
ival-0.003
ative-0.003
Tokenopl
Token ID20106
Feature activation+4.75e-02

Pos logit contributions
ver+0.040
 voyage+0.034
 commander+0.033
 warrior+0.033
five+0.033
Neg logit contributions
 me-0.062
 L-0.056
 Lily-0.053
 my-0.053
 us-0.051
Tokenane
Token ID1531
Feature activation+7.92e-03

Pos logit contributions
 quest+0.051
 voyage+0.049
zed+0.049
seek+0.047
 believe+0.045
Neg logit contributions
 L-0.093
 you-0.088
 me-0.088
 Lily-0.083
 my-0.082
Token soared
Token ID35292
Feature activation+1.15e-02

Pos logit contributions
seek+0.016
five+0.016
asts+0.016
ibr+0.016
ver+0.016
Neg logit contributions
 me-0.009
 L-0.008
 Lily-0.008
 p-0.007
 too-0.007
Token high
Token ID1029
Feature activation+2.09e-02

Pos logit contributions
ver+0.022
five+0.021
seek+0.021
ival+0.021
glass+0.021
Neg logit contributions
 L-0.014
 me-0.014
 Lily-0.014
 my-0.011
 swe-0.011
Token in
Token ID287
Feature activation-4.28e-03

Pos logit contributions
ver+0.043
five+0.042
seek+0.041
ival+0.040
glass+0.040
Neg logit contributions
 L-0.023
 me-0.023
 Lily-0.022
 my-0.018
 swe-0.018
Token the
Token ID262
Feature activation+4.87e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.003
 swe+0.002
 my+0.002
Neg logit contributions
ver-0.011
five-0.010
seek-0.010
glass-0.010
ival-0.010
TokenWhen
Token ID2215
Feature activation+3.97e-03

Pos logit contributions
ver+0.006
glass+0.005
seek+0.005
five+0.005
asts+0.005
Neg logit contributions
 me-0.005
 L-0.005
 Lily-0.005
 my-0.004
 Sara-0.004
Token she
Token ID673
Feature activation+9.81e-03

Pos logit contributions
ver+0.008
five+0.008
seek+0.008
ival+0.008
zed+0.008
Neg logit contributions
 L-0.004
 me-0.004
 Lily-0.004
 swe-0.003
 Mitt-0.003
Token saw
Token ID2497
Feature activation+2.76e-03

Pos logit contributions
ver+0.019
seek+0.018
ative+0.018
five+0.018
glass+0.018
Neg logit contributions
 L-0.013
 me-0.013
 Lily-0.012
 p-0.011
 my-0.010
Token the
Token ID262
Feature activation+6.42e-03

Pos logit contributions
ver+0.006
five+0.006
ival+0.006
seek+0.005
glass+0.005
Neg logit contributions
 L-0.003
 me-0.002
 Lily-0.002
 swe-0.002
 Mitt-0.002
Token aer
Token ID9551
Feature activation+3.02e-02

Pos logit contributions
ival+0.009
 voyage+0.009
glass+0.009
ane+0.009
five+0.009
Neg logit contributions
 Lily-0.011
 L-0.010
 Mitt-0.009
 Mia-0.009
 Sara-0.009
Tokenopl
Token ID20106
Feature activation+4.70e-02

Pos logit contributions
ver+0.031
five+0.027
glass+0.025
ative+0.025
 voyage+0.024
Neg logit contributions
 me-0.067
 L-0.063
 Lily-0.060
 my-0.058
 you-0.056
Tokenane
Token ID1531
Feature activation+7.20e-03

Pos logit contributions
 voyage+0.040
 quest+0.039
five+0.037
seek+0.037
zed+0.036
Neg logit contributions
 L-0.102
 me-0.098
 you-0.095
 Lily-0.093
ries-0.091
Token,
Token ID11
Feature activation-2.47e-03

Pos logit contributions
ver+0.013
five+0.012
seek+0.012
glass+0.012
ival+0.012
Neg logit contributions
 L-0.010
 Lily-0.009
 me-0.009
 Mitt-0.008
 Sara-0.008
Token she
Token ID673
Feature activation-7.08e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.002
 swe+0.002
 Mitt+0.002
Neg logit contributions
ver-0.005
five-0.005
ival-0.005
zed-0.004
ative-0.004
Token felt
Token ID2936
Feature activation+2.83e-03

Pos logit contributions
 L+0.011
 Lily+0.011
 swe+0.010
 Mitt+0.010
 my+0.010
Neg logit contributions
ver-0.011
 exited-0.010
glass-0.010
ival-0.010
five-0.010
Token anxious
Token ID18116
Feature activation+3.04e-03

Pos logit contributions
ver+0.006
five+0.006
seek+0.006
glass+0.006
ival+0.006
Neg logit contributions
 L-0.003
 me-0.003
 Lily-0.003
 my-0.002
 Sara-0.002
Token
Token ID198
Feature activation+1.07e-03

Pos logit contributions
ver+0.002
five+0.002
glass+0.002
ival+0.002
seek+0.002
Neg logit contributions
 me-0.002
 L-0.002
 Lily-0.002
 swe-0.002
 my-0.002
Token
Token ID198
Feature activation+3.85e-03

Pos logit contributions
ver+0.002
five+0.002
seek+0.002
ival+0.002
ative+0.002
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 my-0.001
 Sara-0.001
TokenThen
Token ID6423
Feature activation-1.57e-03

Pos logit contributions
ver+0.007
five+0.007
seek+0.006
ugg+0.006
asts+0.006
Neg logit contributions
 me-0.005
 L-0.005
 Lily-0.005
 my-0.005
 p-0.004
Token the
Token ID262
Feature activation+1.82e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 swe+0.002
 my+0.002
Neg logit contributions
ver-0.003
five-0.003
glass-0.003
seek-0.003
ival-0.003
Token aer
Token ID9551
Feature activation+2.90e-02

Pos logit contributions
glass+0.003
 voyage+0.003
five+0.003
ival+0.003
asts+0.003
Neg logit contributions
 L-0.003
 Lily-0.003
 me-0.002
 Sara-0.002
 Mitt-0.002
Tokenopl
Token ID20106
Feature activation+4.70e-02

Pos logit contributions
ver+0.028
five+0.024
 voyage+0.022
ative+0.022
ival+0.022
Neg logit contributions
 me-0.065
 L-0.064
 Lily-0.061
 my-0.057
 Mitt-0.055
Tokenane
Token ID1531
Feature activation+8.14e-03

Pos logit contributions
 voyage+0.036
zed+0.036
five+0.036
seek+0.035
 quest+0.035
Neg logit contributions
 L-0.107
 me-0.100
 Lily-0.097
 you-0.096
 my-0.091
Token tipped
Token ID28395
Feature activation-1.34e-03

Pos logit contributions
ver+0.014
five+0.014
glass+0.014
seek+0.014
ival+0.013
Neg logit contributions
 L-0.011
 Lily-0.011
 me-0.010
 my-0.009
 Mitt-0.009
Token,
Token ID11
Feature activation-4.58e-04

Pos logit contributions
 L+0.002
 Lily+0.002
 me+0.002
 Mitt+0.002
 swe+0.002
Neg logit contributions
ver-0.002
seek-0.002
ative-0.002
five-0.002
ane-0.002
Token and
Token ID290
Feature activation+8.25e-03

Pos logit contributions
 L+0.001
 me+0.000
 Lily+0.000
 swe+0.000
 Mitt+0.000
Neg logit contributions
ver-0.001
glass-0.001
five-0.001
ival-0.001
asts-0.001
Token began
Token ID2540
Feature activation+7.06e-03

Pos logit contributions
ver+0.015
five+0.015
seek+0.014
glass+0.014
ival+0.014
Neg logit contributions
 L-0.011
 me-0.010
 Lily-0.010
 my-0.009
 swe-0.009
Token no
Token ID645
Feature activation-5.54e-03

Pos logit contributions
 L+0.005
 Lily+0.005
 me+0.005
 my+0.004
 Sara+0.004
Neg logit contributions
ver-0.009
seek-0.009
five-0.009
glass-0.009
ival-0.009
Token,
Token ID11
Feature activation+2.83e-03

Pos logit contributions
 L+0.006
 Lily+0.006
 me+0.006
 swe+0.005
 Mitt+0.005
Neg logit contributions
ver-0.010
five-0.010
seek-0.010
asts-0.010
glass-0.010
Token but
Token ID475
Feature activation-5.92e-04

Pos logit contributions
ver+0.005
five+0.005
glass+0.005
asts+0.005
 voyage+0.005
Neg logit contributions
 L-0.003
 me-0.003
 Lily-0.003
 swe-0.003
 my-0.003
Token the
Token ID262
Feature activation+1.60e-04

Pos logit contributions
 L+0.001
 Lily+0.000
 me+0.000
 my+0.000
 Sara+0.000
Neg logit contributions
five-0.001
ver-0.001
seek-0.001
ival-0.001
glass-0.001
Token aer
Token ID9551
Feature activation+3.04e-02

Pos logit contributions
ver+0.000
five+0.000
glass+0.000
seek+0.000
asts+0.000
Neg logit contributions
 L-0.000
 Lily-0.000
 me-0.000
 Mitt-0.000
 my-0.000
Tokenopl
Token ID20106
Feature activation+4.64e-02

Pos logit contributions
ver+0.035
five+0.031
 voyage+0.030
ative+0.029
 warrior+0.029
Neg logit contributions
 me-0.062
 L-0.058
 Lily-0.055
 my-0.054
 you-0.052
Tokenane
Token ID1531
Feature activation+7.47e-03

Pos logit contributions
 quest+0.046
 voyage+0.045
zed+0.044
seek+0.043
 believe+0.043
Neg logit contributions
 L-0.094
 me-0.090
 you-0.089
 Lily-0.086
ries-0.084
Token was
Token ID373
Feature activation+1.55e-03

Pos logit contributions
five+0.015
ver+0.015
seek+0.015
asts+0.015
ival+0.015
Neg logit contributions
 me-0.008
 L-0.008
 Lily-0.008
 p-0.006
 my-0.006
Token very
Token ID845
Feature activation+6.77e-05

Pos logit contributions
ver+0.004
five+0.003
seek+0.003
ative+0.003
glass+0.003
Neg logit contributions
 L-0.001
 Lily-0.001
 me-0.001
 my-0.001
 Sara-0.001
Token patient
Token ID5827
Feature activation-5.65e-04

Pos logit contributions
ver+0.000
five+0.000
seek+0.000
glass+0.000
ival+0.000
Neg logit contributions
 L-0.000
 me-0.000
 Lily-0.000
 my-0.000
 swe-0.000
Token,
Token ID11
Feature activation-9.93e-03

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.000
 my+0.000
 swe+0.000
Neg logit contributions
ver-0.001
five-0.001
seek-0.001
glass-0.001
ative-0.001
Token."
Token ID526
Feature activation+4.87e-03

Pos logit contributions
ver+0.002
ane+0.002
tted+0.002
wreck+0.002
 voyage+0.002
Neg logit contributions
 L-0.002
 swollen-0.002
 scar-0.002
 honey-0.002
 swe-0.002
Token
Token ID198
Feature activation-4.30e-03

Pos logit contributions
ver+0.008
five+0.008
seek+0.008
glass+0.008
ival+0.008
Neg logit contributions
 L-0.007
 me-0.007
 Lily-0.007
 swe-0.006
 my-0.006
Token
Token ID198
Feature activation-3.65e-04

Pos logit contributions
 L+0.005
 me+0.005
 Lily+0.005
 my+0.004
 Sara+0.003
Neg logit contributions
ver-0.009
five-0.009
seek-0.009
ative-0.008
asts-0.008
TokenThe
Token ID464
Feature activation+6.60e-04

Pos logit contributions
 me+0.001
 L+0.001
 Lily+0.001
 my+0.001
 p+0.001
Neg logit contributions
ver-0.000
five-0.000
ugg-0.000
seek-0.000
asts-0.000
Token aer
Token ID9551
Feature activation+2.88e-02

Pos logit contributions
glass+0.001
ival+0.001
five+0.001
 voyage+0.001
 rider+0.001
Neg logit contributions
 Lily-0.001
 L-0.001
 me-0.001
 Mia-0.001
 my-0.001
Tokenopl
Token ID20106
Feature activation+4.63e-02

Pos logit contributions
ver+0.027
five+0.023
 voyage+0.022
ival+0.021
ative+0.021
Neg logit contributions
 me-0.066
 L-0.064
 Lily-0.062
 my-0.058
 Mitt-0.056
Tokenane
Token ID1531
Feature activation+5.35e-03

Pos logit contributions
 voyage+0.034
zed+0.033
 quest+0.031
five+0.031
seek+0.030
Neg logit contributions
 L-0.110
 me-0.104
 Lily-0.101
 you-0.099
 my-0.095
Token flew
Token ID13112
Feature activation-6.38e-04

Pos logit contributions
ver+0.010
five+0.010
seek+0.010
ival+0.009
glass+0.009
Neg logit contributions
 L-0.007
 Lily-0.007
 me-0.007
 my-0.006
 Mitt-0.005
Token around
Token ID1088
Feature activation+1.48e-02

Pos logit contributions
 L+0.001
 Lily+0.001
 me+0.001
 Mitt+0.001
 Sara+0.001
Neg logit contributions
ver-0.001
ative-0.001
five-0.001
seek-0.001
tted-0.001
Token the
Token ID262
Feature activation+8.38e-03

Pos logit contributions
ver+0.033
five+0.032
seek+0.032
glass+0.031
ival+0.031
Neg logit contributions
 L-0.013
 me-0.013
 Lily-0.012
 swe-0.010
 my-0.010
Token park
Token ID3952
Feature activation+6.38e-03

Pos logit contributions
five+0.012
seek+0.012
glass+0.011
ival+0.011
asts+0.011
Neg logit contributions
 L-0.013
 Lily-0.013
 me-0.012
 Mitt-0.011
 swe-0.011
Token was
Token ID373
Feature activation+9.43e-03

Pos logit contributions
 L+0.000
 Lily+0.000
 me+0.000
 my+0.000
 Sara+0.000
Neg logit contributions
ver-0.001
five-0.001
seek-0.001
glass-0.000
ival-0.000
Token sad
Token ID6507
Feature activation+1.79e-04

Pos logit contributions
ver+0.014
seek+0.013
 inspired+0.013
 exited+0.013
oved+0.013
Neg logit contributions
 L-0.014
 me-0.013
 swe-0.013
 Lily-0.013
 powder-0.012
Token.
Token ID13
Feature activation-2.10e-03

Pos logit contributions
ver+0.000
ative+0.000
five+0.000
seek+0.000
asts+0.000
Neg logit contributions
 L-0.000
 me-0.000
 Lily-0.000
 swe-0.000
 Mitt-0.000
Token The
Token ID383
Feature activation-4.03e-04

Pos logit contributions
 L+0.003
 me+0.003
 swe+0.003
 scar+0.003
 mess+0.003
Neg logit contributions
five-0.003
seek-0.003
ver-0.003
ative-0.003
 exited-0.003
Token aer
Token ID9551
Feature activation+3.01e-02

Pos logit contributions
 L+0.000
 Mitt+0.000
 Lily+0.000
 me+0.000
 Mia+0.000
Neg logit contributions
five-0.001
 astronaut-0.001
 rider-0.001
ival-0.001
glass-0.001
Tokenopl
Token ID20106
Feature activation+4.63e-02

Pos logit contributions
ver+0.034
 voyage+0.029
five+0.028
ative+0.027
 warrior+0.027
Neg logit contributions
 me-0.064
 L-0.061
 Lily-0.059
 my-0.056
 us-0.054
Tokenane
Token ID1531
Feature activation+5.86e-03

Pos logit contributions
 voyage+0.040
 quest+0.040
zed+0.038
seek+0.037
five+0.035
Neg logit contributions
 L-0.103
 me-0.097
 Lily-0.096
 you-0.095
ries-0.090
Token wanted
Token ID2227
Feature activation-4.67e-04

Pos logit contributions
ver+0.011
five+0.011
seek+0.010
glass+0.010
ival+0.010
Neg logit contributions
 L-0.008
 Lily-0.007
 me-0.007
 my-0.006
 Mitt-0.006
Token to
Token ID284
Feature activation+1.55e-03

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.001
 mess+0.001
 swe+0.001
Neg logit contributions
five-0.001
seek-0.001
ver-0.001
glass-0.001
asts-0.001
Token make
Token ID787
Feature activation-4.93e-04

Pos logit contributions
five+0.002
ver+0.002
seek+0.002
asts+0.002
glass+0.002
Neg logit contributions
 L-0.003
 Lily-0.002
 me-0.002
 swe-0.002
 Mitt-0.002
Token her
Token ID607
Feature activation+6.85e-03

Pos logit contributions
Oh+0.001
lasses+0.001
ounced+0.001
ety+0.001
 swollen+0.001
Neg logit contributions
five-0.001
 believe-0.001
asts-0.001
handed-0.001
seek-0.001
Token tried
Token ID3088
Feature activation-1.63e-03

Pos logit contributions
ver+0.003
five+0.003
seek+0.003
ival+0.003
glass+0.003
Neg logit contributions
 L-0.002
 me-0.002
 Lily-0.002
 my-0.002
 swe-0.001
Token to
Token ID284
Feature activation+2.66e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 my+0.001
 swe+0.001
Neg logit contributions
ver-0.003
five-0.003
seek-0.003
ival-0.003
glass-0.003
Token make
Token ID787
Feature activation-2.91e-03

Pos logit contributions
ver+0.004
five+0.004
seek+0.004
asts+0.004
glass+0.004
Neg logit contributions
 L-0.004
 Lily-0.004
 me-0.004
 swe-0.003
 Sara-0.003
Token the
Token ID262
Feature activation+3.71e-03

Pos logit contributions
 L+0.003
 me+0.002
 Lily+0.002
 swe+0.002
lasses+0.002
Neg logit contributions
five-0.006
ver-0.006
asts-0.006
ival-0.006
glass-0.006
Token aer
Token ID9551
Feature activation+3.12e-02

Pos logit contributions
five+0.007
ver+0.006
glass+0.006
seek+0.006
asts+0.006
Neg logit contributions
 L-0.005
 Lily-0.005
 me-0.005
 Mitt-0.004
 Sara-0.004
Tokenopl
Token ID20106
Feature activation+4.60e-02

Pos logit contributions
ver+0.033
five+0.030
 warrior+0.028
ival+0.027
ative+0.027
Neg logit contributions
 me-0.067
 L-0.062
 Lily-0.058
 my-0.058
 you-0.057
Tokenane
Token ID1531
Feature activation+2.45e-03

Pos logit contributions
 quest+0.044
five+0.044
 voyage+0.044
seek+0.042
zed+0.042
Neg logit contributions
 L-0.091
 you-0.090
 me-0.088
 Lily-0.082
 too-0.080
Token stop
Token ID2245
Feature activation-2.12e-03

Pos logit contributions
ver+0.005
five+0.005
seek+0.005
ival+0.004
glass+0.004
Neg logit contributions
 L-0.003
 me-0.003
 Lily-0.003
 my-0.002
 swe-0.002
Token shaking
Token ID17275
Feature activation+7.00e-04

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 swe+0.002
 Mitt+0.002
Neg logit contributions
ver-0.004
asts-0.004
glass-0.004
five-0.004
seek-0.004
Token,
Token ID11
Feature activation+7.99e-04

Pos logit contributions
ver+0.001
five+0.001
seek+0.001
ative+0.001
glass+0.001
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 my-0.001
 Mitt-0.001
Token but
Token ID475
Feature activation-4.75e-03

Pos logit contributions
ver+0.002
five+0.002
glass+0.001
seek+0.001
zed+0.001
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 swe-0.001
 my-0.001
Tokenopl
Token ID20106
Feature activation+3.34e-02

Pos logit contributions
ver+0.020
five+0.017
glass+0.015
ival+0.015
ative+0.015
Neg logit contributions
 me-0.052
 L-0.050
 Lily-0.049
 my-0.046
 Mitt-0.045
Tokenane
Token ID1531
Feature activation+4.64e-03

Pos logit contributions
 voyage+0.023
five+0.021
zed+0.021
 quest+0.021
seek+0.020
Neg logit contributions
 L-0.081
 me-0.076
 Lily-0.076
 you-0.072
 Sara-0.070
Token.
Token ID13
Feature activation+8.17e-03

Pos logit contributions
ver+0.009
five+0.009
seek+0.009
glass+0.009
ative+0.009
Neg logit contributions
 L-0.005
 Lily-0.005
 me-0.005
 swe-0.004
 Sara-0.004
Token The
Token ID383
Feature activation+6.65e-03

Pos logit contributions
ver+0.015
seek+0.015
ival+0.015
glass+0.015
five+0.014
Neg logit contributions
 Lily-0.012
 L-0.012
 me-0.010
 my-0.010
 Sara-0.010
Token aer
Token ID9551
Feature activation+2.91e-02

Pos logit contributions
ver+0.013
five+0.013
seek+0.012
ival+0.012
glass+0.012
Neg logit contributions
 L-0.008
 Lily-0.008
 me-0.008
 my-0.006
 swe-0.006
Tokenopl
Token ID20106
Feature activation+4.59e-02

Pos logit contributions
ver+0.032
five+0.027
 voyage+0.026
ative+0.025
ival+0.025
Neg logit contributions
 me-0.064
 L-0.060
 Lily-0.059
 my-0.056
 you-0.053
Tokenane
Token ID1531
Feature activation+1.04e-02

Pos logit contributions
 voyage+0.040
 quest+0.039
seek+0.039
zed+0.038
five+0.036
Neg logit contributions
 L-0.101
 me-0.096
 Lily-0.094
 you-0.093
 my-0.088
Token was
Token ID373
Feature activation+5.70e-03

Pos logit contributions
ver+0.024
five+0.024
seek+0.024
asts+0.023
ival+0.023
Neg logit contributions
 me-0.009
 L-0.009
 Lily-0.008
 p-0.007
 my-0.006
Token very
Token ID845
Feature activation-2.55e-03

Pos logit contributions
ver+0.011
five+0.011
glass+0.011
seek+0.011
ative+0.011
Neg logit contributions
 L-0.006
 Lily-0.006
 me-0.006
 swe-0.005
 Mitt-0.005
Token generous
Token ID14431
Feature activation-1.74e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.003
 my+0.002
 Sara+0.002
Neg logit contributions
ver-0.005
five-0.005
seek-0.005
glass-0.005
ival-0.005
Token.
Token ID13
Feature activation+3.22e-03

Pos logit contributions
 L+0.002
 Lily+0.001
 me+0.001
 my+0.001
 Sara+0.001
Neg logit contributions
ver-0.004
five-0.004
seek-0.004
ival-0.004
asts-0.004
Token
Token ID198
Feature activation-1.22e-03

Pos logit contributions
ver+0.001
five+0.001
seek+0.001
ival+0.001
glass+0.001
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 swe-0.001
 my-0.001
TokenThey
Token ID2990
Feature activation+3.15e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 my+0.002
 Mitt+0.001
Neg logit contributions
ver-0.002
five-0.002
seek-0.002
ival-0.002
glass-0.002
Token asked
Token ID1965
Feature activation-3.03e-03

Pos logit contributions
ver+0.005
five+0.005
glass+0.005
seek+0.005
ival+0.005
Neg logit contributions
 L-0.005
 Lily-0.004
 me-0.004
 my-0.004
 swe-0.004
Token the
Token ID262
Feature activation-1.44e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.003
 my+0.002
 swe+0.002
Neg logit contributions
ver-0.007
five-0.007
seek-0.007
ival-0.007
glass-0.006
Token aer
Token ID9551
Feature activation+2.63e-02

Pos logit contributions
 L+0.002
 Lily+0.002
 me+0.002
 Mitt+0.002
 my+0.002
Neg logit contributions
ver-0.002
five-0.002
glass-0.002
ival-0.002
asts-0.002
Tokenopl
Token ID20106
Feature activation+4.56e-02

Pos logit contributions
ver+0.025
five+0.023
ative+0.020
ival+0.020
seek+0.020
Neg logit contributions
 me-0.059
 L-0.057
 Lily-0.054
 my-0.051
 you-0.050
Tokenane
Token ID1531
Feature activation+2.20e-03

Pos logit contributions
zed+0.041
five+0.040
 quest+0.040
seek+0.040
 voyage+0.040
Neg logit contributions
 L-0.096
 you-0.091
 me-0.091
 Lily-0.085
ries-0.083
Token to
Token ID284
Feature activation-4.90e-03

Pos logit contributions
ver+0.005
five+0.004
seek+0.004
glass+0.004
ival+0.004
Neg logit contributions
 L-0.002
 me-0.002
 Lily-0.002
 my-0.002
 swe-0.002
Token fly
Token ID6129
Feature activation-3.99e-03

Pos logit contributions
 L+0.008
 Lily+0.008
 swe+0.007
 me+0.007
 Mitt+0.007
Neg logit contributions
five-0.007
ver-0.007
asts-0.006
 believe-0.006
seek-0.006
Token away
Token ID1497
Feature activation+9.33e-04

Pos logit contributions
 L+0.005
 Lily+0.005
 me+0.005
 Mitt+0.005
 swe+0.005
Neg logit contributions
ver-0.007
five-0.006
glass-0.006
seek-0.006
ative-0.006
Token and
Token ID290
Feature activation-3.65e-03

Pos logit contributions
ver+0.002
five+0.002
ative+0.002
seek+0.002
glass+0.002
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 Mitt-0.001
 swe-0.001
Token come
Token ID1282
Feature activation+2.37e-03

Pos logit contributions
ver+0.003
five+0.003
seek+0.003
ival+0.003
glass+0.003
Neg logit contributions
 L-0.002
 me-0.002
 Lily-0.002
 my-0.002
 Mitt-0.002
Token back
Token ID736
Feature activation-3.34e-03

Pos logit contributions
ver+0.004
five+0.003
seek+0.003
glass+0.003
ative+0.003
Neg logit contributions
 L-0.004
 Lily-0.004
 me-0.004
 Mitt-0.003
 swe-0.003
Token.
Token ID13
Feature activation-4.17e-03

Pos logit contributions
 L+0.004
 me+0.004
 Lily+0.003
 Mitt+0.003
 swe+0.003
Neg logit contributions
ver-0.007
five-0.006
seek-0.006
glass-0.006
asts-0.006
Token The
Token ID383
Feature activation-8.49e-04

Pos logit contributions
 L+0.007
 me+0.007
 Lily+0.007
 my+0.006
 swe+0.006
Neg logit contributions
ver-0.006
five-0.006
seek-0.006
ival-0.006
glass-0.006
Token aer
Token ID9551
Feature activation+2.98e-02

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.001
 my+0.001
 swe+0.001
Neg logit contributions
ver-0.002
five-0.002
seek-0.002
ival-0.002
glass-0.002
Tokenopl
Token ID20106
Feature activation+4.56e-02

Pos logit contributions
ver+0.034
five+0.029
 voyage+0.029
ative+0.027
ival+0.027
Neg logit contributions
 me-0.063
 L-0.058
 Lily-0.055
 my-0.054
 us-0.052
Tokenane
Token ID1531
Feature activation+4.60e-03

Pos logit contributions
seek+0.041
 quest+0.041
five+0.041
zed+0.041
 voyage+0.041
Neg logit contributions
 L-0.094
 me-0.091
 you-0.089
 Lily-0.085
 my-0.083
Token was
Token ID373
Feature activation-2.00e-03

Pos logit contributions
ver+0.009
five+0.009
seek+0.009
ival+0.008
asts+0.008
Neg logit contributions
 L-0.006
 me-0.005
 Lily-0.005
 my-0.004
 p-0.004
Token scared
Token ID12008
Feature activation-3.80e-03

Pos logit contributions
 L+0.002
 Lily+0.002
 me+0.002
 Sara+0.002
 Mitt+0.002
Neg logit contributions
ver-0.004
five-0.004
seek-0.004
ative-0.004
 exited-0.004
Token,
Token ID11
Feature activation-4.26e-03

Pos logit contributions
 L+0.005
 me+0.005
 Lily+0.005
 swe+0.005
 Mitt+0.004
Neg logit contributions
ver-0.006
five-0.006
ative-0.006
 voyage-0.006
asts-0.006
Token and
Token ID290
Feature activation+3.48e-03

Pos logit contributions
 L+0.005
 me+0.005
 Lily+0.004
 swe+0.004
 my+0.004
Neg logit contributions
ver-0.008
glass-0.007
five-0.007
 voyage-0.007
zed-0.007
Token
Token ID198
Feature activation+1.02e-03

Pos logit contributions
 me+0.008
 L+0.008
 swe+0.008
 Lily+0.008
 my+0.008
Neg logit contributions
ver-0.007
five-0.006
ival-0.006
seek-0.006
glass-0.006
Token
Token ID198
Feature activation-2.64e-04

Pos logit contributions
ver+0.002
five+0.002
seek+0.002
ival+0.002
glass+0.002
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 swe-0.001
 my-0.001
TokenThen
Token ID6423
Feature activation-1.32e-02

Pos logit contributions
 L+0.000
 me+0.000
 Lily+0.000
 my+0.000
 swe+0.000
Neg logit contributions
ver-0.000
five-0.000
seek-0.000
ival-0.000
glass-0.000
Token the
Token ID262
Feature activation+3.88e-04

Pos logit contributions
 me+0.015
 L+0.015
 swe+0.014
 Lily+0.014
 mess+0.013
Neg logit contributions
ver-0.025
glass-0.023
 voyage-0.023
five-0.022
ival-0.022
Token aer
Token ID9551
Feature activation+2.62e-02

Pos logit contributions
ver+0.001
five+0.001
glass+0.001
ival+0.001
seek+0.001
Neg logit contributions
 L-0.001
 Lily-0.001
 me-0.001
 Mitt-0.000
 my-0.000
Tokenopl
Token ID20106
Feature activation+4.56e-02

Pos logit contributions
ver+0.027
five+0.023
 voyage+0.022
ative+0.021
ival+0.021
Neg logit contributions
 me-0.058
 L-0.055
 Lily-0.053
 my-0.050
 us-0.049
Tokenane
Token ID1531
Feature activation+5.49e-03

Pos logit contributions
 voyage+0.040
 quest+0.039
five+0.038
seek+0.038
zed+0.038
Neg logit contributions
 L-0.098
 me-0.092
 you-0.090
 Lily-0.089
ries-0.084
Token started
Token ID2067
Feature activation-1.81e-03

Pos logit contributions
ver+0.010
five+0.009
seek+0.009
glass+0.009
ival+0.009
Neg logit contributions
 L-0.008
 Lily-0.007
 me-0.007
 my-0.006
 Mitt-0.006
Token to
Token ID284
Feature activation+1.10e-02

Pos logit contributions
 L+0.002
 Lily+0.002
 me+0.002
 my+0.002
 Sara+0.002
Neg logit contributions
ver-0.003
seek-0.003
five-0.003
asts-0.003
glass-0.003
Token get
Token ID651
Feature activation-5.25e-03

Pos logit contributions
ver+0.016
five+0.015
asts+0.015
seek+0.015
 voyage+0.015
Neg logit contributions
 L-0.018
 Lily-0.017
 me-0.017
 Mitt-0.015
 my-0.015
Token smaller
Token ID4833
Feature activation+4.00e-03

Pos logit contributions
 L+0.006
 me+0.006
 Lily+0.006
 my+0.005
 Sara+0.005
Neg logit contributions
ver-0.010
 exited-0.009
 voyage-0.009
asts-0.009
ative-0.009
Token slow
Token ID3105
Feature activation+1.02e-03

Pos logit contributions
seek+0.007
five+0.007
ival+0.007
asts+0.007
growth+0.007
Neg logit contributions
 me-0.005
 L-0.004
 Lily-0.004
 my-0.004
 you-0.004
Token.
Token ID13
Feature activation+8.46e-04

Pos logit contributions
ver+0.002
five+0.002
seek+0.002
ative+0.002
ival+0.002
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 my-0.001
 swe-0.001
Token Inside
Token ID14384
Feature activation-1.72e-05

Pos logit contributions
ver+0.001
five+0.001
seek+0.001
ival+0.001
glass+0.001
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 swe-0.001
 my-0.001
Token the
Token ID262
Feature activation+6.12e-03

Pos logit contributions
 me+0.000
 Lily+0.000
 L+0.000
 my+0.000
 Sara+0.000
Neg logit contributions
five-0.000
seek-0.000
glass-0.000
tted-0.000
ative-0.000
Token aer
Token ID9551
Feature activation+2.83e-02

Pos logit contributions
ver+0.014
seek+0.014
five+0.014
zed+0.013
ugg+0.013
Neg logit contributions
 me-0.006
 L-0.006
 Lily-0.005
 p-0.005
 my-0.005
Tokenopl
Token ID20106
Feature activation+4.55e-02

Pos logit contributions
ver+0.026
five+0.023
ative+0.021
glass+0.020
seek+0.020
Neg logit contributions
 me-0.066
 L-0.065
 Lily-0.063
 my-0.059
 p-0.056
Tokenane
Token ID1531
Feature activation+1.09e-02

Pos logit contributions
five+0.039
 voyage+0.038
zed+0.038
seek+0.038
 quest+0.037
Neg logit contributions
 L-0.101
 me-0.096
 Lily-0.092
 you-0.092
 my-0.087
Token,
Token ID11
Feature activation-1.35e-03

Pos logit contributions
five+0.025
seek+0.025
ative+0.024
asts+0.024
ver+0.024
Neg logit contributions
 me-0.011
 L-0.010
 Lily-0.010
 my-0.008
 you-0.007
Token there
Token ID612
Feature activation+2.84e-03

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.001
 swe+0.001
 Mitt+0.001
Neg logit contributions
ver-0.003
five-0.003
ival-0.003
glass-0.003
ative-0.002
Token were
Token ID547
Feature activation+5.23e-03

Pos logit contributions
seek+0.008
five+0.008
ival+0.008
ector+0.008
ugg+0.008
Neg logit contributions
 Lily-0.002
 L-0.002
 me-0.001
 p-0.001
 Sara-0.001
Token many
Token ID867
Feature activation+1.42e-02

Pos logit contributions
ver+0.008
five+0.007
seek+0.007
ival+0.007
glass+0.007
Neg logit contributions
 L-0.008
 Lily-0.008
 me-0.008
 Mitt-0.007
 swe-0.007
Token thank
Token ID5875
Feature activation-9.77e-03

Pos logit contributions
 L+0.006
 me+0.006
 Lily+0.006
 my+0.005
 swe+0.005
Neg logit contributions
ver-0.007
five-0.007
seek-0.007
ival-0.007
glass-0.007
Token you
Token ID345
Feature activation-9.02e-03

Pos logit contributions
 L+0.006
 Lily+0.005
 swe+0.004
 mess+0.004
 me+0.004
Neg logit contributions
five-0.023
ver-0.023
asts-0.023
seek-0.022
ival-0.022
Token to
Token ID284
Feature activation-1.02e-02

Pos logit contributions
 L+0.010
 Lily+0.009
 me+0.009
 Mitt+0.008
Oh+0.008
Neg logit contributions
ver-0.018
five-0.017
seek-0.017
ative-0.017
glass-0.017
Token the
Token ID262
Feature activation+6.69e-04

Pos logit contributions
 L+0.009
 swe+0.008
 powder+0.008
 bottles+0.008
 Lily+0.008
Neg logit contributions
five-0.020
ver-0.020
asts-0.019
seek-0.018
zed-0.018
Token aer
Token ID9551
Feature activation+3.02e-02

Pos logit contributions
five+0.001
ver+0.001
ival+0.001
ane+0.001
glass+0.001
Neg logit contributions
 L-0.001
 Lily-0.001
 me-0.001
 Mitt-0.001
 Mia-0.001
Tokenopl
Token ID20106
Feature activation+4.52e-02

Pos logit contributions
ver+0.031
five+0.027
 warrior+0.025
growth+0.025
ative+0.025
Neg logit contributions
 me-0.066
 L-0.064
 Lily-0.061
 my-0.058
 you-0.056
Tokenane
Token ID1531
Feature activation+5.05e-05

Pos logit contributions
 voyage+0.035
five+0.033
 quest+0.032
zed+0.032
seek+0.031
Neg logit contributions
 L-0.104
 me-0.099
 you-0.097
 Lily-0.097
 my-0.091
Token.
Token ID13
Feature activation-2.09e-03

Pos logit contributions
ver+0.000
five+0.000
seek+0.000
ative+0.000
glass+0.000
Neg logit contributions
 L-0.000
 me-0.000
 Lily-0.000
 Mitt-0.000
 swe-0.000
Token
Token ID198
Feature activation+4.70e-03

Pos logit contributions
 L+0.003
 Lily+0.003
 me+0.003
 my+0.002
 Sara+0.002
Neg logit contributions
ver-0.004
five-0.004
seek-0.004
glass-0.004
ival-0.004
Token
Token ID198
Feature activation+2.31e-03

Pos logit contributions
ver+0.009
five+0.009
seek+0.009
glass+0.009
ival+0.009
Neg logit contributions
 L-0.005
 me-0.005
 Lily-0.005
 my-0.004
 Sara-0.004
TokenThe
Token ID464
Feature activation-3.25e-03

Pos logit contributions
ver+0.004
five+0.004
glass+0.004
seek+0.004
asts+0.004
Neg logit contributions
 me-0.003
 L-0.003
 Lily-0.003
 my-0.003
 Mitt-0.003
Token the
Token ID262
Feature activation-5.35e-04

Pos logit contributions
 L+0.009
 me+0.009
 Lily+0.009
 my+0.007
 Sara+0.006
Neg logit contributions
ver-0.025
five-0.024
seek-0.024
ival-0.023
glass-0.023
Token sky
Token ID6766
Feature activation+2.29e-03

Pos logit contributions
 me+0.001
 L+0.001
 Lily+0.001
 my+0.001
 p+0.001
Neg logit contributions
ver-0.001
five-0.001
seek-0.001
ival-0.001
glass-0.001
Token.
Token ID13
Feature activation+6.92e-03

Pos logit contributions
ver+0.006
five+0.006
ival+0.006
seek+0.005
glass+0.005
Neg logit contributions
 L-0.002
 me-0.002
 Lily-0.001
 my-0.001
 p-0.001
Token The
Token ID383
Feature activation+2.37e-03

Pos logit contributions
ver+0.014
seek+0.013
ival+0.013
five+0.013
ibr+0.013
Neg logit contributions
 Lily-0.010
 L-0.009
 me-0.008
 Sara-0.007
 my-0.007
Token aer
Token ID9551
Feature activation+2.80e-02

Pos logit contributions
ver+0.004
five+0.004
seek+0.004
glass+0.004
ival+0.004
Neg logit contributions
 L-0.003
 Lily-0.003
 me-0.003
 my-0.002
 Mitt-0.002
Tokenopl
Token ID20106
Feature activation+4.52e-02

Pos logit contributions
ver+0.031
five+0.027
 voyage+0.025
ative+0.025
ival+0.025
Neg logit contributions
 me-0.061
 L-0.057
 Lily-0.055
 my-0.053
 us-0.051
Tokenane
Token ID1531
Feature activation+6.61e-03

Pos logit contributions
 quest+0.040
seek+0.039
five+0.039
 voyage+0.039
zed+0.038
Neg logit contributions
 L-0.097
 me-0.092
 you-0.090
 Lily-0.090
 my-0.084
Token liked
Token ID8288
Feature activation-7.09e-03

Pos logit contributions
five+0.016
ver+0.015
seek+0.015
asts+0.015
ival+0.015
Neg logit contributions
 me-0.006
 L-0.005
 Lily-0.005
 p-0.004
 my-0.004
Token to
Token ID284
Feature activation+1.51e-03

Pos logit contributions
 L+0.007
 me+0.007
 Lily+0.007
 my+0.005
 swe+0.005
Neg logit contributions
ver-0.015
five-0.015
seek-0.014
ival-0.014
glass-0.014
Token move
Token ID1445
Feature activation+5.43e-03

Pos logit contributions
ver+0.003
ival+0.003
seek+0.002
five+0.002
ative+0.002
Neg logit contributions
 me-0.002
 L-0.002
 Lily-0.002
 my-0.002
 p-0.002
Token fast
Token ID3049
Feature activation+6.26e-04

Pos logit contributions
ver+0.010
five+0.010
ative+0.010
seek+0.010
glass+0.010
Neg logit contributions
 L-0.006
 Lily-0.006
 me-0.006
 Mitt-0.005
 swe-0.005
Token and
Token ID290
Feature activation-9.94e-04

Pos logit contributions
ver+0.014
five+0.013
seek+0.013
ival+0.013
glass+0.013
Neg logit contributions
 L-0.008
 me-0.008
 Lily-0.008
 my-0.006
 swe-0.006
Token thin
Token ID7888
Feature activation+1.01e-02

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.001
 swe+0.001
 my+0.001
Neg logit contributions
ver-0.002
seek-0.002
glass-0.002
five-0.002
ative-0.002
Token.
Token ID13
Feature activation-1.01e-03

Pos logit contributions
ver+0.022
five+0.022
seek+0.022
glass+0.021
ative+0.021
Neg logit contributions
 L-0.010
 me-0.009
 Lily-0.009
 my-0.007
 Mitt-0.007
Token The
Token ID383
Feature activation+3.19e-03

Pos logit contributions
 me+0.002
 L+0.002
 swe+0.002
 mess+0.002
 my+0.002
Neg logit contributions
ver-0.001
five-0.001
seek-0.001
ival-0.001
glass-0.001
Token aer
Token ID9551
Feature activation+3.01e-02

Pos logit contributions
glass+0.005
ver+0.005
five+0.005
 voyage+0.005
asts+0.005
Neg logit contributions
 L-0.005
 Lily-0.005
 me-0.004
 Mitt-0.004
 Sara-0.004
Tokenopl
Token ID20106
Feature activation+4.52e-02

Pos logit contributions
ver+0.028
five+0.025
ival+0.023
 voyage+0.023
ative+0.023
Neg logit contributions
 me-0.069
 L-0.067
 Lily-0.064
 my-0.060
 p-0.058
Tokenane
Token ID1531
Feature activation+7.49e-03

Pos logit contributions
 voyage+0.036
five+0.035
seek+0.035
 quest+0.035
zed+0.035
Neg logit contributions
 L-0.101
 me-0.096
 Lily-0.093
 you-0.093
ries-0.088
Token flew
Token ID13112
Feature activation+7.61e-04

Pos logit contributions
ver+0.015
five+0.015
seek+0.015
ival+0.014
asts+0.014
Neg logit contributions
 me-0.009
 L-0.008
 Lily-0.008
 p-0.007
 my-0.007
Token calmly
Token ID30180
Feature activation-2.87e-03

Pos logit contributions
ver+0.001
ative+0.001
seek+0.001
five+0.001
ival+0.001
Neg logit contributions
 L-0.001
 Lily-0.001
 me-0.001
 Mitt-0.001
 Sara-0.001
Token and
Token ID290
Feature activation-2.37e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.003
 my+0.003
 swe+0.003
Neg logit contributions
ver-0.006
five-0.006
seek-0.005
ival-0.005
glass-0.005
Token then
Token ID788
Feature activation-3.92e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.003
 my+0.002
 swe+0.002
Neg logit contributions
ver-0.004
five-0.004
seek-0.004
glass-0.004
ival-0.004
Token and
Token ID290
Feature activation-1.41e-03

Pos logit contributions
ver+0.015
five+0.014
seek+0.014
ival+0.014
glass+0.014
Neg logit contributions
 L-0.004
 Lily-0.004
 me-0.004
 my-0.003
 Sara-0.003
Token right
Token ID826
Feature activation+6.61e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 my+0.001
 swe+0.001
Neg logit contributions
ver-0.003
five-0.003
seek-0.003
ival-0.003
glass-0.003
Token.
Token ID13
Feature activation+1.50e-03

Pos logit contributions
ver+0.011
ative+0.011
seek+0.011
five+0.011
glass+0.010
Neg logit contributions
 L-0.009
 Mitt-0.008
 Lily-0.008
 me-0.008
 swe-0.007
Token The
Token ID383
Feature activation-2.85e-04

Pos logit contributions
ver+0.003
five+0.003
seek+0.003
ival+0.003
glass+0.003
Neg logit contributions
 L-0.002
 Lily-0.002
 me-0.002
 my-0.002
 Mitt-0.001
Token aer
Token ID9551
Feature activation+2.87e-02

Pos logit contributions
 L+0.000
 Lily+0.000
 me+0.000
 Mitt+0.000
 Sara+0.000
Neg logit contributions
ver-0.001
five-0.001
glass-0.001
asts-0.001
ival-0.001
Tokenopl
Token ID20106
Feature activation+4.51e-02

Pos logit contributions
ver+0.032
five+0.027
 voyage+0.027
ival+0.026
 warrior+0.025
Neg logit contributions
 me-0.061
 L-0.058
 Lily-0.056
 my-0.054
 us-0.052
Tokenane
Token ID1531
Feature activation+6.88e-03

Pos logit contributions
 quest+0.037
seek+0.037
 voyage+0.037
zed+0.037
five+0.036
Neg logit contributions
 L-0.099
 me-0.094
 you-0.092
 Lily-0.092
 my-0.087
Token was
Token ID373
Feature activation+2.89e-03

Pos logit contributions
asts+0.017
five+0.016
ver+0.016
ector+0.016
ival+0.016
Neg logit contributions
 me-0.006
 L-0.005
 Lily-0.005
 p-0.004
 too-0.004
Token very
Token ID845
Feature activation-3.05e-03

Pos logit contributions
ver+0.006
five+0.006
seek+0.005
ival+0.005
glass+0.005
Neg logit contributions
 L-0.004
 me-0.003
 Lily-0.003
 my-0.003
 swe-0.003
Token happy
Token ID3772
Feature activation+2.18e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.003
 my+0.003
 p+0.003
Neg logit contributions
seek-0.007
glass-0.006
ver-0.006
five-0.006
growth-0.006
Token when
Token ID618
Feature activation-5.60e-03

Pos logit contributions
ver+0.005
five+0.005
ival+0.005
asts+0.005
tted+0.005
Neg logit contributions
 L-0.002
 Lily-0.002
 me-0.001
 my-0.001
 p-0.001
Token rail
Token ID6787
Feature activation+1.66e-03

Pos logit contributions
 L+0.011
 Lily+0.010
 me+0.010
 Sara+0.009
 Mitt+0.009
Neg logit contributions
five-0.013
ver-0.013
tted-0.013
glass-0.012
ival-0.012
Token,
Token ID11
Feature activation-4.16e-03

Pos logit contributions
ver+0.003
seek+0.003
five+0.002
ival+0.002
ative+0.002
Neg logit contributions
 L-0.002
 me-0.002
 Lily-0.002
 Mitt-0.002
Oh-0.002
Token Mia
Token ID32189
Feature activation-1.01e-02

Pos logit contributions
 me+0.005
 mess+0.005
 swe+0.005
ety+0.005
 us+0.005
Neg logit contributions
ver-0.006
zed-0.006
 exited-0.005
ival-0.005
five-0.005
Token's
Token ID338
Feature activation-6.20e-03

Pos logit contributions
 L+0.014
 Lily+0.013
 my+0.012
 me+0.012
 swe+0.012
Neg logit contributions
ver-0.017
glass-0.015
five-0.015
 exited-0.015
ival-0.015
Token foot
Token ID2366
Feature activation-1.75e-02

Pos logit contributions
 L+0.008
 Lily+0.008
 me+0.007
 my+0.007
 Mitt+0.007
Neg logit contributions
glass-0.010
 voyage-0.010
five-0.010
ver-0.010
ival-0.010
Token slipped
Token ID18859
Feature activation-4.04e-02

Pos logit contributions
 L+0.025
 Lily+0.024
 me+0.023
 my+0.021
 swe+0.021
Neg logit contributions
ver-0.030
seek-0.027
tted-0.027
five-0.027
glass-0.027
Token off
Token ID572
Feature activation-1.21e-02

Pos logit contributions
 L+0.059
 Lily+0.054
 me+0.051
ries+0.051
 swe+0.050
Neg logit contributions
ver-0.057
he-0.057
ative-0.056
tted-0.055
hing-0.055
Token the
Token ID262
Feature activation+4.26e-03

Pos logit contributions
 me+0.013
ety+0.013
lasses+0.013
 swe+0.013
ounced+0.012
Neg logit contributions
he-0.021
ver-0.019
zed-0.019
five-0.019
growth-0.019
Token rail
Token ID6787
Feature activation+1.31e-03

Pos logit contributions
five+0.007
tted+0.006
glass+0.006
seek+0.006
 voyage+0.006
Neg logit contributions
 L-0.006
 Lily-0.006
 Sara-0.006
ries-0.005
 me-0.005
Token and
Token ID290
Feature activation-1.01e-02

Pos logit contributions
five+0.002
ver+0.002
seek+0.002
tted+0.002
he+0.002
Neg logit contributions
 L-0.002
 Lily-0.001
 me-0.001
 my-0.001
 swe-0.001
Token onto
Token ID4291
Feature activation-8.25e-03

Pos logit contributions
 swe+0.015
 coll+0.015
 L+0.015
lasses+0.014
 me+0.014
Neg logit contributions
zed-0.013
 exited-0.011
ver-0.011
five-0.011
seek-0.010
Token ripped
Token ID19551
Feature activation-1.86e-02

Pos logit contributions
 L+0.007
 Lily+0.007
 me+0.007
 my+0.006
 swe+0.006
Neg logit contributions
ver-0.010
five-0.010
seek-0.010
ival-0.009
glass-0.009
Token the
Token ID262
Feature activation+1.09e-03

Pos logit contributions
 L+0.024
 me+0.023
 swe+0.022
 Lily+0.021
 you+0.020
Neg logit contributions
ver-0.027
five-0.026
ative-0.026
glass-0.026
growth-0.026
Token rocket
Token ID10701
Feature activation+4.69e-03

Pos logit contributions
ver+0.002
five+0.002
glass+0.002
ival+0.002
seek+0.002
Neg logit contributions
 L-0.002
 Lily-0.002
 me-0.002
 Mitt-0.001
 Sara-0.001
Token apart
Token ID5475
Feature activation-1.75e-03

Pos logit contributions
ver+0.009
five+0.009
seek+0.009
glass+0.008
ival+0.008
Neg logit contributions
 L-0.006
 Lily-0.005
 me-0.005
 my-0.005
 swe-0.005
Token and
Token ID290
Feature activation-1.41e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 swe+0.002
 Mitt+0.002
Neg logit contributions
ver-0.003
five-0.003
glass-0.003
ative-0.003
asts-0.003
Token spat
Token ID15246
Feature activation-3.90e-02

Pos logit contributions
 L+0.002
 me+0.002
 swe+0.002
 Lily+0.002
 my+0.002
Neg logit contributions
ver-0.002
five-0.002
glass-0.002
zed-0.002
 exited-0.002
Token out
Token ID503
Feature activation-8.09e-03

Pos logit contributions
 L+0.044
 me+0.043
 swe+0.040
 Lily+0.039
 p+0.037
Neg logit contributions
ver-0.075
five-0.074
zed-0.071
ival-0.068
glass-0.068
Token the
Token ID262
Feature activation-8.95e-03

Pos logit contributions
 me+0.008
 L+0.008
 swe+0.007
 Lily+0.007
 you+0.006
Neg logit contributions
ver-0.016
five-0.015
ative-0.015
glass-0.014
 warrior-0.014
Token pieces
Token ID5207
Feature activation-3.85e-03

Pos logit contributions
 L+0.016
 Lily+0.014
 me+0.014
 Mitt+0.014
 Sara+0.013
Neg logit contributions
five-0.013
glass-0.013
 warrior-0.012
 voyage-0.012
 astronaut-0.011
Token.
Token ID13
Feature activation+9.42e-04

Pos logit contributions
 L+0.003
 Lily+0.003
 me+0.003
 my+0.002
 swe+0.002
Neg logit contributions
ver-0.009
five-0.009
seek-0.009
ival-0.008
glass-0.008
Token The
Token ID383
Feature activation-5.41e-03

Pos logit contributions
ver+0.002
five+0.002
ival+0.002
glass+0.002
tted+0.002
Neg logit contributions
 me-0.001
 L-0.001
 swe-0.001
 umb-0.001
 p-0.001
Token using
Token ID1262
Feature activation-4.28e-03

Pos logit contributions
 L+0.006
 me+0.005
 Lily+0.005
 my+0.005
 swe+0.004
Neg logit contributions
ver-0.007
five-0.007
seek-0.007
ival-0.007
glass-0.007
Token his
Token ID465
Feature activation+8.55e-04

Pos logit contributions
 L+0.007
 me+0.006
 Lily+0.006
 swe+0.005
 Mitt+0.005
Neg logit contributions
ver-0.007
five-0.006
asts-0.006
seek-0.006
ative-0.006
Token problem
Token ID1917
Feature activation-2.15e-03

Pos logit contributions
ver+0.001
five+0.001
seek+0.001
ugg+0.001
ival+0.001
Neg logit contributions
 me-0.001
 L-0.001
 Lily-0.001
 my-0.001
 swe-0.001
Token-
Token ID12
Feature activation-5.28e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 my+0.001
 Mitt+0.001
Neg logit contributions
ver-0.005
five-0.005
seek-0.005
glass-0.004
ative-0.004
Tokens
Token ID82
Feature activation-5.35e-03

Pos logit contributions
 L+0.009
 me+0.009
 Lily+0.009
 my+0.008
 Mitt+0.008
Neg logit contributions
ver-0.007
five-0.007
glass-0.007
seek-0.007
asts-0.007
Tokenolving
Token ID10890
Feature activation-3.85e-02

Pos logit contributions
 L+0.008
 me+0.008
 Lily+0.008
 my+0.007
 swe+0.006
Neg logit contributions
ver-0.009
five-0.009
seek-0.009
ival-0.009
glass-0.008
Token skills
Token ID4678
Feature activation-5.81e-03

Pos logit contributions
 L+0.069
 Lily+0.065
 me+0.063
 Mitt+0.058
 Sara+0.058
Neg logit contributions
ver-0.056
five-0.052
seek-0.052
glass-0.051
ival-0.050
Token.
Token ID13
Feature activation-2.20e-03

Pos logit contributions
 L+0.005
 me+0.004
 Lily+0.004
 my+0.003
 swe+0.003
Neg logit contributions
ver-0.014
five-0.013
seek-0.013
ival-0.013
glass-0.013
Token From
Token ID3574
Feature activation-3.21e-04

Pos logit contributions
 L+0.004
 me+0.004
 Lily+0.004
 my+0.004
 swe+0.004
Neg logit contributions
ver-0.003
five-0.003
seek-0.003
ival-0.002
glass-0.002
Token that
Token ID326
Feature activation+1.28e-03

Pos logit contributions
 L+0.000
 Lily+0.000
 me+0.000
 swe+0.000
 Mitt+0.000
Neg logit contributions
ver-0.000
five-0.000
ival-0.000
ative-0.000
seek-0.000
Token day
Token ID1110
Feature activation-7.47e-03

Pos logit contributions
seek+0.002
ver+0.002
five+0.002
ival+0.002
glass+0.002
Neg logit contributions
 me-0.002
 Lily-0.002
 L-0.002
 my-0.001
 Mitt-0.001
Token the
Token ID262
Feature activation+7.33e-03

Pos logit contributions
 L+0.010
 swe+0.009
 me+0.009
ies+0.009
 mess+0.009
Neg logit contributions
growth-0.017
five-0.017
ver-0.016
zed-0.016
asts-0.016
Token ground
Token ID2323
Feature activation+5.80e-03

Pos logit contributions
glass+0.012
tted+0.012
five+0.012
ver+0.012
asts+0.012
Neg logit contributions
 L-0.010
 Lily-0.009
 me-0.009
 Sara-0.008
 Mia-0.008
Token,
Token ID11
Feature activation-3.57e-03

Pos logit contributions
ver+0.010
seek+0.009
five+0.009
ival+0.009
glass+0.008
Neg logit contributions
 L-0.008
 me-0.008
 Lily-0.008
 swe-0.007
 my-0.007
Token its
Token ID663
Feature activation+1.80e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.003
 swe+0.003
 my+0.002
Neg logit contributions
ver-0.008
five-0.007
glass-0.007
ival-0.007
ative-0.007
Token feet
Token ID3625
Feature activation-9.09e-03

Pos logit contributions
 protector+0.002
glass+0.002
asts+0.002
 voyage+0.002
working+0.002
Neg logit contributions
 L-0.003
 Lily-0.003
Oh-0.003
 you-0.003
 Sara-0.003
Token slipped
Token ID18859
Feature activation-3.76e-02

Pos logit contributions
 L+0.013
 Lily+0.013
 me+0.013
 my+0.011
 swe+0.011
Neg logit contributions
ver-0.015
five-0.015
seek-0.014
glass-0.014
ival-0.014
Token on
Token ID319
Feature activation-9.51e-03

Pos logit contributions
 L+0.053
 Lily+0.049
 me+0.048
 swe+0.046
 Sara+0.044
Neg logit contributions
ver-0.059
ative-0.056
five-0.056
tted-0.054
he-0.053
Token the
Token ID262
Feature activation+7.29e-03

Pos logit contributions
 L+0.009
 me+0.009
 swe+0.009
 Lily+0.008
lasses+0.008
Neg logit contributions
growth-0.018
five-0.017
ver-0.017
asts-0.017
zed-0.016
Token oily
Token ID44560
Feature activation-8.61e-04

Pos logit contributions
tted+0.011
glass+0.011
five+0.011
asts+0.011
seek+0.011
Neg logit contributions
 L-0.011
 Lily-0.010
 me-0.010
 Mia-0.009
 Sara-0.009
Token surface
Token ID4417
Feature activation+2.78e-03

Pos logit contributions
 L+0.001
 Lily+0.001
 me+0.001
 my+0.001
 swe+0.001
Neg logit contributions
five-0.002
seek-0.002
ver-0.001
glass-0.001
ative-0.001
Token.
Token ID13
Feature activation-1.63e-03

Pos logit contributions
ver+0.006
five+0.005
seek+0.005
ative+0.005
glass+0.005
Neg logit contributions
 L-0.003
 me-0.003
 Lily-0.003
 swe-0.002
 my-0.002
Token its
Token ID663
Feature activation+6.40e-03

Pos logit contributions
 me+0.003
 L+0.003
 swe+0.003
 mess+0.003
lasses+0.003
Neg logit contributions
seek-0.002
five-0.002
glass-0.002
growth-0.002
asts-0.002
Token er
Token ID1931
Feature activation+1.08e-02

Pos logit contributions
seek+0.010
glass+0.009
 voyage+0.009
ative+0.009
 protector+0.009
Neg logit contributions
 L-0.009
 Lily-0.009
 Sara-0.008
ries-0.008
Oh-0.008
Tokenaser
Token ID6005
Feature activation-1.36e-02

Pos logit contributions
ver+0.015
five+0.014
seek+0.013
ival+0.013
glass+0.013
Neg logit contributions
 L-0.020
 Lily-0.019
 me-0.019
 swe-0.017
 Sara-0.017
Token when
Token ID618
Feature activation-1.07e-02

Pos logit contributions
 L+0.016
 me+0.015
 Lily+0.015
 my+0.012
 swe+0.012
Neg logit contributions
ver-0.027
five-0.026
seek-0.026
glass-0.025
ival-0.025
Token it
Token ID340
Feature activation-1.59e-03

Pos logit contributions
 me+0.012
 swe+0.012
 L+0.011
 mess+0.011
 scar+0.010
Neg logit contributions
ver-0.019
seek-0.018
zed-0.018
asts-0.018
glass-0.018
Token accidentally
Token ID14716
Feature activation-3.68e-02

Pos logit contributions
 L+0.002
 Lily+0.002
 me+0.002
 my+0.002
 Sara+0.002
Neg logit contributions
ver-0.002
seek-0.002
zed-0.002
glass-0.002
tted-0.002
Token dropped
Token ID5710
Feature activation-1.94e-02

Pos logit contributions
 L+0.055
 me+0.053
 Lily+0.052
 swe+0.051
 my+0.048
Neg logit contributions
ver-0.055
zed-0.053
seek-0.051
five-0.051
glass-0.051
Token it
Token ID340
Feature activation-6.99e-03

Pos logit contributions
imi+0.029
 swe+0.027
 MacDonald+0.025
ue+0.025
 scar+0.025
Neg logit contributions
growth-0.021
hed-0.020
ative-0.020
five-0.020
 control-0.019
Token into
Token ID656
Feature activation-1.18e-02

Pos logit contributions
 L+0.009
 Lily+0.008
 me+0.008
 Sara+0.007
 swe+0.007
Neg logit contributions
ver-0.013
tted-0.012
seek-0.012
he-0.012
ative-0.012
Token a
Token ID257
Feature activation-7.21e-03

Pos logit contributions
 swe+0.015
lasses+0.015
ies+0.014
 me+0.014
ety+0.013
Neg logit contributions
growth-0.018
glass-0.016
asts-0.016
handed-0.015
seek-0.015
Token deep
Token ID2769
Feature activation+3.08e-03

Pos logit contributions
 L+0.011
ries+0.010
 Jen+0.009
 Lily+0.009
 Sara+0.009
Neg logit contributions
 warrior-0.011
ival-0.010
 quest-0.010
 voyage-0.010
five-0.010
Token with
Token ID351
Feature activation+1.67e-04

Pos logit contributions
 L+0.008
 me+0.007
 Lily+0.007
 swe+0.006
 my+0.006
Neg logit contributions
ver-0.011
five-0.011
seek-0.010
glass-0.010
ative-0.010
Token her
Token ID607
Feature activation+4.67e-03

Pos logit contributions
asts+0.000
seek+0.000
ver+0.000
zed+0.000
growth+0.000
Neg logit contributions
 me-0.000
 L-0.000
lasses-0.000
 swe-0.000
Oh-0.000
Token doll
Token ID3654
Feature activation-1.15e-02

Pos logit contributions
ver+0.009
five+0.008
seek+0.008
glass+0.008
ative+0.008
Neg logit contributions
 L-0.006
 me-0.006
 Lily-0.005
 my-0.005
 Sara-0.005
Token when
Token ID618
Feature activation-7.69e-03

Pos logit contributions
 L+0.015
 me+0.014
 Lily+0.013
 Sara+0.012
 my+0.012
Neg logit contributions
ver-0.021
seek-0.021
glass-0.020
five-0.020
ative-0.019
Token it
Token ID340
Feature activation+1.55e-03

Pos logit contributions
 L+0.009
 me+0.009
 swe+0.008
 Lily+0.008
 my+0.007
Neg logit contributions
ver-0.014
seek-0.013
zed-0.013
 exited-0.013
five-0.013
Token accidentally
Token ID14716
Feature activation-3.65e-02

Pos logit contributions
ver+0.002
seek+0.002
zed+0.002
 exited+0.002
tted+0.002
Neg logit contributions
 L-0.002
 my-0.002
 Lily-0.002
 me-0.002
 Sara-0.002
Token fell
Token ID3214
Feature activation-2.52e-02

Pos logit contributions
 L+0.055
 me+0.053
 Lily+0.051
 swe+0.050
 us+0.048
Neg logit contributions
zed-0.053
seek-0.051
ver-0.051
five-0.049
glass-0.048
Token out
Token ID503
Feature activation-6.05e-03

Pos logit contributions
 L+0.037
 Sara+0.033
 Lily+0.033
 me+0.032
 Lia+0.032
Neg logit contributions
ative-0.034
ver-0.034
hing-0.034
he-0.033
tted-0.032
Token of
Token ID286
Feature activation-3.22e-03

Pos logit contributions
 L+0.011
 me+0.010
 Lily+0.010
 swe+0.010
 Mitt+0.010
Neg logit contributions
he-0.007
ver-0.007
ative-0.007
five-0.007
hing-0.006
Token her
Token ID607
Feature activation-1.18e-02

Pos logit contributions
 L+0.004
ies+0.003
 swe+0.003
ety+0.003
lasses+0.003
Neg logit contributions
growth-0.005
he-0.005
five-0.005
 patrol-0.005
seek-0.005
Token hand
Token ID1021
Feature activation-1.57e-02

Pos logit contributions
 L+0.017
 Lily+0.016
 me+0.015
 Sara+0.014
 Mitt+0.014
Neg logit contributions
ver-0.018
seek-0.018
five-0.018
glass-0.018
case-0.017
Token scared
Token ID12008
Feature activation-2.01e-03

Pos logit contributions
ver+0.031
seek+0.030
five+0.030
ival+0.029
glass+0.028
Neg logit contributions
 L-0.019
 Lily-0.018
 me-0.018
 p-0.015
 Sara-0.014
Token the
Token ID262
Feature activation+2.79e-03

Pos logit contributions
 swe+0.002
 L+0.002
 mess+0.002
 me+0.002
 swollen+0.002
Neg logit contributions
ative-0.003
five-0.003
ver-0.003
hing-0.003
asts-0.003
Token bunny
Token ID44915
Feature activation-6.65e-03

Pos logit contributions
ver+0.004
five+0.004
ative+0.004
seek+0.004
ival+0.004
Neg logit contributions
 L-0.005
 Lily-0.004
 me-0.004
 Sara-0.004
 swe-0.004
Token and
Token ID290
Feature activation-7.03e-04

Pos logit contributions
 L+0.008
 Lily+0.008
 me+0.007
 swe+0.007
 Sara+0.007
Neg logit contributions
ver-0.012
five-0.012
seek-0.012
glass-0.012
tted-0.011
Token it
Token ID340
Feature activation-6.00e-03

Pos logit contributions
 L+0.001
 swe+0.001
 me+0.001
 Lily+0.001
 sugar+0.001
Neg logit contributions
ver-0.001
 exited-0.001
five-0.001
glass-0.001
zed-0.001
Token accidentally
Token ID14716
Feature activation-3.62e-02

Pos logit contributions
 L+0.009
 Lily+0.008
 me+0.008
 Sara+0.007
 my+0.007
Neg logit contributions
ver-0.010
seek-0.009
glass-0.009
five-0.009
zed-0.009
Token dropped
Token ID5710
Feature activation-1.90e-02

Pos logit contributions
 L+0.052
 me+0.051
 Lily+0.050
 swe+0.046
 Sara+0.044
Neg logit contributions
ver-0.059
five-0.056
seek-0.056
glass-0.055
zed-0.055
Token the
Token ID262
Feature activation+9.47e-03

Pos logit contributions
 swe+0.022
 L+0.022
imi+0.022
 me+0.021
 scar+0.020
Neg logit contributions
five-0.029
growth-0.029
ver-0.027
ative-0.027
hing-0.025
Token camera
Token ID4676
Feature activation-2.14e-03

Pos logit contributions
five+0.014
glass+0.014
ver+0.014
ival+0.013
 voyage+0.013
Neg logit contributions
 L-0.016
 Lily-0.015
 me-0.014
 Sara-0.013
 swe-0.013
Token.
Token ID13
Feature activation-3.15e-03

Pos logit contributions
 L+0.002
 Lily+0.002
 me+0.002
 Sara+0.002
 my+0.002
Neg logit contributions
ver-0.005
five-0.004
seek-0.004
glass-0.004
ative-0.004
Token The
Token ID383
Feature activation-5.20e-03

Pos logit contributions
 L+0.005
 Lily+0.005
 me+0.005
 my+0.004
 Mitt+0.004
Neg logit contributions
ver-0.005
five-0.005
seek-0.005
glass-0.005
ival-0.005
Token with
Token ID351
Feature activation-2.12e-03

Pos logit contributions
 L+0.013
 Lily+0.013
 me+0.012
 my+0.010
 p+0.010
Neg logit contributions
ver-0.025
five-0.024
asts-0.024
ival-0.023
seek-0.023
Token his
Token ID465
Feature activation+7.73e-03

Pos logit contributions
 me+0.004
 L+0.004
 swe+0.004
lasses+0.004
 mistake+0.004
Neg logit contributions
seek-0.002
growth-0.002
glass-0.002
zed-0.002
five-0.002
Token friends
Token ID2460
Feature activation-3.73e-03

Pos logit contributions
seek+0.011
ver+0.011
glass+0.011
ibr+0.011
 protector+0.011
Neg logit contributions
 L-0.011
 Lily-0.010
 me-0.009
lasses-0.009
Oh-0.009
Token when
Token ID618
Feature activation-6.68e-03

Pos logit contributions
 L+0.005
 me+0.004
 Lily+0.004
 my+0.004
 swe+0.004
Neg logit contributions
ver-0.007
five-0.007
seek-0.007
ival-0.007
glass-0.007
Token he
Token ID339
Feature activation-3.54e-03

Pos logit contributions
 me+0.012
 L+0.011
 swe+0.011
 Lily+0.011
 my+0.010
Neg logit contributions
ver-0.009
seek-0.008
five-0.008
asts-0.008
glass-0.008
Token accidentally
Token ID14716
Feature activation-3.60e-02

Pos logit contributions
 L+0.006
 Lily+0.005
 swe+0.005
 me+0.005
 my+0.005
Neg logit contributions
ver-0.005
glass-0.005
seek-0.005
ival-0.005
zed-0.005
Token fell
Token ID3214
Feature activation-2.56e-02

Pos logit contributions
 L+0.053
 me+0.051
 Lily+0.049
 swe+0.047
 my+0.045
Neg logit contributions
zed-0.055
ver-0.054
seek-0.054
five-0.053
glass-0.052
Token and
Token ID290
Feature activation-8.09e-03

Pos logit contributions
 L+0.037
 me+0.033
 Lily+0.033
 Sara+0.031
 swe+0.031
Neg logit contributions
ative-0.035
ver-0.035
hing-0.034
seek-0.034
he-0.033
Token hurt
Token ID5938
Feature activation-8.26e-03

Pos logit contributions
 L+0.011
 swe+0.011
 me+0.011
 umb+0.010
 Lily+0.010
Neg logit contributions
zed-0.012
ver-0.012
seek-0.011
 exited-0.011
he-0.010
Token his
Token ID465
Feature activation-1.20e-03

Pos logit contributions
 swe+0.016
 me+0.016
 mess+0.016
 scar+0.015
 L+0.015
Neg logit contributions
ative-0.006
seek-0.006
His-0.006
ver-0.006
ayer-0.006
Token paw
Token ID41827
Feature activation-2.46e-02

Pos logit contributions
 L+0.002
 Lily+0.002
 me+0.002
lasses+0.002
 my+0.002
Neg logit contributions
seek-0.001
oved-0.001
glass-0.001
ver-0.001
zed-0.001
Token it
Token ID340
Feature activation-1.99e-03

Pos logit contributions
imi+0.013
 swe+0.012
Su+0.011
 swarm+0.011
 L+0.011
Neg logit contributions
five-0.014
glass-0.012
hed-0.011
he-0.011
ative-0.011
Token away
Token ID1497
Feature activation-2.74e-03

Pos logit contributions
 L+0.003
 Lily+0.002
 me+0.002
 Mitt+0.002
 Sara+0.002
Neg logit contributions
ver-0.003
five-0.003
seek-0.003
ative-0.003
tted-0.003
Token a
Token ID257
Feature activation-1.34e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.003
 swe+0.003
 scar+0.003
Neg logit contributions
ver-0.005
ative-0.005
five-0.005
seek-0.005
glass-0.004
Token long
Token ID890
Feature activation-1.30e-02

Pos logit contributions
ries+0.002
 L+0.002
 Jen+0.002
lasses+0.002
 me+0.002
Neg logit contributions
five-0.002
 voyage-0.002
ival-0.002
 warrior-0.002
 believe-0.002
Token time
Token ID640
Feature activation-1.02e-02

Pos logit contributions
 L+0.021
 me+0.019
ries+0.019
 my+0.018
 swe+0.018
Neg logit contributions
five-0.018
 voyage-0.017
ver-0.017
glass-0.017
seek-0.016
Token ago
Token ID2084
Feature activation-3.60e-02

Pos logit contributions
 L+0.014
 me+0.013
 Lily+0.013
 swe+0.013
 Mitt+0.013
Neg logit contributions
ver-0.017
five-0.017
seek-0.016
glass-0.016
asts-0.016
Token.
Token ID13
Feature activation+1.18e-03

Pos logit contributions
 L+0.044
 me+0.041
 Lily+0.041
 swe+0.040
 Mitt+0.039
Neg logit contributions
ver-0.066
five-0.065
seek-0.064
glass-0.064
asts-0.062
Token It
Token ID632
Feature activation+4.55e-03

Pos logit contributions
ver+0.002
five+0.002
seek+0.002
glass+0.002
ative+0.002
Neg logit contributions
 L-0.002
 me-0.002
 Lily-0.001
 swe-0.001
 my-0.001
Token is
Token ID318
Feature activation-6.86e-03

Pos logit contributions
ver+0.008
tted+0.007
five+0.007
case+0.007
ives+0.007
Neg logit contributions
 L-0.006
 Lily-0.005
 me-0.005
 Sara-0.005
 Mitt-0.005
Token not
Token ID407
Feature activation-3.83e-03

Pos logit contributions
 L+0.009
ries+0.008
heart+0.008
ies+0.008
Su+0.008
Neg logit contributions
ver-0.009
zed-0.008
 accomplished-0.008
 protector-0.008
 inspired-0.008
Token good
Token ID922
Feature activation+1.58e-03

Pos logit contributions
 L+0.005
 me+0.005
Oh+0.004
lasses+0.004
 Mitt+0.004
Neg logit contributions
ver-0.006
five-0.006
ative-0.006
 astronaut-0.005
 believe-0.005
Token was
Token ID373
Feature activation+6.46e-03

Pos logit contributions
five+0.001
ver+0.001
seek+0.001
ative+0.001
ival+0.001
Neg logit contributions
 me-0.001
 L-0.001
 Lily-0.001
 p-0.001
 my-0.001
Token playing
Token ID2712
Feature activation-6.55e-03

Pos logit contributions
five+0.013
ver+0.013
ival+0.013
glass+0.013
seek+0.013
Neg logit contributions
 L-0.007
 me-0.007
 Lily-0.007
 p-0.006
 my-0.006
Token and
Token ID290
Feature activation-6.76e-03

Pos logit contributions
 Lily+0.006
 me+0.006
 L+0.006
 p+0.005
 my+0.004
Neg logit contributions
ver-0.016
ugg-0.015
asts-0.015
five-0.015
ane-0.015
Token his
Token ID465
Feature activation+8.77e-03

Pos logit contributions
 L+0.010
 me+0.009
 Lily+0.009
 swe+0.008
 Sara+0.008
Neg logit contributions
ver-0.011
seek-0.011
five-0.010
glass-0.010
ival-0.010
Token weapon
Token ID4282
Feature activation-1.21e-02

Pos logit contributions
ver+0.017
five+0.016
seek+0.016
ival+0.015
asts+0.015
Neg logit contributions
 me-0.012
 L-0.011
 Lily-0.011
 my-0.009
 p-0.009
Token slipped
Token ID18859
Feature activation-3.58e-02

Pos logit contributions
 me+0.013
 L+0.013
 Lily+0.013
 p+0.010
 my+0.009
Neg logit contributions
five-0.025
ver-0.025
ival-0.024
asts-0.024
ugg-0.024
Token out
Token ID503
Feature activation-5.05e-03

Pos logit contributions
 L+0.046
 Lily+0.043
 me+0.041
 Sara+0.037
 swe+0.037
Neg logit contributions
ver-0.063
five-0.060
ative-0.060
seek-0.060
zed-0.057
Token of
Token ID286
Feature activation-1.94e-03

Pos logit contributions
 L+0.008
 me+0.008
 Lily+0.008
 swe+0.007
 Mitt+0.007
Neg logit contributions
ver-0.008
five-0.007
seek-0.007
glass-0.007
ative-0.007
Token his
Token ID465
Feature activation+1.06e-05

Pos logit contributions
 L+0.002
 me+0.002
 swe+0.002
 Lily+0.002
lasses+0.002
Neg logit contributions
five-0.003
growth-0.003
asts-0.003
ver-0.003
glass-0.003
Token hands
Token ID2832
Feature activation-9.37e-03

Pos logit contributions
ver+0.000
five+0.000
seek+0.000
ival+0.000
ative+0.000
Neg logit contributions
 L-0.000
 me-0.000
 Lily-0.000
 my-0.000
 p-0.000
Token.
Token ID13
Feature activation-3.00e-03

Pos logit contributions
 L+0.007
 me+0.006
 Lily+0.006
 my+0.004
 swe+0.004
Neg logit contributions
ver-0.023
five-0.022
seek-0.022
ival-0.022
glass-0.022
Token day
Token ID1110
Feature activation-6.01e-04

Pos logit contributions
 Lily+0.000
 L+0.000
 me+0.000
 my+0.000
 Sara+0.000
Neg logit contributions
ver-0.000
seek-0.000
ival-0.000
five-0.000
glass-0.000
Token,
Token ID11
Feature activation+7.13e-03

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.001
 my+0.001
 swe+0.001
Neg logit contributions
ver-0.001
five-0.001
seek-0.001
ival-0.001
glass-0.001
Token Lily
Token ID20037
Feature activation+4.27e-04

Pos logit contributions
ver+0.017
five+0.016
working+0.015
ival+0.015
he+0.015
Neg logit contributions
 me-0.004
 umb-0.004
 L-0.004
 yells-0.003
 us-0.003
Token's
Token ID338
Feature activation+4.56e-03

Pos logit contributions
ver+0.001
five+0.001
seek+0.001
ival+0.001
glass+0.001
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 my-0.000
 swe-0.000
Token ball
Token ID2613
Feature activation-6.58e-03

Pos logit contributions
ver+0.009
five+0.009
ival+0.009
ative+0.009
seek+0.009
Neg logit contributions
 L-0.005
 me-0.005
 Lily-0.005
 my-0.004
 swe-0.004
Token accidentally
Token ID14716
Feature activation-3.58e-02

Pos logit contributions
 L+0.010
 Lily+0.009
 me+0.009
 swe+0.009
 my+0.009
Neg logit contributions
ver-0.011
seek-0.010
five-0.010
glass-0.010
tted-0.010
Token went
Token ID1816
Feature activation-8.78e-03

Pos logit contributions
 L+0.053
 me+0.050
 Lily+0.049
 swe+0.046
 Sara+0.044
Neg logit contributions
ver-0.055
zed-0.054
seek-0.053
five-0.053
glass-0.052
Token over
Token ID625
Feature activation+8.18e-03

Pos logit contributions
 L+0.012
 scar+0.011
 swe+0.011
 Sara+0.010
 Lia+0.010
Neg logit contributions
tted-0.013
seek-0.013
case-0.012
aling-0.012
he-0.012
Token Mrs
Token ID9074
Feature activation+2.01e-02

Pos logit contributions
seek+0.013
five+0.013
ver+0.013
he+0.013
growth+0.012
Neg logit contributions
 swe-0.009
 me-0.009
 L-0.009
lasses-0.009
 gross-0.008
Token.
Token ID13
Feature activation+1.29e-02

Pos logit contributions
ver+0.042
five+0.040
seek+0.040
ative+0.039
glass+0.039
Neg logit contributions
 L-0.020
 me-0.020
 Lily-0.017
 my-0.016
 swe-0.016
Token Smith
Token ID4176
Feature activation-7.02e-03

Pos logit contributions
ver+0.018
five+0.017
seek+0.017
glass+0.016
ival+0.016
Neg logit contributions
 L-0.022
 me-0.022
 Lily-0.021
 swe-0.019
 my-0.019
Token ran
Token ID4966
Feature activation-1.14e-03

Pos logit contributions
ver+0.013
five+0.013
seek+0.013
ative+0.013
glass+0.012
Neg logit contributions
 L-0.007
 me-0.007
 Lily-0.007
 p-0.006
 Sara-0.005
Token,
Token ID11
Feature activation-2.06e-03

Pos logit contributions
 L+0.002
 me+0.001
 Lily+0.001
 swe+0.001
 Mitt+0.001
Neg logit contributions
five-0.002
ver-0.002
seek-0.002
ival-0.002
ative-0.002
Token her
Token ID607
Feature activation+1.16e-05

Pos logit contributions
 me+0.003
 swe+0.003
 us+0.003
 mess+0.003
 my+0.003
Neg logit contributions
ver-0.003
zed-0.003
five-0.003
seek-0.003
 exited-0.003
Token slippery
Token ID32911
Feature activation-9.93e-03

Pos logit contributions
ver+0.000
five+0.000
 exited+0.000
glass+0.000
ival+0.000
Neg logit contributions
 L-0.000
 Lily-0.000
 me-0.000
 my-0.000
 us-0.000
Token feet
Token ID3625
Feature activation-1.04e-02

Pos logit contributions
 L+0.016
 Lily+0.014
 me+0.013
 my+0.012
 Sara+0.012
Neg logit contributions
five-0.015
zed-0.015
seek-0.015
asts-0.014
 voyage-0.014
Token slipped
Token ID18859
Feature activation-3.54e-02

Pos logit contributions
 L+0.015
 Lily+0.014
 me+0.014
 my+0.013
 swe+0.013
Neg logit contributions
ver-0.018
five-0.016
seek-0.016
tted-0.016
ival-0.016
Token on
Token ID319
Feature activation-9.90e-03

Pos logit contributions
 L+0.054
 Lily+0.049
 swe+0.048
ries+0.047
 me+0.047
Neg logit contributions
ver-0.048
he-0.047
ative-0.046
hing-0.044
tted-0.044
Token the
Token ID262
Feature activation+1.17e-02

Pos logit contributions
 swe+0.011
 me+0.010
 L+0.010
ies+0.010
 scar+0.010
Neg logit contributions
growth-0.017
five-0.017
ver-0.016
zed-0.016
asts-0.016
Token ground
Token ID2323
Feature activation+3.37e-03

Pos logit contributions
five+0.019
glass+0.017
seek+0.017
asts+0.017
zed+0.016
Neg logit contributions
 L-0.019
 Lily-0.017
 me-0.016
 Sara-0.015
 my-0.015
Token and
Token ID290
Feature activation-9.24e-03

Pos logit contributions
ver+0.006
seek+0.006
five+0.005
he+0.005
tted+0.005
Neg logit contributions
 L-0.004
 Lily-0.004
 me-0.004
 swe-0.004
 my-0.004
Token the
Token ID262
Feature activation+5.23e-03

Pos logit contributions
 swe+0.015
 coll+0.014
 me+0.013
 L+0.013
 scar+0.013
Neg logit contributions
zed-0.012
ver-0.012
five-0.011
 exited-0.011
seek-0.011
Token to
Token ID284
Feature activation+1.28e-03

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.001
 swe+0.001
 us+0.001
Neg logit contributions
ver-0.001
seek-0.001
five-0.001
zed-0.001
ative-0.001
Token walk
Token ID2513
Feature activation+8.16e-03

Pos logit contributions
ver+0.002
five+0.002
 believe+0.002
seek+0.002
asts+0.002
Neg logit contributions
 L-0.002
 Lily-0.002
 me-0.002
 swe-0.002
 my-0.002
Token,
Token ID11
Feature activation-7.63e-03

Pos logit contributions
five+0.010
ver+0.010
seek+0.010
ative+0.010
tted+0.010
Neg logit contributions
 L-0.012
 yells-0.012
 mess-0.012
 swe-0.011
 me-0.011
Token her
Token ID607
Feature activation-7.77e-04

Pos logit contributions
 me+0.011
 yells+0.011
 us+0.010
lasses+0.010
 swe+0.010
Neg logit contributions
ver-0.011
zed-0.010
 Higher-0.009
ival-0.009
seek-0.009
Token foot
Token ID2366
Feature activation-1.47e-02

Pos logit contributions
 L+0.001
 Lily+0.001
 me+0.001
 my+0.001
 Sara+0.001
Neg logit contributions
ver-0.001
glass-0.001
 voyage-0.001
 exited-0.001
ative-0.001
Token slipped
Token ID18859
Feature activation-3.53e-02

Pos logit contributions
 L+0.020
 me+0.019
 Lily+0.019
 my+0.018
 swe+0.017
Neg logit contributions
ver-0.026
seek-0.024
five-0.023
tted-0.023
glass-0.023
Token on
Token ID319
Feature activation-1.01e-02

Pos logit contributions
 L+0.052
 Lily+0.048
ries+0.047
 me+0.046
 swe+0.046
Neg logit contributions
ver-0.048
ative-0.047
he-0.046
hing-0.045
tted-0.045
Token the
Token ID262
Feature activation+6.64e-04

Pos logit contributions
ies+0.012
 swe+0.012
 scar+0.011
 yells+0.011
ounced+0.011
Neg logit contributions
growth-0.016
he-0.014
five-0.014
 Higher-0.014
zed-0.014
Token wet
Token ID9583
Feature activation-2.72e-03

Pos logit contributions
five+0.001
seek+0.001
tted+0.001
ative+0.001
ugg+0.001
Neg logit contributions
 L-0.001
 Lily-0.001
 me-0.001
 Mitt-0.001
 swe-0.001
Token ground
Token ID2323
Feature activation+7.84e-04

Pos logit contributions
 L+0.003
 Lily+0.003
 me+0.003
 swe+0.003
 scar+0.003
Neg logit contributions
five-0.005
seek-0.005
ugg-0.004
asts-0.004
growth-0.004
Token.
Token ID13
Feature activation-5.54e-03

Pos logit contributions
ver+0.001
seek+0.001
five+0.001
tted+0.001
he+0.001
Neg logit contributions
 L-0.001
 Lily-0.001
 me-0.001
 swe-0.001
 my-0.001
Token he
Token ID339
Feature activation+9.64e-03

Pos logit contributions
 L+0.007
 Lily+0.007
 me+0.006
 my+0.006
 Mitt+0.006
Neg logit contributions
five-0.006
seek-0.006
ver-0.006
ival-0.006
glass-0.006
Token climbed
Token ID19952
Feature activation-4.25e-03

Pos logit contributions
ative+0.019
glass+0.019
five+0.019
asts+0.019
seek+0.019
Neg logit contributions
 L-0.012
 me-0.011
 Lily-0.011
 p-0.010
 pr-0.009
Token,
Token ID11
Feature activation-3.21e-03

Pos logit contributions
 L+0.006
 me+0.005
 Lily+0.005
 swe+0.005
 perfume+0.005
Neg logit contributions
ver-0.007
five-0.007
seek-0.006
ative-0.006
tted-0.006
Token the
Token ID262
Feature activation-4.90e-03

Pos logit contributions
 L+0.006
 me+0.005
 Lily+0.005
 my+0.005
 swe+0.005
Neg logit contributions
ver-0.005
five-0.004
seek-0.004
ival-0.004
glass-0.004
Token leaf
Token ID12835
Feature activation-6.91e-03

Pos logit contributions
 L+0.008
 Lily+0.007
 me+0.006
 Sara+0.006
 Liz+0.006
Neg logit contributions
five-0.008
ver-0.008
 warrior-0.007
ative-0.007
seek-0.007
Token slipped
Token ID18859
Feature activation-3.49e-02

Pos logit contributions
 L+0.009
 Lily+0.009
 me+0.009
 my+0.008
 Sara+0.008
Neg logit contributions
ver-0.013
five-0.012
seek-0.012
glass-0.011
ival-0.011
Token and
Token ID290
Feature activation-5.40e-03

Pos logit contributions
 L+0.048
 Lily+0.045
 me+0.044
 swe+0.040
 Sara+0.038
Neg logit contributions
ver-0.060
five-0.056
ative-0.055
tted-0.054
seek-0.054
Token he
Token ID339
Feature activation-5.54e-03

Pos logit contributions
 L+0.009
 me+0.009
 Lily+0.008
 swe+0.008
 my+0.007
Neg logit contributions
ver-0.008
five-0.007
zed-0.007
seek-0.007
tted-0.007
Token fell
Token ID3214
Feature activation-2.32e-02

Pos logit contributions
 L+0.009
 Lily+0.009
 me+0.008
 swe+0.008
 my+0.008
Neg logit contributions
ver-0.009
five-0.008
seek-0.008
glass-0.008
ival-0.008
Token down
Token ID866
Feature activation+2.56e-03

Pos logit contributions
 L+0.028
 Lily+0.026
 me+0.026
 us+0.023
 Sara+0.023
Neg logit contributions
ver-0.043
seek-0.041
five-0.040
ative-0.040
tted-0.039
Token to
Token ID284
Feature activation-1.01e-02

Pos logit contributions
ver+0.005
five+0.005
seek+0.005
he+0.005
ative+0.005
Neg logit contributions
 L-0.003
 me-0.003
 Lily-0.003
 swe-0.002
 scar-0.002
Token and
Token ID290
Feature activation+4.44e-03

Pos logit contributions
 L+0.005
 me+0.004
 swe+0.004
 Lily+0.004
 my+0.004
Neg logit contributions
ver-0.006
glass-0.005
seek-0.005
zed-0.005
five-0.005
Token Mia
Token ID32189
Feature activation-1.89e-03

Pos logit contributions
ver+0.011
five+0.011
seek+0.010
glass+0.010
ival+0.010
Neg logit contributions
 me-0.003
 L-0.003
 Lily-0.003
 swe-0.002
 my-0.002
Token is
Token ID318
Feature activation-3.58e-04

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 my+0.002
 p+0.002
Neg logit contributions
ver-0.004
five-0.004
seek-0.004
ival-0.004
glass-0.004
Token the
Token ID262
Feature activation-2.55e-03

Pos logit contributions
 me+0.000
 L+0.000
 Lily+0.000
 swe+0.000
 umb+0.000
Neg logit contributions
ver-0.001
seek-0.001
five-0.001
ative-0.001
zed-0.001
Token h
Token ID289
Feature activation+6.09e-03

Pos logit contributions
ries+0.004
lasses+0.003
ucker+0.003
 them+0.003
 L+0.003
Neg logit contributions
seek-0.004
 villains-0.004
 protector-0.004
ative-0.004
ibr-0.004
Tokenider
Token ID1304
Feature activation-3.49e-02

Pos logit contributions
 voyage+0.009
five+0.009
ver+0.009
asts+0.008
glass+0.008
Neg logit contributions
 L-0.010
 Lily-0.009
 me-0.009
 Sara-0.008
 p-0.008
Token.
Token ID13
Feature activation-4.18e-03

Pos logit contributions
 L+0.037
 Lily+0.035
 me+0.035
 Mitt+0.032
 swe+0.031
Neg logit contributions
ver-0.068
seek-0.067
five-0.066
adows-0.065
glass-0.064
Token Tom
Token ID4186
Feature activation-7.49e-03

Pos logit contributions
 swe+0.004
 me+0.004
 scar+0.004
 swollen+0.004
 L+0.004
Neg logit contributions
seek-0.008
five-0.008
ver-0.008
ival-0.007
zed-0.007
Token counts
Token ID9853
Feature activation+3.36e-03

Pos logit contributions
 L+0.010
 Lily+0.009
 me+0.009
 Mitt+0.008
 swe+0.008
Neg logit contributions
ver-0.013
five-0.013
seek-0.013
ival-0.012
glass-0.012
Token to
Token ID284
Feature activation-1.27e-02

Pos logit contributions
ver+0.007
seek+0.006
five+0.006
glass+0.006
ative+0.006
Neg logit contributions
 L-0.004
 me-0.004
 Lily-0.004
 swe-0.003
 Mitt-0.003
Token ten
Token ID3478
Feature activation-3.68e-03

Pos logit contributions
 L+0.017
 me+0.017
 Lily+0.016
 swe+0.016
 Mitt+0.015
Neg logit contributions
five-0.020
seek-0.020
ver-0.020
glass-0.020
asts-0.019
Token someone
Token ID2130
Feature activation-5.92e-03

Pos logit contributions
 swe+0.011
 L+0.010
 me+0.010
 scar+0.010
lasses+0.009
Neg logit contributions
ver-0.014
asts-0.014
zed-0.014
growth-0.014
glass-0.014
Token a
Token ID257
Feature activation-1.66e-03

Pos logit contributions
 L+0.009
 Lily+0.008
 me+0.008
 Mitt+0.007
 swe+0.007
Neg logit contributions
ver-0.010
five-0.009
ival-0.009
glass-0.009
asts-0.009
Token very
Token ID845
Feature activation-1.26e-02

Pos logit contributions
 L+0.003
 Lily+0.002
ries+0.002
lasses+0.002
 Jen+0.002
Neg logit contributions
 warrior-0.002
ival-0.002
 voyage-0.002
five-0.002
glass-0.002
Token long
Token ID890
Feature activation-6.76e-03

Pos logit contributions
 L+0.021
 Mitt+0.019
 Lily+0.019
ries+0.018
lasses+0.018
Neg logit contributions
five-0.018
 voyage-0.018
 quest-0.017
working-0.017
glass-0.017
Token time
Token ID640
Feature activation-3.72e-03

Pos logit contributions
 L+0.011
 Lily+0.010
 me+0.010
 swe+0.010
 Mitt+0.009
Neg logit contributions
 voyage-0.010
five-0.010
glass-0.010
 warrior-0.009
ver-0.009
Token ago
Token ID2084
Feature activation-3.49e-02

Pos logit contributions
 L+0.007
 Lily+0.007
 me+0.007
 Mitt+0.006
 swe+0.006
Neg logit contributions
ver-0.004
five-0.004
seek-0.004
glass-0.004
asts-0.004
Token.
Token ID13
Feature activation-4.13e-03

Pos logit contributions
 L+0.044
 Lily+0.041
 me+0.040
 Mitt+0.040
 swe+0.040
Neg logit contributions
ver-0.064
glass-0.061
seek-0.060
five-0.060
asts-0.060
Token The
Token ID383
Feature activation-1.17e-02

Pos logit contributions
 me+0.006
 L+0.006
 swe+0.006
 Lily+0.006
 scar+0.006
Neg logit contributions
ver-0.006
five-0.006
ival-0.006
seek-0.006
ugg-0.006
Token little
Token ID1310
Feature activation-1.08e-02

Pos logit contributions
 L+0.013
 me+0.012
 Lily+0.011
 Mitt+0.010
 my+0.010
Neg logit contributions
ver-0.023
five-0.023
ival-0.023
glass-0.022
ative-0.022
Token girl
Token ID2576
Feature activation+2.37e-03

Pos logit contributions
 me+0.012
 L+0.012
 Mitt+0.011
 us+0.011
 cho+0.010
Neg logit contributions
five-0.019
 warrior-0.019
ver-0.019
 astronaut-0.018
ative-0.018
Token thought
Token ID1807
Feature activation-3.61e-03

Pos logit contributions
ver+0.005
five+0.005
seek+0.005
ative+0.005
asts+0.005
Neg logit contributions
 me-0.003
 L-0.002
 Lily-0.002
 p-0.002
 my-0.002
Token
Token ID198
Feature activation+7.12e-03

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.001
 swe+0.001
 my+0.001
Neg logit contributions
ver-0.002
five-0.001
seek-0.001
glass-0.001
ival-0.001
TokenSuddenly
Token ID38582
Feature activation-5.25e-03

Pos logit contributions
ver+0.013
five+0.013
seek+0.013
ival+0.012
asts+0.012
Neg logit contributions
 L-0.009
 me-0.009
 Lily-0.009
 my-0.008
 Sara-0.007
Token,
Token ID11
Feature activation-1.93e-05

Pos logit contributions
 L+0.007
 me+0.007
 Lily+0.007
 swe+0.006
 my+0.006
Neg logit contributions
ver-0.010
five-0.009
ival-0.009
seek-0.009
glass-0.009
Token the
Token ID262
Feature activation+1.15e-03

Pos logit contributions
 me+0.000
 swe+0.000
 L+0.000
ety+0.000
 swollen+0.000
Neg logit contributions
ver-0.000
five-0.000
ival-0.000
 exited-0.000
glass-0.000
Token crystal
Token ID15121
Feature activation+5.54e-03

Pos logit contributions
 warrior+0.001
 voyage+0.001
 miner+0.001
glass+0.001
 astronaut+0.001
Neg logit contributions
 L-0.002
 Lily-0.002
 Sara-0.002
 Mitt-0.002
 Jen-0.002
Token slipped
Token ID18859
Feature activation-3.49e-02

Pos logit contributions
ver+0.009
glass+0.009
five+0.008
zed+0.008
ival+0.008
Neg logit contributions
 L-0.009
 Lily-0.008
 me-0.008
 my-0.007
 Mitt-0.007
Token from
Token ID422
Feature activation+6.09e-03

Pos logit contributions
 L+0.047
 me+0.044
 Lily+0.043
 swe+0.039
 my+0.038
Neg logit contributions
ver-0.060
five-0.058
ative-0.056
tted-0.055
glass-0.055
Token their
Token ID511
Feature activation+1.36e-03

Pos logit contributions
growth+0.012
five+0.012
asts+0.011
 astronaut+0.011
 patrol+0.011
Neg logit contributions
 swe-0.006
 L-0.006
 me-0.005
 umb-0.005
 pr-0.005
Token hands
Token ID2832
Feature activation-1.13e-02

Pos logit contributions
five+0.002
seek+0.002
glass+0.002
 voyage+0.002
 patrol+0.002
Neg logit contributions
 L-0.002
 Lily-0.002
 me-0.002
 swe-0.002
 Mitt-0.002
Token and
Token ID290
Feature activation-4.57e-03

Pos logit contributions
 L+0.011
 me+0.011
 Lily+0.011
 my+0.008
 swe+0.008
Neg logit contributions
ver-0.025
five-0.024
seek-0.023
ival-0.023
glass-0.023
Token fell
Token ID3214
Feature activation-2.66e-02

Pos logit contributions
 L+0.007
 swe+0.007
 me+0.007
 Lily+0.007
lasses+0.006
Neg logit contributions
zed-0.006
five-0.006
 exited-0.005
growth-0.005
glass-0.005
TokenBut
Token ID1537
Feature activation+3.36e-03

Pos logit contributions
ver+0.016
adows+0.015
ane+0.015
igh+0.015
ugg+0.015
Neg logit contributions
 me-0.015
 Lily-0.013
 L-0.012
 my-0.011
L-0.011
Token then
Token ID788
Feature activation-2.58e-03

Pos logit contributions
ver+0.006
five+0.006
seek+0.006
ival+0.006
glass+0.006
Neg logit contributions
 L-0.004
 me-0.004
 Lily-0.004
 my-0.003
 swe-0.003
Token Jack
Token ID3619
Feature activation-1.02e-03

Pos logit contributions
 L+0.003
 Lily+0.003
 me+0.003
 my+0.003
 Mitt+0.003
Neg logit contributions
ver-0.005
five-0.005
seek-0.005
ival-0.004
glass-0.004
Token's
Token ID338
Feature activation+8.21e-03

Pos logit contributions
 L+0.002
 Lily+0.002
 my+0.002
 swe+0.001
 Sara+0.001
Neg logit contributions
ver-0.002
glass-0.001
seek-0.001
 exited-0.001
zed-0.001
Token foot
Token ID2366
Feature activation-1.77e-02

Pos logit contributions
ver+0.017
five+0.017
seek+0.016
ival+0.016
glass+0.016
Neg logit contributions
 L-0.009
 me-0.009
 Lily-0.009
 my-0.007
 swe-0.007
Token slipped
Token ID18859
Feature activation-3.48e-02

Pos logit contributions
 L+0.027
 Lily+0.026
 me+0.025
 my+0.024
 swe+0.023
Neg logit contributions
ver-0.029
seek-0.027
five-0.026
tted-0.026
zed-0.026
Token,
Token ID11
Feature activation-3.80e-03

Pos logit contributions
 L+0.050
 Lily+0.047
 me+0.045
 swe+0.042
 Sara+0.041
Neg logit contributions
ver-0.056
five-0.053
ative-0.053
seek-0.051
tted-0.050
Token and
Token ID290
Feature activation-9.67e-04

Pos logit contributions
 L+0.005
 Lily+0.005
 me+0.005
 swe+0.005
 my+0.004
Neg logit contributions
ver-0.006
five-0.006
zed-0.006
glass-0.006
 exited-0.006
Token he
Token ID339
Feature activation-5.15e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 swe+0.002
 my+0.001
Neg logit contributions
ver-0.001
five-0.001
seek-0.001
zed-0.001
glass-0.001
Token started
Token ID2067
Feature activation+1.11e-03

Pos logit contributions
 L+0.009
 Lily+0.008
 swe+0.008
 me+0.008
 my+0.008
Neg logit contributions
ver-0.007
zed-0.007
seek-0.007
glass-0.007
five-0.006
Token to
Token ID284
Feature activation+1.32e-02

Pos logit contributions
seek+0.002
ver+0.002
five+0.002
zed+0.002
anging+0.002
Neg logit contributions
 L-0.001
 Lily-0.001
 me-0.001
 my-0.001
 swe-0.001
Token Cole
Token ID11768
Feature activation-5.56e-03

Pos logit contributions
 L+0.008
 me+0.008
 Lily+0.008
 swe+0.007
 Mitt+0.006
Neg logit contributions
ver-0.020
five-0.020
seek-0.020
glass-0.019
asts-0.019
Token's
Token ID338
Feature activation-3.38e-03

Pos logit contributions
 L+0.008
 Lily+0.007
 me+0.007
 my+0.006
 swe+0.006
Neg logit contributions
ver-0.010
five-0.009
seek-0.009
glass-0.009
asts-0.009
Token room
Token ID2119
Feature activation+1.24e-03

Pos logit contributions
 L+0.005
 Lily+0.005
 me+0.005
 Sara+0.004
 swe+0.004
Neg logit contributions
five-0.005
ver-0.005
glass-0.005
seek-0.005
 voyage-0.005
Token when
Token ID618
Feature activation-7.25e-03

Pos logit contributions
ver+0.003
five+0.003
ival+0.003
seek+0.003
glass+0.003
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 my-0.001
 p-0.001
Token he
Token ID339
Feature activation-3.20e-03

Pos logit contributions
 L+0.013
 me+0.012
 Lily+0.012
 swe+0.011
 my+0.011
Neg logit contributions
ver-0.010
five-0.009
seek-0.009
ival-0.009
glass-0.009
Token accidentally
Token ID14716
Feature activation-3.48e-02

Pos logit contributions
 L+0.005
 Lily+0.005
 swe+0.005
 me+0.005
 my+0.005
Neg logit contributions
ver-0.005
glass-0.004
ival-0.004
seek-0.004
 exited-0.004
Token disturbed
Token ID24069
Feature activation-8.29e-03

Pos logit contributions
 L+0.051
 me+0.049
 Lily+0.048
 swe+0.045
 my+0.042
Neg logit contributions
ver-0.055
five-0.053
zed-0.053
seek-0.052
glass-0.052
Token him
Token ID683
Feature activation-1.57e-02

Pos logit contributions
 swe+0.012
 umb+0.011
 ther+0.010
ies+0.010
ety+0.010
Neg logit contributions
seek-0.011
growth-0.011
five-0.011
asts-0.011
ative-0.011
Token.
Token ID13
Feature activation+1.56e-03

Pos logit contributions
 L+0.018
 Lily+0.017
 me+0.016
 Sara+0.014
 swe+0.014
Neg logit contributions
ver-0.031
seek-0.030
glass-0.029
five-0.029
ative-0.029
Token Cole
Token ID11768
Feature activation+3.76e-03

Pos logit contributions
ver+0.002
five+0.002
seek+0.002
ival+0.002
glass+0.002
Neg logit contributions
 L-0.002
 me-0.002
 Lily-0.002
 my-0.002
 swe-0.002
Token said
Token ID531
Feature activation+3.17e-04

Pos logit contributions
ver+0.006
five+0.006
seek+0.006
ival+0.006
zed+0.006
Neg logit contributions
 L-0.005
 Lily-0.005
 me-0.005
 swe-0.004
 my-0.004
Token was
Token ID373
Feature activation+7.07e-03

Pos logit contributions
ver+0.016
five+0.016
seek+0.016
ival+0.015
glass+0.015
Neg logit contributions
 L-0.011
 me-0.011
 Lily-0.011
 my-0.009
 swe-0.009
Token sitting
Token ID5586
Feature activation+6.74e-03

Pos logit contributions
ver+0.010
five+0.010
seek+0.010
ival+0.009
asts+0.009
Neg logit contributions
 L-0.012
 Lily-0.011
 me-0.011
 swe-0.010
 my-0.010
Token,
Token ID11
Feature activation+1.29e-04

Pos logit contributions
ver+0.014
five+0.014
seek+0.013
ival+0.013
glass+0.013
Neg logit contributions
 L-0.008
 me-0.007
 Lily-0.007
 my-0.006
 swe-0.006
Token his
Token ID465
Feature activation-3.51e-04

Pos logit contributions
ver+0.000
five+0.000
seek+0.000
ival+0.000
glass+0.000
Neg logit contributions
 L-0.000
 me-0.000
 Lily-0.000
 my-0.000
 swe-0.000
Token ball
Token ID2613
Feature activation-1.02e-02

Pos logit contributions
 L+0.000
 Lily+0.000
 me+0.000
 my+0.000
 Mitt+0.000
Neg logit contributions
ver-0.001
five-0.001
seek-0.001
ival-0.001
glass-0.001
Token slipped
Token ID18859
Feature activation-3.47e-02

Pos logit contributions
 L+0.016
 Lily+0.015
 me+0.014
 my+0.014
 Sara+0.013
Neg logit contributions
ver-0.016
glass-0.015
seek-0.015
tted-0.015
five-0.015
Token out
Token ID503
Feature activation-3.59e-03

Pos logit contributions
 L+0.050
 Lily+0.046
 me+0.044
 swe+0.042
 Sara+0.041
Neg logit contributions
ver-0.055
ative-0.052
five-0.051
tted-0.051
hing-0.051
Token of
Token ID286
Feature activation-3.53e-03

Pos logit contributions
 L+0.006
 me+0.006
 Lily+0.006
 swe+0.006
 Mitt+0.006
Neg logit contributions
ver-0.005
five-0.004
ative-0.004
glass-0.004
asts-0.004
Token his
Token ID465
Feature activation-3.86e-03

Pos logit contributions
 L+0.004
 swe+0.004
 me+0.004
 Lily+0.004
ies+0.004
Neg logit contributions
growth-0.005
five-0.005
asts-0.005
he-0.005
glass-0.005
Token hands
Token ID2832
Feature activation-8.33e-03

Pos logit contributions
 L+0.005
 Lily+0.005
 me+0.005
 Sara+0.004
 Mitt+0.004
Neg logit contributions
five-0.006
glass-0.006
seek-0.006
ver-0.006
asts-0.006
Token.
Token ID13
Feature activation+7.11e-04

Pos logit contributions
 L+0.010
 me+0.009
 Lily+0.009
 my+0.008
 swe+0.007
Neg logit contributions
ver-0.017
five-0.016
seek-0.016
glass-0.015
ival-0.015
Token
Token ID198
Feature activation+1.41e-03

Pos logit contributions
 L+0.007
 me+0.006
 Lily+0.006
 my+0.005
 swe+0.005
Neg logit contributions
ver-0.010
five-0.009
seek-0.009
ival-0.009
glass-0.009
TokenThey
Token ID2990
Feature activation+3.64e-03

Pos logit contributions
ver+0.003
seek+0.003
five+0.003
ative+0.003
asts+0.003
Neg logit contributions
 L-0.002
 me-0.002
 Lily-0.002
 my-0.001
 Sara-0.001
Token see
Token ID766
Feature activation+4.25e-03

Pos logit contributions
ver+0.007
seek+0.007
ative+0.006
ival+0.006
five+0.006
Neg logit contributions
 L-0.005
 me-0.005
 Lily-0.005
 p-0.004
 Sara-0.004
Token photos
Token ID5205
Feature activation+6.16e-03

Pos logit contributions
ver+0.009
five+0.008
glass+0.008
asts+0.008
ival+0.008
Neg logit contributions
 L-0.004
 me-0.004
 Lily-0.004
 my-0.003
 swe-0.003
Token of
Token ID286
Feature activation+3.02e-03

Pos logit contributions
ver+0.012
five+0.011
seek+0.011
ival+0.011
glass+0.011
Neg logit contributions
 L-0.008
 me-0.007
 Lily-0.007
 my-0.006
 Mitt-0.006
Token Grand
Token ID5675
Feature activation+2.52e-02

Pos logit contributions
glass+0.006
asts+0.006
five+0.006
ver+0.006
 voyage+0.005
Neg logit contributions
 L-0.003
 me-0.003
 Lily-0.003
 Mitt-0.003
 swe-0.003
Tokenma
Token ID2611
Feature activation-8.95e-04

Pos logit contributions
ver+0.048
five+0.046
seek+0.045
ival+0.044
ative+0.044
Neg logit contributions
 L-0.033
 me-0.032
 Lily-0.031
 my-0.027
 swe-0.026
Token and
Token ID290
Feature activation+8.58e-03

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.001
 my+0.001
 Mitt+0.001
Neg logit contributions
ver-0.002
five-0.002
glass-0.002
seek-0.002
asts-0.002
Token Grand
Token ID5675
Feature activation+3.10e-02

Pos logit contributions
five+0.015
ver+0.015
glass+0.014
asts+0.014
seek+0.014
Neg logit contributions
 L-0.011
 me-0.011
 Lily-0.010
 swe-0.009
 my-0.009
Tokenpa
Token ID8957
Feature activation+2.51e-03

Pos logit contributions
ver+0.049
five+0.045
seek+0.045
ival+0.044
tted+0.043
Neg logit contributions
 L-0.051
 me-0.051
 Lily-0.049
 my-0.043
 swe-0.043
Token in
Token ID287
Feature activation-1.51e-02

Pos logit contributions
ver+0.005
five+0.005
seek+0.005
glass+0.005
ival+0.005
Neg logit contributions
 L-0.003
 me-0.002
 Lily-0.002
 my-0.002
 Mitt-0.002
Token and
Token ID290
Feature activation+9.95e-04

Pos logit contributions
ver+0.010
five+0.010
seek+0.010
ative+0.010
glass+0.009
Neg logit contributions
 L-0.007
 me-0.006
 Lily-0.006
 Mitt-0.005
 swe-0.005
Token not
Token ID407
Feature activation-7.26e-03

Pos logit contributions
ver+0.002
five+0.002
seek+0.002
ival+0.002
ative+0.002
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 swe-0.001
 Sara-0.001
Token strong
Token ID1913
Feature activation+1.09e-02

Pos logit contributions
 L+0.009
 me+0.008
 umb+0.008
lasses+0.008
 swe+0.008
Neg logit contributions
ver-0.012
five-0.012
 believe-0.012
 rider-0.011
seek-0.011
Token like
Token ID588
Feature activation-6.86e-03

Pos logit contributions
ver+0.023
five+0.022
seek+0.022
ative+0.022
 rider+0.021
Neg logit contributions
 L-0.011
 me-0.010
 Lily-0.010
 Mitt-0.009
 swe-0.008
Token the
Token ID262
Feature activation+1.13e-02

Pos logit contributions
 L+0.006
 swe+0.004
 Lily+0.004
 umb+0.004
Oh+0.004
Neg logit contributions
ver-0.014
five-0.014
seek-0.014
ival-0.014
asts-0.014
Token other
Token ID584
Feature activation+2.35e-02

Pos logit contributions
 rider+0.018
ative+0.017
 warrior+0.017
 protector+0.016
five+0.016
Neg logit contributions
 L-0.016
 Lily-0.016
 Mia-0.015
 Liz-0.014
 Jen-0.014
Token animals
Token ID4695
Feature activation-3.64e-03

Pos logit contributions
 rider+0.040
five+0.037
seek+0.037
ver+0.036
ane+0.036
Neg logit contributions
 L-0.029
 Lily-0.029
lasses-0.025
 Mitt-0.025
 Sara-0.025
Token."
Token ID526
Feature activation+5.55e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.003
 my+0.002
 swe+0.002
Neg logit contributions
ver-0.009
five-0.008
seek-0.008
ival-0.008
glass-0.008
Token The
Token ID383
Feature activation+3.76e-03

Pos logit contributions
ver+0.010
five+0.009
seek+0.009
ival+0.009
glass+0.009
Neg logit contributions
 L-0.008
 me-0.007
 Lily-0.007
 swe-0.006
 my-0.006
Token owl
Token ID39610
Feature activation-8.95e-03

Pos logit contributions
ver+0.006
five+0.006
ative+0.006
seek+0.006
glass+0.006
Neg logit contributions
 L-0.006
 Lily-0.005
 me-0.005
 Sara-0.005
 my-0.005
Token told
Token ID1297
Feature activation-2.40e-03

Pos logit contributions
 L+0.009
 Lily+0.008
 me+0.008
 my+0.007
 swe+0.006
Neg logit contributions
ver-0.020
five-0.019
seek-0.019
ival-0.018
glass-0.018
Token
Token ID198
Feature activation+2.02e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 my+0.002
 swe+0.002
Neg logit contributions
ver-0.003
five-0.003
seek-0.003
ival-0.003
glass-0.003
TokenThe
Token ID464
Feature activation-5.63e-03

Pos logit contributions
ver+0.003
five+0.003
seek+0.003
glass+0.003
ival+0.003
Neg logit contributions
 L-0.003
 me-0.003
 Lily-0.003
 my-0.003
 Mitt-0.003
Token pupil
Token ID34047
Feature activation+9.10e-03

Pos logit contributions
 L+0.009
 me+0.009
 Lily+0.009
 my+0.007
 swe+0.007
Neg logit contributions
ver-0.009
five-0.009
seek-0.008
ative-0.008
ival-0.008
Token and
Token ID290
Feature activation+1.22e-02

Pos logit contributions
ver+0.017
five+0.016
seek+0.016
asts+0.016
ival+0.016
Neg logit contributions
 L-0.012
 me-0.012
 Lily-0.011
 my-0.010
 p-0.009
Token the
Token ID262
Feature activation+7.01e-03

Pos logit contributions
ver+0.025
five+0.025
seek+0.024
ative+0.024
glass+0.024
Neg logit contributions
 L-0.012
 me-0.012
 Lily-0.011
 swe-0.010
 Mitt-0.009
Token attractive
Token ID10966
Feature activation+2.39e-02

Pos logit contributions
ver+0.009
five+0.008
seek+0.008
ative+0.008
ival+0.008
Neg logit contributions
 L-0.013
 me-0.013
 Lily-0.013
 my-0.012
 swe-0.011
Token person
Token ID1048
Feature activation+2.92e-03

Pos logit contributions
ver+0.035
asts+0.033
five+0.033
seek+0.032
ival+0.032
Neg logit contributions
 L-0.042
 me-0.041
 Lily-0.041
 my-0.036
 p-0.035
Token became
Token ID2627
Feature activation-6.39e-03

Pos logit contributions
seek+0.005
five+0.005
ver+0.005
glass+0.005
asts+0.005
Neg logit contributions
 me-0.004
 L-0.004
 Lily-0.004
 p-0.003
 you-0.003
Token good
Token ID922
Feature activation-4.01e-03

Pos logit contributions
 me+0.008
 L+0.008
 Lily+0.008
 my+0.007
 swe+0.007
Neg logit contributions
ver-0.012
ative-0.011
five-0.011
ival-0.010
asts-0.010
Token friends
Token ID2460
Feature activation+2.79e-03

Pos logit contributions
 L+0.003
 Lily+0.003
 me+0.003
 my+0.002
 p+0.002
Neg logit contributions
ver-0.009
five-0.009
seek-0.009
ival-0.009
glass-0.009
Token.
Token ID13
Feature activation-6.87e-04

Pos logit contributions
ver+0.006
five+0.006
seek+0.006
asts+0.006
ival+0.006
Neg logit contributions
 L-0.003
 Lily-0.002
 me-0.002
 my-0.002
 Sara-0.002
Token.
Token ID13
Feature activation-3.61e-03

Pos logit contributions
ver+0.003
five+0.003
seek+0.003
ival+0.003
glass+0.003
Neg logit contributions
 L-0.001
 Lily-0.001
 me-0.001
 my-0.001
 swe-0.001
Token
Token ID198
Feature activation+2.98e-03

Pos logit contributions
 L+0.004
 Lily+0.004
 me+0.004
 my+0.004
 Sara+0.003
Neg logit contributions
ver-0.007
five-0.007
seek-0.007
ival-0.007
glass-0.007
Token
Token ID198
Feature activation+2.65e-03

Pos logit contributions
ver+0.006
five+0.006
seek+0.006
ative+0.006
ival+0.006
Neg logit contributions
 L-0.003
 Lily-0.003
 me-0.003
 Sara-0.003
 my-0.003
TokenTom
Token ID13787
Feature activation+3.65e-03

Pos logit contributions
ver+0.005
five+0.005
ative+0.005
seek+0.005
asts+0.005
Neg logit contributions
 L-0.003
 me-0.003
 Lily-0.003
 my-0.003
 Sara-0.003
Token and
Token ID290
Feature activation+1.00e-02

Pos logit contributions
ver+0.007
five+0.006
seek+0.006
ival+0.006
glass+0.006
Neg logit contributions
 L-0.005
 me-0.005
 Lily-0.005
 my-0.004
 swe-0.004
Token his
Token ID465
Feature activation+2.28e-02

Pos logit contributions
five+0.013
ver+0.013
seek+0.012
glass+0.012
asts+0.011
Neg logit contributions
 L-0.018
 me-0.018
 Lily-0.017
 swe-0.016
 my-0.015
Token dad
Token ID9955
Feature activation+1.56e-02

Pos logit contributions
ver+0.039
five+0.037
seek+0.036
ival+0.036
glass+0.036
Neg logit contributions
 L-0.034
 me-0.033
 Lily-0.033
 my-0.028
 swe-0.028
Token and
Token ID290
Feature activation+1.10e-02

Pos logit contributions
seek+0.030
five+0.030
ver+0.029
ative+0.029
ival+0.028
Neg logit contributions
 me-0.020
 Lily-0.019
 L-0.019
 p-0.016
 Mitt-0.015
Token mum
Token ID25682
Feature activation+8.65e-03

Pos logit contributions
ver+0.016
five+0.016
glass+0.015
seek+0.015
ative+0.015
Neg logit contributions
 L-0.018
 me-0.017
 Lily-0.016
 my-0.015
 swe-0.015
Token wanted
Token ID2227
Feature activation+3.51e-03

Pos logit contributions
seek+0.017
five+0.017
ver+0.017
ival+0.016
ative+0.016
Neg logit contributions
 L-0.010
 me-0.010
 Lily-0.010
 p-0.008
 Mitt-0.008
Token to
Token ID284
Feature activation+3.80e-03

Pos logit contributions
ver+0.008
five+0.008
seek+0.008
ival+0.008
ative+0.008
Neg logit contributions
 L-0.003
 Lily-0.003
 me-0.003
 my-0.002
 Sara-0.002
Token's
Token ID338
Feature activation-4.42e-04

Pos logit contributions
ver+0.019
five+0.018
seek+0.018
glass+0.018
ival+0.018
Neg logit contributions
 L-0.008
 Lily-0.008
 me-0.008
 Mitt-0.006
 swe-0.006
Token a
Token ID257
Feature activation-4.98e-03

Pos logit contributions
 L+0.001
 Lily+0.000
 me+0.000
 swe+0.000
 Wendy+0.000
Neg logit contributions
asts-0.001
glass-0.001
ver-0.001
adows-0.001
 voyage-0.001
Token problem
Token ID1917
Feature activation-2.22e-03

Pos logit contributions
 L+0.008
 Lily+0.007
 me+0.007
ries+0.007
 Jen+0.007
Neg logit contributions
 voyage-0.007
glass-0.007
 warrior-0.007
asts-0.007
ival-0.007
Token with
Token ID351
Feature activation-4.04e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 Mitt+0.001
 swe+0.001
Neg logit contributions
ver-0.005
five-0.005
seek-0.005
glass-0.005
ival-0.005
Token the
Token ID262
Feature activation+1.16e-03

Pos logit contributions
 L+0.004
 Lily+0.004
 me+0.004
 swe+0.004
 Mitt+0.003
Neg logit contributions
asts-0.007
glass-0.007
ver-0.007
five-0.007
ibr-0.007
Token pump
Token ID8901
Feature activation+2.51e-02

Pos logit contributions
asts+0.001
glass+0.001
 miner+0.001
five+0.001
 voyage+0.001
Neg logit contributions
 L-0.002
 Lily-0.002
 me-0.002
 Mia-0.002
 Mitt-0.002
Token.
Token ID13
Feature activation+5.76e-03

Pos logit contributions
ver+0.035
glass+0.033
five+0.032
asts+0.032
ative+0.031
Neg logit contributions
 L-0.045
 Lily-0.043
 me-0.043
 Mitt-0.039
 swe-0.038
Token I
Token ID314
Feature activation-1.11e-03

Pos logit contributions
ver+0.011
five+0.011
seek+0.011
ival+0.010
glass+0.010
Neg logit contributions
 L-0.007
 me-0.007
 Lily-0.007
 my-0.006
 Sara-0.005
Token need
Token ID761
Feature activation+1.82e-03

Pos logit contributions
 L+0.002
 Lily+0.001
 me+0.001
 my+0.001
 p+0.001
Neg logit contributions
ver-0.002
five-0.002
seek-0.002
ival-0.002
glass-0.002
Token to
Token ID284
Feature activation-5.81e-04

Pos logit contributions
five+0.003
ver+0.003
asts+0.003
glass+0.003
seek+0.003
Neg logit contributions
 L-0.002
 Lily-0.002
 me-0.002
 swe-0.002
 Mitt-0.002
Token open
Token ID1280
Feature activation-3.34e-03

Pos logit contributions
 L+0.001
 Lily+0.001
 me+0.001
 swe+0.001
 Sara+0.001
Neg logit contributions
ver-0.001
five-0.001
seek-0.001
asts-0.001
glass-0.001
Token up
Token ID510
Feature activation+6.95e-03

Pos logit contributions
 L+0.017
 umb+0.015
 shampoo+0.015
Su+0.015
 p+0.014
Neg logit contributions
ver-0.019
five-0.018
zed-0.017
 rider-0.017
seek-0.017
Token.
Token ID13
Feature activation-9.86e-04

Pos logit contributions
ver+0.013
ative+0.012
glass+0.012
five+0.012
seek+0.012
Neg logit contributions
 L-0.008
 me-0.008
 Lily-0.007
lasses-0.007
 my-0.006
Token He
Token ID679
Feature activation-6.88e-04

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 swe+0.002
 my+0.002
Neg logit contributions
ver-0.001
seek-0.001
five-0.001
ival-0.001
glass-0.001
Token used
Token ID973
Feature activation-1.28e-02

Pos logit contributions
 L+0.001
 Lily+0.001
 me+0.001
 my+0.001
 swe+0.001
Neg logit contributions
ver-0.001
five-0.001
seek-0.001
glass-0.001
ival-0.001
Token his
Token ID465
Feature activation-7.00e-04

Pos logit contributions
 L+0.020
 me+0.019
 Lily+0.019
 my+0.017
 swe+0.017
Neg logit contributions
ver-0.020
five-0.019
seek-0.019
glass-0.018
ival-0.018
Token shield
Token ID7614
Feature activation-1.94e-03

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.001
 my+0.001
 swe+0.001
Neg logit contributions
ver-0.001
five-0.001
seek-0.001
ival-0.001
glass-0.001
Token to
Token ID284
Feature activation+5.64e-04

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.001
 too+0.001
 my+0.001
Neg logit contributions
asts-0.005
five-0.005
ival-0.005
ver-0.005
ugg-0.004
Token protect
Token ID1805
Feature activation-7.18e-03

Pos logit contributions
ver+0.001
five+0.001
seek+0.001
glass+0.001
asts+0.001
Neg logit contributions
 L-0.001
 Lily-0.001
 me-0.001
 my-0.001
 swe-0.001
Token himself
Token ID2241
Feature activation-2.94e-03

Pos logit contributions
 swe+0.011
 L+0.011
 umb+0.011
 mess+0.010
 p+0.010
Neg logit contributions
ver-0.009
glass-0.009
five-0.009
asts-0.008
growth-0.008
Token from
Token ID422
Feature activation-4.47e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.003
 my+0.002
 swe+0.002
Neg logit contributions
ver-0.006
five-0.006
seek-0.006
glass-0.006
ival-0.006
Token the
Token ID262
Feature activation-4.76e-03

Pos logit contributions
 L+0.006
 Lily+0.005
 swe+0.005
 me+0.005
lasses+0.005
Neg logit contributions
glass-0.007
asts-0.007
ver-0.007
zed-0.007
 patrol-0.007
Token stretched
Token ID19110
Feature activation+3.81e-03

Pos logit contributions
 L+0.005
 me+0.005
 Lily+0.005
 Mitt+0.004
 my+0.004
Neg logit contributions
five-0.007
ver-0.007
seek-0.007
ival-0.007
 exited-0.007
Token up
Token ID510
Feature activation+1.22e-02

Pos logit contributions
ver+0.007
ative+0.006
five+0.006
adows+0.006
glass+0.005
Neg logit contributions
 me-0.005
 L-0.005
 Lily-0.005
 my-0.004
 mess-0.004
Token into
Token ID656
Feature activation-8.30e-03

Pos logit contributions
ver+0.019
five+0.018
seek+0.018
ative+0.018
zed+0.018
Neg logit contributions
 me-0.016
 L-0.016
 Lily-0.015
 my-0.014
 Mitt-0.014
Token the
Token ID262
Feature activation-3.59e-04

Pos logit contributions
 me+0.009
ety+0.009
lasses+0.009
ounced+0.008
 my+0.008
Neg logit contributions
growth-0.014
five-0.013
asts-0.012
 believe-0.012
glass-0.012
Token sky
Token ID6766
Feature activation+4.74e-03

Pos logit contributions
 L+0.001
 Lily+0.001
 me+0.000
 Mitt+0.000
 Sara+0.000
Neg logit contributions
five-0.001
 voyage-0.001
glass-0.001
asts-0.001
seek-0.001
Token.
Token ID13
Feature activation+8.31e-03

Pos logit contributions
ver+0.009
five+0.009
seek+0.009
asts+0.009
ative+0.009
Neg logit contributions
 L-0.005
 Lily-0.005
 me-0.005
 Mitt-0.004
 my-0.004
Token
Token ID220
Feature activation-1.03e-02

Pos logit contributions
ver+0.014
five+0.014
seek+0.013
ival+0.013
glass+0.013
Neg logit contributions
 L-0.012
 me-0.012
 Lily-0.011
 my-0.010
 swe-0.010
Token
Token ID198
Feature activation+8.13e-03

Pos logit contributions
 me+0.015
 L+0.014
 Lily+0.014
 swe+0.013
 my+0.013
Neg logit contributions
ver-0.018
five-0.016
seek-0.016
ival-0.016
tted-0.016
Token
Token ID198
Feature activation+1.11e-02

Pos logit contributions
ver+0.015
five+0.014
seek+0.014
ival+0.014
glass+0.013
Neg logit contributions
 L-0.011
 me-0.010
 Lily-0.010
 swe-0.009
 my-0.009
TokenWhen
Token ID2215
Feature activation+7.97e-03

Pos logit contributions
ver+0.017
five+0.016
seek+0.016
ival+0.016
glass+0.016
Neg logit contributions
 L-0.018
 Lily-0.018
 me-0.017
 my-0.015
 swe-0.015
Token she
Token ID673
Feature activation+7.76e-03

Pos logit contributions
ver+0.016
five+0.015
seek+0.015
ative+0.015
ival+0.015
Neg logit contributions
 me-0.008
 Lily-0.008
 L-0.008
 swe-0.007
 my-0.007
Token They
Token ID1119
Feature activation-3.25e-03

Pos logit contributions
ver+0.005
five+0.005
seek+0.005
ival+0.004
glass+0.004
Neg logit contributions
 me-0.004
 L-0.004
 Lily-0.004
 swe-0.003
 my-0.003
Token were
Token ID547
Feature activation+2.11e-03

Pos logit contributions
 L+0.005
 me+0.005
 Lily+0.005
 my+0.004
 swe+0.004
Neg logit contributions
ver-0.005
five-0.005
seek-0.005
ival-0.005
glass-0.005
Token happy
Token ID3772
Feature activation+6.86e-03

Pos logit contributions
ver+0.004
five+0.003
seek+0.003
ival+0.003
glass+0.003
Neg logit contributions
 L-0.003
 me-0.003
 Lily-0.003
 my-0.003
 swe-0.003
Token in
Token ID287
Feature activation-1.80e-03

Pos logit contributions
five+0.013
ver+0.013
seek+0.013
ative+0.013
glass+0.013
Neg logit contributions
 me-0.007
 L-0.007
 Lily-0.006
 swe-0.006
 my-0.006
Token the
Token ID262
Feature activation-4.33e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 swe+0.001
 my+0.001
Neg logit contributions
five-0.004
ver-0.004
seek-0.004
asts-0.004
glass-0.004
Token open
Token ID1280
Feature activation-1.10e-03

Pos logit contributions
 L+0.007
 Lily+0.007
 me+0.006
 my+0.006
 Sara+0.006
Neg logit contributions
five-0.006
ative-0.006
ival-0.006
tted-0.005
glass-0.005
Token.
Token ID13
Feature activation+8.34e-04

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 Mitt+0.001
 my+0.001
Neg logit contributions
five-0.002
glass-0.002
seek-0.002
ver-0.001
 voyage-0.001
Token<|endoftext|>
Token ID50256
Feature activation+8.28e-03

Pos logit contributions
ver+0.001
five+0.001
seek+0.001
glass+0.001
ival+0.001
Neg logit contributions
 me-0.001
 L-0.001
 Lily-0.001
 swe-0.001
 my-0.001
TokenTom
Token ID13787
Feature activation-3.51e-03

Pos logit contributions
ver+0.016
five+0.015
seek+0.015
ival+0.015
ative+0.015
Neg logit contributions
 L-0.010
 me-0.010
 Lily-0.009
 swe-0.008
 my-0.008
Token was
Token ID373
Feature activation+3.42e-03

Pos logit contributions
 L+0.003
 Lily+0.003
 me+0.003
 Sara+0.002
 my+0.002
Neg logit contributions
ival-0.008
ver-0.008
seek-0.008
five-0.008
glass-0.008
Token a
Token ID257
Feature activation-3.96e-03

Pos logit contributions
ver+0.007
five+0.007
seek+0.006
ival+0.006
glass+0.006
Neg logit contributions
 L-0.004
 me-0.004
 Lily-0.004
 my-0.003
 Sara-0.003
Token the
Token ID262
Feature activation-2.88e-04

Pos logit contributions
 me+0.004
 swe+0.004
 L+0.003
 Lily+0.003
 mess+0.003
Neg logit contributions
ver-0.007
five-0.007
growth-0.007
asts-0.007
glass-0.007
Token sky
Token ID6766
Feature activation+3.85e-03

Pos logit contributions
 L+0.001
 Lily+0.001
 Mitt+0.000
 Sara+0.000
 swe+0.000
Neg logit contributions
asts-0.000
ane-0.000
glass-0.000
ver-0.000
 voyage-0.000
Token.
Token ID13
Feature activation+2.83e-03

Pos logit contributions
ver+0.008
five+0.008
seek+0.007
glass+0.007
asts+0.007
Neg logit contributions
 L-0.004
 me-0.004
 Lily-0.004
 Mitt-0.003
 my-0.003
Token
Token ID198
Feature activation+3.37e-03

Pos logit contributions
ver+0.005
five+0.004
glass+0.004
ival+0.004
seek+0.004
Neg logit contributions
 me-0.004
 L-0.004
 swe-0.004
 my-0.004
 Lily-0.004
Token
Token ID198
Feature activation+5.53e-03

Pos logit contributions
ver+0.005
five+0.005
ival+0.005
asts+0.005
glass+0.005
Neg logit contributions
 L-0.005
 me-0.005
 swe-0.005
 Lily-0.005
 honey-0.004
Token"
Token ID1
Feature activation-1.39e-03

Pos logit contributions
ver+0.010
five+0.010
seek+0.009
ival+0.009
glass+0.009
Neg logit contributions
 L-0.008
 me-0.007
 Lily-0.007
 my-0.006
 swe-0.006
TokenIt
Token ID1026
Feature activation+4.71e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 my+0.001
 swe+0.001
Neg logit contributions
ver-0.003
five-0.003
seek-0.003
ival-0.003
glass-0.003
Token's
Token ID338
Feature activation-6.41e-04

Pos logit contributions
ver+0.008
five+0.007
ative+0.007
glass+0.007
ane+0.007
Neg logit contributions
 L-0.007
 Lily-0.006
 Mitt-0.006
 Sara-0.006
 me-0.005
Token so
Token ID523
Feature activation+1.18e-03

Pos logit contributions
 L+0.001
 Lily+0.001
 Mitt+0.001
 pins+0.001
 swe+0.001
Neg logit contributions
asts-0.001
ver-0.001
 voyage-0.001
 astronaut-0.001
adows-0.001
Token noisy
Token ID31210
Feature activation+7.52e-03

Pos logit contributions
ver+0.002
five+0.002
asts+0.002
 voyage+0.002
ative+0.002
Neg logit contributions
 L-0.001
 Lily-0.001
 me-0.001
 Mitt-0.001
 swe-0.001
Token and
Token ID290
Feature activation-1.20e-03

Pos logit contributions
five+0.015
ver+0.014
seek+0.014
glass+0.014
ative+0.014
Neg logit contributions
 L-0.008
 Lily-0.007
 swe-0.007
 Mitt-0.007
 me-0.007
Token morning
Token ID3329
Feature activation+2.21e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 my+0.002
 Sara+0.002
Neg logit contributions
ver-0.002
five-0.002
seek-0.002
ival-0.002
glass-0.002
Token,
Token ID11
Feature activation+1.49e-03

Pos logit contributions
ver+0.005
seek+0.005
five+0.005
ival+0.005
glass+0.004
Neg logit contributions
 L-0.002
 Lily-0.002
 me-0.002
 my-0.002
 Sara-0.002
Token the
Token ID262
Feature activation-2.86e-04

Pos logit contributions
ver+0.003
five+0.003
asts+0.003
glass+0.003
ival+0.003
Neg logit contributions
 L-0.001
 me-0.001
 swe-0.001
 Lily-0.001
 mess-0.001
Token shrimp
Token ID30060
Feature activation+1.59e-03

Pos logit contributions
 L+0.000
 Lily+0.000
 me+0.000
 my+0.000
 swe+0.000
Neg logit contributions
five-0.000
ver-0.000
ival-0.000
ative-0.000
 voyage-0.000
Token filled
Token ID5901
Feature activation+4.14e-03

Pos logit contributions
five+0.004
seek+0.004
person+0.004
ival+0.004
ector+0.004
Neg logit contributions
 me-0.001
 p-0.001
 Lily-0.001
 L-0.001
 pr-0.001
Token the
Token ID262
Feature activation+6.59e-04

Pos logit contributions
five+0.007
zed+0.007
seek+0.007
ver+0.007
glass+0.007
Neg logit contributions
 L-0.005
 Lily-0.004
 me-0.004
 swe-0.004
 umb-0.004
Token pond
Token ID16723
Feature activation-5.46e-04

Pos logit contributions
five+0.001
ver+0.001
seek+0.001
ival+0.001
ative+0.001
Neg logit contributions
 L-0.001
 Lily-0.001
 me-0.001
 Mitt-0.001
 swe-0.001
Token with
Token ID351
Feature activation-3.76e-03

Pos logit contributions
 me+0.000
 Lily+0.000
 L+0.000
 p+0.000
 my+0.000
Neg logit contributions
ver-0.001
ival-0.001
five-0.001
seek-0.001
glass-0.001
Token cheer
Token ID14042
Feature activation+2.56e-04

Pos logit contributions
 L+0.004
 Lily+0.004
 me+0.004
 Mitt+0.003
 swe+0.003
Neg logit contributions
five-0.007
asts-0.007
 voyage-0.007
glass-0.007
ver-0.007
Token.
Token ID13
Feature activation-1.39e-03

Pos logit contributions
ver+0.000
five+0.000
seek+0.000
ative+0.000
glass+0.000
Neg logit contributions
 L-0.000
 me-0.000
 Lily-0.000
 swe-0.000
 my-0.000
Token
Token ID198
Feature activation+4.63e-03

Pos logit contributions
 L+0.002
 Lily+0.002
 me+0.002
 my+0.001
 Sara+0.001
Neg logit contributions
ver-0.003
five-0.003
seek-0.002
ival-0.002
glass-0.002
Token cry
Token ID3960
Feature activation-3.94e-03

Pos logit contributions
 L+0.005
 me+0.004
 Lily+0.004
 swe+0.004
 my+0.004
Neg logit contributions
five-0.006
ver-0.006
seek-0.005
glass-0.005
asts-0.005
Token.
Token ID13
Feature activation-9.40e-04

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.003
 my+0.002
 swe+0.002
Neg logit contributions
ver-0.009
five-0.009
seek-0.009
ative-0.009
glass-0.009
Token They
Token ID1119
Feature activation-5.16e-03

Pos logit contributions
 me+0.001
 L+0.001
 Lily+0.001
 swe+0.001
 my+0.001
Neg logit contributions
ver-0.002
five-0.002
seek-0.001
glass-0.001
ative-0.001
Token feel
Token ID1254
Feature activation+3.00e-03

Pos logit contributions
 L+0.008
 Lily+0.008
 me+0.007
 my+0.007
 swe+0.007
Neg logit contributions
ver-0.008
five-0.008
seek-0.008
glass-0.007
ival-0.007
Token dizz
Token ID29527
Feature activation-2.52e-03

Pos logit contributions
ver+0.007
seek+0.007
five+0.007
ival+0.007
glass+0.006
Neg logit contributions
 Lily-0.003
 L-0.003
 me-0.003
 my-0.002
 Sara-0.002
Tokeny
Token ID88
Feature activation-3.21e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.003
 swe+0.003
 my+0.003
Neg logit contributions
ver-0.005
five-0.004
ative-0.004
seek-0.004
glass-0.004
Token from
Token ID422
Feature activation-8.34e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.003
 my+0.002
 swe+0.002
Neg logit contributions
ver-0.007
five-0.007
seek-0.006
glass-0.006
ative-0.006
Token running
Token ID2491
Feature activation-2.86e-03

Pos logit contributions
 L+0.010
 me+0.009
 swe+0.009
 Lily+0.008
L+0.008
Neg logit contributions
five-0.014
 voyage-0.013
 patrol-0.013
zed-0.013
asts-0.013
Token so
Token ID523
Feature activation+9.17e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.003
 my+0.003
 Mitt+0.003
Neg logit contributions
ver-0.006
five-0.005
seek-0.005
ative-0.005
tted-0.005
Token much
Token ID881
Feature activation+8.33e-04

Pos logit contributions
ver+0.019
five+0.018
seek+0.018
ival+0.018
glass+0.018
Neg logit contributions
 L-0.010
 Lily-0.010
 me-0.010
 Sara-0.008
 my-0.008
Token.
Token ID13
Feature activation-2.54e-03

Pos logit contributions
ver+0.002
five+0.002
seek+0.002
glass+0.002
asts+0.002
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 my-0.001
 swe-0.000
Token curiosity
Token ID20136
Feature activation-2.12e-04

Pos logit contributions
ver+0.010
five+0.009
seek+0.009
ival+0.009
glass+0.009
Neg logit contributions
 L-0.007
 me-0.006
 Lily-0.006
 my-0.005
 swe-0.005
Token was
Token ID373
Feature activation-1.12e-02

Pos logit contributions
 L+0.000
 me+0.000
 Lily+0.000
 my+0.000
 swe+0.000
Neg logit contributions
ver-0.000
five-0.000
seek-0.000
ival-0.000
glass-0.000
Token a
Token ID257
Feature activation-2.80e-03

Pos logit contributions
 L+0.016
 Lily+0.014
 me+0.014
 Sara+0.012
 Mitt+0.012
Neg logit contributions
ver-0.020
ative-0.019
zed-0.018
five-0.018
seek-0.018
Token good
Token ID922
Feature activation-4.34e-03

Pos logit contributions
 L+0.004
 Lily+0.004
 me+0.003
 Sara+0.003
 my+0.003
Neg logit contributions
ver-0.005
five-0.005
ival-0.005
ative-0.005
seek-0.005
Token thing
Token ID1517
Feature activation-2.06e-03

Pos logit contributions
 L+0.007
 Lily+0.006
 me+0.006
 my+0.005
 swe+0.005
Neg logit contributions
ver-0.007
five-0.007
seek-0.007
ival-0.007
glass-0.007
Token but
Token ID475
Feature activation-3.70e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 swe+0.002
 my+0.002
Neg logit contributions
ver-0.004
five-0.004
seek-0.004
ative-0.004
glass-0.004
Token it
Token ID340
Feature activation+4.48e-04

Pos logit contributions
 L+0.004
 Lily+0.004
 me+0.004
 my+0.003
 you+0.003
Neg logit contributions
five-0.008
seek-0.008
ver-0.008
ival-0.008
glass-0.008
Token's
Token ID338
Feature activation-6.99e-03

Pos logit contributions
ver+0.001
five+0.001
seek+0.001
ival+0.001
glass+0.001
Neg logit contributions
 L-0.000
 me-0.000
 Lily-0.000
 my-0.000
 swe-0.000
Token important
Token ID1593
Feature activation+5.18e-04

Pos logit contributions
 L+0.011
 Lily+0.010
 me+0.010
 Sara+0.009
 Mitt+0.008
Neg logit contributions
ver-0.012
ative-0.011
five-0.011
seek-0.011
zed-0.010
Token to
Token ID284
Feature activation-7.58e-03

Pos logit contributions
ver+0.001
five+0.001
seek+0.001
ative+0.001
glass+0.001
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 my-0.000
 swe-0.000
Token always
Token ID1464
Feature activation+4.68e-03

Pos logit contributions
 L+0.010
 me+0.010
 Lily+0.010
 my+0.008
 swe+0.008
Neg logit contributions
ver-0.014
five-0.014
seek-0.013
ival-0.013
glass-0.013
Tokenâ„¢
Token ID8151
Feature activation+3.43e-03

Pos logit contributions
ver+0.001
five+0.001
seek+0.001
ival+0.001
glass+0.001
Neg logit contributions
 L-0.001
 me-0.001
 Lily-0.001
 swe-0.001
 my-0.001
Tokens
Token ID82
Feature activation-5.83e-03

Pos logit contributions
ver+0.005
ival+0.004
five+0.004
he+0.004
ane+0.004
Neg logit contributions
 me-0.006
 scar-0.006
 L-0.006
 swe-0.006
 honey-0.006
Token just
Token ID655
Feature activation-8.54e-03

Pos logit contributions
 L+0.009
 Lily+0.008
 me+0.008
 swe+0.008
 Mitt+0.008
Neg logit contributions
ver-0.010
five-0.009
 believe-0.008
asts-0.008
ival-0.008
Token try
Token ID1949
Feature activation-3.36e-03

Pos logit contributions
 L+0.014
 Lily+0.014
 me+0.013
 swe+0.013
 Mitt+0.013
Neg logit contributions
five-0.012
ver-0.012
 believe-0.011
 voyage-0.011
asts-0.011
Token both
Token ID1111
Feature activation-1.71e-03

Pos logit contributions
 L+0.004
 Lily+0.004
Oh+0.004
 me+0.004
 trash+0.003
Neg logit contributions
ver-0.006
five-0.006
seek-0.005
zed-0.005
ector-0.005
Token ways
Token ID2842
Feature activation-7.80e-03

Pos logit contributions
 L+0.002
 Lily+0.002
 Mitt+0.002
 me+0.002
 swe+0.002
Neg logit contributions
five-0.003
ver-0.003
seek-0.003
ival-0.003
ative-0.003
Token and
Token ID290
Feature activation-4.87e-04

Pos logit contributions
 L+0.010
 Lily+0.010
 Mitt+0.009
 me+0.009
 swe+0.009
Neg logit contributions
ver-0.014
five-0.014
seek-0.013
 believe-0.013
ative-0.013
Token then
Token ID788
Feature activation-5.94e-03

Pos logit contributions
 L+0.001
 Lily+0.001
 me+0.001
 swe+0.001
 honey+0.001
Neg logit contributions
ver-0.001
five-0.001
 believe-0.001
seek-0.001
 voyage-0.001
Token we
Token ID356
Feature activation-3.64e-03

Pos logit contributions
 swollen+0.006
 mess+0.006
 powder+0.006
 L+0.006
ries+0.006
Neg logit contributions
ver-0.011
five-0.010
 voyage-0.010
ival-0.010
 believe-0.010
Token can
Token ID460
Feature activation+5.60e-03

Pos logit contributions
 L+0.005
 Lily+0.005
 me+0.004
 swe+0.004
 my+0.004
Neg logit contributions
ver-0.007
five-0.006
ival-0.006
tted-0.006
 voyage-0.006
Token decide
Token ID5409
Feature activation-5.75e-03

Pos logit contributions
ver+0.009
five+0.009
seek+0.009
ival+0.009
asts+0.009
Neg logit contributions
 L-0.008
 Lily-0.008
 me-0.008
 swe-0.007
 my-0.007
Token what
Token ID644
Feature activation+2.11e-03

Pos logit contributions
 swe+0.007
 L+0.006
 me+0.006
 umb+0.006
 swollen+0.006
Neg logit contributions
five-0.013
ival-0.012
ver-0.012
 believe-0.012
he-0.011
Token is
Token ID318
Feature activation+5.85e-03

Pos logit contributions
ver+0.004
five+0.004
seek+0.004
glass+0.004
ival+0.004
Neg logit contributions
 L-0.003
 Lily-0.002
 me-0.002
 my-0.002
 Sara-0.002
Token this
Token ID428
Feature activation+1.06e-02

Pos logit contributions
five+0.012
ver+0.012
glass+0.012
ival+0.012
seek+0.012
Neg logit contributions
 L-0.005
 Lily-0.005
 swe-0.005
 me-0.005
 Mitt-0.004
Token?"
Token ID1701
Feature activation+6.76e-03

Pos logit contributions
ver+0.018
five+0.018
glass+0.017
ative+0.017
ane+0.017
Neg logit contributions
 L-0.015
 umb-0.013
 Mitt-0.013
 Lily-0.013
 me-0.013
Token He
Token ID679
Feature activation+1.07e-03

Pos logit contributions
five+0.010
seek+0.009
ver+0.009
glass+0.009
zed+0.009
Neg logit contributions
 umb-0.011
 swe-0.011
 me-0.010
 scar-0.010
ries-0.009
Token takes
Token ID2753
Feature activation-1.20e-02

Pos logit contributions
five+0.002
ver+0.002
ival+0.001
 exited+0.001
 warrior+0.001
Neg logit contributions
 L-0.001
 Mitt-0.001
 Lily-0.001
Su-0.001
 Su-0.001
Token it
Token ID340
Feature activation+1.91e-03

Pos logit contributions
 L+0.012
 me+0.011
 umb+0.010
 Mitt+0.010
 swe+0.010
Neg logit contributions
five-0.025
ver-0.023
seek-0.022
ival-0.021
zed-0.021
Token out
Token ID503
Feature activation-3.66e-03

Pos logit contributions
ver+0.004
five+0.004
seek+0.004
ival+0.004
asts+0.004
Neg logit contributions
 L-0.002
 Lily-0.002
 me-0.002
 my-0.002
 swe-0.001
Token and
Token ID290
Feature activation-2.20e-03

Pos logit contributions
 me+0.007
 L+0.006
 Lily+0.006
 Mitt+0.006
 swe+0.006
Neg logit contributions
ver-0.005
five-0.004
ative-0.004
glass-0.004
asts-0.004
Token reads
Token ID9743
Feature activation-3.66e-03

Pos logit contributions
 L+0.003
Su+0.003
imi+0.003
 me+0.003
 Mitt+0.003
Neg logit contributions
five-0.004
 exited-0.003
glass-0.003
ver-0.003
 patrol-0.003
Token it
Token ID340
Feature activation+1.25e-03

Pos logit contributions
 L+0.003
 me+0.003
 Mitt+0.003
 Lily+0.003
 swe+0.003
Neg logit contributions
five-0.008
ver-0.007
glass-0.007
ative-0.007
ival-0.007
Token
Token ID198
Feature activation-8.74e-04

Pos logit contributions
 me+0.017
 L+0.017
 Lily+0.016
 swe+0.016
 my+0.015
Neg logit contributions
ver-0.022
five-0.020
seek-0.020
ival-0.020
tted-0.019
Token
Token ID198
Feature activation-2.79e-03

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.001
 swe+0.001
 my+0.001
Neg logit contributions
ver-0.002
five-0.001
seek-0.001
ival-0.001
glass-0.001
TokenEventually
Token ID28724
Feature activation-1.10e-02

Pos logit contributions
 L+0.005
 me+0.004
 Lily+0.004
 my+0.004
 swe+0.004
Neg logit contributions
ver-0.004
five-0.004
seek-0.004
glass-0.004
ival-0.004
Token the
Token ID262
Feature activation+3.37e-03

Pos logit contributions
 L+0.014
 me+0.014
 Lily+0.013
 swe+0.012
 my+0.012
Neg logit contributions
ver-0.019
five-0.018
zed-0.018
ival-0.018
seek-0.017
Token sun
Token ID4252
Feature activation-7.28e-04

Pos logit contributions
five+0.005
ver+0.004
seek+0.004
ival+0.004
ative+0.004
Neg logit contributions
 L-0.006
 Lily-0.006
 me-0.006
 swe-0.005
 Mitt-0.005
Token started
Token ID2067
Feature activation-4.25e-03

Pos logit contributions
 L+0.001
 Lily+0.001
 me+0.001
 my+0.001
 Mitt+0.001
Neg logit contributions
ver-0.001
five-0.001
glass-0.001
seek-0.001
ival-0.001
Token to
Token ID284
Feature activation+9.12e-03

Pos logit contributions
 L+0.006
 Lily+0.006
 me+0.005
 Sara+0.005
 my+0.005
Neg logit contributions
ver-0.007
ative-0.007
seek-0.006
 voyage-0.006
asts-0.006
Token set
Token ID900
Feature activation+6.81e-04

Pos logit contributions
ver+0.015
five+0.014
asts+0.014
 voyage+0.013
seek+0.013
Neg logit contributions
 L-0.014
 Lily-0.013
 me-0.012
 my-0.011
 Mitt-0.011
Token and
Token ID290
Feature activation-6.26e-03

Pos logit contributions
ver+0.001
ative+0.001
he+0.001
tted+0.001
 voyage+0.001
Neg logit contributions
 L-0.001
 Lily-0.001
 me-0.001
 my-0.001
Oh-0.001
Token Lucy
Token ID22162
Feature activation-2.35e-03

Pos logit contributions
 L+0.008
 me+0.007
imi+0.007
 poop+0.007
 pins+0.007
Neg logit contributions
ver-0.010
 exited-0.009
five-0.009
zed-0.009
ival-0.009
Token knew
Token ID2993
Feature activation-5.39e-03

Pos logit contributions
 L+0.003
 Lily+0.003
 me+0.003
 my+0.003
 swe+0.003
Neg logit contributions
ver-0.004
five-0.004
seek-0.004
zed-0.004
 exited-0.004
Token with
Token ID351
Feature activation-7.68e-04

Pos logit contributions
 L+0.010
 Lily+0.010
 me+0.010
 my+0.008
 Mitt+0.007
Neg logit contributions
ver-0.018
five-0.017
seek-0.017
ival-0.016
asts-0.016
Token her
Token ID607
Feature activation+5.51e-03

Pos logit contributions
 L+0.001
 me+0.001
 swe+0.001
 Lily+0.001
 my+0.001
Neg logit contributions
seek-0.001
ver-0.001
five-0.001
glass-0.001
ative-0.001
Token toys
Token ID14958
Feature activation-2.39e-03

Pos logit contributions
ver+0.010
glass+0.009
five+0.009
seek+0.009
ival+0.009
Neg logit contributions
 L-0.008
 me-0.007
 Lily-0.007
 Sara-0.006
 my-0.006
Token when
Token ID618
Feature activation-6.45e-03

Pos logit contributions
 L+0.003
 me+0.003
 Lily+0.002
 p+0.002
 Mitt+0.002
Neg logit contributions
five-0.005
ver-0.005
ival-0.005
seek-0.005
asts-0.005
Token she
Token ID673
Feature activation-2.33e-03

Pos logit contributions
 L+0.008
 me+0.008
 swe+0.007
 mess+0.007
 my+0.007
Neg logit contributions
ver-0.011
seek-0.011
zed-0.010
 patrol-0.010
 exited-0.010
Token accidentally
Token ID14716
Feature activation-3.19e-02

Pos logit contributions
 L+0.003
 Lily+0.003
 me+0.003
 swe+0.003
 my+0.003
Neg logit contributions
ver-0.004
glass-0.003
ival-0.003
seek-0.003
asts-0.003
Token knocked
Token ID13642
Feature activation-1.50e-02

Pos logit contributions
 L+0.049
 me+0.048
 swe+0.046
 Lily+0.045
 my+0.043
Neg logit contributions
zed-0.045
ver-0.044
seek-0.044
five-0.043
glass-0.041
Token over
Token ID625
Feature activation+1.05e-02

Pos logit contributions
 swe+0.027
 MacDonald+0.026
 scar+0.025
 coll+0.024
 tul+0.024
Neg logit contributions
he-0.017
five-0.015
hed-0.015
on-0.013
ane-0.012
Token a
Token ID257
Feature activation-1.09e-02

Pos logit contributions
five+0.016
ver+0.015
growth+0.015
he+0.015
seek+0.015
Neg logit contributions
 L-0.012
 me-0.012
 mess-0.012
 yells-0.012
 swe-0.012
Token lamp
Token ID20450
Feature activation-3.26e-03

Pos logit contributions
 L+0.016
 Lily+0.013
ries+0.013
 me+0.013
 Jen+0.013
Neg logit contributions
ival-0.017
 warrior-0.016
five-0.016
case-0.015
 rider-0.015
Token.
Token ID13
Feature activation+2.68e-04

Pos logit contributions
 L+0.004
 Lily+0.004
 me+0.004
 swe+0.003
 Mitt+0.003
Neg logit contributions
ver-0.006
five-0.006
he-0.006
seek-0.006
glass-0.006
Token baking
Token ID16871
Feature activation-5.30e-03

Pos logit contributions
ver+0.014
seek+0.013
five+0.013
asts+0.012
zed+0.012
Neg logit contributions
 L-0.011
 Lily-0.010
 me-0.010
 swe-0.009
 my-0.009
Token a
Token ID257
Feature activation-1.58e-03

Pos logit contributions
 L+0.007
 Lily+0.006
 me+0.006
 yells+0.005
 umb+0.005
Neg logit contributions
ver-0.009
seek-0.009
asts-0.009
five-0.009
ative-0.009
Token cake
Token ID12187
Feature activation-9.23e-04

Pos logit contributions
 L+0.002
 Lily+0.002
lasses+0.002
 yourselves+0.002
 Sara+0.002
Neg logit contributions
ival-0.002
asts-0.002
five-0.002
 quest-0.002
case-0.002
Token and
Token ID290
Feature activation-5.28e-03

Pos logit contributions
 L+0.001
 Lily+0.001
 me+0.001
 my+0.001
 swe+0.001
Neg logit contributions
ver-0.002
five-0.002
seek-0.002
glass-0.002
asts-0.002
Token she
Token ID673
Feature activation-7.34e-03

Pos logit contributions
 L+0.006
 swe+0.006
 me+0.006
 Lily+0.006
 umb+0.005
Neg logit contributions
ver-0.009
zed-0.009
five-0.009
glass-0.008
asts-0.008
Token accidentally
Token ID14716
Feature activation-2.85e-02

Pos logit contributions
 L+0.011
 Lily+0.010
 me+0.010
 swe+0.009
 my+0.009
Neg logit contributions
ver-0.012
five-0.011
zed-0.011
glass-0.011
seek-0.011
Token forgot
Token ID16453
Feature activation-1.50e-02

Pos logit contributions
 L+0.045
 Lily+0.042
 me+0.042
 swe+0.040
 my+0.039
Neg logit contributions
zed-0.041
ver-0.040
five-0.040
asts-0.038
seek-0.037
Token to
Token ID284
Feature activation-4.35e-03

Pos logit contributions
 L+0.021
 me+0.018
 Lily+0.018
 scar+0.018
 Lia+0.017
Neg logit contributions
ver-0.023
zed-0.022
ative-0.021
five-0.021
asts-0.021
Token put
Token ID1234
Feature activation-2.43e-03

Pos logit contributions
 L+0.007
 Lily+0.007
 me+0.006
 swe+0.006
 Mitt+0.006
Neg logit contributions
five-0.006
asts-0.006
ver-0.006
zed-0.006
 believe-0.006
Token in
Token ID287
Feature activation-4.42e-03

Pos logit contributions
 L+0.003
imi+0.003
 swe+0.002
 Mitt+0.002
 mistake+0.002
Neg logit contributions
five-0.004
he-0.004
ver-0.004
zed-0.003
asts-0.003
Token the
Token ID262
Feature activation-1.02e-02

Pos logit contributions
ety+0.006
ies+0.006
lasses+0.006
 L+0.006
ounced+0.006
Neg logit contributions
he-0.006
zed-0.006
five-0.005
handed-0.005
asts-0.005
Token
Token ID198
Feature activation+1.25e-03

Pos logit contributions
 L+0.002
 me+0.002
 Lily+0.002
 swe+0.002
 my+0.002
Neg logit contributions
ver-0.002
five-0.002
asts-0.002
seek-0.002
ival-0.002
TokenBut
Token ID1537
Feature activation+1.97e-03

Pos logit contributions
ver+0.002
five+0.002
seek+0.002
ival+0.002
glass+0.002
Neg logit contributions
 L-0.002
 me-0.002
 Lily-0.002
 swe-0.001
 my-0.001
Token then
Token ID788
Feature activation-1.39e-03

Pos logit contributions
ver+0.003
five+0.003
glass+0.003
zed+0.003
 exited+0.003
Neg logit contributions
 me-0.003
 swe-0.003
 L-0.003
 mess-0.003
 Lily-0.002
Token,
Token ID11
Feature activation+2.57e-04

Pos logit contributions
 me+0.002
 L+0.002
 swe+0.002
 mess+0.002
 Lily+0.002
Neg logit contributions
ver-0.002
five-0.002
glass-0.002
ival-0.002
asts-0.002
Token Lily
Token ID20037
Feature activation-1.14e-02

Pos logit contributions
ver+0.000
glass+0.000
five+0.000
ival+0.000
 exited+0.000
Neg logit contributions
 me-0.000
 swe-0.000
 swollen-0.000
 mess-0.000
imi-0.000
Token slipped
Token ID18859
Feature activation-2.93e-02

Pos logit contributions
 L+0.017
 Lily+0.016
 me+0.015
 my+0.015
 swe+0.015
Neg logit contributions
ver-0.018
glass-0.017
seek-0.016
five-0.016
 exited-0.016
Token and
Token ID290
Feature activation-5.16e-03

Pos logit contributions
 L+0.041
 Lily+0.039
 me+0.038
 swe+0.035
 my+0.034
Neg logit contributions
ver-0.045
five-0.045
ative-0.043
seek-0.043
zed-0.042
Token fell
Token ID3214
Feature activation-2.03e-02

Pos logit contributions
 me+0.009
 L+0.009
 swe+0.009
 Lily+0.008
lasses+0.008
Neg logit contributions
five-0.006
ver-0.006
zed-0.006
 exited-0.005
growth-0.005
Token into
Token ID656
Feature activation-7.75e-03

Pos logit contributions
 L+0.024
 me+0.023
 Lily+0.022
 my+0.020
 us+0.019
Neg logit contributions
ver-0.036
ative-0.035
seek-0.034
five-0.034
tted-0.033
Token the
Token ID262
Feature activation-6.62e-04

Pos logit contributions
 me+0.008
 swe+0.008
lasses+0.007
ety+0.007
 cho+0.007
Neg logit contributions
growth-0.014
he-0.013
five-0.012
seek-0.012
zed-0.012
Token fountain
Token ID29420
Feature activation+1.69e-03

Pos logit contributions
 L+0.001
 Lily+0.001
 S+0.001
 Liz+0.001
 Jen+0.001
Neg logit contributions
five-0.001
tted-0.001
case-0.001
he-0.001
ugg-0.001
Token day
Token ID1110
Feature activation-1.82e-03

Pos logit contributions
ival+0.013
ver+0.013
seek+0.012
tted+0.012
ibr+0.012
Neg logit contributions
 Lily-0.007
 L-0.006
 me-0.006
 my-0.005
 Sara-0.004
Token,
Token ID11
Feature activation+4.82e-03

Pos logit contributions
 L+0.002
 Lily+0.002
 me+0.002
 my+0.002
 Sara+0.002
Neg logit contributions
ver-0.004
five-0.004
seek-0.004
glass-0.004
ival-0.004
Token her
Token ID607
Feature activation+3.99e-03

Pos logit contributions
ver+0.009
five+0.009
he+0.009
zed+0.009
 exited+0.008
Neg logit contributions
 umb-0.004
 me-0.004
 yells-0.004
 swe-0.004
 us-0.004
Token little
Token ID1310
Feature activation-1.37e-02

Pos logit contributions
ver+0.009
five+0.009
seek+0.009
ival+0.009
glass+0.009
Neg logit contributions
 L-0.004
 Lily-0.003
 me-0.003
 my-0.003
 swe-0.002
Token brother
Token ID3956
Feature activation-8.39e-04

Pos logit contributions
 L+0.018
 me+0.017
 Lily+0.016
lasses+0.016
Su+0.016
Neg logit contributions
 astronaut-0.020
 rider-0.020
 warrior-0.019
 protector-0.019
five-0.019
Token accidentally
Token ID14716
Feature activation-2.87e-02

Pos logit contributions
 L+0.001
 me+0.001
 swe+0.001
 scar+0.001
 my+0.001
Neg logit contributions
ver-0.001
five-0.001
seek-0.001
glass-0.001
ugg-0.001
Token broke
Token ID6265
Feature activation-1.26e-02

Pos logit contributions
 L+0.042
 me+0.040
 Lily+0.039
 swe+0.035
 my+0.034
Neg logit contributions
ver-0.047
five-0.046
seek-0.045
zed-0.044
glass-0.044
Token it
Token ID340
Feature activation-6.52e-03

Pos logit contributions
 swe+0.014
 MacDonald+0.014
 yells+0.013
 spat+0.013
lasses+0.013
Neg logit contributions
growth-0.016
ver-0.016
ative-0.016
he-0.016
 astronaut-0.015
Token.
Token ID13
Feature activation-1.01e-03

Pos logit contributions
 L+0.008
 Lily+0.007
 me+0.007
 Sara+0.006
 swe+0.006
Neg logit contributions
ver-0.013
seek-0.012
five-0.012
ative-0.012
ival-0.012
Token Lily
Token ID20037
Feature activation+3.80e-04

Pos logit contributions
 me+0.001
 L+0.001
 umb+0.001
 scar+0.001
 swe+0.001
Neg logit contributions
five-0.002
ver-0.002
seek-0.002
tted-0.002
he-0.002
Token was
Token ID373
Feature activation+2.79e-03

Pos logit contributions
ver+0.001
seek+0.001
five+0.001
zed+0.001
glass+0.001
Neg logit contributions
 L-0.000
 Lily-0.000
 me-0.000
 swe-0.000
 Sara-0.000
Token.
Token ID13
Feature activation-9.94e-04

Pos logit contributions
 L+0.001
 me+0.001
 Lily+0.001
 my+0.001
 Sara+0.001
Neg logit contributions
ver-0.001
five-0.001
seek-0.001
ative-0.001
ival-0.001
Token But
Token ID887
Feature activation-3.46e-03

Pos logit contributions
 me+0.002
 L+0.001
ries+0.001
 us+0.001
 umb+0.001
Neg logit contributions
seek-0.001
five-0.001
zed-0.001
ver-0.001
ugg-0.001
Token then
Token ID788
Feature activation-7.43e-03

Pos logit contributions
 L+0.005
 me+0.005
 swe+0.004
 umb+0.004
 p+0.004
Neg logit contributions
ver-0.005
five-0.005
zed-0.005
seek-0.005
glass-0.005
Token,
Token ID11
Feature activation+1.17e-03

Pos logit contributions
 L+0.009
 me+0.009
 Lily+0.008
 swe+0.008
 my+0.008
Neg logit contributions
ver-0.013
seek-0.012
five-0.012
zed-0.012
glass-0.012
Token Max
Token ID5436
Feature activation-8.56e-03

Pos logit contributions
ver+0.002
glass+0.002
zed+0.002
five+0.002
seek+0.002
Neg logit contributions
 me-0.001
 L-0.001
 swe-0.001
 umb-0.001
 us-0.001
Token accidentally
Token ID14716
Feature activation-2.84e-02

Pos logit contributions
 L+0.014
 Lily+0.013
 my+0.012
 me+0.012
 Sara+0.012
Neg logit contributions
ver-0.012
seek-0.012
glass-0.012
zed-0.011
 exited-0.011
Token marked
Token ID7498
Feature activation+1.36e-03

Pos logit contributions
 L+0.043
 me+0.041
 Lily+0.040
 swe+0.038
 my+0.037
Neg logit contributions
zed-0.041
ver-0.041
seek-0.041
five-0.040
glass-0.039
Token Lily
Token ID20037
Feature activation-1.16e-02

Pos logit contributions
ver+0.002
he+0.002
ative+0.002
seek+0.002
zed+0.002
Neg logit contributions
 umb-0.002
 swe-0.002
 me-0.002
 L-0.002
 scar-0.002
Token's
Token ID338
Feature activation-3.21e-03

Pos logit contributions
 L+0.018
 Lily+0.017
 me+0.016
 Sara+0.015
 swe+0.015
Neg logit contributions
ver-0.018
seek-0.018
five-0.017
zed-0.017
oved-0.017
Token favorite
Token ID4004
Feature activation-2.48e-03

Pos logit contributions
 L+0.005
 Lily+0.004
 me+0.004
 cho+0.004
 us+0.004
Neg logit contributions
five-0.005
seek-0.005
asts-0.004
handed-0.004
oved-0.004
Token shirt
Token ID10147
Feature activation+9.21e-04

Pos logit contributions
 L+0.004
 cho+0.003
 umb+0.003
 Mitt+0.003
 Dot+0.003
Neg logit contributions
five-0.004
seek-0.003
zed-0.003
asts-0.003
oved-0.003