TOP ACTIVATIONS
MAX = 0.079

he could . When at last he opened it all the
she raced around . At last , her Mom said it
Fin and Carl . At last , they found Fin 's
fit and fast . At last , it won a race
he had forgotten . At last , one mechanic had an
, nothing seemed to come up . She was frustrated and
destination all day . At last , evening came and m
log for Dad . At last , Ben found the perfect
could stop her . At last , Ruby took the fifth
find her ball . At last , they found the ball
Then , at last , Ros ie was able
something scared him . At last , the tiger saw a
spot . At last , Tom found Lily .
again and again . At last , the duck pushed the
came in through . At last , they found the right
by the pond . At last , they found many carrots
thicker . Until , at last , she saw a clearing
the best one . At last he chose one with big
, no matter what you do ! Doing something with a
her new powers . At last , she found the answers

MIN ACTIVATIONS
MIN = -0.119

who had a pet cat named Mitt ens . One day
he had a sausage dog named Max . Max was Tim
and had a toy pony named Spark les . One day
a favorite t eddy bear named Mark . She loved Mark
She had a pet dog named Spot who she loved very
day , a chocolate - col oured whip named C heddar
She had a brown dog named Max . One day ,
She had a rag doll named Daisy . Lucy loved Daisy
She had a pet cat named Mitt ens . Mitt ens
She had a gray cat named Mitt ens . One day
She had a big dog named Max who was very hairy
She had a pet dog named Max . One day ,
She had a pet dog named Max . Max was her
had a t eddy bear named Teddy , who was her
She had a pet cat named Mitt ens . One day
had a t eddy bear named Teddy . One day ,
. She had a kitten named Mitt ens . Mitt ens
She had a pet dog named Max . One day ,
a favorite t eddy bear named Brown ie . One day
She had a pet dog named Max . Max loved to

INTERVAL 0.030 - 0.079
CONTAINS 5.689%

sneak around when a human stopped him .
not many leaves left on the tree . Most of them
€ � � You â € ™ re welcome ! â
They look at each other and nod again ,
deal and he soon settled into his new home ! Now

INTERVAL -0.020 - 0.030
CONTAINS 78.792%

ground . The man was safe .
cute and colorful . The baby talked to his mom my
hatched . Days passed and the egg finally
Then , they heard a loud noise . It was
was happy . She loved to pretend . <|endoftext|> Ben and

INTERVAL -0.069 - -0.020
CONTAINS 15.347%

, changed into their clothes and grabbed their telephone .
, fluffy dog . He w agged his tail and l
The end . <|endoftext|> Once upon a time , there was
to Sue . " They are spark les !"
a time , there was a little boy named Tim my

INTERVAL -0.119 - -0.069
CONTAINS 0.172%

friend , a small dog named Max , came over to
there was a little boy named Jack who wanted to go
ud met a little girl named Lily .
many cars of different colors and sizes . He likes to
play with her dolls and t eddy bears . One day
Token he
Token ID339
Feature activation-3.23e-03

Pos logit contributions
endant+0.010
aj+0.009
cream+0.009
ifying+0.009
ksh+0.009
Neg logit contributions
,-0.004
 first-0.004
 right-0.004
 up-0.004
 hard-0.004
Token could
Token ID714
Feature activation+1.10e-02

Pos logit contributions
endant+0.008
ifying+0.007
aj+0.007
cream+0.007
ksh+0.007
Neg logit contributions
,-0.004
 first-0.004
 up-0.004
 hard-0.004
 right-0.004
Token.
Token ID13
Feature activation-8.34e-03

Pos logit contributions
,+0.010
 first+0.009
 up+0.009
 hard+0.008
 right+0.008
Neg logit contributions
endant-0.030
cream-0.029
ifying-0.028
aj-0.028
run-0.027
Token When
Token ID1649
Feature activation+3.72e-02

Pos logit contributions
endant+0.020
aj+0.019
ifying+0.018
cream+0.018
lets+0.018
Neg logit contributions
,-0.012
 up-0.010
 right-0.010
 hard-0.009
 first-0.009
Token at
Token ID379
Feature activation+3.44e-02

Pos logit contributions
 first+0.050
 right+0.049
 up+0.047
,+0.047
 hard+0.045
Neg logit contributions
cream-0.077
endant-0.074
ifying-0.072
aj-0.069
ittens-0.067
Token last
Token ID938
Feature activation+7.93e-02

Pos logit contributions
:+0.026
 hard+0.025
,+0.025
 right+0.024
 overcome+0.024
Neg logit contributions
endant-0.081
cream-0.079
eddy-0.078
ifying-0.076
aj-0.074
Token he
Token ID339
Feature activation+2.91e-02

Pos logit contributions
 up+0.051
 opposite+0.048
 first+0.048
 overcome+0.047
 right+0.046
Neg logit contributions
run-0.185
endant-0.183
ittens-0.180
ifying-0.174
paste-0.171
Token opened
Token ID4721
Feature activation+1.03e-02

Pos logit contributions
,+0.037
 first+0.033
 right+0.032
 up+0.032
 hard+0.031
Neg logit contributions
endant-0.071
cream-0.069
aj-0.067
ifying-0.067
ksh-0.062
Token it
Token ID340
Feature activation+1.83e-02

Pos logit contributions
 first+0.007
 hard+0.007
 best+0.007
 right+0.007
 overcome+0.007
Neg logit contributions
endant-0.027
cream-0.026
ifying-0.026
aj-0.025
run-0.024
Token all
Token ID477
Feature activation+2.81e-02

Pos logit contributions
 opposite+0.005
 hard+0.003
 overcome+0.003
 first+0.003
 bottom+0.003
Neg logit contributions
endant-0.055
cream-0.052
aj-0.052
run-0.051
ifying-0.051
Token the
Token ID262
Feature activation+2.05e-02

Pos logit contributions
 hard+0.020
 top+0.020
 first+0.019
 bottom+0.019
,+0.018
Neg logit contributions
endant-0.073
cream-0.070
aj-0.068
run-0.067
ifying-0.067
Token she
Token ID673
Feature activation-2.87e-03

Pos logit contributions
,+0.029
 first+0.026
 up+0.025
 right+0.025
 hard+0.024
Neg logit contributions
endant-0.082
cream-0.079
aj-0.078
ifying-0.077
run-0.073
Token raced
Token ID32897
Feature activation+3.98e-02

Pos logit contributions
endant+0.007
cream+0.007
ifying+0.007
aj+0.007
eddy+0.007
Neg logit contributions
,-0.004
 first-0.003
 right-0.003
 the-0.003
 up-0.003
Token around
Token ID1088
Feature activation+4.58e-02

Pos logit contributions
 first+0.031
,+0.031
 hard+0.029
 right+0.028
 bottom+0.027
Neg logit contributions
endant-0.101
cream-0.101
ifying-0.098
aj-0.096
run-0.096
Token.
Token ID13
Feature activation+1.71e-04

Pos logit contributions
 first+0.031
 best+0.030
 hard+0.028
 right+0.027
 overcome+0.027
Neg logit contributions
endant-0.106
ifying-0.106
run-0.104
cream-0.101
aj-0.100
Token At
Token ID1629
Feature activation+2.86e-02

Pos logit contributions
,+0.000
 first+0.000
 right+0.000
:+0.000
 the+0.000
Neg logit contributions
endant-0.000
cream-0.000
ifying-0.000
aj-0.000
ppers-0.000
Token last
Token ID938
Feature activation+7.83e-02

Pos logit contributions
,+0.017
 overcome+0.016
 best+0.015
 right+0.015
 hard+0.015
Neg logit contributions
endant-0.070
aj-0.067
cream-0.066
ifying-0.066
run-0.063
Token,
Token ID11
Feature activation+2.48e-02

Pos logit contributions
 first+0.040
 up+0.038
 right+0.038
 overcome+0.032
 bottom+0.030
Neg logit contributions
endant-0.198
run-0.197
ifying-0.195
aj-0.193
ittens-0.187
Token her
Token ID607
Feature activation-3.57e-02

Pos logit contributions
,+0.022
 first+0.020
 right+0.019
 up+0.019
 hard+0.019
Neg logit contributions
endant-0.066
cream-0.065
ifying-0.063
aj-0.063
run-0.060
Token Mom
Token ID11254
Feature activation+1.22e-02

Pos logit contributions
endant+0.086
cream+0.085
ifying+0.081
aj+0.080
ksh+0.077
Neg logit contributions
,-0.046
 first-0.042
 up-0.040
 hard-0.040
 right-0.040
Token said
Token ID531
Feature activation-1.65e-02

Pos logit contributions
,+0.018
 first+0.016
 up+0.016
 right+0.015
 hard+0.015
Neg logit contributions
endant-0.028
ifying-0.027
cream-0.026
aj-0.026
ittens-0.024
Token it
Token ID340
Feature activation-5.24e-03

Pos logit contributions
endant+0.044
cream+0.042
ifying+0.042
aj+0.042
run+0.042
Neg logit contributions
 first-0.010
 opposite-0.010
 up-0.009
 bottom-0.009
 hard-0.009
Token Fin
Token ID4463
Feature activation+2.18e-02

Pos logit contributions
 hard+0.009
 first+0.009
 right+0.008
 top+0.008
 bottom+0.008
Neg logit contributions
cream-0.010
run-0.010
lets-0.009
endant-0.009
ittens-0.009
Token and
Token ID290
Feature activation-3.86e-02

Pos logit contributions
 opposite+0.020
 hard+0.019
 bottom+0.019
 top+0.019
 first+0.018
Neg logit contributions
cream-0.051
endant-0.050
lets-0.048
run-0.046
aj-0.045
Token Carl
Token ID8124
Feature activation+3.11e-02

Pos logit contributions
endant+0.093
cream+0.093
ifying+0.088
aj+0.088
run+0.086
Neg logit contributions
,-0.040
 first-0.037
 up-0.037
 right-0.036
 hard-0.036
Token.
Token ID13
Feature activation+8.91e-03

Pos logit contributions
 first+0.026
 right+0.024
 hard+0.024
 top+0.024
,+0.024
Neg logit contributions
endant-0.078
cream-0.075
lets-0.072
aj-0.071
ifying-0.071
Token At
Token ID1629
Feature activation+1.29e-02

Pos logit contributions
,+0.013
 first+0.013
 right+0.012
 hard+0.012
 up+0.012
Neg logit contributions
cream-0.018
endant-0.018
ifying-0.016
run-0.016
aj-0.016
Token last
Token ID938
Feature activation+7.75e-02

Pos logit contributions
 right+0.008
 overcome+0.008
ered+0.008
 best+0.008
,+0.007
Neg logit contributions
endant-0.030
cream-0.029
ifying-0.028
run-0.026
ittens-0.026
Token,
Token ID11
Feature activation+2.88e-02

Pos logit contributions
 up+0.066
 right+0.060
 overcome+0.052
 bottom+0.048
 edge+0.048
Neg logit contributions
ittens-0.164
endant-0.160
run-0.159
ifying-0.156
fy-0.147
Token they
Token ID484
Feature activation+1.99e-02

Pos logit contributions
 right+0.041
:+0.039
 bottom+0.039
,+0.039
 up+0.038
Neg logit contributions
cream-0.047
ifying-0.044
run-0.043
endant-0.042
lets-0.040
Token found
Token ID1043
Feature activation-1.08e-02

Pos logit contributions
,+0.026
 first+0.025
 hard+0.025
 up+0.025
 right+0.025
Neg logit contributions
endant-0.044
cream-0.042
ifying-0.041
aj-0.040
lets-0.038
Token Fin
Token ID4463
Feature activation+1.04e-02

Pos logit contributions
cream+0.022
endant+0.022
ifying+0.020
run+0.019
lets+0.019
Neg logit contributions
,-0.015
 first-0.015
 hard-0.014
 right-0.014
 up-0.014
Token's
Token ID338
Feature activation-1.63e-02

Pos logit contributions
,+0.009
 first+0.008
 hard+0.008
 right+0.008
 top+0.008
Neg logit contributions
endant-0.027
cream-0.027
aj-0.025
ifying-0.025
run-0.024
Token fit
Token ID4197
Feature activation+1.94e-02

Pos logit contributions
,+0.023
 first+0.021
 up+0.020
 right+0.019
:+0.018
Neg logit contributions
endant-0.077
cream-0.076
aj-0.075
ifying-0.074
run-0.072
Token and
Token ID290
Feature activation-1.31e-02

Pos logit contributions
,+0.018
 first+0.017
 hard+0.017
 opposite+0.016
 right+0.016
Neg logit contributions
endant-0.046
aj-0.046
cream-0.045
ifying-0.045
run-0.044
Token fast
Token ID3049
Feature activation+2.01e-02

Pos logit contributions
endant+0.032
cream+0.031
aj+0.030
ifying+0.030
run+0.029
Neg logit contributions
,-0.015
 first-0.013
 right-0.013
 up-0.013
 hard-0.013
Token.
Token ID13
Feature activation+8.77e-03

Pos logit contributions
,+0.022
 first+0.021
 right+0.020
 up+0.020
 hard+0.020
Neg logit contributions
endant-0.049
cream-0.048
aj-0.047
ifying-0.047
run-0.044
Token At
Token ID1629
Feature activation+1.81e-02

Pos logit contributions
,+0.010
 right+0.008
 first+0.008
 hard+0.008
 up+0.008
Neg logit contributions
endant-0.023
cream-0.022
aj-0.022
ifying-0.022
ittens-0.020
Token last
Token ID938
Feature activation+7.70e-02

Pos logit contributions
,+0.011
:+0.010
 right+0.009
 opposite+0.008
 up+0.008
Neg logit contributions
endant-0.044
aj-0.042
cream-0.042
ifying-0.042
run-0.042
Token,
Token ID11
Feature activation+3.42e-02

Pos logit contributions
 up+0.053
 first+0.050
 right+0.048
 overcome+0.046
 outskirts+0.042
Neg logit contributions
run-0.175
ifying-0.170
endant-0.169
ittens-0.168
fy-0.164
Token it
Token ID340
Feature activation+2.46e-02

Pos logit contributions
,+0.027
:+0.026
 first+0.026
 right+0.023
 hard+0.023
Neg logit contributions
endant-0.083
ifying-0.082
cream-0.081
run-0.074
lets-0.074
Token won
Token ID1839
Feature activation+2.97e-02

Pos logit contributions
,+0.029
 first+0.029
 hard+0.028
 opposite+0.028
 right+0.027
Neg logit contributions
endant-0.050
aj-0.050
cream-0.049
run-0.046
ifying-0.046
Token a
Token ID257
Feature activation+1.29e-02

Pos logit contributions
 first+0.017
,+0.017
 hard+0.016
 right+0.015
 up+0.015
Neg logit contributions
endant-0.079
cream-0.078
ifying-0.076
aj-0.075
run-0.075
Token race
Token ID3234
Feature activation+3.14e-02

Pos logit contributions
,+0.014
 first+0.012
 up+0.012
 right+0.012
 hard+0.012
Neg logit contributions
endant-0.032
cream-0.031
ifying-0.031
aj-0.030
run-0.029
Token he
Token ID339
Feature activation-4.31e-03

Pos logit contributions
endant+0.007
cream+0.007
ifying+0.006
aj+0.006
run+0.006
Neg logit contributions
,-0.003
 first-0.003
 up-0.003
 right-0.003
 hard-0.003
Token had
Token ID550
Feature activation+4.39e-03

Pos logit contributions
endant+0.009
cream+0.009
aj+0.009
ifying+0.009
run+0.008
Neg logit contributions
,-0.006
 first-0.006
 right-0.006
 up-0.006
 hard-0.006
Token forgotten
Token ID11564
Feature activation+2.36e-03

Pos logit contributions
,+0.006
 first+0.005
 right+0.005
 up+0.005
 hard+0.005
Neg logit contributions
endant-0.010
cream-0.010
aj-0.009
ifying-0.009
run-0.009
Token.
Token ID13
Feature activation+3.90e-03

Pos logit contributions
,+0.002
 first+0.002
 right+0.002
 up+0.002
 hard+0.002
Neg logit contributions
endant-0.006
cream-0.006
aj-0.006
ifying-0.006
run-0.006
Token At
Token ID1629
Feature activation+2.25e-02

Pos logit contributions
,+0.004
 first+0.004
 right+0.004
 up+0.003
:+0.003
Neg logit contributions
endant-0.010
cream-0.010
ifying-0.010
aj-0.010
ittens-0.009
Token last
Token ID938
Feature activation+7.53e-02

Pos logit contributions
,+0.013
 best+0.011
 hard+0.011
 right+0.010
 overcome+0.010
Neg logit contributions
endant-0.059
cream-0.057
ifying-0.056
aj-0.055
run-0.053
Token,
Token ID11
Feature activation+2.68e-02

Pos logit contributions
 up+0.037
 first+0.034
 best+0.031
 right+0.030
 opposite+0.028
Neg logit contributions
run-0.189
endant-0.188
ifying-0.184
aj-0.173
ittens-0.172
Token one
Token ID530
Feature activation+1.98e-02

Pos logit contributions
,+0.026
 first+0.024
 hard+0.023
 up+0.023
 right+0.023
Neg logit contributions
endant-0.068
cream-0.068
ifying-0.066
aj-0.064
run-0.062
Token mechanic
Token ID21600
Feature activation+1.03e-02

Pos logit contributions
,+0.024
 first+0.022
 right+0.021
 up+0.021
 hard+0.021
Neg logit contributions
endant-0.049
cream-0.047
aj-0.046
ifying-0.046
ksh-0.043
Token had
Token ID550
Feature activation+6.48e-03

Pos logit contributions
,+0.013
 first+0.012
 up+0.012
 right+0.012
 hard+0.011
Neg logit contributions
endant-0.024
cream-0.023
ifying-0.023
aj-0.023
run-0.021
Token an
Token ID281
Feature activation-6.03e-03

Pos logit contributions
,+0.006
 first+0.005
 up+0.005
 right+0.005
 hard+0.005
Neg logit contributions
endant-0.018
cream-0.017
aj-0.017
ifying-0.017
run-0.016
Token,
Token ID11
Feature activation+5.39e-03

Pos logit contributions
,+0.003
 up+0.003
 first+0.003
 right+0.002
 hard+0.002
Neg logit contributions
endant-0.018
cream-0.017
aj-0.017
ifying-0.017
run-0.017
Token nothing
Token ID2147
Feature activation+1.31e-02

Pos logit contributions
,+0.005
 first+0.004
 up+0.004
 right+0.004
 hard+0.004
Neg logit contributions
endant-0.015
cream-0.014
aj-0.014
ifying-0.014
run-0.013
Token seemed
Token ID3947
Feature activation+2.26e-02

Pos logit contributions
,+0.017
 first+0.016
 right+0.016
 up+0.015
 hard+0.015
Neg logit contributions
endant-0.030
cream-0.029
aj-0.029
ifying-0.029
run-0.027
Token to
Token ID284
Feature activation+4.67e-03

Pos logit contributions
,+0.022
 first+0.020
 up+0.019
 right+0.019
 hard+0.018
Neg logit contributions
endant-0.058
cream-0.056
ifying-0.055
aj-0.054
run-0.053
Token come
Token ID1282
Feature activation+8.20e-03

Pos logit contributions
,+0.004
 up+0.004
 first+0.004
 hard+0.004
 right+0.004
Neg logit contributions
endant-0.012
cream-0.011
ifying-0.011
aj-0.011
run-0.011
Token up
Token ID510
Feature activation+7.49e-02

Pos logit contributions
,+0.007
 first+0.007
 hard+0.006
 right+0.006
 up+0.006
Neg logit contributions
endant-0.021
ifying-0.021
cream-0.020
aj-0.020
run-0.019
Token.
Token ID13
Feature activation+7.72e-03

Pos logit contributions
 first+0.066
,+0.064
 up+0.064
 hard+0.061
 best+0.060
Neg logit contributions
ifying-0.176
endant-0.174
cream-0.168
run-0.167
aj-0.167
Token She
Token ID1375
Feature activation+3.44e-03

Pos logit contributions
,+0.008
 first+0.007
 right+0.007
 up+0.007
 hard+0.007
Neg logit contributions
endant-0.020
cream-0.020
aj-0.019
ifying-0.019
run-0.018
Token was
Token ID373
Feature activation+2.39e-02

Pos logit contributions
,+0.004
 first+0.004
 right+0.004
 up+0.004
 hard+0.004
Neg logit contributions
endant-0.008
cream-0.008
aj-0.008
ifying-0.008
run-0.007
Token frustrated
Token ID14718
Feature activation+1.51e-02

Pos logit contributions
,+0.024
 first+0.021
 up+0.020
 hard+0.019
 right+0.019
Neg logit contributions
endant-0.059
cream-0.057
aj-0.056
ifying-0.056
run-0.056
Token and
Token ID290
Feature activation-2.47e-03

Pos logit contributions
 first+0.008
 up+0.007
 right+0.006
 top+0.006
 bottom+0.006
Neg logit contributions
endant-0.043
cream-0.042
ifying-0.041
run-0.041
aj-0.041
Token destination
Token ID10965
Feature activation-9.06e-03

Pos logit contributions
endant+0.013
cream+0.013
aj+0.013
ifying+0.013
run+0.012
Neg logit contributions
,-0.007
 first-0.007
 right-0.006
 up-0.006
 hard-0.006
Token all
Token ID477
Feature activation+1.61e-02

Pos logit contributions
endant+0.025
cream+0.024
aj+0.024
ifying+0.024
run+0.022
Neg logit contributions
,-0.008
 first-0.007
 right-0.007
 up-0.007
 hard-0.007
Token day
Token ID1110
Feature activation+3.33e-02

Pos logit contributions
,+0.013
 first+0.012
 right+0.011
 hard+0.011
 top+0.011
Neg logit contributions
cream-0.042
endant-0.042
ifying-0.041
aj-0.040
run-0.039
Token.
Token ID13
Feature activation+4.40e-03

Pos logit contributions
,+0.029
 first+0.028
 up+0.027
 right+0.026
 hard+0.026
Neg logit contributions
endant-0.087
cream-0.085
aj-0.083
ifying-0.083
run-0.079
Token At
Token ID1629
Feature activation+1.60e-02

Pos logit contributions
,+0.006
 first+0.005
 up+0.005
 right+0.005
 hard+0.005
Neg logit contributions
endant-0.010
cream-0.010
aj-0.009
ifying-0.009
run-0.009
Token last
Token ID938
Feature activation+7.44e-02

Pos logit contributions
,+0.011
 overcome+0.010
 best+0.010
 hard+0.010
 right+0.010
Neg logit contributions
endant-0.039
cream-0.039
ifying-0.038
aj-0.038
pron-0.037
Token,
Token ID11
Feature activation+2.55e-02

Pos logit contributions
 up+0.047
 right+0.039
 first+0.038
 overcome+0.036
 best+0.031
Neg logit contributions
ifying-0.183
run-0.179
endant-0.178
ittens-0.177
pron-0.174
Token evening
Token ID6180
Feature activation+1.77e-02

Pos logit contributions
 hard+0.025
,+0.025
 right+0.024
:+0.024
 first+0.024
Neg logit contributions
cream-0.052
ifying-0.051
endant-0.044
pron-0.044
run-0.044
Token came
Token ID1625
Feature activation-3.53e-03

Pos logit contributions
 hard+0.006
 first+0.006
 up+0.005
 best+0.004
 right+0.004
Neg logit contributions
cream-0.051
ifying-0.050
endant-0.049
aj-0.048
pron-0.047
Token and
Token ID290
Feature activation-1.48e-03

Pos logit contributions
ifying+0.009
endant+0.009
aj+0.009
cream+0.008
eddy+0.008
Neg logit contributions
 first-0.003
 hard-0.003
 opposite-0.003
,-0.002
 right-0.002
Token m
Token ID285
Feature activation-3.35e-02

Pos logit contributions
cream+0.004
endant+0.004
ifying+0.004
aj+0.003
run+0.003
Neg logit contributions
,-0.002
 first-0.001
 hard-0.001
 right-0.001
 up-0.001
Token log
Token ID2604
Feature activation+4.44e-03

Pos logit contributions
endant+0.001
cream+0.001
ifying+0.001
aj+0.001
run+0.001
Neg logit contributions
,-0.000
 first-0.000
 up-0.000
 hard-0.000
 right-0.000
Token for
Token ID329
Feature activation+3.05e-02

Pos logit contributions
,+0.004
 first+0.004
 hard+0.004
 right+0.004
 up+0.004
Neg logit contributions
endant-0.011
cream-0.011
ifying-0.011
aj-0.010
run-0.010
Token Dad
Token ID17415
Feature activation+2.86e-02

Pos logit contributions
,+0.022
 first+0.019
 right+0.018
 up+0.018
 hard+0.017
Neg logit contributions
endant-0.089
cream-0.087
aj-0.085
ifying-0.084
run-0.080
Token.
Token ID13
Feature activation+1.20e-02

Pos logit contributions
,+0.026
 first+0.021
 right+0.020
 up+0.020
 hard+0.018
Neg logit contributions
endant-0.083
cream-0.079
ifying-0.078
aj-0.078
ittens-0.074
Token At
Token ID1629
Feature activation+2.43e-02

Pos logit contributions
,+0.014
 first+0.013
 right+0.013
 up+0.013
 hard+0.012
Neg logit contributions
endant-0.030
cream-0.029
ifying-0.027
aj-0.027
lets-0.026
Token last
Token ID938
Feature activation+7.41e-02

Pos logit contributions
 overcome+0.015
 best+0.015
,+0.013
 right+0.012
 hard+0.012
Neg logit contributions
endant-0.059
cream-0.059
ifying-0.058
aj-0.057
pron-0.055
Token,
Token ID11
Feature activation+2.68e-02

Pos logit contributions
 up+0.050
 right+0.044
 overcome+0.041
 best+0.041
 outskirts+0.039
Neg logit contributions
ifying-0.169
run-0.169
ittens-0.164
endant-0.160
paste-0.155
Token Ben
Token ID3932
Feature activation+4.55e-02

Pos logit contributions
,+0.027
 hard+0.025
 first+0.025
 right+0.024
 up+0.024
Neg logit contributions
cream-0.063
ifying-0.063
endant-0.062
aj-0.061
run-0.057
Token found
Token ID1043
Feature activation-5.71e-03

Pos logit contributions
,+0.056
 first+0.054
 hard+0.053
 right+0.053
 up+0.052
Neg logit contributions
endant-0.103
cream-0.101
aj-0.099
ifying-0.096
run-0.091
Token the
Token ID262
Feature activation+6.55e-03

Pos logit contributions
endant+0.016
cream+0.016
ifying+0.016
aj+0.015
run+0.015
Neg logit contributions
,-0.004
 first-0.003
 hard-0.003
 right-0.003
 up-0.003
Token perfect
Token ID2818
Feature activation+1.16e-02

Pos logit contributions
,+0.009
 up+0.008
 first+0.008
 right+0.008
 hard+0.008
Neg logit contributions
endant-0.014
cream-0.013
aj-0.013
ifying-0.013
run-0.012
Token could
Token ID714
Feature activation+9.83e-03

Pos logit contributions
,+0.011
 first+0.011
 up+0.010
 right+0.010
 top+0.010
Neg logit contributions
endant-0.020
cream-0.019
ifying-0.018
aj-0.018
run-0.018
Token stop
Token ID2245
Feature activation+4.23e-02

Pos logit contributions
,+0.010
 up+0.009
 hard+0.009
 opposite+0.008
 right+0.008
Neg logit contributions
endant-0.023
cream-0.022
run-0.022
ifying-0.021
lets-0.020
Token her
Token ID607
Feature activation-1.20e-02

Pos logit contributions
 first+0.040
 opposite+0.039
,+0.039
 hard+0.038
 top+0.037
Neg logit contributions
endant-0.097
ifying-0.094
aj-0.093
run-0.090
cream-0.089
Token.
Token ID13
Feature activation+1.22e-03

Pos logit contributions
endant+0.032
cream+0.031
aj+0.030
ifying+0.030
run+0.029
Neg logit contributions
,-0.012
 first-0.010
 right-0.010
 up-0.010
 hard-0.010
Token At
Token ID1629
Feature activation+1.54e-02

Pos logit contributions
,+0.001
 first+0.001
 up+0.001
 right+0.001
 hard+0.001
Neg logit contributions
endant-0.003
cream-0.003
aj-0.003
ifying-0.003
run-0.003
Token last
Token ID938
Feature activation+7.35e-02

Pos logit contributions
,+0.010
:+0.009
 right+0.008
 overcome+0.008
 best+0.008
Neg logit contributions
endant-0.037
ifying-0.035
cream-0.034
aj-0.033
jar-0.033
Token,
Token ID11
Feature activation+2.60e-02

Pos logit contributions
 up+0.057
 first+0.047
 right+0.045
 over+0.044
 overcome+0.044
Neg logit contributions
run-0.157
ifying-0.156
endant-0.153
ittens-0.153
paste-0.153
Token Ruby
Token ID10888
Feature activation+2.60e-02

Pos logit contributions
,+0.029
:+0.029
 first+0.029
 right+0.028
 hard+0.027
Neg logit contributions
ifying-0.053
endant-0.051
cream-0.049
paste-0.049
aj-0.048
Token took
Token ID1718
Feature activation+6.09e-03

Pos logit contributions
 first+0.032
 hard+0.031
,+0.031
 up+0.031
 right+0.031
Neg logit contributions
endant-0.055
cream-0.054
aj-0.053
ifying-0.051
lets-0.050
Token the
Token ID262
Feature activation+1.47e-02

Pos logit contributions
 first+0.004
:+0.004
,+0.004
 bottom+0.003
 right+0.003
Neg logit contributions
cream-0.015
endant-0.015
ifying-0.014
run-0.014
paste-0.014
Token fifth
Token ID8150
Feature activation+4.27e-02

Pos logit contributions
,+0.025
 up+0.023
 first+0.023
 right+0.023
 hard+0.022
Neg logit contributions
endant-0.027
cream-0.026
ifying-0.025
aj-0.024
run-0.023
Token find
Token ID1064
Feature activation+7.32e-03

Pos logit contributions
,+0.017
 first+0.015
 right+0.014
 up+0.014
 hard+0.014
Neg logit contributions
endant-0.061
cream-0.060
aj-0.058
ifying-0.058
run-0.055
Token her
Token ID607
Feature activation-2.06e-02

Pos logit contributions
,+0.003
 first+0.003
 right+0.003
 hard+0.003
 up+0.002
Neg logit contributions
endant-0.023
cream-0.022
ifying-0.021
aj-0.021
run-0.021
Token ball
Token ID2613
Feature activation+5.19e-04

Pos logit contributions
cream+0.045
endant+0.045
ifying+0.043
aj+0.043
ksh+0.041
Neg logit contributions
,-0.033
 first-0.030
 up-0.030
 hard-0.029
 right-0.029
Token.
Token ID13
Feature activation+1.73e-02

Pos logit contributions
,+0.000
 first+0.000
 right+0.000
 up+0.000
 hard+0.000
Neg logit contributions
endant-0.001
cream-0.001
aj-0.001
ifying-0.001
run-0.001
Token At
Token ID1629
Feature activation+2.00e-02

Pos logit contributions
,+0.016
 right+0.013
 up+0.013
 the+0.013
 first+0.013
Neg logit contributions
endant-0.049
cream-0.046
aj-0.046
ifying-0.046
pron-0.044
Token last
Token ID938
Feature activation+7.33e-02

Pos logit contributions
 best+0.011
,+0.011
 overcome+0.011
 right+0.011
:+0.010
Neg logit contributions
endant-0.049
cream-0.047
ifying-0.046
aj-0.045
jar-0.045
Token,
Token ID11
Feature activation+2.66e-02

Pos logit contributions
 up+0.057
 right+0.052
 first+0.047
 overcome+0.043
 over+0.043
Neg logit contributions
ittens-0.168
endant-0.161
run-0.161
ifying-0.159
paste-0.151
Token they
Token ID484
Feature activation+2.00e-02

Pos logit contributions
,+0.029
 first+0.027
 right+0.026
 hard+0.026
 up+0.026
Neg logit contributions
endant-0.064
cream-0.063
ifying-0.061
aj-0.060
run-0.057
Token found
Token ID1043
Feature activation-5.30e-03

Pos logit contributions
,+0.026
 first+0.024
 right+0.023
 up+0.023
 hard+0.023
Neg logit contributions
endant-0.047
cream-0.045
aj-0.044
ifying-0.044
run-0.041
Token the
Token ID262
Feature activation+8.99e-03

Pos logit contributions
endant+0.015
cream+0.015
ifying+0.014
aj+0.014
run+0.014
Neg logit contributions
,-0.004
 first-0.003
 right-0.003
 hard-0.003
 up-0.003
Token ball
Token ID2613
Feature activation-1.15e-03

Pos logit contributions
,+0.014
 first+0.014
 right+0.014
 hard+0.013
 up+0.013
Neg logit contributions
cream-0.019
endant-0.019
aj-0.019
ifying-0.018
run-0.017
Token
Token ID198
Feature activation-7.74e-03

Pos logit contributions
,+0.015
 first+0.012
 right+0.012
 up+0.012
 hard+0.011
Neg logit contributions
endant-0.043
cream-0.042
aj-0.041
ifying-0.040
ksh-0.038
Token
Token ID198
Feature activation-1.06e-02

Pos logit contributions
endant+0.021
cream+0.021
ifying+0.020
aj+0.020
ksh+0.019
Neg logit contributions
,-0.007
 first-0.006
 up-0.006
 hard-0.006
 right-0.006
TokenThen
Token ID6423
Feature activation+3.46e-02

Pos logit contributions
endant+0.023
cream+0.022
aj+0.022
ifying+0.021
ksh+0.020
Neg logit contributions
,-0.016
 first-0.015
 hard-0.015
 right-0.015
 up-0.014
Token,
Token ID11
Feature activation+1.35e-02

Pos logit contributions
,+0.030
 first+0.030
 up+0.029
 right+0.029
 hard+0.028
Neg logit contributions
endant-0.088
cream-0.085
ifying-0.084
aj-0.084
lets-0.080
Token at
Token ID379
Feature activation+2.91e-02

Pos logit contributions
,+0.017
 first+0.015
 right+0.015
 up+0.014
 hard+0.014
Neg logit contributions
endant-0.034
cream-0.033
aj-0.033
ifying-0.032
ksh-0.030
Token last
Token ID938
Feature activation+7.33e-02

Pos logit contributions
,+0.021
 up+0.019
 hard+0.019
 right+0.018
 overcome+0.018
Neg logit contributions
endant-0.069
ifying-0.068
cream-0.068
aj-0.065
eddy-0.065
Token,
Token ID11
Feature activation+2.43e-02

Pos logit contributions
 up+0.046
 first+0.043
 right+0.038
 best+0.034
 overcome+0.033
Neg logit contributions
ifying-0.183
endant-0.180
run-0.180
ittens-0.176
pron-0.174
Token Ros
Token ID10018
Feature activation-1.48e-02

Pos logit contributions
,+0.035
 first+0.033
 up+0.032
 right+0.032
 hard+0.032
Neg logit contributions
endant-0.052
cream-0.050
ifying-0.048
aj-0.048
run-0.045
Tokenie
Token ID494
Feature activation+1.95e-02

Pos logit contributions
endant+0.032
cream+0.031
ifying+0.030
aj+0.029
ksh+0.028
Neg logit contributions
,-0.022
 first-0.021
 up-0.020
 right-0.020
 hard-0.019
Token was
Token ID373
Feature activation+2.56e-02

Pos logit contributions
,+0.024
 first+0.022
 right+0.022
 up+0.022
 hard+0.022
Neg logit contributions
endant-0.046
cream-0.045
aj-0.043
ifying-0.043
run-0.041
Token able
Token ID1498
Feature activation+2.93e-02

Pos logit contributions
,+0.033
 first+0.032
 hard+0.031
 up+0.031
 opposite+0.030
Neg logit contributions
endant-0.053
cream-0.052
aj-0.051
ifying-0.051
run-0.050
Token something
Token ID1223
Feature activation-3.23e-03

Pos logit contributions
,+0.018
 up+0.016
 right+0.016
 first+0.016
 hard+0.016
Neg logit contributions
cream-0.031
endant-0.031
ifying-0.030
aj-0.029
run-0.029
Token scared
Token ID12008
Feature activation-4.53e-03

Pos logit contributions
endant+0.007
cream+0.007
run+0.006
ifying+0.006
aj+0.006
Neg logit contributions
 first-0.004
,-0.004
 top-0.004
 up-0.004
 right-0.004
Token him
Token ID683
Feature activation+3.78e-03

Pos logit contributions
endant+0.012
cream+0.011
ifying+0.011
aj+0.011
run+0.011
Neg logit contributions
,-0.004
 first-0.004
 up-0.004
 right-0.004
 hard-0.004
Token.
Token ID13
Feature activation+5.02e-03

Pos logit contributions
,+0.004
 first+0.003
 right+0.003
 up+0.003
 hard+0.003
Neg logit contributions
endant-0.010
cream-0.010
aj-0.010
ifying-0.010
run-0.009
Token At
Token ID1629
Feature activation+1.44e-02

Pos logit contributions
,+0.006
 first+0.005
 right+0.005
 up+0.005
 hard+0.005
Neg logit contributions
endant-0.013
cream-0.012
aj-0.012
ifying-0.012
lets-0.011
Token last
Token ID938
Feature activation+7.32e-02

Pos logit contributions
 best+0.009
,+0.009
 overcome+0.008
ap+0.008
:+0.008
Neg logit contributions
endant-0.033
ifying-0.032
cream-0.032
jar-0.030
ents-0.029
Token,
Token ID11
Feature activation+2.40e-02

Pos logit contributions
 up+0.059
 best+0.055
 right+0.054
 overcome+0.051
 opposite+0.050
Neg logit contributions
run-0.155
ifying-0.146
endant-0.144
ittens-0.136
fy-0.134
Token the
Token ID262
Feature activation+1.10e-02

Pos logit contributions
,+0.027
:+0.025
 first+0.025
 right+0.025
 hard+0.025
Neg logit contributions
cream-0.054
ifying-0.052
endant-0.051
run-0.048
aj-0.048
Token tiger
Token ID26241
Feature activation+1.62e-02

Pos logit contributions
,+0.019
 first+0.018
 right+0.017
 up+0.017
 hard+0.017
Neg logit contributions
endant-0.021
cream-0.020
aj-0.020
ifying-0.020
run-0.018
Token saw
Token ID2497
Feature activation-9.09e-03

Pos logit contributions
,+0.020
 first+0.019
 right+0.018
 hard+0.018
 up+0.018
Neg logit contributions
endant-0.037
cream-0.036
aj-0.034
ifying-0.034
run-0.033
Token a
Token ID257
Feature activation+1.03e-03

Pos logit contributions
endant+0.025
cream+0.025
ifying+0.024
aj+0.024
run+0.023
Neg logit contributions
,-0.007
 first-0.006
 right-0.006
 up-0.006
 hard-0.006
Token spot
Token ID4136
Feature activation-3.96e-03

Pos logit contributions
,+0.005
 first+0.004
 right+0.004
 up+0.004
 hard+0.004
Neg logit contributions
endant-0.007
cream-0.007
aj-0.007
ifying-0.007
run-0.006
Token.
Token ID13
Feature activation+1.79e-02

Pos logit contributions
endant+0.011
cream+0.011
aj+0.011
ifying+0.011
ksh+0.010
Neg logit contributions
,-0.004
 first-0.003
 up-0.003
 right-0.003
 hard-0.003
Token
Token ID198
Feature activation-2.99e-04

Pos logit contributions
,+0.017
 first+0.015
 right+0.015
 up+0.014
 hard+0.014
Neg logit contributions
endant-0.049
cream-0.048
ifying-0.047
aj-0.046
run-0.043
Token
Token ID198
Feature activation-3.63e-04

Pos logit contributions
endant+0.001
cream+0.001
aj+0.001
ifying+0.001
run+0.001
Neg logit contributions
,-0.000
 first-0.000
 right-0.000
 up-0.000
 hard-0.000
TokenAt
Token ID2953
Feature activation+1.70e-02

Pos logit contributions
endant+0.001
cream+0.001
aj+0.001
ifying+0.001
run+0.001
Neg logit contributions
,-0.001
 first-0.001
 right-0.001
 up-0.001
 hard-0.001
Token last
Token ID938
Feature activation+7.29e-02

Pos logit contributions
 best+0.012
,+0.012
 overcome+0.011
 right+0.011
ered+0.010
Neg logit contributions
endant-0.041
cream-0.039
aj-0.038
ifying-0.038
jar-0.038
Token,
Token ID11
Feature activation+2.29e-02

Pos logit contributions
 up+0.047
 right+0.041
 overcome+0.038
 first+0.036
 best+0.035
Neg logit contributions
endant-0.184
ittens-0.178
ifying-0.174
run-0.172
paste-0.170
Token Tom
Token ID4186
Feature activation+4.14e-02

Pos logit contributions
,+0.027
 first+0.025
 hard+0.025
 up+0.025
 right+0.024
Neg logit contributions
endant-0.049
cream-0.048
ifying-0.047
aj-0.045
run-0.043
Token found
Token ID1043
Feature activation-7.54e-03

Pos logit contributions
,+0.050
 first+0.047
 hard+0.045
 up+0.045
 right+0.045
Neg logit contributions
endant-0.097
cream-0.094
aj-0.092
ifying-0.090
run-0.086
Token Lily
Token ID20037
Feature activation+1.14e-02

Pos logit contributions
endant+0.019
cream+0.019
ifying+0.018
aj+0.018
run+0.018
Neg logit contributions
,-0.007
 first-0.006
 hard-0.006
 right-0.006
 up-0.006
Token.
Token ID13
Feature activation+1.72e-02

Pos logit contributions
,+0.010
 first+0.009
 up+0.009
 right+0.009
 hard+0.008
Neg logit contributions
endant-0.032
cream-0.031
ifying-0.030
aj-0.030
ksh-0.029
Token again
Token ID757
Feature activation+4.21e-02

Pos logit contributions
 Fre+0.031
 first+0.030
 bottom+0.030
 best+0.030
ered+0.030
Neg logit contributions
run-0.051
cream-0.047
endant-0.047
aj-0.046
ifying-0.042
Token and
Token ID290
Feature activation-1.20e-02

Pos logit contributions
 up+0.028
 first+0.028
,+0.027
 right+0.026
 hard+0.026
Neg logit contributions
endant-0.111
cream-0.108
run-0.107
aj-0.106
ifying-0.105
Token again
Token ID757
Feature activation+4.03e-02

Pos logit contributions
cream+0.026
endant+0.026
run+0.024
aj+0.023
ifying+0.023
Neg logit contributions
,-0.013
 best-0.013
 first-0.012
 hard-0.012
 top-0.012
Token.
Token ID13
Feature activation+1.29e-02

Pos logit contributions
 up+0.022
 first+0.022
,+0.020
 hard+0.020
 right+0.020
Neg logit contributions
endant-0.112
cream-0.109
run-0.107
aj-0.106
ifying-0.106
Token At
Token ID1629
Feature activation+1.79e-02

Pos logit contributions
,+0.017
 first+0.016
 up+0.015
 right+0.015
 hard+0.015
Neg logit contributions
endant-0.030
cream-0.029
aj-0.028
ifying-0.028
run-0.026
Token last
Token ID938
Feature activation+7.27e-02

Pos logit contributions
 best+0.013
:+0.013
,+0.013
 overcome+0.013
ered+0.011
Neg logit contributions
endant-0.039
cream-0.037
ifying-0.035
jar-0.034
ents-0.033
Token,
Token ID11
Feature activation+2.76e-02

Pos logit contributions
 up+0.056
 right+0.044
 overcome+0.044
 outskirts+0.043
 first+0.043
Neg logit contributions
run-0.164
ittens-0.160
ifying-0.159
endant-0.158
paste-0.151
Token the
Token ID262
Feature activation+2.05e-02

Pos logit contributions
,+0.022
:+0.021
 first+0.020
 right+0.019
 hard+0.019
Neg logit contributions
cream-0.071
endant-0.069
ifying-0.068
aj-0.065
run-0.064
Token duck
Token ID22045
Feature activation+1.08e-02

Pos logit contributions
,+0.029
 first+0.028
 right+0.027
 hard+0.026
 up+0.025
Neg logit contributions
endant-0.049
cream-0.048
ifying-0.046
aj-0.046
run-0.045
Token pushed
Token ID7121
Feature activation+1.50e-02

Pos logit contributions
,+0.013
 first+0.012
 hard+0.012
 right+0.012
 up+0.012
Neg logit contributions
endant-0.024
cream-0.024
aj-0.023
ifying-0.022
run-0.022
Token the
Token ID262
Feature activation+1.70e-02

Pos logit contributions
:+0.008
 first+0.008
 best+0.008
 overcome+0.007
,+0.007
Neg logit contributions
cream-0.038
run-0.037
ifying-0.036
endant-0.035
ksh-0.035
Token came
Token ID1625
Feature activation-1.07e-03

Pos logit contributions
endant+0.067
cream+0.066
aj+0.066
eddy+0.063
ifying+0.062
Neg logit contributions
,-0.036
 first-0.034
 right-0.032
 hard-0.031
 very-0.030
Token in
Token ID287
Feature activation+3.92e-02

Pos logit contributions
endant+0.003
ifying+0.003
cream+0.003
aj+0.002
run+0.002
Neg logit contributions
,-0.001
 first-0.001
 hard-0.001
 right-0.001
 opposite-0.001
Token through
Token ID832
Feature activation+3.49e-02

Pos logit contributions
,+0.036
 up+0.035
 first+0.034
 right+0.032
 hard+0.031
Neg logit contributions
endant-0.097
ifying-0.094
aj-0.093
cream-0.092
run-0.089
Token.
Token ID13
Feature activation+2.54e-03

Pos logit contributions
,+0.031
 first+0.028
 right+0.027
 up+0.027
 hard+0.026
Neg logit contributions
endant-0.094
cream-0.092
ifying-0.090
aj-0.089
run-0.085
Token At
Token ID1629
Feature activation+1.06e-02

Pos logit contributions
,+0.003
 right+0.003
 first+0.003
 up+0.003
 hard+0.003
Neg logit contributions
endant-0.006
cream-0.006
ifying-0.006
aj-0.006
pron-0.006
Token last
Token ID938
Feature activation+7.27e-02

Pos logit contributions
 overcome+0.008
 best+0.007
 right+0.007
:+0.006
ered+0.006
Neg logit contributions
endant-0.024
cream-0.021
aj-0.021
jar-0.021
ifying-0.021
Token,
Token ID11
Feature activation+2.19e-02

Pos logit contributions
 up+0.054
 right+0.050
 overcome+0.046
 first+0.041
 over+0.041
Neg logit contributions
ittens-0.168
endant-0.164
run-0.161
ifying-0.157
aj-0.152
Token they
Token ID484
Feature activation+1.48e-02

Pos logit contributions
,+0.023
 first+0.022
 right+0.022
 hard+0.021
 up+0.021
Neg logit contributions
endant-0.051
cream-0.051
ifying-0.050
aj-0.048
run-0.046
Token found
Token ID1043
Feature activation-1.23e-02

Pos logit contributions
,+0.019
 first+0.018
 up+0.018
 hard+0.018
 right+0.018
Neg logit contributions
endant-0.033
cream-0.032
ifying-0.031
aj-0.031
run-0.029
Token the
Token ID262
Feature activation+1.87e-03

Pos logit contributions
endant+0.037
cream+0.036
ifying+0.035
aj+0.035
run+0.034
Neg logit contributions
,-0.007
 first-0.006
 right-0.005
 hard-0.005
 up-0.005
Token right
Token ID826
Feature activation+2.83e-02

Pos logit contributions
,+0.002
 first+0.002
 right+0.002
 hard+0.002
 up+0.002
Neg logit contributions
endant-0.005
cream-0.005
aj-0.004
ifying-0.004
run-0.004
Token by
Token ID416
Feature activation+2.49e-02

Pos logit contributions
endant+0.070
cream+0.069
ifying+0.067
aj+0.065
run+0.063
Neg logit contributions
,-0.028
 first-0.025
 hard-0.024
 right-0.024
 up-0.024
Token the
Token ID262
Feature activation+1.18e-02

Pos logit contributions
,+0.020
 up+0.018
 first+0.018
 right+0.017
 hard+0.017
Neg logit contributions
endant-0.066
cream-0.064
ifying-0.063
aj-0.062
run-0.061
Token pond
Token ID16723
Feature activation+1.21e-02

Pos logit contributions
,+0.016
 first+0.015
 right+0.014
 up+0.014
 hard+0.014
Neg logit contributions
endant-0.027
cream-0.026
aj-0.025
ifying-0.025
run-0.024
Token.
Token ID13
Feature activation+1.32e-02

Pos logit contributions
 hard+0.004
 first+0.004
 up+0.004
,+0.004
 right+0.004
Neg logit contributions
endant-0.035
cream-0.034
ifying-0.034
aj-0.034
run-0.033
Token At
Token ID1629
Feature activation+1.71e-02

Pos logit contributions
,+0.017
 first+0.015
 right+0.015
 up+0.015
 hard+0.015
Neg logit contributions
endant-0.032
cream-0.031
aj-0.030
ifying-0.030
run-0.028
Token last
Token ID938
Feature activation+7.25e-02

Pos logit contributions
 best+0.011
 overcome+0.011
 right+0.010
,+0.010
ered+0.010
Neg logit contributions
endant-0.039
aj-0.036
jar-0.035
ifying-0.035
cream-0.035
Token,
Token ID11
Feature activation+2.84e-02

Pos logit contributions
 up+0.054
 right+0.054
 overcome+0.051
 outskirts+0.045
 over+0.042
Neg logit contributions
run-0.154
ittens-0.154
endant-0.153
ifying-0.153
fy-0.140
Token they
Token ID484
Feature activation+2.06e-02

Pos logit contributions
,+0.029
 right+0.028
 first+0.028
 hard+0.028
:+0.027
Neg logit contributions
cream-0.063
ifying-0.062
endant-0.061
aj-0.058
run-0.056
Token found
Token ID1043
Feature activation-9.12e-03

Pos logit contributions
,+0.027
 first+0.025
 up+0.025
 hard+0.025
 right+0.025
Neg logit contributions
endant-0.046
cream-0.045
ifying-0.044
aj-0.043
run-0.041
Token many
Token ID867
Feature activation+8.97e-03

Pos logit contributions
endant+0.026
cream+0.026
ifying+0.025
aj+0.025
run+0.024
Neg logit contributions
,-0.006
 first-0.005
 right-0.005
 hard-0.005
 up-0.004
Token carrots
Token ID34397
Feature activation-2.20e-02

Pos logit contributions
,+0.011
 first+0.011
 right+0.010
 up+0.010
 hard+0.010
Neg logit contributions
endant-0.021
cream-0.020
aj-0.020
ifying-0.020
run-0.018
Token thicker
Token ID29175
Feature activation-7.86e-03

Pos logit contributions
endant+0.065
cream+0.064
ifying+0.061
aj+0.061
run+0.058
Neg logit contributions
,-0.030
 first-0.026
 right-0.026
 up-0.025
:-0.025
Token.
Token ID13
Feature activation+3.18e-04

Pos logit contributions
endant+0.019
cream+0.019
ifying+0.018
aj+0.018
run+0.017
Neg logit contributions
 first-0.007
,-0.007
 right-0.007
 up-0.007
 hard-0.007
Token Until
Token ID14303
Feature activation+3.45e-02

Pos logit contributions
,+0.000
 first+0.000
 right+0.000
 up+0.000
 hard+0.000
Neg logit contributions
endant-0.001
cream-0.001
aj-0.001
ifying-0.001
run-0.001
Token,
Token ID11
Feature activation+2.14e-02

Pos logit contributions
 overcome+0.024
 up+0.023
 bottom+0.021
 opposite+0.021
 hard+0.021
Neg logit contributions
ifying-0.083
cream-0.079
endant-0.078
jar-0.074
run-0.072
Token at
Token ID379
Feature activation+2.72e-02

Pos logit contributions
,+0.020
:+0.020
 hard+0.019
 first+0.019
 up+0.019
Neg logit contributions
cream-0.049
endant-0.048
ifying-0.048
aj-0.044
jar-0.044
Token last
Token ID938
Feature activation+7.24e-02

Pos logit contributions
:+0.022
,+0.022
 best+0.018
 right+0.017
 hard+0.017
Neg logit contributions
endant-0.059
jar-0.057
cream-0.057
ifying-0.055
aj-0.053
Token,
Token ID11
Feature activation+2.30e-02

Pos logit contributions
 up+0.061
 best+0.053
 over+0.051
 right+0.051
 overcome+0.051
Neg logit contributions
ittens-0.150
run-0.144
paste-0.142
ifying-0.141
endant-0.139
Token she
Token ID673
Feature activation+5.40e-03

Pos logit contributions
,+0.016
:+0.016
 first+0.016
 up+0.015
 hard+0.015
Neg logit contributions
endant-0.058
cream-0.058
ifying-0.057
aj-0.053
paste-0.053
Token saw
Token ID2497
Feature activation-1.41e-02

Pos logit contributions
,+0.007
 first+0.006
 right+0.006
 up+0.006
 hard+0.006
Neg logit contributions
endant-0.013
cream-0.012
aj-0.012
ifying-0.012
run-0.011
Token a
Token ID257
Feature activation-2.00e-03

Pos logit contributions
endant+0.038
cream+0.037
ifying+0.035
aj+0.035
run+0.033
Neg logit contributions
,-0.011
 first-0.011
 right-0.010
 up-0.010
 hard-0.009
Token clearing
Token ID17304
Feature activation-2.75e-02

Pos logit contributions
endant+0.005
cream+0.005
aj+0.005
ifying+0.005
run+0.005
Neg logit contributions
,-0.002
 first-0.002
 up-0.002
 right-0.002
 hard-0.002
Token the
Token ID262
Feature activation+6.37e-03

Pos logit contributions
endant+0.008
cream+0.008
ifying+0.008
aj+0.008
eddy+0.007
Neg logit contributions
,-0.004
 first-0.003
 hard-0.003
 right-0.003
 up-0.003
Token best
Token ID1266
Feature activation+1.86e-02

Pos logit contributions
,+0.005
 first+0.004
 right+0.004
 up+0.004
 hard+0.004
Neg logit contributions
endant-0.019
cream-0.018
aj-0.018
ifying-0.018
run-0.017
Token one
Token ID530
Feature activation-1.59e-03

Pos logit contributions
,+0.030
 first+0.028
 up+0.027
 right+0.027
 hard+0.027
Neg logit contributions
endant-0.039
cream-0.037
ifying-0.036
aj-0.036
run-0.033
Token.
Token ID13
Feature activation+5.28e-03

Pos logit contributions
endant+0.004
cream+0.004
aj+0.004
ifying+0.004
ittens+0.004
Neg logit contributions
,-0.002
 first-0.001
 right-0.001
 up-0.001
 hard-0.001
Token At
Token ID1629
Feature activation+1.07e-02

Pos logit contributions
,+0.008
 first+0.007
 hard+0.007
 up+0.007
 right+0.007
Neg logit contributions
endant-0.011
cream-0.011
aj-0.011
ifying-0.010
run-0.010
Token last
Token ID938
Feature activation+7.22e-02

Pos logit contributions
,+0.005
 overcome+0.004
 best+0.004
:+0.004
 right+0.004
Neg logit contributions
endant-0.029
cream-0.028
ifying-0.027
aj-0.027
pron-0.026
Token he
Token ID339
Feature activation+1.56e-02

Pos logit contributions
 up+0.050
 overcome+0.040
 right+0.040
 outskirts+0.038
 first+0.036
Neg logit contributions
ifying-0.161
run-0.160
endant-0.157
ittens-0.154
pron-0.154
Token chose
Token ID7690
Feature activation-2.12e-02

Pos logit contributions
,+0.020
 first+0.018
 right+0.018
 up+0.018
 hard+0.018
Neg logit contributions
endant-0.036
cream-0.035
aj-0.034
ifying-0.034
run-0.032
Token one
Token ID530
Feature activation-1.34e-03

Pos logit contributions
endant+0.061
cream+0.060
aj+0.058
ifying+0.058
run+0.056
Neg logit contributions
,-0.015
 first-0.013
 right-0.012
 up-0.012
 hard-0.012
Token with
Token ID351
Feature activation-9.04e-03

Pos logit contributions
endant+0.004
cream+0.004
aj+0.004
ifying+0.004
run+0.004
Neg logit contributions
,-0.001
 first-0.001
 right-0.001
 up-0.001
 hard-0.001
Token big
Token ID1263
Feature activation+2.78e-03

Pos logit contributions
endant+0.029
aj+0.028
cream+0.028
ifying+0.027
eddy+0.026
Neg logit contributions
,-0.005
 first-0.004
 hard-0.004
 right-0.004
 up-0.004
Token,
Token ID11
Feature activation-8.40e-03

Pos logit contributions
endant+0.024
cream+0.024
aj+0.023
ifying+0.023
run+0.022
Neg logit contributions
,-0.008
 first-0.007
 up-0.007
 right-0.007
 hard-0.006
Token no
Token ID645
Feature activation-8.26e-03

Pos logit contributions
endant+0.022
cream+0.022
aj+0.021
ifying+0.021
ksh+0.020
Neg logit contributions
,-0.009
 right-0.008
 first-0.008
 hard-0.008
 up-0.008
Token matter
Token ID2300
Feature activation+3.44e-03

Pos logit contributions
endant+0.017
cream+0.016
aj+0.016
ifying+0.015
ksh+0.014
Neg logit contributions
,-0.015
 first-0.014
 hard-0.013
 right-0.013
 up-0.013
Token what
Token ID644
Feature activation+8.13e-03

Pos logit contributions
,+0.006
 first+0.005
 right+0.005
 up+0.005
 hard+0.005
Neg logit contributions
endant-0.007
cream-0.006
aj-0.006
ifying-0.006
run-0.006
Token you
Token ID345
Feature activation+1.66e-02

Pos logit contributions
,+0.008
 first+0.008
 up+0.008
 right+0.008
 hard+0.007
Neg logit contributions
endant-0.020
cream-0.019
ifying-0.019
aj-0.019
run-0.018
Token do
Token ID466
Feature activation+7.19e-02

Pos logit contributions
,+0.018
 first+0.016
 up+0.016
 hard+0.016
 right+0.016
Neg logit contributions
endant-0.041
cream-0.040
ifying-0.039
aj-0.038
run-0.036
Token!
Token ID0
Feature activation+1.32e-02

Pos logit contributions
 opposite+0.064
 up+0.057
 hard+0.056
 bottom+0.055
 first+0.054
Neg logit contributions
ifying-0.161
aj-0.160
endant-0.159
cream-0.158
pron-0.151
Token Doing
Token ID25327
Feature activation+1.94e-02

Pos logit contributions
,+0.012
 first+0.010
 right+0.010
 hard+0.009
 up+0.009
Neg logit contributions
endant-0.037
cream-0.036
aj-0.035
ifying-0.035
ittens-0.033
Token something
Token ID1223
Feature activation+4.07e-03

Pos logit contributions
 first+0.017
,+0.017
 up+0.015
 right+0.015
 bottom+0.015
Neg logit contributions
endant-0.046
cream-0.046
aj-0.045
ifying-0.045
run-0.043
Token with
Token ID351
Feature activation-1.82e-03

Pos logit contributions
 first+0.005
 bottom+0.005
 top+0.004
 opposite+0.004
,+0.004
Neg logit contributions
endant-0.009
cream-0.009
ifying-0.009
aj-0.008
run-0.008
Token a
Token ID257
Feature activation+6.83e-03

Pos logit contributions
endant+0.005
cream+0.005
aj+0.005
ifying+0.005
run+0.004
Neg logit contributions
,-0.002
 first-0.001
 up-0.001
 right-0.001
 hard-0.001
Token her
Token ID607
Feature activation-2.23e-02

Pos logit contributions
,+0.010
 first+0.009
 right+0.009
 up+0.009
 hard+0.008
Neg logit contributions
endant-0.045
cream-0.044
ifying-0.043
aj-0.043
run-0.041
Token new
Token ID649
Feature activation-7.80e-04

Pos logit contributions
endant+0.053
cream+0.052
aj+0.050
ifying+0.050
ksh+0.047
Neg logit contributions
,-0.029
 first-0.027
 hard-0.026
 right-0.025
 up-0.025
Token powers
Token ID5635
Feature activation-1.62e-02

Pos logit contributions
endant+0.002
cream+0.002
aj+0.002
ifying+0.002
run+0.001
Neg logit contributions
,-0.001
 first-0.001
 up-0.001
 right-0.001
 hard-0.001
Token.
Token ID13
Feature activation+1.14e-02

Pos logit contributions
endant+0.043
cream+0.041
aj+0.040
ifying+0.040
run+0.039
Neg logit contributions
,-0.015
 first-0.014
 up-0.013
 right-0.013
 hard-0.013
Token At
Token ID1629
Feature activation+1.33e-02

Pos logit contributions
,+0.015
 first+0.014
 right+0.014
 up+0.014
 hard+0.014
Neg logit contributions
endant-0.026
cream-0.026
aj-0.025
ifying-0.025
ittens-0.023
Token last
Token ID938
Feature activation+7.19e-02

Pos logit contributions
,+0.008
:+0.008
 right+0.007
 best+0.007
 hard+0.007
Neg logit contributions
endant-0.033
ifying-0.031
aj-0.031
cream-0.029
jar-0.029
Token,
Token ID11
Feature activation+2.86e-02

Pos logit contributions
 up+0.045
 first+0.037
 right+0.036
 overcome+0.032
 best+0.031
Neg logit contributions
run-0.175
endant-0.173
ifying-0.169
ittens-0.163
paste-0.163
Token she
Token ID673
Feature activation+1.16e-02

Pos logit contributions
,+0.026
 first+0.024
:+0.024
 hard+0.024
 right+0.023
Neg logit contributions
endant-0.068
ifying-0.068
cream-0.067
aj-0.064
run-0.063
Token found
Token ID1043
Feature activation-9.81e-03

Pos logit contributions
,+0.015
 first+0.014
 right+0.014
 up+0.014
 hard+0.014
Neg logit contributions
endant-0.027
cream-0.026
aj-0.025
ifying-0.025
run-0.023
Token the
Token ID262
Feature activation+3.12e-03

Pos logit contributions
endant+0.028
cream+0.027
ifying+0.026
aj+0.026
run+0.026
Neg logit contributions
,-0.007
 first-0.006
 right-0.005
 hard-0.005
 up-0.005
Token answers
Token ID7429
Feature activation+1.04e-03

Pos logit contributions
,+0.004
 first+0.004
 right+0.004
 hard+0.004
 up+0.003
Neg logit contributions
endant-0.007
cream-0.007
aj-0.007
ifying-0.007
run-0.007
Token who
Token ID508
Feature activation-5.26e-02

Pos logit contributions
endant+0.021
cream+0.021
aj+0.020
ifying+0.020
run+0.019
Neg logit contributions
,-0.007
 first-0.005
 up-0.005
 right-0.005
 hard-0.005
Token had
Token ID550
Feature activation-4.09e-02

Pos logit contributions
endant+0.132
aj+0.126
ifying+0.125
cream+0.124
run+0.118
Neg logit contributions
,-0.066
 hard-0.059
 right-0.057
 first-0.056
 up-0.056
Token a
Token ID257
Feature activation-2.43e-02

Pos logit contributions
endant+0.133
cream+0.130
aj+0.127
ifying+0.126
run+0.124
Neg logit contributions
,-0.019
 right-0.015
 first-0.015
 up-0.014
 hard-0.014
Token pet
Token ID4273
Feature activation-6.86e-02

Pos logit contributions
endant+0.057
cream+0.055
aj+0.053
ifying+0.053
run+0.050
Neg logit contributions
,-0.031
 first-0.028
 up-0.028
 right-0.027
 hard-0.027
Token cat
Token ID3797
Feature activation-3.56e-02

Pos logit contributions
endant+0.153
aj+0.149
cream+0.143
ifying+0.138
run+0.136
Neg logit contributions
,-0.111
 first-0.095
 right-0.093
 up-0.091
 hard-0.089
Token named
Token ID3706
Feature activation-1.06e-01

Pos logit contributions
endant+0.087
cream+0.083
ifying+0.082
aj+0.082
ksh+0.078
Neg logit contributions
,-0.052
 first-0.042
 right-0.041
:-0.041
 up-0.040
Token Mitt
Token ID16627
Feature activation-5.45e-02

Pos logit contributions
endant+0.230
cream+0.215
aj+0.210
ifying+0.208
run+0.192
Neg logit contributions
,-0.178
 first-0.165
 the-0.161
 right-0.157
 up-0.157
Tokenens
Token ID641
Feature activation-2.40e-02

Pos logit contributions
endant+0.122
ifying+0.114
cream+0.109
aj+0.106
run+0.103
Neg logit contributions
,-0.095
 first-0.082
 right-0.081
 hard-0.078
 the-0.077
Token.
Token ID13
Feature activation-2.01e-02

Pos logit contributions
endant+0.071
cream+0.068
aj+0.067
ifying+0.067
run+0.064
Neg logit contributions
,-0.021
 first-0.017
 right-0.016
 up-0.015
 hard-0.015
Token One
Token ID1881
Feature activation-2.11e-02

Pos logit contributions
endant+0.056
cream+0.053
ifying+0.051
aj+0.048
paste+0.047
Neg logit contributions
,-0.025
 the-0.020
 first-0.020
 right-0.019
 hard-0.018
Token day
Token ID1110
Feature activation+2.06e-02

Pos logit contributions
endant+0.061
cream+0.059
aj+0.058
ifying+0.058
run+0.055
Neg logit contributions
,-0.017
 first-0.015
 up-0.014
 hard-0.014
 right-0.014
Token he
Token ID339
Feature activation-8.72e-03

Pos logit contributions
,+0.006
 first+0.005
 hard+0.005
 the+0.005
 up+0.005
Neg logit contributions
endant-0.017
aj-0.016
cream-0.016
ifying-0.016
ksh-0.015
Token had
Token ID550
Feature activation-2.58e-02

Pos logit contributions
endant+0.021
aj+0.020
cream+0.020
ifying+0.020
run+0.019
Neg logit contributions
,-0.011
 first-0.010
 up-0.010
 hard-0.010
 right-0.010
Token a
Token ID257
Feature activation-2.15e-02

Pos logit contributions
endant+0.080
cream+0.078
aj+0.077
ifying+0.077
run+0.074
Neg logit contributions
,-0.014
 first-0.012
 right-0.011
 hard-0.011
 up-0.010
Token sausage
Token ID31889
Feature activation-4.44e-02

Pos logit contributions
endant+0.057
cream+0.055
aj+0.054
ifying+0.054
run+0.051
Neg logit contributions
,-0.022
 first-0.020
 hard-0.019
 right-0.019
 up-0.019
Token dog
Token ID3290
Feature activation-2.53e-02

Pos logit contributions
endant+0.137
aj+0.133
ifying+0.128
ksh+0.126
cream+0.125
Neg logit contributions
,-0.039
 first-0.030
 right-0.028
 up-0.026
 very-0.026
Token named
Token ID3706
Feature activation-1.06e-01

Pos logit contributions
endant+0.068
aj+0.062
ifying+0.062
cream+0.062
ksh+0.061
Neg logit contributions
,-0.036
 first-0.026
 right-0.026
 the-0.026
 very-0.026
Token Max
Token ID5436
Feature activation-1.28e-02

Pos logit contributions
endant+0.272
aj+0.246
ifying+0.238
cream+0.235
lets+0.222
Neg logit contributions
,-0.146
 first-0.137
 the-0.136
 right-0.127
 up-0.126
Token.
Token ID13
Feature activation-1.60e-02

Pos logit contributions
endant+0.039
cream+0.037
aj+0.036
ifying+0.036
ittens+0.034
Neg logit contributions
,-0.011
 first-0.009
 right-0.008
 up-0.008
 hard-0.008
Token Max
Token ID5436
Feature activation+9.48e-03

Pos logit contributions
endant+0.045
cream+0.043
ifying+0.042
aj+0.040
paste+0.040
Neg logit contributions
,-0.018
 first-0.015
 right-0.014
 up-0.014
 the-0.014
Token was
Token ID373
Feature activation+4.40e-03

Pos logit contributions
 first+0.013
,+0.013
 right+0.013
 up+0.012
 top+0.012
Neg logit contributions
cream-0.019
endant-0.019
aj-0.018
ifying-0.017
lets-0.017
Token Tim
Token ID5045
Feature activation-3.45e-03

Pos logit contributions
,+0.004
 first+0.003
 bottom+0.003
 top+0.003
 up+0.003
Neg logit contributions
endant-0.011
cream-0.011
aj-0.011
ifying-0.011
run-0.010
Token and
Token ID290
Feature activation-4.53e-02

Pos logit contributions
endant+0.084
cream+0.083
ifying+0.079
aj+0.079
lets+0.076
Neg logit contributions
,-0.017
 first-0.017
 up-0.015
 right-0.015
 hard-0.015
Token had
Token ID550
Feature activation-8.51e-03

Pos logit contributions
endant+0.112
cream+0.107
ifying+0.106
aj+0.105
run+0.100
Neg logit contributions
,-0.057
 first-0.051
 hard-0.048
 up-0.048
 right-0.048
Token a
Token ID257
Feature activation-1.61e-02

Pos logit contributions
endant+0.024
cream+0.024
aj+0.023
ifying+0.023
run+0.022
Neg logit contributions
,-0.006
 first-0.005
 up-0.005
 right-0.005
 hard-0.005
Token toy
Token ID13373
Feature activation-3.25e-02

Pos logit contributions
endant+0.039
cream+0.037
aj+0.036
ifying+0.036
run+0.034
Neg logit contributions
,-0.020
 first-0.018
 up-0.018
 right-0.018
 hard-0.017
Token pony
Token ID26902
Feature activation-4.81e-02

Pos logit contributions
endant+0.062
cream+0.060
aj+0.058
ifying+0.057
run+0.053
Neg logit contributions
,-0.055
 first-0.052
 right-0.051
 up-0.051
 hard-0.050
Token named
Token ID3706
Feature activation-1.05e-01

Pos logit contributions
endant+0.147
aj+0.145
ifying+0.142
ksh+0.140
cream+0.139
Neg logit contributions
,-0.045
 first-0.031
 very-0.031
 up-0.029
 right-0.028
Token Spark
Token ID17732
Feature activation-4.65e-02

Pos logit contributions
endant+0.223
cream+0.206
aj+0.205
ifying+0.202
run+0.191
Neg logit contributions
,-0.176
 first-0.167
 up-0.160
 right-0.159
 the-0.157
Tokenles
Token ID829
Feature activation-3.22e-02

Pos logit contributions
endant+0.104
cream+0.102
aj+0.099
ifying+0.098
ittens+0.096
Neg logit contributions
,-0.072
 up-0.065
 first-0.063
 right-0.059
 hard-0.059
Token.
Token ID13
Feature activation-1.53e-02

Pos logit contributions
endant+0.088
cream+0.085
aj+0.083
ifying+0.083
run+0.079
Neg logit contributions
,-0.030
 first-0.026
 right-0.025
 up-0.025
 hard-0.024
Token One
Token ID1881
Feature activation-1.72e-02

Pos logit contributions
endant+0.039
cream+0.038
ifying+0.037
aj+0.037
run+0.035
Neg logit contributions
,-0.018
 first-0.016
 right-0.015
 up-0.015
 hard-0.015
Token day
Token ID1110
Feature activation+2.61e-02

Pos logit contributions
endant+0.051
cream+0.049
aj+0.048
ifying+0.048
run+0.046
Neg logit contributions
,-0.014
 first-0.012
 up-0.011
 hard-0.011
 right-0.011
Token a
Token ID257
Feature activation-2.23e-02

Pos logit contributions
endant+0.109
cream+0.106
aj+0.104
ifying+0.103
run+0.100
Neg logit contributions
,-0.017
 first-0.013
 right-0.013
 hard-0.012
 up-0.012
Token favorite
Token ID4004
Feature activation-3.71e-02

Pos logit contributions
endant+0.052
cream+0.051
aj+0.049
ifying+0.049
run+0.047
Neg logit contributions
,-0.028
 first-0.026
 right-0.025
 up-0.025
 hard-0.025
Token t
Token ID256
Feature activation-9.20e-02

Pos logit contributions
endant+0.078
cream+0.076
ifying+0.074
aj+0.073
run+0.069
Neg logit contributions
,-0.060
 first-0.055
 hard-0.053
 up-0.053
 right-0.053
Tokeneddy
Token ID21874
Feature activation-4.55e-02

Pos logit contributions
endant+0.076
aj+0.075
ksh+0.074
ifying+0.070
ccoli+0.062
Neg logit contributions
,-0.250
 up-0.237
 first-0.233
 very-0.229
 hard-0.221
Token bear
Token ID6842
Feature activation-3.98e-02

Pos logit contributions
endant+0.116
ifying+0.112
aj+0.110
cream+0.109
run+0.106
Neg logit contributions
,-0.065
 first-0.052
 up-0.050
 right-0.050
 the-0.049
Token named
Token ID3706
Feature activation-1.05e-01

Pos logit contributions
aj+0.105
endant+0.104
ksh+0.103
ifying+0.101
cream+0.098
Neg logit contributions
,-0.055
 very-0.040
:-0.040
 first-0.037
 right-0.036
Token Mark
Token ID2940
Feature activation-1.33e-02

Pos logit contributions
endant+0.225
ifying+0.202
aj+0.200
cream+0.198
run+0.184
Neg logit contributions
,-0.193
 the-0.177
 first-0.171
 up-0.163
 right-0.162
Token.
Token ID13
Feature activation-2.01e-02

Pos logit contributions
endant+0.038
cream+0.036
ifying+0.036
aj+0.036
ksh+0.034
Neg logit contributions
,-0.014
 first-0.010
:-0.010
 up-0.010
 the-0.009
Token She
Token ID1375
Feature activation-1.90e-02

Pos logit contributions
endant+0.049
cream+0.047
aj+0.046
ifying+0.046
run+0.043
Neg logit contributions
,-0.025
 first-0.022
 right-0.022
 up-0.022
 hard-0.021
Token loved
Token ID6151
Feature activation-3.06e-02

Pos logit contributions
endant+0.047
ifying+0.046
cream+0.045
aj+0.044
run+0.043
Neg logit contributions
,-0.024
 first-0.020
 hard-0.020
 up-0.020
 right-0.019
Token Mark
Token ID2940
Feature activation+2.57e-03

Pos logit contributions
endant+0.089
ifying+0.085
cream+0.085
aj+0.084
run+0.080
Neg logit contributions
,-0.024
 first-0.021
 hard-0.020
 up-0.020
 right-0.020
Token She
Token ID1375
Feature activation-1.64e-02

Pos logit contributions
endant+0.048
aj+0.046
cream+0.045
ifying+0.045
run+0.042
Neg logit contributions
,-0.024
 first-0.023
 up-0.023
 right-0.022
 hard-0.022
Token had
Token ID550
Feature activation-3.43e-02

Pos logit contributions
endant+0.033
cream+0.032
aj+0.031
ifying+0.030
ittens+0.028
Neg logit contributions
 first-0.024
,-0.024
 up-0.023
 right-0.023
 hard-0.022
Token a
Token ID257
Feature activation-2.28e-02

Pos logit contributions
endant+0.101
cream+0.098
ifying+0.096
aj+0.096
run+0.091
Neg logit contributions
,-0.022
 first-0.019
 up-0.018
 right-0.018
 hard-0.017
Token pet
Token ID4273
Feature activation-6.33e-02

Pos logit contributions
endant+0.054
cream+0.052
aj+0.051
ifying+0.051
run+0.047
Neg logit contributions
,-0.028
 first-0.026
 up-0.025
 right-0.025
 hard-0.025
Token dog
Token ID3290
Feature activation-2.63e-02

Pos logit contributions
endant+0.134
cream+0.129
aj+0.128
ifying+0.122
run+0.117
Neg logit contributions
,-0.105
 first-0.093
 right-0.090
 up-0.089
 hard-0.088
Token named
Token ID3706
Feature activation-1.05e-01

Pos logit contributions
endant+0.059
cream+0.057
aj+0.055
ifying+0.055
ksh+0.052
Neg logit contributions
,-0.042
 first-0.035
 right-0.035
 up-0.034
 hard-0.034
Token Spot
Token ID15899
Feature activation-2.55e-02

Pos logit contributions
endant+0.232
cream+0.220
aj+0.211
ifying+0.211
run+0.196
Neg logit contributions
,-0.166
 first-0.154
 up-0.148
 right-0.148
 the-0.145
Token who
Token ID508
Feature activation-5.81e-02

Pos logit contributions
endant+0.073
cream+0.070
aj+0.068
ifying+0.068
run+0.064
Neg logit contributions
,-0.025
 first-0.020
 right-0.019
 up-0.019
 hard-0.018
Token she
Token ID673
Feature activation-4.22e-02

Pos logit contributions
endant+0.148
ifying+0.142
aj+0.142
cream+0.138
ksh+0.132
Neg logit contributions
,-0.071
 first-0.061
 hard-0.060
 up-0.059
 right-0.059
Token loved
Token ID6151
Feature activation-2.89e-02

Pos logit contributions
endant+0.111
ifying+0.105
aj+0.105
cream+0.104
ksh+0.098
Neg logit contributions
,-0.048
 first-0.042
 up-0.041
 right-0.041
 hard-0.041
Token very
Token ID845
Feature activation-7.37e-03

Pos logit contributions
endant+0.087
aj+0.083
ifying+0.082
cream+0.082
run+0.077
Neg logit contributions
,-0.022
 first-0.019
 hard-0.018
 up-0.018
 right-0.018
Token day
Token ID1110
Feature activation+1.26e-02

Pos logit contributions
endant+0.059
cream+0.058
aj+0.056
ifying+0.056
ittens+0.053
Neg logit contributions
 right-0.029
 first-0.028
 hard-0.028
,-0.028
 up-0.027
Token,
Token ID11
Feature activation-1.02e-02

Pos logit contributions
 first+0.004
 right+0.004
 overcome+0.004
 up+0.003
 opposite+0.003
Neg logit contributions
cream-0.037
endant-0.037
ittens-0.037
ifying-0.037
aj-0.036
Token a
Token ID257
Feature activation-2.07e-02

Pos logit contributions
endant+0.022
cream+0.021
aj+0.020
ifying+0.020
run+0.019
Neg logit contributions
,-0.016
 first-0.015
 up-0.014
 hard-0.014
 right-0.014
Token chocolate
Token ID11311
Feature activation-4.86e-02

Pos logit contributions
endant+0.049
cream+0.048
aj+0.047
ifying+0.046
run+0.044
Neg logit contributions
,-0.027
 first-0.025
 hard-0.024
 right-0.024
 up-0.023
Token-
Token ID12
Feature activation-5.00e-02

Pos logit contributions
endant+0.106
aj+0.102
cream+0.099
ifying+0.098
run+0.097
Neg logit contributions
,-0.077
 first-0.068
 up-0.066
 hard-0.066
 right-0.066
Tokencol
Token ID4033
Feature activation-1.05e-01

Pos logit contributions
endant+0.058
aj+0.056
ifying+0.050
run+0.044
aters+0.041
Neg logit contributions
,-0.125
 up-0.123
 first-0.120
 hard-0.120
 right-0.118
Tokenoured
Token ID8167
Feature activation-3.40e-02

Pos logit contributions
jar+0.126
endant+0.122
pron+0.111
 importantly+0.108
aj+0.105
Neg logit contributions
 up-0.242
,-0.230
 first-0.217
 top-0.213
 right-0.212
Token whip
Token ID22972
Feature activation-3.07e-02

Pos logit contributions
endant+0.068
aj+0.065
cream+0.063
ifying+0.063
run+0.061
Neg logit contributions
,-0.062
 first-0.054
 right-0.054
 up-0.053
 hard-0.052
Token named
Token ID3706
Feature activation-9.42e-02

Pos logit contributions
endant+0.088
cream+0.084
aj+0.084
ifying+0.083
run+0.079
Neg logit contributions
,-0.025
 first-0.021
 up-0.021
 right-0.020
 hard-0.020
Token C
Token ID327
Feature activation-5.77e-02

Pos logit contributions
endant+0.196
cream+0.183
aj+0.181
ifying+0.177
run+0.166
Neg logit contributions
,-0.154
 first-0.145
 right-0.143
 up-0.142
 hard-0.137
Tokenheddar
Token ID44937
Feature activation-5.23e-02

Pos logit contributions
endant+0.082
cream+0.078
ifying+0.078
aj+0.071
run+0.070
Neg logit contributions
,-0.128
 up-0.121
 first-0.120
 right-0.120
:-0.117
Token She
Token ID1375
Feature activation-1.77e-02

Pos logit contributions
endant+0.054
cream+0.054
ifying+0.052
aj+0.051
run+0.049
Neg logit contributions
,-0.018
 first-0.015
 right-0.015
 hard-0.015
 up-0.014
Token had
Token ID550
Feature activation-3.48e-02

Pos logit contributions
endant+0.041
cream+0.040
aj+0.039
ifying+0.039
run+0.036
Neg logit contributions
,-0.023
 first-0.022
 right-0.021
 up-0.021
 hard-0.021
Token a
Token ID257
Feature activation-2.17e-02

Pos logit contributions
endant+0.112
cream+0.109
aj+0.107
ifying+0.106
run+0.103
Neg logit contributions
,-0.017
 first-0.013
 right-0.013
 hard-0.013
 up-0.012
Token brown
Token ID7586
Feature activation-5.01e-02

Pos logit contributions
endant+0.052
cream+0.051
ifying+0.049
aj+0.049
run+0.046
Neg logit contributions
,-0.027
 first-0.025
 hard-0.024
 right-0.024
 up-0.024
Token dog
Token ID3290
Feature activation-2.47e-02

Pos logit contributions
endant+0.091
cream+0.087
aj+0.084
ifying+0.083
run+0.077
Neg logit contributions
,-0.088
 first-0.085
 right-0.083
 up-0.083
 hard-0.082
Token named
Token ID3706
Feature activation-1.04e-01

Pos logit contributions
endant+0.057
cream+0.054
aj+0.054
ifying+0.054
ksh+0.051
Neg logit contributions
,-0.039
 first-0.031
 hard-0.031
 right-0.031
 up-0.031
Token Max
Token ID5436
Feature activation-1.22e-02

Pos logit contributions
endant+0.285
cream+0.261
aj+0.254
ifying+0.253
run+0.235
Neg logit contributions
,-0.132
 the-0.118
 first-0.117
 right-0.107
 hard-0.105
Token.
Token ID13
Feature activation-2.08e-02

Pos logit contributions
endant+0.034
cream+0.033
aj+0.032
ifying+0.032
run+0.030
Neg logit contributions
,-0.012
 first-0.010
 right-0.010
 up-0.010
 hard-0.010
Token One
Token ID1881
Feature activation-2.00e-02

Pos logit contributions
endant+0.054
cream+0.052
ifying+0.050
aj+0.049
run+0.047
Neg logit contributions
,-0.027
 first-0.022
 the-0.022
 right-0.021
 hard-0.021
Token day
Token ID1110
Feature activation+1.94e-02

Pos logit contributions
endant+0.058
cream+0.056
aj+0.055
ifying+0.055
run+0.052
Neg logit contributions
,-0.016
 first-0.014
 up-0.013
 hard-0.013
 right-0.013
Token,
Token ID11
Feature activation-1.06e-02

Pos logit contributions
 first+0.003
 right+0.003
 up+0.002
 opposite+0.001
 hard+0.001
Neg logit contributions
endant-0.063
cream-0.062
aj-0.061
ifying-0.061
ittens-0.059
Token She
Token ID1375
Feature activation-1.89e-02

Pos logit contributions
endant+0.043
cream+0.042
aj+0.041
ifying+0.041
run+0.039
Neg logit contributions
,-0.019
 first-0.018
 up-0.017
 right-0.017
 hard-0.017
Token had
Token ID550
Feature activation-3.48e-02

Pos logit contributions
endant+0.043
cream+0.042
aj+0.040
ifying+0.040
run+0.037
Neg logit contributions
,-0.026
 first-0.024
 up-0.023
 right-0.023
 hard-0.023
Token a
Token ID257
Feature activation-2.27e-02

Pos logit contributions
endant+0.107
cream+0.104
aj+0.102
ifying+0.101
run+0.097
Neg logit contributions
,-0.020
 first-0.017
 right-0.015
 up-0.015
 hard-0.015
Token rag
Token ID34232
Feature activation-5.11e-02

Pos logit contributions
endant+0.056
cream+0.054
aj+0.052
ifying+0.052
run+0.049
Neg logit contributions
,-0.026
 first-0.024
 up-0.023
 right-0.023
 hard-0.023
Token doll
Token ID3654
Feature activation-4.65e-02

Pos logit contributions
endant+0.125
cream+0.121
aj+0.120
ifying+0.117
ksh+0.113
Neg logit contributions
,-0.069
 first-0.058
 right-0.056
 up-0.056
 hard-0.054
Token named
Token ID3706
Feature activation-1.04e-01

Pos logit contributions
endant+0.123
cream+0.121
ksh+0.121
aj+0.121
ifying+0.119
Neg logit contributions
,-0.058
 first-0.043
 very-0.042
:-0.041
 up-0.041
Token Daisy
Token ID40355
Feature activation-3.34e-02

Pos logit contributions
endant+0.241
cream+0.227
aj+0.224
ifying+0.223
run+0.207
Neg logit contributions
,-0.160
 the-0.144
 first-0.144
 up-0.139
 right-0.135
Token.
Token ID13
Feature activation-1.46e-02

Pos logit contributions
endant+0.095
cream+0.093
aj+0.091
ifying+0.090
run+0.086
Neg logit contributions
,-0.031
 first-0.025
 right-0.023
 up-0.023
 hard-0.023
Token Lucy
Token ID22162
Feature activation+9.92e-04

Pos logit contributions
endant+0.036
cream+0.035
aj+0.034
ifying+0.034
run+0.032
Neg logit contributions
,-0.017
 first-0.016
 up-0.015
 right-0.015
 hard-0.015
Token loved
Token ID6151
Feature activation-2.75e-02

Pos logit contributions
,+0.001
 first+0.001
 right+0.001
 up+0.001
 hard+0.001
Neg logit contributions
endant-0.002
cream-0.002
aj-0.002
ifying-0.002
lets-0.002
Token Daisy
Token ID40355
Feature activation-1.43e-02

Pos logit contributions
endant+0.088
ifying+0.086
aj+0.085
cream+0.083
ksh+0.081
Neg logit contributions
,-0.017
 the-0.013
 hard-0.013
 first-0.012
 up-0.012
Token She
Token ID1375
Feature activation-1.88e-02

Pos logit contributions
endant+0.053
cream+0.052
aj+0.050
ifying+0.050
run+0.047
Neg logit contributions
,-0.019
 first-0.017
 right-0.017
 up-0.016
 hard-0.016
Token had
Token ID550
Feature activation-3.59e-02

Pos logit contributions
endant+0.042
cream+0.041
aj+0.040
ifying+0.040
run+0.037
Neg logit contributions
,-0.025
 first-0.024
 right-0.023
 up-0.023
 hard-0.023
Token a
Token ID257
Feature activation-2.27e-02

Pos logit contributions
endant+0.114
cream+0.112
aj+0.110
ifying+0.109
run+0.105
Neg logit contributions
,-0.018
 first-0.014
 right-0.014
 hard-0.013
 up-0.013
Token pet
Token ID4273
Feature activation-6.24e-02

Pos logit contributions
endant+0.053
cream+0.052
aj+0.050
ifying+0.050
run+0.047
Neg logit contributions
,-0.030
 first-0.027
 right-0.027
 hard-0.027
 up-0.026
Token cat
Token ID3797
Feature activation-3.34e-02

Pos logit contributions
endant+0.134
aj+0.130
cream+0.127
ifying+0.123
run+0.118
Neg logit contributions
,-0.104
 first-0.090
 right-0.088
 up-0.086
 hard-0.085
Token named
Token ID3706
Feature activation-1.03e-01

Pos logit contributions
endant+0.074
cream+0.072
ifying+0.071
aj+0.070
ksh+0.066
Neg logit contributions
,-0.053
 first-0.044
 right-0.043
 up-0.043
 hard-0.043
Token Mitt
Token ID16627
Feature activation-5.22e-02

Pos logit contributions
endant+0.236
cream+0.219
aj+0.215
ifying+0.213
run+0.197
Neg logit contributions
,-0.167
 the-0.154
 first-0.152
 up-0.144
 right-0.142
Tokenens
Token ID641
Feature activation-2.07e-02

Pos logit contributions
endant+0.113
ifying+0.107
cream+0.101
aj+0.098
run+0.096
Neg logit contributions
,-0.093
 first-0.081
 right-0.081
 hard-0.077
 up-0.077
Token.
Token ID13
Feature activation-1.93e-02

Pos logit contributions
endant+0.061
cream+0.058
aj+0.057
ifying+0.057
run+0.055
Neg logit contributions
,-0.019
 first-0.015
 right-0.015
 up-0.014
 hard-0.013
Token Mitt
Token ID16627
Feature activation-3.41e-02

Pos logit contributions
endant+0.056
cream+0.054
ifying+0.051
aj+0.049
lets+0.048
Neg logit contributions
,-0.022
 the-0.016
 first-0.016
 right-0.015
 hard-0.014
Tokenens
Token ID641
Feature activation-1.91e-03

Pos logit contributions
endant+0.067
ifying+0.064
cream+0.061
aj+0.059
run+0.057
Neg logit contributions
,-0.064
 first-0.057
 right-0.057
 hard-0.056
 up-0.056
Token She
Token ID1375
Feature activation-1.60e-02

Pos logit contributions
endant+0.043
cream+0.043
aj+0.041
ifying+0.041
run+0.039
Neg logit contributions
,-0.014
 first-0.012
 right-0.012
 hard-0.011
 up-0.011
Token had
Token ID550
Feature activation-3.37e-02

Pos logit contributions
endant+0.037
cream+0.036
aj+0.035
ifying+0.035
run+0.033
Neg logit contributions
,-0.021
 first-0.020
 right-0.019
 up-0.019
 hard-0.019
Token a
Token ID257
Feature activation-2.26e-02

Pos logit contributions
endant+0.109
cream+0.106
aj+0.105
ifying+0.103
run+0.101
Neg logit contributions
,-0.015
 right-0.012
 first-0.012
 hard-0.011
 up-0.011
Token gray
Token ID12768
Feature activation-4.80e-02

Pos logit contributions
endant+0.053
cream+0.052
ifying+0.050
aj+0.050
run+0.047
Neg logit contributions
,-0.029
 first-0.027
 hard-0.026
 right-0.026
 up-0.026
Token cat
Token ID3797
Feature activation-3.16e-02

Pos logit contributions
endant+0.089
cream+0.086
aj+0.083
ifying+0.082
run+0.076
Neg logit contributions
,-0.083
 first-0.079
 right-0.077
 up-0.077
 hard-0.077
Token named
Token ID3706
Feature activation-1.03e-01

Pos logit contributions
endant+0.071
cream+0.069
ifying+0.067
aj+0.067
ksh+0.064
Neg logit contributions
,-0.050
 first-0.041
 up-0.040
 right-0.040
 hard-0.040
Token Mitt
Token ID16627
Feature activation-4.85e-02

Pos logit contributions
endant+0.248
cream+0.227
aj+0.224
ifying+0.223
run+0.204
Neg logit contributions
,-0.160
 the-0.148
 first-0.142
 right-0.133
 hard-0.132
Tokenens
Token ID641
Feature activation-1.98e-02

Pos logit contributions
endant+0.106
ifying+0.101
cream+0.094
aj+0.093
run+0.090
Neg logit contributions
,-0.086
 right-0.075
 first-0.075
 hard-0.072
 up-0.071
Token.
Token ID13
Feature activation-1.59e-02

Pos logit contributions
endant+0.057
cream+0.055
aj+0.054
ifying+0.054
run+0.051
Neg logit contributions
,-0.018
 first-0.015
 right-0.015
 up-0.014
 hard-0.014
Token One
Token ID1881
Feature activation-1.76e-02

Pos logit contributions
endant+0.045
cream+0.044
ifying+0.041
aj+0.040
lets+0.039
Neg logit contributions
,-0.019
 the-0.014
 first-0.014
 right-0.013
 hard-0.013
Token day
Token ID1110
Feature activation+2.37e-02

Pos logit contributions
endant+0.053
cream+0.051
aj+0.050
ifying+0.050
run+0.048
Neg logit contributions
,-0.014
 first-0.011
 hard-0.011
 up-0.011
 right-0.010
Token She
Token ID1375
Feature activation-1.76e-02

Pos logit contributions
endant+0.057
cream+0.055
aj+0.054
ifying+0.054
run+0.051
Neg logit contributions
,-0.021
 first-0.018
 right-0.018
 up-0.018
 hard-0.017
Token had
Token ID550
Feature activation-3.37e-02

Pos logit contributions
endant+0.039
cream+0.038
aj+0.037
ifying+0.037
run+0.034
Neg logit contributions
,-0.024
 first-0.023
 right-0.022
 up-0.022
 hard-0.021
Token a
Token ID257
Feature activation-2.22e-02

Pos logit contributions
endant+0.109
cream+0.107
aj+0.105
ifying+0.103
run+0.101
Neg logit contributions
,-0.015
 right-0.012
 first-0.012
 hard-0.011
 up-0.011
Token big
Token ID1263
Feature activation+2.75e-03

Pos logit contributions
endant+0.051
cream+0.050
aj+0.048
ifying+0.048
run+0.045
Neg logit contributions
,-0.030
 first-0.028
 hard-0.027
 right-0.027
 up-0.027
Token dog
Token ID3290
Feature activation-2.39e-02

Pos logit contributions
 overcome+0.004
ered+0.004
 up+0.004
 first+0.004
 over+0.004
Neg logit contributions
endant-0.005
aj-0.005
pron-0.005
ittens-0.004
eddy-0.004
Token named
Token ID3706
Feature activation-1.03e-01

Pos logit contributions
endant+0.054
cream+0.052
aj+0.051
ifying+0.051
ksh+0.049
Neg logit contributions
,-0.037
 first-0.031
 hard-0.030
 right-0.030
 up-0.030
Token Max
Token ID5436
Feature activation-1.27e-02

Pos logit contributions
endant+0.284
cream+0.260
ifying+0.254
aj+0.253
lets+0.237
Neg logit contributions
,-0.128
 first-0.115
 the-0.115
 right-0.106
 hard-0.102
Token who
Token ID508
Feature activation-5.74e-02

Pos logit contributions
endant+0.034
cream+0.032
ifying+0.032
aj+0.032
ksh+0.030
Neg logit contributions
,-0.015
 first-0.012
 right-0.012
 up-0.012
 hard-0.012
Token was
Token ID373
Feature activation-3.17e-03

Pos logit contributions
endant+0.153
ifying+0.147
aj+0.146
cream+0.138
ksh+0.137
Neg logit contributions
,-0.068
 hard-0.058
 up-0.055
 first-0.055
 the-0.054
Token very
Token ID845
Feature activation-7.03e-03

Pos logit contributions
endant+0.008
cream+0.008
aj+0.008
ifying+0.008
run+0.008
Neg logit contributions
,-0.003
 first-0.002
 up-0.002
 right-0.002
 top-0.002
Token hairy
Token ID43873
Feature activation-3.21e-02

Pos logit contributions
endant+0.017
cream+0.017
aj+0.016
ifying+0.016
run+0.015
Neg logit contributions
,-0.008
 first-0.008
 right-0.007
 up-0.007
 hard-0.007
Token She
Token ID1375
Feature activation-1.78e-02

Pos logit contributions
endant+0.048
cream+0.047
aj+0.045
ifying+0.045
run+0.043
Neg logit contributions
,-0.016
 first-0.014
 right-0.013
 hard-0.013
 up-0.013
Token had
Token ID550
Feature activation-3.55e-02

Pos logit contributions
endant+0.041
cream+0.039
aj+0.038
ifying+0.038
run+0.036
Neg logit contributions
,-0.024
 first-0.022
 right-0.021
 up-0.021
 hard-0.021
Token a
Token ID257
Feature activation-2.35e-02

Pos logit contributions
endant+0.114
cream+0.112
aj+0.110
ifying+0.108
run+0.105
Neg logit contributions
,-0.017
 right-0.013
 first-0.013
 hard-0.012
 up-0.012
Token pet
Token ID4273
Feature activation-6.34e-02

Pos logit contributions
endant+0.056
cream+0.055
ifying+0.053
aj+0.053
run+0.050
Neg logit contributions
,-0.030
 first-0.028
 hard-0.027
 right-0.027
 up-0.027
Token dog
Token ID3290
Feature activation-2.59e-02

Pos logit contributions
endant+0.137
aj+0.133
cream+0.129
ifying+0.124
run+0.121
Neg logit contributions
,-0.106
 first-0.092
 right-0.090
 up-0.088
 hard-0.086
Token named
Token ID3706
Feature activation-1.03e-01

Pos logit contributions
endant+0.059
cream+0.056
ifying+0.055
aj+0.055
ksh+0.052
Neg logit contributions
,-0.041
 first-0.034
 right-0.033
 up-0.033
 hard-0.033
Token Max
Token ID5436
Feature activation-1.13e-02

Pos logit contributions
endant+0.260
cream+0.240
aj+0.236
ifying+0.235
run+0.220
Neg logit contributions
,-0.141
 first-0.127
 the-0.127
 up-0.122
 right-0.120
Token.
Token ID13
Feature activation-2.08e-02

Pos logit contributions
endant+0.033
cream+0.032
aj+0.031
ifying+0.031
run+0.029
Neg logit contributions
,-0.011
 first-0.008
 right-0.008
 up-0.008
 hard-0.008
Token One
Token ID1881
Feature activation-1.94e-02

Pos logit contributions
endant+0.052
cream+0.051
ifying+0.048
aj+0.047
run+0.045
Neg logit contributions
,-0.028
 first-0.024
 the-0.023
 up-0.023
 hard-0.022
Token day
Token ID1110
Feature activation+2.14e-02

Pos logit contributions
endant+0.057
cream+0.055
aj+0.054
ifying+0.054
run+0.051
Neg logit contributions
,-0.016
 first-0.013
 up-0.013
 hard-0.013
 right-0.012
Token,
Token ID11
Feature activation-9.73e-03

Pos logit contributions
 first+0.002
 right+0.002
 up+0.001
 hard+0.000
,+0.000
Neg logit contributions
endant-0.071
cream-0.069
aj-0.069
ifying-0.069
ittens-0.066
Token She
Token ID1375
Feature activation-1.77e-02

Pos logit contributions
endant+0.053
cream+0.053
aj+0.050
ifying+0.050
run+0.048
Neg logit contributions
,-0.018
 first-0.015
 right-0.014
 hard-0.014
 up-0.014
Token had
Token ID550
Feature activation-3.40e-02

Pos logit contributions
endant+0.039
cream+0.038
aj+0.037
ifying+0.036
run+0.034
Neg logit contributions
,-0.024
 first-0.023
 up-0.022
 right-0.022
 hard-0.022
Token a
Token ID257
Feature activation-2.19e-02

Pos logit contributions
endant+0.111
cream+0.108
aj+0.106
ifying+0.104
run+0.102
Neg logit contributions
,-0.015
 right-0.012
 first-0.012
 hard-0.011
 up-0.010
Token pet
Token ID4273
Feature activation-6.20e-02

Pos logit contributions
endant+0.051
cream+0.050
aj+0.049
ifying+0.048
run+0.046
Neg logit contributions
,-0.028
 first-0.026
 hard-0.025
 right-0.025
 up-0.025
Token dog
Token ID3290
Feature activation-2.47e-02

Pos logit contributions
endant+0.132
aj+0.127
cream+0.125
ifying+0.119
run+0.115
Neg logit contributions
,-0.105
 first-0.092
 right-0.089
 up-0.088
 hard-0.086
Token named
Token ID3706
Feature activation-1.02e-01

Pos logit contributions
endant+0.057
cream+0.054
ifying+0.053
aj+0.053
ksh+0.050
Neg logit contributions
,-0.039
 first-0.032
 right-0.031
 up-0.031
 hard-0.031
Token Max
Token ID5436
Feature activation-1.00e-02

Pos logit contributions
endant+0.259
cream+0.239
aj+0.233
ifying+0.233
run+0.217
Neg logit contributions
,-0.143
 first-0.129
 the-0.128
 up-0.123
 right-0.122
Token.
Token ID13
Feature activation-2.15e-02

Pos logit contributions
endant+0.029
cream+0.028
aj+0.028
ifying+0.028
run+0.026
Neg logit contributions
,-0.010
 first-0.007
 right-0.007
 up-0.007
 hard-0.007
Token Max
Token ID5436
Feature activation+9.46e-03

Pos logit contributions
endant+0.056
cream+0.054
ifying+0.051
aj+0.050
run+0.048
Neg logit contributions
,-0.029
 first-0.024
 the-0.023
 up-0.023
 right-0.022
Token was
Token ID373
Feature activation-2.04e-03

Pos logit contributions
 first+0.012
 right+0.012
,+0.012
 top+0.012
 bottom+0.011
Neg logit contributions
endant-0.020
cream-0.020
aj-0.019
ifying-0.018
lets-0.018
Token her
Token ID607
Feature activation-4.37e-02

Pos logit contributions
endant+0.005
cream+0.005
aj+0.005
ifying+0.005
run+0.005
Neg logit contributions
,-0.002
 first-0.001
 up-0.001
 top-0.001
 bottom-0.001
Token had
Token ID550
Feature activation-3.31e-02

Pos logit contributions
endant+0.038
cream+0.037
aj+0.035
ifying+0.035
run+0.033
Neg logit contributions
,-0.023
 first-0.022
 right-0.022
 up-0.022
 hard-0.021
Token a
Token ID257
Feature activation-2.23e-02

Pos logit contributions
endant+0.106
cream+0.104
aj+0.102
ifying+0.101
run+0.098
Neg logit contributions
,-0.016
 first-0.012
 right-0.012
 hard-0.012
 up-0.011
Token t
Token ID256
Feature activation-8.76e-02

Pos logit contributions
endant+0.052
cream+0.051
aj+0.049
ifying+0.049
run+0.047
Neg logit contributions
,-0.029
 first-0.026
 hard-0.026
 right-0.025
 up-0.025
Tokeneddy
Token ID21874
Feature activation-4.46e-02

Pos logit contributions
endant+0.064
aj+0.063
ksh+0.061
ifying+0.057
cream+0.056
Neg logit contributions
,-0.241
 up-0.233
 first-0.230
 very-0.223
 hard-0.219
Token bear
Token ID6842
Feature activation-3.76e-02

Pos logit contributions
endant+0.106
cream+0.102
ifying+0.101
aj+0.100
run+0.095
Neg logit contributions
,-0.066
 first-0.056
 up-0.055
 right-0.054
 hard-0.054
Token named
Token ID3706
Feature activation-1.02e-01

Pos logit contributions
endant+0.094
aj+0.093
ksh+0.090
ifying+0.088
cream+0.088
Neg logit contributions
,-0.055
 very-0.043
 first-0.041
 right-0.040
 hard-0.040
Token Teddy
Token ID29345
Feature activation-3.01e-02

Pos logit contributions
endant+0.225
ifying+0.204
cream+0.203
aj+0.203
run+0.187
Neg logit contributions
,-0.177
 the-0.164
 first-0.161
 up-0.152
 right-0.151
Token,
Token ID11
Feature activation-5.19e-02

Pos logit contributions
endant+0.090
ifying+0.085
aj+0.085
cream+0.085
run+0.081
Neg logit contributions
,-0.028
 first-0.022
 right-0.020
 up-0.020
 hard-0.019
Token who
Token ID508
Feature activation-4.84e-02

Pos logit contributions
endant+0.126
ifying+0.122
cream+0.121
aj+0.118
ksh+0.114
Neg logit contributions
,-0.074
 the-0.065
 right-0.061
 first-0.061
 hard-0.060
Token was
Token ID373
Feature activation-4.78e-03

Pos logit contributions
endant+0.131
ifying+0.129
aj+0.128
ksh+0.122
run+0.121
Neg logit contributions
,-0.055
 hard-0.046
 very-0.044
 the-0.044
 up-0.043
Token her
Token ID607
Feature activation-5.05e-02

Pos logit contributions
endant+0.015
cream+0.014
aj+0.014
ifying+0.014
run+0.013
Neg logit contributions
,-0.003
 first-0.002
 right-0.002
 up-0.002
 hard-0.002
Token She
Token ID1375
Feature activation-1.66e-02

Pos logit contributions
endant+0.049
cream+0.048
aj+0.046
ifying+0.046
run+0.044
Neg logit contributions
,-0.017
 first-0.015
 right-0.015
 hard-0.014
 up-0.014
Token had
Token ID550
Feature activation-3.45e-02

Pos logit contributions
endant+0.037
cream+0.036
aj+0.034
ifying+0.034
run+0.032
Neg logit contributions
,-0.023
 first-0.022
 right-0.021
 up-0.021
 hard-0.020
Token a
Token ID257
Feature activation-2.25e-02

Pos logit contributions
endant+0.109
cream+0.106
aj+0.104
ifying+0.103
run+0.100
Neg logit contributions
,-0.017
 first-0.014
 right-0.014
 hard-0.013
 up-0.013
Token pet
Token ID4273
Feature activation-6.29e-02

Pos logit contributions
endant+0.054
cream+0.052
aj+0.051
ifying+0.050
run+0.048
Neg logit contributions
,-0.028
 first-0.026
 right-0.025
 hard-0.025
 up-0.025
Token cat
Token ID3797
Feature activation-3.25e-02

Pos logit contributions
endant+0.133
aj+0.128
cream+0.126
ifying+0.121
run+0.117
Neg logit contributions
,-0.105
 first-0.093
 right-0.090
 up-0.089
 hard-0.088
Token named
Token ID3706
Feature activation-1.02e-01

Pos logit contributions
endant+0.071
cream+0.069
ifying+0.067
aj+0.067
ksh+0.063
Neg logit contributions
,-0.052
 first-0.044
 right-0.043
 up-0.043
 hard-0.043
Token Mitt
Token ID16627
Feature activation-4.99e-02

Pos logit contributions
endant+0.227
cream+0.212
aj+0.206
ifying+0.205
run+0.189
Neg logit contributions
,-0.171
 the-0.156
 first-0.156
 right-0.148
 up-0.148
Tokenens
Token ID641
Feature activation-1.92e-02

Pos logit contributions
endant+0.106
ifying+0.100
cream+0.096
aj+0.093
run+0.090
Neg logit contributions
,-0.090
 first-0.079
 right-0.079
 up-0.075
 hard-0.075
Token.
Token ID13
Feature activation-1.88e-02

Pos logit contributions
endant+0.055
cream+0.053
aj+0.052
ifying+0.052
run+0.050
Neg logit contributions
,-0.018
 first-0.015
 right-0.014
 up-0.014
 hard-0.013
Token One
Token ID1881
Feature activation-1.83e-02

Pos logit contributions
endant+0.054
cream+0.052
ifying+0.049
aj+0.047
lets+0.046
Neg logit contributions
,-0.022
 the-0.017
 first-0.016
 right-0.015
 hard-0.015
Token day
Token ID1110
Feature activation+2.24e-02

Pos logit contributions
endant+0.053
cream+0.051
aj+0.050
ifying+0.050
run+0.048
Neg logit contributions
,-0.015
 first-0.013
 hard-0.012
 up-0.012
 right-0.012
Token had
Token ID550
Feature activation-3.60e-02

Pos logit contributions
endant+0.039
cream+0.038
aj+0.037
ifying+0.036
run+0.034
Neg logit contributions
,-0.023
 first-0.022
 up-0.021
 right-0.021
 hard-0.021
Token a
Token ID257
Feature activation-2.33e-02

Pos logit contributions
endant+0.115
cream+0.113
aj+0.111
ifying+0.109
run+0.106
Neg logit contributions
,-0.017
 first-0.014
 right-0.014
 hard-0.013
 up-0.012
Token t
Token ID256
Feature activation-8.78e-02

Pos logit contributions
endant+0.054
cream+0.053
aj+0.051
ifying+0.051
run+0.048
Neg logit contributions
,-0.031
 first-0.028
 hard-0.027
 right-0.027
 up-0.027
Tokeneddy
Token ID21874
Feature activation-4.48e-02

Pos logit contributions
endant+0.068
aj+0.067
ksh+0.064
ifying+0.060
cream+0.059
Neg logit contributions
,-0.238
 up-0.228
 first-0.226
 very-0.220
 hard-0.215
Token bear
Token ID6842
Feature activation-3.79e-02

Pos logit contributions
endant+0.107
cream+0.103
ifying+0.102
aj+0.101
run+0.096
Neg logit contributions
,-0.066
 first-0.055
 up-0.054
 right-0.054
 hard-0.054
Token named
Token ID3706
Feature activation-1.02e-01

Pos logit contributions
endant+0.095
aj+0.093
ksh+0.090
ifying+0.089
cream+0.088
Neg logit contributions
,-0.055
 very-0.044
 first-0.042
 right-0.041
:-0.040
Token Teddy
Token ID29345
Feature activation-2.93e-02

Pos logit contributions
endant+0.228
ifying+0.207
cream+0.205
aj+0.203
run+0.192
Neg logit contributions
,-0.176
 the-0.166
 first-0.162
 up-0.151
 right-0.150
Token.
Token ID13
Feature activation-1.87e-02

Pos logit contributions
endant+0.087
aj+0.083
ifying+0.083
cream+0.083
run+0.079
Neg logit contributions
,-0.027
 first-0.021
 right-0.020
 up-0.019
 hard-0.018
Token One
Token ID1881
Feature activation-2.03e-02

Pos logit contributions
endant+0.049
cream+0.049
ifying+0.044
aj+0.043
ppers+0.043
Neg logit contributions
,-0.025
 the-0.020
 first-0.019
 right-0.019
 up-0.019
Token day
Token ID1110
Feature activation+2.19e-02

Pos logit contributions
endant+0.061
cream+0.058
aj+0.058
ifying+0.058
run+0.055
Neg logit contributions
,-0.016
 first-0.013
 hard-0.012
 up-0.012
 right-0.012
Token,
Token ID11
Feature activation-8.25e-03

Pos logit contributions
 first+0.001
 right+0.000
,+0.000
 up-0.000
 hard-0.001
Neg logit contributions
endant-0.075
cream-0.073
aj-0.073
ifying-0.072
ittens-0.070
Token.
Token ID13
Feature activation-1.71e-02

Pos logit contributions
endant+0.014
cream+0.013
ifying+0.013
aj+0.013
run+0.012
Neg logit contributions
,-0.004
 first-0.003
 up-0.003
 right-0.003
 hard-0.003
Token She
Token ID1375
Feature activation-1.75e-02

Pos logit contributions
endant+0.047
cream+0.046
ifying+0.044
aj+0.044
run+0.042
Neg logit contributions
,-0.017
 first-0.014
 right-0.014
 hard-0.014
 up-0.014
Token had
Token ID550
Feature activation-3.37e-02

Pos logit contributions
endant+0.038
cream+0.037
aj+0.036
ifying+0.035
run+0.033
Neg logit contributions
,-0.024
 first-0.023
 up-0.022
 right-0.022
 hard-0.022
Token a
Token ID257
Feature activation-2.21e-02

Pos logit contributions
endant+0.108
cream+0.105
aj+0.103
ifying+0.102
run+0.099
Neg logit contributions
,-0.016
 first-0.013
 right-0.013
 hard-0.012
 up-0.012
Token kitten
Token ID42143
Feature activation-4.88e-02

Pos logit contributions
endant+0.053
cream+0.052
ifying+0.050
aj+0.050
run+0.047
Neg logit contributions
,-0.028
 first-0.025
 hard-0.025
 right-0.025
 up-0.024
Token named
Token ID3706
Feature activation-1.02e-01

Pos logit contributions
endant+0.108
cream+0.105
aj+0.103
ifying+0.101
ksh+0.096
Neg logit contributions
,-0.077
 first-0.066
 right-0.064
 up-0.064
 hard-0.064
Token Mitt
Token ID16627
Feature activation-4.99e-02

Pos logit contributions
endant+0.228
cream+0.216
aj+0.208
ifying+0.200
run+0.187
Neg logit contributions
,-0.171
 the-0.158
 first-0.158
 right-0.151
 up-0.150
Tokenens
Token ID641
Feature activation-2.04e-02

Pos logit contributions
endant+0.108
ifying+0.103
cream+0.097
aj+0.094
run+0.092
Neg logit contributions
,-0.089
 first-0.078
 right-0.078
 hard-0.074
 up-0.074
Token.
Token ID13
Feature activation-1.71e-02

Pos logit contributions
endant+0.057
cream+0.055
aj+0.054
ifying+0.054
run+0.051
Neg logit contributions
,-0.021
 first-0.017
 right-0.017
 up-0.016
 hard-0.016
Token Mitt
Token ID16627
Feature activation-3.28e-02

Pos logit contributions
endant+0.049
cream+0.048
ifying+0.044
aj+0.043
lets+0.042
Neg logit contributions
,-0.020
 the-0.015
 first-0.015
 right-0.014
 hard-0.014
Tokenens
Token ID641
Feature activation-8.19e-04

Pos logit contributions
endant+0.064
ifying+0.061
cream+0.059
aj+0.056
run+0.054
Neg logit contributions
,-0.062
 first-0.055
 right-0.055
 hard-0.054
 up-0.054
Token She
Token ID1375
Feature activation-1.85e-02

Pos logit contributions
endant+0.048
cream+0.048
aj+0.045
ifying+0.045
run+0.043
Neg logit contributions
,-0.017
 first-0.014
 right-0.014
 hard-0.014
 up-0.014
Token had
Token ID550
Feature activation-3.62e-02

Pos logit contributions
endant+0.042
cream+0.041
aj+0.040
ifying+0.040
run+0.037
Neg logit contributions
,-0.025
 first-0.023
 right-0.022
 up-0.022
 hard-0.022
Token a
Token ID257
Feature activation-2.32e-02

Pos logit contributions
endant+0.118
cream+0.115
aj+0.113
ifying+0.111
run+0.109
Neg logit contributions
,-0.016
 right-0.013
 first-0.013
 hard-0.012
 up-0.011
Token pet
Token ID4273
Feature activation-6.31e-02

Pos logit contributions
endant+0.055
cream+0.054
aj+0.052
ifying+0.052
run+0.049
Neg logit contributions
,-0.029
 first-0.027
 right-0.026
 up-0.026
 hard-0.026
Token dog
Token ID3290
Feature activation-2.57e-02

Pos logit contributions
endant+0.134
aj+0.129
cream+0.127
ifying+0.122
run+0.117
Neg logit contributions
,-0.106
 first-0.093
 right-0.091
 up-0.089
 hard-0.088
Token named
Token ID3706
Feature activation-1.02e-01

Pos logit contributions
endant+0.058
cream+0.055
aj+0.055
ifying+0.054
ksh+0.051
Neg logit contributions
,-0.041
 first-0.034
 hard-0.033
 right-0.033
 up-0.033
Token Max
Token ID5436
Feature activation-1.24e-02

Pos logit contributions
endant+0.253
cream+0.235
aj+0.231
ifying+0.228
run+0.213
Neg logit contributions
,-0.142
 first-0.128
 the-0.125
 up-0.122
 right-0.121
Token.
Token ID13
Feature activation-1.98e-02

Pos logit contributions
endant+0.036
cream+0.035
aj+0.034
ifying+0.034
run+0.032
Neg logit contributions
,-0.012
 first-0.009
 right-0.009
 up-0.008
 hard-0.008
Token One
Token ID1881
Feature activation-1.99e-02

Pos logit contributions
endant+0.051
cream+0.049
ifying+0.047
aj+0.046
run+0.043
Neg logit contributions
,-0.027
 first-0.022
 the-0.022
 hard-0.021
 up-0.021
Token day
Token ID1110
Feature activation+2.13e-02

Pos logit contributions
endant+0.059
cream+0.056
aj+0.056
ifying+0.055
run+0.052
Neg logit contributions
,-0.016
 first-0.014
 hard-0.013
 up-0.013
 right-0.013
Token,
Token ID11
Feature activation-1.10e-02

Pos logit contributions
 first+0.004
 right+0.004
 up+0.003
 opposite+0.001
 hard+0.001
Neg logit contributions
endant-0.068
cream-0.067
aj-0.067
ifying-0.066
ittens-0.065
Token a
Token ID257
Feature activation-2.34e-02

Pos logit contributions
endant+0.110
cream+0.107
aj+0.106
ifying+0.104
run+0.102
Neg logit contributions
,-0.016
 right-0.012
 first-0.012
 hard-0.012
 up-0.011
Token favorite
Token ID4004
Feature activation-3.69e-02

Pos logit contributions
endant+0.055
cream+0.054
ifying+0.051
aj+0.051
run+0.049
Neg logit contributions
,-0.031
 first-0.028
 hard-0.028
 right-0.027
 up-0.027
Token t
Token ID256
Feature activation-9.04e-02

Pos logit contributions
endant+0.078
cream+0.076
ifying+0.074
aj+0.073
run+0.069
Neg logit contributions
,-0.059
 first-0.054
 hard-0.052
 up-0.052
 right-0.052
Tokeneddy
Token ID21874
Feature activation-4.38e-02

Pos logit contributions
endant+0.072
aj+0.072
ksh+0.070
ifying+0.065
cream+0.060
Neg logit contributions
,-0.249
 up-0.236
 first-0.230
 very-0.226
:-0.220
Token bear
Token ID6842
Feature activation-3.93e-02

Pos logit contributions
endant+0.108
ifying+0.103
cream+0.103
aj+0.102
run+0.098
Neg logit contributions
,-0.064
 first-0.052
 up-0.051
 right-0.050
 hard-0.050
Token named
Token ID3706
Feature activation-1.02e-01

Pos logit contributions
aj+0.104
endant+0.103
ksh+0.101
ifying+0.100
cream+0.096
Neg logit contributions
,-0.053
 very-0.039
:-0.039
 first-0.036
 hard-0.035
Token Brown
Token ID4373
Feature activation-5.37e-02

Pos logit contributions
endant+0.216
ifying+0.192
aj+0.191
cream+0.189
run+0.179
Neg logit contributions
,-0.191
 the-0.175
 first-0.169
 up-0.161
 right-0.161
Tokenie
Token ID494
Feature activation-3.15e-02

Pos logit contributions
endant+0.135
ifying+0.128
aj+0.126
cream+0.123
run+0.123
Neg logit contributions
,-0.080
 first-0.068
 up-0.064
 the-0.062
 very-0.062
Token.
Token ID13
Feature activation-2.20e-02

Pos logit contributions
endant+0.096
aj+0.092
cream+0.091
ifying+0.090
run+0.087
Neg logit contributions
,-0.027
 first-0.020
 right-0.018
:-0.017
 hard-0.017
Token One
Token ID1881
Feature activation-1.92e-02

Pos logit contributions
endant+0.055
ifying+0.053
aj+0.052
cream+0.052
run+0.050
Neg logit contributions
,-0.030
 the-0.024
 first-0.024
 right-0.024
 up-0.023
Token day
Token ID1110
Feature activation+2.24e-02

Pos logit contributions
endant+0.059
ifying+0.056
aj+0.056
cream+0.056
run+0.053
Neg logit contributions
,-0.015
 first-0.011
 hard-0.011
 up-0.011
 right-0.010
Token She
Token ID1375
Feature activation-1.84e-02

Pos logit contributions
endant+0.046
cream+0.046
ifying+0.044
aj+0.044
run+0.042
Neg logit contributions
,-0.016
 first-0.013
 right-0.013
 hard-0.012
 up-0.012
Token had
Token ID550
Feature activation-3.42e-02

Pos logit contributions
endant+0.042
cream+0.041
aj+0.040
ifying+0.039
run+0.037
Neg logit contributions
,-0.025
 first-0.023
 right-0.022
 up-0.022
 hard-0.022
Token a
Token ID257
Feature activation-2.14e-02

Pos logit contributions
endant+0.111
cream+0.108
aj+0.107
ifying+0.105
run+0.103
Neg logit contributions
,-0.015
 first-0.012
 right-0.012
 hard-0.011
 up-0.011
Token pet
Token ID4273
Feature activation-6.24e-02

Pos logit contributions
endant+0.050
cream+0.050
ifying+0.048
aj+0.047
run+0.045
Neg logit contributions
,-0.028
 first-0.026
 hard-0.025
 right-0.025
 up-0.024
Token dog
Token ID3290
Feature activation-2.60e-02

Pos logit contributions
endant+0.135
aj+0.132
cream+0.128
ifying+0.123
run+0.119
Neg logit contributions
,-0.104
 first-0.090
 right-0.088
 up-0.086
 hard-0.084
Token named
Token ID3706
Feature activation-1.02e-01

Pos logit contributions
endant+0.061
cream+0.058
ifying+0.058
aj+0.057
ksh+0.055
Neg logit contributions
,-0.041
 first-0.033
 right-0.032
:-0.032
 hard-0.032
Token Max
Token ID5436
Feature activation-1.00e-02

Pos logit contributions
endant+0.259
cream+0.240
aj+0.234
ifying+0.234
run+0.217
Neg logit contributions
,-0.140
 first-0.127
 the-0.125
 up-0.120
 right-0.119
Token.
Token ID13
Feature activation-2.06e-02

Pos logit contributions
endant+0.030
cream+0.028
ifying+0.028
aj+0.028
run+0.026
Neg logit contributions
,-0.010
 first-0.007
 right-0.007
 up-0.007
 the-0.007
Token Max
Token ID5436
Feature activation+8.91e-03

Pos logit contributions
endant+0.054
cream+0.052
ifying+0.049
aj+0.048
paste+0.046
Neg logit contributions
,-0.027
 first-0.022
 the-0.022
 up-0.021
 right-0.021
Token loved
Token ID6151
Feature activation-2.45e-02

Pos logit contributions
 first+0.012
 right+0.011
,+0.011
 up+0.011
 top+0.011
Neg logit contributions
endant-0.019
cream-0.019
aj-0.018
ifying-0.017
lets-0.017
Token to
Token ID284
Feature activation-2.11e-02

Pos logit contributions
endant+0.070
cream+0.067
ifying+0.067
aj+0.067
run+0.063
Neg logit contributions
,-0.020
 first-0.018
 up-0.017
 hard-0.017
 right-0.017
Token sneak
Token ID20528
Feature activation+3.82e-03

Pos logit contributions
endant+0.028
cream+0.027
ifying+0.026
aj+0.026
run+0.024
Neg logit contributions
 hard-0.012
,-0.012
 up-0.012
 best-0.012
 first-0.011
Token around
Token ID1088
Feature activation+2.81e-02

Pos logit contributions
 best+0.003
,+0.003
 first+0.003
 bottom+0.003
 end+0.003
Neg logit contributions
cream-0.010
ifying-0.009
endant-0.009
aj-0.009
run-0.009
Token when
Token ID618
Feature activation+3.96e-02

Pos logit contributions
 first+0.025
,+0.024
 hard+0.024
 up+0.023
 right+0.023
Neg logit contributions
cream-0.066
endant-0.066
ifying-0.065
aj-0.064
run-0.064
Token a
Token ID257
Feature activation-8.20e-03

Pos logit contributions
 right+0.060
 overcome+0.057
 up+0.056
 best+0.056
ap+0.056
Neg logit contributions
cream-0.063
ifying-0.063
endant-0.059
run-0.054
jar-0.052
Token human
Token ID1692
Feature activation+6.39e-03

Pos logit contributions
endant+0.022
cream+0.021
aj+0.021
ifying+0.020
ksh+0.020
Neg logit contributions
,-0.009
 hard-0.008
 first-0.008
 right-0.007
 up-0.007
Token stopped
Token ID5025
Feature activation+3.31e-02

Pos logit contributions
,+0.008
 first+0.008
 up+0.008
 right+0.007
 hard+0.007
Neg logit contributions
endant-0.015
cream-0.014
ifying-0.014
aj-0.014
run-0.013
Token him
Token ID683
Feature activation+6.44e-03

Pos logit contributions
 first+0.035
 hard+0.034
,+0.034
 best+0.033
 right+0.032
Neg logit contributions
endant-0.072
ifying-0.072
cream-0.071
run-0.069
aj-0.069
Token.
Token ID13
Feature activation-1.23e-03

Pos logit contributions
,+0.006
 first+0.005
 right+0.005
 up+0.005
 hard+0.005
Neg logit contributions
endant-0.018
cream-0.017
aj-0.017
ifying-0.017
run-0.016
Token
Token ID220
Feature activation-1.99e-02

Pos logit contributions
cream+0.003
endant+0.002
ifying+0.002
run+0.002
aj+0.002
Neg logit contributions
 best-0.002
 first-0.002
 hard-0.002
 up-0.002
 right-0.002
Token
Token ID198
Feature activation-2.10e-02

Pos logit contributions
endant+0.051
cream+0.049
aj+0.048
ifying+0.048
run+0.046
Neg logit contributions
,-0.021
 first-0.019
 right-0.018
 up-0.018
 hard-0.018
Token
Token ID198
Feature activation-2.03e-02

Pos logit contributions
cream+0.042
endant+0.039
run+0.039
aj+0.039
ittens+0.037
Neg logit contributions
 best-0.028
 right-0.027
 hard-0.027
 up-0.027
 first-0.027
Token not
Token ID407
Feature activation+9.79e-03

Pos logit contributions
,+0.006
 up+0.005
 first+0.005
 right+0.005
 hard+0.004
Neg logit contributions
endant-0.016
cream-0.015
ifying-0.015
aj-0.015
run-0.014
Token many
Token ID867
Feature activation+5.73e-03

Pos logit contributions
,+0.009
 first+0.008
 up+0.008
 right+0.008
 hard+0.008
Neg logit contributions
endant-0.026
cream-0.025
aj-0.024
ifying-0.024
run-0.023
Token leaves
Token ID5667
Feature activation-1.64e-02

Pos logit contributions
,+0.008
 first+0.008
 up+0.008
 right+0.007
 hard+0.007
Neg logit contributions
endant-0.012
cream-0.012
aj-0.012
ifying-0.011
run-0.011
Token left
Token ID1364
Feature activation+3.43e-03

Pos logit contributions
endant+0.046
cream+0.045
aj+0.044
ifying+0.043
run+0.041
Neg logit contributions
,-0.015
 first-0.013
 right-0.013
 up-0.012
 hard-0.012
Token on
Token ID319
Feature activation+2.95e-02

Pos logit contributions
,+0.003
 first+0.003
 right+0.003
 up+0.003
 hard+0.003
Neg logit contributions
endant-0.009
cream-0.009
aj-0.009
ifying-0.009
run-0.008
Token the
Token ID262
Feature activation+3.62e-02

Pos logit contributions
,+0.013
 first+0.011
 right+0.011
 up+0.010
 hard+0.009
Neg logit contributions
endant-0.090
cream-0.087
ifying-0.087
aj-0.083
run-0.081
Token tree
Token ID5509
Feature activation+7.47e-03

Pos logit contributions
,+0.050
 up+0.047
ered+0.047
 first+0.046
:+0.046
Neg logit contributions
endant-0.073
ifying-0.068
cream-0.068
aj-0.063
lets-0.059
Token.
Token ID13
Feature activation+7.09e-03

Pos logit contributions
,+0.006
 first+0.005
 right+0.005
 up+0.005
 hard+0.004
Neg logit contributions
endant-0.022
cream-0.022
aj-0.021
ifying-0.021
ittens-0.020
Token Most
Token ID4042
Feature activation-4.10e-03

Pos logit contributions
,+0.006
 right+0.005
 the+0.005
 first+0.004
:+0.004
Neg logit contributions
endant-0.021
cream-0.020
aj-0.019
pron-0.019
ifying-0.019
Token of
Token ID286
Feature activation+9.31e-03

Pos logit contributions
endant+0.009
cream+0.009
aj+0.008
ifying+0.008
run+0.008
Neg logit contributions
,-0.006
 first-0.006
 right-0.005
 up-0.005
 hard-0.005
Token them
Token ID606
Feature activation+1.14e-02

Pos logit contributions
,+0.005
 first+0.004
 up+0.004
 right+0.004
 hard+0.004
Neg logit contributions
endant-0.028
cream-0.028
aj-0.027
ifying-0.027
run-0.026
Token€
Token ID26391
Feature activation+3.65e-02

Pos logit contributions
,+0.029
 first+0.027
 right+0.027
 up+0.027
 hard+0.026
Neg logit contributions
endant-0.033
cream-0.032
aj-0.031
ifying-0.030
run-0.028
Token�
Token ID129
Feature activation+1.57e-03

Pos logit contributions
,+0.050
 first+0.045
 right+0.044
 up+0.044
 hard+0.044
Neg logit contributions
endant-0.083
cream-0.081
ifying-0.078
aj-0.078
run-0.073
Token�
Token ID241
Feature activation+8.98e-03

Pos logit contributions
,+0.003
 first+0.002
 hard+0.002
 up+0.002
 best+0.002
Neg logit contributions
endant-0.003
cream-0.003
aj-0.003
ifying-0.003
ksh-0.002
TokenYou
Token ID1639
Feature activation-1.19e-02

Pos logit contributions
,+0.012
 first+0.012
 right+0.012
 hard+0.011
 the+0.011
Neg logit contributions
endant-0.021
aj-0.020
cream-0.020
ifying-0.019
lets-0.018
Tokenâ
Token ID22940
Feature activation+1.13e-02

Pos logit contributions
endant+0.026
cream+0.025
ifying+0.024
aj+0.024
run+0.022
Neg logit contributions
,-0.017
 first-0.016
 up-0.015
 hard-0.015
 right-0.015
Token€
Token ID26391
Feature activation+3.15e-02

Pos logit contributions
,+0.016
 first+0.015
 up+0.015
 hard+0.015
 right+0.014
Neg logit contributions
endant-0.024
cream-0.024
ifying-0.023
aj-0.023
run-0.021
Tokenâ„¢
Token ID8151
Feature activation+3.74e-02

Pos logit contributions
,+0.042
 first+0.035
 right+0.034
 up+0.033
 hard+0.033
Neg logit contributions
endant-0.076
cream-0.073
ifying-0.071
aj-0.070
ksh-0.067
Tokenre
Token ID260
Feature activation+1.61e-03

Pos logit contributions
 up+0.058
,+0.057
 hard+0.057
 right+0.056
 first+0.056
Neg logit contributions
endant-0.069
cream-0.068
ifying-0.068
aj-0.067
run-0.065
Token welcome
Token ID7062
Feature activation-6.67e-03

Pos logit contributions
,+0.001
 up+0.001
 first+0.001
 opposite+0.001
 bottom+0.001
Neg logit contributions
endant-0.004
cream-0.004
ifying-0.004
run-0.004
aj-0.004
Token!
Token ID0
Feature activation+1.39e-02

Pos logit contributions
endant+0.016
cream+0.016
ifying+0.016
aj+0.015
run+0.015
Neg logit contributions
,-0.007
 first-0.007
 hard-0.007
 up-0.007
 right-0.006
Tokenâ
Token ID22940
Feature activation+1.53e-02

Pos logit contributions
,+0.018
 right+0.016
 first+0.016
 up+0.014
 the+0.014
Neg logit contributions
endant-0.035
cream-0.033
aj-0.033
ifying-0.032
pron-0.030
Token
Token ID198
Feature activation-5.70e-03

Pos logit contributions
,+0.013
 first+0.013
 up+0.012
 hard+0.012
 right+0.012
Neg logit contributions
endant-0.031
cream-0.030
aj-0.029
ifying-0.029
run-0.028
Token
Token ID198
Feature activation-4.49e-03

Pos logit contributions
endant+0.014
cream+0.013
aj+0.013
ifying+0.013
run+0.012
Neg logit contributions
,-0.006
 first-0.006
 right-0.006
 up-0.006
 hard-0.006
TokenThey
Token ID2990
Feature activation+1.50e-02

Pos logit contributions
endant+0.008
cream+0.007
aj+0.007
ifying+0.007
ittens+0.007
Neg logit contributions
 up-0.008
,-0.008
 first-0.008
 right-0.008
 hard-0.007
Token look
Token ID804
Feature activation+2.99e-02

Pos logit contributions
,+0.016
 up+0.015
 hard+0.015
 best+0.015
 first+0.015
Neg logit contributions
endant-0.033
ifying-0.032
cream-0.031
aj-0.030
lets-0.029
Token at
Token ID379
Feature activation-1.14e-02

Pos logit contributions
 first+0.034
,+0.033
 bottom+0.032
 top+0.031
 hard+0.030
Neg logit contributions
endant-0.069
cream-0.066
aj-0.066
ifying-0.065
run-0.061
Token each
Token ID1123
Feature activation+4.24e-02

Pos logit contributions
endant+0.035
cream+0.034
aj+0.033
ifying+0.033
run+0.031
Neg logit contributions
,-0.008
 first-0.006
 up-0.006
 hard-0.006
 right-0.006
Token other
Token ID584
Feature activation+1.74e-02

Pos logit contributions
,+0.026
 up+0.022
 right+0.022
 first+0.022
 hard+0.021
Neg logit contributions
endant-0.118
cream-0.117
ifying-0.116
aj-0.112
run-0.111
Token and
Token ID290
Feature activation-2.51e-02

Pos logit contributions
,+0.015
 first+0.013
 right+0.012
 up+0.012
 hard+0.012
Neg logit contributions
endant-0.050
cream-0.048
aj-0.047
ifying-0.047
run-0.045
Token nod
Token ID18666
Feature activation+8.61e-03

Pos logit contributions
endant+0.059
cream+0.055
ifying+0.055
aj+0.054
lets+0.052
Neg logit contributions
,-0.026
 first-0.026
 up-0.025
 hard-0.024
 right-0.024
Token again
Token ID757
Feature activation+4.26e-02

Pos logit contributions
 first+0.009
 up+0.008
,+0.008
 top+0.008
 hard+0.008
Neg logit contributions
endant-0.019
ifying-0.019
aj-0.019
cream-0.019
run-0.018
Token,
Token ID11
Feature activation-4.28e-03

Pos logit contributions
,+0.039
 first+0.033
 right+0.031
 hard+0.030
 up+0.030
Neg logit contributions
endant-0.121
cream-0.118
ifying-0.114
aj-0.113
run-0.107
Token deal
Token ID1730
Feature activation+1.16e-02

Pos logit contributions
 first+0.008
,+0.008
 up+0.008
 right+0.007
:+0.007
Neg logit contributions
endant-0.018
cream-0.017
aj-0.017
ifying-0.016
run-0.016
Token and
Token ID290
Feature activation-1.95e-02

Pos logit contributions
,+0.010
 first+0.009
 right+0.008
 up+0.008
 hard+0.008
Neg logit contributions
endant-0.033
cream-0.032
aj-0.031
ifying-0.031
run-0.029
Token he
Token ID339
Feature activation+1.10e-02

Pos logit contributions
endant+0.047
cream+0.045
aj+0.044
ifying+0.044
ittens+0.041
Neg logit contributions
,-0.025
 first-0.023
 right-0.022
 up-0.022
 hard-0.022
Token soon
Token ID2582
Feature activation+1.90e-02

Pos logit contributions
,+0.015
 first+0.014
 right+0.013
 up+0.013
 hard+0.013
Neg logit contributions
endant-0.025
cream-0.024
ifying-0.024
aj-0.024
run-0.022
Token settled
Token ID10282
Feature activation-7.39e-03

Pos logit contributions
 first+0.027
,+0.027
 up+0.027
 hard+0.027
 right+0.026
Neg logit contributions
endant-0.038
cream-0.037
aj-0.034
ifying-0.034
lets-0.033
Token into
Token ID656
Feature activation+3.61e-02

Pos logit contributions
endant+0.021
cream+0.020
aj+0.020
ifying+0.020
run+0.019
Neg logit contributions
,-0.005
 first-0.005
 hard-0.004
 right-0.004
 up-0.004
Token his
Token ID465
Feature activation-1.13e-02

Pos logit contributions
,+0.018
 first+0.014
 right+0.013
 up+0.013
 hard+0.012
Neg logit contributions
endant-0.114
cream-0.111
aj-0.109
ifying-0.108
run-0.104
Token new
Token ID649
Feature activation+3.28e-03

Pos logit contributions
endant+0.024
cream+0.023
ifying+0.022
aj+0.022
run+0.021
Neg logit contributions
,-0.017
 first-0.016
 up-0.016
 hard-0.016
 right-0.016
Token home
Token ID1363
Feature activation-1.37e-02

Pos logit contributions
,+0.005
 first+0.004
 right+0.004
 up+0.004
 hard+0.004
Neg logit contributions
endant-0.007
cream-0.007
aj-0.007
ifying-0.007
run-0.006
Token!
Token ID0
Feature activation+8.83e-03

Pos logit contributions
endant+0.038
cream+0.037
aj+0.036
ifying+0.036
ksh+0.034
Neg logit contributions
,-0.012
 first-0.011
 right-0.010
 up-0.010
 hard-0.010
Token Now
Token ID2735
Feature activation+2.14e-02

Pos logit contributions
,+0.008
 the+0.007
 right+0.007
 first+0.006
:+0.006
Neg logit contributions
endant-0.025
cream-0.024
aj-0.023
ifying-0.023
ppers-0.022
Token ground
Token ID2323
Feature activation+1.43e-02

Pos logit contributions
 first+0.036
,+0.036
 hard+0.035
 right+0.034
 bottom+0.033
Neg logit contributions
endant-0.083
cream-0.082
ifying-0.080
aj-0.080
run-0.076
Token.
Token ID13
Feature activation+1.81e-02

Pos logit contributions
,+0.014
 first+0.013
 hard+0.013
 right+0.012
 up+0.012
Neg logit contributions
endant-0.035
cream-0.034
ifying-0.033
aj-0.033
run-0.032
Token
Token ID220
Feature activation+1.60e-04

Pos logit contributions
,+0.018
 first+0.015
 right+0.015
 up+0.015
 hard+0.014
Neg logit contributions
endant-0.049
cream-0.047
ifying-0.047
aj-0.047
ksh-0.044
Token
Token ID198
Feature activation-1.40e-03

Pos logit contributions
,+0.000
 first+0.000
 up+0.000
 right+0.000
 hard+0.000
Neg logit contributions
endant-0.000
cream-0.000
ifying-0.000
aj-0.000
run-0.000
Token
Token ID198
Feature activation-7.08e-03

Pos logit contributions
endant+0.003
cream+0.003
aj+0.003
ifying+0.003
run+0.003
Neg logit contributions
,-0.002
 first-0.001
 up-0.001
 right-0.001
 hard-0.001
TokenThe
Token ID464
Feature activation+1.64e-02

Pos logit contributions
endant+0.017
cream+0.016
aj+0.016
ifying+0.016
ksh+0.015
Neg logit contributions
,-0.009
 first-0.008
 right-0.008
 hard-0.008
 up-0.008
Token man
Token ID582
Feature activation+2.88e-02

Pos logit contributions
,+0.019
 up+0.017
 first+0.017
 right+0.016
 hard+0.016
Neg logit contributions
endant-0.038
cream-0.038
aj-0.036
ifying-0.035
pron-0.034
Token was
Token ID373
Feature activation+2.22e-02

Pos logit contributions
,+0.035
 first+0.034
 hard+0.033
 up+0.033
 right+0.033
Neg logit contributions
endant-0.065
cream-0.065
aj-0.061
ifying-0.060
run-0.058
Token safe
Token ID3338
Feature activation+2.52e-03

Pos logit contributions
,+0.025
 first+0.023
 hard+0.022
 top+0.022
 opposite+0.022
Neg logit contributions
endant-0.049
cream-0.048
ifying-0.048
run-0.047
aj-0.046
Token.
Token ID13
Feature activation+1.60e-02

Pos logit contributions
 first+0.002
 hard+0.002
 opposite+0.002
 top+0.002
,+0.002
Neg logit contributions
endant-0.006
cream-0.006
ifying-0.006
aj-0.006
run-0.006
Token
Token ID220
Feature activation+3.08e-03

Pos logit contributions
,+0.018
 right+0.016
 first+0.016
 up+0.015
 hard+0.015
Neg logit contributions
endant-0.041
cream-0.039
aj-0.039
ifying-0.038
ittens-0.036
Token cute
Token ID13779
Feature activation-2.80e-02

Pos logit contributions
,+0.030
 first+0.027
 right+0.027
 up+0.027
 hard+0.026
Neg logit contributions
endant-0.063
cream-0.061
aj-0.059
ifying-0.059
run-0.056
Token and
Token ID290
Feature activation-2.81e-02

Pos logit contributions
endant+0.083
cream+0.081
aj+0.080
ifying+0.079
run+0.075
Neg logit contributions
,-0.021
 first-0.017
 right-0.016
 up-0.016
 hard-0.015
Token colorful
Token ID20239
Feature activation-3.51e-02

Pos logit contributions
endant+0.073
aj+0.070
cream+0.069
ifying+0.069
ittens+0.065
Neg logit contributions
,-0.033
 hard-0.028
 first-0.028
 up-0.028
 the-0.028
Token.
Token ID13
Feature activation+1.09e-02

Pos logit contributions
endant+0.103
cream+0.100
aj+0.099
ifying+0.097
run+0.093
Neg logit contributions
,-0.031
 right-0.022
 first-0.022
 up-0.022
 hard-0.021
Token The
Token ID383
Feature activation+5.75e-03

Pos logit contributions
,+0.010
 the+0.008
 first+0.007
 right+0.007
:+0.007
Neg logit contributions
endant-0.031
cream-0.030
aj-0.030
ifying-0.029
ppers-0.028
Token baby
Token ID5156
Feature activation+3.39e-03

Pos logit contributions
,+0.008
 first+0.008
 right+0.007
 hard+0.007
 up+0.007
Neg logit contributions
endant-0.013
cream-0.013
aj-0.013
ifying-0.012
run-0.012
Token talked
Token ID6619
Feature activation-1.91e-02

Pos logit contributions
,+0.004
 first+0.003
 up+0.003
 right+0.003
 hard+0.003
Neg logit contributions
endant-0.009
cream-0.008
ifying-0.008
aj-0.008
run-0.008
Token to
Token ID284
Feature activation-9.32e-03

Pos logit contributions
cream+0.037
endant+0.035
run+0.035
ifying+0.034
lets+0.034
Neg logit contributions
 bottom-0.023
 first-0.023
 top-0.021
 opposite-0.021
 overcome-0.021
Token his
Token ID465
Feature activation-2.72e-02

Pos logit contributions
endant+0.026
cream+0.026
ifying+0.024
aj+0.024
run+0.024
Neg logit contributions
,-0.006
 first-0.005
 right-0.005
 up-0.005
 hard-0.005
Token mom
Token ID1995
Feature activation+1.55e-02

Pos logit contributions
endant+0.064
cream+0.062
ifying+0.060
aj+0.060
run+0.056
Neg logit contributions
,-0.036
 first-0.034
 up-0.032
 right-0.032
 hard-0.032
Tokenmy
Token ID1820
Feature activation+1.90e-02

Pos logit contributions
,+0.013
 first+0.010
 up+0.010
 right+0.009
 hard+0.008
Neg logit contributions
endant-0.047
cream-0.045
ifying-0.044
aj-0.044
ittens-0.041
Token hatched
Token ID49353
Feature activation-2.40e-02

Pos logit contributions
,+0.009
 first+0.008
 up+0.007
 right+0.007
 hard+0.007
Neg logit contributions
endant-0.023
cream-0.022
aj-0.022
ifying-0.022
ittens-0.021
Token.
Token ID13
Feature activation+4.49e-03

Pos logit contributions
endant+0.065
cream+0.064
aj+0.062
ifying+0.062
run+0.059
Neg logit contributions
,-0.021
 first-0.019
 right-0.018
 up-0.018
 hard-0.018
Token
Token ID220
Feature activation-1.32e-02

Pos logit contributions
,+0.004
 the+0.003
 first+0.003
 right+0.003
 up+0.003
Neg logit contributions
endant-0.013
cream-0.013
ifying-0.013
aj-0.012
pron-0.012
Token
Token ID198
Feature activation-1.03e-02

Pos logit contributions
endant+0.037
cream+0.036
ifying+0.036
aj+0.035
ksh+0.034
Neg logit contributions
,-0.011
 first-0.009
 up-0.009
 right-0.009
 hard-0.009
Token
Token ID198
Feature activation-1.00e-02

Pos logit contributions
endant+0.026
cream+0.025
aj+0.025
ifying+0.025
run+0.024
Neg logit contributions
,-0.011
 first-0.010
 up-0.010
 right-0.010
 hard-0.010
TokenDays
Token ID38770
Feature activation+7.68e-03

Pos logit contributions
endant+0.025
cream+0.023
aj+0.023
ifying+0.023
run+0.021
Neg logit contributions
,-0.012
 first-0.012
 right-0.011
 hard-0.011
 up-0.011
Token passed
Token ID3804
Feature activation+1.10e-02

Pos logit contributions
,+0.009
 first+0.008
 right+0.008
 up+0.008
 hard+0.008
Neg logit contributions
endant-0.019
cream-0.018
aj-0.017
ifying-0.017
run-0.016
Token and
Token ID290
Feature activation-8.82e-03

Pos logit contributions
 first+0.007
,+0.007
 right+0.007
 hard+0.007
 up+0.006
Neg logit contributions
endant-0.030
cream-0.029
ifying-0.028
aj-0.028
run-0.027
Token the
Token ID262
Feature activation+1.07e-02

Pos logit contributions
endant+0.028
cream+0.027
aj+0.027
ifying+0.027
run+0.026
Neg logit contributions
,-0.004
 first-0.003
 right-0.003
 up-0.003
 hard-0.003
Token egg
Token ID5935
Feature activation-1.84e-02

Pos logit contributions
,+0.015
 first+0.014
 hard+0.013
 right+0.013
 best+0.013
Neg logit contributions
ifying-0.026
endant-0.026
cream-0.025
run-0.025
aj-0.025
Token finally
Token ID3443
Feature activation+8.02e-03

Pos logit contributions
endant+0.048
cream+0.046
aj+0.045
ifying+0.045
run+0.043
Neg logit contributions
,-0.019
 first-0.017
 up-0.016
 right-0.016
 hard-0.016
Token
Token ID198
Feature activation-2.32e-02

Pos logit contributions
endant+0.059
cream+0.057
aj+0.056
ifying+0.056
run+0.053
Neg logit contributions
,-0.023
 first-0.021
 right-0.020
 up-0.020
 hard-0.020
TokenThen
Token ID6423
Feature activation+3.84e-02

Pos logit contributions
endant+0.051
cream+0.048
aj+0.047
ifying+0.047
run+0.044
Neg logit contributions
,-0.034
 first-0.032
 right-0.031
 up-0.031
 hard-0.030
Token,
Token ID11
Feature activation+1.10e-02

Pos logit contributions
 first+0.015
 hard+0.015
 up+0.015
 right+0.014
 bottom+0.013
Neg logit contributions
cream-0.112
endant-0.109
ifying-0.107
aj-0.106
lets-0.102
Token they
Token ID484
Feature activation+9.70e-03

Pos logit contributions
,+0.013
 first+0.012
 right+0.011
 up+0.011
 hard+0.011
Neg logit contributions
endant-0.027
cream-0.026
aj-0.026
ifying-0.026
run-0.024
Token heard
Token ID2982
Feature activation-1.51e-02

Pos logit contributions
,+0.012
 first+0.011
 up+0.011
 right+0.011
 hard+0.011
Neg logit contributions
endant-0.023
cream-0.022
ifying-0.022
aj-0.022
run-0.020
Token a
Token ID257
Feature activation+1.51e-02

Pos logit contributions
endant+0.045
cream+0.044
aj+0.043
ifying+0.043
run+0.041
Neg logit contributions
,-0.010
 first-0.008
 up-0.008
 right-0.007
 hard-0.007
Token loud
Token ID7812
Feature activation+3.40e-02

Pos logit contributions
,+0.015
 up+0.014
 first+0.014
 right+0.014
 hard+0.013
Neg logit contributions
endant-0.037
cream-0.036
ifying-0.035
aj-0.035
run-0.033
Token noise
Token ID7838
Feature activation+1.84e-02

Pos logit contributions
 best+0.050
 first+0.048
 overcome+0.044
 bottom+0.043
 top+0.041
Neg logit contributions
cream-0.064
endant-0.061
run-0.058
ksh-0.058
eddy-0.057
Token.
Token ID13
Feature activation+4.18e-06

Pos logit contributions
 first+0.019
 best+0.018
 hard+0.018
,+0.018
 right+0.018
Neg logit contributions
cream-0.043
endant-0.042
ifying-0.041
run-0.040
aj-0.039
Token It
Token ID632
Feature activation-7.36e-03

Feature not active on prev token;
no predictions were affected.
Token was
Token ID373
Feature activation+2.16e-04

Pos logit contributions
endant+0.017
cream+0.017
ifying+0.016
aj+0.016
run+0.016
Neg logit contributions
,-0.008
 first-0.007
 right-0.007
 hard-0.007
 up-0.007
Token was
Token ID373
Feature activation+2.20e-02

Pos logit contributions
endant+0.010
cream+0.009
ifying+0.009
aj+0.009
run+0.009
Neg logit contributions
,-0.005
 first-0.005
 right-0.004
 up-0.004
 hard-0.004
Token happy
Token ID3772
Feature activation-1.31e-03

Pos logit contributions
,+0.023
 first+0.020
 hard+0.019
 up+0.019
 right+0.019
Neg logit contributions
endant-0.055
cream-0.054
aj-0.053
ifying-0.052
run-0.050
Token.
Token ID13
Feature activation+1.97e-02

Pos logit contributions
endant+0.004
cream+0.003
aj+0.003
ifying+0.003
run+0.003
Neg logit contributions
,-0.001
 first-0.001
 right-0.001
 up-0.001
 hard-0.001
Token She
Token ID1375
Feature activation+1.36e-03

Pos logit contributions
,+0.016
 right+0.012
 first+0.012
 hard+0.012
 up+0.011
Neg logit contributions
endant-0.058
cream-0.058
aj-0.055
ifying-0.055
ksh-0.053
Token loved
Token ID6151
Feature activation-8.30e-03

Pos logit contributions
,+0.002
 first+0.002
 right+0.002
 hard+0.002
 up+0.002
Neg logit contributions
endant-0.003
cream-0.003
ifying-0.003
aj-0.003
run-0.003
Token to
Token ID284
Feature activation-2.52e-03

Pos logit contributions
endant+0.026
ifying+0.026
cream+0.026
aj+0.025
ksh+0.024
Neg logit contributions
,-0.005
 first-0.004
 hard-0.004
 the-0.004
 up-0.004
Token pretend
Token ID16614
Feature activation-6.21e-03

Pos logit contributions
endant+0.006
cream+0.005
aj+0.005
ifying+0.005
ittens+0.005
Neg logit contributions
 up-0.003
 first-0.003
,-0.003
 right-0.003
 hard-0.003
Token.
Token ID13
Feature activation+1.51e-02

Pos logit contributions
endant+0.016
cream+0.016
aj+0.015
ifying+0.015
run+0.015
Neg logit contributions
,-0.006
 first-0.006
 right-0.005
 up-0.005
 hard-0.005
Token<|endoftext|>
Token ID50256
Feature activation-3.25e-03

Pos logit contributions
,+0.012
 right+0.009
 first+0.008
 hard+0.008
 the+0.008
Neg logit contributions
endant-0.047
cream-0.046
aj-0.044
ifying-0.044
ksh-0.042
TokenBen
Token ID11696
Feature activation+3.96e-02

Pos logit contributions
endant+0.008
cream+0.007
ifying+0.007
aj+0.007
run+0.007
Neg logit contributions
,-0.004
 first-0.004
 up-0.004
 right-0.004
 hard-0.003
Token and
Token ID290
Feature activation-3.48e-02

Pos logit contributions
 first+0.042
 hard+0.040
 right+0.040
 top+0.040
,+0.039
Neg logit contributions
endant-0.093
cream-0.092
aj-0.089
ifying-0.084
eddy-0.083
Token,
Token ID11
Feature activation+3.32e-04

Pos logit contributions
,+0.021
 first+0.021
 right+0.019
 up+0.018
 hard+0.018
Neg logit contributions
endant-0.075
cream-0.075
aj-0.074
ifying-0.073
run-0.071
Token changed
Token ID3421
Feature activation+6.14e-03

Pos logit contributions
,+0.000
 first+0.000
 up+0.000
 right+0.000
 hard+0.000
Neg logit contributions
endant-0.001
cream-0.001
aj-0.001
ifying-0.001
run-0.001
Token into
Token ID656
Feature activation+3.75e-02

Pos logit contributions
,+0.004
 first+0.004
 right+0.004
 hard+0.003
 up+0.003
Neg logit contributions
endant-0.017
cream-0.017
aj-0.017
ifying-0.017
run-0.016
Token their
Token ID511
Feature activation-2.85e-03

Pos logit contributions
,+0.028
 first+0.024
 right+0.023
 up+0.023
 hard+0.023
Neg logit contributions
endant-0.108
cream-0.105
aj-0.103
ifying-0.102
run-0.097
Token clothes
Token ID8242
Feature activation-3.02e-03

Pos logit contributions
endant+0.006
cream+0.006
aj+0.006
ifying+0.006
run+0.005
Neg logit contributions
,-0.004
 first-0.004
 up-0.004
 right-0.004
 hard-0.004
Token and
Token ID290
Feature activation-3.71e-02

Pos logit contributions
endant+0.008
cream+0.008
aj+0.008
ifying+0.008
run+0.007
Neg logit contributions
,-0.003
 first-0.002
 right-0.002
 up-0.002
 hard-0.002
Token grabbed
Token ID13176
Feature activation+2.09e-02

Pos logit contributions
endant+0.085
cream+0.083
ifying+0.080
aj+0.079
run+0.075
Neg logit contributions
,-0.051
 first-0.046
 up-0.045
 right-0.045
 hard-0.045
Token their
Token ID511
Feature activation-1.11e-02

Pos logit contributions
,+0.013
 first+0.011
 up+0.011
 right+0.011
 hard+0.010
Neg logit contributions
endant-0.062
cream-0.060
aj-0.059
ifying-0.059
run-0.056
Token telephone
Token ID11426
Feature activation-1.05e-02

Pos logit contributions
endant+0.022
cream+0.021
ifying+0.020
aj+0.020
run+0.019
Neg logit contributions
,-0.019
 first-0.018
 hard-0.017
 up-0.017
 right-0.017
Token.
Token ID13
Feature activation+9.19e-03

Pos logit contributions
endant+0.029
cream+0.028
aj+0.027
ifying+0.027
run+0.026
Neg logit contributions
,-0.010
 first-0.009
 up-0.008
 right-0.008
 hard-0.008
Token
Token ID198
Feature activation-2.09e-03

Pos logit contributions
,+0.009
 first+0.008
 right+0.007
:+0.007
 the+0.007
Neg logit contributions
endant-0.025
cream-0.024
ifying-0.023
aj-0.023
lets-0.022
Token,
Token ID11
Feature activation-8.67e-03

Pos logit contributions
,+0.038
 first+0.036
 up+0.035
 right+0.035
 hard+0.034
Neg logit contributions
endant-0.062
cream-0.059
aj-0.058
ifying-0.058
run-0.054
Token fluffy
Token ID39145
Feature activation-3.24e-02

Pos logit contributions
cream+0.022
endant+0.021
aj+0.020
ittens+0.020
ifying+0.020
Neg logit contributions
,-0.010
 right-0.010
 hard-0.010
 up-0.009
 first-0.009
Token dog
Token ID3290
Feature activation-5.71e-03

Pos logit contributions
cream+0.068
endant+0.067
aj+0.064
ifying+0.062
run+0.060
Neg logit contributions
,-0.056
 up-0.047
 right-0.047
 first-0.047
 hard-0.046
Token.
Token ID13
Feature activation+1.42e-02

Pos logit contributions
endant+0.017
cream+0.017
aj+0.017
ifying+0.016
ksh+0.016
Neg logit contributions
,-0.005
 right-0.003
 up-0.003
 first-0.003
:-0.003
Token He
Token ID679
Feature activation-1.55e-03

Pos logit contributions
,+0.014
 the+0.012
 hard+0.010
:+0.010
 right+0.010
Neg logit contributions
endant-0.041
cream-0.040
aj-0.038
lets-0.038
ifying-0.038
Token w
Token ID266
Feature activation-3.81e-02

Pos logit contributions
endant+0.004
cream+0.004
ifying+0.004
aj+0.004
eddy+0.004
Neg logit contributions
,-0.002
 first-0.002
 hard-0.001
 up-0.001
 right-0.001
Tokenagged
Token ID14655
Feature activation+1.37e-03

Pos logit contributions
endant+0.057
cream+0.054
aj+0.052
ifying+0.050
run+0.047
Neg logit contributions
,-0.082
 first-0.080
 up-0.078
 right-0.078
 hard-0.076
Token his
Token ID465
Feature activation-1.69e-02

Pos logit contributions
 first+0.001
 bottom+0.001
 right+0.001
 best+0.001
,+0.001
Neg logit contributions
cream-0.003
endant-0.003
run-0.003
aj-0.003
lets-0.003
Token tail
Token ID7894
Feature activation-2.02e-03

Pos logit contributions
endant+0.029
cream+0.028
ifying+0.027
aj+0.027
run+0.025
Neg logit contributions
,-0.033
 first-0.031
 right-0.030
 up-0.030
 hard-0.030
Token and
Token ID290
Feature activation-3.31e-02

Pos logit contributions
endant+0.005
cream+0.005
ifying+0.005
aj+0.005
run+0.005
Neg logit contributions
,-0.002
 first-0.002
 up-0.002
 hard-0.002
 right-0.002
Token l
Token ID300
Feature activation-3.69e-02

Pos logit contributions
endant+0.075
ifying+0.072
cream+0.071
aj+0.071
eddy+0.067
Neg logit contributions
,-0.048
 first-0.043
 up-0.042
 right-0.042
 hard-0.042
Token The
Token ID383
Feature activation+1.23e-02

Pos logit contributions
,+0.007
 right+0.006
 first+0.006
:+0.006
 up+0.006
Neg logit contributions
endant-0.025
cream-0.024
ifying-0.024
aj-0.023
paste-0.023
Token end
Token ID886
Feature activation+1.90e-02

Pos logit contributions
,+0.013
 first+0.012
 hard+0.012
 right+0.012
 up+0.011
Neg logit contributions
endant-0.033
cream-0.032
ifying-0.031
aj-0.031
run-0.030
Token.
Token ID13
Feature activation+8.95e-03

Pos logit contributions
,+0.018
 first+0.016
 right+0.015
 up+0.015
 hard+0.015
Neg logit contributions
endant-0.051
cream-0.050
aj-0.049
ifying-0.048
run-0.046
Token<|endoftext|>
Token ID50256
Feature activation-3.99e-03

Pos logit contributions
,+0.003
 the+0.002
 right+0.002
:+0.001
 their+0.001
Neg logit contributions
endant-0.032
cream-0.030
aj-0.030
isher-0.029
ppers-0.029
TokenOnce
Token ID7454
Feature activation-2.02e-02

Pos logit contributions
endant+0.010
cream+0.010
aj+0.010
ifying+0.010
run+0.009
Neg logit contributions
,-0.004
 first-0.004
 up-0.004
 right-0.004
 hard-0.004
Token upon
Token ID2402
Feature activation-2.37e-02

Pos logit contributions
endant+0.033
cream+0.031
aj+0.031
ifying+0.030
run+0.028
Neg logit contributions
,-0.039
 first-0.037
 right-0.037
 up-0.037
 hard-0.036
Token a
Token ID257
Feature activation+6.63e-03

Pos logit contributions
endant+0.083
aj+0.081
cream+0.080
ifying+0.080
run+0.077
Neg logit contributions
,-0.007
 the-0.004
 right-0.004
 first-0.004
 hard-0.004
Token time
Token ID640
Feature activation+2.12e-02

Pos logit contributions
,+0.006
 first+0.006
 up+0.006
 right+0.006
 hard+0.005
Neg logit contributions
endant-0.017
cream-0.017
aj-0.016
ifying-0.016
run-0.015
Token,
Token ID11
Feature activation-1.84e-02

Pos logit contributions
 opposite+0.014
 right+0.010
 up+0.010
 bottom+0.009
 overcome+0.009
Neg logit contributions
endant-0.058
aj-0.058
run-0.055
ittens-0.055
ifying-0.053
Token there
Token ID612
Feature activation-1.51e-02

Pos logit contributions
endant+0.043
cream+0.042
aj+0.041
ifying+0.041
run+0.038
Neg logit contributions
,-0.023
 first-0.021
 right-0.021
 up-0.021
 hard-0.020
Token was
Token ID373
Feature activation-1.86e-02

Pos logit contributions
endant+0.036
cream+0.035
ifying+0.034
aj+0.034
run+0.032
Neg logit contributions
 first-0.015
 right-0.015
,-0.015
 hard-0.014
 up-0.014
Token to
Token ID284
Feature activation-7.68e-03

Pos logit contributions
,+0.005
 first+0.005
 bottom+0.005
 best+0.004
 hard+0.004
Neg logit contributions
endant-0.012
cream-0.011
run-0.011
lets-0.010
ifying-0.010
Token Sue
Token ID26089
Feature activation+1.55e-02

Pos logit contributions
endant+0.016
cream+0.016
run+0.014
ifying+0.014
aj+0.014
Neg logit contributions
,-0.010
 hard-0.010
 first-0.010
 right-0.009
 up-0.009
Token.
Token ID13
Feature activation+1.39e-02

Pos logit contributions
,+0.014
 first+0.012
 up+0.012
 right+0.012
 hard+0.011
Neg logit contributions
endant-0.043
cream-0.041
ifying-0.040
aj-0.040
run-0.038
Token
Token ID198
Feature activation-9.58e-03

Pos logit contributions
,+0.016
 first+0.015
 up+0.015
 right+0.015
 hard+0.014
Neg logit contributions
endant-0.033
cream-0.032
aj-0.031
ifying-0.030
run-0.030
Token
Token ID198
Feature activation-6.44e-03

Pos logit contributions
endant+0.021
cream+0.020
aj+0.020
run+0.019
eddy+0.019
Neg logit contributions
 first-0.012
,-0.012
 right-0.012
 best-0.012
 up-0.011
Token"
Token ID1
Feature activation-2.67e-02

Pos logit contributions
cream+0.013
endant+0.013
ifying+0.012
aj+0.012
run+0.012
Neg logit contributions
,-0.009
 up-0.008
 first-0.008
 hard-0.008
 best-0.008
TokenThey
Token ID2990
Feature activation-4.20e-04

Pos logit contributions
endant+0.053
cream+0.051
aj+0.049
ifying+0.049
run+0.045
Neg logit contributions
,-0.044
 first-0.042
 up-0.041
 right-0.041
 hard-0.040
Token are
Token ID389
Feature activation+7.20e-03

Pos logit contributions
endant+0.001
cream+0.001
ifying+0.001
aj+0.001
lets+0.001
Neg logit contributions
,-0.000
 first-0.000
 up-0.000
 hard-0.000
 right-0.000
Token spark
Token ID9009
Feature activation-3.02e-02

Pos logit contributions
,+0.006
 first+0.006
 up+0.005
 hard+0.005
 right+0.005
Neg logit contributions
endant-0.019
cream-0.019
ifying-0.018
aj-0.018
run-0.017
Tokenles
Token ID829
Feature activation-5.60e-03

Pos logit contributions
endant+0.049
cream+0.047
aj+0.045
ifying+0.045
run+0.041
Neg logit contributions
,-0.060
 first-0.057
 up-0.056
 right-0.056
 hard-0.056
Token!"
Token ID2474
Feature activation+3.09e-03

Pos logit contributions
endant+0.015
cream+0.014
ifying+0.014
aj+0.014
lets+0.013
Neg logit contributions
 first-0.005
,-0.004
 up-0.004
 right-0.004
 hard-0.004
Token a
Token ID257
Feature activation+6.88e-03

Pos logit contributions
endant+0.089
aj+0.086
cream+0.086
ifying+0.085
run+0.082
Neg logit contributions
,-0.009
 right-0.005
 first-0.005
 hard-0.005
 up-0.005
Token time
Token ID640
Feature activation+2.04e-02

Pos logit contributions
,+0.007
 first+0.006
 up+0.006
 right+0.006
 hard+0.006
Neg logit contributions
endant-0.018
cream-0.017
aj-0.017
ifying-0.017
run-0.016
Token,
Token ID11
Feature activation-1.67e-02

Pos logit contributions
 opposite+0.013
 right+0.010
 overcome+0.009
 up+0.009
 bottom+0.009
Neg logit contributions
endant-0.055
aj-0.054
run-0.053
ittens-0.053
ifying-0.050
Token there
Token ID612
Feature activation-1.63e-02

Pos logit contributions
endant+0.035
ifying+0.033
cream+0.033
aj+0.032
run+0.031
Neg logit contributions
,-0.022
 first-0.021
 right-0.021
 up-0.020
 hard-0.020
Token was
Token ID373
Feature activation-1.91e-02

Pos logit contributions
endant+0.037
cream+0.036
ifying+0.034
aj+0.034
run+0.033
Neg logit contributions
 first-0.018
 right-0.017
 hard-0.017
,-0.016
 up-0.016
Token a
Token ID257
Feature activation-2.46e-02

Pos logit contributions
endant+0.060
cream+0.058
aj+0.057
ifying+0.057
run+0.055
Neg logit contributions
,-0.010
 first-0.008
 right-0.008
 up-0.008
 hard-0.007
Token little
Token ID1310
Feature activation-1.97e-02

Pos logit contributions
endant+0.058
cream+0.057
ifying+0.055
aj+0.054
run+0.052
Neg logit contributions
,-0.033
 first-0.030
 hard-0.030
 right-0.029
 up-0.029
Token boy
Token ID2933
Feature activation+1.33e-02

Pos logit contributions
endant+0.041
cream+0.039
aj+0.038
ifying+0.038
run+0.036
Neg logit contributions
,-0.031
 first-0.029
 right-0.028
 up-0.028
 hard-0.028
Token named
Token ID3706
Feature activation-7.32e-02

Pos logit contributions
,+0.022
 first+0.020
 up+0.019
 right+0.019
 hard+0.019
Neg logit contributions
endant-0.027
cream-0.026
aj-0.025
ifying-0.025
run-0.023
Token Tim
Token ID5045
Feature activation-4.65e-03

Pos logit contributions
endant+0.181
ifying+0.166
cream+0.166
aj+0.165
lets+0.157
Neg logit contributions
,-0.104
 first-0.097
 right-0.090
 up-0.089
 hard-0.087
Tokenmy
Token ID1820
Feature activation-1.15e-02

Pos logit contributions
endant+0.010
cream+0.009
ifying+0.009
aj+0.009
run+0.008
Neg logit contributions
,-0.007
 first-0.007
 right-0.007
 up-0.007
 hard-0.007
Token friend
Token ID1545
Feature activation-8.62e-03

Pos logit contributions
endant+0.101
cream+0.100
ifying+0.095
aj+0.093
run+0.091
Neg logit contributions
,-0.056
 first-0.052
 right-0.049
 up-0.049
 hard-0.049
Token,
Token ID11
Feature activation-2.76e-02

Pos logit contributions
cream+0.022
aj+0.021
endant+0.021
lets+0.020
run+0.020
Neg logit contributions
 top-0.006
 hard-0.006
 bottom-0.006
 opposite-0.006
 first-0.006
Token a
Token ID257
Feature activation-2.28e-02

Pos logit contributions
cream+0.063
endant+0.062
ifying+0.058
aj+0.058
run+0.057
Neg logit contributions
,-0.030
 hard-0.029
 first-0.029
 bottom-0.029
 top-0.028
Token small
Token ID1402
Feature activation-3.10e-02

Pos logit contributions
endant+0.052
cream+0.050
aj+0.048
ifying+0.048
run+0.045
Neg logit contributions
,-0.032
 first-0.029
 right-0.028
 hard-0.028
 up-0.028
Token dog
Token ID3290
Feature activation-1.86e-02

Pos logit contributions
endant+0.055
cream+0.053
aj+0.051
ifying+0.051
run+0.047
Neg logit contributions
,-0.054
 first-0.052
 up-0.052
 right-0.051
 hard-0.050
Token named
Token ID3706
Feature activation-8.29e-02

Pos logit contributions
endant+0.038
cream+0.036
aj+0.035
ifying+0.035
run+0.032
Neg logit contributions
,-0.030
 first-0.028
 right-0.028
 up-0.028
 hard-0.027
Token Max
Token ID5436
Feature activation-3.35e-03

Pos logit contributions
endant+0.173
cream+0.167
aj+0.162
ifying+0.161
run+0.150
Neg logit contributions
,-0.127
 first-0.119
 right-0.116
 up-0.116
 hard-0.114
Token,
Token ID11
Feature activation-3.26e-02

Pos logit contributions
endant+0.010
cream+0.009
ifying+0.009
aj+0.009
run+0.009
Neg logit contributions
 first-0.002
 hard-0.002
 top-0.001
 right-0.001
 up-0.001
Token came
Token ID1625
Feature activation-1.28e-02

Pos logit contributions
endant+0.074
cream+0.072
ifying+0.069
aj+0.069
run+0.066
Neg logit contributions
,-0.041
 first-0.040
 up-0.038
 right-0.038
 hard-0.038
Token over
Token ID625
Feature activation+4.38e-02

Pos logit contributions
run+0.026
eddy+0.024
cream+0.023
aj+0.023
lets+0.023
Neg logit contributions
 bottom-0.017
 top-0.016
,-0.015
 opposite-0.014
 hard-0.014
Token to
Token ID284
Feature activation-1.34e-02

Pos logit contributions
,+0.039
 first+0.037
 up+0.037
 right+0.036
 hard+0.035
Neg logit contributions
endant-0.106
cream-0.105
aj-0.104
ifying-0.103
run-0.101
Token there
Token ID612
Feature activation-8.52e-03

Pos logit contributions
endant+0.027
aj+0.025
cream+0.024
ifying+0.024
run+0.023
Neg logit contributions
 right-0.036
,-0.036
 first-0.035
 up-0.035
 hard-0.035
Token was
Token ID373
Feature activation-1.91e-02

Pos logit contributions
endant+0.021
cream+0.020
aj+0.019
ifying+0.019
run+0.019
Neg logit contributions
 first-0.008
 hard-0.008
 right-0.008
,-0.008
 up-0.008
Token a
Token ID257
Feature activation-2.44e-02

Pos logit contributions
endant+0.060
cream+0.058
aj+0.058
ifying+0.057
run+0.055
Neg logit contributions
,-0.010
 first-0.008
 up-0.007
 right-0.007
 hard-0.007
Token little
Token ID1310
Feature activation-1.65e-02

Pos logit contributions
endant+0.058
cream+0.057
ifying+0.055
aj+0.054
run+0.052
Neg logit contributions
,-0.034
 first-0.030
 hard-0.030
 right-0.029
 up-0.028
Token boy
Token ID2933
Feature activation+1.41e-02

Pos logit contributions
endant+0.035
cream+0.033
aj+0.032
ifying+0.032
run+0.030
Neg logit contributions
,-0.026
 first-0.024
 hard-0.023
 up-0.023
 right-0.023
Token named
Token ID3706
Feature activation-7.35e-02

Pos logit contributions
,+0.023
 first+0.021
 up+0.021
 right+0.021
 hard+0.020
Neg logit contributions
endant-0.029
cream-0.027
aj-0.027
ifying-0.027
run-0.025
Token Jack
Token ID3619
Feature activation+1.39e-02

Pos logit contributions
endant+0.170
cream+0.159
ifying+0.159
aj+0.157
lets+0.148
Neg logit contributions
,-0.109
 first-0.102
 up-0.098
 right-0.097
 hard-0.094
Token who
Token ID508
Feature activation-4.70e-02

Pos logit contributions
,+0.012
 first+0.009
 up+0.009
 right+0.009
 hard+0.009
Neg logit contributions
endant-0.040
cream-0.039
aj-0.039
ifying-0.039
ksh-0.037
Token wanted
Token ID2227
Feature activation-1.25e-02

Pos logit contributions
endant+0.110
cream+0.106
aj+0.104
ifying+0.104
run+0.097
Neg logit contributions
,-0.062
 first-0.057
 right-0.055
 up-0.055
 hard-0.055
Token to
Token ID284
Feature activation-2.22e-02

Pos logit contributions
cream+0.032
endant+0.032
aj+0.030
ifying+0.030
run+0.029
Neg logit contributions
,-0.012
 first-0.011
 up-0.010
 right-0.010
 hard-0.010
Token go
Token ID467
Feature activation+3.22e-02

Pos logit contributions
endant+0.052
cream+0.052
ifying+0.049
aj+0.049
run+0.047
Neg logit contributions
,-0.024
 first-0.023
 hard-0.023
 up-0.023
 right-0.023
Tokenud
Token ID463
Feature activation+1.96e-02

Pos logit contributions
endant+0.004
cream+0.004
ifying+0.004
aj+0.003
lets+0.003
Neg logit contributions
,-0.004
 first-0.004
 hard-0.004
 up-0.003
 right-0.003
Token met
Token ID1138
Feature activation-3.34e-02

Pos logit contributions
,+0.029
 first+0.025
 up+0.024
 hard+0.023
 right+0.023
Neg logit contributions
endant-0.046
cream-0.043
ifying-0.043
aj-0.042
ksh-0.040
Token a
Token ID257
Feature activation-1.67e-02

Pos logit contributions
endant+0.093
cream+0.093
aj+0.090
ifying+0.088
run+0.086
Neg logit contributions
,-0.021
 first-0.021
 right-0.019
 up-0.018
 hard-0.018
Token little
Token ID1310
Feature activation-1.62e-02

Pos logit contributions
endant+0.041
cream+0.040
ifying+0.039
aj+0.038
run+0.036
Neg logit contributions
,-0.022
 first-0.020
 hard-0.019
 right-0.019
 up-0.018
Token girl
Token ID2576
Feature activation-3.90e-03

Pos logit contributions
endant+0.030
cream+0.029
aj+0.028
ifying+0.028
run+0.026
Neg logit contributions
,-0.027
 first-0.026
 up-0.025
 right-0.025
 hard-0.025
Token named
Token ID3706
Feature activation-7.65e-02

Pos logit contributions
endant+0.008
cream+0.008
aj+0.007
ifying+0.007
run+0.007
Neg logit contributions
,-0.006
 first-0.006
 up-0.006
 right-0.006
 hard-0.006
Token Lily
Token ID20037
Feature activation-1.21e-03

Pos logit contributions
endant+0.150
cream+0.144
aj+0.140
ifying+0.138
run+0.131
Neg logit contributions
,-0.123
 first-0.117
 up-0.114
 hard-0.114
 right-0.113
Token.
Token ID13
Feature activation-1.02e-02

Pos logit contributions
endant+0.003
cream+0.003
aj+0.003
ifying+0.003
run+0.003
Neg logit contributions
,-0.001
 first-0.001
 up-0.001
 right-0.001
 hard-0.001
Token
Token ID220
Feature activation-3.93e-02

Pos logit contributions
endant+0.031
cream+0.031
ifying+0.029
aj+0.028
ppers+0.027
Neg logit contributions
,-0.010
 the-0.007
 first-0.007
 right-0.007
:-0.006
Token
Token ID198
Feature activation-2.64e-02

Pos logit contributions
endant+0.114
cream+0.111
ifying+0.109
aj+0.107
ksh+0.102
Neg logit contributions
,-0.032
 first-0.027
 up-0.027
 right-0.026
 hard-0.025
Token
Token ID198
Feature activation-2.50e-02

Pos logit contributions
endant+0.074
cream+0.072
ifying+0.070
aj+0.069
ksh+0.065
Neg logit contributions
,-0.024
 first-0.021
 up-0.020
 right-0.020
 hard-0.020
Token many
Token ID867
Feature activation-1.37e-02

Pos logit contributions
endant+0.066
cream+0.065
ifying+0.063
aj+0.063
run+0.061
Neg logit contributions
,-0.016
 first-0.014
 up-0.013
 right-0.013
 hard-0.012
Token cars
Token ID5006
Feature activation-5.78e-03

Pos logit contributions
endant+0.030
cream+0.029
ifying+0.029
aj+0.028
ksh+0.027
Neg logit contributions
,-0.021
 first-0.019
 hard-0.019
 up-0.018
 right-0.018
Token of
Token ID286
Feature activation-1.59e-02

Pos logit contributions
endant+0.015
cream+0.015
aj+0.014
ifying+0.014
run+0.013
Neg logit contributions
,-0.006
 first-0.005
 right-0.005
 up-0.005
 hard-0.005
Token different
Token ID1180
Feature activation-2.91e-02

Pos logit contributions
endant+0.048
cream+0.046
ifying+0.045
aj+0.045
ksh+0.043
Neg logit contributions
,-0.012
 first-0.010
 hard-0.010
 right-0.010
 up-0.009
Token colors
Token ID7577
Feature activation-3.17e-02

Pos logit contributions
endant+0.055
cream+0.053
ifying+0.051
aj+0.051
run+0.047
Neg logit contributions
,-0.051
 first-0.048
 up-0.047
 right-0.047
 hard-0.046
Token and
Token ID290
Feature activation-6.96e-02

Pos logit contributions
endant+0.086
cream+0.084
aj+0.081
ifying+0.081
run+0.077
Neg logit contributions
,-0.028
 first-0.025
 right-0.024
 up-0.024
 hard-0.024
Token sizes
Token ID10620
Feature activation-2.22e-02

Pos logit contributions
endant+0.106
cream+0.099
ifying+0.097
aj+0.097
eddy+0.085
Neg logit contributions
,-0.156
 hard-0.145
 first-0.145
 right-0.142
 up-0.139
Token.
Token ID13
Feature activation+1.59e-05

Pos logit contributions
endant+0.061
cream+0.060
aj+0.059
ifying+0.058
run+0.055
Neg logit contributions
,-0.020
 first-0.017
 right-0.017
 up-0.016
 hard-0.016
Token He
Token ID679
Feature activation-7.31e-03

Pos logit contributions
,+0.000
 first+0.000
 right+0.000
 up+0.000
 hard+0.000
Neg logit contributions
endant-0.000
cream-0.000
ifying-0.000
aj-0.000
run-0.000
Token likes
Token ID7832
Feature activation-1.72e-02

Pos logit contributions
endant+0.015
cream+0.014
aj+0.014
ifying+0.014
lets+0.013
Neg logit contributions
,-0.010
 up-0.010
 first-0.010
 right-0.009
 bottom-0.009
Token to
Token ID284
Feature activation-1.39e-02

Pos logit contributions
endant+0.048
cream+0.046
aj+0.045
ifying+0.045
run+0.043
Neg logit contributions
,-0.015
 first-0.013
 up-0.013
 right-0.013
 hard-0.012
Token play
Token ID711
Feature activation+1.84e-02

Pos logit contributions
endant+0.046
cream+0.045
aj+0.043
ifying+0.043
run+0.040
Neg logit contributions
,-0.027
 first-0.025
 up-0.025
 right-0.025
 hard-0.024
Token with
Token ID351
Feature activation-1.41e-02

Pos logit contributions
,+0.018
 first+0.016
 right+0.015
 up+0.015
 hard+0.015
Neg logit contributions
endant-0.049
cream-0.048
aj-0.047
ifying-0.046
run-0.044
Token her
Token ID607
Feature activation-5.32e-02

Pos logit contributions
endant+0.048
ifying+0.047
cream+0.046
aj+0.046
ksh+0.045
Neg logit contributions
,-0.006
 up-0.003
 the-0.003
 hard-0.003
 first-0.003
Token dolls
Token ID36062
Feature activation-2.06e-02

Pos logit contributions
ifying+0.124
cream+0.124
endant+0.124
ksh+0.120
run+0.118
Neg logit contributions
,-0.082
 first-0.074
 hard-0.073
 best-0.073
 up-0.070
Token and
Token ID290
Feature activation-5.30e-02

Pos logit contributions
endant+0.064
cream+0.062
aj+0.061
ifying+0.061
run+0.059
Neg logit contributions
,-0.016
 first-0.013
 right-0.012
 up-0.011
 hard-0.010
Token t
Token ID256
Feature activation-7.72e-02

Pos logit contributions
endant+0.131
cream+0.128
aj+0.124
ifying+0.124
run+0.117
Neg logit contributions
,-0.060
 first-0.055
 right-0.054
 up-0.053
 hard-0.053
Tokeneddy
Token ID21874
Feature activation-1.77e-02

Pos logit contributions
endant+0.073
ifying+0.069
ksh+0.064
ccoli+0.063
 bisc+0.061
Neg logit contributions
 up-0.196
,-0.196
 first-0.187
:-0.175
 her-0.175
Token bears
Token ID13062
Feature activation-1.71e-02

Pos logit contributions
endant+0.039
cream+0.037
ifying+0.036
aj+0.036
run+0.034
Neg logit contributions
,-0.028
 first-0.025
 up-0.025
 right-0.024
 hard-0.024
Token.
Token ID13
Feature activation-1.17e-02

Pos logit contributions
endant+0.051
aj+0.049
cream+0.049
ifying+0.048
run+0.046
Neg logit contributions
,-0.015
 first-0.012
 right-0.011
 up-0.011
 hard-0.010
Token One
Token ID1881
Feature activation-1.55e-02

Pos logit contributions
endant+0.033
cream+0.032
ifying+0.031
aj+0.031
run+0.030
Neg logit contributions
,-0.012
 first-0.010
 right-0.010
 up-0.009
 hard-0.009
Token day
Token ID1110
Feature activation+2.61e-02

Pos logit contributions
endant+0.047
cream+0.044
ifying+0.044
aj+0.044
run+0.042
Neg logit contributions
,-0.012
 first-0.010
 hard-0.010
 up-0.009
 right-0.009