TOP ACTIVATIONS
MAX = 0.049

to the tree . On three , they ran to the
having fun . After a few minutes , Tim my fell
monkey bars . After a few minutes , Tim was tired
After a few hours , the vet was
. After a few minutes they reached the gym
to them . After a few days , Fl uffy was
the r iddles . After 3 tricky questions , Lily had
the slide . After a few turns , they went to
. After a few minutes , Rachel won the
. After a few times of sliding , the
to hide . After a few minutes , John called out
After a few minutes , the bird flew
. After a few minutes , Joe saw the
and gentle . After a few moments , the dove flew
best . But after a few minutes , she felt tired
After a few minutes , the owner realized
." After a few seconds , she sees Tom
: Don 't jump to conclusions and accuse your friends .
After a few moments , he came up
to meet him . The three snakes began to play together

MIN ACTIVATIONS
MIN = -0.052

he said it was only $ 10 .
with lots of pretty or nam ents . One day ,
. Mom says they are av oc ados . Sara wants
the price - it was $ 5 . Sally was so
box full of beautiful or nam ents .
the shiny things were or nam ents and they were very
The le opard toy costs $ 10 ."
hill and put her or nam ents all around . She
Let 's go find a y ummy surprise !" So they
her a box of ch ocol ates . Lily opened the
Once upon a time , Mr . Frog and a naughty
could not find his famous las agna . The
it made Bob 's face scr unch up . He didn
wallet had coins , gum wra ppers , buttons , and
a box of shiny or nam ents that spark led in
a tree with shiny or nam ents hanging from its branches
the ants spotted a delicious mor sel of food . E
a big box of ch ocol ates !
a big box of ch ocol ates to share with the
big tree with many or nam ents on it . The

INTERVAL 0.024 - 0.049
CONTAINS 0.178%

." They walked around the museum and looked at
empty ig loo . After playing , they went inside the
the little girl had so many new adventures . She learned
!" So they j ogged around the park one more time
Tom says . L ily agrees . She has

INTERVAL -0.001 - 0.024
CONTAINS 58.361%

. But Lily was so upset about losing her penny and
But there is no one in the park . They are
. He walked along the path , slowly and carefully .
we need to go closer to the rainbow ."
every day . She reads about animals , and cars ,

INTERVAL -0.027 - -0.001
CONTAINS 41.397%

someone who understand how he felt . Together , they had
paper , on the wall , on the floor . She
attractive barn . As the sun went down ,
around the barrel and began to change it . The barrel
she wanted to keep playing . But her mom was very

INTERVAL -0.052 - -0.027
CONTAINS 0.064%

noticed that the ground was getting very wet . Jack realized
cy was so proud that her family was so tidy .
had delayed its smoke from coming out of it . He
saw a stream of light coming from the sky . It
. Lily was proud of her mom my and daddy .
Token to
Token ID284
Feature activation+9.01e-03

Pos logit contributions
rob+0.004
 maybe+0.004
 Skip+0.004
ock+0.004
 inspection+0.004
Neg logit contributions
attered-0.004
aming-0.003
ney-0.003
opened-0.003
amed-0.003
Token the
Token ID262
Feature activation+4.94e-03

Pos logit contributions
rob+0.016
ock+0.014
aro+0.014
 villain+0.014
 Skip+0.013
Neg logit contributions
 mailbox-0.013
-0.012
opened-0.012
ney-0.012
attered-0.012
Token tree
Token ID5509
Feature activation+8.46e-03

Pos logit contributions
 Skip+0.007
 maybe+0.007
 Zoe+0.007
 but+0.007
 he+0.006
Neg logit contributions
attered-0.010
aming-0.009
ifying-0.009
opened-0.008
amed-0.008
Token.
Token ID13
Feature activation+3.64e-03

Pos logit contributions
 Skip+0.013
 maybe+0.013
rob+0.012
 Zoe+0.012
 inspection+0.012
Neg logit contributions
attered-0.015
aming-0.013
ney-0.013
opened-0.013
ifying-0.013
Token On
Token ID1550
Feature activation+9.25e-03

Pos logit contributions
rob+0.006
 Skip+0.006
ock+0.006
 maybe+0.006
 villain+0.005
Neg logit contributions
attered-0.006
ney-0.005
aming-0.005
opened-0.005
ifying-0.005
Token three
Token ID1115
Feature activation+4.92e-02

Pos logit contributions
 Skip+0.011
 maybe+0.010
 Zoe+0.010
rob+0.010
 inspection+0.010
Neg logit contributions
attered-0.020
aming-0.018
ney-0.018
opened-0.018
ifying-0.017
Token,
Token ID11
Feature activation-1.24e-04

Pos logit contributions
rob+0.089
 villain+0.089
ock+0.085
 Skip+0.083
 Bub+0.082
Neg logit contributions
-0.065
aming-0.061
ney-0.061
attered-0.060
 cracked-0.058
Token they
Token ID484
Feature activation+3.52e-03

Pos logit contributions
attered+0.000
aming+0.000
ney+0.000
ifying+0.000
opened+0.000
Neg logit contributions
rob-0.000
ock-0.000
 Skip-0.000
 maybe-0.000
 inspection-0.000
Token ran
Token ID4966
Feature activation+3.20e-04

Pos logit contributions
 maybe+0.006
 Skip+0.006
 Zoe+0.006
 inspection+0.006
 he+0.006
Neg logit contributions
attered-0.006
aming-0.005
ney-0.005
opened-0.005
ifying-0.005
Token to
Token ID284
Feature activation+8.36e-03

Pos logit contributions
rob+0.001
 maybe+0.001
ock+0.001
 Skip+0.001
 villain+0.001
Neg logit contributions
attered-0.000
aming-0.000
amed-0.000
ney-0.000
opened-0.000
Token the
Token ID262
Feature activation+7.88e-03

Pos logit contributions
rob+0.015
ock+0.013
 villain+0.013
aro+0.012
 Skip+0.012
Neg logit contributions
 mailbox-0.012
-0.012
opened-0.012
attered-0.012
aming-0.011
Token having
Token ID1719
Feature activation+1.74e-03

Pos logit contributions
 Skip+0.004
 maybe+0.004
 Zoe+0.004
rob+0.004
 inspection+0.004
Neg logit contributions
attered-0.007
aming-0.006
ney-0.006
opened-0.006
ifying-0.006
Token fun
Token ID1257
Feature activation+2.64e-02

Pos logit contributions
rob+0.003
aro+0.003
ock+0.003
 inspection+0.003
 maybe+0.003
Neg logit contributions
attered-0.002
-0.002
aming-0.002
amed-0.002
 painted-0.002
Token.
Token ID13
Feature activation+3.95e-03

Pos logit contributions
 Skip+0.059
 inspection+0.058
ock+0.057
 maybe+0.057
 Bub+0.056
Neg logit contributions
attered-0.019
aming-0.017
 painted-0.017
-0.016
ney-0.016
Token After
Token ID2293
Feature activation+6.81e-03

Pos logit contributions
rob+0.006
ock+0.006
 Skip+0.006
aro+0.006
 villain+0.006
Neg logit contributions
attered-0.007
tr-0.006
ney-0.006
opened-0.006
aming-0.006
Token a
Token ID257
Feature activation+1.22e-02

Pos logit contributions
ock+0.012
rob+0.012
 Skip+0.011
 inspection+0.011
 maybe+0.011
Neg logit contributions
attered-0.009
aming-0.009
 painted-0.008
 mailbox-0.008
 cracked-0.008
Token few
Token ID1178
Feature activation+4.91e-02

Pos logit contributions
rob+0.015
aro+0.014
 maybe+0.013
izards+0.013
 Skip+0.013
Neg logit contributions
attered-0.023
tr-0.022
 printed-0.022
-0.022
opened-0.021
Token minutes
Token ID2431
Feature activation+1.66e-02

Pos logit contributions
ock+0.079
 Bub+0.078
 Zig+0.078
 maybe+0.078
rob+0.077
Neg logit contributions
aming-0.064
 cracked-0.061
attered-0.060
 painted-0.059
amed-0.059
Token,
Token ID11
Feature activation-1.11e-02

Pos logit contributions
rob+0.026
 Skip+0.026
ock+0.025
 maybe+0.025
 villain+0.024
Neg logit contributions
attered-0.027
aming-0.024
opened-0.024
 cracked-0.023
 painted-0.023
Token Tim
Token ID5045
Feature activation-6.13e-04

Pos logit contributions
attered+0.023
ifying+0.021
 mailbox+0.021
 cracked+0.021
tr+0.020
Neg logit contributions
rob-0.013
ock-0.013
 Skip-0.012
 inspection-0.011
aro-0.011
Tokenmy
Token ID1820
Feature activation-5.45e-03

Pos logit contributions
ifying+0.001
attered+0.001
 mailbox+0.001
aming+0.001
olk+0.001
Neg logit contributions
ock-0.001
 but-0.001
 he-0.001
 maybe-0.001
 inspection-0.001
Token fell
Token ID3214
Feature activation-9.71e-03

Pos logit contributions
attered+0.010
ifying+0.009
ney+0.009
aming+0.008
tr+0.008
Neg logit contributions
 maybe-0.008
 Skip-0.008
 Zoe-0.008
 he-0.008
 but-0.008
Token monkey
Token ID21657
Feature activation+5.98e-03

Pos logit contributions
 Skip+0.004
 maybe+0.004
 Zoe+0.004
rob+0.004
ock+0.004
Neg logit contributions
attered-0.005
aming-0.005
ney-0.005
opened-0.004
amed-0.004
Token bars
Token ID9210
Feature activation-5.39e-03

Pos logit contributions
 Skip+0.008
 maybe+0.008
 Zoe+0.008
rob+0.008
ock+0.007
Neg logit contributions
attered-0.011
aming-0.011
ney-0.010
amed-0.010
opened-0.010
Token.
Token ID13
Feature activation+1.20e-04

Pos logit contributions
attered+0.009
aming+0.008
ney+0.008
opened+0.008
ifying+0.008
Neg logit contributions
 maybe-0.008
 Skip-0.008
 Zoe-0.008
rob-0.008
ock-0.008
Token After
Token ID2293
Feature activation+1.17e-02

Pos logit contributions
rob+0.000
ock+0.000
 villain+0.000
 Skip+0.000
 inspection+0.000
Neg logit contributions
attered-0.000
-0.000
opened-0.000
ney-0.000
tr-0.000
Token a
Token ID257
Feature activation+1.33e-02

Pos logit contributions
rob+0.017
ock+0.017
 Skip+0.016
 maybe+0.015
 inspection+0.015
Neg logit contributions
attered-0.019
aming-0.018
 cracked-0.018
-0.017
 mailbox-0.017
Token few
Token ID1178
Feature activation+4.81e-02

Pos logit contributions
rob+0.019
aro+0.017
 maybe+0.017
 Skip+0.016
ock+0.016
Neg logit contributions
attered-0.023
-0.022
tr-0.022
 cracked-0.021
 mailbox-0.021
Token minutes
Token ID2431
Feature activation+1.64e-02

Pos logit contributions
rob+0.075
 Skip+0.074
ock+0.073
 maybe+0.071
 villain+0.071
Neg logit contributions
 cracked-0.070
aming-0.069
attered-0.068
amed-0.066
-0.064
Token,
Token ID11
Feature activation-4.12e-03

Pos logit contributions
 Skip+0.025
rob+0.024
 maybe+0.023
 villain+0.023
ock+0.023
Neg logit contributions
attered-0.028
aming-0.025
opened-0.025
 cracked-0.025
ney-0.025
Token Tim
Token ID5045
Feature activation-1.69e-03

Pos logit contributions
attered+0.008
ifying+0.007
 mailbox+0.007
 cracked+0.007
aming+0.007
Neg logit contributions
ock-0.005
rob-0.005
 Skip-0.005
 inspection-0.004
 maybe-0.004
Token was
Token ID373
Feature activation+1.25e-02

Pos logit contributions
attered+0.003
ifying+0.003
aming+0.002
 mailbox+0.002
lette+0.002
Neg logit contributions
 he-0.003
 maybe-0.003
ock-0.003
 but-0.003
 Skip-0.003
Token tired
Token ID10032
Feature activation-5.91e-03

Pos logit contributions
 maybe+0.020
 Skip+0.020
 Zoe+0.019
rob+0.019
 inspection+0.018
Neg logit contributions
attered-0.021
ney-0.019
aming-0.018
ifying-0.018
opened-0.018
Token
Token ID220
Feature activation+5.61e-03

Pos logit contributions
rob+0.001
ock+0.001
aro+0.001
 inspection+0.001
 villain+0.001
Neg logit contributions
attered-0.001
tr-0.001
opened-0.001
-0.001
 mailbox-0.001
Token
Token ID198
Feature activation+2.24e-04

Pos logit contributions
rob+0.010
 villain+0.010
 Skip+0.009
 inspection+0.009
ock+0.009
Neg logit contributions
attered-0.008
ney-0.008
opened-0.008
amed-0.008
aming-0.007
Token
Token ID198
Feature activation+4.36e-03

Pos logit contributions
 villain+0.000
aro+0.000
rob+0.000
ock+0.000
 villains+0.000
Neg logit contributions
-0.000
opened-0.000
ney-0.000
amed-0.000
Mom-0.000
TokenAfter
Token ID3260
Feature activation+9.11e-03

Pos logit contributions
 Skip+0.006
 maybe+0.005
 Zoe+0.005
ock+0.005
rob+0.005
Neg logit contributions
attered-0.009
ney-0.008
aming-0.008
ige-0.007
lette-0.007
Token a
Token ID257
Feature activation+1.24e-02

Pos logit contributions
rob+0.016
ock+0.016
 inspection+0.015
 Skip+0.015
 maybe+0.015
Neg logit contributions
attered-0.013
aming-0.012
 mailbox-0.011
 cracked-0.011
ifying-0.011
Token few
Token ID1178
Feature activation+4.80e-02

Pos logit contributions
rob+0.015
aro+0.014
 maybe+0.014
 Zoe+0.014
 Skip+0.014
Neg logit contributions
tr-0.024
attered-0.023
 mailbox-0.023
opened-0.022
 cracked-0.022
Token hours
Token ID2250
Feature activation+2.37e-02

Pos logit contributions
 Zoe+0.081
rob+0.080
ock+0.080
 inspection+0.080
 Zig+0.079
Neg logit contributions
aming-0.063
attered-0.062
 cracked-0.060
amed-0.058
 mailbox-0.057
Token,
Token ID11
Feature activation-7.26e-03

Pos logit contributions
 Skip+0.035
rob+0.035
 maybe+0.034
ock+0.033
 Zoe+0.032
Neg logit contributions
attered-0.041
aming-0.038
opened-0.036
 cracked-0.036
ney-0.036
Token the
Token ID262
Feature activation+8.44e-03

Pos logit contributions
attered+0.014
ifying+0.013
 mailbox+0.013
 cracked+0.013
 printed+0.013
Neg logit contributions
ock-0.010
rob-0.010
 inspection-0.009
aro-0.008
 Skip-0.008
Token vet
Token ID20202
Feature activation+5.23e-03

Pos logit contributions
 Skip+0.010
 maybe+0.010
 villain+0.009
 Zoe+0.009
rob+0.009
Neg logit contributions
attered-0.018
aming-0.016
ney-0.016
ifying-0.016
opened-0.016
Token was
Token ID373
Feature activation+9.20e-03

Pos logit contributions
 maybe+0.008
 he+0.007
 Skip+0.007
 Zoe+0.007
 but+0.007
Neg logit contributions
attered-0.009
ifying-0.009
aming-0.008
ney-0.008
ige-0.008
Token.
Token ID13
Feature activation+1.25e-03

Pos logit contributions
 Skip+0.014
 maybe+0.014
 Zoe+0.013
rob+0.013
 inspection+0.013
Neg logit contributions
attered-0.015
aming-0.013
ney-0.013
opened-0.013
ifying-0.012
Token
Token ID198
Feature activation+1.16e-04

Pos logit contributions
rob+0.002
ock+0.002
 villain+0.002
aro+0.002
 villains+0.002
Neg logit contributions
attered-0.002
-0.002
ney-0.002
opened-0.002
ifying-0.002
Token
Token ID198
Feature activation+2.63e-03

Pos logit contributions
 villain+0.000
rob+0.000
ock+0.000
 Skip+0.000
aro+0.000
Neg logit contributions
-0.000
aming-0.000
ney-0.000
opened-0.000
amed-0.000
TokenAfter
Token ID3260
Feature activation+5.57e-03

Pos logit contributions
 maybe+0.004
 Skip+0.004
rob+0.004
 Zoe+0.004
 inspection+0.004
Neg logit contributions
attered-0.004
aming-0.004
ney-0.004
opened-0.004
amed-0.004
Token a
Token ID257
Feature activation+1.28e-02

Pos logit contributions
ock+0.011
rob+0.010
 villain+0.010
 Skip+0.010
 inspection+0.009
Neg logit contributions
aming-0.007
attered-0.006
-0.006
 painted-0.006
 mailbox-0.006
Token few
Token ID1178
Feature activation+4.76e-02

Pos logit contributions
rob+0.016
 maybe+0.015
aro+0.014
ock+0.014
 Skip+0.013
Neg logit contributions
attered-0.023
tr-0.023
-0.023
 mailbox-0.023
 printed-0.023
Token minutes
Token ID2431
Feature activation+1.46e-02

Pos logit contributions
rob+0.075
ock+0.074
 Bub+0.072
 Zig+0.071
 maybe+0.071
Neg logit contributions
aming-0.069
 cracked-0.065
attered-0.064
amed-0.064
-0.063
Token they
Token ID484
Feature activation+1.18e-02

Pos logit contributions
rob+0.020
 Skip+0.020
 maybe+0.019
ock+0.018
 villain+0.018
Neg logit contributions
attered-0.027
aming-0.025
opened-0.024
ney-0.024
amed-0.024
Token reached
Token ID4251
Feature activation-3.89e-03

Pos logit contributions
 Skip+0.020
 maybe+0.020
rob+0.020
 Zoe+0.020
 inspection+0.019
Neg logit contributions
attered-0.018
aming-0.016
ney-0.016
opened-0.015
ifying-0.015
Token the
Token ID262
Feature activation+1.43e-03

Pos logit contributions
attered+0.007
aming+0.007
ney+0.007
 mailbox+0.007
opened+0.007
Neg logit contributions
rob-0.005
 Skip-0.005
 maybe-0.005
ock-0.005
 inspection-0.005
Token gym
Token ID11550
Feature activation-2.72e-03

Pos logit contributions
 Skip+0.002
 maybe+0.002
 Zoe+0.002
rob+0.002
ock+0.002
Neg logit contributions
attered-0.003
aming-0.003
ney-0.002
opened-0.002
amed-0.002
Token to
Token ID284
Feature activation+8.69e-03

Pos logit contributions
attered+0.017
aming+0.015
ney+0.015
opened+0.014
ifying+0.014
Neg logit contributions
 Skip-0.021
 maybe-0.021
 Zoe-0.020
rob-0.020
 inspection-0.020
Token them
Token ID606
Feature activation+8.67e-04

Pos logit contributions
rob+0.018
aro+0.017
ner+0.017
ock+0.016
izards+0.016
Neg logit contributions
 mailbox-0.010
 rightful-0.009
 hyd-0.009
 mail-0.009
 open-0.008
Token.
Token ID13
Feature activation-9.04e-04

Pos logit contributions
rob+0.001
 Skip+0.001
 maybe+0.001
 inspection+0.001
 Zoe+0.001
Neg logit contributions
attered-0.001
ney-0.001
aming-0.001
tr-0.001
opened-0.001
Token After
Token ID2293
Feature activation+4.35e-03

Pos logit contributions
attered+0.001
ifying+0.001
tr+0.001
ney+0.001
+0.001
Neg logit contributions
ock-0.002
rob-0.002
aro-0.002
 inspection-0.002
 villain-0.002
Token a
Token ID257
Feature activation+1.24e-02

Pos logit contributions
ock+0.008
rob+0.008
 inspection+0.008
 Skip+0.008
 Nice+0.007
Neg logit contributions
 cracked-0.005
ifying-0.005
aming-0.005
 painted-0.005
 printed-0.005
Token few
Token ID1178
Feature activation+4.75e-02

Pos logit contributions
rob+0.017
aro+0.016
 maybe+0.016
 Skip+0.015
 Zoe+0.015
Neg logit contributions
tr-0.022
 mailbox-0.021
 cracked-0.021
attered-0.021
 printed-0.020
Token days
Token ID1528
Feature activation+2.03e-02

Pos logit contributions
rob+0.074
 Zig+0.074
ock+0.074
 Bub+0.074
 Skip+0.073
Neg logit contributions
 cracked-0.067
aming-0.067
attered-0.063
amed-0.061
ra-0.060
Token,
Token ID11
Feature activation-1.20e-02

Pos logit contributions
rob+0.032
 Skip+0.032
ock+0.030
 maybe+0.030
 inspection+0.030
Neg logit contributions
attered-0.032
aming-0.030
 cracked-0.029
opened-0.029
um-0.028
Token Fl
Token ID1610
Feature activation-1.56e-02

Pos logit contributions
attered+0.023
ifying+0.022
aming+0.021
 cracked+0.021
ra+0.020
Neg logit contributions
ock-0.015
rob-0.015
 inspection-0.014
 Skip-0.013
aro-0.013
Tokenuffy
Token ID15352
Feature activation-9.67e-03

Pos logit contributions
attered+0.026
aming+0.025
amed+0.024
opened+0.024
ifying+0.023
Neg logit contributions
 Skip-0.023
 maybe-0.023
 Zoe-0.023
 Boo-0.022
rob-0.022
Token was
Token ID373
Feature activation+1.10e-02

Pos logit contributions
attered+0.017
aming+0.016
ney+0.015
ifying+0.015
opened+0.015
Neg logit contributions
 Skip-0.014
 maybe-0.014
 Zoe-0.014
rob-0.013
 but-0.013
Token the
Token ID262
Feature activation+1.55e-03

Pos logit contributions
attered+0.001
aming+0.001
ney+0.001
opened+0.001
 mailbox+0.001
Neg logit contributions
 Skip-0.001
 maybe-0.001
rob-0.001
ock-0.001
 villain-0.001
Token r
Token ID374
Feature activation-6.95e-03

Pos logit contributions
 maybe+0.003
 Skip+0.002
 inspection+0.002
 but+0.002
 he+0.002
Neg logit contributions
attered-0.003
aming-0.002
amed-0.002
lette-0.002
ige-0.002
Tokeniddles
Token ID29319
Feature activation+4.99e-03

Pos logit contributions
attered+0.010
ifying+0.010
 mailbox+0.009
tr+0.009
aming+0.008
Neg logit contributions
ock-0.012
rob-0.012
 maybe-0.012
 Zoe-0.011
 Skip-0.011
Token.
Token ID13
Feature activation-1.56e-03

Pos logit contributions
 maybe+0.007
 Skip+0.007
 inspection+0.007
 Zoe+0.007
 but+0.007
Neg logit contributions
attered-0.009
aming-0.008
ney-0.008
ifying-0.008
opened-0.008
Token After
Token ID2293
Feature activation+3.60e-03

Pos logit contributions
attered+0.003
opened+0.002
ney+0.002
aming+0.002
tr+0.002
Neg logit contributions
rob-0.002
 Skip-0.002
ock-0.002
 villain-0.002
 maybe-0.002
Token 3
Token ID513
Feature activation+4.74e-02

Pos logit contributions
rob+0.005
 Skip+0.005
ock+0.005
 maybe+0.005
 villain+0.005
Neg logit contributions
attered-0.006
aming-0.006
 cracked-0.005
 mailbox-0.005
opened-0.005
Token tricky
Token ID17198
Feature activation+2.06e-02

Pos logit contributions
 Skip+0.084
 villain+0.081
 Whe+0.081
 Nice+0.081
ock+0.078
Neg logit contributions
aming-0.061
-0.060
attered-0.057
ney-0.055
 painted-0.055
Token questions
Token ID2683
Feature activation+1.01e-02

Pos logit contributions
rob+0.030
 Skip+0.030
ock+0.029
 Zoe+0.029
 maybe+0.029
Neg logit contributions
attered-0.035
aming-0.032
ney-0.031
 mailbox-0.031
opened-0.031
Token,
Token ID11
Feature activation-4.05e-03

Pos logit contributions
rob+0.015
 Skip+0.015
 maybe+0.014
ock+0.014
 villain+0.014
Neg logit contributions
attered-0.018
aming-0.016
ney-0.016
opened-0.016
 cracked-0.016
Token Lily
Token ID20037
Feature activation-8.19e-03

Pos logit contributions
attered+0.008
 mailbox+0.007
 cracked+0.007
 painted+0.007
aming+0.007
Neg logit contributions
rob-0.005
ock-0.005
 Skip-0.005
 inspection-0.004
aro-0.004
Token had
Token ID550
Feature activation-6.75e-03

Pos logit contributions
attered+0.015
ifying+0.014
olk+0.013
ige+0.013
aming+0.013
Neg logit contributions
 he-0.012
 maybe-0.012
 but-0.011
 Skip-0.011
 Zoe-0.011
Token the
Token ID262
Feature activation+1.09e-02

Pos logit contributions
 villain+0.008
ock+0.008
 Skip+0.008
rob+0.008
 maybe+0.007
Neg logit contributions
attered-0.009
aming-0.008
 mailbox-0.008
-0.008
 cracked-0.008
Token slide
Token ID10649
Feature activation+4.06e-03

Pos logit contributions
 Skip+0.016
 maybe+0.016
rob+0.016
 villain+0.015
 inspection+0.015
Neg logit contributions
attered-0.019
aming-0.017
ney-0.017
opened-0.016
 cracked-0.016
Token.
Token ID13
Feature activation-3.11e-04

Pos logit contributions
 Skip+0.007
 maybe+0.007
rob+0.007
 villain+0.007
 inspection+0.007
Neg logit contributions
attered-0.006
ney-0.006
tr-0.005
aming-0.005
opened-0.005
Token After
Token ID2293
Feature activation+6.65e-03

Pos logit contributions
attered+0.001
ney+0.000
aming+0.000
opened+0.000
tr+0.000
Neg logit contributions
 Skip-0.000
rob-0.000
 maybe-0.000
ock-0.000
 inspection-0.000
Token a
Token ID257
Feature activation+1.29e-02

Pos logit contributions
ock+0.010
rob+0.010
 Skip+0.009
 villain+0.009
 inspection+0.009
Neg logit contributions
attered-0.011
aming-0.011
-0.010
 painted-0.010
 cracked-0.010
Token few
Token ID1178
Feature activation+4.73e-02

Pos logit contributions
rob+0.016
aro+0.014
 maybe+0.014
 Skip+0.014
 Zig+0.013
Neg logit contributions
attered-0.025
-0.024
tr-0.024
opened-0.023
 printed-0.023
Token turns
Token ID4962
Feature activation-6.27e-03

Pos logit contributions
 Skip+0.072
rob+0.071
 Zig+0.070
 villain+0.069
ock+0.068
Neg logit contributions
aming-0.072
attered-0.072
 cracked-0.069
amed-0.067
 painted-0.066
Token,
Token ID11
Feature activation-9.61e-03

Pos logit contributions
attered+0.014
aming+0.013
opened+0.013
ney+0.013
 cracked+0.012
Neg logit contributions
rob-0.006
 Skip-0.006
 maybe-0.006
 Zoe-0.006
ock-0.005
Token they
Token ID484
Feature activation+1.81e-03

Pos logit contributions
attered+0.021
aming+0.019
ifying+0.019
 mailbox+0.019
opened+0.019
Neg logit contributions
rob-0.010
 Skip-0.010
ock-0.009
 maybe-0.009
 inspection-0.009
Token went
Token ID1816
Feature activation+7.29e-03

Pos logit contributions
 he+0.003
 maybe+0.003
 Zoe+0.003
 but+0.003
ock+0.003
Neg logit contributions
attered-0.003
ney-0.003
lette-0.003
 mailbox-0.003
tr-0.003
Token to
Token ID284
Feature activation+7.68e-03

Pos logit contributions
 Skip+0.014
 maybe+0.014
 Zoe+0.014
rob+0.013
 inspection+0.013
Neg logit contributions
attered-0.010
ney-0.009
aming-0.009
ifying-0.008
opened-0.008
Token.
Token ID13
Feature activation+1.04e-03

Pos logit contributions
attered+0.003
ney+0.003
aming+0.003
tr+0.003
opened+0.003
Neg logit contributions
 Skip-0.004
rob-0.004
 maybe-0.004
 inspection-0.004
 villain-0.004
Token
Token ID198
Feature activation+1.73e-03

Pos logit contributions
rob+0.002
ock+0.002
 maybe+0.002
 Skip+0.002
 inspection+0.002
Neg logit contributions
attered-0.002
ney-0.002
aming-0.002
opened-0.002
ifying-0.002
Token
Token ID198
Feature activation+4.07e-03

Pos logit contributions
rob+0.003
 villain+0.003
 inspection+0.003
 maybe+0.003
 Skip+0.003
Neg logit contributions
aming-0.003
attered-0.003
-0.003
ney-0.003
opened-0.003
TokenAfter
Token ID3260
Feature activation+9.34e-03

Pos logit contributions
 Skip+0.006
 maybe+0.006
 Zoe+0.006
rob+0.006
ock+0.006
Neg logit contributions
attered-0.007
ney-0.007
aming-0.006
opened-0.006
tr-0.006
Token a
Token ID257
Feature activation+8.77e-03

Pos logit contributions
rob+0.014
ock+0.014
 inspection+0.013
 Skip+0.013
 maybe+0.013
Neg logit contributions
attered-0.015
aming-0.014
-0.014
 cracked-0.014
 painted-0.014
Token few
Token ID1178
Feature activation+4.72e-02

Pos logit contributions
rob+0.011
aro+0.010
 maybe+0.010
 Skip+0.009
 Zig+0.009
Neg logit contributions
attered-0.017
tr-0.017
-0.016
 mailbox-0.016
ifying-0.016
Token minutes
Token ID2431
Feature activation+1.74e-02

Pos logit contributions
rob+0.073
 Skip+0.072
 Zig+0.072
 Bub+0.071
ock+0.071
Neg logit contributions
aming-0.070
 cracked-0.068
attered-0.066
amed-0.065
 painted-0.064
Token,
Token ID11
Feature activation-5.74e-03

Pos logit contributions
rob+0.027
 Skip+0.027
 maybe+0.025
 villain+0.025
ock+0.025
Neg logit contributions
attered-0.029
aming-0.027
 cracked-0.026
opened-0.026
 painted-0.025
Token Rachel
Token ID15984
Feature activation-6.80e-03

Pos logit contributions
attered+0.012
ifying+0.011
aming+0.011
 cracked+0.010
 mailbox+0.010
Neg logit contributions
rob-0.007
ock-0.007
 inspection-0.006
 Skip-0.006
 maybe-0.006
Token won
Token ID1839
Feature activation-1.21e-03

Pos logit contributions
attered+0.012
ney+0.011
aming+0.011
ifying+0.011
opened+0.011
Neg logit contributions
 Skip-0.010
 maybe-0.010
 Zoe-0.010
 but-0.009
 he-0.009
Token the
Token ID262
Feature activation+4.69e-03

Pos logit contributions
attered+0.002
aming+0.002
+0.002
ney+0.002
opened+0.002
Neg logit contributions
rob-0.002
 Skip-0.002
 maybe-0.002
 inspection-0.002
ock-0.002
Token.
Token ID13
Feature activation-6.06e-03

Pos logit contributions
attered+0.001
aming+0.001
ney+0.001
opened+0.001
tr+0.001
Neg logit contributions
rob-0.002
 Skip-0.002
 villain-0.002
 maybe-0.001
ock-0.001
Token
Token ID198
Feature activation-4.73e-03

Pos logit contributions
attered+0.011
ney+0.010
opened+0.010
tr+0.010
aming+0.010
Neg logit contributions
rob-0.009
ock-0.008
 villain-0.008
 Skip-0.008
aro-0.008
Token
Token ID198
Feature activation-1.70e-03

Pos logit contributions
ney+0.006
+0.006
opened+0.005
um+0.005
aming+0.005
Neg logit contributions
 villain-0.010
rob-0.009
aro-0.008
 Skip-0.008
ock-0.008
TokenAfter
Token ID3260
Feature activation+4.99e-03

Pos logit contributions
attered+0.003
aming+0.003
ney+0.003
opened+0.003
amed+0.003
Neg logit contributions
 maybe-0.003
 Skip-0.003
rob-0.002
 Zoe-0.002
 inspection-0.002
Token a
Token ID257
Feature activation+9.74e-03

Pos logit contributions
rob+0.007
ock+0.007
 Skip+0.007
 inspection+0.007
 maybe+0.006
Neg logit contributions
attered-0.008
aming-0.008
-0.008
 mailbox-0.008
 painted-0.008
Token few
Token ID1178
Feature activation+4.71e-02

Pos logit contributions
rob+0.012
aro+0.011
 maybe+0.011
 Skip+0.010
ock+0.010
Neg logit contributions
attered-0.019
tr-0.018
-0.018
 mailbox-0.018
opened-0.018
Token times
Token ID1661
Feature activation+2.29e-02

Pos logit contributions
 Zig+0.070
 Skip+0.070
rob+0.070
 Bub+0.069
ock+0.067
Neg logit contributions
aming-0.071
attered-0.070
 cracked-0.068
-0.067
amed-0.066
Token of
Token ID286
Feature activation+1.65e-02

Pos logit contributions
rob+0.038
 Skip+0.035
 maybe+0.034
ock+0.034
 inspection+0.033
Neg logit contributions
attered-0.035
-0.032
aming-0.032
um-0.031
 cracked-0.031
Token sliding
Token ID22292
Feature activation+3.31e-03

Pos logit contributions
rob+0.026
 Skip+0.025
 maybe+0.025
 villain+0.024
ock+0.024
Neg logit contributions
attered-0.026
aming-0.025
opened-0.024
amed-0.023
 mailbox-0.023
Token,
Token ID11
Feature activation-1.07e-02

Pos logit contributions
rob+0.007
 Skip+0.006
 maybe+0.006
ock+0.006
 villain+0.006
Neg logit contributions
attered-0.004
ney-0.003
aming-0.003
 mailbox-0.003
amed-0.003
Token the
Token ID262
Feature activation+5.48e-04

Pos logit contributions
attered+0.018
ifying+0.017
 mailbox+0.017
 cracked+0.017
aming+0.016
Neg logit contributions
ock-0.015
rob-0.015
 inspection-0.014
aro-0.014
 Skip-0.013
Token to
Token ID284
Feature activation+1.34e-02

Pos logit contributions
 Skip+0.037
rob+0.037
 maybe+0.037
ock+0.035
 Zoe+0.035
Neg logit contributions
attered-0.039
aming-0.035
ney-0.035
opened-0.033
ifying-0.033
Token hide
Token ID7808
Feature activation+5.53e-04

Pos logit contributions
rob+0.023
ock+0.021
 Skip+0.020
 inspection+0.020
 maybe+0.020
Neg logit contributions
attered-0.018
 mailbox-0.018
ney-0.018
-0.018
opened-0.018
Token.
Token ID13
Feature activation+1.60e-03

Pos logit contributions
rob+0.001
 Skip+0.001
 maybe+0.001
ock+0.001
 inspection+0.001
Neg logit contributions
attered-0.001
aming-0.001
ney-0.001
ifying-0.001
opened-0.001
Token After
Token ID2293
Feature activation+7.64e-03

Pos logit contributions
rob+0.002
ock+0.002
 Skip+0.002
 villain+0.002
 inspection+0.002
Neg logit contributions
attered-0.003
ney-0.003
ifying-0.003
opened-0.003
aming-0.003
Token a
Token ID257
Feature activation+1.39e-02

Pos logit contributions
ock+0.013
rob+0.013
 Skip+0.012
 inspection+0.012
 villain+0.012
Neg logit contributions
attered-0.011
aming-0.010
-0.010
 cracked-0.010
ifying-0.010
Token few
Token ID1178
Feature activation+4.67e-02

Pos logit contributions
rob+0.020
aro+0.018
 maybe+0.018
ock+0.017
 Skip+0.017
Neg logit contributions
attered-0.024
tr-0.023
-0.023
ifying-0.023
opened-0.022
Token minutes
Token ID2431
Feature activation+1.65e-02

Pos logit contributions
rob+0.074
ock+0.073
 Zig+0.069
 Skip+0.069
 maybe+0.067
Neg logit contributions
aming-0.069
attered-0.067
 cracked-0.067
-0.065
amed-0.065
Token,
Token ID11
Feature activation-1.21e-02

Pos logit contributions
rob+0.025
 Skip+0.024
 villain+0.023
ock+0.023
 maybe+0.023
Neg logit contributions
attered-0.028
aming-0.026
opened-0.025
 cracked-0.025
ney-0.025
Token John
Token ID1757
Feature activation-8.11e-03

Pos logit contributions
attered+0.024
aming+0.022
 mailbox+0.022
ifying+0.022
ney+0.022
Neg logit contributions
rob-0.014
 Skip-0.014
ock-0.014
 maybe-0.013
 inspection-0.013
Token called
Token ID1444
Feature activation+2.37e-03

Pos logit contributions
attered+0.014
aming+0.012
ney+0.012
ifying+0.012
opened+0.012
Neg logit contributions
 Skip-0.012
 maybe-0.012
 Zoe-0.012
rob-0.012
 inspection-0.012
Token out
Token ID503
Feature activation+5.36e-03

Pos logit contributions
rob+0.004
 Skip+0.004
 maybe+0.004
 inspection+0.004
ock+0.004
Neg logit contributions
attered-0.004
aming-0.003
ney-0.003
opened-0.003
 mailbox-0.003
Token
Token ID220
Feature activation-6.75e-04

Pos logit contributions
attered+0.022
aming+0.020
ney+0.020
opened+0.019
ifying+0.019
Neg logit contributions
 Skip-0.015
 maybe-0.015
rob-0.014
 Zoe-0.014
 inspection-0.014
Token
Token ID198
Feature activation-7.19e-03

Pos logit contributions
attered+0.001
aming+0.001
ney+0.001
amed+0.001
opened+0.001
Neg logit contributions
rob-0.001
 Skip-0.001
 inspection-0.001
 villain-0.001
 maybe-0.001
Token
Token ID198
Feature activation-4.81e-03

Pos logit contributions
aming+0.010
amed+0.010
ney+0.010
attered+0.009
+0.009
Neg logit contributions
 villain-0.013
rob-0.012
aro-0.012
 Skip-0.012
ock-0.012
TokenAfter
Token ID3260
Feature activation+2.93e-03

Pos logit contributions
attered+0.009
ney+0.008
aming+0.008
opened+0.008
ifying+0.008
Neg logit contributions
 Skip-0.007
 maybe-0.007
 Zoe-0.006
rob-0.006
ock-0.006
Token a
Token ID257
Feature activation+1.26e-02

Pos logit contributions
rob+0.004
ock+0.004
 Skip+0.004
 inspection+0.004
 maybe+0.004
Neg logit contributions
attered-0.005
aming-0.005
 mailbox-0.005
ifying-0.005
opened-0.005
Token few
Token ID1178
Feature activation+4.67e-02

Pos logit contributions
rob+0.015
aro+0.014
 Skip+0.013
 maybe+0.013
ock+0.012
Neg logit contributions
attered-0.025
tr-0.024
 printed-0.024
 mailbox-0.024
opened-0.023
Token minutes
Token ID2431
Feature activation+1.57e-02

Pos logit contributions
 Skip+0.070
ock+0.069
 Zig+0.068
 maybe+0.068
rob+0.067
Neg logit contributions
aming-0.071
 cracked-0.068
attered-0.068
amed-0.066
 painted-0.064
Token,
Token ID11
Feature activation-1.26e-02

Pos logit contributions
 Skip+0.022
rob+0.022
 maybe+0.022
ock+0.021
 villain+0.021
Neg logit contributions
attered-0.028
aming-0.026
ney-0.025
opened-0.025
 cracked-0.025
Token the
Token ID262
Feature activation-2.61e-03

Pos logit contributions
attered+0.026
 cracked+0.024
aming+0.024
ifying+0.024
 mailbox+0.024
Neg logit contributions
rob-0.015
ock-0.014
 Skip-0.014
 inspection-0.013
aro-0.013
Token bird
Token ID6512
Feature activation+1.38e-04

Pos logit contributions
attered+0.006
ifying+0.005
ney+0.005
opened+0.005
amed+0.005
Neg logit contributions
 maybe-0.003
ock-0.003
 Skip-0.003
 Zoe-0.003
 but-0.003
Token flew
Token ID13112
Feature activation-5.23e-03

Pos logit contributions
 Skip+0.000
 maybe+0.000
rob+0.000
 Zoe+0.000
 inspection+0.000
Neg logit contributions
attered-0.000
aming-0.000
ney-0.000
opened-0.000
ifying-0.000
Token.
Token ID13
Feature activation-7.84e-03

Pos logit contributions
rob+0.016
 Skip+0.016
 maybe+0.015
ock+0.015
 villain+0.015
Neg logit contributions
attered-0.011
ney-0.010
aming-0.010
opened-0.010
amed-0.009
Token
Token ID198
Feature activation-6.06e-03

Pos logit contributions
attered+0.016
ney+0.014
aming+0.014
opened+0.014
aut+0.014
Neg logit contributions
rob-0.010
 Skip-0.010
 maybe-0.010
ock-0.009
 villain-0.009
Token
Token ID198
Feature activation-4.35e-03

Pos logit contributions
ney+0.008
+0.007
um+0.007
aming+0.007
opened+0.007
Neg logit contributions
 villain-0.012
rob-0.012
 Skip-0.011
 maybe-0.011
ock-0.011
TokenAfter
Token ID3260
Feature activation+5.35e-03

Pos logit contributions
attered+0.009
ige+0.008
ifying+0.008
 mailbox+0.008
tr+0.008
Neg logit contributions
 Skip-0.005
 he-0.005
 maybe-0.005
 inspection-0.005
 Zoe-0.005
Token a
Token ID257
Feature activation+9.02e-03

Pos logit contributions
rob+0.009
ock+0.009
 Skip+0.009
 inspection+0.009
 maybe+0.009
Neg logit contributions
attered-0.008
aming-0.007
 mailbox-0.007
opened-0.007
amed-0.007
Token few
Token ID1178
Feature activation+4.66e-02

Pos logit contributions
rob+0.011
 maybe+0.010
aro+0.009
izards+0.009
ock+0.009
Neg logit contributions
attered-0.017
 mailbox-0.017
tr-0.017
 printed-0.016
-0.016
Token minutes
Token ID2431
Feature activation+1.61e-02

Pos logit contributions
rob+0.075
ock+0.073
 Skip+0.073
 Bub+0.072
 Zig+0.072
Neg logit contributions
aming-0.065
attered-0.064
 cracked-0.062
amed-0.060
-0.058
Token,
Token ID11
Feature activation-1.39e-02

Pos logit contributions
 Skip+0.023
rob+0.022
 maybe+0.022
ock+0.021
 Zoe+0.021
Neg logit contributions
attered-0.029
aming-0.027
ney-0.026
opened-0.026
 cracked-0.025
Token Joe
Token ID5689
Feature activation-9.04e-03

Pos logit contributions
attered+0.028
aming+0.026
ney+0.026
opened+0.025
ifying+0.025
Neg logit contributions
rob-0.017
 Skip-0.017
 maybe-0.016
ock-0.016
 Zoe-0.016
Token saw
Token ID2497
Feature activation+1.43e-03

Pos logit contributions
attered+0.014
ney+0.013
aming+0.012
ifying+0.012
tr+0.012
Neg logit contributions
 maybe-0.015
 Skip-0.015
 Zoe-0.015
 he-0.014
 but-0.014
Token the
Token ID262
Feature activation-3.19e-03

Pos logit contributions
rob+0.003
 Skip+0.003
 inspection+0.003
ock+0.003
 villain+0.003
Neg logit contributions
attered-0.002
aming-0.002
 mailbox-0.002
 cracked-0.002
amed-0.002
Token and
Token ID290
Feature activation+4.83e-04

Pos logit contributions
 maybe+0.004
 Skip+0.004
 Zoe+0.004
 inspection+0.004
rob+0.004
Neg logit contributions
attered-0.005
aming-0.005
ney-0.005
opened-0.004
ifying-0.004
Token gentle
Token ID10296
Feature activation+1.40e-02

Pos logit contributions
rob+0.001
 Skip+0.001
ock+0.001
 inspection+0.001
 Zoe+0.001
Neg logit contributions
attered-0.001
aming-0.001
 cracked-0.001
ney-0.001
 painted-0.001
Token.
Token ID13
Feature activation-5.88e-03

Pos logit contributions
rob+0.023
 Skip+0.023
 maybe+0.023
 inspection+0.022
ock+0.022
Neg logit contributions
attered-0.022
aming-0.020
ney-0.020
tr-0.019
opened-0.019
Token After
Token ID2293
Feature activation+2.87e-03

Pos logit contributions
attered+0.010
ney+0.009
aming+0.009
opened+0.009
ifying+0.009
Neg logit contributions
rob-0.009
 Skip-0.009
 maybe-0.009
ock-0.008
 inspection-0.008
Token a
Token ID257
Feature activation+1.10e-02

Pos logit contributions
rob+0.005
ock+0.005
 Skip+0.005
 inspection+0.005
 villain+0.005
Neg logit contributions
aming-0.004
attered-0.004
 mailbox-0.003
 painted-0.003
 cracked-0.003
Token few
Token ID1178
Feature activation+4.65e-02

Pos logit contributions
rob+0.016
 maybe+0.014
 Skip+0.013
aro+0.013
ock+0.013
Neg logit contributions
attered-0.019
tr-0.018
 mailbox-0.018
-0.018
 cracked-0.018
Token moments
Token ID7188
Feature activation+2.22e-02

Pos logit contributions
rob+0.072
 Bub+0.070
 Zig+0.069
 Skip+0.069
ock+0.069
Neg logit contributions
aming-0.069
 cracked-0.067
attered-0.067
-0.064
amed-0.064
Token,
Token ID11
Feature activation-1.07e-02

Pos logit contributions
rob+0.043
 Skip+0.039
 villain+0.038
 villains+0.038
ock+0.036
Neg logit contributions
attered-0.028
 painted-0.028
-0.028
aming-0.027
 cracked-0.027
Token the
Token ID262
Feature activation+2.13e-03

Pos logit contributions
attered+0.021
ifying+0.021
 cracked+0.020
+0.020
aming+0.020
Neg logit contributions
rob-0.013
ock-0.012
aro-0.011
 inspection-0.011
 Skip-0.011
Token dove
Token ID38862
Feature activation+8.69e-03

Pos logit contributions
 Skip+0.003
 maybe+0.003
 Zoe+0.003
rob+0.003
 inspection+0.003
Neg logit contributions
attered-0.004
ney-0.004
aming-0.004
opened-0.004
ifying-0.004
Token flew
Token ID13112
Feature activation-4.63e-03

Pos logit contributions
 Skip+0.016
 maybe+0.016
rob+0.016
 Zoe+0.015
 inspection+0.015
Neg logit contributions
attered-0.012
aming-0.011
ney-0.011
opened-0.010
amed-0.010
Token best
Token ID1266
Feature activation-5.61e-03

Pos logit contributions
attered+0.008
aming+0.007
ney+0.007
opened+0.007
ifying+0.007
Neg logit contributions
 Skip-0.015
 maybe-0.014
 Zoe-0.014
rob-0.014
 Boo-0.014
Token.
Token ID13
Feature activation-9.05e-03

Pos logit contributions
attered+0.007
tr+0.006
ney+0.006
um+0.006
 painted+0.006
Neg logit contributions
rob-0.012
 Skip-0.011
aro-0.011
 Zig-0.011
 inspection-0.010
Token But
Token ID887
Feature activation+9.77e-03

Pos logit contributions
attered+0.017
aming+0.016
ney+0.016
ifying+0.015
amed+0.015
Neg logit contributions
 maybe-0.012
 Skip-0.012
 Zoe-0.012
 inspection-0.011
 but-0.011
Token after
Token ID706
Feature activation-3.59e-03

Pos logit contributions
 Skip+0.013
rob+0.013
 maybe+0.012
ock+0.012
 inspection+0.012
Neg logit contributions
attered-0.019
aming-0.017
ney-0.017
opened-0.017
 mailbox-0.017
Token a
Token ID257
Feature activation+1.19e-02

Pos logit contributions
attered+0.004
aming+0.004
 painted+0.004
 mailbox+0.004
+0.004
Neg logit contributions
rob-0.007
ock-0.007
 Skip-0.007
 inspection-0.006
aro-0.006
Token few
Token ID1178
Feature activation+4.65e-02

Pos logit contributions
rob+0.016
aro+0.014
ock+0.013
izards+0.013
 maybe+0.013
Neg logit contributions
attered-0.022
tr-0.021
 printed-0.020
opened-0.020
-0.020
Token minutes
Token ID2431
Feature activation+1.20e-02

Pos logit contributions
rob+0.080
 Zig+0.079
ock+0.078
 Skip+0.076
 Bub+0.075
Neg logit contributions
aming-0.059
 cracked-0.058
attered-0.057
-0.056
 painted-0.053
Token,
Token ID11
Feature activation-7.51e-03

Pos logit contributions
 Skip+0.018
rob+0.018
 maybe+0.017
ock+0.017
 villain+0.017
Neg logit contributions
attered-0.021
aming-0.019
 cracked-0.018
opened-0.018
ney-0.018
Token she
Token ID673
Feature activation+6.22e-03

Pos logit contributions
attered+0.016
aming+0.015
 mailbox+0.014
 cracked+0.014
ifying+0.014
Neg logit contributions
rob-0.008
 Skip-0.008
ock-0.008
 inspection-0.007
 maybe-0.007
Token felt
Token ID2936
Feature activation+1.43e-03

Pos logit contributions
 Skip+0.010
 maybe+0.010
 Zoe+0.010
rob+0.010
 inspection+0.010
Neg logit contributions
attered-0.010
aming-0.009
ney-0.009
opened-0.008
ifying-0.008
Token tired
Token ID10032
Feature activation-1.02e-03

Pos logit contributions
rob+0.003
 Skip+0.003
 inspection+0.002
 Zoe+0.002
aro+0.002
Neg logit contributions
attered-0.002
aming-0.002
ifying-0.002
 cracked-0.002
amed-0.002
Token
Token ID220
Feature activation+4.00e-03

Pos logit contributions
rob+0.011
ock+0.010
 villain+0.010
aro+0.010
 Skip+0.010
Neg logit contributions
attered-0.010
ney-0.009
opened-0.009
tr-0.009
 mailbox-0.009
Token
Token ID198
Feature activation+5.02e-04

Pos logit contributions
 villain+0.007
rob+0.007
 Skip+0.007
ock+0.007
aro+0.007
Neg logit contributions
ney-0.006
attered-0.005
opened-0.005
-0.005
um-0.005
Token
Token ID198
Feature activation+6.83e-03

Pos logit contributions
 villain+0.001
rob+0.001
aro+0.001
 Skip+0.001
ock+0.001
Neg logit contributions
-0.001
ney-0.001
opened-0.001
aming-0.001
aut-0.001
TokenAfter
Token ID3260
Feature activation+6.07e-03

Pos logit contributions
 Skip+0.009
 maybe+0.009
ock+0.009
rob+0.009
 Zoe+0.009
Neg logit contributions
attered-0.013
ney-0.012
aming-0.012
ifying-0.012
 mailbox-0.011
Token a
Token ID257
Feature activation+1.06e-02

Pos logit contributions
ock+0.009
rob+0.009
 Skip+0.009
 inspection+0.009
 maybe+0.009
Neg logit contributions
attered-0.009
aming-0.009
 cracked-0.008
 mailbox-0.008
opened-0.008
Token few
Token ID1178
Feature activation+4.64e-02

Pos logit contributions
rob+0.012
aro+0.011
 maybe+0.011
 Skip+0.011
ock+0.011
Neg logit contributions
attered-0.021
tr-0.020
 mailbox-0.020
opened-0.020
 cracked-0.020
Token minutes
Token ID2431
Feature activation+1.63e-02

Pos logit contributions
 Skip+0.072
 Zig+0.071
 Zoe+0.071
rob+0.070
 maybe+0.070
Neg logit contributions
attered-0.069
aming-0.068
 cracked-0.066
amed-0.063
um-0.061
Token,
Token ID11
Feature activation-9.69e-03

Pos logit contributions
 Skip+0.022
 maybe+0.021
rob+0.021
 Zoe+0.020
 inspection+0.020
Neg logit contributions
attered-0.031
aming-0.028
ney-0.028
opened-0.027
amed-0.027
Token the
Token ID262
Feature activation-8.14e-04

Pos logit contributions
attered+0.018
 mailbox+0.017
 cracked+0.017
ifying+0.016
aming+0.016
Neg logit contributions
rob-0.013
ock-0.013
 Skip-0.012
 inspection-0.012
 Zoe-0.012
Token owner
Token ID4870
Feature activation-1.61e-03

Pos logit contributions
attered+0.002
ney+0.002
aming+0.001
lette+0.001
opened+0.001
Neg logit contributions
 Skip-0.001
 villain-0.001
 maybe-0.001
 he-0.001
 Zoe-0.001
Token realized
Token ID6939
Feature activation+1.42e-02

Pos logit contributions
attered+0.003
ifying+0.003
aming+0.002
olk+0.002
lette+0.002
Neg logit contributions
 he-0.003
 maybe-0.003
 but-0.002
 Zoe-0.002
ock-0.002
Token."
Token ID526
Feature activation-2.88e-03

Pos logit contributions
 Skip+0.004
 maybe+0.004
rob+0.004
 Zoe+0.004
 inspection+0.004
Neg logit contributions
attered-0.004
ney-0.003
aming-0.003
opened-0.003
ifying-0.003
Token
Token ID198
Feature activation-6.85e-03

Pos logit contributions
attered+0.004
+0.004
aut+0.004
ney+0.004
um+0.004
Neg logit contributions
rob-0.005
ock-0.005
 villain-0.005
aro-0.005
 Skip-0.005
Token
Token ID198
Feature activation-1.90e-03

Pos logit contributions
aming+0.010
ney+0.009
+0.009
attered+0.009
opened+0.009
Neg logit contributions
 villain-0.012
rob-0.012
 Skip-0.011
ock-0.011
aro-0.010
TokenAfter
Token ID3260
Feature activation+6.63e-03

Pos logit contributions
attered+0.004
ney+0.004
aming+0.004
ifying+0.004
opened+0.004
Neg logit contributions
 Skip-0.002
 maybe-0.002
 Zoe-0.002
rob-0.002
 inspection-0.002
Token a
Token ID257
Feature activation+1.22e-02

Pos logit contributions
rob+0.012
ock+0.012
 inspection+0.011
 Skip+0.011
 maybe+0.010
Neg logit contributions
aming-0.009
attered-0.009
 cracked-0.008
ifying-0.008
 mailbox-0.008
Token few
Token ID1178
Feature activation+4.62e-02

Pos logit contributions
rob+0.017
 maybe+0.015
 Skip+0.015
ock+0.015
aro+0.015
Neg logit contributions
attered-0.021
tr-0.021
 mailbox-0.021
 cracked-0.020
ifying-0.020
Token seconds
Token ID4201
Feature activation+8.36e-03

Pos logit contributions
ock+0.068
rob+0.068
 Skip+0.068
 Zig+0.067
 Zoe+0.066
Neg logit contributions
aming-0.070
 cracked-0.069
attered-0.069
amed-0.064
 mailbox-0.064
Token,
Token ID11
Feature activation-2.72e-03

Pos logit contributions
 Skip+0.012
rob+0.012
 maybe+0.011
ock+0.011
 Zoe+0.011
Neg logit contributions
attered-0.015
aming-0.014
ney-0.014
opened-0.013
 cracked-0.013
Token she
Token ID673
Feature activation+7.35e-03

Pos logit contributions
attered+0.005
 cracked+0.005
tr+0.005
ifying+0.004
aming+0.004
Neg logit contributions
rob-0.004
ock-0.004
 Skip-0.003
aro-0.003
 inspection-0.003
Token sees
Token ID7224
Feature activation+7.64e-03

Pos logit contributions
 maybe+0.011
 Skip+0.011
 villain+0.010
 Zoe+0.010
 he+0.010
Neg logit contributions
attered-0.014
aming-0.012
ney-0.012
lette-0.012
ifying-0.012
Token Tom
Token ID4186
Feature activation-1.22e-03

Pos logit contributions
rob+0.013
 Skip+0.013
ner+0.012
aro+0.012
 inspection+0.012
Neg logit contributions
attered-0.012
aming-0.010
 cracked-0.010
 mailbox-0.010
um-0.010
Token:
Token ID25
Feature activation-8.21e-03

Pos logit contributions
 maybe+0.016
 Skip+0.015
 Zoe+0.015
 Boo+0.015
ock+0.014
Neg logit contributions
attered-0.016
aming-0.014
ifying-0.013
ney-0.013
opened-0.013
Token Don
Token ID2094
Feature activation+1.49e-02

Pos logit contributions
attered+0.016
aming+0.015
ney+0.015
opened+0.014
ifying+0.014
Neg logit contributions
 Skip-0.011
 maybe-0.010
rob-0.010
 Zoe-0.010
 inspection-0.010
Token't
Token ID470
Feature activation-1.41e-02

Pos logit contributions
 maybe+0.015
 Skip+0.015
 inspection+0.015
rob+0.015
 Zoe+0.014
Neg logit contributions
attered-0.032
aming-0.031
ney-0.030
opened-0.029
tr-0.029
Token jump
Token ID4391
Feature activation-4.10e-03

Pos logit contributions
attered+0.026
aming+0.024
ney+0.024
amed+0.023
opened+0.023
Neg logit contributions
 maybe-0.020
 Skip-0.020
 Zoe-0.019
 but-0.018
ock-0.018
Token to
Token ID284
Feature activation+9.11e-03

Pos logit contributions
attered+0.007
aming+0.006
ney+0.006
opened+0.006
ifying+0.006
Neg logit contributions
 Skip-0.006
 maybe-0.006
rob-0.006
 Zoe-0.006
 inspection-0.006
Token conclusions
Token ID13242
Feature activation+4.60e-02

Pos logit contributions
rob+0.018
ner+0.016
aro+0.016
 Zac+0.016
 Skip+0.015
Neg logit contributions
 mailbox-0.012
 painted-0.011
ra-0.010
 frame-0.010
-0.010
Token and
Token ID290
Feature activation+8.76e-03

Pos logit contributions
rob+0.095
aro+0.088
 inspection+0.087
 Zig+0.087
 Skip+0.087
Neg logit contributions
-0.053
 mailbox-0.045
 painted-0.042
ney-0.041
aut-0.041
Token accuse
Token ID24456
Feature activation+1.05e-02

Pos logit contributions
rob+0.014
ock+0.013
 Skip+0.013
 inspection+0.013
 Zig+0.012
Neg logit contributions
attered-0.014
 mailbox-0.013
tr-0.013
 painted-0.013
 cracked-0.013
Token your
Token ID534
Feature activation+2.04e-04

Pos logit contributions
rob+0.019
 inspection+0.017
 Skip+0.016
ner+0.016
aro+0.016
Neg logit contributions
 mailbox-0.015
ige-0.015
attered-0.015
ra-0.014
 painted-0.014
Token friends
Token ID2460
Feature activation+1.82e-03

Pos logit contributions
 Skip+0.000
 maybe+0.000
rob+0.000
 Zoe+0.000
 inspection+0.000
Neg logit contributions
attered-0.000
aming-0.000
ney-0.000
opened-0.000
ifying-0.000
Token.
Token ID13
Feature activation-4.81e-03

Pos logit contributions
 Skip+0.003
rob+0.003
 maybe+0.003
 Zoe+0.003
 inspection+0.003
Neg logit contributions
attered-0.003
aming-0.003
ney-0.003
opened-0.003
ifying-0.003
Token
Token ID220
Feature activation-2.82e-03

Pos logit contributions
attered+0.032
aming+0.031
ney+0.030
ige+0.029
amed+0.029
Neg logit contributions
 maybe-0.015
 Skip-0.015
 Zoe-0.014
 he-0.014
 Boo-0.013
Token
Token ID198
Feature activation-1.14e-02

Pos logit contributions
attered+0.005
ney+0.005
opened+0.005
aming+0.004
aut+0.004
Neg logit contributions
rob-0.004
 villain-0.004
 Skip-0.004
aro-0.004
 inspection-0.004
Token
Token ID198
Feature activation-1.20e-02

Pos logit contributions
attered+0.018
aming+0.018
+0.017
ney+0.017
opened+0.017
Neg logit contributions
 villain-0.018
rob-0.018
 Skip-0.017
 maybe-0.017
aro-0.017
TokenAfter
Token ID3260
Feature activation+4.69e-03

Pos logit contributions
attered+0.025
ige+0.023
tr+0.022
 mailbox+0.022
ra+0.022
Neg logit contributions
 he-0.015
 Skip-0.014
ock-0.013
 maybe-0.013
 but-0.013
Token a
Token ID257
Feature activation+9.36e-03

Pos logit contributions
rob+0.008
ock+0.008
 inspection+0.008
 Skip+0.007
 maybe+0.007
Neg logit contributions
attered-0.006
aming-0.006
 painted-0.006
-0.006
 printed-0.006
Token few
Token ID1178
Feature activation+4.59e-02

Pos logit contributions
rob+0.012
 maybe+0.011
ock+0.011
aro+0.011
 Zig+0.010
Neg logit contributions
attered-0.017
 printed-0.017
tr-0.016
 mailbox-0.016
 cracked-0.016
Token moments
Token ID7188
Feature activation+1.89e-02

Pos logit contributions
 Zig+0.069
rob+0.067
ock+0.067
 inspection+0.065
 maybe+0.064
Neg logit contributions
aming-0.072
attered-0.070
 cracked-0.069
amed-0.066
-0.065
Token,
Token ID11
Feature activation-2.20e-02

Pos logit contributions
rob+0.036
 Skip+0.033
 villains+0.032
 villain+0.032
ock+0.031
Neg logit contributions
attered-0.024
 painted-0.023
-0.023
 printed-0.023
aming-0.022
Token he
Token ID339
Feature activation+2.94e-03

Pos logit contributions
attered+0.045
ney+0.041
aming+0.041
tr+0.040
opened+0.040
Neg logit contributions
 maybe-0.028
 Skip-0.028
 Zoe-0.026
 he-0.026
 but-0.025
Token came
Token ID1625
Feature activation-1.57e-02

Pos logit contributions
 maybe+0.005
 Skip+0.005
 Zoe+0.005
rob+0.005
 he+0.005
Neg logit contributions
attered-0.005
ney-0.004
aming-0.004
opened-0.004
ifying-0.004
Token up
Token ID510
Feature activation-9.71e-03

Pos logit contributions
attered+0.024
tr+0.022
ney+0.021
 mailbox+0.021
ige+0.021
Neg logit contributions
 Skip-0.027
 Zoe-0.026
 but-0.026
 maybe-0.026
 Boo-0.025
Token to
Token ID284
Feature activation-1.28e-02

Pos logit contributions
attered+0.014
aming+0.012
ney+0.012
opened+0.011
 mailbox+0.011
Neg logit contributions
rob-0.013
 Skip-0.013
 maybe-0.013
ock-0.012
 Zoe-0.012
Token meet
Token ID1826
Feature activation-2.93e-03

Pos logit contributions
attered+0.020
ra+0.019
+0.019
aming+0.019
 painted+0.019
Neg logit contributions
rob-0.021
 Skip-0.019
ock-0.019
 maybe-0.018
 villain-0.018
Token him
Token ID683
Feature activation-2.08e-03

Pos logit contributions
attered+0.005
ney+0.005
 mailbox+0.005
aming+0.005
+0.005
Neg logit contributions
rob-0.004
ock-0.004
 inspection-0.004
 villain-0.004
 maybe-0.004
Token.
Token ID13
Feature activation+1.29e-03

Pos logit contributions
attered+0.004
aming+0.003
ney+0.003
opened+0.003
ifying+0.003
Neg logit contributions
 Skip-0.003
 maybe-0.003
rob-0.003
 Zoe-0.003
 inspection-0.003
Token The
Token ID383
Feature activation+1.13e-02

Pos logit contributions
rob+0.002
ock+0.002
 maybe+0.002
 inspection+0.002
 Skip+0.002
Neg logit contributions
attered-0.002
ney-0.002
ifying-0.002
opened-0.002
aming-0.002
Token three
Token ID1115
Feature activation+4.59e-02

Pos logit contributions
 Skip+0.018
 maybe+0.018
 Zoe+0.017
rob+0.017
 inspection+0.017
Neg logit contributions
attered-0.019
aming-0.017
ney-0.017
opened-0.017
ifying-0.016
Token snakes
Token ID26042
Feature activation-7.20e-03

Pos logit contributions
rob+0.076
 Skip+0.075
 Zig+0.071
 villain+0.071
 maybe+0.071
Neg logit contributions
attered-0.073
ney-0.070
 cracked-0.066
aming-0.065
um-0.064
Token began
Token ID2540
Feature activation-9.94e-03

Pos logit contributions
attered+0.012
aming+0.011
ney+0.010
opened+0.010
amed+0.010
Neg logit contributions
 Skip-0.011
rob-0.011
 maybe-0.011
 Zoe-0.011
 inspection-0.011
Token to
Token ID284
Feature activation-1.09e-03

Pos logit contributions
attered+0.017
aming+0.015
ney+0.015
opened+0.015
ifying+0.015
Neg logit contributions
 Skip-0.015
 maybe-0.015
 Zoe-0.015
 inspection-0.015
rob-0.015
Token play
Token ID711
Feature activation+1.71e-02

Pos logit contributions
attered+0.002
aming+0.002
ney+0.002
 cracked+0.002
opened+0.002
Neg logit contributions
rob-0.002
 maybe-0.002
 Skip-0.002
 inspection-0.001
 Zoe-0.001
Token together
Token ID1978
Feature activation+1.04e-02

Pos logit contributions
rob+0.031
 inspection+0.030
ock+0.029
 maybe+0.028
aro+0.028
Neg logit contributions
attered-0.022
aming-0.021
ney-0.021
amed-0.020
-0.020
Token he
Token ID339
Feature activation+5.52e-03

Pos logit contributions
rob+0.009
 Skip+0.009
ock+0.009
 inspection+0.009
 villain+0.009
Neg logit contributions
attered-0.014
 cracked-0.012
aming-0.012
 mailbox-0.012
opened-0.012
Token said
Token ID531
Feature activation+6.47e-04

Pos logit contributions
 maybe+0.008
 Skip+0.008
 Zoe+0.008
rob+0.008
 inspection+0.007
Neg logit contributions
attered-0.010
aming-0.009
ney-0.009
ifying-0.009
opened-0.009
Token it
Token ID340
Feature activation+1.19e-03

Pos logit contributions
rob+0.001
 Skip+0.001
 maybe+0.001
ock+0.001
 Zoe+0.001
Neg logit contributions
attered-0.001
aming-0.001
ney-0.001
opened-0.001
ifying-0.001
Token was
Token ID373
Feature activation-3.83e-04

Pos logit contributions
 maybe+0.002
 Skip+0.002
 Zoe+0.002
 he+0.002
 but+0.002
Neg logit contributions
attered-0.002
aming-0.002
ifying-0.002
ney-0.002
opened-0.002
Token only
Token ID691
Feature activation+4.55e-03

Pos logit contributions
attered+0.001
ney+0.001
aut+0.001
aming+0.001
opened+0.001
Neg logit contributions
 maybe-0.000
 Skip-0.000
 Zoe-0.000
 but-0.000
 he-0.000
Token $
Token ID720
Feature activation-5.16e-02

Pos logit contributions
 maybe+0.006
 Zoe+0.006
 Skip+0.006
 but+0.006
 inspection+0.005
Neg logit contributions
attered-0.009
aming-0.008
ney-0.008
aut-0.008
opened-0.008
Token10
Token ID940
Feature activation+7.94e-04

Pos logit contributions
attered+0.085
tr+0.082
ney+0.082
olk+0.075
 intact+0.075
Neg logit contributions
ock-0.069
ner-0.067
 Many-0.066
 but-0.066
 maybe-0.065
Token.
Token ID13
Feature activation-3.23e-03

Pos logit contributions
 Skip+0.001
 maybe+0.001
rob+0.001
 Zoe+0.001
 inspection+0.001
Neg logit contributions
attered-0.001
aming-0.001
ney-0.001
opened-0.001
ifying-0.001
Token
Token ID220
Feature activation+6.34e-03

Pos logit contributions
attered+0.006
ney+0.005
aming+0.005
opened+0.005
 mailbox+0.005
Neg logit contributions
rob-0.005
 Skip-0.004
ock-0.004
 maybe-0.004
 villain-0.004
Token
Token ID198
Feature activation-4.65e-03

Pos logit contributions
rob+0.009
 villain+0.009
 Skip+0.009
ock+0.009
 inspection+0.008
Neg logit contributions
attered-0.011
opened-0.011
ney-0.011
aming-0.010
amed-0.010
Token
Token ID198
Feature activation-2.86e-03

Pos logit contributions
attered+0.008
aming+0.008
ney+0.008
amed+0.007
opened+0.007
Neg logit contributions
 villain-0.007
rob-0.007
 maybe-0.007
 Skip-0.006
 inspection-0.006
Token with
Token ID351
Feature activation-1.54e-02

Pos logit contributions
attered+0.014
ige+0.013
aming+0.013
aut+0.013
olk+0.013
Neg logit contributions
 but-0.009
 he-0.009
 maybe-0.008
 though-0.008
 Zoe-0.008
Token lots
Token ID6041
Feature activation+8.38e-03

Pos logit contributions
attered+0.029
aming+0.027
ney+0.027
opened+0.026
tr+0.025
Neg logit contributions
 Skip-0.021
 maybe-0.021
rob-0.020
 Zoe-0.020
 inspection-0.019
Token of
Token ID286
Feature activation+3.68e-03

Pos logit contributions
 Skip+0.009
 maybe+0.008
rob+0.008
 Zoe+0.008
 inspection+0.008
Neg logit contributions
attered-0.019
aming-0.017
ney-0.017
opened-0.017
ifying-0.017
Token pretty
Token ID2495
Feature activation+1.18e-02

Pos logit contributions
 Skip+0.007
 villain+0.007
rob+0.007
 maybe+0.007
 Boo+0.007
Neg logit contributions
attered-0.004
 mailbox-0.004
ney-0.004
 cracked-0.004
aming-0.004
Token or
Token ID393
Feature activation-6.41e-03

Pos logit contributions
 Skip+0.019
 maybe+0.019
rob+0.018
 Zoe+0.018
 villain+0.018
Neg logit contributions
attered-0.019
aming-0.017
ney-0.017
 mailbox-0.017
opened-0.016
Tokennam
Token ID7402
Feature activation-4.68e-02

Pos logit contributions
attered+0.010
aming+0.009
ney+0.009
opened+0.009
amed+0.008
Neg logit contributions
 maybe-0.011
 Skip-0.011
 Zoe-0.010
rob-0.010
 inspection-0.010
Tokenents
Token ID658
Feature activation-1.90e-04

Pos logit contributions
attered+0.096
tr+0.095
 mailbox+0.090
aming+0.089
 hyd+0.089
Neg logit contributions
ock-0.043
ner-0.040
 inspection-0.039
 Zoe-0.038
 maybe-0.037
Token.
Token ID13
Feature activation-7.16e-04

Pos logit contributions
attered+0.000
aming+0.000
ney+0.000
opened+0.000
amed+0.000
Neg logit contributions
 maybe-0.000
 Skip-0.000
rob-0.000
 Zoe-0.000
ock-0.000
Token One
Token ID1881
Feature activation-9.82e-03

Pos logit contributions
attered+0.001
tr+0.001
 mailbox+0.001
ney+0.001
opened+0.001
Neg logit contributions
aro-0.001
rob-0.001
 Skip-0.001
 villain-0.001
ock-0.001
Token day
Token ID1110
Feature activation+2.68e-03

Pos logit contributions
attered+0.017
amed+0.015
ifying+0.015
aming+0.014
opened+0.014
Neg logit contributions
 maybe-0.016
 he-0.015
ock-0.015
 but-0.015
 Zoe-0.015
Token,
Token ID11
Feature activation-1.41e-02

Pos logit contributions
 Skip+0.004
rob+0.004
 maybe+0.004
ock+0.004
 Zoe+0.003
Neg logit contributions
attered-0.005
aming-0.004
ney-0.004
ifying-0.004
opened-0.004
Token.
Token ID13
Feature activation+6.72e-04

Pos logit contributions
attered+0.031
amed+0.028
aut+0.028
aming+0.028
ney+0.028
Neg logit contributions
 but-0.016
 he-0.015
 maybe-0.014
 Zoe-0.013
rob-0.013
Token Mom
Token ID11254
Feature activation-2.82e-03

Pos logit contributions
 maybe+0.001
 Skip+0.001
 Zoe+0.001
rob+0.001
 inspection+0.001
Neg logit contributions
attered-0.001
aming-0.001
ney-0.001
opened-0.001
ifying-0.001
Token says
Token ID1139
Feature activation+3.99e-03

Pos logit contributions
attered+0.006
aming+0.005
ifying+0.005
ney+0.005
opened+0.005
Neg logit contributions
 maybe-0.004
 Skip-0.004
 Zoe-0.004
rob-0.004
 inspection-0.004
Token they
Token ID484
Feature activation+7.17e-04

Pos logit contributions
rob+0.006
 Skip+0.005
ock+0.005
 maybe+0.005
 villain+0.005
Neg logit contributions
attered-0.007
 mailbox-0.006
aming-0.006
ney-0.006
 cracked-0.006
Token are
Token ID389
Feature activation+6.96e-04

Pos logit contributions
 maybe+0.001
 Skip+0.001
 but+0.001
 Zoe+0.001
 he+0.001
Neg logit contributions
attered-0.001
aming-0.001
amed-0.001
ney-0.001
aut-0.001
Token av
Token ID1196
Feature activation-4.61e-02

Pos logit contributions
 maybe+0.001
 but+0.001
 Zoe+0.001
 he+0.001
 Skip+0.001
Neg logit contributions
attered-0.001
aut-0.001
ney-0.001
ifying-0.001
tr-0.001
Tokenoc
Token ID420
Feature activation+1.07e-02

Pos logit contributions
attered+0.102
aming+0.093
tr+0.092
ifying+0.091
 cracked+0.091
Neg logit contributions
rob-0.049
 Skip-0.045
ock-0.042
 maybe-0.040
 inspection-0.040
Tokenados
Token ID22484
Feature activation-1.68e-02

Pos logit contributions
 Skip+0.018
 maybe+0.018
rob+0.018
 Zoe+0.018
 inspection+0.017
Neg logit contributions
attered-0.017
aming-0.015
ney-0.015
ifying-0.014
opened-0.014
Token.
Token ID13
Feature activation+2.59e-03

Pos logit contributions
attered+0.031
aming+0.028
ney+0.027
opened+0.027
amed+0.026
Neg logit contributions
 maybe-0.024
 Skip-0.024
 Zoe-0.024
rob-0.023
 but-0.023
Token Sara
Token ID24799
Feature activation-7.42e-03

Pos logit contributions
rob+0.002
 Skip+0.002
ock+0.002
 maybe+0.002
 villain+0.002
Neg logit contributions
attered-0.006
aming-0.006
ney-0.006
opened-0.006
ifying-0.006
Token wants
Token ID3382
Feature activation-1.28e-02

Pos logit contributions
attered+0.015
ifying+0.014
aming+0.014
ney+0.013
tr+0.013
Neg logit contributions
 maybe-0.010
 Skip-0.010
 Zoe-0.009
 but-0.009
 Boo-0.009
Token the
Token ID262
Feature activation-4.86e-03

Pos logit contributions
 Skip+0.006
 maybe+0.006
 Zoe+0.006
rob+0.006
 inspection+0.006
Neg logit contributions
attered-0.012
aming-0.011
ney-0.011
opened-0.010
ifying-0.010
Token price
Token ID2756
Feature activation+1.38e-02

Pos logit contributions
attered+0.008
aming+0.008
ney+0.007
amed+0.007
aut+0.007
Neg logit contributions
 maybe-0.008
 Skip-0.008
 Zoe-0.008
 but-0.008
 he-0.007
Token -
Token ID532
Feature activation-2.81e-03

Pos logit contributions
 Skip+0.019
 maybe+0.019
rob+0.018
 Zoe+0.018
 inspection+0.018
Neg logit contributions
attered-0.026
aming-0.023
ney-0.023
opened-0.023
ifying-0.022
Token it
Token ID340
Feature activation+7.23e-03

Pos logit contributions
attered+0.005
aming+0.005
ney+0.005
opened+0.005
ifying+0.005
Neg logit contributions
 Skip-0.004
 maybe-0.004
rob-0.004
 Zoe-0.003
 inspection-0.003
Token was
Token ID373
Feature activation+8.13e-04

Pos logit contributions
 maybe+0.012
 Skip+0.012
 Zoe+0.011
rob+0.011
 he+0.011
Neg logit contributions
attered-0.012
aming-0.011
ney-0.010
ifying-0.010
opened-0.010
Token $
Token ID720
Feature activation-4.60e-02

Pos logit contributions
 maybe+0.001
 he+0.001
 but+0.001
 Zoe+0.001
 Skip+0.001
Neg logit contributions
attered-0.002
ney-0.001
aut-0.001
aming-0.001
olk-0.001
Token5
Token ID20
Feature activation-1.94e-02

Pos logit contributions
attered+0.075
tr+0.075
ney+0.075
olk+0.070
 hyd+0.069
Neg logit contributions
ner-0.061
 Many-0.060
ock-0.060
 maybe-0.057
 but-0.056
Token.
Token ID13
Feature activation+9.39e-03

Pos logit contributions
attered+0.038
tr+0.034
ney+0.033
aming+0.033
amed+0.033
Neg logit contributions
 but-0.025
 maybe-0.025
ock-0.024
 he-0.024
 Skip-0.024
Token Sally
Token ID25737
Feature activation+4.06e-03

Pos logit contributions
rob+0.013
 Skip+0.013
 maybe+0.012
ock+0.012
 villain+0.012
Neg logit contributions
attered-0.018
aming-0.016
ney-0.016
opened-0.016
ifying-0.015
Token was
Token ID373
Feature activation+1.04e-02

Pos logit contributions
 Zoe+0.005
 maybe+0.005
 he+0.005
 Skip+0.005
 but+0.005
Neg logit contributions
attered-0.008
ifying-0.007
olk-0.007
aming-0.007
tr-0.007
Token so
Token ID523
Feature activation+4.37e-03

Pos logit contributions
 maybe+0.014
 Skip+0.014
 Zoe+0.014
 but+0.013
 inspection+0.013
Neg logit contributions
attered-0.020
ney-0.018
aming-0.017
tr-0.017
ifying-0.017
Token box
Token ID3091
Feature activation-1.27e-02

Pos logit contributions
attered+0.016
amed+0.014
aming+0.014
aut+0.013
ney+0.013
Neg logit contributions
 but-0.014
 Skip-0.013
 maybe-0.013
 he-0.013
 Zoe-0.013
Token full
Token ID1336
Feature activation-1.52e-02

Pos logit contributions
attered+0.028
ige+0.026
aut+0.026
olk+0.026
ifying+0.025
Neg logit contributions
 but-0.015
 he-0.015
 maybe-0.012
 though-0.012
ock-0.011
Token of
Token ID286
Feature activation-2.51e-03

Pos logit contributions
attered+0.044
amed+0.041
ra+0.041
ney+0.041
aming+0.041
Neg logit contributions
 but-0.007
 he-0.006
 maybe-0.005
ock-0.005
 Zoe-0.004
Token beautiful
Token ID4950
Feature activation+1.59e-03

Pos logit contributions
attered+0.004
aming+0.004
ney+0.004
opened+0.004
ifying+0.003
Neg logit contributions
 Skip-0.004
 maybe-0.004
rob-0.004
 Zoe-0.004
 inspection-0.004
Token or
Token ID393
Feature activation-6.15e-03

Pos logit contributions
 maybe+0.002
 inspection+0.002
 but+0.002
 Zoe+0.002
 he+0.002
Neg logit contributions
attered-0.003
aming-0.003
ney-0.003
amed-0.003
ifying-0.003
Tokennam
Token ID7402
Feature activation-4.56e-02

Pos logit contributions
attered+0.011
ney+0.011
aming+0.010
lette+0.010
tr+0.010
Neg logit contributions
 maybe-0.010
 he-0.009
 but-0.009
 Zoe-0.009
 Skip-0.008
Tokenents
Token ID658
Feature activation-4.99e-03

Pos logit contributions
tr+0.093
attered+0.092
 mailbox+0.088
 hyd+0.087
ifying+0.085
Neg logit contributions
ock-0.041
ner-0.038
 inspection-0.038
 Zoe-0.037
rob-0.036
Token.
Token ID13
Feature activation-3.63e-03

Pos logit contributions
attered+0.009
aming+0.008
opened+0.008
aut+0.008
ney+0.008
Neg logit contributions
 maybe-0.007
 he-0.007
 but-0.007
ock-0.007
 inspection-0.007
Token
Token ID220
Feature activation-1.63e-03

Pos logit contributions
attered+0.006
aming+0.006
ney+0.006
opened+0.005
 mailbox+0.005
Neg logit contributions
rob-0.005
 Skip-0.005
 maybe-0.005
ock-0.005
aro-0.005
Token
Token ID198
Feature activation-4.65e-03

Pos logit contributions
attered+0.003
ney+0.002
opened+0.002
+0.002
aming+0.002
Neg logit contributions
rob-0.003
 villain-0.003
 Skip-0.003
aro-0.002
ock-0.002
Token
Token ID198
Feature activation+5.58e-04

Pos logit contributions
+0.006
aming+0.006
um+0.006
Mom+0.005
attered+0.005
Neg logit contributions
aro-0.009
 villain-0.009
rob-0.009
ock-0.008
 maybe-0.008
Token the
Token ID262
Feature activation+6.05e-03

Pos logit contributions
rob+0.005
ock+0.005
 Skip+0.005
 villain+0.004
 inspection+0.004
Neg logit contributions
attered-0.007
 mailbox-0.007
aming-0.007
 cracked-0.007
-0.007
Token shiny
Token ID22441
Feature activation+1.32e-02

Pos logit contributions
 maybe+0.009
 Skip+0.008
 Zoe+0.008
 inspection+0.008
 but+0.008
Neg logit contributions
attered-0.012
aming-0.011
amed-0.010
ifying-0.010
opened-0.010
Token things
Token ID1243
Feature activation-6.90e-03

Pos logit contributions
 Skip+0.023
 maybe+0.022
rob+0.022
 Zoe+0.021
ock+0.021
Neg logit contributions
attered-0.020
ney-0.018
 mailbox-0.018
aming-0.017
tr-0.017
Token were
Token ID547
Feature activation+1.39e-03

Pos logit contributions
attered+0.015
aming+0.014
ney+0.013
opened+0.013
amed+0.013
Neg logit contributions
 maybe-0.008
 Skip-0.008
 Zoe-0.008
 inspection-0.007
 but-0.007
Token or
Token ID393
Feature activation-6.34e-03

Pos logit contributions
 maybe+0.002
 Skip+0.002
 Zoe+0.002
 but+0.002
 inspection+0.002
Neg logit contributions
attered-0.002
ney-0.002
aming-0.002
ifying-0.002
opened-0.002
Tokennam
Token ID7402
Feature activation-4.48e-02

Pos logit contributions
attered+0.011
aming+0.009
ney+0.009
opened+0.009
ifying+0.009
Neg logit contributions
 Skip-0.010
 maybe-0.010
rob-0.010
 Zoe-0.010
 inspection-0.010
Tokenents
Token ID658
Feature activation-5.01e-03

Pos logit contributions
tr+0.095
attered+0.091
 mailbox+0.091
 narrative+0.089
 hyd+0.089
Neg logit contributions
ock-0.036
ner-0.033
 Zoe-0.032
 but-0.032
 inspection-0.032
Token and
Token ID290
Feature activation+7.48e-03

Pos logit contributions
attered+0.009
aming+0.008
ney+0.008
opened+0.008
ifying+0.008
Neg logit contributions
 Skip-0.007
 maybe-0.007
rob-0.007
 Zoe-0.007
 inspection-0.007
Token they
Token ID484
Feature activation+4.80e-03

Pos logit contributions
 Skip+0.009
rob+0.009
ock+0.009
 maybe+0.009
 inspection+0.009
Neg logit contributions
attered-0.015
aming-0.013
ney-0.013
 cracked-0.013
 mailbox-0.013
Token were
Token ID547
Feature activation+6.59e-03

Pos logit contributions
 maybe+0.008
 Skip+0.008
 Zoe+0.008
 he+0.007
 inspection+0.007
Neg logit contributions
attered-0.008
aming-0.007
ney-0.007
opened-0.007
ifying-0.007
Token very
Token ID845
Feature activation+2.74e-03

Pos logit contributions
 maybe+0.011
 Skip+0.011
 Zoe+0.010
rob+0.010
 inspection+0.010
Neg logit contributions
attered-0.011
aming-0.010
ney-0.010
opened-0.009
ifying-0.009
TokenThe
Token ID464
Feature activation+1.96e-02

Pos logit contributions
 Skip+0.005
 maybe+0.005
 inspection+0.005
 Zoe+0.005
rob+0.004
Neg logit contributions
attered-0.004
aming-0.003
ney-0.003
opened-0.003
tr-0.003
Token le
Token ID443
Feature activation+7.67e-03

Pos logit contributions
 Skip+0.029
 maybe+0.029
rob+0.028
 Zoe+0.028
 inspection+0.027
Neg logit contributions
attered-0.035
aming-0.032
ney-0.032
ifying-0.031
opened-0.031
Tokenopard
Token ID15478
Feature activation+3.82e-03

Pos logit contributions
 Skip+0.016
 maybe+0.015
 inspection+0.015
 Zoe+0.015
ock+0.015
Neg logit contributions
attered-0.010
ney-0.008
tr-0.008
opened-0.008
ra-0.008
Token toy
Token ID13373
Feature activation+1.64e-02

Pos logit contributions
 maybe+0.006
 Skip+0.006
 Zoe+0.006
rob+0.006
 inspection+0.006
Neg logit contributions
attered-0.006
aming-0.006
ney-0.005
ifying-0.005
amed-0.005
Token costs
Token ID3484
Feature activation-1.44e-02

Pos logit contributions
 Skip+0.026
 maybe+0.026
 inspection+0.026
rob+0.026
 Zoe+0.025
Neg logit contributions
attered-0.026
ney-0.023
aming-0.023
 mailbox-0.023
tr-0.022
Token $
Token ID720
Feature activation-4.46e-02

Pos logit contributions
attered+0.024
aming+0.022
ney+0.022
opened+0.021
amed+0.021
Neg logit contributions
 Skip-0.023
rob-0.022
 maybe-0.022
 Zoe-0.021
 inspection-0.021
Token10
Token ID940
Feature activation+4.02e-03

Pos logit contributions
attered+0.070
ney+0.065
tr+0.065
aming+0.063
olk+0.061
Neg logit contributions
ock-0.067
 maybe-0.066
 Many-0.063
rob-0.063
 Skip-0.063
Token."
Token ID526
Feature activation+4.95e-03

Pos logit contributions
 Skip+0.007
rob+0.006
 maybe+0.006
 Zoe+0.006
 inspection+0.006
Neg logit contributions
attered-0.006
ney-0.006
ifying-0.006
ra-0.006
opened-0.006
Token
Token ID220
Feature activation+3.35e-03

Pos logit contributions
 Skip+0.007
 villain+0.007
rob+0.007
ock+0.006
 maybe+0.006
Neg logit contributions
attered-0.010
opened-0.009
ney-0.009
aming-0.008
tr-0.008
Token
Token ID198
Feature activation-7.55e-04

Pos logit contributions
rob+0.006
 villain+0.006
 Skip+0.005
ock+0.005
 villains+0.005
Neg logit contributions
attered-0.005
ney-0.005
opened-0.005
um-0.005
-0.005
Token
Token ID198
Feature activation+2.08e-03

Pos logit contributions
ney+0.001
opened+0.001
attered+0.001
+0.001
um+0.001
Neg logit contributions
 villain-0.002
 Skip-0.001
 villains-0.001
aro-0.001
rob-0.001
Token hill
Token ID12788
Feature activation-1.39e-02

Pos logit contributions
attered+0.001
aming+0.001
amed+0.001
lette+0.001
ige+0.001
Neg logit contributions
 Skip-0.001
 maybe-0.001
 but-0.001
 Zoe-0.001
 he-0.001
Token and
Token ID290
Feature activation+5.44e-03

Pos logit contributions
attered+0.028
aming+0.025
ney+0.024
ifying+0.024
opened+0.024
Neg logit contributions
 maybe-0.017
 Skip-0.017
 Zoe-0.017
 but-0.017
 inspection-0.016
Token put
Token ID1234
Feature activation+2.33e-03

Pos logit contributions
 Skip+0.010
rob+0.010
 maybe+0.010
 Zoe+0.010
ock+0.010
Neg logit contributions
attered-0.007
aming-0.006
ney-0.006
opened-0.006
amed-0.006
Token her
Token ID607
Feature activation-7.39e-03

Pos logit contributions
 Skip+0.003
 maybe+0.003
rob+0.003
 Zoe+0.003
 inspection+0.003
Neg logit contributions
attered-0.005
aming-0.004
ney-0.004
opened-0.004
amed-0.004
Token or
Token ID393
Feature activation-2.78e-03

Pos logit contributions
attered+0.012
aming+0.011
ney+0.010
opened+0.010
aut+0.010
Neg logit contributions
 maybe-0.012
 Zoe-0.012
 Skip-0.012
 inspection-0.012
 but-0.012
Tokennam
Token ID7402
Feature activation-4.39e-02

Pos logit contributions
attered+0.004
aming+0.004
ney+0.004
opened+0.004
ifying+0.004
Neg logit contributions
 maybe-0.005
 Skip-0.005
 Zoe-0.004
rob-0.004
 inspection-0.004
Tokenents
Token ID658
Feature activation-1.03e-03

Pos logit contributions
tr+0.092
attered+0.088
 mailbox+0.086
 hyd+0.085
 narrative+0.085
Neg logit contributions
ock-0.035
ner-0.034
 inspection-0.034
 Zoe-0.032
 Many-0.032
Token all
Token ID477
Feature activation+6.66e-03

Pos logit contributions
attered+0.002
aming+0.001
ney+0.001
aut+0.001
opened+0.001
Neg logit contributions
 maybe-0.002
 he-0.002
ock-0.002
 but-0.002
 Zoe-0.002
Token around
Token ID1088
Feature activation+1.94e-02

Pos logit contributions
 Zoe+0.007
 maybe+0.007
ock+0.007
 Skip+0.007
rob+0.007
Neg logit contributions
attered-0.014
aming-0.013
ney-0.013
ifying-0.013
tr-0.013
Token.
Token ID13
Feature activation-3.07e-03

Pos logit contributions
 Skip+0.028
 maybe+0.027
rob+0.027
 Zoe+0.026
 inspection+0.026
Neg logit contributions
attered-0.035
aming-0.032
ney-0.032
opened-0.031
ifying-0.030
Token She
Token ID1375
Feature activation-1.21e-03

Pos logit contributions
attered+0.006
aming+0.005
ney+0.005
opened+0.005
ifying+0.005
Neg logit contributions
 maybe-0.004
 Skip-0.004
 Zoe-0.004
rob-0.004
 inspection-0.004
TokenLet
Token ID5756
Feature activation+5.34e-03

Pos logit contributions
 maybe+0.003
 Skip+0.003
 Zoe+0.003
rob+0.003
 inspection+0.003
Neg logit contributions
attered-0.003
aming-0.003
ney-0.003
opened-0.003
ifying-0.003
Token's
Token ID338
Feature activation-1.67e-03

Pos logit contributions
 Skip+0.011
 maybe+0.011
 Zoe+0.011
ock+0.011
 but+0.011
Neg logit contributions
attered-0.006
ney-0.006
aming-0.005
ifying-0.005
opened-0.005
Token go
Token ID467
Feature activation+3.93e-03

Pos logit contributions
attered+0.003
ney+0.003
aming+0.003
aut+0.003
amed+0.003
Neg logit contributions
 maybe-0.002
 Skip-0.002
 Zoe-0.002
 but-0.002
 he-0.002
Token find
Token ID1064
Feature activation-4.15e-03

Pos logit contributions
 Skip+0.007
rob+0.007
 maybe+0.007
 inspection+0.007
 villain+0.007
Neg logit contributions
attered-0.005
aming-0.005
opened-0.005
ney-0.005
-0.005
Token a
Token ID257
Feature activation-3.78e-03

Pos logit contributions
attered+0.006
aming+0.006
 mailbox+0.006
ige+0.006
ifying+0.005
Neg logit contributions
 Skip-0.007
 inspection-0.007
rob-0.007
 villain-0.007
 villains-0.006
Token y
Token ID331
Feature activation-4.36e-02

Pos logit contributions
attered+0.007
aming+0.006
ney+0.006
opened+0.006
tr+0.006
Neg logit contributions
 Skip-0.006
 maybe-0.006
rob-0.006
 Zoe-0.005
 inspection-0.005
Tokenummy
Token ID13513
Feature activation-1.23e-02

Pos logit contributions
amed+0.060
attered+0.059
tr+0.057
aming+0.056
ney+0.056
Neg logit contributions
ock-0.077
 Zoe-0.071
 inspection-0.071
 but-0.070
 Skip-0.069
Token surprise
Token ID5975
Feature activation-2.44e-03

Pos logit contributions
attered+0.019
aming+0.017
 mailbox+0.016
tr+0.016
opened+0.016
Neg logit contributions
 Skip-0.021
 maybe-0.020
 inspection-0.020
 Zoe-0.020
aro-0.020
Token!"
Token ID2474
Feature activation-6.59e-04

Pos logit contributions
attered+0.004
aming+0.004
ney+0.004
opened+0.004
tr+0.004
Neg logit contributions
 Skip-0.004
 maybe-0.004
rob-0.003
 Zoe-0.003
 inspection-0.003
Token So
Token ID1406
Feature activation-5.57e-03

Pos logit contributions
attered+0.001
aming+0.001
ifying+0.001
opened+0.001
 mailbox+0.001
Neg logit contributions
rob-0.001
 Skip-0.001
 maybe-0.001
aro-0.001
 inspection-0.001
Token they
Token ID484
Feature activation-1.04e-02

Pos logit contributions
attered+0.012
aming+0.011
ney+0.010
ifying+0.010
opened+0.010
Neg logit contributions
 Skip-0.006
rob-0.006
 maybe-0.006
ock-0.006
 inspection-0.006
Token her
Token ID607
Feature activation+1.10e-03

Pos logit contributions
 Skip+0.010
 maybe+0.010
rob+0.010
 Zoe+0.010
 inspection+0.009
Neg logit contributions
attered-0.016
aming-0.015
ney-0.014
opened-0.014
amed-0.014
Token a
Token ID257
Feature activation-1.28e-05

Pos logit contributions
 maybe+0.001
 Skip+0.001
 Zoe+0.001
 but+0.001
 he+0.001
Neg logit contributions
attered-0.002
ney-0.002
aming-0.002
opened-0.002
tr-0.002
Token box
Token ID3091
Feature activation-1.34e-02

Pos logit contributions
attered+0.000
aming+0.000
ney+0.000
opened+0.000
amed+0.000
Neg logit contributions
 Skip-0.000
 maybe-0.000
 Zoe-0.000
rob-0.000
 inspection-0.000
Token of
Token ID286
Feature activation-2.08e-03

Pos logit contributions
attered+0.033
aut+0.032
olk+0.032
ra+0.032
inct+0.032
Neg logit contributions
 but-0.010
 he-0.009
ock-0.007
 though-0.007
 maybe-0.006
Token ch
Token ID442
Feature activation-1.17e-02

Pos logit contributions
attered+0.003
aming+0.003
ney+0.003
 mailbox+0.003
opened+0.003
Neg logit contributions
 Skip-0.003
 maybe-0.003
rob-0.003
 inspection-0.003
 Zoe-0.003
Tokenocol
Token ID4668
Feature activation-4.36e-02

Pos logit contributions
attered+0.020
tr+0.018
ney+0.018
aming+0.018
opened+0.018
Neg logit contributions
 Skip-0.017
 maybe-0.017
ock-0.017
 inspection-0.017
rob-0.017
Tokenates
Token ID689
Feature activation-1.73e-02

Pos logit contributions
attered+0.075
 mailbox+0.065
ifying+0.064
opened+0.064
+0.064
Neg logit contributions
rob-0.070
ock-0.064
aro-0.064
 Skip-0.060
 Zig-0.059
Token.
Token ID13
Feature activation+6.89e-03

Pos logit contributions
attered+0.035
ney+0.033
opened+0.032
aming+0.032
amed+0.031
Neg logit contributions
 he-0.022
 but-0.021
 maybe-0.021
 Zoe-0.019
ock-0.019
Token Lily
Token ID20037
Feature activation-4.93e-03

Pos logit contributions
 maybe+0.005
 Zoe+0.005
 Skip+0.005
 but+0.004
 inspection+0.004
Neg logit contributions
attered-0.017
aming-0.017
ney-0.017
amed-0.016
ifying-0.016
Token opened
Token ID4721
Feature activation-1.51e-02

Pos logit contributions
attered+0.009
ifying+0.009
aming+0.009
ney+0.008
tr+0.008
Neg logit contributions
 maybe-0.007
 Skip-0.006
 he-0.006
 Zoe-0.006
 but-0.006
Token the
Token ID262
Feature activation-1.28e-02

Pos logit contributions
attered+0.035
ney+0.033
aming+0.032
opened+0.031
tr+0.031
Neg logit contributions
 Skip-0.014
 maybe-0.014
 Zoe-0.014
 inspection-0.013
rob-0.013
TokenOnce
Token ID7454
Feature activation-5.22e-03

Pos logit contributions
 maybe+0.001
 Skip+0.001
 inspection+0.001
rob+0.001
ock+0.001
Neg logit contributions
attered-0.001
ney-0.001
aming-0.001
opened-0.001
ifying-0.001
Token upon
Token ID2402
Feature activation-5.42e-03

Pos logit contributions
attered+0.006
ney+0.005
um+0.005
+0.005
opened+0.005
Neg logit contributions
rob-0.011
 Skip-0.010
 villain-0.010
 villains-0.010
ock-0.010
Token a
Token ID257
Feature activation-1.07e-03

Pos logit contributions
aming+0.005
+0.005
attered+0.005
ney+0.004
tr+0.004
Neg logit contributions
aro-0.012
rob-0.012
 inspection-0.011
 Skip-0.011
 Zoe-0.011
Token time
Token ID640
Feature activation+3.58e-03

Pos logit contributions
attered+0.001
aming+0.001
ney+0.001
opened+0.001
tr+0.001
Neg logit contributions
 maybe-0.002
rob-0.002
 Skip-0.002
aro-0.002
 Zoe-0.002
Token,
Token ID11
Feature activation-5.05e-03

Pos logit contributions
rob+0.008
ock+0.008
 villain+0.007
 villains+0.007
 Skip+0.007
Neg logit contributions
-0.004
um-0.003
attered-0.003
au-0.003
ney-0.003
Token Mr
Token ID1770
Feature activation-4.27e-02

Pos logit contributions
attered+0.008
aming+0.007
ney+0.007
ifying+0.007
opened+0.007
Neg logit contributions
rob-0.009
 Skip-0.008
ock-0.008
 villain-0.008
 maybe-0.008
Token.
Token ID13
Feature activation-3.75e-04

Pos logit contributions
attered+0.078
aming+0.069
amed+0.069
ifying+0.068
opened+0.067
Neg logit contributions
 Skip-0.062
 maybe-0.061
rob-0.058
 Zoe-0.058
ock-0.057
Token Frog
Token ID28328
Feature activation-1.11e-02

Pos logit contributions
attered+0.001
ney+0.001
aming+0.001
opened+0.001
amed+0.001
Neg logit contributions
 Skip-0.000
 Zoe-0.000
rob-0.000
 maybe-0.000
 Boo-0.000
Token and
Token ID290
Feature activation+6.19e-03

Pos logit contributions
attered+0.019
aming+0.017
ifying+0.017
ney+0.017
opened+0.016
Neg logit contributions
 Skip-0.017
 maybe-0.017
 Zoe-0.016
rob-0.016
 he-0.016
Token a
Token ID257
Feature activation-2.22e-03

Pos logit contributions
ock+0.007
 inspection+0.007
aro+0.006
rob+0.006
 maybe+0.006
Neg logit contributions
attered-0.012
 mailbox-0.012
-0.011
ige-0.011
tr-0.011
Token naughty
Token ID45128
Feature activation+1.29e-03

Pos logit contributions
attered+0.004
ney+0.003
aming+0.003
 mailbox+0.003
opened+0.003
Neg logit contributions
rob-0.003
 Skip-0.003
 maybe-0.003
 inspection-0.003
ock-0.003
Token could
Token ID714
Feature activation-8.87e-03

Pos logit contributions
 maybe+0.019
 Skip+0.018
 he+0.017
 Zoe+0.017
 inspection+0.017
Neg logit contributions
attered-0.015
aming-0.013
ney-0.013
ifying-0.013
tr-0.013
Token not
Token ID407
Feature activation-8.14e-03

Pos logit contributions
attered+0.013
aming+0.012
ney+0.012
 cracked+0.012
opened+0.012
Neg logit contributions
rob-0.016
 Skip-0.015
ock-0.015
 Zoe-0.015
aro-0.014
Token find
Token ID1064
Feature activation-3.74e-03

Pos logit contributions
attered+0.014
aming+0.013
ney+0.013
opened+0.012
amed+0.012
Neg logit contributions
 maybe-0.012
 Skip-0.012
 Zoe-0.012
 inspection-0.012
rob-0.011
Token his
Token ID465
Feature activation-2.36e-02

Pos logit contributions
attered+0.007
 cracked+0.007
ifying+0.007
 mailbox+0.007
aming+0.007
Neg logit contributions
rob-0.005
 Skip-0.005
 Zoe-0.004
aro-0.004
 inspection-0.004
Token famous
Token ID5863
Feature activation-1.36e-02

Pos logit contributions
attered+0.041
amed+0.038
aming+0.038
tr+0.036
ney+0.035
Neg logit contributions
 maybe-0.038
 he-0.037
 but-0.036
 Skip-0.033
 Zoe-0.033
Token las
Token ID39990
Feature activation-4.27e-02

Pos logit contributions
attered+0.023
aming+0.020
amed+0.020
ney+0.020
ifying+0.019
Neg logit contributions
 maybe-0.022
 he-0.021
 but-0.021
 Skip-0.021
 inspection-0.020
Tokenagna
Token ID48669
Feature activation-1.06e-02

Pos logit contributions
attered+0.060
aming+0.053
amed+0.052
ney+0.050
ifying+0.049
Neg logit contributions
rob-0.077
 Skip-0.075
 maybe-0.075
 villain-0.073
 but-0.070
Token.
Token ID13
Feature activation+3.02e-04

Pos logit contributions
attered+0.018
aming+0.017
ney+0.016
opened+0.016
amed+0.016
Neg logit contributions
 maybe-0.016
 Skip-0.016
 Zoe-0.015
rob-0.015
 inspection-0.015
Token
Token ID198
Feature activation-1.55e-03

Pos logit contributions
rob+0.000
 Skip+0.000
ock+0.000
 inspection+0.000
 maybe+0.000
Neg logit contributions
attered-0.001
ney-0.001
opened-0.001
aming-0.001
tr-0.001
Token
Token ID198
Feature activation+2.88e-03

Pos logit contributions
ney+0.002
+0.002
aming+0.002
opened+0.002
attered+0.002
Neg logit contributions
rob-0.003
 villain-0.003
 Skip-0.002
ock-0.002
 inspection-0.002
TokenThe
Token ID464
Feature activation-2.45e-03

Pos logit contributions
 Skip+0.004
 maybe+0.004
 Zoe+0.004
ock+0.004
rob+0.004
Neg logit contributions
attered-0.005
ney-0.005
aming-0.005
 mailbox-0.005
ifying-0.005
Token it
Token ID340
Feature activation-2.60e-03

Pos logit contributions
attered+0.002
 cracked+0.002
ney+0.002
+0.002
 mailbox+0.002
Neg logit contributions
rob-0.001
ock-0.001
 villain-0.001
 Skip-0.001
 inspection-0.001
Token made
Token ID925
Feature activation+3.71e-03

Pos logit contributions
attered+0.004
aming+0.004
ney+0.003
amed+0.003
opened+0.003
Neg logit contributions
 maybe-0.005
 Skip-0.004
 he-0.004
 Zoe-0.004
 Boo-0.004
Token Bob
Token ID5811
Feature activation-6.06e-03

Pos logit contributions
rob+0.006
 villain+0.005
 villains+0.004
ock+0.004
ner+0.004
Neg logit contributions
attered-0.006
 cracked-0.006
olk-0.006
um-0.006
aming-0.006
Token's
Token ID338
Feature activation-6.35e-03

Pos logit contributions
attered+0.011
aming+0.010
ney+0.010
opened+0.010
tr+0.010
Neg logit contributions
 maybe-0.008
 Skip-0.008
 inspection-0.008
 Zoe-0.008
 but-0.008
Token face
Token ID1986
Feature activation-1.50e-02

Pos logit contributions
aming+0.011
attered+0.011
amed+0.010
ifying+0.010
opened+0.010
Neg logit contributions
 but-0.009
 maybe-0.009
 he-0.009
 inspection-0.009
 Boo-0.009
Token scr
Token ID6040
Feature activation-4.26e-02

Pos logit contributions
attered+0.026
ney+0.026
amed+0.026
aming+0.025
 mailbox+0.025
Neg logit contributions
 but-0.021
 he-0.021
 maybe-0.020
 though-0.019
 inspection-0.018
Tokenunch
Token ID3316
Feature activation-6.77e-03

Pos logit contributions
tr+0.087
 mailbox+0.086
attered+0.083
 printed+0.083
ifying+0.081
Neg logit contributions
ock-0.039
 Many-0.038
 maybe-0.037
 Lia-0.036
aro-0.036
Token up
Token ID510
Feature activation-9.19e-03

Pos logit contributions
attered+0.011
ney+0.011
opened+0.011
tr+0.011
 mailbox+0.011
Neg logit contributions
 he-0.010
ock-0.010
 but-0.010
 maybe-0.010
 Skip-0.009
Token.
Token ID13
Feature activation+3.95e-03

Pos logit contributions
attered+0.020
ney+0.019
tr+0.019
amed+0.018
lette+0.018
Neg logit contributions
 but-0.012
 he-0.011
 maybe-0.011
 inspection-0.010
 them-0.010
Token He
Token ID679
Feature activation+2.04e-03

Pos logit contributions
ock+0.004
rob+0.004
 villain+0.004
 maybe+0.004
 Skip+0.004
Neg logit contributions
attered-0.009
ney-0.008
opened-0.008
aut-0.008
 mailbox-0.008
Token didn
Token ID1422
Feature activation+1.05e-02

Pos logit contributions
 maybe+0.003
 Skip+0.003
 Zoe+0.003
 he+0.003
 Boo+0.003
Neg logit contributions
attered-0.004
aming-0.003
ifying-0.003
ney-0.003
opened-0.003
Token wallet
Token ID13008
Feature activation-3.13e-03

Pos logit contributions
attered+0.005
aming+0.005
ifying+0.005
amed+0.005
ney+0.005
Neg logit contributions
 maybe-0.005
 but-0.004
 he-0.004
 villain-0.004
 Skip-0.004
Token had
Token ID550
Feature activation-1.16e-02

Pos logit contributions
attered+0.006
ifying+0.006
aming+0.005
ige+0.005
lette+0.005
Neg logit contributions
 he-0.005
 maybe-0.004
 but-0.004
 villain-0.004
 Zoe-0.004
Token coins
Token ID10796
Feature activation-1.57e-02

Pos logit contributions
attered+0.017
ney+0.016
aming+0.016
aut+0.015
opened+0.015
Neg logit contributions
 he-0.020
 maybe-0.020
 but-0.019
 Zoe-0.019
ock-0.018
Token,
Token ID11
Feature activation+8.88e-03

Pos logit contributions
attered+0.031
ney+0.028
aming+0.028
tr+0.028
aut+0.028
Neg logit contributions
 but-0.021
 he-0.021
 maybe-0.019
 inspection-0.018
 though-0.018
Token gum
Token ID27142
Feature activation-8.70e-03

Pos logit contributions
 Skip+0.016
rob+0.016
 maybe+0.016
 inspection+0.015
 Zoe+0.015
Neg logit contributions
attered-0.013
aming-0.011
ney-0.011
 cracked-0.011
ifying-0.011
Token wra
Token ID7917
Feature activation-4.26e-02

Pos logit contributions
attered+0.016
aming+0.016
ney+0.015
opened+0.015
ifying+0.015
Neg logit contributions
 he-0.011
 maybe-0.011
 but-0.011
 Skip-0.011
rob-0.010
Tokenppers
Token ID11799
Feature activation-9.31e-03

Pos logit contributions
 narrative+0.070
 rightful+0.070
ney+0.066
attered+0.066
 mailbox+0.063
Neg logit contributions
 but-0.056
 inspection-0.053
ock-0.052
ner-0.050
 them-0.050
Token,
Token ID11
Feature activation+1.34e-03

Pos logit contributions
attered+0.019
aming+0.018
ney+0.018
opened+0.017
tr+0.017
Neg logit contributions
 maybe-0.011
 he-0.011
 but-0.011
 Skip-0.010
 inspection-0.010
Token buttons
Token ID12163
Feature activation+4.34e-03

Pos logit contributions
 Skip+0.002
 maybe+0.002
rob+0.002
 Zoe+0.002
 inspection+0.002
Neg logit contributions
attered-0.002
aming-0.002
ney-0.002
 cracked-0.002
ifying-0.002
Token,
Token ID11
Feature activation-3.05e-03

Pos logit contributions
 Skip+0.006
 maybe+0.006
rob+0.005
 Zoe+0.005
 inspection+0.005
Neg logit contributions
attered-0.008
aming-0.008
ney-0.008
opened-0.007
ifying-0.007
Token and
Token ID290
Feature activation+3.99e-04

Pos logit contributions
attered+0.005
aming+0.005
ney+0.005
 cracked+0.005
ifying+0.005
Neg logit contributions
 Skip-0.005
rob-0.004
 maybe-0.004
 Zoe-0.004
 inspection-0.004
Token a
Token ID257
Feature activation+3.87e-03

Pos logit contributions
attered+0.006
ney+0.005
aming+0.005
ifying+0.005
opened+0.005
Neg logit contributions
 maybe-0.005
 Zoe-0.005
 inspection-0.005
 but-0.005
 Skip-0.005
Token box
Token ID3091
Feature activation+1.74e-03

Pos logit contributions
 maybe+0.005
 but+0.005
 Zoe+0.005
 Skip+0.005
 villain+0.005
Neg logit contributions
attered-0.008
aming-0.007
aut-0.007
amed-0.007
ney-0.007
Token of
Token ID286
Feature activation+1.44e-03

Pos logit contributions
 but+0.002
 he+0.002
 maybe+0.002
 Zoe+0.002
 though+0.002
Neg logit contributions
attered-0.004
aut-0.003
ige-0.003
aming-0.003
ney-0.003
Token shiny
Token ID22441
Feature activation-5.14e-03

Pos logit contributions
 Skip+0.003
 maybe+0.002
rob+0.002
 Zoe+0.002
 inspection+0.002
Neg logit contributions
attered-0.002
aming-0.002
ney-0.002
opened-0.002
ifying-0.002
Token or
Token ID393
Feature activation-8.24e-03

Pos logit contributions
attered+0.009
aming+0.008
ney+0.008
opened+0.008
amed+0.008
Neg logit contributions
 maybe-0.008
 Skip-0.008
rob-0.008
 Zoe-0.008
 inspection-0.008
Tokennam
Token ID7402
Feature activation-4.24e-02

Pos logit contributions
attered+0.014
ney+0.013
aming+0.013
opened+0.012
aut+0.012
Neg logit contributions
 maybe-0.014
 Zoe-0.013
 Skip-0.012
 he-0.012
 but-0.012
Tokenents
Token ID658
Feature activation-7.57e-03

Pos logit contributions
attered+0.086
tr+0.086
 mailbox+0.083
 hyd+0.082
 narrative+0.080
Neg logit contributions
ock-0.037
ner-0.035
 inspection-0.035
 Zoe-0.033
 Many-0.033
Token that
Token ID326
Feature activation-6.80e-03

Pos logit contributions
attered+0.013
aming+0.012
ney+0.011
opened+0.011
aut+0.011
Neg logit contributions
 maybe-0.011
ock-0.011
 Skip-0.011
 but-0.011
 he-0.011
Token spark
Token ID9009
Feature activation+1.02e-02

Pos logit contributions
attered+0.015
aming+0.013
ney+0.013
opened+0.013
ifying+0.013
Neg logit contributions
 maybe-0.008
 Skip-0.007
 Zoe-0.007
rob-0.007
 inspection-0.007
Tokenled
Token ID992
Feature activation-5.38e-03

Pos logit contributions
 Skip+0.015
 maybe+0.014
rob+0.014
 Zoe+0.014
 Zig+0.013
Neg logit contributions
attered-0.019
aming-0.017
ney-0.017
ifying-0.016
opened-0.016
Token in
Token ID287
Feature activation-9.81e-03

Pos logit contributions
attered+0.010
ney+0.009
opened+0.009
ige+0.009
aming+0.008
Neg logit contributions
 but-0.008
 maybe-0.008
 he-0.008
 Zoe-0.007
 Skip-0.007
Token a
Token ID257
Feature activation+3.54e-03

Pos logit contributions
 Skip+0.005
rob+0.005
aro+0.005
 villain+0.005
 inspection+0.005
Neg logit contributions
attered-0.003
aming-0.003
 mailbox-0.003
amed-0.003
 cracked-0.003
Token tree
Token ID5509
Feature activation-2.36e-03

Pos logit contributions
 Skip+0.005
 maybe+0.005
 Zoe+0.005
rob+0.005
 villain+0.005
Neg logit contributions
attered-0.006
aming-0.006
ney-0.005
amed-0.005
opened-0.005
Token with
Token ID351
Feature activation-6.19e-03

Pos logit contributions
attered+0.005
ige+0.005
ifying+0.005
lette+0.005
aming+0.005
Neg logit contributions
 but-0.003
 he-0.002
 maybe-0.002
 Zoe-0.002
 though-0.002
Token shiny
Token ID22441
Feature activation+3.17e-03

Pos logit contributions
attered+0.012
ney+0.011
aming+0.011
opened+0.010
ifying+0.010
Neg logit contributions
 maybe-0.008
 Zoe-0.008
 Skip-0.008
 he-0.008
 inspection-0.008
Token or
Token ID393
Feature activation-2.33e-03

Pos logit contributions
 maybe+0.005
 inspection+0.005
 Zoe+0.005
 Skip+0.005
rob+0.005
Neg logit contributions
attered-0.005
aming-0.005
ney-0.005
opened-0.005
amed-0.005
Tokennam
Token ID7402
Feature activation-4.22e-02

Pos logit contributions
attered+0.004
ney+0.004
aming+0.004
opened+0.003
amed+0.003
Neg logit contributions
 maybe-0.004
 Skip-0.004
 Zoe-0.004
 he-0.003
 but-0.003
Tokenents
Token ID658
Feature activation+7.47e-04

Pos logit contributions
attered+0.085
tr+0.085
 hyd+0.082
 mailbox+0.080
aming+0.079
Neg logit contributions
ock-0.037
 inspection-0.037
 Zoe-0.036
 Many-0.035
ner-0.034
Token hanging
Token ID10938
Feature activation-1.39e-02

Pos logit contributions
 maybe+0.001
 he+0.001
 inspection+0.001
 but+0.001
 Zoe+0.001
Neg logit contributions
attered-0.001
aming-0.001
aut-0.001
ney-0.001
opened-0.001
Token from
Token ID422
Feature activation-1.08e-02

Pos logit contributions
attered+0.025
aming+0.023
ney+0.023
aut+0.022
opened+0.022
Neg logit contributions
 maybe-0.020
 Zoe-0.020
 but-0.019
 inspection-0.019
 Zig-0.019
Token its
Token ID663
Feature activation+1.71e-03

Pos logit contributions
attered+0.021
aming+0.019
ney+0.019
opened+0.018
ifying+0.018
Neg logit contributions
 Skip-0.014
 maybe-0.014
rob-0.014
 Zoe-0.013
 inspection-0.013
Token branches
Token ID13737
Feature activation-1.60e-02

Pos logit contributions
 Zoe+0.003
 but+0.003
 Zig+0.003
 maybe+0.002
 inspection+0.002
Neg logit contributions
attered-0.003
aming-0.003
amed-0.003
opened-0.002
amy-0.002
Token the
Token ID262
Feature activation-1.41e-03

Pos logit contributions
rob+0.019
 villain+0.017
 Skip+0.017
 maybe+0.017
 Nice+0.016
Neg logit contributions
aming-0.017
attered-0.016
 cracked-0.016
 painted-0.016
amed-0.015
Token ants
Token ID27842
Feature activation-3.69e-03

Pos logit contributions
attered+0.003
aming+0.003
ney+0.003
opened+0.003
ifying+0.003
Neg logit contributions
 Skip-0.002
 maybe-0.002
rob-0.002
 Zoe-0.002
ock-0.002
Token spotted
Token ID13489
Feature activation-7.58e-04

Pos logit contributions
attered+0.006
ney+0.005
aming+0.005
tr+0.005
ifying+0.005
Neg logit contributions
 maybe-0.006
 Skip-0.006
 inspection-0.006
 he-0.006
 Zoe-0.006
Token a
Token ID257
Feature activation-1.57e-03

Pos logit contributions
attered+0.001
aming+0.001
ney+0.001
opened+0.001
 mailbox+0.001
Neg logit contributions
 Skip-0.001
rob-0.001
 maybe-0.001
 Zoe-0.001
ock-0.001
Token delicious
Token ID12625
Feature activation-2.03e-02

Pos logit contributions
attered+0.003
aming+0.002
amed+0.002
ney+0.002
opened+0.002
Neg logit contributions
 Skip-0.002
 maybe-0.002
 villain-0.002
 but-0.002
 Zoe-0.002
Token mor
Token ID2146
Feature activation-4.21e-02

Pos logit contributions
attered+0.036
aming+0.032
amed+0.031
ney+0.031
ifying+0.031
Neg logit contributions
 Skip-0.031
 maybe-0.030
 Zoe-0.029
rob-0.029
 but-0.029
Tokensel
Token ID741
Feature activation-2.98e-03

Pos logit contributions
attered+0.072
tr+0.068
ney+0.065
 mailbox+0.064
 rightful+0.062
Neg logit contributions
ock-0.062
 Skip-0.061
ner-0.059
rob-0.058
 inspection-0.056
Token of
Token ID286
Feature activation-1.79e-03

Pos logit contributions
attered+0.006
aming+0.006
amed+0.006
opened+0.005
ney+0.005
Neg logit contributions
 but-0.004
 he-0.004
 maybe-0.003
ock-0.003
 inspection-0.003
Token food
Token ID2057
Feature activation-3.73e-03

Pos logit contributions
attered+0.003
aming+0.003
ney+0.002
 mailbox+0.002
opened+0.002
Neg logit contributions
 Skip-0.003
 maybe-0.003
rob-0.003
 Zoe-0.003
 inspection-0.003
Token.
Token ID13
Feature activation+2.94e-03

Pos logit contributions
attered+0.007
amed+0.007
aming+0.007
ney+0.007
opened+0.006
Neg logit contributions
 he-0.005
 but-0.005
 maybe-0.005
 Skip-0.005
 Zoe-0.005
Token E
Token ID412
Feature activation-1.62e-03

Pos logit contributions
 Skip+0.004
 maybe+0.004
 Zoe+0.004
rob+0.004
 inspection+0.004
Neg logit contributions
attered-0.005
aming-0.005
ney-0.005
opened-0.005
ifying-0.005
Token a
Token ID257
Feature activation+1.37e-03

Pos logit contributions
attered+0.009
aming+0.008
ney+0.008
amed+0.008
opened+0.008
Neg logit contributions
 maybe-0.006
 Skip-0.006
 but-0.006
 Zoe-0.006
 he-0.006
Token big
Token ID1263
Feature activation-3.61e-03

Pos logit contributions
 maybe+0.002
 but+0.002
 Zoe+0.002
 he+0.002
 Skip+0.002
Neg logit contributions
attered-0.003
aming-0.002
amed-0.002
aut-0.002
ney-0.002
Token box
Token ID3091
Feature activation-6.15e-03

Pos logit contributions
attered+0.006
aming+0.005
aut+0.005
amed+0.005
ney+0.005
Neg logit contributions
 but-0.006
 maybe-0.006
 Skip-0.006
 Zoe-0.006
 he-0.006
Token of
Token ID286
Feature activation-1.24e-03

Pos logit contributions
attered+0.012
aut+0.011
ige+0.011
amed+0.011
aming+0.011
Neg logit contributions
 but-0.008
 he-0.008
 Zoe-0.007
 maybe-0.007
 inspection-0.007
Token ch
Token ID442
Feature activation-8.88e-03

Pos logit contributions
attered+0.002
aming+0.002
ney+0.002
opened+0.002
amed+0.002
Neg logit contributions
 maybe-0.002
 Skip-0.002
 Zoe-0.002
rob-0.002
 inspection-0.002
Tokenocol
Token ID4668
Feature activation-4.21e-02

Pos logit contributions
attered+0.021
tr+0.020
ney+0.019
aming+0.019
opened+0.019
Neg logit contributions
ock-0.007
 inspection-0.007
 maybe-0.007
aro-0.007
 Lia-0.007
Tokenates
Token ID689
Feature activation-1.29e-02

Pos logit contributions
attered+0.080
+0.068
opened+0.067
 mailbox+0.067
 rightful+0.067
Neg logit contributions
rob-0.061
ock-0.055
aro-0.054
 inspection-0.051
 Zig-0.050
Token!
Token ID0
Feature activation-2.32e-03

Pos logit contributions
attered+0.025
ney+0.023
opened+0.022
aming+0.022
amed+0.022
Neg logit contributions
 he-0.017
 but-0.017
 maybe-0.017
 Zoe-0.016
 inspection-0.016
Token
Token ID220
Feature activation+1.46e-03

Pos logit contributions
attered+0.004
aming+0.004
ney+0.004
opened+0.004
ifying+0.004
Neg logit contributions
 Skip-0.003
 maybe-0.003
 Zoe-0.003
 inspection-0.003
rob-0.003
Token
Token ID198
Feature activation-4.50e-03

Pos logit contributions
rob+0.002
 villain+0.002
 Skip+0.002
ock+0.002
aro+0.002
Neg logit contributions
attered-0.002
ney-0.002
opened-0.002
aming-0.002
-0.002
Token
Token ID198
Feature activation-5.66e-04

Pos logit contributions
+0.005
ney+0.005
opened+0.004
um+0.004
Mom+0.004
Neg logit contributions
 villain-0.010
rob-0.010
aro-0.009
ock-0.009
 villains-0.009
Token a
Token ID257
Feature activation+1.44e-03

Pos logit contributions
attered+0.011
ney+0.010
ifying+0.010
aming+0.010
tr+0.010
Neg logit contributions
 Skip-0.011
 maybe-0.011
 Zoe-0.010
rob-0.010
 he-0.010
Token big
Token ID1263
Feature activation-1.36e-03

Pos logit contributions
 maybe+0.002
 Skip+0.002
 Zoe+0.002
rob+0.002
 inspection+0.002
Neg logit contributions
attered-0.002
aming-0.002
ney-0.002
amed-0.002
opened-0.002
Token box
Token ID3091
Feature activation-9.77e-03

Pos logit contributions
attered+0.002
aming+0.002
amed+0.002
ifying+0.002
ney+0.002
Neg logit contributions
 maybe-0.002
 Skip-0.002
rob-0.002
 but-0.002
 Zoe-0.002
Token of
Token ID286
Feature activation-9.41e-05

Pos logit contributions
attered+0.023
aming+0.021
olk+0.021
ige+0.021
ifying+0.021
Neg logit contributions
 he-0.009
 but-0.009
ock-0.008
 maybe-0.008
 though-0.007
Token ch
Token ID442
Feature activation-6.43e-03

Pos logit contributions
attered+0.000
 mailbox+0.000
ney+0.000
aming+0.000
opened+0.000
Neg logit contributions
 Skip-0.000
 inspection-0.000
rob-0.000
 maybe-0.000
 Zoe-0.000
Tokenocol
Token ID4668
Feature activation-4.21e-02

Pos logit contributions
attered+0.013
tr+0.012
aming+0.012
ney+0.012
opened+0.012
Neg logit contributions
ock-0.007
 Skip-0.007
 maybe-0.007
 inspection-0.007
rob-0.007
Tokenates
Token ID689
Feature activation-1.40e-02

Pos logit contributions
attered+0.074
opened+0.063
+0.063
ifying+0.063
 mailbox+0.063
Neg logit contributions
rob-0.066
ock-0.061
aro-0.060
 Skip-0.056
 Zig-0.056
Token to
Token ID284
Feature activation+5.60e-03

Pos logit contributions
attered+0.027
aming+0.025
ney+0.025
opened+0.024
tr+0.024
Neg logit contributions
 he-0.018
 maybe-0.018
 but-0.018
 Skip-0.018
 Zoe-0.017
Token share
Token ID2648
Feature activation-1.64e-03

Pos logit contributions
rob+0.010
 inspection+0.010
 Skip+0.010
aro+0.010
 villain+0.010
Neg logit contributions
 mailbox-0.007
opened-0.007
 painted-0.007
 cracked-0.007
 printed-0.007
Token with
Token ID351
Feature activation-3.79e-03

Pos logit contributions
attered+0.002
 mailbox+0.002
ney+0.002
opened+0.002
ra+0.002
Neg logit contributions
rob-0.003
 Skip-0.003
 villain-0.003
 inspection-0.003
 maybe-0.003
Token the
Token ID262
Feature activation+9.11e-03

Pos logit contributions
attered+0.006
um+0.006
ra+0.006
 mailbox+0.006
 cracked+0.005
Neg logit contributions
 inspection-0.005
rob-0.005
 Skip-0.005
ner-0.005
ock-0.005
Token big
Token ID1263
Feature activation+8.44e-03

Pos logit contributions
 Skip+0.018
 Zoe+0.018
 villain+0.018
 maybe+0.018
 he+0.017
Neg logit contributions
attered-0.021
aming-0.019
amed-0.019
lette-0.018
aut-0.018
Token tree
Token ID5509
Feature activation-5.95e-04

Pos logit contributions
 Skip+0.012
 maybe+0.012
rob+0.012
 Zoe+0.012
 inspection+0.011
Neg logit contributions
attered-0.015
aming-0.014
ney-0.014
amed-0.013
opened-0.013
Token with
Token ID351
Feature activation-1.00e-02

Pos logit contributions
attered+0.001
ige+0.001
amed+0.001
aming+0.001
lette+0.001
Neg logit contributions
 he-0.001
 but-0.001
 maybe-0.001
 Zoe-0.001
 Zip-0.000
Token many
Token ID867
Feature activation+2.11e-02

Pos logit contributions
attered+0.018
aming+0.016
ney+0.016
opened+0.016
ifying+0.015
Neg logit contributions
 maybe-0.014
 Skip-0.014
 Zoe-0.014
rob-0.014
 Zig-0.014
Token or
Token ID393
Feature activation+2.09e-03

Pos logit contributions
 Skip+0.033
 maybe+0.033
rob+0.032
 Zoe+0.031
ock+0.031
Neg logit contributions
attered-0.036
ney-0.032
aming-0.032
opened-0.031
 mailbox-0.030
Tokennam
Token ID7402
Feature activation-4.20e-02

Pos logit contributions
 maybe+0.003
 Skip+0.003
 Zoe+0.003
rob+0.003
 inspection+0.003
Neg logit contributions
attered-0.003
aming-0.003
ney-0.003
opened-0.003
amed-0.003
Tokenents
Token ID658
Feature activation+1.70e-03

Pos logit contributions
attered+0.084
tr+0.082
 mailbox+0.080
aming+0.080
 hyd+0.078
Neg logit contributions
ock-0.040
 Zoe-0.039
 inspection-0.039
rob-0.035
 Many-0.035
Token on
Token ID319
Feature activation+5.78e-03

Pos logit contributions
 maybe+0.003
 inspection+0.003
 Zoe+0.003
 he+0.003
ock+0.003
Neg logit contributions
attered-0.003
aming-0.003
ney-0.003
opened-0.002
aut-0.002
Token it
Token ID340
Feature activation+1.27e-02

Pos logit contributions
rob+0.009
 Skip+0.009
 maybe+0.009
ock+0.009
 villain+0.008
Neg logit contributions
attered-0.010
ney-0.009
 mailbox-0.009
 cracked-0.009
opened-0.009
Token.
Token ID13
Feature activation+3.55e-03

Pos logit contributions
 maybe+0.020
 Skip+0.020
 Zoe+0.019
rob+0.019
 inspection+0.019
Neg logit contributions
attered-0.021
aming-0.019
ney-0.019
opened-0.018
ifying-0.018
Token The
Token ID383
Feature activation+1.98e-03

Pos logit contributions
rob+0.005
ock+0.005
 villain+0.005
 Skip+0.005
aro+0.005
Neg logit contributions
attered-0.007
opened-0.006
ney-0.006
tr-0.006
 mailbox-0.006
Token."
Token ID526
Feature activation+8.51e-03

Pos logit contributions
attered+0.019
aming+0.017
ney+0.017
opened+0.016
amed+0.016
Neg logit contributions
 maybe-0.011
 Skip-0.011
 Zoe-0.011
ock-0.011
rob-0.011
Token
Token ID198
Feature activation+3.68e-03

Pos logit contributions
rob+0.010
 villain+0.010
ock+0.009
aro+0.009
 Skip+0.009
Neg logit contributions
attered-0.018
ney-0.017
tr-0.016
 mailbox-0.016
opened-0.016
Token
Token ID198
Feature activation+4.30e-03

Pos logit contributions
 villain+0.007
aro+0.007
rob+0.007
 Skip+0.006
 villains+0.006
Neg logit contributions
ney-0.005
-0.005
aming-0.004
attered-0.004
opened-0.004
TokenThey
Token ID2990
Feature activation-8.35e-04

Pos logit contributions
 Skip+0.006
 maybe+0.006
rob+0.006
 Zoe+0.006
 inspection+0.005
Neg logit contributions
attered-0.008
aming-0.007
ney-0.007
opened-0.007
ifying-0.007
Token walked
Token ID6807
Feature activation+3.40e-03

Pos logit contributions
attered+0.002
ney+0.001
ifying+0.001
opened+0.001
umm+0.001
Neg logit contributions
 maybe-0.001
 Zoe-0.001
 them-0.001
 but-0.001
 inspection-0.001
Token around
Token ID1088
Feature activation+2.50e-02

Pos logit contributions
 Skip+0.006
 maybe+0.006
rob+0.006
 Zoe+0.006
 inspection+0.006
Neg logit contributions
attered-0.005
aming-0.005
ney-0.005
opened-0.004
amed-0.004
Token the
Token ID262
Feature activation-8.84e-03

Pos logit contributions
rob+0.039
ock+0.037
 Skip+0.037
 villain+0.035
aro+0.035
Neg logit contributions
attered-0.041
aming-0.039
ney-0.039
 painted-0.038
 mailbox-0.038
Token museum
Token ID13257
Feature activation-7.73e-03

Pos logit contributions
attered+0.017
aming+0.015
ifying+0.015
lette+0.015
amed+0.015
Neg logit contributions
 Skip-0.012
 maybe-0.012
 Zoe-0.012
rob-0.011
 he-0.011
Token and
Token ID290
Feature activation+4.84e-03

Pos logit contributions
attered+0.014
aming+0.013
ney+0.013
ifying+0.013
opened+0.013
Neg logit contributions
 maybe-0.011
 Skip-0.010
 Zoe-0.010
 inspection-0.010
rob-0.010
Token looked
Token ID3114
Feature activation-1.06e-02

Pos logit contributions
 Skip+0.009
rob+0.009
ock+0.008
 maybe+0.008
 villain+0.008
Neg logit contributions
attered-0.007
aming-0.006
ney-0.006
opened-0.006
 cracked-0.006
Token at
Token ID379
Feature activation+3.40e-03

Pos logit contributions
attered+0.016
ney+0.015
aming+0.014
tr+0.014
opened+0.014
Neg logit contributions
 maybe-0.018
 Skip-0.018
 Zoe-0.018
 inspection-0.017
 but-0.017
Token empty
Token ID6565
Feature activation-9.43e-03

Pos logit contributions
attered+0.013
aming+0.013
amed+0.012
ige+0.011
lette+0.011
Neg logit contributions
ock-0.012
 but-0.011
 Zoe-0.011
rob-0.011
 Skip-0.011
Token ig
Token ID45329
Feature activation+5.92e-03

Pos logit contributions
attered+0.015
aming+0.014
ney+0.013
opened+0.013
amed+0.013
Neg logit contributions
 maybe-0.015
 Skip-0.015
rob-0.015
 Zoe-0.015
 inspection-0.015
Tokenloo
Token ID29680
Feature activation-7.21e-03

Pos logit contributions
 Skip+0.008
 maybe+0.008
rob+0.008
 Zoe+0.008
 inspection+0.008
Neg logit contributions
attered-0.011
aming-0.010
ney-0.010
opened-0.010
ifying-0.010
Token.
Token ID13
Feature activation+4.91e-03

Pos logit contributions
attered+0.012
aming+0.011
opened+0.011
ney+0.011
ige+0.011
Neg logit contributions
 maybe-0.011
 Skip-0.010
rob-0.010
 he-0.010
ock-0.010
Token After
Token ID2293
Feature activation+5.46e-03

Pos logit contributions
rob+0.007
 Skip+0.006
ock+0.006
 maybe+0.006
 inspection+0.006
Neg logit contributions
attered-0.009
ney-0.008
aming-0.008
ifying-0.008
opened-0.008
Token playing
Token ID2712
Feature activation+2.71e-02

Pos logit contributions
 Skip+0.007
rob+0.007
 maybe+0.007
ock+0.007
 Zoe+0.007
Neg logit contributions
attered-0.010
aming-0.009
ney-0.009
opened-0.009
 cracked-0.009
Token,
Token ID11
Feature activation+2.79e-04

Pos logit contributions
rob+0.060
 Skip+0.056
ock+0.055
 villain+0.054
 inspection+0.053
Neg logit contributions
ney-0.025
attered-0.025
aming-0.023
opened-0.023
amed-0.023
Token they
Token ID484
Feature activation-1.12e-03

Pos logit contributions
rob+0.000
ock+0.000
 inspection+0.000
 Skip+0.000
 maybe+0.000
Neg logit contributions
attered-0.001
ifying-0.001
 mailbox-0.001
aming-0.001
-0.001
Token went
Token ID1816
Feature activation+8.27e-03

Pos logit contributions
attered+0.002
aming+0.002
ney+0.002
opened+0.002
tr+0.002
Neg logit contributions
 maybe-0.002
 he-0.002
 Skip-0.002
 Zoe-0.002
 villain-0.002
Token inside
Token ID2641
Feature activation-3.85e-04

Pos logit contributions
 Skip+0.016
 maybe+0.016
rob+0.015
 Zoe+0.015
 inspection+0.015
Neg logit contributions
attered-0.011
aming-0.010
ney-0.010
opened-0.009
ifying-0.009
Token the
Token ID262
Feature activation-1.12e-02

Pos logit contributions
attered+0.001
aming+0.001
ney+0.000
opened+0.000
 cracked+0.000
Neg logit contributions
rob-0.001
 Skip-0.001
 villain-0.001
aro-0.001
ock-0.001
Token the
Token ID262
Feature activation-7.53e-03

Pos logit contributions
attered+0.028
aming+0.026
ney+0.025
ra+0.025
 mailbox+0.025
Neg logit contributions
rob-0.020
 Skip-0.019
ock-0.019
 inspection-0.018
 villain-0.018
Token little
Token ID1310
Feature activation+2.26e-03

Pos logit contributions
attered+0.015
aming+0.014
ney+0.014
opened+0.013
ifying+0.013
Neg logit contributions
 maybe-0.010
 Skip-0.010
 Zoe-0.009
rob-0.009
 inspection-0.009
Token girl
Token ID2576
Feature activation-1.38e-02

Pos logit contributions
 Zoe+0.002
 but+0.002
 maybe+0.002
 Skip+0.002
 inspection+0.002
Neg logit contributions
attered-0.005
aming-0.005
ney-0.005
ifying-0.005
tr-0.005
Token had
Token ID550
Feature activation-1.13e-02

Pos logit contributions
attered+0.025
ifying+0.023
aming+0.022
opened+0.022
olk+0.021
Neg logit contributions
 maybe-0.021
 Skip-0.019
 Zoe-0.019
 he-0.019
 but-0.019
Token so
Token ID523
Feature activation+8.10e-03

Pos logit contributions
attered+0.021
ney+0.018
aming+0.018
tr+0.018
ifying+0.018
Neg logit contributions
 Skip-0.017
 maybe-0.017
 Zoe-0.016
 inspection-0.016
 but-0.016
Token many
Token ID867
Feature activation+2.45e-02

Pos logit contributions
 Skip+0.013
 maybe+0.013
rob+0.012
 Zoe+0.012
 inspection+0.012
Neg logit contributions
attered-0.014
aming-0.012
ney-0.012
opened-0.012
ifying-0.012
Token new
Token ID649
Feature activation+1.17e-02

Pos logit contributions
rob+0.042
 Skip+0.040
 Zig+0.038
 villain+0.038
 Boo+0.038
Neg logit contributions
attered-0.035
aming-0.034
 painted-0.033
 mailbox-0.033
 cracked-0.033
Token adventures
Token ID17545
Feature activation+4.36e-03

Pos logit contributions
rob+0.018
 Skip+0.018
ock+0.017
 villain+0.017
 Zig+0.017
Neg logit contributions
attered-0.019
ney-0.018
aming-0.018
 mailbox-0.018
 painted-0.017
Token.
Token ID13
Feature activation-1.26e-02

Pos logit contributions
 Skip+0.007
rob+0.007
 maybe+0.007
 Zoe+0.007
 inspection+0.007
Neg logit contributions
attered-0.007
aming-0.006
ney-0.006
opened-0.006
 mailbox-0.006
Token She
Token ID1375
Feature activation-5.42e-03

Pos logit contributions
attered+0.022
ney+0.020
aming+0.019
opened+0.019
tr+0.019
Neg logit contributions
 Skip-0.019
rob-0.019
 maybe-0.018
 inspection-0.018
 Zoe-0.018
Token learned
Token ID4499
Feature activation+7.93e-03

Pos logit contributions
attered+0.009
aming+0.008
ifying+0.008
ney+0.007
opened+0.007
Neg logit contributions
 maybe-0.009
 Skip-0.009
 Zoe-0.009
rob-0.009
 inspection-0.008
Token!"
Token ID2474
Feature activation+3.09e-03

Pos logit contributions
rob+0.035
 Skip+0.034
 maybe+0.034
 inspection+0.033
 Zoe+0.033
Neg logit contributions
attered-0.037
aming-0.034
ney-0.034
opened-0.032
 mailbox-0.032
Token So
Token ID1406
Feature activation+2.75e-03

Pos logit contributions
rob+0.004
 villain+0.004
ock+0.004
aro+0.004
 Skip+0.004
Neg logit contributions
attered-0.006
ney-0.005
aming-0.005
-0.005
tr-0.005
Token they
Token ID484
Feature activation-8.12e-03

Pos logit contributions
rob+0.003
ock+0.003
 Skip+0.003
 villain+0.003
 maybe+0.003
Neg logit contributions
attered-0.006
aming-0.005
ney-0.005
 mailbox-0.005
opened-0.005
Token j
Token ID474
Feature activation+9.35e-03

Pos logit contributions
attered+0.012
ney+0.011
aming+0.010
opened+0.010
 mailbox+0.010
Neg logit contributions
 Skip-0.015
 maybe-0.015
 Zoe-0.015
ock-0.014
 Zig-0.014
Tokenogged
Token ID42545
Feature activation+9.88e-03

Pos logit contributions
ock+0.012
 Zoe+0.012
 Skip+0.012
 inspection+0.012
 maybe+0.012
Neg logit contributions
attered-0.017
ney-0.016
tr-0.016
 mailbox-0.016
aming-0.016
Token around
Token ID1088
Feature activation+2.49e-02

Pos logit contributions
 maybe+0.020
rob+0.020
 villain+0.019
 Skip+0.019
ock+0.019
Neg logit contributions
attered-0.012
amed-0.010
aming-0.010
ney-0.009
opened-0.009
Token the
Token ID262
Feature activation+4.93e-03

Pos logit contributions
rob+0.045
ock+0.045
 villain+0.040
aro+0.040
 maybe+0.040
Neg logit contributions
-0.031
aming-0.031
 painted-0.030
attered-0.029
ra-0.029
Token park
Token ID3952
Feature activation+5.51e-03

Pos logit contributions
 Skip+0.006
 Zoe+0.006
 maybe+0.006
rob+0.006
ock+0.006
Neg logit contributions
attered-0.010
lette-0.009
ney-0.009
aming-0.009
ifying-0.009
Token one
Token ID530
Feature activation+2.23e-02

Pos logit contributions
rob+0.009
 Skip+0.008
ock+0.008
 maybe+0.008
 villain+0.008
Neg logit contributions
attered-0.009
ney-0.008
aming-0.008
amed-0.008
tr-0.008
Token more
Token ID517
Feature activation+2.46e-02

Pos logit contributions
 Skip+0.035
 maybe+0.034
rob+0.034
 Zoe+0.033
 inspection+0.033
Neg logit contributions
attered-0.038
aming-0.034
ney-0.034
opened-0.033
ifying-0.033
Token time
Token ID640
Feature activation+5.97e-04

Pos logit contributions
rob+0.047
 Skip+0.046
 maybe+0.045
 villain+0.043
 Boo+0.043
Neg logit contributions
attered-0.032
aming-0.030
ney-0.030
aut-0.027
opened-0.027
Token Tom
Token ID4186
Feature activation-7.31e-03

Pos logit contributions
rob+0.017
ock+0.017
aro+0.015
 Skip+0.015
 villain+0.015
Neg logit contributions
attered-0.017
-0.016
 mailbox-0.016
 cracked-0.016
 painted-0.015
Token says
Token ID1139
Feature activation-2.07e-03

Pos logit contributions
attered+0.011
+0.010
aming+0.010
 cracked+0.010
ney+0.010
Neg logit contributions
rob-0.012
 inspection-0.011
 Skip-0.011
aro-0.011
 maybe-0.011
Token.
Token ID13
Feature activation+6.72e-03

Pos logit contributions
attered+0.004
aming+0.003
ney+0.003
opened+0.003
ifying+0.003
Neg logit contributions
 Skip-0.003
rob-0.003
 maybe-0.003
 Zoe-0.003
ock-0.003
Token
Token ID198
Feature activation+4.49e-03

Pos logit contributions
rob+0.012
ock+0.012
aro+0.011
 Skip+0.011
 inspection+0.011
Neg logit contributions
tr-0.009
ney-0.009
attered-0.009
opened-0.009
um-0.008
Token
Token ID198
Feature activation+7.11e-03

Pos logit contributions
rob+0.008
 villain+0.008
 Skip+0.008
 inspection+0.008
aro+0.008
Neg logit contributions
ney-0.006
opened-0.006
attered-0.006
Mom-0.006
aming-0.006
TokenL
Token ID43
Feature activation+2.77e-02

Pos logit contributions
 Skip+0.010
 maybe+0.010
 Zoe+0.010
 he+0.009
rob+0.009
Neg logit contributions
attered-0.013
ifying-0.012
aming-0.011
ney-0.011
tr-0.011
Tokenily
Token ID813
Feature activation-5.92e-03

Pos logit contributions
 Skip+0.046
rob+0.044
 maybe+0.044
 inspection+0.043
 Zoe+0.043
Neg logit contributions
attered-0.044
aming-0.043
amed-0.040
ige-0.039
ney-0.039
Token agrees
Token ID14386
Feature activation-8.60e-04

Pos logit contributions
attered+0.012
aming+0.011
ney+0.011
ifying+0.011
opened+0.011
Neg logit contributions
 maybe-0.007
 Skip-0.007
 Zoe-0.007
 but-0.007
 Boo-0.007
Token.
Token ID13
Feature activation+9.95e-03

Pos logit contributions
aming+0.001
attered+0.001
+0.001
um+0.001
au+0.001
Neg logit contributions
rob-0.002
 Skip-0.002
 villain-0.002
ock-0.002
aro-0.002
Token She
Token ID1375
Feature activation-6.45e-03

Pos logit contributions
rob+0.018
ock+0.017
 Skip+0.017
aro+0.017
 villain+0.016
Neg logit contributions
attered-0.015
ney-0.013
 mailbox-0.013
tr-0.013
opened-0.013
Token has
Token ID468
Feature activation+8.48e-05

Pos logit contributions
attered+0.012
ney+0.011
aming+0.011
ifying+0.010
opened+0.010
Neg logit contributions
 maybe-0.009
 Skip-0.009
 Zoe-0.009
rob-0.009
 Boo-0.009
Token.
Token ID13
Feature activation+2.69e-03

Pos logit contributions
rob+0.021
 Skip+0.021
 Zoe+0.020
 maybe+0.020
 inspection+0.019
Neg logit contributions
attered-0.017
ney-0.015
aming-0.015
opened-0.015
 mailbox-0.015
Token But
Token ID887
Feature activation+1.54e-02

Pos logit contributions
rob+0.005
ock+0.005
aro+0.004
 villain+0.004
 Skip+0.004
Neg logit contributions
attered-0.004
ney-0.004
tr-0.004
opened-0.004
 cracked-0.004
Token Lily
Token ID20037
Feature activation+7.74e-03

Pos logit contributions
rob+0.020
ock+0.018
 Skip+0.018
 inspection+0.018
 Zoe+0.016
Neg logit contributions
attered-0.029
aming-0.028
 cracked-0.028
-0.027
 mailbox-0.027
Token was
Token ID373
Feature activation+1.65e-02

Pos logit contributions
 maybe+0.012
 Skip+0.012
 but+0.011
 Zoe+0.011
 he+0.011
Neg logit contributions
attered-0.014
ifying-0.012
aming-0.012
ney-0.012
opened-0.012
Token so
Token ID523
Feature activation+8.35e-03

Pos logit contributions
 Skip+0.025
 maybe+0.025
rob+0.025
 Zoe+0.024
 inspection+0.024
Neg logit contributions
attered-0.029
aming-0.026
ney-0.026
opened-0.025
amed-0.025
Token upset
Token ID9247
Feature activation+2.77e-03

Pos logit contributions
 maybe+0.013
 Skip+0.012
 he+0.012
ock+0.012
 Zoe+0.011
Neg logit contributions
attered-0.015
ifying-0.013
ney-0.013
aming-0.013
tr-0.013
Token about
Token ID546
Feature activation-6.39e-04

Pos logit contributions
rob+0.006
 Skip+0.006
 Zac+0.006
 villain+0.005
 Zoe+0.005
Neg logit contributions
attered-0.003
-0.003
aming-0.003
tr-0.003
 mailbox-0.002
Token losing
Token ID6078
Feature activation+7.20e-03

Pos logit contributions
attered+0.001
+0.001
 painted+0.001
 mailbox+0.001
 cracked+0.001
Neg logit contributions
rob-0.001
 Skip-0.001
 villain-0.001
ock-0.001
 Zac-0.001
Token her
Token ID607
Feature activation-1.29e-02

Pos logit contributions
rob+0.011
 Skip+0.010
 inspection+0.010
aro+0.010
 maybe+0.010
Neg logit contributions
attered-0.012
aming-0.011
 mailbox-0.011
-0.011
 cracked-0.011
Token penny
Token ID26369
Feature activation-3.94e-03

Pos logit contributions
attered+0.022
aming+0.020
ney+0.019
opened+0.019
ifying+0.019
Neg logit contributions
 maybe-0.020
 Skip-0.019
 but-0.019
 Zoe-0.019
ock-0.019
Token and
Token ID290
Feature activation+3.49e-03

Pos logit contributions
attered+0.007
aming+0.006
ney+0.006
ifying+0.006
opened+0.006
Neg logit contributions
 maybe-0.006
 Skip-0.006
 but-0.005
ock-0.005
rob-0.005
Token But
Token ID887
Feature activation+1.86e-02

Pos logit contributions
rob+0.010
ock+0.010
 villain+0.009
 Skip+0.009
aro+0.009
Neg logit contributions
attered-0.009
ney-0.008
tr-0.008
opened-0.008
 mailbox-0.008
Token there
Token ID612
Feature activation+1.02e-02

Pos logit contributions
rob+0.020
 Skip+0.020
 maybe+0.019
ock+0.019
 inspection+0.018
Neg logit contributions
attered-0.040
aming-0.037
ney-0.036
opened-0.036
tr-0.036
Token is
Token ID318
Feature activation+6.17e-03

Pos logit contributions
 maybe+0.016
 Skip+0.015
 Zoe+0.015
rob+0.015
 inspection+0.014
Neg logit contributions
attered-0.018
aming-0.017
ney-0.016
ifying-0.016
opened-0.016
Token no
Token ID645
Feature activation-1.24e-03

Pos logit contributions
 maybe+0.010
 Skip+0.009
 Zoe+0.009
rob+0.009
 inspection+0.009
Neg logit contributions
attered-0.011
aming-0.010
ney-0.009
opened-0.009
ifying-0.009
Token one
Token ID530
Feature activation+1.30e-02

Pos logit contributions
attered+0.002
aming+0.002
ney+0.002
opened+0.002
tr+0.002
Neg logit contributions
 Skip-0.002
 maybe-0.002
rob-0.002
 Zoe-0.002
 inspection-0.002
Token in
Token ID287
Feature activation+1.19e-02

Pos logit contributions
 maybe+0.020
 Skip+0.020
 Zoe+0.020
rob+0.019
 inspection+0.019
Neg logit contributions
attered-0.022
aming-0.020
ney-0.020
opened-0.019
ifying-0.019
Token the
Token ID262
Feature activation+1.73e-03

Pos logit contributions
rob+0.016
 Skip+0.016
 maybe+0.016
ock+0.015
 inspection+0.015
Neg logit contributions
attered-0.022
ifying-0.020
aming-0.020
ney-0.020
 mailbox-0.020
Token park
Token ID3952
Feature activation+5.58e-03

Pos logit contributions
 but+0.003
 Whe+0.003
 villain+0.003
 maybe+0.003
 Zoe+0.003
Neg logit contributions
attered-0.003
aming-0.002
amed-0.002
opened-0.002
ifying-0.002
Token.
Token ID13
Feature activation+1.03e-02

Pos logit contributions
 maybe+0.008
 Skip+0.008
 Zoe+0.008
 inspection+0.008
 but+0.008
Neg logit contributions
attered-0.010
aming-0.009
ney-0.009
amed-0.009
opened-0.009
Token They
Token ID1119
Feature activation+9.52e-03

Pos logit contributions
rob+0.016
 Skip+0.015
ock+0.015
 maybe+0.015
 inspection+0.015
Neg logit contributions
attered-0.018
ney-0.016
tr-0.016
aming-0.016
opened-0.016
Token are
Token ID389
Feature activation+2.68e-03

Pos logit contributions
 maybe+0.012
 Skip+0.011
 Zoe+0.011
 but+0.011
 he+0.011
Neg logit contributions
attered-0.020
aming-0.019
amed-0.018
ney-0.018
opened-0.017
Token.
Token ID13
Feature activation+1.20e-03

Pos logit contributions
 Skip+0.009
 maybe+0.009
rob+0.009
 Zoe+0.009
 inspection+0.009
Neg logit contributions
attered-0.009
aming-0.008
ney-0.008
opened-0.008
ifying-0.008
Token He
Token ID679
Feature activation+4.33e-03

Pos logit contributions
rob+0.002
ock+0.002
 Skip+0.002
 inspection+0.002
 villain+0.002
Neg logit contributions
attered-0.002
ney-0.002
aming-0.002
ifying-0.002
-0.002
Token walked
Token ID6807
Feature activation+5.41e-03

Pos logit contributions
 Skip+0.008
 maybe+0.007
rob+0.007
 Zoe+0.007
 inspection+0.007
Neg logit contributions
attered-0.006
aming-0.006
ney-0.006
opened-0.005
tr-0.005
Token along
Token ID1863
Feature activation+8.11e-03

Pos logit contributions
rob+0.011
 maybe+0.011
 villain+0.010
 Skip+0.010
aro+0.010
Neg logit contributions
attered-0.006
amed-0.006
aming-0.006
ney-0.006
opened-0.005
Token the
Token ID262
Feature activation+7.05e-05

Pos logit contributions
rob+0.013
 maybe+0.012
 Skip+0.012
ock+0.012
 villain+0.012
Neg logit contributions
attered-0.013
ney-0.012
aming-0.012
 mailbox-0.012
opened-0.012
Token path
Token ID3108
Feature activation+1.27e-02

Pos logit contributions
 Skip+0.000
 maybe+0.000
 Zoe+0.000
rob+0.000
 inspection+0.000
Neg logit contributions
attered-0.000
aming-0.000
ifying-0.000
opened-0.000
ney-0.000
Token,
Token ID11
Feature activation+7.53e-04

Pos logit contributions
rob+0.021
 Skip+0.020
 villain+0.019
 maybe+0.019
ock+0.018
Neg logit contributions
attered-0.021
ney-0.020
aming-0.019
amed-0.018
opened-0.018
Token slowly
Token ID6364
Feature activation+4.23e-03

Pos logit contributions
rob+0.001
ock+0.001
 Skip+0.001
 maybe+0.001
 villain+0.001
Neg logit contributions
attered-0.001
aming-0.001
ney-0.001
 cracked-0.001
amed-0.001
Token and
Token ID290
Feature activation-4.57e-04

Pos logit contributions
 Skip+0.007
 maybe+0.006
rob+0.006
 Zoe+0.006
ock+0.006
Neg logit contributions
attered-0.007
aming-0.006
ney-0.006
opened-0.006
amed-0.006
Token carefully
Token ID7773
Feature activation+6.39e-04

Pos logit contributions
attered+0.001
aming+0.001
 cracked+0.001
ney+0.001
amed+0.001
Neg logit contributions
rob-0.001
ock-0.001
 Skip-0.001
 villain-0.001
 maybe-0.001
Token.
Token ID13
Feature activation+5.94e-04

Pos logit contributions
 Skip+0.001
 maybe+0.001
rob+0.001
 Zoe+0.001
ock+0.001
Neg logit contributions
attered-0.001
aming-0.001
ney-0.001
opened-0.001
amed-0.001
Token we
Token ID356
Feature activation-4.83e-03

Pos logit contributions
attered+0.021
 mailbox+0.020
aming+0.020
 cracked+0.020
ney+0.020
Neg logit contributions
rob-0.012
 Skip-0.012
ock-0.011
 inspection-0.011
 Zoe-0.010
Token need
Token ID761
Feature activation-4.20e-03

Pos logit contributions
attered+0.009
aming+0.009
amed+0.008
ney+0.008
opened+0.008
Neg logit contributions
 maybe-0.007
 but-0.006
 he-0.006
 villain-0.006
 Zoe-0.006
Token to
Token ID284
Feature activation-1.20e-03

Pos logit contributions
attered+0.008
aming+0.007
ney+0.007
opened+0.007
ifying+0.007
Neg logit contributions
 Skip-0.006
 maybe-0.006
rob-0.006
 Zoe-0.005
 inspection-0.005
Token go
Token ID467
Feature activation-2.27e-03

Pos logit contributions
attered+0.002
aming+0.002
ney+0.002
amed+0.002
ifying+0.002
Neg logit contributions
 maybe-0.002
 Skip-0.002
 Zoe-0.002
 but-0.001
ock-0.001
Token closer
Token ID5699
Feature activation+4.87e-03

Pos logit contributions
attered+0.003
aming+0.002
ney+0.002
opened+0.002
 mailbox+0.002
Neg logit contributions
 Skip-0.005
rob-0.004
 maybe-0.004
 inspection-0.004
 Zoe-0.004
Token to
Token ID284
Feature activation+1.36e-03

Pos logit contributions
rob+0.009
 Skip+0.009
 villain+0.008
aro+0.008
 Zoe+0.008
Neg logit contributions
attered-0.007
ney-0.007
aming-0.006
 mailbox-0.006
aut-0.006
Token the
Token ID262
Feature activation-7.95e-03

Pos logit contributions
rob+0.002
 inspection+0.002
 Skip+0.002
aro+0.002
ock+0.002
Neg logit contributions
 mailbox-0.002
 printed-0.002
 painted-0.002
ra-0.002
 cracked-0.002
Token rainbow
Token ID27223
Feature activation+6.14e-03

Pos logit contributions
attered+0.016
aming+0.014
amed+0.014
ung+0.013
ney+0.013
Neg logit contributions
 Zoe-0.011
 Skip-0.011
 maybe-0.010
 Boo-0.010
 villain-0.010
Token."
Token ID526
Feature activation+1.30e-03

Pos logit contributions
 Skip+0.009
 maybe+0.009
rob+0.008
 Zoe+0.008
 inspection+0.008
Neg logit contributions
attered-0.011
ney-0.010
aming-0.010
opened-0.010
tr-0.010
Token
Token ID198
Feature activation+1.93e-03

Pos logit contributions
rob+0.003
aro+0.003
ock+0.003
 Skip+0.003
 villain+0.003
Neg logit contributions
ney-0.002
opened-0.001
um-0.001
attered-0.001
tr-0.001
Token
Token ID198
Feature activation+6.87e-03

Pos logit contributions
rob+0.004
 villain+0.004
 Skip+0.004
aro+0.003
 inspection+0.003
Neg logit contributions
ney-0.003
opened-0.002
um-0.002
attered-0.002
Mom-0.002
Token every
Token ID790
Feature activation+8.89e-04

Pos logit contributions
attered+0.002
tr+0.002
 mailbox+0.001
aut+0.001
ney+0.001
Neg logit contributions
rob-0.002
 Skip-0.002
aro-0.002
 villain-0.002
ock-0.002
Token day
Token ID1110
Feature activation-8.36e-03

Pos logit contributions
 maybe+0.002
 Skip+0.002
 Zoe+0.002
ock+0.002
rob+0.002
Neg logit contributions
attered-0.001
aming-0.001
amed-0.001
ney-0.001
opened-0.001
Token.
Token ID13
Feature activation+1.21e-03

Pos logit contributions
attered+0.014
aming+0.013
ney+0.013
opened+0.012
ifying+0.012
Neg logit contributions
 maybe-0.013
 Skip-0.013
 Zoe-0.012
rob-0.012
 inspection-0.012
Token She
Token ID1375
Feature activation-2.09e-04

Pos logit contributions
rob+0.002
 Skip+0.002
 villain+0.002
aro+0.002
ock+0.002
Neg logit contributions
attered-0.002
ney-0.002
 mailbox-0.002
um-0.002
opened-0.002
Token reads
Token ID9743
Feature activation+3.80e-03

Pos logit contributions
attered+0.000
aming+0.000
ney+0.000
opened+0.000
ifying+0.000
Neg logit contributions
 Skip-0.000
 maybe-0.000
rob-0.000
 Zoe-0.000
 inspection-0.000
Token about
Token ID546
Feature activation-4.44e-04

Pos logit contributions
aro+0.007
rob+0.007
 Skip+0.006
 villain+0.006
 inspection+0.006
Neg logit contributions
 mailbox-0.005
 painted-0.005
tr-0.005
um-0.005
 printed-0.005
Token animals
Token ID4695
Feature activation+1.46e-02

Pos logit contributions
 mailbox+0.001
 painted+0.001
ige+0.001
+0.001
 envelope+0.001
Neg logit contributions
rob-0.001
 Skip-0.001
aro-0.001
 inspection-0.001
 villain-0.001
Token,
Token ID11
Feature activation+3.03e-03

Pos logit contributions
rob+0.026
 inspection+0.025
 Skip+0.025
 villain+0.024
ock+0.024
Neg logit contributions
-0.018
attered-0.018
 mailbox-0.018
ney-0.018
 painted-0.017
Token and
Token ID290
Feature activation+2.34e-04

Pos logit contributions
aro+0.006
rob+0.005
 Skip+0.005
ner+0.005
orts+0.005
Neg logit contributions
 painted-0.004
 mailbox-0.004
 backyard-0.003
-0.003
 cracked-0.003
Token cars
Token ID5006
Feature activation+7.88e-03

Pos logit contributions
rob+0.000
aro+0.000
 Skip+0.000
ock+0.000
 inspection+0.000
Neg logit contributions
 mailbox-0.000
 painted-0.000
attered-0.000
-0.000
 printed-0.000
Token,
Token ID11
Feature activation-4.89e-04

Pos logit contributions
 Skip+0.014
rob+0.014
 maybe+0.014
 inspection+0.014
aro+0.013
Neg logit contributions
attered-0.011
ney-0.010
tr-0.010
 mailbox-0.009
aming-0.009
Token someone
Token ID2130
Feature activation+1.53e-02

Pos logit contributions
rob+0.007
ner+0.006
 inspection+0.006
 Skip+0.006
 villain+0.006
Neg logit contributions
 mailbox-0.004
 cracked-0.003
attered-0.003
amed-0.003
 painted-0.003
Token who
Token ID508
Feature activation+6.78e-03

Pos logit contributions
rob+0.027
 Skip+0.026
ock+0.025
 Zoe+0.025
aro+0.025
Neg logit contributions
attered-0.023
ney-0.020
-0.020
aming-0.020
 mailbox-0.020
Token understand
Token ID1833
Feature activation+8.12e-03

Pos logit contributions
rob+0.014
 Skip+0.014
 inspection+0.014
ock+0.014
 Zoe+0.013
Neg logit contributions
attered-0.008
 cracked-0.007
ney-0.006
 painted-0.006
opened-0.006
Token how
Token ID703
Feature activation-1.34e-02

Pos logit contributions
rob+0.015
 Skip+0.014
 inspection+0.014
 Zig+0.013
 Zoe+0.013
Neg logit contributions
attered-0.011
 mailbox-0.010
ney-0.010
aming-0.010
 cracked-0.010
Token he
Token ID339
Feature activation-4.17e-04

Pos logit contributions
attered+0.023
aming+0.021
ney+0.021
opened+0.020
tr+0.020
Neg logit contributions
 Skip-0.020
 maybe-0.020
 Zoe-0.020
rob-0.019
 inspection-0.019
Token felt
Token ID2936
Feature activation-3.11e-03

Pos logit contributions
attered+0.001
ney+0.000
aming+0.000
opened+0.000
 cracked+0.000
Neg logit contributions
 Skip-0.001
 inspection-0.001
rob-0.001
 Zoe-0.001
 maybe-0.001
Token.
Token ID13
Feature activation+1.45e-03

Pos logit contributions
attered+0.005
aming+0.004
ney+0.004
opened+0.004
 mailbox+0.004
Neg logit contributions
rob-0.005
 Skip-0.005
 Zoe-0.005
 inspection-0.005
 Zig-0.005
Token Together
Token ID17083
Feature activation-5.51e-03

Pos logit contributions
 Skip+0.002
rob+0.002
 maybe+0.002
ock+0.002
 Zoe+0.002
Neg logit contributions
attered-0.003
ney-0.003
aming-0.003
opened-0.003
tr-0.003
Token,
Token ID11
Feature activation-7.95e-03

Pos logit contributions
attered+0.013
aming+0.012
ney+0.012
ifying+0.011
opened+0.011
Neg logit contributions
 Skip-0.006
 maybe-0.006
 Zoe-0.006
 he-0.005
 inspection-0.005
Token they
Token ID484
Feature activation-7.71e-03

Pos logit contributions
attered+0.015
 mailbox+0.015
+0.014
 cracked+0.014
opened+0.014
Neg logit contributions
rob-0.009
ock-0.009
 Skip-0.008
 inspection-0.008
 villain-0.008
Token had
Token ID550
Feature activation-1.15e-02

Pos logit contributions
attered+0.015
ney+0.013
olk+0.013
aming+0.013
ifying+0.013
Neg logit contributions
 maybe-0.011
 he-0.011
 Skip-0.010
 them-0.010
 Bub-0.010
Token paper
Token ID3348
Feature activation+2.97e-03

Pos logit contributions
attered+0.010
aming+0.009
 mailbox+0.009
ney+0.009
ifying+0.009
Neg logit contributions
 Skip-0.009
rob-0.009
 maybe-0.008
aro-0.008
 inspection-0.008
Token,
Token ID11
Feature activation-2.72e-03

Pos logit contributions
 Skip+0.005
 maybe+0.005
rob+0.004
 Zoe+0.004
 inspection+0.004
Neg logit contributions
attered-0.005
aming-0.005
ney-0.004
opened-0.004
ifying-0.004
Token on
Token ID319
Feature activation-9.28e-03

Pos logit contributions
 painted+0.002
 cracked+0.002
attered+0.002
 mailbox+0.002
tr+0.002
Neg logit contributions
 Skip-0.007
rob-0.006
aro-0.006
 Zig-0.006
 inspection-0.006
Token the
Token ID262
Feature activation+1.52e-03

Pos logit contributions
attered+0.016
aming+0.015
 mailbox+0.014
ney+0.014
 cracked+0.014
Neg logit contributions
 Skip-0.014
rob-0.014
 maybe-0.013
aro-0.013
 Zig-0.013
Token wall
Token ID3355
Feature activation+9.22e-03

Pos logit contributions
 but+0.002
ock+0.002
rob+0.002
 Skip+0.002
 maybe+0.002
Neg logit contributions
attered-0.003
ung-0.002
opened-0.002
aming-0.002
amed-0.002
Token,
Token ID11
Feature activation-5.25e-03

Pos logit contributions
 Skip+0.014
rob+0.014
 maybe+0.014
 Zoe+0.013
 inspection+0.013
Neg logit contributions
attered-0.016
ney-0.014
aming-0.014
opened-0.014
tr-0.014
Token on
Token ID319
Feature activation-1.04e-02

Pos logit contributions
attered+0.009
aming+0.008
 mailbox+0.008
ney+0.008
 cracked+0.008
Neg logit contributions
 Skip-0.008
rob-0.008
 Zoe-0.008
 inspection-0.008
 maybe-0.007
Token the
Token ID262
Feature activation-2.52e-04

Pos logit contributions
attered+0.020
aming+0.018
ney+0.018
opened+0.018
 mailbox+0.017
Neg logit contributions
 Skip-0.014
rob-0.013
 maybe-0.013
 Zoe-0.013
 inspection-0.013
Token floor
Token ID4314
Feature activation+2.45e-03

Pos logit contributions
attered+0.000
aming+0.000
opened+0.000
ung+0.000
ney+0.000
Neg logit contributions
 but-0.000
 maybe-0.000
 Skip-0.000
ock-0.000
rob-0.000
Token.
Token ID13
Feature activation-1.31e-03

Pos logit contributions
 Skip+0.004
rob+0.004
 maybe+0.004
 Zoe+0.003
 inspection+0.003
Neg logit contributions
attered-0.004
ney-0.004
aming-0.004
opened-0.004
 cracked-0.004
Token She
Token ID1375
Feature activation-2.01e-03

Pos logit contributions
attered+0.002
tr+0.002
ney+0.002
 mailbox+0.002
opened+0.002
Neg logit contributions
rob-0.002
 Skip-0.002
aro-0.002
 inspection-0.002
 villain-0.002
Token attractive
Token ID10966
Feature activation-2.11e-03

Pos logit contributions
 but+0.002
 Skip+0.002
 maybe+0.002
rob+0.002
 inspection+0.002
Neg logit contributions
attered-0.002
ifying-0.002
aming-0.001
ra-0.001
lette-0.001
Token barn
Token ID25203
Feature activation-6.52e-03

Pos logit contributions
attered+0.003
aming+0.002
ifying+0.002
ney+0.002
amed+0.002
Neg logit contributions
 Skip-0.004
 maybe-0.004
 inspection-0.004
 Zoe-0.004
rob-0.004
Token.
Token ID13
Feature activation-2.22e-03

Pos logit contributions
attered+0.012
aming+0.011
ney+0.011
opened+0.010
ifying+0.010
Neg logit contributions
 Skip-0.009
 maybe-0.009
rob-0.009
 Zoe-0.009
 inspection-0.009
Token
Token ID198
Feature activation-1.46e-03

Pos logit contributions
attered+0.004
ney+0.003
tr+0.003
opened+0.003
aming+0.003
Neg logit contributions
rob-0.003
ock-0.003
 inspection-0.003
 villain-0.003
 Skip-0.003
Token
Token ID198
Feature activation+1.24e-03

Pos logit contributions
ney+0.002
aming+0.002
+0.002
opened+0.002
amed+0.002
Neg logit contributions
 villain-0.003
rob-0.003
ock-0.003
 Skip-0.003
aro-0.003
TokenAs
Token ID1722
Feature activation-2.88e-03

Pos logit contributions
 maybe+0.002
 Skip+0.002
rob+0.002
 Zoe+0.002
 inspection+0.002
Neg logit contributions
attered-0.002
aming-0.002
ney-0.002
opened-0.002
amed-0.002
Token the
Token ID262
Feature activation+2.12e-03

Pos logit contributions
attered+0.006
aming+0.005
ney+0.005
opened+0.005
 cracked+0.005
Neg logit contributions
rob-0.004
 Skip-0.003
ock-0.003
 maybe-0.003
 inspection-0.003
Token sun
Token ID4252
Feature activation-4.78e-03

Pos logit contributions
 maybe+0.003
 Skip+0.003
 Boo+0.003
 but+0.003
 Zoe+0.003
Neg logit contributions
attered-0.004
ifying-0.004
aming-0.004
opened-0.003
ney-0.003
Token went
Token ID1816
Feature activation-7.43e-03

Pos logit contributions
attered+0.007
ifying+0.007
ige+0.007
lette+0.006
opened+0.006
Neg logit contributions
 he-0.008
 Zoe-0.008
 Zig-0.008
 Skip-0.008
 maybe-0.008
Token down
Token ID866
Feature activation+1.09e-03

Pos logit contributions
attered+0.012
ney+0.012
aming+0.012
ige+0.012
lette+0.012
Neg logit contributions
 Skip-0.012
 he-0.012
 but-0.012
 maybe-0.012
 Zig-0.011
Token,
Token ID11
Feature activation-8.45e-03

Pos logit contributions
 Skip+0.002
rob+0.001
 maybe+0.001
ock+0.001
 Zoe+0.001
Neg logit contributions
attered-0.002
aming-0.002
ney-0.002
opened-0.002
 cracked-0.002
Token around
Token ID1088
Feature activation+1.87e-02

Pos logit contributions
 Zoe+0.011
 Skip+0.011
ock+0.011
 Boo+0.010
 Zig+0.010
Neg logit contributions
attered-0.015
ige-0.014
lette-0.013
 mailbox-0.013
aming-0.013
Token the
Token ID262
Feature activation-4.89e-03

Pos logit contributions
rob+0.027
 Skip+0.025
ock+0.025
 maybe+0.025
 villain+0.024
Neg logit contributions
attered-0.033
aming-0.031
ney-0.030
 mailbox-0.030
opened-0.030
Token barrel
Token ID9036
Feature activation-1.28e-02

Pos logit contributions
attered+0.009
ung+0.009
ra+0.008
tr+0.008
ifying+0.008
Neg logit contributions
 but-0.008
 he-0.007
 Skip-0.006
 inspection-0.006
 maybe-0.006
Token and
Token ID290
Feature activation-1.30e-04

Pos logit contributions
attered+0.025
aming+0.023
ifying+0.022
ney+0.022
ige+0.022
Neg logit contributions
 maybe-0.017
 Skip-0.016
 but-0.016
 Zoe-0.016
 he-0.016
Token began
Token ID2540
Feature activation-1.29e-02

Pos logit contributions
attered+0.000
aming+0.000
ney+0.000
opened+0.000
 cracked+0.000
Neg logit contributions
 Skip-0.000
rob-0.000
 maybe-0.000
ock-0.000
 Zoe-0.000
Token to
Token ID284
Feature activation-8.38e-03

Pos logit contributions
attered+0.028
ney+0.025
aming+0.025
ige+0.024
um+0.024
Neg logit contributions
 Zoe-0.015
 inspection-0.015
 maybe-0.015
 Skip-0.015
 but-0.015
Token change
Token ID1487
Feature activation-2.65e-03

Pos logit contributions
attered+0.016
aming+0.014
ney+0.014
opened+0.013
ifying+0.013
Neg logit contributions
 Skip-0.012
 maybe-0.012
rob-0.011
 Zoe-0.011
 inspection-0.011
Token it
Token ID340
Feature activation-2.75e-03

Pos logit contributions
attered+0.005
aming+0.004
ney+0.004
opened+0.004
ifying+0.004
Neg logit contributions
 Skip-0.004
 maybe-0.004
rob-0.004
 Zoe-0.004
 inspection-0.004
Token.
Token ID13
Feature activation-8.35e-04

Pos logit contributions
attered+0.005
aming+0.005
ige+0.005
ney+0.005
amed+0.005
Neg logit contributions
 maybe-0.004
 but-0.004
 Skip-0.004
 he-0.003
 Zoe-0.003
Token The
Token ID383
Feature activation+4.66e-04

Pos logit contributions
attered+0.002
aming+0.001
ney+0.001
opened+0.001
ifying+0.001
Neg logit contributions
 Skip-0.001
 maybe-0.001
 Zoe-0.001
rob-0.001
 inspection-0.001
Token barrel
Token ID9036
Feature activation-1.19e-02

Pos logit contributions
 maybe+0.001
 but+0.001
 he+0.001
 Skip+0.001
 villain+0.001
Neg logit contributions
attered-0.001
aming-0.001
ra-0.001
ung-0.001
ifying-0.001
Token she
Token ID673
Feature activation+6.62e-03

Pos logit contributions
rob+0.016
 Skip+0.015
 maybe+0.015
ock+0.015
 inspection+0.014
Neg logit contributions
attered-0.017
aming-0.015
ney-0.014
 cracked-0.014
opened-0.014
Token wanted
Token ID2227
Feature activation-6.97e-03

Pos logit contributions
 maybe+0.010
 Zoe+0.009
 Skip+0.009
 he+0.009
 but+0.009
Neg logit contributions
attered-0.012
ifying-0.011
aming-0.011
ney-0.010
olk-0.010
Token to
Token ID284
Feature activation+4.62e-03

Pos logit contributions
attered+0.009
+0.007
aming+0.007
 mailbox+0.007
tr+0.007
Neg logit contributions
rob-0.014
 Skip-0.013
ock-0.013
 villain-0.013
 maybe-0.013
Token keep
Token ID1394
Feature activation-4.29e-03

Pos logit contributions
rob+0.007
 Skip+0.007
 inspection+0.007
 maybe+0.007
ock+0.006
Neg logit contributions
attered-0.007
 mailbox-0.007
ney-0.007
aming-0.007
-0.007
Token playing
Token ID2712
Feature activation+8.37e-03

Pos logit contributions
attered+0.007
aming+0.007
 mailbox+0.006
opened+0.006
lette+0.006
Neg logit contributions
rob-0.006
 Skip-0.006
ock-0.006
 inspection-0.006
 maybe-0.006
Token.
Token ID13
Feature activation-1.48e-03

Pos logit contributions
 Skip+0.015
 inspection+0.015
rob+0.015
 maybe+0.014
 Zoe+0.014
Neg logit contributions
attered-0.012
aming-0.011
ney-0.010
 mailbox-0.010
opened-0.010
Token But
Token ID887
Feature activation+7.30e-03

Pos logit contributions
attered+0.003
aming+0.003
ney+0.003
ifying+0.003
opened+0.003
Neg logit contributions
 maybe-0.002
 Skip-0.002
 Zoe-0.002
 inspection-0.002
rob-0.002
Token her
Token ID607
Feature activation-1.28e-02

Pos logit contributions
 maybe+0.008
 Skip+0.008
 Zoe+0.008
rob+0.008
 inspection+0.008
Neg logit contributions
attered-0.015
aming-0.014
ney-0.014
opened-0.014
ifying-0.013
Token mom
Token ID1995
Feature activation+3.24e-03

Pos logit contributions
attered+0.023
aming+0.021
opened+0.021
amed+0.020
ifying+0.020
Neg logit contributions
 Zoe-0.019
 maybe-0.019
 Skip-0.019
 Boo-0.018
 but-0.018
Token was
Token ID373
Feature activation+1.06e-02

Pos logit contributions
 maybe+0.005
 Skip+0.005
 Zoe+0.005
 but+0.005
 he+0.004
Neg logit contributions
attered-0.006
ifying-0.005
ney-0.005
aming-0.005
opened-0.005
Token very
Token ID845
Feature activation+8.11e-03

Pos logit contributions
 maybe+0.018
 Skip+0.018
 Zoe+0.017
rob+0.017
ock+0.016
Neg logit contributions
attered-0.017
ney-0.015
aming-0.015
ifying-0.015
tr-0.015
Token noticed
Token ID6810
Feature activation+8.95e-04

Pos logit contributions
 maybe+0.011
 Skip+0.011
 Zoe+0.010
rob+0.010
 he+0.010
Neg logit contributions
attered-0.010
aming-0.009
ney-0.008
opened-0.008
ifying-0.008
Token that
Token ID326
Feature activation-5.58e-03

Pos logit contributions
rob+0.001
 Skip+0.001
ock+0.001
 inspection+0.001
 villain+0.001
Neg logit contributions
attered-0.001
 mailbox-0.001
aming-0.001
 cracked-0.001
 painted-0.001
Token the
Token ID262
Feature activation-6.15e-03

Pos logit contributions
attered+0.011
 mailbox+0.010
ney+0.010
aming+0.010
+0.010
Neg logit contributions
rob-0.007
 Skip-0.006
ock-0.006
 inspection-0.006
aro-0.006
Token ground
Token ID2323
Feature activation-9.61e-03

Pos logit contributions
attered+0.011
aming+0.010
opened+0.009
amed+0.009
ney+0.009
Neg logit contributions
 maybe-0.009
 Skip-0.009
 Zoe-0.009
 but-0.009
 villain-0.009
Token was
Token ID373
Feature activation-3.54e-03

Pos logit contributions
attered+0.017
aming+0.016
ige+0.016
opened+0.015
ney+0.015
Neg logit contributions
 maybe-0.014
 he-0.013
 but-0.013
 Skip-0.013
 Zoe-0.013
Token getting
Token ID1972
Feature activation-2.90e-02

Pos logit contributions
attered+0.006
ige+0.006
aut+0.006
ney+0.006
ra+0.005
Neg logit contributions
 but-0.006
 he-0.006
 maybe-0.005
 them-0.005
 though-0.005
Token very
Token ID845
Feature activation-9.85e-03

Pos logit contributions
attered+0.059
ney+0.058
ige+0.058
lette+0.058
ra+0.057
Neg logit contributions
 he-0.037
 them-0.036
 though-0.036
 but-0.036
 maybe-0.035
Token wet
Token ID9583
Feature activation-1.90e-02

Pos logit contributions
tr+0.018
lette+0.018
attered+0.018
ige+0.017
ney+0.017
Neg logit contributions
 he-0.015
 but-0.014
 maybe-0.013
 though-0.013
 inspection-0.012
Token.
Token ID13
Feature activation+3.36e-03

Pos logit contributions
attered+0.036
ney+0.034
opened+0.033
ige+0.033
aming+0.033
Neg logit contributions
 but-0.025
 maybe-0.025
 he-0.024
ock-0.024
rob-0.023
Token Jack
Token ID3619
Feature activation-4.90e-03

Pos logit contributions
rob+0.005
ock+0.005
aro+0.004
 inspection+0.004
 Skip+0.004
Neg logit contributions
attered-0.006
ney-0.006
-0.005
opened-0.005
ifying-0.005
Token realized
Token ID6939
Feature activation+4.38e-03

Pos logit contributions
attered+0.009
ifying+0.008
opened+0.008
olk+0.008
tr+0.008
Neg logit contributions
 he-0.007
 Skip-0.007
 but-0.007
 maybe-0.007
 Zoe-0.007
Tokency
Token ID948
Feature activation-1.08e-02

Pos logit contributions
 Zoe+0.011
rob+0.011
 inspection+0.011
 Skip+0.011
 Boo+0.010
Neg logit contributions
attered-0.007
aming-0.006
ifying-0.006
tr-0.006
amed-0.006
Token was
Token ID373
Feature activation+6.66e-03

Pos logit contributions
ifying+0.021
attered+0.020
olk+0.019
 intact+0.018
ney+0.018
Neg logit contributions
 but-0.015
 maybe-0.014
 he-0.014
 Zoe-0.014
 Boo-0.012
Token so
Token ID523
Feature activation+8.74e-05

Pos logit contributions
 maybe+0.009
 Zoe+0.009
 but+0.008
 Skip+0.008
 he+0.008
Neg logit contributions
attered-0.013
ney-0.012
ifying-0.012
tr-0.012
opened-0.012
Token proud
Token ID6613
Feature activation-1.08e-02

Pos logit contributions
 maybe+0.000
 Zoe+0.000
 he+0.000
 but+0.000
 Skip+0.000
Neg logit contributions
attered-0.000
tr-0.000
ney-0.000
ifying-0.000
 mailbox-0.000
Token that
Token ID326
Feature activation-2.06e-02

Pos logit contributions
attered+0.023
aming+0.021
ney+0.020
opened+0.020
 mailbox+0.020
Neg logit contributions
 Skip-0.013
rob-0.013
 maybe-0.012
 inspection-0.012
 villain-0.012
Token her
Token ID607
Feature activation-3.76e-02

Pos logit contributions
attered+0.054
aming+0.050
ney+0.050
opened+0.049
ifying+0.049
Neg logit contributions
 maybe-0.015
 Skip-0.014
 Zoe-0.013
rob-0.012
 he-0.012
Token family
Token ID1641
Feature activation-1.36e-02

Pos logit contributions
attered+0.068
ifying+0.063
aming+0.062
opened+0.060
amed+0.058
Neg logit contributions
 maybe-0.058
 he-0.054
 but-0.053
 Zoe-0.053
 Boo-0.053
Token was
Token ID373
Feature activation+4.32e-04

Pos logit contributions
ifying+0.025
attered+0.025
tr+0.023
ney+0.022
aming+0.022
Neg logit contributions
 maybe-0.020
 he-0.020
 but-0.019
 Zoe-0.019
ock-0.017
Token so
Token ID523
Feature activation-1.14e-03

Pos logit contributions
 maybe+0.001
 but+0.001
 Zoe+0.001
 he+0.001
 them+0.000
Neg logit contributions
attered-0.001
ney-0.001
tr-0.001
ifying-0.001
aut-0.001
Token tidy
Token ID43044
Feature activation+3.01e-03

Pos logit contributions
attered+0.002
tr+0.002
ney+0.002
ifying+0.002
 mailbox+0.002
Neg logit contributions
 maybe-0.002
 Zoe-0.002
 he-0.001
 but-0.001
aro-0.001
Token.
Token ID13
Feature activation-3.71e-03

Pos logit contributions
 maybe+0.004
 Skip+0.004
 Zoe+0.004
 but+0.004
 he+0.004
Neg logit contributions
attered-0.005
ney-0.005
aming-0.005
opened-0.005
ifying-0.005
Token had
Token ID550
Feature activation-1.36e-02

Pos logit contributions
attered+0.008
aming+0.007
ney+0.007
opened+0.007
ige+0.007
Neg logit contributions
 maybe-0.007
 he-0.007
ock-0.007
 but-0.007
 Skip-0.007
Token delayed
Token ID11038
Feature activation-9.95e-03

Pos logit contributions
attered+0.021
ney+0.019
opened+0.019
aming+0.019
aut+0.018
Neg logit contributions
ock-0.022
 he-0.022
 maybe-0.021
 but-0.021
 inspection-0.021
Token its
Token ID663
Feature activation-2.81e-03

Pos logit contributions
attered+0.021
ney+0.019
aming+0.019
opened+0.018
amed+0.018
Neg logit contributions
 maybe-0.012
 Skip-0.012
 but-0.011
 Zoe-0.011
ock-0.011
Token smoke
Token ID7523
Feature activation-1.07e-02

Pos logit contributions
attered+0.005
aming+0.004
amed+0.004
opened+0.004
ifying+0.004
Neg logit contributions
 maybe-0.004
 but-0.004
 Zoe-0.004
 Skip-0.004
 inspection-0.004
Token from
Token ID422
Feature activation-1.57e-02

Pos logit contributions
attered+0.020
aming+0.018
opened+0.018
ige+0.018
amed+0.018
Neg logit contributions
 but-0.015
 maybe-0.015
 he-0.014
ock-0.014
 Skip-0.014
Token coming
Token ID2406
Feature activation-2.80e-02

Pos logit contributions
attered+0.031
aming+0.029
ney+0.028
opened+0.028
ifying+0.028
Neg logit contributions
 Skip-0.019
 maybe-0.019
rob-0.019
 Zoe-0.018
 inspection-0.018
Token out
Token ID503
Feature activation-3.17e-03

Pos logit contributions
attered+0.051
ige+0.049
aming+0.047
lette+0.046
amed+0.046
Neg logit contributions
 but-0.040
 he-0.039
 maybe-0.038
ock-0.037
 Zig-0.036
Token of
Token ID286
Feature activation-1.59e-02

Pos logit contributions
attered+0.007
aming+0.006
ney+0.006
opened+0.006
tr+0.006
Neg logit contributions
 maybe-0.004
 Skip-0.004
 Zoe-0.004
 inspection-0.004
ock-0.004
Token it
Token ID340
Feature activation-1.93e-03

Pos logit contributions
attered+0.034
aming+0.031
ney+0.031
opened+0.030
ifying+0.030
Neg logit contributions
 Skip-0.018
 maybe-0.018
rob-0.017
 Zoe-0.017
 inspection-0.017
Token.
Token ID13
Feature activation-2.94e-03

Pos logit contributions
attered+0.004
ige+0.004
aming+0.004
amed+0.004
lette+0.004
Neg logit contributions
 but-0.003
 maybe-0.002
 he-0.002
 Skip-0.002
 inspection-0.002
Token He
Token ID679
Feature activation+1.25e-04

Pos logit contributions
attered+0.006
ney+0.005
opened+0.005
aming+0.005
aut+0.005
Neg logit contributions
rob-0.004
aro-0.004
ock-0.004
 Skip-0.004
 villain-0.004
Token saw
Token ID2497
Feature activation-6.01e-03

Pos logit contributions
 maybe+0.010
 Skip+0.010
 Zoe+0.010
 he+0.010
rob+0.009
Neg logit contributions
attered-0.011
ifying-0.010
aming-0.010
ney-0.010
opened-0.010
Token a
Token ID257
Feature activation+2.64e-03

Pos logit contributions
attered+0.011
aming+0.010
ney+0.010
 mailbox+0.010
amed+0.010
Neg logit contributions
 Skip-0.009
rob-0.008
 maybe-0.008
 inspection-0.008
 Zoe-0.008
Token stream
Token ID4269
Feature activation+6.93e-03

Pos logit contributions
 Skip+0.004
 maybe+0.004
 Zoe+0.004
 but+0.004
 villain+0.004
Neg logit contributions
attered-0.005
aming-0.004
amed-0.004
ney-0.004
opened-0.004
Token of
Token ID286
Feature activation-7.65e-03

Pos logit contributions
 but+0.009
 he+0.009
ock+0.009
 maybe+0.009
 Skip+0.009
Neg logit contributions
attered-0.013
ra-0.011
ige-0.011
opened-0.011
ifying-0.011
Token light
Token ID1657
Feature activation+4.68e-04

Pos logit contributions
attered+0.013
aming+0.011
ney+0.011
opened+0.011
ifying+0.011
Neg logit contributions
 Skip-0.012
 maybe-0.012
rob-0.012
 Zoe-0.012
 inspection-0.012
Token coming
Token ID2406
Feature activation-2.96e-02

Pos logit contributions
 but+0.001
 Zoe+0.001
ock+0.001
 he+0.001
 maybe+0.001
Neg logit contributions
attered-0.001
ige-0.001
opened-0.001
amed-0.001
aming-0.001
Token from
Token ID422
Feature activation-1.64e-02

Pos logit contributions
attered+0.063
ige+0.060
ney+0.058
lette+0.057
ocado+0.056
Neg logit contributions
 but-0.038
 them-0.037
 Zoe-0.034
 Zig-0.034
 maybe-0.033
Token the
Token ID262
Feature activation-7.62e-03

Pos logit contributions
attered+0.034
ney+0.030
aming+0.030
ige+0.030
opened+0.030
Neg logit contributions
 maybe-0.020
 Skip-0.020
 Zoe-0.019
 inspection-0.019
rob-0.019
Token sky
Token ID6766
Feature activation-1.44e-02

Pos logit contributions
attered+0.014
lette+0.013
aming+0.013
amed+0.013
opened+0.013
Neg logit contributions
 but-0.010
 Zoe-0.010
 Skip-0.010
 he-0.010
 maybe-0.010
Token.
Token ID13
Feature activation-4.43e-03

Pos logit contributions
attered+0.029
amed+0.027
ige+0.027
aming+0.026
opened+0.026
Neg logit contributions
 but-0.018
 he-0.017
 maybe-0.017
 Zoe-0.017
ock-0.016
Token It
Token ID632
Feature activation+5.19e-03

Pos logit contributions
attered+0.008
aming+0.008
ney+0.008
opened+0.007
ifying+0.007
Neg logit contributions
 Skip-0.006
 maybe-0.006
rob-0.006
 Zoe-0.006
 inspection-0.006
Token.
Token ID13
Feature activation-2.55e-03

Pos logit contributions
attered+0.013
aming+0.012
ney+0.012
opened+0.011
ifying+0.011
Neg logit contributions
 Skip-0.014
 maybe-0.013
rob-0.013
 Zoe-0.013
 inspection-0.013
Token Lily
Token ID20037
Feature activation-1.02e-02

Pos logit contributions
attered+0.006
aming+0.005
ney+0.005
opened+0.005
ifying+0.005
Neg logit contributions
 Skip-0.003
 maybe-0.002
rob-0.002
 Zoe-0.002
 inspection-0.002
Token was
Token ID373
Feature activation+3.89e-03

Pos logit contributions
attered+0.021
ifying+0.020
olk+0.018
tr+0.018
opened+0.018
Neg logit contributions
 maybe-0.013
 he-0.013
 but-0.012
 Skip-0.012
 Zoe-0.012
Token proud
Token ID6613
Feature activation-6.03e-03

Pos logit contributions
 maybe+0.005
 Zoe+0.005
 Skip+0.005
 he+0.004
 but+0.004
Neg logit contributions
attered-0.008
ifying-0.007
ney-0.007
aut-0.007
tr-0.007
Token of
Token ID286
Feature activation+7.63e-04

Pos logit contributions
attered+0.011
aming+0.010
ney+0.010
 painted+0.010
+0.010
Neg logit contributions
rob-0.009
 Skip-0.008
 Zig-0.008
 inspection-0.008
aro-0.008
Token her
Token ID607
Feature activation-2.93e-02

Pos logit contributions
rob+0.001
 inspection+0.001
 villain+0.001
ock+0.001
 Skip+0.001
Neg logit contributions
aming-0.001
 mailbox-0.001
-0.001
attered-0.001
um-0.001
Token mom
Token ID1995
Feature activation+2.05e-03

Pos logit contributions
attered+0.059
ney+0.052
ifying+0.052
aming+0.052
amed+0.051
Neg logit contributions
 maybe-0.042
 he-0.039
 Skip-0.039
 but-0.039
 villain-0.038
Tokenmy
Token ID1820
Feature activation-1.06e-02

Pos logit contributions
 maybe+0.003
 Skip+0.003
rob+0.003
 Zoe+0.003
ock+0.003
Neg logit contributions
attered-0.004
aming-0.004
ney-0.004
opened-0.004
ifying-0.003
Token and
Token ID290
Feature activation-1.67e-02

Pos logit contributions
attered+0.024
aming+0.021
ney+0.021
opened+0.021
amed+0.021
Neg logit contributions
 maybe-0.012
 but-0.012
 Skip-0.011
 he-0.011
 Zoe-0.011
Token daddy
Token ID35822
Feature activation-5.63e-03

Pos logit contributions
attered+0.030
ney+0.027
ifying+0.027
aming+0.026
tr+0.026
Neg logit contributions
 maybe-0.027
 he-0.025
 Skip-0.025
 Zoe-0.024
 but-0.023
Token.
Token ID13
Feature activation-9.76e-03

Pos logit contributions
attered+0.011
ney+0.010
aming+0.010
opened+0.010
tr+0.010
Neg logit contributions
 maybe-0.007
 Skip-0.007
 Zoe-0.007
 but-0.007
 he-0.007