TOP ACTIVATIONS
MAX = 4.295

not tried to get the cream . She wished her mom
! This is not ice cream !" Lily cried .
They lifted him onto a stret cher and put him in
want to buy an ice cream , then let 's give
with a man on a stret cher . The man 's
won 't get any ice cream later ." L
the elderly man on a stret cher . She thanked Tim
baby should get a check - up . But the baby
need to get a check - up to make sure you
won 't get any ice cream ." Tim my still refused
she needed a little pick - me - up , she
a little pick - me - up , she would go
castle was destroyed and no - one was able to help
It said : " Happy Birthday , Mom ! Do not
young to get an X - ray , but she said
she wanted to buy ice cream for all her friends .
We can 't get ice cream ." Just
knew he couldn 't un - m elt it .
's go and get ice cream , sweet ie ."
had not touched the or nam ents . They hoped Mom

MIN ACTIVATIONS
MIN = -5.045

used Clim ber the Lad der . It helped him to
thanked Clim ber the Lad der for its help . And
the tree to its once former beauty . He was happy
his mother even p rou der . <|endoftext|> John and his
This made her even glad der and she smiled with joy
car made him even sad der . He asked everyone if
the village back to its former glory !"
to be restored to its former glory ."
it made her even sad der . Suddenly ,
Then something even we ir der happened . The animal suddenly
back the village to its former state . With the help
Jen feels even sad der . She thinks Tim is
girl felt even p rou der . She was happy that
!" Benny felt even sad der and started to cry .
little boy felt even sad der . He started to cry
, leaving it to its former state .
." Lily felt even sad der than before .
. Red felt even sad der . But then
. Tim felt even sad der now , and he cried
. Lily felt even sad der than before . From that

INTERVAL 1.960 - 4.295
CONTAINS 0.385%

t hurt you . â The rabbit smiled
to take care of your hair . And you are very
" Hello , kids !" the bird said .
Me too ! Let â s pick some flowers
said you are very good twins and you deserve a friend

INTERVAL -0.375 - 1.960
CONTAINS 72.076%

pron was impressive because it had a picture of a rainbow
harvest of corn . He was so excited that his plan
cat sitting under a tree . The cat looked sad and
told Lily they needed to go back inside to get it
, T illy wanted to make something special . She went

INTERVAL -2.710 - -0.375
CONTAINS 27.504%

. The flood kept getting bigger and bigger .
very proud of himself . <|endoftext|> Once upon a time ,
named Lily . She loved to draw and color with her
, and told the others to form a line and to
want to save it . Please , don 't cut it

INTERVAL -5.045 - -2.710
CONTAINS 0.035%

it . Find your own beetle ." M ia
with a ball . Spot w agged his long tail very
. She asked Tim " Do you want to play a
and he asked his mom if he could keep it .
, she realized that    the ground was alive !
Token not
Token ID407
Feature activation+1.38e+00

Pos logit contributions
rolley+0.046
aughter+0.041
he+0.041
 down+0.040
cream+0.039
Neg logit contributions
 villains-0.039
emy-0.039
ib-0.037
 hops-0.037
 weaknesses-0.037
Token tried
Token ID3088
Feature activation+7.42e-02

Pos logit contributions
 Mom+0.034
rolley+0.032
 down+0.032
cream+0.031
 Dad+0.030
Neg logit contributions
ib-0.051
ub-0.051
emy-0.051
 hops-0.050
 villains-0.049
Token to
Token ID284
Feature activation+6.80e-02

Pos logit contributions
 Mom+0.002
rolley+0.002
cream+0.002
 down+0.002
owner+0.002
Neg logit contributions
ib-0.003
emy-0.003
ub-0.003
 hops-0.002
 villains-0.002
Token get
Token ID651
Feature activation+8.93e-01

Pos logit contributions
rolley+0.002
 down+0.002
cream+0.002
nam+0.002
owner+0.002
Neg logit contributions
emy-0.002
ib-0.002
 villains-0.002
 hops-0.002
ub-0.002
Token the
Token ID262
Feature activation+1.07e+00

Pos logit contributions
rolley+0.027
person+0.025
he+0.025
belt+0.024
appointed+0.024
Neg logit contributions
 hops-0.027
 nightmares-0.026
 conflicts-0.026
 appearance-0.026
 spiders-0.025
Token cream
Token ID8566
Feature activation+4.29e+00

Pos logit contributions
 Mom+0.029
rolley+0.029
cream+0.028
 down+0.028
owner+0.027
Neg logit contributions
ib-0.036
ub-0.035
emy-0.035
 hops-0.035
 villains-0.034
Token.
Token ID13
Feature activation-5.33e-01

Pos logit contributions
nam+0.105
appointed+0.104
amed+0.103
rolley+0.103
cream+0.100
Neg logit contributions
ub-0.145
ib-0.142
 appearance-0.140
 hops-0.139
 villains-0.135
Token She
Token ID1375
Feature activation+4.24e-01

Pos logit contributions
ib+0.015
emy+0.015
 hops+0.015
ub+0.015
 appearance+0.015
Neg logit contributions
rolley-0.017
cream-0.017
owner-0.016
nam-0.016
 down-0.016
Token wished
Token ID16555
Feature activation-7.69e-01

Pos logit contributions
rolley+0.015
 Mom+0.014
 down+0.014
owner+0.014
cream+0.014
Neg logit contributions
emy-0.011
 villains-0.011
 hops-0.011
ib-0.011
ub-0.010
Token her
Token ID607
Feature activation+2.25e+00

Pos logit contributions
ib+0.027
ub+0.027
emy+0.027
 hops+0.027
 villains+0.026
Neg logit contributions
 Mom-0.020
 down-0.019
rolley-0.019
 Dad-0.019
cream-0.018
Token mom
Token ID1995
Feature activation+1.44e+00

Pos logit contributions
appointed+0.041
rolley+0.040
cream+0.039
''+0.038
amed+0.038
Neg logit contributions
 villains-0.093
 nightmares-0.092
 appearance-0.092
ib-0.092
emy-0.092
Token!
Token ID0
Feature activation-1.41e+00

Pos logit contributions
nam+0.018
rolley+0.018
owner+0.017
cream+0.017
 Mom+0.017
Neg logit contributions
ub-0.021
emy-0.021
 hops-0.021
ib-0.020
 tweet-0.019
Token This
Token ID770
Feature activation-1.11e+00

Pos logit contributions
emy+0.052
 hops+0.052
 villains+0.052
ub+0.052
ib+0.051
Neg logit contributions
 Mom-0.036
 Dad-0.033
cream-0.032
 down-0.032
rolley-0.032
Token is
Token ID318
Feature activation+2.67e-01

Pos logit contributions
ib+0.029
ub+0.029
 hops+0.028
pill+0.028
emy+0.027
Neg logit contributions
nam-0.039
 Mom-0.038
rolley-0.038
owner-0.037
 forehead-0.037
Token not
Token ID407
Feature activation+1.64e+00

Pos logit contributions
rolley+0.008
owner+0.007
person+0.007
 down+0.007
 Mom+0.007
Neg logit contributions
emy-0.009
ib-0.008
 hops-0.008
ub-0.008
pill-0.008
Token ice
Token ID4771
Feature activation+1.20e+00

Pos logit contributions
rolley+0.045
owner+0.043
person+0.042
 Mom+0.042
 clerk+0.041
Neg logit contributions
ib-0.055
emy-0.054
ub-0.053
 hops-0.052
 villains-0.052
Token cream
Token ID8566
Feature activation+4.26e+00

Pos logit contributions
nam+0.022
rolley+0.022
 Mom+0.022
owner+0.022
 forehead+0.021
Neg logit contributions
ub-0.051
ib-0.051
emy-0.049
 hops-0.048
 appearance-0.047
Token!"
Token ID2474
Feature activation-1.79e-01

Pos logit contributions
nam+0.096
rolley+0.096
 Mom+0.096
owner+0.091
cream+0.088
Neg logit contributions
ub-0.161
ib-0.157
 hops-0.154
emy-0.153
pill-0.151
Token Lily
Token ID20037
Feature activation-3.97e-01

Pos logit contributions
 villains+0.005
 hops+0.005
emy+0.005
ib+0.005
ub+0.005
Neg logit contributions
rolley-0.006
amed-0.006
cream-0.005
 forehead-0.005
bled-0.005
Token cried
Token ID16896
Feature activation-8.32e-01

Pos logit contributions
emy+0.016
ib+0.015
ub+0.015
 hops+0.015
pill+0.015
Neg logit contributions
rolley-0.009
belt-0.008
nam-0.008
cream-0.008
owner-0.007
Token.
Token ID13
Feature activation-1.37e+00

Pos logit contributions
ib+0.038
ub+0.038
emy+0.037
 hops+0.037
 villains+0.037
Neg logit contributions
 Mom-0.013
rolley-0.012
 down-0.012
cream-0.012
 Dad-0.011
Token
Token ID198
Feature activation-5.40e-01

Pos logit contributions
ib+0.054
emy+0.053
ub+0.053
 hops+0.052
 villains+0.052
Neg logit contributions
rolley-0.030
cream-0.028
 Mom-0.028
nam-0.027
 down-0.027
Token They
Token ID1119
Feature activation+3.78e-01

Pos logit contributions
ib+0.016
ub+0.016
emy+0.016
 hops+0.016
 villains+0.016
Neg logit contributions
 Mom-0.009
 Dad-0.008
 down-0.008
rolley-0.008
 Mum-0.008
Token lifted
Token ID13663
Feature activation+8.09e-01

Pos logit contributions
rolley+0.008
 down+0.008
 Mom+0.008
owner+0.007
aughter+0.007
Neg logit contributions
ib-0.014
 villains-0.014
 hops-0.014
emy-0.014
ub-0.014
Token him
Token ID683
Feature activation-1.60e-01

Pos logit contributions
rolley+0.018
cream+0.017
nam+0.017
owner+0.017
 Mom+0.016
Neg logit contributions
ub-0.031
ib-0.031
 hops-0.031
emy-0.031
 villains-0.030
Token onto
Token ID4291
Feature activation+5.75e-01

Pos logit contributions
emy+0.009
ib+0.009
pill+0.009
 villains+0.009
ub+0.009
Neg logit contributions
 down-0.001
 Mom-0.001
cream-0.001
rolley-0.001
 up-0.001
Token a
Token ID257
Feature activation-3.18e-01

Pos logit contributions
rolley+0.009
cream+0.009
 Mom+0.009
 down+0.008
owner+0.008
Neg logit contributions
ib-0.025
ub-0.025
emy-0.025
 hops-0.025
 villains-0.025
Token stret
Token ID5759
Feature activation+4.18e+00

Pos logit contributions
ub+0.012
ib+0.012
emy+0.012
 hops+0.012
 villains+0.012
Neg logit contributions
 Mom-0.007
rolley-0.007
cream-0.007
 down-0.007
 Dad-0.007
Tokencher
Token ID2044
Feature activation+1.92e+00

Pos logit contributions
 Mom+0.131
 Mum+0.124
rolley+0.122
 Dad+0.121
appointed+0.119
Neg logit contributions
ub-0.126
ib-0.120
 villains-0.120
 tweet-0.117
 hops-0.113
Token and
Token ID290
Feature activation-2.78e-01

Pos logit contributions
 Mom+0.040
 down+0.040
rolley+0.037
 forehead+0.035
owner+0.035
Neg logit contributions
emy-0.081
pill-0.078
ib-0.078
 weaknesses-0.077
 hops-0.077
Token put
Token ID1234
Feature activation+1.22e+00

Pos logit contributions
 villains+0.010
 hops+0.010
emy+0.010
ib+0.010
 tweet+0.009
Neg logit contributions
rolley-0.007
aughter-0.006
owner-0.006
appointed-0.006
he-0.006
Token him
Token ID683
Feature activation-9.84e-02

Pos logit contributions
rolley+0.034
appointed+0.034
owner+0.032
uct+0.031
person+0.031
Neg logit contributions
 hops-0.039
ib-0.039
ub-0.038
emy-0.037
 villains-0.037
Token in
Token ID287
Feature activation-1.48e-01

Pos logit contributions
ib+0.004
ub+0.004
 hops+0.004
emy+0.004
 villains+0.004
Neg logit contributions
 Mom-0.002
rolley-0.002
owner-0.002
 Dad-0.002
 Mum-0.002
Token want
Token ID765
Feature activation-4.96e-01

Pos logit contributions
pill+0.020
emy+0.020
ib+0.020
 villains+0.019
 hops+0.019
Neg logit contributions
 Mom-0.015
 down-0.015
 Dad-0.014
cream-0.014
rolley-0.014
Token to
Token ID284
Feature activation-3.37e-01

Pos logit contributions
ib+0.019
ub+0.019
emy+0.019
 hops+0.018
 villains+0.018
Neg logit contributions
 Mom-0.012
rolley-0.011
 down-0.011
cream-0.011
 Dad-0.011
Token buy
Token ID2822
Feature activation+1.03e+00

Pos logit contributions
ib+0.011
ub+0.011
emy+0.011
 hops+0.011
 villains+0.011
Neg logit contributions
 Mom-0.009
rolley-0.009
 down-0.009
cream-0.008
nam-0.008
Token an
Token ID281
Feature activation+2.10e-01

Pos logit contributions
rolley+0.028
appointed+0.027
 forehead+0.027
owner+0.027
belt+0.026
Neg logit contributions
ib-0.034
 appearance-0.032
 hops-0.032
ub-0.032
emy-0.031
Token ice
Token ID4771
Feature activation+8.44e-01

Pos logit contributions
nam+0.005
 down+0.005
 Mom+0.005
owner+0.005
rolley+0.005
Neg logit contributions
ib-0.007
ub-0.007
emy-0.007
 appearance-0.007
 hops-0.007
Token cream
Token ID8566
Feature activation+4.10e+00

Pos logit contributions
appointed+0.016
amed+0.015
nam+0.015
 doorway+0.014
 forehead+0.014
Neg logit contributions
ub-0.037
ib-0.037
 hops-0.032
emy-0.032
 appearance-0.031
Token,
Token ID11
Feature activation+8.53e-01

Pos logit contributions
appointed+0.114
amed+0.112
nam+0.109
 forehead+0.105
rolley+0.104
Neg logit contributions
ub-0.137
ib-0.130
 appearance-0.119
 hops-0.116
 rect-0.114
Token then
Token ID788
Feature activation+1.08e+00

Pos logit contributions
 Mom+0.022
 Dad+0.020
 down+0.020
 Mum+0.020
cream+0.019
Neg logit contributions
ib-0.032
ub-0.032
emy-0.031
 hops-0.031
pill-0.031
Token let
Token ID1309
Feature activation-5.89e-01

Pos logit contributions
 Mom+0.030
 down+0.028
rolley+0.028
cream+0.027
 Dad+0.027
Neg logit contributions
ib-0.037
ub-0.037
emy-0.036
 hops-0.036
 villains-0.035
Token's
Token ID338
Feature activation+6.22e-01

Pos logit contributions
emy+0.016
 hops+0.016
 tweet+0.015
 appearance+0.015
ib+0.015
Neg logit contributions
nam-0.020
rolley-0.019
belt-0.018
owner-0.018
 forehead-0.017
Token give
Token ID1577
Feature activation+2.87e-01

Pos logit contributions
rolley+0.019
owner+0.018
 Mom+0.018
cream+0.018
 down+0.017
Neg logit contributions
ub-0.018
ib-0.018
 hops-0.018
emy-0.018
 villains-0.017
Token with
Token ID351
Feature activation+1.62e-01

Pos logit contributions
emy+0.005
ib+0.005
ub+0.004
pill+0.004
 villains+0.004
Neg logit contributions
 Mom-0.003
 down-0.003
 Dad-0.003
rolley-0.003
 Mum-0.003
Token a
Token ID257
Feature activation-4.22e-01

Pos logit contributions
rolley+0.004
cream+0.004
appointed+0.004
owner+0.004
person+0.004
Neg logit contributions
 hops-0.005
ib-0.005
 appearances-0.005
 villains-0.005
 tweet-0.005
Token man
Token ID582
Feature activation+9.78e-01

Pos logit contributions
ub+0.016
 hops+0.016
emy+0.015
ib+0.015
 villains+0.015
Neg logit contributions
 Mom-0.009
cream-0.009
rolley-0.009
 down-0.009
nam-0.009
Token on
Token ID319
Feature activation-8.31e-02

Pos logit contributions
 Mom+0.021
 down+0.020
rolley+0.019
cream+0.019
 Dad+0.019
Neg logit contributions
emy-0.040
ib-0.040
ub-0.039
 villains-0.039
 weaknesses-0.039
Token a
Token ID257
Feature activation-7.50e-01

Pos logit contributions
emy+0.003
ib+0.003
 villains+0.003
 hops+0.003
ub+0.003
Neg logit contributions
rolley-0.002
cream-0.001
owner-0.001
nam-0.001
 down-0.001
Token stret
Token ID5759
Feature activation+4.07e+00

Pos logit contributions
 hops+0.027
emy+0.027
ub+0.027
 villains+0.027
ib+0.027
Neg logit contributions
 Mom-0.017
rolley-0.017
cream-0.017
 down-0.016
nam-0.016
Tokencher
Token ID2044
Feature activation+1.71e+00

Pos logit contributions
 Mom+0.135
 Mum+0.130
rolley+0.128
 Dad+0.125
appointed+0.123
Neg logit contributions
ub-0.112
 villains-0.107
 tweet-0.106
ib-0.105
 hops-0.096
Token.
Token ID13
Feature activation-5.59e-02

Pos logit contributions
 Mom+0.032
rolley+0.030
cream+0.029
nam+0.029
 down+0.028
Neg logit contributions
ib-0.072
ub-0.072
emy-0.070
 hops-0.070
 villains-0.070
Token The
Token ID383
Feature activation-2.76e-01

Pos logit contributions
pill+0.001
ub+0.001
ib+0.001
emy+0.001
 hops+0.001
Neg logit contributions
nam-0.002
rolley-0.002
amed-0.002
he-0.002
bled-0.002
Token man
Token ID582
Feature activation+5.10e-01

Pos logit contributions
 villains+0.006
 appearance+0.005
emy+0.005
 hunters+0.005
Like+0.005
Neg logit contributions
nam-0.011
cream-0.010
uct-0.010
chard-0.009
aughter-0.009
Token's
Token ID338
Feature activation-6.48e-01

Pos logit contributions
 Mom+0.009
rolley+0.008
 down+0.008
cream+0.008
 Dad+0.008
Neg logit contributions
ib-0.022
ub-0.022
emy-0.022
 hops-0.022
 villains-0.021
Token won
Token ID1839
Feature activation+3.51e-02

Pos logit contributions
 Mom+0.002
 down+0.002
rolley+0.002
cream+0.002
 Dad+0.002
Neg logit contributions
emy-0.001
ib-0.001
ub-0.001
 hops-0.001
pill-0.001
Token't
Token ID470
Feature activation-1.75e-01

Pos logit contributions
rolley+0.001
owner+0.001
cream+0.001
nam+0.001
aughter+0.001
Neg logit contributions
ub-0.001
 villains-0.001
ib-0.001
emy-0.001
 hops-0.001
Token get
Token ID651
Feature activation+7.31e-01

Pos logit contributions
pill+0.006
ib+0.006
emy+0.006
ub+0.006
 hops+0.006
Neg logit contributions
 Mom-0.005
 down-0.005
 Dad-0.004
cream-0.004
he-0.004
Token any
Token ID597
Feature activation+3.83e-01

Pos logit contributions
rolley+0.022
owner+0.020
aughter+0.020
person+0.020
 clerk+0.019
Neg logit contributions
 hops-0.023
ub-0.023
ib-0.023
 appearance-0.022
emy-0.022
Token ice
Token ID4771
Feature activation+9.86e-01

Pos logit contributions
 Mom+0.011
rolley+0.010
 down+0.010
cream+0.010
 Dad+0.010
Neg logit contributions
ib-0.013
ub-0.013
emy-0.013
 hops-0.013
 villains-0.012
Token cream
Token ID8566
Feature activation+4.07e+00

Pos logit contributions
appointed+0.017
nam+0.016
 Mom+0.016
owner+0.016
uct+0.016
Neg logit contributions
ib-0.044
ub-0.043
 hops-0.039
 villains-0.039
emy-0.039
Token later
Token ID1568
Feature activation+1.06e+00

Pos logit contributions
nam+0.103
amed+0.102
appointed+0.102
rolley+0.101
owner+0.097
Neg logit contributions
ub-0.138
ib-0.135
 appearance-0.129
 villains-0.127
 hops-0.126
Token."
Token ID526
Feature activation+4.23e-02

Pos logit contributions
rolley+0.024
cream+0.023
nam+0.023
owner+0.022
 Daughter+0.021
Neg logit contributions
ib-0.041
ub-0.041
 villains-0.040
 hops-0.039
 appearance-0.039
Token
Token ID198
Feature activation-4.62e-01

Pos logit contributions
rolley+0.001
cream+0.001
 Mom+0.001
 down+0.001
nam+0.001
Neg logit contributions
ib-0.002
ub-0.002
emy-0.002
 hops-0.002
pill-0.002
Token
Token ID198
Feature activation-5.38e-01

Pos logit contributions
emy+0.017
ib+0.017
ub+0.017
 hops+0.016
 villains+0.016
Neg logit contributions
cream-0.011
rolley-0.011
nam-0.011
owner-0.011
 down-0.011
TokenL
Token ID43
Feature activation+2.53e-01

Pos logit contributions
ib+0.019
ub+0.019
 hops+0.019
emy+0.019
pill+0.019
Neg logit contributions
 Mom-0.014
rolley-0.013
 down-0.013
nam-0.013
owner-0.012
Token the
Token ID262
Feature activation+2.53e-01

Pos logit contributions
rolley+0.021
appointed+0.021
 forehead+0.019
uct+0.019
 Daughter+0.018
Neg logit contributions
 hops-0.025
ib-0.025
ub-0.025
emy-0.025
 tweet-0.025
Token elderly
Token ID13830
Feature activation+1.83e+00

Pos logit contributions
 Mom+0.007
cream+0.006
appointed+0.006
uct+0.006
rolley+0.006
Neg logit contributions
 villains-0.008
 hops-0.008
ib-0.008
 appearances-0.008
 appearance-0.008
Token man
Token ID582
Feature activation+1.02e+00

Pos logit contributions
nam+0.043
 Mom+0.041
cream+0.041
uct+0.040
appointed+0.039
Neg logit contributions
 villains-0.068
 hops-0.065
ib-0.065
emy-0.063
ub-0.063
Token on
Token ID319
Feature activation+1.94e-01

Pos logit contributions
 Mom+0.021
rolley+0.020
 Dad+0.019
 Mum+0.018
owner+0.018
Neg logit contributions
ib-0.041
ub-0.041
 hops-0.041
emy-0.040
 villains-0.040
Token a
Token ID257
Feature activation-4.10e-01

Pos logit contributions
rolley+0.003
cream+0.003
nam+0.003
 Mom+0.003
owner+0.003
Neg logit contributions
ib-0.008
ub-0.008
emy-0.008
 villains-0.008
 hops-0.008
Token stret
Token ID5759
Feature activation+4.05e+00

Pos logit contributions
 villains+0.014
ub+0.014
 hops+0.014
emy+0.014
ib+0.014
Neg logit contributions
 Mom-0.010
cream-0.010
rolley-0.010
 Dad-0.010
 down-0.010
Tokencher
Token ID2044
Feature activation+2.12e+00

Pos logit contributions
 Mom+0.131
 Mum+0.123
rolley+0.122
 Dad+0.120
appointed+0.117
Neg logit contributions
ub-0.116
 villains-0.113
ib-0.112
 tweet-0.112
 hops-0.106
Token.
Token ID13
Feature activation-1.10e+00

Pos logit contributions
 Mom+0.036
rolley+0.033
 down+0.033
cream+0.032
 Dad+0.031
Neg logit contributions
ib-0.093
ub-0.093
emy-0.092
 hops-0.092
 villains-0.091
Token She
Token ID1375
Feature activation+4.30e-01

Pos logit contributions
ib+0.040
ub+0.040
emy+0.040
 hops+0.039
pill+0.039
Neg logit contributions
rolley-0.026
 Mom-0.026
nam-0.025
 down-0.025
cream-0.025
Token thanked
Token ID26280
Feature activation-2.63e-01

Pos logit contributions
 Mom+0.008
rolley+0.007
 down+0.007
cream+0.007
nam+0.007
Neg logit contributions
ib-0.019
ub-0.019
emy-0.018
 hops-0.018
 villains-0.018
Token Tim
Token ID5045
Feature activation-4.68e-02

Pos logit contributions
ib+0.010
ub+0.010
emy+0.010
 hops+0.010
 villains+0.010
Neg logit contributions
 Mom-0.006
rolley-0.006
 down-0.005
cream-0.005
nam-0.005
Token baby
Token ID5156
Feature activation+8.17e-01

Pos logit contributions
 villains+0.002
 appearances+0.002
emy+0.002
 appearance+0.002
 hops+0.002
Neg logit contributions
nam-0.002
cream-0.002
uct-0.002
amed-0.002
 down-0.002
Token should
Token ID815
Feature activation+3.27e-01

Pos logit contributions
 Mom+0.021
rolley+0.020
nam+0.020
 down+0.019
cream+0.019
Neg logit contributions
ib-0.028
ub-0.028
 hops-0.028
emy-0.028
 villains-0.028
Token get
Token ID651
Feature activation-1.05e-01

Pos logit contributions
 Mom+0.009
 down+0.009
rolley+0.009
cream+0.009
 Dad+0.009
Neg logit contributions
ib-0.011
ub-0.011
emy-0.011
 hops-0.010
pill-0.010
Token a
Token ID257
Feature activation+1.19e+00

Pos logit contributions
 hops+0.003
ib+0.003
ub+0.003
 appearance+0.003
emy+0.003
Neg logit contributions
rolley-0.003
owner-0.003
he-0.002
uct-0.002
person-0.002
Token check
Token ID2198
Feature activation+1.65e+00

Pos logit contributions
 Mom+0.031
rolley+0.031
cream+0.030
 down+0.030
nam+0.029
Neg logit contributions
ub-0.040
ib-0.039
emy-0.039
 hops-0.039
 villains-0.038
Token-
Token ID12
Feature activation+4.04e+00

Pos logit contributions
nam+0.042
appointed+0.041
rolley+0.041
 Mom+0.041
owner+0.041
Neg logit contributions
ib-0.054
ub-0.054
 hops-0.052
 tweet-0.050
pill-0.050
Tokenup
Token ID929
Feature activation+2.58e+00

Pos logit contributions
 Mom+0.117
 forehead+0.112
nam+0.111
amas+0.110
 Dad+0.110
Neg logit contributions
ub-0.117
pill-0.115
ib-0.115
 tweet-0.109
emy-0.108
Token.
Token ID13
Feature activation+3.30e-01

Pos logit contributions
 Mom+0.055
rolley+0.052
cream+0.050
 down+0.050
nam+0.050
Neg logit contributions
ib-0.101
ub-0.101
emy-0.099
 hops-0.099
 villains-0.098
Token But
Token ID887
Feature activation+6.61e-01

Pos logit contributions
rolley+0.007
 Mom+0.006
cream+0.006
owner+0.006
nam+0.006
Neg logit contributions
ib-0.013
ub-0.013
 hops-0.013
emy-0.013
pill-0.013
Token the
Token ID262
Feature activation+1.11e-01

Pos logit contributions
rolley+0.011
cream+0.010
 Mom+0.010
owner+0.010
 down+0.010
Neg logit contributions
ub-0.028
ib-0.028
 hops-0.028
emy-0.028
 villains-0.027
Token baby
Token ID5156
Feature activation+6.09e-01

Pos logit contributions
cream+0.003
rolley+0.003
nam+0.003
amed+0.003
uct+0.003
Neg logit contributions
 villains-0.003
 appearances-0.003
 hops-0.003
 appearance-0.003
emy-0.003
Token need
Token ID761
Feature activation-3.23e-01

Pos logit contributions
 Mom+0.003
 down+0.002
cream+0.002
 Dad+0.002
rolley+0.002
Neg logit contributions
emy-0.003
ib-0.003
pill-0.003
ub-0.003
 hops-0.003
Token to
Token ID284
Feature activation-7.71e-01

Pos logit contributions
ub+0.014
ib+0.013
emy+0.013
pill+0.013
 hops+0.013
Neg logit contributions
 Mom-0.007
 down-0.006
 Dad-0.006
cream-0.006
 Grand-0.006
Token get
Token ID651
Feature activation+1.82e-01

Pos logit contributions
pill+0.029
emy+0.028
ib+0.028
ub+0.027
 hops+0.027
Neg logit contributions
 Mom-0.021
 Dad-0.019
 down-0.019
 Mum-0.018
 forehead-0.018
Token a
Token ID257
Feature activation+1.22e+00

Pos logit contributions
rolley+0.005
 Mom+0.004
owner+0.004
aughter+0.004
he+0.004
Neg logit contributions
ub-0.006
ib-0.006
 hops-0.006
emy-0.006
 villains-0.006
Token check
Token ID2198
Feature activation+1.97e+00

Pos logit contributions
 Mom+0.031
 down+0.029
rolley+0.028
 Dad+0.028
 clerk+0.027
Neg logit contributions
ib-0.045
pill-0.045
emy-0.044
ub-0.044
 hops-0.043
Token-
Token ID12
Feature activation+4.02e+00

Pos logit contributions
 Mom+0.052
rolley+0.051
nam+0.050
owner+0.050
uct+0.049
Neg logit contributions
ib-0.065
ub-0.064
 hops-0.062
emy-0.060
 villains-0.060
Tokenup
Token ID929
Feature activation+2.70e+00

Pos logit contributions
 Mom+0.116
rolley+0.109
 Mum+0.107
 Dad+0.107
nam+0.106
Neg logit contributions
ib-0.125
ub-0.125
emy-0.123
pill-0.120
 hops-0.119
Token to
Token ID284
Feature activation+2.73e-01

Pos logit contributions
 Mom+0.069
 down+0.066
rolley+0.065
cream+0.063
 Dad+0.063
Neg logit contributions
ib-0.098
emy-0.097
ub-0.096
 hops-0.095
pill-0.094
Token make
Token ID787
Feature activation+4.26e-01

Pos logit contributions
rolley+0.007
 Mom+0.007
cream+0.007
owner+0.007
 down+0.007
Neg logit contributions
ib-0.009
emy-0.009
 villains-0.009
ub-0.009
 hops-0.009
Token sure
Token ID1654
Feature activation-1.47e-01

Pos logit contributions
rolley+0.010
 Mom+0.010
owner+0.009
nam+0.009
 down+0.009
Neg logit contributions
emy-0.016
ib-0.015
ub-0.015
 hops-0.015
 villains-0.015
Token you
Token ID345
Feature activation-1.64e-01

Pos logit contributions
 villains+0.004
 appearance+0.004
 hops+0.004
ib+0.004
ub+0.004
Neg logit contributions
rolley-0.005
cream-0.004
amed-0.004
owner-0.004
appointed-0.004
Token won
Token ID1839
Feature activation-1.49e-01

Pos logit contributions
ib+0.008
emy+0.008
pill+0.008
ub+0.008
 hops+0.008
Neg logit contributions
 Mom-0.010
 down-0.009
cream-0.009
 Dad-0.009
rolley-0.009
Token't
Token ID470
Feature activation-4.75e-01

Pos logit contributions
ub+0.004
emy+0.004
ib+0.004
 villains+0.004
pill+0.004
Neg logit contributions
rolley-0.005
 Mom-0.005
nam-0.005
cream-0.005
owner-0.005
Token get
Token ID651
Feature activation+3.28e-01

Pos logit contributions
pill+0.017
ib+0.017
emy+0.017
ub+0.017
 hops+0.017
Neg logit contributions
 Mom-0.012
 down-0.012
cream-0.011
owner-0.011
 Dad-0.011
Token any
Token ID597
Feature activation+9.76e-02

Pos logit contributions
rolley+0.009
cakes+0.008
owner+0.008
he+0.008
aughter+0.008
Neg logit contributions
ub-0.011
 hops-0.011
ib-0.011
emy-0.011
 appearance-0.011
Token ice
Token ID4771
Feature activation+8.10e-01

Pos logit contributions
 Mom+0.002
 down+0.002
rolley+0.002
 Dad+0.002
 Mum+0.002
Neg logit contributions
ib-0.004
emy-0.004
pill-0.004
ub-0.004
 villains-0.004
Token cream
Token ID8566
Feature activation+4.00e+00

Pos logit contributions
 Mom+0.015
nam+0.014
appointed+0.014
owner+0.014
rolley+0.014
Neg logit contributions
ub-0.035
ib-0.034
 hops-0.032
emy-0.032
 villains-0.031
Token."
Token ID526
Feature activation+4.81e-01

Pos logit contributions
nam+0.103
rolley+0.103
amed+0.101
appointed+0.101
 Mom+0.096
Neg logit contributions
ub-0.137
ib-0.130
 appearance-0.125
 hops-0.122
 villains-0.119
Token Tim
Token ID5045
Feature activation-8.14e-02

Pos logit contributions
rolley+0.011
nam+0.010
owner+0.010
cream+0.010
 down+0.010
Neg logit contributions
ib-0.018
ub-0.018
emy-0.018
pill-0.018
 hops-0.017
Tokenmy
Token ID1820
Feature activation-1.22e-01

Pos logit contributions
emy+0.003
ib+0.002
ub+0.002
 hops+0.002
 appearance+0.002
Neg logit contributions
nam-0.003
rolley-0.002
person-0.002
cream-0.002
 middle-0.002
Token still
Token ID991
Feature activation-3.28e-01

Pos logit contributions
emy+0.004
ib+0.004
 hops+0.004
ub+0.004
 villains+0.004
Neg logit contributions
rolley-0.003
nam-0.003
 Mom-0.003
 down-0.003
cream-0.003
Token refused
Token ID6520
Feature activation-4.43e-01

Pos logit contributions
ib+0.012
ub+0.012
emy+0.012
 hops+0.012
 villains+0.012
Neg logit contributions
 Mom-0.008
rolley-0.008
 down-0.007
cream-0.007
nam-0.007
Token she
Token ID673
Feature activation+4.57e-02

Pos logit contributions
ib+0.022
ub+0.022
emy+0.022
pill+0.022
 hops+0.022
Neg logit contributions
 Mom-0.013
 down-0.011
 Dad-0.011
 Mum-0.011
nam-0.011
Token needed
Token ID2622
Feature activation-2.42e-01

Pos logit contributions
 Mom+0.001
rolley+0.001
 started+0.001
cream+0.001
 down+0.001
Neg logit contributions
emy-0.002
 villains-0.002
ib-0.002
 weaknesses-0.002
 appearances-0.002
Token a
Token ID257
Feature activation+4.19e-01

Pos logit contributions
emy+0.011
ub+0.011
ib+0.011
pill+0.011
 villains+0.010
Neg logit contributions
 Mom-0.005
 Dad-0.004
 Mum-0.004
 down-0.004
 Grand-0.004
Token little
Token ID1310
Feature activation-5.05e-02

Pos logit contributions
 Mom+0.009
 down+0.009
nam+0.008
 Dad+0.008
owner+0.008
Neg logit contributions
ib-0.017
emy-0.017
ub-0.016
 villains-0.016
pill-0.016
Token pick
Token ID2298
Feature activation+1.04e+00

Pos logit contributions
emy+0.002
ib+0.002
ub+0.002
pill+0.002
 villains+0.002
Neg logit contributions
 Mom-0.001
 down-0.001
 Dad-0.001
 forehead-0.001
 clerk-0.001
Token-
Token ID12
Feature activation+3.87e+00

Pos logit contributions
 Mom+0.026
 down+0.025
rolley+0.024
owner+0.024
 Dad+0.023
Neg logit contributions
emy-0.039
ib-0.038
ub-0.038
 villains-0.038
 hops-0.037
Tokenme
Token ID1326
Feature activation+1.69e+00

Pos logit contributions
 Mom+0.111
 forehead+0.109
nam+0.102
appointed+0.101
 Dad+0.101
Neg logit contributions
ub-0.123
pill-0.122
ib-0.122
emy-0.115
 weaknesses-0.114
Token-
Token ID12
Feature activation+3.84e+00

Pos logit contributions
 Mom+0.042
 down+0.041
cream+0.039
rolley+0.039
owner+0.039
Neg logit contributions
emy-0.064
 villains-0.063
ib-0.062
 hops-0.061
ub-0.061
Tokenup
Token ID929
Feature activation+2.16e+00

Pos logit contributions
 Mom+0.111
 forehead+0.109
nam+0.102
appointed+0.102
 Dad+0.101
Neg logit contributions
ub-0.121
pill-0.120
ib-0.120
emy-0.113
 weaknesses-0.111
Token,
Token ID11
Feature activation+5.80e-01

Pos logit contributions
 Mom+0.060
 down+0.058
 Dad+0.055
rolley+0.055
owner+0.055
Neg logit contributions
emy-0.075
ib-0.075
ub-0.072
 villains-0.072
 hops-0.071
Token she
Token ID673
Feature activation+3.85e-01

Pos logit contributions
 Mom+0.015
rolley+0.014
 down+0.014
cream+0.014
 Dad+0.014
Neg logit contributions
ib-0.021
ub-0.020
emy-0.020
 hops-0.020
 villains-0.020
Token a
Token ID257
Feature activation+4.19e-01

Pos logit contributions
emy+0.011
ub+0.011
ib+0.011
pill+0.011
 villains+0.010
Neg logit contributions
 Mom-0.005
 Dad-0.004
 Mum-0.004
 down-0.004
 Grand-0.004
Token little
Token ID1310
Feature activation-5.05e-02

Pos logit contributions
 Mom+0.009
 down+0.009
nam+0.008
 Dad+0.008
owner+0.008
Neg logit contributions
ib-0.017
emy-0.017
ub-0.016
 villains-0.016
pill-0.016
Token pick
Token ID2298
Feature activation+1.04e+00

Pos logit contributions
emy+0.002
ib+0.002
ub+0.002
pill+0.002
 villains+0.002
Neg logit contributions
 Mom-0.001
 down-0.001
 Dad-0.001
 forehead-0.001
 clerk-0.001
Token-
Token ID12
Feature activation+3.87e+00

Pos logit contributions
 Mom+0.026
 down+0.025
rolley+0.024
owner+0.024
 Dad+0.023
Neg logit contributions
emy-0.039
ib-0.038
ub-0.038
 villains-0.038
 hops-0.037
Tokenme
Token ID1326
Feature activation+1.69e+00

Pos logit contributions
 Mom+0.111
 forehead+0.109
nam+0.102
appointed+0.101
 Dad+0.101
Neg logit contributions
ub-0.123
pill-0.122
ib-0.122
emy-0.115
 weaknesses-0.114
Token-
Token ID12
Feature activation+3.84e+00

Pos logit contributions
 Mom+0.042
 down+0.041
cream+0.039
rolley+0.039
owner+0.039
Neg logit contributions
emy-0.064
 villains-0.063
ib-0.062
 hops-0.061
ub-0.061
Tokenup
Token ID929
Feature activation+2.16e+00

Pos logit contributions
 Mom+0.111
 forehead+0.109
nam+0.102
appointed+0.102
 Dad+0.101
Neg logit contributions
ub-0.121
pill-0.120
ib-0.120
emy-0.113
 weaknesses-0.111
Token,
Token ID11
Feature activation+5.80e-01

Pos logit contributions
 Mom+0.060
 down+0.058
 Dad+0.055
rolley+0.055
owner+0.055
Neg logit contributions
emy-0.075
ib-0.075
ub-0.072
 villains-0.072
 hops-0.071
Token she
Token ID673
Feature activation+3.85e-01

Pos logit contributions
 Mom+0.015
rolley+0.014
 down+0.014
cream+0.014
 Dad+0.014
Neg logit contributions
ib-0.021
ub-0.020
emy-0.020
 hops-0.020
 villains-0.020
Token would
Token ID561
Feature activation+2.83e-01

Pos logit contributions
 Mom+0.008
rolley+0.007
 down+0.007
cream+0.007
 Dad+0.007
Neg logit contributions
ib-0.016
ub-0.016
emy-0.015
 hops-0.015
 villains-0.015
Token go
Token ID467
Feature activation+3.15e-01

Pos logit contributions
 Mom+0.008
 Mum+0.007
 Dad+0.007
 down+0.007
he+0.006
Neg logit contributions
emy-0.011
ib-0.011
pill-0.011
 villains-0.010
ub-0.010
Token castle
Token ID16669
Feature activation-7.38e-01

Pos logit contributions
ub+0.030
pill+0.030
emy+0.030
ib+0.028
 villains+0.028
Neg logit contributions
 Mom-0.032
 down-0.032
 mom-0.030
 Dad-0.030
 Mum-0.029
Token was
Token ID373
Feature activation+1.72e-01

Pos logit contributions
ib+0.030
ub+0.030
emy+0.029
 hops+0.029
 villains+0.029
Neg logit contributions
 Mom-0.015
rolley-0.014
 down-0.014
cream-0.014
nam-0.014
Token destroyed
Token ID6572
Feature activation+7.58e-01

Pos logit contributions
 Mom+0.004
rolley+0.004
 down+0.004
cream+0.004
 Dad+0.004
Neg logit contributions
ib-0.006
ub-0.006
emy-0.006
 hops-0.006
 villains-0.006
Token and
Token ID290
Feature activation+5.03e-01

Pos logit contributions
 Mom+0.014
 down+0.013
rolley+0.012
 Dad+0.012
nam+0.012
Neg logit contributions
emy-0.033
ib-0.033
ub-0.033
 hops-0.032
 villains-0.032
Token no
Token ID645
Feature activation+9.29e-01

Pos logit contributions
 Mom+0.011
rolley+0.011
 down+0.011
cream+0.011
owner+0.010
Neg logit contributions
ib-0.019
ub-0.019
emy-0.019
 hops-0.019
 villains-0.019
Token-
Token ID12
Feature activation+3.81e+00

Pos logit contributions
 Mom+0.026
 down+0.025
rolley+0.024
nam+0.024
 Dad+0.024
Neg logit contributions
ub-0.031
ib-0.031
emy-0.031
pill-0.030
 hops-0.030
Tokenone
Token ID505
Feature activation-2.88e-01

Pos logit contributions
 Mom+0.119
rolley+0.114
 down+0.111
 Dad+0.111
 Mum+0.110
Neg logit contributions
ib-0.114
ub-0.114
emy-0.112
 villains-0.110
 hops-0.109
Token was
Token ID373
Feature activation-2.12e-01

Pos logit contributions
ub+0.011
ib+0.011
emy+0.010
 villains+0.010
 hops+0.010
Neg logit contributions
 Mom-0.007
 down-0.007
rolley-0.007
cream-0.006
 Dad-0.006
Token able
Token ID1498
Feature activation+6.51e-01

Pos logit contributions
ub+0.008
ib+0.008
emy+0.008
 hops+0.008
 villains+0.007
Neg logit contributions
 Mom-0.005
rolley-0.005
cream-0.005
 down-0.005
 Dad-0.005
Token to
Token ID284
Feature activation-2.64e-01

Pos logit contributions
 Mom+0.017
 down+0.016
rolley+0.016
cream+0.015
 Dad+0.015
Neg logit contributions
ib-0.023
ub-0.023
emy-0.023
 hops-0.023
 villains-0.022
Token help
Token ID1037
Feature activation+2.81e-01

Pos logit contributions
pill+0.010
emy+0.009
ib+0.009
ub+0.009
 hops+0.009
Neg logit contributions
 Mom-0.007
 down-0.007
 Dad-0.006
nam-0.006
 Mum-0.006
Token It
Token ID632
Feature activation-7.34e-01

Pos logit contributions
ub+0.036
emy+0.036
ib+0.036
 hops+0.036
 villains+0.035
Neg logit contributions
 Mom-0.022
 Dad-0.019
 down-0.019
rolley-0.019
 Mum-0.019
Token said
Token ID531
Feature activation-4.79e-01

Pos logit contributions
ib+0.030
ub+0.030
emy+0.030
 hops+0.030
 villains+0.030
Neg logit contributions
 Mom-0.014
rolley-0.014
 down-0.013
cream-0.013
 Dad-0.013
Token:
Token ID25
Feature activation-2.93e-01

Pos logit contributions
emy+0.020
ub+0.019
 weaknesses+0.019
pill+0.019
ib+0.019
Neg logit contributions
 Mom-0.013
 Dad-0.011
 Mum-0.011
 down-0.010
 mom-0.010
Token "
Token ID366
Feature activation-4.76e-01

Pos logit contributions
emy+0.014
 villains+0.014
 weaknesses+0.014
ub+0.014
pill+0.014
Neg logit contributions
 Mom-0.005
 Dad-0.004
 down-0.004
 Mum-0.004
 -0.004
TokenHappy
Token ID25082
Feature activation-3.36e-01

Pos logit contributions
ib+0.017
ub+0.017
emy+0.017
 hops+0.017
 villains+0.017
Neg logit contributions
 Mom-0.012
rolley-0.011
cream-0.011
 down-0.011
owner-0.011
Token Birthday
Token ID33511
Feature activation+3.81e+00

Pos logit contributions
ib+0.010
ub+0.010
emy+0.010
 hops+0.010
 villains+0.010
Neg logit contributions
 Mom-0.010
rolley-0.010
 down-0.010
cream-0.010
nam-0.010
Token,
Token ID11
Feature activation+1.05e+00

Pos logit contributions
rolley+0.151
nam+0.150
 forehead+0.147
cream+0.145
appointed+0.139
Neg logit contributions
ub-0.087
ib-0.084
 tweet-0.080
emy-0.077
 hops-0.075
Token Mom
Token ID11254
Feature activation+1.43e+00

Pos logit contributions
rolley+0.018
cream+0.017
 down+0.016
nam+0.016
 Mom+0.016
Neg logit contributions
emy-0.045
ib-0.045
ub-0.045
 hops-0.044
 villains-0.044
Token!
Token ID0
Feature activation-1.97e+00

Pos logit contributions
nam+0.030
rolley+0.030
 Mom+0.029
owner+0.028
cream+0.028
Neg logit contributions
ib-0.056
ub-0.056
emy-0.054
 hops-0.054
pill-0.053
Token Do
Token ID2141
Feature activation-2.00e+00

Pos logit contributions
ib+0.068
ub+0.067
emy+0.065
pill+0.064
 hops+0.064
Neg logit contributions
 down-0.051
rolley-0.050
nam-0.050
cream-0.049
owner-0.049
Token not
Token ID407
Feature activation+6.43e-01

Pos logit contributions
ub+0.067
ib+0.067
emy+0.066
 hops+0.065
 villains+0.064
Neg logit contributions
 Mom-0.057
 Dad-0.052
 down-0.052
cream-0.052
nam-0.052
Token young
Token ID1862
Feature activation+8.93e-01

Pos logit contributions
ub+0.016
ib+0.016
emy+0.016
 hops+0.016
 villains+0.015
Neg logit contributions
rolley-0.011
 Mom-0.011
cream-0.011
owner-0.011
 down-0.011
Token to
Token ID284
Feature activation-9.40e-01

Pos logit contributions
 Mom+0.020
 down+0.019
 Dad+0.018
owner+0.017
 Mum+0.017
Neg logit contributions
ib-0.037
ub-0.036
emy-0.036
pill-0.035
 villains-0.035
Token get
Token ID651
Feature activation+4.50e-01

Pos logit contributions
pill+0.033
ib+0.033
emy+0.032
ub+0.032
 hops+0.031
Neg logit contributions
 Mom-0.027
 down-0.025
 Dad-0.025
 Mum-0.024
rolley-0.024
Token an
Token ID281
Feature activation+1.68e-01

Pos logit contributions
rolley+0.009
 Mom+0.009
cream+0.008
owner+0.008
 down+0.008
Neg logit contributions
ub-0.018
ib-0.018
 hops-0.018
emy-0.017
 villains-0.017
Token X
Token ID1395
Feature activation+4.75e-01

Pos logit contributions
 Mom+0.005
rolley+0.004
 down+0.004
cream+0.004
nam+0.004
Neg logit contributions
ib-0.006
ub-0.006
emy-0.006
 hops-0.006
 villains-0.005
Token-
Token ID12
Feature activation+3.80e+00

Pos logit contributions
 Mom+0.010
 down+0.010
rolley+0.010
cream+0.010
 Dad+0.009
Neg logit contributions
ib-0.019
emy-0.018
ub-0.018
 hops-0.018
 villains-0.018
Tokenray
Token ID2433
Feature activation+1.01e+00

Pos logit contributions
 Mom+0.110
 forehead+0.105
rolley+0.102
nam+0.102
 Daughter+0.101
Neg logit contributions
ub-0.122
ib-0.122
 hops-0.113
pill-0.113
emy-0.113
Token,
Token ID11
Feature activation-1.79e-01

Pos logit contributions
 Mom+0.020
rolley+0.019
 down+0.019
cream+0.018
nam+0.018
Neg logit contributions
ib-0.041
ub-0.041
emy-0.040
 hops-0.040
 villains-0.040
Token but
Token ID475
Feature activation+7.97e-01

Pos logit contributions
emy+0.008
ub+0.008
ib+0.008
 villains+0.007
 hops+0.007
Neg logit contributions
 Mom-0.004
 Dad-0.004
 down-0.004
 Mum-0.003
 Grand-0.003
Token she
Token ID673
Feature activation+6.33e-01

Pos logit contributions
rolley+0.019
cream+0.018
 forehead+0.016
amas+0.016
amed+0.016
Neg logit contributions
 hops-0.028
 appearance-0.028
ub-0.028
ib-0.028
 weaknesses-0.028
Token said
Token ID531
Feature activation+1.47e-01

Pos logit contributions
rolley+0.016
 down+0.016
nam+0.016
 Mom+0.015
cream+0.015
Neg logit contributions
emy-0.022
ub-0.022
ib-0.022
 hops-0.022
 villains-0.021
Token she
Token ID673
Feature activation-2.27e-01

Pos logit contributions
ub+0.014
emy+0.014
ib+0.013
 villains+0.013
 hops+0.013
Neg logit contributions
 Mom-0.006
nam-0.006
 Dad-0.006
 down-0.006
 Mum-0.005
Token wanted
Token ID2227
Feature activation-2.64e-01

Pos logit contributions
ib+0.009
ub+0.009
emy+0.009
 villains+0.009
pill+0.009
Neg logit contributions
 Mom-0.005
 down-0.005
rolley-0.004
cream-0.004
owner-0.004
Token to
Token ID284
Feature activation+3.80e-01

Pos logit contributions
emy+0.011
ub+0.011
ib+0.011
 villains+0.011
 hops+0.011
Neg logit contributions
 Mom-0.006
 down-0.005
 Dad-0.005
 Mum-0.005
 Grand-0.005
Token buy
Token ID2822
Feature activation+1.66e+00

Pos logit contributions
 Mom+0.011
rolley+0.010
 down+0.010
cream+0.010
 Dad+0.010
Neg logit contributions
ib-0.012
ub-0.012
emy-0.012
 hops-0.012
pill-0.012
Token ice
Token ID4771
Feature activation+7.01e-01

Pos logit contributions
rolley+0.048
 Mom+0.046
 forehead+0.046
appointed+0.046
cream+0.045
Neg logit contributions
ib-0.052
ub-0.050
 hops-0.050
pill-0.049
emy-0.049
Token cream
Token ID8566
Feature activation+3.78e+00

Pos logit contributions
appointed+0.014
 Mom+0.014
 forehead+0.014
nam+0.013
rolley+0.013
Neg logit contributions
ub-0.029
ib-0.029
 hops-0.026
pill-0.026
emy-0.026
Token for
Token ID329
Feature activation+8.39e-01

Pos logit contributions
appointed+0.094
rolley+0.094
nam+0.092
 forehead+0.091
 Mom+0.091
Neg logit contributions
ub-0.130
ib-0.125
 hops-0.119
pill-0.117
 appearance-0.116
Token all
Token ID477
Feature activation-1.39e-01

Pos logit contributions
rolley+0.017
 Mom+0.017
cream+0.016
 down+0.016
owner+0.016
Neg logit contributions
ib-0.034
emy-0.033
ub-0.033
 hops-0.033
 villains-0.032
Token her
Token ID607
Feature activation+1.56e-02

Pos logit contributions
ib+0.005
 hops+0.005
ub+0.005
emy+0.005
pill+0.005
Neg logit contributions
 Mom-0.003
rolley-0.003
cream-0.003
nam-0.003
 forehead-0.003
Token friends
Token ID2460
Feature activation+9.24e-01

Pos logit contributions
rolley+0.000
 Mom+0.000
 down+0.000
owner+0.000
cream+0.000
Neg logit contributions
ib-0.001
emy-0.001
 hops-0.001
ub-0.001
 villains-0.001
Token.
Token ID13
Feature activation+5.43e-01

Pos logit contributions
 Mom+0.018
rolley+0.017
cream+0.016
 down+0.016
owner+0.016
Neg logit contributions
ub-0.038
ib-0.038
emy-0.038
 hops-0.038
 villains-0.037
TokenWe
Token ID1135
Feature activation+4.37e-01

Pos logit contributions
ib+0.030
emy+0.029
ub+0.029
 villains+0.029
pill+0.029
Neg logit contributions
 Mom-0.021
 down-0.020
rolley-0.020
cream-0.020
 Dad-0.020
Token can
Token ID460
Feature activation-8.84e-01

Pos logit contributions
 Mom+0.014
 down+0.013
 Dad+0.013
cream+0.013
rolley+0.013
Neg logit contributions
emy-0.013
ib-0.013
ub-0.013
pill-0.013
 hops-0.013
Token't
Token ID470
Feature activation+5.38e-02

Pos logit contributions
ib+0.031
emy+0.031
ub+0.031
 hops+0.031
pill+0.030
Neg logit contributions
 Mom-0.024
 down-0.022
rolley-0.022
 Dad-0.022
cream-0.022
Token get
Token ID651
Feature activation+7.05e-01

Pos logit contributions
 Mom+0.001
 down+0.001
 Dad+0.001
cream+0.001
 Mum+0.001
Neg logit contributions
pill-0.002
ib-0.002
emy-0.002
 hops-0.002
ub-0.002
Token ice
Token ID4771
Feature activation+8.71e-01

Pos logit contributions
rolley+0.020
owner+0.019
aughter+0.019
 clerk+0.019
person+0.018
Neg logit contributions
ub-0.022
ib-0.022
 hops-0.022
emy-0.021
 appearance-0.021
Token cream
Token ID8566
Feature activation+3.77e+00

Pos logit contributions
 doorway+0.017
owner+0.016
person+0.015
 Mom+0.015
amed+0.015
Neg logit contributions
ub-0.037
ib-0.037
emy-0.032
 appearance-0.032
 hops-0.032
Token."
Token ID526
Feature activation+6.76e-01

Pos logit contributions
amed+0.113
rolley+0.112
appointed+0.108
nam+0.108
owner+0.107
Neg logit contributions
ub-0.111
ib-0.104
 appearance-0.101
 villains-0.093
 hops-0.092
Token
Token ID220
Feature activation+8.04e-01

Pos logit contributions
rolley+0.019
aughter+0.018
nam+0.018
he+0.017
amed+0.016
Neg logit contributions
emy-0.023
 hops-0.022
ib-0.021
 villains-0.019
ub-0.019
Token
Token ID198
Feature activation+2.58e-01

Pos logit contributions
nam+0.034
 forehead+0.033
 doorway+0.032
rolley+0.030
 Daughter+0.030
Neg logit contributions
emy-0.018
ub-0.017
ib-0.017
pill-0.015
 hops-0.015
Token
Token ID198
Feature activation+8.44e-02

Pos logit contributions
nam+0.011
 forehead+0.010
aughter+0.010
rolley+0.010
 doorway+0.009
Neg logit contributions
emy-0.007
ub-0.005
ib-0.005
Like-0.005
pill-0.004
TokenJust
Token ID5703
Feature activation-3.97e-01

Pos logit contributions
nam+0.003
 Mom+0.003
aughter+0.003
 down+0.003
 Dad+0.003
Neg logit contributions
emy-0.003
 villains-0.002
ib-0.002
ub-0.002
 hops-0.002
Token knew
Token ID2993
Feature activation-3.59e-01

Pos logit contributions
 Mom+0.011
rolley+0.010
 down+0.010
cream+0.010
nam+0.010
Neg logit contributions
ib-0.019
ub-0.019
emy-0.019
 hops-0.019
 villains-0.019
Token he
Token ID339
Feature activation+2.86e-01

Pos logit contributions
ib+0.014
ub+0.013
emy+0.013
 hops+0.013
 villains+0.013
Neg logit contributions
 Mom-0.009
 down-0.008
rolley-0.008
cream-0.008
 Dad-0.008
Token couldn
Token ID3521
Feature activation-9.75e-01

Pos logit contributions
 Mom+0.007
rolley+0.006
 down+0.006
cream+0.006
nam+0.006
Neg logit contributions
ib-0.011
ub-0.011
emy-0.011
 hops-0.011
 villains-0.011
Token't
Token ID470
Feature activation+3.83e-01

Pos logit contributions
ib+0.033
emy+0.033
ub+0.033
 hops+0.033
 villains+0.032
Neg logit contributions
 Mom-0.027
 down-0.025
rolley-0.025
cream-0.024
 Dad-0.024
Token un
Token ID555
Feature activation+6.93e-01

Pos logit contributions
 Mom+0.010
 down+0.009
 Dad+0.009
 Mum+0.009
he+0.009
Neg logit contributions
pill-0.015
emy-0.015
ib-0.014
 hops-0.014
ub-0.014
Token-
Token ID12
Feature activation+3.76e+00

Pos logit contributions
 Mom+0.015
 forehead+0.014
 Dad+0.013
 Mum+0.013
nam+0.013
Neg logit contributions
ub-0.028
ib-0.027
emy-0.026
 villains-0.025
 appearance-0.025
Tokenm
Token ID76
Feature activation+4.73e-01

Pos logit contributions
 Mom+0.110
 forehead+0.106
aughter+0.105
rolley+0.104
 Daughter+0.102
Neg logit contributions
ub-0.117
ib-0.112
 weaknesses-0.109
 villains-0.108
pill-0.108
Tokenelt
Token ID2120
Feature activation+9.96e-01

Pos logit contributions
 Mom+0.013
rolley+0.012
 down+0.012
cream+0.012
owner+0.012
Neg logit contributions
ib-0.016
ub-0.016
emy-0.016
 hops-0.016
 villains-0.016
Token it
Token ID340
Feature activation-3.70e-01

Pos logit contributions
rolley+0.027
 Mom+0.026
owner+0.025
cream+0.025
person+0.025
Neg logit contributions
ib-0.032
 hops-0.032
ub-0.032
 appearance-0.032
emy-0.031
Token.
Token ID13
Feature activation+6.64e-01

Pos logit contributions
emy+0.015
ib+0.015
ub+0.015
pill+0.014
 villains+0.014
Neg logit contributions
 Mom-0.008
 down-0.008
rolley-0.007
cream-0.007
 Dad-0.007
Token
Token ID220
Feature activation+9.61e-01

Pos logit contributions
nam+0.018
 cream+0.017
 forehead+0.016
amed+0.016
rolley+0.016
Neg logit contributions
ub-0.022
ib-0.022
emy-0.022
pill-0.021
 hops-0.021
Token's
Token ID338
Feature activation+4.89e-01

Pos logit contributions
ub+0.017
 hops+0.017
emy+0.016
 appearance+0.016
ib+0.016
Neg logit contributions
nam-0.022
cream-0.021
rolley-0.021
belt-0.020
owner-0.020
Token go
Token ID467
Feature activation+4.63e-01

Pos logit contributions
rolley+0.015
cream+0.014
owner+0.014
 Mom+0.014
nam+0.014
Neg logit contributions
ub-0.015
 hops-0.014
 villains-0.014
ib-0.014
emy-0.014
Token and
Token ID290
Feature activation+2.29e-01

Pos logit contributions
 Mom+0.011
 down+0.011
rolley+0.011
 Dad+0.010
cream+0.010
Neg logit contributions
ib-0.017
emy-0.017
ub-0.017
 villains-0.017
pill-0.017
Token get
Token ID651
Feature activation+4.55e-01

Pos logit contributions
rolley+0.007
owner+0.007
cream+0.007
 Mom+0.007
aughter+0.007
Neg logit contributions
ub-0.007
 hops-0.007
 villains-0.007
ib-0.006
 tweet-0.006
Token ice
Token ID4771
Feature activation+6.85e-01

Pos logit contributions
owner+0.012
rolley+0.012
person+0.011
aughter+0.011
appointed+0.011
Neg logit contributions
ub-0.015
 hops-0.015
ib-0.015
 villains-0.014
emy-0.014
Token cream
Token ID8566
Feature activation+3.75e+00

Pos logit contributions
owner+0.015
nam+0.015
 Mom+0.014
appointed+0.014
person+0.013
Neg logit contributions
ub-0.028
ib-0.027
 hops-0.024
emy-0.024
pill-0.023
Token,
Token ID11
Feature activation+1.40e+00

Pos logit contributions
nam+0.100
appointed+0.095
owner+0.094
rolley+0.093
uct+0.093
Neg logit contributions
ub-0.126
ib-0.116
 hops-0.110
 appearance-0.108
 tweet-0.106
Token sweet
Token ID6029
Feature activation+4.03e-02

Pos logit contributions
 Mom+0.035
 Dad+0.031
 Mum+0.030
 down+0.030
 Grand+0.028
Neg logit contributions
ib-0.055
emy-0.055
pill-0.054
ub-0.054
 hops-0.054
Tokenie
Token ID494
Feature activation-1.05e+00

Pos logit contributions
nam+0.001
owner+0.001
 Mom+0.001
rolley+0.001
cream+0.001
Neg logit contributions
ub-0.001
ib-0.001
emy-0.001
 hops-0.001
pill-0.001
Token."
Token ID526
Feature activation+2.56e-01

Pos logit contributions
ub+0.044
ib+0.043
emy+0.042
 hops+0.042
 villains+0.041
Neg logit contributions
 Mom-0.020
nam-0.020
rolley-0.019
cream-0.019
owner-0.018
Token
Token ID198
Feature activation-6.22e-02

Pos logit contributions
nam+0.007
cream+0.006
rolley+0.006
owner+0.006
belt+0.006
Neg logit contributions
ub-0.009
ib-0.009
emy-0.009
 hops-0.009
pill-0.009
Token had
Token ID550
Feature activation+9.98e-01

Pos logit contributions
rolley+0.027
 Mom+0.026
 down+0.025
owner+0.025
aughter+0.024
Neg logit contributions
 villains-0.035
ib-0.034
 hops-0.034
emy-0.034
ub-0.034
Token not
Token ID407
Feature activation+1.14e+00

Pos logit contributions
rolley+0.030
 Mom+0.028
 down+0.027
aughter+0.027
cream+0.027
Neg logit contributions
emy-0.031
ib-0.031
 villains-0.031
 hops-0.030
 weaknesses-0.030
Token touched
Token ID12615
Feature activation+6.03e-01

Pos logit contributions
 Mom+0.026
 down+0.025
nam+0.025
 Dad+0.024
cream+0.024
Neg logit contributions
ub-0.046
emy-0.045
ib-0.045
pill-0.044
 hops-0.043
Token the
Token ID262
Feature activation+3.66e-01

Pos logit contributions
 Mom+0.012
rolley+0.011
 down+0.011
cream+0.011
owner+0.010
Neg logit contributions
ib-0.025
ub-0.025
emy-0.025
 hops-0.025
 villains-0.024
Token or
Token ID393
Feature activation+2.30e-01

Pos logit contributions
rolley+0.010
cream+0.010
 Mom+0.010
 Mum+0.010
 Dad+0.010
Neg logit contributions
ib-0.011
 appearance-0.011
pill-0.011
 villains-0.011
 hops-0.011
Tokennam
Token ID7402
Feature activation+3.75e+00

Pos logit contributions
rolley+0.004
 Mom+0.004
cream+0.004
owner+0.004
 down+0.004
Neg logit contributions
ib-0.010
 villains-0.009
 hops-0.009
ub-0.009
emy-0.009
Tokenents
Token ID658
Feature activation+1.28e+00

Pos logit contributions
 Mom+0.154
 forehead+0.148
 Mum+0.147
 down+0.146
 Dad+0.146
Neg logit contributions
ib-0.081
ub-0.081
emy-0.070
pill-0.070
 villains-0.065
Token.
Token ID13
Feature activation-9.30e-01

Pos logit contributions
 Mom+0.025
 down+0.024
rolley+0.022
cream+0.022
owner+0.022
Neg logit contributions
emy-0.055
ib-0.053
ub-0.053
 hops-0.053
pill-0.052
Token They
Token ID1119
Feature activation+2.64e-01

Pos logit contributions
ib+0.030
ub+0.030
emy+0.029
 hops+0.029
 villains+0.029
Neg logit contributions
 Mom-0.027
rolley-0.026
 down-0.026
cream-0.026
nam-0.025
Token hoped
Token ID10719
Feature activation+5.42e-01

Pos logit contributions
 Mom+0.008
rolley+0.008
 down+0.008
cream+0.008
owner+0.007
Neg logit contributions
ib-0.008
emy-0.008
ub-0.008
 villains-0.008
 hops-0.008
Token Mom
Token ID11254
Feature activation+1.46e+00

Pos logit contributions
 Mom+0.012
 Dad+0.010
 Mum+0.010
 Grand+0.010
 down+0.010
Neg logit contributions
ub-0.024
ib-0.024
emy-0.023
 hops-0.023
pill-0.023
Token used
Token ID973
Feature activation-7.59e-01

Pos logit contributions
ib+0.004
ub+0.004
emy+0.004
 villains+0.004
pill+0.004
Neg logit contributions
 Mom-0.003
 down-0.003
 Dad-0.002
rolley-0.002
cream-0.002
Token Clim
Token ID28573
Feature activation-1.13e+00

Pos logit contributions
ib+0.033
ub+0.033
emy+0.033
 hops+0.033
 villains+0.032
Neg logit contributions
 Mom-0.013
rolley-0.012
 down-0.011
cream-0.011
nam-0.011
Tokenber
Token ID527
Feature activation+7.22e-01

Pos logit contributions
ib+0.039
ub+0.038
emy+0.038
 hops+0.038
 villains+0.037
Neg logit contributions
 Mom-0.031
rolley-0.029
 down-0.029
cream-0.028
 Dad-0.028
Token the
Token ID262
Feature activation-5.07e-01

Pos logit contributions
 down+0.017
 Mom+0.016
cream+0.015
 Dad+0.015
rolley+0.015
Neg logit contributions
emy-0.029
ib-0.028
ub-0.028
 villains-0.027
 hops-0.027
Token Lad
Token ID12862
Feature activation-1.05e+00

Pos logit contributions
ib+0.019
ub+0.019
emy+0.019
 hops+0.019
 villains+0.019
Neg logit contributions
 Mom-0.012
 down-0.011
rolley-0.011
cream-0.011
 Dad-0.011
Tokender
Token ID1082
Feature activation-5.04e+00

Pos logit contributions
ub+0.036
emy+0.035
ib+0.035
 hops+0.034
pill+0.034
Neg logit contributions
 Mom-0.029
 down-0.027
rolley-0.027
 Dad-0.026
cream-0.026
Token.
Token ID13
Feature activation-2.05e-01

Pos logit contributions
ib+0.175
ub+0.175
emy+0.174
 villains+0.171
 hops+0.170
Neg logit contributions
 Mom-0.131
 down-0.126
rolley-0.125
cream-0.122
 Dad-0.121
Token It
Token ID632
Feature activation-1.10e+00

Pos logit contributions
emy+0.008
ib+0.008
ub+0.008
 hops+0.008
 villains+0.008
Neg logit contributions
 Mom-0.004
rolley-0.004
 Dad-0.004
 down-0.004
cream-0.004
Token helped
Token ID4193
Feature activation-6.24e-01

Pos logit contributions
ib+0.047
ub+0.047
emy+0.046
 villains+0.045
pill+0.045
Neg logit contributions
 Mom-0.021
 down-0.020
cream-0.019
rolley-0.019
owner-0.018
Token him
Token ID683
Feature activation-8.66e-01

Pos logit contributions
ib+0.024
emy+0.024
pill+0.024
ub+0.024
 hops+0.024
Neg logit contributions
 Mom-0.015
 down-0.014
 Dad-0.013
 Mum-0.013
cream-0.013
Token to
Token ID284
Feature activation-1.15e+00

Pos logit contributions
emy+0.032
ib+0.032
pill+0.031
ub+0.031
 villains+0.031
Neg logit contributions
 down-0.023
 Mom-0.023
cream-0.021
 up-0.021
 Mum-0.020
Token thanked
Token ID26280
Feature activation-4.62e-01

Pos logit contributions
ib+0.030
ub+0.030
emy+0.030
 hops+0.029
 villains+0.029
Neg logit contributions
 Mom-0.019
rolley-0.018
 down-0.017
cream-0.017
 Dad-0.017
Token Clim
Token ID28573
Feature activation-1.01e+00

Pos logit contributions
ib+0.020
ub+0.020
emy+0.020
 hops+0.020
 villains+0.020
Neg logit contributions
 Mom-0.009
 Dad-0.007
 Mum-0.007
 down-0.007
 Grand-0.007
Tokenber
Token ID527
Feature activation+1.66e+00

Pos logit contributions
ib+0.035
ub+0.034
emy+0.034
 hops+0.034
 villains+0.033
Neg logit contributions
 Mom-0.027
rolley-0.026
 down-0.026
cream-0.026
owner-0.025
Token the
Token ID262
Feature activation-9.27e-01

Pos logit contributions
 Mom+0.042
rolley+0.040
 down+0.040
cream+0.039
 Dad+0.039
Neg logit contributions
ib-0.059
ub-0.059
emy-0.058
 hops-0.057
 villains-0.057
Token Lad
Token ID12862
Feature activation-3.67e-01

Pos logit contributions
ub+0.036
ib+0.036
emy+0.035
pill+0.034
 hops+0.033
Neg logit contributions
 Mom-0.023
 down-0.021
 clerk-0.021
owner-0.021
 Dad-0.021
Tokender
Token ID1082
Feature activation-4.92e+00

Pos logit contributions
ub+0.012
emy+0.012
ib+0.012
 hops+0.012
pill+0.012
Neg logit contributions
 Mom-0.010
rolley-0.009
 down-0.009
cream-0.009
 Dad-0.009
Token for
Token ID329
Feature activation-8.08e-01

Pos logit contributions
ib+0.170
ub+0.170
emy+0.169
 villains+0.166
 hops+0.165
Neg logit contributions
 Mom-0.129
 down-0.123
rolley-0.123
cream-0.119
 Dad-0.118
Token its
Token ID663
Feature activation-1.15e+00

Pos logit contributions
ib+0.029
ub+0.029
emy+0.028
pill+0.028
 hops+0.028
Neg logit contributions
 Mom-0.022
 down-0.020
rolley-0.020
 Dad-0.020
nam-0.020
Token help
Token ID1037
Feature activation-1.12e+00

Pos logit contributions
ub+0.041
ib+0.039
emy+0.039
pill+0.039
 villains+0.038
Neg logit contributions
 Mom-0.032
 down-0.030
 Dad-0.029
 Mum-0.029
 forehead-0.029
Token.
Token ID13
Feature activation-1.03e-01

Pos logit contributions
ib+0.051
emy+0.051
ub+0.051
 hops+0.050
pill+0.049
Neg logit contributions
 Mom-0.018
 down-0.016
rolley-0.016
 Dad-0.015
cream-0.015
Token And
Token ID843
Feature activation+2.12e-01

Pos logit contributions
ib+0.004
ub+0.004
emy+0.004
 hops+0.004
pill+0.004
Neg logit contributions
 Mom-0.002
rolley-0.002
 down-0.002
nam-0.002
cream-0.002
Token the
Token ID262
Feature activation+8.29e-03

Pos logit contributions
ub+0.019
ib+0.019
 hops+0.019
emy+0.019
pill+0.019
Neg logit contributions
rolley-0.008
 Mom-0.007
 forehead-0.007
nam-0.007
owner-0.007
Token tree
Token ID5509
Feature activation-1.54e+00

Pos logit contributions
 Mom+0.000
 down+0.000
owner+0.000
rolley+0.000
 middle+0.000
Neg logit contributions
ub-0.000
ib-0.000
emy-0.000
 villains-0.000
 hops-0.000
Token to
Token ID284
Feature activation-1.47e-01

Pos logit contributions
emy+0.057
ib+0.057
ub+0.056
 villains+0.055
 hops+0.055
Neg logit contributions
 Mom-0.038
 down-0.037
rolley-0.035
cream-0.035
owner-0.035
Token its
Token ID663
Feature activation-2.74e-01

Pos logit contributions
ib+0.006
ub+0.006
emy+0.006
 hops+0.006
 villains+0.006
Neg logit contributions
 Mom-0.003
rolley-0.003
 down-0.003
cream-0.003
 Dad-0.003
Token once
Token ID1752
Feature activation-2.47e+00

Pos logit contributions
ub+0.010
ib+0.010
emy+0.010
pill+0.010
 hops+0.010
Neg logit contributions
 Mom-0.007
 down-0.006
rolley-0.006
 Dad-0.006
 Mum-0.006
Token former
Token ID1966
Feature activation-4.67e+00

Pos logit contributions
ub+0.089
emy+0.089
ib+0.088
 villains+0.088
pill+0.087
Neg logit contributions
 Mom-0.063
 down-0.061
rolley-0.059
owner-0.057
cream-0.057
Token beauty
Token ID8737
Feature activation-1.26e+00

Pos logit contributions
ub+0.160
 villains+0.154
emy+0.154
ib+0.153
pill+0.150
Neg logit contributions
 Mom-0.123
 down-0.118
rolley-0.118
cream-0.114
 Mum-0.112
Token.
Token ID13
Feature activation+5.36e-01

Pos logit contributions
ib+0.054
emy+0.054
ub+0.054
 hops+0.052
 villains+0.052
Neg logit contributions
 Mom-0.023
 down-0.022
rolley-0.021
cream-0.021
 Dad-0.020
Token He
Token ID679
Feature activation+1.11e-01

Pos logit contributions
nam+0.013
 down+0.013
 forehead+0.013
rolley+0.012
 Mom+0.012
Neg logit contributions
ub-0.019
ib-0.019
emy-0.019
pill-0.019
 hops-0.018
Token was
Token ID373
Feature activation+3.92e-01

Pos logit contributions
 Mom+0.002
 down+0.002
rolley+0.002
cream+0.002
 Mum+0.002
Neg logit contributions
ib-0.005
ub-0.005
emy-0.005
 villains-0.005
pill-0.005
Token happy
Token ID3772
Feature activation-5.39e-01

Pos logit contributions
rolley+0.009
cream+0.009
nam+0.008
aughter+0.008
 Mom+0.008
Neg logit contributions
 hops-0.014
ub-0.014
ib-0.014
 villains-0.014
emy-0.014
Token his
Token ID465
Feature activation+6.88e-01

Pos logit contributions
nam+0.044
rolley+0.044
 forehead+0.042
owner+0.042
 down+0.041
Neg logit contributions
emy-0.051
ub-0.050
 hops-0.050
ib-0.050
 villains-0.049
Token mother
Token ID2802
Feature activation-7.97e-01

Pos logit contributions
rolley+0.016
cream+0.016
nam+0.015
owner+0.015
cakes+0.015
Neg logit contributions
 hops-0.026
emy-0.025
ib-0.025
 appearance-0.025
ub-0.025
Token even
Token ID772
Feature activation-1.93e+00

Pos logit contributions
 hops+0.027
ub+0.027
ib+0.027
 tweet+0.026
emy+0.026
Neg logit contributions
rolley-0.020
 Mom-0.020
cream-0.019
nam-0.019
aughter-0.019
Token p
Token ID279
Feature activation-1.86e+00

Pos logit contributions
 hops+0.058
ib+0.058
emy+0.057
ub+0.057
 villains+0.056
Neg logit contributions
 Mom-0.059
rolley-0.058
cream-0.056
 down-0.055
owner-0.055
Tokenrou
Token ID472
Feature activation-1.08e+00

Pos logit contributions
ub+0.053
ib+0.053
emy+0.052
pill+0.050
 hops+0.049
Neg logit contributions
 Mom-0.062
nam-0.059
 down-0.059
rolley-0.058
 Dad-0.057
Tokender
Token ID1082
Feature activation-4.62e+00

Pos logit contributions
ib+0.023
ub+0.023
emy+0.023
 hops+0.022
 villains+0.022
Neg logit contributions
 Mom-0.043
rolley-0.042
nam-0.042
 down-0.041
 Dad-0.041
Token.
Token ID13
Feature activation+2.80e-01

Pos logit contributions
ib+0.173
ub+0.172
emy+0.171
 hops+0.169
 villains+0.168
Neg logit contributions
 Mom-0.109
rolley-0.103
 down-0.103
cream-0.100
 Dad-0.099
Token<|endoftext|>
Token ID50256
Feature activation-4.97e-01

Pos logit contributions
 forehead+0.008
nam+0.008
led+0.007
 cream+0.007
amed+0.007
Neg logit contributions
ub-0.009
ib-0.009
pill-0.009
emy-0.008
 hops-0.008
TokenJohn
Token ID7554
Feature activation+5.51e-01

Pos logit contributions
ib+0.016
 hops+0.016
pill+0.016
ub+0.016
emy+0.015
Neg logit contributions
nam-0.015
 forehead-0.014
cream-0.014
rolley-0.014
owner-0.014
Token and
Token ID290
Feature activation-4.79e-01

Pos logit contributions
 Mom+0.012
 down+0.011
rolley+0.011
cream+0.010
 Dad+0.010
Neg logit contributions
ib-0.022
ub-0.022
emy-0.022
 villains-0.022
pill-0.022
Token his
Token ID465
Feature activation-1.87e-01

Pos logit contributions
 hops+0.017
 villains+0.016
emy+0.016
ib+0.016
 tweet+0.016
Neg logit contributions
rolley-0.012
cakes-0.012
amas-0.012
aughter-0.011
amed-0.011
Token This
Token ID770
Feature activation-1.12e+00

Pos logit contributions
rolley+0.024
cream+0.023
nam+0.023
 Mom+0.023
owner+0.023
Neg logit contributions
emy-0.029
ib-0.029
pill-0.029
ub-0.028
 hops-0.028
Token made
Token ID925
Feature activation+1.90e+00

Pos logit contributions
ub+0.049
ib+0.048
emy+0.046
 villains+0.045
 conclusions+0.045
Neg logit contributions
 Mom-0.022
 down-0.021
 up-0.019
owner-0.019
 Mum-0.018
Token her
Token ID607
Feature activation+1.18e+00

Pos logit contributions
 Mom+0.034
rolley+0.031
 down+0.031
cream+0.030
 Dad+0.029
Neg logit contributions
ib-0.083
ub-0.082
emy-0.082
 hops-0.081
 villains-0.080
Token even
Token ID772
Feature activation-1.63e+00

Pos logit contributions
rolley+0.032
cream+0.031
 Mom+0.030
owner+0.030
 clerk+0.029
Neg logit contributions
 hops-0.040
ib-0.039
ub-0.039
emy-0.039
 villains-0.038
Token glad
Token ID9675
Feature activation-5.26e-01

Pos logit contributions
ib+0.051
ub+0.051
emy+0.050
 hops+0.050
 villains+0.049
Neg logit contributions
 Mom-0.048
rolley-0.047
cream-0.045
 down-0.045
 Dad-0.045
Tokender
Token ID1082
Feature activation-4.46e+00

Pos logit contributions
ub+0.022
ib+0.022
emy+0.021
 villains+0.021
pill+0.021
Neg logit contributions
 Mom-0.012
 down-0.011
 Dad-0.011
 Mum-0.011
 Grand-0.010
Token and
Token ID290
Feature activation+4.78e-01

Pos logit contributions
ub+0.165
ib+0.165
emy+0.162
 villains+0.161
 hops+0.160
Neg logit contributions
 Mom-0.107
 down-0.102
rolley-0.098
cream-0.098
 Dad-0.096
Token she
Token ID673
Feature activation+7.23e-01

Pos logit contributions
 Mom+0.012
rolley+0.011
 down+0.011
cream+0.011
 Dad+0.011
Neg logit contributions
ib-0.017
ub-0.017
emy-0.017
 hops-0.017
 villains-0.017
Token smiled
Token ID13541
Feature activation+9.44e-01

Pos logit contributions
 Mom+0.014
 down+0.013
cream+0.013
rolley+0.013
 Dad+0.013
Neg logit contributions
ib-0.031
ub-0.030
emy-0.030
 villains-0.029
pill-0.029
Token with
Token ID351
Feature activation-6.82e-01

Pos logit contributions
 Mom+0.019
rolley+0.019
cream+0.019
nam+0.018
owner+0.018
Neg logit contributions
ib-0.038
ub-0.038
emy-0.037
 hops-0.037
 villains-0.036
Token joy
Token ID8716
Feature activation-3.24e-01

Pos logit contributions
ib+0.025
 hops+0.025
emy+0.025
ub+0.025
 villains+0.025
Neg logit contributions
rolley-0.016
 Mom-0.016
cream-0.015
 down-0.015
owner-0.015
Token car
Token ID1097
Feature activation+6.71e-01

Pos logit contributions
 Mom+0.003
nam+0.002
 down+0.002
 Dad+0.002
owner+0.002
Neg logit contributions
ub-0.004
ib-0.004
emy-0.004
pill-0.003
 hops-0.003
Token made
Token ID925
Feature activation+1.26e+00

Pos logit contributions
 Mom+0.012
 down+0.012
rolley+0.011
cream+0.011
owner+0.011
Neg logit contributions
emy-0.030
ib-0.029
ub-0.029
pill-0.028
 hops-0.028
Token him
Token ID683
Feature activation-8.23e-01

Pos logit contributions
 Mom+0.027
rolley+0.026
 down+0.025
cream+0.025
 Dad+0.024
Neg logit contributions
ib-0.050
ub-0.049
emy-0.049
 hops-0.049
 villains-0.048
Token even
Token ID772
Feature activation-2.18e+00

Pos logit contributions
emy+0.031
ib+0.031
ub+0.031
pill+0.031
 hops+0.030
Neg logit contributions
 Mom-0.020
 down-0.019
rolley-0.018
nam-0.017
cream-0.017
Token sad
Token ID6507
Feature activation-7.22e-01

Pos logit contributions
ib+0.066
ub+0.066
emy+0.065
 hops+0.065
 villains+0.064
Neg logit contributions
 Mom-0.067
rolley-0.065
 down-0.064
cream-0.063
 Dad-0.062
Tokender
Token ID1082
Feature activation-4.39e+00

Pos logit contributions
emy+0.035
ib+0.034
ub+0.034
pill+0.034
 hops+0.033
Neg logit contributions
 Mom-0.011
 down-0.010
 Dad-0.009
cream-0.008
 Mum-0.008
Token.
Token ID13
Feature activation+7.99e-01

Pos logit contributions
ib+0.176
ub+0.175
emy+0.174
 hops+0.172
 villains+0.170
Neg logit contributions
 Mom-0.093
rolley-0.087
 down-0.086
cream-0.084
 Dad-0.083
Token He
Token ID679
Feature activation+1.29e-01

Pos logit contributions
rolley+0.015
 down+0.015
nam+0.014
cream+0.014
 Mom+0.014
Neg logit contributions
emy-0.033
ib-0.033
ub-0.033
pill-0.032
 hops-0.032
Token asked
Token ID1965
Feature activation-5.36e-01

Pos logit contributions
 Mom+0.003
 down+0.002
cream+0.002
 Dad+0.002
rolley+0.002
Neg logit contributions
ib-0.005
ub-0.005
emy-0.005
pill-0.005
 hops-0.005
Token everyone
Token ID2506
Feature activation-3.68e-01

Pos logit contributions
ib+0.024
ub+0.024
emy+0.024
pill+0.024
 hops+0.024
Neg logit contributions
 Mom-0.009
 down-0.008
 Dad-0.008
rolley-0.008
 Mum-0.007
Token if
Token ID611
Feature activation-2.42e+00

Pos logit contributions
ib+0.017
emy+0.017
ub+0.016
pill+0.016
 weaknesses+0.015
Neg logit contributions
 down-0.008
 Mom-0.008
 up-0.008
 Dad-0.007
 Mum-0.006
Token the
Token ID262
Feature activation-9.36e-02

Pos logit contributions
ib+0.031
ub+0.031
emy+0.030
 hops+0.030
 villains+0.030
Neg logit contributions
 Mom-0.017
rolley-0.016
 down-0.016
cream-0.016
owner-0.015
Token village
Token ID7404
Feature activation-4.23e-01

Pos logit contributions
ub+0.003
ib+0.003
emy+0.003
 villains+0.003
 hops+0.003
Neg logit contributions
 Mom-0.003
 down-0.003
 Dad-0.003
 clerk-0.002
nam-0.002
Token back
Token ID736
Feature activation+1.78e-01

Pos logit contributions
ib+0.017
ub+0.017
 hops+0.016
emy+0.016
 villains+0.016
Neg logit contributions
 Mom-0.009
rolley-0.009
cream-0.008
 down-0.008
nam-0.008
Token to
Token ID284
Feature activation-6.44e-01

Pos logit contributions
 Mom+0.004
rolley+0.004
 down+0.004
cream+0.004
nam+0.004
Neg logit contributions
ib-0.007
ub-0.007
emy-0.007
 hops-0.007
 villains-0.007
Token its
Token ID663
Feature activation-7.29e-01

Pos logit contributions
emy+0.026
ib+0.026
ub+0.026
pill+0.026
 hops+0.025
Neg logit contributions
 Mom-0.015
 down-0.013
 Dad-0.013
 Mum-0.013
 Grand-0.013
Token former
Token ID1966
Feature activation-4.38e+00

Pos logit contributions
ub+0.025
ib+0.025
emy+0.024
pill+0.024
 villains+0.023
Neg logit contributions
 Mom-0.021
 down-0.019
rolley-0.019
 Dad-0.019
 Mum-0.019
Token glory
Token ID13476
Feature activation-1.08e-01

Pos logit contributions
ub+0.136
ib+0.135
emy+0.135
 villains+0.132
 hops+0.132
Neg logit contributions
 Mom-0.130
rolley-0.125
 down-0.123
cream-0.121
 Dad-0.120
Token!"
Token ID2474
Feature activation+4.45e-01

Pos logit contributions
ib+0.005
ub+0.004
 hops+0.004
emy+0.004
 villains+0.004
Neg logit contributions
 Mom-0.002
rolley-0.002
cream-0.002
 down-0.002
nam-0.002
Token
Token ID220
Feature activation+1.06e+00

Pos logit contributions
nam+0.008
rolley+0.007
 Mom+0.007
belt+0.007
appointed+0.007
Neg logit contributions
ib-0.019
ub-0.019
pill-0.019
 hops-0.019
emy-0.018
Token
Token ID198
Feature activation+2.05e-02

Pos logit contributions
nam+0.028
 forehead+0.025
rolley+0.025
cream+0.024
 down+0.024
Neg logit contributions
ub-0.039
ib-0.039
emy-0.038
pill-0.037
 hops-0.037
Token
Token ID198
Feature activation-1.75e-01

Pos logit contributions
 down+0.000
nam+0.000
rolley+0.000
 Mom+0.000
cream+0.000
Neg logit contributions
emy-0.001
ub-0.001
ib-0.001
pill-0.001
 hops-0.001
Token to
Token ID284
Feature activation-5.67e-01

Pos logit contributions
ub+0.002
ib+0.002
 hops+0.002
emy+0.002
 villains+0.002
Neg logit contributions
 Mom-0.002
rolley-0.001
owner-0.001
nam-0.001
cream-0.001
Token be
Token ID307
Feature activation-1.02e+00

Pos logit contributions
ib+0.020
emy+0.020
ub+0.020
pill+0.019
 hops+0.019
Neg logit contributions
 Mom-0.016
 down-0.015
rolley-0.014
 Dad-0.014
nam-0.014
Token restored
Token ID15032
Feature activation-5.30e-01

Pos logit contributions
ib+0.037
emy+0.036
 hops+0.036
ub+0.036
 villains+0.036
Neg logit contributions
 Mom-0.025
rolley-0.025
owner-0.023
cream-0.023
 Mum-0.023
Token to
Token ID284
Feature activation+1.33e-01

Pos logit contributions
ib+0.022
ub+0.021
emy+0.021
 hops+0.021
 villains+0.021
Neg logit contributions
 Mom-0.011
rolley-0.010
cream-0.010
 down-0.010
owner-0.010
Token its
Token ID663
Feature activation-1.04e-01

Pos logit contributions
 Mom+0.003
 down+0.003
 Dad+0.003
nam+0.003
 Grand+0.003
Neg logit contributions
ub-0.005
emy-0.005
ib-0.005
pill-0.005
 hops-0.005
Token former
Token ID1966
Feature activation-4.36e+00

Pos logit contributions
ub+0.004
ib+0.004
emy+0.004
pill+0.003
 hops+0.003
Neg logit contributions
 Mom-0.003
 down-0.003
 Dad-0.003
 Mum-0.003
rolley-0.003
Token glory
Token ID13476
Feature activation+6.53e-02

Pos logit contributions
ub+0.140
emy+0.138
ib+0.138
 villains+0.136
pill+0.135
Neg logit contributions
 Mom-0.125
rolley-0.119
 down-0.119
cream-0.115
 Dad-0.115
Token."
Token ID526
Feature activation+4.58e-01

Pos logit contributions
 Mom+0.001
rolley+0.001
cream+0.001
owner+0.001
 down+0.001
Neg logit contributions
ib-0.003
ub-0.003
emy-0.003
 hops-0.003
 villains-0.003
Token
Token ID220
Feature activation+8.73e-01

Pos logit contributions
 Mom+0.006
rolley+0.006
cream+0.006
nam+0.006
 down+0.006
Neg logit contributions
ib-0.022
ub-0.022
emy-0.021
 hops-0.021
pill-0.021
Token
Token ID198
Feature activation-1.34e-02

Pos logit contributions
nam+0.023
cream+0.022
rolley+0.022
 forehead+0.021
owner+0.021
Neg logit contributions
ub-0.032
ib-0.031
emy-0.031
 hops-0.030
pill-0.030
Token
Token ID198
Feature activation-1.72e-01

Pos logit contributions
ub+0.001
ib+0.001
emy+0.001
 hops+0.001
pill+0.001
Neg logit contributions
rolley-0.000
cream-0.000
 Mom-0.000
nam-0.000
 down-0.000
Token it
Token ID340
Feature activation-2.63e-01

Pos logit contributions
ub+0.003
 hops+0.003
ib+0.003
emy+0.003
 villains+0.003
Neg logit contributions
rolley-0.002
 Mom-0.002
cream-0.002
 down-0.002
nam-0.002
Token made
Token ID925
Feature activation+1.10e+00

Pos logit contributions
ib+0.011
ub+0.011
emy+0.011
 villains+0.011
 hops+0.011
Neg logit contributions
 Mom-0.005
 down-0.005
rolley-0.004
cream-0.004
 Dad-0.004
Token her
Token ID607
Feature activation+7.49e-01

Pos logit contributions
rolley+0.027
cream+0.026
nam+0.025
belt+0.025
owner+0.024
Neg logit contributions
emy-0.039
 hops-0.039
 villains-0.038
ib-0.038
ub-0.038
Token even
Token ID772
Feature activation-2.05e+00

Pos logit contributions
rolley+0.021
cream+0.021
owner+0.020
 clerk+0.020
person+0.020
Neg logit contributions
 hops-0.025
ub-0.023
 appearance-0.023
 villains-0.023
ib-0.023
Token sad
Token ID6507
Feature activation-7.93e-01

Pos logit contributions
 hops+0.055
ib+0.051
ub+0.051
 villains+0.051
emy+0.050
Neg logit contributions
rolley-0.071
 Mom-0.068
cream-0.068
aughter-0.067
person-0.066
Tokender
Token ID1082
Feature activation-4.35e+00

Pos logit contributions
emy+0.038
ib+0.038
ub+0.037
pill+0.037
 hops+0.036
Neg logit contributions
 Mom-0.013
 down-0.011
 Dad-0.011
 Mum-0.010
rolley-0.009
Token.
Token ID13
Feature activation+6.75e-01

Pos logit contributions
ib+0.185
ub+0.185
emy+0.184
 hops+0.182
 villains+0.180
Neg logit contributions
 Mom-0.081
rolley-0.076
 down-0.074
cream-0.073
nam-0.071
Token
Token ID198
Feature activation+8.55e-02

Pos logit contributions
rolley+0.018
cream+0.018
nam+0.018
 forehead+0.017
belt+0.016
Neg logit contributions
pill-0.023
emy-0.023
 hops-0.022
ib-0.022
 villains-0.022
Token
Token ID198
Feature activation-6.36e-02

Pos logit contributions
nam+0.003
cream+0.002
rolley+0.002
 forehead+0.002
 down+0.002
Neg logit contributions
emy-0.003
 hops-0.003
ub-0.003
ib-0.003
 villains-0.003
TokenSuddenly
Token ID38582
Feature activation-2.33e-02

Pos logit contributions
emy+0.002
ib+0.002
ub+0.002
 villains+0.002
 hops+0.002
Neg logit contributions
 Mom-0.002
rolley-0.001
 down-0.001
cream-0.001
nam-0.001
Token,
Token ID11
Feature activation+3.64e-01

Pos logit contributions
ib+0.001
ub+0.001
emy+0.001
 hops+0.001
 villains+0.001
Neg logit contributions
 Mom-0.001
 down-0.001
rolley-0.001
cream-0.001
 Dad-0.001
TokenThen
Token ID6423
Feature activation+3.33e-01

Pos logit contributions
 Mom+0.005
rolley+0.005
 down+0.004
 Dad+0.004
owner+0.004
Neg logit contributions
ib-0.006
ub-0.006
emy-0.006
 hops-0.006
 villains-0.005
Token something
Token ID1223
Feature activation-1.43e+00

Pos logit contributions
 Mom+0.008
 down+0.008
rolley+0.008
 Dad+0.007
cream+0.007
Neg logit contributions
ib-0.012
ub-0.012
emy-0.012
 hops-0.012
 villains-0.012
Token even
Token ID772
Feature activation-2.39e-02

Pos logit contributions
ib+0.048
ub+0.047
emy+0.047
 hops+0.046
 villains+0.045
Neg logit contributions
 Mom-0.041
 down-0.038
rolley-0.038
cream-0.038
 Dad-0.037
Token we
Token ID356
Feature activation+6.60e-01

Pos logit contributions
 hops+0.001
emy+0.001
ib+0.001
ub+0.001
 villains+0.001
Neg logit contributions
 Mom-0.001
rolley-0.001
 Dad-0.001
cream-0.001
owner-0.001
Tokenir
Token ID343
Feature activation-8.46e-01

Pos logit contributions
 Mom+0.017
 down+0.017
rolley+0.016
 Dad+0.016
cream+0.016
Neg logit contributions
ib-0.024
emy-0.024
ub-0.023
pill-0.023
 villains-0.023
Tokender
Token ID1082
Feature activation-4.33e+00

Pos logit contributions
ub+0.039
ib+0.038
emy+0.038
 villains+0.038
 hops+0.037
Neg logit contributions
 Mom-0.012
 down-0.012
cream-0.011
rolley-0.011
 Dad-0.011
Token happened
Token ID3022
Feature activation+3.47e-01

Pos logit contributions
ub+0.169
ib+0.169
emy+0.167
 villains+0.166
 hops+0.162
Neg logit contributions
 Mom-0.092
 down-0.090
cream-0.086
rolley-0.086
 Dad-0.084
Token.
Token ID13
Feature activation+3.90e-01

Pos logit contributions
 Mom+0.006
 down+0.006
rolley+0.006
cream+0.005
 Dad+0.005
Neg logit contributions
ib-0.015
ub-0.015
emy-0.015
 hops-0.015
 villains-0.015
Token The
Token ID383
Feature activation-2.32e-02

Pos logit contributions
 Mom+0.010
rolley+0.009
 down+0.009
cream+0.009
nam+0.009
Neg logit contributions
ib-0.014
ub-0.014
emy-0.014
 hops-0.014
 villains-0.014
Token animal
Token ID5044
Feature activation-1.12e+00

Pos logit contributions
ib+0.001
ub+0.001
emy+0.001
 hops+0.001
pill+0.001
Neg logit contributions
 Mom-0.001
 down-0.001
rolley-0.001
 Dad-0.001
nam-0.001
Token suddenly
Token ID6451
Feature activation-4.87e-01

Pos logit contributions
ib+0.050
ub+0.049
emy+0.048
pill+0.047
 villains+0.047
Neg logit contributions
 Mom-0.021
 down-0.019
 Dad-0.018
rolley-0.018
 Mum-0.018
Token back
Token ID736
Feature activation+5.85e-01

Pos logit contributions
 Mom+0.021
 down+0.019
 Dad+0.018
rolley+0.018
 Mum+0.018
Neg logit contributions
ub-0.047
emy-0.047
ib-0.047
 villains-0.046
 hops-0.046
Token the
Token ID262
Feature activation-2.37e-01

Pos logit contributions
 Mom+0.010
 down+0.010
rolley+0.009
cream+0.009
 Dad+0.009
Neg logit contributions
ub-0.026
ib-0.026
emy-0.026
 hops-0.025
 villains-0.025
Token village
Token ID7404
Feature activation-6.60e-01

Pos logit contributions
ub+0.009
 villains+0.009
emy+0.009
ib+0.009
 hops+0.008
Neg logit contributions
 Mom-0.006
 clerk-0.006
 Dad-0.005
owner-0.005
 Mum-0.005
Token to
Token ID284
Feature activation-5.98e-02

Pos logit contributions
ib+0.025
ub+0.025
emy+0.024
 hops+0.024
 villains+0.024
Neg logit contributions
 Mom-0.015
rolley-0.015
 down-0.014
cream-0.014
nam-0.014
Token its
Token ID663
Feature activation-3.82e-01

Pos logit contributions
emy+0.003
ib+0.003
ub+0.003
pill+0.003
 hops+0.003
Neg logit contributions
 Mom-0.001
 Dad-0.001
 down-0.001
 Mum-0.001
 Grand-0.001
Token former
Token ID1966
Feature activation-4.32e+00

Pos logit contributions
ub+0.015
ib+0.015
emy+0.014
pill+0.014
 hops+0.014
Neg logit contributions
 Mom-0.009
 down-0.008
 Dad-0.008
rolley-0.008
 Mum-0.008
Token state
Token ID1181
Feature activation-6.32e-01

Pos logit contributions
ub+0.143
emy+0.142
ib+0.140
 villains+0.140
pill+0.139
Neg logit contributions
 Mom-0.120
rolley-0.114
 down-0.113
 Mum-0.110
 Dad-0.110
Token.
Token ID13
Feature activation+8.41e-01

Pos logit contributions
ib+0.026
ub+0.026
emy+0.026
 hops+0.025
 villains+0.025
Neg logit contributions
 Mom-0.013
rolley-0.012
 down-0.012
cream-0.012
 Dad-0.011
Token With
Token ID2080
Feature activation-1.32e+00

Pos logit contributions
nam+0.016
 down+0.016
rolley+0.016
cream+0.016
 forehead+0.016
Neg logit contributions
ib-0.034
ub-0.034
emy-0.034
pill-0.034
 hops-0.033
Token the
Token ID262
Feature activation-1.18e-02

Pos logit contributions
ub+0.059
ib+0.059
emy+0.058
 villains+0.058
pill+0.057
Neg logit contributions
 Mom-0.024
 Dad-0.021
 down-0.020
 Mum-0.020
nam-0.020
Token help
Token ID1037
Feature activation+6.72e-01

Pos logit contributions
emy+0.000
ub+0.000
ib+0.000
 villains+0.000
 hops+0.000
Neg logit contributions
 Mom-0.000
 clerk-0.000
 Mum-0.000
 Dad-0.000
owner-0.000
Token
Token ID198
Feature activation-6.59e-01

Pos logit contributions
emy+0.021
ib+0.020
ub+0.019
 hops+0.018
 villains+0.017
Neg logit contributions
nam-0.018
 forehead-0.017
rolley-0.016
cream-0.016
person-0.016
TokenJen
Token ID44875
Feature activation-2.73e-01

Pos logit contributions
ib+0.023
emy+0.023
ub+0.023
 hops+0.022
 villains+0.022
Neg logit contributions
 Mom-0.017
rolley-0.017
 down-0.017
nam-0.017
cream-0.017
Token feels
Token ID5300
Feature activation-1.49e+00

Pos logit contributions
ub+0.011
emy+0.011
ib+0.011
 villains+0.011
 weaknesses+0.011
Neg logit contributions
 Mom-0.006
 down-0.005
 Dad-0.005
cream-0.005
rolley-0.005
Token even
Token ID772
Feature activation-1.59e+00

Pos logit contributions
 hops+0.052
 villains+0.049
emy+0.049
ib+0.048
ub+0.047
Neg logit contributions
rolley-0.042
cream-0.039
owner-0.037
aughter-0.037
cakes-0.037
Token sad
Token ID6507
Feature activation-1.05e+00

Pos logit contributions
 hops+0.045
ib+0.045
emy+0.044
ub+0.044
 villains+0.043
Neg logit contributions
 Mom-0.052
rolley-0.051
cream-0.049
 Dad-0.048
 down-0.048
Tokender
Token ID1082
Feature activation-4.25e+00

Pos logit contributions
pill+0.059
emy+0.059
 weaknesses+0.057
ib+0.057
 hops+0.056
Neg logit contributions
 Mom-0.013
 down-0.012
 Dad-0.009
 up-0.008
.-0.008
Token.
Token ID13
Feature activation-1.05e+00

Pos logit contributions
ub+0.194
ib+0.192
 tweet+0.188
 hops+0.188
emy+0.188
Neg logit contributions
 Mom-0.071
 down-0.064
 Dad-0.060
rolley-0.058
cream-0.057
Token She
Token ID1375
Feature activation-7.71e-02

Pos logit contributions
 villains+0.048
 tale+0.048
ub+0.047
emy+0.046
 hops+0.046
Neg logit contributions
 Mom-0.024
 Dad-0.021
 Mum-0.020
 down-0.018
 Grand-0.018
Token thinks
Token ID6834
Feature activation-4.60e-01

Pos logit contributions
ub+0.003
ib+0.003
emy+0.003
 appearances+0.003
pill+0.003
Neg logit contributions
 Mom-0.002
 down-0.002
 started-0.002
 Dad-0.002
cream-0.002
Token Tim
Token ID5045
Feature activation+3.24e-01

Pos logit contributions
ub+0.022
emy+0.022
ib+0.021
pill+0.021
Like+0.021
Neg logit contributions
 Mom-0.010
 Dad-0.009
 Mum-0.008
 mom-0.008
 Grand-0.007
Token is
Token ID318
Feature activation-3.23e-01

Pos logit contributions
 Mom+0.009
 down+0.009
cream+0.009
rolley+0.008
 up+0.008
Neg logit contributions
ub-0.011
ib-0.011
emy-0.011
 villains-0.011
pill-0.010
Token girl
Token ID2576
Feature activation+2.77e-01

Pos logit contributions
cream+0.002
rolley+0.002
 Mom+0.002
nam+0.002
 down+0.002
Neg logit contributions
 hops-0.003
ub-0.003
 villains-0.003
ib-0.003
 appearance-0.003
Token felt
Token ID2936
Feature activation-9.33e-01

Pos logit contributions
 Mom+0.006
rolley+0.006
nam+0.005
cream+0.005
 down+0.005
Neg logit contributions
ub-0.011
 hops-0.011
ib-0.011
emy-0.011
 villains-0.010
Token even
Token ID772
Feature activation-1.04e+00

Pos logit contributions
 hops+0.033
ub+0.033
 villains+0.032
ib+0.032
pill+0.031
Neg logit contributions
rolley-0.023
cream-0.023
aughter-0.022
owner-0.022
person-0.021
Token p
Token ID279
Feature activation-1.29e+00

Pos logit contributions
 hops+0.026
ub+0.024
ib+0.024
 villains+0.024
emy+0.023
Neg logit contributions
rolley-0.037
 Mom-0.037
cream-0.036
aughter-0.036
person-0.036
Tokenrou
Token ID472
Feature activation-1.05e+00

Pos logit contributions
ub+0.039
ib+0.038
emy+0.037
 hops+0.037
pill+0.036
Neg logit contributions
 Mom-0.041
rolley-0.039
 down-0.039
nam-0.038
owner-0.038
Tokender
Token ID1082
Feature activation-4.24e+00

Pos logit contributions
ib+0.018
ub+0.017
emy+0.017
 hops+0.017
 villains+0.016
Neg logit contributions
 Mom-0.047
rolley-0.045
 down-0.045
nam-0.045
cream-0.045
Token.
Token ID13
Feature activation+9.89e-01

Pos logit contributions
ib+0.173
ub+0.172
emy+0.172
 villains+0.170
 hops+0.169
Neg logit contributions
 Mom-0.086
 down-0.082
rolley-0.079
cream-0.078
 Dad-0.076
Token She
Token ID1375
Feature activation+4.56e-01

Pos logit contributions
rolley+0.024
cream+0.024
 down+0.024
nam+0.024
 Mom+0.023
Neg logit contributions
ub-0.035
ib-0.035
emy-0.035
pill-0.034
 hops-0.034
Token was
Token ID373
Feature activation+2.94e-01

Pos logit contributions
 Mom+0.009
rolley+0.008
 down+0.008
cream+0.008
 Dad+0.008
Neg logit contributions
ib-0.019
ub-0.019
emy-0.019
 hops-0.019
 villains-0.019
Token happy
Token ID3772
Feature activation-4.71e-01

Pos logit contributions
rolley+0.006
cream+0.006
aughter+0.006
owner+0.006
 Mom+0.006
Neg logit contributions
ub-0.011
 hops-0.011
 villains-0.011
ib-0.011
pill-0.011
Token that
Token ID326
Feature activation+1.32e+00

Pos logit contributions
emy+0.020
ib+0.020
pill+0.019
 villains+0.019
ub+0.019
Neg logit contributions
 Mom-0.012
 down-0.010
 Dad-0.010
 Mum-0.010
 Grand-0.009
Token!"
Token ID2474
Feature activation+8.92e-02

Pos logit contributions
ib+0.019
emy+0.019
ub+0.019
 hops+0.019
 villains+0.019
Neg logit contributions
 Mom-0.008
 down-0.008
rolley-0.007
 Dad-0.007
cream-0.007
Token Benny
Token ID44275
Feature activation-1.26e+00

Pos logit contributions
rolley+0.002
nam+0.002
aughter+0.002
owner+0.002
 down+0.002
Neg logit contributions
emy-0.003
ib-0.003
ub-0.003
pill-0.003
 hops-0.003
Token felt
Token ID2936
Feature activation-1.20e+00

Pos logit contributions
ib+0.050
emy+0.050
 hops+0.049
ub+0.049
 villains+0.048
Neg logit contributions
rolley-0.027
 Mom-0.026
nam-0.026
cream-0.025
 down-0.025
Token even
Token ID772
Feature activation-1.44e+00

Pos logit contributions
 hops+0.043
emy+0.042
ub+0.042
ib+0.042
 villains+0.042
Neg logit contributions
rolley-0.031
cream-0.029
 Mom-0.028
owner-0.027
aughter-0.027
Token sad
Token ID6507
Feature activation-9.18e-01

Pos logit contributions
 hops+0.042
ib+0.040
emy+0.040
ub+0.039
 villains+0.038
Neg logit contributions
rolley-0.047
 Mom-0.046
cream-0.045
aughter-0.044
 Dad-0.044
Tokender
Token ID1082
Feature activation-4.21e+00

Pos logit contributions
pill+0.045
emy+0.043
ib+0.043
 villains+0.042
 weaknesses+0.042
Neg logit contributions
 Mom-0.018
 down-0.018
 Dad-0.015
 up-0.014
 mom-0.014
Token and
Token ID290
Feature activation-2.35e-01

Pos logit contributions
ub+0.161
ib+0.161
 hops+0.159
 tweet+0.158
 villains+0.158
Neg logit contributions
 Mom-0.098
 down-0.092
 Dad-0.088
 Mum-0.086
cream-0.085
Token started
Token ID2067
Feature activation-5.61e-01

Pos logit contributions
 hops+0.009
emy+0.008
 villains+0.008
ib+0.008
ub+0.008
Neg logit contributions
rolley-0.006
aughter-0.005
owner-0.005
cream-0.005
person-0.005
Token to
Token ID284
Feature activation-3.67e-01

Pos logit contributions
ib+0.024
ub+0.024
emy+0.023
pill+0.023
 villains+0.023
Neg logit contributions
 Mom-0.012
 down-0.012
 Dad-0.010
 up-0.010
 Mum-0.010
Token cry
Token ID3960
Feature activation-4.49e-01

Pos logit contributions
pill+0.015
ib+0.015
ub+0.015
emy+0.015
 villains+0.014
Neg logit contributions
 Mom-0.009
 down-0.009
 Dad-0.008
 Mum-0.008
 forehead-0.008
Token.
Token ID13
Feature activation+5.34e-01

Pos logit contributions
ib+0.021
emy+0.021
ub+0.021
 hops+0.021
pill+0.020
Neg logit contributions
 Mom-0.007
 down-0.006
rolley-0.006
 Dad-0.006
cream-0.006
Token little
Token ID1310
Feature activation+3.66e-01

Pos logit contributions
ub+0.004
ib+0.004
pill+0.004
emy+0.004
 villains+0.004
Neg logit contributions
 Mom-0.004
 Dad-0.003
 down-0.003
 mom-0.003
 Mum-0.003
Token boy
Token ID2933
Feature activation+8.33e-01

Pos logit contributions
 Mom+0.011
 down+0.010
 Dad+0.010
 Mum+0.009
 mom+0.009
Neg logit contributions
ub-0.013
 villains-0.012
pill-0.012
emy-0.012
ib-0.012
Token felt
Token ID2936
Feature activation-1.06e+00

Pos logit contributions
 Mom+0.014
 down+0.013
 started+0.012
cream+0.012
 Dad+0.012
Neg logit contributions
ib-0.038
ub-0.038
emy-0.037
pill-0.037
 weaknesses-0.037
Token even
Token ID772
Feature activation-1.13e+00

Pos logit contributions
ib+0.045
ub+0.045
emy+0.045
 villains+0.043
pill+0.043
Neg logit contributions
 Mom-0.021
 down-0.020
 Dad-0.019
nam-0.018
cream-0.018
Token sad
Token ID6507
Feature activation-7.49e-01

Pos logit contributions
 hops+0.033
ib+0.033
emy+0.033
ub+0.032
 villains+0.032
Neg logit contributions
 Mom-0.036
rolley-0.035
cream-0.034
 Dad-0.034
owner-0.033
Tokender
Token ID1082
Feature activation-4.21e+00

Pos logit contributions
pill+0.037
emy+0.036
ib+0.036
 villains+0.035
ub+0.035
Neg logit contributions
 down-0.013
 Mom-0.013
 up-0.010
 Dad-0.010
 Mum-0.009
Token.
Token ID13
Feature activation-3.79e-02

Pos logit contributions
ub+0.175
ib+0.173
 tweet+0.171
 villains+0.169
 weaknesses+0.168
Neg logit contributions
 down-0.083
 Mom-0.083
 Dad-0.074
he-0.073
 up-0.073
Token He
Token ID679
Feature activation-5.90e-02

Pos logit contributions
ub+0.002
ib+0.002
emy+0.002
 hops+0.002
 villains+0.002
Neg logit contributions
 Mom-0.001
 Dad-0.001
 down-0.001
rolley-0.001
 Mum-0.001
Token started
Token ID2067
Feature activation-7.02e-01

Pos logit contributions
ib+0.003
ub+0.003
emy+0.002
pill+0.002
 appearances+0.002
Neg logit contributions
 Mom-0.001
 down-0.001
cream-0.001
 Dad-0.001
 started-0.001
Token to
Token ID284
Feature activation-5.74e-01

Pos logit contributions
ib+0.034
ub+0.033
emy+0.033
pill+0.032
 villains+0.032
Neg logit contributions
 down-0.013
 Mom-0.012
 up-0.011
 Dad-0.010
 -0.010
Token cry
Token ID3960
Feature activation-5.22e-01

Pos logit contributions
ib+0.024
pill+0.024
ub+0.024
emy+0.024
 villains+0.022
Neg logit contributions
 Mom-0.013
 down-0.013
 forehead-0.012
 up-0.012
nam-0.011
Token,
Token ID11
Feature activation-2.31e-01

Pos logit contributions
ib+0.018
emy+0.018
ub+0.018
 hops+0.018
 villains+0.018
Neg logit contributions
 Mom-0.008
 down-0.007
rolley-0.007
cream-0.007
 Dad-0.007
Token leaving
Token ID4305
Feature activation+1.22e-01

Pos logit contributions
ib+0.009
emy+0.009
ub+0.008
 villains+0.008
pill+0.008
Neg logit contributions
 Mom-0.006
 down-0.006
 Dad-0.006
 Mum-0.005
rolley-0.005
Token it
Token ID340
Feature activation-1.52e+00

Pos logit contributions
 Mom+0.002
rolley+0.002
cream+0.002
nam+0.002
 down+0.002
Neg logit contributions
ib-0.005
ub-0.005
emy-0.005
 hops-0.005
 villains-0.005
Token to
Token ID284
Feature activation+1.21e-02

Pos logit contributions
emy+0.065
ib+0.064
ub+0.063
pill+0.062
 villains+0.062
Neg logit contributions
 down-0.031
 Mom-0.029
 up-0.027
rolley-0.026
owner-0.026
Token its
Token ID663
Feature activation+3.08e-02

Pos logit contributions
 Mom+0.000
 down+0.000
 Dad+0.000
 Mum+0.000
rolley+0.000
Neg logit contributions
ib-0.000
ub-0.000
emy-0.000
pill-0.000
 hops-0.000
Token former
Token ID1966
Feature activation-4.14e+00

Pos logit contributions
 Mom+0.001
 down+0.001
rolley+0.001
 Dad+0.001
 Mum+0.001
Neg logit contributions
ib-0.001
ub-0.001
emy-0.001
pill-0.001
 hops-0.001
Token state
Token ID1181
Feature activation-6.69e-01

Pos logit contributions
ub+0.138
emy+0.137
pill+0.136
 villains+0.136
ib+0.135
Neg logit contributions
 Mom-0.113
rolley-0.107
 down-0.107
cream-0.103
 Mum-0.103
Token.
Token ID13
Feature activation+1.00e+00

Pos logit contributions
ib+0.029
ub+0.029
emy+0.029
 hops+0.028
 villains+0.028
Neg logit contributions
 Mom-0.012
rolley-0.011
 down-0.011
cream-0.011
nam-0.010
Token
Token ID220
Feature activation+1.20e+00

Pos logit contributions
cream+0.017
nam+0.017
 forehead+0.017
 cream+0.017
rolley+0.017
Neg logit contributions
ub-0.043
ib-0.043
pill-0.042
emy-0.042
 hops-0.042
Token
Token ID198
Feature activation+3.06e-01

Pos logit contributions
nam+0.032
 forehead+0.031
rolley+0.030
cream+0.029
owner+0.028
Neg logit contributions
ub-0.043
ib-0.042
emy-0.041
pill-0.041
 hops-0.040
Token
Token ID198
Feature activation+6.24e-02

Pos logit contributions
rolley+0.007
 down+0.007
nam+0.007
cream+0.007
 Mom+0.007
Neg logit contributions
emy-0.012
ub-0.012
ib-0.012
 hops-0.011
pill-0.011
Token."
Token ID526
Feature activation+2.89e-01

Pos logit contributions
ib+0.026
emy+0.026
ub+0.026
 hops+0.026
pill+0.026
Neg logit contributions
 Mom-0.013
 down-0.012
rolley-0.011
 Dad-0.011
cream-0.011
Token Lily
Token ID20037
Feature activation-2.00e-01

Pos logit contributions
nam+0.008
rolley+0.007
cream+0.007
amed+0.007
 down+0.007
Neg logit contributions
ib-0.010
ub-0.010
emy-0.009
 villains-0.009
 hops-0.009
Token felt
Token ID2936
Feature activation-1.15e+00

Pos logit contributions
ub+0.008
ib+0.008
 hops+0.008
emy+0.008
 villains+0.007
Neg logit contributions
rolley-0.004
 Mom-0.004
nam-0.004
cream-0.004
 down-0.004
Token even
Token ID772
Feature activation-1.31e+00

Pos logit contributions
 hops+0.040
 villains+0.039
emy+0.039
ub+0.039
ib+0.039
Neg logit contributions
rolley-0.031
cream-0.029
 Mom-0.028
owner-0.028
aughter-0.027
Token sad
Token ID6507
Feature activation-8.73e-01

Pos logit contributions
 hops+0.038
ib+0.037
emy+0.037
 villains+0.036
ub+0.036
Neg logit contributions
rolley-0.042
 Mom-0.042
cream-0.040
 Dad-0.039
 Mum-0.039
Tokender
Token ID1082
Feature activation-4.13e+00

Pos logit contributions
pill+0.041
emy+0.040
ib+0.040
 hops+0.039
 weaknesses+0.039
Neg logit contributions
 Mom-0.018
 down-0.018
 Dad-0.015
 up-0.014
 mom-0.014
Token than
Token ID621
Feature activation+1.05e+00

Pos logit contributions
ib+0.155
ub+0.154
 hops+0.151
emy+0.151
 tweet+0.150
Neg logit contributions
 Mom-0.101
 down-0.096
 Dad-0.091
 Mum-0.089
rolley-0.089
Token before
Token ID878
Feature activation-1.16e+00

Pos logit contributions
 Mom+0.023
rolley+0.022
 down+0.021
cream+0.021
 Dad+0.021
Neg logit contributions
ib-0.041
ub-0.041
emy-0.041
 hops-0.041
 villains-0.040
Token.
Token ID13
Feature activation-2.98e-02

Pos logit contributions
ub+0.053
ib+0.053
emy+0.052
pill+0.051
 villains+0.051
Neg logit contributions
 Mom-0.021
 down-0.018
 Dad-0.018
 Mum-0.017
 Grand-0.016
Token
Token ID198
Feature activation+1.12e-01

Pos logit contributions
ib+0.001
ub+0.001
emy+0.001
pill+0.001
 hops+0.001
Neg logit contributions
rolley-0.001
cream-0.001
nam-0.001
 forehead-0.001
owner-0.001
Token
Token ID198
Feature activation-4.28e-02

Pos logit contributions
nam+0.003
 forehead+0.003
rolley+0.003
cream+0.003
 down+0.003
Neg logit contributions
emy-0.004
ub-0.004
ib-0.004
pill-0.003
 hops-0.003
Token.
Token ID13
Feature activation+1.17e+00

Pos logit contributions
 Mom+0.004
rolley+0.004
cream+0.004
nam+0.004
owner+0.004
Neg logit contributions
ib-0.007
ub-0.007
emy-0.007
 hops-0.007
 villains-0.007
Token Red
Token ID2297
Feature activation+1.84e-01

Pos logit contributions
rolley+0.033
amed+0.033
 forehead+0.032
aughter+0.032
 cream+0.032
Neg logit contributions
pill-0.034
emy-0.034
ub-0.034
 villains-0.033
ib-0.033
Token felt
Token ID2936
Feature activation-9.74e-01

Pos logit contributions
 Mom+0.004
rolley+0.003
 down+0.003
cream+0.003
owner+0.003
Neg logit contributions
ib-0.008
ub-0.008
emy-0.008
 hops-0.007
 villains-0.007
Token even
Token ID772
Feature activation-9.96e-01

Pos logit contributions
 hops+0.034
ib+0.034
emy+0.034
ub+0.034
 villains+0.034
Neg logit contributions
rolley-0.025
 Mom-0.024
cream-0.023
owner-0.023
 down-0.022
Token sad
Token ID6507
Feature activation-6.85e-01

Pos logit contributions
 hops+0.026
emy+0.024
ib+0.024
 villains+0.024
ub+0.023
Neg logit contributions
rolley-0.035
 Mom-0.034
aughter-0.033
cream-0.033
 Dad-0.033
Tokender
Token ID1082
Feature activation-4.13e+00

Pos logit contributions
pill+0.031
emy+0.031
ib+0.030
 tweet+0.030
 hops+0.030
Neg logit contributions
 down-0.015
 Mom-0.014
 up-0.013
 Dad-0.012
 Grand-0.011
Token.
Token ID13
Feature activation+8.41e-01

Pos logit contributions
ub+0.157
ib+0.157
 tweet+0.155
 villains+0.152
emy+0.151
Neg logit contributions
 Mom-0.096
 down-0.094
 Dad-0.087
he-0.085
cream-0.085
Token
Token ID198
Feature activation+2.35e-01

Pos logit contributions
rolley+0.024
 forehead+0.022
nam+0.022
 cream+0.022
aughter+0.022
Neg logit contributions
emy-0.027
ub-0.026
ib-0.025
 villains-0.025
pill-0.025
Token
Token ID198
Feature activation+3.88e-02

Pos logit contributions
nam+0.007
 forehead+0.006
rolley+0.006
 down+0.006
cream+0.006
Neg logit contributions
emy-0.008
ub-0.008
ib-0.008
 villains-0.008
 hops-0.007
TokenBut
Token ID1537
Feature activation+1.16e+00

Pos logit contributions
 Mom+0.001
 down+0.001
rolley+0.001
 Dad+0.001
 Mum+0.001
Neg logit contributions
emy-0.001
ub-0.001
ib-0.001
 villains-0.001
 hops-0.001
Token then
Token ID788
Feature activation+1.28e-01

Pos logit contributions
rolley+0.033
aughter+0.030
cakes+0.030
cream+0.029
he+0.028
Neg logit contributions
emy-0.038
ub-0.037
 villains-0.037
ib-0.036
 hops-0.036
Token.
Token ID13
Feature activation+4.38e-01

Pos logit contributions
ib+0.034
emy+0.034
ub+0.033
 hops+0.033
 villains+0.032
Neg logit contributions
 Mom-0.016
 down-0.016
rolley-0.015
cream-0.015
 Dad-0.014
Token Tim
Token ID5045
Feature activation+3.82e-02

Pos logit contributions
rolley+0.010
nam+0.010
 down+0.010
 forehead+0.009
owner+0.009
Neg logit contributions
ib-0.017
emy-0.016
ub-0.016
pill-0.016
 hops-0.016
Token felt
Token ID2936
Feature activation-9.39e-01

Pos logit contributions
 Mom+0.001
 down+0.001
rolley+0.001
cream+0.001
 Dad+0.001
Neg logit contributions
ib-0.002
emy-0.002
ub-0.002
 villains-0.002
pill-0.002
Token even
Token ID772
Feature activation-1.10e+00

Pos logit contributions
 hops+0.035
ub+0.034
ib+0.034
 villains+0.034
emy+0.033
Neg logit contributions
rolley-0.023
cream-0.021
 Mom-0.021
owner-0.021
aughter-0.020
Token sad
Token ID6507
Feature activation-6.37e-01

Pos logit contributions
 hops+0.030
ib+0.028
ub+0.028
emy+0.028
 villains+0.028
Neg logit contributions
rolley-0.038
 Mom-0.038
cream-0.036
 Dad-0.036
aughter-0.036
Tokender
Token ID1082
Feature activation-4.12e+00

Pos logit contributions
emy+0.028
pill+0.028
ib+0.028
 villains+0.027
 weaknesses+0.027
Neg logit contributions
 down-0.014
 Mom-0.014
 up-0.012
 Dad-0.011
 Mum-0.011
Token now
Token ID783
Feature activation-7.44e-02

Pos logit contributions
ub+0.167
ib+0.166
 villains+0.164
 tweet+0.163
emy+0.163
Neg logit contributions
 Mom-0.082
 down-0.082
 Dad-0.073
cream-0.072
 up-0.072
Token,
Token ID11
Feature activation-1.75e-02

Pos logit contributions
ib+0.003
emy+0.003
ub+0.003
pill+0.003
 villains+0.003
Neg logit contributions
 Mom-0.002
 down-0.002
 Dad-0.001
 Mum-0.001
 up-0.001
Token and
Token ID290
Feature activation+2.83e-01

Pos logit contributions
ib+0.001
emy+0.001
ub+0.001
pill+0.001
 villains+0.001
Neg logit contributions
 Mom-0.000
 down-0.000
 Dad-0.000
nam-0.000
 Mum-0.000
Token he
Token ID339
Feature activation+1.61e-01

Pos logit contributions
rolley+0.007
 Mom+0.007
cream+0.006
owner+0.006
 down+0.006
Neg logit contributions
 hops-0.010
emy-0.010
ub-0.010
ib-0.010
 villains-0.010
Token cried
Token ID16896
Feature activation-9.82e-01

Pos logit contributions
 Mom+0.004
 down+0.003
 Dad+0.003
rolley+0.003
cream+0.003
Neg logit contributions
ib-0.007
emy-0.006
ub-0.006
pill-0.006
 villains-0.006
Token.
Token ID13
Feature activation+9.25e-01

Pos logit contributions
 Mom+0.021
rolley+0.021
cream+0.020
 down+0.019
owner+0.019
Neg logit contributions
ib-0.038
ub-0.038
emy-0.037
 hops-0.037
 villains-0.037
Token Lily
Token ID20037
Feature activation+2.27e-01

Pos logit contributions
rolley+0.025
nam+0.024
cream+0.024
 cream+0.023
 forehead+0.023
Neg logit contributions
emy-0.032
ib-0.032
pill-0.032
ub-0.031
 hops-0.030
Token felt
Token ID2936
Feature activation-1.13e+00

Pos logit contributions
 Mom+0.004
 down+0.004
rolley+0.004
cream+0.004
owner+0.004
Neg logit contributions
ib-0.010
ub-0.010
emy-0.010
pill-0.010
 villains-0.010
Token even
Token ID772
Feature activation-1.21e+00

Pos logit contributions
ib+0.045
ub+0.045
emy+0.045
 hops+0.044
 villains+0.043
Neg logit contributions
 Mom-0.024
 down-0.023
rolley-0.022
cream-0.022
 Dad-0.022
Token sad
Token ID6507
Feature activation-7.76e-01

Pos logit contributions
 hops+0.035
ib+0.035
emy+0.035
ub+0.035
 villains+0.034
Neg logit contributions
 Mom-0.038
rolley-0.038
cream-0.037
 Dad-0.036
 down-0.036
Tokender
Token ID1082
Feature activation-4.07e+00

Pos logit contributions
pill+0.037
emy+0.036
ib+0.035
 villains+0.035
 weaknesses+0.034
Neg logit contributions
 down-0.016
 Mom-0.016
 up-0.014
 Dad-0.012
 tightly-0.012
Token than
Token ID621
Feature activation+1.11e+00

Pos logit contributions
ub+0.163
ib+0.162
 tweet+0.160
 villains+0.158
 weaknesses+0.157
Neg logit contributions
 Mom-0.087
 down-0.085
he-0.077
 Dad-0.076
 up-0.075
Token before
Token ID878
Feature activation-1.10e+00

Pos logit contributions
 Mom+0.023
nam+0.022
 down+0.020
 Dad+0.020
 Mum+0.019
Neg logit contributions
ib-0.047
emy-0.047
ub-0.047
 hops-0.046
pill-0.046
Token.
Token ID13
Feature activation+4.93e-01

Pos logit contributions
ub+0.056
ib+0.055
emy+0.054
 villains+0.054
pill+0.053
Neg logit contributions
 Mom-0.015
 down-0.013
 Dad-0.012
 Mum-0.011
 Grand-0.011
Token From
Token ID3574
Feature activation-1.47e+00

Pos logit contributions
rolley+0.013
cream+0.012
nam+0.012
 forehead+0.012
owner+0.012
Neg logit contributions
ib-0.017
emy-0.017
ub-0.017
pill-0.017
 hops-0.017
Token that
Token ID326
Feature activation+6.85e-01

Pos logit contributions
emy+0.045
 hops+0.044
ib+0.043
ub+0.042
 villains+0.042
Neg logit contributions
rolley-0.044
 Mom-0.042
nam-0.042
owner-0.041
cream-0.041
Tokent
Token ID83
Feature activation-2.27e-01

Pos logit contributions
ub+0.015
pill+0.015
ib+0.014
emy+0.013
 hops+0.012
Neg logit contributions
 forehead-0.019
nam-0.018
rolley-0.017
 doorway-0.017
amed-0.017
Token hurt
Token ID5938
Feature activation+7.45e-02

Pos logit contributions
ib+0.008
ub+0.007
emy+0.007
 hops+0.007
pill+0.007
Neg logit contributions
 Mom-0.007
 down-0.006
rolley-0.006
 Dad-0.006
cream-0.006
Token you
Token ID345
Feature activation-8.62e-01

Pos logit contributions
 Mom+0.002
rolley+0.002
 down+0.002
cream+0.002
 Dad+0.002
Neg logit contributions
ib-0.003
ub-0.003
emy-0.003
 hops-0.003
 villains-0.003
Token.
Token ID13
Feature activation-1.45e-02

Pos logit contributions
ib+0.037
ub+0.037
emy+0.036
 hops+0.036
 villains+0.036
Neg logit contributions
 Mom-0.016
rolley-0.015
 down-0.015
cream-0.014
 Dad-0.014
Tokenâ
Token ID22940
Feature activation+5.73e-01

Pos logit contributions
ib+0.000
ub+0.000
emy+0.000
 hops+0.000
pill+0.000
Neg logit contributions
 Mom-0.000
rolley-0.000
 down-0.000
cream-0.000
nam-0.000
Token
Token ID26391
Feature activation+2.38e+00

Pos logit contributions
 Mom+0.013
nam+0.012
rolley+0.011
 Dad+0.011
 Mum+0.011
Neg logit contributions
ub-0.023
ib-0.023
 hops-0.023
emy-0.022
 tweet-0.022
Token
Token ID198
Feature activation+1.08e-01

Pos logit contributions
rolley+0.051
 Mom+0.049
nam+0.049
cream+0.048
owner+0.047
Neg logit contributions
ib-0.094
ub-0.094
 hops-0.094
emy-0.094
pill-0.091
Token
Token ID198
Feature activation-3.79e-03

Pos logit contributions
nam+0.003
rolley+0.003
 down+0.003
cream+0.003
 forehead+0.003
Neg logit contributions
emy-0.004
ub-0.004
ib-0.004
 hops-0.004
pill-0.004
TokenThe
Token ID464
Feature activation-2.70e-01

Pos logit contributions
ib+0.000
ub+0.000
 hops+0.000
 appearance+0.000
emy+0.000
Neg logit contributions
 Mom-0.000
rolley-0.000
cream-0.000
owner-0.000
 down-0.000
Token rabbit
Token ID22746
Feature activation-7.84e-01

Pos logit contributions
ub+0.009
ib+0.009
pill+0.008
 villains+0.008
emy+0.008
Neg logit contributions
 Mom-0.008
 down-0.008
owner-0.008
 forehead-0.008
 Mum-0.008
Token smiled
Token ID13541
Feature activation+6.30e-01

Pos logit contributions
ib+0.034
ub+0.034
emy+0.033
 villains+0.033
 weaknesses+0.033
Neg logit contributions
 Mom-0.015
 down-0.013
owner-0.013
cream-0.013
 Dad-0.013
Token to
Token ID284
Feature activation-5.88e-01

Pos logit contributions
ib+0.003
 hops+0.003
emy+0.003
ub+0.003
 villains+0.002
Neg logit contributions
rolley-0.002
 Mom-0.002
owner-0.002
cream-0.002
 down-0.002
Token take
Token ID1011
Feature activation+6.12e-02

Pos logit contributions
pill+0.022
emy+0.022
ib+0.022
ub+0.022
 hops+0.021
Neg logit contributions
 Mom-0.015
 down-0.013
 Mum-0.013
 Dad-0.013
 mom-0.013
Token care
Token ID1337
Feature activation+1.21e+00

Pos logit contributions
 Mom+0.001
rolley+0.001
cream+0.001
 down+0.001
owner+0.001
Neg logit contributions
ib-0.002
ub-0.002
emy-0.002
 hops-0.002
 villains-0.002
Token of
Token ID286
Feature activation-4.25e-01

Pos logit contributions
 Mom+0.031
rolley+0.030
cream+0.030
nam+0.029
 down+0.029
Neg logit contributions
ub-0.043
ib-0.043
emy-0.041
 hops-0.041
 villains-0.040
Token your
Token ID534
Feature activation+6.33e-01

Pos logit contributions
ib+0.016
ub+0.016
emy+0.016
 hops+0.016
 villains+0.016
Neg logit contributions
 Mom-0.010
rolley-0.009
 down-0.009
cream-0.009
owner-0.009
Token hair
Token ID4190
Feature activation+2.10e+00

Pos logit contributions
rolley+0.017
cream+0.017
 Mom+0.016
owner+0.016
 down+0.016
Neg logit contributions
ib-0.022
 hops-0.021
emy-0.021
ub-0.021
 appearance-0.020
Token.
Token ID13
Feature activation-1.32e+00

Pos logit contributions
rolley+0.047
 Mom+0.045
owner+0.044
cream+0.044
nam+0.043
Neg logit contributions
ib-0.080
ub-0.080
 hops-0.075
 appearance-0.075
 villains-0.075
Token And
Token ID843
Feature activation-4.88e-01

Pos logit contributions
ub+0.048
emy+0.048
 hops+0.047
ib+0.047
 villains+0.046
Neg logit contributions
 Mom-0.036
 Dad-0.032
 Mum-0.031
 Grand-0.031
cream-0.030
Token you
Token ID345
Feature activation-7.21e-02

Pos logit contributions
emy+0.020
pill+0.020
ib+0.019
ub+0.019
 villains+0.019
Neg logit contributions
 Mom-0.013
 Dad-0.011
 down-0.011
 Mum-0.011
nam-0.011
Token are
Token ID389
Feature activation-2.24e-01

Pos logit contributions
emy+0.003
ib+0.003
pill+0.003
ub+0.002
 appearances+0.002
Neg logit contributions
 Mom-0.002
he-0.002
 -0.002
 down-0.002
 started-0.002
Token very
Token ID845
Feature activation+9.76e-02

Pos logit contributions
 hops+0.008
emy+0.008
ib+0.008
 villains+0.008
ub+0.008
Neg logit contributions
rolley-0.006
cream-0.005
owner-0.005
 Mom-0.005
 clerk-0.005
Token
Token ID198
Feature activation-4.54e-01

Pos logit contributions
ib+0.052
ub+0.051
emy+0.051
 hops+0.051
 villains+0.050
Neg logit contributions
rolley-0.027
 Mom-0.027
cream-0.026
nam-0.026
 down-0.025
Token
Token ID198
Feature activation-4.40e-01

Pos logit contributions
emy+0.016
ib+0.016
ub+0.016
 hops+0.016
 villains+0.015
Neg logit contributions
nam-0.012
cream-0.012
 forehead-0.012
rolley-0.012
aughter-0.011
Token"
Token ID1
Feature activation-6.16e-02

Pos logit contributions
ib+0.015
emy+0.015
ub+0.015
 hops+0.015
 villains+0.015
Neg logit contributions
 Mom-0.012
rolley-0.012
cream-0.011
 down-0.011
nam-0.011
TokenHello
Token ID15496
Feature activation+1.32e+00

Pos logit contributions
ib+0.002
ub+0.002
 appearance+0.002
emy+0.002
 weaknesses+0.002
Neg logit contributions
 Mom-0.002
 Dad-0.001
 down-0.001
nam-0.001
Mom-0.001
Token,
Token ID11
Feature activation+6.04e-01

Pos logit contributions
 Mom+0.038
 down+0.036
 Dad+0.035
 Mum+0.034
 Grand+0.032
Neg logit contributions
 villains-0.045
emy-0.045
ib-0.045
 weaknesses-0.044
ub-0.044
Token kids
Token ID3988
Feature activation+2.35e+00

Pos logit contributions
 Mom+0.018
 Dad+0.016
 down+0.016
 Mum+0.016
 mom+0.015
Neg logit contributions
ib-0.021
 weaknesses-0.021
ub-0.021
 villains-0.021
emy-0.021
Token!"
Token ID2474
Feature activation-5.21e-01

Pos logit contributions
 Mom+0.040
 down+0.037
rolley+0.036
cream+0.035
 Dad+0.034
Neg logit contributions
ib-0.104
emy-0.104
ub-0.104
 hops-0.102
 villains-0.102
Token the
Token ID262
Feature activation-8.56e-02

Pos logit contributions
 villains+0.019
 hops+0.019
ib+0.018
emy+0.018
ub+0.018
Neg logit contributions
rolley-0.014
cream-0.013
appointed-0.012
 down-0.012
 forehead-0.012
Token bird
Token ID6512
Feature activation-1.16e+00

Pos logit contributions
ib+0.002
emy+0.002
ub+0.002
 hops+0.002
 villains+0.002
Neg logit contributions
rolley-0.003
 Mom-0.003
cream-0.003
 down-0.003
nam-0.003
Token said
Token ID531
Feature activation-8.41e-01

Pos logit contributions
ib+0.057
ub+0.057
emy+0.056
 villains+0.056
 hops+0.055
Neg logit contributions
 Mom-0.014
 down-0.013
cream-0.012
rolley-0.012
 Dad-0.011
Token.
Token ID13
Feature activation-1.09e+00

Pos logit contributions
ib+0.034
ub+0.033
 hops+0.033
emy+0.032
 villains+0.032
Neg logit contributions
rolley-0.019
cream-0.017
 Mom-0.016
nam-0.016
 Daughter-0.016
TokenMe
Token ID5308
Feature activation+1.51e+00

Pos logit contributions
rolley+0.017
 Mom+0.016
 forehead+0.016
nam+0.016
cream+0.016
Neg logit contributions
ub-0.025
ib-0.025
emy-0.025
pill-0.024
 hops-0.024
Token too
Token ID1165
Feature activation-6.21e-01

Pos logit contributions
rolley+0.047
nam+0.043
owner+0.043
person+0.043
cream+0.042
Neg logit contributions
ub-0.047
ib-0.047
 tweet-0.045
 hops-0.044
emy-0.043
Token!
Token ID0
Feature activation+5.07e-01

Pos logit contributions
ub+0.023
ib+0.022
emy+0.022
 hops+0.022
pill+0.021
Neg logit contributions
rolley-0.015
 Mom-0.015
owner-0.015
cream-0.015
nam-0.014
Token Let
Token ID3914
Feature activation-8.27e-01

Pos logit contributions
 Mom+0.014
rolley+0.013
 down+0.013
cream+0.013
 Dad+0.013
Neg logit contributions
ib-0.017
ub-0.017
emy-0.017
 hops-0.016
 villains-0.016
Tokenâ
Token ID22940
Feature activation+7.57e-01

Pos logit contributions
ub+0.021
 hops+0.020
 tweet+0.020
 appearance+0.019
ib+0.019
Neg logit contributions
nam-0.029
rolley-0.028
belt-0.027
owner-0.027
cream-0.027
Token
Token ID26391
Feature activation+3.07e+00

Pos logit contributions
 Mom+0.019
nam+0.018
rolley+0.018
 down+0.017
 Dad+0.017
Neg logit contributions
ub-0.027
ib-0.027
 hops-0.026
 tweet-0.026
emy-0.025
Token
Token ID8151
Feature activation+4.64e-01

Pos logit contributions
nam+0.082
rolley+0.082
 forehead+0.080
owner+0.076
 down+0.075
Neg logit contributions
ub-0.112
ib-0.105
 hops-0.103
pill-0.103
emy-0.102
Tokens
Token ID82
Feature activation+2.84e-01

Pos logit contributions
 forehead+0.017
nam+0.016
 doorway+0.015
 Mom+0.015
rolley+0.015
Neg logit contributions
ub-0.014
ib-0.012
pill-0.012
emy-0.010
azz-0.010
Token pick
Token ID2298
Feature activation+3.87e-01

Pos logit contributions
 Mom+0.008
rolley+0.008
cream+0.008
nam+0.008
 down+0.008
Neg logit contributions
ub-0.009
ib-0.009
 hops-0.009
emy-0.009
 villains-0.008
Token some
Token ID617
Feature activation-1.93e-01

Pos logit contributions
nam+0.010
appointed+0.010
 forehead+0.010
owner+0.010
belt+0.009
Neg logit contributions
ub-0.013
pill-0.013
ib-0.013
 hops-0.012
emy-0.012
Token flowers
Token ID12734
Feature activation-5.91e-01

Pos logit contributions
ib+0.006
ub+0.006
emy+0.006
 hops+0.006
 villains+0.006
Neg logit contributions
 Mom-0.005
rolley-0.005
 down-0.005
cream-0.005
 Dad-0.005
Token said
Token ID531
Feature activation-1.17e+00

Pos logit contributions
 Mom+0.005
 down+0.004
cream+0.004
 started+0.004
 up+0.004
Neg logit contributions
ub-0.007
emy-0.007
ib-0.007
 conclusions-0.007
 villains-0.007
Token you
Token ID345
Feature activation-2.53e-01

Pos logit contributions
emy+0.051
ib+0.049
ub+0.049
 weaknesses+0.048
 conclusions+0.047
Neg logit contributions
 Mom-0.031
 Dad-0.027
 Grand-0.025
 Mum-0.025
 mom-0.024
Token are
Token ID389
Feature activation-5.24e-01

Pos logit contributions
emy+0.009
ub+0.009
ib+0.008
 appearances+0.008
pill+0.008
Neg logit contributions
 Mom-0.007
cream-0.007
 down-0.007
 Dad-0.007
nam-0.006
Token very
Token ID845
Feature activation-9.56e-02

Pos logit contributions
 hops+0.020
emy+0.020
 villains+0.019
ib+0.019
 appearances+0.019
Neg logit contributions
rolley-0.013
cream-0.012
owner-0.011
 down-0.010
 clerk-0.010
Token good
Token ID922
Feature activation+1.36e+00

Pos logit contributions
ib+0.004
ub+0.004
emy+0.004
 hops+0.004
 villains+0.003
Neg logit contributions
 Mom-0.002
 down-0.002
rolley-0.002
 Dad-0.002
cream-0.002
Token twins
Token ID20345
Feature activation+2.67e+00

Pos logit contributions
 Mom+0.030
 down+0.028
 Dad+0.028
 Mum+0.025
 Grand+0.025
Neg logit contributions
ub-0.055
emy-0.055
ib-0.054
pill-0.054
 hops-0.053
Token and
Token ID290
Feature activation-8.02e-01

Pos logit contributions
 Mom+0.043
rolley+0.039
 down+0.039
cream+0.038
 Dad+0.037
Neg logit contributions
ib-0.120
ub-0.120
emy-0.119
 hops-0.118
 villains-0.117
Token you
Token ID345
Feature activation-3.95e-01

Pos logit contributions
ib+0.027
emy+0.027
 hops+0.027
ub+0.027
 villains+0.026
Neg logit contributions
 Mom-0.021
rolley-0.021
 down-0.020
cream-0.020
owner-0.020
Token deserve
Token ID10925
Feature activation+5.89e-01

Pos logit contributions
ib+0.011
ub+0.011
emy+0.011
 hops+0.011
 villains+0.011
Neg logit contributions
 Mom-0.013
rolley-0.012
 down-0.012
cream-0.012
 Dad-0.012
Token a
Token ID257
Feature activation+1.11e+00

Pos logit contributions
 Mom+0.013
 down+0.012
rolley+0.012
cream+0.012
 Dad+0.012
Neg logit contributions
ib-0.023
ub-0.023
emy-0.023
 hops-0.022
 villains-0.022
Token friend
Token ID1545
Feature activation+1.17e+00

Pos logit contributions
 Mom+0.028
 down+0.024
 Dad+0.023
 Mum+0.023
 mom+0.023
Neg logit contributions
emy-0.044
ib-0.044
pill-0.044
ub-0.042
 villains-0.042
Tokenpron
Token ID31186
Feature activation+2.15e+00

Pos logit contributions
 Mom+0.014
rolley+0.014
 down+0.013
cream+0.013
 Dad+0.013
Neg logit contributions
ib-0.018
ub-0.018
emy-0.018
 hops-0.017
 villains-0.017
Token was
Token ID373
Feature activation+1.63e-01

Pos logit contributions
nam+0.058
rolley+0.057
 Dad+0.053
cream+0.053
 Mom+0.053
Neg logit contributions
 villains-0.070
 hops-0.069
ub-0.068
ib-0.067
 appearance-0.065
Token impressive
Token ID8036
Feature activation+2.94e-01

Pos logit contributions
rolley+0.004
owner+0.004
person+0.004
cream+0.004
aughter+0.004
Neg logit contributions
ib-0.006
 hops-0.006
emy-0.006
 villains-0.006
ub-0.006
Token because
Token ID780
Feature activation+2.38e-01

Pos logit contributions
 Mom+0.006
 down+0.006
 up+0.006
 Dad+0.006
cream+0.006
Neg logit contributions
emy-0.012
ub-0.012
ib-0.012
 conclusions-0.012
 villains-0.012
Token it
Token ID340
Feature activation+9.89e-02

Pos logit contributions
rolley+0.006
cream+0.005
owner+0.005
 down+0.005
 forehead+0.005
Neg logit contributions
ib-0.009
ub-0.008
emy-0.008
 hops-0.008
 villains-0.008
Token had
Token ID550
Feature activation+9.35e-01

Pos logit contributions
 Mom+0.002
rolley+0.002
 down+0.002
cream+0.002
 Dad+0.002
Neg logit contributions
ib-0.004
ub-0.004
emy-0.004
 hops-0.004
 villains-0.004
Token a
Token ID257
Feature activation-1.43e-01

Pos logit contributions
 Mom+0.020
rolley+0.019
 down+0.019
owner+0.018
cream+0.018
Neg logit contributions
ib-0.037
ub-0.036
emy-0.036
 villains-0.036
 hops-0.036
Token picture
Token ID4286
Feature activation-2.54e-01

Pos logit contributions
ib+0.006
emy+0.005
ub+0.005
 hops+0.005
 villains+0.005
Neg logit contributions
 Mom-0.003
 down-0.003
rolley-0.003
 Dad-0.003
cream-0.003
Token of
Token ID286
Feature activation+7.21e-01

Pos logit contributions
ib+0.009
ub+0.009
 hops+0.009
emy+0.008
 villains+0.008
Neg logit contributions
 Mom-0.006
rolley-0.006
nam-0.006
cream-0.006
owner-0.006
Token a
Token ID257
Feature activation+2.12e-01

Pos logit contributions
 Mom+0.013
rolley+0.013
 down+0.013
cream+0.012
owner+0.012
Neg logit contributions
ib-0.030
emy-0.030
ub-0.030
 hops-0.030
 villains-0.029
Token rainbow
Token ID27223
Feature activation-8.26e-01

Pos logit contributions
 Mom+0.006
 down+0.005
rolley+0.005
 Dad+0.005
cream+0.005
Neg logit contributions
ib-0.008
ub-0.008
emy-0.007
 hops-0.007
pill-0.007
Token harvest
Token ID13222
Feature activation+2.49e-01

Pos logit contributions
 Mom+0.012
 Dad+0.011
 Mum+0.011
 down+0.011
 mom+0.010
Neg logit contributions
 villains-0.016
pill-0.016
ub-0.015
ib-0.015
emy-0.015
Token of
Token ID286
Feature activation+3.45e-01

Pos logit contributions
 Mom+0.006
nam+0.006
rolley+0.006
cream+0.006
appointed+0.006
Neg logit contributions
ib-0.009
 hops-0.009
ub-0.009
emy-0.008
pill-0.008
Token corn
Token ID11676
Feature activation-3.55e-01

Pos logit contributions
 Mom+0.010
rolley+0.010
cream+0.009
owner+0.009
 down+0.009
Neg logit contributions
ib-0.011
 hops-0.011
emy-0.011
ub-0.011
pill-0.011
Token.
Token ID13
Feature activation+1.95e-01

Pos logit contributions
ib+0.014
ub+0.014
emy+0.014
 hops+0.014
 villains+0.014
Neg logit contributions
 Mom-0.007
rolley-0.007
 down-0.007
cream-0.007
 Dad-0.007
Token He
Token ID679
Feature activation+2.26e-01

Pos logit contributions
nam+0.006
rolley+0.005
 down+0.005
owner+0.005
 forehead+0.005
Neg logit contributions
ib-0.006
 hops-0.006
pill-0.006
emy-0.006
ub-0.006
Token was
Token ID373
Feature activation+3.09e-01

Pos logit contributions
 Mom+0.005
rolley+0.005
 down+0.005
cream+0.005
nam+0.005
Neg logit contributions
ib-0.009
ub-0.009
emy-0.009
 hops-0.009
 villains-0.008
Token so
Token ID523
Feature activation+1.55e-02

Pos logit contributions
rolley+0.008
cream+0.007
aughter+0.007
owner+0.007
person+0.007
Neg logit contributions
 hops-0.011
ib-0.010
ub-0.010
emy-0.010
 tweet-0.010
Token excited
Token ID6568
Feature activation+5.80e-01

Pos logit contributions
rolley+0.000
owner+0.000
cream+0.000
person+0.000
 clerk+0.000
Neg logit contributions
 hops-0.001
ib-0.001
ub-0.001
emy-0.001
pill-0.001
Token that
Token ID326
Feature activation+1.67e+00

Pos logit contributions
 Mom+0.014
rolley+0.014
 down+0.014
cream+0.013
 Dad+0.013
Neg logit contributions
ib-0.021
ub-0.021
emy-0.021
 hops-0.021
 villains-0.020
Token his
Token ID465
Feature activation+9.42e-01

Pos logit contributions
rolley+0.059
aughter+0.056
amed+0.055
owner+0.054
 forehead+0.054
Neg logit contributions
 hops-0.047
ib-0.042
 villains-0.041
emy-0.041
ub-0.040
Token plan
Token ID1410
Feature activation+8.83e-03

Pos logit contributions
rolley+0.022
nam+0.021
cream+0.021
 Mom+0.021
 down+0.021
Neg logit contributions
emy-0.035
 hops-0.035
ib-0.035
ub-0.033
 villains-0.033
Token cat
Token ID3797
Feature activation-8.58e-01

Pos logit contributions
pill+0.000
 weaknesses+0.000
ub+0.000
ib+0.000
 villains+0.000
Neg logit contributions
 Mom-0.000
 down-0.000
 Dad-0.000
 mom-0.000
 cream-0.000
Token sitting
Token ID5586
Feature activation+6.11e-01

Pos logit contributions
 weaknesses+0.044
ub+0.043
ib+0.042
 appearance+0.041
emy+0.041
Neg logit contributions
 down-0.015
 Mom-0.013
 up-0.012
he-0.011
 onto-0.010
Token under
Token ID739
Feature activation+3.62e-01

Pos logit contributions
 Daughter+0.019
person+0.018
rolley+0.018
cakes+0.018
 forehead+0.017
Neg logit contributions
ib-0.019
emy-0.018
 hops-0.018
 hunters-0.017
ub-0.016
Token a
Token ID257
Feature activation-5.64e-01

Pos logit contributions
rolley+0.011
aughter+0.011
owner+0.010
belt+0.010
appointed+0.010
Neg logit contributions
 villains-0.012
emy-0.011
ib-0.011
 hops-0.011
 spiders-0.011
Token tree
Token ID5509
Feature activation-2.19e+00

Pos logit contributions
ib+0.018
emy+0.018
ub+0.018
 hops+0.018
 villains+0.018
Neg logit contributions
 Mom-0.016
rolley-0.015
cream-0.015
nam-0.015
 down-0.015
Token.
Token ID13
Feature activation+6.33e-01

Pos logit contributions
ib+0.098
ub+0.098
emy+0.098
 hops+0.096
 villains+0.096
Neg logit contributions
 Mom-0.036
 down-0.033
rolley-0.033
cream-0.031
 Dad-0.030
Token The
Token ID383
Feature activation-3.70e-01

Pos logit contributions
rolley+0.022
aughter+0.021
owner+0.020
nam+0.020
amed+0.019
Neg logit contributions
 villains-0.019
 hops-0.019
emy-0.018
ib-0.018
pill-0.017
Token cat
Token ID3797
Feature activation-1.05e+00

Pos logit contributions
ib+0.011
emy+0.011
ub+0.011
 hops+0.010
 villains+0.010
Neg logit contributions
 Mom-0.012
rolley-0.012
 down-0.011
cream-0.011
nam-0.011
Token looked
Token ID3114
Feature activation-6.55e-01

Pos logit contributions
ib+0.044
ub+0.044
emy+0.043
 hops+0.043
 villains+0.042
Neg logit contributions
 Mom-0.021
rolley-0.019
 down-0.019
cream-0.019
 Dad-0.018
Token sad
Token ID6507
Feature activation-3.24e-01

Pos logit contributions
 hops+0.023
emy+0.022
ib+0.022
ub+0.022
 appearance+0.021
Neg logit contributions
rolley-0.018
owner-0.017
cream-0.017
person-0.016
 Mom-0.016
Token and
Token ID290
Feature activation-3.94e-02

Pos logit contributions
ub+0.013
pill+0.013
ib+0.013
emy+0.013
 hops+0.013
Neg logit contributions
 Mom-0.007
 down-0.007
 Dad-0.006
cream-0.006
rolley-0.006
Token told
Token ID1297
Feature activation-1.33e-01

Pos logit contributions
 Mom+0.011
rolley+0.010
 down+0.010
cream+0.010
nam+0.010
Neg logit contributions
emy-0.022
ib-0.021
ub-0.021
 hops-0.021
 villains-0.021
Token Lily
Token ID20037
Feature activation+2.83e-01

Pos logit contributions
ib+0.006
ub+0.006
emy+0.006
 hops+0.006
 villains+0.005
Neg logit contributions
 Mom-0.003
rolley-0.002
 down-0.002
cream-0.002
 Dad-0.002
Token they
Token ID484
Feature activation-4.15e-01

Pos logit contributions
 Mom+0.007
 down+0.007
 Dad+0.007
cream+0.006
rolley+0.006
Neg logit contributions
ib-0.011
ub-0.010
emy-0.010
 hops-0.010
 villains-0.010
Token needed
Token ID2622
Feature activation-6.79e-01

Pos logit contributions
 villains+0.013
 weaknesses+0.013
ub+0.013
ib+0.013
 hops+0.013
Neg logit contributions
rolley-0.011
owner-0.011
 Mom-0.011
cream-0.011
 down-0.010
Token to
Token ID284
Feature activation-6.03e-01

Pos logit contributions
ub+0.030
emy+0.029
pill+0.029
ib+0.029
 hops+0.029
Neg logit contributions
 Mom-0.015
 Dad-0.013
 Mum-0.013
 down-0.012
 Grand-0.012
Token go
Token ID467
Feature activation+1.88e-01

Pos logit contributions
pill+0.022
ib+0.022
emy+0.022
ub+0.022
 hops+0.021
Neg logit contributions
 Mom-0.016
 Dad-0.015
 down-0.015
 Mum-0.014
rolley-0.014
Token back
Token ID736
Feature activation+2.26e-01

Pos logit contributions
 down+0.003
 Mom+0.003
rolley+0.003
 Dad+0.003
cream+0.003
Neg logit contributions
ib-0.008
emy-0.008
pill-0.008
ub-0.008
 villains-0.008
Token inside
Token ID2641
Feature activation+6.47e-01

Pos logit contributions
 Mom+0.006
rolley+0.005
nam+0.005
cream+0.005
owner+0.005
Neg logit contributions
ib-0.008
ub-0.008
emy-0.008
 hops-0.008
 villains-0.008
Token to
Token ID284
Feature activation-1.10e-01

Pos logit contributions
cream+0.019
rolley+0.018
 Mom+0.018
owner+0.018
nam+0.018
Neg logit contributions
ib-0.020
ub-0.020
emy-0.020
 hops-0.020
 villains-0.019
Token get
Token ID651
Feature activation+3.95e-01

Pos logit contributions
ib+0.004
ub+0.004
emy+0.004
 hops+0.004
 villains+0.003
Neg logit contributions
 Mom-0.003
rolley-0.003
cream-0.003
owner-0.003
 down-0.003
Token it
Token ID340
Feature activation-1.29e+00

Pos logit contributions
rolley+0.010
owner+0.010
person+0.010
appointed+0.010
aughter+0.010
Neg logit contributions
 hops-0.013
ib-0.012
ub-0.012
emy-0.012
 nightmares-0.012
Token,
Token ID11
Feature activation-2.03e-01

Pos logit contributions
 Mom+0.018
 down+0.017
rolley+0.017
cream+0.017
 Dad+0.017
Neg logit contributions
ib-0.018
ub-0.018
emy-0.018
 hops-0.017
 villains-0.017
Token T
Token ID309
Feature activation+6.70e-01

Pos logit contributions
ub+0.006
ib+0.006
emy+0.006
 hops+0.005
 tweet+0.005
Neg logit contributions
rolley-0.007
cream-0.007
owner-0.006
person-0.006
 forehead-0.006
Tokenilly
Token ID6548
Feature activation-5.19e-01

Pos logit contributions
 Mom+0.019
rolley+0.018
 Dad+0.018
 down+0.018
 Mum+0.018
Neg logit contributions
 villains-0.022
ub-0.022
 hops-0.022
ib-0.021
emy-0.021
Token wanted
Token ID2227
Feature activation-6.65e-01

Pos logit contributions
ub+0.022
ib+0.022
emy+0.022
 villains+0.022
 hops+0.022
Neg logit contributions
 Mom-0.010
 down-0.009
rolley-0.009
 Dad-0.008
cream-0.008
Token to
Token ID284
Feature activation-1.84e-02

Pos logit contributions
ub+0.030
ib+0.029
emy+0.028
 hops+0.028
pill+0.027
Neg logit contributions
 Mom-0.015
 Dad-0.013
 Mum-0.013
 down-0.013
 Grand-0.012
Token make
Token ID787
Feature activation+2.72e-01

Pos logit contributions
 hops+0.000
 villains+0.000
ib+0.000
emy+0.000
ub+0.000
Neg logit contributions
rolley-0.001
owner-0.001
cream-0.001
 down-0.001
nam-0.001
Token something
Token ID1223
Feature activation-1.20e+00

Pos logit contributions
 Mom+0.006
rolley+0.006
 down+0.005
cream+0.005
owner+0.005
Neg logit contributions
ib-0.011
emy-0.011
ub-0.011
 hops-0.011
pill-0.010
Token special
Token ID2041
Feature activation-3.51e-01

Pos logit contributions
ib+0.045
ub+0.045
emy+0.045
 hops+0.045
 villains+0.044
Neg logit contributions
 Mom-0.027
rolley-0.027
cream-0.025
 down-0.025
owner-0.025
Token.
Token ID13
Feature activation+3.34e-01

Pos logit contributions
emy+0.016
ib+0.015
ub+0.015
 hops+0.015
 villains+0.015
Neg logit contributions
 Mom-0.006
 down-0.006
rolley-0.005
 Dad-0.005
cream-0.005
Token She
Token ID1375
Feature activation+5.41e-01

Pos logit contributions
cream+0.011
nam+0.011
rolley+0.011
 cream+0.011
 forehead+0.011
Neg logit contributions
ub-0.009
ib-0.009
 hops-0.009
emy-0.008
pill-0.008
Token went
Token ID1816
Feature activation-5.26e-01

Pos logit contributions
nam+0.013
 Mom+0.013
owner+0.013
rolley+0.013
 forehead+0.013
Neg logit contributions
ib-0.019
emy-0.019
 hops-0.019
ub-0.019
 villains-0.018
Token.
Token ID13
Feature activation+1.19e+00

Pos logit contributions
rolley+0.011
cream+0.010
 Daughter+0.010
 clerk+0.009
owner+0.009
Neg logit contributions
 hops-0.017
ib-0.017
emy-0.016
ub-0.016
 villains-0.016
Token
Token ID198
Feature activation+3.56e-01

Pos logit contributions
rolley+0.024
 down+0.023
nam+0.023
 Mom+0.023
cream+0.022
Neg logit contributions
ib-0.048
emy-0.048
ub-0.048
 hops-0.047
pill-0.047
Token
Token ID198
Feature activation+1.55e-01

Pos logit contributions
nam+0.008
rolley+0.008
cream+0.008
 down+0.008
 forehead+0.008
Neg logit contributions
emy-0.014
ub-0.014
ib-0.014
 hops-0.013
 villains-0.013
TokenThe
Token ID464
Feature activation-1.17e-01

Pos logit contributions
 Mom+0.004
Mom+0.004
appointed+0.004
rolley+0.004
owner+0.004
Neg logit contributions
ib-0.005
 hops-0.005
 appearance-0.005
ub-0.005
pill-0.005
Token flood
Token ID6947
Feature activation+1.09e-01

Pos logit contributions
ub+0.004
ib+0.004
pill+0.004
emy+0.004
 hops+0.004
Neg logit contributions
 Mom-0.003
 down-0.003
 Dad-0.003
 mom-0.003
owner-0.003
Token kept
Token ID4030
Feature activation-5.43e-01

Pos logit contributions
 Mom+0.002
 down+0.002
owner+0.002
rolley+0.002
cream+0.002
Neg logit contributions
ib-0.005
emy-0.005
ub-0.005
pill-0.005
 villains-0.005
Token getting
Token ID1972
Feature activation+4.28e-02

Pos logit contributions
ib+0.023
ub+0.022
pill+0.022
emy+0.022
 villains+0.021
Neg logit contributions
 down-0.015
 Mom-0.013
 up-0.013
 onto-0.012
 Dad-0.011
Token bigger
Token ID5749
Feature activation-7.49e-01

Pos logit contributions
rolley+0.001
 Mom+0.001
aughter+0.001
person+0.001
cream+0.001
Neg logit contributions
ib-0.001
ub-0.001
emy-0.001
 hops-0.001
 villains-0.001
Token and
Token ID290
Feature activation+3.44e-01

Pos logit contributions
ub+0.031
 tweet+0.030
emy+0.030
 appearances+0.030
 villains+0.029
Neg logit contributions
 down-0.017
 Mom-0.015
 up-0.014
 Dad-0.013
 onto-0.013
Token bigger
Token ID5749
Feature activation-3.99e-01

Pos logit contributions
 Mom+0.010
rolley+0.010
 down+0.009
cream+0.009
 Dad+0.009
Neg logit contributions
ib-0.011
ub-0.011
emy-0.011
 hops-0.011
 villains-0.011
Token.
Token ID13
Feature activation+3.91e-01

Pos logit contributions
ub+0.019
emy+0.018
ib+0.018
pill+0.018
 appearances+0.018
Neg logit contributions
 down-0.007
 Mom-0.006
 up-0.005
 Dad-0.005
 onto-0.005
Token very
Token ID845
Feature activation-1.13e-01

Pos logit contributions
rolley+0.006
cream+0.006
cakes+0.006
aughter+0.006
amas+0.005
Neg logit contributions
 hops-0.007
 villains-0.007
ub-0.007
ib-0.007
emy-0.007
Token proud
Token ID6613
Feature activation+1.09e-01

Pos logit contributions
ub+0.004
 hops+0.004
ib+0.004
emy+0.004
 villains+0.004
Neg logit contributions
rolley-0.003
cream-0.003
amas-0.003
owner-0.003
 Mom-0.003
Token of
Token ID286
Feature activation+3.30e-01

Pos logit contributions
 Mom+0.002
 down+0.002
 Dad+0.002
rolley+0.002
 Mum+0.002
Neg logit contributions
ib-0.004
ub-0.004
emy-0.004
 hops-0.004
 villains-0.004
Token himself
Token ID2241
Feature activation-3.04e-01

Pos logit contributions
 Mom+0.006
rolley+0.006
 down+0.006
cream+0.006
 Dad+0.006
Neg logit contributions
ib-0.014
ub-0.014
emy-0.014
 hops-0.014
 villains-0.014
Token.
Token ID13
Feature activation+4.09e-01

Pos logit contributions
ib+0.013
ub+0.013
emy+0.013
 hops+0.013
 villains+0.012
Neg logit contributions
 Mom-0.006
rolley-0.005
 down-0.005
cream-0.005
 Dad-0.005
Token<|endoftext|>
Token ID50256
Feature activation-6.30e-01

Pos logit contributions
nam+0.013
 forehead+0.013
 cream+0.012
bled+0.012
 down+0.012
Neg logit contributions
emy-0.012
pill-0.012
ib-0.012
ub-0.011
 hops-0.011
TokenOnce
Token ID7454
Feature activation-5.94e-01

Pos logit contributions
ib+0.021
 hops+0.021
ub+0.021
pill+0.020
emy+0.020
Neg logit contributions
nam-0.018
 forehead-0.017
cream-0.017
rolley-0.017
 Mom-0.017
Token upon
Token ID2402
Feature activation+3.77e-01

Pos logit contributions
ub+0.015
ib+0.014
emy+0.014
 hops+0.013
alk+0.011
Neg logit contributions
cream-0.024
rolley-0.023
 forehead-0.023
person-0.022
nam-0.021
Token a
Token ID257
Feature activation+3.04e-01

Pos logit contributions
rolley+0.009
owner+0.008
aughter+0.008
 clerk+0.008
cream+0.008
Neg logit contributions
 villains-0.014
ib-0.014
emy-0.014
 hops-0.014
pill-0.013
Token time
Token ID640
Feature activation-3.38e-01

Pos logit contributions
 Mom+0.008
 down+0.007
 Dad+0.007
rolley+0.007
 Mum+0.007
Neg logit contributions
ub-0.011
ib-0.011
 villains-0.011
pill-0.011
 hops-0.011
Token,
Token ID11
Feature activation-1.69e-01

Pos logit contributions
 villains+0.013
 weaknesses+0.013
pill+0.013
 conclusions+0.013
emy+0.013
Neg logit contributions
 Mom-0.009
 Dad-0.008
 down-0.008
 Mum-0.007
 -0.007
Token named
Token ID3706
Feature activation+3.65e-01

Pos logit contributions
 Mom+0.002
rolley+0.002
 down+0.002
cream+0.002
owner+0.002
Neg logit contributions
ib-0.002
emy-0.002
ub-0.002
 hops-0.002
 villains-0.002
Token Lily
Token ID20037
Feature activation+1.29e-01

Pos logit contributions
rolley+0.012
cream+0.011
owner+0.010
belt+0.010
person+0.010
Neg logit contributions
ib-0.012
ub-0.011
emy-0.011
 tweet-0.011
 hops-0.011
Token.
Token ID13
Feature activation-6.95e-01

Pos logit contributions
cakes+0.003
rolley+0.003
owner+0.003
 clerk+0.003
aughter+0.003
Neg logit contributions
ib-0.005
emy-0.005
ub-0.004
 villains-0.004
 hops-0.004
Token She
Token ID1375
Feature activation+1.42e-01

Pos logit contributions
ub+0.025
emy+0.025
ib+0.025
 villains+0.025
 hops+0.025
Neg logit contributions
 Mom-0.019
 down-0.017
 Dad-0.016
 Mum-0.016
rolley-0.016
Token loved
Token ID6151
Feature activation-7.54e-01

Pos logit contributions
rolley+0.003
 Mom+0.003
owner+0.003
 down+0.003
nam+0.003
Neg logit contributions
ib-0.006
emy-0.006
 hops-0.006
 villains-0.006
ub-0.006
Token to
Token ID284
Feature activation-9.59e-01

Pos logit contributions
ub+0.034
ib+0.032
emy+0.032
 hops+0.032
 villains+0.031
Neg logit contributions
 Mom-0.017
 down-0.015
 Dad-0.015
cream-0.014
 Grand-0.014
Token draw
Token ID3197
Feature activation-2.45e-01

Pos logit contributions
ub+0.037
emy+0.036
pill+0.035
ib+0.035
 villains+0.033
Neg logit contributions
 Mom-0.027
 Dad-0.024
 Mum-0.024
 down-0.023
 Grand-0.022
Token and
Token ID290
Feature activation-8.57e-01

Pos logit contributions
emy+0.010
ub+0.010
ib+0.010
 hops+0.010
 villains+0.010
Neg logit contributions
 Mom-0.005
 down-0.005
rolley-0.005
cream-0.005
nam-0.005
Token color
Token ID3124
Feature activation-1.06e-01

Pos logit contributions
ib+0.026
ub+0.026
emy+0.026
 hops+0.025
pill+0.025
Neg logit contributions
 Mom-0.027
 down-0.026
nam-0.025
rolley-0.025
 Dad-0.025
Token with
Token ID351
Feature activation-6.14e-02

Pos logit contributions
emy+0.005
ib+0.005
ub+0.005
 villains+0.005
 hops+0.005
Neg logit contributions
 Mom-0.002
 down-0.002
cream-0.002
 Dad-0.002
rolley-0.002
Token her
Token ID607
Feature activation+4.54e-01

Pos logit contributions
ib+0.002
emy+0.002
ub+0.002
 hops+0.002
 weaknesses+0.002
Neg logit contributions
rolley-0.001
cream-0.001
owner-0.001
 forehead-0.001
appointed-0.001
Token,
Token ID11
Feature activation-1.56e-02

Pos logit contributions
 Mom+0.013
 down+0.013
owner+0.012
 Dad+0.012
 clerk+0.012
Neg logit contributions
emy-0.025
ib-0.025
ub-0.025
pill-0.024
 hops-0.024
Token and
Token ID290
Feature activation+9.28e-01

Pos logit contributions
ib+0.001
ub+0.001
emy+0.001
 villains+0.001
pill+0.001
Neg logit contributions
 Mom-0.000
 down-0.000
 Dad-0.000
 Mum-0.000
owner-0.000
Token told
Token ID1297
Feature activation+4.82e-01

Pos logit contributions
 Mom+0.021
 down+0.020
rolley+0.019
 Dad+0.019
cream+0.019
Neg logit contributions
ib-0.036
emy-0.036
ub-0.036
 hops-0.035
 villains-0.035
Token the
Token ID262
Feature activation+6.42e-01

Pos logit contributions
 Mom+0.013
 Dad+0.012
 Mum+0.012
 down+0.011
 up+0.011
Neg logit contributions
ib-0.019
pill-0.018
emy-0.018
ub-0.018
 villains-0.018
Token others
Token ID1854
Feature activation+1.57e-01

Pos logit contributions
 Mom+0.018
 down+0.016
 Dad+0.016
 Mum+0.016
rolley+0.015
Neg logit contributions
ub-0.024
ib-0.023
emy-0.023
pill-0.022
 hops-0.022
Token to
Token ID284
Feature activation-9.07e-01

Pos logit contributions
 Mom+0.004
 down+0.003
 Dad+0.003
 Mum+0.003
cream+0.003
Neg logit contributions
ib-0.006
emy-0.006
ub-0.006
pill-0.006
 hops-0.006
Token form
Token ID1296
Feature activation+1.72e-01

Pos logit contributions
pill+0.035
ib+0.034
ub+0.033
emy+0.033
 hops+0.031
Neg logit contributions
 Mom-0.025
 down-0.022
 Mum-0.022
 Dad-0.022
 mom-0.021
Token a
Token ID257
Feature activation+2.06e-01

Pos logit contributions
 Mom+0.003
 down+0.003
 Dad+0.003
rolley+0.003
cream+0.003
Neg logit contributions
ib-0.007
emy-0.007
ub-0.007
 hops-0.007
 villains-0.007
Token line
Token ID1627
Feature activation+1.16e+00

Pos logit contributions
 Mom+0.005
 down+0.005
rolley+0.005
owner+0.005
 Mum+0.005
Neg logit contributions
ib-0.008
emy-0.008
pill-0.008
ub-0.008
 hops-0.007
Token and
Token ID290
Feature activation+7.27e-01

Pos logit contributions
 Mom+0.025
rolley+0.024
 down+0.023
cream+0.023
nam+0.022
Neg logit contributions
ib-0.046
ub-0.046
emy-0.045
 hops-0.045
 villains-0.044
Token to
Token ID284
Feature activation-2.04e-01

Pos logit contributions
 Mom+0.018
 down+0.017
rolley+0.017
cream+0.017
 Dad+0.016
Neg logit contributions
ib-0.027
ub-0.026
emy-0.026
 hops-0.026
pill-0.026
Token want
Token ID765
Feature activation-3.56e-03

Pos logit contributions
 Mom+0.014
 down+0.014
cream+0.013
 Dad+0.013
rolley+0.013
Neg logit contributions
emy-0.014
ib-0.014
pill-0.013
ub-0.013
 hops-0.013
Token to
Token ID284
Feature activation-1.46e-01

Pos logit contributions
ub+0.000
ib+0.000
emy+0.000
 hops+0.000
pill+0.000
Neg logit contributions
 Mom-0.000
 Dad-0.000
 down-0.000
 Mum-0.000
cream-0.000
Token save
Token ID3613
Feature activation+3.08e-01

Pos logit contributions
ib+0.005
emy+0.005
ub+0.005
pill+0.005
 hops+0.005
Neg logit contributions
 Mom-0.004
 down-0.004
rolley-0.004
 Dad-0.004
 Mum-0.004
Token it
Token ID340
Feature activation-1.38e+00

Pos logit contributions
rolley+0.008
owner+0.008
belt+0.008
appointed+0.008
 clerk+0.007
Neg logit contributions
emy-0.010
ib-0.010
 hops-0.010
ub-0.010
 conclusions-0.010
Token.
Token ID13
Feature activation-1.11e+00

Pos logit contributions
ib+0.063
ub+0.063
emy+0.063
 hops+0.062
pill+0.062
Neg logit contributions
 Mom-0.022
 down-0.020
rolley-0.019
cream-0.019
 Dad-0.019
Token Please
Token ID4222
Feature activation-5.30e-01

Pos logit contributions
ib+0.038
ub+0.037
emy+0.037
 hops+0.037
pill+0.036
Neg logit contributions
 Mom-0.031
rolley-0.029
 down-0.029
cream-0.029
 Dad-0.028
Token,
Token ID11
Feature activation-2.25e-02

Pos logit contributions
ib+0.016
ub+0.016
emy+0.015
 hops+0.015
 villains+0.015
Neg logit contributions
 Mom-0.017
rolley-0.016
 down-0.016
cream-0.016
 Dad-0.016
Token don
Token ID836
Feature activation-8.31e-01

Pos logit contributions
ib+0.001
pill+0.001
emy+0.001
ub+0.001
 hops+0.001
Neg logit contributions
 Mom-0.001
 Dad-0.001
 Mum-0.001
 down-0.001
 mom-0.001
Token't
Token ID470
Feature activation+3.55e-01

Pos logit contributions
 hops+0.019
 villains+0.018
ub+0.018
pill+0.018
ib+0.018
Neg logit contributions
rolley-0.031
nam-0.031
owner-0.029
cream-0.029
 forehead-0.029
Token cut
Token ID2005
Feature activation+8.08e-01

Pos logit contributions
rolley+0.011
 Mom+0.011
cream+0.011
owner+0.011
 down+0.011
Neg logit contributions
ib-0.010
ub-0.010
emy-0.010
 hops-0.010
 villains-0.010
Token it
Token ID340
Feature activation-6.04e-01

Pos logit contributions
rolley+0.021
 Mom+0.020
owner+0.020
nam+0.019
 clerk+0.019
Neg logit contributions
ib-0.029
ub-0.028
emy-0.028
 hops-0.028
 villains-0.027
Token it
Token ID340
Feature activation-4.29e-01

Pos logit contributions
rolley+0.011
 forehead+0.011
owner+0.010
aughter+0.010
 clerk+0.010
Neg logit contributions
ib-0.015
 hops-0.015
emy-0.015
 appearance-0.014
ub-0.014
Token.
Token ID13
Feature activation-1.14e+00

Pos logit contributions
ib+0.016
ub+0.016
emy+0.016
 villains+0.015
 hops+0.015
Neg logit contributions
rolley-0.010
 Mom-0.009
 Daughter-0.009
owner-0.009
nam-0.009
Token Find
Token ID9938
Feature activation-3.14e-01

Pos logit contributions
ub+0.038
ib+0.038
emy+0.038
 hops+0.038
 villains+0.037
Neg logit contributions
 Mom-0.032
rolley-0.029
 down-0.029
 Dad-0.029
cream-0.029
Token your
Token ID534
Feature activation+1.35e-01

Pos logit contributions
ib+0.011
emy+0.010
 hops+0.010
ub+0.010
 villains+0.010
Neg logit contributions
rolley-0.009
 Mom-0.008
nam-0.008
appointed-0.008
owner-0.008
Token own
Token ID898
Feature activation-1.02e-01

Pos logit contributions
 Mom+0.003
 down+0.003
 Dad+0.003
 Mum+0.003
nam+0.002
Neg logit contributions
ub-0.006
emy-0.005
ib-0.005
 villains-0.005
pill-0.005
Token beetle
Token ID45489
Feature activation-2.81e+00

Pos logit contributions
ib+0.003
ub+0.003
emy+0.003
 hops+0.003
 villains+0.003
Neg logit contributions
 Mom-0.003
rolley-0.003
cream-0.003
 down-0.003
nam-0.003
Token."
Token ID526
Feature activation+6.75e-02

Pos logit contributions
ub+0.124
ib+0.124
emy+0.124
 hops+0.121
 villains+0.121
Neg logit contributions
 Mom-0.048
 down-0.044
rolley-0.043
cream-0.042
nam-0.040
Token
Token ID198
Feature activation-3.19e-01

Pos logit contributions
nam+0.002
rolley+0.002
cream+0.002
 forehead+0.002
owner+0.002
Neg logit contributions
ib-0.002
emy-0.002
ub-0.002
 hops-0.002
 villains-0.002
Token
Token ID198
Feature activation-3.69e-01

Pos logit contributions
emy+0.011
ub+0.010
ib+0.010
 hops+0.010
 villains+0.009
Neg logit contributions
nam-0.010
cream-0.009
rolley-0.009
 forehead-0.009
 down-0.009
TokenM
Token ID44
Feature activation-5.25e-01

Pos logit contributions
ib+0.014
ub+0.014
emy+0.013
 hops+0.013
 villains+0.013
Neg logit contributions
 Mom-0.009
rolley-0.009
 down-0.009
cream-0.008
nam-0.008
Tokenia
Token ID544
Feature activation-8.58e-01

Pos logit contributions
 villains+0.015
pill+0.015
 hops+0.014
ub+0.014
 weaknesses+0.014
Neg logit contributions
 Mom-0.018
 Mum-0.017
 Dad-0.017
 mom-0.016
owner-0.016
Token with
Token ID351
Feature activation+1.02e+00

Pos logit contributions
 Mom+0.001
 down+0.001
rolley+0.001
cream+0.001
 Dad+0.001
Neg logit contributions
ib-0.001
ub-0.001
emy-0.001
 hops-0.001
pill-0.001
Token a
Token ID257
Feature activation-3.26e-02

Pos logit contributions
rolley+0.027
owner+0.027
 Mom+0.026
nam+0.026
cream+0.026
Neg logit contributions
emy-0.034
 hops-0.033
ib-0.033
 villains-0.033
pill-0.033
Token ball
Token ID2613
Feature activation-6.14e-01

Pos logit contributions
ib+0.001
ub+0.001
emy+0.001
 hops+0.001
pill+0.001
Neg logit contributions
 Mom-0.001
 down-0.001
rolley-0.001
 Dad-0.001
owner-0.001
Token.
Token ID13
Feature activation-2.40e-02

Pos logit contributions
ib+0.027
emy+0.027
ub+0.027
 hops+0.026
 weaknesses+0.026
Neg logit contributions
 Mom-0.011
 down-0.010
rolley-0.010
 Dad-0.009
cream-0.009
Token Spot
Token ID15899
Feature activation-2.03e-01

Pos logit contributions
ib+0.001
emy+0.001
ub+0.001
 hops+0.001
 villains+0.001
Neg logit contributions
 Mom-0.001
rolley-0.001
 down-0.001
cream-0.001
nam-0.001
Token w
Token ID266
Feature activation-2.74e+00

Pos logit contributions
ib+0.008
ub+0.008
emy+0.008
 villains+0.008
 weaknesses+0.008
Neg logit contributions
 Mom-0.004
 down-0.004
rolley-0.004
cream-0.004
 Dad-0.004
Tokenagged
Token ID14655
Feature activation+9.55e-01

Pos logit contributions
ub+0.134
 appearance+0.133
 appearances+0.130
 hops+0.128
ib+0.128
Neg logit contributions
amed-0.030
 Mom-0.029
cream-0.029
rolley-0.029
aughter-0.025
Token his
Token ID465
Feature activation-1.67e+00

Pos logit contributions
rolley+0.027
aughter+0.024
cream+0.024
 Daughter+0.024
cakes+0.023
Neg logit contributions
emy-0.034
 hops-0.034
 villains-0.033
ib-0.033
ub-0.031
Token long
Token ID890
Feature activation-2.82e+00

Pos logit contributions
emy+0.037
ib+0.035
 hops+0.033
 tweet+0.031
 villains+0.029
Neg logit contributions
rolley-0.074
cream-0.073
cakes-0.072
 clerk-0.070
aughter-0.069
Token tail
Token ID7894
Feature activation-1.56e+00

Pos logit contributions
ub+0.096
pill+0.093
ib+0.093
emy+0.091
 hops+0.090
Neg logit contributions
 Mom-0.079
 down-0.074
rolley-0.071
 Dad-0.071
owner-0.070
Token very
Token ID845
Feature activation-1.58e+00

Pos logit contributions
ub+0.053
emy+0.053
ib+0.053
pill+0.052
 villains+0.051
Neg logit contributions
 Mom-0.044
 down-0.042
 Dad-0.039
owner-0.039
 Mum-0.039
Token.
Token ID13
Feature activation+1.83e-01

Pos logit contributions
 Mom+0.011
 down+0.010
rolley+0.010
owner+0.009
 Dad+0.009
Neg logit contributions
ib-0.024
emy-0.024
ub-0.024
 hops-0.023
pill-0.023
Token She
Token ID1375
Feature activation+4.31e-01

Pos logit contributions
rolley+0.006
nam+0.005
cream+0.005
 forehead+0.005
 Mom+0.005
Neg logit contributions
ib-0.005
ub-0.005
pill-0.005
 hops-0.005
emy-0.005
Token asked
Token ID1965
Feature activation-4.39e-01

Pos logit contributions
 Mom+0.010
rolley+0.009
 down+0.009
cream+0.009
nam+0.009
Neg logit contributions
ib-0.017
ub-0.016
emy-0.016
 hops-0.016
 villains-0.016
Token Tim
Token ID5045
Feature activation-2.10e-01

Pos logit contributions
pill+0.021
ub+0.021
emy+0.020
ib+0.020
 hops+0.020
Neg logit contributions
 Mom-0.008
 down-0.007
 Dad-0.007
 Mum-0.006
 Grand-0.006
Token "
Token ID366
Feature activation-9.03e-01

Pos logit contributions
 villains+0.009
emy+0.009
pill+0.009
 conclusions+0.009
ib+0.009
Neg logit contributions
 down-0.005
 Mom-0.005
 up-0.005
 Dad-0.004
 mom-0.004
TokenDo
Token ID5211
Feature activation-2.83e+00

Pos logit contributions
ib+0.029
ub+0.029
emy+0.029
 hops+0.028
 villains+0.028
Neg logit contributions
 Mom-0.026
rolley-0.025
 down-0.025
cream-0.024
 Dad-0.024
Token you
Token ID345
Feature activation-8.09e-01

Pos logit contributions
ub+0.107
ib+0.106
emy+0.102
 hops+0.101
 villains+0.100
Neg logit contributions
 Mom-0.075
 Dad-0.067
 Mum-0.064
nam-0.063
 down-0.062
Token want
Token ID765
Feature activation-7.12e-01

Pos logit contributions
pill+0.027
emy+0.027
 appearances+0.026
ub+0.025
ib+0.025
Neg logit contributions
 Mom-0.024
 down-0.023
 Dad-0.023
 Mum-0.022
nam-0.022
Token to
Token ID284
Feature activation-3.71e-01

Pos logit contributions
ib+0.026
ub+0.026
emy+0.026
 hops+0.025
 villains+0.025
Neg logit contributions
 Mom-0.018
 down-0.017
rolley-0.017
cream-0.017
 Dad-0.016
Token play
Token ID711
Feature activation+3.84e-01

Pos logit contributions
ib+0.011
ub+0.011
emy+0.011
 hops+0.011
 villains+0.011
Neg logit contributions
 Mom-0.011
rolley-0.011
cream-0.011
 down-0.011
nam-0.011
Token a
Token ID257
Feature activation+1.45e+00

Pos logit contributions
 Mom+0.009
 down+0.008
rolley+0.008
cream+0.008
 Dad+0.008
Neg logit contributions
ib-0.015
ub-0.015
emy-0.015
 hops-0.015
 villains-0.014
Token and
Token ID290
Feature activation+4.19e-01

Pos logit contributions
 down+0.006
 Mom+0.006
 up+0.005
 Dad+0.005
cream+0.005
Neg logit contributions
emy-0.011
 villains-0.011
ib-0.010
ub-0.010
 conclusions-0.010
Token he
Token ID339
Feature activation+5.38e-01

Pos logit contributions
 Mom+0.010
rolley+0.010
 down+0.009
cream+0.009
owner+0.009
Neg logit contributions
ib-0.015
emy-0.015
 hops-0.015
ub-0.015
 villains-0.015
Token asked
Token ID1965
Feature activation-3.16e-01

Pos logit contributions
 Mom+0.012
rolley+0.012
 down+0.011
cream+0.011
nam+0.011
Neg logit contributions
ib-0.021
ub-0.021
emy-0.021
 hops-0.020
 villains-0.020
Token his
Token ID465
Feature activation+7.27e-01

Pos logit contributions
ib+0.011
ub+0.011
emy+0.011
 hops+0.011
 villains+0.011
Neg logit contributions
rolley-0.008
 Mom-0.008
cream-0.007
 down-0.007
owner-0.007
Token mom
Token ID1995
Feature activation-1.97e-02

Pos logit contributions
rolley+0.005
 down+0.004
 Mom+0.004
cream+0.004
nam+0.004
Neg logit contributions
ib-0.039
emy-0.039
ub-0.039
 hops-0.039
 villains-0.038
Token if
Token ID611
Feature activation-2.89e+00

Pos logit contributions
emy+0.001
 villains+0.001
ib+0.001
ub+0.001
 hops+0.001
Neg logit contributions
 Mom-0.001
 down-0.001
 Dad-0.001
 Mum-0.001
nam-0.001
Token he
Token ID339
Feature activation-3.34e-01

Pos logit contributions
ib+0.121
ub+0.121
emy+0.119
pill+0.117
 hops+0.116
Neg logit contributions
 Mom-0.065
 Dad-0.059
 Mum-0.057
 down-0.056
nam-0.055
Token could
Token ID714
Feature activation-7.83e-01

Pos logit contributions
ub+0.013
ib+0.013
emy+0.013
 villains+0.013
 hops+0.013
Neg logit contributions
 Mom-0.007
 Dad-0.007
cream-0.007
 down-0.007
 Mum-0.007
Token keep
Token ID1394
Feature activation-6.09e-01

Pos logit contributions
pill+0.032
emy+0.031
ib+0.030
 villains+0.030
 appearances+0.029
Neg logit contributions
 Mom-0.022
 Mum-0.020
 Dad-0.019
 down-0.019
 mom-0.018
Token it
Token ID340
Feature activation-1.52e+00

Pos logit contributions
ub+0.027
ib+0.027
 villains+0.027
emy+0.026
pill+0.026
Neg logit contributions
 Mom-0.013
 down-0.012
 Dad-0.011
 Mum-0.011
 up-0.011
Token.
Token ID13
Feature activation+2.98e-01

Pos logit contributions
ib+0.061
ub+0.061
emy+0.061
 hops+0.060
pill+0.060
Neg logit contributions
 Mom-0.033
 down-0.031
rolley-0.030
cream-0.029
 Dad-0.029
Token,
Token ID11
Feature activation+7.57e-01

Pos logit contributions
 Mom+0.001
 down+0.001
 Mum+0.000
 Dad+0.000
 up+0.000
Neg logit contributions
emy-0.003
pill-0.003
 villains-0.003
 weaknesses-0.003
ib-0.003
Token she
Token ID673
Feature activation+4.92e-01

Pos logit contributions
rolley+0.019
cream+0.018
 Mom+0.018
owner+0.018
 down+0.017
Neg logit contributions
 hops-0.027
ib-0.027
ub-0.026
emy-0.026
 villains-0.026
Token realized
Token ID6939
Feature activation-9.27e-01

Pos logit contributions
 Mom+0.009
 started+0.008
 down+0.008
 Dad+0.008
 Mum+0.007
Neg logit contributions
 appearances-0.022
 villains-0.022
ub-0.022
 weaknesses-0.022
ib-0.022
Token that
Token ID326
Feature activation+9.27e-01

Pos logit contributions
ib+0.039
ub+0.039
emy+0.038
 villains+0.037
 hops+0.037
Neg logit contributions
 Mom-0.021
 down-0.019
 Dad-0.018
 Mum-0.018
nam-0.018
TokenÂ
Token ID5523
Feature activation+7.45e-01

Pos logit contributions
 Mom+0.014
 down+0.011
nam+0.010
 Dad+0.010
 Mum+0.010
Neg logit contributions
ub-0.051
emy-0.050
ib-0.049
pill-0.047
 villains-0.047
Token 
Token ID1849
Feature activation-2.98e+00

Pos logit contributions
rolley+0.021
cream+0.021
cakes+0.020
owner+0.020
 clerk+0.020
Neg logit contributions
 hops-0.024
ib-0.024
emy-0.024
 appearances-0.023
 tweet-0.023
Tokenthe
Token ID1169
Feature activation-4.73e-01

Pos logit contributions
ib+0.115
ub+0.112
emy+0.112
 hops+0.109
pill+0.108
Neg logit contributions
nam-0.066
 Mom-0.066
cream-0.066
rolley-0.065
 down-0.063
Token ground
Token ID2323
Feature activation-3.62e-01

Pos logit contributions
ub+0.018
ib+0.018
emy+0.018
 villains+0.017
pill+0.017
Neg logit contributions
 Mom-0.012
 down-0.011
rolley-0.011
 Dad-0.010
cream-0.010
Token was
Token ID373
Feature activation+5.05e-01

Pos logit contributions
emy+0.016
ib+0.016
ub+0.016
 villains+0.016
pill+0.016
Neg logit contributions
 Mom-0.006
 down-0.006
rolley-0.005
 Dad-0.005
cream-0.005
Token alive
Token ID6776
Feature activation+7.99e-01

Pos logit contributions
 Mom+0.012
 down+0.011
 Dad+0.010
 Mum+0.010
nam+0.010
Neg logit contributions
ib-0.020
ub-0.020
emy-0.020
pill-0.019
 villains-0.019
Token!
Token ID0
Feature activation+6.41e-01

Pos logit contributions
 down+0.014
 Mom+0.014
 up+0.012
 Dad+0.011
 Mum+0.011
Neg logit contributions
emy-0.037
ub-0.037
ib-0.037
 villains-0.036
 weaknesses-0.036