TOP ACTIVATIONS
MAX = 7.932

of her and said , " See , you can do
mom smiled and said , " I love spending time with
her back and said , " I love you too ,
felt bad and said , " I 'm sorry , Lily
mom smiled and said , " That 's my good boy
him back and said , " I love you , Bear
big hug and said , " You 're a smart and
over and everyone cheered . " H oor ay for Blue
and hugged Sammy back . " That 's what friends are
mom smiled and said , " I 'm glad I could
her back and said , " You 're my best friend
Bob grinned , relieved . " P hew ! That 's
her back and said , " I love you too ,
my smiled and said , " Yes , we did it
around the jungle together . " Thank you , Max !"
my smiled and said , " Thank you , Lily .
parents smiled and said , " We are glad you understand
and hugged her tight . " It 's ok ," she
hugged her and said , " Thank you , Lily .
her back and said , " You 're welcome , my

MIN ACTIVATIONS
MIN = -6.349

the toilet , Adam asked Laura to wait outside . She
She called out to her daughter , Sarah . " Sarah
, the mom asked her daughter , " Would you like
her face and asked her daughter , " Do you want
the mum said to her daughter , " Let 's try
, the mom asked the daughter if she wanted to help
and showed it to her daughter . " What do you
Mom ? â € asked Amy . â €
One day , mom asked Amy to replace the powder in
was angry and asked her daughter why she had done something
, Mama and Papa asked Sarah to share her toys with
She was very excited because her mom was making a pie
Polly heard a knock on her door . She opened it
loved to listen . Whenever his Mum called him in for
to call out to the daughter , " Come inside quick
scared and he asked his brother , " What is that
to start her day because her parents had a surprise for
. She asked Tom and Anna to watch Ben for a
. When her mom asked her why she was frown ing
. He was sad because his puppy was stuck in a

INTERVAL 4.362 - 7.932
CONTAINS 1.291%

follow the rabbit . The rabbit took the kitten
and trees ." " OK , but you have
to her mom my , " Look at this book !
too . Max said , " Can I try sitting on
to eat . The end . <|endoftext|> Once

INTERVAL 0.792 - 4.362
CONTAINS 28.065%

ister . We will be more careful next time ."
. They liked to swing , slide and climb on the
let 's play with the dolls . You can be the
woke up and wanted to play with the bow . He
and jewels . Jack was very happy and danced around on

INTERVAL -2.779 - 0.792
CONTAINS 67.277%

night , Lily 's mom my took her outside and showed
made his way closer and asked , " Hey ! Hi
and their coat . They still liked motion , but they
up and took him to the vet . The vet gave
be more careful next time ." They nodded .

INTERVAL -6.349 - -2.779
CONTAINS 3.366%

oop ing around , he mentioned dust . He thought out
again . She thanked the boy and they became friends .
She asked , " Do you want to play ?" The
P ips que ak called out , " Help ! Help
He watched as his mum started to work hard on the
Token of
Token ID286
Feature activation+1.34e+00

Pos logit contributions
izing+0.008
ler+0.007
ww+0.007
hest+0.007
E+0.007
Neg logit contributions
 wife-0.010
 opinion-0.010
 attitude-0.010
 stomach-0.010
 name-0.010
Token her
Token ID607
Feature activation-2.45e+00

Pos logit contributions
izing+0.041
ler+0.041
ww+0.040
hest+0.040
corn+0.039
Neg logit contributions
 wife-0.024
 opinion-0.023
 attitude-0.023
 stomach-0.021
 name-0.021
Token and
Token ID290
Feature activation+2.53e+00

Pos logit contributions
 wife+0.058
 stomach+0.053
sed+0.053
 attitude+0.052
 name+0.052
Neg logit contributions
izing-0.063
ler-0.057
hest-0.053
corn-0.053
is-0.053
Token said
Token ID531
Feature activation-2.19e+00

Pos logit contributions
izing+0.067
ler+0.062
ww+0.061
E+0.060
hest+0.060
Neg logit contributions
 wife-0.058
 opinion-0.056
 attitude-0.055
 stomach-0.054
 name-0.053
Token,
Token ID11
Feature activation-1.45e+00

Pos logit contributions
 wife+0.042
 opinion+0.038
 attitude+0.038
sed+0.037
 fault+0.037
Neg logit contributions
izing-0.067
ww-0.063
E-0.062
ler-0.061
corn-0.060
Token "
Token ID366
Feature activation+7.93e+00

Pos logit contributions
 wife+0.014
 opinion+0.013
 attitude+0.012
 name+0.012
 stomach+0.011
Neg logit contributions
izing-0.057
ler-0.057
hest-0.055
ww-0.055
E-0.054
TokenSee
Token ID6214
Feature activation+9.50e-01

Pos logit contributions
ler+0.196
ls+0.178
izing+0.175
iflower+0.167
corn+0.164
Neg logit contributions
 opinion-0.173
 wife-0.168
 ending-0.166
 name-0.164
 known-0.164
Token,
Token ID11
Feature activation+1.84e+00

Pos logit contributions
izing+0.024
ww+0.023
corn+0.021
E+0.021
ler+0.021
Neg logit contributions
 wife-0.024
 attitude-0.023
 fault-0.022
 opinion-0.022
sed-0.022
Token you
Token ID345
Feature activation-1.97e+00

Pos logit contributions
ler+0.049
izing+0.049
hest+0.047
corn+0.045
ww+0.045
Neg logit contributions
 wife-0.041
 opinion-0.041
 attitude-0.040
 name-0.039
 stomach-0.038
Token can
Token ID460
Feature activation-6.23e-01

Pos logit contributions
 wife+0.044
 opinion+0.041
 attitude+0.040
 stomach+0.039
 name+0.038
Neg logit contributions
izing-0.054
ler-0.052
hest-0.050
ww-0.050
E-0.050
Token do
Token ID466
Feature activation+1.41e+00

Pos logit contributions
 wife+0.011
 opinion+0.011
 attitude+0.011
 stomach+0.010
 name+0.010
Neg logit contributions
izing-0.019
ler-0.019
hest-0.018
E-0.018
ww-0.018
Token mom
Token ID1995
Feature activation+2.43e+00

Pos logit contributions
 wife+0.040
urance+0.040
 attitude+0.038
 permission+0.038
 advisor+0.037
Neg logit contributions
E-0.072
ler-0.071
ww-0.070
corn-0.070
izing-0.069
Token smiled
Token ID13541
Feature activation-1.16e+00

Pos logit contributions
izing+0.074
ler+0.070
hest+0.069
ww+0.068
E+0.068
Neg logit contributions
 wife-0.047
 opinion-0.044
 attitude-0.043
 stomach-0.042
 name-0.042
Token and
Token ID290
Feature activation+2.50e+00

Pos logit contributions
 fault+0.037
 opinion+0.037
sed+0.037
 attitude+0.036
 wife+0.036
Neg logit contributions
izing-0.019
ler-0.018
E-0.016
B-0.015
corn-0.015
Token said
Token ID531
Feature activation-1.86e+00

Pos logit contributions
izing+0.063
ler+0.060
ww+0.059
E+0.058
corn+0.058
Neg logit contributions
 wife-0.059
 opinion-0.058
 attitude-0.057
 stomach-0.055
 name-0.055
Token,
Token ID11
Feature activation-7.58e-01

Pos logit contributions
 wife+0.036
 opinion+0.034
 attitude+0.033
 name+0.032
 fault+0.032
Neg logit contributions
izing-0.056
ww-0.053
ler-0.053
E-0.052
corn-0.051
Token "
Token ID366
Feature activation+7.93e+00

Pos logit contributions
 wife+0.009
 opinion+0.008
 attitude+0.008
 stomach+0.007
 name+0.007
Neg logit contributions
izing-0.029
ler-0.028
hest-0.027
ww-0.027
E-0.027
TokenI
Token ID40
Feature activation-1.03e+00

Pos logit contributions
ler+0.200
izing+0.179
iflower+0.171
ls+0.170
hest+0.164
Neg logit contributions
 wife-0.184
joy-0.164
 opinion-0.164
 nod-0.161
 ending-0.160
Token love
Token ID1842
Feature activation+4.21e-01

Pos logit contributions
 wife+0.021
 stomach+0.020
sed+0.020
 attitude+0.020
 behavior+0.019
Neg logit contributions
E-0.027
ler-0.027
ww-0.027
hest-0.026
izing-0.026
Token spending
Token ID4581
Feature activation+1.39e+00

Pos logit contributions
ler+0.006
izing+0.006
hest+0.006
E+0.005
sc+0.005
Neg logit contributions
 wife-0.015
 opinion-0.014
 parents-0.014
 attitude-0.014
 name-0.014
Token time
Token ID640
Feature activation+1.51e+00

Pos logit contributions
izing+0.041
ler+0.039
ww+0.038
hest+0.038
corn+0.038
Neg logit contributions
 wife-0.028
 opinion-0.027
 attitude-0.026
 stomach-0.026
 name-0.025
Token with
Token ID351
Feature activation+2.22e+00

Pos logit contributions
izing+0.038
E+0.037
ler+0.036
ww+0.035
corn+0.034
Neg logit contributions
 wife-0.035
 opinion-0.034
 attitude-0.032
 stomach-0.032
 goodbye-0.032
Token her
Token ID607
Feature activation-1.84e+00

Pos logit contributions
izing+0.030
ler+0.029
hest+0.028
ww+0.028
E+0.028
Neg logit contributions
 wife-0.022
 opinion-0.021
 attitude-0.021
 stomach-0.020
 name-0.020
Token back
Token ID736
Feature activation-1.96e+00

Pos logit contributions
 wife+0.049
 opinion+0.048
 fault+0.047
sed+0.046
 attitude+0.046
Neg logit contributions
izing-0.040
ler-0.039
E-0.036
corn-0.034
hest-0.034
Token and
Token ID290
Feature activation+2.38e+00

Pos logit contributions
 wife+0.048
 opinion+0.047
 attitude+0.046
 stomach+0.045
 name+0.045
Neg logit contributions
izing-0.048
ler-0.046
E-0.044
ww-0.044
hest-0.043
Token said
Token ID531
Feature activation-2.15e+00

Pos logit contributions
izing+0.060
ler+0.057
ww+0.055
corn+0.055
E+0.055
Neg logit contributions
 wife-0.057
 opinion-0.055
 attitude-0.055
 name-0.053
 stomach-0.053
Token,
Token ID11
Feature activation-1.50e+00

Pos logit contributions
 wife+0.040
 opinion+0.037
 attitude+0.036
 name+0.035
 fault+0.035
Neg logit contributions
izing-0.066
ww-0.063
ler-0.063
E-0.062
corn-0.061
Token "
Token ID366
Feature activation+7.87e+00

Pos logit contributions
 wife+0.020
 opinion+0.018
 attitude+0.018
 fault+0.017
 stomach+0.017
Neg logit contributions
izing-0.054
ler-0.051
E-0.051
ww-0.050
corn-0.050
TokenI
Token ID40
Feature activation-1.15e+00

Pos logit contributions
ler+0.196
izing+0.190
ls+0.179
hest+0.172
 pounds+0.171
Neg logit contributions
 wife-0.167
 ending-0.153
joy-0.151
 name-0.150
 opinion-0.148
Token love
Token ID1842
Feature activation+2.75e-01

Pos logit contributions
 wife+0.026
 attitude+0.025
 stomach+0.024
sed+0.024
 opinion+0.024
Neg logit contributions
izing-0.028
E-0.028
ler-0.028
ww-0.028
hest-0.027
Token you
Token ID345
Feature activation-5.65e-01

Pos logit contributions
ler+0.005
hest+0.005
izing+0.005
sc+0.005
plets+0.004
Neg logit contributions
 parents-0.008
 wife-0.008
 opinion-0.008
 stomach-0.008
 permission-0.008
Token too
Token ID1165
Feature activation-1.07e+00

Pos logit contributions
 wife+0.013
 opinion+0.012
 attitude+0.012
 stomach+0.011
 name+0.011
Neg logit contributions
izing-0.015
ler-0.015
ww-0.014
hest-0.014
E-0.014
Token,
Token ID11
Feature activation+6.70e-01

Pos logit contributions
 wife+0.018
 opinion+0.017
 attitude+0.016
 name+0.016
 stomach+0.015
Neg logit contributions
izing-0.035
ler-0.033
ww-0.033
E-0.033
hest-0.032
Token felt
Token ID2936
Feature activation+1.72e-01

Pos logit contributions
ler+0.012
izing+0.012
corn+0.011
E+0.011
B+0.011
Neg logit contributions
 wife-0.012
 fault-0.012
 opinion-0.011
 attitude-0.011
 name-0.011
Token bad
Token ID2089
Feature activation-5.76e-01

Pos logit contributions
izing+0.004
ler+0.004
E+0.004
hest+0.004
ww+0.004
Neg logit contributions
 wife-0.004
 attitude-0.004
 opinion-0.004
sed-0.004
 fault-0.004
Token and
Token ID290
Feature activation+1.99e+00

Pos logit contributions
 wife+0.016
 opinion+0.014
 stomach+0.014
 name+0.014
sed+0.014
Neg logit contributions
izing-0.013
ler-0.012
ww-0.011
corn-0.011
B-0.011
Token said
Token ID531
Feature activation-2.68e+00

Pos logit contributions
izing+0.049
ler+0.048
ww+0.046
E+0.045
corn+0.044
Neg logit contributions
 wife-0.046
 opinion-0.045
 fault-0.045
 attitude-0.045
 name-0.044
Token,
Token ID11
Feature activation-1.65e+00

Pos logit contributions
 wife+0.061
 fault+0.056
 opinion+0.055
 name+0.055
 attitude+0.053
Neg logit contributions
izing-0.072
E-0.066
ww-0.065
ler-0.065
B-0.063
Token "
Token ID366
Feature activation+7.86e+00

Pos logit contributions
 wife+0.025
 opinion+0.023
 attitude+0.023
 fault+0.022
 stomach+0.021
Neg logit contributions
izing-0.056
E-0.053
ler-0.052
corn-0.052
ww-0.052
TokenI
Token ID40
Feature activation-1.31e+00

Pos logit contributions
hest+0.203
ler+0.195
corn+0.192
 pounds+0.191
ds+0.187
Neg logit contributions
 wife-0.152
 opinion-0.148
 ending-0.143
 attitude-0.143
 name-0.135
Token'm
Token ID1101
Feature activation-1.49e+00

Pos logit contributions
 wife+0.026
 attitude+0.023
 stomach+0.023
urance+0.022
sed+0.022
Neg logit contributions
ler-0.037
E-0.037
ww-0.036
izing-0.036
w-0.035
Token sorry
Token ID7926
Feature activation-5.80e-01

Pos logit contributions
 wife+0.016
 attitude+0.014
 opinion+0.014
 stomach+0.013
 name+0.013
Neg logit contributions
izing-0.058
ler-0.056
E-0.055
ww-0.055
hest-0.055
Token,
Token ID11
Feature activation+4.37e-01

Pos logit contributions
 wife+0.011
 opinion+0.011
 attitude+0.011
 stomach+0.010
 name+0.010
Neg logit contributions
izing-0.017
ler-0.017
ww-0.016
E-0.016
hest-0.016
Token Lily
Token ID20037
Feature activation-1.84e+00

Pos logit contributions
izing+0.014
ler+0.014
hest+0.014
w+0.013
corn+0.013
Neg logit contributions
 opinion-0.008
 wife-0.007
 attitude-0.007
 name-0.007
 stomach-0.007
Token mom
Token ID1995
Feature activation+2.07e+00

Pos logit contributions
 wife+0.037
 attitude+0.034
 fault+0.033
urance+0.033
joy+0.032
Neg logit contributions
ww-0.087
corn-0.085
izing-0.084
ler-0.083
E-0.083
Token smiled
Token ID13541
Feature activation-1.50e+00

Pos logit contributions
izing+0.066
ww+0.061
hest+0.061
ler+0.061
E+0.060
Neg logit contributions
 wife-0.038
 opinion-0.035
 stomach-0.034
 attitude-0.034
 parents-0.033
Token and
Token ID290
Feature activation+2.62e+00

Pos logit contributions
 wife+0.040
 opinion+0.039
 fault+0.038
 attitude+0.038
sed+0.037
Neg logit contributions
izing-0.033
ler-0.032
E-0.030
hest-0.028
ww-0.028
Token said
Token ID531
Feature activation-1.47e+00

Pos logit contributions
izing+0.071
ler+0.068
ww+0.066
hest+0.066
E+0.066
Neg logit contributions
 wife-0.058
 opinion-0.055
 attitude-0.055
 stomach-0.053
 name-0.053
Token,
Token ID11
Feature activation-7.88e-01

Pos logit contributions
 wife+0.030
 opinion+0.029
 attitude+0.028
 name+0.027
 stomach+0.027
Neg logit contributions
izing-0.043
ler-0.040
ww-0.040
E-0.040
corn-0.039
Token "
Token ID366
Feature activation+7.82e+00

Pos logit contributions
 wife+0.009
 opinion+0.008
 attitude+0.008
 name+0.008
 stomach+0.008
Neg logit contributions
izing-0.030
ler-0.029
hest-0.028
ww-0.028
E-0.028
TokenThat
Token ID2504
Feature activation+1.19e+00

Pos logit contributions
ler+0.175
izing+0.159
hest+0.158
corn+0.151
iflower+0.148
Neg logit contributions
 wife-0.196
joy-0.187
 name-0.179
 known-0.177
 opinion-0.176
Token's
Token ID338
Feature activation+8.47e-01

Pos logit contributions
izing+0.033
ler+0.031
E+0.031
hest+0.030
ww+0.030
Neg logit contributions
 wife-0.026
 opinion-0.025
 attitude-0.025
 name-0.024
 stomach-0.024
Token my
Token ID616
Feature activation-1.76e+00

Pos logit contributions
izing+0.026
ler+0.026
hest+0.025
ww+0.024
E+0.024
Neg logit contributions
 wife-0.015
 opinion-0.015
 attitude-0.014
 name-0.014
 stomach-0.014
Token good
Token ID922
Feature activation-3.31e-01

Pos logit contributions
 wife+0.036
 opinion+0.033
 attitude+0.033
 stomach+0.032
 name+0.031
Neg logit contributions
izing-0.048
ler-0.047
ww-0.046
E-0.045
corn-0.045
Token boy
Token ID2933
Feature activation-1.07e+00

Pos logit contributions
 wife+0.005
sed+0.004
urance+0.004
 stomach+0.004
 opinion+0.004
Neg logit contributions
ww-0.011
ler-0.010
izing-0.010
corn-0.010
E-0.010
Token him
Token ID683
Feature activation-1.37e+00

Pos logit contributions
izing+0.059
hest+0.059
ww+0.056
ler+0.054
corn+0.054
Neg logit contributions
 wife-0.035
 name-0.033
 attitude-0.032
 parents-0.032
 stomach-0.032
Token back
Token ID736
Feature activation-1.60e+00

Pos logit contributions
 wife+0.036
 opinion+0.034
 attitude+0.034
 stomach+0.033
 name+0.033
Neg logit contributions
izing-0.032
ler-0.030
ww-0.029
hest-0.029
E-0.029
Token and
Token ID290
Feature activation+2.35e+00

Pos logit contributions
 wife+0.036
 opinion+0.035
 attitude+0.034
 stomach+0.033
 name+0.033
Neg logit contributions
izing-0.043
ler-0.041
hest-0.040
ww-0.040
E-0.040
Token said
Token ID531
Feature activation-1.99e+00

Pos logit contributions
izing+0.065
ler+0.062
hest+0.061
E+0.060
ww+0.060
Neg logit contributions
 wife-0.052
 opinion-0.048
 attitude-0.048
 stomach-0.047
 name-0.046
Token,
Token ID11
Feature activation-1.66e+00

Pos logit contributions
 wife+0.035
 opinion+0.033
 attitude+0.032
 name+0.030
 fault+0.030
Neg logit contributions
izing-0.063
ler-0.061
ww-0.060
E-0.060
corn-0.059
Token "
Token ID366
Feature activation+7.81e+00

Pos logit contributions
 wife+0.020
 opinion+0.018
 attitude+0.017
 stomach+0.017
 name+0.016
Neg logit contributions
izing-0.062
ler-0.060
hest-0.059
ww-0.059
E-0.059
TokenI
Token ID40
Feature activation-1.48e+00

Pos logit contributions
izing+0.170
hest+0.170
ler+0.163
ls+0.155
corn+0.146
Neg logit contributions
 wife-0.194
joy-0.181
 nod-0.181
 name-0.174
 ending-0.166
Token love
Token ID1842
Feature activation+1.83e-01

Pos logit contributions
 wife+0.029
 attitude+0.028
sed+0.027
 stomach+0.026
 behavior+0.026
Neg logit contributions
ler-0.040
E-0.040
ww-0.040
izing-0.039
hest-0.039
Token you
Token ID345
Feature activation-8.84e-01

Pos logit contributions
izing+0.003
ler+0.003
hest+0.003
E+0.002
sc+0.002
Neg logit contributions
 wife-0.006
 opinion-0.006
 parents-0.006
 stomach-0.006
 name-0.006
Token,
Token ID11
Feature activation+6.22e-01

Pos logit contributions
 wife+0.019
 opinion+0.018
 attitude+0.018
 name+0.017
 fault+0.017
Neg logit contributions
izing-0.024
ler-0.024
ww-0.023
hest-0.023
E-0.022
Token Bear
Token ID14732
Feature activation-5.15e-01

Pos logit contributions
izing+0.020
hest+0.019
w+0.019
ler+0.018
ww+0.018
Neg logit contributions
 wife-0.011
 opinion-0.011
 parents-0.011
 stomach-0.011
 attitude-0.011
Token big
Token ID1263
Feature activation-5.53e-01

Pos logit contributions
izing+0.021
ler+0.020
ww+0.020
E+0.020
hest+0.019
Neg logit contributions
 wife-0.015
 opinion-0.015
 attitude-0.014
 stomach-0.014
 name-0.013
Token hug
Token ID16225
Feature activation+4.40e-01

Pos logit contributions
 wife+0.008
 opinion+0.007
 attitude+0.007
 fault+0.007
urance+0.007
Neg logit contributions
ler-0.018
izing-0.018
ww-0.018
corn-0.017
E-0.017
Token and
Token ID290
Feature activation+2.41e+00

Pos logit contributions
izing+0.011
ler+0.010
ww+0.010
hest+0.010
corn+0.010
Neg logit contributions
 wife-0.011
 opinion-0.010
 attitude-0.010
 stomach-0.010
 name-0.010
Token said
Token ID531
Feature activation-1.61e+00

Pos logit contributions
izing+0.060
ler+0.058
ww+0.057
corn+0.056
E+0.055
Neg logit contributions
 wife-0.056
 opinion-0.054
 attitude-0.054
 stomach-0.052
 name-0.052
Token,
Token ID11
Feature activation-1.40e+00

Pos logit contributions
 wife+0.033
 opinion+0.031
 attitude+0.031
 name+0.030
sed+0.030
Neg logit contributions
izing-0.047
ww-0.043
ler-0.043
E-0.042
corn-0.042
Token "
Token ID366
Feature activation+7.80e+00

Pos logit contributions
 wife+0.015
 opinion+0.013
 attitude+0.013
 name+0.012
 stomach+0.012
Neg logit contributions
izing-0.054
ler-0.053
hest-0.052
ww-0.052
E-0.051
TokenYou
Token ID1639
Feature activation-1.88e+00

Pos logit contributions
ler+0.189
izing+0.180
hest+0.177
ls+0.170
 pounds+0.165
Neg logit contributions
 wife-0.179
 opinion-0.161
 ending-0.159
joy-0.159
 name-0.159
Token're
Token ID821
Feature activation-1.66e-01

Pos logit contributions
 wife+0.041
 attitude+0.038
 opinion+0.037
sed+0.037
joy+0.037
Neg logit contributions
ler-0.049
izing-0.049
hest-0.048
ww-0.048
E-0.046
Token a
Token ID257
Feature activation+2.00e+00

Pos logit contributions
 wife+0.003
 attitude+0.003
 opinion+0.003
 behavior+0.003
 stomach+0.003
Neg logit contributions
izing-0.005
hest-0.005
E-0.005
ler-0.005
ww-0.005
Token smart
Token ID4451
Feature activation-1.77e-01

Pos logit contributions
izing+0.059
ler+0.055
hest+0.054
corn+0.053
ww+0.052
Neg logit contributions
 wife-0.043
 opinion-0.040
 name-0.039
 stomach-0.039
 attitude-0.039
Token and
Token ID290
Feature activation+2.11e+00

Pos logit contributions
 wife+0.004
 opinion+0.004
sed+0.004
 stomach+0.004
 attitude+0.004
Neg logit contributions
ler-0.005
izing-0.004
ww-0.004
E-0.004
corn-0.004
Token over
Token ID625
Feature activation+4.06e-01

Pos logit contributions
 wife+0.018
 opinion+0.017
 attitude+0.017
 stomach+0.017
 name+0.017
Neg logit contributions
izing-0.027
ler-0.026
corn-0.026
E-0.026
ww-0.025
Token and
Token ID290
Feature activation+1.48e+00

Pos logit contributions
izing+0.012
ler+0.011
hest+0.011
ww+0.011
E+0.011
Neg logit contributions
 wife-0.008
 opinion-0.008
 attitude-0.008
 stomach-0.008
 name-0.008
Token everyone
Token ID2506
Feature activation+1.40e+00

Pos logit contributions
izing+0.040
ler+0.038
hest+0.038
E+0.038
ww+0.037
Neg logit contributions
 wife-0.032
 stomach-0.030
 opinion-0.030
 attitude-0.030
 name-0.029
Token cheered
Token ID34767
Feature activation-2.25e-01

Pos logit contributions
izing+0.041
E+0.038
ww+0.038
hest+0.038
corn+0.037
Neg logit contributions
 wife-0.028
 stomach-0.026
 opinion-0.026
 tum-0.025
 attitude-0.025
Token.
Token ID13
Feature activation-3.29e-01

Pos logit contributions
 wife+0.005
 opinion+0.005
 attitude+0.004
 stomach+0.004
 name+0.004
Neg logit contributions
izing-0.006
ler-0.006
E-0.006
ww-0.006
hest-0.006
Token "
Token ID366
Feature activation+7.79e+00

Pos logit contributions
 wife+0.007
 opinion+0.007
 attitude+0.007
 stomach+0.006
 fault+0.006
Neg logit contributions
izing-0.009
ler-0.008
E-0.008
hest-0.008
corn-0.008
TokenH
Token ID39
Feature activation+1.16e+00

Pos logit contributions
ler+0.195
izing+0.192
iflower+0.172
ls+0.172
hest+0.171
Neg logit contributions
 wife-0.169
joy-0.160
 ending-0.154
 known-0.152
 name-0.151
Tokenoor
Token ID2675
Feature activation+1.40e+00

Pos logit contributions
izing+0.031
ler+0.030
E+0.030
corn+0.029
sc+0.028
Neg logit contributions
 wife-0.024
 opinion-0.024
 attitude-0.023
 name-0.022
 ending-0.022
Tokenay
Token ID323
Feature activation+9.22e-02

Pos logit contributions
izing+0.028
ler+0.025
E+0.025
corn+0.025
web+0.025
Neg logit contributions
 opinion-0.038
 attitude-0.037
 name-0.037
 wife-0.037
 stomach-0.036
Token for
Token ID329
Feature activation+9.79e-01

Pos logit contributions
izing+0.002
ler+0.002
ww+0.002
hest+0.002
E+0.002
Neg logit contributions
 wife-0.003
 opinion-0.003
 attitude-0.003
 stomach-0.002
 name-0.002
Token Blue
Token ID4518
Feature activation+6.18e-01

Pos logit contributions
izing+0.024
ler+0.023
hest+0.023
E+0.022
ww+0.022
Neg logit contributions
 wife-0.024
 opinion-0.023
 attitude-0.023
 stomach-0.022
 name-0.022
Token and
Token ID290
Feature activation+2.22e+00

Pos logit contributions
 opinion+0.044
 wife+0.044
 attitude+0.044
 fault+0.043
sed+0.043
Neg logit contributions
izing-0.034
ler-0.031
E-0.030
B-0.028
hest-0.027
Token hugged
Token ID41130
Feature activation-8.08e-02

Pos logit contributions
izing+0.059
ler+0.056
ww+0.056
corn+0.054
E+0.054
Neg logit contributions
 wife-0.049
 opinion-0.048
 attitude-0.048
 fault-0.046
 stomach-0.045
Token Sammy
Token ID44067
Feature activation-1.68e+00

Pos logit contributions
 wife+0.002
 opinion+0.002
 attitude+0.002
 fault+0.002
sed+0.002
Neg logit contributions
izing-0.002
ler-0.002
E-0.002
B-0.002
corn-0.002
Token back
Token ID736
Feature activation-1.78e+00

Pos logit contributions
 wife+0.034
 opinion+0.033
 attitude+0.032
 fault+0.031
sed+0.030
Neg logit contributions
izing-0.047
ler-0.047
hest-0.044
corn-0.044
ww-0.044
Token.
Token ID13
Feature activation-1.43e+00

Pos logit contributions
 wife+0.036
 attitude+0.035
 opinion+0.035
 ending+0.033
 stomach+0.033
Neg logit contributions
izing-0.051
ler-0.048
E-0.046
corn-0.046
ww-0.045
Token "
Token ID366
Feature activation+7.75e+00

Pos logit contributions
 wife+0.028
 opinion+0.026
 attitude+0.026
 stomach+0.025
 behavior+0.024
Neg logit contributions
izing-0.041
E-0.039
ww-0.039
ler-0.039
corn-0.038
TokenThat
Token ID2504
Feature activation+1.23e+00

Pos logit contributions
ler+0.192
izing+0.192
ls+0.177
hest+0.172
corn+0.169
Neg logit contributions
 wife-0.177
 name-0.168
 opinion-0.163
joy-0.160
 ending-0.159
Token's
Token ID338
Feature activation+1.03e+00

Pos logit contributions
izing+0.032
ler+0.030
ww+0.029
hest+0.029
corn+0.029
Neg logit contributions
 wife-0.029
 opinion-0.028
 attitude-0.028
 stomach-0.027
 name-0.026
Token what
Token ID644
Feature activation-3.76e-02

Pos logit contributions
izing+0.029
ler+0.029
hest+0.027
E+0.027
corn+0.027
Neg logit contributions
 opinion-0.021
 wife-0.021
 name-0.021
 attitude-0.020
 ending-0.020
Token friends
Token ID2460
Feature activation+2.12e-01

Pos logit contributions
 wife+0.001
 attitude+0.001
 opinion+0.001
sed+0.001
 stomach+0.001
Neg logit contributions
izing-0.001
ww-0.001
ler-0.001
corn-0.001
w-0.001
Token are
Token ID389
Feature activation+1.81e+00

Pos logit contributions
izing+0.007
ler+0.006
E+0.006
ww+0.006
corn+0.006
Neg logit contributions
 wife-0.004
 opinion-0.004
 attitude-0.004
 name-0.004
 stomach-0.003
Token mom
Token ID1995
Feature activation+2.66e+00

Pos logit contributions
 wife+0.039
 attitude+0.037
 opinion+0.035
 fault+0.034
 permission+0.034
Neg logit contributions
E-0.078
B-0.075
corn-0.074
izing-0.074
ww-0.074
Token smiled
Token ID13541
Feature activation-1.30e+00

Pos logit contributions
izing+0.090
hest+0.084
ler+0.083
ww+0.082
E+0.081
Neg logit contributions
 wife-0.044
 opinion-0.041
 stomach-0.040
 name-0.039
 attitude-0.038
Token and
Token ID290
Feature activation+2.73e+00

Pos logit contributions
 wife+0.036
 opinion+0.036
 attitude+0.035
 fault+0.035
sed+0.035
Neg logit contributions
izing-0.026
ler-0.025
E-0.023
B-0.022
ww-0.022
Token said
Token ID531
Feature activation-2.14e+00

Pos logit contributions
izing+0.070
ler+0.067
ww+0.065
E+0.064
corn+0.064
Neg logit contributions
 wife-0.064
 opinion-0.062
 attitude-0.061
 stomach-0.059
 name-0.059
Token,
Token ID11
Feature activation-1.37e+00

Pos logit contributions
 wife+0.040
 opinion+0.037
 attitude+0.036
 fault+0.035
 name+0.035
Neg logit contributions
izing-0.066
ww-0.063
ler-0.063
E-0.062
corn-0.061
Token "
Token ID366
Feature activation+7.74e+00

Pos logit contributions
 wife+0.018
 opinion+0.017
 attitude+0.017
 stomach+0.015
 fault+0.015
Neg logit contributions
izing-0.050
ler-0.047
E-0.047
ww-0.046
corn-0.046
TokenI
Token ID40
Feature activation-1.45e+00

Pos logit contributions
ler+0.190
izing+0.187
ls+0.171
hest+0.171
 pounds+0.169
Neg logit contributions
 wife-0.154
joy-0.149
 name-0.147
 ending-0.145
 feeling-0.142
Token'm
Token ID1101
Feature activation-7.21e-01

Pos logit contributions
 wife+0.031
 attitude+0.029
 stomach+0.028
sed+0.028
 opinion+0.028
Neg logit contributions
ler-0.038
E-0.038
ww-0.037
izing-0.037
hest-0.037
Token glad
Token ID9675
Feature activation+3.55e-02

Pos logit contributions
 wife+0.013
 opinion+0.012
 attitude+0.012
 stomach+0.011
 name+0.011
Neg logit contributions
izing-0.023
ler-0.022
ww-0.021
hest-0.021
E-0.021
Token I
Token ID314
Feature activation-2.07e+00

Pos logit contributions
izing+0.001
ler+0.000
hest+0.000
E+0.000
ww+0.000
Neg logit contributions
 wife-0.001
 opinion-0.001
 attitude-0.001
 stomach-0.001
 name-0.001
Token could
Token ID714
Feature activation-7.31e-01

Pos logit contributions
 wife+0.046
 attitude+0.043
 opinion+0.043
 stomach+0.042
 fault+0.040
Neg logit contributions
ler-0.055
izing-0.055
hest-0.053
E-0.053
ww-0.052
Token her
Token ID607
Feature activation-1.11e+00

Pos logit contributions
izing+0.010
ler+0.010
hest+0.010
ww+0.010
E+0.010
Neg logit contributions
 wife-0.008
 opinion-0.007
 attitude-0.007
 stomach-0.007
 name-0.007
Token back
Token ID736
Feature activation-2.23e+00

Pos logit contributions
 wife+0.024
 opinion+0.023
 attitude+0.022
 name+0.021
 stomach+0.021
Neg logit contributions
izing-0.031
ler-0.030
hest-0.028
E-0.028
ww-0.028
Token and
Token ID290
Feature activation+1.85e+00

Pos logit contributions
 wife+0.055
 opinion+0.053
 attitude+0.052
 stomach+0.050
 name+0.050
Neg logit contributions
izing-0.055
ler-0.053
ww-0.051
E-0.050
hest-0.050
Token said
Token ID531
Feature activation-2.54e+00

Pos logit contributions
izing+0.044
ler+0.043
ww+0.042
corn+0.042
E+0.040
Neg logit contributions
 opinion-0.045
 wife-0.044
 attitude-0.044
 fault-0.042
 name-0.042
Token,
Token ID11
Feature activation-1.57e+00

Pos logit contributions
 wife+0.052
 opinion+0.050
 fault+0.049
 attitude+0.047
sed+0.047
Neg logit contributions
izing-0.069
ww-0.069
ler-0.067
E-0.065
corn-0.063
Token "
Token ID366
Feature activation+7.74e+00

Pos logit contributions
 wife+0.021
 opinion+0.020
 attitude+0.019
 fault+0.018
 stomach+0.018
Neg logit contributions
izing-0.056
ler-0.053
E-0.052
corn-0.052
ww-0.052
TokenYou
Token ID1639
Feature activation-1.18e+00

Pos logit contributions
izing+0.187
ler+0.184
ls+0.174
hest+0.170
ang+0.163
Neg logit contributions
 wife-0.165
 ending-0.158
 name-0.151
 opinion-0.150
 attitude-0.149
Token're
Token ID821
Feature activation-2.58e-01

Pos logit contributions
 wife+0.023
 attitude+0.021
 opinion+0.020
joy+0.020
 ending+0.020
Neg logit contributions
ler-0.034
hest-0.034
izing-0.034
ww-0.034
E-0.032
Token my
Token ID616
Feature activation-4.20e-01

Pos logit contributions
 wife+0.004
 attitude+0.004
 opinion+0.004
 stomach+0.004
 behavior+0.004
Neg logit contributions
izing-0.009
ler-0.008
hest-0.008
ww-0.008
E-0.008
Token best
Token ID1266
Feature activation-7.04e-02

Pos logit contributions
 wife+0.007
 opinion+0.007
 attitude+0.007
 stomach+0.006
 name+0.006
Neg logit contributions
izing-0.013
ler-0.013
ww-0.013
E-0.012
hest-0.012
Token friend
Token ID1545
Feature activation+5.65e-01

Pos logit contributions
 attitude+0.001
 wife+0.001
 opinion+0.001
 ending+0.001
 stomach+0.001
Neg logit contributions
ww-0.003
ler-0.003
izing-0.002
ch-0.002
E-0.002
TokenBob
Token ID18861
Feature activation+1.78e+00

Pos logit contributions
ler+0.083
izing+0.079
plets+0.073
hest+0.071
sc+0.070
Neg logit contributions
 wife-0.117
 opinion-0.104
 nod-0.103
 name-0.101
 parents-0.100
Token grinned
Token ID36268
Feature activation-1.79e+00

Pos logit contributions
izing+0.052
ler+0.050
E+0.048
ww+0.048
hest+0.048
Neg logit contributions
 wife-0.035
 opinion-0.033
 attitude-0.033
 stomach-0.032
 fault-0.031
Token,
Token ID11
Feature activation+4.67e-01

Pos logit contributions
 wife+0.048
sed+0.048
 attitude+0.047
 fault+0.047
 opinion+0.046
Neg logit contributions
izing-0.040
ler-0.037
E-0.035
B-0.033
hest-0.031
Token relieved
Token ID22598
Feature activation-1.32e+00

Pos logit contributions
izing+0.014
ler+0.013
E+0.013
ww+0.013
hest+0.013
Neg logit contributions
 wife-0.009
 opinion-0.009
 attitude-0.009
 name-0.008
 fault-0.008
Token.
Token ID13
Feature activation-1.31e+00

Pos logit contributions
 wife+0.031
 fault+0.031
 opinion+0.030
sed+0.030
 attitude+0.029
Neg logit contributions
izing-0.034
ler-0.031
ww-0.028
E-0.028
is-0.028
Token "
Token ID366
Feature activation+7.73e+00

Pos logit contributions
 wife+0.023
 opinion+0.021
 attitude+0.021
 behavior+0.020
 stomach+0.020
Neg logit contributions
izing-0.040
E-0.039
ww-0.038
corn-0.038
ler-0.038
TokenP
Token ID47
Feature activation+7.70e-01

Pos logit contributions
ler+0.198
izing+0.187
ls+0.183
hest+0.180
corn+0.180
Neg logit contributions
 wife-0.174
 name-0.163
 opinion-0.161
 ending-0.155
 attitude-0.153
Tokenhew
Token ID6391
Feature activation-8.01e-01

Pos logit contributions
izing+0.019
ler+0.018
corn+0.017
hest+0.016
E+0.016
Neg logit contributions
 wife-0.020
 opinion-0.019
 name-0.018
 attitude-0.018
 stomach-0.018
Token!
Token ID0
Feature activation-2.36e-01

Pos logit contributions
 wife+0.016
 opinion+0.015
 attitude+0.014
 stomach+0.014
 name+0.014
Neg logit contributions
izing-0.024
ler-0.023
ww-0.023
hest-0.022
E-0.022
Token That
Token ID1320
Feature activation+9.91e-01

Pos logit contributions
 wife+0.006
 attitude+0.006
 opinion+0.006
 stomach+0.006
 ending+0.006
Neg logit contributions
izing-0.005
ww-0.005
E-0.005
ler-0.005
corn-0.005
Token's
Token ID338
Feature activation+2.35e-01

Pos logit contributions
izing+0.027
ler+0.026
E+0.025
hest+0.025
ww+0.025
Neg logit contributions
 wife-0.022
 opinion-0.021
 attitude-0.021
 name-0.020
 stomach-0.020
Token her
Token ID607
Feature activation-1.90e+00

Pos logit contributions
izing+0.027
ler+0.026
hest+0.025
ww+0.025
E+0.025
Neg logit contributions
 wife-0.020
 opinion-0.019
 attitude-0.019
 stomach-0.018
 name-0.018
Token back
Token ID736
Feature activation-2.10e+00

Pos logit contributions
 wife+0.052
 opinion+0.051
 fault+0.050
sed+0.049
 attitude+0.049
Neg logit contributions
ler-0.038
izing-0.038
E-0.034
corn-0.033
hest-0.033
Token and
Token ID290
Feature activation+2.12e+00

Pos logit contributions
 wife+0.052
 opinion+0.050
 attitude+0.049
 stomach+0.048
 name+0.048
Neg logit contributions
izing-0.052
ler-0.049
E-0.047
ww-0.047
hest-0.046
Token said
Token ID531
Feature activation-2.33e+00

Pos logit contributions
izing+0.052
ler+0.050
ww+0.048
corn+0.048
E+0.048
Neg logit contributions
 wife-0.051
 opinion-0.050
 attitude-0.050
 name-0.048
 stomach-0.048
Token,
Token ID11
Feature activation-1.77e+00

Pos logit contributions
 wife+0.044
 opinion+0.041
 attitude+0.040
 fault+0.039
 name+0.039
Neg logit contributions
izing-0.070
ww-0.068
ler-0.067
E-0.067
corn-0.064
Token "
Token ID366
Feature activation+7.72e+00

Pos logit contributions
 wife+0.024
 opinion+0.023
 attitude+0.022
 fault+0.021
 stomach+0.021
Neg logit contributions
izing-0.063
ler-0.059
E-0.059
ww-0.058
corn-0.058
TokenI
Token ID40
Feature activation-1.29e+00

Pos logit contributions
ler+0.192
izing+0.190
ls+0.175
 pounds+0.168
hest+0.167
Neg logit contributions
 wife-0.160
 ending-0.150
joy-0.148
 name-0.148
 opinion-0.144
Token love
Token ID1842
Feature activation+1.41e-01

Pos logit contributions
 wife+0.029
 attitude+0.027
 stomach+0.026
 opinion+0.026
sed+0.026
Neg logit contributions
ler-0.032
izing-0.032
E-0.032
ww-0.032
hest-0.032
Token you
Token ID345
Feature activation-5.96e-01

Pos logit contributions
ler+0.003
hest+0.003
izing+0.003
sc+0.002
plets+0.002
Neg logit contributions
 parents-0.004
 wife-0.004
 opinion-0.004
 permission-0.004
 fault-0.004
Token too
Token ID1165
Feature activation-1.17e+00

Pos logit contributions
 wife+0.013
 opinion+0.012
 attitude+0.012
 stomach+0.012
 name+0.012
Neg logit contributions
izing-0.017
ler-0.016
ww-0.015
hest-0.015
E-0.015
Token,
Token ID11
Feature activation+3.73e-01

Pos logit contributions
 wife+0.020
 opinion+0.018
 attitude+0.018
 name+0.017
 stomach+0.017
Neg logit contributions
izing-0.038
ler-0.036
E-0.035
hest-0.035
ww-0.035
Tokenmy
Token ID1820
Feature activation+1.95e+00

Pos logit contributions
izing+0.078
hest+0.073
ww+0.072
ler+0.072
E+0.070
Neg logit contributions
 wife-0.044
 opinion-0.042
 stomach-0.041
 attitude-0.040
 name-0.040
Token smiled
Token ID13541
Feature activation-1.34e+00

Pos logit contributions
izing+0.060
ler+0.056
hest+0.056
ww+0.056
E+0.055
Neg logit contributions
 wife-0.038
 opinion-0.035
 stomach-0.034
 attitude-0.034
 name-0.033
Token and
Token ID290
Feature activation+2.76e+00

Pos logit contributions
 wife+0.039
 opinion+0.038
 attitude+0.038
 fault+0.038
sed+0.037
Neg logit contributions
izing-0.026
ler-0.025
E-0.023
B-0.022
hest-0.022
Token said
Token ID531
Feature activation-1.67e+00

Pos logit contributions
izing+0.075
ler+0.071
ww+0.069
hest+0.069
E+0.069
Neg logit contributions
 wife-0.062
 opinion-0.059
 attitude-0.058
 stomach-0.056
 name-0.056
Token,
Token ID11
Feature activation-1.52e+00

Pos logit contributions
 wife+0.034
 opinion+0.032
 attitude+0.031
 name+0.031
 fault+0.030
Neg logit contributions
izing-0.049
ler-0.046
ww-0.046
E-0.045
corn-0.045
Token "
Token ID366
Feature activation+7.71e+00

Pos logit contributions
 wife+0.018
 opinion+0.016
 attitude+0.016
 stomach+0.015
 name+0.015
Neg logit contributions
izing-0.057
ler-0.055
hest-0.054
ww-0.054
E-0.054
TokenYes
Token ID5297
Feature activation-4.13e-01

Pos logit contributions
ler+0.193
ls+0.188
izing+0.177
 pounds+0.174
hest+0.172
Neg logit contributions
 wife-0.156
 ending-0.141
 name-0.138
 nod-0.136
 known-0.132
Token,
Token ID11
Feature activation+8.34e-01

Pos logit contributions
 wife+0.005
 opinion+0.005
 attitude+0.005
 stomach+0.005
 tum+0.004
Neg logit contributions
izing-0.015
ler-0.015
hest-0.014
E-0.014
corn-0.014
Token we
Token ID356
Feature activation-2.33e+00

Pos logit contributions
izing+0.018
ler+0.017
hest+0.017
E+0.016
ww+0.016
Neg logit contributions
 wife-0.023
 opinion-0.023
 attitude-0.022
 stomach-0.022
 tum-0.022
Token did
Token ID750
Feature activation+8.44e-01

Pos logit contributions
 wife+0.054
 attitude+0.051
 opinion+0.050
 stomach+0.049
sed+0.048
Neg logit contributions
izing-0.059
ler-0.058
hest-0.057
ww-0.056
E-0.056
Token it
Token ID340
Feature activation+3.03e-01

Pos logit contributions
izing+0.020
ler+0.019
hest+0.019
E+0.018
ww+0.018
Neg logit contributions
 wife-0.022
 opinion-0.021
 name-0.020
 attitude-0.020
 stomach-0.020
Token around
Token ID1088
Feature activation+2.84e-01

Pos logit contributions
izing+0.029
ler+0.028
ww+0.028
hest+0.028
E+0.027
Neg logit contributions
 wife-0.024
 opinion-0.023
 attitude-0.023
 name-0.022
 stomach-0.022
Token the
Token ID262
Feature activation+4.72e-03

Pos logit contributions
izing+0.008
ler+0.008
ww+0.008
hest+0.008
E+0.008
Neg logit contributions
 wife-0.006
 opinion-0.006
 attitude-0.005
 stomach-0.005
 name-0.005
Token jungle
Token ID20712
Feature activation-1.93e-01

Pos logit contributions
izing+0.000
ler+0.000
ww+0.000
E+0.000
corn+0.000
Neg logit contributions
 wife-0.000
 opinion-0.000
 stomach-0.000
 name-0.000
 tum-0.000
Token together
Token ID1978
Feature activation-1.52e+00

Pos logit contributions
 wife+0.004
 opinion+0.003
 attitude+0.003
 stomach+0.003
 name+0.003
Neg logit contributions
izing-0.006
ler-0.006
hest-0.006
ww-0.006
E-0.005
Token.
Token ID13
Feature activation+9.37e-02

Pos logit contributions
 wife+0.028
 opinion+0.027
 attitude+0.026
 stomach+0.025
 name+0.025
Neg logit contributions
izing-0.047
ler-0.045
hest-0.044
E-0.043
corn-0.043
Token "
Token ID366
Feature activation+7.71e+00

Pos logit contributions
izing+0.003
ler+0.003
E+0.003
hest+0.003
ww+0.003
Neg logit contributions
 wife-0.002
 opinion-0.002
 attitude-0.002
 stomach-0.002
 name-0.002
TokenThank
Token ID10449
Feature activation+2.96e+00

Pos logit contributions
izing+0.188
ler+0.188
ls+0.165
iflower+0.160
 pounds+0.152
Neg logit contributions
 wife-0.178
 name-0.172
joy-0.170
 opinion-0.168
 ending-0.167
Token you
Token ID345
Feature activation-2.03e-01

Pos logit contributions
ler+0.046
izing+0.044
hest+0.040
corn+0.036
E+0.036
Neg logit contributions
 wife-0.096
 opinion-0.094
 name-0.093
 parents-0.090
 stomach-0.089
Token,
Token ID11
Feature activation+6.17e-02

Pos logit contributions
 wife+0.004
 opinion+0.004
 attitude+0.004
 name+0.004
 stomach+0.004
Neg logit contributions
izing-0.006
ww-0.006
ler-0.006
hest-0.006
E-0.006
Token Max
Token ID5436
Feature activation+1.82e-01

Pos logit contributions
izing+0.002
ler+0.002
hest+0.002
E+0.001
corn+0.001
Neg logit contributions
 wife-0.001
 opinion-0.001
 name-0.001
 attitude-0.001
 stomach-0.001
Token!"
Token ID2474
Feature activation+3.68e-01

Pos logit contributions
izing+0.004
ler+0.004
hest+0.003
E+0.003
ww+0.003
Neg logit contributions
 wife-0.005
 opinion-0.005
 attitude-0.005
 name-0.005
 stomach-0.005
Tokenmy
Token ID1820
Feature activation+1.96e+00

Pos logit contributions
izing+0.039
ler+0.036
hest+0.036
ww+0.035
E+0.035
Neg logit contributions
 wife-0.030
 opinion-0.029
 attitude-0.028
 name-0.027
 stomach-0.027
Token smiled
Token ID13541
Feature activation-1.66e+00

Pos logit contributions
izing+0.059
ler+0.056
hest+0.055
ww+0.054
E+0.054
Neg logit contributions
 wife-0.038
 opinion-0.036
 attitude-0.035
 stomach-0.034
 name-0.034
Token and
Token ID290
Feature activation+2.52e+00

Pos logit contributions
 opinion+0.050
 wife+0.050
 fault+0.050
sed+0.050
 attitude+0.049
Neg logit contributions
izing-0.030
ler-0.028
E-0.026
B-0.025
corn-0.023
Token said
Token ID531
Feature activation-2.06e+00

Pos logit contributions
izing+0.072
ler+0.068
hest+0.067
E+0.066
ww+0.065
Neg logit contributions
 wife-0.054
 opinion-0.050
 attitude-0.048
 stomach-0.048
 name-0.048
Token,
Token ID11
Feature activation-1.61e+00

Pos logit contributions
 wife+0.040
 opinion+0.037
 attitude+0.036
 name+0.035
 fault+0.035
Neg logit contributions
izing-0.063
ww-0.059
E-0.059
ler-0.058
corn-0.057
Token "
Token ID366
Feature activation+7.71e+00

Pos logit contributions
 wife+0.020
 opinion+0.019
 attitude+0.018
 stomach+0.017
 name+0.017
Neg logit contributions
izing-0.059
ler-0.057
E-0.056
ww-0.055
hest-0.055
TokenThank
Token ID10449
Feature activation+3.44e+00

Pos logit contributions
ler+0.202
izing+0.183
ls+0.179
plets+0.172
hest+0.171
Neg logit contributions
 wife-0.172
 ending-0.153
 opinion-0.152
joy-0.151
 name-0.148
Token you
Token ID345
Feature activation-8.63e-01

Pos logit contributions
ler+0.065
izing+0.063
hest+0.060
plets+0.056
corn+0.056
Neg logit contributions
 parents-0.092
 wife-0.092
 opinion-0.091
 permission-0.089
 name-0.089
Token,
Token ID11
Feature activation+1.19e+00

Pos logit contributions
 wife+0.014
 opinion+0.013
 attitude+0.013
 name+0.013
 stomach+0.012
Neg logit contributions
izing-0.028
ler-0.027
ww-0.027
hest-0.026
E-0.026
Token Lily
Token ID20037
Feature activation-7.07e-01

Pos logit contributions
ler+0.040
hest+0.039
izing+0.038
w+0.038
plets+0.037
Neg logit contributions
 opinion-0.020
 attitude-0.019
 name-0.018
 wife-0.017
 ending-0.017
Token.
Token ID13
Feature activation+7.76e-01

Pos logit contributions
 wife+0.014
 opinion+0.013
 attitude+0.013
 stomach+0.012
 name+0.012
Neg logit contributions
izing-0.021
ler-0.020
ww-0.020
corn-0.020
hest-0.019
Token parents
Token ID3397
Feature activation+1.80e+00

Pos logit contributions
 fault+0.030
sed+0.028
 opinion+0.028
urance+0.027
 attitude+0.026
Neg logit contributions
ler-0.106
iflower-0.105
izing-0.103
is-0.101
E-0.101
Token smiled
Token ID13541
Feature activation-1.55e+00

Pos logit contributions
izing+0.050
ler+0.048
E+0.047
hest+0.046
ww+0.046
Neg logit contributions
 wife-0.038
 opinion-0.036
 attitude-0.036
 name-0.035
 fault-0.035
Token and
Token ID290
Feature activation+2.18e+00

Pos logit contributions
 wife+0.047
 opinion+0.046
 fault+0.046
sed+0.046
 attitude+0.044
Neg logit contributions
izing-0.028
ler-0.027
E-0.024
B-0.022
ww-0.021
Token said
Token ID531
Feature activation-1.43e+00

Pos logit contributions
izing+0.054
ler+0.051
ww+0.050
E+0.049
corn+0.049
Neg logit contributions
 wife-0.051
 opinion-0.051
 attitude-0.051
 fault-0.050
 name-0.049
Token,
Token ID11
Feature activation-1.01e+00

Pos logit contributions
 wife+0.029
 opinion+0.028
 attitude+0.027
 name+0.026
 fault+0.026
Neg logit contributions
izing-0.042
ler-0.039
E-0.038
ww-0.038
corn-0.037
Token "
Token ID366
Feature activation+7.70e+00

Pos logit contributions
 wife+0.014
 opinion+0.013
 attitude+0.013
 fault+0.012
 stomach+0.012
Neg logit contributions
izing-0.036
ler-0.034
E-0.034
ww-0.033
corn-0.033
TokenWe
Token ID1135
Feature activation-2.61e+00

Pos logit contributions
ler+0.169
hest+0.157
izing+0.155
corn+0.151
iflower+0.146
Neg logit contributions
 wife-0.203
joy-0.184
 opinion-0.178
 nod-0.176
 attitude-0.174
Token are
Token ID389
Feature activation-8.41e-01

Pos logit contributions
 wife+0.063
sed+0.060
 stomach+0.059
 attitude+0.058
 ending+0.057
Neg logit contributions
ler-0.061
izing-0.059
hest-0.058
ww-0.058
w-0.057
Token glad
Token ID9675
Feature activation-1.15e-01

Pos logit contributions
 wife+0.017
 attitude+0.016
 opinion+0.016
 stomach+0.015
 behavior+0.015
Neg logit contributions
izing-0.025
ler-0.023
hest-0.023
ww-0.023
E-0.023
Token you
Token ID345
Feature activation-2.06e+00

Pos logit contributions
 wife+0.004
 opinion+0.004
 attitude+0.004
 stomach+0.004
 name+0.004
Neg logit contributions
izing-0.001
ler-0.001
hest-0.001
ww-0.001
E-0.001
Token understand
Token ID1833
Feature activation-2.19e-01

Pos logit contributions
 wife+0.044
 attitude+0.040
 opinion+0.040
 stomach+0.039
 fault+0.039
Neg logit contributions
ler-0.055
izing-0.055
hest-0.054
ww-0.052
w-0.051
Token and
Token ID290
Feature activation+1.61e+00

Pos logit contributions
 wife+0.066
sed+0.063
 name+0.062
 advisor+0.061
 subjects+0.059
Neg logit contributions
izing-0.053
ler-0.052
is-0.048
B-0.044
hest-0.043
Token hugged
Token ID41130
Feature activation+1.43e+00

Pos logit contributions
izing+0.041
ler+0.039
ww+0.038
E+0.037
corn+0.037
Neg logit contributions
 wife-0.037
 opinion-0.036
 attitude-0.036
 name-0.035
 fault-0.035
Token her
Token ID607
Feature activation-2.19e+00

Pos logit contributions
izing+0.043
ler+0.042
hest+0.041
ww+0.041
corn+0.040
Neg logit contributions
 wife-0.027
 attitude-0.026
 opinion-0.026
 name-0.025
 stomach-0.024
Token tight
Token ID5381
Feature activation-1.49e+00

Pos logit contributions
 wife+0.055
 opinion+0.053
 fault+0.053
sed+0.052
 attitude+0.052
Neg logit contributions
izing-0.049
ler-0.047
E-0.043
hest-0.042
corn-0.041
Token.
Token ID13
Feature activation-6.68e-01

Pos logit contributions
 wife+0.034
 opinion+0.031
sed+0.028
 name+0.028
 attitude+0.028
Neg logit contributions
ler-0.039
izing-0.039
hest-0.034
sc-0.034
ww-0.033
Token "
Token ID366
Feature activation+7.70e+00

Pos logit contributions
 wife+0.014
 opinion+0.013
 attitude+0.013
 fault+0.012
 ending+0.012
Neg logit contributions
izing-0.019
E-0.018
ler-0.018
corn-0.018
ww-0.018
TokenIt
Token ID1026
Feature activation+2.69e-01

Pos logit contributions
izing+0.177
hest+0.170
ler+0.169
corn+0.154
plets+0.149
Neg logit contributions
 wife-0.192
 opinion-0.181
 ending-0.174
 attitude-0.173
joy-0.172
Token's
Token ID338
Feature activation+1.27e-01

Pos logit contributions
izing+0.008
ler+0.007
E+0.007
hest+0.007
ww+0.007
Neg logit contributions
 wife-0.006
 opinion-0.006
 attitude-0.005
 name-0.005
 stomach-0.005
Token ok
Token ID12876
Feature activation-1.05e-01

Pos logit contributions
izing+0.005
ler+0.005
hest+0.005
E+0.005
ww+0.005
Neg logit contributions
 wife-0.001
 opinion-0.001
 attitude-0.001
 stomach-0.001
 name-0.001
Token,"
Token ID553
Feature activation+1.25e+00

Pos logit contributions
 wife+0.002
 opinion+0.002
 attitude+0.002
 stomach+0.002
 name+0.002
Neg logit contributions
izing-0.003
ler-0.003
ww-0.003
E-0.003
corn-0.003
Token she
Token ID673
Feature activation-3.62e-01

Pos logit contributions
izing+0.030
ler+0.029
hest+0.028
ww+0.027
E+0.027
Neg logit contributions
 wife-0.032
 opinion-0.030
 attitude-0.030
 stomach-0.029
 name-0.029
Token hugged
Token ID41130
Feature activation+1.14e+00

Pos logit contributions
izing+0.060
ler+0.057
hest+0.056
ww+0.056
E+0.055
Neg logit contributions
 wife-0.039
 opinion-0.037
 attitude-0.036
 stomach-0.035
 name-0.035
Token her
Token ID607
Feature activation-2.07e+00

Pos logit contributions
izing+0.033
ler+0.032
hest+0.031
ww+0.031
corn+0.031
Neg logit contributions
 wife-0.023
 opinion-0.022
 attitude-0.022
 stomach-0.021
 name-0.021
Token and
Token ID290
Feature activation+2.37e+00

Pos logit contributions
 wife+0.058
 opinion+0.056
 fault+0.055
sed+0.055
 attitude+0.054
Neg logit contributions
ler-0.040
izing-0.040
E-0.037
hest-0.035
corn-0.034
Token said
Token ID531
Feature activation-2.32e+00

Pos logit contributions
izing+0.062
ler+0.058
ww+0.057
E+0.056
corn+0.056
Neg logit contributions
 wife-0.055
 opinion-0.053
 attitude-0.053
 stomach-0.051
 name-0.051
Token,
Token ID11
Feature activation-1.71e+00

Pos logit contributions
 wife+0.046
 opinion+0.042
 attitude+0.041
 name+0.041
 fault+0.040
Neg logit contributions
izing-0.070
ler-0.065
E-0.065
ww-0.065
corn-0.062
Token "
Token ID366
Feature activation+7.69e+00

Pos logit contributions
 wife+0.022
 opinion+0.020
 attitude+0.019
 stomach+0.018
 name+0.018
Neg logit contributions
izing-0.063
ler-0.060
E-0.059
ww-0.059
hest-0.059
TokenThank
Token ID10449
Feature activation+3.69e+00

Pos logit contributions
ler+0.191
corn+0.184
hest+0.183
 pounds+0.176
izing+0.174
Neg logit contributions
 wife-0.157
 ending-0.145
 opinion-0.143
 attitude-0.139
 feeling-0.136
Token you
Token ID345
Feature activation-6.55e-01

Pos logit contributions
ler+0.072
izing+0.069
hest+0.068
corn+0.065
plets+0.064
Neg logit contributions
 parents-0.096
 wife-0.095
 opinion-0.094
 permission-0.094
 goodbye-0.092
Token,
Token ID11
Feature activation+1.05e+00

Pos logit contributions
 wife+0.011
 opinion+0.010
 attitude+0.010
 name+0.009
 stomach+0.009
Neg logit contributions
izing-0.022
ler-0.021
ww-0.020
hest-0.020
E-0.020
Token Lily
Token ID20037
Feature activation-8.89e-01

Pos logit contributions
ler+0.034
izing+0.034
hest+0.033
corn+0.032
w+0.032
Neg logit contributions
 opinion-0.018
 wife-0.017
 attitude-0.017
 name-0.016
 ending-0.016
Token.
Token ID13
Feature activation+1.02e+00

Pos logit contributions
 wife+0.019
 opinion+0.018
 attitude+0.018
 stomach+0.017
sed+0.017
Neg logit contributions
izing-0.024
ler-0.024
ww-0.023
corn-0.023
E-0.023
Token her
Token ID607
Feature activation-2.24e+00

Pos logit contributions
izing+0.020
ler+0.019
E+0.019
ww+0.019
hest+0.019
Neg logit contributions
 wife-0.015
 opinion-0.015
 attitude-0.014
 stomach-0.014
 name-0.014
Token back
Token ID736
Feature activation-2.64e+00

Pos logit contributions
 wife+0.059
 opinion+0.057
 attitude+0.056
 fault+0.055
 ending+0.055
Neg logit contributions
ler-0.048
izing-0.047
E-0.043
corn-0.042
hest-0.041
Token and
Token ID290
Feature activation+2.18e+00

Pos logit contributions
 wife+0.068
 opinion+0.066
 attitude+0.066
 fault+0.064
 stomach+0.064
Neg logit contributions
izing-0.059
ler-0.056
ww-0.053
E-0.052
corn-0.052
Token said
Token ID531
Feature activation-2.36e+00

Pos logit contributions
izing+0.058
ler+0.055
ww+0.053
E+0.053
hest+0.053
Neg logit contributions
 wife-0.050
 opinion-0.048
 attitude-0.047
 stomach-0.046
 name-0.046
Token,
Token ID11
Feature activation-1.71e+00

Pos logit contributions
 wife+0.046
 opinion+0.044
 attitude+0.042
 fault+0.042
 name+0.041
Neg logit contributions
izing-0.070
ww-0.066
ler-0.065
E-0.065
corn-0.063
Token "
Token ID366
Feature activation+7.68e+00

Pos logit contributions
 wife+0.022
 opinion+0.020
 attitude+0.019
 stomach+0.018
 name+0.018
Neg logit contributions
izing-0.063
ler-0.060
E-0.059
ww-0.059
hest-0.059
TokenYou
Token ID1639
Feature activation-1.75e+00

Pos logit contributions
ler+0.185
izing+0.182
ls+0.173
hest+0.169
corn+0.157
Neg logit contributions
 wife-0.179
 ending-0.159
joy-0.158
 opinion-0.158
 name-0.158
Token're
Token ID821
Feature activation-4.95e-01

Pos logit contributions
 wife+0.033
 attitude+0.031
 opinion+0.030
sed+0.030
 behavior+0.029
Neg logit contributions
ler-0.051
izing-0.050
hest-0.050
ww-0.049
E-0.048
Token welcome
Token ID7062
Feature activation-6.92e-01

Pos logit contributions
 wife+0.008
 attitude+0.008
 opinion+0.007
 behavior+0.007
sed+0.007
Neg logit contributions
izing-0.016
hest-0.015
E-0.015
ww-0.015
ler-0.015
Token,
Token ID11
Feature activation+9.14e-01

Pos logit contributions
 wife+0.013
 attitude+0.013
 opinion+0.013
 fault+0.012
 ending+0.012
Neg logit contributions
izing-0.020
ww-0.019
E-0.019
ler-0.019
corn-0.019
Token my
Token ID616
Feature activation-1.49e+00

Pos logit contributions
izing+0.027
ler+0.026
hest+0.025
w+0.024
E+0.024
Neg logit contributions
 wife-0.019
 opinion-0.019
 attitude-0.018
 stomach-0.018
 name-0.018
Token the
Token ID262
Feature activation-7.21e-01

Pos logit contributions
sed+0.037
 attitude+0.036
 wife+0.036
 opinion+0.036
 stomach+0.035
Neg logit contributions
E-0.028
izing-0.028
corn-0.026
ler-0.025
Little-0.025
Token toilet
Token ID16146
Feature activation-1.19e+00

Pos logit contributions
 wife+0.016
 opinion+0.015
 attitude+0.015
 stomach+0.014
 fault+0.014
Neg logit contributions
izing-0.020
E-0.019
ler-0.019
hest-0.018
ww-0.018
Token,
Token ID11
Feature activation+3.30e-01

Pos logit contributions
 wife+0.019
 opinion+0.018
 attitude+0.017
 name+0.016
 stomach+0.016
Neg logit contributions
izing-0.040
ler-0.038
hest-0.037
ww-0.037
corn-0.037
Token Adam
Token ID7244
Feature activation+5.85e-01

Pos logit contributions
izing+0.009
ler+0.008
corn+0.008
ww+0.008
E+0.008
Neg logit contributions
 wife-0.007
 opinion-0.007
 attitude-0.007
 stomach-0.007
 name-0.007
Token asked
Token ID1965
Feature activation-2.97e+00

Pos logit contributions
izing+0.016
ler+0.016
corn+0.015
hest+0.015
E+0.015
Neg logit contributions
 wife-0.012
 opinion-0.012
 attitude-0.012
 name-0.011
 stomach-0.011
Token Laura
Token ID16753
Feature activation-6.35e+00

Pos logit contributions
sed+0.072
 fault+0.071
 wife+0.069
 stomach+0.067
 opinion+0.067
Neg logit contributions
izing-0.065
E-0.064
corn-0.064
ler-0.063
hest-0.060
Token to
Token ID284
Feature activation-6.61e-01

Pos logit contributions
sed+0.133
 attitude+0.127
 wife+0.122
 ending+0.121
 opinion+0.120
Neg logit contributions
izing-0.158
ler-0.153
corn-0.150
E-0.149
hest-0.148
Token wait
Token ID4043
Feature activation-3.38e+00

Pos logit contributions
 fault+0.013
sed+0.013
 attitude+0.012
joy+0.012
 ending+0.012
Neg logit contributions
izing-0.018
E-0.017
hest-0.017
ler-0.017
ww-0.016
Token outside
Token ID2354
Feature activation-3.89e+00

Pos logit contributions
 wife+0.073
 opinion+0.069
 attitude+0.068
 stomach+0.068
 name+0.066
Neg logit contributions
izing-0.093
ler-0.088
hest-0.085
E-0.084
ww-0.083
Token.
Token ID13
Feature activation-1.27e+00

Pos logit contributions
 wife+0.081
 opinion+0.075
 attitude+0.074
 name+0.073
sed+0.072
Neg logit contributions
izing-0.110
ler-0.105
hest-0.099
corn-0.098
E-0.097
Token She
Token ID1375
Feature activation+4.80e-01

Pos logit contributions
 wife+0.029
 fault+0.028
sed+0.027
 stomach+0.027
 opinion+0.027
Neg logit contributions
izing-0.030
E-0.029
corn-0.028
hest-0.027
Now-0.026
Token She
Token ID1375
Feature activation-3.74e-01

Pos logit contributions
 wife+0.017
 attitude+0.016
 opinion+0.016
 fault+0.016
 behavior+0.015
Neg logit contributions
izing-0.028
E-0.027
ler-0.026
ww-0.026
hest-0.026
Token called
Token ID1444
Feature activation-3.13e+00

Pos logit contributions
 wife+0.006
 opinion+0.006
 name+0.006
 stomach+0.005
 tum+0.005
Neg logit contributions
izing-0.013
hest-0.012
ler-0.012
E-0.012
ww-0.012
Token out
Token ID503
Feature activation-2.78e+00

Pos logit contributions
 wife+0.070
 opinion+0.065
 attitude+0.065
 stomach+0.063
 fault+0.062
Neg logit contributions
izing-0.083
ler-0.079
E-0.078
ww-0.077
hest-0.077
Token to
Token ID284
Feature activation-1.25e+00

Pos logit contributions
 wife+0.042
 attitude+0.039
 fault+0.039
 opinion+0.038
 stomach+0.037
Neg logit contributions
izing-0.092
ww-0.087
ler-0.087
E-0.087
corn-0.086
Token her
Token ID607
Feature activation-4.05e+00

Pos logit contributions
 wife+0.025
 opinion+0.024
 attitude+0.024
 stomach+0.023
 name+0.023
Neg logit contributions
izing-0.036
ler-0.035
E-0.034
hest-0.034
ww-0.033
Token daughter
Token ID4957
Feature activation-6.20e+00

Pos logit contributions
 attitude+0.063
 fault+0.059
 wife+0.059
 ending+0.058
 opinion+0.058
Neg logit contributions
izing-0.122
ler-0.120
E-0.119
corn-0.118
ww-0.113
Token,
Token ID11
Feature activation-2.24e+00

Pos logit contributions
 fault+0.130
 attitude+0.130
 wife+0.129
sed+0.126
 ending+0.123
Neg logit contributions
corn-0.147
izing-0.146
ler-0.146
E-0.140
ch-0.139
Token Sarah
Token ID10490
Feature activation-2.95e+00

Pos logit contributions
 wife+0.047
 attitude+0.046
 opinion+0.045
 fault+0.043
 stomach+0.043
Neg logit contributions
izing-0.061
ler-0.058
E-0.057
corn-0.057
ww-0.057
Token.
Token ID13
Feature activation-1.16e+00

Pos logit contributions
 wife+0.052
 opinion+0.050
 attitude+0.049
 fault+0.046
 stomach+0.046
Neg logit contributions
izing-0.091
ler-0.089
corn-0.086
E-0.085
ww-0.085
Token "
Token ID366
Feature activation+5.40e+00

Pos logit contributions
 wife+0.028
 fault+0.026
sed+0.025
 attitude+0.025
 stomach+0.025
Neg logit contributions
izing-0.028
E-0.028
Little-0.026
 million-0.025
B-0.025
TokenSarah
Token ID29284
Feature activation-1.18e+00

Pos logit contributions
izing+0.122
ler+0.117
hest+0.104
corn+0.100
ww+0.099
Neg logit contributions
 wife-0.153
 opinion-0.147
 name-0.144
joy-0.136
 attitude-0.136
Token,
Token ID11
Feature activation+7.52e-01

Pos logit contributions
 wife+0.008
 opinion+0.007
 fault+0.007
 attitude+0.007
 behavior+0.007
Neg logit contributions
izing-0.012
ww-0.011
ler-0.011
E-0.011
corn-0.011
Token the
Token ID262
Feature activation-1.68e+00

Pos logit contributions
izing+0.020
ler+0.019
E+0.018
ww+0.018
corn+0.018
Neg logit contributions
 wife-0.017
 opinion-0.017
 attitude-0.016
 fault-0.016
 stomach-0.016
Token mom
Token ID1995
Feature activation+8.93e-01

Pos logit contributions
 fault+0.024
 attitude+0.023
 opinion+0.022
sed+0.022
urance+0.021
Neg logit contributions
izing-0.053
E-0.053
ww-0.052
ler-0.052
Little-0.051
Token asked
Token ID1965
Feature activation-2.11e+00

Pos logit contributions
izing+0.026
ler+0.025
hest+0.024
E+0.024
ww+0.024
Neg logit contributions
 wife-0.019
 opinion-0.017
 attitude-0.017
 stomach-0.017
 name-0.017
Token her
Token ID607
Feature activation-4.62e+00

Pos logit contributions
 wife+0.049
 attitude+0.046
 opinion+0.046
 fault+0.046
 stomach+0.045
Neg logit contributions
izing-0.053
E-0.052
ler-0.049
ww-0.049
corn-0.049
Token daughter
Token ID4957
Feature activation-6.03e+00

Pos logit contributions
sed+0.107
 spine+0.105
urance+0.098
ses+0.097
 fault+0.093
Neg logit contributions
E-0.106
in-0.100
 else-0.100
ww-0.099
is-0.097
Token,
Token ID11
Feature activation-2.14e+00

Pos logit contributions
sed+0.151
 attitude+0.138
ses+0.135
urance+0.132
 wife+0.132
Neg logit contributions
 else-0.119
izing-0.118
corn-0.117
E-0.116
in-0.112
Token "
Token ID366
Feature activation+5.92e+00

Pos logit contributions
 attitude+0.039
 fault+0.038
 wife+0.037
 behavior+0.036
 opinion+0.036
Neg logit contributions
izing-0.063
E-0.062
corn-0.058
Little-0.058
ww-0.058
TokenWould
Token ID17353
Feature activation+2.95e+00

Pos logit contributions
izing+0.158
ler+0.156
hest+0.152
ww+0.148
ang+0.140
Neg logit contributions
 wife-0.136
joy-0.121
 permission-0.116
 opinion-0.114
 name-0.111
Token you
Token ID345
Feature activation-2.40e+00

Pos logit contributions
ler+0.047
izing+0.046
hest+0.040
web+0.037
plets+0.037
Neg logit contributions
 wife-0.097
 opinion-0.094
 name-0.090
 parents-0.090
 stomach-0.089
Token like
Token ID588
Feature activation-4.77e-01

Pos logit contributions
 wife+0.054
 attitude+0.052
 opinion+0.051
 stomach+0.050
sed+0.050
Neg logit contributions
izing-0.060
ler-0.060
hest-0.059
ww-0.058
E-0.058
Token her
Token ID607
Feature activation-3.09e+00

Pos logit contributions
 wife+0.010
 opinion+0.009
 attitude+0.009
 fault+0.009
sed+0.009
Neg logit contributions
izing-0.011
E-0.010
ler-0.010
hest-0.010
ww-0.010
Token face
Token ID1986
Feature activation-8.40e-01

Pos logit contributions
 fault+0.032
 wife+0.030
 opinion+0.030
 attitude+0.029
 ending+0.028
Neg logit contributions
izing-0.113
ler-0.111
hest-0.105
E-0.105
is-0.103
Token and
Token ID290
Feature activation+1.37e+00

Pos logit contributions
 wife+0.020
 opinion+0.019
 attitude+0.019
 stomach+0.018
 fault+0.018
Neg logit contributions
izing-0.021
ler-0.020
hest-0.019
ww-0.019
corn-0.019
Token asked
Token ID1965
Feature activation-2.42e+00

Pos logit contributions
izing+0.034
ler+0.033
ww+0.032
corn+0.032
E+0.032
Neg logit contributions
 opinion-0.030
 attitude-0.030
 wife-0.030
 fault-0.030
 stomach-0.029
Token her
Token ID607
Feature activation-3.85e+00

Pos logit contributions
 wife+0.054
 attitude+0.052
 opinion+0.051
sed+0.051
 fault+0.051
Neg logit contributions
izing-0.063
E-0.061
ler-0.058
corn-0.057
hest-0.056
Token daughter
Token ID4957
Feature activation-5.97e+00

Pos logit contributions
sed+0.057
 attitude+0.054
urance+0.053
 fault+0.052
 ending+0.049
Neg logit contributions
E-0.119
izing-0.117
corn-0.116
Little-0.113
ler-0.112
Token,
Token ID11
Feature activation-1.70e+00

Pos logit contributions
sed+0.145
 attitude+0.134
 ending+0.130
ses+0.127
urance+0.125
Neg logit contributions
corn-0.129
E-0.127
hest-0.126
izing-0.126
sc-0.123
Token "
Token ID366
Feature activation+6.02e+00

Pos logit contributions
 wife+0.025
 attitude+0.025
 opinion+0.024
 fault+0.023
 stomach+0.023
Neg logit contributions
izing-0.057
E-0.054
ler-0.053
corn-0.053
ww-0.052
TokenDo
Token ID5211
Feature activation+1.87e+00

Pos logit contributions
izing+0.152
ler+0.148
hest+0.136
ww+0.134
ang+0.127
Neg logit contributions
 wife-0.152
 name-0.138
joy-0.134
 opinion-0.133
 permission-0.129
Token you
Token ID345
Feature activation-3.04e+00

Pos logit contributions
izing+0.016
ww+0.013
ler+0.013
E+0.013
hest+0.012
Neg logit contributions
 wife-0.076
 opinion-0.073
 attitude-0.073
 stomach-0.072
 fault-0.071
Token want
Token ID765
Feature activation+4.80e-01

Pos logit contributions
 wife+0.060
 attitude+0.056
sed+0.056
 stomach+0.055
 opinion+0.055
Neg logit contributions
izing-0.085
ler-0.084
hest-0.082
ww-0.081
E-0.081
Token the
Token ID262
Feature activation-1.75e+00

Pos logit contributions
 wife+0.007
 opinion+0.006
 attitude+0.006
 fault+0.006
 stomach+0.006
Neg logit contributions
izing-0.013
ler-0.012
ww-0.012
E-0.012
corn-0.012
Token mum
Token ID25682
Feature activation+4.86e-01

Pos logit contributions
 fault+0.027
 attitude+0.026
sed+0.024
 opinion+0.024
joy+0.023
Neg logit contributions
ww-0.055
izing-0.054
is-0.053
E-0.053
ler-0.053
Token said
Token ID531
Feature activation-3.18e+00

Pos logit contributions
izing+0.015
ler+0.014
hest+0.014
E+0.014
ww+0.013
Neg logit contributions
 wife-0.010
 opinion-0.009
 attitude-0.009
 name-0.009
 stomach-0.009
Token to
Token ID284
Feature activation+6.72e-01

Pos logit contributions
 attitude+0.068
 fault+0.068
 wife+0.068
 opinion+0.066
sed+0.063
Neg logit contributions
ww-0.081
izing-0.080
E-0.078
corn-0.074
ler-0.073
Token her
Token ID607
Feature activation-3.91e+00

Pos logit contributions
izing+0.018
ler+0.017
E+0.017
ww+0.017
hest+0.017
Neg logit contributions
 wife-0.015
 opinion-0.014
 attitude-0.014
 fault-0.013
 stomach-0.013
Token daughter
Token ID4957
Feature activation-5.85e+00

Pos logit contributions
sed+0.075
 fault+0.072
urance+0.071
 spine+0.070
 attitude+0.067
Neg logit contributions
izing-0.099
E-0.098
ler-0.097
Little-0.095
in-0.093
Token,
Token ID11
Feature activation-2.40e+00

Pos logit contributions
 wife+0.133
 attitude+0.127
sed+0.126
 fault+0.120
 rightful+0.119
Neg logit contributions
izing-0.127
corn-0.125
ww-0.119
ler-0.118
E-0.117
Token "
Token ID366
Feature activation+5.97e+00

Pos logit contributions
 attitude+0.044
 wife+0.042
 opinion+0.042
 fault+0.041
 behavior+0.040
Neg logit contributions
izing-0.070
E-0.069
ww-0.067
corn-0.067
ler-0.066
TokenLet
Token ID5756
Feature activation+4.36e-01

Pos logit contributions
izing+0.180
ler+0.177
hest+0.167
iflower+0.165
ls+0.161
Neg logit contributions
 wife-0.114
 opinion-0.103
 name-0.102
joy-0.098
 nod-0.097
Token's
Token ID338
Feature activation-2.62e+00

Pos logit contributions
izing+0.012
ler+0.012
hest+0.011
E+0.011
ww+0.011
Neg logit contributions
 wife-0.009
 opinion-0.009
 attitude-0.009
 stomach-0.008
 name-0.008
Token try
Token ID1949
Feature activation-8.32e-01

Pos logit contributions
 wife+0.047
 attitude+0.046
joy+0.045
 fault+0.045
sed+0.044
Neg logit contributions
izing-0.077
ler-0.074
E-0.073
hest-0.073
ww-0.073
Token,
Token ID11
Feature activation+4.10e-01

Pos logit contributions
 wife+0.017
 opinion+0.016
 attitude+0.016
 fault+0.016
 behavior+0.015
Neg logit contributions
izing-0.025
ww-0.024
ler-0.024
E-0.023
corn-0.023
Token the
Token ID262
Feature activation-2.00e+00

Pos logit contributions
izing+0.011
ler+0.011
hest+0.011
ww+0.011
E+0.010
Neg logit contributions
 wife-0.009
 opinion-0.009
 attitude-0.008
 stomach-0.008
 name-0.008
Token mom
Token ID1995
Feature activation+6.88e-01

Pos logit contributions
 fault+0.029
 attitude+0.028
 opinion+0.027
sed+0.026
joy+0.026
Neg logit contributions
izing-0.062
E-0.062
ww-0.062
ler-0.060
Little-0.060
Token asked
Token ID1965
Feature activation-2.39e+00

Pos logit contributions
izing+0.022
ler+0.020
hest+0.020
E+0.020
ww+0.019
Neg logit contributions
 wife-0.013
 opinion-0.012
 name-0.012
 attitude-0.012
 stomach-0.012
Token the
Token ID262
Feature activation-2.10e+00

Pos logit contributions
 wife+0.056
 fault+0.054
 attitude+0.053
 opinion+0.053
sed+0.052
Neg logit contributions
izing-0.060
E-0.058
ww-0.055
ler-0.055
corn-0.054
Token daughter
Token ID4957
Feature activation-5.77e+00

Pos logit contributions
 fault+0.042
 attitude+0.041
sed+0.041
urance+0.039
joy+0.038
Neg logit contributions
E-0.056
ww-0.053
izing-0.053
Little-0.052
is-0.051
Token if
Token ID611
Feature activation+6.52e-01

Pos logit contributions
sed+0.127
 attitude+0.115
urance+0.115
 ending+0.112
 fault+0.112
Neg logit contributions
E-0.133
 else-0.132
izing-0.130
hest-0.128
corn-0.128
Token she
Token ID673
Feature activation-3.58e+00

Pos logit contributions
izing+0.012
E+0.010
ww+0.010
ler+0.010
corn+0.010
Neg logit contributions
 opinion-0.020
 attitude-0.020
 wife-0.019
 fault-0.019
 name-0.019
Token wanted
Token ID2227
Feature activation-9.97e-01

Pos logit contributions
 wife+0.079
 attitude+0.078
 opinion+0.076
 fault+0.075
 stomach+0.073
Neg logit contributions
izing-0.089
ler-0.088
E-0.087
ww-0.086
hest-0.085
Token to
Token ID284
Feature activation-1.79e+00

Pos logit contributions
 attitude+0.016
sed+0.015
 wife+0.015
 opinion+0.014
 fault+0.014
Neg logit contributions
izing-0.030
corn-0.029
ler-0.029
ww-0.029
E-0.029
Token help
Token ID1037
Feature activation+1.02e-01

Pos logit contributions
sed+0.037
joy+0.037
 fault+0.036
 attitude+0.035
 ending+0.034
Neg logit contributions
hest-0.046
E-0.045
izing-0.045
ler-0.044
ww-0.044
Token and
Token ID290
Feature activation+8.21e-01

Pos logit contributions
 wife+0.004
 attitude+0.004
 opinion+0.003
 fault+0.003
 stomach+0.003
Neg logit contributions
izing-0.005
ler-0.005
hest-0.004
ww-0.004
corn-0.004
Token showed
Token ID3751
Feature activation-1.54e-01

Pos logit contributions
izing+0.026
ler+0.025
hest+0.024
E+0.023
ww+0.023
Neg logit contributions
 wife-0.016
 name-0.015
 opinion-0.014
 stomach-0.014
 tum-0.014
Token it
Token ID340
Feature activation+1.00e-01

Pos logit contributions
 wife+0.003
 opinion+0.003
 name+0.003
 attitude+0.003
 stomach+0.003
Neg logit contributions
izing-0.005
ler-0.004
hest-0.004
ww-0.004
corn-0.004
Token to
Token ID284
Feature activation+5.24e-01

Pos logit contributions
izing+0.003
hest+0.003
ler+0.003
E+0.003
corn+0.003
Neg logit contributions
 wife-0.001
 attitude-0.001
 stomach-0.001
 opinion-0.001
 fault-0.001
Token her
Token ID607
Feature activation-3.37e+00

Pos logit contributions
izing+0.016
ler+0.015
ww+0.015
hest+0.015
corn+0.015
Neg logit contributions
 wife-0.010
 opinion-0.010
 attitude-0.009
 name-0.009
 stomach-0.009
Token daughter
Token ID4957
Feature activation-5.72e+00

Pos logit contributions
 attitude+0.046
 wife+0.046
 opinion+0.044
 fault+0.043
 stomach+0.043
Neg logit contributions
izing-0.114
E-0.109
ler-0.109
corn-0.106
ww-0.105
Token.
Token ID13
Feature activation-1.74e+00

Pos logit contributions
sed+0.126
 attitude+0.122
 fault+0.121
 ending+0.120
 wife+0.119
Neg logit contributions
izing-0.135
ler-0.134
corn-0.132
E-0.131
hest-0.129
Token "
Token ID366
Feature activation+5.61e+00

Pos logit contributions
 wife+0.032
 fault+0.030
 attitude+0.030
 ending+0.029
sed+0.029
Neg logit contributions
E-0.052
izing-0.052
corn-0.048
ch-0.047
hest-0.047
TokenWhat
Token ID2061
Feature activation-1.23e+00

Pos logit contributions
izing+0.155
ler+0.142
iflower+0.123
ls+0.119
hest+0.117
Neg logit contributions
 wife-0.140
 name-0.137
 opinion-0.131
 names-0.127
joy-0.121
Token do
Token ID466
Feature activation+1.96e+00

Pos logit contributions
 wife+0.027
 opinion+0.026
 attitude+0.026
 stomach+0.025
 name+0.025
Neg logit contributions
izing-0.034
ler-0.032
hest-0.031
ww-0.031
E-0.031
Token you
Token ID345
Feature activation-2.59e+00

Pos logit contributions
izing+0.027
ler+0.025
hest+0.024
E+0.023
ww+0.022
Neg logit contributions
 wife-0.070
 opinion-0.068
 attitude-0.066
 name-0.066
 stomach-0.065
Token Mom
Token ID11254
Feature activation-1.33e+00

Pos logit contributions
 wife+0.007
 opinion+0.006
 name+0.006
 attitude+0.006
 stomach+0.006
Neg logit contributions
izing-0.011
ler-0.011
hest-0.011
E-0.010
ww-0.010
Token?
Token ID30
Feature activation-1.03e+00

Pos logit contributions
 wife+0.033
 attitude+0.032
 opinion+0.032
 stomach+0.031
 name+0.030
Neg logit contributions
izing-0.032
ler-0.031
ww-0.030
corn-0.029
hest-0.029
Tokenâ
Token ID22940
Feature activation+7.91e-01

Pos logit contributions
 wife+0.015
 name+0.015
 opinion+0.015
 attitude+0.013
 stomach+0.013
Neg logit contributions
izing-0.036
ler-0.035
ww-0.033
hest-0.033
corn-0.033
Token€
Token ID26391
Feature activation-8.06e-01

Pos logit contributions
izing+0.024
ler+0.023
E+0.022
hest+0.022
ww+0.022
Neg logit contributions
 wife-0.015
 opinion-0.015
 attitude-0.014
 name-0.014
 stomach-0.013
Token asked
Token ID1965
Feature activation-2.31e+00

Pos logit contributions
 wife+0.019
 opinion+0.019
 attitude+0.018
 stomach+0.018
 name+0.017
Neg logit contributions
izing-0.020
ler-0.019
ww-0.019
corn-0.019
hest-0.019
Token Amy
Token ID14235
Feature activation-5.69e+00

Pos logit contributions
 wife+0.043
 opinion+0.040
 attitude+0.040
 stomach+0.038
 name+0.038
Neg logit contributions
izing-0.071
ler-0.068
E-0.066
ww-0.066
corn-0.066
Token.
Token ID13
Feature activation-1.31e+00

Pos logit contributions
 wife+0.112
 attitude+0.111
sed+0.110
 ending+0.107
 opinion+0.106
Neg logit contributions
izing-0.150
ler-0.144
corn-0.143
E-0.142
hest-0.142
Token
Token ID198
Feature activation+2.18e+00

Pos logit contributions
 wife+0.023
 attitude+0.021
 opinion+0.021
 stomach+0.020
 fault+0.020
Neg logit contributions
izing-0.042
E-0.040
corn-0.039
ww-0.039
ler-0.039
Token
Token ID198
Feature activation+3.38e+00

Pos logit contributions
izing+0.080
ler+0.079
ww+0.073
hest+0.072
corn+0.072
Neg logit contributions
 name-0.027
 wife-0.027
 opinion-0.026
 attitude-0.022
 stomach-0.021
Tokenâ
Token ID22940
Feature activation+7.36e-01

Pos logit contributions
ler+0.085
izing+0.079
iflower+0.070
plets+0.069
hest+0.069
Neg logit contributions
 name-0.078
 opinion-0.078
 wife-0.077
 meeting-0.074
 Her-0.073
Token€
Token ID26391
Feature activation-8.66e-01

Pos logit contributions
izing+0.022
ler+0.021
ww+0.020
hest+0.020
E+0.020
Neg logit contributions
 wife-0.015
 opinion-0.014
 attitude-0.014
 stomach-0.013
 name-0.013
Token One
Token ID1881
Feature activation-9.46e-01

Pos logit contributions
 wife+0.017
 fault+0.016
 permission+0.015
 attitude+0.015
 stomach+0.015
Neg logit contributions
izing-0.021
E-0.021
corn-0.021
hest-0.020
ww-0.019
Token day
Token ID1110
Feature activation-6.94e-01

Pos logit contributions
 wife+0.019
 opinion+0.018
 attitude+0.018
 stomach+0.018
 name+0.017
Neg logit contributions
izing-0.027
ler-0.026
ww-0.025
hest-0.025
E-0.025
Token,
Token ID11
Feature activation+7.17e-01

Pos logit contributions
 wife+0.015
 fault+0.014
 opinion+0.014
 attitude+0.013
 stomach+0.013
Neg logit contributions
izing-0.019
ww-0.019
corn-0.018
w-0.018
ler-0.018
Token mom
Token ID1995
Feature activation+7.00e-01

Pos logit contributions
izing+0.018
ler+0.017
ww+0.016
E+0.016
corn+0.016
Neg logit contributions
 wife-0.017
 opinion-0.016
 attitude-0.016
 fault-0.016
 stomach-0.016
Token asked
Token ID1965
Feature activation-1.80e+00

Pos logit contributions
izing+0.021
ler+0.020
hest+0.019
E+0.019
ww+0.019
Neg logit contributions
 wife-0.014
 opinion-0.013
 attitude-0.013
 name-0.013
 stomach-0.013
Token Amy
Token ID14235
Feature activation-5.66e+00

Pos logit contributions
 wife+0.040
 opinion+0.038
 attitude+0.037
 fault+0.037
 stomach+0.037
Neg logit contributions
izing-0.047
E-0.044
ler-0.044
corn-0.044
ww-0.044
Token to
Token ID284
Feature activation-1.16e+00

Pos logit contributions
 wife+0.094
 attitude+0.093
sed+0.091
 opinion+0.088
 fault+0.086
Neg logit contributions
izing-0.167
corn-0.163
ler-0.163
E-0.162
hest-0.160
Token replace
Token ID6330
Feature activation+2.30e+00

Pos logit contributions
 wife+0.019
 attitude+0.017
 opinion+0.017
 fault+0.017
 stomach+0.016
Neg logit contributions
izing-0.038
ler-0.037
E-0.036
hest-0.036
ww-0.036
Token the
Token ID262
Feature activation-1.28e+00

Pos logit contributions
ler+0.075
izing+0.075
ww+0.075
hest+0.075
plets+0.072
Neg logit contributions
 wife-0.037
 opinion-0.035
 name-0.034
 parents-0.033
 stomach-0.033
Token powder
Token ID11913
Feature activation+1.67e+00

Pos logit contributions
 wife+0.031
 attitude+0.030
 opinion+0.029
 stomach+0.028
 ending+0.028
Neg logit contributions
izing-0.031
ler-0.030
E-0.029
hest-0.028
corn-0.028
Token in
Token ID287
Feature activation+7.28e-01

Pos logit contributions
izing+0.047
E+0.046
ww+0.046
ler+0.045
corn+0.044
Neg logit contributions
 wife-0.033
 opinion-0.033
 name-0.032
 attitude-0.031
 stomach-0.031
Token was
Token ID373
Feature activation-1.36e+00

Pos logit contributions
izing+0.029
ler+0.028
hest+0.027
ww+0.027
E+0.027
Neg logit contributions
 wife-0.022
 opinion-0.020
 attitude-0.020
 stomach-0.020
 name-0.019
Token angry
Token ID7954
Feature activation-2.07e+00

Pos logit contributions
 attitude+0.040
 wife+0.040
 fault+0.039
 behavior+0.039
 opinion+0.038
Neg logit contributions
izing-0.025
E-0.024
hest-0.024
ler-0.023
 million-0.022
Token and
Token ID290
Feature activation+1.15e+00

Pos logit contributions
 wife+0.054
 opinion+0.052
 attitude+0.051
 stomach+0.050
 name+0.049
Neg logit contributions
izing-0.049
ler-0.045
ww-0.042
hest-0.042
E-0.041
Token asked
Token ID1965
Feature activation-2.48e+00

Pos logit contributions
izing+0.032
ler+0.030
E+0.030
ww+0.029
hest+0.029
Neg logit contributions
 wife-0.025
 opinion-0.024
 attitude-0.023
 fault-0.023
 name-0.022
Token her
Token ID607
Feature activation-4.38e+00

Pos logit contributions
 wife+0.064
 attitude+0.060
 fault+0.060
sed+0.059
 opinion+0.059
Neg logit contributions
izing-0.057
E-0.055
B-0.050
corn-0.047
ler-0.047
Token daughter
Token ID4957
Feature activation-5.66e+00

Pos logit contributions
sed+0.125
urance+0.118
 attitude+0.118
 spine+0.117
ses+0.115
Neg logit contributions
E-0.072
in-0.072
 else-0.070
izing-0.068
is-0.066
Token why
Token ID1521
Feature activation-1.71e+00

Pos logit contributions
sed+0.160
ses+0.149
 attitude+0.147
 wife+0.145
 ending+0.144
Neg logit contributions
 else-0.102
izing-0.086
in-0.084
 you-0.081
 easily-0.079
Token she
Token ID673
Feature activation-1.37e+00

Pos logit contributions
 wife+0.048
 opinion+0.047
 attitude+0.046
 name+0.045
 stomach+0.044
Neg logit contributions
izing-0.037
ler-0.034
E-0.033
ww-0.032
corn-0.032
Token had
Token ID550
Feature activation+3.70e-01

Pos logit contributions
 wife+0.030
 opinion+0.028
 attitude+0.028
 stomach+0.027
 name+0.027
Neg logit contributions
izing-0.038
ler-0.036
ww-0.035
hest-0.035
E-0.035
Token done
Token ID1760
Feature activation-3.69e-01

Pos logit contributions
izing+0.009
ww+0.009
ler+0.009
corn+0.008
E+0.008
Neg logit contributions
 wife-0.008
 attitude-0.008
 opinion-0.008
 name-0.008
sed-0.008
Token something
Token ID1223
Feature activation-1.85e-01

Pos logit contributions
 wife+0.009
 opinion+0.008
 attitude+0.008
 name+0.008
 stomach+0.008
Neg logit contributions
izing-0.009
ler-0.009
E-0.008
corn-0.008
ww-0.008
Token,
Token ID11
Feature activation+8.58e-01

Pos logit contributions
 wife+0.003
 attitude+0.003
 fault+0.003
 opinion+0.003
 stomach+0.002
Neg logit contributions
izing-0.005
ww-0.004
ler-0.004
corn-0.004
E-0.004
Token Mama
Token ID40084
Feature activation-2.96e-01

Pos logit contributions
izing+0.027
ler+0.026
hest+0.025
ww+0.025
E+0.025
Neg logit contributions
 wife-0.015
 opinion-0.014
 attitude-0.014
 stomach-0.014
 name-0.014
Token and
Token ID290
Feature activation+3.48e-01

Pos logit contributions
 wife+0.007
 opinion+0.006
 attitude+0.006
 stomach+0.006
 name+0.006
Neg logit contributions
izing-0.008
ler-0.008
ww-0.007
E-0.007
corn-0.007
Token Papa
Token ID42328
Feature activation-1.16e+00

Pos logit contributions
izing+0.011
ler+0.010
E+0.010
ww+0.010
corn+0.010
Neg logit contributions
 wife-0.006
 opinion-0.006
 attitude-0.006
 fault-0.005
 stomach-0.005
Token asked
Token ID1965
Feature activation-2.88e+00

Pos logit contributions
 wife+0.026
 attitude+0.026
 opinion+0.025
 fault+0.025
 stomach+0.024
Neg logit contributions
izing-0.029
ler-0.029
ww-0.028
corn-0.028
E-0.028
Token Sarah
Token ID10490
Feature activation-5.63e+00

Pos logit contributions
 wife+0.061
 attitude+0.057
 opinion+0.056
 fault+0.056
 stomach+0.056
Neg logit contributions
izing-0.079
E-0.076
ler-0.074
corn-0.073
ww-0.073
Token to
Token ID284
Feature activation-1.33e+00

Pos logit contributions
sed+0.123
 wife+0.123
 attitude+0.120
 opinion+0.116
 ending+0.116
Neg logit contributions
izing-0.139
ler-0.132
corn-0.131
E-0.130
hest-0.128
Token share
Token ID2648
Feature activation+9.50e-01

Pos logit contributions
 fault+0.023
sed+0.023
joy+0.022
 attitude+0.022
 wife+0.022
Neg logit contributions
izing-0.040
hest-0.037
ler-0.037
E-0.037
ww-0.036
Token her
Token ID607
Feature activation-1.60e+00

Pos logit contributions
izing+0.030
ler+0.029
ww+0.028
hest+0.028
E+0.027
Neg logit contributions
 wife-0.018
 opinion-0.017
 attitude-0.016
 name-0.016
 stomach-0.016
Token toys
Token ID14958
Feature activation+1.85e+00

Pos logit contributions
 wife+0.035
 attitude+0.033
 fault+0.033
 ending+0.033
 opinion+0.033
Neg logit contributions
izing-0.040
ler-0.039
corn-0.037
ww-0.037
E-0.037
Token with
Token ID351
Feature activation+1.13e+00

Pos logit contributions
izing+0.055
ler+0.053
ww+0.052
E+0.052
hest+0.051
Neg logit contributions
 wife-0.036
 opinion-0.034
 attitude-0.033
 stomach-0.032
 name-0.032
Token She
Token ID1375
Feature activation+1.47e-01

Pos logit contributions
E+0.009
B+0.007
izing+0.007
Fl+0.007
hest+0.007
Neg logit contributions
 wife-0.011
 fault-0.011
sed-0.011
 permission-0.011
 behavior-0.010
Token was
Token ID373
Feature activation-1.49e+00

Pos logit contributions
izing+0.004
ler+0.004
ww+0.004
E+0.004
hest+0.004
Neg logit contributions
 wife-0.003
 opinion-0.003
 attitude-0.003
 stomach-0.003
 name-0.003
Token very
Token ID845
Feature activation-6.49e-01

Pos logit contributions
 wife+0.038
 attitude+0.037
 fault+0.036
 behavior+0.036
 opinion+0.036
Neg logit contributions
izing-0.032
E-0.031
ww-0.030
ler-0.030
ch-0.029
Token excited
Token ID6568
Feature activation-1.43e+00

Pos logit contributions
 wife+0.014
 attitude+0.013
 opinion+0.013
 fault+0.013
 stomach+0.013
Neg logit contributions
izing-0.017
ler-0.017
E-0.016
ww-0.016
hest-0.016
Token because
Token ID780
Feature activation-3.30e-01

Pos logit contributions
 wife+0.028
 attitude+0.026
 opinion+0.026
 stomach+0.025
 name+0.025
Neg logit contributions
izing-0.043
ler-0.040
ww-0.039
hest-0.039
corn-0.039
Token her
Token ID607
Feature activation-5.62e+00

Pos logit contributions
 wife+0.007
 opinion+0.006
 parents+0.006
 ending+0.006
 name+0.006
Neg logit contributions
izing-0.010
ler-0.010
hest-0.009
corn-0.009
ww-0.009
Token mom
Token ID1995
Feature activation+4.64e-01

Pos logit contributions
urance+0.092
 attitude+0.091
 advisor+0.090
 permission+0.089
 tone+0.085
Neg logit contributions
ww-0.140
corn-0.133
E-0.133
iflower-0.133
izing-0.130
Token was
Token ID373
Feature activation-6.14e-01

Pos logit contributions
izing+0.013
ler+0.012
ww+0.012
E+0.012
hest+0.012
Neg logit contributions
 wife-0.010
 opinion-0.010
 attitude-0.009
 name-0.009
 stomach-0.009
Token making
Token ID1642
Feature activation+5.49e-01

Pos logit contributions
 wife+0.012
 attitude+0.012
 opinion+0.012
 stomach+0.012
 fault+0.011
Neg logit contributions
izing-0.018
ler-0.017
ww-0.016
E-0.016
hest-0.016
Token a
Token ID257
Feature activation+1.41e+00

Pos logit contributions
izing+0.017
ler+0.017
hest+0.016
ww+0.016
E+0.016
Neg logit contributions
 wife-0.010
 opinion-0.010
 name-0.009
 stomach-0.009
 tum-0.009
Token pie
Token ID2508
Feature activation+2.10e+00

Pos logit contributions
izing+0.043
ler+0.041
hest+0.040
E+0.038
plets+0.038
Neg logit contributions
 wife-0.030
 name-0.028
 stomach-0.027
 opinion-0.027
 tum-0.027
Token Polly
Token ID36898
Feature activation+2.04e-01

Pos logit contributions
izing+0.001
ww+0.001
ler+0.001
E+0.001
hest+0.001
Neg logit contributions
 wife-0.001
 fault-0.001
 opinion-0.001
 attitude-0.001
sed-0.001
Token heard
Token ID2982
Feature activation-1.01e+00

Pos logit contributions
izing+0.006
ler+0.006
ww+0.006
hest+0.006
E+0.006
Neg logit contributions
 wife-0.004
 opinion-0.004
 attitude-0.004
 stomach-0.004
 name-0.004
Token a
Token ID257
Feature activation+1.66e+00

Pos logit contributions
 wife+0.022
 attitude+0.021
sed+0.021
 fault+0.021
 opinion+0.021
Neg logit contributions
izing-0.026
E-0.025
ler-0.025
hest-0.025
ww-0.024
Token knock
Token ID10643
Feature activation-3.00e-01

Pos logit contributions
izing+0.047
ler+0.046
hest+0.045
corn+0.044
w+0.043
Neg logit contributions
 wife-0.035
 opinion-0.033
 stomach-0.032
 name-0.032
 tum-0.031
Token on
Token ID319
Feature activation-1.25e+00

Pos logit contributions
 wife+0.006
 opinion+0.006
 attitude+0.006
 stomach+0.006
 name+0.006
Neg logit contributions
izing-0.008
E-0.008
hest-0.008
ler-0.008
ww-0.008
Token her
Token ID607
Feature activation-5.60e+00

Pos logit contributions
 wife+0.030
 opinion+0.029
 attitude+0.029
 fault+0.028
sed+0.028
Neg logit contributions
izing-0.031
ler-0.029
E-0.029
corn-0.028
ww-0.028
Token door
Token ID3420
Feature activation-1.67e+00

Pos logit contributions
 attitude+0.116
 ending+0.114
 known+0.109
 wife+0.108
 opinion+0.108
Neg logit contributions
izing-0.132
E-0.125
web-0.124
ww-0.121
 million-0.120
Token.
Token ID13
Feature activation-1.67e+00

Pos logit contributions
 wife+0.037
 attitude+0.034
 opinion+0.034
 fault+0.033
 stomach+0.033
Neg logit contributions
izing-0.046
ler-0.043
hest-0.042
E-0.041
ww-0.040
Token She
Token ID1375
Feature activation+3.06e-01

Pos logit contributions
 fault+0.048
 permission+0.044
 wife+0.044
 attitude+0.044
 behavior+0.044
Neg logit contributions
E-0.029
ww-0.027
sc-0.026
izing-0.026
B-0.026
Token opened
Token ID4721
Feature activation-4.94e-01

Pos logit contributions
izing+0.009
ler+0.009
hest+0.009
ww+0.009
E+0.009
Neg logit contributions
 wife-0.006
 opinion-0.005
 attitude-0.005
 name-0.005
 stomach-0.005
Token it
Token ID340
Feature activation-1.09e+00

Pos logit contributions
 wife+0.011
 opinion+0.010
 attitude+0.010
 name+0.010
 stomach+0.010
Neg logit contributions
izing-0.014
ler-0.013
hest-0.013
ww-0.013
E-0.013
Token loved
Token ID6151
Feature activation-1.40e+00

Pos logit contributions
 wife+0.010
 attitude+0.010
 opinion+0.010
 fault+0.010
sed+0.009
Neg logit contributions
izing-0.011
ler-0.011
ww-0.011
E-0.011
corn-0.011
Token to
Token ID284
Feature activation-2.75e+00

Pos logit contributions
 wife+0.019
 attitude+0.019
 opinion+0.018
 name+0.017
 stomach+0.017
Neg logit contributions
izing-0.048
corn-0.046
ler-0.046
E-0.045
ww-0.045
Token listen
Token ID6004
Feature activation-1.56e+00

Pos logit contributions
 fault+0.055
sed+0.054
 wife+0.052
joy+0.052
 attitude+0.051
Neg logit contributions
izing-0.071
ler-0.071
ww-0.070
hest-0.070
E-0.070
Token.
Token ID13
Feature activation-6.07e-01

Pos logit contributions
 wife+0.017
 opinion+0.015
 attitude+0.015
 stomach+0.014
 name+0.014
Neg logit contributions
izing-0.060
ler-0.058
ww-0.057
hest-0.057
E-0.057
Token Whenever
Token ID21326
Feature activation-8.25e-01

Pos logit contributions
 wife+0.012
sed+0.011
 permission+0.011
 attitude+0.011
 parents+0.011
Neg logit contributions
izing-0.017
E-0.017
hest-0.017
corn-0.016
ww-0.016
Token his
Token ID465
Feature activation-5.59e+00

Pos logit contributions
 wife+0.021
 fault+0.021
 attitude+0.021
 opinion+0.021
sed+0.020
Neg logit contributions
izing-0.018
ww-0.018
corn-0.017
ler-0.017
E-0.016
Token Mum
Token ID38367
Feature activation-1.55e-01

Pos logit contributions
urance+0.074
 attitude+0.072
joy+0.071
 fault+0.069
 opinion+0.065
Neg logit contributions
ww-0.161
izing-0.159
ler-0.158
E-0.156
iflower-0.154
Token called
Token ID1444
Feature activation-2.36e+00

Pos logit contributions
 wife+0.003
 opinion+0.003
 attitude+0.003
 stomach+0.003
 name+0.003
Neg logit contributions
izing-0.004
ler-0.004
hest-0.004
ww-0.004
E-0.004
Token him
Token ID683
Feature activation-3.38e+00

Pos logit contributions
 wife+0.055
 opinion+0.052
 attitude+0.052
 fault+0.049
sed+0.049
Neg logit contributions
izing-0.059
ler-0.057
E-0.055
ww-0.055
corn-0.055
Token in
Token ID287
Feature activation-2.83e+00

Pos logit contributions
 attitude+0.060
 wife+0.058
 opinion+0.058
 stomach+0.055
 fault+0.054
Neg logit contributions
corn-0.096
izing-0.095
ww-0.093
ler-0.093
E-0.091
Token for
Token ID329
Feature activation-9.28e-01

Pos logit contributions
 wife+0.061
 opinion+0.056
 attitude+0.055
 fault+0.054
sed+0.054
Neg logit contributions
izing-0.077
ler-0.074
E-0.072
ww-0.072
corn-0.071
Token to
Token ID284
Feature activation-1.29e+00

Pos logit contributions
 wife+0.054
 attitude+0.053
 opinion+0.051
 fault+0.048
 ending+0.048
Neg logit contributions
izing-0.077
B-0.075
ch-0.074
corn-0.070
ler-0.070
Token call
Token ID869
Feature activation-2.62e+00

Pos logit contributions
 wife+0.029
 attitude+0.027
 fault+0.027
 opinion+0.026
sed+0.026
Neg logit contributions
izing-0.032
ler-0.030
E-0.030
Little-0.029
corn-0.029
Token out
Token ID503
Feature activation-2.65e+00

Pos logit contributions
 wife+0.058
 opinion+0.054
 attitude+0.054
 fault+0.053
sed+0.053
Neg logit contributions
izing-0.068
ler-0.064
E-0.063
ww-0.061
hest-0.061
Token to
Token ID284
Feature activation-4.64e-01

Pos logit contributions
 wife+0.052
 fault+0.051
 attitude+0.049
 opinion+0.048
sed+0.048
Neg logit contributions
izing-0.075
E-0.071
ww-0.068
ch-0.068
ler-0.067
Token the
Token ID262
Feature activation-1.27e+00

Pos logit contributions
 fault+0.011
 wife+0.011
 attitude+0.011
 opinion+0.011
sed+0.011
Neg logit contributions
izing-0.011
E-0.010
Little-0.010
ler-0.009
B-0.009
Token daughter
Token ID4957
Feature activation-5.57e+00

Pos logit contributions
 attitude+0.023
 opinion+0.022
 wife+0.022
 fault+0.021
 stomach+0.020
Neg logit contributions
izing-0.037
ler-0.036
E-0.035
ww-0.035
hest-0.035
Token,
Token ID11
Feature activation-1.79e+00

Pos logit contributions
 wife+0.125
 attitude+0.121
sed+0.117
 opinion+0.117
 ending+0.116
Neg logit contributions
izing-0.130
ler-0.125
hest-0.117
sc-0.117
corn-0.117
Token "
Token ID366
Feature activation+5.84e+00

Pos logit contributions
 wife+0.036
 opinion+0.034
 attitude+0.034
 stomach+0.032
 fault+0.032
Neg logit contributions
izing-0.052
ler-0.049
E-0.048
ww-0.047
hest-0.047
TokenCome
Token ID16773
Feature activation-9.21e-01

Pos logit contributions
izing+0.136
ler+0.128
hest+0.125
ww+0.117
corn+0.114
Neg logit contributions
 wife-0.158
 opinion-0.143
 permission-0.141
 name-0.140
joy-0.137
Token inside
Token ID2641
Feature activation-1.56e+00

Pos logit contributions
 wife+0.020
 opinion+0.019
 attitude+0.019
 stomach+0.018
 name+0.018
Neg logit contributions
izing-0.025
ler-0.024
hest-0.023
ww-0.023
E-0.023
Token quick
Token ID2068
Feature activation-1.05e+00

Pos logit contributions
 wife+0.033
 opinion+0.032
 attitude+0.031
 stomach+0.031
 name+0.030
Neg logit contributions
izing-0.043
ler-0.041
ww-0.041
E-0.041
hest-0.040
Token scared
Token ID12008
Feature activation-1.16e+00

Pos logit contributions
 wife+0.001
 attitude+0.001
 opinion+0.001
 fault+0.001
 stomach+0.001
Neg logit contributions
izing-0.002
ler-0.001
ww-0.001
E-0.001
hest-0.001
Token and
Token ID290
Feature activation+4.35e-01

Pos logit contributions
 wife+0.025
 opinion+0.024
 attitude+0.024
 name+0.023
 stomach+0.023
Neg logit contributions
izing-0.032
ler-0.031
ww-0.030
hest-0.030
E-0.030
Token he
Token ID339
Feature activation-7.89e-01

Pos logit contributions
izing+0.011
ler+0.010
ww+0.010
E+0.010
corn+0.009
Neg logit contributions
 wife-0.010
 opinion-0.010
 fault-0.010
 attitude-0.010
 name-0.010
Token asked
Token ID1965
Feature activation-3.46e+00

Pos logit contributions
 wife+0.018
 opinion+0.017
 attitude+0.017
 fault+0.017
 stomach+0.016
Neg logit contributions
izing-0.020
ler-0.020
E-0.019
ww-0.019
hest-0.019
Token his
Token ID465
Feature activation-4.04e+00

Pos logit contributions
 fault+0.094
sed+0.092
 attitude+0.090
 wife+0.087
 opinion+0.086
Neg logit contributions
E-0.062
B-0.061
izing-0.057
corn-0.057
in-0.057
Token brother
Token ID3956
Feature activation-5.53e+00

Pos logit contributions
 attitude+0.064
urance+0.064
 stomach+0.059
 gaze+0.058
 fault+0.057
Neg logit contributions
B-0.107
ler-0.105
E-0.105
ch-0.104
Little-0.104
Token,
Token ID11
Feature activation-2.13e+00

Pos logit contributions
 attitude+0.129
sed+0.128
 wife+0.126
 rightful+0.120
urance+0.118
Neg logit contributions
corn-0.107
 else-0.105
in-0.104
ch-0.104
B-0.103
Token "
Token ID366
Feature activation+5.12e+00

Pos logit contributions
 attitude+0.045
 fault+0.043
 behavior+0.042
 opinion+0.042
 wife+0.042
Neg logit contributions
izing-0.056
E-0.056
B-0.054
Little-0.053
Fl-0.052
TokenWhat
Token ID2061
Feature activation-1.82e+00

Pos logit contributions
izing+0.118
hest+0.111
ler+0.109
ww+0.109
corn+0.100
Neg logit contributions
 wife-0.141
 name-0.127
 opinion-0.127
 stomach-0.124
 permission-0.124
Token is
Token ID318
Feature activation+1.15e+00

Pos logit contributions
 wife+0.049
 attitude+0.046
 opinion+0.044
joy+0.044
sed+0.044
Neg logit contributions
ww-0.038
izing-0.037
ler-0.037
corn-0.037
hest-0.037
Token that
Token ID326
Feature activation-8.35e-01

Pos logit contributions
izing+0.030
ler+0.028
E+0.028
hest+0.028
ww+0.027
Neg logit contributions
 wife-0.027
 opinion-0.026
 attitude-0.025
 stomach-0.025
 name-0.025
Token to
Token ID284
Feature activation-1.62e+00

Pos logit contributions
 wife+0.032
 attitude+0.030
 opinion+0.030
 name+0.029
sed+0.029
Neg logit contributions
izing-0.039
ler-0.036
ww-0.035
hest-0.034
corn-0.034
Token start
Token ID923
Feature activation-1.43e+00

Pos logit contributions
 fault+0.036
sed+0.034
joy+0.034
 attitude+0.032
 wife+0.032
Neg logit contributions
hest-0.040
izing-0.040
ww-0.039
E-0.039
ler-0.038
Token her
Token ID607
Feature activation-2.96e+00

Pos logit contributions
 wife+0.027
 attitude+0.026
 opinion+0.025
sed+0.024
 stomach+0.024
Neg logit contributions
izing-0.043
corn-0.040
ler-0.040
E-0.039
ww-0.039
Token day
Token ID1110
Feature activation-1.86e+00

Pos logit contributions
 wife+0.059
 attitude+0.055
 opinion+0.054
 name+0.052
 stomach+0.051
Neg logit contributions
izing-0.084
ler-0.081
ww-0.080
corn-0.078
E-0.077
Token because
Token ID780
Feature activation+4.29e-01

Pos logit contributions
 wife+0.035
 opinion+0.033
 attitude+0.032
 stomach+0.031
 name+0.031
Neg logit contributions
izing-0.057
ler-0.055
hest-0.053
ww-0.053
E-0.053
Token her
Token ID607
Feature activation-5.52e+00

Pos logit contributions
ler+0.012
izing+0.012
corn+0.012
hest+0.011
ww+0.011
Neg logit contributions
 wife-0.009
 opinion-0.009
 parents-0.009
 ending-0.008
 name-0.008
Token parents
Token ID3397
Feature activation+2.70e-01

Pos logit contributions
urance+0.079
 attitude+0.078
joy+0.078
 wife+0.073
 permission+0.072
Neg logit contributions
ww-0.150
izing-0.146
iflower-0.145
E-0.145
corn-0.144
Token had
Token ID550
Feature activation-1.09e+00

Pos logit contributions
izing+0.008
ler+0.008
ww+0.007
E+0.007
hest+0.007
Neg logit contributions
 wife-0.005
 opinion-0.005
 attitude-0.005
 name-0.005
 stomach-0.005
Token a
Token ID257
Feature activation+6.65e-01

Pos logit contributions
 wife+0.022
 attitude+0.022
 opinion+0.021
 name+0.020
 stomach+0.020
Neg logit contributions
izing-0.030
ww-0.029
corn-0.028
ler-0.028
E-0.028
Token surprise
Token ID5975
Feature activation-6.36e-02

Pos logit contributions
izing+0.017
ler+0.016
ww+0.016
corn+0.016
E+0.016
Neg logit contributions
 wife-0.015
 attitude-0.015
 opinion-0.014
 stomach-0.014
 behavior-0.014
Token for
Token ID329
Feature activation+1.26e+00

Pos logit contributions
 wife+0.001
 attitude+0.001
 opinion+0.001
 stomach+0.001
 fault+0.001
Neg logit contributions
izing-0.002
ler-0.002
hest-0.002
ww-0.002
corn-0.002
Token.
Token ID13
Feature activation-1.86e+00

Pos logit contributions
 attitude+0.035
 wife+0.035
 opinion+0.035
 stomach+0.034
sed+0.034
Neg logit contributions
izing-0.034
hest-0.030
ler-0.030
sc-0.029
is-0.029
Token She
Token ID1375
Feature activation-2.41e-01

Pos logit contributions
 wife+0.044
 fault+0.044
 opinion+0.041
 permission+0.041
 ending+0.040
Neg logit contributions
E-0.040
izing-0.040
corn-0.038
ler-0.037
ww-0.036
Token asked
Token ID1965
Feature activation-2.15e+00

Pos logit contributions
 wife+0.005
 opinion+0.005
 attitude+0.005
 stomach+0.005
 fault+0.005
Neg logit contributions
izing-0.007
ler-0.006
E-0.006
ww-0.006
corn-0.006
Token Tom
Token ID4186
Feature activation-3.94e+00

Pos logit contributions
 wife+0.044
 opinion+0.044
 attitude+0.042
 fault+0.041
sed+0.041
Neg logit contributions
izing-0.058
E-0.057
ler-0.056
corn-0.056
hest-0.053
Token and
Token ID290
Feature activation+4.24e-01

Pos logit contributions
sed+0.086
 wife+0.083
 attitude+0.083
 ending+0.082
 opinion+0.080
Neg logit contributions
ler-0.092
izing-0.091
corn-0.091
hest-0.088
E-0.088
Token Anna
Token ID11735
Feature activation-5.51e+00

Pos logit contributions
corn+0.009
E+0.008
ler+0.008
izing+0.008
ww+0.008
Neg logit contributions
 fault-0.011
 attitude-0.011
 opinion-0.011
 wife-0.011
 name-0.011
Token to
Token ID284
Feature activation-7.14e-01

Pos logit contributions
sed+0.125
 fault+0.114
urance+0.113
 wife+0.113
 attitude+0.113
Neg logit contributions
 else-0.130
sc-0.125
izing-0.119
ler-0.119
hest-0.117
Token watch
Token ID2342
Feature activation-1.02e+00

Pos logit contributions
 fault+0.014
sed+0.014
joy+0.013
 attitude+0.013
 ending+0.013
Neg logit contributions
izing-0.019
E-0.019
hest-0.018
Little-0.018
ler-0.017
Token Ben
Token ID3932
Feature activation-3.17e+00

Pos logit contributions
 wife+0.021
 opinion+0.020
 attitude+0.020
 stomach+0.019
 name+0.019
Neg logit contributions
izing-0.029
ler-0.028
E-0.027
hest-0.027
ww-0.027
Token for
Token ID329
Feature activation+8.92e-02

Pos logit contributions
 wife+0.069
 attitude+0.064
 opinion+0.063
sed+0.063
 stomach+0.061
Neg logit contributions
izing-0.084
ler-0.083
hest-0.079
corn-0.078
E-0.077
Token a
Token ID257
Feature activation+9.26e-02

Pos logit contributions
izing+0.003
ler+0.002
E+0.002
hest+0.002
corn+0.002
Neg logit contributions
 wife-0.002
 opinion-0.002
 attitude-0.002
 stomach-0.002
 name-0.002
Token.
Token ID13
Feature activation-1.64e+00

Pos logit contributions
 wife+0.044
 opinion+0.041
 attitude+0.040
 name+0.039
 stomach+0.039
Neg logit contributions
izing-0.061
ler-0.058
hest-0.056
ww-0.056
corn-0.055
Token When
Token ID1649
Feature activation+1.95e-01

Pos logit contributions
 wife+0.037
sed+0.033
 fault+0.031
 opinion+0.031
 behavior+0.031
Neg logit contributions
E-0.045
izing-0.043
hest-0.041
B-0.038
ler-0.038
Token her
Token ID607
Feature activation-4.92e+00

Pos logit contributions
izing+0.005
ler+0.005
hest+0.004
ww+0.004
E+0.004
Neg logit contributions
 wife-0.005
 opinion-0.005
 attitude-0.005
 stomach-0.005
 name-0.004
Token mom
Token ID1995
Feature activation-1.19e+00

Pos logit contributions
 attitude+0.069
joy+0.065
 wife+0.065
urance+0.065
 opinion+0.062
Neg logit contributions
ww-0.145
izing-0.143
ler-0.142
E-0.140
Little-0.137
Token asked
Token ID1965
Feature activation-2.07e+00

Pos logit contributions
 wife+0.026
 opinion+0.025
 attitude+0.025
 stomach+0.024
 name+0.024
Neg logit contributions
izing-0.032
ler-0.031
hest-0.030
E-0.030
ww-0.030
Token her
Token ID607
Feature activation-5.50e+00

Pos logit contributions
 wife+0.047
 opinion+0.044
 attitude+0.044
 stomach+0.043
 fault+0.042
Neg logit contributions
izing-0.055
ler-0.052
E-0.052
ww-0.050
hest-0.050
Token why
Token ID1521
Feature activation-2.39e+00

Pos logit contributions
sed+0.135
ses+0.120
 fault+0.118
urance+0.117
 spine+0.117
Neg logit contributions
E-0.117
izing-0.117
 else-0.112
corn-0.111
ler-0.110
Token she
Token ID673
Feature activation-3.01e+00

Pos logit contributions
 wife+0.062
 opinion+0.060
 attitude+0.059
 stomach+0.057
 name+0.057
Neg logit contributions
izing-0.056
ler-0.053
ww-0.051
E-0.050
hest-0.050
Token was
Token ID373
Feature activation-2.10e+00

Pos logit contributions
 wife+0.064
 attitude+0.059
 opinion+0.059
 fault+0.057
 stomach+0.057
Neg logit contributions
izing-0.081
ler-0.081
E-0.077
ww-0.077
hest-0.076
Token frown
Token ID22002
Feature activation-1.11e+00

Pos logit contributions
 wife+0.044
 attitude+0.042
sed+0.042
 name+0.041
 names+0.040
Neg logit contributions
izing-0.055
ler-0.054
hest-0.050
is-0.050
E-0.050
Tokening
Token ID278
Feature activation-1.32e+00

Pos logit contributions
 wife+0.025
 opinion+0.023
 attitude+0.023
 stomach+0.022
 name+0.022
Neg logit contributions
izing-0.030
ler-0.029
hest-0.028
ww-0.028
E-0.028
Token.
Token ID13
Feature activation+1.80e-01

Pos logit contributions
izing+0.013
ler+0.012
hest+0.012
ww+0.012
E+0.012
Neg logit contributions
 wife-0.006
 opinion-0.006
 attitude-0.006
 stomach-0.006
 name-0.006
Token He
Token ID679
Feature activation+5.33e-01

Pos logit contributions
E+0.004
corn+0.004
hest+0.004
ler+0.004
ch+0.004
Neg logit contributions
 wife-0.005
 permission-0.004
 fault-0.004
sed-0.004
 parents-0.004
Token was
Token ID373
Feature activation-1.64e+00

Pos logit contributions
izing+0.016
ler+0.015
hest+0.015
ww+0.015
E+0.015
Neg logit contributions
 wife-0.010
 opinion-0.010
 attitude-0.010
 stomach-0.009
 name-0.009
Token sad
Token ID6507
Feature activation-2.24e+00

Pos logit contributions
 attitude+0.045
 behavior+0.044
 fault+0.044
 wife+0.044
 opinion+0.043
Neg logit contributions
izing-0.029
ch-0.029
E-0.029
ler-0.028
ww-0.028
Token because
Token ID780
Feature activation-4.77e-01

Pos logit contributions
 wife+0.057
 opinion+0.054
 attitude+0.053
sed+0.051
urance+0.051
Neg logit contributions
izing-0.052
ler-0.050
ww-0.045
hest-0.045
E-0.043
Token his
Token ID465
Feature activation-5.50e+00

Pos logit contributions
 wife+0.010
 opinion+0.010
 stomach+0.010
 attitude+0.009
 parents+0.009
Neg logit contributions
izing-0.014
ler-0.013
ww-0.013
hest-0.013
E-0.013
Token puppy
Token ID26188
Feature activation+1.36e+00

Pos logit contributions
urance+0.088
joy+0.083
 attitude+0.080
 hesitation+0.079
 permission+0.078
Neg logit contributions
ler-0.135
iflower-0.135
ww-0.135
izing-0.130
w-0.127
Token was
Token ID373
Feature activation-1.01e+00

Pos logit contributions
izing+0.036
ler+0.034
ww+0.033
hest+0.033
E+0.033
Neg logit contributions
 wife-0.031
 opinion-0.029
 attitude-0.029
 name-0.028
 stomach-0.028
Token stuck
Token ID7819
Feature activation-1.46e-01

Pos logit contributions
 wife+0.024
 opinion+0.023
 attitude+0.023
 name+0.021
 behavior+0.021
Neg logit contributions
ler-0.024
izing-0.024
E-0.023
ww-0.023
hest-0.022
Token in
Token ID287
Feature activation-2.81e-01

Pos logit contributions
 wife+0.003
 opinion+0.003
 attitude+0.003
 stomach+0.003
 name+0.003
Neg logit contributions
izing-0.004
ler-0.004
hest-0.004
E-0.003
ww-0.003
Token a
Token ID257
Feature activation+1.09e+00

Pos logit contributions
 opinion+0.004
 wife+0.004
 attitude+0.004
 stomach+0.004
 name+0.004
Neg logit contributions
izing-0.010
ler-0.009
ww-0.009
hest-0.009
plets-0.009
Token follow
Token ID1061
Feature activation+2.31e-01

Pos logit contributions
 wife+0.014
joy+0.014
sed+0.014
 fault+0.013
 attitude+0.013
Neg logit contributions
izing-0.024
ler-0.023
hest-0.023
ww-0.022
E-0.022
Token the
Token ID262
Feature activation-5.00e-01

Pos logit contributions
izing+0.007
ler+0.006
ww+0.006
hest+0.006
E+0.006
Neg logit contributions
 wife-0.005
 opinion-0.005
 attitude-0.005
 name-0.004
 stomach-0.004
Token rabbit
Token ID22746
Feature activation-1.91e+00

Pos logit contributions
 wife+0.011
 opinion+0.011
 attitude+0.010
 name+0.010
 stomach+0.010
Neg logit contributions
izing-0.014
ler-0.013
hest-0.013
corn-0.012
E-0.012
Token.
Token ID13
Feature activation-1.70e+00

Pos logit contributions
 wife+0.040
 opinion+0.038
 attitude+0.037
 stomach+0.036
 name+0.035
Neg logit contributions
izing-0.053
ler-0.051
hest-0.050
corn-0.048
ww-0.048
Token
Token ID198
Feature activation+1.86e+00

Pos logit contributions
 wife+0.029
 opinion+0.027
 attitude+0.026
 name+0.026
 stomach+0.025
Neg logit contributions
izing-0.056
ler-0.054
hest-0.052
ww-0.052
E-0.051
Token
Token ID198
Feature activation+4.42e+00

Pos logit contributions
izing+0.066
ler+0.064
ww+0.062
hest+0.060
corn+0.060
Neg logit contributions
 wife-0.027
 opinion-0.025
 name-0.025
 attitude-0.023
 stomach-0.023
TokenThe
Token ID464
Feature activation-3.87e-02

Pos logit contributions
izing+0.103
ler+0.100
plets+0.093
ww+0.086
iflower+0.085
Neg logit contributions
 wife-0.112
 nod-0.098
 opinion-0.097
 name-0.096
 aid-0.093
Token rabbit
Token ID22746
Feature activation+1.36e+00

Pos logit contributions
 wife+0.001
 opinion+0.001
 attitude+0.001
 name+0.001
 stomach+0.001
Neg logit contributions
izing-0.001
ler-0.001
hest-0.001
ww-0.001
corn-0.001
Token took
Token ID1718
Feature activation+1.65e+00

Pos logit contributions
izing+0.037
ler+0.036
hest+0.034
ww+0.034
E+0.034
Neg logit contributions
 wife-0.030
 opinion-0.028
 attitude-0.028
 stomach-0.027
 name-0.027
Token the
Token ID262
Feature activation-4.04e-01

Pos logit contributions
izing+0.055
ler+0.051
hest+0.050
ww+0.050
corn+0.049
Neg logit contributions
 wife-0.029
 name-0.027
 opinion-0.027
 attitude-0.026
 stomach-0.026
Token kitten
Token ID42143
Feature activation-7.09e-01

Pos logit contributions
 wife+0.010
 opinion+0.009
 attitude+0.009
 name+0.009
 stomach+0.009
Neg logit contributions
izing-0.011
ler-0.010
hest-0.010
E-0.010
corn-0.010
Token and
Token ID290
Feature activation+1.72e+00

Pos logit contributions
izing+0.068
E+0.068
ww+0.065
corn+0.064
hest+0.063
Neg logit contributions
 wife-0.039
 opinion-0.036
 attitude-0.036
 name-0.035
 stomach-0.035
Token trees
Token ID7150
Feature activation+1.14e+00

Pos logit contributions
izing+0.043
ler+0.041
hest+0.039
ww+0.039
E+0.039
Neg logit contributions
 wife-0.042
 opinion-0.040
 attitude-0.040
 stomach-0.039
 name-0.039
Token."
Token ID526
Feature activation-1.98e+00

Pos logit contributions
izing+0.036
E+0.035
ww+0.035
ler+0.035
hest+0.034
Neg logit contributions
 wife-0.019
 opinion-0.018
 attitude-0.017
 name-0.017
 stomach-0.017
Token
Token ID198
Feature activation+1.33e+00

Pos logit contributions
 wife+0.044
 attitude+0.043
 opinion+0.043
 ending+0.042
 fault+0.042
Neg logit contributions
E-0.049
izing-0.049
corn-0.047
ww-0.046
ler-0.046
Token
Token ID198
Feature activation+3.43e+00

Pos logit contributions
izing+0.046
ler+0.045
ww+0.043
hest+0.043
corn+0.042
Neg logit contributions
 wife-0.020
 opinion-0.018
 name-0.018
 attitude-0.017
 stomach-0.017
Token"
Token ID1
Feature activation+6.70e+00

Pos logit contributions
izing+0.078
ler+0.075
hest+0.070
plets+0.068
iflower+0.062
Neg logit contributions
 wife-0.096
 name-0.085
 opinion-0.083
 His-0.083
 nod-0.082
TokenOK
Token ID11380
Feature activation-1.16e+00

Pos logit contributions
ler+0.166
izing+0.162
hest+0.146
iflower+0.146
ons+0.143
Neg logit contributions
 wife-0.172
 name-0.157
 opinion-0.152
 nod-0.152
 names-0.150
Token,
Token ID11
Feature activation+5.12e-01

Pos logit contributions
 wife+0.024
 attitude+0.022
 opinion+0.022
 stomach+0.022
 fault+0.021
Neg logit contributions
izing-0.034
ler-0.032
ww-0.032
corn-0.031
E-0.031
Token but
Token ID475
Feature activation-3.93e-01

Pos logit contributions
izing+0.017
ler+0.017
hest+0.016
E+0.016
ww+0.016
Neg logit contributions
 wife-0.009
 opinion-0.008
 name-0.008
 attitude-0.008
 stomach-0.007
Token you
Token ID345
Feature activation-3.08e+00

Pos logit contributions
 wife+0.010
 attitude+0.010
 opinion+0.010
 stomach+0.010
 fault+0.009
Neg logit contributions
izing-0.009
ler-0.008
ww-0.008
E-0.008
corn-0.008
Token have
Token ID423
Feature activation-4.27e-01

Pos logit contributions
 wife+0.056
 opinion+0.052
 attitude+0.052
 stomach+0.050
 name+0.049
Neg logit contributions
izing-0.095
ler-0.092
hest-0.090
E-0.089
ww-0.089
Token to
Token ID284
Feature activation+6.58e-01

Pos logit contributions
 wife+0.052
 attitude+0.050
 opinion+0.050
 fault+0.048
 ending+0.047
Neg logit contributions
izing-0.082
ler-0.076
ww-0.076
E-0.075
corn-0.075
Token her
Token ID607
Feature activation-3.35e+00

Pos logit contributions
izing+0.018
ler+0.017
E+0.017
hest+0.017
ww+0.017
Neg logit contributions
 wife-0.014
 opinion-0.013
 attitude-0.013
 stomach-0.013
 name-0.013
Token mom
Token ID1995
Feature activation-3.18e+00

Pos logit contributions
 attitude+0.041
 wife+0.040
 opinion+0.039
 ending+0.039
urance+0.038
Neg logit contributions
izing-0.113
ler-0.109
corn-0.105
E-0.104
hest-0.102
Tokenmy
Token ID1820
Feature activation-3.41e+00

Pos logit contributions
 wife+0.071
sed+0.066
 attitude+0.066
 opinion+0.066
 stomach+0.063
Neg logit contributions
izing-0.082
ler-0.079
corn-0.075
is-0.073
hest-0.073
Token,
Token ID11
Feature activation-1.77e+00

Pos logit contributions
 wife+0.070
 opinion+0.064
 attitude+0.063
sed+0.062
 stomach+0.061
Neg logit contributions
izing-0.097
ler-0.093
corn-0.089
hest-0.087
E-0.086
Token "
Token ID366
Feature activation+5.83e+00

Pos logit contributions
 wife+0.026
 attitude+0.024
 opinion+0.024
 stomach+0.022
 behavior+0.022
Neg logit contributions
izing-0.060
ler-0.057
E-0.057
corn-0.056
hest-0.055
TokenLook
Token ID8567
Feature activation-1.20e-01

Pos logit contributions
izing+0.150
ler+0.142
hest+0.137
 pounds+0.131
igs+0.130
Neg logit contributions
 wife-0.126
 name-0.125
 nod-0.115
 opinion-0.114
 Her-0.113
Token at
Token ID379
Feature activation+7.38e-01

Pos logit contributions
 wife+0.002
 opinion+0.002
 name+0.002
 stomach+0.002
 attitude+0.001
Neg logit contributions
izing-0.004
ler-0.004
hest-0.004
ww-0.004
E-0.004
Token this
Token ID428
Feature activation-7.88e-01

Pos logit contributions
izing+0.018
ler+0.017
hest+0.017
E+0.016
sc+0.016
Neg logit contributions
 wife-0.019
 opinion-0.018
 name-0.017
 stomach-0.017
 parents-0.017
Token book
Token ID1492
Feature activation+1.67e+00

Pos logit contributions
 wife+0.021
 opinion+0.020
 name+0.020
 attitude+0.019
 stomach+0.019
Neg logit contributions
izing-0.019
E-0.017
ler-0.017
hest-0.017
ww-0.017
Token!
Token ID0
Feature activation-5.90e-01

Pos logit contributions
izing+0.045
ler+0.043
E+0.043
ww+0.041
hest+0.041
Neg logit contributions
 wife-0.036
 opinion-0.036
 name-0.034
 attitude-0.034
 stomach-0.034
Token too
Token ID1165
Feature activation-1.52e+00

Pos logit contributions
 opinion+0.010
 wife+0.010
 attitude+0.010
 name+0.009
 stomach+0.009
Neg logit contributions
izing-0.027
ler-0.026
ww-0.026
E-0.026
corn-0.026
Token.
Token ID13
Feature activation-8.92e-01

Pos logit contributions
 wife+0.021
 opinion+0.020
 attitude+0.019
 name+0.019
 stomach+0.018
Neg logit contributions
izing-0.053
ler-0.052
E-0.051
corn-0.050
ww-0.050
Token Max
Token ID5436
Feature activation+1.96e+00

Pos logit contributions
 wife+0.016
 opinion+0.015
 attitude+0.015
 stomach+0.014
 name+0.014
Neg logit contributions
izing-0.028
ler-0.027
E-0.027
ww-0.026
hest-0.026
Token said
Token ID531
Feature activation-1.95e+00

Pos logit contributions
izing+0.058
ler+0.055
hest+0.054
ww+0.053
E+0.053
Neg logit contributions
 wife-0.040
 opinion-0.037
 attitude-0.036
 stomach-0.036
 name-0.035
Token,
Token ID11
Feature activation-1.59e+00

Pos logit contributions
 wife+0.042
 attitude+0.040
 fault+0.040
 opinion+0.040
urance+0.038
Neg logit contributions
izing-0.051
ww-0.050
ler-0.048
E-0.047
ch-0.046
Token "
Token ID366
Feature activation+6.57e+00

Pos logit contributions
 wife+0.022
 opinion+0.021
 attitude+0.020
 fault+0.019
 stomach+0.019
Neg logit contributions
izing-0.056
ler-0.053
E-0.053
ww-0.052
corn-0.052
TokenCan
Token ID6090
Feature activation+1.44e+00

Pos logit contributions
izing+0.180
ler+0.167
hest+0.165
iflower+0.159
plets+0.151
Neg logit contributions
 wife-0.150
 name-0.138
 opinion-0.134
 nod-0.133
 His-0.125
Token I
Token ID314
Feature activation-2.13e+00

Pos logit contributions
izing+0.024
ler+0.022
ww+0.021
E+0.021
hest+0.021
Neg logit contributions
 wife-0.047
 opinion-0.045
 attitude-0.045
 stomach-0.044
 name-0.044
Token try
Token ID1949
Feature activation+3.35e-01

Pos logit contributions
 wife+0.036
 opinion+0.034
 attitude+0.034
 stomach+0.032
 name+0.032
Neg logit contributions
izing-0.069
ler-0.066
hest-0.064
ww-0.064
E-0.064
Token sitting
Token ID5586
Feature activation+1.04e+00

Pos logit contributions
izing+0.008
ler+0.008
E+0.008
hest+0.008
ww+0.008
Neg logit contributions
 wife-0.008
 opinion-0.008
 name-0.008
 stomach-0.008
 fault-0.007
Token on
Token ID319
Feature activation-2.25e-01

Pos logit contributions
izing+0.029
ler+0.028
E+0.028
hest+0.028
ww+0.027
Neg logit contributions
 wife-0.022
 opinion-0.021
 name-0.020
 stomach-0.020
 attitude-0.019
Token to
Token ID284
Feature activation-1.13e+00

Pos logit contributions
izing+0.069
ler+0.066
corn+0.064
ww+0.064
hest+0.063
Neg logit contributions
 wife-0.049
 opinion-0.047
 attitude-0.047
 name-0.045
 stomach-0.044
Token eat
Token ID4483
Feature activation+1.46e+00

Pos logit contributions
 wife+0.022
 opinion+0.020
 attitude+0.020
 stomach+0.019
 name+0.019
Neg logit contributions
izing-0.034
ler-0.033
hest-0.032
ww-0.032
E-0.032
Token.
Token ID13
Feature activation-7.67e-01

Pos logit contributions
izing+0.049
ler+0.048
ww+0.048
hest+0.047
E+0.047
Neg logit contributions
 wife-0.021
 stomach-0.020
 opinion-0.019
 attitude-0.019
 tum-0.019
Token
Token ID220
Feature activation-9.17e-01

Pos logit contributions
 wife+0.012
 opinion+0.011
 attitude+0.011
 stomach+0.011
 name+0.011
Neg logit contributions
izing-0.026
ler-0.025
ww-0.024
hest-0.024
E-0.024
Token
Token ID198
Feature activation+3.27e+00

Pos logit contributions
 wife+0.018
 attitude+0.017
 opinion+0.017
 stomach+0.016
 ending+0.016
Neg logit contributions
izing-0.027
hest-0.025
E-0.025
ler-0.025
corn-0.025
Token
Token ID198
Feature activation+4.76e+00

Pos logit contributions
izing+0.129
ler+0.127
ww+0.121
corn+0.114
w+0.113
Neg logit contributions
 name-0.034
 wife-0.034
 opinion-0.031
 stomach-0.027
 parents-0.026
TokenThe
Token ID464
Feature activation-5.30e-01

Pos logit contributions
ler+0.121
plets+0.111
izing+0.111
ww+0.110
 pounds+0.105
Neg logit contributions
 wife-0.098
 opinion-0.095
 name-0.093
 meeting-0.091
 ending-0.090
Token end
Token ID886
Feature activation-1.77e+00

Pos logit contributions
 wife+0.008
 opinion+0.008
 attitude+0.008
 stomach+0.007
 name+0.007
Neg logit contributions
izing-0.018
ler-0.017
hest-0.017
ww-0.017
E-0.017
Token.
Token ID13
Feature activation+5.46e-01

Pos logit contributions
 wife+0.030
 opinion+0.028
 attitude+0.028
 stomach+0.027
 name+0.027
Neg logit contributions
izing-0.057
ler-0.055
hest-0.053
ww-0.053
E-0.053
Token<|endoftext|>
Token ID50256
Feature activation-1.24e+00

Pos logit contributions
izing+0.017
ler+0.016
E+0.016
hest+0.016
ww+0.016
Neg logit contributions
 wife-0.010
 opinion-0.010
 attitude-0.009
 stomach-0.009
 name-0.009
TokenOnce
Token ID7454
Feature activation-3.02e+00

Pos logit contributions
 wife+0.028
 attitude+0.026
 opinion+0.026
sed+0.025
 stomach+0.025
Neg logit contributions
izing-0.033
E-0.032
ler-0.032
hest-0.032
corn-0.031
Tokenister
Token ID1694
Feature activation-1.31e+00

Pos logit contributions
izing+0.012
ler+0.011
ww+0.011
hest+0.011
E+0.010
Neg logit contributions
 wife-0.019
 opinion-0.018
 attitude-0.018
 stomach-0.018
 name-0.018
Token.
Token ID13
Feature activation-1.96e-01

Pos logit contributions
 wife+0.026
 opinion+0.025
 attitude+0.025
 stomach+0.024
 ending+0.023
Neg logit contributions
izing-0.038
ler-0.037
corn-0.036
ww-0.035
hest-0.035
Token We
Token ID775
Feature activation-2.52e+00

Pos logit contributions
 wife+0.006
 attitude+0.005
 opinion+0.005
 stomach+0.005
 fault+0.005
Neg logit contributions
izing-0.004
ler-0.004
E-0.004
ww-0.004
corn-0.004
Token will
Token ID481
Feature activation-2.26e+00

Pos logit contributions
 wife+0.062
joy+0.060
 attitude+0.058
 stomach+0.058
 ending+0.058
Neg logit contributions
ler-0.058
izing-0.056
w-0.055
ww-0.055
hest-0.054
Token be
Token ID307
Feature activation+4.30e-01

Pos logit contributions
joy+0.045
 wife+0.043
 attitude+0.040
 behavior+0.040
sed+0.039
Neg logit contributions
izing-0.063
ler-0.058
hest-0.057
ww-0.057
E-0.057
Token more
Token ID517
Feature activation+2.25e+00

Pos logit contributions
izing+0.011
ler+0.010
ww+0.010
hest+0.010
E+0.010
Neg logit contributions
 attitude-0.010
 wife-0.010
 ending-0.010
 opinion-0.009
 stomach-0.009
Token careful
Token ID8161
Feature activation-6.16e-01

Pos logit contributions
izing+0.062
ler+0.060
E+0.060
hest+0.058
corn+0.058
Neg logit contributions
 wife-0.051
 opinion-0.048
 name-0.046
 attitude-0.046
 fault-0.044
Token next
Token ID1306
Feature activation+2.24e-01

Pos logit contributions
 wife+0.015
 opinion+0.014
 attitude+0.014
 stomach+0.013
 name+0.013
Neg logit contributions
izing-0.016
ler-0.015
hest-0.015
ww-0.015
E-0.014
Token time
Token ID640
Feature activation-1.32e+00

Pos logit contributions
izing+0.007
ler+0.007
hest+0.007
E+0.007
ww+0.007
Neg logit contributions
 wife-0.004
 opinion-0.004
 attitude-0.004
 stomach-0.004
 name-0.004
Token."
Token ID526
Feature activation-1.13e+00

Pos logit contributions
 wife+0.028
 opinion+0.027
 attitude+0.026
 name+0.026
 stomach+0.026
Neg logit contributions
izing-0.037
ler-0.035
E-0.035
hest-0.034
ww-0.034
Token
Token ID198
Feature activation+2.05e+00

Pos logit contributions
 opinion+0.025
 attitude+0.025
 wife+0.025
 ending+0.024
 fault+0.024
Neg logit contributions
E-0.028
izing-0.028
corn-0.027
ww-0.027
ler-0.026
Token.
Token ID13
Feature activation-1.05e+00

Pos logit contributions
 wife+0.027
 opinion+0.025
 attitude+0.024
 stomach+0.023
 name+0.023
Neg logit contributions
izing-0.055
ler-0.053
E-0.052
ww-0.052
hest-0.052
Token They
Token ID1119
Feature activation+1.17e-01

Pos logit contributions
 wife+0.021
 behavior+0.020
sed+0.020
 permission+0.019
 fault+0.019
Neg logit contributions
E-0.029
hest-0.027
corn-0.027
sc-0.027
ch-0.027
Token liked
Token ID8288
Feature activation-1.25e+00

Pos logit contributions
izing+0.003
ler+0.003
hest+0.003
E+0.003
ww+0.003
Neg logit contributions
 wife-0.002
 opinion-0.002
 attitude-0.002
 stomach-0.002
 name-0.002
Token to
Token ID284
Feature activation-1.04e+00

Pos logit contributions
 wife+0.016
 attitude+0.015
 opinion+0.015
 stomach+0.014
 name+0.013
Neg logit contributions
izing-0.045
corn-0.043
ler-0.043
ww-0.043
E-0.042
Token swing
Token ID9628
Feature activation+1.87e+00

Pos logit contributions
 fault+0.024
 wife+0.023
sed+0.023
 ending+0.023
 behavior+0.022
Neg logit contributions
hest-0.023
izing-0.023
ler-0.023
corn-0.022
ch-0.022
Token,
Token ID11
Feature activation+1.21e+00

Pos logit contributions
izing+0.060
ler+0.057
E+0.057
ww+0.055
hest+0.055
Neg logit contributions
 wife-0.032
 opinion-0.031
 name-0.031
 attitude-0.031
 stomach-0.029
Token slide
Token ID10649
Feature activation+1.04e+00

Pos logit contributions
izing+0.037
ler+0.035
ww+0.034
E+0.034
hest+0.033
Neg logit contributions
 wife-0.024
 opinion-0.023
 name-0.022
 attitude-0.022
 stomach-0.022
Token and
Token ID290
Feature activation+7.46e-01

Pos logit contributions
izing+0.034
E+0.033
hest+0.033
ler+0.033
ww+0.032
Neg logit contributions
 wife-0.016
 name-0.015
 attitude-0.015
 stomach-0.015
 opinion-0.015
Token climb
Token ID12080
Feature activation+7.41e-01

Pos logit contributions
izing+0.022
ler+0.021
hest+0.021
ww+0.021
E+0.021
Neg logit contributions
 wife-0.015
 opinion-0.014
 attitude-0.014
 stomach-0.013
 name-0.013
Token on
Token ID319
Feature activation-1.14e+00

Pos logit contributions
izing+0.023
ler+0.022
ww+0.021
E+0.021
hest+0.021
Neg logit contributions
 wife-0.013
 opinion-0.013
 name-0.013
 attitude-0.013
 stomach-0.012
Token the
Token ID262
Feature activation-1.34e+00

Pos logit contributions
 wife+0.019
 attitude+0.018
 opinion+0.018
 stomach+0.018
 name+0.017
Neg logit contributions
izing-0.036
ler-0.036
ww-0.035
hest-0.035
w-0.034
Token let
Token ID1309
Feature activation+1.49e+00

Pos logit contributions
 wife+0.004
 opinion+0.004
 attitude+0.004
 stomach+0.004
 name+0.004
Neg logit contributions
izing-0.005
ler-0.004
ww-0.004
hest-0.004
E-0.004
Token's
Token ID338
Feature activation-2.23e+00

Pos logit contributions
izing+0.045
ler+0.042
hest+0.041
E+0.040
ww+0.040
Neg logit contributions
 wife-0.030
 opinion-0.028
 name-0.027
 stomach-0.027
 attitude-0.026
Token play
Token ID711
Feature activation+1.76e+00

Pos logit contributions
 wife+0.040
 attitude+0.038
 fault+0.037
joy+0.037
 opinion+0.037
Neg logit contributions
izing-0.066
ler-0.064
E-0.063
ww-0.063
hest-0.063
Token with
Token ID351
Feature activation+2.30e+00

Pos logit contributions
izing+0.045
ler+0.042
ww+0.041
hest+0.040
E+0.040
Neg logit contributions
 opinion-0.042
 wife-0.042
 name-0.041
 attitude-0.039
 stomach-0.039
Token the
Token ID262
Feature activation-2.84e-01

Pos logit contributions
izing+0.060
hest+0.057
plets+0.056
ler+0.056
sc+0.056
Neg logit contributions
 parents-0.057
 mom-0.056
 dad-0.056
 wife-0.056
 stomach-0.051
Token dolls
Token ID36062
Feature activation+9.67e-01

Pos logit contributions
 wife+0.007
 opinion+0.006
 attitude+0.006
 name+0.006
 stomach+0.006
Neg logit contributions
izing-0.008
ler-0.007
hest-0.007
E-0.007
ww-0.007
Token.
Token ID13
Feature activation+1.44e-01

Pos logit contributions
izing+0.030
E+0.028
ww+0.028
ler+0.027
corn+0.027
Neg logit contributions
 wife-0.019
 opinion-0.018
 name-0.017
 attitude-0.017
 stomach-0.016
Token You
Token ID921
Feature activation-2.00e+00

Pos logit contributions
izing+0.004
ler+0.004
hest+0.004
ww+0.004
E+0.004
Neg logit contributions
 wife-0.003
 opinion-0.003
 attitude-0.003
 stomach-0.003
 name-0.003
Token can
Token ID460
Feature activation-5.46e-01

Pos logit contributions
 wife+0.041
 opinion+0.040
 attitude+0.039
 name+0.038
 stomach+0.037
Neg logit contributions
izing-0.059
ler-0.055
E-0.054
ww-0.054
hest-0.053
Token be
Token ID307
Feature activation+2.13e+00

Pos logit contributions
 wife+0.008
 opinion+0.008
 attitude+0.007
 name+0.007
 stomach+0.007
Neg logit contributions
izing-0.019
ler-0.018
ww-0.018
corn-0.018
E-0.018
Token the
Token ID262
Feature activation+8.18e-01

Pos logit contributions
izing+0.050
ler+0.047
hest+0.045
E+0.045
ww+0.045
Neg logit contributions
 wife-0.056
 opinion-0.054
 attitude-0.053
 name-0.052
 stomach-0.052
Token woke
Token ID19092
Feature activation-9.28e-01

Pos logit contributions
 attitude+0.006
 opinion+0.006
 wife+0.006
 fault+0.005
 name+0.005
Neg logit contributions
ler-0.006
corn-0.006
ww-0.006
izing-0.006
E-0.006
Token up
Token ID510
Feature activation-2.42e+00

Pos logit contributions
 wife+0.024
 opinion+0.024
 attitude+0.024
 stomach+0.022
 name+0.022
Neg logit contributions
izing-0.022
ler-0.020
E-0.019
corn-0.019
ww-0.019
Token and
Token ID290
Feature activation+6.04e-01

Pos logit contributions
 wife+0.063
 attitude+0.061
 opinion+0.061
 fault+0.059
sed+0.059
Neg logit contributions
izing-0.053
ler-0.052
E-0.047
corn-0.046
is-0.046
Token wanted
Token ID2227
Feature activation-3.00e-01

Pos logit contributions
izing+0.015
ler+0.015
corn+0.014
E+0.014
ww+0.014
Neg logit contributions
 opinion-0.014
 fault-0.014
 attitude-0.014
 wife-0.014
 behavior-0.013
Token to
Token ID284
Feature activation-4.27e-01

Pos logit contributions
 attitude+0.005
 wife+0.005
 opinion+0.005
sed+0.005
urance+0.005
Neg logit contributions
corn-0.009
B-0.009
E-0.009
ch-0.009
izing-0.009
Token play
Token ID711
Feature activation+1.60e+00

Pos logit contributions
 fault+0.008
 wife+0.008
joy+0.008
sed+0.008
 attitude+0.008
Neg logit contributions
E-0.012
ler-0.011
hest-0.011
izing-0.011
ww-0.011
Token with
Token ID351
Feature activation+2.02e+00

Pos logit contributions
izing+0.035
ler+0.033
hest+0.032
E+0.032
corn+0.031
Neg logit contributions
 wife-0.044
 attitude-0.042
 opinion-0.042
 fault-0.041
 stomach-0.041
Token the
Token ID262
Feature activation+2.38e-01

Pos logit contributions
izing+0.062
ww+0.059
ler+0.059
hest+0.058
E+0.056
Neg logit contributions
 wife-0.039
 opinion-0.037
 attitude-0.036
 stomach-0.036
 name-0.035
Token bow
Token ID9563
Feature activation+2.29e+00

Pos logit contributions
izing+0.006
ler+0.006
E+0.006
ww+0.006
hest+0.006
Neg logit contributions
 wife-0.006
 opinion-0.005
 name-0.005
 attitude-0.005
 stomach-0.005
Token.
Token ID13
Feature activation-7.95e-01

Pos logit contributions
izing+0.067
ler+0.066
hest+0.063
ww+0.061
E+0.061
Neg logit contributions
 wife-0.045
 opinion-0.043
 attitude-0.042
 stomach-0.040
 name-0.040
Token He
Token ID679
Feature activation+7.80e-01

Pos logit contributions
 wife+0.018
 ending+0.016
 permission+0.016
 fault+0.016
 attitude+0.016
Neg logit contributions
izing-0.020
E-0.020
corn-0.019
ww-0.019
hest-0.019
Token and
Token ID290
Feature activation+2.12e+00

Pos logit contributions
izing+0.119
E+0.111
ww+0.110
ler+0.110
hest+0.109
Neg logit contributions
 wife-0.082
 name-0.076
 opinion-0.075
 stomach-0.075
 attitude-0.073
Token jewels
Token ID42701
Feature activation+3.69e+00

Pos logit contributions
izing+0.053
ler+0.050
hest+0.048
E+0.048
ww+0.048
Neg logit contributions
 wife-0.053
 opinion-0.050
 attitude-0.049
 name-0.048
 stomach-0.048
Token.
Token ID13
Feature activation-2.25e+00

Pos logit contributions
izing+0.108
ler+0.104
hest+0.102
ww+0.098
corn+0.098
Neg logit contributions
 wife-0.074
 opinion-0.071
 attitude-0.070
 stomach-0.067
 name-0.065
Token Jack
Token ID3619
Feature activation+1.39e+00

Pos logit contributions
 wife+0.039
 opinion+0.037
 attitude+0.036
 stomach+0.035
 name+0.034
Neg logit contributions
izing-0.072
ler-0.069
E-0.067
ww-0.067
hest-0.067
Token was
Token ID373
Feature activation-4.82e-01

Pos logit contributions
izing+0.041
ler+0.039
ww+0.038
hest+0.038
E+0.038
Neg logit contributions
 wife-0.028
 opinion-0.026
 attitude-0.026
 name-0.025
 stomach-0.025
Token very
Token ID845
Feature activation+1.14e+00

Pos logit contributions
 wife+0.013
 attitude+0.012
 opinion+0.012
 behavior+0.012
 fault+0.012
Neg logit contributions
izing-0.010
ler-0.010
hest-0.010
E-0.009
ww-0.009
Token happy
Token ID3772
Feature activation-8.65e-01

Pos logit contributions
izing+0.032
ler+0.031
hest+0.030
ww+0.030
E+0.030
Neg logit contributions
 wife-0.024
 opinion-0.023
 attitude-0.022
 stomach-0.022
 name-0.021
Token and
Token ID290
Feature activation+1.97e+00

Pos logit contributions
 wife+0.022
 opinion+0.021
 attitude+0.021
 stomach+0.020
 name+0.020
Neg logit contributions
izing-0.021
ler-0.019
corn-0.019
ww-0.018
hest-0.018
Token danced
Token ID39480
Feature activation+6.95e-01

Pos logit contributions
izing+0.050
ww+0.046
ler+0.046
corn+0.045
hest+0.045
Neg logit contributions
 wife-0.045
 opinion-0.044
 attitude-0.044
 fault-0.044
 stomach-0.042
Token around
Token ID1088
Feature activation+3.87e-01

Pos logit contributions
izing+0.016
ler+0.016
hest+0.015
E+0.015
corn+0.015
Neg logit contributions
 wife-0.018
 opinion-0.017
 attitude-0.017
 fault-0.017
 stomach-0.016
Token on
Token ID319
Feature activation-5.21e-01

Pos logit contributions
izing+0.010
ler+0.010
corn+0.009
E+0.009
hest+0.009
Neg logit contributions
 wife-0.009
 opinion-0.008
 attitude-0.008
 stomach-0.008
 name-0.008
Token night
Token ID1755
Feature activation-1.04e+00

Pos logit contributions
izing+0.008
ler+0.007
hest+0.007
E+0.007
ww+0.007
Neg logit contributions
 wife-0.006
 opinion-0.005
 attitude-0.005
 stomach-0.005
 name-0.005
Token,
Token ID11
Feature activation+4.76e-02

Pos logit contributions
 wife+0.019
 opinion+0.018
 attitude+0.017
 fault+0.017
 name+0.016
Neg logit contributions
izing-0.033
ler-0.032
ww-0.031
E-0.031
corn-0.030
Token Lily
Token ID20037
Feature activation-2.82e-01

Pos logit contributions
izing+0.001
ler+0.001
E+0.001
ww+0.001
hest+0.001
Neg logit contributions
 wife-0.001
 opinion-0.001
 attitude-0.001
 name-0.001
 stomach-0.001
Token's
Token ID338
Feature activation-3.18e+00

Pos logit contributions
 wife+0.007
 opinion+0.007
 attitude+0.006
 fault+0.006
 name+0.006
Neg logit contributions
ler-0.007
izing-0.007
E-0.006
ww-0.006
corn-0.006
Token mom
Token ID1995
Feature activation+6.60e-01

Pos logit contributions
 attitude+0.043
 wife+0.043
urance+0.042
 opinion+0.042
joy+0.040
Neg logit contributions
ler-0.092
izing-0.090
E-0.090
web-0.089
ww-0.089
Tokenmy
Token ID1820
Feature activation-1.13e-01

Pos logit contributions
izing+0.018
ler+0.017
hest+0.016
ww+0.016
E+0.016
Neg logit contributions
 wife-0.015
 opinion-0.014
 attitude-0.014
 stomach-0.014
 name-0.014
Token took
Token ID1718
Feature activation+9.71e-01

Pos logit contributions
 wife+0.002
 opinion+0.002
 attitude+0.002
 stomach+0.002
 name+0.002
Neg logit contributions
izing-0.003
ler-0.003
E-0.003
corn-0.003
ww-0.003
Token her
Token ID607
Feature activation-2.38e+00

Pos logit contributions
izing+0.033
ler+0.031
hest+0.030
ww+0.030
corn+0.030
Neg logit contributions
 wife-0.016
 opinion-0.016
 name-0.015
 attitude-0.015
 stomach-0.015
Token outside
Token ID2354
Feature activation-2.14e+00

Pos logit contributions
 wife+0.039
 attitude+0.037
 opinion+0.036
sed+0.035
 stomach+0.035
Neg logit contributions
ler-0.073
izing-0.073
sc-0.069
hest-0.069
E-0.069
Token and
Token ID290
Feature activation+1.62e+00

Pos logit contributions
 wife+0.044
sed+0.040
 attitude+0.040
 opinion+0.040
 name+0.039
Neg logit contributions
ler-0.058
izing-0.058
hest-0.054
sc-0.054
E-0.054
Token showed
Token ID3751
Feature activation+6.35e-01

Pos logit contributions
izing+0.046
ler+0.044
E+0.042
ww+0.042
hest+0.042
Neg logit contributions
 wife-0.034
 opinion-0.032
 attitude-0.032
 stomach-0.031
 name-0.031
Token made
Token ID925
Feature activation+1.36e+00

Pos logit contributions
 wife+0.006
 fault+0.006
 opinion+0.006
 attitude+0.006
 name+0.006
Neg logit contributions
izing-0.007
ler-0.007
corn-0.007
E-0.007
ww-0.007
Token his
Token ID465
Feature activation-3.02e+00

Pos logit contributions
izing+0.041
ler+0.039
ww+0.039
corn+0.038
hest+0.038
Neg logit contributions
 wife-0.027
 name-0.026
 opinion-0.026
 stomach-0.024
 attitude-0.024
Token way
Token ID835
Feature activation-4.05e-01

Pos logit contributions
 wife+0.048
 attitude+0.045
 fault+0.044
urance+0.043
 opinion+0.042
Neg logit contributions
izing-0.087
ww-0.083
ler-0.083
E-0.080
corn-0.079
Token closer
Token ID5699
Feature activation-2.63e+00

Pos logit contributions
 wife+0.007
 opinion+0.007
 attitude+0.007
 stomach+0.006
 name+0.006
Neg logit contributions
izing-0.013
ler-0.012
ww-0.012
E-0.012
hest-0.012
Token and
Token ID290
Feature activation+1.98e+00

Pos logit contributions
 wife+0.048
 opinion+0.045
 attitude+0.044
 stomach+0.043
 name+0.042
Neg logit contributions
izing-0.082
ler-0.078
hest-0.076
ww-0.075
E-0.075
Token asked
Token ID1965
Feature activation-2.67e+00

Pos logit contributions
izing+0.054
ler+0.052
ww+0.052
hest+0.052
E+0.051
Neg logit contributions
 wife-0.044
 opinion-0.040
 stomach-0.039
 name-0.039
 attitude-0.039
Token,
Token ID11
Feature activation-1.98e+00

Pos logit contributions
 fault+0.067
sed+0.066
 wife+0.066
 attitude+0.065
 stomach+0.064
Neg logit contributions
izing-0.059
E-0.055
ler-0.053
corn-0.053
B-0.052
Token "
Token ID366
Feature activation+5.65e+00

Pos logit contributions
 wife+0.032
 attitude+0.032
 opinion+0.031
 fault+0.031
 behavior+0.030
Neg logit contributions
izing-0.064
E-0.060
ler-0.059
hest-0.058
corn-0.058
TokenHey
Token ID10814
Feature activation-7.07e-01

Pos logit contributions
izing+0.136
ler+0.133
ww+0.125
hest+0.122
ang+0.116
Neg logit contributions
 wife-0.139
 name-0.129
joy-0.128
 permission-0.127
 opinion-0.125
Token!
Token ID0
Feature activation+1.28e-01

Pos logit contributions
 wife+0.017
 opinion+0.016
 attitude+0.015
 name+0.015
 stomach+0.015
Neg logit contributions
izing-0.018
ler-0.018
hest-0.017
E-0.017
ww-0.017
Token Hi
Token ID15902
Feature activation-5.20e-01

Pos logit contributions
izing+0.004
ler+0.003
E+0.003
ww+0.003
hest+0.003
Neg logit contributions
 wife-0.003
 attitude-0.003
 opinion-0.003
 stomach-0.002
 fault-0.002
Token and
Token ID290
Feature activation+2.04e+00

Pos logit contributions
izing+0.054
ler+0.052
hest+0.050
ww+0.049
corn+0.049
Neg logit contributions
 wife-0.049
 opinion-0.046
 attitude-0.046
 name-0.044
 stomach-0.044
Token their
Token ID511
Feature activation-7.45e-01

Pos logit contributions
corn+0.043
ler+0.042
izing+0.041
ww+0.040
ch+0.040
Neg logit contributions
 opinion-0.050
 fault-0.049
 attitude-0.048
 wife-0.048
 name-0.047
Token coat
Token ID13209
Feature activation+1.62e+00

Pos logit contributions
 wife+0.016
 opinion+0.016
 attitude+0.016
 stomach+0.015
urance+0.015
Neg logit contributions
izing-0.018
ler-0.018
corn-0.017
ww-0.017
E-0.017
Token.
Token ID13
Feature activation-1.35e+00

Pos logit contributions
izing+0.044
ler+0.043
hest+0.041
corn+0.040
ww+0.039
Neg logit contributions
 wife-0.035
 opinion-0.034
 attitude-0.032
 name-0.032
 stomach-0.031
Token They
Token ID1119
Feature activation+1.21e+00

Pos logit contributions
 wife+0.023
 opinion+0.022
 attitude+0.021
 stomach+0.021
 name+0.020
Neg logit contributions
izing-0.043
ler-0.041
E-0.041
ww-0.041
hest-0.041
Token still
Token ID991
Feature activation-1.35e+00

Pos logit contributions
ler+0.031
izing+0.030
E+0.029
hest+0.029
corn+0.029
Neg logit contributions
 wife-0.027
 opinion-0.026
 attitude-0.026
 fault-0.025
 name-0.025
Token liked
Token ID8288
Feature activation+1.13e+00

Pos logit contributions
 wife+0.029
 opinion+0.028
 attitude+0.027
 fault+0.027
 stomach+0.027
Neg logit contributions
ler-0.035
izing-0.035
corn-0.033
ww-0.033
E-0.033
Token motion
Token ID6268
Feature activation+1.92e+00

Pos logit contributions
izing+0.039
ler+0.038
hest+0.038
E+0.037
ww+0.037
Neg logit contributions
 wife-0.017
 opinion-0.016
 attitude-0.015
 stomach-0.015
 name-0.015
Token,
Token ID11
Feature activation+6.75e-01

Pos logit contributions
izing+0.060
E+0.057
ww+0.057
ler+0.057
hest+0.057
Neg logit contributions
 wife-0.033
 opinion-0.032
 attitude-0.032
 stomach-0.031
 name-0.030
Token but
Token ID475
Feature activation+1.48e+00

Pos logit contributions
izing+0.019
ler+0.018
hest+0.018
ww+0.018
E+0.018
Neg logit contributions
 wife-0.014
 opinion-0.014
 attitude-0.013
 stomach-0.013
 name-0.013
Token they
Token ID484
Feature activation+4.71e-01

Pos logit contributions
izing+0.037
ler+0.035
hest+0.034
E+0.033
ww+0.033
Neg logit contributions
 wife-0.037
 opinion-0.036
 attitude-0.035
 stomach-0.034
 name-0.034
Token up
Token ID510
Feature activation-9.79e-01

Pos logit contributions
 wife+0.036
 opinion+0.034
 attitude+0.034
 stomach+0.033
 name+0.033
Neg logit contributions
izing-0.043
ler-0.041
ww-0.039
hest-0.039
E-0.039
Token and
Token ID290
Feature activation+1.30e+00

Pos logit contributions
 wife+0.023
 opinion+0.022
 attitude+0.022
 stomach+0.021
 name+0.021
Neg logit contributions
izing-0.025
ler-0.024
E-0.023
ww-0.023
hest-0.023
Token took
Token ID1718
Feature activation+7.30e-01

Pos logit contributions
izing+0.039
ler+0.037
hest+0.037
ww+0.036
E+0.036
Neg logit contributions
 wife-0.026
 opinion-0.024
 attitude-0.023
 stomach-0.023
 name-0.023
Token him
Token ID683
Feature activation-1.92e+00

Pos logit contributions
izing+0.024
ler+0.023
hest+0.022
ww+0.022
corn+0.021
Neg logit contributions
 wife-0.012
 opinion-0.012
 name-0.012
 stomach-0.011
 attitude-0.011
Token to
Token ID284
Feature activation-8.54e-01

Pos logit contributions
 wife+0.037
 attitude+0.034
 opinion+0.033
 stomach+0.033
 ending+0.032
Neg logit contributions
ler-0.055
izing-0.054
hest-0.053
sc-0.052
B-0.052
Token the
Token ID262
Feature activation-1.16e+00

Pos logit contributions
 attitude+0.022
 wife+0.022
sed+0.021
 opinion+0.021
 fault+0.021
Neg logit contributions
E-0.017
izing-0.016
B-0.016
ch-0.015
Little-0.015
Token vet
Token ID20202
Feature activation-3.04e+00

Pos logit contributions
 attitude+0.027
 opinion+0.025
urance+0.024
 fault+0.024
 behavior+0.024
Neg logit contributions
E-0.028
izing-0.027
ww-0.027
ch-0.027
Fl-0.026
Token.
Token ID13
Feature activation-1.16e+00

Pos logit contributions
 ending+0.075
 attitude+0.074
 fault+0.072
 stomach+0.071
 wife+0.070
Neg logit contributions
ler-0.064
sc-0.063
izing-0.063
corn-0.061
oc-0.059
Token The
Token ID383
Feature activation-7.38e-01

Pos logit contributions
 wife+0.024
 opinion+0.023
 attitude+0.023
 fault+0.022
 stomach+0.022
Neg logit contributions
izing-0.033
E-0.031
corn-0.030
ww-0.029
ler-0.029
Token vet
Token ID20202
Feature activation-5.50e-01

Pos logit contributions
 attitude+0.017
 fault+0.017
 opinion+0.016
joy+0.016
urance+0.016
Neg logit contributions
izing-0.018
E-0.017
ch-0.017
ww-0.016
ler-0.016
Token gave
Token ID2921
Feature activation+2.37e+00

Pos logit contributions
 wife+0.013
 opinion+0.012
 attitude+0.012
 stomach+0.012
 fault+0.012
Neg logit contributions
izing-0.014
ler-0.014
E-0.013
corn-0.013
ww-0.013
Token be
Token ID307
Feature activation+1.24e+00

Pos logit contributions
 wife+0.030
 opinion+0.029
 attitude+0.028
 name+0.027
 stomach+0.027
Neg logit contributions
izing-0.050
ler-0.048
hest-0.047
E-0.047
ww-0.047
Token more
Token ID517
Feature activation+1.67e+00

Pos logit contributions
izing+0.034
ler+0.032
hest+0.031
E+0.031
ww+0.031
Neg logit contributions
 wife-0.028
 opinion-0.027
 attitude-0.026
 stomach-0.026
 name-0.026
Token careful
Token ID8161
Feature activation-5.10e-01

Pos logit contributions
izing+0.045
ler+0.043
E+0.042
hest+0.042
ww+0.041
Neg logit contributions
 wife-0.039
 opinion-0.037
 attitude-0.036
 name-0.035
 stomach-0.035
Token next
Token ID1306
Feature activation-6.31e-01

Pos logit contributions
 wife+0.012
 opinion+0.011
 attitude+0.011
 stomach+0.011
 name+0.010
Neg logit contributions
izing-0.014
ler-0.013
hest-0.013
ww-0.013
E-0.013
Token time
Token ID640
Feature activation-1.65e+00

Pos logit contributions
 wife+0.013
 opinion+0.012
 attitude+0.012
 stomach+0.012
 name+0.011
Neg logit contributions
izing-0.018
ler-0.018
ww-0.017
corn-0.017
hest-0.017
Token."
Token ID526
Feature activation-1.24e+00

Pos logit contributions
 wife+0.032
 opinion+0.030
 attitude+0.029
 name+0.028
 stomach+0.028
Neg logit contributions
izing-0.050
ler-0.048
E-0.047
hest-0.046
ww-0.046
Token
Token ID198
Feature activation+1.83e+00

Pos logit contributions
 wife+0.024
 opinion+0.023
 attitude+0.022
 stomach+0.022
 name+0.021
Neg logit contributions
izing-0.037
ler-0.036
E-0.035
ww-0.035
hest-0.035
Token
Token ID198
Feature activation+4.12e+00

Pos logit contributions
izing+0.062
ler+0.060
ww+0.058
hest+0.057
corn+0.057
Neg logit contributions
 wife-0.030
 opinion-0.027
 name-0.027
 attitude-0.026
 stomach-0.026
TokenThey
Token ID2990
Feature activation+2.21e-01

Pos logit contributions
izing+0.094
ler+0.086
plets+0.083
hest+0.080
ww+0.074
Neg logit contributions
 wife-0.111
 parents-0.098
 name-0.097
 nod-0.095
 His-0.095
Token nodded
Token ID14464
Feature activation-1.85e+00

Pos logit contributions
ler+0.006
izing+0.006
E+0.005
hest+0.005
corn+0.005
Neg logit contributions
 opinion-0.005
 wife-0.005
 attitude-0.005
 fault-0.005
 name-0.005
Token.
Token ID13
Feature activation-1.37e+00

Pos logit contributions
 fault+0.060
sed+0.058
 wife+0.057
 attitude+0.056
 opinion+0.056
Neg logit contributions
izing-0.027
ler-0.025
B-0.024
E-0.024
Little-0.021
Tokenoop
Token ID11224
Feature activation+1.36e+00

Pos logit contributions
izing+0.000
ler+0.000
hest+0.000
ww+0.000
E+0.000
Neg logit contributions
 wife-0.000
 attitude-0.000
 opinion-0.000
 fault-0.000
 ending-0.000
Tokening
Token ID278
Feature activation-4.71e-01

Pos logit contributions
izing+0.025
E+0.023
ww+0.022
ler+0.021
corn+0.021
Neg logit contributions
 wife-0.041
 opinion-0.039
 attitude-0.039
 name-0.039
 stomach-0.039
Token around
Token ID1088
Feature activation-1.57e+00

Pos logit contributions
 wife+0.009
 opinion+0.009
 attitude+0.009
 stomach+0.009
 name+0.009
Neg logit contributions
izing-0.013
ww-0.013
E-0.013
hest-0.013
corn-0.013
Token,
Token ID11
Feature activation-1.71e-01

Pos logit contributions
 wife+0.034
 opinion+0.032
 attitude+0.032
 name+0.031
 fault+0.031
Neg logit contributions
izing-0.043
ler-0.041
E-0.039
corn-0.039
hest-0.039
Token he
Token ID339
Feature activation+7.17e-01

Pos logit contributions
 wife+0.004
 opinion+0.004
 attitude+0.004
 stomach+0.004
 name+0.004
Neg logit contributions
izing-0.004
ler-0.004
hest-0.004
ww-0.004
E-0.004
Token mentioned
Token ID4750
Feature activation-2.78e+00

Pos logit contributions
izing+0.021
ler+0.020
E+0.019
ww+0.019
corn+0.019
Neg logit contributions
 wife-0.014
 opinion-0.014
 attitude-0.014
 fault-0.013
 name-0.013
Token dust
Token ID8977
Feature activation+1.42e+00

Pos logit contributions
 attitude+0.066
sed+0.064
 wife+0.063
 opinion+0.063
 fault+0.062
Neg logit contributions
izing-0.066
E-0.060
ler-0.060
ww-0.059
corn-0.059
Token.
Token ID13
Feature activation-1.45e+00

Pos logit contributions
izing+0.037
ler+0.036
hest+0.033
corn+0.033
ww+0.032
Neg logit contributions
 wife-0.034
 opinion-0.032
 attitude-0.031
 stomach-0.030
 name-0.029
Token He
Token ID679
Feature activation+5.93e-01

Pos logit contributions
 wife+0.030
 fault+0.029
 attitude+0.027
 ending+0.027
 behavior+0.027
Neg logit contributions
E-0.039
izing-0.038
corn-0.036
ch-0.036
hest-0.036
Token thought
Token ID1807
Feature activation-1.69e+00

Pos logit contributions
izing+0.019
ler+0.018
hest+0.018
ww+0.018
E+0.017
Neg logit contributions
 wife-0.011
 opinion-0.010
 name-0.009
 stomach-0.009
 attitude-0.009
Token out
Token ID503
Feature activation-7.58e-01

Pos logit contributions
 wife+0.046
 opinion+0.045
 attitude+0.045
 stomach+0.044
 tum+0.044
Neg logit contributions
izing-0.038
ww-0.035
E-0.035
ler-0.035
corn-0.034
Token again
Token ID757
Feature activation-1.42e+00

Pos logit contributions
 wife+0.019
 opinion+0.019
 attitude+0.018
 name+0.018
sed+0.017
Neg logit contributions
izing-0.015
ler-0.013
B-0.012
corn-0.012
hest-0.012
Token.
Token ID13
Feature activation-9.58e-01

Pos logit contributions
 wife+0.028
 opinion+0.026
 attitude+0.026
 stomach+0.025
 name+0.025
Neg logit contributions
izing-0.043
ler-0.040
hest-0.039
ww-0.039
E-0.038
Token She
Token ID1375
Feature activation+1.05e+00

Pos logit contributions
 wife+0.020
 opinion+0.019
 attitude+0.019
 stomach+0.018
 fault+0.018
Neg logit contributions
izing-0.026
E-0.026
ww-0.025
ler-0.024
hest-0.024
Token thanked
Token ID26280
Feature activation+8.51e-01

Pos logit contributions
izing+0.032
ler+0.031
hest+0.030
ww+0.030
E+0.030
Neg logit contributions
 wife-0.020
 opinion-0.019
 attitude-0.018
 stomach-0.018
 name-0.018
Token the
Token ID262
Feature activation-4.54e-01

Pos logit contributions
izing+0.024
ler+0.023
hest+0.022
ww+0.022
corn+0.022
Neg logit contributions
 wife-0.018
 opinion-0.017
 attitude-0.017
 stomach-0.016
 name-0.016
Token boy
Token ID2933
Feature activation-3.47e+00

Pos logit contributions
 fault+0.011
 wife+0.011
 attitude+0.011
urance+0.011
 opinion+0.011
Neg logit contributions
izing-0.010
E-0.010
ler-0.010
B-0.009
is-0.009
Token and
Token ID290
Feature activation+1.50e+00

Pos logit contributions
 wife+0.076
 opinion+0.074
 attitude+0.073
sed+0.071
 stomach+0.070
Neg logit contributions
izing-0.087
ler-0.086
corn-0.082
hest-0.081
ww-0.079
Token they
Token ID484
Feature activation-1.01e+00

Pos logit contributions
izing+0.039
ler+0.037
ww+0.036
corn+0.036
E+0.036
Neg logit contributions
 opinion-0.034
 wife-0.033
 attitude-0.033
 fault-0.032
 name-0.032
Token became
Token ID2627
Feature activation+5.15e-01

Pos logit contributions
 wife+0.022
 opinion+0.022
 attitude+0.021
 fault+0.021
 name+0.020
Neg logit contributions
izing-0.026
ler-0.026
ww-0.025
E-0.025
corn-0.025
Token friends
Token ID2460
Feature activation-1.03e+00

Pos logit contributions
izing+0.017
ler+0.016
E+0.016
ww+0.016
hest+0.016
Neg logit contributions
 wife-0.008
 opinion-0.008
 attitude-0.008
 stomach-0.007
 name-0.007
Token.
Token ID13
Feature activation-2.84e-01

Pos logit contributions
 wife+0.024
 opinion+0.024
 attitude+0.024
sed+0.023
 name+0.023
Neg logit contributions
izing-0.025
ler-0.024
hest-0.023
corn-0.023
E-0.022
Token She
Token ID1375
Feature activation+1.20e+00

Pos logit contributions
 wife+0.034
 attitude+0.032
 opinion+0.032
 behavior+0.031
 ending+0.031
Neg logit contributions
izing-0.051
E-0.050
ww-0.047
corn-0.047
hest-0.046
Token asked
Token ID1965
Feature activation-1.47e+00

Pos logit contributions
izing+0.037
ler+0.034
hest+0.034
ww+0.033
E+0.033
Neg logit contributions
 wife-0.024
 opinion-0.022
 name-0.021
 attitude-0.021
 stomach-0.021
Token,
Token ID11
Feature activation-1.42e+00

Pos logit contributions
 wife+0.032
 opinion+0.030
 attitude+0.030
 stomach+0.029
 name+0.029
Neg logit contributions
izing-0.040
ler-0.038
E-0.037
ww-0.037
hest-0.037
Token "
Token ID366
Feature activation+6.07e+00

Pos logit contributions
 wife+0.020
 attitude+0.019
 opinion+0.019
 fault+0.018
 stomach+0.018
Neg logit contributions
izing-0.049
E-0.046
ler-0.046
corn-0.045
ww-0.045
TokenDo
Token ID5211
Feature activation+2.39e+00

Pos logit contributions
izing+0.148
ler+0.139
hest+0.133
ww+0.127
iflower+0.119
Neg logit contributions
 wife-0.161
joy-0.144
 name-0.140
 opinion-0.140
 permission-0.136
Token you
Token ID345
Feature activation-2.80e+00

Pos logit contributions
izing+0.023
ler+0.019
ww+0.018
E+0.018
hest+0.018
Neg logit contributions
 wife-0.095
 opinion-0.092
 attitude-0.092
 stomach-0.090
 name-0.090
Token want
Token ID765
Feature activation+9.85e-01

Pos logit contributions
 wife+0.063
sed+0.059
 parents+0.056
 stomach+0.056
 fault+0.056
Neg logit contributions
ler-0.071
hest-0.068
izing-0.067
ww-0.066
E-0.065
Token to
Token ID284
Feature activation-8.97e-01

Pos logit contributions
izing+0.036
ler+0.035
ww+0.035
hest+0.034
E+0.034
Neg logit contributions
 wife-0.012
 opinion-0.011
 attitude-0.011
 stomach-0.010
 name-0.010
Token play
Token ID711
Feature activation+1.90e+00

Pos logit contributions
 attitude+0.017
sed+0.017
 wife+0.017
joy+0.016
 fault+0.016
Neg logit contributions
hest-0.024
ch-0.024
izing-0.024
ww-0.023
E-0.023
Token?"
Token ID1701
Feature activation-9.96e-01

Pos logit contributions
izing+0.050
ler+0.047
ww+0.046
E+0.046
hest+0.045
Neg logit contributions
 wife-0.044
 opinion-0.044
 name-0.042
 attitude-0.040
 stomach-0.040
Token The
Token ID383
Feature activation-1.98e-01

Pos logit contributions
 wife+0.020
 attitude+0.019
 opinion+0.019
 fault+0.018
 ending+0.018
Neg logit contributions
izing-0.029
E-0.028
ler-0.027
corn-0.027
ww-0.026
Token P
Token ID350
Feature activation-4.64e-01

Pos logit contributions
 wife+0.024
 attitude+0.022
 opinion+0.022
 ending+0.021
 behavior+0.021
Neg logit contributions
izing-0.035
E-0.034
ler-0.033
hest-0.033
ww-0.032
Tokenips
Token ID2419
Feature activation+1.80e-01

Pos logit contributions
 wife+0.012
 opinion+0.011
 attitude+0.011
 stomach+0.011
 name+0.011
Neg logit contributions
izing-0.011
ler-0.011
hest-0.011
E-0.010
ww-0.010
Tokenque
Token ID4188
Feature activation+1.67e+00

Pos logit contributions
izing+0.005
E+0.005
ww+0.005
ler+0.005
hest+0.005
Neg logit contributions
 wife-0.004
 opinion-0.004
 attitude-0.004
 stomach-0.004
 name-0.003
Tokenak
Token ID461
Feature activation+7.75e-01

Pos logit contributions
corn+0.036
izing+0.036
ww+0.035
E+0.034
Little+0.032
Neg logit contributions
 opinion-0.043
 attitude-0.042
 wife-0.042
 fault-0.041
 stomach-0.041
Token called
Token ID1444
Feature activation-3.48e+00

Pos logit contributions
izing+0.022
ler+0.021
ww+0.020
hest+0.020
E+0.020
Neg logit contributions
 wife-0.017
 opinion-0.016
 attitude-0.016
 stomach-0.015
 name-0.015
Token out
Token ID503
Feature activation-3.12e+00

Pos logit contributions
 wife+0.083
 opinion+0.073
 attitude+0.072
 stomach+0.071
 behavior+0.069
Neg logit contributions
izing-0.085
ler-0.080
ch-0.077
hest-0.076
sc-0.075
Token,
Token ID11
Feature activation-2.42e+00

Pos logit contributions
 wife+0.051
 attitude+0.045
 fault+0.045
 opinion+0.044
 stomach+0.042
Neg logit contributions
izing-0.104
ler-0.096
E-0.096
ww-0.096
ch-0.094
Token "
Token ID366
Feature activation+5.84e+00

Pos logit contributions
 wife+0.036
 opinion+0.034
 attitude+0.034
 stomach+0.032
 fault+0.031
Neg logit contributions
izing-0.082
ler-0.078
E-0.077
ww-0.076
corn-0.076
TokenHelp
Token ID22087
Feature activation+9.83e-02

Pos logit contributions
izing+0.133
ler+0.122
hest+0.117
corn+0.113
ww+0.108
Neg logit contributions
 wife-0.149
 opinion-0.141
 permission-0.140
 name-0.139
 ending-0.135
Token!
Token ID0
Feature activation-7.11e-01

Pos logit contributions
izing+0.002
ler+0.002
hest+0.002
E+0.002
ww+0.002
Neg logit contributions
 wife-0.003
 opinion-0.003
 attitude-0.003
 stomach-0.003
 name-0.002
Token Help
Token ID10478
Feature activation+3.43e-01

Pos logit contributions
 wife+0.015
 opinion+0.014
 attitude+0.014
 stomach+0.014
 name+0.014
Neg logit contributions
izing-0.020
ler-0.019
E-0.019
ww-0.019
hest-0.018
Token He
Token ID679
Feature activation-5.75e-01

Pos logit contributions
 wife+0.030
 ending+0.028
 fault+0.028
 attitude+0.027
 opinion+0.027
Neg logit contributions
izing-0.039
E-0.038
ww-0.036
corn-0.035
ch-0.035
Token watched
Token ID7342
Feature activation-1.15e+00

Pos logit contributions
 wife+0.010
 name+0.010
 opinion+0.009
sed+0.008
 tum+0.008
Neg logit contributions
izing-0.020
hest-0.018
ler-0.018
ww-0.017
E-0.017
Token as
Token ID355
Feature activation-1.08e+00

Pos logit contributions
 wife+0.022
 opinion+0.021
 name+0.020
 attitude+0.020
 stomach+0.020
Neg logit contributions
izing-0.035
ww-0.034
ler-0.033
hest-0.033
corn-0.033
Token his
Token ID465
Feature activation-4.03e+00

Pos logit contributions
 wife+0.021
 opinion+0.018
 name+0.018
 attitude+0.018
 stomach+0.018
Neg logit contributions
izing-0.033
ler-0.032
hest-0.032
ww-0.031
corn-0.031
Token mum
Token ID25682
Feature activation-4.78e-01

Pos logit contributions
 attitude+0.041
 fault+0.039
urance+0.039
 opinion+0.039
 ending+0.037
Neg logit contributions
izing-0.147
ww-0.140
ler-0.139
E-0.136
iflower-0.132
Token started
Token ID2067
Feature activation-2.93e+00

Pos logit contributions
 wife+0.009
 name+0.008
 opinion+0.008
 parents+0.007
 nod+0.007
Neg logit contributions
izing-0.015
ww-0.015
E-0.014
hest-0.014
ler-0.014
Token to
Token ID284
Feature activation-1.18e-01

Pos logit contributions
 wife+0.050
 opinion+0.045
 attitude+0.045
 stomach+0.042
 name+0.042
Neg logit contributions
izing-0.095
ler-0.090
corn-0.087
E-0.086
ww-0.086
Token work
Token ID670
Feature activation+2.07e-01

Pos logit contributions
 wife+0.002
 opinion+0.002
 name+0.002
 attitude+0.002
 stomach+0.002
Neg logit contributions
izing-0.004
ler-0.003
ww-0.003
hest-0.003
E-0.003
Token hard
Token ID1327
Feature activation-9.89e-01

Pos logit contributions
izing+0.006
ww+0.006
E+0.006
ler+0.006
hest+0.006
Neg logit contributions
 attitude-0.004
 opinion-0.004
 wife-0.004
 name-0.004
 fault-0.004
Token on
Token ID319
Feature activation-4.60e-01

Pos logit contributions
 wife+0.015
 attitude+0.015
 opinion+0.015
 name+0.014
 behavior+0.014
Neg logit contributions
izing-0.032
E-0.031
ww-0.031
ler-0.031
hest-0.031
Token the
Token ID262
Feature activation-7.87e-01

Pos logit contributions
 wife+0.008
 opinion+0.008
 stomach+0.008
 parents+0.008
 attitude+0.008
Neg logit contributions
izing-0.014
ww-0.014
ler-0.014
hest-0.014
E-0.013