TOP ACTIVATIONS
MAX = 0.238

replied . Sh aron helped her mom pour the
a lot . Er in was glad that she
nodded . She liked mac aron i and cheese . She
their favorite food : mac aron i and cheese .
table . Sh aron smiled and took a big
a delicious bowl of mac aron i and cheese . And
ig loo . Er in felt so happy and
an ig loo ! Er in had never seen an
Soon , the mac aron i and cheese was ready
perfect battery ! Er nie thanked the creature and
discovered something ! Col in found a giant hole
Mom and they loved mac aron i and cheese .
about we make some mac aron i and cheese for dinner
my friend ?" Col in smiled and said ,
Then they placed some rainbow sprink les on top . It
. It has icing and sprink les . It smells y
'm making a delicious mac aron i and cheese ," M
your friend !" Col in took the hand home
the forest !" Er nie nodded and thanked the
had pink frost ing and sprink les . It had candles

MIN ACTIVATIONS
MIN = -0.203

omm a and smiled sheep ishly . M omm a shook
andy , it isn 't ours . We can 't take
is angry . It his ses . It scratches . It
like that . It his ses and fl aps its wings
ally , that 's not ours . Leave it where it
was , it sl ugg ishly moved in the water but
snake sees Tom and his ses . It opens its mouth
up things that are not ours ". The daddy
â € ™ s not ours . â €
of Tom . It his ses and scratches . It hurts
isn â € ™ t ours without permission . It â
and angry . It his ses and fl aps its wings
to listen and kept playing anyway . The next
make his house safe like ours ?" Her mom and dad
it . The cat his ses and scratches him . Ben
is afraid . It his ses and scratches Tom 's hand
scare it . It his ses and shows its teeth .
The snake his ses . L ila and Tom
it â € ™ s ours â € .
embarrassed . He smiled sheep ishly at his mom , who

INTERVAL 0.128 - 0.238
CONTAINS 0.206%

was filled with gifts for Ad ri ana . She was
feel better ." Anna and Mom got some bread
test was to see how well Lisa could do certain things
circles , triangles , and stars . She sees how Anna
same shapes as their paper shapes . Lily and Ben smiled

INTERVAL 0.018 - 0.128
CONTAINS 28.558%

the farm ! <|endoftext|> One day , Max and his Mom
Tom and Lily were playing with magnets on the fridge .
but then he noticed something even stranger - a little car
scared , but she tried to stay brave .
ground . She picked it up and showed her mom .

INTERVAL -0.093 - 0.018
CONTAINS 70.138%

One day , the veterinarian went on a long journey
small rocks , sticks and leaves and anything else she thought
. " Tom , why did you pull the
and stepped back . But then , she noticed something comfortable
the hose and the water and the tape . He smiles

INTERVAL -0.203 - -0.093
CONTAINS 1.097%

to be your best friend forever !" Suddenly
a rock at all ! It was a mean old crocod
before riding off on his horse . He will always be
see what is in the sack . " Let
t eddy bear , her ball and her book . They
Token replied
Token ID8712
Feature activation-8.14e-02

Pos logit contributions
 sav+0.041
die+0.041
by+0.041
 late+0.040
idi+0.038
Neg logit contributions
 plunge-0.036
 umbrella-0.035
ishly-0.033
aroo-0.033
 hi-0.032
Token.
Token ID13
Feature activation-1.30e-02

Pos logit contributions
 plunge+0.153
ishly+0.152
aroo+0.148
 umbrella+0.146
anger+0.146
Neg logit contributions
die-0.118
by-0.108
 sav-0.106
bo-0.102
idi-0.101
Token
Token ID198
Feature activation-2.12e-03

Pos logit contributions
 plunge+0.025
anger+0.024
aroo+0.024
 hi+0.023
 umbrella+0.023
Neg logit contributions
by-0.018
bo-0.017
die-0.017
ce-0.017
en-0.016
Token
Token ID198
Feature activation+3.66e-03

Pos logit contributions
 plunge+0.005
aroo+0.005
anger+0.005
 umbrella+0.005
aro+0.005
Neg logit contributions
by-0.002
die-0.002
bo-0.002
 sav-0.002
ce-0.002
TokenSh
Token ID2484
Feature activation+8.42e-03

Pos logit contributions
by+0.005
die+0.005
bo+0.005
ce+0.005
aron+0.005
Neg logit contributions
 plunge-0.007
aroo-0.007
anger-0.007
 umbrella-0.006
adders-0.006
Tokenaron
Token ID8045
Feature activation+2.32e-01

Pos logit contributions
die+0.015
by+0.014
 sav+0.014
bo+0.013
 writes+0.013
Neg logit contributions
aroo-0.014
anger-0.014
 plunge-0.013
aro-0.012
oot-0.012
Token helped
Token ID4193
Feature activation-7.64e-03

Pos logit contributions
 late+0.307
 sav+0.288
die+0.285
by+0.267
aron+0.262
Neg logit contributions
aroo-0.469
 plunge-0.454
aro-0.434
 umbrella-0.428
ishly-0.426
Token her
Token ID607
Feature activation-1.13e-02

Pos logit contributions
 plunge+0.015
aroo+0.013
 umbrella+0.013
anger+0.013
 hi+0.013
Neg logit contributions
die-0.012
bo-0.011
 sav-0.011
by-0.010
ce-0.010
Token mom
Token ID1995
Feature activation+2.91e-02

Pos logit contributions
 plunge+0.019
 umbrella+0.018
aroo+0.017
anger+0.017
adders+0.016
Neg logit contributions
die-0.019
ce-0.019
aron-0.019
bo-0.019
 sav-0.019
Token pour
Token ID12797
Feature activation+4.92e-02

Pos logit contributions
die+0.041
 sav+0.039
idi+0.038
aron+0.037
 late+0.037
Neg logit contributions
 plunge-0.056
 umbrella-0.053
anger-0.050
aroo-0.050
adders-0.049
Token the
Token ID262
Feature activation+1.98e-03

Pos logit contributions
die+0.086
idi+0.082
ce+0.081
aron+0.079
Grand+0.079
Neg logit contributions
 plunge-0.088
 umbrella-0.085
anger-0.074
aroo-0.072
adders-0.072
Token a
Token ID257
Feature activation-1.30e-02

Pos logit contributions
 plunge+0.021
anger+0.020
aroo+0.019
 umbrella+0.019
adders+0.019
Neg logit contributions
die-0.011
ce-0.011
bo-0.011
by-0.011
 sav-0.010
Token lot
Token ID1256
Feature activation-2.24e-02

Pos logit contributions
 plunge+0.024
aroo+0.022
anger+0.022
 umbrella+0.022
vere+0.021
Neg logit contributions
bo-0.019
die-0.019
ce-0.019
by-0.019
aron-0.019
Token.
Token ID13
Feature activation+1.79e-02

Pos logit contributions
 plunge+0.052
anger+0.051
aroo+0.050
 umbrella+0.049
adders+0.048
Neg logit contributions
die-0.023
bo-0.022
by-0.021
 sav-0.021
idi-0.021
Token
Token ID198
Feature activation+1.30e-02

Pos logit contributions
by+0.025
ce+0.025
die+0.024
bo+0.024
idi+0.023
Neg logit contributions
 plunge-0.033
anger-0.033
aroo-0.032
 umbrella-0.031
aro-0.031
Token
Token ID198
Feature activation+1.70e-02

Pos logit contributions
die+0.016
by+0.016
ce+0.016
bo+0.016
 sav+0.016
Neg logit contributions
 plunge-0.027
aroo-0.025
anger-0.025
 umbrella-0.024
aro-0.024
TokenEr
Token ID9139
Feature activation+2.31e-01

Pos logit contributions
die+0.024
by+0.024
ce+0.024
bo+0.024
 sav+0.024
Neg logit contributions
 plunge-0.032
anger-0.030
aroo-0.030
 umbrella-0.029
adders-0.029
Tokenin
Token ID259
Feature activation+1.85e-02

Pos logit contributions
 sav+0.327
die+0.308
aron+0.300
 late+0.295
idi+0.294
Neg logit contributions
aroo-0.424
 plunge-0.398
anger-0.373
aro-0.361
fs-0.360
Token was
Token ID373
Feature activation+2.50e-02

Pos logit contributions
die+0.021
 late+0.020
aron+0.019
ce+0.019
by+0.018
Neg logit contributions
aroo-0.038
 plunge-0.038
 umbrella-0.037
anger-0.036
ishly-0.035
Token glad
Token ID9675
Feature activation-4.12e-02

Pos logit contributions
by+0.041
die+0.040
ce+0.040
aron+0.038
bo+0.037
Neg logit contributions
 plunge-0.043
aroo-0.041
 umbrella-0.040
 hi-0.038
adders-0.038
Token that
Token ID326
Feature activation+1.21e-02

Pos logit contributions
 plunge+0.057
aroo+0.053
 umbrella+0.053
anger+0.052
 hi+0.051
Neg logit contributions
die-0.080
bo-0.080
ce-0.079
aron-0.077
idi-0.075
Token she
Token ID673
Feature activation+4.25e-02

Pos logit contributions
die+0.022
aron+0.022
bo+0.021
ce+0.021
by+0.020
Neg logit contributions
 plunge-0.020
aroo-0.018
aro-0.017
anger-0.017
vere-0.017
Token nodded
Token ID14464
Feature activation+5.20e-02

Pos logit contributions
 late+0.011
by+0.011
ce+0.011
bo+0.011
die+0.011
Neg logit contributions
ishly-0.017
 plunge-0.017
 umbrella-0.017
 hi-0.016
anger-0.016
Token.
Token ID13
Feature activation+3.43e-02

Pos logit contributions
 sav+0.079
ce+0.075
die+0.074
bo+0.073
by+0.073
Neg logit contributions
 yes-0.090
 hi-0.087
 umbrella-0.085
 please-0.085
ishly-0.080
Token She
Token ID1375
Feature activation-9.43e-04

Pos logit contributions
by+0.059
bo+0.058
ce+0.056
die+0.054
 sav+0.054
Neg logit contributions
anger-0.054
 plunge-0.053
adders-0.051
 hi-0.051
 umbrella-0.050
Token liked
Token ID8288
Feature activation+3.16e-02

Pos logit contributions
 plunge+0.002
 umbrella+0.002
aroo+0.002
anger+0.002
 hi+0.002
Neg logit contributions
die-0.001
ce-0.001
by-0.001
 late-0.001
bo-0.001
Token mac
Token ID8352
Feature activation+1.18e-01

Pos logit contributions
bo+0.050
aron+0.049
ce+0.047
die+0.046
by+0.045
Neg logit contributions
 plunge-0.057
 umbrella-0.053
 hi-0.052
aro-0.050
aroo-0.050
Tokenaron
Token ID8045
Feature activation+2.29e-01

Pos logit contributions
idi+0.126
die+0.116
by+0.114
 late+0.111
 sav+0.106
Neg logit contributions
anger-0.254
ishly-0.241
adders-0.239
 plunge-0.236
 umbrella-0.234
Tokeni
Token ID72
Feature activation+5.43e-02

Pos logit contributions
die+0.326
 sav+0.326
 late+0.324
idi+0.318
aron+0.309
Neg logit contributions
 plunge-0.427
aroo-0.411
aro-0.386
anger-0.385
ishly-0.385
Token and
Token ID290
Feature activation+1.07e-02

Pos logit contributions
idi+0.061
die+0.058
 sav+0.057
ce+0.055
Grand+0.055
Neg logit contributions
 plunge-0.119
aroo-0.115
 umbrella-0.112
anger-0.112
adders-0.109
Token cheese
Token ID9891
Feature activation+3.85e-02

Pos logit contributions
ce+0.014
by+0.014
en+0.014
athan+0.014
aron+0.014
Neg logit contributions
 plunge-0.019
aroo-0.019
 umbrella-0.018
 hi-0.018
aro-0.017
Token.
Token ID13
Feature activation+2.72e-02

Pos logit contributions
die+0.044
idi+0.043
by+0.042
ce+0.042
bo+0.041
Neg logit contributions
 plunge-0.083
aroo-0.080
anger-0.080
 umbrella-0.079
adders-0.079
Token She
Token ID1375
Feature activation-6.02e-03

Pos logit contributions
by+0.047
bo+0.046
ce+0.045
die+0.043
en+0.042
Neg logit contributions
anger-0.042
 plunge-0.042
adders-0.041
 hi-0.040
 umbrella-0.039
Token their
Token ID511
Feature activation+1.37e-02

Pos logit contributions
die+0.122
ce+0.121
bo+0.120
aron+0.119
 sav+0.118
Neg logit contributions
 plunge-0.163
 umbrella-0.150
anger-0.144
adders-0.143
aroo-0.139
Token favorite
Token ID4004
Feature activation-2.80e-03

Pos logit contributions
ce+0.022
bo+0.022
aron+0.022
die+0.021
 sav+0.021
Neg logit contributions
 plunge-0.024
 umbrella-0.022
anger-0.022
aroo-0.022
adders-0.021
Token food
Token ID2057
Feature activation-8.71e-04

Pos logit contributions
 plunge+0.005
anger+0.005
aroo+0.005
 umbrella+0.005
adders+0.005
Neg logit contributions
bo-0.004
ce-0.004
die-0.004
by-0.004
 sav-0.004
Token:
Token ID25
Feature activation+6.13e-02

Pos logit contributions
 plunge+0.002
anger+0.002
 umbrella+0.002
aroo+0.002
adders+0.002
Neg logit contributions
die-0.001
ce-0.001
bo-0.001
by-0.001
idi-0.001
Token mac
Token ID8352
Feature activation+1.22e-01

Pos logit contributions
by+0.092
bo+0.091
ce+0.088
die+0.086
idi+0.083
Neg logit contributions
 plunge-0.110
 umbrella-0.110
 hi-0.108
adders-0.105
aroo-0.104
Tokenaron
Token ID8045
Feature activation+2.28e-01

Pos logit contributions
idi+0.124
 late+0.118
by+0.114
Kat+0.111
die+0.111
Neg logit contributions
anger-0.261
ittens-0.249
adders-0.235
ishly-0.235
 umbrella-0.233
Tokeni
Token ID72
Feature activation+5.17e-02

Pos logit contributions
 late+0.354
die+0.347
 sav+0.347
idi+0.342
ce+0.321
Neg logit contributions
 plunge-0.400
aroo-0.392
aro-0.368
 umbrella-0.365
ittens-0.361
Token and
Token ID290
Feature activation+1.01e-02

Pos logit contributions
idi+0.067
die+0.061
 late+0.060
 sav+0.060
Kat+0.060
Neg logit contributions
ittens-0.100
aroo-0.100
anger-0.098
 plunge-0.097
 umbrella-0.096
Token cheese
Token ID9891
Feature activation+3.90e-02

Pos logit contributions
by+0.015
en+0.015
athan+0.015
ce+0.015
aron+0.014
Neg logit contributions
 plunge-0.017
aroo-0.016
 umbrella-0.016
 hi-0.016
 tweet-0.015
Token.
Token ID13
Feature activation+1.66e-02

Pos logit contributions
die+0.048
idi+0.048
by+0.047
bo+0.045
ce+0.044
Neg logit contributions
 plunge-0.079
anger-0.079
aroo-0.077
 umbrella-0.076
adders-0.075
Token
Token ID198
Feature activation+1.62e-02

Pos logit contributions
by+0.024
bo+0.024
ce+0.023
die+0.023
idi+0.022
Neg logit contributions
 plunge-0.031
anger-0.030
adders-0.030
 umbrella-0.030
 hi-0.029
Token table
Token ID3084
Feature activation+7.82e-02

Pos logit contributions
die+0.004
by+0.004
bo+0.004
ce+0.003
en+0.003
Neg logit contributions
 plunge-0.004
 umbrella-0.004
anger-0.004
aroo-0.004
adders-0.004
Token.
Token ID13
Feature activation+6.36e-03

Pos logit contributions
die+0.094
bo+0.088
idi+0.084
by+0.083
ce+0.082
Neg logit contributions
 plunge-0.175
anger-0.168
adders-0.162
 umbrella-0.160
aroo-0.154
Token
Token ID198
Feature activation+7.63e-03

Pos logit contributions
by+0.009
bo+0.009
die+0.009
ce+0.009
y+0.008
Neg logit contributions
 plunge-0.012
anger-0.012
 hi-0.011
aroo-0.011
adders-0.011
Token
Token ID198
Feature activation+1.11e-02

Pos logit contributions
die+0.006
by+0.006
bo+0.006
ce+0.006
 sav+0.006
Neg logit contributions
 plunge-0.019
anger-0.018
aroo-0.018
 umbrella-0.018
adders-0.018
TokenSh
Token ID2484
Feature activation+6.80e-03

Pos logit contributions
die+0.015
by+0.015
ce+0.015
bo+0.015
 sav+0.015
Neg logit contributions
 plunge-0.022
anger-0.021
aroo-0.020
 umbrella-0.020
adders-0.020
Tokenaron
Token ID8045
Feature activation+2.26e-01

Pos logit contributions
die+0.010
by+0.010
 sav+0.009
bo+0.009
idi+0.009
Neg logit contributions
anger-0.013
aroo-0.012
 plunge-0.012
adders-0.012
ittens-0.012
Token smiled
Token ID13541
Feature activation+6.06e-02

Pos logit contributions
 late+0.286
 sav+0.273
die+0.264
aron+0.257
by+0.245
Neg logit contributions
aroo-0.463
 plunge-0.453
aro-0.431
ishly-0.426
 umbrella-0.418
Token and
Token ID290
Feature activation+3.88e-03

Pos logit contributions
bo+0.087
ce+0.086
 sav+0.085
idi+0.084
die+0.082
Neg logit contributions
 plunge-0.105
 umbrella-0.103
 hi-0.102
anger-0.100
ishly-0.100
Token took
Token ID1718
Feature activation+1.02e-02

Pos logit contributions
die+0.005
bo+0.005
by+0.005
ce+0.005
aron+0.005
Neg logit contributions
 plunge-0.008
aroo-0.007
 umbrella-0.007
anger-0.007
aro-0.007
Token a
Token ID257
Feature activation-8.74e-03

Pos logit contributions
die+0.016
aron+0.014
 sav+0.014
bo+0.014
y+0.014
Neg logit contributions
 plunge-0.019
anger-0.017
aroo-0.017
 umbrella-0.016
 hi-0.016
Token big
Token ID1263
Feature activation+4.60e-02

Pos logit contributions
 plunge+0.017
anger+0.016
aroo+0.016
 umbrella+0.016
adders+0.015
Neg logit contributions
die-0.012
ce-0.012
bo-0.012
by-0.012
 sav-0.012
Token a
Token ID257
Feature activation-1.60e-02

Pos logit contributions
 plunge+0.016
anger+0.015
aroo+0.014
 umbrella+0.014
adders+0.014
Neg logit contributions
die-0.014
bo-0.014
aron-0.014
ce-0.014
by-0.014
Token delicious
Token ID12625
Feature activation-9.52e-03

Pos logit contributions
 plunge+0.030
 umbrella+0.028
aroo+0.028
anger+0.027
 hi+0.026
Neg logit contributions
bo-0.024
en-0.023
aron-0.023
die-0.023
ce-0.022
Token bowl
Token ID9396
Feature activation-4.84e-03

Pos logit contributions
 plunge+0.015
 umbrella+0.014
aroo+0.013
anger+0.013
 leap+0.012
Neg logit contributions
bo-0.017
die-0.017
aron-0.016
 late-0.016
Grand-0.016
Token of
Token ID286
Feature activation-1.98e-02

Pos logit contributions
 plunge+0.008
anger+0.007
aroo+0.007
 umbrella+0.007
adders+0.007
Neg logit contributions
die-0.008
by-0.008
idi-0.008
ce-0.008
bo-0.008
Token mac
Token ID8352
Feature activation+1.17e-01

Pos logit contributions
 plunge+0.036
 umbrella+0.033
aroo+0.031
 hi+0.030
adders+0.029
Neg logit contributions
bo-0.033
ce-0.033
die-0.032
by-0.032
aron-0.031
Tokenaron
Token ID8045
Feature activation+2.26e-01

Pos logit contributions
 late+0.120
die+0.113
idi+0.110
by+0.109
Kat+0.101
Neg logit contributions
anger-0.264
 plunge-0.237
ishly-0.236
ittens-0.234
 umbrella-0.233
Tokeni
Token ID72
Feature activation+5.28e-02

Pos logit contributions
 late+0.364
die+0.350
aron+0.345
 sav+0.338
Grand+0.336
Neg logit contributions
 plunge-0.383
aroo-0.369
 umbrella-0.340
aro-0.340
ishly-0.338
Token and
Token ID290
Feature activation-9.39e-03

Pos logit contributions
die+0.064
idi+0.064
 late+0.062
Grand+0.060
bo+0.058
Neg logit contributions
 plunge-0.104
anger-0.102
aroo-0.101
ittens-0.098
 umbrella-0.098
Token cheese
Token ID9891
Feature activation+4.19e-02

Pos logit contributions
 plunge+0.017
aroo+0.016
 umbrella+0.016
 hi+0.015
adders+0.015
Neg logit contributions
by-0.013
en-0.013
ce-0.013
aron-0.013
bo-0.013
Token.
Token ID13
Feature activation+1.09e-02

Pos logit contributions
die+0.052
idi+0.050
by+0.049
bo+0.048
ce+0.048
Neg logit contributions
 plunge-0.087
anger-0.086
aroo-0.084
 umbrella-0.083
adders-0.082
Token And
Token ID843
Feature activation-7.06e-02

Pos logit contributions
by+0.017
die+0.017
bo+0.017
ce+0.016
idi+0.016
Neg logit contributions
 plunge-0.019
anger-0.018
aroo-0.018
 umbrella-0.017
 hi-0.017
Token ig
Token ID45329
Feature activation-6.85e-02

Pos logit contributions
 plunge+0.007
 umbrella+0.007
anger+0.006
aroo+0.006
adders+0.006
Neg logit contributions
die-0.008
idi-0.008
aron-0.007
ce-0.007
bo-0.007
Tokenloo
Token ID29680
Feature activation-8.29e-02

Pos logit contributions
 plunge+0.085
anger+0.076
aroo+0.073
 umbrella+0.071
adders+0.069
Neg logit contributions
die-0.144
ce-0.144
bo-0.144
by-0.143
 sav-0.143
Token.
Token ID13
Feature activation+7.14e-03

Pos logit contributions
 plunge+0.190
anger+0.184
 umbrella+0.177
aroo+0.175
adders+0.173
Neg logit contributions
die-0.093
by-0.085
idi-0.083
bo-0.082
ce-0.081
Token
Token ID220
Feature activation-1.80e-02

Pos logit contributions
die+0.010
ce+0.009
bo+0.009
by+0.009
idi+0.009
Neg logit contributions
 plunge-0.014
anger-0.014
adders-0.013
aroo-0.013
 umbrella-0.013
Token
Token ID198
Feature activation+2.64e-03

Pos logit contributions
 plunge+0.047
anger+0.044
aroo+0.043
 umbrella+0.043
adders+0.043
Neg logit contributions
die-0.014
ce-0.013
by-0.013
bo-0.013
 sav-0.013
TokenEr
Token ID9139
Feature activation+2.26e-01

Pos logit contributions
die+0.003
by+0.003
bo+0.003
en+0.003
 sav+0.003
Neg logit contributions
 plunge-0.005
anger-0.005
vere-0.005
aroo-0.005
adders-0.005
Tokenin
Token ID259
Feature activation+1.57e-02

Pos logit contributions
 sav+0.338
die+0.333
 late+0.327
idi+0.305
aron+0.303
Neg logit contributions
aroo-0.384
 plunge-0.358
aro-0.341
anger-0.340
arest-0.325
Token felt
Token ID2936
Feature activation+4.42e-04

Pos logit contributions
 late+0.018
die+0.018
aron+0.017
 sav+0.016
bo+0.016
Neg logit contributions
aroo-0.032
 plunge-0.032
 umbrella-0.031
anger-0.030
aro-0.030
Token so
Token ID523
Feature activation-2.64e-02

Pos logit contributions
die+0.001
by+0.001
ce+0.001
bo+0.001
aron+0.001
Neg logit contributions
 plunge-0.001
aroo-0.001
 umbrella-0.001
 hi-0.001
adders-0.001
Token happy
Token ID3772
Feature activation-2.35e-02

Pos logit contributions
 plunge+0.055
anger+0.051
aroo+0.050
 umbrella+0.049
adders+0.049
Neg logit contributions
die-0.035
ce-0.034
by-0.034
bo-0.034
 sav-0.033
Token and
Token ID290
Feature activation-3.95e-03

Pos logit contributions
 plunge+0.042
anger+0.040
aroo+0.038
 umbrella+0.038
 hi+0.036
Neg logit contributions
bo-0.038
die-0.038
ce-0.036
idi-0.036
by-0.036
Token an
Token ID281
Feature activation-5.54e-02

Pos logit contributions
 plunge+0.050
anger+0.046
aroo+0.045
 umbrella+0.045
adders+0.044
Neg logit contributions
die-0.036
ce-0.036
bo-0.036
by-0.035
aron-0.035
Token ig
Token ID45329
Feature activation-6.86e-02

Pos logit contributions
 plunge+0.121
anger+0.115
aroo+0.111
aro+0.109
vere+0.109
Neg logit contributions
 sav-0.064
die-0.062
aron-0.062
ce-0.062
bo-0.061
Tokenloo
Token ID29680
Feature activation-8.38e-02

Pos logit contributions
 plunge+0.099
 hi+0.084
aroo+0.084
 umbrella+0.083
adders+0.082
Neg logit contributions
ce-0.134
bo-0.133
by-0.128
aron-0.127
 sav-0.127
Token!
Token ID0
Feature activation-3.65e-02

Pos logit contributions
 plunge+0.181
anger+0.172
aroo+0.167
 umbrella+0.166
adders+0.163
Neg logit contributions
die-0.103
by-0.099
bo-0.097
ce-0.097
idi-0.095
Token
Token ID198
Feature activation-1.67e-03

Pos logit contributions
 plunge+0.080
anger+0.075
aroo+0.073
 umbrella+0.072
adders+0.071
Neg logit contributions
die-0.043
ce-0.043
bo-0.042
 sav-0.042
by-0.042
TokenEr
Token ID9139
Feature activation+2.24e-01

Pos logit contributions
 plunge+0.004
anger+0.004
aroo+0.004
 umbrella+0.004
adders+0.004
Neg logit contributions
die-0.001
by-0.001
ce-0.001
bo-0.001
 sav-0.001
Tokenin
Token ID259
Feature activation+1.21e-02

Pos logit contributions
die+0.313
 sav+0.312
 late+0.303
aron+0.288
idi+0.280
Neg logit contributions
aroo-0.415
 plunge-0.399
anger-0.376
aro-0.375
arest-0.353
Token had
Token ID550
Feature activation+4.25e-02

Pos logit contributions
 late+0.013
aron+0.013
die+0.013
 sav+0.013
bo+0.012
Neg logit contributions
aroo-0.025
 plunge-0.025
 umbrella-0.024
anger-0.023
aro-0.023
Token never
Token ID1239
Feature activation-3.89e-02

Pos logit contributions
die+0.056
ce+0.055
by+0.055
aron+0.054
bo+0.053
Neg logit contributions
 plunge-0.087
 umbrella-0.080
anger-0.080
adders-0.079
aroo-0.078
Token seen
Token ID1775
Feature activation-8.19e-02

Pos logit contributions
 plunge+0.061
 umbrella+0.055
adders+0.052
 hi+0.050
 footing+0.049
Neg logit contributions
ce-0.070
aron-0.069
die-0.069
bo-0.067
idi-0.065
Token an
Token ID281
Feature activation-5.54e-02

Pos logit contributions
 plunge+0.141
anger+0.128
 umbrella+0.127
aroo+0.122
adders+0.119
Neg logit contributions
die-0.140
by-0.138
ce-0.138
bo-0.136
aron-0.134
Token
Token ID198
Feature activation+2.61e-02

Pos logit contributions
die+0.026
bo+0.026
by+0.026
 sav+0.025
aron+0.025
Neg logit contributions
 plunge-0.052
anger-0.049
aroo-0.049
vere-0.047
adders-0.047
TokenSoon
Token ID28093
Feature activation+1.57e-03

Pos logit contributions
by+0.036
bo+0.035
die+0.035
en+0.034
aron+0.033
Neg logit contributions
 plunge-0.048
vere-0.046
aroo-0.045
fs-0.044
anger-0.044
Token,
Token ID11
Feature activation+5.77e-03

Pos logit contributions
die+0.003
bo+0.003
aron+0.003
 sav+0.003
ce+0.003
Neg logit contributions
 plunge-0.003
anger-0.002
aroo-0.002
 hi-0.002
 umbrella-0.002
Token the
Token ID262
Feature activation+1.12e-02

Pos logit contributions
bo+0.007
die+0.007
by+0.007
ce+0.006
aron+0.006
Neg logit contributions
 plunge-0.013
anger-0.012
 umbrella-0.012
aroo-0.011
 hi-0.011
Token mac
Token ID8352
Feature activation+1.15e-01

Pos logit contributions
aron+0.018
die+0.018
en+0.018
ce+0.018
bo+0.018
Neg logit contributions
 plunge-0.020
 umbrella-0.019
anger-0.018
aroo-0.017
 hi-0.017
Tokenaron
Token ID8045
Feature activation+2.22e-01

Pos logit contributions
 late+0.110
die+0.106
by+0.096
idi+0.093
Kat+0.088
Neg logit contributions
anger-0.270
aroo-0.250
 plunge-0.242
ittens-0.242
adders-0.241
Tokeni
Token ID72
Feature activation+5.35e-02

Pos logit contributions
 late+0.332
 sav+0.327
die+0.316
aron+0.313
Grand+0.297
Neg logit contributions
aroo-0.401
 plunge-0.395
aro-0.365
 umbrella-0.362
anger-0.358
Token and
Token ID290
Feature activation+1.03e-02

Pos logit contributions
die+0.069
 late+0.065
Grand+0.063
idi+0.063
 sav+0.062
Neg logit contributions
aroo-0.103
 plunge-0.099
anger-0.099
adders-0.096
aro-0.094
Token cheese
Token ID9891
Feature activation+3.76e-02

Pos logit contributions
aron+0.014
athan+0.014
ce+0.014
en+0.013
by+0.013
Neg logit contributions
 plunge-0.019
aroo-0.018
 umbrella-0.018
 hi-0.017
aro-0.017
Token was
Token ID373
Feature activation+2.74e-02

Pos logit contributions
die+0.041
ce+0.039
by+0.039
bo+0.039
idi+0.038
Neg logit contributions
 plunge-0.085
anger-0.081
aroo-0.080
 umbrella-0.079
adders-0.078
Token ready
Token ID3492
Feature activation-8.23e-03

Pos logit contributions
by+0.045
any+0.045
very+0.044
ce+0.044
Em+0.044
Neg logit contributions
 plunge-0.036
adders-0.035
 ours-0.035
 umbrella-0.035
 please-0.034
Token perfect
Token ID2818
Feature activation-6.05e-03

Pos logit contributions
ce+0.020
bo+0.019
die+0.019
aron+0.019
by+0.019
Neg logit contributions
 plunge-0.018
anger-0.015
 umbrella-0.015
aroo-0.015
vere-0.014
Token battery
Token ID6555
Feature activation+3.11e-02

Pos logit contributions
 plunge+0.011
anger+0.010
 umbrella+0.010
aroo+0.010
vere+0.009
Neg logit contributions
bo-0.010
die-0.010
ce-0.010
aron-0.009
 sav-0.009
Token!
Token ID0
Feature activation-4.11e-02

Pos logit contributions
die+0.039
ce+0.038
bo+0.037
by+0.037
 sav+0.037
Neg logit contributions
 plunge-0.066
anger-0.062
aroo-0.061
 umbrella-0.060
adders-0.059
Token
Token ID198
Feature activation+1.00e-02

Pos logit contributions
 plunge+0.086
anger+0.083
aroo+0.082
aro+0.077
 hi+0.077
Neg logit contributions
by-0.049
die-0.049
ce-0.046
 sav-0.046
idi-0.045
Token
Token ID198
Feature activation+1.48e-02

Pos logit contributions
die+0.010
by+0.010
 sav+0.010
bo+0.010
aron+0.010
Neg logit contributions
 plunge-0.023
aroo-0.022
anger-0.021
vere-0.021
aro-0.021
TokenEr
Token ID9139
Feature activation+2.20e-01

Pos logit contributions
by+0.021
die+0.021
 sav+0.019
bo+0.019
aron+0.019
Neg logit contributions
 plunge-0.029
aroo-0.027
vere-0.026
anger-0.026
aro-0.026
Tokennie
Token ID11952
Feature activation+1.02e-01

Pos logit contributions
 sav+0.229
 late+0.194
 gre+0.172
idi+0.167
die+0.167
Neg logit contributions
aroo-0.503
 plunge-0.471
anger-0.449
aro-0.448
arest-0.433
Token thanked
Token ID26280
Feature activation+6.10e-02

Pos logit contributions
die+0.118
 late+0.112
by+0.103
aron+0.102
bo+0.101
Neg logit contributions
 plunge-0.217
aroo-0.215
anger-0.200
ishly-0.200
 umbrella-0.199
Token the
Token ID262
Feature activation+1.32e-02

Pos logit contributions
die+0.079
bo+0.075
aron+0.073
by+0.072
ce+0.071
Neg logit contributions
 plunge-0.130
aroo-0.121
anger-0.119
 umbrella-0.116
aro-0.115
Token creature
Token ID7185
Feature activation-4.38e-02

Pos logit contributions
die+0.021
ce+0.021
bo+0.021
by+0.021
 sav+0.021
Neg logit contributions
 plunge-0.024
anger-0.022
aroo-0.021
 umbrella-0.021
adders-0.021
Token and
Token ID290
Feature activation+6.12e-03

Pos logit contributions
 plunge+0.093
anger+0.087
aroo+0.086
 umbrella+0.085
adders+0.082
Neg logit contributions
bo-0.053
die-0.052
ce-0.052
 sav-0.052
idi-0.050
Token discovered
Token ID5071
Feature activation-2.91e-02

Pos logit contributions
die+0.020
ce+0.020
by+0.020
bo+0.020
 sav+0.020
Neg logit contributions
 plunge-0.057
anger-0.054
aroo-0.053
 umbrella-0.052
adders-0.052
Token something
Token ID1223
Feature activation+4.87e-02

Pos logit contributions
 plunge+0.058
anger+0.055
aroo+0.054
 umbrella+0.052
adders+0.052
Neg logit contributions
die-0.038
by-0.038
ce-0.038
bo-0.038
 sav-0.038
Token!
Token ID0
Feature activation-2.90e-02

Pos logit contributions
die+0.073
aron+0.073
bo+0.072
ce+0.072
 sav+0.070
Neg logit contributions
 plunge-0.094
 umbrella-0.087
anger-0.086
aroo-0.084
vere-0.081
Token
Token ID198
Feature activation+7.38e-03

Pos logit contributions
 plunge+0.055
anger+0.050
 hi+0.047
aroo+0.047
adders+0.047
Neg logit contributions
ce-0.044
die-0.043
aron-0.043
bo-0.043
by-0.042
Token
Token ID198
Feature activation+1.06e-02

Pos logit contributions
die+0.009
aron+0.009
bo+0.008
by+0.008
 sav+0.008
Neg logit contributions
 plunge-0.016
anger-0.014
vere-0.014
aroo-0.014
adders-0.014
TokenCol
Token ID5216
Feature activation+2.20e-01

Pos logit contributions
aron+0.015
die+0.015
bo+0.014
by+0.014
 sav+0.014
Neg logit contributions
 plunge-0.021
anger-0.018
aroo-0.018
vere-0.018
adders-0.018
Tokenin
Token ID259
Feature activation+1.93e-02

Pos logit contributions
 late+0.379
 sav+0.367
die+0.359
aron+0.355
Grand+0.337
Neg logit contributions
 plunge-0.335
anger-0.326
aroo-0.323
aro-0.292
arest-0.292
Token found
Token ID1043
Feature activation-3.90e-02

Pos logit contributions
aron+0.023
 late+0.022
 sav+0.021
die+0.021
bo+0.021
Neg logit contributions
aroo-0.038
 plunge-0.037
 umbrella-0.036
anger-0.036
aro-0.034
Token a
Token ID257
Feature activation-1.46e-02

Pos logit contributions
 plunge+0.083
anger+0.078
aroo+0.076
 umbrella+0.075
adders+0.074
Neg logit contributions
die-0.048
ce-0.048
by-0.048
bo-0.047
 sav-0.047
Token giant
Token ID6175
Feature activation-5.25e-03

Pos logit contributions
 plunge+0.027
anger+0.025
 umbrella+0.024
aroo+0.024
 hi+0.023
Neg logit contributions
bo-0.022
ce-0.022
aron-0.022
die-0.021
 sav-0.021
Token hole
Token ID7604
Feature activation-2.32e-02

Pos logit contributions
 plunge+0.008
aroo+0.008
anger+0.008
 umbrella+0.008
 hi+0.007
Neg logit contributions
die-0.009
bo-0.009
by-0.009
aron-0.009
ce-0.009
Token Mom
Token ID11254
Feature activation+4.98e-02

Pos logit contributions
bo+0.036
aron+0.036
ce+0.034
ny+0.034
die+0.032
Neg logit contributions
 plunge-0.030
 umbrella-0.029
aro-0.028
 hi-0.028
 please-0.027
Token and
Token ID290
Feature activation+1.08e-02

Pos logit contributions
idi+0.067
die+0.066
ce+0.065
 sav+0.064
Grand+0.061
Neg logit contributions
 umbrella-0.093
 plunge-0.087
ishly-0.086
adders-0.084
 yes-0.084
Token they
Token ID484
Feature activation+2.99e-02

Pos logit contributions
ce+0.016
bo+0.016
aron+0.016
by+0.015
die+0.015
Neg logit contributions
 plunge-0.019
 umbrella-0.019
aroo-0.018
 hi-0.018
ishly-0.017
Token loved
Token ID6151
Feature activation+1.76e-02

Pos logit contributions
ce+0.033
die+0.032
by+0.031
idi+0.030
bo+0.030
Neg logit contributions
 plunge-0.068
aroo-0.063
 umbrella-0.063
anger-0.062
adders-0.061
Token mac
Token ID8352
Feature activation+1.16e-01

Pos logit contributions
bo+0.031
aron+0.030
die+0.029
ce+0.029
idi+0.028
Neg logit contributions
 plunge-0.027
 umbrella-0.026
 hi-0.025
aro-0.024
 please-0.024
Tokenaron
Token ID8045
Feature activation+2.19e-01

Pos logit contributions
idi+0.125
die+0.110
 late+0.103
by+0.101
ce+0.099
Neg logit contributions
anger-0.246
adders-0.233
ishly-0.230
 umbrella-0.229
ittens-0.226
Tokeni
Token ID72
Feature activation+4.92e-02

Pos logit contributions
 sav+0.330
die+0.329
 late+0.326
idi+0.325
aron+0.311
Neg logit contributions
 plunge-0.395
aroo-0.375
aro-0.363
 umbrella-0.357
ishly-0.352
Token and
Token ID290
Feature activation+7.60e-03

Pos logit contributions
idi+0.058
 sav+0.054
die+0.054
ce+0.051
Grand+0.051
Neg logit contributions
 plunge-0.105
aroo-0.102
 umbrella-0.100
anger-0.099
adders-0.099
Token cheese
Token ID9891
Feature activation+3.65e-02

Pos logit contributions
ce+0.011
aron+0.011
by+0.011
en+0.011
athan+0.011
Neg logit contributions
 plunge-0.013
aroo-0.013
 umbrella-0.013
 hi-0.012
aro-0.012
Token.
Token ID13
Feature activation+1.22e-02

Pos logit contributions
idi+0.042
die+0.042
ce+0.040
bo+0.040
by+0.040
Neg logit contributions
 plunge-0.077
anger-0.075
aroo-0.075
 umbrella-0.074
adders-0.074
Token
Token ID198
Feature activation+1.28e-02

Pos logit contributions
ce+0.020
by+0.019
bo+0.019
en+0.018
nie+0.017
Neg logit contributions
 umbrella-0.020
 hi-0.019
ittens-0.019
 plunge-0.019
adders-0.019
Token about
Token ID546
Feature activation+6.08e-02

Pos logit contributions
die+0.028
ce+0.028
by+0.028
bo+0.027
 sav+0.027
Neg logit contributions
 plunge-0.047
anger-0.044
aroo-0.043
 umbrella-0.042
adders-0.042
Token we
Token ID356
Feature activation-2.20e-02

Pos logit contributions
die+0.103
by+0.103
aron+0.102
bo+0.101
ce+0.100
Neg logit contributions
 plunge-0.101
anger-0.091
aroo-0.091
 umbrella-0.089
adders-0.087
Token make
Token ID787
Feature activation+2.54e-02

Pos logit contributions
 plunge+0.040
anger+0.037
aroo+0.036
 umbrella+0.035
adders+0.035
Neg logit contributions
die-0.034
ce-0.034
by-0.034
bo-0.034
 sav-0.034
Token some
Token ID617
Feature activation-2.35e-02

Pos logit contributions
die+0.037
aron+0.037
by+0.037
idi+0.037
bo+0.037
Neg logit contributions
 plunge-0.048
anger-0.043
 umbrella-0.042
aroo-0.042
adders-0.041
Token mac
Token ID8352
Feature activation+1.16e-01

Pos logit contributions
 plunge+0.045
 umbrella+0.041
anger+0.040
aroo+0.039
adders+0.039
Neg logit contributions
die-0.035
bo-0.034
idi-0.033
aron-0.033
by-0.033
Tokenaron
Token ID8045
Feature activation+2.19e-01

Pos logit contributions
idi+0.126
by+0.117
die+0.115
Kat+0.106
 late+0.105
Neg logit contributions
anger-0.246
 plunge-0.224
ittens-0.222
adders-0.216
mm-0.214
Tokeni
Token ID72
Feature activation+4.78e-02

Pos logit contributions
die+0.309
 sav+0.305
idi+0.303
aron+0.297
ce+0.291
Neg logit contributions
 plunge-0.422
aroo-0.393
anger-0.386
 umbrella-0.378
aro-0.373
Token and
Token ID290
Feature activation+6.09e-04

Pos logit contributions
idi+0.066
die+0.066
 sav+0.065
by+0.064
ce+0.061
Neg logit contributions
 plunge-0.091
anger-0.085
aroo-0.085
 umbrella-0.083
adders-0.081
Token cheese
Token ID9891
Feature activation+3.79e-02

Pos logit contributions
aron+0.001
by+0.001
bo+0.001
ce+0.001
die+0.001
Neg logit contributions
 plunge-0.001
aroo-0.001
 umbrella-0.001
 hi-0.001
aro-0.001
Token for
Token ID329
Feature activation-4.65e-02

Pos logit contributions
die+0.055
by+0.054
idi+0.053
bo+0.052
ce+0.052
Neg logit contributions
 plunge-0.070
anger-0.068
aroo-0.065
 umbrella-0.065
adders-0.063
Token dinner
Token ID8073
Feature activation+5.13e-04

Pos logit contributions
 plunge+0.082
anger+0.075
aroo+0.074
 umbrella+0.073
adders+0.072
Neg logit contributions
die-0.074
bo-0.073
by-0.073
ce-0.073
 sav-0.072
Token my
Token ID616
Feature activation-5.12e-02

Pos logit contributions
ce+0.082
aron+0.078
idi+0.078
by+0.076
y+0.075
Neg logit contributions
 plunge-0.073
aroo-0.068
vere-0.067
 umbrella-0.067
 hi-0.066
Token friend
Token ID1545
Feature activation-9.08e-02

Pos logit contributions
 plunge+0.086
anger+0.077
aroo+0.077
 umbrella+0.077
adders+0.071
Neg logit contributions
ce-0.087
bo-0.086
idi-0.085
die-0.085
aron-0.084
Token?"
Token ID1701
Feature activation+2.60e-02

Pos logit contributions
 plunge+0.191
aroo+0.181
anger+0.178
 umbrella+0.176
 hi+0.171
Neg logit contributions
idi-0.106
 sav-0.105
die-0.105
ce-0.105
by-0.103
Token
Token ID198
Feature activation+1.83e-02

Pos logit contributions
die+0.031
by+0.031
aron+0.030
ce+0.030
bo+0.029
Neg logit contributions
aroo-0.051
 plunge-0.051
 hi-0.049
aro-0.048
ishly-0.047
Token
Token ID198
Feature activation+2.00e-02

Pos logit contributions
by+0.019
 sav+0.019
en+0.019
die+0.019
aron+0.019
Neg logit contributions
aroo-0.038
 plunge-0.038
anger-0.037
aro-0.036
ittens-0.036
TokenCol
Token ID5216
Feature activation+2.19e-01

Pos logit contributions
by+0.024
die+0.023
 sav+0.023
aron+0.023
bo+0.022
Neg logit contributions
 plunge-0.041
aroo-0.040
anger-0.038
vere-0.038
aro-0.037
Tokenin
Token ID259
Feature activation+1.80e-02

Pos logit contributions
 late+0.389
 sav+0.381
die+0.367
Grand+0.354
 gre+0.348
Neg logit contributions
aroo-0.319
anger-0.318
 plunge-0.311
aro-0.284
ute-0.261
Token smiled
Token ID13541
Feature activation+5.86e-02

Pos logit contributions
 late+0.028
 sav+0.025
 gre+0.025
die+0.025
 birth+0.025
Neg logit contributions
aroo-0.029
ishly-0.027
 plunge-0.027
anger-0.026
 umbrella-0.025
Token and
Token ID290
Feature activation+5.63e-03

Pos logit contributions
ce+0.090
idi+0.084
 sav+0.080
bo+0.080
by+0.078
Neg logit contributions
 hi-0.101
 umbrella-0.096
ishly-0.096
 yes-0.096
 please-0.093
Token said
Token ID531
Feature activation-3.44e-03

Pos logit contributions
ce+0.008
die+0.008
by+0.008
aron+0.008
bo+0.008
Neg logit contributions
 plunge-0.010
aroo-0.010
 hi-0.009
 umbrella-0.009
anger-0.009
Token,
Token ID11
Feature activation+2.74e-03

Pos logit contributions
 plunge+0.006
aroo+0.006
anger+0.006
 hi+0.006
aro+0.005
Neg logit contributions
die-0.006
 sav-0.005
bo-0.005
ce-0.005
aron-0.005
Token Then
Token ID3244
Feature activation+3.59e-02

Pos logit contributions
by+0.031
ce+0.030
bo+0.029
die+0.029
aron+0.028
Neg logit contributions
 plunge-0.051
aroo-0.048
anger-0.048
 umbrella-0.046
adders-0.046
Token they
Token ID484
Feature activation+3.06e-02

Pos logit contributions
aron+0.058
die+0.058
ce+0.057
bo+0.056
by+0.056
Neg logit contributions
 plunge-0.063
aroo-0.057
 umbrella-0.056
anger-0.056
 hi-0.054
Token placed
Token ID4624
Feature activation+5.71e-02

Pos logit contributions
die+0.037
ce+0.036
aron+0.036
bo+0.035
by+0.034
Neg logit contributions
 plunge-0.065
 umbrella-0.061
adders-0.059
anger-0.059
aroo-0.059
Token some
Token ID617
Feature activation-1.91e-02

Pos logit contributions
ce+0.086
die+0.083
aron+0.083
idi+0.082
bo+0.080
Neg logit contributions
 plunge-0.108
 umbrella-0.101
aroo-0.100
aro-0.096
anger-0.095
Token rainbow
Token ID27223
Feature activation+4.78e-02

Pos logit contributions
 plunge+0.034
 umbrella+0.033
aroo+0.031
anger+0.030
adders+0.029
Neg logit contributions
die-0.032
ce-0.030
idi-0.030
aron-0.030
bo-0.030
Token sprink
Token ID27562
Feature activation+2.17e-01

Pos logit contributions
die+0.069
 sav+0.065
idi+0.065
aron+0.063
ce+0.061
Neg logit contributions
 plunge-0.096
anger-0.094
 umbrella-0.092
aroo-0.089
adders-0.086
Tokenles
Token ID829
Feature activation+5.61e-02

Pos logit contributions
die+0.348
idi+0.339
 sav+0.324
Grand+0.314
athan+0.304
Neg logit contributions
 plunge-0.386
anger-0.374
fs-0.354
 umbrella-0.349
aroo-0.347
Token on
Token ID319
Feature activation-1.89e-02

Pos logit contributions
die+0.077
idi+0.074
 sav+0.071
by+0.071
ce+0.071
Neg logit contributions
 plunge-0.115
anger-0.109
aroo-0.107
 umbrella-0.107
adders-0.104
Token top
Token ID1353
Feature activation-1.18e-02

Pos logit contributions
 plunge+0.032
 umbrella+0.029
aroo+0.029
anger+0.029
aro+0.027
Neg logit contributions
die-0.031
by-0.029
bo-0.029
ce-0.028
aron-0.028
Token.
Token ID13
Feature activation+2.76e-02

Pos logit contributions
anger+0.022
 plunge+0.022
 umbrella+0.021
adders+0.020
aroo+0.020
Neg logit contributions
die-0.018
idi-0.017
by-0.016
 sav-0.016
bo-0.016
Token It
Token ID632
Feature activation-7.55e-02

Pos logit contributions
by+0.037
ce+0.035
bo+0.035
die+0.035
aron+0.033
Neg logit contributions
 plunge-0.055
anger-0.053
aroo-0.052
adders-0.050
 umbrella-0.050
Token.
Token ID13
Feature activation+3.14e-02

Pos logit contributions
die+0.107
by+0.100
bo+0.099
idi+0.098
ce+0.095
Neg logit contributions
anger-0.171
 plunge-0.167
 umbrella-0.164
aroo-0.159
adders-0.157
Token It
Token ID632
Feature activation-7.57e-02

Pos logit contributions
die+0.035
ce+0.035
bo+0.035
 sav+0.035
by+0.034
Neg logit contributions
 plunge-0.070
anger-0.066
aroo-0.065
 umbrella-0.063
adders-0.063
Token has
Token ID468
Feature activation+7.76e-02

Pos logit contributions
 plunge+0.107
 umbrella+0.102
anger+0.101
adders+0.097
aroo+0.093
Neg logit contributions
die-0.138
by-0.135
aron-0.133
idi-0.128
athan-0.128
Token icing
Token ID41567
Feature activation+1.16e-01

Pos logit contributions
die+0.142
by+0.134
ce+0.133
bo+0.132
aron+0.127
Neg logit contributions
 plunge-0.121
adders-0.119
 umbrella-0.118
anger-0.108
 hi-0.103
Token and
Token ID290
Feature activation+1.82e-02

Pos logit contributions
die+0.181
idi+0.172
by+0.165
 sav+0.152
bo+0.148
Neg logit contributions
anger-0.215
 umbrella-0.201
 plunge-0.200
adders-0.196
ittens-0.186
Token sprink
Token ID27562
Feature activation+2.17e-01

Pos logit contributions
die+0.028
by+0.028
bo+0.027
athan+0.026
idi+0.026
Neg logit contributions
 plunge-0.031
 umbrella-0.030
adders-0.028
 hi-0.027
aroo-0.027
Tokenles
Token ID829
Feature activation+6.36e-02

Pos logit contributions
die+0.361
idi+0.350
 sav+0.312
Kat+0.304
Grand+0.302
Neg logit contributions
anger-0.368
 plunge-0.367
fs-0.347
 umbrella-0.339
ittens-0.337
Token.
Token ID13
Feature activation+3.51e-02

Pos logit contributions
die+0.095
idi+0.091
by+0.088
 sav+0.081
bo+0.080
Neg logit contributions
anger-0.120
 plunge-0.120
adders-0.120
 umbrella-0.117
aroo-0.113
Token It
Token ID632
Feature activation-7.28e-02

Pos logit contributions
by+0.043
die+0.043
bo+0.042
ce+0.042
 sav+0.041
Neg logit contributions
 plunge-0.075
anger-0.071
aroo-0.069
adders-0.069
 umbrella-0.068
Token smells
Token ID25760
Feature activation+1.39e-01

Pos logit contributions
 plunge+0.101
 umbrella+0.097
anger+0.093
adders+0.092
aroo+0.086
Neg logit contributions
die-0.132
by-0.129
aron-0.124
athan-0.123
 late-0.123
Token y
Token ID331
Feature activation-1.89e-03

Pos logit contributions
die+0.288
idi+0.257
by+0.257
any+0.254
bo+0.252
Neg logit contributions
 plunge-0.208
 hi-0.190
 umbrella-0.189
 please-0.180
 aren-0.178
Token'm
Token ID1101
Feature activation-6.48e-03

Pos logit contributions
die+0.056
aron+0.055
idi+0.055
ce+0.054
bo+0.054
Neg logit contributions
 plunge-0.040
anger-0.035
aroo-0.034
 umbrella-0.034
adders-0.033
Token making
Token ID1642
Feature activation+7.45e-02

Pos logit contributions
 plunge+0.013
aroo+0.012
 umbrella+0.012
adders+0.012
anger+0.011
Neg logit contributions
by-0.010
die-0.009
bo-0.009
aron-0.009
ce-0.009
Token a
Token ID257
Feature activation-2.10e-02

Pos logit contributions
by+0.112
die+0.112
bo+0.111
aron+0.111
idi+0.111
Neg logit contributions
 plunge-0.141
 umbrella-0.126
anger-0.121
aroo-0.118
adders-0.118
Token delicious
Token ID12625
Feature activation-1.22e-02

Pos logit contributions
 plunge+0.039
 umbrella+0.035
anger+0.035
aroo+0.033
vere+0.032
Neg logit contributions
bo-0.033
aron-0.032
die-0.032
ce-0.031
idi-0.031
Token mac
Token ID8352
Feature activation+1.09e-01

Pos logit contributions
 plunge+0.022
 umbrella+0.020
anger+0.019
aroo+0.018
ittens+0.018
Neg logit contributions
bo-0.020
die-0.020
idi-0.019
aron-0.019
by-0.018
Tokenaron
Token ID8045
Feature activation+2.17e-01

Pos logit contributions
idi+0.114
die+0.110
by+0.103
 late+0.102
 sav+0.093
Neg logit contributions
anger-0.239
 plunge-0.230
aroo-0.222
ittens-0.220
 umbrella-0.212
Tokeni
Token ID72
Feature activation+4.32e-02

Pos logit contributions
die+0.305
 sav+0.299
idi+0.298
aron+0.290
 late+0.285
Neg logit contributions
 plunge-0.422
aroo-0.397
anger-0.383
 umbrella-0.377
aro-0.373
Token and
Token ID290
Feature activation-1.23e-02

Pos logit contributions
idi+0.079
die+0.076
 sav+0.075
by+0.072
ce+0.072
Neg logit contributions
 plunge-0.066
aroo-0.061
anger-0.059
 umbrella-0.058
ittens-0.057
Token cheese
Token ID9891
Feature activation+3.60e-02

Pos logit contributions
 plunge+0.023
aroo+0.021
 umbrella+0.020
 hi+0.020
ittens+0.019
Neg logit contributions
by-0.018
bo-0.017
aron-0.017
ce-0.017
y-0.017
Token,"
Token ID553
Feature activation-1.49e-02

Pos logit contributions
idi+0.067
die+0.066
by+0.065
bo+0.064
ce+0.064
Neg logit contributions
 plunge-0.053
anger-0.049
aroo-0.048
 umbrella-0.048
adders-0.046
Token M
Token ID337
Feature activation-9.52e-03

Pos logit contributions
 plunge+0.028
aroo+0.026
 umbrella+0.025
aro+0.024
 hi+0.024
Neg logit contributions
die-0.023
aron-0.022
by-0.022
bo-0.022
ce-0.021
Token your
Token ID534
Feature activation-1.41e-02

Pos logit contributions
 plunge+0.039
aroo+0.037
 umbrella+0.036
 hi+0.036
vere+0.034
Neg logit contributions
by-0.043
ce-0.042
idi-0.041
 sav-0.040
aron-0.040
Token friend
Token ID1545
Feature activation-9.02e-02

Pos logit contributions
 plunge+0.024
anger+0.022
aroo+0.021
 umbrella+0.021
adders+0.021
Neg logit contributions
ce-0.023
die-0.023
bo-0.023
by-0.023
 sav-0.023
Token!"
Token ID2474
Feature activation+9.06e-04

Pos logit contributions
 plunge+0.185
anger+0.172
aroo+0.172
 umbrella+0.168
adders+0.165
Neg logit contributions
ce-0.113
die-0.113
 sav-0.112
idi-0.112
by-0.111
Token
Token ID198
Feature activation+8.76e-03

Pos logit contributions
die+0.001
ce+0.001
by+0.001
idi+0.001
bo+0.001
Neg logit contributions
anger-0.002
 plunge-0.002
aroo-0.002
ishly-0.002
 hi-0.002
Token
Token ID198
Feature activation+1.23e-02

Pos logit contributions
die+0.010
aron+0.009
en+0.009
by+0.009
 sav+0.009
Neg logit contributions
 plunge-0.019
aroo-0.018
anger-0.018
aro-0.017
vere-0.017
TokenCol
Token ID5216
Feature activation+2.16e-01

Pos logit contributions
die+0.019
aron+0.019
by+0.018
en+0.018
bo+0.018
Neg logit contributions
 plunge-0.021
aroo-0.021
vere-0.019
aro-0.019
anger-0.019
Tokenin
Token ID259
Feature activation+1.55e-02

Pos logit contributions
 late+0.418
 sav+0.391
die+0.386
Grand+0.373
 gre+0.365
Neg logit contributions
aroo-0.292
 plunge-0.277
anger-0.268
aro-0.249
ig-0.233
Token took
Token ID1718
Feature activation+3.15e-03

Pos logit contributions
 late+0.020
die+0.018
 sav+0.017
Grand+0.017
aron+0.017
Neg logit contributions
aroo-0.030
 plunge-0.028
ishly-0.028
 umbrella-0.028
anger-0.027
Token the
Token ID262
Feature activation+5.30e-03

Pos logit contributions
die+0.005
aron+0.005
 sav+0.004
Grand+0.004
ce+0.004
Neg logit contributions
 plunge-0.006
anger-0.006
aroo-0.005
 hi-0.005
aro-0.005
Token hand
Token ID1021
Feature activation-1.05e-01

Pos logit contributions
die+0.009
by+0.008
bo+0.008
ce+0.008
 sav+0.008
Neg logit contributions
 plunge-0.009
anger-0.009
aroo-0.008
adders-0.008
 umbrella-0.008
Token home
Token ID1363
Feature activation-6.94e-02

Pos logit contributions
 plunge+0.235
anger+0.221
aroo+0.218
 umbrella+0.217
adders+0.209
Neg logit contributions
die-0.118
 sav-0.117
ce-0.115
idi-0.114
aron-0.112
Token the
Token ID262
Feature activation+1.20e-03

Pos logit contributions
die+0.029
ce+0.029
bo+0.029
 sav+0.029
by+0.029
Neg logit contributions
 plunge-0.037
anger-0.035
aroo-0.034
adders-0.033
 umbrella-0.033
Token forest
Token ID8222
Feature activation-3.87e-02

Pos logit contributions
die+0.002
ce+0.002
aron+0.002
bo+0.002
by+0.002
Neg logit contributions
 plunge-0.002
anger-0.002
 umbrella-0.002
aroo-0.002
adders-0.002
Token!"
Token ID2474
Feature activation-1.04e-02

Pos logit contributions
 plunge+0.081
anger+0.075
aroo+0.074
 umbrella+0.073
adders+0.072
Neg logit contributions
die-0.048
ce-0.048
bo-0.047
by-0.047
 sav-0.047
Token
Token ID198
Feature activation+2.42e-03

Pos logit contributions
 plunge+0.022
anger+0.021
aroo+0.020
 hi+0.019
vere+0.019
Neg logit contributions
die-0.013
by-0.012
idi-0.012
bo-0.012
ce-0.012
Token
Token ID198
Feature activation+5.79e-03

Pos logit contributions
die+0.002
by+0.002
 sav+0.002
bo+0.002
aron+0.002
Neg logit contributions
 plunge-0.006
aroo-0.005
anger-0.005
vere-0.005
aro-0.005
TokenEr
Token ID9139
Feature activation+2.16e-01

Pos logit contributions
die+0.008
by+0.007
bo+0.007
 sav+0.007
aron+0.007
Neg logit contributions
 plunge-0.012
aroo-0.011
vere-0.011
anger-0.011
aro-0.010
Tokennie
Token ID11952
Feature activation+9.99e-02

Pos logit contributions
 sav+0.224
 late+0.198
 gre+0.170
die+0.169
idi+0.167
Neg logit contributions
aroo-0.498
 plunge-0.462
aro-0.445
anger-0.439
arest-0.424
Token nodded
Token ID14464
Feature activation+4.87e-02

Pos logit contributions
die+0.113
 late+0.111
aron+0.103
ce+0.100
bo+0.100
Neg logit contributions
 plunge-0.213
aroo-0.207
ishly-0.200
 umbrella-0.196
anger-0.194
Token and
Token ID290
Feature activation-5.09e-04

Pos logit contributions
die+0.061
 sav+0.060
bo+0.059
by+0.058
ce+0.058
Neg logit contributions
 plunge-0.099
 umbrella-0.093
 hi-0.093
aroo-0.093
ishly-0.092
Token thanked
Token ID26280
Feature activation+5.52e-02

Pos logit contributions
 plunge+0.001
aroo+0.001
 umbrella+0.001
anger+0.001
aro+0.001
Neg logit contributions
die-0.001
aron-0.001
bo-0.001
ce-0.001
by-0.001
Token the
Token ID262
Feature activation+7.70e-03

Pos logit contributions
die+0.076
bo+0.071
aron+0.070
by+0.068
 sav+0.067
Neg logit contributions
 plunge-0.114
aroo-0.105
anger-0.104
 umbrella-0.101
aro-0.101
Token had
Token ID550
Feature activation+5.35e-02

Pos logit contributions
 plunge+0.156
 umbrella+0.147
aroo+0.145
anger+0.140
adders+0.138
Neg logit contributions
die-0.109
by-0.107
ce-0.107
aron-0.106
Grand-0.100
Token pink
Token ID11398
Feature activation+7.76e-02

Pos logit contributions
die+0.084
ce+0.083
aron+0.080
bo+0.079
by+0.078
Neg logit contributions
 plunge-0.094
 umbrella-0.092
adders-0.090
anger-0.086
 hi-0.080
Token frost
Token ID21682
Feature activation+1.27e-01

Pos logit contributions
die+0.107
idi+0.099
bo+0.098
by+0.097
ce+0.096
Neg logit contributions
 plunge-0.158
anger-0.156
 umbrella-0.156
aroo-0.145
adders-0.141
Tokening
Token ID278
Feature activation+7.83e-02

Pos logit contributions
die+0.209
 sav+0.208
idi+0.204
ce+0.186
aron+0.185
Neg logit contributions
anger-0.223
 plunge-0.220
aroo-0.201
 umbrella-0.199
adders-0.196
Token and
Token ID290
Feature activation+1.96e-02

Pos logit contributions
die+0.118
idi+0.114
by+0.112
ce+0.110
bo+0.108
Neg logit contributions
 plunge-0.139
 umbrella-0.136
anger-0.134
aroo-0.127
ishly-0.124
Token sprink
Token ID27562
Feature activation+2.15e-01

Pos logit contributions
by+0.032
ny+0.030
bo+0.030
die+0.029
ce+0.029
Neg logit contributions
 umbrella-0.029
 plunge-0.029
aroo-0.028
 yes-0.027
mm-0.026
Tokenles
Token ID829
Feature activation+5.56e-02

Pos logit contributions
die+0.367
idi+0.351
 sav+0.321
Grand+0.313
 late+0.313
Neg logit contributions
 plunge-0.361
anger-0.349
fs-0.339
aroo-0.332
ittens-0.328
Token.
Token ID13
Feature activation+3.79e-02

Pos logit contributions
die+0.079
idi+0.074
by+0.070
 sav+0.067
bo+0.067
Neg logit contributions
 plunge-0.112
anger-0.108
aroo-0.106
 umbrella-0.105
adders-0.103
Token It
Token ID632
Feature activation-7.48e-02

Pos logit contributions
by+0.054
ce+0.051
bo+0.051
die+0.050
 sav+0.049
Neg logit contributions
 plunge-0.072
anger-0.069
aroo-0.069
adders-0.068
 umbrella-0.067
Token had
Token ID550
Feature activation+6.77e-02

Pos logit contributions
aroo+0.126
 umbrella+0.124
 plunge+0.120
 yes+0.116
aro+0.116
Neg logit contributions
die-0.103
by-0.102
 late-0.101
athan-0.094
aron-0.091
Token candles
Token ID32268
Feature activation+1.19e-01

Pos logit contributions
die+0.115
ce+0.109
by+0.104
bo+0.102
aron+0.102
Neg logit contributions
 umbrella-0.108
 plunge-0.108
adders-0.103
anger-0.096
 hi-0.092
Tokenomm
Token ID2002
Feature activation+1.14e-01

Pos logit contributions
 plunge+0.018
anger+0.017
aroo+0.016
 umbrella+0.016
adders+0.016
Neg logit contributions
ce-0.014
die-0.014
bo-0.014
by-0.014
 sav-0.014
Tokena
Token ID64
Feature activation+5.74e-02

Pos logit contributions
die+0.120
ce+0.120
by+0.119
bo+0.119
 sav+0.116
Neg logit contributions
 plunge-0.261
anger-0.246
aroo-0.241
 umbrella-0.237
adders-0.235
Token and
Token ID290
Feature activation-5.34e-03

Pos logit contributions
die+0.070
 sav+0.067
bo+0.066
by+0.066
idi+0.066
Neg logit contributions
 plunge-0.122
anger-0.114
 umbrella-0.114
aroo-0.113
ishly-0.110
Token smiled
Token ID13541
Feature activation+4.43e-02

Pos logit contributions
 plunge+0.011
aroo+0.010
anger+0.010
 umbrella+0.009
ishly+0.009
Neg logit contributions
die-0.007
by-0.007
ce-0.007
bo-0.007
aron-0.007
Token sheep
Token ID15900
Feature activation-7.34e-02

Pos logit contributions
ce+0.066
 sav+0.064
idi+0.063
bo+0.063
die+0.061
Neg logit contributions
 hi-0.078
 plunge-0.076
ishly-0.075
 umbrella-0.073
 yes-0.072
Tokenishly
Token ID29735
Feature activation-2.03e-01

Pos logit contributions
 plunge+0.093
anger+0.084
aroo+0.083
 umbrella+0.079
adders+0.076
Neg logit contributions
die-0.153
ce-0.151
 sav-0.150
by-0.150
bo-0.150
Token.
Token ID13
Feature activation+6.52e-03

Pos logit contributions
 plunge+0.448
anger+0.419
 umbrella+0.410
 hi+0.409
aroo+0.404
Neg logit contributions
die-0.244
ce-0.232
bo-0.228
idi-0.227
 sav-0.224
Token M
Token ID337
Feature activation-9.66e-03

Pos logit contributions
by+0.010
bo+0.010
ce+0.009
en+0.009
y+0.009
Neg logit contributions
 plunge-0.011
 hi-0.011
adders-0.011
aroo-0.011
anger-0.011
Tokenomm
Token ID2002
Feature activation+1.13e-01

Pos logit contributions
 plunge+0.019
anger+0.018
aroo+0.018
 umbrella+0.017
adders+0.017
Neg logit contributions
die-0.013
ce-0.013
by-0.013
bo-0.013
 sav-0.013
Tokena
Token ID64
Feature activation+5.77e-02

Pos logit contributions
die+0.114
bo+0.113
ce+0.113
by+0.113
 sav+0.109
Neg logit contributions
 plunge-0.267
anger-0.252
aroo-0.245
 umbrella-0.243
adders-0.241
Token shook
Token ID14682
Feature activation+2.34e-02

Pos logit contributions
die+0.067
by+0.064
 late+0.063
bo+0.061
aron+0.060
Neg logit contributions
 plunge-0.124
aroo-0.120
anger-0.116
 umbrella-0.115
ishly-0.114
Tokenandy
Token ID10757
Feature activation-2.22e-02

Pos logit contributions
 plunge+0.060
anger+0.054
aroo+0.053
adders+0.052
 umbrella+0.051
Neg logit contributions
ce-0.061
die-0.060
bo-0.060
by-0.060
 sav-0.059
Token,
Token ID11
Feature activation-3.32e-02

Pos logit contributions
 plunge+0.045
anger+0.042
aroo+0.042
 umbrella+0.041
adders+0.040
Neg logit contributions
die-0.029
 sav-0.028
by-0.028
idi-0.028
ce-0.028
Token it
Token ID340
Feature activation-3.26e-02

Pos logit contributions
 plunge+0.057
anger+0.053
aroo+0.051
adders+0.050
 umbrella+0.049
Neg logit contributions
die-0.053
ce-0.053
bo-0.053
 sav-0.053
by-0.052
Token isn
Token ID2125
Feature activation-9.46e-03

Pos logit contributions
 plunge+0.073
anger+0.068
aroo+0.067
 umbrella+0.067
adders+0.065
Neg logit contributions
die-0.035
ce-0.035
by-0.035
 sav-0.035
aron-0.034
Token't
Token ID470
Feature activation-6.94e-02

Pos logit contributions
 plunge+0.012
anger+0.011
adders+0.011
aroo+0.010
 umbrella+0.010
Neg logit contributions
bo-0.021
ce-0.019
die-0.019
 late-0.019
y-0.018
Token ours
Token ID16903
Feature activation-2.00e-01

Pos logit contributions
 plunge+0.123
 umbrella+0.116
anger+0.112
aroo+0.111
adders+0.107
Neg logit contributions
ce-0.111
by-0.109
idi-0.109
bo-0.108
die-0.107
Token.
Token ID13
Feature activation-4.08e-03

Pos logit contributions
 plunge+0.474
anger+0.445
aroo+0.437
adders+0.430
 umbrella+0.426
Neg logit contributions
by-0.192
bo-0.191
die-0.190
aron-0.186
 sav-0.185
Token We
Token ID775
Feature activation+1.26e-02

Pos logit contributions
 plunge+0.008
anger+0.007
aroo+0.007
 umbrella+0.007
adders+0.007
Neg logit contributions
bo-0.006
die-0.006
ce-0.006
 sav-0.006
aron-0.006
Token can
Token ID460
Feature activation+9.57e-03

Pos logit contributions
bo+0.019
by+0.018
 sav+0.017
 added+0.017
y+0.017
Neg logit contributions
 plunge-0.024
anger-0.024
aroo-0.022
ittens-0.022
vere-0.022
Token't
Token ID470
Feature activation-5.21e-02

Pos logit contributions
bo+0.015
 sav+0.015
 asked+0.014
by+0.013
 added+0.013
Neg logit contributions
anger-0.018
 plunge-0.017
adders-0.017
aroo-0.016
fs-0.015
Token take
Token ID1011
Feature activation-4.59e-03

Pos logit contributions
anger+0.104
 plunge+0.102
adders+0.096
aroo+0.092
fs+0.091
Neg logit contributions
bo-0.074
 sav-0.070
by-0.069
 asked-0.069
ce-0.066
Token is
Token ID318
Feature activation+2.64e-02

Pos logit contributions
 plunge+0.148
 umbrella+0.139
aroo+0.135
anger+0.133
aro+0.131
Neg logit contributions
aron-0.128
ce-0.128
by-0.128
die-0.126
Grand-0.121
Token angry
Token ID7954
Feature activation-1.72e-02

Pos logit contributions
by+0.045
ce+0.044
bo+0.043
en+0.041
aron+0.041
Neg logit contributions
 umbrella-0.044
 plunge-0.043
aroo-0.042
adders-0.041
 hi-0.040
Token.
Token ID13
Feature activation+3.31e-02

Pos logit contributions
 plunge+0.037
anger+0.036
 umbrella+0.035
adders+0.035
aroo+0.035
Neg logit contributions
die-0.020
idi-0.020
by-0.020
ce-0.019
 sav-0.019
Token It
Token ID632
Feature activation-8.07e-02

Pos logit contributions
by+0.051
die+0.050
ce+0.048
bo+0.047
 sav+0.047
Neg logit contributions
 plunge-0.061
anger-0.058
aroo-0.058
adders-0.055
 umbrella-0.055
Token his
Token ID465
Feature activation+6.91e-03

Pos logit contributions
 plunge+0.141
 umbrella+0.131
anger+0.129
aroo+0.129
 hi+0.124
Neg logit contributions
ce-0.127
by-0.126
die-0.126
aron-0.125
idi-0.119
Tokenses
Token ID8448
Feature activation-1.97e-01

Pos logit contributions
by+0.016
die+0.016
 sav+0.016
bo+0.015
ce+0.015
Neg logit contributions
 plunge-0.008
anger-0.006
aroo-0.006
adders-0.006
 umbrella-0.006
Token.
Token ID13
Feature activation+3.38e-02

Pos logit contributions
 plunge+0.506
anger+0.481
aroo+0.474
 umbrella+0.468
adders+0.465
Neg logit contributions
die-0.153
ce-0.150
by-0.149
bo-0.146
 sav-0.145
Token It
Token ID632
Feature activation-7.97e-02

Pos logit contributions
die+0.047
ce+0.047
bo+0.047
by+0.047
 sav+0.046
Neg logit contributions
 plunge-0.066
anger-0.062
aroo-0.060
 umbrella-0.059
adders-0.058
Token scratches
Token ID37627
Feature activation+1.11e-02

Pos logit contributions
 plunge+0.146
anger+0.134
aroo+0.133
 umbrella+0.133
adders+0.128
Neg logit contributions
ce-0.120
die-0.119
by-0.118
aron-0.118
idi-0.114
Token.
Token ID13
Feature activation+3.46e-02

Pos logit contributions
die+0.012
ce+0.012
bo+0.012
by+0.012
 sav+0.012
Neg logit contributions
 plunge-0.025
anger-0.024
aroo-0.023
 umbrella-0.023
adders-0.023
Token It
Token ID632
Feature activation-7.60e-02

Pos logit contributions
die+0.044
ce+0.044
bo+0.044
 sav+0.044
by+0.043
Neg logit contributions
 plunge-0.072
anger-0.067
aroo-0.066
 umbrella-0.065
adders-0.064
Token like
Token ID588
Feature activation+4.79e-02

Pos logit contributions
ce+0.033
die+0.032
idi+0.032
 sav+0.032
aron+0.031
Neg logit contributions
 plunge-0.037
aroo-0.033
 umbrella-0.033
anger-0.032
adders-0.031
Token that
Token ID326
Feature activation-3.89e-03

Pos logit contributions
ce+0.074
aron+0.074
die+0.073
bo+0.071
idi+0.071
Neg logit contributions
 plunge-0.088
aroo-0.082
 umbrella-0.081
aro-0.080
anger-0.080
Token.
Token ID13
Feature activation+1.60e-02

Pos logit contributions
 plunge+0.009
aroo+0.008
 umbrella+0.008
anger+0.008
aro+0.008
Neg logit contributions
die-0.004
ce-0.004
idi-0.004
by-0.004
bo-0.004
Token It
Token ID632
Feature activation-9.02e-02

Pos logit contributions
die+0.024
ce+0.024
bo+0.024
by+0.024
 sav+0.024
Neg logit contributions
 plunge-0.030
anger-0.027
aroo-0.027
 umbrella-0.026
adders-0.026
Token his
Token ID465
Feature activation-3.89e-03

Pos logit contributions
 plunge+0.169
aroo+0.155
 umbrella+0.151
anger+0.149
aro+0.147
Neg logit contributions
ce-0.132
aron-0.131
die-0.130
by-0.128
Grand-0.127
Tokenses
Token ID8448
Feature activation-1.96e-01

Pos logit contributions
 plunge+0.006
anger+0.006
aroo+0.005
adders+0.005
 umbrella+0.005
Neg logit contributions
die-0.007
by-0.007
bo-0.007
 sav-0.007
ce-0.007
Token and
Token ID290
Feature activation+1.02e-02

Pos logit contributions
 plunge+0.505
anger+0.479
aroo+0.472
 umbrella+0.464
adders+0.461
Neg logit contributions
die-0.153
ce-0.151
by-0.149
bo-0.148
 sav-0.147
Token fl
Token ID781
Feature activation-3.18e-02

Pos logit contributions
die+0.014
 sav+0.014
bo+0.014
ce+0.014
by+0.014
Neg logit contributions
 plunge-0.020
anger-0.019
aroo-0.018
adders-0.018
 umbrella-0.018
Tokenaps
Token ID1686
Feature activation+5.52e-02

Pos logit contributions
 plunge+0.060
 umbrella+0.054
anger+0.050
fs+0.049
aroo+0.047
Neg logit contributions
bo-0.051
by-0.051
y-0.048
die-0.046
 late-0.043
Token its
Token ID663
Feature activation-8.84e-02

Pos logit contributions
die+0.062
bo+0.061
by+0.061
ce+0.060
 sav+0.060
Neg logit contributions
 plunge-0.124
anger-0.117
aroo-0.114
 umbrella-0.112
adders-0.111
Token wings
Token ID12098
Feature activation-5.43e-02

Pos logit contributions
 plunge+0.117
anger+0.106
aroo+0.102
 umbrella+0.099
adders+0.098
Neg logit contributions
die-0.180
by-0.178
bo-0.178
ce-0.178
 sav-0.177
Tokenally
Token ID453
Feature activation-7.57e-03

Pos logit contributions
by+0.091
ce+0.088
die+0.088
bo+0.087
 sav+0.087
Neg logit contributions
 plunge-0.129
aroo-0.121
anger-0.118
aro-0.115
adders-0.114
Token,
Token ID11
Feature activation-1.85e-02

Pos logit contributions
 plunge+0.013
anger+0.012
 umbrella+0.012
aroo+0.012
adders+0.011
Neg logit contributions
idi-0.012
die-0.012
 sav-0.012
aron-0.012
bo-0.012
Token that
Token ID326
Feature activation-9.36e-03

Pos logit contributions
 plunge+0.033
anger+0.033
aroo+0.032
aro+0.031
adders+0.031
Neg logit contributions
 sav-0.027
die-0.025
by-0.025
ce-0.025
y-0.025
Token's
Token ID338
Feature activation-1.01e-02

Pos logit contributions
 plunge+0.024
anger+0.023
aroo+0.023
adders+0.022
aro+0.022
Neg logit contributions
die-0.007
 sav-0.007
bo-0.007
by-0.007
ce-0.007
Token not
Token ID407
Feature activation+1.72e-02

Pos logit contributions
 plunge+0.018
 umbrella+0.017
anger+0.017
aroo+0.016
adders+0.016
Neg logit contributions
ce-0.016
by-0.016
die-0.016
bo-0.016
aron-0.015
Token ours
Token ID16903
Feature activation-1.96e-01

Pos logit contributions
ce+0.026
bo+0.025
idi+0.025
die+0.025
by+0.024
Neg logit contributions
 plunge-0.032
 umbrella-0.030
anger-0.029
aroo-0.029
adders-0.029
Token.
Token ID13
Feature activation-2.04e-02

Pos logit contributions
 plunge+0.483
anger+0.456
aroo+0.451
adders+0.439
aro+0.438
Neg logit contributions
by-0.171
bo-0.167
 sav-0.167
die-0.166
aron-0.162
Token Leave
Token ID17446
Feature activation-1.55e-01

Pos logit contributions
 plunge+0.039
anger+0.036
aroo+0.035
 umbrella+0.034
adders+0.034
Neg logit contributions
bo-0.030
die-0.029
ce-0.029
 sav-0.029
aron-0.028
Token it
Token ID340
Feature activation-1.80e-02

Pos logit contributions
 plunge+0.326
anger+0.309
aroo+0.301
adders+0.297
vere+0.293
Neg logit contributions
 sav-0.195
by-0.193
bo-0.193
die-0.191
ce-0.188
Token where
Token ID810
Feature activation-4.25e-02

Pos logit contributions
 plunge+0.036
anger+0.034
aroo+0.033
adders+0.032
aro+0.032
Neg logit contributions
bo-0.024
 sav-0.024
by-0.024
die-0.024
 late-0.023
Token it
Token ID340
Feature activation-1.87e-02

Pos logit contributions
 plunge+0.069
anger+0.068
ittens+0.065
aroo+0.064
ishly+0.062
Neg logit contributions
 late-0.070
bo-0.067
by-0.067
 sav-0.064
y-0.063
Token was
Token ID373
Feature activation+1.61e-02

Pos logit contributions
by+0.011
ce+0.011
 sav+0.010
bo+0.010
idi+0.010
Neg logit contributions
 plunge-0.032
anger-0.031
adders-0.031
aroo-0.030
 umbrella-0.030
Token,
Token ID11
Feature activation-1.12e-02

Pos logit contributions
by+0.024
bo+0.024
die+0.024
ce+0.024
aron+0.023
Neg logit contributions
 plunge-0.032
 umbrella-0.028
aroo-0.027
anger-0.027
adders-0.027
Token it
Token ID340
Feature activation-1.72e-02

Pos logit contributions
 plunge+0.023
anger+0.021
aroo+0.020
 umbrella+0.020
adders+0.020
Neg logit contributions
bo-0.015
die-0.015
by-0.015
ce-0.015
aron-0.015
Token sl
Token ID1017
Feature activation+4.28e-02

Pos logit contributions
 plunge+0.037
aroo+0.034
 umbrella+0.033
anger+0.033
aro+0.032
Neg logit contributions
aron-0.021
die-0.020
by-0.019
ce-0.019
 sav-0.019
Tokenugg
Token ID6837
Feature activation+3.62e-02

Pos logit contributions
bo+0.092
ce+0.091
by+0.090
die+0.090
aron+0.090
Neg logit contributions
 plunge-0.054
 umbrella-0.046
aroo-0.045
anger-0.044
vere-0.042
Tokenishly
Token ID29735
Feature activation-1.95e-01

Pos logit contributions
 sav+0.058
Grand+0.056
die+0.056
idi+0.054
ce+0.053
Neg logit contributions
 plunge-0.061
anger-0.061
aroo-0.056
 umbrella-0.054
fs-0.052
Token moved
Token ID3888
Feature activation+1.28e-02

Pos logit contributions
 plunge+0.444
anger+0.417
aroo+0.408
 umbrella+0.403
adders+0.398
Neg logit contributions
die-0.212
ce-0.209
bo-0.207
by-0.206
 sav-0.204
Token in
Token ID287
Feature activation+1.72e-02

Pos logit contributions
ce+0.019
 sav+0.019
aron+0.019
idi+0.019
die+0.019
Neg logit contributions
 plunge-0.023
 umbrella-0.021
anger-0.020
aroo-0.020
 hi-0.020
Token the
Token ID262
Feature activation-2.80e-04

Pos logit contributions
bo+0.025
ce+0.023
by+0.023
die+0.023
aron+0.023
Neg logit contributions
 plunge-0.035
 umbrella-0.032
anger-0.029
aroo-0.029
 hi-0.029
Token water
Token ID1660
Feature activation+3.73e-02

Pos logit contributions
 plunge+0.001
 umbrella+0.000
 hi+0.000
anger+0.000
 raft+0.000
Neg logit contributions
aron-0.001
die-0.000
ce-0.000
Grand-0.000
idi-0.000
Token but
Token ID475
Feature activation-5.26e-02

Pos logit contributions
die+0.038
idi+0.037
ce+0.037
by+0.036
bo+0.036
Neg logit contributions
 plunge-0.089
anger-0.083
 umbrella-0.081
aroo-0.081
adders-0.078
Token snake
Token ID17522
Feature activation-3.39e-02

Pos logit contributions
 plunge+0.087
anger+0.084
adders+0.080
vere+0.078
ittens+0.078
Neg logit contributions
die-0.084
ce-0.082
 sav-0.082
bo-0.080
idi-0.080
Token sees
Token ID7224
Feature activation-2.25e-03

Pos logit contributions
 plunge+0.065
anger+0.060
aroo+0.059
 umbrella+0.058
adders+0.057
Neg logit contributions
die-0.049
ce-0.048
by-0.048
bo-0.048
 sav-0.047
Token Tom
Token ID4186
Feature activation+6.34e-03

Pos logit contributions
 plunge+0.005
anger+0.004
aroo+0.004
 umbrella+0.004
adders+0.004
Neg logit contributions
die-0.003
ce-0.003
by-0.003
bo-0.003
 sav-0.003
Token and
Token ID290
Feature activation+1.92e-02

Pos logit contributions
die+0.006
idi+0.006
ce+0.006
by+0.006
 sav+0.006
Neg logit contributions
 plunge-0.015
anger-0.014
aroo-0.014
 umbrella-0.014
adders-0.014
Token his
Token ID465
Feature activation+2.69e-03

Pos logit contributions
die+0.024
ce+0.024
by+0.024
aron+0.024
bo+0.023
Neg logit contributions
 plunge-0.040
aroo-0.037
anger-0.037
 umbrella-0.037
aro-0.036
Tokenses
Token ID8448
Feature activation-1.94e-01

Pos logit contributions
by+0.005
die+0.005
bo+0.005
 sav+0.005
ce+0.005
Neg logit contributions
 plunge-0.004
anger-0.004
aroo-0.004
adders-0.004
aro-0.004
Token.
Token ID13
Feature activation+2.51e-02

Pos logit contributions
 plunge+0.473
anger+0.446
aroo+0.441
 umbrella+0.437
adders+0.436
Neg logit contributions
die-0.180
ce-0.175
idi-0.172
by-0.172
 sav-0.170
Token It
Token ID632
Feature activation-8.12e-02

Pos logit contributions
die+0.036
by+0.036
ce+0.035
bo+0.035
 sav+0.034
Neg logit contributions
 plunge-0.048
anger-0.046
aroo-0.045
 umbrella-0.043
adders-0.043
Token opens
Token ID9808
Feature activation+2.69e-02

Pos logit contributions
 plunge+0.155
anger+0.144
aroo+0.142
 umbrella+0.139
adders+0.137
Neg logit contributions
die-0.117
ce-0.116
by-0.115
bo-0.114
 sav-0.113
Token its
Token ID663
Feature activation-7.55e-02

Pos logit contributions
by+0.046
ce+0.046
die+0.046
bo+0.044
idi+0.044
Neg logit contributions
 plunge-0.043
aroo-0.040
anger-0.040
 umbrella-0.040
 hi-0.038
Token mouth
Token ID5422
Feature activation-4.60e-02

Pos logit contributions
 plunge+0.124
anger+0.115
aroo+0.112
adders+0.108
 umbrella+0.107
Neg logit contributions
die-0.130
bo-0.129
by-0.129
 sav-0.128
ce-0.127
Token up
Token ID510
Feature activation+4.79e-02

Pos logit contributions
die+0.013
ce+0.013
idi+0.012
 sav+0.012
aron+0.012
Neg logit contributions
 plunge-0.012
anger-0.011
 umbrella-0.011
adders-0.011
aroo-0.010
Token things
Token ID1243
Feature activation+4.20e-02

Pos logit contributions
die+0.065
ce+0.064
bo+0.064
by+0.063
idi+0.063
Neg logit contributions
 plunge-0.096
anger-0.088
 umbrella-0.088
aroo-0.087
adders-0.086
Token that
Token ID326
Feature activation-1.38e-02

Pos logit contributions
die+0.060
ce+0.058
by+0.058
idi+0.057
bo+0.057
Neg logit contributions
 plunge-0.083
anger-0.077
aroo-0.075
 umbrella-0.074
adders-0.073
Token are
Token ID389
Feature activation-8.62e-02

Pos logit contributions
 plunge+0.019
anger+0.018
aroo+0.017
 umbrella+0.016
adders+0.016
Neg logit contributions
die-0.027
bo-0.027
by-0.027
ce-0.027
 sav-0.027
Token not
Token ID407
Feature activation+1.27e-02

Pos logit contributions
 plunge+0.142
 umbrella+0.132
adders+0.129
aroo+0.127
anger+0.126
Neg logit contributions
ce-0.148
by-0.145
bo-0.143
aron-0.142
die-0.141
Token ours
Token ID16903
Feature activation-1.94e-01

Pos logit contributions
ce+0.022
bo+0.021
idi+0.021
by+0.021
 sav+0.021
Neg logit contributions
 plunge-0.021
 umbrella-0.019
anger-0.019
adders-0.019
aroo-0.019
Token".
Token ID1911
Feature activation+1.20e-02

Pos logit contributions
 plunge+0.472
anger+0.446
aroo+0.438
 umbrella+0.432
adders+0.429
Neg logit contributions
die-0.179
ce-0.178
bo-0.175
by-0.175
 sav-0.173
Token
Token ID198
Feature activation+1.14e-02

Pos logit contributions
ce+0.013
die+0.013
by+0.012
bo+0.012
aron+0.012
Neg logit contributions
 plunge-0.027
aroo-0.026
anger-0.025
 umbrella-0.025
aro-0.024
Token
Token ID198
Feature activation+1.62e-02

Pos logit contributions
 sav+0.011
ce+0.011
die+0.011
by+0.011
bo+0.011
Neg logit contributions
 plunge-0.026
aroo-0.025
anger-0.024
 umbrella-0.024
adders-0.024
TokenThe
Token ID464
Feature activation-3.81e-02

Pos logit contributions
by+0.022
die+0.022
ce+0.021
aron+0.021
 sav+0.021
Neg logit contributions
 plunge-0.031
aroo-0.030
anger-0.029
adders-0.028
 umbrella-0.028
Token daddy
Token ID35822
Feature activation+1.09e-01

Pos logit contributions
 plunge+0.072
 umbrella+0.067
aroo+0.067
anger+0.066
adders+0.062
Neg logit contributions
aron-0.057
bo-0.055
die-0.055
ce-0.055
by-0.053
Tokenâ
Token ID22940
Feature activation+3.38e-02

Pos logit contributions
 plunge+0.026
anger+0.024
aroo+0.023
 umbrella+0.023
adders+0.022
Neg logit contributions
die-0.020
ce-0.020
by-0.020
 sav-0.020
aron-0.020
Token€
Token ID26391
Feature activation+3.30e-02

Pos logit contributions
die+0.054
by+0.053
ce+0.052
 sav+0.052
idi+0.052
Neg logit contributions
 plunge-0.060
anger-0.056
aroo-0.054
 umbrella-0.053
adders-0.052
Tokenâ„¢
Token ID8151
Feature activation+1.49e-02

Pos logit contributions
ce+0.054
bo+0.054
die+0.054
 sav+0.053
by+0.053
Neg logit contributions
 plunge-0.058
anger-0.052
aroo-0.052
 umbrella-0.050
aro-0.048
Tokens
Token ID82
Feature activation-4.44e-03

Pos logit contributions
bo+0.017
by+0.016
ce+0.016
die+0.016
y+0.016
Neg logit contributions
 plunge-0.034
anger-0.031
aroo-0.031
 umbrella-0.031
adders-0.030
Token not
Token ID407
Feature activation+1.92e-02

Pos logit contributions
 plunge+0.009
anger+0.008
aroo+0.008
aro+0.008
adders+0.008
Neg logit contributions
bo-0.006
die-0.006
by-0.006
 sav-0.006
ce-0.006
Token ours
Token ID16903
Feature activation-1.94e-01

Pos logit contributions
die+0.028
ce+0.028
bo+0.028
by+0.028
 sav+0.027
Neg logit contributions
 plunge-0.037
anger-0.034
aroo-0.033
 umbrella-0.033
adders-0.032
Token.
Token ID13
Feature activation-1.16e-02

Pos logit contributions
 plunge+0.495
anger+0.468
aroo+0.461
 umbrella+0.452
adders+0.450
Neg logit contributions
die-0.152
by-0.152
bo-0.150
ce-0.148
 sav-0.148
Tokenâ
Token ID22940
Feature activation+3.47e-02

Pos logit contributions
 plunge+0.020
anger+0.019
aroo+0.019
 umbrella+0.018
adders+0.018
Neg logit contributions
by-0.018
die-0.018
ce-0.018
bo-0.018
 sav-0.018
Token€
Token ID26391
Feature activation+3.40e-02

Pos logit contributions
die+0.055
ce+0.054
by+0.054
bo+0.054
 sav+0.054
Neg logit contributions
 plunge-0.062
anger-0.057
aroo-0.056
 umbrella-0.054
adders-0.054
Token
Token ID198
Feature activation-5.38e-03

Pos logit contributions
ce+0.047
 sav+0.047
die+0.047
bo+0.046
by+0.046
Neg logit contributions
 plunge-0.069
anger-0.063
aroo-0.063
 umbrella-0.061
adders-0.060
Token
Token ID198
Feature activation-1.90e-03

Pos logit contributions
 plunge+0.014
anger+0.013
aroo+0.013
 umbrella+0.013
adders+0.013
Neg logit contributions
die-0.004
ce-0.004
bo-0.004
by-0.004
 sav-0.004
Token of
Token ID286
Feature activation-1.37e-02

Pos logit contributions
 plunge+0.094
 umbrella+0.090
aroo+0.089
anger+0.088
adders+0.088
Neg logit contributions
die-0.048
idi-0.047
 sav-0.046
ce-0.046
by-0.046
Token Tom
Token ID4186
Feature activation-8.05e-03

Pos logit contributions
 plunge+0.034
anger+0.033
aroo+0.032
vere+0.031
adders+0.031
Neg logit contributions
 sav-0.012
ce-0.011
bo-0.011
die-0.011
by-0.011
Token.
Token ID13
Feature activation+1.04e-02

Pos logit contributions
 plunge+0.020
anger+0.019
aroo+0.019
 umbrella+0.018
adders+0.018
Neg logit contributions
die-0.007
ce-0.007
idi-0.007
 sav-0.007
by-0.007
Token It
Token ID632
Feature activation-8.74e-02

Pos logit contributions
 sav+0.012
bo+0.012
aron+0.012
ce+0.012
die+0.012
Neg logit contributions
 plunge-0.023
anger-0.022
aroo-0.021
 umbrella-0.021
aro-0.020
Token his
Token ID465
Feature activation-3.95e-03

Pos logit contributions
 plunge+0.167
anger+0.155
aroo+0.152
 umbrella+0.149
adders+0.147
Neg logit contributions
die-0.126
ce-0.125
by-0.125
bo-0.124
 sav-0.124
Tokenses
Token ID8448
Feature activation-1.94e-01

Pos logit contributions
 plunge+0.005
anger+0.004
aro+0.004
aroo+0.004
adders+0.004
Neg logit contributions
by-0.008
die-0.008
 sav-0.008
bo-0.008
idi-0.008
Token and
Token ID290
Feature activation+3.11e-04

Pos logit contributions
 plunge+0.529
anger+0.501
aroo+0.493
 umbrella+0.484
adders+0.479
Neg logit contributions
die-0.124
bo-0.123
ce-0.122
by-0.120
 sav-0.118
Token scratches
Token ID37627
Feature activation+5.97e-04

Pos logit contributions
 sav+0.000
die+0.000
bo+0.000
ce+0.000
by+0.000
Neg logit contributions
 plunge-0.001
anger-0.001
aroo-0.001
adders-0.001
 umbrella-0.001
Token.
Token ID13
Feature activation+1.12e-02

Pos logit contributions
die+0.001
ce+0.001
bo+0.001
by+0.001
 sav+0.001
Neg logit contributions
 plunge-0.001
anger-0.001
aroo-0.001
 umbrella-0.001
adders-0.001
Token It
Token ID632
Feature activation-8.55e-02

Pos logit contributions
 sav+0.013
aron+0.013
bo+0.013
 late+0.013
ce+0.013
Neg logit contributions
 plunge-0.025
anger-0.023
aroo-0.023
 umbrella-0.023
aro-0.022
Token hurts
Token ID20406
Feature activation+6.98e-03

Pos logit contributions
 plunge+0.169
anger+0.158
aroo+0.154
 umbrella+0.151
adders+0.150
Neg logit contributions
die-0.118
bo-0.117
ce-0.117
by-0.116
 sav-0.116
Token isn
Token ID2125
Feature activation+3.07e-03

Pos logit contributions
 plunge+0.000
anger+0.000
aroo+0.000
 umbrella+0.000
adders+0.000
Neg logit contributions
ce-0.000
die-0.000
by-0.000
aron-0.000
bo-0.000
Tokenâ
Token ID22940
Feature activation+3.67e-02

Pos logit contributions
bo+0.006
ce+0.006
 late+0.006
y+0.006
 over+0.006
Neg logit contributions
 plunge-0.004
anger-0.004
adders-0.004
 umbrella-0.004
 hi-0.004
Token€
Token ID26391
Feature activation+3.61e-02

Pos logit contributions
die+0.062
by+0.061
ce+0.060
 sav+0.059
idi+0.059
Neg logit contributions
 plunge-0.062
anger-0.057
aroo-0.056
 umbrella-0.053
adders-0.053
Tokenâ„¢
Token ID8151
Feature activation+1.75e-02

Pos logit contributions
die+0.062
ce+0.062
bo+0.062
by+0.062
 sav+0.061
Neg logit contributions
 plunge-0.059
anger-0.054
aroo-0.053
 umbrella-0.051
adders-0.050
Tokent
Token ID83
Feature activation-1.86e-02

Pos logit contributions
bo+0.032
ce+0.031
by+0.031
die+0.030
y+0.030
Neg logit contributions
 plunge-0.028
anger-0.025
 umbrella-0.024
aroo-0.024
adders-0.024
Token ours
Token ID16903
Feature activation-1.93e-01

Pos logit contributions
 plunge+0.036
anger+0.034
aroo+0.033
 umbrella+0.032
adders+0.032
Neg logit contributions
die-0.026
ce-0.026
by-0.026
bo-0.026
 sav-0.026
Token without
Token ID1231
Feature activation-4.59e-02

Pos logit contributions
 plunge+0.446
anger+0.421
aroo+0.413
 umbrella+0.407
adders+0.404
Neg logit contributions
die-0.199
by-0.197
bo-0.196
ce-0.196
 sav-0.194
Token permission
Token ID7170
Feature activation-1.26e-01

Pos logit contributions
 plunge+0.078
anger+0.068
aroo+0.068
 umbrella+0.067
aro+0.065
Neg logit contributions
ce-0.078
die-0.077
bo-0.077
by-0.076
aron-0.076
Token.
Token ID13
Feature activation+1.51e-03

Pos logit contributions
 plunge+0.274
anger+0.259
aroo+0.252
adders+0.247
 umbrella+0.247
Neg logit contributions
die-0.149
ce-0.147
 sav-0.146
by-0.146
idi-0.145
Token It
Token ID632
Feature activation-9.27e-02

Pos logit contributions
die+0.002
ce+0.002
bo+0.002
by+0.002
 sav+0.002
Neg logit contributions
 plunge-0.003
anger-0.003
aroo-0.003
 umbrella-0.003
adders-0.003
Tokenâ
Token ID22940
Feature activation+3.70e-02

Pos logit contributions
 plunge+0.198
anger+0.186
aroo+0.182
 umbrella+0.179
adders+0.177
Neg logit contributions
die-0.112
ce-0.112
by-0.111
bo-0.110
 sav-0.109
Token and
Token ID290
Feature activation+1.33e-02

Pos logit contributions
 plunge+0.091
 umbrella+0.089
aroo+0.087
anger+0.086
adders+0.086
Neg logit contributions
die-0.049
idi-0.048
 sav-0.048
ce-0.047
by-0.047
Token angry
Token ID7954
Feature activation-2.44e-02

Pos logit contributions
die+0.019
ce+0.019
bo+0.019
by+0.019
 sav+0.019
Neg logit contributions
 plunge-0.025
anger-0.024
aroo-0.023
 umbrella-0.022
adders-0.022
Token.
Token ID13
Feature activation+2.37e-02

Pos logit contributions
 plunge+0.053
anger+0.051
aroo+0.050
adders+0.050
 umbrella+0.049
Neg logit contributions
die-0.029
idi-0.029
ce-0.028
 sav-0.027
by-0.027
Token It
Token ID632
Feature activation-8.75e-02

Pos logit contributions
die+0.038
by+0.037
ce+0.037
bo+0.037
 sav+0.037
Neg logit contributions
 plunge-0.042
anger-0.039
aroo-0.038
 umbrella-0.037
adders-0.037
Token his
Token ID465
Feature activation-1.31e-03

Pos logit contributions
 plunge+0.161
aroo+0.146
anger+0.145
 umbrella+0.145
 hi+0.141
Neg logit contributions
ce-0.131
by-0.129
die-0.129
aron-0.129
bo-0.124
Tokenses
Token ID8448
Feature activation-1.93e-01

Pos logit contributions
 plunge+0.002
anger+0.001
aroo+0.001
adders+0.001
aro+0.001
Neg logit contributions
by-0.003
 sav-0.003
die-0.003
bo-0.003
ce-0.003
Token and
Token ID290
Feature activation+1.56e-02

Pos logit contributions
 plunge+0.496
anger+0.470
aroo+0.464
 umbrella+0.456
adders+0.455
Neg logit contributions
die-0.151
ce-0.149
by-0.147
bo-0.146
 sav-0.144
Token fl
Token ID781
Feature activation-2.70e-02

Pos logit contributions
 sav+0.021
die+0.021
bo+0.020
ce+0.020
by+0.020
Neg logit contributions
 plunge-0.032
anger-0.031
aroo-0.029
adders-0.029
 umbrella-0.028
Tokenaps
Token ID1686
Feature activation+5.61e-02

Pos logit contributions
 plunge+0.065
 umbrella+0.058
fs+0.054
aroo+0.054
anger+0.053
Neg logit contributions
by-0.030
bo-0.030
y-0.027
 late-0.026
die-0.025
Token its
Token ID663
Feature activation-8.11e-02

Pos logit contributions
die+0.064
bo+0.063
by+0.062
ce+0.061
 sav+0.060
Neg logit contributions
 plunge-0.127
anger-0.120
aroo-0.116
 umbrella-0.113
adders-0.112
Token wings
Token ID12098
Feature activation-5.30e-02

Pos logit contributions
 plunge+0.121
anger+0.113
aroo+0.106
 umbrella+0.103
adders+0.103
Neg logit contributions
die-0.153
bo-0.153
by-0.152
 sav-0.149
ce-0.149
Token to
Token ID284
Feature activation+5.68e-03

Pos logit contributions
 plunge+0.169
 umbrella+0.161
anger+0.156
aroo+0.152
adders+0.151
Neg logit contributions
die-0.083
ce-0.083
aron-0.083
 sav-0.077
bo-0.077
Token listen
Token ID6004
Feature activation-1.16e-02

Pos logit contributions
die+0.010
ce+0.010
by+0.010
bo+0.010
 sav+0.010
Neg logit contributions
 plunge-0.009
anger-0.008
aroo-0.008
 umbrella-0.008
adders-0.008
Token and
Token ID290
Feature activation-3.42e-02

Pos logit contributions
 plunge+0.021
 umbrella+0.020
anger+0.020
 hi+0.020
ishly+0.019
Neg logit contributions
die-0.017
 sav-0.016
bo-0.016
idi-0.016
ce-0.016
Token kept
Token ID4030
Feature activation+1.23e-02

Pos logit contributions
 plunge+0.059
anger+0.055
aroo+0.054
 umbrella+0.052
adders+0.052
Neg logit contributions
die-0.055
ce-0.055
by-0.055
bo-0.055
 sav-0.054
Token playing
Token ID2712
Feature activation-1.95e-02

Pos logit contributions
aron+0.022
die+0.021
ce+0.021
by+0.021
idi+0.020
Neg logit contributions
 plunge-0.020
 umbrella-0.019
aroo-0.018
 hi-0.017
aro-0.017
Token anyway
Token ID6949
Feature activation-1.93e-01

Pos logit contributions
 plunge+0.048
anger+0.046
aroo+0.045
 umbrella+0.044
adders+0.044
Neg logit contributions
ce-0.017
bo-0.017
die-0.017
 sav-0.017
by-0.017
Token.
Token ID13
Feature activation-1.92e-02

Pos logit contributions
 plunge+0.526
anger+0.494
aroo+0.490
adders+0.476
 umbrella+0.474
Neg logit contributions
by-0.120
ce-0.119
bo-0.115
aron-0.114
die-0.114
Token
Token ID198
Feature activation-2.96e-03

Pos logit contributions
 plunge+0.038
anger+0.036
 umbrella+0.035
aroo+0.035
adders+0.035
Neg logit contributions
ce-0.025
by-0.025
bo-0.024
die-0.024
aron-0.024
Token
Token ID198
Feature activation+4.37e-03

Pos logit contributions
 plunge+0.008
anger+0.007
aroo+0.007
 umbrella+0.007
adders+0.007
Neg logit contributions
die-0.002
ce-0.002
by-0.002
bo-0.002
 sav-0.002
TokenThe
Token ID464
Feature activation-5.03e-02

Pos logit contributions
die+0.006
ce+0.006
bo+0.006
by+0.006
 sav+0.006
Neg logit contributions
 plunge-0.009
anger-0.008
aroo-0.008
 umbrella-0.008
adders-0.008
Token next
Token ID1306
Feature activation+2.63e-02

Pos logit contributions
 plunge+0.105
aroo+0.096
 umbrella+0.096
anger+0.096
adders+0.091
Neg logit contributions
aron-0.066
die-0.066
ce-0.065
bo-0.064
by-0.063
Token make
Token ID787
Feature activation+1.34e-02

Pos logit contributions
 plunge+0.168
anger+0.156
aroo+0.153
 umbrella+0.150
adders+0.149
Neg logit contributions
die-0.118
ce-0.117
bo-0.117
by-0.116
 sav-0.115
Token his
Token ID465
Feature activation-1.66e-02

Pos logit contributions
die+0.020
ce+0.019
by+0.019
aron+0.019
bo+0.019
Neg logit contributions
 plunge-0.025
anger-0.023
 umbrella-0.023
aroo-0.023
adders-0.022
Token house
Token ID2156
Feature activation-4.79e-02

Pos logit contributions
 plunge+0.028
anger+0.025
 umbrella+0.025
aroo+0.025
adders+0.024
Neg logit contributions
ce-0.028
die-0.028
bo-0.027
aron-0.027
idi-0.027
Token safe
Token ID3338
Feature activation-1.26e-01

Pos logit contributions
 plunge+0.110
anger+0.104
aroo+0.103
adders+0.099
 umbrella+0.099
Neg logit contributions
die-0.050
ce-0.050
bo-0.050
by-0.050
 sav-0.049
Token like
Token ID588
Feature activation+4.50e-02

Pos logit contributions
 plunge+0.219
 umbrella+0.201
anger+0.200
aroo+0.197
adders+0.189
Neg logit contributions
ce-0.197
die-0.196
idi-0.196
by-0.195
bo-0.194
Token ours
Token ID16903
Feature activation-1.93e-01

Pos logit contributions
 sav+0.059
ce+0.057
die+0.057
bo+0.057
 late+0.056
Neg logit contributions
 plunge-0.091
anger-0.086
aroo-0.084
vere-0.080
adders-0.080
Token?"
Token ID1701
Feature activation+2.86e-03

Pos logit contributions
 plunge+0.354
anger+0.329
aroo+0.320
adders+0.311
 umbrella+0.308
Neg logit contributions
die-0.290
by-0.289
bo-0.287
ce-0.285
 sav-0.285
Token Her
Token ID2332
Feature activation-4.34e-02

Pos logit contributions
die+0.005
ce+0.005
by+0.005
bo+0.005
aron+0.005
Neg logit contributions
 plunge-0.004
anger-0.004
 umbrella-0.004
aroo-0.004
 hi-0.004
Token mom
Token ID1995
Feature activation+2.95e-02

Pos logit contributions
 plunge+0.067
 umbrella+0.065
anger+0.063
aroo+0.060
ittens+0.059
Neg logit contributions
ce-0.078
bo-0.077
aron-0.076
 sav-0.076
die-0.075
Token and
Token ID290
Feature activation-9.30e-03

Pos logit contributions
by+0.040
 late+0.040
die+0.040
 sav+0.038
bo+0.037
Neg logit contributions
 umbrella-0.052
ishly-0.051
aroo-0.051
 plunge-0.050
 yes-0.049
Token dad
Token ID9955
Feature activation+8.27e-02

Pos logit contributions
 plunge+0.014
 umbrella+0.014
aroo+0.013
 hi+0.013
anger+0.013
Neg logit contributions
bo-0.016
ce-0.016
die-0.016
aron-0.016
by-0.015
Token it
Token ID340
Feature activation-4.52e-03

Pos logit contributions
 plunge+0.026
anger+0.025
vere+0.025
adders+0.024
ittens+0.024
Neg logit contributions
bo-0.013
ce-0.013
by-0.013
 sav-0.012
die-0.012
Token.
Token ID13
Feature activation+3.32e-02

Pos logit contributions
 plunge+0.012
anger+0.012
aroo+0.011
adders+0.011
 umbrella+0.011
Neg logit contributions
bo-0.003
die-0.003
ce-0.003
by-0.003
 sav-0.003
Token The
Token ID383
Feature activation-5.04e-02

Pos logit contributions
die+0.040
by+0.040
ce+0.039
bo+0.038
idi+0.036
Neg logit contributions
 plunge-0.072
anger-0.069
aroo-0.067
adders-0.066
 umbrella-0.065
Token cat
Token ID3797
Feature activation-8.27e-02

Pos logit contributions
 plunge+0.096
anger+0.089
aroo+0.086
adders+0.085
aro+0.084
Neg logit contributions
ce-0.073
 sav-0.073
die-0.073
by-0.072
bo-0.071
Token his
Token ID465
Feature activation+4.48e-03

Pos logit contributions
 plunge+0.157
anger+0.145
aroo+0.144
 umbrella+0.142
adders+0.139
Neg logit contributions
die-0.119
by-0.118
ce-0.117
bo-0.115
aron-0.115
Tokenses
Token ID8448
Feature activation-1.93e-01

Pos logit contributions
by+0.010
die+0.010
 sav+0.010
bo+0.010
ce+0.010
Neg logit contributions
 plunge-0.005
anger-0.004
aro-0.004
aroo-0.004
adders-0.004
Token and
Token ID290
Feature activation+1.80e-02

Pos logit contributions
 plunge+0.479
anger+0.454
aroo+0.450
 umbrella+0.444
adders+0.443
Neg logit contributions
die-0.164
by-0.161
ce-0.160
 sav-0.155
bo-0.155
Token scratches
Token ID37627
Feature activation+3.21e-03

Pos logit contributions
 sav+0.025
die+0.025
bo+0.025
ce+0.024
idi+0.024
Neg logit contributions
 plunge-0.036
anger-0.034
aroo-0.032
adders-0.032
 umbrella-0.031
Token him
Token ID683
Feature activation-5.97e-02

Pos logit contributions
die+0.004
by+0.004
ce+0.004
 sav+0.004
bo+0.004
Neg logit contributions
 plunge-0.007
anger-0.007
aroo-0.007
 umbrella-0.006
adders-0.006
Token.
Token ID13
Feature activation+3.00e-02

Pos logit contributions
 plunge+0.149
anger+0.141
aroo+0.139
 umbrella+0.137
adders+0.136
Neg logit contributions
die-0.051
ce-0.051
by-0.050
 sav-0.050
bo-0.049
Token Ben
Token ID3932
Feature activation+5.20e-02

Pos logit contributions
by+0.036
die+0.035
ce+0.035
bo+0.035
 sav+0.033
Neg logit contributions
 plunge-0.065
anger-0.062
aroo-0.061
adders-0.059
 umbrella-0.059
Token is
Token ID318
Feature activation+2.38e-02

Pos logit contributions
 plunge+0.147
 umbrella+0.136
anger+0.136
aroo+0.135
 hi+0.132
Neg logit contributions
by-0.121
die-0.120
ce-0.118
bo-0.118
aron-0.116
Token afraid
Token ID7787
Feature activation-5.79e-02

Pos logit contributions
by+0.038
ce+0.036
bo+0.035
 sav+0.034
die+0.034
Neg logit contributions
 plunge-0.043
aroo-0.040
 umbrella-0.040
adders-0.039
 hi-0.038
Token.
Token ID13
Feature activation+1.76e-02

Pos logit contributions
 plunge+0.115
 umbrella+0.111
aroo+0.108
adders+0.107
anger+0.107
Neg logit contributions
die-0.080
 sav-0.077
idi-0.075
aron-0.074
ce-0.074
Token It
Token ID632
Feature activation-8.17e-02

Pos logit contributions
 sav+0.019
aron+0.019
 late+0.019
bo+0.018
ce+0.017
Neg logit contributions
 plunge-0.043
anger-0.039
 umbrella-0.038
aroo-0.037
aro-0.037
Token his
Token ID465
Feature activation+2.20e-03

Pos logit contributions
 plunge+0.155
anger+0.141
aroo+0.141
 umbrella+0.139
adders+0.136
Neg logit contributions
ce-0.117
by-0.116
die-0.116
aron-0.115
bo-0.114
Tokenses
Token ID8448
Feature activation-1.92e-01

Pos logit contributions
by+0.005
die+0.005
 sav+0.005
bo+0.005
ce+0.005
Neg logit contributions
 plunge-0.002
anger-0.002
aroo-0.002
aro-0.002
adders-0.002
Token and
Token ID290
Feature activation+1.78e-02

Pos logit contributions
 plunge+0.497
anger+0.471
aroo+0.464
 umbrella+0.457
adders+0.454
Neg logit contributions
die-0.148
ce-0.146
by-0.144
bo-0.144
 sav-0.142
Token scratches
Token ID37627
Feature activation+4.36e-03

Pos logit contributions
 sav+0.024
die+0.023
bo+0.022
ce+0.022
idi+0.022
Neg logit contributions
 plunge-0.037
anger-0.036
aroo-0.034
adders-0.033
vere-0.032
Token Tom
Token ID4186
Feature activation+8.21e-03

Pos logit contributions
die+0.004
ce+0.004
bo+0.004
by+0.004
 sav+0.004
Neg logit contributions
 plunge-0.010
anger-0.010
aroo-0.009
 umbrella-0.009
adders-0.009
Token's
Token ID338
Feature activation+8.17e-03

Pos logit contributions
idi+0.008
die+0.008
ce+0.008
 sav+0.008
by+0.007
Neg logit contributions
 plunge-0.020
anger-0.018
aroo-0.018
 umbrella-0.018
adders-0.018
Token hand
Token ID1021
Feature activation-1.07e-01

Pos logit contributions
die+0.015
by+0.015
bo+0.015
ce+0.015
 sav+0.014
Neg logit contributions
 plunge-0.013
anger-0.012
aroo-0.011
adders-0.011
 umbrella-0.011
Token scare
Token ID19437
Feature activation-2.01e-02

Pos logit contributions
bo+0.043
 sav+0.042
ce+0.042
by+0.042
die+0.042
Neg logit contributions
 plunge-0.053
anger-0.053
adders-0.049
aroo-0.049
ittens-0.048
Token it
Token ID340
Feature activation-5.39e-03

Pos logit contributions
 plunge+0.040
aroo+0.036
anger+0.036
aro+0.035
 umbrella+0.035
Neg logit contributions
aron-0.028
die-0.028
ce-0.027
 sav-0.027
idi-0.027
Token.
Token ID13
Feature activation+1.48e-02

Pos logit contributions
 plunge+0.012
anger+0.011
aroo+0.011
 umbrella+0.011
adders+0.011
Neg logit contributions
die-0.006
ce-0.006
by-0.006
bo-0.006
 sav-0.006
Token It
Token ID632
Feature activation-8.44e-02

Pos logit contributions
ce+0.020
by+0.020
bo+0.020
die+0.020
aron+0.019
Neg logit contributions
 plunge-0.029
anger-0.027
aroo-0.026
adders-0.026
 umbrella-0.026
Token his
Token ID465
Feature activation+1.76e-03

Pos logit contributions
 plunge+0.157
 umbrella+0.141
anger+0.141
aroo+0.141
 hi+0.137
Neg logit contributions
aron-0.123
ce-0.123
die-0.122
by-0.121
bo-0.118
Tokenses
Token ID8448
Feature activation-1.91e-01

Pos logit contributions
by+0.003
 sav+0.003
die+0.003
bo+0.003
ce+0.003
Neg logit contributions
 plunge-0.003
anger-0.002
adders-0.002
aroo-0.002
aro-0.002
Token and
Token ID290
Feature activation+2.86e-03

Pos logit contributions
 plunge+0.485
anger+0.460
aroo+0.453
 umbrella+0.446
adders+0.444
Neg logit contributions
die-0.155
ce-0.152
by-0.151
bo-0.149
 sav-0.148
Token shows
Token ID2523
Feature activation+2.73e-02

Pos logit contributions
die+0.004
 sav+0.004
bo+0.004
ce+0.004
by+0.004
Neg logit contributions
 plunge-0.006
anger-0.005
aroo-0.005
adders-0.005
 umbrella-0.005
Token its
Token ID663
Feature activation-7.47e-02

Pos logit contributions
die+0.036
ce+0.036
 sav+0.035
bo+0.035
by+0.035
Neg logit contributions
 plunge-0.056
anger-0.053
aroo-0.052
 umbrella-0.050
adders-0.050
Token teeth
Token ID9941
Feature activation-3.35e-05

Pos logit contributions
 plunge+0.154
anger+0.148
adders+0.144
aroo+0.143
aro+0.137
Neg logit contributions
die-0.098
by-0.096
bo-0.094
 sav-0.091
ce-0.089
Token.
Token ID13
Feature activation+1.50e-02

Pos logit contributions
 plunge+0.000
anger+0.000
adders+0.000
aroo+0.000
 umbrella+0.000
Neg logit contributions
die-0.000
idi-0.000
ce-0.000
 sav-0.000
by-0.000
Token
Token ID198
Feature activation+3.63e-03

Pos logit contributions
ce+0.008
bo+0.008
 sav+0.008
die+0.008
aron+0.008
Neg logit contributions
 plunge-0.015
anger-0.014
aroo-0.014
 umbrella-0.013
adders-0.013
Token
Token ID198
Feature activation+5.87e-03

Pos logit contributions
die+0.003
by+0.003
ce+0.003
bo+0.003
 sav+0.003
Neg logit contributions
 plunge-0.009
anger-0.009
aroo-0.009
 umbrella-0.008
adders-0.008
TokenThe
Token ID464
Feature activation-3.61e-02

Pos logit contributions
die+0.007
by+0.007
ce+0.007
bo+0.007
 sav+0.007
Neg logit contributions
 plunge-0.013
anger-0.012
aroo-0.012
 umbrella-0.012
adders-0.012
Token snake
Token ID17522
Feature activation-3.35e-02

Pos logit contributions
 plunge+0.053
anger+0.048
aroo+0.046
 umbrella+0.045
adders+0.045
Neg logit contributions
die-0.069
ce-0.069
by-0.068
bo-0.068
 sav-0.068
Token his
Token ID465
Feature activation-6.04e-04

Pos logit contributions
 plunge+0.065
anger+0.060
aroo+0.060
 umbrella+0.059
adders+0.058
Neg logit contributions
die-0.047
ce-0.046
aron-0.046
by-0.046
 sav-0.045
Tokenses
Token ID8448
Feature activation-1.91e-01

Pos logit contributions
 plunge+0.001
anger+0.001
aroo+0.001
aro+0.001
adders+0.001
Neg logit contributions
by-0.001
die-0.001
 sav-0.001
bo-0.001
idi-0.001
Token.
Token ID13
Feature activation+1.58e-02

Pos logit contributions
 plunge+0.468
anger+0.444
aroo+0.438
 umbrella+0.432
adders+0.431
Neg logit contributions
die-0.170
by-0.166
ce-0.166
 sav-0.163
idi-0.163
Token L
Token ID406
Feature activation+3.47e-02

Pos logit contributions
ce+0.020
 sav+0.020
bo+0.020
die+0.020
aron+0.020
Neg logit contributions
 plunge-0.033
anger-0.030
aroo-0.030
 umbrella-0.029
adders-0.029
Tokenila
Token ID10102
Feature activation-3.21e-02

Pos logit contributions
ce+0.055
y+0.050
bo+0.050
aron+0.049
idi+0.049
Neg logit contributions
 plunge-0.065
ishly-0.061
 umbrella-0.059
vere-0.057
 hi-0.056
Token and
Token ID290
Feature activation+4.16e-04

Pos logit contributions
 plunge+0.082
anger+0.078
aroo+0.077
 umbrella+0.076
adders+0.075
Neg logit contributions
die-0.025
by-0.025
ce-0.025
bo-0.024
 sav-0.024
Token Tom
Token ID4186
Feature activation-2.61e-03

Pos logit contributions
 sav+0.001
die+0.001
by+0.001
ce+0.001
bo+0.001
Neg logit contributions
 plunge-0.001
anger-0.001
aroo-0.001
adders-0.001
 umbrella-0.001
Token it
Token ID340
Feature activation-1.64e-02

Pos logit contributions
 plunge+0.065
anger+0.060
aroo+0.059
adders+0.057
 umbrella+0.057
Neg logit contributions
ce-0.051
by-0.051
 sav-0.051
die-0.051
bo-0.051
Tokenâ
Token ID22940
Feature activation+3.69e-02

Pos logit contributions
 plunge+0.036
anger+0.034
aroo+0.033
 umbrella+0.033
adders+0.033
Neg logit contributions
die-0.019
ce-0.018
aron-0.018
by-0.018
 sav-0.018
Token€
Token ID26391
Feature activation+3.25e-02

Pos logit contributions
die+0.064
by+0.063
idi+0.060
 sav+0.059
ce+0.058
Neg logit contributions
 plunge-0.060
anger-0.057
aroo-0.054
 umbrella-0.051
adders-0.050
Tokenâ„¢
Token ID8151
Feature activation+1.83e-02

Pos logit contributions
die+0.056
by+0.055
ce+0.055
bo+0.055
 sav+0.055
Neg logit contributions
 plunge-0.052
anger-0.049
aroo-0.047
adders-0.046
 umbrella-0.046
Tokens
Token ID82
Feature activation-6.52e-03

Pos logit contributions
bo+0.029
ce+0.028
by+0.028
die+0.028
 sav+0.028
Neg logit contributions
 plunge-0.033
anger-0.030
aroo-0.029
 umbrella-0.029
adders-0.028
Token ours
Token ID16903
Feature activation-1.90e-01

Pos logit contributions
 plunge+0.013
anger+0.012
aroo+0.012
 umbrella+0.012
adders+0.011
Neg logit contributions
die-0.009
ce-0.009
bo-0.009
by-0.009
 sav-0.009
Tokenâ
Token ID22940
Feature activation+3.65e-02

Pos logit contributions
 plunge+0.424
anger+0.399
aroo+0.390
 umbrella+0.386
adders+0.381
Neg logit contributions
die-0.214
ce-0.213
bo-0.210
by-0.208
 sav-0.206
Token€
Token ID26391
Feature activation+3.17e-02

Pos logit contributions
die+0.063
by+0.062
idi+0.059
 sav+0.058
ce+0.058
Neg logit contributions
 plunge-0.059
anger-0.056
aroo-0.053
 umbrella-0.050
adders-0.050
Token.
Token ID13
Feature activation-2.85e-03

Pos logit contributions
die+0.056
by+0.054
bo+0.054
ce+0.054
idi+0.054
Neg logit contributions
 plunge-0.049
anger-0.047
aroo-0.045
adders-0.044
 umbrella-0.043
Token
Token ID198
Feature activation+8.91e-03

Pos logit contributions
 plunge+0.005
anger+0.005
aroo+0.005
 umbrella+0.004
adders+0.004
Neg logit contributions
by-0.004
die-0.004
ce-0.004
bo-0.004
 sav-0.004
Token
Token ID198
Feature activation+1.37e-02

Pos logit contributions
by+0.009
die+0.008
bo+0.008
ce+0.008
 sav+0.008
Neg logit contributions
 plunge-0.021
anger-0.020
aroo-0.020
adders-0.020
 umbrella-0.019
Token embarrassed
Token ID21100
Feature activation-9.41e-03

Pos logit contributions
 plunge+0.037
aroo+0.035
 umbrella+0.034
vere+0.033
adders+0.033
Neg logit contributions
ce-0.029
by-0.029
die-0.028
aron-0.028
bo-0.028
Token.
Token ID13
Feature activation+2.51e-02

Pos logit contributions
 plunge+0.017
anger+0.016
aroo+0.015
 umbrella+0.015
 hi+0.015
Neg logit contributions
die-0.016
bo-0.015
ce-0.015
by-0.015
idi-0.014
Token He
Token ID679
Feature activation+1.90e-02

Pos logit contributions
by+0.037
ce+0.035
die+0.034
bo+0.034
y+0.033
Neg logit contributions
 plunge-0.046
anger-0.044
 hi-0.043
aroo-0.042
ishly-0.041
Token smiled
Token ID13541
Feature activation+6.15e-02

Pos logit contributions
ce+0.024
die+0.024
aron+0.023
by+0.023
 late+0.022
Neg logit contributions
 plunge-0.038
 umbrella-0.036
aroo-0.036
anger-0.035
 hi-0.034
Token sheep
Token ID15900
Feature activation-6.20e-02

Pos logit contributions
ce+0.090
idi+0.085
bo+0.085
 sav+0.083
die+0.082
Neg logit contributions
 plunge-0.113
 hi-0.110
 umbrella-0.110
ishly-0.108
anger-0.105
Tokenishly
Token ID29735
Feature activation-1.90e-01

Pos logit contributions
 plunge+0.132
anger+0.123
aroo+0.121
 umbrella+0.119
adders+0.118
Neg logit contributions
die-0.076
ce-0.075
by-0.075
bo-0.075
 sav-0.074
Token at
Token ID379
Feature activation-2.89e-02

Pos logit contributions
 plunge+0.435
anger+0.406
 umbrella+0.400
aroo+0.394
 hi+0.392
Neg logit contributions
die-0.209
ce-0.206
bo-0.199
idi-0.195
by-0.195
Token his
Token ID465
Feature activation-4.88e-03

Pos logit contributions
 plunge+0.055
aroo+0.050
 umbrella+0.049
anger+0.049
aro+0.046
Neg logit contributions
ce-0.042
die-0.041
bo-0.041
 sav-0.041
aron-0.041
Token mom
Token ID1995
Feature activation+3.28e-02

Pos logit contributions
 plunge+0.009
anger+0.008
 umbrella+0.008
aroo+0.008
adders+0.007
Neg logit contributions
ce-0.008
bo-0.008
die-0.008
aron-0.008
 sav-0.008
Token,
Token ID11
Feature activation-5.56e-03

Pos logit contributions
idi+0.040
die+0.039
 sav+0.038
ce+0.037
Ar+0.036
Neg logit contributions
 umbrella-0.066
 plunge-0.066
ishly-0.065
 hi-0.060
aroo-0.060
Token who
Token ID508
Feature activation-3.43e-02

Pos logit contributions
 plunge+0.010
 umbrella+0.009
aroo+0.009
 hi+0.009
ishly+0.009
Neg logit contributions
by-0.009
ce-0.009
bo-0.008
die-0.008
idi-0.008
Token was
Token ID373
Feature activation+2.39e-02

Pos logit contributions
die+0.077
bo+0.073
ce+0.072
idi+0.070
aron+0.070
Neg logit contributions
 plunge-0.097
 umbrella-0.088
anger-0.087
aroo-0.085
adders-0.085
Token filled
Token ID5901
Feature activation+5.10e-02

Pos logit contributions
die+0.034
by+0.033
ce+0.033
bo+0.032
aron+0.032
Neg logit contributions
 plunge-0.046
aroo-0.042
 umbrella-0.042
anger-0.041
adders-0.041
Token with
Token ID351
Feature activation+4.80e-02

Pos logit contributions
die+0.079
bo+0.073
aron+0.072
by+0.071
idi+0.070
Neg logit contributions
 plunge-0.091
aroo-0.084
aro-0.077
vere-0.073
anger-0.073
Token gifts
Token ID13201
Feature activation+5.22e-02

Pos logit contributions
ce+0.075
die+0.074
bo+0.074
aron+0.072
idi+0.071
Neg logit contributions
 plunge-0.091
 umbrella-0.083
aroo-0.079
 hi-0.078
adders-0.078
Token for
Token ID329
Feature activation-5.24e-02

Pos logit contributions
die+0.054
bo+0.052
ce+0.052
idi+0.051
 sav+0.051
Neg logit contributions
 plunge-0.123
anger-0.116
aroo-0.113
 umbrella-0.113
adders-0.111
Token Ad
Token ID1215
Feature activation+1.51e-01

Pos logit contributions
 plunge+0.108
anger+0.101
aroo+0.099
 umbrella+0.097
adders+0.096
Neg logit contributions
die-0.069
bo-0.067
ce-0.066
by-0.066
aron-0.065
Tokenri
Token ID380
Feature activation+3.60e-02

Pos logit contributions
die+0.246
 sav+0.236
idi+0.231
aron+0.225
bo+0.221
Neg logit contributions
 plunge-0.265
anger-0.258
aroo-0.244
adders-0.238
aro-0.238
Tokenana
Token ID2271
Feature activation+1.20e-01

Pos logit contributions
die+0.050
 sav+0.048
by+0.047
idi+0.047
ce+0.047
Neg logit contributions
 plunge-0.070
anger-0.067
aroo-0.067
adders-0.063
aro-0.063
Token.
Token ID13
Feature activation+1.39e-02

Pos logit contributions
die+0.123
 sav+0.118
idi+0.116
ce+0.115
bo+0.113
Neg logit contributions
 plunge-0.273
anger-0.262
aroo-0.261
 umbrella-0.256
adders-0.251
Token She
Token ID1375
Feature activation-1.66e-02

Pos logit contributions
die+0.021
by+0.020
ce+0.019
bo+0.019
idi+0.019
Neg logit contributions
 plunge-0.026
anger-0.026
 hi-0.024
aroo-0.024
adders-0.024
Token was
Token ID373
Feature activation+2.35e-02

Pos logit contributions
 plunge+0.037
aroo+0.036
anger+0.035
 umbrella+0.034
aro+0.034
Neg logit contributions
die-0.017
bo-0.017
by-0.016
 late-0.016
ce-0.016
Token feel
Token ID1254
Feature activation+3.45e-02

Pos logit contributions
 plunge+0.018
anger+0.017
 umbrella+0.016
aroo+0.016
adders+0.016
Neg logit contributions
ce-0.013
die-0.013
idi-0.013
 sav-0.013
by-0.012
Token better
Token ID1365
Feature activation+5.20e-02

Pos logit contributions
ce+0.053
by+0.050
die+0.049
idi+0.049
aron+0.047
Neg logit contributions
 plunge-0.066
 umbrella-0.061
 hi-0.059
aroo-0.059
adders-0.058
Token."
Token ID526
Feature activation+4.11e-02

Pos logit contributions
idi+0.079
die+0.075
ce+0.074
by+0.072
 sav+0.071
Neg logit contributions
 plunge-0.095
 umbrella-0.088
aroo-0.088
anger-0.088
adders-0.087
Token
Token ID198
Feature activation+2.55e-02

Pos logit contributions
die+0.043
ce+0.042
by+0.042
bo+0.041
aron+0.041
Neg logit contributions
 plunge-0.093
aroo-0.089
anger-0.087
 umbrella-0.085
 hi-0.085
Token
Token ID198
Feature activation+2.78e-02

Pos logit contributions
by+0.026
ce+0.025
die+0.025
 sav+0.025
bo+0.025
Neg logit contributions
 plunge-0.057
aroo-0.055
anger-0.053
 umbrella-0.053
vere-0.052
TokenAnna
Token ID31160
Feature activation+1.31e-01

Pos logit contributions
by+0.034
en+0.034
ce+0.033
bo+0.033
die+0.033
Neg logit contributions
 plunge-0.055
aroo-0.055
aro-0.051
 umbrella-0.051
vere-0.051
Token and
Token ID290
Feature activation+2.33e-02

Pos logit contributions
 late+0.173
by+0.168
die+0.167
Grand+0.166
ce+0.162
Neg logit contributions
 plunge-0.226
ishly-0.225
 umbrella-0.217
 hi-0.214
anger-0.211
Token Mom
Token ID11254
Feature activation+4.89e-02

Pos logit contributions
die+0.029
ce+0.029
bo+0.029
by+0.029
aron+0.028
Neg logit contributions
 plunge-0.049
anger-0.046
aroo-0.045
 umbrella-0.045
adders-0.044
Token got
Token ID1392
Feature activation+5.11e-03

Pos logit contributions
die+0.082
ce+0.079
idi+0.077
by+0.077
 late+0.077
Neg logit contributions
 plunge-0.083
 umbrella-0.076
anger-0.073
aroo-0.072
adders-0.071
Token some
Token ID617
Feature activation-2.31e-02

Pos logit contributions
die+0.007
aron+0.007
ce+0.007
 sav+0.007
idi+0.007
Neg logit contributions
 plunge-0.010
 umbrella-0.010
aroo-0.009
adders-0.009
anger-0.009
Token bread
Token ID8509
Feature activation-4.46e-04

Pos logit contributions
 plunge+0.041
anger+0.038
 umbrella+0.037
aroo+0.037
adders+0.036
Neg logit contributions
die-0.037
bo-0.037
ce-0.036
by-0.036
 sav-0.036
Token test
Token ID1332
Feature activation+3.29e-02

Pos logit contributions
 plunge+0.115
anger+0.108
aroo+0.104
 umbrella+0.102
adders+0.099
Neg logit contributions
die-0.117
by-0.111
aron-0.110
bo-0.110
 sav-0.108
Token was
Token ID373
Feature activation+1.46e-02

Pos logit contributions
die+0.031
 sav+0.029
by+0.029
bo+0.029
ce+0.029
Neg logit contributions
 plunge-0.079
anger-0.076
aroo-0.075
 umbrella-0.073
adders-0.072
Token to
Token ID284
Feature activation+2.03e-02

Pos logit contributions
by+0.021
die+0.020
bo+0.020
ce+0.019
 sav+0.019
Neg logit contributions
 plunge-0.027
 umbrella-0.026
adders-0.026
aroo-0.025
anger-0.025
Token see
Token ID766
Feature activation-2.59e-02

Pos logit contributions
 sav+0.027
ce+0.027
by+0.026
bo+0.026
die+0.026
Neg logit contributions
 plunge-0.041
anger-0.039
aroo-0.038
adders-0.037
 umbrella-0.037
Token how
Token ID703
Feature activation+1.29e-02

Pos logit contributions
 plunge+0.051
anger+0.047
aroo+0.046
 umbrella+0.045
adders+0.045
Neg logit contributions
die-0.036
ce-0.036
by-0.035
bo-0.035
 sav-0.035
Token well
Token ID880
Feature activation+1.33e-01

Pos logit contributions
die+0.017
ce+0.016
bo+0.016
aron+0.016
 sav+0.016
Neg logit contributions
 plunge-0.027
anger-0.025
aroo-0.025
 umbrella-0.024
adders-0.024
Token Lisa
Token ID15378
Feature activation+1.25e-02

Pos logit contributions
die+0.183
bo+0.177
by+0.177
ce+0.175
 sav+0.175
Neg logit contributions
 plunge-0.264
anger-0.248
aroo-0.242
 umbrella-0.237
adders-0.233
Token could
Token ID714
Feature activation-2.16e-02

Pos logit contributions
die+0.013
by+0.012
 sav+0.012
ce+0.012
idi+0.012
Neg logit contributions
 plunge-0.028
anger-0.027
aroo-0.027
 umbrella-0.026
adders-0.026
Token do
Token ID466
Feature activation-3.17e-02

Pos logit contributions
 plunge+0.037
anger+0.035
aroo+0.034
aro+0.032
adders+0.032
Neg logit contributions
ce-0.035
 sav-0.034
bo-0.034
by-0.033
idi-0.033
Token certain
Token ID1728
Feature activation-6.71e-02

Pos logit contributions
 plunge+0.066
anger+0.061
aroo+0.060
 umbrella+0.059
adders+0.058
Neg logit contributions
ce-0.040
die-0.040
by-0.040
bo-0.040
 sav-0.040
Token things
Token ID1243
Feature activation+4.95e-02

Pos logit contributions
 plunge+0.134
anger+0.125
aroo+0.122
 umbrella+0.120
adders+0.119
Neg logit contributions
die-0.092
ce-0.090
bo-0.090
by-0.090
 sav-0.089
Token circles
Token ID13332
Feature activation+3.47e-02

Pos logit contributions
by+0.008
die+0.008
bo+0.008
ce+0.008
aron+0.007
Neg logit contributions
 plunge-0.011
 umbrella-0.011
aroo-0.010
 hi-0.010
aro-0.010
Token,
Token ID11
Feature activation+4.32e-03

Pos logit contributions
die+0.051
by+0.049
bo+0.049
 sav+0.048
idi+0.048
Neg logit contributions
 plunge-0.067
anger-0.063
aroo-0.061
 umbrella-0.061
adders-0.060
Token triangles
Token ID44360
Feature activation+1.39e-01

Pos logit contributions
by+0.006
die+0.006
bo+0.006
idi+0.006
athan+0.005
Neg logit contributions
 plunge-0.008
 umbrella-0.008
 tweet-0.008
aroo-0.008
 hi-0.007
Token,
Token ID11
Feature activation+2.97e-03

Pos logit contributions
die+0.201
idi+0.199
bo+0.195
 sav+0.194
by+0.194
Neg logit contributions
 plunge-0.267
anger-0.251
aroo-0.246
 umbrella-0.242
adders-0.242
Token and
Token ID290
Feature activation+1.75e-02

Pos logit contributions
by+0.004
die+0.004
idi+0.004
bo+0.004
 late+0.004
Neg logit contributions
 plunge-0.006
 umbrella-0.006
 tweet-0.005
aroo-0.005
adders-0.005
Token stars
Token ID5788
Feature activation+1.45e-01

Pos logit contributions
ce+0.026
by+0.026
aron+0.026
idi+0.026
bo+0.026
Neg logit contributions
 plunge-0.031
 umbrella-0.030
 hi-0.028
 tweet-0.027
aroo-0.027
Token.
Token ID13
Feature activation+2.23e-02

Pos logit contributions
die+0.192
idi+0.188
 sav+0.176
by+0.176
bo+0.169
Neg logit contributions
 plunge-0.301
anger-0.286
aroo-0.283
 umbrella-0.279
adders-0.273
Token She
Token ID1375
Feature activation-3.12e-03

Pos logit contributions
by+0.034
bo+0.034
ce+0.033
die+0.033
 sav+0.032
Neg logit contributions
 plunge-0.041
anger-0.040
adders-0.038
 umbrella-0.038
aroo-0.037
Token sees
Token ID7224
Feature activation+3.36e-04

Pos logit contributions
 plunge+0.005
 umbrella+0.005
adders+0.005
 yeah+0.004
 hi+0.004
Neg logit contributions
 late-0.005
aron-0.005
ce-0.005
by-0.005
bo-0.005
Token how
Token ID703
Feature activation+2.53e-02

Pos logit contributions
die+0.001
ce+0.001
bo+0.000
aron+0.000
by+0.000
Neg logit contributions
 plunge-0.001
 umbrella-0.001
anger-0.001
aroo-0.001
adders-0.001
Token Anna
Token ID11735
Feature activation+8.64e-02

Pos logit contributions
aron+0.045
die+0.040
ce+0.040
idi+0.040
 sav+0.039
Neg logit contributions
 plunge-0.044
adders-0.041
aro-0.040
 umbrella-0.039
aroo-0.039
Token same
Token ID976
Feature activation+4.26e-02

Pos logit contributions
ce+0.027
die+0.026
idi+0.026
bo+0.025
by+0.025
Neg logit contributions
 plunge-0.028
 umbrella-0.025
anger-0.025
aroo-0.024
adders-0.023
Token shapes
Token ID15268
Feature activation+1.52e-01

Pos logit contributions
die+0.065
bo+0.064
idi+0.063
ce+0.062
athan+0.062
Neg logit contributions
 plunge-0.074
 umbrella-0.073
anger-0.070
aroo-0.068
 hi-0.064
Token as
Token ID355
Feature activation+4.55e-03

Pos logit contributions
die+0.235
idi+0.223
ce+0.221
by+0.219
 sav+0.216
Neg logit contributions
 plunge-0.282
anger-0.268
aroo-0.267
 umbrella-0.265
adders-0.258
Token their
Token ID511
Feature activation+1.24e-02

Pos logit contributions
bo+0.007
aron+0.007
die+0.007
ce+0.007
 sav+0.007
Neg logit contributions
 plunge-0.008
aroo-0.008
adders-0.008
anger-0.007
aro-0.007
Token paper
Token ID3348
Feature activation+1.32e-01

Pos logit contributions
die+0.020
bo+0.020
ce+0.020
aron+0.019
idi+0.019
Neg logit contributions
 plunge-0.021
anger-0.020
 umbrella-0.020
aroo-0.019
adders-0.019
Token shapes
Token ID15268
Feature activation+1.49e-01

Pos logit contributions
die+0.160
idi+0.151
ce+0.144
 sav+0.143
by+0.141
Neg logit contributions
 plunge-0.284
anger-0.280
 umbrella-0.270
aroo-0.266
adders-0.259
Token.
Token ID13
Feature activation+2.47e-02

Pos logit contributions
die+0.191
idi+0.178
ce+0.173
by+0.169
 sav+0.168
Neg logit contributions
 plunge-0.321
anger-0.307
 umbrella-0.299
aroo-0.298
adders-0.296
Token Lily
Token ID20037
Feature activation+4.73e-02

Pos logit contributions
by+0.033
ce+0.031
die+0.031
bo+0.030
 sav+0.030
Neg logit contributions
 plunge-0.050
anger-0.048
adders-0.047
aroo-0.047
 umbrella-0.046
Token and
Token ID290
Feature activation+1.14e-02

Pos logit contributions
 late+0.071
die+0.063
by+0.062
bo+0.061
ce+0.058
Neg logit contributions
 umbrella-0.070
ishly-0.069
 plunge-0.068
aroo-0.065
 yes-0.065
Token Ben
Token ID3932
Feature activation+5.08e-02

Pos logit contributions
ce+0.016
aron+0.016
die+0.016
bo+0.016
by+0.016
Neg logit contributions
 plunge-0.022
 umbrella-0.020
aroo-0.020
anger-0.020
adders-0.020
Token smiled
Token ID13541
Feature activation+5.70e-02

Pos logit contributions
die+0.073
ce+0.070
 late+0.069
idi+0.066
by+0.066
Neg logit contributions
 plunge-0.098
 umbrella-0.091
ishly-0.087
 hi-0.087
adders-0.086
Token the
Token ID262
Feature activation+9.13e-04

Pos logit contributions
 plunge+0.089
anger+0.084
aroo+0.082
 umbrella+0.081
adders+0.080
Neg logit contributions
die-0.037
ce-0.036
bo-0.036
by-0.036
 sav-0.036
Token farm
Token ID5318
Feature activation-1.49e-02

Pos logit contributions
die+0.001
ce+0.001
 sav+0.001
by+0.001
bo+0.001
Neg logit contributions
 plunge-0.002
anger-0.002
aroo-0.002
 umbrella-0.002
adders-0.002
Token!
Token ID0
Feature activation-4.26e-02

Pos logit contributions
 plunge+0.033
aroo+0.031
anger+0.031
adders+0.030
 umbrella+0.030
Neg logit contributions
die-0.017
bo-0.016
by-0.016
idi-0.016
ce-0.016
Token<|endoftext|>
Token ID50256
Feature activation-3.38e-02

Pos logit contributions
 plunge+0.102
anger+0.098
aroo+0.096
 umbrella+0.094
adders+0.094
Neg logit contributions
ce-0.039
die-0.039
by-0.038
bo-0.037
idi-0.037
TokenOne
Token ID3198
Feature activation-9.95e-04

Pos logit contributions
 plunge+0.065
anger+0.061
aroo+0.059
adders+0.056
 umbrella+0.056
Neg logit contributions
by-0.046
die-0.045
 sav-0.044
ce-0.043
idi-0.042
Token day
Token ID1110
Feature activation+3.75e-02

Pos logit contributions
 plunge+0.002
aroo+0.002
anger+0.002
 umbrella+0.002
adders+0.002
Neg logit contributions
die-0.001
ce-0.001
aron-0.001
 sav-0.001
bo-0.001
Token,
Token ID11
Feature activation-2.16e-02

Pos logit contributions
die+0.062
aron+0.062
bo+0.060
 sav+0.060
ce+0.059
Neg logit contributions
 plunge-0.066
aroo-0.060
anger-0.058
 umbrella-0.056
 hi-0.055
Token Max
Token ID5436
Feature activation-4.84e-02

Pos logit contributions
 plunge+0.043
aroo+0.041
anger+0.039
 umbrella+0.038
 hi+0.037
Neg logit contributions
ce-0.030
aron-0.029
 sav-0.029
die-0.029
bo-0.029
Token and
Token ID290
Feature activation-2.01e-02

Pos logit contributions
 umbrella+0.086
 plunge+0.080
aroo+0.079
 hi+0.071
 raft+0.070
Neg logit contributions
aron-0.064
clock-0.063
ce-0.063
die-0.057
Grand-0.057
Token his
Token ID465
Feature activation-1.06e-02

Pos logit contributions
 plunge+0.050
aroo+0.047
anger+0.047
 umbrella+0.047
adders+0.045
Neg logit contributions
ce-0.017
die-0.017
bo-0.016
aron-0.016
 sav-0.016
Token Mom
Token ID11254
Feature activation+2.80e-02

Pos logit contributions
 plunge+0.023
anger+0.022
aro+0.022
aroo+0.021
adders+0.021
Neg logit contributions
by-0.013
die-0.012
bo-0.012
 sav-0.011
idi-0.011
TokenTom
Token ID13787
Feature activation+6.49e-02

Pos logit contributions
 plunge+0.037
anger+0.035
aroo+0.034
adders+0.032
aro+0.031
Neg logit contributions
die-0.036
by-0.035
 sav-0.034
aron-0.034
idi-0.034
Token and
Token ID290
Feature activation-1.65e-02

Pos logit contributions
 birth+0.083
ce+0.082
idi+0.079
Grand+0.079
 late+0.078
Neg logit contributions
 umbrella-0.108
 plunge-0.100
 are-0.100
ow-0.098
aroo-0.097
Token Lily
Token ID20037
Feature activation+3.70e-02

Pos logit contributions
 plunge+0.036
aroo+0.034
 umbrella+0.033
anger+0.033
 hi+0.033
Neg logit contributions
ce-0.020
die-0.019
aron-0.019
 sav-0.019
by-0.018
Token were
Token ID547
Feature activation-4.06e-02

Pos logit contributions
die+0.062
ce+0.062
idi+0.058
 late+0.057
aron+0.056
Neg logit contributions
 umbrella-0.058
 plunge-0.058
 hi-0.052
adders-0.051
ishly-0.051
Token playing
Token ID2712
Feature activation-1.98e-04

Pos logit contributions
 plunge+0.074
anger+0.067
 umbrella+0.067
adders+0.066
aroo+0.066
Neg logit contributions
ce-0.063
die-0.062
bo-0.062
aron-0.061
by-0.061
Token with
Token ID351
Feature activation+4.78e-02

Pos logit contributions
 plunge+0.000
anger+0.000
aroo+0.000
 umbrella+0.000
adders+0.000
Neg logit contributions
die-0.000
ce-0.000
by-0.000
bo-0.000
 sav-0.000
Token magnets
Token ID37446
Feature activation+5.56e-02

Pos logit contributions
aron+0.068
ce+0.066
bo+0.063
 sav+0.061
die+0.061
Neg logit contributions
 plunge-0.095
 umbrella-0.093
aroo-0.090
adders-0.087
anger-0.085
Token on
Token ID319
Feature activation-3.59e-02

Pos logit contributions
die+0.060
ce+0.058
idi+0.058
 sav+0.057
by+0.056
Neg logit contributions
 plunge-0.128
anger-0.121
aroo-0.121
 umbrella-0.117
adders-0.117
Token the
Token ID262
Feature activation+2.14e-03

Pos logit contributions
 plunge+0.068
 umbrella+0.064
anger+0.062
aroo+0.061
adders+0.061
Neg logit contributions
aron-0.051
bo-0.051
die-0.049
ce-0.048
by-0.047
Token fridge
Token ID25772
Feature activation+4.51e-02

Pos logit contributions
aron+0.004
ce+0.004
die+0.004
bo+0.004
by+0.004
Neg logit contributions
 plunge-0.004
 umbrella-0.003
anger-0.003
aroo-0.003
 raft-0.003
Token.
Token ID13
Feature activation-2.75e-03

Pos logit contributions
die+0.048
idi+0.045
ce+0.044
by+0.043
bo+0.043
Neg logit contributions
 plunge-0.104
anger-0.102
adders-0.098
 umbrella-0.098
aroo-0.097
Token but
Token ID475
Feature activation-4.24e-02

Pos logit contributions
 plunge+0.009
 umbrella+0.008
anger+0.008
adders+0.008
aroo+0.008
Neg logit contributions
bo-0.009
by-0.009
die-0.009
ce-0.009
idi-0.009
Token then
Token ID788
Feature activation+1.41e-02

Pos logit contributions
 plunge+0.084
anger+0.078
aroo+0.076
 umbrella+0.075
adders+0.074
Neg logit contributions
die-0.059
ce-0.058
bo-0.058
by-0.058
 sav-0.057
Token he
Token ID339
Feature activation+4.65e-02

Pos logit contributions
aron+0.021
bo+0.021
die+0.021
ce+0.021
 sav+0.020
Neg logit contributions
 plunge-0.026
anger-0.024
aroo-0.024
 umbrella-0.023
adders-0.023
Token noticed
Token ID6810
Feature activation+1.09e-02

Pos logit contributions
aron+0.060
die+0.058
ce+0.056
 sav+0.056
 late+0.054
Neg logit contributions
 plunge-0.091
anger-0.087
aroo-0.086
 umbrella-0.086
adders-0.083
Token something
Token ID1223
Feature activation+4.86e-02

Pos logit contributions
die+0.017
ce+0.017
bo+0.016
by+0.016
 sav+0.016
Neg logit contributions
 plunge-0.020
anger-0.019
aroo-0.018
 umbrella-0.018
adders-0.018
Token even
Token ID772
Feature activation+6.78e-02

Pos logit contributions
die+0.078
ce+0.075
by+0.075
bo+0.075
aron+0.074
Neg logit contributions
 plunge-0.090
 umbrella-0.083
anger-0.082
aroo-0.081
vere-0.078
Token stranger
Token ID16195
Feature activation+7.28e-02

Pos logit contributions
ce+0.100
aron+0.098
by+0.098
die+0.096
 sav+0.095
Neg logit contributions
 plunge-0.127
 umbrella-0.115
aroo-0.111
adders-0.111
 hi-0.108
Token -
Token ID532
Feature activation-1.28e-03

Pos logit contributions
die+0.092
ce+0.089
aron+0.089
bo+0.088
 sav+0.088
Neg logit contributions
 plunge-0.155
anger-0.143
aroo-0.143
 umbrella-0.140
 hi-0.136
Token a
Token ID257
Feature activation-1.62e-02

Pos logit contributions
 plunge+0.002
anger+0.002
aroo+0.002
 umbrella+0.002
adders+0.002
Neg logit contributions
by-0.002
ce-0.002
die-0.002
bo-0.002
aron-0.002
Token little
Token ID1310
Feature activation+1.99e-02

Pos logit contributions
 plunge+0.028
 umbrella+0.026
anger+0.025
aroo+0.025
 hi+0.024
Neg logit contributions
bo-0.026
ce-0.026
aron-0.026
by-0.026
die-0.026
Token car
Token ID1097
Feature activation-6.87e-02

Pos logit contributions
ce+0.034
bo+0.033
idi+0.033
aron+0.033
die+0.033
Neg logit contributions
 plunge-0.033
 umbrella-0.032
aroo-0.030
anger-0.030
 hi-0.029
Token scared
Token ID12008
Feature activation-4.09e-02

Pos logit contributions
by+0.031
ce+0.030
bo+0.029
die+0.029
aron+0.028
Neg logit contributions
 plunge-0.030
aroo-0.027
 umbrella-0.026
 hi-0.026
vere-0.025
Token,
Token ID11
Feature activation-8.78e-03

Pos logit contributions
 plunge+0.086
 umbrella+0.082
anger+0.081
aroo+0.079
 hi+0.077
Neg logit contributions
die-0.052
ce-0.050
bo-0.050
 sav-0.050
idi-0.049
Token but
Token ID475
Feature activation-4.94e-02

Pos logit contributions
 plunge+0.014
anger+0.013
aroo+0.013
 umbrella+0.012
adders+0.012
Neg logit contributions
die-0.015
ce-0.015
by-0.015
bo-0.015
 sav-0.015
Token she
Token ID673
Feature activation+3.41e-02

Pos logit contributions
 plunge+0.093
anger+0.085
aroo+0.085
 umbrella+0.083
adders+0.082
Neg logit contributions
die-0.073
ce-0.073
bo-0.072
by-0.072
 sav-0.071
Token tried
Token ID3088
Feature activation-2.86e-02

Pos logit contributions
die+0.040
by+0.040
ce+0.040
 late+0.040
aron+0.039
Neg logit contributions
 plunge-0.069
aroo-0.068
 umbrella-0.065
anger-0.064
aro-0.062
Token to
Token ID284
Feature activation+2.29e-02

Pos logit contributions
 plunge+0.059
 umbrella+0.055
anger+0.053
 hi+0.052
adders+0.051
Neg logit contributions
bo-0.038
aron-0.037
ce-0.036
 sav-0.035
die-0.035
Token stay
Token ID2652
Feature activation-6.08e-02

Pos logit contributions
 sav+0.035
die+0.035
idi+0.034
by+0.034
bo+0.033
Neg logit contributions
anger-0.043
 plunge-0.041
adders-0.040
aroo-0.039
ittens-0.039
Token brave
Token ID14802
Feature activation-4.40e-02

Pos logit contributions
 plunge+0.133
anger+0.125
aroo+0.123
 umbrella+0.121
adders+0.120
Neg logit contributions
die-0.071
ce-0.070
by-0.070
bo-0.070
 sav-0.069
Token.
Token ID13
Feature activation+2.86e-04

Pos logit contributions
 plunge+0.088
anger+0.082
 umbrella+0.080
adders+0.077
ishly+0.077
Neg logit contributions
idi-0.060
die-0.058
 sav-0.057
bo-0.055
ce-0.054
Token
Token ID198
Feature activation-6.39e-03

Pos logit contributions
die+0.000
bo+0.000
by+0.000
ce+0.000
aron+0.000
Neg logit contributions
 plunge-0.001
anger-0.001
aroo-0.000
adders-0.000
 umbrella-0.000
Token
Token ID198
Feature activation-3.14e-03

Pos logit contributions
 plunge+0.017
anger+0.016
aroo+0.016
 umbrella+0.015
adders+0.015
Neg logit contributions
die-0.005
ce-0.005
by-0.004
bo-0.004
 sav-0.004
Token ground
Token ID2323
Feature activation-3.22e-02

Pos logit contributions
die+0.000
ce+0.000
by+0.000
bo+0.000
aron+0.000
Neg logit contributions
 plunge-0.000
anger-0.000
 umbrella-0.000
aroo-0.000
adders-0.000
Token.
Token ID13
Feature activation+1.04e-02

Pos logit contributions
 plunge+0.074
anger+0.071
 umbrella+0.069
aroo+0.068
adders+0.068
Neg logit contributions
die-0.036
by-0.033
idi-0.033
bo-0.033
ce-0.033
Token She
Token ID1375
Feature activation-1.72e-02

Pos logit contributions
die+0.016
by+0.015
ce+0.015
bo+0.015
aron+0.015
Neg logit contributions
 plunge-0.019
anger-0.019
 umbrella-0.018
adders-0.018
aroo-0.017
Token picked
Token ID6497
Feature activation+2.30e-02

Pos logit contributions
 plunge+0.033
aroo+0.033
 umbrella+0.033
 hi+0.031
ittens+0.030
Neg logit contributions
 late-0.021
ce-0.021
aron-0.021
bo-0.020
die-0.019
Token it
Token ID340
Feature activation-8.84e-03

Pos logit contributions
die+0.040
ce+0.039
aron+0.039
bo+0.038
idi+0.038
Neg logit contributions
 plunge-0.039
anger-0.036
aroo-0.034
 umbrella-0.034
 hi-0.033
Token up
Token ID510
Feature activation+6.92e-02

Pos logit contributions
 plunge+0.022
anger+0.020
aroo+0.020
adders+0.020
aro+0.020
Neg logit contributions
by-0.008
die-0.008
ce-0.008
bo-0.008
 sav-0.008
Token and
Token ID290
Feature activation-6.75e-03

Pos logit contributions
die+0.071
bo+0.068
ce+0.068
by+0.064
idi+0.063
Neg logit contributions
 plunge-0.162
anger-0.154
 umbrella-0.153
aroo-0.153
adders-0.144
Token showed
Token ID3751
Feature activation+1.13e-02

Pos logit contributions
 plunge+0.014
 umbrella+0.013
anger+0.013
aroo+0.013
 hi+0.013
Neg logit contributions
ce-0.008
die-0.008
bo-0.008
aron-0.008
by-0.008
Token her
Token ID607
Feature activation-1.08e-02

Pos logit contributions
by+0.013
bo+0.013
 sav+0.012
idi+0.012
ce+0.012
Neg logit contributions
 plunge-0.025
anger-0.023
aroo-0.023
adders-0.023
ittens-0.022
Token mom
Token ID1995
Feature activation+2.35e-02

Pos logit contributions
 plunge+0.024
anger+0.022
adders+0.022
vere+0.021
aroo+0.021
Neg logit contributions
by-0.013
die-0.013
bo-0.012
 sav-0.012
idi-0.012
Token.
Token ID13
Feature activation+1.16e-02

Pos logit contributions
ce+0.039
idi+0.038
 sav+0.038
aron+0.037
die+0.036
Neg logit contributions
 umbrella-0.038
 plunge-0.035
 hi-0.032
aroo-0.031
anger-0.030
Token
Token ID198
Feature activation-7.84e-03

Pos logit contributions
 plunge+0.033
anger+0.031
aroo+0.031
 umbrella+0.030
adders+0.030
Neg logit contributions
die-0.010
ce-0.010
bo-0.010
by-0.010
 sav-0.010
TokenOne
Token ID3198
Feature activation+5.35e-03

Pos logit contributions
 plunge+0.015
aroo+0.014
anger+0.014
adders+0.014
 umbrella+0.013
Neg logit contributions
by-0.011
die-0.011
aron-0.010
bo-0.010
 sav-0.010
Token day
Token ID1110
Feature activation+3.75e-02

Pos logit contributions
die+0.005
 sav+0.005
aron+0.005
bo+0.005
ce+0.005
Neg logit contributions
 plunge-0.013
aroo-0.012
anger-0.012
 umbrella-0.012
adders-0.012
Token,
Token ID11
Feature activation-1.37e-02

Pos logit contributions
die+0.059
bo+0.057
 sav+0.056
aron+0.056
ce+0.056
Neg logit contributions
 plunge-0.070
aroo-0.064
anger-0.064
 umbrella-0.061
adders-0.060
Token the
Token ID262
Feature activation+4.43e-05

Pos logit contributions
 plunge+0.033
anger+0.031
aroo+0.030
 umbrella+0.030
adders+0.030
Neg logit contributions
ce-0.013
die-0.013
 sav-0.013
by-0.013
bo-0.013
Token veterinarian
Token ID41404
Feature activation-3.01e-02

Pos logit contributions
ce+0.000
die+0.000
 sav+0.000
bo+0.000
by+0.000
Neg logit contributions
 plunge-0.000
anger-0.000
aroo-0.000
adders-0.000
 umbrella-0.000
Token went
Token ID1816
Feature activation+7.06e-03

Pos logit contributions
 plunge+0.067
aroo+0.064
anger+0.064
 umbrella+0.063
adders+0.061
Neg logit contributions
die-0.033
aron-0.031
 sav-0.030
ce-0.030
bo-0.030
Token on
Token ID319
Feature activation-3.95e-02

Pos logit contributions
die+0.009
by+0.008
bo+0.008
ce+0.008
aron+0.008
Neg logit contributions
 plunge-0.015
anger-0.014
aroo-0.014
 umbrella-0.014
adders-0.014
Token a
Token ID257
Feature activation-1.91e-02

Pos logit contributions
 plunge+0.072
anger+0.065
 umbrella+0.065
aroo+0.063
adders+0.063
Neg logit contributions
bo-0.061
die-0.060
aron-0.059
by-0.058
ce-0.058
Token long
Token ID890
Feature activation+4.78e-02

Pos logit contributions
 plunge+0.035
anger+0.033
aroo+0.032
adders+0.031
aro+0.030
Neg logit contributions
die-0.030
by-0.029
 sav-0.029
ce-0.029
bo-0.029
Token journey
Token ID7002
Feature activation-7.18e-02

Pos logit contributions
die+0.076
bo+0.075
ce+0.075
aron+0.075
by+0.075
Neg logit contributions
 plunge-0.081
 umbrella-0.075
anger-0.075
aroo-0.074
adders-0.070
Token small
Token ID1402
Feature activation+5.21e-02

Pos logit contributions
 plunge+0.016
 umbrella+0.014
aroo+0.014
adders+0.014
anger+0.014
Neg logit contributions
aron-0.011
bo-0.011
by-0.011
ce-0.011
die-0.010
Token rocks
Token ID12586
Feature activation+1.87e-02

Pos logit contributions
bo+0.104
idi+0.102
die+0.102
ce+0.101
aron+0.100
Neg logit contributions
 plunge-0.074
 umbrella-0.068
adders-0.065
anger-0.065
aroo-0.059
Token,
Token ID11
Feature activation-1.29e-02

Pos logit contributions
die+0.016
ce+0.016
by+0.015
bo+0.015
 sav+0.015
Neg logit contributions
 plunge-0.047
anger-0.044
aroo-0.044
 umbrella-0.043
adders-0.043
Token sticks
Token ID16461
Feature activation-1.04e-02

Pos logit contributions
 plunge+0.018
 umbrella+0.016
aroo+0.015
adders+0.015
anger+0.015
Neg logit contributions
by-0.026
bo-0.026
ce-0.026
die-0.026
aron-0.025
Token and
Token ID290
Feature activation-9.03e-03

Pos logit contributions
 plunge+0.026
anger+0.024
aroo+0.024
 umbrella+0.023
adders+0.023
Neg logit contributions
die-0.009
ce-0.009
by-0.009
bo-0.009
 sav-0.009
Token leaves
Token ID5667
Feature activation-7.95e-03

Pos logit contributions
 plunge+0.015
 umbrella+0.013
adders+0.013
aroo+0.013
anger+0.012
Neg logit contributions
aron-0.016
ce-0.016
by-0.016
bo-0.016
die-0.015
Token and
Token ID290
Feature activation-9.58e-03

Pos logit contributions
 plunge+0.020
anger+0.019
aroo+0.018
 umbrella+0.018
adders+0.018
Neg logit contributions
die-0.007
ce-0.007
 sav-0.007
by-0.007
idi-0.007
Token anything
Token ID1997
Feature activation+6.11e-02

Pos logit contributions
 plunge+0.016
 umbrella+0.015
adders+0.014
aroo+0.014
anger+0.014
Neg logit contributions
aron-0.016
ce-0.016
die-0.016
bo-0.016
by-0.016
Token else
Token ID2073
Feature activation+3.27e-04

Pos logit contributions
die+0.116
aron+0.111
bo+0.111
ce+0.110
by+0.109
Neg logit contributions
 plunge-0.095
anger-0.084
adders-0.084
aroo-0.084
 umbrella-0.082
Token she
Token ID673
Feature activation+2.85e-02

Pos logit contributions
die+0.001
aron+0.001
ce+0.001
 sav+0.001
bo+0.001
Neg logit contributions
 plunge-0.001
anger-0.001
adders-0.001
aroo-0.001
 umbrella-0.001
Token thought
Token ID1807
Feature activation-1.00e-02

Pos logit contributions
aron+0.046
die+0.044
ce+0.043
 late+0.041
en+0.041
Neg logit contributions
 plunge-0.048
anger-0.045
adders-0.044
aroo-0.044
vere-0.042
Token.
Token ID13
Feature activation+1.94e-02

Pos logit contributions
idi+0.020
die+0.020
 sav+0.019
by+0.019
ce+0.018
Neg logit contributions
 plunge-0.033
 umbrella-0.033
anger-0.032
aroo-0.032
adders-0.031
Token
Token ID198
Feature activation+2.02e-02

Pos logit contributions
by+0.024
die+0.024
bo+0.023
ce+0.022
en+0.021
Neg logit contributions
 plunge-0.041
anger-0.041
aroo-0.038
adders-0.038
 umbrella-0.038
Token
Token ID198
Feature activation+2.30e-02

Pos logit contributions
die+0.016
by+0.016
ce+0.016
bo+0.016
 sav+0.016
Neg logit contributions
 plunge-0.051
anger-0.049
aroo-0.048
 umbrella-0.047
adders-0.047
Token"
Token ID1
Feature activation-7.84e-03

Pos logit contributions
die+0.034
by+0.034
bo+0.033
ce+0.033
 sav+0.033
Neg logit contributions
 plunge-0.042
anger-0.040
aroo-0.039
adders-0.039
 umbrella-0.038
TokenTom
Token ID13787
Feature activation+9.20e-02

Pos logit contributions
 plunge+0.015
aroo+0.015
anger+0.014
aro+0.014
adders+0.013
Neg logit contributions
by-0.011
die-0.011
ce-0.011
aron-0.011
 sav-0.011
Token,
Token ID11
Feature activation+9.80e-03

Pos logit contributions
idi+0.141
die+0.140
by+0.138
ce+0.136
 sav+0.133
Neg logit contributions
 plunge-0.158
aroo-0.152
anger-0.148
 umbrella-0.147
ishly-0.140
Token why
Token ID1521
Feature activation+8.40e-02

Pos logit contributions
bo+0.014
 sav+0.014
ce+0.013
idi+0.013
die+0.013
Neg logit contributions
 plunge-0.019
anger-0.018
aroo-0.017
vere-0.017
adders-0.017
Token did
Token ID750
Feature activation-6.24e-03

Pos logit contributions
die+0.139
ce+0.138
by+0.137
bo+0.137
 sav+0.136
Neg logit contributions
 plunge-0.144
anger-0.133
aroo-0.130
 umbrella-0.127
adders-0.126
Token you
Token ID345
Feature activation-7.74e-03

Pos logit contributions
 plunge+0.010
anger+0.010
adders+0.009
 umbrella+0.009
aroo+0.009
Neg logit contributions
bo-0.011
 sav-0.011
ce-0.011
by-0.011
idi-0.011
Token pull
Token ID2834
Feature activation-3.22e-02

Pos logit contributions
anger+0.015
adders+0.014
 plunge+0.014
aro+0.014
aroo+0.014
Neg logit contributions
bo-0.011
 sav-0.011
 asked-0.011
 added-0.011
 over-0.010
Token the
Token ID262
Feature activation+1.41e-02

Pos logit contributions
 plunge+0.071
anger+0.067
aroo+0.065
 umbrella+0.064
adders+0.064
Neg logit contributions
die-0.037
ce-0.037
bo-0.037
by-0.037
 sav-0.036
Token and
Token ID290
Feature activation-1.27e-02

Pos logit contributions
 plunge+0.130
 umbrella+0.125
anger+0.121
aroo+0.117
adders+0.116
Neg logit contributions
die-0.079
idi-0.076
ce-0.076
by-0.075
 sav-0.075
Token stepped
Token ID10764
Feature activation+4.97e-03

Pos logit contributions
 plunge+0.026
anger+0.025
aroo+0.024
adders+0.024
 umbrella+0.024
Neg logit contributions
die-0.016
bo-0.016
 sav-0.016
ce-0.016
by-0.016
Token back
Token ID736
Feature activation-7.03e-02

Pos logit contributions
bo+0.006
ce+0.006
 sav+0.006
die+0.006
by+0.006
Neg logit contributions
 plunge-0.010
anger-0.010
aroo-0.010
adders-0.009
aro-0.009
Token.
Token ID13
Feature activation+5.90e-03

Pos logit contributions
 plunge+0.164
anger+0.155
 umbrella+0.152
aroo+0.152
adders+0.148
Neg logit contributions
die-0.073
bo-0.072
ce-0.071
 sav-0.071
by-0.070
Token But
Token ID887
Feature activation-1.29e-02

Pos logit contributions
by+0.007
die+0.007
ce+0.007
bo+0.007
aron+0.007
Neg logit contributions
 plunge-0.013
anger-0.012
aroo-0.012
 umbrella-0.011
adders-0.011
Token then
Token ID788
Feature activation+8.53e-03

Pos logit contributions
 plunge+0.026
anger+0.025
aroo+0.024
adders+0.024
 umbrella+0.024
Neg logit contributions
ce-0.017
 sav-0.017
die-0.017
by-0.017
idi-0.016
Token,
Token ID11
Feature activation-1.09e-02

Pos logit contributions
die+0.013
aron+0.013
bo+0.013
ce+0.013
by+0.013
Neg logit contributions
 plunge-0.015
anger-0.014
aroo-0.014
 umbrella-0.014
adders-0.013
Token she
Token ID673
Feature activation+3.02e-02

Pos logit contributions
 plunge+0.023
anger+0.022
aroo+0.021
adders+0.021
 umbrella+0.020
Neg logit contributions
ce-0.014
 sav-0.014
by-0.014
die-0.014
bo-0.013
Token noticed
Token ID6810
Feature activation+5.00e-03

Pos logit contributions
aron+0.037
die+0.034
 sav+0.034
 late+0.034
ce+0.033
Neg logit contributions
 plunge-0.062
 umbrella-0.059
aroo-0.058
anger-0.057
 hi-0.055
Token something
Token ID1223
Feature activation+4.83e-02

Pos logit contributions
ce+0.007
die+0.007
by+0.007
 sav+0.007
bo+0.007
Neg logit contributions
 plunge-0.010
anger-0.009
aroo-0.009
adders-0.009
 umbrella-0.009
Token comfortable
Token ID6792
Feature activation-2.73e-02

Pos logit contributions
die+0.067
by+0.066
ce+0.066
bo+0.065
 sav+0.065
Neg logit contributions
 plunge-0.096
anger-0.089
aroo-0.088
 umbrella-0.087
adders-0.085
Token the
Token ID262
Feature activation+9.84e-03

Pos logit contributions
 plunge+0.010
anger+0.009
aroo+0.009
 umbrella+0.009
adders+0.009
Neg logit contributions
die-0.006
ce-0.005
bo-0.005
by-0.005
 sav-0.005
Token hose
Token ID31489
Feature activation+9.99e-03

Pos logit contributions
die+0.016
bo+0.016
ce+0.016
by+0.016
 sav+0.016
Neg logit contributions
 plunge-0.017
anger-0.016
 umbrella-0.015
aroo-0.015
adders-0.015
Token and
Token ID290
Feature activation+1.91e-02

Pos logit contributions
die+0.009
ce+0.008
by+0.008
bo+0.008
 sav+0.008
Neg logit contributions
 plunge-0.025
anger-0.024
aroo-0.023
 umbrella-0.023
adders-0.023
Token the
Token ID262
Feature activation+1.27e-02

Pos logit contributions
bo+0.024
by+0.024
ce+0.023
die+0.023
aron+0.023
Neg logit contributions
 plunge-0.041
 umbrella-0.038
aroo-0.038
anger-0.037
aro-0.037
Token water
Token ID1660
Feature activation+4.38e-02

Pos logit contributions
bo+0.023
ce+0.023
die+0.023
by+0.023
idi+0.023
Neg logit contributions
 plunge-0.020
 umbrella-0.018
anger-0.018
aroo-0.017
adders-0.017
Token and
Token ID290
Feature activation+1.77e-02

Pos logit contributions
die+0.049
by+0.047
idi+0.046
ce+0.045
bo+0.044
Neg logit contributions
 plunge-0.101
anger-0.095
 umbrella-0.094
aroo-0.094
adders-0.091
Token the
Token ID262
Feature activation+1.21e-02

Pos logit contributions
die+0.020
ce+0.020
bo+0.020
by+0.019
aron+0.019
Neg logit contributions
 plunge-0.040
aroo-0.037
anger-0.037
 umbrella-0.037
adders-0.036
Token tape
Token ID9154
Feature activation+6.40e-02

Pos logit contributions
bo+0.022
ce+0.022
die+0.022
by+0.022
idi+0.021
Neg logit contributions
 plunge-0.019
 umbrella-0.018
anger-0.017
aroo-0.017
adders-0.017
Token.
Token ID13
Feature activation+3.01e-02

Pos logit contributions
die+0.055
ce+0.053
by+0.053
bo+0.052
 sav+0.052
Neg logit contributions
 plunge-0.161
anger-0.153
aroo-0.150
 umbrella-0.149
adders-0.147
Token He
Token ID679
Feature activation+2.22e-02

Pos logit contributions
by+0.043
die+0.040
bo+0.040
ce+0.040
en+0.039
Neg logit contributions
anger-0.057
 plunge-0.057
aroo-0.055
adders-0.055
 hi-0.055
Token smiles
Token ID21845
Feature activation+7.78e-02

Pos logit contributions
die+0.032
by+0.032
aron+0.032
ce+0.032
bo+0.032
Neg logit contributions
 plunge-0.041
anger-0.038
 umbrella-0.038
aroo-0.038
adders-0.037
Token to
Token ID284
Feature activation+1.84e-02

Pos logit contributions
 plunge+0.124
 umbrella+0.109
 hi+0.109
anger+0.105
aroo+0.104
Neg logit contributions
aron-0.076
ce-0.075
die-0.073
bo-0.072
idi-0.071
Token be
Token ID307
Feature activation-2.70e-02

Pos logit contributions
by+0.027
bo+0.027
die+0.027
ce+0.027
 sav+0.027
Neg logit contributions
 plunge-0.034
anger-0.032
aroo-0.031
adders-0.031
 umbrella-0.030
Token your
Token ID534
Feature activation-1.26e-02

Pos logit contributions
 plunge+0.039
 hi+0.038
aroo+0.037
 umbrella+0.035
aro+0.035
Neg logit contributions
ce-0.047
aron-0.045
 sav-0.045
by-0.045
idi-0.045
Token best
Token ID1266
Feature activation-2.91e-02

Pos logit contributions
 plunge+0.021
anger+0.019
aroo+0.019
 umbrella+0.018
adders+0.018
Neg logit contributions
ce-0.022
die-0.021
bo-0.021
 sav-0.021
aron-0.021
Token friend
Token ID1545
Feature activation-8.63e-02

Pos logit contributions
 plunge+0.047
anger+0.043
aroo+0.042
 umbrella+0.042
adders+0.040
Neg logit contributions
ce-0.051
die-0.050
bo-0.050
 sav-0.050
idi-0.050
Token forever
Token ID8097
Feature activation-1.00e-01

Pos logit contributions
 plunge+0.167
anger+0.156
aroo+0.156
 umbrella+0.151
adders+0.150
Neg logit contributions
ce-0.119
die-0.118
 sav-0.117
by-0.117
idi-0.117
Token!"
Token ID2474
Feature activation-5.06e-03

Pos logit contributions
 plunge+0.171
 umbrella+0.151
anger+0.151
aroo+0.151
vere+0.147
Neg logit contributions
idi-0.164
ce-0.160
die-0.158
bo-0.158
 sav-0.156
Token
Token ID220
Feature activation-1.31e-02

Pos logit contributions
 plunge+0.011
anger+0.011
 umbrella+0.011
aroo+0.011
ittens+0.010
Neg logit contributions
ce-0.005
die-0.005
bo-0.005
idi-0.005
by-0.005
Token
Token ID198
Feature activation+2.85e-03

Pos logit contributions
 plunge+0.033
anger+0.032
aroo+0.031
adders+0.031
 umbrella+0.030
Neg logit contributions
die-0.011
ce-0.011
by-0.011
 sav-0.011
bo-0.010
Token
Token ID198
Feature activation+4.97e-03

Pos logit contributions
die+0.002
ce+0.002
by+0.002
bo+0.002
 sav+0.002
Neg logit contributions
 plunge-0.007
anger-0.007
aroo-0.007
 umbrella-0.006
adders-0.006
TokenSuddenly
Token ID38582
Feature activation+2.96e-02

Pos logit contributions
bo+0.007
by+0.007
ce+0.007
die+0.007
aron+0.007
Neg logit contributions
 plunge-0.009
aroo-0.008
anger-0.008
vere-0.008
 umbrella-0.008
Token a
Token ID257
Feature activation-2.30e-02

Pos logit contributions
 plunge+0.089
aroo+0.081
anger+0.081
 umbrella+0.081
adders+0.079
Neg logit contributions
ce-0.067
bo-0.067
by-0.067
 sav-0.066
die-0.065
Token rock
Token ID3881
Feature activation-8.13e-03

Pos logit contributions
 plunge+0.042
anger+0.038
aroo+0.037
 umbrella+0.037
 hi+0.035
Neg logit contributions
ce-0.037
bo-0.037
aron-0.036
die-0.036
 sav-0.035
Token at
Token ID379
Feature activation-3.69e-02

Pos logit contributions
 plunge+0.018
anger+0.016
adders+0.016
aroo+0.016
 umbrella+0.016
Neg logit contributions
bo-0.009
by-0.009
ce-0.009
 sav-0.009
die-0.009
Token all
Token ID477
Feature activation-3.36e-02

Pos logit contributions
 plunge+0.069
anger+0.066
aroo+0.064
adders+0.063
 umbrella+0.062
Neg logit contributions
die-0.054
by-0.053
 sav-0.053
bo-0.053
ce-0.053
Token!
Token ID0
Feature activation-3.66e-02

Pos logit contributions
 plunge+0.073
aroo+0.067
anger+0.066
 umbrella+0.065
aro+0.064
Neg logit contributions
die-0.042
ce-0.042
aron-0.041
bo-0.041
 sav-0.041
Token It
Token ID632
Feature activation-9.52e-02

Pos logit contributions
 plunge+0.072
anger+0.068
aroo+0.066
 umbrella+0.065
aro+0.065
Neg logit contributions
die-0.051
 sav-0.050
bo-0.049
idi-0.049
by-0.048
Token was
Token ID373
Feature activation+1.51e-02

Pos logit contributions
 plunge+0.238
aroo+0.223
anger+0.223
 umbrella+0.218
adders+0.213
Neg logit contributions
die-0.081
ce-0.080
by-0.080
aron-0.078
 sav-0.077
Token a
Token ID257
Feature activation-2.14e-02

Pos logit contributions
by+0.025
ce+0.025
bo+0.025
aron+0.024
 sav+0.023
Neg logit contributions
 plunge-0.026
 umbrella-0.024
aroo-0.023
anger-0.022
adders-0.022
Token mean
Token ID1612
Feature activation+9.83e-03

Pos logit contributions
 plunge+0.040
anger+0.036
aroo+0.036
 umbrella+0.036
vere+0.033
Neg logit contributions
aron-0.032
ce-0.032
bo-0.032
die-0.032
by-0.031
Token old
Token ID1468
Feature activation+7.70e-02

Pos logit contributions
die+0.015
by+0.015
ce+0.015
bo+0.015
 sav+0.015
Neg logit contributions
 plunge-0.018
anger-0.016
aroo-0.016
adders-0.016
 umbrella-0.015
Token crocod
Token ID37565
Feature activation-2.63e-02

Pos logit contributions
die+0.132
by+0.131
ce+0.131
bo+0.130
 sav+0.130
Neg logit contributions
 plunge-0.126
anger-0.115
aroo-0.112
adders-0.110
 umbrella-0.109
Token before
Token ID878
Feature activation+3.86e-02

Pos logit contributions
 plunge+0.025
aroo+0.023
anger+0.023
 umbrella+0.023
 hi+0.022
Neg logit contributions
by-0.014
ce-0.014
bo-0.014
die-0.014
idi-0.013
Token riding
Token ID10311
Feature activation+1.56e-02

Pos logit contributions
die+0.058
bo+0.058
by+0.057
aron+0.057
ce+0.057
Neg logit contributions
 plunge-0.073
anger-0.066
aroo-0.066
 umbrella-0.064
vere-0.063
Token off
Token ID572
Feature activation-3.04e-02

Pos logit contributions
die+0.016
ce+0.016
by+0.016
bo+0.016
 sav+0.016
Neg logit contributions
 plunge-0.036
anger-0.034
aroo-0.033
 umbrella-0.032
adders-0.032
Token on
Token ID319
Feature activation-4.04e-02

Pos logit contributions
 plunge+0.061
aroo+0.057
anger+0.057
aro+0.055
 umbrella+0.054
Neg logit contributions
die-0.043
bo-0.041
by-0.040
aron-0.040
ce-0.040
Token his
Token ID465
Feature activation-6.93e-03

Pos logit contributions
 plunge+0.090
aroo+0.082
anger+0.081
 umbrella+0.081
adders+0.080
Neg logit contributions
bo-0.047
die-0.046
ce-0.046
aron-0.045
by-0.045
Token horse
Token ID8223
Feature activation-1.04e-01

Pos logit contributions
 plunge+0.012
anger+0.011
aroo+0.011
adders+0.010
aro+0.010
Neg logit contributions
die-0.012
by-0.012
 sav-0.012
ce-0.011
bo-0.011
Token.
Token ID13
Feature activation+1.02e-02

Pos logit contributions
 plunge+0.246
anger+0.232
aroo+0.227
adders+0.223
 umbrella+0.221
Neg logit contributions
by-0.103
ce-0.103
die-0.103
bo-0.103
 sav-0.102
Token He
Token ID679
Feature activation+7.72e-03

Pos logit contributions
by+0.014
ce+0.014
die+0.014
bo+0.014
y+0.013
Neg logit contributions
 plunge-0.019
anger-0.018
aroo-0.018
adders-0.017
 hi-0.017
Token will
Token ID481
Feature activation+1.77e-02

Pos logit contributions
die+0.008
ce+0.008
aron+0.007
bo+0.007
by+0.007
Neg logit contributions
 plunge-0.018
aroo-0.017
anger-0.017
 umbrella-0.016
aro-0.016
Token always
Token ID1464
Feature activation+1.04e-01

Pos logit contributions
 sav+0.020
bo+0.019
ce+0.019
by+0.019
die+0.019
Neg logit contributions
 plunge-0.040
anger-0.038
aroo-0.037
adders-0.036
 umbrella-0.036
Token be
Token ID307
Feature activation-2.40e-02

Pos logit contributions
die+0.197
ce+0.196
by+0.195
bo+0.195
 sav+0.195
Neg logit contributions
 plunge-0.151
anger-0.138
aroo-0.134
 umbrella-0.129
adders-0.128
Token see
Token ID766
Feature activation-2.11e-02

Pos logit contributions
 sav+0.033
by+0.030
bo+0.030
 asked+0.030
idi+0.029
Neg logit contributions
anger-0.059
 plunge-0.057
adders-0.055
vere-0.055
fs-0.055
Token what
Token ID644
Feature activation+2.40e-02

Pos logit contributions
anger+0.052
 plunge+0.051
adders+0.047
vere+0.047
ittens+0.047
Neg logit contributions
 sav-0.019
by-0.019
die-0.019
bo-0.018
aron-0.017
Token is
Token ID318
Feature activation+1.36e-02

Pos logit contributions
aron+0.039
die+0.038
ce+0.037
 sav+0.036
idi+0.036
Neg logit contributions
 plunge-0.043
aroo-0.040
 umbrella-0.038
aro-0.038
 hi-0.037
Token in
Token ID287
Feature activation+2.23e-02

Pos logit contributions
die+0.019
by+0.019
ce+0.019
bo+0.018
 sav+0.018
Neg logit contributions
 plunge-0.027
anger-0.025
aroo-0.025
 umbrella-0.025
adders-0.024
Token the
Token ID262
Feature activation+5.17e-03

Pos logit contributions
die+0.023
ce+0.023
bo+0.023
by+0.023
 sav+0.022
Neg logit contributions
 plunge-0.052
anger-0.049
aroo-0.048
 umbrella-0.047
adders-0.047
Token sack
Token ID23704
Feature activation-9.97e-02

Pos logit contributions
 sav+0.010
die+0.010
by+0.010
bo+0.010
 late+0.010
Neg logit contributions
 plunge-0.007
anger-0.007
aro-0.007
aroo-0.007
adders-0.007
Token.
Token ID13
Feature activation+5.11e-03

Pos logit contributions
 plunge+0.294
anger+0.283
aroo+0.277
aro+0.274
ittens+0.272
Neg logit contributions
bo-0.036
 sav-0.034
y-0.033
ce-0.031
 late-0.028
Token
Token ID198
Feature activation+1.66e-03

Pos logit contributions
die+0.005
ce+0.005
by+0.005
bo+0.005
 sav+0.005
Neg logit contributions
 plunge-0.012
anger-0.011
aroo-0.011
 umbrella-0.011
adders-0.011
Token
Token ID198
Feature activation+4.56e-03

Pos logit contributions
die+0.001
ce+0.001
by+0.001
bo+0.001
 sav+0.001
Neg logit contributions
 plunge-0.004
anger-0.004
aroo-0.004
 umbrella-0.004
adders-0.004
Token"
Token ID1
Feature activation-2.25e-02

Pos logit contributions
die+0.006
ce+0.006
by+0.006
bo+0.006
 sav+0.005
Neg logit contributions
 plunge-0.010
anger-0.009
aroo-0.009
 umbrella-0.009
adders-0.009
TokenLet
Token ID5756
Feature activation+3.03e-02

Pos logit contributions
 plunge+0.053
aroo+0.049
anger+0.048
 umbrella+0.048
aro+0.047
Neg logit contributions
die-0.024
by-0.024
ce-0.023
aron-0.023
bo-0.022
Token t
Token ID256
Feature activation-9.02e-03

Pos logit contributions
by+0.003
die+0.003
 sav+0.003
ce+0.003
bo+0.003
Neg logit contributions
 plunge-0.004
anger-0.004
aroo-0.003
vere-0.003
aro-0.003
Tokeneddy
Token ID21874
Feature activation+1.30e-01

Pos logit contributions
 plunge+0.028
aroo+0.025
anger+0.025
vere+0.025
adders+0.025
Neg logit contributions
by-0.004
bo-0.004
die-0.003
y-0.003
ce-0.003
Token bear
Token ID6842
Feature activation-4.48e-02

Pos logit contributions
 sav+0.199
ce+0.198
die+0.198
idi+0.196
by+0.196
Neg logit contributions
 plunge-0.228
aroo-0.214
anger-0.213
 umbrella-0.211
adders-0.199
Token,
Token ID11
Feature activation+3.93e-03

Pos logit contributions
anger+0.091
 plunge+0.091
 umbrella+0.091
aroo+0.091
adders+0.088
Neg logit contributions
die-0.053
idi-0.052
 sav-0.051
ce-0.050
by-0.050
Token her
Token ID607
Feature activation+8.74e-04

Pos logit contributions
ce+0.004
by+0.004
die+0.004
bo+0.004
 sav+0.004
Neg logit contributions
 plunge-0.009
anger-0.009
aroo-0.009
 umbrella-0.009
adders-0.009
Token ball
Token ID2613
Feature activation-9.36e-02

Pos logit contributions
die+0.001
by+0.001
ce+0.001
 sav+0.001
bo+0.001
Neg logit contributions
 plunge-0.002
anger-0.001
aroo-0.001
adders-0.001
aro-0.001
Token and
Token ID290
Feature activation+1.85e-02

Pos logit contributions
 plunge+0.211
anger+0.203
 umbrella+0.199
aroo+0.198
adders+0.195
Neg logit contributions
die-0.101
ce-0.099
idi-0.099
by-0.097
 sav-0.096
Token her
Token ID607
Feature activation+6.83e-05

Pos logit contributions
ce+0.025
aron+0.024
by+0.024
bo+0.024
die+0.023
Neg logit contributions
 plunge-0.037
 umbrella-0.036
aroo-0.035
adders-0.034
anger-0.034
Token book
Token ID1492
Feature activation-4.24e-02

Pos logit contributions
die+0.000
by+0.000
bo+0.000
 sav+0.000
ce+0.000
Neg logit contributions
 plunge-0.000
anger-0.000
aroo-0.000
adders-0.000
aro-0.000
Token.
Token ID13
Feature activation+2.41e-02

Pos logit contributions
 plunge+0.099
anger+0.097
 umbrella+0.094
adders+0.094
aroo+0.092
Neg logit contributions
die-0.043
ce-0.041
idi-0.040
by-0.040
 sav-0.039
Token They
Token ID1119
Feature activation+2.47e-02

Pos logit contributions
by+0.036
bo+0.035
ce+0.035
die+0.033
 sav+0.032
Neg logit contributions
 plunge-0.043
anger-0.043
 hi-0.041
aroo-0.041
adders-0.041