TOP ACTIVATIONS
MAX = 0.170

until it got darker . Rh onda was scared but Alex
named Cr ust y . Cr ust y lived in a
strong and happy today !" Qu ackers smiled and said ,
beautiful colors . Alex and Rh onda were so excited to
two friends named Alex and Rh onda . Alex wanted to
bit of magic . <|endoftext|> Isa ac was preparing breakfast in
lived a nice bear named Che wy . Che wy had
Qu ackers was safe . Qu ackers thanked the little girl
" Don 't worry , Rh in oc eros !" The
a little . Cr ack ! The
was a little crab named Cr ust y . Cr ust
bear named Che wy . Che wy had a depend able
day , an explorer named Che e wanted to study a
able to do more . Del ight ed , the insect
From that day on , Qu ackers and the little girl
duck named Qu ackers . Qu ackers lived on a farm
girl named Gr acy . Gr acy loved to run around
't know . Luc as started to complain "
creatures . One day , Cr ust y found a shiny
a gr umpy bear called Gr umbles . He used to

MIN ACTIVATIONS
MIN = -0.120

they could go check the mailbox again . Her mom said
Mama to take out the journal and write something down .
tools to create his master pieces . <|endoftext|> Once upon a
you for not touching the file . Thank you for being
better way to repair the TV together ." They
the mail out of the mailbox . And you should not
took turns to stir the batter with a wooden spoon .
big stick to break the vine . Lily looked around and
meant she could play the piano again . <|endoftext|> Once upon
if she could turn the journal into what she wanted ?
You need to remove the glue and put it away ,
happy to have read the newspaper . <|endoftext|> Once upon a
shouldn 't have brought the glue . Now we can 't
, let 's get your journal so you can write about
we need to pack our diary and some cr ay ons
the beautiful things in this journal , and it doesn 't
. We can cut the paper into strips and write our
were sorry for measuring the radio . Their mom said it
help Mom to pour the batter into a pan . Mom
them sprinkle them over the batter . Mom poured

INTERVAL 0.097 - 0.170
CONTAINS 0.430%

about the hurt . <|endoftext|> L ily and Ben were twins
As he was r aking the leaves , Jack
who smiled back . The w affle maker made a pretty
your friends ." L ily walked with mom to
a threat !" L ily and Ben did not

INTERVAL 0.025 - 0.097
CONTAINS 21.856%

very creative and smart ." S ara and Ben
and said , " Don 't worry , Lily . We
dad 's hand . She felt a little better . <|endoftext|>
Tim my was very excited to eat the avocado
- the hole was much deeper than it had expected !

INTERVAL -0.047 - 0.025
CONTAINS 73.576%

Jack said , " Let 's fill the leak ".
He couldn 't find it anywhere . He looked and looked
upon . Everything was so exciting ! Suddenly
and look for things to put in their net .
He ran outside and enjoyed playing with his toys in the

INTERVAL -0.120 - -0.047
CONTAINS 4.138%

the most things in the house was the winner .
the box she acted like it was such a special surprise
saw something blue in the sky . It was so far
for watering the flowers . It is very hard to turn
she saw a white hat inside the bag . When she
Token until
Token ID1566
Feature activation-1.72e-02

Pos logit contributions
urious+0.031
elly+0.030
ude+0.030
resa+0.030
ida+0.030
Neg logit contributions
 he-0.021
 heal-0.021
 back-0.018
 bed-0.018
 drown-0.018
Token it
Token ID340
Feature activation-2.47e-02

Pos logit contributions
 heal+0.023
 bleed+0.018
 shield+0.015
 protect+0.015
 back+0.015
Neg logit contributions
elia-0.034
iley-0.034
ida-0.033
oph-0.033
resa-0.033
Token got
Token ID1392
Feature activation+2.97e-02

Pos logit contributions
 he+0.038
 back+0.034
 out+0.031
 heal+0.031
 his+0.029
Neg logit contributions
urious-0.053
elly-0.051
ubby-0.049
resa-0.049
illing-0.048
Token darker
Token ID18646
Feature activation-1.49e-03

Pos logit contributions
urious+0.064
elly+0.060
ubby+0.058
illing+0.056
ude+0.056
Neg logit contributions
 he-0.042
 heal-0.038
 back-0.035
 bed-0.033
 out-0.032
Token.
Token ID13
Feature activation-3.18e-02

Pos logit contributions
 he+0.002
 heal+0.002
 back+0.002
 out+0.002
 they+0.001
Neg logit contributions
urious-0.003
elly-0.003
ubby-0.003
ude-0.003
illing-0.003
Token Rh
Token ID10323
Feature activation+1.70e-01

Pos logit contributions
 he+0.046
 back+0.040
 heal+0.038
 out+0.037
 away+0.034
Neg logit contributions
urious-0.072
elly-0.069
ubby-0.066
resa-0.066
illing-0.065
Tokenonda
Token ID13533
Feature activation+4.05e-02

Pos logit contributions
 blond+0.232
illing+0.225
 blonde+0.224
elly+0.212
ited+0.208
Neg logit contributions
 back-0.279
 away-0.258
 again-0.249
 out-0.247
 helicopter-0.243
Token was
Token ID373
Feature activation+3.55e-02

Pos logit contributions
urious+0.079
elly+0.075
ubby+0.072
resa+0.071
ude+0.070
Neg logit contributions
 he-0.070
 back-0.062
 heal-0.060
 out-0.059
 his-0.056
Token scared
Token ID12008
Feature activation+4.06e-02

Pos logit contributions
urious+0.065
illing+0.061
ubby+0.061
elly+0.060
ude+0.059
Neg logit contributions
 he-0.059
 heal-0.052
 back-0.048
 bed-0.046
 they-0.046
Token but
Token ID475
Feature activation-5.40e-03

Pos logit contributions
elly+0.084
urious+0.083
ubby+0.080
illing+0.077
ude+0.076
Neg logit contributions
 he-0.058
 heal-0.055
 back-0.053
 out-0.048
 bed-0.047
Token Alex
Token ID4422
Feature activation+5.77e-02

Pos logit contributions
 he+0.006
 back+0.005
 heal+0.005
 out+0.005
 bed+0.005
Neg logit contributions
urious-0.013
resa-0.012
elly-0.012
ida-0.012
ubby-0.012
Token named
Token ID3706
Feature activation+2.37e-02

Pos logit contributions
urious+0.076
elly+0.075
ubby+0.069
illing+0.069
resa+0.066
Neg logit contributions
 he-0.072
 back-0.069
 heal-0.066
 out-0.064
 anything-0.063
Token Cr
Token ID3864
Feature activation+1.61e-01

Pos logit contributions
urious+0.054
ubby+0.053
elly+0.053
resa+0.053
ody+0.053
Neg logit contributions
 he-0.025
 heal-0.024
 back-0.024
 away-0.022
 out-0.021
Tokenust
Token ID436
Feature activation+8.21e-02

Pos logit contributions
resa+0.194
ised+0.184
elia+0.180
 types+0.179
urious+0.178
Neg logit contributions
 back-0.320
 he-0.318
 out-0.308
 heal-0.291
 fall-0.276
Tokeny
Token ID88
Feature activation+5.66e-02

Pos logit contributions
urious+0.088
resa+0.080
elly+0.079
ubby+0.074
illing+0.073
Neg logit contributions
 he-0.215
 back-0.202
 out-0.195
 his-0.189
 down-0.189
Token.
Token ID13
Feature activation-8.03e-03

Pos logit contributions
urious+0.161
elly+0.157
ubby+0.153
resa+0.151
illing+0.150
Neg logit contributions
 he-0.046
 back-0.036
 out-0.031
 heal-0.031
 down-0.026
Token Cr
Token ID3864
Feature activation+1.68e-01

Pos logit contributions
 he+0.010
 back+0.009
 out+0.008
 his+0.008
 heal+0.007
Neg logit contributions
urious-0.020
elly-0.019
ubby-0.019
illing-0.018
resa-0.018
Tokenust
Token ID436
Feature activation+8.76e-02

Pos logit contributions
resa+0.231
iley+0.212
ited+0.211
ya+0.203
elia+0.201
Neg logit contributions
 back-0.300
 out-0.287
 he-0.269
 heal-0.259
 away-0.254
Tokeny
Token ID88
Feature activation+6.33e-02

Pos logit contributions
resa+0.107
urious+0.105
ami+0.100
elly+0.098
ookie+0.098
Neg logit contributions
 he-0.216
 back-0.201
 out-0.195
 his-0.191
 away-0.190
Token lived
Token ID5615
Feature activation+5.90e-02

Pos logit contributions
urious+0.120
elly+0.114
ubby+0.110
resa+0.108
illing+0.107
Neg logit contributions
 he-0.113
 back-0.101
 out-0.096
 heal-0.096
 his-0.090
Token in
Token ID287
Feature activation-3.80e-03

Pos logit contributions
urious+0.154
elly+0.149
ubby+0.145
resa+0.144
illing+0.142
Neg logit contributions
 he-0.063
 back-0.052
 heal-0.047
 out-0.047
 his-0.041
Token a
Token ID257
Feature activation-3.12e-03

Pos logit contributions
 he+0.005
 back+0.004
 out+0.004
 heal+0.004
 his+0.003
Neg logit contributions
urious-0.009
elly-0.009
ubby-0.009
resa-0.009
illing-0.009
Token strong
Token ID1913
Feature activation+7.80e-03

Pos logit contributions
urious+0.045
elly+0.043
ubby+0.042
resa+0.041
illing+0.040
Neg logit contributions
 he-0.038
 back-0.034
 out-0.032
 heal-0.032
 down-0.030
Token and
Token ID290
Feature activation-2.95e-02

Pos logit contributions
elly+0.017
urious+0.016
ubby+0.016
illing+0.015
ude+0.015
Neg logit contributions
 he-0.011
 back-0.010
 heal-0.010
 out-0.009
 bed-0.009
Token happy
Token ID3772
Feature activation+3.34e-03

Pos logit contributions
 he+0.045
 back+0.043
 out+0.041
 heal+0.040
 bed+0.039
Neg logit contributions
urious-0.054
ubby-0.050
elly-0.050
 types-0.048
ida-0.048
Token today
Token ID1909
Feature activation-2.07e-02

Pos logit contributions
elly+0.007
urious+0.007
ubby+0.006
iling+0.006
ude+0.006
Neg logit contributions
 heal-0.005
 he-0.004
 back-0.004
 away-0.004
 bed-0.004
Token!"
Token ID2474
Feature activation-3.50e-03

Pos logit contributions
 he+0.029
 heal+0.028
 back+0.025
 Zig+0.025
 protect+0.024
Neg logit contributions
elly-0.042
urious-0.041
ubby-0.039
elia-0.038
ude-0.038
Token Qu
Token ID2264
Feature activation+1.66e-01

Pos logit contributions
 he+0.005
 back+0.004
 heal+0.004
 out+0.004
 away+0.004
Neg logit contributions
urious-0.008
elly-0.007
ubby-0.007
resa-0.007
ody-0.007
Tokenackers
Token ID28874
Feature activation+7.31e-02

Pos logit contributions
resa+0.297
atherine+0.276
ami+0.271
 elderly+0.270
 types+0.260
Neg logit contributions
 he-0.264
 back-0.262
 away-0.243
 out-0.242
 his-0.241
Token smiled
Token ID13541
Feature activation+5.54e-02

Pos logit contributions
urious+0.152
elly+0.148
ubby+0.141
resa+0.139
illing+0.138
Neg logit contributions
 he-0.121
 back-0.107
 out-0.098
 heal-0.097
 his-0.096
Token and
Token ID290
Feature activation-1.97e-02

Pos logit contributions
urious+0.096
elly+0.093
ubby+0.089
ude+0.087
illing+0.084
Neg logit contributions
 he-0.098
 heal-0.090
 anything-0.083
 Zig-0.083
 bed-0.082
Token said
Token ID531
Feature activation+2.00e-02

Pos logit contributions
 he+0.035
 heal+0.034
 back+0.033
 anything+0.033
 out+0.033
Neg logit contributions
urious-0.030
elly-0.028
resa-0.028
ody-0.028
ida-0.028
Token,
Token ID11
Feature activation+1.15e-03

Pos logit contributions
ubby+0.030
ida+0.029
ude+0.029
elly+0.028
mber+0.027
Neg logit contributions
 bed-0.031
 roof-0.029
 ramp-0.029
stack-0.028
 drown-0.028
Token beautiful
Token ID4950
Feature activation+3.82e-03

Pos logit contributions
 he+0.002
 back+0.002
 out+0.002
 heal+0.002
 his+0.001
Neg logit contributions
urious-0.002
elly-0.002
ubby-0.002
resa-0.002
illing-0.002
Token colors
Token ID7577
Feature activation-4.42e-03

Pos logit contributions
urious+0.008
elly+0.008
ubby+0.007
illing+0.007
resa+0.007
Neg logit contributions
 he-0.006
 back-0.005
 heal-0.005
 out-0.005
 away-0.005
Token.
Token ID13
Feature activation-2.89e-02

Pos logit contributions
 he+0.005
 back+0.004
 heal+0.004
 out+0.004
 away+0.003
Neg logit contributions
elly-0.011
urious-0.011
illing-0.010
ubby-0.010
ude-0.010
Token Alex
Token ID4422
Feature activation+7.08e-02

Pos logit contributions
 he+0.041
 back+0.035
 heal+0.033
 out+0.033
 down+0.030
Neg logit contributions
urious-0.066
elly-0.063
ubby-0.061
resa-0.061
illing-0.060
Token and
Token ID290
Feature activation-3.27e-03

Pos logit contributions
urious+0.148
elly+0.146
ubby+0.140
illing+0.133
ami+0.131
Neg logit contributions
 he-0.119
 back-0.103
 they-0.096
 out-0.094
 it-0.091
Token Rh
Token ID10323
Feature activation+1.65e-01

Pos logit contributions
 he+0.005
 back+0.005
 out+0.004
 heal+0.004
 his+0.004
Neg logit contributions
urious-0.007
elly-0.006
ubby-0.006
resa-0.006
illing-0.006
Tokenonda
Token ID13533
Feature activation+3.96e-02

Pos logit contributions
elly+0.205
illing+0.191
 types+0.174
resa+0.168
osa+0.168
Neg logit contributions
 back-0.328
 away-0.306
 he-0.304
 out-0.296
 heal-0.290
Token were
Token ID547
Feature activation+3.33e-02

Pos logit contributions
urious+0.080
elly+0.076
ubby+0.074
resa+0.072
illing+0.072
Neg logit contributions
 he-0.064
 back-0.057
 heal-0.055
 out-0.054
 his-0.051
Token so
Token ID523
Feature activation+2.19e-02

Pos logit contributions
urious+0.064
ubby+0.061
elly+0.060
ude+0.060
illing+0.060
Neg logit contributions
 he-0.054
 heal-0.048
 back-0.045
 out-0.043
 bed-0.042
Token excited
Token ID6568
Feature activation+2.73e-02

Pos logit contributions
urious+0.049
elly+0.047
ami+0.046
resa+0.045
oph+0.045
Neg logit contributions
 he-0.034
 back-0.030
 out-0.028
 they-0.028
 down-0.027
Token to
Token ID284
Feature activation-1.27e-02

Pos logit contributions
urious+0.069
elly+0.068
illing+0.065
resa+0.064
ude+0.064
Neg logit contributions
 heal-0.025
 he-0.024
 back-0.023
 out-0.019
 away-0.019
Token two
Token ID734
Feature activation+4.51e-02

Pos logit contributions
urious+0.066
elly+0.061
resa+0.058
ida+0.058
ited+0.058
Neg logit contributions
 he-0.072
 back-0.067
 out-0.064
 his-0.060
 they-0.060
Token friends
Token ID2460
Feature activation+3.20e-02

Pos logit contributions
urious+0.089
elly+0.087
resa+0.082
illing+0.081
ida+0.080
Neg logit contributions
 he-0.077
 back-0.071
 out-0.065
 his-0.062
 they-0.061
Token named
Token ID3706
Feature activation+3.56e-02

Pos logit contributions
urious+0.070
elly+0.068
ubby+0.066
resa+0.065
illing+0.064
Neg logit contributions
 he-0.047
 back-0.041
 heal-0.039
 out-0.039
 down-0.036
Token Alex
Token ID4422
Feature activation+7.29e-02

Pos logit contributions
urious+0.095
elly+0.092
ubby+0.089
resa+0.089
elia+0.087
Neg logit contributions
 he-0.036
 back-0.028
 heal-0.027
 out-0.025
 away-0.023
Token and
Token ID290
Feature activation-2.47e-02

Pos logit contributions
urious+0.168
elly+0.160
ubby+0.154
resa+0.153
ida+0.151
Neg logit contributions
 he-0.106
 back-0.093
 out-0.086
 heal-0.081
 they-0.081
Token Rh
Token ID10323
Feature activation+1.65e-01

Pos logit contributions
 heal+0.028
 back+0.028
 he+0.026
 away+0.025
 out+0.025
Neg logit contributions
ida-0.052
resa-0.052
ody-0.052
urious-0.050
ude-0.050
Tokenonda
Token ID13533
Feature activation+4.28e-02

Pos logit contributions
elly+0.221
urious+0.205
resa+0.201
illing+0.200
 types+0.196
Neg logit contributions
 he-0.347
 back-0.321
 his-0.302
 out-0.296
 heal-0.292
Token.
Token ID13
Feature activation-1.75e-02

Pos logit contributions
urious+0.117
elly+0.114
ubby+0.111
illing+0.110
resa+0.109
Neg logit contributions
 he-0.038
 back-0.030
 heal-0.029
 out-0.026
 down-0.023
Token Alex
Token ID4422
Feature activation+6.87e-02

Pos logit contributions
 he+0.021
 back+0.017
 out+0.015
 his+0.015
 they+0.015
Neg logit contributions
urious-0.045
elly-0.043
ubby-0.042
ida-0.041
illing-0.041
Token wanted
Token ID2227
Feature activation+1.71e-02

Pos logit contributions
urious+0.143
elly+0.137
ubby+0.132
resa+0.129
illing+0.127
Neg logit contributions
 he-0.114
 back-0.099
 out-0.093
 heal-0.091
 they-0.089
Token to
Token ID284
Feature activation-3.98e-02

Pos logit contributions
urious+0.043
resa+0.042
elly+0.042
illing+0.041
ody+0.040
Neg logit contributions
 he-0.015
 heal-0.013
 back-0.012
 away-0.011
 down-0.011
Token bit
Token ID1643
Feature activation-2.39e-03

Pos logit contributions
 he+0.020
 back+0.017
 Zig+0.016
 his+0.016
 heal+0.016
Neg logit contributions
urious-0.024
elly-0.023
ubby-0.022
elia-0.021
resa-0.021
Token of
Token ID286
Feature activation-3.12e-02

Pos logit contributions
 he+0.003
 back+0.003
 heal+0.003
 out+0.003
 his+0.003
Neg logit contributions
urious-0.005
elly-0.005
ubby-0.005
illing-0.005
resa-0.005
Token magic
Token ID5536
Feature activation-1.65e-02

Pos logit contributions
 he+0.052
 back+0.047
 out+0.044
 Zig+0.044
 heal+0.044
Neg logit contributions
urious-0.056
elly-0.053
ubby-0.052
 types-0.051
iling-0.050
Token.
Token ID13
Feature activation-3.39e-02

Pos logit contributions
 he+0.021
 back+0.018
 heal+0.017
 Zig+0.017
 his+0.016
Neg logit contributions
elly-0.038
urious-0.037
ubby-0.035
ude-0.034
ida-0.034
Token<|endoftext|>
Token ID50256
Feature activation+6.26e-02

Pos logit contributions
 he+0.047
 back+0.042
 heal+0.040
 out+0.039
 away+0.037
Neg logit contributions
urious-0.077
elly-0.074
resa-0.071
ubby-0.071
illing-0.070
TokenIsa
Token ID39443
Feature activation+1.65e-01

Pos logit contributions
urious+0.134
elly+0.130
ubby+0.125
resa+0.123
illing+0.122
Neg logit contributions
 he-0.092
 back-0.082
 heal-0.078
 out-0.076
 away-0.071
Tokenac
Token ID330
Feature activation+1.15e-01

Pos logit contributions
resa+0.236
elly+0.226
urious+0.224
ida+0.221
usting+0.221
Neg logit contributions
 back-0.334
 out-0.329
 he-0.306
 away-0.306
 his-0.305
Token was
Token ID373
Feature activation+4.28e-02

Pos logit contributions
illing+0.219
elly+0.219
ubby+0.205
illed+0.195
urious+0.193
Neg logit contributions
 he-0.193
 back-0.193
 his-0.180
 him-0.172
 it-0.171
Token preparing
Token ID10629
Feature activation+1.01e-02

Pos logit contributions
urious+0.090
ubby+0.089
elly+0.086
illing+0.086
ady+0.085
Neg logit contributions
 he-0.061
 heal-0.052
 back-0.050
 Zig-0.047
 they-0.047
Token breakfast
Token ID12607
Feature activation-3.91e-03

Pos logit contributions
urious+0.021
resa+0.020
elly+0.020
 types+0.020
illing+0.020
Neg logit contributions
 he-0.012
 Zig-0.011
 back-0.011
 heal-0.011
 crash-0.010
Token in
Token ID287
Feature activation-1.91e-02

Pos logit contributions
 he+0.004
 back+0.004
 heal+0.004
 Zig+0.004
 crash+0.003
Neg logit contributions
urious-0.009
elly-0.009
resa-0.009
illing-0.009
ude-0.009
Token lived
Token ID5615
Feature activation+5.40e-02

Pos logit contributions
urious+0.015
elly+0.014
ubby+0.014
resa+0.014
illing+0.014
Neg logit contributions
 he-0.009
 back-0.007
 out-0.007
 his-0.007
 heal-0.007
Token a
Token ID257
Feature activation+1.91e-02

Pos logit contributions
urious+0.134
elly+0.129
ubby+0.124
resa+0.124
illing+0.122
Neg logit contributions
 he-0.066
 back-0.058
 out-0.053
 heal-0.051
 his-0.048
Token nice
Token ID3621
Feature activation+4.20e-02

Pos logit contributions
urious+0.030
elly+0.028
ubby+0.026
resa+0.026
illing+0.026
Neg logit contributions
 he-0.041
 back-0.038
 out-0.036
 heal-0.036
 his-0.034
Token bear
Token ID6842
Feature activation+3.61e-02

Pos logit contributions
urious+0.074
elly+0.069
ubby+0.066
resa+0.065
illing+0.065
Neg logit contributions
 he-0.082
 back-0.074
 out-0.070
 heal-0.070
 his-0.067
Token named
Token ID3706
Feature activation+2.44e-02

Pos logit contributions
urious+0.078
elly+0.076
ubby+0.073
resa+0.072
illing+0.072
Neg logit contributions
 he-0.052
 back-0.047
 heal-0.044
 out-0.043
 away-0.040
Token Che
Token ID2580
Feature activation+1.63e-01

Pos logit contributions
elly+0.056
urious+0.056
ubby+0.055
resa+0.055
elia+0.055
Neg logit contributions
 back-0.025
 he-0.024
 heal-0.023
 away-0.022
 out-0.021
Tokenwy
Token ID21768
Feature activation+5.10e-02

Pos logit contributions
elly+0.257
urious+0.244
ida+0.239
ubby+0.237
illing+0.232
Neg logit contributions
 he-0.317
 back-0.292
 heal-0.287
 away-0.277
 his-0.276
Token.
Token ID13
Feature activation-9.81e-03

Pos logit contributions
urious+0.142
elly+0.137
ubby+0.134
resa+0.132
illing+0.131
Neg logit contributions
 he-0.047
 back-0.037
 out-0.032
 heal-0.032
 his-0.029
Token Che
Token ID2580
Feature activation+1.60e-01

Pos logit contributions
 he+0.013
 back+0.010
 his+0.010
 out+0.010
 heal+0.009
Neg logit contributions
urious-0.025
ida-0.023
elly-0.023
ubby-0.023
elia-0.022
Tokenwy
Token ID21768
Feature activation+5.55e-02

Pos logit contributions
ida+0.269
elly+0.266
urious+0.255
ubby+0.250
illing+0.248
Neg logit contributions
 he-0.291
 heal-0.262
 back-0.261
 his-0.257
 away-0.251
Token had
Token ID550
Feature activation+1.06e-02

Pos logit contributions
urious+0.104
elly+0.100
ubby+0.098
illing+0.093
ida+0.093
Neg logit contributions
 he-0.106
 back-0.093
 his-0.087
 heal-0.086
 out-0.085
Token Qu
Token ID2264
Feature activation+1.47e-01

Pos logit contributions
 he+0.022
 heal+0.021
 back+0.020
 out+0.018
 bed+0.018
Neg logit contributions
urious-0.038
elly-0.037
elia-0.037
ida-0.035
oph-0.035
Tokenackers
Token ID28874
Feature activation+5.31e-02

Pos logit contributions
resa+0.240
atherine+0.214
 elderly+0.212
ami+0.208
elia+0.208
Neg logit contributions
 back-0.267
 he-0.263
 out-0.256
 away-0.245
 his-0.235
Token was
Token ID373
Feature activation+2.70e-02

Pos logit contributions
urious+0.108
elly+0.103
ubby+0.100
resa+0.098
illing+0.097
Neg logit contributions
 he-0.087
 back-0.077
 heal-0.073
 out-0.073
 down-0.068
Token safe
Token ID3338
Feature activation+8.53e-03

Pos logit contributions
urious+0.053
ubby+0.050
elly+0.050
illing+0.049
ude+0.049
Neg logit contributions
 he-0.043
 heal-0.036
 back-0.035
 they-0.034
 out-0.034
Token.
Token ID13
Feature activation-3.35e-02

Pos logit contributions
elly+0.017
urious+0.015
ude+0.015
illing+0.015
ubby+0.015
Neg logit contributions
 heal-0.011
 he-0.011
 Zig-0.010
 Blink-0.009
 bed-0.009
Token Qu
Token ID2264
Feature activation+1.62e-01

Pos logit contributions
 he+0.043
 back+0.038
 heal+0.038
 out+0.036
 away+0.033
Neg logit contributions
urious-0.076
elly-0.076
ubby-0.072
resa-0.071
illing-0.070
Tokenackers
Token ID28874
Feature activation+6.67e-02

Pos logit contributions
resa+0.312
atherine+0.281
 elderly+0.280
ami+0.275
becca+0.270
Neg logit contributions
 back-0.231
 he-0.219
 out-0.217
 away-0.209
 his-0.200
Token thanked
Token ID26280
Feature activation+3.52e-02

Pos logit contributions
urious+0.139
elly+0.136
ubby+0.130
resa+0.128
illing+0.126
Neg logit contributions
 he-0.111
 back-0.097
 out-0.089
 heal-0.089
 his-0.086
Token the
Token ID262
Feature activation+1.29e-02

Pos logit contributions
urious+0.079
resa+0.078
ody+0.077
ida+0.077
ubby+0.076
Neg logit contributions
 he-0.035
 out-0.032
 back-0.031
 heal-0.030
 down-0.029
Token little
Token ID1310
Feature activation+2.68e-02

Pos logit contributions
urious+0.027
elly+0.027
illing+0.026
iling+0.026
ody+0.026
Neg logit contributions
 he-0.021
 back-0.020
 heal-0.018
 out-0.017
 his-0.016
Token girl
Token ID2576
Feature activation-1.29e-02

Pos logit contributions
urious+0.045
elly+0.043
ubby+0.041
illing+0.041
resa+0.041
Neg logit contributions
 he-0.054
 back-0.050
 out-0.047
 heal-0.046
 away-0.045
Token "
Token ID366
Feature activation+5.00e-02

Pos logit contributions
 he+0.035
 back+0.033
 heal+0.032
 away+0.030
 out+0.029
Neg logit contributions
urious-0.056
elly-0.055
resa-0.053
ubby-0.052
illing-0.052
TokenDon
Token ID3987
Feature activation+4.46e-02

Pos logit contributions
urious+0.092
ubby+0.088
elly+0.088
ude+0.083
illing+0.081
Neg logit contributions
 he-0.095
 back-0.083
 heal-0.079
 out-0.077
 his-0.076
Token't
Token ID470
Feature activation+2.60e-02

Pos logit contributions
urious+0.107
elly+0.103
ubby+0.101
resa+0.101
ude+0.099
Neg logit contributions
 he-0.053
 back-0.045
 out-0.042
 heal-0.041
 his-0.039
Token worry
Token ID5490
Feature activation+4.51e-02

Pos logit contributions
urious+0.042
ida+0.036
elly+0.036
resa+0.036
ubby+0.036
Neg logit contributions
 he-0.051
 back-0.046
 out-0.045
 down-0.044
 heal-0.044
Token,
Token ID11
Feature activation+1.45e-02

Pos logit contributions
urious+0.096
elly+0.096
ubby+0.095
elia+0.091
ude+0.091
Neg logit contributions
 he-0.056
 back-0.051
 heal-0.051
 out-0.047
 away-0.046
Token Rh
Token ID10323
Feature activation+1.61e-01

Pos logit contributions
urious+0.032
elly+0.030
ubby+0.030
resa+0.029
ida+0.029
Neg logit contributions
 he-0.020
 back-0.018
 out-0.017
 heal-0.017
 away-0.016
Tokenin
Token ID259
Feature activation+2.47e-02

Pos logit contributions
elly+0.270
urious+0.265
illing+0.259
ydia+0.257
osa+0.252
Neg logit contributions
 he-0.275
 heal-0.232
 back-0.225
 his-0.221
 car-0.208
Tokenoc
Token ID420
Feature activation+3.67e-02

Pos logit contributions
urious+0.043
elly+0.040
ubby+0.039
resa+0.038
illing+0.037
Neg logit contributions
 he-0.048
 back-0.044
 out-0.042
 heal-0.041
 down-0.039
Tokeneros
Token ID27498
Feature activation+1.87e-02

Pos logit contributions
urious+0.069
elly+0.065
ubby+0.063
resa+0.063
ude+0.061
Neg logit contributions
 he-0.066
 back-0.059
 out-0.056
 heal-0.055
 they-0.053
Token!"
Token ID2474
Feature activation-6.68e-03

Pos logit contributions
urious+0.041
elly+0.039
ubby+0.039
iling+0.037
elia+0.037
Neg logit contributions
 he-0.024
 back-0.022
 bed-0.021
 heal-0.020
 out-0.020
Token The
Token ID383
Feature activation+3.13e-02

Pos logit contributions
 back+0.011
 heal+0.010
 anything+0.009
 he+0.009
 out+0.009
Neg logit contributions
resa-0.011
urious-0.011
elly-0.011
ude-0.011
ody-0.010
Token a
Token ID257
Feature activation-2.14e-02

Pos logit contributions
 he+0.042
 Zig+0.036
 heal+0.036
 bed+0.036
stack+0.034
Neg logit contributions
urious-0.049
elly-0.047
ubby-0.044
illing-0.042
ude-0.042
Token little
Token ID1310
Feature activation-4.48e-03

Pos logit contributions
 he+0.038
 back+0.033
 out+0.032
 they+0.031
 heal+0.031
Neg logit contributions
urious-0.037
elly-0.037
ubby-0.036
ude-0.033
illing-0.033
Token.
Token ID13
Feature activation-3.64e-02

Pos logit contributions
 he+0.007
 Zig+0.006
 they+0.006
 his+0.006
 heal+0.006
Neg logit contributions
elly-0.008
urious-0.008
ubby-0.008
ited-0.007
illing-0.007
Token
Token ID198
Feature activation+5.98e-02

Pos logit contributions
 he+0.046
 back+0.045
 heal+0.043
 away+0.042
 out+0.041
Neg logit contributions
elly-0.077
urious-0.075
ubby-0.070
ude-0.070
resa-0.070
Token
Token ID198
Feature activation+6.23e-02

Pos logit contributions
urious+0.150
elly+0.145
ubby+0.142
resa+0.140
illing+0.138
Neg logit contributions
 he-0.072
 back-0.060
 out-0.055
 heal-0.054
 his-0.051
TokenCr
Token ID13916
Feature activation+1.61e-01

Pos logit contributions
urious+0.115
elly+0.109
ubby+0.108
resa+0.104
ody+0.103
Neg logit contributions
 he-0.121
 back-0.105
 out-0.098
 his-0.097
 heal-0.097
Tokenack
Token ID441
Feature activation+5.34e-02

Pos logit contributions
resa+0.232
 elderly+0.227
 types+0.217
elia+0.215
ubby+0.211
Neg logit contributions
 he-0.286
 away-0.275
 back-0.273
 out-0.264
 down-0.242
Token!
Token ID0
Feature activation-2.37e-02

Pos logit contributions
urious+0.115
elly+0.109
ubby+0.106
resa+0.105
illing+0.103
Neg logit contributions
 he-0.086
 back-0.075
 out-0.070
 heal-0.068
 his-0.066
Token
Token ID198
Feature activation+6.03e-02

Pos logit contributions
 he+0.028
 back+0.024
 out+0.022
 heal+0.021
 his+0.020
Neg logit contributions
urious-0.060
elly-0.057
ubby-0.056
resa-0.055
illing-0.055
Token
Token ID198
Feature activation+6.01e-02

Pos logit contributions
urious+0.121
ubby+0.114
elly+0.114
ited+0.110
ude+0.109
Neg logit contributions
 he-0.111
 back-0.094
 they-0.087
 his-0.087
 out-0.086
TokenThe
Token ID464
Feature activation+1.87e-02

Pos logit contributions
urious+0.116
ubby+0.110
elly+0.109
ude+0.106
resa+0.104
Neg logit contributions
 he-0.113
 back-0.096
 they-0.089
 his-0.088
 out-0.088
Token was
Token ID373
Feature activation+4.66e-02

Pos logit contributions
urious+0.023
elly+0.022
ubby+0.021
resa+0.021
illing+0.021
Neg logit contributions
 he-0.015
 back-0.012
 out-0.012
 his-0.011
 heal-0.011
Token a
Token ID257
Feature activation+2.04e-02

Pos logit contributions
urious+0.125
elly+0.119
ubby+0.116
ida+0.116
resa+0.116
Neg logit contributions
 he-0.050
 back-0.042
 out-0.040
 his-0.037
 heal-0.034
Token little
Token ID1310
Feature activation+3.32e-02

Pos logit contributions
urious+0.030
elly+0.028
ubby+0.027
resa+0.026
illing+0.026
Neg logit contributions
 he-0.045
 back-0.042
 out-0.040
 heal-0.039
 his-0.038
Token crab
Token ID32202
Feature activation+4.18e-02

Pos logit contributions
elly+0.047
ubby+0.046
urious+0.046
resa+0.042
elia+0.041
Neg logit contributions
 he-0.072
 back-0.066
 heal-0.066
 out-0.063
 anything-0.062
Token named
Token ID3706
Feature activation+2.37e-02

Pos logit contributions
urious+0.076
elly+0.075
ubby+0.069
illing+0.069
resa+0.066
Neg logit contributions
 he-0.072
 back-0.069
 heal-0.066
 out-0.064
 anything-0.063
Token Cr
Token ID3864
Feature activation+1.61e-01

Pos logit contributions
urious+0.054
ubby+0.053
elly+0.053
resa+0.053
ody+0.053
Neg logit contributions
 he-0.025
 heal-0.024
 back-0.024
 away-0.022
 out-0.021
Tokenust
Token ID436
Feature activation+8.21e-02

Pos logit contributions
resa+0.194
ised+0.184
elia+0.180
 types+0.179
urious+0.178
Neg logit contributions
 back-0.320
 he-0.318
 out-0.308
 heal-0.291
 fall-0.276
Tokeny
Token ID88
Feature activation+5.66e-02

Pos logit contributions
urious+0.088
resa+0.080
elly+0.079
ubby+0.074
illing+0.073
Neg logit contributions
 he-0.215
 back-0.202
 out-0.195
 his-0.189
 down-0.189
Token.
Token ID13
Feature activation-8.03e-03

Pos logit contributions
urious+0.161
elly+0.157
ubby+0.153
resa+0.151
illing+0.150
Neg logit contributions
 he-0.046
 back-0.036
 out-0.031
 heal-0.031
 down-0.026
Token Cr
Token ID3864
Feature activation+1.68e-01

Pos logit contributions
 he+0.010
 back+0.009
 out+0.008
 his+0.008
 heal+0.007
Neg logit contributions
urious-0.020
elly-0.019
ubby-0.019
illing-0.018
resa-0.018
Tokenust
Token ID436
Feature activation+8.76e-02

Pos logit contributions
resa+0.231
iley+0.212
ited+0.211
ya+0.203
elia+0.201
Neg logit contributions
 back-0.300
 out-0.287
 he-0.269
 heal-0.259
 away-0.254
Token bear
Token ID6842
Feature activation+3.61e-02

Pos logit contributions
urious+0.074
elly+0.069
ubby+0.066
resa+0.065
illing+0.065
Neg logit contributions
 he-0.082
 back-0.074
 out-0.070
 heal-0.070
 his-0.067
Token named
Token ID3706
Feature activation+2.44e-02

Pos logit contributions
urious+0.078
elly+0.076
ubby+0.073
resa+0.072
illing+0.072
Neg logit contributions
 he-0.052
 back-0.047
 heal-0.044
 out-0.043
 away-0.040
Token Che
Token ID2580
Feature activation+1.63e-01

Pos logit contributions
elly+0.056
urious+0.056
ubby+0.055
resa+0.055
elia+0.055
Neg logit contributions
 back-0.025
 he-0.024
 heal-0.023
 away-0.022
 out-0.021
Tokenwy
Token ID21768
Feature activation+5.10e-02

Pos logit contributions
elly+0.257
urious+0.244
ida+0.239
ubby+0.237
illing+0.232
Neg logit contributions
 he-0.317
 back-0.292
 heal-0.287
 away-0.277
 his-0.276
Token.
Token ID13
Feature activation-9.81e-03

Pos logit contributions
urious+0.142
elly+0.137
ubby+0.134
resa+0.132
illing+0.131
Neg logit contributions
 he-0.047
 back-0.037
 out-0.032
 heal-0.032
 his-0.029
Token Che
Token ID2580
Feature activation+1.60e-01

Pos logit contributions
 he+0.013
 back+0.010
 his+0.010
 out+0.010
 heal+0.009
Neg logit contributions
urious-0.025
ida-0.023
elly-0.023
ubby-0.023
elia-0.022
Tokenwy
Token ID21768
Feature activation+5.55e-02

Pos logit contributions
ida+0.269
elly+0.266
urious+0.255
ubby+0.250
illing+0.248
Neg logit contributions
 he-0.291
 heal-0.262
 back-0.261
 his-0.257
 away-0.251
Token had
Token ID550
Feature activation+1.06e-02

Pos logit contributions
urious+0.104
elly+0.100
ubby+0.098
illing+0.093
ida+0.093
Neg logit contributions
 he-0.106
 back-0.093
 his-0.087
 heal-0.086
 out-0.085
Token a
Token ID257
Feature activation-7.52e-03

Pos logit contributions
urious+0.023
elly+0.022
ubby+0.021
resa+0.021
ude+0.021
Neg logit contributions
 he-0.015
 heal-0.013
 back-0.013
 out-0.012
 away-0.012
Token depend
Token ID4745
Feature activation+7.54e-02

Pos logit contributions
 he+0.014
 heal+0.012
 back+0.012
 out+0.012
 down+0.012
Neg logit contributions
urious-0.012
ubby-0.012
elly-0.012
resa-0.012
ude-0.011
Tokenable
Token ID540
Feature activation+8.09e-03

Pos logit contributions
urious+0.085
resa+0.070
ude+0.069
oph+0.068
ami+0.067
Neg logit contributions
 he-0.205
 back-0.186
 out-0.181
 his-0.180
 away-0.179
Token day
Token ID1110
Feature activation+5.28e-03

Pos logit contributions
urious+0.173
elly+0.167
ubby+0.162
resa+0.161
ude+0.158
Neg logit contributions
 he-0.097
 back-0.083
 out-0.077
 heal-0.075
 his-0.071
Token,
Token ID11
Feature activation+2.97e-02

Pos logit contributions
urious+0.012
elly+0.012
ubby+0.012
illing+0.012
ude+0.011
Neg logit contributions
 he-0.006
 heal-0.006
 back-0.006
 away-0.005
 out-0.005
Token an
Token ID281
Feature activation+6.92e-02

Pos logit contributions
urious+0.069
elly+0.067
ubby+0.065
resa+0.064
illing+0.064
Neg logit contributions
 he-0.038
 back-0.034
 heal-0.032
 out-0.030
 away-0.028
Token explorer
Token ID39349
Feature activation+3.81e-02

Pos logit contributions
urious+0.122
elly+0.118
ude+0.115
oph+0.114
ited+0.113
Neg logit contributions
 back-0.120
 he-0.119
 his-0.111
 out-0.103
 safe-0.100
Token named
Token ID3706
Feature activation+2.35e-02

Pos logit contributions
urious+0.079
elly+0.076
ubby+0.073
resa+0.072
ude+0.071
Neg logit contributions
 he-0.059
 back-0.054
 heal-0.051
 out-0.050
 down-0.047
Token Che
Token ID2580
Feature activation+1.60e-01

Pos logit contributions
urious+0.058
elly+0.056
resa+0.055
ubby+0.055
ody+0.054
Neg logit contributions
 he-0.025
 back-0.023
 heal-0.022
 out-0.019
 away-0.019
Tokene
Token ID68
Feature activation+5.00e-02

Pos logit contributions
elly+0.277
ubby+0.257
urious+0.249
illing+0.238
ida+0.238
Neg logit contributions
 he-0.306
 back-0.277
 away-0.274
 his-0.267
 heal-0.261
Token wanted
Token ID2227
Feature activation+3.78e-02

Pos logit contributions
urious+0.114
ubby+0.109
elly+0.107
ited+0.103
 types+0.102
Neg logit contributions
 he-0.077
 back-0.064
 his-0.060
 out-0.059
 heal-0.058
Token to
Token ID284
Feature activation-2.88e-02

Pos logit contributions
urious+0.101
elly+0.098
resa+0.096
illing+0.095
ude+0.094
Neg logit contributions
 he-0.031
 back-0.027
 heal-0.026
 away-0.023
 out-0.023
Token study
Token ID2050
Feature activation+1.19e-02

Pos logit contributions
 he+0.036
 back+0.033
 out+0.030
 down+0.029
 away+0.028
Neg logit contributions
urious-0.065
resa-0.061
ude-0.061
ida-0.060
ody-0.059
Token a
Token ID257
Feature activation-2.13e-02

Pos logit contributions
urious+0.026
elly+0.026
ude+0.025
 types+0.024
ubby+0.024
Neg logit contributions
 he-0.013
 heal-0.013
 back-0.012
 drown-0.011
 away-0.011
Token able
Token ID1498
Feature activation-1.97e-02

Pos logit contributions
urious+0.069
ubby+0.067
elly+0.064
ady+0.064
ude+0.064
Neg logit contributions
 he-0.071
 heal-0.063
 back-0.058
 out-0.055
 bed-0.055
Token to
Token ID284
Feature activation-3.99e-02

Pos logit contributions
 heal+0.022
 he+0.021
stack+0.018
 ramp+0.018
 Zig+0.017
Neg logit contributions
urious-0.043
elly-0.042
illing-0.041
ude-0.041
ubby-0.040
Token do
Token ID466
Feature activation-3.03e-02

Pos logit contributions
 he+0.051
 back+0.042
 out+0.039
 heal+0.037
 down+0.037
Neg logit contributions
urious-0.093
ubby-0.087
elly-0.086
resa-0.086
ude-0.086
Token more
Token ID517
Feature activation-1.46e-02

Pos logit contributions
 he+0.030
 heal+0.027
 back+0.027
 down+0.023
stack+0.023
Neg logit contributions
urious-0.071
ude-0.067
ida-0.066
elly-0.066
 types-0.065
Token.
Token ID13
Feature activation-2.43e-02

Pos logit contributions
 he+0.025
 back+0.022
 heal+0.021
 out+0.021
 away+0.020
Neg logit contributions
urious-0.029
elly-0.028
ubby-0.027
illing-0.026
resa-0.026
Token Del
Token ID4216
Feature activation+1.59e-01

Pos logit contributions
 he+0.035
 back+0.033
 heal+0.032
 out+0.030
 away+0.029
Neg logit contributions
urious-0.052
elly-0.049
resa-0.048
illing-0.047
ude-0.046
Tokenight
Token ID432
Feature activation+4.77e-02

Pos logit contributions
resa+0.218
Very+0.212
Fre+0.211
 toddler+0.205
Children+0.205
Neg logit contributions
 back-0.234
 he-0.218
 his-0.209
.-0.200
 out-0.199
Tokened
Token ID276
Feature activation+3.14e-02

Pos logit contributions
urious+0.081
elly+0.077
ubby+0.074
resa+0.072
illing+0.071
Neg logit contributions
 he-0.094
 back-0.086
 heal-0.082
 out-0.081
 his-0.077
Token,
Token ID11
Feature activation+4.48e-03

Pos logit contributions
ubby+0.065
ude+0.065
elly+0.064
urious+0.064
ida+0.063
Neg logit contributions
 he-0.034
 heal-0.034
 back-0.031
 bed-0.030
 roof-0.029
Token the
Token ID262
Feature activation+2.58e-02

Pos logit contributions
urious+0.010
ude+0.010
resa+0.010
elly+0.010
ida+0.010
Neg logit contributions
 heal-0.005
 back-0.004
 he-0.004
 away-0.004
 out-0.004
Token insect
Token ID12833
Feature activation-4.12e-04

Pos logit contributions
urious+0.047
elly+0.047
iling+0.046
ady+0.045
illing+0.045
Neg logit contributions
 he-0.046
 back-0.046
 out-0.041
 they-0.039
 it-0.039
Token From
Token ID3574
Feature activation+2.49e-02

Pos logit contributions
 he+0.008
 back+0.007
 heal+0.007
 out+0.007
 away+0.007
Neg logit contributions
urious-0.010
elly-0.010
resa-0.009
ude-0.009
elia-0.009
Token that
Token ID326
Feature activation+7.42e-03

Pos logit contributions
urious+0.066
elly+0.064
illing+0.062
ubby+0.061
ude+0.060
Neg logit contributions
 he-0.030
 back-0.025
 out-0.021
 they-0.021
 his-0.021
Token day
Token ID1110
Feature activation+2.34e-02

Pos logit contributions
urious+0.018
elly+0.018
ubby+0.017
resa+0.017
ida+0.017
Neg logit contributions
 he-0.010
 back-0.008
 out-0.007
 heal-0.007
 his-0.007
Token on
Token ID319
Feature activation-4.10e-04

Pos logit contributions
ubby+0.059
elly+0.058
urious+0.057
illing+0.056
ude+0.055
Neg logit contributions
 he-0.024
 heal-0.020
 back-0.018
 out-0.016
 anything-0.016
Token,
Token ID11
Feature activation+1.47e-02

Pos logit contributions
 heal+0.001
 back+0.001
 out+0.001
 he+0.001
 roof+0.001
Neg logit contributions
ubby-0.001
elly-0.001
ude-0.001
elia-0.001
iley-0.001
Token Qu
Token ID2264
Feature activation+1.59e-01

Pos logit contributions
elly+0.031
urious+0.031
ubby+0.030
elia+0.030
ude+0.029
Neg logit contributions
 back-0.017
 heal-0.017
 he-0.017
 out-0.016
 away-0.015
Tokenackers
Token ID28874
Feature activation+6.16e-02

Pos logit contributions
resa+0.283
atherine+0.260
 elderly+0.250
ami+0.249
 types+0.247
Neg logit contributions
 back-0.256
 he-0.248
 out-0.233
 away-0.228
 heal-0.223
Token and
Token ID290
Feature activation-5.45e-03

Pos logit contributions
urious+0.128
elly+0.123
ubby+0.118
resa+0.117
illing+0.114
Neg logit contributions
 he-0.101
 back-0.089
 out-0.083
 heal-0.083
 his-0.079
Token the
Token ID262
Feature activation+2.03e-02

Pos logit contributions
 he+0.005
 out+0.005
 back+0.005
 heal+0.005
 bed+0.004
Neg logit contributions
urious-0.013
 types-0.012
resa-0.012
elly-0.012
ida-0.012
Token little
Token ID1310
Feature activation+3.76e-02

Pos logit contributions
elly+0.041
urious+0.040
illing+0.039
iling+0.039
ami+0.039
Neg logit contributions
 he-0.036
 back-0.035
 heal-0.031
 out-0.031
 they-0.030
Token girl
Token ID2576
Feature activation+9.44e-03

Pos logit contributions
urious+0.071
elly+0.069
illing+0.068
resa+0.068
ubby+0.066
Neg logit contributions
 he-0.068
 back-0.064
 out-0.059
 away-0.056
 they-0.056
Token duck
Token ID22045
Feature activation+4.82e-02

Pos logit contributions
elly+0.044
urious+0.043
ubby+0.043
resa+0.040
elia+0.039
Neg logit contributions
 he-0.065
 back-0.060
 heal-0.059
 out-0.058
 anything-0.056
Token named
Token ID3706
Feature activation+2.14e-02

Pos logit contributions
urious+0.095
elly+0.091
ubby+0.088
illing+0.086
resa+0.085
Neg logit contributions
 he-0.080
 back-0.073
 heal-0.069
 out-0.068
 down-0.064
Token Qu
Token ID2264
Feature activation+1.50e-01

Pos logit contributions
urious+0.042
ubby+0.042
elly+0.042
resa+0.040
elia+0.040
Neg logit contributions
 he-0.031
 back-0.028
 heal-0.027
 out-0.025
 away-0.025
Tokenackers
Token ID28874
Feature activation+7.16e-02

Pos logit contributions
resa+0.216
urious+0.201
elia+0.198
 types+0.196
atherine+0.195
Neg logit contributions
 he-0.310
 back-0.307
 out-0.295
 heal-0.276
 away-0.271
Token.
Token ID13
Feature activation-8.58e-03

Pos logit contributions
urious+0.193
elly+0.187
ubby+0.183
illing+0.179
ude+0.177
Neg logit contributions
 he-0.066
 back-0.054
 heal-0.051
 out-0.048
 down-0.044
Token Qu
Token ID2264
Feature activation+1.58e-01

Pos logit contributions
 he+0.011
 back+0.009
 out+0.008
 heal+0.008
 his+0.008
Neg logit contributions
urious-0.021
elly-0.020
ubby-0.020
resa-0.020
illing-0.019
Tokenackers
Token ID28874
Feature activation+7.66e-02

Pos logit contributions
resa+0.278
ami+0.250
atherine+0.246
elia+0.244
ida+0.239
Neg logit contributions
 back-0.275
 he-0.271
 out-0.264
 away-0.246
 his-0.243
Token lived
Token ID5615
Feature activation+5.90e-02

Pos logit contributions
urious+0.132
elly+0.125
ubby+0.120
resa+0.119
illing+0.116
Neg logit contributions
 he-0.151
 back-0.137
 out-0.130
 heal-0.129
 his-0.124
Token on
Token ID319
Feature activation-5.59e-03

Pos logit contributions
urious+0.153
elly+0.148
ubby+0.145
resa+0.142
illing+0.141
Neg logit contributions
 he-0.063
 back-0.052
 heal-0.048
 out-0.046
 down-0.041
Token a
Token ID257
Feature activation-6.04e-03

Pos logit contributions
 he+0.008
 back+0.007
 heal+0.006
 out+0.006
 down+0.006
Neg logit contributions
urious-0.013
elly-0.012
ubby-0.012
resa-0.012
ude-0.012
Token farm
Token ID5318
Feature activation-1.54e-02

Pos logit contributions
 he+0.010
 back+0.009
 out+0.008
 heal+0.008
 down+0.008
Neg logit contributions
urious-0.012
elly-0.012
ubby-0.011
resa-0.011
illing-0.011
Token girl
Token ID2576
Feature activation+1.52e-02

Pos logit contributions
elly+0.040
urious+0.040
ubby+0.038
resa+0.036
elia+0.035
Neg logit contributions
 he-0.059
 back-0.054
 heal-0.053
 out-0.052
 his-0.050
Token named
Token ID3706
Feature activation+2.41e-02

Pos logit contributions
urious+0.030
elly+0.029
ubby+0.028
resa+0.027
illing+0.027
Neg logit contributions
 he-0.026
 back-0.023
 heal-0.022
 out-0.022
 down-0.020
Token Gr
Token ID1902
Feature activation+1.56e-01

Pos logit contributions
urious+0.070
elly+0.067
ubby+0.066
ude+0.065
illing+0.065
Neg logit contributions
 he-0.020
 back-0.017
 out-0.015
 heal-0.013
 they-0.012
Tokenacy
Token ID1590
Feature activation+4.28e-02

Pos logit contributions
ited+0.249
urious+0.243
resa+0.239
atherine+0.234
becca+0.226
Neg logit contributions
 he-0.243
 back-0.241
 out-0.239
 heal-0.234
 bed-0.211
Token.
Token ID13
Feature activation-1.26e-02

Pos logit contributions
urious+0.118
elly+0.114
ubby+0.110
resa+0.110
illing+0.109
Neg logit contributions
 he-0.039
 back-0.031
 heal-0.028
 out-0.027
 down-0.024
Token Gr
Token ID1902
Feature activation+1.58e-01

Pos logit contributions
 he+0.016
 back+0.014
 out+0.013
 heal+0.012
 down+0.012
Neg logit contributions
urious-0.032
ubby-0.029
elly-0.029
ited-0.029
illing-0.029
Tokenacy
Token ID1590
Feature activation+5.26e-02

Pos logit contributions
ited+0.254
resa+0.236
urious+0.231
atherine+0.231
becca+0.229
Neg logit contributions
 out-0.231
 he-0.223
 heal-0.220
 back-0.218
 bed-0.212
Token loved
Token ID6151
Feature activation+3.03e-02

Pos logit contributions
urious+0.102
elly+0.101
ubby+0.098
ami+0.094
ited+0.093
Neg logit contributions
 he-0.093
 back-0.082
 out-0.076
 they-0.076
 heal-0.073
Token to
Token ID284
Feature activation-2.59e-02

Pos logit contributions
urious+0.076
elly+0.074
resa+0.073
illing+0.072
 types+0.071
Neg logit contributions
 he-0.028
 back-0.025
 heal-0.024
 away-0.022
 out-0.022
Token run
Token ID1057
Feature activation+2.27e-02

Pos logit contributions
 he+0.032
 back+0.029
 out+0.026
 away+0.026
 down+0.025
Neg logit contributions
urious-0.060
resa-0.056
elly-0.056
ude-0.055
ubby-0.055
Token around
Token ID1088
Feature activation-1.93e-03

Pos logit contributions
urious+0.048
elly+0.048
ubby+0.046
illing+0.045
resa+0.044
Neg logit contributions
 he-0.032
 heal-0.028
 back-0.026
 out-0.024
 anything-0.024
Token't
Token ID470
Feature activation+2.27e-02

Pos logit contributions
urious+0.078
elly+0.076
ubby+0.074
resa+0.073
illing+0.072
Neg logit contributions
 he-0.036
 back-0.031
 heal-0.028
 out-0.028
 away-0.026
Token know
Token ID760
Feature activation-5.59e-04

Pos logit contributions
urious+0.032
elly+0.031
ubby+0.030
ida+0.029
illing+0.029
Neg logit contributions
 he-0.048
 back-0.044
 heal-0.043
 out-0.042
 down-0.041
Token.
Token ID13
Feature activation-3.13e-02

Pos logit contributions
 he+0.001
 back+0.001
 heal+0.001
 away+0.001
 out+0.001
Neg logit contributions
urious-0.001
elly-0.001
ude-0.001
ami-0.001
ubby-0.001
Token
Token ID198
Feature activation+5.22e-02

Pos logit contributions
 he+0.038
 back+0.034
 heal+0.033
 out+0.031
 away+0.029
Neg logit contributions
urious-0.075
elly-0.071
resa-0.070
ude-0.069
ubby-0.068
Token
Token ID198
Feature activation+5.22e-02

Pos logit contributions
urious+0.139
elly+0.135
ubby+0.132
ited+0.131
illing+0.129
Neg logit contributions
 he-0.058
 back-0.044
 out-0.041
 his-0.040
 they-0.038
TokenLuc
Token ID22946
Feature activation+1.58e-01

Pos logit contributions
elly+0.100
ubby+0.096
urious+0.095
ited+0.091
illing+0.088
Neg logit contributions
 he-0.099
 his-0.082
 back-0.081
 out-0.077
 they-0.076
Tokenas
Token ID292
Feature activation+4.96e-02

Pos logit contributions
usting+0.210
resa+0.205
Fine+0.199
ited+0.198
elly+0.196
Neg logit contributions
 he-0.326
 back-0.306
 out-0.295
 his-0.293
 they-0.267
Token started
Token ID2067
Feature activation+1.74e-02

Pos logit contributions
urious+0.095
elly+0.094
ubby+0.089
resa+0.087
illing+0.086
Neg logit contributions
 he-0.089
 back-0.079
 out-0.075
 heal-0.072
 his-0.071
Token to
Token ID284
Feature activation-2.23e-02

Pos logit contributions
urious+0.048
elly+0.046
illing+0.045
ude+0.045
resa+0.045
Neg logit contributions
 he-0.015
 back-0.012
 heal-0.012
 out-0.010
 away-0.009
Token complain
Token ID13121
Feature activation+3.53e-02

Pos logit contributions
 he+0.034
 back+0.030
 out+0.028
 heal+0.028
 they+0.026
Neg logit contributions
urious-0.048
elly-0.044
resa-0.043
ubby-0.043
ude-0.043
Token "
Token ID366
Feature activation+1.98e-02

Pos logit contributions
urious+0.075
elly+0.074
ude+0.070
illing+0.069
resa+0.068
Neg logit contributions
 he-0.042
 heal-0.040
 back-0.039
 drown-0.037
 Zig-0.036
Token creatures
Token ID8109
Feature activation-9.24e-03

Pos logit contributions
 he+0.004
 heal+0.004
 back+0.004
 out+0.003
 away+0.003
Neg logit contributions
elly-0.008
urious-0.008
ubby-0.008
illing-0.008
ude-0.007
Token.
Token ID13
Feature activation-2.07e-02

Pos logit contributions
 he+0.009
 heal+0.009
 back+0.008
 bed+0.007
 away+0.007
Neg logit contributions
elly-0.023
urious-0.022
illing-0.022
ubby-0.022
ude-0.021
Token One
Token ID1881
Feature activation+6.07e-02

Pos logit contributions
 he+0.030
 back+0.026
 out+0.024
 heal+0.024
 his+0.023
Neg logit contributions
urious-0.047
elly-0.045
ubby-0.044
resa-0.043
illing-0.043
Token day
Token ID1110
Feature activation+1.09e-02

Pos logit contributions
urious+0.156
elly+0.149
resa+0.148
atherine+0.148
ody+0.147
Neg logit contributions
 he-0.074
 back-0.060
 out-0.054
 his-0.053
 they-0.050
Token,
Token ID11
Feature activation+2.04e-02

Pos logit contributions
urious+0.031
elly+0.029
resa+0.029
ubby+0.029
ami+0.028
Neg logit contributions
 he-0.010
 back-0.008
 out-0.007
 they-0.007
 heal-0.006
Token Cr
Token ID3864
Feature activation+1.57e-01

Pos logit contributions
urious+0.050
elly+0.049
resa+0.048
ubby+0.047
ody+0.047
Neg logit contributions
 he-0.021
 back-0.019
 heal-0.018
 out-0.017
 bed-0.016
Tokenust
Token ID436
Feature activation+7.77e-02

Pos logit contributions
resa+0.202
ited+0.186
iley+0.179
 types+0.172
elia+0.172
Neg logit contributions
 back-0.305
 out-0.294
 he-0.279
 heal-0.264
 away-0.261
Tokeny
Token ID88
Feature activation+5.93e-02

Pos logit contributions
urious+0.085
resa+0.082
elly+0.078
ami+0.074
ookie+0.074
Neg logit contributions
 he-0.200
 back-0.187
 out-0.183
 down-0.177
 away-0.177
Token found
Token ID1043
Feature activation+5.19e-02

Pos logit contributions
urious+0.128
elly+0.122
ubby+0.120
resa+0.119
atherine+0.116
Neg logit contributions
 he-0.096
 back-0.081
 out-0.077
 they-0.073
 his-0.072
Token a
Token ID257
Feature activation-2.19e-03

Pos logit contributions
urious+0.121
elly+0.117
ubby+0.114
resa+0.112
illing+0.110
Neg logit contributions
 he-0.068
 back-0.058
 heal-0.056
 out-0.053
 down-0.050
Token shiny
Token ID22441
Feature activation+2.49e-02

Pos logit contributions
 he+0.004
 back+0.004
 heal+0.003
 out+0.003
 down+0.003
Neg logit contributions
urious-0.004
elly-0.004
ubby-0.004
ida-0.004
resa-0.004
Token a
Token ID257
Feature activation+2.31e-02

Pos logit contributions
urious+0.128
elly+0.122
ubby+0.118
resa+0.118
ida+0.118
Neg logit contributions
 he-0.052
 back-0.043
 out-0.042
 his-0.038
 heal-0.036
Token gr
Token ID1036
Feature activation+1.09e-01

Pos logit contributions
urious+0.034
elly+0.032
ubby+0.030
resa+0.029
illing+0.029
Neg logit contributions
 he-0.052
 back-0.047
 out-0.045
 heal-0.045
 his-0.043
Tokenumpy
Token ID32152
Feature activation+4.54e-02

Pos logit contributions
urious+0.105
ited+0.078
atherine+0.077
osa+0.073
resa+0.071
Neg logit contributions
 he-0.297
 back-0.262
 heal-0.257
 out-0.256
 his-0.246
Token bear
Token ID6842
Feature activation+3.48e-02

Pos logit contributions
urious+0.090
elly+0.085
ubby+0.082
resa+0.081
illing+0.080
Neg logit contributions
 he-0.078
 back-0.070
 out-0.066
 heal-0.065
 his-0.062
Token called
Token ID1444
Feature activation+1.50e-02

Pos logit contributions
urious+0.063
elly+0.062
ubby+0.058
illing+0.058
resa+0.057
Neg logit contributions
 he-0.061
 back-0.057
 heal-0.054
 out-0.053
 down-0.051
Token Gr
Token ID1902
Feature activation+1.57e-01

Pos logit contributions
ubby+0.032
elly+0.032
resa+0.031
elia+0.031
urious+0.031
Neg logit contributions
 he-0.018
 back-0.018
 heal-0.017
 away-0.016
 down-0.015
Tokenumbles
Token ID25329
Feature activation+4.62e-02

Pos logit contributions
urious+0.217
resa+0.210
ited+0.210
atherine+0.204
becca+0.196
Neg logit contributions
 he-0.320
 back-0.303
 out-0.296
 heal-0.286
 his-0.282
Token.
Token ID13
Feature activation-1.21e-02

Pos logit contributions
elly+0.110
urious+0.109
ubby+0.106
illing+0.106
ude+0.103
Neg logit contributions
 he-0.051
 back-0.045
 heal-0.044
 out-0.040
 bed-0.039
Token He
Token ID679
Feature activation+6.11e-03

Pos logit contributions
 he+0.019
 back+0.017
 heal+0.016
 out+0.016
 away+0.015
Neg logit contributions
urious-0.024
elly-0.024
ubby-0.023
resa-0.023
illing-0.022
Token used
Token ID973
Feature activation+2.94e-02

Pos logit contributions
urious+0.011
elly+0.010
ubby+0.010
resa+0.010
illing+0.010
Neg logit contributions
 he-0.012
 back-0.010
 out-0.010
 heal-0.010
 down-0.009
Token to
Token ID284
Feature activation-2.33e-02

Pos logit contributions
urious+0.074
elly+0.070
ubby+0.070
ude+0.069
 types+0.069
Neg logit contributions
 he-0.025
 heal-0.021
 back-0.021
 they-0.018
 we-0.017
Token they
Token ID484
Feature activation-6.70e-02

Pos logit contributions
 he+0.002
 back+0.002
 his+0.001
 out+0.001
 heal+0.001
Neg logit contributions
urious-0.003
elly-0.003
ubby-0.003
illing-0.003
resa-0.003
Token could
Token ID714
Feature activation-5.65e-02

Pos logit contributions
 he+0.112
 heal+0.103
 back+0.099
 bed+0.094
 out+0.093
Neg logit contributions
urious-0.115
ody-0.108
ubby-0.107
ida-0.106
ude-0.106
Token go
Token ID467
Feature activation-3.22e-02

Pos logit contributions
 he+0.073
 back+0.070
 down+0.066
 Zig+0.065
 out+0.063
Neg logit contributions
urious-0.109
ude-0.108
ubby-0.100
ida-0.098
illing-0.097
Token check
Token ID2198
Feature activation-8.83e-03

Pos logit contributions
 heal+0.036
 nobody+0.033
 he+0.032
 shield+0.031
pan+0.031
Neg logit contributions
ubby-0.060
elly-0.057
 elderly-0.056
ady-0.055
illing-0.054
Token the
Token ID262
Feature activation-6.06e-02

Pos logit contributions
 heal+0.009
 he+0.008
 Zig+0.007
 back+0.007
 roof+0.007
Neg logit contributions
ubby-0.020
elly-0.019
ude-0.018
iley-0.018
ida-0.018
Token mailbox
Token ID37282
Feature activation-1.14e-01

Pos logit contributions
 he+0.098
 heal+0.080
 back+0.079
 they+0.078
 out+0.077
Neg logit contributions
urious-0.119
elly-0.115
ubby-0.113
resa-0.112
elia-0.108
Token again
Token ID757
Feature activation-4.42e-02

Pos logit contributions
 Zig+0.129
 he+0.129
 heal+0.113
 drown+0.111
 Snow+0.109
Neg logit contributions
urious-0.232
elly-0.218
 elderly-0.213
 types-0.211
resa-0.209
Token.
Token ID13
Feature activation-4.44e-02

Pos logit contributions
 he+0.051
 back+0.043
 heal+0.042
 Zig+0.039
 out+0.039
Neg logit contributions
urious-0.104
elly-0.100
ude-0.097
ubby-0.097
resa-0.096
Token Her
Token ID2332
Feature activation+5.21e-03

Pos logit contributions
 he+0.063
 back+0.061
 heal+0.059
 out+0.056
 away+0.055
Neg logit contributions
urious-0.093
elly-0.091
resa-0.088
illing-0.086
elia-0.085
Token mom
Token ID1995
Feature activation+2.47e-02

Pos logit contributions
urious+0.011
ubby+0.010
resa+0.010
ude+0.010
elia+0.010
Neg logit contributions
 he-0.008
 heal-0.007
 back-0.007
 out-0.007
 down-0.006
Token said
Token ID531
Feature activation+8.20e-03

Pos logit contributions
urious+0.051
elly+0.051
ubby+0.048
illing+0.047
resa+0.046
Neg logit contributions
 he-0.042
 back-0.037
 out-0.035
 heal-0.034
 his-0.033
Token Mama
Token ID40084
Feature activation-3.71e-03

Pos logit contributions
 down+0.038
 back+0.037
 heal+0.037
 he+0.036
 roof+0.033
Neg logit contributions
iley-0.070
elia-0.067
ubby-0.066
 elderly-0.066
resa-0.065
Token to
Token ID284
Feature activation-4.95e-02

Pos logit contributions
 he+0.005
 heal+0.004
 back+0.004
 Zig+0.004
 out+0.004
Neg logit contributions
urious-0.008
elly-0.008
illing-0.008
ubby-0.008
resa-0.008
Token take
Token ID1011
Feature activation+5.10e-03

Pos logit contributions
 he+0.066
 back+0.055
 down+0.052
 out+0.049
 Zig+0.048
Neg logit contributions
resa-0.100
urious-0.099
ida-0.096
ody-0.095
ude-0.095
Token out
Token ID503
Feature activation-3.02e-02

Pos logit contributions
elia+0.010
ubby+0.010
elly+0.010
urious+0.010
ady+0.010
Neg logit contributions
 he-0.006
 heal-0.005
 ramp-0.005
 roof-0.005
 mast-0.005
Token the
Token ID262
Feature activation-5.66e-02

Pos logit contributions
 he+0.038
 heal+0.034
 Zig+0.031
 back+0.031
 down+0.028
Neg logit contributions
ubby-0.063
urious-0.063
elly-0.062
ady-0.060
illing-0.059
Token journal
Token ID3989
Feature activation-1.12e-01

Pos logit contributions
 he+0.088
 heal+0.072
 back+0.071
 they+0.070
 Zig+0.069
Neg logit contributions
urious-0.111
ubby-0.108
elly-0.108
resa-0.103
elia-0.100
Token and
Token ID290
Feature activation-6.85e-02

Pos logit contributions
 he+0.136
 Zig+0.133
 heal+0.123
 anything+0.113
 his+0.110
Neg logit contributions
elly-0.240
urious-0.229
resa-0.218
illing-0.218
ady-0.218
Token write
Token ID3551
Feature activation-1.08e-02

Pos logit contributions
 he+0.087
 back+0.087
 Zig+0.085
 cho+0.080
 heal+0.079
Neg logit contributions
urious-0.119
resa-0.117
elia-0.114
ody-0.112
ida-0.111
Token something
Token ID1223
Feature activation-5.24e-02

Pos logit contributions
 he+0.013
 drown+0.010
 heal+0.010
 back+0.010
 Zig+0.010
Neg logit contributions
urious-0.023
elly-0.022
ude-0.022
ami-0.022
resa-0.022
Token down
Token ID866
Feature activation-3.89e-02

Pos logit contributions
 Zig+0.075
 he+0.075
 back+0.072
stack+0.070
 heal+0.070
Neg logit contributions
elly-0.091
ida-0.088
urious-0.088
 types-0.085
resa-0.084
Token.
Token ID13
Feature activation-4.58e-02

Pos logit contributions
 he+0.040
 heal+0.036
 Zig+0.034
 back+0.033
 out+0.029
Neg logit contributions
elly-0.091
urious-0.090
ubby-0.088
ude-0.085
illing-0.085
Token tools
Token ID4899
Feature activation-4.83e-02

Pos logit contributions
urious+0.004
elly+0.003
ubby+0.003
resa+0.003
illing+0.003
Neg logit contributions
 he-0.004
 back-0.003
 out-0.003
 heal-0.003
 his-0.003
Token to
Token ID284
Feature activation-4.42e-02

Pos logit contributions
 Zig+0.059
 heal+0.054
 drown+0.051
 bed+0.048
 Blink+0.047
Neg logit contributions
elly-0.097
illing-0.092
ody-0.092
ami-0.091
ude-0.091
Token create
Token ID2251
Feature activation-2.01e-02

Pos logit contributions
 he+0.061
 back+0.055
 Zig+0.050
 down+0.049
 out+0.049
Neg logit contributions
urious-0.094
ami-0.088
resa-0.087
elly-0.086
ude-0.086
Token his
Token ID465
Feature activation-1.60e-02

Pos logit contributions
 he+0.026
stack+0.025
 Zig+0.024
 back+0.023
 heal+0.023
Neg logit contributions
urious-0.041
ude-0.039
 types-0.039
ami-0.039
elly-0.038
Token master
Token ID4958
Feature activation-3.27e-02

Pos logit contributions
 he+0.027
 Zig+0.023
 heal+0.023
 they+0.023
 out+0.023
Neg logit contributions
urious-0.032
resa-0.028
ubby-0.027
elly-0.027
ude-0.027
Tokenpieces
Token ID34154
Feature activation-1.11e-01

Pos logit contributions
 he+0.045
 Zig+0.043
 Snow+0.041
 heal+0.040
 back+0.038
Neg logit contributions
urious-0.067
ami-0.061
ude-0.060
resa-0.060
ody-0.060
Token.
Token ID13
Feature activation-3.09e-02

Pos logit contributions
 he+0.141
 Zig+0.140
 heal+0.129
 Frost+0.125
 Snow+0.123
Neg logit contributions
urious-0.226
ude-0.214
elly-0.214
ody-0.209
resa-0.207
Token<|endoftext|>
Token ID50256
Feature activation+6.46e-02

Pos logit contributions
 away+0.043
 he+0.042
 back+0.042
 heal+0.040
 out+0.040
Neg logit contributions
elly-0.062
urious-0.062
illing-0.058
resa-0.057
ami-0.056
TokenOnce
Token ID7454
Feature activation+6.78e-02

Pos logit contributions
urious+0.165
elly+0.158
ubby+0.155
resa+0.153
ida+0.152
Neg logit contributions
 he-0.076
 back-0.064
 out-0.059
 heal-0.056
 his-0.054
Token upon
Token ID2402
Feature activation+2.92e-02

Pos logit contributions
urious+0.112
elly+0.106
ubby+0.101
resa+0.100
illing+0.098
Neg logit contributions
 he-0.137
 back-0.125
 out-0.119
 heal-0.118
 his-0.113
Token a
Token ID257
Feature activation+2.99e-02

Pos logit contributions
urious+0.070
elly+0.067
ubby+0.065
resa+0.065
illing+0.064
Neg logit contributions
 he-0.037
 back-0.032
 out-0.029
 heal-0.029
 down-0.027
Token you
Token ID345
Feature activation+8.66e-04

Pos logit contributions
urious+0.082
ubby+0.079
elly+0.079
resa+0.079
elia+0.078
Neg logit contributions
 he-0.036
 heal-0.031
 back-0.031
 out-0.029
 roof-0.028
Token for
Token ID329
Feature activation-9.33e-03

Pos logit contributions
elly+0.002
ubby+0.002
urious+0.002
resa+0.002
illing+0.002
Neg logit contributions
 he-0.001
 heal-0.001
 his-0.001
 away-0.001
 back-0.001
Token not
Token ID407
Feature activation-2.27e-02

Pos logit contributions
 he+0.015
 back+0.013
 heal+0.012
 out+0.012
 down+0.011
Neg logit contributions
urious-0.019
resa-0.018
elly-0.018
ubby-0.018
illing-0.017
Token touching
Token ID15241
Feature activation-1.53e-02

Pos logit contributions
 he+0.029
 back+0.024
 heal+0.023
 out+0.022
 they+0.020
Neg logit contributions
urious-0.054
elly-0.052
illing-0.050
ubby-0.050
resa-0.049
Token the
Token ID262
Feature activation-4.93e-02

Pos logit contributions
 he+0.018
 heal+0.016
 back+0.015
 they+0.013
 Zig+0.013
Neg logit contributions
elly-0.035
urious-0.035
ubby-0.033
ude-0.032
elia-0.032
Token file
Token ID2393
Feature activation-1.08e-01

Pos logit contributions
 he+0.080
 back+0.065
 heal+0.064
 out+0.063
 they+0.062
Neg logit contributions
urious-0.099
elly-0.098
ubby-0.093
resa-0.090
illing-0.089
Token.
Token ID13
Feature activation-4.32e-02

Pos logit contributions
 he+0.125
 Zig+0.115
 heal+0.108
 his+0.098
 Snow+0.098
Neg logit contributions
elly-0.243
urious-0.229
illing-0.219
iling-0.219
ubby-0.217
Token Thank
Token ID6952
Feature activation+3.28e-02

Pos logit contributions
 he+0.066
 back+0.061
 out+0.056
 heal+0.056
 away+0.056
Neg logit contributions
elly-0.090
urious-0.088
ubby-0.085
resa-0.083
illing-0.082
Token you
Token ID345
Feature activation-8.78e-04

Pos logit contributions
urious+0.085
elly+0.083
ubby+0.082
resa+0.082
elia+0.081
Neg logit contributions
 he-0.032
 back-0.026
 heal-0.025
 out-0.024
 down-0.022
Token for
Token ID329
Feature activation-1.14e-02

Pos logit contributions
 he+0.001
 heal+0.001
 away+0.001
 his+0.001
 back+0.001
Neg logit contributions
elly-0.002
ubby-0.002
urious-0.002
resa-0.002
illing-0.002
Token being
Token ID852
Feature activation-9.68e-04

Pos logit contributions
 he+0.018
 back+0.016
 out+0.015
 heal+0.015
 down+0.014
Neg logit contributions
urious-0.023
elly-0.022
resa-0.021
ubby-0.021
illing-0.021
Token better
Token ID1365
Feature activation-4.53e-02

Pos logit contributions
 he+0.023
 back+0.020
 heal+0.020
 out+0.018
 Zig+0.017
Neg logit contributions
urious-0.053
elly-0.053
ude-0.052
resa-0.050
ami-0.050
Token way
Token ID835
Feature activation-5.00e-02

Pos logit contributions
 he+0.052
 heal+0.046
 back+0.045
 Zig+0.045
 out+0.045
Neg logit contributions
urious-0.104
elly-0.104
ubby-0.099
ude-0.096
illing-0.096
Token to
Token ID284
Feature activation-7.18e-02

Pos logit contributions
 he+0.050
 Zig+0.048
 heal+0.044
 ramp+0.042
 Frost+0.042
Neg logit contributions
urious-0.107
elly-0.106
ubby-0.103
iling-0.100
illing-0.099
Token repair
Token ID9185
Feature activation-3.09e-02

Pos logit contributions
 he+0.090
 back+0.079
 out+0.075
 down+0.075
 Zig+0.071
Neg logit contributions
urious-0.143
ude-0.139
resa-0.137
iling-0.137
ody-0.136
Token the
Token ID262
Feature activation-5.43e-02

Pos logit contributions
 he+0.034
 Zig+0.028
 heal+0.028
 back+0.028
 roof+0.025
Neg logit contributions
urious-0.069
elly-0.068
ubby-0.065
ude-0.065
 types-0.065
Token TV
Token ID3195
Feature activation-1.08e-01

Pos logit contributions
 he+0.088
 back+0.073
 out+0.072
 they+0.072
 heal+0.071
Neg logit contributions
urious-0.106
elly-0.102
ubby-0.099
resa-0.098
ude-0.094
Token together
Token ID1978
Feature activation-5.22e-02

Pos logit contributions
 Zig+0.136
 he+0.131
 heal+0.116
 Snow+0.108
 back+0.107
Neg logit contributions
elly-0.221
urious-0.221
ude-0.204
resa-0.202
 types-0.199
Token."
Token ID526
Feature activation-2.02e-02

Pos logit contributions
 he+0.059
 Zig+0.053
 back+0.049
 heal+0.048
 out+0.046
Neg logit contributions
urious-0.120
elly-0.115
ude-0.112
iling-0.110
ubby-0.109
Token
Token ID220
Feature activation+5.77e-02

Pos logit contributions
 heal+0.030
 back+0.029
 out+0.028
 anything+0.028
 he+0.028
Neg logit contributions
urious-0.038
resa-0.037
elly-0.037
ody-0.036
elia-0.036
Token
Token ID198
Feature activation+3.62e-02

Pos logit contributions
urious+0.124
elly+0.119
ubby+0.115
ude+0.114
illing+0.113
Neg logit contributions
 he-0.084
 back-0.074
 heal-0.071
 out-0.068
 away-0.068
TokenThey
Token ID2990
Feature activation-2.78e-02

Pos logit contributions
urious+0.083
elly+0.079
resa+0.077
ubby+0.076
illing+0.075
Neg logit contributions
 he-0.049
 back-0.044
 out-0.041
 heal-0.041
 down-0.038
Token the
Token ID262
Feature activation-4.98e-02

Pos logit contributions
urious+0.006
elly+0.006
ubby+0.006
elia+0.006
atherine+0.006
Neg logit contributions
 he-0.005
 heal-0.004
stack-0.004
 roof-0.004
 back-0.004
Token mail
Token ID6920
Feature activation-7.99e-02

Pos logit contributions
 he+0.081
 back+0.068
 heal+0.065
 out+0.064
 they+0.064
Neg logit contributions
urious-0.096
elly-0.095
ubby-0.093
resa-0.090
ida-0.087
Token out
Token ID503
Feature activation-3.19e-02

Pos logit contributions
 he+0.102
 Zig+0.092
 drown+0.081
 heal+0.076
 Frost+0.076
Neg logit contributions
elly-0.174
resa-0.164
urious-0.163
ami-0.161
ubby-0.158
Token of
Token ID286
Feature activation-4.76e-02

Pos logit contributions
 he+0.043
 heal+0.037
 back+0.036
 Zig+0.034
 out+0.034
Neg logit contributions
elly-0.069
urious-0.069
ubby-0.067
illing-0.064
iling-0.063
Token the
Token ID262
Feature activation-5.03e-02

Pos logit contributions
 he+0.055
 back+0.046
 heal+0.044
 Zig+0.041
 out+0.040
Neg logit contributions
urious-0.109
ubby-0.107
elly-0.106
resa-0.104
ody-0.102
Token mailbox
Token ID37282
Feature activation-1.08e-01

Pos logit contributions
 he+0.096
 Zig+0.082
 back+0.078
 heal+0.078
 they+0.077
Neg logit contributions
urious-0.084
elly-0.080
ubby-0.079
resa-0.078
 elderly-0.074
Token.
Token ID13
Feature activation-4.54e-02

Pos logit contributions
 he+0.127
 Zig+0.118
 heal+0.110
 his+0.099
 back+0.095
Neg logit contributions
elly-0.240
urious-0.232
resa-0.213
ami-0.212
illing-0.212
Token And
Token ID843
Feature activation-1.58e-02

Pos logit contributions
 he+0.059
 back+0.051
 out+0.047
 heal+0.046
 down+0.043
Neg logit contributions
urious-0.108
elly-0.105
ubby-0.101
resa-0.100
illing-0.099
Token you
Token ID345
Feature activation-1.59e-03

Pos logit contributions
 he+0.016
 back+0.014
 heal+0.013
 out+0.013
 down+0.011
Neg logit contributions
urious-0.040
elly-0.038
resa-0.038
ubby-0.038
ida-0.037
Token should
Token ID815
Feature activation-3.62e-02

Pos logit contributions
 he+0.003
 back+0.003
 heal+0.003
 out+0.003
 they+0.002
Neg logit contributions
urious-0.003
elly-0.003
ubby-0.003
resa-0.003
illing-0.003
Token not
Token ID407
Feature activation-2.08e-02

Pos logit contributions
 he+0.051
 back+0.045
 out+0.044
 down+0.043
 heal+0.039
Neg logit contributions
urious-0.074
resa-0.071
ubby-0.071
ida-0.070
ody-0.069
Token took
Token ID1718
Feature activation+3.13e-02

Pos logit contributions
urious+0.032
ody+0.030
elly+0.030
ude+0.030
ubby+0.030
Neg logit contributions
 he-0.027
 heal-0.024
 back-0.022
 bed-0.022
 Zig-0.021
Token turns
Token ID4962
Feature activation-2.95e-02

Pos logit contributions
elly+0.065
ubby+0.065
urious+0.065
elia+0.064
atherine+0.063
Neg logit contributions
 he-0.038
 heal-0.033
stack-0.031
 roof-0.030
 back-0.027
Token to
Token ID284
Feature activation-4.74e-02

Pos logit contributions
 he+0.032
 heal+0.029
 back+0.027
 Zig+0.025
 bed+0.024
Neg logit contributions
elly-0.067
illing-0.067
urious-0.065
ude-0.063
ami-0.063
Token stir
Token ID11240
Feature activation+8.08e-03

Pos logit contributions
 he+0.062
 back+0.051
 out+0.048
 heal+0.046
 down+0.045
Neg logit contributions
urious-0.106
resa-0.101
ida-0.099
elly-0.099
ubby-0.098
Token the
Token ID262
Feature activation-5.85e-02

Pos logit contributions
urious+0.020
elly+0.020
ubby+0.019
ude+0.019
ami+0.018
Neg logit contributions
 he-0.009
 back-0.007
 heal-0.007
 out-0.006
 Zig-0.006
Token batter
Token ID4984
Feature activation-1.07e-01

Pos logit contributions
 he+0.100
 Zig+0.083
 heal+0.080
 back+0.079
 they+0.078
Neg logit contributions
urious-0.109
elly-0.108
ubby-0.104
resa-0.099
 types-0.099
Token with
Token ID351
Feature activation-3.93e-02

Pos logit contributions
 he+0.138
 Zig+0.131
 heal+0.111
 Snow+0.108
 Brown+0.108
Neg logit contributions
urious-0.223
elly-0.221
 types-0.211
ude-0.210
ami-0.209
Token a
Token ID257
Feature activation-4.24e-02

Pos logit contributions
 he+0.054
 heal+0.046
 back+0.045
 Zig+0.042
 down+0.039
Neg logit contributions
urious-0.081
ubby-0.078
 types-0.078
elly-0.078
ude-0.077
Token wooden
Token ID13510
Feature activation-3.97e-02

Pos logit contributions
 he+0.077
 Zig+0.067
 back+0.066
 heal+0.066
 his+0.063
Neg logit contributions
elly-0.071
urious-0.071
ubby-0.071
ude-0.067
ady-0.067
Token spoon
Token ID24556
Feature activation-5.14e-02

Pos logit contributions
 he+0.068
 Zig+0.060
 heal+0.058
 back+0.057
 out+0.054
Neg logit contributions
elly-0.072
ubby-0.071
urious-0.071
resa-0.066
ude-0.066
Token.
Token ID13
Feature activation-3.44e-02

Pos logit contributions
 he+0.058
 heal+0.052
 Zig+0.052
 back+0.048
 anything+0.045
Neg logit contributions
elly-0.119
illing-0.114
ubby-0.111
ude-0.111
urious-0.110
Token big
Token ID1263
Feature activation-3.35e-02

Pos logit contributions
 he+0.074
 heal+0.062
 back+0.062
 down+0.062
 him+0.062
Neg logit contributions
urious-0.067
ida-0.065
resa-0.064
elly-0.063
ubby-0.061
Token stick
Token ID4859
Feature activation-7.54e-02

Pos logit contributions
 he+0.050
 heal+0.042
 back+0.042
 out+0.039
 they+0.038
Neg logit contributions
urious-0.068
ubby-0.066
elly-0.066
resa-0.064
ida-0.062
Token to
Token ID284
Feature activation-6.11e-02

Pos logit contributions
 he+0.085
 heal+0.081
 Zig+0.076
 back+0.076
 bed+0.069
Neg logit contributions
elly-0.161
illing-0.152
urious-0.151
ubby-0.150
ude-0.149
Token break
Token ID2270
Feature activation+3.21e-03

Pos logit contributions
 he+0.078
 back+0.073
 down+0.064
 out+0.064
 Zig+0.061
Neg logit contributions
urious-0.122
ubby-0.116
ude-0.116
resa-0.114
ida-0.113
Token the
Token ID262
Feature activation-5.25e-02

Pos logit contributions
elly+0.008
ubby+0.007
urious+0.007
ude+0.007
 types+0.007
Neg logit contributions
 he-0.003
 heal-0.002
 Zig-0.002
 back-0.002
 bed-0.002
Token vine
Token ID17793
Feature activation-1.07e-01

Pos logit contributions
 he+0.081
 they+0.067
 heal+0.066
 Zig+0.066
 back+0.064
Neg logit contributions
urious-0.099
ubby-0.098
elly-0.096
resa-0.095
 elderly-0.092
Token.
Token ID13
Feature activation-4.32e-02

Pos logit contributions
 he+0.130
 Zig+0.123
 heal+0.110
 Snow+0.108
 his+0.108
Neg logit contributions
urious-0.215
elly-0.214
illing-0.201
ude-0.196
 types-0.193
Token Lily
Token ID20037
Feature activation+4.67e-02

Pos logit contributions
 back+0.059
 heal+0.059
 he+0.058
 out+0.055
 away+0.054
Neg logit contributions
urious-0.091
elly-0.088
resa-0.085
illing-0.083
elia-0.083
Token looked
Token ID3114
Feature activation+6.31e-03

Pos logit contributions
elly+0.097
urious+0.097
ubby+0.094
illing+0.090
resa+0.089
Neg logit contributions
 he-0.076
 back-0.067
 out-0.062
 heal-0.061
 his-0.059
Token around
Token ID1088
Feature activation-9.73e-03

Pos logit contributions
urious+0.014
elly+0.013
ubby+0.013
illing+0.012
elia+0.012
Neg logit contributions
 he-0.009
 heal-0.008
 back-0.007
 bed-0.007
 Zig-0.007
Token and
Token ID290
Feature activation-3.60e-02

Pos logit contributions
 he+0.015
 heal+0.014
 back+0.013
 out+0.013
 Zig+0.012
Neg logit contributions
urious-0.019
elly-0.018
resa-0.017
ubby-0.017
ude-0.017
Token meant
Token ID4001
Feature activation-2.63e-02

Pos logit contributions
 he+0.072
 back+0.065
 heal+0.063
 out+0.062
 anything+0.058
Neg logit contributions
urious-0.091
elly-0.085
ude-0.082
ida-0.081
ubby-0.081
Token she
Token ID673
Feature activation-4.64e-02

Pos logit contributions
 he+0.028
 heal+0.026
 back+0.025
 ramp+0.025
 bed+0.024
Neg logit contributions
ude-0.052
ida-0.051
urious-0.050
elly-0.049
illing-0.049
Token could
Token ID714
Feature activation-4.40e-02

Pos logit contributions
 he+0.079
 back+0.071
 out+0.067
 heal+0.067
 down+0.063
Neg logit contributions
urious-0.091
elly-0.086
ubby-0.083
resa-0.082
ude-0.081
Token play
Token ID711
Feature activation-3.58e-03

Pos logit contributions
 he+0.059
 back+0.052
 down+0.050
 out+0.049
 Zig+0.048
Neg logit contributions
urious-0.089
ude-0.087
ida-0.083
resa-0.082
ubby-0.082
Token the
Token ID262
Feature activation-4.73e-02

Pos logit contributions
 he+0.005
 heal+0.004
 back+0.004
 drown+0.004
 Zig+0.004
Neg logit contributions
elly-0.007
ude-0.006
urious-0.006
ida-0.006
iling-0.006
Token piano
Token ID19132
Feature activation-1.07e-01

Pos logit contributions
 he+0.080
 back+0.069
 heal+0.067
 out+0.066
 his+0.065
Neg logit contributions
urious-0.089
resa-0.084
elly-0.084
ubby-0.082
ida-0.081
Token again
Token ID757
Feature activation-3.79e-02

Pos logit contributions
 he+0.124
 Zig+0.115
 heal+0.108
 Frost+0.106
 Snow+0.102
Neg logit contributions
urious-0.220
elly-0.214
ida-0.210
ude-0.208
ody-0.208
Token.
Token ID13
Feature activation-4.42e-02

Pos logit contributions
 he+0.043
 back+0.036
 heal+0.036
 out+0.034
 Zig+0.032
Neg logit contributions
urious-0.092
elly-0.089
ude-0.086
ubby-0.085
resa-0.085
Token<|endoftext|>
Token ID50256
Feature activation+4.90e-02

Pos logit contributions
 he+0.062
 back+0.058
 heal+0.055
 out+0.054
 away+0.052
Neg logit contributions
urious-0.096
elly-0.094
resa-0.090
elia-0.088
illing-0.088
TokenOnce
Token ID7454
Feature activation+5.55e-02

Pos logit contributions
urious+0.101
elly+0.098
ubby+0.093
resa+0.092
ude+0.091
Neg logit contributions
 he-0.076
 back-0.068
 heal-0.066
 out-0.063
 away-0.059
Token upon
Token ID2402
Feature activation+1.65e-02

Pos logit contributions
urious+0.087
elly+0.082
ubby+0.078
resa+0.076
illing+0.075
Neg logit contributions
 he-0.116
 back-0.107
 heal-0.103
 out-0.102
 down-0.097
Token if
Token ID611
Feature activation-1.94e-02

Pos logit contributions
urious+0.012
elly+0.012
ubby+0.011
resa+0.011
illing+0.011
Neg logit contributions
 he-0.008
 back-0.007
 out-0.006
 heal-0.006
 they-0.006
Token she
Token ID673
Feature activation-2.68e-02

Pos logit contributions
 back+0.017
 heal+0.017
 he+0.015
 down+0.015
 away+0.015
Neg logit contributions
ida-0.045
elia-0.045
urious-0.044
resa-0.043
ude-0.042
Token could
Token ID714
Feature activation-2.87e-02

Pos logit contributions
 he+0.043
 back+0.037
 out+0.034
 heal+0.033
 his+0.033
Neg logit contributions
elly-0.056
urious-0.056
ubby-0.054
illing-0.053
resa-0.053
Token turn
Token ID1210
Feature activation+9.60e-04

Pos logit contributions
 he+0.044
 back+0.042
 out+0.039
 down+0.038
 heal+0.037
Neg logit contributions
urious-0.057
elly-0.053
ude-0.052
resa-0.052
ubby-0.052
Token the
Token ID262
Feature activation-3.62e-02

Pos logit contributions
elly+0.002
ubby+0.002
ody+0.002
urious+0.002
ida+0.002
Neg logit contributions
 he-0.001
 heal-0.001
 Zig-0.001
stack-0.001
 drown-0.001
Token journal
Token ID3989
Feature activation-1.06e-01

Pos logit contributions
 he+0.063
 Zig+0.056
 heal+0.055
 they+0.054
 back+0.054
Neg logit contributions
urious-0.066
elly-0.061
ubby-0.060
ody-0.059
resa-0.059
Token into
Token ID656
Feature activation-2.34e-02

Pos logit contributions
 Zig+0.115
 he+0.113
 Frost+0.104
 drown+0.097
 heal+0.094
Neg logit contributions
urious-0.231
elly-0.221
ude-0.214
ody-0.214
ida-0.213
Token what
Token ID644
Feature activation-4.29e-02

Pos logit contributions
 he+0.026
 back+0.023
 heal+0.021
 out+0.019
 down+0.019
Neg logit contributions
urious-0.055
elly-0.054
ubby-0.053
ude-0.052
resa-0.052
Token she
Token ID673
Feature activation-6.13e-02

Pos logit contributions
 back+0.046
 he+0.043
 heal+0.042
 down+0.041
 out+0.040
Neg logit contributions
urious-0.093
ida-0.091
 types-0.091
elly-0.090
ude-0.089
Token wanted
Token ID2227
Feature activation-6.75e-03

Pos logit contributions
 he+0.105
 back+0.090
 out+0.083
 heal+0.081
 his+0.080
Neg logit contributions
elly-0.124
urious-0.123
ubby-0.121
illing-0.119
resa-0.118
Token?
Token ID30
Feature activation-4.10e-02

Pos logit contributions
 he+0.008
 heal+0.007
 back+0.007
 Zig+0.006
 drown+0.006
Neg logit contributions
urious-0.015
elly-0.015
ody-0.014
ubby-0.014
ude-0.014
Token You
Token ID921
Feature activation-2.14e-02

Pos logit contributions
 he+0.071
 back+0.065
 out+0.060
 heal+0.058
 away+0.057
Neg logit contributions
urious-0.124
elly-0.122
ubby-0.118
resa-0.116
illing-0.114
Token need
Token ID761
Feature activation-1.30e-02

Pos logit contributions
 he+0.032
 back+0.028
 heal+0.026
 out+0.026
 down+0.025
Neg logit contributions
urious-0.046
elly-0.044
ubby-0.043
resa-0.042
ude-0.042
Token to
Token ID284
Feature activation-4.57e-02

Pos logit contributions
 he+0.013
 back+0.010
 heal+0.009
 down+0.008
 Zig+0.008
Neg logit contributions
urious-0.033
elly-0.033
ubby-0.031
resa-0.031
illing-0.031
Token remove
Token ID4781
Feature activation+8.26e-03

Pos logit contributions
 he+0.062
 back+0.053
 out+0.049
 down+0.047
 heal+0.047
Neg logit contributions
urious-0.103
elly-0.098
resa-0.097
ubby-0.097
ude-0.095
Token the
Token ID262
Feature activation-6.26e-02

Pos logit contributions
elly+0.019
urious+0.019
ubby+0.019
elia+0.018
resa+0.018
Neg logit contributions
 he-0.010
 heal-0.008
 back-0.007
 out-0.006
 they-0.006
Token glue
Token ID22749
Feature activation-1.06e-01

Pos logit contributions
 he+0.102
 back+0.085
 heal+0.083
 they+0.082
 out+0.080
Neg logit contributions
urious-0.121
elly-0.121
ubby-0.117
resa-0.113
illing-0.110
Token and
Token ID290
Feature activation-7.41e-02

Pos logit contributions
 he+0.150
 Zig+0.138
 heal+0.127
 back+0.118
 his+0.115
Neg logit contributions
elly-0.215
ubby-0.204
urious-0.200
ude-0.193
illing-0.191
Token put
Token ID1234
Feature activation-4.78e-03

Pos logit contributions
 he+0.099
 Zig+0.088
 back+0.088
 out+0.080
 heal+0.079
Neg logit contributions
urious-0.141
 types-0.132
resa-0.131
 elderly-0.129
elly-0.127
Token it
Token ID340
Feature activation-7.89e-02

Pos logit contributions
 he+0.005
 heal+0.004
 Zig+0.004
 bed+0.003
 drown+0.003
Neg logit contributions
ubby-0.011
elly-0.011
urious-0.011
ody-0.011
elia-0.011
Token away
Token ID1497
Feature activation-2.23e-02

Pos logit contributions
 he+0.103
 Zig+0.090
 bed+0.081
 heal+0.079
stack+0.078
Neg logit contributions
ubby-0.156
urious-0.155
elly-0.152
ady-0.141
umpy-0.141
Token,
Token ID11
Feature activation-2.57e-02

Pos logit contributions
 he+0.027
 heal+0.022
 back+0.021
 Zig+0.020
 bed+0.019
Neg logit contributions
elly-0.051
urious-0.051
ubby-0.049
illing-0.047
ude-0.047
Token happy
Token ID3772
Feature activation-1.83e-02

Pos logit contributions
 he+0.027
 heal+0.024
 Zig+0.023
 back+0.023
 bed+0.022
Neg logit contributions
urious-0.030
ubby-0.029
illing-0.028
ady-0.027
ude-0.027
Token to
Token ID284
Feature activation-3.56e-02

Pos logit contributions
 heal+0.022
 Zig+0.021
 ramp+0.020
 he+0.020
 away+0.020
Neg logit contributions
elly-0.036
urious-0.036
ude-0.035
 types-0.034
iling-0.033
Token have
Token ID423
Feature activation-5.69e-02

Pos logit contributions
 he+0.046
 Zig+0.042
 ramp+0.040
 car+0.039
 sled+0.038
Neg logit contributions
ubby-0.067
ude-0.065
urious-0.065
iling-0.063
ida-0.061
Token read
Token ID1100
Feature activation-1.92e-02

Pos logit contributions
 he+0.080
 heal+0.072
 Zig+0.070
stack+0.069
 roof+0.069
Neg logit contributions
urious-0.108
ude-0.105
ady-0.104
ody-0.104
iling-0.103
Token the
Token ID262
Feature activation-4.76e-02

Pos logit contributions
 he+0.022
 drown+0.021
 heal+0.020
 back+0.020
 crash+0.018
Neg logit contributions
elly-0.038
ude-0.037
urious-0.037
ami-0.036
 types-0.036
Token newspaper
Token ID7533
Feature activation-1.06e-01

Pos logit contributions
 he+0.083
 back+0.073
 out+0.070
 heal+0.069
 they+0.068
Neg logit contributions
urious-0.088
elly-0.081
resa-0.081
ubby-0.080
ida-0.077
Token.
Token ID13
Feature activation-4.61e-02

Pos logit contributions
 Zig+0.117
 he+0.117
 drown+0.115
 heal+0.109
 Frost+0.102
Neg logit contributions
elly-0.212
urious-0.206
ody-0.204
oph-0.201
resa-0.200
Token<|endoftext|>
Token ID50256
Feature activation+4.89e-02

Pos logit contributions
 back+0.068
 he+0.067
 heal+0.066
 away+0.065
 out+0.064
Neg logit contributions
urious-0.093
elly-0.091
resa-0.086
elia-0.084
illing-0.084
TokenOnce
Token ID7454
Feature activation+5.66e-02

Pos logit contributions
urious+0.105
elly+0.103
ubby+0.098
resa+0.096
illing+0.096
Neg logit contributions
 he-0.070
 back-0.062
 heal-0.060
 out-0.057
 away-0.054
Token upon
Token ID2402
Feature activation+1.85e-02

Pos logit contributions
urious+0.090
elly+0.085
ubby+0.081
resa+0.080
illing+0.078
Neg logit contributions
 he-0.118
 back-0.108
 heal-0.103
 out-0.103
 his-0.097
Token a
Token ID257
Feature activation+2.41e-02

Pos logit contributions
urious+0.046
elly+0.044
ubby+0.043
resa+0.043
illing+0.042
Neg logit contributions
 he-0.023
 back-0.019
 out-0.018
 heal-0.017
 his-0.016
Token shouldn
Token ID6584
Feature activation-2.20e-02

Pos logit contributions
 he+0.058
 back+0.052
 out+0.049
 heal+0.048
 his+0.046
Neg logit contributions
urious-0.075
elly-0.072
ubby-0.069
resa-0.068
illing-0.067
Token't
Token ID470
Feature activation-1.39e-02

Pos logit contributions
 he+0.028
 back+0.025
 out+0.024
 heal+0.023
 down+0.023
Neg logit contributions
urious-0.049
ude-0.046
resa-0.046
ubby-0.046
elly-0.046
Token have
Token ID423
Feature activation-4.33e-02

Pos logit contributions
 he+0.018
 back+0.017
 out+0.016
 down+0.015
 bed+0.015
Neg logit contributions
urious-0.030
elly-0.028
ida-0.028
resa-0.028
ude-0.027
Token brought
Token ID3181
Feature activation+5.06e-03

Pos logit contributions
 he+0.073
 back+0.064
 heal+0.063
 out+0.061
 down+0.058
Neg logit contributions
urious-0.084
elly-0.078
ody-0.076
resa-0.075
ude-0.075
Token the
Token ID262
Feature activation-4.79e-02

Pos logit contributions
elia+0.010
urious+0.010
ody+0.009
elly+0.009
ubby+0.009
Neg logit contributions
 he-0.007
 heal-0.006
 roof-0.006
stack-0.006
pan-0.006
Token glue
Token ID22749
Feature activation-1.05e-01

Pos logit contributions
 he+0.082
 back+0.072
 heal+0.069
 out+0.068
 down+0.065
Neg logit contributions
urious-0.092
elly-0.089
ubby-0.086
resa-0.085
ida-0.082
Token.
Token ID13
Feature activation-5.29e-02

Pos logit contributions
 he+0.124
 Zig+0.114
 heal+0.106
 Frost+0.090
 Snow+0.089
Neg logit contributions
elly-0.236
ubby-0.224
urious-0.220
ody-0.217
umpy-0.213
Token Now
Token ID2735
Feature activation-1.64e-02

Pos logit contributions
 he+0.072
 back+0.066
 out+0.063
 heal+0.060
 away+0.060
Neg logit contributions
urious-0.117
elly-0.114
resa-0.110
ubby-0.109
illing-0.107
Token we
Token ID356
Feature activation-3.03e-02

Pos logit contributions
 he+0.017
 numb+0.016
 heal+0.016
 Zig+0.015
 back+0.015
Neg logit contributions
elia-0.032
urious-0.031
oph-0.031
ody-0.030
ude-0.030
Token can
Token ID460
Feature activation-3.68e-02

Pos logit contributions
 he+0.045
 back+0.037
 heal+0.037
 his+0.035
 out+0.035
Neg logit contributions
urious-0.063
ubby-0.061
elly-0.060
ody-0.058
ida-0.057
Token't
Token ID470
Feature activation-1.44e-02

Pos logit contributions
 he+0.049
 back+0.041
 out+0.039
 down+0.039
 anything+0.037
Neg logit contributions
urious-0.080
ubby-0.076
ude-0.075
resa-0.074
elly-0.074
Token,
Token ID11
Feature activation-2.96e-02

Pos logit contributions
urious+0.013
elly+0.013
ubby+0.012
resa+0.012
ude+0.012
Neg logit contributions
 he-0.011
 back-0.009
 heal-0.009
 out-0.009
 his-0.008
Token let
Token ID1309
Feature activation-1.20e-02

Pos logit contributions
 heal+0.033
 he+0.032
 back+0.032
 away+0.030
 out+0.027
Neg logit contributions
elly-0.061
resa-0.058
ubby-0.058
urious-0.058
ida-0.058
Token's
Token ID338
Feature activation-3.34e-02

Pos logit contributions
 he+0.017
 heal+0.015
 back+0.013
 roof+0.013
 Zig+0.013
Neg logit contributions
urious-0.025
ody-0.024
elly-0.024
oph-0.023
ubby-0.023
Token get
Token ID651
Feature activation-2.05e-03

Pos logit contributions
 he+0.041
 back+0.033
 Zig+0.032
 heal+0.031
 out+0.030
Neg logit contributions
urious-0.070
ude-0.068
resa-0.067
ida-0.067
ubby-0.065
Token your
Token ID534
Feature activation-3.01e-02

Pos logit contributions
 roof+0.002
 ramp+0.002
stack+0.002
 heal+0.002
 fort+0.002
Neg logit contributions
ubby-0.004
elia-0.004
ody-0.004
elly-0.004
 elderly-0.004
Token journal
Token ID3989
Feature activation-1.05e-01

Pos logit contributions
 he+0.047
 heal+0.038
 they+0.037
 Zig+0.036
 out+0.036
Neg logit contributions
urious-0.058
ubby-0.057
elly-0.055
resa-0.053
 elderly-0.053
Token so
Token ID523
Feature activation+4.23e-03

Pos logit contributions
 he+0.126
 Zig+0.116
 heal+0.106
 his+0.095
 anything+0.094
Neg logit contributions
elly-0.223
urious-0.215
ody-0.207
oph-0.206
ubby-0.202
Token you
Token ID345
Feature activation-2.66e-02

Pos logit contributions
elia+0.009
ubby+0.009
urious+0.009
ida+0.009
elly+0.009
Neg logit contributions
stack-0.004
 bed-0.003
pan-0.003
 stret-0.003
 ramp-0.003
Token can
Token ID460
Feature activation-4.54e-02

Pos logit contributions
 he+0.036
 back+0.031
 heal+0.029
 his+0.029
 bed+0.029
Neg logit contributions
urious-0.057
ubby-0.055
elly-0.055
ody-0.052
ited-0.052
Token write
Token ID3551
Feature activation-2.23e-02

Pos logit contributions
 he+0.059
 back+0.053
 out+0.050
 down+0.049
 they+0.047
Neg logit contributions
urious-0.096
ubby-0.092
ude-0.091
ody-0.091
elly-0.091
Token about
Token ID546
Feature activation-3.80e-02

Pos logit contributions
 he+0.023
 drown+0.018
 heal+0.018
stack+0.017
 roof+0.017
Neg logit contributions
urious-0.050
elly-0.050
ude-0.049
ubby-0.049
ody-0.049
Token we
Token ID356
Feature activation-2.68e-02

Pos logit contributions
 back+0.018
 heal+0.018
 he+0.018
 away+0.016
 out+0.016
Neg logit contributions
resa-0.037
elia-0.035
ida-0.035
ude-0.035
urious-0.035
Token need
Token ID761
Feature activation-1.34e-02

Pos logit contributions
 he+0.038
 back+0.031
 heal+0.030
 his+0.029
 out+0.029
Neg logit contributions
urious-0.058
ubby-0.057
elly-0.055
ude-0.054
ida-0.053
Token to
Token ID284
Feature activation-4.83e-02

Pos logit contributions
 he+0.014
 heal+0.011
 back+0.010
 down+0.010
 out+0.009
Neg logit contributions
urious-0.032
elly-0.031
resa-0.030
elia-0.030
ubby-0.030
Token pack
Token ID2353
Feature activation-2.80e-03

Pos logit contributions
 he+0.068
 back+0.056
 out+0.053
 down+0.051
 they+0.050
Neg logit contributions
urious-0.103
ubby-0.098
resa-0.097
ude-0.096
ida-0.095
Token our
Token ID674
Feature activation-3.57e-02

Pos logit contributions
 he+0.003
 heal+0.003
 back+0.003
 Zig+0.002
 crash+0.002
Neg logit contributions
urious-0.006
elly-0.006
ubby-0.006
elia-0.006
resa-0.006
Token diary
Token ID26339
Feature activation-1.05e-01

Pos logit contributions
 he+0.066
 heal+0.058
 out+0.055
 back+0.055
 Zig+0.053
Neg logit contributions
urious-0.063
ubby-0.058
elly-0.057
resa-0.054
elia-0.054
Token and
Token ID290
Feature activation-7.63e-02

Pos logit contributions
 he+0.138
 heal+0.114
 Zig+0.112
 his+0.104
 crash+0.103
Neg logit contributions
elly-0.220
urious-0.219
ubby-0.206
resa-0.205
elia-0.203
Token some
Token ID617
Feature activation-3.77e-02

Pos logit contributions
 he+0.098
 Zig+0.094
 back+0.089
 down+0.088
 out+0.086
Neg logit contributions
urious-0.129
 elderly-0.124
resa-0.123
elia-0.118
atherine-0.118
Token cr
Token ID1067
Feature activation-3.73e-04

Pos logit contributions
 he+0.057
 Zig+0.052
 back+0.051
 down+0.049
 heal+0.049
Neg logit contributions
elly-0.066
urious-0.065
ubby-0.064
 types-0.062
 elderly-0.060
Tokenay
Token ID323
Feature activation-2.43e-02

Pos logit contributions
 he+0.001
 back+0.001
 out+0.001
 heal+0.001
 his+0.001
Neg logit contributions
urious-0.001
elly-0.000
resa-0.000
ida-0.000
ubby-0.000
Tokenons
Token ID684
Feature activation-5.83e-02

Pos logit contributions
 he+0.048
 back+0.045
 heal+0.043
 out+0.042
 down+0.042
Neg logit contributions
urious-0.037
elly-0.037
ubby-0.036
ami-0.035
ody-0.034
Token the
Token ID262
Feature activation-2.64e-02

Pos logit contributions
 he+0.038
 back+0.034
 heal+0.033
 out+0.031
 down+0.028
Neg logit contributions
urious-0.067
elly-0.065
ubby-0.063
resa-0.062
ida-0.061
Token beautiful
Token ID4950
Feature activation-2.31e-02

Pos logit contributions
 he+0.049
 back+0.044
 out+0.042
 heal+0.042
 his+0.039
Neg logit contributions
urious-0.048
elly-0.046
ubby-0.044
resa-0.043
illing-0.043
Token things
Token ID1243
Feature activation-5.85e-02

Pos logit contributions
 he+0.043
 back+0.039
 heal+0.038
 out+0.036
 away+0.034
Neg logit contributions
urious-0.041
elly-0.039
ubby-0.038
resa-0.036
 types-0.036
Token in
Token ID287
Feature activation-4.46e-02

Pos logit contributions
 he+0.064
 Zig+0.062
 heal+0.061
 bed+0.059
 drown+0.057
Neg logit contributions
elly-0.121
ody-0.115
elia-0.115
ubby-0.115
urious-0.115
Token this
Token ID428
Feature activation-4.92e-02

Pos logit contributions
 he+0.044
 back+0.036
 out+0.032
 heal+0.032
 his+0.029
Neg logit contributions
urious-0.121
elly-0.117
ubby-0.114
resa-0.113
illing-0.112
Token journal
Token ID3989
Feature activation-1.05e-01

Pos logit contributions
 he+0.086
 back+0.077
 heal+0.072
 out+0.072
 his+0.068
Neg logit contributions
urious-0.095
elly-0.091
ubby-0.088
resa-0.086
illing-0.085
Token,
Token ID11
Feature activation-2.41e-02

Pos logit contributions
 he+0.148
 heal+0.132
 back+0.128
 Zig+0.125
 his+0.124
Neg logit contributions
elly-0.213
urious-0.211
ubby-0.197
elia-0.197
resa-0.196
Token and
Token ID290
Feature activation-5.64e-02

Pos logit contributions
 he+0.035
 back+0.031
 heal+0.031
 away+0.029
 out+0.028
Neg logit contributions
urious-0.046
elly-0.045
resa-0.044
ubby-0.044
ida-0.044
Token it
Token ID340
Feature activation-4.96e-02

Pos logit contributions
 he+0.060
 back+0.055
 heal+0.051
 Zig+0.048
 out+0.048
Neg logit contributions
urious-0.129
resa-0.121
elia-0.121
elly-0.120
ida-0.119
Token doesn
Token ID1595
Feature activation-1.89e-02

Pos logit contributions
 he+0.078
 back+0.068
 bed+0.066
 Zig+0.065
 heal+0.065
Neg logit contributions
urious-0.099
 types-0.088
elly-0.088
elia-0.087
ida-0.086
Token't
Token ID470
Feature activation-6.60e-03

Pos logit contributions
 he+0.024
 back+0.021
 away+0.019
 out+0.019
 down+0.018
Neg logit contributions
urious-0.042
elly-0.041
ubby-0.041
ude-0.041
resa-0.040
Token.
Token ID13
Feature activation-4.92e-02

Pos logit contributions
 he+0.068
 heal+0.057
 Zig+0.051
 back+0.050
 out+0.049
Neg logit contributions
urious-0.131
elly-0.130
ude-0.123
ida-0.123
ubby-0.122
Token We
Token ID775
Feature activation-4.11e-02

Pos logit contributions
 he+0.070
 back+0.067
 heal+0.062
 out+0.062
 away+0.060
Neg logit contributions
urious-0.104
elly-0.103
ubby-0.098
resa-0.097
illing-0.096
Token can
Token ID460
Feature activation-4.50e-02

Pos logit contributions
 he+0.060
 back+0.052
 heal+0.048
 out+0.048
 his+0.045
Neg logit contributions
urious-0.090
elly-0.086
ubby-0.085
resa-0.083
ude-0.082
Token cut
Token ID2005
Feature activation+6.87e-03

Pos logit contributions
 he+0.061
 back+0.053
 out+0.050
 down+0.048
 heal+0.047
Neg logit contributions
urious-0.098
resa-0.093
ude-0.093
ubby-0.093
elly-0.092
Token the
Token ID262
Feature activation-5.67e-02

Pos logit contributions
elly+0.016
ubby+0.016
urious+0.015
ami+0.015
ude+0.015
Neg logit contributions
 he-0.008
 heal-0.006
 back-0.006
 bed-0.005
 roof-0.005
Token paper
Token ID3348
Feature activation-1.04e-01

Pos logit contributions
 he+0.095
 back+0.081
 heal+0.078
 out+0.076
 his+0.075
Neg logit contributions
urious-0.110
elly-0.108
ubby-0.105
resa-0.103
ida-0.101
Token into
Token ID656
Feature activation-3.18e-02

Pos logit contributions
 he+0.166
 Zig+0.155
 heal+0.141
 Snow+0.135
 his+0.133
Neg logit contributions
elly-0.184
ubby-0.172
urious-0.171
ude-0.169
ami-0.164
Token strips
Token ID22670
Feature activation-7.11e-02

Pos logit contributions
 he+0.043
 back+0.037
 out+0.035
 heal+0.034
 his+0.032
Neg logit contributions
urious-0.075
elly-0.072
ubby-0.069
resa-0.069
illing-0.068
Token and
Token ID290
Feature activation-7.43e-02

Pos logit contributions
 he+0.097
 Zig+0.088
 heal+0.086
 back+0.081
 Snow+0.078
Neg logit contributions
elly-0.147
ubby-0.142
urious-0.136
illing-0.136
umpy-0.133
Token write
Token ID3551
Feature activation-1.49e-02

Pos logit contributions
 he+0.099
 back+0.088
 Zig+0.081
 heal+0.081
 out+0.078
Neg logit contributions
urious-0.141
ude-0.136
resa-0.134
elly-0.134
ubby-0.133
Token our
Token ID674
Feature activation-3.39e-02

Pos logit contributions
 he+0.020
 heal+0.016
 back+0.015
 bed+0.014
 Zig+0.014
Neg logit contributions
urious-0.032
elly-0.031
ubby-0.031
ude-0.031
ami-0.030
Token were
Token ID547
Feature activation-7.61e-03

Pos logit contributions
 he+0.112
 heal+0.103
 back+0.099
 bed+0.094
 out+0.093
Neg logit contributions
urious-0.115
ody-0.108
ubby-0.107
ida-0.106
ude-0.106
Token sorry
Token ID7926
Feature activation+9.53e-04

Pos logit contributions
 he+0.012
 heal+0.011
 back+0.010
 out+0.009
 Zig+0.009
Neg logit contributions
urious-0.014
ubby-0.014
illing-0.014
elly-0.014
ude-0.013
Token for
Token ID329
Feature activation-1.29e-02

Pos logit contributions
elly+0.002
urious+0.002
illing+0.002
ude+0.002
ubby+0.002
Neg logit contributions
stack-0.001
 Zig-0.001
 heal-0.001
 he-0.001
 back-0.001
Token measuring
Token ID15964
Feature activation-2.93e-02

Pos logit contributions
 he+0.019
 back+0.018
 down+0.017
 out+0.017
 heal+0.017
Neg logit contributions
resa-0.021
urious-0.020
iling-0.020
ude-0.020
elia-0.020
Token the
Token ID262
Feature activation-4.64e-02

Pos logit contributions
 he+0.037
 Zig+0.033
 heal+0.033
 back+0.028
 disappear+0.027
Neg logit contributions
urious-0.064
elly-0.060
ude-0.059
 types-0.059
ubby-0.058
Token radio
Token ID5243
Feature activation-1.04e-01

Pos logit contributions
 he+0.073
 back+0.060
 heal+0.060
 out+0.058
 Zig+0.057
Neg logit contributions
urious-0.092
ubby-0.089
elly-0.089
resa-0.086
ude-0.082
Token.
Token ID13
Feature activation-4.43e-02

Pos logit contributions
 Zig+0.125
 he+0.124
 heal+0.105
 Snow+0.100
 his+0.100
Neg logit contributions
urious-0.212
elly-0.211
ude-0.199
illing-0.197
umpy-0.191
Token Their
Token ID5334
Feature activation-9.58e-03

Pos logit contributions
 back+0.069
 he+0.067
 heal+0.065
 out+0.065
 away+0.065
Neg logit contributions
elly-0.085
urious-0.085
resa-0.080
ude-0.078
illing-0.077
Token mom
Token ID1995
Feature activation+2.97e-02

Pos logit contributions
 he+0.014
 out+0.011
 heal+0.011
 back+0.011
 down+0.011
Neg logit contributions
urious-0.021
elly-0.020
ubby-0.020
resa-0.019
ude-0.019
Token said
Token ID531
Feature activation+1.60e-02

Pos logit contributions
urious+0.066
elly+0.065
ubby+0.062
resa+0.060
ude+0.060
Neg logit contributions
 he-0.047
 back-0.042
 out-0.038
 heal-0.037
 they-0.036
Token it
Token ID340
Feature activation-3.17e-02

Pos logit contributions
ude+0.033
ubby+0.033
elly+0.032
resa+0.032
ida+0.032
Neg logit contributions
 ramp-0.016
 heal-0.015
 down-0.014
 back-0.014
 bed-0.014
Token help
Token ID1037
Feature activation-3.41e-02

Pos logit contributions
 he+0.040
 back+0.036
 out+0.034
 heal+0.033
 away+0.032
Neg logit contributions
elly-0.053
urious-0.052
ubby-0.050
resa-0.049
illing-0.048
Token Mom
Token ID11254
Feature activation-9.96e-03

Pos logit contributions
 he+0.040
 back+0.036
 Zig+0.035
 down+0.030
 away+0.030
Neg logit contributions
urious-0.075
elly-0.071
ude-0.070
iling-0.069
ubby-0.069
Token to
Token ID284
Feature activation-3.88e-02

Pos logit contributions
 he+0.015
 back+0.013
 Zig+0.012
 heal+0.012
 out+0.012
Neg logit contributions
urious-0.020
elly-0.019
illing-0.019
ubby-0.019
resa-0.018
Token pour
Token ID12797
Feature activation-1.80e-02

Pos logit contributions
 he+0.046
 back+0.039
 down+0.035
 Zig+0.033
 out+0.033
Neg logit contributions
urious-0.087
resa-0.086
ida-0.083
ude-0.082
ubby-0.082
Token the
Token ID262
Feature activation-6.01e-02

Pos logit contributions
 he+0.025
 Zig+0.022
 heal+0.022
 shield+0.020
stack+0.020
Neg logit contributions
elly-0.036
ubby-0.034
urious-0.034
 types-0.033
illing-0.032
Token batter
Token ID4984
Feature activation-1.04e-01

Pos logit contributions
 he+0.100
 back+0.082
 heal+0.081
 his+0.078
 Zig+0.077
Neg logit contributions
urious-0.118
elly-0.117
ubby-0.112
illing-0.107
resa-0.107
Token into
Token ID656
Feature activation-3.32e-02

Pos logit contributions
 he+0.130
 Zig+0.122
 heal+0.106
 Brown+0.096
 Blink+0.096
Neg logit contributions
elly-0.224
urious-0.216
ubby-0.213
illing-0.206
 types-0.205
Token a
Token ID257
Feature activation-5.68e-02

Pos logit contributions
 he+0.043
 back+0.038
 heal+0.036
 down+0.033
 Zig+0.033
Neg logit contributions
elly-0.068
urious-0.068
ubby-0.066
ami-0.066
ude-0.066
Token pan
Token ID3425
Feature activation-8.00e-02

Pos logit contributions
 he+0.098
 Zig+0.086
 heal+0.081
 his+0.080
 him+0.080
Neg logit contributions
elly-0.101
urious-0.100
ubby-0.097
ida-0.095
resa-0.092
Token.
Token ID13
Feature activation-4.83e-02

Pos logit contributions
 he+0.108
 Zig+0.103
 heal+0.092
 his+0.086
 anything+0.085
Neg logit contributions
elly-0.164
illing-0.152
urious-0.149
ami-0.148
ubby-0.148
Token Mom
Token ID11254
Feature activation+3.01e-02

Pos logit contributions
 he+0.067
 back+0.065
 heal+0.063
 away+0.061
 out+0.060
Neg logit contributions
urious-0.102
elly-0.100
resa-0.095
illing-0.094
ubby-0.093
Token them
Token ID606
Feature activation-4.50e-02

Pos logit contributions
urious+0.044
elly+0.042
resa+0.042
ami+0.041
ude+0.041
Neg logit contributions
 he-0.025
 back-0.020
 heal-0.019
 Zig-0.018
 out-0.018
Token sprinkle
Token ID45799
Feature activation-9.62e-03

Pos logit contributions
 he+0.065
 Zig+0.063
 Snow+0.055
 back+0.055
 cho+0.054
Neg logit contributions
urious-0.080
illing-0.076
ude-0.075
resa-0.073
elly-0.073
Token them
Token ID606
Feature activation-6.03e-02

Pos logit contributions
 he+0.012
 heal+0.010
 back+0.010
 Zig+0.010
 bed+0.009
Neg logit contributions
elly-0.021
ubby-0.020
urious-0.019
illing-0.019
ami-0.019
Token over
Token ID625
Feature activation-3.09e-02

Pos logit contributions
 he+0.072
 Zig+0.065
 heal+0.061
stack+0.061
 bed+0.058
Neg logit contributions
elly-0.125
urious-0.124
ubby-0.119
illing-0.119
ude-0.118
Token the
Token ID262
Feature activation-5.02e-02

Pos logit contributions
 he+0.034
 heal+0.030
 back+0.028
 Zig+0.026
 out+0.026
Neg logit contributions
urious-0.072
elly-0.072
ubby-0.070
ude-0.068
ami-0.068
Token batter
Token ID4984
Feature activation-1.04e-01

Pos logit contributions
 he+0.087
 Zig+0.072
 heal+0.072
 out+0.070
 his+0.070
Neg logit contributions
urious-0.094
elly-0.091
resa-0.088
ubby-0.087
illing-0.084
Token.
Token ID13
Feature activation-3.09e-02

Pos logit contributions
 he+0.125
 Zig+0.119
 heal+0.108
 drown+0.099
 shield+0.097
Neg logit contributions
elly-0.226
urious-0.217
ami-0.210
 types-0.209
ude-0.208
Token
Token ID198
Feature activation+5.25e-02

Pos logit contributions
 he+0.043
 back+0.039
 heal+0.037
 out+0.035
 away+0.034
Neg logit contributions
urious-0.068
elly-0.068
ubby-0.064
resa-0.063
illing-0.063
Token
Token ID198
Feature activation+5.36e-02

Pos logit contributions
urious+0.141
elly+0.134
ubby+0.134
resa+0.131
ited+0.131
Neg logit contributions
 he-0.057
 back-0.045
 out-0.041
 his-0.039
 heal-0.039
TokenMom
Token ID29252
Feature activation+5.13e-02

Pos logit contributions
urious+0.107
ubby+0.100
elly+0.098
 types+0.096
ude+0.096
Neg logit contributions
 he-0.095
 back-0.079
 they-0.079
 his-0.074
 out-0.073
Token poured
Token ID19036
Feature activation+2.65e-02

Pos logit contributions
urious+0.122
elly+0.118
ubby+0.117
iling+0.108
ida+0.108
Neg logit contributions
 he-0.077
 back-0.066
 out-0.059
 they-0.059
 his-0.055
Token about
Token ID546
Feature activation-2.01e-02

Pos logit contributions
 he+0.048
 heal+0.045
 away+0.042
 back+0.042
 ramp+0.041
Neg logit contributions
urious-0.026
elly-0.024
illing-0.023
ude-0.022
resa-0.022
Token the
Token ID262
Feature activation-1.68e-02

Pos logit contributions
 he+0.022
 back+0.018
 heal+0.017
 out+0.017
 away+0.015
Neg logit contributions
urious-0.050
elly-0.047
resa-0.046
ubby-0.046
ude-0.046
Token hurt
Token ID5938
Feature activation+1.38e-02

Pos logit contributions
 he+0.024
 back+0.021
 heal+0.021
 out+0.020
 down+0.019
Neg logit contributions
urious-0.037
elly-0.035
ubby-0.035
resa-0.033
illing-0.033
Token.
Token ID13
Feature activation-3.46e-02

Pos logit contributions
urious+0.028
elly+0.027
ubby+0.027
ude+0.026
illing+0.025
Neg logit contributions
 he-0.022
 back-0.020
 heal-0.019
 out-0.018
 down-0.017
Token<|endoftext|>
Token ID50256
Feature activation+6.38e-02

Pos logit contributions
 he+0.047
 back+0.042
 heal+0.039
 out+0.039
 away+0.036
Neg logit contributions
urious-0.079
elly-0.078
ubby-0.074
resa-0.073
illing-0.073
TokenL
Token ID43
Feature activation+1.08e-01

Pos logit contributions
urious+0.126
elly+0.121
ubby+0.117
resa+0.114
illing+0.113
Neg logit contributions
 he-0.108
 back-0.096
 heal-0.092
 out-0.091
 his-0.085
Tokenily
Token ID813
Feature activation+5.34e-02

Pos logit contributions
urious+0.198
resa+0.183
elly+0.183
ude+0.180
illing+0.178
Neg logit contributions
 he-0.202
 back-0.198
 out-0.189
 bed-0.177
 his-0.177
Token and
Token ID290
Feature activation-4.64e-04

Pos logit contributions
urious+0.105
elly+0.101
ubby+0.097
resa+0.096
illing+0.094
Neg logit contributions
 he-0.091
 back-0.081
 out-0.077
 heal-0.076
 his-0.072
Token Ben
Token ID3932
Feature activation+6.55e-02

Pos logit contributions
 he+0.001
 heal+0.001
 ramp+0.001
 back+0.001
 away+0.000
Neg logit contributions
urious-0.001
ody-0.001
resa-0.001
elia-0.001
ubby-0.001
Token were
Token ID547
Feature activation+4.01e-02

Pos logit contributions
urious+0.145
elly+0.139
ubby+0.135
resa+0.132
illing+0.131
Neg logit contributions
 he-0.093
 back-0.081
 heal-0.077
 out-0.076
 down-0.071
Token twins
Token ID20345
Feature activation+4.67e-03

Pos logit contributions
urious+0.084
elly+0.080
ubby+0.078
resa+0.077
illing+0.076
Neg logit contributions
 he-0.064
 back-0.057
 heal-0.054
 out-0.053
 his-0.050
Token
Token ID198
Feature activation+4.73e-02

Pos logit contributions
 he+0.039
 heal+0.036
 back+0.036
 out+0.034
 away+0.033
Neg logit contributions
urious-0.066
elly-0.063
resa-0.060
illing-0.059
ude-0.058
Token
Token ID198
Feature activation+4.90e-02

Pos logit contributions
urious+0.130
ubby+0.125
elly+0.125
ited+0.122
resa+0.121
Neg logit contributions
 he-0.050
 back-0.036
 out-0.034
 his-0.033
 they-0.032
TokenAs
Token ID1722
Feature activation+7.21e-03

Pos logit contributions
ubby+0.089
elia+0.084
elly+0.082
urious+0.080
atherine+0.080
Neg logit contributions
 he-0.101
 his-0.082
 back-0.080
 they-0.079
 out-0.075
Token he
Token ID339
Feature activation+6.72e-03

Pos logit contributions
urious+0.018
elly+0.017
resa+0.017
illing+0.017
ubby+0.017
Neg logit contributions
 heal-0.007
 back-0.006
 Zig-0.005
 bleed-0.005
pan-0.005
Token was
Token ID373
Feature activation+2.15e-02

Pos logit contributions
elly+0.017
ubby+0.017
ody+0.017
urious+0.016
ida+0.016
Neg logit contributions
 he-0.008
 back-0.007
 his-0.006
 they-0.006
 out-0.006
Token r
Token ID374
Feature activation+1.01e-01

Pos logit contributions
urious+0.047
elly+0.045
ubby+0.043
resa+0.042
ude+0.042
Neg logit contributions
 he-0.033
 back-0.030
 out-0.028
 heal-0.027
 his-0.026
Tokenaking
Token ID868
Feature activation+1.40e-02

Pos logit contributions
urious+0.126
ya+0.121
oph+0.121
elia+0.120
ody+0.117
Neg logit contributions
 down-0.202
 back-0.200
 out-0.199
 he-0.198
 away-0.190
Token the
Token ID262
Feature activation-1.78e-02

Pos logit contributions
ubby+0.024
 types+0.024
elly+0.023
ude+0.023
ised+0.023
Neg logit contributions
 heal-0.018
 Zig-0.017
 drown-0.016
 back-0.016
stack-0.016
Token leaves
Token ID5667
Feature activation-2.78e-02

Pos logit contributions
 he+0.029
 back+0.024
 heal+0.024
 out+0.023
 his+0.022
Neg logit contributions
urious-0.037
elly-0.035
ubby-0.034
resa-0.034
illing-0.033
Token,
Token ID11
Feature activation+8.55e-03

Pos logit contributions
 Zig+0.034
 Snow+0.033
 heal+0.033
 he+0.032
 Frost+0.032
Neg logit contributions
urious-0.050
illing-0.049
elly-0.049
ude-0.049
ubby-0.048
Token Jack
Token ID3619
Feature activation+6.24e-02

Pos logit contributions
urious+0.019
ude+0.018
resa+0.017
ody+0.017
ami+0.017
Neg logit contributions
 heal-0.010
 back-0.009
 away-0.008
 Zig-0.008
 bleed-0.007
Token who
Token ID508
Feature activation-1.95e-02

Pos logit contributions
 he+0.021
 back+0.020
 heal+0.020
 out+0.019
 away+0.018
Neg logit contributions
urious-0.025
elly-0.023
resa-0.023
illing-0.022
ody-0.022
Token smiled
Token ID13541
Feature activation+4.72e-02

Pos logit contributions
 he+0.033
 back+0.026
 his+0.026
 they+0.025
 out+0.025
Neg logit contributions
urious-0.043
elly-0.041
ubby-0.040
illing-0.038
resa-0.038
Token back
Token ID736
Feature activation+6.95e-03

Pos logit contributions
urious+0.101
elly+0.098
ubby+0.093
ude+0.091
illing+0.091
Neg logit contributions
 he-0.070
 heal-0.061
 back-0.060
 out-0.056
 bed-0.054
Token.
Token ID13
Feature activation-3.68e-02

Pos logit contributions
urious+0.018
elly+0.017
ubby+0.017
resa+0.016
illing+0.016
Neg logit contributions
 he-0.008
 back-0.007
 heal-0.006
 out-0.006
 his-0.005
Token The
Token ID383
Feature activation+1.80e-02

Pos logit contributions
 he+0.052
 back+0.045
 heal+0.042
 out+0.042
 away+0.039
Neg logit contributions
urious-0.083
elly-0.080
ubby-0.077
resa-0.077
illing-0.076
Token w
Token ID266
Feature activation+1.28e-01

Pos logit contributions
ude+0.049
urious+0.048
illed+0.047
elly+0.047
udging+0.046
Neg logit contributions
 he-0.023
 back-0.020
 his-0.017
 bed-0.016
 car-0.015
Tokenaffle
Token ID30697
Feature activation+1.76e-02

Pos logit contributions
urious+0.228
resa+0.204
elia+0.199
atherine+0.198
ida+0.196
Neg logit contributions
 he-0.249
 back-0.212
 his-0.206
 out-0.201
 him-0.198
Token maker
Token ID16009
Feature activation+9.09e-03

Pos logit contributions
urious+0.035
elly+0.034
ubby+0.033
resa+0.032
illing+0.032
Neg logit contributions
 he-0.029
 back-0.026
 out-0.025
 heal-0.025
 down-0.023
Token made
Token ID925
Feature activation+3.68e-02

Pos logit contributions
urious+0.018
elly+0.017
ubby+0.017
resa+0.016
illing+0.016
Neg logit contributions
 he-0.016
 back-0.014
 out-0.013
 heal-0.013
 his-0.013
Token a
Token ID257
Feature activation-6.33e-03

Pos logit contributions
urious+0.087
elly+0.084
ubby+0.081
resa+0.081
illing+0.080
Neg logit contributions
 he-0.045
 back-0.039
 heal-0.037
 out-0.035
 down-0.033
Token pretty
Token ID2495
Feature activation+9.24e-04

Pos logit contributions
 he+0.011
 back+0.010
 heal+0.010
 out+0.010
 down+0.009
Neg logit contributions
urious-0.012
elly-0.011
resa-0.011
ubby-0.011
ida-0.010
Token your
Token ID534
Feature activation-1.29e-03

Pos logit contributions
urious+0.052
ubby+0.051
ady+0.050
elly+0.047
elia+0.047
Neg logit contributions
 he-0.055
 heal-0.046
 roof-0.045
 bed-0.044
 Zig-0.043
Token friends
Token ID2460
Feature activation-4.22e-03

Pos logit contributions
 he+0.003
 heal+0.002
 back+0.002
 out+0.002
 down+0.002
Neg logit contributions
urious-0.002
ubby-0.002
elly-0.002
resa-0.002
ida-0.002
Token."
Token ID526
Feature activation+1.55e-03

Pos logit contributions
 he+0.006
 Zig+0.005
 heal+0.004
 out+0.004
 roof+0.004
Neg logit contributions
urious-0.009
elly-0.009
ubby-0.008
iling-0.008
umpy-0.008
Token
Token ID198
Feature activation+6.45e-02

Pos logit contributions
elly+0.003
urious+0.003
resa+0.003
ubby+0.003
ody+0.003
Neg logit contributions
 heal-0.002
 he-0.002
 back-0.002
 out-0.002
 away-0.002
Token
Token ID198
Feature activation+6.48e-02

Pos logit contributions
urious+0.175
ubby+0.169
elly+0.168
resa+0.165
ited+0.164
Neg logit contributions
 he-0.069
 back-0.053
 heal-0.047
 out-0.046
 his-0.046
TokenL
Token ID43
Feature activation+1.10e-01

Pos logit contributions
urious+0.122
ubby+0.121
elly+0.119
ude+0.116
illing+0.114
Neg logit contributions
 he-0.119
 back-0.099
 they-0.095
 heal-0.094
 his-0.094
Tokenily
Token ID813
Feature activation+5.09e-02

Pos logit contributions
urious+0.207
ude+0.194
resa+0.193
ited+0.191
elia+0.185
Neg logit contributions
 he-0.185
 back-0.175
 out-0.168
 heal-0.154
 his-0.154
Token walked
Token ID6807
Feature activation+5.62e-02

Pos logit contributions
ubby+0.114
elly+0.114
urious+0.110
illing+0.107
ami+0.103
Neg logit contributions
 he-0.078
 back-0.072
 out-0.062
 away-0.060
 his-0.060
Token with
Token ID351
Feature activation+2.03e-02

Pos logit contributions
urious+0.149
elly+0.145
resa+0.141
ubby+0.139
illing+0.139
Neg logit contributions
 he-0.050
 heal-0.042
 bed-0.032
 back-0.032
 anything-0.031
Token mom
Token ID1995
Feature activation+2.66e-02

Pos logit contributions
urious+0.042
atherine+0.040
ubby+0.040
ody+0.040
elly+0.039
Neg logit contributions
 he-0.027
 heal-0.022
 back-0.020
 they-0.020
 down-0.020
Token to
Token ID284
Feature activation-9.01e-03

Pos logit contributions
urious+0.062
elly+0.058
resa+0.057
ubby+0.056
illing+0.056
Neg logit contributions
 he-0.035
 heal-0.029
 back-0.028
 out-0.027
 his-0.026
Token a
Token ID257
Feature activation-6.89e-03

Pos logit contributions
urious+0.019
resa+0.019
elia+0.018
 elderly+0.018
ida+0.018
Neg logit contributions
 he-0.013
 back-0.011
 heal-0.011
 Zig-0.011
 bed-0.010
Token threat
Token ID2372
Feature activation-9.82e-03

Pos logit contributions
 he+0.013
 Zig+0.012
 down+0.011
 him+0.011
 out+0.011
Neg logit contributions
urious-0.010
elly-0.010
resa-0.010
ida-0.009
 elderly-0.009
Token!"
Token ID2474
Feature activation+9.55e-03

Pos logit contributions
 he+0.012
 heal+0.010
 Zig+0.010
 his+0.010
 back+0.009
Neg logit contributions
elly-0.022
ubby-0.020
urious-0.020
illing-0.019
umpy-0.019
Token
Token ID198
Feature activation+7.38e-02

Pos logit contributions
urious+0.022
elly+0.021
resa+0.020
ubby+0.020
illing+0.020
Neg logit contributions
 he-0.013
 back-0.012
 heal-0.011
 out-0.011
 away-0.010
Token
Token ID198
Feature activation+7.48e-02

Pos logit contributions
urious+0.202
elly+0.192
ubby+0.192
resa+0.189
elia+0.188
Neg logit contributions
 he-0.078
 back-0.060
 out-0.054
 heal-0.054
 they-0.053
TokenL
Token ID43
Feature activation+1.18e-01

Pos logit contributions
urious+0.142
ubby+0.136
elly+0.136
illing+0.132
 types+0.131
Neg logit contributions
 he-0.137
 back-0.114
 they-0.112
 heal-0.109
 his-0.109
Tokenily
Token ID813
Feature activation+5.79e-02

Pos logit contributions
urious+0.249
ited+0.240
resa+0.231
elia+0.228
ised+0.216
Neg logit contributions
 he-0.167
 back-0.156
 out-0.155
 his-0.144
 it-0.144
Token and
Token ID290
Feature activation+1.75e-02

Pos logit contributions
ubby+0.163
elly+0.153
urious+0.152
iling+0.148
illing+0.147
Neg logit contributions
 he-0.059
 back-0.052
 out-0.041
 fell-0.039
 down-0.039
Token Ben
Token ID3932
Feature activation+6.88e-02

Pos logit contributions
urious+0.040
elly+0.039
ubby+0.038
resa+0.038
ude+0.037
Neg logit contributions
 he-0.024
 back-0.020
 heal-0.020
 out-0.019
 away-0.018
Token did
Token ID750
Feature activation+3.14e-02

Pos logit contributions
elly+0.145
urious+0.142
ubby+0.142
illing+0.134
ami+0.134
Neg logit contributions
 he-0.112
 back-0.103
 out-0.095
 away-0.092
 heal-0.091
Token not
Token ID407
Feature activation+2.92e-02

Pos logit contributions
urious+0.075
elly+0.069
resa+0.069
ida+0.068
ude+0.068
Neg logit contributions
 he-0.038
 back-0.033
 heal-0.030
 out-0.029
 down-0.027
Token very
Token ID845
Feature activation+1.85e-02

Pos logit contributions
ubby+0.015
 elderly+0.015
elia+0.014
resa+0.014
urious+0.014
Neg logit contributions
 he-0.013
 heal-0.011
 back-0.010
 Zig-0.010
 down-0.010
Token creative
Token ID7325
Feature activation+2.44e-02

Pos logit contributions
urious+0.031
elly+0.029
ubby+0.029
resa+0.028
illing+0.027
Neg logit contributions
 he-0.037
 back-0.034
 heal-0.032
 out-0.032
 down-0.031
Token and
Token ID290
Feature activation-3.64e-02

Pos logit contributions
urious+0.063
elly+0.061
ubby+0.059
resa+0.059
illing+0.058
Neg logit contributions
 he-0.027
 back-0.023
 out-0.021
 heal-0.020
 his-0.019
Token smart
Token ID4451
Feature activation+1.54e-02

Pos logit contributions
 he+0.071
 back+0.066
 out+0.063
 heal+0.062
 bed+0.061
Neg logit contributions
urious-0.055
elly-0.051
ubby-0.051
resa-0.050
ida-0.049
Token."
Token ID526
Feature activation-5.48e-03

Pos logit contributions
urious+0.040
elly+0.039
ubby+0.038
illing+0.037
resa+0.037
Neg logit contributions
 he-0.016
 back-0.013
 heal-0.012
 out-0.012
 his-0.010
Token
Token ID198
Feature activation+6.62e-02

Pos logit contributions
 back+0.009
 heal+0.009
 away+0.009
 out+0.009
 he+0.008
Neg logit contributions
elly-0.010
urious-0.009
resa-0.008
ude-0.008
ubby-0.008
Token
Token ID198
Feature activation+6.76e-02

Pos logit contributions
urious+0.163
elly+0.157
ubby+0.153
resa+0.151
illing+0.149
Neg logit contributions
 he-0.083
 back-0.070
 heal-0.064
 out-0.064
 his-0.060
TokenS
Token ID50
Feature activation+1.16e-01

Pos logit contributions
urious+0.113
elly+0.105
ubby+0.105
resa+0.100
illing+0.100
Neg logit contributions
 he-0.141
 back-0.123
 heal-0.117
 his-0.117
 they-0.117
Tokenara
Token ID3301
Feature activation+5.89e-02

Pos logit contributions
urious+0.127
illing+0.118
ubby+0.116
resa+0.115
ida+0.115
Neg logit contributions
 he-0.277
 back-0.268
 heal-0.253
 away-0.250
 out-0.247
Token and
Token ID290
Feature activation+9.08e-03

Pos logit contributions
urious+0.117
elly+0.114
ubby+0.111
resa+0.108
illing+0.108
Neg logit contributions
 he-0.100
 back-0.092
 out-0.085
 heal-0.083
 down-0.080
Token Ben
Token ID3932
Feature activation+5.84e-02

Pos logit contributions
urious+0.018
elly+0.017
ubby+0.017
resa+0.017
ody+0.017
Neg logit contributions
 he-0.014
 heal-0.012
 back-0.012
 out-0.012
 away-0.011
Token and
Token ID290
Feature activation-3.16e-02

Pos logit contributions
 he+0.022
 back+0.020
 out+0.019
 heal+0.018
 down+0.017
Neg logit contributions
urious-0.030
elly-0.028
ubby-0.027
resa-0.027
illing-0.027
Token said
Token ID531
Feature activation+1.93e-02

Pos logit contributions
 he+0.059
 back+0.057
 heal+0.056
 out+0.054
 anything+0.052
Neg logit contributions
urious-0.051
elly-0.046
resa-0.045
ida-0.044
illing-0.044
Token,
Token ID11
Feature activation-2.01e-03

Pos logit contributions
elly+0.031
ude+0.031
ubby+0.030
ida+0.030
oke+0.030
Neg logit contributions
 bed-0.027
 ramp-0.026
 back-0.026
stack-0.025
 away-0.025
Token "
Token ID366
Feature activation+4.93e-02

Pos logit contributions
 away+0.003
 back+0.003
 anything+0.003
 heal+0.003
 out+0.003
Neg logit contributions
resa-0.003
illing-0.003
elly-0.003
ude-0.003
urious-0.003
TokenDon
Token ID3987
Feature activation+4.70e-02

Pos logit contributions
urious+0.087
ubby+0.083
elly+0.083
resa+0.079
illing+0.079
Neg logit contributions
 he-0.095
 back-0.085
 heal-0.081
 out-0.080
 they-0.077
Token't
Token ID470
Feature activation+2.57e-02

Pos logit contributions
urious+0.121
elly+0.117
ubby+0.114
resa+0.113
illing+0.112
Neg logit contributions
 he-0.051
 back-0.042
 out-0.038
 heal-0.038
 his-0.034
Token worry
Token ID5490
Feature activation+5.36e-02

Pos logit contributions
urious+0.043
elly+0.039
resa+0.038
ubby+0.037
ida+0.037
Neg logit contributions
 he-0.051
 back-0.047
 out-0.045
 heal-0.044
 down-0.043
Token,
Token ID11
Feature activation+1.96e-02

Pos logit contributions
elly+0.120
urious+0.118
ubby+0.116
ida+0.111
ude+0.109
Neg logit contributions
 he-0.069
 back-0.059
 heal-0.058
 out-0.053
 away-0.053
Token Lily
Token ID20037
Feature activation+3.85e-02

Pos logit contributions
urious+0.044
elly+0.043
ida+0.043
ubby+0.042
resa+0.042
Neg logit contributions
 he-0.022
 back-0.020
 out-0.019
 down-0.018
 away-0.018
Token.
Token ID13
Feature activation-4.14e-02

Pos logit contributions
urious+0.079
elly+0.078
ubby+0.076
ude+0.075
resa+0.073
Neg logit contributions
 he-0.044
 heal-0.042
 float-0.042
 bed-0.041
 back-0.041
Token We
Token ID775
Feature activation-2.04e-02

Pos logit contributions
 he+0.056
 back+0.049
 out+0.045
 his+0.044
 away+0.043
Neg logit contributions
urious-0.095
elly-0.092
ubby-0.090
resa-0.088
elia-0.086
Token dad
Token ID9955
Feature activation+2.54e-03

Pos logit contributions
 he+0.029
 Grad+0.023
 heal+0.023
 they+0.022
 Zig+0.022
Neg logit contributions
ubby-0.032
urious-0.029
 elderly-0.028
ude-0.028
iet-0.027
Token's
Token ID338
Feature activation-1.37e-02

Pos logit contributions
urious+0.004
ady+0.004
illing+0.004
elia+0.004
resa+0.004
Neg logit contributions
 he-0.004
 Zig-0.004
 anything-0.004
 heal-0.003
 drown-0.003
Token hand
Token ID1021
Feature activation-1.86e-02

Pos logit contributions
 he+0.024
 they+0.019
 Zig+0.019
 anything+0.019
 heal+0.019
Neg logit contributions
ubby-0.024
urious-0.023
ady-0.022
ude-0.021
 elderly-0.021
Token.
Token ID13
Feature activation-3.71e-02

Pos logit contributions
 he+0.027
 drown+0.025
 anything+0.025
 Zig+0.024
 ramp+0.024
Neg logit contributions
elly-0.030
urious-0.030
ady-0.029
illing-0.029
illed-0.029
Token She
Token ID1375
Feature activation-3.32e-02

Pos logit contributions
 he+0.061
 back+0.057
 away+0.054
 heal+0.054
 out+0.053
Neg logit contributions
elly-0.068
urious-0.068
resa-0.062
ude-0.061
ubby-0.060
Token felt
Token ID2936
Feature activation+3.41e-02

Pos logit contributions
 he+0.059
 back+0.053
 out+0.050
 heal+0.050
 down+0.047
Neg logit contributions
urious-0.063
elly-0.060
ubby-0.057
resa-0.057
illing-0.056
Token a
Token ID257
Feature activation+4.08e-03

Pos logit contributions
urious+0.060
ubby+0.060
elly+0.059
 elderly+0.059
ady+0.057
Neg logit contributions
 he-0.045
 ramp-0.043
stack-0.041
 roof-0.039
 Zig-0.038
Token little
Token ID1310
Feature activation+7.37e-03

Pos logit contributions
urious+0.007
ubby+0.006
elly+0.006
resa+0.006
ited+0.006
Neg logit contributions
 he-0.008
 back-0.007
 out-0.007
 heal-0.007
 down-0.007
Token better
Token ID1365
Feature activation+1.14e-02

Pos logit contributions
elly+0.015
urious+0.015
ubby+0.015
ited+0.014
illing+0.014
Neg logit contributions
 he-0.010
 Zig-0.008
 his-0.008
 heal-0.008
 out-0.008
Token.
Token ID13
Feature activation-3.89e-02

Pos logit contributions
elly+0.026
ubby+0.025
urious+0.025
illing+0.024
ude+0.024
Neg logit contributions
 he-0.013
 back-0.011
 heal-0.011
 Zig-0.010
 out-0.010
Token<|endoftext|>
Token ID50256
Feature activation+6.33e-02

Pos logit contributions
 he+0.060
 back+0.056
 heal+0.052
 out+0.051
 away+0.051
Neg logit contributions
elly-0.078
urious-0.078
resa-0.072
ubby-0.071
ude-0.070
Token
Token ID198
Feature activation+4.97e-02

Pos logit contributions
 he+0.034
 back+0.032
 heal+0.030
 out+0.029
 away+0.027
Neg logit contributions
urious-0.061
elly-0.059
resa-0.056
illing-0.055
ubby-0.054
Token
Token ID198
Feature activation+5.10e-02

Pos logit contributions
urious+0.141
ubby+0.138
elly+0.136
elia+0.136
ami+0.133
Neg logit contributions
 he-0.050
 back-0.031
 they-0.031
 his-0.031
 out-0.030
TokenTim
Token ID14967
Feature activation+8.39e-02

Pos logit contributions
ubby+0.107
urious+0.105
elia+0.104
elly+0.104
ody+0.102
Neg logit contributions
 he-0.088
 they-0.069
 his-0.067
 back-0.066
 heal-0.066
Tokenmy
Token ID1820
Feature activation+3.97e-02

Pos logit contributions
elly+0.201
ida+0.194
ubby+0.193
illing+0.193
Fre+0.191
Neg logit contributions
 back-0.103
 he-0.100
 away-0.087
 down-0.085
 his-0.084
Token was
Token ID373
Feature activation+3.04e-02

Pos logit contributions
elly+0.091
urious+0.088
ubby+0.088
illing+0.083
ida+0.083
Neg logit contributions
 he-0.059
 back-0.051
 out-0.045
 heal-0.045
 his-0.044
Token very
Token ID845
Feature activation+2.92e-02

Pos logit contributions
urious+0.069
elly+0.067
ubby+0.064
resa+0.063
ami+0.062
Neg logit contributions
 he-0.044
 back-0.039
 out-0.036
 heal-0.035
 his-0.033
Token excited
Token ID6568
Feature activation+2.27e-02

Pos logit contributions
urious+0.057
elly+0.055
resa+0.052
ami+0.052
ubby+0.052
Neg logit contributions
 he-0.053
 back-0.047
 out-0.045
 heal-0.044
 his-0.043
Token to
Token ID284
Feature activation-1.49e-02

Pos logit contributions
urious+0.055
elly+0.053
resa+0.051
illing+0.051
ami+0.050
Neg logit contributions
 he-0.024
 heal-0.023
 back-0.022
 out-0.020
 Zig-0.020
Token eat
Token ID4483
Feature activation+2.21e-02

Pos logit contributions
 he+0.019
 back+0.017
 out+0.015
 heal+0.015
 they+0.015
Neg logit contributions
urious-0.033
resa-0.031
ubby-0.031
ude-0.031
ida-0.030
Token the
Token ID262
Feature activation-3.12e-02

Pos logit contributions
urious+0.050
elly+0.050
ami+0.048
resa+0.048
 types+0.048
Neg logit contributions
 he-0.021
 heal-0.020
 back-0.020
 Zig-0.018
 roof-0.016
Token avocado
Token ID40377
Feature activation-5.49e-02

Pos logit contributions
 he+0.053
 back+0.047
 out+0.044
 heal+0.044
 his+0.042
Neg logit contributions
urious-0.063
elly-0.061
ubby-0.059
resa-0.058
illing-0.057
Token -
Token ID532
Feature activation-1.75e-02

Pos logit contributions
 he+0.019
 back+0.016
 heal+0.015
 Zig+0.015
 out+0.014
Neg logit contributions
urious-0.042
elly-0.041
ude-0.039
ubby-0.039
illing-0.039
Token the
Token ID262
Feature activation-8.07e-03

Pos logit contributions
 heal+0.017
 back+0.016
 away+0.015
 he+0.015
 throne+0.015
Neg logit contributions
 elderly-0.036
resa-0.035
ude-0.034
ida-0.033
ubby-0.033
Token hole
Token ID7604
Feature activation-1.13e-02

Pos logit contributions
 he+0.013
 back+0.012
 out+0.012
 down+0.011
 heal+0.011
Neg logit contributions
urious-0.015
elly-0.015
resa-0.014
ubby-0.014
ida-0.014
Token was
Token ID373
Feature activation+1.65e-02

Pos logit contributions
 he+0.018
 heal+0.017
 back+0.016
 bed+0.016
 Zig+0.015
Neg logit contributions
urious-0.021
elly-0.021
ubby-0.020
ude-0.019
illing-0.019
Token much
Token ID881
Feature activation+4.26e-03

Pos logit contributions
ubby+0.030
urious+0.029
illing+0.028
elly+0.028
 elderly+0.028
Neg logit contributions
 he-0.027
 heal-0.022
 Zig-0.022
 they-0.021
 back-0.021
Token deeper
Token ID9211
Feature activation+3.13e-02

Pos logit contributions
urious+0.009
elly+0.008
ubby+0.008
ited+0.008
 elderly+0.008
Neg logit contributions
 he-0.006
 Zig-0.005
 back-0.005
 heal-0.005
 out-0.005
Token than
Token ID621
Feature activation+4.14e-03

Pos logit contributions
urious+0.064
elly+0.062
ubby+0.060
ude+0.056
illing+0.056
Neg logit contributions
 he-0.049
 heal-0.044
 back-0.042
 out-0.039
 they-0.038
Token it
Token ID340
Feature activation-5.32e-02

Pos logit contributions
ody+0.009
elly+0.009
ami+0.009
ubby+0.009
urious+0.009
Neg logit contributions
 heal-0.004
 back-0.004
 he-0.004
 down-0.004
 out-0.004
Token had
Token ID550
Feature activation-3.23e-02

Pos logit contributions
 he+0.078
 heal+0.070
 back+0.070
 out+0.065
 bed+0.064
Neg logit contributions
urious-0.108
elly-0.103
ude-0.099
ubby-0.098
illing-0.097
Token expected
Token ID2938
Feature activation-1.57e-02

Pos logit contributions
 he+0.044
 heal+0.038
 back+0.037
 down+0.034
 out+0.033
Neg logit contributions
urious-0.069
elly-0.067
ude-0.066
ody-0.064
ubby-0.064
Token!
Token ID0
Feature activation-3.30e-02

Pos logit contributions
 he+0.018
 heal+0.016
 Zig+0.015
 back+0.015
 roof+0.014
Neg logit contributions
elly-0.034
urious-0.033
ude-0.032
ubby-0.032
illing-0.032
Token
Token ID198
Feature activation+4.42e-02

Pos logit contributions
urious+0.108
elly+0.105
ubby+0.102
resa+0.099
illing+0.099
Neg logit contributions
 he-0.055
 back-0.047
 out-0.043
 heal-0.042
 his-0.040
TokenJack
Token ID14295
Feature activation+6.22e-02

Pos logit contributions
urious+0.071
elly+0.071
ubby+0.069
illing+0.063
resa+0.063
Neg logit contributions
 he-0.091
 back-0.081
 out-0.078
 heal-0.077
 his-0.075
Token said
Token ID531
Feature activation+1.53e-02

Pos logit contributions
urious+0.131
elly+0.130
ubby+0.124
resa+0.119
illing+0.119
Neg logit contributions
 he-0.101
 back-0.090
 out-0.082
 heal-0.079
 away-0.078
Token,
Token ID11
Feature activation-8.66e-03

Pos logit contributions
ude+0.027
ida+0.027
ubby+0.026
elly+0.026
resa+0.025
Neg logit contributions
 bed-0.019
 ramp-0.019
 heal-0.018
 drown-0.018
 fall-0.018
Token "
Token ID366
Feature activation+3.24e-02

Pos logit contributions
 he+0.013
 heal+0.013
 anything+0.013
 back+0.013
 away+0.013
Neg logit contributions
resa-0.014
ude-0.014
illing-0.013
urious-0.013
ubby-0.013
TokenLet
Token ID5756
Feature activation-1.73e-02

Pos logit contributions
ubby+0.057
urious+0.055
elly+0.055
illing+0.052
iling+0.050
Neg logit contributions
 he-0.063
 back-0.054
 his-0.050
 out-0.050
 heal-0.049
Token's
Token ID338
Feature activation-2.17e-02

Pos logit contributions
 he+0.025
 heal+0.022
 back+0.021
 bed+0.020
 Zig+0.020
Neg logit contributions
urious-0.038
ody-0.034
elly-0.034
resa-0.034
oph-0.034
Token fill
Token ID6070
Feature activation+1.22e-02

Pos logit contributions
 he+0.028
 back+0.022
 heal+0.020
 his+0.019
 they+0.019
Neg logit contributions
urious-0.048
ubby-0.046
ude-0.046
resa-0.046
ida-0.045
Token the
Token ID262
Feature activation-3.87e-02

Pos logit contributions
urious+0.028
elly+0.027
elia+0.027
 types+0.027
ubby+0.026
Neg logit contributions
 he-0.014
 heal-0.012
 back-0.011
 Zig-0.010
 crash-0.010
Token leak
Token ID13044
Feature activation-4.29e-02

Pos logit contributions
 he+0.060
 back+0.048
 heal+0.047
 his+0.047
 out+0.046
Neg logit contributions
urious-0.081
elly-0.079
ubby-0.078
illing-0.074
ude-0.072
Token".
Token ID1911
Feature activation-2.10e-02

Pos logit contributions
 he+0.060
 heal+0.051
 Zig+0.049
 back+0.047
 his+0.045
Neg logit contributions
urious-0.089
elly-0.089
ubby-0.086
illing-0.083
iling-0.083
Token He
Token ID679
Feature activation-5.94e-03

Pos logit contributions
 back+0.044
 heal+0.043
 he+0.043
 away+0.043
 out+0.042
Neg logit contributions
urious-0.078
elly-0.072
resa-0.071
illing-0.068
 types-0.067
Token couldn
Token ID3521
Feature activation+2.65e-03

Pos logit contributions
 he+0.011
 back+0.010
 out+0.009
 heal+0.009
 his+0.009
Neg logit contributions
urious-0.011
elly-0.011
ubby-0.010
resa-0.010
illing-0.010
Token't
Token ID470
Feature activation+7.07e-03

Pos logit contributions
elly+0.005
urious+0.005
ubby+0.005
ude+0.005
resa+0.005
Neg logit contributions
 he-0.004
 back-0.004
 away-0.003
 down-0.003
 out-0.003
Token find
Token ID1064
Feature activation+8.61e-03

Pos logit contributions
urious+0.015
ubby+0.014
elly+0.014
ida+0.014
resa+0.014
Neg logit contributions
 he-0.010
 back-0.008
 down-0.008
 out-0.008
 they-0.008
Token it
Token ID340
Feature activation-6.39e-02

Pos logit contributions
ubby+0.019
elly+0.017
urious+0.017
ody+0.017
 types+0.017
Neg logit contributions
stack-0.008
 heal-0.007
 he-0.007
 hyd-0.006
 roof-0.006
Token anywhere
Token ID6609
Feature activation-3.40e-02

Pos logit contributions
 he+0.064
 heal+0.056
 back+0.050
 Zig+0.047
stack+0.045
Neg logit contributions
urious-0.145
elly-0.141
ubby-0.138
illing-0.132
ody-0.132
Token.
Token ID13
Feature activation-3.39e-02

Pos logit contributions
stack+0.043
 heal+0.039
pan+0.038
 mast+0.037
 ramp+0.036
Neg logit contributions
elly-0.061
iling-0.058
illing-0.056
urious-0.055
ude-0.054
Token He
Token ID679
Feature activation-9.25e-03

Pos logit contributions
 he+0.041
 back+0.039
 heal+0.038
 out+0.037
 away+0.035
Neg logit contributions
urious-0.080
elly-0.074
resa-0.073
illing-0.070
ude-0.069
Token looked
Token ID3114
Feature activation+4.50e-03

Pos logit contributions
 he+0.014
 back+0.013
 out+0.012
 heal+0.012
 his+0.011
Neg logit contributions
urious-0.020
elly-0.020
ubby-0.019
illing-0.018
resa-0.018
Token and
Token ID290
Feature activation-2.74e-02

Pos logit contributions
urious+0.010
ubby+0.009
elly+0.009
elia+0.009
ited+0.009
Neg logit contributions
 he-0.006
 heal-0.005
 back-0.005
 Zig-0.004
 bed-0.004
Token looked
Token ID3114
Feature activation+6.22e-04

Pos logit contributions
 he+0.042
 heal+0.039
 back+0.039
 out+0.037
 bed+0.035
Neg logit contributions
urious-0.055
elly-0.047
ubby-0.047
resa-0.047
elia-0.047
Token upon
Token ID2402
Feature activation-1.82e-02

Pos logit contributions
elly+0.021
urious+0.021
ubby+0.020
ude+0.020
illing+0.019
Neg logit contributions
 he-0.014
 heal-0.013
 back-0.011
 they-0.011
 bed-0.011
Token.
Token ID13
Feature activation-3.99e-02

Pos logit contributions
 he+0.029
 heal+0.029
 back+0.024
 mast+0.024
 drown+0.024
Neg logit contributions
urious-0.031
elly-0.028
oph-0.027
ida-0.027
elia-0.027
Token Everything
Token ID11391
Feature activation-2.46e-02

Pos logit contributions
 he+0.059
 back+0.054
 heal+0.052
 out+0.049
 away+0.048
Neg logit contributions
urious-0.085
elly-0.082
resa-0.079
ubby-0.078
illing-0.078
Token was
Token ID373
Feature activation+1.46e-02

Pos logit contributions
 he+0.040
 back+0.036
 heal+0.036
 bed+0.035
 out+0.034
Neg logit contributions
urious-0.046
elly-0.043
ida-0.042
ude-0.042
illing-0.041
Token so
Token ID523
Feature activation+2.82e-03

Pos logit contributions
urious+0.031
elly+0.030
ubby+0.029
resa+0.029
illing+0.028
Neg logit contributions
 he-0.022
 back-0.020
 heal-0.018
 out-0.018
 down-0.017
Token exciting
Token ID7895
Feature activation-8.90e-04

Pos logit contributions
urious+0.005
elly+0.005
resa+0.005
ubby+0.005
ami+0.005
Neg logit contributions
 he-0.005
 back-0.005
 out-0.005
 heal-0.004
 they-0.004
Token!
Token ID0
Feature activation-3.88e-02

Pos logit contributions
 heal+0.001
 he+0.001
 back+0.001
 bed+0.001
 out+0.001
Neg logit contributions
elly-0.002
urious-0.002
ude-0.002
ami-0.002
ida-0.002
Token
Token ID220
Feature activation+5.49e-02

Pos logit contributions
 he+0.054
 back+0.050
 heal+0.050
 out+0.046
 away+0.045
Neg logit contributions
urious-0.084
elly-0.082
resa-0.079
ubby-0.077
elia-0.077
Token
Token ID198
Feature activation+4.41e-02

Pos logit contributions
urious+0.120
elly+0.115
ubby+0.110
ude+0.110
illing+0.109
Neg logit contributions
 he-0.079
 back-0.070
 heal-0.067
 out-0.063
 away-0.063
Token
Token ID198
Feature activation+4.44e-02

Pos logit contributions
urious+0.117
elly+0.114
ubby+0.114
ited+0.109
iling+0.108
Neg logit contributions
 he-0.049
 back-0.037
 out-0.033
 they-0.032
 his-0.032
TokenSuddenly
Token ID38582
Feature activation-1.08e-02

Pos logit contributions
ubby+0.085
urious+0.084
elly+0.084
 types+0.077
iling+0.077
Neg logit contributions
 he-0.080
 back-0.066
 they-0.063
 his-0.061
 out-0.061
Token and
Token ID290
Feature activation-2.82e-02

Pos logit contributions
urious+0.088
elly+0.088
ude+0.085
ubby+0.083
illing+0.083
Neg logit contributions
 he-0.063
 heal-0.055
 back-0.053
 out-0.048
 bed-0.048
Token look
Token ID804
Feature activation+1.15e-02

Pos logit contributions
 he+0.042
 back+0.038
 heal+0.037
 out+0.036
 anything+0.035
Neg logit contributions
resa-0.048
ude-0.046
ida-0.046
ubby-0.046
illing-0.046
Token for
Token ID329
Feature activation+2.53e-03

Pos logit contributions
urious+0.025
elly+0.024
ubby+0.024
illing+0.023
ude+0.022
Neg logit contributions
 he-0.017
 heal-0.014
 back-0.013
 bed-0.012
 out-0.012
Token things
Token ID1243
Feature activation-4.16e-02

Pos logit contributions
urious+0.005
elly+0.005
ubby+0.005
resa+0.005
illing+0.005
Neg logit contributions
 he-0.004
 back-0.004
 out-0.003
 heal-0.003
 his-0.003
Token to
Token ID284
Feature activation-3.11e-02

Pos logit contributions
 heal+0.052
 roof+0.050
 bed+0.048
 crash+0.048
 ramp+0.045
Neg logit contributions
elly-0.083
illing-0.075
ody-0.074
elia-0.074
as-0.071
Token put
Token ID1234
Feature activation-9.02e-03

Pos logit contributions
 he+0.034
 back+0.030
 out+0.027
 heal+0.026
 his+0.025
Neg logit contributions
urious-0.082
elly-0.080
ubby-0.077
illing-0.076
resa-0.076
Token in
Token ID287
Feature activation-2.34e-02

Pos logit contributions
 he+0.011
 heal+0.009
 roof+0.008
 crash+0.008
 bed+0.008
Neg logit contributions
ubby-0.020
elly-0.020
urious-0.019
ady-0.019
iling-0.018
Token their
Token ID511
Feature activation-2.63e-02

Pos logit contributions
 he+0.022
 back+0.017
 out+0.015
 heal+0.015
 down+0.013
Neg logit contributions
urious-0.065
elly-0.063
ubby-0.061
resa-0.061
illing-0.060
Token net
Token ID2010
Feature activation-4.72e-02

Pos logit contributions
 he+0.044
 heal+0.039
 out+0.036
 Zig+0.035
 Brown+0.035
Neg logit contributions
urious-0.053
ubby-0.049
elly-0.047
ude-0.047
iet-0.045
Token.
Token ID13
Feature activation-2.20e-02

Pos logit contributions
 he+0.053
 heal+0.047
 Zig+0.043
 crash+0.041
 back+0.040
Neg logit contributions
elly-0.108
urious-0.108
ubby-0.103
ady-0.102
ude-0.101
Token
Token ID198
Feature activation+5.53e-02

Pos logit contributions
 he+0.027
 back+0.023
 out+0.021
 heal+0.020
 his+0.019
Neg logit contributions
urious-0.055
elly-0.052
ubby-0.051
resa-0.051
ida-0.050
Token He
Token ID679
Feature activation-6.65e-03

Pos logit contributions
 heal+0.040
 back+0.039
 he+0.039
 away+0.038
 out+0.037
Neg logit contributions
urious-0.066
resa-0.061
elly-0.060
ude-0.059
illing-0.058
Token ran
Token ID4966
Feature activation+4.47e-02

Pos logit contributions
 he+0.012
 back+0.010
 out+0.010
 heal+0.010
 his+0.009
Neg logit contributions
urious-0.013
elly-0.012
ubby-0.012
resa-0.012
illing-0.012
Token outside
Token ID2354
Feature activation+2.75e-03

Pos logit contributions
elly+0.088
urious+0.087
resa+0.087
illing+0.086
ida+0.085
Neg logit contributions
 heal-0.056
 he-0.055
 anything-0.046
 nobody-0.044
 drown-0.042
Token and
Token ID290
Feature activation-2.44e-02

Pos logit contributions
urious+0.005
elly+0.004
illing+0.004
resa+0.004
illed+0.004
Neg logit contributions
 heal-0.005
 he-0.004
 anything-0.004
 back-0.004
 Zig-0.004
Token enjoyed
Token ID8359
Feature activation+1.24e-02

Pos logit contributions
 heal+0.046
 anything+0.041
 he+0.038
 protect+0.037
 drown+0.037
Neg logit contributions
urious-0.034
ida-0.033
ody-0.033
resa-0.032
ya-0.032
Token playing
Token ID2712
Feature activation+1.21e-02

Pos logit contributions
ude+0.022
 elderly+0.021
 types+0.021
iley+0.021
illing+0.021
Neg logit contributions
 heal-0.016
 back-0.016
 he-0.014
 away-0.014
 out-0.014
Token with
Token ID351
Feature activation+2.04e-03

Pos logit contributions
elly+0.020
urious+0.020
iling+0.020
ubby+0.019
ised+0.019
Neg logit contributions
 he-0.016
 heal-0.015
 drown-0.015
 back-0.014
 explode-0.014
Token his
Token ID465
Feature activation+2.69e-04

Pos logit contributions
urious+0.005
ubby+0.005
 types+0.005
elly+0.005
ude+0.005
Neg logit contributions
 heal-0.001
 he-0.001
 back-0.001
 out-0.001
 down-0.001
Token toys
Token ID14958
Feature activation-3.26e-02

Pos logit contributions
urious+0.000
elly+0.000
ubby+0.000
resa+0.000
ude+0.000
Neg logit contributions
 he-0.000
 heal-0.000
 back-0.000
 out-0.000
 away-0.000
Token in
Token ID287
Feature activation-2.49e-02

Pos logit contributions
 heal+0.038
 he+0.037
 Zig+0.036
 back+0.032
 drown+0.032
Neg logit contributions
elly-0.066
urious-0.066
ude-0.064
illing-0.063
resa-0.062
Token the
Token ID262
Feature activation-4.11e-02

Pos logit contributions
 he+0.026
 back+0.023
 heal+0.023
 out+0.020
 down+0.019
Neg logit contributions
urious-0.060
ubby-0.058
elly-0.058
ude-0.056
resa-0.055
Token the
Token ID262
Feature activation-2.75e-02

Pos logit contributions
urious+0.048
 types+0.045
elly+0.045
illing+0.044
ida+0.044
Neg logit contributions
 he-0.030
 heal-0.029
stack-0.025
 Zig-0.025
 back-0.025
Token most
Token ID749
Feature activation+2.27e-02

Pos logit contributions
 he+0.047
 heal+0.040
 back+0.039
 Zig+0.039
 his+0.038
Neg logit contributions
urious-0.050
ubby-0.047
elly-0.047
resa-0.045
 elderly-0.045
Token things
Token ID1243
Feature activation-3.83e-02

Pos logit contributions
urious+0.040
elly+0.038
ubby+0.037
resa+0.036
illing+0.035
Neg logit contributions
 he-0.043
 back-0.039
 out-0.037
 heal-0.037
 his-0.035
Token in
Token ID287
Feature activation-2.73e-02

Pos logit contributions
 heal+0.033
 bed+0.031
 he+0.031
 Zig+0.030
 back+0.027
Neg logit contributions
elly-0.087
ida-0.085
ude-0.084
ody-0.084
urious-0.083
Token the
Token ID262
Feature activation-2.61e-02

Pos logit contributions
 he+0.028
 back+0.024
 heal+0.021
 out+0.021
 down+0.019
Neg logit contributions
urious-0.072
elly-0.069
ubby-0.068
resa-0.067
ude-0.066
Token house
Token ID2156
Feature activation-6.07e-02

Pos logit contributions
 he+0.050
 heal+0.043
 Zig+0.043
 they+0.041
 his+0.040
Neg logit contributions
urious-0.046
elly-0.042
ubby-0.041
resa-0.040
 elderly-0.038
Token was
Token ID373
Feature activation+9.66e-03

Pos logit contributions
 he+0.080
 back+0.071
 heal+0.071
 out+0.065
 Zig+0.065
Neg logit contributions
urious-0.128
elly-0.126
ude-0.121
illing-0.119
ubby-0.119
Token the
Token ID262
Feature activation-1.32e-02

Pos logit contributions
urious+0.020
elly+0.019
illing+0.019
ude+0.019
ubby+0.019
Neg logit contributions
 he-0.014
 back-0.012
 heal-0.012
 out-0.011
 down-0.011
Token winner
Token ID8464
Feature activation-2.25e-02

Pos logit contributions
 he+0.025
 heal+0.021
 back+0.021
 his+0.020
 they+0.020
Neg logit contributions
urious-0.022
resa-0.021
ubby-0.021
elly-0.021
ida-0.020
Token.
Token ID13
Feature activation-3.85e-02

Pos logit contributions
 he+0.028
 heal+0.027
 back+0.024
 Zig+0.023
 his+0.021
Neg logit contributions
elly-0.049
urious-0.048
ude-0.047
illing-0.046
ida-0.046
Token
Token ID198
Feature activation+4.91e-02

Pos logit contributions
 he+0.054
 back+0.049
 heal+0.046
 out+0.045
 away+0.043
Neg logit contributions
urious-0.086
elly-0.083
ubby-0.080
resa-0.079
illing-0.078
Token the
Token ID262
Feature activation-2.25e-02

Pos logit contributions
urious+0.032
elly+0.030
ida+0.030
ubby+0.030
elia+0.030
Neg logit contributions
 he-0.013
 heal-0.011
 back-0.010
 bed-0.008
 they-0.008
Token box
Token ID3091
Feature activation-6.31e-02

Pos logit contributions
 he+0.036
 back+0.032
 heal+0.031
 out+0.030
 down+0.029
Neg logit contributions
urious-0.046
elly-0.044
ubby-0.042
resa-0.042
ida-0.041
Token she
Token ID673
Feature activation-4.84e-02

Pos logit contributions
 he+0.072
 heal+0.069
 Zig+0.064
 back+0.063
 drown+0.060
Neg logit contributions
urious-0.135
elly-0.131
ubby-0.126
ude-0.125
ida-0.124
Token acted
Token ID13134
Feature activation+1.35e-02

Pos logit contributions
 he+0.072
 back+0.063
 out+0.059
 heal+0.058
 his+0.055
Neg logit contributions
urious-0.107
elly-0.103
ubby-0.100
resa-0.098
illing-0.097
Token like
Token ID588
Feature activation-2.29e-02

Pos logit contributions
urious+0.027
elly+0.027
ubby+0.026
illing+0.025
ude+0.025
Neg logit contributions
 he-0.020
 back-0.017
 heal-0.017
 out-0.016
 bed-0.016
Token it
Token ID340
Feature activation-5.29e-02

Pos logit contributions
 he+0.025
 heal+0.024
 back+0.024
 away+0.021
 down+0.021
Neg logit contributions
urious-0.050
elly-0.049
ody-0.048
ida-0.048
ami-0.048
Token was
Token ID373
Feature activation-9.01e-03

Pos logit contributions
 he+0.072
 back+0.064
 heal+0.063
 out+0.059
 bed+0.058
Neg logit contributions
urious-0.111
elly-0.106
ude-0.102
ida-0.102
ubby-0.100
Token such
Token ID884
Feature activation-8.67e-03

Pos logit contributions
 he+0.012
 heal+0.011
 back+0.010
 down+0.009
 out+0.009
Neg logit contributions
urious-0.018
elly-0.018
ude-0.018
ubby-0.018
ida-0.018
Token a
Token ID257
Feature activation-2.27e-02

Pos logit contributions
 he+0.012
 back+0.010
 heal+0.009
 out+0.009
 down+0.009
Neg logit contributions
urious-0.020
elly-0.019
ubby-0.019
resa-0.018
ude-0.018
Token special
Token ID2041
Feature activation-1.83e-02

Pos logit contributions
 he+0.043
 heal+0.038
 back+0.038
 out+0.038
 down+0.038
Neg logit contributions
urious-0.035
elly-0.034
ida-0.034
resa-0.034
ubby-0.033
Token surprise
Token ID5975
Feature activation-3.58e-02

Pos logit contributions
 he+0.029
 back+0.026
 heal+0.025
 out+0.024
 down+0.023
Neg logit contributions
urious-0.035
elly-0.035
ubby-0.033
resa-0.033
ida-0.032
Token saw
Token ID2497
Feature activation+2.19e-02

Pos logit contributions
urious+0.014
elly+0.013
ubby+0.013
resa+0.013
ami+0.013
Neg logit contributions
 he-0.009
 back-0.008
 out-0.008
 his-0.007
 they-0.007
Token something
Token ID1223
Feature activation-2.20e-02

Pos logit contributions
urious+0.052
elly+0.050
ubby+0.048
resa+0.048
ude+0.047
Neg logit contributions
 he-0.027
 back-0.024
 heal-0.023
 out-0.021
 away-0.020
Token blue
Token ID4171
Feature activation+6.19e-03

Pos logit contributions
 he+0.035
 heal+0.035
 back+0.034
 away+0.031
 bed+0.031
Neg logit contributions
elly-0.037
urious-0.037
illing-0.035
ida-0.035
ubby-0.034
Token in
Token ID287
Feature activation-3.58e-02

Pos logit contributions
urious+0.015
elly+0.015
ubby+0.014
resa+0.014
illing+0.014
Neg logit contributions
 he-0.008
 back-0.006
 heal-0.006
 out-0.006
 his-0.005
Token the
Token ID262
Feature activation-4.17e-02

Pos logit contributions
 he+0.028
 back+0.020
 his+0.018
 out+0.017
 heal+0.015
Neg logit contributions
urious-0.108
elly-0.104
ubby-0.102
illing-0.101
resa-0.100
Token sky
Token ID6766
Feature activation-6.22e-02

Pos logit contributions
 he+0.079
 they+0.067
 heal+0.067
 Zig+0.063
 back+0.060
Neg logit contributions
elly-0.072
urious-0.072
resa-0.066
ubby-0.065
ady-0.064
Token.
Token ID13
Feature activation-3.78e-02

Pos logit contributions
 he+0.061
 back+0.055
 heal+0.055
 Zig+0.049
 out+0.047
Neg logit contributions
urious-0.155
elly-0.151
illing-0.146
ami-0.143
ude-0.142
Token It
Token ID632
Feature activation-4.58e-02

Pos logit contributions
 he+0.047
 back+0.042
 heal+0.040
 out+0.038
 away+0.037
Neg logit contributions
urious-0.090
elly-0.086
ubby-0.083
resa-0.083
illing-0.083
Token was
Token ID373
Feature activation+1.84e-02

Pos logit contributions
 he+0.077
 back+0.072
 heal+0.070
 out+0.067
 away+0.064
Neg logit contributions
urious-0.086
elly-0.080
ude-0.077
ubby-0.076
resa-0.076
Token so
Token ID523
Feature activation+9.45e-03

Pos logit contributions
urious+0.042
elly+0.041
ubby+0.039
resa+0.039
illing+0.039
Neg logit contributions
 he-0.025
 back-0.021
 heal-0.020
 out-0.019
 away-0.019
Token far
Token ID1290
Feature activation-6.10e-03

Pos logit contributions
urious+0.019
elly+0.018
ubby+0.017
resa+0.017
illing+0.017
Neg logit contributions
 he-0.017
 back-0.014
 out-0.014
 heal-0.013
 his-0.013
Token for
Token ID329
Feature activation+3.44e-03

Pos logit contributions
 he+0.007
 heal+0.006
 back+0.006
 Zig+0.006
 down+0.005
Neg logit contributions
ubby-0.010
elly-0.010
urious-0.010
 elderly-0.009
illing-0.009
Token watering
Token ID41737
Feature activation-8.88e-03

Pos logit contributions
urious+0.008
elly+0.007
ubby+0.007
resa+0.007
illing+0.007
Neg logit contributions
 he-0.005
 back-0.004
 heal-0.004
 out-0.004
 down-0.004
Token the
Token ID262
Feature activation-4.16e-02

Pos logit contributions
 he+0.011
 heal+0.009
 back+0.009
 Zig+0.008
 bed+0.008
Neg logit contributions
elly-0.021
urious-0.020
ubby-0.020
illing-0.019
 types-0.019
Token flowers
Token ID12734
Feature activation-7.11e-02

Pos logit contributions
 he+0.059
 back+0.051
 heal+0.048
 out+0.047
 his+0.044
Neg logit contributions
urious-0.094
elly-0.091
ubby-0.088
resa-0.087
illing-0.086
Token.
Token ID13
Feature activation-3.54e-02

Pos logit contributions
 he+0.087
 Zig+0.080
 heal+0.079
 his+0.070
 Brown+0.069
Neg logit contributions
elly-0.156
ubby-0.142
urious-0.142
illing-0.141
iling-0.138
Token It
Token ID632
Feature activation-4.94e-02

Pos logit contributions
 he+0.048
 back+0.042
 out+0.038
 heal+0.038
 away+0.036
Neg logit contributions
urious-0.082
elly-0.079
ubby-0.077
resa-0.075
illing-0.075
Token is
Token ID318
Feature activation-2.23e-03

Pos logit contributions
 he+0.087
 back+0.080
 bed+0.077
 heal+0.077
 out+0.075
Neg logit contributions
urious-0.086
elly-0.080
ubby-0.078
iling-0.076
ude-0.072
Token very
Token ID845
Feature activation+1.74e-02

Pos logit contributions
 he+0.003
 heal+0.003
 back+0.003
 Zig+0.003
 down+0.002
Neg logit contributions
ubby-0.004
urious-0.004
elly-0.004
illing-0.004
ady-0.004
Token hard
Token ID1327
Feature activation-7.97e-03

Pos logit contributions
urious+0.033
elly+0.032
ubby+0.032
illing+0.030
ida+0.029
Neg logit contributions
 he-0.030
 back-0.027
 heal-0.027
 out-0.025
 anything-0.024
Token to
Token ID284
Feature activation-2.50e-02

Pos logit contributions
 he+0.009
 back+0.008
 heal+0.008
 out+0.007
 Zig+0.007
Neg logit contributions
elly-0.019
urious-0.019
ubby-0.018
illing-0.018
ude-0.017
Token turn
Token ID1210
Feature activation+1.54e-02

Pos logit contributions
 he+0.036
 back+0.031
 down+0.027
 heal+0.027
 out+0.027
Neg logit contributions
ubby-0.050
urious-0.050
elly-0.048
ude-0.048
ida-0.047
Token she
Token ID673
Feature activation-4.08e-02

Pos logit contributions
 heal+0.036
 back+0.031
 bleed+0.030
 drown+0.029
 ramp+0.029
Neg logit contributions
iley-0.031
oph-0.031
ubby-0.030
elia-0.030
ude-0.030
Token saw
Token ID2497
Feature activation+1.55e-03

Pos logit contributions
 he+0.064
 back+0.056
 heal+0.054
 out+0.053
 down+0.050
Neg logit contributions
urious-0.086
elly-0.082
ubby-0.079
resa-0.078
ude-0.077
Token a
Token ID257
Feature activation-1.30e-02

Pos logit contributions
urious+0.004
elly+0.003
resa+0.003
ubby+0.003
ami+0.003
Neg logit contributions
 he-0.002
 back-0.002
 heal-0.002
 out-0.001
 down-0.001
Token white
Token ID2330
Feature activation+3.79e-04

Pos logit contributions
 he+0.024
 back+0.021
 heal+0.020
 out+0.020
 down+0.019
Neg logit contributions
urious-0.024
elly-0.023
ubby-0.022
resa-0.022
ida-0.021
Token hat
Token ID6877
Feature activation-6.07e-02

Pos logit contributions
elly+0.001
urious+0.001
ubby+0.001
resa+0.001
illing+0.001
Neg logit contributions
 he-0.001
 back-0.001
 heal-0.001
 out-0.000
 they-0.000
Token inside
Token ID2641
Feature activation-5.35e-02

Pos logit contributions
 he+0.078
 heal+0.070
 anything+0.068
 back+0.066
 Zig+0.065
Neg logit contributions
elly-0.119
urious-0.115
ami-0.114
ude-0.113
illing-0.113
Token the
Token ID262
Feature activation-4.88e-02

Pos logit contributions
 he+0.054
 heal+0.049
 back+0.048
 out+0.041
 Zig+0.039
Neg logit contributions
urious-0.134
elly-0.131
ubby-0.124
ami-0.124
illing-0.124
Token bag
Token ID6131
Feature activation-4.78e-02

Pos logit contributions
 he+0.085
 heal+0.076
 they+0.072
 Zig+0.070
 anything+0.067
Neg logit contributions
elly-0.083
urious-0.080
resa-0.080
ubby-0.079
 elderly-0.079
Token.
Token ID13
Feature activation-4.35e-02

Pos logit contributions
 he+0.050
 heal+0.044
 back+0.043
 out+0.039
 anything+0.038
Neg logit contributions
urious-0.117
elly-0.116
illing-0.111
resa-0.110
ubby-0.109
Token When
Token ID1649
Feature activation-3.80e-02

Pos logit contributions
 he+0.064
 back+0.059
 heal+0.057
 out+0.054
 away+0.054
Neg logit contributions
urious-0.091
elly-0.090
resa-0.086
ubby-0.084
illing-0.084
Token she
Token ID673
Feature activation-2.54e-02

Pos logit contributions
 heal+0.048
 back+0.040
 bleed+0.039
 drown+0.038
 ramp+0.037
Neg logit contributions
elia-0.077
oph-0.075
resa-0.074
ida-0.074
iley-0.072