TOP ACTIVATIONS
MAX = 9.436

and swinging their long tr unks . Amy and her mom
do not see the p uddle on the ground . They
and fall into the p uddle . They get wet and
But those are not tr unks . Those are my pants
" I know , sk unks are not very pleasant .
! We have long tr unks !" Mom hears
Now , let 's c uddle and read your book .
He lands in the p uddle next to Anna . He
wished for . They c uddle with their new toys .
't splash in the p uddle ," she says .
put on his swimming tr unks and find some interesting things
go to bed and c uddle with their doll and car
He runs to the p uddle . He jumps in it
you . We can c uddle and sn uggle and keep
the comfortable couch and c uddle with their blankets . They
They put on their tr unks and jumped right in .
Now , let 's c uddle and read a story .
ph ants have strong tr unks that they can use to
. They grabbed their tr unks and set off .
balls and mice and c uddle with the boys . One

MIN ACTIVATIONS
MIN = -5.277

the night . <|endoftext|> Jo Jo was walking outside on a
? â Jo jo dug into his pocket
to the village . Jo Jo smiled with anticipation and asked
thing to do . <|endoftext|> Jo el saw a banana in
unloaded , John and Jo Jo started to make their way
happy boy . <|endoftext|> Jo Jo was sad . He wanted
felt so proud ! <|endoftext|> Jo was a lively three -
of the night . <|endoftext|> Jo Jo was walking outside on
lucky happy boy . <|endoftext|> Jo Jo was sad . He
and harder . Jo jo and his mom were
friend Jo Jo . Jo Jo was thrilled to see Big
've got an idea Jo Jo . Maybe next time we
life perfectly . Jo decided to take the puppy
Jo shouted . Jo opened the book and saw
into his best friend Jo Jo . Jo Jo was thrilled
't bud ge ! Jo Jo was getting frustrated . She
the village ?" Jo Jo was excited . They
village ?" Jo Jo was excited . They knew
Don 't worry , Jo Jo . I can help you
His mom looked at Jo Jo and said " Don 't

INTERVAL 5.758 - 9.436
CONTAINS 0.012%

with her toys and c uddle with her t eddy bear
" That 's a good choice ," he said .
He knew he had no choice . He decided to surrender
stepped in the deep p uddle . " Mom
land in a deep p uddle of water . They are

INTERVAL 2.080 - 5.758
CONTAINS 3.194%

to their room and sleep for a little while .
. He kept asking for a muff in .
saw the cat trying to catch the bird ie . She
seal said , " Please forgive me , I just wanted
you . I 'm sorry for being sad earlier ." The

INTERVAL -1.598 - 2.080
CONTAINS 91.599%

used to being angry , but the so apy water was
Lily . I will try some of your pasta . But
fair !" She tried to pull her blocks back , but
nice , dear ," said her mom . " But be
luck iest deer around ! <|endoftext|> Once upon a time there

INTERVAL -5.277 - -1.598
CONTAINS 5.195%

moved . Soon he tried twisting it around other things that
He looked at it with delight and immediately wanted to wear
Tim my learned that candles can be dangerous and should
They relaxed , laughed and ch atted the whole way .
a time there was a little boy called Sean . He
Token and
Token ID290
Feature activation+1.09e-01

Pos logit contributions
 wished+0.027
 s+0.026
 living+0.025
 t+0.023
 sharp+0.023
Neg logit contributions
ete-0.067
icking-0.065
ement-0.062
ory-0.062
oming-0.060
Token swinging
Token ID25635
Feature activation-1.19e+00

Pos logit contributions
ete+0.008
glers+0.008
icking+0.008
ement+0.008
+0.008
Neg logit contributions
 wished-0.005
 living-0.005
 s-0.005
 sharp-0.005
 responsible-0.005
Token their
Token ID511
Feature activation-1.36e+00

Pos logit contributions
 s+0.049
 wished+0.048
 living+0.045
 t+0.044
 sharp+0.043
Neg logit contributions
ete-0.123
icking-0.120
ement-0.114
ory-0.113
ancer-0.111
Token long
Token ID890
Feature activation-2.67e+00

Pos logit contributions
 wished+0.071
 living+0.060
 s+0.057
 knocking+0.055
 responsible+0.055
Neg logit contributions
ete-0.117
icking-0.114
ory-0.110
ement-0.108
glers-0.106
Token tr
Token ID491
Feature activation-1.89e+00

Pos logit contributions
 wished+0.123
 living+0.104
 s+0.098
 never+0.095
 knocking+0.092
Neg logit contributions
ete-0.252
ory-0.238
ement-0.235
icking-0.234
ails-0.229
Tokenunks
Token ID14125
Feature activation+7.10e+00

Pos logit contributions
 s+0.170
 living+0.159
 wished+0.159
 snow+0.158
 never+0.157
Neg logit contributions
ete-0.094
icking-0.092
ancer-0.087
ory-0.083
ylan-0.077
Token.
Token ID13
Feature activation+5.78e-02

Pos logit contributions
ete+0.725
icking+0.716
ement+0.678
ory+0.672
oming+0.658
Neg logit contributions
 wished-0.278
 s-0.273
 living-0.252
 t-0.241
 sharp-0.240
Token Amy
Token ID14235
Feature activation-8.44e-02

Pos logit contributions
ete+0.005
icking+0.005
ement+0.005
+0.004
ory+0.004
Neg logit contributions
 wished-0.003
 s-0.003
 living-0.003
 t-0.003
 sharp-0.003
Token and
Token ID290
Feature activation+1.30e-01

Pos logit contributions
 s+0.004
 living+0.004
 snow+0.003
 sharp+0.003
 wished+0.003
Neg logit contributions
ete-0.008
icking-0.008
ory-0.007
ement-0.007
oured-0.007
Token her
Token ID607
Feature activation-1.35e+00

Pos logit contributions
icking+0.011
ete+0.011
ement+0.010
ory+0.010
iane+0.009
Neg logit contributions
 wished-0.007
 s-0.006
 living-0.006
 sharp-0.006
 snow-0.006
Token mom
Token ID1995
Feature activation-1.68e-01

Pos logit contributions
 wished+0.077
 living+0.069
 s+0.069
 knocking+0.066
 r+0.064
Neg logit contributions
ete-0.110
icking-0.110
ement-0.107
ory-0.104
-0.097
Token do
Token ID466
Feature activation+7.75e-01

Pos logit contributions
 s+0.011
 wished+0.011
 living+0.011
 t+0.010
 snow+0.010
Neg logit contributions
ete-0.023
icking-0.023
ory-0.022
ement-0.021
-0.021
Token not
Token ID407
Feature activation-1.12e+00

Pos logit contributions
icking+0.073
ete+0.072
oming+0.068
ails+0.067
+0.067
Neg logit contributions
 wished-0.034
 s-0.031
 living-0.031
 snow-0.030
 t-0.030
Token see
Token ID766
Feature activation+1.42e+00

Pos logit contributions
 living+0.072
 s+0.071
 wished+0.070
 cob+0.069
 snow+0.067
Neg logit contributions
ory-0.080
ete-0.078
icking-0.072
ola-0.072
ancer-0.069
Token the
Token ID262
Feature activation+5.19e-01

Pos logit contributions
ete+0.125
icking+0.120
ement+0.117
ays+0.117
ory+0.113
Neg logit contributions
 wished-0.072
 s-0.070
 bark-0.066
 living-0.064
 r-0.063
Token p
Token ID279
Feature activation-1.87e+00

Pos logit contributions
ete+0.050
icking+0.049
ory+0.046
ement+0.046
+0.045
Neg logit contributions
 wished-0.024
 s-0.023
 living-0.021
 t-0.020
 happy-0.020
Tokenuddle
Token ID24500
Feature activation+6.97e+00

Pos logit contributions
 s+0.184
 mess+0.184
 sharp+0.178
 toy+0.178
 r+0.177
Neg logit contributions
ete-0.088
ancer-0.075
icking-0.071
ola-0.065
ory-0.064
Token on
Token ID319
Feature activation+1.89e-01

Pos logit contributions
ete+0.810
icking+0.752
ory+0.750
ancer+0.737
ement+0.725
Neg logit contributions
 s-0.245
 living-0.220
 r-0.213
 wished-0.212
 t-0.201
Token the
Token ID262
Feature activation+5.20e-01

Pos logit contributions
ete+0.020
icking+0.019
ement+0.019
ory+0.019
+0.018
Neg logit contributions
 wished-0.007
 s-0.007
 living-0.007
 t-0.006
 sharp-0.006
Token ground
Token ID2323
Feature activation-5.44e-01

Pos logit contributions
ete+0.046
icking+0.046
ory+0.043
+0.042
oming+0.042
Neg logit contributions
 wished-0.026
 s-0.023
 dreamed-0.022
 happy-0.021
 r-0.020
Token.
Token ID13
Feature activation+1.43e-01

Pos logit contributions
 s+0.020
 wished+0.020
 living+0.018
 t+0.018
 sharp+0.017
Neg logit contributions
ete-0.057
icking-0.056
ement-0.054
ory-0.053
oming-0.052
Token They
Token ID1119
Feature activation-2.54e-01

Pos logit contributions
ete+0.014
icking+0.014
ement+0.013
ory+0.013
ancer+0.013
Neg logit contributions
 s-0.006
 wished-0.006
 living-0.006
 t-0.005
 sharp-0.005
Token and
Token ID290
Feature activation+1.22e-01

Pos logit contributions
icking+0.051
ays+0.050
ete+0.048
ement+0.048
oming+0.047
Neg logit contributions
 wished-0.035
 s-0.033
 living-0.032
 gir-0.032
 dreamed-0.031
Token fall
Token ID2121
Feature activation-1.29e+00

Pos logit contributions
icking+0.011
ete+0.011
+0.010
ancer+0.010
oming+0.010
Neg logit contributions
 wished-0.005
 living-0.005
 s-0.005
 songs-0.005
 gir-0.005
Token into
Token ID656
Feature activation+5.71e-01

Pos logit contributions
 wished+0.064
 s+0.060
 living+0.058
 t+0.055
 sharp+0.054
Neg logit contributions
icking-0.117
ete-0.116
ement-0.109
oming-0.108
ory-0.107
Token the
Token ID262
Feature activation+5.46e-01

Pos logit contributions
ete+0.048
icking+0.044
ola+0.044
+0.044
 Derek+0.043
Neg logit contributions
 wished-0.030
 s-0.028
 gir-0.028
 dreamed-0.026
 living-0.026
Token p
Token ID279
Feature activation-1.89e+00

Pos logit contributions
ete+0.047
icking+0.046
+0.045
ory+0.045
oming+0.044
Neg logit contributions
 wished-0.028
 s-0.024
 dreamed-0.024
 never-0.024
 t-0.024
Tokenuddle
Token ID24500
Feature activation+6.92e+00

Pos logit contributions
 s+0.205
 living+0.197
 mess+0.196
 r+0.195
 wished+0.194
Neg logit contributions
ete-0.072
icking-0.059
ancer-0.057
ement-0.052
ory-0.052
Token.
Token ID13
Feature activation+1.51e-01

Pos logit contributions
ete+0.697
icking+0.693
ement+0.660
ory+0.644
oming+0.635
Neg logit contributions
 wished-0.279
 s-0.269
 living-0.251
 t-0.244
 sharp-0.238
Token They
Token ID1119
Feature activation-2.59e-01

Pos logit contributions
ete+0.014
icking+0.013
ement+0.013
ory+0.013
+0.013
Neg logit contributions
 wished-0.007
 s-0.007
 living-0.007
 t-0.007
 sharp-0.007
Token get
Token ID651
Feature activation+5.37e-01

Pos logit contributions
 living+0.012
 wished+0.011
 s+0.011
 snow+0.011
 gir+0.011
Neg logit contributions
ete-0.023
icking-0.022
-0.021
ory-0.021
oming-0.021
Token wet
Token ID9583
Feature activation+4.87e-01

Pos logit contributions
ete+0.054
icking+0.054
ory+0.050
ement+0.050
oming+0.050
Neg logit contributions
 wished-0.021
 s-0.021
 living-0.020
 t-0.018
 bark-0.018
Token and
Token ID290
Feature activation+1.23e-01

Pos logit contributions
icking+0.047
ete+0.046
ement+0.045
ory+0.043
ays+0.043
Neg logit contributions
 wished-0.021
 s-0.020
 living-0.019
 sharp-0.019
 bark-0.018
Token But
Token ID887
Feature activation-2.01e-01

Pos logit contributions
ete+0.002
icking+0.002
ement+0.002
ory+0.002
oming+0.002
Neg logit contributions
 living-0.002
 s-0.002
 wished-0.002
 snow-0.002
 t-0.002
Token those
Token ID883
Feature activation+2.55e-01

Pos logit contributions
 wished+0.012
 s+0.012
 living+0.012
 t+0.012
 r+0.012
Neg logit contributions
ete-0.017
icking-0.016
ement-0.015
ory-0.015
oming-0.015
Token are
Token ID389
Feature activation-1.22e-01

Pos logit contributions
ete+0.022
icking+0.021
ement+0.020
ory+0.020
ancer+0.019
Neg logit contributions
 s-0.015
 wished-0.015
 living-0.014
 t-0.014
 sharp-0.013
Token not
Token ID407
Feature activation-1.18e+00

Pos logit contributions
 wished+0.004
 s+0.004
 living+0.003
 t+0.003
 sharp+0.003
Neg logit contributions
ete-0.014
icking-0.013
ory-0.013
ement-0.013
ancer-0.012
Token tr
Token ID491
Feature activation-1.91e+00

Pos logit contributions
 wished+0.047
 s+0.043
 living+0.039
 bark+0.038
 t+0.037
Neg logit contributions
icking-0.119
ete-0.118
ory-0.112
ancer-0.109
ola-0.108
Tokenunks
Token ID14125
Feature activation+6.91e+00

Pos logit contributions
 s+0.177
 living+0.171
 mess+0.167
 r+0.167
 t+0.166
Neg logit contributions
ete-0.095
icking-0.085
ory-0.079
ancer-0.073
ement-0.071
Token.
Token ID13
Feature activation+2.02e-02

Pos logit contributions
ete+0.684
icking+0.658
ement+0.632
ory+0.626
ancer+0.610
Neg logit contributions
 s-0.316
 wished-0.308
 living-0.298
 t-0.284
 sharp-0.280
Token Those
Token ID5845
Feature activation-1.54e+00

Pos logit contributions
ete+0.002
icking+0.002
ement+0.002
ory+0.002
oming+0.002
Neg logit contributions
 s-0.001
 living-0.001
 wished-0.001
 snow-0.001
 t-0.001
Token are
Token ID389
Feature activation-1.20e-01

Pos logit contributions
 s+0.086
 wished+0.084
 living+0.081
 t+0.080
 sharp+0.078
Neg logit contributions
ete-0.139
icking-0.134
ement-0.127
ory-0.125
oming-0.122
Token my
Token ID616
Feature activation+3.45e-01

Pos logit contributions
 s+0.004
 wished+0.004
 living+0.004
 t+0.003
 sharp+0.003
Neg logit contributions
ete-0.013
icking-0.013
ement-0.012
ory-0.012
ancer-0.012
Token pants
Token ID12581
Feature activation+1.90e+00

Pos logit contributions
ete+0.035
icking+0.034
ement+0.033
ory+0.032
ancer+0.032
Neg logit contributions
 s-0.015
 wished-0.014
 living-0.014
 t-0.013
 sharp-0.013
Token "
Token ID366
Feature activation-5.35e-01

Pos logit contributions
ete+0.008
icking+0.008
ement+0.007
ory+0.007
ancer+0.007
Neg logit contributions
 s-0.004
 wished-0.004
 living-0.004
 t-0.004
 sharp-0.004
TokenI
Token ID40
Feature activation+1.97e+00

Pos logit contributions
 s+0.033
 t+0.031
 wished+0.030
 mess+0.030
 living+0.030
Neg logit contributions
ete-0.041
ement-0.038
ory-0.038
-0.038
ola-0.038
Token know
Token ID760
Feature activation-5.87e-01

Pos logit contributions
ete+0.185
icking+0.180
ory+0.174
ement+0.168
ancer+0.167
Neg logit contributions
 s-0.091
 wished-0.089
 living-0.089
 snow-0.083
 sharp-0.083
Token,
Token ID11
Feature activation+5.61e-01

Pos logit contributions
 s+0.036
 wished+0.036
 living+0.035
 t+0.034
 sharp+0.033
Neg logit contributions
ete-0.048
icking-0.046
ement-0.044
ory-0.044
ancer-0.042
Token sk
Token ID1341
Feature activation-1.96e+00

Pos logit contributions
ete+0.051
icking+0.049
ement+0.047
ory+0.047
ancer+0.046
Neg logit contributions
 s-0.030
 wished-0.029
 living-0.028
 t-0.027
 sharp-0.027
Tokenunks
Token ID14125
Feature activation+6.91e+00

Pos logit contributions
 s+0.171
 living+0.171
 wished+0.161
 r+0.160
 t+0.159
Neg logit contributions
ete-0.109
ement-0.097
icking-0.096
ory-0.093
iring-0.087
Token are
Token ID389
Feature activation-6.68e-02

Pos logit contributions
ete+0.685
icking+0.655
ement+0.636
ory+0.624
ancer+0.610
Neg logit contributions
 s-0.325
 wished-0.313
 living-0.307
 t-0.287
 r-0.285
Token not
Token ID407
Feature activation-1.16e+00

Pos logit contributions
 wished+0.002
 s+0.002
 living+0.002
 t+0.002
 bark+0.002
Neg logit contributions
ete-0.007
icking-0.007
ory-0.007
ement-0.007
oming-0.007
Token very
Token ID845
Feature activation+6.68e-01

Pos logit contributions
 wished+0.048
 s+0.044
 bark+0.039
 never+0.039
 t+0.039
Neg logit contributions
icking-0.110
ete-0.105
ory-0.105
ancer-0.101
iring-0.098
Token pleasant
Token ID15497
Feature activation+2.14e-01

Pos logit contributions
ete+0.079
icking+0.077
ory+0.075
ola+0.072
oming+0.072
Neg logit contributions
 wished-0.016
 s-0.014
 living-0.014
 t-0.013
 snow-0.012
Token.
Token ID13
Feature activation+2.51e-02

Pos logit contributions
icking+0.020
ete+0.020
ory+0.019
ement+0.019
ails+0.018
Neg logit contributions
 wished-0.010
 s-0.009
 living-0.009
 sharp-0.009
 t-0.008
Token!
Token ID0
Feature activation+1.44e+00

Pos logit contributions
 wished+0.071
 s+0.070
 living+0.066
 sharp+0.064
 t+0.064
Neg logit contributions
ete-0.102
icking-0.099
ement-0.093
ory-0.093
ancer-0.090
Token We
Token ID775
Feature activation+6.09e-01

Pos logit contributions
ete+0.136
icking+0.135
ement+0.127
ory+0.124
oming+0.122
Neg logit contributions
 s-0.075
 wished-0.071
 living-0.070
 t-0.066
 snow-0.066
Token have
Token ID423
Feature activation+4.63e-01

Pos logit contributions
ete+0.051
icking+0.049
ory+0.048
ola+0.046
ancer+0.045
Neg logit contributions
 living-0.032
 s-0.032
 wished-0.032
 soft-0.030
 sharp-0.030
Token long
Token ID890
Feature activation-2.71e+00

Pos logit contributions
ete+0.046
icking+0.043
ory+0.042
ola+0.040
oming+0.040
Neg logit contributions
 wished-0.019
 s-0.018
 bark-0.017
 cry-0.016
 cough-0.016
Token tr
Token ID491
Feature activation-1.90e+00

Pos logit contributions
 wished+0.146
 s+0.135
 living+0.132
 happy+0.129
 sharp+0.127
Neg logit contributions
ete-0.233
icking-0.225
ory-0.219
ement-0.213
ails-0.213
Tokenunks
Token ID14125
Feature activation+6.87e+00

Pos logit contributions
 s+0.188
 living+0.180
 wished+0.176
 t+0.176
 r+0.176
Neg logit contributions
ete-0.079
icking-0.075
ory-0.066
ancer-0.064
ement-0.061
Token!"
Token ID2474
Feature activation-7.43e-01

Pos logit contributions
ete+0.704
icking+0.683
ement+0.658
ory+0.644
ancer+0.634
Neg logit contributions
 s-0.287
 wished-0.275
 living-0.272
 t-0.258
 sharp-0.247
Token
Token ID198
Feature activation+9.58e-02

Pos logit contributions
 wished+0.039
 s+0.036
 sharp+0.036
 t+0.035
 living+0.035
Neg logit contributions
ete-0.067
icking-0.063
ory-0.062
ement-0.062
-0.060
Token
Token ID198
Feature activation+8.46e-02

Pos logit contributions
ete+0.009
icking+0.008
ory+0.008
ement+0.008
ola+0.008
Neg logit contributions
 wished-0.005
 s-0.005
 t-0.004
 sharp-0.004
 living-0.004
TokenMom
Token ID29252
Feature activation+5.05e-01

Pos logit contributions
ete+0.006
icking+0.006
ola+0.006
ory+0.006
iring+0.005
Neg logit contributions
 wished-0.006
 s-0.005
 sharp-0.005
 t-0.005
 bark-0.005
Token hears
Token ID23180
Feature activation-1.56e+00

Pos logit contributions
ete+0.045
icking+0.044
ement+0.042
ory+0.041
ays+0.041
Neg logit contributions
 wished-0.027
 s-0.025
 sharp-0.024
 happy-0.024
 living-0.023
Token Now
Token ID2735
Feature activation+3.15e-01

Pos logit contributions
ete+0.006
icking+0.006
ement+0.006
ory+0.005
oming+0.005
Neg logit contributions
 s-0.005
 wished-0.005
 living-0.005
 t-0.005
 soft-0.004
Token,
Token ID11
Feature activation+5.43e-01

Pos logit contributions
ete+0.029
icking+0.028
ement+0.027
ory+0.027
ancer+0.026
Neg logit contributions
 s-0.016
 wished-0.016
 living-0.015
 t-0.015
 happy-0.014
Token let
Token ID1309
Feature activation-9.66e-01

Pos logit contributions
ete+0.053
icking+0.052
ory+0.049
ement+0.049
ancer+0.048
Neg logit contributions
 s-0.024
 wished-0.024
 living-0.022
 sharp-0.021
 t-0.021
Token's
Token ID338
Feature activation-6.36e-01

Pos logit contributions
 wished+0.032
 s+0.031
 living+0.028
 sharp+0.027
 t+0.026
Neg logit contributions
ete-0.108
icking-0.105
ory-0.099
ement-0.098
oming-0.096
Token c
Token ID269
Feature activation-1.29e+00

Pos logit contributions
 wished+0.028
 s+0.027
 living+0.026
 sharp+0.025
 t+0.024
Neg logit contributions
ete-0.063
icking-0.061
ory-0.058
ement-0.057
ancer-0.056
Tokenuddle
Token ID24500
Feature activation+6.85e+00

Pos logit contributions
 s+0.111
 mess+0.106
 living+0.105
 snow+0.103
 soft+0.103
Neg logit contributions
ete-0.073
 Derek-0.067
icking-0.064
ylan-0.059
ennis-0.059
Token and
Token ID290
Feature activation+7.12e-02

Pos logit contributions
ete+0.772
ement+0.727
icking+0.713
ory+0.704
ancer+0.701
Neg logit contributions
 s-0.274
 snow-0.242
 living-0.239
 r-0.237
 t-0.234
Token read
Token ID1100
Feature activation-1.01e+00

Pos logit contributions
ete+0.006
icking+0.006
ory+0.006
oming+0.006
ola+0.005
Neg logit contributions
 wished-0.004
 s-0.003
 sharp-0.003
 living-0.003
 gir-0.003
Token your
Token ID534
Feature activation-1.27e-01

Pos logit contributions
 s+0.039
 wished+0.038
 living+0.036
 t+0.034
 sharp+0.033
Neg logit contributions
ete-0.107
icking-0.104
ement-0.099
ory-0.098
ancer-0.096
Token book
Token ID1492
Feature activation+4.71e-02

Pos logit contributions
 wished+0.006
 s+0.005
 living+0.005
 never+0.005
 knocking+0.005
Neg logit contributions
ete-0.012
icking-0.012
ement-0.011
ory-0.011
ancer-0.011
Token.
Token ID13
Feature activation+7.21e-02

Pos logit contributions
ete+0.004
icking+0.004
ement+0.004
ory+0.004
ails+0.004
Neg logit contributions
 wished-0.003
 s-0.003
 living-0.002
 sharp-0.002
 t-0.002
Token He
Token ID679
Feature activation+4.74e-01

Pos logit contributions
 s+0.005
 wished+0.005
 living+0.005
 sharp+0.005
 r+0.005
Neg logit contributions
icking-0.009
ete-0.009
ement-0.009
ory-0.009
oming-0.009
Token lands
Token ID8604
Feature activation-4.68e-02

Pos logit contributions
ete+0.043
icking+0.042
ement+0.040
ory+0.040
ays+0.039
Neg logit contributions
 wished-0.022
 s-0.022
 living-0.022
 t-0.021
 happy-0.020
Token in
Token ID287
Feature activation+3.09e-01

Pos logit contributions
 s+0.002
 wished+0.002
 living+0.002
 t+0.002
 sharp+0.002
Neg logit contributions
ete-0.005
icking-0.005
ory-0.004
ement-0.004
ancer-0.004
Token the
Token ID262
Feature activation+2.23e-01

Pos logit contributions
ete+0.031
icking+0.030
ement+0.029
ory+0.028
+0.028
Neg logit contributions
 wished-0.013
 s-0.012
 living-0.012
 sharp-0.011
 t-0.011
Token p
Token ID279
Feature activation-1.85e+00

Pos logit contributions
ete+0.021
icking+0.021
ement+0.020
ory+0.020
oming+0.019
Neg logit contributions
 wished-0.010
 s-0.010
 t-0.009
 living-0.009
 sharp-0.009
Tokenuddle
Token ID24500
Feature activation+6.85e+00

Pos logit contributions
 s+0.182
 living+0.173
 mess+0.172
 r+0.172
 snow+0.171
Neg logit contributions
ete-0.087
ancer-0.077
icking-0.076
avored-0.071
ement-0.068
Token next
Token ID1306
Feature activation-1.71e-01

Pos logit contributions
ete+0.721
icking+0.701
ement+0.667
ory+0.666
ancer+0.653
Neg logit contributions
 s-0.264
 wished-0.259
 living-0.244
 t-0.233
 sharp-0.228
Token to
Token ID284
Feature activation-5.41e-01

Pos logit contributions
 wished+0.008
 s+0.007
 living+0.007
 t+0.006
 sharp+0.006
Neg logit contributions
ete-0.017
icking-0.017
ement-0.016
ory-0.016
oming-0.015
Token Anna
Token ID11735
Feature activation-2.27e+00

Pos logit contributions
 s+0.019
 living+0.017
 wished+0.017
 t+0.017
 happy+0.016
Neg logit contributions
icking-0.060
ete-0.060
ement-0.058
oming-0.057
ory-0.056
Token.
Token ID13
Feature activation-4.96e-02

Pos logit contributions
 wished+0.107
 s+0.104
 sharp+0.094
 soft+0.090
 living+0.089
Neg logit contributions
ete-0.214
icking-0.214
ory-0.196
ement-0.195
ails-0.191
Token He
Token ID679
Feature activation+5.18e-01

Pos logit contributions
 s+0.003
 wished+0.003
 living+0.003
 t+0.002
 sharp+0.002
Neg logit contributions
ete-0.004
icking-0.004
ement-0.004
ory-0.004
ancer-0.004
Token wished
Token ID16555
Feature activation-3.53e+00

Pos logit contributions
ete+0.003
icking+0.003
ement+0.003
ory+0.002
oming+0.002
Neg logit contributions
 wished-0.001
 s-0.001
 living-0.001
 t-0.001
 sharp-0.001
Token for
Token ID329
Feature activation+2.29e+00

Pos logit contributions
 s+0.156
 living+0.154
 wished+0.150
 t+0.142
 happy+0.142
Neg logit contributions
ete-0.355
icking-0.341
ory-0.331
ement-0.328
-0.323
Token.
Token ID13
Feature activation+3.04e-02

Pos logit contributions
ete+0.258
icking+0.251
ement+0.243
oming+0.241
ory+0.240
Neg logit contributions
 s-0.077
 wished-0.075
 living-0.071
 t-0.070
 happy-0.065
Token They
Token ID1119
Feature activation-3.68e-01

Pos logit contributions
ete+0.003
icking+0.003
ory+0.002
ement+0.002
ancer+0.002
Neg logit contributions
 s-0.002
 wished-0.002
 sharp-0.002
 t-0.001
 mess-0.001
Token c
Token ID269
Feature activation-1.29e+00

Pos logit contributions
 s+0.016
 living+0.016
 wished+0.015
 sharp+0.015
 knocking+0.015
Neg logit contributions
ete-0.035
icking-0.033
ory-0.032
ement-0.031
ola-0.031
Tokenuddle
Token ID24500
Feature activation+6.84e+00

Pos logit contributions
 s+0.119
 living+0.112
 wished+0.110
 snow+0.109
 soft+0.108
Neg logit contributions
ete-0.071
icking-0.066
ola-0.059
 Derek-0.059
ancer-0.058
Token with
Token ID351
Feature activation+9.50e-02

Pos logit contributions
ete+0.703
icking+0.646
ory+0.633
ement+0.631
ancer+0.624
Neg logit contributions
 s-0.341
 living-0.311
 t-0.307
 snow-0.301
 r-0.295
Token their
Token ID511
Feature activation-1.30e+00

Pos logit contributions
ete+0.010
icking+0.009
ory+0.009
ement+0.009
ancer+0.009
Neg logit contributions
 wished-0.004
 s-0.004
 living-0.003
 bark-0.003
 knocking-0.003
Token new
Token ID649
Feature activation+1.26e+00

Pos logit contributions
 wished+0.046
 knocking+0.044
 s+0.043
 never+0.041
 bark+0.041
Neg logit contributions
ete-0.124
icking-0.122
ory-0.120
ement-0.118
ancer-0.114
Token toys
Token ID14958
Feature activation-5.72e-01

Pos logit contributions
ory+0.115
ete+0.114
icking+0.114
ement+0.111
ippers+0.108
Neg logit contributions
 wished-0.049
 never-0.045
 knocking-0.045
 s-0.043
 sharp-0.039
Token.
Token ID13
Feature activation+5.43e-02

Pos logit contributions
 s+0.021
 wished+0.021
 living+0.019
 t+0.018
 sharp+0.018
Neg logit contributions
ete-0.061
icking-0.059
ement-0.056
ory-0.056
ancer-0.055
Token't
Token ID470
Feature activation+5.22e-01

Pos logit contributions
 s+0.010
 wished+0.010
 living+0.009
 r+0.009
 t+0.009
Neg logit contributions
ete-0.012
icking-0.012
ory-0.011
ement-0.011
ancer-0.011
Token splash
Token ID22870
Feature activation-1.49e-01

Pos logit contributions
ete+0.050
icking+0.050
ory+0.046
ancer+0.045
ement+0.044
Neg logit contributions
 wished-0.024
 s-0.023
 living-0.023
 sharp-0.021
 happy-0.021
Token in
Token ID287
Feature activation+3.16e-01

Pos logit contributions
 s+0.007
 bad+0.007
 mess+0.007
 r+0.007
 snow+0.007
Neg logit contributions
ete-0.015
icking-0.014
ement-0.014
ancer-0.014
-0.014
Token the
Token ID262
Feature activation+4.09e-01

Pos logit contributions
ete+0.029
icking+0.029
ement+0.027
oming+0.027
ory+0.027
Neg logit contributions
 wished-0.015
 s-0.014
 bark-0.014
 living-0.013
 t-0.013
Token p
Token ID279
Feature activation-1.92e+00

Pos logit contributions
icking+0.031
ete+0.031
ory+0.029
oming+0.029
ement+0.029
Neg logit contributions
 wished-0.026
 s-0.023
 never-0.023
 sharp-0.023
 bark-0.023
Tokenuddle
Token ID24500
Feature activation+6.84e+00

Pos logit contributions
 s+0.155
 mess+0.149
 r+0.146
 snow+0.146
 living+0.145
Neg logit contributions
ete-0.129
icking-0.109
ancer-0.107
ement-0.105
ory-0.105
Token,"
Token ID553
Feature activation-1.44e+00

Pos logit contributions
ete+0.731
icking+0.698
ement+0.669
ancer+0.665
ory+0.664
Neg logit contributions
 s-0.274
 wished-0.253
 living-0.242
 t-0.239
 snow-0.235
Token she
Token ID673
Feature activation-1.62e-01

Pos logit contributions
 wished+0.073
 s+0.073
 living+0.071
 t+0.068
 sharp+0.067
Neg logit contributions
ete-0.127
icking-0.122
ement-0.118
ory-0.115
ola-0.113
Token says
Token ID1139
Feature activation-3.70e-02

Pos logit contributions
 wished+0.010
 living+0.010
 s+0.010
 never+0.009
 sharp+0.009
Neg logit contributions
ete-0.012
icking-0.012
ement-0.011
ory-0.011
ancer-0.011
Token.
Token ID13
Feature activation+5.48e-02

Pos logit contributions
 wished+0.002
 s+0.002
 living+0.001
 t+0.001
 snow+0.001
Neg logit contributions
icking-0.004
ete-0.004
ement-0.003
ory-0.003
ails-0.003
Token
Token ID198
Feature activation+1.40e-01

Pos logit contributions
ete+0.005
icking+0.005
ement+0.005
ory+0.005
+0.005
Neg logit contributions
 wished-0.003
 s-0.003
 living-0.002
 t-0.002
 sharp-0.002
Token put
Token ID1234
Feature activation-2.09e-01

Pos logit contributions
 wished+0.031
 s+0.031
 living+0.030
 t+0.028
 sharp+0.028
Neg logit contributions
ete-0.059
icking-0.057
ory-0.054
ement-0.053
oming-0.052
Token on
Token ID319
Feature activation+1.04e-01

Pos logit contributions
 s+0.006
 snow+0.006
 toy+0.006
 t+0.006
 soft+0.005
Neg logit contributions
ete-0.026
icking-0.024
ancer-0.024
ement-0.024
awn-0.024
Token his
Token ID465
Feature activation-1.73e-01

Pos logit contributions
ete+0.010
icking+0.009
ement+0.009
ory+0.009
ays+0.009
Neg logit contributions
 wished-0.005
 s-0.004
 living-0.004
 knocking-0.004
 mess-0.004
Token swimming
Token ID14899
Feature activation+1.76e+00

Pos logit contributions
 s+0.008
 wished+0.007
 living+0.007
 t+0.007
 sharp+0.007
Neg logit contributions
ete-0.017
icking-0.017
ement-0.016
ory-0.016
ancer-0.016
Token tr
Token ID491
Feature activation-1.99e+00

Pos logit contributions
icking+0.166
ete+0.165
ement+0.161
ancer+0.157
avored+0.156
Neg logit contributions
 s-0.091
 t-0.086
 toy-0.084
 living-0.084
 snow-0.084
Tokenunks
Token ID14125
Feature activation+6.84e+00

Pos logit contributions
 s+0.203
 living+0.197
 snow+0.195
 r+0.194
 t+0.189
Neg logit contributions
icking-0.066
ancer-0.063
ete-0.059
ory-0.052
ancing-0.048
Token and
Token ID290
Feature activation+1.75e-02

Pos logit contributions
ete+0.814
ement+0.776
icking+0.775
ory+0.751
ancer+0.748
Neg logit contributions
 living-0.189
 s-0.180
 t-0.166
 wished-0.160
 toy-0.152
Token find
Token ID1064
Feature activation+1.18e+00

Pos logit contributions
ete+0.002
icking+0.002
ory+0.001
 Derek+0.001
ancer+0.001
Neg logit contributions
 wished-0.001
 s-0.001
 living-0.001
 sharp-0.001
 t-0.001
Token some
Token ID617
Feature activation+2.69e-01

Pos logit contributions
+0.138
ete+0.135
icking+0.135
ement+0.131
ory+0.130
Neg logit contributions
 s-0.035
 living-0.034
 snow-0.033
 soft-0.033
 sharp-0.032
Token interesting
Token ID3499
Feature activation+1.06e+00

Pos logit contributions
ete+0.030
icking+0.030
+0.029
ement+0.028
ancer+0.028
Neg logit contributions
 s-0.010
 living-0.010
 soft-0.009
 snow-0.009
 sharp-0.009
Token things
Token ID1243
Feature activation-8.65e-01

Pos logit contributions
ete+0.095
icking+0.093
ement+0.086
+0.086
ancer+0.086
Neg logit contributions
 s-0.059
 wished-0.057
 living-0.056
 t-0.055
 snow-0.054
Token go
Token ID467
Feature activation+1.30e+00

Pos logit contributions
 living+0.018
 s+0.017
 knocking+0.017
 wished+0.017
 sharp+0.017
Neg logit contributions
ete-0.024
icking-0.024
ory-0.022
-0.021
ement-0.020
Token to
Token ID284
Feature activation-4.87e-01

Pos logit contributions
icking+0.113
ete+0.110
ory+0.104
ails+0.103
ancer+0.101
Neg logit contributions
 wished-0.071
 s-0.066
 sharp-0.065
 cob-0.060
 funny-0.060
Token bed
Token ID3996
Feature activation-7.05e-01

Pos logit contributions
 wished+0.026
 gir+0.025
 sharp+0.023
 s+0.021
 songs+0.021
Neg logit contributions
ete-0.040
icking-0.040
-0.037
ory-0.037
ola-0.036
Token and
Token ID290
Feature activation+7.52e-02

Pos logit contributions
 wished+0.038
 sharp+0.034
 snow+0.034
 gir+0.034
 cob+0.033
Neg logit contributions
icking-0.063
ement-0.057
ete-0.056
ails-0.055
iring-0.055
Token c
Token ID269
Feature activation-1.37e+00

Pos logit contributions
icking+0.006
ete+0.006
ory+0.005
ancer+0.005
+0.005
Neg logit contributions
 gir-0.004
 wished-0.003
 sharp-0.003
 s-0.003
 living-0.003
Tokenuddle
Token ID24500
Feature activation+6.84e+00

Pos logit contributions
 s+0.159
 living+0.153
 wished+0.153
 snow+0.150
 r+0.149
Neg logit contributions
ete-0.042
icking-0.034
ory-0.030
ement-0.029
ancer-0.028
Token with
Token ID351
Feature activation+1.02e-01

Pos logit contributions
ete+0.675
icking+0.648
ory+0.618
ement+0.617
ancer+0.603
Neg logit contributions
 s-0.319
 wished-0.308
 living-0.298
 t-0.287
 sharp-0.279
Token their
Token ID511
Feature activation-1.34e+00

Pos logit contributions
ete+0.010
icking+0.010
ement+0.009
ory+0.009
ays+0.009
Neg logit contributions
 wished-0.004
 s-0.004
 living-0.004
 bark-0.004
 sharp-0.004
Token doll
Token ID3654
Feature activation-9.48e-01

Pos logit contributions
 wished+0.067
 bark+0.061
 never+0.060
 s+0.058
 knocking+0.057
Neg logit contributions
icking-0.115
ete-0.108
ement-0.106
oming-0.103
-0.101
Token and
Token ID290
Feature activation+7.05e-02

Pos logit contributions
 wished+0.047
 s+0.043
 sharp+0.042
 living+0.040
 knocking+0.039
Neg logit contributions
icking-0.089
ete-0.079
ory-0.078
ement-0.076
ails-0.076
Token car
Token ID1097
Feature activation-5.43e-01

Pos logit contributions
icking+0.006
ete+0.005
ory+0.005
 Derek+0.005
ippers+0.005
Neg logit contributions
 wished-0.003
 gir-0.003
 knocking-0.003
 sharp-0.003
 s-0.003
Token He
Token ID679
Feature activation+6.61e-01

Pos logit contributions
ete+0.013
icking+0.013
ement+0.012
ory+0.012
ancer+0.012
Neg logit contributions
 s-0.009
 wished-0.009
 living-0.009
 t-0.009
 sharp-0.008
Token runs
Token ID4539
Feature activation-1.50e+00

Pos logit contributions
ete+0.061
icking+0.060
ement+0.057
ory+0.056
ays+0.056
Neg logit contributions
 wished-0.030
 s-0.029
 living-0.028
 happy-0.027
 sharp-0.027
Token to
Token ID284
Feature activation-4.43e-01

Pos logit contributions
 wished+0.081
 s+0.078
 living+0.072
 cough+0.070
 sharp+0.070
Neg logit contributions
ete-0.125
icking-0.120
anger-0.117
ory-0.117
ola-0.113
Token the
Token ID262
Feature activation+4.67e-01

Pos logit contributions
 s+0.016
 t+0.015
 wished+0.014
 living+0.014
 bark+0.014
Neg logit contributions
ete-0.049
icking-0.049
ement-0.048
ory-0.046
ancer-0.046
Token p
Token ID279
Feature activation-1.84e+00

Pos logit contributions
icking+0.038
ete+0.037
anger+0.035
ancer+0.035
ory+0.034
Neg logit contributions
 wished-0.025
 s-0.022
 never-0.021
 dreamed-0.021
 happy-0.021
Tokenuddle
Token ID24500
Feature activation+6.84e+00

Pos logit contributions
 s+0.190
 living+0.183
 snow+0.181
 r+0.181
 wished+0.179
Neg logit contributions
ete-0.082
icking-0.065
ement-0.062
ancer-0.061
avored-0.059
Token.
Token ID13
Feature activation+1.58e-01

Pos logit contributions
ete+0.707
icking+0.694
ement+0.661
ory+0.652
ancer+0.638
Neg logit contributions
 wished-0.267
 s-0.263
 living-0.243
 t-0.232
 sharp-0.231
Token He
Token ID679
Feature activation+6.63e-01

Pos logit contributions
ete+0.014
icking+0.013
ement+0.012
ory+0.012
ancer+0.012
Neg logit contributions
 s-0.009
 wished-0.009
 living-0.009
 t-0.008
 sharp-0.008
Token jumps
Token ID18045
Feature activation-3.35e-01

Pos logit contributions
ete+0.057
icking+0.056
ays+0.054
oming+0.052
ement+0.052
Neg logit contributions
 wished-0.032
 s-0.029
 happy-0.029
 songs-0.029
 living-0.028
Token in
Token ID287
Feature activation+4.09e-01

Pos logit contributions
 wished+0.020
 s+0.019
 living+0.018
 t+0.017
 bark+0.017
Neg logit contributions
ete-0.028
icking-0.027
oming-0.025
ement-0.025
ory-0.025
Token it
Token ID340
Feature activation-2.95e-01

Pos logit contributions
ete+0.040
icking+0.038
ement+0.037
ory+0.036
ola+0.036
Neg logit contributions
 wished-0.018
 s-0.017
 living-0.016
 sharp-0.015
 gir-0.015
Token you
Token ID345
Feature activation+6.96e-01

Pos logit contributions
ete+0.012
icking+0.012
ory+0.011
ement+0.011
ancer+0.011
Neg logit contributions
 wished-0.011
 s-0.010
 bark-0.010
 living-0.010
 knocking-0.009
Token.
Token ID13
Feature activation+8.94e-02

Pos logit contributions
ete+0.069
icking+0.068
ory+0.064
ement+0.063
ancer+0.063
Neg logit contributions
 wished-0.030
 s-0.029
 living-0.026
 sharp-0.026
 bark-0.025
Token We
Token ID775
Feature activation+7.06e-01

Pos logit contributions
ete+0.006
icking+0.006
ement+0.006
ory+0.006
oming+0.006
Neg logit contributions
 s-0.006
 wished-0.006
 living-0.006
 t-0.006
 sharp-0.006
Token can
Token ID460
Feature activation+4.72e-01

Pos logit contributions
ete+0.055
icking+0.055
ory+0.051
ancer+0.051
ola+0.049
Neg logit contributions
 s-0.042
 wished-0.040
 living-0.040
 sharp-0.039
 soft-0.038
Token c
Token ID269
Feature activation-1.31e+00

Pos logit contributions
ete+0.044
icking+0.044
ory+0.040
ancer+0.040
+0.040
Neg logit contributions
 wished-0.022
 s-0.021
 living-0.020
 sharp-0.020
 toy-0.019
Tokenuddle
Token ID24500
Feature activation+6.83e+00

Pos logit contributions
 s+0.133
 living+0.127
 snow+0.127
 soft+0.125
 mess+0.124
Neg logit contributions
ete-0.057
icking-0.048
 Derek-0.047
ola-0.044
ory-0.043
Token and
Token ID290
Feature activation+1.16e-01

Pos logit contributions
ete+0.686
icking+0.652
ement+0.624
ory+0.621
ancer+0.604
Neg logit contributions
 s-0.333
 living-0.309
 t-0.304
 snow-0.302
 wished-0.297
Token sn
Token ID3013
Feature activation+5.33e-02

Pos logit contributions
ete+0.009
icking+0.009
ancer+0.009
ory+0.009
ola+0.008
Neg logit contributions
 wished-0.006
 s-0.005
 sharp-0.005
 living-0.005
 gir-0.005
Tokenuggle
Token ID33498
Feature activation+1.82e+00

Pos logit contributions
ete+0.003
icking+0.003
ement+0.003
 Derek+0.002
ory+0.002
Neg logit contributions
 s-0.005
 living-0.005
 snow-0.005
 soft-0.005
 r-0.005
Token and
Token ID290
Feature activation+1.10e-01

Pos logit contributions
ete+0.154
icking+0.152
ory+0.143
ement+0.140
ancer+0.138
Neg logit contributions
 wished-0.103
 s-0.103
 living-0.096
 sharp-0.094
 t-0.092
Token keep
Token ID1394
Feature activation+1.29e+00

Pos logit contributions
icking+0.009
ete+0.009
ory+0.009
ancer+0.009
oming+0.008
Neg logit contributions
 wished-0.005
 s-0.005
 gir-0.005
 living-0.005
 toy-0.005
Token the
Token ID262
Feature activation+4.31e-01

Pos logit contributions
ete+0.016
icking+0.016
ement+0.015
ory+0.015
+0.015
Neg logit contributions
 wished-0.006
 s-0.006
 living-0.006
 t-0.005
 sharp-0.005
Token comfortable
Token ID6792
Feature activation+8.62e-01

Pos logit contributions
ement+0.036
icking+0.035
anger+0.035
ete+0.033
oming+0.033
Neg logit contributions
 wished-0.025
 wanted-0.019
 never-0.018
 dreamed-0.018
 listened-0.017
Token couch
Token ID18507
Feature activation-1.92e+00

Pos logit contributions
ete+0.075
anger+0.075
icking+0.075
ement+0.073
ails+0.071
Neg logit contributions
 wished-0.044
 sharp-0.037
 s-0.036
 happy-0.035
 never-0.035
Token and
Token ID290
Feature activation+1.43e-01

Pos logit contributions
 wished+0.080
 s+0.078
 living+0.071
 sharp+0.070
 t+0.069
Neg logit contributions
ete-0.188
icking-0.188
ement-0.179
ory-0.175
-0.171
Token c
Token ID269
Feature activation-1.28e+00

Pos logit contributions
icking+0.012
ete+0.012
ory+0.011
ement+0.011
ancer+0.010
Neg logit contributions
 wished-0.006
 gir-0.006
 s-0.006
 sharp-0.006
 living-0.006
Tokenuddle
Token ID24500
Feature activation+6.83e+00

Pos logit contributions
 s+0.126
 living+0.121
 wished+0.120
 r+0.117
 snow+0.117
Neg logit contributions
ete-0.062
icking-0.056
ory-0.050
ement-0.048
ancer-0.048
Token with
Token ID351
Feature activation+8.29e-02

Pos logit contributions
ete+0.752
icking+0.723
ory+0.694
ement+0.690
ancer+0.679
Neg logit contributions
 s-0.246
 wished-0.230
 living-0.223
 t-0.212
 sharp-0.204
Token their
Token ID511
Feature activation-1.29e+00

Pos logit contributions
ete+0.009
icking+0.009
ement+0.008
ory+0.008
ancer+0.008
Neg logit contributions
 s-0.003
 wished-0.003
 living-0.003
 t-0.003
 sharp-0.003
Token blankets
Token ID34794
Feature activation-5.70e-01

Pos logit contributions
 wished+0.060
 s+0.052
 never+0.050
 living+0.049
 knocking+0.048
Neg logit contributions
ete-0.119
icking-0.119
ement-0.115
ory-0.111
-0.111
Token.
Token ID13
Feature activation+9.10e-02

Pos logit contributions
 wished+0.022
 s+0.022
 living+0.020
 sharp+0.019
 t+0.019
Neg logit contributions
ete-0.058
icking-0.058
ement-0.055
ory-0.054
oming-0.053
Token They
Token ID1119
Feature activation-3.29e-01

Pos logit contributions
ete+0.008
icking+0.008
ory+0.007
ement+0.007
+0.007
Neg logit contributions
 wished-0.005
 s-0.005
 t-0.005
 sharp-0.005
 living-0.004
Token They
Token ID1119
Feature activation-4.41e-01

Pos logit contributions
 s+0.006
 wished+0.006
 living+0.006
 t+0.006
 sharp+0.006
Neg logit contributions
ete-0.012
icking-0.011
ement-0.011
ory-0.011
ancer-0.010
Token put
Token ID1234
Feature activation-2.12e-01

Pos logit contributions
 s+0.016
 wished+0.016
 living+0.014
 t+0.014
 sharp+0.013
Neg logit contributions
ete-0.048
icking-0.046
ement-0.045
ory-0.044
ancer-0.043
Token on
Token ID319
Feature activation+5.40e-02

Pos logit contributions
 s+0.008
 t+0.007
 snow+0.007
 soft+0.007
 living+0.007
Neg logit contributions
ete-0.024
icking-0.023
ement-0.022
ory-0.022
ancer-0.022
Token their
Token ID511
Feature activation-1.29e+00

Pos logit contributions
ete+0.005
icking+0.005
ement+0.005
ory+0.005
ays+0.005
Neg logit contributions
 wished-0.003
 living-0.002
 knocking-0.002
 s-0.002
 mess-0.002
Token tr
Token ID491
Feature activation-1.97e+00

Pos logit contributions
 s+0.062
 wished+0.060
 living+0.058
 t+0.057
 sharp+0.055
Neg logit contributions
ete-0.125
icking-0.122
ement-0.116
ory-0.114
ancer-0.113
Tokenunks
Token ID14125
Feature activation+6.83e+00

Pos logit contributions
 s+0.201
 snow+0.193
 living+0.191
 t+0.187
 r+0.187
Neg logit contributions
ancer-0.067
icking-0.064
ete-0.062
ory-0.055
umber-0.047
Token and
Token ID290
Feature activation+3.27e-02

Pos logit contributions
ete+0.831
ement+0.801
icking+0.788
ancer+0.773
ory+0.770
Neg logit contributions
 living-0.161
 s-0.161
 t-0.147
 wished-0.133
 toy-0.127
Token jumped
Token ID11687
Feature activation+9.02e-02

Pos logit contributions
ete+0.003
icking+0.003
ory+0.003
ement+0.003
 Derek+0.003
Neg logit contributions
 wished-0.001
 s-0.001
 living-0.001
 sharp-0.001
 mess-0.001
Token right
Token ID826
Feature activation+5.87e-01

Pos logit contributions
ete+0.009
icking+0.008
ement+0.008
ancer+0.008
ory+0.008
Neg logit contributions
 s-0.005
 t-0.004
 living-0.004
 sharp-0.004
 r-0.004
Token in
Token ID287
Feature activation+2.27e-01

Pos logit contributions
ete+0.056
icking+0.053
ancer+0.051
+0.050
avored+0.050
Neg logit contributions
 s-0.030
 living-0.029
 wished-0.029
 sharp-0.028
 t-0.028
Token.
Token ID13
Feature activation-1.08e-01

Pos logit contributions
ete+0.023
icking+0.022
ement+0.022
ory+0.021
oming+0.021
Neg logit contributions
 wished-0.009
 s-0.009
 living-0.008
 sharp-0.008
 t-0.008
Token Now
Token ID2735
Feature activation+3.10e-01

Pos logit contributions
ete+0.001
icking+0.001
ement+0.001
ory+0.001
oming+0.001
Neg logit contributions
 wished-0.001
 living-0.001
 s-0.001
 t-0.001
 soft-0.001
Token,
Token ID11
Feature activation+5.40e-01

Pos logit contributions
ete+0.030
icking+0.029
ement+0.028
ory+0.028
ancer+0.027
Neg logit contributions
 s-0.015
 wished-0.014
 t-0.014
 living-0.014
 happy-0.013
Token let
Token ID1309
Feature activation-9.76e-01

Pos logit contributions
ete+0.054
icking+0.053
ory+0.050
ement+0.050
ancer+0.049
Neg logit contributions
 s-0.023
 wished-0.023
 living-0.022
 t-0.020
 sharp-0.020
Token's
Token ID338
Feature activation-6.69e-01

Pos logit contributions
 wished+0.031
 s+0.030
 living+0.027
 sharp+0.026
 t+0.025
Neg logit contributions
ete-0.110
icking-0.108
ory-0.101
ement-0.101
ancer-0.098
Token c
Token ID269
Feature activation-1.29e+00

Pos logit contributions
 wished+0.030
 s+0.029
 living+0.028
 sharp+0.027
 t+0.026
Neg logit contributions
ete-0.066
icking-0.064
ory-0.060
ement-0.059
-0.058
Tokenuddle
Token ID24500
Feature activation+6.82e+00

Pos logit contributions
 s+0.113
 living+0.106
 mess+0.106
 snow+0.104
 soft+0.103
Neg logit contributions
ete-0.071
 Derek-0.066
icking-0.063
ennis-0.059
ylan-0.058
Token and
Token ID290
Feature activation+3.72e-02

Pos logit contributions
ete+0.739
ement+0.687
icking+0.683
ory+0.678
ancer+0.660
Neg logit contributions
 s-0.296
 living-0.263
 snow-0.262
 r-0.255
 t-0.255
Token read
Token ID1100
Feature activation-1.02e+00

Pos logit contributions
icking+0.003
ete+0.003
ory+0.003
oming+0.003
+0.003
Neg logit contributions
 wished-0.002
 s-0.002
 sharp-0.002
 gir-0.002
 living-0.002
Token a
Token ID257
Feature activation+8.96e-02

Pos logit contributions
 wished+0.041
 s+0.040
 living+0.037
 t+0.035
 sharp+0.035
Neg logit contributions
ete-0.105
icking-0.102
ement-0.098
ory-0.097
ancer-0.095
Token story
Token ID1621
Feature activation+7.28e-01

Pos logit contributions
ete+0.008
ory+0.007
oming+0.007
ement+0.007
icking+0.007
Neg logit contributions
 wished-0.005
 never-0.005
 knocking-0.004
 s-0.004
 chased-0.004
Token.
Token ID13
Feature activation+8.28e-03

Pos logit contributions
icking+0.060
ete+0.060
ement+0.057
anger+0.056
ola+0.056
Neg logit contributions
 wished-0.042
 s-0.038
 sharp-0.036
 bark-0.035
 living-0.035
Tokenph
Token ID746
Feature activation-2.13e+00

Pos logit contributions
ete+0.054
icking+0.051
+0.050
ory+0.049
ement+0.049
Neg logit contributions
 wished-0.049
 s-0.048
 living-0.047
 sharp-0.046
 t-0.046
Tokenants
Token ID1187
Feature activation+7.62e-01

Pos logit contributions
 s+0.191
 wished+0.190
 living+0.185
 t+0.182
 sharp+0.180
Neg logit contributions
ete-0.117
icking-0.112
ement-0.100
ory-0.099
oming-0.094
Token have
Token ID423
Feature activation+4.84e-02

Pos logit contributions
ete+0.073
icking+0.070
ement+0.068
ory+0.068
ancer+0.067
Neg logit contributions
 s-0.037
 wished-0.037
 living-0.035
 t-0.034
 r-0.033
Token strong
Token ID1913
Feature activation+2.65e-01

Pos logit contributions
ete+0.005
icking+0.005
ory+0.004
ails+0.004
ola+0.004
Neg logit contributions
 s-0.002
 wished-0.002
 living-0.002
 t-0.002
 mess-0.002
Token tr
Token ID491
Feature activation-1.95e+00

Pos logit contributions
ete+0.025
icking+0.024
ement+0.023
ory+0.023
oming+0.022
Neg logit contributions
 wished-0.013
 s-0.012
 living-0.012
 t-0.011
 sharp-0.011
Tokenunks
Token ID14125
Feature activation+6.81e+00

Pos logit contributions
 s+0.183
 wished+0.176
 living+0.175
 never+0.174
 sharp+0.174
Neg logit contributions
icking-0.080
ete-0.078
ancer-0.075
ory-0.071
oming-0.060
Token that
Token ID326
Feature activation-1.01e+00

Pos logit contributions
ete+0.736
icking+0.707
ement+0.686
ory+0.682
ancer+0.670
Neg logit contributions
 s-0.252
 wished-0.246
 living-0.238
 t-0.221
 sharp-0.215
Token they
Token ID484
Feature activation-3.75e-01

Pos logit contributions
 s+0.047
 wished+0.047
 living+0.044
 t+0.043
 sharp+0.042
Neg logit contributions
ete-0.100
icking-0.097
ement-0.092
ory-0.092
oming-0.089
Token can
Token ID460
Feature activation+2.35e-01

Pos logit contributions
 s+0.018
 wished+0.017
 living+0.017
 t+0.016
 sharp+0.016
Neg logit contributions
ete-0.036
icking-0.035
ory-0.033
ement-0.033
ancer-0.032
Token use
Token ID779
Feature activation+1.31e+00

Pos logit contributions
ete+0.024
icking+0.024
ement+0.023
ory+0.023
ancer+0.022
Neg logit contributions
 s-0.010
 t-0.009
 wished-0.009
 bark-0.009
 sharp-0.009
Token to
Token ID284
Feature activation-7.82e-01

Pos logit contributions
ete+0.132
icking+0.128
ement+0.122
ory+0.121
ancer+0.119
Neg logit contributions
 s-0.058
 wished-0.056
 living-0.054
 t-0.052
 sharp-0.051
Token.
Token ID13
Feature activation-1.76e-01

Pos logit contributions
ete+0.014
ement+0.013
umber+0.013
+0.013
icking+0.013
Neg logit contributions
 s-0.004
 living-0.004
 many-0.003
 never-0.003
 toy-0.003
Token They
Token ID1119
Feature activation-4.73e-01

Pos logit contributions
 s+0.007
 wished+0.007
 living+0.007
 snow+0.006
 bark+0.006
Neg logit contributions
ete-0.018
icking-0.017
ement-0.017
ory-0.017
oming-0.016
Token grabbed
Token ID13176
Feature activation-3.35e-01

Pos logit contributions
 s+0.020
 living+0.019
 wished+0.019
 sharp+0.018
 snow+0.017
Neg logit contributions
ete-0.048
icking-0.047
ory-0.044
ement-0.043
ancer-0.043
Token their
Token ID511
Feature activation-1.27e+00

Pos logit contributions
 s+0.011
 wished+0.011
 living+0.010
 t+0.009
 sharp+0.009
Neg logit contributions
ete-0.037
icking-0.036
ement-0.035
ory-0.035
ancer-0.034
Token tr
Token ID491
Feature activation-1.99e+00

Pos logit contributions
 s+0.060
 wished+0.059
 living+0.057
 t+0.056
 sharp+0.054
Neg logit contributions
ete-0.124
icking-0.120
ement-0.115
ory-0.114
ancer-0.112
Tokenunks
Token ID14125
Feature activation+6.81e+00

Pos logit contributions
 s+0.164
 living+0.157
 t+0.157
 snow+0.153
 toy+0.152
Neg logit contributions
ete-0.104
ancer-0.101
icking-0.100
ory-0.096
olphins-0.087
Token and
Token ID290
Feature activation+2.35e-02

Pos logit contributions
ete+0.801
ement+0.758
icking+0.753
ory+0.744
ancer+0.736
Neg logit contributions
 s-0.187
 living-0.178
 t-0.170
 wished-0.165
 toy-0.152
Token set
Token ID900
Feature activation+1.76e+00

Pos logit contributions
ete+0.002
icking+0.002
ory+0.002
ement+0.002
ancer+0.002
Neg logit contributions
 s-0.001
 wished-0.001
 living-0.001
 sharp-0.001
 t-0.001
Token off
Token ID572
Feature activation-1.11e+00

Pos logit contributions
ete+0.139
icking+0.134
ement+0.124
ory+0.123
oming+0.121
Neg logit contributions
 wished-0.115
 s-0.112
 living-0.109
 t-0.105
 sharp-0.104
Token.
Token ID13
Feature activation-1.53e-01

Pos logit contributions
 wished+0.060
 s+0.056
 living+0.053
 sharp+0.052
 t+0.052
Neg logit contributions
ete-0.099
icking-0.098
ement-0.091
oming-0.090
ory-0.089
Token
Token ID198
Feature activation-2.21e-02

Pos logit contributions
 s+0.006
 wished+0.006
 living+0.005
 snow+0.005
 bark+0.005
Neg logit contributions
ete-0.016
icking-0.016
ory-0.015
ement-0.015
ancer-0.015
Token balls
Token ID11333
Feature activation-3.26e-01

Pos logit contributions
 wished+0.057
 s+0.056
 living+0.052
 t+0.050
 knocking+0.049
Neg logit contributions
ete-0.099
icking-0.092
ory-0.092
ement-0.088
ola-0.086
Token and
Token ID290
Feature activation+1.38e-01

Pos logit contributions
 wished+0.014
 s+0.014
 living+0.013
 t+0.012
 sharp+0.012
Neg logit contributions
ete-0.032
icking-0.032
ement-0.030
ory-0.030
oming-0.029
Token mice
Token ID10693
Feature activation+5.06e-02

Pos logit contributions
ete+0.011
glers+0.011
 Derek+0.011
icking+0.011
+0.011
Neg logit contributions
 wished-0.006
 s-0.005
 living-0.005
 accident-0.005
 fake-0.005
Token and
Token ID290
Feature activation+1.43e-01

Pos logit contributions
icking+0.004
ete+0.004
oming+0.004
ails+0.003
ory+0.003
Neg logit contributions
 s-0.003
 wished-0.003
 snow-0.003
 soft-0.002
 mess-0.002
Token c
Token ID269
Feature activation-1.29e+00

Pos logit contributions
ete+0.012
icking+0.012
 Derek+0.011
ory+0.011
ancer+0.011
Neg logit contributions
 wished-0.005
 s-0.005
 accident-0.005
 living-0.005
 mess-0.005
Tokenuddle
Token ID24500
Feature activation+6.80e+00

Pos logit contributions
 s+0.121
 wished+0.120
 living+0.117
 t+0.115
 sharp+0.114
Neg logit contributions
ete-0.065
icking-0.061
ement-0.055
ory-0.055
ancer-0.052
Token with
Token ID351
Feature activation+8.86e-02

Pos logit contributions
ete+0.690
icking+0.674
ement+0.640
ory+0.636
ancer+0.622
Neg logit contributions
 wished-0.284
 s-0.282
 living-0.262
 t-0.252
 sharp-0.247
Token the
Token ID262
Feature activation+3.84e-01

Pos logit contributions
ete+0.008
icking+0.008
ory+0.007
ement+0.007
ola+0.007
Neg logit contributions
 wished-0.005
 s-0.005
 living-0.004
 mess-0.004
 t-0.004
Token boys
Token ID6510
Feature activation+3.18e-01

Pos logit contributions
icking+0.037
ete+0.037
ancer+0.035
avored+0.035
ory+0.034
Neg logit contributions
 wished-0.015
 s-0.014
 never-0.012
 living-0.012
 dreamed-0.012
Token.
Token ID13
Feature activation+5.86e-02

Pos logit contributions
icking+0.027
ete+0.025
oming+0.024
ays+0.024
ails+0.024
Neg logit contributions
 wished-0.016
 s-0.016
 mess-0.015
 sharp-0.015
 soft-0.014
Token One
Token ID1881
Feature activation-3.60e-01

Pos logit contributions
icking+0.006
ete+0.006
ement+0.006
ory+0.005
oming+0.005
Neg logit contributions
 s-0.002
 living-0.002
 wished-0.002
 snow-0.002
 t-0.002
Token the
Token ID262
Feature activation-1.48e-01

Pos logit contributions
 s+0.010
 wished+0.010
 living+0.009
 t+0.009
 sharp+0.008
Neg logit contributions
ete-0.028
icking-0.028
ement-0.026
ory-0.026
ancer-0.026
Token night
Token ID1755
Feature activation-1.26e+00

Pos logit contributions
 wished+0.006
 s+0.006
 living+0.005
 t+0.005
 sharp+0.005
Neg logit contributions
ete-0.015
icking-0.014
ement-0.014
ory-0.013
ancer-0.013
Token.
Token ID13
Feature activation-6.34e-01

Pos logit contributions
 s+0.058
 wished+0.058
 living+0.054
 t+0.053
 sharp+0.052
Neg logit contributions
ete-0.122
icking-0.119
ement-0.113
ory-0.112
ancer-0.110
Token<|endoftext|>
Token ID50256
Feature activation+8.90e-01

Pos logit contributions
 s+0.025
 wished+0.025
 living+0.024
 happy+0.022
 t+0.022
Neg logit contributions
icking-0.066
ete-0.065
ement-0.062
ory-0.061
oming-0.060
TokenJo
Token ID9908
Feature activation-5.23e+00

Pos logit contributions
icking+0.098
ete+0.096
ory+0.091
ement+0.089
oming+0.088
Neg logit contributions
 living-0.034
 s-0.032
 wished-0.029
 t-0.028
 r-0.028
TokenJo
Token ID9908
Feature activation-5.28e+00

Pos logit contributions
 s+0.292
 wished+0.288
 living+0.278
 sharp+0.270
 t+0.270
Neg logit contributions
ete-0.457
icking-0.442
ement-0.415
ory-0.414
ancer-0.400
Token was
Token ID373
Feature activation-4.42e-01

Pos logit contributions
 s+0.273
 wished+0.269
 living+0.258
 t+0.249
 sharp+0.249
Neg logit contributions
ete-0.482
icking-0.468
ement-0.442
ory-0.440
ancer-0.427
Token walking
Token ID6155
Feature activation-1.12e+00

Pos logit contributions
 wished+0.016
 s+0.015
 bark+0.014
 sharp+0.014
 gir+0.014
Neg logit contributions
icking-0.044
ete-0.043
oming-0.041
ement-0.041
ancer-0.041
Token outside
Token ID2354
Feature activation-8.82e-01

Pos logit contributions
 s+0.062
 wished+0.061
 living+0.058
 t+0.057
 sharp+0.056
Neg logit contributions
ete-0.099
icking-0.096
ement-0.090
ory-0.090
ancer-0.087
Token on
Token ID319
Feature activation-1.78e-01

Pos logit contributions
 wished+0.047
 s+0.043
 sharp+0.042
 bark+0.039
 gir+0.039
Neg logit contributions
icking-0.077
ete-0.075
ement-0.070
oming-0.070
ancer-0.070
Token a
Token ID257
Feature activation-4.44e-01

Pos logit contributions
 wished+0.006
 s+0.006
 living+0.006
 t+0.005
 sharp+0.005
Neg logit contributions
ete-0.019
icking-0.018
ement-0.018
ory-0.017
oming-0.017
Token?
Token ID30
Feature activation+2.35e+00

Pos logit contributions
ement+0.092
ete+0.090
oming+0.090
icking+0.090
+0.089
Neg logit contributions
 living-0.036
 sharp-0.035
 toy-0.035
 t-0.035
 s-0.034
Tokenâ
Token ID22940
Feature activation+7.33e-01

Pos logit contributions
icking+0.238
ete+0.222
ement+0.218
glers+0.210
ory+0.210
Neg logit contributions
 wished-0.104
 s-0.103
 living-0.101
 happy-0.092
 t-0.092
Token
Token ID26391
Feature activation+4.20e-01

Pos logit contributions
ete+0.057
icking+0.054
ory+0.051
oming+0.051
ancer+0.050
Neg logit contributions
 s-0.047
 wished-0.047
 bark-0.045
 living-0.045
 r-0.044
Token
Token ID198
Feature activation-5.57e-01

Pos logit contributions
ete+0.038
icking+0.037
ement+0.035
ory+0.035
ancer+0.034
Neg logit contributions
 s-0.022
 wished-0.022
 living-0.021
 happy-0.020
 t-0.020
Token
Token ID198
Feature activation-5.51e-01

Pos logit contributions
 s+0.019
 wished+0.018
 living+0.017
 happy+0.016
 soft+0.015
Neg logit contributions
icking-0.063
ete-0.059
ory-0.058
ement-0.057
ays-0.056
TokenJo
Token ID9908
Feature activation-5.27e+00

Pos logit contributions
 s+0.026
 wished+0.026
 living+0.025
 happy+0.023
 sharp+0.023
Neg logit contributions
icking-0.053
ete-0.052
ement-0.049
ory-0.048
ancer-0.047
Tokenjo
Token ID7639
Feature activation-1.82e+00

Pos logit contributions
 s+0.271
 wished+0.270
 living+0.256
 t+0.248
 sharp+0.244
Neg logit contributions
ete-0.483
icking-0.469
ement-0.443
ory-0.441
ancer-0.430
Token dug
Token ID18735
Feature activation+1.79e+00

Pos logit contributions
 s+0.075
 wished+0.072
 living+0.071
 t+0.067
 sharp+0.066
Neg logit contributions
ete-0.185
icking-0.179
ement-0.170
ory-0.170
ola-0.163
Token into
Token ID656
Feature activation+1.71e-01

Pos logit contributions
ete+0.198
icking+0.195
ancer+0.186
ory+0.184
ement+0.184
Neg logit contributions
 s-0.062
 wished-0.058
 sharp-0.055
 living-0.055
 t-0.054
Token his
Token ID465
Feature activation-5.32e-01

Pos logit contributions
ete+0.018
icking+0.017
ement+0.016
ory+0.016
ola+0.016
Neg logit contributions
 wished-0.006
 s-0.006
 living-0.006
 bark-0.005
 never-0.005
Token pocket
Token ID10000
Feature activation+4.17e+00

Pos logit contributions
 s+0.032
 wished+0.032
 living+0.031
 t+0.030
 sharp+0.029
Neg logit contributions
ete-0.045
icking-0.043
ement-0.041
ory-0.040
ancer-0.040
Token to
Token ID284
Feature activation-8.71e-01

Pos logit contributions
ete+0.283
icking+0.274
ement+0.272
ory+0.262
ancer+0.262
Neg logit contributions
 s-0.115
 t-0.108
 living-0.105
 happy-0.099
 wished-0.099
Token the
Token ID262
Feature activation-2.90e-02

Pos logit contributions
 wished+0.034
 s+0.034
 living+0.032
 t+0.030
 sharp+0.030
Neg logit contributions
ete-0.090
icking-0.087
ory-0.083
ement-0.082
ancer-0.081
Token village
Token ID7404
Feature activation-3.53e-01

Pos logit contributions
 wished+0.002
 s+0.002
 living+0.002
 t+0.001
 sharp+0.001
Neg logit contributions
ete-0.003
icking-0.002
ory-0.002
ement-0.002
ancer-0.002
Token.
Token ID13
Feature activation-5.63e-01

Pos logit contributions
 s+0.016
 wished+0.015
 sharp+0.014
 t+0.014
 living+0.014
Neg logit contributions
ete-0.035
icking-0.034
ement-0.032
ory-0.032
oming-0.031
Token Jo
Token ID5302
Feature activation-1.71e+00

Pos logit contributions
 s+0.029
 wished+0.028
 living+0.027
 t+0.026
 sharp+0.025
Neg logit contributions
ete-0.052
icking-0.051
ory-0.048
ement-0.048
oming-0.047
TokenJo
Token ID9908
Feature activation-5.26e+00

Pos logit contributions
 s+0.081
 wished+0.078
 living+0.078
 sharp+0.076
 mess+0.075
Neg logit contributions
ete-0.162
icking-0.155
ement-0.150
ory-0.145
ola-0.144
Token smiled
Token ID13541
Feature activation+2.28e-03

Pos logit contributions
 s+0.243
 wished+0.238
 living+0.234
 sharp+0.224
 t+0.222
Neg logit contributions
ete-0.507
icking-0.493
ement-0.470
ory-0.460
ola-0.445
Token with
Token ID351
Feature activation-3.44e-01

Pos logit contributions
ete+0.000
icking+0.000
ement+0.000
ory+0.000
ancer+0.000
Neg logit contributions
 s-0.000
 wished-0.000
 living-0.000
 t-0.000
 sharp-0.000
Token anticipation
Token ID24054
Feature activation-1.15e+00

Pos logit contributions
 s+0.017
 wished+0.017
 living+0.016
 t+0.016
 sharp+0.016
Neg logit contributions
ete-0.032
icking-0.031
ement-0.029
ory-0.029
oming-0.028
Token and
Token ID290
Feature activation-3.10e-01

Pos logit contributions
 s+0.049
 wished+0.048
 living+0.045
 t+0.044
 sharp+0.043
Neg logit contributions
ete-0.115
icking-0.112
ement-0.106
ory-0.106
ancer-0.103
Token asked
Token ID1965
Feature activation-9.74e-01

Pos logit contributions
 s+0.015
 wished+0.014
 living+0.014
 t+0.013
 sharp+0.013
Neg logit contributions
ete-0.030
icking-0.029
ement-0.027
ory-0.027
oming-0.026
Token thing
Token ID1517
Feature activation-4.43e-01

Pos logit contributions
ete+0.175
icking+0.172
ory+0.161
ement+0.161
ancer+0.155
Neg logit contributions
 wished-0.098
 s-0.098
 living-0.090
 r-0.086
 sharp-0.086
Token to
Token ID284
Feature activation-7.89e-01

Pos logit contributions
 wished+0.016
 living+0.016
 s+0.016
 t+0.015
 sharp+0.014
Neg logit contributions
ete-0.048
icking-0.046
ory-0.045
ement-0.044
oming-0.043
Token do
Token ID466
Feature activation+3.80e-01

Pos logit contributions
 s+0.035
 wished+0.035
 living+0.033
 t+0.032
 sharp+0.032
Neg logit contributions
ete-0.079
icking-0.077
ement-0.073
ory-0.072
ancer-0.071
Token.
Token ID13
Feature activation-4.39e-01

Pos logit contributions
ete+0.039
icking+0.038
ory+0.037
ement+0.036
ancer+0.036
Neg logit contributions
 wished-0.015
 s-0.015
 living-0.015
 sharp-0.014
 t-0.014
Token<|endoftext|>
Token ID50256
Feature activation+1.01e+00

Pos logit contributions
 wished+0.024
 s+0.024
 living+0.023
 t+0.022
 happy+0.022
Neg logit contributions
ete-0.038
icking-0.038
ory-0.036
ement-0.035
oming-0.035
TokenJo
Token ID9908
Feature activation-5.25e+00

Pos logit contributions
icking+0.112
ete+0.108
ory+0.102
ement+0.100
ancing+0.099
Neg logit contributions
 living-0.042
 s-0.037
 r-0.033
 t-0.032
 wished-0.032
Tokenel
Token ID417
Feature activation+1.60e+00

Pos logit contributions
 s+0.252
 wished+0.252
 living+0.237
 t+0.229
 sharp+0.224
Neg logit contributions
ete-0.499
icking-0.485
ement-0.460
ory-0.457
ancer-0.449
Token saw
Token ID2497
Feature activation-1.21e+00

Pos logit contributions
ete+0.128
icking+0.125
ory+0.116
ement+0.115
ancer+0.112
Neg logit contributions
 s-0.100
 wished-0.099
 living-0.094
 t-0.092
 sharp-0.092
Token a
Token ID257
Feature activation-2.64e-01

Pos logit contributions
 s+0.036
 wished+0.035
 living+0.032
 t+0.030
 sharp+0.030
Neg logit contributions
ete-0.137
icking-0.134
ement-0.128
ory-0.127
ancer-0.125
Token banana
Token ID25996
Feature activation-6.39e-01

Pos logit contributions
 s+0.009
 wished+0.009
 living+0.009
 t+0.008
 sharp+0.008
Neg logit contributions
ete-0.028
icking-0.028
ement-0.026
ory-0.026
ancer-0.026
Token in
Token ID287
Feature activation+3.44e-02

Pos logit contributions
 s+0.025
 wished+0.024
 living+0.024
 t+0.023
 sharp+0.022
Neg logit contributions
ete-0.069
icking-0.065
ory-0.063
ement-0.062
ancer-0.062
Token unloaded
Token ID49975
Feature activation+6.24e-01

Pos logit contributions
ete+0.009
icking+0.009
ement+0.008
ory+0.008
oming+0.008
Neg logit contributions
 wished-0.004
 s-0.004
 living-0.004
 t-0.003
 sharp-0.003
Token,
Token ID11
Feature activation+2.39e-01

Pos logit contributions
ete+0.068
ancer+0.067
ory+0.065
icking+0.064
ays+0.064
Neg logit contributions
 s-0.023
 sharp-0.023
 t-0.022
 soft-0.021
 happy-0.021
Token John
Token ID1757
Feature activation-8.60e-01

Pos logit contributions
ete+0.020
icking+0.020
ory+0.019
ement+0.019
ancer+0.018
Neg logit contributions
 s-0.014
 wished-0.014
 living-0.013
 t-0.013
 sharp-0.013
Token and
Token ID290
Feature activation-1.62e-01

Pos logit contributions
 s+0.037
 wished+0.037
 living+0.035
 t+0.033
 sharp+0.033
Neg logit contributions
ete-0.085
icking-0.083
ement-0.078
ory-0.078
oming-0.076
Token Jo
Token ID5302
Feature activation-1.69e+00

Pos logit contributions
 wished+0.013
 mess+0.012
 living+0.012
 s+0.012
 sharp+0.012
Neg logit contributions
ete-0.010
icking-0.009
 Derek-0.008
ola-0.008
ement-0.008
TokenJo
Token ID9908
Feature activation-5.24e+00

Pos logit contributions
 s+0.083
 living+0.082
 wished+0.081
 sharp+0.080
 mess+0.080
Neg logit contributions
ete-0.156
icking-0.151
ement-0.143
ory-0.140
ola-0.138
Token started
Token ID2067
Feature activation-4.74e-01

Pos logit contributions
 s+0.259
 wished+0.253
 living+0.253
 sharp+0.245
 t+0.239
Neg logit contributions
ete-0.486
icking-0.474
ement-0.448
ory-0.439
ola-0.424
Token to
Token ID284
Feature activation-8.13e-01

Pos logit contributions
 wished+0.025
 cob+0.021
 s+0.021
 dreamed+0.021
 mess+0.021
Neg logit contributions
icking-0.041
ete-0.040
 Derek-0.037
ipping-0.037
oming-0.037
Token make
Token ID787
Feature activation+3.60e-01

Pos logit contributions
 wished+0.041
 living+0.040
 s+0.037
 sharp+0.036
 soft+0.035
Neg logit contributions
ete-0.072
icking-0.072
-0.065
 Derek-0.064
ory-0.064
Token their
Token ID511
Feature activation-1.45e+00

Pos logit contributions
ete+0.037
icking+0.036
ement+0.035
ory+0.034
oming+0.033
Neg logit contributions
 wished-0.014
 s-0.013
 living-0.013
 t-0.012
 bark-0.012
Token way
Token ID835
Feature activation-1.17e-02

Pos logit contributions
 wished+0.076
 s+0.076
 living+0.072
 t+0.069
 sharp+0.068
Neg logit contributions
ete-0.132
icking-0.128
ement-0.120
ory-0.120
ancer-0.116
Token happy
Token ID3772
Feature activation-1.76e+00

Pos logit contributions
 wished+0.003
 s+0.003
 living+0.003
 sharp+0.002
 r+0.002
Neg logit contributions
ete-0.007
icking-0.006
ory-0.006
ement-0.006
ancer-0.006
Token boy
Token ID2933
Feature activation-2.20e+00

Pos logit contributions
 wished+0.080
 s+0.075
 sharp+0.071
 living+0.068
 r+0.067
Neg logit contributions
ete-0.159
icking-0.154
ory-0.149
ement-0.147
-0.142
Token.
Token ID13
Feature activation-3.59e-01

Pos logit contributions
 wished+0.083
 living+0.081
 s+0.078
 t+0.074
 happy+0.073
Neg logit contributions
ete-0.236
icking-0.229
ement-0.220
ory-0.219
oming-0.217
Token<|endoftext|>
Token ID50256
Feature activation+1.04e+00

Pos logit contributions
 living+0.012
 wished+0.011
 s+0.011
 never+0.010
 happy+0.010
Neg logit contributions
icking-0.041
ete-0.038
ement-0.037
oming-0.037
ory-0.037
TokenJo
Token ID9908
Feature activation-5.23e+00

Pos logit contributions
icking+0.103
ete+0.100
ory+0.094
ement+0.091
oming+0.090
Neg logit contributions
 living-0.055
 s-0.050
 wished-0.047
 r-0.046
 t-0.046
TokenJo
Token ID9908
Feature activation-5.24e+00

Pos logit contributions
 s+0.281
 wished+0.278
 living+0.266
 t+0.258
 sharp+0.256
Neg logit contributions
ete-0.469
icking-0.454
ement-0.428
ory-0.426
ancer-0.413
Token was
Token ID373
Feature activation-1.19e-01

Pos logit contributions
 s+0.248
 wished+0.244
 living+0.233
 t+0.224
 sharp+0.223
Neg logit contributions
ete-0.502
icking-0.488
ement-0.462
ory-0.460
ancer-0.448
Token sad
Token ID6507
Feature activation-2.34e-01

Pos logit contributions
 wished+0.004
 s+0.004
 bark+0.004
 sharp+0.004
 mess+0.004
Neg logit contributions
icking-0.012
ete-0.012
oming-0.011
ement-0.011
ancer-0.011
Token.
Token ID13
Feature activation-4.39e-01

Pos logit contributions
 wished+0.010
 s+0.009
 living+0.009
 sharp+0.008
 t+0.008
Neg logit contributions
ete-0.024
icking-0.023
ement-0.022
ory-0.022
ancer-0.021
Token He
Token ID679
Feature activation+2.02e-01

Pos logit contributions
 s+0.027
 wished+0.027
 living+0.026
 t+0.025
 sharp+0.025
Neg logit contributions
ete-0.036
icking-0.034
ement-0.032
ory-0.032
ancer-0.031
Token wanted
Token ID2227
Feature activation-2.12e+00

Pos logit contributions
ete+0.020
icking+0.020
ory+0.019
ement+0.019
ola+0.018
Neg logit contributions
 s-0.007
 living-0.007
 snow-0.006
 wished-0.006
 gir-0.006
Token felt
Token ID2936
Feature activation-9.09e-01

Pos logit contributions
 s+0.016
 wished+0.016
 living+0.015
 t+0.014
 sharp+0.014
Neg logit contributions
ete-0.038
icking-0.037
ement-0.035
ory-0.035
ancer-0.034
Token so
Token ID523
Feature activation+1.61e+00

Pos logit contributions
 wished+0.034
 s+0.034
 living+0.032
 t+0.030
 happy+0.030
Neg logit contributions
ete-0.096
icking-0.094
ement-0.090
ory-0.089
ancer-0.088
Token proud
Token ID6613
Feature activation+3.56e-01

Pos logit contributions
ete+0.170
icking+0.165
ement+0.158
ory+0.156
oming+0.154
Neg logit contributions
 s-0.060
 wished-0.058
 living-0.055
 t-0.054
 snow-0.052
Token!
Token ID0
Feature activation+1.29e+00

Pos logit contributions
ete+0.029
icking+0.028
ory+0.026
ement+0.026
ancer+0.025
Neg logit contributions
 s-0.023
 wished-0.023
 living-0.022
 t-0.021
 sharp-0.021
Token<|endoftext|>
Token ID50256
Feature activation+1.04e+00

Pos logit contributions
icking+0.143
ete+0.139
ement+0.133
ory+0.133
oming+0.130
Neg logit contributions
 s-0.043
 wished-0.043
 living-0.042
 happy-0.037
 t-0.035
TokenJo
Token ID9908
Feature activation-5.23e+00

Pos logit contributions
icking+0.097
ete+0.095
ory+0.088
ement+0.086
oming+0.084
Neg logit contributions
 living-0.059
 s-0.055
 wished-0.052
 r-0.051
 t-0.051
Token was
Token ID373
Feature activation-4.04e-02

Pos logit contributions
 s+0.312
 wished+0.308
 living+0.298
 sharp+0.289
 t+0.289
Neg logit contributions
ete-0.437
icking-0.423
ory-0.395
ement-0.395
ancer-0.378
Token a
Token ID257
Feature activation-2.27e-01

Pos logit contributions
 wished+0.002
 s+0.001
 bark+0.001
 sharp+0.001
 t+0.001
Neg logit contributions
ete-0.004
icking-0.004
ement-0.004
oming-0.004
ory-0.004
Token lively
Token ID29696
Feature activation+5.31e-01

Pos logit contributions
 wished+0.007
 s+0.007
 living+0.006
 knocking+0.006
 mess+0.006
Neg logit contributions
ete-0.024
ory-0.024
icking-0.023
ancer-0.022
ement-0.022
Token three
Token ID1115
Feature activation-2.76e+00

Pos logit contributions
ete+0.056
icking+0.055
ory+0.054
ement+0.052
ancer+0.051
Neg logit contributions
 wished-0.019
 s-0.018
 living-0.016
 sharp-0.016
 never-0.015
Token-
Token ID12
Feature activation-5.06e-01

Pos logit contributions
 wished+0.160
 s+0.153
 living+0.148
 sharp+0.147
 knocking+0.145
Neg logit contributions
ete-0.221
ement-0.220
ory-0.210
icking-0.209
ola-0.202
Token of
Token ID286
Feature activation-2.65e-01

Pos logit contributions
ete+0.175
icking+0.170
ory+0.164
ement+0.161
ancer+0.159
Neg logit contributions
 s-0.069
 wished-0.067
 living-0.066
 r-0.060
 sharp-0.060
Token the
Token ID262
Feature activation-1.48e-01

Pos logit contributions
 s+0.010
 wished+0.010
 living+0.009
 t+0.009
 sharp+0.008
Neg logit contributions
ete-0.028
icking-0.028
ement-0.026
ory-0.026
ancer-0.026
Token night
Token ID1755
Feature activation-1.26e+00

Pos logit contributions
 wished+0.006
 s+0.006
 living+0.005
 t+0.005
 sharp+0.005
Neg logit contributions
ete-0.015
icking-0.014
ement-0.014
ory-0.013
ancer-0.013
Token.
Token ID13
Feature activation-6.34e-01

Pos logit contributions
 s+0.058
 wished+0.058
 living+0.054
 t+0.053
 sharp+0.052
Neg logit contributions
ete-0.122
icking-0.119
ement-0.113
ory-0.112
ancer-0.110
Token<|endoftext|>
Token ID50256
Feature activation+8.90e-01

Pos logit contributions
 s+0.025
 wished+0.025
 living+0.024
 happy+0.022
 t+0.022
Neg logit contributions
icking-0.066
ete-0.065
ement-0.062
ory-0.061
oming-0.060
TokenJo
Token ID9908
Feature activation-5.23e+00

Pos logit contributions
icking+0.098
ete+0.096
ory+0.091
ement+0.089
oming+0.088
Neg logit contributions
 living-0.034
 s-0.032
 wished-0.029
 t-0.028
 r-0.028
TokenJo
Token ID9908
Feature activation-5.28e+00

Pos logit contributions
 s+0.292
 wished+0.288
 living+0.278
 sharp+0.270
 t+0.270
Neg logit contributions
ete-0.457
icking-0.442
ement-0.415
ory-0.414
ancer-0.400
Token was
Token ID373
Feature activation-4.42e-01

Pos logit contributions
 s+0.273
 wished+0.269
 living+0.258
 t+0.249
 sharp+0.249
Neg logit contributions
ete-0.482
icking-0.468
ement-0.442
ory-0.440
ancer-0.427
Token walking
Token ID6155
Feature activation-1.12e+00

Pos logit contributions
 wished+0.016
 s+0.015
 bark+0.014
 sharp+0.014
 gir+0.014
Neg logit contributions
icking-0.044
ete-0.043
oming-0.041
ement-0.041
ancer-0.041
Token outside
Token ID2354
Feature activation-8.82e-01

Pos logit contributions
 s+0.062
 wished+0.061
 living+0.058
 t+0.057
 sharp+0.056
Neg logit contributions
ete-0.099
icking-0.096
ement-0.090
ory-0.090
ancer-0.087
Token on
Token ID319
Feature activation-1.78e-01

Pos logit contributions
 wished+0.047
 s+0.043
 sharp+0.042
 bark+0.039
 gir+0.039
Neg logit contributions
icking-0.077
ete-0.075
ement-0.070
oming-0.070
ancer-0.070
Token lucky
Token ID9670
Feature activation-6.59e-02

Pos logit contributions
 wished+0.006
 s+0.006
 living+0.005
 bark+0.005
 r+0.005
Neg logit contributions
ete-0.018
ory-0.017
icking-0.017
-0.017
ement-0.017
Token happy
Token ID3772
Feature activation-1.76e+00

Pos logit contributions
 wished+0.003
 s+0.003
 living+0.003
 sharp+0.002
 r+0.002
Neg logit contributions
ete-0.007
icking-0.006
ory-0.006
ement-0.006
ancer-0.006
Token boy
Token ID2933
Feature activation-2.20e+00

Pos logit contributions
 wished+0.080
 s+0.075
 sharp+0.071
 living+0.068
 r+0.067
Neg logit contributions
ete-0.159
icking-0.154
ory-0.149
ement-0.147
-0.142
Token.
Token ID13
Feature activation-3.59e-01

Pos logit contributions
 wished+0.083
 living+0.081
 s+0.078
 t+0.074
 happy+0.073
Neg logit contributions
ete-0.236
icking-0.229
ement-0.220
ory-0.219
oming-0.217
Token<|endoftext|>
Token ID50256
Feature activation+1.04e+00

Pos logit contributions
 living+0.012
 wished+0.011
 s+0.011
 never+0.010
 happy+0.010
Neg logit contributions
icking-0.041
ete-0.038
ement-0.037
oming-0.037
ory-0.037
TokenJo
Token ID9908
Feature activation-5.23e+00

Pos logit contributions
icking+0.103
ete+0.100
ory+0.094
ement+0.091
oming+0.090
Neg logit contributions
 living-0.055
 s-0.050
 wished-0.047
 r-0.046
 t-0.046
TokenJo
Token ID9908
Feature activation-5.24e+00

Pos logit contributions
 s+0.281
 wished+0.278
 living+0.266
 t+0.258
 sharp+0.256
Neg logit contributions
ete-0.469
icking-0.454
ement-0.428
ory-0.426
ancer-0.413
Token was
Token ID373
Feature activation-1.19e-01

Pos logit contributions
 s+0.248
 wished+0.244
 living+0.233
 t+0.224
 sharp+0.223
Neg logit contributions
ete-0.502
icking-0.488
ement-0.462
ory-0.460
ancer-0.448
Token sad
Token ID6507
Feature activation-2.34e-01

Pos logit contributions
 wished+0.004
 s+0.004
 bark+0.004
 sharp+0.004
 mess+0.004
Neg logit contributions
icking-0.012
ete-0.012
oming-0.011
ement-0.011
ancer-0.011
Token.
Token ID13
Feature activation-4.39e-01

Pos logit contributions
 wished+0.010
 s+0.009
 living+0.009
 sharp+0.008
 t+0.008
Neg logit contributions
ete-0.024
icking-0.023
ement-0.022
ory-0.022
ancer-0.021
Token He
Token ID679
Feature activation+2.02e-01

Pos logit contributions
 s+0.027
 wished+0.027
 living+0.026
 t+0.025
 sharp+0.025
Neg logit contributions
ete-0.036
icking-0.034
ement-0.032
ory-0.032
ancer-0.031
Token and
Token ID290
Feature activation-2.27e-01

Pos logit contributions
 wished+0.121
 s+0.114
 t+0.107
 bark+0.104
 bang+0.104
Neg logit contributions
icking-0.169
ete-0.158
ails-0.157
iring-0.156
oming-0.155
Token harder
Token ID7069
Feature activation-2.13e+00

Pos logit contributions
 wished+0.011
 s+0.011
 living+0.010
 t+0.010
 sharp+0.010
Neg logit contributions
ete-0.021
icking-0.021
ement-0.019
ory-0.019
oming-0.019
Token.
Token ID13
Feature activation-3.38e-01

Pos logit contributions
 wished+0.113
 s+0.104
 t+0.098
 living+0.095
 happy+0.094
Neg logit contributions
icking-0.179
ete-0.174
oming-0.166
ails-0.164
ement-0.162
Token
Token ID198
Feature activation-2.89e-01

Pos logit contributions
 s+0.017
 wished+0.017
 living+0.016
 t+0.015
 sharp+0.015
Neg logit contributions
ete-0.032
icking-0.031
ement-0.029
ory-0.029
ancer-0.029
Token
Token ID198
Feature activation-3.04e-01

Pos logit contributions
 s+0.011
 wished+0.011
 living+0.011
 t+0.010
 sharp+0.010
Neg logit contributions
ete-0.030
icking-0.030
ement-0.028
ory-0.028
ancer-0.027
TokenJo
Token ID9908
Feature activation-5.21e+00

Pos logit contributions
 wished+0.018
 s+0.018
 t+0.017
 living+0.017
 mess+0.017
Neg logit contributions
ete-0.026
ory-0.022
icking-0.022
-0.022
ola-0.022
Tokenjo
Token ID7639
Feature activation-1.81e+00

Pos logit contributions
 s+0.240
 living+0.233
 wished+0.230
 t+0.220
 sharp+0.217
Neg logit contributions
ete-0.505
icking-0.493
ement-0.461
ory-0.460
ola-0.446
Token and
Token ID290
Feature activation-2.10e-01

Pos logit contributions
 living+0.081
 s+0.080
 soft+0.076
 sharp+0.075
 snow+0.074
Neg logit contributions
ete-0.169
icking-0.165
ory-0.154
ement-0.151
ola-0.147
Token his
Token ID465
Feature activation-3.31e-01

Pos logit contributions
 s+0.010
 mess+0.009
 living+0.009
 wished+0.009
 sharp+0.009
Neg logit contributions
ete-0.019
icking-0.019
 Derek-0.018
ancer-0.017
ory-0.017
Token mom
Token ID1995
Feature activation-4.99e-01

Pos logit contributions
 wished+0.016
 s+0.016
 living+0.015
 t+0.015
 sharp+0.014
Neg logit contributions
ete-0.031
icking-0.030
ement-0.028
ory-0.028
ancer-0.027
Token were
Token ID547
Feature activation-4.40e-01

Pos logit contributions
 wished+0.023
 s+0.023
 sharp+0.022
 living+0.022
 soft+0.021
Neg logit contributions
icking-0.048
ete-0.046
ory-0.043
ement-0.042
ancer-0.041
Token friend
Token ID1545
Feature activation+2.29e+00

Pos logit contributions
ete+0.135
icking+0.131
ory+0.124
ement+0.123
ancer+0.117
Neg logit contributions
 wished-0.076
 s-0.075
 living-0.070
 sharp-0.067
 t-0.066
Token Jo
Token ID5302
Feature activation-1.66e+00

Pos logit contributions
ete+0.237
icking+0.231
ement+0.222
ory+0.219
ancer+0.216
Neg logit contributions
 s-0.090
 wished-0.089
 living-0.086
 t-0.081
 sharp-0.077
TokenJo
Token ID9908
Feature activation-5.17e+00

Pos logit contributions
 s+0.082
 wished+0.081
 sharp+0.076
 living+0.076
 t+0.074
Neg logit contributions
ete-0.155
icking-0.152
ory-0.142
ement-0.142
ancer-0.138
Token.
Token ID13
Feature activation-6.29e-01

Pos logit contributions
 s+0.252
 wished+0.250
 living+0.237
 sharp+0.235
 t+0.231
Neg logit contributions
ete-0.485
icking-0.477
ement-0.448
ory-0.442
ancer-0.430
Token Jo
Token ID5302
Feature activation-1.66e+00

Pos logit contributions
 s+0.039
 wished+0.038
 living+0.037
 t+0.036
 sharp+0.035
Neg logit contributions
ete-0.052
icking-0.050
ement-0.047
ory-0.047
ancer-0.045
TokenJo
Token ID9908
Feature activation-5.21e+00

Pos logit contributions
 s+0.078
 wished+0.075
 living+0.074
 sharp+0.073
 mess+0.071
Neg logit contributions
ete-0.158
icking-0.153
ement-0.147
ory-0.143
ola-0.142
Token was
Token ID373
Feature activation-1.74e-01

Pos logit contributions
 s+0.231
 wished+0.227
 living+0.217
 t+0.209
 sharp+0.208
Neg logit contributions
ete-0.513
icking-0.501
ement-0.477
ory-0.470
ancer-0.458
Token thrilled
Token ID20536
Feature activation-4.33e-01

Pos logit contributions
 wished+0.008
 s+0.008
 living+0.007
 snow+0.007
 sharp+0.007
Neg logit contributions
ete-0.017
icking-0.016
ement-0.015
oming-0.015
ory-0.015
Token to
Token ID284
Feature activation-9.28e-01

Pos logit contributions
 s+0.015
 living+0.014
 wished+0.014
 t+0.013
 happy+0.013
Neg logit contributions
ete-0.048
icking-0.046
ory-0.044
ancer-0.044
ement-0.044
Token see
Token ID766
Feature activation+8.89e-01

Pos logit contributions
 s+0.037
 wished+0.035
 t+0.033
 happy+0.032
 bark+0.031
Neg logit contributions
ete-0.099
icking-0.098
ement-0.094
ory-0.092
ancer-0.091
Token Big
Token ID4403
Feature activation-5.93e-01

Pos logit contributions
ete+0.092
icking+0.089
ement+0.085
ory+0.084
ancer+0.083
Neg logit contributions
 s-0.036
 wished-0.035
 living-0.033
 t-0.032
 sharp-0.031
Token've
Token ID1053
Feature activation+3.22e+00

Pos logit contributions
ete+0.159
icking+0.154
ory+0.145
ement+0.145
ancer+0.141
Neg logit contributions
 s-0.099
 wished-0.098
 living-0.095
 t-0.091
 sharp-0.089
Token got
Token ID1392
Feature activation-7.53e-01

Pos logit contributions
ete+0.330
icking+0.322
ory+0.300
ement+0.297
ola+0.294
Neg logit contributions
 s-0.120
 wished-0.114
 living-0.111
 snow-0.108
 soft-0.106
Token an
Token ID281
Feature activation+1.19e+00

Pos logit contributions
 s+0.025
 wished+0.024
 living+0.022
 t+0.021
 sharp+0.021
Neg logit contributions
ete-0.084
icking-0.082
ement-0.079
ory-0.078
ancer-0.077
Token idea
Token ID2126
Feature activation+2.16e+00

Pos logit contributions
ete+0.117
icking+0.114
ement+0.108
ory+0.108
ancer+0.107
Neg logit contributions
 s-0.055
 wished-0.054
 sharp-0.050
 t-0.050
 living-0.050
Token Jo
Token ID5302
Feature activation-1.68e+00

Pos logit contributions
ete+0.191
icking+0.185
ement+0.175
ory+0.174
ancer+0.170
Neg logit contributions
 s-0.118
 wished-0.117
 living-0.112
 t-0.108
 sharp-0.107
TokenJo
Token ID9908
Feature activation-5.21e+00

Pos logit contributions
 s+0.092
 wished+0.092
 living+0.087
 t+0.085
 sharp+0.084
Neg logit contributions
ete-0.148
icking-0.144
ement-0.136
ory-0.135
ancer-0.131
Token.
Token ID13
Feature activation-4.09e-01

Pos logit contributions
 s+0.299
 wished+0.298
 living+0.284
 t+0.276
 sharp+0.274
Neg logit contributions
ete-0.445
icking-0.431
ement-0.406
ory-0.403
ancer-0.392
Token Maybe
Token ID6674
Feature activation+1.06e-01

Pos logit contributions
 s+0.024
 wished+0.024
 living+0.024
 t+0.022
 never+0.022
Neg logit contributions
ete-0.034
icking-0.034
ory-0.033
ement-0.033
oming-0.031
Token next
Token ID1306
Feature activation-3.50e-01

Pos logit contributions
ete+0.010
icking+0.010
ement+0.010
ory+0.009
+0.009
Neg logit contributions
 wished-0.005
 s-0.005
 living-0.005
 t-0.005
 snow-0.005
Token time
Token ID640
Feature activation+2.76e-01

Pos logit contributions
 wished+0.016
 s+0.016
 living+0.015
 t+0.014
 sharp+0.014
Neg logit contributions
ete-0.034
icking-0.033
ement-0.032
ory-0.031
-0.030
Token we
Token ID356
Feature activation+3.39e-01

Pos logit contributions
ete+0.023
icking+0.022
ement+0.021
ory+0.021
+0.021
Neg logit contributions
 s-0.017
 wished-0.017
 living-0.016
 t-0.016
 happy-0.015
Token life
Token ID1204
Feature activation+6.90e-01

Pos logit contributions
 s+0.059
 t+0.057
 wished+0.056
 living+0.054
 soft+0.054
Neg logit contributions
ete-0.169
icking-0.161
ement-0.155
ancer-0.154
ory-0.154
Token perfectly
Token ID7138
Feature activation+2.89e+00

Pos logit contributions
ete+0.069
icking+0.067
ement+0.064
ory+0.063
ancer+0.063
Neg logit contributions
 s-0.030
 wished-0.029
 living-0.029
 t-0.028
 sharp-0.028
Token.
Token ID13
Feature activation-4.22e-01

Pos logit contributions
ete+0.250
icking+0.246
ory+0.232
ement+0.231
oming+0.226
Neg logit contributions
 s-0.161
 wished-0.160
 living-0.150
 snow-0.145
 t-0.144
Token
Token ID198
Feature activation-3.18e-01

Pos logit contributions
 s+0.020
 wished+0.019
 living+0.019
 t+0.018
 happy+0.017
Neg logit contributions
ete-0.041
icking-0.041
ory-0.038
ement-0.038
ancer-0.037
Token
Token ID198
Feature activation-3.44e-01

Pos logit contributions
 s+0.016
 wished+0.016
 t+0.015
 mess+0.015
 sharp+0.015
Neg logit contributions
ete-0.029
ement-0.026
icking-0.026
ola-0.026
oming-0.025
TokenJo
Token ID9908
Feature activation-5.21e+00

Pos logit contributions
 s+0.019
 mess+0.018
 cough+0.018
 wished+0.018
 t+0.018
Neg logit contributions
ete-0.030
ola-0.027
ement-0.027
icking-0.026
oming-0.026
Token decided
Token ID3066
Feature activation+1.19e+00

Pos logit contributions
 s+0.242
 wished+0.238
 living+0.228
 t+0.219
 sharp+0.216
Neg logit contributions
ete-0.505
icking-0.491
ement-0.465
ory-0.463
ancer-0.450
Token to
Token ID284
Feature activation-8.20e-01

Pos logit contributions
ete+0.119
icking+0.115
ory+0.109
ement+0.109
oming+0.106
Neg logit contributions
 s-0.051
 wished-0.049
 living-0.047
 t-0.045
 snow-0.044
Token take
Token ID1011
Feature activation+3.43e-01

Pos logit contributions
 s+0.036
 wished+0.034
 t+0.033
 bark+0.032
 living+0.031
Neg logit contributions
ete-0.083
icking-0.082
ement-0.080
ancer-0.078
oming-0.077
Token the
Token ID262
Feature activation+6.95e-02

Pos logit contributions
icking+0.041
ete+0.041
ement+0.038
ory+0.038
ancer+0.038
Neg logit contributions
 s-0.009
 wished-0.009
 t-0.008
 sharp-0.008
 living-0.008
Token puppy
Token ID26188
Feature activation+1.39e+00

Pos logit contributions
ete+0.007
icking+0.007
ement+0.006
ory+0.006
ancer+0.006
Neg logit contributions
 s-0.003
 wished-0.003
 living-0.003
 t-0.003
 sharp-0.003
Token Jo
Token ID5302
Feature activation-1.65e+00

Pos logit contributions
 s+0.054
 living+0.054
 wished+0.053
 sharp+0.051
 mess+0.051
Neg logit contributions
ete-0.082
icking-0.079
ement-0.075
ola-0.073
ory-0.073
Token shouted
Token ID17293
Feature activation-5.31e-01

Pos logit contributions
 s+0.084
 living+0.084
 mess+0.079
 wished+0.078
 snow+0.077
Neg logit contributions
ete-0.151
icking-0.144
ement-0.137
ola-0.136
ory-0.135
Token.
Token ID13
Feature activation-3.13e-01

Pos logit contributions
 wished+0.024
 s+0.024
 living+0.023
 snow+0.021
 sharp+0.021
Neg logit contributions
ete-0.052
icking-0.050
ement-0.048
ory-0.047
oming-0.046
Token
Token ID198
Feature activation-2.06e-01

Pos logit contributions
 s+0.015
 wished+0.015
 living+0.015
 t+0.014
 sharp+0.014
Neg logit contributions
ete-0.029
icking-0.029
ement-0.027
ory-0.027
ancer-0.026
Token
Token ID198
Feature activation-2.08e-01

Pos logit contributions
 s+0.009
 wished+0.009
 living+0.008
 t+0.008
 sharp+0.008
Neg logit contributions
ete-0.021
icking-0.020
ement-0.019
ory-0.019
ancer-0.019
TokenJo
Token ID9908
Feature activation-5.18e+00

Pos logit contributions
 wished+0.011
 s+0.011
 mess+0.010
 living+0.010
 sharp+0.010
Neg logit contributions
ete-0.018
icking-0.017
ola-0.016
ement-0.016
iring-0.016
Token opened
Token ID4721
Feature activation+6.46e-01

Pos logit contributions
 s+0.227
 living+0.222
 wished+0.214
 snow+0.202
 sharp+0.201
Neg logit contributions
ete-0.515
icking-0.500
ement-0.471
ory-0.467
ola-0.451
Token the
Token ID262
Feature activation+2.41e-01

Pos logit contributions
ete+0.064
icking+0.062
ement+0.060
ory+0.058
oming+0.058
Neg logit contributions
 wished-0.028
 s-0.027
 living-0.026
 mess-0.023
 sharp-0.023
Token book
Token ID1492
Feature activation-1.44e-01

Pos logit contributions
ete+0.022
icking+0.022
ement+0.022
anger+0.021
oming+0.021
Neg logit contributions
 wished-0.011
 s-0.011
 living-0.009
 snow-0.009
 r-0.009
Token and
Token ID290
Feature activation-1.28e-01

Pos logit contributions
 s+0.005
 living+0.005
 t+0.005
 wished+0.004
 toy+0.004
Neg logit contributions
ete-0.016
icking-0.015
ory-0.015
ancer-0.015
oming-0.014
Token saw
Token ID2497
Feature activation-1.04e+00

Pos logit contributions
 s+0.005
 wished+0.005
 living+0.005
 t+0.005
 sharp+0.005
Neg logit contributions
ete-0.013
icking-0.013
ory-0.012
ement-0.012
ancer-0.012
Token into
Token ID656
Feature activation+2.11e-01

Pos logit contributions
 s+0.055
 wished+0.055
 living+0.053
 t+0.051
 sharp+0.050
Neg logit contributions
ete-0.096
icking-0.093
ement-0.088
ory-0.087
ancer-0.085
Token his
Token ID465
Feature activation-4.33e-01

Pos logit contributions
ete+0.022
icking+0.021
ement+0.020
ory+0.020
 Derek+0.019
Neg logit contributions
 wished-0.008
 s-0.008
 living-0.008
 t-0.007
 sharp-0.007
Token best
Token ID1266
Feature activation+1.47e+00

Pos logit contributions
 wished+0.023
 s+0.023
 living+0.021
 t+0.020
 sharp+0.020
Neg logit contributions
ete-0.039
icking-0.038
ement-0.036
ory-0.035
-0.034
Token friend
Token ID1545
Feature activation+2.29e+00

Pos logit contributions
ete+0.135
icking+0.131
ory+0.124
ement+0.123
ancer+0.117
Neg logit contributions
 wished-0.076
 s-0.075
 living-0.070
 sharp-0.067
 t-0.066
Token Jo
Token ID5302
Feature activation-1.66e+00

Pos logit contributions
ete+0.237
icking+0.231
ement+0.222
ory+0.219
ancer+0.216
Neg logit contributions
 s-0.090
 wished-0.089
 living-0.086
 t-0.081
 sharp-0.077
TokenJo
Token ID9908
Feature activation-5.17e+00

Pos logit contributions
 s+0.082
 wished+0.081
 sharp+0.076
 living+0.076
 t+0.074
Neg logit contributions
ete-0.155
icking-0.152
ory-0.142
ement-0.142
ancer-0.138
Token.
Token ID13
Feature activation-6.29e-01

Pos logit contributions
 s+0.252
 wished+0.250
 living+0.237
 sharp+0.235
 t+0.231
Neg logit contributions
ete-0.485
icking-0.477
ement-0.448
ory-0.442
ancer-0.430
Token Jo
Token ID5302
Feature activation-1.66e+00

Pos logit contributions
 s+0.039
 wished+0.038
 living+0.037
 t+0.036
 sharp+0.035
Neg logit contributions
ete-0.052
icking-0.050
ement-0.047
ory-0.047
ancer-0.045
TokenJo
Token ID9908
Feature activation-5.21e+00

Pos logit contributions
 s+0.078
 wished+0.075
 living+0.074
 sharp+0.073
 mess+0.071
Neg logit contributions
ete-0.158
icking-0.153
ement-0.147
ory-0.143
ola-0.142
Token was
Token ID373
Feature activation-1.74e-01

Pos logit contributions
 s+0.231
 wished+0.227
 living+0.217
 t+0.209
 sharp+0.208
Neg logit contributions
ete-0.513
icking-0.501
ement-0.477
ory-0.470
ancer-0.458
Token thrilled
Token ID20536
Feature activation-4.33e-01

Pos logit contributions
 wished+0.008
 s+0.008
 living+0.007
 snow+0.007
 sharp+0.007
Neg logit contributions
ete-0.017
icking-0.016
ement-0.015
oming-0.015
ory-0.015
Token't
Token ID470
Feature activation+3.09e-01

Pos logit contributions
ete+0.195
ory+0.190
icking+0.186
ement+0.185
ails+0.176
Neg logit contributions
 s-0.114
 never-0.105
 t-0.105
 r-0.104
 sharp-0.103
Token bud
Token ID3600
Feature activation-9.52e-01

Pos logit contributions
ete+0.025
icking+0.025
ory+0.022
ancer+0.022
ement+0.022
Neg logit contributions
 wished-0.019
 living-0.019
 s-0.018
 snow-0.017
 gir-0.016
Tokenge
Token ID469
Feature activation+1.24e+00

Pos logit contributions
 s+0.063
 wished+0.063
 living+0.062
 t+0.061
 sharp+0.060
Neg logit contributions
ete-0.074
icking-0.068
ement-0.067
ory-0.065
-0.063
Token!
Token ID0
Feature activation+1.30e+00

Pos logit contributions
ete+0.133
icking+0.130
ement+0.124
ory+0.123
ancer+0.121
Neg logit contributions
 s-0.045
 wished-0.044
 living-0.041
 t-0.039
 sharp-0.038
Token Jo
Token ID5302
Feature activation-1.67e+00

Pos logit contributions
ete+0.119
icking+0.115
ory+0.109
ement+0.108
oming+0.105
Neg logit contributions
 s-0.069
 wished-0.068
 living-0.065
 t-0.063
 r-0.061
TokenJo
Token ID9908
Feature activation-5.17e+00

Pos logit contributions
 s+0.086
 living+0.084
 wished+0.079
 mess+0.078
 snow+0.078
Neg logit contributions
ete-0.152
icking-0.149
ement-0.142
ola-0.137
ory-0.136
Token was
Token ID373
Feature activation-2.66e-02

Pos logit contributions
 s+0.240
 wished+0.232
 living+0.231
 t+0.217
 sharp+0.214
Neg logit contributions
ete-0.500
icking-0.489
ement-0.463
ory-0.456
ola-0.440
Token getting
Token ID1972
Feature activation-1.67e+00

Pos logit contributions
 s+0.001
 wished+0.001
 snow+0.001
 gir+0.001
 living+0.001
Neg logit contributions
icking-0.002
ete-0.002
ement-0.002
oming-0.002
iring-0.002
Token frustrated
Token ID14718
Feature activation+2.34e-01

Pos logit contributions
 s+0.063
 wished+0.063
 sharp+0.060
 happy+0.059
 r+0.058
Neg logit contributions
ete-0.185
icking-0.177
ement-0.175
ancer-0.170
ory-0.169
Token.
Token ID13
Feature activation-4.21e-01

Pos logit contributions
ete+0.025
icking+0.024
ory+0.023
ement+0.023
ancer+0.022
Neg logit contributions
 s-0.009
 wished-0.009
 living-0.008
 t-0.008
 sharp-0.008
Token She
Token ID1375
Feature activation-3.19e-01

Pos logit contributions
 s+0.021
 wished+0.021
 living+0.020
 t+0.019
 sharp+0.019
Neg logit contributions
ete-0.039
icking-0.038
ement-0.036
ory-0.036
ancer-0.035
Token the
Token ID262
Feature activation+2.73e-01

Pos logit contributions
ete+0.017
icking+0.017
ory+0.016
ement+0.016
ancer+0.016
Neg logit contributions
 s-0.006
 wished-0.006
 living-0.006
 t-0.006
 happy-0.005
Token village
Token ID7404
Feature activation-2.83e-01

Pos logit contributions
ete+0.023
icking+0.022
ement+0.021
ory+0.021
+0.020
Neg logit contributions
 wished-0.016
 s-0.015
 t-0.014
 sharp-0.014
 bark-0.014
Token?"
Token ID1701
Feature activation+4.26e-01

Pos logit contributions
 s+0.012
 wished+0.012
 living+0.012
 t+0.011
 sharp+0.011
Neg logit contributions
ete-0.028
icking-0.027
ement-0.026
ory-0.026
ancer-0.025
Token
Token ID198
Feature activation-1.29e-01

Pos logit contributions
ete+0.039
icking+0.038
ory+0.035
ement+0.035
ancer+0.034
Neg logit contributions
 s-0.023
 wished-0.023
 living-0.021
 t-0.021
 sharp-0.020
Token
Token ID198
Feature activation-1.39e-01

Pos logit contributions
 s+0.006
 wished+0.005
 living+0.005
 t+0.005
 sharp+0.005
Neg logit contributions
ete-0.013
icking-0.013
ement-0.012
ory-0.012
ancer-0.012
TokenJo
Token ID9908
Feature activation-5.17e+00

Pos logit contributions
 wished+0.007
 s+0.007
 living+0.006
 t+0.006
 sharp+0.006
Neg logit contributions
ete-0.013
icking-0.012
ement-0.012
ola-0.012
ory-0.012
TokenJo
Token ID9908
Feature activation-5.17e+00

Pos logit contributions
 s+0.250
 wished+0.244
 living+0.240
 sharp+0.230
 t+0.229
Neg logit contributions
ete-0.490
icking-0.474
ement-0.455
ory-0.443
ola-0.432
Token was
Token ID373
Feature activation+1.47e-01

Pos logit contributions
 s+0.253
 wished+0.248
 living+0.240
 t+0.231
 sharp+0.229
Neg logit contributions
ete-0.489
icking-0.474
ement-0.452
ory-0.444
ancer-0.430
Token excited
Token ID6568
Feature activation-8.16e-01

Pos logit contributions
ete+0.013
icking+0.013
ement+0.013
oming+0.012
ola+0.012
Neg logit contributions
 s-0.007
 wished-0.007
 snow-0.007
 living-0.007
 mess-0.006
Token.
Token ID13
Feature activation-3.54e-01

Pos logit contributions
 s+0.034
 wished+0.034
 living+0.031
 t+0.030
 sharp+0.030
Neg logit contributions
ete-0.083
icking-0.081
ement-0.078
ory-0.077
ancer-0.075
Token They
Token ID1119
Feature activation-5.48e-01

Pos logit contributions
 s+0.019
 wished+0.019
 living+0.018
 t+0.017
 bark+0.017
Neg logit contributions
ete-0.032
icking-0.031
ory-0.030
ement-0.029
oming-0.029
Token village
Token ID7404
Feature activation-2.83e-01

Pos logit contributions
ete+0.023
icking+0.022
ement+0.021
ory+0.021
+0.020
Neg logit contributions
 wished-0.016
 s-0.015
 t-0.014
 sharp-0.014
 bark-0.014
Token?"
Token ID1701
Feature activation+4.26e-01

Pos logit contributions
 s+0.012
 wished+0.012
 living+0.012
 t+0.011
 sharp+0.011
Neg logit contributions
ete-0.028
icking-0.027
ement-0.026
ory-0.026
ancer-0.025
Token
Token ID198
Feature activation-1.29e-01

Pos logit contributions
ete+0.039
icking+0.038
ory+0.035
ement+0.035
ancer+0.034
Neg logit contributions
 s-0.023
 wished-0.023
 living-0.021
 t-0.021
 sharp-0.020
Token
Token ID198
Feature activation-1.39e-01

Pos logit contributions
 s+0.006
 wished+0.005
 living+0.005
 t+0.005
 sharp+0.005
Neg logit contributions
ete-0.013
icking-0.013
ement-0.012
ory-0.012
ancer-0.012
TokenJo
Token ID9908
Feature activation-5.17e+00

Pos logit contributions
 wished+0.007
 s+0.007
 living+0.006
 t+0.006
 sharp+0.006
Neg logit contributions
ete-0.013
icking-0.012
ement-0.012
ola-0.012
ory-0.012
TokenJo
Token ID9908
Feature activation-5.17e+00

Pos logit contributions
 s+0.250
 wished+0.244
 living+0.240
 sharp+0.230
 t+0.229
Neg logit contributions
ete-0.490
icking-0.474
ement-0.455
ory-0.443
ola-0.432
Token was
Token ID373
Feature activation+1.47e-01

Pos logit contributions
 s+0.253
 wished+0.248
 living+0.240
 t+0.231
 sharp+0.229
Neg logit contributions
ete-0.489
icking-0.474
ement-0.452
ory-0.444
ancer-0.430
Token excited
Token ID6568
Feature activation-8.16e-01

Pos logit contributions
ete+0.013
icking+0.013
ement+0.013
oming+0.012
ola+0.012
Neg logit contributions
 s-0.007
 wished-0.007
 snow-0.007
 living-0.007
 mess-0.006
Token.
Token ID13
Feature activation-3.54e-01

Pos logit contributions
 s+0.034
 wished+0.034
 living+0.031
 t+0.030
 sharp+0.030
Neg logit contributions
ete-0.083
icking-0.081
ement-0.078
ory-0.077
ancer-0.075
Token They
Token ID1119
Feature activation-5.48e-01

Pos logit contributions
 s+0.019
 wished+0.019
 living+0.018
 t+0.017
 bark+0.017
Neg logit contributions
ete-0.032
icking-0.031
ory-0.030
ement-0.029
oming-0.029
Token knew
Token ID2993
Feature activation-1.26e+00

Pos logit contributions
 s+0.024
 wished+0.024
 living+0.023
 sharp+0.022
 snow+0.022
Neg logit contributions
ete-0.054
icking-0.052
ory-0.049
ement-0.049
ola-0.047
TokenDon
Token ID3987
Feature activation-2.55e-01

Pos logit contributions
 s+0.042
 wished+0.040
 living+0.040
 t+0.039
 sharp+0.038
Neg logit contributions
ete-0.060
icking-0.059
ancer-0.056
ement-0.055
ory-0.055
Token't
Token ID470
Feature activation+3.26e-01

Pos logit contributions
 s+0.015
 never+0.014
 living+0.014
 t+0.013
 loud+0.013
Neg logit contributions
ory-0.023
ete-0.022
icking-0.021
ails-0.020
ement-0.020
Token worry
Token ID5490
Feature activation-5.13e-01

Pos logit contributions
ete+0.028
icking+0.028
ory+0.026
 Derek+0.024
oming+0.024
Neg logit contributions
 wished-0.018
 living-0.017
 s-0.017
 gir-0.016
 many-0.015
Token,
Token ID11
Feature activation+3.25e-01

Pos logit contributions
 wished+0.030
 s+0.030
 living+0.029
 t+0.028
 sharp+0.028
Neg logit contributions
ete-0.043
icking-0.042
ement-0.039
ory-0.039
ancer-0.038
Token Jo
Token ID5302
Feature activation-1.61e+00

Pos logit contributions
ete+0.024
icking+0.023
ement+0.021
ancer+0.021
ola+0.020
Neg logit contributions
 wished-0.022
 s-0.022
 living-0.021
 bark-0.020
 t-0.020
TokenJo
Token ID9908
Feature activation-5.16e+00

Pos logit contributions
 s+0.088
 wished+0.088
 living+0.084
 t+0.081
 sharp+0.081
Neg logit contributions
ete-0.142
icking-0.139
ement-0.130
ory-0.128
ancer-0.127
Token.
Token ID13
Feature activation-2.94e-01

Pos logit contributions
 s+0.242
 wished+0.241
 living+0.228
 t+0.220
 sharp+0.219
Neg logit contributions
ete-0.493
icking-0.485
ement-0.457
ory-0.452
ancer-0.443
Token I
Token ID314
Feature activation+7.60e-01

Pos logit contributions
 s+0.018
 wished+0.018
 living+0.018
 t+0.017
 snow+0.016
Neg logit contributions
ete-0.024
icking-0.023
ement-0.023
ory-0.022
oming-0.021
Token can
Token ID460
Feature activation+4.09e-01

Pos logit contributions
ete+0.069
icking+0.066
ement+0.064
ory+0.062
oming+0.061
Neg logit contributions
 wished-0.041
 s-0.041
 living-0.039
 t-0.038
 happy-0.037
Token help
Token ID1037
Feature activation+1.02e+00

Pos logit contributions
ete+0.041
icking+0.039
ement+0.039
ory+0.038
ancer+0.037
Neg logit contributions
 s-0.019
 wished-0.017
 t-0.017
 living-0.017
 never-0.016
Token you
Token ID345
Feature activation+5.18e-01

Pos logit contributions
ete+0.084
icking+0.081
ement+0.078
ory+0.076
+0.074
Neg logit contributions
 s-0.063
 wished-0.061
 living-0.061
 t-0.059
 sharp-0.058
TokenHis
Token ID6653
Feature activation+5.61e-01

Pos logit contributions
 wished+0.012
 s+0.012
 mess+0.011
 living+0.011
 t+0.011
Neg logit contributions
ete-0.021
icking-0.019
ement-0.019
ola-0.018
iring-0.018
Token mom
Token ID1995
Feature activation-3.95e-01

Pos logit contributions
ete+0.053
icking+0.052
ement+0.049
ory+0.049
ancer+0.048
Neg logit contributions
 s-0.027
 wished-0.027
 living-0.026
 t-0.025
 sharp-0.025
Token looked
Token ID3114
Feature activation-7.53e-01

Pos logit contributions
 s+0.017
 wished+0.017
 sharp+0.016
 living+0.016
 soft+0.015
Neg logit contributions
ete-0.038
icking-0.038
ement-0.035
ory-0.035
ancer-0.034
Token at
Token ID379
Feature activation-6.77e-01

Pos logit contributions
 wished+0.042
 s+0.041
 living+0.039
 t+0.038
 sharp+0.038
Neg logit contributions
ete-0.064
icking-0.063
ement-0.060
ory-0.059
oming-0.058
Token Jo
Token ID5302
Feature activation-1.62e+00

Pos logit contributions
 wished+0.041
 s+0.041
 living+0.040
 sharp+0.038
 t+0.038
Neg logit contributions
ete-0.055
icking-0.053
ement-0.050
ory-0.049
ancer-0.048
TokenJo
Token ID9908
Feature activation-5.16e+00

Pos logit contributions
 wished+0.081
 s+0.077
 sharp+0.073
 living+0.070
 many+0.068
Neg logit contributions
icking-0.151
ete-0.148
ement-0.142
ory-0.136
oming-0.133
Token and
Token ID290
Feature activation-1.36e-01

Pos logit contributions
 wished+0.253
 s+0.249
 sharp+0.236
 living+0.229
 t+0.226
Neg logit contributions
icking-0.484
ete-0.472
ement-0.452
ory-0.430
oming-0.418
Token said
Token ID531
Feature activation+5.58e-01

Pos logit contributions
 s+0.007
 living+0.006
 wished+0.006
 sharp+0.006
 mess+0.006
Neg logit contributions
ete-0.012
icking-0.012
ement-0.011
ory-0.011
 Derek-0.011
Token "
Token ID366
Feature activation-7.25e-01

Pos logit contributions
ete+0.046
icking+0.045
ement+0.042
ory+0.041
oming+0.040
Neg logit contributions
 wished-0.033
 s-0.033
 living-0.031
 snow-0.030
 t-0.030
TokenDon
Token ID3987
Feature activation-2.55e-01

Pos logit contributions
 s+0.042
 wished+0.040
 living+0.040
 t+0.039
 sharp+0.038
Neg logit contributions
ete-0.060
icking-0.059
ancer-0.056
ement-0.055
ory-0.055
Token't
Token ID470
Feature activation+3.26e-01

Pos logit contributions
 s+0.015
 never+0.014
 living+0.014
 t+0.013
 loud+0.013
Neg logit contributions
ory-0.023
ete-0.022
icking-0.021
ails-0.020
ement-0.020
Token with
Token ID351
Feature activation+7.85e-03

Pos logit contributions
 wished+0.007
 s+0.007
 living+0.006
 sharp+0.006
 never+0.006
Neg logit contributions
ete-0.012
icking-0.011
ement-0.011
-0.011
ory-0.011
Token her
Token ID607
Feature activation-1.41e+00

Pos logit contributions
avored+0.001
ions+0.001
ays+0.001
icking+0.001
iring+0.001
Neg logit contributions
 wished-0.000
 never-0.000
 dreamed-0.000
 knocking-0.000
 happy-0.000
Token toys
Token ID14958
Feature activation-6.86e-01

Pos logit contributions
 wished+0.058
 s+0.052
 never+0.052
 knocking+0.051
 bang+0.045
Neg logit contributions
icking-0.135
-0.130
avored-0.128
ement-0.128
iring-0.126
Token and
Token ID290
Feature activation+1.21e-01

Pos logit contributions
 s+0.027
 wished+0.027
 living+0.026
 t+0.025
 sharp+0.024
Neg logit contributions
ete-0.072
icking-0.070
ement-0.066
ory-0.066
ancer-0.065
Token c
Token ID269
Feature activation-1.31e+00

Pos logit contributions
icking+0.009
glers+0.009
 Derek+0.008
ete+0.008
ory+0.008
Neg logit contributions
 wished-0.006
 s-0.005
 accident-0.005
 living-0.005
 sharp-0.005
Tokenuddle
Token ID24500
Feature activation+6.49e+00

Pos logit contributions
 s+0.120
 wished+0.116
 living+0.115
 t+0.113
 sharp+0.112
Neg logit contributions
ete-0.070
icking-0.063
ement-0.058
ory-0.057
ancer-0.056
Token with
Token ID351
Feature activation+2.24e-02

Pos logit contributions
ete+0.640
icking+0.614
ement+0.584
ory+0.582
ancer+0.572
Neg logit contributions
 s-0.313
 wished-0.297
 living-0.291
 t-0.284
 sharp-0.277
Token her
Token ID607
Feature activation-1.38e+00

Pos logit contributions
ete+0.002
icking+0.002
ory+0.002
ement+0.002
ola+0.002
Neg logit contributions
 wished-0.001
 s-0.001
 living-0.001
 knocking-0.001
 bark-0.001
Token t
Token ID256
Feature activation-2.92e+00

Pos logit contributions
 wished+0.047
 s+0.044
 never+0.039
 knocking+0.039
 living+0.038
Neg logit contributions
icking-0.143
ete-0.141
ement-0.138
ory-0.136
-0.135
Tokeneddy
Token ID21874
Feature activation+4.44e-02

Pos logit contributions
 s+0.219
 t+0.217
 living+0.215
 toy+0.213
 wished+0.212
Neg logit contributions
ete-0.203
icking-0.180
ement-0.174
-0.167
oming-0.165
Token bear
Token ID6842
Feature activation-1.52e-01

Pos logit contributions
icking+0.004
ete+0.004
ory+0.004
ement+0.004
ancer+0.004
Neg logit contributions
 wished-0.002
 s-0.002
 living-0.002
 sharp-0.002
 never-0.002
Token "
Token ID366
Feature activation-6.32e-01

Pos logit contributions
 s+0.015
 wished+0.014
 living+0.014
 t+0.013
 happy+0.013
Neg logit contributions
icking-0.029
ete-0.029
ory-0.027
ement-0.027
oming-0.026
TokenThat
Token ID2504
Feature activation-1.99e+00

Pos logit contributions
 s+0.037
 wished+0.036
 living+0.035
 sharp+0.034
 t+0.034
Neg logit contributions
ete-0.053
icking-0.050
ory-0.049
ement-0.048
ancer-0.048
Token's
Token ID338
Feature activation-7.04e-01

Pos logit contributions
 wished+0.085
 s+0.081
 living+0.075
 sharp+0.072
 happy+0.071
Neg logit contributions
ete-0.195
icking-0.192
ory-0.182
ement-0.182
ancer-0.175
Token a
Token ID257
Feature activation+6.96e-02

Pos logit contributions
 s+0.022
 wished+0.022
 living+0.020
 t+0.019
 sharp+0.019
Neg logit contributions
ete-0.080
icking-0.077
ement-0.074
ory-0.074
ancer-0.073
Token good
Token ID922
Feature activation-1.30e+00

Pos logit contributions
ete+0.007
ory+0.007
ement+0.007
icking+0.007
+0.007
Neg logit contributions
 wished-0.002
 s-0.002
 living-0.002
 r-0.002
 knocking-0.002
Token choice
Token ID3572
Feature activation+5.90e+00

Pos logit contributions
 wished+0.075
 s+0.071
 living+0.068
 sharp+0.065
 happy+0.064
Neg logit contributions
ete-0.109
icking-0.105
ory-0.104
ement-0.099
ails-0.095
Token,"
Token ID553
Feature activation-2.09e+00

Pos logit contributions
ete+0.560
icking+0.544
ement+0.519
ory+0.513
ancer+0.500
Neg logit contributions
 s-0.282
 wished-0.280
 living-0.262
 t-0.254
 sharp-0.251
Token he
Token ID339
Feature activation+6.62e-02

Pos logit contributions
 wished+0.102
 s+0.102
 living+0.095
 t+0.093
 sharp+0.091
Neg logit contributions
ete-0.201
icking-0.197
ement-0.184
ory-0.184
ancer-0.180
Token said
Token ID531
Feature activation+5.41e-01

Pos logit contributions
ete+0.006
icking+0.006
ory+0.006
ement+0.006
ola+0.005
Neg logit contributions
 s-0.003
 living-0.003
 wished-0.003
 t-0.003
 snow-0.003
Token.
Token ID13
Feature activation-2.95e-01

Pos logit contributions
ete+0.058
icking+0.056
ement+0.054
ory+0.053
ancer+0.052
Neg logit contributions
 s-0.020
 wished-0.020
 living-0.018
 t-0.017
 sharp-0.017
Token
Token ID198
Feature activation-2.36e-01

Pos logit contributions
 s+0.011
 wished+0.011
 living+0.010
 t+0.010
 happy+0.010
Neg logit contributions
icking-0.031
ete-0.031
ory-0.029
ement-0.029
ancer-0.029
Token He
Token ID679
Feature activation+4.73e-01

Pos logit contributions
 s+0.003
 wished+0.002
 living+0.002
 t+0.002
 soft+0.002
Neg logit contributions
ete-0.005
icking-0.005
ement-0.004
oming-0.004
ory-0.004
Token knew
Token ID2993
Feature activation-1.26e+00

Pos logit contributions
ete+0.049
icking+0.047
ory+0.046
ement+0.044
ola+0.044
Neg logit contributions
 living-0.017
 s-0.017
 wished-0.016
 soft-0.015
 snow-0.015
Token he
Token ID339
Feature activation+1.35e-01

Pos logit contributions
 s+0.052
 wished+0.051
 t+0.051
 living+0.051
 snow+0.049
Neg logit contributions
ete-0.130
icking-0.123
ory-0.122
ement-0.121
oming-0.121
Token had
Token ID550
Feature activation-3.95e-01

Pos logit contributions
ete+0.015
icking+0.014
ement+0.014
ory+0.014
oming+0.013
Neg logit contributions
 wished-0.005
 s-0.005
 living-0.004
 t-0.004
 sharp-0.004
Token no
Token ID645
Feature activation-1.65e-01

Pos logit contributions
 wished+0.018
 s+0.018
 living+0.016
 soft+0.016
 happy+0.016
Neg logit contributions
ete-0.036
icking-0.034
ory-0.032
ails-0.031
ola-0.031
Token choice
Token ID3572
Feature activation+5.97e+00

Pos logit contributions
 wished+0.008
 s+0.007
 living+0.007
 t+0.007
 sharp+0.006
Neg logit contributions
ete-0.016
icking-0.016
ement-0.015
ory-0.015
ancer-0.014
Token.
Token ID13
Feature activation-6.43e-02

Pos logit contributions
ete+0.636
icking+0.617
ory+0.593
ement+0.584
oming+0.580
Neg logit contributions
 s-0.223
 wished-0.220
 living-0.216
 t-0.205
 sharp-0.202
Token He
Token ID679
Feature activation+4.50e-01

Pos logit contributions
 s+0.003
 wished+0.003
 living+0.003
 t+0.003
 soft+0.003
Neg logit contributions
ete-0.006
icking-0.006
oming-0.006
ement-0.006
ory-0.006
Token decided
Token ID3066
Feature activation+1.26e+00

Pos logit contributions
ete+0.046
icking+0.044
ory+0.043
ement+0.041
ola+0.041
Neg logit contributions
 living-0.018
 s-0.017
 soft-0.015
 wished-0.015
 happy-0.015
Token to
Token ID284
Feature activation-5.62e-01

Pos logit contributions
ete+0.117
icking+0.114
ory+0.107
ement+0.104
ails+0.103
Neg logit contributions
 s-0.059
 wished-0.057
 living-0.054
 snow-0.052
 t-0.052
Token surrender
Token ID16908
Feature activation+2.12e+00

Pos logit contributions
 s+0.026
 wished+0.026
 living+0.024
 t+0.024
 sharp+0.023
Neg logit contributions
ete-0.054
icking-0.053
ement-0.050
ory-0.050
ancer-0.049
Token stepped
Token ID10764
Feature activation+2.63e-02

Pos logit contributions
ete+0.075
icking+0.075
ory+0.070
ement+0.068
ancer+0.067
Neg logit contributions
 s-0.026
 living-0.025
 wished-0.025
 snow-0.023
 sharp-0.023
Token in
Token ID287
Feature activation-8.43e-05

Pos logit contributions
ete+0.003
icking+0.003
ement+0.002
ancer+0.002
ory+0.002
Neg logit contributions
 s-0.001
 living-0.001
 t-0.001
 sharp-0.001
 r-0.001
Token the
Token ID262
Feature activation+9.88e-02

Pos logit contributions
 wished+0.000
 s+0.000
 living+0.000
 sharp+0.000
 gir+0.000
Neg logit contributions
ete-0.000
icking-0.000
ement-0.000
-0.000
ory-0.000
Token deep
Token ID2769
Feature activation+7.71e-01

Pos logit contributions
ete+0.009
icking+0.009
ement+0.008
ory+0.008
+0.008
Neg logit contributions
 wished-0.005
 s-0.004
 never-0.004
 t-0.004
 happy-0.004
Token p
Token ID279
Feature activation-1.86e+00

Pos logit contributions
ete+0.072
icking+0.069
ory+0.067
ement+0.067
oming+0.065
Neg logit contributions
 wished-0.037
 s-0.036
 living-0.034
 t-0.033
 never-0.032
Tokenuddle
Token ID24500
Feature activation+6.39e+00

Pos logit contributions
 s+0.187
 living+0.179
 r+0.179
 mess+0.177
 snow+0.176
Neg logit contributions
ete-0.084
icking-0.071
ancer-0.070
avored-0.065
ement-0.062
Token.
Token ID13
Feature activation-3.57e-01

Pos logit contributions
ete+0.678
icking+0.653
ory+0.627
ement+0.624
ancer+0.614
Neg logit contributions
 s-0.247
 wished-0.239
 living-0.229
 t-0.218
 sharp-0.213
Token
Token ID198
Feature activation-3.44e-01

Pos logit contributions
 s+0.018
 wished+0.017
 living+0.017
 t+0.016
 sharp+0.016
Neg logit contributions
ete-0.034
icking-0.033
ement-0.031
ory-0.031
ancer-0.030
Token
Token ID198
Feature activation-3.67e-01

Pos logit contributions
 wished+0.016
 s+0.016
 t+0.015
 living+0.015
 sharp+0.015
Neg logit contributions
ete-0.033
icking-0.031
ement-0.030
ory-0.029
ola-0.029
Token"
Token ID1
Feature activation-4.44e-02

Pos logit contributions
 s+0.021
 wished+0.020
 t+0.020
 living+0.019
 sharp+0.019
Neg logit contributions
ete-0.033
icking-0.030
ola-0.029
umber-0.028
-0.028
TokenMom
Token ID29252
Feature activation+4.80e-01

Pos logit contributions
 s+0.003
 living+0.003
 wished+0.003
 t+0.002
 sharp+0.002
Neg logit contributions
ete-0.004
icking-0.003
ory-0.003
ola-0.003
-0.003
Token land
Token ID1956
Feature activation-3.59e-01

Pos logit contributions
 living+0.040
 wished+0.040
 s+0.040
 sharp+0.036
 t+0.036
Neg logit contributions
ete-0.083
icking-0.080
ory-0.076
ement-0.075
-0.073
Token in
Token ID287
Feature activation+3.19e-01

Pos logit contributions
 s+0.016
 loud+0.016
 happy+0.015
 many+0.015
 living+0.015
Neg logit contributions
ete-0.039
ory-0.035
-0.035
icking-0.035
ement-0.034
Token a
Token ID257
Feature activation+2.09e-01

Pos logit contributions
ete+0.032
icking+0.031
ement+0.030
+0.029
ory+0.029
Neg logit contributions
 wished-0.013
 s-0.012
 living-0.012
 bark-0.011
 r-0.011
Token deep
Token ID2769
Feature activation+8.30e-01

Pos logit contributions
ete+0.019
oming+0.017
uddle+0.017
+0.017
umber+0.017
Neg logit contributions
 wished-0.009
 feelings-0.008
 accident-0.008
 listening-0.008
 wanted-0.008
Token p
Token ID279
Feature activation-1.87e+00

Pos logit contributions
ete+0.071
ement+0.067
icking+0.066
ory+0.066
oming+0.064
Neg logit contributions
 wished-0.044
 s-0.041
 living-0.040
 t-0.038
 knocking-0.037
Tokenuddle
Token ID24500
Feature activation+6.63e+00

Pos logit contributions
 s+0.184
 living+0.174
 mess+0.174
 r+0.173
 sharp+0.172
Neg logit contributions
ete-0.087
icking-0.077
ancer-0.075
avored-0.071
ory-0.069
Token of
Token ID286
Feature activation+4.69e-02

Pos logit contributions
ete+0.714
icking+0.692
ory+0.661
ement+0.660
ancer+0.647
Neg logit contributions
 s-0.244
 wished-0.238
 living-0.224
 t-0.212
 sharp-0.207
Token water
Token ID1660
Feature activation+9.69e-01

Pos logit contributions
ete+0.004
icking+0.004
ory+0.004
ement+0.004
oming+0.004
Neg logit contributions
 wished-0.002
 s-0.002
 never-0.002
 bark-0.002
 dreamed-0.002
Token.
Token ID13
Feature activation+3.48e-02

Pos logit contributions
ete+0.102
icking+0.100
ement+0.095
ory+0.094
oming+0.092
Neg logit contributions
 wished-0.037
 s-0.036
 living-0.034
 t-0.032
 sharp-0.032
Token They
Token ID1119
Feature activation-4.01e-01

Pos logit contributions
ete+0.004
icking+0.003
ement+0.003
ory+0.003
ancer+0.003
Neg logit contributions
 s-0.002
 wished-0.001
 living-0.001
 t-0.001
 snow-0.001
Token are
Token ID389
Feature activation-1.36e-01

Pos logit contributions
 wished+0.020
 living+0.020
 s+0.020
 toy+0.018
 sharp+0.018
Neg logit contributions
ete-0.035
icking-0.034
ory-0.032
oming-0.031
ola-0.031
Token to
Token ID284
Feature activation-5.05e-01

Pos logit contributions
ete+0.139
icking+0.134
ement+0.129
ory+0.127
ancer+0.124
Neg logit contributions
 s-0.056
 wished-0.055
 living-0.052
 t-0.050
 sharp-0.048
Token their
Token ID511
Feature activation-1.34e+00

Pos logit contributions
 wished+0.024
 s+0.021
 sharp+0.020
 living+0.020
 gir+0.020
Neg logit contributions
ete-0.047
icking-0.045
ory-0.043
-0.042
ancer-0.042
Token room
Token ID2119
Feature activation-1.21e+00

Pos logit contributions
 wished+0.059
 s+0.057
 sharp+0.052
 knocking+0.050
 bang+0.049
Neg logit contributions
icking-0.126
ete-0.126
ory-0.118
ement-0.115
iring-0.113
Token and
Token ID290
Feature activation+1.33e-01

Pos logit contributions
 wished+0.055
 s+0.053
 sharp+0.049
 living+0.048
 snow+0.047
Neg logit contributions
icking-0.116
ete-0.115
ement-0.107
ory-0.107
ails-0.104
Token sleep
Token ID3993
Feature activation-9.19e-01

Pos logit contributions
ete+0.012
icking+0.012
ory+0.011
ancer+0.011
ola+0.011
Neg logit contributions
 s-0.006
 wished-0.006
 gir-0.006
 sharp-0.006
 living-0.005
Token for
Token ID329
Feature activation+2.27e+00

Pos logit contributions
 wished+0.041
 s+0.039
 living+0.036
 snow+0.036
 sharp+0.035
Neg logit contributions
ete-0.088
icking-0.088
ement-0.083
ory-0.082
ancer-0.081
Token a
Token ID257
Feature activation+9.00e-02

Pos logit contributions
ete+0.215
icking+0.200
ory+0.196
ola+0.193
ement+0.192
Neg logit contributions
 wished-0.100
 s-0.089
 living-0.087
 gir-0.086
 bark-0.083
Token little
Token ID1310
Feature activation-1.78e+00

Pos logit contributions
ete+0.008
ory+0.008
ancer+0.007
+0.007
icking+0.007
Neg logit contributions
 wished-0.005
 feelings-0.004
 s-0.004
 never-0.004
 chased-0.004
Token while
Token ID981
Feature activation-9.44e-01

Pos logit contributions
 wished+0.096
 s+0.093
 never+0.090
 sharp+0.089
 gir+0.083
Neg logit contributions
ory-0.134
icking-0.130
ancer-0.128
-0.125
ete-0.124
Token.
Token ID13
Feature activation+9.60e-02

Pos logit contributions
 s+0.031
 wished+0.029
 living+0.028
 t+0.026
 happy+0.026
Neg logit contributions
ete-0.106
icking-0.103
ement-0.099
ory-0.099
oming-0.097
Token
Token ID198
Feature activation+9.69e-02

Pos logit contributions
ete+0.009
icking+0.009
ement+0.009
ory+0.009
ancer+0.008
Neg logit contributions
 wished-0.004
 s-0.004
 living-0.004
 t-0.004
 sharp-0.004
Token.
Token ID13
Feature activation-2.11e-01

Pos logit contributions
ete+0.064
icking+0.063
ement+0.060
ory+0.059
ancer+0.059
Neg logit contributions
 s-0.022
 wished-0.021
 living-0.020
 t-0.019
 sharp-0.019
Token
Token ID198
Feature activation-1.19e-01

Pos logit contributions
 s+0.012
 wished+0.011
 living+0.011
 t+0.011
 soft+0.010
Neg logit contributions
ete-0.019
icking-0.018
ory-0.017
ement-0.017
oming-0.017
TokenHe
Token ID1544
Feature activation+2.04e-01

Pos logit contributions
 wished+0.007
 s+0.007
 t+0.007
 living+0.006
 mess+0.006
Neg logit contributions
ete-0.010
icking-0.009
ement-0.009
ails-0.009
iring-0.009
Token kept
Token ID4030
Feature activation+7.45e-01

Pos logit contributions
ete+0.020
icking+0.019
ory+0.018
ement+0.018
ola+0.017
Neg logit contributions
 living-0.009
 s-0.008
 wished-0.008
 soft-0.008
 happy-0.008
Token asking
Token ID4737
Feature activation-1.97e+00

Pos logit contributions
icking+0.064
ete+0.063
+0.060
 Derek+0.060
ipping+0.060
Neg logit contributions
 wished-0.035
 living-0.030
 songs-0.030
 cob-0.030
 s-0.029
Token for
Token ID329
Feature activation+2.21e+00

Pos logit contributions
 s+0.094
 wished+0.093
 living+0.088
 t+0.086
 sharp+0.084
Neg logit contributions
ete-0.190
icking-0.184
ement-0.175
ory-0.174
ancer-0.170
Token a
Token ID257
Feature activation-2.47e-02

Pos logit contributions
ete+0.197
icking+0.187
ory+0.181
ement+0.180
ancer+0.174
Neg logit contributions
 wished-0.102
 living-0.096
 s-0.095
 mess-0.089
 r-0.086
Token muff
Token ID27563
Feature activation-3.22e+00

Pos logit contributions
 wished+0.001
 s+0.001
 dreamed+0.001
 never+0.001
 living+0.001
Neg logit contributions
ete-0.002
ory-0.002
-0.002
icking-0.002
ole-0.002
Tokenin
Token ID259
Feature activation-3.16e-01

Pos logit contributions
 living+0.210
 wished+0.202
 happy+0.201
 s+0.200
 sharp+0.200
Neg logit contributions
ete-0.244
icking-0.229
ancing-0.218
-0.215
 Derek-0.210
Token.
Token ID13
Feature activation-2.04e-01

Pos logit contributions
 s+0.014
 wished+0.014
 living+0.013
 t+0.013
 sharp+0.012
Neg logit contributions
ete-0.032
icking-0.031
ement-0.029
ory-0.029
oming-0.029
Token
Token ID198
Feature activation-1.05e-01

Pos logit contributions
 s+0.011
 living+0.010
 wished+0.010
 t+0.010
 soft+0.009
Neg logit contributions
ete-0.019
icking-0.018
ory-0.018
ement-0.017
oming-0.017
Token saw
Token ID2497
Feature activation-6.28e-01

Pos logit contributions
ete+0.049
icking+0.048
ory+0.045
ement+0.044
ola+0.044
Neg logit contributions
 s-0.020
 wished-0.019
 living-0.019
 sharp-0.018
 bark-0.017
Token the
Token ID262
Feature activation+5.83e-01

Pos logit contributions
 wished+0.022
 s+0.022
 living+0.020
 t+0.019
 bark+0.019
Neg logit contributions
ete-0.066
icking-0.064
ement-0.062
ory-0.061
oming-0.060
Token cat
Token ID3797
Feature activation-3.83e-01

Pos logit contributions
ete+0.058
icking+0.058
ement+0.056
ory+0.055
ancer+0.054
Neg logit contributions
 wished-0.023
 s-0.022
 living-0.019
 bark-0.019
 r-0.019
Token trying
Token ID2111
Feature activation-7.82e-02

Pos logit contributions
 wished+0.017
 s+0.016
 sharp+0.015
 t+0.014
 living+0.014
Neg logit contributions
icking-0.037
ete-0.036
ement-0.034
oming-0.034
ory-0.033
Token to
Token ID284
Feature activation-6.55e-01

Pos logit contributions
 wished+0.004
 s+0.004
 living+0.003
 t+0.003
 gir+0.003
Neg logit contributions
ete-0.007
icking-0.007
ory-0.006
ement-0.006
ails-0.006
Token catch
Token ID4929
Feature activation+2.17e+00

Pos logit contributions
 s+0.030
 wished+0.029
 living+0.028
 t+0.027
 sharp+0.026
Neg logit contributions
ete-0.064
icking-0.063
ement-0.060
ory-0.059
ancer-0.058
Token the
Token ID262
Feature activation+1.15e-01

Pos logit contributions
ete+0.222
icking+0.216
ement+0.206
ory+0.205
ancer+0.200
Neg logit contributions
 s-0.089
 wished-0.088
 living-0.082
 t-0.079
 sharp-0.077
Token bird
Token ID6512
Feature activation-9.56e-01

Pos logit contributions
icking+0.011
ete+0.011
ement+0.010
ancer+0.010
ory+0.010
Neg logit contributions
 wished-0.005
 s-0.004
 living-0.004
 never-0.004
 happy-0.004
Tokenie
Token ID494
Feature activation+1.97e+00

Pos logit contributions
 s+0.037
 wished+0.037
 living+0.035
 t+0.033
 sharp+0.032
Neg logit contributions
ete-0.100
icking-0.097
ement-0.092
ory-0.092
ancer-0.090
Token.
Token ID13
Feature activation-3.02e-01

Pos logit contributions
ete+0.203
icking+0.199
ement+0.189
ory+0.188
ancer+0.183
Neg logit contributions
 s-0.078
 wished-0.077
 living-0.071
 t-0.068
 sharp-0.067
Token She
Token ID1375
Feature activation-6.79e-02

Pos logit contributions
 s+0.014
 wished+0.014
 living+0.013
 bark+0.013
 t+0.013
Neg logit contributions
icking-0.029
ete-0.029
ement-0.027
ory-0.027
ancer-0.027
Token seal
Token ID13810
Feature activation+4.35e-01

Pos logit contributions
ete+0.081
ory+0.077
icking+0.076
oming+0.075
ancer+0.074
Neg logit contributions
 wished-0.032
 s-0.030
 living-0.027
 cob-0.027
 knocking-0.027
Token said
Token ID531
Feature activation+6.93e-01

Pos logit contributions
ete+0.041
icking+0.041
ory+0.039
ails+0.036
umber+0.036
Neg logit contributions
 living-0.019
 s-0.018
 soft-0.017
 snow-0.017
 knocking-0.016
Token,
Token ID11
Feature activation+4.98e-01

Pos logit contributions
ete+0.058
icking+0.057
ement+0.053
ory+0.052
oming+0.051
Neg logit contributions
 wished-0.040
 s-0.039
 living-0.038
 t-0.036
 mess-0.036
Token "
Token ID366
Feature activation-5.58e-01

Pos logit contributions
ete+0.040
icking+0.038
ancer+0.036
ement+0.036
oming+0.035
Neg logit contributions
 wished-0.028
 mess-0.027
 dreamed-0.025
 living-0.025
 s-0.025
TokenPlease
Token ID5492
Feature activation+1.76e+00

Pos logit contributions
 s+0.038
 living+0.037
 t+0.036
 soft+0.035
 sharp+0.035
Neg logit contributions
ete-0.040
ory-0.038
icking-0.038
umber-0.036
ancer-0.036
Token forgive
Token ID20927
Feature activation+2.55e+00

Pos logit contributions
ete+0.185
ement+0.183
icking+0.174
ays+0.173
ancer+0.171
Neg logit contributions
 s-0.078
 t-0.074
 r-0.071
 wished-0.070
 bark-0.070
Token me
Token ID502
Feature activation+3.51e-01

Pos logit contributions
ete+0.245
icking+0.240
ement+0.228
ory+0.223
ancer+0.221
Neg logit contributions
 s-0.124
 wished-0.122
 living-0.117
 t-0.115
 sharp-0.113
Token,
Token ID11
Feature activation+4.87e-01

Pos logit contributions
ete+0.034
icking+0.033
ement+0.032
ory+0.031
ancer+0.031
Neg logit contributions
 s-0.017
 wished-0.016
 living-0.015
 t-0.015
 sharp-0.015
Token I
Token ID314
Feature activation+8.19e-01

Pos logit contributions
ete+0.049
icking+0.048
ement+0.048
ails+0.046
+0.046
Neg logit contributions
 s-0.023
 t-0.022
 wished-0.021
 happy-0.021
 sharp-0.021
Token just
Token ID655
Feature activation+1.51e+00

Pos logit contributions
ete+0.080
icking+0.077
ement+0.074
ory+0.072
ancer+0.071
Neg logit contributions
 s-0.039
 wished-0.039
 living-0.036
 t-0.036
 r-0.035
Token wanted
Token ID2227
Feature activation-1.87e+00

Pos logit contributions
ete+0.139
icking+0.139
ory+0.133
ancer+0.127
ola+0.126
Neg logit contributions
 s-0.065
 wished-0.062
 living-0.062
 snow-0.058
 sharp-0.058
Token you
Token ID345
Feature activation+5.14e-01

Pos logit contributions
 living+0.017
 s+0.017
 never+0.016
 sharp+0.015
 loud+0.015
Neg logit contributions
icking-0.042
oured-0.040
ete-0.039
ory-0.039
ays-0.038
Token.
Token ID13
Feature activation-3.10e-01

Pos logit contributions
ete+0.049
icking+0.048
ement+0.048
ays+0.046
+0.046
Neg logit contributions
 s-0.025
 living-0.023
 t-0.022
 happy-0.022
 soft-0.022
Token I
Token ID314
Feature activation+7.83e-01

Pos logit contributions
 living+0.016
 s+0.016
 wished+0.016
 never+0.015
 t+0.015
Neg logit contributions
icking-0.028
ete-0.027
ement-0.027
ory-0.027
oming-0.025
Token'm
Token ID1101
Feature activation+8.54e-01

Pos logit contributions
ete+0.073
icking+0.071
ory+0.066
ement+0.066
ancer+0.065
Neg logit contributions
 s-0.040
 wished-0.039
 living-0.037
 t-0.036
 sharp-0.036
Token sorry
Token ID7926
Feature activation-3.33e-01

Pos logit contributions
ete+0.093
icking+0.090
ory+0.087
ement+0.087
ancer+0.085
Neg logit contributions
 s-0.031
 wished-0.030
 living-0.029
 happy-0.028
 t-0.027
Token for
Token ID329
Feature activation+2.17e+00

Pos logit contributions
 s+0.019
 living+0.018
 wished+0.018
 t+0.018
 happy+0.017
Neg logit contributions
ete-0.030
icking-0.028
ement-0.028
ory-0.027
-0.027
Token being
Token ID852
Feature activation-2.57e+00

Pos logit contributions
ete+0.197
icking+0.193
ement+0.175
oming+0.174
+0.173
Neg logit contributions
 wished-0.106
 s-0.101
 living-0.092
 r-0.091
 dreamed-0.090
Token sad
Token ID6507
Feature activation-1.15e-01

Pos logit contributions
 s+0.113
 wished+0.111
 living+0.105
 t+0.102
 snow+0.099
Neg logit contributions
ete-0.254
icking-0.248
ement-0.234
ory-0.234
oming-0.228
Token earlier
Token ID2961
Feature activation+1.57e+00

Pos logit contributions
 wished+0.005
 s+0.005
 living+0.005
 t+0.005
 sharp+0.005
Neg logit contributions
ete-0.011
icking-0.011
ement-0.010
ory-0.010
ancer-0.010
Token."
Token ID526
Feature activation-1.52e+00

Pos logit contributions
ete+0.150
icking+0.146
ement+0.140
ory+0.137
ancer+0.135
Neg logit contributions
 s-0.070
 wished-0.069
 living-0.065
 t-0.063
 sharp-0.063
Token The
Token ID383
Feature activation+6.65e-01

Pos logit contributions
 s+0.062
 wished+0.057
 living+0.055
 bark+0.053
 happy+0.053
Neg logit contributions
ete-0.157
icking-0.156
ement-0.148
ory-0.148
ancer-0.148
Token used
Token ID973
Feature activation+1.72e+00

Pos logit contributions
ete+0.176
icking+0.169
oming+0.163
iring+0.161
ement+0.160
Neg logit contributions
 wished-0.061
 s-0.059
 t-0.058
 living-0.058
 bark-0.057
Token to
Token ID284
Feature activation-7.04e-01

Pos logit contributions
ete+0.179
icking+0.174
ement+0.165
ory+0.162
oming+0.160
Neg logit contributions
 wished-0.067
 s-0.067
 living-0.062
 t-0.059
 bark-0.057
Token being
Token ID852
Feature activation-2.59e+00

Pos logit contributions
 wished+0.034
 s+0.033
 living+0.032
 t+0.030
 sharp+0.029
Neg logit contributions
ete-0.066
icking-0.065
ory-0.060
ement-0.059
oming-0.059
Token angry
Token ID7954
Feature activation-2.04e-01

Pos logit contributions
 s+0.087
 wished+0.087
 living+0.078
 t+0.077
 bark+0.073
Neg logit contributions
ete-0.280
icking-0.272
ory-0.258
ement-0.256
oming-0.252
Token,
Token ID11
Feature activation+3.30e-01

Pos logit contributions
 s+0.008
 wished+0.008
 living+0.008
 t+0.007
 sharp+0.007
Neg logit contributions
ete-0.021
icking-0.021
ory-0.020
ement-0.020
-0.019
Token but
Token ID475
Feature activation+8.76e-01

Pos logit contributions
ete+0.031
icking+0.030
ement+0.028
ory+0.028
oming+0.028
Neg logit contributions
 wished-0.015
 s-0.015
 living-0.015
 t-0.014
 sharp-0.014
Token the
Token ID262
Feature activation+1.94e-01

Pos logit contributions
ete+0.086
icking+0.084
ement+0.080
ory+0.080
ancer+0.078
Neg logit contributions
 wished-0.041
 s-0.040
 living-0.038
 t-0.037
 happy-0.036
Token so
Token ID523
Feature activation+1.71e+00

Pos logit contributions
ete+0.012
ement+0.011
icking+0.011
ory+0.011
oming+0.010
Neg logit contributions
 wished-0.014
 s-0.013
 living-0.013
 bark-0.013
 knocking-0.013
Tokenapy
Token ID12826
Feature activation+2.26e-01

Pos logit contributions
ete+0.184
icking+0.180
ement+0.171
ory+0.170
ancer+0.167
Neg logit contributions
 s-0.061
 wished-0.060
 living-0.056
 t-0.053
 sharp-0.051
Token water
Token ID1660
Feature activation+8.01e-01

Pos logit contributions
ete+0.023
icking+0.023
ory+0.021
ement+0.021
ancer+0.021
Neg logit contributions
 s-0.010
 wished-0.010
 living-0.009
 r-0.009
 sharp-0.009
Token was
Token ID373
Feature activation+1.86e-01

Pos logit contributions
ete+0.082
icking+0.078
ory+0.076
ement+0.076
ancer+0.076
Neg logit contributions
 s-0.035
 wished-0.035
 living-0.032
 r-0.031
 t-0.031
Token Lily
Token ID20037
Feature activation-5.36e-01

Pos logit contributions
ete+0.054
icking+0.053
ement+0.051
ancer+0.049
ory+0.049
Neg logit contributions
 wished-0.022
 s-0.022
 t-0.021
 living-0.020
 sharp-0.020
Token.
Token ID13
Feature activation+6.95e-02

Pos logit contributions
 s+0.026
 wished+0.026
 living+0.024
 sharp+0.023
 t+0.023
Neg logit contributions
ete-0.049
icking-0.049
ory-0.046
ement-0.045
ails-0.044
Token I
Token ID314
Feature activation+8.29e-01

Pos logit contributions
ete+0.006
icking+0.006
ement+0.006
ory+0.005
oming+0.005
Neg logit contributions
 s-0.004
 living-0.004
 wished-0.004
 t-0.004
 snow-0.004
Token will
Token ID481
Feature activation+3.02e-01

Pos logit contributions
ete+0.077
icking+0.075
ement+0.072
ory+0.071
ancer+0.069
Neg logit contributions
 s-0.042
 wished-0.042
 living-0.039
 t-0.039
 sharp-0.038
Token try
Token ID1949
Feature activation+1.57e+00

Pos logit contributions
ete+0.028
icking+0.027
ory+0.026
ement+0.025
+0.025
Neg logit contributions
 wished-0.015
 s-0.015
 living-0.015
 t-0.013
 snow-0.013
Token some
Token ID617
Feature activation+2.87e-01

Pos logit contributions
ete+0.138
icking+0.133
ory+0.128
oming+0.127
ails+0.125
Neg logit contributions
 wished-0.087
 s-0.081
 living-0.071
 t-0.071
 bark-0.070
Token of
Token ID286
Feature activation+4.59e-03

Pos logit contributions
ete+0.029
icking+0.029
ancer+0.028
+0.028
awn+0.028
Neg logit contributions
 s-0.012
 sharp-0.012
 snow-0.012
 soft-0.012
 fun-0.011
Token your
Token ID534
Feature activation-1.03e-01

Pos logit contributions
ete+0.000
icking+0.000
ement+0.000
ancer+0.000
ory+0.000
Neg logit contributions
 s-0.000
 living-0.000
 t-0.000
 sharp-0.000
 snow-0.000
Token pasta
Token ID26296
Feature activation+3.95e-01

Pos logit contributions
 wished+0.006
 s+0.006
 living+0.006
 t+0.006
 sharp+0.006
Neg logit contributions
ete-0.009
icking-0.008
ement-0.008
ory-0.008
ancer-0.008
Token.
Token ID13
Feature activation+9.21e-02

Pos logit contributions
ete+0.042
icking+0.041
ement+0.039
ory+0.039
oming+0.038
Neg logit contributions
 s-0.014
 wished-0.014
 living-0.013
 t-0.013
 sharp-0.013
Token But
Token ID887
Feature activation-1.70e-01

Pos logit contributions
ete+0.008
icking+0.008
ement+0.008
ory+0.008
oming+0.008
Neg logit contributions
 s-0.005
 wished-0.005
 living-0.005
 t-0.004
 snow-0.004
Token fair
Token ID3148
Feature activation-5.97e-01

Pos logit contributions
 wished+0.051
 s+0.046
 bark+0.043
 dreamed+0.043
 living+0.042
Neg logit contributions
icking-0.110
ete-0.106
ory-0.099
iring-0.099
ancer-0.098
Token!"
Token ID2474
Feature activation-7.42e-01

Pos logit contributions
 s+0.022
 living+0.021
 wished+0.021
 t+0.020
 sharp+0.019
Neg logit contributions
ete-0.065
icking-0.062
ement-0.060
ory-0.059
-0.059
Token She
Token ID1375
Feature activation-9.23e-03

Pos logit contributions
 s+0.030
 wished+0.029
 living+0.027
 happy+0.025
 t+0.025
Neg logit contributions
ete-0.079
icking-0.077
ory-0.073
ement-0.072
ancer-0.072
Token tried
Token ID3088
Feature activation-1.48e-01

Pos logit contributions
 s+0.000
 wished+0.000
 living+0.000
 t+0.000
 happy+0.000
Neg logit contributions
ete-0.001
icking-0.001
ement-0.001
ory-0.001
ancer-0.001
Token to
Token ID284
Feature activation-4.58e-01

Pos logit contributions
 wished+0.008
 s+0.008
 living+0.007
 gir+0.007
 dreamed+0.007
Neg logit contributions
icking-0.012
ete-0.012
ory-0.012
ails-0.011
ement-0.011
Token pull
Token ID2834
Feature activation-1.37e+00

Pos logit contributions
 wished+0.024
 living+0.023
 s+0.022
 gir+0.021
 cob+0.020
Neg logit contributions
ete-0.041
icking-0.040
ory-0.037
ola-0.036
 Derek-0.036
Token her
Token ID607
Feature activation-1.31e+00

Pos logit contributions
 s+0.046
 sharp+0.043
 t+0.043
 toy+0.043
 snow+0.042
Neg logit contributions
ete-0.155
icking-0.151
ancer-0.145
ory-0.143
ement-0.142
Token blocks
Token ID7021
Feature activation-1.90e+00

Pos logit contributions
 wished+0.059
 s+0.055
 living+0.052
 bark+0.048
 t+0.047
Neg logit contributions
ete-0.129
icking-0.128
ement-0.122
ory-0.119
oming-0.117
Token back
Token ID736
Feature activation+9.88e-01

Pos logit contributions
 s+0.096
 wished+0.090
 sharp+0.089
 t+0.089
 happy+0.088
Neg logit contributions
ete-0.181
icking-0.168
ory-0.166
ement-0.165
ancer-0.164
Token,
Token ID11
Feature activation+5.43e-01

Pos logit contributions
ete+0.107
icking+0.099
ory+0.098
ancer+0.097
ement+0.096
Neg logit contributions
 s-0.039
 t-0.036
 living-0.036
 toy-0.036
 sharp-0.035
Token but
Token ID475
Feature activation+1.06e+00

Pos logit contributions
icking+0.038
ete+0.035
ement+0.034
oming+0.033
 Derek+0.032
Neg logit contributions
 s-0.035
 wished-0.035
 living-0.034
 snow-0.033
 sharp-0.032
Token nice
Token ID3621
Feature activation+7.85e-01

Pos logit contributions
 soft+0.015
 sharp+0.015
 happy+0.015
 wished+0.014
 s+0.014
Neg logit contributions
ete-0.088
icking-0.086
ement-0.085
ays-0.084
ancer-0.084
Token,
Token ID11
Feature activation+5.09e-01

Pos logit contributions
ete+0.073
icking+0.071
ement+0.067
ory+0.066
ancer+0.065
Neg logit contributions
 s-0.041
 wished-0.040
 living-0.039
 t-0.038
 sharp-0.038
Token dear
Token ID13674
Feature activation+1.05e+00

Pos logit contributions
ete+0.058
icking+0.056
ement+0.054
ory+0.053
ancer+0.052
Neg logit contributions
 s-0.016
 wished-0.015
 living-0.014
 t-0.014
 sharp-0.013
Token,"
Token ID553
Feature activation-1.70e+00

Pos logit contributions
ete+0.100
icking+0.098
ory+0.095
ement+0.092
ola+0.089
Neg logit contributions
 wished-0.048
 s-0.048
 living-0.045
 sharp-0.043
 t-0.041
Token said
Token ID531
Feature activation+6.55e-01

Pos logit contributions
 s+0.112
 wished+0.111
 living+0.109
 sharp+0.104
 t+0.104
Neg logit contributions
ete-0.127
icking-0.125
ement-0.116
ory-0.114
-0.108
Token her
Token ID607
Feature activation-1.39e+00

Pos logit contributions
ete+0.074
icking+0.072
ement+0.069
ory+0.069
ancer+0.068
Neg logit contributions
 s-0.021
 wished-0.020
 t-0.018
 living-0.018
 sharp-0.017
Token mom
Token ID1995
Feature activation-3.19e-01

Pos logit contributions
 wished+0.070
 s+0.066
 living+0.066
 never+0.060
 r+0.059
Neg logit contributions
icking-0.128
ete-0.127
ory-0.120
ement-0.118
ancer-0.114
Token.
Token ID13
Feature activation-6.03e-02

Pos logit contributions
 wished+0.014
 s+0.014
 living+0.013
 sharp+0.013
 snow+0.012
Neg logit contributions
icking-0.031
ete-0.031
ory-0.029
ement-0.029
oming-0.028
Token "
Token ID366
Feature activation-5.17e-01

Pos logit contributions
 wished+0.003
 s+0.003
 sharp+0.003
 living+0.003
 t+0.003
Neg logit contributions
ete-0.005
icking-0.005
-0.005
ement-0.005
ails-0.005
TokenBut
Token ID1537
Feature activation-8.64e-02

Pos logit contributions
 s+0.030
 wished+0.028
 t+0.027
 mess+0.027
 r+0.026
Neg logit contributions
ete-0.043
icking-0.041
ory-0.040
ancer-0.040
ement-0.039
Token be
Token ID307
Feature activation-6.00e-01

Pos logit contributions
 wished+0.004
 t+0.004
 s+0.004
 living+0.004
 happy+0.004
Neg logit contributions
ete-0.009
icking-0.008
ory-0.008
ement-0.008
-0.008
Token luck
Token ID8458
Feature activation+1.89e+00

Pos logit contributions
ete+0.015
icking+0.014
ement+0.013
ory+0.013
ancer+0.013
Neg logit contributions
 s-0.011
 wished-0.011
 living-0.011
 t-0.010
 sharp-0.010
Tokeniest
Token ID6386
Feature activation+3.64e+00

Pos logit contributions
ete+0.100
icking+0.096
ement+0.085
ory+0.084
ays+0.081
Neg logit contributions
 living-0.171
 s-0.171
 wished-0.169
 t-0.167
 sharp-0.165
Token deer
Token ID20096
Feature activation-5.83e-02

Pos logit contributions
ete+0.350
icking+0.340
ement+0.323
ory+0.321
ancer+0.313
Neg logit contributions
 s-0.173
 wished-0.171
 living-0.162
 t-0.156
 sharp-0.154
Token around
Token ID1088
Feature activation-8.92e-01

Pos logit contributions
 wished+0.003
 living+0.003
 s+0.002
 t+0.002
 never+0.002
Neg logit contributions
ete-0.006
icking-0.006
ement-0.005
ory-0.005
ays-0.005
Token!
Token ID0
Feature activation+1.33e+00

Pos logit contributions
 living+0.037
 wished+0.036
 s+0.036
 t+0.034
 never+0.033
Neg logit contributions
ete-0.092
icking-0.092
ory-0.086
ancer-0.085
oured-0.085
Token<|endoftext|>
Token ID50256
Feature activation+1.14e+00

Pos logit contributions
icking+0.158
ete+0.147
ement+0.146
oming+0.146
ory+0.144
Neg logit contributions
 s-0.037
 wished-0.035
 living-0.035
 happy-0.030
 never-0.029
TokenOnce
Token ID7454
Feature activation-1.76e+00

Pos logit contributions
icking+0.106
ete+0.104
ory+0.097
ement+0.094
oming+0.093
Neg logit contributions
 living-0.064
 s-0.060
 wished-0.057
 r-0.056
 t-0.055
Token upon
Token ID2402
Feature activation+2.96e+00

Pos logit contributions
 living+0.137
 happy+0.136
 s+0.135
 wished+0.135
 t+0.132
Neg logit contributions
ete-0.110
icking-0.105
ement-0.098
ory-0.097
-0.097
Token a
Token ID257
Feature activation-1.37e-01

Pos logit contributions
ete+0.255
icking+0.230
ola+0.228
ancer+0.226
ement+0.223
Neg logit contributions
 wished-0.135
 s-0.133
 mess-0.131
 knocking-0.128
 gir-0.127
Token time
Token ID640
Feature activation+2.60e-01

Pos logit contributions
 wished+0.008
 s+0.007
 knocking+0.007
 living+0.006
 t+0.006
Neg logit contributions
ete-0.012
ory-0.011
-0.011
ement-0.011
icking-0.011
Token there
Token ID612
Feature activation+3.28e-01

Pos logit contributions
ete+0.025
icking+0.025
ory+0.024
ement+0.023
ancer+0.023
Neg logit contributions
 s-0.012
 living-0.012
 wished-0.011
 t-0.011
 happy-0.011
Token moved
Token ID3888
Feature activation+4.70e-01

Pos logit contributions
 wished+0.011
 s+0.011
 t+0.010
 living+0.010
 sharp+0.010
Neg logit contributions
ete-0.032
icking-0.030
ement-0.029
ory-0.029
ancer-0.028
Token.
Token ID13
Feature activation-2.00e-01

Pos logit contributions
ete+0.052
icking+0.050
ory+0.048
ement+0.048
ancer+0.048
Neg logit contributions
 s-0.017
 wished-0.016
 sharp-0.016
 living-0.016
 t-0.015
Token Soon
Token ID15894
Feature activation+5.14e-01

Pos logit contributions
 s+0.009
 wished+0.008
 living+0.008
 happy+0.008
 soft+0.008
Neg logit contributions
icking-0.020
ete-0.020
ement-0.019
ory-0.019
ancer-0.018
Token he
Token ID339
Feature activation+9.88e-02

Pos logit contributions
ete+0.044
icking+0.042
ement+0.040
ory+0.040
oming+0.038
Neg logit contributions
 s-0.030
 wished-0.029
 living-0.028
 t-0.027
 sharp-0.027
Token tried
Token ID3088
Feature activation-2.56e-01

Pos logit contributions
ete+0.010
icking+0.010
ement+0.009
ory+0.009
ancer+0.009
Neg logit contributions
 s-0.004
 wished-0.004
 living-0.004
 t-0.004
 sharp-0.004
Token twisting
Token ID36373
Feature activation-2.34e+00

Pos logit contributions
 wished+0.012
 s+0.012
 living+0.010
 t+0.010
 snow+0.010
Neg logit contributions
ete-0.024
icking-0.023
ement-0.022
oming-0.022
ory-0.022
Token it
Token ID340
Feature activation-3.09e-01

Pos logit contributions
 s+0.089
 wished+0.088
 t+0.082
 living+0.082
 sharp+0.082
Neg logit contributions
ete-0.248
icking-0.244
ory-0.229
ancer-0.227
ement-0.227
Token around
Token ID1088
Feature activation-8.14e-01

Pos logit contributions
 s+0.014
 wished+0.014
 living+0.013
 t+0.013
 sharp+0.013
Neg logit contributions
ete-0.030
icking-0.029
ory-0.028
ement-0.028
ancer-0.027
Token other
Token ID584
Feature activation-5.62e-01

Pos logit contributions
 s+0.032
 wished+0.031
 living+0.031
 t+0.030
 sharp+0.029
Neg logit contributions
ete-0.086
icking-0.084
ory-0.079
ement-0.078
ancer-0.078
Token things
Token ID1243
Feature activation-9.39e-01

Pos logit contributions
 wished+0.024
 s+0.024
 living+0.021
 sharp+0.020
 r+0.020
Neg logit contributions
ete-0.056
icking-0.055
ement-0.053
ory-0.052
ancer-0.050
Token that
Token ID326
Feature activation-7.19e-01

Pos logit contributions
 s+0.042
 wished+0.042
 t+0.040
 living+0.039
 sharp+0.039
Neg logit contributions
ete-0.094
icking-0.090
ory-0.087
ancer-0.085
ement-0.085
Token He
Token ID679
Feature activation+3.74e-01

Pos logit contributions
ete+0.101
icking+0.095
ement+0.092
+0.090
ory+0.088
Neg logit contributions
 wished-0.083
 s-0.080
 living-0.078
 sharp-0.078
 t-0.076
Token looked
Token ID3114
Feature activation-8.12e-01

Pos logit contributions
ete+0.037
icking+0.035
ory+0.034
ement+0.033
ola+0.033
Neg logit contributions
 s-0.015
 living-0.015
 wished-0.015
 snow-0.014
 sharp-0.014
Token at
Token ID379
Feature activation-7.63e-01

Pos logit contributions
 s+0.038
 wished+0.038
 living+0.036
 t+0.035
 sharp+0.034
Neg logit contributions
ete-0.079
icking-0.077
ement-0.073
ory-0.072
ancer-0.071
Token it
Token ID340
Feature activation-3.61e-01

Pos logit contributions
 s+0.032
 wished+0.032
 living+0.030
 t+0.028
 r+0.028
Neg logit contributions
ete-0.077
icking-0.074
ement-0.072
ory-0.071
ancer-0.069
Token with
Token ID351
Feature activation-2.95e-01

Pos logit contributions
 s+0.015
 living+0.014
 wished+0.014
 t+0.014
 sharp+0.014
Neg logit contributions
ete-0.038
icking-0.036
ory-0.035
ancer-0.035
ement-0.034
Token delight
Token ID10974
Feature activation-1.63e+00

Pos logit contributions
 wished+0.011
 s+0.010
 living+0.010
 bark+0.009
 knocking+0.009
Neg logit contributions
ete-0.030
icking-0.028
ement-0.028
ory-0.027
 Derek-0.027
Token and
Token ID290
Feature activation-1.51e-01

Pos logit contributions
 s+0.062
 wished+0.058
 living+0.057
 t+0.056
 sharp+0.054
Neg logit contributions
ete-0.174
icking-0.166
ement-0.160
-0.159
ancer-0.159
Token immediately
Token ID3393
Feature activation+2.45e-01

Pos logit contributions
 s+0.007
 living+0.007
 sharp+0.006
 wished+0.006
 snow+0.006
Neg logit contributions
ete-0.014
icking-0.014
ory-0.013
ement-0.013
 Derek-0.012
Token wanted
Token ID2227
Feature activation-2.08e+00

Pos logit contributions
ete+0.023
icking+0.022
ory+0.022
ement+0.021
 Derek+0.020
Neg logit contributions
 living-0.011
 s-0.011
 wished-0.011
 snow-0.010
 mess-0.010
Token to
Token ID284
Feature activation-7.92e-01

Pos logit contributions
 wished+0.088
 s+0.088
 living+0.080
 t+0.076
 r+0.074
Neg logit contributions
ete-0.209
icking-0.200
ement-0.191
ory-0.190
oming-0.183
Token wear
Token ID5806
Feature activation+1.17e+00

Pos logit contributions
 s+0.033
 wished+0.032
 t+0.032
 bark+0.030
 sharp+0.029
Neg logit contributions
ete-0.082
icking-0.081
ement-0.078
ancer-0.076
oming-0.076
Token
Token ID198
Feature activation-2.89e-01

Pos logit contributions
 s+0.011
 wished+0.011
 living+0.010
 t+0.010
 sharp+0.009
Neg logit contributions
ete-0.025
icking-0.024
ement-0.023
ory-0.023
ancer-0.022
TokenTim
Token ID14967
Feature activation-4.49e-01

Pos logit contributions
 s+0.016
 wished+0.016
 t+0.015
 living+0.015
 sharp+0.015
Neg logit contributions
ete-0.026
icking-0.023
ory-0.023
ola-0.022
-0.022
Tokenmy
Token ID1820
Feature activation-6.65e-02

Pos logit contributions
 s+0.024
 wished+0.023
 living+0.022
 sharp+0.022
 soft+0.021
Neg logit contributions
ete-0.040
icking-0.040
ory-0.036
ement-0.035
oured-0.035
Token learned
Token ID4499
Feature activation+1.08e+00

Pos logit contributions
 s+0.003
 living+0.003
 sharp+0.003
 wished+0.003
 soft+0.003
Neg logit contributions
ete-0.006
icking-0.006
ory-0.006
ement-0.005
ola-0.005
Token that
Token ID326
Feature activation-8.19e-01

Pos logit contributions
ete+0.106
icking+0.102
ement+0.096
ory+0.096
 Derek+0.094
Neg logit contributions
 wished-0.048
 s-0.048
 living-0.044
 t-0.043
 bark-0.041
Token candles
Token ID32268
Feature activation-3.37e+00

Pos logit contributions
 s+0.042
 wished+0.040
 bark+0.037
 sharp+0.036
 t+0.035
Neg logit contributions
icking-0.072
ete-0.072
ement-0.065
oming-0.064
iring-0.064
Token can
Token ID460
Feature activation+3.84e-01

Pos logit contributions
 living+0.140
 wished+0.137
 s+0.135
 never+0.133
 r+0.133
Neg logit contributions
ete-0.359
ory-0.335
ement-0.321
icking-0.316
oured-0.315
Token be
Token ID307
Feature activation-7.28e-01

Pos logit contributions
ete+0.041
icking+0.039
ement+0.039
ory+0.038
oured+0.037
Neg logit contributions
 s-0.015
 wished-0.014
 r-0.014
 living-0.014
 never-0.014
Token dangerous
Token ID4923
Feature activation+1.66e+00

Pos logit contributions
 s+0.025
 wished+0.024
 living+0.023
 t+0.021
 bark+0.021
Neg logit contributions
ete-0.078
icking-0.077
ory-0.073
ement-0.072
oming-0.071
Token and
Token ID290
Feature activation-2.16e-01

Pos logit contributions
ete+0.165
icking+0.160
ement+0.152
ory+0.152
ancer+0.148
Neg logit contributions
 s-0.074
 wished-0.074
 living-0.071
 t-0.067
 sharp-0.066
Token should
Token ID815
Feature activation+5.11e-01

Pos logit contributions
 s+0.010
 wished+0.010
 living+0.010
 t+0.010
 bark+0.009
Neg logit contributions
ete-0.020
icking-0.020
ory-0.018
ement-0.018
ancer-0.018
Token They
Token ID1119
Feature activation-5.08e-01

Pos logit contributions
 s+0.007
 wished+0.006
 living+0.006
 t+0.006
 bark+0.006
Neg logit contributions
icking-0.012
ete-0.012
ement-0.012
ory-0.012
oming-0.011
Token relaxed
Token ID18397
Feature activation-4.80e-01

Pos logit contributions
 s+0.025
 wished+0.024
 living+0.024
 sharp+0.023
 t+0.022
Neg logit contributions
ete-0.047
icking-0.046
ory-0.043
ement-0.043
ancer-0.042
Token,
Token ID11
Feature activation+4.58e-01

Pos logit contributions
 wished+0.021
 s+0.021
 living+0.019
 t+0.018
 sharp+0.018
Neg logit contributions
ete-0.048
icking-0.047
ement-0.044
ory-0.044
ancer-0.043
Token laughed
Token ID13818
Feature activation-2.96e-01

Pos logit contributions
icking+0.045
ete+0.045
ory+0.042
oming+0.041
ancer+0.041
Neg logit contributions
 wished-0.017
 s-0.017
 living-0.016
 sharp-0.016
 cob-0.015
Token and
Token ID290
Feature activation+6.63e-02

Pos logit contributions
 wished+0.012
 s+0.012
 living+0.012
 t+0.011
 sharp+0.011
Neg logit contributions
ete-0.030
icking-0.029
ement-0.028
ory-0.027
oming-0.027
Token ch
Token ID442
Feature activation-2.82e+00

Pos logit contributions
ete+0.007
icking+0.006
ory+0.006
ancer+0.006
 Derek+0.006
Neg logit contributions
 s-0.003
 living-0.003
 mess-0.003
 wished-0.002
 sharp-0.002
Tokenatted
Token ID16898
Feature activation+1.39e+00

Pos logit contributions
 s+0.264
 wished+0.255
 sharp+0.243
 happy+0.242
 r+0.242
Neg logit contributions
ete-0.132
icking-0.128
-0.122
ancer-0.121
avored-0.120
Token the
Token ID262
Feature activation+2.30e-01

Pos logit contributions
ete+0.131
icking+0.128
ement+0.122
ory+0.120
oming+0.116
Neg logit contributions
 wished-0.065
 s-0.063
 living-0.062
 t-0.058
 sharp-0.057
Token whole
Token ID2187
Feature activation-1.25e+00

Pos logit contributions
ete+0.018
icking+0.018
ement+0.017
ory+0.017
oming+0.016
Neg logit contributions
 wished-0.014
 s-0.014
 living-0.013
 t-0.013
 sharp-0.013
Token way
Token ID835
Feature activation+1.40e-02

Pos logit contributions
 wished+0.065
 s+0.065
 living+0.062
 t+0.060
 sharp+0.059
Neg logit contributions
ete-0.113
icking-0.109
ement-0.105
ory-0.104
oming-0.100
Token.
Token ID13
Feature activation-1.19e-01

Pos logit contributions
ete+0.001
icking+0.001
ement+0.001
ory+0.001
umber+0.001
Neg logit contributions
 s-0.001
 living-0.001
 wished-0.001
 happy-0.001
 soft-0.001
Token a
Token ID257
Feature activation-9.73e-02

Pos logit contributions
ete+0.273
icking+0.248
ola+0.247
ancer+0.244
ory+0.240
Neg logit contributions
 wished-0.128
 s-0.124
 mess-0.122
 knocking-0.119
 gir-0.115
Token time
Token ID640
Feature activation+3.15e-01

Pos logit contributions
 wished+0.005
 s+0.005
 t+0.004
 knocking+0.004
 gir+0.004
Neg logit contributions
ete-0.009
-0.008
ement-0.008
icking-0.008
ory-0.008
Token there
Token ID612
Feature activation+3.29e-01

Pos logit contributions
ete+0.031
icking+0.031
ory+0.030
ement+0.029
ancer+0.029
Neg logit contributions
 s-0.013
 living-0.013
 happy-0.013
 t-0.013
 wished-0.012
Token was
Token ID373
Feature activation+1.48e-01

Pos logit contributions
ete+0.038
icking+0.037
ory+0.036
ancer+0.036
oming+0.035
Neg logit contributions
 wished-0.010
 s-0.010
 t-0.009
 living-0.009
 r-0.009
Token a
Token ID257
Feature activation-1.06e-01

Pos logit contributions
ete+0.015
icking+0.015
ory+0.014
+0.014
avored+0.014
Neg logit contributions
 wished-0.005
 s-0.005
 mess-0.005
 knocking-0.005
 living-0.005
Token little
Token ID1310
Feature activation-2.01e+00

Pos logit contributions
 wished+0.003
 s+0.003
 never+0.003
 living+0.003
 feelings+0.002
Neg logit contributions
ete-0.011
ory-0.011
icking-0.011
ancer-0.011
-0.010
Token boy
Token ID2933
Feature activation-2.11e+00

Pos logit contributions
 wished+0.082
 s+0.082
 living+0.076
 t+0.072
 sharp+0.071
Neg logit contributions
ete-0.205
icking-0.200
ory-0.191
ement-0.190
ancer-0.186
Token called
Token ID1444
Feature activation+3.06e+00

Pos logit contributions
 s+0.140
 wished+0.140
 mess+0.136
 sharp+0.130
 t+0.130
Neg logit contributions
icking-0.146
ete-0.143
ails-0.138
ory-0.135
ola-0.129
Token Sean
Token ID11465
Feature activation+1.81e+00

Pos logit contributions
ete+0.279
icking+0.278
ement+0.271
oming+0.260
ory+0.259
Neg logit contributions
 s-0.158
 wished-0.148
 living-0.148
 t-0.147
 sharp-0.142
Token.
Token ID13
Feature activation-2.98e-01

Pos logit contributions
ete+0.170
icking+0.162
ory+0.158
ement+0.153
ola+0.151
Neg logit contributions
 s-0.084
 wished-0.082
 mess-0.078
 sharp-0.074
 r-0.074
Token He
Token ID679
Feature activation+4.14e-01

Pos logit contributions
 s+0.019
 wished+0.019
 living+0.018
 t+0.017
 sharp+0.017
Neg logit contributions
ete-0.024
icking-0.023
ement-0.022
ory-0.022
ancer-0.021