TOP ACTIVATIONS
MAX = 22.910

the fridge . Pe ppa didn 't know how
it ." Pe ep listened to the owl
found the cave and heard Pe ppa 's o inks .
big tree . Pe ep had an idea .
the plane was taking . Pe anut kept dreaming and chewing
ep had an idea . Pe ep took some soft leaves
smiled and continued to hold Pe anut close to her heart
, anyway ." Wil bur thought hard about what
from the store and asked Pe ppa to help organize it
found a huge volcano . Pe ppa was curious , so
From that day on , Pe ppa never went for a
mind your advice ." Wil bur smiled back and replied
. One day Pe ep ers was feeling extra
animals . One day , Pe ppa 's mom bought some
a far away place caught Pe anut 's eye . He
said , " Wow , Pe anut is such a cool
all broken . Wil ma felt a big sad
to buy any vegetables . Pe ppa felt sorry for him
asked . " Pe anuts are a kind of
to see the attic , Nib bles ? There are many

MIN ACTIVATIONS
MIN = -23.777

He took out a magn ifying glass and a notebook and
a flashlight and a magn ifying glass to examine their finds
said , holding the magn ifying glass over a black ant
the watch with the magn ifying glass . He saw the
the necklace and the magn ifying glass and started to examine
, next to her magn ifying glass . She looked at
He brought his magn ifying glass and looked closely .
hat and grabbed her magn ifying glass . She was ready
flashlight , and a magn ifying glass . As
She takes out a magn ifying glass . She looks at
She had a big magn ifying glass that made things look
, handing him the magn ifying glass . She looked around
it she had a magn ifying glass , a tiny torch
He pointed his magn ifying glass at different classmates and
the necklace and the magn ifying glass and told him that
. They had a magn ifying glass and a small box
When Daisy put the magn ifying glass down , the cater
. Lily had a magn ifying glass that she found in
and handed him the magn ifying glass . He
She took out a magn ifying glass and looked very carefully

INTERVAL 11.238 - 22.910
CONTAINS 0.947%

happy vib es . <|endoftext|> Once there was a sheep .
brave and special . <|endoftext|> Once there was a little girl
're welcome Jos ie !" <|endoftext|> Once upon a time ,
show her mom . <|endoftext|> Once upon a time , there
are a great friend , Mr . Ot ter . I

INTERVAL -0.434 - 11.238
CONTAINS 53.594%

!" <|endoftext|> Once upon a time , there was a little
high . However , T erence really wanted to explore and
She had a watch for him ! He was so happy
all day long until it was time to go home .
, she noticed something y ummy in the garden .

INTERVAL -12.105 - -0.434
CONTAINS 45.160%

of animals , trees and clouds . Everything was so peaceful
boy nodded and used his net to scoop up the cr
and Star ran very fast . They passed all the other
, Sam kicked the ball too hard and it went into
bananas . He wished he could steal something too .

INTERVAL -23.777 - -12.105
CONTAINS 0.299%

Jen is worried . She says , " You
!" Anna says to Ben . " We can 't slide
Max go to say hello . " Hi ,
now , Tom is dead . And Sam is alive .
Tom . A funny joke . Try again , Tom ,"
Token the
Token ID262
Feature activation+9.64e-01

Pos logit contributions
ll+0.005
ma+0.005
tha+0.005
igi+0.004
aman+0.004
Neg logit contributions
der-0.005
.-0.004
 dot-0.004
powers-0.004
lier-0.004
Token fridge
Token ID25772
Feature activation-1.75e+00

Pos logit contributions
ma+0.020
ll+0.018
Once+0.016
aman+0.016
tha+0.015
Neg logit contributions
der-0.035
 dot-0.032
angs-0.030
powers-0.030
.-0.029
Token.
Token ID13
Feature activation-9.75e+00

Pos logit contributions
 dot+0.019
powers+0.016
der+0.016
 trigger+0.014
angs+0.013
Neg logit contributions
ll-0.079
ma-0.078
tha-0.068
aman-0.068
Once-0.066
Token
Token ID198
Feature activation+1.07e+01

Pos logit contributions
.+0.258
der+0.234
 It+0.222
 He+0.205
 dot+0.196
Neg logit contributions
ll-0.307
ma-0.293
prise-0.282
tha-0.247
igi-0.245
Token
Token ID198
Feature activation+1.07e+01

Pos logit contributions
tha+0.411
ll+0.401
eking+0.397
prise+0.393
ma+0.390
Neg logit contributions
.-0.211
 It-0.131
 He-0.118
".-0.106
elly-0.102
TokenPe
Token ID6435
Feature activation+2.25e+01

Pos logit contributions
ll+0.254
ma+0.244
tha+0.216
prise+0.204
ulu+0.191
Neg logit contributions
.-0.346
der-0.316
elly-0.288
 dot-0.287
powers-0.286
Tokenppa
Token ID44989
Feature activation+1.01e+01

Pos logit contributions
ll+0.324
 ah+0.297
 Madd+0.288
ma+0.286
ulu+0.282
Neg logit contributions
.-0.836
der-0.824
ly-0.733
lier-0.732
powers-0.716
Token didn
Token ID1422
Feature activation+4.48e+00

Pos logit contributions
ll+0.287
igi+0.273
tha+0.267
ma+0.256
aman+0.254
Neg logit contributions
.-0.258
der-0.247
powers-0.204
 dot-0.203
lier-0.202
Token't
Token ID470
Feature activation+3.61e+00

Pos logit contributions
ll+0.089
ma+0.081
tha+0.070
igi+0.063
Once+0.063
Neg logit contributions
der-0.158
 dot-0.148
.-0.148
powers-0.144
lier-0.138
Token know
Token ID760
Feature activation+4.90e+00

Pos logit contributions
ll+0.099
ma+0.094
Once+0.084
tha+0.082
igi+0.080
Neg logit contributions
der-0.101
 dot-0.089
powers-0.087
.-0.084
ifying-0.083
Token how
Token ID703
Feature activation-2.44e+00

Pos logit contributions
ll+0.128
ma+0.120
tha+0.109
igi+0.101
aman+0.101
Neg logit contributions
der-0.142
 dot-0.132
.-0.132
powers-0.128
lier-0.119
Token it
Token ID340
Feature activation-2.43e+00

Pos logit contributions
ll+0.034
ma+0.032
tha+0.029
igi+0.027
Once+0.027
Neg logit contributions
der-0.028
 dot-0.028
powers-0.025
elly-0.025
.-0.025
Token."
Token ID526
Feature activation-3.64e+00

Pos logit contributions
der+0.057
 dot+0.055
powers+0.051
attered+0.049
angs+0.049
Neg logit contributions
ll-0.079
ma-0.072
tha-0.062
igi-0.061
Once-0.061
Token
Token ID220
Feature activation+8.93e+00

Pos logit contributions
.+0.116
der+0.112
 He+0.096
 It+0.096
 dot+0.095
Neg logit contributions
ll-0.090
ma-0.087
prise-0.079
tha-0.077
igi-0.074
Token
Token ID198
Feature activation+1.03e+01

Pos logit contributions
ll+0.242
ma+0.234
tha+0.215
igi+0.202
prise+0.202
Neg logit contributions
der-0.245
.-0.244
powers-0.218
 dot-0.214
lier-0.199
Token
Token ID198
Feature activation+1.03e+01

Pos logit contributions
ma+0.374
tha+0.369
prise+0.351
ll+0.346
aman+0.333
Neg logit contributions
.-0.208
der-0.155
 He-0.146
powers-0.134
 It-0.133
TokenPe
Token ID6435
Feature activation+2.24e+01

Pos logit contributions
ma+0.289
tha+0.276
ll+0.276
prise+0.252
 done+0.247
Neg logit contributions
.-0.260
der-0.260
powers-0.227
lier-0.218
 dot-0.211
Tokenep
Token ID538
Feature activation+2.25e+00

Pos logit contributions
 Madd+0.544
ll+0.531
Once+0.519
aman+0.509
ma+0.502
Neg logit contributions
.-0.592
der-0.585
lier-0.457
powers-0.451
 dot-0.444
Token listened
Token ID16399
Feature activation+2.47e+00

Pos logit contributions
ll+0.064
ma+0.059
tha+0.055
igi+0.052
aman+0.052
Neg logit contributions
der-0.061
.-0.057
 dot-0.054
powers-0.052
lier-0.050
Token to
Token ID284
Feature activation+2.26e+00

Pos logit contributions
ll+0.072
ma+0.064
tha+0.059
Once+0.057
aman+0.056
Neg logit contributions
 dot-0.061
der-0.060
angs-0.058
powers-0.057
elly-0.053
Token the
Token ID262
Feature activation+8.26e-01

Pos logit contributions
ll+0.076
ma+0.068
tha+0.062
prise+0.062
igi+0.061
Neg logit contributions
der-0.052
.-0.052
 dot-0.045
powers-0.041
 target-0.039
Token owl
Token ID39610
Feature activation+3.83e+00

Pos logit contributions
ll+0.026
ma+0.025
tha+0.023
igi+0.022
aman+0.022
Neg logit contributions
der-0.019
.-0.017
 dot-0.017
powers-0.017
lier-0.016
Token found
Token ID1043
Feature activation+3.30e+00

Pos logit contributions
.+0.050
der+0.046
 dot+0.042
lier+0.040
ly+0.039
Neg logit contributions
ll-0.049
ma-0.046
igi-0.046
tha-0.045
aman-0.044
Token the
Token ID262
Feature activation+9.67e-01

Pos logit contributions
ll+0.119
ma+0.109
tha+0.101
igi+0.098
Once+0.097
Neg logit contributions
der-0.066
.-0.066
 dot-0.054
powers-0.052
lier-0.050
Token cave
Token ID11527
Feature activation-2.10e+00

Pos logit contributions
ll+0.032
ma+0.030
tha+0.028
igi+0.027
aman+0.027
Neg logit contributions
der-0.021
.-0.019
 dot-0.019
powers-0.018
lier-0.017
Token and
Token ID290
Feature activation+5.21e+00

Pos logit contributions
der+0.053
 dot+0.050
powers+0.049
.+0.046
angs+0.044
Neg logit contributions
ll-0.062
ma-0.059
tha-0.053
aman-0.049
igi-0.048
Token heard
Token ID2982
Feature activation+6.52e-01

Pos logit contributions
ll+0.162
ma+0.149
prise+0.131
aman+0.130
liv+0.130
Neg logit contributions
.-0.135
der-0.126
 dot-0.105
ly-0.104
 towards-0.101
Token Pe
Token ID2631
Feature activation+2.23e+01

Pos logit contributions
ll+0.021
ma+0.020
tha+0.019
aman+0.018
igi+0.018
Neg logit contributions
der-0.014
 dot-0.013
.-0.013
powers-0.012
lier-0.011
Tokenppa
Token ID44989
Feature activation+1.02e+01

Pos logit contributions
ll+0.286
ma+0.250
 Madd+0.231
Once+0.229
ulu+0.207
Neg logit contributions
.-0.856
der-0.830
lier-0.792
 dot-0.782
obs-0.765
Token's
Token ID338
Feature activation+2.30e+00

Pos logit contributions
ll+0.333
ma+0.313
Once+0.276
tha+0.268
 done+0.264
Neg logit contributions
der-0.243
powers-0.203
elly-0.199
 dot-0.199
pool-0.186
Token o
Token ID267
Feature activation+1.04e+01

Pos logit contributions
ll+0.057
ma+0.056
tha+0.048
 babys+0.046
Once+0.046
Neg logit contributions
der-0.064
elly-0.060
lier-0.059
pool-0.056
ifying-0.056
Tokeninks
Token ID2973
Feature activation-2.41e+00

Pos logit contributions
ll+0.226
ma+0.208
tha+0.189
igi+0.175
aman+0.172
Neg logit contributions
der-0.343
.-0.328
 dot-0.316
powers-0.309
lier-0.296
Token.
Token ID13
Feature activation-9.72e+00

Pos logit contributions
powers+0.058
der+0.054
elly+0.050
oky+0.047
 dot+0.045
Neg logit contributions
ma-0.068
ll-0.063
Once-0.060
 rejo-0.059
aman-0.058
Token big
Token ID1263
Feature activation-3.01e+00

Pos logit contributions
der+0.086
.+0.080
 dot+0.076
powers+0.071
lier+0.065
Neg logit contributions
ll-0.165
ma-0.156
tha-0.147
igi-0.141
aman-0.139
Token tree
Token ID5509
Feature activation-6.71e-02

Pos logit contributions
der+0.082
powers+0.073
.+0.069
angs+0.068
 dot+0.067
Neg logit contributions
ma-0.082
ll-0.082
tha-0.075
aman-0.070
igi-0.068
Token.
Token ID13
Feature activation-1.02e+01

Pos logit contributions
der+0.001
powers+0.001
 trigger+0.001
angs+0.001
obs+0.001
Neg logit contributions
ll-0.002
ma-0.002
Once-0.002
 done-0.002
 waiting-0.002
Token
Token ID198
Feature activation+1.05e+01

Pos logit contributions
der+0.208
.+0.197
 dot+0.178
powers+0.172
lier+0.155
Neg logit contributions
ll-0.356
ma-0.345
tha-0.312
igi-0.302
aman-0.298
Token
Token ID198
Feature activation+1.06e+01

Pos logit contributions
ma+0.376
tha+0.372
ll+0.367
aman+0.360
prise+0.343
Neg logit contributions
.-0.230
 It-0.164
 He-0.151
der-0.140
powers-0.138
TokenPe
Token ID6435
Feature activation+2.23e+01

Pos logit contributions
ma+0.283
ll+0.274
tha+0.243
aman+0.223
prise+0.220
Neg logit contributions
.-0.300
der-0.276
pool-0.253
powers-0.252
 dot-0.252
Tokenep
Token ID538
Feature activation+2.16e+00

Pos logit contributions
ma+0.526
 babys+0.516
aman+0.512
ll+0.508
 Madd+0.503
Neg logit contributions
.-0.600
der-0.520
powers-0.472
lier-0.459
tie-0.453
Token had
Token ID550
Feature activation-1.31e+00

Pos logit contributions
ll+0.061
ma+0.058
tha+0.053
aman+0.052
prise+0.052
Neg logit contributions
der-0.057
.-0.056
 dot-0.050
powers-0.048
lier-0.046
Token an
Token ID281
Feature activation-8.28e-01

Pos logit contributions
der+0.025
.+0.023
 dot+0.021
powers+0.020
lier+0.018
Neg logit contributions
ll-0.048
ma-0.046
tha-0.043
igi-0.041
aman-0.041
Token idea
Token ID2126
Feature activation+2.11e+00

Pos logit contributions
der+0.017
.+0.015
 dot+0.014
powers+0.014
lier+0.012
Neg logit contributions
ll-0.029
ma-0.028
tha-0.026
igi-0.025
aman-0.024
Token.
Token ID13
Feature activation-1.00e+01

Pos logit contributions
ll+0.085
ma+0.081
 told+0.074
tha+0.071
Once+0.071
Neg logit contributions
der-0.036
ifying-0.034
 dot-0.034
attered-0.032
 trigger-0.032
Token the
Token ID262
Feature activation+8.66e-01

Pos logit contributions
ll+0.102
ma+0.101
tha+0.089
 done+0.085
Once+0.085
Neg logit contributions
der-0.060
 dot-0.055
powers-0.052
 trigger-0.050
attered-0.049
Token plane
Token ID6614
Feature activation+2.90e-01

Pos logit contributions
ll+0.027
ma+0.025
tha+0.024
igi+0.023
aman+0.022
Neg logit contributions
der-0.021
.-0.020
 dot-0.019
powers-0.018
 target-0.017
Token was
Token ID373
Feature activation+3.46e+00

Pos logit contributions
ll+0.010
ma+0.010
tha+0.009
Once+0.008
aman+0.008
Neg logit contributions
der-0.006
powers-0.005
 dot-0.005
elly-0.005
angs-0.005
Token taking
Token ID2263
Feature activation-7.86e-01

Pos logit contributions
ll+0.098
ma+0.093
tha+0.085
igi+0.081
Once+0.080
Neg logit contributions
der-0.094
 dot-0.085
powers-0.083
elly-0.079
attered-0.077
Token.
Token ID13
Feature activation-9.98e+00

Pos logit contributions
der+0.021
 dot+0.018
powers+0.017
elly+0.016
attered+0.016
Neg logit contributions
ll-0.022
ma-0.022
tha-0.020
igi-0.019
Once-0.019
Token Pe
Token ID2631
Feature activation+2.22e+01

Pos logit contributions
 dot+0.228
der+0.228
.+0.212
angs+0.208
powers+0.204
Neg logit contributions
ll-0.293
ma-0.265
tha-0.258
aman-0.249
igi-0.243
Tokenanut
Token ID20651
Feature activation+9.66e+00

Pos logit contributions
 babys+0.389
loe+0.370
ll+0.361
 Madd+0.358
liv+0.356
Neg logit contributions
.-0.720
der-0.701
lier-0.647
elly-0.614
ly-0.613
Token kept
Token ID4030
Feature activation+2.33e+00

Pos logit contributions
ll+0.266
ma+0.256
tha+0.255
aman+0.247
igi+0.245
Neg logit contributions
der-0.252
.-0.248
 dot-0.220
powers-0.212
lier-0.203
Token dreaming
Token ID28368
Feature activation-2.15e+00

Pos logit contributions
ll+0.075
ma+0.069
tha+0.064
igi+0.063
aman+0.062
Neg logit contributions
.-0.055
der-0.052
 dot-0.048
powers-0.044
 target-0.043
Token and
Token ID290
Feature activation+5.17e+00

Pos logit contributions
der+0.042
 dot+0.038
.+0.037
powers+0.036
lier+0.032
Neg logit contributions
ll-0.076
ma-0.072
tha-0.067
igi-0.064
aman-0.063
Token chewing
Token ID36615
Feature activation+4.67e-01

Pos logit contributions
ll+0.118
ma+0.115
tha+0.109
aman+0.104
igi+0.102
Neg logit contributions
.-0.166
der-0.165
powers-0.143
 dot-0.142
 target-0.133
Tokenep
Token ID538
Feature activation+2.16e+00

Pos logit contributions
ma+0.526
 babys+0.516
aman+0.512
ll+0.508
 Madd+0.503
Neg logit contributions
.-0.600
der-0.520
powers-0.472
lier-0.459
tie-0.453
Token had
Token ID550
Feature activation-1.31e+00

Pos logit contributions
ll+0.061
ma+0.058
tha+0.053
aman+0.052
prise+0.052
Neg logit contributions
der-0.057
.-0.056
 dot-0.050
powers-0.048
lier-0.046
Token an
Token ID281
Feature activation-8.28e-01

Pos logit contributions
der+0.025
.+0.023
 dot+0.021
powers+0.020
lier+0.018
Neg logit contributions
ll-0.048
ma-0.046
tha-0.043
igi-0.041
aman-0.041
Token idea
Token ID2126
Feature activation+2.11e+00

Pos logit contributions
der+0.017
.+0.015
 dot+0.014
powers+0.014
lier+0.012
Neg logit contributions
ll-0.029
ma-0.028
tha-0.026
igi-0.025
aman-0.024
Token.
Token ID13
Feature activation-1.00e+01

Pos logit contributions
ll+0.085
ma+0.081
 told+0.074
tha+0.071
Once+0.071
Neg logit contributions
der-0.036
ifying-0.034
 dot-0.034
attered-0.032
 trigger-0.032
Token Pe
Token ID2631
Feature activation+2.22e+01

Pos logit contributions
der+0.200
 dot+0.191
.+0.175
 target+0.171
powers+0.170
Neg logit contributions
ll-0.335
tha-0.308
ma-0.303
igi-0.288
aman-0.284
Tokenep
Token ID538
Feature activation+2.31e+00

Pos logit contributions
 babys+0.446
 Madd+0.437
aman+0.437
Once+0.427
ll+0.415
Neg logit contributions
.-0.649
der-0.590
lier-0.562
powers-0.553
angs-0.530
Token took
Token ID1718
Feature activation+1.84e+00

Pos logit contributions
ll+0.066
ma+0.061
tha+0.056
igi+0.053
Once+0.051
Neg logit contributions
der-0.062
 dot-0.057
.-0.055
powers-0.055
lier-0.051
Token some
Token ID617
Feature activation-2.41e+00

Pos logit contributions
ll+0.055
ma+0.051
tha+0.047
igi+0.045
aman+0.045
Neg logit contributions
der-0.046
.-0.044
 dot-0.042
powers-0.041
lier-0.038
Token soft
Token ID2705
Feature activation-1.84e+00

Pos logit contributions
der+0.059
.+0.055
 dot+0.053
powers+0.051
lier+0.047
Neg logit contributions
ll-0.075
ma-0.070
tha-0.065
igi-0.062
aman-0.061
Token leaves
Token ID5667
Feature activation-3.71e+00

Pos logit contributions
der+0.041
.+0.037
 dot+0.036
powers+0.034
lier+0.033
Neg logit contributions
ll-0.059
ma-0.057
tha-0.053
igi-0.051
aman-0.050
Token smiled
Token ID13541
Feature activation+5.75e+00

Pos logit contributions
ll+0.138
igi+0.130
ma+0.129
prise+0.126
tha+0.125
Neg logit contributions
.-0.187
der-0.182
 dot-0.158
lier-0.152
 target-0.148
Token and
Token ID290
Feature activation+4.93e+00

Pos logit contributions
ll+0.111
Once+0.109
ma+0.108
 glad+0.100
 pleased+0.085
Neg logit contributions
powers-0.187
angs-0.185
 trigger-0.171
inters-0.169
ifying-0.166
Token continued
Token ID3767
Feature activation-1.69e+00

Pos logit contributions
ll+0.115
ma+0.114
igi+0.099
prise+0.098
tha+0.096
Neg logit contributions
.-0.160
der-0.158
 dot-0.136
powers-0.128
 It-0.128
Token to
Token ID284
Feature activation+2.30e+00

Pos logit contributions
der+0.035
 dot+0.031
.+0.031
powers+0.030
lier+0.028
Neg logit contributions
ll-0.058
ma-0.055
tha-0.052
igi-0.049
aman-0.049
Token hold
Token ID1745
Feature activation+2.25e+00

Pos logit contributions
ll+0.078
ma+0.073
tha+0.069
igi+0.066
aman+0.064
Neg logit contributions
der-0.048
.-0.047
 dot-0.044
powers-0.040
lier-0.038
Token Pe
Token ID2631
Feature activation+2.21e+01

Pos logit contributions
ll+0.074
ma+0.070
tha+0.065
igi+0.062
aman+0.062
Neg logit contributions
der-0.050
.-0.047
 dot-0.045
powers-0.043
lier-0.040
Tokenanut
Token ID20651
Feature activation+9.53e+00

Pos logit contributions
ll+0.291
Once+0.291
 Madd+0.288
Els+0.276
 babys+0.268
Neg logit contributions
.-0.802
lier-0.711
der-0.706
attered-0.681
 dot-0.676
Token close
Token ID1969
Feature activation-1.31e+00

Pos logit contributions
ll+0.346
ma+0.332
Once+0.282
tha+0.276
ppa+0.271
Neg logit contributions
der-0.199
 dot-0.161
elly-0.153
angs-0.152
powers-0.152
Token to
Token ID284
Feature activation+2.43e+00

Pos logit contributions
der+0.027
powers+0.026
 dot+0.025
angs+0.023
 trigger+0.022
Neg logit contributions
ll-0.043
ma-0.041
aman-0.038
tha-0.037
Once-0.035
Token her
Token ID607
Feature activation+3.08e+00

Pos logit contributions
ll+0.066
ma+0.063
tha+0.058
aman+0.054
igi+0.053
Neg logit contributions
der-0.067
 dot-0.062
powers-0.061
.-0.060
angs-0.056
Token heart
Token ID2612
Feature activation+3.33e+00

Pos logit contributions
ll+0.109
ma+0.104
tha+0.097
aman+0.091
igi+0.090
Neg logit contributions
der-0.062
 dot-0.052
powers-0.052
.-0.049
lier-0.048
Token,
Token ID11
Feature activation+8.28e+00

Pos logit contributions
ll+0.078
ma+0.073
tha+0.068
igi+0.065
aman+0.064
Neg logit contributions
der-0.045
 dot-0.040
.-0.039
powers-0.038
lier-0.035
Token anyway
Token ID6949
Feature activation+3.22e-01

Pos logit contributions
ll+0.252
ma+0.237
tha+0.217
igi+0.208
aman+0.203
Neg logit contributions
der-0.206
.-0.196
 dot-0.182
powers-0.174
lier-0.165
Token."
Token ID526
Feature activation-3.21e+00

Pos logit contributions
ll+0.010
ma+0.010
tha+0.009
igi+0.009
aman+0.008
Neg logit contributions
der-0.007
 dot-0.007
powers-0.007
attered-0.006
lier-0.006
Token
Token ID198
Feature activation+1.07e+01

Pos logit contributions
.+0.086
der+0.080
 It+0.068
 He+0.068
powers+0.067
Neg logit contributions
ll-0.095
ma-0.092
prise-0.083
tha-0.081
igi-0.079
Token
Token ID198
Feature activation+1.07e+01

Pos logit contributions
ma+0.361
ll+0.355
prise+0.349
tha+0.348
aman+0.324
Neg logit contributions
.-0.239
der-0.175
 He-0.166
 It-0.163
powers-0.155
TokenWil
Token ID22327
Feature activation+2.21e+01

Pos logit contributions
ma+0.211
ll+0.210
prise+0.173
tha+0.172
ulu+0.165
Neg logit contributions
.-0.370
der-0.368
powers-0.332
 dot-0.317
 He-0.314
Tokenbur
Token ID6236
Feature activation+1.40e+01

Pos logit contributions
 Madd+0.346
 babys+0.342
 ah+0.310
prise+0.304
ll+0.297
Neg logit contributions
.-0.725
der-0.671
axe-0.626
tie-0.625
lier-0.623
Token thought
Token ID1807
Feature activation+4.27e+00

Pos logit contributions
ll+0.367
prise+0.349
ma+0.342
tha+0.327
aman+0.325
Neg logit contributions
.-0.394
der-0.382
ly-0.317
lier-0.316
 dot-0.305
Token hard
Token ID1327
Feature activation-3.52e+00

Pos logit contributions
ll+0.140
ma+0.133
tha+0.121
aman+0.113
igi+0.113
Neg logit contributions
der-0.093
 dot-0.091
powers-0.083
attered-0.082
elly-0.079
Token about
Token ID546
Feature activation+3.36e+00

Pos logit contributions
 dot+0.084
powers+0.084
der+0.080
ifying+0.077
 target+0.074
Neg logit contributions
ll-0.110
ma-0.101
tha-0.085
aman-0.083
 done-0.081
Token what
Token ID644
Feature activation-1.50e+00

Pos logit contributions
ll+0.100
ma+0.093
tha+0.086
igi+0.082
aman+0.081
Neg logit contributions
der-0.087
.-0.083
 dot-0.076
powers-0.073
lier-0.069
Token from
Token ID422
Feature activation+8.82e+00

Pos logit contributions
der+0.023
 dot+0.022
powers+0.021
lier+0.020
 target+0.019
Neg logit contributions
ll-0.034
ma-0.032
tha-0.028
igi-0.028
prise-0.026
Token the
Token ID262
Feature activation+6.66e-01

Pos logit contributions
ll+0.269
ma+0.241
tha+0.223
igi+0.213
prise+0.206
Neg logit contributions
der-0.224
.-0.222
 dot-0.200
powers-0.188
 target-0.179
Token store
Token ID3650
Feature activation-1.69e+00

Pos logit contributions
ll+0.019
ma+0.017
tha+0.016
igi+0.015
ulu+0.015
Neg logit contributions
.-0.018
der-0.017
 dot-0.016
powers-0.015
 target-0.015
Token and
Token ID290
Feature activation+4.94e+00

Pos logit contributions
der+0.025
 dot+0.024
powers+0.022
angs+0.018
 trigger+0.018
Neg logit contributions
ll-0.069
ma-0.067
tha-0.062
aman-0.059
igi-0.058
Token asked
Token ID1965
Feature activation+6.46e+00

Pos logit contributions
ll+0.153
ma+0.130
igi+0.130
Eventually+0.122
Finally+0.122
Neg logit contributions
.-0.126
der-0.108
ly-0.092
lier-0.088
 drops-0.086
Token Pe
Token ID2631
Feature activation+2.21e+01

Pos logit contributions
ll+0.199
ma+0.180
tha+0.167
liv+0.163
prise+0.160
Neg logit contributions
.-0.181
der-0.166
ly-0.133
lier-0.127
 It-0.125
Tokenppa
Token ID44989
Feature activation+9.84e+00

Pos logit contributions
 Madd+0.285
ll+0.279
 ah+0.255
tha+0.233
ulu+0.229
Neg logit contributions
.-0.822
der-0.772
lier-0.757
 dot-0.703
ly-0.703
Token to
Token ID284
Feature activation+2.40e+00

Pos logit contributions
ll+0.364
ma+0.346
tha+0.311
igi+0.303
Once+0.300
Neg logit contributions
der-0.182
 dot-0.168
powers-0.156
.-0.146
lier-0.139
Token help
Token ID1037
Feature activation-7.04e-01

Pos logit contributions
ll+0.078
ma+0.070
tha+0.067
igi+0.067
aman+0.063
Neg logit contributions
.-0.055
der-0.051
 dot-0.047
powers-0.045
lier-0.043
Token organize
Token ID16481
Feature activation-2.57e+00

Pos logit contributions
der+0.020
 dot+0.018
powers+0.018
angs+0.016
elly+0.016
Neg logit contributions
ma-0.019
ll-0.019
tha-0.017
igi-0.015
aman-0.015
Token it
Token ID340
Feature activation-2.17e+00

Pos logit contributions
der+0.080
 dot+0.075
powers+0.072
angs+0.069
attered+0.068
Neg logit contributions
ma-0.063
ll-0.058
tha-0.053
igi-0.048
aman-0.047
Token found
Token ID1043
Feature activation+2.95e+00

Pos logit contributions
.+0.061
der+0.058
 dot+0.053
powers+0.051
lier+0.050
Neg logit contributions
ll-0.054
ma-0.052
tha-0.049
aman-0.047
igi-0.047
Token a
Token ID257
Feature activation-4.72e+00

Pos logit contributions
ll+0.099
ma+0.098
tha+0.092
igi+0.089
aman+0.087
Neg logit contributions
der-0.059
 dot-0.055
powers-0.050
.-0.049
lier-0.046
Token huge
Token ID3236
Feature activation-1.63e+00

Pos logit contributions
der+0.096
.+0.089
 dot+0.085
powers+0.081
lier+0.074
Neg logit contributions
ll-0.164
ma-0.155
tha-0.145
igi-0.139
aman-0.137
Token volcano
Token ID32410
Feature activation+9.64e-01

Pos logit contributions
der+0.043
powers+0.038
.+0.037
 dot+0.036
angs+0.035
Neg logit contributions
ll-0.045
ma-0.045
tha-0.041
aman-0.039
igi-0.038
Token.
Token ID13
Feature activation-1.01e+01

Pos logit contributions
ll+0.032
ma+0.031
 waiting+0.029
lly+0.027
 told+0.027
Neg logit contributions
powers-0.019
 trigger-0.017
der-0.017
 dot-0.016
angs-0.016
Token Pe
Token ID2631
Feature activation+2.19e+01

Pos logit contributions
der+0.244
 dot+0.231
.+0.221
powers+0.214
lier+0.211
Neg logit contributions
ll-0.287
ma-0.269
tha-0.265
aman-0.257
Once-0.246
Tokenppa
Token ID44989
Feature activation+9.93e+00

Pos logit contributions
ll+0.301
 Madd+0.298
 ah+0.283
ulu+0.279
ma+0.264
Neg logit contributions
.-0.782
der-0.773
lier-0.721
ly-0.694
obs-0.692
Token was
Token ID373
Feature activation+3.32e+00

Pos logit contributions
ll+0.284
ma+0.278
igi+0.276
tha+0.272
aman+0.267
Neg logit contributions
.-0.255
der-0.241
 dot-0.202
powers-0.198
lier-0.196
Token curious
Token ID11040
Feature activation-3.02e+00

Pos logit contributions
ll+0.124
ma+0.116
aman+0.108
igi+0.107
tha+0.107
Neg logit contributions
.-0.065
der-0.059
 dot-0.053
powers-0.050
 across-0.047
Token,
Token ID11
Feature activation+8.01e+00

Pos logit contributions
powers+0.089
der+0.088
 trigger+0.088
 dot+0.088
ifying+0.085
Neg logit contributions
ma-0.069
ll-0.064
Once-0.057
aman-0.054
 done-0.051
Token so
Token ID523
Feature activation+6.38e+00

Pos logit contributions
ll+0.220
ma+0.205
tha+0.189
igi+0.178
aman+0.174
Neg logit contributions
der-0.224
.-0.214
 dot-0.203
powers-0.196
lier-0.187
Token From
Token ID3574
Feature activation+1.01e+01

Pos logit contributions
der+0.292
.+0.275
 dot+0.254
powers+0.247
lier+0.233
Neg logit contributions
ll-0.292
ma-0.277
tha-0.249
igi-0.236
prise-0.234
Token that
Token ID326
Feature activation-3.15e+00

Pos logit contributions
ll+0.442
ma+0.413
prise+0.406
 babys+0.398
 ah+0.389
Neg logit contributions
.-0.124
der-0.098
ly-0.066
 dot-0.053
tie-0.052
Token day
Token ID1110
Feature activation+3.77e+00

Pos logit contributions
.+0.050
 thing+0.035
 dot+0.029
 across+0.029
 word+0.028
Neg logit contributions
ll-0.124
tha-0.121
iza-0.119
prise-0.119
aj-0.119
Token on
Token ID319
Feature activation-4.63e-01

Pos logit contributions
ll+0.144
ma+0.136
tha+0.131
aman+0.128
igi+0.126
Neg logit contributions
.-0.062
der-0.059
 dot-0.050
powers-0.044
pool-0.042
Token,
Token ID11
Feature activation+7.92e+00

Pos logit contributions
.+0.010
der+0.007
 He+0.005
 It+0.005
 anymore+0.005
Neg logit contributions
ll-0.018
prise-0.018
Danny-0.016
ma-0.016
Jamie-0.016
Token Pe
Token ID2631
Feature activation+2.19e+01

Pos logit contributions
ll+0.238
ma+0.228
prise+0.210
Danny+0.199
liv+0.198
Neg logit contributions
.-0.213
der-0.194
 dot-0.167
pool-0.156
lier-0.155
Tokenppa
Token ID44989
Feature activation+9.71e+00

Pos logit contributions
 ah+0.362
 Madd+0.361
ll+0.333
Danny+0.324
liv+0.322
Neg logit contributions
.-0.735
der-0.680
lier-0.651
ly-0.632
 dot-0.625
Token never
Token ID1239
Feature activation+8.13e+00

Pos logit contributions
prise+0.256
ma+0.252
aman+0.249
tha+0.247
ll+0.242
Neg logit contributions
.-0.262
der-0.251
 dot-0.215
powers-0.204
lier-0.198
Token went
Token ID1816
Feature activation+1.52e+00

Pos logit contributions
ll+0.192
ma+0.175
tha+0.158
igi+0.147
aman+0.144
Neg logit contributions
der-0.259
.-0.243
 dot-0.238
powers-0.231
lier-0.219
Token for
Token ID329
Feature activation+2.83e+00

Pos logit contributions
ll+0.049
ma+0.044
tha+0.041
igi+0.040
Once+0.039
Neg logit contributions
der-0.037
 dot-0.035
powers-0.030
pool-0.029
 trigger-0.029
Token a
Token ID257
Feature activation-4.67e+00

Pos logit contributions
ll+0.116
ma+0.110
tha+0.104
igi+0.100
aman+0.100
Neg logit contributions
der-0.040
.-0.038
 dot-0.033
powers-0.032
lier-0.027
Token mind
Token ID2000
Feature activation+1.14e+00

Pos logit contributions
ll+0.078
ma+0.073
tha+0.067
igi+0.064
aman+0.063
Neg logit contributions
der-0.076
.-0.071
 dot-0.069
powers-0.066
lier-0.063
Token your
Token ID534
Feature activation+3.63e+00

Pos logit contributions
ll+0.032
ma+0.030
tha+0.028
igi+0.026
aman+0.026
Neg logit contributions
der-0.031
 dot-0.028
.-0.028
powers-0.027
lier-0.026
Token advice
Token ID5608
Feature activation+8.25e-02

Pos logit contributions
ll+0.099
ma+0.097
tha+0.089
aman+0.082
 glad+0.079
Neg logit contributions
der-0.099
 dot-0.089
elly-0.084
lier-0.084
angs-0.083
Token."
Token ID526
Feature activation-3.39e+00

Pos logit contributions
ll+0.003
ma+0.003
tha+0.003
igi+0.003
aman+0.003
Neg logit contributions
der-0.002
 dot-0.001
powers-0.001
.-0.001
angs-0.001
Token
Token ID198
Feature activation+1.05e+01

Pos logit contributions
.+0.095
der+0.094
 dot+0.084
powers+0.082
lier+0.078
Neg logit contributions
ll-0.094
ma-0.085
tha-0.078
igi-0.076
prise-0.074
TokenWil
Token ID22327
Feature activation+2.19e+01

Pos logit contributions
ll+0.357
tha+0.336
ma+0.331
prise+0.323
 done+0.314
Neg logit contributions
.-0.231
 He-0.170
der-0.168
 It-0.156
powers-0.152
Tokenbur
Token ID6236
Feature activation+1.35e+01

Pos logit contributions
 Madd+0.193
ll+0.185
 babys+0.184
Once+0.154
ulu+0.149
Neg logit contributions
.-0.871
der-0.829
lier-0.783
ray-0.777
ly-0.751
Token smiled
Token ID13541
Feature activation+6.29e+00

Pos logit contributions
ll+0.302
ma+0.268
prise+0.267
tha+0.258
igi+0.255
Neg logit contributions
.-0.434
der-0.421
 dot-0.362
lier-0.356
powers-0.356
Token back
Token ID736
Feature activation+2.90e+00

Pos logit contributions
ma+0.141
ll+0.140
Once+0.129
tha+0.122
 glad+0.121
Neg logit contributions
powers-0.181
angs-0.178
 dot-0.168
der-0.163
 trigger-0.162
Token and
Token ID290
Feature activation+5.20e+00

Pos logit contributions
ll+0.082
ma+0.077
tha+0.071
aman+0.066
Once+0.064
Neg logit contributions
 dot-0.074
powers-0.071
der-0.071
angs-0.069
 trigger-0.067
Token replied
Token ID8712
Feature activation+8.74e+00

Pos logit contributions
ll+0.110
ma+0.101
tha+0.088
igi+0.084
prise+0.081
Neg logit contributions
der-0.179
.-0.175
 dot-0.161
powers-0.160
lier-0.152
Token.
Token ID13
Feature activation-1.04e+01

Pos logit contributions
ll+0.185
ma+0.169
Once+0.156
tha+0.152
aman+0.152
Neg logit contributions
 dot-0.063
der-0.056
attered-0.052
 trigger-0.052
angs-0.049
Token
Token ID198
Feature activation+1.01e+01

Pos logit contributions
.+0.257
der+0.251
 dot+0.215
 It+0.206
powers+0.204
Neg logit contributions
ll-0.328
ma-0.322
prise-0.295
tha-0.292
igi-0.277
Token
Token ID198
Feature activation+1.02e+01

Pos logit contributions
tha+0.349
ll+0.339
ma+0.336
prise+0.327
aman+0.315
Neg logit contributions
.-0.227
der-0.152
 It-0.149
 He-0.147
pool-0.140
TokenOne
Token ID3198
Feature activation+1.08e+00

Pos logit contributions
ll+0.312
ma+0.311
tha+0.289
prise+0.272
igi+0.256
Neg logit contributions
.-0.241
der-0.240
 dot-0.213
powers-0.204
pool-0.200
Token day
Token ID1110
Feature activation+3.80e+00

Pos logit contributions
ll+0.024
ma+0.022
tha+0.018
ulu+0.018
Once+0.018
Neg logit contributions
der-0.039
 dot-0.034
powers-0.034
angs-0.033
 trigger-0.032
Token Pe
Token ID2631
Feature activation+2.18e+01

Pos logit contributions
ll+0.113
ma+0.107
tha+0.095
igi+0.091
prise+0.090
Neg logit contributions
.-0.102
der-0.098
 dot-0.088
powers-0.083
pool-0.080
Tokenep
Token ID538
Feature activation+1.95e+00

Pos logit contributions
 Madd+0.471
ll+0.454
Once+0.443
 babys+0.441
 ah+0.438
Neg logit contributions
.-0.615
der-0.589
lier-0.550
powers-0.535
ly-0.518
Tokeners
Token ID364
Feature activation-4.68e-01

Pos logit contributions
ll+0.053
ma+0.049
tha+0.046
igi+0.043
aman+0.043
Neg logit contributions
der-0.054
.-0.052
 dot-0.049
powers-0.047
lier-0.045
Token was
Token ID373
Feature activation+3.38e+00

Pos logit contributions
.+0.012
der+0.011
 dot+0.009
lier+0.009
pool+0.009
Neg logit contributions
ll-0.014
ma-0.013
igi-0.013
tha-0.013
prise-0.012
Token feeling
Token ID4203
Feature activation+2.02e+00

Pos logit contributions
ll+0.102
ma+0.097
tha+0.090
aman+0.087
igi+0.086
Neg logit contributions
.-0.085
der-0.081
 dot-0.073
powers-0.071
lier-0.067
Token extra
Token ID3131
Feature activation+1.61e+00

Pos logit contributions
ll+0.049
ma+0.047
 glad+0.043
Once+0.041
tha+0.041
Neg logit contributions
der-0.059
 dot-0.055
powers-0.052
 target-0.051
.-0.050
Token animals
Token ID4695
Feature activation-2.22e+00

Pos logit contributions
der+0.073
.+0.069
 dot+0.064
powers+0.062
lier+0.058
Neg logit contributions
ll-0.110
ma-0.103
tha-0.097
igi-0.092
aman-0.091
Token.
Token ID13
Feature activation-1.03e+01

Pos logit contributions
der+0.028
 dot+0.026
powers+0.026
attered+0.024
 trigger+0.022
Neg logit contributions
ll-0.094
ma-0.087
Once-0.079
 told-0.076
 ha-0.074
Token One
Token ID1881
Feature activation+2.52e-01

Pos logit contributions
.+0.159
der+0.156
 dot+0.120
powers+0.110
 It+0.103
Neg logit contributions
ll-0.421
ma-0.398
tha-0.375
prise-0.374
igi-0.356
Token day
Token ID1110
Feature activation+3.68e+00

Pos logit contributions
ll+0.006
ma+0.005
tha+0.005
Once+0.004
igi+0.004
Neg logit contributions
der-0.009
powers-0.008
 dot-0.008
lier-0.007
.-0.007
Token,
Token ID11
Feature activation+7.84e+00

Pos logit contributions
ll+0.116
ma+0.112
tha+0.101
prise+0.097
igi+0.096
Neg logit contributions
.-0.094
der-0.085
 dot-0.074
powers-0.071
elly-0.067
Token Pe
Token ID2631
Feature activation+2.18e+01

Pos logit contributions
ll+0.222
ma+0.201
prise+0.175
tha+0.169
igi+0.166
Neg logit contributions
.-0.230
der-0.218
 dot-0.196
pool-0.181
 It-0.178
Tokenppa
Token ID44989
Feature activation+9.51e+00

Pos logit contributions
 ah+0.314
ll+0.308
 Madd+0.304
ulu+0.271
ma+0.268
Neg logit contributions
.-0.791
der-0.758
lier-0.702
ly-0.694
obs-0.677
Token's
Token ID338
Feature activation+1.77e+00

Pos logit contributions
ll+0.272
ma+0.256
tha+0.245
igi+0.235
aman+0.229
Neg logit contributions
.-0.246
der-0.241
 dot-0.221
powers-0.207
lier-0.205
Token mom
Token ID1995
Feature activation+9.40e+00

Pos logit contributions
ll+0.048
ma+0.044
tha+0.041
igi+0.039
aman+0.038
Neg logit contributions
der-0.050
.-0.048
 dot-0.046
powers-0.044
lier-0.041
Token bought
Token ID5839
Feature activation+1.56e+00

Pos logit contributions
ll+0.251
tha+0.248
liv+0.245
igi+0.243
Els+0.241
Neg logit contributions
.-0.267
lier-0.187
ly-0.183
y-0.182
!-0.181
Token some
Token ID617
Feature activation-3.06e+00

Pos logit contributions
ll+0.054
ma+0.049
tha+0.046
igi+0.044
prise+0.043
Neg logit contributions
.-0.036
der-0.034
 dot-0.026
ly-0.026
lier-0.026
Token a
Token ID257
Feature activation-5.02e+00

Pos logit contributions
ll+0.063
ma+0.059
tha+0.055
igi+0.052
aman+0.051
Neg logit contributions
der-0.045
.-0.043
 dot-0.041
powers-0.038
 target-0.036
Token far
Token ID1290
Feature activation-5.47e+00

Pos logit contributions
der+0.120
.+0.114
 dot+0.110
powers+0.104
lier+0.098
Neg logit contributions
ll-0.156
ma-0.146
tha-0.136
igi-0.130
aman-0.126
Token away
Token ID1497
Feature activation+1.31e+00

Pos logit contributions
der+0.119
powers+0.099
 dot+0.097
.+0.097
angs+0.092
Neg logit contributions
ll-0.183
ma-0.177
tha-0.165
aman-0.158
igi-0.157
Token place
Token ID1295
Feature activation-1.59e+00

Pos logit contributions
ll+0.037
ma+0.035
tha+0.033
igi+0.031
aman+0.031
Neg logit contributions
der-0.032
powers-0.031
angs-0.030
attered-0.029
 trigger-0.028
Token caught
Token ID4978
Feature activation+3.76e+00

Pos logit contributions
der+0.022
 dot+0.020
powers+0.018
 trigger+0.018
attered+0.014
Neg logit contributions
ll-0.069
ma-0.067
tha-0.060
 done-0.058
igi-0.057
Token Pe
Token ID2631
Feature activation+2.18e+01

Pos logit contributions
ll+0.123
ma+0.121
igi+0.109
tha+0.109
aman+0.103
Neg logit contributions
der-0.084
 dot-0.080
powers-0.079
attered-0.067
angs-0.067
Tokenanut
Token ID20651
Feature activation+9.23e+00

Pos logit contributions
ll+0.256
ma+0.226
tha+0.184
Once+0.179
aman+0.163
Neg logit contributions
der-0.891
.-0.882
 dot-0.835
lier-0.811
powers-0.808
Token's
Token ID338
Feature activation+1.67e+00

Pos logit contributions
ll+0.360
ma+0.338
ppa+0.282
igi+0.280
Once+0.279
Neg logit contributions
der-0.158
attered-0.150
powers-0.147
 dot-0.146
 trigger-0.129
Token eye
Token ID4151
Feature activation-1.31e+00

Pos logit contributions
ll+0.049
ma+0.047
tha+0.043
aman+0.040
igi+0.040
Neg logit contributions
der-0.043
 dot-0.038
.-0.038
powers-0.037
lier-0.036
Token.
Token ID13
Feature activation-1.03e+01

Pos logit contributions
powers+0.017
 dot+0.016
der+0.015
attered+0.014
elly+0.013
Neg logit contributions
ll-0.052
ma-0.052
aman-0.046
tha-0.045
ppa-0.045
Token He
Token ID679
Feature activation-1.95e+00

Pos logit contributions
 dot+0.252
der+0.245
angs+0.239
powers+0.229
.+0.229
Neg logit contributions
ll-0.267
tha-0.246
aman-0.240
Wil-0.236
ma-0.233
Token said
Token ID531
Feature activation+7.76e+00

Pos logit contributions
ll+0.124
ma+0.111
igi+0.105
prise+0.102
tha+0.100
Neg logit contributions
.-0.141
der-0.135
 dot-0.117
 target-0.111
lier-0.108
Token,
Token ID11
Feature activation+7.68e+00

Pos logit contributions
ll+0.199
ma+0.184
tha+0.169
igi+0.159
aman+0.156
Neg logit contributions
der-0.231
.-0.218
 dot-0.211
powers-0.204
lier-0.194
Token "
Token ID366
Feature activation+2.14e+00

Pos logit contributions
ll+0.289
ma+0.279
tha+0.259
igi+0.253
prise+0.250
Neg logit contributions
der-0.141
.-0.131
 dot-0.114
powers-0.101
lier-0.095
TokenWow
Token ID22017
Feature activation+8.58e+00

Pos logit contributions
ll+0.062
ma+0.058
tha+0.054
igi+0.051
aman+0.050
Neg logit contributions
der-0.057
.-0.053
 dot-0.052
powers-0.049
lier-0.047
Token,
Token ID11
Feature activation+7.76e+00

Pos logit contributions
ll+0.206
ma+0.191
tha+0.165
Once+0.147
prise+0.145
Neg logit contributions
der-0.280
 dot-0.259
powers-0.257
 target-0.243
elly-0.240
Token Pe
Token ID2631
Feature activation+2.18e+01

Pos logit contributions
ll+0.265
ma+0.243
tha+0.229
igi+0.225
aman+0.214
Neg logit contributions
der-0.172
.-0.167
 dot-0.143
powers-0.134
ly-0.129
Tokenanut
Token ID20651
Feature activation+9.17e+00

Pos logit contributions
ll+0.272
Once+0.257
 babys+0.246
Els+0.241
ma+0.239
Neg logit contributions
.-0.823
der-0.772
lier-0.740
 dot-0.720
obs-0.710
Token is
Token ID318
Feature activation+1.26e+00

Pos logit contributions
ll+0.327
ma+0.308
lly+0.263
tha+0.253
prise+0.247
Neg logit contributions
der-0.210
 trigger-0.187
 dot-0.187
powers-0.186
attered-0.178
Token such
Token ID884
Feature activation+1.45e+00

Pos logit contributions
ll+0.036
ma+0.034
tha+0.032
igi+0.030
aman+0.029
Neg logit contributions
der-0.034
.-0.030
 dot-0.030
powers-0.030
lier-0.027
Token a
Token ID257
Feature activation-4.89e+00

Pos logit contributions
Once+0.037
ll+0.036
 glad+0.036
ma+0.034
Els+0.033
Neg logit contributions
der-0.038
 dot-0.032
obs-0.032
attered-0.032
pool-0.032
Token cool
Token ID3608
Feature activation-2.80e+00

Pos logit contributions
der+0.115
.+0.109
 dot+0.105
powers+0.099
lier+0.093
Neg logit contributions
ll-0.154
ma-0.145
tha-0.135
igi-0.129
aman-0.127
Token all
Token ID477
Feature activation-3.07e+00

Pos logit contributions
ll+0.087
ma+0.087
tha+0.079
Once+0.076
aman+0.074
Neg logit contributions
der-0.089
powers-0.082
 dot-0.080
.-0.076
obs-0.075
Token broken
Token ID5445
Feature activation-6.09e+00

Pos logit contributions
der+0.085
powers+0.078
 dot+0.075
obs+0.074
lier+0.070
Neg logit contributions
ll-0.082
ma-0.082
 done-0.069
tha-0.069
Once-0.067
Token.
Token ID13
Feature activation-1.01e+01

Pos logit contributions
powers+0.124
affe+0.107
vious+0.106
 trigger+0.104
angs+0.104
Neg logit contributions
ma-0.198
ll-0.185
tha-0.165
 done-0.162
 eventually-0.151
Token
Token ID198
Feature activation+1.05e+01

Pos logit contributions
.+0.205
der+0.191
 It+0.169
 He+0.157
 dot+0.151
Neg logit contributions
ll-0.373
ma-0.345
prise-0.325
igi-0.312
tha-0.308
Token
Token ID198
Feature activation+1.05e+01

Pos logit contributions
ll+0.375
ma+0.359
prise+0.350
tha+0.342
eda+0.328
Neg logit contributions
.-0.227
 He-0.167
 It-0.166
der-0.142
".-0.137
TokenWil
Token ID22327
Feature activation+2.18e+01

Pos logit contributions
ll+0.225
ma+0.197
prise+0.168
tha+0.165
ulu+0.153
Neg logit contributions
.-0.366
der-0.344
powers-0.309
 He-0.308
 It-0.305
Tokenma
Token ID2611
Feature activation+7.17e+00

Pos logit contributions
 Madd+0.231
ulu+0.195
Els+0.180
ll+0.176
igi+0.171
Neg logit contributions
.-0.884
ray-0.789
der-0.784
ly-0.767
lier-0.745
Token felt
Token ID2936
Feature activation+2.58e+00

Pos logit contributions
ll+0.203
igi+0.184
tha+0.177
aman+0.176
ma+0.175
Neg logit contributions
.-0.196
der-0.183
 dot-0.164
ly-0.156
lier-0.153
Token a
Token ID257
Feature activation-4.46e+00

Pos logit contributions
ll+0.067
ma+0.065
tha+0.058
Once+0.055
igi+0.054
Neg logit contributions
der-0.072
 dot-0.067
.-0.065
powers-0.064
lier-0.060
Token big
Token ID1263
Feature activation-2.91e+00

Pos logit contributions
der+0.098
.+0.090
 dot+0.086
powers+0.082
lier+0.077
Neg logit contributions
ll-0.148
ma-0.140
tha-0.131
igi-0.125
aman-0.123
Token sad
Token ID6507
Feature activation+6.06e-01

Pos logit contributions
der+0.075
powers+0.071
obs+0.067
.+0.066
angs+0.066
Neg logit contributions
ma-0.076
ll-0.075
tha-0.067
Once-0.066
aman-0.063
Token to
Token ID284
Feature activation+1.74e+00

Pos logit contributions
 dot+0.093
der+0.088
powers+0.078
 obstacle+0.072
 target+0.070
Neg logit contributions
ll-0.172
ma-0.163
tha-0.148
 done-0.142
igi-0.138
Token buy
Token ID2822
Feature activation+4.94e-01

Pos logit contributions
ll+0.047
ma+0.043
tha+0.040
igi+0.038
aman+0.037
Neg logit contributions
der-0.049
.-0.047
 dot-0.045
powers-0.043
lier-0.041
Token any
Token ID597
Feature activation-5.92e+00

Pos logit contributions
ma+0.014
ll+0.013
tha+0.012
igi+0.012
aman+0.011
Neg logit contributions
der-0.013
 dot-0.013
powers-0.011
 target-0.011
angs-0.011
Token vegetables
Token ID13701
Feature activation-4.63e+00

Pos logit contributions
der+0.152
 dot+0.148
angs+0.133
attered+0.132
pool+0.132
Neg logit contributions
ll-0.168
ma-0.168
prise-0.136
igi-0.135
tha-0.134
Token.
Token ID13
Feature activation-1.05e+01

Pos logit contributions
 dot+0.051
der+0.050
powers+0.043
angs+0.039
 trigger+0.037
Neg logit contributions
ll-0.209
ma-0.195
tha-0.176
 done-0.174
Once-0.173
Token Pe
Token ID2631
Feature activation+2.17e+01

Pos logit contributions
der+0.228
.+0.210
 dot+0.198
powers+0.191
lier+0.176
Neg logit contributions
ll-0.357
ma-0.337
tha-0.313
igi-0.301
aman-0.294
Tokenppa
Token ID44989
Feature activation+9.61e+00

Pos logit contributions
 ah+0.360
ll+0.355
 Madd+0.340
ulu+0.321
Danny+0.311
Neg logit contributions
.-0.739
der-0.728
ly-0.663
lier-0.654
attered-0.629
Token felt
Token ID2936
Feature activation+1.93e+00

Pos logit contributions
ll+0.270
prise+0.265
igi+0.262
tha+0.257
ma+0.256
Neg logit contributions
.-0.249
der-0.242
 dot-0.191
lier-0.190
ly-0.187
Token sorry
Token ID7926
Feature activation+1.21e+00

Pos logit contributions
ll+0.052
ma+0.049
tha+0.045
igi+0.042
aman+0.042
Neg logit contributions
der-0.054
.-0.051
 dot-0.049
powers-0.048
lier-0.045
Token for
Token ID329
Feature activation+2.38e+00

Pos logit contributions
ll+0.039
ma+0.036
tha+0.033
Once+0.032
aman+0.031
Neg logit contributions
der-0.026
 dot-0.025
powers-0.025
.-0.023
angs-0.023
Token him
Token ID683
Feature activation+2.64e+00

Pos logit contributions
ll+0.080
ma+0.069
igi+0.066
prise+0.065
liv+0.064
Neg logit contributions
.-0.058
der-0.054
 dot-0.043
lier-0.040
ly-0.039
Token asked
Token ID1965
Feature activation+5.87e+00

Pos logit contributions
ll+0.134
ma+0.121
tha+0.116
igi+0.116
prise+0.109
Neg logit contributions
.-0.171
der-0.168
 dot-0.152
powers-0.143
lier-0.142
Token.
Token ID13
Feature activation-1.06e+01

Pos logit contributions
ma+0.273
ll+0.268
tha+0.254
aman+0.238
Once+0.237
Neg logit contributions
der-0.053
 dot-0.043
powers-0.041
elly-0.029
pool-0.027
Token
Token ID198
Feature activation+9.92e+00

Pos logit contributions
.+0.244
der+0.227
 It+0.198
 dot+0.195
 He+0.190
Neg logit contributions
ll-0.374
ma-0.339
igi-0.331
prise-0.320
tha-0.316
Token
Token ID198
Feature activation+9.95e+00

Pos logit contributions
ll+0.330
tha+0.297
igi+0.297
ma+0.296
ulu+0.286
Neg logit contributions
.-0.237
der-0.181
 It-0.176
powers-0.167
 He-0.166
Token"
Token ID1
Feature activation+2.64e+00

Pos logit contributions
ll+0.284
ma+0.249
igi+0.239
tha+0.235
ulu+0.227
Neg logit contributions
.-0.268
der-0.257
 dot-0.237
powers-0.234
lier-0.218
TokenPe
Token ID6435
Feature activation+2.17e+01

Pos logit contributions
ll+0.076
ma+0.072
tha+0.067
igi+0.063
aman+0.062
Neg logit contributions
der-0.069
.-0.065
 dot-0.063
powers-0.060
lier-0.056
Tokenanuts
Token ID37555
Feature activation+1.60e+00

Pos logit contributions
ll+0.459
ma+0.401
Once+0.396
ulu+0.388
aman+0.388
Neg logit contributions
.-0.679
der-0.634
powers-0.587
lier-0.575
 dot-0.568
Token are
Token ID389
Feature activation+4.84e+00

Pos logit contributions
ll+0.046
ma+0.042
tha+0.038
igi+0.036
aman+0.035
Neg logit contributions
der-0.044
 dot-0.040
powers-0.039
.-0.038
lier-0.036
Token a
Token ID257
Feature activation-5.09e+00

Pos logit contributions
ll+0.151
ma+0.142
tha+0.132
igi+0.125
aman+0.125
Neg logit contributions
der-0.116
.-0.110
 dot-0.103
powers-0.099
lier-0.092
Token kind
Token ID1611
Feature activation-2.35e+00

Pos logit contributions
der+0.111
.+0.103
 dot+0.098
powers+0.094
lier+0.087
Neg logit contributions
ll-0.169
ma-0.160
tha-0.150
igi-0.143
aman-0.141
Token of
Token ID286
Feature activation-2.18e+00

Pos logit contributions
der+0.058
 dot+0.051
powers+0.051
ifying+0.048
lier+0.047
Neg logit contributions
ll-0.070
ma-0.069
tha-0.061
igi-0.061
aman-0.060
Token to
Token ID284
Feature activation+2.20e+00

Pos logit contributions
.+0.014
der+0.014
powers+0.010
 dot+0.010
lier+0.008
Neg logit contributions
ll-0.043
ma-0.041
tha-0.041
aman-0.039
igi-0.039
Token see
Token ID766
Feature activation+4.89e+00

Pos logit contributions
ll+0.070
ma+0.066
tha+0.063
igi+0.060
aman+0.060
Neg logit contributions
.-0.051
der-0.049
 dot-0.046
powers-0.043
 target-0.040
Token the
Token ID262
Feature activation+6.38e-01

Pos logit contributions
ll+0.132
ma+0.118
tha+0.115
igi+0.105
aman+0.104
Neg logit contributions
.-0.144
der-0.140
 dot-0.124
lier-0.119
powers-0.118
Token attic
Token ID46113
Feature activation-5.74e+00

Pos logit contributions
ll+0.022
ma+0.020
tha+0.019
igi+0.018
aman+0.018
Neg logit contributions
der-0.013
.-0.013
 dot-0.012
powers-0.011
lier-0.010
Token,
Token ID11
Feature activation+7.92e+00

Pos logit contributions
der+0.165
powers+0.150
 dot+0.149
.+0.141
angs+0.138
Neg logit contributions
ll-0.154
ma-0.149
tha-0.129
Once-0.120
igi-0.119
Token Nib
Token ID49532
Feature activation+2.17e+01

Pos logit contributions
ll+0.258
ma+0.237
tha+0.234
liv+0.231
str+0.225
Neg logit contributions
.-0.191
der-0.178
 dot-0.166
 bone-0.142
 He-0.142
Tokenbles
Token ID7689
Feature activation-2.57e+00

Pos logit contributions
ll+0.482
ma+0.398
tha+0.347
igi+0.342
ulu+0.294
Neg logit contributions
der-0.760
powers-0.727
 dot-0.719
.-0.702
pool-0.670
Token?
Token ID30
Feature activation-8.76e+00

Pos logit contributions
der+0.065
.+0.060
 dot+0.058
powers+0.056
lier+0.053
Neg logit contributions
ll-0.077
ma-0.072
tha-0.067
igi-0.063
aman-0.062
Token There
Token ID1318
Feature activation-4.65e+00

Pos logit contributions
der+0.217
 dot+0.191
.+0.186
powers+0.181
 target+0.179
Neg logit contributions
ll-0.259
ma-0.242
tha-0.220
Once-0.217
igi-0.212
Token are
Token ID389
Feature activation+5.48e+00

Pos logit contributions
.+0.110
der+0.110
 dot+0.100
powers+0.096
lier+0.091
Neg logit contributions
ll-0.141
ma-0.131
tha-0.126
igi-0.121
aman-0.118
Token many
Token ID867
Feature activation-1.69e+00

Pos logit contributions
ll+0.183
ma+0.172
tha+0.164
aman+0.155
igi+0.154
Neg logit contributions
der-0.120
.-0.114
 dot-0.104
powers-0.101
lier-0.093
Token He
Token ID679
Feature activation-3.82e+00

Pos logit contributions
der+0.119
.+0.107
 dot+0.106
powers+0.101
lier+0.095
Neg logit contributions
ll-0.153
ma-0.145
tha-0.134
igi-0.126
aman-0.125
Token took
Token ID1718
Feature activation-2.85e-01

Pos logit contributions
der+0.099
.+0.097
powers+0.083
 dot+0.083
 target+0.077
Neg logit contributions
ll-0.114
igi-0.106
aman-0.104
tha-0.104
ma-0.103
Token out
Token ID503
Feature activation-4.57e+00

Pos logit contributions
der+0.006
 dot+0.006
.+0.005
powers+0.005
elly+0.005
Neg logit contributions
ma-0.009
ll-0.009
tha-0.008
Once-0.008
igi-0.008
Token a
Token ID257
Feature activation-6.40e+00

Pos logit contributions
der+0.111
 dot+0.100
.+0.095
powers+0.095
lier+0.090
Neg logit contributions
ma-0.138
ll-0.135
tha-0.121
Once-0.118
aman-0.113
Token magn
Token ID7842
Feature activation-7.55e-01

Pos logit contributions
.+0.118
der+0.113
 dot+0.111
powers+0.104
 target+0.095
Neg logit contributions
ll-0.234
ma-0.215
tha-0.208
igi-0.203
aman-0.194
Tokenifying
Token ID4035
Feature activation-2.38e+01

Pos logit contributions
der+0.010
.+0.009
powers+0.009
 textures+0.007
 obstacle+0.007
Neg logit contributions
ll-0.030
ma-0.029
tha-0.027
aman-0.026
Once-0.025
Token glass
Token ID5405
Feature activation-3.78e+00

Pos logit contributions
der+0.490
.+0.430
 dot+0.417
powers+0.410
lier+0.387
Neg logit contributions
ll-0.801
ma-0.777
tha-0.715
aman-0.693
Once-0.685
Token and
Token ID290
Feature activation+3.38e+00

Pos logit contributions
der+0.096
powers+0.091
 dot+0.087
lier+0.085
attered+0.083
Neg logit contributions
ma-0.102
ll-0.099
tha-0.082
Once-0.081
aman-0.080
Token a
Token ID257
Feature activation-6.41e+00

Pos logit contributions
ll+0.091
ma+0.084
tha+0.078
igi+0.074
aman+0.072
Neg logit contributions
der-0.097
.-0.092
 dot-0.088
powers-0.085
lier-0.080
Token notebook
Token ID20922
Feature activation-9.15e+00

Pos logit contributions
der+0.139
.+0.128
 dot+0.123
powers+0.117
lier+0.109
Neg logit contributions
ll-0.214
ma-0.202
tha-0.189
igi-0.180
aman-0.178
Token and
Token ID290
Feature activation+3.36e+00

Pos logit contributions
der+0.156
 dot+0.152
powers+0.149
elly+0.141
attered+0.140
Neg logit contributions
ll-0.320
ma-0.313
Once-0.270
tha-0.268
 done-0.264
Token a
Token ID257
Feature activation-6.47e+00

Pos logit contributions
.+0.047
der+0.046
powers+0.038
 dot+0.038
lier+0.032
Neg logit contributions
ll-0.132
ma-0.124
tha-0.119
aman-0.115
igi-0.114
Token flashlight
Token ID38371
Feature activation-8.61e+00

Pos logit contributions
.+0.106
 dot+0.098
der+0.094
powers+0.090
 target+0.083
Neg logit contributions
ll-0.248
ma-0.234
tha-0.223
igi-0.214
aman-0.213
Token and
Token ID290
Feature activation+3.23e+00

Pos logit contributions
powers+0.234
der+0.229
lier+0.216
 dot+0.213
elly+0.206
Neg logit contributions
ll-0.225
ma-0.215
tha-0.186
Once-0.165
aman-0.165
Token a
Token ID257
Feature activation-6.47e+00

Pos logit contributions
ll+0.106
ma+0.104
tha+0.096
Once+0.091
prise+0.091
Neg logit contributions
der-0.068
 dot-0.059
.-0.057
powers-0.054
lier-0.054
Token magn
Token ID7842
Feature activation-7.78e-01

Pos logit contributions
der+0.136
.+0.124
 dot+0.119
powers+0.114
lier+0.105
Neg logit contributions
ll-0.221
ma-0.209
tha-0.195
igi-0.187
aman-0.184
Tokenifying
Token ID4035
Feature activation-2.37e+01

Pos logit contributions
powers+0.011
der+0.011
 obstacle+0.010
.+0.009
 textures+0.009
Neg logit contributions
ll-0.031
ma-0.028
tha-0.026
igi-0.026
ulu-0.026
Token glass
Token ID5405
Feature activation-3.91e+00

Pos logit contributions
der+0.515
.+0.445
powers+0.435
 dot+0.432
lier+0.419
Neg logit contributions
ll-0.762
ma-0.743
tha-0.676
Once-0.670
aman-0.659
Token to
Token ID284
Feature activation+4.41e-01

Pos logit contributions
der+0.071
powers+0.070
 dot+0.064
lier+0.063
assion+0.061
Neg logit contributions
ma-0.127
ll-0.125
ulu-0.112
igi-0.106
tha-0.104
Token examine
Token ID10716
Feature activation-2.09e+00

Pos logit contributions
ma+0.011
ll+0.011
tha+0.010
lly+0.009
igi+0.009
Neg logit contributions
der-0.014
elly-0.011
powers-0.011
 dot-0.011
lier-0.011
Token their
Token ID511
Feature activation+2.36e+00

Pos logit contributions
der+0.046
powers+0.046
 dot+0.043
attered+0.040
 target+0.040
Neg logit contributions
ma-0.067
ll-0.058
tha-0.056
igi-0.054
lly-0.053
Token finds
Token ID7228
Feature activation-5.17e-01

Pos logit contributions
ll+0.071
ma+0.068
tha+0.062
aman+0.058
igi+0.057
Neg logit contributions
der-0.060
 dot-0.051
.-0.051
powers-0.050
lier-0.049
Token said
Token ID531
Feature activation+6.08e+00

Pos logit contributions
ll+0.076
ma+0.067
tha+0.063
igi+0.061
aman+0.056
Neg logit contributions
.-0.145
der-0.144
 dot-0.135
powers-0.132
lier-0.127
Token,
Token ID11
Feature activation+6.24e+00

Pos logit contributions
ma+0.171
ll+0.167
tha+0.153
 pleaded+0.140
 babys+0.140
Neg logit contributions
der-0.162
 dot-0.147
 obstacle-0.144
ifying-0.142
powers-0.141
Token holding
Token ID4769
Feature activation+4.77e-01

Pos logit contributions
ll+0.137
ma+0.129
tha+0.119
Once+0.106
aman+0.105
Neg logit contributions
der-0.202
powers-0.186
 dot-0.185
.-0.184
elly-0.174
Token the
Token ID262
Feature activation-1.06e+00

Pos logit contributions
ma+0.012
tha+0.010
Once+0.008
Danny+0.008
ilia+0.008
Neg logit contributions
der-0.017
powers-0.015
elly-0.014
 dot-0.014
oky-0.013
Token magn
Token ID7842
Feature activation-5.87e-01

Pos logit contributions
der+0.026
lier+0.023
.+0.022
powers+0.022
attered+0.021
Neg logit contributions
ll-0.032
ma-0.032
tha-0.029
aman-0.027
Once-0.026
Tokenifying
Token ID4035
Feature activation-2.37e+01

Pos logit contributions
der+0.010
powers+0.010
 textures+0.010
 sounds+0.009
.+0.009
Neg logit contributions
ll-0.020
ma-0.019
tha-0.017
Once-0.017
ulu-0.017
Token glass
Token ID5405
Feature activation-3.80e+00

Pos logit contributions
der+0.569
lier+0.495
powers+0.484
ly+0.480
attered+0.472
Neg logit contributions
ma-0.670
ll-0.641
Once-0.609
aman-0.579
 babys-0.574
Token over
Token ID625
Feature activation-5.52e-01

Pos logit contributions
powers+0.074
assion+0.071
der+0.067
 feats+0.066
 Dogs+0.065
Neg logit contributions
ma-0.113
ll-0.101
ulu-0.097
ela-0.089
Once-0.087
Token a
Token ID257
Feature activation-6.35e+00

Pos logit contributions
der+0.018
powers+0.016
 dot+0.016
angs+0.015
obs+0.015
Neg logit contributions
ma-0.013
ll-0.012
tha-0.011
Once-0.010
aman-0.010
Token black
Token ID2042
Feature activation-6.39e+00

Pos logit contributions
der+0.152
powers+0.128
attered+0.123
.+0.123
angs+0.121
Neg logit contributions
ll-0.191
ma-0.187
Once-0.172
tha-0.170
aman-0.164
Token ant
Token ID1885
Feature activation-2.07e+00

Pos logit contributions
powers+0.175
attered+0.154
assion+0.143
der+0.140
inters+0.140
Neg logit contributions
ma-0.164
 ha-0.151
Once-0.141
aman-0.133
 babys-0.132
Token the
Token ID262
Feature activation-1.04e+00

Pos logit contributions
ll+0.022
ma+0.020
tha+0.019
igi+0.018
aman+0.017
Neg logit contributions
der-0.021
.-0.019
 dot-0.019
powers-0.018
lier-0.017
Token watch
Token ID2342
Feature activation-2.86e+00

Pos logit contributions
der+0.025
powers+0.021
.+0.021
 dot+0.021
lier+0.021
Neg logit contributions
ll-0.032
ma-0.031
tha-0.028
aman-0.026
Once-0.026
Token with
Token ID351
Feature activation-5.64e-02

Pos logit contributions
der+0.057
powers+0.057
 dot+0.049
attered+0.047
elly+0.046
Neg logit contributions
ma-0.098
ll-0.092
tha-0.084
Once-0.081
aman-0.081
Token the
Token ID262
Feature activation-1.06e+00

Pos logit contributions
der+0.001
 dot+0.001
powers+0.001
elly+0.001
obs+0.001
Neg logit contributions
ma-0.002
ll-0.001
tha-0.001
Once-0.001
 glad-0.001
Token magn
Token ID7842
Feature activation-8.27e-01

Pos logit contributions
der+0.024
.+0.021
 dot+0.021
powers+0.020
lier+0.020
Neg logit contributions
ll-0.034
ma-0.033
tha-0.030
igi-0.028
aman-0.028
Tokenifying
Token ID4035
Feature activation-2.36e+01

Pos logit contributions
powers+0.015
der+0.014
 obstacle+0.014
 textures+0.014
 situations+0.012
Neg logit contributions
ll-0.028
ma-0.026
tha-0.024
Once-0.023
ulu-0.023
Token glass
Token ID5405
Feature activation-4.01e+00

Pos logit contributions
der+0.520
powers+0.436
 dot+0.436
.+0.427
lier+0.419
Neg logit contributions
ma-0.738
ll-0.734
Once-0.656
tha-0.656
aman-0.650
Token.
Token ID13
Feature activation-1.20e+01

Pos logit contributions
powers+0.064
der+0.062
 dot+0.059
attered+0.055
obs+0.054
Neg logit contributions
ma-0.142
ll-0.139
ulu-0.122
tha-0.120
 done-0.116
Token He
Token ID679
Feature activation-3.81e+00

Pos logit contributions
ifying+0.208
 dot+0.207
der+0.191
lier+0.191
angs+0.183
Neg logit contributions
ll-0.359
Once-0.342
 Madd-0.332
ma-0.332
Finally-0.323
Token saw
Token ID2497
Feature activation+1.20e+00

Pos logit contributions
.+0.093
der+0.083
 dot+0.070
 drops+0.068
 spots+0.067
Neg logit contributions
ll-0.120
igi-0.116
aman-0.114
ma-0.114
tha-0.113
Token the
Token ID262
Feature activation-1.15e+00

Pos logit contributions
ll+0.040
ma+0.040
tha+0.036
igi+0.035
Once+0.034
Neg logit contributions
der-0.024
 dot-0.022
.-0.021
powers-0.021
lier-0.019
Token the
Token ID262
Feature activation-9.62e-01

Pos logit contributions
der+0.131
 dot+0.120
.+0.119
powers+0.116
lier+0.110
Neg logit contributions
ll-0.112
ma-0.109
tha-0.097
igi-0.091
aman-0.089
Token necklace
Token ID38987
Feature activation-2.99e+00

Pos logit contributions
.+0.021
der+0.021
 dot+0.020
powers+0.019
 target+0.017
Neg logit contributions
ll-0.032
ma-0.030
tha-0.028
igi-0.027
aman-0.027
Token and
Token ID290
Feature activation+3.06e+00

Pos logit contributions
der+0.077
 dot+0.070
powers+0.070
lier+0.064
angs+0.064
Neg logit contributions
ll-0.089
ma-0.085
tha-0.075
igi-0.071
aman-0.070
Token the
Token ID262
Feature activation-1.02e+00

Pos logit contributions
ll+0.088
ma+0.079
tha+0.074
igi+0.072
aman+0.071
Neg logit contributions
der-0.082
.-0.082
 dot-0.075
powers-0.072
 target-0.068
Token magn
Token ID7842
Feature activation-7.83e-01

Pos logit contributions
der+0.022
.+0.022
 dot+0.021
powers+0.019
 target+0.018
Neg logit contributions
ll-0.034
ma-0.032
tha-0.030
igi-0.029
aman-0.028
Tokenifying
Token ID4035
Feature activation-2.36e+01

Pos logit contributions
der+0.011
powers+0.011
.+0.010
 obstacle+0.009
 word+0.008
Neg logit contributions
ll-0.031
ma-0.029
tha-0.027
igi-0.026
aman-0.026
Token glass
Token ID5405
Feature activation-3.80e+00

Pos logit contributions
der+0.469
.+0.408
 dot+0.397
powers+0.392
lier+0.368
Neg logit contributions
ll-0.808
ma-0.783
tha-0.725
aman-0.700
igi-0.695
Token and
Token ID290
Feature activation+2.97e+00

Pos logit contributions
der+0.044
powers+0.042
 dot+0.040
 obstacle+0.038
lier+0.036
Neg logit contributions
ma-0.151
ll-0.149
ulu-0.134
tha-0.133
igi-0.132
Token started
Token ID2067
Feature activation-7.65e+00

Pos logit contributions
ll+0.087
ma+0.079
tha+0.073
igi+0.071
aman+0.070
Neg logit contributions
der-0.080
.-0.079
 dot-0.072
powers-0.069
 target-0.065
Token to
Token ID284
Feature activation+6.48e-01

Pos logit contributions
der+0.167
 dot+0.145
lier+0.138
pool+0.135
powers+0.133
Neg logit contributions
ll-0.255
ma-0.243
tha-0.224
igi-0.211
Once-0.210
Token examine
Token ID10716
Feature activation-2.32e+00

Pos logit contributions
ll+0.020
ma+0.019
tha+0.018
igi+0.017
aman+0.017
Neg logit contributions
der-0.016
.-0.014
 dot-0.014
powers-0.014
lier-0.013
Token,
Token ID11
Feature activation+6.18e+00

Pos logit contributions
der+0.088
 dot+0.083
powers+0.078
.+0.073
angs+0.071
Neg logit contributions
ll-0.135
ma-0.133
tha-0.118
aman-0.112
Once-0.112
Token next
Token ID1306
Feature activation-1.10e+00

Pos logit contributions
ll+0.227
igi+0.198
ma+0.196
prise+0.191
tha+0.186
Neg logit contributions
.-0.138
der-0.128
 dot-0.107
 towards-0.102
 It-0.095
Token to
Token ID284
Feature activation+8.02e-01

Pos logit contributions
der+0.025
 dot+0.022
lier+0.021
 trigger+0.021
powers+0.020
Neg logit contributions
ma-0.037
ll-0.036
tha-0.032
Once-0.031
 done-0.031
Token her
Token ID607
Feature activation+9.12e-01

Pos logit contributions
ll+0.021
ma+0.020
tha+0.018
igi+0.017
aman+0.017
Neg logit contributions
der-0.023
.-0.022
 dot-0.021
powers-0.021
lier-0.020
Token magn
Token ID7842
Feature activation-7.08e-01

Pos logit contributions
ll+0.020
ma+0.020
tha+0.017
aman+0.015
Once+0.015
Neg logit contributions
der-0.030
 dot-0.027
powers-0.027
.-0.027
lier-0.026
Tokenifying
Token ID4035
Feature activation-2.36e+01

Pos logit contributions
der+0.009
powers+0.008
.+0.007
 obstacle+0.007
 dot+0.006
Neg logit contributions
ll-0.030
ma-0.029
tha-0.026
aman-0.026
Once-0.025
Token glass
Token ID5405
Feature activation-3.84e+00

Pos logit contributions
der+0.493
 dot+0.427
.+0.423
powers+0.410
lier+0.397
Neg logit contributions
ll-0.779
ma-0.766
tha-0.688
aman-0.678
Once-0.673
Token.
Token ID13
Feature activation-1.20e+01

Pos logit contributions
der+0.093
 dot+0.087
powers+0.086
lier+0.078
attered+0.078
Neg logit contributions
ma-0.117
ll-0.109
tha-0.096
ulu-0.091
igi-0.090
Token She
Token ID1375
Feature activation-2.37e+00

Pos logit contributions
der+0.252
 dot+0.232
.+0.223
powers+0.212
lier+0.204
Neg logit contributions
ll-0.392
ma-0.378
tha-0.349
Once-0.343
aman-0.336
Token looked
Token ID3114
Feature activation-2.98e+00

Pos logit contributions
der+0.064
.+0.064
powers+0.056
 dot+0.055
 spots+0.050
Neg logit contributions
ll-0.066
igi-0.061
tha-0.060
ma-0.059
aman-0.057
Token at
Token ID379
Feature activation+4.30e-01

Pos logit contributions
.+0.041
der+0.041
 dot+0.032
powers+0.029
lier+0.028
Neg logit contributions
ll-0.126
ma-0.118
tha-0.115
igi-0.112
aman-0.109
Token
Token ID198
Feature activation+9.04e+00

Pos logit contributions
tha+0.309
ll+0.300
ma+0.293
aman+0.290
prise+0.276
Neg logit contributions
.-0.206
powers-0.143
 It-0.143
 He-0.140
der-0.136
TokenHe
Token ID1544
Feature activation-3.23e+00

Pos logit contributions
ll+0.232
ma+0.216
tha+0.196
igi+0.185
aman+0.181
Neg logit contributions
der-0.273
.-0.256
 dot-0.250
powers-0.241
lier-0.229
Token brought
Token ID3181
Feature activation-1.20e+00

Pos logit contributions
.+0.079
der+0.071
ly+0.064
powers+0.064
 dot+0.064
Neg logit contributions
ll-0.101
ma-0.097
aman-0.089
tha-0.089
prise-0.085
Token his
Token ID465
Feature activation+4.14e-01

Pos logit contributions
der+0.026
.+0.023
 dot+0.022
powers+0.021
lier+0.020
Neg logit contributions
ll-0.041
ma-0.039
tha-0.037
igi-0.035
aman-0.034
Token magn
Token ID7842
Feature activation-5.08e-01

Pos logit contributions
ll+0.011
ma+0.011
tha+0.010
igi+0.009
aman+0.009
Neg logit contributions
der-0.011
.-0.011
 dot-0.010
powers-0.010
lier-0.010
Tokenifying
Token ID4035
Feature activation-2.35e+01

Pos logit contributions
der+0.006
.+0.005
powers+0.004
 dot+0.004
 obstacle+0.003
Neg logit contributions
ll-0.022
ma-0.021
tha-0.020
Once-0.019
igi-0.019
Token glass
Token ID5405
Feature activation-3.52e+00

Pos logit contributions
der+0.462
.+0.410
 dot+0.400
powers+0.378
lier+0.357
Neg logit contributions
ll-0.824
ma-0.787
tha-0.729
aman-0.702
igi-0.701
Token and
Token ID290
Feature activation+3.46e+00

Pos logit contributions
der+0.083
 dot+0.074
lier+0.070
powers+0.070
.+0.066
Neg logit contributions
ll-0.107
ma-0.104
tha-0.090
igi-0.089
ulu-0.087
Token looked
Token ID3114
Feature activation-2.57e+00

Pos logit contributions
ll+0.092
ma+0.086
tha+0.079
igi+0.073
aman+0.072
Neg logit contributions
der-0.098
 dot-0.089
.-0.088
powers-0.084
lier-0.083
Token closely
Token ID7173
Feature activation-3.69e+00

Pos logit contributions
der+0.053
 dot+0.047
.+0.045
powers+0.045
lier+0.040
Neg logit contributions
ll-0.090
ma-0.085
tha-0.079
igi-0.076
aman-0.074
Token.
Token ID13
Feature activation-1.17e+01

Pos logit contributions
 dot+0.054
powers+0.051
der+0.046
 target+0.043
angs+0.043
Neg logit contributions
ll-0.149
ma-0.140
tha-0.127
aman-0.122
Once-0.121
Token hat
Token ID6877
Feature activation-5.02e+00

Pos logit contributions
ll+0.033
ma+0.033
tha+0.030
Once+0.028
aman+0.027
Neg logit contributions
der-0.024
 dot-0.021
.-0.020
powers-0.020
lier-0.019
Token and
Token ID290
Feature activation+3.21e+00

Pos logit contributions
der+0.115
 dot+0.103
powers+0.101
.+0.100
lier+0.091
Neg logit contributions
ll-0.161
ma-0.155
tha-0.140
igi-0.133
aman-0.132
Token grabbed
Token ID13176
Feature activation-9.17e-01

Pos logit contributions
ll+0.095
ma+0.085
tha+0.080
igi+0.077
aman+0.077
Neg logit contributions
der-0.084
.-0.082
 dot-0.075
powers-0.073
 target-0.067
Token her
Token ID607
Feature activation+1.10e+00

Pos logit contributions
der+0.028
powers+0.026
 dot+0.026
elly+0.025
attered+0.024
Neg logit contributions
ma-0.021
ll-0.018
tha-0.018
Once-0.017
aman-0.016
Token magn
Token ID7842
Feature activation-3.36e-01

Pos logit contributions
ma+0.033
ll+0.032
tha+0.029
Once+0.026
 babys+0.026
Neg logit contributions
der-0.028
 dot-0.024
powers-0.024
lier-0.023
attered-0.023
Tokenifying
Token ID4035
Feature activation-2.34e+01

Pos logit contributions
der+0.005
 obstacle+0.004
powers+0.004
.+0.004
 word+0.003
Neg logit contributions
ll-0.013
ma-0.013
tha-0.012
aman-0.012
Once-0.012
Token glass
Token ID5405
Feature activation-3.49e+00

Pos logit contributions
der+0.511
 dot+0.448
.+0.434
powers+0.424
lier+0.415
Neg logit contributions
ll-0.753
ma-0.734
tha-0.660
aman-0.648
Once-0.647
Token.
Token ID13
Feature activation-1.19e+01

Pos logit contributions
der+0.051
 dot+0.045
powers+0.044
 obstacle+0.043
 target+0.041
Neg logit contributions
ma-0.135
ll-0.127
ulu-0.110
igi-0.109
Once-0.107
Token She
Token ID1375
Feature activation-2.10e+00

Pos logit contributions
der+0.258
.+0.231
 dot+0.231
powers+0.217
lier+0.209
Neg logit contributions
ll-0.390
ma-0.369
tha-0.342
Once-0.330
igi-0.329
Token was
Token ID373
Feature activation+1.62e+00

Pos logit contributions
der+0.058
.+0.058
 dot+0.051
powers+0.051
lier+0.047
Neg logit contributions
ll-0.058
ma-0.053
tha-0.052
aman-0.051
igi-0.050
Token ready
Token ID3492
Feature activation+5.33e-03

Pos logit contributions
ll+0.048
ma+0.042
tha+0.041
aman+0.041
igi+0.039
Neg logit contributions
.-0.045
der-0.041
 dot-0.036
powers-0.035
lier-0.034
Token flashlight
Token ID38371
Feature activation-8.06e+00

Pos logit contributions
der+0.116
.+0.105
 dot+0.101
powers+0.097
lier+0.091
Neg logit contributions
ll-0.173
ma-0.164
tha-0.152
igi-0.145
aman-0.144
Token,
Token ID11
Feature activation+7.03e+00

Pos logit contributions
der+0.156
powers+0.150
 dot+0.138
lier+0.134
elly+0.133
Neg logit contributions
ll-0.271
ma-0.268
tha-0.238
aman-0.222
 done-0.219
Token and
Token ID290
Feature activation+3.97e+00

Pos logit contributions
ll+0.239
ma+0.229
tha+0.213
igi+0.203
aman+0.201
Neg logit contributions
der-0.146
.-0.131
 dot-0.129
powers-0.123
lier-0.113
Token a
Token ID257
Feature activation-5.49e+00

Pos logit contributions
ll+0.130
ma+0.124
tha+0.115
igi+0.110
aman+0.108
Neg logit contributions
der-0.087
.-0.079
 dot-0.078
powers-0.074
lier-0.069
Token magn
Token ID7842
Feature activation-3.95e-01

Pos logit contributions
der+0.126
.+0.112
 dot+0.108
powers+0.105
lier+0.100
Neg logit contributions
ll-0.175
ma-0.168
tha-0.154
aman-0.146
igi-0.146
Tokenifying
Token ID4035
Feature activation-2.33e+01

Pos logit contributions
powers+0.006
der+0.006
 obstacle+0.005
 textures+0.005
 target+0.005
Neg logit contributions
ll-0.015
ma-0.014
tha-0.013
aman-0.012
igi-0.012
Token glass
Token ID5405
Feature activation-3.57e+00

Pos logit contributions
der+0.517
.+0.444
powers+0.442
 dot+0.436
lier+0.420
Neg logit contributions
ll-0.733
ma-0.727
tha-0.659
aman-0.650
Once-0.647
Token.
Token ID13
Feature activation-1.10e+01

Pos logit contributions
der+0.052
powers+0.045
 dot+0.044
 target+0.036
 trigger+0.036
Neg logit contributions
ma-0.142
ll-0.135
ulu-0.118
tha-0.118
igi-0.116
Token
Token ID198
Feature activation+8.93e+00

Pos logit contributions
der+0.223
.+0.208
 dot+0.193
powers+0.184
lier+0.170
Neg logit contributions
ll-0.391
ma-0.366
tha-0.345
igi-0.330
aman-0.323
Token
Token ID198
Feature activation+8.98e+00

Pos logit contributions
ll+0.304
tha+0.287
ma+0.269
aman+0.254
igi+0.249
Neg logit contributions
.-0.204
 It-0.162
der-0.155
powers-0.151
 He-0.148
TokenAs
Token ID1722
Feature activation+2.18e+00

Pos logit contributions
ll+0.260
tha+0.230
ma+0.227
igi+0.202
aman+0.197
Neg logit contributions
.-0.232
der-0.220
powers-0.214
 dot-0.207
lier-0.203
Token She
Token ID1375
Feature activation-1.48e+00

Pos logit contributions
der+0.268
 dot+0.242
.+0.241
powers+0.235
lier+0.220
Neg logit contributions
ll-0.339
ma-0.320
tha-0.298
igi-0.280
aman-0.280
Token takes
Token ID2753
Feature activation-1.36e-01

Pos logit contributions
der+0.037
 dot+0.034
.+0.034
powers+0.033
pool+0.032
Neg logit contributions
ll-0.043
ma-0.041
tha-0.036
aman-0.034
Once-0.034
Token out
Token ID503
Feature activation-4.06e+00

Pos logit contributions
der+0.003
 dot+0.003
powers+0.003
elly+0.003
.+0.003
Neg logit contributions
ma-0.004
ll-0.004
tha-0.004
Once-0.004
prise-0.003
Token a
Token ID257
Feature activation-5.82e+00

Pos logit contributions
der+0.106
 dot+0.099
powers+0.096
lier+0.093
attered+0.092
Neg logit contributions
ma-0.115
ll-0.104
tha-0.098
Once-0.096
ulu-0.092
Token magn
Token ID7842
Feature activation-3.20e-01

Pos logit contributions
der+0.121
.+0.110
 dot+0.106
powers+0.101
lier+0.093
Neg logit contributions
ll-0.200
ma-0.190
tha-0.178
igi-0.170
aman-0.168
Tokenifying
Token ID4035
Feature activation-2.33e+01

Pos logit contributions
der+0.005
powers+0.004
.+0.004
 textures+0.004
 obstacle+0.003
Neg logit contributions
ll-0.012
ma-0.011
tha-0.011
aman-0.010
Once-0.010
Token glass
Token ID5405
Feature activation-3.75e+00

Pos logit contributions
der+0.460
.+0.410
 dot+0.396
powers+0.382
lier+0.357
Neg logit contributions
ll-0.811
ma-0.782
tha-0.726
aman-0.696
igi-0.692
Token.
Token ID13
Feature activation-1.14e+01

Pos logit contributions
der+0.091
powers+0.088
 dot+0.082
lier+0.082
attered+0.080
Neg logit contributions
ma-0.106
ll-0.097
Once-0.083
ulu-0.082
tha-0.081
Token She
Token ID1375
Feature activation-1.89e+00

Pos logit contributions
der+0.239
 dot+0.213
.+0.211
powers+0.205
lier+0.194
Neg logit contributions
ll-0.379
ma-0.357
tha-0.336
aman-0.320
Once-0.318
Token looks
Token ID3073
Feature activation-3.59e+00

Pos logit contributions
der+0.046
.+0.042
 dot+0.041
powers+0.039
lier+0.037
Neg logit contributions
ll-0.058
ma-0.055
tha-0.051
igi-0.048
aman-0.048
Token at
Token ID379
Feature activation+9.11e-01

Pos logit contributions
der+0.061
.+0.057
 dot+0.051
powers+0.048
lier+0.044
Neg logit contributions
ll-0.139
ma-0.132
tha-0.125
igi-0.121
aman-0.119
Token She
Token ID1375
Feature activation-2.02e+00

Pos logit contributions
der+0.263
.+0.250
 dot+0.227
powers+0.217
lier+0.205
Neg logit contributions
ll-0.367
ma-0.346
tha-0.326
igi-0.307
prise-0.303
Token had
Token ID550
Feature activation-2.82e+00

Pos logit contributions
.+0.045
der+0.042
 dot+0.037
lier+0.032
powers+0.032
Neg logit contributions
ll-0.066
tha-0.062
igi-0.062
aman-0.062
ma-0.061
Token a
Token ID257
Feature activation-6.02e+00

Pos logit contributions
der+0.046
.+0.043
 dot+0.039
powers+0.037
lier+0.033
Neg logit contributions
ll-0.111
ma-0.105
tha-0.100
igi-0.096
aman-0.095
Token big
Token ID1263
Feature activation-4.47e+00

Pos logit contributions
.+0.100
 dot+0.095
der+0.088
powers+0.082
 target+0.077
Neg logit contributions
ll-0.235
ma-0.217
tha-0.209
igi-0.206
aman-0.196
Token magn
Token ID7842
Feature activation-1.35e-02

Pos logit contributions
.+0.100
 dot+0.096
der+0.093
 target+0.081
powers+0.080
Neg logit contributions
ll-0.154
ma-0.133
igi-0.133
tha-0.130
aman-0.122
Tokenifying
Token ID4035
Feature activation-2.33e+01

Pos logit contributions
der+0.000
powers+0.000
.+0.000
 obstacle+0.000
 target+0.000
Neg logit contributions
ll-0.001
ma-0.001
tha-0.000
aman-0.000
Once-0.000
Token glass
Token ID5405
Feature activation-3.33e+00

Pos logit contributions
der+0.476
.+0.411
 dot+0.400
powers+0.400
lier+0.372
Neg logit contributions
ll-0.787
ma-0.760
tha-0.698
Once-0.682
aman-0.680
Token that
Token ID326
Feature activation-4.29e+00

Pos logit contributions
der+0.081
powers+0.079
 dot+0.076
lier+0.071
 trigger+0.071
Neg logit contributions
ma-0.093
ll-0.092
ulu-0.078
tha-0.075
igi-0.073
Token made
Token ID925
Feature activation-2.90e+00

Pos logit contributions
.+0.115
der+0.112
 dot+0.102
powers+0.099
lier+0.092
Neg logit contributions
ll-0.120
ma-0.110
tha-0.104
igi-0.101
aman-0.100
Token things
Token ID1243
Feature activation-5.67e+00

Pos logit contributions
der+0.083
 dot+0.075
powers+0.073
.+0.069
 target+0.069
Neg logit contributions
ll-0.074
ma-0.074
tha-0.065
aman-0.060
igi-0.060
Token look
Token ID804
Feature activation-8.91e-01

Pos logit contributions
der+0.148
elly+0.134
powers+0.131
attered+0.124
angs+0.124
Neg logit contributions
ll-0.159
ma-0.145
 done-0.132
Once-0.127
tha-0.126
Token,
Token ID11
Feature activation+6.42e+00

Pos logit contributions
ll+0.179
ma+0.170
tha+0.151
 glad+0.141
Once+0.139
Neg logit contributions
 dot-0.169
der-0.159
powers-0.156
elly-0.154
 target-0.152
Token handing
Token ID22786
Feature activation-1.42e+00

Pos logit contributions
ll+0.238
ma+0.227
tha+0.212
igi+0.209
aman+0.205
Neg logit contributions
der-0.120
.-0.114
 dot-0.100
powers-0.090
lier-0.085
Token him
Token ID683
Feature activation+1.69e+00

Pos logit contributions
der+0.039
.+0.036
 dot+0.035
powers+0.034
lier+0.032
Neg logit contributions
ll-0.040
ma-0.038
tha-0.035
igi-0.032
aman-0.032
Token the
Token ID262
Feature activation-6.08e-01

Pos logit contributions
ma+0.043
ll+0.038
tha+0.033
Once+0.030
igi+0.029
Neg logit contributions
der-0.058
 dot-0.054
powers-0.049
elly-0.048
attered-0.047
Token magn
Token ID7842
Feature activation-6.72e-01

Pos logit contributions
der+0.015
.+0.012
lier+0.012
 dot+0.012
powers+0.012
Neg logit contributions
ll-0.018
ma-0.018
tha-0.016
prise-0.015
aman-0.015
Tokenifying
Token ID4035
Feature activation-2.33e+01

Pos logit contributions
der+0.011
powers+0.011
 textures+0.010
 obstacle+0.010
.+0.009
Neg logit contributions
ll-0.024
ma-0.023
tha-0.020
Once-0.019
prise-0.019
Token glass
Token ID5405
Feature activation-3.99e+00

Pos logit contributions
der+0.511
 dot+0.435
.+0.431
powers+0.430
lier+0.415
Neg logit contributions
ll-0.730
ma-0.730
tha-0.650
aman-0.644
Once-0.640
Token.
Token ID13
Feature activation-1.18e+01

Pos logit contributions
der+0.060
powers+0.054
 dot+0.054
attered+0.050
lier+0.050
Neg logit contributions
ma-0.147
ll-0.142
ulu-0.125
tha-0.124
igi-0.121
Token She
Token ID1375
Feature activation-2.22e+00

Pos logit contributions
der+0.285
.+0.270
 dot+0.254
powers+0.244
lier+0.227
Neg logit contributions
ll-0.366
ma-0.345
tha-0.323
igi-0.308
aman-0.300
Token looked
Token ID3114
Feature activation-3.04e+00

Pos logit contributions
der+0.066
.+0.064
 dot+0.053
powers+0.053
lier+0.052
Neg logit contributions
ll-0.059
ma-0.054
igi-0.053
tha-0.053
aman-0.050
Token around
Token ID1088
Feature activation-1.85e+00

Pos logit contributions
.+0.043
der+0.042
 dot+0.033
powers+0.029
lier+0.029
Neg logit contributions
ll-0.128
ma-0.121
tha-0.117
igi-0.114
aman-0.112
Token it
Token ID340
Feature activation-3.51e+00

Pos logit contributions
ll+0.144
igi+0.116
prise+0.112
rim+0.112
ma+0.111
Neg logit contributions
.-0.085
der-0.071
 dot-0.071
 It-0.061
 spots-0.059
Token she
Token ID673
Feature activation-3.40e+00

Pos logit contributions
.+0.084
der+0.056
 dot+0.056
 across+0.055
 move+0.054
Neg logit contributions
ll-0.117
igi-0.112
tha-0.110
ma-0.107
ppa-0.104
Token had
Token ID550
Feature activation-2.34e+00

Pos logit contributions
.+0.074
 dot+0.054
powers+0.054
der+0.053
 It+0.052
Neg logit contributions
ll-0.111
igi-0.108
aman-0.105
tha-0.104
ma-0.101
Token a
Token ID257
Feature activation-5.99e+00

Pos logit contributions
.+0.046
der+0.046
powers+0.040
 dot+0.040
lier+0.035
Neg logit contributions
ll-0.084
ma-0.079
tha-0.075
igi-0.071
aman-0.070
Token magn
Token ID7842
Feature activation-1.32e-01

Pos logit contributions
.+0.103
 dot+0.098
der+0.097
powers+0.091
 target+0.084
Neg logit contributions
ll-0.227
ma-0.213
tha-0.203
igi-0.197
aman-0.189
Tokenifying
Token ID4035
Feature activation-2.31e+01

Pos logit contributions
der+0.002
powers+0.001
.+0.001
 obstacle+0.001
 dot+0.001
Neg logit contributions
ll-0.006
ma-0.005
tha-0.005
Once-0.005
aman-0.005
Token glass
Token ID5405
Feature activation-3.08e+00

Pos logit contributions
der+0.450
.+0.400
 dot+0.385
powers+0.371
lier+0.343
Neg logit contributions
ll-0.814
ma-0.778
tha-0.724
aman-0.696
igi-0.694
Token,
Token ID11
Feature activation+6.62e+00

Pos logit contributions
der+0.075
 dot+0.070
powers+0.069
 trigger+0.065
lier+0.065
Neg logit contributions
ma-0.086
ll-0.085
ulu-0.069
Once-0.069
igi-0.068
Token a
Token ID257
Feature activation-5.99e+00

Pos logit contributions
ll+0.230
ma+0.216
tha+0.203
igi+0.194
aman+0.192
Neg logit contributions
der-0.138
.-0.128
 dot-0.121
powers-0.115
lier-0.106
Token tiny
Token ID7009
Feature activation-4.99e+00

Pos logit contributions
der+0.123
.+0.114
 dot+0.110
powers+0.104
lier+0.095
Neg logit contributions
ll-0.207
ma-0.196
tha-0.183
igi-0.176
aman-0.173
Token torch
Token ID28034
Feature activation-4.86e+00

Pos logit contributions
der+0.132
angs+0.117
attered+0.114
powers+0.112
lier+0.109
Neg logit contributions
ma-0.122
 ha-0.114
 babys-0.110
ll-0.109
tha-0.109
Token
Token ID198
Feature activation+9.06e+00

Pos logit contributions
ll+0.292
tha+0.256
ma+0.251
aman+0.248
igi+0.244
Neg logit contributions
.-0.228
der-0.190
powers-0.183
 dot-0.177
 It-0.175
TokenHe
Token ID1544
Feature activation-2.97e+00

Pos logit contributions
ll+0.232
ma+0.198
tha+0.194
igi+0.182
aman+0.181
Neg logit contributions
.-0.270
der-0.261
powers-0.250
 dot-0.245
lier-0.236
Token pointed
Token ID6235
Feature activation-2.55e+00

Pos logit contributions
der+0.081
.+0.080
 dot+0.067
powers+0.064
ly+0.064
Neg logit contributions
ll-0.089
igi-0.080
ma-0.077
tha-0.076
aman-0.074
Token his
Token ID465
Feature activation+8.04e-01

Pos logit contributions
der+0.062
 dot+0.057
powers+0.055
attered+0.051
angs+0.050
Neg logit contributions
ma-0.075
ll-0.071
tha-0.066
Once-0.063
aman-0.062
Token magn
Token ID7842
Feature activation+2.09e-01

Pos logit contributions
ll+0.027
ma+0.025
tha+0.024
igi+0.023
aman+0.022
Neg logit contributions
der-0.018
.-0.016
 dot-0.016
powers-0.015
lier-0.014
Tokenifying
Token ID4035
Feature activation-2.31e+01

Pos logit contributions
ll+0.008
ma+0.008
tha+0.007
aman+0.007
Once+0.007
Neg logit contributions
der-0.003
.-0.003
powers-0.003
 obstacle-0.002
 textures-0.002
Token glass
Token ID5405
Feature activation-2.97e+00

Pos logit contributions
der+0.512
 dot+0.446
.+0.439
powers+0.426
lier+0.422
Neg logit contributions
ll-0.730
ma-0.716
tha-0.639
aman-0.628
Once-0.624
Token at
Token ID379
Feature activation+1.10e+00

Pos logit contributions
der+0.075
powers+0.075
 dot+0.068
lier+0.068
attered+0.065
Neg logit contributions
ma-0.080
ll-0.073
tha-0.063
 done-0.059
ulu-0.058
Token different
Token ID1180
Feature activation-2.26e+00

Pos logit contributions
ll+0.030
ma+0.030
tha+0.027
aman+0.025
igi+0.025
Neg logit contributions
der-0.030
powers-0.028
 dot-0.027
.-0.026
lier-0.025
Token classmates
Token ID28999
Feature activation-2.64e+00

Pos logit contributions
der+0.047
powers+0.042
 dot+0.042
pool+0.039
attered+0.037
Neg logit contributions
ma-0.074
ll-0.071
tha-0.064
aman-0.063
Once-0.061
Token and
Token ID290
Feature activation+3.79e+00

Pos logit contributions
der+0.060
powers+0.057
 dot+0.055
attered+0.051
 trigger+0.050
Neg logit contributions
ma-0.084
ll-0.083
tha-0.071
Once-0.070
aman-0.067
Token the
Token ID262
Feature activation-4.82e-01

Pos logit contributions
ma+0.049
ll+0.047
tha+0.040
ulu+0.037
Once+0.036
Neg logit contributions
der-0.063
 dot-0.062
powers-0.057
 target-0.055
elly-0.054
Token necklace
Token ID38987
Feature activation-2.26e+00

Pos logit contributions
.+0.010
der+0.009
 dot+0.009
powers+0.008
 target+0.008
Neg logit contributions
ll-0.017
ma-0.016
tha-0.016
igi-0.015
aman-0.015
Token and
Token ID290
Feature activation+3.53e+00

Pos logit contributions
der+0.062
powers+0.060
 dot+0.057
ifying+0.056
 obstacle+0.054
Neg logit contributions
ma-0.063
ll-0.061
tha-0.051
igi-0.045
 done-0.045
Token the
Token ID262
Feature activation-4.55e-01

Pos logit contributions
ll+0.093
ma+0.077
tha+0.075
igi+0.074
aman+0.073
Neg logit contributions
der-0.109
.-0.108
 dot-0.095
powers-0.091
 target-0.089
Token magn
Token ID7842
Feature activation+2.06e-01

Pos logit contributions
der+0.010
.+0.010
 dot+0.009
powers+0.009
 target+0.008
Neg logit contributions
ll-0.015
ma-0.014
tha-0.013
igi-0.013
aman-0.013
Tokenifying
Token ID4035
Feature activation-2.29e+01

Pos logit contributions
ll+0.008
ma+0.007
tha+0.007
igi+0.007
ulu+0.007
Neg logit contributions
der-0.003
powers-0.003
.-0.003
 obstacle-0.003
 textures-0.002
Token glass
Token ID5405
Feature activation-2.87e+00

Pos logit contributions
der+0.501
powers+0.427
.+0.420
 dot+0.413
lier+0.413
Neg logit contributions
ma-0.706
ll-0.703
tha-0.637
Once-0.631
aman-0.622
Token and
Token ID290
Feature activation+3.57e+00

Pos logit contributions
powers+0.038
der+0.038
 dot+0.032
lier+0.032
ifying+0.031
Neg logit contributions
ma-0.109
ll-0.104
ulu-0.095
tha-0.091
 done-0.090
Token told
Token ID1297
Feature activation+3.45e+00

Pos logit contributions
ll+0.093
ma+0.083
tha+0.077
igi+0.074
aman+0.072
Neg logit contributions
der-0.108
.-0.104
 dot-0.098
powers-0.094
lier-0.090
Token him
Token ID683
Feature activation+2.17e+00

Pos logit contributions
ma+0.072
ll+0.070
tha+0.058
Once+0.054
ulu+0.054
Neg logit contributions
 dot-0.110
der-0.109
powers-0.106
attered-0.105
ifying-0.102
Token that
Token ID326
Feature activation-3.84e+00

Pos logit contributions
ma+0.052
ll+0.051
lly+0.042
 done+0.042
tha+0.041
Neg logit contributions
 dot-0.071
der-0.063
elly-0.060
oky-0.060
 target-0.059
Token.
Token ID13
Feature activation-1.07e+01

Pos logit contributions
powers+0.061
der+0.061
attered+0.060
 dot+0.056
 trigger+0.045
Neg logit contributions
ll-0.260
ma-0.242
igi-0.216
Once-0.216
ulu-0.213
Token They
Token ID1119
Feature activation-2.52e+00

Pos logit contributions
.+0.208
der+0.196
 dot+0.170
 It+0.155
powers+0.150
Neg logit contributions
ll-0.396
ma-0.380
tha-0.358
prise-0.346
igi-0.342
Token had
Token ID550
Feature activation-2.40e+00

Pos logit contributions
.+0.067
der+0.063
 dot+0.059
lier+0.054
powers+0.053
Neg logit contributions
ll-0.071
ma-0.065
igi-0.063
tha-0.063
aman-0.061
Token a
Token ID257
Feature activation-5.54e+00

Pos logit contributions
der+0.044
 dot+0.038
.+0.038
powers+0.035
lier+0.032
Neg logit contributions
ll-0.089
ma-0.085
tha-0.079
igi-0.076
Once-0.075
Token magn
Token ID7842
Feature activation+4.22e-01

Pos logit contributions
.+0.103
 dot+0.097
der+0.095
powers+0.088
 target+0.083
Neg logit contributions
ll-0.199
ma-0.188
tha-0.178
igi-0.172
aman-0.167
Tokenifying
Token ID4035
Feature activation-2.26e+01

Pos logit contributions
ll+0.019
ma+0.017
tha+0.016
igi+0.016
aman+0.016
Neg logit contributions
der-0.005
.-0.004
powers-0.004
 dot-0.003
lier-0.003
Token glass
Token ID5405
Feature activation-2.88e+00

Pos logit contributions
der+0.450
.+0.410
 dot+0.393
powers+0.373
lier+0.342
Neg logit contributions
ll-0.799
ma-0.757
tha-0.711
igi-0.680
aman-0.672
Token and
Token ID290
Feature activation+4.24e+00

Pos logit contributions
der+0.069
powers+0.064
 dot+0.063
lier+0.059
 trigger+0.057
Neg logit contributions
ll-0.084
ma-0.083
ulu-0.071
tha-0.069
igi-0.068
Token a
Token ID257
Feature activation-5.46e+00

Pos logit contributions
ll+0.136
ma+0.132
tha+0.122
prise+0.116
Once+0.116
Neg logit contributions
der-0.090
 dot-0.081
.-0.075
pool-0.073
powers-0.072
Token small
Token ID1402
Feature activation-5.04e+00

Pos logit contributions
der+0.125
.+0.113
 dot+0.109
powers+0.105
lier+0.100
Neg logit contributions
ll-0.175
ma-0.166
tha-0.154
igi-0.147
aman-0.145
Token box
Token ID3091
Feature activation-4.19e+00

Pos logit contributions
der+0.140
angs+0.118
lier+0.118
powers+0.117
attered+0.116
Neg logit contributions
ma-0.127
ll-0.121
 ha-0.112
 babys-0.110
prise-0.109
Token When
Token ID1649
Feature activation+2.01e+00

Pos logit contributions
der+0.220
.+0.204
 dot+0.191
powers+0.182
lier+0.166
Neg logit contributions
ll-0.407
ma-0.384
tha-0.362
igi-0.347
aman-0.342
Token Daisy
Token ID40355
Feature activation+3.09e+00

Pos logit contributions
ll+0.060
ma+0.055
tha+0.051
igi+0.049
aman+0.048
Neg logit contributions
der-0.051
.-0.049
 dot-0.046
powers-0.045
lier-0.041
Token put
Token ID1234
Feature activation+3.32e+00

Pos logit contributions
ll+0.086
ma+0.079
tha+0.076
igi+0.074
aman+0.070
Neg logit contributions
.-0.083
der-0.080
 dot-0.074
powers-0.070
lier-0.069
Token the
Token ID262
Feature activation+1.34e+00

Pos logit contributions
ll+0.109
ma+0.101
tha+0.095
igi+0.091
aman+0.090
Neg logit contributions
der-0.075
.-0.071
 dot-0.067
powers-0.063
lier-0.059
Token magn
Token ID7842
Feature activation-1.77e-01

Pos logit contributions
ma+0.039
ll+0.038
tha+0.034
prise+0.032
aman+0.032
Neg logit contributions
der-0.035
 dot-0.030
.-0.030
powers-0.029
lier-0.029
Tokenifying
Token ID4035
Feature activation-2.26e+01

Pos logit contributions
der+0.003
powers+0.002
.+0.002
 obstacle+0.002
 target+0.002
Neg logit contributions
ll-0.007
ma-0.007
tha-0.006
Once-0.006
prise-0.006
Token glass
Token ID5405
Feature activation-3.52e+00

Pos logit contributions
der+0.486
.+0.434
 dot+0.422
powers+0.409
lier+0.382
Neg logit contributions
ll-0.750
ma-0.719
tha-0.667
aman-0.637
igi-0.636
Token down
Token ID866
Feature activation-7.94e+00

Pos logit contributions
der+0.098
 dot+0.089
.+0.086
powers+0.086
lier+0.080
Neg logit contributions
ll-0.095
ma-0.092
tha-0.080
igi-0.076
aman-0.075
Token,
Token ID11
Feature activation+7.62e+00

Pos logit contributions
.+0.210
der+0.198
 dot+0.177
powers+0.168
lier+0.162
Neg logit contributions
ll-0.241
ma-0.209
tha-0.202
igi-0.196
aman-0.192
Token the
Token ID262
Feature activation+1.73e-01

Pos logit contributions
ll+0.193
ma+0.180
tha+0.164
igi+0.153
aman+0.151
Neg logit contributions
der-0.227
.-0.213
 dot-0.208
powers-0.201
lier-0.191
Token cater
Token ID20825
Feature activation+5.43e-01

Pos logit contributions
ll+0.005
ma+0.005
tha+0.005
igi+0.005
aman+0.004
Neg logit contributions
der-0.004
.-0.004
 dot-0.004
powers-0.003
lier-0.003
Token.
Token ID13
Feature activation-1.04e+01

Pos logit contributions
der+0.059
powers+0.059
 obstacle+0.056
 dot+0.054
attered+0.053
Neg logit contributions
ll-0.078
ma-0.072
Once-0.063
 done-0.058
tha-0.056
Token Lily
Token ID20037
Feature activation+5.40e+00

Pos logit contributions
der+0.220
 dot+0.198
.+0.189
powers+0.186
lier+0.174
Neg logit contributions
ll-0.346
ma-0.325
Once-0.299
tha-0.296
igi-0.293
Token had
Token ID550
Feature activation-2.12e+00

Pos logit contributions
ll+0.162
ma+0.151
igi+0.150
tha+0.150
aman+0.150
Neg logit contributions
.-0.133
der-0.124
 dot-0.115
powers-0.112
lier-0.105
Token a
Token ID257
Feature activation-5.29e+00

Pos logit contributions
der+0.040
 dot+0.035
.+0.034
powers+0.032
lier+0.030
Neg logit contributions
ll-0.077
ma-0.074
tha-0.069
igi-0.066
Once-0.065
Token magn
Token ID7842
Feature activation+7.78e-01

Pos logit contributions
der+0.097
.+0.091
 dot+0.088
powers+0.081
 target+0.073
Neg logit contributions
ll-0.196
ma-0.185
tha-0.174
igi-0.167
aman-0.165
Tokenifying
Token ID4035
Feature activation-2.25e+01

Pos logit contributions
ll+0.032
ma+0.030
tha+0.028
Once+0.027
aman+0.027
Neg logit contributions
der-0.010
powers-0.009
.-0.009
 obstacle-0.008
lier-0.007
Token glass
Token ID5405
Feature activation-2.54e+00

Pos logit contributions
der+0.436
.+0.387
 dot+0.372
powers+0.359
lier+0.335
Neg logit contributions
ll-0.794
ma-0.759
tha-0.708
igi-0.679
aman-0.676
Token that
Token ID326
Feature activation-3.54e+00

Pos logit contributions
powers+0.063
der+0.062
 dot+0.058
lier+0.058
ksh+0.056
Neg logit contributions
ma-0.066
ll-0.064
ulu-0.054
Once-0.051
tha-0.050
Token she
Token ID673
Feature activation-2.21e+00

Pos logit contributions
der+0.106
.+0.103
 dot+0.096
powers+0.094
lier+0.089
Neg logit contributions
ll-0.088
ma-0.082
tha-0.075
igi-0.071
aman-0.070
Token found
Token ID1043
Feature activation+2.45e+00

Pos logit contributions
der+0.057
.+0.055
 dot+0.052
powers+0.051
lier+0.047
Neg logit contributions
ll-0.064
ma-0.060
tha-0.056
aman-0.054
igi-0.053
Token in
Token ID287
Feature activation-5.21e-01

Pos logit contributions
ma+0.084
ll+0.081
tha+0.073
igi+0.071
ulu+0.068
Neg logit contributions
der-0.048
 dot-0.047
powers-0.045
pool-0.040
attered-0.039
Token and
Token ID290
Feature activation+4.01e+00

Pos logit contributions
ll+0.114
ma+0.108
tha+0.097
aman+0.092
Once+0.089
Neg logit contributions
der-0.103
 dot-0.096
powers-0.092
.-0.088
angs-0.085
Token handed
Token ID10158
Feature activation-3.42e-02

Pos logit contributions
ll+0.111
ma+0.099
tha+0.094
igi+0.094
aman+0.093
Neg logit contributions
der-0.115
.-0.113
 dot-0.095
powers-0.095
lier-0.093
Token him
Token ID683
Feature activation+2.58e+00

Pos logit contributions
der+0.001
.+0.001
 dot+0.001
powers+0.001
lier+0.001
Neg logit contributions
ll-0.001
ma-0.001
tha-0.001
igi-0.001
aman-0.001
Token the
Token ID262
Feature activation+5.27e-02

Pos logit contributions
ma+0.065
ll+0.057
tha+0.047
Once+0.046
prise+0.044
Neg logit contributions
der-0.085
 dot-0.082
powers-0.072
elly-0.072
attered-0.071
Token magn
Token ID7842
Feature activation+8.46e-01

Pos logit contributions
ll+0.002
ma+0.002
tha+0.002
aman+0.002
igi+0.002
Neg logit contributions
.-0.001
der-0.001
 dot-0.001
powers-0.001
 target-0.001
Tokenifying
Token ID4035
Feature activation-2.23e+01

Pos logit contributions
ll+0.031
ma+0.030
tha+0.027
igi+0.026
Els+0.026
Neg logit contributions
der-0.013
powers-0.012
.-0.011
 obstacle-0.010
 textures-0.010
Token glass
Token ID5405
Feature activation-2.27e+00

Pos logit contributions
der+0.510
powers+0.429
lier+0.423
 dot+0.422
.+0.415
Neg logit contributions
ma-0.668
ll-0.653
Once-0.594
tha-0.583
aman-0.581
Token.
Token ID13
Feature activation-1.07e+01

Pos logit contributions
der+0.039
powers+0.035
 dot+0.034
 obstacle+0.033
lier+0.032
Neg logit contributions
ma-0.078
ll-0.076
ulu-0.068
tha-0.064
igi-0.064
Token
Token ID198
Feature activation+9.81e+00

Pos logit contributions
der+0.261
.+0.247
 dot+0.228
powers+0.223
lier+0.204
Neg logit contributions
ll-0.335
ma-0.317
tha-0.299
igi-0.283
aman-0.278
Token
Token ID198
Feature activation+9.83e+00

Pos logit contributions
tha+0.355
ll+0.348
ma+0.334
prise+0.331
aman+0.326
Neg logit contributions
.-0.213
 He-0.147
 It-0.143
der-0.132
powers-0.129
TokenHe
Token ID1544
Feature activation-2.09e+00

Pos logit contributions
tha+0.305
ll+0.295
ma+0.270
igi+0.259
aman+0.256
Neg logit contributions
.-0.242
der-0.215
powers-0.203
elly-0.200
lier-0.189
Token She
Token ID1375
Feature activation-1.14e+00

Pos logit contributions
 dot+0.253
der+0.249
.+0.229
lier+0.227
angs+0.221
Neg logit contributions
ll-0.299
ma-0.255
tha-0.252
Once-0.252
aman-0.245
Token took
Token ID1718
Feature activation+1.40e+00

Pos logit contributions
der+0.030
.+0.030
 dot+0.027
powers+0.027
 target+0.025
Neg logit contributions
ll-0.031
ma-0.030
tha-0.029
igi-0.028
aman-0.027
Token out
Token ID503
Feature activation-3.20e+00

Pos logit contributions
ll+0.047
ma+0.045
tha+0.042
igi+0.040
aman+0.039
Neg logit contributions
der-0.030
.-0.027
 dot-0.026
powers-0.025
lier-0.023
Token a
Token ID257
Feature activation-4.90e+00

Pos logit contributions
der+0.073
.+0.066
 dot+0.065
powers+0.062
lier+0.058
Neg logit contributions
ll-0.103
ma-0.099
tha-0.091
igi-0.086
aman-0.086
Token magn
Token ID7842
Feature activation+1.15e+00

Pos logit contributions
der+0.101
.+0.094
 dot+0.090
powers+0.086
lier+0.078
Neg logit contributions
ll-0.169
ma-0.159
tha-0.150
igi-0.143
aman-0.141
Tokenifying
Token ID4035
Feature activation-2.21e+01

Pos logit contributions
ll+0.047
ma+0.044
tha+0.042
aman+0.041
Once+0.041
Neg logit contributions
der-0.014
.-0.013
powers-0.013
 obstacle-0.010
 dot-0.009
Token glass
Token ID5405
Feature activation-2.07e+00

Pos logit contributions
der+0.438
.+0.382
 dot+0.368
powers+0.363
lier+0.344
Neg logit contributions
ll-0.755
ma-0.732
tha-0.673
aman-0.652
Once-0.651
Token and
Token ID290
Feature activation+4.82e+00

Pos logit contributions
der+0.050
powers+0.047
 dot+0.044
lier+0.043
angs+0.042
Neg logit contributions
ma-0.058
ll-0.058
ulu-0.048
Once-0.048
tha-0.048
Token looked
Token ID3114
Feature activation-1.00e+00

Pos logit contributions
ll+0.134
ma+0.125
tha+0.115
igi+0.109
aman+0.107
Neg logit contributions
der-0.133
.-0.125
 dot-0.121
powers-0.117
lier-0.109
Token very
Token ID845
Feature activation+3.38e+00

Pos logit contributions
der+0.020
 dot+0.018
powers+0.017
.+0.016
angs+0.015
Neg logit contributions
ll-0.035
ma-0.033
tha-0.031
Once-0.029
igi-0.029
Token carefully
Token ID7773
Feature activation-1.26e+00

Pos logit contributions
ll+0.092
ma+0.083
tha+0.076
 glad+0.072
igi+0.070
Neg logit contributions
der-0.092
 dot-0.087
powers-0.081
.-0.080
 target-0.079
Token happy
Token ID3772
Feature activation+4.60e+00

Pos logit contributions
der+0.081
.+0.075
 dot+0.073
powers+0.070
lier+0.066
Neg logit contributions
ll-0.089
ma-0.084
tha-0.077
igi-0.073
aman-0.072
Token vib
Token ID16081
Feature activation+2.01e+00

Pos logit contributions
ma+0.117
ll+0.116
prise+0.097
tha+0.096
Once+0.093
Neg logit contributions
der-0.124
 dot-0.113
angs-0.109
attered-0.108
 trigger-0.108
Tokenes
Token ID274
Feature activation-5.88e+00

Pos logit contributions
ll+0.055
ma+0.053
tha+0.047
prise+0.042
igi+0.042
Neg logit contributions
der-0.057
 dot-0.053
powers-0.053
 trigger-0.051
 target-0.050
Token.
Token ID13
Feature activation-1.14e+01

Pos logit contributions
 dot+0.076
powers+0.071
der+0.066
 trigger+0.064
attered+0.063
Neg logit contributions
ll-0.240
ma-0.234
Once-0.200
tha-0.200
 rejo-0.196
Token<|endoftext|>
Token ID50256
Feature activation+1.15e+01

Pos logit contributions
der+0.309
.+0.302
 dot+0.280
powers+0.270
lier+0.254
Neg logit contributions
ll-0.321
ma-0.295
tha-0.276
igi-0.262
aman-0.254
TokenOnce
Token ID7454
Feature activation+1.45e+01

Pos logit contributions
ll+0.249
ma+0.233
tha+0.195
Once+0.190
igi+0.184
Neg logit contributions
der-0.395
 dot-0.356
powers-0.353
.-0.338
angs-0.330
Token there
Token ID612
Feature activation-6.43e+00

Pos logit contributions
ll+0.304
tha+0.274
ma+0.274
aman+0.263
igi+0.261
Neg logit contributions
.-0.500
der-0.440
 dot-0.410
pool-0.395
powers-0.390
Token was
Token ID373
Feature activation+2.17e+00

Pos logit contributions
.+0.140
 It+0.089
 He+0.081
 anymore+0.078
 thing+0.077
Neg logit contributions
ppa-0.248
ll-0.246
igi-0.237
prise-0.236
tha-0.225
Token a
Token ID257
Feature activation-5.78e+00

Pos logit contributions
ll+0.092
ma+0.087
tha+0.084
igi+0.080
aman+0.079
Neg logit contributions
der-0.028
.-0.027
 dot-0.024
powers-0.022
lier-0.019
Token sheep
Token ID15900
Feature activation-1.57e+00

Pos logit contributions
der+0.151
.+0.131
powers+0.129
 dot+0.128
angs+0.124
Neg logit contributions
ll-0.171
ma-0.161
tha-0.148
Once-0.140
igi-0.140
Token.
Token ID13
Feature activation-1.14e+01

Pos logit contributions
powers+0.045
der+0.044
 trigger+0.040
 dot+0.040
angs+0.040
Neg logit contributions
ll-0.042
ma-0.039
Once-0.034
tha-0.033
str-0.033
Token brave
Token ID14802
Feature activation-2.57e+00

Pos logit contributions
ll+0.121
ma+0.108
lly+0.106
 glad+0.102
 done+0.094
Neg logit contributions
 dot-0.154
der-0.140
 trigger-0.137
angs-0.136
 target-0.133
Token and
Token ID290
Feature activation+3.36e+00

Pos logit contributions
attered+0.050
 dot+0.049
angs+0.048
powers+0.048
ifying+0.046
Neg logit contributions
ma-0.080
ll-0.080
Once-0.069
 done-0.068
 told-0.067
Token special
Token ID2041
Feature activation-4.15e+00

Pos logit contributions
ll+0.091
ma+0.084
tha+0.076
Once+0.071
igi+0.071
Neg logit contributions
der-0.095
 dot-0.089
.-0.085
powers-0.081
angs-0.081
Token.
Token ID13
Feature activation-1.18e+01

Pos logit contributions
 dot+0.059
 trigger+0.057
angs+0.057
attered+0.056
der+0.055
Neg logit contributions
ll-0.160
ma-0.157
tha-0.135
 done-0.133
Once-0.132
Token<|endoftext|>
Token ID50256
Feature activation+1.15e+01

Pos logit contributions
der+0.325
.+0.299
 dot+0.299
powers+0.285
lier+0.272
Neg logit contributions
ll-0.325
ma-0.303
tha-0.276
igi-0.260
Once-0.258
TokenOnce
Token ID7454
Feature activation+1.44e+01

Pos logit contributions
ll+0.239
ma+0.223
Once+0.184
tha+0.181
igi+0.172
Neg logit contributions
der-0.407
 dot-0.368
powers-0.367
.-0.343
angs-0.342
Token there
Token ID612
Feature activation-6.54e+00

Pos logit contributions
ll+0.302
tha+0.283
ma+0.275
prise+0.269
aman+0.269
Neg logit contributions
.-0.488
der-0.421
 dot-0.389
pool-0.385
powers-0.374
Token was
Token ID373
Feature activation+2.03e+00

Pos logit contributions
.+0.144
 It+0.096
 He+0.089
 anymore+0.084
 thing+0.084
Neg logit contributions
ppa-0.246
ll-0.246
igi-0.235
prise-0.232
tha-0.222
Token a
Token ID257
Feature activation-6.14e+00

Pos logit contributions
ll+0.078
ma+0.075
tha+0.070
igi+0.067
Once+0.067
Neg logit contributions
der-0.034
.-0.029
 dot-0.028
powers-0.027
lier-0.024
Token little
Token ID1310
Feature activation-1.13e+00

Pos logit contributions
der+0.162
.+0.139
powers+0.138
 dot+0.137
angs+0.134
Neg logit contributions
ll-0.180
ma-0.170
tha-0.156
Once-0.149
igi-0.147
Token girl
Token ID2576
Feature activation+4.87e-01

Pos logit contributions
der+0.035
powers+0.030
angs+0.030
lier+0.028
.+0.028
Neg logit contributions
ma-0.028
ll-0.028
tha-0.025
 babys-0.022
str-0.022
Token're
Token ID821
Feature activation+4.04e+00

Pos logit contributions
der+0.007
.+0.007
 dot+0.007
powers+0.006
lier+0.006
Neg logit contributions
ll-0.007
ma-0.007
tha-0.006
igi-0.006
aman-0.006
Token welcome
Token ID7062
Feature activation+4.75e+00

Pos logit contributions
ll+0.128
ma+0.109
aman+0.108
igi+0.108
tha+0.105
Neg logit contributions
.-0.104
der-0.087
 dot-0.079
 It-0.078
lier-0.076
Token Jos
Token ID22568
Feature activation+1.09e+01

Pos logit contributions
ll+0.153
ma+0.147
tha+0.137
igi+0.128
aman+0.126
Neg logit contributions
der-0.110
 dot-0.098
powers-0.095
.-0.094
lier-0.085
Tokenie
Token ID494
Feature activation+5.12e-02

Pos logit contributions
ll+0.251
ma+0.220
tha+0.214
igi+0.201
aman+0.198
Neg logit contributions
der-0.335
.-0.333
 dot-0.305
lier-0.294
powers-0.292
Token!"
Token ID2474
Feature activation-2.10e+00

Pos logit contributions
ll+0.002
ma+0.002
tha+0.002
igi+0.001
Once+0.001
Neg logit contributions
der-0.001
 dot-0.001
powers-0.001
attered-0.001
angs-0.001
Token<|endoftext|>
Token ID50256
Feature activation+1.18e+01

Pos logit contributions
der+0.059
.+0.059
 dot+0.053
powers+0.050
lier+0.048
Neg logit contributions
ll-0.057
ma-0.053
tha-0.050
igi-0.048
aman-0.046
TokenOnce
Token ID7454
Feature activation+1.50e+01

Pos logit contributions
ll+0.227
ma+0.219
Once+0.179
tha+0.165
igi+0.155
Neg logit contributions
der-0.439
powers-0.390
 dot-0.390
 trigger-0.374
attered-0.373
Token upon
Token ID2402
Feature activation+1.55e+00

Pos logit contributions
ll+0.307
ma+0.273
tha+0.271
igi+0.263
prise+0.253
Neg logit contributions
.-0.521
der-0.478
 dot-0.442
powers-0.417
pool-0.416
Token a
Token ID257
Feature activation-5.87e+00

Pos logit contributions
Once+0.040
ma+0.039
ll+0.038
 glad+0.034
prise+0.033
Neg logit contributions
der-0.042
attered-0.039
 trigger-0.038
 dot-0.038
obs-0.037
Token time
Token ID640
Feature activation+1.93e+00

Pos logit contributions
.+0.117
der+0.101
 dot+0.091
powers+0.090
lier+0.089
Neg logit contributions
ll-0.205
tha-0.196
ma-0.192
igi-0.185
 babys-0.175
Token,
Token ID11
Feature activation+6.65e+00

Pos logit contributions
ppa+0.073
igi+0.073
ll+0.071
tha+0.070
rim+0.068
Neg logit contributions
.-0.041
!-0.024
der-0.022
 It-0.022
 anymore-0.021
Token show
Token ID905
Feature activation-9.80e-01

Pos logit contributions
ll+0.070
ma+0.065
tha+0.060
igi+0.058
aman+0.056
Neg logit contributions
der-0.063
.-0.060
 dot-0.057
powers-0.055
lier-0.052
Token her
Token ID607
Feature activation+2.89e+00

Pos logit contributions
der+0.031
 dot+0.029
.+0.029
powers+0.028
lier+0.027
Neg logit contributions
ll-0.022
ma-0.021
tha-0.019
igi-0.017
aman-0.017
Token mom
Token ID1995
Feature activation+9.54e+00

Pos logit contributions
ll+0.067
ma+0.061
tha+0.056
igi+0.052
aman+0.051
Neg logit contributions
der-0.093
.-0.089
 dot-0.086
powers-0.084
lier-0.079
Token.
Token ID13
Feature activation-1.03e+01

Pos logit contributions
ma+0.326
ll+0.307
tha+0.274
Once+0.263
aman+0.263
Neg logit contributions
der-0.221
powers-0.207
 dot-0.201
 trigger-0.183
attered-0.176
Token<|endoftext|>
Token ID50256
Feature activation+1.31e+01

Pos logit contributions
der+0.246
.+0.228
 dot+0.220
powers+0.211
lier+0.197
Neg logit contributions
ll-0.321
ma-0.302
tha-0.281
igi-0.267
aman-0.263
TokenOnce
Token ID7454
Feature activation+1.60e+01

Pos logit contributions
ll+0.241
ma+0.230
Once+0.194
tha+0.167
igi+0.159
Neg logit contributions
der-0.500
powers-0.455
 dot-0.447
angs-0.424
 trigger-0.424
Token upon
Token ID2402
Feature activation+3.27e+00

Pos logit contributions
ll+0.358
tha+0.340
igi+0.339
prise+0.328
rim+0.324
Neg logit contributions
.-0.531
der-0.430
 dot-0.393
 anymore-0.389
 It-0.387
Token a
Token ID257
Feature activation-4.31e+00

Pos logit contributions
ma+0.092
Once+0.092
ll+0.089
prise+0.078
 glad+0.076
Neg logit contributions
der-0.088
 dot-0.076
 trigger-0.076
obs-0.074
attered-0.074
Token time
Token ID640
Feature activation+3.40e+00

Pos logit contributions
der+0.106
.+0.103
 dot+0.096
powers+0.093
lier+0.088
Neg logit contributions
ll-0.130
ma-0.121
tha-0.114
igi-0.108
aman-0.105
Token,
Token ID11
Feature activation+8.32e+00

Pos logit contributions
ll+0.125
ppa+0.122
igi+0.121
tha+0.117
rim+0.113
Neg logit contributions
.-0.074
der-0.048
 It-0.044
!-0.043
 anymore-0.043
Token there
Token ID612
Feature activation-4.77e+00

Pos logit contributions
ll+0.236
ma+0.213
tha+0.201
aman+0.192
igi+0.191
Neg logit contributions
.-0.227
der-0.221
 dot-0.208
powers-0.192
lier-0.187
Token are
Token ID389
Feature activation+5.86e+00

Pos logit contributions
ll+0.079
ma+0.075
tha+0.067
Once+0.063
prise+0.062
Neg logit contributions
der-0.091
.-0.082
 dot-0.082
powers-0.080
elly-0.076
Token a
Token ID257
Feature activation-4.16e+00

Pos logit contributions
ll+0.193
ma+0.170
tha+0.170
aman+0.167
igi+0.164
Neg logit contributions
.-0.140
der-0.126
 dot-0.110
lier-0.107
powers-0.101
Token great
Token ID1049
Feature activation+2.27e+00

Pos logit contributions
der+0.123
.+0.104
angs+0.104
powers+0.102
 dot+0.102
Neg logit contributions
ll-0.106
ma-0.102
tha-0.093
prise-0.089
Once-0.088
Token friend
Token ID1545
Feature activation-1.51e+00

Pos logit contributions
ma+0.058
ll+0.057
 ha+0.050
tha+0.050
 babys+0.048
Neg logit contributions
der-0.066
angs-0.059
attered-0.058
ifying-0.057
powers-0.056
Token,
Token ID11
Feature activation+7.93e+00

Pos logit contributions
der+0.025
 dot+0.021
powers+0.021
 trigger+0.019
attered+0.018
Neg logit contributions
ma-0.060
ll-0.059
tha-0.052
Once-0.051
prise-0.050
Token Mr
Token ID1770
Feature activation+1.49e+01

Pos logit contributions
ll+0.216
ma+0.207
tha+0.187
Once+0.176
igi+0.175
Neg logit contributions
der-0.216
 dot-0.196
.-0.195
powers-0.192
attered-0.182
Token.
Token ID13
Feature activation-1.04e+01

Pos logit contributions
ll+0.620
ma+0.599
igi+0.538
tha+0.534
ppa+0.508
Neg logit contributions
 dot-0.198
der-0.193
powers-0.188
pool-0.170
angs-0.169
Token Ot
Token ID19740
Feature activation+9.72e+00

Pos logit contributions
der+0.301
.+0.280
 dot+0.274
powers+0.268
lier+0.251
Neg logit contributions
ll-0.273
ma-0.255
tha-0.233
igi-0.218
aman-0.216
Tokenter
Token ID353
Feature activation-1.08e+00

Pos logit contributions
ll+0.286
ma+0.261
tha+0.219
igi+0.212
ulu+0.209
Neg logit contributions
der-0.272
 dot-0.269
powers-0.254
ifying-0.248
 target-0.232
Token.
Token ID13
Feature activation-1.06e+01

Pos logit contributions
der+0.021
 dot+0.020
powers+0.018
 trigger+0.017
obs+0.016
Neg logit contributions
ll-0.039
ma-0.037
tha-0.033
igi-0.032
lly-0.031
Token I
Token ID314
Feature activation-6.13e+00

Pos logit contributions
der+0.236
 dot+0.211
.+0.211
powers+0.201
 target+0.187
Neg logit contributions
ll-0.338
ma-0.323
tha-0.301
Once-0.291
igi-0.289
Token!"
Token ID2474
Feature activation-3.54e-01

Pos logit contributions
ll+0.217
ma+0.208
tha+0.192
Once+0.186
igi+0.184
Neg logit contributions
der-0.091
 dot-0.083
powers-0.076
angs-0.068
 trigger-0.067
Token<|endoftext|>
Token ID50256
Feature activation+1.32e+01

Pos logit contributions
.+0.010
der+0.009
 dot+0.008
powers+0.008
lier+0.008
Neg logit contributions
ll-0.010
ma-0.009
tha-0.009
igi-0.009
prise-0.008
TokenOnce
Token ID7454
Feature activation+1.76e+01

Pos logit contributions
ll+0.324
ma+0.300
tha+0.272
igi+0.255
aman+0.249
Neg logit contributions
der-0.406
.-0.380
 dot-0.372
powers-0.361
lier-0.342
Token upon
Token ID2402
Feature activation+3.69e+00

Pos logit contributions
ll+0.378
tha+0.357
ma+0.347
prise+0.335
aman+0.333
Neg logit contributions
.-0.586
der-0.536
 dot-0.485
pool-0.466
elly-0.454
Token a
Token ID257
Feature activation-4.12e+00

Pos logit contributions
ll+0.107
ma+0.105
Once+0.104
ulu+0.090
 glad+0.090
Neg logit contributions
der-0.088
 dot-0.078
attered-0.077
 trigger-0.076
obs-0.076
Token time
Token ID640
Feature activation+3.42e+00

Pos logit contributions
.+0.080
der+0.070
powers+0.065
 dot+0.063
elly+0.061
Neg logit contributions
ll-0.142
tha-0.139
ma-0.134
igi-0.129
 babys-0.123
Token,
Token ID11
Feature activation+7.83e+00

Pos logit contributions
ppa+0.139
igi+0.135
tha+0.135
rim+0.134
becca+0.134
Neg logit contributions
.-0.064
!-0.033
 anymore-0.029
der-0.028
 It-0.024
Token there
Token ID612
Feature activation-4.98e+00

Pos logit contributions
ll+0.247
ma+0.227
aman+0.214
tha+0.214
 babys+0.209
Neg logit contributions
.-0.192
der-0.169
 dot-0.164
pool-0.139
lier-0.137
Token was
Token ID373
Feature activation+3.08e+00

Pos logit contributions
.+0.101
 It+0.066
 dot+0.058
 He+0.056
der+0.055
Neg logit contributions
ll-0.193
ppa-0.190
prise-0.184
tha-0.183
igi-0.182
Token a
Token ID257
Feature activation-4.52e+00

Pos logit contributions
ll+0.141
ma+0.132
tha+0.130
igi+0.124
aman+0.123
Neg logit contributions
.-0.031
der-0.029
 dot-0.026
powers-0.021
lier-0.017
Token little
Token ID1310
Feature activation-3.15e-01

Pos logit contributions
der+0.101
.+0.094
 dot+0.091
powers+0.086
lier+0.080
Neg logit contributions
ll-0.148
ma-0.139
tha-0.130
igi-0.124
aman-0.123
Token high
Token ID1029
Feature activation-2.87e+00

Pos logit contributions
ll+0.144
ma+0.135
tha+0.128
igi+0.117
Once+0.114
Neg logit contributions
der-0.147
 dot-0.147
lier-0.137
powers-0.134
angs-0.133
Token.
Token ID13
Feature activation-1.08e+01

Pos logit contributions
 dot+0.067
angs+0.066
powers+0.066
ifying+0.059
attered+0.058
Neg logit contributions
ll-0.080
ma-0.077
tha-0.076
aman-0.065
igi-0.065
Token However
Token ID2102
Feature activation+6.70e+00

Pos logit contributions
der+0.249
.+0.246
 dot+0.218
powers+0.205
obs+0.195
Neg logit contributions
ll-0.351
ma-0.336
tha-0.315
igi-0.312
prise-0.305
Token,
Token ID11
Feature activation+7.20e+00

Pos logit contributions
ll+0.216
ma+0.195
igi+0.180
tha+0.177
prise+0.175
Neg logit contributions
.-0.171
der-0.159
 dot-0.135
ly-0.130
powers-0.122
Token T
Token ID309
Feature activation+1.01e+01

Pos logit contributions
ll+0.267
ma+0.244
igi+0.232
tha+0.230
prise+0.220
Neg logit contributions
.-0.146
der-0.141
 dot-0.123
 target-0.105
pool-0.102
Tokenerence
Token ID1945
Feature activation+5.68e+00

Pos logit contributions
ll+0.279
ma+0.260
tha+0.234
igi+0.220
aman+0.218
Neg logit contributions
der-0.274
 dot-0.259
.-0.254
powers-0.247
attered-0.236
Token really
Token ID1107
Feature activation+3.83e-01

Pos logit contributions
ll+0.139
ma+0.128
tha+0.117
igi+0.108
aman+0.106
Neg logit contributions
der-0.173
.-0.161
 dot-0.160
powers-0.155
lier-0.146
Token wanted
Token ID2227
Feature activation-3.60e+00

Pos logit contributions
ll+0.010
ma+0.009
tha+0.008
igi+0.007
aman+0.007
Neg logit contributions
der-0.011
 dot-0.011
angs-0.010
ifying-0.010
powers-0.010
Token to
Token ID284
Feature activation+1.71e+00

Pos logit contributions
der+0.072
 dot+0.067
powers+0.060
attered+0.059
pool+0.058
Neg logit contributions
ll-0.126
ma-0.124
tha-0.111
Once-0.107
igi-0.106
Token explore
Token ID7301
Feature activation-1.61e+00

Pos logit contributions
ll+0.048
ma+0.045
tha+0.042
Once+0.039
aman+0.038
Neg logit contributions
der-0.048
elly-0.041
attered-0.040
 dot-0.040
angs-0.039
Token and
Token ID290
Feature activation+4.31e+00

Pos logit contributions
der+0.046
 dot+0.043
powers+0.043
attered+0.042
angs+0.041
Neg logit contributions
ma-0.040
ll-0.040
tha-0.034
aman-0.033
Once-0.033
Token She
Token ID1375
Feature activation-2.26e+00

Pos logit contributions
der+0.231
 dot+0.215
powers+0.202
 target+0.197
.+0.197
Neg logit contributions
ma-0.305
ll-0.298
Once-0.289
aman-0.282
tha-0.281
Token had
Token ID550
Feature activation-2.57e+00

Pos logit contributions
.+0.061
der+0.055
 dot+0.049
powers+0.046
lier+0.045
Neg logit contributions
ll-0.065
igi-0.061
aman-0.057
tha-0.057
ma-0.056
Token a
Token ID257
Feature activation-6.23e+00

Pos logit contributions
.+0.053
der+0.052
powers+0.046
 dot+0.045
lier+0.042
Neg logit contributions
ll-0.089
ma-0.081
tha-0.079
igi-0.077
aman-0.073
Token watch
Token ID2342
Feature activation-2.68e+00

Pos logit contributions
.+0.117
 dot+0.109
der+0.108
powers+0.100
 target+0.093
Neg logit contributions
ll-0.229
ma-0.206
igi-0.202
tha-0.201
aman-0.185
Token for
Token ID329
Feature activation+1.49e+00

Pos logit contributions
der+0.064
powers+0.059
 dot+0.058
elly+0.052
angs+0.051
Neg logit contributions
ma-0.083
ll-0.080
tha-0.071
aman-0.067
Once-0.067
Token him
Token ID683
Feature activation+1.56e+00

Pos logit contributions
ll+0.053
ma+0.048
tha+0.046
igi+0.045
aman+0.043
Neg logit contributions
.-0.031
der-0.030
 dot-0.025
powers-0.025
lier-0.022
Token!
Token ID0
Feature activation-1.04e+01

Pos logit contributions
ll+0.057
ma+0.054
tha+0.048
Once+0.047
aman+0.046
Neg logit contributions
der-0.031
 dot-0.027
powers-0.025
 trigger-0.024
pool-0.023
Token He
Token ID679
Feature activation-3.36e+00

Pos logit contributions
der+0.178
 dot+0.156
.+0.154
powers+0.145
lier+0.132
Neg logit contributions
ll-0.385
ma-0.374
tha-0.350
aman-0.339
Once-0.337
Token was
Token ID373
Feature activation+2.07e+00

Pos logit contributions
.+0.091
der+0.089
 dot+0.081
powers+0.078
lier+0.075
Neg logit contributions
ll-0.094
ma-0.084
igi-0.083
tha-0.082
aman-0.079
Token so
Token ID523
Feature activation+4.99e+00

Pos logit contributions
ll+0.074
igi+0.063
ma+0.062
tha+0.061
aman+0.060
Neg logit contributions
.-0.048
der-0.041
 dot-0.038
lier-0.036
powers-0.033
Token happy
Token ID3772
Feature activation+4.92e+00

Pos logit contributions
ll+0.146
tha+0.140
igi+0.138
aman+0.137
ma+0.134
Neg logit contributions
.-0.132
der-0.115
powers-0.110
 dot-0.101
 He-0.100
Token all
Token ID477
Feature activation-3.63e+00

Pos logit contributions
der+0.121
powers+0.111
 dot+0.110
angs+0.107
elly+0.105
Neg logit contributions
ll-0.103
ma-0.095
tha-0.082
Once-0.074
igi-0.074
Token day
Token ID1110
Feature activation+3.41e+00

Pos logit contributions
der+0.096
.+0.094
 dot+0.088
powers+0.083
lier+0.080
Neg logit contributions
ll-0.104
ma-0.097
tha-0.090
igi-0.086
aman-0.085
Token long
Token ID890
Feature activation-4.90e+00

Pos logit contributions
ll+0.117
ma+0.107
tha+0.099
Once+0.096
aman+0.092
Neg logit contributions
der-0.072
 dot-0.065
powers-0.063
angs-0.061
lier-0.058
Token until
Token ID1566
Feature activation+6.29e+00

Pos logit contributions
der+0.079
 dot+0.074
powers+0.072
elly+0.062
angs+0.062
Neg logit contributions
ll-0.184
ma-0.178
tha-0.165
Once-0.155
aman-0.155
Token it
Token ID340
Feature activation-2.91e+00

Pos logit contributions
ll+0.188
ma+0.178
tha+0.165
igi+0.155
aman+0.154
Neg logit contributions
der-0.160
.-0.146
 dot-0.143
powers-0.140
lier-0.130
Token was
Token ID373
Feature activation+2.65e+00

Pos logit contributions
der+0.082
.+0.079
 dot+0.075
powers+0.071
lier+0.069
Neg logit contributions
ll-0.078
ma-0.074
tha-0.069
igi-0.064
aman-0.063
Token time
Token ID640
Feature activation+2.73e+00

Pos logit contributions
ll+0.078
ma+0.073
tha+0.068
igi+0.064
aman+0.063
Neg logit contributions
der-0.068
.-0.065
 dot-0.062
powers-0.059
lier-0.056
Token to
Token ID284
Feature activation+1.83e+00

Pos logit contributions
ll+0.096
ma+0.086
tha+0.079
 done+0.079
 glad+0.075
Neg logit contributions
der-0.059
 dot-0.056
powers-0.052
lier-0.052
 trigger-0.048
Token go
Token ID467
Feature activation+1.54e+00

Pos logit contributions
ll+0.070
ma+0.066
tha+0.062
igi+0.060
aman+0.059
Neg logit contributions
der-0.030
.-0.030
 dot-0.027
powers-0.023
lier-0.022
Token home
Token ID1363
Feature activation+2.87e+00

Pos logit contributions
ll+0.034
ma+0.031
tha+0.024
Once+0.023
 done+0.023
Neg logit contributions
der-0.056
 dot-0.050
powers-0.049
lier-0.048
angs-0.048
Token.
Token ID13
Feature activation-1.07e+01

Pos logit contributions
ll+0.136
ma+0.131
tha+0.121
 done+0.116
Once+0.115
Neg logit contributions
der-0.023
 dot-0.022
powers-0.018
 trigger-0.018
angs-0.015
Token,
Token ID11
Feature activation+6.45e+00

Pos logit contributions
.+0.161
der+0.158
 dot+0.144
powers+0.135
lier+0.131
Neg logit contributions
ll-0.167
ma-0.149
tha-0.142
igi-0.142
aman-0.131
Token she
Token ID673
Feature activation-3.66e+00

Pos logit contributions
ll+0.195
ma+0.173
tha+0.161
igi+0.159
aman+0.153
Neg logit contributions
der-0.166
.-0.165
 dot-0.146
powers-0.143
lier-0.135
Token noticed
Token ID6810
Feature activation+5.85e-01

Pos logit contributions
.+0.094
der+0.083
 dot+0.076
 drops+0.074
lier+0.072
Neg logit contributions
ll-0.104
igi-0.099
ma-0.098
tha-0.097
aman-0.094
Token something
Token ID1223
Feature activation-7.54e+00

Pos logit contributions
ma+0.018
ll+0.017
tha+0.015
aman+0.015
Once+0.015
Neg logit contributions
powers-0.013
der-0.013
 dot-0.013
pool-0.012
 target-0.012
Token y
Token ID331
Feature activation-9.25e-01

Pos logit contributions
der+0.189
powers+0.177
 dot+0.170
obs+0.164
 target+0.157
Neg logit contributions
ll-0.209
ma-0.202
ulu-0.184
lly-0.183
str-0.181
Tokenummy
Token ID13513
Feature activation+9.79e-01

Pos logit contributions
der+0.030
powers+0.030
 target+0.029
 dot+0.028
 robot+0.026
Neg logit contributions
ll-0.020
ma-0.017
aman-0.016
tha-0.015
igi-0.014
Token in
Token ID287
Feature activation-1.65e+00

Pos logit contributions
ll+0.030
ma+0.029
tha+0.026
 done+0.026
lly+0.025
Neg logit contributions
powers-0.022
der-0.021
 dot-0.020
pool-0.019
 target-0.018
Token the
Token ID262
Feature activation-7.77e-01

Pos logit contributions
der+0.056
powers+0.053
obs+0.049
lier+0.048
attered+0.048
Neg logit contributions
ma-0.036
tha-0.031
ll-0.030
Once-0.030
aman-0.029
Token garden
Token ID11376
Feature activation-5.87e+00

Pos logit contributions
der+0.022
powers+0.019
 dot+0.019
.+0.019
lier+0.018
Neg logit contributions
ll-0.021
ma-0.021
tha-0.018
aman-0.017
igi-0.017
Token.
Token ID13
Feature activation-1.17e+01

Pos logit contributions
powers+0.084
der+0.073
 dot+0.067
 trigger+0.059
tie+0.056
Neg logit contributions
ma-0.247
ll-0.235
aman-0.212
tha-0.211
lly-0.209
Token
Token ID198
Feature activation+9.05e+00

Pos logit contributions
der+0.203
.+0.190
 dot+0.174
powers+0.165
lier+0.148
Neg logit contributions
ll-0.445
ma-0.419
tha-0.396
igi-0.382
aman-0.374
Token of
Token ID286
Feature activation-3.20e+00

Pos logit contributions
der+0.118
 dot+0.107
attered+0.106
powers+0.105
ifying+0.103
Neg logit contributions
ma-0.167
ll-0.166
ulu-0.146
Once-0.144
igi-0.139
Token animals
Token ID4695
Feature activation-3.61e+00

Pos logit contributions
der+0.072
.+0.066
 dot+0.064
powers+0.061
lier+0.057
Neg logit contributions
ll-0.104
ma-0.098
tha-0.091
igi-0.087
aman-0.086
Token,
Token ID11
Feature activation+6.43e+00

Pos logit contributions
der+0.087
powers+0.079
 dot+0.078
 trigger+0.072
attered+0.072
Neg logit contributions
ll-0.117
ma-0.106
tha-0.092
Once-0.092
igi-0.088
Token trees
Token ID7150
Feature activation-3.67e+00

Pos logit contributions
ll+0.196
ma+0.185
tha+0.171
igi+0.162
aman+0.160
Neg logit contributions
der-0.158
.-0.145
 dot-0.142
powers-0.136
lier-0.127
Token and
Token ID290
Feature activation+3.51e+00

Pos logit contributions
der+0.098
.+0.090
 dot+0.089
powers+0.087
lier+0.081
Neg logit contributions
ll-0.104
ma-0.097
tha-0.089
igi-0.084
aman-0.083
Token clouds
Token ID15114
Feature activation-2.75e+00

Pos logit contributions
ll+0.116
ma+0.110
tha+0.102
igi+0.098
aman+0.096
Neg logit contributions
der-0.076
.-0.069
 dot-0.067
powers-0.065
lier-0.060
Token.
Token ID13
Feature activation-1.19e+01

Pos logit contributions
powers+0.049
der+0.044
 trigger+0.039
 dot+0.038
angs+0.037
Neg logit contributions
ll-0.104
ma-0.099
Once-0.087
tha-0.086
aman-0.086
Token Everything
Token ID11391
Feature activation-1.75e+00

Pos logit contributions
der+0.232
 dot+0.210
.+0.210
powers+0.201
lier+0.183
Neg logit contributions
ll-0.393
ma-0.381
tha-0.360
aman-0.348
Once-0.346
Token was
Token ID373
Feature activation+1.68e+00

Pos logit contributions
der+0.058
 dot+0.054
powers+0.054
 trigger+0.052
angs+0.051
Neg logit contributions
ll-0.037
ma-0.035
tha-0.029
Once-0.027
lly-0.027
Token so
Token ID523
Feature activation+4.70e+00

Pos logit contributions
ll+0.060
ma+0.054
tha+0.051
prise+0.051
igi+0.048
Neg logit contributions
.-0.039
der-0.033
 dot-0.030
lier-0.027
 across-0.025
Token peaceful
Token ID12309
Feature activation-3.42e+00

Pos logit contributions
ll+0.171
ma+0.166
tha+0.156
aman+0.148
igi+0.147
Neg logit contributions
.-0.085
der-0.085
 dot-0.073
powers-0.072
lier-0.065
Token boy
Token ID2933
Feature activation+3.32e+00

Pos logit contributions
ll+0.018
ma+0.018
tha+0.016
Once+0.015
igi+0.014
Neg logit contributions
der-0.016
powers-0.015
.-0.014
lier-0.014
 dot-0.014
Token nodded
Token ID14464
Feature activation+3.87e+00

Pos logit contributions
ll+0.085
ma+0.078
igi+0.074
tha+0.074
aman+0.071
Neg logit contributions
der-0.100
.-0.098
 dot-0.088
lier-0.085
powers-0.083
Token and
Token ID290
Feature activation+3.31e+00

Pos logit contributions
ma+0.135
ll+0.130
Once+0.118
tha+0.117
igi+0.112
Neg logit contributions
powers-0.075
der-0.073
 dot-0.067
angs-0.063
obs-0.062
Token used
Token ID973
Feature activation-3.68e+00

Pos logit contributions
ll+0.089
ma+0.082
igi+0.075
tha+0.074
aman+0.070
Neg logit contributions
der-0.099
.-0.095
 dot-0.084
powers-0.081
lier-0.079
Token his
Token ID465
Feature activation+4.79e-01

Pos logit contributions
der+0.106
 dot+0.095
powers+0.093
lier+0.090
elly+0.090
Neg logit contributions
ma-0.093
ll-0.091
tha-0.081
Once-0.078
aman-0.075
Token net
Token ID2010
Feature activation-6.85e+00

Pos logit contributions
ll+0.015
ma+0.014
tha+0.013
igi+0.012
Once+0.012
Neg logit contributions
der-0.012
.-0.010
 dot-0.010
powers-0.010
lier-0.009
Token to
Token ID284
Feature activation+8.63e-01

Pos logit contributions
der+0.158
 dot+0.148
elly+0.135
ifying+0.132
oky+0.126
Neg logit contributions
ll-0.214
ma-0.206
tha-0.173
 done-0.171
igi-0.168
Token scoop
Token ID23529
Feature activation+6.40e-03

Pos logit contributions
ll+0.025
ma+0.025
tha+0.023
aman+0.022
igi+0.021
Neg logit contributions
der-0.022
powers-0.019
 dot-0.019
.-0.018
elly-0.018
Token up
Token ID510
Feature activation-4.47e+00

Pos logit contributions
ma+0.000
ll+0.000
tha+0.000
igi+0.000
ulu+0.000
Neg logit contributions
der-0.000
powers-0.000
 dot-0.000
 target-0.000
ifying-0.000
Token the
Token ID262
Feature activation-7.88e-01

Pos logit contributions
der+0.147
powers+0.139
 dot+0.137
attered+0.128
elly+0.127
Neg logit contributions
ma-0.095
ll-0.084
Once-0.073
tha-0.072
ulu-0.067
Token cr
Token ID1067
Feature activation-3.39e+00

Pos logit contributions
der+0.018
.+0.015
powers+0.015
 dot+0.014
lier+0.014
Neg logit contributions
ll-0.026
ma-0.025
tha-0.023
igi-0.021
Once-0.021
Token and
Token ID290
Feature activation+4.53e+00

Pos logit contributions
ll+0.177
ma+0.168
igi+0.167
tha+0.166
prise+0.163
Neg logit contributions
.-0.125
der-0.120
 dot-0.108
powers-0.103
 target-0.099
Token Star
Token ID2907
Feature activation+7.85e+00

Pos logit contributions
ll+0.130
ma+0.119
tha+0.110
igi+0.107
prise+0.103
Neg logit contributions
der-0.122
.-0.119
 dot-0.111
powers-0.105
lier-0.100
Token ran
Token ID4966
Feature activation-1.03e+00

Pos logit contributions
ll+0.184
ma+0.167
tha+0.149
Once+0.134
igi+0.133
Neg logit contributions
der-0.253
 dot-0.232
powers-0.227
.-0.227
angs-0.217
Token very
Token ID845
Feature activation+3.24e+00

Pos logit contributions
der+0.028
 dot+0.026
angs+0.026
elly+0.025
ifying+0.024
Neg logit contributions
ll-0.027
ma-0.024
Once-0.024
 done-0.022
tha-0.022
Token fast
Token ID3049
Feature activation-4.91e+00

Pos logit contributions
ll+0.111
ma+0.102
tha+0.096
igi+0.091
Once+0.090
Neg logit contributions
der-0.069
 dot-0.061
.-0.060
powers-0.056
angs-0.055
Token.
Token ID13
Feature activation-1.05e+01

Pos logit contributions
angs+0.076
ifying+0.073
 trigger+0.070
attered+0.069
 dot+0.068
Neg logit contributions
ll-0.176
ma-0.160
 done-0.150
 ha-0.147
Once-0.146
Token They
Token ID1119
Feature activation-2.00e+00

Pos logit contributions
der+0.216
 dot+0.203
angs+0.193
.+0.192
lier+0.185
Neg logit contributions
ll-0.310
ma-0.289
tha-0.288
Once-0.288
aman-0.276
Token passed
Token ID3804
Feature activation+3.30e-01

Pos logit contributions
.+0.054
der+0.052
 dot+0.050
powers+0.046
 target+0.046
Neg logit contributions
ll-0.055
ma-0.053
tha-0.050
igi-0.049
aman-0.049
Token all
Token ID477
Feature activation-3.26e+00

Pos logit contributions
ll+0.011
ma+0.010
tha+0.010
igi+0.009
aman+0.009
Neg logit contributions
der-0.007
.-0.007
 dot-0.007
powers-0.006
lier-0.006
Token the
Token ID262
Feature activation+7.06e-01

Pos logit contributions
der+0.078
.+0.077
 dot+0.072
powers+0.068
lier+0.064
Neg logit contributions
ll-0.100
ma-0.093
tha-0.088
igi-0.085
aman-0.083
Token other
Token ID584
Feature activation-3.07e+00

Pos logit contributions
ll+0.020
ma+0.019
tha+0.018
igi+0.017
aman+0.016
Neg logit contributions
der-0.019
.-0.018
 dot-0.017
powers-0.017
lier-0.016
Token,
Token ID11
Feature activation+6.33e+00

Pos logit contributions
ll+0.036
ma+0.033
tha+0.031
igi+0.029
aman+0.029
Neg logit contributions
der-0.035
.-0.034
 dot-0.032
powers-0.030
lier-0.029
Token Sam
Token ID3409
Feature activation+1.56e+00

Pos logit contributions
ll+0.225
ma+0.212
tha+0.198
igi+0.191
aman+0.188
Neg logit contributions
der-0.126
.-0.117
 dot-0.110
powers-0.104
lier-0.096
Token kicked
Token ID12165
Feature activation-2.98e+00

Pos logit contributions
ll+0.037
ma+0.034
tha+0.031
igi+0.030
aman+0.029
Neg logit contributions
der-0.048
.-0.047
 dot-0.044
powers-0.043
lier-0.041
Token the
Token ID262
Feature activation-8.88e-01

Pos logit contributions
der+0.087
powers+0.080
 dot+0.079
.+0.075
obs+0.073
Neg logit contributions
ll-0.074
ma-0.073
tha-0.064
Once-0.061
aman-0.060
Token ball
Token ID2613
Feature activation-5.86e+00

Pos logit contributions
der+0.024
lier+0.022
obs+0.022
angs+0.022
storms+0.021
Neg logit contributions
ll-0.021
ma-0.021
Once-0.020
 ha-0.019
 done-0.017
Token too
Token ID1165
Feature activation-2.93e+00

Pos logit contributions
obs+0.161
powers+0.157
der+0.153
ifying+0.150
 dot+0.149
Neg logit contributions
ll-0.141
ma-0.126
Once-0.126
 done-0.110
 eventually-0.109
Token hard
Token ID1327
Feature activation-5.45e+00

Pos logit contributions
 zone+0.090
 dot+0.089
ifying+0.087
attered+0.086
obs+0.086
Neg logit contributions
ll-0.065
Once-0.059
ma-0.051
 done-0.049
Finally-0.048
Token and
Token ID290
Feature activation+3.31e+00

Pos logit contributions
 dot+0.128
powers+0.125
 trigger+0.121
ifying+0.121
angs+0.117
Neg logit contributions
ll-0.171
ma-0.152
 done-0.132
Once-0.126
 told-0.120
Token it
Token ID340
Feature activation-4.01e+00

Pos logit contributions
ll+0.088
ma+0.082
tha+0.076
igi+0.069
Once+0.069
Neg logit contributions
der-0.091
 dot-0.084
powers-0.083
.-0.081
elly-0.079
Token went
Token ID1816
Feature activation-1.83e-01

Pos logit contributions
der+0.115
 dot+0.105
.+0.103
powers+0.102
obs+0.099
Neg logit contributions
ll-0.103
ma-0.094
tha-0.084
Once-0.081
igi-0.079
Token into
Token ID656
Feature activation-1.66e+00

Pos logit contributions
der+0.005
 dot+0.005
powers+0.005
obs+0.005
elly+0.005
Neg logit contributions
ll-0.004
ma-0.004
Once-0.004
 done-0.004
tha-0.004
Token bananas
Token ID35484
Feature activation-1.76e+00

Pos logit contributions
der+0.011
.+0.010
 dot+0.009
powers+0.009
lier+0.008
Neg logit contributions
ll-0.020
ma-0.019
tha-0.019
aman-0.017
igi-0.017
Token.
Token ID13
Feature activation-1.13e+01

Pos logit contributions
 dot+0.029
der+0.025
powers+0.025
attered+0.023
obs+0.022
Neg logit contributions
ll-0.071
ma-0.065
Once-0.058
tha-0.057
igi-0.055
Token He
Token ID679
Feature activation-3.24e+00

Pos logit contributions
der+0.179
.+0.159
 dot+0.156
powers+0.141
lier+0.129
Neg logit contributions
ll-0.442
ma-0.418
tha-0.398
igi-0.381
aman-0.379
Token wished
Token ID16555
Feature activation+5.88e+00

Pos logit contributions
der+0.082
.+0.077
 target+0.066
 dot+0.066
powers+0.065
Neg logit contributions
ma-0.098
ll-0.097
aman-0.090
tha-0.089
prise-0.088
Token he
Token ID339
Feature activation-3.92e+00

Pos logit contributions
ll+0.207
ma+0.196
tha+0.184
igi+0.177
aman+0.177
Neg logit contributions
der-0.124
.-0.115
 dot-0.100
powers-0.095
lier-0.088
Token could
Token ID714
Feature activation-4.07e+00

Pos logit contributions
der+0.111
.+0.103
 dot+0.102
powers+0.098
lier+0.092
Neg logit contributions
ll-0.107
ma-0.099
tha-0.092
igi-0.086
aman-0.083
Token steal
Token ID8711
Feature activation-5.19e+00

Pos logit contributions
der+0.105
.+0.099
 dot+0.095
powers+0.092
lier+0.086
Neg logit contributions
ll-0.119
ma-0.112
tha-0.103
igi-0.098
aman-0.097
Token something
Token ID1223
Feature activation-7.09e+00

Pos logit contributions
der+0.134
 dot+0.123
.+0.122
powers+0.118
lier+0.108
Neg logit contributions
ll-0.150
ma-0.143
tha-0.130
igi-0.124
aman-0.122
Token too
Token ID1165
Feature activation-2.23e+00

Pos logit contributions
der+0.163
.+0.153
 dot+0.143
powers+0.138
lier+0.129
Neg logit contributions
ll-0.228
ma-0.217
tha-0.203
igi-0.193
aman-0.191
Token.
Token ID13
Feature activation-1.12e+01

Pos logit contributions
 dot+0.039
attered+0.035
der+0.032
angs+0.032
ifying+0.031
Neg logit contributions
ll-0.089
ma-0.075
Once-0.074
 glad-0.073
 done-0.073
Token
Token ID198
Feature activation+9.33e+00

Pos logit contributions
der+0.220
.+0.206
powers+0.182
 dot+0.181
lier+0.161
Neg logit contributions
ll-0.407
ma-0.392
tha-0.368
igi-0.354
prise-0.348
Token
Token ID198
Feature activation+8.42e+00

Pos logit contributions
der+0.128
.+0.126
 dot+0.113
powers+0.112
lier+0.101
Neg logit contributions
ll-0.163
ma-0.152
tha-0.143
igi-0.136
aman-0.134
Token
Token ID198
Feature activation+8.46e+00

Pos logit contributions
ll+0.251
ma+0.231
tha+0.230
aman+0.217
igi+0.209
Neg logit contributions
.-0.225
der-0.207
powers-0.188
 dot-0.186
 It-0.168
TokenJen
Token ID44875
Feature activation+8.79e+00

Pos logit contributions
ll+0.241
ma+0.217
tha+0.216
aman+0.200
igi+0.191
Neg logit contributions
.-0.229
der-0.221
powers-0.205
 dot-0.202
lier-0.189
Token is
Token ID318
Feature activation-3.71e-01

Pos logit contributions
ll+0.236
ma+0.220
tha+0.199
igi+0.186
aman+0.185
Neg logit contributions
der-0.250
.-0.233
 dot-0.229
powers-0.221
lier-0.208
Token worried
Token ID7960
Feature activation-7.05e-01

Pos logit contributions
.+0.010
der+0.009
 dot+0.008
powers+0.008
lier+0.007
Neg logit contributions
ll-0.012
ma-0.010
tha-0.010
aman-0.010
igi-0.010
Token.
Token ID13
Feature activation-1.22e+01

Pos logit contributions
powers+0.012
 target+0.011
 dot+0.011
ifying+0.011
 trigger+0.010
Neg logit contributions
ma-0.027
ll-0.026
Once-0.024
 done-0.022
 told-0.021
Token She
Token ID1375
Feature activation-2.46e+00

Pos logit contributions
der+0.273
 dot+0.252
powers+0.244
lier+0.232
.+0.232
Neg logit contributions
ll-0.373
ma-0.356
tha-0.336
aman-0.324
Once-0.319
Token says
Token ID1139
Feature activation+5.47e+00

Pos logit contributions
der+0.070
.+0.065
 dot+0.065
powers+0.062
lier+0.059
Neg logit contributions
ll-0.065
ma-0.061
tha-0.055
igi-0.051
aman-0.051
Token,
Token ID11
Feature activation+6.02e+00

Pos logit contributions
ll+0.134
ma+0.124
tha+0.112
igi+0.104
aman+0.102
Neg logit contributions
der-0.167
.-0.156
 dot-0.155
powers-0.149
lier-0.143
Token "
Token ID366
Feature activation+3.66e-01

Pos logit contributions
ll+0.218
ma+0.207
tha+0.195
igi+0.188
aman+0.185
Neg logit contributions
der-0.117
.-0.108
 dot-0.100
powers-0.093
lier-0.085
TokenYou
Token ID1639
Feature activation-1.22e+00

Pos logit contributions
ll+0.010
ma+0.010
tha+0.009
igi+0.008
aman+0.008
Neg logit contributions
der-0.010
.-0.010
 dot-0.009
powers-0.009
lier-0.008
Token!"
Token ID2474
Feature activation-3.03e+00

Pos logit contributions
ll+0.117
ma+0.115
tha+0.100
aman+0.091
igi+0.086
Neg logit contributions
der-0.175
 dot-0.164
powers-0.161
attered-0.156
.-0.154
Token Anna
Token ID11735
Feature activation+4.12e+00

Pos logit contributions
.+0.078
der+0.074
 dot+0.065
powers+0.064
 It+0.059
Neg logit contributions
ll-0.100
ma-0.087
igi-0.085
tha-0.084
aman-0.079
Token says
Token ID1139
Feature activation+5.37e+00

Pos logit contributions
ll+0.109
ma+0.101
tha+0.093
igi+0.089
aman+0.086
Neg logit contributions
der-0.120
.-0.115
 dot-0.109
powers-0.105
lier-0.100
Token to
Token ID284
Feature activation+4.18e-01

Pos logit contributions
ma+0.226
ll+0.226
tha+0.206
 pleaded+0.194
 glad+0.191
Neg logit contributions
 dot-0.060
der-0.056
powers-0.053
 target-0.051
elly-0.050
Token Ben
Token ID3932
Feature activation+4.03e+00

Pos logit contributions
ll+0.012
ma+0.011
tha+0.010
igi+0.009
aman+0.009
Neg logit contributions
der-0.012
.-0.011
 dot-0.010
powers-0.010
lier-0.009
Token.
Token ID13
Feature activation-1.22e+01

Pos logit contributions
ma+0.180
ll+0.177
tha+0.157
Once+0.154
 pleaded+0.151
Neg logit contributions
der-0.049
 dot-0.042
powers-0.039
ifying-0.033
attered-0.031
Token "
Token ID366
Feature activation+2.44e-01

Pos logit contributions
der+0.282
 dot+0.267
powers+0.262
lier+0.247
angs+0.242
Neg logit contributions
ll-0.361
ma-0.349
tha-0.324
Once-0.317
aman-0.301
TokenWe
Token ID1135
Feature activation-3.09e+00

Pos logit contributions
ll+0.007
ma+0.006
tha+0.006
igi+0.005
aman+0.005
Neg logit contributions
der-0.007
.-0.006
 dot-0.006
powers-0.006
lier-0.006
Token can
Token ID460
Feature activation-2.04e+00

Pos logit contributions
der+0.078
.+0.069
 dot+0.068
powers+0.065
lier+0.062
Neg logit contributions
ll-0.095
ma-0.089
tha-0.082
igi-0.077
Once-0.076
Token't
Token ID470
Feature activation+1.51e+00

Pos logit contributions
der+0.065
 dot+0.059
.+0.058
powers+0.057
lier+0.055
Neg logit contributions
ll-0.050
ma-0.046
tha-0.041
igi-0.037
Once-0.037
Token slide
Token ID10649
Feature activation-2.31e+00

Pos logit contributions
ll+0.037
ma+0.037
 ha+0.031
Once+0.031
lly+0.030
Neg logit contributions
der-0.049
elly-0.043
powers-0.042
attered-0.041
ifying-0.041
Token Max
Token ID5436
Feature activation+1.94e+00

Pos logit contributions
ma+0.062
ll+0.058
tha+0.049
Once+0.047
aman+0.043
Neg logit contributions
der-0.098
attered-0.092
powers-0.091
 dot-0.091
elly-0.090
Token go
Token ID467
Feature activation+2.36e-01

Pos logit contributions
ll+0.052
ma+0.047
tha+0.042
Once+0.041
igi+0.038
Neg logit contributions
der-0.056
 dot-0.050
powers-0.049
.-0.047
elly-0.047
Token to
Token ID284
Feature activation+4.85e-01

Pos logit contributions
aman+0.010
tha+0.010
ll+0.010
ma+0.010
liv+0.010
Neg logit contributions
.-0.003
 towards-0.002
powers-0.001
ly-0.001
der-0.001
Token say
Token ID910
Feature activation+5.96e+00

Pos logit contributions
ll+0.015
ma+0.014
tha+0.013
igi+0.013
aman+0.012
Neg logit contributions
.-0.012
der-0.011
 dot-0.010
powers-0.010
 target-0.009
Token hello
Token ID23748
Feature activation-4.27e+00

Pos logit contributions
ll+0.146
ma+0.142
tha+0.125
Once+0.121
 glad+0.116
Neg logit contributions
der-0.173
 dot-0.165
powers-0.157
attered-0.155
obs-0.151
Token.
Token ID13
Feature activation-1.22e+01

Pos logit contributions
 dot+0.090
 obstacle+0.088
powers+0.088
der+0.088
 trigger+0.083
Neg logit contributions
ll-0.141
ma-0.128
tha-0.114
Once-0.111
igi-0.107
Token
Token ID198
Feature activation+8.34e+00

Pos logit contributions
der+0.245
.+0.230
 dot+0.214
powers+0.203
lier+0.185
Neg logit contributions
ll-0.436
ma-0.415
tha-0.388
igi-0.372
aman-0.367
Token
Token ID198
Feature activation+8.38e+00

Pos logit contributions
ll+0.281
tha+0.272
ma+0.272
aman+0.255
prise+0.254
Neg logit contributions
.-0.197
der-0.145
 It-0.142
powers-0.137
 dot-0.136
Token"
Token ID1
Feature activation+1.19e+00

Pos logit contributions
ll+0.245
ma+0.227
tha+0.215
aman+0.203
prise+0.197
Neg logit contributions
.-0.222
der-0.199
 dot-0.190
powers-0.189
elly-0.181
TokenHi
Token ID17250
Feature activation+3.71e+00

Pos logit contributions
ll+0.033
ma+0.030
tha+0.028
igi+0.026
aman+0.026
Neg logit contributions
der-0.034
.-0.032
 dot-0.031
powers-0.029
lier-0.028
Token,
Token ID11
Feature activation+6.03e+00

Pos logit contributions
ll+0.091
ma+0.085
tha+0.076
igi+0.071
aman+0.070
Neg logit contributions
der-0.116
.-0.106
 dot-0.105
powers-0.102
lier-0.096
Token now
Token ID783
Feature activation+1.77e+00

Pos logit contributions
ll+0.197
ma+0.182
aman+0.167
igi+0.167
tha+0.165
Neg logit contributions
.-0.164
der-0.164
 dot-0.144
powers-0.138
lier-0.128
Token,
Token ID11
Feature activation+5.93e+00

Pos logit contributions
ma+0.030
ll+0.030
tha+0.027
Once+0.023
lly+0.022
Neg logit contributions
der-0.065
 dot-0.063
powers-0.062
elly-0.060
obs-0.059
Token Tom
Token ID4186
Feature activation+6.67e+00

Pos logit contributions
ll+0.163
ma+0.152
tha+0.140
igi+0.132
aman+0.130
Neg logit contributions
der-0.165
.-0.155
 dot-0.150
powers-0.145
lier-0.137
Token is
Token ID318
Feature activation-3.01e-01

Pos logit contributions
ll+0.207
ma+0.194
tha+0.174
igi+0.165
Once+0.161
Neg logit contributions
der-0.165
 dot-0.151
powers-0.145
.-0.139
pool-0.134
Token dead
Token ID2636
Feature activation-8.43e+00

Pos logit contributions
der+0.009
 dot+0.008
powers+0.008
elly+0.007
pool+0.007
Neg logit contributions
ll-0.008
ma-0.008
tha-0.007
 glad-0.007
 done-0.006
Token.
Token ID13
Feature activation-1.23e+01

Pos logit contributions
ifying+0.162
attered+0.159
powers+0.157
tie+0.156
obs+0.153
Neg logit contributions
ma-0.260
ll-0.239
 waiting-0.210
 told-0.209
 ha-0.207
Token And
Token ID843
Feature activation+6.41e+00

Pos logit contributions
der+0.273
.+0.244
 dot+0.241
powers+0.231
lier+0.216
Neg logit contributions
ll-0.405
ma-0.385
tha-0.356
igi-0.340
Once-0.339
Token Sam
Token ID3409
Feature activation+1.02e+00

Pos logit contributions
ll+0.200
ma+0.184
tha+0.173
igi+0.169
aman+0.168
Neg logit contributions
der-0.159
.-0.157
 dot-0.142
powers-0.136
 target-0.124
Token is
Token ID318
Feature activation-3.26e-01

Pos logit contributions
ll+0.027
ma+0.025
tha+0.023
igi+0.022
aman+0.022
Neg logit contributions
der-0.029
.-0.028
 dot-0.027
powers-0.026
lier-0.025
Token alive
Token ID6776
Feature activation-8.65e+00

Pos logit contributions
der+0.010
 dot+0.009
powers+0.009
pool+0.008
obs+0.008
Neg logit contributions
ll-0.008
ma-0.008
tha-0.007
 glad-0.007
igi-0.007
Token.
Token ID13
Feature activation-1.23e+01

Pos logit contributions
powers+0.153
obs+0.150
 dot+0.149
 trigger+0.148
attered+0.140
Neg logit contributions
ma-0.316
ll-0.291
Once-0.252
tha-0.242
 told-0.241
Token Tom
Token ID4186
Feature activation+6.66e+00

Pos logit contributions
ll+0.197
ma+0.178
tha+0.173
igi+0.167
aman+0.161
Neg logit contributions
der-0.145
.-0.142
 dot-0.127
powers-0.120
lier-0.114
Token.
Token ID13
Feature activation-1.21e+01

Pos logit contributions
ll+0.286
ma+0.281
tha+0.247
 done+0.244
igi+0.242
Neg logit contributions
der-0.091
 dot-0.077
powers-0.063
ifying-0.057
 trigger-0.057
Token A
Token ID317
Feature activation-8.40e+00

Pos logit contributions
der+0.267
.+0.245
 dot+0.242
powers+0.234
 target+0.218
Neg logit contributions
ll-0.372
ma-0.361
tha-0.326
Once-0.322
igi-0.309
Token funny
Token ID8258
Feature activation-5.36e+00

Pos logit contributions
der+0.195
.+0.172
 dot+0.171
powers+0.164
lier+0.155
Neg logit contributions
ll-0.268
ma-0.253
tha-0.233
igi-0.222
aman-0.220
Token joke
Token ID9707
Feature activation-2.21e+00

Pos logit contributions
der+0.121
.+0.107
 dot+0.106
powers+0.103
lier+0.096
Neg logit contributions
ll-0.172
ma-0.165
tha-0.153
igi-0.145
aman-0.144
Token.
Token ID13
Feature activation-1.21e+01

Pos logit contributions
der+0.042
 dot+0.041
powers+0.040
attered+0.037
 trigger+0.036
Neg logit contributions
ll-0.074
ma-0.072
 done-0.062
prise-0.060
tha-0.059
Token Try
Token ID9993
Feature activation-6.95e+00

Pos logit contributions
der+0.259
.+0.238
 dot+0.234
powers+0.226
attered+0.208
Neg logit contributions
ll-0.388
ma-0.372
tha-0.337
Once-0.334
igi-0.322
Token again
Token ID757
Feature activation-3.45e-01

Pos logit contributions
der+0.152
.+0.137
 dot+0.136
powers+0.132
lier+0.120
Neg logit contributions
ll-0.230
ma-0.218
tha-0.201
igi-0.192
aman-0.190
Token,
Token ID11
Feature activation+5.99e+00

Pos logit contributions
der+0.007
 dot+0.006
.+0.006
powers+0.006
lier+0.006
Neg logit contributions
ll-0.012
ma-0.011
tha-0.010
igi-0.010
aman-0.010
Token Tom
Token ID4186
Feature activation+6.69e+00

Pos logit contributions
ll+0.183
ma+0.170
tha+0.161
igi+0.154
aman+0.150
Neg logit contributions
der-0.152
.-0.144
 dot-0.136
powers-0.129
lier-0.122
Token,"
Token ID553
Feature activation+3.62e+00

Pos logit contributions
ll+0.240
ma+0.224
tha+0.200
igi+0.192
 done+0.189
Neg logit contributions
der-0.136
 dot-0.122
powers-0.113
ifying-0.105
attered-0.104