TOP ACTIVATIONS
MAX = 0.506

skin might get s og gy . Amelia quickly stood up
She had a big pig gy bank where she saved her
in the water ! Dog gy was happy and excited ,
. It was so fog gy that no one could even
was so bright that Dog gy had to close his eyes
pond and explore the fog gy world . He
was a cute little pig gy named Pink y . Pink
. Bug gy saw a big leaf and
and they were very bag gy and floppy so it was
. One day , Bug gy got stuck in the rain
that and soon a fog gy mist filled the room .
day , Kitty saw Dog gy eating a big bowl of
a best friend named Dog gy . They liked to play
to her friend , Dog gy . Dog gy said ,
Finally , everything turned fog gy and a thick mist began
. One day , Big gy saw a little cat sitting
a huge dog named Big gy . Big gy was very
was always there for Bug gy , no matter what .
with . Big gy wanted to help the cat
coins back in his pig gy bank and covered it with

MIN ACTIVATIONS
MIN = -0.705

will find gold or pear ls . What will you do
it . The dog how ls along . He likes the
tent . The bear grow ls and hangs his head .
. It moves and craw ls on the paper . It
to share . He grow ls and shows his teeth .
help Tom . She craw ls under the couch and gets
dog bar ks and grow ls at Sara . Sara feels
to play . He grow ls and scratches the toy .
not listen . He grow ls and shakes his head .
sharp teeth . He grow ls at Ben and tries to
flower . The beetle craw ls onto her palm .
like it . It grow ls and snaps at Sara .
He bar ks and grow ls and ch ases them .
back at night . Ow ls like the night ," mom
The dog grow ls and shows his teeth .
Ben and Lily and grow ls . It wants to eat
like that . He grow ls . He shows his teeth
The snail craw ls on Ben 's tower .
But then , Lily craw ls under the sofa . She
! I love painting ow ls . â € <|endoftext|> One

INTERVAL 0.204 - 0.506
CONTAINS 5.175%

to an end . Everyone who watched it was sad .
� � What are you doing in here ? This is
lemon to play with . Tim my was so excited to
mom my . K enny wanted his mom my
them went together to the store . They got some boxes

INTERVAL -0.099 - 0.204
CONTAINS 76.033%

. She smiled and said to the tire " Hi ,
bad cook ," Sam says . " I 'm
challenge , no matter what it is ". The
running over to see what was happening . Tim and Sue
liked to slide , swing and climb on the equipment .

INTERVAL -0.402 - -0.099
CONTAINS 18.553%

The dragon and Princess stayed up there for a while and
. The bird was so lively and so special that she
of you ." She hugged Sara . She said
what to say . They look at each other . They
next day . John was sad because the generous drain was

INTERVAL -0.705 - -0.402
CONTAINS 0.238%

in the village watched with smiles on their faces and cl
a big box !" Sam says . " We can make
to go home ," Mia says . " But
sorry and brave . He says , " Don 't worry
boy who was very depend able . He was always doing
Token skin
Token ID4168
Feature activation+1.16e-01

Pos logit contributions
ls+0.145
bs+0.130
s+0.127
oms+0.123
ushes+0.121
Neg logit contributions
penter-0.092
gy-0.090
IT-0.086
isk-0.085
zer-0.085
Token might
Token ID1244
Feature activation-6.69e-02

Pos logit contributions
gy+0.198
penter+0.184
zer+0.177
IT+0.176
isk+0.173
Neg logit contributions
ls-0.303
bs-0.273
oms-0.265
ushes-0.262
s-0.259
Token get
Token ID651
Feature activation+5.22e-02

Pos logit contributions
ls+0.170
bs+0.153
s+0.147
oms+0.147
ushes+0.142
Neg logit contributions
gy-0.118
penter-0.114
zer-0.110
IT-0.110
isk-0.106
Token s
Token ID264
Feature activation-7.01e-02

Pos logit contributions
gy+0.106
penter+0.095
 driving+0.093
isk+0.091
zer+0.090
Neg logit contributions
ls-0.120
bs-0.106
oms-0.106
ushes-0.104
s-0.101
Tokenog
Token ID519
Feature activation-1.67e-01

Pos logit contributions
ls+0.142
bs+0.124
oms+0.118
s+0.116
ushes+0.115
Neg logit contributions
gy-0.163
penter-0.157
zer-0.152
IT-0.151
isk-0.150
Tokengy
Token ID1360
Feature activation+5.06e-01

Pos logit contributions
ls+0.612
s+0.588
bs+0.566
oms+0.545
ushes+0.532
Neg logit contributions
lf-0.081
penter-0.075
IT-0.066
achable-0.064
zer-0.062
Token.
Token ID13
Feature activation+1.20e-02

Pos logit contributions
gy+0.976
penter+0.951
zer+0.917
IT+0.915
isk+0.896
Neg logit contributions
ls-1.201
bs-1.070
s-1.031
oms-1.026
ushes-0.992
Token Amelia
Token ID49769
Feature activation-5.35e-03

Pos logit contributions
gy+0.023
penter+0.022
zer+0.021
isk+0.021
IT+0.021
Neg logit contributions
ls-0.029
bs-0.026
s-0.025
oms-0.025
ushes-0.025
Token quickly
Token ID2952
Feature activation-5.55e-02

Pos logit contributions
ls+0.013
bs+0.012
s+0.012
oms+0.011
ushes+0.011
Neg logit contributions
gy-0.010
penter-0.009
IT-0.009
zer-0.009
isk-0.009
Token stood
Token ID6204
Feature activation+8.60e-02

Pos logit contributions
ls+0.132
bs+0.119
oms+0.113
s+0.112
ushes+0.109
Neg logit contributions
gy-0.108
penter-0.103
IT-0.100
zer-0.099
isk-0.098
Token up
Token ID510
Feature activation-1.10e-01

Pos logit contributions
gy+0.180
penter+0.169
zer+0.165
IT+0.163
isk+0.159
Neg logit contributions
ls-0.193
bs-0.175
oms-0.166
s-0.164
ushes-0.161
Token She
Token ID1375
Feature activation-1.07e-01

Pos logit contributions
gy+0.030
penter+0.028
IT+0.027
isk+0.027
zer+0.026
Neg logit contributions
ls-0.052
bs-0.047
oms-0.045
s-0.045
ushes-0.044
Token had
Token ID550
Feature activation+4.77e-02

Pos logit contributions
ls+0.232
s+0.213
bs+0.204
oms+0.177
ushes+0.173
Neg logit contributions
IT-0.218
uding-0.215
gy-0.214
penter-0.207
isk-0.207
Token a
Token ID257
Feature activation+1.08e-01

Pos logit contributions
gy+0.093
penter+0.093
isk+0.089
uding+0.087
zer+0.087
Neg logit contributions
ls-0.108
bs-0.102
ushes-0.096
s-0.095
oms-0.089
Token big
Token ID1263
Feature activation-2.46e-01

Pos logit contributions
gy+0.277
penter+0.270
IT+0.266
isk+0.265
zer+0.263
Neg logit contributions
ls-0.185
bs-0.160
s-0.149
oms-0.142
ushes-0.142
Token pig
Token ID12967
Feature activation+1.34e-01

Pos logit contributions
ls+0.554
bs+0.492
s+0.475
oms+0.474
ushes+0.452
Neg logit contributions
gy-0.479
IT-0.467
penter-0.466
isk-0.461
uding-0.455
Tokengy
Token ID1360
Feature activation+5.05e-01

Pos logit contributions
IT+0.236
achable+0.215
uding+0.202
ulum+0.194
ressive+0.194
Neg logit contributions
s-0.348
ls-0.342
bs-0.324
ushes-0.281
able-0.272
Token bank
Token ID3331
Feature activation+1.85e-01

Pos logit contributions
IT+0.840
achable+0.808
uding+0.789
penter+0.784
isk+0.781
Neg logit contributions
ls-1.221
s-1.204
bs-1.123
l-0.974
oms-0.960
Token where
Token ID810
Feature activation+2.64e-01

Pos logit contributions
gy+0.418
penter+0.400
IT+0.398
zer+0.391
isk+0.384
Neg logit contributions
ls-0.386
bs-0.339
s-0.325
oms-0.322
ushes-0.316
Token she
Token ID673
Feature activation-1.35e-01

Pos logit contributions
gy+0.524
penter+0.480
isk+0.466
zer+0.463
 driving+0.457
Neg logit contributions
ls-0.594
oms-0.533
bs-0.524
ushes-0.521
s-0.502
Token saved
Token ID7448
Feature activation+8.47e-02

Pos logit contributions
ls+0.321
bs+0.287
s+0.282
oms+0.255
ushes+0.251
Neg logit contributions
IT-0.249
gy-0.248
achable-0.245
penter-0.239
uding-0.236
Token her
Token ID607
Feature activation-9.77e-02

Pos logit contributions
gy+0.179
penter+0.173
IT+0.166
zer+0.166
isk+0.163
Neg logit contributions
ls-0.189
bs-0.166
s-0.158
oms-0.157
ushes-0.155
Token in
Token ID287
Feature activation+1.31e-02

Pos logit contributions
ls+0.077
bs+0.071
ushes+0.068
oms+0.068
s+0.065
Neg logit contributions
gy-0.048
penter-0.044
zer-0.043
isk-0.041
IT-0.041
Token the
Token ID262
Feature activation-3.58e-02

Pos logit contributions
gy+0.030
penter+0.029
isk+0.028
IT+0.028
zer+0.028
Neg logit contributions
ls-0.026
bs-0.023
s-0.022
ushes-0.022
oms-0.022
Token water
Token ID1660
Feature activation-1.43e-01

Pos logit contributions
ls+0.101
bs+0.094
ushes+0.089
oms+0.088
s+0.085
Neg logit contributions
gy-0.054
penter-0.050
IT-0.049
zer-0.048
isk-0.046
Token!
Token ID0
Feature activation+9.37e-02

Pos logit contributions
ls+0.334
s+0.293
bs+0.292
oms+0.288
ushes+0.271
Neg logit contributions
gy-0.282
penter-0.272
IT-0.265
zer-0.264
isk-0.260
Token Dog
Token ID8532
Feature activation+2.93e-01

Pos logit contributions
gy+0.170
penter+0.162
zer+0.159
IT+0.158
isk+0.155
Neg logit contributions
ls-0.234
bs-0.210
s-0.204
oms-0.200
ushes-0.196
Tokengy
Token ID1360
Feature activation+5.04e-01

Pos logit contributions
lf+0.090
IT+0.090
penter+0.085
uding+0.079
achable+0.076
Neg logit contributions
ls-1.102
s-1.076
bs-1.009
l-0.972
ushes-0.968
Token was
Token ID373
Feature activation+7.46e-02

Pos logit contributions
gy+0.997
penter+0.983
IT+0.968
zer+0.947
isk+0.944
Neg logit contributions
ls-1.136
s-1.048
bs-1.005
oms-0.945
ushes-0.879
Token happy
Token ID3772
Feature activation-1.90e-01

Pos logit contributions
gy+0.140
penter+0.133
IT+0.131
zer+0.130
isk+0.130
Neg logit contributions
ls-0.178
bs-0.159
s-0.157
oms-0.156
ushes-0.156
Token and
Token ID290
Feature activation+6.60e-02

Pos logit contributions
ls+0.395
s+0.364
bs+0.315
oms+0.313
ushes+0.279
Neg logit contributions
IT-0.446
isk-0.435
gy-0.435
penter-0.429
zer-0.405
Token excited
Token ID6568
Feature activation-1.76e-01

Pos logit contributions
gy+0.126
penter+0.118
zer+0.112
 driving+0.111
IT+0.109
Neg logit contributions
ls-0.160
oms-0.142
bs-0.142
ushes-0.139
s-0.131
Token,
Token ID11
Feature activation-1.27e-02

Pos logit contributions
ls+0.387
s+0.343
bs+0.323
oms+0.311
ushes+0.283
Neg logit contributions
gy-0.384
penter-0.378
IT-0.362
isk-0.360
zer-0.359
Token.
Token ID13
Feature activation+1.79e-02

Pos logit contributions
ls+0.260
s+0.239
oms+0.213
bs+0.212
l+0.205
Neg logit contributions
IT-0.124
Billy-0.121
uries-0.118
achable-0.117
zer-0.115
Token It
Token ID632
Feature activation-3.66e-02

Pos logit contributions
gy+0.038
penter+0.036
zer+0.036
isk+0.036
uding+0.034
Neg logit contributions
ls-0.035
bs-0.035
oms-0.032
ushes-0.032
s-0.031
Token was
Token ID373
Feature activation+7.49e-02

Pos logit contributions
ls+0.098
bs+0.089
oms+0.085
s+0.085
ushes+0.084
Neg logit contributions
gy-0.061
penter-0.057
zer-0.055
IT-0.055
isk-0.054
Token so
Token ID523
Feature activation+1.18e-01

Pos logit contributions
zer+0.171
gy+0.170
IT+0.168
penter+0.166
uding+0.163
Neg logit contributions
ls-0.143
bs-0.125
ushes-0.118
s-0.118
oms-0.114
Token fog
Token ID19558
Feature activation-9.08e-02

Pos logit contributions
gy+0.220
lf+0.211
penter+0.206
 always+0.205
 driving+0.205
Neg logit contributions
ls-0.260
bs-0.236
oms-0.235
ushes-0.226
s-0.217
Tokengy
Token ID1360
Feature activation+5.04e-01

Pos logit contributions
ls+0.258
s+0.237
bs+0.220
oms+0.216
l+0.202
Neg logit contributions
IT-0.130
zer-0.127
penter-0.125
 dentist-0.121
uding-0.120
Token that
Token ID326
Feature activation+1.57e-01

Pos logit contributions
gy+0.991
penter+0.989
IT+0.980
zer+0.975
isk+0.956
Neg logit contributions
ls-1.158
s-1.040
bs-1.006
oms-0.958
ushes-0.864
Token no
Token ID645
Feature activation+2.49e-01

Pos logit contributions
gy+0.307
penter+0.291
zer+0.281
IT+0.281
isk+0.277
Neg logit contributions
ls-0.372
bs-0.332
oms-0.319
s-0.315
ushes-0.312
Token one
Token ID530
Feature activation+2.10e-01

Pos logit contributions
gy+0.363
penter+0.343
IT+0.331
zer+0.328
isk+0.327
Neg logit contributions
ls-0.729
bs-0.661
s-0.636
oms-0.636
ushes-0.619
Token could
Token ID714
Feature activation-5.49e-02

Pos logit contributions
gy+0.370
penter+0.345
zer+0.334
IT+0.326
isk+0.323
Neg logit contributions
ls-0.534
bs-0.472
oms-0.468
ushes-0.459
s-0.450
Token even
Token ID772
Feature activation+1.91e-01

Pos logit contributions
ls+0.129
bs+0.120
s+0.112
oms+0.106
ushes+0.104
Neg logit contributions
IT-0.101
penter-0.099
gy-0.099
zer-0.097
isk-0.093
Token was
Token ID373
Feature activation+7.42e-02

Pos logit contributions
gy+0.064
penter+0.060
zer+0.056
lf+0.055
IT+0.054
Neg logit contributions
ls-0.116
bs-0.109
oms-0.104
ushes-0.103
s-0.099
Token so
Token ID523
Feature activation+1.18e-01

Pos logit contributions
gy+0.135
penter+0.128
IT+0.126
zer+0.126
isk+0.124
Neg logit contributions
ls-0.184
bs-0.164
ushes-0.160
s-0.159
oms-0.159
Token bright
Token ID6016
Feature activation-1.50e-01

Pos logit contributions
 always+0.184
 never+0.180
 not+0.172
 because+0.171
achable+0.163
Neg logit contributions
ls-0.302
bs-0.280
oms-0.273
ushes-0.245
s-0.237
Token that
Token ID326
Feature activation+1.57e-01

Pos logit contributions
ls+0.396
s+0.344
bs+0.342
oms+0.337
ushes+0.320
Neg logit contributions
gy-0.264
isk-0.249
penter-0.247
IT-0.244
zer-0.242
Token Dog
Token ID8532
Feature activation+2.92e-01

Pos logit contributions
gy+0.309
penter+0.296
IT+0.294
isk+0.292
zer+0.291
Neg logit contributions
ls-0.369
bs-0.324
s-0.318
oms-0.300
ushes-0.293
Tokengy
Token ID1360
Feature activation+5.04e-01

Pos logit contributions
penter+0.079
lf+0.076
IT+0.074
zer+0.069
achable+0.065
Neg logit contributions
ls-1.130
s-1.084
bs-1.053
oms-0.987
l-0.983
Token had
Token ID550
Feature activation+4.80e-02

Pos logit contributions
gy+0.970
penter+0.942
IT+0.911
zer+0.901
isk+0.896
Neg logit contributions
ls-1.200
bs-1.083
s-1.060
oms-1.024
ushes-0.962
Token to
Token ID284
Feature activation-4.37e-03

Pos logit contributions
gy+0.082
penter+0.078
zer+0.075
IT+0.074
isk+0.074
Neg logit contributions
ls-0.125
bs-0.113
oms-0.109
s-0.109
ushes-0.107
Token close
Token ID1969
Feature activation-1.62e-01

Pos logit contributions
ls+0.010
bs+0.009
oms+0.009
ushes+0.008
s+0.008
Neg logit contributions
gy-0.008
penter-0.008
IT-0.008
zer-0.008
itor-0.008
Token his
Token ID465
Feature activation+5.55e-02

Pos logit contributions
ls+0.395
bs+0.352
oms+0.339
s+0.339
ushes+0.331
Neg logit contributions
gy-0.306
penter-0.294
IT-0.283
zer-0.282
isk-0.281
Token eyes
Token ID2951
Feature activation-7.02e-02

Pos logit contributions
gy+0.110
penter+0.103
zer+0.100
IT+0.099
isk+0.098
Neg logit contributions
ls-0.132
bs-0.118
oms-0.113
ushes-0.110
s-0.110
Token pond
Token ID16723
Feature activation+4.00e-02

Pos logit contributions
ls+0.094
bs+0.079
oms+0.079
s+0.078
ushes+0.074
Neg logit contributions
gy-0.083
penter-0.079
isk-0.078
IT-0.076
zer-0.076
Token and
Token ID290
Feature activation+6.56e-02

Pos logit contributions
IT+0.075
penter+0.073
gy+0.073
zer+0.071
isk+0.069
Neg logit contributions
ls-0.096
bs-0.085
s-0.085
oms-0.080
ushes-0.078
Token explore
Token ID7301
Feature activation+1.84e-01

Pos logit contributions
gy+0.118
penter+0.115
IT+0.113
zer+0.110
isk+0.110
Neg logit contributions
ls-0.160
bs-0.148
s-0.140
oms-0.136
ushes-0.130
Token the
Token ID262
Feature activation-4.01e-02

Pos logit contributions
gy+0.372
penter+0.353
IT+0.342
isk+0.338
zer+0.338
Neg logit contributions
ls-0.434
bs-0.382
oms-0.368
s-0.368
ushes-0.355
Token fog
Token ID19558
Feature activation-8.86e-02

Pos logit contributions
ls+0.111
bs+0.103
oms+0.097
ushes+0.097
s+0.096
Neg logit contributions
gy-0.062
penter-0.058
IT-0.056
zer-0.056
isk-0.054
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
ls+0.193
s+0.174
oms+0.164
bs+0.163
ushes+0.145
Neg logit contributions
IT-0.186
zer-0.181
penter-0.179
isk-0.177
gy-0.170
Token world
Token ID995
Feature activation+8.50e-02

Pos logit contributions
penter+0.896
gy+0.888
zer+0.871
IT+0.861
isk+0.854
Neg logit contributions
ls-1.254
s-1.130
bs-1.089
oms-1.070
ushes-0.997
Token.
Token ID13
Feature activation+1.50e-02

Pos logit contributions
gy+0.167
penter+0.159
IT+0.153
zer+0.153
isk+0.151
Neg logit contributions
ls-0.201
bs-0.179
oms-0.172
s-0.169
ushes-0.169
Token
Token ID198
Feature activation-2.96e-02

Pos logit contributions
lf+0.033
gy+0.032
eman+0.031
zer+0.031
penter+0.030
Neg logit contributions
ls-0.028
bs-0.028
s-0.027
ushes-0.026
oms-0.025
Token
Token ID198
Feature activation-3.04e-02

Pos logit contributions
ls+0.068
bs+0.060
s+0.058
oms+0.057
ushes+0.056
Neg logit contributions
gy-0.060
penter-0.058
IT-0.056
zer-0.055
isk-0.055
TokenHe
Token ID1544
Feature activation-5.88e-02

Pos logit contributions
ls+0.064
s+0.060
bs+0.060
l+0.055
ushes+0.054
Neg logit contributions
gy-0.063
uding-0.057
IT-0.056
itor-0.055
penter-0.055
Token was
Token ID373
Feature activation+7.46e-02

Pos logit contributions
ls+0.083
s+0.075
bs+0.071
oms+0.069
ushes+0.067
Neg logit contributions
gy-0.065
penter-0.065
IT-0.061
isk-0.061
zer-0.061
Token a
Token ID257
Feature activation+1.07e-01

Pos logit contributions
gy+0.151
isk+0.147
penter+0.146
uding+0.141
IT+0.139
Neg logit contributions
ls-0.165
bs-0.148
s-0.142
ushes-0.138
oms-0.132
Token cute
Token ID13779
Feature activation-1.79e-01

Pos logit contributions
gy+0.231
penter+0.217
IT+0.216
isk+0.212
uding+0.211
Neg logit contributions
ls-0.236
bs-0.208
s-0.199
oms-0.196
ushes-0.188
Token little
Token ID1310
Feature activation+7.03e-03

Pos logit contributions
ls+0.340
s+0.303
bs+0.284
oms+0.269
l+0.235
Neg logit contributions
penter-0.398
IT-0.395
uding-0.394
gy-0.391
isk-0.380
Token pig
Token ID12967
Feature activation+1.33e-01

Pos logit contributions
gy+0.014
penter+0.013
zer+0.012
IT+0.012
isk+0.012
Neg logit contributions
ls-0.017
bs-0.015
oms-0.015
ushes-0.014
s-0.014
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
IT+0.238
achable+0.213
itor+0.203
ulum+0.200
 coaster+0.200
Neg logit contributions
s-0.347
ls-0.321
bs-0.302
l-0.266
able-0.251
Token named
Token ID3706
Feature activation+3.32e-01

Pos logit contributions
IT+1.011
isk+0.952
achable+0.946
itor+0.945
ipment+0.924
Neg logit contributions
s-1.060
ls-1.028
bs-0.915
l-0.830
oms-0.744
Token Pink
Token ID14657
Feature activation+6.40e-02

Pos logit contributions
penter+0.511
gy+0.502
IT+0.498
zer+0.487
uding+0.479
Neg logit contributions
ls-0.907
bs-0.820
s-0.808
oms-0.758
ushes-0.749
Tokeny
Token ID88
Feature activation-1.83e-01

Pos logit contributions
gy+0.157
penter+0.157
IT+0.153
zer+0.152
isk+0.147
Neg logit contributions
ls-0.115
bs-0.099
s-0.096
oms-0.091
ushes-0.091
Token.
Token ID13
Feature activation+1.51e-02

Pos logit contributions
ls+0.330
s+0.314
bs+0.271
l+0.252
oms+0.224
Neg logit contributions
IT-0.474
uding-0.433
penter-0.429
zer-0.421
isk-0.420
Token Pink
Token ID14657
Feature activation+6.39e-02

Pos logit contributions
gy+0.023
penter+0.021
IT+0.020
zer+0.020
isk+0.020
Neg logit contributions
ls-0.043
bs-0.039
oms-0.038
ushes-0.037
s-0.037
Token.
Token ID13
Feature activation+1.61e-02

Pos logit contributions
ls+0.591
s+0.513
bs+0.491
oms+0.473
ushes+0.455
Neg logit contributions
IT-0.552
zer-0.546
penter-0.545
gy-0.540
isk-0.528
Token
Token ID220
Feature activation-1.43e-01

Pos logit contributions
gy+0.033
penter+0.033
IT+0.032
zer+0.032
isk+0.031
Neg logit contributions
ls-0.036
s-0.031
bs-0.031
oms-0.029
ushes-0.028
Token
Token ID198
Feature activation-2.99e-02

Pos logit contributions
ls+0.366
bs+0.328
oms+0.319
ushes+0.303
s+0.302
Neg logit contributions
gy-0.248
penter-0.231
isk-0.225
IT-0.220
zer-0.219
Token
Token ID198
Feature activation-3.01e-02

Pos logit contributions
ls+0.070
bs+0.062
oms+0.059
s+0.059
ushes+0.058
Neg logit contributions
gy-0.060
penter-0.057
IT-0.055
zer-0.055
isk-0.054
TokenBug
Token ID25624
Feature activation+1.83e-01

Pos logit contributions
ls+0.079
s+0.073
bs+0.070
l+0.063
oms+0.063
Neg logit contributions
gy-0.049
itor-0.047
IT-0.046
penter-0.046
uding-0.045
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
penter+0.025
IT+0.023
gy+0.011
zer+0.010
lf+0.008
Neg logit contributions
ls-0.751
s-0.702
bs-0.701
ushes-0.674
oms-0.669
Token saw
Token ID2497
Feature activation-3.28e-02

Pos logit contributions
IT+1.014
uding+1.002
penter+0.996
gy+0.992
achable+0.991
Neg logit contributions
ls-1.083
s-1.005
bs-0.977
oms-0.850
l-0.823
Token a
Token ID257
Feature activation+1.05e-01

Pos logit contributions
ls+0.086
s+0.075
bs+0.073
l+0.067
oms+0.066
Neg logit contributions
IT-0.055
penter-0.055
gy-0.055
uding-0.054
itor-0.054
Token big
Token ID1263
Feature activation-2.49e-01

Pos logit contributions
gy+0.238
penter+0.228
IT+0.221
zer+0.221
isk+0.218
Neg logit contributions
ls-0.216
bs-0.189
oms-0.180
s-0.178
ushes-0.175
Token leaf
Token ID12835
Feature activation+1.30e-02

Pos logit contributions
ls+0.597
oms+0.514
bs+0.511
s+0.495
ushes+0.467
Neg logit contributions
gy-0.474
IT-0.454
penter-0.451
isk-0.447
zer-0.438
Token and
Token ID290
Feature activation+6.48e-02

Pos logit contributions
IT+0.028
gy+0.028
isk+0.027
itor+0.027
penter+0.027
Neg logit contributions
ls-0.028
s-0.024
oms-0.024
bs-0.023
l-0.020
Token and
Token ID290
Feature activation+6.88e-02

Pos logit contributions
gy+0.285
penter+0.267
IT+0.262
zer+0.261
isk+0.258
Neg logit contributions
ls-0.406
bs-0.368
s-0.360
oms-0.353
ushes-0.340
Token they
Token ID484
Feature activation-1.41e-01

Pos logit contributions
gy+0.132
IT+0.129
penter+0.129
isk+0.126
zer+0.122
Neg logit contributions
ls-0.161
bs-0.145
s-0.142
oms-0.130
ushes-0.128
Token were
Token ID547
Feature activation-1.47e-01

Pos logit contributions
ls+0.343
bs+0.311
s+0.296
oms+0.293
ushes+0.287
Neg logit contributions
gy-0.265
penter-0.252
IT-0.245
zer-0.244
isk-0.240
Token very
Token ID845
Feature activation-3.74e-02

Pos logit contributions
ls+0.379
bs+0.344
s+0.330
oms+0.327
ushes+0.326
Neg logit contributions
gy-0.250
IT-0.238
penter-0.234
zer-0.229
isk-0.227
Token bag
Token ID6131
Feature activation-1.63e-01

Pos logit contributions
ls+0.080
oms+0.072
ushes+0.071
bs+0.070
s+0.067
Neg logit contributions
gy-0.073
penter-0.069
 driving-0.068
 never-0.066
 accident-0.065
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
ls+0.376
bs+0.331
s+0.322
oms+0.319
ushes+0.310
Neg logit contributions
gy-0.326
penter-0.312
IT-0.312
zer-0.307
isk-0.300
Token and
Token ID290
Feature activation+6.87e-02

Pos logit contributions
gy+0.910
penter+0.890
IT+0.885
zer+0.878
isk+0.855
Neg logit contributions
ls-1.246
s-1.101
bs-1.092
oms-1.058
ushes-1.012
Token floppy
Token ID49415
Feature activation-1.03e-01

Pos logit contributions
gy+0.130
IT+0.125
penter+0.125
isk+0.121
zer+0.121
Neg logit contributions
ls-0.165
bs-0.148
s-0.144
oms-0.137
ushes-0.135
Token so
Token ID523
Feature activation+1.22e-01

Pos logit contributions
ls+0.240
s+0.224
bs+0.217
oms+0.199
ushes+0.186
Neg logit contributions
isk-0.192
IT-0.192
penter-0.190
lf-0.187
zer-0.178
Token it
Token ID340
Feature activation-5.35e-02

Pos logit contributions
gy+0.229
penter+0.219
zer+0.208
IT+0.205
isk+0.203
Neg logit contributions
ls-0.293
bs-0.263
oms-0.259
ushes-0.252
s-0.247
Token was
Token ID373
Feature activation+7.30e-02

Pos logit contributions
ls+0.138
bs+0.122
s+0.121
oms+0.115
ushes+0.113
Neg logit contributions
gy-0.092
IT-0.087
penter-0.087
zer-0.086
isk-0.084
Token.
Token ID13
Feature activation+1.41e-02

Pos logit contributions
gy+0.022
penter+0.022
IT+0.021
isk+0.021
zer+0.021
Neg logit contributions
ls-0.029
bs-0.025
s-0.025
oms-0.024
ushes-0.023
Token One
Token ID1881
Feature activation+1.62e-01

Pos logit contributions
gy+0.030
penter+0.030
IT+0.029
zer+0.029
uding+0.028
Neg logit contributions
ls-0.030
s-0.026
bs-0.026
oms-0.025
ushes-0.024
Token day
Token ID1110
Feature activation+7.09e-02

Pos logit contributions
gy+0.306
penter+0.303
IT+0.300
uding+0.291
zer+0.291
Neg logit contributions
ls-0.383
bs-0.353
s-0.342
oms-0.321
ushes-0.309
Token,
Token ID11
Feature activation-1.36e-02

Pos logit contributions
gy+0.119
penter+0.107
zer+0.105
IT+0.102
isk+0.099
Neg logit contributions
ls-0.185
ushes-0.169
bs-0.168
oms-0.166
s-0.157
Token Bug
Token ID15217
Feature activation+2.98e-01

Pos logit contributions
ls+0.042
bs+0.038
s+0.038
oms+0.035
ushes+0.034
Neg logit contributions
gy-0.017
penter-0.017
IT-0.016
isk-0.016
itor-0.016
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
penter+0.107
IT+0.100
lf+0.084
zer+0.072
itor+0.070
Neg logit contributions
ls-1.165
s-1.096
bs-1.083
ushes-1.028
oms-1.015
Token got
Token ID1392
Feature activation-8.05e-02

Pos logit contributions
penter+1.017
gy+1.001
IT+0.996
zer+0.990
achable+0.989
Neg logit contributions
ls-1.105
s-1.012
bs-0.980
oms-0.868
l-0.826
Token stuck
Token ID7819
Feature activation+1.43e-01

Pos logit contributions
ls+0.215
bs+0.193
oms+0.191
ushes+0.187
s+0.185
Neg logit contributions
gy-0.137
penter-0.126
zer-0.120
IT-0.118
isk-0.117
Token in
Token ID287
Feature activation+1.01e-02

Pos logit contributions
penter+0.305
IT+0.302
isk+0.300
gy+0.298
zer+0.298
Neg logit contributions
ls-0.302
s-0.257
bs-0.253
oms-0.249
ushes-0.234
Token the
Token ID262
Feature activation-3.99e-02

Pos logit contributions
penter+0.022
itor+0.022
IT+0.022
gy+0.021
achable+0.021
Neg logit contributions
ls-0.020
s-0.017
bs-0.016
ushes-0.016
oms-0.015
Token rain
Token ID6290
Feature activation+8.85e-02

Pos logit contributions
ls+0.093
bs+0.082
oms+0.080
s+0.079
ushes+0.076
Neg logit contributions
gy-0.076
penter-0.072
isk-0.072
uding-0.070
zer-0.069
Token that
Token ID326
Feature activation+1.54e-01

Pos logit contributions
ls+0.799
bs+0.681
s+0.665
oms+0.663
ushes+0.642
Neg logit contributions
penter-0.549
zer-0.542
gy-0.540
 accident-0.530
IT-0.518
Token and
Token ID290
Feature activation+6.38e-02

Pos logit contributions
gy+0.261
penter+0.250
IT+0.239
zer+0.237
isk+0.228
Neg logit contributions
ls-0.404
bs-0.355
oms-0.354
ushes-0.340
s-0.335
Token soon
Token ID2582
Feature activation+1.62e-01

Pos logit contributions
gy+0.102
penter+0.094
IT+0.090
zer+0.090
isk+0.090
Neg logit contributions
ls-0.176
bs-0.161
s-0.155
oms-0.153
ushes-0.152
Token a
Token ID257
Feature activation+1.05e-01

Pos logit contributions
gy+0.253
penter+0.227
zer+0.222
 accident+0.221
lf+0.221
Neg logit contributions
ls-0.416
oms-0.377
bs-0.374
ushes-0.371
s-0.354
Token fog
Token ID19558
Feature activation-9.11e-02

Pos logit contributions
gy+0.211
penter+0.201
zer+0.194
IT+0.194
isk+0.191
Neg logit contributions
ls-0.242
bs-0.215
oms-0.206
s-0.204
ushes-0.202
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
ls+0.210
s+0.182
bs+0.178
oms+0.177
ushes+0.163
Neg logit contributions
penter-0.174
gy-0.170
IT-0.169
isk-0.168
zer-0.167
Token mist
Token ID4020
Feature activation-4.13e-02

Pos logit contributions
gy+0.899
penter+0.869
isk+0.853
IT+0.849
zer+0.836
Neg logit contributions
ls-1.235
bs-1.111
s-1.103
oms-1.063
ushes-0.992
Token filled
Token ID5901
Feature activation+4.30e-02

Pos logit contributions
ls+0.098
bs+0.087
oms+0.084
s+0.083
ushes+0.082
Neg logit contributions
gy-0.081
penter-0.077
zer-0.074
IT-0.074
isk-0.073
Token the
Token ID262
Feature activation-3.81e-02

Pos logit contributions
gy+0.085
penter+0.083
zer+0.079
IT+0.078
isk+0.078
Neg logit contributions
ls-0.100
bs-0.088
s-0.085
ushes-0.084
oms-0.083
Token room
Token ID2119
Feature activation+1.76e-01

Pos logit contributions
ls+0.092
bs+0.083
oms+0.079
s+0.078
ushes+0.078
Neg logit contributions
gy-0.074
penter-0.069
IT-0.068
zer-0.067
uding-0.064
Token.
Token ID13
Feature activation+1.34e-02

Pos logit contributions
gy+0.296
penter+0.279
zer+0.264
IT+0.263
isk+0.257
Neg logit contributions
ls-0.465
bs-0.420
oms-0.404
ushes-0.396
s-0.395
Token day
Token ID1110
Feature activation+7.10e-02

Pos logit contributions
gy+0.304
penter+0.300
IT+0.296
uding+0.290
zer+0.288
Neg logit contributions
ls-0.397
bs-0.371
s-0.360
oms-0.334
ushes-0.331
Token,
Token ID11
Feature activation-9.89e-03

Pos logit contributions
gy+0.123
penter+0.112
zer+0.109
IT+0.106
isk+0.105
Neg logit contributions
ls-0.183
bs-0.165
oms-0.162
ushes-0.162
s-0.154
Token Kitty
Token ID21168
Feature activation+2.18e-01

Pos logit contributions
ls+0.027
bs+0.025
s+0.024
oms+0.023
ushes+0.023
Neg logit contributions
gy-0.015
penter-0.015
IT-0.015
zer-0.015
isk-0.014
Token saw
Token ID2497
Feature activation-3.17e-02

Pos logit contributions
penter+0.418
IT+0.409
achable+0.403
gy+0.403
zer+0.403
Neg logit contributions
ls-0.524
s-0.471
bs-0.455
oms-0.417
ushes-0.381
Token Dog
Token ID8532
Feature activation+2.92e-01

Pos logit contributions
ls+0.072
bs+0.063
s+0.062
ushes+0.056
oms+0.055
Neg logit contributions
uding-0.062
gy-0.062
penter-0.061
IT-0.059
uries-0.059
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
lf+0.200
IT+0.182
achable+0.175
uding+0.163
penter+0.159
Neg logit contributions
s-1.017
ls-0.975
bs-0.912
l-0.883
oms-0.844
Token eating
Token ID6600
Feature activation+1.70e-01

Pos logit contributions
penter+0.901
gy+0.867
itor+0.864
zer+0.852
eman+0.834
Neg logit contributions
ls-1.160
s-1.094
bs-1.049
oms-0.976
l-0.908
Token a
Token ID257
Feature activation+1.06e-01

Pos logit contributions
gy+0.318
penter+0.306
IT+0.289
isk+0.283
uding+0.283
Neg logit contributions
ls-0.423
bs-0.373
s-0.361
oms-0.351
ushes-0.347
Token big
Token ID1263
Feature activation-2.48e-01

Pos logit contributions
gy+0.214
penter+0.202
zer+0.197
IT+0.194
isk+0.191
Neg logit contributions
ls-0.243
bs-0.219
oms-0.211
ushes-0.207
s-0.203
Token bowl
Token ID9396
Feature activation-9.31e-02

Pos logit contributions
ls+0.585
bs+0.509
s+0.504
oms+0.486
ushes+0.462
Neg logit contributions
gy-0.476
penter-0.466
IT-0.456
isk-0.450
uding-0.446
Token of
Token ID286
Feature activation-3.13e-01

Pos logit contributions
ls+0.166
bs+0.145
s+0.141
oms+0.135
ushes+0.128
Neg logit contributions
gy-0.235
IT-0.230
penter-0.225
zer-0.223
isk-0.220
Token a
Token ID257
Feature activation+1.06e-01

Pos logit contributions
gy+0.080
penter+0.079
IT+0.075
isk+0.074
zer+0.073
Neg logit contributions
ls-0.124
bs-0.114
s-0.108
ushes-0.107
oms-0.105
Token best
Token ID1266
Feature activation-4.30e-02

Pos logit contributions
gy+0.267
penter+0.263
IT+0.258
isk+0.256
zer+0.254
Neg logit contributions
ls-0.188
bs-0.163
s-0.153
ushes-0.145
oms-0.144
Token friend
Token ID1545
Feature activation+1.77e-01

Pos logit contributions
ls+0.103
bs+0.098
s+0.093
ushes+0.089
oms+0.083
Neg logit contributions
IT-0.078
penter-0.078
gy-0.075
uding-0.073
isk-0.072
Token named
Token ID3706
Feature activation+3.33e-01

Pos logit contributions
IT+0.274
achable+0.254
uding+0.249
penter+0.249
 doorstep+0.242
Neg logit contributions
ls-0.478
bs-0.459
s-0.456
oms-0.415
ushes-0.397
Token Dog
Token ID8532
Feature activation+2.91e-01

Pos logit contributions
penter+0.575
IT+0.557
uding+0.556
zer+0.555
eman+0.550
Neg logit contributions
ls-0.794
bs-0.776
s-0.724
ushes-0.676
oms-0.633
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
IT+0.383
lf+0.357
achable+0.341
ulum+0.338
uding+0.334
Neg logit contributions
s-0.829
ls-0.797
bs-0.758
l-0.729
ushes-0.652
Token.
Token ID13
Feature activation+1.97e-02

Pos logit contributions
lf+1.187
IT+1.183
penter+1.152
zer+1.147
uding+1.145
Neg logit contributions
s-0.759
ls-0.748
bs-0.686
oms-0.587
l-0.579
Token They
Token ID1119
Feature activation-9.94e-02

Pos logit contributions
gy+0.029
penter+0.028
zer+0.027
IT+0.027
uding+0.026
Neg logit contributions
ls-0.055
bs-0.050
s-0.049
oms-0.048
ushes-0.048
Token liked
Token ID8288
Feature activation+9.39e-02

Pos logit contributions
ls+0.256
bs+0.240
s+0.229
oms+0.218
ushes+0.214
Neg logit contributions
gy-0.165
penter-0.161
IT-0.158
zer-0.156
isk-0.151
Token to
Token ID284
Feature activation-6.71e-03

Pos logit contributions
penter+0.227
gy+0.225
IT+0.216
isk+0.212
zer+0.210
Neg logit contributions
ls-0.176
bs-0.156
s-0.154
ushes-0.134
oms-0.132
Token play
Token ID711
Feature activation+1.33e-01

Pos logit contributions
ls+0.015
bs+0.015
s+0.013
oms+0.013
ushes+0.012
Neg logit contributions
penter-0.012
uding-0.012
IT-0.012
itor-0.012
gy-0.012
Token to
Token ID284
Feature activation-5.69e-03

Pos logit contributions
gy+0.254
penter+0.242
IT+0.233
zer+0.233
isk+0.230
Neg logit contributions
ls-0.339
bs-0.304
oms-0.291
s-0.291
ushes-0.286
Token her
Token ID607
Feature activation-1.00e-01

Pos logit contributions
ls+0.010
bs+0.009
ushes+0.008
s+0.008
oms+0.007
Neg logit contributions
gy-0.014
penter-0.014
zer-0.013
uding-0.013
ologist-0.013
Token friend
Token ID1545
Feature activation+1.77e-01

Pos logit contributions
ls+0.232
s+0.203
bs+0.202
ushes+0.182
oms+0.169
Neg logit contributions
IT-0.207
gy-0.199
penter-0.197
uding-0.195
zer-0.195
Token,
Token ID11
Feature activation-1.16e-02

Pos logit contributions
IT+0.391
penter+0.384
gy+0.382
zer+0.377
uding+0.374
Neg logit contributions
ls-0.360
bs-0.325
s-0.324
oms-0.298
ushes-0.263
Token Dog
Token ID8532
Feature activation+2.90e-01

Pos logit contributions
ls+0.024
bs+0.022
s+0.022
oms+0.020
ushes+0.019
Neg logit contributions
penter-0.022
lf-0.022
uding-0.022
IT-0.022
zer-0.022
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
IT+0.237
lf+0.196
astical+0.194
achable+0.187
ulum+0.176
Neg logit contributions
ls-0.952
s-0.945
bs-0.873
l-0.865
oms-0.822
Token.
Token ID13
Feature activation+1.77e-02

Pos logit contributions
IT+1.117
penter+1.090
zer+1.061
Wh+1.057
lf+1.055
Neg logit contributions
ls-0.942
s-0.862
bs-0.833
oms-0.792
l-0.716
Token Dog
Token ID8532
Feature activation+2.90e-01

Pos logit contributions
gy+0.020
penter+0.019
zer+0.019
IT+0.019
isk+0.019
Neg logit contributions
ls-0.054
bs-0.050
s-0.050
oms-0.047
ushes-0.047
Tokengy
Token ID1360
Feature activation+5.02e-01

Pos logit contributions
IT+0.095
uding+0.085
astical+0.083
lf+0.080
achable+0.080
Neg logit contributions
ls-1.098
s-1.054
bs-1.004
l-0.967
oms-0.949
Token said
Token ID531
Feature activation-1.75e-01

Pos logit contributions
gy+1.093
IT+1.076
uding+1.068
penter+1.059
zer+1.054
Neg logit contributions
ls-1.017
s-0.917
bs-0.874
oms-0.802
l-0.761
Token,
Token ID11
Feature activation-1.17e-02

Pos logit contributions
ls+0.494
ushes+0.478
bs+0.465
oms+0.460
s+0.426
Neg logit contributions
gy-0.248
penter-0.221
zer-0.213
 driving-0.209
lf-0.201
TokenFinally
Token ID11158
Feature activation-6.68e-03

Pos logit contributions
ls+0.078
bs+0.073
s+0.070
ushes+0.068
oms+0.067
Neg logit contributions
gy-0.054
penter-0.050
zer-0.047
IT-0.046
isk-0.046
Token,
Token ID11
Feature activation-1.79e-02

Pos logit contributions
ls+0.016
bs+0.014
oms+0.013
s+0.013
ushes+0.012
Neg logit contributions
gy-0.014
penter-0.013
IT-0.013
isk-0.013
zer-0.013
Token everything
Token ID2279
Feature activation-1.03e-01

Pos logit contributions
ls+0.037
bs+0.034
s+0.031
oms+0.029
ushes+0.028
Neg logit contributions
gy-0.040
penter-0.040
isk-0.039
zer-0.038
IT-0.038
Token turned
Token ID2900
Feature activation-7.81e-02

Pos logit contributions
ls+0.258
ushes+0.247
bs+0.243
oms+0.233
s+0.222
Neg logit contributions
gy-0.160
penter-0.158
lf-0.146
 accident-0.146
 never-0.145
Token fog
Token ID19558
Feature activation-9.24e-02

Pos logit contributions
ls+0.172
bs+0.160
ushes+0.156
oms+0.151
s+0.135
Neg logit contributions
gy-0.159
penter-0.145
 driving-0.138
 because-0.135
IT-0.133
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
ls+0.192
bs+0.163
s+0.162
oms+0.158
ushes+0.151
Neg logit contributions
gy-0.202
penter-0.202
zer-0.198
IT-0.196
isk-0.195
Token and
Token ID290
Feature activation+6.34e-02

Pos logit contributions
gy+0.875
penter+0.817
IT+0.783
zer+0.778
isk+0.767
Neg logit contributions
ls-1.296
bs-1.172
oms-1.128
ushes-1.116
s-1.104
Token a
Token ID257
Feature activation+1.03e-01

Pos logit contributions
gy+0.124
penter+0.119
zer+0.115
isk+0.113
IT+0.113
Neg logit contributions
ls-0.152
bs-0.136
oms-0.129
s-0.127
ushes-0.124
Token thick
Token ID6546
Feature activation-1.81e-01

Pos logit contributions
gy+0.217
penter+0.207
IT+0.200
zer+0.200
isk+0.198
Neg logit contributions
ls-0.229
bs-0.203
oms-0.194
s-0.192
ushes-0.189
Token mist
Token ID4020
Feature activation-4.26e-02

Pos logit contributions
ls+0.415
oms+0.373
bs+0.365
s+0.352
ushes+0.320
Neg logit contributions
gy-0.345
isk-0.340
penter-0.329
IT-0.329
uding-0.324
Token began
Token ID2540
Feature activation-1.36e-01

Pos logit contributions
ls+0.103
bs+0.092
oms+0.089
s+0.088
ushes+0.087
Neg logit contributions
gy-0.082
penter-0.077
zer-0.074
IT-0.074
isk-0.073
Token.
Token ID13
Feature activation+2.19e-02

Pos logit contributions
IT+0.496
zer+0.475
penter+0.472
gy+0.469
itor+0.468
Neg logit contributions
ls-0.405
s-0.372
bs-0.360
oms-0.312
ushes-0.307
Token One
Token ID1881
Feature activation+1.66e-01

Pos logit contributions
gy+0.045
isk+0.044
IT+0.044
zer+0.044
itor+0.043
Neg logit contributions
ls-0.046
s-0.044
bs-0.043
oms-0.039
ushes-0.038
Token day
Token ID1110
Feature activation+7.20e-02

Pos logit contributions
gy+0.326
IT+0.325
penter+0.322
uding+0.321
zer+0.310
Neg logit contributions
ls-0.369
bs-0.350
s-0.341
oms-0.303
ushes-0.301
Token,
Token ID11
Feature activation-9.74e-03

Pos logit contributions
gy+0.133
penter+0.126
zer+0.121
IT+0.121
isk+0.119
Neg logit contributions
ls-0.179
bs-0.160
oms-0.154
s-0.152
ushes-0.152
Token Big
Token ID4403
Feature activation+1.78e-01

Pos logit contributions
ls+0.025
s+0.024
bs+0.023
oms+0.020
ushes+0.020
Neg logit contributions
isk-0.017
IT-0.016
gy-0.016
uding-0.015
itor-0.015
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
lf+0.096
penter+0.073
achable+0.072
 doorstep+0.068
astical+0.068
Neg logit contributions
ls-0.653
s-0.606
bs-0.596
oms-0.586
ushes-0.571
Token saw
Token ID2497
Feature activation-2.81e-02

Pos logit contributions
achable+1.018
penter+1.008
uding+1.007
IT+1.005
zer+0.995
Neg logit contributions
ls-1.087
s-1.013
bs-0.951
oms-0.838
l-0.813
Token a
Token ID257
Feature activation+1.08e-01

Pos logit contributions
ls+0.056
s+0.047
bs+0.047
l+0.042
oms+0.039
Neg logit contributions
uding-0.062
uries-0.061
isk-0.061
gy-0.061
 doorstep-0.060
Token little
Token ID1310
Feature activation+1.08e-02

Pos logit contributions
gy+0.257
penter+0.247
IT+0.242
isk+0.240
zer+0.239
Neg logit contributions
ls-0.211
bs-0.179
s-0.174
oms-0.171
ushes-0.162
Token cat
Token ID3797
Feature activation+1.98e-01

Pos logit contributions
gy+0.020
penter+0.019
IT+0.019
zer+0.019
isk+0.018
Neg logit contributions
ls-0.027
bs-0.024
oms-0.023
s-0.023
ushes-0.022
Token sitting
Token ID5586
Feature activation+1.96e-01

Pos logit contributions
IT+0.354
penter+0.337
achable+0.336
itor+0.336
gy+0.329
Neg logit contributions
ls-0.504
s-0.464
bs-0.440
oms-0.409
l-0.373
Token a
Token ID257
Feature activation+1.08e-01

Pos logit contributions
gy+0.150
penter+0.147
isk+0.144
uding+0.142
IT+0.141
Neg logit contributions
ls-0.169
bs-0.155
s-0.144
ushes-0.144
oms-0.138
Token huge
Token ID3236
Feature activation-1.13e-01

Pos logit contributions
gy+0.248
IT+0.237
penter+0.235
uding+0.232
isk+0.231
Neg logit contributions
ls-0.221
bs-0.192
s-0.184
oms-0.178
ushes-0.169
Token dog
Token ID3290
Feature activation+6.72e-02

Pos logit contributions
ls+0.294
oms+0.252
bs+0.248
s+0.245
l+0.217
Neg logit contributions
gy-0.194
uding-0.188
IT-0.186
isk-0.183
achable-0.174
Token named
Token ID3706
Feature activation+3.33e-01

Pos logit contributions
IT+0.125
uding+0.112
bet+0.105
achable+0.105
isk+0.102
Neg logit contributions
ls-0.174
s-0.172
bs-0.160
l-0.135
oms-0.134
Token Big
Token ID4403
Feature activation+1.78e-01

Pos logit contributions
penter+0.550
eman+0.546
uding+0.533
gy+0.524
ologist+0.524
Neg logit contributions
ls-0.843
bs-0.800
s-0.786
oms-0.673
ushes-0.668
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
lf+0.260
penter+0.257
uding+0.251
 doorstep+0.249
IT+0.245
Neg logit contributions
ls-0.464
s-0.446
bs-0.424
oms-0.390
l-0.377
Token.
Token ID13
Feature activation+2.12e-02

Pos logit contributions
ode+1.225
ologist+1.198
uding+1.195
IT+1.191
 lending+1.188
Neg logit contributions
s-0.776
ls-0.671
l-0.562
bs-0.548
 says-0.396
Token Big
Token ID4403
Feature activation+1.78e-01

Pos logit contributions
gy+0.030
isk+0.029
penter+0.029
zer+0.029
IT+0.029
Neg logit contributions
ls-0.058
bs-0.055
s-0.054
oms-0.051
ushes-0.050
Tokengy
Token ID1360
Feature activation+5.02e-01

Pos logit contributions
lf+0.150
penter+0.141
 doorstep+0.133
achable+0.130
uding+0.128
Neg logit contributions
ls-0.587
s-0.552
bs-0.520
oms-0.505
ushes-0.499
Token was
Token ID373
Feature activation+7.62e-02

Pos logit contributions
uding+1.064
ologist+1.041
 lending+1.040
IT+1.039
penter+1.039
Neg logit contributions
s-0.966
ls-0.950
bs-0.840
l-0.753
led-0.705
Token very
Token ID845
Feature activation-3.88e-02

Pos logit contributions
IT+0.176
uding+0.171
eman+0.171
zer+0.169
itor+0.168
Neg logit contributions
ls-0.140
bs-0.129
s-0.118
ushes-0.110
oms-0.105
Token was
Token ID373
Feature activation+7.08e-02

Pos logit contributions
penter+0.195
 driving+0.185
IT+0.182
lf+0.182
bike+0.177
Neg logit contributions
ls-0.323
s-0.287
bs-0.286
ushes-0.264
oms-0.263
Token always
Token ID1464
Feature activation+3.81e-01

Pos logit contributions
gy+0.122
IT+0.121
penter+0.118
zer+0.118
uding+0.114
Neg logit contributions
ls-0.178
bs-0.160
s-0.153
ushes-0.150
oms-0.147
Token there
Token ID612
Feature activation-3.90e-02

Pos logit contributions
gy+0.793
penter+0.746
IT+0.720
zer+0.720
isk+0.712
Neg logit contributions
ls-0.873
bs-0.772
oms-0.747
s-0.727
ushes-0.722
Token for
Token ID329
Feature activation-5.53e-02

Pos logit contributions
ls+0.076
s+0.072
bs+0.068
oms+0.062
ushes+0.059
Neg logit contributions
penter-0.089
IT-0.089
gy-0.088
zer-0.086
isk-0.083
Token Bug
Token ID15217
Feature activation+2.98e-01

Pos logit contributions
ls+0.130
bs+0.117
s+0.109
oms+0.108
ushes+0.102
Neg logit contributions
penter-0.108
gy-0.105
IT-0.105
itor-0.101
uding-0.097
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
IT+0.140
lf+0.134
penter+0.134
achable+0.096
itor+0.090
Neg logit contributions
ls-1.111
s-1.086
bs-1.074
ushes-0.991
l-0.969
Token,
Token ID11
Feature activation-1.43e-02

Pos logit contributions
penter+0.962
gy+0.949
IT+0.937
zer+0.927
lf+0.908
Neg logit contributions
ls-1.098
bs-1.058
s-1.044
oms-0.953
l-0.854
Token no
Token ID645
Feature activation+2.47e-01

Pos logit contributions
ls+0.037
bs+0.034
s+0.033
oms+0.030
ushes+0.029
Neg logit contributions
penter-0.024
IT-0.024
gy-0.023
itor-0.023
isk-0.022
Token matter
Token ID2300
Feature activation+2.28e-01

Pos logit contributions
gy+0.350
penter+0.346
zer+0.323
IT+0.322
isk+0.316
Neg logit contributions
ls-0.710
bs-0.660
s-0.634
oms-0.624
ushes-0.609
Token what
Token ID644
Feature activation+1.17e-02

Pos logit contributions
penter+0.419
gy+0.416
zer+0.413
IT+0.409
itor+0.395
Neg logit contributions
ls-0.564
s-0.503
oms-0.497
bs-0.495
ushes-0.450
Token.
Token ID13
Feature activation+1.41e-02

Pos logit contributions
gy+0.022
penter+0.022
IT+0.021
isk+0.021
zer+0.021
Neg logit contributions
ls-0.029
bs-0.025
s-0.025
oms-0.024
ushes-0.023
Token with
Token ID351
Feature activation+5.56e-02

Pos logit contributions
gy+0.296
IT+0.280
uding+0.275
penter+0.275
achable+0.274
Neg logit contributions
ls-0.279
s-0.245
bs-0.240
oms-0.219
ushes-0.216
Token.
Token ID13
Feature activation+2.03e-02

Pos logit contributions
gy+0.117
penter+0.115
IT+0.114
isk+0.111
achable+0.111
Neg logit contributions
ls-0.124
s-0.111
bs-0.110
oms-0.097
ushes-0.092
Token
Token ID198
Feature activation-2.83e-02

Pos logit contributions
gy+0.042
penter+0.041
isk+0.040
zer+0.040
IT+0.040
Neg logit contributions
ls-0.045
s-0.040
bs-0.040
oms-0.037
ushes-0.036
Token
Token ID198
Feature activation-2.82e-02

Pos logit contributions
ls+0.065
bs+0.058
oms+0.056
s+0.056
ushes+0.054
Neg logit contributions
gy-0.057
penter-0.054
IT-0.053
zer-0.052
isk-0.052
TokenBig
Token ID12804
Feature activation+1.24e-01

Pos logit contributions
ls+0.071
s+0.067
bs+0.066
oms+0.058
ushes+0.057
Neg logit contributions
gy-0.048
isk-0.047
itor-0.045
IT-0.045
uding-0.044
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
uding+0.079
achable+0.076
lf+0.075
IT+0.072
astical+0.070
Neg logit contributions
ls-0.434
s-0.400
oms-0.384
bs-0.384
l-0.376
Token wanted
Token ID2227
Feature activation+8.95e-02

Pos logit contributions
uding+1.064
IT+1.056
achable+1.051
zer+1.030
penter+1.027
Neg logit contributions
ls-1.009
s-0.969
bs-0.900
l-0.774
oms-0.752
Token to
Token ID284
Feature activation-3.22e-03

Pos logit contributions
gy+0.220
IT+0.215
penter+0.213
uding+0.211
isk+0.210
Neg logit contributions
ls-0.162
bs-0.145
s-0.143
oms-0.122
ushes-0.118
Token help
Token ID1037
Feature activation+3.92e-02

Pos logit contributions
ls+0.007
bs+0.006
s+0.006
oms+0.006
ushes+0.006
Neg logit contributions
gy-0.006
penter-0.006
zer-0.006
uding-0.006
IT-0.006
Token the
Token ID262
Feature activation-3.42e-02

Pos logit contributions
gy+0.083
IT+0.081
penter+0.081
isk+0.080
zer+0.078
Neg logit contributions
ls-0.083
bs-0.074
s-0.074
oms-0.067
ushes-0.064
Token cat
Token ID3797
Feature activation+1.97e-01

Pos logit contributions
ls+0.093
s+0.084
bs+0.083
oms+0.079
ushes+0.077
Neg logit contributions
gy-0.055
penter-0.053
uding-0.051
IT-0.051
zer-0.050
Token coins
Token ID10796
Feature activation-1.66e-01

Pos logit contributions
ls+0.090
bs+0.085
oms+0.081
ushes+0.078
s+0.076
Neg logit contributions
gy-0.078
penter-0.073
zer-0.071
 driving-0.068
isk-0.067
Token back
Token ID736
Feature activation-9.22e-02

Pos logit contributions
ls+0.392
bs+0.362
oms+0.348
ushes+0.348
s+0.330
Neg logit contributions
gy-0.315
penter-0.295
zer-0.279
 driving-0.273
IT-0.264
Token in
Token ID287
Feature activation+1.24e-02

Pos logit contributions
ls+0.205
bs+0.181
oms+0.173
s+0.172
ushes+0.169
Neg logit contributions
gy-0.194
penter-0.186
zer-0.180
IT-0.180
isk-0.178
Token his
Token ID465
Feature activation+5.32e-02

Pos logit contributions
gy+0.025
penter+0.024
zer+0.024
IT+0.024
isk+0.023
Neg logit contributions
ls-0.027
bs-0.024
ushes-0.024
s-0.024
oms-0.023
Token pig
Token ID12967
Feature activation+1.33e-01

Pos logit contributions
gy+0.107
penter+0.102
zer+0.099
IT+0.098
isk+0.097
Neg logit contributions
ls-0.125
bs-0.111
oms-0.106
s-0.105
ushes-0.104
Tokengy
Token ID1360
Feature activation+5.03e-01

Pos logit contributions
achable+0.053
IT+0.050
penter+0.039
uding+0.034
lf+0.031
Neg logit contributions
ls-0.505
s-0.498
bs-0.476
ushes-0.463
oms-0.439
Token bank
Token ID3331
Feature activation+1.83e-01

Pos logit contributions
achable+0.707
IT+0.707
penter+0.699
uding+0.688
gy+0.676
Neg logit contributions
ls-1.388
s-1.300
bs-1.246
oms-1.164
ushes-1.146
Token and
Token ID290
Feature activation+6.37e-02

Pos logit contributions
gy+0.308
penter+0.291
zer+0.277
IT+0.273
isk+0.273
Neg logit contributions
ls-0.484
bs-0.438
oms-0.421
ushes-0.412
s-0.410
Token covered
Token ID5017
Feature activation+8.82e-02

Pos logit contributions
gy+0.123
IT+0.118
penter+0.116
zer+0.115
isk+0.115
Neg logit contributions
ls-0.150
bs-0.132
s-0.128
ushes-0.122
oms-0.121
Token it
Token ID340
Feature activation-5.74e-02

Pos logit contributions
gy+0.189
penter+0.183
IT+0.177
isk+0.176
zer+0.174
Neg logit contributions
ls-0.195
bs-0.163
s-0.162
ushes-0.157
oms-0.156
Token with
Token ID351
Feature activation+5.41e-02

Pos logit contributions
ls+0.149
bs+0.139
oms+0.135
ushes+0.129
s+0.127
Neg logit contributions
gy-0.098
penter-0.091
zer-0.086
IT-0.085
isk-0.083
Token will
Token ID481
Feature activation-2.59e-01

Pos logit contributions
gy+0.194
penter+0.187
zer+0.180
isk+0.179
IT+0.179
Neg logit contributions
ls-0.136
bs-0.114
oms-0.109
ushes-0.106
s-0.106
Token find
Token ID1064
Feature activation+1.50e-01

Pos logit contributions
ls+0.545
bs+0.504
s+0.474
oms+0.470
ushes+0.461
Neg logit contributions
penter-0.482
gy-0.479
isk-0.476
bike-0.470
zer-0.470
Token gold
Token ID3869
Feature activation-1.89e-02

Pos logit contributions
gy+0.233
penter+0.219
zer+0.212
IT+0.208
isk+0.200
Neg logit contributions
ls-0.404
bs-0.370
oms-0.361
s-0.354
ushes-0.353
Token or
Token ID393
Feature activation+1.13e-01

Pos logit contributions
ls+0.048
bs+0.044
oms+0.042
s+0.041
ushes+0.041
Neg logit contributions
gy-0.033
penter-0.032
zer-0.030
IT-0.030
isk-0.030
Token pear
Token ID25286
Feature activation-7.36e-02

Pos logit contributions
gy+0.230
penter+0.219
IT+0.214
zer+0.214
isk+0.213
Neg logit contributions
ls-0.263
bs-0.232
s-0.222
oms-0.221
ushes-0.215
Tokenls
Token ID7278
Feature activation-7.05e-01

Pos logit contributions
ls+0.146
bs+0.144
ushes+0.139
oms+0.136
s+0.123
Neg logit contributions
gy-0.154
penter-0.144
isk-0.138
zer-0.136
lf-0.134
Token.
Token ID13
Feature activation+1.42e-02

Pos logit contributions
ls+1.728
bs+1.541
oms+1.483
s+1.474
ushes+1.439
Neg logit contributions
gy-1.335
penter-1.267
IT-1.224
zer-1.224
isk-1.208
Token What
Token ID1867
Feature activation+1.10e-01

Pos logit contributions
gy+0.028
lf+0.027
IT+0.027
zer+0.027
isk+0.026
Neg logit contributions
ls-0.030
bs-0.029
ushes-0.028
s-0.027
oms-0.027
Token will
Token ID481
Feature activation-2.58e-01

Pos logit contributions
gy+0.222
penter+0.211
zer+0.204
IT+0.204
isk+0.201
Neg logit contributions
ls-0.254
bs-0.226
oms-0.217
s-0.214
ushes-0.212
Token you
Token ID345
Feature activation-5.36e-02

Pos logit contributions
ls+0.572
bs+0.474
s+0.468
oms+0.452
ushes+0.429
Neg logit contributions
gy-0.557
penter-0.536
isk-0.526
IT-0.523
zer-0.521
Token do
Token ID466
Feature activation+3.01e-02

Pos logit contributions
ls+0.131
bs+0.121
oms+0.114
s+0.113
ushes+0.110
Neg logit contributions
gy-0.098
penter-0.092
IT-0.090
zer-0.090
isk-0.088
Token it
Token ID340
Feature activation-5.52e-02

Pos logit contributions
ls+0.718
oms+0.673
bs+0.646
ushes+0.645
s+0.587
Neg logit contributions
 accident-0.316
 then-0.311
zer-0.305
 because-0.304
gy-0.284
Token.
Token ID13
Feature activation+1.95e-02

Pos logit contributions
ls+0.136
oms+0.122
bs+0.121
ushes+0.117
s+0.110
Neg logit contributions
gy-0.099
penter-0.091
zer-0.084
IT-0.084
isk-0.084
Token The
Token ID383
Feature activation+3.82e-02

Pos logit contributions
gy+0.039
penter+0.038
IT+0.036
zer+0.036
isk+0.035
Neg logit contributions
ls-0.045
bs-0.039
oms-0.038
s-0.037
ushes-0.037
Token dog
Token ID3290
Feature activation+6.30e-02

Pos logit contributions
gy+0.066
penter+0.062
IT+0.060
zer+0.060
isk+0.059
Neg logit contributions
ls-0.100
bs-0.090
oms-0.087
s-0.086
ushes-0.085
Token how
Token ID703
Feature activation+1.94e-01

Pos logit contributions
gy+0.132
penter+0.128
zer+0.125
isk+0.122
IT+0.121
Neg logit contributions
ls-0.124
bs-0.117
ushes-0.116
oms-0.113
s-0.098
Tokenls
Token ID7278
Feature activation-7.04e-01

Pos logit contributions
 never+0.330
gy+0.314
 always+0.303
penter+0.298
 accident+0.292
Neg logit contributions
oms-0.419
ls-0.415
ushes-0.363
bs-0.355
s-0.349
Token along
Token ID1863
Feature activation-1.78e-01

Pos logit contributions
ushes+1.547
ls+1.478
oms+1.395
bs+1.325
s+1.226
Neg logit contributions
gy-1.302
penter-1.280
IT-1.177
isk-1.160
zer-1.159
Token.
Token ID13
Feature activation+2.15e-02

Pos logit contributions
ls+0.445
bs+0.406
oms+0.392
ushes+0.382
s+0.367
Neg logit contributions
gy-0.300
penter-0.288
lf-0.279
zer-0.277
IT-0.268
Token He
Token ID679
Feature activation+1.92e-02

Pos logit contributions
gy+0.029
penter+0.029
zer+0.027
IT+0.026
isk+0.025
Neg logit contributions
ls-0.061
bs-0.053
oms-0.053
s-0.051
ushes-0.050
Token likes
Token ID7832
Feature activation-1.68e-01

Pos logit contributions
gy+0.043
penter+0.042
zer+0.041
IT+0.040
isk+0.040
Neg logit contributions
ls-0.037
bs-0.034
ushes-0.032
oms-0.032
s-0.030
Token the
Token ID262
Feature activation-3.83e-02

Pos logit contributions
ls+0.392
bs+0.347
s+0.332
oms+0.327
ushes+0.321
Neg logit contributions
gy-0.345
penter-0.329
IT-0.318
zer-0.310
isk-0.309
Token tent
Token ID11105
Feature activation-1.07e-01

Pos logit contributions
ls+0.086
bs+0.077
oms+0.075
s+0.073
ushes+0.072
Neg logit contributions
gy-0.059
penter-0.056
IT-0.055
isk-0.054
zer-0.053
Token.
Token ID13
Feature activation+1.83e-02

Pos logit contributions
ls+0.243
s+0.222
bs+0.218
oms+0.213
ushes+0.200
Neg logit contributions
gy-0.200
IT-0.197
zer-0.193
achable-0.191
penter-0.190
Token The
Token ID383
Feature activation+3.59e-02

Pos logit contributions
gy+0.040
penter+0.038
zer+0.037
IT+0.037
isk+0.036
Neg logit contributions
ls-0.039
bs-0.035
oms-0.033
s-0.033
ushes-0.032
Token bear
Token ID6842
Feature activation+2.37e-01

Pos logit contributions
gy+0.067
IT+0.062
penter+0.062
isk+0.060
zer+0.059
Neg logit contributions
ls-0.090
bs-0.079
oms-0.078
s-0.075
ushes-0.075
Token grow
Token ID1663
Feature activation-1.99e-02

Pos logit contributions
gy+0.543
penter+0.519
zer+0.507
isk+0.495
IT+0.490
Neg logit contributions
ls-0.464
bs-0.420
ushes-0.396
oms-0.395
s-0.376
Tokenls
Token ID7278
Feature activation-7.03e-01

Pos logit contributions
bs+0.027
ushes+0.020
oms+0.018
irk+0.017
 tant+0.017
Neg logit contributions
gy-0.048
 driving-0.044
 always-0.043
 because-0.042
ter-0.042
Token and
Token ID290
Feature activation+7.34e-02

Pos logit contributions
ls+1.932
bs+1.756
oms+1.697
ushes+1.678
s+1.669
Neg logit contributions
gy-1.112
penter-1.037
zer-0.991
IT-0.981
isk-0.968
Token hangs
Token ID28087
Feature activation-4.13e-01

Pos logit contributions
gy+0.156
penter+0.147
zer+0.142
isk+0.140
IT+0.139
Neg logit contributions
ls-0.160
bs-0.143
oms-0.140
ushes-0.138
s-0.136
Token his
Token ID465
Feature activation+5.18e-02

Pos logit contributions
ls+0.961
bs+0.901
oms+0.864
ushes+0.818
s+0.810
Neg logit contributions
gy-0.812
penter-0.727
zer-0.703
isk-0.697
 driving-0.681
Token head
Token ID1182
Feature activation+1.77e-01

Pos logit contributions
gy+0.084
penter+0.080
IT+0.077
isk+0.075
zer+0.075
Neg logit contributions
ls-0.140
bs-0.127
oms-0.122
s-0.121
ushes-0.120
Token.
Token ID13
Feature activation+2.36e-02

Pos logit contributions
gy+0.355
penter+0.331
zer+0.320
IT+0.315
isk+0.312
Neg logit contributions
ls-0.411
bs-0.364
oms-0.354
ushes-0.346
s-0.339
Token.
Token ID13
Feature activation+2.41e-02

Pos logit contributions
gy+0.348
IT+0.343
penter+0.343
isk+0.333
zer+0.329
Neg logit contributions
ls-0.318
bs-0.282
s-0.271
oms-0.255
ushes-0.242
Token It
Token ID632
Feature activation-3.40e-02

Pos logit contributions
gy+0.045
penter+0.043
zer+0.041
IT+0.041
isk+0.041
Neg logit contributions
ls-0.059
bs-0.054
oms-0.051
s-0.051
ushes-0.051
Token moves
Token ID6100
Feature activation-2.16e-01

Pos logit contributions
ls+0.066
bs+0.062
oms+0.061
ushes+0.059
s+0.054
Neg logit contributions
gy-0.075
penter-0.071
zer-0.068
IT-0.067
isk-0.067
Token and
Token ID290
Feature activation+7.40e-02

Pos logit contributions
ls+0.563
oms+0.507
bs+0.496
ushes+0.477
s+0.465
Neg logit contributions
gy-0.374
penter-0.341
zer-0.326
isk-0.316
IT-0.309
Token craw
Token ID27784
Feature activation-1.39e-01

Pos logit contributions
gy+0.163
penter+0.159
zer+0.152
IT+0.150
 driving+0.148
Neg logit contributions
ls-0.147
bs-0.129
ushes-0.128
oms-0.125
s-0.120
Tokenls
Token ID7278
Feature activation-7.02e-01

Pos logit contributions
ushes+0.048
oms+0.030
bs+0.028
ls+0.006
reci+0.003
Neg logit contributions
gy-0.462
 because-0.446
zer-0.440
penter-0.438
isk-0.427
Token on
Token ID319
Feature activation-1.93e-01

Pos logit contributions
ls+1.696
bs+1.510
oms+1.478
ushes+1.469
s+1.400
Neg logit contributions
gy-1.328
penter-1.230
zer-1.185
IT-1.178
isk-1.148
Token the
Token ID262
Feature activation-3.61e-02

Pos logit contributions
ls+0.347
s+0.293
bs+0.277
ushes+0.264
l+0.255
Neg logit contributions
penter-0.458
gy-0.457
isk-0.443
storm-0.438
bike-0.434
Token paper
Token ID3348
Feature activation-6.08e-02

Pos logit contributions
ls+0.098
bs+0.089
oms+0.086
ushes+0.086
s+0.085
Neg logit contributions
gy-0.057
penter-0.054
zer-0.053
IT-0.052
isk-0.050
Token.
Token ID13
Feature activation+2.55e-02

Pos logit contributions
ls+0.158
bs+0.142
oms+0.137
ushes+0.135
s+0.134
Neg logit contributions
gy-0.104
penter-0.098
zer-0.094
IT-0.093
isk-0.091
Token It
Token ID632
Feature activation-3.30e-02

Pos logit contributions
gy+0.043
penter+0.040
zer+0.039
IT+0.039
isk+0.038
Neg logit contributions
ls-0.067
bs-0.062
oms-0.059
s-0.059
ushes-0.058
Token to
Token ID284
Feature activation-1.81e-02

Pos logit contributions
ls+0.342
bs+0.296
s+0.292
oms+0.280
ushes+0.278
Neg logit contributions
gy-0.351
penter-0.333
isk-0.326
IT-0.324
zer-0.319
Token share
Token ID2648
Feature activation-3.85e-02

Pos logit contributions
ls+0.042
bs+0.038
s+0.037
oms+0.036
ushes+0.035
Neg logit contributions
gy-0.035
penter-0.034
zer-0.033
IT-0.032
isk-0.032
Token.
Token ID13
Feature activation+1.04e-02

Pos logit contributions
ls+0.095
bs+0.084
s+0.081
oms+0.081
ushes+0.080
Neg logit contributions
gy-0.073
penter-0.069
IT-0.067
zer-0.066
isk-0.065
Token He
Token ID679
Feature activation+9.35e-03

Pos logit contributions
gy+0.018
penter+0.017
zer+0.017
IT+0.017
isk+0.016
Neg logit contributions
ls-0.027
bs-0.024
oms-0.023
s-0.023
ushes-0.023
Token grow
Token ID1663
Feature activation-1.98e-02

Pos logit contributions
gy+0.022
penter+0.022
zer+0.021
IT+0.021
isk+0.021
Neg logit contributions
ls-0.017
bs-0.015
oms-0.014
ushes-0.014
s-0.013
Tokenls
Token ID7278
Feature activation-7.02e-01

Pos logit contributions
bs+0.014
ushes+0.008
oms+0.008
forts+0.006
Happy+0.005
Neg logit contributions
gy-0.057
 because-0.056
 not-0.054
 always-0.053
 accident-0.053
Token and
Token ID290
Feature activation+6.52e-02

Pos logit contributions
ls+1.922
bs+1.752
oms+1.694
ushes+1.676
s+1.654
Neg logit contributions
gy-1.110
penter-1.039
zer-0.985
IT-0.976
isk-0.962
Token shows
Token ID2523
Feature activation-1.77e-01

Pos logit contributions
gy+0.152
penter+0.147
zer+0.140
IT+0.138
isk+0.137
Neg logit contributions
ls-0.122
oms-0.109
bs-0.107
ushes-0.106
s-0.102
Token his
Token ID465
Feature activation+4.93e-02

Pos logit contributions
ls+0.459
bs+0.413
oms+0.398
s+0.394
ushes+0.390
Neg logit contributions
gy-0.310
penter-0.292
IT-0.280
zer-0.280
isk-0.276
Token teeth
Token ID9941
Feature activation-4.16e-02

Pos logit contributions
gy+0.088
penter+0.084
IT+0.081
zer+0.080
isk+0.080
Neg logit contributions
ls-0.126
bs-0.113
oms-0.108
s-0.108
ushes-0.107
Token.
Token ID13
Feature activation+1.24e-02

Pos logit contributions
ls+0.104
bs+0.093
oms+0.090
s+0.089
ushes+0.088
Neg logit contributions
gy-0.076
penter-0.072
IT-0.070
zer-0.069
isk-0.068
Token help
Token ID1037
Feature activation+2.58e-02

Pos logit contributions
gy+0.115
penter+0.110
isk+0.105
zer+0.104
IT+0.102
Neg logit contributions
ls-0.160
oms-0.141
bs-0.138
s-0.136
ushes-0.132
Token Tom
Token ID4186
Feature activation+1.21e-01

Pos logit contributions
gy+0.049
penter+0.046
IT+0.045
zer+0.043
achable+0.043
Neg logit contributions
ls-0.063
bs-0.058
s-0.057
oms-0.053
ushes-0.052
Token.
Token ID13
Feature activation+1.12e-02

Pos logit contributions
gy+0.214
penter+0.198
zer+0.190
isk+0.188
IT+0.187
Neg logit contributions
ls-0.314
bs-0.280
oms-0.273
ushes-0.268
s-0.266
Token She
Token ID1375
Feature activation-1.16e-01

Pos logit contributions
gy+0.023
penter+0.021
zer+0.021
IT+0.021
isk+0.020
Neg logit contributions
ls-0.026
bs-0.023
oms-0.022
s-0.022
ushes-0.022
Token craw
Token ID27784
Feature activation-1.42e-01

Pos logit contributions
ls+0.192
bs+0.172
ushes+0.157
oms+0.154
s+0.153
Neg logit contributions
gy-0.295
penter-0.288
isk-0.279
zer-0.277
IT-0.276
Tokenls
Token ID7278
Feature activation-7.02e-01

Pos logit contributions
ushes+0.060
bs+0.040
oms+0.029
ls+0.020
reci+0.013
Neg logit contributions
zer-0.461
gy-0.460
 accident-0.450
penter-0.446
itor-0.439
Token under
Token ID739
Feature activation+1.29e-02

Pos logit contributions
ls+1.520
bs+1.350
oms+1.289
ushes+1.273
s+1.249
Neg logit contributions
gy-1.524
penter-1.440
zer-1.395
IT-1.393
isk-1.368
Token the
Token ID262
Feature activation-4.03e-02

Pos logit contributions
gy+0.034
penter+0.033
isk+0.032
uding+0.032
achable+0.031
Neg logit contributions
ls-0.022
s-0.018
bs-0.018
oms-0.017
ushes-0.015
Token couch
Token ID18507
Feature activation+2.15e-01

Pos logit contributions
ls+0.129
bs+0.120
oms+0.116
ushes+0.115
s+0.113
Neg logit contributions
gy-0.045
penter-0.042
zer-0.039
IT-0.038
isk-0.037
Token and
Token ID290
Feature activation+6.58e-02

Pos logit contributions
gy+0.372
penter+0.351
zer+0.340
IT+0.340
isk+0.332
Neg logit contributions
ls-0.558
bs-0.503
s-0.491
oms-0.486
ushes-0.475
Token gets
Token ID3011
Feature activation-3.66e-01

Pos logit contributions
gy+0.146
penter+0.137
zer+0.133
IT+0.132
 driving+0.131
Neg logit contributions
ls-0.127
ushes-0.115
bs-0.113
oms-0.110
s-0.102
Token dog
Token ID3290
Feature activation+6.20e-02

Pos logit contributions
gy+0.070
penter+0.065
IT+0.064
uding+0.062
isk+0.062
Neg logit contributions
ls-0.092
bs-0.080
s-0.078
oms-0.077
ushes-0.075
Token bar
Token ID2318
Feature activation+3.92e-02

Pos logit contributions
gy+0.144
penter+0.140
zer+0.138
isk+0.135
IT+0.135
Neg logit contributions
ls-0.113
bs-0.106
ushes-0.102
oms-0.099
s-0.092
Tokenks
Token ID591
Feature activation-2.95e-01

Pos logit contributions
penter+0.078
gy+0.076
zer+0.074
itor+0.073
 accident+0.072
Neg logit contributions
ls-0.071
bs-0.066
ushes-0.064
s-0.060
oms-0.060
Token and
Token ID290
Feature activation+7.12e-02

Pos logit contributions
ls+0.882
ushes+0.841
bs+0.833
oms+0.819
s+0.765
Neg logit contributions
 because-0.278
 accident-0.277
penter-0.275
zer-0.274
gy-0.271
Token grow
Token ID1663
Feature activation-1.89e-02

Pos logit contributions
gy+0.159
penter+0.151
zer+0.149
IT+0.146
isk+0.145
Neg logit contributions
ls-0.139
bs-0.128
ushes-0.127
oms-0.121
s-0.115
Tokenls
Token ID7278
Feature activation-7.02e-01

Pos logit contributions
bs+0.017
ushes+0.012
oms+0.009
prise+0.006
forts+0.006
Neg logit contributions
gy-0.054
 because-0.051
 not-0.050
 accident-0.050
 always-0.050
Token at
Token ID379
Feature activation-1.67e-01

Pos logit contributions
ls+1.823
bs+1.662
ushes+1.604
oms+1.602
s+1.548
Neg logit contributions
gy-1.204
penter-1.118
zer-1.074
IT-1.058
isk-1.044
Token Sara
Token ID24799
Feature activation-1.22e-01

Pos logit contributions
ls+0.333
bs+0.282
s+0.266
oms+0.265
ushes+0.257
Neg logit contributions
gy-0.393
penter-0.392
zer-0.368
IT-0.368
uding-0.361
Token.
Token ID13
Feature activation+2.64e-02

Pos logit contributions
ls+0.284
s+0.257
bs+0.255
oms+0.238
ushes+0.231
Neg logit contributions
penter-0.236
gy-0.235
IT-0.231
zer-0.223
isk-0.218
Token Sara
Token ID24799
Feature activation-1.21e-01

Pos logit contributions
gy+0.043
penter+0.040
IT+0.038
zer+0.038
isk+0.038
Neg logit contributions
ls-0.071
bs-0.064
oms-0.062
s-0.061
ushes-0.060
Token feels
Token ID5300
Feature activation-3.31e-01

Pos logit contributions
ls+0.220
bs+0.199
ushes+0.188
oms+0.184
s+0.173
Neg logit contributions
gy-0.295
penter-0.280
zer-0.278
IT-0.275
isk-0.272
Token to
Token ID284
Feature activation-1.75e-02

Pos logit contributions
ls+0.398
bs+0.354
s+0.340
oms+0.333
ushes+0.325
Neg logit contributions
gy-0.372
penter-0.351
IT-0.344
isk-0.337
zer-0.337
Token play
Token ID711
Feature activation+1.24e-01

Pos logit contributions
ls+0.040
bs+0.038
s+0.036
oms+0.035
ushes+0.034
Neg logit contributions
gy-0.032
penter-0.031
zer-0.031
IT-0.030
isk-0.030
Token.
Token ID13
Feature activation+1.41e-02

Pos logit contributions
gy+0.246
IT+0.225
penter+0.224
isk+0.221
zer+0.218
Neg logit contributions
ls-0.296
bs-0.260
s-0.255
oms-0.251
ushes-0.242
Token He
Token ID679
Feature activation+1.09e-02

Pos logit contributions
gy+0.026
penter+0.024
zer+0.024
IT+0.024
isk+0.023
Neg logit contributions
ls-0.036
bs-0.032
oms-0.031
s-0.031
ushes-0.030
Token grow
Token ID1663
Feature activation-1.81e-02

Pos logit contributions
gy+0.026
penter+0.025
zer+0.024
IT+0.024
isk+0.024
Neg logit contributions
ls-0.021
bs-0.018
ushes-0.017
oms-0.017
s-0.017
Tokenls
Token ID7278
Feature activation-7.02e-01

Pos logit contributions
bs+0.020
ushes+0.013
oms+0.012
eal+0.011
forts+0.011
Neg logit contributions
 because-0.044
 when-0.043
 animals-0.043
 doctors-0.043
 always-0.043
Token and
Token ID290
Feature activation+6.72e-02

Pos logit contributions
ls+1.905
bs+1.731
oms+1.671
ushes+1.650
s+1.640
Neg logit contributions
gy-1.127
penter-1.059
zer-1.007
IT-1.001
isk-0.984
Token scratches
Token ID37627
Feature activation-1.95e-01

Pos logit contributions
gy+0.154
penter+0.148
zer+0.142
IT+0.141
isk+0.138
Neg logit contributions
ls-0.133
oms-0.116
bs-0.115
ushes-0.113
s-0.111
Token the
Token ID262
Feature activation-4.35e-02

Pos logit contributions
ls+0.508
bs+0.459
oms+0.443
s+0.435
ushes+0.435
Neg logit contributions
gy-0.337
penter-0.317
zer-0.306
IT-0.304
isk-0.299
Token toy
Token ID13373
Feature activation+6.43e-02

Pos logit contributions
ls+0.110
bs+0.098
oms+0.095
s+0.095
ushes+0.093
Neg logit contributions
gy-0.079
penter-0.074
IT-0.071
isk-0.071
zer-0.071
Token.
Token ID13
Feature activation+1.55e-02

Pos logit contributions
gy+0.120
penter+0.113
IT+0.110
isk+0.110
zer+0.110
Neg logit contributions
ls-0.160
bs-0.143
s-0.138
oms-0.138
ushes-0.133
Token not
Token ID407
Feature activation+3.23e-01

Pos logit contributions
ls+0.540
bs+0.499
s+0.485
oms+0.475
ushes+0.470
Neg logit contributions
gy-0.155
penter-0.133
IT-0.132
zer-0.127
uding-0.121
Token listen
Token ID6004
Feature activation+5.60e-02

Pos logit contributions
gy+0.433
penter+0.401
isk+0.381
zer+0.377
 accident+0.360
Neg logit contributions
ls-0.973
oms-0.862
ushes-0.857
s-0.842
bs-0.834
Token.
Token ID13
Feature activation+1.53e-02

Pos logit contributions
gy+0.104
penter+0.099
IT+0.095
zer+0.095
isk+0.094
Neg logit contributions
ls-0.139
bs-0.125
oms-0.119
s-0.119
ushes-0.117
Token He
Token ID679
Feature activation+1.54e-02

Pos logit contributions
gy+0.028
penter+0.027
zer+0.026
IT+0.026
isk+0.025
Neg logit contributions
ls-0.038
bs-0.034
oms-0.033
s-0.033
ushes-0.033
Token grow
Token ID1663
Feature activation-1.69e-02

Pos logit contributions
gy+0.038
penter+0.037
zer+0.036
IT+0.035
isk+0.035
Neg logit contributions
ls-0.029
bs-0.025
oms-0.024
s-0.023
ushes-0.023
Tokenls
Token ID7278
Feature activation-7.01e-01

Pos logit contributions
bs+0.017
ushes+0.011
oms+0.009
eal+0.007
 piping+0.007
Neg logit contributions
gy-0.045
 driving-0.044
 because-0.044
penter-0.043
 doctors-0.043
Token and
Token ID290
Feature activation+6.63e-02

Pos logit contributions
ls+1.947
bs+1.786
oms+1.724
ushes+1.714
s+1.665
Neg logit contributions
gy-1.071
penter-1.001
zer-0.944
IT-0.928
isk-0.920
Token shakes
Token ID29959
Feature activation-2.41e-01

Pos logit contributions
gy+0.139
penter+0.132
zer+0.127
isk+0.126
IT+0.125
Neg logit contributions
ls-0.147
bs-0.130
oms-0.129
ushes-0.126
s-0.124
Token his
Token ID465
Feature activation+5.53e-02

Pos logit contributions
ls+0.515
bs+0.455
s+0.453
ushes+0.421
oms+0.397
Neg logit contributions
penter-0.499
gy-0.494
IT-0.475
astical-0.466
achable-0.453
Token head
Token ID1182
Feature activation+1.81e-01

Pos logit contributions
penter+0.072
gy+0.071
IT+0.068
isk+0.065
zer+0.065
Neg logit contributions
ls-0.165
bs-0.150
s-0.146
ushes-0.146
oms-0.143
Token.
Token ID13
Feature activation+1.71e-02

Pos logit contributions
gy+0.326
penter+0.308
zer+0.295
IT+0.294
isk+0.290
Neg logit contributions
ls-0.460
bs-0.413
oms-0.399
s-0.392
ushes-0.390
Token sharp
Token ID7786
Feature activation-7.56e-02

Pos logit contributions
ls+0.622
bs+0.559
oms+0.534
s+0.533
ushes+0.529
Neg logit contributions
gy-0.451
penter-0.429
IT-0.412
zer-0.410
isk-0.407
Token teeth
Token ID9941
Feature activation-4.67e-02

Pos logit contributions
ls+0.183
bs+0.164
oms+0.158
s+0.155
ushes+0.155
Neg logit contributions
gy-0.145
penter-0.136
zer-0.133
IT-0.131
isk-0.129
Token.
Token ID13
Feature activation+2.50e-02

Pos logit contributions
ls+0.104
bs+0.092
s+0.092
oms+0.086
ushes+0.083
Neg logit contributions
gy-0.096
penter-0.095
IT-0.095
isk-0.092
zer-0.090
Token He
Token ID679
Feature activation+2.16e-02

Pos logit contributions
gy+0.049
penter+0.045
IT+0.045
zer+0.044
isk+0.043
Neg logit contributions
ls-0.060
bs-0.054
oms-0.052
s-0.052
ushes-0.051
Token grow
Token ID1663
Feature activation-2.08e-02

Pos logit contributions
gy+0.052
penter+0.050
zer+0.048
IT+0.048
isk+0.048
Neg logit contributions
ls-0.040
bs-0.036
oms-0.034
ushes-0.033
s-0.033
Tokenls
Token ID7278
Feature activation-7.00e-01

Pos logit contributions
bs+0.017
ushes+0.010
oms+0.009
eal+0.007
forts+0.007
Neg logit contributions
gy-0.061
 because-0.057
 animals-0.057
 driving-0.057
 not-0.056
Token at
Token ID379
Feature activation-1.67e-01

Pos logit contributions
ls+1.912
bs+1.743
oms+1.685
ushes+1.666
s+1.650
Neg logit contributions
gy-1.121
penter-1.045
zer-0.997
IT-0.987
isk-0.967
Token Ben
Token ID3932
Feature activation+1.10e-02

Pos logit contributions
ls+0.335
bs+0.271
s+0.264
oms+0.254
ushes+0.245
Neg logit contributions
penter-0.373
gy-0.364
achable-0.352
astical-0.351
uding-0.350
Token and
Token ID290
Feature activation+7.63e-02

Pos logit contributions
penter+0.021
IT+0.021
gy+0.021
achable+0.020
zer+0.020
Neg logit contributions
ls-0.025
s-0.023
bs-0.022
oms-0.021
ushes-0.020
Token tries
Token ID8404
Feature activation-2.33e-01

Pos logit contributions
gy+0.156
penter+0.149
zer+0.144
IT+0.142
isk+0.141
Neg logit contributions
ls-0.173
oms-0.153
bs-0.153
ushes-0.148
s-0.146
Token to
Token ID284
Feature activation-1.06e-02

Pos logit contributions
ls+0.505
s+0.446
bs+0.439
ushes+0.408
oms+0.402
Neg logit contributions
penter-0.503
gy-0.497
IT-0.487
zer-0.473
lf-0.473
Token flower
Token ID15061
Feature activation-1.92e-02

Pos logit contributions
ls+0.101
bs+0.092
oms+0.089
ushes+0.088
s+0.087
Neg logit contributions
gy-0.061
penter-0.057
zer-0.056
IT-0.056
isk-0.054
Token.
Token ID13
Feature activation+2.08e-02

Pos logit contributions
ls+0.047
bs+0.043
s+0.041
oms+0.041
ushes+0.040
Neg logit contributions
gy-0.036
penter-0.034
IT-0.033
zer-0.033
isk-0.032
Token The
Token ID383
Feature activation+3.96e-02

Pos logit contributions
gy+0.032
penter+0.031
IT+0.029
achable+0.029
zer+0.029
Neg logit contributions
ls-0.056
bs-0.049
oms-0.048
s-0.046
ushes-0.046
Token beetle
Token ID45489
Feature activation+1.37e-02

Pos logit contributions
gy+0.057
zer+0.054
IT+0.053
penter+0.052
 accident+0.051
Neg logit contributions
ls-0.107
bs-0.099
oms-0.096
ushes-0.096
s-0.094
Token craw
Token ID27784
Feature activation-1.42e-01

Pos logit contributions
gy+0.027
penter+0.027
zer+0.027
 never+0.026
 always+0.025
Neg logit contributions
ls-0.025
bs-0.025
ushes-0.024
oms-0.023
s-0.021
Tokenls
Token ID7278
Feature activation-7.00e-01

Pos logit contributions
ushes+0.070
bs+0.044
oms+0.036
reci+0.028
apy+0.027
Neg logit contributions
gy-0.415
zer-0.403
isk-0.393
 because-0.389
 accident-0.388
Token onto
Token ID4291
Feature activation-2.24e-01

Pos logit contributions
ls+1.641
bs+1.491
ushes+1.470
oms+1.428
s+1.326
Neg logit contributions
gy-1.326
penter-1.197
zer-1.169
IT-1.162
isk-1.139
Token her
Token ID607
Feature activation-9.91e-02

Pos logit contributions
ls+0.483
bs+0.420
s+0.407
ushes+0.396
oms+0.392
Neg logit contributions
gy-0.481
penter-0.460
IT-0.441
isk-0.438
zer-0.433
Token palm
Token ID18057
Feature activation+8.23e-03

Pos logit contributions
ls+0.250
bs+0.226
s+0.220
oms+0.214
ushes+0.211
Neg logit contributions
gy-0.176
penter-0.175
IT-0.166
isk-0.166
zer-0.160
Token.
Token ID13
Feature activation+2.21e-02

Pos logit contributions
gy+0.014
penter+0.013
zer+0.012
IT+0.012
isk+0.012
Neg logit contributions
ls-0.022
bs-0.020
oms-0.019
ushes-0.019
s-0.019
Token
Token ID198
Feature activation-3.56e-02

Pos logit contributions
penter+0.029
gy+0.028
IT+0.026
achable+0.026
isk+0.026
Neg logit contributions
ls-0.063
bs-0.054
oms-0.054
s-0.053
ushes-0.051
Token like
Token ID588
Feature activation+8.11e-02

Pos logit contributions
gy+0.463
penter+0.416
isk+0.402
zer+0.398
 accident+0.383
Neg logit contributions
ls-0.917
oms-0.827
ushes-0.827
s-0.799
bs-0.795
Token it
Token ID340
Feature activation-5.92e-02

Pos logit contributions
gy+0.166
penter+0.158
IT+0.151
zer+0.149
isk+0.147
Neg logit contributions
ls-0.191
bs-0.170
oms-0.161
s-0.161
ushes-0.156
Token.
Token ID13
Feature activation+2.04e-02

Pos logit contributions
ls+0.144
bs+0.128
s+0.125
oms+0.121
ushes+0.116
Neg logit contributions
gy-0.114
penter-0.109
IT-0.106
zer-0.106
isk-0.103
Token It
Token ID632
Feature activation-4.09e-02

Pos logit contributions
gy+0.025
zer+0.024
penter+0.023
 because+0.022
IT+0.022
Neg logit contributions
ls-0.059
oms-0.053
bs-0.053
ushes-0.050
s-0.049
Token grow
Token ID1663
Feature activation-1.86e-02

Pos logit contributions
ls+0.081
bs+0.078
oms+0.077
ushes+0.075
s+0.065
Neg logit contributions
gy-0.089
penter-0.084
zer-0.081
isk-0.080
IT-0.078
Tokenls
Token ID7278
Feature activation-7.00e-01

Pos logit contributions
bs+0.014
ushes+0.011
oms+0.009
eal+0.005
apy+0.004
Neg logit contributions
gy-0.055
 because-0.054
 when-0.051
 accident-0.050
 not-0.050
Token and
Token ID290
Feature activation+7.28e-02

Pos logit contributions
ls+2.019
bs+1.866
ushes+1.849
oms+1.811
s+1.729
Neg logit contributions
gy-0.944
penter-0.857
zer-0.812
IT-0.783
 driving-0.761
Token snaps
Token ID23429
Feature activation-3.13e-01

Pos logit contributions
gy+0.153
penter+0.143
zer+0.140
isk+0.135
 driving+0.135
Neg logit contributions
ls-0.151
oms-0.141
ushes-0.138
bs-0.137
s-0.128
Token at
Token ID379
Feature activation-1.66e-01

Pos logit contributions
ls+0.749
bs+0.694
oms+0.687
ushes+0.663
s+0.649
Neg logit contributions
gy-0.528
zer-0.474
uding-0.460
penter-0.454
 because-0.452
Token Sara
Token ID24799
Feature activation-1.21e-01

Pos logit contributions
ls+0.329
bs+0.280
oms+0.264
s+0.263
ushes+0.257
Neg logit contributions
gy-0.394
penter-0.388
IT-0.367
zer-0.367
isk-0.357
Token.
Token ID13
Feature activation+2.60e-02

Pos logit contributions
ls+0.296
bs+0.265
s+0.261
oms+0.250
ushes+0.244
Neg logit contributions
gy-0.225
penter-0.223
IT-0.216
zer-0.211
isk-0.207
Token He
Token ID679
Feature activation+1.95e-02

Pos logit contributions
gy+0.050
penter+0.048
zer+0.046
IT+0.046
isk+0.045
Neg logit contributions
ls-0.075
bs-0.067
oms-0.065
s-0.064
ushes-0.064
Token bar
Token ID2318
Feature activation+3.54e-02

Pos logit contributions
gy+0.045
penter+0.043
zer+0.042
IT+0.041
isk+0.041
Neg logit contributions
ls-0.038
bs-0.034
oms-0.032
ushes-0.032
s-0.031
Tokenks
Token ID591
Feature activation-2.95e-01

Pos logit contributions
 driving+0.064
 never+0.063
uding+0.062
gy+0.062
 who+0.062
Neg logit contributions
ls-0.069
bs-0.059
ushes-0.058
s-0.055
oms-0.055
Token and
Token ID290
Feature activation+7.24e-02

Pos logit contributions
ls+0.831
oms+0.772
bs+0.769
ushes+0.761
s+0.711
Neg logit contributions
gy-0.367
penter-0.362
zer-0.343
 driving-0.328
isk-0.320
Token grow
Token ID1663
Feature activation-2.00e-02

Pos logit contributions
gy+0.158
penter+0.147
zer+0.145
IT+0.143
isk+0.143
Neg logit contributions
ls-0.151
bs-0.134
oms-0.132
ushes-0.131
s-0.124
Tokenls
Token ID7278
Feature activation-7.00e-01

Pos logit contributions
bs+0.017
ushes+0.013
oms+0.010
eal+0.007
apy+0.006
Neg logit contributions
gy-0.056
 because-0.056
 when-0.053
 driving-0.052
 always-0.052
Token and
Token ID290
Feature activation+7.34e-02

Pos logit contributions
ls+1.889
bs+1.726
oms+1.667
ushes+1.665
s+1.612
Neg logit contributions
gy-1.119
penter-1.036
zer-0.984
IT-0.970
isk-0.956
Token ch
Token ID442
Feature activation-1.33e-01

Pos logit contributions
gy+0.148
penter+0.139
zer+0.135
IT+0.133
isk+0.132
Neg logit contributions
ls-0.167
bs-0.148
oms-0.147
ushes-0.144
s-0.141
Tokenases
Token ID1386
Feature activation-2.51e-01

Pos logit contributions
ls+0.256
s+0.249
ushes+0.229
bs+0.223
oms+0.217
Neg logit contributions
gy-0.243
uding-0.240
zer-0.238
isk-0.230
 accident-0.215
Token them
Token ID606
Feature activation-1.32e-01

Pos logit contributions
ls+0.628
oms+0.562
bs+0.559
ushes+0.547
s+0.527
Neg logit contributions
gy-0.441
penter-0.421
zer-0.408
isk-0.401
IT-0.394
Token.
Token ID13
Feature activation+2.58e-02

Pos logit contributions
ls+0.340
bs+0.306
oms+0.300
ushes+0.294
s+0.288
Neg logit contributions
gy-0.228
penter-0.212
zer-0.202
IT-0.199
isk-0.199
Token back
Token ID736
Feature activation-9.79e-02

Pos logit contributions
gy+0.010
penter+0.010
IT+0.010
zer+0.010
isk+0.010
Neg logit contributions
ls-0.011
bs-0.009
s-0.009
oms-0.009
ushes-0.009
Token at
Token ID379
Feature activation-1.65e-01

Pos logit contributions
ls+0.256
bs+0.230
s+0.224
oms+0.217
ushes+0.215
Neg logit contributions
gy-0.165
penter-0.160
IT-0.160
zer-0.156
isk-0.153
Token night
Token ID1755
Feature activation-4.87e-02

Pos logit contributions
ls+0.353
bs+0.290
s+0.282
oms+0.277
ushes+0.271
Neg logit contributions
gy-0.352
penter-0.352
IT-0.328
isk-0.327
achable-0.325
Token.
Token ID13
Feature activation+2.06e-02

Pos logit contributions
ls+0.110
bs+0.098
oms+0.093
s+0.092
ushes+0.091
Neg logit contributions
gy-0.100
penter-0.096
zer-0.092
IT-0.092
isk-0.091
Token Ow
Token ID11960
Feature activation-4.92e-02

Pos logit contributions
gy+0.039
IT+0.037
zer+0.037
lf+0.037
penter+0.036
Neg logit contributions
ls-0.046
bs-0.044
s-0.043
ushes-0.042
oms-0.041
Tokenls
Token ID7278
Feature activation-7.00e-01

Pos logit contributions
ushes+0.096
ls+0.092
bs+0.092
oms+0.083
s+0.080
Neg logit contributions
gy-0.105
penter-0.098
lf-0.097
zer-0.097
IT-0.094
Token like
Token ID588
Feature activation+8.64e-02

Pos logit contributions
ls+1.710
ushes+1.533
bs+1.528
oms+1.486
s+1.446
Neg logit contributions
gy-1.301
penter-1.214
zer-1.155
isk-1.137
IT-1.136
Token the
Token ID262
Feature activation-3.61e-02

Pos logit contributions
gy+0.172
penter+0.162
IT+0.161
isk+0.156
zer+0.154
Neg logit contributions
ls-0.208
bs-0.182
s-0.176
oms-0.173
ushes-0.167
Token night
Token ID1755
Feature activation-4.85e-02

Pos logit contributions
ls+0.087
bs+0.076
oms+0.074
s+0.072
ushes+0.071
Neg logit contributions
gy-0.071
penter-0.067
IT-0.066
isk-0.065
zer-0.063
Token,"
Token ID553
Feature activation-1.65e-01

Pos logit contributions
ls+0.122
bs+0.110
ushes+0.106
oms+0.105
s+0.104
Neg logit contributions
gy-0.087
penter-0.082
zer-0.078
IT-0.077
isk-0.076
Token mom
Token ID1995
Feature activation+1.36e-01

Pos logit contributions
ls+0.421
bs+0.382
s+0.363
oms+0.358
ushes+0.351
Neg logit contributions
gy-0.293
penter-0.276
IT-0.276
zer-0.268
isk-0.267
Token
Token ID198
Feature activation-3.12e-02

Pos logit contributions
gy+0.041
penter+0.039
IT+0.039
zer+0.038
isk+0.038
Neg logit contributions
ls-0.041
bs-0.036
s-0.035
oms-0.034
ushes-0.034
Token
Token ID198
Feature activation-3.11e-02

Pos logit contributions
ls+0.084
bs+0.076
oms+0.075
ushes+0.071
s+0.069
Neg logit contributions
gy-0.050
penter-0.047
zer-0.045
isk-0.044
IT-0.043
TokenThe
Token ID464
Feature activation+4.27e-02

Pos logit contributions
ls+0.080
bs+0.071
oms+0.070
ushes+0.068
s+0.067
Neg logit contributions
gy-0.056
penter-0.053
zer-0.051
IT-0.050
isk-0.050
Token dog
Token ID3290
Feature activation+6.22e-02

Pos logit contributions
gy+0.086
IT+0.079
penter+0.079
uding+0.077
isk+0.077
Neg logit contributions
ls-0.104
bs-0.090
oms-0.088
s-0.087
ushes-0.085
Token grow
Token ID1663
Feature activation-1.82e-02

Pos logit contributions
gy+0.149
penter+0.144
zer+0.141
IT+0.139
isk+0.139
Neg logit contributions
ls-0.116
bs-0.104
oms-0.098
ushes-0.096
s-0.094
Tokenls
Token ID7278
Feature activation-7.00e-01

Pos logit contributions
bs+0.018
ushes+0.011
oms+0.006
forts+0.006
able+0.006
Neg logit contributions
gy-0.049
 driving-0.048
penter-0.048
 accident-0.047
 not-0.047
Token and
Token ID290
Feature activation+7.12e-02

Pos logit contributions
ls+1.945
bs+1.790
ushes+1.729
oms+1.722
s+1.665
Neg logit contributions
gy-1.066
penter-0.987
zer-0.943
IT-0.920
isk-0.908
Token shows
Token ID2523
Feature activation-1.74e-01

Pos logit contributions
gy+0.159
penter+0.152
zer+0.148
isk+0.146
IT+0.145
Neg logit contributions
ls-0.145
bs-0.130
oms-0.127
ushes-0.125
s-0.121
Token his
Token ID465
Feature activation+5.57e-02

Pos logit contributions
ls+0.447
oms+0.415
bs+0.410
ushes+0.395
s+0.389
Neg logit contributions
gy-0.273
penter-0.254
zer-0.248
isk-0.238
 Billy-0.237
Token teeth
Token ID9941
Feature activation-4.51e-02

Pos logit contributions
gy+0.096
penter+0.090
IT+0.087
zer+0.086
isk+0.085
Neg logit contributions
ls-0.147
bs-0.132
oms-0.127
s-0.126
ushes-0.125
Token.
Token ID13
Feature activation+1.98e-02

Pos logit contributions
ls+0.116
bs+0.105
oms+0.101
ushes+0.099
s+0.099
Neg logit contributions
gy-0.079
penter-0.074
zer-0.071
IT-0.070
isk-0.070
Token Ben
Token ID3932
Feature activation+7.43e-03

Pos logit contributions
ls+0.683
bs+0.614
oms+0.591
s+0.586
ushes+0.579
Neg logit contributions
gy-0.466
penter-0.440
IT-0.422
zer-0.421
isk-0.415
Token and
Token ID290
Feature activation+7.50e-02

Pos logit contributions
gy+0.013
penter+0.013
IT+0.012
zer+0.012
isk+0.012
Neg logit contributions
ls-0.019
bs-0.017
s-0.017
oms-0.016
ushes-0.015
Token Lily
Token ID20037
Feature activation+3.38e-02

Pos logit contributions
gy+0.140
penter+0.131
zer+0.128
isk+0.125
IT+0.124
Neg logit contributions
ls-0.183
bs-0.164
oms-0.163
ushes-0.159
s-0.159
Token and
Token ID290
Feature activation+7.57e-02

Pos logit contributions
gy+0.057
penter+0.054
IT+0.052
zer+0.052
isk+0.051
Neg logit contributions
ls-0.089
bs-0.081
oms-0.078
s-0.077
ushes-0.076
Token grow
Token ID1663
Feature activation-2.30e-02

Pos logit contributions
gy+0.154
penter+0.146
zer+0.141
IT+0.140
isk+0.140
Neg logit contributions
ls-0.174
bs-0.153
oms-0.150
ushes-0.147
s-0.147
Tokenls
Token ID7278
Feature activation-6.99e-01

Pos logit contributions
bs+0.042
ushes+0.036
oms+0.033
rav+0.031
forts+0.030
Neg logit contributions
 because-0.042
ter-0.041
gy-0.041
 when-0.039
 accident-0.039
Token.
Token ID13
Feature activation+2.49e-02

Pos logit contributions
ls+1.852
bs+1.720
ushes+1.672
oms+1.656
s+1.589
Neg logit contributions
gy-1.127
penter-1.027
zer-0.998
IT-0.960
uding-0.947
Token It
Token ID632
Feature activation-3.23e-02

Pos logit contributions
gy+0.047
penter+0.044
IT+0.042
zer+0.042
isk+0.041
Neg logit contributions
ls-0.061
bs-0.055
oms-0.053
s-0.052
ushes-0.052
Token wants
Token ID3382
Feature activation-1.80e-01

Pos logit contributions
ls+0.059
bs+0.056
oms+0.053
ushes+0.052
s+0.048
Neg logit contributions
gy-0.076
penter-0.073
zer-0.071
isk-0.069
IT-0.069
Token to
Token ID284
Feature activation-1.45e-02

Pos logit contributions
ls+0.442
bs+0.395
oms+0.378
s+0.377
ushes+0.370
Neg logit contributions
gy-0.341
penter-0.324
IT-0.312
zer-0.311
isk-0.307
Token eat
Token ID4483
Feature activation+3.05e-02

Pos logit contributions
ls+0.034
bs+0.030
s+0.029
oms+0.027
ushes+0.027
Neg logit contributions
penter-0.027
gy-0.027
IT-0.026
zer-0.026
uding-0.025
Token like
Token ID588
Feature activation+8.22e-02

Pos logit contributions
gy+0.470
penter+0.424
isk+0.407
zer+0.397
 accident+0.383
Neg logit contributions
ls-0.914
ushes-0.827
oms-0.826
bs-0.792
s-0.791
Token that
Token ID326
Feature activation+1.49e-01

Pos logit contributions
gy+0.178
penter+0.170
IT+0.162
uding+0.156
isk+0.156
Neg logit contributions
ls-0.188
bs-0.161
oms-0.154
s-0.152
ushes-0.144
Token.
Token ID13
Feature activation+1.51e-02

Pos logit contributions
gy+0.284
penter+0.269
IT+0.261
zer+0.259
isk+0.257
Neg logit contributions
ls-0.367
bs-0.328
s-0.315
oms-0.313
ushes-0.304
Token He
Token ID679
Feature activation+1.55e-02

Pos logit contributions
gy+0.026
penter+0.025
IT+0.024
zer+0.024
isk+0.023
Neg logit contributions
ls-0.039
bs-0.035
oms-0.034
s-0.034
ushes-0.033
Token grow
Token ID1663
Feature activation-1.82e-02

Pos logit contributions
gy+0.036
penter+0.035
zer+0.034
isk+0.033
IT+0.033
Neg logit contributions
ls-0.030
bs-0.027
oms-0.026
ushes-0.026
s-0.025
Tokenls
Token ID7278
Feature activation-6.99e-01

Pos logit contributions
bs+0.014
ushes+0.009
oms+0.007
forts+0.006
prise+0.006
Neg logit contributions
gy-0.050
 because-0.047
 always-0.047
 not-0.047
 actually-0.047
Token.
Token ID13
Feature activation+1.53e-02

Pos logit contributions
ls+1.968
bs+1.810
ushes+1.778
oms+1.768
s+1.671
Neg logit contributions
gy-1.023
penter-0.930
zer-0.881
isk-0.842
IT-0.839
Token He
Token ID679
Feature activation+1.71e-02

Pos logit contributions
gy+0.028
penter+0.027
zer+0.026
IT+0.026
isk+0.025
Neg logit contributions
ls-0.038
bs-0.034
oms-0.033
s-0.033
ushes-0.033
Token shows
Token ID2523
Feature activation-1.73e-01

Pos logit contributions
gy+0.039
penter+0.038
zer+0.037
isk+0.036
IT+0.036
Neg logit contributions
ls-0.034
bs-0.030
oms-0.029
ushes-0.028
s-0.028
Token his
Token ID465
Feature activation+5.22e-02

Pos logit contributions
ls+0.500
bs+0.458
oms+0.447
s+0.439
ushes+0.437
Neg logit contributions
gy-0.248
penter-0.234
zer-0.222
IT-0.219
isk-0.217
Token teeth
Token ID9941
Feature activation-4.22e-02

Pos logit contributions
gy+0.095
penter+0.094
IT+0.090
isk+0.088
zer+0.087
Neg logit contributions
ls-0.130
bs-0.116
s-0.112
ushes-0.110
oms-0.109
Token
Token ID198
Feature activation-4.33e-02

Pos logit contributions
gy+0.027
penter+0.026
IT+0.025
zer+0.025
isk+0.025
Neg logit contributions
ls-0.044
bs-0.038
s-0.037
oms-0.037
ushes-0.036
Token
Token ID198
Feature activation-4.21e-02

Pos logit contributions
ls+0.105
bs+0.094
oms+0.090
s+0.088
ushes+0.088
Neg logit contributions
gy-0.082
penter-0.078
zer-0.075
IT-0.074
isk-0.074
TokenThe
Token ID464
Feature activation+3.09e-02

Pos logit contributions
ls+0.101
bs+0.089
oms+0.087
ushes+0.085
s+0.083
Neg logit contributions
gy-0.082
penter-0.079
zer-0.075
IT-0.074
isk-0.074
Token snail
Token ID47374
Feature activation-1.12e-02

Pos logit contributions
gy+0.058
penter+0.055
zer+0.053
IT+0.053
isk+0.052
Neg logit contributions
ls-0.077
bs-0.069
oms-0.066
s-0.065
ushes-0.064
Token craw
Token ID27784
Feature activation-1.38e-01

Pos logit contributions
ls+0.021
bs+0.019
ushes+0.018
oms+0.018
s+0.017
Neg logit contributions
gy-0.026
penter-0.025
zer-0.025
IT-0.024
isk-0.024
Tokenls
Token ID7278
Feature activation-6.99e-01

Pos logit contributions
ushes+0.078
bs+0.031
oms+0.026
Happy+0.020
 piping+0.012
Neg logit contributions
gy-0.437
 because-0.408
isk-0.405
zer-0.402
penter-0.401
Token on
Token ID319
Feature activation-1.95e-01

Pos logit contributions
ls+1.617
bs+1.456
ushes+1.426
oms+1.396
s+1.328
Neg logit contributions
gy-1.377
penter-1.272
IT-1.236
zer-1.234
isk-1.209
Token Ben
Token ID3932
Feature activation+7.09e-03

Pos logit contributions
ls+0.372
bs+0.325
s+0.317
ushes+0.303
oms+0.300
Neg logit contributions
gy-0.454
penter-0.447
bike-0.423
uding-0.420
IT-0.420
Token's
Token ID338
Feature activation+3.55e-03

Pos logit contributions
gy+0.014
penter+0.013
IT+0.013
zer+0.012
isk+0.012
Neg logit contributions
ls-0.017
bs-0.015
s-0.015
oms-0.015
ushes-0.014
Token tower
Token ID10580
Feature activation+1.89e-01

Pos logit contributions
penter+0.006
gy+0.006
isk+0.005
IT+0.005
zer+0.005
Neg logit contributions
ls-0.009
bs-0.008
s-0.008
oms-0.008
ushes-0.008
Token.
Token ID13
Feature activation+2.11e-02

Pos logit contributions
gy+0.350
penter+0.331
IT+0.319
zer+0.319
isk+0.314
Neg logit contributions
ls-0.472
bs-0.423
oms-0.407
s-0.405
ushes-0.398
TokenBut
Token ID1537
Feature activation+2.42e-01

Pos logit contributions
ls+0.094
bs+0.087
s+0.084
ushes+0.082
oms+0.081
Neg logit contributions
gy-0.054
penter-0.049
IT-0.048
zer-0.047
isk-0.047
Token then
Token ID788
Feature activation+3.37e-01

Pos logit contributions
IT+0.388
isk+0.355
uding+0.352
eman+0.350
bike+0.350
Neg logit contributions
ls-0.652
bs-0.592
s-0.552
oms-0.506
l-0.473
Token,
Token ID11
Feature activation-1.66e-02

Pos logit contributions
gy+0.590
penter+0.536
zer+0.519
IT+0.494
lf+0.493
Neg logit contributions
ls-0.844
bs-0.773
ushes-0.771
oms-0.756
s-0.730
Token Lily
Token ID20037
Feature activation+3.14e-02

Pos logit contributions
ls+0.033
bs+0.029
s+0.026
oms+0.026
ushes+0.024
Neg logit contributions
gy-0.039
penter-0.039
IT-0.039
isk-0.038
zer-0.038
Token craw
Token ID27784
Feature activation-1.40e-01

Pos logit contributions
gy+0.075
penter+0.072
zer+0.070
IT+0.069
isk+0.069
Neg logit contributions
ls-0.058
bs-0.052
ushes-0.050
oms-0.048
s-0.047
Tokenls
Token ID7278
Feature activation-6.99e-01

Pos logit contributions
ushes+0.101
apy+0.041
bs+0.036
Happy+0.035
oms+0.018
Neg logit contributions
 because-0.407
gy-0.400
 accident-0.398
zer-0.382
isk-0.382
Token under
Token ID739
Feature activation+1.47e-02

Pos logit contributions
ls+1.486
bs+1.353
ushes+1.346
oms+1.281
s+1.242
Neg logit contributions
gy-1.460
penter-1.359
zer-1.333
IT-1.330
isk-1.323
Token the
Token ID262
Feature activation-3.71e-02

Pos logit contributions
penter+0.038
gy+0.037
bike+0.036
isk+0.036
uding+0.036
Neg logit contributions
ls-0.027
bs-0.022
oms-0.021
s-0.021
ushes-0.017
Token sofa
Token ID34902
Feature activation+1.00e-01

Pos logit contributions
ls+0.110
bs+0.101
oms+0.098
s+0.097
ushes+0.096
Neg logit contributions
gy-0.051
penter-0.047
zer-0.045
IT-0.045
isk-0.044
Token.
Token ID13
Feature activation+2.03e-02

Pos logit contributions
gy+0.199
penter+0.190
IT+0.184
zer+0.182
isk+0.180
Neg logit contributions
ls-0.238
bs-0.209
s-0.206
oms-0.204
ushes-0.191
Token She
Token ID1375
Feature activation-1.10e-01

Pos logit contributions
gy+0.040
penter+0.038
IT+0.037
isk+0.036
zer+0.036
Neg logit contributions
ls-0.048
bs-0.042
oms-0.041
s-0.041
ushes-0.040
Token!
Token ID0
Feature activation+9.07e-02

Pos logit contributions
gy+0.206
penter+0.195
IT+0.191
zer+0.188
isk+0.187
Neg logit contributions
ls-0.294
bs-0.264
s-0.256
oms-0.254
ushes-0.248
Token I
Token ID314
Feature activation+8.35e-02

Pos logit contributions
gy+0.171
penter+0.159
zer+0.158
IT+0.156
isk+0.155
Neg logit contributions
ls-0.218
bs-0.198
s-0.192
oms-0.192
ushes-0.189
Token love
Token ID1842
Feature activation-7.47e-02

Pos logit contributions
gy+0.161
penter+0.152
zer+0.146
IT+0.146
isk+0.146
Neg logit contributions
ls-0.204
bs-0.176
oms-0.172
s-0.168
ushes-0.168
Token painting
Token ID12036
Feature activation+1.54e-01

Pos logit contributions
ls+0.157
bs+0.139
s+0.137
oms+0.131
ushes+0.125
Neg logit contributions
gy-0.169
penter-0.158
isk-0.156
IT-0.154
achable-0.149
Token ow
Token ID12334
Feature activation+9.56e-02

Pos logit contributions
gy+0.279
penter+0.263
IT+0.253
zer+0.252
isk+0.250
Neg logit contributions
ls-0.388
bs-0.348
oms-0.335
s-0.334
ushes-0.328
Tokenls
Token ID7278
Feature activation-6.99e-01

Pos logit contributions
lf+0.222
gy+0.221
zer+0.215
penter+0.213
 driving+0.208
Neg logit contributions
ushes-0.146
ls-0.144
bs-0.136
oms-0.129
s-0.114
Token.
Token ID13
Feature activation+9.21e-03

Pos logit contributions
ls+1.761
bs+1.571
oms+1.527
s+1.527
ushes+1.466
Neg logit contributions
gy-1.271
penter-1.185
IT-1.157
zer-1.153
isk-1.147
Tokenâ
Token ID22940
Feature activation+2.14e-02

Pos logit contributions
gy+0.019
zer+0.017
penter+0.017
isk+0.017
IT+0.017
Neg logit contributions
ls-0.020
bs-0.019
s-0.018
oms-0.018
ushes-0.018
Token€
Token ID26391
Feature activation+1.60e-01

Pos logit contributions
gy+0.031
penter+0.029
zer+0.027
IT+0.027
isk+0.027
Neg logit contributions
ls-0.062
bs-0.057
oms-0.055
s-0.054
ushes-0.054
Token<|endoftext|>
Token ID50256
Feature activation-3.88e-02

Pos logit contributions
gy+0.332
penter+0.320
IT+0.309
isk+0.306
zer+0.304
Neg logit contributions
ls-0.359
bs-0.319
s-0.310
oms-0.305
ushes-0.296
TokenOne
Token ID3198
Feature activation+2.18e-01

Pos logit contributions
ls+0.074
bs+0.066
ushes+0.061
oms+0.061
s+0.059
Neg logit contributions
gy-0.092
penter-0.088
IT-0.087
zer-0.086
isk-0.085
Token to
Token ID284
Feature activation-6.12e-03

Pos logit contributions
ls+0.112
bs+0.100
s+0.097
oms+0.096
ushes+0.094
Neg logit contributions
gy-0.088
penter-0.085
zer-0.082
IT-0.081
isk-0.080
Token an
Token ID281
Feature activation+1.71e-01

Pos logit contributions
ls+0.011
bs+0.010
oms+0.009
ushes+0.009
s+0.009
Neg logit contributions
gy-0.013
penter-0.013
itor-0.013
zer-0.012
ologist-0.012
Token end
Token ID886
Feature activation+1.14e-01

Pos logit contributions
gy+0.421
isk+0.384
zer+0.378
IT+0.376
uding+0.367
Neg logit contributions
ls-0.299
oms-0.256
bs-0.239
s-0.230
l-0.218
Token.
Token ID13
Feature activation+1.39e-02

Pos logit contributions
gy+0.243
penter+0.233
IT+0.231
zer+0.229
isk+0.226
Neg logit contributions
ls-0.255
bs-0.224
s-0.218
oms-0.217
ushes-0.209
Token Everyone
Token ID11075
Feature activation-8.39e-02

Pos logit contributions
gy+0.028
penter+0.026
zer+0.026
isk+0.026
IT+0.025
Neg logit contributions
ls-0.031
bs-0.029
s-0.028
oms-0.028
ushes-0.027
Token who
Token ID508
Feature activation+3.09e-01

Pos logit contributions
ls+0.238
bs+0.217
oms+0.214
ushes+0.214
s+0.202
Neg logit contributions
gy-0.121
penter-0.115
zer-0.109
IT-0.106
lf-0.103
Token watched
Token ID7342
Feature activation+1.33e-01

Pos logit contributions
gy+0.474
penter+0.446
zer+0.418
isk+0.413
IT+0.412
Neg logit contributions
ls-0.854
bs-0.772
oms-0.762
ushes-0.757
s-0.724
Token it
Token ID340
Feature activation-5.69e-02

Pos logit contributions
gy+0.254
penter+0.240
IT+0.232
isk+0.231
zer+0.229
Neg logit contributions
ls-0.327
bs-0.294
oms-0.283
s-0.281
ushes-0.272
Token was
Token ID373
Feature activation+7.35e-02

Pos logit contributions
ls+0.144
bs+0.130
oms+0.125
s+0.124
ushes+0.122
Neg logit contributions
gy-0.102
penter-0.097
zer-0.093
IT-0.093
isk-0.092
Token sad
Token ID6507
Feature activation-1.91e-01

Pos logit contributions
gy+0.151
penter+0.143
zer+0.140
IT+0.140
isk+0.138
Neg logit contributions
ls-0.167
bs-0.149
oms-0.141
s-0.141
ushes-0.140
Token.
Token ID13
Feature activation+1.50e-02

Pos logit contributions
ls+0.463
s+0.417
bs+0.413
oms+0.404
ushes+0.378
Neg logit contributions
gy-0.368
zer-0.358
IT-0.358
penter-0.358
isk-0.352
Token�
Token ID129
Feature activation+3.66e-02

Pos logit contributions
gy+0.307
penter+0.299
IT+0.296
zer+0.288
isk+0.287
Neg logit contributions
ls-0.369
bs-0.333
s-0.331
oms-0.316
ushes-0.310
Token�
Token ID241
Feature activation+1.48e-01

Pos logit contributions
gy+0.056
eman+0.052
penter+0.051
lf+0.050
ologist+0.050
Neg logit contributions
ls-0.093
bs-0.084
s-0.082
oms-0.080
ushes-0.080
TokenWhat
Token ID2061
Feature activation+1.21e-01

Pos logit contributions
gy+0.294
penter+0.276
isk+0.269
IT+0.267
zer+0.265
Neg logit contributions
ls-0.340
bs-0.312
s-0.302
oms-0.295
ushes-0.292
Token are
Token ID389
Feature activation-3.45e-01

Pos logit contributions
gy+0.254
penter+0.243
zer+0.235
IT+0.233
isk+0.230
Neg logit contributions
ls-0.270
bs-0.236
oms-0.228
ushes-0.225
s-0.222
Token you
Token ID345
Feature activation-4.94e-02

Pos logit contributions
ls+0.783
bs+0.687
oms+0.655
s+0.655
ushes+0.624
Neg logit contributions
gy-0.727
penter-0.689
IT-0.675
zer-0.671
isk-0.668
Token doing
Token ID1804
Feature activation+2.04e-01

Pos logit contributions
ls+0.135
oms+0.121
bs+0.120
s+0.118
ushes+0.110
Neg logit contributions
penter-0.072
gy-0.071
isk-0.066
zer-0.065
IT-0.063
Token in
Token ID287
Feature activation+6.77e-03

Pos logit contributions
isk+0.443
gy+0.433
IT+0.407
penter+0.404
Wh+0.402
Neg logit contributions
ls-0.426
s-0.347
oms-0.346
bs-0.330
ushes-0.300
Token here
Token ID994
Feature activation-1.17e-02

Pos logit contributions
gy+0.013
penter+0.013
IT+0.013
achable+0.013
zer+0.013
Neg logit contributions
ls-0.015
bs-0.013
oms-0.013
s-0.013
ushes-0.012
Token?
Token ID30
Feature activation+1.10e-01

Pos logit contributions
ls+0.029
oms+0.025
s+0.025
bs+0.025
ushes+0.023
Neg logit contributions
gy-0.021
zer-0.020
penter-0.019
IT-0.019
isk-0.019
Token This
Token ID770
Feature activation+3.29e-02

Pos logit contributions
gy+0.204
penter+0.191
zer+0.187
IT+0.186
isk+0.184
Neg logit contributions
ls-0.272
bs-0.246
oms-0.238
s-0.237
ushes-0.230
Token is
Token ID318
Feature activation-1.89e-01

Pos logit contributions
gy+0.075
isk+0.070
zer+0.070
IT+0.070
penter+0.069
Neg logit contributions
ls-0.067
bs-0.057
s-0.057
oms-0.055
ushes-0.052
Token lemon
Token ID18873
Feature activation-3.85e-02

Pos logit contributions
gy+0.244
penter+0.232
IT+0.229
isk+0.226
zer+0.222
Neg logit contributions
ls-0.258
bs-0.229
oms-0.219
s-0.218
ushes-0.210
Token to
Token ID284
Feature activation-1.01e-02

Pos logit contributions
ls+0.091
bs+0.082
ushes+0.080
oms+0.079
s+0.076
Neg logit contributions
gy-0.073
penter-0.070
zer-0.067
IT-0.065
isk-0.064
Token play
Token ID711
Feature activation+1.22e-01

Pos logit contributions
ls+0.024
bs+0.022
s+0.021
oms+0.021
ushes+0.020
Neg logit contributions
gy-0.019
penter-0.018
IT-0.017
zer-0.017
uding-0.017
Token with
Token ID351
Feature activation+4.43e-02

Pos logit contributions
gy+0.262
IT+0.238
isk+0.238
penter+0.236
achable+0.233
Neg logit contributions
ls-0.269
s-0.234
bs-0.232
oms-0.219
ushes-0.208
Token.
Token ID13
Feature activation+7.76e-03

Pos logit contributions
gy+0.086
penter+0.083
IT+0.080
isk+0.078
zer+0.077
Neg logit contributions
ls-0.108
bs-0.095
s-0.093
oms-0.089
ushes-0.088
Token Tim
Token ID5045
Feature activation+2.87e-01

Pos logit contributions
gy+0.016
zer+0.016
IT+0.015
uding+0.015
isk+0.015
Neg logit contributions
ls-0.015
bs-0.015
s-0.014
oms-0.014
ushes-0.013
Tokenmy
Token ID1820
Feature activation+1.14e-01

Pos logit contributions
gy+0.464
penter+0.443
IT+0.424
zer+0.422
uding+0.418
Neg logit contributions
ls-0.762
bs-0.690
s-0.687
oms-0.653
ushes-0.642
Token was
Token ID373
Feature activation+6.63e-02

Pos logit contributions
gy+0.214
penter+0.202
IT+0.201
zer+0.196
uding+0.195
Neg logit contributions
ls-0.278
s-0.249
bs-0.249
oms-0.230
ushes-0.225
Token so
Token ID523
Feature activation+1.16e-01

Pos logit contributions
gy+0.162
IT+0.158
isk+0.155
storm+0.154
zer+0.154
Neg logit contributions
ls-0.101
bs-0.093
ushes-0.093
s-0.085
oms-0.080
Token excited
Token ID6568
Feature activation-1.84e-01

Pos logit contributions
gy+0.205
 always+0.202
 driving+0.200
penter+0.197
 never+0.192
Neg logit contributions
ls-0.292
oms-0.262
bs-0.243
ushes-0.232
s-0.230
Token to
Token ID284
Feature activation-1.32e-02

Pos logit contributions
ls+0.417
s+0.391
bs+0.334
oms+0.316
l+0.302
Neg logit contributions
gy-0.392
IT-0.380
isk-0.378
penter-0.372
achable-0.368
Token mom
Token ID1995
Feature activation+1.35e-01

Pos logit contributions
penter+0.107
gy+0.105
IT+0.105
isk+0.101
zer+0.098
Neg logit contributions
ls-0.122
bs-0.103
s-0.100
ushes-0.100
oms-0.099
Tokenmy
Token ID1820
Feature activation+1.26e-01

Pos logit contributions
IT+0.266
achable+0.257
uding+0.251
itor+0.247
zer+0.246
Neg logit contributions
s-0.291
ls-0.288
bs-0.283
oms-0.246
ushes-0.235
Token.
Token ID13
Feature activation+1.27e-02

Pos logit contributions
IT+0.241
gy+0.232
achable+0.230
zer+0.229
uding+0.228
Neg logit contributions
ls-0.283
s-0.275
bs-0.269
oms-0.238
ushes-0.230
Token
Token ID198
Feature activation-2.99e-02

Pos logit contributions
zer+0.029
gy+0.028
penter+0.028
lf+0.028
eman+0.028
Neg logit contributions
s-0.022
ls-0.022
ushes-0.020
bs-0.020
oms-0.018
Token
Token ID198
Feature activation-3.08e-02

Pos logit contributions
ls+0.067
bs+0.059
s+0.057
oms+0.056
ushes+0.056
Neg logit contributions
gy-0.062
penter-0.060
zer-0.058
IT-0.058
isk-0.057
TokenK
Token ID42
Feature activation+2.24e-01

Pos logit contributions
ls+0.076
s+0.073
bs+0.071
ushes+0.066
l+0.063
Neg logit contributions
IT-0.050
gy-0.050
uding-0.049
zer-0.047
itor-0.047
Tokenenny
Token ID11870
Feature activation-1.15e-01

Pos logit contributions
gy+0.488
penter+0.465
IT+0.450
zer+0.450
isk+0.446
Neg logit contributions
ls-0.488
bs-0.430
oms-0.410
s-0.405
ushes-0.401
Token wanted
Token ID2227
Feature activation+8.77e-02

Pos logit contributions
ls+0.241
s+0.224
bs+0.219
oms+0.189
l+0.174
Neg logit contributions
IT-0.239
gy-0.232
uding-0.232
zer-0.232
penter-0.231
Token his
Token ID465
Feature activation+5.37e-02

Pos logit contributions
gy+0.226
penter+0.214
IT+0.214
isk+0.211
uding+0.209
Neg logit contributions
ls-0.149
bs-0.134
s-0.129
oms-0.113
ushes-0.111
Token mom
Token ID1995
Feature activation+1.35e-01

Pos logit contributions
penter+0.096
gy+0.095
isk+0.093
IT+0.093
zer+0.090
Neg logit contributions
ls-0.134
s-0.118
bs-0.117
oms-0.114
ushes-0.113
Tokenmy
Token ID1820
Feature activation+1.26e-01

Pos logit contributions
gy+0.212
IT+0.209
penter+0.209
zer+0.206
uding+0.199
Neg logit contributions
ls-0.351
bs-0.331
s-0.329
ushes-0.311
oms-0.306
Token them
Token ID606
Feature activation-1.41e-01

Pos logit contributions
ls+0.615
bs+0.533
oms+0.510
s+0.495
ushes+0.489
Neg logit contributions
gy-0.796
penter-0.760
IT-0.734
zer-0.729
isk-0.721
Token went
Token ID1816
Feature activation-5.17e-02

Pos logit contributions
ls+0.345
bs+0.314
oms+0.299
s+0.299
ushes+0.292
Neg logit contributions
gy-0.261
penter-0.246
IT-0.241
zer-0.239
isk-0.236
Token together
Token ID1978
Feature activation-2.12e-01

Pos logit contributions
ls+0.126
bs+0.114
ushes+0.108
oms+0.107
s+0.104
Neg logit contributions
gy-0.100
penter-0.093
 driving-0.088
zer-0.086
isk-0.085
Token to
Token ID284
Feature activation-7.93e-03

Pos logit contributions
ls+0.512
bs+0.463
s+0.450
oms+0.445
ushes+0.432
Neg logit contributions
gy-0.405
penter-0.388
IT-0.377
isk-0.373
zer-0.373
Token the
Token ID262
Feature activation-4.30e-02

Pos logit contributions
ls+0.015
bs+0.014
oms+0.013
s+0.013
ushes+0.013
Neg logit contributions
penter-0.016
gy-0.016
zer-0.016
isk-0.016
storm-0.016
Token store
Token ID3650
Feature activation+2.27e-01

Pos logit contributions
ls+0.108
s+0.102
oms+0.097
bs+0.095
ushes+0.094
Neg logit contributions
gy-0.070
uding-0.063
zer-0.063
eman-0.060
isk-0.060
Token.
Token ID13
Feature activation-1.64e-04

Pos logit contributions
gy+0.443
zer+0.433
IT+0.430
penter+0.425
isk+0.415
Neg logit contributions
ls-0.515
s-0.473
bs-0.458
oms-0.449
ushes-0.440
Token They
Token ID1119
Feature activation-1.15e-01

Pos logit contributions
ls+0.000
bs+0.000
oms+0.000
s+0.000
ushes+0.000
Neg logit contributions
gy-0.000
penter-0.000
IT-0.000
isk-0.000
zer-0.000
Token got
Token ID1392
Feature activation-8.13e-02

Pos logit contributions
ls+0.278
bs+0.254
s+0.243
oms+0.240
ushes+0.236
Neg logit contributions
gy-0.215
penter-0.205
IT-0.200
zer-0.199
isk-0.194
Token some
Token ID617
Feature activation-6.63e-03

Pos logit contributions
ls+0.216
bs+0.194
oms+0.188
ushes+0.181
s+0.177
Neg logit contributions
gy-0.130
 driving-0.127
penter-0.121
zer-0.112
 always-0.106
Token boxes
Token ID10559
Feature activation-7.07e-02

Pos logit contributions
ls+0.017
bs+0.015
ushes+0.014
oms+0.014
s+0.014
Neg logit contributions
penter-0.012
gy-0.011
 driving-0.011
zer-0.011
 accident-0.010
Token.
Token ID13
Feature activation-1.05e-03

Pos logit contributions
gy+0.211
penter+0.199
IT+0.195
zer+0.194
isk+0.191
Neg logit contributions
ls-0.269
bs-0.240
oms-0.232
s-0.231
ushes-0.224
Token She
Token ID1375
Feature activation-1.23e-01

Pos logit contributions
ls+0.002
bs+0.002
s+0.002
oms+0.002
ushes+0.002
Neg logit contributions
gy-0.002
penter-0.002
zer-0.002
isk-0.002
IT-0.002
Token smiled
Token ID13541
Feature activation-1.49e-01

Pos logit contributions
ls+0.272
s+0.243
bs+0.242
oms+0.221
ushes+0.213
Neg logit contributions
gy-0.262
IT-0.247
penter-0.246
isk-0.244
zer-0.236
Token and
Token ID290
Feature activation+5.20e-02

Pos logit contributions
ls+0.369
bs+0.329
s+0.326
oms+0.318
ushes+0.312
Neg logit contributions
gy-0.273
penter-0.265
IT-0.259
isk-0.258
zer-0.256
Token said
Token ID531
Feature activation-1.87e-01

Pos logit contributions
gy+0.109
penter+0.104
zer+0.101
IT+0.100
isk+0.099
Neg logit contributions
ls-0.117
bs-0.103
oms-0.099
ushes-0.097
s-0.097
Token to
Token ID284
Feature activation-1.15e-02

Pos logit contributions
ls+0.520
bs+0.466
ushes+0.465
oms+0.455
s+0.444
Neg logit contributions
gy-0.276
penter-0.253
zer-0.247
IT-0.234
 driving-0.232
Token the
Token ID262
Feature activation-5.20e-02

Pos logit contributions
ls+0.014
bs+0.013
oms+0.011
s+0.011
ushes+0.010
Neg logit contributions
gy-0.031
penter-0.031
ologist-0.031
eman-0.029
storm-0.029
Token tire
Token ID15867
Feature activation+2.87e-01

Pos logit contributions
ls+0.124
oms+0.111
s+0.110
bs+0.110
ushes+0.108
Neg logit contributions
gy-0.091
isk-0.084
uding-0.084
zer-0.083
lf-0.082
Token "
Token ID366
Feature activation+1.18e-01

Pos logit contributions
IT+0.551
gy+0.534
zer+0.530
uding+0.526
achable+0.522
Neg logit contributions
ls-0.650
s-0.603
bs-0.581
oms-0.559
l-0.517
TokenHi
Token ID17250
Feature activation+6.95e-02

Pos logit contributions
gy+0.211
penter+0.198
zer+0.189
isk+0.187
IT+0.187
Neg logit contributions
ls-0.294
bs-0.270
s-0.258
oms-0.256
ushes-0.254
Token,
Token ID11
Feature activation-2.73e-02

Pos logit contributions
gy+0.144
penter+0.139
isk+0.136
IT+0.133
zer+0.132
Neg logit contributions
ls-0.156
s-0.133
oms-0.132
bs-0.131
ushes-0.123
Token bad
Token ID2089
Feature activation-2.04e-01

Pos logit contributions
gy+0.254
isk+0.242
penter+0.241
IT+0.239
zer+0.233
Neg logit contributions
ls-0.208
bs-0.182
s-0.171
oms-0.170
ushes-0.163
Token cook
Token ID4255
Feature activation+1.96e-01

Pos logit contributions
ls+0.471
bs+0.414
s+0.396
oms+0.394
ushes+0.355
Neg logit contributions
isk-0.417
gy-0.407
IT-0.404
penter-0.388
zer-0.376
Token,"
Token ID553
Feature activation-1.61e-01

Pos logit contributions
gy+0.380
penter+0.355
IT+0.353
zer+0.346
isk+0.345
Neg logit contributions
ls-0.470
bs-0.418
oms-0.402
s-0.400
ushes-0.391
Token Sam
Token ID3409
Feature activation+1.59e-01

Pos logit contributions
ls+0.424
bs+0.382
oms+0.368
s+0.365
ushes+0.360
Neg logit contributions
gy-0.280
penter-0.263
IT-0.253
zer-0.253
isk-0.249
Token says
Token ID1139
Feature activation-4.21e-01

Pos logit contributions
gy+0.496
penter+0.487
zer+0.473
IT+0.473
isk+0.469
Neg logit contributions
ls-0.181
bs-0.145
ushes-0.132
oms-0.131
s-0.120
Token.
Token ID13
Feature activation+2.43e-02

Pos logit contributions
ls+1.124
bs+1.008
ushes+0.981
oms+0.978
s+0.951
Neg logit contributions
gy-0.704
penter-0.650
zer-0.617
isk-0.593
IT-0.590
Token
Token ID198
Feature activation-3.25e-02

Pos logit contributions
gy+0.047
penter+0.044
IT+0.044
zer+0.044
isk+0.043
Neg logit contributions
ls-0.059
bs-0.052
s-0.051
oms-0.050
ushes-0.049
Token
Token ID198
Feature activation-3.24e-02

Pos logit contributions
ls+0.083
bs+0.076
oms+0.073
ushes+0.070
s+0.070
Neg logit contributions
gy-0.056
penter-0.053
zer-0.051
isk-0.050
IT-0.049
Token"
Token ID1
Feature activation-1.07e-01

Pos logit contributions
ls+0.073
bs+0.065
oms+0.062
s+0.061
ushes+0.061
Neg logit contributions
gy-0.068
penter-0.065
IT-0.063
zer-0.063
isk-0.062
TokenI
Token ID40
Feature activation+9.35e-02

Pos logit contributions
ls+0.263
bs+0.239
s+0.228
oms+0.225
ushes+0.225
Neg logit contributions
gy-0.207
isk-0.184
penter-0.183
IT-0.181
zer-0.179
Token'm
Token ID1101
Feature activation-1.07e-01

Pos logit contributions
gy+0.201
penter+0.188
isk+0.182
zer+0.181
IT+0.177
Neg logit contributions
ls-0.211
bs-0.181
oms-0.179
ushes-0.174
s-0.170
Token challenge
Token ID4427
Feature activation-1.61e-02

Pos logit contributions
ls+0.090
oms+0.078
bs+0.077
s+0.076
ushes+0.072
Neg logit contributions
gy-0.083
isk-0.077
penter-0.075
zer-0.073
IT-0.073
Token,
Token ID11
Feature activation-1.48e-02

Pos logit contributions
ls+0.037
bs+0.033
s+0.033
oms+0.032
ushes+0.031
Neg logit contributions
gy-0.032
penter-0.030
IT-0.030
isk-0.030
zer-0.030
Token no
Token ID645
Feature activation+2.48e-01

Pos logit contributions
ls+0.037
bs+0.035
oms+0.034
s+0.033
ushes+0.032
Neg logit contributions
gy-0.026
isk-0.024
IT-0.023
zer-0.023
penter-0.023
Token matter
Token ID2300
Feature activation+2.28e-01

Pos logit contributions
gy+0.408
isk+0.384
penter+0.367
zer+0.358
IT+0.346
Neg logit contributions
ls-0.661
bs-0.596
s-0.596
oms-0.588
ushes-0.545
Token what
Token ID644
Feature activation+1.46e-02

Pos logit contributions
gy+0.402
penter+0.387
zer+0.382
IT+0.364
isk+0.358
Neg logit contributions
ls-0.567
oms-0.520
bs-0.503
s-0.500
ushes-0.471
Token it
Token ID340
Feature activation-5.65e-02

Pos logit contributions
gy+0.027
penter+0.026
zer+0.025
IT+0.025
isk+0.024
Neg logit contributions
ls-0.035
bs-0.031
oms-0.030
ushes-0.030
s-0.030
Token is
Token ID318
Feature activation-1.84e-01

Pos logit contributions
ls+0.111
bs+0.099
ushes+0.097
oms+0.092
s+0.088
Neg logit contributions
gy-0.128
penter-0.122
IT-0.122
zer-0.119
uding-0.116
Token".
Token ID1911
Feature activation+2.19e-02

Pos logit contributions
ls+0.418
bs+0.373
s+0.361
oms+0.361
ushes+0.342
Neg logit contributions
gy-0.374
penter-0.356
isk-0.352
zer-0.352
IT-0.344
Token
Token ID198
Feature activation-3.59e-02

Pos logit contributions
gy+0.044
penter+0.042
isk+0.042
zer+0.041
lf+0.041
Neg logit contributions
ls-0.049
s-0.045
bs-0.045
oms-0.042
ushes-0.041
Token
Token ID198
Feature activation-3.60e-02

Pos logit contributions
ls+0.086
bs+0.077
oms+0.074
s+0.072
ushes+0.072
Neg logit contributions
gy-0.070
penter-0.066
zer-0.064
IT-0.064
isk-0.063
TokenThe
Token ID464
Feature activation+4.03e-02

Pos logit contributions
ls+0.079
s+0.072
bs+0.069
ushes+0.066
oms+0.065
Neg logit contributions
gy-0.079
isk-0.069
IT-0.068
uding-0.068
penter-0.067
Token running
Token ID2491
Feature activation+1.91e-01

Pos logit contributions
ls+0.115
bs+0.103
oms+0.099
s+0.097
ushes+0.097
Neg logit contributions
gy-0.091
penter-0.085
IT-0.082
zer-0.082
isk-0.081
Token over
Token ID625
Feature activation+3.49e-03

Pos logit contributions
gy+0.349
penter+0.333
zer+0.321
IT+0.320
isk+0.317
Neg logit contributions
ls-0.477
bs-0.427
s-0.411
oms-0.411
ushes-0.402
Token to
Token ID284
Feature activation-8.99e-03

Pos logit contributions
gy+0.007
penter+0.007
isk+0.006
IT+0.006
zer+0.006
Neg logit contributions
ls-0.008
bs-0.007
s-0.007
oms-0.007
ushes-0.007
Token see
Token ID766
Feature activation-9.38e-02

Pos logit contributions
ls+0.019
bs+0.017
s+0.016
oms+0.016
ushes+0.015
Neg logit contributions
gy-0.019
penter-0.019
isk-0.018
zer-0.018
lf-0.018
Token what
Token ID644
Feature activation+1.35e-02

Pos logit contributions
ls+0.233
bs+0.209
s+0.200
oms+0.200
ushes+0.196
Neg logit contributions
gy-0.176
penter-0.166
zer-0.160
IT-0.160
isk-0.159
Token was
Token ID373
Feature activation+7.06e-02

Pos logit contributions
gy+0.025
penter+0.023
IT+0.023
zer+0.023
isk+0.022
Neg logit contributions
ls-0.034
bs-0.030
oms-0.029
s-0.029
ushes-0.029
Token happening
Token ID5836
Feature activation+2.97e-01

Pos logit contributions
gy+0.102
isk+0.095
penter+0.092
zer+0.091
IT+0.089
Neg logit contributions
ls-0.207
bs-0.182
s-0.179
oms-0.178
ushes-0.170
Token.
Token ID13
Feature activation+1.34e-02

Pos logit contributions
gy+0.571
penter+0.544
zer+0.526
IT+0.525
isk+0.522
Neg logit contributions
ls-0.726
bs-0.644
oms-0.625
s-0.625
ushes-0.600
Token Tim
Token ID5045
Feature activation+2.89e-01

Pos logit contributions
gy+0.023
penter+0.022
zer+0.022
IT+0.021
isk+0.021
Neg logit contributions
ls-0.034
bs-0.031
s-0.030
oms-0.030
ushes-0.029
Token and
Token ID290
Feature activation+6.25e-02

Pos logit contributions
gy+0.576
penter+0.560
zer+0.558
uding+0.554
IT+0.554
Neg logit contributions
ls-0.647
s-0.601
bs-0.560
oms-0.530
ushes-0.488
Token Sue
Token ID26089
Feature activation+6.30e-02

Pos logit contributions
gy+0.104
penter+0.098
zer+0.094
IT+0.094
isk+0.092
Neg logit contributions
ls-0.168
bs-0.152
oms-0.146
s-0.145
ushes-0.143
Token liked
Token ID8288
Feature activation+8.99e-02

Pos logit contributions
ls+0.258
bs+0.236
s+0.225
oms+0.224
ushes+0.218
Neg logit contributions
gy-0.184
penter-0.176
IT-0.171
zer-0.170
isk-0.167
Token to
Token ID284
Feature activation-9.41e-03

Pos logit contributions
gy+0.210
penter+0.205
isk+0.199
IT+0.196
zer+0.193
Neg logit contributions
ls-0.182
bs-0.158
s-0.155
oms-0.151
ushes-0.142
Token slide
Token ID10649
Feature activation+1.17e-01

Pos logit contributions
ls+0.020
bs+0.019
oms+0.018
s+0.017
ushes+0.016
Neg logit contributions
gy-0.018
penter-0.018
uding-0.017
isk-0.017
IT-0.017
Token,
Token ID11
Feature activation-1.28e-02

Pos logit contributions
gy+0.187
penter+0.164
 because+0.154
IT+0.150
 driving+0.150
Neg logit contributions
ls-0.297
bs-0.290
oms-0.284
ushes-0.283
s-0.242
Token swing
Token ID9628
Feature activation+1.51e-01

Pos logit contributions
ls+0.032
s+0.029
bs+0.029
oms+0.027
ushes+0.025
Neg logit contributions
penter-0.023
gy-0.023
IT-0.022
isk-0.022
uding-0.022
Token and
Token ID290
Feature activation+6.58e-02

Pos logit contributions
gy+0.233
penter+0.214
zer+0.198
IT+0.198
isk+0.198
Neg logit contributions
ls-0.414
bs-0.385
oms-0.370
ushes-0.369
s-0.350
Token climb
Token ID12080
Feature activation+1.02e-01

Pos logit contributions
gy+0.109
penter+0.105
IT+0.101
isk+0.100
zer+0.100
Neg logit contributions
ls-0.174
bs-0.160
s-0.152
oms-0.151
ushes-0.145
Token on
Token ID319
Feature activation-1.92e-01

Pos logit contributions
gy+0.185
penter+0.170
IT+0.164
zer+0.162
isk+0.158
Neg logit contributions
ls-0.253
bs-0.236
oms-0.224
ushes-0.222
s-0.215
Token the
Token ID262
Feature activation-3.95e-02

Pos logit contributions
ls+0.439
bs+0.382
oms+0.371
s+0.369
ushes+0.359
Neg logit contributions
gy-0.399
penter-0.386
isk-0.369
IT-0.367
zer-0.367
Token equipment
Token ID5112
Feature activation+8.67e-02

Pos logit contributions
ls+0.109
bs+0.101
oms+0.096
ushes+0.096
s+0.095
Neg logit contributions
gy-0.060
penter-0.056
zer-0.054
IT-0.053
isk-0.052
Token.
Token ID13
Feature activation+1.91e-02

Pos logit contributions
gy+0.169
penter+0.158
zer+0.157
IT+0.155
isk+0.155
Neg logit contributions
ls-0.208
bs-0.184
s-0.181
oms-0.179
ushes-0.172
TokenThe
Token ID464
Feature activation+4.35e-02

Pos logit contributions
ls+0.106
bs+0.095
oms+0.093
s+0.090
ushes+0.090
Neg logit contributions
gy-0.063
penter-0.062
IT-0.059
zer-0.058
isk-0.057
Token dragon
Token ID10441
Feature activation+7.43e-02

Pos logit contributions
gy+0.073
penter+0.067
IT+0.066
isk+0.066
zer+0.065
Neg logit contributions
ls-0.118
bs-0.105
oms-0.102
s-0.100
ushes-0.098
Token and
Token ID290
Feature activation+5.96e-02

Pos logit contributions
gy+0.145
IT+0.141
isk+0.138
zer+0.135
penter+0.134
Neg logit contributions
ls-0.175
s-0.155
bs-0.151
oms-0.144
ushes-0.136
Token Princess
Token ID8449
Feature activation+1.03e-01

Pos logit contributions
gy+0.095
penter+0.089
IT+0.085
zer+0.084
isk+0.083
Neg logit contributions
ls-0.163
bs-0.148
oms-0.145
ushes-0.142
s-0.142
Token stayed
Token ID9658
Feature activation+2.06e-01

Pos logit contributions
gy+0.206
penter+0.197
zer+0.189
IT+0.188
isk+0.186
Neg logit contributions
ls-0.239
bs-0.213
oms-0.205
ushes-0.201
s-0.201
Token up
Token ID510
Feature activation-1.11e-01

Pos logit contributions
gy+0.407
penter+0.377
zer+0.360
 driving+0.354
IT+0.349
Neg logit contributions
ls-0.492
bs-0.454
oms-0.429
s-0.421
ushes-0.406
Token there
Token ID612
Feature activation-3.88e-02

Pos logit contributions
ls+0.246
bs+0.220
oms+0.209
s+0.205
ushes+0.203
Neg logit contributions
gy-0.236
penter-0.224
IT-0.214
zer-0.213
isk-0.210
Token for
Token ID329
Feature activation-5.47e-02

Pos logit contributions
ls+0.091
bs+0.082
s+0.080
oms+0.078
ushes+0.077
Neg logit contributions
gy-0.075
penter-0.073
IT-0.070
zer-0.070
isk-0.069
Token a
Token ID257
Feature activation+1.06e-01

Pos logit contributions
ls+0.132
bs+0.120
oms+0.112
ushes+0.111
s+0.109
Neg logit contributions
gy-0.103
penter-0.098
IT-0.095
uding-0.092
zer-0.091
Token while
Token ID981
Feature activation+8.24e-02

Pos logit contributions
gy+0.242
penter+0.234
IT+0.233
zer+0.227
isk+0.227
Neg logit contributions
ls-0.215
bs-0.191
s-0.184
oms-0.177
ushes-0.165
Token and
Token ID290
Feature activation+6.04e-02

Pos logit contributions
gy+0.166
IT+0.159
penter+0.159
isk+0.158
zer+0.155
Neg logit contributions
ls-0.193
bs-0.172
s-0.167
oms-0.166
ushes-0.155
Token.
Token ID13
Feature activation+1.27e-02

Pos logit contributions
IT+0.167
penter+0.160
gy+0.157
itor+0.157
zer+0.155
Neg logit contributions
ls-0.199
s-0.182
bs-0.176
oms-0.168
l-0.155
Token The
Token ID383
Feature activation+4.06e-02

Pos logit contributions
gy+0.024
penter+0.023
zer+0.022
IT+0.022
isk+0.022
Neg logit contributions
ls-0.030
bs-0.027
s-0.026
oms-0.026
ushes-0.026
Token bird
Token ID6512
Feature activation+8.30e-02

Pos logit contributions
gy+0.079
isk+0.074
penter+0.074
IT+0.074
uding+0.073
Neg logit contributions
ls-0.098
bs-0.085
oms-0.083
s-0.081
ushes-0.077
Token was
Token ID373
Feature activation+6.67e-02

Pos logit contributions
gy+0.154
penter+0.147
IT+0.144
zer+0.142
isk+0.140
Neg logit contributions
ls-0.205
bs-0.183
s-0.178
oms-0.174
ushes-0.169
Token so
Token ID523
Feature activation+1.17e-01

Pos logit contributions
gy+0.124
penter+0.121
IT+0.118
zer+0.117
isk+0.116
Neg logit contributions
ls-0.159
bs-0.141
s-0.137
oms-0.136
ushes-0.135
Token lively
Token ID29696
Feature activation-1.24e-01

Pos logit contributions
gy+0.223
 always+0.206
penter+0.202
 never+0.201
 driving+0.199
Neg logit contributions
ls-0.267
oms-0.248
bs-0.244
ushes-0.224
s-0.210
Token and
Token ID290
Feature activation+6.13e-02

Pos logit contributions
ls+0.307
s+0.267
oms+0.260
bs+0.257
ushes+0.222
Neg logit contributions
gy-0.243
isk-0.232
IT-0.231
penter-0.226
zer-0.217
Token so
Token ID523
Feature activation+1.17e-01

Pos logit contributions
gy+0.121
penter+0.116
zer+0.110
IT+0.110
isk+0.108
Neg logit contributions
ls-0.143
bs-0.127
oms-0.125
ushes-0.122
s-0.120
Token special
Token ID2041
Feature activation-1.14e-01

Pos logit contributions
gy+0.211
penter+0.198
lf+0.186
IT+0.183
zer+0.182
Neg logit contributions
ls-0.285
oms-0.262
bs-0.257
ushes-0.252
s-0.239
Token that
Token ID326
Feature activation+1.54e-01

Pos logit contributions
ls+0.279
s+0.244
bs+0.235
oms+0.231
ushes+0.203
Neg logit contributions
gy-0.227
isk-0.217
IT-0.215
zer-0.208
penter-0.206
Token she
Token ID673
Feature activation-1.42e-01

Pos logit contributions
gy+0.311
penter+0.297
zer+0.284
IT+0.283
isk+0.280
Neg logit contributions
ls-0.353
bs-0.312
oms-0.306
ushes-0.300
s-0.295
Token of
Token ID286
Feature activation-3.20e-01

Pos logit contributions
ls+0.343
bs+0.284
oms+0.268
s+0.263
ushes+0.256
Neg logit contributions
gy-0.646
penter-0.619
IT-0.606
zer-0.604
isk-0.601
Token you
Token ID345
Feature activation-4.74e-02

Pos logit contributions
ls+0.730
bs+0.645
oms+0.619
s+0.610
ushes+0.603
Neg logit contributions
gy-0.669
penter-0.636
IT-0.610
zer-0.610
isk-0.602
Token."
Token ID526
Feature activation-8.09e-03

Pos logit contributions
ls+0.114
bs+0.101
s+0.097
oms+0.097
ushes+0.095
Neg logit contributions
gy-0.092
penter-0.088
IT-0.085
zer-0.085
isk-0.083
Token
Token ID198
Feature activation-3.36e-02

Pos logit contributions
ls+0.024
oms+0.021
bs+0.021
ushes+0.020
led+0.019
Neg logit contributions
gy-0.010
 Johnson-0.009
penter-0.009
 because-0.009
isk-0.008
Token
Token ID198
Feature activation-3.32e-02

Pos logit contributions
ls+0.097
bs+0.090
oms+0.090
ushes+0.083
s+0.081
Neg logit contributions
gy-0.040
isk-0.038
zer-0.037
penter-0.036
lf-0.035
TokenShe
Token ID3347
Feature activation-1.12e-01

Pos logit contributions
ls+0.083
oms+0.076
bs+0.075
ushes+0.068
s+0.067
Neg logit contributions
isk-0.054
gy-0.053
zer-0.053
penter-0.053
Wh-0.052
Token hugged
Token ID41130
Feature activation-1.49e-01

Pos logit contributions
ls+0.274
bs+0.249
oms+0.238
ushes+0.236
s+0.234
Neg logit contributions
penter-0.197
gy-0.195
zer-0.186
isk-0.185
IT-0.182
Token Sara
Token ID24799
Feature activation-1.15e-01

Pos logit contributions
ls+0.351
bs+0.316
oms+0.310
s+0.300
ushes+0.294
Neg logit contributions
gy-0.291
penter-0.271
isk-0.266
zer-0.266
IT-0.264
Token.
Token ID13
Feature activation+1.91e-02

Pos logit contributions
ls+0.327
bs+0.295
oms+0.292
ushes+0.283
s+0.274
Neg logit contributions
gy-0.173
penter-0.157
zer-0.150
isk-0.150
IT-0.143
Token She
Token ID1375
Feature activation-1.14e-01

Pos logit contributions
gy+0.037
penter+0.035
IT+0.033
isk+0.033
zer+0.033
Neg logit contributions
ls-0.046
bs-0.040
oms-0.039
s-0.038
ushes-0.037
Token said
Token ID531
Feature activation-1.77e-01

Pos logit contributions
ls+0.224
bs+0.194
s+0.186
oms+0.179
ushes+0.177
Neg logit contributions
gy-0.275
penter-0.259
IT-0.257
zer-0.254
uding-0.251
Token what
Token ID644
Feature activation+2.00e-02

Pos logit contributions
ls+0.579
bs+0.497
oms+0.494
s+0.481
ushes+0.453
Neg logit contributions
gy-0.453
IT-0.421
penter-0.421
isk-0.413
zer-0.405
Token to
Token ID284
Feature activation-7.13e-03

Pos logit contributions
gy+0.041
penter+0.038
IT+0.037
isk+0.037
zer+0.037
Neg logit contributions
ls-0.047
bs-0.042
s-0.040
oms-0.040
ushes-0.039
Token say
Token ID910
Feature activation-1.32e-01

Pos logit contributions
ls+0.018
bs+0.016
oms+0.015
s+0.015
ushes+0.015
Neg logit contributions
gy-0.013
penter-0.012
zer-0.012
IT-0.012
isk-0.012
Token.
Token ID13
Feature activation+2.48e-02

Pos logit contributions
ls+0.317
bs+0.285
oms+0.275
s+0.272
ushes+0.264
Neg logit contributions
gy-0.258
penter-0.241
IT-0.234
zer-0.234
isk-0.233
Token They
Token ID1119
Feature activation-9.86e-02

Pos logit contributions
gy+0.046
penter+0.044
IT+0.042
zer+0.041
uding+0.041
Neg logit contributions
ls-0.061
bs-0.054
oms-0.052
s-0.049
ushes-0.049
Token look
Token ID804
Feature activation-1.05e-01

Pos logit contributions
ls+0.232
bs+0.203
oms+0.198
s+0.195
ushes+0.194
Neg logit contributions
gy-0.195
penter-0.186
IT-0.179
zer-0.179
isk-0.177
Token at
Token ID379
Feature activation-1.62e-01

Pos logit contributions
ls+0.244
ushes+0.226
oms+0.226
bs+0.221
s+0.214
Neg logit contributions
gy-0.167
 because-0.166
penter-0.165
 driving-0.154
 then-0.151
Token each
Token ID1123
Feature activation+1.02e-01

Pos logit contributions
ls+0.404
bs+0.362
oms+0.347
s+0.346
ushes+0.341
Neg logit contributions
gy-0.300
penter-0.285
zer-0.274
IT-0.273
isk-0.270
Token other
Token ID584
Feature activation-8.29e-02

Pos logit contributions
gy+0.203
penter+0.193
IT+0.187
zer+0.187
isk+0.183
Neg logit contributions
ls-0.236
bs-0.209
oms-0.205
ushes-0.200
s-0.190
Token.
Token ID13
Feature activation+2.56e-02

Pos logit contributions
ls+0.218
bs+0.197
oms+0.190
s+0.188
ushes+0.186
Neg logit contributions
gy-0.142
penter-0.134
zer-0.128
IT-0.128
isk-0.126
Token They
Token ID1119
Feature activation-9.76e-02

Pos logit contributions
gy+0.051
penter+0.048
IT+0.047
zer+0.046
isk+0.045
Neg logit contributions
ls-0.060
bs-0.053
oms-0.051
s-0.050
ushes-0.049
Token next
Token ID1306
Feature activation+2.23e-01

Pos logit contributions
ls+0.101
oms+0.087
s+0.086
bs+0.086
ushes+0.080
Neg logit contributions
gy-0.068
isk-0.064
penter-0.063
IT-0.062
zer-0.062
Token day
Token ID1110
Feature activation+7.36e-02

Pos logit contributions
gy+0.426
IT+0.403
zer+0.401
penter+0.401
isk+0.400
Neg logit contributions
ls-0.527
oms-0.476
bs-0.468
s-0.461
ushes-0.431
Token.
Token ID13
Feature activation+1.24e-02

Pos logit contributions
gy+0.143
penter+0.135
zer+0.131
IT+0.131
isk+0.131
Neg logit contributions
ls-0.178
bs-0.160
s-0.155
oms-0.155
ushes-0.149
Token John
Token ID1757
Feature activation+2.61e-01

Pos logit contributions
gy+0.025
zer+0.024
isk+0.024
penter+0.023
lf+0.023
Neg logit contributions
ls-0.026
s-0.024
bs-0.024
ushes-0.024
oms-0.023
Token was
Token ID373
Feature activation+7.18e-02

Pos logit contributions
gy+0.482
penter+0.466
IT+0.465
isk+0.454
zer+0.453
Neg logit contributions
ls-0.638
bs-0.576
s-0.574
oms-0.545
ushes-0.519
Token sad
Token ID6507
Feature activation-1.91e-01

Pos logit contributions
gy+0.142
penter+0.137
IT+0.136
zer+0.135
isk+0.134
Neg logit contributions
ls-0.158
bs-0.143
ushes-0.138
oms-0.137
s-0.137
Token because
Token ID780
Feature activation+3.64e-01

Pos logit contributions
ls+0.513
bs+0.463
s+0.462
oms+0.452
ushes+0.427
Neg logit contributions
gy-0.318
penter-0.307
IT-0.306
zer-0.301
isk-0.297
Token the
Token ID262
Feature activation-3.89e-02

Pos logit contributions
gy+0.717
isk+0.663
penter+0.663
IT+0.661
zer+0.648
Neg logit contributions
ls-0.888
bs-0.784
oms-0.757
s-0.757
ushes-0.731
Token generous
Token ID14431
Feature activation-4.29e-02

Pos logit contributions
ls+0.100
bs+0.089
oms+0.087
s+0.085
ushes+0.084
Neg logit contributions
gy-0.069
penter-0.066
zer-0.064
IT-0.063
isk-0.063
Token drain
Token ID14782
Feature activation+3.67e-02

Pos logit contributions
ls+0.115
bs+0.104
oms+0.099
s+0.098
ushes+0.098
Neg logit contributions
gy-0.072
penter-0.067
IT-0.064
zer-0.064
isk-0.062
Token was
Token ID373
Feature activation+7.16e-02

Pos logit contributions
gy+0.067
IT+0.064
penter+0.063
isk+0.062
zer+0.062
Neg logit contributions
ls-0.091
s-0.080
bs-0.080
oms-0.078
ushes-0.073
Token in
Token ID287
Feature activation+1.30e-02

Pos logit contributions
ls+0.238
bs+0.214
ushes+0.211
oms+0.208
s+0.199
Neg logit contributions
gy-0.131
penter-0.128
zer-0.119
IT-0.117
lf-0.115
Token the
Token ID262
Feature activation-3.85e-02

Pos logit contributions
gy+0.024
penter+0.023
zer+0.022
IT+0.022
isk+0.021
Neg logit contributions
ls-0.032
bs-0.029
oms-0.028
s-0.027
ushes-0.027
Token village
Token ID7404
Feature activation+1.35e-01

Pos logit contributions
ls+0.114
bs+0.104
oms+0.101
s+0.100
ushes+0.099
Neg logit contributions
gy-0.053
penter-0.050
zer-0.047
IT-0.047
isk-0.046
Token watched
Token ID7342
Feature activation+1.36e-01

Pos logit contributions
gy+0.240
penter+0.231
zer+0.218
IT+0.216
lf+0.212
Neg logit contributions
ls-0.334
bs-0.297
ushes-0.286
oms-0.285
s-0.275
Token with
Token ID351
Feature activation+5.12e-02

Pos logit contributions
gy+0.243
penter+0.231
zer+0.224
IT+0.221
isk+0.217
Neg logit contributions
ls-0.340
bs-0.304
oms-0.290
ushes-0.289
s-0.289
Token smiles
Token ID21845
Feature activation-4.08e-01

Pos logit contributions
gy+0.094
penter+0.091
zer+0.089
IT+0.087
 driving+0.086
Neg logit contributions
ls-0.125
bs-0.107
s-0.105
oms-0.103
ushes-0.098
Token on
Token ID319
Feature activation-1.87e-01

Pos logit contributions
ls+0.996
bs+0.894
s+0.871
oms+0.865
ushes+0.835
Neg logit contributions
gy-0.757
penter-0.721
IT-0.704
zer-0.700
isk-0.700
Token their
Token ID511
Feature activation-2.02e-01

Pos logit contributions
ls+0.442
bs+0.394
oms+0.380
s+0.371
ushes+0.370
Neg logit contributions
gy-0.366
penter-0.347
zer-0.338
IT-0.336
isk-0.329
Token faces
Token ID6698
Feature activation-1.76e-01

Pos logit contributions
ls+0.465
bs+0.412
oms+0.396
ushes+0.393
s+0.388
Neg logit contributions
gy-0.408
penter-0.384
zer-0.382
IT-0.372
uding-0.360
Token and
Token ID290
Feature activation+6.42e-02

Pos logit contributions
ls+0.450
bs+0.404
oms+0.389
s+0.384
ushes+0.382
Neg logit contributions
gy-0.310
penter-0.293
zer-0.281
IT-0.280
isk-0.276
Token cl
Token ID537
Feature activation-9.27e-02

Pos logit contributions
gy+0.126
penter+0.119
zer+0.114
IT+0.113
 driving+0.112
Neg logit contributions
ls-0.151
bs-0.131
oms-0.130
ushes-0.126
s-0.124
Token a
Token ID257
Feature activation+1.07e-01

Pos logit contributions
ls+0.389
bs+0.331
s+0.324
oms+0.320
ushes+0.297
Neg logit contributions
penter-0.372
zer-0.358
eman-0.356
gy-0.355
isk-0.355
Token big
Token ID1263
Feature activation-2.45e-01

Pos logit contributions
gy+0.232
penter+0.221
IT+0.216
isk+0.216
zer+0.214
Neg logit contributions
ls-0.233
bs-0.201
oms-0.195
s-0.193
ushes-0.187
Token box
Token ID3091
Feature activation-4.68e-02

Pos logit contributions
ls+0.572
oms+0.506
bs+0.485
s+0.472
ushes+0.438
Neg logit contributions
gy-0.459
isk-0.453
IT-0.445
penter-0.433
uding-0.431
Token!"
Token ID2474
Feature activation-2.96e-02

Pos logit contributions
ls+0.106
bs+0.091
oms+0.090
s+0.089
ushes+0.084
Neg logit contributions
gy-0.097
IT-0.094
zer-0.093
isk-0.092
penter-0.091
Token Sam
Token ID3409
Feature activation+1.59e-01

Pos logit contributions
ls+0.088
bs+0.080
oms+0.079
ushes+0.078
s+0.077
Neg logit contributions
gy-0.040
penter-0.037
zer-0.035
IT-0.035
isk-0.033
Token says
Token ID1139
Feature activation-4.20e-01

Pos logit contributions
gy+0.492
penter+0.479
IT+0.464
zer+0.462
isk+0.458
Neg logit contributions
ls-0.192
bs-0.152
oms-0.140
ushes-0.140
s-0.130
Token.
Token ID13
Feature activation+2.34e-02

Pos logit contributions
ls+1.121
bs+0.993
oms+0.971
ushes+0.963
s+0.944
Neg logit contributions
gy-0.713
penter-0.652
zer-0.619
IT-0.601
isk-0.597
Token "
Token ID366
Feature activation+1.31e-01

Pos logit contributions
gy+0.038
penter+0.036
zer+0.034
IT+0.033
isk+0.033
Neg logit contributions
ls-0.063
bs-0.057
oms-0.055
ushes-0.054
s-0.054
TokenWe
Token ID1135
Feature activation-7.62e-02

Pos logit contributions
gy+0.258
penter+0.243
isk+0.237
IT+0.235
zer+0.231
Neg logit contributions
ls-0.305
bs-0.281
s-0.271
ushes-0.268
oms-0.267
Token can
Token ID460
Feature activation-2.22e-01

Pos logit contributions
ls+0.137
bs+0.117
oms+0.110
s+0.109
ushes+0.107
Neg logit contributions
gy-0.194
penter-0.187
zer-0.182
IT-0.182
isk-0.180
Token make
Token ID787
Feature activation+1.04e-02

Pos logit contributions
ls+0.481
bs+0.446
s+0.440
oms+0.420
ushes+0.399
Neg logit contributions
IT-0.436
isk-0.435
gy-0.425
zer-0.422
penter-0.417
Token to
Token ID284
Feature activation-8.64e-03

Pos logit contributions
ls+0.173
bs+0.155
oms+0.148
s+0.147
ushes+0.146
Neg logit contributions
gy-0.142
penter-0.137
isk-0.132
IT-0.131
zer-0.129
Token go
Token ID467
Feature activation-3.99e-02

Pos logit contributions
ls+0.017
bs+0.017
oms+0.015
ushes+0.014
s+0.014
Neg logit contributions
penter-0.018
gy-0.017
itor-0.017
IT-0.017
isk-0.017
Token home
Token ID1363
Feature activation-5.02e-02

Pos logit contributions
ls+0.096
bs+0.086
oms+0.083
s+0.082
ushes+0.081
Neg logit contributions
gy-0.077
penter-0.073
zer-0.070
IT-0.070
isk-0.069
Token,"
Token ID553
Feature activation-1.59e-01

Pos logit contributions
ls+0.119
bs+0.105
oms+0.101
s+0.101
ushes+0.099
Neg logit contributions
gy-0.099
penter-0.094
IT-0.091
zer-0.091
isk-0.090
Token Mia
Token ID32189
Feature activation-4.56e-02

Pos logit contributions
ls+0.384
bs+0.346
oms+0.340
ushes+0.330
s+0.325
Neg logit contributions
gy-0.301
penter-0.288
zer-0.277
IT-0.269
lf-0.265
Token says
Token ID1139
Feature activation-4.16e-01

Pos logit contributions
ls+0.059
bs+0.048
ushes+0.047
oms+0.044
s+0.039
Neg logit contributions
gy-0.135
penter-0.132
IT-0.126
zer-0.126
isk-0.126
Token.
Token ID13
Feature activation+3.02e-02

Pos logit contributions
ls+1.075
bs+0.961
oms+0.929
ushes+0.920
s+0.913
Neg logit contributions
gy-0.735
penter-0.687
zer-0.660
IT-0.650
isk-0.642
Token
Token ID198
Feature activation-2.91e-02

Pos logit contributions
gy+0.055
penter+0.052
IT+0.050
zer+0.050
isk+0.048
Neg logit contributions
ls-0.075
bs-0.067
oms-0.065
ushes-0.064
s-0.062
Token
Token ID198
Feature activation-2.90e-02

Pos logit contributions
ls+0.073
bs+0.066
oms+0.065
ushes+0.063
s+0.061
Neg logit contributions
gy-0.051
penter-0.048
zer-0.047
IT-0.046
isk-0.045
Token"
Token ID1
Feature activation-1.01e-01

Pos logit contributions
ls+0.065
bs+0.057
oms+0.055
s+0.054
ushes+0.054
Neg logit contributions
gy-0.061
penter-0.059
zer-0.057
IT-0.057
isk-0.056
TokenBut
Token ID1537
Feature activation+2.47e-01

Pos logit contributions
ls+0.252
bs+0.226
oms+0.217
s+0.216
ushes+0.213
Neg logit contributions
gy-0.188
penter-0.176
zer-0.170
IT-0.170
isk-0.168
Token sorry
Token ID7926
Feature activation-1.74e-01

Pos logit contributions
ls+0.873
oms+0.769
bs+0.762
ushes+0.754
s+0.749
Neg logit contributions
gy-0.539
penter-0.515
zer-0.509
 driving-0.490
 accident-0.474
Token and
Token ID290
Feature activation+7.35e-02

Pos logit contributions
ls+0.449
bs+0.404
oms+0.391
ushes+0.390
s+0.375
Neg logit contributions
gy-0.303
penter-0.284
zer-0.272
isk-0.270
IT-0.265
Token brave
Token ID14802
Feature activation-1.11e-02

Pos logit contributions
gy+0.134
 driving+0.128
penter+0.128
 owl+0.123
 never+0.122
Neg logit contributions
ushes-0.158
ls-0.154
bs-0.146
oms-0.142
s-0.132
Token.
Token ID13
Feature activation+2.32e-02

Pos logit contributions
ls+0.029
bs+0.026
oms+0.025
ushes+0.025
s+0.024
Neg logit contributions
gy-0.019
penter-0.018
zer-0.017
IT-0.017
isk-0.017
Token He
Token ID679
Feature activation+2.39e-02

Pos logit contributions
gy+0.038
penter+0.036
zer+0.034
IT+0.034
isk+0.033
Neg logit contributions
ls-0.063
bs-0.057
oms-0.056
ushes-0.054
s-0.054
Token says
Token ID1139
Feature activation-4.18e-01

Pos logit contributions
gy+0.064
penter+0.063
zer+0.060
IT+0.060
isk+0.060
Neg logit contributions
ls-0.035
bs-0.032
ushes-0.030
oms-0.029
s-0.028
Token,
Token ID11
Feature activation-1.19e-02

Pos logit contributions
ls+1.149
ushes+1.092
bs+1.065
oms+1.042
s+0.994
Neg logit contributions
gy-0.618
penter-0.539
zer-0.537
 driving-0.503
 Squirrel-0.474
Token "
Token ID366
Feature activation+1.34e-01

Pos logit contributions
ls+0.030
bs+0.027
s+0.027
oms+0.026
ushes+0.025
Neg logit contributions
gy-0.021
IT-0.020
penter-0.020
isk-0.019
uding-0.019
TokenDon
Token ID3987
Feature activation-2.63e-02

Pos logit contributions
gy+0.225
penter+0.214
zer+0.207
IT+0.205
isk+0.199
Neg logit contributions
ls-0.359
bs-0.323
oms-0.316
ushes-0.308
s-0.306
Token't
Token ID470
Feature activation+1.54e-01

Pos logit contributions
ls+0.085
bs+0.080
oms+0.077
ushes+0.077
s+0.075
Neg logit contributions
gy-0.028
penter-0.026
IT-0.024
zer-0.024
isk-0.024
Token worry
Token ID5490
Feature activation+7.66e-02

Pos logit contributions
penter+0.270
gy+0.267
IT+0.251
zer+0.250
isk+0.248
Neg logit contributions
ls-0.392
bs-0.357
s-0.347
oms-0.334
ushes-0.321
Token boy
Token ID2933
Feature activation-3.33e-02

Pos logit contributions
gy+0.013
penter+0.013
zer+0.012
IT+0.012
isk+0.012
Neg logit contributions
ls-0.013
bs-0.012
oms-0.011
ushes-0.011
s-0.011
Token who
Token ID508
Feature activation+3.07e-01

Pos logit contributions
s+0.084
bs+0.083
ls+0.079
able+0.069
oms+0.063
Neg logit contributions
IT-0.061
itor-0.055
achable-0.054
uding-0.054
isk-0.053
Token was
Token ID373
Feature activation+7.63e-02

Pos logit contributions
gy+0.464
penter+0.429
zer+0.410
IT+0.405
isk+0.401
Neg logit contributions
ls-0.880
bs-0.794
oms-0.777
s-0.762
ushes-0.759
Token very
Token ID845
Feature activation-3.67e-02

Pos logit contributions
IT+0.185
zer+0.176
isk+0.175
eman+0.175
uding+0.175
Neg logit contributions
ls-0.129
bs-0.116
s-0.109
oms-0.100
ushes-0.100
Token depend
Token ID4745
Feature activation+6.90e-02

Pos logit contributions
ls+0.083
oms+0.072
ushes+0.072
bs+0.072
s+0.068
Neg logit contributions
gy-0.076
penter-0.070
 driving-0.068
zer-0.067
 always-0.065
Tokenable
Token ID540
Feature activation-5.01e-01

Pos logit contributions
gy+0.209
penter+0.196
zer+0.195
 accident+0.193
IT+0.189
Neg logit contributions
ls-0.089
oms-0.079
ushes-0.068
bs-0.062
s-0.050
Token.
Token ID13
Feature activation+2.48e-02

Pos logit contributions
ls+0.826
s+0.799
bs+0.660
oms+0.643
ushes+0.569
Neg logit contributions
IT-1.287
isk-1.221
gy-1.211
zer-1.192
penter-1.179
Token He
Token ID679
Feature activation+2.58e-02

Pos logit contributions
gy+0.056
zer+0.055
penter+0.054
isk+0.053
IT+0.053
Neg logit contributions
ls-0.043
s-0.041
bs-0.040
ushes-0.040
oms-0.038
Token was
Token ID373
Feature activation+7.52e-02

Pos logit contributions
IT+0.059
uding+0.058
itor+0.056
ologist+0.055
gy+0.054
Neg logit contributions
ls-0.046
s-0.046
bs-0.043
l-0.037
led-0.035
Token always
Token ID1464
Feature activation+3.84e-01

Pos logit contributions
IT+0.163
gy+0.157
zer+0.156
isk+0.154
uding+0.154
Neg logit contributions
ls-0.144
bs-0.132
ushes-0.122
s-0.122
oms-0.116
Token doing
Token ID1804
Feature activation+2.09e-01

Pos logit contributions
gy+0.748
penter+0.718
zer+0.692
IT+0.691
isk+0.684
Neg logit contributions
ls-0.901
bs-0.810
s-0.771
oms-0.771
ushes-0.762