TOP ACTIVATIONS
MAX = 0.328

question , and she courage ously answered it .
my showed Billy his screw driver and they decided to build
He takes the screw driver and opens the hole .
the screw with this screw driver . Let 's find more
big one was named Max and the small one was named
merry - go - round and make it spin .
dog named Sam came running and tried to take the steak
t catch up and just watched Jack reach the
were two brothers named Jeff and Tom . They decided to
thing was getting very excited and decided to jump right in
hide - and - seek and tag . Tim my had
a little girl named Mia and her dog , Buddy ,
boy came run n ig and kicked the old engine .
Jane was picking up shrimp and putting them into her bucket
three - years - old and was very inqu is itive
It was funny - looking and he didn 't know what
he got his little screw driver and started to screw in
were two siblings named Anne and Jake who loved to play
's friend called him again and said , " I 'm
were two brothers called Jack and Tom . They were best

MIN ACTIVATIONS
MIN = -0.388

should be compassionate . Comp assion ate means you care about
car . " V room , v room ,
car . " V room , v room !"
car and said , " V anish !" The toy car
bows . V era was so happy .
truck . " V room , v room !"
a sign that said " V est for sale , one
doll . " V room , v room ,
again . V ince was very happy and
her mission ! V iolet grinned and knew that
other . " V room , v room ,
roaring . " V room , v room ,
V icky ?" V icky said in a shaky
will try !" V icky took a deep breath
her ! V iv ian enjoyed playing with
car . " V room , v room ,
wand and said , " V anish !" Suddenly , his
inside . " V room , v room !"
?" Sophie explained , " V elvet is a special kind
noises and moves . " V room , v room !"

INTERVAL 0.149 - 0.328
CONTAINS 6.391%

. He approached the hand and he realized it was made
soon !" The boat smiled and knew it had a new
was walking in his garden and he saw a pit .
the floor . They laughed and played in the sticky mess
to the playground and climbed and slid and swung . They

INTERVAL -0.030 - 0.149
CONTAINS 58.467%

, Sarah found a big bowl and spoon . Then ,
<|endoftext|> One day , there was a little boy who was
â I am taking a great bath ,
I can give you some fuel , but you must promise
. Let me add it to your tablet ."

INTERVAL -0.209 - -0.030
CONTAINS 30.445%

nice . You owe Tommy an apology and a new car
the sun . It was warm and bright . Ben did
a picture of Max and the little girl to remember their
Thank you , sir . You are wise ," Tom says
in the toy box . Finally , they found it in

INTERVAL -0.388 - -0.209
CONTAINS 4.696%

knew that her story was special , just like the por
the pond and disappears . L ily and Tom
even more . The truck driver opened the door
!" And they did . The end . <|endoftext|> Once upon
me !" she said . " Okay , L
Token question
Token ID1808
Feature activation-1.20e-02

Pos logit contributions
icious+0.022
rous+0.021
reci+0.020
 disl+0.020
Little+0.020
Neg logit contributions
 and-0.018
 in-0.016
 on-0.016
 he-0.016
 under-0.015
Token,
Token ID11
Feature activation+1.54e-01

Pos logit contributions
Ms+0.021
One+0.021
Little+0.021
icious+0.020
 difference+0.020
Neg logit contributions
ed-0.012
 knees-0.011
driver-0.010
 under-0.010
 hid-0.010
Token and
Token ID290
Feature activation+2.74e-01

Pos logit contributions
ed+0.152
 under+0.152
driver+0.140
 and+0.137
 in+0.128
Neg logit contributions
Little-0.312
icious-0.310
Ms-0.298
Everybody-0.292
One-0.291
Token she
Token ID673
Feature activation+1.83e-01

Pos logit contributions
 under+0.265
 and+0.262
driver+0.253
ed+0.249
 in+0.220
Neg logit contributions
Little-0.552
icious-0.533
Everybody-0.524
Ms-0.524
Br-0.518
Token courage
Token ID11917
Feature activation+2.34e-02

Pos logit contributions
 under+0.227
 and+0.220
ed+0.209
driver+0.209
 in+0.199
Neg logit contributions
Little-0.331
icious-0.330
Everybody-0.312
Ms-0.311
E-0.307
Tokenously
Token ID3481
Feature activation+3.22e-01

Pos logit contributions
 under+0.022
driver+0.021
 knees+0.020
aces+0.019
ed+0.019
Neg logit contributions
Little-0.039
qu-0.039
reci-0.039
is-0.038
 consider-0.038
Token answered
Token ID9373
Feature activation-2.22e-02

Pos logit contributions
 under+0.363
driver+0.352
 and+0.338
ed+0.331
 in+0.306
Neg logit contributions
Little-0.580
icious-0.577
 disl-0.562
reci-0.554
Everybody-0.549
Token it
Token ID340
Feature activation+1.16e-01

Pos logit contributions
Little+0.044
icious+0.044
Ms+0.042
Everybody+0.042
Everyone+0.042
Neg logit contributions
 under-0.021
driver-0.021
ed-0.020
 and-0.020
 knees-0.018
Token.
Token ID13
Feature activation+5.94e-02

Pos logit contributions
 under+0.105
 and+0.103
ed+0.095
driver+0.094
 in+0.089
Neg logit contributions
icious-0.247
Little-0.244
Ms-0.233
Everybody-0.233
reci-0.232
Token
Token ID220
Feature activation-1.54e-01

Pos logit contributions
 and+0.089
 under+0.083
ed+0.081
 in+0.079
 he+0.079
Neg logit contributions
icious-0.104
reci-0.096
Little-0.096
rous-0.094
 difference-0.094
Token
Token ID198
Feature activation-1.85e-01

Pos logit contributions
icious+0.259
ograph+0.240
rous+0.239
Little+0.238
 disl+0.233
Neg logit contributions
 and-0.259
ed-0.237
 under-0.234
 in-0.230
driver-0.222
Tokenmy
Token ID1820
Feature activation+8.78e-02

Pos logit contributions
Little+0.068
icious+0.065
reci+0.063
Ms+0.062
One+0.061
Neg logit contributions
 under-0.053
driver-0.052
ed-0.051
 and-0.048
 in-0.047
Token showed
Token ID3751
Feature activation+2.07e-02

Pos logit contributions
 under+0.102
driver+0.099
ed+0.095
 in+0.085
 on+0.084
Neg logit contributions
Little-0.161
icious-0.153
 disl-0.151
reci-0.149
Ms-0.149
Token Billy
Token ID15890
Feature activation+3.03e-02

Pos logit contributions
 and+0.016
 under+0.016
ed+0.014
 in+0.014
driver+0.013
Neg logit contributions
icious-0.049
Little-0.048
reci-0.046
Everybody-0.046
Ms-0.046
Token his
Token ID465
Feature activation+9.66e-02

Pos logit contributions
 under+0.031
 and+0.030
ed+0.029
driver+0.028
 in+0.026
Neg logit contributions
icious-0.061
Little-0.061
Everybody-0.058
Ms-0.058
 disl-0.058
Token screw
Token ID9580
Feature activation+1.11e-01

Pos logit contributions
 and+0.126
 under+0.125
ed+0.117
 in+0.114
driver+0.112
Neg logit contributions
icious-0.174
Little-0.171
Everybody-0.162
reci-0.161
Ms-0.161
Tokendriver
Token ID26230
Feature activation+3.18e-01

Pos logit contributions
 explorers+0.068
 under+0.067
ously+0.061
edly+0.061
 hid+0.056
Neg logit contributions
assion-0.222
One-0.211
is-0.208
Each-0.207
V-0.207
Token and
Token ID290
Feature activation+2.87e-01

Pos logit contributions
 under+0.179
ed+0.168
 knees+0.164
driver+0.149
 explorers+0.141
Neg logit contributions
One-0.695
assion-0.694
icious-0.691
reci-0.680
 disl-0.679
Token they
Token ID484
Feature activation+2.01e-01

Pos logit contributions
 under+0.291
ed+0.269
 and+0.255
aces+0.246
 in+0.240
Neg logit contributions
Little-0.566
 disl-0.555
Everybody-0.542
Everyone-0.537
assion-0.527
Token decided
Token ID3066
Feature activation+1.47e-01

Pos logit contributions
 under+0.194
 and+0.193
ed+0.183
driver+0.180
edly+0.160
Neg logit contributions
Little-0.384
reci-0.377
icious-0.377
Everybody-0.376
assion-0.373
Token to
Token ID284
Feature activation+1.37e-01

Pos logit contributions
ed+0.118
 under+0.114
aces+0.111
driver+0.105
edly+0.094
Neg logit contributions
assion-0.310
 disl-0.310
Everybody-0.297
Everyone-0.294
One-0.293
Token build
Token ID1382
Feature activation+3.78e-02

Pos logit contributions
 under+0.218
ed+0.218
 and+0.209
driver+0.194
 in+0.192
Neg logit contributions
icious-0.202
Little-0.201
Everybody-0.192
 disl-0.188
Everyone-0.188
Token
Token ID198
Feature activation-1.24e-01

Pos logit contributions
 difference+0.192
icious+0.190
assion+0.178
reci+0.177
redible+0.176
Neg logit contributions
 and-0.178
 he-0.164
 He-0.158
 in-0.153
 under-0.144
TokenHe
Token ID1544
Feature activation+1.63e-01

Pos logit contributions
 difference+0.195
reci+0.180
icious+0.176
assion+0.175
 former+0.165
Neg logit contributions
 and-0.213
 he-0.198
 in-0.184
 He-0.182
 under-0.172
Token takes
Token ID2753
Feature activation+1.67e-02

Pos logit contributions
 under+0.155
driver+0.148
ed+0.144
 and+0.141
 in+0.130
Neg logit contributions
Little-0.327
icious-0.319
reci-0.312
Everybody-0.310
E-0.307
Token the
Token ID262
Feature activation-5.48e-02

Pos logit contributions
 under+0.017
 and+0.017
ed+0.016
driver+0.015
 in+0.015
Neg logit contributions
Little-0.033
icious-0.033
Everybody-0.032
E-0.032
Ms-0.032
Token screw
Token ID9580
Feature activation+1.15e-01

Pos logit contributions
icious+0.101
Little+0.099
Everybody+0.095
reci+0.094
Ms+0.094
Neg logit contributions
 and-0.069
 under-0.068
ed-0.063
 in-0.062
driver-0.061
Tokendriver
Token ID26230
Feature activation+3.16e-01

Pos logit contributions
 knees+0.081
 shrugged+0.081
 hid+0.078
edly+0.077
 explorers+0.076
Neg logit contributions
Little-0.215
is-0.213
qu-0.207
 difference-0.206
E-0.205
Token and
Token ID290
Feature activation+2.92e-01

Pos logit contributions
 knees+0.192
ed+0.180
 under+0.179
edly+0.174
 ear+0.172
Neg logit contributions
is-0.650
Little-0.648
One-0.645
reci-0.639
Everybody-0.630
Token opens
Token ID9808
Feature activation+9.63e-02

Pos logit contributions
 under+0.389
fully+0.388
ed+0.383
ously+0.383
edly+0.366
Neg logit contributions
Little-0.437
 disl-0.406
V-0.395
E-0.394
reci-0.389
Token the
Token ID262
Feature activation-6.24e-02

Pos logit contributions
 and+0.082
 under+0.079
ed+0.071
 in+0.070
driver+0.066
Neg logit contributions
icious-0.223
Little-0.217
reci-0.209
Everybody-0.209
Ms-0.208
Token hole
Token ID7604
Feature activation+9.01e-02

Pos logit contributions
icious+0.135
Little+0.132
Everybody+0.128
 disl+0.127
reci+0.126
Neg logit contributions
 and-0.063
 under-0.061
 in-0.055
ed-0.054
driver-0.053
Token.
Token ID13
Feature activation+7.27e-02

Pos logit contributions
driver+0.112
 hid+0.103
 knees+0.103
 wouldn+0.100
 ear+0.100
Neg logit contributions
is-0.125
One-0.117
icious-0.116
Little-0.115
 difference-0.115
Token the
Token ID262
Feature activation-5.85e-02

Pos logit contributions
 and+0.050
 under+0.049
ed+0.044
 in+0.042
driver+0.041
Neg logit contributions
icious-0.154
Little-0.151
Everybody-0.146
reci-0.145
Ms-0.145
Token screw
Token ID9580
Feature activation+1.13e-01

Pos logit contributions
icious+0.106
Little+0.103
Everybody+0.098
 disl+0.098
Ms+0.098
Neg logit contributions
 and-0.078
 under-0.077
 in-0.072
ed-0.070
 on-0.069
Token with
Token ID351
Feature activation+2.73e-02

Pos logit contributions
 under+0.079
 knees+0.079
ed+0.075
ously+0.074
edly+0.070
Neg logit contributions
Little-0.229
icious-0.228
Everybody-0.219
redible-0.218
 difference-0.217
Token this
Token ID428
Feature activation-8.02e-02

Pos logit contributions
 and+0.031
 under+0.030
 in+0.027
ed+0.027
driver+0.026
Neg logit contributions
icious-0.055
Little-0.054
Everybody-0.052
Ms-0.052
reci-0.052
Token screw
Token ID9580
Feature activation+1.09e-01

Pos logit contributions
icious+0.153
Little+0.149
Everybody+0.143
Ms+0.142
reci+0.142
Neg logit contributions
 and-0.099
 under-0.097
ed-0.089
 in-0.089
driver-0.086
Tokendriver
Token ID26230
Feature activation+3.15e-01

Pos logit contributions
 under+0.088
 knees+0.081
ed+0.079
ously+0.077
edly+0.074
Neg logit contributions
icious-0.208
Little-0.206
assion-0.199
ograph-0.197
 difference-0.197
Token.
Token ID13
Feature activation+7.16e-02

Pos logit contributions
 under+0.240
ed+0.228
 and+0.220
driver+0.201
 knees+0.199
Neg logit contributions
icious-0.693
Little-0.682
reci-0.656
Everybody-0.656
 disl-0.650
Token Let
Token ID3914
Feature activation-2.84e-03

Pos logit contributions
 and+0.089
 under+0.080
driver+0.078
 in+0.076
ed+0.073
Neg logit contributions
icious-0.138
rous-0.133
Little-0.131
reci-0.130
 difference-0.129
Token's
Token ID338
Feature activation+3.06e-02

Pos logit contributions
icious+0.007
 former+0.007
 disl+0.007
 lifetime+0.007
 sym+0.007
Neg logit contributions
 and-0.003
 in-0.002
 on-0.002
 out-0.002
 under-0.002
Token find
Token ID1064
Feature activation+3.12e-02

Pos logit contributions
 under+0.039
 and+0.038
ed+0.037
driver+0.035
 in+0.035
Neg logit contributions
icious-0.056
Little-0.055
Everybody-0.052
reci-0.051
 difference-0.051
Token more
Token ID517
Feature activation+3.60e-03

Pos logit contributions
 and+0.026
 under+0.024
 in+0.023
 out+0.022
 on+0.021
Neg logit contributions
icious-0.080
 disl-0.075
reci-0.075
 sym-0.074
 former-0.073
Token big
Token ID1263
Feature activation-8.10e-02

Pos logit contributions
icious+0.214
reci+0.199
Little+0.199
rous+0.194
E+0.193
Neg logit contributions
 and-0.297
 under-0.281
ed-0.276
 in-0.268
driver-0.264
Token one
Token ID530
Feature activation+7.25e-02

Pos logit contributions
Little+0.130
icious+0.129
 difference+0.129
 disl+0.124
 sym+0.122
Neg logit contributions
ed-0.112
 under-0.111
driver-0.105
 and-0.104
 in-0.098
Token was
Token ID373
Feature activation+1.65e-01

Pos logit contributions
 under+0.060
 and+0.059
ed+0.057
driver+0.056
 in+0.050
Neg logit contributions
icious-0.158
Little-0.157
Everybody-0.150
 difference-0.149
 disl-0.149
Token named
Token ID3706
Feature activation+5.03e-02

Pos logit contributions
 and+0.154
 under+0.153
ed+0.141
 in+0.134
driver+0.133
Neg logit contributions
icious-0.354
Little-0.348
Everybody-0.334
reci-0.332
Ms-0.332
Token Max
Token ID5436
Feature activation+4.18e-02

Pos logit contributions
 and+0.062
 under+0.060
 in+0.056
ed+0.055
driver+0.053
Neg logit contributions
icious-0.099
Little-0.094
reci-0.093
 difference-0.092
rous-0.091
Token and
Token ID290
Feature activation+3.07e-01

Pos logit contributions
 under+0.036
 and+0.036
ed+0.033
driver+0.031
 in+0.031
Neg logit contributions
icious-0.092
Little-0.091
Everybody-0.087
Ms-0.087
reci-0.087
Token the
Token ID262
Feature activation-5.69e-02

Pos logit contributions
 under+0.314
ed+0.299
 and+0.287
driver+0.281
 in+0.258
Neg logit contributions
Little-0.613
 disl-0.584
icious-0.581
Everybody-0.579
Br-0.575
Token small
Token ID1402
Feature activation-1.66e-01

Pos logit contributions
icious+0.110
Little+0.106
reci+0.102
Everybody+0.101
Ms+0.101
Neg logit contributions
 and-0.072
 under-0.070
 in-0.065
 on-0.062
ed-0.062
Token one
Token ID530
Feature activation+3.89e-02

Pos logit contributions
icious+0.277
Little+0.277
 difference+0.269
Everybody+0.264
reci+0.263
Neg logit contributions
 under-0.220
 and-0.214
ed-0.212
driver-0.201
 in-0.195
Token was
Token ID373
Feature activation+1.38e-01

Pos logit contributions
 under+0.038
driver+0.038
 and+0.037
ed+0.036
 in+0.032
Neg logit contributions
Little-0.078
icious-0.076
 sym-0.073
Everybody-0.073
 difference-0.073
Token named
Token ID3706
Feature activation+2.95e-02

Pos logit contributions
 and+0.133
 under+0.132
ed+0.124
driver+0.117
 in+0.115
Neg logit contributions
icious-0.293
Little-0.289
Everybody-0.276
Ms-0.275
reci-0.274
Token merry
Token ID44570
Feature activation-1.12e-02

Pos logit contributions
 and+0.049
 under+0.047
 in+0.044
ed+0.043
driver+0.041
Neg logit contributions
icious-0.069
Little-0.068
Ms-0.065
E-0.064
reci-0.063
Token-
Token ID12
Feature activation+1.90e-01

Pos logit contributions
icious+0.022
Little+0.022
Everybody+0.021
Everyone+0.021
 sym+0.021
Neg logit contributions
 under-0.011
driver-0.011
 and-0.011
ed-0.011
 in-0.010
Tokengo
Token ID2188
Feature activation+4.57e-02

Pos logit contributions
 and+0.264
 under+0.261
ed+0.247
 in+0.241
driver+0.235
Neg logit contributions
icious-0.328
Little-0.320
reci-0.304
Everybody-0.304
Ms-0.302
Token-
Token ID12
Feature activation+1.76e-01

Pos logit contributions
 and+0.051
 under+0.049
ed+0.046
 in+0.045
 on+0.042
Neg logit contributions
icious-0.093
Little-0.091
reci-0.087
 disl-0.087
Everybody-0.087
Tokenround
Token ID744
Feature activation+1.27e-01

Pos logit contributions
 and+0.249
ed+0.235
 under+0.234
 in+0.225
 on+0.213
Neg logit contributions
icious-0.323
 difference-0.307
reci-0.303
Little-0.297
 disl-0.290
Token and
Token ID290
Feature activation+3.05e-01

Pos logit contributions
 under+0.098
 and+0.092
driver+0.088
ed+0.085
 in+0.079
Neg logit contributions
icious-0.282
Little-0.281
Everybody-0.270
Ms-0.268
 sym-0.265
Token make
Token ID787
Feature activation+2.18e-02

Pos logit contributions
 under+0.326
driver+0.323
ed+0.312
 and+0.303
 in+0.277
Neg logit contributions
Little-0.579
icious-0.560
 disl-0.554
Everybody-0.550
Everyone-0.540
Token it
Token ID340
Feature activation+1.41e-01

Pos logit contributions
 and+0.024
 under+0.022
 in+0.021
ed+0.021
 on+0.020
Neg logit contributions
icious-0.046
Little-0.044
reci-0.044
ograph-0.043
Ms-0.043
Token spin
Token ID7906
Feature activation-1.03e-02

Pos logit contributions
 and+0.171
 under+0.168
ed+0.155
 in+0.154
driver+0.148
Neg logit contributions
icious-0.269
Little-0.265
reci-0.251
Ms-0.251
Everybody-0.251
Token.
Token ID13
Feature activation+7.90e-02

Pos logit contributions
icious+0.024
Little+0.023
Everybody+0.022
E+0.022
reci+0.022
Neg logit contributions
 under-0.008
 and-0.008
ed-0.007
driver-0.007
 in-0.006
Token
Token ID198
Feature activation-1.70e-01

Pos logit contributions
 and+0.095
 under+0.084
 he+0.081
 in+0.078
ed+0.078
Neg logit contributions
icious-0.162
Little-0.152
reci-0.151
rous-0.150
 difference-0.150
Token dog
Token ID3290
Feature activation+1.46e-01

Pos logit contributions
icious+0.294
Little+0.281
 disl+0.275
reci+0.275
Everybody+0.270
Neg logit contributions
 and-0.177
 under-0.173
 in-0.161
 on-0.153
ed-0.146
Token named
Token ID3706
Feature activation+7.28e-02

Pos logit contributions
 under+0.127
 and+0.126
ed+0.117
driver+0.113
 in+0.109
Neg logit contributions
icious-0.315
Little-0.311
Everybody-0.299
Ms-0.297
reci-0.297
Token Sam
Token ID3409
Feature activation+6.63e-02

Pos logit contributions
 and+0.077
 under+0.075
ed+0.069
 in+0.067
driver+0.064
Neg logit contributions
icious-0.152
Little-0.148
reci-0.142
Everybody-0.141
 disl-0.140
Token came
Token ID1625
Feature activation+1.90e-01

Pos logit contributions
 and+0.057
 under+0.056
ed+0.051
 in+0.049
driver+0.047
Neg logit contributions
icious-0.149
Little-0.147
Everybody-0.141
reci-0.140
Ms-0.140
Token running
Token ID2491
Feature activation+1.33e-01

Pos logit contributions
 and+0.082
 under+0.074
 in+0.061
ed+0.054
 on+0.048
Neg logit contributions
icious-0.543
Little-0.526
 disl-0.514
reci-0.514
ograph-0.510
Token and
Token ID290
Feature activation+3.05e-01

Pos logit contributions
 and+0.089
 under+0.089
ed+0.079
driver+0.077
 in+0.072
Neg logit contributions
icious-0.315
Little-0.313
Everybody-0.302
Ms-0.300
reci-0.299
Token tried
Token ID3088
Feature activation+1.41e-01

Pos logit contributions
 under+0.256
 and+0.254
driver+0.238
ed+0.236
 in+0.211
Neg logit contributions
Little-0.663
icious-0.648
Everybody-0.636
 disl-0.631
Ms-0.629
Token to
Token ID284
Feature activation+1.49e-01

Pos logit contributions
ed+0.109
 under+0.093
driver+0.093
aces+0.091
edly+0.080
Neg logit contributions
One-0.303
Little-0.300
 disl-0.298
Everybody-0.296
Br-0.294
Token take
Token ID1011
Feature activation+5.28e-03

Pos logit contributions
 and+0.176
 under+0.168
 in+0.156
ed+0.144
 on+0.143
Neg logit contributions
icious-0.298
Little-0.289
reci-0.285
Ms-0.278
rous-0.278
Token the
Token ID262
Feature activation-6.66e-02

Pos logit contributions
 under+0.006
 and+0.006
ed+0.006
driver+0.006
 knees+0.005
Neg logit contributions
Little-0.010
Everyone-0.009
Everybody-0.009
icious-0.009
E-0.009
Token steak
Token ID26320
Feature activation+1.77e-01

Pos logit contributions
icious+0.115
Little+0.114
Everybody+0.108
 difference+0.107
reci+0.107
Neg logit contributions
 and-0.088
 under-0.087
ed-0.085
driver-0.079
 in-0.079
Token
Token ID26391
Feature activation-5.00e-02

Pos logit contributions
icious+0.250
Little+0.224
ograph+0.223
 disl+0.217
 former+0.217
Neg logit contributions
 and-0.244
 under-0.238
 in-0.224
 he-0.219
ed-0.215
Token
Token ID8151
Feature activation+1.10e-01

Pos logit contributions
icious+0.084
Little+0.082
reci+0.080
Everybody+0.078
rous+0.078
Neg logit contributions
 and-0.075
 under-0.071
ed-0.068
 in-0.067
 on-0.063
Tokent
Token ID83
Feature activation+1.48e-01

Pos logit contributions
 and+0.155
ed+0.149
 under+0.143
 in+0.137
 on+0.133
Neg logit contributions
icious-0.197
Little-0.194
 difference-0.190
ograph-0.186
rous-0.184
Token catch
Token ID4929
Feature activation+1.72e-02

Pos logit contributions
 and+0.146
 under+0.143
ed+0.131
 in+0.128
driver+0.122
Neg logit contributions
icious-0.316
Little-0.310
Everybody-0.297
Ms-0.296
reci-0.296
Token up
Token ID510
Feature activation+5.77e-02

Pos logit contributions
 and+0.017
 under+0.017
 in+0.015
ed+0.015
 on+0.014
Neg logit contributions
icious-0.038
Little-0.037
reci-0.036
ograph-0.035
Everybody-0.035
Token and
Token ID290
Feature activation+3.04e-01

Pos logit contributions
 under+0.046
 and+0.045
ed+0.044
driver+0.042
 in+0.038
Neg logit contributions
Little-0.126
icious-0.126
Everybody-0.121
 disl-0.121
reci-0.120
Token just
Token ID655
Feature activation+1.68e-01

Pos logit contributions
 under+0.266
 and+0.261
driver+0.257
ed+0.244
aces+0.228
Neg logit contributions
Little-0.631
icious-0.606
 disl-0.605
Everybody-0.596
assion-0.596
Token watched
Token ID7342
Feature activation-2.11e-05

Pos logit contributions
 under+0.159
 and+0.156
driver+0.148
ed+0.147
 in+0.134
Neg logit contributions
icious-0.347
Little-0.345
 disl-0.334
reci-0.332
Everybody-0.328
Token Jack
Token ID3619
Feature activation+4.32e-02

Pos logit contributions
Little+0.000
icious+0.000
Everybody+0.000
 disl+0.000
Everyone+0.000
Neg logit contributions
 under-0.000
ed-0.000
 and-0.000
driver-0.000
 in-0.000
Token reach
Token ID3151
Feature activation+8.49e-02

Pos logit contributions
 under+0.037
 and+0.036
ed+0.035
driver+0.034
 in+0.031
Neg logit contributions
icious-0.093
Little-0.092
Everybody-0.088
reci-0.087
Ms-0.087
Token the
Token ID262
Feature activation-6.54e-02

Pos logit contributions
ed+0.075
driver+0.073
 under+0.072
 and+0.071
 knees+0.062
Neg logit contributions
Little-0.174
icious-0.172
 difference-0.165
Everybody-0.165
rous-0.165
Token were
Token ID547
Feature activation+1.96e-01

Pos logit contributions
 and+0.055
 under+0.054
ed+0.049
 in+0.047
driver+0.045
Neg logit contributions
icious-0.136
Little-0.134
Everybody-0.128
reci-0.128
Ms-0.127
Token two
Token ID734
Feature activation+3.60e-02

Pos logit contributions
 and+0.245
 under+0.240
ed+0.225
 in+0.223
 on+0.209
Neg logit contributions
icious-0.370
Little-0.361
reci-0.347
 disl-0.344
Everybody-0.342
Token brothers
Token ID9397
Feature activation+1.79e-01

Pos logit contributions
 under+0.049
 and+0.049
ed+0.047
driver+0.046
 in+0.045
Neg logit contributions
icious-0.061
Little-0.060
Everybody-0.057
reci-0.056
Ms-0.056
Token named
Token ID3706
Feature activation+4.48e-02

Pos logit contributions
 under+0.169
driver+0.162
ed+0.159
 and+0.149
 knees+0.143
Neg logit contributions
Little-0.350
icious-0.348
Everybody-0.339
reci-0.338
 sym-0.335
Token Jeff
Token ID5502
Feature activation+9.33e-02

Pos logit contributions
 and+0.046
 under+0.046
ed+0.042
 in+0.041
driver+0.040
Neg logit contributions
icious-0.093
Little-0.091
Everybody-0.087
reci-0.087
Ms-0.086
Token and
Token ID290
Feature activation+3.04e-01

Pos logit contributions
 under+0.065
 and+0.064
ed+0.057
driver+0.055
 in+0.053
Neg logit contributions
icious-0.220
Little-0.218
Everybody-0.210
reci-0.208
Ms-0.208
Token Tom
Token ID4186
Feature activation+2.26e-03

Pos logit contributions
 under+0.344
 and+0.335
ed+0.321
driver+0.309
 in+0.298
Neg logit contributions
Little-0.582
icious-0.575
Everybody-0.559
 disl-0.556
Ms-0.547
Token.
Token ID13
Feature activation+9.13e-02

Pos logit contributions
 under+0.002
 and+0.002
ed+0.002
driver+0.002
 in+0.002
Neg logit contributions
icious-0.005
Little-0.005
Everybody-0.005
reci-0.005
Ms-0.005
Token They
Token ID1119
Feature activation+1.28e-01

Pos logit contributions
 and+0.092
 under+0.084
 in+0.076
ed+0.074
 he+0.074
Neg logit contributions
icious-0.204
Little-0.196
ograph-0.189
gary-0.188
reci-0.187
Token decided
Token ID3066
Feature activation+1.52e-01

Pos logit contributions
 and+0.123
 under+0.123
ed+0.113
 in+0.107
driver+0.107
Neg logit contributions
icious-0.272
Little-0.268
Everybody-0.257
reci-0.255
 disl-0.254
Token to
Token ID284
Feature activation+1.38e-01

Pos logit contributions
 under+0.116
ed+0.114
driver+0.112
aces+0.107
 knees+0.092
Neg logit contributions
 disl-0.327
Everybody-0.320
Little-0.318
Everyone-0.316
assion-0.316
Token thing
Token ID1517
Feature activation+1.48e-01

Pos logit contributions
driver+0.064
ed+0.054
aces+0.052
 knees+0.047
 under+0.047
Neg logit contributions
Little-0.089
icious-0.083
 sym-0.081
 wonderful-0.081
 difference-0.079
Token was
Token ID373
Feature activation+1.81e-01

Pos logit contributions
 under+0.176
 and+0.173
ed+0.166
driver+0.163
 in+0.154
Neg logit contributions
icious-0.270
Little-0.267
Everybody-0.256
 disl-0.254
 difference-0.253
Token getting
Token ID1972
Feature activation+1.01e-01

Pos logit contributions
 and+0.192
 under+0.186
 in+0.172
ed+0.163
 on+0.158
Neg logit contributions
icious-0.387
Little-0.377
Everybody-0.362
E-0.361
reci-0.360
Token very
Token ID845
Feature activation+8.33e-02

Pos logit contributions
 under+0.121
 and+0.121
ed+0.117
driver+0.117
 knees+0.107
Neg logit contributions
icious-0.180
Little-0.179
Everybody-0.171
reci-0.171
rous-0.170
Token excited
Token ID6568
Feature activation-1.46e-02

Pos logit contributions
 and+0.127
 under+0.125
 in+0.118
ed+0.117
 on+0.112
Neg logit contributions
icious-0.137
Little-0.132
Ms-0.125
ograph-0.125
Everybody-0.124
Token and
Token ID290
Feature activation+3.04e-01

Pos logit contributions
Little+0.026
 difference+0.026
icious+0.026
 disl+0.025
 sym+0.025
Neg logit contributions
driver-0.017
ed-0.015
 under-0.014
aces-0.014
pered-0.013
Token decided
Token ID3066
Feature activation+1.66e-01

Pos logit contributions
 under+0.281
driver+0.273
 and+0.273
ed+0.262
 in+0.233
Neg logit contributions
Little-0.628
icious-0.621
 disl-0.616
Everybody-0.603
reci-0.596
Token to
Token ID284
Feature activation+1.48e-01

Pos logit contributions
 under+0.102
ed+0.091
 and+0.088
driver+0.086
aces+0.075
Neg logit contributions
Little-0.390
Everybody-0.383
Everyone-0.378
icious-0.376
 disl-0.375
Token jump
Token ID4391
Feature activation+9.59e-02

Pos logit contributions
 under+0.224
ed+0.223
 and+0.211
driver+0.206
 in+0.192
Neg logit contributions
Little-0.234
icious-0.228
Everybody-0.225
Everyone-0.220
 disl-0.219
Token right
Token ID826
Feature activation-1.71e-02

Pos logit contributions
driver+0.058
 under+0.053
ed+0.053
 and+0.052
 knees+0.047
Neg logit contributions
Little-0.220
icious-0.215
Everybody-0.214
 sym-0.211
One-0.211
Token in
Token ID287
Feature activation+8.02e-02

Pos logit contributions
Little+0.037
icious+0.036
rous+0.035
Everybody+0.035
One+0.035
Neg logit contributions
driver-0.014
ed-0.012
 under-0.012
 and-0.012
 knees-0.010
Token hide
Token ID7808
Feature activation+1.51e-01

Pos logit contributions
 and+0.058
 under+0.055
 in+0.053
 on+0.049
ed+0.048
Neg logit contributions
icious-0.102
Little-0.100
reci-0.096
Ms-0.094
E-0.094
Token-
Token ID12
Feature activation+1.92e-01

Pos logit contributions
driver+0.089
ed+0.070
 under+0.058
 wouldn+0.057
ongs+0.057
Neg logit contributions
Little-0.338
E-0.334
 disl-0.328
Gr-0.327
Ms-0.325
Tokenand
Token ID392
Feature activation+7.04e-02

Pos logit contributions
 and+0.324
 under+0.322
ed+0.315
 in+0.306
driver+0.300
Neg logit contributions
icious-0.285
Little-0.269
 difference-0.257
reci-0.256
ograph-0.251
Token-
Token ID12
Feature activation+1.74e-01

Pos logit contributions
 under+0.069
 and+0.068
driver+0.064
ed+0.062
 in+0.058
Neg logit contributions
Little-0.142
icious-0.142
Everybody-0.137
 disl-0.136
Everyone-0.134
Tokenseek
Token ID36163
Feature activation-1.56e-02

Pos logit contributions
 and+0.287
 under+0.283
ed+0.281
 in+0.271
driver+0.262
Neg logit contributions
icious-0.268
 difference-0.247
Little-0.245
reci-0.243
ograph-0.232
Token and
Token ID290
Feature activation+3.04e-01

Pos logit contributions
icious+0.032
Little+0.032
Everybody+0.032
Everyone+0.031
Ms+0.031
Neg logit contributions
 under-0.013
driver-0.013
 and-0.013
ed-0.012
 in-0.011
Token tag
Token ID7621
Feature activation+1.56e-01

Pos logit contributions
 and+0.333
 under+0.331
driver+0.329
ed+0.324
 in+0.292
Neg logit contributions
Little-0.560
icious-0.541
 disl-0.535
Everybody-0.533
Br-0.530
Token.
Token ID13
Feature activation+8.56e-02

Pos logit contributions
driver+0.145
 under+0.138
 and+0.137
ed+0.129
 knees+0.123
Neg logit contributions
icious-0.289
Little-0.283
Everybody-0.281
 disl-0.281
 sym-0.281
Token Tim
Token ID5045
Feature activation-5.02e-02

Pos logit contributions
 and+0.107
 under+0.098
 he+0.092
 in+0.091
ed+0.089
Neg logit contributions
icious-0.172
reci-0.163
rous-0.160
Little-0.160
 difference-0.160
Tokenmy
Token ID1820
Feature activation+7.87e-02

Pos logit contributions
Little+0.080
icious+0.075
reci+0.073
Ms+0.072
One+0.072
Neg logit contributions
driver-0.070
 under-0.067
ed-0.064
edly-0.060
 knees-0.060
Token had
Token ID550
Feature activation+1.35e-01

Pos logit contributions
driver+0.093
 under+0.087
ed+0.080
fully+0.077
 on+0.074
Neg logit contributions
Little-0.139
 sym-0.129
Ms-0.129
V-0.127
icious-0.127
Token a
Token ID257
Feature activation+3.84e-02

Pos logit contributions
 under+0.193
 and+0.189
ed+0.184
driver+0.172
 in+0.165
Neg logit contributions
icious-0.429
Little-0.428
Everybody-0.408
 disl-0.404
Ms-0.404
Token little
Token ID1310
Feature activation-6.46e-02

Pos logit contributions
 and+0.057
 under+0.054
 in+0.051
 on+0.049
 he+0.048
Neg logit contributions
icious-0.066
reci-0.063
Little-0.062
rous-0.061
 disl-0.061
Token girl
Token ID2576
Feature activation+1.20e-01

Pos logit contributions
Little+0.097
 difference+0.094
icious+0.093
 sym+0.089
Everybody+0.089
Neg logit contributions
 under-0.097
ed-0.096
driver-0.093
 and-0.092
aces-0.088
Token named
Token ID3706
Feature activation+4.32e-02

Pos logit contributions
 under+0.140
 and+0.138
ed+0.132
driver+0.132
 in+0.123
Neg logit contributions
Little-0.222
icious-0.220
Everybody-0.208
reci-0.208
Ms-0.207
Token Mia
Token ID32189
Feature activation+2.24e-02

Pos logit contributions
 under+0.055
 and+0.054
ed+0.051
 in+0.051
 on+0.047
Neg logit contributions
icious-0.083
Little-0.079
reci-0.076
Everybody-0.076
ograph-0.076
Token and
Token ID290
Feature activation+3.03e-01

Pos logit contributions
 under+0.019
 and+0.018
driver+0.018
ed+0.018
 in+0.016
Neg logit contributions
Little-0.048
icious-0.047
reci-0.046
Ms-0.045
Everybody-0.045
Token her
Token ID607
Feature activation+1.00e-01

Pos logit contributions
 under+0.357
driver+0.355
 and+0.338
ed+0.324
 knees+0.314
Neg logit contributions
Little-0.539
Everybody-0.502
Ms-0.502
 disl-0.501
Br-0.499
Token dog
Token ID3290
Feature activation+1.03e-01

Pos logit contributions
 and+0.123
 under+0.119
 in+0.111
ed+0.109
 on+0.104
Neg logit contributions
icious-0.201
Little-0.192
reci-0.186
Everybody-0.185
Ms-0.184
Token,
Token ID11
Feature activation+1.73e-01

Pos logit contributions
 and+0.072
 under+0.067
 in+0.058
ed+0.054
 on+0.050
Neg logit contributions
icious-0.265
Little-0.259
 disl-0.251
rous-0.249
Everybody-0.249
Token Buddy
Token ID36896
Feature activation+3.85e-02

Pos logit contributions
 under+0.198
ed+0.192
 and+0.191
driver+0.189
 in+0.174
Neg logit contributions
Little-0.326
icious-0.320
Ms-0.308
Everybody-0.308
Br-0.301
Token,
Token ID11
Feature activation+1.69e-01

Pos logit contributions
 and+0.031
 under+0.029
 in+0.025
ed+0.025
 on+0.022
Neg logit contributions
icious-0.094
Little-0.091
Everybody-0.088
reci-0.088
 disl-0.087
Token boy
Token ID2933
Feature activation+1.33e-01

Pos logit contributions
driver+0.273
aces+0.238
ed+0.222
 and+0.221
 under+0.213
Neg logit contributions
Little-0.407
Everybody-0.376
rous-0.372
Br-0.370
icious-0.369
Token came
Token ID1625
Feature activation+2.07e-01

Pos logit contributions
 under+0.140
driver+0.137
ed+0.130
 and+0.124
aces+0.118
Neg logit contributions
Little-0.247
icious-0.238
Everybody-0.234
Ms-0.230
 difference-0.229
Token run
Token ID1057
Feature activation+1.45e-01

Pos logit contributions
driver+0.141
ed+0.122
aces+0.118
 under+0.116
 knees+0.108
Neg logit contributions
Little-0.467
Everybody-0.455
icious-0.453
Ms-0.452
 difference-0.451
Tokenn
Token ID77
Feature activation+4.16e-02

Pos logit contributions
driver+0.097
 under+0.080
 knees+0.079
aces+0.079
ed+0.077
Neg logit contributions
Little-0.322
Ms-0.311
Everybody-0.311
One-0.311
icious-0.310
Tokenig
Token ID328
Feature activation+1.83e-01

Pos logit contributions
 under+0.050
 and+0.048
driver+0.047
ed+0.046
 in+0.044
Neg logit contributions
icious-0.073
Little-0.072
Everybody-0.071
Everyone-0.069
Ms-0.068
Token and
Token ID290
Feature activation+3.03e-01

Pos logit contributions
 under+0.296
 and+0.291
driver+0.276
ed+0.274
 in+0.268
Neg logit contributions
Little-0.249
icious-0.248
Everybody-0.239
Everyone-0.229
Br-0.229
Token kicked
Token ID12165
Feature activation+2.64e-02

Pos logit contributions
 under+0.291
 and+0.290
ed+0.267
driver+0.259
 in+0.251
Neg logit contributions
Little-0.634
icious-0.630
Everybody-0.606
Ms-0.599
 disl-0.599
Token the
Token ID262
Feature activation-6.17e-02

Pos logit contributions
 under+0.025
 and+0.024
ed+0.024
driver+0.024
 in+0.021
Neg logit contributions
Little-0.053
icious-0.052
E-0.051
Everybody-0.051
Br-0.050
Token old
Token ID1468
Feature activation-1.03e-01

Pos logit contributions
icious+0.111
Little+0.109
Everybody+0.103
reci+0.103
Ms+0.103
Neg logit contributions
 and-0.081
 under-0.080
ed-0.075
 in-0.073
driver-0.072
Token engine
Token ID3113
Feature activation+9.03e-02

Pos logit contributions
Little+0.177
icious+0.175
Everybody+0.168
 disl+0.167
 difference+0.166
Neg logit contributions
 under-0.134
 and-0.130
ed-0.127
driver-0.119
 in-0.119
Token.
Token ID13
Feature activation+8.25e-02

Pos logit contributions
ed+0.082
 under+0.079
driver+0.074
 and+0.070
 explorers+0.067
Neg logit contributions
icious-0.164
Little-0.164
 difference-0.162
Everybody-0.162
 sym-0.160
Token Jane
Token ID12091
Feature activation+7.13e-02

Pos logit contributions
icious+0.213
Little+0.209
Everybody+0.202
reci+0.201
Ms+0.201
Neg logit contributions
 and-0.056
 under-0.055
ed-0.048
 in-0.045
driver-0.043
Token was
Token ID373
Feature activation+1.79e-01

Pos logit contributions
 under+0.069
 and+0.068
driver+0.063
ed+0.062
 in+0.059
Neg logit contributions
icious-0.148
Little-0.147
Everybody-0.140
reci-0.139
Ms-0.139
Token picking
Token ID10868
Feature activation+5.37e-02

Pos logit contributions
 and+0.160
 under+0.158
ed+0.144
 in+0.138
driver+0.134
Neg logit contributions
icious-0.394
Little-0.388
Everybody-0.372
reci-0.371
Ms-0.369
Token up
Token ID510
Feature activation+6.10e-02

Pos logit contributions
driver+0.057
 under+0.054
 and+0.051
ed+0.050
 knees+0.049
Neg logit contributions
Little-0.101
rous-0.099
icious-0.098
E-0.098
Everybody-0.096
Token shrimp
Token ID30060
Feature activation+9.87e-02

Pos logit contributions
driver+0.070
 under+0.067
ed+0.066
 and+0.062
 knees+0.057
Neg logit contributions
Little-0.110
rous-0.108
icious-0.107
Everybody-0.105
 difference-0.105
Token and
Token ID290
Feature activation+3.03e-01

Pos logit contributions
driver+0.087
 under+0.075
ed+0.069
 knees+0.068
 hid+0.063
Neg logit contributions
icious-0.198
Little-0.196
rous-0.192
Ms-0.192
 difference-0.191
Token putting
Token ID5137
Feature activation+4.00e-02

Pos logit contributions
 under+0.317
 and+0.315
ed+0.293
driver+0.286
 in+0.277
Neg logit contributions
icious-0.611
Little-0.604
Everybody-0.578
 disl-0.575
Ms-0.574
Token them
Token ID606
Feature activation+1.28e-01

Pos logit contributions
 and+0.046
 under+0.046
ed+0.043
 in+0.041
driver+0.040
Neg logit contributions
icious-0.078
Little-0.077
Everybody-0.073
reci-0.073
Ms-0.072
Token into
Token ID656
Feature activation+9.36e-02

Pos logit contributions
 and+0.060
 under+0.057
driver+0.056
ed+0.052
 he+0.038
Neg logit contributions
icious-0.317
Little-0.312
rous-0.308
Everybody-0.307
reci-0.305
Token her
Token ID607
Feature activation+9.58e-02

Pos logit contributions
 and+0.088
 under+0.077
 in+0.074
 on+0.068
 he+0.064
Neg logit contributions
icious-0.225
Little-0.214
reci-0.211
assion-0.209
Everybody-0.206
Token bucket
Token ID19236
Feature activation+1.02e-01

Pos logit contributions
 and+0.117
 under+0.115
ed+0.106
 in+0.105
driver+0.100
Neg logit contributions
icious-0.184
Little-0.180
Everybody-0.172
reci-0.171
Ms-0.170
Token three
Token ID1115
Feature activation+3.05e-02

Pos logit contributions
 and+0.046
 under+0.046
ed+0.044
driver+0.043
 in+0.040
Neg logit contributions
icious-0.102
Little-0.101
E-0.096
Everybody-0.096
Ms-0.096
Token-
Token ID12
Feature activation+1.87e-01

Pos logit contributions
 and+0.038
 under+0.036
 in+0.034
ed+0.034
 on+0.031
Neg logit contributions
icious-0.060
Little-0.057
 disl-0.055
ograph-0.055
Everybody-0.055
Tokenyears
Token ID19002
Feature activation-2.40e-02

Pos logit contributions
 and+0.268
 under+0.265
ed+0.252
 in+0.246
driver+0.242
Neg logit contributions
icious-0.314
Little-0.305
Everybody-0.289
reci-0.289
 disl-0.286
Token-
Token ID12
Feature activation+1.70e-01

Pos logit contributions
icious+0.050
Little+0.050
Everybody+0.047
reci+0.047
Ms+0.047
Neg logit contributions
 and-0.024
 under-0.024
ed-0.022
 in-0.021
driver-0.021
Tokenold
Token ID727
Feature activation+1.53e-02

Pos logit contributions
 and+0.280
 under+0.271
ed+0.269
 in+0.262
driver+0.255
Neg logit contributions
icious-0.258
Little-0.244
reci-0.240
 difference-0.237
Br-0.231
Token and
Token ID290
Feature activation+3.03e-01

Pos logit contributions
 under+0.013
driver+0.013
ed+0.011
 and+0.011
 knees+0.011
Neg logit contributions
Little-0.032
Ms-0.030
icious-0.030
Everybody-0.029
E-0.029
Token was
Token ID373
Feature activation+1.38e-01

Pos logit contributions
 under+0.344
 and+0.329
ed+0.310
driver+0.303
 in+0.292
Neg logit contributions
Little-0.582
icious-0.573
Everybody-0.556
 disl-0.552
Ms-0.552
Token very
Token ID845
Feature activation+6.48e-02

Pos logit contributions
 and+0.125
 under+0.123
ed+0.112
 in+0.107
driver+0.105
Neg logit contributions
icious-0.303
Little-0.299
Everybody-0.287
reci-0.285
Ms-0.284
Token inqu
Token ID7949
Feature activation-5.42e-02

Pos logit contributions
 and+0.090
 under+0.088
 in+0.084
ed+0.083
 on+0.078
Neg logit contributions
icious-0.114
Little-0.111
reci-0.106
ograph-0.106
Everybody-0.105
Tokenis
Token ID271
Feature activation-4.43e-02

Pos logit contributions
Br+0.053
icious+0.046
ograph+0.045
Ms+0.043
assion+0.043
Neg logit contributions
 and-0.122
 under-0.117
 in-0.116
 he-0.108
 behind-0.108
Tokenitive
Token ID1800
Feature activation+1.86e-01

Pos logit contributions
Little+0.078
is+0.077
icious+0.076
Everybody+0.072
Ms+0.072
Neg logit contributions
 under-0.051
driver-0.051
 knees-0.046
 hid-0.045
 and-0.044
Token It
Token ID632
Feature activation+9.90e-02

Pos logit contributions
icious+0.296
Little+0.293
Everybody+0.279
reci+0.277
Ms+0.277
Neg logit contributions
 and-0.153
 under-0.151
ed-0.142
 in-0.135
driver-0.134
Token was
Token ID373
Feature activation+1.79e-01

Pos logit contributions
 and+0.103
 under+0.102
ed+0.094
 in+0.091
driver+0.088
Neg logit contributions
icious-0.204
Little-0.201
Everybody-0.192
reci-0.191
Ms-0.191
Token funny
Token ID8258
Feature activation-8.80e-02

Pos logit contributions
 and+0.196
 under+0.192
 in+0.175
ed+0.170
 on+0.163
Neg logit contributions
icious-0.368
Little-0.362
Everybody-0.347
reci-0.345
Ms-0.343
Token-
Token ID12
Feature activation+1.70e-01

Pos logit contributions
 disl+0.174
 sym+0.169
 difference+0.166
icious+0.162
 wonderful+0.161
Neg logit contributions
driver-0.088
ed-0.079
 under-0.075
 wouldn-0.067
ously-0.064
Tokenlooking
Token ID11534
Feature activation-4.82e-03

Pos logit contributions
 and+0.269
ed+0.263
 under+0.255
 in+0.250
driver+0.241
Neg logit contributions
icious-0.275
 difference-0.263
Little-0.256
reci-0.251
Br-0.245
Token and
Token ID290
Feature activation+3.03e-01

Pos logit contributions
 difference+0.009
Little+0.009
 disl+0.009
icious+0.008
 sym+0.008
Neg logit contributions
ed-0.005
driver-0.005
 under-0.005
ously-0.004
 knees-0.004
Token he
Token ID339
Feature activation+2.34e-01

Pos logit contributions
 and+0.456
 under+0.451
ed+0.425
 in+0.418
driver+0.408
Neg logit contributions
icious-0.488
Little-0.476
Everybody-0.451
reci-0.448
Ms-0.446
Token didn
Token ID1422
Feature activation+1.02e-01

Pos logit contributions
 under+0.211
 and+0.200
driver+0.197
ed+0.193
 in+0.174
Neg logit contributions
Little-0.493
icious-0.493
 disl-0.474
Everybody-0.468
E-0.465
Token't
Token ID470
Feature activation+1.04e-01

Pos logit contributions
 under+0.141
 and+0.138
ed+0.134
 in+0.127
driver+0.125
Neg logit contributions
icious-0.171
Little-0.170
Everybody-0.162
E-0.159
Everyone-0.158
Token know
Token ID760
Feature activation-3.07e-02

Pos logit contributions
 and+0.124
 under+0.123
ed+0.115
 in+0.111
driver+0.110
Neg logit contributions
icious-0.199
Little-0.195
Everybody-0.186
reci-0.185
Ms-0.184
Token what
Token ID644
Feature activation-2.28e-02

Pos logit contributions
Little+0.051
icious+0.050
Everybody+0.048
 disl+0.048
 difference+0.047
Neg logit contributions
ed-0.042
 under-0.040
 and-0.039
driver-0.038
 in-0.035
Token he
Token ID339
Feature activation+2.13e-01

Pos logit contributions
 under+0.022
driver+0.022
ed+0.021
aces+0.020
 and+0.020
Neg logit contributions
Little-0.097
Everybody-0.091
Br-0.091
icious-0.091
Everyone-0.091
Token got
Token ID1392
Feature activation+9.58e-02

Pos logit contributions
 under+0.239
driver+0.234
aces+0.217
ed+0.217
 and+0.208
Neg logit contributions
Little-0.381
E-0.362
Everyone-0.359
Gr-0.358
Ms-0.356
Token his
Token ID465
Feature activation+8.54e-02

Pos logit contributions
driver+0.091
 and+0.087
 under+0.085
ously+0.082
aces+0.082
Neg logit contributions
Little-0.190
rous-0.187
Everybody-0.185
Everyone-0.182
E-0.180
Token little
Token ID1310
Feature activation-1.01e-01

Pos logit contributions
ed+0.127
 under+0.123
 and+0.120
driver+0.119
ously+0.113
Neg logit contributions
Little-0.125
rous-0.120
icious-0.118
 disl-0.117
 difference-0.116
Token screw
Token ID9580
Feature activation+9.55e-02

Pos logit contributions
 difference+0.168
Little+0.166
icious+0.162
 former+0.158
ograph+0.157
Neg logit contributions
 under-0.130
ed-0.129
 and-0.122
driver-0.118
 in-0.113
Tokendriver
Token ID26230
Feature activation+3.03e-01

Pos logit contributions
 ear+0.078
 shrugged+0.072
 throat+0.072
 throats+0.070
 explorers+0.068
Neg logit contributions
is-0.157
qu-0.154
Little-0.151
assion-0.145
off-0.142
Token and
Token ID290
Feature activation+2.73e-01

Pos logit contributions
ed+0.217
 knees+0.208
 ear+0.196
 under+0.190
edly+0.189
Neg logit contributions
is-0.592
Little-0.585
One-0.579
icious-0.577
Each-0.559
Token started
Token ID2067
Feature activation+1.22e-01

Pos logit contributions
aces+0.306
 under+0.301
ed+0.290
pered+0.289
edly+0.280
Neg logit contributions
Little-0.478
Everyone-0.435
Everybody-0.431
 disl-0.422
E-0.419
Token to
Token ID284
Feature activation+1.27e-01

Pos logit contributions
ed+0.102
 under+0.101
 and+0.090
aces+0.085
driver+0.081
Neg logit contributions
Little-0.255
 disl-0.248
icious-0.243
assion-0.243
Everybody-0.243
Token screw
Token ID9580
Feature activation+9.05e-02

Pos logit contributions
 and+0.175
 under+0.174
ed+0.164
 in+0.159
driver+0.157
Neg logit contributions
icious-0.222
Little-0.218
Everybody-0.206
reci-0.205
Ms-0.204
Token in
Token ID287
Feature activation+5.83e-02

Pos logit contributions
 under+0.076
 knees+0.070
 ear+0.069
edly+0.069
 explorers+0.067
Neg logit contributions
Little-0.172
E-0.172
Everybody-0.165
V-0.165
Everyone-0.164
Token were
Token ID547
Feature activation+1.96e-01

Pos logit contributions
 and+0.055
 under+0.054
ed+0.049
 in+0.047
driver+0.045
Neg logit contributions
icious-0.136
Little-0.134
Everybody-0.128
reci-0.128
Ms-0.127
Token two
Token ID734
Feature activation+3.60e-02

Pos logit contributions
 and+0.245
 under+0.240
ed+0.225
 in+0.223
 on+0.209
Neg logit contributions
icious-0.370
Little-0.361
reci-0.347
 disl-0.344
Everybody-0.342
Token siblings
Token ID20569
Feature activation+1.18e-01

Pos logit contributions
 under+0.049
 and+0.049
ed+0.047
driver+0.046
 in+0.045
Neg logit contributions
icious-0.061
Little-0.060
Everybody-0.057
reci-0.056
Ms-0.056
Token named
Token ID3706
Feature activation+4.48e-02

Pos logit contributions
 under+0.107
driver+0.103
ed+0.102
 and+0.097
 knees+0.088
Neg logit contributions
Little-0.237
icious-0.237
Everybody-0.228
 sym-0.227
ograph-0.226
Token Anne
Token ID15397
Feature activation+1.15e-01

Pos logit contributions
 and+0.048
 under+0.046
 in+0.043
ed+0.042
driver+0.039
Neg logit contributions
icious-0.095
Little-0.091
reci-0.088
Br-0.087
ograph-0.087
Token and
Token ID290
Feature activation+3.03e-01

Pos logit contributions
 under+0.084
 and+0.082
ed+0.076
driver+0.072
 in+0.069
Neg logit contributions
icious-0.266
Little-0.264
Everybody-0.254
Ms-0.253
reci-0.252
Token Jake
Token ID14757
Feature activation+6.27e-02

Pos logit contributions
 under+0.346
 and+0.337
ed+0.322
driver+0.312
 in+0.300
Neg logit contributions
Little-0.576
icious-0.569
Everybody-0.549
 disl-0.543
Ms-0.543
Token who
Token ID508
Feature activation+6.25e-02

Pos logit contributions
 and+0.054
 under+0.051
ed+0.047
 in+0.046
driver+0.041
Neg logit contributions
icious-0.145
Little-0.142
Everybody-0.136
Ms-0.136
reci-0.135
Token loved
Token ID6151
Feature activation-3.85e-02

Pos logit contributions
 under+0.068
 and+0.067
ed+0.063
driver+0.061
 in+0.059
Neg logit contributions
Little-0.121
icious-0.121
Everybody-0.117
reci-0.116
 disl-0.115
Token to
Token ID284
Feature activation+1.47e-01

Pos logit contributions
 disl+0.090
Little+0.088
icious+0.087
rous+0.087
Everybody+0.086
Neg logit contributions
 under-0.024
ed-0.024
driver-0.023
 and-0.022
aces-0.018
Token play
Token ID711
Feature activation-7.45e-02

Pos logit contributions
 and+0.234
 under+0.223
 in+0.213
ed+0.207
driver+0.205
Neg logit contributions
icious-0.231
Little-0.224
Ms-0.213
reci-0.209
E-0.209
Token's
Token ID338
Feature activation+9.18e-02

Pos logit contributions
icious+0.029
Little+0.028
redible+0.028
Ms+0.028
ograph+0.027
Neg logit contributions
ously-0.011
driver-0.010
 knees-0.010
ed-0.010
 under-0.009
Token friend
Token ID1545
Feature activation+5.09e-02

Pos logit contributions
 and+0.121
 under+0.120
ed+0.111
 in+0.110
driver+0.107
Neg logit contributions
icious-0.165
Little-0.162
Everybody-0.154
reci-0.153
Ms-0.152
Token called
Token ID1444
Feature activation+1.13e-01

Pos logit contributions
 under+0.045
 and+0.044
ed+0.042
driver+0.041
 in+0.038
Neg logit contributions
icious-0.109
Little-0.109
Everybody-0.104
Ms-0.103
reci-0.103
Token him
Token ID683
Feature activation+2.13e-01

Pos logit contributions
 and+0.121
 under+0.121
ed+0.114
driver+0.108
 in+0.106
Neg logit contributions
icious-0.222
Little-0.222
Everybody-0.212
Ms-0.211
 disl-0.211
Token again
Token ID757
Feature activation+7.08e-02

Pos logit contributions
 under+0.171
 and+0.170
driver+0.164
ed+0.160
 knees+0.145
Neg logit contributions
Little-0.457
icious-0.453
Ms-0.439
Everybody-0.437
 difference-0.432
Token and
Token ID290
Feature activation+3.03e-01

Pos logit contributions
 and+0.058
 under+0.055
ed+0.049
 in+0.048
driver+0.044
Neg logit contributions
icious-0.165
Little-0.162
Everybody-0.156
reci-0.155
Ms-0.155
Token said
Token ID531
Feature activation+1.42e-02

Pos logit contributions
 under+0.314
driver+0.310
 and+0.309
ed+0.295
aces+0.270
Neg logit contributions
Little-0.587
icious-0.574
Everybody-0.561
 disl-0.558
 difference-0.549
Token,
Token ID11
Feature activation+1.71e-01

Pos logit contributions
 under+0.012
 and+0.011
ed+0.010
driver+0.010
 knees+0.009
Neg logit contributions
Little-0.031
icious-0.030
 disl-0.030
Everybody-0.030
Everyone-0.029
Token "
Token ID366
Feature activation-5.38e-02

Pos logit contributions
 under+0.201
ed+0.194
 and+0.187
driver+0.183
 in+0.174
Neg logit contributions
Little-0.314
icious-0.313
Everybody-0.300
Ms-0.299
 disl-0.296
TokenI
Token ID40
Feature activation-1.75e-02

Pos logit contributions
 difference+0.083
icious+0.080
 disl+0.077
reci+0.077
 former+0.076
Neg logit contributions
 and-0.082
ed-0.078
 he-0.077
 in-0.074
 ever-0.072
Token'm
Token ID1101
Feature activation+9.45e-03

Pos logit contributions
icious+0.029
Little+0.028
Ms+0.026
Everybody+0.026
redible+0.026
Neg logit contributions
 and-0.027
 under-0.026
ed-0.025
 in-0.025
 on-0.024
Token were
Token ID547
Feature activation+1.96e-01

Pos logit contributions
 and+0.055
 under+0.054
ed+0.049
 in+0.047
driver+0.045
Neg logit contributions
icious-0.136
Little-0.134
Everybody-0.128
reci-0.128
Ms-0.127
Token two
Token ID734
Feature activation+3.60e-02

Pos logit contributions
 and+0.245
 under+0.240
ed+0.225
 in+0.223
 on+0.209
Neg logit contributions
icious-0.370
Little-0.361
reci-0.347
 disl-0.344
Everybody-0.342
Token brothers
Token ID9397
Feature activation+1.79e-01

Pos logit contributions
 under+0.049
 and+0.049
ed+0.047
driver+0.046
 in+0.045
Neg logit contributions
icious-0.061
Little-0.060
Everybody-0.057
reci-0.056
Ms-0.056
Token called
Token ID1444
Feature activation+1.01e-01

Pos logit contributions
 under+0.169
driver+0.162
ed+0.159
 and+0.149
 knees+0.143
Neg logit contributions
Little-0.350
icious-0.348
Everybody-0.339
reci-0.338
 sym-0.335
Token Jack
Token ID3619
Feature activation+5.63e-02

Pos logit contributions
 under+0.106
 and+0.105
ed+0.098
driver+0.093
 in+0.092
Neg logit contributions
icious-0.202
Little-0.200
Everybody-0.191
Ms-0.190
 disl-0.189
Token and
Token ID290
Feature activation+3.03e-01

Pos logit contributions
 under+0.037
driver+0.035
ed+0.033
 knees+0.030
 and+0.030
Neg logit contributions
Little-0.127
icious-0.125
Ms-0.123
 sym-0.121
reci-0.121
Token Tom
Token ID4186
Feature activation+2.49e-04

Pos logit contributions
 under+0.334
 and+0.329
ed+0.311
driver+0.300
 in+0.291
Neg logit contributions
Little-0.586
icious-0.583
Everybody-0.561
 disl-0.558
Ms-0.552
Token.
Token ID13
Feature activation+8.74e-02

Pos logit contributions
 under+0.000
 and+0.000
ed+0.000
driver+0.000
 in+0.000
Neg logit contributions
icious-0.001
Little-0.001
Everybody-0.001
reci-0.001
Ms-0.001
Token They
Token ID1119
Feature activation+1.25e-01

Pos logit contributions
 and+0.091
 He+0.073
 he+0.072
 under+0.070
 in+0.066
Neg logit contributions
icious-0.197
gary-0.185
Little-0.185
redible-0.183
rous-0.183
Token were
Token ID547
Feature activation+1.52e-01

Pos logit contributions
 under+0.119
 and+0.118
ed+0.110
driver+0.105
 in+0.103
Neg logit contributions
icious-0.265
Little-0.262
Everybody-0.252
reci-0.251
 disl-0.249
Token best
Token ID1266
Feature activation+4.36e-02

Pos logit contributions
driver+0.166
ed+0.165
 under+0.159
 and+0.157
 knees+0.140
Neg logit contributions
icious-0.297
Little-0.292
Everybody-0.281
Ms-0.279
rous-0.279
Token should
Token ID815
Feature activation+8.07e-02

Pos logit contributions
icious+0.010
Little+0.010
reci+0.009
Everybody+0.009
Everyone+0.009
Neg logit contributions
 under-0.009
ed-0.008
 and-0.008
driver-0.008
 knees-0.008
Token be
Token ID307
Feature activation-2.74e-02

Pos logit contributions
 and+0.091
 under+0.090
ed+0.084
 in+0.081
driver+0.080
Neg logit contributions
icious-0.160
Little-0.157
Everybody-0.150
reci-0.149
Ms-0.149
Token compassionate
Token ID32533
Feature activation-1.47e-01

Pos logit contributions
icious+0.050
Little+0.049
reci+0.047
 sym+0.046
ograph+0.046
Neg logit contributions
 and-0.042
 in-0.041
 under-0.040
 on-0.039
ed-0.037
Token.
Token ID13
Feature activation+4.65e-02

Pos logit contributions
 difference+0.233
Little+0.225
One+0.221
icious+0.221
 disl+0.220
Neg logit contributions
 under-0.181
driver-0.168
 knees-0.165
 he-0.164
 wouldn-0.159
Token Comp
Token ID3082
Feature activation-7.27e-02

Pos logit contributions
 and+0.070
 under+0.069
ed+0.066
 in+0.064
driver+0.063
Neg logit contributions
icious-0.075
Little-0.073
Everybody-0.069
reci-0.069
Ms-0.068
Tokenassion
Token ID11857
Feature activation-3.88e-01

Pos logit contributions
Little+0.076
Br+0.073
icious+0.070
 difference+0.067
 sym+0.065
Neg logit contributions
ed-0.157
 and-0.156
driver-0.154
 in-0.146
 under-0.146
Tokenate
Token ID378
Feature activation-4.85e-02

Pos logit contributions
icious+0.666
gary+0.628
Little+0.618
 disl+0.615
rous+0.604
Neg logit contributions
 and-0.655
ed-0.586
 under-0.575
 in-0.574
 on-0.548
Token means
Token ID1724
Feature activation+1.60e-02

Pos logit contributions
Little+0.074
icious+0.073
reci+0.072
 disl+0.072
 difference+0.071
Neg logit contributions
 under-0.068
driver-0.065
 and-0.064
ed-0.063
 knees-0.062
Token you
Token ID345
Feature activation+1.47e-02

Pos logit contributions
 and+0.023
 under+0.023
ed+0.022
 in+0.021
driver+0.021
Neg logit contributions
icious-0.027
Little-0.026
Everybody-0.025
reci-0.024
Ms-0.024
Token care
Token ID1337
Feature activation+5.41e-02

Pos logit contributions
 and+0.018
 under+0.018
ed+0.017
 in+0.016
driver+0.016
Neg logit contributions
icious-0.028
Little-0.027
Everybody-0.026
reci-0.026
Ms-0.026
Token about
Token ID546
Feature activation+2.89e-02

Pos logit contributions
 and+0.042
 under+0.036
 in+0.035
 on+0.031
ed+0.028
Neg logit contributions
icious-0.139
Little-0.137
rous-0.133
ograph-0.133
Br-0.133
Token car
Token ID1097
Feature activation+1.45e-02

Pos logit contributions
icious+0.323
Little+0.317
Everybody+0.307
ograph+0.302
Br+0.301
Neg logit contributions
 under-0.248
 and-0.240
ed-0.239
 in-0.223
driver-0.220
Token.
Token ID13
Feature activation+7.94e-03

Pos logit contributions
ed+0.016
 under+0.015
 knees+0.014
ously+0.014
driver+0.014
Neg logit contributions
icious-0.025
One-0.024
assion-0.024
ograph-0.024
 sym-0.024
Token
Token ID198
Feature activation-2.40e-01

Pos logit contributions
 and+0.010
 under+0.010
ed+0.009
 in+0.009
 he+0.009
Neg logit contributions
icious-0.015
Little-0.015
ograph-0.014
 disl-0.014
rous-0.014
Token
Token ID198
Feature activation-2.41e-01

Pos logit contributions
 disl+0.565
icious+0.551
 former+0.548
 difference+0.507
assion+0.501
Neg logit contributions
 and-0.352
 on-0.283
 in-0.281
 he-0.277
 under-0.273
Token"
Token ID1
Feature activation-8.26e-02

Pos logit contributions
 difference+0.353
 disl+0.348
 former+0.333
icious+0.309
 lasting+0.309
Neg logit contributions
 and-0.479
 in-0.421
 on-0.418
 under-0.416
 he-0.414
TokenV
Token ID53
Feature activation-3.36e-01

Pos logit contributions
 disl+0.105
 difference+0.102
icious+0.100
 former+0.098
rous+0.095
Neg logit contributions
 and-0.160
ed-0.153
 under-0.152
 on-0.151
 in-0.151
Tokenroom
Token ID3823
Feature activation+2.76e-02

Pos logit contributions
icious+0.429
Little+0.421
Everybody+0.394
ograph+0.380
Everyone+0.380
Neg logit contributions
 under-0.617
 and-0.592
ed-0.575
driver-0.567
 he-0.561
Token,
Token ID11
Feature activation+9.39e-02

Pos logit contributions
 under+0.020
 and+0.020
ed+0.018
 in+0.017
driver+0.016
Neg logit contributions
icious-0.065
Little-0.064
Everybody-0.061
 disl-0.061
Ms-0.061
Token v
Token ID410
Feature activation-1.62e-01

Pos logit contributions
 and+0.144
 under+0.139
 in+0.131
ed+0.128
driver+0.128
Neg logit contributions
icious-0.153
Little-0.149
reci-0.142
rous-0.141
Everybody-0.141
Tokenroom
Token ID3823
Feature activation+2.75e-02

Pos logit contributions
icious+0.238
Little+0.231
Everybody+0.218
reci+0.215
Ms+0.215
Neg logit contributions
 and-0.266
 under-0.265
ed-0.251
 in-0.246
driver-0.243
Token,
Token ID11
Feature activation+9.39e-02

Pos logit contributions
 and+0.031
 under+0.031
ed+0.029
 in+0.028
driver+0.027
Neg logit contributions
icious-0.054
Little-0.053
Everybody-0.050
reci-0.050
Ms-0.050
Token car
Token ID1097
Feature activation+1.64e-02

Pos logit contributions
icious+0.158
Little+0.153
rous+0.153
reci+0.152
 disl+0.151
Neg logit contributions
 and-0.095
 under-0.087
 on-0.081
 in-0.081
 he-0.077
Token.
Token ID13
Feature activation+1.12e-02

Pos logit contributions
ed+0.015
 under+0.014
 knees+0.014
ously+0.013
driver+0.013
Neg logit contributions
icious-0.032
 difference-0.031
One-0.031
Little-0.031
 sym-0.030
Token
Token ID198
Feature activation-2.42e-01

Pos logit contributions
 and+0.016
 under+0.014
 in+0.013
 on+0.013
ed+0.012
Neg logit contributions
icious-0.023
ograph-0.021
Little-0.021
reci-0.021
rous-0.020
Token
Token ID198
Feature activation-2.42e-01

Pos logit contributions
icious+0.585
 former+0.555
ograph+0.537
 difference+0.530
gary+0.529
Neg logit contributions
 and-0.352
 under-0.257
 he-0.252
 in-0.251
 on-0.236
Token"
Token ID1
Feature activation-8.04e-02

Pos logit contributions
 difference+0.438
 former+0.423
icious+0.390
reci+0.389
 disl+0.375
Neg logit contributions
 and-0.428
 under-0.374
 in-0.372
 on-0.365
 he-0.349
TokenV
Token ID53
Feature activation-3.34e-01

Pos logit contributions
 difference+0.110
 former+0.106
rous+0.104
 disl+0.103
icious+0.101
Neg logit contributions
 and-0.142
ed-0.135
 on-0.135
 he-0.133
 under-0.132
Tokenroom
Token ID3823
Feature activation+3.21e-02

Pos logit contributions
icious+0.410
Little+0.395
Everybody+0.369
Everyone+0.357
redible+0.355
Neg logit contributions
 under-0.630
 and-0.593
ed-0.588
 he-0.582
 in-0.575
Token,
Token ID11
Feature activation+9.81e-02

Pos logit contributions
 under+0.024
 and+0.023
ed+0.021
 in+0.020
driver+0.019
Neg logit contributions
icious-0.075
Little-0.073
Everybody-0.071
 disl-0.070
Ms-0.070
Token v
Token ID410
Feature activation-1.60e-01

Pos logit contributions
 and+0.154
 under+0.144
 in+0.138
 on+0.135
driver+0.134
Neg logit contributions
icious-0.166
rous-0.157
Little-0.156
Ms-0.154
 former-0.152
Tokenroom
Token ID3823
Feature activation+3.16e-02

Pos logit contributions
icious+0.218
Little+0.208
Everybody+0.198
Everyone+0.192
redible+0.191
Neg logit contributions
 under-0.280
 and-0.277
ed-0.258
 in-0.258
 he-0.250
Token!"
Token ID2474
Feature activation-8.85e-02

Pos logit contributions
 and+0.034
 under+0.034
ed+0.032
 in+0.031
driver+0.030
Neg logit contributions
icious-0.063
Little-0.062
Everybody-0.059
reci-0.059
Ms-0.059
Token car
Token ID1097
Feature activation+1.51e-02

Pos logit contributions
ograph+0.028
 difference+0.027
assion+0.027
icious+0.027
One+0.027
Neg logit contributions
ed-0.022
driver-0.021
 under-0.021
ously-0.020
 and-0.019
Token and
Token ID290
Feature activation+2.17e-01

Pos logit contributions
ed+0.013
driver+0.013
 under+0.012
ously+0.012
 knees+0.011
Neg logit contributions
One-0.029
assion-0.029
icious-0.029
Little-0.028
Ms-0.027
Token said
Token ID531
Feature activation-6.31e-02

Pos logit contributions
driver+0.286
ed+0.260
aces+0.260
pered+0.254
ously+0.247
Neg logit contributions
Everyone-0.320
 disl-0.320
Br-0.318
Everybody-0.313
Little-0.312
Token,
Token ID11
Feature activation+9.60e-02

Pos logit contributions
Little+0.133
icious+0.133
Everybody+0.133
 disl+0.133
Everyone+0.132
Neg logit contributions
 under-0.051
 and-0.047
ed-0.043
driver-0.042
 knees-0.038
Token "
Token ID366
Feature activation-1.18e-01

Pos logit contributions
 under+0.122
ed+0.121
driver+0.114
 and+0.111
aces+0.105
Neg logit contributions
Little-0.159
icious-0.156
Ms-0.155
One-0.152
assion-0.150
TokenV
Token ID53
Feature activation-3.34e-01

Pos logit contributions
 difference+0.175
 disl+0.164
 former+0.160
icious+0.160
rous+0.155
Neg logit contributions
 and-0.208
ed-0.201
 under-0.191
 went-0.191
 in-0.190
Tokenanish
Token ID7115
Feature activation+2.66e-02

Pos logit contributions
icious+0.460
Little+0.447
Everybody+0.417
E+0.407
ograph+0.406
Neg logit contributions
 under-0.575
 and-0.570
ed-0.545
driver-0.534
 in-0.531
Token!"
Token ID2474
Feature activation-8.90e-02

Pos logit contributions
 and+0.028
 under+0.027
 in+0.025
ed+0.024
 on+0.022
Neg logit contributions
icious-0.057
reci-0.056
Little-0.056
 former-0.055
rous-0.055
Token The
Token ID383
Feature activation-2.78e-01

Pos logit contributions
icious+0.104
Little+0.102
Everybody+0.095
Br+0.093
Everyone+0.093
Neg logit contributions
 under-0.169
 and-0.165
ed-0.162
driver-0.159
 in-0.157
Token toy
Token ID13373
Feature activation-1.87e-02

Pos logit contributions
icious+0.467
Little+0.447
Ms+0.428
rous+0.420
Everybody+0.415
Neg logit contributions
 and-0.437
 under-0.428
 in-0.408
 on-0.389
ed-0.387
Token car
Token ID1097
Feature activation+1.42e-02

Pos logit contributions
Everyone+0.026
Tal+0.026
ograph+0.025
Everybody+0.025
Little+0.025
Neg logit contributions
ously-0.025
 under-0.025
driver-0.025
 and-0.024
ed-0.024
Tokenbows
Token ID25435
Feature activation+1.03e-02

Pos logit contributions
 sym+0.017
icious+0.017
qu+0.017
redible+0.016
assion+0.016
Neg logit contributions
 under-0.043
 knees-0.043
ously-0.042
 he-0.042
driver-0.042
Token.
Token ID13
Feature activation+9.80e-03

Pos logit contributions
driver+0.011
 knees+0.010
ed+0.010
 under+0.010
pered+0.009
Neg logit contributions
qu-0.018
icious-0.018
 sym-0.017
is-0.017
 difference-0.017
Token
Token ID220
Feature activation-2.04e-01

Pos logit contributions
 and+0.014
 under+0.013
ed+0.012
 on+0.012
 in+0.012
Neg logit contributions
icious-0.019
Little-0.018
 disl-0.018
gary-0.018
ograph-0.018
Token
Token ID198
Feature activation-2.36e-01

Pos logit contributions
icious+0.373
rous+0.365
ograph+0.364
 disl+0.360
 difference+0.358
Neg logit contributions
 and-0.321
 under-0.277
ed-0.274
 in-0.271
 on-0.257
Token
Token ID198
Feature activation-2.36e-01

Pos logit contributions
 difference+0.517
 disl+0.517
 former+0.511
icious+0.508
rous+0.489
Neg logit contributions
 and-0.353
 on-0.286
 under-0.284
 in-0.274
ed-0.247
TokenV
Token ID53
Feature activation-3.33e-01

Pos logit contributions
 disl+0.327
 difference+0.320
 former+0.303
icious+0.294
rous+0.287
Neg logit contributions
 and-0.482
 on-0.440
 under-0.437
 in-0.430
ed-0.415
Tokenera
Token ID8607
Feature activation-1.50e-03

Pos logit contributions
icious+0.479
Little+0.467
reci+0.439
 disl+0.439
Ms+0.437
Neg logit contributions
 and-0.565
 under-0.554
ed-0.533
 in-0.519
driver-0.506
Token was
Token ID373
Feature activation+5.22e-02

Pos logit contributions
icious+0.003
Little+0.003
Everybody+0.003
reci+0.003
 disl+0.003
Neg logit contributions
 under-0.001
 and-0.001
ed-0.001
driver-0.001
 in-0.001
Token so
Token ID523
Feature activation-2.65e-02

Pos logit contributions
driver+0.056
 under+0.053
 and+0.053
ed+0.052
 knees+0.048
Neg logit contributions
icious-0.102
Little-0.101
Everybody-0.097
 sym-0.095
Ms-0.095
Token happy
Token ID3772
Feature activation-2.08e-01

Pos logit contributions
icious+0.047
Little+0.046
Ms+0.043
Everybody+0.043
reci+0.043
Neg logit contributions
 and-0.037
 under-0.036
ed-0.034
 in-0.034
driver-0.032
Token.
Token ID13
Feature activation+8.81e-03

Pos logit contributions
 disl+0.318
 sym+0.302
Little+0.302
 difference+0.299
 wonderful+0.289
Neg logit contributions
driver-0.276
aces-0.262
pered-0.243
ed-0.234
 brow-0.233
Token truck
Token ID7779
Feature activation+6.19e-02

Pos logit contributions
icious+0.259
Little+0.252
Everybody+0.244
 disl+0.239
reci+0.239
Neg logit contributions
 and-0.176
 under-0.174
 in-0.158
ed-0.156
 on-0.152
Token.
Token ID13
Feature activation+1.35e-02

Pos logit contributions
ed+0.050
 knees+0.050
ously+0.049
 under+0.047
driver+0.046
Neg logit contributions
assion-0.119
One-0.119
icious-0.117
Little-0.115
 difference-0.112
Token
Token ID198
Feature activation-2.35e-01

Pos logit contributions
 and+0.015
 under+0.013
 he+0.013
ed+0.013
 in+0.012
Neg logit contributions
icious-0.030
Little-0.028
ograph-0.028
reci-0.028
rous-0.027
Token
Token ID198
Feature activation-2.36e-01

Pos logit contributions
icious+0.511
 former+0.481
ograph+0.463
 disl+0.461
assion+0.457
Neg logit contributions
 and-0.365
 under-0.301
 he-0.296
 in-0.292
 on-0.276
Token"
Token ID1
Feature activation-7.66e-02

Pos logit contributions
 difference+0.309
icious+0.305
ograph+0.290
reci+0.289
 former+0.285
Neg logit contributions
 and-0.459
 under-0.429
 he-0.423
 in-0.423
ed-0.405
TokenV
Token ID53
Feature activation-3.33e-01

Pos logit contributions
 disl+0.097
rous+0.095
 difference+0.093
 former+0.089
reci+0.089
Neg logit contributions
 and-0.142
 under-0.136
ed-0.136
 he-0.135
 on-0.134
Tokenroom
Token ID3823
Feature activation+3.36e-02

Pos logit contributions
icious+0.433
Little+0.414
Everybody+0.386
redible+0.375
Everyone+0.374
Neg logit contributions
 under-0.611
 and-0.585
ed-0.567
 in-0.559
 he-0.558
Token,
Token ID11
Feature activation+9.92e-02

Pos logit contributions
 under+0.025
 and+0.024
ed+0.022
 in+0.021
driver+0.020
Neg logit contributions
icious-0.078
Little-0.077
Everybody-0.074
 disl-0.074
Ms-0.073
Token v
Token ID410
Feature activation-1.58e-01

Pos logit contributions
 and+0.156
 under+0.151
 in+0.142
ed+0.140
driver+0.139
Neg logit contributions
icious-0.157
Little-0.152
Everybody-0.145
reci-0.145
rous-0.145
Tokenroom
Token ID3823
Feature activation+3.33e-02

Pos logit contributions
icious+0.223
Little+0.213
Everybody+0.202
Ms+0.199
 disl+0.197
Neg logit contributions
 under-0.270
 and-0.268
ed-0.250
 in-0.249
driver-0.243
Token!"
Token ID2474
Feature activation-8.59e-02

Pos logit contributions
 under+0.035
 and+0.035
ed+0.032
 in+0.031
driver+0.030
Neg logit contributions
icious-0.068
Little-0.067
Everybody-0.064
Ms-0.063
 disl-0.063
Token a
Token ID257
Feature activation-7.32e-02

Pos logit contributions
Little+0.077
icious+0.076
Everybody+0.073
rous+0.073
E+0.072
Neg logit contributions
 under-0.040
 and-0.040
ed-0.038
driver-0.036
 in-0.035
Token sign
Token ID1051
Feature activation+5.04e-02

Pos logit contributions
icious+0.132
Little+0.129
Everybody+0.122
reci+0.122
Ms+0.121
Neg logit contributions
 and-0.097
 under-0.095
ed-0.089
 in-0.088
driver-0.086
Token that
Token ID326
Feature activation-8.32e-02

Pos logit contributions
ed+0.042
 under+0.042
driver+0.038
 knees+0.038
 and+0.036
Neg logit contributions
Little-0.100
icious-0.098
One-0.097
Ms-0.096
 difference-0.095
Token said
Token ID531
Feature activation-5.85e-02

Pos logit contributions
Little+0.145
icious+0.141
Everybody+0.136
 disl+0.136
reci+0.135
Neg logit contributions
 under-0.103
 and-0.101
driver-0.097
ed-0.096
 in-0.091
Token "
Token ID366
Feature activation-1.14e-01

Pos logit contributions
icious+0.110
Little+0.109
Everybody+0.107
Everyone+0.106
 disl+0.105
Neg logit contributions
 under-0.065
 and-0.060
ed-0.060
 knees-0.056
driver-0.055
TokenV
Token ID53
Feature activation-3.32e-01

Pos logit contributions
 difference+0.162
 disl+0.150
icious+0.146
rous+0.137
 sym+0.135
Neg logit contributions
ed-0.214
 and-0.214
 in-0.204
 on-0.200
 under-0.199
Tokenest
Token ID395
Feature activation+3.42e-03

Pos logit contributions
icious+0.444
Little+0.426
Everybody+0.399
ograph+0.390
E+0.389
Neg logit contributions
 under-0.595
 and-0.569
 he-0.547
ed-0.546
 in-0.539
Token for
Token ID329
Feature activation-1.43e-03

Pos logit contributions
 under+0.004
ed+0.004
 and+0.004
driver+0.004
 in+0.004
Neg logit contributions
icious-0.006
Little-0.006
 difference-0.006
Everybody-0.006
ograph-0.006
Token sale
Token ID5466
Feature activation-3.57e-03

Pos logit contributions
icious+0.002
Little+0.002
reci+0.002
Br+0.002
 disl+0.002
Neg logit contributions
 and-0.002
 under-0.002
 in-0.002
ed-0.002
 on-0.002
Token,
Token ID11
Feature activation+1.00e-01

Pos logit contributions
icious+0.006
Little+0.006
Ms+0.005
Everybody+0.005
reci+0.005
Neg logit contributions
 under-0.005
 and-0.005
ed-0.005
 in-0.005
driver-0.005
Token one
Token ID530
Feature activation-3.58e-02

Pos logit contributions
 under+0.125
ed+0.122
 and+0.121
driver+0.112
 in+0.111
Neg logit contributions
icious-0.183
Little-0.180
Everybody-0.170
E-0.168
Ms-0.167
Token doll
Token ID3654
Feature activation+7.78e-02

Pos logit contributions
 and+0.023
 under+0.023
ed+0.022
driver+0.021
 in+0.020
Neg logit contributions
icious-0.038
Little-0.038
Everybody-0.036
reci-0.036
 disl-0.036
Token.
Token ID13
Feature activation+1.46e-02

Pos logit contributions
ed+0.082
ously+0.077
driver+0.076
 under+0.073
 and+0.071
Neg logit contributions
icious-0.144
Little-0.138
assion-0.137
One-0.137
ograph-0.136
Token
Token ID198
Feature activation-2.36e-01

Pos logit contributions
 and+0.022
 he+0.019
 under+0.019
 He+0.018
ed+0.017
Neg logit contributions
icious-0.027
ograph-0.027
Little-0.026
reci-0.025
 former-0.025
Token
Token ID198
Feature activation-2.36e-01

Pos logit contributions
 former+0.518
 disl+0.517
icious+0.514
ograph+0.484
assion+0.480
Neg logit contributions
 and-0.352
 he-0.280
 under-0.279
 on-0.274
 in-0.273
Token"
Token ID1
Feature activation-7.71e-02

Pos logit contributions
 difference+0.265
 former+0.256
reci+0.244
 disl+0.244
ograph+0.242
Neg logit contributions
 and-0.546
 he-0.484
 under-0.479
 on-0.476
 in-0.475
TokenV
Token ID53
Feature activation-3.32e-01

Pos logit contributions
 difference+0.091
 disl+0.090
reci+0.088
 former+0.088
icious+0.088
Neg logit contributions
 and-0.150
 under-0.142
ed-0.142
 on-0.139
 in-0.138
Tokenroom
Token ID3823
Feature activation+3.29e-02

Pos logit contributions
icious+0.439
Little+0.429
Everybody+0.400
Everyone+0.386
E+0.384
Neg logit contributions
 under-0.594
 and-0.576
ed-0.556
 in-0.545
driver-0.545
Token,
Token ID11
Feature activation+9.88e-02

Pos logit contributions
 under+0.025
 and+0.023
ed+0.022
 in+0.020
driver+0.020
Neg logit contributions
icious-0.075
Little-0.074
 disl-0.071
Everybody-0.071
redible-0.070
Token v
Token ID410
Feature activation-1.57e-01

Pos logit contributions
 and+0.144
 under+0.143
ed+0.135
 in+0.132
driver+0.130
Neg logit contributions
icious-0.163
Little-0.160
Everybody-0.151
reci-0.150
Ms-0.149
Tokenroom
Token ID3823
Feature activation+3.25e-02

Pos logit contributions
icious+0.228
Little+0.221
Everybody+0.209
Ms+0.205
reci+0.205
Neg logit contributions
 and-0.259
 under-0.258
ed-0.243
 in-0.239
driver-0.236
Token,
Token ID11
Feature activation+9.87e-02

Pos logit contributions
 under+0.035
 and+0.034
ed+0.033
 in+0.031
driver+0.031
Neg logit contributions
icious-0.064
Little-0.063
 disl-0.060
Everybody-0.060
Ms-0.059
Token again
Token ID757
Feature activation-1.35e-02

Pos logit contributions
Little+0.158
icious+0.157
 difference+0.150
Ms+0.149
One+0.148
Neg logit contributions
driver-0.140
ed-0.139
 under-0.134
 knees-0.134
aces-0.133
Token.
Token ID13
Feature activation+1.06e-02

Pos logit contributions
Little+0.029
icious+0.029
 disl+0.027
Everybody+0.027
 difference+0.027
Neg logit contributions
 under-0.012
ed-0.012
driver-0.011
 and-0.011
 in-0.010
Token
Token ID220
Feature activation-2.04e-01

Pos logit contributions
 and+0.011
 under+0.011
ed+0.010
 in+0.009
driver+0.009
Neg logit contributions
icious-0.022
Little-0.022
Everybody-0.021
reci-0.021
 disl-0.021
Token
Token ID198
Feature activation-2.35e-01

Pos logit contributions
rous+0.360
icious+0.352
 disl+0.350
reci+0.346
ograph+0.344
Neg logit contributions
 and-0.323
 under-0.281
 He-0.280
ed-0.276
 he-0.274
Token
Token ID198
Feature activation-2.35e-01

Pos logit contributions
 former+0.512
icious+0.492
 disl+0.490
 difference+0.481
rous+0.476
Neg logit contributions
 and-0.351
 on-0.288
 he-0.278
 under-0.271
 He-0.268
TokenV
Token ID53
Feature activation-3.32e-01

Pos logit contributions
 former+0.303
 difference+0.298
 disl+0.293
rous+0.285
icious+0.266
Neg logit contributions
 and-0.485
 on-0.447
 he-0.435
driver-0.430
 in-0.428
Tokenince
Token ID924
Feature activation+2.82e-02

Pos logit contributions
icious+0.460
Little+0.448
Everybody+0.418
reci+0.411
Ms+0.411
Neg logit contributions
 under-0.571
 and-0.569
ed-0.540
 in-0.531
driver-0.524
Token was
Token ID373
Feature activation+5.16e-02

Pos logit contributions
 under+0.031
driver+0.027
ed+0.027
 and+0.027
 knees+0.024
Neg logit contributions
Little-0.056
icious-0.054
Everybody-0.052
Everyone-0.051
Ms-0.051
Token very
Token ID845
Feature activation-2.77e-02

Pos logit contributions
 under+0.055
 and+0.055
ed+0.052
driver+0.051
 in+0.047
Neg logit contributions
icious-0.103
Little-0.101
Everybody-0.097
Ms-0.096
 disl-0.095
Token happy
Token ID3772
Feature activation-2.08e-01

Pos logit contributions
icious+0.045
Little+0.043
reci+0.042
ograph+0.041
Everybody+0.041
Neg logit contributions
 and-0.043
 under-0.042
 in-0.040
ed-0.040
 on-0.038
Token and
Token ID290
Feature activation+2.20e-01

Pos logit contributions
 difference+0.380
Little+0.367
 wonderful+0.356
 disl+0.355
 sym+0.350
Neg logit contributions
driver-0.207
aces-0.198
ed-0.190
ped-0.176
pered-0.175
Token her
Token ID607
Feature activation+1.80e-02

Pos logit contributions
icious+0.088
Little+0.087
Everybody+0.083
Ms+0.082
 disl+0.082
Neg logit contributions
 and-0.046
 under-0.045
ed-0.043
driver-0.041
 in-0.040
Token mission
Token ID4365
Feature activation-9.95e-03

Pos logit contributions
 and+0.030
 under+0.029
 in+0.028
 on+0.027
ed+0.027
Neg logit contributions
icious-0.028
Little-0.026
Ms-0.025
E-0.025
Everybody-0.025
Token!
Token ID0
Feature activation-1.73e-02

Pos logit contributions
Little+0.023
icious+0.023
 difference+0.023
Everybody+0.023
Ms+0.022
Neg logit contributions
 under-0.006
ed-0.006
driver-0.006
 and-0.005
 knees-0.005
Token
Token ID198
Feature activation-2.35e-01

Pos logit contributions
icious+0.032
 disl+0.029
Little+0.028
rous+0.028
redible+0.028
Neg logit contributions
 and-0.027
ed-0.025
 under-0.025
 on-0.024
 in-0.024
Token
Token ID198
Feature activation-2.35e-01

Pos logit contributions
icious+0.419
 former+0.416
 disl+0.383
 difference+0.373
rous+0.361
Neg logit contributions
 and-0.438
 under-0.407
 on-0.402
 in-0.388
ed-0.354
TokenV
Token ID53
Feature activation-3.31e-01

Pos logit contributions
 disl+0.316
 former+0.306
 difference+0.295
rous+0.278
 sym+0.265
Neg logit contributions
 and-0.490
 on-0.483
 under-0.460
 in-0.454
ed-0.426
Tokeniolet
Token ID19194
Feature activation+3.38e-02

Pos logit contributions
icious+0.396
Little+0.385
reci+0.356
 disl+0.356
Everybody+0.355
Neg logit contributions
 and-0.637
 under-0.629
ed-0.606
 in-0.593
driver-0.585
Token grinned
Token ID36268
Feature activation-2.82e-02

Pos logit contributions
 under+0.035
 and+0.033
driver+0.032
ed+0.031
 in+0.029
Neg logit contributions
Little-0.067
icious-0.067
Everybody-0.065
Everyone-0.064
Ms-0.063
Token and
Token ID290
Feature activation+2.20e-01

Pos logit contributions
Little+0.061
 difference+0.061
 sym+0.061
 former+0.060
assion+0.060
Neg logit contributions
driver-0.023
 hid-0.017
 wouldn-0.016
aces-0.015
 knees-0.015
Token knew
Token ID2993
Feature activation+1.36e-03

Pos logit contributions
driver+0.283
 under+0.267
 and+0.259
aces+0.239
ed+0.236
Neg logit contributions
Little-0.390
Everybody-0.364
icious-0.363
Everyone-0.361
Br-0.354
Token that
Token ID326
Feature activation-8.65e-02

Pos logit contributions
ed+0.001
 under+0.001
driver+0.001
 and+0.001
 knees+0.001
Neg logit contributions
Little-0.003
Everybody-0.002
icious-0.002
Everyone-0.002
 difference-0.002
Token other
Token ID584
Feature activation-4.75e-02

Pos logit contributions
icious+0.067
Little+0.062
 disl+0.061
reci+0.059
ograph+0.059
Neg logit contributions
 and-0.061
 under-0.059
 in-0.055
ed-0.054
 on-0.052
Token.
Token ID13
Feature activation+1.58e-02

Pos logit contributions
 difference+0.087
icious+0.084
Little+0.084
 disl+0.083
assion+0.082
Neg logit contributions
ed-0.049
driver-0.046
 under-0.045
ously-0.040
 wouldn-0.039
Token
Token ID198
Feature activation-2.36e-01

Pos logit contributions
 and+0.020
 under+0.017
 in+0.016
 on+0.016
 He+0.016
Neg logit contributions
icious-0.036
gary-0.032
ograph-0.032
Little-0.032
redible-0.032
Token
Token ID198
Feature activation-2.36e-01

Pos logit contributions
icious+0.635
 former+0.586
ograph+0.585
gary+0.577
rous+0.574
Neg logit contributions
 and-0.297
 he-0.200
 He-0.196
 in-0.190
 under-0.189
Token"
Token ID1
Feature activation-7.37e-02

Pos logit contributions
 difference+0.510
icious+0.454
 disl+0.449
rous+0.448
 former+0.446
Neg logit contributions
 and-0.369
 in-0.300
 on-0.291
 he-0.283
 under-0.282
TokenV
Token ID53
Feature activation-3.31e-01

Pos logit contributions
rous+0.094
 difference+0.092
assion+0.088
icious+0.088
 disl+0.087
Neg logit contributions
 and-0.132
ed-0.128
 under-0.127
 in-0.124
 on-0.123
Tokenroom
Token ID3823
Feature activation+3.60e-02

Pos logit contributions
icious+0.422
Little+0.406
Everybody+0.382
Everyone+0.370
redible+0.367
Neg logit contributions
 under-0.610
 and-0.573
 he-0.565
ed-0.564
driver-0.555
Token,
Token ID11
Feature activation+1.03e-01

Pos logit contributions
 under+0.025
 and+0.023
ed+0.022
 in+0.020
driver+0.019
Neg logit contributions
icious-0.084
Little-0.084
 disl-0.080
Everybody-0.080
Ms-0.079
Token v
Token ID410
Feature activation-1.56e-01

Pos logit contributions
 and+0.164
 under+0.157
 in+0.149
 on+0.145
driver+0.144
Neg logit contributions
icious-0.167
Little-0.160
rous-0.155
Ms-0.154
reci-0.153
Tokenroom
Token ID3823
Feature activation+3.57e-02

Pos logit contributions
icious+0.208
Little+0.188
Everybody+0.187
redible+0.183
Everyone+0.183
Neg logit contributions
 under-0.281
 and-0.274
 he-0.257
 in-0.256
 knees-0.252
Token,
Token ID11
Feature activation+1.03e-01

Pos logit contributions
 under+0.027
 and+0.026
ed+0.024
 in+0.022
driver+0.022
Neg logit contributions
icious-0.083
Little-0.082
Everybody-0.078
 disl-0.078
Ms-0.078
Token roaring
Token ID43170
Feature activation-1.73e-02

Pos logit contributions
ed+0.094
 and+0.093
 under+0.092
driver+0.092
 knees+0.083
Neg logit contributions
icious-0.155
Little-0.152
Everybody-0.148
 difference-0.146
 disl-0.145
Token.
Token ID13
Feature activation+1.43e-02

Pos logit contributions
Little+0.030
icious+0.029
 sym+0.029
Everyone+0.029
Everybody+0.029
Neg logit contributions
driver-0.020
 knees-0.018
ed-0.017
ously-0.017
 under-0.017
Token
Token ID198
Feature activation-2.37e-01

Pos logit contributions
 and+0.016
 under+0.014
 in+0.013
 on+0.012
ed+0.012
Neg logit contributions
icious-0.033
rous-0.031
Little-0.031
ograph-0.030
gary-0.030
Token
Token ID198
Feature activation-2.37e-01

Pos logit contributions
icious+0.602
 former+0.571
ograph+0.556
 disl+0.553
rous+0.549
Neg logit contributions
 and-0.333
 under-0.249
 he-0.238
 in-0.237
 on-0.221
Token"
Token ID1
Feature activation-7.58e-02

Pos logit contributions
 difference+0.441
 former+0.412
 disl+0.400
rous+0.399
icious+0.397
Neg logit contributions
 and-0.415
 under-0.365
 in-0.355
 on-0.349
 he-0.332
TokenV
Token ID53
Feature activation-3.30e-01

Pos logit contributions
rous+0.100
 difference+0.099
 disl+0.099
icious+0.096
 former+0.095
Neg logit contributions
 and-0.137
ed-0.130
 under-0.128
 on-0.126
 in-0.125
Tokenroom
Token ID3823
Feature activation+3.56e-02

Pos logit contributions
icious+0.436
Little+0.423
Everybody+0.397
ograph+0.386
Everyone+0.384
Neg logit contributions
 under-0.592
 and-0.568
ed-0.552
 in-0.545
 he-0.536
Token,
Token ID11
Feature activation+1.03e-01

Pos logit contributions
 under+0.026
 and+0.024
ed+0.023
 in+0.021
driver+0.021
Neg logit contributions
icious-0.082
Little-0.081
 disl-0.078
Everybody-0.078
redible-0.077
Token v
Token ID410
Feature activation-1.56e-01

Pos logit contributions
 and+0.156
 under+0.148
 in+0.141
 on+0.136
driver+0.136
Neg logit contributions
icious-0.176
Little-0.166
rous-0.165
Ms-0.162
reci-0.161
Tokenroom
Token ID3823
Feature activation+3.54e-02

Pos logit contributions
icious+0.234
Little+0.227
Everybody+0.215
 disl+0.211
Ms+0.210
Neg logit contributions
 under-0.249
 and-0.249
ed-0.232
 in-0.230
driver-0.226
Token,
Token ID11
Feature activation+1.02e-01

Pos logit contributions
 under+0.032
 and+0.031
ed+0.029
 in+0.028
driver+0.027
Neg logit contributions
icious-0.076
Little-0.075
Everybody-0.072
 disl-0.071
reci-0.071
Token V
Token ID569
Feature activation-2.83e-01

Pos logit contributions
icious+0.158
Little+0.155
Br+0.152
E+0.152
Everybody+0.152
Neg logit contributions
 and-0.061
 under-0.055
 in-0.051
 on-0.046
ed-0.043
Tokenicky
Token ID17479
Feature activation-4.28e-03

Pos logit contributions
icious+0.430
Little+0.421
ograph+0.388
Everybody+0.387
Ms+0.386
Neg logit contributions
 under-0.443
 and-0.438
ed-0.420
driver-0.411
 in-0.406
Token?"
Token ID1701
Feature activation-1.19e-01

Pos logit contributions
Little+0.006
icious+0.006
ograph+0.006
 difference+0.006
Ms+0.006
Neg logit contributions
 under-0.006
ed-0.006
driver-0.006
 and-0.006
 knees-0.006
Token
Token ID198
Feature activation-2.34e-01

Pos logit contributions
icious+0.239
rous+0.218
 difference+0.217
 former+0.216
Little+0.214
Neg logit contributions
 and-0.180
 under-0.157
 in-0.153
ed-0.143
 he-0.143
Token
Token ID198
Feature activation-2.34e-01

Pos logit contributions
icious+0.584
 former+0.581
 difference+0.541
 fullest+0.533
rous+0.531
Neg logit contributions
 and-0.340
 in-0.259
 under-0.250
 to-0.241
 on-0.236
TokenV
Token ID53
Feature activation-3.30e-01

Pos logit contributions
 difference+0.382
 former+0.370
rous+0.355
 lifetime+0.335
reci+0.330
Neg logit contributions
 and-0.465
 in-0.412
 under-0.388
 ran-0.378
 on-0.375
Tokenicky
Token ID17479
Feature activation-4.97e-03

Pos logit contributions
icious+0.506
Little+0.497
Everybody+0.460
E+0.457
Ms+0.453
Neg logit contributions
 under-0.514
 and-0.504
driver-0.481
ed-0.479
 in-0.467
Token said
Token ID531
Feature activation-6.20e-02

Pos logit contributions
Little+0.010
icious+0.009
Everybody+0.009
reci+0.009
Everyone+0.009
Neg logit contributions
 under-0.005
driver-0.005
ed-0.005
 and-0.005
 on-0.004
Token in
Token ID287
Feature activation+4.21e-03

Pos logit contributions
Little+0.111
Everyone+0.110
Everybody+0.110
 disl+0.110
Ms+0.108
Neg logit contributions
aces-0.064
 under-0.062
driver-0.061
 knees-0.057
eer-0.056
Token a
Token ID257
Feature activation-7.54e-02

Pos logit contributions
 and+0.005
 under+0.005
ed+0.004
 in+0.004
driver+0.004
Neg logit contributions
icious-0.009
Little-0.008
Everybody-0.008
reci-0.008
Ms-0.008
Token shaky
Token ID35335
Feature activation-1.49e-01

Pos logit contributions
icious+0.130
Little+0.118
reci+0.117
redible+0.116
assion+0.116
Neg logit contributions
 and-0.120
 under-0.111
 in-0.107
 on-0.105
 he-0.102
Token will
Token ID481
Feature activation+7.23e-02

Pos logit contributions
icious+0.042
Little+0.041
Everybody+0.039
 difference+0.039
reci+0.038
Neg logit contributions
 under-0.045
ed-0.043
 and-0.042
driver-0.041
 in-0.041
Token try
Token ID1949
Feature activation-1.99e-03

Pos logit contributions
 and+0.092
 under+0.086
 in+0.081
ed+0.077
 on+0.076
Neg logit contributions
icious-0.139
Little-0.136
Ms-0.132
reci-0.130
rous-0.130
Token!"
Token ID2474
Feature activation-8.49e-02

Pos logit contributions
icious+0.005
Little+0.004
reci+0.004
Ms+0.004
Everybody+0.004
Neg logit contributions
 and-0.002
 under-0.002
 in-0.002
 on-0.002
ed-0.002
Token
Token ID198
Feature activation-2.33e-01

Pos logit contributions
Little+0.113
E+0.106
Br+0.106
Everyone+0.106
Everybody+0.104
Neg logit contributions
 under-0.131
driver-0.131
ed-0.125
 explorers-0.120
 behind-0.119
Token
Token ID198
Feature activation-2.33e-01

Pos logit contributions
icious+0.511
 disl+0.505
 former+0.495
 difference+0.479
rous+0.476
Neg logit contributions
 and-0.368
 under-0.299
 in-0.288
 he-0.273
 on-0.267
TokenV
Token ID53
Feature activation-3.30e-01

Pos logit contributions
 disl+0.303
 difference+0.298
rous+0.292
reci+0.285
icious+0.257
Neg logit contributions
 and-0.476
 in-0.431
 under-0.429
 he-0.413
 on-0.394
Tokenicky
Token ID17479
Feature activation-3.64e-03

Pos logit contributions
icious+0.499
Little+0.488
Everybody+0.456
E+0.445
Everyone+0.444
Neg logit contributions
 under-0.526
 and-0.511
ed-0.491
driver-0.486
 in-0.474
Token took
Token ID1718
Feature activation-2.48e-02

Pos logit contributions
Little+0.008
Everybody+0.008
icious+0.008
reci+0.008
 sym+0.008
Neg logit contributions
 under-0.003
driver-0.002
ed-0.002
 and-0.002
 in-0.002
Token a
Token ID257
Feature activation-7.45e-02

Pos logit contributions
Little+0.040
E+0.039
Everybody+0.039
Everyone+0.038
icious+0.038
Neg logit contributions
 under-0.033
 and-0.033
driver-0.032
ed-0.031
 knees-0.030
Token deep
Token ID2769
Feature activation-1.51e-01

Pos logit contributions
icious+0.133
Little+0.124
 disl+0.122
reci+0.120
Everybody+0.119
Neg logit contributions
 and-0.109
 under-0.105
 in-0.098
driver-0.095
 on-0.095
Token breath
Token ID8033
Feature activation+1.05e-01

Pos logit contributions
Little+0.258
Br+0.245
One+0.242
E+0.241
rous+0.239
Neg logit contributions
ed-0.189
 under-0.175
 wouldn-0.163
driver-0.162
 and-0.158
Token her
Token ID607
Feature activation+2.12e-02

Pos logit contributions
icious+0.118
Little+0.111
reci+0.110
 disl+0.110
rous+0.109
Neg logit contributions
 and-0.060
 under-0.056
 in-0.053
ed-0.049
 on-0.049
Token!
Token ID0
Feature activation-1.56e-02

Pos logit contributions
 and+0.015
 under+0.014
 in+0.012
ed+0.011
 on+0.010
Neg logit contributions
icious-0.053
Little-0.052
 disl-0.051
Everybody-0.051
Br-0.051
Token
Token ID220
Feature activation-2.01e-01

Pos logit contributions
icious+0.029
rous+0.026
 disl+0.026
redible+0.026
 difference+0.026
Neg logit contributions
 and-0.024
ed-0.022
 under-0.021
 on-0.021
 in-0.020
Token
Token ID198
Feature activation-2.32e-01

Pos logit contributions
icious+0.368
rous+0.360
 difference+0.353
ograph+0.347
reci+0.341
Neg logit contributions
 and-0.306
ed-0.297
 in-0.275
 under-0.269
 on-0.264
Token
Token ID198
Feature activation-2.32e-01

Pos logit contributions
 former+0.515
icious+0.507
 difference+0.507
rous+0.493
 disl+0.477
Neg logit contributions
 and-0.317
 on-0.278
 in-0.256
 under-0.253
ed-0.251
TokenV
Token ID53
Feature activation-3.30e-01

Pos logit contributions
 difference+0.321
 former+0.318
 disl+0.312
rous+0.305
itionally+0.291
Neg logit contributions
 and-0.469
 on-0.461
 in-0.424
 under-0.410
ed-0.409
Tokeniv
Token ID452
Feature activation-2.72e-02

Pos logit contributions
icious+0.487
Little+0.469
 disl+0.460
rous+0.453
Br+0.450
Neg logit contributions
 and-0.547
 under-0.521
ed-0.513
 in-0.495
driver-0.494
Tokenian
Token ID666
Feature activation+7.36e-02

Pos logit contributions
Little+0.046
icious+0.045
reci+0.043
Everybody+0.043
E+0.043
Neg logit contributions
 under-0.038
 and-0.037
ed-0.034
 in-0.034
driver-0.034
Token enjoyed
Token ID8359
Feature activation-5.66e-02

Pos logit contributions
 under+0.080
driver+0.072
 and+0.068
 knees+0.068
ed+0.065
Neg logit contributions
Little-0.143
icious-0.134
Everybody-0.132
Ms-0.131
Everyone-0.130
Token playing
Token ID2712
Feature activation-5.24e-02

Pos logit contributions
Little+0.104
icious+0.101
 disl+0.099
Ms+0.098
Everybody+0.097
Neg logit contributions
ed-0.066
 under-0.065
driver-0.064
 and-0.063
 knees-0.057
Token with
Token ID351
Feature activation-4.10e-02

Pos logit contributions
Little+0.106
icious+0.103
Everybody+0.102
 difference+0.101
 sym+0.101
Neg logit contributions
ed-0.047
driver-0.046
 knees-0.043
 under-0.042
ously-0.039
Token car
Token ID1097
Feature activation+2.25e-02

Pos logit contributions
icious+0.294
Little+0.287
Everybody+0.280
ograph+0.279
One+0.277
Neg logit contributions
 under-0.245
ed-0.239
 and-0.233
 in-0.219
driver-0.217
Token.
Token ID13
Feature activation+1.63e-02

Pos logit contributions
ed+0.024
 under+0.022
ously+0.022
 knees+0.022
driver+0.021
Neg logit contributions
icious-0.039
One-0.038
assion-0.038
ograph-0.037
 sym-0.037
Token
Token ID198
Feature activation-2.31e-01

Pos logit contributions
 and+0.022
 under+0.021
ed+0.020
 in+0.019
 he+0.019
Neg logit contributions
icious-0.030
Little-0.029
ograph-0.028
 disl-0.028
reci-0.027
Token
Token ID198
Feature activation-2.31e-01

Pos logit contributions
 disl+0.524
icious+0.520
 former+0.516
assion+0.477
 difference+0.475
Neg logit contributions
 and-0.338
 in-0.267
 he-0.266
 on-0.264
 under-0.264
Token"
Token ID1
Feature activation-7.27e-02

Pos logit contributions
 difference+0.319
 disl+0.310
 former+0.297
icious+0.283
 lasting+0.280
Neg logit contributions
 and-0.465
 in-0.413
 under-0.412
 on-0.402
 he-0.400
TokenV
Token ID53
Feature activation-3.29e-01

Pos logit contributions
 disl+0.088
 difference+0.087
icious+0.085
 former+0.083
rous+0.081
Neg logit contributions
 and-0.143
ed-0.137
 under-0.136
 in-0.135
 on-0.134
Tokenroom
Token ID3823
Feature activation+3.55e-02

Pos logit contributions
icious+0.405
Little+0.401
Everybody+0.371
ograph+0.362
Everyone+0.357
Neg logit contributions
 under-0.614
 and-0.579
 he-0.569
ed-0.568
driver-0.557
Token,
Token ID11
Feature activation+1.02e-01

Pos logit contributions
 under+0.027
 and+0.026
ed+0.024
 in+0.022
driver+0.021
Neg logit contributions
icious-0.082
Little-0.080
 disl-0.077
Everybody-0.077
Ms-0.076
Token v
Token ID410
Feature activation-1.55e-01

Pos logit contributions
 and+0.161
 under+0.157
 in+0.148
ed+0.146
driver+0.145
Neg logit contributions
icious-0.162
Little-0.157
reci-0.149
Everybody-0.149
rous-0.148
Tokenroom
Token ID3823
Feature activation+3.49e-02

Pos logit contributions
icious+0.220
Little+0.212
Everybody+0.201
 disl+0.196
Ms+0.196
Neg logit contributions
 under-0.261
 and-0.261
ed-0.243
 in-0.241
driver-0.236
Token,
Token ID11
Feature activation+1.02e-01

Pos logit contributions
 under+0.036
 and+0.036
ed+0.033
 in+0.032
driver+0.031
Neg logit contributions
icious-0.071
Little-0.070
Everybody-0.067
 disl-0.066
Ms-0.066
Token wand
Token ID11569
Feature activation+9.73e-02

Pos logit contributions
icious+0.234
Little+0.230
reci+0.227
 difference+0.224
Everybody+0.223
Neg logit contributions
 and-0.170
 under-0.168
ed-0.166
driver-0.153
 in-0.148
Token and
Token ID290
Feature activation+2.22e-01

Pos logit contributions
driver+0.082
keeping+0.076
 chefs+0.071
oll+0.068
 Lee+0.068
Neg logit contributions
One-0.169
assion-0.168
Each-0.164
 wonderful-0.163
 sym-0.163
Token said
Token ID531
Feature activation-5.84e-02

Pos logit contributions
driver+0.258
ed+0.228
 and+0.224
aces+0.223
ously+0.221
Neg logit contributions
Little-0.379
Everyone-0.365
Everybody-0.362
Br-0.362
 sym-0.359
Token,
Token ID11
Feature activation+1.01e-01

Pos logit contributions
Everybody+0.117
 disl+0.116
Everyone+0.116
Little+0.114
assion+0.114
Neg logit contributions
 under-0.053
 and-0.046
ed-0.043
 knees-0.042
driver-0.041
Token "
Token ID366
Feature activation-1.12e-01

Pos logit contributions
 under+0.125
ed+0.124
driver+0.116
 and+0.115
 in+0.107
Neg logit contributions
Little-0.171
icious-0.169
Ms-0.165
assion-0.163
One-0.162
TokenV
Token ID53
Feature activation-3.29e-01

Pos logit contributions
 difference+0.169
 disl+0.151
icious+0.147
rous+0.145
 former+0.142
Neg logit contributions
ed-0.200
 and-0.193
 in-0.185
 went-0.181
 on-0.178
Tokenanish
Token ID7115
Feature activation+3.01e-02

Pos logit contributions
icious+0.429
Little+0.414
Everybody+0.385
ograph+0.376
E+0.375
Neg logit contributions
 under-0.594
 and-0.579
ed-0.556
driver-0.547
 in-0.544
Token!"
Token ID2474
Feature activation-8.52e-02

Pos logit contributions
 and+0.044
 under+0.042
ed+0.040
 in+0.040
 on+0.037
Neg logit contributions
icious-0.052
reci-0.050
Little-0.050
rous-0.050
 former-0.050
Token Suddenly
Token ID24975
Feature activation-1.42e-01

Pos logit contributions
icious+0.139
Everyone+0.139
Little+0.138
Everybody+0.136
Br+0.132
Neg logit contributions
 under-0.108
driver-0.103
ed-0.101
ously-0.094
 in-0.093
Token,
Token ID11
Feature activation+9.99e-02

Pos logit contributions
Little+0.214
Everyone+0.208
Everybody+0.203
 wonderful+0.197
One+0.196
Neg logit contributions
edly-0.198
aces-0.176
keeping-0.174
fully-0.170
driver-0.169
Token his
Token ID465
Feature activation+2.36e-02

Pos logit contributions
ously+0.147
driver+0.143
aces+0.142
edly+0.138
pered+0.134
Neg logit contributions
Everyone-0.153
Little-0.151
Everybody-0.146
 wonderful-0.136
The-0.135
Token inside
Token ID2641
Feature activation-2.86e-03

Pos logit contributions
ously+0.008
driver+0.007
 throat+0.007
 nor+0.007
aces+0.007
Neg logit contributions
One-0.019
V-0.019
Little-0.019
E-0.018
The-0.018
Token.
Token ID13
Feature activation+1.75e-02

Pos logit contributions
Little+0.005
Ms+0.005
One+0.005
E+0.005
icious+0.005
Neg logit contributions
ed-0.003
driver-0.003
 under-0.003
aces-0.003
 knees-0.003
Token
Token ID198
Feature activation-2.34e-01

Pos logit contributions
 and+0.020
 under+0.019
ed+0.017
 in+0.017
driver+0.016
Neg logit contributions
icious-0.038
Little-0.035
reci-0.035
 difference-0.035
ograph-0.034
Token
Token ID198
Feature activation-2.34e-01

Pos logit contributions
icious+0.529
 former+0.491
ograph+0.480
 difference+0.469
 disl+0.462
Neg logit contributions
 and-0.355
 under-0.294
 he-0.283
 in-0.280
 on-0.267
Token"
Token ID1
Feature activation-7.35e-02

Pos logit contributions
 difference+0.405
icious+0.375
 former+0.372
reci+0.362
 sym+0.352
Neg logit contributions
 and-0.406
 under-0.366
 in-0.364
 on-0.352
 he-0.348
TokenV
Token ID53
Feature activation-3.28e-01

Pos logit contributions
 difference+0.103
icious+0.102
 former+0.101
 disl+0.100
reci+0.097
Neg logit contributions
 and-0.128
ed-0.126
 under-0.124
 on-0.119
 he-0.119
Tokenroom
Token ID3823
Feature activation+3.75e-02

Pos logit contributions
icious+0.412
Little+0.400
Everybody+0.372
ograph+0.361
redible+0.360
Neg logit contributions
 under-0.614
 and-0.584
ed-0.574
 he-0.564
 in-0.559
Token,
Token ID11
Feature activation+1.05e-01

Pos logit contributions
 under+0.027
 and+0.025
ed+0.023
 in+0.021
driver+0.020
Neg logit contributions
icious-0.087
Little-0.086
 disl-0.082
Everybody-0.082
redible-0.082
Token v
Token ID410
Feature activation-1.54e-01

Pos logit contributions
 and+0.166
 under+0.161
 in+0.152
driver+0.149
 on+0.147
Neg logit contributions
icious-0.169
Little-0.161
rous-0.156
Ms-0.155
reci-0.155
Tokenroom
Token ID3823
Feature activation+3.72e-02

Pos logit contributions
icious+0.203
Little+0.190
Everybody+0.184
Everyone+0.180
redible+0.180
Neg logit contributions
 under-0.275
 and-0.271
 in-0.252
 he-0.248
ed-0.247
Token!"
Token ID2474
Feature activation-8.30e-02

Pos logit contributions
 under+0.031
 and+0.030
ed+0.028
 in+0.026
driver+0.026
Neg logit contributions
icious-0.082
Little-0.081
Everybody-0.078
 disl-0.077
E-0.077
Token?"
Token ID1701
Feature activation-1.17e-01

Pos logit contributions
icious+0.091
Little+0.088
ograph+0.084
redible+0.081
Ms+0.081
Neg logit contributions
 under-0.102
ed-0.101
 and-0.100
driver-0.097
ously-0.094
Token Sophie
Token ID35331
Feature activation-2.62e-02

Pos logit contributions
icious+0.198
reci+0.184
Little+0.182
 disl+0.175
redible+0.175
Neg logit contributions
 and-0.201
 under-0.175
 in-0.171
ed-0.166
 he-0.163
Token explained
Token ID4893
Feature activation-8.29e-02

Pos logit contributions
icious+0.054
Little+0.051
Everybody+0.051
rous+0.050
gary+0.049
Neg logit contributions
 and-0.034
 under-0.029
 in-0.029
ed-0.027
 he-0.025
Token,
Token ID11
Feature activation+1.01e-01

Pos logit contributions
Little+0.167
icious+0.166
 disl+0.161
Everybody+0.160
Ms+0.159
Neg logit contributions
 under-0.083
ed-0.080
 and-0.077
driver-0.076
 in-0.068
Token "
Token ID366
Feature activation-1.11e-01

Pos logit contributions
 under+0.106
 and+0.103
ed+0.101
driver+0.097
 in+0.093
Neg logit contributions
icious-0.203
Little-0.199
Everybody-0.191
Ms-0.190
 disl-0.189
TokenV
Token ID53
Feature activation-3.28e-01

Pos logit contributions
icious+0.160
 difference+0.158
reci+0.150
Little+0.147
 disl+0.146
Neg logit contributions
 and-0.205
 under-0.186
ed-0.183
 in-0.182
 on-0.178
Tokenelvet
Token ID32667
Feature activation+2.13e-02

Pos logit contributions
icious+0.458
Little+0.446
Everybody+0.417
reci+0.412
Ms+0.411
Neg logit contributions
 and-0.564
 under-0.561
ed-0.535
 in-0.523
driver-0.518
Token is
Token ID318
Feature activation+5.21e-02

Pos logit contributions
 and+0.021
 under+0.021
ed+0.019
 in+0.019
driver+0.018
Neg logit contributions
icious-0.045
Little-0.044
Everybody-0.042
reci-0.042
Ms-0.042
Token a
Token ID257
Feature activation-7.39e-02

Pos logit contributions
 and+0.068
 under+0.067
 in+0.062
ed+0.060
 on+0.059
Neg logit contributions
icious-0.097
Little-0.095
Everybody-0.091
reci-0.091
 disl-0.089
Token special
Token ID2041
Feature activation-2.53e-01

Pos logit contributions
icious+0.130
reci+0.130
Everybody+0.126
Little+0.125
 disl+0.122
Neg logit contributions
 and-0.115
 under-0.106
 in-0.102
 on-0.099
 he-0.091
Token kind
Token ID1611
Feature activation-1.88e-01

Pos logit contributions
icious+0.446
Little+0.430
reci+0.414
Everybody+0.414
 disl+0.406
Neg logit contributions
 and-0.383
 under-0.362
 in-0.339
ed-0.328
 on-0.325
Token noises
Token ID26782
Feature activation+1.75e-02

Pos logit contributions
icious+0.102
Little+0.101
Everybody+0.096
reci+0.095
Ms+0.095
Neg logit contributions
 and-0.057
 under-0.057
ed-0.053
 in-0.051
driver-0.050
Token and
Token ID290
Feature activation+2.27e-01

Pos logit contributions
ed+0.018
 knees+0.016
driver+0.016
ously+0.015
 under+0.015
Neg logit contributions
icious-0.034
Little-0.033
Everyone-0.033
Everybody-0.032
 difference-0.032
Token moves
Token ID6100
Feature activation+6.75e-03

Pos logit contributions
 under+0.342
ed+0.336
 and+0.332
driver+0.317
 in+0.312
Neg logit contributions
icious-0.338
Little-0.334
 disl-0.333
reci-0.320
Everybody-0.319
Token.
Token ID13
Feature activation+1.46e-02

Pos logit contributions
 under+0.007
ed+0.007
driver+0.007
 knees+0.006
 and+0.006
Neg logit contributions
Little-0.013
E-0.012
Everyone-0.012
Everybody-0.012
One-0.012
Token "
Token ID366
Feature activation-1.10e-01

Pos logit contributions
 and+0.021
 under+0.020
ed+0.019
 in+0.019
driver+0.018
Neg logit contributions
icious-0.026
Little-0.025
ograph-0.024
reci-0.024
 disl-0.023
TokenV
Token ID53
Feature activation-3.28e-01

Pos logit contributions
 difference+0.148
 former+0.131
rous+0.129
assion+0.128
 disl+0.126
Neg logit contributions
 and-0.207
 on-0.193
ed-0.188
 in-0.188
driver-0.188
Tokenroom
Token ID3823
Feature activation+3.54e-02

Pos logit contributions
icious+0.449
Little+0.434
Everybody+0.405
E+0.393
Everyone+0.392
Neg logit contributions
 under-0.573
 and-0.559
ed-0.544
driver-0.528
 in-0.526
Token,
Token ID11
Feature activation+1.01e-01

Pos logit contributions
 and+0.038
 under+0.038
ed+0.035
 in+0.033
driver+0.033
Neg logit contributions
icious-0.072
Little-0.071
Everybody-0.067
reci-0.067
Ms-0.067
Token v
Token ID410
Feature activation-1.55e-01

Pos logit contributions
 and+0.160
 under+0.154
 in+0.147
 on+0.143
driver+0.141
Neg logit contributions
icious-0.165
Little-0.158
rous-0.153
reci-0.153
Ms-0.153
Tokenroom
Token ID3823
Feature activation+3.51e-02

Pos logit contributions
icious+0.227
Little+0.223
Everybody+0.208
reci+0.208
Ms+0.206
Neg logit contributions
 and-0.254
 under-0.252
ed-0.241
 in-0.235
driver-0.233
Token!"
Token ID2474
Feature activation-8.31e-02

Pos logit contributions
 and+0.054
 under+0.053
ed+0.050
 in+0.049
driver+0.049
Neg logit contributions
icious-0.055
Little-0.054
Everybody-0.051
reci-0.051
Ms-0.051
Token.
Token ID13
Feature activation+4.51e-02

Pos logit contributions
 difference+0.213
Little+0.203
 wonderful+0.201
reci+0.192
is+0.191
Neg logit contributions
driver-0.129
 knees-0.112
ed-0.111
 under-0.110
icked-0.107
Token He
Token ID679
Feature activation+1.25e-01

Pos logit contributions
 and+0.036
 under+0.034
ed+0.031
 in+0.029
driver+0.029
Neg logit contributions
icious-0.106
Little-0.103
reci-0.100
Everybody-0.099
 disl-0.099
Token approached
Token ID10448
Feature activation-7.09e-03

Pos logit contributions
driver+0.123
 under+0.121
ed+0.109
 and+0.105
fully+0.100
Neg logit contributions
Little-0.250
icious-0.236
reci-0.235
 sym-0.234
One-0.233
Token the
Token ID262
Feature activation-1.06e-01

Pos logit contributions
Little+0.013
Everybody+0.012
Everyone+0.012
rous+0.012
One+0.012
Neg logit contributions
driver-0.008
ed-0.008
 under-0.008
aces-0.007
 hid-0.007
Token hand
Token ID1021
Feature activation+3.68e-02

Pos logit contributions
icious+0.200
Little+0.193
 disl+0.190
reci+0.189
Everybody+0.185
Neg logit contributions
 and-0.143
 under-0.136
 in-0.128
ed-0.122
 on-0.120
Token and
Token ID290
Feature activation+2.53e-01

Pos logit contributions
driver+0.022
 under+0.021
ed+0.021
 hid+0.019
aces+0.019
Neg logit contributions
Little-0.080
icious-0.077
One-0.076
 sym-0.076
assion-0.076
Token he
Token ID339
Feature activation+1.85e-01

Pos logit contributions
driver+0.266
 under+0.258
 and+0.245
aces+0.240
ed+0.232
Neg logit contributions
Little-0.495
Everybody-0.463
icious-0.452
Everyone-0.452
Br-0.451
Token realized
Token ID6939
Feature activation-6.34e-03

Pos logit contributions
 under+0.181
 and+0.175
driver+0.170
ed+0.163
 in+0.153
Neg logit contributions
Little-0.375
icious-0.370
Everybody-0.356
Ms-0.352
Br-0.349
Token it
Token ID340
Feature activation+9.54e-02

Pos logit contributions
Little+0.012
Everybody+0.011
 disl+0.011
Everyone+0.011
icious+0.011
Neg logit contributions
ed-0.007
 under-0.007
driver-0.007
 and-0.006
 knees-0.006
Token was
Token ID373
Feature activation+8.63e-02

Pos logit contributions
driver+0.124
 under+0.123
 knees+0.119
edly+0.114
ed+0.112
Neg logit contributions
Little-0.153
 wonderful-0.143
assion-0.142
icious-0.140
One-0.138
Token made
Token ID925
Feature activation+2.10e-02

Pos logit contributions
ed+0.095
driver+0.095
 and+0.094
 under+0.094
 knees+0.084
Neg logit contributions
icious-0.164
Little-0.162
Ms-0.154
 difference-0.154
 disl-0.153
Token soon
Token ID2582
Feature activation+5.03e-02

Pos logit contributions
icious+0.008
 disl+0.007
Little+0.007
reci+0.007
Everybody+0.007
Neg logit contributions
 and-0.004
 under-0.004
 in-0.004
 on-0.003
ed-0.003
Token!"
Token ID2474
Feature activation-7.72e-02

Pos logit contributions
 under+0.082
 and+0.081
ed+0.078
driver+0.076
 in+0.075
Neg logit contributions
icious-0.073
Little-0.071
Everybody-0.067
reci-0.066
Ms-0.066
Token The
Token ID383
Feature activation-2.66e-01

Pos logit contributions
icious+0.127
Little+0.125
Everybody+0.118
reci+0.117
Ms+0.117
Neg logit contributions
 under-0.111
 and-0.110
ed-0.105
driver-0.103
 in-0.102
Token boat
Token ID8848
Feature activation+8.57e-02

Pos logit contributions
icious+0.579
Little+0.548
 disl+0.541
Ms+0.540
assion+0.532
Neg logit contributions
 and-0.293
 under-0.276
 in-0.262
 on-0.243
ed-0.233
Token smiled
Token ID13541
Feature activation+9.36e-03

Pos logit contributions
driver+0.090
 under+0.087
ed+0.079
aces+0.074
 and+0.073
Neg logit contributions
Little-0.157
Everybody-0.152
icious-0.151
Everyone-0.150
 sym-0.150
Token and
Token ID290
Feature activation+2.30e-01

Pos logit contributions
driver+0.008
aces+0.006
 hid+0.006
 knees+0.006
keeping+0.005
Neg logit contributions
 sym-0.020
 difference-0.020
 wonderful-0.020
One-0.020
Everybody-0.019
Token knew
Token ID2993
Feature activation+1.07e-02

Pos logit contributions
driver+0.328
 under+0.313
 and+0.305
ed+0.282
 in+0.278
Neg logit contributions
Little-0.365
icious-0.353
Br-0.347
Everybody-0.345
 sym-0.343
Token it
Token ID340
Feature activation+7.30e-02

Pos logit contributions
ed+0.011
driver+0.011
 under+0.011
 and+0.011
aces+0.009
Neg logit contributions
Little-0.021
icious-0.021
Everybody-0.020
 difference-0.020
Everyone-0.020
Token had
Token ID550
Feature activation+7.52e-02

Pos logit contributions
driver+0.107
pered+0.097
aces+0.094
 knees+0.090
 under+0.090
Neg logit contributions
 wonderful-0.108
Little-0.103
assion-0.101
One-0.094
 lifetime-0.094
Token a
Token ID257
Feature activation-6.73e-02

Pos logit contributions
 and+0.117
 under+0.114
ed+0.107
 in+0.107
 on+0.102
Neg logit contributions
icious-0.121
Little-0.119
Everybody-0.111
E-0.111
Ms-0.111
Token new
Token ID649
Feature activation-2.26e-01

Pos logit contributions
icious+0.101
Little+0.096
reci+0.094
Ms+0.091
redible+0.091
Neg logit contributions
 and-0.122
 under-0.110
 in-0.107
 on-0.105
 he-0.103
Token was
Token ID373
Feature activation+7.73e-02

Pos logit contributions
driver+0.267
aces+0.245
fully+0.237
 under+0.230
ously+0.226
Neg logit contributions
Little-0.277
One-0.258
E-0.250
ograph-0.249
rous-0.247
Token walking
Token ID6155
Feature activation-6.31e-03

Pos logit contributions
driver+0.096
ously+0.088
ed+0.084
fully+0.083
edly+0.079
Neg logit contributions
rous-0.129
Ms-0.128
icious-0.127
One-0.126
Everybody-0.125
Token in
Token ID287
Feature activation+2.88e-02

Pos logit contributions
One+0.012
The+0.011
Everyone+0.011
Everybody+0.011
Most+0.011
Neg logit contributions
driver-0.007
 wouldn-0.006
 nor-0.006
aces-0.006
ongs-0.006
Token his
Token ID465
Feature activation+4.67e-02

Pos logit contributions
ed+0.042
driver+0.039
 under+0.038
 and+0.036
aces+0.036
Neg logit contributions
Little-0.048
One-0.045
rous-0.043
 difference-0.043
E-0.043
Token garden
Token ID11376
Feature activation+6.00e-02

Pos logit contributions
 and+0.069
 under+0.069
ed+0.065
driver+0.063
 in+0.063
Neg logit contributions
icious-0.075
Little-0.074
Everybody-0.070
Ms-0.069
reci-0.069
Token and
Token ID290
Feature activation+2.40e-01

Pos logit contributions
driver+0.058
ed+0.054
 under+0.049
aces+0.045
 wouldn+0.044
Neg logit contributions
Little-0.122
One-0.121
icious-0.119
Ms-0.116
 difference-0.116
Token he
Token ID339
Feature activation+1.74e-01

Pos logit contributions
driver+0.342
aces+0.312
 and+0.299
 under+0.293
ously+0.292
Neg logit contributions
Little-0.382
 disl-0.354
rous-0.352
Everybody-0.352
Everyone-0.349
Token saw
Token ID2497
Feature activation-8.37e-02

Pos logit contributions
driver+0.258
 under+0.248
 and+0.246
ed+0.235
fully+0.227
Neg logit contributions
Little-0.270
icious-0.254
Everybody-0.249
Br-0.249
One-0.247
Token a
Token ID257
Feature activation-5.51e-02

Pos logit contributions
icious+0.178
Little+0.176
Everybody+0.169
Ms+0.167
 disl+0.167
Neg logit contributions
 and-0.081
 under-0.080
ed-0.073
 in-0.070
driver-0.069
Token pit
Token ID6028
Feature activation+5.30e-02

Pos logit contributions
icious+0.093
Little+0.088
reci+0.087
 disl+0.085
Everybody+0.084
Neg logit contributions
 and-0.083
 under-0.080
 in-0.076
 on-0.073
 he-0.071
Token.
Token ID13
Feature activation+2.98e-02

Pos logit contributions
ed+0.067
aces+0.067
driver+0.067
ously+0.065
fully+0.061
Neg logit contributions
One-0.084
 difference-0.080
Little-0.080
icious-0.077
is-0.076
Token the
Token ID262
Feature activation-1.41e-01

Pos logit contributions
driver+0.344
ously+0.333
ed+0.322
aces+0.321
pered+0.312
Neg logit contributions
Little-0.300
 wonderful-0.282
 beautiful-0.267
The-0.266
Everyone-0.256
Token floor
Token ID4314
Feature activation+9.50e-02

Pos logit contributions
icious+0.255
Little+0.250
Everybody+0.238
reci+0.236
Ms+0.235
Neg logit contributions
 and-0.183
 under-0.181
ed-0.169
 in-0.165
driver-0.162
Token.
Token ID13
Feature activation+8.13e-03

Pos logit contributions
driver+0.106
 under+0.100
ed+0.099
 ear+0.097
 and+0.097
Neg logit contributions
icious-0.152
Little-0.140
redible-0.140
 disl-0.140
qu-0.140
Token They
Token ID1119
Feature activation+6.43e-02

Pos logit contributions
 under+0.010
 and+0.009
ed+0.009
 in+0.009
driver+0.009
Neg logit contributions
icious-0.016
Little-0.015
Everybody-0.015
reci-0.015
Ms-0.015
Token laughed
Token ID13818
Feature activation+1.52e-03

Pos logit contributions
driver+0.076
ed+0.076
 under+0.069
aces+0.068
edly+0.067
Neg logit contributions
Little-0.109
icious-0.104
One-0.102
Each-0.101
Ms-0.100
Token and
Token ID290
Feature activation+2.17e-01

Pos logit contributions
driver+0.001
aces+0.001
ongs+0.001
ed+0.001
ously+0.001
Neg logit contributions
 sym-0.003
icious-0.003
Little-0.003
 wonderful-0.003
Everyone-0.003
Token played
Token ID2826
Feature activation-8.40e-02

Pos logit contributions
driver+0.273
 under+0.243
ed+0.241
 and+0.239
ously+0.216
Neg logit contributions
Little-0.376
icious-0.371
Br-0.369
 sym-0.367
 disl-0.358
Token in
Token ID287
Feature activation+1.67e-03

Pos logit contributions
icious+0.171
Little+0.166
Everybody+0.163
Everyone+0.161
 difference+0.159
Neg logit contributions
ed-0.080
driver-0.079
 under-0.073
 and-0.071
ously-0.065
Token the
Token ID262
Feature activation-1.42e-01

Pos logit contributions
 and+0.002
 under+0.002
 in+0.002
ed+0.002
 on+0.002
Neg logit contributions
icious-0.003
Little-0.003
reci-0.003
Everybody-0.003
Ms-0.003
Token sticky
Token ID23408
Feature activation-1.02e-01

Pos logit contributions
icious+0.239
Little+0.234
Everybody+0.220
Ms+0.219
reci+0.219
Neg logit contributions
 and-0.201
 under-0.200
ed-0.191
driver-0.184
 in-0.184
Token mess
Token ID2085
Feature activation-6.44e-02

Pos logit contributions
icious+0.142
Little+0.139
is+0.132
Ms+0.131
 difference+0.131
Neg logit contributions
driver-0.157
ed-0.156
 under-0.155
 and-0.144
ously-0.144
Token to
Token ID284
Feature activation+8.70e-02

Pos logit contributions
aces+0.138
ongs+0.133
ities+0.128
 throat+0.125
oll+0.125
Neg logit contributions
One-0.162
Everyone-0.156
L-0.152
V-0.151
The-0.151
Token the
Token ID262
Feature activation-1.23e-01

Pos logit contributions
 and+0.099
 under+0.096
 in+0.087
ed+0.083
 on+0.080
Neg logit contributions
icious-0.182
Little-0.174
reci-0.169
 disl-0.167
Everybody-0.167
Token playground
Token ID24817
Feature activation+4.29e-03

Pos logit contributions
icious+0.215
Little+0.210
reci+0.201
Everybody+0.199
 disl+0.198
Neg logit contributions
 and-0.170
 under-0.169
ed-0.155
 in-0.155
driver-0.151
Token and
Token ID290
Feature activation+2.35e-01

Pos logit contributions
ed+0.004
driver+0.004
 wouldn+0.004
ped+0.004
aces+0.003
Neg logit contributions
One-0.008
 difference-0.008
is-0.007
Each-0.007
The-0.007
Token climbed
Token ID19952
Feature activation+1.68e-02

Pos logit contributions
driver+0.300
aces+0.287
ously+0.279
fully+0.279
edly+0.275
Neg logit contributions
Br-0.361
V-0.359
Little-0.356
Ms-0.351
Everybody-0.348
Token and
Token ID290
Feature activation+2.34e-01

Pos logit contributions
driver+0.019
ously+0.018
aces+0.016
 wouldn+0.016
 throat+0.016
Neg logit contributions
One-0.029
Little-0.028
V-0.028
The-0.028
E-0.028
Token slid
Token ID27803
Feature activation+1.66e-02

Pos logit contributions
driver+0.241
 and+0.239
ously+0.226
 under+0.224
ed+0.215
Neg logit contributions
Little-0.435
Everybody-0.428
Br-0.426
Everyone-0.422
One-0.408
Token and
Token ID290
Feature activation+2.34e-01

Pos logit contributions
driver+0.013
 under+0.011
 wouldn+0.011
ed+0.010
ously+0.010
Neg logit contributions
Everybody-0.033
Everyone-0.033
Little-0.033
Br-0.033
One-0.033
Token swung
Token ID28507
Feature activation+3.17e-02

Pos logit contributions
 and+0.246
driver+0.242
 under+0.228
ously+0.226
fully+0.223
Neg logit contributions
Br-0.425
Little-0.418
Everybody-0.412
Everyone-0.406
V-0.402
Token.
Token ID13
Feature activation+2.34e-02

Pos logit contributions
driver+0.032
ed+0.031
 knees+0.030
 under+0.030
 wouldn+0.028
Neg logit contributions
One-0.056
Little-0.056
Everybody-0.056
Everyone-0.055
Br-0.055
Token They
Token ID1119
Feature activation+7.69e-02

Pos logit contributions
 and+0.024
 under+0.024
ed+0.022
 in+0.021
driver+0.021
Neg logit contributions
icious-0.049
Little-0.048
Everybody-0.046
reci-0.046
Ms-0.046
Token,
Token ID11
Feature activation+1.19e-01

Pos logit contributions
Little+0.053
One+0.046
L+0.046
E+0.045
Everyone+0.045
Neg logit contributions
edly-0.044
aces-0.039
pered-0.038
eer-0.038
driver-0.037
Token Sarah
Token ID10490
Feature activation-2.19e-02

Pos logit contributions
ed+0.154
 under+0.152
driver+0.146
 and+0.146
aces+0.137
Neg logit contributions
Little-0.211
icious-0.196
Everybody-0.194
Everyone-0.192
Ms-0.190
Token found
Token ID1043
Feature activation-1.96e-02

Pos logit contributions
Little+0.031
ograph+0.028
Em+0.027
reci+0.027
One+0.027
Neg logit contributions
edly-0.031
 under-0.030
pered-0.029
ously-0.029
fully-0.028
Token a
Token ID257
Feature activation-5.56e-02

Pos logit contributions
Little+0.038
icious+0.038
Ms+0.037
E+0.036
Everybody+0.036
Neg logit contributions
 under-0.021
 and-0.021
ed-0.019
driver-0.019
 in-0.018
Token big
Token ID1263
Feature activation-1.93e-01

Pos logit contributions
icious+0.092
Little+0.090
reci+0.089
 disl+0.086
Everybody+0.086
Neg logit contributions
 and-0.085
 under-0.082
 in-0.078
 on-0.075
driver-0.073
Token bowl
Token ID9396
Feature activation+4.34e-02

Pos logit contributions
 difference+0.296
Little+0.287
icious+0.286
rous+0.272
ograph+0.271
Neg logit contributions
ed-0.293
 under-0.265
edly-0.258
aces-0.258
fully-0.257
Token and
Token ID290
Feature activation+2.39e-01

Pos logit contributions
 knees+0.072
 teeth+0.068
ked+0.067
iously+0.066
ed+0.065
Neg logit contributions
is-0.047
 batch-0.045
One-0.045
icious-0.044
Little-0.044
Token spoon
Token ID24556
Feature activation+1.36e-01

Pos logit contributions
 under+0.237
 and+0.228
ed+0.218
driver+0.216
 in+0.199
Neg logit contributions
Little-0.491
icious-0.474
E-0.465
Everybody-0.460
Ms-0.459
Token.
Token ID13
Feature activation+3.24e-02

Pos logit contributions
 ear+0.130
 under+0.128
 explorers+0.125
 knees+0.124
driver+0.116
Neg logit contributions
is-0.222
icious-0.218
qu-0.214
Little-0.213
redible-0.211
Token Then
Token ID3244
Feature activation-3.71e-02

Pos logit contributions
 under+0.036
 and+0.033
ed+0.033
 in+0.031
driver+0.030
Neg logit contributions
Little-0.063
icious-0.062
Everybody-0.060
Ms-0.059
E-0.059
Token,
Token ID11
Feature activation+1.16e-01

Pos logit contributions
Little+0.072
Everyone+0.065
Everybody+0.065
 wonderful+0.064
One+0.064
Neg logit contributions
eer-0.040
aces-0.039
driver-0.038
 under-0.037
ed-0.035
Token<|endoftext|>
Token ID50256
Feature activation-3.02e-01

Pos logit contributions
 and+0.032
 under+0.029
 he+0.028
 in+0.027
ed+0.027
Neg logit contributions
icious-0.034
redible-0.032
ograph-0.031
gary-0.030
reci-0.030
TokenOne
Token ID3198
Feature activation-2.10e-01

Pos logit contributions
icious+0.343
Little+0.315
 disl+0.303
reci+0.295
Everybody+0.292
Neg logit contributions
 and-0.619
 under-0.611
ed-0.588
 in-0.582
 on-0.566
Token day
Token ID1110
Feature activation-1.00e-02

Pos logit contributions
icious+0.357
Little+0.346
Everybody+0.329
reci+0.328
Ms+0.328
Neg logit contributions
 and-0.302
 under-0.298
ed-0.278
 in-0.276
driver-0.264
Token,
Token ID11
Feature activation+1.05e-01

Pos logit contributions
Little+0.023
icious+0.022
Everybody+0.022
Everyone+0.022
One+0.022
Neg logit contributions
 under-0.007
 and-0.007
driver-0.007
ed-0.006
aces-0.005
Token there
Token ID612
Feature activation-6.15e-02

Pos logit contributions
 and+0.102
 under+0.101
driver+0.097
ed+0.095
 in+0.085
Neg logit contributions
Little-0.216
icious-0.211
E-0.205
Everybody-0.205
Br-0.203
Token was
Token ID373
Feature activation+6.12e-02

Pos logit contributions
Little+0.106
E+0.098
Everyone+0.098
Everybody+0.097
 difference+0.096
Neg logit contributions
driver-0.084
 under-0.077
aces-0.075
ed-0.071
edly-0.070
Token a
Token ID257
Feature activation-6.81e-02

Pos logit contributions
driver+0.067
ed+0.066
 under+0.065
 and+0.063
 knees+0.059
Neg logit contributions
Little-0.120
icious-0.117
E-0.114
One-0.114
Br-0.112
Token little
Token ID1310
Feature activation-1.48e-01

Pos logit contributions
icious+0.110
Little+0.104
reci+0.103
 disl+0.101
Everybody+0.101
Neg logit contributions
 and-0.106
 under-0.103
 in-0.096
 on-0.091
ed-0.091
Token boy
Token ID2933
Feature activation+2.51e-02

Pos logit contributions
Little+0.199
 difference+0.185
E+0.175
icious+0.173
rous+0.172
Neg logit contributions
ed-0.260
aces-0.251
 under-0.245
driver-0.243
ously-0.235
Token who
Token ID508
Feature activation-7.52e-03

Pos logit contributions
driver+0.043
ed+0.042
aces+0.042
edly+0.039
 knees+0.039
Neg logit contributions
Little-0.033
One-0.031
L-0.030
ograph-0.029
rous-0.028
Token was
Token ID373
Feature activation+6.07e-02

Pos logit contributions
Little+0.013
 disl+0.013
One+0.013
E+0.012
 wonderful+0.012
Neg logit contributions
driver-0.010
 under-0.009
fully-0.009
 knees-0.009
aces-0.009
Tokenâ
Token ID22940
Feature activation-8.91e-02

Pos logit contributions
 difference+0.466
 former+0.455
icious+0.406
 lifetime+0.406
 fullest+0.402
Neg logit contributions
 and-0.328
 on-0.254
 in-0.253
 he-0.253
 He-0.237
Token
Token ID26391
Feature activation-1.47e-01

Pos logit contributions
icious+0.111
Little+0.108
Everybody+0.104
E+0.104
Br+0.102
Neg logit contributions
 and-0.167
 under-0.162
ed-0.158
 in-0.154
driver-0.150
Token
Token ID129
Feature activation-7.55e-02

Pos logit contributions
icious+0.232
Little+0.219
reci+0.215
Everybody+0.210
rous+0.210
Neg logit contributions
 and-0.241
 under-0.229
ed-0.220
 in-0.214
driver-0.209
Token
Token ID241
Feature activation-1.55e-01

Pos logit contributions
Little+0.117
icious+0.117
Everybody+0.111
gary+0.111
Br+0.110
Neg logit contributions
 and-0.127
 under-0.118
 in-0.113
ed-0.107
 knees-0.103
TokenI
Token ID40
Feature activation-6.47e-02

Pos logit contributions
icious+0.226
 difference+0.203
Little+0.202
reci+0.201
redible+0.198
Neg logit contributions
 and-0.284
 under-0.263
ed-0.259
 in-0.255
 he-0.251
Token am
Token ID716
Feature activation+1.81e-02

Pos logit contributions
icious+0.097
Little+0.094
Everybody+0.088
Ms+0.088
reci+0.087
Neg logit contributions
 and-0.106
 under-0.104
ed-0.099
 in-0.097
driver-0.095
Token taking
Token ID2263
Feature activation-4.85e-02

Pos logit contributions
 under+0.019
 and+0.018
ed+0.018
driver+0.018
 in+0.016
Neg logit contributions
icious-0.036
Little-0.036
Everybody-0.034
 disl-0.034
Ms-0.034
Token a
Token ID257
Feature activation-6.19e-02

Pos logit contributions
icious+0.103
Little+0.095
redible+0.092
 difference+0.092
reci+0.092
Neg logit contributions
 and-0.060
 in-0.054
 under-0.054
ed-0.053
 on-0.052
Token great
Token ID1049
Feature activation-1.83e-01

Pos logit contributions
icious+0.112
Little+0.101
 disl+0.101
Ms+0.098
reci+0.098
Neg logit contributions
 and-0.096
 under-0.091
 in-0.090
 on-0.087
 out-0.085
Token bath
Token ID7837
Feature activation-8.75e-02

Pos logit contributions
icious+0.264
Little+0.260
Everybody+0.244
reci+0.242
 difference+0.240
Neg logit contributions
 under-0.301
 and-0.298
ed-0.289
driver-0.279
 in-0.277
Token,
Token ID11
Feature activation+1.11e-01

Pos logit contributions
icious+0.152
Little+0.152
 sym+0.143
Everybody+0.143
 difference+0.143
Neg logit contributions
 under-0.092
ed-0.085
 knees-0.084
driver-0.084
ously-0.081
Token I
Token ID314
Feature activation+2.24e-02

Pos logit contributions
 under+0.195
 and+0.193
ed+0.189
driver+0.176
 in+0.175
Neg logit contributions
icious-0.275
Little-0.272
Everybody-0.257
reci-0.254
Ms-0.253
Token can
Token ID460
Feature activation+1.04e-01

Pos logit contributions
 under+0.032
 and+0.032
ed+0.031
driver+0.030
 in+0.029
Neg logit contributions
icious-0.036
Little-0.036
reci-0.034
Everybody-0.034
 disl-0.033
Token give
Token ID1577
Feature activation-4.59e-03

Pos logit contributions
 and+0.138
 under+0.133
 in+0.124
ed+0.121
driver+0.120
Neg logit contributions
icious-0.189
Little-0.184
Ms-0.176
reci-0.176
 disl-0.176
Token you
Token ID345
Feature activation+3.76e-02

Pos logit contributions
icious+0.007
Little+0.007
 sym+0.007
reci+0.007
 disl+0.007
Neg logit contributions
 and-0.007
 under-0.007
 in-0.007
ed-0.007
 on-0.007
Token some
Token ID617
Feature activation-1.15e-02

Pos logit contributions
 and+0.035
 in+0.031
 under+0.028
 on+0.027
 out+0.025
Neg logit contributions
icious-0.092
Little-0.088
reci-0.086
 sym-0.085
ograph-0.084
Token fuel
Token ID5252
Feature activation+8.58e-02

Pos logit contributions
icious+0.021
Little+0.021
 disl+0.020
Ms+0.020
Everybody+0.020
Neg logit contributions
 and-0.015
 under-0.015
 in-0.014
ed-0.013
 on-0.013
Token,
Token ID11
Feature activation+1.45e-01

Pos logit contributions
 under+0.083
ed+0.081
 knees+0.076
driver+0.075
ously+0.074
Neg logit contributions
icious-0.159
Little-0.158
 difference-0.156
 disl-0.155
Ms-0.153
Token but
Token ID475
Feature activation+1.38e-01

Pos logit contributions
ed+0.121
 under+0.107
driver+0.093
 and+0.091
 knees+0.084
Neg logit contributions
icious-0.332
Little-0.329
Everybody-0.312
 difference-0.311
Ms-0.309
Token you
Token ID345
Feature activation+3.23e-02

Pos logit contributions
ed+0.203
 under+0.192
 and+0.186
driver+0.185
ously+0.175
Neg logit contributions
Little-0.218
icious-0.217
 difference-0.204
Everybody-0.202
One-0.201
Token must
Token ID1276
Feature activation+1.23e-01

Pos logit contributions
 under+0.047
ed+0.047
 and+0.044
driver+0.043
 knees+0.042
Neg logit contributions
icious-0.051
Little-0.050
reci-0.049
Everybody-0.048
 difference-0.047
Token promise
Token ID6991
Feature activation-3.46e-02

Pos logit contributions
 and+0.163
 under+0.160
ed+0.149
 in+0.147
driver+0.144
Neg logit contributions
icious-0.221
Little-0.217
Everybody-0.206
reci-0.205
Ms-0.204
Token.
Token ID13
Feature activation+2.98e-02

Pos logit contributions
 under+0.114
driver+0.112
ed+0.111
 and+0.103
 knees+0.100
Neg logit contributions
Little-0.185
icious-0.179
 difference-0.173
Ms-0.171
 disl-0.170
Token Let
Token ID3914
Feature activation-4.14e-02

Pos logit contributions
 and+0.037
 under+0.035
ed+0.033
 in+0.033
driver+0.032
Neg logit contributions
icious-0.056
Little-0.055
rous-0.053
reci-0.052
Everybody-0.052
Token me
Token ID502
Feature activation-9.67e-02

Pos logit contributions
icious+0.109
 former+0.108
 disl+0.106
 sym+0.103
 lifetime+0.102
Neg logit contributions
 and-0.036
 in-0.026
ed-0.025
 out-0.023
 under-0.023
Token add
Token ID751
Feature activation+4.76e-04

Pos logit contributions
icious+0.153
Little+0.150
Everybody+0.142
ograph+0.138
reci+0.138
Neg logit contributions
 under-0.144
 and-0.141
ed-0.139
driver-0.133
 in-0.129
Token it
Token ID340
Feature activation+8.09e-02

Pos logit contributions
 and+0.001
 under+0.000
 in+0.000
 on+0.000
ed+0.000
Neg logit contributions
icious-0.001
Little-0.001
 disl-0.001
 former-0.001
 difference-0.001
Token to
Token ID284
Feature activation+9.07e-02

Pos logit contributions
 under+0.041
 and+0.038
ed+0.036
driver+0.035
 knees+0.029
Neg logit contributions
icious-0.198
Little-0.196
Ms-0.190
Everybody-0.190
One-0.189
Token your
Token ID534
Feature activation-1.10e-01

Pos logit contributions
 and+0.108
 in+0.089
 as+0.082
 out+0.079
 under+0.078
Neg logit contributions
reci-0.203
icious-0.197
 lifetime-0.196
Br-0.195
oral-0.190
Token tablet
Token ID14147
Feature activation+3.76e-02

Pos logit contributions
icious+0.194
Little+0.191
Everybody+0.187
E+0.187
Br+0.184
Neg logit contributions
 and-0.173
 under-0.163
 in-0.153
 on-0.146
 knees-0.138
Token."
Token ID526
Feature activation-8.54e-02

Pos logit contributions
 under+0.043
 and+0.042
ed+0.041
driver+0.040
 in+0.038
Neg logit contributions
icious-0.070
Little-0.068
Ms-0.065
Everybody-0.065
reci-0.064
Token
Token ID220
Feature activation-1.90e-01

Pos logit contributions
icious+0.146
Little+0.138
rous+0.133
ograph+0.131
 disl+0.131
Neg logit contributions
 and-0.141
 under-0.130
ed-0.123
 in-0.122
 on-0.116
Token
Token ID198
Feature activation-2.20e-01

Pos logit contributions
icious+0.390
 former+0.379
 lifetime+0.379
 disl+0.378
rous+0.378
Neg logit contributions
 and-0.283
 under-0.235
 in-0.228
ed-0.215
 out-0.212
Token nice
Token ID3621
Feature activation-2.25e-01

Pos logit contributions
 under+0.080
 and+0.080
ed+0.078
driver+0.074
 in+0.073
Neg logit contributions
icious-0.078
Little-0.076
Everybody-0.072
 disl-0.071
Ms-0.070
Token.
Token ID13
Feature activation+2.09e-02

Pos logit contributions
 difference+0.374
 disl+0.338
icious+0.329
 future+0.323
Little+0.315
Neg logit contributions
 under-0.258
 knees-0.257
edly-0.252
driver-0.243
 barely-0.242
Token You
Token ID921
Feature activation-3.14e-02

Pos logit contributions
 under+0.028
 and+0.026
ed+0.025
 in+0.025
 on+0.024
Neg logit contributions
Little-0.037
 disl-0.037
icious-0.036
E-0.035
Everybody-0.035
Token owe
Token ID19059
Feature activation-1.04e-01

Pos logit contributions
Little+0.059
icious+0.058
reci+0.057
Everybody+0.057
 disl+0.057
Neg logit contributions
 under-0.035
 and-0.033
ed-0.032
driver-0.031
 knees-0.030
Token Tommy
Token ID19919
Feature activation-5.59e-02

Pos logit contributions
icious+0.199
Little+0.193
reci+0.188
Everybody+0.185
 sym+0.184
Neg logit contributions
 and-0.129
 under-0.122
 in-0.114
ed-0.113
driver-0.108
Token an
Token ID281
Feature activation-9.40e-02

Pos logit contributions
icious+0.111
Little+0.110
Everybody+0.105
 disl+0.105
Ms+0.105
Neg logit contributions
 under-0.060
 and-0.058
ed-0.055
driver-0.053
 in-0.052
Token apology
Token ID17029
Feature activation-5.13e-02

Pos logit contributions
icious+0.128
Little+0.126
Ms+0.120
Everybody+0.119
reci+0.116
Neg logit contributions
 and-0.168
 under-0.164
 in-0.154
driver-0.152
ed-0.152
Token and
Token ID290
Feature activation+2.29e-01

Pos logit contributions
 disl+0.089
Little+0.088
 difference+0.086
icious+0.086
One+0.086
Neg logit contributions
 under-0.055
 knees-0.053
ed-0.052
edly-0.047
driver-0.045
Token a
Token ID257
Feature activation-6.75e-02

Pos logit contributions
 under+0.284
ed+0.269
 and+0.264
driver+0.251
 in+0.237
Neg logit contributions
Little-0.406
 disl-0.405
icious-0.390
Everybody-0.380
Everyone-0.378
Token new
Token ID649
Feature activation-2.25e-01

Pos logit contributions
icious+0.116
Little+0.111
reci+0.109
redible+0.106
Ms+0.104
Neg logit contributions
 and-0.106
 under-0.096
 in-0.093
 on-0.092
 he-0.091
Token car
Token ID1097
Feature activation+2.55e-02

Pos logit contributions
icious+0.478
Little+0.460
Ms+0.450
reci+0.446
redible+0.434
Neg logit contributions
 and-0.287
 under-0.250
 in-0.238
driver-0.236
 on-0.230
Token the
Token ID262
Feature activation-1.37e-01

Pos logit contributions
Little+0.097
 disl+0.095
rous+0.094
 difference+0.092
One+0.091
Neg logit contributions
ed-0.061
driver-0.057
 under-0.054
aces-0.050
 and-0.049
Token sun
Token ID4252
Feature activation-1.95e-02

Pos logit contributions
icious+0.212
Little+0.208
Everybody+0.196
reci+0.195
Ms+0.193
Neg logit contributions
 and-0.210
 under-0.210
ed-0.202
driver-0.195
 in-0.193
Token.
Token ID13
Feature activation+1.21e-02

Pos logit contributions
 difference+0.032
Little+0.032
icious+0.032
 sym+0.031
 wonderful+0.030
Neg logit contributions
driver-0.025
pered-0.022
ously-0.022
ed-0.021
edly-0.021
Token It
Token ID632
Feature activation-6.81e-03

Pos logit contributions
 under+0.017
 and+0.017
ed+0.016
 in+0.015
driver+0.015
Neg logit contributions
icious-0.021
Little-0.020
Everybody-0.019
reci-0.019
Ms-0.019
Token was
Token ID373
Feature activation+5.30e-02

Pos logit contributions
Little+0.014
icious+0.014
Everybody+0.013
Br+0.013
reci+0.013
Neg logit contributions
 under-0.007
 and-0.006
driver-0.006
ed-0.006
 in-0.006
Token warm
Token ID5814
Feature activation-1.75e-01

Pos logit contributions
driver+0.091
ed+0.089
ously+0.080
 knees+0.077
aces+0.076
Neg logit contributions
icious-0.072
Little-0.071
 wonderful-0.070
 sym-0.068
Br-0.068
Token and
Token ID290
Feature activation+2.21e-01

Pos logit contributions
Little+0.282
 beautiful+0.275
 lovely+0.265
 sym+0.260
 difference+0.257
Neg logit contributions
driver-0.248
pered-0.210
ously-0.202
icked-0.202
 elbow-0.200
Token bright
Token ID6016
Feature activation-1.92e-01

Pos logit contributions
 under+0.364
 and+0.363
ed+0.360
driver+0.358
 in+0.335
Neg logit contributions
Little-0.308
icious-0.301
 disl-0.284
reci-0.278
Br-0.278
Token.
Token ID13
Feature activation+1.07e-02

Pos logit contributions
 difference+0.217
 beautiful+0.217
 sym+0.213
 wonderful+0.212
 lovely+0.209
Neg logit contributions
driver-0.357
 wouldn-0.311
 knees-0.307
ously-0.301
 hid-0.296
Token Ben
Token ID3932
Feature activation-5.27e-02

Pos logit contributions
 and+0.017
 under+0.016
ed+0.016
 in+0.015
 he+0.015
Neg logit contributions
icious-0.017
Little-0.016
ograph-0.016
 disl-0.015
reci-0.015
Token did
Token ID750
Feature activation+4.88e-02

Pos logit contributions
Little+0.105
icious+0.098
One+0.098
 disl+0.097
E+0.097
Neg logit contributions
 under-0.054
driver-0.053
ed-0.050
 and-0.045
 on-0.045
Token a
Token ID257
Feature activation+1.40e-02

Pos logit contributions
icious+0.085
Little+0.083
Everybody+0.080
reci+0.079
Ms+0.079
Neg logit contributions
 and-0.039
 under-0.038
ed-0.035
 in-0.034
driver-0.032
Token picture
Token ID4286
Feature activation+1.60e-02

Pos logit contributions
 and+0.024
 under+0.022
 in+0.022
 on+0.021
 he+0.020
Neg logit contributions
icious-0.022
reci-0.021
Little-0.021
Everybody-0.020
 disl-0.020
Token of
Token ID286
Feature activation+1.16e-01

Pos logit contributions
 under+0.015
 knees+0.014
ed+0.014
driver+0.014
 and+0.013
Neg logit contributions
icious-0.032
Little-0.031
Everybody-0.030
Everyone-0.030
Ms-0.030
Token Max
Token ID5436
Feature activation+2.85e-02

Pos logit contributions
 and+0.116
 under+0.108
 in+0.100
 on+0.092
 he+0.092
Neg logit contributions
icious-0.253
Little-0.245
reci-0.243
Everybody-0.236
Br-0.235
Token and
Token ID290
Feature activation+2.92e-01

Pos logit contributions
 under+0.024
 and+0.024
ed+0.022
driver+0.022
 in+0.020
Neg logit contributions
icious-0.062
Little-0.061
Ms-0.058
reci-0.058
Everybody-0.058
Token the
Token ID262
Feature activation-6.23e-02

Pos logit contributions
 under+0.334
ed+0.320
driver+0.313
 and+0.307
 in+0.286
Neg logit contributions
Little-0.532
 disl-0.514
icious-0.504
Everybody-0.503
Everyone-0.499
Token little
Token ID1310
Feature activation-9.21e-02

Pos logit contributions
icious+0.116
Little+0.111
reci+0.109
Everybody+0.107
Ms+0.107
Neg logit contributions
 and-0.083
 under-0.080
 in-0.075
ed-0.072
 on-0.071
Token girl
Token ID2576
Feature activation+1.04e-01

Pos logit contributions
Little+0.139
icious+0.138
Everybody+0.130
 difference+0.130
reci+0.127
Neg logit contributions
 under-0.141
 and-0.138
ed-0.134
driver-0.131
 in-0.128
Token to
Token ID284
Feature activation+1.40e-01

Pos logit contributions
 and+0.103
 under+0.100
ed+0.091
 in+0.090
driver+0.084
Neg logit contributions
icious-0.225
Little-0.221
Everybody-0.211
Ms-0.210
reci-0.210
Token remember
Token ID3505
Feature activation+6.18e-02

Pos logit contributions
 and+0.208
 under+0.195
 in+0.191
 on+0.176
ed+0.175
Neg logit contributions
icious-0.242
Little-0.232
Br-0.223
Ms-0.222
reci-0.222
Token their
Token ID511
Feature activation+4.17e-02

Pos logit contributions
ed+0.068
 under+0.067
 and+0.064
driver+0.064
 in+0.057
Neg logit contributions
Little-0.119
icious-0.118
 disl-0.112
Ms-0.112
Everybody-0.112
TokenThank
Token ID10449
Feature activation-2.23e-01

Pos logit contributions
 difference+0.100
icious+0.100
 former+0.099
rous+0.096
Little+0.093
Neg logit contributions
 and-0.126
ed-0.115
 in-0.115
 he-0.114
 under-0.112
Token you
Token ID345
Feature activation-8.68e-03

Pos logit contributions
icious+0.329
 former+0.316
Little+0.303
 sym+0.301
 disl+0.300
Neg logit contributions
 and-0.386
 under-0.379
 in-0.373
ed-0.361
 on-0.354
Token,
Token ID11
Feature activation+1.07e-01

Pos logit contributions
icious+0.020
Little+0.020
Ms+0.019
Everybody+0.019
 difference+0.019
Neg logit contributions
 under-0.006
ed-0.006
 and-0.006
driver-0.005
 in-0.005
Token sir
Token ID15967
Feature activation-1.67e-02

Pos logit contributions
 and+0.144
 under+0.142
ed+0.134
 in+0.130
driver+0.128
Neg logit contributions
icious-0.190
Little-0.186
Everybody-0.177
reci-0.176
Ms-0.175
Token.
Token ID13
Feature activation+1.95e-02

Pos logit contributions
icious+0.034
Little+0.033
Everybody+0.031
reci+0.031
Ms+0.031
Neg logit contributions
 and-0.019
 under-0.018
ed-0.017
 in-0.017
driver-0.016
Token You
Token ID921
Feature activation-3.35e-02

Pos logit contributions
 and+0.027
 under+0.026
ed+0.025
 in+0.025
driver+0.024
Neg logit contributions
icious-0.034
Little-0.033
rous-0.032
Everybody-0.031
redible-0.031
Token are
Token ID389
Feature activation+6.24e-02

Pos logit contributions
icious+0.063
Little+0.062
Everybody+0.059
Ms+0.059
 disl+0.058
Neg logit contributions
 and-0.042
 under-0.041
ed-0.038
 in-0.038
driver-0.036
Token wise
Token ID10787
Feature activation-1.92e-01

Pos logit contributions
 and+0.061
 under+0.061
 in+0.057
 on+0.052
ed+0.050
Neg logit contributions
icious-0.136
Little-0.135
reci-0.132
Br-0.130
Everybody-0.129
Token,"
Token ID553
Feature activation-7.55e-02

Pos logit contributions
 difference+0.286
Little+0.281
E+0.277
icious+0.273
One+0.272
Neg logit contributions
 under-0.263
ed-0.256
driver-0.249
 wouldn-0.240
 knees-0.235
Token Tom
Token ID4186
Feature activation-6.54e-02

Pos logit contributions
icious+0.172
Little+0.166
Everybody+0.160
reci+0.159
ograph+0.159
Neg logit contributions
 and-0.071
 under-0.067
 in-0.059
ed-0.058
 he-0.057
Token says
Token ID1139
Feature activation-7.01e-02

Pos logit contributions
icious+0.087
Little+0.085
reci+0.080
Everybody+0.079
Ms+0.079
Neg logit contributions
 under-0.114
 and-0.110
ed-0.109
driver-0.106
 in-0.105
Token in
Token ID287
Feature activation+3.83e-02

Pos logit contributions
 and+0.065
driver+0.064
ed+0.059
 under+0.054
ously+0.050
Neg logit contributions
Little-0.230
icious-0.222
rous-0.222
Ms-0.221
reci-0.220
Token the
Token ID262
Feature activation-1.03e-01

Pos logit contributions
 and+0.044
 under+0.043
 in+0.039
ed+0.038
driver+0.037
Neg logit contributions
icious-0.078
Little-0.075
reci-0.073
Everybody-0.072
 disl-0.072
Token toy
Token ID13373
Feature activation+1.29e-02

Pos logit contributions
icious+0.191
Little+0.186
Everybody+0.179
reci+0.177
 disl+0.177
Neg logit contributions
 and-0.134
 under-0.133
ed-0.122
 in-0.122
driver-0.115
Token box
Token ID3091
Feature activation+1.30e-01

Pos logit contributions
 under+0.017
 and+0.016
ed+0.016
driver+0.016
 in+0.015
Neg logit contributions
Little-0.021
icious-0.021
Ms-0.020
ograph-0.020
 difference-0.020
Token.
Token ID13
Feature activation+4.13e-02

Pos logit contributions
 under+0.123
 and+0.121
driver+0.117
ed+0.117
 in+0.106
Neg logit contributions
icious-0.261
Little-0.256
Everybody-0.245
Ms-0.243
 difference-0.241
Token Finally
Token ID9461
Feature activation-7.75e-02

Pos logit contributions
 and+0.045
 under+0.041
 in+0.037
 He+0.037
ed+0.036
Neg logit contributions
icious-0.092
rous-0.088
Little-0.088
reci-0.087
ograph-0.087
Token,
Token ID11
Feature activation+1.27e-01

Pos logit contributions
Little+0.146
Everybody+0.137
Everyone+0.137
 wonderful+0.133
icious+0.131
Neg logit contributions
driver-0.083
aces-0.082
ed-0.078
eer-0.075
edly-0.073
Token they
Token ID484
Feature activation+1.65e-01

Pos logit contributions
driver+0.143
ed+0.139
 and+0.136
aces+0.135
 under+0.133
Neg logit contributions
Little-0.240
icious-0.230
Everybody-0.226
Br-0.225
Ms-0.224
Token found
Token ID1043
Feature activation-9.45e-03

Pos logit contributions
driver+0.218
aces+0.217
 knees+0.212
edly+0.202
fully+0.197
Neg logit contributions
Little-0.251
Happy-0.235
Everybody-0.234
Everyone-0.234
prise-0.233
Token it
Token ID340
Feature activation+9.35e-02

Pos logit contributions
Little+0.017
icious+0.016
E+0.016
Ms+0.016
Everybody+0.016
Neg logit contributions
 and-0.011
 under-0.011
driver-0.011
ed-0.010
 knees-0.010
Token in
Token ID287
Feature activation+3.36e-02

Pos logit contributions
driver+0.072
 knees+0.061
 ear+0.059
 throat+0.054
aces+0.053
Neg logit contributions
Little-0.195
icious-0.194
One-0.187
redible-0.186
 wonderful-0.186
Token knew
Token ID2993
Feature activation+1.67e-02

Pos logit contributions
 under+0.069
driver+0.065
ed+0.064
 and+0.062
 in+0.058
Neg logit contributions
Little-0.126
icious-0.125
Everybody-0.118
 disl-0.117
reci-0.116
Token that
Token ID326
Feature activation-7.20e-02

Pos logit contributions
 under+0.019
ed+0.018
driver+0.018
 and+0.017
 knees+0.016
Neg logit contributions
Little-0.032
icious-0.031
Everybody-0.030
rous-0.029
Everyone-0.029
Token her
Token ID607
Feature activation+3.32e-02

Pos logit contributions
Little+0.148
icious+0.145
 difference+0.140
Ms+0.137
rous+0.137
Neg logit contributions
ed-0.071
 under-0.071
driver-0.070
 and-0.066
 in-0.059
Token story
Token ID1621
Feature activation+2.63e-02

Pos logit contributions
 and+0.048
 under+0.045
 in+0.042
ed+0.041
 on+0.040
Neg logit contributions
icious-0.059
Little-0.057
Everybody-0.056
E-0.055
Ms-0.055
Token was
Token ID373
Feature activation+6.90e-02

Pos logit contributions
 under+0.027
driver+0.025
 knees+0.025
ed+0.024
 and+0.024
Neg logit contributions
Little-0.052
icious-0.051
Everybody-0.049
reci-0.049
Everyone-0.049
Token special
Token ID2041
Feature activation-2.42e-01

Pos logit contributions
 and+0.088
 under+0.087
ed+0.080
 in+0.080
driver+0.076
Neg logit contributions
icious-0.128
Little-0.125
Everybody-0.119
reci-0.119
E-0.118
Token,
Token ID11
Feature activation+1.10e-01

Pos logit contributions
 difference+0.513
Little+0.493
 wonderful+0.491
icious+0.476
 rewarding+0.468
Neg logit contributions
driver-0.218
ed-0.209
ked-0.185
 under-0.181
 knees-0.181
Token just
Token ID655
Feature activation+9.80e-02

Pos logit contributions
ed+0.081
driver+0.078
 under+0.071
 knees+0.061
 hid+0.056
Neg logit contributions
Little-0.260
icious-0.255
 difference-0.247
 disl-0.243
 wonderful-0.242
Token like
Token ID588
Feature activation-8.46e-03

Pos logit contributions
driver+0.092
 under+0.088
ed+0.088
 and+0.085
 knees+0.074
Neg logit contributions
icious-0.204
Little-0.201
 disl-0.197
 difference-0.195
rous-0.193
Token the
Token ID262
Feature activation-1.26e-01

Pos logit contributions
icious+0.016
Little+0.016
Everybody+0.015
 disl+0.015
Ms+0.015
Neg logit contributions
 under-0.009
 and-0.009
ed-0.009
driver-0.009
 in-0.008
Token por
Token ID16964
Feature activation+4.23e-02

Pos logit contributions
icious+0.214
Little+0.204
E+0.201
ograph+0.201
Everybody+0.199
Neg logit contributions
 and-0.193
 under-0.178
 in-0.171
 on-0.163
ed-0.160
Token the
Token ID262
Feature activation-1.26e-01

Pos logit contributions
 and+0.019
 under+0.017
 in+0.016
 on+0.014
 he+0.013
Neg logit contributions
icious-0.050
reci-0.048
assion-0.047
ograph-0.047
Little-0.047
Token pond
Token ID16723
Feature activation-1.09e-02

Pos logit contributions
icious+0.223
Little+0.218
Everybody+0.208
reci+0.207
 difference+0.205
Neg logit contributions
 and-0.165
 under-0.164
ed-0.157
driver-0.151
 in-0.149
Token and
Token ID290
Feature activation+2.33e-01

Pos logit contributions
 difference+0.021
icious+0.020
One+0.020
Little+0.019
Ms+0.019
Neg logit contributions
driver-0.011
ed-0.010
aces-0.008
 knees-0.008
 under-0.008
Token disappears
Token ID27934
Feature activation-5.01e-02

Pos logit contributions
driver+0.320
ed+0.317
fully+0.304
 under+0.302
 and+0.301
Neg logit contributions
 disl-0.353
 difference-0.343
icious-0.337
Little-0.335
reci-0.334
Token.
Token ID13
Feature activation+2.15e-02

Pos logit contributions
Little+0.085
 sym+0.084
icious+0.084
rous+0.083
 difference+0.082
Neg logit contributions
driver-0.060
ed-0.049
 he-0.048
 knees-0.047
aces-0.046
Token
Token ID198
Feature activation-2.32e-01

Pos logit contributions
 and+0.030
 He+0.028
 he+0.026
 under+0.026
 in+0.025
Neg logit contributions
icious-0.045
rous-0.041
 lifetime-0.041
assion-0.041
ograph-0.040
Token
Token ID198
Feature activation-2.32e-01

Pos logit contributions
icious+0.607
 former+0.572
ograph+0.565
 lifetime+0.554
redible+0.542
Neg logit contributions
 and-0.301
 under-0.231
 in-0.225
 He-0.219
 he-0.216
TokenL
Token ID43
Feature activation-2.03e-01

Pos logit contributions
 difference+0.448
 former+0.416
 sym+0.386
rous+0.382
 impact+0.377
Neg logit contributions
 and-0.418
 in-0.380
 on-0.370
 under-0.363
 he-0.355
Tokenily
Token ID813
Feature activation-1.90e-02

Pos logit contributions
icious+0.369
Little+0.348
Everybody+0.341
E+0.338
Everyone+0.337
Neg logit contributions
 under-0.246
 and-0.239
 knees-0.238
ed-0.236
 he-0.234
Token and
Token ID290
Feature activation+2.31e-01

Pos logit contributions
Little+0.041
 difference+0.040
reci+0.039
V+0.039
Girl+0.039
Neg logit contributions
fully-0.014
driver-0.013
ously-0.010
edly-0.010
ed-0.010
Token Tom
Token ID4186
Feature activation-6.18e-02

Pos logit contributions
 under+0.243
 and+0.239
ed+0.236
driver+0.224
 in+0.217
Neg logit contributions
icious-0.463
Little-0.453
Ms-0.436
Everybody-0.432
 disl-0.429
Token even
Token ID772
Feature activation+6.86e-02

Pos logit contributions
driver+0.013
ed+0.010
aces+0.009
 knees+0.009
 under+0.009
Neg logit contributions
 sym-0.038
Little-0.038
Everybody-0.037
 wonderful-0.037
reci-0.037
Token more
Token ID517
Feature activation-5.83e-02

Pos logit contributions
 and+0.090
 under+0.086
 in+0.083
 on+0.078
ed+0.075
Neg logit contributions
icious-0.140
Little-0.132
ograph-0.128
E-0.127
Everybody-0.125
Token.
Token ID13
Feature activation+8.07e-03

Pos logit contributions
icious+0.114
Little+0.111
 sym+0.110
 disl+0.109
 difference+0.108
Neg logit contributions
driver-0.060
ed-0.057
 under-0.056
 and-0.049
 wouldn-0.048
Token
Token ID198
Feature activation-2.36e-01

Pos logit contributions
 and+0.011
 under+0.009
 in+0.009
 he+0.009
ed+0.009
Neg logit contributions
icious-0.017
gary-0.015
ograph-0.015
redible-0.015
Little-0.015
Token
Token ID198
Feature activation-2.37e-01

Pos logit contributions
icious+0.522
 former+0.496
 disl+0.494
assion+0.470
ograph+0.466
Neg logit contributions
 and-0.366
 in-0.292
 under-0.290
 he-0.274
 on-0.273
TokenThe
Token ID464
Feature activation-2.68e-01

Pos logit contributions
 difference+0.356
 former+0.309
icious+0.302
reci+0.297
 disl+0.292
Neg logit contributions
 and-0.470
 in-0.424
 under-0.403
 he-0.389
 on-0.383
Token truck
Token ID7779
Feature activation+5.88e-02

Pos logit contributions
icious+0.531
Little+0.517
Ms+0.495
reci+0.495
 disl+0.492
Neg logit contributions
 and-0.315
 under-0.307
 in-0.280
ed-0.279
driver-0.268
Token driver
Token ID4639
Feature activation+5.96e-04

Pos logit contributions
 under+0.055
driver+0.053
ed+0.050
 and+0.047
aces+0.044
Neg logit contributions
Little-0.120
icious-0.116
Everybody-0.116
Everyone-0.115
One-0.113
Token opened
Token ID4721
Feature activation+1.97e-02

Pos logit contributions
 under+0.001
driver+0.001
ed+0.001
 and+0.000
 knees+0.000
Neg logit contributions
Little-0.001
icious-0.001
Everybody-0.001
reci-0.001
Everyone-0.001
Token the
Token ID262
Feature activation-1.42e-01

Pos logit contributions
 under+0.023
driver+0.022
ed+0.022
 and+0.021
aces+0.020
Neg logit contributions
Little-0.035
rous-0.034
icious-0.034
E-0.033
Everybody-0.033
Token door
Token ID3420
Feature activation+8.33e-02

Pos logit contributions
icious+0.306
Little+0.301
Everybody+0.288
reci+0.286
Ms+0.286
Neg logit contributions
 and-0.140
 under-0.137
ed-0.125
 in-0.122
driver-0.118
Token!"
Token ID2474
Feature activation-9.02e-02

Pos logit contributions
 and+0.109
 under+0.108
ed+0.102
 in+0.101
driver+0.098
Neg logit contributions
icious-0.103
Little-0.100
Everybody-0.094
reci-0.093
Ms-0.093
Token And
Token ID843
Feature activation-2.86e-02

Pos logit contributions
Little+0.136
icious+0.134
Everybody+0.128
Everyone+0.126
 disl+0.124
Neg logit contributions
 under-0.140
ed-0.132
driver-0.130
 and-0.129
 in-0.126
Token they
Token ID484
Feature activation+1.35e-01

Pos logit contributions
Little+0.046
Everybody+0.042
Everyone+0.041
 wonderful+0.041
The+0.041
Neg logit contributions
driver-0.040
aces-0.039
ed-0.036
ously-0.033
ps-0.032
Token did
Token ID750
Feature activation+4.64e-02

Pos logit contributions
driver+0.151
 under+0.149
 and+0.147
ed+0.142
 he+0.129
Neg logit contributions
Little-0.252
icious-0.248
Everybody-0.239
Br-0.237
Everyone-0.236
Token.
Token ID13
Feature activation+1.19e-02

Pos logit contributions
ed+0.045
driver+0.044
 under+0.043
 and+0.041
 in+0.037
Neg logit contributions
icious-0.096
Little-0.093
 disl-0.090
Everybody-0.090
 difference-0.089
Token The
Token ID383
Feature activation-2.74e-01

Pos logit contributions
 and+0.016
 under+0.014
ed+0.013
 in+0.013
 he+0.012
Neg logit contributions
icious-0.024
rous-0.023
ograph-0.023
Little-0.023
Gr-0.022
Token end
Token ID886
Feature activation-2.99e-02

Pos logit contributions
icious+0.421
Little+0.406
Ms+0.395
E+0.387
 disl+0.387
Neg logit contributions
 and-0.455
 under-0.439
 in-0.415
ed-0.406
 on-0.401
Token.
Token ID13
Feature activation+1.09e-02

Pos logit contributions
icious+0.069
Little+0.067
Everybody+0.064
Ms+0.064
reci+0.064
Neg logit contributions
 and-0.026
 under-0.025
ed-0.022
 in-0.022
driver-0.021
Token<|endoftext|>
Token ID50256
Feature activation-3.14e-01

Pos logit contributions
 and+0.017
 under+0.013
 on+0.013
 in+0.013
 He+0.013
Neg logit contributions
 former-0.022
icious-0.021
rous-0.021
 disl-0.021
itionally-0.020
TokenOnce
Token ID7454
Feature activation-1.50e-01

Pos logit contributions
icious+0.507
 disl+0.466
 former+0.456
rous+0.446
reci+0.445
Neg logit contributions
 and-0.569
 under-0.539
 in-0.517
 on-0.504
ed-0.495
Token upon
Token ID2402
Feature activation-8.04e-02

Pos logit contributions
icious+0.266
ograph+0.238
 disl+0.238
 former+0.238
Little+0.236
Neg logit contributions
 and-0.244
 under-0.234
 in-0.220
 he-0.214
 on-0.210
Token me
Token ID502
Feature activation-1.10e-01

Pos logit contributions
icious+0.141
 former+0.135
 lifetime+0.133
 difference+0.133
reci+0.132
Neg logit contributions
 and-0.111
 under-0.099
 in-0.098
 on-0.095
ed-0.092
Token!"
Token ID2474
Feature activation-8.41e-02

Pos logit contributions
icious+0.200
Little+0.198
Everybody+0.187
Ms+0.185
reci+0.185
Neg logit contributions
 under-0.138
 and-0.137
ed-0.130
 in-0.125
driver-0.124
Token she
Token ID673
Feature activation+1.29e-01

Pos logit contributions
icious+0.172
 difference+0.157
ograph+0.153
Little+0.153
 former+0.153
Neg logit contributions
 and-0.115
 under-0.108
 in-0.103
ed-0.102
 he-0.098
Token said
Token ID531
Feature activation-5.93e-02

Pos logit contributions
 and+0.177
 under+0.175
ed+0.165
 in+0.160
driver+0.158
Neg logit contributions
icious-0.224
Little-0.220
Everybody-0.209
reci-0.207
Ms-0.206
Token.
Token ID13
Feature activation+1.27e-02

Pos logit contributions
Little+0.086
Gr+0.083
icious+0.082
Everybody+0.082
Br+0.082
Neg logit contributions
driver-0.070
 under-0.070
ed-0.067
eer-0.066
 and-0.065
Token
Token ID198
Feature activation-2.34e-01

Pos logit contributions
 under+0.021
 and+0.021
ed+0.020
 in+0.020
driver+0.019
Neg logit contributions
icious-0.018
Little-0.018
Everybody-0.016
Ms-0.016
E-0.016
Token
Token ID198
Feature activation-2.35e-01

Pos logit contributions
icious+0.589
 former+0.566
 difference+0.553
 disl+0.534
 lifetime+0.526
Neg logit contributions
 and-0.338
 in-0.257
 to-0.247
 under-0.239
 on-0.239
Token"
Token ID1
Feature activation-7.58e-02

Pos logit contributions
 difference+0.394
reci+0.363
icious+0.356
 former+0.347
 disl+0.344
Neg logit contributions
 and-0.444
 in-0.402
 under-0.381
 ran-0.373
 on-0.371
TokenOkay
Token ID16454
Feature activation-9.02e-02

Pos logit contributions
 difference+0.125
 former+0.119
reci+0.112
 disl+0.110
icious+0.109
Neg logit contributions
 and-0.121
 he-0.112
 in-0.112
 on-0.111
ed-0.108
Token,
Token ID11
Feature activation+9.90e-02

Pos logit contributions
icious+0.208
Little+0.204
reci+0.199
Everybody+0.197
 disl+0.196
Neg logit contributions
 and-0.077
 under-0.074
 in-0.065
ed-0.064
driver-0.060
Token L
Token ID406
Feature activation-2.01e-01

Pos logit contributions
 and+0.127
 under+0.127
ed+0.120
 in+0.115
driver+0.115
Neg logit contributions
icious-0.179
Little-0.176
Everybody-0.166
Ms-0.165
reci-0.164