TOP ACTIVATIONS
MAX = 0.386

should be compassionate . Comp assion ate means you care about
mom and dad too . <|endoftext|> Tim and Mia liked to
hot and dangerous grill . <|endoftext|> Once upon a time ,
back outside to play . <|endoftext|> L ily did not like
Their mom is happy . <|endoftext|> L ily likes to watch
done it without you !" <|endoftext|> Once upon a time ,
. They are friends . <|endoftext|> L ily and Ben were
more careful and patient . <|endoftext|> L ily and Max were
They learned their lesson . <|endoftext|> J ill and Ben were
. They were happy . <|endoftext|> L ila and Tom were
with wild dogs again . <|endoftext|> Ben and Mia love to
the lemon juice together . <|endoftext|> One day , a little
of fun playing together . <|endoftext|> Once upon a time ,
happy friends once more . <|endoftext|> Once upon a time ,
. They are happy . <|endoftext|> L ily and Ben were
their brother and sister . <|endoftext|> S ara was a creative
park a cleaner place . <|endoftext|> One day , a little
They learn a lesson . <|endoftext|> L ila and Ben are
. They were happy . <|endoftext|> Tom and Sam are friends
fun in the park . <|endoftext|> Sam and Mia are friends

MIN ACTIVATIONS
MIN = -0.330

question , and she courage ously answered it .
boy smiled mis chie v ously and said , " Well
so he stepped forward courage ously . When he
cat grinned mis chie v ously . " Never try to
tools and found a screw driver , hammer , and some
he got his little screw driver and started to screw in
mom gave him a screw driver and he was helping to
and took out a screw driver . Then he carefully unsc
He handed her a screw driver and showed her how to
she found a big screw driver . She used it to
The mother got a screw driver and fixed the broken part
tool was her tiny screw driver . One day
. It was a screw driver that her daddy had left
a hammer and a screw driver that he liked to store
a screw and a screw driver and started to work .
tightened it with a screw driver . The clock was now
were connected , the screw driver tightened screws , and the
the wood and his screw driver to screw it together .
a hammer and a screw driver . M olly
! He found a screw driver and gave it to her

INTERVAL 0.207 - 0.386
CONTAINS 7.030%

was the start of a new tradition , every day he
the seeds around . They had so much
garden , there was a big bush . A little girl
for the plant . L ily gave the plant
a big hug . From that day on ,

INTERVAL 0.028 - 0.207
CONTAINS 27.326%

yellow . She painted for a long time ,
loyal underground animal . His name was Tom . Tom wanted
show you ." Tom runs to the slide and
was so graceful that all the animals who passed by stopped
" Be gentle with the stamps and stickers ," Mom

INTERVAL -0.151 - 0.028
CONTAINS 59.637%

glad to see it again ! Tommy had found the box
She picked it up . She liked how it looked and
. He ran towards the car and bit it with his
make him feel better by giving him a y ummy ice
went to bed . She was scared of the dark ,

INTERVAL -0.330 - -0.151
CONTAINS 6.007%

so pretty ! It bloom ed today !" Ben
with just a few paints and switches . From then on
walking through the park when he heard a noise . He
played a trick on her and now she was stuck !
while they were playing , they saw a big box near
Token should
Token ID815
Feature activation-4.43e-02

Pos logit contributions
Little+0.020
One+0.019
reci+0.019
Ms+0.019
 difference+0.019
Neg logit contributions
ed-0.024
 knees-0.022
fully-0.021
ously-0.021
driver-0.020
Token be
Token ID307
Feature activation+4.59e-02

Pos logit contributions
ed+0.062
 knees+0.052
driver+0.050
 under+0.049
 and+0.049
Neg logit contributions
Ms-0.088
Little-0.088
One-0.086
rous-0.086
 difference-0.084
Token compassionate
Token ID32533
Feature activation+1.67e-01

Pos logit contributions
Little+0.073
Ms+0.072
rous+0.070
 difference+0.069
One+0.068
Neg logit contributions
ed-0.085
 and-0.074
 under-0.073
 in-0.072
 knees-0.071
Token.
Token ID13
Feature activation-4.79e-02

Pos logit contributions
 difference+0.284
One+0.279
Ms+0.269
 future+0.265
Little+0.262
Neg logit contributions
 knees-0.226
ed-0.220
 wouldn-0.210
driver-0.206
ped-0.200
Token Comp
Token ID3082
Feature activation+5.96e-02

Pos logit contributions
ed+0.087
 knees+0.075
 and+0.075
driver+0.074
 under+0.073
Neg logit contributions
Ms-0.075
Little-0.075
rous-0.075
 difference-0.072
 lifetime-0.071
Tokenassion
Token ID11857
Feature activation+3.86e-01

Pos logit contributions
Little+0.061
 difference+0.056
 lifetime+0.054
 memorable+0.053
rous+0.053
Neg logit contributions
ed-0.151
driver-0.136
 and-0.129
 wouldn-0.129
fully-0.128
Tokenate
Token ID378
Feature activation+4.53e-02

Pos logit contributions
Ms+0.666
rous+0.666
Little+0.649
 lifetime+0.645
gary+0.645
Neg logit contributions
ed-0.692
 and-0.615
 in-0.554
 under-0.550
driver-0.534
Token means
Token ID1724
Feature activation-4.00e-02

Pos logit contributions
Ms+0.067
 difference+0.066
Little+0.065
rous+0.064
One+0.063
Neg logit contributions
ed-0.076
 knees-0.076
driver-0.071
 hid-0.066
 under-0.066
Token you
Token ID345
Feature activation+4.46e-03

Pos logit contributions
ed+0.076
 knees+0.066
driver+0.066
 under+0.063
fully+0.062
Neg logit contributions
Ms-0.059
Little-0.058
One-0.057
rous-0.056
Mrs-0.056
Token care
Token ID1337
Feature activation-5.78e-02

Pos logit contributions
One+0.007
Ms+0.007
Little+0.007
 difference+0.007
 rewarding+0.007
Neg logit contributions
ed-0.007
 knees-0.007
driver-0.006
 under-0.006
fully-0.006
Token about
Token ID546
Feature activation-1.79e-03

Pos logit contributions
ed+0.052
 and+0.042
 under+0.039
driver+0.037
 in+0.037
Neg logit contributions
rous-0.145
Little-0.145
Ms-0.144
One-0.141
Mrs-0.139
Token mom
Token ID1995
Feature activation+9.35e-04

Pos logit contributions
Ms+0.011
Little+0.011
One+0.011
rous+0.011
Mrs+0.011
Neg logit contributions
ed-0.013
 knees-0.011
 and-0.011
 under-0.011
driver-0.011
Token and
Token ID290
Feature activation-2.08e-01

Pos logit contributions
 difference+0.002
Ms+0.002
One+0.002
Little+0.002
Mrs+0.002
Neg logit contributions
ed-0.001
 knees-0.001
driver-0.001
fully-0.001
ped-0.001
Token dad
Token ID9955
Feature activation-1.12e-02

Pos logit contributions
ed+0.348
driver+0.289
fully+0.286
 knees+0.278
 under+0.271
Neg logit contributions
Ms-0.366
Mrs-0.357
rous-0.355
Little-0.352
 difference-0.352
Token too
Token ID1165
Feature activation-4.27e-02

Pos logit contributions
ed+0.013
driver+0.011
 knees+0.011
fully+0.009
 under+0.009
Neg logit contributions
Ms-0.025
 difference-0.025
Little-0.024
One-0.024
rous-0.024
Token.
Token ID13
Feature activation-3.38e-02

Pos logit contributions
ed+0.045
 knees+0.035
driver+0.034
 under+0.033
 and+0.032
Neg logit contributions
Ms-0.099
Little-0.098
One-0.097
rous-0.097
Mrs-0.095
Token<|endoftext|>
Token ID50256
Feature activation+3.21e-01

Pos logit contributions
 and+0.031
ed+0.027
 under+0.023
 in+0.022
 went+0.020
Neg logit contributions
rous-0.087
icious-0.086
gary-0.086
redible-0.086
Ms-0.084
TokenTim
Token ID14967
Feature activation+6.87e-02

Pos logit contributions
 former+0.437
icious+0.420
 disl+0.404
 lifetime+0.401
rous+0.401
Neg logit contributions
ed-0.718
 and-0.667
 in-0.650
 on-0.649
 under-0.637
Token and
Token ID290
Feature activation-2.16e-01

Pos logit contributions
One+0.148
Little+0.148
rous+0.146
Ms+0.144
 difference+0.143
Neg logit contributions
ed-0.071
 knees-0.069
driver-0.069
fully-0.060
ked-0.055
Token Mia
Token ID32189
Feature activation+1.38e-02

Pos logit contributions
ed+0.375
driver+0.344
aces+0.333
ier+0.332
ps+0.329
Neg logit contributions
Little-0.338
The-0.334
Every-0.331
One-0.329
Mrs-0.324
Token liked
Token ID8288
Feature activation+3.11e-02

Pos logit contributions
One+0.021
Little+0.021
rous+0.020
 difference+0.020
reci+0.020
Neg logit contributions
ed-0.024
driver-0.023
 knees-0.023
fully-0.022
ped-0.020
Token to
Token ID284
Feature activation-7.49e-02

Pos logit contributions
rous+0.074
Ms+0.074
One+0.073
Little+0.073
Mrs+0.073
Neg logit contributions
ed-0.031
driver-0.023
 knees-0.022
ped-0.020
 under-0.020
Token hot
Token ID3024
Feature activation+1.53e-01

Pos logit contributions
Ms+0.180
Little+0.179
rous+0.175
One+0.173
Mrs+0.170
Neg logit contributions
ed-0.171
 knees-0.149
driver-0.145
 and-0.143
 under-0.143
Token and
Token ID290
Feature activation-2.24e-01

Pos logit contributions
Little+0.230
 difference+0.221
 future+0.216
rous+0.207
One+0.207
Neg logit contributions
ed-0.293
icked-0.271
driver-0.266
aces-0.263
fully-0.257
Token dangerous
Token ID4923
Feature activation+1.50e-01

Pos logit contributions
ed+0.433
driver+0.394
aces+0.376
icked+0.371
 knees+0.371
Neg logit contributions
rous-0.315
Little-0.314
 rewarding-0.312
 difference-0.297
One-0.296
Token grill
Token ID29901
Feature activation-1.93e-02

Pos logit contributions
 difference+0.232
 future+0.219
Little+0.214
rous+0.206
Ms+0.201
Neg logit contributions
ed-0.289
icked-0.271
aces-0.267
driver-0.267
fully-0.251
Token.
Token ID13
Feature activation-5.02e-02

Pos logit contributions
 knees+0.024
ed+0.024
driver+0.023
ped+0.022
icked+0.022
Neg logit contributions
 difference-0.035
One-0.034
Ms-0.034
Little-0.034
 future-0.034
Token<|endoftext|>
Token ID50256
Feature activation+3.21e-01

Pos logit contributions
 and+0.081
ed+0.073
 he+0.063
 under+0.063
 because+0.063
Neg logit contributions
rous-0.099
redible-0.094
icious-0.091
ograph-0.090
prise-0.089
TokenOnce
Token ID7454
Feature activation+1.80e-01

Pos logit contributions
rous+0.520
 former+0.519
icious+0.512
 disl+0.489
 lifetime+0.488
Neg logit contributions
ed-0.626
 and-0.606
 under-0.564
 in-0.563
 on-0.551
Token upon
Token ID2402
Feature activation+9.61e-02

Pos logit contributions
icious+0.324
 former+0.314
ograph+0.304
Ms+0.304
 sym+0.298
Neg logit contributions
ed-0.310
 and-0.307
 under-0.294
 in-0.288
 on-0.273
Token a
Token ID257
Feature activation+4.58e-02

Pos logit contributions
Ms+0.216
Little+0.215
rous+0.212
One+0.210
Mrs+0.207
Neg logit contributions
ed-0.108
 knees-0.086
driver-0.083
 and-0.082
 under-0.082
Token time
Token ID640
Feature activation+7.24e-03

Pos logit contributions
Ms+0.082
assion+0.080
Girl+0.079
rous+0.077
 former+0.077
Neg logit contributions
 and-0.072
ed-0.065
 under-0.063
 knees-0.061
 in-0.061
Token,
Token ID11
Feature activation-1.13e-01

Pos logit contributions
Ms+0.019
Little+0.019
rous+0.019
One+0.019
Mrs+0.019
Neg logit contributions
ed-0.005
 knees-0.004
 and-0.003
 under-0.003
driver-0.003
Token back
Token ID736
Feature activation+3.30e-02

Pos logit contributions
ed+0.151
ped+0.146
 wouldn+0.145
fully+0.139
 knees+0.138
Neg logit contributions
One-0.240
Ms-0.235
rous-0.235
Mrs-0.229
 future-0.220
Token outside
Token ID2354
Feature activation+2.76e-02

Pos logit contributions
rous+0.075
One+0.073
Ms+0.073
Little+0.073
Mrs+0.072
Neg logit contributions
ed-0.032
driver-0.027
 knees-0.027
fully-0.026
aces-0.026
Token to
Token ID284
Feature activation-7.44e-02

Pos logit contributions
One+0.064
Ms+0.064
Little+0.064
rous+0.064
 difference+0.063
Neg logit contributions
ed-0.026
driver-0.022
 knees-0.020
ped-0.020
icked-0.019
Token play
Token ID711
Feature activation+1.11e-01

Pos logit contributions
ed+0.124
 knees+0.114
 and+0.111
 under+0.110
 hid+0.106
Neg logit contributions
Ms-0.126
Little-0.124
Mrs-0.119
icious-0.118
rous-0.118
Token.
Token ID13
Feature activation-3.43e-02

Pos logit contributions
One+0.236
Little+0.226
Ms+0.223
rous+0.222
 difference+0.218
Neg logit contributions
ed-0.133
driver-0.117
 knees-0.111
ped-0.108
 wouldn-0.107
Token<|endoftext|>
Token ID50256
Feature activation+3.21e-01

Pos logit contributions
 and+0.030
 under+0.026
ed+0.024
 in+0.023
 knees+0.022
Neg logit contributions
icious-0.091
rous-0.088
redible-0.087
is-0.087
 lifetime-0.086
TokenL
Token ID43
Feature activation+2.03e-01

Pos logit contributions
 former+0.438
icious+0.429
rous+0.420
Ms+0.414
 lifetime+0.413
Neg logit contributions
ed-0.698
 and-0.655
 in-0.628
 under-0.624
 on-0.622
Tokenily
Token ID813
Feature activation-1.57e-02

Pos logit contributions
Ms+0.389
Little+0.386
One+0.381
rous+0.377
Mrs+0.372
Neg logit contributions
ed-0.296
 knees-0.254
driver-0.246
 under-0.235
 and-0.229
Token did
Token ID750
Feature activation-5.32e-02

Pos logit contributions
driver+0.019
fully+0.017
ed+0.017
 knees+0.017
ked+0.016
Neg logit contributions
Little-0.033
One-0.031
Love-0.030
Em-0.030
str-0.029
Token not
Token ID407
Feature activation-4.00e-02

Pos logit contributions
ed+0.052
 knees+0.040
driver+0.038
 under+0.037
 and+0.037
Neg logit contributions
Ms-0.127
Little-0.127
rous-0.125
One-0.124
Mrs-0.123
Token like
Token ID588
Feature activation+1.18e-02

Pos logit contributions
ed+0.069
driver+0.057
 knees+0.056
ped+0.054
 under+0.053
Neg logit contributions
One-0.069
Little-0.065
rous-0.064
Every-0.064
Each-0.063
Token Their
Token ID5334
Feature activation+1.14e-01

Pos logit contributions
 and+0.037
ed+0.034
 under+0.029
 knees+0.027
 He+0.027
Neg logit contributions
rous-0.083
gary-0.082
Ms-0.080
redible-0.080
icious-0.080
Token mom
Token ID1995
Feature activation-9.93e-04

Pos logit contributions
Ms+0.240
Little+0.234
Mrs+0.228
One+0.227
rous+0.226
Neg logit contributions
ed-0.151
 knees-0.140
 and-0.131
 under-0.125
driver-0.123
Token is
Token ID318
Feature activation-4.70e-02

Pos logit contributions
ed+0.001
 knees+0.001
driver+0.001
 and+0.001
 under+0.001
Neg logit contributions
Ms-0.002
Little-0.002
rous-0.002
One-0.002
Mrs-0.002
Token happy
Token ID3772
Feature activation+2.17e-01

Pos logit contributions
ed+0.078
 and+0.076
 in+0.075
 under+0.073
 on+0.072
Neg logit contributions
Little-0.081
Ms-0.078
 lifetime-0.077
igsaw-0.076
reci-0.076
Token.
Token ID13
Feature activation-3.77e-02

Pos logit contributions
One+0.441
 difference+0.437
 future+0.418
Ms+0.411
Little+0.408
Neg logit contributions
ed-0.289
ped-0.245
 knees-0.242
driver-0.241
aces-0.231
Token<|endoftext|>
Token ID50256
Feature activation+3.21e-01

Pos logit contributions
 and+0.031
ed+0.031
 under+0.022
 in+0.019
 knees+0.019
Neg logit contributions
 lifetime-0.096
icious-0.096
rous-0.095
gary-0.095
Little-0.094
TokenL
Token ID43
Feature activation+1.99e-01

Pos logit contributions
 former+0.393
icious+0.390
rous+0.383
Ms+0.377
Little+0.371
Neg logit contributions
ed-0.732
 and-0.678
 under-0.652
 in-0.648
 on-0.646
Tokenily
Token ID813
Feature activation-1.66e-02

Pos logit contributions
Ms+0.361
Little+0.357
One+0.354
Mrs+0.348
rous+0.348
Neg logit contributions
ed-0.300
 knees-0.273
driver-0.265
 wouldn-0.251
 hid-0.240
Token likes
Token ID7832
Feature activation+6.76e-02

Pos logit contributions
 knees+0.026
driver+0.026
fully+0.025
ed+0.025
ked+0.024
Neg logit contributions
rous-0.026
Love-0.026
Little-0.026
reci-0.026
 future-0.026
Token to
Token ID284
Feature activation-7.61e-02

Pos logit contributions
Ms+0.167
Little+0.166
rous+0.165
One+0.163
Mrs+0.162
Neg logit contributions
ed-0.061
 knees-0.045
driver-0.044
 under-0.040
ped-0.038
Token watch
Token ID2342
Feature activation+5.28e-02

Pos logit contributions
ed+0.163
 under+0.136
driver+0.134
ped+0.132
 knees+0.132
Neg logit contributions
Little-0.100
One-0.099
rous-0.097
Ms-0.095
 disl-0.094
Token done
Token ID1760
Feature activation+3.69e-02

Pos logit contributions
Ms+0.017
Little+0.017
rous+0.016
 difference+0.016
Mrs+0.016
Neg logit contributions
ed-0.010
 knees-0.008
 and-0.008
 under-0.007
driver-0.007
Token it
Token ID340
Feature activation-8.50e-02

Pos logit contributions
Ms+0.077
Little+0.077
rous+0.075
One+0.075
Mrs+0.074
Neg logit contributions
ed-0.047
 knees-0.040
driver-0.038
 under-0.037
 and-0.037
Token without
Token ID1231
Feature activation-5.72e-02

Pos logit contributions
ed+0.140
 knees+0.138
driver+0.127
ously+0.123
ped+0.122
Neg logit contributions
One-0.133
Little-0.128
Ms-0.127
icious-0.126
Mrs-0.121
Token you
Token ID345
Feature activation+4.58e-03

Pos logit contributions
ed+0.095
 knees+0.085
 and+0.081
 under+0.081
driver+0.080
Neg logit contributions
Ms-0.097
Little-0.097
rous-0.094
One-0.093
 difference-0.092
Token!"
Token ID2474
Feature activation+4.61e-02

Pos logit contributions
One+0.009
Little+0.008
Ms+0.008
rous+0.008
Mrs+0.008
Neg logit contributions
ed-0.007
driver-0.006
 knees-0.006
 under-0.005
fully-0.005
Token<|endoftext|>
Token ID50256
Feature activation+3.21e-01

Pos logit contributions
Little+0.071
Ms+0.070
One+0.068
Mrs+0.068
rous+0.065
Neg logit contributions
ed-0.083
driver-0.074
 knees-0.073
ps-0.071
ped-0.070
TokenOnce
Token ID7454
Feature activation+1.81e-01

Pos logit contributions
 former+0.487
rous+0.477
icious+0.474
 rewarding+0.457
 lifetime+0.457
Neg logit contributions
ed-0.648
 and-0.623
 under-0.587
 in-0.585
 on-0.574
Token upon
Token ID2402
Feature activation+1.00e-01

Pos logit contributions
icious+0.316
 former+0.313
Ms+0.304
ograph+0.296
Little+0.291
Neg logit contributions
ed-0.319
 and-0.313
 under-0.301
 in-0.290
 on-0.278
Token a
Token ID257
Feature activation+5.32e-02

Pos logit contributions
Little+0.212
One+0.211
Ms+0.210
rous+0.206
Mrs+0.204
Neg logit contributions
ed-0.124
 knees-0.103
driver-0.102
 under-0.094
 hid-0.091
Token time
Token ID640
Feature activation+1.18e-02

Pos logit contributions
Ms+0.092
assion+0.089
Girl+0.089
Little+0.088
 former+0.087
Neg logit contributions
 and-0.085
ed-0.081
 under-0.077
 knees-0.074
 in-0.074
Token,
Token ID11
Feature activation-1.08e-01

Pos logit contributions
Little+0.030
Ms+0.029
One+0.029
rous+0.029
Mrs+0.028
Neg logit contributions
ed-0.010
 knees-0.007
driver-0.007
 under-0.006
 and-0.006
Token.
Token ID13
Feature activation-2.86e-02

Pos logit contributions
 difference+0.400
 future+0.352
One+0.348
 rewarding+0.345
 harmony+0.336
Neg logit contributions
ed-0.308
ped-0.291
driver-0.288
 wouldn-0.284
icked-0.281
Token They
Token ID1119
Feature activation-7.46e-02

Pos logit contributions
 and+0.034
ed+0.033
 in+0.028
 under+0.027
 on+0.026
Neg logit contributions
rous-0.065
 lifetime-0.063
icious-0.062
str-0.062
gary-0.061
Token are
Token ID389
Feature activation-4.91e-02

Pos logit contributions
ed+0.112
 and+0.098
 knees+0.096
 under+0.095
driver+0.094
Neg logit contributions
Ms-0.139
Little-0.139
rous-0.136
Mrs-0.133
One-0.132
Token friends
Token ID2460
Feature activation-3.05e-02

Pos logit contributions
ed+0.094
 knees+0.083
driver+0.083
fully+0.078
 under+0.076
Neg logit contributions
Ms-0.071
rous-0.070
Little-0.069
One-0.069
 rewarding-0.068
Token.
Token ID13
Feature activation-2.88e-02

Pos logit contributions
ed+0.039
 knees+0.036
driver+0.034
ped+0.032
 hid+0.030
Neg logit contributions
 difference-0.061
One-0.061
Little-0.060
Ms-0.060
rous-0.058
Token<|endoftext|>
Token ID50256
Feature activation+3.21e-01

Pos logit contributions
 and+0.041
ed+0.039
 in+0.035
 under+0.034
 on+0.033
Neg logit contributions
rous-0.059
icious-0.057
 lifetime-0.057
str-0.056
gary-0.056
TokenL
Token ID43
Feature activation+2.04e-01

Pos logit contributions
 former+0.382
icious+0.372
Ms+0.368
rous+0.357
Little+0.355
Neg logit contributions
ed-0.743
 and-0.679
 in-0.666
 under-0.659
 on-0.658
Tokenily
Token ID813
Feature activation-1.17e-02

Pos logit contributions
Ms+0.342
One+0.329
Little+0.329
rous+0.323
Mrs+0.322
Neg logit contributions
ed-0.345
 knees-0.313
driver-0.301
 wouldn-0.291
ped-0.286
Token and
Token ID290
Feature activation-2.10e-01

Pos logit contributions
ed+0.018
driver+0.017
 knees+0.017
fully+0.016
ked+0.015
Neg logit contributions
rous-0.020
One-0.020
Little-0.020
 difference-0.019
 future-0.019
Token Ben
Token ID3932
Feature activation+1.78e-02

Pos logit contributions
ed+0.365
driver+0.348
 knees+0.319
aces+0.317
fully+0.315
Neg logit contributions
Little-0.329
rous-0.324
Mrs-0.319
One-0.317
Every-0.316
Token were
Token ID547
Feature activation-9.48e-02

Pos logit contributions
One+0.029
Little+0.028
reci+0.028
rous+0.028
 difference+0.028
Neg logit contributions
ed-0.029
 knees-0.028
driver-0.028
fully-0.027
edly-0.025
Token more
Token ID517
Feature activation+6.06e-02

Pos logit contributions
Ms+0.070
Little+0.070
rous+0.068
One+0.067
 difference+0.066
Neg logit contributions
ed-0.081
 knees-0.071
driver-0.069
 and-0.069
 under-0.069
Token careful
Token ID8161
Feature activation+1.02e-01

Pos logit contributions
Ms+0.099
Little+0.098
rous+0.096
One+0.095
Mrs+0.093
Neg logit contributions
ed-0.107
 knees-0.093
driver-0.091
 and-0.091
 under-0.091
Token and
Token ID290
Feature activation-2.12e-01

Pos logit contributions
One+0.182
Every+0.174
 future+0.168
 difference+0.168
 rewarding+0.167
Neg logit contributions
ed-0.147
ked-0.135
icked-0.133
 hid-0.133
driver-0.130
Token patient
Token ID5827
Feature activation+7.28e-02

Pos logit contributions
ed+0.427
 knees+0.380
driver+0.373
 under+0.368
 and+0.368
Neg logit contributions
Ms-0.288
Little-0.287
rous-0.280
One-0.276
Mrs-0.270
Token.
Token ID13
Feature activation-3.47e-02

Pos logit contributions
 difference+0.131
One+0.129
Ms+0.129
 rewarding+0.126
 future+0.125
Neg logit contributions
ed-0.100
driver-0.091
 knees-0.090
icked-0.085
ped-0.084
Token<|endoftext|>
Token ID50256
Feature activation+3.20e-01

Pos logit contributions
ed+0.039
 and+0.037
 under+0.031
 fell+0.029
 in+0.028
Neg logit contributions
icious-0.080
redible-0.080
 lifetime-0.079
rous-0.079
gary-0.079
TokenL
Token ID43
Feature activation+1.99e-01

Pos logit contributions
 former+0.454
icious+0.441
 lifetime+0.427
rous+0.424
 disl+0.417
Neg logit contributions
ed-0.697
 and-0.658
 on-0.634
 in-0.633
 under-0.626
Tokenily
Token ID813
Feature activation-1.57e-02

Pos logit contributions
Ms+0.383
Little+0.379
One+0.372
rous+0.371
Mrs+0.366
Neg logit contributions
ed-0.293
 knees-0.250
driver-0.243
 under-0.233
 and-0.230
Token and
Token ID290
Feature activation-2.18e-01

Pos logit contributions
 knees+0.022
driver+0.022
fully+0.022
ked+0.021
ed+0.021
Neg logit contributions
Little-0.027
 future-0.027
rous-0.027
Love-0.026
One-0.026
Token Max
Token ID5436
Feature activation-1.10e-02

Pos logit contributions
ed+0.358
driver+0.342
aces+0.310
 knees+0.308
ps+0.306
Neg logit contributions
Little-0.372
Mrs-0.361
rous-0.357
Ms-0.351
One-0.351
Token were
Token ID547
Feature activation-1.02e-01

Pos logit contributions
ed+0.018
 knees+0.017
driver+0.017
fully+0.016
ously+0.015
Neg logit contributions
One-0.019
Little-0.019
 difference-0.018
Ms-0.018
rous-0.017
Token They
Token ID1119
Feature activation-7.81e-02

Pos logit contributions
 and+0.035
ed+0.034
 under+0.029
 in+0.028
 on+0.028
Neg logit contributions
rous-0.060
icious-0.060
gary-0.059
redible-0.059
Ms-0.057
Token learned
Token ID4499
Feature activation+1.51e-03

Pos logit contributions
ed+0.102
driver+0.089
 knees+0.086
fully+0.079
 under+0.078
Neg logit contributions
Little-0.158
Ms-0.158
One-0.158
Mrs-0.152
rous-0.152
Token their
Token ID511
Feature activation+6.29e-03

Pos logit contributions
Ms+0.003
Little+0.003
rous+0.003
One+0.003
Mrs+0.003
Neg logit contributions
ed-0.002
 knees-0.001
driver-0.001
 under-0.001
 and-0.001
Token lesson
Token ID11483
Feature activation+1.17e-02

Pos logit contributions
Ms+0.012
rous+0.012
Little+0.012
Girl+0.012
Mrs+0.012
Neg logit contributions
 and-0.009
ed-0.009
 knees-0.009
 in-0.009
 under-0.008
Token.
Token ID13
Feature activation-2.95e-02

Pos logit contributions
One+0.022
 future+0.022
Mrs+0.022
 rewarding+0.022
Little+0.022
Neg logit contributions
ed-0.014
 knees-0.014
driver-0.014
icked-0.013
ped-0.012
Token<|endoftext|>
Token ID50256
Feature activation+3.20e-01

Pos logit contributions
 and+0.033
ed+0.031
 under+0.026
 in+0.025
 on+0.024
Neg logit contributions
rous-0.069
icious-0.069
gary-0.069
redible-0.069
Ms-0.067
TokenJ
Token ID41
Feature activation+7.38e-02

Pos logit contributions
 former+0.382
icious+0.377
rous+0.374
Ms+0.369
 difference+0.362
Neg logit contributions
ed-0.741
 and-0.680
 in-0.660
 under-0.656
 on-0.653
Tokenill
Token ID359
Feature activation-1.10e-01

Pos logit contributions
Ms+0.118
Little+0.117
rous+0.115
One+0.114
Mrs+0.112
Neg logit contributions
ed-0.128
 knees-0.115
driver-0.110
 under-0.110
 and-0.107
Token and
Token ID290
Feature activation-2.15e-01

Pos logit contributions
driver+0.119
 knees+0.113
ed+0.110
fully+0.105
pered+0.096
Neg logit contributions
Little-0.229
str-0.218
One-0.217
Love-0.214
rous-0.214
Token Ben
Token ID3932
Feature activation+1.64e-02

Pos logit contributions
driver+0.357
ed+0.351
aces+0.317
ps+0.317
fully+0.304
Neg logit contributions
Little-0.342
rous-0.334
The-0.328
Mrs-0.327
One-0.327
Token were
Token ID547
Feature activation-9.85e-02

Pos logit contributions
Little+0.031
One+0.030
This+0.029
Love+0.029
Che+0.029
Neg logit contributions
driver-0.022
 knees-0.021
fully-0.021
ed-0.021
iously-0.019
Token.
Token ID13
Feature activation-2.35e-02

Pos logit contributions
ed+0.025
driver+0.021
 knees+0.021
ped+0.020
icked+0.019
Neg logit contributions
One-0.037
 difference-0.037
Little-0.037
Ms-0.036
 rewarding-0.036
Token They
Token ID1119
Feature activation-7.15e-02

Pos logit contributions
 and+0.025
ed+0.022
 he+0.018
 in+0.018
 under+0.018
Neg logit contributions
rous-0.058
icious-0.058
redible-0.056
str-0.056
gary-0.056
Token were
Token ID547
Feature activation-9.85e-02

Pos logit contributions
ed+0.115
 knees+0.100
driver+0.099
 under+0.095
 and+0.093
Neg logit contributions
Ms-0.124
Little-0.124
One-0.121
rous-0.120
Mrs-0.119
Token happy
Token ID3772
Feature activation+2.20e-01

Pos logit contributions
ed+0.199
 knees+0.177
driver+0.174
 under+0.172
 and+0.172
Neg logit contributions
Ms-0.135
Little-0.134
rous-0.131
One-0.129
Mrs-0.126
Token.
Token ID13
Feature activation-2.60e-02

Pos logit contributions
 difference+0.449
One+0.447
Little+0.437
Mrs+0.433
Ms+0.430
Neg logit contributions
ed-0.282
driver-0.244
ped-0.225
aces-0.224
 knees-0.221
Token<|endoftext|>
Token ID50256
Feature activation+3.20e-01

Pos logit contributions
 and+0.035
ed+0.032
 in+0.028
 under+0.028
 on+0.027
Neg logit contributions
icious-0.057
rous-0.057
redible-0.055
gary-0.055
 lifetime-0.054
TokenL
Token ID43
Feature activation+2.02e-01

Pos logit contributions
 former+0.380
icious+0.371
rous+0.368
Ms+0.365
 rewarding+0.359
Neg logit contributions
ed-0.742
 and-0.684
 in-0.666
 under-0.661
 on-0.657
Tokenila
Token ID10102
Feature activation-1.39e-03

Pos logit contributions
Ms+0.387
Little+0.379
One+0.376
Mrs+0.370
rous+0.369
Neg logit contributions
ed-0.291
 knees-0.256
driver-0.249
 wouldn-0.235
ped-0.228
Token and
Token ID290
Feature activation-2.13e-01

Pos logit contributions
driver+0.001
 knees+0.001
ed+0.001
fully+0.001
icked+0.001
Neg logit contributions
Little-0.003
Love-0.003
Em-0.003
One-0.003
L-0.003
Token Tom
Token ID4186
Feature activation+3.01e-02

Pos logit contributions
ed+0.364
driver+0.331
 knees+0.306
ps+0.305
fully+0.304
Neg logit contributions
Little-0.343
Mrs-0.341
The-0.338
One-0.338
Every-0.334
Token were
Token ID547
Feature activation-9.69e-02

Pos logit contributions
One+0.055
Little+0.054
Love+0.053
reci+0.052
This+0.052
Neg logit contributions
 knees-0.039
driver-0.039
fully-0.038
ed-0.037
edly-0.035
Token with
Token ID351
Feature activation+2.80e-02

Pos logit contributions
rous+0.048
 difference+0.048
One+0.048
Ms+0.047
Little+0.047
Neg logit contributions
ed-0.036
 knees-0.033
driver-0.031
ously-0.030
ped-0.030
Token wild
Token ID4295
Feature activation+1.95e-01

Pos logit contributions
rous+0.051
Little+0.051
Ms+0.051
One+0.050
 difference+0.049
Neg logit contributions
ed-0.044
driver-0.036
 knees-0.035
 under-0.034
fully-0.034
Token dogs
Token ID6844
Feature activation-1.15e-01

Pos logit contributions
Little+0.323
 disl+0.322
One+0.321
 difference+0.320
rous+0.317
Neg logit contributions
ed-0.296
driver-0.268
icked-0.261
 knees-0.261
fully-0.260
Token again
Token ID757
Feature activation-3.89e-02

Pos logit contributions
ed+0.163
 knees+0.142
ped+0.137
driver+0.136
icked+0.129
Neg logit contributions
One-0.221
Ms-0.216
 difference-0.215
Little-0.213
rous-0.211
Token.
Token ID13
Feature activation-3.04e-02

Pos logit contributions
ed+0.041
 knees+0.033
driver+0.032
 under+0.030
 and+0.029
Neg logit contributions
Little-0.090
Ms-0.090
One-0.088
rous-0.088
Mrs-0.086
Token<|endoftext|>
Token ID50256
Feature activation+3.20e-01

Pos logit contributions
 and+0.037
ed+0.036
 under+0.031
 teeth+0.029
 in+0.029
Neg logit contributions
gary-0.066
rous-0.066
icious-0.066
redible-0.065
is-0.064
TokenBen
Token ID11696
Feature activation+5.44e-02

Pos logit contributions
 former+0.408
icious+0.397
rous+0.391
 disl+0.389
 rewarding+0.387
Neg logit contributions
ed-0.724
 and-0.670
 in-0.654
 on-0.649
 under-0.647
Token and
Token ID290
Feature activation-2.13e-01

Pos logit contributions
Little+0.108
Love+0.107
L+0.105
 disl+0.103
One+0.102
Neg logit contributions
 knees-0.061
driver-0.058
fully-0.053
iously-0.053
 toes-0.052
Token Mia
Token ID32189
Feature activation+1.30e-02

Pos logit contributions
ed+0.361
driver+0.340
ps+0.327
aces+0.327
ier+0.324
Neg logit contributions
The-0.339
Little-0.334
One-0.331
Mrs-0.327
Every-0.325
Token love
Token ID1842
Feature activation+9.54e-02

Pos logit contributions
Little+0.024
One+0.024
rous+0.023
reci+0.023
Every+0.023
Neg logit contributions
driver-0.017
 knees-0.016
ed-0.016
fully-0.015
edly-0.014
Token to
Token ID284
Feature activation-7.19e-02

Pos logit contributions
rous+0.187
One+0.179
Little+0.176
Mrs+0.175
Ms+0.175
Neg logit contributions
ed-0.128
driver-0.117
ously-0.105
aces-0.104
 knees-0.100
Token the
Token ID262
Feature activation+1.03e-01

Pos logit contributions
ed+0.002
 knees+0.002
driver+0.001
 under+0.001
 and+0.001
Neg logit contributions
Ms-0.003
Little-0.003
rous-0.003
One-0.003
Mrs-0.003
Token lemon
Token ID18873
Feature activation+6.48e-02

Pos logit contributions
Ms+0.154
Mrs+0.145
Little+0.143
rous+0.142
One+0.141
Neg logit contributions
ed-0.195
 and-0.183
 knees-0.179
 under-0.175
 in-0.170
Token juice
Token ID13135
Feature activation-6.08e-02

Pos logit contributions
Ms+0.109
Little+0.109
One+0.107
is+0.107
 difference+0.106
Neg logit contributions
ed-0.100
 knees-0.091
driver-0.090
ped-0.083
 hid-0.083
Token together
Token ID1978
Feature activation-1.97e-02

Pos logit contributions
ed+0.094
 knees+0.089
driver+0.087
ped+0.085
 teeth+0.083
Neg logit contributions
One-0.091
Ms-0.091
is-0.089
 difference-0.088
icious-0.088
Token.
Token ID13
Feature activation-5.11e-02

Pos logit contributions
ed+0.027
driver+0.025
 knees+0.023
 hid+0.022
ped+0.022
Neg logit contributions
One-0.038
Ms-0.037
Little-0.036
Every-0.036
Mrs-0.035
Token<|endoftext|>
Token ID50256
Feature activation+3.20e-01

Pos logit contributions
ed+0.074
 and+0.072
 under+0.063
 knees+0.062
 in+0.060
Neg logit contributions
rous-0.103
 difference-0.096
redible-0.096
icious-0.095
 rewarding-0.094
TokenOne
Token ID3198
Feature activation+2.14e-01

Pos logit contributions
icious+0.480
 former+0.478
rous+0.473
 rewarding+0.454
Ms+0.454
Neg logit contributions
ed-0.653
 and-0.617
 in-0.577
 on-0.576
 under-0.575
Token day
Token ID1110
Feature activation+3.70e-02

Pos logit contributions
Ms+0.373
Mrs+0.353
icious+0.349
 former+0.348
ograph+0.347
Neg logit contributions
ed-0.348
 and-0.348
 in-0.323
 under-0.320
 knees-0.314
Token,
Token ID11
Feature activation-1.06e-01

Pos logit contributions
Little+0.080
One+0.080
Ms+0.078
Mrs+0.075
Every+0.075
Neg logit contributions
ed-0.038
 knees-0.036
driver-0.036
aces-0.032
fully-0.031
Token a
Token ID257
Feature activation+4.57e-02

Pos logit contributions
ed+0.124
 knees+0.104
driver+0.101
 and+0.094
 under+0.094
Neg logit contributions
Little-0.231
Ms-0.230
One-0.227
rous-0.226
Mrs-0.222
Token little
Token ID1310
Feature activation+1.91e-01

Pos logit contributions
rous+0.076
Ms+0.076
 difference+0.073
 former+0.073
icious+0.072
Neg logit contributions
ed-0.076
 and-0.071
 under-0.068
 knees-0.067
 in-0.067
Token of
Token ID286
Feature activation-9.05e-03

Pos logit contributions
Little+0.085
Ms+0.084
rous+0.083
One+0.082
 difference+0.081
Neg logit contributions
ed-0.064
 knees-0.056
driver-0.054
 under-0.051
ped-0.050
Token fun
Token ID1257
Feature activation+1.73e-01

Pos logit contributions
 and+0.018
ed+0.018
 under+0.017
 in+0.017
 knees+0.016
Neg logit contributions
Ms-0.013
Little-0.012
Mrs-0.012
Girl-0.012
str-0.012
Token playing
Token ID2712
Feature activation+2.14e-02

Pos logit contributions
Little+0.338
rous+0.337
One+0.332
Ms+0.327
 difference+0.325
Neg logit contributions
ed-0.242
driver-0.204
 knees-0.196
ped-0.187
fully-0.184
Token together
Token ID1978
Feature activation-1.63e-02

Pos logit contributions
Little+0.040
rous+0.040
Ms+0.039
One+0.039
 difference+0.038
Neg logit contributions
ed-0.031
 knees-0.027
driver-0.026
ped-0.025
icked-0.024
Token.
Token ID13
Feature activation-4.25e-02

Pos logit contributions
ed+0.019
driver+0.018
ped+0.016
 knees+0.015
icked+0.014
Neg logit contributions
One-0.035
rous-0.033
 difference-0.033
Little-0.033
Ms-0.033
Token<|endoftext|>
Token ID50256
Feature activation+3.20e-01

Pos logit contributions
ed+0.070
 and+0.068
 under+0.060
 knees+0.058
 fell+0.056
Neg logit contributions
redible-0.078
rous-0.077
Ms-0.075
gary-0.075
icious-0.075
TokenOnce
Token ID7454
Feature activation+1.84e-01

Pos logit contributions
 former+0.572
icious+0.560
rous+0.556
 lifetime+0.536
 rewarding+0.530
Neg logit contributions
 and-0.593
ed-0.582
 under-0.537
 in-0.535
 on-0.524
Token upon
Token ID2402
Feature activation+9.88e-02

Pos logit contributions
icious+0.331
 former+0.325
Ms+0.312
ograph+0.311
 sym+0.304
Neg logit contributions
ed-0.313
 and-0.310
 under-0.299
 in-0.289
 on-0.277
Token a
Token ID257
Feature activation+4.93e-02

Pos logit contributions
Ms+0.218
Little+0.218
rous+0.214
One+0.213
Mrs+0.210
Neg logit contributions
ed-0.114
 knees-0.092
driver-0.089
 under-0.086
 and-0.086
Token time
Token ID640
Feature activation+1.09e-02

Pos logit contributions
Ms+0.088
assion+0.086
Girl+0.085
 former+0.084
Little+0.083
Neg logit contributions
 and-0.078
ed-0.071
 under-0.069
 in-0.067
 knees-0.066
Token,
Token ID11
Feature activation-1.09e-01

Pos logit contributions
Ms+0.029
Little+0.028
rous+0.028
One+0.028
Mrs+0.028
Neg logit contributions
ed-0.008
 knees-0.006
 and-0.005
driver-0.005
 under-0.005
Token happy
Token ID3772
Feature activation+2.15e-01

Pos logit contributions
ed+0.183
driver+0.162
 knees+0.160
fully+0.148
 under+0.145
Neg logit contributions
Little-0.200
Ms-0.199
rous-0.195
One-0.192
 difference-0.191
Token friends
Token ID2460
Feature activation-4.57e-02

Pos logit contributions
 difference+0.403
Little+0.389
 sym+0.387
 lifetime+0.377
Em+0.372
Neg logit contributions
driver-0.288
ed-0.276
aces-0.263
ps-0.247
 knees-0.241
Token once
Token ID1752
Feature activation+4.23e-02

Pos logit contributions
 knees+0.070
driver+0.066
ped+0.059
ps+0.058
ed+0.057
Neg logit contributions
 difference-0.082
 lifetime-0.077
 memorable-0.075
 sym-0.075
Little-0.075
Token more
Token ID517
Feature activation+5.42e-02

Pos logit contributions
Little+0.084
Ms+0.082
rous+0.082
One+0.082
 difference+0.081
Neg logit contributions
ed-0.058
driver-0.051
 knees-0.050
 hid-0.044
 under-0.044
Token.
Token ID13
Feature activation-5.10e-02

Pos logit contributions
Little+0.114
 rewarding+0.110
rous+0.109
Ms+0.108
One+0.108
Neg logit contributions
ed-0.073
driver-0.063
 knees-0.057
ped-0.055
icked-0.053
Token<|endoftext|>
Token ID50256
Feature activation+3.20e-01

Pos logit contributions
 and+0.086
ed+0.086
 under+0.076
 knees+0.073
 fell+0.072
Neg logit contributions
redible-0.087
rous-0.087
icious-0.083
gary-0.083
Ms-0.082
TokenOnce
Token ID7454
Feature activation+1.77e-01

Pos logit contributions
 former+0.534
icious+0.532
rous+0.528
 disl+0.518
 rewarding+0.507
Neg logit contributions
ed-0.605
 and-0.594
 in-0.552
 under-0.550
 on-0.545
Token upon
Token ID2402
Feature activation+9.35e-02

Pos logit contributions
icious+0.344
 former+0.331
ograph+0.321
Ms+0.316
 disl+0.310
Neg logit contributions
 and-0.296
ed-0.288
 under-0.284
 in-0.280
 on-0.264
Token a
Token ID257
Feature activation+4.49e-02

Pos logit contributions
Ms+0.213
Little+0.212
rous+0.209
One+0.206
 difference+0.205
Neg logit contributions
ed-0.102
 knees-0.081
 and-0.078
 under-0.078
driver-0.077
Token time
Token ID640
Feature activation+4.97e-03

Pos logit contributions
Ms+0.084
Girl+0.083
assion+0.082
 disl+0.080
redible+0.080
Neg logit contributions
 and-0.068
 under-0.059
ed-0.059
 in-0.057
 knees-0.055
Token,
Token ID11
Feature activation-1.14e-01

Pos logit contributions
Ms+0.014
Little+0.014
rous+0.013
 difference+0.013
Mrs+0.013
Neg logit contributions
ed-0.003
 and-0.002
 knees-0.002
 under-0.002
driver-0.002
Token.
Token ID13
Feature activation-3.36e-02

Pos logit contributions
ed+0.095
 knees+0.094
driver+0.087
ped+0.086
ously+0.085
Neg logit contributions
Ms-0.120
One-0.117
Mrs-0.115
 difference-0.114
Little-0.112
Token They
Token ID1119
Feature activation-7.55e-02

Pos logit contributions
ed+0.031
 and+0.027
 under+0.024
 knees+0.023
 in+0.022
Neg logit contributions
rous-0.084
Ms-0.083
Little-0.083
icious-0.083
str-0.082
Token are
Token ID389
Feature activation-5.02e-02

Pos logit contributions
ed+0.127
 knees+0.108
driver+0.106
 under+0.103
 and+0.102
Neg logit contributions
Ms-0.129
Little-0.128
One-0.126
rous-0.125
 difference-0.123
Token happy
Token ID3772
Feature activation+2.19e-01

Pos logit contributions
ed+0.104
driver+0.095
 knees+0.093
fully+0.088
 hid+0.085
Neg logit contributions
Ms-0.067
Little-0.066
rous-0.064
One-0.064
Mrs-0.063
Token.
Token ID13
Feature activation-3.28e-02

Pos logit contributions
One+0.404
 difference+0.396
 future+0.387
 talent+0.375
rous+0.374
Neg logit contributions
ed-0.311
driver-0.310
ped-0.279
aces-0.273
 wouldn-0.270
Token<|endoftext|>
Token ID50256
Feature activation+3.20e-01

Pos logit contributions
ed+0.046
 and+0.046
 under+0.040
 in+0.039
 on+0.037
Neg logit contributions
rous-0.068
icious-0.068
redible-0.066
str-0.065
gary-0.065
TokenL
Token ID43
Feature activation+2.01e-01

Pos logit contributions
 former+0.344
Ms+0.342
rous+0.342
icious+0.340
Little+0.332
Neg logit contributions
ed-0.760
 and-0.694
 under-0.679
 in-0.676
 on-0.666
Tokenily
Token ID813
Feature activation-1.22e-02

Pos logit contributions
Ms+0.350
One+0.344
Little+0.341
rous+0.335
Mrs+0.330
Neg logit contributions
ed-0.320
 knees-0.282
driver-0.279
 wouldn-0.267
ped-0.260
Token and
Token ID290
Feature activation-2.14e-01

Pos logit contributions
ed+0.019
driver+0.019
 knees+0.019
fully+0.018
ked+0.018
Neg logit contributions
Little-0.020
rous-0.019
One-0.019
Ms-0.019
 future-0.019
Token Ben
Token ID3932
Feature activation+1.48e-02

Pos logit contributions
ed+0.371
driver+0.365
aces+0.334
ier+0.330
fully+0.330
Neg logit contributions
Little-0.332
rous-0.323
One-0.318
Every-0.313
This-0.312
Token were
Token ID547
Feature activation-9.70e-02

Pos logit contributions
One+0.025
Little+0.025
reci+0.024
Ms+0.023
rous+0.023
Neg logit contributions
ed-0.023
 knees-0.022
fully-0.022
driver-0.022
edly-0.021
Token their
Token ID511
Feature activation+3.60e-03

Pos logit contributions
ed+0.357
driver+0.294
fully+0.289
 knees+0.285
ped+0.280
Neg logit contributions
Ms-0.345
 difference-0.340
Little-0.338
Mrs-0.335
rous-0.334
Token brother
Token ID3956
Feature activation-7.53e-02

Pos logit contributions
Ms+0.007
Little+0.007
rous+0.007
One+0.007
Mrs+0.007
Neg logit contributions
ed-0.005
 knees-0.004
driver-0.004
 under-0.004
 and-0.004
Token and
Token ID290
Feature activation-2.07e-01

Pos logit contributions
ed+0.086
 knees+0.074
driver+0.072
fully+0.066
ped+0.064
Neg logit contributions
 difference-0.162
Ms-0.161
One-0.160
Little-0.159
rous-0.156
Token sister
Token ID6621
Feature activation-7.85e-02

Pos logit contributions
ed+0.372
driver+0.306
fully+0.302
 knees+0.300
ped+0.293
Neg logit contributions
Ms-0.326
 difference-0.323
Little-0.316
Mrs-0.315
One-0.314
Token.
Token ID13
Feature activation-2.99e-02

Pos logit contributions
ed+0.085
 knees+0.069
driver+0.067
 under+0.062
fully+0.061
Neg logit contributions
Ms-0.177
Little-0.176
One-0.174
rous-0.173
 difference-0.173
Token<|endoftext|>
Token ID50256
Feature activation+3.20e-01

Pos logit contributions
 and+0.030
ed+0.029
 under+0.024
 in+0.023
 knees+0.022
Neg logit contributions
rous-0.072
gary-0.071
Ms-0.070
icious-0.070
redible-0.070
TokenS
Token ID50
Feature activation+1.27e-01

Pos logit contributions
 former+0.411
icious+0.398
rous+0.382
Ms+0.381
 lifetime+0.381
Neg logit contributions
ed-0.728
 and-0.674
 in-0.654
 on-0.652
 under-0.648
Tokenara
Token ID3301
Feature activation+3.30e-02

Pos logit contributions
 former+0.175
 disl+0.170
 rewarding+0.168
 sym+0.168
 wond+0.160
Neg logit contributions
 and-0.268
ed-0.254
 went-0.231
 under-0.229
 fell-0.226
Token was
Token ID373
Feature activation-7.58e-02

Pos logit contributions
 future+0.062
rous+0.062
reci+0.061
Love+0.061
One+0.060
Neg logit contributions
driver-0.041
 knees-0.040
fully-0.038
iously-0.037
ked-0.037
Token a
Token ID257
Feature activation+5.42e-02

Pos logit contributions
ed+0.094
driver+0.078
 knees+0.077
fully+0.069
 and+0.068
Neg logit contributions
rous-0.161
Ms-0.161
Little-0.160
One-0.158
Mrs-0.154
Token creative
Token ID7325
Feature activation+2.10e-01

Pos logit contributions
Ms+0.086
Little+0.084
rous+0.083
One+0.081
 difference+0.081
Neg logit contributions
ed-0.095
 and-0.086
 knees-0.085
 under-0.084
 in-0.082
Token park
Token ID3952
Feature activation+1.86e-02

Pos logit contributions
Ms+0.165
Little+0.161
rous+0.158
One+0.157
Mrs+0.154
Neg logit contributions
ed-0.179
 knees-0.161
 and-0.157
 under-0.155
driver-0.153
Token a
Token ID257
Feature activation+4.46e-02

Pos logit contributions
Little+0.027
 difference+0.027
Ms+0.026
rous+0.026
One+0.025
Neg logit contributions
ed-0.036
 knees-0.032
driver-0.031
ped-0.031
icked-0.030
Token cleaner
Token ID21723
Feature activation+2.19e-02

Pos logit contributions
Ms+0.064
reci+0.064
rous+0.064
itionally+0.061
One+0.061
Neg logit contributions
 and-0.083
 in-0.076
 under-0.075
 knees-0.074
 out-0.073
Token place
Token ID1295
Feature activation+6.93e-04

Pos logit contributions
 difference+0.050
Little+0.048
 future+0.047
One+0.046
 former+0.046
Neg logit contributions
ed-0.025
icked-0.022
ped-0.020
ked-0.020
 knees-0.019
Token.
Token ID13
Feature activation-4.63e-02

Pos logit contributions
 difference+0.001
Little+0.001
rous+0.001
 rewarding+0.001
Ms+0.001
Neg logit contributions
ed-0.001
 knees-0.001
ped-0.001
driver-0.001
icked-0.001
Token<|endoftext|>
Token ID50256
Feature activation+3.20e-01

Pos logit contributions
ed+0.075
 and+0.073
 under+0.062
 knees+0.061
 went+0.059
Neg logit contributions
redible-0.087
icious-0.086
rous-0.085
ograph-0.082
gary-0.082
TokenOne
Token ID3198
Feature activation+2.14e-01

Pos logit contributions
rous+0.457
icious+0.451
 former+0.446
Ms+0.435
 rewarding+0.427
Neg logit contributions
ed-0.671
 and-0.626
 under-0.596
 in-0.588
 on-0.580
Token day
Token ID1110
Feature activation+3.49e-02

Pos logit contributions
Ms+0.372
Mrs+0.347
icious+0.347
 former+0.344
Gr+0.340
Neg logit contributions
ed-0.353
 and-0.345
 in-0.323
 under-0.322
 knees-0.317
Token,
Token ID11
Feature activation-1.09e-01

Pos logit contributions
Little+0.078
One+0.077
Ms+0.076
Mrs+0.073
rous+0.073
Neg logit contributions
ed-0.035
driver-0.033
 knees-0.032
aces-0.029
fully-0.029
Token a
Token ID257
Feature activation+4.66e-02

Pos logit contributions
ed+0.128
 knees+0.106
driver+0.104
 under+0.097
 and+0.097
Neg logit contributions
Little-0.237
Ms-0.237
One-0.234
rous-0.233
Mrs-0.228
Token little
Token ID1310
Feature activation+1.90e-01

Pos logit contributions
rous+0.078
Ms+0.078
 former+0.075
 difference+0.074
icious+0.074
Neg logit contributions
ed-0.076
 and-0.072
 under-0.068
 knees-0.067
 in-0.067
Token They
Token ID1119
Feature activation-7.67e-02

Pos logit contributions
 and+0.030
ed+0.027
 in+0.021
 under+0.020
 as+0.019
Neg logit contributions
rous-0.079
icious-0.078
redible-0.078
str-0.078
is-0.077
Token learn
Token ID2193
Feature activation+3.48e-02

Pos logit contributions
ed+0.135
 knees+0.118
driver+0.115
 under+0.113
 and+0.113
Neg logit contributions
Ms-0.124
Little-0.124
rous-0.121
One-0.120
 difference-0.118
Token a
Token ID257
Feature activation+5.62e-02

Pos logit contributions
Ms+0.081
Little+0.080
rous+0.080
One+0.080
Mrs+0.079
Neg logit contributions
ed-0.036
 knees-0.028
driver-0.027
 under-0.025
ped-0.024
Token lesson
Token ID11483
Feature activation+1.22e-02

Pos logit contributions
rous+0.111
Little+0.107
icious+0.105
Ms+0.103
gary+0.101
Neg logit contributions
 and-0.084
ed-0.077
 in-0.074
 on-0.073
 under-0.073
Token.
Token ID13
Feature activation-3.11e-02

Pos logit contributions
One+0.024
Ms+0.021
 future+0.021
 rewarding+0.021
Every+0.021
Neg logit contributions
ed-0.017
 knees-0.015
driver-0.014
ped-0.014
 teeth-0.013
Token<|endoftext|>
Token ID50256
Feature activation+3.20e-01

Pos logit contributions
 and+0.030
ed+0.028
 in+0.021
 or+0.021
 under+0.021
Neg logit contributions
rous-0.076
str-0.076
icious-0.075
 lifetime-0.074
gary-0.074
TokenL
Token ID43
Feature activation+2.03e-01

Pos logit contributions
 former+0.362
Ms+0.353
icious+0.352
rous+0.346
Little+0.340
Neg logit contributions
ed-0.752
 and-0.685
 in-0.665
 under-0.663
 on-0.660
Tokenila
Token ID10102
Feature activation-3.61e-03

Pos logit contributions
Ms+0.370
Little+0.361
One+0.359
rous+0.353
Mrs+0.352
Neg logit contributions
ed-0.307
 knees-0.277
driver-0.268
 wouldn-0.257
 hid-0.246
Token and
Token ID290
Feature activation-2.14e-01

Pos logit contributions
 knees+0.003
driver+0.003
fully+0.003
ed+0.003
ked+0.003
Neg logit contributions
rous-0.008
reci-0.008
 future-0.008
Little-0.008
 difference-0.008
Token Ben
Token ID3932
Feature activation+1.47e-02

Pos logit contributions
ed+0.394
driver+0.354
ps+0.333
aces+0.331
fully+0.329
Neg logit contributions
Little-0.329
rous-0.317
The-0.316
Mrs-0.313
Every-0.311
Token are
Token ID389
Feature activation-5.15e-02

Pos logit contributions
One+0.022
Little+0.022
 difference+0.022
reci+0.021
rous+0.021
Neg logit contributions
ed-0.025
 knees-0.024
driver-0.024
fully-0.024
ped-0.023
Token.
Token ID13
Feature activation-3.01e-02

Pos logit contributions
ed+0.084
 knees+0.081
ped+0.072
driver+0.072
icked+0.067
Neg logit contributions
Ms-0.139
One-0.136
Mrs-0.135
 difference-0.134
Little-0.132
Token They
Token ID1119
Feature activation-7.90e-02

Pos logit contributions
ed+0.025
 and+0.024
 under+0.020
 knees+0.018
 in+0.017
Neg logit contributions
icious-0.076
str-0.076
redible-0.075
rous-0.075
Little-0.075
Token were
Token ID547
Feature activation-9.97e-02

Pos logit contributions
ed+0.113
driver+0.096
 knees+0.094
fully+0.089
ped+0.087
Neg logit contributions
Ms-0.153
Little-0.152
One-0.151
Mrs-0.148
rous-0.147
Token happy
Token ID3772
Feature activation+2.19e-01

Pos logit contributions
ed+0.212
driver+0.192
 knees+0.190
fully+0.179
 under+0.176
Neg logit contributions
Ms-0.125
Little-0.124
rous-0.122
One-0.120
Mrs-0.118
Token.
Token ID13
Feature activation-3.13e-02

Pos logit contributions
 difference+0.422
One+0.413
 future+0.400
Little+0.391
Love+0.389
Neg logit contributions
ed-0.305
driver-0.291
ped-0.256
aces-0.253
 wouldn-0.240
Token<|endoftext|>
Token ID50256
Feature activation+3.20e-01

Pos logit contributions
 and+0.042
ed+0.041
 under+0.036
 in+0.035
 he+0.033
Neg logit contributions
icious-0.066
redible-0.065
gary-0.064
rous-0.064
str-0.063
TokenTom
Token ID13787
Feature activation+5.87e-02

Pos logit contributions
 former+0.362
rous+0.361
icious+0.360
Ms+0.356
 disl+0.351
Neg logit contributions
ed-0.748
 and-0.685
 under-0.669
 in-0.668
 on-0.658
Token and
Token ID290
Feature activation-2.14e-01

Pos logit contributions
One+0.078
Little+0.077
Ms+0.076
Mrs+0.072
Today+0.072
Neg logit contributions
ed-0.112
 knees-0.111
driver-0.109
fully-0.102
edly-0.099
Token Sam
Token ID3409
Feature activation-2.19e-02

Pos logit contributions
ed+0.333
ps+0.315
driver+0.309
 knees+0.306
iously+0.306
Neg logit contributions
The-0.356
Every-0.355
Little-0.349
 Love-0.348
One-0.347
Token are
Token ID389
Feature activation-4.85e-02

Pos logit contributions
driver+0.028
 knees+0.027
ed+0.027
fully+0.026
edly+0.024
Neg logit contributions
Little-0.042
One-0.041
Every-0.039
reci-0.039
Ms-0.039
Token friends
Token ID2460
Feature activation-3.87e-02

Pos logit contributions
ed+0.061
 knees+0.050
driver+0.050
fully+0.046
 under+0.046
Neg logit contributions
Ms-0.103
rous-0.103
Little-0.103
One-0.101
Mrs-0.099
Token fun
Token ID1257
Feature activation+1.76e-01

Pos logit contributions
Little+0.019
Ms+0.019
rous+0.019
One+0.018
Mrs+0.018
Neg logit contributions
ed-0.020
 knees-0.018
 and-0.018
 under-0.017
driver-0.017
Token in
Token ID287
Feature activation-1.54e-02

Pos logit contributions
One+0.392
rous+0.386
 difference+0.379
Little+0.379
 sym+0.374
Neg logit contributions
ed-0.194
driver-0.172
 knees-0.144
 wouldn-0.142
ped-0.132
Token the
Token ID262
Feature activation+1.14e-01

Pos logit contributions
 and+0.014
 under+0.012
 in+0.011
 on+0.011
 knees+0.010
Neg logit contributions
icious-0.041
Ms-0.039
Little-0.039
rig-0.039
assion-0.038
Token park
Token ID3952
Feature activation+2.90e-02

Pos logit contributions
Ms+0.225
Em+0.218
Mrs+0.214
icious+0.213
rig+0.207
Neg logit contributions
 and-0.169
 on-0.152
 in-0.151
ed-0.150
 under-0.149
Token.
Token ID13
Feature activation-3.74e-02

Pos logit contributions
One+0.059
 difference+0.058
Ms+0.058
rous+0.057
Little+0.057
Neg logit contributions
ed-0.036
driver-0.032
 knees-0.031
ped-0.029
icked-0.027
Token<|endoftext|>
Token ID50256
Feature activation+3.19e-01

Pos logit contributions
 and+0.046
ed+0.042
 under+0.040
 in+0.038
 on+0.035
Neg logit contributions
rous-0.083
icious-0.082
is-0.081
gary-0.081
Ms-0.081
TokenSam
Token ID16305
Feature activation+1.87e-02

Pos logit contributions
 former+0.440
icious+0.421
 disl+0.405
Ms+0.404
rous+0.403
Neg logit contributions
ed-0.702
 and-0.664
 in-0.639
 on-0.637
 under-0.629
Token and
Token ID290
Feature activation-2.16e-01

Pos logit contributions
Little+0.041
One+0.040
rous+0.040
Ms+0.040
 difference+0.039
Neg logit contributions
ed-0.021
 knees-0.020
driver-0.019
fully-0.017
ked-0.016
Token Mia
Token ID32189
Feature activation+1.02e-02

Pos logit contributions
ed+0.383
driver+0.367
aces+0.349
 knees+0.343
ps+0.335
Neg logit contributions
Little-0.329
The-0.321
Mrs-0.321
One-0.314
Ms-0.312
Token are
Token ID389
Feature activation-5.36e-02

Pos logit contributions
Little+0.016
One+0.016
rous+0.015
 difference+0.015
reci+0.015
Neg logit contributions
ed-0.018
driver-0.017
 knees-0.016
fully-0.016
ped-0.015
Token friends
Token ID2460
Feature activation-4.10e-02

Pos logit contributions
ed+0.075
 knees+0.063
driver+0.063
 under+0.059
 and+0.059
Neg logit contributions
Ms-0.106
Little-0.105
rous-0.105
One-0.103
Mrs-0.101
Token question
Token ID1808
Feature activation+5.33e-02

Pos logit contributions
rous+0.076
icious+0.072
reci+0.068
Little+0.067
 former+0.066
Neg logit contributions
 and-0.069
ed-0.063
 in-0.063
 on-0.062
 ran-0.060
Token,
Token ID11
Feature activation-1.14e-01

Pos logit contributions
Ms+0.100
One+0.099
Mrs+0.097
Every+0.095
Today+0.092
Neg logit contributions
ed-0.069
 knees-0.065
ped-0.062
 hid-0.057
erved-0.056
Token and
Token ID290
Feature activation-2.26e-01

Pos logit contributions
ed+0.169
 knees+0.142
driver+0.136
ped+0.132
icked+0.126
Neg logit contributions
Ms-0.221
Little-0.219
One-0.216
Mrs-0.214
Girl-0.205
Token she
Token ID673
Feature activation-1.52e-01

Pos logit contributions
ed+0.294
driver+0.261
 knees+0.255
aces+0.243
ps+0.230
Neg logit contributions
Ms-0.447
Little-0.445
Mrs-0.436
One-0.426
rous-0.424
Token courage
Token ID11917
Feature activation-1.03e-03

Pos logit contributions
ed+0.229
driver+0.210
 knees+0.202
 under+0.195
fully+0.193
Neg logit contributions
Little-0.268
Ms-0.267
One-0.263
Mrs-0.256
Each-0.248
Tokenously
Token ID3481
Feature activation-3.30e-01

Pos logit contributions
 knees+0.001
 teeth+0.001
erved+0.001
ed+0.001
driver+0.001
Neg logit contributions
str-0.002
Ms-0.002
Mrs-0.002
 lifetime-0.002
is-0.002
Token answered
Token ID9373
Feature activation+9.78e-03

Pos logit contributions
ed+0.446
driver+0.441
 knees+0.397
aces+0.377
 under+0.370
Neg logit contributions
One-0.569
Little-0.563
Ms-0.558
Mrs-0.557
str-0.557
Token it
Token ID340
Feature activation-9.51e-02

Pos logit contributions
Ms+0.020
Little+0.019
Mrs+0.019
One+0.019
 difference+0.018
Neg logit contributions
ed-0.012
driver-0.011
 knees-0.011
 hid-0.010
 fell-0.009
Token.
Token ID13
Feature activation-4.51e-02

Pos logit contributions
ed+0.108
 knees+0.097
driver+0.097
 under+0.084
 hid+0.083
Neg logit contributions
Ms-0.203
One-0.199
Little-0.198
 difference-0.193
Mrs-0.193
Token
Token ID220
Feature activation+2.15e-01

Pos logit contributions
ed+0.083
 and+0.074
 knees+0.071
driver+0.070
 under+0.069
Neg logit contributions
rous-0.071
 difference-0.069
Little-0.068
icious-0.067
 rewarding-0.067
Token
Token ID198
Feature activation+2.43e-01

Pos logit contributions
rous+0.299
Little+0.282
 difference+0.281
icious+0.277
 former+0.272
Neg logit contributions
ed-0.459
 and-0.409
driver-0.396
 under-0.383
 in-0.380
Token boy
Token ID2933
Feature activation-2.39e-02

Pos logit contributions
Little+0.242
Ms+0.235
 difference+0.231
One+0.229
rous+0.228
Neg logit contributions
ed-0.375
driver-0.336
 knees-0.329
icked-0.326
 under-0.323
Token smiled
Token ID13541
Feature activation-5.06e-02

Pos logit contributions
ed+0.040
driver+0.037
aces+0.034
 under+0.034
 knees+0.034
Neg logit contributions
Little-0.040
Ms-0.038
One-0.038
Mrs-0.037
Girl-0.036
Token mis
Token ID2984
Feature activation+1.16e-01

Pos logit contributions
driver+0.053
ed+0.046
 knees+0.046
aces+0.045
 hid+0.045
Neg logit contributions
 difference-0.108
One-0.107
Little-0.105
 sym-0.105
Mrs-0.103
Tokenchie
Token ID3043
Feature activation+1.23e-01

Pos logit contributions
Ms+0.149
Little+0.149
rous+0.144
One+0.143
 difference+0.141
Neg logit contributions
ed-0.247
driver-0.216
 knees-0.214
 and-0.209
 under-0.208
Tokenv
Token ID85
Feature activation+1.74e-01

Pos logit contributions
Ms+0.183
Little+0.179
rous+0.178
One+0.176
 difference+0.175
Neg logit contributions
ed-0.241
 and-0.206
 knees-0.200
 under-0.199
ped-0.197
Tokenously
Token ID3481
Feature activation-3.26e-01

Pos logit contributions
Little+0.337
Girl+0.323
Ms+0.309
 difference+0.308
oral+0.306
Neg logit contributions
 under-0.183
 knees-0.179
 hid-0.175
driver-0.170
 wouldn-0.161
Token and
Token ID290
Feature activation-2.29e-01

Pos logit contributions
ed+0.332
driver+0.300
 knees+0.253
aces+0.251
icked+0.250
Neg logit contributions
Little-0.706
Mrs-0.705
One-0.704
 difference-0.701
Ms-0.699
Token said
Token ID531
Feature activation+1.41e-02

Pos logit contributions
driver+0.361
ed+0.353
aces+0.343
ps+0.309
 under+0.302
Neg logit contributions
Little-0.388
Ms-0.367
Mrs-0.364
rous-0.355
str-0.351
Token,
Token ID11
Feature activation-1.17e-01

Pos logit contributions
Ms+0.029
Little+0.028
One+0.027
Mrs+0.027
Ja+0.027
Neg logit contributions
 knees-0.015
ed-0.015
 under-0.013
driver-0.013
 hid-0.013
Token "
Token ID366
Feature activation+9.59e-02

Pos logit contributions
ed+0.178
 knees+0.153
driver+0.149
 under+0.143
 hid+0.140
Neg logit contributions
Ms-0.219
Little-0.215
Mrs-0.211
One-0.211
Girl-0.202
TokenWell
Token ID5779
Feature activation+4.22e-02

Pos logit contributions
 difference+0.158
rous+0.148
igsaw+0.139
Ms+0.138
reci+0.134
Neg logit contributions
ed-0.185
 and-0.152
driver-0.148
fully-0.142
ped-0.140
Token so
Token ID523
Feature activation+5.01e-02

Pos logit contributions
ed+0.168
icked+0.141
driver+0.135
ped+0.131
 knees+0.127
Neg logit contributions
Little-0.225
 difference-0.213
One-0.209
 rewarding-0.206
 wonderful-0.203
Token he
Token ID339
Feature activation-1.77e-01

Pos logit contributions
Little+0.118
One+0.116
Mrs+0.114
 lifetime+0.113
Ms+0.113
Neg logit contributions
ed-0.048
driver-0.041
aces-0.038
 knees-0.036
ps-0.035
Token stepped
Token ID10764
Feature activation-8.38e-02

Pos logit contributions
driver+0.249
aces+0.236
ed+0.225
 knees+0.212
 toes+0.208
Neg logit contributions
Prin-0.331
One-0.330
Little-0.324
Gr-0.321
N-0.317
Token forward
Token ID2651
Feature activation-4.12e-02

Pos logit contributions
 knees+0.099
driver+0.092
 teeth+0.091
aces+0.091
ed+0.087
Neg logit contributions
Little-0.176
One-0.176
Ms-0.167
Mrs-0.163
Gr-0.162
Token courage
Token ID11917
Feature activation+5.07e-03

Pos logit contributions
ed+0.056
icked+0.052
ped+0.048
led+0.047
 knees+0.047
Neg logit contributions
One-0.075
Little-0.071
 difference-0.070
 future-0.068
Mrs-0.067
Tokenously
Token ID3481
Feature activation-3.23e-01

Pos logit contributions
One+0.011
Mrs+0.010
Every+0.010
Little+0.010
Ms+0.010
Neg logit contributions
ed-0.006
icked-0.005
 knees-0.005
ped-0.005
driver-0.004
Token.
Token ID13
Feature activation-4.72e-02

Pos logit contributions
ed+0.360
driver+0.312
 knees+0.284
icked+0.266
aces+0.258
Neg logit contributions
One-0.692
Ms-0.674
Little-0.673
 difference-0.667
Mrs-0.665
Token
Token ID198
Feature activation+2.48e-01

Pos logit contributions
ed+0.088
 and+0.081
 knees+0.075
 under+0.073
 in+0.073
Neg logit contributions
rous-0.077
icious-0.074
Little-0.072
 difference-0.070
 rewarding-0.070
Token
Token ID198
Feature activation+2.48e-01

Pos logit contributions
icious+0.491
rous+0.491
 former+0.488
 difference+0.483
 disl+0.477
Neg logit contributions
 and-0.418
ed-0.378
 in-0.347
 under-0.334
 on-0.334
TokenWhen
Token ID2215
Feature activation+4.22e-02

Pos logit contributions
 difference+0.394
rous+0.394
 former+0.339
reci+0.337
 sym+0.334
Neg logit contributions
ed-0.462
 and-0.461
 ran-0.431
driver-0.425
 in-0.422
Token he
Token ID339
Feature activation-1.74e-01

Pos logit contributions
Little+0.094
One+0.088
Mrs+0.086
Ms+0.086
The+0.085
Neg logit contributions
aces-0.056
driver-0.052
ed-0.051
 knees-0.046
icked-0.046
Token cat
Token ID3797
Feature activation-3.41e-02

Pos logit contributions
rous+0.508
Ms+0.497
Little+0.488
rig+0.475
 sym+0.474
Neg logit contributions
ed-0.318
 and-0.283
 under-0.277
 knees-0.276
driver-0.270
Token grinned
Token ID36268
Feature activation-2.94e-02

Pos logit contributions
ed+0.044
driver+0.044
 knees+0.040
aces+0.039
fully+0.039
Neg logit contributions
Little-0.065
Ms-0.061
Girl-0.060
One-0.060
str-0.060
Token mis
Token ID2984
Feature activation+1.20e-01

Pos logit contributions
driver+0.031
 hid+0.026
 knees+0.026
ed+0.025
 wouldn+0.025
Neg logit contributions
 difference-0.063
One-0.063
Little-0.062
 sym-0.061
 former-0.060
Tokenchie
Token ID3043
Feature activation+1.25e-01

Pos logit contributions
Ms+0.126
Little+0.125
rous+0.121
One+0.119
Mrs+0.115
Neg logit contributions
ed-0.280
 knees-0.252
driver-0.249
 and-0.246
 under-0.246
Tokenv
Token ID85
Feature activation+1.78e-01

Pos logit contributions
Ms+0.162
Little+0.159
rous+0.158
One+0.155
 difference+0.153
Neg logit contributions
ed-0.265
 and-0.230
 knees-0.229
 under-0.226
driver-0.225
Tokenously
Token ID3481
Feature activation-3.20e-01

Pos logit contributions
Little+0.327
Ms+0.316
 difference+0.311
Girl+0.305
oral+0.300
Neg logit contributions
 knees-0.218
 under-0.215
ed-0.212
 hid-0.210
 wouldn-0.198
Token.
Token ID13
Feature activation-4.33e-02

Pos logit contributions
ed+0.288
driver+0.255
 knees+0.210
icked+0.209
aces+0.201
Neg logit contributions
 difference-0.731
One-0.730
Ms-0.725
Mrs-0.721
Little-0.720
Token "
Token ID366
Feature activation+9.99e-02

Pos logit contributions
ed+0.060
 knees+0.050
 and+0.049
driver+0.049
 under+0.048
Neg logit contributions
Little-0.086
rous-0.086
Ms-0.086
One-0.084
 difference-0.083
TokenNever
Token ID12295
Feature activation-4.23e-02

Pos logit contributions
 difference+0.136
Ms+0.134
rous+0.129
igsaw+0.129
assion+0.125
Neg logit contributions
ed-0.208
ped-0.174
 and-0.172
driver-0.169
 ran-0.168
Token try
Token ID1949
Feature activation+9.40e-05

Pos logit contributions
ed+0.071
driver+0.060
 knees+0.060
 under+0.058
fully+0.056
Neg logit contributions
Little-0.072
One-0.071
Ms-0.070
 difference-0.070
rous-0.070
Token to
Token ID284
Feature activation-7.88e-02

Pos logit contributions
Little+0.000
One+0.000
Ms+0.000
rous+0.000
 difference+0.000
Neg logit contributions
ed-0.000
 knees-0.000
driver-0.000
 under-0.000
ped-0.000
Token tools
Token ID4899
Feature activation-1.75e-01

Pos logit contributions
ed+0.022
 and+0.020
 knees+0.020
 under+0.020
driver+0.019
Neg logit contributions
Ms-0.034
Little-0.033
icious-0.033
One-0.032
rous-0.032
Token and
Token ID290
Feature activation-2.21e-01

Pos logit contributions
ed+0.142
 knees+0.135
ped+0.130
icked+0.128
 teeth+0.122
Neg logit contributions
Little-0.393
Mrs-0.382
Ms-0.381
Every-0.376
One-0.376
Token found
Token ID1043
Feature activation-1.83e-02

Pos logit contributions
ed+0.289
aces+0.282
ped+0.276
icked+0.273
ps+0.271
Neg logit contributions
Little-0.407
Mrs-0.371
Em-0.369
rous-0.364
Ms-0.359
Token a
Token ID257
Feature activation+4.41e-02

Pos logit contributions
ed+0.023
 knees+0.021
driver+0.019
 under+0.018
fully+0.018
Neg logit contributions
Ms-0.036
Little-0.036
rous-0.035
Mrs-0.035
One-0.035
Token screw
Token ID9580
Feature activation-7.14e-02

Pos logit contributions
Ms+0.070
Little+0.070
rous+0.070
One+0.067
icious+0.066
Neg logit contributions
ed-0.074
 and-0.070
driver-0.068
 knees-0.068
 under-0.068
Tokendriver
Token ID26230
Feature activation-3.05e-01

Pos logit contributions
 knees+0.064
ped+0.064
ed+0.063
icked+0.055
 hid+0.049
Neg logit contributions
One-0.139
is-0.136
 difference-0.132
igsaw-0.131
assion-0.130
Token,
Token ID11
Feature activation-1.10e-01

Pos logit contributions
ed+0.367
 knees+0.342
icked+0.303
ped+0.299
 teeth+0.290
Neg logit contributions
One-0.590
Mrs-0.565
Ms-0.557
is-0.556
Every-0.550
Token hammer
Token ID15554
Feature activation-1.20e-01

Pos logit contributions
ed+0.158
icked+0.123
 knees+0.121
ped+0.118
fully+0.113
Neg logit contributions
Little-0.218
One-0.217
Mrs-0.217
Ms-0.215
Every-0.206
Token,
Token ID11
Feature activation-1.11e-01

Pos logit contributions
 knees+0.108
ed+0.101
 under+0.084
 hid+0.083
edly+0.081
Neg logit contributions
Little-0.260
One-0.258
Ms-0.257
str-0.254
Mrs-0.254
Token and
Token ID290
Feature activation-2.20e-01

Pos logit contributions
ed+0.106
icked+0.082
 knees+0.078
ped+0.070
fully+0.067
Neg logit contributions
One-0.257
Little-0.256
Mrs-0.253
Ms-0.248
Every-0.246
Token some
Token ID617
Feature activation+1.08e-02

Pos logit contributions
ed+0.314
icked+0.266
ped+0.266
 knees+0.257
fully+0.255
Neg logit contributions
Little-0.409
Mrs-0.391
One-0.383
Ms-0.382
Every-0.377
Token he
Token ID339
Feature activation-1.87e-01

Pos logit contributions
Little+0.095
One+0.090
Mrs+0.089
 lifetime+0.088
Every+0.088
Neg logit contributions
ed-0.043
aces-0.037
ps-0.037
driver-0.036
ped-0.031
Token got
Token ID1392
Feature activation-8.24e-02

Pos logit contributions
ed+0.252
driver+0.237
aces+0.232
ps+0.232
fully+0.220
Neg logit contributions
Little-0.334
Ms-0.332
Em-0.326
Mrs-0.321
Girl-0.319
Token his
Token ID465
Feature activation-7.86e-02

Pos logit contributions
ed+0.102
driver+0.099
aces+0.094
fully+0.093
ously+0.093
Neg logit contributions
rous-0.166
Ms-0.158
Little-0.158
Mrs-0.156
Here-0.152
Token little
Token ID1310
Feature activation+1.80e-01

Pos logit contributions
ed+0.146
icked+0.123
driver+0.122
ped+0.121
fully+0.121
Neg logit contributions
rous-0.119
Little-0.116
 difference-0.111
One-0.110
rig-0.110
Token screw
Token ID9580
Feature activation-7.33e-02

Pos logit contributions
 difference+0.309
rous+0.287
Little+0.286
 former+0.284
 future+0.276
Neg logit contributions
ed-0.300
icked-0.265
ped-0.255
fully-0.250
driver-0.242
Tokendriver
Token ID26230
Feature activation-3.04e-01

Pos logit contributions
ped+0.071
icked+0.070
 knees+0.068
 shrugged+0.066
 hid+0.065
Neg logit contributions
is-0.122
igsaw-0.117
Little-0.117
One-0.114
off-0.114
Token and
Token ID290
Feature activation-2.27e-01

Pos logit contributions
ed+0.334
 knees+0.308
icked+0.292
 teeth+0.286
ped+0.268
Neg logit contributions
One-0.605
Mrs-0.589
is-0.586
Little-0.572
rous-0.571
Token started
Token ID2067
Feature activation-1.16e-01

Pos logit contributions
ed+0.316
aces+0.302
ps+0.279
fully+0.275
pered+0.273
Neg logit contributions
Little-0.401
Em-0.369
Mrs-0.366
Love-0.361
Ms-0.353
Token to
Token ID284
Feature activation-8.76e-02

Pos logit contributions
ed+0.149
icked+0.115
ped+0.113
aces+0.112
fully+0.111
Neg logit contributions
Mrs-0.235
Ms-0.235
Little-0.235
One-0.234
rous-0.230
Token screw
Token ID9580
Feature activation-7.30e-02

Pos logit contributions
ed+0.146
 knees+0.121
driver+0.119
 under+0.118
 and+0.116
Neg logit contributions
Little-0.154
Ms-0.152
One-0.149
rous-0.148
 difference-0.146
Token in
Token ID287
Feature activation-3.29e-02

Pos logit contributions
 knees+0.076
ed+0.070
ped+0.068
 hid+0.067
icked+0.067
Neg logit contributions
One-0.138
Little-0.137
V-0.136
E-0.135
Ms-0.133
Token mom
Token ID1995
Feature activation-1.97e-02

Pos logit contributions
Little+0.019
Ms+0.019
 difference+0.019
rous+0.019
One+0.019
Neg logit contributions
ed-0.017
driver-0.015
fully-0.014
ped-0.014
icked-0.014
Token gave
Token ID2921
Feature activation-7.74e-03

Pos logit contributions
ed+0.029
 knees+0.027
aces+0.025
driver+0.025
fully+0.025
Neg logit contributions
Little-0.036
Ms-0.035
Mrs-0.034
One-0.034
Em-0.033
Token him
Token ID683
Feature activation-1.64e-01

Pos logit contributions
ed+0.008
 knees+0.007
driver+0.007
 hid+0.006
 under+0.006
Neg logit contributions
Ms-0.018
Little-0.018
rous-0.018
Mrs-0.017
One-0.017
Token a
Token ID257
Feature activation+4.48e-02

Pos logit contributions
ed+0.227
 knees+0.198
driver+0.192
ped+0.183
fully+0.182
Neg logit contributions
Ms-0.320
Little-0.316
rous-0.312
Mrs-0.308
One-0.306
Token screw
Token ID9580
Feature activation-7.04e-02

Pos logit contributions
Ms+0.069
Little+0.069
rous+0.069
icious+0.068
One+0.067
Neg logit contributions
ed-0.076
 and-0.074
driver-0.071
 knees-0.070
 under-0.069
Tokendriver
Token ID26230
Feature activation-3.04e-01

Pos logit contributions
 knees+0.069
ped+0.064
ed+0.058
 hid+0.055
 ran+0.053
Neg logit contributions
igsaw-0.139
One-0.130
Mrs-0.127
is-0.126
Every-0.125
Token and
Token ID290
Feature activation-2.21e-01

Pos logit contributions
 knees+0.267
ed+0.260
 hid+0.214
icked+0.213
ped+0.212
Neg logit contributions
One-0.680
Mrs-0.678
Ms-0.671
Every-0.651
Each-0.644
Token he
Token ID339
Feature activation-1.76e-01

Pos logit contributions
ed+0.358
aces+0.323
ked+0.302
 knees+0.297
ped+0.295
Neg logit contributions
Little-0.384
Mrs-0.369
Ms-0.364
Em-0.347
Love-0.343
Token was
Token ID373
Feature activation-8.64e-02

Pos logit contributions
ed+0.244
fully+0.218
 knees+0.211
ped+0.206
ously+0.206
Neg logit contributions
Little-0.314
Ms-0.303
One-0.302
Mrs-0.301
Each-0.298
Token helping
Token ID5742
Feature activation+4.46e-03

Pos logit contributions
ed+0.125
 knees+0.106
driver+0.102
 under+0.097
fully+0.096
Neg logit contributions
Ms-0.167
Little-0.165
One-0.161
Mrs-0.161
rous-0.161
Token to
Token ID284
Feature activation-8.41e-02

Pos logit contributions
Ms+0.009
Mrs+0.008
Little+0.008
 difference+0.008
One+0.008
Neg logit contributions
ed-0.007
icked-0.006
driver-0.006
ked-0.005
 knees-0.005
Token and
Token ID290
Feature activation-2.26e-01

Pos logit contributions
icked+0.113
 knees+0.113
ed+0.112
eer+0.111
ped+0.111
Neg logit contributions
Little-0.290
Mrs-0.281
Every-0.281
The-0.279
Em-0.276
Token took
Token ID1718
Feature activation-3.87e-02

Pos logit contributions
aces+0.330
ed+0.322
ps+0.301
pered+0.291
ped+0.290
Neg logit contributions
Little-0.387
Em-0.357
The-0.348
Mrs-0.341
L-0.339
Token out
Token ID503
Feature activation-7.47e-02

Pos logit contributions
 knees+0.046
ed+0.045
driver+0.041
ously+0.040
aces+0.040
Neg logit contributions
Little-0.077
Ms-0.073
rous-0.073
E-0.073
 difference-0.071
Token a
Token ID257
Feature activation+4.45e-02

Pos logit contributions
ed+0.092
 knees+0.078
driver+0.071
ped+0.070
 hid+0.070
Neg logit contributions
Little-0.158
Ms-0.158
One-0.155
rous-0.155
Mrs-0.153
Token screw
Token ID9580
Feature activation-6.96e-02

Pos logit contributions
rous+0.081
Ms+0.081
icious+0.080
Little+0.079
One+0.077
Neg logit contributions
ed-0.065
 and-0.063
driver-0.062
 under-0.060
 knees-0.059
Tokendriver
Token ID26230
Feature activation-3.04e-01

Pos logit contributions
ped+0.065
 knees+0.060
ed+0.058
icked+0.058
 teeth+0.051
Neg logit contributions
igsaw-0.140
is-0.134
 difference-0.132
One-0.129
Little-0.127
Token.
Token ID13
Feature activation-5.26e-02

Pos logit contributions
ed+0.298
 knees+0.268
icked+0.244
ped+0.240
 teeth+0.225
Neg logit contributions
One-0.654
Mrs-0.642
Ms-0.634
is-0.627
 difference-0.623
Token Then
Token ID3244
Feature activation+3.28e-02

Pos logit contributions
ed+0.075
 and+0.062
 knees+0.061
driver+0.060
 under+0.059
Neg logit contributions
rous-0.104
Little-0.104
Ms-0.103
One-0.102
 difference-0.100
Token he
Token ID339
Feature activation-1.75e-01

Pos logit contributions
Little+0.068
 difference+0.062
Ms+0.062
Mrs+0.061
One+0.060
Neg logit contributions
ed-0.040
aces-0.040
driver-0.038
ps-0.037
fully-0.037
Token carefully
Token ID7773
Feature activation-6.76e-02

Pos logit contributions
ed+0.248
fully+0.229
 knees+0.228
ped+0.218
aces+0.213
Neg logit contributions
Little-0.302
One-0.288
Ms-0.282
Em-0.282
Mrs-0.281
Token unsc
Token ID28594
Feature activation+5.14e-02

Pos logit contributions
ed+0.098
 knees+0.092
fully+0.089
ped+0.087
aces+0.085
Neg logit contributions
Little-0.108
One-0.104
Mrs-0.101
Em-0.101
Che-0.100
Token He
Token ID679
Feature activation-1.06e-01

Pos logit contributions
ed+0.077
driver+0.067
 knees+0.066
 under+0.064
 wouldn+0.063
Neg logit contributions
Ms-0.091
Mrs-0.087
Little-0.087
One-0.086
Girl-0.083
Token handed
Token ID10158
Feature activation-6.05e-03

Pos logit contributions
ed+0.128
 knees+0.126
fully+0.119
 under+0.117
aces+0.117
Neg logit contributions
Little-0.202
One-0.201
Ms-0.198
Ja-0.194
Mrs-0.194
Token her
Token ID607
Feature activation-5.93e-02

Pos logit contributions
driver+0.009
ed+0.009
 knees+0.009
aces+0.009
 wouldn+0.008
Neg logit contributions
Ms-0.011
Little-0.010
Mrs-0.010
One-0.010
rous-0.010
Token a
Token ID257
Feature activation+4.49e-02

Pos logit contributions
ed+0.082
 knees+0.069
driver+0.066
ped+0.066
fully+0.065
Neg logit contributions
Little-0.117
Ms-0.114
rous-0.114
One-0.112
Mrs-0.111
Token screw
Token ID9580
Feature activation-6.93e-02

Pos logit contributions
rous+0.081
Little+0.081
Ms+0.081
One+0.079
icious+0.078
Neg logit contributions
ed-0.068
driver-0.061
 and-0.061
 knees-0.059
 under-0.058
Tokendriver
Token ID26230
Feature activation-3.04e-01

Pos logit contributions
ped+0.063
 knees+0.062
 hid+0.059
led+0.057
 headed+0.054
Neg logit contributions
igsaw-0.129
One-0.125
The-0.122
is-0.119
Every-0.119
Token and
Token ID290
Feature activation-2.19e-01

Pos logit contributions
 knees+0.287
ed+0.284
 hid+0.258
icked+0.255
 elbow+0.251
Neg logit contributions
One-0.624
Mrs-0.613
is-0.596
Every-0.592
Ms-0.580
Token showed
Token ID3751
Feature activation-3.04e-03

Pos logit contributions
aces+0.290
ed+0.290
ps+0.273
pered+0.272
ked+0.260
Neg logit contributions
Little-0.395
Mrs-0.361
Ms-0.354
Love-0.348
The-0.348
Token her
Token ID607
Feature activation-5.77e-02

Pos logit contributions
ed+0.004
 knees+0.003
driver+0.003
 under+0.003
aces+0.003
Neg logit contributions
Ms-0.006
Little-0.006
E-0.006
rous-0.006
Mrs-0.006
Token how
Token ID703
Feature activation+5.52e-02

Pos logit contributions
ed+0.079
 knees+0.064
 under+0.062
ped+0.062
driver+0.061
Neg logit contributions
Little-0.114
Ms-0.112
rous-0.111
 difference-0.110
One-0.110
Token to
Token ID284
Feature activation-7.99e-02

Pos logit contributions
Little+0.125
rous+0.125
One+0.123
Ms+0.123
 rewarding+0.122
Neg logit contributions
ed-0.056
 knees-0.046
 under-0.042
ped-0.042
driver-0.042
Token she
Token ID673
Feature activation-1.49e-01

Pos logit contributions
ed+0.145
driver+0.135
 knees+0.125
aces+0.121
ps+0.118
Neg logit contributions
Little-0.233
Ms-0.224
rous-0.222
One-0.220
Mrs-0.220
Token found
Token ID1043
Feature activation-1.51e-02

Pos logit contributions
ed+0.200
fully+0.195
driver+0.186
ped+0.185
aces+0.179
Neg logit contributions
Little-0.266
One-0.257
Ms-0.256
Mrs-0.249
Love-0.247
Token a
Token ID257
Feature activation+4.61e-02

Pos logit contributions
ed+0.019
 knees+0.018
driver+0.017
 under+0.016
fully+0.016
Neg logit contributions
Little-0.030
Ms-0.030
rous-0.029
Mrs-0.029
One-0.028
Token big
Token ID1263
Feature activation+2.28e-01

Pos logit contributions
Ms+0.072
Little+0.072
rous+0.071
One+0.069
icious+0.068
Neg logit contributions
ed-0.080
 and-0.073
 knees-0.073
driver-0.071
 under-0.071
Token screw
Token ID9580
Feature activation-6.85e-02

Pos logit contributions
 difference+0.364
Little+0.349
rous+0.341
Ms+0.333
One+0.330
Neg logit contributions
ed-0.430
ped-0.363
icked-0.362
aces-0.353
fully-0.349
Tokendriver
Token ID26230
Feature activation-3.04e-01

Pos logit contributions
ped+0.068
ed+0.066
 knees+0.065
icked+0.064
 teeth+0.063
Neg logit contributions
is-0.125
One-0.120
igsaw-0.117
 carn-0.114
Little-0.113
Token.
Token ID13
Feature activation-4.66e-02

Pos logit contributions
ed+0.296
 knees+0.264
 teeth+0.249
icked+0.249
ped+0.228
Neg logit contributions
One-0.653
is-0.642
Mrs-0.618
Every-0.612
Little-0.601
Token She
Token ID1375
Feature activation-6.65e-02

Pos logit contributions
ed+0.071
 knees+0.063
driver+0.062
 under+0.061
 hid+0.058
Neg logit contributions
Ms-0.085
Little-0.084
One-0.081
rous-0.081
Mrs-0.081
Token used
Token ID973
Feature activation-3.79e-02

Pos logit contributions
ed+0.089
fully+0.076
driver+0.073
 under+0.073
ps+0.073
Neg logit contributions
Little-0.127
One-0.125
Every-0.121
Ms-0.121
Each-0.121
Token it
Token ID340
Feature activation-8.84e-02

Pos logit contributions
ed+0.048
 knees+0.040
 under+0.037
driver+0.037
fully+0.036
Neg logit contributions
Ms-0.077
Little-0.076
One-0.076
rous-0.076
Mrs-0.074
Token to
Token ID284
Feature activation-7.90e-02

Pos logit contributions
 knees+0.101
 elbow+0.100
ed+0.097
ped+0.088
icked+0.083
Neg logit contributions
One-0.162
Ms-0.159
Each-0.157
Every-0.157
assion-0.157
TokenThe
Token ID464
Feature activation+2.35e-01

Pos logit contributions
 difference+0.367
rous+0.338
assion+0.325
 former+0.316
 sym+0.298
Neg logit contributions
 and-0.470
ed-0.449
 ran-0.433
driver-0.423
 he-0.418
Token mother
Token ID2802
Feature activation-6.41e-02

Pos logit contributions
rous+0.439
Ms+0.436
 difference+0.407
reci+0.406
icious+0.406
Neg logit contributions
ed-0.351
 and-0.327
 knees-0.311
driver-0.302
 ran-0.301
Token got
Token ID1392
Feature activation-8.19e-02

Pos logit contributions
ed+0.102
 knees+0.097
driver+0.089
aces+0.086
ped+0.086
Neg logit contributions
Little-0.112
Ms-0.107
Girl-0.103
Mrs-0.102
Em-0.101
Token a
Token ID257
Feature activation+4.15e-02

Pos logit contributions
ed+0.092
 knees+0.080
driver+0.077
fully+0.073
 under+0.071
Neg logit contributions
Ms-0.178
Little-0.177
rous-0.175
Mrs-0.173
One-0.170
Token screw
Token ID9580
Feature activation-7.05e-02

Pos logit contributions
rous+0.071
Ms+0.070
Little+0.069
One+0.069
icious+0.068
Neg logit contributions
ed-0.065
 and-0.062
 knees-0.060
driver-0.060
 under-0.057
Tokendriver
Token ID26230
Feature activation-3.03e-01

Pos logit contributions
 knees+0.066
ped+0.058
ed+0.056
icked+0.053
 teeth+0.052
Neg logit contributions
is-0.136
One-0.135
Little-0.134
igsaw-0.133
Em-0.131
Token and
Token ID290
Feature activation-2.27e-01

Pos logit contributions
 knees+0.248
ed+0.244
 teeth+0.200
icked+0.195
ped+0.190
Neg logit contributions
One-0.682
Little-0.668
is-0.667
Mrs-0.664
Ms-0.659
Token fixed
Token ID5969
Feature activation+7.64e-02

Pos logit contributions
ed+0.321
 knees+0.281
ped+0.272
fully+0.271
ps+0.266
Neg logit contributions
Little-0.432
Ms-0.416
Mrs-0.412
One-0.400
Em-0.397
Token the
Token ID262
Feature activation+1.02e-01

Pos logit contributions
Ms+0.156
Little+0.156
rous+0.151
One+0.150
Mrs+0.149
Neg logit contributions
ed-0.101
 knees-0.084
driver-0.082
 under-0.079
 and-0.077
Token broken
Token ID5445
Feature activation+1.12e-01

Pos logit contributions
Ms+0.186
rous+0.181
Little+0.181
One+0.174
Mrs+0.173
Neg logit contributions
ed-0.159
 knees-0.140
 and-0.139
driver-0.137
 under-0.135
Token part
Token ID636
Feature activation+3.19e-02

Pos logit contributions
One+0.166
Little+0.165
Ms+0.161
 difference+0.161
rous+0.158
Neg logit contributions
ed-0.201
 knees-0.172
icked-0.170
ped-0.168
 under-0.168
Token tool
Token ID2891
Feature activation-8.83e-02

Pos logit contributions
 difference+0.252
Little+0.251
One+0.250
 future+0.244
Ms+0.233
Neg logit contributions
ed-0.311
icked-0.285
ped-0.271
ked-0.268
aces-0.260
Token was
Token ID373
Feature activation-8.20e-02

Pos logit contributions
ed+0.138
ked+0.132
ped+0.128
fully+0.126
 knees+0.126
Neg logit contributions
One-0.132
str-0.130
Every-0.127
Little-0.126
Each-0.125
Token her
Token ID607
Feature activation-5.70e-02

Pos logit contributions
ed+0.128
 knees+0.115
ped+0.104
fully+0.101
ously+0.099
Neg logit contributions
Ms-0.149
One-0.149
Little-0.146
rous-0.146
Mrs-0.145
Token tiny
Token ID7009
Feature activation+2.14e-01

Pos logit contributions
ed+0.115
ped+0.098
icked+0.096
 knees+0.093
 hid+0.092
Neg logit contributions
 difference-0.079
rous-0.079
Little-0.078
One-0.077
 former-0.076
Token screw
Token ID9580
Feature activation-6.87e-02

Pos logit contributions
 difference+0.332
 former+0.310
One+0.302
 future+0.295
igsaw+0.292
Neg logit contributions
ed-0.357
icked-0.342
ped-0.342
fully-0.310
led-0.302
Tokendriver
Token ID26230
Feature activation-3.02e-01

Pos logit contributions
ped+0.087
ed+0.082
 knees+0.080
icked+0.079
ked+0.076
Neg logit contributions
igsaw-0.113
is-0.106
One-0.106
 beautiful-0.104
assion-0.103
Token.
Token ID13
Feature activation-4.67e-02

Pos logit contributions
ed+0.396
 knees+0.370
icked+0.354
ped+0.349
ked+0.323
Neg logit contributions
One-0.565
Every-0.544
is-0.531
 difference-0.527
Mrs-0.526
Token
Token ID198
Feature activation+2.55e-01

Pos logit contributions
 and+0.055
ed+0.052
 in+0.042
 He+0.041
 knees+0.039
Neg logit contributions
icious-0.109
rous-0.108
 rewarding-0.107
 lifetime-0.106
prise-0.104
Token
Token ID198
Feature activation+2.55e-01

Pos logit contributions
 former+0.638
icious+0.623
 disl+0.603
 rewarding+0.578
 rightful+0.578
Neg logit contributions
 and-0.389
 in-0.312
 on-0.296
 under-0.289
ed-0.271
TokenOne
Token ID3198
Feature activation+2.13e-01

Pos logit contributions
 difference+0.326
 lifetime+0.293
rous+0.290
 former+0.288
 disl+0.284
Neg logit contributions
 and-0.601
 in-0.555
 under-0.516
 ran-0.513
 on-0.509
Token day
Token ID1110
Feature activation+3.46e-02

Pos logit contributions
Ms+0.290
Little+0.285
rous+0.279
One+0.273
Mrs+0.270
Neg logit contributions
ed-0.431
 knees-0.384
 and-0.379
 under-0.378
driver-0.374
Token.
Token ID13
Feature activation-5.32e-02

Pos logit contributions
ed+0.122
 knees+0.110
ped+0.108
 teeth+0.097
ously+0.095
Neg logit contributions
One-0.160
Em-0.154
ograph-0.152
Mrs-0.151
Ms-0.151
Token It
Token ID632
Feature activation-1.48e-02

Pos logit contributions
ed+0.076
 knees+0.064
driver+0.063
 under+0.062
 and+0.061
Neg logit contributions
Ms-0.103
Little-0.102
One-0.100
rous-0.100
Mrs-0.099
Token was
Token ID373
Feature activation-9.10e-02

Pos logit contributions
ed+0.023
 knees+0.022
ously+0.021
fully+0.021
aces+0.020
Neg logit contributions
Little-0.024
One-0.024
Em-0.023
Ms-0.022
str-0.022
Token a
Token ID257
Feature activation+4.41e-02

Pos logit contributions
ed+0.158
 knees+0.134
driver+0.127
ously+0.124
fully+0.121
Neg logit contributions
Ms-0.156
One-0.156
Mrs-0.154
Little-0.147
 rewarding-0.147
Token screw
Token ID9580
Feature activation-6.90e-02

Pos logit contributions
Ms+0.071
Little+0.070
rous+0.070
One+0.067
reci+0.067
Neg logit contributions
ed-0.074
 and-0.069
 under-0.068
 knees-0.068
driver-0.067
Tokendriver
Token ID26230
Feature activation-3.02e-01

Pos logit contributions
ped+0.090
ed+0.085
 knees+0.083
icked+0.082
ously+0.076
Neg logit contributions
is-0.118
igsaw-0.117
 carn-0.105
 difference-0.104
Em-0.104
Token that
Token ID326
Feature activation+8.33e-02

Pos logit contributions
ed+0.355
 knees+0.327
icked+0.323
ped+0.298
 teeth+0.269
Neg logit contributions
is-0.606
One-0.600
 difference-0.564
Mrs-0.563
str-0.559
Token her
Token ID607
Feature activation-6.10e-02

Pos logit contributions
Little+0.163
Ms+0.160
rous+0.159
One+0.157
Mrs+0.155
Neg logit contributions
ed-0.108
driver-0.096
 knees-0.094
ously-0.089
ped-0.085
Token daddy
Token ID35822
Feature activation-3.80e-02

Pos logit contributions
ed+0.100
fully+0.083
ped+0.083
driver+0.082
icked+0.081
Neg logit contributions
Little-0.102
 difference-0.100
rous-0.098
 former-0.098
One-0.097
Token had
Token ID550
Feature activation-8.78e-02

Pos logit contributions
ed+0.057
 knees+0.051
driver+0.048
fully+0.047
ped+0.047
Neg logit contributions
Little-0.068
Ms-0.065
Mrs-0.065
str-0.064
Love-0.064
Token left
Token ID1364
Feature activation+3.45e-02

Pos logit contributions
ed+0.119
 knees+0.105
driver+0.099
 under+0.094
fully+0.094
Neg logit contributions
One-0.170
Ms-0.169
Little-0.165
Mrs-0.165
rous-0.164
Token a
Token ID257
Feature activation+4.09e-02

Pos logit contributions
ed+0.143
 knees+0.121
driver+0.115
 under+0.114
fully+0.112
Neg logit contributions
Ms-0.177
One-0.177
Little-0.175
Mrs-0.170
rous-0.169
Token hammer
Token ID15554
Feature activation-1.18e-01

Pos logit contributions
Ms+0.062
Little+0.062
rous+0.061
icious+0.059
Mrs+0.058
Neg logit contributions
ed-0.072
 and-0.068
 knees-0.068
driver-0.066
 under-0.066
Token and
Token ID290
Feature activation-2.31e-01

Pos logit contributions
 knees+0.149
ed+0.140
ped+0.129
ously+0.123
icked+0.123
Neg logit contributions
One-0.224
Every-0.221
is-0.215
Each-0.211
Mrs-0.204
Token a
Token ID257
Feature activation+4.29e-02

Pos logit contributions
ed+0.312
fully+0.256
 under+0.252
ps+0.251
ked+0.250
Neg logit contributions
One-0.475
Every-0.454
Each-0.433
Mrs-0.421
Little-0.419
Token screw
Token ID9580
Feature activation-6.80e-02

Pos logit contributions
Little+0.071
Ms+0.071
rous+0.070
One+0.067
icious+0.066
Neg logit contributions
ed-0.072
 knees-0.065
 and-0.065
driver-0.065
 under-0.063
Tokendriver
Token ID26230
Feature activation-3.02e-01

Pos logit contributions
 knees+0.062
ped+0.059
ed+0.057
 shrugged+0.053
 headed+0.052
Neg logit contributions
One-0.126
is-0.125
Every-0.117
L-0.115
Each-0.115
Token that
Token ID326
Feature activation+8.08e-02

Pos logit contributions
ed+0.409
 knees+0.396
ped+0.332
icked+0.332
ously+0.330
Neg logit contributions
One-0.556
Every-0.508
is-0.508
Each-0.499
Mrs-0.475
Token he
Token ID339
Feature activation-1.85e-01

Pos logit contributions
Little+0.181
One+0.179
Every+0.175
Ms+0.175
Each+0.174
Neg logit contributions
ed-0.084
 knees-0.070
ked-0.067
ously-0.066
fully-0.065
Token liked
Token ID8288
Feature activation+2.22e-02

Pos logit contributions
ed+0.250
 knees+0.223
ously+0.220
fully+0.212
 under+0.198
Neg logit contributions
One-0.352
Every-0.333
Little-0.333
Each-0.331
E-0.324
Token to
Token ID284
Feature activation-8.37e-02

Pos logit contributions
One+0.046
Ms+0.043
Mrs+0.042
Every+0.042
Little+0.041
Neg logit contributions
ed-0.030
 knees-0.025
icked-0.025
ped-0.024
ked-0.023
Token store
Token ID3650
Feature activation+1.21e-02

Pos logit contributions
ed+0.155
 knees+0.121
 under+0.120
ped+0.120
icked+0.119
Neg logit contributions
One-0.137
Little-0.134
rous-0.126
Ms-0.125
Every-0.124
Token a
Token ID257
Feature activation+4.56e-02

Pos logit contributions
ed+0.111
 knees+0.102
 hid+0.094
ped+0.093
 wouldn+0.093
Neg logit contributions
Little-0.127
rous-0.125
Ms-0.123
One-0.122
igsaw-0.119
Token screw
Token ID9580
Feature activation-6.92e-02

Pos logit contributions
Ms+0.081
Little+0.081
rous+0.080
One+0.079
Mrs+0.077
Neg logit contributions
ed-0.072
 knees-0.062
driver-0.062
 and-0.061
 under-0.061
Token and
Token ID290
Feature activation-2.19e-01

Pos logit contributions
ped+0.068
 knees+0.065
ed+0.062
icked+0.056
 hid+0.055
Neg logit contributions
igsaw-0.125
is-0.122
One-0.119
ay-0.115
 difference-0.114
Token a
Token ID257
Feature activation+4.72e-02

Pos logit contributions
ed+0.301
aces+0.293
ously+0.269
ped+0.268
fully+0.266
Neg logit contributions
Little-0.387
rous-0.356
Mrs-0.348
Ms-0.348
Em-0.345
Token screw
Token ID9580
Feature activation-6.79e-02

Pos logit contributions
Ms+0.085
Little+0.085
rous+0.083
One+0.082
Mrs+0.081
Neg logit contributions
ed-0.074
 knees-0.063
driver-0.062
 and-0.061
 under-0.061
Tokendriver
Token ID26230
Feature activation-3.01e-01

Pos logit contributions
 knees+0.059
edly+0.052
ped+0.052
ed+0.050
erved+0.047
Neg logit contributions
is-0.117
str-0.115
One-0.113
 difference-0.109
Che-0.108
Token and
Token ID290
Feature activation-2.18e-01

Pos logit contributions
 knees+0.357
ed+0.357
edly+0.302
 teeth+0.300
 toes+0.294
Neg logit contributions
One-0.540
is-0.525
str-0.517
Each-0.507
Every-0.502
Token started
Token ID2067
Feature activation-1.08e-01

Pos logit contributions
ed+0.318
aces+0.298
ped+0.278
 knees+0.274
ps+0.271
Neg logit contributions
Little-0.379
Ms-0.351
Mrs-0.349
rous-0.342
One-0.340
Token to
Token ID284
Feature activation-8.14e-02

Pos logit contributions
ed+0.139
icked+0.113
ped+0.113
aces+0.113
fully+0.108
Neg logit contributions
Mrs-0.208
Ms-0.206
One-0.206
Little-0.206
rous-0.205
Token work
Token ID670
Feature activation-7.39e-02

Pos logit contributions
ed+0.133
 knees+0.110
driver+0.108
 under+0.108
 and+0.106
Neg logit contributions
Little-0.145
Ms-0.142
One-0.141
rous-0.140
 difference-0.138
Token.
Token ID13
Feature activation-4.51e-02

Pos logit contributions
ed+0.104
 knees+0.092
ped+0.087
icked+0.084
 hid+0.081
Neg logit contributions
Ms-0.137
One-0.137
Little-0.134
Mrs-0.133
rous-0.132
Token tightened
Token ID37257
Feature activation-5.10e-02

Pos logit contributions
ed+0.305
aces+0.273
ously+0.271
ped+0.270
driver+0.262
Neg logit contributions
Little-0.424
rous-0.388
Em-0.380
Mrs-0.376
Ms-0.375
Token it
Token ID340
Feature activation-9.19e-02

Pos logit contributions
ed+0.074
 knees+0.066
driver+0.063
 hid+0.061
icked+0.061
Neg logit contributions
Little-0.092
Ms-0.092
rous-0.088
One-0.088
Mrs-0.088
Token with
Token ID351
Feature activation+1.88e-02

Pos logit contributions
ed+0.095
 knees+0.089
ped+0.080
driver+0.076
 hid+0.075
Neg logit contributions
One-0.192
Ms-0.189
Little-0.186
Every-0.184
Each-0.183
Token a
Token ID257
Feature activation+4.67e-02

Pos logit contributions
Little+0.037
Ms+0.037
rous+0.036
One+0.036
Mrs+0.035
Neg logit contributions
ed-0.026
 knees-0.022
driver-0.021
 under-0.021
 and-0.020
Token screw
Token ID9580
Feature activation-6.73e-02

Pos logit contributions
rous+0.081
Ms+0.078
Little+0.076
itionally+0.074
One+0.074
Neg logit contributions
ed-0.077
 and-0.074
driver-0.072
 knees-0.068
 in-0.067
Tokendriver
Token ID26230
Feature activation-3.01e-01

Pos logit contributions
ped+0.068
 knees+0.068
ed+0.067
 hid+0.061
icked+0.058
Neg logit contributions
is-0.125
One-0.121
assion-0.117
 difference-0.116
Little-0.115
Token.
Token ID13
Feature activation-4.53e-02

Pos logit contributions
ed+0.369
 knees+0.343
ped+0.315
icked+0.308
 hid+0.291
Neg logit contributions
One-0.569
is-0.552
Ms-0.545
Mrs-0.545
Every-0.537
Token The
Token ID383
Feature activation+2.31e-01

Pos logit contributions
ed+0.077
 and+0.067
 knees+0.064
driver+0.062
 in+0.061
Neg logit contributions
Ms-0.080
rous-0.078
icious-0.077
 difference-0.076
Little-0.076
Token clock
Token ID8801
Feature activation-9.70e-02

Pos logit contributions
Little+0.401
Ms+0.387
One+0.381
 difference+0.378
Mrs+0.375
Neg logit contributions
ed-0.389
 knees-0.325
driver-0.318
 under-0.315
icked-0.315
Token was
Token ID373
Feature activation-8.64e-02

Pos logit contributions
ed+0.127
 knees+0.113
fully+0.107
 under+0.107
ped+0.107
Neg logit contributions
Little-0.179
 difference-0.169
One-0.166
Love-0.164
str-0.164
Token now
Token ID783
Feature activation+1.89e-02

Pos logit contributions
ed+0.130
 knees+0.111
driver+0.106
 under+0.103
 and+0.101
Neg logit contributions
Ms-0.161
Little-0.161
rous-0.157
One-0.157
Mrs-0.155
Token were
Token ID547
Feature activation-1.06e-01

Pos logit contributions
 knees+0.126
ed+0.121
 teeth+0.109
icked+0.108
edly+0.108
Neg logit contributions
Little-0.190
Every-0.181
Each-0.180
reci-0.179
str-0.176
Token connected
Token ID5884
Feature activation+2.02e-02

Pos logit contributions
ed+0.194
 knees+0.166
fully+0.155
ped+0.152
icked+0.150
Neg logit contributions
Ms-0.158
One-0.158
Little-0.157
Mrs-0.155
 rewarding-0.154
Token,
Token ID11
Feature activation-1.08e-01

Pos logit contributions
One+0.040
Ms+0.040
Mrs+0.040
Little+0.040
assion+0.040
Neg logit contributions
 knees-0.021
 elbow-0.019
icked-0.019
ed-0.019
ped-0.018
Token the
Token ID262
Feature activation+1.10e-01

Pos logit contributions
ed+0.119
icked+0.097
 knees+0.092
ped+0.087
driver+0.080
Neg logit contributions
Little-0.244
Mrs-0.236
Ms-0.236
One-0.232
 rewarding-0.229
Token screw
Token ID9580
Feature activation-6.60e-02

Pos logit contributions
Little+0.183
Ms+0.176
 difference+0.175
One+0.172
Mrs+0.168
Neg logit contributions
ed-0.191
icked-0.164
 knees-0.158
ped-0.158
driver-0.157
Tokendriver
Token ID26230
Feature activation-3.01e-01

Pos logit contributions
 knees+0.056
ped+0.049
icked+0.047
 under+0.047
ed+0.047
Neg logit contributions
is-0.124
str-0.124
Little-0.122
Each-0.120
One-0.120
Token tightened
Token ID37257
Feature activation-4.80e-02

Pos logit contributions
 knees+0.298
ed+0.284
icked+0.259
ped+0.245
 elbow+0.235
Neg logit contributions
is-0.586
Every-0.579
One-0.578
str-0.570
Each-0.569
Token screws
Token ID23742
Feature activation-1.02e-01

Pos logit contributions
ed+0.051
 knees+0.049
icked+0.047
 hid+0.043
ped+0.041
Neg logit contributions
Little-0.095
Ms-0.094
One-0.092
redible-0.092
Mrs-0.091
Token,
Token ID11
Feature activation-1.07e-01

Pos logit contributions
 knees+0.102
ed+0.095
ped+0.093
 elbow+0.087
icked+0.086
Neg logit contributions
One-0.205
is-0.204
Every-0.203
Little-0.202
Ms-0.201
Token and
Token ID290
Feature activation-2.19e-01

Pos logit contributions
ed+0.095
icked+0.076
 knees+0.074
ped+0.064
 hid+0.056
Neg logit contributions
Little-0.255
Mrs-0.253
Ms-0.250
One-0.249
The-0.247
Token the
Token ID262
Feature activation+1.11e-01

Pos logit contributions
ed+0.335
aces+0.286
driver+0.281
 knees+0.280
ped+0.277
Neg logit contributions
Little-0.403
Ms-0.380
Mrs-0.371
rous-0.360
 rewarding-0.356
Token the
Token ID262
Feature activation+1.05e-01

Pos logit contributions
 knees+0.122
ed+0.116
 hid+0.109
edly+0.108
icked+0.101
Neg logit contributions
One-0.239
Little-0.238
Ms-0.233
rous-0.229
Mrs-0.227
Token wood
Token ID4898
Feature activation-3.27e-02

Pos logit contributions
One+0.184
Ms+0.183
Little+0.183
 difference+0.178
Mrs+0.176
Neg logit contributions
ed-0.172
 knees-0.141
fully-0.136
icked-0.135
ped-0.134
Token and
Token ID290
Feature activation-2.24e-01

Pos logit contributions
ed+0.036
icked+0.031
 knees+0.030
ped+0.030
ked+0.029
Neg logit contributions
 difference-0.062
One-0.062
str-0.061
Every-0.060
Each-0.060
Token his
Token ID465
Feature activation-6.58e-02

Pos logit contributions
ed+0.397
icked+0.332
ously+0.325
ped+0.323
aces+0.313
Neg logit contributions
Little-0.345
One-0.345
The-0.323
Mrs-0.319
Every-0.317
Token screw
Token ID9580
Feature activation-6.64e-02

Pos logit contributions
ed+0.100
fully+0.079
icked+0.078
ped+0.076
 under+0.076
Neg logit contributions
Little-0.118
 difference-0.117
One-0.116
Ms-0.114
 former-0.114
Tokendriver
Token ID26230
Feature activation-3.01e-01

Pos logit contributions
 knees+0.048
ed+0.044
ped+0.043
 under+0.043
edly+0.043
Neg logit contributions
One-0.129
Each-0.124
assion-0.123
str-0.123
Every-0.122
Token to
Token ID284
Feature activation-8.15e-02

Pos logit contributions
ed+0.387
 knees+0.360
ped+0.324
edly+0.315
icked+0.313
Neg logit contributions
One-0.554
Every-0.515
Each-0.514
assion-0.489
 difference-0.487
Token screw
Token ID9580
Feature activation-6.56e-02

Pos logit contributions
ed+0.141
 knees+0.117
driver+0.112
 under+0.112
ped+0.111
Neg logit contributions
Little-0.138
One-0.136
Ms-0.135
rous-0.134
 difference-0.130
Token it
Token ID340
Feature activation-9.10e-02

Pos logit contributions
 knees+0.065
ed+0.061
edly+0.057
ped+0.056
ously+0.056
Neg logit contributions
One-0.125
Little-0.121
The-0.120
Ms-0.119
Ja-0.117
Token together
Token ID1978
Feature activation-2.14e-02

Pos logit contributions
ed+0.116
 knees+0.109
ously+0.106
ped+0.101
 elbow+0.097
Neg logit contributions
One-0.173
Every-0.157
assion-0.154
Each-0.152
 former-0.148
Token.
Token ID13
Feature activation-5.06e-02

Pos logit contributions
ed+0.037
ped+0.034
 knees+0.033
ked+0.031
icked+0.031
Neg logit contributions
One-0.033
Every-0.030
Each-0.029
The-0.028
Ms-0.028
Token a
Token ID257
Feature activation+4.75e-02

Pos logit contributions
Ms+0.047
Little+0.047
rous+0.046
One+0.046
Mrs+0.045
Neg logit contributions
ed-0.029
 knees-0.025
driver-0.024
 under-0.022
ped-0.022
Token hammer
Token ID15554
Feature activation-1.16e-01

Pos logit contributions
Ms+0.081
rous+0.081
icious+0.081
Little+0.080
One+0.079
Neg logit contributions
ed-0.074
 and-0.072
driver-0.070
 knees-0.068
 under-0.068
Token and
Token ID290
Feature activation-2.20e-01

Pos logit contributions
 knees+0.139
ed+0.137
ped+0.126
icked+0.124
 hid+0.122
Neg logit contributions
One-0.205
Mrs-0.203
Little-0.199
 difference-0.198
Ms-0.197
Token a
Token ID257
Feature activation+4.84e-02

Pos logit contributions
ed+0.316
aces+0.295
fully+0.270
ped+0.269
driver+0.267
Neg logit contributions
Little-0.403
Mrs-0.382
Ms-0.378
rous-0.372
Em-0.367
Token screw
Token ID9580
Feature activation-6.61e-02

Pos logit contributions
Ms+0.088
rous+0.088
Little+0.088
One+0.085
icious+0.084
Neg logit contributions
ed-0.072
 and-0.065
driver-0.065
 knees-0.065
 under-0.062
Tokendriver
Token ID26230
Feature activation-3.01e-01

Pos logit contributions
 knees+0.051
ped+0.043
ed+0.043
 toes+0.040
erved+0.038
Neg logit contributions
is-0.130
One-0.129
L-0.124
Em-0.124
Little-0.124
Token.
Token ID13
Feature activation-4.46e-02

Pos logit contributions
 knees+0.370
ed+0.367
 teeth+0.316
 toes+0.315
ped+0.309
Neg logit contributions
One-0.514
Mrs-0.502
Ms-0.498
is-0.497
Em-0.486
Token
Token ID198
Feature activation+2.55e-01

Pos logit contributions
ed+0.074
 and+0.063
 knees+0.063
 under+0.061
driver+0.059
Neg logit contributions
Little-0.077
rous-0.077
Ms-0.076
One-0.075
 difference-0.074
Token
Token ID198
Feature activation+2.54e-01

Pos logit contributions
 former+0.520
 difference+0.493
assion+0.492
 lifetime+0.489
 disl+0.489
Neg logit contributions
 and-0.414
ed-0.407
 in-0.358
 on-0.343
 under-0.340
TokenM
Token ID44
Feature activation+9.15e-02

Pos logit contributions
 difference+0.413
 former+0.400
 disl+0.390
assion+0.380
str+0.377
Neg logit contributions
ed-0.482
 and-0.444
 in-0.421
 under-0.404
 on-0.399
Tokenolly
Token ID5098
Feature activation+3.14e-02

Pos logit contributions
Ms+0.081
Little+0.080
rous+0.075
One+0.075
icious+0.072
Neg logit contributions
ed-0.228
 knees-0.211
driver-0.210
 under-0.205
 and-0.202
Token!
Token ID0
Feature activation-1.72e-02

Pos logit contributions
ed+0.072
driver+0.060
ped+0.057
icked+0.057
fully+0.055
Neg logit contributions
 difference-0.112
Ms-0.107
Little-0.107
 former-0.107
rous-0.106
Token He
Token ID679
Feature activation-1.08e-01

Pos logit contributions
ed+0.024
 knees+0.020
driver+0.020
 under+0.019
 and+0.019
Neg logit contributions
Little-0.034
Ms-0.034
One-0.033
Mrs-0.033
rous-0.033
Token found
Token ID1043
Feature activation-1.30e-02

Pos logit contributions
aces+0.165
ed+0.157
ier+0.153
fully+0.152
driver+0.152
Neg logit contributions
Little-0.162
One-0.161
Ms-0.159
Gr-0.157
Em-0.156
Token a
Token ID257
Feature activation+4.95e-02

Pos logit contributions
ed+0.017
 knees+0.017
driver+0.015
aces+0.015
fully+0.015
Neg logit contributions
Ms-0.024
Little-0.024
rous-0.023
Mrs-0.023
One-0.023
Token screw
Token ID9580
Feature activation-6.69e-02

Pos logit contributions
Little+0.075
Ms+0.075
rous+0.074
One+0.072
icious+0.070
Neg logit contributions
ed-0.089
 and-0.080
 knees-0.080
 under-0.079
driver-0.079
Tokendriver
Token ID26230
Feature activation-3.00e-01

Pos logit contributions
ped+0.069
 knees+0.065
icked+0.063
erved+0.060
ed+0.059
Neg logit contributions
is-0.121
igsaw-0.119
One-0.111
 carn-0.107
Mrs-0.106
Token and
Token ID290
Feature activation-2.21e-01

Pos logit contributions
ed+0.296
 knees+0.276
icked+0.276
ped+0.258
 teeth+0.242
Neg logit contributions
One-0.630
is-0.625
Mrs-0.607
Every-0.596
Ms-0.591
Token gave
Token ID2921
Feature activation-3.16e-03

Pos logit contributions
ed+0.300
aces+0.281
ps+0.279
ped+0.262
fully+0.261
Neg logit contributions
Little-0.395
Mrs-0.389
One-0.375
Ms-0.372
Em-0.368
Token it
Token ID340
Feature activation-8.48e-02

Pos logit contributions
ed+0.004
 knees+0.004
driver+0.004
aces+0.004
 hid+0.003
Neg logit contributions
Little-0.006
Ms-0.006
Mrs-0.006
E-0.006
rous-0.006
Token to
Token ID284
Feature activation-7.80e-02

Pos logit contributions
ed+0.093
 knees+0.083
ped+0.078
ously+0.077
fully+0.076
Neg logit contributions
One-0.173
rous-0.172
Ms-0.171
Mrs-0.168
Each-0.166
Token her
Token ID607
Feature activation-5.31e-02

Pos logit contributions
ed+0.104
driver+0.084
 knees+0.079
aces+0.078
icked+0.074
Neg logit contributions
Little-0.167
Ms-0.165
Mrs-0.163
One-0.162
 difference-0.158
Token was
Token ID373
Feature activation-7.65e-02

Pos logit contributions
ed+0.006
driver+0.005
 knees+0.005
fully+0.005
aces+0.004
Neg logit contributions
Little-0.008
Ms-0.007
One-0.007
 difference-0.007
Every-0.007
Token the
Token ID262
Feature activation+1.19e-01

Pos logit contributions
ed+0.124
 knees+0.107
driver+0.105
 under+0.103
 and+0.103
Neg logit contributions
Ms-0.134
Little-0.134
rous-0.131
One-0.130
Mrs-0.127
Token start
Token ID923
Feature activation-5.39e-03

Pos logit contributions
Ms+0.194
Little+0.191
rous+0.189
Mrs+0.182
One+0.181
Neg logit contributions
ed-0.194
 and-0.184
 knees-0.177
 under-0.176
 in-0.172
Token of
Token ID286
Feature activation-7.05e-03

Pos logit contributions
ed+0.008
driver+0.007
 knees+0.007
 hid+0.006
ped+0.006
Neg logit contributions
Little-0.010
One-0.010
 difference-0.010
rous-0.010
Ms-0.010
Token a
Token ID257
Feature activation+5.33e-02

Pos logit contributions
ed+0.011
 knees+0.009
driver+0.009
 under+0.008
 hid+0.008
Neg logit contributions
Ms-0.013
Little-0.013
One-0.013
rous-0.013
 difference-0.013
Token new
Token ID649
Feature activation+2.37e-01

Pos logit contributions
reci+0.090
rous+0.088
Ms+0.085
Ta+0.083
 disl+0.082
Neg logit contributions
 and-0.091
 in-0.081
 knees-0.077
 under-0.077
 as-0.077
Token tradition
Token ID6761
Feature activation-2.46e-03

Pos logit contributions
Ms+0.359
Little+0.357
rous+0.349
One+0.345
Mrs+0.338
Neg logit contributions
ed-0.441
 knees-0.388
driver-0.380
 under-0.376
 and-0.375
Token,
Token ID11
Feature activation-1.05e-01

Pos logit contributions
ed+0.003
ped+0.002
driver+0.002
icked+0.002
 knees+0.002
Neg logit contributions
 difference-0.005
 lifetime-0.005
Little-0.005
Ms-0.005
 rewarding-0.005
Token every
Token ID790
Feature activation+4.86e-02

Pos logit contributions
ed+0.112
driver+0.087
 knees+0.083
icked+0.082
ped+0.077
Neg logit contributions
Little-0.247
 difference-0.247
rous-0.241
 memorable-0.240
 rewarding-0.239
Token day
Token ID1110
Feature activation+3.84e-02

Pos logit contributions
Ms+0.065
Little+0.064
rous+0.062
One+0.062
Mrs+0.061
Neg logit contributions
ed-0.099
 knees-0.088
driver-0.087
 and-0.086
 under-0.086
Token he
Token ID339
Feature activation-1.68e-01

Pos logit contributions
Little+0.081
 lifetime+0.076
 difference+0.075
Mrs+0.075
Ms+0.075
Neg logit contributions
ed-0.045
driver-0.042
 knees-0.041
ped-0.040
ps-0.038
Token the
Token ID262
Feature activation+1.06e-01

Pos logit contributions
rous+0.025
Little+0.025
Ms+0.025
 difference+0.025
One+0.025
Neg logit contributions
ed-0.020
driver-0.017
 knees-0.017
fully-0.016
 under-0.015
Token seeds
Token ID11904
Feature activation-1.37e-01

Pos logit contributions
Ms+0.213
Little+0.212
rous+0.208
One+0.206
Mrs+0.203
Neg logit contributions
ed-0.147
 knees-0.123
 and-0.119
driver-0.119
 under-0.118
Token around
Token ID1088
Feature activation+5.20e-02

Pos logit contributions
ed+0.134
 knees+0.130
driver+0.124
 teeth+0.121
ped+0.113
Neg logit contributions
One-0.270
rous-0.268
 difference-0.267
Every-0.263
Each-0.260
Token.
Token ID13
Feature activation-4.55e-02

Pos logit contributions
Little+0.096
Ms+0.092
rous+0.090
One+0.089
Mrs+0.088
Neg logit contributions
ed-0.075
driver-0.068
 knees-0.066
aces-0.065
icked-0.064
Token
Token ID220
Feature activation+2.20e-01

Pos logit contributions
ed+0.088
 and+0.080
 knees+0.074
 under+0.072
driver+0.072
Neg logit contributions
icious-0.069
rous-0.069
 difference-0.067
 rewarding-0.066
Ms-0.065
Token
Token ID198
Feature activation+2.46e-01

Pos logit contributions
rous+0.329
 difference+0.305
 lifetime+0.304
icious+0.301
Little+0.300
Neg logit contributions
ed-0.451
 and-0.408
driver-0.381
 in-0.374
 under-0.369
Token
Token ID198
Feature activation+2.46e-01

Pos logit contributions
 difference+0.552
rous+0.522
 former+0.501
icious+0.500
 rightful+0.499
Neg logit contributions
 and-0.360
ed-0.343
 in-0.287
 on-0.282
 he-0.277
TokenThey
Token ID2990
Feature activation-9.23e-02

Pos logit contributions
 difference+0.372
rous+0.322
igsaw+0.297
reci+0.287
itionally+0.287
Neg logit contributions
ed-0.488
 and-0.478
 ran-0.459
driver-0.448
 in-0.435
Token had
Token ID550
Feature activation-8.57e-02

Pos logit contributions
ed+0.123
driver+0.104
 knees+0.101
fully+0.094
aces+0.093
Neg logit contributions
Little-0.184
One-0.180
Ms-0.178
rous-0.177
str-0.177
Token so
Token ID523
Feature activation+5.05e-02

Pos logit contributions
ed+0.115
 knees+0.097
driver+0.095
 under+0.091
fully+0.088
Neg logit contributions
Ms-0.174
Little-0.172
One-0.170
rous-0.170
Mrs-0.167
Token much
Token ID881
Feature activation+1.15e-01

Pos logit contributions
Little+0.065
Ms+0.065
rous+0.063
One+0.063
Mrs+0.061
Neg logit contributions
ed-0.105
 knees-0.095
driver-0.093
 under-0.091
 and-0.090
Token garden
Token ID11376
Feature activation-1.71e-02

Pos logit contributions
Ms+0.196
Little+0.195
Mrs+0.191
icious+0.186
ograph+0.186
Neg logit contributions
ed-0.193
 and-0.179
 under-0.175
 in-0.172
 knees-0.165
Token,
Token ID11
Feature activation-1.11e-01

Pos logit contributions
ed+0.011
 and+0.009
 under+0.008
 in+0.007
 knees+0.007
Neg logit contributions
Ms-0.048
Little-0.047
icious-0.046
Mrs-0.046
 former-0.046
Token there
Token ID612
Feature activation+6.35e-02

Pos logit contributions
ed+0.179
driver+0.163
 knees+0.154
ously+0.154
ped+0.149
Neg logit contributions
Little-0.192
One-0.188
rous-0.181
Ms-0.180
Every-0.179
Token was
Token ID373
Feature activation-7.96e-02

Pos logit contributions
Little+0.113
One+0.113
Every+0.110
rous+0.106
Ms+0.106
Neg logit contributions
ed-0.098
driver-0.094
 knees-0.087
ked-0.082
fully-0.081
Token a
Token ID257
Feature activation+5.06e-02

Pos logit contributions
ed+0.112
driver+0.099
 knees+0.095
fully+0.089
ously+0.086
Neg logit contributions
One-0.159
Ms-0.157
Little-0.157
Mrs-0.152
rous-0.151
Token big
Token ID1263
Feature activation+2.32e-01

Pos logit contributions
rous+0.085
Ms+0.084
reci+0.081
Each+0.080
Little+0.080
Neg logit contributions
 and-0.079
ed-0.078
 under-0.076
 in-0.075
 knees-0.073
Token bush
Token ID24484
Feature activation-3.95e-02

Pos logit contributions
 difference+0.357
Little+0.342
rous+0.339
Ms+0.330
One+0.325
Neg logit contributions
ed-0.460
driver-0.392
ped-0.377
icked-0.376
fully-0.376
Token.
Token ID13
Feature activation-4.74e-02

Pos logit contributions
ed+0.050
driver+0.041
 knees+0.040
ped+0.038
fully+0.036
Neg logit contributions
One-0.083
Ms-0.082
Little-0.082
rous-0.081
Mrs-0.080
Token A
Token ID317
Feature activation+1.52e-01

Pos logit contributions
ed+0.077
 and+0.070
 under+0.066
 knees+0.065
 in+0.064
Neg logit contributions
rous-0.084
 difference-0.083
Ms-0.081
Little-0.081
 former-0.081
Token little
Token ID1310
Feature activation+1.90e-01

Pos logit contributions
rous+0.263
Ms+0.252
 former+0.249
reci+0.247
icious+0.247
Neg logit contributions
ed-0.240
 and-0.234
 in-0.220
 under-0.220
 knees-0.213
Token girl
Token ID2576
Feature activation-4.14e-02

Pos logit contributions
 difference+0.220
Little+0.214
rous+0.211
Girl+0.208
One+0.205
Neg logit contributions
ed-0.404
driver-0.383
 knees-0.366
fully-0.362
icked-0.359
Token for
Token ID329
Feature activation-3.26e-02

Pos logit contributions
 knees+0.055
ed+0.054
driver+0.048
aces+0.047
 hid+0.046
Neg logit contributions
 difference-0.106
rous-0.099
 bou-0.099
Little-0.098
 sym-0.097
Token the
Token ID262
Feature activation+1.08e-01

Pos logit contributions
ed+0.043
 knees+0.035
driver+0.034
 under+0.034
 and+0.034
Neg logit contributions
Ms-0.068
Little-0.067
rous-0.066
One-0.066
Mrs-0.065
Token plant
Token ID4618
Feature activation-3.82e-02

Pos logit contributions
Ms+0.215
Little+0.214
rous+0.208
One+0.206
Mrs+0.205
Neg logit contributions
ed-0.152
 knees-0.129
 and-0.128
 under-0.125
driver-0.119
Token.
Token ID13
Feature activation-4.71e-02

Pos logit contributions
ed+0.053
driver+0.048
 knees+0.047
ped+0.045
ked+0.043
Neg logit contributions
 difference-0.068
One-0.064
 rewarding-0.064
rous-0.063
Ms-0.063
Token
Token ID198
Feature activation+2.53e-01

Pos logit contributions
ed+0.066
 and+0.061
 under+0.051
driver+0.051
 knees+0.050
Neg logit contributions
rous-0.101
 lifetime-0.098
icious-0.097
 difference-0.097
Little-0.096
Token
Token ID198
Feature activation+2.54e-01

Pos logit contributions
 former+0.568
icious+0.534
 lifetime+0.518
 difference+0.517
rous+0.516
Neg logit contributions
 and-0.420
ed-0.359
 under-0.340
 in-0.332
 on-0.322
TokenL
Token ID43
Feature activation+1.90e-01

Pos logit contributions
 difference+0.353
rous+0.336
 former+0.320
 lifetime+0.289
str+0.285
Neg logit contributions
 and-0.533
ed-0.525
 ran-0.511
 in-0.498
 under-0.484
Tokenily
Token ID813
Feature activation-1.86e-02

Pos logit contributions
Ms+0.393
Little+0.390
One+0.383
rous+0.383
Mrs+0.375
Neg logit contributions
ed-0.244
 knees-0.210
driver-0.197
 under-0.190
 and-0.188
Token gave
Token ID2921
Feature activation-3.56e-03

Pos logit contributions
ed+0.026
fully+0.026
driver+0.026
ped+0.024
aces+0.024
Neg logit contributions
Little-0.032
Ms-0.031
One-0.030
Em-0.030
Mrs-0.029
Token the
Token ID262
Feature activation+1.09e-01

Pos logit contributions
ed+0.005
 knees+0.004
driver+0.004
 under+0.004
 and+0.004
Neg logit contributions
Ms-0.007
Little-0.007
rous-0.007
One-0.007
Mrs-0.007
Token plant
Token ID4618
Feature activation-3.75e-02

Pos logit contributions
Little+0.231
Ms+0.229
rous+0.224
Mrs+0.221
One+0.221
Neg logit contributions
ed-0.142
 and-0.125
 knees-0.118
 under-0.116
 in-0.112
Token a
Token ID257
Feature activation+4.45e-02

Pos logit contributions
ed+0.128
 knees+0.118
driver+0.114
 wouldn+0.104
 under+0.104
Neg logit contributions
Ms-0.199
One-0.193
Mrs-0.193
rous-0.193
Little-0.191
Token big
Token ID1263
Feature activation+2.28e-01

Pos logit contributions
icious+0.077
itionally+0.076
Ms+0.076
Little+0.076
rous+0.075
Neg logit contributions
ed-0.072
 and-0.069
 under-0.063
 in-0.063
driver-0.062
Token hug
Token ID16225
Feature activation+9.90e-03

Pos logit contributions
Ms+0.370
Little+0.369
rous+0.362
One+0.358
 difference+0.354
Neg logit contributions
ed-0.396
 knees-0.345
driver-0.336
 under-0.333
 and-0.328
Token.
Token ID13
Feature activation-4.05e-02

Pos logit contributions
Ms+0.017
Mrs+0.017
Little+0.016
Today+0.016
One+0.016
Neg logit contributions
 knees-0.014
 hid-0.011
ed-0.011
ped-0.011
driver-0.010
Token
Token ID198
Feature activation+2.54e-01

Pos logit contributions
ed+0.075
 and+0.068
 knees+0.063
 under+0.062
driver+0.061
Neg logit contributions
rous-0.064
icious-0.063
Little-0.062
 rewarding-0.061
One-0.061
Token
Token ID198
Feature activation+2.53e-01

Pos logit contributions
 former+0.528
 difference+0.514
icious+0.499
rous+0.499
 rewarding+0.489
Neg logit contributions
 and-0.414
ed-0.373
 under-0.335
 in-0.335
 on-0.325
TokenFrom
Token ID4863
Feature activation+8.64e-02

Pos logit contributions
 difference+0.339
rous+0.318
 former+0.285
itionally+0.278
 lifetime+0.278
Neg logit contributions
 and-0.515
ed-0.500
 ran-0.480
fully-0.475
 in-0.467
Token that
Token ID326
Feature activation+8.51e-02

Pos logit contributions
Ms+0.181
Little+0.180
rous+0.177
One+0.176
Mrs+0.173
Neg logit contributions
ed-0.111
 knees-0.092
driver-0.089
 and-0.088
 under-0.088
Token day
Token ID1110
Feature activation+3.27e-02

Pos logit contributions
Ms+0.146
icious+0.144
rous+0.141
Little+0.139
redible+0.139
Neg logit contributions
ed-0.142
 and-0.139
 under-0.128
 in-0.128
 on-0.127
Token on
Token ID319
Feature activation-7.73e-02

Pos logit contributions
Little+0.078
One+0.076
Ms+0.075
Every+0.073
Today+0.072
Neg logit contributions
 knees-0.031
ed-0.026
driver-0.024
 hid-0.023
 teeth-0.022
Token,
Token ID11
Feature activation-1.14e-01

Pos logit contributions
ed+0.119
ked+0.107
 knees+0.101
icked+0.099
driver+0.099
Neg logit contributions
Little-0.146
Ms-0.134
Em-0.134
One-0.132
igsaw-0.132
Token yellow
Token ID7872
Feature activation+1.82e-01

Pos logit contributions
Little+0.059
Ms+0.058
rous+0.057
icious+0.057
 lifetime+0.057
Neg logit contributions
ed-0.024
 and-0.021
 knees-0.020
 under-0.019
 in-0.018
Token.
Token ID13
Feature activation-4.06e-02

Pos logit contributions
One+0.318
Ms+0.316
Little+0.315
 difference+0.305
rous+0.305
Neg logit contributions
ed-0.285
 knees-0.241
driver-0.236
 under-0.236
 hid-0.233
Token
Token ID198
Feature activation+2.59e-01

Pos logit contributions
ed+0.077
 and+0.067
 knees+0.066
 under+0.063
 in+0.062
Neg logit contributions
Little-0.063
rous-0.063
Ms-0.063
icious-0.063
 difference-0.062
Token
Token ID198
Feature activation+2.58e-01

Pos logit contributions
 former+0.570
 difference+0.543
icious+0.543
ograph+0.535
 lifetime+0.532
Neg logit contributions
 and-0.416
ed-0.371
 in-0.350
 under-0.337
 on-0.320
TokenShe
Token ID3347
Feature activation-5.18e-02

Pos logit contributions
 difference+0.366
 former+0.334
rous+0.332
 lifetime+0.324
ograph+0.323
Neg logit contributions
ed-0.521
 and-0.487
 in-0.465
 knees-0.462
 ran-0.462
Token painted
Token ID13055
Feature activation+8.81e-02

Pos logit contributions
ed+0.069
driver+0.062
fully+0.058
 knees+0.058
ped+0.057
Neg logit contributions
One-0.098
Little-0.097
Ms-0.095
Em-0.094
rous-0.094
Token for
Token ID329
Feature activation-2.85e-02

Pos logit contributions
Little+0.189
Ms+0.189
rous+0.187
 difference+0.183
One+0.183
Neg logit contributions
ed-0.110
 and-0.091
 under-0.087
 knees-0.087
driver-0.084
Token a
Token ID257
Feature activation+5.36e-02

Pos logit contributions
ed+0.036
 knees+0.032
 and+0.031
 under+0.030
driver+0.030
Neg logit contributions
Little-0.059
rous-0.058
Ms-0.058
 difference-0.057
One-0.056
Token long
Token ID890
Feature activation+1.25e-01

Pos logit contributions
rous+0.102
reci+0.096
icious+0.096
 disl+0.094
Little+0.094
Neg logit contributions
 and-0.085
 in-0.076
 under-0.074
 on-0.074
ed-0.074
Token time
Token ID640
Feature activation+1.10e-02

Pos logit contributions
Ms+0.172
Little+0.171
rous+0.167
One+0.164
Mrs+0.161
Neg logit contributions
ed-0.247
 knees-0.221
driver-0.216
 and-0.215
 under-0.214
Token,
Token ID11
Feature activation-1.05e-01

Pos logit contributions
One+0.025
Little+0.024
Ms+0.024
rous+0.024
 difference+0.024
Neg logit contributions
ed-0.011
ped-0.009
 knees-0.009
driver-0.009
 hid-0.008
Token loyal
Token ID9112
Feature activation+1.40e-01

Pos logit contributions
rous+0.087
Ms+0.086
 former+0.082
reci+0.082
icious+0.082
Neg logit contributions
 and-0.079
ed-0.075
 under-0.074
 in-0.072
 knees-0.071
Token underground
Token ID11447
Feature activation+4.31e-02

Pos logit contributions
 difference+0.225
Ms+0.221
Little+0.216
One+0.211
Mrs+0.209
Neg logit contributions
ed-0.252
ped-0.217
driver-0.213
icked-0.211
 knees-0.211
Token animal
Token ID5044
Feature activation-2.35e-02

Pos logit contributions
 difference+0.063
One+0.063
Little+0.062
Ms+0.062
rous+0.062
Neg logit contributions
ed-0.079
driver-0.070
 knees-0.068
ped-0.067
icked-0.065
Token.
Token ID13
Feature activation-4.62e-02

Pos logit contributions
ed+0.041
driver+0.038
 knees+0.036
ped+0.034
fully+0.034
Neg logit contributions
One-0.037
Little-0.036
Ms-0.035
rous-0.034
Every-0.034
Token His
Token ID2399
Feature activation+2.01e-02

Pos logit contributions
ed+0.080
 knees+0.070
 and+0.070
driver+0.068
 under+0.068
Neg logit contributions
Ms-0.075
Little-0.074
rous-0.074
 difference-0.072
One-0.071
Token name
Token ID1438
Feature activation+9.43e-02

Pos logit contributions
Ms+0.029
Little+0.029
rous+0.029
One+0.028
 difference+0.028
Neg logit contributions
ed-0.038
 knees-0.034
driver-0.033
 under-0.033
 and-0.032
Token was
Token ID373
Feature activation-7.71e-02

Pos logit contributions
Little+0.147
One+0.142
Ms+0.139
Every+0.138
This+0.138
Neg logit contributions
 knees-0.161
driver-0.158
ed-0.146
ously-0.144
fully-0.144
Token Tom
Token ID4186
Feature activation+2.34e-02

Pos logit contributions
ed+0.089
 under+0.073
 and+0.071
 in+0.069
 hid+0.068
Neg logit contributions
 difference-0.170
rous-0.169
Little-0.167
 former-0.167
Ms-0.166
Token.
Token ID13
Feature activation-4.54e-02

Pos logit contributions
One+0.054
Ms+0.053
Little+0.053
Mrs+0.052
 difference+0.051
Neg logit contributions
ed-0.026
 knees-0.021
driver-0.020
ped-0.018
fully-0.018
Token Tom
Token ID4186
Feature activation+2.24e-02

Pos logit contributions
ed+0.044
 and+0.038
 knees+0.034
 under+0.034
driver+0.034
Neg logit contributions
rous-0.108
Little-0.108
Ms-0.107
Gr-0.105
 former-0.105
Token wanted
Token ID2227
Feature activation-1.13e-02

Pos logit contributions
One+0.034
Every+0.031
str+0.030
Little+0.030
Love+0.030
Neg logit contributions
ed-0.040
driver-0.040
fully-0.038
ously-0.037
 knees-0.037
Token show
Token ID905
Feature activation+5.26e-02

Pos logit contributions
ed+0.393
driver+0.327
ped+0.326
fully+0.322
 knees+0.322
Neg logit contributions
Little-0.344
One-0.338
rous-0.337
Ms-0.334
 difference-0.329
Token you
Token ID345
Feature activation+7.41e-03

Pos logit contributions
Little+0.075
Ms+0.074
rous+0.072
One+0.072
Mrs+0.070
Neg logit contributions
ed-0.102
driver-0.090
 knees-0.089
 under-0.087
 and-0.086
Token."
Token ID526
Feature activation+5.19e-02

Pos logit contributions
Ms+0.013
Little+0.013
rous+0.012
One+0.012
 difference+0.012
Neg logit contributions
ed-0.012
 knees-0.011
driver-0.010
 under-0.010
 and-0.010
Token
Token ID198
Feature activation+2.59e-01

Pos logit contributions
Ms+0.088
Little+0.088
rous+0.086
One+0.085
 difference+0.084
Neg logit contributions
ed-0.088
 knees-0.076
driver-0.074
 and-0.074
 under-0.073
Token
Token ID198
Feature activation+2.59e-01

Pos logit contributions
 former+0.556
rous+0.521
 lifetime+0.514
ograph+0.514
 difference+0.513
Neg logit contributions
 and-0.424
ed-0.396
 in-0.354
 under-0.350
 he-0.347
TokenTom
Token ID13787
Feature activation+5.10e-02

Pos logit contributions
rous+0.313
 difference+0.307
 former+0.290
Ms+0.289
reci+0.287
Neg logit contributions
ed-0.573
 and-0.521
driver-0.500
 under-0.499
 knees-0.499
Token runs
Token ID4539
Feature activation-1.04e-01

Pos logit contributions
One+0.079
Little+0.078
Ms+0.075
V+0.075
rig+0.075
Neg logit contributions
fully-0.083
ed-0.083
edly-0.077
ped-0.075
driver-0.072
Token to
Token ID284
Feature activation-7.16e-02

Pos logit contributions
icked+0.144
aces+0.139
 knees+0.136
ped+0.133
 teeth+0.133
Neg logit contributions
One-0.200
Ms-0.182
Mrs-0.181
L-0.181
V-0.180
Token the
Token ID262
Feature activation+1.19e-01

Pos logit contributions
ed+0.108
 knees+0.089
driver+0.089
 under+0.085
 and+0.085
Neg logit contributions
Ms-0.136
Little-0.136
rous-0.132
One-0.132
Mrs-0.130
Token slide
Token ID10649
Feature activation-6.23e-02

Pos logit contributions
Ms+0.226
Little+0.221
rous+0.220
Mrs+0.214
One+0.212
Neg logit contributions
ed-0.178
 knees-0.156
 and-0.153
 under-0.149
driver-0.148
Token and
Token ID290
Feature activation-2.10e-01

Pos logit contributions
ed+0.069
ped+0.064
icked+0.062
driver+0.057
 knees+0.057
Neg logit contributions
One-0.135
Every-0.126
Today-0.125
Each-0.123
Little-0.122
Token was
Token ID373
Feature activation-8.29e-02

Pos logit contributions
ed+0.026
driver+0.025
 knees+0.023
fully+0.022
aces+0.021
Neg logit contributions
One-0.038
Little-0.038
rous-0.037
str-0.037
Ms-0.036
Token so
Token ID523
Feature activation+5.33e-02

Pos logit contributions
ed+0.113
 knees+0.095
driver+0.093
 under+0.087
 and+0.087
Neg logit contributions
Ms-0.167
Little-0.166
rous-0.164
One-0.164
Mrs-0.160
Token graceful
Token ID44363
Feature activation+2.09e-01

Pos logit contributions
Ms+0.092
 difference+0.090
icious+0.089
Little+0.088
Mrs+0.086
Neg logit contributions
ed-0.093
 and-0.083
 in-0.080
 under-0.080
 ran-0.077
Token that
Token ID326
Feature activation+8.41e-02

Pos logit contributions
 beautiful+0.229
 wonderful+0.227
Little+0.225
is+0.225
rous+0.221
Neg logit contributions
driver-0.396
 elbow-0.362
 knees-0.361
icked-0.348
ps-0.335
Token all
Token ID477
Feature activation-6.30e-02

Pos logit contributions
Little+0.187
One+0.185
Ms+0.184
rous+0.182
 difference+0.178
Neg logit contributions
ed-0.097
driver-0.081
 knees-0.079
 under-0.070
 and-0.070
Token the
Token ID262
Feature activation+1.11e-01

Pos logit contributions
ed+0.104
driver+0.094
 knees+0.092
icked+0.082
 hid+0.081
Neg logit contributions
Little-0.110
rous-0.108
Ms-0.106
 difference-0.105
One-0.104
Token animals
Token ID4695
Feature activation-1.23e-01

Pos logit contributions
Ms+0.239
igsaw+0.236
icious+0.235
Mrs+0.231
rous+0.231
Neg logit contributions
ed-0.130
 and-0.126
 in-0.113
 under-0.112
 knees-0.111
Token who
Token ID508
Feature activation+6.34e-03

Pos logit contributions
driver+0.152
 knees+0.138
iously+0.132
 teeth+0.129
ed+0.126
Neg logit contributions
reci-0.239
rous-0.229
str-0.228
 wonderful-0.220
Little-0.216
Token passed
Token ID3804
Feature activation-1.96e-02

Pos logit contributions
Little+0.011
rous+0.011
reci+0.011
Love+0.011
One+0.011
Neg logit contributions
ed-0.009
 knees-0.009
driver-0.009
fully-0.008
iously-0.008
Token by
Token ID416
Feature activation+4.85e-02

Pos logit contributions
ed+0.023
driver+0.021
 knees+0.021
 under+0.018
icked+0.018
Neg logit contributions
Little-0.041
rous-0.041
Ms-0.041
One-0.040
Mrs-0.039
Token stopped
Token ID5025
Feature activation-7.08e-03

Pos logit contributions
Little+0.106
Ms+0.105
One+0.105
rous+0.105
 difference+0.102
Neg logit contributions
ed-0.055
driver-0.046
 knees-0.044
 under-0.040
 hid-0.039
Token
Token ID198
Feature activation+2.73e-01

Pos logit contributions
 former+0.643
icious+0.631
 lifetime+0.620
rous+0.616
assion+0.603
Neg logit contributions
 and-0.413
 he-0.276
 under-0.276
 in-0.273
ed-0.273
Token"
Token ID1
Feature activation+7.63e-02

Pos logit contributions
 difference+0.562
 former+0.545
 sym+0.522
 lifetime+0.515
igsaw+0.512
Neg logit contributions
 and-0.420
ed-0.385
 ran-0.356
 in-0.348
 on-0.340
TokenBe
Token ID3856
Feature activation+1.22e-01

Pos logit contributions
 difference+0.101
rous+0.096
rig+0.096
igsaw+0.092
str+0.089
Neg logit contributions
ed-0.159
 and-0.146
driver-0.143
 on-0.138
fully-0.137
Token gentle
Token ID10296
Feature activation+2.04e-01

Pos logit contributions
Ms+0.205
Little+0.205
rous+0.203
 difference+0.203
 sym+0.200
Neg logit contributions
ed-0.204
 and-0.179
 under-0.175
 in-0.175
 knees-0.171
Token with
Token ID351
Feature activation+3.13e-02

Pos logit contributions
Little+0.404
Ms+0.399
One+0.395
Mrs+0.388
rous+0.379
Neg logit contributions
ed-0.284
 knees-0.245
driver-0.224
 wouldn-0.224
icked-0.224
Token the
Token ID262
Feature activation+1.18e-01

Pos logit contributions
Little+0.054
Ms+0.053
rous+0.053
 difference+0.052
One+0.052
Neg logit contributions
ed-0.050
 and-0.046
 knees-0.044
 under-0.043
driver-0.043
Token stamps
Token ID25560
Feature activation-8.89e-02

Pos logit contributions
Little+0.216
Ms+0.216
rous+0.214
One+0.207
icious+0.204
Neg logit contributions
ed-0.178
 and-0.166
 knees-0.157
 under-0.154
driver-0.150
Token and
Token ID290
Feature activation-2.08e-01

Pos logit contributions
ed+0.152
 knees+0.138
driver+0.129
 under+0.126
ped+0.124
Neg logit contributions
Ms-0.142
Little-0.141
One-0.138
rous-0.137
 difference-0.137
Token stickers
Token ID28568
Feature activation-9.03e-02

Pos logit contributions
ed+0.344
 knees+0.298
driver+0.291
 under+0.288
 and+0.287
Neg logit contributions
Ms-0.356
Little-0.355
rous-0.347
One-0.344
Mrs-0.338
Token,"
Token ID553
Feature activation+4.86e-02

Pos logit contributions
ed+0.145
 knees+0.133
 under+0.121
driver+0.121
 hid+0.120
Neg logit contributions
Ms-0.152
Little-0.151
One-0.148
rous-0.146
Mrs-0.145
Token Mom
Token ID11254
Feature activation+7.61e-02

Pos logit contributions
Ms+0.106
rous+0.106
icious+0.104
Little+0.104
 lifetime+0.104
Neg logit contributions
ed-0.055
 and-0.050
 knees-0.049
 under-0.046
 in-0.044
Token glad
Token ID9675
Feature activation+2.05e-01

Pos logit contributions
Ms+0.091
Little+0.086
One+0.084
 difference+0.084
icious+0.083
Neg logit contributions
ed-0.085
 and-0.074
 under-0.073
 in-0.072
 hid-0.070
Token to
Token ID284
Feature activation-8.36e-02

Pos logit contributions
Little+0.330
Mrs+0.319
One+0.311
 difference+0.303
rous+0.300
Neg logit contributions
ed-0.337
driver-0.328
ps-0.304
aces-0.301
icked-0.297
Token see
Token ID766
Feature activation+6.00e-02

Pos logit contributions
ed+0.130
 knees+0.113
 and+0.113
 under+0.111
driver+0.109
Neg logit contributions
Ms-0.153
Little-0.151
rous-0.149
Mrs-0.145
One-0.144
Token it
Token ID340
Feature activation-9.15e-02

Pos logit contributions
Ms+0.144
Little+0.143
rous+0.142
One+0.140
 difference+0.138
Neg logit contributions
ed-0.058
 knees-0.044
 and-0.043
driver-0.043
 under-0.042
Token again
Token ID757
Feature activation-5.00e-02

Pos logit contributions
ed+0.119
driver+0.105
 knees+0.102
ped+0.093
icked+0.092
Neg logit contributions
 difference-0.186
One-0.183
Little-0.181
rous-0.177
 future-0.174
Token!
Token ID0
Feature activation-1.96e-02

Pos logit contributions
ed+0.051
 knees+0.040
driver+0.038
 under+0.036
 and+0.035
Neg logit contributions
Little-0.118
Ms-0.118
rous-0.116
One-0.116
Mrs-0.114
Token Tommy
Token ID19919
Feature activation+3.62e-03

Pos logit contributions
ed+0.027
 and+0.022
 knees+0.022
 under+0.021
driver+0.021
Neg logit contributions
Little-0.040
rous-0.040
Ms-0.040
One-0.039
 difference-0.039
Token had
Token ID550
Feature activation-8.75e-02

Pos logit contributions
Little+0.007
Ms+0.006
Mrs+0.006
Em+0.006
One+0.006
Neg logit contributions
ed-0.005
driver-0.005
aces-0.005
fully-0.005
 knees-0.004
Token found
Token ID1043
Feature activation-1.76e-02

Pos logit contributions
ed+0.116
 knees+0.096
driver+0.094
 under+0.092
 and+0.091
Neg logit contributions
Ms-0.180
Little-0.179
rous-0.176
One-0.175
Mrs-0.173
Token the
Token ID262
Feature activation+1.07e-01

Pos logit contributions
ed+0.023
 knees+0.021
driver+0.019
 under+0.018
fully+0.018
Neg logit contributions
Ms-0.035
rous-0.035
Little-0.035
Mrs-0.034
 difference-0.034
Token box
Token ID3091
Feature activation-8.42e-02

Pos logit contributions
Ms+0.199
Little+0.196
rous+0.192
One+0.190
Mrs+0.189
Neg logit contributions
ed-0.164
 knees-0.142
 under-0.141
 and-0.140
driver-0.139
Token She
Token ID1375
Feature activation-6.38e-02

Pos logit contributions
ed+0.065
 knees+0.055
driver+0.055
 under+0.054
 and+0.052
Neg logit contributions
Ms-0.080
Little-0.080
One-0.078
rous-0.078
Mrs-0.077
Token picked
Token ID6497
Feature activation+1.06e-02

Pos logit contributions
ed+0.085
driver+0.075
 under+0.068
ps+0.066
fully+0.066
Neg logit contributions
One-0.130
Little-0.128
Ms-0.125
Mrs-0.124
Every-0.123
Token it
Token ID340
Feature activation-8.66e-02

Pos logit contributions
Ms+0.023
rous+0.023
Little+0.023
One+0.023
Mrs+0.022
Neg logit contributions
ed-0.011
 knees-0.010
driver-0.010
 under-0.009
 hid-0.009
Token up
Token ID510
Feature activation+3.29e-02

Pos logit contributions
ed+0.123
driver+0.122
aces+0.115
ously+0.114
 elbow+0.111
Neg logit contributions
Little-0.152
One-0.151
Ms-0.150
rous-0.148
Mrs-0.145
Token.
Token ID13
Feature activation-4.18e-02

Pos logit contributions
One+0.073
The+0.071
Little+0.071
Mrs+0.070
Every+0.069
Neg logit contributions
driver-0.036
ed-0.034
icked-0.031
aces-0.030
ped-0.029
Token She
Token ID1375
Feature activation-6.07e-02

Pos logit contributions
ed+0.066
 knees+0.056
driver+0.055
 under+0.054
 and+0.054
Neg logit contributions
Ms-0.075
Little-0.075
rous-0.073
One-0.073
Mrs-0.072
Token liked
Token ID8288
Feature activation+2.85e-02

Pos logit contributions
ed+0.088
driver+0.082
fully+0.076
ps+0.075
aces+0.073
Neg logit contributions
One-0.109
Little-0.109
Every-0.105
Mrs-0.104
Love-0.104
Token how
Token ID703
Feature activation+5.84e-02

Pos logit contributions
Ms+0.057
rous+0.057
Little+0.057
One+0.056
Mrs+0.056
Neg logit contributions
ed-0.039
 knees-0.031
driver-0.031
 under-0.029
icked-0.029
Token it
Token ID340
Feature activation-8.61e-02

Pos logit contributions
Ms+0.133
Little+0.133
rous+0.131
One+0.130
Mrs+0.128
Neg logit contributions
ed-0.064
 knees-0.051
driver-0.049
 under-0.048
 and-0.047
Token looked
Token ID3114
Feature activation-1.17e-01

Pos logit contributions
driver+0.113
ed+0.112
ously+0.110
fully+0.106
 knees+0.105
Neg logit contributions
Little-0.161
Love-0.151
str-0.150
One-0.149
rous-0.147
Token and
Token ID290
Feature activation-2.19e-01

Pos logit contributions
ed+0.130
driver+0.120
 knees+0.113
ped+0.102
icked+0.097
Neg logit contributions
rous-0.253
One-0.250
 difference-0.249
Ms-0.249
Little-0.247
Token.
Token ID13
Feature activation-3.50e-02

Pos logit contributions
ed+0.206
 knees+0.177
driver+0.171
ped+0.169
fully+0.157
Neg logit contributions
One-0.292
 difference-0.285
Ms-0.283
 rewarding-0.277
rous-0.276
Token He
Token ID679
Feature activation-1.02e-01

Pos logit contributions
ed+0.043
 and+0.036
 knees+0.035
driver+0.034
 under+0.034
Neg logit contributions
rous-0.076
Ms-0.075
Little-0.075
icious-0.073
 difference-0.073
Token ran
Token ID4966
Feature activation-1.46e-01

Pos logit contributions
ed+0.136
driver+0.129
fully+0.127
ier+0.118
aces+0.115
Neg logit contributions
One-0.192
Ms-0.187
E-0.185
Mrs-0.185
L-0.183
Token towards
Token ID3371
Feature activation+7.94e-03

Pos logit contributions
aces+0.219
 elbow+0.206
 knees+0.200
icked+0.199
ongs+0.186
Neg logit contributions
Ms-0.249
Mrs-0.248
One-0.248
L-0.242
V-0.241
Token the
Token ID262
Feature activation+1.15e-01

Pos logit contributions
Ms+0.015
Little+0.015
One+0.015
Mrs+0.015
rous+0.015
Neg logit contributions
ed-0.011
driver-0.010
 knees-0.010
aces-0.009
icked-0.009
Token car
Token ID1097
Feature activation-2.22e-02

Pos logit contributions
Ms+0.199
Little+0.199
One+0.193
rous+0.193
 difference+0.191
Neg logit contributions
ed-0.190
 knees-0.162
driver-0.158
 under-0.155
 and-0.155
Token and
Token ID290
Feature activation-2.11e-01

Pos logit contributions
ed+0.024
ped+0.022
icked+0.021
led+0.017
 knees+0.017
Neg logit contributions
One-0.049
 difference-0.047
Ms-0.047
Mrs-0.046
rig-0.045
Token bit
Token ID1643
Feature activation+6.70e-03

Pos logit contributions
aces+0.261
fully+0.253
driver+0.253
ed+0.249
ier+0.245
Neg logit contributions
Ms-0.377
Mrs-0.363
Little-0.360
V-0.359
Girl-0.355
Token it
Token ID340
Feature activation-8.25e-02

Pos logit contributions
E+0.011
One+0.010
Little+0.010
Ms+0.010
Mrs+0.010
Neg logit contributions
ed-0.011
driver-0.010
aces-0.010
 wouldn-0.010
fully-0.009
Token with
Token ID351
Feature activation+2.43e-02

Pos logit contributions
ed+0.097
ously+0.087
edly+0.080
driver+0.079
 knees+0.079
Neg logit contributions
One-0.147
Every-0.135
V-0.133
assion-0.131
 difference-0.131
Token his
Token ID465
Feature activation-5.75e-02

Pos logit contributions
Little+0.051
Ms+0.051
rous+0.051
One+0.051
Mrs+0.050
Neg logit contributions
ed-0.028
driver-0.022
fully-0.022
 knees-0.022
 wouldn-0.020
Token make
Token ID787
Feature activation+1.31e-02

Pos logit contributions
ed+0.140
 knees+0.117
driver+0.116
 under+0.113
 and+0.111
Neg logit contributions
Little-0.153
Ms-0.152
One-0.149
rous-0.147
 difference-0.145
Token him
Token ID683
Feature activation-1.65e-01

Pos logit contributions
Little+0.028
Ms+0.028
rous+0.028
 difference+0.027
Mrs+0.027
Neg logit contributions
ed-0.016
 knees-0.013
driver-0.013
 under-0.012
fully-0.012
Token feel
Token ID1254
Feature activation-2.80e-02

Pos logit contributions
ed+0.265
driver+0.217
 knees+0.217
ped+0.215
fully+0.212
Neg logit contributions
 difference-0.287
Little-0.284
rous-0.283
One-0.276
Ms-0.275
Token better
Token ID1365
Feature activation+1.01e-01

Pos logit contributions
ed+0.056
 knees+0.050
driver+0.049
 under+0.048
 and+0.047
Neg logit contributions
Little-0.039
Ms-0.038
rous-0.038
 difference-0.037
One-0.037
Token by
Token ID416
Feature activation+4.43e-02

Pos logit contributions
 difference+0.179
Little+0.177
 rewarding+0.170
One+0.169
 former+0.168
Neg logit contributions
ed-0.150
ped-0.140
icked-0.139
 knees-0.127
driver-0.125
Token giving
Token ID3501
Feature activation+1.63e-02

Pos logit contributions
Little+0.085
Ms+0.084
rous+0.083
One+0.082
Mrs+0.081
Neg logit contributions
ed-0.066
 knees-0.053
driver-0.053
 under-0.051
 and-0.050
Token him
Token ID683
Feature activation-1.66e-01

Pos logit contributions
Little+0.036
rous+0.035
 difference+0.034
Ms+0.034
Mrs+0.034
Neg logit contributions
ed-0.017
driver-0.016
 knees-0.015
icked-0.014
aces-0.013
Token a
Token ID257
Feature activation+4.11e-02

Pos logit contributions
ed+0.216
 knees+0.184
driver+0.180
 under+0.170
fully+0.168
Neg logit contributions
Little-0.335
Ms-0.333
rous-0.331
One-0.327
 difference-0.324
Token y
Token ID331
Feature activation+1.79e-01

Pos logit contributions
Ms+0.070
Little+0.069
rous+0.068
icious+0.068
One+0.066
Neg logit contributions
ed-0.066
 and-0.064
driver-0.062
 knees-0.059
 in-0.059
Tokenummy
Token ID13513
Feature activation+3.19e-02

Pos logit contributions
 difference+0.163
rous+0.158
Little+0.156
One+0.149
Ms+0.146
Neg logit contributions
ed-0.444
driver-0.398
 under-0.393
 knees-0.392
ped-0.389
Token ice
Token ID4771
Feature activation+4.61e-03

Pos logit contributions
 difference+0.043
Little+0.043
Ms+0.041
rous+0.040
One+0.040
Neg logit contributions
ed-0.063
 knees-0.055
driver-0.054
icked-0.054
fully-0.053
Token went
Token ID1816
Feature activation-1.38e-01

Pos logit contributions
Little+0.098
One+0.091
Em+0.089
Love+0.087
Prin+0.086
Neg logit contributions
driver-0.098
ed-0.094
fully-0.092
ped-0.088
icked-0.086
Token to
Token ID284
Feature activation-8.73e-02

Pos logit contributions
ed+0.183
ped+0.164
 knees+0.158
 wouldn+0.158
driver+0.154
Neg logit contributions
rous-0.266
One-0.263
Ms-0.261
 future-0.252
Mrs-0.252
Token bed
Token ID3996
Feature activation-5.36e-02

Pos logit contributions
 and+0.094
 knees+0.090
 under+0.087
 in+0.085
ed+0.083
Neg logit contributions
icious-0.201
Ms-0.196
Little-0.190
 lifetime-0.189
gary-0.188
Token.
Token ID13
Feature activation-5.44e-02

Pos logit contributions
ed+0.061
driver+0.051
ped+0.050
 knees+0.049
icked+0.046
Neg logit contributions
Ms-0.116
Mrs-0.114
One-0.113
Little-0.112
rous-0.110
Token She
Token ID1375
Feature activation-7.47e-02

Pos logit contributions
ed+0.077
 knees+0.065
driver+0.063
 and+0.063
 under+0.062
Neg logit contributions
Ms-0.107
Little-0.106
rous-0.105
One-0.103
 difference-0.102
Token was
Token ID373
Feature activation-9.65e-02

Pos logit contributions
ed+0.107
driver+0.096
fully+0.086
 knees+0.085
ped+0.085
Neg logit contributions
Little-0.145
One-0.144
Ms-0.141
rous-0.138
Mrs-0.137
Token scared
Token ID12008
Feature activation+1.04e-01

Pos logit contributions
ed+0.145
driver+0.128
 knees+0.123
fully+0.114
ped+0.111
Neg logit contributions
Ms-0.179
rous-0.177
Little-0.176
One-0.175
Mrs-0.173
Token of
Token ID286
Feature activation-2.05e-02

Pos logit contributions
Little+0.201
One+0.196
 future+0.191
 difference+0.190
Em+0.186
Neg logit contributions
ed-0.144
ped-0.137
driver-0.133
icked-0.130
ps-0.126
Token the
Token ID262
Feature activation+9.63e-02

Pos logit contributions
ed+0.033
driver+0.028
 knees+0.026
icked+0.026
ped+0.026
Neg logit contributions
Little-0.037
rous-0.036
Ms-0.036
One-0.036
 difference-0.035
Token dark
Token ID3223
Feature activation+1.10e-01

Pos logit contributions
Ms+0.155
Little+0.151
rous+0.149
One+0.146
Mrs+0.145
Neg logit contributions
ed-0.168
 knees-0.152
 and-0.150
 under-0.150
driver-0.143
Token,
Token ID11
Feature activation-1.17e-01

Pos logit contributions
Little+0.209
 future+0.207
 difference+0.207
One+0.202
 sym+0.200
Neg logit contributions
icked-0.155
ed-0.152
driver-0.142
ked-0.140
ped-0.139
Token so
Token ID523
Feature activation+4.31e-02

Pos logit contributions
ed+0.079
 and+0.069
 under+0.068
 knees+0.067
 in+0.065
Neg logit contributions
Little-0.104
Ms-0.103
rous-0.101
One-0.100
 difference-0.098
Token pretty
Token ID2495
Feature activation+2.61e-01

Pos logit contributions
Little+0.061
Ms+0.060
icious+0.058
ograph+0.056
Em+0.056
Neg logit contributions
ed-0.085
 in-0.082
 and-0.082
 under-0.081
 on-0.077
Token!
Token ID0
Feature activation-3.20e-02

Pos logit contributions
Little+0.260
One+0.259
 difference+0.247
 beautiful+0.247
rous+0.245
Neg logit contributions
 knees-0.540
ed-0.521
driver-0.521
ked-0.515
ped-0.511
Token It
Token ID632
Feature activation-7.82e-03

Pos logit contributions
ed+0.043
 and+0.038
driver+0.034
 under+0.034
 knees+0.034
Neg logit contributions
rous-0.066
Little-0.063
Ms-0.063
 difference-0.063
 lifetime-0.062
Token bloom
Token ID29955
Feature activation+4.41e-02

Pos logit contributions
ed+0.012
driver+0.011
 knees+0.011
fully+0.010
 under+0.010
Neg logit contributions
Ms-0.013
Little-0.013
rous-0.013
One-0.013
 difference-0.013
Tokened
Token ID276
Feature activation-1.73e-01

Pos logit contributions
Ms+0.114
Little+0.114
rous+0.112
 sym+0.112
One+0.112
Neg logit contributions
 knees-0.026
driver-0.023
ed-0.018
ously-0.016
 hid-0.015
Token today
Token ID1909
Feature activation+5.94e-02

Pos logit contributions
ed+0.196
 knees+0.183
driver+0.182
ped+0.152
 hid+0.152
Neg logit contributions
Little-0.365
rous-0.361
Ms-0.360
One-0.360
 difference-0.349
Token!"
Token ID2474
Feature activation+4.67e-02

Pos logit contributions
Little+0.092
Ms+0.092
One+0.090
 difference+0.087
rous+0.086
Neg logit contributions
ed-0.104
driver-0.094
 knees-0.093
ped-0.091
ked-0.087
Token
Token ID198
Feature activation+2.55e-01

Pos logit contributions
Little+0.058
rous+0.057
Ms+0.057
One+0.055
 difference+0.055
Neg logit contributions
ed-0.100
 and-0.090
 knees-0.090
 under-0.087
driver-0.087
Token
Token ID198
Feature activation+2.55e-01

Pos logit contributions
 former+0.560
 lifetime+0.556
 difference+0.524
icious+0.524
assion+0.523
Neg logit contributions
 and-0.415
ed-0.337
 in-0.304
 under-0.296
 on-0.293
TokenBen
Token ID11696
Feature activation+4.77e-02

Pos logit contributions
 difference+0.415
 former+0.387
 lifetime+0.384
rous+0.375
 sym+0.371
Neg logit contributions
 and-0.452
ed-0.441
 ran-0.413
 in-0.396
 knees-0.394
Token with
Token ID351
Feature activation+1.17e-02

Pos logit contributions
 difference+0.413
 sym+0.396
 former+0.378
Little+0.376
 future+0.374
Neg logit contributions
ed-0.262
icked-0.259
driver-0.254
ped-0.236
 knees-0.230
Token just
Token ID655
Feature activation-1.08e-01

Pos logit contributions
Little+0.022
rous+0.022
 difference+0.022
Ms+0.021
One+0.021
Neg logit contributions
ed-0.017
driver-0.014
 knees-0.014
 under-0.013
fully-0.013
Token a
Token ID257
Feature activation+3.90e-02

Pos logit contributions
ed+0.147
driver+0.132
 knees+0.126
icked+0.115
ped+0.113
Neg logit contributions
rous-0.206
 difference-0.203
 rewarding-0.200
Little-0.200
 former-0.199
Token few
Token ID1178
Feature activation+8.73e-02

Pos logit contributions
Ms+0.059
rous+0.059
Little+0.056
One+0.056
icious+0.055
Neg logit contributions
ed-0.070
 and-0.069
driver-0.063
 knees-0.063
 in-0.062
Token paints
Token ID32281
Feature activation-2.18e-02

Pos logit contributions
Little+0.144
Ms+0.143
rous+0.142
One+0.140
 difference+0.139
Neg logit contributions
ed-0.147
 knees-0.128
driver-0.127
 under-0.123
fully-0.121
Token and
Token ID290
Feature activation-2.28e-01

Pos logit contributions
 knees+0.025
ed+0.024
ped+0.023
icked+0.021
 hid+0.021
Neg logit contributions
Little-0.044
 difference-0.043
One-0.042
Ms-0.042
 sym-0.042
Token switches
Token ID18225
Feature activation-1.03e-01

Pos logit contributions
ed+0.398
 knees+0.339
driver+0.337
 under+0.334
fully+0.329
Neg logit contributions
Little-0.364
Ms-0.352
rous-0.350
One-0.349
 rewarding-0.341
Token.
Token ID13
Feature activation-4.93e-02

Pos logit contributions
 knees+0.123
ed+0.118
ped+0.109
icked+0.108
 hid+0.101
Neg logit contributions
Little-0.185
 difference-0.185
 lifetime-0.183
One-0.183
assion-0.182
Token From
Token ID3574
Feature activation+1.04e-01

Pos logit contributions
ed+0.062
 and+0.052
 knees+0.050
driver+0.049
 under+0.046
Neg logit contributions
rous-0.106
Ms-0.104
Little-0.103
icious-0.101
One-0.101
Token then
Token ID788
Feature activation-2.90e-02

Pos logit contributions
Little+0.230
Ms+0.229
rous+0.227
One+0.224
Mrs+0.221
Neg logit contributions
ed-0.122
 knees-0.098
driver-0.096
 under-0.092
 and-0.091
Token on
Token ID319
Feature activation-8.26e-02

Pos logit contributions
ed+0.020
 knees+0.019
 hid+0.014
driver+0.014
ped+0.013
Neg logit contributions
Little-0.076
One-0.074
Every-0.072
rous-0.071
Ms-0.071
Token walking
Token ID6155
Feature activation-2.85e-03

Pos logit contributions
ed+0.123
driver+0.106
 knees+0.105
fully+0.095
 hid+0.092
Neg logit contributions
rous-0.184
Little-0.182
Ms-0.181
One-0.179
Mrs-0.174
Token through
Token ID832
Feature activation-2.25e-02

Pos logit contributions
icked+0.003
driver+0.003
ed+0.003
 wouldn+0.003
aces+0.003
Neg logit contributions
One-0.006
rous-0.006
Today-0.006
Every-0.006
Little-0.006
Token the
Token ID262
Feature activation+1.03e-01

Pos logit contributions
ed+0.030
 knees+0.026
 and+0.025
 under+0.025
driver+0.025
Neg logit contributions
Ms-0.045
Little-0.045
rous-0.044
One-0.044
Mrs-0.043
Token park
Token ID3952
Feature activation+1.33e-02

Pos logit contributions
icious+0.187
gary+0.178
Ms+0.178
reci+0.170
 disl+0.165
Neg logit contributions
 and-0.172
 in-0.158
 under-0.149
ed-0.148
 inside-0.146
Token when
Token ID618
Feature activation-2.65e-02

Pos logit contributions
One+0.027
Every+0.025
Each+0.025
Today+0.024
rous+0.024
Neg logit contributions
ed-0.018
driver-0.017
icked-0.016
ped-0.015
 wouldn-0.015
Token he
Token ID339
Feature activation-1.82e-01

Pos logit contributions
ed+0.030
icked+0.026
driver+0.025
aces+0.025
ps+0.023
Neg logit contributions
One-0.060
Little-0.060
rous-0.057
Every-0.056
Today-0.055
Token heard
Token ID2982
Feature activation-8.00e-02

Pos logit contributions
ed+0.249
 and+0.202
 knees+0.198
 under+0.197
 in+0.187
Neg logit contributions
Ms-0.374
 difference-0.367
Little-0.366
icious-0.365
rous-0.364
Token a
Token ID257
Feature activation+4.18e-02

Pos logit contributions
ed+0.093
 knees+0.075
 and+0.072
driver+0.072
 under+0.071
Neg logit contributions
Ms-0.176
Little-0.175
rous-0.173
One-0.171
 difference-0.169
Token noise
Token ID7838
Feature activation+3.63e-03

Pos logit contributions
rous+0.083
Ms+0.079
 former+0.078
gary+0.078
Each+0.078
Neg logit contributions
 and-0.055
ed-0.051
 in-0.049
 ran-0.049
 knees-0.048
Token.
Token ID13
Feature activation-4.95e-02

Pos logit contributions
Little+0.007
One+0.007
Mrs+0.006
 sym+0.006
Ms+0.006
Neg logit contributions
ed-0.005
 knees-0.005
driver-0.005
aces-0.005
icked-0.005
Token He
Token ID679
Feature activation-1.16e-01

Pos logit contributions
ed+0.071
icked+0.064
driver+0.063
 under+0.061
ps+0.061
Neg logit contributions
Little-0.090
Mrs-0.089
One-0.088
Ms-0.088
Every-0.085
Token played
Token ID2826
Feature activation+8.37e-03

Pos logit contributions
ed+0.141
 knees+0.128
driver+0.127
fully+0.119
aces+0.111
Neg logit contributions
Ms-0.170
One-0.169
Mrs-0.167
rous-0.167
Little-0.166
Token a
Token ID257
Feature activation+3.81e-02

Pos logit contributions
Little+0.016
Ms+0.016
 difference+0.016
One+0.016
rous+0.016
Neg logit contributions
ed-0.011
 knees-0.010
icked-0.009
ped-0.009
driver-0.009
Token trick
Token ID6908
Feature activation+3.12e-02

Pos logit contributions
Ms+0.061
rous+0.060
Little+0.058
reci+0.057
Girl+0.056
Neg logit contributions
 and-0.067
 under-0.064
 in-0.063
ed-0.062
 knees-0.060
Token on
Token ID319
Feature activation-8.69e-02

Pos logit contributions
Little+0.062
Ms+0.061
 difference+0.060
One+0.059
rous+0.059
Neg logit contributions
 knees-0.035
ed-0.034
ped-0.030
icked-0.029
 teeth-0.029
Token her
Token ID607
Feature activation-6.93e-02

Pos logit contributions
ed+0.143
driver+0.120
icked+0.115
 knees+0.115
ked+0.113
Neg logit contributions
Little-0.161
Ms-0.158
Mrs-0.152
One-0.151
 difference-0.146
Token and
Token ID290
Feature activation-2.27e-01

Pos logit contributions
ed+0.120
driver+0.103
icked+0.098
ped+0.096
 knees+0.093
Neg logit contributions
 difference-0.115
Little-0.115
 former-0.111
Ms-0.111
rous-0.111
Token now
Token ID783
Feature activation+8.63e-03

Pos logit contributions
ed+0.295
driver+0.284
aces+0.251
 knees+0.248
fully+0.237
Neg logit contributions
Little-0.440
Ms-0.427
rous-0.422
Mrs-0.419
 rewarding-0.409
Token she
Token ID673
Feature activation-1.56e-01

Pos logit contributions
Little+0.017
 lifetime+0.017
 difference+0.016
 future+0.016
 sym+0.016
Neg logit contributions
ed-0.010
driver-0.009
 knees-0.009
icked-0.008
aces-0.008
Token was
Token ID373
Feature activation-9.30e-02

Pos logit contributions
driver+0.205
ed+0.199
 knees+0.193
fully+0.185
aces+0.171
Neg logit contributions
Little-0.281
 lifetime-0.269
Ms-0.263
One-0.261
Em-0.258
Token stuck
Token ID7819
Feature activation+1.12e-02

Pos logit contributions
ed+0.127
 knees+0.111
driver+0.110
fully+0.099
 and+0.095
Neg logit contributions
Ms-0.181
Little-0.179
rous-0.179
One-0.178
Mrs-0.175
Token!
Token ID0
Feature activation-2.36e-02

Pos logit contributions
One+0.022
rous+0.021
Little+0.021
Em+0.021
Every+0.021
Neg logit contributions
 knees-0.012
ed-0.011
 teeth-0.010
icked-0.010
ped-0.010
Token while
Token ID981
Feature activation-7.04e-02

Pos logit contributions
ed+0.174
 knees+0.148
driver+0.147
 under+0.140
 and+0.140
Neg logit contributions
Little-0.213
Ms-0.213
One-0.208
rous-0.206
Mrs-0.203
Token they
Token ID484
Feature activation-1.74e-01

Pos logit contributions
ed+0.091
driver+0.076
 knees+0.074
fully+0.069
aces+0.069
Neg logit contributions
Little-0.148
One-0.148
Ms-0.145
rous-0.142
Mrs-0.140
Token were
Token ID547
Feature activation-1.15e-01

Pos logit contributions
ed+0.206
driver+0.190
 knees+0.178
fully+0.167
aces+0.157
Neg logit contributions
Little-0.371
One-0.365
Ms-0.362
rous-0.357
Every-0.347
Token playing
Token ID2712
Feature activation+1.33e-02

Pos logit contributions
ed+0.199
driver+0.194
ously+0.188
fully+0.185
 wouldn+0.175
Neg logit contributions
rous-0.175
Ms-0.164
One-0.163
gary-0.153
Mrs-0.153
Token,
Token ID11
Feature activation-1.16e-01

Pos logit contributions
Little+0.025
One+0.025
rous+0.024
Ms+0.024
 difference+0.024
Neg logit contributions
ed-0.017
driver-0.015
 knees-0.015
 wouldn-0.015
ped-0.015
Token they
Token ID484
Feature activation-1.74e-01

Pos logit contributions
ed+0.218
driver+0.193
aces+0.189
 knees+0.180
ps+0.178
Neg logit contributions
Little-0.187
One-0.180
Ms-0.179
E-0.170
Mrs-0.168
Token saw
Token ID2497
Feature activation+6.72e-02

Pos logit contributions
driver+0.327
ed+0.303
aces+0.294
fully+0.285
 knees+0.284
Neg logit contributions
Little-0.252
Ms-0.248
One-0.247
Mrs-0.235
Prin-0.233
Token a
Token ID257
Feature activation+4.32e-02

Pos logit contributions
Little+0.143
Ms+0.141
rous+0.139
One+0.139
Mrs+0.136
Neg logit contributions
ed-0.084
 knees-0.071
driver-0.069
 under-0.065
 and-0.064
Token big
Token ID1263
Feature activation+2.27e-01

Pos logit contributions
Ms+0.069
rous+0.068
Little+0.066
reci+0.066
One+0.065
Neg logit contributions
ed-0.071
 and-0.070
 under-0.067
 in-0.066
 knees-0.066
Token box
Token ID3091
Feature activation-8.33e-02

Pos logit contributions
 difference+0.359
Little+0.346
rous+0.341
Ms+0.328
One+0.323
Neg logit contributions
ed-0.446
icked-0.369
ped-0.369
aces-0.366
fully-0.360
Token near
Token ID1474
Feature activation+3.39e-02

Pos logit contributions
ed+0.084
 knees+0.067
driver+0.064
 teeth+0.060
ped+0.060
Neg logit contributions
One-0.194
Little-0.186
Every-0.184
rous-0.183
Each-0.182