TOP ACTIVATIONS
MAX = 5.270

walked and walked , and finally , they found the nest
walked and walked , and finally , they found the yogurt
little girl was disappointed but soon spotted a small elephant standing
saved and saved , and finally had enough money to buy
stared and stared , until finally the music stopped . The
," said Lily . She organized all the toys in the
baby was disappointed , but soon she saw a little bird
she screamed for help . Soon a friendly polar bear appeared
but she kept walking and soon found a beautiful flower .
around the park , and finally , Johnny found the watch
and boy continued walking and soon stopped at the tree .
walked for a while and finally found the puppy 's home
she found it ! She closed her eyes , and soon
They kept walking , and soon they came to a playground
park for the frame , but they couldn 't find it
went off upset , but soon he saw his best friend
to sleep again . She closed her eyes tightly and counted
L ily resisted , but eventually she gave in and took
little girl kept walking and soon found a soft , squ
open the box , but finally managed to tear it open

MIN ACTIVATIONS
MIN = -4.511

The elephant is just saying hello
baby , don 't say a word . Mama is going
That â s the bird that was calling !
The refrigerator door is not for
or baskets , to sing a song or tell a story
my . You listened to the cater pillar and respected his
thing from the basket as a gift . Choose what you
But how about you draw a picture of you z ooming
shop at the end of the road . If you do
It 's important to follow the rules ." Tim
to be careful and ask the owner if we can pet
there , â replied the snail .
get to the end of the wire first . All of
m so glad you followed the light , now let â
Let 's go and ask the truck driver if we can
bear , I 'll ask the doctor for permission to give
other happy and grow as a family ." L
" I was born as a cater pillar . A cater
That â s a special ball , it has
careful and wise . " The dog might bite you or

INTERVAL 2.825 - 5.270
CONTAINS 1.883%

in her backyard . One day , while she was playing
tired but also very satisfied with her Tuesday . She knew
As she walked , she saw a boy who
to collect rocks . He wore his big bath robe because
the pastry . She smiled because she knew she could eat

INTERVAL 0.380 - 2.825
CONTAINS 39.458%

Billy and his mom went to a store called the
. <|endoftext|> Once upon a time , there was a little
Da isy smiled and picked up the plastic foot .
the end of the day , Sir C aw had to
Inside the sack , she found a lot of powder

INTERVAL -2.065 - 0.380
CONTAINS 51.081%

touched the sky . Inside the castle , there was a
bird and said " hi " to it . The bird
bear . Emma noticed how Lily felt and offered to share
book with a beautiful poem on the front . The poem
<|endoftext|> Jason and his family were in their house . There

INTERVAL -4.511 - -2.065
CONTAINS 7.578%

I love you too , Anna . Have fun with Lily
" Please teach me , thumb !" The
the lake . He wanted to see it better so he
her mom and hugged her tight . From then on ,
princess or a fairy or a doctor . One day ,
Token walked
Token ID6807
Feature activation+2.10e+00

Pos logit contributions
 her+5.767
 his+5.244
 the+5.017
 their+4.962
led+4.851
Neg logit contributions
 promise-8.683
ode-8.260
 rule-8.189
 hill-7.848
 request-7.844
Token and
Token ID290
Feature activation+4.24e+00

Pos logit contributions
 her+3.080
 the+2.574
 his+2.570
led+2.502
 their+2.447
Neg logit contributions
 rule-4.736
 promise-4.458
 world-4.343
 basis-4.301
age-4.279
Token walked
Token ID6807
Feature activation+1.65e+00

Pos logit contributions
 her+7.028
led+6.273
 his+6.145
 the+5.824
 their+5.781
Neg logit contributions
 hill-8.669
 promise-8.308
 ramp-8.195
ode-8.123
 road-8.072
Token,
Token ID11
Feature activation+3.38e+00

Pos logit contributions
 her+2.348
 the+1.996
 his+1.966
 their+1.892
led+1.858
Neg logit contributions
 rule-3.779
 promise-3.540
 basis-3.449
 world-3.434
 difference-3.432
Token and
Token ID290
Feature activation+4.29e+00

Pos logit contributions
 her+6.478
 his+5.790
led+5.721
 their+5.695
 the+5.455
Neg logit contributions
 rule-6.159
 ramp-5.816
 hill-5.811
 promise-5.795
 road-5.709
Token finally
Token ID3443
Feature activation+5.27e+00

Pos logit contributions
 her+7.163
led+5.836
 his+5.772
 their+5.425
 its+5.317
Neg logit contributions
 rule-9.232
 road-9.131
 ramp-9.047
age-8.982
oc-8.958
Token,
Token ID11
Feature activation+3.49e+00

Pos logit contributions
 her+9.809
led+8.911
 his+7.501
ged+6.936
ied+6.930
Neg logit contributions
ari-11.198
 promise-10.949
affe-10.888
oc-10.871
 river-10.785
Token they
Token ID484
Feature activation+3.50e+00

Pos logit contributions
 her+4.586
led+4.058
 his+3.308
ed+3.096
icked+2.962
Neg logit contributions
 rule-8.589
 ramp-8.496
 hill-8.484
 road-8.442
 difference-8.155
Token found
Token ID1043
Feature activation+3.86e+00

Pos logit contributions
 her+5.339
 his+4.714
led+4.555
 their+4.486
 the+4.354
Neg logit contributions
 promise-7.671
 rule-7.545
age-7.431
asis-7.383
ode-7.203
Token the
Token ID262
Feature activation-1.25e+00

Pos logit contributions
 her+4.723
led+3.886
ged+3.303
 his+3.246
 their+3.026
Neg logit contributions
 difference-10.003
 rule-9.608
 promise-9.420
 world-9.329
oc-9.300
Token nest
Token ID16343
Feature activation+1.48e+00

Pos logit contributions
icious+1.666
 rule+1.665
opus+1.602
'."+1.594
rr+1.591
Neg logit contributions
 her-3.080
 the-2.929
 his-2.739
 their-2.704
 it-2.592
Token walked
Token ID6807
Feature activation+2.30e+00

Pos logit contributions
 her+4.043
 his+3.596
 the+3.591
 their+3.464
led+3.286
Neg logit contributions
 promise-6.391
 rule-6.263
ode-5.976
Two-5.972
age-5.908
Token and
Token ID290
Feature activation+4.16e+00

Pos logit contributions
 her+3.204
 the+2.728
 his+2.685
led+2.645
 their+2.563
Neg logit contributions
 rule-5.298
 promise-5.063
 world-4.826
age-4.821
 request-4.814
Token walked
Token ID6807
Feature activation+1.72e+00

Pos logit contributions
 her+6.728
 his+5.914
led+5.888
 its+5.518
 the+5.508
Neg logit contributions
 hill-8.543
 promise-8.458
 ramp-8.182
Two-8.023
 rule-7.996
Token,
Token ID11
Feature activation+2.87e+00

Pos logit contributions
 her+2.452
 the+2.106
 his+2.055
 their+2.000
led+1.971
Neg logit contributions
 rule-3.957
 promise-3.721
 basis-3.608
age-3.590
reath-3.582
Token and
Token ID290
Feature activation+4.09e+00

Pos logit contributions
 her+5.174
 his+4.608
 their+4.537
led+4.527
 the+4.449
Neg logit contributions
 rule-5.456
 promise-5.203
 ramp-4.963
 basis-4.951
 hill-4.921
Token finally
Token ID3443
Feature activation+5.22e+00

Pos logit contributions
 her+6.050
 his+5.012
led+4.850
 their+4.675
 its+4.573
Neg logit contributions
 rule-9.192
 promise-8.889
 road-8.680
age-8.628
 hill-8.628
Token,
Token ID11
Feature activation+2.82e+00

Pos logit contributions
 her+9.373
led+8.934
 his+7.419
ged+7.007
bled+6.983
Neg logit contributions
ari-10.964
 promise-10.750
Two-10.507
affe-10.298
 team-10.280
Token they
Token ID484
Feature activation+2.82e+00

Pos logit contributions
 her+2.754
led+2.298
 his+1.971
 their+1.777
 the+1.758
Neg logit contributions
 rule-7.564
 promise-7.227
 hill-7.212
 world-7.191
 road-7.158
Token found
Token ID1043
Feature activation+3.73e+00

Pos logit contributions
 her+4.111
 his+3.552
 the+3.435
 their+3.433
led+3.398
Neg logit contributions
 rule-6.281
 promise-6.215
age-5.945
asis-5.899
Two-5.888
Token the
Token ID262
Feature activation-2.09e+00

Pos logit contributions
 her+3.887
led+3.394
ged+2.840
 his+2.804
 their+2.488
Neg logit contributions
 difference-9.995
 rule-9.942
 promise-9.677
 world-9.549
 goal-9.196
Token yogurt
Token ID32132
Feature activation+2.78e-01

Pos logit contributions
'."+4.444
!'"+4.337
.'"+4.335
age+4.322
rr+4.321
Neg logit contributions
 her-3.816
 the-3.599
 his-3.437
 their-3.034
 it-2.926
Token little
Token ID1310
Feature activation+7.47e-01

Pos logit contributions
 rule+2.267
icious+2.185
'."+2.147
 difference+2.136
!'"+2.131
Neg logit contributions
 her-2.234
 the-2.107
 his-1.939
 their-1.936
led-1.726
Token girl
Token ID2576
Feature activation+6.89e-01

Pos logit contributions
 her+1.918
 the+1.798
 his+1.780
 their+1.757
led+1.645
Neg logit contributions
 rule-0.885
 promise-0.791
icious-0.766
 difference-0.757
oral-0.748
Token was
Token ID373
Feature activation+1.80e+00

Pos logit contributions
 her+0.865
 the+0.750
 his+0.715
 their+0.706
led+0.599
Neg logit contributions
 rule-1.714
 promise-1.634
icious-1.614
oral-1.597
 difference-1.590
Token disappointed
Token ID11679
Feature activation+5.03e-01

Pos logit contributions
 her+2.580
 the+2.229
 his+2.192
 their+2.167
led+2.130
Neg logit contributions
 rule-4.169
 promise-3.947
umpy-3.833
age-3.779
icious-3.770
Token but
Token ID475
Feature activation+3.36e+00

Pos logit contributions
 her+0.632
 the+0.552
 his+0.525
 their+0.516
led+0.444
Neg logit contributions
 rule-1.262
 promise-1.193
icious-1.190
 difference-1.173
oral-1.171
Token soon
Token ID2582
Feature activation+5.18e+00

Pos logit contributions
 her+4.904
 their+4.680
led+4.608
 his+4.380
 the+4.250
Neg logit contributions
 promise-7.120
 rule-6.919
 judge-6.789
 request-6.633
 world-6.539
Token spotted
Token ID13489
Feature activation+2.40e+00

Pos logit contributions
led+6.919
bled+6.446
 her+6.243
ged+6.136
 their+6.078
Neg logit contributions
-11.750
Two-11.647
 promise-11.579
 world-11.576
 story-11.544
Token a
Token ID257
Feature activation-2.38e+00

Pos logit contributions
 her+1.631
led+1.349
 his+1.315
 their+1.249
 the+1.179
Neg logit contributions
 rule-6.868
 difference-6.529
 world-6.441
 charity-6.405
icious-6.377
Token small
Token ID1402
Feature activation-1.52e+00

Pos logit contributions
icious+5.348
'."+5.266
reath+5.219
opl+5.180
.'"+5.179
Neg logit contributions
 the-3.917
 her-3.810
 his-3.510
 their-3.500
 it-3.263
Token elephant
Token ID20950
Feature activation-1.41e+00

Pos logit contributions
icious+2.928
 rule+2.783
reath+2.770
oral+2.764
arl+2.711
Neg logit contributions
 her-2.936
 the-2.890
 his-2.685
 their-2.668
 it-2.455
Token standing
Token ID5055
Feature activation+9.78e-01

Pos logit contributions
 rule+3.743
icious+3.654
reath+3.630
oral+3.621
arl+3.607
Neg logit contributions
 her-1.740
 the-1.586
 his-1.459
 their-1.357
 it-1.237
Token saved
Token ID7448
Feature activation+2.76e+00

Pos logit contributions
 her+1.065
 the+0.944
 his+0.907
 their+0.900
led+0.770
Neg logit contributions
 rule-1.798
 promise-1.711
icious-1.691
oral-1.667
umpy-1.657
Token and
Token ID290
Feature activation+3.62e+00

Pos logit contributions
led+1.284
 her+1.034
 his+0.796
 their+0.790
ed+0.762
Neg logit contributions
 rule-8.819
 promise-8.382
 difference-8.325
 law-8.288
icious-8.179
Token saved
Token ID7448
Feature activation+2.28e+00

Pos logit contributions
 her+5.352
 his+4.775
led+4.748
 their+4.663
 the+4.581
Neg logit contributions
 promise-7.884
 rule-7.843
 occasion-7.228
 deal-7.081
 request-7.078
Token,
Token ID11
Feature activation+1.49e+00

Pos logit contributions
 her+1.344
led+1.177
 his+1.078
 their+1.019
 the+0.926
Neg logit contributions
 rule-6.964
 promise-6.600
 difference-6.553
 law-6.461
icious-6.457
Token and
Token ID290
Feature activation+3.64e+00

Pos logit contributions
 her+2.369
 his+2.137
 the+2.126
 their+2.080
led+2.013
Neg logit contributions
 rule-3.156
 promise-2.931
icious-2.873
 difference-2.862
 world-2.799
Token finally
Token ID3443
Feature activation+5.16e+00

Pos logit contributions
 her+3.887
 his+3.788
 their+3.502
led+3.489
 the+3.174
Neg logit contributions
 rule-9.211
 promise-8.935
 occasion-8.460
 difference-8.334
 charity-8.333
Token had
Token ID550
Feature activation+2.47e+00

Pos logit contributions
led+9.306
 his+8.087
 her+7.652
ged+7.342
 their+7.233
Neg logit contributions
 promise-11.451
 rule-10.996
 difference-10.724
 goal-10.503
 story-10.344
Token enough
Token ID1576
Feature activation+8.20e-01

Pos logit contributions
 her+2.287
led+2.278
 his+1.932
 the+1.775
 their+1.772
Neg logit contributions
 rule-6.681
 promise-6.312
 difference-6.260
oral-6.134
icious-6.110
Token money
Token ID1637
Feature activation+1.50e+00

Pos logit contributions
 her+1.685
 the+1.546
 his+1.506
 their+1.494
led+1.404
Neg logit contributions
 rule-1.406
 promise-1.291
icious-1.261
 difference-1.255
reath-1.250
Token to
Token ID284
Feature activation-1.06e+00

Pos logit contributions
 her+1.761
 the+1.504
 his+1.501
 their+1.444
led+1.288
Neg logit contributions
 rule-3.763
 promise-3.572
icious-3.472
 difference-3.441
oral-3.439
Token buy
Token ID2822
Feature activation+1.66e+00

Pos logit contributions
 rule+1.441
umpy+1.382
opus+1.363
icious+1.361
affe+1.355
Neg logit contributions
 her-2.641
 the-2.525
 his-2.435
 their-2.394
 it-2.201
Token stared
Token ID18484
Feature activation+2.35e+00

Pos logit contributions
 her+0.500
 the+0.435
 his+0.414
 their+0.406
led+0.344
Neg logit contributions
 rule-1.013
icious-0.960
 promise-0.959
reath-0.945
oral-0.943
Token and
Token ID290
Feature activation+2.77e+00

Pos logit contributions
 her+2.140
 the+1.779
 their+1.700
 his+1.659
led+1.450
Neg logit contributions
 rule-6.307
 promise-6.088
umpy-6.045
 world-6.041
 difference-6.040
Token stared
Token ID18484
Feature activation+2.24e+00

Pos logit contributions
 her+4.100
 the+3.564
 their+3.548
led+3.521
 his+3.495
Neg logit contributions
 rule-5.740
 promise-5.692
icious-5.557
 request-5.464
rr-5.350
Token,
Token ID11
Feature activation+1.89e+00

Pos logit contributions
 her+2.054
 the+1.709
 their+1.630
 his+1.609
led+1.399
Neg logit contributions
 rule-6.005
 promise-5.789
 difference-5.782
umpy-5.739
ater-5.733
Token until
Token ID1566
Feature activation+3.90e+00

Pos logit contributions
 her+2.936
 their+2.599
 his+2.581
 the+2.558
led+2.512
Neg logit contributions
 rule-3.891
icious-3.659
 promise-3.614
 difference-3.576
 world-3.562
Token finally
Token ID3443
Feature activation+5.15e+00

Pos logit contributions
led+5.039
 her+4.332
 his+4.044
 their+3.771
ed+3.665
Neg logit contributions
 rule-8.963
Two-8.926
 goal-8.656
oral-8.461
 sentence-8.437
Token the
Token ID262
Feature activation-2.48e+00

Pos logit contributions
led+9.391
 his+7.844
ged+7.586
 their+7.585
 her+7.539
Neg logit contributions
Two-10.191
 promise-10.185
 goal-10.178
 rule-10.070
 judge-10.053
Token music
Token ID2647
Feature activation+1.04e+00

Pos logit contributions
'."+4.946
icious+4.925
!'"+4.900
oc+4.844
.'"+4.827
Neg logit contributions
 the-4.881
 her-4.820
 his-4.479
 their-4.243
 it-4.061
Token stopped
Token ID5025
Feature activation+5.68e-01

Pos logit contributions
 her+1.234
 the+1.046
 his+1.012
 their+0.996
led+0.830
Neg logit contributions
 rule-2.660
icious-2.498
 promise-2.497
oral-2.468
 difference-2.465
Token.
Token ID13
Feature activation+3.52e-01

Pos logit contributions
 her+0.869
 the+0.778
 his+0.750
 their+0.738
led+0.662
Neg logit contributions
 rule-1.255
icious-1.180
 promise-1.177
 difference-1.158
reath-1.156
Token The
Token ID383
Feature activation-2.12e+00

Pos logit contributions
 her+0.558
 the+0.502
 his+0.483
 their+0.476
led+0.421
Neg logit contributions
 rule-0.766
icious-0.720
 promise-0.718
reath-0.707
 difference-0.705
Token,"
Token ID553
Feature activation+1.47e+00

Pos logit contributions
 her+0.351
 the+0.327
 his+0.318
 their+0.314
led+0.290
Neg logit contributions
 rule-0.236
icious-0.217
 promise-0.213
reath-0.211
oral-0.210
Token said
Token ID531
Feature activation+2.39e+00

Pos logit contributions
 her+2.040
 the+1.772
 their+1.757
 his+1.749
led+1.577
Neg logit contributions
 rule-3.303
 promise-3.171
oral-3.079
 difference-3.077
icious-3.075
Token Lily
Token ID20037
Feature activation+1.50e+00

Pos logit contributions
 her+2.230
led+2.030
 their+1.806
 his+1.773
 the+1.727
Neg logit contributions
 rule-6.218
 promise-5.938
 difference-5.832
'."-5.799
.'"-5.786
Token.
Token ID13
Feature activation-5.85e-02

Pos logit contributions
 her+2.585
 the+2.370
 their+2.339
 his+2.281
led+2.108
Neg logit contributions
 rule-2.920
 promise-2.840
age-2.724
 request-2.673
 difference-2.672
Token She
Token ID1375
Feature activation+2.60e+00

Pos logit contributions
 rule+0.126
icious+0.118
 promise+0.117
reath+0.116
oral+0.115
Neg logit contributions
 her-0.097
 the-0.088
 his-0.084
 their-0.083
 it-0.074
Token organized
Token ID8389
Feature activation+5.12e+00

Pos logit contributions
 her+3.942
 their+3.499
 the+3.452
 his+3.344
 its+2.927
Neg logit contributions
 promise-5.767
 rule-5.655
age-5.510
 request-5.475
oc-5.423
Token all
Token ID477
Feature activation+3.59e+00

Pos logit contributions
led+7.900
 their+6.275
 her+6.245
ed+6.053
 they+5.929
Neg logit contributions
 shelf-11.591
 rule-11.382
 basis-10.731
 request-10.700
 charity-10.511
Token the
Token ID262
Feature activation+7.85e-01

Pos logit contributions
 her+6.192
led+5.715
 their+5.697
bled+5.680
 the+5.545
Neg logit contributions
 rule-6.742
 promise-6.431
 basis-6.193
 difference-6.162
 world-6.151
Token toys
Token ID14958
Feature activation+2.93e+00

Pos logit contributions
 her+1.528
 the+1.397
 his+1.351
 their+1.349
led+1.242
Neg logit contributions
 rule-1.437
 promise-1.325
 difference-1.298
icious-1.298
reath-1.293
Token in
Token ID287
Feature activation+1.50e+00

Pos logit contributions
 her+3.763
 his+3.225
 their+3.103
 the+3.102
led+2.714
Neg logit contributions
 rule-7.198
 promise-7.003
 world-6.648
ater-6.593
age-6.563
Token the
Token ID262
Feature activation-1.62e+00

Pos logit contributions
 her+0.977
led+0.762
 the+0.717
 his+0.678
 their+0.655
Neg logit contributions
 rule-4.556
 promise-4.301
 difference-4.249
 world-4.198
 charity-4.183
Token baby
Token ID5156
Feature activation+8.16e-01

Pos logit contributions
 rule+2.059
icious+2.025
reath+2.000
ode+1.987
'."+1.976
Neg logit contributions
 her-2.061
 the-1.927
 his-1.816
 their-1.796
 it-1.615
Token was
Token ID373
Feature activation+2.34e+00

Pos logit contributions
 her+1.432
 the+1.290
 his+1.258
 their+1.254
led+1.114
Neg logit contributions
 rule-1.623
 promise-1.528
icious-1.518
 difference-1.485
umpy-1.484
Token disappointed
Token ID11679
Feature activation+1.04e+00

Pos logit contributions
 her+3.510
 the+3.066
 their+3.015
led+3.012
 his+3.002
Neg logit contributions
 rule-5.311
 promise-4.909
umpy-4.837
age-4.738
 request-4.730
Token,
Token ID11
Feature activation+1.64e+00

Pos logit contributions
 her+1.390
 the+1.218
 his+1.169
 their+1.153
led+1.059
Neg logit contributions
 rule-2.493
 promise-2.346
 difference-2.298
icious-2.292
umpy-2.283
Token but
Token ID475
Feature activation+3.81e+00

Pos logit contributions
 her+2.558
 the+2.289
 their+2.244
 his+2.236
led+2.180
Neg logit contributions
 rule-3.453
 promise-3.182
icious-3.131
 difference-3.123
 world-3.104
Token soon
Token ID2582
Feature activation+5.08e+00

Pos logit contributions
 her+4.392
led+4.319
 their+4.307
 his+4.029
ed+3.939
Neg logit contributions
 promise-8.657
 rule-8.558
 world-8.217
 judge-8.139
 request-8.050
Token she
Token ID673
Feature activation+1.25e+00

Pos logit contributions
led+6.747
bled+5.922
 her+5.634
 his+5.620
ied+5.497
Neg logit contributions
-11.915
Two-11.696
 world-11.628
 difference-11.476
 judge-11.416
Token saw
Token ID2497
Feature activation+2.87e+00

Pos logit contributions
 her+1.693
 the+1.466
 their+1.424
 his+1.413
 its+1.197
Neg logit contributions
 rule-2.963
 promise-2.870
icious-2.780
oral-2.765
ater-2.746
Token a
Token ID257
Feature activation-2.39e+00

Pos logit contributions
 her+2.029
led+1.908
 his+1.696
 their+1.680
 the+1.504
Neg logit contributions
 rule-8.038
 difference-7.653
 world-7.541
 promise-7.495
 law-7.484
Token little
Token ID1310
Feature activation-9.47e-01

Pos logit contributions
icious+5.321
'."+5.272
opl+5.237
reath+5.222
.'"+5.177
Neg logit contributions
 her-4.024
 the-3.944
 his-3.629
 their-3.543
 it-3.343
Token bird
Token ID6512
Feature activation-6.49e-01

Pos logit contributions
icious+1.405
 rule+1.397
reath+1.346
oral+1.335
'."+1.276
Neg logit contributions
 her-2.199
 the-2.094
 his-2.000
 their-1.980
 it-1.856
Token she
Token ID673
Feature activation+1.76e+00

Pos logit contributions
led+2.795
 her+2.512
 his+2.344
 their+2.263
 the+2.094
Neg logit contributions
 rule-8.548
 promise-8.157
 law-7.961
 sentence-7.941
 request-7.928
Token screamed
Token ID25421
Feature activation+1.67e+00

Pos logit contributions
 her+2.381
 the+2.068
 their+2.046
 his+1.985
 its+1.710
Neg logit contributions
 rule-4.150
 promise-4.037
age-3.910
ode-3.864
icious-3.823
Token for
Token ID329
Feature activation+3.00e+00

Pos logit contributions
 her+2.315
 the+2.081
 his+2.010
 their+2.002
led+1.964
Neg logit contributions
 rule-3.776
 promise-3.556
icious-3.487
 difference-3.434
 request-3.424
Token help
Token ID1037
Feature activation+1.38e+00

Pos logit contributions
led+7.973
ed+7.349
 her+7.297
 their+7.054
 the+6.935
Neg logit contributions
 rule-3.602
 promise-3.217
 charity-2.889
 request-2.882
reath-2.752
Token.
Token ID13
Feature activation+4.38e-01

Pos logit contributions
 her+2.188
 the+2.001
 their+1.972
 his+1.966
led+1.944
Neg logit contributions
 rule-2.880
 promise-2.673
age-2.591
icious-2.590
 basis-2.578
Token Soon
Token ID15894
Feature activation+5.04e+00

Pos logit contributions
 her+0.808
 the+0.739
 his+0.716
 their+0.708
led+0.643
Neg logit contributions
 rule-0.837
icious-0.779
 promise-0.775
reath-0.761
 difference-0.760
Token a
Token ID257
Feature activation-2.03e+00

Pos logit contributions
led+9.849
 his+7.751
 their+7.528
bled+7.505
 her+7.389
Neg logit contributions
 promise-10.199
age-10.119
 rule-10.083
 difference-9.985
Two-9.881
Token friendly
Token ID8030
Feature activation-4.38e-01

Pos logit contributions
ater+4.268
'."+4.264
opl+4.263
icious+4.250
.'"+4.217
Neg logit contributions
 her-3.502
 the-3.343
 his-3.147
 their-2.997
 it-2.872
Token polar
Token ID13559
Feature activation-9.36e-02

Pos logit contributions
 rule+0.608
icious+0.587
oral+0.556
 promise+0.554
reath+0.549
Neg logit contributions
 her-1.055
 the-0.992
 his-0.960
 their-0.942
 it-0.884
Token bear
Token ID6842
Feature activation+4.66e-01

Pos logit contributions
 rule+0.130
icious+0.120
 promise+0.116
oral+0.115
reath+0.114
Neg logit contributions
 her-0.224
 the-0.210
 his-0.206
 their-0.201
led-0.187
Token appeared
Token ID4120
Feature activation+1.30e+00

Pos logit contributions
 her+0.487
 the+0.411
 his+0.386
 their+0.378
led+0.307
Neg logit contributions
 rule-1.264
icious-1.201
 promise-1.201
 difference-1.183
reath-1.183
Token but
Token ID475
Feature activation+4.23e+00

Pos logit contributions
 her+3.615
 the+3.337
 their+3.331
led+3.309
 his+3.285
Neg logit contributions
 rule-4.088
 promise-3.713
icious-3.643
 world-3.603
 difference-3.588
Token she
Token ID673
Feature activation+1.20e+00

Pos logit contributions
led+6.210
ed+5.760
 their+5.603
 his+5.359
 her+5.226
Neg logit contributions
 promise-9.676
 rule-8.953
 request-8.945
 judge-8.606
oral-8.381
Token kept
Token ID4030
Feature activation+2.27e+00

Pos logit contributions
 her+1.714
 the+1.543
 their+1.491
 his+1.463
led+1.292
Neg logit contributions
 rule-2.769
 promise-2.685
icious-2.585
oral-2.570
umpy-2.529
Token walking
Token ID6155
Feature activation+7.14e-01

Pos logit contributions
 her+2.719
led+2.589
 the+2.387
 their+2.323
 his+2.319
Neg logit contributions
 rule-5.456
 promise-5.207
 request-5.020
 world-5.015
 difference-4.990
Token and
Token ID290
Feature activation+3.34e+00

Pos logit contributions
 her+0.969
 the+0.854
 his+0.819
 their+0.810
led+0.722
Neg logit contributions
 rule-1.708
 promise-1.610
icious-1.584
 difference-1.574
reath-1.569
Token soon
Token ID2582
Feature activation+5.02e+00

Pos logit contributions
 her+4.999
led+4.979
 their+4.513
 his+4.486
 the+4.467
Neg logit contributions
 rule-7.118
 promise-6.881
 world-6.641
 request-6.548
 basis-6.461
Token found
Token ID1043
Feature activation+2.66e+00

Pos logit contributions
 his+6.816
led+6.451
 her+6.241
 their+5.722
 how+5.643
Neg logit contributions
 promise-11.316
Two-11.194
age-11.171
 rule-11.030
ari-10.916
Token a
Token ID257
Feature activation-1.97e+00

Pos logit contributions
 her+1.997
led+1.904
 his+1.693
 their+1.631
 the+1.540
Neg logit contributions
 rule-7.538
 difference-7.113
age-7.049
 world-6.993
 promise-6.965
Token beautiful
Token ID4950
Feature activation-1.44e+00

Pos logit contributions
icious+4.442
reath+4.349
'."+4.340
rr+4.322
opl+4.308
Neg logit contributions
 her-3.132
 the-3.054
 their-2.786
 his-2.775
 it-2.622
Token flower
Token ID15061
Feature activation-1.32e+00

Pos logit contributions
icious+2.290
 rule+2.219
 Deal+2.210
umpy+2.123
reath+2.098
Neg logit contributions
 her-3.338
 the-3.192
 his-3.004
 their-2.987
 it-2.872
Token.
Token ID13
Feature activation-2.53e-01

Pos logit contributions
 rule+3.382
icious+3.346
oral+3.277
umpy+3.257
 Deal+3.247
Neg logit contributions
 her-1.723
 the-1.498
 their-1.335
 his-1.308
 it-1.221
Token around
Token ID1088
Feature activation+2.71e+00

Pos logit contributions
 her+5.813
led+5.154
 his+4.903
 their+4.825
ed+4.807
Neg logit contributions
 rule-6.380
 world-6.164
 promise-6.096
 difference-6.044
 hill-5.955
Token the
Token ID262
Feature activation-1.82e+00

Pos logit contributions
 her+2.683
led+2.449
 his+1.963
ed+1.813
 the+1.802
Neg logit contributions
 rule-7.332
 world-7.031
 difference-6.907
 basis-6.806
 promise-6.780
Token park
Token ID3952
Feature activation+1.66e+00

Pos logit contributions
 rule+2.187
icious+2.172
rr+2.155
affe+2.155
oral+2.153
Neg logit contributions
 her-4.799
 the-4.719
 his-4.349
 their-4.322
 it-4.109
Token,
Token ID11
Feature activation+3.64e+00

Pos logit contributions
 her+2.342
 his+1.999
 the+1.992
 their+1.898
led+1.833
Neg logit contributions
 rule-3.914
 promise-3.679
 difference-3.610
 world-3.564
 basis-3.555
Token and
Token ID290
Feature activation+4.11e+00

Pos logit contributions
 her+6.864
led+6.333
 his+6.052
 their+5.883
 the+5.874
Neg logit contributions
 rule-6.894
 promise-6.598
 world-6.307
 basis-6.194
 difference-6.192
Token finally
Token ID3443
Feature activation+5.02e+00

Pos logit contributions
 her+6.350
led+5.359
 his+5.143
 their+4.847
 the+4.806
Neg logit contributions
 rule-9.194
 promise-8.869
 occasion-8.698
 charity-8.574
 former-8.571
Token,
Token ID11
Feature activation+3.35e+00

Pos logit contributions
 her+7.976
led+7.075
 his+6.124
 their+5.533
 the+5.226
Neg logit contributions
ari-11.528
 promise-11.379
Two-10.774
oc-10.750
rr-10.670
Token Johnny
Token ID15470
Feature activation+2.12e+00

Pos logit contributions
 her+3.637
led+3.291
 his+2.520
 the+2.264
 their+2.191
Neg logit contributions
 rule-8.756
 promise-8.455
 charity-8.438
 hill-8.419
 road-8.364
Token found
Token ID1043
Feature activation+3.31e+00

Pos logit contributions
 her+3.251
 their+2.846
 his+2.814
 the+2.799
led+2.491
Neg logit contributions
 rule-4.499
 promise-4.453
age-4.388
ater-4.250
 difference-4.211
Token the
Token ID262
Feature activation-1.87e+00

Pos logit contributions
 her+3.111
led+2.614
 their+2.179
 the+2.058
 his+2.028
Neg logit contributions
 rule-8.774
 difference-8.561
 promise-8.344
oc-8.302
 world-8.290
Token watch
Token ID2342
Feature activation-1.02e+00

Pos logit contributions
oral+3.443
icious+3.439
rr+3.414
umpy+3.411
'."+3.407
Neg logit contributions
 her-3.770
 the-3.565
 his-3.279
 their-3.232
 it-2.999
Token and
Token ID290
Feature activation+2.21e+00

Pos logit contributions
 her+0.270
 the+0.237
 his+0.224
 their+0.219
led+0.187
Neg logit contributions
 rule-0.540
icious-0.512
 promise-0.508
reath-0.504
 difference-0.502
Token boy
Token ID2933
Feature activation+1.58e+00

Pos logit contributions
 her+3.100
led+2.899
 the+2.766
 their+2.677
 his+2.568
Neg logit contributions
 rule-4.668
 promise-4.503
 request-4.343
affe-4.334
 difference-4.322
Token continued
Token ID3767
Feature activation+1.95e+00

Pos logit contributions
 her+2.115
 the+1.862
 his+1.836
 their+1.792
 its+1.588
Neg logit contributions
 rule-3.633
 promise-3.617
icious-3.458
 difference-3.414
 request-3.401
Token walking
Token ID6155
Feature activation+1.18e+00

Pos logit contributions
 her+2.054
led+1.721
 the+1.709
 his+1.688
 their+1.536
Neg logit contributions
 rule-5.039
 promise-4.673
oral-4.624
 sentence-4.609
reath-4.605
Token and
Token ID290
Feature activation+2.96e+00

Pos logit contributions
 her+1.300
 the+1.072
 his+1.040
 their+0.993
led+0.914
Neg logit contributions
 rule-3.072
 promise-2.918
 difference-2.846
oral-2.828
reath-2.828
Token soon
Token ID2582
Feature activation+5.02e+00

Pos logit contributions
 her+4.669
led+4.217
 his+4.127
 their+3.929
 the+3.903
Neg logit contributions
 rule-6.050
 promise-5.766
 judge-5.749
 hill-5.650
 world-5.619
Token stopped
Token ID5025
Feature activation+4.94e-01

Pos logit contributions
 her+7.054
led+6.604
 his+5.946
bled+5.724
 how+5.449
Neg logit contributions
Two-11.481
 judge-11.458
 world-11.402
 difference-11.383
ari-11.266
Token at
Token ID379
Feature activation+1.84e+00

Pos logit contributions
 her+0.586
 the+0.506
 his+0.481
 their+0.470
led+0.402
Neg logit contributions
 rule-1.268
icious-1.200
 promise-1.200
 difference-1.183
reath-1.182
Token the
Token ID262
Feature activation-2.38e+00

Pos logit contributions
 her+1.560
led+1.237
 his+1.131
 the+1.098
 their+1.020
Neg logit contributions
 rule-5.285
 promise-5.000
 difference-4.964
 world-4.882
 charity-4.841
Token tree
Token ID5509
Feature activation-7.68e-01

Pos logit contributions
reath+4.501
rr+4.386
agons+4.359
icious+4.348
 rule+4.317
Neg logit contributions
 her-4.970
 the-4.948
 his-4.317
 their-4.201
 it-3.993
Token.
Token ID13
Feature activation+1.42e-01

Pos logit contributions
 rule+1.778
icious+1.725
reath+1.714
oral+1.679
 promise+1.671
Neg logit contributions
 her-1.152
 the-1.050
 his-0.979
 their-0.962
 it-0.830
Token walked
Token ID6807
Feature activation+2.44e+00

Pos logit contributions
 her+5.933
 the+5.117
 his+5.096
led+4.942
 their+4.936
Neg logit contributions
 promise-9.041
ode-8.323
 rule-8.284
 request-8.152
age-8.104
Token for
Token ID329
Feature activation+2.79e+00

Pos logit contributions
 her+3.551
 the+2.975
led+2.952
 his+2.923
 their+2.804
Neg logit contributions
 rule-5.446
 promise-5.210
age-5.019
 basis-4.996
 world-4.984
Token a
Token ID257
Feature activation-8.38e-03

Pos logit contributions
 her+4.509
led+4.375
ed+3.810
 his+3.774
 the+3.740
Neg logit contributions
 rule-5.864
 promise-5.496
 charity-5.431
 difference-5.310
 basis-5.306
Token while
Token ID981
Feature activation+1.51e+00

Pos logit contributions
 rule+0.017
icious+0.016
 promise+0.016
reath+0.016
oral+0.016
Neg logit contributions
 her-0.014
 the-0.013
 his-0.013
 their-0.013
 it-0.011
Token and
Token ID290
Feature activation+4.27e+00

Pos logit contributions
 her+2.190
 the+1.848
 his+1.831
 their+1.809
led+1.756
Neg logit contributions
 rule-3.407
 promise-3.195
oral-3.129
 difference-3.122
umpy-3.121
Token finally
Token ID3443
Feature activation+5.00e+00

Pos logit contributions
 her+7.128
led+6.010
 his+5.995
 their+5.759
 its+5.512
Neg logit contributions
 rule-8.838
 promise-8.743
 basis-8.687
oc-8.638
 occasion-8.606
Token found
Token ID1043
Feature activation+3.43e+00

Pos logit contributions
 her+9.010
led+7.434
 their+7.243
 his+7.200
 the+6.509
Neg logit contributions
 promise-10.829
oc-10.800
ari-10.775
affe-10.548
asis-10.461
Token the
Token ID262
Feature activation-8.97e-01

Pos logit contributions
 her+3.058
led+3.000
 their+1.973
 the+1.971
 his+1.909
Neg logit contributions
 rule-9.225
 difference-9.111
 promise-8.972
oc-8.684
 request-8.665
Token puppy
Token ID26188
Feature activation+1.54e+00

Pos logit contributions
 rule+1.349
reath+1.308
icious+1.306
ater+1.265
rr+1.258
Neg logit contributions
 her-2.063
 the-1.925
 his-1.861
 their-1.828
 it-1.706
Token's
Token ID338
Feature activation-5.78e-01

Pos logit contributions
 her+2.293
 the+2.023
 their+1.961
 his+1.958
led+1.779
Neg logit contributions
 rule-3.408
 promise-3.297
 difference-3.169
age-3.131
icious-3.122
Token home
Token ID1363
Feature activation+1.45e+00

Pos logit contributions
 rule+1.101
icious+1.086
reath+1.058
oral+1.049
ater+1.036
Neg logit contributions
 her-1.087
 the-0.990
 his-0.965
 their-0.941
 it-0.852
Token she
Token ID673
Feature activation+9.87e-01

Pos logit contributions
led+5.402
 his+4.965
 her+4.929
 their+4.698
bled+4.613
Neg logit contributions
 promise-7.448
-7.407
 rule-7.382
'."-7.187
 difference-7.073
Token found
Token ID1043
Feature activation+3.97e+00

Pos logit contributions
 her+1.298
 the+1.145
 their+1.101
 his+1.085
led+0.928
Neg logit contributions
 rule-2.366
 promise-2.282
icious-2.232
oral-2.200
reath-2.190
Token it
Token ID340
Feature activation+1.74e+00

Pos logit contributions
led+4.138
 their+3.531
 his+3.108
 her+3.096
ged+2.908
Neg logit contributions
 rule-10.210
age-9.956
 world-9.873
 difference-9.840
 veterinarian-9.673
Token!
Token ID0
Feature activation+5.77e-01

Pos logit contributions
 her+2.260
 their+2.042
 the+2.012
 his+1.986
led+1.724
Neg logit contributions
 rule-4.123
 promise-3.971
icious-3.849
 world-3.788
age-3.761
Token She
Token ID1375
Feature activation+2.17e+00

Pos logit contributions
 her+0.950
 the+0.859
 his+0.833
 their+0.823
led+0.743
Neg logit contributions
 rule-1.211
icious-1.132
 promise-1.128
 difference-1.111
reath-1.111
Token closed
Token ID4838
Feature activation+5.00e+00

Pos logit contributions
 her+3.312
 their+3.035
 the+2.959
 his+2.825
 its+2.565
Neg logit contributions
 promise-4.579
 rule-4.562
ode-4.262
age-4.259
 request-4.257
Token her
Token ID607
Feature activation+7.52e-01

Pos logit contributions
led+5.025
ed+4.107
icked+3.460
 able+3.287
 their+2.737
Neg logit contributions
 shelf-13.571
 rule-13.383
'."-13.045
.'"-12.939
!'"-12.931
Token eyes
Token ID2951
Feature activation+1.88e+00

Pos logit contributions
 her+1.175
 the+1.053
 his+1.012
 their+1.010
led+0.904
Neg logit contributions
 rule-1.657
 promise-1.562
icious-1.528
 difference-1.526
reath-1.521
Token,
Token ID11
Feature activation+3.82e+00

Pos logit contributions
 her+2.805
 their+2.521
 his+2.491
 the+2.474
led+2.214
Neg logit contributions
 rule-4.034
 promise-3.824
icious-3.801
ater-3.784
age-3.762
Token and
Token ID290
Feature activation+3.90e+00

Pos logit contributions
 their+6.200
 her+6.179
led+5.816
 his+5.711
 the+5.613
Neg logit contributions
 rule-7.187
 promise-7.067
ode-6.643
age-6.635
ater-6.575
Token soon
Token ID2582
Feature activation+4.54e+00

Pos logit contributions
 her+5.977
 their+5.777
 his+5.604
 the+5.376
led+5.226
Neg logit contributions
 promise-7.832
 rule-7.514
age-7.197
 request-7.088
 shelf-6.928
TokenThey
Token ID2990
Feature activation+2.63e+00

Pos logit contributions
 rule+1.177
 difference+1.102
 promise+1.093
reath+1.083
icious+1.049
Neg logit contributions
 her-1.636
 the-1.519
 his-1.489
 their-1.454
 it-1.343
Token kept
Token ID4030
Feature activation+2.91e+00

Pos logit contributions
 her+3.553
 the+3.127
 his+3.070
 their+2.986
led+2.844
Neg logit contributions
 rule-6.083
 promise-6.026
ode-5.635
 request-5.635
Two-5.635
Token walking
Token ID6155
Feature activation+1.20e+00

Pos logit contributions
 her+4.491
led+4.261
 the+3.871
 his+3.836
ed+3.700
Neg logit contributions
 rule-6.112
 promise-5.713
ode-5.501
agons-5.497
 world-5.462
Token,
Token ID11
Feature activation+2.72e+00

Pos logit contributions
 her+1.624
 the+1.402
 his+1.348
 their+1.328
led+1.241
Neg logit contributions
 rule-2.884
 promise-2.726
 difference-2.625
umpy-2.618
reath-2.608
Token and
Token ID290
Feature activation+3.27e+00

Pos logit contributions
 her+5.009
led+4.557
 their+4.434
 his+4.427
 the+4.377
Neg logit contributions
 rule-5.071
 promise-4.707
 basis-4.501
 ramp-4.432
 hill-4.420
Token soon
Token ID2582
Feature activation+4.97e+00

Pos logit contributions
 her+4.763
led+4.255
 his+3.862
 the+3.774
 their+3.753
Neg logit contributions
 rule-7.415
 promise-6.860
 hill-6.839
 road-6.836
 basis-6.735
Token they
Token ID484
Feature activation+3.30e+00

Pos logit contributions
 her+6.593
led+6.441
bled+5.021
 how+4.820
 his+4.667
Neg logit contributions
Two-11.542
 river-11.481
 road-11.359
asis-11.344
 world-11.227
Token came
Token ID1625
Feature activation+1.75e+00

Pos logit contributions
 her+5.410
 his+4.603
 their+4.528
 the+4.431
led+4.319
Neg logit contributions
 rule-6.807
Two-6.635
 promise-6.631
umpy-6.510
asis-6.447
Token to
Token ID284
Feature activation+2.09e-01

Pos logit contributions
 her+2.304
 the+1.931
 his+1.917
 their+1.864
led+1.820
Neg logit contributions
 rule-4.183
 promise-3.978
umpy-3.954
 world-3.944
rr-3.864
Token a
Token ID257
Feature activation-1.67e+00

Pos logit contributions
 her+0.147
 the+0.116
 his+0.103
 their+0.099
 it+0.064
Neg logit contributions
 rule-0.644
icious-0.619
 promise-0.614
reath-0.611
oral-0.609
Token playground
Token ID24817
Feature activation-1.86e-02

Pos logit contributions
icious+3.379
 rule+3.339
reath+3.313
rr+3.305
agons+3.302
Neg logit contributions
 her-2.970
 the-2.955
 his-2.643
 their-2.636
 it-2.530
Token park
Token ID3952
Feature activation+2.35e+00

Pos logit contributions
 rule+0.287
icious+0.268
reath+0.262
 promise+0.257
oral+0.252
Neg logit contributions
 her-0.619
 the-0.584
 their-0.560
 his-0.559
 it-0.522
Token for
Token ID329
Feature activation+4.04e+00

Pos logit contributions
 her+3.464
 his+2.984
 the+2.934
led+2.888
 their+2.827
Neg logit contributions
 rule-5.474
 promise-5.146
 difference-4.968
 world-4.966
 basis-4.925
Token the
Token ID262
Feature activation-4.05e-01

Pos logit contributions
led+6.937
 her+6.399
ed+6.013
icked+5.389
ied+5.373
Neg logit contributions
 rule-8.706
 charity-8.387
 promise-8.245
 difference-8.040
 world-8.007
Token frame
Token ID5739
Feature activation+8.94e-01

Pos logit contributions
 rule+0.649
icious+0.626
reath+0.615
oral+0.599
'."+0.598
Neg logit contributions
 her-0.872
 the-0.812
 their-0.769
 his-0.756
 it-0.715
Token,
Token ID11
Feature activation+3.44e+00

Pos logit contributions
 her+1.246
 the+1.097
 his+1.060
 their+1.040
led+0.929
Neg logit contributions
 rule-2.099
 promise-1.984
 difference-1.938
icious-1.937
age-1.908
Token but
Token ID475
Feature activation+4.96e+00

Pos logit contributions
 her+6.860
led+6.644
 his+6.098
ed+6.044
 their+6.040
Neg logit contributions
 rule-6.057
 promise-5.763
 world-5.443
 difference-5.356
ode-5.302
Token they
Token ID484
Feature activation+3.34e+00

Pos logit contributions
led+6.642
 her+6.613
ed+5.430
 his+4.825
icked+4.812
Neg logit contributions
 rule-12.507
 promise-12.493
 hill-12.129
 road-11.804
 river-11.247
Token couldn
Token ID3521
Feature activation-1.87e-01

Pos logit contributions
 her+5.169
 their+4.573
 his+4.566
 the+4.533
led+4.244
Neg logit contributions
 promise-7.564
 rule-7.342
age-6.896
ode-6.801
 request-6.797
Token't
Token ID470
Feature activation-3.44e-01

Pos logit contributions
 rule+0.296
 promise+0.273
icious+0.273
reath+0.271
ater+0.267
Neg logit contributions
 her-0.414
 the-0.387
 his-0.373
 their-0.369
 it-0.344
Token find
Token ID1064
Feature activation+3.36e+00

Pos logit contributions
 rule+0.501
icious+0.476
oral+0.458
reath+0.454
 difference+0.449
Neg logit contributions
 her-0.803
 the-0.750
 his-0.726
 their-0.714
 it-0.667
Token it
Token ID340
Feature activation+2.29e+00

Pos logit contributions
led+2.469
 her+2.338
ed+1.568
 their+1.364
 the+1.356
Neg logit contributions
 rule-9.921
 promise-9.526
 difference-9.507
 law-9.025
 request-8.966
Token went
Token ID1816
Feature activation+1.59e+00

Pos logit contributions
 her+0.505
 the+0.447
 his+0.428
 their+0.420
led+0.365
Neg logit contributions
 rule-0.853
icious-0.806
 promise-0.803
reath-0.792
 difference-0.790
Token off
Token ID572
Feature activation+2.04e+00

Pos logit contributions
 her+1.688
 the+1.433
 his+1.368
 their+1.351
led+1.321
Neg logit contributions
 rule-4.239
 promise-3.976
 request-3.852
age-3.828
 sentence-3.818
Token upset
Token ID9247
Feature activation+1.18e+00

Pos logit contributions
 her+2.148
led+1.844
 the+1.739
 their+1.703
 his+1.683
Neg logit contributions
 rule-5.310
 promise-4.976
 request-4.841
 world-4.814
oral-4.801
Token,
Token ID11
Feature activation+8.86e-01

Pos logit contributions
 her+1.547
 the+1.369
 their+1.292
 his+1.274
led+1.167
Neg logit contributions
 rule-2.826
 promise-2.617
icious-2.602
 difference-2.585
reath-2.578
Token but
Token ID475
Feature activation+3.03e+00

Pos logit contributions
 her+1.325
 the+1.183
 his+1.133
 their+1.129
led+1.040
Neg logit contributions
 rule-1.977
 promise-1.844
icious-1.833
 difference-1.809
oral-1.789
Token soon
Token ID2582
Feature activation+4.94e+00

Pos logit contributions
 her+3.708
led+3.359
 their+3.320
 the+3.132
ed+2.881
Neg logit contributions
 rule-7.184
 promise-7.121
 request-6.682
 judge-6.664
oral-6.507
Token he
Token ID339
Feature activation+1.64e+00

Pos logit contributions
led+6.802
 her+6.725
 their+5.793
 its+5.276
bled+5.211
Neg logit contributions
 team-11.180
 judge-11.083
 goal-10.953
 story-10.897
 promise-10.803
Token saw
Token ID2497
Feature activation+2.22e+00

Pos logit contributions
 her+2.286
 their+1.977
 the+1.976
 his+1.915
 its+1.684
Neg logit contributions
 rule-3.832
 promise-3.729
icious-3.606
ater-3.593
umpy-3.588
Token his
Token ID465
Feature activation-1.15e+00

Pos logit contributions
 her+1.723
led+1.366
 their+1.302
 the+1.258
 his+1.098
Neg logit contributions
 rule-6.353
 difference-5.984
 promise-5.891
 world-5.872
 charity-5.845
Token best
Token ID1266
Feature activation-1.57e+00

Pos logit contributions
icious+2.481
 rule+2.435
'."+2.431
!'"+2.396
reath+2.387
Neg logit contributions
 her-1.929
 the-1.777
 his-1.722
 their-1.645
 it-1.435
Token friend
Token ID1545
Feature activation+4.29e-01

Pos logit contributions
icious+2.723
 rule+2.595
reath+2.512
oral+2.508
'."+2.500
Neg logit contributions
 her-3.487
 the-3.229
 his-3.198
 their-3.078
 it-2.850
Token to
Token ID284
Feature activation-1.89e+00

Pos logit contributions
 her+1.834
 the+1.596
led+1.577
 their+1.561
 his+1.525
Neg logit contributions
 rule-4.068
 promise-3.813
 difference-3.760
 sentence-3.759
icious-3.757
Token sleep
Token ID3993
Feature activation+1.32e+00

Pos logit contributions
oral+3.387
umpy+3.282
icious+3.241
 rule+3.215
affe+3.190
Neg logit contributions
 her-4.320
 the-4.014
 his-3.909
 their-3.559
 it-3.417
Token again
Token ID757
Feature activation-6.58e-01

Pos logit contributions
 her+1.499
 the+1.292
 their+1.252
 his+1.238
led+1.077
Neg logit contributions
 rule-3.384
icious-3.168
 promise-3.162
reath-3.129
'."-3.093
Token.
Token ID13
Feature activation-1.93e-01

Pos logit contributions
 rule+1.653
icious+1.602
oral+1.576
reath+1.576
 promise+1.559
Neg logit contributions
 her-0.901
 the-0.796
 his-0.740
 their-0.715
 it-0.615
Token She
Token ID1375
Feature activation+1.27e+00

Pos logit contributions
 rule+0.417
icious+0.397
 promise+0.394
reath+0.389
oral+0.387
Neg logit contributions
 her-0.318
 the-0.284
 his-0.270
 their-0.264
 it-0.237
Token closed
Token ID4838
Feature activation+4.94e+00

Pos logit contributions
 her+1.998
 the+1.801
 their+1.770
 his+1.722
 its+1.532
Neg logit contributions
 rule-2.718
 promise-2.617
icious-2.501
reath-2.486
oral-2.479
Token her
Token ID607
Feature activation+5.66e-01

Pos logit contributions
led+3.310
ed+2.789
 able+2.208
icked+1.990
 they+1.759
Neg logit contributions
 rule-14.389
 shelf-14.304
'."-13.881
.'"-13.783
ute-13.662
Token eyes
Token ID2951
Feature activation+1.74e+00

Pos logit contributions
 her+0.859
 the+0.770
 his+0.736
 their+0.732
led+0.647
Neg logit contributions
 rule-1.270
 promise-1.193
icious-1.184
reath-1.172
 difference-1.169
Token tightly
Token ID17707
Feature activation+1.85e+00

Pos logit contributions
 her+2.547
 their+2.272
 the+2.261
 his+2.238
led+1.976
Neg logit contributions
 rule-3.826
 promise-3.600
icious-3.556
age-3.556
ater-3.535
Token and
Token ID290
Feature activation+1.97e+00

Pos logit contributions
 her+2.676
 their+2.373
 the+2.360
 his+2.349
led+2.052
Neg logit contributions
 rule-4.077
 promise-3.814
ater-3.793
'."-3.769
age-3.745
Token counted
Token ID14789
Feature activation+2.41e+00

Pos logit contributions
 her+3.384
 the+3.119
 their+3.079
 his+3.014
led+2.827
Neg logit contributions
 rule-3.790
 promise-3.686
 request-3.414
 sentence-3.408
age-3.389
TokenL
Token ID43
Feature activation+5.65e-01

Pos logit contributions
 rule+1.946
reath+1.809
 difference+1.803
ode+1.789
asis+1.780
Neg logit contributions
 her-2.332
 the-2.129
 their-2.043
 his-2.004
 it-1.850
Tokenily
Token ID813
Feature activation+6.55e-01

Pos logit contributions
 her+0.948
 the+0.865
 his+0.834
 their+0.823
 it+0.730
Neg logit contributions
 rule-1.178
icious-1.105
 promise-1.101
reath-1.086
oral-1.082
Token resisted
Token ID26643
Feature activation+9.99e-01

Pos logit contributions
 her+0.840
 the+0.736
 his+0.703
 their+0.699
led+0.586
Neg logit contributions
 rule-1.608
 promise-1.532
icious-1.525
oral-1.495
 difference-1.492
Token,
Token ID11
Feature activation+1.64e+00

Pos logit contributions
 her+1.311
 the+1.151
 his+1.121
 their+1.106
led+1.033
Neg logit contributions
 rule-2.392
 promise-2.260
icious-2.237
 difference-2.215
reath-2.189
Token but
Token ID475
Feature activation+4.32e+00

Pos logit contributions
 her+2.654
 his+2.408
 the+2.393
 their+2.372
led+2.287
Neg logit contributions
 rule-3.388
 promise-3.155
icious-3.101
 difference-3.053
 basis-3.015
Token eventually
Token ID4191
Feature activation+4.93e+00

Pos logit contributions
 his+6.451
led+6.399
 their+5.907
 her+5.770
ed+5.738
Neg logit contributions
 promise-10.042
 rule-9.361
 request-8.604
 judge-8.519
oral-8.449
Token she
Token ID673
Feature activation+1.25e+00

Pos logit contributions
led+7.491
 his+7.035
 their+6.461
 her+6.365
ged+6.138
Neg logit contributions
 promise-11.844
 rule-10.980
 judge-10.670
 story-10.565
-10.458
Token gave
Token ID2921
Feature activation+3.13e+00

Pos logit contributions
 her+1.717
 the+1.496
 their+1.457
 his+1.432
 its+1.236
Neg logit contributions
 rule-2.964
 promise-2.856
icious-2.804
oral-2.772
umpy-2.762
Token in
Token ID287
Feature activation+8.16e-01

Pos logit contributions
 her+2.389
led+2.349
 their+2.240
 his+2.204
 the+1.984
Neg logit contributions
 rule-8.430
 promise-8.274
 difference-8.065
 request-7.899
 world-7.823
Token and
Token ID290
Feature activation+3.16e+00

Pos logit contributions
 her+0.617
 the+0.490
 his+0.465
 their+0.449
led+0.367
Neg logit contributions
 rule-2.432
 promise-2.327
icious-2.308
oral-2.278
 difference-2.276
Token took
Token ID1718
Feature activation+2.96e+00

Pos logit contributions
 her+3.812
 his+3.642
 their+3.400
 the+3.341
led+3.247
Neg logit contributions
 promise-7.327
 rule-7.176
 request-6.762
 difference-6.721
icious-6.650
Token little
Token ID1310
Feature activation+1.06e+00

Pos logit contributions
 rule+1.789
icious+1.749
'."+1.710
!'"+1.692
age+1.670
Neg logit contributions
 her-1.935
 the-1.798
 his-1.664
 their-1.631
 it-1.524
Token girl
Token ID2576
Feature activation+7.28e-01

Pos logit contributions
 her+2.742
 his+2.578
 the+2.573
 their+2.546
led+2.391
Neg logit contributions
 rule-1.199
 promise-1.083
 difference-1.029
umpy-0.998
oral-0.996
Token kept
Token ID4030
Feature activation+2.25e+00

Pos logit contributions
 her+0.955
 the+0.831
 his+0.806
 their+0.792
led+0.678
Neg logit contributions
 rule-1.762
 promise-1.682
icious-1.664
 difference-1.639
oral-1.635
Token walking
Token ID6155
Feature activation+1.00e+00

Pos logit contributions
 her+3.217
led+3.055
 their+2.857
 the+2.832
 his+2.787
Neg logit contributions
 rule-4.912
 promise-4.609
 sentence-4.474
 request-4.443
 world-4.435
Token and
Token ID290
Feature activation+3.12e+00

Pos logit contributions
 her+1.260
 the+1.089
 his+1.051
 their+1.050
led+0.950
Neg logit contributions
 rule-2.473
 promise-2.335
 difference-2.283
icious-2.266
umpy-2.264
Token soon
Token ID2582
Feature activation+4.93e+00

Pos logit contributions
 her+4.872
led+4.549
 their+4.413
 his+4.336
 the+4.206
Neg logit contributions
 rule-6.474
 promise-6.228
 world-6.086
 hill-5.987
asis-5.953
Token found
Token ID1043
Feature activation+3.34e+00

Pos logit contributions
 his+6.935
led+6.744
 her+6.707
 their+6.129
bled+6.068
Neg logit contributions
 world-10.934
 promise-10.649
pillar-10.628
 difference-10.594
age-10.572
Token a
Token ID257
Feature activation-1.71e+00

Pos logit contributions
led+3.068
 their+2.725
 her+2.652
 his+2.470
ged+2.219
Neg logit contributions
 rule-9.057
 difference-8.633
 world-8.593
 promise-8.483
reath-8.420
Token soft
Token ID2705
Feature activation-7.84e-01

Pos logit contributions
icious+3.737
'."+3.737
 rule+3.712
oral+3.688
opl+3.685
Neg logit contributions
 her-2.773
 the-2.623
 his-2.424
 their-2.336
 it-2.230
Token,
Token ID11
Feature activation+1.64e-01

Pos logit contributions
 rule+1.496
icious+1.469
oral+1.435
reath+1.391
 promise+1.391
Neg logit contributions
 her-1.508
 the-1.393
 his-1.326
 their-1.271
 it-1.211
Token squ
Token ID2809
Feature activation+1.40e-01

Pos logit contributions
 her+0.282
 the+0.257
 his+0.247
 their+0.243
 it+0.218
Neg logit contributions
 rule-0.334
icious-0.314
 promise-0.311
oral-0.309
reath-0.308
Token open
Token ID1280
Feature activation+3.81e+00

Pos logit contributions
 her+0.038
 the+0.035
 his+0.034
 their+0.034
 it+0.031
Neg logit contributions
 rule-0.030
icious-0.028
oral-0.028
reath-0.027
 difference-0.027
Token the
Token ID262
Feature activation-2.13e-01

Pos logit contributions
led+2.876
ed+1.985
 her+1.767
 his+1.634
bled+1.485
Neg logit contributions
 rule-11.236
 world-10.865
 difference-10.779
 hill-10.657
 shelf-10.642
Token box
Token ID3091
Feature activation+2.03e+00

Pos logit contributions
 rule+0.264
icious+0.249
reath+0.236
 promise+0.234
umpy+0.233
Neg logit contributions
 her-0.536
 the-0.511
 his-0.495
 their-0.490
 it-0.459
Token,
Token ID11
Feature activation+2.76e+00

Pos logit contributions
 her+2.842
 their+2.482
 his+2.471
 the+2.455
led+2.272
Neg logit contributions
 rule-4.746
 promise-4.479
 difference-4.301
 world-4.281
 request-4.271
Token but
Token ID475
Feature activation+4.72e+00

Pos logit contributions
 her+5.207
led+5.089
 their+4.810
 his+4.805
 the+4.621
Neg logit contributions
 rule-5.007
 promise-4.452
 request-4.291
 world-4.266
 basis-4.216
Token finally
Token ID3443
Feature activation+4.92e+00

Pos logit contributions
led+5.725
 his+5.260
 their+4.877
 her+4.592
ed+4.312
Neg logit contributions
 promise-12.121
 rule-12.000
 shelf-11.476
 goal-11.179
 request-11.032
Token managed
Token ID5257
Feature activation+3.07e+00

Pos logit contributions
led+7.966
 his+7.563
 their+7.397
 her+7.154
ged+6.381
Neg logit contributions
 promise-11.271
 rule-10.049
-9.986
'."-9.767
 request-9.751
Token to
Token ID284
Feature activation+5.75e-01

Pos logit contributions
led+3.719
 her+3.407
 his+3.053
 their+2.959
 the+2.806
Neg logit contributions
 rule-7.544
 promise-7.168
 sentence-6.849
 request-6.831
 difference-6.810
Token tear
Token ID11626
Feature activation+2.85e+00

Pos logit contributions
 her+1.167
 the+1.070
 his+1.043
 their+1.033
led+0.956
Neg logit contributions
 rule-0.997
 promise-0.924
icious-0.914
reath-0.894
 difference-0.894
Token it
Token ID340
Feature activation+1.35e+00

Pos logit contributions
 her+1.803
led+1.515
 his+1.484
 their+1.392
 the+1.179
Neg logit contributions
 rule-8.112
ater-7.890
 world-7.818
'."-7.802
!'"-7.769
Token open
Token ID1280
Feature activation+1.22e+00

Pos logit contributions
 her+1.835
 the+1.614
 their+1.581
 his+1.552
led+1.401
Neg logit contributions
 rule-3.172
 promise-3.009
icious-2.934
 difference-2.917
ater-2.879
Token
Token ID6184
Feature activation-2.14e+00

Pos logit contributions
 rule+5.389
reath+5.120
 promise+5.114
ater+5.063
icious+5.004
Neg logit contributions
 her-2.927
 the-2.645
 his-2.460
 their-2.359
 it-2.136
Token
Token ID95
Feature activation-2.65e+00

Pos logit contributions
 rule+3.267
 difference+3.003
 promise+2.944
ode+2.895
 sentence+2.874
Neg logit contributions
 her-5.063
 the-4.859
 their-4.489
 his-4.472
 it-4.336
Token
Token ID26391
Feature activation-1.69e+00

Pos logit contributions
 rule+3.924
 promise+3.713
 difference+3.701
ode+3.648
 affect+3.601
Neg logit contributions
 her-6.325
 the-6.113
 his-5.656
 their-5.477
 it-5.315
Token
Token ID129
Feature activation-1.48e+00

Pos logit contributions
 rule+3.463
reath+3.237
 difference+3.180
 promise+3.150
oral+3.147
Neg logit contributions
 her-3.110
 the-3.002
 their-2.705
 his-2.703
 it-2.659
Token
Token ID241
Feature activation-2.27e+00

Pos logit contributions
 rule+2.166
ode+2.049
 difference+2.038
 promise+2.030
oral+1.975
Neg logit contributions
 her-3.511
 the-3.307
 his-3.166
 their-3.132
 it-2.893
TokenThe
Token ID464
Feature activation-4.51e+00

Pos logit contributions
 difference+4.457
ode+4.449
 rule+4.430
reath+4.335
 promise+4.325
Neg logit contributions
 her-4.599
 the-4.283
 his-4.096
 their-4.003
 it-3.942
Token elephant
Token ID20950
Feature activation-1.91e+00

Pos logit contributions
ater+11.176
uer+11.101
iver+10.891
affe+10.872
icious+10.855
Neg logit contributions
 the-8.984
 her-8.889
 his-7.830
 their-7.138
 it-7.127
Token is
Token ID318
Feature activation-6.15e-01

Pos logit contributions
 rule+4.990
icious+4.915
arl+4.881
reath+4.869
oral+4.830
Neg logit contributions
 her-2.433
 the-2.335
 his-2.086
 their-1.875
 it-1.843
Token just
Token ID655
Feature activation-7.13e-01

Pos logit contributions
 rule+1.598
oral+1.538
icious+1.538
reath+1.522
 promise+1.516
Neg logit contributions
 her-0.758
 the-0.679
 his-0.634
 their-0.602
 it-0.514
Token saying
Token ID2282
Feature activation-8.91e-01

Pos logit contributions
 rule+2.003
oral+1.944
icious+1.940
reath+1.902
 promise+1.901
Neg logit contributions
 her-0.744
 the-0.658
 his-0.595
 their-0.561
 it-0.465
Token hello
Token ID23748
Feature activation-1.56e+00

Pos logit contributions
 rule+2.561
icious+2.495
 promise+2.468
oral+2.450
reath+2.449
Neg logit contributions
 her-0.884
 the-0.787
 his-0.712
 their-0.666
 it-0.573
Token baby
Token ID5156
Feature activation-1.77e+00

Pos logit contributions
icious+6.612
reath+6.577
 rule+6.558
oral+6.512
umpy+6.328
Neg logit contributions
 her-3.136
 the-3.094
 his-2.583
 their-2.432
 it-2.327
Token,
Token ID11
Feature activation-1.99e+00

Pos logit contributions
 rule+4.774
icious+4.634
oral+4.586
reath+4.584
arl+4.480
Neg logit contributions
 her-2.207
 the-2.127
 his-1.787
 their-1.711
 it-1.607
Token don
Token ID836
Feature activation-1.32e+00

Pos logit contributions
 rule+5.905
oral+5.816
reath+5.771
icious+5.760
umpy+5.756
Neg logit contributions
 her-2.069
 the-1.963
 his-1.604
 their-1.562
 it-1.473
Token't
Token ID470
Feature activation-3.05e+00

Pos logit contributions
 rule+2.403
reath+2.281
 promise+2.272
ater+2.236
ode+2.234
Neg logit contributions
 her-2.730
 the-2.629
 his-2.418
 their-2.359
 it-2.333
Token say
Token ID910
Feature activation-7.95e-01

Pos logit contributions
oral+7.200
reath+6.994
ari+6.888
opl+6.744
agons+6.695
Neg logit contributions
 her-6.056
 the-5.884
 his-5.252
 their-4.977
 it-4.883
Token a
Token ID257
Feature activation-4.47e+00

Pos logit contributions
 rule+1.936
oral+1.872
icious+1.870
umpy+1.852
reath+1.841
Neg logit contributions
 her-1.115
 the-1.014
 his-0.948
 their-0.918
 it-0.833
Token word
Token ID1573
Feature activation-2.55e+00

Pos logit contributions
ode+11.662
icious+11.496
opl+11.493
agons+11.280
iver+11.209
Neg logit contributions
 the-8.135
 her-7.857
 his-7.329
 their-7.033
 it-6.831
Token.
Token ID13
Feature activation-3.15e+00

Pos logit contributions
icious+6.962
reath+6.950
 rule+6.807
ode+6.790
agons+6.773
Neg logit contributions
 her-3.386
 the-3.066
 his-2.735
 their-2.678
 it-2.473
Token Mama
Token ID40084
Feature activation-1.80e+00

Pos logit contributions
 rule+7.987
asis+7.886
ater+7.736
reath+7.713
 promise+7.599
Neg logit contributions
 her-5.248
 the-4.946
 his-4.570
 their-4.327
 it-4.211
Token is
Token ID318
Feature activation-6.22e-01

Pos logit contributions
 rule+4.493
icious+4.336
 basis+4.270
oral+4.235
reath+4.230
Neg logit contributions
 her-2.684
 the-2.533
 his-2.219
 their-2.153
 it-2.035
Token going
Token ID1016
Feature activation-1.41e+00

Pos logit contributions
 rule+1.584
icious+1.540
oral+1.538
reath+1.516
 promise+1.514
Neg logit contributions
 her-0.789
 the-0.703
 his-0.648
 their-0.638
 it-0.537
TokenThat
Token ID2504
Feature activation-2.14e+00

Pos logit contributions
 rule+4.749
 difference+4.706
 promise+4.639
ode+4.627
reath+4.625
Neg logit contributions
 her-4.287
 the-4.036
 his-3.886
 their-3.720
 it-3.668
Tokenâ
Token ID22940
Feature activation-2.45e+00

Pos logit contributions
reath+5.096
arl+5.061
icious+4.994
 rule+4.983
oral+4.868
Neg logit contributions
 her-3.360
 the-3.321
 his-3.012
 their-2.823
 it-2.805
Token
Token ID26391
Feature activation-1.95e+00

Pos logit contributions
 rule+3.589
reath+3.165
 charity+3.157
 difference+3.140
oral+3.120
Neg logit contributions
 her-5.975
 the-5.864
 his-5.488
led-5.237
 their-5.212
Token
Token ID8151
Feature activation-1.31e+00

Pos logit contributions
 rule+4.125
reath+3.880
 difference+3.823
 promise+3.805
icious+3.768
Neg logit contributions
 her-3.540
 the-3.489
 his-3.175
 their-3.110
 it-3.104
Tokens
Token ID82
Feature activation-1.89e+00

Pos logit contributions
 rule+3.042
reath+2.924
 difference+2.858
ode+2.853
icious+2.851
Neg logit contributions
 her-1.990
 the-1.866
 his-1.730
 their-1.692
 it-1.533
Token the
Token ID262
Feature activation-4.36e+00

Pos logit contributions
 rule+4.854
oral+4.765
reath+4.732
ater+4.717
affe+4.705
Neg logit contributions
 her-2.544
 the-2.493
 his-2.207
 their-2.117
 it-1.908
Token bird
Token ID6512
Feature activation-1.79e+00

Pos logit contributions
oc+11.268
ater+11.086
icious+11.073
uer+11.009
affe+10.915
Neg logit contributions
 the-8.073
 her-8.009
 his-7.246
 it-6.299
 a-6.287
Token that
Token ID326
Feature activation-1.70e+00

Pos logit contributions
 rule+4.717
oral+4.581
icious+4.559
arl+4.525
reath+4.515
Neg logit contributions
 her-2.329
 the-2.187
 his-1.970
 their-1.820
 it-1.754
Token was
Token ID373
Feature activation-8.80e-01

Pos logit contributions
 rule+4.564
icious+4.498
reath+4.470
oral+4.443
arl+4.391
Neg logit contributions
 her-2.109
 the-1.990
 his-1.796
 their-1.656
 it-1.589
Token calling
Token ID4585
Feature activation-1.64e+00

Pos logit contributions
 rule+2.472
oral+2.388
icious+2.376
 promise+2.365
reath+2.357
Neg logit contributions
 her-0.927
 the-0.827
 his-0.749
 their-0.711
 it-0.595
Token!
Token ID0
Feature activation-2.92e+00

Pos logit contributions
 rule+5.077
icious+4.996
ode+4.956
reath+4.916
 promise+4.897
Neg logit contributions
 her-1.479
 the-1.305
 his-1.139
 their-1.041
 it-0.871
Token
Token ID6184
Feature activation-2.01e+00

Pos logit contributions
 rule+6.400
reath+6.192
ater+6.057
 promise+5.997
icious+5.968
Neg logit contributions
 her-4.369
 the-4.198
 his-3.849
 their-3.714
 it-3.604
Token
Token ID95
Feature activation-2.41e+00

Pos logit contributions
 rule+3.115
 difference+2.924
oral+2.866
 promise+2.865
 affect+2.821
Neg logit contributions
 her-4.612
 the-4.470
 his-4.133
 their-4.107
 it-4.006
Token
Token ID26391
Feature activation-1.79e+00

Pos logit contributions
 rule+3.327
 difference+3.213
 affect+3.174
 promise+3.164
oral+3.156
Neg logit contributions
 her-5.861
 the-5.698
 his-5.365
 their-5.176
 it-5.028
Token
Token ID129
Feature activation-1.31e+00

Pos logit contributions
 rule+3.298
reath+3.127
 difference+3.036
icious+3.026
oral+3.025
Neg logit contributions
 her-3.619
 the-3.547
 his-3.223
 their-3.218
 it-3.213
Token
Token ID241
Feature activation-2.28e+00

Pos logit contributions
 rule+1.864
ode+1.774
 difference+1.773
 promise+1.755
oral+1.746
Neg logit contributions
 her-3.106
 the-2.935
 his-2.824
 their-2.781
 it-2.589
TokenThe
Token ID464
Feature activation-4.36e+00

Pos logit contributions
 difference+4.537
ode+4.513
reath+4.399
 rule+4.398
icious+4.379
Neg logit contributions
 her-4.594
 the-4.295
 his-4.110
 their-4.059
 it-3.972
Token refrigerator
Token ID30500
Feature activation-2.25e+00

Pos logit contributions
uer+11.179
ater+11.111
oc+11.069
affe+10.978
iver+10.936
Neg logit contributions
 the-8.103
 her-7.985
 his-7.146
 their-6.497
 all-6.372
Token door
Token ID3420
Feature activation-1.30e+00

Pos logit contributions
reath+4.660
icious+4.625
 basis+4.624
 rule+4.542
oral+4.511
Neg logit contributions
 her-4.100
 the-3.950
 his-3.621
 their-3.518
 it-3.386
Token is
Token ID318
Feature activation-4.28e-01

Pos logit contributions
 rule+2.733
icious+2.681
reath+2.675
 difference+2.626
 basis+2.613
Neg logit contributions
 her-2.254
 the-2.096
 his-1.926
 their-1.915
 it-1.775
Token not
Token ID407
Feature activation-1.32e-01

Pos logit contributions
 rule+1.091
icious+1.059
reath+1.050
oral+1.040
 difference+1.039
Neg logit contributions
 her-0.534
 the-0.477
 his-0.443
 their-0.434
 it-0.367
Token for
Token ID329
Feature activation-4.21e-01

Pos logit contributions
 rule+0.329
icious+0.317
oral+0.313
reath+0.311
 promise+0.310
Neg logit contributions
 her-0.172
 the-0.153
 his-0.145
 their-0.142
 it-0.120
Token or
Token ID393
Feature activation-1.13e+00

Pos logit contributions
 rule+3.236
reath+3.222
icious+3.221
oral+3.143
 sentence+3.101
Neg logit contributions
 her-1.745
 the-1.634
 his-1.472
 their-1.454
 it-1.266
Token baskets
Token ID46530
Feature activation-1.45e+00

Pos logit contributions
icious+2.436
 rule+2.406
umpy+2.390
oral+2.371
reath+2.350
Neg logit contributions
 her-1.952
 the-1.804
 his-1.721
 their-1.704
 it-1.475
Token,
Token ID11
Feature activation-1.30e+00

Pos logit contributions
icious+3.404
 rule+3.386
reath+3.381
oral+3.280
opus+3.268
Neg logit contributions
 her-2.202
 the-2.088
 their-1.900
 his-1.895
 it-1.675
Token to
Token ID284
Feature activation-1.89e+00

Pos logit contributions
 rule+3.095
icious+3.052
reath+3.026
oral+3.009
opus+2.988
Neg logit contributions
 her-1.903
 the-1.733
 their-1.637
 his-1.619
 it-1.395
Token sing
Token ID1702
Feature activation-1.49e+00

Pos logit contributions
umpy+4.760
oral+4.750
opus+4.748
affe+4.687
icious+4.634
Neg logit contributions
 her-2.865
 the-2.629
 his-2.433
 their-2.389
 it-2.041
Token a
Token ID257
Feature activation-4.35e+00

Pos logit contributions
 rule+4.221
icious+4.155
oral+4.086
asis+4.057
reath+4.054
Neg logit contributions
 her-1.672
 the-1.508
 their-1.384
 his-1.333
 it-1.126
Token song
Token ID3496
Feature activation-2.29e+00

Pos logit contributions
opl+11.578
umpy+11.345
icious+11.099
asis+11.078
ode+11.064
Neg logit contributions
 her-7.665
 the-7.408
 their-7.069
 his-6.773
 it-6.559
Token or
Token ID393
Feature activation-1.65e+00

Pos logit contributions
icious+6.344
reath+6.231
oral+6.179
 rule+6.176
ode+6.156
Neg logit contributions
 her-2.917
 the-2.704
 their-2.428
 his-2.325
 it-2.065
Token tell
Token ID1560
Feature activation-1.34e+00

Pos logit contributions
icious+4.044
umpy+4.003
opus+3.980
oral+3.977
 rule+3.969
Neg logit contributions
 her-2.472
 the-2.268
 his-2.117
 their-2.102
 it-1.741
Token a
Token ID257
Feature activation-3.96e+00

Pos logit contributions
 rule+3.956
icious+3.880
umpy+3.850
asis+3.827
ode+3.814
Neg logit contributions
 her-1.288
 the-1.103
 their-1.006
 his-0.967
 it-0.765
Token story
Token ID1621
Feature activation-1.73e+00

Pos logit contributions
asis+11.375
ode+11.337
icious+11.286
ater+11.270
iver+11.143
Neg logit contributions
 the-6.042
 her-5.837
 their-5.361
 his-5.341
 it-5.119
Tokenmy
Token ID1820
Feature activation-1.96e+00

Pos logit contributions
 rule+4.177
icious+4.086
reath+4.002
oral+3.933
Two+3.884
Neg logit contributions
 her-2.892
 the-2.660
 his-2.526
 their-2.289
 it-2.080
Token.
Token ID13
Feature activation-3.06e+00

Pos logit contributions
 rule+4.763
icious+4.678
reath+4.571
oral+4.565
arl+4.481
Neg logit contributions
 her-3.107
 the-2.910
 his-2.721
 their-2.508
 it-2.378
Token You
Token ID921
Feature activation-1.41e+00

Pos logit contributions
 rule+7.548
reath+7.338
ater+7.329
 promise+7.037
icious+6.975
Neg logit contributions
 her-5.196
 the-5.081
 his-4.944
 their-4.485
 it-4.296
Token listened
Token ID16399
Feature activation-1.06e+00

Pos logit contributions
 rule+2.703
icious+2.616
reath+2.567
oral+2.553
arl+2.548
Neg logit contributions
 her-2.734
 the-2.637
 his-2.432
 their-2.361
 it-2.265
Token to
Token ID284
Feature activation-1.27e+00

Pos logit contributions
 rule+3.235
icious+3.168
reath+3.149
 promise+3.099
ode+3.095
Neg logit contributions
 her-0.835
 the-0.714
 his-0.620
 their-0.601
 it-0.460
Token the
Token ID262
Feature activation-4.34e+00

Pos logit contributions
 rule+3.356
umpy+3.251
 promise+3.225
icious+3.225
oral+3.222
Neg logit contributions
 her-1.694
 the-1.528
 his-1.433
 their-1.355
 it-1.137
Token cater
Token ID20825
Feature activation-2.58e+00

Pos logit contributions
icious+10.198
iver+10.088
umpy+10.083
 basis+9.907
age+9.890
Neg logit contributions
 the-8.900
 her-8.700
 his-7.874
 their-7.336
 a-7.026
Tokenpillar
Token ID41643
Feature activation-2.23e+00

Pos logit contributions
 rule+2.580
 promise+2.312
icious+2.125
 Deal+2.112
 sentence+2.093
Neg logit contributions
 her-7.880
 the-7.532
led-6.953
 his-6.913
 it-6.903
Token and
Token ID290
Feature activation-7.56e-01

Pos logit contributions
icious+6.033
 rule+5.993
reath+5.856
Two+5.853
arl+5.848
Neg logit contributions
 her-2.964
 the-2.669
 his-2.395
 their-2.260
 it-2.053
Token respected
Token ID14462
Feature activation-9.22e-01

Pos logit contributions
 rule+1.730
icious+1.692
reath+1.651
oral+1.646
umpy+1.645
Neg logit contributions
 her-1.183
 the-1.087
 his-1.024
 their-0.992
 it-0.900
Token his
Token ID465
Feature activation-3.29e+00

Pos logit contributions
 rule+3.008
icious+2.939
reath+2.908
 promise+2.901
oral+2.891
Neg logit contributions
 her-0.608
 the-0.504
 his-0.413
 their-0.375
 it-0.279
Token thing
Token ID1517
Feature activation-1.66e+00

Pos logit contributions
icious+6.726
reath+6.507
 Driver+6.485
Two+6.440
agons+6.438
Neg logit contributions
 her-2.972
 the-2.895
 his-2.601
 their-2.335
 it-2.330
Token from
Token ID422
Feature activation-1.11e+00

Pos logit contributions
 rule+4.388
icious+4.367
oral+4.335
reath+4.319
 basis+4.248
Neg logit contributions
 her-2.167
 the-1.938
 his-1.766
 their-1.755
 it-1.590
Token the
Token ID262
Feature activation-3.93e+00

Pos logit contributions
 rule+3.222
icious+3.172
oral+3.135
reath+3.120
umpy+3.117
Neg logit contributions
 her-1.101
 the-0.975
 his-0.854
 their-0.822
 it-0.674
Token basket
Token ID7988
Feature activation-2.27e+00

Pos logit contributions
icious+9.136
umpy+8.974
iver+8.928
age+8.675
ides+8.515
Neg logit contributions
 the-8.162
 her-7.806
 his-7.106
 their-6.935
 it-6.534
Token as
Token ID355
Feature activation-1.13e+00

Pos logit contributions
reath+6.200
icious+6.156
 rule+6.072
oral+5.980
agons+5.922
Neg logit contributions
 her-2.858
 the-2.745
 his-2.347
 their-2.260
 it-2.048
Token a
Token ID257
Feature activation-4.32e+00

Pos logit contributions
 rule+3.436
icious+3.383
 difference+3.329
reath+3.305
 promise+3.304
Neg logit contributions
 her-0.979
 the-0.857
 his-0.729
 their-0.716
 it-0.600
Token gift
Token ID6979
Feature activation-2.23e+00

Pos logit contributions
opl+11.587
icious+11.188
iver+11.134
umpy+10.881
agons+10.821
Neg logit contributions
 her-8.094
 the-7.799
 his-7.067
 it-6.989
 their-6.951
Token.
Token ID13
Feature activation-2.71e+00

Pos logit contributions
icious+5.738
 rule+5.575
reath+5.543
oral+5.520
Two+5.512
Neg logit contributions
 her-3.284
 the-2.956
 his-2.685
 their-2.660
 it-2.442
Token Choose
Token ID17489
Feature activation-3.49e-01

Pos logit contributions
 rule+5.664
asis+5.575
reath+5.455
 affect+5.426
opus+5.410
Neg logit contributions
 her-5.258
 the-5.004
 his-4.589
 their-4.512
 it-4.371
Token what
Token ID644
Feature activation-1.27e+00

Pos logit contributions
 rule+1.036
icious+1.005
 promise+0.987
oral+0.987
umpy+0.985
Neg logit contributions
 her-0.294
 the-0.248
 his-0.219
 their-0.211
 it-0.165
Token you
Token ID345
Feature activation-2.71e+00

Pos logit contributions
icious+2.368
 rule+2.315
reath+2.278
umpy+2.241
ater+2.239
Neg logit contributions
 her-2.600
 the-2.461
 his-2.317
 their-2.280
 it-2.157
Token But
Token ID887
Feature activation-5.05e-01

Pos logit contributions
asis+7.198
 rule+7.122
reath+7.081
ater+6.829
 affect+6.788
Neg logit contributions
 her-6.017
 the-5.839
 his-5.327
 their-5.245
 it-5.126
Token how
Token ID703
Feature activation-1.36e+00

Pos logit contributions
 rule+1.004
icious+0.962
umpy+0.959
reath+0.957
oral+0.945
Neg logit contributions
 her-0.916
 the-0.851
 his-0.803
 their-0.791
 it-0.736
Token about
Token ID546
Feature activation-1.73e+00

Pos logit contributions
 rule+2.815
icious+2.763
reath+2.755
ater+2.722
 basis+2.703
Neg logit contributions
 her-2.360
 the-2.266
 their-2.084
 his-2.062
 it-1.951
Token you
Token ID345
Feature activation-2.90e+00

Pos logit contributions
 rule+5.694
reath+5.604
umpy+5.598
icious+5.595
oral+5.542
Neg logit contributions
 her-1.195
 the-1.119
 their-0.820
 his-0.785
 it-0.676
Token draw
Token ID3197
Feature activation-1.23e+00

Pos logit contributions
umm+7.360
icious+7.336
reath+7.212
oral+7.053
 rule+7.050
Neg logit contributions
 the-4.571
 her-4.533
 his-4.027
 their-3.767
 it-3.673
Token a
Token ID257
Feature activation-4.30e+00

Pos logit contributions
 rule+3.582
 promise+3.448
oral+3.421
icious+3.417
reath+3.399
Neg logit contributions
 her-1.246
 the-1.098
 his-0.990
 their-0.944
 it-0.795
Token picture
Token ID4286
Feature activation-2.31e+00

Pos logit contributions
reath+11.146
icious+10.621
ode+10.586
 Deal+10.420
agons+10.333
Neg logit contributions
 the-8.010
 his-7.356
 her-7.191
 their-7.081
 it-7.053
Token of
Token ID286
Feature activation-1.66e+00

Pos logit contributions
 rule+6.462
reath+6.424
icious+6.336
ode+6.302
 promise+6.294
Neg logit contributions
 her-2.841
 the-2.538
 his-2.203
 their-2.107
 it-1.910
Token you
Token ID345
Feature activation-2.79e+00

Pos logit contributions
umpy+5.529
 rule+5.517
oral+5.494
reath+5.436
icious+5.434
Neg logit contributions
 her-1.085
 the-0.922
 his-0.695
 their-0.690
 it-0.472
Token z
Token ID1976
Feature activation-1.68e+00

Pos logit contributions
 rule+7.687
reath+7.593
ode+7.527
icious+7.512
umm+7.445
Neg logit contributions
 her-3.768
 the-3.639
 his-2.989
 their-2.973
 it-2.692
Tokenooming
Token ID30602
Feature activation-1.59e+00

Pos logit contributions
 rule+2.869
 difference+2.751
icious+2.708
umpy+2.595
 promise+2.578
Neg logit contributions
 her-3.510
 the-3.335
 his-3.252
 their-3.155
led-2.985
Token shop
Token ID6128
Feature activation-2.23e+00

Pos logit contributions
reath+7.043
rr+7.021
icious+7.009
umpy+6.950
ater+6.932
Neg logit contributions
 the-7.544
 her-7.284
 his-6.698
 their-6.326
 it-6.211
Token at
Token ID379
Feature activation-3.85e-01

Pos logit contributions
reath+5.925
icious+5.848
 rule+5.842
oral+5.673
 Deal+5.539
Neg logit contributions
 her-2.930
 the-2.889
 his-2.495
 their-2.431
 it-2.252
Token the
Token ID262
Feature activation-3.11e+00

Pos logit contributions
 rule+1.235
icious+1.211
reath+1.192
 promise+1.190
oral+1.182
Neg logit contributions
 her-0.233
 the-0.189
 his-0.154
 their-0.142
 it-0.082
Token end
Token ID886
Feature activation-2.43e+00

Pos logit contributions
rr+5.010
umpy+4.881
ater+4.800
reath+4.797
arl+4.703
Neg logit contributions
 her-7.871
 the-7.850
 his-7.209
 their-6.869
 it-6.726
Token of
Token ID286
Feature activation-1.04e+00

Pos logit contributions
icious+6.396
 rule+6.387
reath+6.260
 difference+6.126
 promise+6.095
Neg logit contributions
 her-3.382
 the-3.331
 his-2.888
 their-2.862
 it-2.821
Token the
Token ID262
Feature activation-4.27e+00

Pos logit contributions
 rule+3.852
icious+3.819
reath+3.781
umpy+3.744
 promise+3.741
Neg logit contributions
 her-0.199
 the-0.108
 his-0.008
 their+0.044
 it+0.183
Token road
Token ID2975
Feature activation-2.23e+00

Pos logit contributions
ater+8.640
rr+8.332
icious+8.256
umpy+8.239
reath+8.208
Neg logit contributions
 the-10.627
 her-10.338
 his-9.743
 it-9.306
 there-8.913
Token.
Token ID13
Feature activation-2.54e+00

Pos logit contributions
reath+5.820
icious+5.742
 rule+5.726
oral+5.618
ode+5.507
Neg logit contributions
 her-2.993
 the-2.946
 his-2.566
 their-2.512
 it-2.301
Token If
Token ID1002
Feature activation-9.33e-01

Pos logit contributions
 rule+6.428
reath+6.144
 promise+6.084
ater+5.992
 affect+5.942
Neg logit contributions
 her-3.936
 the-3.927
 his-3.606
 it-3.276
 their-3.271
Token you
Token ID345
Feature activation-2.44e+00

Pos logit contributions
 rule+2.343
reath+2.270
icious+2.270
umpy+2.241
ater+2.222
Neg logit contributions
 her-1.263
 the-1.183
 his-1.084
 their-1.024
 it-0.948
Token do
Token ID466
Feature activation-8.75e-01

Pos logit contributions
icious+4.845
 rule+4.768
oral+4.692
reath+4.648
 sentence+4.647
Neg logit contributions
 her-4.782
 the-4.723
 his-4.358
 their-4.172
 it-3.996
Token It
Token ID632
Feature activation-4.68e-01

Pos logit contributions
reath+6.370
 rule+6.282
ater+6.178
asis+6.150
oral+6.012
Neg logit contributions
 the-3.827
 her-3.708
 their-3.397
 his-3.321
 it-3.264
Token's
Token ID338
Feature activation+5.43e-01

Pos logit contributions
 rule+0.901
icious+0.862
reath+0.853
oral+0.848
 basis+0.835
Neg logit contributions
 her-0.864
 the-0.809
 his-0.761
 their-0.757
 it-0.696
Token important
Token ID1593
Feature activation-1.19e+00

Pos logit contributions
 her+0.800
 the+0.710
 his+0.684
 their+0.672
led+0.601
Neg logit contributions
 rule-1.245
 promise-1.170
icious-1.168
 difference-1.145
reath-1.145
Token to
Token ID284
Feature activation-2.17e+00

Pos logit contributions
icious+3.345
 rule+3.315
reath+3.257
 difference+3.246
oral+3.246
Neg logit contributions
 her-1.224
 the-1.089
 their-0.960
 his-0.958
 it-0.852
Token follow
Token ID1061
Feature activation-1.03e+00

Pos logit contributions
oral+4.035
umpy+3.939
opus+3.846
 rule+3.780
icious+3.752
Neg logit contributions
 her-4.871
 the-4.540
 his-4.201
 their-4.129
 it-3.903
Token the
Token ID262
Feature activation-4.27e+00

Pos logit contributions
 rule+3.322
icious+3.267
oral+3.238
reath+3.219
 promise+3.190
Neg logit contributions
 her-0.668
 the-0.589
 his-0.443
 their-0.438
 it-0.332
Token rules
Token ID3173
Feature activation-1.69e+00

Pos logit contributions
icious+8.823
iver+8.784
umpy+8.448
'."+8.413
oc+8.400
Neg logit contributions
 the-9.964
 her-9.571
 his-8.934
 their-8.830
 it-8.546
Token."
Token ID526
Feature activation-5.87e-01

Pos logit contributions
 rule+3.193
reath+3.182
icious+3.151
 difference+3.130
oral+3.074
Neg logit contributions
 her-3.336
 the-3.232
 their-2.969
 his-2.967
 it-2.816
Token
Token ID198
Feature activation-6.89e-01

Pos logit contributions
 rule+1.298
reath+1.230
icious+1.222
ater+1.209
 promise+1.203
Neg logit contributions
 her-0.966
 the-0.891
 his-0.862
 their-0.841
 it-0.741
Token
Token ID198
Feature activation-1.66e+00

Pos logit contributions
 rule+1.455
icious+1.370
reath+1.370
 promise+1.361
ater+1.354
Neg logit contributions
 her-1.180
 the-1.101
 his-1.059
 their-1.037
 it-0.932
TokenTim
Token ID14967
Feature activation-1.16e-01

Pos logit contributions
 rule+2.873
reath+2.821
 difference+2.713
 promise+2.669
arl+2.639
Neg logit contributions
 her-3.572
 his-3.355
 the-3.349
 their-3.286
 it-2.931
Token to
Token ID284
Feature activation-3.26e+00

Pos logit contributions
 rule+2.011
reath+1.975
icious+1.972
 promise+1.922
oral+1.922
Neg logit contributions
 her-0.547
 the-0.479
 their-0.404
 his-0.403
 it-0.310
Token be
Token ID307
Feature activation+3.22e-01

Pos logit contributions
opus+7.095
umpy+6.962
ini+6.810
oral+6.802
umm+6.695
Neg logit contributions
 her-7.294
 the-6.915
 their-6.340
 his-6.242
 it-5.716
Token careful
Token ID8161
Feature activation-1.07e+00

Pos logit contributions
 her+0.583
 the+0.532
 his+0.515
 their+0.508
led+0.457
Neg logit contributions
 rule-0.631
icious-0.589
 promise-0.586
reath-0.577
oral-0.576
Token and
Token ID290
Feature activation-1.15e+00

Pos logit contributions
 rule+2.775
oral+2.680
reath+2.679
icious+2.678
 difference+2.639
Neg logit contributions
 her-1.308
 the-1.194
 his-1.071
 their-1.063
 it-0.972
Token ask
Token ID1265
Feature activation-9.05e-01

Pos logit contributions
 rule+2.314
icious+2.287
oral+2.280
reath+2.243
umpy+2.229
Neg logit contributions
 her-2.054
 the-1.931
 their-1.800
 his-1.797
 it-1.668
Token the
Token ID262
Feature activation-4.26e+00

Pos logit contributions
 rule+2.629
umpy+2.552
icious+2.550
oral+2.527
reath+2.522
Neg logit contributions
 her-0.871
 the-0.763
 their-0.659
 his-0.655
 it-0.514
Token owner
Token ID4870
Feature activation-1.88e+00

Pos logit contributions
oc+10.234
iver+10.086
 basis+10.063
icious+10.010
age+9.973
Neg logit contributions
 the-8.444
 her-8.151
 his-7.244
 their-7.184
 a-6.693
Token if
Token ID611
Feature activation-1.40e-01

Pos logit contributions
 rule+5.251
icious+5.178
reath+5.120
 difference+5.079
 basis+5.003
Neg logit contributions
 her-2.204
 the-2.042
 his-1.779
 their-1.735
 it-1.473
Token we
Token ID356
Feature activation-2.35e+00

Pos logit contributions
 rule+0.311
icious+0.295
reath+0.291
 promise+0.290
umpy+0.289
Neg logit contributions
 her-0.221
 the-0.202
 his-0.190
 their-0.188
 it-0.169
Token can
Token ID460
Feature activation-3.57e+00

Pos logit contributions
icious+4.939
 rule+4.823
opl+4.776
ini+4.764
reath+4.751
Neg logit contributions
 her-4.418
 the-4.371
 their-3.843
 his-3.824
 it-3.764
Token pet
Token ID4273
Feature activation-1.01e+00

Pos logit contributions
oral+9.027
reath+8.984
umm+8.969
affe+8.954
icious+8.928
Neg logit contributions
 her-6.176
 the-5.840
 his-4.956
 their-4.955
 it-4.796
Token there
Token ID612
Feature activation-2.15e+00

Pos logit contributions
 rule+2.169
icious+2.054
oral+2.046
reath+2.032
 promise+2.024
Neg logit contributions
 her-0.989
 the-0.901
 his-0.808
 their-0.763
 it-0.701
Token,
Token ID11
Feature activation-1.66e+00

Pos logit contributions
 rule+5.614
reath+5.534
ater+5.454
oral+5.440
icious+5.431
Neg logit contributions
 her-3.022
 the-2.911
 his-2.470
 their-2.361
 it-2.200
Tokenâ
Token ID22940
Feature activation-2.27e+00

Pos logit contributions
 rule+4.720
icious+4.639
reath+4.591
ater+4.576
oral+4.570
Neg logit contributions
 her-1.863
 the-1.718
 his-1.512
 their-1.414
 it-1.301
Token
Token ID26391
Feature activation-1.64e+00

Pos logit contributions
 rule+3.236
oral+2.963
 difference+2.895
 promise+2.841
 charity+2.822
Neg logit contributions
 her-5.643
 the-5.468
 his-5.141
 their-4.887
 it-4.793
Token replied
Token ID8712
Feature activation-6.81e-01

Pos logit contributions
 rule+3.216
reath+3.010
 promise+2.975
 difference+2.950
oral+2.922
Neg logit contributions
 her-3.170
 the-3.068
 their-2.791
 his-2.788
 it-2.751
Token the
Token ID262
Feature activation-4.25e+00

Pos logit contributions
 rule+1.863
icious+1.800
oral+1.769
 promise+1.767
reath+1.766
Neg logit contributions
 her-0.774
 the-0.681
 his-0.629
 their-0.602
 it-0.517
Token snail
Token ID47374
Feature activation-1.68e+00

Pos logit contributions
icious+10.847
uer+10.572
ater+10.544
oc+10.523
iver+10.400
Neg logit contributions
 her-8.325
 the-8.162
 his-7.318
 their-6.452
 a-6.410
Token.
Token ID13
Feature activation-2.11e+00

Pos logit contributions
 rule+4.476
icious+4.442
reath+4.331
oral+4.279
 difference+4.226
Neg logit contributions
 her-2.192
 the-1.960
 his-1.821
 their-1.706
 it-1.527
Token
Token ID6184
Feature activation-2.03e+00

Pos logit contributions
 rule+5.087
 promise+4.902
reath+4.853
ater+4.802
oral+4.774
Neg logit contributions
 her-3.386
 the-3.213
 his-3.009
 their-2.844
 it-2.745
Token
Token ID95
Feature activation-2.67e+00

Pos logit contributions
 rule+3.368
oral+3.124
 difference+3.113
 promise+3.102
icious+3.009
Neg logit contributions
 her-4.500
 the-4.345
 his-3.990
 their-3.950
 it-3.891
Token
Token ID26391
Feature activation-1.82e+00

Pos logit contributions
 rule+3.890
oral+3.773
 promise+3.756
 difference+3.702
 affect+3.646
Neg logit contributions
 her-6.414
 the-6.249
 his-5.845
 their-5.565
 it-5.501
Token get
Token ID651
Feature activation+2.22e-01

Pos logit contributions
icious+3.921
arl+3.887
reath+3.879
oral+3.873
ode+3.863
Neg logit contributions
 her-4.483
 the-4.342
 his-4.168
 their-4.019
 it-3.700
Token to
Token ID284
Feature activation-1.71e+00

Pos logit contributions
 her+0.192
 the+0.157
 his+0.145
 their+0.140
 it+0.105
Neg logit contributions
 rule-0.644
icious-0.617
 promise-0.613
reath-0.609
oral-0.608
Token the
Token ID262
Feature activation-3.77e+00

Pos logit contributions
icious+6.207
 rule+6.143
oral+6.138
ode+6.134
umpy+6.131
Neg logit contributions
 her-0.587
 the-0.488
 his-0.307
 their-0.220
 it+0.037
Token end
Token ID886
Feature activation-1.58e+00

Pos logit contributions
reath+9.062
umpy+8.756
arl+8.747
oc+8.718
age+8.676
Neg logit contributions
 the-7.087
 her-6.712
 his-6.283
 their-6.059
 a-5.582
Token of
Token ID286
Feature activation-1.73e-01

Pos logit contributions
icious+4.013
 rule+4.009
reath+3.936
 difference+3.859
oral+3.842
Neg logit contributions
 her-2.172
 the-1.998
 his-1.840
 their-1.834
 it-1.622
Token the
Token ID262
Feature activation-4.24e+00

Pos logit contributions
 rule+0.612
icious+0.595
 promise+0.591
reath+0.589
oral+0.588
Neg logit contributions
 her-0.049
 the-0.023
 his-0.012
 their-0.008
 it+0.020
Token wire
Token ID6503
Feature activation-1.78e+00

Pos logit contributions
icious+9.572
reath+9.367
'."+9.205
ater+9.150
iver+9.104
Neg logit contributions
 the-9.296
 her-8.799
 his-8.388
 their-8.006
 all-7.373
Token first
Token ID717
Feature activation-2.30e+00

Pos logit contributions
icious+4.644
 rule+4.609
reath+4.601
oral+4.545
arl+4.436
Neg logit contributions
 her-2.313
 the-2.117
 his-1.964
 their-1.903
 it-1.637
Token.
Token ID13
Feature activation-1.77e+00

Pos logit contributions
icious+5.598
reath+5.400
oral+5.333
 Deal+5.315
 rule+5.238
Neg logit contributions
 her-3.683
 the-3.454
 his-3.233
 their-3.075
 it-2.853
Token All
Token ID1439
Feature activation+1.55e+00

Pos logit contributions
 rule+4.327
reath+4.155
ater+4.108
 promise+4.099
icious+4.055
Neg logit contributions
 her-2.759
 the-2.548
 his-2.413
 their-2.305
 it-2.115
Token of
Token ID286
Feature activation+6.72e-01

Pos logit contributions
 her+2.012
 the+1.682
 his+1.663
 their+1.652
led+1.469
Neg logit contributions
 rule-3.708
 promise-3.532
 difference-3.476
umpy-3.449
 world-3.405
Tokenm
Token ID76
Feature activation-1.45e+00

Pos logit contributions
 rule+1.881
reath+1.796
icious+1.755
 difference+1.753
 promise+1.724
Neg logit contributions
 her-1.626
 the-1.516
 his-1.438
 their-1.412
 it-1.268
Token so
Token ID523
Feature activation-1.51e+00

Pos logit contributions
 rule+3.396
oral+3.255
icious+3.203
 difference+3.185
 basis+3.171
Neg logit contributions
 her-2.339
 the-2.155
 his-2.032
 their-1.947
 it-1.773
Token glad
Token ID9675
Feature activation-1.16e+00

Pos logit contributions
 rule+3.691
icious+3.558
ode+3.533
reath+3.498
oral+3.495
Neg logit contributions
 her-2.273
 the-2.125
 his-1.998
 their-1.887
 it-1.726
Token you
Token ID345
Feature activation-1.63e+00

Pos logit contributions
 rule+2.981
icious+2.915
reath+2.888
 promise+2.824
umpy+2.824
Neg logit contributions
 her-1.589
 the-1.477
 his-1.364
 their-1.276
 it-1.202
Token followed
Token ID3940
Feature activation-1.54e+00

Pos logit contributions
 rule+3.742
oral+3.611
icious+3.552
 sentence+3.470
 difference+3.467
Neg logit contributions
 her-2.614
 the-2.464
 his-2.264
 their-2.158
 it-2.023
Token the
Token ID262
Feature activation-4.23e+00

Pos logit contributions
 rule+5.145
icious+5.096
oral+5.048
umpy+4.983
reath+4.970
Neg logit contributions
 her-1.026
 the-0.886
 his-0.716
 their-0.592
 it-0.471
Token light
Token ID1657
Feature activation-2.56e+00

Pos logit contributions
ater+9.990
icious+9.611
'."+9.606
umpy+9.566
iver+9.511
Neg logit contributions
 her-8.543
 the-8.521
 his-8.259
 all-6.909
 it-6.834
Token,
Token ID11
Feature activation-1.88e+00

Pos logit contributions
icious+7.354
 rule+7.172
oral+7.156
reath+7.092
Two+6.867
Neg logit contributions
 her-3.235
 the-2.990
 his-2.720
 their-2.558
 it-2.262
Token now
Token ID783
Feature activation-1.77e+00

Pos logit contributions
 rule+5.399
reath+5.242
oral+5.235
icious+5.183
ater+5.135
Neg logit contributions
 her-2.149
 the-2.024
 his-1.781
 their-1.670
 it-1.602
Token let
Token ID1309
Feature activation-1.10e+00

Pos logit contributions
 rule+4.979
icious+4.889
reath+4.838
oral+4.794
ater+4.736
Neg logit contributions
 her-2.138
 the-1.947
 his-1.764
 their-1.622
 it-1.608
Tokenâ
Token ID22940
Feature activation-2.41e+00

Pos logit contributions
 rule+2.571
icious+2.435
oral+2.434
ater+2.423
reath+2.416
Neg logit contributions
 her-1.750
 the-1.614
 his-1.505
 their-1.444
 it-1.335
Token Let
Token ID3914
Feature activation-1.38e-01

Pos logit contributions
 rule+2.190
icious+2.041
reath+2.039
 promise+2.019
asis+1.984
Neg logit contributions
 her-3.073
 the-2.929
 their-2.808
 his-2.781
 it-2.588
Token's
Token ID338
Feature activation-2.80e+00

Pos logit contributions
 rule+0.283
icious+0.266
reath+0.264
 promise+0.260
oral+0.258
Neg logit contributions
 her-0.247
 the-0.226
 his-0.213
 their-0.213
 it-0.192
Token go
Token ID467
Feature activation-1.62e+00

Pos logit contributions
umpy+6.493
opus+6.476
affe+6.453
pillar+6.317
oral+6.296
Neg logit contributions
 her-5.665
 the-5.495
 their-4.953
 his-4.882
 it-4.501
Token and
Token ID290
Feature activation-1.68e+00

Pos logit contributions
icious+4.705
 rule+4.584
oral+4.473
affe+4.414
 sentence+4.404
Neg logit contributions
 her-1.754
 the-1.613
 their-1.419
 his-1.418
 it-1.203
Token ask
Token ID1265
Feature activation-5.90e-01

Pos logit contributions
umpy+3.022
icious+2.991
oral+2.979
 rule+2.943
reath+2.907
Neg logit contributions
 her-3.641
 the-3.531
 his-3.294
 their-3.273
 it-3.080
Token the
Token ID262
Feature activation-4.22e+00

Pos logit contributions
 rule+1.543
icious+1.481
umpy+1.479
age+1.468
reath+1.465
Neg logit contributions
 her-0.729
 the-0.653
 their-0.598
 his-0.594
 it-0.495
Token truck
Token ID7779
Feature activation-2.18e+00

Pos logit contributions
iver+10.591
oc+10.451
age+10.348
opl+10.294
umpy+10.139
Neg logit contributions
 the-8.407
 her-8.124
 their-7.395
 his-7.206
 a-6.598
Token driver
Token ID4639
Feature activation-1.74e+00

Pos logit contributions
icious+6.285
reath+6.209
 rule+6.156
ater+6.044
opus+6.026
Neg logit contributions
 her-2.319
 the-2.241
 his-1.994
 their-1.978
 they-1.464
Token if
Token ID611
Feature activation+9.12e-02

Pos logit contributions
 rule+4.878
icious+4.843
reath+4.774
 difference+4.723
agons+4.671
Neg logit contributions
 her-1.932
 the-1.808
 their-1.576
 his-1.573
 it-1.238
Token we
Token ID356
Feature activation-2.21e+00

Pos logit contributions
 her+0.145
 the+0.132
 his+0.126
 their+0.124
 it+0.111
Neg logit contributions
 rule-0.199
icious-0.189
 promise-0.187
reath-0.186
umpy-0.184
Token can
Token ID460
Feature activation-3.28e+00

Pos logit contributions
icious+4.479
opl+4.390
 rule+4.309
reath+4.223
 basis+4.195
Neg logit contributions
 her-4.399
 the-4.355
 his-3.928
 their-3.917
 it-3.822
Token bear
Token ID6842
Feature activation-1.56e+00

Pos logit contributions
 rule+2.569
icious+2.484
reath+2.466
oral+2.464
 promise+2.428
Neg logit contributions
 her-0.908
 the-0.812
 his-0.737
 their-0.697
 it-0.598
Token,
Token ID11
Feature activation-1.06e+00

Pos logit contributions
 rule+4.311
icious+4.245
reath+4.212
oral+4.179
 difference+4.119
Neg logit contributions
 her-1.733
 the-1.595
 his-1.433
 their-1.293
 it-1.176
Token I
Token ID314
Feature activation-2.61e+00

Pos logit contributions
 rule+3.086
icious+3.012
reath+2.995
oral+2.957
ater+2.934
Neg logit contributions
 her-1.045
 the-0.956
 his-0.846
 their-0.785
 it-0.717
Token'll
Token ID1183
Feature activation-3.03e+00

Pos logit contributions
icious+5.327
 rule+5.100
reath+5.063
umm+5.027
 basis+5.022
Neg logit contributions
 her-5.178
 the-4.827
 his-4.640
 their-4.389
 it-4.342
Token ask
Token ID1265
Feature activation-1.08e+00

Pos logit contributions
oral+6.604
icious+6.563
opus+6.423
umpy+6.421
ari+6.307
Neg logit contributions
 her-6.319
 the-5.782
 his-5.522
 their-5.135
 it-5.093
Token the
Token ID262
Feature activation-4.22e+00

Pos logit contributions
 rule+3.048
icious+2.952
umpy+2.934
oral+2.917
reath+2.907
Neg logit contributions
 her-1.204
 the-1.075
 his-0.974
 their-0.904
 it-0.779
Token doctor
Token ID6253
Feature activation-2.12e+00

Pos logit contributions
iver+10.046
icious+9.947
'."+9.922
!'"+9.886
ater+9.847
Neg logit contributions
 her-8.402
 the-8.397
 his-7.739
 it-6.588
 a-6.570
Token for
Token ID329
Feature activation-1.03e+00

Pos logit contributions
icious+6.145
 rule+6.121
reath+6.022
oral+5.924
 difference+5.877
Neg logit contributions
 her-2.290
 the-2.072
 his-1.802
 their-1.747
 it-1.441
Token permission
Token ID7170
Feature activation-2.30e+00

Pos logit contributions
 rule+2.396
umpy+2.351
icious+2.301
oral+2.276
 difference+2.275
Neg logit contributions
 her-1.625
 the-1.490
 his-1.412
 their-1.332
 it-1.231
Token to
Token ID284
Feature activation-2.87e+00

Pos logit contributions
icious+6.369
reath+6.369
 rule+6.364
 difference+6.284
oral+6.206
Neg logit contributions
 her-2.860
 the-2.590
 his-2.338
 their-2.048
 it-2.013
Token give
Token ID1577
Feature activation-5.92e-01

Pos logit contributions
umpy+6.059
oral+5.878
opus+5.850
affe+5.778
icious+5.701
Neg logit contributions
 her-6.249
 the-5.857
 his-5.550
 their-5.104
 it-4.849
Token other
Token ID584
Feature activation-2.69e+00

Pos logit contributions
arl+9.324
icious+8.928
ari+8.861
ode+8.829
 Deal+8.752
Neg logit contributions
 her-4.819
 the-4.737
 his-4.252
 their-3.839
 it-3.419
Token happy
Token ID3772
Feature activation-2.41e+00

Pos logit contributions
oral+7.747
arl+7.733
 rule+7.596
reath+7.582
icious+7.558
Neg logit contributions
 her-3.261
 the-3.258
 his-2.738
 their-2.658
 it-2.257
Token and
Token ID290
Feature activation-1.05e+00

Pos logit contributions
icious+6.022
 rule+5.944
reath+5.920
oral+5.801
Two+5.774
Neg logit contributions
 her-3.788
 the-3.528
 his-3.228
 their-3.172
 it-2.945
Token grow
Token ID1663
Feature activation-4.30e-01

Pos logit contributions
 rule+2.193
oral+2.130
icious+2.128
reath+2.087
umpy+2.055
Neg logit contributions
 her-1.836
 the-1.721
 his-1.607
 their-1.605
 it-1.481
Token as
Token ID355
Feature activation-1.26e+00

Pos logit contributions
 rule+0.887
icious+0.834
 promise+0.824
reath+0.811
oral+0.809
Neg logit contributions
 her-0.752
 the-0.703
 their-0.660
 his-0.657
led-0.599
Token a
Token ID257
Feature activation-4.20e+00

Pos logit contributions
 rule+3.262
icious+3.180
 basis+3.126
age+3.113
reath+3.113
Neg logit contributions
 her-1.637
 the-1.529
 their-1.377
 his-1.367
 it-1.218
Token family
Token ID1641
Feature activation-3.08e+00

Pos logit contributions
icious+10.982
iver+10.877
reath+10.784
ater+10.716
umpy+10.677
Neg logit contributions
 the-7.612
 her-7.303
 their-6.975
 his-6.698
 it-6.679
Token."
Token ID526
Feature activation-1.54e+00

Pos logit contributions
icious+8.587
reath+8.418
 rule+8.412
oral+8.334
 Deal+8.263
Neg logit contributions
 her-4.222
 the-4.195
 their-3.529
 his-3.422
 it-3.164
Token
Token ID198
Feature activation-1.67e+00

Pos logit contributions
 rule+3.914
icious+3.743
reath+3.728
ater+3.669
ode+3.654
Neg logit contributions
 her-2.409
 the-2.178
 his-2.071
 their-2.014
 it-1.743
Token
Token ID198
Feature activation-3.00e+00

Pos logit contributions
 rule+3.865
ode+3.696
icious+3.679
 promise+3.622
ater+3.601
Neg logit contributions
 her-2.852
 the-2.630
 his-2.513
 their-2.442
 it-2.183
TokenL
Token ID43
Feature activation-1.32e+00

Pos logit contributions
 rule+6.471
 difference+6.339
ode+6.201
umm+6.161
reath+6.159
Neg logit contributions
 her-6.239
 the-5.559
 his-5.533
 their-5.411
 it-4.743
Token "
Token ID366
Feature activation-2.77e+00

Pos logit contributions
icious+2.872
 rule+2.871
age+2.752
 Driver+2.748
umpy+2.704
Neg logit contributions
 her-3.898
 the-3.712
 it-3.261
 his-3.222
 their-3.208
TokenI
Token ID40
Feature activation-2.18e+00

Pos logit contributions
agons+5.555
 rule+5.526
reath+5.510
 difference+5.273
asis+5.190
Neg logit contributions
 her-5.764
 the-5.234
 his-5.129
 their-4.843
 it-4.649
Token was
Token ID373
Feature activation-7.57e-01

Pos logit contributions
icious+4.300
agons+4.207
 rule+4.188
reath+4.188
ode+4.135
Neg logit contributions
 her-4.333
 the-4.006
 his-3.856
 their-3.755
 it-3.567
Token born
Token ID4642
Feature activation-1.62e+00

Pos logit contributions
 rule+2.010
oral+1.956
icious+1.941
reath+1.926
 promise+1.909
Neg logit contributions
 her-0.900
 the-0.809
 his-0.745
 their-0.713
 it-0.623
Token as
Token ID355
Feature activation-9.97e-01

Pos logit contributions
 rule+4.713
icious+4.635
reath+4.587
oral+4.580
 promise+4.529
Neg logit contributions
 her-1.656
 the-1.540
 his-1.309
 their-1.200
 it-1.068
Token a
Token ID257
Feature activation-4.20e+00

Pos logit contributions
 rule+2.634
icious+2.564
reath+2.509
 promise+2.499
ode+2.488
Neg logit contributions
 her-1.250
 the-1.157
 his-1.046
 their-1.007
 it-0.920
Token cater
Token ID20825
Feature activation-2.43e+00

Pos logit contributions
icious+10.911
iver+10.801
opl+10.782
umpy+10.564
reath+10.419
Neg logit contributions
 her-7.449
 the-7.365
 his-6.598
 it-6.477
 their-6.366
Tokenpillar
Token ID41643
Feature activation-2.19e+00

Pos logit contributions
 rule+2.378
 promise+2.144
 charity+1.935
 sentence+1.918
oral+1.914
Neg logit contributions
 her-7.387
 the-7.005
led-6.625
 his-6.565
 it-6.430
Token.
Token ID13
Feature activation-2.95e+00

Pos logit contributions
 rule+6.184
icious+6.095
oral+6.046
reath+5.918
 difference+5.871
Neg logit contributions
 her-2.695
 the-2.395
 his-2.030
 it-1.884
 their-1.879
Token A
Token ID317
Feature activation-3.09e+00

Pos logit contributions
 rule+6.924
 promise+6.634
ater+6.603
asis+6.568
oral+6.523
Neg logit contributions
 her-5.349
 the-5.163
 his-4.740
 it-4.445
 their-4.438
Token cater
Token ID20825
Feature activation-1.92e+00

Pos logit contributions
icious+7.456
 rule+7.219
oral+7.200
ides+7.189
opl+7.152
Neg logit contributions
 her-5.341
 the-5.125
 it-4.352
 his-4.297
 their-4.059
TokenThat
Token ID2504
Feature activation-2.31e+00

Pos logit contributions
 rule+4.767
reath+4.635
agons+4.526
 difference+4.518
 promise+4.466
Neg logit contributions
 her-5.321
 the-4.790
 his-4.601
 their-4.523
 it-4.394
Tokenâ
Token ID22940
Feature activation-2.30e+00

Pos logit contributions
arl+5.828
reath+5.803
icious+5.752
 rule+5.626
oral+5.533
Neg logit contributions
 her-3.534
 the-3.397
 his-3.042
 their-2.970
 it-2.890
Token
Token ID26391
Feature activation-1.71e+00

Pos logit contributions
 rule+3.413
 difference+3.039
reath+3.029
ode+2.991
 promise+2.975
Neg logit contributions
 her-5.516
 the-5.305
 his-4.889
 their-4.846
led-4.702
Token
Token ID8151
Feature activation-1.20e+00

Pos logit contributions
 rule+3.034
reath+2.868
icious+2.815
 difference+2.796
oral+2.781
Neg logit contributions
 her-3.576
 the-3.522
 their-3.227
 his-3.161
 it-3.160
Tokens
Token ID82
Feature activation-1.41e+00

Pos logit contributions
 rule+2.699
reath+2.584
icious+2.537
 difference+2.516
ode+2.499
Neg logit contributions
 her-1.911
 the-1.789
 their-1.651
 his-1.651
 it-1.506
Token a
Token ID257
Feature activation-4.19e+00

Pos logit contributions
 rule+3.630
oral+3.546
reath+3.525
ater+3.502
affe+3.500
Neg logit contributions
 her-1.859
 the-1.738
 his-1.530
 their-1.522
 it-1.340
Token special
Token ID2041
Feature activation-3.67e+00

Pos logit contributions
 affect+11.437
arians+11.393
ater+11.347
opl+11.342
ides+11.327
Neg logit contributions
 the-6.114
 her-5.914
 his-5.401
 it-5.023
 their-4.955
Token ball
Token ID2613
Feature activation-1.95e+00

Pos logit contributions
icious+9.484
 Deal+8.717
ater+8.694
umpy+8.602
arl+8.560
Neg logit contributions
 her-6.873
 the-6.496
 his-6.064
 it-5.801
 their-5.471
Token,
Token ID11
Feature activation-1.47e+00

Pos logit contributions
icious+4.761
 rule+4.732
oral+4.699
reath+4.684
 Deal+4.542
Neg logit contributions
 her-2.910
 the-2.713
 his-2.459
 their-2.367
 it-2.243
Token it
Token ID340
Feature activation-1.36e+00

Pos logit contributions
 rule+4.209
oral+4.081
reath+4.062
icious+4.057
ater+4.007
Neg logit contributions
 her-1.549
 the-1.419
 his-1.221
 their-1.191
 it-1.097
Token has
Token ID468
Feature activation-8.54e-01

Pos logit contributions
 rule+3.071
oral+2.989
reath+2.973
icious+2.968
arl+2.936
Neg logit contributions
 her-2.162
 the-2.034
 his-1.866
 their-1.808
 it-1.712
Token careful
Token ID8161
Feature activation-2.28e-01

Pos logit contributions
 her+1.027
 the+0.895
 his+0.859
 their+0.848
led+0.748
Neg logit contributions
 rule-2.017
icious-1.908
 promise-1.905
reath-1.867
 difference-1.864
Token and
Token ID290
Feature activation+8.37e-01

Pos logit contributions
 rule+0.600
icious+0.575
reath+0.570
oral+0.569
 promise+0.569
Neg logit contributions
 her-0.268
 the-0.231
 his-0.216
 their-0.210
 it-0.179
Token wise
Token ID10787
Feature activation-7.12e-01

Pos logit contributions
 her+1.494
 the+1.357
 his+1.321
 their+1.311
led+1.188
Neg logit contributions
 rule-1.642
 promise-1.540
icious-1.517
reath-1.483
 difference-1.476
Token.
Token ID13
Feature activation-1.86e+00

Pos logit contributions
 rule+1.722
icious+1.676
reath+1.641
oral+1.636
 promise+1.629
Neg logit contributions
 her-1.023
 the-0.913
 his-0.845
 their-0.820
 it-0.736
Token "
Token ID366
Feature activation-2.08e+00

Pos logit contributions
 rule+4.760
 promise+4.590
reath+4.513
asis+4.492
ater+4.481
Neg logit contributions
 her-2.854
 the-2.546
 his-2.305
 their-2.263
 it-2.062
TokenThe
Token ID464
Feature activation-4.19e+00

Pos logit contributions
 rule+3.838
 promise+3.730
 difference+3.712
reath+3.568
asis+3.501
Neg logit contributions
 her-4.500
 the-3.980
 his-3.803
 their-3.748
 it-3.513
Token dog
Token ID3290
Feature activation-1.56e+00

Pos logit contributions
icious+9.570
iver+9.491
umpy+9.429
ides+9.205
ater+9.174
Neg logit contributions
 her-8.888
 the-8.753
 his-7.781
 their-7.513
 a-7.056
Token might
Token ID1244
Feature activation-2.12e+00

Pos logit contributions
 rule+3.683
icious+3.643
reath+3.613
oral+3.608
arl+3.515
Neg logit contributions
 her-2.355
 the-2.232
 his-2.052
 their-1.940
 it-1.838
Token bite
Token ID13197
Feature activation-1.14e+00

Pos logit contributions
icious+4.425
oral+4.377
opus+4.330
ode+4.276
umm+4.260
Neg logit contributions
 her-4.165
 the-3.848
 his-3.758
 their-3.545
 it-3.253
Token you
Token ID345
Feature activation-1.87e+00

Pos logit contributions
 rule+2.345
icious+2.264
reath+2.243
oral+2.239
 promise+2.224
Neg logit contributions
 her-2.110
 the-1.985
 his-1.856
 their-1.821
 it-1.695
Token or
Token ID393
Feature activation-1.19e+00

Pos logit contributions
 rule+4.850
oral+4.813
icious+4.729
reath+4.724
Two+4.638
Neg logit contributions
 her-2.521
 the-2.343
 his-2.121
 their-1.999
 it-1.792
Token in
Token ID287
Feature activation+1.20e+00

Pos logit contributions
 her+1.134
 the+0.951
 his+0.926
 their+0.908
led+0.816
Neg logit contributions
 rule-3.080
 promise-2.877
icious-2.852
 world-2.849
 difference-2.841
Token her
Token ID607
Feature activation+7.31e-01

Pos logit contributions
 her+0.230
led+0.040
 the+0.023
 his+0.020
 their-0.003
Neg logit contributions
 rule-4.138
 promise-3.954
 difference-3.936
 world-3.920
reath-3.909
Token backyard
Token ID24296
Feature activation+3.83e-01

Pos logit contributions
 her+1.936
 the+1.825
 his+1.787
 their+1.768
led+1.684
Neg logit contributions
 rule-0.791
 promise-0.696
icious-0.680
 difference-0.666
reath-0.666
Token.
Token ID13
Feature activation-4.81e-01

Pos logit contributions
 her+0.608
 the+0.548
 his+0.527
 their+0.519
led+0.461
Neg logit contributions
 rule-0.833
icious-0.782
 promise-0.780
reath-0.767
 difference-0.766
Token One
Token ID1881
Feature activation+1.01e+00

Pos logit contributions
 rule+0.812
icious+0.764
 promise+0.759
ater+0.748
reath+0.742
Neg logit contributions
 her-1.046
 the-0.961
 his-0.921
 their-0.913
 it-0.836
Token day
Token ID1110
Feature activation+3.22e+00

Pos logit contributions
 her+2.347
 the+2.183
 his+2.138
 their+2.121
led+2.023
Neg logit contributions
 rule-1.457
 promise-1.323
icious-1.267
umpy-1.263
oral-1.254
Token,
Token ID11
Feature activation+3.24e+00

Pos logit contributions
 her+4.515
 his+4.255
led+4.196
 their+4.021
 the+4.003
Neg logit contributions
 rule-6.989
-6.714
Two-6.710
'."-6.668
 promise-6.568
Token while
Token ID981
Feature activation+2.16e+00

Pos logit contributions
led+3.267
 his+2.755
 her+2.677
 their+2.522
ed+2.509
Neg logit contributions
 rule-8.323
-7.823
Two-7.783
 promise-7.675
 judge-7.646
Token she
Token ID673
Feature activation+1.72e+00

Pos logit contributions
 her+2.048
led+1.939
 his+1.809
 the+1.757
 their+1.659
Neg logit contributions
 rule-5.520
Two-5.303
oral-5.269
 promise-5.195
 world-5.187
Token was
Token ID373
Feature activation+2.94e+00

Pos logit contributions
 her+2.330
 the+2.025
 their+1.994
 his+1.982
 its+1.724
Neg logit contributions
 rule-4.114
 promise-3.929
umpy-3.837
Two-3.802
ater-3.782
Token playing
Token ID2712
Feature activation+2.19e+00

Pos logit contributions
 her+4.094
 his+3.814
led+3.763
 their+3.667
 the+3.656
Neg logit contributions
 rule-6.759
 promise-6.072
icious-6.065
 world-6.045
reath-6.012
Token tired
Token ID10032
Feature activation+5.47e-01

Pos logit contributions
 her+3.291
 their+2.849
 his+2.846
 the+2.843
led+2.843
Neg logit contributions
 rule-4.967
 promise-4.862
 difference-4.730
umpy-4.589
icious-4.588
Token but
Token ID475
Feature activation+3.33e+00

Pos logit contributions
 her+0.753
 the+0.665
 his+0.638
 their+0.629
led+0.551
Neg logit contributions
 rule-1.302
 promise-1.226
icious-1.225
 difference-1.207
reath-1.201
Token also
Token ID635
Feature activation+2.21e+00

Pos logit contributions
 her+5.401
led+5.222
 their+5.096
 his+5.063
 the+4.863
Neg logit contributions
 promise-6.971
 rule-6.676
 difference-6.193
 request-6.186
 world-6.060
Token very
Token ID845
Feature activation+6.60e-01

Pos logit contributions
 her+3.082
 their+2.736
 the+2.734
 his+2.730
led+2.472
Neg logit contributions
 rule-5.022
 promise-4.890
 request-4.584
umpy-4.539
 sentence-4.532
Token satisfied
Token ID11378
Feature activation+8.05e-01

Pos logit contributions
 her+0.981
 the+0.878
 his+0.841
 their+0.833
led+0.732
Neg logit contributions
 rule-1.491
 promise-1.414
icious-1.396
 difference-1.376
reath-1.371
Token with
Token ID351
Feature activation+3.34e+00

Pos logit contributions
 her+1.259
 the+1.133
 his+1.092
 their+1.083
led+0.987
Neg logit contributions
 rule-1.745
 promise-1.630
icious-1.620
 difference-1.599
oral-1.588
Token her
Token ID607
Feature activation-1.78e+00

Pos logit contributions
led+3.029
ed+2.703
 her+2.302
 their+2.068
 his+2.051
Neg logit contributions
 rule-9.695
 promise-9.336
 difference-9.272
 request-8.849
 law-8.685
Token Tuesday
Token ID3431
Feature activation+1.45e-01

Pos logit contributions
icious+3.665
 rule+3.612
umpy+3.601
reath+3.549
'."+3.525
Neg logit contributions
 her-3.263
 the-3.100
 his-2.877
 their-2.863
 it-2.583
Token.
Token ID13
Feature activation-4.99e-01

Pos logit contributions
 her+0.211
 the+0.188
 his+0.179
 their+0.176
 it+0.153
Neg logit contributions
 rule-0.336
icious-0.318
 promise-0.316
reath-0.314
oral-0.312
Token She
Token ID1375
Feature activation-4.94e-03

Pos logit contributions
 rule+1.099
 promise+1.023
icious+1.023
reath+1.015
ater+1.011
Neg logit contributions
 her-0.834
 the-0.751
 his-0.708
 their-0.695
 it-0.631
Token knew
Token ID2993
Feature activation+1.67e+00

Pos logit contributions
 rule+0.012
icious+0.012
reath+0.012
 promise+0.012
oral+0.012
Neg logit contributions
 her-0.006
 the-0.006
 his-0.005
 their-0.005
 it-0.004
Token
Token ID198
Feature activation+5.79e-01

Pos logit contributions
 her+1.060
 the+0.970
 his+0.943
 their+0.931
led+0.855
Neg logit contributions
 rule-1.083
icious-1.006
 promise-1.004
oral-0.986
 difference-0.985
Token
Token ID198
Feature activation-7.76e-01

Pos logit contributions
 her+1.094
 the+1.001
 his+0.973
 their+0.962
led+0.884
Neg logit contributions
 rule-1.069
 promise-0.988
icious-0.988
oral-0.972
reath-0.967
TokenAs
Token ID1722
Feature activation+2.60e+00

Pos logit contributions
 rule+1.357
 difference+1.290
 promise+1.278
reath+1.267
ode+1.247
Neg logit contributions
 her-1.656
 the-1.504
 their-1.432
 his-1.431
 it-1.319
Token she
Token ID673
Feature activation+1.53e+00

Pos logit contributions
led+1.971
 her+1.908
 his+1.761
ed+1.468
 the+1.310
Neg logit contributions
 rule-7.328
 promise-6.924
oral-6.918
 world-6.872
Two-6.840
Token walked
Token ID6807
Feature activation+2.23e+00

Pos logit contributions
 her+2.128
 the+1.794
 his+1.737
 their+1.733
 its+1.517
Neg logit contributions
 rule-3.629
 promise-3.458
oral-3.409
oc-3.406
icious-3.386
Token,
Token ID11
Feature activation+3.37e+00

Pos logit contributions
 her+2.627
 the+2.181
 their+2.169
 his+2.102
led+1.991
Neg logit contributions
 rule-5.610
age-5.317
umpy-5.203
 promise-5.176
Two-5.152
Token she
Token ID673
Feature activation+1.31e+00

Pos logit contributions
led+3.198
 his+2.862
 her+2.774
 their+2.633
 its+2.364
Neg logit contributions
 rule-8.689
Two-8.341
-8.243
 judge-8.077
 occasion-8.053
Token saw
Token ID2497
Feature activation+2.98e+00

Pos logit contributions
 her+1.911
 the+1.643
 their+1.620
 his+1.604
 its+1.399
Neg logit contributions
 rule-2.972
 promise-2.815
icious-2.780
ater-2.769
oral-2.753
Token a
Token ID257
Feature activation-2.11e+00

Pos logit contributions
led+2.450
 her+2.296
 his+1.957
 their+1.874
 the+1.807
Neg logit contributions
 rule-8.125
 charity-7.696
 law-7.650
Two-7.649
 world-7.561
Token boy
Token ID2933
Feature activation+3.03e-01

Pos logit contributions
icious+4.484
'."+4.446
opl+4.415
 rule+4.391
.'"+4.388
Neg logit contributions
 her-3.598
 the-3.535
 his-3.188
 their-3.171
 it-3.027
Token who
Token ID508
Feature activation+1.04e+00

Pos logit contributions
 her+0.445
 the+0.397
 his+0.380
 their+0.374
led+0.326
Neg logit contributions
 rule-0.696
icious-0.657
 promise-0.654
reath-0.646
oral-0.644
Token to
Token ID284
Feature activation+4.98e-01

Pos logit contributions
 her+1.578
 the+1.308
 their+1.244
 his+1.240
led+1.157
Neg logit contributions
 rule-3.921
 promise-3.664
umpy-3.602
 world-3.599
 difference-3.581
Token collect
Token ID2824
Feature activation+1.91e+00

Pos logit contributions
 her+1.237
 the+1.155
 his+1.129
 their+1.121
led+1.059
Neg logit contributions
 rule-0.633
 promise-0.565
icious-0.560
 difference-0.546
reath-0.546
Token rocks
Token ID12586
Feature activation+4.75e-01

Pos logit contributions
 her+3.325
 the+2.966
 their+2.925
 his+2.834
led+2.757
Neg logit contributions
 rule-3.688
 promise-3.387
 difference-3.384
reath-3.382
ater-3.310
Token.
Token ID13
Feature activation+6.24e-01

Pos logit contributions
 her+0.717
 the+0.641
 his+0.616
 their+0.607
led+0.536
Neg logit contributions
 rule-1.063
 promise-0.997
icious-0.996
reath-0.981
 difference-0.979
Token He
Token ID679
Feature activation+2.17e+00

Pos logit contributions
 her+1.176
 the+1.069
 his+1.031
 their+1.028
led+0.952
Neg logit contributions
 rule-1.176
icious-1.084
 promise-1.083
reath-1.063
 difference-1.063
Token wore
Token ID12408
Feature activation+3.32e+00

Pos logit contributions
 her+3.138
 the+2.740
 their+2.729
 his+2.617
 its+2.422
Neg logit contributions
 rule-4.944
 promise-4.670
Two-4.540
 world-4.520
ater-4.513
Token his
Token ID465
Feature activation+4.79e-01

Pos logit contributions
 her+4.551
led+4.228
ed+3.820
 their+3.663
 the+3.610
Neg logit contributions
 rule-7.380
ater-6.803
reath-6.750
ode-6.593
 world-6.539
Token big
Token ID1263
Feature activation+3.70e-01

Pos logit contributions
 her+0.844
 the+0.767
 his+0.740
 their+0.733
led+0.661
Neg logit contributions
 rule-0.957
 promise-0.892
icious-0.888
reath-0.877
 difference-0.870
Token bath
Token ID7837
Feature activation+8.04e-01

Pos logit contributions
 her+0.645
 the+0.587
 his+0.567
 their+0.559
led+0.502
Neg logit contributions
 rule-0.743
icious-0.695
 promise-0.692
reath-0.681
 difference-0.679
Tokenrobe
Token ID25481
Feature activation+1.37e+00

Pos logit contributions
 her+2.050
 the+1.922
 his+1.870
 their+1.866
 it+1.740
Neg logit contributions
 rule-0.969
 promise-0.846
icious-0.833
reath-0.830
ater-0.814
Token because
Token ID780
Feature activation+2.82e+00

Pos logit contributions
 her+1.678
 the+1.432
 their+1.421
 his+1.354
 its+1.160
Neg logit contributions
 rule-3.448
 promise-3.216
age-3.192
umpy-3.178
ater-3.175
Token the
Token ID262
Feature activation-5.22e-01

Pos logit contributions
led+1.297
 her+1.256
 their+1.000
 his+0.952
 the+0.758
Neg logit contributions
 rule-7.968
 promise-7.626
 difference-7.625
 sentence-7.494
'."-7.480
Token pastry
Token ID41738
Feature activation+1.83e-02

Pos logit contributions
 rule+0.806
icious+0.778
umpy+0.753
reath+0.750
oral+0.744
Neg logit contributions
 her-1.166
 the-1.081
 his-1.030
 their-1.015
 it-0.959
Token.
Token ID13
Feature activation+7.00e-01

Pos logit contributions
 her+0.028
 the+0.025
 his+0.024
 their+0.023
 it+0.021
Neg logit contributions
 rule-0.042
icious-0.040
reath-0.039
 promise-0.039
oral-0.039
Token She
Token ID1375
Feature activation+2.06e+00

Pos logit contributions
 her+1.193
 the+1.082
 his+1.059
 their+1.046
led+0.954
Neg logit contributions
 rule-1.418
icious-1.325
 promise-1.321
 difference-1.304
reath-1.299
Token smiled
Token ID13541
Feature activation+2.86e+00

Pos logit contributions
 her+2.922
 their+2.605
 the+2.526
 his+2.492
led+2.228
Neg logit contributions
 rule-4.700
 promise-4.613
oc-4.392
icious-4.379
oral-4.342
Token because
Token ID780
Feature activation+2.99e+00

Pos logit contributions
 her+4.576
 their+4.184
led+4.171
 the+4.106
 his+3.954
Neg logit contributions
 rule-5.522
 promise-5.394
 world-5.390
 difference-5.290
ater-5.233
Token she
Token ID673
Feature activation+1.78e+00

Pos logit contributions
 her+2.486
led+2.384
 his+2.278
 their+2.180
 the+1.923
Neg logit contributions
 rule-8.269
 promise-7.808
oral-7.773
 world-7.767
 difference-7.728
Token knew
Token ID2993
Feature activation+2.43e+00

Pos logit contributions
 her+2.593
 their+2.336
 the+2.313
 his+2.270
 its+1.973
Neg logit contributions
 rule-4.040
 promise-3.909
icious-3.793
ater-3.744
reath-3.727
Token she
Token ID673
Feature activation+1.30e+00

Pos logit contributions
 her+1.894
led+1.743
 his+1.602
 their+1.560
 the+1.462
Neg logit contributions
 rule-6.847
 promise-6.576
 difference-6.506
icious-6.319
oral-6.303
Token could
Token ID714
Feature activation-1.63e+00

Pos logit contributions
 her+2.044
 the+1.840
 their+1.819
 his+1.812
 its+1.568
Neg logit contributions
 rule-2.818
 promise-2.707
icious-2.608
oral-2.570
umpy-2.567
Token eat
Token ID4483
Feature activation+6.79e-01

Pos logit contributions
umpy+2.956
opus+2.920
 rule+2.918
oral+2.900
icious+2.875
Neg logit contributions
 her-3.552
 the-3.254
 his-3.085
 their-2.929
 it-2.790
Token
Token ID198
Feature activation-1.23e+00

Pos logit contributions
 rule+3.203
reath+3.107
 promise+3.083
icious+3.043
 difference+3.028
Neg logit contributions
 her-2.344
 the-2.161
 his-2.074
 their-2.032
 it-1.867
TokenBilly
Token ID49640
Feature activation+1.85e-01

Pos logit contributions
 rule+2.022
reath+1.925
 difference+1.874
 promise+1.873
icious+1.803
Neg logit contributions
 her-2.763
 the-2.555
 his-2.508
 their-2.498
 it-2.273
Token and
Token ID290
Feature activation+1.36e+00

Pos logit contributions
 her+0.253
 the+0.224
 his+0.213
 their+0.209
 it+0.180
Neg logit contributions
 rule-0.444
icious-0.420
 promise-0.417
reath-0.415
oral-0.412
Token his
Token ID465
Feature activation-1.06e+00

Pos logit contributions
 her+0.518
 the+0.310
 their+0.250
led+0.221
 his+0.163
Neg logit contributions
 rule-4.534
 promise-4.330
 sentence-4.233
 difference-4.213
 request-4.210
Token mom
Token ID1995
Feature activation+1.03e+00

Pos logit contributions
icious+2.582
 rule+2.538
'."+2.481
!'"+2.477
reath+2.476
Neg logit contributions
 her-1.515
 the-1.351
 his-1.293
 their-1.262
 it-1.017
Token went
Token ID1816
Feature activation+1.23e+00

Pos logit contributions
 her+1.437
 the+1.281
 his+1.246
 their+1.226
led+1.084
Neg logit contributions
 rule-2.355
 promise-2.251
icious-2.234
oral-2.180
 difference-2.179
Token to
Token ID284
Feature activation+9.31e-02

Pos logit contributions
 her+1.288
 the+1.085
 his+1.041
 their+0.994
led+0.954
Neg logit contributions
 rule-3.308
 promise-3.125
oral-3.024
 difference-3.020
 request-3.016
Token a
Token ID257
Feature activation-1.79e+00

Pos logit contributions
 her+0.041
 the+0.028
 his+0.022
 their+0.020
 it+0.004
Neg logit contributions
 rule-0.312
icious-0.303
 promise-0.299
reath-0.299
oral-0.298
Token store
Token ID3650
Feature activation+9.94e-01

Pos logit contributions
icious+3.199
reath+3.085
affe+3.053
rr+3.033
ode+3.022
Neg logit contributions
 her-3.857
 the-3.730
 their-3.463
 his-3.461
 it-3.219
Token called
Token ID1444
Feature activation-8.04e-02

Pos logit contributions
 her+1.488
 the+1.306
 his+1.271
 their+1.242
led+1.153
Neg logit contributions
 rule-2.256
 promise-2.108
 difference-2.048
icious-2.047
oral-2.042
Token the
Token ID262
Feature activation-2.16e+00

Pos logit contributions
 rule+0.241
icious+0.233
 promise+0.231
reath+0.229
oral+0.228
Neg logit contributions
 her-0.064
 the-0.053
 his-0.047
 their-0.046
 it-0.032
Token.
Token ID13
Feature activation-4.09e-01

Pos logit contributions
 rule+3.783
arl+3.692
oral+3.685
icious+3.669
reath+3.662
Neg logit contributions
 her-2.256
 the-2.044
 his-1.878
 their-1.733
 it-1.573
Token<|endoftext|>
Token ID50256
Feature activation-9.82e-02

Pos logit contributions
 rule+0.846
 promise+0.794
icious+0.794
ater+0.787
reath+0.781
Neg logit contributions
 her-0.726
 the-0.667
 his-0.642
 their-0.629
 it-0.564
TokenOnce
Token ID7454
Feature activation+1.89e+00

Pos logit contributions
 rule+0.196
icious+0.185
 promise+0.184
reath+0.181
oral+0.178
Neg logit contributions
 her-0.176
 the-0.164
 his-0.156
 their-0.153
 it-0.140
Token upon
Token ID2402
Feature activation+9.91e-01

Pos logit contributions
 her+3.478
 the+3.095
 his+3.081
 their+3.050
led+2.998
Neg logit contributions
 rule-3.481
 promise-3.302
oral-3.248
umpy-3.183
 difference-3.176
Token a
Token ID257
Feature activation-4.31e-01

Pos logit contributions
 her+0.644
 the+0.482
 his+0.430
 their+0.430
led+0.344
Neg logit contributions
 rule-3.017
icious-2.878
 promise-2.873
reath-2.833
oral-2.821
Token time
Token ID640
Feature activation+1.99e+00

Pos logit contributions
 rule+0.763
reath+0.709
icious+0.707
 promise+0.695
oral+0.693
Neg logit contributions
 her-0.856
 the-0.789
 their-0.760
 his-0.759
 it-0.678
Token,
Token ID11
Feature activation+2.25e+00

Pos logit contributions
 her+2.560
 his+2.145
 the+2.134
led+2.086
 their+2.073
Neg logit contributions
 rule-4.862
 promise-4.624
rr-4.567
oral-4.499
umpy-4.472
Token there
Token ID612
Feature activation+2.56e+00

Pos logit contributions
 her+2.010
 his+1.497
 the+1.463
 their+1.407
led+1.261
Neg logit contributions
 rule-6.344
 promise-6.065
umpy-6.014
Two-5.999
rr-5.961
Token was
Token ID373
Feature activation+1.93e+00

Pos logit contributions
 her+3.847
led+3.419
 their+3.224
 his+3.179
 the+3.152
Neg logit contributions
 rule-5.660
 world-5.396
 promise-5.299
 difference-5.203
oc-5.149
Token a
Token ID257
Feature activation-1.47e+00

Pos logit contributions
 her+1.396
led+1.096
 the+1.090
 his+1.053
 their+0.972
Neg logit contributions
 rule-5.521
icious-5.330
Two-5.287
 world-5.215
reath-5.187
Token little
Token ID1310
Feature activation-1.29e+00

Pos logit contributions
 rule+2.980
icious+2.904
'."+2.869
opl+2.858
age+2.853
Neg logit contributions
 her-2.632
 the-2.532
 his-2.343
 their-2.335
 it-2.098
Token
Token ID198
Feature activation-9.87e-01

Pos logit contributions
 rule+1.363
icious+1.280
 promise+1.275
ode+1.261
reath+1.258
Neg logit contributions
 her-1.133
 the-1.032
 his-0.973
 their-0.970
 it-0.881
TokenDa
Token ID26531
Feature activation-6.29e-02

Pos logit contributions
 rule+2.239
 promise+2.100
 difference+2.094
reath+2.088
ode+2.075
Neg logit contributions
 her-1.638
 the-1.436
 their-1.363
 his-1.336
 it-1.194
Tokenisy
Token ID13560
Feature activation-4.33e-01

Pos logit contributions
 rule+0.065
icious+0.056
 promise+0.054
 difference+0.054
reath+0.053
Neg logit contributions
 her-0.174
 the-0.163
 his-0.159
 their-0.158
led-0.151
Token smiled
Token ID13541
Feature activation+1.92e+00

Pos logit contributions
 rule+1.141
icious+1.096
reath+1.083
 difference+1.068
arl+1.064
Neg logit contributions
 her-0.521
 the-0.456
 his-0.427
 their-0.409
 it-0.346
Token and
Token ID290
Feature activation+2.52e+00

Pos logit contributions
 her+2.857
 the+2.600
 their+2.523
 his+2.457
led+2.419
Neg logit contributions
 rule-4.020
 promise-3.883
 difference-3.792
 world-3.781
 request-3.758
Token picked
Token ID6497
Feature activation+1.91e+00

Pos logit contributions
 her+3.452
 the+3.085
 his+3.059
 their+3.049
led+2.877
Neg logit contributions
 promise-5.422
 rule-5.410
 request-5.184
 judge-5.012
icious-4.992
Token up
Token ID510
Feature activation+1.75e+00

Pos logit contributions
 her+1.374
 their+1.113
 his+1.069
 the+1.037
led+1.022
Neg logit contributions
 rule-5.465
 promise-5.256
 difference-5.199
reath-5.183
 sentence-5.152
Token the
Token ID262
Feature activation-4.86e-01

Pos logit contributions
 her+0.338
led+0.084
 his+0.076
 their+0.057
 the+0.048
Neg logit contributions
 rule-5.905
 promise-5.687
 difference-5.645
 sentence-5.565
reath-5.542
Token plastic
Token ID7309
Feature activation-5.97e-01

Pos logit contributions
 rule+0.791
icious+0.784
oral+0.750
reath+0.745
'."+0.742
Neg logit contributions
 her-1.048
 the-0.975
 his-0.953
 their-0.916
 it-0.858
Token foot
Token ID2366
Feature activation-2.72e-01

Pos logit contributions
icious+0.927
 rule+0.922
oral+0.870
 difference+0.863
reath+0.862
Neg logit contributions
 her-1.344
 the-1.252
 his-1.229
 their-1.190
 it-1.120
Token.
Token ID13
Feature activation-8.64e-01

Pos logit contributions
 rule+0.647
icious+0.625
reath+0.616
oral+0.615
 difference+0.609
Neg logit contributions
 her-0.389
 the-0.344
 his-0.326
 their-0.318
 it-0.281
Token the
Token ID262
Feature activation-1.11e+00

Pos logit contributions
 her+2.342
led+2.028
 his+1.990
 the+1.899
 their+1.867
Neg logit contributions
 rule-4.157
 promise-3.908
 difference-3.887
oral-3.876
icious-3.813
Token end
Token ID886
Feature activation+4.67e-01

Pos logit contributions
 rule+1.498
umpy+1.433
ater+1.422
reath+1.403
arl+1.399
Neg logit contributions
 her-2.797
 the-2.656
 his-2.516
 their-2.492
 it-2.281
Token of
Token ID286
Feature activation+2.31e+00

Pos logit contributions
 her+0.750
 the+0.673
 his+0.652
 their+0.639
led+0.573
Neg logit contributions
 rule-1.007
icious-0.943
 promise-0.942
oral-0.927
 difference-0.926
Token the
Token ID262
Feature activation-1.36e+00

Pos logit contributions
 her+1.128
led+0.910
 his+0.558
ed+0.489
 the+0.475
Neg logit contributions
 rule-7.394
oral-6.980
 promise-6.977
 difference-6.954
'."-6.887
Token day
Token ID1110
Feature activation+9.53e-01

Pos logit contributions
 rule+2.427
ater+2.379
icious+2.355
umpy+2.315
reath+2.302
Neg logit contributions
 her-2.821
 the-2.673
 his-2.472
 their-2.457
 it-2.261
Token,
Token ID11
Feature activation+1.40e+00

Pos logit contributions
 her+1.054
 the+0.865
 his+0.851
 their+0.821
led+0.712
Neg logit contributions
 rule-2.488
 promise-2.364
icious-2.360
 difference-2.348
oral-2.344
Token Sir
Token ID7361
Feature activation+1.01e+00

Pos logit contributions
 her+2.512
 the+2.187
 his+2.179
 their+2.146
led+2.094
Neg logit contributions
 rule-2.672
icious-2.478
 promise-2.464
 difference-2.463
oral-2.448
Token C
Token ID327
Feature activation-1.12e+00

Pos logit contributions
 her+2.014
 the+1.825
 his+1.795
 their+1.785
led+1.629
Neg logit contributions
 rule-1.748
icious-1.637
 promise-1.631
oral-1.603
ater-1.602
Tokenaw
Token ID707
Feature activation+6.94e-01

Pos logit contributions
 rule+1.892
 promise+1.695
 difference+1.619
 basis+1.618
 sentence+1.601
Neg logit contributions
 her-2.470
 the-2.284
 his-2.169
 their-2.148
led-2.069
Token had
Token ID550
Feature activation+1.36e+00

Pos logit contributions
 her+1.146
 the+1.026
 his+0.996
 their+0.986
 it+0.864
Neg logit contributions
 rule-1.455
icious-1.372
 promise-1.365
oral-1.345
reath-1.334
Token to
Token ID284
Feature activation-1.74e+00

Pos logit contributions
 her+1.267
 the+1.001
 his+0.958
 their+0.943
led+0.885
Neg logit contributions
 rule-3.784
 promise-3.614
icious-3.568
 difference-3.557
age-3.511
Token
Token ID198
Feature activation-8.78e-03

Pos logit contributions
 her+1.545
 the+1.414
 his+1.377
 their+1.365
led+1.279
Neg logit contributions
 rule-1.457
 promise-1.343
icious-1.324
oral-1.324
 difference-1.308
TokenInside
Token ID24441
Feature activation+3.04e+00

Pos logit contributions
 rule+0.015
icious+0.014
 promise+0.013
 difference+0.013
reath+0.013
Neg logit contributions
 her-0.019
 the-0.017
 his-0.017
 their-0.017
 it-0.015
Token the
Token ID262
Feature activation+2.55e-01

Pos logit contributions
 her+2.035
led+2.025
 his+1.809
 their+1.535
ed+1.386
Neg logit contributions
 rule-8.580
 world-8.478
'."-8.243
 promise-8.239
oral-8.232
Token sack
Token ID23704
Feature activation+1.67e+00

Pos logit contributions
 her+0.695
 the+0.654
 his+0.639
 their+0.635
 it+0.593
Neg logit contributions
 rule-0.268
icious-0.237
 promise-0.233
reath-0.227
oral-0.225
Token,
Token ID11
Feature activation+3.02e+00

Pos logit contributions
 her+2.016
 their+1.736
 his+1.728
 the+1.709
led+1.495
Neg logit contributions
 rule-4.177
 promise-3.969
 world-3.874
oral-3.864
age-3.845
Token she
Token ID673
Feature activation+1.29e+00

Pos logit contributions
 her+3.202
 his+3.085
led+3.017
 their+2.764
 the+2.719
Neg logit contributions
 rule-7.472
 promise-7.036
Two-6.946
 goal-6.794
 world-6.753
Token found
Token ID1043
Feature activation+3.63e+00

Pos logit contributions
 her+1.567
 the+1.359
 his+1.324
 their+1.308
led+1.102
Neg logit contributions
 rule-3.246
 promise-3.124
ater-3.014
icious-3.008
oral-2.977
Token a
Token ID257
Feature activation-2.30e+00

Pos logit contributions
 her+3.291
 their+3.125
led+3.100
 his+3.058
 the+2.610
Neg logit contributions
 rule-9.495
age-9.032
Two-9.010
 world-8.912
affe-8.871
Token lot
Token ID1256
Feature activation+7.13e-01

Pos logit contributions
opl+5.360
icious+5.297
'."+5.264
.'"+5.186
umpy+5.172
Neg logit contributions
 her-3.727
 the-3.513
 his-3.261
 their-3.234
 it-3.068
Token of
Token ID286
Feature activation-1.61e-01

Pos logit contributions
 her+1.183
 the+1.066
 his+1.036
 their+1.020
led+0.920
Neg logit contributions
 rule-1.499
 promise-1.390
icious-1.387
reath-1.368
 difference-1.367
Token powder
Token ID11913
Feature activation+4.04e-01

Pos logit contributions
 rule+0.326
icious+0.311
 promise+0.304
oral+0.302
'."+0.299
Neg logit contributions
 her-0.288
 the-0.261
 his-0.252
 their-0.248
 it-0.223
Token touched
Token ID12615
Feature activation+7.50e-01

Pos logit contributions
 her+2.201
 the+1.885
 their+1.883
 his+1.873
led+1.632
Neg logit contributions
 rule-3.456
 promise-3.220
umpy-3.142
 world-3.132
icious-3.127
Token the
Token ID262
Feature activation-2.79e+00

Pos logit contributions
 her+0.650
 the+0.516
 his+0.488
 their+0.473
led+0.381
Neg logit contributions
 rule-2.158
icious-2.049
 promise-2.043
 difference-2.027
reath-2.018
Token sky
Token ID6766
Feature activation-2.05e-01

Pos logit contributions
icious+5.703
age+5.518
reath+5.499
'."+5.402
 rule+5.395
Neg logit contributions
 the-5.458
 her-5.280
 his-4.991
 their-4.707
 into-4.350
Token.
Token ID13
Feature activation-1.09e+00

Pos logit contributions
 rule+0.468
icious+0.445
oral+0.441
 promise+0.441
reath+0.439
Neg logit contributions
 her-0.309
 the-0.282
 his-0.264
 their-0.262
 it-0.232
Token Inside
Token ID14384
Feature activation+1.93e+00

Pos logit contributions
 rule+2.227
 promise+2.185
icious+2.159
ater+2.141
reath+2.123
Neg logit contributions
 her-1.977
 the-1.863
 his-1.748
 their-1.710
 it-1.598
Token the
Token ID262
Feature activation-8.33e-01

Pos logit contributions
 her+2.052
 his+1.643
led+1.617
 their+1.604
 the+1.545
Neg logit contributions
 rule-5.060
 world-4.791
 promise-4.753
oral-4.715
 difference-4.708
Token castle
Token ID16669
Feature activation+1.48e+00

Pos logit contributions
icious+1.044
 rule+1.033
reath+0.968
oral+0.946
'."+0.943
Neg logit contributions
 her-2.113
 the-2.030
 his-1.967
 their-1.926
 it-1.792
Token,
Token ID11
Feature activation+1.58e+00

Pos logit contributions
 her+2.039
 the+1.756
 his+1.733
 their+1.721
led+1.541
Neg logit contributions
 rule-3.475
 promise-3.275
 world-3.186
oral-3.185
 difference-3.172
Token there
Token ID612
Feature activation+2.18e+00

Pos logit contributions
 her+1.200
 his+0.884
 the+0.835
 their+0.806
led+0.749
Neg logit contributions
 rule-4.642
 promise-4.403
Two-4.312
 world-4.306
 difference-4.298
Token was
Token ID373
Feature activation+1.53e+00

Pos logit contributions
 her+3.118
 his+2.651
 their+2.641
 the+2.635
led+2.630
Neg logit contributions
 rule-4.935
 promise-4.646
 world-4.610
 difference-4.512
oral-4.453
Token a
Token ID257
Feature activation-2.89e+00

Pos logit contributions
 her+1.169
 the+0.813
 his+0.800
 their+0.794
led+0.772
Neg logit contributions
 rule-4.577
 promise-4.255
icious-4.251
oral-4.222
 difference-4.206
Token bird
Token ID6512
Feature activation-4.09e-01

Pos logit contributions
'."+5.353
icious+5.305
.'"+5.294
 basis+5.278
 rule+5.223
Neg logit contributions
 the-3.714
 her-3.703
 his-3.284
 it-3.144
 their-3.075
Token and
Token ID290
Feature activation+1.55e+00

Pos logit contributions
 rule+1.076
icious+1.032
oral+1.021
 promise+1.015
reath+1.015
Neg logit contributions
 her-0.496
 the-0.435
 his-0.402
 their-0.385
 it-0.339
Token said
Token ID531
Feature activation+1.75e+00

Pos logit contributions
 her+2.024
 the+1.741
 their+1.730
 his+1.682
led+1.566
Neg logit contributions
 rule-3.578
 promise-3.382
affe-3.364
Two-3.315
'."-3.294
Token "
Token ID366
Feature activation-2.30e+00

Pos logit contributions
 her+2.268
 the+1.897
 their+1.863
 his+1.818
led+1.775
Neg logit contributions
 rule-4.198
 promise-3.954
affe-3.921
oral-3.903
'."-3.883
Tokenhi
Token ID5303
Feature activation-2.01e+00

Pos logit contributions
 rule+3.889
 promise+3.650
 difference+3.612
agons+3.419
reath+3.413
Neg logit contributions
 her-5.514
 the-5.153
 his-4.874
 it-4.627
 their-4.547
Token"
Token ID1
Feature activation+7.68e-02

Pos logit contributions
 rule+4.268
 former+4.089
 promise+4.083
icious+4.033
reath+4.013
Neg logit contributions
 her-3.793
 the-3.792
 his-3.404
 it-3.206
 their-3.110
Token to
Token ID284
Feature activation+6.28e-01

Pos logit contributions
 her+0.092
 the+0.080
 his+0.076
 their+0.073
 it+0.062
Neg logit contributions
 rule-0.198
icious-0.188
 promise-0.188
reath-0.186
 difference-0.186
Token it
Token ID340
Feature activation-3.72e-01

Pos logit contributions
 her+0.313
 the+0.203
 his+0.173
 their+0.170
led+0.094
Neg logit contributions
 rule-2.048
 promise-1.956
icious-1.954
reath-1.937
 difference-1.933
Token.
Token ID13
Feature activation+5.41e-01

Pos logit contributions
 rule+0.858
icious+0.821
reath+0.808
oral+0.807
 promise+0.800
Neg logit contributions
 her-0.567
 the-0.517
 his-0.492
 their-0.474
 it-0.423
Token The
Token ID383
Feature activation-1.06e+00

Pos logit contributions
 her+0.676
 the+0.583
 his+0.554
 their+0.546
led+0.473
Neg logit contributions
 rule-1.359
icious-1.286
 promise-1.281
oral-1.264
 difference-1.263
Token bird
Token ID6512
Feature activation+1.32e+00

Pos logit contributions
icious+1.427
 rule+1.400
reath+1.345
 basis+1.322
'."+1.317
Neg logit contributions
 her-2.597
 the-2.490
 his-2.346
 their-2.284
 it-2.217
Token bear
Token ID6842
Feature activation-3.15e-01

Pos logit contributions
 her+0.385
 the+0.360
 his+0.352
 their+0.346
 it+0.322
Neg logit contributions
 rule-0.215
icious-0.196
 promise-0.193
oral-0.189
reath-0.188
Token.
Token ID13
Feature activation-4.41e-02

Pos logit contributions
 rule+0.797
icious+0.767
oral+0.758
reath+0.753
 promise+0.750
Neg logit contributions
 her-0.406
 the-0.355
 his-0.333
 their-0.324
 it-0.281
Token Emma
Token ID18966
Feature activation+3.66e-01

Pos logit contributions
 rule+0.096
icious+0.090
 promise+0.090
reath+0.090
ater+0.088
Neg logit contributions
 her-0.072
 the-0.065
 his-0.062
 their-0.061
 it-0.055
Token noticed
Token ID6810
Feature activation+1.87e+00

Pos logit contributions
 her+0.441
 the+0.383
 his+0.363
 their+0.356
led+0.299
Neg logit contributions
 rule-0.932
icious-0.885
 promise-0.882
reath-0.871
 difference-0.869
Token how
Token ID703
Feature activation+8.72e-01

Pos logit contributions
 her+1.610
 their+1.334
 his+1.330
 the+1.322
led+1.240
Neg logit contributions
 rule-5.040
icious-4.792
 difference-4.763
 promise-4.734
oral-4.707
Token Lily
Token ID20037
Feature activation+2.88e-01

Pos logit contributions
 her+1.375
 the+1.239
 his+1.204
 their+1.192
led+1.052
Neg logit contributions
 rule-1.873
 promise-1.759
icious-1.751
umpy-1.742
oral-1.718
Token felt
Token ID2936
Feature activation+7.14e-01

Pos logit contributions
 her+0.343
 the+0.298
 his+0.282
 their+0.275
led+0.231
Neg logit contributions
 rule-0.741
icious-0.704
 promise-0.700
reath-0.694
oral-0.692
Token and
Token ID290
Feature activation+2.47e+00

Pos logit contributions
 her+1.064
 the+0.946
 his+0.911
 their+0.907
led+0.806
Neg logit contributions
 rule-1.607
 promise-1.512
icious-1.498
 difference-1.487
umpy-1.476
Token offered
Token ID4438
Feature activation+1.93e+00

Pos logit contributions
 her+3.289
 the+2.993
 their+2.990
 his+2.928
led+2.744
Neg logit contributions
 rule-5.348
 promise-5.312
 request-5.134
 difference-5.040
 judge-5.023
Token to
Token ID284
Feature activation-1.07e+00

Pos logit contributions
 her+1.542
 his+1.316
led+1.312
 the+1.305
 their+1.278
Neg logit contributions
 rule-5.389
 promise-5.130
 difference-5.044
 request-4.978
 sentence-4.938
Token share
Token ID2648
Feature activation+6.58e-01

Pos logit contributions
 rule+2.485
icious+2.435
umpy+2.431
oral+2.407
asis+2.399
Neg logit contributions
 her-1.769
 the-1.565
 his-1.487
 their-1.431
 it-1.284
Token book
Token ID1492
Feature activation-2.02e-01

Pos logit contributions
icious+3.221
 rule+3.122
umpy+3.104
oral+3.035
'."+3.019
Neg logit contributions
 her-2.633
 the-2.432
 his-2.348
 their-2.253
 it-2.039
Token with
Token ID351
Feature activation+1.10e+00

Pos logit contributions
 rule+0.545
icious+0.527
reath+0.521
 promise+0.520
 difference+0.518
Neg logit contributions
 her-0.220
 the-0.188
 his-0.178
 their-0.170
 it-0.139
Token a
Token ID257
Feature activation-2.69e+00

Pos logit contributions
 her+1.394
 the+1.215
 their+1.166
 his+1.153
led+1.074
Neg logit contributions
 rule-2.657
icious-2.494
 promise-2.473
reath-2.469
 difference-2.445
Token beautiful
Token ID4950
Feature activation-1.51e+00

Pos logit contributions
icious+5.999
rr+5.887
umpy+5.821
oral+5.760
 rule+5.731
Neg logit contributions
 her-4.882
 the-4.523
 his-4.237
 it-4.103
 their-3.804
Token poem
Token ID21247
Feature activation-7.65e-01

Pos logit contributions
 rule+2.486
icious+2.470
umpy+2.410
oral+2.334
 promise+2.325
Neg logit contributions
 her-3.463
 the-3.219
 his-3.083
 it-2.893
 their-2.892
Token on
Token ID319
Feature activation+1.65e-01

Pos logit contributions
 rule+1.944
icious+1.894
 promise+1.877
oral+1.870
 difference+1.868
Neg logit contributions
 her-0.988
 the-0.862
 his-0.829
 their-0.788
 it-0.696
Token the
Token ID262
Feature activation-1.91e+00

Pos logit contributions
 her+0.068
 the+0.042
 his+0.033
 their+0.028
 it+0.004
Neg logit contributions
 rule-0.559
icious-0.539
 promise-0.538
reath-0.532
 difference-0.532
Token front
Token ID2166
Feature activation-2.84e+00

Pos logit contributions
oral+3.308
 rule+3.296
icious+3.257
umpy+3.208
rr+3.207
Neg logit contributions
 her-4.141
 the-3.886
 his-3.560
 their-3.365
 it-3.223
Token.
Token ID13
Feature activation-2.27e-01

Pos logit contributions
icious+6.323
reath+6.283
 affect+6.119
 rule+6.033
 difference+5.929
Neg logit contributions
 her-5.428
 the-5.021
 his-4.551
 their-4.057
 it-4.039
Token The
Token ID383
Feature activation-1.11e+00

Pos logit contributions
 rule+0.531
 promise+0.507
icious+0.502
reath+0.495
oral+0.493
Neg logit contributions
 her-0.340
 the-0.307
 his-0.294
 their-0.284
 it-0.250
Token poem
Token ID21247
Feature activation+7.36e-01

Pos logit contributions
 rule+1.900
icious+1.835
!'"+1.795
'."+1.791
 difference+1.785
Neg logit contributions
 her-2.345
 the-2.212
 his-2.065
 their-1.966
 it-1.910
Token<|endoftext|>
Token ID50256
Feature activation-6.08e-01

Pos logit contributions
 rule+1.946
icious+1.855
reath+1.840
 promise+1.834
ater+1.832
Neg logit contributions
 her-1.546
 the-1.451
 their-1.361
 his-1.344
 it-1.208
TokenJason
Token ID26497
Feature activation-6.76e-01

Pos logit contributions
 rule+1.227
icious+1.164
 promise+1.160
reath+1.138
ater+1.111
Neg logit contributions
 her-1.117
 the-1.055
 his-0.993
 their-0.974
 it-0.900
Token and
Token ID290
Feature activation-3.93e-01

Pos logit contributions
 rule+1.773
icious+1.698
reath+1.683
oral+1.657
 promise+1.652
Neg logit contributions
 her-0.854
 the-0.757
 his-0.701
 their-0.678
 it-0.579
Token his
Token ID465
Feature activation-1.51e+00

Pos logit contributions
 rule+1.317
icious+1.295
reath+1.268
 promise+1.257
oral+1.257
Neg logit contributions
 her-0.211
 the-0.152
 his-0.134
 their-0.117
 it-0.050
Token family
Token ID1641
Feature activation-6.51e-01

Pos logit contributions
 rule+3.584
icious+3.540
asis+3.496
reath+3.470
age+3.418
Neg logit contributions
 her-2.340
 the-2.180
 their-2.015
 his-1.968
 it-1.666
Token were
Token ID547
Feature activation+2.20e-01

Pos logit contributions
 rule+1.769
icious+1.701
reath+1.692
 sentence+1.651
ater+1.648
Neg logit contributions
 her-0.762
 the-0.675
 his-0.615
 their-0.610
 it-0.498
Token in
Token ID287
Feature activation+6.70e-01

Pos logit contributions
 her+0.241
 the+0.206
 his+0.193
 their+0.190
 it+0.153
Neg logit contributions
 rule-0.589
icious-0.561
 promise-0.559
reath-0.555
oral-0.552
Token their
Token ID511
Feature activation+2.88e-01

Pos logit contributions
 her+0.286
 the+0.171
 his+0.139
 their+0.125
led+0.068
Neg logit contributions
 rule-2.218
 promise-2.121
icious-2.109
 difference-2.101
oral-2.087
Token house
Token ID2156
Feature activation-4.80e-01

Pos logit contributions
 her+0.611
 the+0.566
 his+0.550
 their+0.545
 it+0.496
Neg logit contributions
 rule-0.474
icious-0.438
 promise-0.434
reath-0.428
oral-0.425
Token.
Token ID13
Feature activation+5.96e-01

Pos logit contributions
 rule+1.163
reath+1.120
icious+1.119
oral+1.102
 promise+1.096
Neg logit contributions
 her-0.668
 the-0.600
 his-0.568
 their-0.561
 it-0.466
Token There
Token ID1318
Feature activation+1.23e+00

Pos logit contributions
 her+1.140
 the+1.043
 his+1.009
 their+0.997
led+0.923
Neg logit contributions
 rule-1.093
icious-1.012
 promise-1.008
 difference-0.992
oral-0.991
TokenI
Token ID40
Feature activation-2.18e+00

Pos logit contributions
 rule+5.596
 difference+5.380
 promise+5.248
reath+5.219
agons+5.125
Neg logit contributions
 her-5.489
 the-4.931
 his-4.681
 their-4.538
 it-4.362
Token love
Token ID1842
Feature activation-1.06e+00

Pos logit contributions
icious+3.994
 rule+3.981
ode+3.886
reath+3.885
agons+3.881
Neg logit contributions
 her-4.585
 the-4.299
 his-4.107
 their-4.017
 it-3.891
Token you
Token ID345
Feature activation-1.88e+00

Pos logit contributions
 rule+2.272
umpy+2.154
icious+2.151
 promise+2.143
oral+2.115
Neg logit contributions
 her-1.942
 the-1.791
 his-1.685
 their-1.632
 it-1.542
Token too
Token ID1165
Feature activation-1.94e+00

Pos logit contributions
 rule+5.567
reath+5.456
oral+5.388
asis+5.364
umpy+5.353
Neg logit contributions
 her-1.996
 the-1.782
 his-1.516
 their-1.424
 it-1.273
Token,
Token ID11
Feature activation-1.67e+00

Pos logit contributions
 rule+5.251
reath+5.198
ode+5.182
age+5.102
ulture+5.081
Neg logit contributions
 her-2.428
 the-2.255
 his-1.958
 their-1.885
 it-1.686
Token Anna
Token ID11735
Feature activation-2.14e+00

Pos logit contributions
 rule+4.736
age+4.620
ute+4.578
umpy+4.565
reath+4.562
Neg logit contributions
 her-1.912
 the-1.714
 his-1.473
 their-1.440
 it-1.288
Token.
Token ID13
Feature activation-1.90e+00

Pos logit contributions
 rule+5.218
icious+5.138
reath+5.078
oral+4.980
Two+4.929
Neg logit contributions
 her-3.482
 the-3.301
 his-2.889
 their-2.859
 it-2.675
Token Have
Token ID8192
Feature activation-9.19e-01

Pos logit contributions
 rule+3.352
ater+3.103
icious+3.078
reath+3.069
 promise+3.055
Neg logit contributions
 her-4.163
 the-3.903
 his-3.642
 their-3.618
 it-3.454
Token fun
Token ID1257
Feature activation-2.08e+00

Pos logit contributions
 rule+2.701
icious+2.617
ater+2.611
umpy+2.610
reath+2.604
Neg logit contributions
 her-0.888
 the-0.747
 his-0.657
 their-0.630
 it-0.513
Token with
Token ID351
Feature activation+3.39e-01

Pos logit contributions
oral+6.204
icious+6.197
 rule+6.185
affe+6.119
reath+6.065
Neg logit contributions
 her-1.996
 the-1.790
 their-1.475
 his-1.422
 it-1.219
Token Lily
Token ID20037
Feature activation-1.77e+00

Pos logit contributions
 her+0.537
 the+0.483
 his+0.464
 their+0.457
led+0.404
Neg logit contributions
 rule-0.742
icious-0.698
 promise-0.695
reath-0.685
oral-0.683
Token "
Token ID366
Feature activation-1.89e+00

Pos logit contributions
 rule+0.408
icious+0.388
 promise+0.388
reath+0.383
 difference+0.383
Neg logit contributions
 her-0.217
 the-0.195
 his-0.184
 their-0.178
 it-0.156
TokenPlease
Token ID5492
Feature activation-2.81e+00

Pos logit contributions
 rule+3.368
 promise+3.220
reath+3.062
 difference+3.061
arl+2.948
Neg logit contributions
 her-4.173
 the-3.827
 his-3.735
 their-3.504
 it-3.442
Token teach
Token ID4545
Feature activation-9.93e-01

Pos logit contributions
umpy+8.345
 rule+8.301
oral+8.288
opl+8.160
reath+8.128
Neg logit contributions
 the-3.042
 her-3.031
 his-2.356
 it-2.114
 their-1.986
Token me
Token ID502
Feature activation-1.29e+00

Pos logit contributions
 rule+1.867
 promise+1.778
umpy+1.766
reath+1.764
oral+1.752
Neg logit contributions
 her-2.035
 the-1.912
 his-1.820
 their-1.733
 it-1.641
Token,
Token ID11
Feature activation-2.60e+00

Pos logit contributions
 rule+3.926
icious+3.826
umpy+3.784
oral+3.782
reath+3.770
Neg logit contributions
 her-1.068
 the-0.958
 his-0.829
 their-0.716
 it-0.581
Token thumb
Token ID15683
Feature activation-2.17e+00

Pos logit contributions
age+6.379
reath+6.376
oral+6.339
asis+6.331
 rule+6.283
Neg logit contributions
 the-4.228
 her-4.129
 his-3.810
 it-3.559
 their-3.236
Token!"
Token ID2474
Feature activation-6.30e-01

Pos logit contributions
 rule+5.286
oral+5.248
icious+5.178
reath+5.087
 Deal+5.081
Neg logit contributions
 her-3.251
 the-3.047
 his-2.826
 their-2.615
 it-2.513
Token
Token ID220
Feature activation-9.46e-01

Pos logit contributions
 rule+1.309
icious+1.245
reath+1.236
 promise+1.216
umpy+1.202
Neg logit contributions
 her-1.117
 the-1.052
 his-1.020
 their-0.972
 it-0.881
Token
Token ID198
Feature activation-1.30e+00

Pos logit contributions
 rule+2.034
reath+1.925
icious+1.918
 promise+1.908
ater+1.899
Neg logit contributions
 her-1.586
 the-1.509
 his-1.439
 their-1.383
 it-1.261
Token
Token ID198
Feature activation-1.93e+00

Pos logit contributions
 rule+2.970
 promise+2.839
icious+2.821
reath+2.807
 difference+2.785
Neg logit contributions
 her-2.120
 the-2.017
 his-1.944
 their-1.816
 it-1.655
TokenThe
Token ID464
Feature activation-9.45e-01

Pos logit contributions
 rule+3.853
reath+3.763
 promise+3.757
arl+3.724
 difference+3.707
Neg logit contributions
 her-3.686
 the-3.520
 his-3.513
 their-3.257
 it-2.953
Token the
Token ID262
Feature activation-3.07e+00

Pos logit contributions
 her+0.051
 the+0.035
 his+0.028
 their+0.025
 it+0.008
Neg logit contributions
 rule-0.362
icious-0.350
 promise-0.348
reath-0.345
 difference-0.344
Token lake
Token ID13546
Feature activation-3.03e-01

Pos logit contributions
ides+5.415
icious+5.348
reath+5.296
oral+5.238
 rule+5.231
Neg logit contributions
 the-7.451
 her-7.204
 their-6.571
 his-6.455
 a-5.985
Token.
Token ID13
Feature activation+8.04e-02

Pos logit contributions
 rule+0.688
icious+0.662
reath+0.650
 promise+0.647
oral+0.642
Neg logit contributions
 her-0.461
 the-0.421
 his-0.402
 their-0.396
 it-0.348
Token He
Token ID679
Feature activation+1.22e-01

Pos logit contributions
 her+0.131
 the+0.120
 his+0.115
 their+0.113
 it+0.100
Neg logit contributions
 rule-0.173
icious-0.163
 promise-0.162
reath-0.161
oral-0.159
Token wanted
Token ID2227
Feature activation+1.59e+00

Pos logit contributions
 her+0.169
 the+0.151
 his+0.144
 their+0.141
 it+0.122
Neg logit contributions
 rule-0.292
icious-0.277
 promise-0.275
reath-0.272
 difference-0.272
Token to
Token ID284
Feature activation-2.27e+00

Pos logit contributions
 her+1.664
led+1.288
 the+1.262
 his+1.240
 their+1.228
Neg logit contributions
 rule-4.272
 promise-3.991
 difference-3.941
 world-3.911
oral-3.908
Token see
Token ID766
Feature activation+1.00e+00

Pos logit contributions
icious+4.555
oral+4.267
arl+4.255
umpy+4.189
umm+4.154
Neg logit contributions
 her-4.961
 the-4.839
 his-4.600
 their-4.396
 it-4.121
Token it
Token ID340
Feature activation+3.97e-01

Pos logit contributions
 her+0.723
 the+0.516
 his+0.486
 their+0.478
led+0.400
Neg logit contributions
 rule-3.007
 promise-2.882
 difference-2.869
icious-2.845
reath-2.827
Token better
Token ID1365
Feature activation+3.41e-01

Pos logit contributions
 her+0.517
 the+0.453
 his+0.432
 their+0.424
led+0.364
Neg logit contributions
 rule-0.978
icious-0.926
 promise-0.924
reath-0.911
 difference-0.909
Token so
Token ID523
Feature activation+1.77e+00

Pos logit contributions
 her+0.488
 the+0.434
 his+0.416
 their+0.409
led+0.355
Neg logit contributions
 rule-0.799
icious-0.755
 promise-0.752
reath-0.742
 difference-0.740
Token he
Token ID339
Feature activation+2.51e-01

Pos logit contributions
 her+1.577
 the+1.199
 their+1.135
led+1.132
 his+1.078
Neg logit contributions
 rule-4.909
 promise-4.640
icious-4.568
oral-4.555
'."-4.551
Token her
Token ID607
Feature activation-1.31e+00

Pos logit contributions
 her+0.253
 the+0.142
 his+0.114
 their+0.107
led+0.034
Neg logit contributions
 rule-2.341
 promise-2.246
icious-2.244
reath-2.222
 difference-2.221
Token mom
Token ID1995
Feature activation-1.25e+00

Pos logit contributions
 rule+3.170
icious+3.169
age+3.095
ode+3.093
asis+3.086
Neg logit contributions
 her-1.959
 the-1.759
 his-1.621
 their-1.590
 it-1.341
Token and
Token ID290
Feature activation+1.61e+00

Pos logit contributions
 rule+3.638
icious+3.515
reath+3.494
umpy+3.418
Two+3.415
Neg logit contributions
 her-1.319
 the-1.143
 his-0.989
 their-0.951
 it-0.763
Token hugged
Token ID41130
Feature activation+9.94e-01

Pos logit contributions
 her+2.098
 the+1.848
 his+1.813
 their+1.783
led+1.672
Neg logit contributions
 rule-3.723
 promise-3.651
oral-3.473
icious-3.459
 request-3.440
Token her
Token ID607
Feature activation-1.55e+00

Pos logit contributions
 her+0.288
 the+0.144
 his+0.117
 their+0.095
led+0.016
Neg logit contributions
 rule-3.355
 promise-3.232
icious-3.205
 difference-3.191
reath-3.184
Token tight
Token ID5381
Feature activation-2.17e+00

Pos logit contributions
 rule+3.897
icious+3.846
umpy+3.771
rr+3.730
reath+3.727
Neg logit contributions
 her-2.227
 the-2.044
 his-1.852
 their-1.798
 it-1.532
Token.
Token ID13
Feature activation-1.06e+00

Pos logit contributions
icious+5.828
 rule+5.676
reath+5.639
oral+5.635
asis+5.487
Neg logit contributions
 her-2.936
 the-2.679
 his-2.436
 their-2.250
 it-2.012
Token From
Token ID3574
Feature activation+6.78e-01

Pos logit contributions
 rule+2.445
icious+2.310
 promise+2.296
reath+2.280
ater+2.275
Neg logit contributions
 her-1.749
 the-1.564
 his-1.451
 their-1.443
 it-1.315
Token then
Token ID788
Feature activation+5.23e-02

Pos logit contributions
 her+0.751
 the+0.637
 his+0.616
 their+0.601
led+0.503
Neg logit contributions
 rule-1.782
 promise-1.696
icious-1.692
 difference-1.666
oral-1.661
Token on
Token ID319
Feature activation+3.33e-01

Pos logit contributions
 her+0.040
 the+0.032
 his+0.029
 their+0.028
 it+0.020
Neg logit contributions
 rule-0.157
icious-0.151
 promise-0.150
reath-0.149
 difference-0.149
Token,
Token ID11
Feature activation+7.70e-01

Pos logit contributions
 her+0.340
 the+0.287
 his+0.268
 their+0.262
led+0.209
Neg logit contributions
 rule-0.913
icious-0.870
 promise-0.867
reath-0.857
 difference-0.856
Token princess
Token ID21752
Feature activation-8.84e-01

Pos logit contributions
icious+5.365
opl+5.226
rr+5.127
oral+5.121
 rule+5.118
Neg logit contributions
 her-5.330
 the-4.821
 their-4.657
 his-4.644
 it-4.283
Token or
Token ID393
Feature activation+5.15e-01

Pos logit contributions
 rule+2.239
icious+2.179
oral+2.159
reath+2.129
 promise+2.125
Neg logit contributions
 her-1.161
 the-1.026
 his-0.932
 their-0.906
 it-0.804
Token a
Token ID257
Feature activation-2.97e+00

Pos logit contributions
 her+0.321
 the+0.241
 his+0.213
 their+0.204
led+0.134
Neg logit contributions
 rule-1.607
icious-1.534
 promise-1.534
reath-1.516
 difference-1.512
Token fairy
Token ID25607
Feature activation-1.82e+00

Pos logit contributions
icious+6.529
iver+6.284
umpy+6.204
opl+6.114
 rule+6.106
Neg logit contributions
 her-6.141
 the-5.486
 his-5.307
 their-5.248
 it-4.823
Token or
Token ID393
Feature activation+1.25e-01

Pos logit contributions
 rule+4.585
icious+4.567
oral+4.544
reath+4.371
 promise+4.367
Neg logit contributions
 her-2.647
 the-2.359
 his-2.086
 their-2.043
 it-1.871
Token a
Token ID257
Feature activation-3.21e+00

Pos logit contributions
 her+0.075
 the+0.055
 his+0.048
 their+0.045
 it+0.024
Neg logit contributions
 rule-0.400
icious-0.386
 promise-0.383
reath-0.380
 difference-0.380
Token doctor
Token ID6253
Feature activation-1.10e+00

Pos logit contributions
icious+7.174
iver+6.858
umpy+6.736
opl+6.670
rig+6.599
Neg logit contributions
 her-6.812
 the-6.089
 his-5.847
 their-5.794
 it-5.346
Token.
Token ID13
Feature activation-1.27e+00

Pos logit contributions
 rule+2.569
icious+2.555
oral+2.526
reath+2.476
 difference+2.452
Neg logit contributions
 her-1.646
 the-1.465
 his-1.362
 their-1.357
 it-1.190
Token One
Token ID1881
Feature activation+2.53e-01

Pos logit contributions
 rule+2.805
ater+2.699
icious+2.691
asis+2.686
 promise+2.677
Neg logit contributions
 her-2.253
 the-2.009
 his-1.888
 their-1.878
 it-1.665
Token day
Token ID1110
Feature activation+2.36e+00

Pos logit contributions
 her+0.566
 the+0.526
 his+0.511
 their+0.506
led+0.465
Neg logit contributions
 rule-0.393
icious-0.364
 promise-0.358
reath-0.352
 difference-0.351
Token,
Token ID11
Feature activation+2.40e+00

Pos logit contributions
 her+2.888
led+2.635
 his+2.573
 the+2.539
 their+2.450
Neg logit contributions
 rule-5.439
Two-5.147
'."-5.121
 promise-5.118
oral-5.068