TOP ACTIVATIONS
MAX = 0.585

Billy to seat down on the bench . Billy said ,
oxy if she had seen the stone . She shook her
a shiny ac orn on the ground . Nut ty wanted
let me guide you around the track , I â
asked her to come up the tree and sit with him
and music came out of the trumpet ! Bert laughed ,
you ." Daisy put down the pipe and ran inside .
, but he accidentally knocked the bowl over . All the
Twe ety flew around the forest asking all the animals
showed Sarah how to use the counter top to un load
tree standing by itself in the middle of a field .
cow tried to balance on the log . She said ,
boy who wanted to catch the moon . He had seen
showed Billy how to hold the arrow and throw it .
you ." She put on the dress and tw irled around
mom my asked him what the boy looked like and Tim
She tried to touch the spider but it ran away
So Suz ie climbed up the tree and looked at the
my helped his mom clean the house and she paid him
helped her put it in the car . After that ,

MIN ACTIVATIONS
MIN = -0.718

It looks like a chim ney !" The
was a big rh in oc eros named Remy . Remy
something scary in the chim ney . They thanked Mr .
a bird on the chim ney . He aimed his toy
That is a rh in oc eros , Lily . It
They heard the rh in oc eros roar . They heard
They saw a big chim ney in the distance and Rachel
there is a rh in oc eros in the park .
there was a rh in oc eros who was always naughty
was a big rh in oc eros . He was troubled
am taking the rh in oc eros to a new home
there was a rh in oc eros . The rh in
man come down the chim ney . The man was very
quickly ran to the chim ney . She took a few
a game near the chim ney . They were very happy
had a big rh in oc eros that he liked to
eros . The rh in oc eros looked sad and scared
cozy spot in the chim ney and the two friends enjoyed
It is a rh in oc eros . It is bright
of the train 's chim ney . The steam was very

INTERVAL 0.260 - 0.585
CONTAINS 13.033%

scatter their things all around the palace . It was fun
sees a big cookie on the table . He loves cookies
top . She quickly snatched the fruit and started to eat
piece moved . He showed her how to plan her moves
I like to listen to you and learn from you .

INTERVAL -0.066 - 0.260
CONTAINS 42.506%

outside one day when they found a big , thick model
imagine fun adventures with her uncle . They would go on
castles and shapes with their toys . One day
yes , so they went outside to the garden .
, the pupil is always happy and never forg ets to

INTERVAL -0.392 - -0.066
CONTAINS 43.789%

I am a thief , and you are very stupid !"
Everyone cl ucked in delight . The chicken
to it tightly . They started to pull and tug and
alive . They said Lily and her parents had to leave
do not want to stop . They like their band .

INTERVAL -0.718 - -0.392
CONTAINS 0.672%

the pige ons and sp arrow s . He liked to
lay there , W igg ly said , " Bug gy
It was a tiny three year old girl . She said
he remembered how his j olly day of playing almost delayed
her back home . Mill ie was embarrassed because she had
Token Billy
Token ID15890
Feature activation+1.63e-01

Pos logit contributions
oc+0.057
y+0.055
ney+0.055
ess+0.054
ard+0.054
Neg logit contributions
 your-0.016
 these-0.015
 an-0.015
 the-0.015
 anything-0.015
Token to
Token ID284
Feature activation+1.87e-01

Pos logit contributions
 the+0.316
 a+0.272
 an+0.271
 your+0.253
 her+0.252
Neg logit contributions
ney-0.464
oc-0.462
ess-0.442
ard-0.441
ushed-0.438
Token seat
Token ID5852
Feature activation-1.54e-01

Pos logit contributions
 the+0.326
 an+0.310
 your+0.304
 anything+0.302
 these+0.299
Neg logit contributions
ney-0.458
oc-0.451
ess-0.422
ard-0.414
ushed-0.412
Token down
Token ID866
Feature activation+7.84e-02

Pos logit contributions
ney+0.463
oc+0.446
ushed+0.442
ouse+0.441
y+0.441
Neg logit contributions
 the-0.182
 your-0.174
 an-0.169
 these-0.157
 our-0.156
Token on
Token ID319
Feature activation+1.17e-01

Pos logit contributions
 the+0.099
 your+0.092
 an+0.090
 anything+0.086
 our+0.083
Neg logit contributions
ney-0.237
oc-0.231
ard-0.221
y-0.221
board-0.220
Token the
Token ID262
Feature activation+5.85e-01

Pos logit contributions
 your+0.041
 the+0.034
 these+0.031
 our+0.030
 an+0.026
Neg logit contributions
oc-0.413
ard-0.395
ney-0.392
board-0.376
y-0.376
Token bench
Token ID7624
Feature activation-1.32e-01

Pos logit contributions
 the+1.493
 an+1.348
 a+1.327
 your+1.307
 her+1.261
Neg logit contributions
ney-1.254
oc-1.252
ard-1.170
ess-1.169
ushed-1.154
Token.
Token ID13
Feature activation-2.49e-01

Pos logit contributions
oc+0.299
ney+0.283
ard+0.276
ushed+0.276
an+0.270
Neg logit contributions
 the-0.344
 a-0.312
 an-0.307
 her-0.285
 your-0.283
Token Billy
Token ID15890
Feature activation+1.54e-01

Pos logit contributions
ney+0.613
oc+0.613
ard+0.579
ess+0.576
ushed+0.564
Neg logit contributions
 the-0.538
 an-0.482
 your-0.475
 a-0.468
 her-0.451
Token said
Token ID531
Feature activation-1.47e-01

Pos logit contributions
 the+0.405
 a+0.353
 an+0.346
 her+0.334
 your+0.329
Neg logit contributions
ney-0.341
oc-0.337
ess-0.321
ides-0.316
ushed-0.316
Token,
Token ID11
Feature activation-1.73e-01

Pos logit contributions
oc+0.231
ushed+0.225
ouse+0.220
y+0.217
ess+0.217
Neg logit contributions
 the-0.389
 an-0.370
 your-0.358
 such-0.353
 a-0.351
Tokenoxy
Token ID23536
Feature activation-2.49e-01

Pos logit contributions
 the+0.380
 an+0.352
 a+0.351
 your+0.339
 her+0.323
Neg logit contributions
ney-0.244
oc-0.228
ushed-0.217
suits-0.216
uddle-0.212
Token if
Token ID611
Feature activation+1.23e-01

Pos logit contributions
ney+0.600
oc+0.584
ard+0.564
uddle+0.560
ushed+0.556
Neg logit contributions
 the-0.592
 an-0.530
 a-0.529
 your-0.497
 her-0.491
Token she
Token ID673
Feature activation+2.15e-01

Pos logit contributions
 the+0.192
 your+0.174
 an+0.171
 our+0.170
 these+0.166
Neg logit contributions
oc-0.318
ess-0.312
ard-0.301
ari-0.299
ney-0.299
Token had
Token ID550
Feature activation-2.11e-01

Pos logit contributions
 the+0.545
 a+0.509
 an+0.498
 it+0.459
 her+0.456
Neg logit contributions
ney-0.484
uddle-0.462
oc-0.461
ivering-0.460
ard-0.455
Token seen
Token ID1775
Feature activation-2.63e-02

Pos logit contributions
ney+0.470
oc+0.464
ard+0.460
ess+0.452
ouse+0.446
Neg logit contributions
 the-0.399
 your-0.351
 an-0.345
 their-0.343
 these-0.342
Token the
Token ID262
Feature activation+5.84e-01

Pos logit contributions
oc+0.090
ard+0.089
ess+0.089
ney+0.088
y+0.087
Neg logit contributions
 the-0.016
 your-0.013
 our-0.012
 their-0.012
 an-0.012
Token stone
Token ID7815
Feature activation-9.45e-02

Pos logit contributions
 the+1.290
 an+1.149
 a+1.129
 your+1.103
 her+1.056
Neg logit contributions
ney-1.451
oc-1.445
ard-1.364
ess-1.358
ushed-1.350
Token.
Token ID13
Feature activation-2.45e-01

Pos logit contributions
oc+0.230
ney+0.230
ushed+0.218
ess+0.215
ard+0.213
Neg logit contributions
 the-0.223
 a-0.197
 an-0.196
 your-0.186
 her-0.185
Token She
Token ID1375
Feature activation-6.79e-02

Pos logit contributions
oc+0.657
ney+0.652
ard+0.622
ess+0.621
ari+0.604
Neg logit contributions
 the-0.480
 an-0.424
 a-0.412
 your-0.409
 their-0.387
Token shook
Token ID14682
Feature activation-1.47e-01

Pos logit contributions
ney+0.149
oc+0.147
ard+0.139
ess+0.136
uddle+0.134
Neg logit contributions
 the-0.174
 an-0.157
 a-0.155
 your-0.148
 her-0.148
Token her
Token ID607
Feature activation+4.52e-01

Pos logit contributions
oc+0.512
y+0.501
ney+0.499
jo+0.493
ushed+0.490
Neg logit contributions
 the-0.090
 your-0.080
 these-0.076
 an-0.075
 our-0.070
Token a
Token ID257
Feature activation+4.88e-01

Pos logit contributions
oc+0.472
ard+0.436
ney+0.433
y+0.430
ess+0.423
Neg logit contributions
 our-0.071
 the-0.070
 your-0.066
 their-0.063
 these-0.059
Token shiny
Token ID22441
Feature activation+1.60e-01

Pos logit contributions
 the+0.959
 a+0.831
 an+0.825
 your+0.790
 her+0.736
Neg logit contributions
ney-1.334
oc-1.315
ushed-1.280
ess-1.271
ard-1.267
Token ac
Token ID936
Feature activation-1.19e-01

Pos logit contributions
 the+0.380
 an+0.335
 a+0.335
 your+0.329
 her+0.311
Neg logit contributions
ney-0.383
oc-0.372
ess-0.358
ushed-0.358
jo-0.358
Tokenorn
Token ID1211
Feature activation-3.77e-01

Pos logit contributions
ushed+0.189
ney+0.189
jo+0.187
ess+0.186
ides+0.180
Neg logit contributions
 the-0.360
 a-0.323
 an-0.316
 your-0.315
 it-0.310
Token on
Token ID319
Feature activation+1.15e-01

Pos logit contributions
oc+1.184
ney+1.182
ushed+1.152
ess+1.129
ari+1.127
Neg logit contributions
 the-0.643
 a-0.550
 an-0.537
 your-0.485
 her-0.483
Token the
Token ID262
Feature activation+5.84e-01

Pos logit contributions
 your+0.035
 these+0.032
 our+0.027
nam+0.016
 the+0.015
Neg logit contributions
oc-0.426
ney-0.384
y-0.383
ard-0.379
aron-0.367
Token ground
Token ID2323
Feature activation-2.00e-01

Pos logit contributions
 the+1.610
 an+1.462
 a+1.446
 your+1.422
 her+1.376
Neg logit contributions
oc-1.138
ney-1.133
ushed-1.056
ess-1.051
ard-1.045
Token.
Token ID13
Feature activation-2.45e-01

Pos logit contributions
oc+0.449
ney+0.418
ushed+0.416
ari+0.407
ouse+0.407
Neg logit contributions
 the-0.543
 a-0.491
 an-0.482
 her-0.453
 your-0.441
Token Nut
Token ID11959
Feature activation+1.09e-01

Pos logit contributions
oc+0.614
ney+0.613
ard+0.581
ess+0.577
ushed+0.566
Neg logit contributions
 the-0.534
 an-0.476
 a-0.465
 your-0.460
 her-0.439
Tokenty
Token ID774
Feature activation-2.03e-01

Pos logit contributions
 the+0.285
 a+0.248
 an+0.243
 her+0.234
 your+0.230
Neg logit contributions
ney-0.240
board-0.226
uddle-0.225
ari-0.224
oc-0.223
Token wanted
Token ID2227
Feature activation-2.53e-01

Pos logit contributions
ney+0.493
oc+0.472
umpy+0.468
ari+0.461
ess+0.458
Neg logit contributions
 the-0.496
 a-0.436
 an-0.426
 her-0.404
 it-0.396
Token let
Token ID1309
Feature activation+3.63e-02

Pos logit contributions
 the+0.642
 a+0.560
 an+0.555
 your+0.507
 her+0.496
Neg logit contributions
ney-0.862
oc-0.846
ushed-0.816
uddle-0.801
ouse-0.800
Token me
Token ID502
Feature activation+6.88e-02

Pos logit contributions
 the+0.051
 an+0.045
 a+0.041
 their+0.041
 your+0.040
Neg logit contributions
oc-0.106
ard-0.104
ney-0.102
ess-0.102
board-0.101
Token guide
Token ID5698
Feature activation-9.95e-02

Pos logit contributions
 the+0.130
 an+0.114
 a+0.111
 your+0.109
 her+0.104
Neg logit contributions
ney-0.190
oc-0.189
ard-0.180
ess-0.179
ushed-0.177
Token you
Token ID345
Feature activation+3.10e-01

Pos logit contributions
oc+0.334
ard+0.326
ney+0.322
ess+0.321
y+0.318
Neg logit contributions
 the-0.095
 an-0.082
 a-0.073
 their-0.066
 your-0.066
Token around
Token ID1088
Feature activation-1.13e-02

Pos logit contributions
 the+0.790
 a+0.710
 an+0.707
 your+0.685
 her+0.651
Neg logit contributions
ney-0.697
oc-0.691
ushed-0.654
ard-0.639
ess-0.634
Token the
Token ID262
Feature activation+5.83e-01

Pos logit contributions
oc+0.032
ney+0.032
ard+0.031
ess+0.030
y+0.030
Neg logit contributions
 the-0.018
 an-0.016
 a-0.016
 your-0.015
 their-0.015
Token track
Token ID2610
Feature activation-1.73e-02

Pos logit contributions
 the+1.470
 an+1.317
 a+1.306
 your+1.278
 her+1.211
Neg logit contributions
oc-1.279
ney-1.272
ess-1.212
ushed-1.212
ard-1.183
Token,
Token ID11
Feature activation-1.67e-01

Pos logit contributions
oc+0.038
ney+0.037
ushed+0.036
ess+0.035
ouse+0.035
Neg logit contributions
 the-0.045
 a-0.040
 an-0.040
 your-0.039
 her-0.037
Token I
Token ID314
Feature activation+1.60e-01

Pos logit contributions
ney+0.435
oc+0.432
ard+0.424
ess+0.413
ari+0.406
Neg logit contributions
 the-0.305
 an-0.273
 a-0.259
 your-0.257
 their-0.250
Tokenâ
Token ID22940
Feature activation+4.29e-03

Pos logit contributions
 the+0.379
 a+0.345
 an+0.345
 your+0.317
 it+0.313
Neg logit contributions
ney-0.376
ard-0.364
ushed-0.359
uddle-0.356
ivering-0.353
Token
Token ID26391
Feature activation-9.91e-02

Pos logit contributions
 the+0.010
 an+0.009
 a+0.009
 your+0.009
 this+0.008
Neg logit contributions
ney-0.010
oc-0.010
uddle-0.010
ushed-0.009
ess-0.009
Token asked
Token ID1965
Feature activation-4.26e-02

Pos logit contributions
ney+0.293
uddle+0.290
oc+0.278
umpy+0.276
ard+0.276
Neg logit contributions
 the-0.374
 a-0.335
 an-0.330
 her-0.315
 your-0.308
Token her
Token ID607
Feature activation+4.68e-01

Pos logit contributions
oc+0.146
ess+0.141
ney+0.140
ard+0.138
y+0.137
Neg logit contributions
 the-0.038
 your-0.032
 an-0.031
 our-0.030
 these-0.029
Token to
Token ID284
Feature activation+1.76e-01

Pos logit contributions
 the+0.876
 an+0.765
 a+0.741
 your+0.736
 their+0.695
Neg logit contributions
ney-1.303
oc-1.298
ess-1.235
ard-1.233
ushed-1.209
Token come
Token ID1282
Feature activation+9.78e-02

Pos logit contributions
 the+0.299
 an+0.266
 your+0.255
 their+0.252
 a+0.247
Neg logit contributions
oc-0.452
ney-0.451
ard-0.439
ess-0.438
ouse-0.426
Token up
Token ID510
Feature activation+1.43e-01

Pos logit contributions
 the+0.192
 an+0.169
 a+0.163
 your+0.163
 their+0.154
Neg logit contributions
ney-0.261
oc-0.260
ess-0.247
ard-0.245
ushed-0.240
Token the
Token ID262
Feature activation+5.83e-01

Pos logit contributions
 the+0.365
 an+0.334
 your+0.328
 a+0.325
 these+0.315
Neg logit contributions
ney-0.283
oc-0.282
ess-0.264
ard-0.262
y-0.255
Token tree
Token ID5509
Feature activation-1.49e-01

Pos logit contributions
 the+1.488
 a+1.328
 an+1.325
 your+1.280
 her+1.226
Neg logit contributions
oc-1.262
ney-1.240
ushed-1.218
ess-1.184
ard-1.167
Token and
Token ID290
Feature activation-2.54e-01

Pos logit contributions
oc+0.324
ushed+0.323
ney+0.313
ard+0.309
uddle+0.308
Neg logit contributions
 the-0.394
 a-0.352
 an-0.346
 her-0.329
 your-0.324
Token sit
Token ID1650
Feature activation+1.07e-01

Pos logit contributions
oc+0.667
ney+0.650
ess+0.636
ouse+0.607
ard+0.601
Neg logit contributions
 the-0.412
 an-0.394
 anything-0.369
 your-0.367
 a-0.360
Token with
Token ID351
Feature activation+1.01e-02

Pos logit contributions
 the+0.195
 an+0.169
 a+0.165
 your+0.160
 her+0.154
Neg logit contributions
oc-0.311
ney-0.311
ard-0.297
ess-0.295
ushed-0.290
Token him
Token ID683
Feature activation+3.34e-01

Pos logit contributions
 an+0.005
 our+0.005
 the+0.004
 your+0.004
 these+0.004
Neg logit contributions
oc-0.033
ess-0.033
ard-0.032
jo-0.032
y-0.031
Token and
Token ID290
Feature activation-2.46e-01

Pos logit contributions
 the+0.412
 a+0.367
 an+0.366
 your+0.355
 her+0.346
Neg logit contributions
ney-0.308
oc-0.295
ess-0.284
ushed-0.283
ari-0.279
Token music
Token ID2647
Feature activation+1.62e-01

Pos logit contributions
oc+0.571
ney+0.558
ard+0.542
ushed+0.542
y+0.531
Neg logit contributions
 the-0.465
 an-0.449
 your-0.449
 our-0.426
 these-0.422
Token came
Token ID1625
Feature activation-1.55e-01

Pos logit contributions
 the+0.402
 an+0.358
 a+0.357
 your+0.340
 her+0.336
Neg logit contributions
oc-0.373
ney-0.372
ard-0.346
ess-0.344
ari-0.339
Token out
Token ID503
Feature activation+2.05e-01

Pos logit contributions
ney+0.473
oc+0.471
ard+0.451
ess+0.449
ushed+0.444
Neg logit contributions
 the-0.252
 an-0.215
 a-0.208
 your-0.203
 her-0.191
Token of
Token ID286
Feature activation-1.06e-01

Pos logit contributions
 the+0.517
 an+0.460
 a+0.456
 your+0.442
 her+0.430
Neg logit contributions
ney-0.460
oc-0.459
ess-0.431
ard-0.430
ushed-0.425
Token the
Token ID262
Feature activation+5.83e-01

Pos logit contributions
ney+0.363
oc+0.360
ard+0.357
y+0.355
aron+0.350
Neg logit contributions
 your-0.045
 the-0.034
 an-0.033
 these-0.032
 our-0.030
Token trumpet
Token ID40334
Feature activation-9.14e-02

Pos logit contributions
 the+1.458
 a+1.280
 your+1.240
 an+1.236
 their+1.171
Neg logit contributions
oc-1.346
ess-1.309
ney-1.303
ushed-1.288
ari-1.235
Token!
Token ID0
Feature activation-2.67e-01

Pos logit contributions
oc+0.197
ney+0.193
ushed+0.190
ess+0.178
umpy+0.176
Neg logit contributions
 the-0.244
 a-0.221
 an-0.211
 her-0.201
 their-0.201
Token Bert
Token ID22108
Feature activation+7.36e-02

Pos logit contributions
ney+0.671
oc+0.671
ard+0.634
ess+0.631
ushed+0.619
Neg logit contributions
 the-0.574
 an-0.512
 a-0.500
 your-0.494
 her-0.472
Token laughed
Token ID13818
Feature activation-5.44e-02

Pos logit contributions
 the+0.198
 a+0.171
 an+0.166
 it+0.160
 her+0.159
Neg logit contributions
ney-0.157
oc-0.155
ari-0.149
uddle-0.148
ess-0.147
Token,
Token ID11
Feature activation-1.70e-01

Pos logit contributions
oc+0.096
ney+0.094
ard+0.089
ess+0.088
ushed+0.087
Neg logit contributions
 the-0.153
 an-0.142
 your-0.139
 a-0.138
 these-0.133
Token you
Token ID345
Feature activation+3.01e-01

Pos logit contributions
 the+0.088
 an+0.085
 their+0.074
 a+0.073
 anything+0.073
Neg logit contributions
ard-0.210
oc-0.207
ney-0.203
board-0.200
ess-0.199
Token."
Token ID526
Feature activation-1.67e-01

Pos logit contributions
 the+0.769
 a+0.692
 an+0.687
 your+0.674
 her+0.642
Neg logit contributions
oc-0.678
ney-0.675
ushed-0.636
ard-0.624
uddle-0.621
Token Daisy
Token ID40355
Feature activation+1.73e-01

Pos logit contributions
oc+0.433
ney+0.432
ard+0.409
ess+0.408
ushed+0.401
Neg logit contributions
 the-0.344
 an-0.308
 a-0.299
 your-0.294
 their-0.279
Token put
Token ID1234
Feature activation+4.28e-02

Pos logit contributions
 the+0.466
 a+0.414
 an+0.410
 her+0.397
 your+0.389
Neg logit contributions
ney-0.376
oc-0.371
ard-0.351
ari-0.349
ess-0.347
Token down
Token ID866
Feature activation+7.04e-02

Pos logit contributions
 our+0.023
 your+0.021
 the+0.018
 their+0.018
 an+0.017
Neg logit contributions
y-0.142
oc-0.141
board-0.138
ard-0.137
ushed-0.137
Token the
Token ID262
Feature activation+5.83e-01

Pos logit contributions
 your+0.026
 anything+0.024
 our+0.022
 the+0.022
 these+0.016
Neg logit contributions
ney-0.248
oc-0.247
y-0.246
board-0.246
ess-0.243
Token pipe
Token ID12656
Feature activation-2.32e-01

Pos logit contributions
 the+1.446
 an+1.303
 a+1.288
 your+1.260
 her+1.215
Neg logit contributions
ney-1.285
oc-1.282
ess-1.204
ard-1.198
ushed-1.184
Token and
Token ID290
Feature activation-2.42e-01

Pos logit contributions
oc+0.576
ney+0.556
ess+0.529
ard+0.524
ushed+0.521
Neg logit contributions
 the-0.547
 a-0.488
 an-0.483
 her-0.458
 your-0.450
Token ran
Token ID4966
Feature activation-9.56e-02

Pos logit contributions
oc+0.500
ney+0.480
ushed+0.472
ard+0.459
board+0.453
Neg logit contributions
 the-0.527
 an-0.506
 anything-0.502
 your-0.500
 such-0.480
Token inside
Token ID2641
Feature activation+3.81e-03

Pos logit contributions
ney+0.231
ess+0.231
oc+0.227
board+0.226
ard+0.226
Neg logit contributions
 the-0.179
 an-0.167
 your-0.161
 anything-0.156
 a-0.151
Token.
Token ID13
Feature activation-2.38e-01

Pos logit contributions
 the+0.009
 an+0.008
 your+0.008
 a+0.008
 anything+0.008
Neg logit contributions
oc-0.008
ney-0.008
ess-0.007
ard-0.007
y-0.007
Token,
Token ID11
Feature activation-1.76e-01

Pos logit contributions
ney+0.098
oc+0.097
ard+0.092
ess+0.091
ushed+0.090
Neg logit contributions
 the-0.100
 an-0.090
 a-0.088
 your-0.087
 her-0.083
Token but
Token ID475
Feature activation-1.19e-01

Pos logit contributions
ard+0.355
oc+0.339
ney+0.331
board+0.321
y+0.315
Neg logit contributions
 the-0.348
 your-0.331
 an-0.324
 anything-0.322
 our-0.320
Token he
Token ID339
Feature activation+1.84e-01

Pos logit contributions
oc+0.322
y+0.315
ney+0.314
ouse+0.301
ard+0.298
Neg logit contributions
 an-0.152
 the-0.147
 anything-0.146
 such-0.142
 these-0.138
Token accidentally
Token ID14716
Feature activation-2.45e-02

Pos logit contributions
 the+0.476
 a+0.426
 an+0.424
 her+0.396
 your+0.395
Neg logit contributions
ney-0.405
oc-0.401
ari-0.389
ard-0.388
ess-0.380
Token knocked
Token ID13642
Feature activation+6.42e-03

Pos logit contributions
oc+0.052
ney+0.052
ard+0.049
ushed+0.048
ouse+0.048
Neg logit contributions
 the-0.059
 an-0.053
 your-0.052
 a-0.051
 their-0.050
Token the
Token ID262
Feature activation+5.83e-01

Pos logit contributions
 the+0.008
 an+0.008
 your+0.007
 a+0.007
 our+0.006
Neg logit contributions
ney-0.019
y-0.019
oc-0.019
ard-0.019
board-0.018
Token bowl
Token ID9396
Feature activation-2.07e-01

Pos logit contributions
 the+1.190
 a+1.047
 an+1.029
 your+0.959
 her+0.931
Neg logit contributions
oc-1.585
ney-1.568
jo-1.548
ari-1.525
ushed-1.524
Token over
Token ID625
Feature activation+1.04e-01

Pos logit contributions
oc+0.592
ney+0.579
ess+0.558
ari+0.557
ard+0.555
Neg logit contributions
 the-0.404
 an-0.345
 a-0.343
 your-0.323
 her-0.317
Token.
Token ID13
Feature activation-2.55e-01

Pos logit contributions
 the+0.197
 an+0.181
 your+0.179
 a+0.172
 these+0.167
Neg logit contributions
oc-0.265
ney-0.261
ard-0.251
y-0.244
ushed-0.242
Token All
Token ID1439
Feature activation-4.79e-03

Pos logit contributions
oc+0.652
ney+0.643
ard+0.616
ess+0.602
ushed+0.600
Neg logit contributions
 the-0.483
 your-0.438
 an-0.433
 anything-0.428
 these-0.418
Token the
Token ID262
Feature activation+5.77e-01

Pos logit contributions
ney+0.015
y+0.015
oc+0.015
ard+0.015
board+0.014
Neg logit contributions
 an-0.005
 the-0.005
 your-0.005
 a-0.005
 anything-0.004
Token
Token ID198
Feature activation-3.10e-02

Pos logit contributions
ney+0.071
oc+0.071
ard+0.067
ess+0.067
ushed+0.066
Neg logit contributions
 the-0.062
 an-0.056
 a-0.054
 your-0.054
 her-0.051
TokenTwe
Token ID32665
Feature activation-2.23e-01

Pos logit contributions
oc+0.075
ney+0.074
ard+0.070
ess+0.070
ushed+0.068
Neg logit contributions
 the-0.071
 an-0.064
 a-0.062
 your-0.061
 her-0.058
Tokenety
Token ID2963
Feature activation-3.57e-01

Pos logit contributions
uddle+0.437
board+0.414
ney+0.414
oc+0.412
ard+0.405
Neg logit contributions
 the-0.557
 her-0.527
 an-0.524
 your-0.519
 a-0.514
Token flew
Token ID13112
Feature activation-6.45e-02

Pos logit contributions
ney+0.820
ari+0.801
umpy+0.800
oc+0.800
ides+0.792
Neg logit contributions
 the-0.922
 a-0.800
 an-0.770
 her-0.754
 your-0.712
Token around
Token ID1088
Feature activation-2.11e-02

Pos logit contributions
y+0.157
ney+0.151
ess+0.151
ushed+0.142
suit+0.142
Neg logit contributions
 the-0.114
 your-0.113
 anything-0.108
 an-0.108
 these-0.107
Token the
Token ID262
Feature activation+5.83e-01

Pos logit contributions
oc+0.062
ney+0.060
y+0.058
ard+0.057
ess+0.057
Neg logit contributions
 your-0.028
 the-0.028
 an-0.027
 anything-0.026
 our-0.026
Token forest
Token ID8222
Feature activation-2.45e-01

Pos logit contributions
 the+1.616
 an+1.504
 a+1.458
 your+1.447
 her+1.413
Neg logit contributions
ney-1.113
oc-1.087
ard-1.018
ess-0.987
uddle-0.969
Token asking
Token ID4737
Feature activation+8.05e-02

Pos logit contributions
oc+0.587
ney+0.582
ard+0.555
ushed+0.549
ouse+0.546
Neg logit contributions
 the-0.579
 an-0.512
 a-0.510
 your-0.477
 her-0.477
Token all
Token ID477
Feature activation+2.83e-01

Pos logit contributions
 the+0.116
 your+0.105
 an+0.103
 our+0.100
 these+0.096
Neg logit contributions
oc-0.231
ess-0.227
ney-0.223
y-0.217
ard-0.216
Token the
Token ID262
Feature activation+5.77e-01

Pos logit contributions
 an+0.176
 the+0.163
 anything+0.147
 your+0.134
 a+0.131
Neg logit contributions
y-1.005
oc-1.002
ney-1.000
jo-0.968
ess-0.955
Token animals
Token ID4695
Feature activation+2.68e-02

Pos logit contributions
 the+1.108
 an+0.964
 a+0.953
 your+0.913
 her+0.876
Neg logit contributions
ney-1.622
oc-1.607
ushed-1.540
ard-1.530
ess-1.528
Token showed
Token ID3751
Feature activation-6.98e-02

Pos logit contributions
ney+0.759
oc+0.727
ari+0.705
ess+0.697
uddle+0.681
Neg logit contributions
 the-0.891
 a-0.800
 an-0.776
 her-0.760
 your-0.756
Token Sarah
Token ID10490
Feature activation+1.97e-01

Pos logit contributions
oc+0.225
y+0.220
jo+0.220
ess+0.219
ard+0.217
Neg logit contributions
 the-0.057
 an-0.052
 anything-0.045
 a-0.043
 your-0.041
Token how
Token ID703
Feature activation+3.13e-01

Pos logit contributions
 the+0.198
 your+0.165
 an+0.164
 our+0.143
 a+0.142
Neg logit contributions
ney-0.646
y-0.642
oc-0.641
ard-0.619
ouse-0.617
Token to
Token ID284
Feature activation+1.81e-01

Pos logit contributions
 the+0.515
 an+0.463
 anything+0.446
 your+0.432
 a+0.432
Neg logit contributions
ney-0.795
y-0.790
ess-0.772
ushed-0.760
ouse-0.754
Token use
Token ID779
Feature activation+3.11e-02

Pos logit contributions
 the+0.363
 an+0.330
 a+0.315
 your+0.313
 anything+0.297
Neg logit contributions
oc-0.452
ney-0.450
ard-0.432
ess-0.427
ushed-0.417
Token the
Token ID262
Feature activation+5.82e-01

Pos logit contributions
 the+0.018
 an+0.015
 your+0.013
 such+0.011
 a+0.011
Neg logit contributions
oc-0.101
ard-0.099
y-0.099
ney-0.098
uddle-0.095
Token counter
Token ID3753
Feature activation-1.73e-01

Pos logit contributions
 the+1.270
 an+1.132
 a+1.115
 your+1.079
 her+1.032
Neg logit contributions
ney-1.475
oc-1.474
ess-1.397
ushed-1.386
ard-1.384
Tokentop
Token ID4852
Feature activation-2.11e-01

Pos logit contributions
ney+0.405
oc+0.403
ard+0.380
ess+0.378
ushed+0.372
Neg logit contributions
 the-0.406
 an-0.365
 a-0.357
 your-0.352
 her-0.338
Token to
Token ID284
Feature activation+1.79e-01

Pos logit contributions
ney+0.478
oc+0.475
ess+0.449
ushed+0.447
ard+0.440
Neg logit contributions
 the-0.529
 an-0.469
 a-0.467
 your-0.456
 her-0.443
Token un
Token ID555
Feature activation+4.54e-02

Pos logit contributions
 the+0.330
 an+0.310
 your+0.294
 a+0.289
 anything+0.278
Neg logit contributions
oc-0.446
ney-0.442
ard-0.425
board-0.413
ess-0.405
Tokenload
Token ID2220
Feature activation-1.62e-01

Pos logit contributions
 the+0.116
 an+0.105
 a+0.103
 your+0.102
 her+0.098
Neg logit contributions
oc-0.097
ney-0.097
ard-0.091
ess-0.090
ushed-0.089
Token tree
Token ID5509
Feature activation-1.47e-01

Pos logit contributions
ushed+0.426
ides+0.420
oc+0.410
ouse+0.409
ney+0.403
Neg logit contributions
 the-0.343
 a-0.280
 an-0.265
 this-0.260
 your-0.253
Token standing
Token ID5055
Feature activation-2.77e-02

Pos logit contributions
oc+0.428
ushed+0.417
ney+0.412
ard+0.412
ides+0.410
Neg logit contributions
 the-0.290
 a-0.249
 an-0.241
 her-0.220
 your-0.218
Token by
Token ID416
Feature activation-1.51e-02

Pos logit contributions
ney+0.076
oc+0.074
ess+0.071
ard+0.070
ushed+0.069
Neg logit contributions
 the-0.053
 an-0.046
 your-0.046
 a-0.045
 their-0.044
Token itself
Token ID2346
Feature activation+1.35e-01

Pos logit contributions
oc+0.058
ney+0.057
ess+0.055
y+0.055
ard+0.054
Neg logit contributions
 the-0.006
 your-0.005
 our-0.004
 an-0.004
 their-0.004
Token in
Token ID287
Feature activation+7.69e-02

Pos logit contributions
 the+0.359
 a+0.326
 an+0.316
 her+0.295
 your+0.291
Neg logit contributions
oc-0.296
ney-0.294
ushed-0.292
umpy-0.273
jo-0.273
Token the
Token ID262
Feature activation+5.82e-01

Pos logit contributions
 your+0.033
 our+0.033
 these+0.033
 their+0.029
 those+0.027
Neg logit contributions
oc-0.255
ney-0.248
y-0.230
ard-0.227
 pad-0.220
Token middle
Token ID3504
Feature activation-2.91e-02

Pos logit contributions
 the+1.489
 an+1.355
 a+1.322
 your+1.316
 their+1.273
Neg logit contributions
ney-1.247
oc-1.228
ard-1.156
ess-1.138
ari-1.102
Token of
Token ID286
Feature activation-1.11e-01

Pos logit contributions
ushed+0.079
oc+0.079
ouse+0.076
ess+0.076
uer+0.075
Neg logit contributions
 the-0.064
 a-0.054
 an-0.052
 her-0.051
 your-0.049
Token a
Token ID257
Feature activation+4.95e-01

Pos logit contributions
 pad+0.345
oc+0.340
y+0.338
ney+0.338
ard+0.333
Neg logit contributions
 our-0.044
 anything-0.043
 your-0.043
 such-0.034
 these-0.030
Token field
Token ID2214
Feature activation-1.98e-01

Pos logit contributions
 the+1.038
 a+0.877
 your+0.830
 an+0.821
 it+0.786
Neg logit contributions
ushed-1.349
ides-1.292
ney-1.270
oc-1.248
ess-1.239
Token.
Token ID13
Feature activation-2.50e-01

Pos logit contributions
oc+0.461
ney+0.425
ushed+0.423
uddle+0.421
ard+0.415
Neg logit contributions
 the-0.518
 a-0.465
 an-0.457
 her-0.426
 your-0.420
Token cow
Token ID9875
Feature activation-6.89e-02

Pos logit contributions
 the+1.439
 a+1.286
 an+1.266
 your+1.202
 her+1.187
Neg logit contributions
ney-1.301
oc-1.293
ushed-1.239
ouse-1.215
ard-1.208
Token tried
Token ID3088
Feature activation+6.16e-03

Pos logit contributions
ney+0.165
oc+0.163
umpy+0.155
ess+0.153
ushed+0.153
Neg logit contributions
 the-0.169
 a-0.148
 an-0.144
 her-0.138
 their-0.135
Token to
Token ID284
Feature activation+1.88e-01

Pos logit contributions
 the+0.010
 your+0.009
 an+0.009
 these+0.009
 our+0.008
Neg logit contributions
ney-0.016
oc-0.016
y-0.015
ess-0.015
ard-0.015
Token balance
Token ID5236
Feature activation+1.67e-02

Pos logit contributions
 the+0.377
 an+0.350
 your+0.348
 these+0.337
 anything+0.324
Neg logit contributions
oc-0.434
ney-0.421
ard-0.401
an-0.400
ess-0.397
Token on
Token ID319
Feature activation+1.17e-01

Pos logit contributions
 the+0.025
 an+0.023
 your+0.022
 a+0.021
 these+0.021
Neg logit contributions
oc-0.047
ney-0.046
y-0.045
ard-0.044
ess-0.044
Token the
Token ID262
Feature activation+5.82e-01

Pos logit contributions
 your+0.027
 the+0.022
 an+0.022
 our+0.019
 such+0.015
Neg logit contributions
oc-0.431
ard-0.414
ney-0.410
y-0.410
ess-0.405
Token log
Token ID2604
Feature activation-1.63e-01

Pos logit contributions
 the+1.555
 an+1.405
 a+1.390
 your+1.362
 her+1.312
Neg logit contributions
ney-1.189
oc-1.177
ess-1.108
ard-1.099
ushed-1.099
Token.
Token ID13
Feature activation-2.49e-01

Pos logit contributions
oc+0.364
ney+0.360
ushed+0.341
ess+0.337
ard+0.337
Neg logit contributions
 the-0.413
 an-0.367
 a-0.366
 your-0.350
 her-0.346
Token She
Token ID1375
Feature activation-7.29e-02

Pos logit contributions
oc+0.603
ney+0.601
ard+0.572
ess+0.570
ushed+0.552
Neg logit contributions
 the-0.528
 an-0.480
 your-0.472
 a-0.463
 anything-0.447
Token said
Token ID531
Feature activation-1.52e-01

Pos logit contributions
ney+0.157
oc+0.155
ess+0.148
ard+0.147
ari+0.145
Neg logit contributions
 the-0.189
 an-0.169
 a-0.168
 your-0.161
 her-0.158
Token,
Token ID11
Feature activation-1.76e-01

Pos logit contributions
oc+0.276
ess+0.270
y+0.269
ney+0.266
ushed+0.265
Neg logit contributions
 the-0.353
 an-0.330
 your-0.326
 our-0.318
 such-0.317
Token boy
Token ID2933
Feature activation-2.45e-01

Pos logit contributions
ney+0.119
oc+0.112
ushed+0.112
ides+0.112
ess+0.109
Neg logit contributions
 the-0.152
 a-0.134
 your-0.132
 an-0.129
 her-0.125
Token who
Token ID508
Feature activation-9.20e-02

Pos logit contributions
uddle+0.651
ides+0.630
ari+0.625
ushed+0.624
board+0.620
Neg logit contributions
 the-0.581
 a-0.505
 an-0.479
 her-0.469
 your-0.468
Token wanted
Token ID2227
Feature activation-2.57e-01

Pos logit contributions
ari+0.204
uddle+0.201
ney+0.200
ess+0.197
ard+0.197
Neg logit contributions
 the-0.234
 a-0.216
 an-0.209
 her-0.197
 your-0.194
Token to
Token ID284
Feature activation+1.76e-01

Pos logit contributions
ney+0.497
board+0.463
ard+0.439
y+0.431
ess+0.418
Neg logit contributions
 your-0.478
 the-0.468
 our-0.467
 these-0.466
 anything-0.456
Token catch
Token ID4929
Feature activation+1.83e-01

Pos logit contributions
 the+0.353
 an+0.319
 your+0.316
 these+0.306
 their+0.304
Neg logit contributions
ney-0.423
oc-0.415
ard-0.396
ouse-0.390
board-0.390
Token the
Token ID262
Feature activation+5.82e-01

Pos logit contributions
 your+0.104
 the+0.099
 our+0.092
 these+0.077
 their+0.074
Neg logit contributions
oc-0.623
ard-0.599
y-0.591
uddle-0.589
an-0.587
Token moon
Token ID8824
Feature activation-9.83e-02

Pos logit contributions
 the+1.397
 an+1.257
 a+1.238
 your+1.208
 her+1.160
Neg logit contributions
ney-1.340
oc-1.336
ard-1.253
ess-1.248
ushed-1.240
Token.
Token ID13
Feature activation-2.51e-01

Pos logit contributions
oc+0.206
ney+0.203
ushed+0.197
ess+0.193
ard+0.191
Neg logit contributions
 the-0.266
 a-0.238
 an-0.236
 your-0.225
 her-0.224
Token He
Token ID679
Feature activation-1.43e-01

Pos logit contributions
ney+0.545
oc+0.540
ard+0.508
ess+0.503
ushed+0.493
Neg logit contributions
 the-0.634
 an-0.576
 a-0.564
 your-0.558
 her-0.538
Token had
Token ID550
Feature activation-2.11e-01

Pos logit contributions
ney+0.293
oc+0.291
ard+0.273
ess+0.271
ushed+0.266
Neg logit contributions
 the-0.385
 an-0.348
 a-0.344
 your-0.335
 her-0.327
Token seen
Token ID1775
Feature activation-3.81e-02

Pos logit contributions
ney+0.677
age+0.641
 pad+0.629
ard+0.601
ess+0.599
Neg logit contributions
 these-0.157
 your-0.150
 the-0.149
 our-0.148
 those-0.132
Token showed
Token ID3751
Feature activation-7.46e-02

Pos logit contributions
ney+0.752
oc+0.733
ard+0.694
ess+0.690
ari+0.690
Neg logit contributions
 the-0.930
 a-0.831
 an-0.831
 your-0.798
 her-0.786
Token Billy
Token ID15890
Feature activation+1.51e-01

Pos logit contributions
oc+0.207
ess+0.196
jo+0.193
y+0.192
an+0.186
Neg logit contributions
 anything-0.084
 the-0.076
 an-0.076
 your-0.075
 these-0.074
Token how
Token ID703
Feature activation+3.08e-01

Pos logit contributions
 the+0.108
 your+0.100
 an+0.093
 anything+0.091
 these+0.089
Neg logit contributions
y-0.517
oc-0.507
ney-0.496
ard-0.479
an-0.479
Token to
Token ID284
Feature activation+1.82e-01

Pos logit contributions
 the+0.521
 an+0.470
 anything+0.460
 these+0.442
 a+0.440
Neg logit contributions
ney-0.790
ushed-0.764
umpy-0.764
y-0.759
oc-0.758
Token hold
Token ID1745
Feature activation+1.12e-01

Pos logit contributions
 the+0.364
 an+0.333
 a+0.316
 your+0.314
 anything+0.303
Neg logit contributions
oc-0.459
ney-0.450
ard-0.428
ess-0.423
ushed-0.417
Token the
Token ID262
Feature activation+5.82e-01

Pos logit contributions
 the+0.058
 your+0.050
 an+0.049
 our+0.043
 anything+0.039
Neg logit contributions
oc-0.403
ard-0.391
y-0.383
ney-0.379
ushed-0.371
Token arrow
Token ID15452
Feature activation-5.32e-02

Pos logit contributions
 the+1.378
 an+1.230
 a+1.227
 your+1.186
 her+1.129
Neg logit contributions
ney-1.367
oc-1.360
ess-1.296
ushed-1.276
ard-1.266
Token and
Token ID290
Feature activation-2.48e-01

Pos logit contributions
oc+0.122
ney+0.122
ess+0.114
ard+0.114
ushed+0.114
Neg logit contributions
 the-0.130
 an-0.117
 a-0.116
 your-0.112
 her-0.109
Token throw
Token ID3714
Feature activation+4.33e-02

Pos logit contributions
oc+0.638
ney+0.592
ushed+0.573
ess+0.567
ouse+0.561
Neg logit contributions
 the-0.444
 an-0.426
 your-0.415
 these-0.394
 anything-0.390
Token it
Token ID340
Feature activation+4.59e-01

Pos logit contributions
 the+0.022
 an+0.020
 your+0.019
 a+0.013
 our+0.013
Neg logit contributions
oc-0.166
ard-0.156
ney-0.154
y-0.153
uddle-0.148
Token.
Token ID13
Feature activation-2.52e-01

Pos logit contributions
 the+0.938
 an+0.828
 a+0.808
 your+0.794
 her+0.757
Neg logit contributions
ney-1.213
oc-1.209
ard-1.148
ess-1.142
ushed-1.127
Token you
Token ID345
Feature activation+3.01e-01

Pos logit contributions
 the+0.021
 an+0.017
 a+0.013
 their+0.012
 anything+0.012
Neg logit contributions
oc-0.088
ard-0.086
ouse-0.084
ess-0.083
uddle-0.083
Token."
Token ID526
Feature activation-1.52e-01

Pos logit contributions
 the+0.594
 an+0.522
 a+0.509
 your+0.500
 her+0.476
Neg logit contributions
ney-0.811
oc-0.809
ard-0.769
ess-0.765
ushed-0.755
Token She
Token ID1375
Feature activation-6.32e-02

Pos logit contributions
ney+0.407
oc+0.402
ard+0.384
ess+0.381
ushed+0.378
Neg logit contributions
 the-0.313
 an-0.272
 a-0.268
 your-0.262
 her-0.252
Token put
Token ID1234
Feature activation+4.18e-02

Pos logit contributions
ney+0.146
oc+0.144
ard+0.135
ess+0.134
ari+0.131
Neg logit contributions
 the-0.158
 an-0.141
 a-0.140
 your-0.135
 her-0.133
Token on
Token ID319
Feature activation+1.12e-01

Pos logit contributions
 our+0.045
 your+0.043
 an+0.037
 the+0.036
 my+0.035
Neg logit contributions
y-0.118
les-0.114
board-0.113
oc-0.113
ouse-0.113
Token the
Token ID262
Feature activation+5.82e-01

Pos logit contributions
 our+0.054
 your+0.047
 these+0.045
 anything+0.039
 those+0.037
Neg logit contributions
ard-0.356
oc-0.355
board-0.354
jo-0.352
ess-0.349
Token dress
Token ID6576
Feature activation-7.60e-02

Pos logit contributions
 the+1.477
 an+1.339
 a+1.310
 your+1.295
 her+1.249
Neg logit contributions
ney-1.242
oc-1.234
ard-1.164
ess-1.148
ushed-1.129
Token and
Token ID290
Feature activation-2.54e-01

Pos logit contributions
oc+0.161
ney+0.156
ushed+0.144
umpy+0.144
ard+0.143
Neg logit contributions
 the-0.209
 a-0.187
 an-0.185
 her-0.180
 your-0.175
Token tw
Token ID665
Feature activation+9.80e-03

Pos logit contributions
oc+0.488
ouse+0.486
ney+0.475
board+0.472
ushed+0.467
Neg logit contributions
 the-0.529
 an-0.515
 your-0.514
 such-0.503
 our-0.498
Tokenirled
Token ID49376
Feature activation-2.03e-01

Pos logit contributions
 the+0.029
 an+0.026
 a+0.026
 your+0.026
 her+0.025
Neg logit contributions
ney-0.017
oc-0.017
ard-0.016
ess-0.016
ushed-0.016
Token around
Token ID1088
Feature activation-1.65e-02

Pos logit contributions
oc+0.475
ney+0.466
ard+0.462
y+0.455
ess+0.453
Neg logit contributions
 the-0.416
 an-0.389
 your-0.380
 a-0.363
 our-0.356
Token mom
Token ID1995
Feature activation+1.53e-01

Pos logit contributions
oc+0.188
ney+0.185
ard+0.181
ess+0.177
ari+0.176
Neg logit contributions
 the-0.078
 a-0.064
 an-0.063
 your-0.058
 her-0.055
Tokenmy
Token ID1820
Feature activation-3.34e-01

Pos logit contributions
 the+0.421
 her+0.374
 a+0.373
 an+0.356
 your+0.338
Neg logit contributions
ney-0.334
oc-0.331
uddle-0.318
umpy-0.318
mer-0.317
Token asked
Token ID1965
Feature activation-3.38e-02

Pos logit contributions
ney+0.799
oc+0.764
ess+0.734
umpy+0.731
ard+0.720
Neg logit contributions
 the-0.869
 a-0.765
 an-0.758
 her-0.736
 your-0.702
Token him
Token ID683
Feature activation+3.42e-01

Pos logit contributions
oc+0.080
ney+0.078
ard+0.075
ess+0.074
ushed+0.073
Neg logit contributions
 the-0.075
 an-0.067
 your-0.066
 a-0.065
 these-0.062
Token what
Token ID644
Feature activation+3.32e-01

Pos logit contributions
 the+0.580
 an+0.510
 your+0.499
 a+0.482
 their+0.458
Neg logit contributions
oc-0.966
ney-0.962
ess-0.923
ard-0.905
y-0.905
Token the
Token ID262
Feature activation+5.82e-01

Pos logit contributions
 the+0.693
 an+0.632
 your+0.622
 a+0.602
 these+0.583
Neg logit contributions
oc-0.807
ney-0.803
ess-0.747
ard-0.744
board-0.742
Token boy
Token ID2933
Feature activation-2.38e-01

Pos logit contributions
 the+1.456
 an+1.314
 a+1.311
 your+1.258
 her+1.211
Neg logit contributions
ney-1.292
oc-1.288
ushed-1.216
ess-1.207
ard-1.205
Token looked
Token ID3114
Feature activation-1.74e-01

Pos logit contributions
ney+0.565
oc+0.533
uddle+0.511
ard+0.509
an+0.506
Neg logit contributions
 the-0.610
 a-0.545
 an-0.536
 her-0.505
 it-0.495
Token like
Token ID588
Feature activation+3.84e-02

Pos logit contributions
ney+0.470
oc+0.459
ess+0.437
ard+0.437
ari+0.433
Neg logit contributions
 the-0.374
 an-0.331
 a-0.327
 your-0.313
 her-0.303
Token and
Token ID290
Feature activation-2.47e-01

Pos logit contributions
 the+0.054
 an+0.047
 your+0.046
 a+0.043
 our+0.041
Neg logit contributions
oc-0.116
ney-0.116
ard-0.112
board-0.110
ess-0.110
Token Tim
Token ID5045
Feature activation+2.68e-01

Pos logit contributions
oc+0.592
ney+0.574
ushed+0.550
ard+0.548
ess+0.546
Neg logit contributions
 the-0.484
 an-0.449
 your-0.435
 a-0.426
 anything-0.414
Token
Token ID26391
Feature activation-1.02e-01

Pos logit contributions
 the+0.007
 an+0.006
 your+0.005
 a+0.005
 this+0.005
Neg logit contributions
oc-0.007
ney-0.007
uddle-0.007
ushed-0.007
yt-0.007
Token She
Token ID1375
Feature activation-6.00e-02

Pos logit contributions
ney+0.205
oc+0.202
an+0.192
ushed+0.192
ard+0.191
Neg logit contributions
 the-0.275
 an-0.250
 a-0.246
 your-0.240
 it-0.232
Token tried
Token ID3088
Feature activation+2.12e-04

Pos logit contributions
ney+0.130
oc+0.127
ari+0.121
ess+0.121
ard+0.121
Neg logit contributions
 the-0.157
 a-0.141
 an-0.140
 your-0.136
 her-0.131
Token to
Token ID284
Feature activation+1.88e-01

Pos logit contributions
 the+0.000
 an+0.000
 a+0.000
 your+0.000
 their+0.000
Neg logit contributions
ney-0.001
oc-0.001
ess-0.001
y-0.001
ard-0.001
Token touch
Token ID3638
Feature activation+2.41e-02

Pos logit contributions
 the+0.356
 an+0.323
 your+0.308
 a+0.305
 their+0.300
Neg logit contributions
oc-0.481
ney-0.479
ard-0.455
ess-0.451
an-0.445
Token the
Token ID262
Feature activation+5.82e-01

Pos logit contributions
 the+0.011
 an+0.010
 a+0.007
 your+0.006
 their+0.006
Neg logit contributions
oc-0.091
ard-0.089
y-0.088
ney-0.087
ess-0.086
Token spider
Token ID19230
Feature activation-6.42e-02

Pos logit contributions
 the+1.380
 a+1.227
 an+1.220
 your+1.180
 her+1.116
Neg logit contributions
oc-1.365
ney-1.354
ushed-1.308
ess-1.297
ari-1.272
Token but
Token ID475
Feature activation-1.09e-01

Pos logit contributions
ney+0.144
oc+0.142
ushed+0.136
ess+0.135
ari+0.134
Neg logit contributions
 the-0.163
 an-0.144
 a-0.144
 your-0.141
 her-0.136
Token it
Token ID340
Feature activation+4.57e-01

Pos logit contributions
ney+0.346
oc+0.338
ouse+0.336
y+0.335
ess+0.331
Neg logit contributions
 an-0.097
 the-0.095
 a-0.084
 their-0.081
 your-0.080
Token ran
Token ID4966
Feature activation-9.89e-02

Pos logit contributions
 the+1.272
 an+1.144
 a+1.136
 your+1.088
 her+1.088
Neg logit contributions
oc-0.900
ney-0.895
ard-0.844
ari-0.840
ess-0.824
Token away
Token ID1497
Feature activation+4.54e-02

Pos logit contributions
ney+0.242
oc+0.241
ess+0.234
ard+0.232
y+0.229
Neg logit contributions
 the-0.202
 an-0.180
 your-0.174
 a-0.173
 these-0.166
Token So
Token ID1406
Feature activation-3.99e-01

Pos logit contributions
ney+0.427
oc+0.426
ard+0.405
ess+0.403
ushed+0.397
Neg logit contributions
 the-0.320
 an-0.283
 a-0.276
 your-0.271
 her-0.257
Token Suz
Token ID22691
Feature activation+2.59e-02

Pos logit contributions
oc+1.156
ney+1.104
ess+1.088
y+1.085
ard+1.081
Neg logit contributions
 the-0.540
 an-0.516
 your-0.481
 anything-0.464
 a-0.462
Tokenie
Token ID494
Feature activation-4.09e-01

Pos logit contributions
 the+0.082
 her+0.074
 a+0.072
 an+0.070
 your+0.070
Neg logit contributions
oc-0.047
ney-0.043
uddle-0.042
ouse-0.040
ushed-0.039
Token climbed
Token ID19952
Feature activation-2.37e-02

Pos logit contributions
ney+0.940
oc+0.928
ard+0.870
ess+0.866
ushed+0.855
Neg logit contributions
 the-1.044
 a-0.923
 an-0.919
 your-0.878
 her-0.868
Token up
Token ID510
Feature activation+1.52e-01

Pos logit contributions
ney+0.059
y+0.058
oc+0.057
ard+0.056
ess+0.056
Neg logit contributions
 your-0.039
 the-0.039
 our-0.037
 an-0.037
 anything-0.036
Token the
Token ID262
Feature activation+5.82e-01

Pos logit contributions
 your+0.168
 the+0.154
 our+0.151
 an+0.142
 anything+0.141
Neg logit contributions
ney-0.467
y-0.454
oc-0.450
ess-0.443
ard-0.439
Token tree
Token ID5509
Feature activation-1.36e-01

Pos logit contributions
 the+1.510
 a+1.320
 an+1.283
 your+1.253
 it+1.220
Neg logit contributions
oc-1.281
ushed-1.250
ney-1.231
ess-1.195
ouse-1.177
Token and
Token ID290
Feature activation-2.45e-01

Pos logit contributions
oc+0.304
ushed+0.296
ney+0.284
ouse+0.283
umpy+0.282
Neg logit contributions
 the-0.359
 a-0.323
 an-0.313
 her-0.303
 their-0.287
Token looked
Token ID3114
Feature activation-1.74e-01

Pos logit contributions
oc+0.465
ney+0.459
ard+0.440
ess+0.428
ushed+0.427
Neg logit contributions
 the-0.572
 your-0.563
 an-0.560
 our-0.543
 anything-0.534
Token at
Token ID379
Feature activation-4.71e-02

Pos logit contributions
ney+0.447
oc+0.445
ard+0.425
ess+0.422
ushed+0.419
Neg logit contributions
 the-0.352
 an-0.314
 a-0.305
 your-0.302
 her-0.285
Token the
Token ID262
Feature activation+5.79e-01

Pos logit contributions
y+0.166
ney+0.166
oc+0.163
jo+0.160
ess+0.159
Neg logit contributions
 your-0.020
 our-0.017
 an-0.014
 the-0.012
 these-0.011
Tokenmy
Token ID1820
Feature activation-3.52e-01

Pos logit contributions
 the+0.709
 a+0.640
 an+0.608
 her+0.595
 your+0.592
Neg logit contributions
ney-0.543
ari-0.523
oc-0.506
ides-0.503
ess-0.495
Token helped
Token ID4193
Feature activation-2.15e-02

Pos logit contributions
ney+0.818
ari+0.782
oc+0.764
ess+0.761
ard+0.741
Neg logit contributions
 the-0.934
 a-0.841
 an-0.803
 your-0.782
 her-0.767
Token his
Token ID465
Feature activation+4.31e-01

Pos logit contributions
oc+0.071
y+0.069
ard+0.069
ess+0.069
ney+0.068
Neg logit contributions
 an-0.016
 the-0.015
 anything-0.014
 a-0.013
 these-0.012
Token mom
Token ID1995
Feature activation+1.55e-01

Pos logit contributions
 the+0.788
 an+0.688
 a+0.664
 your+0.654
 her+0.620
Neg logit contributions
ney-1.223
oc-1.220
ard-1.154
ess-1.154
ushed-1.137
Token clean
Token ID3424
Feature activation+1.10e-01

Pos logit contributions
 the+0.394
 an+0.353
 a+0.350
 your+0.340
 her+0.331
Neg logit contributions
ney-0.340
oc-0.339
ess-0.321
ard-0.320
ushed-0.315
Token the
Token ID262
Feature activation+5.82e-01

Pos logit contributions
 the+0.060
 an+0.056
 a+0.044
 your+0.038
 anything+0.035
Neg logit contributions
oc-0.409
ney-0.393
ard-0.390
y-0.388
board-0.378
Token house
Token ID2156
Feature activation-3.25e-01

Pos logit contributions
 the+1.392
 an+1.247
 a+1.229
 your+1.210
 her+1.160
Neg logit contributions
oc-1.353
ney-1.347
ess-1.274
ard-1.267
ushed-1.251
Token and
Token ID290
Feature activation-2.43e-01

Pos logit contributions
ney+0.661
oc+0.660
ard+0.616
ess+0.613
ushed+0.601
Neg logit contributions
 the-0.865
 an-0.785
 a-0.773
 your-0.760
 her-0.736
Token she
Token ID673
Feature activation+2.10e-01

Pos logit contributions
oc+0.563
ney+0.552
ushed+0.529
ouse+0.514
board+0.507
Neg logit contributions
 an-0.435
 anything-0.433
 the-0.432
 your-0.412
 such-0.400
Token paid
Token ID3432
Feature activation-1.10e-01

Pos logit contributions
 the+0.556
 an+0.503
 a+0.497
 your+0.486
 her+0.473
Neg logit contributions
ney-0.437
oc-0.436
ard-0.410
ess-0.405
ari-0.395
Token him
Token ID683
Feature activation+3.38e-01

Pos logit contributions
oc+0.292
ess+0.288
ard+0.277
y+0.271
uddle+0.269
Neg logit contributions
 an-0.134
 anything-0.129
 the-0.124
 your-0.122
 these-0.116
Token helped
Token ID4193
Feature activation-2.48e-02

Pos logit contributions
 the+0.544
 a+0.484
 an+0.467
 her+0.451
 it+0.434
Neg logit contributions
ney-0.537
ari-0.518
oc-0.512
ess-0.506
umpy-0.504
Token her
Token ID607
Feature activation+4.64e-01

Pos logit contributions
oc+0.080
ney+0.078
ard+0.078
y+0.077
board+0.076
Neg logit contributions
 the-0.027
 an-0.024
 your-0.023
 a-0.021
 these-0.020
Token put
Token ID1234
Feature activation+4.16e-02

Pos logit contributions
 the+0.921
 an+0.813
 a+0.788
 your+0.782
 her+0.740
Neg logit contributions
ney-1.250
oc-1.244
ard-1.179
ess-1.175
ushed-1.160
Token it
Token ID340
Feature activation+4.58e-01

Pos logit contributions
 an+0.018
 the+0.017
 your+0.015
 our+0.013
 these+0.011
Neg logit contributions
oc-0.148
ard-0.145
y-0.144
ney-0.143
uddle-0.139
Token in
Token ID287
Feature activation+7.37e-02

Pos logit contributions
 the+0.835
 an+0.727
 a+0.703
 your+0.695
 her+0.655
Neg logit contributions
ney-1.304
oc-1.291
ard-1.236
ess-1.230
ushed-1.215
Token the
Token ID262
Feature activation+5.82e-01

Pos logit contributions
 the+0.021
 your+0.015
 an+0.013
 anything+0.012
 these+0.010
Neg logit contributions
oc-0.282
ney-0.270
y-0.263
aron-0.259
ard-0.259
Token car
Token ID1097
Feature activation-1.36e-01

Pos logit contributions
 the+1.427
 an+1.270
 a+1.268
 your+1.226
 their+1.168
Neg logit contributions
ney-1.318
oc-1.308
ess-1.285
ushed-1.269
ari-1.241
Token.
Token ID13
Feature activation-2.47e-01

Pos logit contributions
oc+0.275
ushed+0.273
ney+0.270
ess+0.255
umpy+0.253
Neg logit contributions
 the-0.375
 a-0.335
 an-0.329
 your-0.320
 their-0.312
Token After
Token ID2293
Feature activation-7.64e-03

Pos logit contributions
ney+0.590
oc+0.590
ard+0.558
ess+0.554
ushed+0.547
Neg logit contributions
 the-0.549
 an-0.493
 a-0.480
 your-0.479
 their-0.451
Token that
Token ID326
Feature activation+2.59e-01

Pos logit contributions
oc+0.021
board+0.021
ops+0.021
ard+0.020
ushed+0.020
Neg logit contributions
 your-0.008
 the-0.007
 anything-0.007
 our-0.007
 an-0.007
Token,
Token ID11
Feature activation-1.77e-01

Pos logit contributions
 the+0.690
 an+0.619
 a+0.610
 your+0.602
 her+0.584
Neg logit contributions
ney-0.544
oc-0.537
ard-0.504
ess-0.501
ushed-0.500
TokenIt
Token ID1026
Feature activation+9.75e-02

Pos logit contributions
 the+0.074
 an+0.065
 a+0.063
 your+0.062
 her+0.059
Neg logit contributions
ney-0.102
oc-0.102
ard-0.097
ess-0.096
ushed-0.095
Token looks
Token ID3073
Feature activation-2.22e-01

Pos logit contributions
 the+0.289
 an+0.265
 a+0.262
 your+0.258
 her+0.248
Neg logit contributions
ney-0.170
oc-0.169
ard-0.157
ess-0.156
ushed-0.154
Token like
Token ID588
Feature activation+1.82e-02

Pos logit contributions
oc+0.564
ney+0.549
ard+0.533
ess+0.527
y+0.518
Neg logit contributions
 the-0.400
 your-0.344
 an-0.336
 her-0.335
 these-0.332
Token a
Token ID257
Feature activation+4.80e-01

Pos logit contributions
 the+0.010
 her+0.008
 their+0.008
 your+0.007
nam+0.007
Neg logit contributions
ard-0.055
ney-0.053
oc-0.051
board-0.051
ess-0.050
Token chim
Token ID18205
Feature activation-2.51e-01

Pos logit contributions
 the+1.048
 a+0.933
 an+0.930
 your+0.896
 it+0.836
Neg logit contributions
ney-1.202
oc-1.173
ushed-1.147
ard-1.138
ess-1.128
Tokenney
Token ID1681
Feature activation-7.11e-01

Pos logit contributions
ushed+0.170
ides+0.150
uddle+0.149
 pad+0.144
int+0.141
Neg logit contributions
 the-0.948
 a-0.888
 an-0.861
 this-0.819
 your-0.799
Token!"
Token ID2474
Feature activation-2.32e-01

Pos logit contributions
ney+1.605
oc+1.603
ushed+1.545
ess+1.529
ari+1.496
Neg logit contributions
 the-1.802
 a-1.588
 an-1.579
 your-1.542
 it-1.467
Token
Token ID220
Feature activation-6.93e-03

Pos logit contributions
ney+0.583
oc+0.583
ard+0.550
ess+0.547
ushed+0.537
Neg logit contributions
 the-0.497
 an-0.443
 your-0.431
 a-0.431
 her-0.409
Token
Token ID198
Feature activation-3.19e-02

Pos logit contributions
ney+0.018
oc+0.018
ard+0.017
ess+0.017
ushed+0.017
Neg logit contributions
 the-0.015
 an-0.013
 a-0.013
 your-0.012
 her-0.012
Token
Token ID198
Feature activation-3.29e-02

Pos logit contributions
ney+0.083
oc+0.082
ard+0.079
ess+0.078
ushed+0.078
Neg logit contributions
 the-0.068
 an-0.060
 a-0.059
 your-0.057
 her-0.054
TokenThe
Token ID464
Feature activation+1.59e-01

Pos logit contributions
ney+0.080
oc+0.079
ard+0.076
ess+0.076
ushed+0.075
Neg logit contributions
 the-0.074
 an-0.066
 a-0.065
 your-0.063
 her-0.060
Token was
Token ID373
Feature activation-2.30e-01

Pos logit contributions
 the+0.639
 a+0.566
 an+0.564
 your+0.541
 her+0.537
Neg logit contributions
ney-0.456
oc-0.449
ard-0.433
ess-0.429
ari-0.426
Token a
Token ID257
Feature activation+4.80e-01

Pos logit contributions
oc+0.734
ney+0.721
y+0.719
ess+0.695
ard+0.681
Neg logit contributions
 those-0.151
 anything-0.151
 the-0.148
 these-0.145
 your-0.134
Token big
Token ID1263
Feature activation+1.35e-01

Pos logit contributions
 the+0.982
 a+0.847
 an+0.842
 your+0.816
 their+0.774
Neg logit contributions
ney-1.293
oc-1.272
ard-1.225
ess-1.219
ushed-1.214
Token rh
Token ID9529
Feature activation-1.39e-01

Pos logit contributions
 the+0.342
 an+0.303
 a+0.302
 your+0.297
 her+0.284
Neg logit contributions
ney-0.298
oc-0.294
ess-0.278
ard-0.277
ushed-0.277
Tokenin
Token ID259
Feature activation-2.78e-01

Pos logit contributions
ney+0.249
ushed+0.243
ard+0.238
oc+0.236
ess+0.236
Neg logit contributions
 the-0.394
 a-0.357
 an-0.345
 your-0.344
 it-0.328
Tokenoc
Token ID420
Feature activation-7.11e-01

Pos logit contributions
ides+0.232
ushed+0.222
uddle+0.193
ows+0.178
ivered+0.171
Neg logit contributions
 the-1.120
 a-1.052
 an-0.988
 this-0.970
 your-0.966
Tokeneros
Token ID27498
Feature activation-2.73e-01

Pos logit contributions
ney+1.240
oc+1.189
ard+1.125
ouse+1.111
ushed+1.110
Neg logit contributions
 the-2.100
 an-1.908
 a-1.885
 your-1.867
 her-1.781
Token named
Token ID3706
Feature activation-1.72e-01

Pos logit contributions
uddle+0.639
ney+0.634
ard+0.634
ides+0.632
ushed+0.630
Neg logit contributions
 the-0.678
 a-0.611
 an-0.586
 your-0.546
 their-0.533
Token Remy
Token ID32912
Feature activation+1.70e-01

Pos logit contributions
oc+0.503
ess+0.478
ard+0.461
ari+0.452
ney+0.447
Neg logit contributions
 the-0.237
 anything-0.222
 her-0.221
 an-0.220
 such-0.213
Token.
Token ID13
Feature activation-2.63e-01

Pos logit contributions
 the+0.464
 a+0.419
 an+0.404
 their+0.384
 her+0.378
Neg logit contributions
uddle-0.377
ushed-0.367
ney-0.365
board-0.358
ard-0.355
Token Remy
Token ID32912
Feature activation+1.69e-01

Pos logit contributions
oc+0.789
ney+0.783
ard+0.748
ess+0.743
ushed+0.731
Neg logit contributions
 the-0.438
 an-0.377
 a-0.364
 your-0.361
 her-0.339
Token something
Token ID1223
Feature activation+3.92e-01

Pos logit contributions
ney+0.786
ess+0.762
oc+0.755
y+0.751
ard+0.720
Neg logit contributions
 the-0.164
 your-0.124
 these-0.115
 an-0.111
 her-0.102
Token scary
Token ID14343
Feature activation+6.41e-02

Pos logit contributions
 the+0.662
 an+0.576
 a+0.551
 your+0.542
 her+0.512
Neg logit contributions
ney-1.140
oc-1.132
ard-1.091
ess-1.088
ushed-1.068
Token in
Token ID287
Feature activation+6.18e-02

Pos logit contributions
 the+0.158
 a+0.140
 an+0.140
 your+0.135
 her+0.132
Neg logit contributions
ney-0.152
oc-0.151
ushed-0.142
ess-0.140
ard-0.140
Token the
Token ID262
Feature activation+5.59e-01

Pos logit contributions
 an+0.028
 the+0.026
 such+0.021
 a+0.020
 Higher+0.019
Neg logit contributions
y-0.212
oc-0.201
ney-0.201
mer-0.199
ard-0.191
Token chim
Token ID18205
Feature activation-2.53e-01

Pos logit contributions
 the+1.451
 a+1.307
 an+1.303
 your+1.268
 her+1.206
Neg logit contributions
oc-1.211
ushed-1.189
ess-1.171
ney-1.169
jo-1.121
Tokenney
Token ID1681
Feature activation-7.11e-01

Pos logit contributions
ushed+0.148
int+0.146
affe+0.145
ides+0.142
 pad+0.139
Neg logit contributions
 the-0.988
 a-0.895
 an-0.887
 your-0.854
 their-0.852
Token.
Token ID13
Feature activation-2.61e-01

Pos logit contributions
oc+1.465
ney+1.445
ushed+1.437
ess+1.416
ouse+1.386
Neg logit contributions
 the-1.986
 a-1.747
 an-1.717
 your-1.662
 their-1.640
Token They
Token ID1119
Feature activation-1.19e-01

Pos logit contributions
ney+0.593
oc+0.592
ess+0.554
ard+0.552
ushed+0.540
Neg logit contributions
 the-0.632
 an-0.568
 a-0.557
 your-0.550
 her-0.531
Token thanked
Token ID26280
Feature activation-1.41e-01

Pos logit contributions
oc+0.275
ney+0.273
ard+0.256
ess+0.254
ari+0.249
Neg logit contributions
 the-0.298
 an-0.266
 a-0.265
 her-0.251
 your-0.248
Token Mr
Token ID1770
Feature activation+2.46e-01

Pos logit contributions
ess+0.398
oc+0.386
ney+0.370
aron+0.365
y+0.361
Neg logit contributions
 your-0.136
 anything-0.127
 an-0.123
 these-0.118
 my-0.116
Token.
Token ID13
Feature activation-2.62e-01

Pos logit contributions
 the+0.583
 a+0.530
 their+0.503
 an+0.490
 her+0.467
Neg logit contributions
ushed-0.584
ouse-0.583
oc-0.580
uddle-0.579
ivering-0.578
Token a
Token ID257
Feature activation+4.80e-01

Pos logit contributions
oc+0.424
y+0.417
ney+0.413
board+0.402
an+0.393
Neg logit contributions
 our-0.103
 your-0.101
 these-0.100
nam-0.084
 anything-0.081
Token bird
Token ID6512
Feature activation-1.79e-01

Pos logit contributions
 the+0.858
 a+0.731
 an+0.710
 your+0.686
 it+0.645
Neg logit contributions
ney-1.406
oc-1.380
ushed-1.345
ess-1.342
ard-1.341
Token on
Token ID319
Feature activation+9.91e-02

Pos logit contributions
ney+0.539
ushed+0.527
oc+0.524
ari+0.523
ess+0.522
Neg logit contributions
 the-0.324
 a-0.271
 an-0.261
 your-0.251
 her-0.248
Token the
Token ID262
Feature activation+5.65e-01

Pos logit contributions
 your+0.037
 these+0.037
nam+0.034
 those+0.032
 our+0.031
Neg logit contributions
oc-0.331
ney-0.312
y-0.302
ard-0.300
an-0.290
Token chim
Token ID18205
Feature activation-2.49e-01

Pos logit contributions
 the+1.578
 an+1.440
 a+1.418
 your+1.398
 her+1.353
Neg logit contributions
ney-1.077
oc-1.076
ard-0.997
ess-0.990
ushed-0.976
Tokenney
Token ID1681
Feature activation-7.11e-01

Pos logit contributions
oc+0.131
int+0.128
ides+0.123
ushed+0.121
ard+0.115
Neg logit contributions
 the-0.996
 a-0.920
 an-0.907
 your-0.862
 her-0.856
Token.
Token ID13
Feature activation-2.62e-01

Pos logit contributions
oc+1.645
ushed+1.599
ney+1.539
ess+1.534
ouse+1.531
Neg logit contributions
 the-1.915
 a-1.704
 an-1.638
 her-1.559
 it-1.546
Token He
Token ID679
Feature activation-1.51e-01

Pos logit contributions
oc+0.734
ney+0.731
ard+0.696
ess+0.688
ushed+0.678
Neg logit contributions
 the-0.487
 an-0.432
 a-0.418
 your-0.417
 her-0.395
Token aimed
Token ID8998
Feature activation-2.99e-02

Pos logit contributions
ney+0.307
oc+0.305
ard+0.286
ess+0.283
ushed+0.278
Neg logit contributions
 the-0.406
 an-0.367
 a-0.363
 your-0.355
 her-0.346
Token his
Token ID465
Feature activation+4.07e-01

Pos logit contributions
oc+0.072
y+0.070
ard+0.066
ney+0.064
an+0.064
Neg logit contributions
 the-0.051
 an-0.047
 your-0.047
 these-0.047
 anything-0.046
Token toy
Token ID13373
Feature activation-9.90e-02

Pos logit contributions
 the+1.110
 an+1.015
 a+1.004
 your+0.979
 their+0.945
Neg logit contributions
ney-0.812
oc-0.803
ard-0.748
ess-0.747
ushed-0.732
TokenThat
Token ID2504
Feature activation-1.81e-02

Pos logit contributions
ney+0.470
oc+0.447
umpy+0.424
ard+0.419
y+0.417
Neg logit contributions
 the-0.421
 your-0.380
 an-0.370
 such-0.359
 anything-0.358
Token is
Token ID318
Feature activation-2.48e-01

Pos logit contributions
ney+0.034
ushed+0.034
oc+0.033
uddle+0.032
ard+0.032
Neg logit contributions
 the-0.051
 a-0.047
 an-0.047
 your-0.046
 it-0.044
Token a
Token ID257
Feature activation+4.76e-01

Pos logit contributions
ess+0.791
ney+0.784
oc+0.782
an+0.756
anda+0.734
Neg logit contributions
 the-0.116
 her-0.109
 these-0.101
 Higher-0.100
nam-0.085
Token rh
Token ID9529
Feature activation-1.39e-01

Pos logit contributions
 the+1.192
 a+1.081
 an+1.070
 your+1.052
 their+0.988
Neg logit contributions
ney-1.039
ushed-1.021
ard-0.979
oc-0.976
ouse-0.966
Tokenin
Token ID259
Feature activation-2.79e-01

Pos logit contributions
ney+0.245
ushed+0.238
ess+0.231
jo+0.229
ouse+0.228
Neg logit contributions
 the-0.400
 a-0.361
 your-0.353
 an-0.353
 it-0.339
Tokenoc
Token ID420
Feature activation-7.11e-01

Pos logit contributions
ushed+0.177
uddle+0.176
udd+0.147
ney+0.143
suits+0.128
Neg logit contributions
 the-1.154
 a-1.092
 an-1.056
 your-1.053
 it-1.043
Tokeneros
Token ID27498
Feature activation-2.72e-01

Pos logit contributions
ney+1.317
ouse+1.224
oc+1.218
uddle+1.197
jo+1.193
Neg logit contributions
 the-2.000
 an-1.809
 a-1.784
 your-1.774
 this-1.689
Token,
Token ID11
Feature activation-1.88e-01

Pos logit contributions
ney+0.596
ushed+0.582
oc+0.569
uddle+0.564
ouse+0.555
Neg logit contributions
 the-0.715
 a-0.645
 an-0.631
 your-0.621
 it-0.592
Token Lily
Token ID20037
Feature activation+2.94e-01

Pos logit contributions
oc+0.541
ard+0.537
ess+0.529
board+0.529
uddle+0.525
Neg logit contributions
 the-0.245
 an-0.221
 a-0.196
 anything-0.193
 their-0.190
Token.
Token ID13
Feature activation-2.68e-01

Pos logit contributions
 the+0.742
 a+0.639
 your+0.625
 an+0.605
 it+0.604
Neg logit contributions
uddle-0.735
ushed-0.701
ides-0.689
board-0.688
ari-0.684
Token It
Token ID632
Feature activation+5.22e-02

Pos logit contributions
oc+0.721
ney+0.710
ard+0.671
ess+0.669
ari+0.650
Neg logit contributions
 the-0.536
 an-0.470
 a-0.457
 your-0.451
 her-0.445
Token They
Token ID1119
Feature activation-1.23e-01

Pos logit contributions
ney+0.547
oc+0.534
ess+0.516
ard+0.497
ari+0.490
Neg logit contributions
 the-0.686
 your-0.626
 an-0.617
 her-0.597
 a-0.593
Token heard
Token ID2982
Feature activation-1.86e-01

Pos logit contributions
ney+0.250
oc+0.248
ard+0.232
ess+0.230
ushed+0.227
Neg logit contributions
 the-0.325
 an-0.297
 a-0.291
 your-0.289
 her-0.277
Token the
Token ID262
Feature activation+5.52e-01

Pos logit contributions
ess+0.589
oc+0.580
jo+0.578
y+0.576
ney+0.576
Neg logit contributions
 your-0.118
 the-0.100
 these-0.097
 an-0.086
 our-0.080
Token rh
Token ID9529
Feature activation-1.39e-01

Pos logit contributions
 the+1.168
 a+1.039
 an+1.015
 your+0.946
 her+0.927
Neg logit contributions
oc-1.470
ney-1.439
ard-1.372
ushed-1.367
ess-1.363
Tokenin
Token ID259
Feature activation-2.77e-01

Pos logit contributions
ney+0.258
ushed+0.251
oc+0.250
ess+0.248
ard+0.241
Neg logit contributions
 the-0.382
 a-0.339
 your-0.338
 an-0.330
 it-0.319
Tokenoc
Token ID420
Feature activation-7.11e-01

Pos logit contributions
ushed+0.230
ouse+0.225
ney+0.214
uddle+0.203
umpy+0.201
Neg logit contributions
 the-1.091
 a-1.011
 an-0.962
 this-0.940
 their-0.938
Tokeneros
Token ID27498
Feature activation-2.71e-01

Pos logit contributions
ney+1.243
oc+1.204
ouse+1.144
ess+1.143
ard+1.140
Neg logit contributions
 the-2.075
 an-1.867
 your-1.849
 a-1.841
 their-1.759
Token roar
Token ID34532
Feature activation-1.11e-01

Pos logit contributions
ney+0.659
oc+0.630
ushed+0.630
ouse+0.621
ess+0.613
Neg logit contributions
 the-0.661
 a-0.578
 an-0.562
 their-0.544
 her-0.536
Token.
Token ID13
Feature activation-2.70e-01

Pos logit contributions
ney+0.245
oc+0.240
ushed+0.231
ouse+0.230
ess+0.224
Neg logit contributions
 the-0.303
 a-0.274
 an-0.264
 their-0.256
 her-0.253
Token They
Token ID1119
Feature activation-1.21e-01

Pos logit contributions
oc+0.579
ney+0.563
ess+0.535
ard+0.514
aron+0.505
Neg logit contributions
 the-0.637
 your-0.582
 an-0.578
 these-0.563
 her-0.561
Token heard
Token ID2982
Feature activation-1.86e-01

Pos logit contributions
ney+0.252
oc+0.250
ard+0.235
ess+0.233
ushed+0.229
Neg logit contributions
 the-0.321
 an-0.291
 a-0.287
 your-0.280
 her-0.272
Token They
Token ID1119
Feature activation-1.06e-01

Pos logit contributions
ney+0.669
oc+0.665
ard+0.635
ess+0.630
ushed+0.619
Neg logit contributions
 the-0.517
 an-0.455
 your-0.445
 a-0.444
 her-0.421
Token saw
Token ID2497
Feature activation-1.47e-01

Pos logit contributions
ney+0.218
oc+0.217
ard+0.203
ess+0.202
ushed+0.198
Neg logit contributions
 the-0.279
 an-0.253
 a-0.249
 your-0.246
 her-0.237
Token a
Token ID257
Feature activation+4.81e-01

Pos logit contributions
ney+0.498
y+0.485
oc+0.482
board+0.482
ard+0.482
Neg logit contributions
 your-0.094
 the-0.075
 my-0.069
 our-0.065
 these-0.059
Token big
Token ID1263
Feature activation+1.36e-01

Pos logit contributions
 the+0.878
 a+0.757
 an+0.740
 your+0.700
 it+0.661
Neg logit contributions
ney-1.385
oc-1.372
ushed-1.327
ess-1.326
ard-1.322
Token chim
Token ID18205
Feature activation-2.47e-01

Pos logit contributions
 the+0.345
 an+0.304
 a+0.304
 your+0.298
 her+0.284
Neg logit contributions
ney-0.303
oc-0.296
ess-0.286
ushed-0.282
ard-0.280
Tokenney
Token ID1681
Feature activation-7.10e-01

Pos logit contributions
ides+0.162
oc+0.157
int+0.145
 pad+0.142
ard+0.123
Neg logit contributions
 the-0.974
 a-0.900
 an-0.878
 this-0.830
 their-0.822
Token in
Token ID287
Feature activation+6.01e-02

Pos logit contributions
oc+2.100
ney+2.035
ess+2.024
ushed+2.005
ari+1.990
Neg logit contributions
 the-1.415
 a-1.196
 an-1.158
 it-1.091
 this-1.071
Token the
Token ID262
Feature activation+5.63e-01

Pos logit contributions
 your+0.019
 anything+0.010
 these+0.006
 an+0.005
 the+0.004
Neg logit contributions
oc-0.225
ney-0.225
y-0.217
ard-0.210
mer-0.201
Token distance
Token ID5253
Feature activation-2.00e-01

Pos logit contributions
 the+1.478
 a+1.323
 an+1.315
 your+1.260
 her+1.224
Neg logit contributions
oc-1.206
ushed-1.176
ney-1.155
ess-1.131
ouse-1.131
Token and
Token ID290
Feature activation-2.58e-01

Pos logit contributions
oc+0.425
ney+0.417
ouse+0.393
ushed+0.393
ari+0.385
Neg logit contributions
 the-0.573
 a-0.503
 an-0.496
 it-0.478
 their-0.478
Token Rachel
Token ID15984
Feature activation+7.17e-02

Pos logit contributions
ney+0.476
oc+0.463
ard+0.456
ushed+0.447
y+0.447
Neg logit contributions
 the-0.605
 your-0.590
 an-0.568
 these-0.552
 anything-0.552
Token there
Token ID612
Feature activation+2.24e-01

Pos logit contributions
oke+0.005
oc+0.005
ess+0.005
ari+0.005
affe+0.005
Neg logit contributions
 such-0.003
 the-0.003
 an-0.003
 your-0.003
nam-0.003
Token is
Token ID318
Feature activation-2.56e-01

Pos logit contributions
 the+0.619
 an+0.562
 a+0.554
 your+0.546
 her+0.528
Neg logit contributions
ney-0.436
oc-0.436
ard-0.408
ess-0.400
ushed-0.399
Token a
Token ID257
Feature activation+4.75e-01

Pos logit contributions
ney+0.950
ess+0.933
oc+0.926
ari+0.889
y+0.882
Neg logit contributions
 the-0.127
 your-0.092
 these-0.082
 her-0.071
 an-0.063
Token rh
Token ID9529
Feature activation-1.37e-01

Pos logit contributions
 the+0.997
 an+0.875
 a+0.866
 your+0.846
 their+0.802
Neg logit contributions
ney-1.236
oc-1.219
ard-1.177
ushed-1.169
ess-1.161
Tokenin
Token ID259
Feature activation-2.77e-01

Pos logit contributions
ney+0.258
ushed+0.249
ess+0.241
ouse+0.239
oc+0.239
Neg logit contributions
 the-0.381
 a-0.341
 your-0.334
 an-0.333
 it-0.322
Tokenoc
Token ID420
Feature activation-7.10e-01

Pos logit contributions
ushed+0.201
uddle+0.159
ouse+0.155
ney+0.149
jo+0.144
Neg logit contributions
 the-1.151
 a-1.081
 an-1.040
 it-1.026
 your-1.025
Tokeneros
Token ID27498
Feature activation-2.71e-01

Pos logit contributions
ney+1.287
oc+1.208
ouse+1.188
ard+1.168
jo+1.161
Neg logit contributions
 the-2.032
 an-1.846
 a-1.820
 your-1.798
 this-1.717
Token in
Token ID287
Feature activation+5.53e-02

Pos logit contributions
ney+0.610
oc+0.595
ushed+0.590
ouse+0.569
ess+0.569
Neg logit contributions
 the-0.696
 a-0.622
 an-0.614
 your-0.594
 their-0.575
Token the
Token ID262
Feature activation+5.55e-01

Pos logit contributions
 these+0.009
 your+0.009
 anything+0.008
 an+0.007
 the+0.007
Neg logit contributions
oc-0.205
ney-0.195
y-0.185
jo-0.182
ess-0.179
Token park
Token ID3952
Feature activation-1.14e-01

Pos logit contributions
 the+1.476
 an+1.334
 a+1.320
 your+1.295
 their+1.243
Neg logit contributions
ney-1.139
oc-1.129
ess-1.069
ard-1.064
ushed-1.063
Token.
Token ID13
Feature activation-2.80e-01

Pos logit contributions
oc+0.228
ney+0.228
ushed+0.224
ouse+0.218
ess+0.215
Neg logit contributions
 the-0.324
 a-0.286
 an-0.283
 your-0.274
 their-0.271
Token there
Token ID612
Feature activation+2.09e-01

Pos logit contributions
ney+0.322
oc+0.319
ard+0.301
ess+0.300
ushed+0.297
Neg logit contributions
 the-0.316
 an-0.282
 a-0.278
 your-0.272
 her-0.262
Token was
Token ID373
Feature activation-2.56e-01

Pos logit contributions
 the+0.573
 an+0.509
 a+0.508
 her+0.483
 your+0.482
Neg logit contributions
oc-0.441
ney-0.437
ess-0.408
ard-0.404
ushed-0.403
Token a
Token ID257
Feature activation+4.68e-01

Pos logit contributions
ney+0.904
oc+0.876
board+0.854
ess+0.853
ard+0.851
Neg logit contributions
 the-0.145
 your-0.124
 these-0.115
 our-0.105
 anything-0.084
Token rh
Token ID9529
Feature activation-1.53e-01

Pos logit contributions
 the+1.005
 a+0.878
 an+0.858
 your+0.832
 their+0.786
Neg logit contributions
ney-1.208
oc-1.172
ushed-1.155
ess-1.147
ard-1.144
Tokenin
Token ID259
Feature activation-2.86e-01

Pos logit contributions
ney+0.275
ess+0.266
ushed+0.265
oc+0.263
ard+0.258
Neg logit contributions
 the-0.435
 a-0.393
 an-0.390
 your-0.382
 it-0.363
Tokenoc
Token ID420
Feature activation-7.10e-01

Pos logit contributions
ides+0.239
ushed+0.234
ows+0.226
uddle+0.214
suits+0.200
Neg logit contributions
 the-1.127
 a-1.055
 an-0.994
 this-0.988
 it-0.983
Tokeneros
Token ID27498
Feature activation-2.67e-01

Pos logit contributions
ney+1.261
oc+1.185
ouse+1.175
ess+1.153
ushed+1.143
Neg logit contributions
 the-2.057
 an-1.875
 a-1.838
 your-1.832
 these-1.744
Token who
Token ID508
Feature activation-9.37e-02

Pos logit contributions
ney+0.615
ushed+0.611
uddle+0.606
ouse+0.593
ides+0.591
Neg logit contributions
 the-0.676
 a-0.611
 an-0.587
 it-0.537
 this-0.532
Token was
Token ID373
Feature activation-2.32e-01

Pos logit contributions
ney+0.186
uddle+0.185
oc+0.177
ouse+0.175
ard+0.175
Neg logit contributions
 the-0.259
 a-0.232
 an-0.229
 it-0.213
 your-0.212
Token always
Token ID1464
Feature activation+1.96e-02

Pos logit contributions
ney+0.705
oc+0.696
ard+0.661
ess+0.657
y+0.648
Neg logit contributions
 the-0.355
 your-0.304
 an-0.298
 these-0.281
 her-0.280
Token naughty
Token ID45128
Feature activation+3.98e-02

Pos logit contributions
 the+0.035
 an+0.031
 your+0.031
 a+0.029
 their+0.029
Neg logit contributions
ney-0.054
oc-0.053
ard-0.051
y-0.050
ess-0.049
Token was
Token ID373
Feature activation-2.51e-01

Pos logit contributions
 the+0.594
 an+0.535
 a+0.534
 your+0.509
 her+0.507
Neg logit contributions
ney-0.422
oc-0.419
ard-0.392
ess-0.389
ouse-0.385
Token a
Token ID257
Feature activation+4.69e-01

Pos logit contributions
ney+0.846
oc+0.819
board+0.806
y+0.799
ard+0.797
Neg logit contributions
 the-0.156
 your-0.145
 these-0.131
 our-0.130
 their-0.102
Token big
Token ID1263
Feature activation+1.26e-01

Pos logit contributions
 the+0.993
 a+0.868
 an+0.856
 your+0.826
 their+0.779
Neg logit contributions
ney-1.220
oc-1.189
ushed-1.161
ess-1.156
ard-1.154
Token rh
Token ID9529
Feature activation-1.50e-01

Pos logit contributions
 the+0.325
 a+0.284
 an+0.281
 your+0.277
 her+0.263
Neg logit contributions
ney-0.284
oc-0.275
ushed-0.272
ess-0.272
ouse-0.263
Tokenin
Token ID259
Feature activation-2.83e-01

Pos logit contributions
ney+0.269
ushed+0.259
ess+0.259
oc+0.259
ard+0.253
Neg logit contributions
 the-0.429
 a-0.388
 an-0.385
 your-0.378
 it-0.359
Tokenoc
Token ID420
Feature activation-7.10e-01

Pos logit contributions
ushed+0.226
ides+0.210
ows+0.194
uddle+0.185
udd+0.175
Neg logit contributions
 the-1.133
 a-1.065
 this-1.008
 an-1.007
 it-0.997
Tokeneros
Token ID27498
Feature activation-2.67e-01

Pos logit contributions
ney+1.266
oc+1.196
ouse+1.179
ess+1.156
ushed+1.148
Neg logit contributions
 the-2.052
 an-1.871
 a-1.835
 your-1.828
 these-1.739
Token.
Token ID13
Feature activation-2.56e-01

Pos logit contributions
ney+0.599
ushed+0.594
uddle+0.586
ouse+0.577
ides+0.571
Neg logit contributions
 the-0.684
 a-0.619
 an-0.598
 it-0.549
 this-0.544
Token He
Token ID679
Feature activation-1.52e-01

Pos logit contributions
oc+0.726
ney+0.721
ard+0.688
ess+0.681
ari+0.670
Neg logit contributions
 the-0.465
 an-0.406
 your-0.395
 a-0.393
 her-0.378
Token was
Token ID373
Feature activation-2.33e-01

Pos logit contributions
ney+0.312
uddle+0.302
oc+0.297
ard+0.295
ouse+0.294
Neg logit contributions
 the-0.418
 a-0.378
 an-0.374
 your-0.346
 it-0.341
Token troubled
Token ID17840
Feature activation-3.31e-02

Pos logit contributions
ney+0.603
oc+0.572
ess+0.548
ard+0.543
umpy+0.535
Neg logit contributions
 the-0.377
 your-0.364
 anything-0.357
 these-0.344
 their-0.339
Token am
Token ID716
Feature activation-1.80e-01

Pos logit contributions
 the+0.305
 a+0.274
 an+0.271
 your+0.256
 it+0.249
Neg logit contributions
ney-0.312
oc-0.300
ard-0.296
ari-0.295
ess-0.290
Token taking
Token ID2263
Feature activation+1.98e-01

Pos logit contributions
ney+0.486
ess+0.481
anda+0.478
ouse+0.477
oc+0.473
Neg logit contributions
 the-0.216
 an-0.190
 these-0.179
 her-0.172
 their-0.171
Token the
Token ID262
Feature activation+5.59e-01

Pos logit contributions
 the+0.193
 an+0.160
 their+0.142
 her+0.134
 your+0.130
Neg logit contributions
oc-0.643
ard-0.623
ess-0.604
ney-0.599
y-0.597
Token rh
Token ID9529
Feature activation-1.44e-01

Pos logit contributions
 the+1.309
 an+1.172
 a+1.155
 your+1.130
 her+1.079
Neg logit contributions
ney-1.317
oc-1.313
ess-1.231
ard-1.229
ushed-1.226
Tokenin
Token ID259
Feature activation-2.83e-01

Pos logit contributions
ney+0.249
ushed+0.241
ess+0.237
board+0.232
ard+0.231
Neg logit contributions
 the-0.411
 a-0.369
 your-0.369
 an-0.367
 our-0.348
Tokenoc
Token ID420
Feature activation-7.10e-01

Pos logit contributions
ushed+0.115
ney+0.084
uddle+0.070
aron+0.064
ouse+0.061
Neg logit contributions
 the-1.229
 a-1.159
 an-1.144
 your-1.126
 this-1.105
Tokeneros
Token ID27498
Feature activation-2.70e-01

Pos logit contributions
ney+1.257
ouse+1.206
ess+1.180
ushed+1.168
jo+1.162
Neg logit contributions
 the-2.015
 an-1.813
 your-1.807
 a-1.768
 this-1.712
Token to
Token ID284
Feature activation+1.82e-01

Pos logit contributions
ney+0.708
oc+0.703
ard+0.670
ushed+0.669
ess+0.667
Neg logit contributions
 the-0.563
 an-0.497
 a-0.489
 your-0.478
 her-0.451
Token a
Token ID257
Feature activation+4.92e-01

Pos logit contributions
 the+0.143
 an+0.131
 her+0.128
 these+0.114
 their+0.111
Neg logit contributions
oc-0.619
ney-0.572
ess-0.545
ard-0.541
age-0.518
Token new
Token ID649
Feature activation+1.76e-01

Pos logit contributions
 the+1.015
 an+0.903
 a+0.875
 your+0.863
 her+0.834
Neg logit contributions
oc-1.284
ney-1.282
ard-1.212
ess-1.201
ari-1.176
Token home
Token ID1363
Feature activation+8.44e-02

Pos logit contributions
 the+0.410
 a+0.365
 an+0.363
 your+0.351
 our+0.327
Neg logit contributions
ushed-0.425
ney-0.412
ouse-0.403
umpy-0.402
ard-0.400
Token there
Token ID612
Feature activation+2.26e-01

Pos logit contributions
y+0.411
ney+0.410
suits+0.409
board+0.393
uddle+0.392
Neg logit contributions
 such-0.269
 your-0.267
 anything-0.254
 the-0.251
 my-0.249
Token was
Token ID373
Feature activation-2.34e-01

Pos logit contributions
 the+0.635
 an+0.568
 a+0.566
 her+0.538
 their+0.535
Neg logit contributions
ney-0.451
oc-0.451
ard-0.420
ess-0.414
ari-0.411
Token a
Token ID257
Feature activation+4.76e-01

Pos logit contributions
ney+0.804
board+0.761
oc+0.754
ess+0.743
y+0.723
Neg logit contributions
 your-0.156
 the-0.128
 our-0.116
 these-0.104
 her-0.095
Token rh
Token ID9529
Feature activation-1.36e-01

Pos logit contributions
 the+0.972
 a+0.848
 an+0.835
 your+0.807
 their+0.764
Neg logit contributions
ney-1.265
oc-1.242
ushed-1.209
ess-1.206
ard-1.205
Tokenin
Token ID259
Feature activation-2.77e-01

Pos logit contributions
ney+0.253
ushed+0.251
ess+0.244
ouse+0.238
jo+0.237
Neg logit contributions
 the-0.376
 a-0.340
 your-0.328
 an-0.327
 it-0.318
Tokenoc
Token ID420
Feature activation-7.10e-01

Pos logit contributions
ushed+0.201
ides+0.181
jo+0.169
udd+0.161
umpy+0.159
Neg logit contributions
 the-1.125
 a-1.068
 an-1.013
 it-1.012
 this-1.006
Tokeneros
Token ID27498
Feature activation-2.71e-01

Pos logit contributions
ney+1.240
oc+1.186
ard+1.141
ouse+1.138
ushed+1.115
Neg logit contributions
 the-2.078
 an-1.892
 a-1.868
 your-1.843
 their-1.762
Token.
Token ID13
Feature activation-2.65e-01

Pos logit contributions
ney+0.645
ushed+0.640
ess+0.620
ouse+0.615
uddle+0.612
Neg logit contributions
 the-0.671
 a-0.607
 an-0.580
 it-0.545
 your-0.542
Token The
Token ID383
Feature activation+1.61e-01

Pos logit contributions
oc+0.727
ney+0.704
ard+0.684
board+0.675
ess+0.670
Neg logit contributions
 the-0.443
 an-0.414
 your-0.411
 anything-0.391
 a-0.385
Token rh
Token ID9529
Feature activation-1.36e-01

Pos logit contributions
 the+0.410
 an+0.368
 a+0.364
 your+0.356
 their+0.341
Neg logit contributions
ney-0.353
oc-0.346
ard-0.329
ess-0.328
ushed-0.327
Tokenin
Token ID259
Feature activation-2.76e-01

Pos logit contributions
ney+0.244
ess+0.239
ard+0.237
ushed+0.235
jo+0.234
Neg logit contributions
 the-0.379
 a-0.339
 an-0.329
 your-0.329
 it-0.321
Token man
Token ID582
Feature activation-2.35e-01

Pos logit contributions
 the+0.020
 a+0.018
 an+0.018
 your+0.017
 her+0.017
Neg logit contributions
ney-0.019
oc-0.019
ushed-0.019
jo-0.019
ari-0.018
Token come
Token ID1282
Feature activation+9.41e-02

Pos logit contributions
ney+0.673
ushed+0.658
ess+0.657
oc+0.653
uddle+0.650
Neg logit contributions
 the-0.471
 a-0.400
 an-0.381
 her-0.372
 it-0.361
Token down
Token ID866
Feature activation+6.07e-02

Pos logit contributions
 the+0.190
 an+0.165
 a+0.165
 your+0.157
 her+0.152
Neg logit contributions
ney-0.260
oc-0.260
ushed-0.249
ard-0.247
uddle-0.246
Token the
Token ID262
Feature activation+5.64e-01

Pos logit contributions
 the+0.130
 your+0.122
 an+0.119
 these+0.117
 anything+0.117
Neg logit contributions
oc-0.139
ney-0.135
y-0.126
ess-0.125
ard-0.124
Token chim
Token ID18205
Feature activation-2.49e-01

Pos logit contributions
 the+1.370
 a+1.200
 an+1.172
 your+1.117
 her+1.079
Neg logit contributions
oc-1.348
ushed-1.339
ney-1.261
jo-1.256
ess-1.240
Tokenney
Token ID1681
Feature activation-7.10e-01

Pos logit contributions
ides+0.163
int+0.147
ard+0.141
oc+0.138
ushed+0.129
Neg logit contributions
 the-0.972
 a-0.897
 an-0.872
 your-0.831
 this-0.822
Token.
Token ID13
Feature activation-2.61e-01

Pos logit contributions
oc+1.567
ushed+1.545
ney+1.507
ouse+1.493
ess+1.482
Neg logit contributions
 the-1.931
 a-1.712
 an-1.642
 her-1.579
 your-1.568
Token The
Token ID383
Feature activation+1.64e-01

Pos logit contributions
oc+0.719
ney+0.714
ard+0.692
ess+0.665
board+0.658
Neg logit contributions
 the-0.454
 an-0.418
 your-0.414
 anything-0.395
 a-0.394
Token man
Token ID582
Feature activation-2.34e-01

Pos logit contributions
 the+0.434
 a+0.387
 an+0.386
 your+0.369
 her+0.360
Neg logit contributions
oc-0.354
ney-0.347
ushed-0.329
ess-0.327
ard-0.325
Token was
Token ID373
Feature activation-2.30e-01

Pos logit contributions
ney+0.567
ides+0.550
oc+0.547
uddle+0.547
ess+0.544
Neg logit contributions
 the-0.587
 a-0.504
 an-0.478
 it-0.469
 her-0.454
Token very
Token ID845
Feature activation+1.16e-01

Pos logit contributions
ney+0.492
y+0.439
ivering+0.439
umpy+0.435
ess+0.412
Neg logit contributions
 anything-0.455
 these-0.448
 your-0.439
 the-0.437
 our-0.427
Token quickly
Token ID2952
Feature activation-4.38e-02

Pos logit contributions
 the+0.321
 a+0.278
 an+0.263
 her+0.261
 it+0.245
Neg logit contributions
uddle-0.326
ney-0.323
oc-0.319
ushed-0.313
umpy-0.310
Token ran
Token ID4966
Feature activation-1.12e-01

Pos logit contributions
ney+0.108
oc+0.106
ess+0.099
ushed+0.098
umpy+0.098
Neg logit contributions
 the-0.105
 a-0.092
 an-0.091
 her-0.087
 your-0.086
Token to
Token ID284
Feature activation+1.69e-01

Pos logit contributions
ess+0.268
ney+0.259
ard+0.255
y+0.255
board+0.251
Neg logit contributions
 the-0.200
 your-0.191
 an-0.186
 anything-0.181
 our-0.177
Token the
Token ID262
Feature activation+5.64e-01

Pos logit contributions
 anything+0.089
 our+0.078
 these+0.077
 your+0.075
nam+0.070
Neg logit contributions
oc-0.471
ess-0.469
jo-0.457
board-0.457
ard-0.454
Token chim
Token ID18205
Feature activation-2.50e-01

Pos logit contributions
 the+1.470
 an+1.362
 your+1.317
 a+1.291
 their+1.287
Neg logit contributions
ney-1.117
oc-1.048
ard-1.027
board-1.025
uddle-1.009
Tokenney
Token ID1681
Feature activation-7.09e-01

Pos logit contributions
ides+0.131
ard+0.112
int+0.112
ouse+0.105
oc+0.104
Neg logit contributions
 the-1.002
 an-0.918
 a-0.916
 your-0.878
 their-0.878
Token.
Token ID13
Feature activation-2.57e-01

Pos logit contributions
oc+1.584
ney+1.528
ushed+1.497
ess+1.463
ouse+1.455
Neg logit contributions
 the-1.876
 a-1.667
 an-1.655
 her-1.568
 your-1.553
Token She
Token ID1375
Feature activation-7.47e-02

Pos logit contributions
ney+0.609
oc+0.599
ard+0.572
ess+0.572
ushed+0.556
Neg logit contributions
 the-0.587
 an-0.524
 your-0.515
 a-0.512
 their-0.490
Token took
Token ID1718
Feature activation-1.00e-01

Pos logit contributions
oc+0.172
ney+0.169
ess+0.162
ard+0.161
umpy+0.160
Neg logit contributions
 the-0.184
 an-0.163
 a-0.163
 your-0.152
 her-0.152
Token a
Token ID257
Feature activation+4.78e-01

Pos logit contributions
ess+0.255
ard+0.254
int+0.252
est+0.250
les+0.245
Neg logit contributions
nam-0.112
 your-0.106
 our-0.103
 my-0.086
 anything-0.083
Token few
Token ID1178
Feature activation+3.14e-02

Pos logit contributions
 the+0.782
 an+0.658
 a+0.652
 your+0.621
 her+0.584
Neg logit contributions
ney-1.473
oc-1.462
ess-1.398
ard-1.395
ushed-1.392
Token a
Token ID257
Feature activation+4.79e-01

Pos logit contributions
oc+0.077
ney+0.075
uddle+0.072
board+0.070
ushed+0.070
Neg logit contributions
 the-0.051
 your-0.045
 an-0.045
 these-0.043
 her-0.042
Token game
Token ID983
Feature activation-1.48e-01

Pos logit contributions
 the+1.154
 a+1.025
 an+1.020
 your+1.001
 their+0.964
Neg logit contributions
ney-1.108
oc-1.073
ard-1.045
ushed-1.033
ess-1.013
Token near
Token ID1474
Feature activation+3.22e-03

Pos logit contributions
oc+0.336
ney+0.331
ard+0.314
ess+0.311
ushed+0.310
Neg logit contributions
 the-0.374
 an-0.335
 a-0.334
 your-0.319
 her-0.311
Token the
Token ID262
Feature activation+5.62e-01

Pos logit contributions
 your+0.001
 these+0.000
 the+0.000
 her+0.000
 an+0.000
Neg logit contributions
oc-0.013
ney-0.013
y-0.012
ess-0.012
ard-0.011
Token chim
Token ID18205
Feature activation-2.49e-01

Pos logit contributions
 the+1.564
 an+1.407
 a+1.397
 your+1.364
 her+1.312
Neg logit contributions
oc-1.084
ney-1.082
ushed-1.039
ard-1.021
ess-1.021
Tokenney
Token ID1681
Feature activation-7.09e-01

Pos logit contributions
int+0.146
ides+0.144
ard+0.135
ushed+0.126
suit+0.125
Neg logit contributions
 the-1.010
 a-0.923
 an-0.911
 their-0.874
 your-0.870
Token.
Token ID13
Feature activation-2.60e-01

Pos logit contributions
oc+1.490
ney+1.435
ushed+1.433
ess+1.403
ouse+1.393
Neg logit contributions
 the-1.989
 a-1.771
 an-1.730
 your-1.654
 their-1.641
Token They
Token ID1119
Feature activation-1.07e-01

Pos logit contributions
ney+0.650
oc+0.650
ard+0.623
ess+0.621
board+0.613
Neg logit contributions
 the-0.505
 your-0.475
 an-0.463
 her-0.446
 a-0.435
Token were
Token ID547
Feature activation-1.58e-01

Pos logit contributions
ney+0.223
oc+0.222
ard+0.208
ess+0.206
ushed+0.203
Neg logit contributions
 the-0.278
 an-0.254
 a-0.248
 your-0.247
 her-0.237
Token very
Token ID845
Feature activation+1.15e-01

Pos logit contributions
ney+0.377
umpy+0.348
y+0.346
ess+0.341
oc+0.335
Neg logit contributions
 your-0.242
 the-0.239
 these-0.235
 an-0.235
 anything-0.234
Token happy
Token ID3772
Feature activation+1.27e-01

Pos logit contributions
 the+0.207
 an+0.193
 such+0.179
 a+0.179
 anything+0.178
Neg logit contributions
ney-0.310
oc-0.304
umpy-0.298
uddle-0.294
ushed-0.293
Token had
Token ID550
Feature activation-2.23e-01

Pos logit contributions
 the+0.718
 a+0.636
 an+0.608
 their+0.574
 her+0.573
Neg logit contributions
oc-0.624
ney-0.623
uddle-0.622
ides-0.621
ivering-0.607
Token a
Token ID257
Feature activation+4.82e-01

Pos logit contributions
ney+0.725
jo+0.670
y+0.669
ard+0.668
 pad+0.667
Neg logit contributions
 your-0.126
 the-0.122
 these-0.121
 our-0.108
 my-0.090
Token big
Token ID1263
Feature activation+1.33e-01

Pos logit contributions
 the+0.981
 an+0.869
 a+0.864
 your+0.827
 their+0.792
Neg logit contributions
ney-1.295
oc-1.280
ard-1.239
ess-1.217
ushed-1.210
Token rh
Token ID9529
Feature activation-1.35e-01

Pos logit contributions
 the+0.347
 an+0.312
 a+0.311
 your+0.304
 their+0.290
Neg logit contributions
ney-0.289
oc-0.280
ess-0.274
ard-0.271
ushed-0.269
Tokenin
Token ID259
Feature activation-2.74e-01

Pos logit contributions
ney+0.247
oc+0.235
ushed+0.233
ess+0.232
ard+0.228
Neg logit contributions
 the-0.385
 a-0.347
 an-0.343
 your-0.340
 it-0.325
Tokenoc
Token ID420
Feature activation-7.09e-01

Pos logit contributions
ushed+0.186
ides+0.182
ney+0.170
umpy+0.161
ivering+0.158
Neg logit contributions
 the-1.107
 a-1.064
 an-1.016
 this-0.986
 your-0.985
Tokeneros
Token ID27498
Feature activation-2.68e-01

Pos logit contributions
ney+1.319
oc+1.227
ouse+1.212
umpy+1.192
ard+1.188
Neg logit contributions
 the-1.983
 an-1.805
 a-1.775
 your-1.762
 this-1.674
Token that
Token ID326
Feature activation+2.52e-01

Pos logit contributions
ney+0.723
oc+0.704
ushed+0.689
ess+0.684
ard+0.683
Neg logit contributions
 the-0.560
 a-0.492
 an-0.486
 your-0.455
 her-0.438
Token he
Token ID339
Feature activation+1.74e-01

Pos logit contributions
 the+0.533
 an+0.481
 your+0.466
 a+0.463
 their+0.439
Neg logit contributions
ney-0.634
oc-0.634
ard-0.598
ess-0.594
ushed-0.583
Token liked
Token ID8288
Feature activation-2.87e-01

Pos logit contributions
 the+0.459
 a+0.417
 an+0.411
 your+0.387
 her+0.380
Neg logit contributions
ney-0.363
oc-0.350
uddle-0.344
ard-0.344
ess-0.336
Token to
Token ID284
Feature activation+1.72e-01

Pos logit contributions
oc+0.845
ney+0.840
ess+0.802
ard+0.801
y+0.781
Neg logit contributions
 the-0.462
 an-0.409
 your-0.391
 a-0.385
 their-0.357
Tokeneros
Token ID27498
Feature activation-2.71e-01

Pos logit contributions
ney+1.240
oc+1.186
ard+1.141
ouse+1.138
ushed+1.115
Neg logit contributions
 the-2.078
 an-1.892
 a-1.868
 your-1.843
 their-1.762
Token.
Token ID13
Feature activation-2.65e-01

Pos logit contributions
ney+0.645
ushed+0.640
ess+0.620
ouse+0.615
uddle+0.612
Neg logit contributions
 the-0.671
 a-0.607
 an-0.580
 it-0.545
 your-0.542
Token The
Token ID383
Feature activation+1.61e-01

Pos logit contributions
oc+0.727
ney+0.704
ard+0.684
board+0.675
ess+0.670
Neg logit contributions
 the-0.443
 an-0.414
 your-0.411
 anything-0.391
 a-0.385
Token rh
Token ID9529
Feature activation-1.36e-01

Pos logit contributions
 the+0.410
 an+0.368
 a+0.364
 your+0.356
 their+0.341
Neg logit contributions
ney-0.353
oc-0.346
ard-0.329
ess-0.328
ushed-0.327
Tokenin
Token ID259
Feature activation-2.76e-01

Pos logit contributions
ney+0.244
ess+0.239
ard+0.237
ushed+0.235
jo+0.234
Neg logit contributions
 the-0.379
 a-0.339
 an-0.329
 your-0.329
 it-0.321
Tokenoc
Token ID420
Feature activation-7.09e-01

Pos logit contributions
ides+0.180
uddle+0.179
ivering+0.163
umpy+0.160
udd+0.154
Neg logit contributions
 the-1.122
 a-1.047
 an-0.997
 it-0.996
 this-0.980
Tokeneros
Token ID27498
Feature activation-2.70e-01

Pos logit contributions
ney+1.191
oc+1.156
ard+1.097
ess+1.077
ouse+1.065
Neg logit contributions
 the-2.132
 an-1.945
 a-1.919
 your-1.903
 her-1.831
Token looked
Token ID3114
Feature activation-1.94e-01

Pos logit contributions
ney+0.608
ides+0.599
umpy+0.596
ivering+0.593
uddle+0.586
Neg logit contributions
 the-0.701
 a-0.628
 an-0.599
 it-0.571
 this-0.546
Token sad
Token ID6507
Feature activation+2.85e-02

Pos logit contributions
oc+0.489
ney+0.451
ard+0.446
ushed+0.442
ess+0.440
Neg logit contributions
 the-0.371
 your-0.331
 an-0.326
 her-0.315
 anything-0.312
Token and
Token ID290
Feature activation-2.68e-01

Pos logit contributions
 the+0.075
 a+0.070
 an+0.067
 her+0.063
 their+0.062
Neg logit contributions
oc-0.070
ney-0.067
ard-0.063
ari-0.063
ushed-0.062
Token scared
Token ID12008
Feature activation-5.04e-03

Pos logit contributions
ney+0.565
oc+0.548
y+0.546
ushed+0.543
ard+0.535
Neg logit contributions
 the-0.541
 an-0.518
 your-0.505
 such-0.501
 these-0.494
Token cozy
Token ID37438
Feature activation+7.01e-02

Pos logit contributions
 the+0.986
 a+0.866
 an+0.863
 your+0.832
 their+0.782
Neg logit contributions
ney-1.271
oc-1.266
ard-1.219
ushed-1.200
ess-1.196
Token spot
Token ID4136
Feature activation-1.57e-01

Pos logit contributions
 the+0.176
 a+0.157
 an+0.155
 your+0.152
 her+0.147
Neg logit contributions
ney-0.157
oc-0.155
ushed-0.148
ard-0.147
ess-0.145
Token in
Token ID287
Feature activation+6.24e-02

Pos logit contributions
oc+0.412
ney+0.411
ard+0.390
ess+0.387
ushed+0.382
Neg logit contributions
 the-0.329
 an-0.289
 a-0.284
 your-0.276
 her-0.266
Token the
Token ID262
Feature activation+5.61e-01

Pos logit contributions
 your+0.014
 anything+0.012
 the+0.009
 these+0.009
 our+0.008
Neg logit contributions
oc-0.228
ney-0.222
y-0.214
ard-0.213
aron-0.209
Token chim
Token ID18205
Feature activation-2.50e-01

Pos logit contributions
 the+1.448
 a+1.298
 an+1.287
 your+1.252
 her+1.206
Neg logit contributions
oc-1.211
ushed-1.170
ney-1.169
ess-1.138
ouse-1.122
Tokenney
Token ID1681
Feature activation-7.09e-01

Pos logit contributions
ides+0.114
oc+0.110
affe+0.104
int+0.091
ard+0.088
Neg logit contributions
 the-1.019
 a-0.937
 an-0.933
 your-0.904
 their-0.889
Token and
Token ID290
Feature activation-2.61e-01

Pos logit contributions
oc+1.530
ney+1.501
ushed+1.452
ess+1.439
ouse+1.421
Neg logit contributions
 the-1.935
 a-1.715
 an-1.684
 your-1.596
 her-1.594
Token the
Token ID262
Feature activation+5.58e-01

Pos logit contributions
oc+0.584
ney+0.584
ard+0.548
ess+0.546
board+0.543
Neg logit contributions
 the-0.505
 an-0.488
 your-0.481
 anything-0.473
 these-0.454
Token two
Token ID734
Feature activation+1.77e-01

Pos logit contributions
 the+1.374
 a+1.232
 an+1.215
 your+1.146
 her+1.128
Neg logit contributions
ney-1.264
oc-1.256
ushed-1.199
ess-1.190
ouse-1.180
Token friends
Token ID2460
Feature activation+1.23e-02

Pos logit contributions
 the+0.405
 a+0.351
 an+0.339
 her+0.331
 their+0.323
Neg logit contributions
ney-0.458
oc-0.451
ouse-0.432
ess-0.429
ushed-0.424
Token enjoyed
Token ID8359
Feature activation-1.04e-01

Pos logit contributions
 the+0.031
 a+0.027
 an+0.026
 her+0.025
 their+0.025
Neg logit contributions
oc-0.029
ney-0.029
umpy-0.027
uddle-0.027
ess-0.027
Token It
Token ID632
Feature activation+5.18e-02

Pos logit contributions
oc+0.724
ney+0.703
ard+0.688
uddle+0.680
board+0.678
Neg logit contributions
 the-0.471
 your-0.449
 an-0.429
 anything-0.424
 her-0.422
Token is
Token ID318
Feature activation-2.53e-01

Pos logit contributions
 the+0.149
 an+0.136
 a+0.134
 your+0.132
 her+0.128
Neg logit contributions
ney-0.096
oc-0.094
ard-0.088
ess-0.088
ushed-0.086
Token a
Token ID257
Feature activation+4.77e-01

Pos logit contributions
ney+0.756
y+0.733
oc+0.717
ess+0.709
aron+0.698
Neg logit contributions
 the-0.251
 these-0.241
 your-0.237
 anything-0.213
 an-0.209
Token rh
Token ID9529
Feature activation-1.35e-01

Pos logit contributions
 the+1.046
 a+0.931
 an+0.919
 your+0.898
 their+0.848
Neg logit contributions
ney-1.197
oc-1.165
ard-1.135
ushed-1.132
ess-1.123
Tokenin
Token ID259
Feature activation-2.76e-01

Pos logit contributions
ney+0.250
ushed+0.245
ess+0.239
oc+0.236
ouse+0.233
Neg logit contributions
 the-0.382
 a-0.344
 your-0.336
 an-0.333
 it-0.325
Tokenoc
Token ID420
Feature activation-7.09e-01

Pos logit contributions
ushed+0.226
ney+0.208
udd+0.204
ides+0.202
uddle+0.193
Neg logit contributions
 the-1.090
 a-1.038
 it-0.990
 an-0.973
 this-0.969
Tokeneros
Token ID27498
Feature activation-2.70e-01

Pos logit contributions
ney+1.334
ouse+1.245
oc+1.245
jo+1.216
uddle+1.196
Neg logit contributions
 the-1.968
 an-1.783
 a-1.762
 your-1.737
 this-1.657
Token.
Token ID13
Feature activation-2.78e-01

Pos logit contributions
ney+0.596
ushed+0.579
oc+0.569
ess+0.563
ouse+0.558
Neg logit contributions
 the-0.717
 a-0.646
 an-0.620
 your-0.609
 it-0.593
Token It
Token ID632
Feature activation+5.25e-02

Pos logit contributions
oc+0.766
ney+0.741
ard+0.719
ess+0.709
umpy+0.704
Neg logit contributions
 the-0.455
 your-0.414
 an-0.412
 anything-0.400
 her-0.391
Token is
Token ID318
Feature activation-2.53e-01

Pos logit contributions
 the+0.151
 an+0.137
 a+0.136
 your+0.134
 their+0.129
Neg logit contributions
ney-0.098
oc-0.095
ess-0.091
ard-0.091
ushed-0.089
Token bright
Token ID6016
Feature activation+7.14e-02

Pos logit contributions
ney+0.697
y+0.673
oc+0.665
ess+0.656
aron+0.645
Neg logit contributions
 the-0.313
 these-0.299
 your-0.290
 anything-0.284
 an-0.278
Token of
Token ID286
Feature activation-1.21e-01

Pos logit contributions
 the+0.461
 an+0.419
 a+0.409
 your+0.409
 her+0.393
Neg logit contributions
ney-0.414
oc-0.411
ard-0.387
ess-0.383
ushed-0.375
Token the
Token ID262
Feature activation+5.65e-01

Pos logit contributions
ney+0.413
oc+0.403
ard+0.389
y+0.388
yt+0.382
Neg logit contributions
 your-0.053
 these-0.047
 the-0.043
 anything-0.043
 an-0.034
Token train
Token ID4512
Feature activation-1.71e-01

Pos logit contributions
 the+1.474
 a+1.290
 an+1.269
 your+1.247
 their+1.186
Neg logit contributions
ushed-1.267
oc-1.232
ney-1.209
ouse-1.177
ess-1.175
Token's
Token ID338
Feature activation-2.61e-01

Pos logit contributions
ushed+0.383
oc+0.382
ney+0.377
ouse+0.364
umpy+0.362
Neg logit contributions
 the-0.455
 a-0.400
 an-0.387
 this-0.364
 it-0.359
Token chim
Token ID18205
Feature activation-2.46e-01

Pos logit contributions
oc+0.630
ushed+0.590
ney+0.581
ouse+0.578
board+0.572
Neg logit contributions
 the-0.635
 a-0.573
 an-0.565
 your-0.532
 their-0.526
Tokenney
Token ID1681
Feature activation-7.09e-01

Pos logit contributions
oc+0.167
pillar+0.154
affe+0.139
ides+0.138
int+0.134
Neg logit contributions
 the-0.957
 a-0.890
 an-0.880
 their-0.818
 your-0.815
Token.
Token ID13
Feature activation-2.62e-01

Pos logit contributions
oc+1.571
ney+1.517
ushed+1.493
ess+1.473
ouse+1.452
Neg logit contributions
 the-1.926
 a-1.724
 an-1.683
 your-1.567
 this-1.549
Token The
Token ID383
Feature activation+1.66e-01

Pos logit contributions
ney+0.708
oc+0.703
ard+0.686
ess+0.667
board+0.651
Neg logit contributions
 the-0.480
 your-0.428
 an-0.425
 her-0.408
 a-0.403
Token steam
Token ID13324
Feature activation+5.02e-02

Pos logit contributions
 the+0.414
 a+0.366
 an+0.360
 your+0.340
 their+0.336
Neg logit contributions
ney-0.381
oc-0.379
ushed-0.367
ess-0.362
ouse-0.361
Token was
Token ID373
Feature activation-2.32e-01

Pos logit contributions
 the+0.131
 a+0.114
 an+0.110
 it+0.107
 their+0.106
Neg logit contributions
oc-0.113
ney-0.109
umpy-0.105
ari-0.104
ouse-0.104
Token very
Token ID845
Feature activation+1.16e-01

Pos logit contributions
ney+0.625
oc+0.579
y+0.578
ushed+0.570
ard+0.569
Neg logit contributions
 the-0.366
 your-0.345
 an-0.331
 these-0.327
 anything-0.326
Token scatter
Token ID41058
Feature activation+3.40e-02

Pos logit contributions
 the+0.330
 an+0.296
 your+0.290
 a+0.280
 these+0.271
Neg logit contributions
ney-0.437
oc-0.432
ess-0.416
ard-0.413
board-0.401
Token their
Token ID511
Feature activation+4.55e-01

Pos logit contributions
 the+0.014
 your+0.014
 an+0.014
 a+0.008
 our+0.008
Neg logit contributions
y-0.120
ney-0.119
oc-0.119
ard-0.116
uddle-0.114
Token things
Token ID1243
Feature activation+1.12e-01

Pos logit contributions
 the+0.982
 an+0.868
 a+0.853
 your+0.833
 her+0.798
Neg logit contributions
oc-1.163
ney-1.161
ess-1.096
ard-1.091
ushed-1.075
Token all
Token ID477
Feature activation+2.65e-01

Pos logit contributions
 the+0.222
 an+0.197
 a+0.191
 your+0.190
 her+0.179
Neg logit contributions
ney-0.295
oc-0.290
ard-0.280
ess-0.277
ushed-0.273
Token around
Token ID1088
Feature activation-2.81e-02

Pos logit contributions
 the+0.528
 an+0.464
 a+0.453
 your+0.444
 her+0.423
Neg logit contributions
ney-0.715
oc-0.713
ard-0.678
ess-0.674
ushed-0.665
Token the
Token ID262
Feature activation+5.65e-01

Pos logit contributions
oc+0.090
ney+0.089
ard+0.087
y+0.086
ess+0.085
Neg logit contributions
 the-0.034
 an-0.030
 your-0.029
 a-0.028
 these-0.024
Token palace
Token ID20562
Feature activation-2.35e-01

Pos logit contributions
 the+1.484
 a+1.313
 an+1.292
 your+1.267
 their+1.201
Neg logit contributions
oc-1.219
ushed-1.181
ney-1.176
ess-1.151
ouse-1.114
Token.
Token ID13
Feature activation-2.55e-01

Pos logit contributions
oc+0.518
ney+0.477
ushed+0.470
int+0.458
ard+0.457
Neg logit contributions
 the-0.633
 a-0.562
 an-0.550
 their-0.523
 her-0.518
Token It
Token ID632
Feature activation+6.23e-02

Pos logit contributions
ney+0.599
oc+0.595
ard+0.574
ess+0.570
ari+0.556
Neg logit contributions
 the-0.563
 your-0.510
 an-0.509
 a-0.491
 her-0.482
Token was
Token ID373
Feature activation-2.28e-01

Pos logit contributions
 the+0.169
 an+0.153
 a+0.151
 your+0.146
 her+0.143
Neg logit contributions
oc-0.126
ney-0.125
ard-0.117
ess-0.117
ushed-0.114
Token fun
Token ID1257
Feature activation+1.85e-01

Pos logit contributions
ney+0.676
y+0.640
ushed+0.620
ess+0.617
aron+0.613
Neg logit contributions
 your-0.220
 the-0.211
 these-0.200
 our-0.183
 anything-0.175
Token sees
Token ID7224
Feature activation-2.23e-01

Pos logit contributions
 the+0.733
 a+0.647
 an+0.631
 your+0.621
 their+0.586
Neg logit contributions
ushed-0.644
oc-0.643
ess-0.640
ney-0.621
ides-0.609
Token a
Token ID257
Feature activation+4.80e-01

Pos logit contributions
board+0.687
ney+0.676
oc+0.661
y+0.648
ard+0.630
Neg logit contributions
 your-0.154
 these-0.148
 our-0.148
nam-0.118
 such-0.114
Token big
Token ID1263
Feature activation+1.35e-01

Pos logit contributions
 the+0.890
 a+0.756
 an+0.748
 your+0.714
 their+0.676
Neg logit contributions
ney-1.392
oc-1.372
ushed-1.327
ess-1.322
ard-1.322
Token cookie
Token ID19751
Feature activation-3.58e-02

Pos logit contributions
 the+0.345
 an+0.309
 a+0.306
 your+0.301
 her+0.289
Neg logit contributions
ney-0.294
oc-0.291
ess-0.274
ard-0.273
ushed-0.271
Token on
Token ID319
Feature activation+9.91e-02

Pos logit contributions
oc+0.112
ney+0.111
ess+0.109
ushed+0.108
jo+0.106
Neg logit contributions
 the-0.061
 a-0.051
 an-0.049
 your-0.047
 her-0.045
Token the
Token ID262
Feature activation+5.64e-01

Pos logit contributions
 your+0.038
 these+0.035
nam+0.032
 our+0.029
 anything+0.026
Neg logit contributions
oc-0.346
ney-0.316
ard-0.314
board-0.313
y-0.306
Token table
Token ID3084
Feature activation-1.60e-01

Pos logit contributions
 the+1.576
 an+1.443
 a+1.416
 your+1.399
 her+1.356
Neg logit contributions
ney-1.070
oc-1.063
ard-0.990
ess-0.980
ushed-0.958
Token.
Token ID13
Feature activation-2.73e-01

Pos logit contributions
oc+0.368
ess+0.361
ushed+0.349
jo+0.344
ney+0.340
Neg logit contributions
 the-0.430
 a-0.388
 an-0.374
 her-0.354
 your-0.350
Token He
Token ID679
Feature activation-1.57e-01

Pos logit contributions
oc+0.626
ney+0.621
board+0.581
ard+0.568
ess+0.546
Neg logit contributions
 the-0.621
 your-0.570
 an-0.564
 her-0.552
 their-0.549
Token loves
Token ID10408
Feature activation-3.79e-01

Pos logit contributions
ney+0.308
oc+0.301
ess+0.289
ard+0.284
ari+0.279
Neg logit contributions
 the-0.446
 an-0.408
 a-0.403
 your-0.394
 her-0.382
Token cookies
Token ID14746
Feature activation+1.68e-01

Pos logit contributions
oc+0.762
y+0.696
ard+0.677
aron+0.658
board+0.652
Neg logit contributions
 an-0.723
 the-0.637
 such-0.625
 anything-0.625
wered-0.619
Token top
Token ID1353
Feature activation+5.60e-02

Pos logit contributions
 the+1.282
 an+1.147
 a+1.135
 your+1.084
 her+1.052
Neg logit contributions
ney-1.365
oc-1.363
ushed-1.288
ard-1.285
ouse-1.272
Token.
Token ID13
Feature activation-2.68e-01

Pos logit contributions
 the+0.143
 a+0.123
 an+0.118
 her+0.115
 this+0.113
Neg logit contributions
ushed-0.131
oc-0.129
ney-0.125
ouse-0.124
ess-0.122
Token She
Token ID1375
Feature activation-8.28e-02

Pos logit contributions
ney+0.606
oc+0.602
ess+0.581
ard+0.574
ushed+0.558
Neg logit contributions
 the-0.605
 your-0.554
 an-0.552
 a-0.533
 their-0.531
Token quickly
Token ID2952
Feature activation-5.00e-02

Pos logit contributions
ney+0.169
oc+0.167
ard+0.160
ess+0.159
ari+0.156
Neg logit contributions
 the-0.225
 an-0.202
 a-0.201
 your-0.192
 her-0.189
Token snatched
Token ID43276
Feature activation-1.20e-01

Pos logit contributions
ney+0.118
oc+0.115
ess+0.109
ard+0.108
ushed+0.107
Neg logit contributions
 the-0.122
 an-0.107
 a-0.106
 your-0.103
 her-0.101
Token the
Token ID262
Feature activation+5.58e-01

Pos logit contributions
oc+0.379
ish+0.360
les+0.357
y+0.352
ard+0.351
Neg logit contributions
nam-0.055
 your-0.049
 anything-0.045
 our-0.043
 an-0.039
Token fruit
Token ID8234
Feature activation-4.27e-02

Pos logit contributions
 the+1.339
 a+1.199
 an+1.185
 your+1.110
 her+1.071
Neg logit contributions
ney-1.322
jo-1.297
oc-1.280
ushed-1.241
ari-1.236
Token and
Token ID290
Feature activation-2.68e-01

Pos logit contributions
ney+0.116
oc+0.115
ushed+0.109
ard+0.109
ess+0.109
Neg logit contributions
 the-0.087
 an-0.075
 a-0.074
 your-0.072
 her-0.070
Token started
Token ID2067
Feature activation-1.51e-01

Pos logit contributions
oc+0.632
ney+0.576
ushed+0.574
ard+0.565
board+0.553
Neg logit contributions
 the-0.508
 an-0.507
 your-0.494
 anything-0.482
 these-0.461
Token to
Token ID284
Feature activation+1.66e-01

Pos logit contributions
oc+0.347
ney+0.341
ard+0.336
y+0.333
ushed+0.326
Neg logit contributions
 your-0.234
 the-0.234
 an-0.232
 our-0.224
 anything-0.222
Token eat
Token ID4483
Feature activation+8.32e-02

Pos logit contributions
 the+0.334
 an+0.317
 your+0.299
 a+0.298
 anything+0.288
Neg logit contributions
oc-0.421
ney-0.399
ard-0.373
ess-0.373
ushed-0.368
Token piece
Token ID3704
Feature activation-1.34e-01

Pos logit contributions
 the+0.495
 an+0.448
 a+0.434
 your+0.433
 their+0.410
Neg logit contributions
ney-0.524
oc-0.524
ard-0.494
ess-0.490
ushed-0.478
Token moved
Token ID3888
Feature activation-8.97e-02

Pos logit contributions
ney+0.329
oc+0.324
ard+0.308
ess+0.308
ushed+0.303
Neg logit contributions
 the-0.290
 an-0.261
 a-0.253
 your-0.251
 their-0.238
Token.
Token ID13
Feature activation-2.51e-01

Pos logit contributions
ney+0.205
oc+0.204
ard+0.192
ess+0.191
ushed+0.189
Neg logit contributions
 the-0.216
 an-0.194
 a-0.190
 your-0.187
 her-0.180
Token He
Token ID679
Feature activation-1.51e-01

Pos logit contributions
ney+0.657
oc+0.655
ard+0.622
ess+0.618
ushed+0.610
Neg logit contributions
 the-0.518
 an-0.458
 a-0.447
 your-0.439
 her-0.419
Token showed
Token ID3751
Feature activation-7.53e-02

Pos logit contributions
ney+0.325
oc+0.321
ard+0.306
ess+0.303
ari+0.297
Neg logit contributions
 the-0.398
 an-0.356
 a-0.356
 your-0.336
 her-0.333
Token her
Token ID607
Feature activation+4.55e-01

Pos logit contributions
ess+0.252
oc+0.251
jo+0.246
y+0.245
les+0.241
Neg logit contributions
 an-0.045
 the-0.044
 your-0.037
 our-0.035
 anything-0.034
Token how
Token ID703
Feature activation+3.05e-01

Pos logit contributions
 the+0.441
 your+0.407
 an+0.388
 our+0.352
 a+0.334
Neg logit contributions
oc-1.492
ney-1.491
ouse-1.443
y-1.433
ushed-1.403
Token to
Token ID284
Feature activation+1.80e-01

Pos logit contributions
 the+0.513
 an+0.469
 our+0.463
 anything+0.459
 their+0.456
Neg logit contributions
ney-0.739
y-0.705
umpy-0.696
ard-0.687
ess-0.681
Token plan
Token ID1410
Feature activation-1.57e-01

Pos logit contributions
 the+0.360
 an+0.336
 your+0.317
 a+0.316
 anything+0.307
Neg logit contributions
oc-0.440
ney-0.436
ard-0.410
ess-0.405
ushed-0.398
Token her
Token ID607
Feature activation+4.54e-01

Pos logit contributions
ney+0.486
oc+0.485
ess+0.467
ard+0.467
board+0.464
Neg logit contributions
 the-0.186
 an-0.161
 your-0.155
 a-0.142
 our-0.141
Token moves
Token ID6100
Feature activation-1.39e-01

Pos logit contributions
 the+1.070
 an+0.954
 a+0.942
 your+0.915
 her+0.883
Neg logit contributions
ney-1.077
oc-1.074
ard-1.019
ess-1.013
ushed-1.004
TokenI
Token ID40
Feature activation-5.02e-02

Pos logit contributions
 the+0.076
 an+0.067
 a+0.065
 your+0.064
 her+0.061
Neg logit contributions
ney-0.098
oc-0.097
ard-0.092
ess-0.092
ushed-0.091
Token like
Token ID588
Feature activation+1.83e-02

Pos logit contributions
ney+0.140
ard+0.136
an+0.131
ari+0.131
oc+0.128
Neg logit contributions
 the-0.103
 a-0.095
 an-0.094
 your-0.086
 her-0.083
Token to
Token ID284
Feature activation+1.72e-01

Pos logit contributions
 the+0.023
 an+0.022
 their+0.020
 anything+0.020
 such+0.020
Neg logit contributions
ney-0.044
ard-0.044
board-0.044
y-0.042
oc-0.042
Token listen
Token ID6004
Feature activation+1.63e-01

Pos logit contributions
 the+0.334
 an+0.290
 a+0.286
 your+0.279
 her+0.267
Neg logit contributions
ney-0.482
oc-0.479
ard-0.457
ess-0.456
ushed-0.452
Token to
Token ID284
Feature activation+1.73e-01

Pos logit contributions
 the+0.285
 an+0.247
 a+0.238
 your+0.234
 her+0.222
Neg logit contributions
ney-0.474
oc-0.471
ard-0.451
ess-0.450
ushed-0.442
Token you
Token ID345
Feature activation+2.96e-01

Pos logit contributions
 an+0.292
 the+0.283
 a+0.260
 their+0.256
 these+0.255
Neg logit contributions
oc-0.410
ney-0.390
ess-0.388
ard-0.382
jo-0.380
Token and
Token ID290
Feature activation-2.64e-01

Pos logit contributions
 the+0.788
 a+0.712
 an+0.696
 your+0.687
 her+0.660
Neg logit contributions
ney-0.650
oc-0.638
ushed-0.624
ard-0.598
ari-0.595
Token learn
Token ID2193
Feature activation+1.17e-01

Pos logit contributions
oc+0.629
ney+0.627
ess+0.593
ouse+0.578
board+0.563
Neg logit contributions
 the-0.476
 an-0.464
 a-0.418
 these-0.416
 anything-0.413
Token from
Token ID422
Feature activation+4.72e-02

Pos logit contributions
 the+0.183
 an+0.162
 a+0.149
 her+0.147
 your+0.145
Neg logit contributions
ney-0.326
oc-0.326
ess-0.314
uddle-0.306
y-0.301
Token you
Token ID345
Feature activation+2.97e-01

Pos logit contributions
 an+0.056
 the+0.055
 such+0.048
 a+0.047
 these+0.047
Neg logit contributions
ney-0.131
oc-0.131
an-0.128
ess-0.127
y-0.126
Token.
Token ID13
Feature activation-2.64e-01

Pos logit contributions
 the+0.797
 a+0.721
 an+0.712
 your+0.699
 her+0.670
Neg logit contributions
oc-0.637
ney-0.637
ushed-0.615
ard-0.602
ouse-0.587
Token outside
Token ID2354
Feature activation+4.47e-02

Pos logit contributions
 the+0.340
 your+0.305
 an+0.300
 anything+0.291
 these+0.290
Neg logit contributions
oc-0.472
ney-0.456
uddle-0.428
board-0.427
ushed-0.426
Token one
Token ID530
Feature activation+1.76e-01

Pos logit contributions
 the+0.086
 an+0.077
 your+0.075
 a+0.073
 her+0.071
Neg logit contributions
ney-0.119
oc-0.118
ard-0.110
y-0.110
ess-0.109
Token day
Token ID1110
Feature activation-1.01e-01

Pos logit contributions
 the+0.412
 a+0.359
 an+0.351
 your+0.351
 their+0.334
Neg logit contributions
oc-0.433
ney-0.425
ushed-0.421
ess-0.414
ard-0.413
Token when
Token ID618
Feature activation+7.65e-02

Pos logit contributions
oc+0.220
ney+0.220
ess+0.202
ard+0.201
ushed+0.201
Neg logit contributions
 the-0.265
 an-0.239
 a-0.238
 your-0.230
 her-0.223
Token they
Token ID484
Feature activation+2.63e-01

Pos logit contributions
 your+0.096
 such+0.092
 these+0.091
 the+0.090
 my+0.086
Neg logit contributions
y-0.191
ney-0.180
yt-0.177
ushed-0.176
uddle-0.176
Token found
Token ID1043
Feature activation-6.11e-02

Pos logit contributions
 the+0.699
 an+0.637
 a+0.625
 your+0.620
 her+0.596
Neg logit contributions
ney-0.532
oc-0.529
ard-0.496
ess-0.490
ushed-0.483
Token a
Token ID257
Feature activation+4.83e-01

Pos logit contributions
ney+0.178
y+0.172
an+0.170
board+0.166
oc+0.166
Neg logit contributions
 your-0.043
nam-0.040
 these-0.037
 my-0.035
 those-0.030
Token big
Token ID1263
Feature activation+1.39e-01

Pos logit contributions
 the+0.891
 an+0.770
 a+0.770
 your+0.735
 their+0.687
Neg logit contributions
ney-1.383
oc-1.377
ard-1.322
ess-1.311
ushed-1.311
Token,
Token ID11
Feature activation-1.80e-01

Pos logit contributions
 the+0.364
 a+0.324
 an+0.321
 your+0.316
 their+0.300
Neg logit contributions
ney-0.302
oc-0.295
ess-0.289
ushed-0.284
jo-0.281
Token thick
Token ID6546
Feature activation+5.13e-02

Pos logit contributions
ney+0.477
oc+0.471
ard+0.448
ess+0.435
board+0.435
Neg logit contributions
 the-0.342
 an-0.300
 your-0.293
 a-0.284
 anything-0.281
Token model
Token ID2746
Feature activation-1.04e-01

Pos logit contributions
 the+0.128
 a+0.115
 an+0.111
 your+0.108
 it+0.105
Neg logit contributions
ney-0.116
oc-0.114
ari-0.114
ushed-0.113
ess-0.113
Token imagine
Token ID5967
Feature activation-9.62e-02

Pos logit contributions
oc+0.259
ney+0.258
ushed+0.245
ard+0.245
ess+0.244
Neg logit contributions
 the-0.173
 an-0.149
 a-0.148
 your-0.143
 her-0.138
Token fun
Token ID1257
Feature activation+1.86e-01

Pos logit contributions
oc+0.290
ney+0.287
ess+0.276
jo+0.276
ard+0.273
Neg logit contributions
 the-0.095
 your-0.087
 an-0.087
 our-0.080
 a-0.075
Token adventures
Token ID17545
Feature activation-4.76e-02

Pos logit contributions
 the+0.365
 an+0.322
 a+0.313
 your+0.307
 her+0.292
Neg logit contributions
ney-0.507
oc-0.506
ard-0.481
ess-0.478
ushed-0.471
Token with
Token ID351
Feature activation+5.92e-03

Pos logit contributions
ney+0.107
oc+0.107
ard+0.101
ess+0.100
ushed+0.097
Neg logit contributions
 the-0.114
 an-0.103
 a-0.100
 your-0.100
 her-0.095
Token her
Token ID607
Feature activation+4.50e-01

Pos logit contributions
 your+0.003
 the+0.003
 an+0.003
 our+0.003
 these+0.002
Neg logit contributions
oc-0.020
ess-0.019
jo-0.019
ney-0.019
ard-0.019
Token uncle
Token ID7711
Feature activation-3.19e-02

Pos logit contributions
 the+0.940
 a+0.815
 an+0.809
 your+0.741
 their+0.739
Neg logit contributions
ney-1.221
oc-1.211
ard-1.183
ushed-1.166
umpy-1.149
Token.
Token ID13
Feature activation-2.57e-01

Pos logit contributions
ney+0.076
oc+0.075
ard+0.071
ushed+0.071
ess+0.070
Neg logit contributions
 the-0.075
 an-0.067
 a-0.066
 your-0.064
 her-0.062
Token They
Token ID1119
Feature activation-1.03e-01

Pos logit contributions
ney+0.593
oc+0.590
ess+0.555
ard+0.555
ari+0.539
Neg logit contributions
 the-0.609
 an-0.548
 a-0.537
 your-0.531
 her-0.507
Token would
Token ID561
Feature activation-2.96e-01

Pos logit contributions
ney+0.200
oc+0.200
ess+0.185
ard+0.184
ari+0.179
Neg logit contributions
 the-0.276
 an-0.256
 your-0.251
 a-0.249
 her-0.236
Token go
Token ID467
Feature activation+5.00e-02

Pos logit contributions
oc+0.748
ney+0.741
ess+0.698
ard+0.693
jo+0.678
Neg logit contributions
 the-0.596
 an-0.541
 your-0.520
 a-0.516
 anything-0.489
Token on
Token ID319
Feature activation+1.03e-01

Pos logit contributions
 the+0.090
 an+0.082
 your+0.079
 a+0.076
 anything+0.075
Neg logit contributions
ney-0.133
oc-0.132
ess-0.128
ari-0.127
y-0.125
Token castles
Token ID49203
Feature activation+1.01e-02

Pos logit contributions
 the+0.175
 an+0.160
 your+0.159
 these+0.143
 her+0.141
Neg logit contributions
oc-0.487
uddle-0.481
ney-0.470
ard-0.467
an-0.458
Token and
Token ID290
Feature activation-2.73e-01

Pos logit contributions
 the+0.028
 an+0.025
 a+0.025
 your+0.024
 her+0.023
Neg logit contributions
oc-0.021
ney-0.020
ard-0.019
ushed-0.019
ess-0.019
Token shapes
Token ID15268
Feature activation-7.80e-02

Pos logit contributions
ney+0.613
oc+0.599
ard+0.554
y+0.544
board+0.540
Neg logit contributions
 the-0.562
 an-0.546
 your-0.521
 a-0.497
 such-0.494
Token with
Token ID351
Feature activation-3.70e-04

Pos logit contributions
oc+0.164
ney+0.163
ard+0.152
ushed+0.151
ess+0.151
Neg logit contributions
 the-0.209
 a-0.188
 an-0.187
 your-0.180
 their-0.178
Token their
Token ID511
Feature activation+4.41e-01

Pos logit contributions
oc+0.001
ney+0.001
y+0.001
ard+0.001
uddle+0.001
Neg logit contributions
 the-0.000
 an-0.000
 your-0.000
 a-0.000
 such-0.000
Token toys
Token ID14958
Feature activation+1.54e-02

Pos logit contributions
 the+0.949
 a+0.814
 an+0.812
 your+0.796
 their+0.775
Neg logit contributions
oc-1.146
ney-1.135
ess-1.087
ard-1.079
ushed-1.064
Token.
Token ID13
Feature activation-2.75e-01

Pos logit contributions
 the+0.043
 a+0.038
 an+0.037
 their+0.036
 her+0.035
Neg logit contributions
oc-0.033
ney-0.033
ushed-0.032
jo-0.031
ess-0.030
Token
Token ID198
Feature activation-3.15e-02

Pos logit contributions
ney+0.680
oc+0.679
ard+0.641
ess+0.638
ushed+0.627
Neg logit contributions
 the-0.603
 an-0.540
 a-0.527
 your-0.520
 her-0.498
Token
Token ID198
Feature activation-3.20e-02

Pos logit contributions
ney+0.083
oc+0.083
ard+0.079
ess+0.078
ushed+0.078
Neg logit contributions
 the-0.066
 an-0.058
 a-0.057
 your-0.056
 their-0.053
TokenOne
Token ID3198
Feature activation-1.12e-01

Pos logit contributions
ney+0.073
oc+0.073
ard+0.069
ess+0.068
ushed+0.068
Neg logit contributions
 the-0.077
 an-0.070
 a-0.068
 your-0.067
 her-0.065
Token day
Token ID1110
Feature activation-1.04e-01

Pos logit contributions
oc+0.278
ney+0.273
ard+0.265
ushed+0.263
ess+0.260
Neg logit contributions
 the-0.266
 a-0.233
 an-0.230
 your-0.224
 their-0.216
Token yes
Token ID3763
Feature activation+5.30e-02

Pos logit contributions
oc+0.358
ard+0.355
board+0.346
ess+0.345
y+0.344
Neg logit contributions
 the-0.293
 your-0.274
 an-0.264
 our-0.262
 a-0.247
Token,
Token ID11
Feature activation-1.86e-01

Pos logit contributions
 the+0.123
 a+0.108
 an+0.102
 it+0.100
 your+0.100
Neg logit contributions
ney-0.148
ushed-0.140
oc-0.139
uddle-0.131
ouse-0.130
Token so
Token ID523
Feature activation+1.93e-01

Pos logit contributions
oc+0.323
ard+0.322
ney+0.316
board+0.308
ess+0.299
Neg logit contributions
 the-0.478
 an-0.443
 your-0.441
 anything-0.429
 a-0.425
Token they
Token ID484
Feature activation+2.58e-01

Pos logit contributions
 an+0.266
 the+0.257
 anything+0.252
 your+0.246
 a+0.235
Neg logit contributions
y-0.527
oc-0.524
ushed-0.512
yt-0.497
board-0.497
Token went
Token ID1816
Feature activation-1.82e-01

Pos logit contributions
 the+0.661
 an+0.599
 a+0.588
 your+0.581
 her+0.560
Neg logit contributions
ney-0.546
oc-0.544
ard-0.510
ess-0.507
ushed-0.498
Token outside
Token ID2354
Feature activation+4.09e-02

Pos logit contributions
ney+0.505
oc+0.494
ess+0.485
ard+0.483
ari+0.475
Neg logit contributions
 the-0.306
 an-0.273
 your-0.265
 a-0.255
 anything-0.251
Token to
Token ID284
Feature activation+1.70e-01

Pos logit contributions
 the+0.085
 an+0.076
 your+0.075
 a+0.074
 her+0.070
Neg logit contributions
ney-0.103
oc-0.103
ard-0.097
ess-0.096
ushed-0.094
Token the
Token ID262
Feature activation+5.63e-01

Pos logit contributions
 anything+0.298
 nothing+0.298
 your+0.293
 an+0.288
 such+0.284
Neg logit contributions
oc-0.347
ard-0.315
ney-0.313
board-0.299
ari-0.282
Token garden
Token ID11376
Feature activation-1.54e-01

Pos logit contributions
 the+1.438
 an+1.316
 a+1.275
 your+1.271
 her+1.231
Neg logit contributions
ney-1.192
oc-1.165
ard-1.099
ess-1.085
uddle-1.067
Token.
Token ID13
Feature activation-2.62e-01

Pos logit contributions
oc+0.326
ney+0.319
ushed+0.305
ess+0.302
ouse+0.302
Neg logit contributions
 the-0.424
 a-0.382
 an-0.379
 her-0.355
 their-0.354
Token
Token ID198
Feature activation-2.88e-02

Pos logit contributions
ney+0.659
oc+0.656
ard+0.631
ess+0.621
ushed+0.617
Neg logit contributions
 the-0.542
 an-0.487
 your-0.476
 a-0.470
 her-0.443
Token,
Token ID11
Feature activation-1.84e-01

Pos logit contributions
oc+0.307
ess+0.292
ard+0.289
y+0.284
ney+0.284
Neg logit contributions
 the-0.205
 an-0.202
 a-0.189
 your-0.185
 anything-0.171
Token the
Token ID262
Feature activation+5.61e-01

Pos logit contributions
oc+0.640
ess+0.599
ney+0.597
ard+0.589
y+0.578
Neg logit contributions
 an-0.145
 the-0.124
 a-0.119
 your-0.115
 her-0.108
Token pupil
Token ID34047
Feature activation-2.34e-01

Pos logit contributions
 the+1.593
 an+1.457
 a+1.434
 your+1.414
 her+1.371
Neg logit contributions
ney-1.051
oc-1.046
ard-0.970
ess-0.963
ushed-0.945
Token is
Token ID318
Feature activation-2.47e-01

Pos logit contributions
ney+0.497
oc+0.495
uddle+0.468
ard+0.466
ess+0.463
Neg logit contributions
 the-0.625
 a-0.551
 an-0.548
 your-0.528
 her-0.519
Token always
Token ID1464
Feature activation+1.60e-02

Pos logit contributions
ney+0.705
oc+0.677
y+0.674
ess+0.654
ouse+0.642
Neg logit contributions
 the-0.321
 an-0.296
 your-0.286
 these-0.284
 anything-0.271
Token happy
Token ID3772
Feature activation+1.32e-01

Pos logit contributions
 the+0.024
 an+0.022
 these+0.021
 your+0.020
 anything+0.020
Neg logit contributions
ney-0.043
oc-0.042
y-0.042
ushed-0.040
ard-0.040
Token and
Token ID290
Feature activation-2.64e-01

Pos logit contributions
 the+0.261
 an+0.230
 a+0.224
 your+0.220
 her+0.209
Neg logit contributions
ney-0.358
oc-0.356
ard-0.339
ess-0.337
ushed-0.333
Token never
Token ID1239
Feature activation+1.21e-01

Pos logit contributions
oc+0.687
ney+0.683
ushed+0.658
ess+0.649
ard+0.630
Neg logit contributions
 the-0.422
 an-0.411
 your-0.376
 anything-0.373
 a-0.371
Token forg
Token ID5784
Feature activation-1.11e-01

Pos logit contributions
 the+0.304
 an+0.273
 a+0.269
 your+0.264
 her+0.254
Neg logit contributions
ney-0.268
oc-0.266
ard-0.250
ess-0.248
ushed-0.245
Tokenets
Token ID1039
Feature activation-4.71e-01

Pos logit contributions
ney+0.130
oc+0.124
uddle+0.107
an+0.106
ushed+0.106
Neg logit contributions
 the-0.399
 a-0.371
 an-0.369
 your-0.358
 anything-0.350
Token to
Token ID284
Feature activation+1.70e-01

Pos logit contributions
oc+1.443
ney+1.424
ess+1.403
ard+1.381
y+1.377
Neg logit contributions
 the-0.530
 an-0.475
 your-0.443
 a-0.431
 these-0.401
Token I
Token ID314
Feature activation+1.48e-01

Pos logit contributions
 the+0.102
 an+0.094
 a+0.092
 your+0.090
 her+0.087
Neg logit contributions
oc-0.073
ney-0.072
ard-0.068
ess-0.068
ushed-0.066
Token am
Token ID716
Feature activation-1.56e-01

Pos logit contributions
 the+0.328
 an+0.290
 a+0.290
 your+0.266
 it+0.266
Neg logit contributions
ney-0.387
ard-0.372
oc-0.366
ushed-0.366
ess-0.358
Token a
Token ID257
Feature activation+4.77e-01

Pos logit contributions
oc+0.387
ess+0.377
ard+0.376
ney+0.374
umpy+0.358
Neg logit contributions
 the-0.275
 an-0.246
 their-0.243
 her-0.241
 your-0.238
Token thief
Token ID25906
Feature activation-9.91e-02

Pos logit contributions
 the+1.012
 a+0.910
 an+0.904
 your+0.879
 her+0.814
Neg logit contributions
ney-1.236
oc-1.180
ushed-1.168
ard-1.145
ari-1.141
Token,
Token ID11
Feature activation-1.86e-01

Pos logit contributions
ney+0.216
ushed+0.209
uddle+0.205
ouse+0.201
oc+0.200
Neg logit contributions
 the-0.268
 a-0.240
 an-0.233
 your-0.232
 her-0.220
Token and
Token ID290
Feature activation-2.66e-01

Pos logit contributions
oc+0.504
ney+0.494
ard+0.488
ess+0.484
board+0.473
Neg logit contributions
 the-0.306
 an-0.285
 their-0.264
 a-0.263
 your-0.259
Token you
Token ID345
Feature activation+2.92e-01

Pos logit contributions
oc+0.699
ney+0.656
ard+0.637
ess+0.635
y+0.625
Neg logit contributions
 the-0.376
 an-0.361
 their-0.348
 her-0.318
 such-0.315
Token are
Token ID389
Feature activation-6.12e-02

Pos logit contributions
 the+0.698
 a+0.627
 an+0.625
 your+0.588
 it+0.562
Neg logit contributions
ney-0.687
oc-0.665
ard-0.653
ushed-0.643
jo-0.636
Token very
Token ID845
Feature activation+1.13e-01

Pos logit contributions
ney+0.141
umpy+0.136
ard+0.135
oc+0.135
ess+0.134
Neg logit contributions
 the-0.107
 their-0.096
 an-0.095
 her-0.092
 these-0.092
Token stupid
Token ID8531
Feature activation+4.95e-02

Pos logit contributions
 the+0.225
 an+0.201
 a+0.192
 her+0.189
 their+0.187
Neg logit contributions
oc-0.292
ney-0.289
ard-0.278
umpy-0.276
ess-0.273
Token!"
Token ID2474
Feature activation-2.35e-01

Pos logit contributions
 the+0.132
 a+0.119
 an+0.118
 your+0.115
 it+0.110
Neg logit contributions
oc-0.108
ney-0.108
ushed-0.101
ouse-0.099
ari-0.098
Token Everyone
Token ID11075
Feature activation+2.38e-02

Pos logit contributions
oc+0.599
ney+0.597
ard+0.568
ess+0.560
ari+0.549
Neg logit contributions
 the-0.615
 an-0.562
 a-0.546
 your-0.545
 her-0.523
Token cl
Token ID537
Feature activation-1.62e-02

Pos logit contributions
 the+0.064
 an+0.057
 a+0.057
 your+0.055
 her+0.054
Neg logit contributions
ney-0.050
oc-0.049
ard-0.046
ess-0.046
ari-0.045
Tokenucked
Token ID17758
Feature activation-1.98e-01

Pos logit contributions
ney+0.034
oc+0.032
ard+0.031
board+0.031
jo+0.030
Neg logit contributions
 the-0.041
 an-0.037
 their-0.037
 a-0.037
 your-0.036
Token in
Token ID287
Feature activation+5.81e-02

Pos logit contributions
oc+0.365
ney+0.360
ard+0.338
ess+0.332
ushed+0.326
Neg logit contributions
 the-0.553
 an-0.511
 your-0.500
 a-0.499
 her-0.480
Token delight
Token ID10974
Feature activation+2.54e-02

Pos logit contributions
 the+0.123
 your+0.116
 an+0.115
 a+0.109
 anything+0.108
Neg logit contributions
oc-0.136
ney-0.127
ard-0.125
y-0.123
ushed-0.121
Token.
Token ID13
Feature activation-2.62e-01

Pos logit contributions
 the+0.071
 a+0.063
 an+0.063
 your+0.061
 their+0.060
Neg logit contributions
ney-0.054
oc-0.053
ari-0.048
ess-0.048
ard-0.048
Token
Token ID220
Feature activation+2.97e-04

Pos logit contributions
oc+0.654
ney+0.651
ard+0.620
ess+0.611
ushed+0.602
Neg logit contributions
 the-0.556
 an-0.502
 your-0.488
 a-0.486
 her-0.468
Token
Token ID198
Feature activation-2.84e-02

Pos logit contributions
 the+0.001
 an+0.001
 a+0.001
 your+0.001
 her+0.001
Neg logit contributions
ney-0.001
oc-0.001
ard-0.001
ess-0.001
ushed-0.001
Token
Token ID198
Feature activation-2.84e-02

Pos logit contributions
ney+0.070
oc+0.070
ard+0.066
ess+0.066
ushed+0.065
Neg logit contributions
 the-0.062
 an-0.056
 a-0.054
 your-0.054
 her-0.052
TokenThe
Token ID464
Feature activation+1.64e-01

Pos logit contributions
oc+0.074
ney+0.072
ard+0.069
umpy+0.068
ess+0.067
Neg logit contributions
 the-0.058
 an-0.052
 your-0.050
 a-0.050
 anything-0.049
Token chicken
Token ID9015
Feature activation+1.58e-02

Pos logit contributions
 the+0.348
 an+0.312
 your+0.300
 a+0.300
 her+0.287
Neg logit contributions
oc-0.419
ney-0.414
ard-0.397
ess-0.388
ushed-0.382
Token to
Token ID284
Feature activation+1.69e-01

Pos logit contributions
 the+0.175
 an+0.167
 anything+0.161
 your+0.160
 a+0.156
Neg logit contributions
oc-0.237
ney-0.228
ess-0.225
ard-0.224
y-0.217
Token it
Token ID340
Feature activation+4.40e-01

Pos logit contributions
 these+0.116
 such+0.114
 your+0.104
 an+0.103
 our+0.102
Neg logit contributions
oc-0.501
an-0.493
ard-0.478
y-0.472
ess-0.471
Token tightly
Token ID17707
Feature activation-1.06e-01

Pos logit contributions
 the+1.178
 a+1.061
 an+1.029
 her+1.015
 their+0.962
Neg logit contributions
ney-0.978
oc-0.958
ushed-0.940
ard-0.921
ari-0.921
Token.
Token ID13
Feature activation-2.63e-01

Pos logit contributions
ney+0.216
oc+0.216
ard+0.198
ushed+0.198
ari+0.197
Neg logit contributions
 the-0.305
 a-0.273
 an-0.272
 her-0.261
 your-0.258
Token They
Token ID1119
Feature activation-1.19e-01

Pos logit contributions
ney+0.546
oc+0.545
ess+0.524
board+0.517
uddle+0.516
Neg logit contributions
 the-0.610
 your-0.555
 an-0.552
 these-0.537
 a-0.533
Token started
Token ID2067
Feature activation-1.49e-01

Pos logit contributions
ney+0.292
oc+0.291
ard+0.274
ess+0.274
ushed+0.271
Neg logit contributions
 the-0.259
 an-0.234
 your-0.230
 a-0.227
 her-0.213
Token to
Token ID284
Feature activation+1.69e-01

Pos logit contributions
ney+0.378
y+0.374
oc+0.372
ess+0.358
ushed+0.356
Neg logit contributions
 the-0.250
 your-0.239
 an-0.231
 these-0.220
 my-0.216
Token pull
Token ID2834
Feature activation-1.24e-02

Pos logit contributions
 the+0.376
 an+0.346
 your+0.336
 a+0.333
 these+0.315
Neg logit contributions
oc-0.401
ney-0.397
ard-0.372
ess-0.370
ushed-0.362
Token and
Token ID290
Feature activation-2.70e-01

Pos logit contributions
oc+0.032
ney+0.031
y+0.030
ard+0.030
ess+0.029
Neg logit contributions
 the-0.023
 an-0.022
 your-0.021
 a-0.020
 these-0.020
Token tug
Token ID27762
Feature activation+2.05e-02

Pos logit contributions
oc+0.548
ney+0.530
ard+0.506
board+0.495
ushed+0.486
Neg logit contributions
 your-0.580
 the-0.579
 an-0.577
 such-0.540
 my-0.536
Token and
Token ID290
Feature activation-2.70e-01

Pos logit contributions
 the+0.052
 a+0.045
 an+0.045
 their+0.044
 her+0.043
Neg logit contributions
ney-0.051
ari-0.048
ess-0.047
oc-0.047
ouse-0.045
Token alive
Token ID6776
Feature activation+4.81e-02

Pos logit contributions
ney+0.068
oc+0.068
ess+0.065
umpy+0.063
an+0.063
Neg logit contributions
 the-0.047
 an-0.042
 these-0.041
 anything-0.039
 this-0.039
Token.
Token ID13
Feature activation-2.66e-01

Pos logit contributions
 the+0.136
 a+0.123
 an+0.123
 your+0.119
 her+0.116
Neg logit contributions
ney-0.094
oc-0.092
ard-0.089
ushed-0.087
ess-0.086
Token They
Token ID1119
Feature activation-1.18e-01

Pos logit contributions
ney+0.642
oc+0.639
ess+0.604
ard+0.601
ushed+0.593
Neg logit contributions
 the-0.601
 an-0.540
 a-0.526
 your-0.519
 her-0.497
Token said
Token ID531
Feature activation-1.58e-01

Pos logit contributions
ney+0.248
oc+0.247
ard+0.232
ess+0.230
ushed+0.226
Neg logit contributions
 the-0.305
 an-0.277
 a-0.272
 your-0.267
 her-0.259
Token Lily
Token ID20037
Feature activation+2.95e-01

Pos logit contributions
ess+0.361
y+0.360
oc+0.355
oke+0.349
anda+0.349
Neg logit contributions
 an-0.215
 such-0.210
 the-0.207
 your-0.203
 my-0.191
Token and
Token ID290
Feature activation-2.70e-01

Pos logit contributions
 the+0.834
 a+0.750
 an+0.744
 her+0.711
 your+0.710
Neg logit contributions
oc-0.589
ney-0.581
ard-0.552
ushed-0.530
uddle-0.530
Token her
Token ID607
Feature activation+4.43e-01

Pos logit contributions
oc+0.748
y+0.743
ari+0.740
ess+0.721
board+0.721
Neg logit contributions
 anything-0.305
 the-0.302
 an-0.299
 your-0.297
 such-0.279
Token parents
Token ID3397
Feature activation+6.13e-04

Pos logit contributions
 the+0.946
 a+0.819
 an+0.817
 your+0.787
 their+0.769
Neg logit contributions
ney-1.166
oc-1.166
ard-1.119
ess-1.097
ushed-1.079
Token had
Token ID550
Feature activation-2.23e-01

Pos logit contributions
 the+0.002
 a+0.001
 an+0.001
 their+0.001
 her+0.001
Neg logit contributions
ney-0.002
oc-0.001
uddle-0.001
ouse-0.001
ard-0.001
Token to
Token ID284
Feature activation+1.74e-01

Pos logit contributions
ney+0.571
ess+0.549
oc+0.535
ushed+0.529
ari+0.527
Neg logit contributions
 the-0.325
 these-0.301
 your-0.301
 anything-0.272
 this-0.268
Token leave
Token ID2666
Feature activation+2.11e-02

Pos logit contributions
 the+0.331
 an+0.290
 a+0.281
 your+0.276
 her+0.262
Neg logit contributions
ney-0.480
oc-0.479
ard-0.454
ess-0.454
ushed-0.445
Token do
Token ID466
Feature activation+1.39e-02

Pos logit contributions
 the+0.667
 a+0.589
 an+0.565
 her+0.533
 their+0.518
Neg logit contributions
oc-0.834
ney-0.818
jo-0.769
ess-0.759
ari-0.757
Token not
Token ID407
Feature activation+3.25e-01

Pos logit contributions
 the+0.016
 an+0.014
 your+0.014
 these+0.014
 such+0.013
Neg logit contributions
oc-0.041
ney-0.040
ushed-0.040
ouse-0.039
ess-0.039
Token want
Token ID765
Feature activation-1.80e-01

Pos logit contributions
 the+0.658
 a+0.570
 an+0.560
 your+0.545
 her+0.533
Neg logit contributions
oc-0.913
ney-0.901
ard-0.862
ushed-0.861
uddle-0.852
Token to
Token ID284
Feature activation+1.76e-01

Pos logit contributions
board+0.321
y+0.297
ney+0.291
ess+0.291
uddle+0.279
Neg logit contributions
nam-0.328
 your-0.319
 these-0.311
 the-0.311
 an-0.307
Token stop
Token ID2245
Feature activation+7.05e-02

Pos logit contributions
 the+0.330
 an+0.287
 a+0.280
 your+0.274
 her+0.260
Neg logit contributions
ney-0.498
oc-0.496
ard-0.473
ess-0.470
ushed-0.465
Token.
Token ID13
Feature activation-2.73e-01

Pos logit contributions
 the+0.142
 an+0.134
 a+0.127
 your+0.126
 these+0.122
Neg logit contributions
oc-0.167
ney-0.166
ard-0.162
y-0.158
ess-0.156
Token They
Token ID1119
Feature activation-1.19e-01

Pos logit contributions
ney+0.570
oc+0.565
ard+0.532
ess+0.528
ushed+0.515
Neg logit contributions
 the-0.699
 an-0.637
 a-0.621
 your-0.619
 her-0.596
Token like
Token ID588
Feature activation+1.61e-02

Pos logit contributions
ney+0.327
oc+0.323
ard+0.306
ess+0.305
ushed+0.304
Neg logit contributions
 the-0.247
 an-0.215
 a-0.213
 your-0.204
 her-0.198
Token their
Token ID511
Feature activation+4.49e-01

Pos logit contributions
 an+0.023
 such+0.023
 these+0.022
 your+0.022
 the+0.021
Neg logit contributions
y-0.037
ney-0.036
board-0.036
uddle-0.035
jo-0.034
Token band
Token ID4097
Feature activation+1.21e-02

Pos logit contributions
 the+0.913
 an+0.825
 your+0.789
 a+0.787
 her+0.751
Neg logit contributions
ney-1.150
oc-1.116
uddle-1.093
ess-1.080
ard-1.075
Token.
Token ID13
Feature activation-2.73e-01

Pos logit contributions
 the+0.033
 a+0.030
 their+0.029
 an+0.028
 your+0.028
Neg logit contributions
ney-0.024
oc-0.024
ard-0.023
int-0.022
ushed-0.022
Token the
Token ID262
Feature activation+5.48e-01

Pos logit contributions
 the+0.199
 an+0.191
 anything+0.175
 such+0.175
 these+0.171
Neg logit contributions
ney-0.338
oc-0.333
ess-0.319
ard-0.316
y-0.315
Token pige
Token ID32372
Feature activation-9.98e-02

Pos logit contributions
 the+1.366
 an+1.234
 a+1.212
 your+1.194
 her+1.149
Neg logit contributions
ney-1.196
oc-1.192
ard-1.119
ess-1.113
ushed-1.097
Tokenons
Token ID684
Feature activation-1.13e-01

Pos logit contributions
ney+0.235
oc+0.226
ess+0.225
ushed+0.220
ard+0.218
Neg logit contributions
 the-0.240
 an-0.208
 a-0.208
 your-0.204
 her-0.197
Token and
Token ID290
Feature activation-2.64e-01

Pos logit contributions
oc+0.241
ney+0.238
ushed+0.222
ess+0.220
ard+0.218
Neg logit contributions
 the-0.305
 a-0.272
 an-0.270
 your-0.262
 her-0.258
Token sp
Token ID599
Feature activation-6.69e-02

Pos logit contributions
oc+0.661
ney+0.636
ess+0.611
ard+0.604
board+0.595
Neg logit contributions
 the-0.479
 an-0.449
 your-0.420
 a-0.416
 such-0.406
Tokenarrow
Token ID6018
Feature activation-5.45e-01

Pos logit contributions
oc+0.112
ney+0.108
ard+0.097
ess+0.094
ari+0.091
Neg logit contributions
 the-0.204
 an-0.189
 a-0.186
 your-0.185
 her-0.181
Tokens
Token ID82
Feature activation-4.50e-01

Pos logit contributions
ney+1.011
oc+0.980
ard+0.896
ess+0.895
ushed+0.889
Neg logit contributions
 the-1.630
 a-1.451
 an-1.443
 your-1.409
 her-1.407
Token.
Token ID13
Feature activation-2.54e-01

Pos logit contributions
oc+0.956
ney+0.925
ard+0.892
ess+0.855
ari+0.848
Neg logit contributions
 the-1.224
 a-1.099
 an-1.098
 her-1.044
 your-1.043
Token He
Token ID679
Feature activation-1.53e-01

Pos logit contributions
oc+0.639
ney+0.633
ard+0.602
ess+0.598
ari+0.583
Neg logit contributions
 the-0.540
 an-0.482
 a-0.469
 your-0.463
 her-0.443
Token liked
Token ID8288
Feature activation-2.81e-01

Pos logit contributions
ney+0.328
oc+0.321
ard+0.306
ess+0.306
ari+0.298
Neg logit contributions
 the-0.399
 an-0.360
 a-0.356
 your-0.347
 her-0.336
Token to
Token ID284
Feature activation+1.78e-01

Pos logit contributions
y+0.752
oc+0.723
ney+0.723
ard+0.718
board+0.699
Neg logit contributions
 an-0.309
 the-0.298
 your-0.294
 such-0.273
 our-0.268
Token lay
Token ID3830
Feature activation-7.80e-02

Pos logit contributions
 the+0.665
 an+0.591
 a+0.588
 your+0.563
 her+0.558
Neg logit contributions
ney-0.603
oc-0.595
ess-0.562
ard-0.560
ushed-0.549
Token there
Token ID612
Feature activation+2.33e-01

Pos logit contributions
ney+0.196
y+0.192
uddle+0.191
ushed+0.190
oc+0.189
Neg logit contributions
 the-0.143
 an-0.134
 your-0.128
 a-0.127
 these-0.122
Token,
Token ID11
Feature activation-1.82e-01

Pos logit contributions
 the+0.605
 an+0.553
 a+0.541
 your+0.540
 her+0.513
Neg logit contributions
ney-0.472
oc-0.467
ard-0.442
ess-0.439
ushed-0.430
Token W
Token ID370
Feature activation+1.48e-01

Pos logit contributions
y+0.491
ess+0.481
ard+0.481
oc+0.478
mer+0.475
Neg logit contributions
 your-0.186
 my-0.168
 our-0.151
 the-0.142
 such-0.141
Tokenigg
Token ID6950
Feature activation-3.34e-01

Pos logit contributions
 the+0.390
 an+0.368
 a+0.363
 your+0.361
 this+0.334
Neg logit contributions
oc-0.280
ney-0.250
ard-0.223
yt-0.221
ari-0.221
Tokenly
Token ID306
Feature activation-4.87e-01

Pos logit contributions
oc+0.355
ney+0.355
ushed+0.275
ess+0.275
ari+0.274
Neg logit contributions
 the-1.257
 a-1.146
 an-1.144
 your-1.134
 her-1.116
Token said
Token ID531
Feature activation-1.50e-01

Pos logit contributions
ney+1.054
oc+1.049
ess+0.981
ari+0.952
ushed+0.952
Neg logit contributions
 the-1.281
 an-1.148
 a-1.147
 your-1.087
 her-1.080
Token,
Token ID11
Feature activation-1.82e-01

Pos logit contributions
ess+0.283
oc+0.277
ouse+0.275
y+0.273
board+0.269
Neg logit contributions
 the-0.325
 your-0.301
 an-0.293
 our-0.292
 these-0.290
Token "
Token ID366
Feature activation+4.48e-02

Pos logit contributions
oc+0.401
ess+0.391
ard+0.390
ney+0.388
y+0.372
Neg logit contributions
 the-0.378
 your-0.357
 an-0.346
 anything-0.341
 these-0.336
TokenBug
Token ID25624
Feature activation-1.18e-01

Pos logit contributions
 the+0.092
 an+0.081
 a+0.080
 your+0.077
 her+0.075
Neg logit contributions
ney-0.118
oc-0.118
ard-0.111
uddle-0.111
ess-0.111
Tokengy
Token ID1360
Feature activation-3.76e-01

Pos logit contributions
ney+0.249
oc+0.244
ess+0.229
ard+0.228
ushed+0.227
Neg logit contributions
 the-0.316
 an-0.279
 a-0.279
 your-0.275
 her-0.262
Token It
Token ID632
Feature activation+6.23e-02

Pos logit contributions
oc+0.742
ney+0.740
ess+0.710
ard+0.707
ushed+0.687
Neg logit contributions
 the-0.436
 your-0.381
 an-0.380
 a-0.366
 her-0.352
Token was
Token ID373
Feature activation-2.31e-01

Pos logit contributions
 the+0.178
 an+0.162
 a+0.160
 your+0.157
 her+0.153
Neg logit contributions
ney-0.117
oc-0.115
ard-0.108
ess-0.106
ushed-0.105
Token a
Token ID257
Feature activation+4.80e-01

Pos logit contributions
ney+0.767
y+0.762
ess+0.735
oc+0.733
jo+0.733
Neg logit contributions
 these-0.139
 your-0.123
 the-0.115
 anything-0.106
 our-0.104
Token tiny
Token ID7009
Feature activation+4.83e-02

Pos logit contributions
 the+0.968
 an+0.848
 a+0.836
 your+0.814
 her+0.773
Neg logit contributions
ney-1.284
oc-1.277
ard-1.219
ess-1.210
ushed-1.202
Token three
Token ID1115
Feature activation+1.31e-01

Pos logit contributions
 the+0.114
 a+0.102
 an+0.098
 your+0.094
 her+0.092
Neg logit contributions
ney-0.118
ushed-0.115
oc-0.113
ides-0.111
ess-0.109
Token year
Token ID614
Feature activation-4.16e-01

Pos logit contributions
 the+0.402
 a+0.362
 her+0.348
 an+0.346
 it+0.342
Neg logit contributions
ney-0.231
ushed-0.215
ess-0.207
ouse-0.205
ides-0.205
Token old
Token ID1468
Feature activation+2.20e-02

Pos logit contributions
oc+1.261
ney+1.243
ushed+1.199
ouse+1.177
ess+1.170
Neg logit contributions
 the-0.737
 an-0.609
 a-0.607
 your-0.593
 her-0.554
Token girl
Token ID2576
Feature activation-2.76e-01

Pos logit contributions
 the+0.058
 a+0.051
 an+0.049
 her+0.047
 their+0.046
Neg logit contributions
ney-0.050
oc-0.049
ushed-0.047
ides-0.047
ouse-0.047
Token.
Token ID13
Feature activation-2.59e-01

Pos logit contributions
ney+0.653
oc+0.619
uddle+0.610
umpy+0.605
ushed+0.603
Neg logit contributions
 the-0.703
 a-0.622
 an-0.607
 her-0.590
 your-0.578
Token She
Token ID1375
Feature activation-6.95e-02

Pos logit contributions
ney+0.650
oc+0.649
ess+0.616
ard+0.615
ushed+0.600
Neg logit contributions
 the-0.550
 an-0.493
 a-0.481
 your-0.478
 their-0.451
Token said
Token ID531
Feature activation-1.58e-01

Pos logit contributions
ney+0.155
oc+0.153
ard+0.145
uddle+0.142
ess+0.141
Neg logit contributions
 the-0.181
 a-0.161
 an-0.160
 her-0.152
 your-0.150
Token he
Token ID339
Feature activation+1.70e-01

Pos logit contributions
oc+0.273
ard+0.267
ney+0.264
ess+0.256
y+0.256
Neg logit contributions
 the-0.166
 an-0.156
 your-0.150
 a-0.145
 her-0.139
Token remembered
Token ID12086
Feature activation-1.73e-01

Pos logit contributions
 the+0.417
 a+0.365
 an+0.362
 her+0.340
 their+0.333
Neg logit contributions
ney-0.397
oc-0.394
ess-0.389
ari-0.370
ard-0.369
Token how
Token ID703
Feature activation+2.92e-01

Pos logit contributions
oc+0.515
ney+0.511
y+0.506
ard+0.499
ry+0.493
Neg logit contributions
 the-0.145
 an-0.127
 your-0.127
 these-0.115
 our-0.114
Token his
Token ID465
Feature activation+4.04e-01

Pos logit contributions
 the+0.424
 an+0.387
 these+0.368
 such+0.359
 a+0.357
Neg logit contributions
ney-0.799
ouse-0.775
uddle-0.772
ushed-0.770
oc-0.768
Token j
Token ID474
Feature activation+3.85e-02

Pos logit contributions
 the+0.899
 an+0.798
 a+0.784
 your+0.765
 her+0.739
Neg logit contributions
oc-1.018
ney-1.018
ess-0.958
ard-0.955
ushed-0.939
Tokenolly
Token ID5098
Feature activation-4.12e-01

Pos logit contributions
 the+0.122
 your+0.112
 an+0.110
 her+0.110
 a+0.109
Neg logit contributions
oc-0.060
ney-0.058
ess-0.052
aron-0.049
ard-0.047
Token day
Token ID1110
Feature activation-1.05e-01

Pos logit contributions
oc+1.021
ney+1.018
ess+0.961
ari+0.951
ushed+0.933
Neg logit contributions
 the-0.964
 a-0.855
 an-0.850
 her-0.811
 your-0.805
Token of
Token ID286
Feature activation-1.20e-01

Pos logit contributions
oc+0.211
ney+0.210
ess+0.195
ard+0.194
ushed+0.191
Neg logit contributions
 the-0.285
 an-0.258
 a-0.255
 your-0.248
 her-0.243
Token playing
Token ID2712
Feature activation+1.79e-01

Pos logit contributions
y+0.312
uddle+0.312
oc+0.310
ney+0.310
board+0.308
Neg logit contributions
 the-0.144
 an-0.140
 your-0.133
 anything-0.126
 our-0.126
Token almost
Token ID2048
Feature activation+9.95e-02

Pos logit contributions
 the+0.372
 an+0.334
 a+0.322
 your+0.321
 her+0.303
Neg logit contributions
oc-0.451
ney-0.451
ard-0.430
board-0.418
ess-0.418
Token delayed
Token ID11038
Feature activation-4.35e-02

Pos logit contributions
 the+0.172
 an+0.154
 your+0.147
 a+0.146
 her+0.136
Neg logit contributions
ney-0.272
oc-0.265
ard-0.264
y-0.258
ess-0.256
Token her
Token ID607
Feature activation+4.44e-01

Pos logit contributions
ess+0.415
y+0.395
ney+0.385
jo+0.374
ys+0.370
Neg logit contributions
 your-0.127
 the-0.126
 our-0.116
 an-0.115
 these-0.114
Token back
Token ID736
Feature activation+6.63e-02

Pos logit contributions
 the+0.883
 an+0.776
 a+0.759
 your+0.739
 her+0.707
Neg logit contributions
ney-1.205
oc-1.204
ard-1.145
ess-1.134
ushed-1.123
Token home
Token ID1363
Feature activation+7.72e-02

Pos logit contributions
 the+0.179
 a+0.160
 an+0.159
 your+0.153
 her+0.153
Neg logit contributions
ney-0.141
oc-0.140
ard-0.136
ushed-0.132
ouse-0.129
Token.
Token ID13
Feature activation-2.61e-01

Pos logit contributions
 the+0.216
 a+0.193
 an+0.192
 her+0.185
 their+0.181
Neg logit contributions
oc-0.166
ney-0.158
ard-0.153
an-0.146
ushed-0.143
Token Mill
Token ID9212
Feature activation+2.62e-02

Pos logit contributions
ney+0.664
oc+0.661
ess+0.627
ard+0.626
ushed+0.614
Neg logit contributions
 the-0.552
 an-0.491
 a-0.479
 your-0.475
 her-0.450
Tokenie
Token ID494
Feature activation-4.15e-01

Pos logit contributions
 the+0.080
 her+0.072
 your+0.068
 a+0.068
 an+0.067
Neg logit contributions
oc-0.050
ney-0.046
uddle-0.044
ouse-0.044
ushed-0.044
Token was
Token ID373
Feature activation-2.31e-01

Pos logit contributions
oc+0.876
ney+0.861
ard+0.815
ess+0.806
ari+0.792
Neg logit contributions
 the-1.129
 a-1.011
 an-1.009
 her-0.953
 your-0.947
Token embarrassed
Token ID21100
Feature activation+8.29e-03

Pos logit contributions
ney+0.520
umpy+0.470
ivering+0.466
ess+0.455
y+0.445
Neg logit contributions
 these-0.426
 your-0.410
 anything-0.398
 the-0.397
 our-0.396
Token because
Token ID780
Feature activation-7.11e-02

Pos logit contributions
 the+0.021
 an+0.019
 a+0.019
 your+0.018
 her+0.018
Neg logit contributions
ney-0.019
oc-0.019
ard-0.018
ushed-0.017
ess-0.017
Token she
Token ID673
Feature activation+2.05e-01

Pos logit contributions
ney+0.172
ess+0.166
jo+0.165
oc+0.164
y+0.162
Neg logit contributions
 your-0.095
 our-0.094
 anything-0.094
 an-0.092
 the-0.092
Token had
Token ID550
Feature activation-2.19e-01

Pos logit contributions
 the+0.501
 a+0.451
 an+0.445
 her+0.423
 your+0.415
Neg logit contributions
oc-0.490
ney-0.472
ard-0.457
ari-0.455
ess-0.436