TOP ACTIVATIONS
MAX = 4.535

mother said , gently . Molly â s eyes
â Alice stared . She had never
meeting darling . â Alice asked ,
? â Jimmy was even more embarrassed .
! â Alice â s dad
be fair to everyone . Lucy and Charlie agreed , and
son was happy again and he promised to be good so
down on their knees . Chloe suggested they play a game
â Anna smiled and said yes !
and things to explore . Thomas leaned next to her and
rely on each other ". Tina hugged her back and knew
â Alice wanted to go too ,
umpy old man inside . Molly smiled and promised to never
s me ! â Lucy was very curious and asked
an answer . â Daisy thought about it for a
She was so excited and she learned something new .
the books he lf ". Toby agreed and they safely returned
questions ". El iza understood the importance of taking
and be depend able . Molly sighed and reluctantly agreed .
said . Emily was very scared but she

MIN ACTIVATIONS
MIN = -5.627

. She wanted to play a game with her friends .
the dog wanted to play a game with his owner .
. He wanted to play a game but he did not
both went home after a great day . Every day ,
and Sue wanted to make a cake for Mom 's birthday
Bob o loved to play a sport called ball . He
. Spot wanted to play a fun game . He had
soon became a blur ! The mixer was spinning at top
, Anna wanted to play a new game . She called
. They want to play a game . They think of
Hop wanted to play a game with the ball .
, Rose wanted to play a game with Max Outside .
. He wanted to play a game with his friend ,
. They wanted to play a game . They were both
. They wanted to play a game . They saw some
and Lily wanted to make a snow man . They put
around it very quickly . The bunny was moving so fast
the snow . They make a big snow man with a
, Lucy wanted to play a game . She found a
. He wanted to play a game . So he brought

INTERVAL 1.995 - 4.535
CONTAINS 12.304%

car engine is not working and needs to be fixed ,
and Jack were very happy . They shared the treats and
?" Tom asked . The owl nodded and said , "
a regular cup of coffee . When Tim asked for one
it 's so cool !" " Yes ,

INTERVAL -0.546 - 1.995
CONTAINS 49.540%

lunch . Today , Mary 's mom had brought
was a silly yellow bird . He just loved to perform
grabbed her friends . Together they formed a plan .
The crab felt so much better now that he was with
. They hoped he was happy and safe . They learned

INTERVAL -3.086 - -0.546
CONTAINS 34.988%

But then , it started to get gloomy outside and it
She knows he did not mean to be bad . She
families about their wonderful day at the new park . <|endoftext|>
. " I don 't believe I can do it ."
, and so she asked her mom if she could have

INTERVAL -5.627 - -3.086
CONTAINS 3.168%

upon a time there was a happy boy named Sam .
She loved to play outside in the green grass and pick
and they went back to the store to exchange it for
park . They see a big slide . Sara wants to
even came to check on the bird and they became friends
Token mother
Token ID2802
Feature activation+3.68e+00

Pos logit contributions
olded+4.217
 agreed+4.066
ave+3.956
 suggested+3.953
 reached+3.893
Neg logit contributions
 new-1.744
 job-1.732
 big-1.604
 gift-1.591
 things-1.532
Token said
Token ID531
Feature activation+1.71e+00

Pos logit contributions
olded+7.656
ave+7.196
 From+7.015
oped+6.581
 Did+6.551
Neg logit contributions
 red-7.367
 new-7.343
 big-7.312
 gift-6.993
 cake-6.909
Token,
Token ID11
Feature activation+2.85e+00

Pos logit contributions
olded+4.723
ave+4.597
 From+4.409
os+4.365
 agreed+4.364
Neg logit contributions
 new-1.815
 red-1.740
 things-1.688
 thing-1.642
 big-1.601
Token gently
Token ID15165
Feature activation+3.21e+00

Pos logit contributions
ave+6.337
olded+6.258
 agreed+5.704
oped+5.660
 apologised+5.644
Neg logit contributions
 red-5.222
 new-5.139
 big-4.973
 things-4.727
 long-4.678
Token.
Token ID13
Feature activation+3.09e+00

Pos logit contributions
olded+9.015
ave+8.725
 From+8.684
Well+8.557
os+8.388
Neg logit contributions
 new-3.636
 red-3.621
 big-3.352
 gift-3.348
 things-3.258
Token Molly
Token ID30236
Feature activation+4.52e+00

Pos logit contributions
ave+6.159
olded+5.926
 apologised+5.876
 managed+5.767
 agreed+5.683
Neg logit contributions
 new-6.297
 red-6.083
 big-5.638
 TV-5.548
 long-5.472
Tokenâ
Token ID22940
Feature activation+2.00e+00

Pos logit contributions
olded+12.283
 Did+11.203
oped+10.937
ave+10.714
os+10.615
Neg logit contributions
 big-8.346
 red-7.732
 new-7.387
 toy-7.305
 shiny-6.988
Token
Token ID26391
Feature activation+2.66e+00

Pos logit contributions
olded+3.947
 could+3.926
 couldn+3.780
 hadn+3.530
 suggested+3.514
Neg logit contributions
 job-3.413
 sunny-3.096
 thing-3.075
 gift-3.007
 task-3.002
Token
Token ID8151
Feature activation+2.32e+00

Pos logit contributions
olded+5.274
 agreed+4.969
 apologised+4.912
ave+4.905
 suggested+4.777
Neg logit contributions
 job-4.914
 things-4.792
 big-4.711
 new-4.679
 red-4.534
Tokens
Token ID82
Feature activation+1.99e+00

Pos logit contributions
olded+4.700
 apologised+4.412
 thanked+4.278
 offered+4.255
ave+4.195
Neg logit contributions
 new-3.996
 big-3.765
 red-3.641
 lunch-3.555
 end-3.444
Token eyes
Token ID2951
Feature activation+1.92e+00

Pos logit contributions
olded+5.596
 agreed+5.360
 suggested+5.309
ave+5.201
oped+5.196
Neg logit contributions
 new-2.064
 big-2.051
 gift-2.002
 things-1.946
 red-1.921
Tokenâ
Token ID22940
Feature activation+1.37e+00

Pos logit contributions
olded+6.556
ave+6.381
 forgot+5.799
 managed+5.798
 apologised+5.775
Neg logit contributions
 new-5.866
 TV-5.411
 red-5.312
 big-5.279
 sunny-5.233
Token
Token ID26391
Feature activation+2.42e+00

Pos logit contributions
olded+2.627
 could+2.459
 couldn+2.313
 suggested+2.259
 reached+2.251
Neg logit contributions
 job-2.383
 thing-2.186
 sunny-2.183
 task-2.141
 gift-2.139
Token
Token ID220
Feature activation+2.65e+00

Pos logit contributions
olded+4.824
ave+4.297
 agreed+4.295
 apologised+4.217
 apologized+3.984
Neg logit contributions
 things-4.522
 big-4.521
 job-4.488
 new-4.472
 red-4.382
Token
Token ID198
Feature activation+2.21e+00

Pos logit contributions
 apologised+5.187
olded+5.186
 agreed+5.169
ave+4.894
 stared+4.893
Neg logit contributions
 new-5.370
 big-5.009
 red-4.558
 things-4.554
 pretty-4.370
Token
Token ID198
Feature activation+1.74e+00

Pos logit contributions
olded+4.818
ave+4.273
 wouldn+4.202
 apologised+4.122
 agreed+4.119
Neg logit contributions
 new-4.046
 big-3.957
 sunny-3.618
 long-3.470
 red-3.460
TokenAlice
Token ID44484
Feature activation+4.52e+00

Pos logit contributions
olded+3.936
owed+3.439
 couldn+3.366
ave+3.323
 could+3.319
Neg logit contributions
 sunny-2.780
 new-2.660
 big-2.613
 rainy-2.542
 thing-2.498
Token stared
Token ID18484
Feature activation-9.14e-02

Pos logit contributions
olded+14.675
 Did+13.157
ver+12.980
oped+12.786
anced+12.679
Neg logit contributions
 big-5.891
 red-5.660
 new-4.980
 toy-4.885
 black-4.843
Token.
Token ID13
Feature activation+2.68e+00

Pos logit contributions
 new+0.115
 red+0.112
 big+0.104
 fake+0.101
 thing+0.100
Neg logit contributions
olded-0.225
 agreed-0.205
 could-0.205
 suggested-0.205
 hadn-0.203
Token She
Token ID1375
Feature activation+3.85e+00

Pos logit contributions
olded+5.031
 agreed+4.673
 apologised+4.569
 managed+4.512
ave+4.504
Neg logit contributions
 new-5.924
 red-5.832
 pink-5.448
 TV-5.411
 big-5.407
Token had
Token ID550
Feature activation+2.42e+00

Pos logit contributions
olded+8.684
ver+7.289
oped+7.286
os+7.194
anced+7.009
Neg logit contributions
 big-8.855
 red-8.528
 new-8.479
 shiny-8.405
 pretty-8.317
Token never
Token ID1239
Feature activation+1.19e+00

Pos logit contributions
olded+7.421
oped+7.140
owed+6.677
os+6.613
ave+6.481
Neg logit contributions
 new-2.737
 big-2.733
 red-2.568
 fake-2.553
 pretty-2.442
Token meeting
Token ID3249
Feature activation+1.50e+00

Pos logit contributions
 new+1.904
 gift+1.871
 red+1.856
 thing+1.746
 food+1.721
Neg logit contributions
olded-6.212
 agreed-5.788
 could-5.788
 suggested-5.699
 From-5.571
Token darling
Token ID40003
Feature activation+1.61e+00

Pos logit contributions
olded+3.982
ave+3.707
 apologised+3.703
oped+3.654
 hadn+3.621
Neg logit contributions
 new-1.849
 things-1.694
 red-1.650
 gift-1.635
 big-1.626
Token.
Token ID13
Feature activation+1.97e+00

Pos logit contributions
olded+4.480
ave+4.177
 apologised+4.071
 From+4.024
 agreed+3.994
Neg logit contributions
 new-1.689
 job-1.645
 gift-1.640
 thing-1.620
 red-1.613
Tokenâ
Token ID22940
Feature activation+9.67e-01

Pos logit contributions
olded+3.457
 managed+3.355
oped+3.277
 apologised+3.243
 couldn+3.220
Neg logit contributions
 new-3.931
 TV-3.879
 red-3.790
 birthday-3.690
 gift-3.686
Token
Token ID26391
Feature activation+1.20e+00

Pos logit contributions
olded+1.725
 could+1.627
 couldn+1.598
 hadn+1.519
 suggested+1.513
Neg logit contributions
 job-1.738
 gift-1.657
 sunny-1.632
 thing-1.625
 new-1.572
Token Alice
Token ID14862
Feature activation+4.52e+00

Pos logit contributions
olded+2.134
 apologised+1.899
 agreed+1.834
 suggested+1.788
ave+1.765
Neg logit contributions
 job-2.240
 new-2.236
 big-2.191
 red-2.181
 things-2.178
Token asked
Token ID1965
Feature activation+6.50e-01

Pos logit contributions
olded+13.949
ave+12.360
alled+12.315
 apologised+12.283
Well+12.278
Neg logit contributions
 big-7.328
 red-6.722
 new-6.546
 pretty-5.961
 pink-5.952
Token,
Token ID11
Feature activation+2.11e+00

Pos logit contributions
olded+1.636
 agreed+1.503
 hadn+1.502
ave+1.489
 apologised+1.481
Neg logit contributions
 new-0.855
 things-0.775
 gift-0.758
 big-0.746
 red-0.744
Token
Token ID6184
Feature activation-9.57e-01

Pos logit contributions
olded+4.627
ave+4.432
 apologised+4.195
oped+4.040
 agreed+3.935
Neg logit contributions
 new-3.845
 red-3.671
 big-3.594
 gift-3.448
 real-3.380
Token
Token ID95
Feature activation+6.36e-01

Pos logit contributions
 new+1.682
 gift+1.666
 red+1.661
 job+1.660
 thing+1.618
Neg logit contributions
olded-1.658
 could-1.507
 agreed-1.501
 suggested-1.480
 couldn-1.432
Token
Token ID26391
Feature activation+1.04e+00

Pos logit contributions
olded+1.059
 could+0.985
 couldn+0.977
 suggested+0.961
 agreed+0.952
Neg logit contributions
 job-1.180
 gift-1.146
 new-1.140
 thing-1.123
 red-1.111
Token?
Token ID30
Feature activation+2.40e+00

Pos logit contributions
olded+4.392
ave+3.945
 apologised+3.915
 hadn+3.913
oped+3.897
Neg logit contributions
 red-1.823
 new-1.784
 sunny-1.703
 big-1.685
 thing-1.657
Tokenâ
Token ID22940
Feature activation+7.17e-01

Pos logit contributions
olded+4.367
ave+4.276
 apologised+4.205
 agreed+4.108
 managed+4.097
Neg logit contributions
 red-4.434
 new-4.362
 TV-4.331
 gift-4.247
 big-4.125
Token
Token ID26391
Feature activation+2.04e+00

Pos logit contributions
olded+1.330
 could+1.260
 couldn+1.198
 hadn+1.162
 suggested+1.137
Neg logit contributions
 job-1.256
 thing-1.197
 gift-1.182
 sunny-1.177
 red-1.156
Token
Token ID198
Feature activation+1.92e+00

Pos logit contributions
olded+4.302
 apologised+3.851
ave+3.800
 agreed+3.696
 apologized+3.616
Neg logit contributions
 red-3.516
 big-3.505
 job-3.464
 new-3.393
 view-3.324
Token
Token ID198
Feature activation+1.69e+00

Pos logit contributions
olded+3.819
 couldn+3.352
 wouldn+3.351
ave+3.341
 apologised+3.335
Neg logit contributions
 new-3.653
 big-3.578
 sunny-3.381
 red-3.330
 long-3.236
TokenJimmy
Token ID40335
Feature activation+4.52e+00

Pos logit contributions
olded+3.623
 couldn+3.413
 could+3.310
owed+3.290
 wouldn+3.237
Neg logit contributions
 sunny-2.691
 new-2.613
 big-2.557
 red-2.497
 nice-2.481
Token was
Token ID373
Feature activation+2.00e+00

Pos logit contributions
olded+13.365
ave+11.853
oped+11.751
 Did+11.672
alled+11.633
Neg logit contributions
 red-7.295
 big-7.206
 new-6.523
 thing-6.246
 black-6.218
Token even
Token ID772
Feature activation+4.81e-01

Pos logit contributions
olded+5.682
oped+5.418
ave+5.342
os+5.330
 could+5.319
Neg logit contributions
 big-2.115
 red-2.058
 new-2.031
 fake-1.918
 nice-1.897
Token more
Token ID517
Feature activation+8.58e-01

Pos logit contributions
olded+1.411
 From+1.339
oped+1.331
 could+1.312
 hadn+1.308
Neg logit contributions
 new-0.356
 red-0.320
 fake-0.301
 big-0.286
 sunny-0.276
Token embarrassed
Token ID21100
Feature activation+1.20e+00

Pos logit contributions
olded+2.263
oped+2.116
os+2.089
 could+2.079
ave+2.073
Neg logit contributions
 red-0.972
 new-0.956
 big-0.933
 shiny-0.901
 fake-0.893
Token.
Token ID13
Feature activation+2.81e+00

Pos logit contributions
olded+3.375
oped+3.102
 From+3.070
 agreed+3.063
ave+3.058
Neg logit contributions
 things-1.243
 thing-1.202
 new-1.200
 red-1.163
 food-1.074
Token!
Token ID0
Feature activation+2.72e+00

Pos logit contributions
olded+5.647
 From+5.007
 apologised+4.995
ave+4.987
 apologized+4.872
Neg logit contributions
 things-2.384
 red-2.356
 new-2.329
 thing-2.222
 food-2.109
Tokenâ
Token ID22940
Feature activation+1.59e+00

Pos logit contributions
olded+5.257
ave+5.067
 apologised+4.896
 agreed+4.831
 managed+4.811
Neg logit contributions
 new-5.277
 red-5.245
 TV-4.951
 sun-4.945
 sunny-4.930
Token
Token ID26391
Feature activation+2.59e+00

Pos logit contributions
olded+3.195
 could+3.052
 couldn+2.907
 suggested+2.845
 hadn+2.757
Neg logit contributions
 job-2.661
 sunny-2.506
 thing-2.458
 red-2.345
 gift-2.337
Token
Token ID198
Feature activation+2.09e+00

Pos logit contributions
olded+5.466
 apologised+5.023
ave+4.898
 agreed+4.868
 apologized+4.814
Neg logit contributions
 big-4.625
 job-4.535
 long-4.519
 view-4.499
 red-4.449
Token
Token ID198
Feature activation+1.79e+00

Pos logit contributions
olded+4.281
ave+3.807
 agreed+3.797
 wouldn+3.745
 apologised+3.718
Neg logit contributions
 big-3.916
 new-3.910
 sunny-3.687
 red-3.568
 long-3.557
TokenAlice
Token ID44484
Feature activation+4.51e+00

Pos logit contributions
olded+4.043
 couldn+3.703
 could+3.680
owed+3.657
 wouldn+3.626
Neg logit contributions
 sunny-2.743
 big-2.574
 new-2.558
 rainy-2.499
 red-2.476
Tokenâ
Token ID22940
Feature activation+1.93e+00

Pos logit contributions
olded+14.330
 Did+13.135
ver+12.764
ave+12.556
 apologised+12.477
Neg logit contributions
 big-6.216
 red-6.015
 pink-5.151
 new-5.142
 black-5.130
Token
Token ID26391
Feature activation+2.56e+00

Pos logit contributions
olded+3.958
 could+3.832
 couldn+3.671
 suggested+3.579
 hadn+3.463
Neg logit contributions
 job-3.165
 sunny-2.980
 thing-2.867
 gift-2.777
 red-2.760
Token
Token ID8151
Feature activation+2.69e+00

Pos logit contributions
olded+4.965
 apologised+4.636
 agreed+4.560
 apologized+4.416
 suggested+4.385
Neg logit contributions
 big-4.730
 job-4.718
 new-4.631
 long-4.626
 red-4.621
Tokens
Token ID82
Feature activation+1.39e+00

Pos logit contributions
olded+5.109
 apologised+4.877
 offered+4.789
 thanked+4.580
 apologized+4.513
Neg logit contributions
 new-5.029
 big-4.803
 red-4.566
 lunch-4.483
 home-4.477
Token dad
Token ID9955
Feature activation+3.84e+00

Pos logit contributions
olded+3.876
 agreed+3.747
 suggested+3.687
ave+3.616
 reached+3.596
Neg logit contributions
 new-1.415
 big-1.361
 red-1.302
 gift-1.294
 job-1.282
Token be
Token ID307
Feature activation+1.55e+00

Pos logit contributions
olded+1.993
oped+1.779
ave+1.772
 could+1.755
 From+1.746
Neg logit contributions
 new-1.071
 gift-1.032
 big-1.015
 red-0.985
 job-0.977
Token fair
Token ID3148
Feature activation+9.74e-01

Pos logit contributions
olded+4.827
oped+4.741
 couldn+4.554
os+4.542
 could+4.533
Neg logit contributions
 new-1.244
 nice-1.098
 red-1.046
 fake-1.018
 things-1.000
Token to
Token ID284
Feature activation+6.40e-01

Pos logit contributions
olded+2.439
oped+2.234
 reached+2.222
 From+2.221
 apologised+2.184
Neg logit contributions
 new-1.299
 things-1.280
 red-1.206
 thing-1.204
 gift-1.163
Token everyone
Token ID2506
Feature activation+2.36e+00

Pos logit contributions
olded+1.697
oped+1.513
ave+1.511
 could+1.493
 agreed+1.487
Neg logit contributions
 new-0.814
 gift-0.737
 things-0.735
 big-0.725
 food-0.716
Token.
Token ID13
Feature activation+3.09e+00

Pos logit contributions
olded+6.734
 From+6.155
ave+6.137
oped+6.134
anced+5.883
Neg logit contributions
 new-2.901
 things-2.836
 thing-2.588
 big-2.547
 red-2.474
Token Lucy
Token ID22162
Feature activation+4.51e+00

Pos logit contributions
oped+6.667
olded+6.539
 managed+6.318
ave+6.267
 hadn+6.196
Neg logit contributions
 new-6.143
 TV-5.443
 big-5.158
 cake-5.123
 food-5.108
Token and
Token ID290
Feature activation+7.42e-01

Pos logit contributions
olded+15.138
 Did+13.949
oped+13.721
 Afterwards+13.276
anced+13.269
Neg logit contributions
 big-5.259
 new-4.785
 red-4.764
 toy-4.524
 things-3.926
Token Charlie
Token ID11526
Feature activation+3.60e+00

Pos logit contributions
olded+1.572
 could+1.400
oped+1.360
ave+1.357
 hadn+1.346
Neg logit contributions
 new-1.237
 food-1.233
 dessert-1.208
 job-1.206
 lunch-1.206
Token agreed
Token ID4987
Feature activation+1.26e+00

Pos logit contributions
olded+8.168
 Did+7.341
oped+7.008
 From+6.776
os+6.753
Neg logit contributions
 end-6.696
 big-6.378
 new-6.117
 gift-6.065
 red-6.003
Token,
Token ID11
Feature activation+3.09e+00

Pos logit contributions
olded+3.283
oped+3.042
os+3.039
ave+3.001
 could+2.911
Neg logit contributions
 new-1.617
 gift-1.454
 things-1.411
 big-1.398
 thing-1.384
Token and
Token ID290
Feature activation+2.67e+00

Pos logit contributions
olded+8.416
ave+8.118
oped+8.095
os+8.054
owed+7.690
Neg logit contributions
 new-4.235
 red-3.965
 big-3.774
 long-3.663
 gift-3.651
Token son
Token ID3367
Feature activation+3.92e+00

Pos logit contributions
 new+1.132
 red+1.105
 gift+1.096
 job+1.062
 thing+1.054
Neg logit contributions
olded-2.332
 could-2.127
 agreed-2.123
 suggested-2.073
 From-2.052
Token was
Token ID373
Feature activation+1.94e+00

Pos logit contributions
olded+10.203
oped+9.469
 Did+9.165
 From+9.053
Well+8.799
Neg logit contributions
 toy-6.004
 big-5.883
 cake-5.823
 red-5.798
 food-5.776
Token happy
Token ID3772
Feature activation+1.26e+00

Pos logit contributions
olded+5.002
 could+4.864
oped+4.829
os+4.777
 From+4.736
Neg logit contributions
 new-2.249
 big-2.239
 red-2.173
 fake-2.139
 nice-2.139
Token again
Token ID757
Feature activation+1.68e+00

Pos logit contributions
olded+3.402
ave+3.081
oped+3.067
 From+2.941
 apologised+2.932
Neg logit contributions
 new-1.503
 things-1.492
 thing-1.474
 gift-1.433
 red-1.407
Token and
Token ID290
Feature activation+3.36e+00

Pos logit contributions
olded+4.937
ave+4.438
oped+4.305
 apologised+4.286
 agreed+4.218
Neg logit contributions
 new-1.799
 things-1.703
 red-1.655
 big-1.614
 thing-1.614
Token he
Token ID339
Feature activation+4.51e+00

Pos logit contributions
olded+8.350
oped+7.813
ave+7.806
os+7.352
ver+7.245
Neg logit contributions
 new-5.833
 big-5.309
 sand-5.225
 job-5.197
 food-5.149
Token promised
Token ID8072
Feature activation+1.21e+00

Pos logit contributions
olded+11.159
 Did+10.285
oped+9.552
os+9.427
owed+9.318
Neg logit contributions
 big-10.097
 new-9.497
 red-9.170
 toy-8.867
 shiny-8.830
Token to
Token ID284
Feature activation+1.23e+00

Pos logit contributions
olded+3.004
 agreed+2.803
ave+2.801
oped+2.800
 could+2.773
Neg logit contributions
 new-1.675
 things-1.552
 job-1.509
 attention-1.440
 food-1.436
Token be
Token ID307
Feature activation+4.75e-01

Pos logit contributions
olded+3.088
ave+2.717
 suggested+2.679
oped+2.673
 agreed+2.653
Neg logit contributions
 gift-1.692
 job-1.639
 new-1.588
 big-1.585
 red-1.564
Token good
Token ID922
Feature activation+2.03e-01

Pos logit contributions
olded+1.373
oped+1.313
 couldn+1.295
 could+1.288
os+1.282
Neg logit contributions
 new-0.407
 red-0.370
 gift-0.353
 nice-0.352
 big-0.348
Token so
Token ID523
Feature activation+1.09e+00

Pos logit contributions
olded+0.493
 hadn+0.459
 From+0.457
 could+0.455
 reached+0.449
Neg logit contributions
 things-0.239
 thing-0.236
 gift-0.233
 job-0.232
 new-0.231
Token down
Token ID866
Feature activation+7.45e-01

Pos logit contributions
olded+2.098
 From+1.882
 hadn+1.879
ave+1.868
 could+1.864
Neg logit contributions
 red-0.813
 new-0.791
 sand-0.710
 shiny-0.705
 things-0.703
Token on
Token ID319
Feature activation-3.78e-01

Pos logit contributions
olded+1.967
ave+1.738
 From+1.721
 could+1.712
 hadn+1.702
Neg logit contributions
 red-0.882
 new-0.875
 big-0.826
 dark-0.801
 gift-0.795
Token their
Token ID511
Feature activation-1.00e+00

Pos logit contributions
 new+0.448
 red+0.438
 big+0.413
 sand+0.413
 gift+0.412
Neg logit contributions
olded-0.993
ave-0.882
 agreed-0.874
 could-0.867
 suggested-0.854
Token knees
Token ID14475
Feature activation+9.14e-01

Pos logit contributions
 new+1.134
 red+1.093
 gift+1.069
 sand+1.068
 food+1.043
Neg logit contributions
olded-2.525
 agreed-2.294
 could-2.294
 suggested-2.284
 From-2.216
Token.
Token ID13
Feature activation+3.37e+00

Pos logit contributions
olded+2.439
 From+2.238
 apologised+2.172
 hadn+2.158
 could+2.155
Neg logit contributions
 red-0.990
 new-0.973
 big-0.901
 sunny-0.891
 sand-0.866
Token Chloe
Token ID29476
Feature activation+4.50e+00

Pos logit contributions
olded+7.232
ave+6.982
 managed+6.751
 apologised+6.474
 forgot+6.450
Neg logit contributions
 new-6.541
 red-6.444
 TV-6.305
 lunch-5.932
 big-5.883
Token suggested
Token ID5220
Feature activation+1.20e+00

Pos logit contributions
olded+14.269
 Did+12.088
ver+11.957
oped+11.864
icked+11.849
Neg logit contributions
 red-6.868
 big-6.811
 new-6.334
 pink-6.101
 black-6.012
Token they
Token ID484
Feature activation+1.98e+00

Pos logit contributions
olded+2.863
ave+2.713
oped+2.664
 could+2.581
 From+2.566
Neg logit contributions
 new-1.719
 things-1.620
 red-1.539
 lunch-1.521
 big-1.500
Token play
Token ID711
Feature activation-4.96e-01

Pos logit contributions
olded+5.505
 From+5.107
oped+4.930
 hadn+4.908
 apologised+4.835
Neg logit contributions
 big-2.102
 red-2.044
 new-1.942
 gift-1.931
 food-1.921
Token a
Token ID257
Feature activation-3.61e+00

Pos logit contributions
 new+0.552
 red+0.521
 sand+0.518
 thing+0.514
 lunch+0.511
Neg logit contributions
olded-1.313
ave-1.153
 could-1.152
 suggested-1.146
 hadn-1.145
Token game
Token ID983
Feature activation+4.26e-01

Pos logit contributions
 gift+2.417
 red+2.343
 food+2.238
 thing+2.212
 things+2.209
Neg logit contributions
olded-10.088
 agreed-9.845
 could-9.798
 suggested-9.624
 couldn-9.430
Tokenâ
Token ID22940
Feature activation+7.94e-01

Pos logit contributions
olded+4.689
 apologised+4.624
ave+4.578
 agreed+4.568
 couldn+4.464
Neg logit contributions
 new-4.768
 red-4.726
 things-4.600
 gift-4.440
 big-4.402
Token
Token ID26391
Feature activation+2.02e+00

Pos logit contributions
olded+1.436
 could+1.366
 couldn+1.306
 suggested+1.243
 hadn+1.243
Neg logit contributions
 job-1.411
 thing-1.342
 sunny-1.333
 gift-1.333
 task-1.303
Token
Token ID220
Feature activation+2.34e+00

Pos logit contributions
olded+4.079
ave+3.689
 apologised+3.629
 agreed+3.604
 apologized+3.432
Neg logit contributions
 big-3.542
 job-3.542
 things-3.508
 red-3.503
 new-3.490
Token
Token ID198
Feature activation+2.06e+00

Pos logit contributions
 apologised+4.280
 agreed+4.264
olded+4.240
 apologized+4.061
 thanked+4.051
Neg logit contributions
 new-4.806
 big-4.514
 red-4.269
 things-4.268
 lunch-4.104
Token
Token ID198
Feature activation+1.35e+00

Pos logit contributions
olded+4.288
 wouldn+3.956
 couldn+3.885
 apologised+3.810
owed+3.777
Neg logit contributions
 new-3.692
 big-3.637
 sunny-3.404
 long-3.342
 red-3.328
TokenAnna
Token ID31160
Feature activation+4.50e+00

Pos logit contributions
olded+2.992
 couldn+2.827
 could+2.714
owed+2.688
 wouldn+2.661
Neg logit contributions
 sunny-1.971
 new-1.957
 big-1.909
 nice-1.858
 red-1.843
Token smiled
Token ID13541
Feature activation+2.05e-02

Pos logit contributions
olded+13.107
oped+12.045
anced+11.964
alled+11.935
rol+11.805
Neg logit contributions
 big-7.042
 red-6.759
 toy-6.189
 new-6.163
 pink-6.029
Token and
Token ID290
Feature activation+3.19e+00

Pos logit contributions
olded+0.049
 could+0.043
 hadn+0.043
os+0.043
oped+0.043
Neg logit contributions
 new-0.028
 red-0.028
 big-0.026
 gift-0.026
 shiny-0.025
Token said
Token ID531
Feature activation+6.41e-01

Pos logit contributions
os+5.341
olded+5.024
oped+4.754
 hadn+4.364
ver+4.357
Neg logit contributions
 new-8.141
 big-8.034
 red-7.845
 shiny-7.731
 pretty-7.652
Token yes
Token ID3763
Feature activation+9.49e-01

Pos logit contributions
olded+1.682
 hadn+1.583
 suggested+1.527
 agreed+1.515
 apologised+1.514
Neg logit contributions
 new-0.694
 red-0.664
 gift-0.630
 thing-0.623
 big-0.619
Token!
Token ID0
Feature activation+3.05e+00

Pos logit contributions
olded+2.592
oped+2.418
os+2.381
 reached+2.357
 hadn+2.349
Neg logit contributions
 new-0.964
 gift-0.886
 red-0.860
 job-0.842
 thing-0.838
Token and
Token ID290
Feature activation+1.25e+00

Pos logit contributions
olded+2.171
 From+2.072
 agreed+2.012
 apologised+1.955
 hadn+1.940
Neg logit contributions
 red-1.058
 new-0.951
 shiny-0.945
 big-0.938
 things-0.925
Token things
Token ID1243
Feature activation+1.88e+00

Pos logit contributions
olded+3.636
ave+3.346
 agreed+3.341
 From+3.336
oped+3.321
Neg logit contributions
 red-1.275
 new-1.175
 shiny-1.161
 things-1.148
 big-1.136
Token to
Token ID284
Feature activation+1.53e-01

Pos logit contributions
olded+4.602
 From+4.468
ave+4.023
 agreed+3.963
 apologised+3.923
Neg logit contributions
 shiny-2.738
 red-2.718
 new-2.677
 big-2.664
 things-2.557
Token explore
Token ID7301
Feature activation-6.49e-02

Pos logit contributions
olded+0.381
 agreed+0.343
ave+0.342
 From+0.339
 suggested+0.337
Neg logit contributions
 new-0.194
 red-0.193
 big-0.190
 sand-0.186
 gift-0.184
Token.
Token ID13
Feature activation+2.98e+00

Pos logit contributions
 new+0.076
 things+0.070
 red+0.069
 thing+0.066
 big+0.066
Neg logit contributions
olded-0.180
ave-0.157
 agreed-0.156
 apologised-0.154
oped-0.153
Token Thomas
Token ID5658
Feature activation+4.50e+00

Pos logit contributions
olded+6.159
ave+5.907
 managed+5.656
 apologised+5.579
 agreed+5.519
Neg logit contributions
 new-5.763
 red-5.653
 TV-5.527
 things-5.362
 big-5.351
Token leaned
Token ID23831
Feature activation+8.14e-01

Pos logit contributions
olded+12.141
oped+10.407
anced+10.387
icked+10.306
Well+10.208
Neg logit contributions
 big-8.537
 red-8.454
 toy-8.105
 shiny-7.645
 new-7.613
Token next
Token ID1306
Feature activation+1.41e-01

Pos logit contributions
olded+2.069
 From+1.856
 apologised+1.816
 agreed+1.813
 hadn+1.812
Neg logit contributions
 red-1.012
 new-0.949
 big-0.913
 sunny-0.901
 dark-0.896
Token to
Token ID284
Feature activation+1.06e+00

Pos logit contributions
olded+0.357
 agreed+0.320
 apologised+0.318
 hadn+0.317
 From+0.313
Neg logit contributions
 new-0.175
 red-0.172
 gift-0.159
 things-0.159
 thing-0.158
Token her
Token ID607
Feature activation+7.50e-02

Pos logit contributions
olded+3.012
ave+2.747
 agreed+2.660
oped+2.642
 suggested+2.624
Neg logit contributions
 new-1.244
 red-1.122
 big-1.110
 things-1.068
 sand-1.058
Token and
Token ID290
Feature activation+3.34e+00

Pos logit contributions
olded+0.195
 agreed+0.179
ave+0.176
 suggested+0.175
 reached+0.174
Neg logit contributions
 new-0.091
 red-0.085
 gift-0.081
 big-0.081
 things-0.080
Token rely
Token ID8814
Feature activation+4.43e-01

Pos logit contributions
 new+0.102
 red+0.097
 gift+0.096
 job+0.095
 big+0.094
Neg logit contributions
olded-0.172
oped-0.151
 From-0.150
os-0.149
 agreed-0.149
Token on
Token ID319
Feature activation+1.55e+00

Pos logit contributions
olded+1.268
 agreed+1.182
 From+1.181
 apologised+1.173
 suggested+1.156
Neg logit contributions
 red-0.377
 new-0.365
 things-0.352
 big-0.334
 gift-0.330
Token each
Token ID1123
Feature activation-7.05e-01

Pos logit contributions
olded+4.580
 agreed+4.262
 reached+4.238
oped+4.181
ave+4.159
Neg logit contributions
 new-1.587
 things-1.510
 red-1.383
 big-1.356
 thing-1.299
Token other
Token ID584
Feature activation+1.21e+00

Pos logit contributions
 new+0.459
 gift+0.447
 thing+0.440
 red+0.428
 things+0.414
Neg logit contributions
olded-2.158
 agreed-1.948
 hadn-1.940
 From-1.934
 could-1.930
Token".
Token ID1911
Feature activation+3.06e+00

Pos logit contributions
olded+3.258
oped+2.978
 From+2.938
 reached+2.912
 apologised+2.879
Neg logit contributions
 new-1.453
 things-1.452
 red-1.350
 food-1.336
 thing-1.325
Token Tina
Token ID35903
Feature activation+4.49e+00

Pos logit contributions
olded+6.524
ave+6.186
 managed+5.879
 wouldn+5.717
 hadn+5.647
Neg logit contributions
 new-5.675
 lunch-5.599
 TV-5.582
 dessert-5.522
 things-5.299
Token hugged
Token ID41130
Feature activation+6.48e-01

Pos logit contributions
olded+11.124
oped+10.797
 Did+9.685
os+9.416
anced+9.408
Neg logit contributions
 new-9.202
 big-9.036
 red-8.830
 pretty-8.491
 nice-8.247
Token her
Token ID607
Feature activation-8.57e-01

Pos logit contributions
olded+1.651
ave+1.471
 agreed+1.465
 could+1.459
 hadn+1.450
Neg logit contributions
 new-0.828
 red-0.777
 things-0.766
 TV-0.739
 thing-0.731
Token back
Token ID736
Feature activation+2.92e-01

Pos logit contributions
 new+0.889
 red+0.849
 gift+0.826
 things+0.773
 big+0.773
Neg logit contributions
olded-2.297
 agreed-2.090
 suggested-2.084
 could-2.077
 reached-2.042
Token and
Token ID290
Feature activation+2.68e+00

Pos logit contributions
olded+0.746
 agreed+0.672
 could+0.668
 hadn+0.667
ave+0.660
Neg logit contributions
 new-0.370
 red-0.352
 gift-0.331
 thing-0.329
 things-0.329
Token knew
Token ID2993
Feature activation+2.06e+00

Pos logit contributions
olded+4.654
os+4.266
ave+4.069
oped+4.066
ver+3.999
Neg logit contributions
 new-6.139
 red-5.753
 shiny-5.642
 big-5.532
 things-5.513
Tokenâ
Token ID22940
Feature activation+1.08e+00

Pos logit contributions
olded+2.828
oped+2.514
 managed+2.508
 apologised+2.490
ave+2.479
Neg logit contributions
 new-3.265
 TV-3.055
 red-2.970
 dessert-2.960
 big-2.952
Token
Token ID26391
Feature activation+1.31e+00

Pos logit contributions
olded+1.953
 could+1.836
 couldn+1.782
 hadn+1.713
 suggested+1.700
Neg logit contributions
 job-1.940
 gift-1.798
 thing-1.786
 sunny-1.782
 new-1.749
Token
Token ID220
Feature activation+2.17e+00

Pos logit contributions
olded+2.368
 apologised+2.050
 agreed+2.005
ave+1.997
 suggested+1.987
Neg logit contributions
 new-2.470
 job-2.432
 things-2.409
 big-2.401
 red-2.381
Token
Token ID198
Feature activation+1.12e+00

Pos logit contributions
 apologised+4.114
olded+4.111
 agreed+4.000
 stared+3.842
 apologized+3.840
Neg logit contributions
 new-4.452
 big-4.048
 things-3.788
 red-3.760
 lunch-3.673
Token
Token ID198
Feature activation+5.85e-01

Pos logit contributions
olded+2.183
 apologised+1.852
 wouldn+1.841
 agreed+1.802
 couldn+1.774
Neg logit contributions
 new-2.168
 big-2.138
 sunny-1.969
 red-1.938
 lunch-1.930
TokenAlice
Token ID44484
Feature activation+4.49e+00

Pos logit contributions
olded+1.194
 couldn+1.047
 could+1.046
os+1.005
 wouldn+0.996
Neg logit contributions
 new-0.970
 sunny-0.962
 big-0.948
 fake-0.930
 gift-0.918
Token wanted
Token ID2227
Feature activation-6.42e-03

Pos logit contributions
olded+14.469
 Did+12.970
oped+12.579
ver+12.521
Well+12.504
Neg logit contributions
 big-6.334
 red-5.710
 new-5.279
 toy-5.099
 black-4.896
Token to
Token ID284
Feature activation+5.86e-01

Pos logit contributions
 new+0.008
 red+0.007
 food+0.007
 things+0.007
 big+0.007
Neg logit contributions
olded-0.016
oped-0.015
 agreed-0.015
 hadn-0.015
 apologised-0.014
Token go
Token ID467
Feature activation-9.67e-01

Pos logit contributions
olded+1.592
oped+1.421
ave+1.416
 hadn+1.399
 suggested+1.385
Neg logit contributions
 new-0.675
 big-0.665
 gift-0.658
 job-0.655
 sand-0.638
Token too
Token ID1165
Feature activation-8.75e-01

Pos logit contributions
 new+1.035
 red+0.963
 gift+0.944
 things+0.911
 big+0.911
Neg logit contributions
olded-2.534
 agreed-2.310
 could-2.308
 hadn-2.281
 suggested-2.273
Token,
Token ID11
Feature activation+1.80e+00

Pos logit contributions
 new+1.094
 red+1.060
 big+1.037
 thing+1.001
 dark+1.001
Neg logit contributions
olded-2.108
 agreed-1.885
 suggested-1.861
 apologised-1.859
 could-1.859
Tokenumpy
Token ID32152
Feature activation-7.29e-01

Pos logit contributions
 red+0.765
 new+0.762
 thing+0.757
 gift+0.757
 job+0.748
Neg logit contributions
olded-1.011
 could-0.945
 suggested-0.925
 agreed-0.925
 hadn-0.906
Token old
Token ID1468
Feature activation-3.13e-01

Pos logit contributions
 new+0.704
 red+0.686
 gift+0.678
 job+0.659
 thing+0.655
Neg logit contributions
olded-1.990
 could-1.825
 suggested-1.810
 agreed-1.800
 From-1.762
Token man
Token ID582
Feature activation+1.35e+00

Pos logit contributions
 new+0.276
 red+0.275
 gift+0.273
 thing+0.260
 food+0.254
Neg logit contributions
olded-0.888
 could-0.808
 From-0.802
 agreed-0.794
ave-0.789
Token inside
Token ID2641
Feature activation+1.31e+00

Pos logit contributions
olded+3.423
 From+3.161
ave+3.095
 agreed+3.081
 apologised+3.057
Neg logit contributions
 gift-1.515
 red-1.486
 new-1.479
 food-1.471
 thing-1.457
Token.
Token ID13
Feature activation+2.95e+00

Pos logit contributions
olded+3.819
ave+3.496
 apologised+3.338
oped+3.321
 agreed+3.316
Neg logit contributions
 new-1.456
 things-1.298
 red-1.273
 food-1.245
 gift-1.234
Token Molly
Token ID30236
Feature activation+4.49e+00

Pos logit contributions
olded+6.447
ave+5.690
 wouldn+5.688
owed+5.632
 managed+5.622
Neg logit contributions
 new-5.797
 lunch-5.299
 big-5.233
 red-5.110
 bed-4.991
Token smiled
Token ID13541
Feature activation+7.55e-01

Pos logit contributions
olded+11.973
oped+10.610
anced+10.545
 Did+10.399
os+10.348
Neg logit contributions
 big-8.899
 new-8.218
 red-8.108
 toy-7.699
 fake-7.486
Token and
Token ID290
Feature activation+3.73e+00

Pos logit contributions
olded+1.893
oped+1.650
 hadn+1.644
os+1.639
 suggested+1.613
Neg logit contributions
 new-1.070
 red-1.031
 big-0.971
 shiny-0.922
 pink-0.917
Token promised
Token ID8072
Feature activation+1.78e+00

Pos logit contributions
olded+8.389
os+8.069
oped+7.827
ave+7.380
ver+7.318
Neg logit contributions
 new-8.255
 big-7.992
 shiny-7.616
 pretty-7.558
 red-7.522
Token to
Token ID284
Feature activation+1.00e+00

Pos logit contributions
olded+4.832
oped+4.602
os+4.443
ave+4.420
 agreed+4.327
Neg logit contributions
 new-2.297
 things-1.993
 big-1.924
 attention-1.922
 job-1.914
Token never
Token ID1239
Feature activation-1.03e+00

Pos logit contributions
olded+2.545
ave+2.217
oped+2.194
 suggested+2.178
os+2.174
Neg logit contributions
 gift-1.358
 job-1.315
 new-1.291
 big-1.289
 sunny-1.243
Tokens
Token ID82
Feature activation+2.08e+00

Pos logit contributions
olded+4.096
 apologised+3.894
 thanked+3.886
 stro+3.766
ave+3.740
Neg logit contributions
 big-2.239
 new-2.194
 red-2.131
 thing-1.864
 fake-1.844
Token me
Token ID502
Feature activation+2.42e+00

Pos logit contributions
olded+5.791
ave+5.417
oped+5.374
 agreed+5.343
 suggested+5.073
Neg logit contributions
 big-2.694
 new-2.686
 dark-2.399
 pretty-2.366
 red-2.346
Token!
Token ID0
Feature activation+2.47e+00

Pos logit contributions
olded+6.347
ave+6.079
 apologised+5.892
 suggested+5.890
 thanked+5.840
Neg logit contributions
 new-2.969
 big-2.890
 red-2.811
 job-2.741
 gift-2.652
Tokenâ
Token ID22940
Feature activation+1.28e+00

Pos logit contributions
olded+4.507
ave+4.330
 wouldn+4.322
 managed+4.321
 couldn+4.315
Neg logit contributions
 new-4.800
 red-4.600
 gift-4.535
 big-4.479
 luxury-4.464
Token
Token ID26391
Feature activation+1.75e+00

Pos logit contributions
olded+2.399
 could+2.192
 couldn+2.186
 suggested+2.118
 apologised+2.083
Neg logit contributions
 job-2.232
 thing-2.170
 sunny-2.112
 gift-2.105
 red-2.058
Token Lucy
Token ID22162
Feature activation+4.49e+00

Pos logit contributions
olded+3.433
 apologised+3.082
 agreed+3.006
ave+3.001
 suggested+2.969
Neg logit contributions
 big-3.127
 things-3.108
 job-3.081
 new-3.080
 red-3.034
Token was
Token ID373
Feature activation+1.46e+00

Pos logit contributions
olded+13.097
alled+11.706
ave+11.644
anced+11.095
ver+10.934
Neg logit contributions
 big-8.448
 red-7.669
 new-7.420
 pretty-7.230
 shiny-7.143
Token very
Token ID845
Feature activation+1.12e+00

Pos logit contributions
olded+3.835
ave+3.564
 agreed+3.487
oped+3.460
 apologised+3.403
Neg logit contributions
 big-1.904
 new-1.897
 red-1.793
 dark-1.713
 fake-1.684
Token curious
Token ID11040
Feature activation+1.11e+00

Pos logit contributions
olded+2.986
ave+2.826
oped+2.758
 apologised+2.675
 From+2.672
Neg logit contributions
 big-1.403
 new-1.369
 red-1.252
 pretty-1.251
 shiny-1.236
Token and
Token ID290
Feature activation+3.32e+00

Pos logit contributions
olded+3.076
oped+2.821
ave+2.703
 apologised+2.699
 From+2.698
Neg logit contributions
 new-1.256
 red-1.236
 things-1.184
 thing-1.160
 gift-1.135
Token asked
Token ID1965
Feature activation+7.90e-01

Pos logit contributions
olded+8.391
oped+8.383
os+8.221
ave+8.029
ver+7.669
Neg logit contributions
 new-6.006
 big-5.842
 pretty-5.682
 shiny-5.639
 red-5.603
Token an
Token ID281
Feature activation-2.18e+00

Pos logit contributions
olded+1.459
 agreed+1.357
ave+1.339
oped+1.330
 suggested+1.325
Neg logit contributions
 new-0.606
 food-0.552
 things-0.549
 big-0.519
 gift-0.518
Token answer
Token ID3280
Feature activation+2.03e+00

Pos logit contributions
 new+1.886
 gift+1.850
 red+1.832
 thing+1.729
 food+1.703
Neg logit contributions
olded-5.877
 could-5.466
 agreed-5.465
 suggested-5.381
 From-5.262
Token.
Token ID13
Feature activation+2.99e+00

Pos logit contributions
olded+5.771
 From+5.277
oped+5.226
ave+5.209
 apologised+5.193
Neg logit contributions
 thing-2.262
 new-2.249
 things-2.193
 gift-2.128
 red-2.109
Tokenâ
Token ID22940
Feature activation+1.11e+00

Pos logit contributions
olded+5.879
ave+5.348
oped+5.253
owed+5.199
os+5.183
Neg logit contributions
 new-6.331
 red-5.602
 sight-5.570
 big-5.543
 sunny-5.521
Token
Token ID26391
Feature activation+2.40e+00

Pos logit contributions
olded+2.013
 could+1.883
 couldn+1.813
os+1.754
 suggested+1.709
Neg logit contributions
 job-2.014
 thing-1.911
 sunny-1.888
 gift-1.874
 task-1.846
Token Daisy
Token ID40355
Feature activation+4.48e+00

Pos logit contributions
olded+4.679
ave+4.094
 agreed+3.946
 apologised+3.939
alled+3.808
Neg logit contributions
 things-4.509
 new-4.493
 big-4.487
 job-4.473
 long-4.289
Token thought
Token ID1807
Feature activation+1.34e+00

Pos logit contributions
olded+12.096
oped+10.858
anced+10.406
os+10.344
uffed+10.223
Neg logit contributions
 big-8.641
 red-7.926
 new-7.769
 toy-7.512
 magic-7.379
Token about
Token ID546
Feature activation+1.22e+00

Pos logit contributions
olded+3.512
ave+3.222
 agreed+3.157
oped+3.143
 apologised+3.129
Neg logit contributions
 new-1.861
 big-1.634
 things-1.605
 red-1.592
 shiny-1.562
Token it
Token ID340
Feature activation+2.21e+00

Pos logit contributions
olded+3.412
ave+3.215
oped+2.997
 agreed+2.971
 suggested+2.870
Neg logit contributions
 new-1.627
 things-1.564
 food-1.444
 red-1.422
 fake-1.394
Token for
Token ID329
Feature activation+2.47e-01

Pos logit contributions
olded+6.917
oped+6.278
ave+6.129
 apologised+6.017
 reached+5.954
Neg logit contributions
 new-2.475
 big-2.045
 things-2.012
 gift-1.959
 thing-1.939
Token a
Token ID257
Feature activation-1.58e+00

Pos logit contributions
olded+0.681
ave+0.618
 agreed+0.616
 could+0.611
 suggested+0.608
Neg logit contributions
 new-0.262
 food-0.249
 lunch-0.246
 gift-0.241
 things-0.240
Token She
Token ID1375
Feature activation+4.29e+00

Pos logit contributions
olded+7.882
 managed+7.162
ave+7.112
 wouldn+7.067
 refused+6.919
Neg logit contributions
 new-7.600
 TV-7.142
 sun-7.137
 red-6.938
 birthday-6.874
Token was
Token ID373
Feature activation+1.85e+00

Pos logit contributions
olded+11.110
 Did+9.285
oped+8.800
uffed+8.741
anced+8.401
Neg logit contributions
 big-9.981
 new-9.263
 shiny-8.794
 pretty-8.752
 red-8.683
Token so
Token ID523
Feature activation+1.34e+00

Pos logit contributions
olded+5.244
ave+4.961
oped+4.924
 suggested+4.731
 could+4.721
Neg logit contributions
 new-2.057
 big-2.002
 red-1.914
 fake-1.823
 nice-1.805
Token excited
Token ID6568
Feature activation+1.19e+00

Pos logit contributions
olded+3.764
ave+3.479
oped+3.470
 suggested+3.449
 could+3.440
Neg logit contributions
 big-1.480
 new-1.420
 pretty-1.335
 nice-1.314
 shiny-1.285
Token and
Token ID290
Feature activation+3.41e+00

Pos logit contributions
olded+3.230
oped+2.942
ave+2.864
 apologised+2.848
 From+2.836
Neg logit contributions
 new-1.463
 things-1.386
 red-1.272
 thing-1.255
 food-1.249
Token she
Token ID673
Feature activation+4.48e+00

Pos logit contributions
olded+9.655
oped+9.286
os+9.070
ave+8.952
ver+8.532
Neg logit contributions
 new-5.212
 big-4.995
 pretty-4.675
 shiny-4.638
 nice-4.545
Token learned
Token ID4499
Feature activation+5.32e-01

Pos logit contributions
olded+11.788
 Did+10.350
ver+9.769
 Although+9.651
oped+9.566
Neg logit contributions
 big-9.884
 pretty-8.887
 shiny-8.845
 red-8.679
 new-8.673
Token something
Token ID1223
Feature activation-3.85e-01

Pos logit contributions
olded+1.470
oped+1.399
ave+1.359
 could+1.324
 agreed+1.321
Neg logit contributions
 new-0.610
 things-0.539
 big-0.510
 fake-0.507
 food-0.492
Token new
Token ID649
Feature activation+4.20e-01

Pos logit contributions
 new+0.164
 fake+0.095
 things+0.093
 red+0.088
 big+0.085
Neg logit contributions
olded-1.310
oped-1.222
 agreed-1.216
ave-1.212
 reached-1.203
Token.
Token ID13
Feature activation+3.55e+00

Pos logit contributions
olded+1.098
oped+0.984
ave+0.963
 apologised+0.957
 reached+0.952
Neg logit contributions
 new-0.529
 thing-0.518
 gift-0.508
 things-0.501
 job-0.460
Token
Token ID198
Feature activation+1.13e+00

Pos logit contributions
olded+7.656
 wouldn+6.844
ave+6.837
 managed+6.834
 refused+6.679
Neg logit contributions
 new-7.351
 sun-6.910
 big-6.603
 sight-6.590
 home-6.495
Token the
Token ID262
Feature activation-1.49e+00

Pos logit contributions
 new+0.586
 sand+0.545
 red+0.545
 lunch+0.534
 food+0.531
Neg logit contributions
olded-1.359
ave-1.226
 agreed-1.221
 could-1.210
 suggested-1.191
Token books
Token ID3835
Feature activation+1.61e+00

Pos logit contributions
 new+1.280
 red+1.236
 gift+1.221
 food+1.161
 sand+1.156
Neg logit contributions
olded-4.117
 agreed-3.810
 could-3.766
 suggested-3.739
 From-3.671
Tokenhe
Token ID258
Feature activation+9.18e-01

Pos logit contributions
olded+4.207
 apologised+3.864
 hadn+3.722
 From+3.691
 apologized+3.676
Neg logit contributions
 red-1.943
 new-1.922
 big-1.919
 cake-1.844
 dark-1.779
Tokenlf
Token ID1652
Feature activation+2.03e+00

Pos logit contributions
olded+2.572
 apologised+2.257
 suggested+2.233
 agreed+2.233
ave+2.221
Neg logit contributions
 things-1.030
 thing-1.023
 cake-0.996
 job-0.974
 food-0.972
Token".
Token ID1911
Feature activation+2.92e+00

Pos logit contributions
olded+5.519
 apologised+5.082
 From+4.981
 hadn+4.920
oped+4.897
Neg logit contributions
 new-2.331
 red-2.303
 food-2.166
 things-2.148
 big-2.135
Token Toby
Token ID37578
Feature activation+4.46e+00

Pos logit contributions
olded+6.110
ave+5.805
 agreed+5.597
 managed+5.314
 apologised+5.251
Neg logit contributions
 new-5.570
 bed-5.528
 TV-5.269
 lunch-5.199
 job-5.190
Token agreed
Token ID4987
Feature activation+1.00e+00

Pos logit contributions
olded+11.530
 Did+11.141
oped+10.904
os+10.817
anced+10.771
Neg logit contributions
 big-8.348
 new-7.837
 red-7.702
 toy-7.359
 little-7.281
Token and
Token ID290
Feature activation+3.50e+00

Pos logit contributions
olded+2.606
os+2.360
ave+2.356
oped+2.337
 From+2.271
Neg logit contributions
 new-1.238
 red-1.110
 big-1.106
 thing-1.104
 things-1.083
Token they
Token ID484
Feature activation+2.29e+00

Pos logit contributions
olded+8.980
ave+8.590
oped+8.487
 Did+8.387
os+8.370
Neg logit contributions
 new-5.250
 big-5.020
 sand-4.999
 food-4.971
 job-4.848
Token safely
Token ID11512
Feature activation+7.65e-01

Pos logit contributions
olded+4.707
 Did+4.260
 From+4.208
 Perhaps+3.987
 Doing+3.980
Neg logit contributions
 big-4.010
 sand-3.881
 red-3.860
 sunny-3.738
 dark-3.706
Token returned
Token ID4504
Feature activation-2.19e-01

Pos logit contributions
olded+1.249
 From+1.093
 hadn+1.069
 could+1.048
 apologised+1.025
Neg logit contributions
 new-1.503
 red-1.485
 sand-1.475
 big-1.468
 food-1.454
Token questions
Token ID2683
Feature activation+1.92e+00

Pos logit contributions
olded+2.822
ave+2.605
 agreed+2.566
oped+2.538
 reached+2.518
Neg logit contributions
 new-1.249
 things-1.199
 red-1.127
 food-1.111
 thing-1.102
Token".
Token ID1911
Feature activation+2.92e+00

Pos logit contributions
olded+5.870
 apologised+5.420
ave+5.379
 From+5.268
 agreed+5.262
Neg logit contributions
 new-1.693
 things-1.656
 red-1.624
 thing-1.517
 gift-1.488
Token
Token ID198
Feature activation+1.56e+00

Pos logit contributions
olded+6.234
ave+5.882
 managed+5.562
 hadn+5.434
 wouldn+5.423
Neg logit contributions
 new-5.515
 lunch-5.314
 TV-5.261
 dessert-5.198
 things-5.024
Token
Token ID198
Feature activation+1.68e+00

Pos logit contributions
olded+3.262
ave+2.883
 wouldn+2.821
 agreed+2.768
 could+2.743
Neg logit contributions
 new-2.852
 big-2.708
 sunny-2.509
 red-2.483
 things-2.411
TokenEl
Token ID9527
Feature activation+1.51e+00

Pos logit contributions
olded+3.839
 couldn+3.446
 could+3.431
 wouldn+3.371
ave+3.296
Neg logit contributions
 sunny-2.551
 new-2.496
 big-2.400
 thing-2.329
 gift-2.314
Tokeniza
Token ID23638
Feature activation+4.46e+00

Pos logit contributions
olded+3.219
 suggested+2.822
 From+2.774
 apologised+2.740
 could+2.735
Neg logit contributions
 thing-2.400
 new-2.299
 red-2.255
 things-2.249
 fake-2.218
Token understood
Token ID7247
Feature activation+9.68e-01

Pos logit contributions
olded+11.733
oped+10.739
 Did+10.343
Well+10.253
rol+9.746
Neg logit contributions
 big-7.960
 red-7.634
 new-7.372
 end-7.067
 cake-7.006
Token the
Token ID262
Feature activation-1.14e+00

Pos logit contributions
olded+2.661
ave+2.466
oped+2.458
 reached+2.333
os+2.327
Neg logit contributions
 new-1.110
 things-1.025
 thing-0.955
 gift-0.949
 big-0.933
Token importance
Token ID6817
Feature activation+1.13e+00

Pos logit contributions
 new+1.089
 gift+1.065
 red+1.028
 thing+0.997
 job+0.985
Neg logit contributions
olded-3.104
 could-2.844
 agreed-2.832
 suggested-2.770
 From-2.748
Token of
Token ID286
Feature activation-7.67e-01

Pos logit contributions
olded+2.984
 From+2.686
 apologised+2.683
 couldn+2.643
oped+2.627
Neg logit contributions
 new-1.185
 thing-1.162
 red-1.130
 gift-1.109
 things-1.104
Token taking
Token ID2263
Feature activation+4.94e-01

Pos logit contributions
 new+1.036
 gift+0.975
 red+0.963
 things+0.961
 thing+0.943
Neg logit contributions
olded-1.848
 could-1.645
ave-1.643
 agreed-1.639
 suggested-1.610
Token and
Token ID290
Feature activation+1.68e+00

Pos logit contributions
olded+1.137
 agreed+1.017
 From+1.013
 could+1.011
 hadn+1.011
Neg logit contributions
 things-0.508
 new-0.495
 red-0.488
 thing-0.472
 job-0.460
Token be
Token ID307
Feature activation+4.54e-01

Pos logit contributions
olded+4.553
oped+4.266
 From+4.250
ave+4.206
os+4.140
Neg logit contributions
 new-2.018
 big-1.910
 sunny-1.854
 red-1.841
 job-1.797
Token depend
Token ID4745
Feature activation+7.00e-01

Pos logit contributions
olded+1.255
 could+1.178
 From+1.164
 couldn+1.144
 agreed+1.137
Neg logit contributions
 new-0.436
 red-0.432
 sunny-0.400
 big-0.397
 nice-0.391
Tokenable
Token ID540
Feature activation+6.93e-01

Pos logit contributions
olded+1.874
 couldn+1.774
 hadn+1.757
 From+1.743
 apologised+1.688
Neg logit contributions
 thing-0.689
 things-0.647
 gift-0.620
 food-0.586
 job-0.580
Token.
Token ID13
Feature activation+2.98e+00

Pos logit contributions
olded+1.849
 reached+1.729
oped+1.703
 From+1.687
 hadn+1.683
Neg logit contributions
 new-0.695
 things-0.683
 gift-0.676
 thing-0.652
 food-0.652
Token Molly
Token ID30236
Feature activation+4.46e+00

Pos logit contributions
olded+6.473
ave+6.205
 managed+6.113
 wouldn+5.829
owed+5.814
Neg logit contributions
 new-5.372
 TV-5.226
 birthday-5.175
 lunch-5.030
 sunny-5.016
Token sighed
Token ID21893
Feature activation+9.23e-01

Pos logit contributions
olded+11.746
anced+10.742
icked+10.499
oped+10.322
 Did+10.156
Neg logit contributions
 big-8.842
 new-8.305
 red-8.209
 toy-7.805
 pretty-7.427
Token and
Token ID290
Feature activation+3.81e+00

Pos logit contributions
olded+2.282
ave+2.040
os+2.025
 could+1.991
 agreed+1.973
Neg logit contributions
 new-1.279
 red-1.224
 big-1.154
 fake-1.147
 thing-1.135
Token reluctantly
Token ID35462
Feature activation+2.69e+00

Pos logit contributions
olded+8.624
os+8.218
oped+8.172
ave+8.100
ives+7.827
Neg logit contributions
 new-7.570
 big-7.338
 red-7.194
 nice-7.143
 dark-7.017
Token agreed
Token ID4987
Feature activation+7.83e-01

Pos logit contributions
olded+4.781
 Did+4.723
os+4.708
 From+4.592
 hadn+4.431
Neg logit contributions
 big-5.292
 red-5.142
 nice-5.106
 new-5.096
 gift-5.050
Token.
Token ID13
Feature activation+3.12e+00

Pos logit contributions
olded+2.069
oped+1.905
os+1.896
ave+1.872
 From+1.857
Neg logit contributions
 new-0.899
 gift-0.799
 red-0.788
 thing-0.786
 things-0.780
Token said
Token ID531
Feature activation+1.54e+00

Pos logit contributions
olded+6.956
 From+6.472
oped+6.245
Well+6.073
abella+5.865
Neg logit contributions
 new-6.578
 big-6.264
 red-6.263
 things-6.172
 fake-6.111
Token.
Token ID13
Feature activation+2.70e+00

Pos logit contributions
olded+3.801
 suggested+3.565
 agreed+3.506
ave+3.486
 thanked+3.413
Neg logit contributions
 new-2.202
 red-2.090
 thing-2.033
 gift-2.017
 big-2.008
Token
Token ID220
Feature activation+2.49e+00

Pos logit contributions
olded+5.190
ave+5.047
 apologised+4.959
 couldn+4.941
 agreed+4.912
Neg logit contributions
 new-5.064
 red-4.931
 TV-4.586
 cotton-4.575
 big-4.543
Token
Token ID198
Feature activation+2.17e+00

Pos logit contributions
olded+5.005
 apologised+4.974
 agreed+4.898
 stared+4.755
 thanked+4.701
Neg logit contributions
 new-4.945
 big-4.487
 things-4.186
 red-4.155
 lunch-3.949
Token
Token ID198
Feature activation+1.59e+00

Pos logit contributions
olded+4.549
 wouldn+4.082
 couldn+4.052
 apologised+4.019
 agreed+3.926
Neg logit contributions
 new-3.994
 big-3.927
 sunny-3.594
 red-3.547
 nice-3.522
TokenEmily
Token ID48640
Feature activation+4.45e+00

Pos logit contributions
olded+3.650
 couldn+3.506
 could+3.420
owed+3.344
 wouldn+3.325
Neg logit contributions
 sunny-2.236
 big-2.167
 new-2.150
 nice-2.064
 red-2.055
Token was
Token ID373
Feature activation+1.63e+00

Pos logit contributions
olded+12.812
 Did+11.775
os+11.587
anced+11.062
ver+11.051
Neg logit contributions
 big-7.538
 red-7.046
 toy-6.811
 new-6.599
 shiny-6.278
Token very
Token ID845
Feature activation+1.37e+00

Pos logit contributions
olded+4.621
oped+4.358
os+4.347
ave+4.346
 could+4.335
Neg logit contributions
 big-1.626
 new-1.599
 red-1.562
 fake-1.489
 nice-1.441
Token scared
Token ID12008
Feature activation+2.79e-01

Pos logit contributions
olded+3.876
oped+3.739
os+3.737
 could+3.702
ave+3.685
Neg logit contributions
 big-1.354
 new-1.339
 nice-1.243
 red-1.223
 shiny-1.175
Token but
Token ID475
Feature activation+2.93e+00

Pos logit contributions
olded+0.719
 agreed+0.645
 hadn+0.640
oped+0.638
 apologised+0.635
Neg logit contributions
 new-0.323
 red-0.316
 things-0.303
 thing-0.298
 big-0.296
Token she
Token ID673
Feature activation+3.42e+00

Pos logit contributions
os+7.850
olded+7.795
ave+7.564
oped+7.397
anced+6.962
Neg logit contributions
 new-4.238
 big-3.829
 gloomy-3.707
 dark-3.673
 sunny-3.666
Token.
Token ID13
Feature activation-1.68e+00

Pos logit contributions
 new+2.256
 red+2.229
 gift+2.206
 thing+2.125
 fake+2.083
Neg logit contributions
olded-4.383
 agreed-3.994
 could-3.951
 suggested-3.898
 apologised-3.827
Token She
Token ID1375
Feature activation-6.53e-01

Pos logit contributions
 red+4.148
 new+4.148
 gift+4.084
 sunny+4.012
 thing+4.002
Neg logit contributions
olded-1.908
 agreed-1.537
 could-1.493
 suggested-1.467
 apologised-1.424
Token wanted
Token ID2227
Feature activation-2.31e+00

Pos logit contributions
 red+1.479
 new+1.462
 big+1.449
 shiny+1.430
 sunny+1.426
Neg logit contributions
olded-0.926
 agreed-0.749
 apologised-0.715
 suggested-0.706
oped-0.667
Token to
Token ID284
Feature activation-1.73e+00

Pos logit contributions
 new+3.291
 gift+3.265
 red+3.242
 thing+3.137
 sand+3.100
Neg logit contributions
olded-4.956
 could-4.527
 agreed-4.522
 suggested-4.437
 From-4.309
Token play
Token ID711
Feature activation-3.83e+00

Pos logit contributions
 new+2.367
 gift+2.334
 red+2.328
 job+2.237
 food+2.234
Neg logit contributions
olded-3.870
 agreed-3.501
 could-3.474
 suggested-3.420
 hadn-3.367
Token a
Token ID257
Feature activation-5.63e+00

Pos logit contributions
 gift+6.295
 rainy+6.100
 new+6.098
 attention+6.079
 red+6.063
Neg logit contributions
olded-7.337
 could-7.026
 agreed-6.941
 suggested-6.786
 reached-6.547
Token game
Token ID983
Feature activation-1.54e+00

Pos logit contributions
 attention+8.118
 food+6.881
 gift+6.797
 things+6.716
ags+6.542
Neg logit contributions
 could-13.394
 suggested-13.010
olded-12.854
 agreed-12.742
 couldn-12.513
Token with
Token ID351
Feature activation-1.39e+00

Pos logit contributions
 new+2.086
 red+2.050
 gift+2.046
 thing+1.986
 food+1.949
Neg logit contributions
olded-3.417
 agreed-3.102
 could-3.070
 apologised-3.025
 suggested-3.020
Token her
Token ID607
Feature activation-2.11e+00

Pos logit contributions
 new+1.760
 red+1.702
 gift+1.648
 things+1.630
 food+1.601
Neg logit contributions
olded-3.296
 agreed-3.047
 could-3.010
 suggested-2.984
 hadn-2.955
Token friends
Token ID2460
Feature activation-2.96e-01

Pos logit contributions
 new+1.979
 gift+1.928
 red+1.925
 thing+1.811
 fake+1.784
Neg logit contributions
olded-5.588
 agreed-5.176
 could-5.158
 suggested-5.090
 From-4.983
Token.
Token ID13
Feature activation+2.15e-01

Pos logit contributions
 red+0.306
 new+0.297
 gift+0.289
 thing+0.282
 things+0.271
Neg logit contributions
olded-0.764
 apologised-0.705
 From-0.691
 agreed-0.688
 thanked-0.684
Token the
Token ID262
Feature activation-3.61e+00

Pos logit contributions
 new+2.517
 red+2.493
 gift+2.441
 thing+2.391
 food+2.360
Neg logit contributions
olded-4.268
 agreed-3.857
 could-3.819
 suggested-3.791
 hadn-3.703
Token dog
Token ID3290
Feature activation+2.10e+00

Pos logit contributions
 gift+3.479
 sand+3.270
 new+3.259
 dessert+3.239
 job+3.204
Neg logit contributions
olded-9.272
 could-8.988
 agreed-8.864
 suggested-8.719
 couldn-8.514
Token wanted
Token ID2227
Feature activation-1.61e+00

Pos logit contributions
olded+4.420
alled+3.699
anced+3.687
 apologised+3.653
oped+3.633
Neg logit contributions
 big-4.037
 red-3.851
 dark-3.814
 new-3.799
 food-3.770
Token to
Token ID284
Feature activation-9.88e-01

Pos logit contributions
 new+2.176
 red+2.107
 gift+2.072
 food+2.057
 thing+2.007
Neg logit contributions
olded-3.670
 agreed-3.336
 could-3.292
 suggested-3.244
 hadn-3.229
Token play
Token ID711
Feature activation-3.18e+00

Pos logit contributions
 new+1.056
 gift+1.048
 red+1.033
 food+1.013
 big+1.010
Neg logit contributions
olded-2.539
 agreed-2.277
 could-2.275
 hadn-2.251
 suggested-2.247
Token a
Token ID257
Feature activation-5.48e+00

Pos logit contributions
 gift+4.903
 new+4.774
 red+4.728
 fake+4.658
 rainy+4.621
Neg logit contributions
olded-6.409
 agreed-6.021
 could-5.992
 suggested-5.837
 From-5.633
Token game
Token ID983
Feature activation-1.48e+00

Pos logit contributions
 attention+6.011
 gift+5.212
 things+4.977
 food+4.955
 cotton+4.945
Neg logit contributions
 could-14.579
 agreed-14.265
 suggested-14.260
 couldn-14.004
olded-13.877
Token with
Token ID351
Feature activation-3.89e-01

Pos logit contributions
 new+1.818
 red+1.788
 gift+1.775
 thing+1.731
 food+1.692
Neg logit contributions
olded-3.456
 agreed-3.139
 could-3.116
 apologised-3.074
 suggested-3.065
Token his
Token ID465
Feature activation-1.69e+00

Pos logit contributions
 new+0.507
 red+0.483
 food+0.471
 things+0.465
 big+0.459
Neg logit contributions
olded-0.945
 agreed-0.863
 hadn-0.862
 suggested-0.857
ave-0.842
Token owner
Token ID4870
Feature activation+1.08e-01

Pos logit contributions
 new+1.661
 red+1.602
 gift+1.565
 thing+1.482
 food+1.471
Neg logit contributions
olded-4.495
 agreed-4.127
 could-4.101
 suggested-4.062
 apologised-3.997
Token.
Token ID13
Feature activation+1.52e+00

Pos logit contributions
olded+0.288
 apologised+0.264
 hadn+0.261
 From+0.258
oped+0.255
Neg logit contributions
 red-0.114
 new-0.109
 food-0.104
 thing-0.104
 gift-0.102
Token.
Token ID13
Feature activation-1.01e+00

Pos logit contributions
olded+0.834
 apologised+0.724
oped+0.718
 agreed+0.717
ave+0.712
Neg logit contributions
 new-0.287
 red-0.277
 big-0.263
 sunny-0.249
 real-0.242
Token He
Token ID679
Feature activation+8.20e-01

Pos logit contributions
 red+2.096
 new+2.070
 sunny+1.999
 big+1.982
 gift+1.981
Neg logit contributions
olded-1.619
 agreed-1.341
 apologised-1.293
 could-1.287
 suggested-1.269
Token wanted
Token ID2227
Feature activation-2.21e+00

Pos logit contributions
olded+1.452
 agreed+1.129
 apologised+1.121
 suggested+1.081
os+1.068
Neg logit contributions
 red-1.773
 big-1.766
 new-1.739
 shiny-1.721
 pretty-1.685
Token to
Token ID284
Feature activation-1.32e+00

Pos logit contributions
 new+3.121
 gift+3.081
 red+3.068
 thing+2.959
 food+2.936
Neg logit contributions
olded-4.783
 agreed-4.364
 could-4.359
 suggested-4.276
 From-4.159
Token play
Token ID711
Feature activation-3.32e+00

Pos logit contributions
 new+1.545
 red+1.530
 gift+1.522
 food+1.484
 job+1.476
Neg logit contributions
olded-3.242
 agreed-2.931
 could-2.912
 suggested-2.866
 hadn-2.850
Token a
Token ID257
Feature activation-5.44e+00

Pos logit contributions
 gift+5.055
 new+4.896
 red+4.832
 fake+4.809
 rainy+4.758
Neg logit contributions
olded-6.777
 could-6.401
 agreed-6.372
 suggested-6.242
 From-5.999
Token game
Token ID983
Feature activation-1.35e+00

Pos logit contributions
 attention+6.555
 gift+5.603
 food+5.435
 things+5.375
 cotton+5.147
Neg logit contributions
 could-13.833
 suggested-13.621
 agreed-13.340
olded-13.265
 couldn-13.120
Token but
Token ID475
Feature activation+1.14e+00

Pos logit contributions
 new+1.783
 red+1.754
 gift+1.732
 thing+1.697
 food+1.665
Neg logit contributions
olded-3.029
 agreed-2.745
 could-2.709
 apologised-2.696
 hadn-2.678
Token he
Token ID339
Feature activation+2.20e+00

Pos logit contributions
olded+2.624
ave+2.393
oped+2.266
 apologised+2.247
 From+2.163
Neg logit contributions
 red-1.837
 new-1.836
 thing-1.804
 things-1.766
 sunny-1.747
Token did
Token ID750
Feature activation-7.04e-01

Pos logit contributions
olded+3.627
 From+3.003
 congratulated+2.847
alled+2.831
 apologised+2.819
Neg logit contributions
 big-5.026
 thing-4.945
 red-4.868
 new-4.803
 toy-4.786
Token not
Token ID407
Feature activation-1.70e+00

Pos logit contributions
 new+1.127
 things+1.083
 red+1.060
 thing+1.054
 fake+1.043
Neg logit contributions
olded-1.486
 agreed-1.308
ave-1.296
 suggested-1.293
 could-1.275
Token both
Token ID1111
Feature activation+1.82e+00

Pos logit contributions
olded+3.686
 Did+3.011
 Perhaps+3.000
 From+2.995
Well+2.924
Neg logit contributions
 sand-2.920
 red-2.852
 big-2.819
 shiny-2.711
 sunny-2.679
Token went
Token ID1816
Feature activation-2.49e+00

Pos logit contributions
olded+2.945
 From+2.260
 suggested+2.208
Well+2.197
 Perhaps+2.160
Neg logit contributions
 red-4.136
 sand-4.108
 sunny-4.056
 big-3.992
 shiny-3.978
Token home
Token ID1363
Feature activation-6.96e-01

Pos logit contributions
 gift+2.922
 new+2.919
 red+2.874
 thing+2.791
 fake+2.752
Neg logit contributions
olded-5.925
 agreed-5.486
 could-5.486
 suggested-5.383
 From-5.257
Token after
Token ID706
Feature activation-1.82e+00

Pos logit contributions
 new+0.937
 red+0.865
 gift+0.849
 big+0.844
 job+0.829
Neg logit contributions
olded-1.603
 could-1.446
 suggested-1.440
 agreed-1.436
 apologised-1.421
Token a
Token ID257
Feature activation-3.99e+00

Pos logit contributions
 new+2.632
 red+2.550
 gift+2.547
 food+2.478
 sand+2.463
Neg logit contributions
olded-3.969
 agreed-3.589
 could-3.548
 suggested-3.492
ave-3.394
Token great
Token ID1049
Feature activation-5.43e+00

Pos logit contributions
 red+5.249
 things+5.239
 thing+5.157
 gift+5.155
 food+5.068
Neg logit contributions
olded-8.656
 could-8.628
 suggested-8.531
 agreed-8.480
 couldn-8.253
Token day
Token ID1110
Feature activation-1.09e+00

Pos logit contributions
 fake+6.340
 cotton+6.314
 red+6.291
 gift+6.203
 attention+5.976
Neg logit contributions
 agreed-12.721
 suggested-12.598
 couldn-12.518
 reached-12.391
 could-12.344
Token.
Token ID13
Feature activation+1.04e+00

Pos logit contributions
 new+1.270
 red+1.246
 gift+1.212
 food+1.194
 sand+1.188
Neg logit contributions
olded-2.607
 agreed-2.385
 could-2.352
 apologised-2.320
 hadn-2.305
Token Every
Token ID3887
Feature activation-1.36e+00

Pos logit contributions
olded+1.709
 agreed+1.417
 apologised+1.342
 hadn+1.332
ave+1.297
Neg logit contributions
 new-2.292
 gift-2.208
 big-2.189
 food-2.162
 red-2.149
Token day
Token ID1110
Feature activation-1.36e-01

Pos logit contributions
 new+0.724
 gift+0.691
 red+0.661
 thing+0.641
 sunny+0.631
Neg logit contributions
olded-4.282
 could-3.887
 agreed-3.878
 suggested-3.821
ave-3.792
Token,
Token ID11
Feature activation-5.01e-01

Pos logit contributions
 new+0.177
 things+0.168
 food+0.167
 red+0.167
 thing+0.165
Neg logit contributions
olded-0.321
 agreed-0.290
 hadn-0.278
 apologised-0.278
ave-0.277
Token and
Token ID290
Feature activation-9.56e-02

Pos logit contributions
olded+4.005
 hadn+3.423
 apologised+3.419
 suggested+3.297
oped+3.255
Neg logit contributions
 red-2.009
 big-1.846
 new-1.828
 end-1.764
 thing-1.745
Token Sue
Token ID26089
Feature activation+1.06e+00

Pos logit contributions
 red+0.119
 big+0.117
 new+0.116
 gift+0.114
 sunny+0.113
Neg logit contributions
olded-0.235
ave-0.205
 hadn-0.204
 apologised-0.201
oped-0.201
Token wanted
Token ID2227
Feature activation-2.13e+00

Pos logit contributions
olded+2.470
 apologised+2.092
 suggested+2.062
 From+1.999
 hadn+1.992
Neg logit contributions
 big-1.602
 red-1.580
 new-1.517
 end-1.481
 pretty-1.473
Token to
Token ID284
Feature activation-5.30e-01

Pos logit contributions
 new+2.993
 gift+2.941
 red+2.934
 thing+2.822
 food+2.811
Neg logit contributions
olded-4.630
 agreed-4.216
 could-4.213
 suggested-4.129
 hadn-4.024
Token make
Token ID787
Feature activation-3.61e+00

Pos logit contributions
 new+0.595
 red+0.592
 big+0.581
 gift+0.579
 sand+0.562
Neg logit contributions
olded-1.378
 hadn-1.218
 agreed-1.216
ave-1.210
 apologised-1.203
Token a
Token ID257
Feature activation-5.42e+00

Pos logit contributions
 gift+4.778
 rainy+4.540
 sunny+4.510
 thing+4.467
 red+4.402
Neg logit contributions
olded-8.232
 suggested-7.669
 agreed-7.595
 could-7.526
 From-7.269
Token cake
Token ID12187
Feature activation-2.51e+00

Pos logit contributions
 attention+4.767
 thing+4.547
 cotton+4.507
 things+4.446
 rainy+4.295
Neg logit contributions
 could-14.139
olded-14.118
 suggested-14.040
 agreed-13.941
 couldn-13.642
Token for
Token ID329
Feature activation-2.44e+00

Pos logit contributions
 new+3.223
 gift+3.178
 red+3.160
 thing+3.050
 fake+3.039
Neg logit contributions
olded-5.663
 could-5.245
 agreed-5.241
 suggested-5.148
 couldn-4.958
Token Mom
Token ID11254
Feature activation-6.53e-01

Pos logit contributions
 new+2.400
 gift+2.382
 red+2.352
 thing+2.250
 fake+2.200
Neg logit contributions
olded-6.276
 agreed-5.829
 could-5.813
 suggested-5.734
 From-5.603
Token's
Token ID338
Feature activation-2.84e+00

Pos logit contributions
 red+0.772
 new+0.756
 gift+0.730
 food+0.727
 cake+0.713
Neg logit contributions
olded-1.596
 apologised-1.442
 From-1.436
 hadn-1.435
 agreed-1.418
Token birthday
Token ID10955
Feature activation-1.10e+00

Pos logit contributions
 new+1.349
 gift+1.282
 red+1.271
 fake+1.159
 cotton+1.156
Neg logit contributions
olded-8.697
 agreed-8.215
 could-8.207
 suggested-8.082
 couldn-7.920
Token Bob
Token ID5811
Feature activation+7.51e-01

Pos logit contributions
 red+1.455
 new+1.412
 food+1.380
 sunny+1.379
 big+1.370
Neg logit contributions
olded-1.126
 apologised-0.905
 agreed-0.897
 could-0.890
ave-0.889
Tokeno
Token ID78
Feature activation+1.42e+00

Pos logit contributions
olded+1.188
 apologised+0.958
 agreed+0.898
 declared+0.862
 suggested+0.856
Neg logit contributions
 red-1.703
 big-1.651
 new-1.617
 thing-1.617
 sunny-1.593
Token loved
Token ID6151
Feature activation-2.47e+00

Pos logit contributions
olded+2.465
 apologised+1.980
oped+1.880
ave+1.828
alled+1.821
Neg logit contributions
 red-3.156
 big-3.132
 new-3.023
 sunny-2.960
 thing-2.951
Token to
Token ID284
Feature activation-1.14e+00

Pos logit contributions
 new+3.534
 gift+3.524
 red+3.470
 thing+3.387
 fake+3.344
Neg logit contributions
olded-5.212
 could-4.792
 agreed-4.770
 suggested-4.689
 From-4.552
Token play
Token ID711
Feature activation-3.25e+00

Pos logit contributions
 red+1.150
 new+1.147
 gift+1.103
 sunny+1.087
 food+1.079
Neg logit contributions
olded-3.030
 could-2.710
 agreed-2.706
 suggested-2.702
 hadn-2.677
Token a
Token ID257
Feature activation-5.42e+00

Pos logit contributions
 gift+4.971
 new+4.803
 red+4.756
 fake+4.706
 rainy+4.671
Neg logit contributions
olded-6.632
 could-6.253
 agreed-6.206
 suggested-6.061
 From-5.942
Token sport
Token ID6332
Feature activation-1.61e+00

Pos logit contributions
 attention+7.631
 gift+7.183
 dessert+6.746
 food+6.671
 cotton+6.611
Neg logit contributions
 could-12.606
 suggested-12.121
 agreed-12.094
olded-11.975
 couldn-11.846
Token called
Token ID1444
Feature activation+2.04e-01

Pos logit contributions
 new+2.074
 red+2.049
 gift+2.026
 thing+1.986
 food+1.949
Neg logit contributions
olded-3.706
 agreed-3.335
 could-3.301
 suggested-3.281
 hadn-3.252
Token ball
Token ID2613
Feature activation-1.63e+00

Pos logit contributions
olded+0.433
 agreed+0.391
 apologised+0.391
 hadn+0.387
 suggested+0.386
Neg logit contributions
 new-0.321
 red-0.318
 gift-0.304
 food-0.300
 thing-0.299
Token.
Token ID13
Feature activation+2.38e-01

Pos logit contributions
 new+1.990
 red+1.960
 gift+1.933
 thing+1.867
 food+1.856
Neg logit contributions
olded-3.852
 agreed-3.507
 could-3.479
 hadn-3.451
 suggested-3.432
Token He
Token ID679
Feature activation+1.70e+00

Pos logit contributions
olded+0.391
 apologised+0.323
ave+0.311
 hadn+0.311
 agreed+0.300
Neg logit contributions
 red-0.530
 new-0.530
 big-0.502
 sunny-0.497
 food-0.496
Token.
Token ID13
Feature activation-2.75e-01

Pos logit contributions
 new+0.534
 red+0.534
 thing+0.501
 sunny+0.500
 food+0.494
Neg logit contributions
olded-1.339
 could-1.218
 agreed-1.215
 apologised-1.206
 hadn-1.196
Token Spot
Token ID15899
Feature activation+2.25e+00

Pos logit contributions
 red+0.588
 new+0.576
 big+0.563
 food+0.552
 sunny+0.544
Neg logit contributions
olded-0.438
 agreed-0.365
 apologised-0.362
 hadn-0.360
ave-0.350
Token wanted
Token ID2227
Feature activation-1.82e+00

Pos logit contributions
olded+4.514
alled+3.779
 apologised+3.672
anced+3.484
 declared+3.354
Neg logit contributions
 big-5.005
 red-4.890
 new-4.692
 shiny-4.584
 pretty-4.574
Token to
Token ID284
Feature activation-1.10e+00

Pos logit contributions
 new+2.494
 red+2.427
 gift+2.398
 food+2.355
 thing+2.315
Neg logit contributions
olded-4.066
 agreed-3.695
 could-3.663
 suggested-3.604
 hadn-3.560
Token play
Token ID711
Feature activation-3.11e+00

Pos logit contributions
 new+1.187
 red+1.163
 gift+1.158
 food+1.129
 big+1.129
Neg logit contributions
olded-2.817
 agreed-2.533
 could-2.533
 hadn-2.511
 suggested-2.507
Token a
Token ID257
Feature activation-5.40e+00

Pos logit contributions
 gift+4.840
 new+4.711
 red+4.671
 fake+4.595
 sand+4.569
Neg logit contributions
olded-6.204
 agreed-5.821
 could-5.812
 suggested-5.652
 From-5.458
Token fun
Token ID1257
Feature activation-4.43e+00

Pos logit contributions
 attention+5.238
 gift+4.589
 cotton+4.337
 things+4.306
 food+4.198
Neg logit contributions
 could-14.561
 agreed-14.506
 suggested-14.487
olded-14.178
 couldn-13.866
Token game
Token ID983
Feature activation-1.08e+00

Pos logit contributions
 gift+3.563
 attention+3.509
 cotton+3.500
 new+3.482
 fake+3.435
Neg logit contributions
 agreed-11.885
 suggested-11.700
 could-11.611
olded-11.534
 reached-11.241
Token.
Token ID13
Feature activation+5.45e-01

Pos logit contributions
 new+1.292
 red+1.261
 gift+1.237
 thing+1.227
 food+1.202
Neg logit contributions
olded-2.586
 apologised-2.330
 agreed-2.325
 From-2.303
 hadn-2.302
Token He
Token ID679
Feature activation+1.77e+00

Pos logit contributions
olded+0.933
 apologised+0.813
 hadn+0.793
 agreed+0.779
ave+0.761
Neg logit contributions
 new-1.145
 red-1.128
 big-1.098
 food-1.050
 things-1.043
Token had
Token ID550
Feature activation-3.02e-01

Pos logit contributions
olded+3.042
 apologised+2.455
 From+2.402
oped+2.326
 Although+2.300
Neg logit contributions
 red-4.051
 big-4.002
 new-3.811
 shiny-3.749
 black-3.672
Token soon
Token ID2582
Feature activation+1.73e+00

Pos logit contributions
olded+3.756
 From+3.343
 apologised+2.975
 suggested+2.884
oped+2.852
Neg logit contributions
 red-3.906
 big-3.901
 shiny-3.738
 new-3.729
 dark-3.720
Token became
Token ID2627
Feature activation-9.98e-01

Pos logit contributions
olded+4.384
 From+3.774
os+3.598
ave+3.577
 suggested+3.560
Neg logit contributions
 new-2.693
 red-2.605
 big-2.573
 dark-2.478
 things-2.464
Token a
Token ID257
Feature activation-2.84e+00

Pos logit contributions
 new+1.120
 red+1.109
 fake+1.027
 big+1.017
 shiny+1.009
Neg logit contributions
olded-2.541
 agreed-2.311
 could-2.269
 From-2.257
 suggested-2.252
Token blur
Token ID23671
Feature activation-5.46e-01

Pos logit contributions
 new+2.991
 gift+2.990
 red+2.928
 food+2.870
 things+2.834
Neg logit contributions
olded-6.944
 could-6.559
 agreed-6.507
 suggested-6.385
 couldn-6.304
Token!
Token ID0
Feature activation+2.33e+00

Pos logit contributions
 red+0.650
 new+0.627
 thing+0.624
 gift+0.600
 things+0.588
Neg logit contributions
olded-1.347
 From-1.219
 agreed-1.211
 suggested-1.202
 apologised-1.196
Token The
Token ID383
Feature activation-5.38e+00

Pos logit contributions
olded+5.130
 apologised+4.569
 managed+4.432
ave+4.406
 hadn+4.400
Neg logit contributions
 new-4.311
 red-4.130
 big-4.099
 dark-4.004
 sun-3.900
Token mixer
Token ID33938
Feature activation+1.40e+00

Pos logit contributions
 cotton+6.707
 birthday+6.544
 sand+6.378
 job+6.348
 dessert+6.324
Neg logit contributions
 agreed-13.032
 could-12.797
 suggested-12.725
olded-12.523
 couldn-12.360
Token was
Token ID373
Feature activation-1.02e+00

Pos logit contributions
olded+2.424
 From+1.995
 suggested+1.805
 agreed+1.757
 apologised+1.744
Neg logit contributions
 red-3.055
 big-2.952
 new-2.903
 thing-2.890
 shiny-2.855
Token spinning
Token ID19493
Feature activation-4.11e-01

Pos logit contributions
 red+1.322
 new+1.321
 big+1.244
 fake+1.234
 shiny+1.227
Neg logit contributions
olded-2.397
 agreed-2.158
 could-2.155
 suggested-2.130
 From-2.090
Token at
Token ID379
Feature activation-1.32e+00

Pos logit contributions
 red+0.572
 new+0.567
 big+0.536
 shiny+0.531
 thing+0.527
Neg logit contributions
olded-0.979
 suggested-0.847
 agreed-0.836
 apologised-0.836
 hadn-0.818
Token top
Token ID1353
Feature activation+7.92e-02

Pos logit contributions
 new+1.611
 red+1.545
 sand+1.473
 gift+1.472
 things+1.461
Neg logit contributions
olded-3.278
 agreed-2.925
 suggested-2.906
 could-2.893
 apologised-2.838
Token,
Token ID11
Feature activation-1.27e+00

Pos logit contributions
 new+1.855
 red+1.818
 gift+1.785
 thing+1.785
 things+1.746
Neg logit contributions
olded-3.429
 agreed-3.107
 suggested-3.052
 could-3.052
 apologised-3.022
Token Anna
Token ID11735
Feature activation+1.34e+00

Pos logit contributions
 new+2.205
 red+2.169
 thing+2.120
 gift+2.118
 things+2.088
Neg logit contributions
olded-2.401
 agreed-2.088
 suggested-2.061
 apologised-2.051
 could-2.037
Token wanted
Token ID2227
Feature activation-2.28e+00

Pos logit contributions
olded+2.604
 apologised+2.119
 From+2.065
ver+2.044
alled+2.022
Neg logit contributions
 big-2.575
 red-2.573
 dark-2.461
 new-2.437
 food-2.431
Token to
Token ID284
Feature activation-1.59e+00

Pos logit contributions
 new+3.248
 gift+3.219
 red+3.196
 thing+3.090
 food+3.057
Neg logit contributions
olded-4.892
 could-4.467
 agreed-4.465
 suggested-4.379
 From-4.251
Token play
Token ID711
Feature activation-3.50e+00

Pos logit contributions
 new+2.013
 gift+1.982
 red+1.982
 sand+1.911
 food+1.908
Neg logit contributions
olded-3.706
 agreed-3.345
 could-3.319
 suggested-3.283
 hadn-3.239
Token a
Token ID257
Feature activation-5.37e+00

Pos logit contributions
 gift+5.261
 new+5.087
 job+4.979
 fake+4.974
 red+4.949
Neg logit contributions
olded-7.220
 could-6.916
 agreed-6.887
 suggested-6.665
 reached-6.483
Token new
Token ID649
Feature activation-4.60e+00

Pos logit contributions
 attention+7.044
 gift+6.438
 cotton+6.269
 things+6.253
 food+6.230
Neg logit contributions
 could-12.982
 agreed-12.628
 suggested-12.548
olded-12.476
 couldn-12.321
Token game
Token ID983
Feature activation-1.27e+00

Pos logit contributions
 cotton+3.851
 attention+3.789
 gift+3.685
 red+3.572
 gloomy+3.538
Neg logit contributions
 agreed-12.164
olded-11.859
 suggested-11.834
 could-11.818
 couldn-11.538
Token.
Token ID13
Feature activation+8.34e-01

Pos logit contributions
 new+1.495
 red+1.468
 gift+1.464
 thing+1.428
 food+1.387
Neg logit contributions
olded-3.038
 agreed-2.753
 could-2.722
 apologised-2.722
 From-2.710
Token She
Token ID1375
Feature activation+1.82e+00

Pos logit contributions
olded+1.223
 apologised+1.069
oped+1.053
 hadn+1.053
 managed+1.001
Neg logit contributions
 red-1.902
 new-1.884
 big-1.808
 things-1.808
 thing-1.798
Token called
Token ID1444
Feature activation-1.50e+00

Pos logit contributions
olded+3.052
 From+2.514
 apologised+2.406
ver+2.397
oped+2.339
Neg logit contributions
 red-4.208
 big-4.036
 shiny-3.857
 toy-3.855
 new-3.850
Token.
Token ID13
Feature activation+3.81e-02

Pos logit contributions
 new+1.546
 red+1.534
 gift+1.500
 thing+1.468
 food+1.466
Neg logit contributions
olded-3.324
 agreed-3.001
 could-2.978
 From-2.967
 suggested-2.940
Token They
Token ID1119
Feature activation+2.99e-02

Pos logit contributions
olded+0.055
 hadn+0.048
 apologised+0.046
 agreed+0.046
oped+0.045
Neg logit contributions
 red-0.085
 new-0.085
 TV-0.082
 lunch-0.082
 big-0.081
Token want
Token ID765
Feature activation-2.33e+00

Pos logit contributions
olded+0.070
 suggested+0.062
 From+0.062
oped+0.061
ave+0.061
Neg logit contributions
 big-0.038
 red-0.038
 sand-0.037
 food-0.036
 dark-0.036
Token to
Token ID284
Feature activation-1.03e+00

Pos logit contributions
 new+3.298
 gift+3.274
 red+3.250
 thing+3.143
 job+3.109
Neg logit contributions
olded-5.007
 could-4.579
 agreed-4.579
 suggested-4.491
 From-4.353
Token play
Token ID711
Feature activation-2.57e+00

Pos logit contributions
 new+1.183
 red+1.166
 gift+1.165
 sand+1.139
 food+1.137
Neg logit contributions
olded-2.592
 agreed-2.302
 suggested-2.286
 could-2.275
 hadn-2.269
Token a
Token ID257
Feature activation-5.34e+00

Pos logit contributions
 new+3.705
 gift+3.690
 red+3.645
 fake+3.515
 job+3.508
Neg logit contributions
olded-5.423
 agreed-5.028
 could-5.007
 suggested-4.887
 From-4.721
Token game
Token ID983
Feature activation-3.25e-01

Pos logit contributions
 attention+7.865
 gift+7.408
 job+7.336
 cotton+7.283
ags+7.206
Neg logit contributions
 could-11.763
 agreed-11.711
 suggested-11.478
olded-11.030
 couldn-10.780
Token.
Token ID13
Feature activation+1.41e-01

Pos logit contributions
 new+0.365
 gift+0.354
 thing+0.352
 red+0.351
 lunch+0.346
Neg logit contributions
olded-0.815
 apologised-0.740
 From-0.735
 hadn-0.731
 agreed-0.718
Token They
Token ID1119
Feature activation-2.36e-01

Pos logit contributions
olded+0.212
 hadn+0.182
 apologised+0.179
oped+0.179
 agreed+0.175
Neg logit contributions
 new-0.310
 red-0.308
 thing-0.296
 TV-0.295
 big-0.295
Token think
Token ID892
Feature activation-1.27e+00

Pos logit contributions
 red+0.323
 big+0.315
 sand+0.307
 new+0.306
 gift+0.302
Neg logit contributions
olded-0.532
 From-0.471
 suggested-0.464
 agreed-0.461
ave-0.459
Token of
Token ID286
Feature activation-1.52e+00

Pos logit contributions
 new+2.214
 red+2.171
 gift+2.098
 sand+2.087
 things+2.080
Neg logit contributions
olded-2.440
 agreed-2.164
 suggested-2.110
 could-2.103
ave-2.089
Token
Token ID198
Feature activation-1.57e+00

Pos logit contributions
 new+2.746
 red+2.652
 gift+2.625
 big+2.625
 thing+2.592
Neg logit contributions
olded-1.753
 agreed-1.439
 apologised-1.387
 could-1.370
 suggested-1.356
TokenHop
Token ID23483
Feature activation+1.15e+00

Pos logit contributions
 new+2.996
 red+2.945
 gift+2.909
 thing+2.886
 fake+2.846
Neg logit contributions
olded-2.653
 agreed-2.293
 could-2.228
 suggested-2.204
 apologised-2.153
Token wanted
Token ID2227
Feature activation-1.82e+00

Pos logit contributions
olded+1.907
 apologised+1.450
 hadn+1.406
 From+1.383
ave+1.340
Neg logit contributions
 red-2.389
 thing-2.380
 new-2.344
 big-2.334
 things-2.326
Token to
Token ID284
Feature activation-1.61e+00

Pos logit contributions
 new+2.516
 red+2.439
 gift+2.409
 food+2.355
 thing+2.340
Neg logit contributions
olded-4.087
 agreed-3.697
 could-3.664
 suggested-3.621
 hadn-3.569
Token play
Token ID711
Feature activation-3.19e+00

Pos logit contributions
 new+1.906
 red+1.849
 gift+1.849
 food+1.762
 thing+1.756
Neg logit contributions
olded-3.929
 could-3.541
 agreed-3.538
 suggested-3.504
 hadn-3.455
Token a
Token ID257
Feature activation-5.34e+00

Pos logit contributions
 gift+5.106
 red+4.938
 new+4.926
 fake+4.848
 cotton+4.845
Neg logit contributions
olded-6.188
 agreed-5.865
 could-5.864
 suggested-5.635
 From-5.459
Token game
Token ID983
Feature activation-1.76e+00

Pos logit contributions
 attention+5.890
 gift+5.436
 groceries+5.300
 food+5.274
 cotton+5.231
Neg logit contributions
 could-14.041
 agreed-13.784
 suggested-13.563
 couldn-13.396
olded-13.158
Token with
Token ID351
Feature activation-1.56e+00

Pos logit contributions
 new+2.314
 red+2.264
 gift+2.263
 thing+2.199
 food+2.158
Neg logit contributions
olded-3.970
 agreed-3.593
 could-3.573
 suggested-3.525
 apologised-3.484
Token the
Token ID262
Feature activation-3.14e+00

Pos logit contributions
 new+2.070
 red+1.991
 gift+1.932
 food+1.913
 thing+1.908
Neg logit contributions
olded-3.595
 agreed-3.263
 suggested-3.222
 could-3.216
 hadn-3.165
Token ball
Token ID2613
Feature activation-7.96e-01

Pos logit contributions
 gift+1.740
 new+1.620
 red+1.560
 dessert+1.559
 sand+1.544
Neg logit contributions
olded-9.221
 could-8.927
 agreed-8.907
 suggested-8.738
 couldn-8.553
Token.
Token ID13
Feature activation+1.76e-02

Pos logit contributions
 new+0.879
 red+0.867
 gift+0.820
 thing+0.817
 food+0.795
Neg logit contributions
olded-2.031
 apologised-1.806
 agreed-1.797
 From-1.797
 hadn-1.783
Token,
Token ID11
Feature activation-1.37e+00

Pos logit contributions
 new+2.922
 gift+2.881
 red+2.871
 thing+2.769
 food+2.736
Neg logit contributions
olded-4.781
 agreed-4.370
 could-4.356
 suggested-4.282
 From-4.167
Token Rose
Token ID8049
Feature activation+1.99e+00

Pos logit contributions
 new+2.606
 red+2.558
 gift+2.517
 things+2.467
 thing+2.459
Neg logit contributions
olded-2.368
 agreed-2.063
 suggested-1.981
 apologised-1.979
 could-1.961
Token wanted
Token ID2227
Feature activation-2.06e+00

Pos logit contributions
olded+3.875
alled+3.181
oped+3.090
anced+3.081
ver+3.069
Neg logit contributions
 big-3.991
 new-3.861
 red-3.841
 dark-3.738
 pretty-3.721
Token to
Token ID284
Feature activation-1.27e+00

Pos logit contributions
 new+2.878
 gift+2.821
 red+2.816
 food+2.709
 thing+2.708
Neg logit contributions
olded-4.504
 agreed-4.106
 could-4.089
 suggested-4.008
 hadn-3.921
Token play
Token ID711
Feature activation-3.43e+00

Pos logit contributions
 new+1.493
 gift+1.467
 red+1.448
 food+1.407
 big+1.404
Neg logit contributions
olded-3.129
 agreed-2.825
 could-2.814
 hadn-2.768
 suggested-2.761
Token a
Token ID257
Feature activation-5.34e+00

Pos logit contributions
 gift+5.429
 new+5.284
 red+5.239
 sand+5.200
 fake+5.197
Neg logit contributions
olded-6.739
 agreed-6.353
 could-6.331
 suggested-6.213
 From-5.922
Token game
Token ID983
Feature activation-1.71e+00

Pos logit contributions
 attention+6.213
 gift+5.445
 food+5.334
 things+5.310
 sand+5.212
Neg logit contributions
 could-13.535
 suggested-13.406
 agreed-13.310
olded-13.100
 couldn-12.892
Token with
Token ID351
Feature activation-5.29e-01

Pos logit contributions
 new+2.278
 red+2.237
 gift+2.232
 thing+2.163
 food+2.124
Neg logit contributions
olded-3.808
 agreed-3.460
 could-3.439
 suggested-3.374
 apologised-3.352
Token Max
Token ID5436
Feature activation+2.31e-01

Pos logit contributions
 new+0.833
 red+0.784
 things+0.775
 food+0.767
 big+0.754
Neg logit contributions
olded-1.134
 agreed-1.026
 hadn-1.004
 could-1.002
ave-0.994
Token Outside
Token ID22151
Feature activation-4.34e-01

Pos logit contributions
olded+0.631
 apologised+0.560
ave+0.560
 From+0.555
 reached+0.550
Neg logit contributions
 red-0.228
 new-0.228
 food-0.211
 things-0.207
 lunch-0.205
Token.
Token ID13
Feature activation+1.27e+00

Pos logit contributions
 new+0.466
 red+0.451
 thing+0.445
 things+0.444
 gift+0.425
Neg logit contributions
olded-1.182
 agreed-1.048
ave-1.045
 apologised-1.045
 apologized-1.001
Token.
Token ID13
Feature activation+4.80e-01

Pos logit contributions
olded+0.878
 apologised+0.785
 agreed+0.778
 hadn+0.766
ave+0.757
Neg logit contributions
 red-0.411
 thing-0.404
 new-0.391
 big-0.380
 things-0.368
Token He
Token ID679
Feature activation+1.06e+00

Pos logit contributions
olded+0.765
 apologised+0.608
 hadn+0.597
 agreed+0.587
 suggested+0.585
Neg logit contributions
 red-1.095
 thing-1.043
 new-1.043
 dark-1.043
 sunny-1.039
Token wanted
Token ID2227
Feature activation-2.10e+00

Pos logit contributions
olded+1.686
 apologised+1.263
 agreed+1.234
 suggested+1.171
oped+1.147
Neg logit contributions
 big-2.528
 red-2.517
 shiny-2.476
 new-2.431
 pretty-2.419
Token to
Token ID284
Feature activation-1.33e+00

Pos logit contributions
 new+2.927
 gift+2.868
 red+2.867
 thing+2.758
 food+2.755
Neg logit contributions
olded-4.587
 agreed-4.182
 could-4.167
 suggested-4.093
 hadn-3.992
Token play
Token ID711
Feature activation-3.26e+00

Pos logit contributions
 new+1.654
 gift+1.645
 red+1.638
 food+1.601
 job+1.584
Neg logit contributions
olded-3.148
 agreed-2.834
 could-2.812
 suggested-2.776
 hadn-2.755
Token a
Token ID257
Feature activation-5.31e+00

Pos logit contributions
 gift+4.961
 new+4.834
 red+4.779
 fake+4.707
 rainy+4.696
Neg logit contributions
olded-6.580
 could-6.251
 agreed-6.211
 suggested-6.055
 From-5.855
Token game
Token ID983
Feature activation-1.07e+00

Pos logit contributions
 attention+5.983
 gift+5.349
 things+5.087
 cotton+5.042
 food+5.007
Neg logit contributions
 could-13.683
 suggested-13.077
 couldn-13.005
olded-12.994
 agreed-12.980
Token with
Token ID351
Feature activation-6.73e-01

Pos logit contributions
 new+1.351
 red+1.325
 gift+1.301
 thing+1.280
 food+1.253
Neg logit contributions
olded-2.478
 agreed-2.237
 apologised-2.225
 hadn-2.201
 From-2.199
Token his
Token ID465
Feature activation-1.56e+00

Pos logit contributions
 new+0.838
 red+0.803
 things+0.788
 food+0.786
 big+0.765
Neg logit contributions
olded-1.656
 agreed-1.524
 suggested-1.504
 hadn-1.493
 could-1.479
Token friend
Token ID1545
Feature activation+1.46e+00

Pos logit contributions
 new+1.383
 red+1.335
 gift+1.300
 food+1.222
 thing+1.217
Neg logit contributions
olded-4.319
 agreed-3.968
 could-3.913
 suggested-3.911
 apologised-3.857
Token,
Token ID11
Feature activation-2.68e-01

Pos logit contributions
olded+4.008
 apologised+3.669
 hadn+3.613
Well+3.603
oped+3.581
Neg logit contributions
 red-1.670
 new-1.603
 food-1.535
 big-1.480
 lunch-1.450
Token.
Token ID13
Feature activation-4.50e-01

Pos logit contributions
 new+1.798
 red+1.768
 gift+1.718
 thing+1.690
 sunny+1.669
Neg logit contributions
olded-3.329
 could-3.010
 agreed-2.985
 hadn-2.958
 apologised-2.948
Token They
Token ID1119
Feature activation+1.04e+00

Pos logit contributions
 red+1.038
 new+1.016
 sunny+0.989
 big+0.986
 thing+0.980
Neg logit contributions
olded-0.640
 apologised-0.519
 hadn-0.503
 suggested-0.502
 agreed-0.498
Token wanted
Token ID2227
Feature activation-1.64e+00

Pos logit contributions
olded+1.955
 apologised+1.624
ave+1.488
 From+1.486
 declared+1.472
Neg logit contributions
 red-2.182
 big-2.140
 sunny-2.076
 new-2.049
 shiny-2.042
Token to
Token ID284
Feature activation-8.82e-01

Pos logit contributions
 new+2.228
 red+2.167
 gift+2.116
 food+2.095
 sand+2.060
Neg logit contributions
olded-3.752
 agreed-3.373
 could-3.355
 suggested-3.308
 hadn-3.279
Token play
Token ID711
Feature activation-2.86e+00

Pos logit contributions
 new+0.960
 red+0.944
 big+0.914
 gift+0.905
 sand+0.898
Neg logit contributions
olded-2.313
 could-2.044
 hadn-2.040
 suggested-2.039
 agreed-2.031
Token a
Token ID257
Feature activation-5.30e+00

Pos logit contributions
 gift+4.130
 new+4.070
 red+4.008
 fake+3.939
 job+3.902
Neg logit contributions
olded-6.012
 agreed-5.663
 could-5.631
 suggested-5.487
 From-5.295
Token game
Token ID983
Feature activation-1.31e+00

Pos logit contributions
 attention+6.380
 gift+5.831
 food+5.566
 cotton+5.452
 things+5.380
Neg logit contributions
 could-13.248
 agreed-13.121
 suggested-12.981
olded-12.549
 couldn-12.320
Token.
Token ID13
Feature activation-5.18e-02

Pos logit contributions
 new+1.662
 red+1.627
 gift+1.606
 thing+1.571
 food+1.542
Neg logit contributions
olded-3.042
 agreed-2.741
 apologised-2.724
 could-2.718
 hadn-2.698
Token They
Token ID1119
Feature activation+1.01e+00

Pos logit contributions
 red+0.110
 new+0.109
 big+0.104
 sunny+0.103
 thing+0.103
Neg logit contributions
olded-0.083
 apologised-0.071
ave-0.067
 agreed-0.067
 hadn-0.066
Token were
Token ID547
Feature activation-2.09e+00

Pos logit contributions
olded+1.714
 apologised+1.405
 From+1.403
ave+1.272
 Perhaps+1.244
Neg logit contributions
 red-2.193
 big-2.131
 shiny-2.026
 dark-2.025
 new-2.009
Token both
Token ID1111
Feature activation-9.00e-01

Pos logit contributions
 new+2.985
 red+2.946
 gift+2.927
 thing+2.815
 food+2.803
Neg logit contributions
olded-4.484
 agreed-4.076
 could-4.073
 suggested-3.999
 From-3.888
Token.
Token ID13
Feature activation-5.10e-01

Pos logit contributions
 new+2.536
 red+2.492
 gift+2.476
 thing+2.402
 food+2.372
Neg logit contributions
olded-4.069
 could-3.691
 agreed-3.679
 suggested-3.613
 hadn-3.568
Token They
Token ID1119
Feature activation+9.84e-01

Pos logit contributions
 red+1.221
 new+1.203
 big+1.166
 sunny+1.152
 gift+1.149
Neg logit contributions
olded-0.658
 hadn-0.531
 apologised-0.524
 agreed-0.519
 could-0.513
Token wanted
Token ID2227
Feature activation-2.12e+00

Pos logit contributions
olded+1.935
 apologised+1.638
 From+1.518
ave+1.511
 declared+1.492
Neg logit contributions
 red-1.950
 big-1.899
 new-1.822
 sunny-1.819
 shiny-1.809
Token to
Token ID284
Feature activation-1.01e+00

Pos logit contributions
 new+2.979
 red+2.922
 gift+2.922
 thing+2.809
 food+2.802
Neg logit contributions
olded-4.619
 agreed-4.203
 could-4.196
 suggested-4.118
 From-4.015
Token play
Token ID711
Feature activation-3.03e+00

Pos logit contributions
 new+1.123
 red+1.102
 gift+1.067
 big+1.063
 sand+1.050
Neg logit contributions
olded-2.608
 could-2.307
 agreed-2.300
 suggested-2.296
 hadn-2.295
Token a
Token ID257
Feature activation-5.29e+00

Pos logit contributions
 gift+4.539
 new+4.444
 red+4.375
 fake+4.335
 job+4.299
Neg logit contributions
olded-6.230
 agreed-5.915
 could-5.875
 suggested-5.713
 From-5.499
Token game
Token ID983
Feature activation-1.38e+00

Pos logit contributions
 attention+6.361
 gift+5.768
 food+5.502
 cotton+5.470
 things+5.357
Neg logit contributions
 could-13.266
 agreed-13.138
 suggested-13.006
olded-12.654
 couldn-12.383
Token.
Token ID13
Feature activation+8.86e-02

Pos logit contributions
 new+1.811
 red+1.773
 gift+1.759
 thing+1.719
 food+1.688
Neg logit contributions
olded-3.116
 agreed-2.804
 could-2.785
 apologised-2.775
 hadn-2.752
Token They
Token ID1119
Feature activation+9.48e-01

Pos logit contributions
olded+0.138
 apologised+0.121
 hadn+0.116
 agreed+0.114
 vowed+0.113
Neg logit contributions
 new-0.192
 red-0.192
 big-0.182
 lunch-0.178
 sunny-0.177
Token saw
Token ID2497
Feature activation-1.88e+00

Pos logit contributions
olded+1.639
 From+1.366
 apologised+1.353
ave+1.236
 declared+1.200
Neg logit contributions
 red-2.007
 big-1.938
 dark-1.848
 shiny-1.847
 new-1.832
Token some
Token ID617
Feature activation-2.50e+00

Pos logit contributions
 new+1.911
 red+1.862
 gift+1.809
 sand+1.743
 food+1.736
Neg logit contributions
olded-4.864
 agreed-4.466
 could-4.461
 suggested-4.406
 hadn-4.343
Token and
Token ID290
Feature activation-4.57e-01

Pos logit contributions
olded+2.953
 apologised+2.517
 hadn+2.494
 exclaimed+2.400
oped+2.379
Neg logit contributions
 red-1.808
 new-1.717
 big-1.684
 gift-1.663
 things-1.637
Token Lily
Token ID20037
Feature activation+1.77e-01

Pos logit contributions
 gift+0.687
 red+0.686
 new+0.680
 big+0.666
 thing+0.658
Neg logit contributions
olded-0.989
 hadn-0.847
oped-0.845
 agreed-0.831
 could-0.825
Token wanted
Token ID2227
Feature activation-2.82e+00

Pos logit contributions
olded+0.362
 apologised+0.297
 suggested+0.296
 hadn+0.296
 From+0.289
Neg logit contributions
 red-0.296
 gift-0.289
 sand-0.286
 big-0.284
 new-0.282
Token to
Token ID284
Feature activation-1.06e+00

Pos logit contributions
 gift+4.207
 new+4.128
 red+4.127
 thing+4.063
 job+4.045
Neg logit contributions
olded-5.766
 could-5.338
 agreed-5.332
 suggested-5.265
 couldn-5.041
Token make
Token ID787
Feature activation-3.02e+00

Pos logit contributions
 new+1.194
 red+1.182
 gift+1.167
 big+1.130
 sand+1.111
Neg logit contributions
olded-2.697
 agreed-2.410
 could-2.383
 hadn-2.382
 apologised-2.366
Token a
Token ID257
Feature activation-5.28e+00

Pos logit contributions
 gift+3.640
 new+3.463
 red+3.459
 rainy+3.421
 thing+3.397
Neg logit contributions
olded-7.135
 agreed-6.617
 suggested-6.588
 could-6.584
 From-6.325
Token snow
Token ID6729
Feature activation-2.57e+00

Pos logit contributions
 attention+4.379
 cotton+4.081
 things+4.000
 rainy+3.987
 thing+3.906
Neg logit contributions
 could-13.972
 suggested-13.927
olded-13.873
 agreed-13.767
 couldn-13.483
Tokenman
Token ID805
Feature activation-2.19e+00

Pos logit contributions
 new+4.341
 gift+4.291
 red+4.256
 fake+4.152
 thing+4.122
Neg logit contributions
olded-4.776
 could-4.367
 agreed-4.355
 suggested-4.238
 couldn-4.075
Token.
Token ID13
Feature activation-2.28e-01

Pos logit contributions
 new+2.943
 gift+2.907
 red+2.894
 thing+2.787
 food+2.762
Neg logit contributions
olded-4.841
 agreed-4.424
 could-4.422
 suggested-4.337
 From-4.228
Token They
Token ID1119
Feature activation+1.22e+00

Pos logit contributions
 red+0.544
 new+0.539
 thing+0.517
 things+0.516
 gift+0.515
Neg logit contributions
olded-0.296
 hadn-0.241
 apologised-0.236
 could-0.232
 couldn-0.232
Token put
Token ID1234
Feature activation-2.16e+00

Pos logit contributions
olded+1.880
 From+1.614
 apologised+1.548
oped+1.446
os+1.443
Neg logit contributions
 red-2.695
 big-2.623
 shiny-2.500
 gift-2.446
 pretty-2.436
Token around
Token ID1088
Feature activation-9.08e-01

Pos logit contributions
 new+2.009
 red+2.003
 thing+1.927
 things+1.911
 gift+1.890
Neg logit contributions
olded-3.118
 agreed-2.767
 suggested-2.727
 could-2.716
 apologised-2.672
Token it
Token ID340
Feature activation-6.07e-01

Pos logit contributions
 new+1.057
 red+1.040
 thing+0.998
 things+0.986
 big+0.958
Neg logit contributions
olded-2.339
 agreed-2.068
 suggested-2.052
ave-2.031
 apologised-2.024
Token very
Token ID845
Feature activation-1.43e+00

Pos logit contributions
 new+0.672
 red+0.670
 big+0.620
 thing+0.616
 fake+0.599
Neg logit contributions
olded-1.597
 agreed-1.403
 apologised-1.396
 suggested-1.396
 From-1.388
Token quickly
Token ID2952
Feature activation+5.65e-01

Pos logit contributions
 new+1.515
 red+1.469
 big+1.386
 thing+1.373
 gift+1.360
Neg logit contributions
olded-3.663
 agreed-3.341
 suggested-3.314
 could-3.277
 apologised-3.223
Token.
Token ID13
Feature activation+2.47e+00

Pos logit contributions
olded+1.598
 From+1.437
 agreed+1.431
 suggested+1.429
 apologised+1.418
Neg logit contributions
 red-0.554
 new-0.538
 things-0.498
 big-0.496
 thing-0.494
Token The
Token ID383
Feature activation-5.27e+00

Pos logit contributions
olded+5.539
 apologised+4.982
 vowed+4.863
ave+4.828
 hadn+4.754
Neg logit contributions
 new-4.790
 red-4.560
 big-4.502
 dark-4.297
 sun-4.183
Token bunny
Token ID44915
Feature activation+2.58e+00

Pos logit contributions
 cotton+6.020
 birthday+5.747
 rainy+5.648
 gift+5.632
 job+5.628
Neg logit contributions
 agreed-13.065
 could-13.006
olded-12.843
 suggested-12.755
 couldn-12.569
Token was
Token ID373
Feature activation-8.32e-01

Pos logit contributions
olded+5.656
 apologised+4.337
anced+4.166
 From+4.124
oped+4.061
Neg logit contributions
 red-5.410
 big-5.386
 new-5.154
 shiny-4.908
 pretty-4.887
Token moving
Token ID3867
Feature activation-6.16e-01

Pos logit contributions
 new+1.089
 red+1.085
 big+1.021
 fake+1.016
 shiny+1.009
Neg logit contributions
olded-1.963
 suggested-1.754
 agreed-1.753
 could-1.746
 apologised-1.701
Token so
Token ID523
Feature activation-4.26e-01

Pos logit contributions
 new+0.739
 red+0.730
 thing+0.684
 big+0.678
 things+0.677
Neg logit contributions
olded-1.572
 suggested-1.385
 agreed-1.373
 apologised-1.354
 From-1.346
Token fast
Token ID3049
Feature activation+3.91e-01

Pos logit contributions
 new+0.255
 red+0.238
 big+0.230
 shiny+0.207
 thing+0.192
Neg logit contributions
olded-1.352
 suggested-1.230
 agreed-1.228
 apologised-1.203
ave-1.199
Token the
Token ID262
Feature activation-3.46e+00

Pos logit contributions
 gift+4.958
 new+4.728
 job+4.699
 thing+4.656
 fake+4.568
Neg logit contributions
olded-6.373
 could-6.237
 agreed-6.054
 suggested-6.030
 From-5.944
Token snow
Token ID6729
Feature activation-1.33e+00

Pos logit contributions
 gift+3.390
 thing+3.108
 new+3.086
 cotton+3.046
 job+3.044
Neg logit contributions
olded-8.583
 could-8.340
 suggested-8.165
 agreed-8.143
 From-8.089
Token.
Token ID13
Feature activation-8.40e-01

Pos logit contributions
 red+1.513
 new+1.508
 thing+1.470
 food+1.457
 gift+1.453
Neg logit contributions
olded-3.313
 agreed-2.977
 could-2.950
 suggested-2.936
 From-2.926
Token They
Token ID1119
Feature activation-5.23e-01

Pos logit contributions
 red+1.918
 new+1.899
 sunny+1.851
 gift+1.833
 dessert+1.819
Neg logit contributions
olded-1.184
 apologised-0.939
 agreed-0.939
 could-0.931
 suggested-0.919
Token make
Token ID787
Feature activation-2.67e+00

Pos logit contributions
 red+0.751
 sunny+0.713
 big+0.712
 sand+0.703
 shiny+0.702
Neg logit contributions
olded-1.207
 From-1.041
 agreed-1.038
 suggested-1.033
 apologised-1.031
Token a
Token ID257
Feature activation-5.26e+00

Pos logit contributions
 gift+2.345
 new+2.260
 red+2.201
 thing+2.127
 job+2.126
Neg logit contributions
olded-7.135
 could-6.730
 agreed-6.689
 suggested-6.576
 couldn-6.444
Token big
Token ID1263
Feature activation-4.90e+00

Pos logit contributions
 attention+3.628
 gift+3.609
 dessert+3.542
 cotton+3.488
 groceries+3.453
Neg logit contributions
 could-15.428
 suggested-14.469
 thanked-14.468
 couldn-14.464
 agreed-14.345
Token snow
Token ID6729
Feature activation-2.77e+00

Pos logit contributions
 gift+4.641
 attention+4.334
 fake+4.227
 new+4.195
 job+4.171
Neg logit contributions
 could-12.620
 agreed-12.171
 couldn-11.992
 thanked-11.895
 From-11.881
Tokenman
Token ID805
Feature activation-1.61e+00

Pos logit contributions
 new+4.712
 gift+4.646
 red+4.572
 fake+4.514
 thing+4.421
Neg logit contributions
olded-5.098
 could-4.747
 agreed-4.681
 suggested-4.527
 couldn-4.416
Token with
Token ID351
Feature activation-1.91e+00

Pos logit contributions
 red+2.111
 new+2.111
 gift+2.092
 thing+2.013
 food+2.009
Neg logit contributions
olded-3.640
 agreed-3.293
 could-3.263
 suggested-3.213
 From-3.182
Token a
Token ID257
Feature activation-2.98e+00

Pos logit contributions
 new+1.951
 red+1.929
 gift+1.894
 food+1.815
 sand+1.805
Neg logit contributions
olded-4.878
 agreed-4.510
 could-4.490
 suggested-4.433
 hadn-4.341
Token,
Token ID11
Feature activation-1.27e+00

Pos logit contributions
 new+2.504
 red+2.468
 gift+2.455
 thing+2.385
 food+2.346
Neg logit contributions
olded-4.309
 agreed-3.926
 could-3.906
 suggested-3.852
 hadn-3.759
Token Lucy
Token ID22162
Feature activation+1.43e+00

Pos logit contributions
 new+2.528
 red+2.506
 gift+2.456
 thing+2.442
 things+2.419
Neg logit contributions
olded-2.093
 agreed-1.772
 suggested-1.743
 apologised-1.738
 could-1.733
Token wanted
Token ID2227
Feature activation-2.50e+00

Pos logit contributions
olded+2.701
 apologised+2.181
ver+2.113
 agreed+2.078
oped+2.032
Neg logit contributions
 big-2.838
 red-2.821
 dark-2.707
 new-2.695
 food-2.668
Token to
Token ID284
Feature activation-1.16e+00

Pos logit contributions
 gift+3.618
 new+3.611
 red+3.578
 thing+3.481
 job+3.440
Neg logit contributions
olded-5.272
 could-4.838
 agreed-4.811
 suggested-4.743
 couldn-4.589
Token play
Token ID711
Feature activation-3.54e+00

Pos logit contributions
 new+1.366
 gift+1.359
 red+1.334
 job+1.312
 big+1.303
Neg logit contributions
olded-2.858
 agreed-2.570
 could-2.542
 suggested-2.510
 hadn-2.509
Token a
Token ID257
Feature activation-5.26e+00

Pos logit contributions
 gift+5.539
 new+5.315
 red+5.275
 fake+5.269
 cotton+5.221
Neg logit contributions
olded-6.989
 could-6.656
 agreed-6.602
 suggested-6.469
 From-6.172
Token game
Token ID983
Feature activation-1.72e+00

Pos logit contributions
 attention+5.910
 gift+5.194
 food+5.174
 things+5.087
 cotton+5.043
Neg logit contributions
 could-13.368
olded-13.098
 suggested-13.069
 agreed-12.990
 couldn-12.823
Token.
Token ID13
Feature activation+3.18e-01

Pos logit contributions
 new+2.228
 gift+2.187
 red+2.186
 thing+2.115
 food+2.073
Neg logit contributions
olded-3.907
 agreed-3.562
 could-3.537
 suggested-3.470
 apologised-3.445
Token She
Token ID1375
Feature activation+1.64e+00

Pos logit contributions
olded+0.435
 apologised+0.373
 agreed+0.361
 suggested+0.356
oped+0.353
Neg logit contributions
 red-0.739
 new-0.734
 thing-0.712
 things-0.709
 big-0.705
Token found
Token ID1043
Feature activation-1.91e+00

Pos logit contributions
olded+2.642
 From+2.112
 apologised+2.065
oped+2.016
ver+2.011
Neg logit contributions
 red-3.875
 big-3.778
 new-3.593
 toy-3.575
 shiny-3.565
Token a
Token ID257
Feature activation-3.33e+00

Pos logit contributions
 new+1.943
 red+1.893
 gift+1.869
 thing+1.776
 food+1.764
Neg logit contributions
olded-4.898
 agreed-4.532
 could-4.507
 suggested-4.457
 hadn-4.371
Token.
Token ID13
Feature activation-2.55e-01

Pos logit contributions
 red+0.717
 new+0.708
 thing+0.695
 gift+0.691
 big+0.680
Neg logit contributions
olded-0.764
 agreed-0.679
 apologised-0.665
 hadn-0.652
 suggested-0.644
Token He
Token ID679
Feature activation+1.24e+00

Pos logit contributions
 red+0.580
 new+0.561
 sunny+0.554
 big+0.546
 dark+0.543
Neg logit contributions
olded-0.390
 apologised-0.320
 agreed-0.312
 suggested-0.306
ave-0.300
Token wanted
Token ID2227
Feature activation-2.62e+00

Pos logit contributions
olded+2.239
 apologised+1.768
 agreed+1.672
 suggested+1.649
 declared+1.630
Neg logit contributions
 red-2.743
 big-2.724
 shiny-2.671
 new-2.643
 pretty-2.593
Token to
Token ID284
Feature activation-1.21e+00

Pos logit contributions
 gift+3.863
 new+3.814
 red+3.789
 thing+3.702
 job+3.680
Neg logit contributions
olded-5.457
 could-5.040
 agreed-4.992
 suggested-4.913
 couldn-4.765
Token play
Token ID711
Feature activation-3.47e+00

Pos logit contributions
 new+1.483
 gift+1.459
 red+1.451
 food+1.421
 big+1.409
Neg logit contributions
olded-2.912
 agreed-2.608
 could-2.596
 suggested-2.567
 hadn-2.548
Token a
Token ID257
Feature activation-5.25e+00

Pos logit contributions
 gift+5.339
 new+5.106
 red+5.081
 fake+5.077
 rainy+5.043
Neg logit contributions
olded-6.937
 could-6.696
 agreed-6.612
 suggested-6.419
 From-6.251
Token game
Token ID983
Feature activation-7.91e-01

Pos logit contributions
 attention+5.886
 gift+5.257
 things+5.029
 food+5.020
 cotton+4.940
Neg logit contributions
 could-13.450
 agreed-12.942
 suggested-12.916
 couldn-12.877
olded-12.840
Token.
Token ID13
Feature activation+8.62e-01

Pos logit contributions
 new+0.973
 red+0.951
 gift+0.928
 thing+0.918
 sunny+0.901
Neg logit contributions
olded-1.883
 apologised-1.706
 agreed-1.686
 hadn-1.674
 From-1.671
Token So
Token ID1406
Feature activation-1.61e+00

Pos logit contributions
olded+1.366
 apologised+1.213
 hadn+1.133
 agreed+1.118
oped+1.091
Neg logit contributions
 new-1.943
 red-1.934
 big-1.857
 things-1.815
 thing-1.813
Token he
Token ID339
Feature activation+1.74e+00

Pos logit contributions
 new+2.983
 red+2.959
 gift+2.912
 thing+2.900
 job+2.848
Neg logit contributions
olded-2.920
 agreed-2.524
 suggested-2.457
 could-2.446
 apologised-2.398
Token brought
Token ID3181
Feature activation-2.10e+00

Pos logit contributions
olded+3.051
 From+2.604
 apologised+2.515
oped+2.473
 Although+2.428
Neg logit contributions
 red-3.792
 big-3.667
 shiny-3.484
 toy-3.454
 new-3.447
Token car
Token ID1097
Feature activation+2.30e+00

Pos logit contributions
 new+1.170
 red+1.153
 gift+1.123
 things+1.077
 thing+1.075
Neg logit contributions
olded-3.974
 agreed-3.706
 could-3.630
 suggested-3.616
 apologised-3.564
Token engine
Token ID3113
Feature activation+2.51e+00

Pos logit contributions
olded+4.602
 From+4.176
abella+3.919
ave+3.894
 agreed+3.880
Neg logit contributions
 red-4.497
 big-4.355
 thing-4.341
 new-4.277
 things-4.273
Token is
Token ID318
Feature activation+1.75e-01

Pos logit contributions
olded+5.425
 From+5.012
ave+4.660
Well+4.639
abella+4.600
Neg logit contributions
 red-4.701
 thing-4.583
 things-4.570
 big-4.480
 shiny-4.406
Token not
Token ID407
Feature activation-1.31e+00

Pos logit contributions
olded+0.487
 agreed+0.465
 From+0.446
ave+0.442
 suggested+0.440
Neg logit contributions
 red-0.167
 new-0.156
 big-0.152
 fake-0.151
 shiny-0.147
Token working
Token ID1762
Feature activation-3.45e-01

Pos logit contributions
 new+1.326
 red+1.310
 gift+1.272
 fake+1.216
 shiny+1.204
Neg logit contributions
olded-3.433
 agreed-3.141
 could-3.111
 suggested-3.081
 From-3.079
Token and
Token ID290
Feature activation+2.18e+00

Pos logit contributions
 new+0.399
 red+0.389
 things+0.372
 big+0.365
 fake+0.364
Neg logit contributions
olded-0.875
 agreed-0.803
 From-0.783
 suggested-0.780
 apologised-0.775
Token needs
Token ID2476
Feature activation-7.13e-01

Pos logit contributions
olded+5.634
ave+5.303
 From+4.892
 agreed+4.864
 apologised+4.666
Neg logit contributions
 red-3.339
 big-3.221
 dark-3.180
 shiny-3.126
 new-3.122
Token to
Token ID284
Feature activation+1.82e-01

Pos logit contributions
 new+0.816
 gift+0.773
 red+0.771
 things+0.751
 food+0.750
Neg logit contributions
olded-1.805
 agreed-1.694
 suggested-1.629
 could-1.619
ave-1.608
Token be
Token ID307
Feature activation+6.22e-01

Pos logit contributions
olded+0.472
ave+0.434
 agreed+0.434
 could+0.424
oped+0.422
Neg logit contributions
 new-0.210
 job-0.199
 big-0.199
 food-0.197
 gift-0.197
Token fixed
Token ID5969
Feature activation+3.15e-01

Pos logit contributions
olded+1.503
 From+1.425
Well+1.406
 agreed+1.405
oped+1.390
Neg logit contributions
 new-0.775
 red-0.767
 big-0.721
 fake-0.707
 dark-0.707
Token,
Token ID11
Feature activation+1.56e+00

Pos logit contributions
olded+0.830
 agreed+0.761
ave+0.748
 reached+0.742
 apologised+0.736
Neg logit contributions
 things-0.361
 new-0.361
 red-0.354
 thing-0.344
 food-0.328
Token and
Token ID290
Feature activation+9.60e-01

Pos logit contributions
olded+10.292
oped+9.592
ver+9.445
anced+9.192
 Did+9.110
Neg logit contributions
 big-5.692
 red-5.586
 cake-5.264
 toy-5.079
 new-5.056
Token Jack
Token ID3619
Feature activation+3.18e+00

Pos logit contributions
olded+2.025
ave+1.731
oped+1.714
 could+1.609
os+1.569
Neg logit contributions
 lunch-1.754
 dessert-1.728
 sand-1.690
 new-1.681
 things-1.672
Token were
Token ID547
Feature activation+4.78e-01

Pos logit contributions
olded+6.432
oped+5.160
 Did+5.149
ver+5.065
anced+5.051
Neg logit contributions
 big-6.753
 cake-6.572
 red-6.535
 end-6.437
 pretty-6.172
Token very
Token ID845
Feature activation-1.03e-01

Pos logit contributions
olded+1.285
oped+1.188
ave+1.182
 suggested+1.160
 could+1.147
Neg logit contributions
 new-0.501
 red-0.499
 big-0.492
 fake-0.455
 food-0.451
Token happy
Token ID3772
Feature activation-1.93e-01

Pos logit contributions
 new+0.099
 big+0.095
 red+0.094
 shiny+0.089
 pretty+0.086
Neg logit contributions
olded-0.279
 could-0.257
ave-0.257
oped-0.257
 suggested-0.256
Token.
Token ID13
Feature activation+2.22e+00

Pos logit contributions
 new+0.219
 things+0.209
 gift+0.206
 thing+0.205
 red+0.203
Neg logit contributions
olded-0.502
ave-0.450
oped-0.440
 could-0.440
 apologised-0.438
Token They
Token ID1119
Feature activation+2.75e+00

Pos logit contributions
olded+3.970
oped+3.710
 hadn+3.382
owed+3.342
ver+3.232
Neg logit contributions
 new-4.596
 red-4.485
 birthday-4.434
 TV-4.393
 lunch-4.388
Token shared
Token ID4888
Feature activation-1.03e+00

Pos logit contributions
olded+5.257
os+4.552
oped+4.446
ver+4.387
 Did+4.305
Neg logit contributions
 big-5.608
 cake-5.508
 pretty-5.471
 red-5.451
 shiny-5.343
Token the
Token ID262
Feature activation-2.21e+00

Pos logit contributions
 new+1.249
 red+1.198
 food+1.172
 gift+1.143
 things+1.141
Neg logit contributions
olded-2.577
 could-2.319
 agreed-2.286
 hadn-2.277
ave-2.268
Token treats
Token ID18432
Feature activation-9.02e-01

Pos logit contributions
 new+1.624
 gift+1.586
 red+1.571
 thing+1.463
 food+1.439
Neg logit contributions
olded-6.276
 agreed-5.856
 could-5.855
 suggested-5.770
 From-5.651
Token and
Token ID290
Feature activation+2.05e+00

Pos logit contributions
 new+1.342
 red+1.308
 gift+1.305
 food+1.288
 cake+1.282
Neg logit contributions
olded-1.920
 could-1.715
 hadn-1.680
 suggested-1.677
 agreed-1.657
Token?"
Token ID1701
Feature activation+1.73e+00

Pos logit contributions
olded+1.591
 From+1.412
 agreed+1.403
 apologised+1.386
ave+1.382
Neg logit contributions
 new-0.759
 food-0.751
 things-0.721
 red-0.721
 lunch-0.718
Token Tom
Token ID4186
Feature activation+2.69e+00

Pos logit contributions
olded+3.819
ave+3.577
oped+3.506
owed+3.499
os+3.406
Neg logit contributions
 new-3.005
 red-2.959
 big-2.881
 things-2.746
 shiny-2.692
Token asked
Token ID1965
Feature activation+2.84e-01

Pos logit contributions
olded+4.030
oped+3.822
anced+3.803
ave+3.456
os+3.281
Neg logit contributions
 red-6.520
 new-6.394
 big-6.381
 fake-6.176
 black-6.141
Token.
Token ID13
Feature activation+1.57e+00

Pos logit contributions
olded+0.744
 could+0.678
 hadn+0.672
oped+0.668
ave+0.668
Neg logit contributions
 new-0.312
 red-0.282
 things-0.277
 food-0.272
 thing-0.271
Token The
Token ID383
Feature activation-1.77e+00

Pos logit contributions
olded+2.932
os+2.655
 hadn+2.617
owed+2.593
oped+2.585
Neg logit contributions
 new-3.062
 red-2.978
 big-2.913
 fake-2.718
 sight-2.708
Token owl
Token ID39610
Feature activation+2.65e+00

Pos logit contributions
 new+1.705
 red+1.645
 gift+1.640
 thing+1.543
 fake+1.540
Neg logit contributions
olded-4.672
 agreed-4.309
 could-4.276
 suggested-4.228
 From-4.134
Token nodded
Token ID14464
Feature activation+5.37e-01

Pos logit contributions
olded+4.064
ave+3.311
anced+3.296
oped+3.254
 From+3.251
Neg logit contributions
 new-6.510
 red-6.380
 shiny-6.121
 fake-6.087
 big-6.077
Token and
Token ID290
Feature activation+2.70e+00

Pos logit contributions
olded+1.344
ave+1.154
 could+1.144
 From+1.137
 hadn+1.136
Neg logit contributions
 red-0.702
 new-0.690
 big-0.629
 thing-0.627
 fake-0.621
Token said
Token ID531
Feature activation+1.20e+00

Pos logit contributions
olded+4.672
oped+4.328
os+4.284
ave+4.234
 hadn+3.767
Neg logit contributions
 new-6.256
 food-5.932
 cake-5.798
 big-5.785
 red-5.780
Token,
Token ID11
Feature activation+2.26e+00

Pos logit contributions
olded+3.237
ave+3.017
 hadn+3.006
 suggested+2.978
oped+2.944
Neg logit contributions
 new-1.309
 red-1.194
 things-1.142
 thing-1.113
 big-1.087
Token "
Token ID366
Feature activation+1.60e+00

Pos logit contributions
olded+4.906
ave+4.742
os+4.616
oped+4.598
 could+4.239
Neg logit contributions
 red-3.961
 new-3.907
 big-3.637
 fake-3.585
 real-3.519
Token a
Token ID257
Feature activation-1.98e+00

Pos logit contributions
 new+0.267
 red+0.261
 gift+0.260
 food+0.257
 big+0.253
Neg logit contributions
olded-0.233
 could-0.210
 agreed-0.205
 From-0.201
 reached-0.196
Token regular
Token ID3218
Feature activation-1.30e+00

Pos logit contributions
 new+1.593
 gift+1.544
 red+1.526
 thing+1.417
 fake+1.410
Neg logit contributions
olded-5.488
 could-5.095
 agreed-5.094
 suggested-5.021
 From-4.937
Token cup
Token ID6508
Feature activation+3.75e-01

Pos logit contributions
 new+1.224
 red+1.189
 gift+1.181
 thing+1.121
 job+1.100
Neg logit contributions
olded-3.489
 could-3.214
 agreed-3.165
 suggested-3.144
 From-3.131
Token of
Token ID286
Feature activation-4.05e-01

Pos logit contributions
olded+0.904
 From+0.837
 apologised+0.808
 hadn+0.802
 couldn+0.799
Neg logit contributions
 gift-0.460
 red-0.453
 cake-0.445
 new-0.441
 food-0.436
Token coffee
Token ID6891
Feature activation+4.89e-01

Pos logit contributions
 new+0.299
 red+0.279
 food+0.264
 sand+0.256
 gift+0.255
Neg logit contributions
olded-1.208
 agreed-1.109
 could-1.105
ave-1.091
 From-1.090
Token.
Token ID13
Feature activation+2.87e+00

Pos logit contributions
olded+1.326
 From+1.219
 apologised+1.177
 agreed+1.166
 hadn+1.158
Neg logit contributions
 cake-0.509
 food-0.499
 red-0.495
 lunch-0.494
 gift-0.490
Token When
Token ID1649
Feature activation+1.06e-02

Pos logit contributions
ave+6.853
olded+6.727
 managed+6.715
oped+6.504
owed+6.489
Neg logit contributions
 new-4.573
 red-4.324
 TV-4.185
 dessert-3.986
 sand-3.920
Token Tim
Token ID5045
Feature activation+2.74e+00

Pos logit contributions
olded+0.021
ave+0.018
 agreed+0.017
 could+0.017
os+0.017
Neg logit contributions
 new-0.020
 lunch-0.019
 job-0.019
 sand-0.019
 dessert-0.019
Token asked
Token ID1965
Feature activation+4.28e-01

Pos logit contributions
olded+5.710
Well+4.725
 apologised+4.560
 From+4.557
anced+4.536
Neg logit contributions
 big-5.104
 new-4.793
 red-4.771
 toy-4.630
 lunch-4.620
Token for
Token ID329
Feature activation-6.77e-01

Pos logit contributions
olded+1.091
ave+1.006
oped+0.995
 could+0.987
 hadn+0.986
Neg logit contributions
 new-0.523
 gift-0.484
 things-0.477
 red-0.475
 food-0.473
Token one
Token ID530
Feature activation-1.49e+00

Pos logit contributions
 new+0.697
 gift+0.656
 food+0.654
 red+0.644
 lunch+0.641
Neg logit contributions
olded-1.764
 could-1.631
 agreed-1.626
 suggested-1.605
ave-1.602
Token it
Token ID340
Feature activation+2.24e+00

Pos logit contributions
 new+0.552
 red+0.549
 gift+0.549
 thing+0.537
 big+0.536
Neg logit contributions
olded-0.705
ave-0.622
 agreed-0.613
 suggested-0.611
oped-0.605
Token's
Token ID338
Feature activation-2.24e-01

Pos logit contributions
olded+4.415
 From+3.897
 apologised+3.780
Well+3.716
 agreed+3.682
Neg logit contributions
 big-4.426
 new-4.298
 red-4.256
 magic-4.133
 thing-4.118
Token so
Token ID523
Feature activation+5.04e-01

Pos logit contributions
 new+0.243
 red+0.242
 big+0.233
 gift+0.231
 fake+0.226
Neg logit contributions
olded-0.597
 agreed-0.551
ave-0.538
 suggested-0.533
 could-0.531
Token cool
Token ID3608
Feature activation+9.29e-01

Pos logit contributions
olded+1.729
ave+1.613
 agreed+1.613
 suggested+1.601
 apologised+1.591
Neg logit contributions
 big-0.221
 new-0.217
 red-0.198
 shiny-0.188
 pretty-0.184
Token!"
Token ID2474
Feature activation+2.81e+00

Pos logit contributions
olded+2.498
 apologised+2.207
 From+2.202
ave+2.160
 apologized+2.156
Neg logit contributions
 red-1.138
 thing-1.104
 new-1.101
 things-1.089
 job-1.068
Token
Token ID220
Feature activation+2.74e+00

Pos logit contributions
olded+5.575
 hadn+5.294
 managed+5.043
ave+5.027
'd+4.950
Neg logit contributions
 new-5.688
 big-5.428
 TV-5.410
 dessert-5.230
 red-5.225
Token
Token ID198
Feature activation+2.04e+00

Pos logit contributions
olded+6.036
 agreed+5.930
ave+5.791
 apologised+5.757
 hadn+5.558
Neg logit contributions
 new-5.194
 big-5.007
 things-4.243
 red-4.184
 home-4.123
Token
Token ID198
Feature activation+1.35e+00

Pos logit contributions
olded+4.307
 wouldn+3.996
 hadn+3.935
 agreed+3.815
 refused+3.799
Neg logit contributions
 big-3.755
 new-3.702
 red-3.171
 sunny-3.163
 nice-3.156
Token"
Token ID1
Feature activation+9.08e-01

Pos logit contributions
olded+2.864
 hadn+2.567
 wouldn+2.560
 could+2.560
 couldn+2.555
Neg logit contributions
 big-2.118
 sunny-2.071
 new-2.057
 thing-2.025
 gift-1.978
TokenYes
Token ID5297
Feature activation+2.18e-01

Pos logit contributions
olded+1.971
 From+1.680
 couldn+1.672
 apologised+1.648
 could+1.640
Neg logit contributions
 red-1.403
 sunny-1.402
 cotton-1.364
 shiny-1.342
 thing-1.341
Token,
Token ID11
Feature activation+1.19e+00

Pos logit contributions
olded+0.559
ave+0.513
 suggested+0.506
 could+0.496
 agreed+0.495
Neg logit contributions
 new-0.249
 red-0.243
 big-0.233
 thing-0.231
 gift-0.227
Token lunch
Token ID9965
Feature activation-2.13e-02

Pos logit contributions
 new+0.663
 red+0.601
 gift+0.588
 food+0.584
 lunch+0.540
Neg logit contributions
olded-3.669
 agreed-3.380
 could-3.361
 suggested-3.344
 apologised-3.321
Token.
Token ID13
Feature activation+6.90e-01

Pos logit contributions
 new+0.022
 red+0.022
 lunch+0.022
 food+0.022
 cake+0.021
Neg logit contributions
olded-0.057
 apologised-0.052
 From-0.051
 hadn-0.051
oped-0.050
Token
Token ID198
Feature activation-3.52e-01

Pos logit contributions
olded+1.227
 apologised+1.013
ave+0.982
 hadn+0.977
 agreed+0.950
Neg logit contributions
 new-1.449
 red-1.395
 big-1.378
 view-1.354
 sunny-1.348
Token
Token ID198
Feature activation-3.17e-01

Pos logit contributions
 new+0.737
 big+0.716
 red+0.700
 sunny+0.689
 gift+0.688
Neg logit contributions
olded-0.573
 apologised-0.472
 agreed-0.464
 hadn-0.450
ave-0.448
TokenToday
Token ID8888
Feature activation-6.64e-01

Pos logit contributions
 new+0.618
 sunny+0.613
 big+0.611
 red+0.610
 thing+0.606
Neg logit contributions
olded-0.540
 apologised-0.438
 could-0.433
 agreed-0.431
 hadn-0.430
Token,
Token ID11
Feature activation-4.77e-01

Pos logit contributions
 new+0.807
 thing+0.766
 red+0.756
 gift+0.737
 job+0.734
Neg logit contributions
olded-1.675
 apologised-1.464
 agreed-1.456
ave-1.435
 could-1.433
Token Mary
Token ID5335
Feature activation+3.17e+00

Pos logit contributions
 new+0.975
 red+0.947
 things+0.937
 thing+0.932
 food+0.928
Neg logit contributions
olded-0.814
 apologised-0.669
ave-0.663
 agreed-0.652
oped-0.647
Token's
Token ID338
Feature activation-2.62e+00

Pos logit contributions
olded+7.479
Well+6.358
alled+6.334
anced+6.330
 apologised+6.302
Neg logit contributions
 big-5.932
 red-5.602
 new-5.305
 food-5.212
 toy-5.176
Token mom
Token ID1995
Feature activation+2.72e+00

Pos logit contributions
 new+2.675
 gift+2.649
 red+2.621
 thing+2.530
 fake+2.507
Neg logit contributions
olded-6.605
 could-6.157
 agreed-6.128
 suggested-6.010
 couldn-5.929
Token had
Token ID550
Feature activation+8.27e-01

Pos logit contributions
olded+4.701
alled+4.137
anced+4.094
 From+4.042
oped+3.939
Neg logit contributions
 big-6.100
 red-6.033
 new-5.790
 cake-5.608
 real-5.450
Token brought
Token ID3181
Feature activation-2.25e-02

Pos logit contributions
olded+2.107
oped+1.938
 From+1.908
os+1.886
 agreed+1.876
Neg logit contributions
 new-1.007
 red-0.997
 big-0.970
 food-0.966
 lunch-0.957
Token was
Token ID373
Feature activation-3.18e+00

Pos logit contributions
olded+1.745
ave+1.365
 apologised+1.332
 suggested+1.237
 agreed+1.221
Neg logit contributions
 new-2.286
 thing-2.235
 red-2.178
 big-2.147
 things-2.136
Token a
Token ID257
Feature activation-3.52e+00

Pos logit contributions
 gift+3.751
 new+3.659
 red+3.527
 job+3.518
 thing+3.507
Neg logit contributions
olded-7.555
 could-7.157
 agreed-7.127
 suggested-6.988
 From-6.960
Token silly
Token ID14397
Feature activation-2.14e+00

Pos logit contributions
 gift+3.372
 food+3.217
 things+3.159
 attention+3.148
 new+3.123
Neg logit contributions
olded-8.915
 could-8.734
 agreed-8.542
 suggested-8.414
 couldn-8.308
Token yellow
Token ID7872
Feature activation-2.34e+00

Pos logit contributions
 new+1.859
 gift+1.817
 red+1.811
 thing+1.707
 food+1.674
Neg logit contributions
olded-5.772
 agreed-5.356
 could-5.351
 suggested-5.276
 From-5.162
Token bird
Token ID6512
Feature activation-6.65e-01

Pos logit contributions
 new+2.071
 gift+2.032
 red+2.010
 thing+1.896
 food+1.880
Neg logit contributions
olded-6.209
 could-5.786
 agreed-5.782
 suggested-5.687
 From-5.566
Token.
Token ID13
Feature activation+3.77e-01

Pos logit contributions
 red+0.997
 new+0.980
 thing+0.960
 gift+0.943
 sunny+0.942
Neg logit contributions
olded-1.418
 agreed-1.259
 apologised-1.229
 hadn-1.206
 suggested-1.202
Token He
Token ID679
Feature activation+1.38e+00

Pos logit contributions
olded+0.670
 apologised+0.551
ave+0.547
 agreed+0.529
 suggested+0.524
Neg logit contributions
 red-0.812
 sunny-0.764
 new-0.760
 dark-0.744
 big-0.740
Token just
Token ID655
Feature activation+1.66e+00

Pos logit contributions
olded+2.481
 apologised+1.884
 agreed+1.829
 suggested+1.803
 From+1.743
Neg logit contributions
 red-3.171
 big-3.085
 shiny-3.076
 pretty-3.014
 new-2.984
Token loved
Token ID6151
Feature activation-1.62e+00

Pos logit contributions
olded+2.756
 From+2.167
 suggested+2.152
oped+2.063
 apologised+2.043
Neg logit contributions
 red-3.796
 shiny-3.667
 sunny-3.626
 dark-3.579
 pretty-3.577
Token to
Token ID284
Feature activation-5.48e-01

Pos logit contributions
 new+2.179
 red+2.154
 gift+2.096
 food+2.052
 things+2.037
Neg logit contributions
olded-3.706
 agreed-3.358
 could-3.291
 suggested-3.281
 apologised-3.210
Token perform
Token ID1620
Feature activation-9.34e-01

Pos logit contributions
 red+0.552
 new+0.534
 gift+0.512
 sand+0.512
 sunny+0.510
Neg logit contributions
olded-1.507
 suggested-1.330
 agreed-1.328
 hadn-1.314
 could-1.312
Token grabbed
Token ID13176
Feature activation-3.79e-01

Pos logit contributions
olded+7.411
ave+7.203
oped+7.161
os+7.119
 From+6.891
Neg logit contributions
 new-5.846
 big-5.709
 food-5.653
 sand-5.470
 dark-5.245
Token her
Token ID607
Feature activation-7.94e-01

Pos logit contributions
 new+0.456
 red+0.442
 food+0.436
 gift+0.433
 things+0.427
Neg logit contributions
olded-0.925
 could-0.838
 agreed-0.834
 suggested-0.833
 hadn-0.830
Token friends
Token ID2460
Feature activation-1.66e-01

Pos logit contributions
 new+0.803
 gift+0.766
 red+0.760
 food+0.733
 thing+0.721
Neg logit contributions
olded-2.130
 agreed-1.954
 suggested-1.926
 could-1.916
ave-1.892
Token.
Token ID13
Feature activation+3.39e+00

Pos logit contributions
 new+0.179
 food+0.173
 red+0.173
 gift+0.170
 lunch+0.169
Neg logit contributions
olded-0.411
 could-0.383
 From-0.375
 agreed-0.374
 hadn-0.372
Token Together
Token ID17083
Feature activation-4.86e-01

Pos logit contributions
 hadn+6.593
olded+6.399
ave+6.229
'd+6.066
 apologised+6.051
Neg logit contributions
 new-7.111
 big-6.691
 sand-6.644
 red-6.642
 home-6.593
Token they
Token ID484
Feature activation+1.33e+00

Pos logit contributions
 new+0.788
 red+0.764
 lunch+0.742
 things+0.741
 gift+0.740
Neg logit contributions
olded-0.993
ave-0.876
 could-0.874
 agreed-0.862
 hadn-0.850
Token formed
Token ID7042
Feature activation-4.49e-01

Pos logit contributions
olded+2.079
 From+1.857
 Did+1.659
Well+1.629
lett+1.603
Neg logit contributions
 big-2.789
 red-2.728
 new-2.668
 food-2.659
 shiny-2.640
Token a
Token ID257
Feature activation-1.93e+00

Pos logit contributions
 new+0.471
 red+0.448
 big+0.424
 sand+0.422
 food+0.422
Neg logit contributions
olded-1.212
 could-1.095
 agreed-1.079
 hadn-1.075
 suggested-1.074
Token plan
Token ID1410
Feature activation+1.36e+00

Pos logit contributions
 new+1.433
 gift+1.382
 red+1.359
 thing+1.264
 fake+1.251
Neg logit contributions
olded-5.471
 agreed-5.087
 could-5.081
 suggested-5.000
 From-4.912
Token.
Token ID13
Feature activation+2.77e+00

Pos logit contributions
olded+3.610
 From+3.294
ave+3.229
 apologised+3.205
 could+3.184
Neg logit contributions
 new-1.515
 gift-1.476
 things-1.350
 thing-1.350
 food-1.343
Token
Token ID198
Feature activation+1.67e+00

Pos logit contributions
olded+5.041
 hadn+4.828
ave+4.697
 apologised+4.598
 wouldn+4.578
Neg logit contributions
 new-5.581
 sand-5.519
 lunch-5.399
 big-5.325
 red-5.241
Token The
Token ID383
Feature activation-2.65e+00

Pos logit contributions
olded+5.471
owed+4.803
 managed+4.795
 agreed+4.789
 apologised+4.716
Neg logit contributions
 TV-5.636
 red-5.556
 new-5.539
 sun-5.343
 sand-5.245
Token crab
Token ID32202
Feature activation+2.90e+00

Pos logit contributions
 new+2.418
 gift+2.417
 red+2.351
 job+2.212
 thing+2.212
Neg logit contributions
olded-6.972
 agreed-6.567
 could-6.566
 suggested-6.407
 couldn-6.303
Token felt
Token ID2936
Feature activation-1.27e-01

Pos logit contributions
olded+6.529
 Did+5.289
anced+5.225
 From+5.108
oped+5.098
Neg logit contributions
 big-5.845
 new-5.806
 red-5.667
 game-5.534
 toy-5.447
Token so
Token ID523
Feature activation-8.83e-02

Pos logit contributions
 new+0.137
 big+0.126
 red+0.124
 sunny+0.122
 fake+0.120
Neg logit contributions
olded-0.340
oped-0.308
 suggested-0.308
os-0.307
 agreed-0.304
Token much
Token ID881
Feature activation-2.14e+00

Pos logit contributions
 new+0.074
 big+0.069
 sunny+0.067
 red+0.066
 pretty+0.064
Neg logit contributions
olded-0.258
 suggested-0.235
 could-0.230
 hadn-0.229
 agreed-0.229
Token better
Token ID1365
Feature activation+4.28e-01

Pos logit contributions
 new+1.950
 gift+1.898
 red+1.887
 thing+1.782
 food+1.759
Neg logit contributions
olded-5.682
 could-5.266
 agreed-5.264
 suggested-5.182
 From-5.077
Token now
Token ID783
Feature activation+9.71e-01

Pos logit contributions
olded+1.115
 apologised+0.978
ave+0.977
 From+0.968
 agreed+0.952
Neg logit contributions
 new-0.499
 thing-0.491
 red-0.476
 things-0.472
 job-0.464
Token that
Token ID326
Feature activation+1.48e+00

Pos logit contributions
olded+2.753
ave+2.503
oped+2.381
os+2.374
 agreed+2.332
Neg logit contributions
 new-1.072
 thing-1.034
 things-1.011
 red-0.976
 food-0.952
Token he
Token ID339
Feature activation+3.52e+00

Pos logit contributions
olded+3.594
ave+3.130
oped+3.005
owed+2.829
 agreed+2.823
Neg logit contributions
 new-2.664
 thing-2.617
 things-2.574
 sunny-2.533
 red-2.471
Token was
Token ID373
Feature activation+2.45e-01

Pos logit contributions
olded+7.383
 From+6.245
 Did+6.177
oped+6.079
os+6.008
Neg logit contributions
 big-7.699
 new-7.471
 bad-7.245
 real-7.213
 red-7.107
Token with
Token ID351
Feature activation+5.23e-02

Pos logit contributions
olded+0.639
 From+0.577
 could+0.573
 suggested+0.568
 agreed+0.567
Neg logit contributions
 new-0.280
 red-0.268
 big-0.265
 fake-0.251
 sunny-0.250
Token.
Token ID13
Feature activation-3.16e-02

Pos logit contributions
olded+0.295
 reached+0.266
 agreed+0.264
oped+0.264
 From+0.263
Neg logit contributions
 things-0.135
 thing-0.134
 new-0.134
 red-0.134
 food-0.132
Token They
Token ID1119
Feature activation+2.37e+00

Pos logit contributions
 new+0.074
 red+0.072
 gift+0.071
 things+0.070
 big+0.069
Neg logit contributions
olded-0.041
 agreed-0.037
ave-0.033
 could-0.033
 reached-0.033
Token hoped
Token ID10719
Feature activation-1.76e-01

Pos logit contributions
os+3.650
olded+3.587
oped+3.445
alled+3.006
 Did+2.976
Neg logit contributions
 big-5.269
 pretty-5.093
 bad-5.076
 shiny-5.006
 red-4.975
Token he
Token ID339
Feature activation+2.73e+00

Pos logit contributions
 new+0.321
 things+0.298
 red+0.296
 gift+0.294
 big+0.287
Neg logit contributions
olded-0.337
 agreed-0.315
ave-0.296
 suggested-0.294
oped-0.288
Token was
Token ID373
Feature activation-1.04e+00

Pos logit contributions
olded+5.444
 exclaimed+4.869
 From+4.630
 declared+4.597
 congratulated+4.539
Neg logit contributions
 new-5.456
 things-5.271
 big-5.179
 thing-5.136
 food-5.110
Token happy
Token ID3772
Feature activation+1.60e-01

Pos logit contributions
 new+1.503
 red+1.427
 fake+1.410
 gift+1.396
 food+1.388
Neg logit contributions
olded-2.298
 agreed-2.076
 suggested-2.051
 could-2.038
oped-2.009
Token and
Token ID290
Feature activation+9.89e-01

Pos logit contributions
olded+0.407
oped+0.367
ave+0.358
 apologised+0.352
 From+0.352
Neg logit contributions
 new-0.200
 things-0.190
 gift-0.189
 red-0.186
 food-0.184
Token safe
Token ID3338
Feature activation+4.55e-01

Pos logit contributions
olded+2.469
oped+2.418
os+2.399
 reached+2.264
ave+2.257
Neg logit contributions
 new-1.148
 red-1.074
 shiny-1.049
 big-1.044
 pretty-1.038
Token.
Token ID13
Feature activation+4.32e-01

Pos logit contributions
olded+1.213
 From+1.070
oped+1.069
ave+1.066
 agreed+1.062
Neg logit contributions
 new-0.503
 things-0.489
 food-0.486
 thing-0.478
 gift-0.476
Token They
Token ID1119
Feature activation+2.17e+00

Pos logit contributions
olded+0.626
 agreed+0.560
oped+0.523
ave+0.518
 managed+0.513
Neg logit contributions
 new-0.970
 red-0.932
 big-0.909
 dark-0.907
 gift-0.901
Token learned
Token ID4499
Feature activation-1.39e+00

Pos logit contributions
olded+3.098
os+2.954
oped+2.816
 Did+2.445
alled+2.430
Neg logit contributions
 big-5.035
 pretty-4.809
 new-4.803
 dark-4.802
 shiny-4.774
Token But
Token ID887
Feature activation-5.57e-02

Pos logit contributions
olded+4.055
 agreed+3.715
owed+3.591
 managed+3.537
oped+3.532
Neg logit contributions
 new-3.450
 red-3.432
 TV-3.253
 job-3.218
 sand-3.186
Token then
Token ID788
Feature activation-2.21e-01

Pos logit contributions
 thing+0.093
 new+0.093
 red+0.092
 job+0.090
 things+0.090
Neg logit contributions
olded-0.113
 agreed-0.098
ave-0.098
 could-0.095
os-0.095
Token,
Token ID11
Feature activation-1.62e-01

Pos logit contributions
 things+0.287
 new+0.286
 red+0.278
 thing+0.274
 food+0.270
Neg logit contributions
olded-0.543
ave-0.482
 agreed-0.473
 could-0.463
 suggested-0.458
Token it
Token ID340
Feature activation-1.21e-02

Pos logit contributions
 new+0.277
 things+0.271
 thing+0.269
 food+0.269
 red+0.267
Neg logit contributions
olded-0.327
ave-0.283
 agreed-0.283
 could-0.276
oped-0.276
Token started
Token ID2067
Feature activation-1.63e+00

Pos logit contributions
 new+0.027
 red+0.027
 big+0.027
 fake+0.027
 thing+0.026
Neg logit contributions
olded-0.017
 From-0.014
 agreed-0.014
oped-0.013
os-0.013
Token to
Token ID284
Feature activation-1.67e+00

Pos logit contributions
 new+2.145
 red+2.114
 gift+2.047
 fake+2.005
 rainy+1.989
Neg logit contributions
olded-3.688
 agreed-3.391
 could-3.360
 suggested-3.283
 From-3.229
Token get
Token ID651
Feature activation-3.01e+00

Pos logit contributions
 new+1.039
 red+1.020
 gift+0.983
 rainy+0.923
 sand+0.919
Neg logit contributions
olded-4.988
 could-4.614
 agreed-4.606
 suggested-4.541
 From-4.495
Token gloomy
Token ID46400
Feature activation-1.98e+00

Pos logit contributions
 gift+3.150
 new+3.001
 thing+2.949
 job+2.879
 red+2.852
Neg logit contributions
olded-7.722
 could-7.237
 agreed-7.144
 suggested-7.131
 couldn-6.935
Token outside
Token ID2354
Feature activation+8.57e-02

Pos logit contributions
 new+2.583
 red+2.542
 gift+2.527
 thing+2.442
 food+2.424
Neg logit contributions
olded-4.508
 agreed-4.108
 could-4.094
 suggested-4.026
 From-3.936
Token and
Token ID290
Feature activation+2.38e-01

Pos logit contributions
olded+0.229
 agreed+0.205
ave+0.204
 apologised+0.201
 suggested+0.200
Neg logit contributions
 new-0.094
 red-0.092
 things-0.088
 thing-0.088
 big-0.087
Token it
Token ID340
Feature activation+1.00e+00

Pos logit contributions
olded+0.541
ave+0.478
 agreed+0.470
 From+0.456
oped+0.450
Neg logit contributions
 red-0.372
 new-0.367
 sand-0.362
 dark-0.361
 rainy-0.356
Token She
Token ID1375
Feature activation-3.71e-01

Pos logit contributions
olded+0.527
 agreed+0.433
ave+0.430
 hadn+0.425
 apologised+0.422
Neg logit contributions
 red-0.781
 new-0.769
 big-0.744
 dark-0.737
 thing-0.724
Token knows
Token ID4206
Feature activation-2.18e-01

Pos logit contributions
 new+0.787
 red+0.772
 big+0.757
 fake+0.753
 gift+0.750
Neg logit contributions
olded-0.543
 agreed-0.457
 suggested-0.451
oped-0.451
os-0.448
Token he
Token ID339
Feature activation+3.87e-01

Pos logit contributions
 new+0.395
 things+0.372
 red+0.371
 food+0.362
 cake+0.362
Neg logit contributions
olded-0.421
ave-0.386
 agreed-0.370
oped-0.367
 suggested-0.358
Token did
Token ID750
Feature activation-3.29e+00

Pos logit contributions
olded+0.525
 agreed+0.429
 apologised+0.418
 suggested+0.417
 From+0.417
Neg logit contributions
 new-0.847
 things-0.833
 food-0.828
 thing-0.826
 red-0.826
Token not
Token ID407
Feature activation-2.29e+00

Pos logit contributions
 red+5.232
 gift+5.219
 cotton+5.094
 new+5.074
 sand+5.044
Neg logit contributions
olded-6.561
 could-6.118
 agreed-6.022
 suggested-5.868
 couldn-5.728
Token mean
Token ID1612
Feature activation-1.43e+00

Pos logit contributions
 new+3.295
 gift+3.254
 red+3.237
 thing+3.126
 food+3.100
Neg logit contributions
olded-4.881
 agreed-4.455
 could-4.450
 suggested-4.365
 From-4.232
Token to
Token ID284
Feature activation-2.23e+00

Pos logit contributions
 new+1.886
 red+1.832
 things+1.819
 thing+1.805
 gift+1.772
Neg logit contributions
olded-3.281
 agreed-2.975
 could-2.931
 suggested-2.910
 hadn-2.897
Token be
Token ID307
Feature activation-7.96e-01

Pos logit contributions
 new+3.047
 gift+3.009
 red+2.994
 thing+2.886
 food+2.859
Neg logit contributions
olded-4.921
 agreed-4.500
 could-4.497
 suggested-4.412
 From-4.293
Token bad
Token ID2089
Feature activation-1.17e+00

Pos logit contributions
 new+0.794
 red+0.784
 fake+0.739
 things+0.719
 gift+0.709
Neg logit contributions
olded-2.045
 could-1.936
 hadn-1.912
 From-1.905
 agreed-1.902
Token.
Token ID13
Feature activation+6.85e-01

Pos logit contributions
 new+1.282
 things+1.261
 red+1.252
 thing+1.243
 gift+1.243
Neg logit contributions
olded-2.891
 agreed-2.632
 could-2.628
 From-2.605
 hadn-2.601
Token She
Token ID1375
Feature activation-1.88e-01

Pos logit contributions
olded+1.059
 agreed+0.912
 apologised+0.876
 managed+0.867
 hadn+0.862
Neg logit contributions
 new-1.505
 red-1.500
 big-1.445
 dark-1.420
 things-1.398
Token families
Token ID4172
Feature activation+4.32e-01

Pos logit contributions
 new+1.562
 gift+1.512
 red+1.450
 food+1.381
 fake+1.380
Neg logit contributions
olded-3.765
 could-3.458
 agreed-3.454
 suggested-3.404
ave-3.333
Token about
Token ID546
Feature activation-8.27e-01

Pos logit contributions
olded+0.958
ave+0.857
oped+0.847
 apologised+0.833
 reached+0.832
Neg logit contributions
 new-0.699
 gift-0.640
 food-0.634
 things-0.628
 big-0.619
Token their
Token ID511
Feature activation-2.14e+00

Pos logit contributions
 new+1.032
 red+0.957
 gift+0.937
 things+0.928
 food+0.925
Neg logit contributions
olded-2.128
ave-1.909
 agreed-1.876
 suggested-1.844
 could-1.843
Token wonderful
Token ID7932
Feature activation-2.48e+00

Pos logit contributions
 new+1.487
 gift+1.445
 red+1.431
 thing+1.322
 food+1.300
Neg logit contributions
olded-6.159
 agreed-5.744
 could-5.741
 suggested-5.664
 From-5.550
Token day
Token ID1110
Feature activation-3.64e-01

Pos logit contributions
 new+1.932
 gift+1.903
 red+1.882
 thing+1.759
 fake+1.747
Neg logit contributions
olded-6.802
 agreed-6.389
 could-6.384
 suggested-6.279
 couldn-6.153
Token at
Token ID379
Feature activation-2.37e+00

Pos logit contributions
 new+0.399
 red+0.381
 things+0.366
 sunny+0.363
 gift+0.361
Neg logit contributions
olded-0.932
 agreed-0.845
 apologised-0.832
 From-0.822
 could-0.820
Token the
Token ID262
Feature activation-2.67e+00

Pos logit contributions
 new+2.995
 gift+2.965
 red+2.945
 thing+2.839
 food+2.806
Neg logit contributions
olded-5.422
 could-5.013
 agreed-5.006
 suggested-4.911
 From-4.789
Token new
Token ID649
Feature activation-1.93e+00

Pos logit contributions
 gift+2.138
 new+2.102
 red+2.073
 thing+1.983
 food+1.959
Neg logit contributions
olded-7.217
 could-6.830
 agreed-6.796
 suggested-6.687
 couldn-6.568
Token park
Token ID3952
Feature activation+1.09e+00

Pos logit contributions
 new+1.443
 gift+1.389
 red+1.385
 thing+1.295
 sand+1.272
Neg logit contributions
olded-5.565
 agreed-5.136
 could-5.135
 suggested-5.072
 From-4.977
Token.
Token ID13
Feature activation+1.38e+00

Pos logit contributions
olded+3.164
 apologised+2.852
ave+2.811
 From+2.808
 hadn+2.768
Neg logit contributions
 new-1.015
 red-0.941
 gift-0.913
 things-0.898
 food-0.887
Token<|endoftext|>
Token ID50256
Feature activation-6.98e-02

Pos logit contributions
olded+2.700
 apologised+2.377
 managed+2.371
ave+2.370
 agreed+2.325
Neg logit contributions
 new-2.728
 big-2.488
 red-2.465
 gift-2.386
 luxury-2.353
Token.
Token ID13
Feature activation+1.71e+00

Pos logit contributions
 new+0.781
 red+0.753
 thing+0.697
 gift+0.694
 things+0.686
Neg logit contributions
olded-1.683
 could-1.536
 agreed-1.527
 suggested-1.523
 hadn-1.504
Token "
Token ID366
Feature activation+5.40e-01

Pos logit contributions
olded+3.355
ave+2.989
 apologised+2.962
 agreed+2.870
 hadn+2.837
Neg logit contributions
 new-3.310
 red-3.223
 big-2.941
 things-2.907
 round-2.904
TokenI
Token ID40
Feature activation+1.21e+00

Pos logit contributions
olded+0.939
 agreed+0.811
 apologised+0.774
 suggested+0.765
 From+0.757
Neg logit contributions
 thing-1.018
 red-1.015
 new-0.998
 shiny-0.979
 big-0.978
Token don
Token ID836
Feature activation-1.16e+00

Pos logit contributions
olded+2.513
 From+2.193
 apologised+2.193
alled+2.159
 agreed+2.127
Neg logit contributions
 big-1.951
 thing-1.941
 gift-1.941
 new-1.939
 red-1.911
Token't
Token ID470
Feature activation-1.09e+00

Pos logit contributions
 new+2.105
 red+2.104
 gift+2.046
 big+2.042
 fake+2.016
Neg logit contributions
olded-2.014
 agreed-1.808
 could-1.731
 suggested-1.707
 apologised-1.688
Token believe
Token ID1975
Feature activation-1.43e+00

Pos logit contributions
 red+2.078
 gift+2.072
 new+2.059
 thing+2.026
 job+2.024
Neg logit contributions
olded-1.836
 could-1.559
 agreed-1.543
 From-1.541
 suggested-1.529
Token I
Token ID314
Feature activation+1.76e+00

Pos logit contributions
 new+2.147
 red+2.128
 gift+2.088
 things+2.052
 thing+2.049
Neg logit contributions
olded-2.993
 agreed-2.707
 suggested-2.656
 could-2.620
ave-2.568
Token can
Token ID460
Feature activation-7.44e-01

Pos logit contributions
olded+3.810
 From+3.583
 apologised+3.520
lett+3.380
 agreed+3.340
Neg logit contributions
 big-2.733
 red-2.676
 gift-2.639
 new-2.634
 shiny-2.604
Token do
Token ID466
Feature activation+2.04e-01

Pos logit contributions
 new+0.926
 red+0.923
 gift+0.918
 big+0.899
 job+0.875
Neg logit contributions
olded-1.752
 agreed-1.575
 suggested-1.546
 apologised-1.544
 From-1.534
Token it
Token ID340
Feature activation+2.85e-01

Pos logit contributions
olded+0.487
 agreed+0.450
 suggested+0.441
ave+0.438
oped+0.433
Neg logit contributions
 new-0.284
 things-0.273
 red-0.264
 shiny-0.260
 fake-0.258
Token."
Token ID526
Feature activation+1.85e+00

Pos logit contributions
olded+0.725
 apologised+0.651
 From+0.645
 hadn+0.644
 agreed+0.633
Neg logit contributions
 new-0.356
 red-0.342
 big-0.326
 things-0.323
 shiny-0.315
Token,
Token ID11
Feature activation+8.94e-01

Pos logit contributions
 new+1.342
 red+1.324
 thing+1.308
 gift+1.301
 things+1.299
Neg logit contributions
olded-2.433
 could-2.206
 agreed-2.201
 suggested-2.154
 From-2.149
Token and
Token ID290
Feature activation+2.40e+00

Pos logit contributions
olded+2.287
oped+2.056
ave+2.055
os+2.019
 agreed+1.982
Neg logit contributions
 new-1.201
 red-1.160
 things-1.105
 sunny-1.098
 food-1.092
Token so
Token ID523
Feature activation+1.10e+00

Pos logit contributions
olded+5.771
oped+5.408
os+5.281
ave+5.126
anced+4.872
Neg logit contributions
 new-4.021
 things-3.914
 dark-3.842
 big-3.789
 gloomy-3.683
Token she
Token ID673
Feature activation+2.58e+00

Pos logit contributions
olded+2.465
ave+2.108
oped+2.048
 agreed+2.026
 apologised+2.019
Neg logit contributions
 new-1.905
 big-1.869
 red-1.857
 sunny-1.818
 thing-1.799
Token asked
Token ID1965
Feature activation-2.49e-01

Pos logit contributions
olded+4.502
os+4.058
oped+3.937
 From+3.777
anced+3.728
Neg logit contributions
 big-5.631
 red-5.380
 thing-5.363
 dark-5.307
 food-5.273
Token her
Token ID607
Feature activation-2.41e+00

Pos logit contributions
 new+0.284
 things+0.258
 red+0.257
 gift+0.255
 thing+0.253
Neg logit contributions
olded-0.647
 agreed-0.584
 hadn-0.584
 suggested-0.579
oped-0.578
Token mom
Token ID1995
Feature activation-4.62e-01

Pos logit contributions
 new+2.446
 gift+2.418
 red+2.409
 thing+2.299
 sand+2.277
Neg logit contributions
olded-6.176
 could-5.740
 agreed-5.733
 suggested-5.639
 From-5.503
Token if
Token ID611
Feature activation+9.05e-01

Pos logit contributions
 new+0.652
 red+0.618
 gift+0.606
 things+0.589
 food+0.585
Neg logit contributions
olded-1.039
 agreed-0.941
 apologised-0.932
 hadn-0.931
 thanked-0.915
Token she
Token ID673
Feature activation+2.78e+00

Pos logit contributions
olded+1.803
ave+1.643
oped+1.617
 From+1.616
os+1.560
Neg logit contributions
 new-1.641
 things-1.603
 food-1.564
 cake-1.550
 thing-1.547
Token could
Token ID714
Feature activation+3.59e-01

Pos logit contributions
olded+3.107
alled+2.950
oped+2.569
 From+2.560
 Afterwards+2.418
Neg logit contributions
 big-7.539
 thing-7.392
 new-7.252
 red-7.222
 gift-7.212
Token have
Token ID423
Feature activation-9.95e-01

Pos logit contributions
olded+0.932
 agreed+0.835
 suggested+0.824
 From+0.824
 apologised+0.823
Neg logit contributions
 job-0.402
 big-0.392
 gift-0.392
 new-0.384
 dessert-0.382
Token upon
Token ID2402
Feature activation-1.93e+00

Pos logit contributions
 new+1.008
 red+0.953
 big+0.930
 gift+0.906
 thing+0.897
Neg logit contributions
olded-1.375
 apologised-1.207
 agreed-1.203
ave-1.189
 suggested-1.167
Token a
Token ID257
Feature activation-2.31e+00

Pos logit contributions
 new+1.857
 red+1.797
 gift+1.778
 thing+1.693
 food+1.692
Neg logit contributions
olded-5.053
 agreed-4.642
 could-4.622
 suggested-4.572
 apologised-4.474
Token time
Token ID640
Feature activation-1.12e+00

Pos logit contributions
 new+1.313
 gift+1.284
 red+1.259
 thing+1.154
 food+1.129
Neg logit contributions
olded-6.882
 agreed-6.462
 could-6.454
 suggested-6.369
 From-6.245
Token there
Token ID612
Feature activation+1.16e+00

Pos logit contributions
 new+1.341
 red+1.308
 thing+1.245
 gift+1.239
 big+1.220
Neg logit contributions
olded-2.721
 agreed-2.435
 apologised-2.412
 suggested-2.404
ave-2.390
Token was
Token ID373
Feature activation-3.14e+00

Pos logit contributions
olded+2.223
ave+1.737
 apologised+1.673
 agreed+1.530
 suggested+1.475
Neg logit contributions
 thing-2.580
 new-2.545
 big-2.491
 red-2.487
 sunny-2.410
Token a
Token ID257
Feature activation-3.52e+00

Pos logit contributions
 gift+3.708
 new+3.581
 red+3.452
 thing+3.452
 job+3.442
Neg logit contributions
olded-7.406
 could-7.051
 agreed-7.042
 suggested-6.922
 From-6.868
Token happy
Token ID3772
Feature activation-2.96e+00

Pos logit contributions
 gift+3.255
 food+3.092
 things+3.045
 attention+3.041
 new+3.001
Neg logit contributions
olded-9.010
 could-8.839
 agreed-8.679
 suggested-8.525
 couldn-8.419
Token boy
Token ID2933
Feature activation-1.72e+00

Pos logit contributions
 new+2.831
 gift+2.797
 red+2.732
 fake+2.663
 things+2.619
Neg logit contributions
olded-7.458
 agreed-7.088
 could-7.080
 suggested-6.948
 From-6.830
Token named
Token ID3706
Feature activation-7.46e-01

Pos logit contributions
 new+3.492
 red+3.491
 gift+3.469
 thing+3.410
 food+3.364
Neg logit contributions
olded-2.619
 agreed-2.283
 could-2.236
 suggested-2.190
 hadn-2.138
Token Sam
Token ID3409
Feature activation+2.32e-01

Pos logit contributions
 new+1.409
 red+1.391
 food+1.338
 big+1.336
 lunch+1.335
Neg logit contributions
olded-1.355
 apologised-1.157
 agreed-1.151
 could-1.119
 hadn-1.115
Token.
Token ID13
Feature activation-4.50e-01

Pos logit contributions
olded+0.676
 apologised+0.596
oped+0.589
 agreed+0.588
 From+0.583
Neg logit contributions
 new-0.197
 red-0.195
 big-0.178
 thing-0.173
 fake-0.165
Token She
Token ID1375
Feature activation-7.96e-01

Pos logit contributions
 red+3.770
 new+3.749
 gift+3.711
 sunny+3.651
 thing+3.637
Neg logit contributions
olded-1.810
 agreed-1.458
 could-1.414
 suggested-1.392
 apologised-1.366
Token loved
Token ID6151
Feature activation-2.83e+00

Pos logit contributions
 red+1.831
 new+1.817
 big+1.793
 shiny+1.771
 sunny+1.764
Neg logit contributions
olded-1.092
 agreed-0.881
 apologised-0.844
 suggested-0.838
oped-0.777
Token to
Token ID284
Feature activation-2.24e+00

Pos logit contributions
 gift+4.150
 new+4.120
 red+4.027
 thing+4.017
 job+3.936
Neg logit contributions
olded-5.839
 could-5.427
 agreed-5.349
 suggested-5.289
 couldn-5.159
Token play
Token ID711
Feature activation-3.47e+00

Pos logit contributions
 new+2.752
 gift+2.714
 red+2.698
 thing+2.591
 food+2.564
Neg logit contributions
olded-5.245
 agreed-4.824
 could-4.821
 suggested-4.735
 From-4.615
Token outside
Token ID2354
Feature activation-2.84e+00

Pos logit contributions
 gift+5.015
 new+4.904
 fake+4.790
 red+4.786
 rainy+4.766
Neg logit contributions
olded-7.331
 could-6.985
 agreed-6.888
 suggested-6.742
 From-6.565
Token in
Token ID287
Feature activation-3.86e+00

Pos logit contributions
 gift+4.097
 new+4.045
 red+3.877
 fake+3.846
 job+3.841
Neg logit contributions
olded-5.970
 could-5.676
 agreed-5.567
 suggested-5.462
 From-5.337
Token the
Token ID262
Feature activation-3.05e+00

Pos logit contributions
 gift+6.409
 job+6.328
 thing+6.009
 new+5.987
 task+5.897
Neg logit contributions
 could-7.054
olded-6.815
 agreed-6.765
 suggested-6.695
 From-6.686
Token green
Token ID4077
Feature activation-2.39e+00

Pos logit contributions
 gift+2.569
 new+2.409
 job+2.346
 red+2.324
 thing+2.312
Neg logit contributions
olded-8.097
 could-7.811
 agreed-7.727
 suggested-7.623
 From-7.569
Token grass
Token ID8701
Feature activation-1.06e+00

Pos logit contributions
 new+2.139
 gift+2.116
 red+2.073
 thing+1.960
 fake+1.950
Neg logit contributions
olded-6.325
 could-5.919
 agreed-5.910
 suggested-5.815
 From-5.690
Token and
Token ID290
Feature activation-2.07e+00

Pos logit contributions
 red+1.220
 new+1.205
 thing+1.137
 sunny+1.130
 big+1.120
Neg logit contributions
olded-2.642
 apologised-2.349
 agreed-2.337
 hadn-2.336
 could-2.312
Token pick
Token ID2298
Feature activation-2.63e+00

Pos logit contributions
 new+2.719
 red+2.680
 gift+2.661
 thing+2.559
 sand+2.546
Neg logit contributions
olded-4.718
 agreed-4.296
 could-4.271
 suggested-4.218
 From-4.102
Token and
Token ID290
Feature activation+1.78e+00

Pos logit contributions
olded+5.095
 reached+4.621
 From+4.614
 apologised+4.610
oped+4.563
Neg logit contributions
 thing-2.331
 new-2.282
 gift-2.247
 things-2.239
 red-2.090
Token they
Token ID484
Feature activation+1.59e+00

Pos logit contributions
oped+4.429
olded+4.391
ave+4.129
os+4.128
ver+3.853
Neg logit contributions
 new-2.487
 gift-2.427
 things-2.243
 cake-2.223
 red-2.155
Token went
Token ID1816
Feature activation-1.88e+00

Pos logit contributions
olded+2.774
oped+2.307
 From+2.292
os+2.251
 declared+2.189
Neg logit contributions
 gift-3.006
 big-2.902
 red-2.895
 pretty-2.854
 shiny-2.826
Token back
Token ID736
Feature activation-8.64e-01

Pos logit contributions
 new+2.271
 gift+2.210
 red+2.204
 food+2.086
 thing+2.079
Neg logit contributions
olded-4.443
 agreed-4.108
 could-4.093
 suggested-4.022
 hadn-3.951
Token to
Token ID284
Feature activation-1.61e+00

Pos logit contributions
 new+1.019
 red+0.951
 gift+0.925
 big+0.907
 food+0.900
Neg logit contributions
olded-2.153
 agreed-1.981
 apologised-1.943
 hadn-1.932
 could-1.929
Token the
Token ID262
Feature activation-3.43e+00

Pos logit contributions
 new+2.006
 gift+1.926
 red+1.916
 job+1.831
 food+1.821
Neg logit contributions
olded-3.860
 agreed-3.520
 could-3.508
 suggested-3.448
ave-3.377
Token store
Token ID3650
Feature activation+3.51e-01

Pos logit contributions
 gift+4.404
 cotton+4.355
 red+4.346
 sand+4.289
 food+4.267
Neg logit contributions
olded-7.684
 could-7.226
 suggested-6.974
 agreed-6.928
 From-6.910
Token to
Token ID284
Feature activation-1.32e+00

Pos logit contributions
olded+0.883
 apologised+0.807
oped+0.805
 could+0.805
 hadn+0.805
Neg logit contributions
 new-0.410
 red-0.365
 things-0.354
 big-0.349
 gift-0.347
Token exchange
Token ID5163
Feature activation-7.39e-01

Pos logit contributions
 new+1.871
 gift+1.819
 red+1.788
 job+1.745
 big+1.723
Neg logit contributions
olded-2.911
 agreed-2.634
 could-2.610
 suggested-2.593
ave-2.525
Token it
Token ID340
Feature activation+4.05e-02

Pos logit contributions
 new+0.810
 gift+0.756
 things+0.728
 food+0.727
 red+0.725
Neg logit contributions
olded-1.914
 agreed-1.766
 suggested-1.749
 hadn-1.743
oped-1.739
Token for
Token ID329
Feature activation-1.58e+00

Pos logit contributions
olded+0.108
oped+0.099
 agreed+0.098
 hadn+0.098
 apologised+0.097
Neg logit contributions
 new-0.045
 gift-0.041
 food-0.040
 things-0.040
 red-0.039
Token park
Token ID3952
Feature activation+1.42e-01

Pos logit contributions
 gift+3.181
 thing+2.920
 new+2.918
 job+2.874
 red+2.854
Neg logit contributions
olded-8.149
 could-7.881
 agreed-7.763
 suggested-7.750
 From-7.626
Token.
Token ID13
Feature activation-2.11e-01

Pos logit contributions
olded+0.397
 apologised+0.358
 hadn+0.354
ave+0.344
oped+0.342
Neg logit contributions
 red-0.139
 new-0.139
 thing-0.128
 sunny-0.128
 things-0.126
Token They
Token ID1119
Feature activation+3.22e-01

Pos logit contributions
 red+0.392
 new+0.389
 sunny+0.373
 lunch+0.370
 sand+0.369
Neg logit contributions
olded-0.395
 agreed-0.340
 apologised-0.339
ave-0.335
 hadn-0.330
Token see
Token ID766
Feature activation-1.54e+00

Pos logit contributions
olded+0.810
 apologised+0.693
 From+0.684
 suggested+0.679
ave+0.677
Neg logit contributions
 red-0.446
 big-0.432
 sunny-0.421
 shiny-0.411
 new-0.405
Token a
Token ID257
Feature activation-3.88e+00

Pos logit contributions
 new+1.537
 red+1.499
 sand+1.388
 gift+1.378
 big+1.365
Neg logit contributions
olded-4.094
 agreed-3.715
 could-3.680
 suggested-3.672
 hadn-3.622
Token big
Token ID1263
Feature activation-3.62e+00

Pos logit contributions
 gift+2.839
 attention+2.748
 food+2.655
 job+2.584
 lunch+2.535
Neg logit contributions
 could-10.533
olded-10.417
 suggested-10.171
 agreed-10.137
 couldn-10.007
Token slide
Token ID10649
Feature activation-5.60e-01

Pos logit contributions
 gift+3.699
 new+3.652
 job+3.408
 food+3.400
 attention+3.399
Neg logit contributions
olded-8.600
 agreed-8.470
 could-8.468
 suggested-8.226
 couldn-8.099
Token.
Token ID13
Feature activation+5.28e-01

Pos logit contributions
 red+0.679
 new+0.644
 thing+0.614
 big+0.601
 things+0.592
Neg logit contributions
olded-1.401
 apologised-1.247
 agreed-1.243
 hadn-1.234
 From-1.233
Token Sara
Token ID24799
Feature activation+1.96e+00

Pos logit contributions
olded+0.961
 apologised+0.841
ave+0.831
 hadn+0.828
 agreed+0.810
Neg logit contributions
 red-1.043
 new-1.022
 big-0.974
 sand-0.973
 lunch-0.972
Token wants
Token ID3382
Feature activation-9.03e-01

Pos logit contributions
olded+4.380
 apologised+3.771
ave+3.635
oped+3.606
owed+3.591
Neg logit contributions
 red-3.728
 new-3.418
 big-3.415
 shiny-3.235
 things-3.229
Token to
Token ID284
Feature activation-9.88e-01

Pos logit contributions
 new+1.167
 red+1.136
 food+1.073
 things+1.064
 big+1.055
Neg logit contributions
olded-2.184
 agreed-1.952
 hadn-1.941
 apologised-1.930
 From-1.929
Token even
Token ID772
Feature activation+1.60e-01

Pos logit contributions
olded+6.698
oped+5.434
os+5.397
 Did+5.302
ver+5.134
Neg logit contributions
 big-5.864
 new-5.700
 red-5.597
 food-5.589
 shiny-5.511
Token came
Token ID1625
Feature activation-1.25e+00

Pos logit contributions
olded+0.213
 From+0.162
oped+0.162
 hadn+0.161
 could+0.154
Neg logit contributions
 new-0.376
 red-0.368
 food-0.359
 gift-0.357
 big-0.357
Token to
Token ID284
Feature activation-1.12e+00

Pos logit contributions
 new+1.371
 red+1.330
 gift+1.269
 big+1.249
 fake+1.226
Neg logit contributions
olded-3.164
 agreed-2.923
 could-2.882
 suggested-2.858
 hadn-2.815
Token check
Token ID2198
Feature activation-5.73e-01

Pos logit contributions
 new+1.428
 gift+1.373
 red+1.345
 food+1.311
 sand+1.293
Neg logit contributions
olded-2.741
 could-2.431
 agreed-2.428
 suggested-2.417
ave-2.377
Token on
Token ID319
Feature activation-7.62e-01

Pos logit contributions
 new+0.620
 things+0.579
 gift+0.578
 red+0.577
 thing+0.575
Neg logit contributions
olded-1.485
 agreed-1.378
 suggested-1.358
 apologised-1.328
 hadn-1.322
Token the
Token ID262
Feature activation-3.89e+00

Pos logit contributions
 new+0.980
 red+0.938
 food+0.914
 fake+0.891
 things+0.891
Neg logit contributions
olded-1.882
 agreed-1.719
 suggested-1.667
ave-1.656
 could-1.656
Token bird
Token ID6512
Feature activation-8.08e-01

Pos logit contributions
 gift+3.544
 job+3.513
 sunny+3.449
 new+3.443
 cotton+3.433
Neg logit contributions
olded-9.971
 could-9.786
 agreed-9.495
 suggested-9.462
 couldn-9.404
Token and
Token ID290
Feature activation+1.69e+00

Pos logit contributions
 new+1.619
 red+1.565
 gift+1.561
 food+1.561
 cake+1.518
Neg logit contributions
olded-1.331
 agreed-1.141
 could-1.093
 apologised-1.078
 suggested-1.075
Token they
Token ID484
Feature activation+1.30e+00

Pos logit contributions
olded+3.639
oped+3.159
ave+3.100
os+3.074
 From+2.943
Neg logit contributions
 new-3.092
 food-2.993
 sand-2.904
 cake-2.897
 shiny-2.890
Token became
Token ID2627
Feature activation-1.42e+00

Pos logit contributions
olded+2.343
 From+1.853
 Perhaps+1.748
ave+1.727
Well+1.706
Neg logit contributions
 big-2.597
 red-2.567
 dark-2.545
 sand-2.544
 shiny-2.526
Token friends
Token ID2460
Feature activation-7.35e-02

Pos logit contributions
 new+1.024
 red+0.962
 gift+0.921
 food+0.892
 things+0.883
Neg logit contributions
olded-4.164
 agreed-3.822
 could-3.796
 suggested-3.763
ave-3.722