TOP ACTIVATIONS
MAX = 32.529

of a plan to unf ree ze the saf ari .
said , " Ant if ree ze . Poison . Do
so she could run fr ree ly . The
Now , I will unf ree ze you all !" He
teacher told them to unf ree ze and sit back down
present ?" B ree said , " Well ,
Wh isk ers . Wh isk ers liked to sleep all
loved to play with Wh isk ers and eat y ummy
a lazy cat named Wh isk ers . Wh isk ers
saw that Mr . Wh isk ers was lost and she
she could and pushed Wh isk ers out of the way
I 'm sorry , Wh isk ers . Let 's go
her pet cat , Wh isk ers . Lily loved to
One sunny day , Wh isk ers went for a walk
. One day , Wh isk ers saw a can of
. After Wh isk ers got better , he
Wh isk ers . Wh isk ers looked at Spot and
scared . Mr . Wh isk ers said , " Don
wise old cat named Wh isk ers . Wh isk ers
family sh oo ed Wh isk ers away and told Max

MIN ACTIVATIONS
MIN = -23.660

make the frame of the puzzle . Then , they look
You can 't have that puzzle , Lisa . It is
smiled . She liked the puzzle . Sam watched
to solve the giant 's puzzle . Finally , she managed
clever , and soon the puzzle was finished . All the
. " I became a puzzle master !" <|endoftext|> Once upon
work . He hugged his puzzle and took it home with
. She saw a black puzzle sitting on the shelf .
Tim was playing with the puzzle , his friend Sara came
. We can fix the puzzle ." She showed him how
inside . It is a puzzle . It has many pieces
He decided to join a puzzle - s olving group .
who can put together the puzzle first ."
lots of fun putting the puzzle together . They were very
I will clean up the puzzle ." Tom 's mom smiled
the key to completing the puzzle . <|endoftext|> Once upon a
and started to put the puzzle together . She worked very
. She wanted a new puzzle that was harder and more
happy that he finished the puzzle all on his own .
pieces and pie cing the puzzle together ! <|endoftext|> Once upon

INTERVAL 18.482 - 32.529
CONTAINS 0.152%

skip around the beach and Bob followed him .
. <|endoftext|> One morning , Bob the bear was walking in
. He shouted with g lee , " Look , Molly
not pleased . Anna sc olded the goat . Anna said
Suddenly , the mouse sc ur ried over to the edge

INTERVAL 4.435 - 18.482
CONTAINS 25.918%

could get better . Emily was really sad
't keep going . That 's when the hardest
watch the birds who flew around . One day
interesting so she picked it up . When
to help him . She asked her mom if she could

INTERVAL -9.612 - 4.435
CONTAINS 68.364%

dad and little brother were happy to see the strawberries .
from the ground . He looked down and saw a small
End . <|endoftext|> Once upon a time there was a boy
The man smiled and came closer . He said "
inside , he found his mom . She was feeling jealous

INTERVAL -23.660 - -9.612
CONTAINS 5.566%

, and he hoped he would always receive special things like
" Mom my , where are you ? Come back !"
to resist and spend one more night in the magical forest
the mail man that there is mail inside the mailbox ,"
his be ak and opened it . Inside the case was
Token of
Token ID286
Feature activation-3.83e+00

Pos logit contributions
ur+0.322
arted+0.313
 stro+0.283
ked+0.283
anged+0.274
Neg logit contributions
 best-0.302
 world-0.294
 thing-0.293
 biggest-0.292
 lot-0.283
Token a
Token ID257
Feature activation-2.89e+00

Pos logit contributions
 thing+0.177
 best+0.175
 world+0.173
 problem+0.172
 biggest+0.170
Neg logit contributions
arted-0.181
ur-0.177
 stro-0.166
ked-0.165
oured-0.153
Token plan
Token ID1410
Feature activation-6.18e+00

Pos logit contributions
iest+0.068
 thing+0.064
 things+0.063
 biggest+0.059
 best+0.059
Neg logit contributions
 stro-0.177
ur-0.176
arted-0.168
 pat-0.165
ree-0.165
Token to
Token ID284
Feature activation+6.96e+00

Pos logit contributions
 best+0.249
 thing+0.239
 world+0.234
 biggest+0.234
 things+0.229
Neg logit contributions
arted-0.356
ur-0.354
ked-0.332
 stro-0.328
anged-0.324
Token unf
Token ID3684
Feature activation+1.24e+01

Pos logit contributions
arted+0.369
ur+0.358
ked+0.340
 stro+0.335
anged+0.333
Neg logit contributions
 best-0.322
 thing-0.314
 world-0.303
 things-0.303
 lot-0.297
Tokenree
Token ID631
Feature activation+3.25e+01

Pos logit contributions
arted+0.345
 stro+0.339
oured+0.298
ged+0.295
 argued+0.292
Neg logit contributions
 thing-0.793
 world-0.761
 important-0.697
 end-0.695
 well-0.690
Tokenze
Token ID2736
Feature activation+1.06e+01

Pos logit contributions
arted+1.739
ur+1.665
anged+1.561
ked+1.535
oured+1.528
Neg logit contributions
 best-1.488
 thing-1.456
 things-1.443
 world-1.399
 important-1.368
Token the
Token ID262
Feature activation+1.72e+00

Pos logit contributions
arted+0.580
ur+0.569
ked+0.536
 stro+0.531
anged+0.523
Neg logit contributions
 best-0.465
 thing-0.451
 world-0.437
 things-0.434
 biggest-0.427
Token saf
Token ID1932
Feature activation+1.68e+00

Pos logit contributions
arted+0.098
ur+0.097
ked+0.091
 stro+0.088
ree+0.087
Neg logit contributions
 best-0.063
 thing-0.063
iest-0.060
 things-0.057
 important-0.057
Tokenari
Token ID2743
Feature activation+3.82e+00

Pos logit contributions
ur+0.070
arted+0.070
ked+0.064
 stro+0.061
anged+0.061
Neg logit contributions
 best-0.095
 thing-0.094
 things-0.091
 important-0.090
 biggest-0.090
Token.
Token ID13
Feature activation-2.74e+00

Pos logit contributions
arted+0.210
ur+0.208
ked+0.195
 stro+0.192
anged+0.187
Neg logit contributions
 best-0.162
 thing-0.155
 biggest-0.152
 lot-0.150
 things-0.149
Token said
Token ID531
Feature activation+1.17e+01

Pos logit contributions
 best+0.217
 thing+0.210
 things+0.199
 world+0.198
 important+0.195
Neg logit contributions
arted-0.376
ur-0.368
ked-0.348
anged-0.343
 stro-0.343
Token,
Token ID11
Feature activation+3.22e+00

Pos logit contributions
arted+0.634
ur+0.611
ked+0.577
 stro+0.570
anged+0.568
Neg logit contributions
 best-0.534
 thing-0.520
 things-0.501
 world-0.494
 biggest-0.487
Token "
Token ID366
Feature activation-7.14e-01

Pos logit contributions
arted+0.214
ur+0.205
anged+0.199
oured+0.195
ked+0.185
Neg logit contributions
 best-0.109
 thing-0.107
 lot-0.106
 things-0.105
 world-0.102
TokenAnt
Token ID13217
Feature activation+4.52e+00

Pos logit contributions
 thing+0.030
 best+0.030
 world+0.029
 things+0.029
 lot+0.028
Neg logit contributions
arted-0.041
ur-0.041
ked-0.037
 stro-0.037
oured-0.037
Tokenif
Token ID361
Feature activation-5.11e+00

Pos logit contributions
ur+0.191
anged+0.176
arted+0.170
 stro+0.165
ked+0.164
Neg logit contributions
 biggest-0.251
 best-0.249
 world-0.244
 thing-0.242
 important-0.237
Tokenree
Token ID631
Feature activation+3.24e+01

Pos logit contributions
 world+0.193
 best+0.186
 thing+0.184
 universe+0.180
 game+0.177
Neg logit contributions
ur-0.299
arted-0.286
ked-0.268
anged-0.268
olded-0.265
Tokenze
Token ID2736
Feature activation+1.04e+01

Pos logit contributions
arted+1.545
ur+1.503
ked+1.396
 stro+1.379
anged+1.365
Neg logit contributions
 best-1.655
 thing-1.619
 things-1.569
 world-1.563
 biggest-1.535
Token.
Token ID13
Feature activation-2.91e+00

Pos logit contributions
ur+0.581
arted+0.580
ked+0.544
 stro+0.531
ree+0.519
Neg logit contributions
 best-0.420
 world-0.405
 biggest-0.397
 thing-0.395
 important-0.380
Token Poison
Token ID19357
Feature activation+5.92e-01

Pos logit contributions
 thing+0.118
 best+0.115
 things+0.111
 biggest+0.109
 world+0.108
Neg logit contributions
arted-0.175
ur-0.157
ked-0.154
anged-0.153
oured-0.152
Token.
Token ID13
Feature activation-2.93e+00

Pos logit contributions
ur+0.042
arted+0.042
ked+0.040
 stro+0.039
anged+0.039
Neg logit contributions
 best-0.015
 biggest-0.014
 world-0.014
 thing-0.014
 important-0.014
Token Do
Token ID2141
Feature activation-6.81e-01

Pos logit contributions
 thing+0.128
 best+0.126
 things+0.122
 world+0.120
 lot+0.118
Neg logit contributions
arted-0.166
ur-0.150
ked-0.146
oured-0.145
anged-0.143
Token so
Token ID523
Feature activation-1.47e+00

Pos logit contributions
ur+0.064
arted+0.064
ked+0.061
 stro+0.059
oured+0.058
Neg logit contributions
 best-0.069
 thing-0.068
 world-0.067
 things-0.066
 lot-0.066
Token she
Token ID673
Feature activation-9.03e+00

Pos logit contributions
 world+0.063
 problem+0.058
 best+0.055
 story+0.055
 game+0.055
Neg logit contributions
ur-0.081
 stro-0.075
arted-0.074
anged-0.070
ked-0.070
Token could
Token ID714
Feature activation-5.03e+00

Pos logit contributions
 best+0.360
 thing+0.354
 world+0.345
 things+0.337
 biggest+0.333
Neg logit contributions
ur-0.525
arted-0.525
 stro-0.496
ked-0.493
anged-0.477
Token run
Token ID1057
Feature activation+2.23e+00

Pos logit contributions
 best+0.237
 thing+0.231
 world+0.224
 things+0.223
 biggest+0.219
Neg logit contributions
arted-0.260
ur-0.255
ked-0.239
 stro-0.237
anged-0.233
Token fr
Token ID1216
Feature activation+2.81e+00

Pos logit contributions
arted+0.104
ur+0.102
ked+0.095
 stro+0.093
anged+0.091
Neg logit contributions
 best-0.116
 thing-0.112
 world-0.110
 things-0.108
 biggest-0.108
Tokenree
Token ID631
Feature activation+3.22e+01

Pos logit contributions
arted+0.158
ur+0.154
ked+0.148
 stro+0.147
anged+0.144
Neg logit contributions
 best-0.119
 thing-0.115
 things-0.112
 world-0.112
 important-0.109
Tokenly
Token ID306
Feature activation+5.34e+00

Pos logit contributions
arted+1.605
ur+1.570
ked+1.469
 stro+1.452
anged+1.431
Neg logit contributions
 best-1.577
 thing-1.537
 world-1.492
 things-1.485
 biggest-1.460
Token.
Token ID13
Feature activation-3.03e+00

Pos logit contributions
arted+0.262
ur+0.258
 stro+0.248
ked+0.247
oured+0.233
Neg logit contributions
 best-0.250
 important-0.242
 biggest-0.237
 world-0.228
 thing-0.226
Token
Token ID198
Feature activation+6.10e+00

Pos logit contributions
 world+0.121
 lot+0.120
 best+0.119
 day+0.116
 things+0.114
Neg logit contributions
arted-0.182
ked-0.180
ur-0.176
anged-0.168
oured-0.164
Token
Token ID198
Feature activation+6.08e+00

Pos logit contributions
arted+0.357
ked+0.347
ur+0.345
anged+0.329
oured+0.326
Neg logit contributions
 best-0.259
 world-0.258
 lot-0.254
 thing-0.252
 things-0.243
TokenThe
Token ID464
Feature activation-2.67e+00

Pos logit contributions
arted+0.350
ur+0.346
 stro+0.328
ree+0.323
ked+0.316
Neg logit contributions
 best-0.248
 thing-0.235
 biggest-0.228
 things-0.227
 problem-0.223
TokenNow
Token ID3844
Feature activation-1.35e+00

Pos logit contributions
 best+0.118
 thing+0.117
 world+0.114
 things+0.110
 biggest+0.108
Neg logit contributions
arted-0.184
ur-0.183
ked-0.171
anged-0.170
oured-0.168
Token,
Token ID11
Feature activation+1.34e+00

Pos logit contributions
 best+0.051
 thing+0.049
 world+0.048
 things+0.047
 biggest+0.047
Neg logit contributions
arted-0.081
ur-0.080
ked-0.075
 stro-0.075
anged-0.073
Token I
Token ID314
Feature activation-9.40e+00

Pos logit contributions
arted+0.067
ur+0.067
ked+0.062
 stro+0.062
anged+0.060
Neg logit contributions
 best-0.064
 thing-0.062
 world-0.061
 things-0.060
 biggest-0.059
Token will
Token ID481
Feature activation-1.18e+01

Pos logit contributions
 best+0.363
 thing+0.352
 world+0.339
 things+0.337
 biggest+0.331
Neg logit contributions
arted-0.564
ur-0.556
ked-0.525
 stro-0.523
anged-0.515
Token unf
Token ID3684
Feature activation+1.10e+01

Pos logit contributions
 best+0.526
 thing+0.512
 world+0.495
 things+0.491
 lot+0.484
Neg logit contributions
arted-0.646
ur-0.622
ked-0.594
anged-0.583
 stro-0.577
Tokenree
Token ID631
Feature activation+3.12e+01

Pos logit contributions
arted+0.253
 stro+0.243
 inspected+0.224
ked+0.222
oured+0.218
Neg logit contributions
 thing-0.708
 world-0.673
 end-0.660
 story-0.631
 game-0.620
Tokenze
Token ID2736
Feature activation+9.38e+00

Pos logit contributions
arted+1.729
ur+1.687
anged+1.568
ked+1.563
 stro+1.551
Neg logit contributions
 best-1.362
 thing-1.329
 things-1.294
 world-1.267
 lot-1.237
Token you
Token ID345
Feature activation-8.34e+00

Pos logit contributions
arted+0.499
ur+0.490
ked+0.461
 stro+0.456
anged+0.447
Neg logit contributions
 best-0.424
 thing-0.411
 world-0.399
 things-0.395
 biggest-0.391
Token all
Token ID477
Feature activation-2.60e+00

Pos logit contributions
 best+0.369
 thing+0.364
 world+0.349
 things+0.348
 important+0.342
Neg logit contributions
arted-0.445
ur-0.441
ked-0.412
 stro-0.407
oured-0.397
Token!"
Token ID2474
Feature activation+1.74e+00

Pos logit contributions
 best+0.112
 thing+0.107
 biggest+0.104
 things+0.103
 world+0.103
Neg logit contributions
arted-0.140
ur-0.137
 stro-0.125
ked-0.124
oured-0.121
Token He
Token ID679
Feature activation+1.37e+00

Pos logit contributions
arted+0.095
ur+0.089
ked+0.089
anged+0.087
oured+0.084
Neg logit contributions
 things-0.073
 best-0.073
 world-0.072
 lot-0.072
 day-0.072
Token teacher
Token ID4701
Feature activation+2.01e+00

Pos logit contributions
arted+0.006
ur+0.005
anged+0.005
ree+0.005
olded+0.005
Neg logit contributions
 best-0.006
 thing-0.006
 world-0.006
 biggest-0.006
 important-0.006
Token told
Token ID1297
Feature activation+1.23e+01

Pos logit contributions
arted+0.102
ur+0.098
ked+0.091
ree+0.088
anged+0.088
Neg logit contributions
 best-0.100
 thing-0.096
 things-0.093
 world-0.092
 important-0.091
Token them
Token ID606
Feature activation+4.23e+00

Pos logit contributions
arted+0.573
ur+0.570
ked+0.530
 stro+0.522
anged+0.519
Neg logit contributions
 best-0.642
 thing-0.637
 things-0.629
 lot-0.613
 world-0.608
Token to
Token ID284
Feature activation+5.29e+00

Pos logit contributions
ur+0.188
arted+0.185
 stro+0.169
ked+0.167
anged+0.161
Neg logit contributions
 best-0.220
 world-0.219
 thing-0.215
 biggest-0.213
 lot-0.205
Token unf
Token ID3684
Feature activation+1.08e+01

Pos logit contributions
arted+0.304
ur+0.294
ked+0.284
anged+0.277
 stro+0.276
Neg logit contributions
 best-0.224
 thing-0.219
 world-0.211
 things-0.209
 lot-0.206
Tokenree
Token ID631
Feature activation+3.08e+01

Pos logit contributions
arted+0.230
 Sammy+0.224
 stro+0.224
 Joe+0.218
oured+0.207
Neg logit contributions
 thing-0.730
 world-0.659
 game-0.636
 well-0.634
 important-0.630
Tokenze
Token ID2736
Feature activation+8.85e+00

Pos logit contributions
arted+1.784
ur+1.723
anged+1.598
oured+1.571
ked+1.562
Neg logit contributions
 best-1.289
 thing-1.275
 things-1.267
 important-1.182
 world-1.179
Token and
Token ID290
Feature activation+4.31e+00

Pos logit contributions
arted+0.490
ur+0.481
ked+0.451
 stro+0.447
anged+0.443
Neg logit contributions
 best-0.387
 thing-0.379
 things-0.365
 world-0.364
 important-0.355
Token sit
Token ID1650
Feature activation+5.56e+00

Pos logit contributions
arted+0.224
ur+0.217
ked+0.205
anged+0.200
 stro+0.199
Neg logit contributions
 best-0.204
 thing-0.200
 things-0.194
 world-0.191
 important-0.189
Token back
Token ID736
Feature activation+2.35e+00

Pos logit contributions
arted+0.308
ur+0.305
ked+0.279
 stro+0.278
anged+0.278
Neg logit contributions
 best-0.251
 thing-0.242
 things-0.234
 world-0.232
 important-0.227
Token down
Token ID866
Feature activation+7.04e+00

Pos logit contributions
arted+0.137
ur+0.134
ked+0.127
 stro+0.126
anged+0.124
Neg logit contributions
 best-0.097
 thing-0.093
 things-0.091
 world-0.090
 important-0.088
Token present
Token ID1944
Feature activation-1.17e+01

Pos logit contributions
 best+0.331
 important+0.320
 biggest+0.317
 world+0.310
 strongest+0.310
Neg logit contributions
ur-0.560
 stro-0.553
oured-0.540
ked-0.506
ree-0.499
Token?"
Token ID1701
Feature activation+1.44e+00

Pos logit contributions
 best+0.487
 thing+0.474
 world+0.460
 things+0.460
 lot+0.455
Neg logit contributions
arted-0.642
ur-0.640
ked-0.596
 stro-0.588
oured-0.584
Token
Token ID198
Feature activation+4.38e+00

Pos logit contributions
arted+0.058
ur+0.057
oured+0.053
ked+0.051
anged+0.051
Neg logit contributions
 best-0.087
 things-0.084
 thing-0.083
 lot-0.083
 world-0.083
Token
Token ID198
Feature activation+4.38e+00

Pos logit contributions
arted+0.254
ur+0.250
ked+0.241
oured+0.240
anged+0.238
Neg logit contributions
 best-0.189
 lot-0.181
 world-0.179
 thing-0.174
 things-0.174
TokenB
Token ID33
Feature activation+2.98e+00

Pos logit contributions
arted+0.216
ur+0.212
anged+0.198
oured+0.197
ked+0.197
Neg logit contributions
 best-0.219
 thing-0.214
 world-0.211
 lot-0.209
 things-0.209
Tokenree
Token ID631
Feature activation+3.04e+01

Pos logit contributions
arted+0.110
ur+0.104
 stro+0.098
ked+0.096
anged+0.093
Neg logit contributions
 best-0.179
 thing-0.176
 world-0.175
 things-0.171
 lot-0.169
Token said
Token ID531
Feature activation+9.64e+00

Pos logit contributions
arted+1.663
ur+1.471
anged+1.466
pered+1.334
oured+1.327
Neg logit contributions
 thing-1.304
 best-1.264
 important-1.250
 is-1.223
 things-1.188
Token,
Token ID11
Feature activation+1.17e+00

Pos logit contributions
arted+0.571
ur+0.530
anged+0.524
oured+0.513
ked+0.509
Neg logit contributions
 best-0.421
 thing-0.409
 things-0.383
 important-0.367
 biggest-0.366
Token "
Token ID366
Feature activation-2.80e+00

Pos logit contributions
arted+0.072
anged+0.070
ur+0.070
oured+0.066
ree+0.064
Neg logit contributions
 best-0.046
 thing-0.044
 lot-0.042
 things-0.042
 world-0.041
TokenWell
Token ID5779
Feature activation-2.34e+00

Pos logit contributions
 best+0.105
 thing+0.096
 things+0.095
 problem+0.094
 important+0.092
Neg logit contributions
arted-0.176
ur-0.169
 stro-0.167
ked-0.166
ree-0.160
Token,
Token ID11
Feature activation+1.09e+00

Pos logit contributions
 best+0.112
 thing+0.110
 world+0.107
 things+0.105
 biggest+0.104
Neg logit contributions
arted-0.121
ur-0.119
ked-0.110
 stro-0.109
anged-0.107
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
arted+0.480
ur+0.455
ked+0.423
anged+0.415
 stro+0.412
Neg logit contributions
 best-0.874
 thing-0.859
 things-0.841
 world-0.836
 lot-0.831
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.330
ur+0.305
ked+0.255
 stro+0.255
anged+0.230
Neg logit contributions
 best-1.432
 thing-1.396
 world-1.390
 things-1.372
 important-1.363
Tokeners
Token ID364
Feature activation+1.59e+00

Pos logit contributions
arted+1.483
ur+1.462
ked+1.368
 stro+1.320
anged+1.293
Neg logit contributions
 best-1.275
 thing-1.230
 lot-1.202
 biggest-1.201
 world-1.201
Token.
Token ID13
Feature activation-2.73e+00

Pos logit contributions
arted+0.102
ur+0.100
ked+0.095
 stro+0.094
anged+0.093
Neg logit contributions
 best-0.056
 thing-0.054
 world-0.052
 things-0.051
 lot-0.050
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
 best+0.178
 thing+0.174
 things+0.171
 world+0.170
 lot+0.169
Neg logit contributions
arted-0.094
ur-0.090
ked-0.081
 stro-0.080
anged-0.078
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.270
ur+0.250
ked+0.195
 stro+0.188
anged+0.173
Neg logit contributions
 best-1.488
 thing-1.464
 world-1.442
 things-1.437
 important-1.421
Tokeners
Token ID364
Feature activation+1.59e+00

Pos logit contributions
arted+1.389
ur+1.361
ked+1.276
 stro+1.210
ree+1.178
Neg logit contributions
 best-1.336
 problem-1.282
 thing-1.278
 biggest-1.270
 world-1.264
Token liked
Token ID8288
Feature activation-1.45e+00

Pos logit contributions
ur+0.076
 stro+0.074
arted+0.073
ked+0.072
anged+0.070
Neg logit contributions
 biggest-0.071
 best-0.068
 thing-0.068
 world-0.066
 things-0.065
Token to
Token ID284
Feature activation+6.93e+00

Pos logit contributions
 best+0.083
 thing+0.082
 things+0.080
 world+0.079
 important+0.078
Neg logit contributions
arted-0.061
ur-0.059
ked-0.055
 stro-0.054
anged-0.054
Token sleep
Token ID3993
Feature activation+4.87e+00

Pos logit contributions
arted+0.338
ur+0.333
ked+0.310
 stro+0.309
anged+0.302
Neg logit contributions
 best-0.346
 thing-0.336
 world-0.327
 things-0.324
 biggest-0.321
Token all
Token ID477
Feature activation-1.61e+00

Pos logit contributions
arted+0.239
ur+0.237
ked+0.227
 stro+0.224
anged+0.210
Neg logit contributions
 best-0.223
 thing-0.220
 world-0.216
 biggest-0.215
 things-0.208
Token loved
Token ID6151
Feature activation-2.25e+00

Pos logit contributions
arted+0.401
ur+0.389
ked+0.372
anged+0.365
oured+0.362
Neg logit contributions
 best-0.211
 thing-0.199
 world-0.194
 things-0.191
 important-0.190
Token to
Token ID284
Feature activation+6.97e+00

Pos logit contributions
 things+0.115
 best+0.113
 thing+0.112
 lot+0.109
 important+0.109
Neg logit contributions
arted-0.112
ur-0.108
oured-0.106
ked-0.105
anged-0.102
Token play
Token ID711
Feature activation+3.40e+00

Pos logit contributions
arted+0.412
ur+0.381
ked+0.377
anged+0.364
oured+0.363
Neg logit contributions
 best-0.297
 thing-0.297
 lot-0.289
 things-0.287
 world-0.281
Token with
Token ID351
Feature activation+7.26e+00

Pos logit contributions
arted+0.187
ur+0.180
anged+0.170
ked+0.168
oured+0.166
Neg logit contributions
 thing-0.159
 best-0.159
 things-0.155
 world-0.151
 lot-0.150
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
arted+0.306
ur+0.297
ked+0.273
 stro+0.267
anged+0.265
Neg logit contributions
 best-0.419
 thing-0.411
 things-0.404
 world-0.397
 lot-0.396
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.268
ur+0.249
ked+0.193
 stro+0.183
anged+0.172
Neg logit contributions
 best-1.487
 thing-1.466
 world-1.439
 things-1.436
 biggest-1.424
Tokeners
Token ID364
Feature activation+1.59e+00

Pos logit contributions
ur+1.441
arted+1.430
ked+1.321
 stro+1.282
ree+1.254
Neg logit contributions
 best-1.304
 thing-1.254
 problem-1.237
 world-1.235
 biggest-1.232
Token and
Token ID290
Feature activation+6.05e+00

Pos logit contributions
ur+0.069
arted+0.068
ked+0.065
 stro+0.064
anged+0.060
Neg logit contributions
 thing-0.083
 best-0.082
 world-0.082
 biggest-0.081
 things-0.079
Token eat
Token ID4483
Feature activation+5.06e+00

Pos logit contributions
arted+0.324
ur+0.310
oured+0.291
ked+0.291
anged+0.285
Neg logit contributions
 best-0.290
 thing-0.283
 things-0.282
 lot-0.277
 world-0.266
Token y
Token ID331
Feature activation-2.68e+00

Pos logit contributions
arted+0.306
ur+0.301
ked+0.285
 stro+0.283
anged+0.278
Neg logit contributions
 best-0.195
 thing-0.189
 world-0.182
 things-0.179
 biggest-0.177
Tokenummy
Token ID13513
Feature activation-2.05e+00

Pos logit contributions
 best+0.101
 thing+0.098
 things+0.095
 world+0.094
 lot+0.092
Neg logit contributions
arted-0.163
ur-0.160
ked-0.154
 stro-0.152
oured-0.148
Token a
Token ID257
Feature activation-2.90e+00

Pos logit contributions
 best+0.177
 thing+0.176
 world+0.170
 things+0.167
 biggest+0.164
Neg logit contributions
arted-0.265
ur-0.261
 stro-0.247
ked-0.246
oured-0.237
Token lazy
Token ID16931
Feature activation-5.35e+00

Pos logit contributions
iest+0.093
 best+0.092
 things+0.091
 thing+0.088
 biggest+0.088
Neg logit contributions
 stro-0.166
ur-0.163
ree-0.157
arted-0.156
ked-0.155
Token cat
Token ID3797
Feature activation-1.45e+00

Pos logit contributions
 best+0.214
 biggest+0.206
 things+0.193
 important+0.192
 thing+0.191
Neg logit contributions
ur-0.287
arted-0.279
 stro-0.273
ree-0.266
ked-0.261
Token named
Token ID3706
Feature activation+1.35e+01

Pos logit contributions
 best+0.092
 thing+0.091
 world+0.089
 things+0.088
 important+0.087
Neg logit contributions
arted-0.052
ur-0.050
ked-0.045
 stro-0.044
anged-0.043
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
arted+0.480
ur+0.455
ked+0.423
anged+0.415
 stro+0.412
Neg logit contributions
 best-0.874
 thing-0.859
 things-0.841
 world-0.836
 lot-0.831
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.330
ur+0.305
ked+0.255
 stro+0.255
anged+0.230
Neg logit contributions
 best-1.432
 thing-1.396
 world-1.390
 things-1.372
 important-1.363
Tokeners
Token ID364
Feature activation+1.59e+00

Pos logit contributions
arted+1.483
ur+1.462
ked+1.368
 stro+1.320
anged+1.293
Neg logit contributions
 best-1.275
 thing-1.230
 lot-1.202
 biggest-1.201
 world-1.201
Token.
Token ID13
Feature activation-2.73e+00

Pos logit contributions
arted+0.102
ur+0.100
ked+0.095
 stro+0.094
anged+0.093
Neg logit contributions
 best-0.056
 thing-0.054
 world-0.052
 things-0.051
 lot-0.050
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
 best+0.178
 thing+0.174
 things+0.171
 world+0.170
 lot+0.169
Neg logit contributions
arted-0.094
ur-0.090
ked-0.081
 stro-0.080
anged-0.078
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.270
ur+0.250
ked+0.195
 stro+0.188
anged+0.173
Neg logit contributions
 best-1.488
 thing-1.464
 world-1.442
 things-1.437
 important-1.421
Tokeners
Token ID364
Feature activation+1.59e+00

Pos logit contributions
arted+1.389
ur+1.361
ked+1.276
 stro+1.210
ree+1.178
Neg logit contributions
 best-1.336
 problem-1.282
 thing-1.278
 biggest-1.270
 world-1.264
Token saw
Token ID2497
Feature activation+5.11e+00

Pos logit contributions
 best+0.541
 thing+0.517
 important+0.515
 special+0.481
 things+0.480
Neg logit contributions
arted-0.683
ur-0.612
ree-0.607
ked-0.596
anged-0.593
Token that
Token ID326
Feature activation-1.88e+00

Pos logit contributions
arted+0.261
ur+0.252
anged+0.235
ked+0.232
ree+0.230
Neg logit contributions
 best-0.253
 things-0.250
 thing-0.247
 lot-0.235
 important-0.231
Token Mr
Token ID1770
Feature activation+7.50e+00

Pos logit contributions
 best+0.091
 thing+0.089
 things+0.086
 world+0.086
 important+0.084
Neg logit contributions
arted-0.097
ur-0.095
ked-0.089
anged-0.087
oured-0.086
Token.
Token ID13
Feature activation-2.73e+00

Pos logit contributions
ur+0.411
arted+0.402
ked+0.388
 stro+0.382
anged+0.378
Neg logit contributions
 biggest-0.309
 best-0.308
 thing-0.303
 lot-0.296
 important-0.292
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
 best+0.155
 world+0.155
 thing+0.153
 things+0.149
 lot+0.146
Neg logit contributions
arted-0.119
ur-0.110
anged-0.103
ked-0.099
oured-0.098
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.297
ur+0.275
 stro+0.221
ked+0.220
anged+0.197
Neg logit contributions
 best-1.468
 thing-1.437
 world-1.421
 things-1.412
 important-1.398
Tokeners
Token ID364
Feature activation+1.59e+00

Pos logit contributions
arted+1.433
ur+1.425
ked+1.331
 stro+1.272
oured+1.240
Neg logit contributions
 best-1.329
 biggest-1.264
 world-1.256
 lot-1.255
 thing-1.254
Token was
Token ID373
Feature activation-4.57e+00

Pos logit contributions
arted+0.094
ur+0.092
ked+0.088
 stro+0.088
anged+0.086
Neg logit contributions
 best-0.062
 thing-0.061
 biggest-0.058
 world-0.058
 things-0.057
Token lost
Token ID2626
Feature activation-1.40e+00

Pos logit contributions
 best+0.193
 thing+0.187
 world+0.180
 things+0.179
 biggest+0.176
Neg logit contributions
arted-0.260
ur-0.255
ked-0.241
 stro-0.239
anged-0.236
Token and
Token ID290
Feature activation+6.06e+00

Pos logit contributions
 best+0.078
 biggest+0.075
 thing+0.074
 world+0.074
 important+0.073
Neg logit contributions
arted-0.058
ur-0.057
ked-0.052
 stro-0.052
anged-0.049
Token she
Token ID673
Feature activation-8.80e+00

Pos logit contributions
arted+0.327
ur+0.322
 stro+0.305
ked+0.300
anged+0.291
Neg logit contributions
 best-0.268
 thing-0.259
 world-0.259
 biggest-0.252
 lot-0.245
Token she
Token ID673
Feature activation-8.80e+00

Pos logit contributions
arted+0.324
ur+0.321
ked+0.302
 stro+0.299
anged+0.293
Neg logit contributions
 best-0.209
 thing-0.202
 world-0.199
 biggest-0.192
 things-0.192
Token could
Token ID714
Feature activation-4.76e+00

Pos logit contributions
 best+0.256
 thing+0.231
 things+0.222
 lot+0.219
 biggest+0.215
Neg logit contributions
arted-0.609
ur-0.588
anged-0.560
ree-0.560
 stro-0.552
Token and
Token ID290
Feature activation+6.07e+00

Pos logit contributions
 best+0.193
 thing+0.180
 lot+0.168
 world+0.168
 things+0.166
Neg logit contributions
arted-0.280
 stro-0.268
ur-0.268
anged-0.266
oured-0.261
Token pushed
Token ID7121
Feature activation+1.37e+01

Pos logit contributions
arted+0.305
ur+0.298
anged+0.275
amb+0.275
ked+0.274
Neg logit contributions
 thing-0.288
 best-0.287
 things-0.279
 lot-0.270
 important-0.263
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
ur+0.681
arted+0.673
oured+0.658
 stro+0.646
ked+0.626
Neg logit contributions
 best-0.675
 things-0.674
 thing-0.672
 lot-0.653
 good-0.600
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.335
ur+0.313
 stro+0.289
ked+0.259
anged+0.231
Neg logit contributions
 best-1.411
 thing-1.368
 world-1.364
 things-1.348
 lot-1.331
Tokeners
Token ID364
Feature activation+1.59e+00

Pos logit contributions
arted+1.520
ur+1.498
ked+1.409
 stro+1.346
ree+1.331
Neg logit contributions
 best-1.259
 thing-1.205
 world-1.197
 biggest-1.176
 important-1.158
Token out
Token ID503
Feature activation+5.89e+00

Pos logit contributions
arted+0.072
ur+0.069
 stro+0.068
ked+0.066
anged+0.065
Neg logit contributions
 best-0.086
 thing-0.084
 things-0.082
 world-0.080
 lot-0.080
Token of
Token ID286
Feature activation-3.82e+00

Pos logit contributions
arted+0.368
ur+0.362
ked+0.344
 stro+0.342
anged+0.337
Neg logit contributions
 best-0.215
 thing-0.208
 world-0.199
 things-0.198
 biggest-0.193
Token the
Token ID262
Feature activation+1.70e+00

Pos logit contributions
 thing+0.156
 lot+0.150
 things+0.150
 best+0.146
 world+0.140
Neg logit contributions
ur-0.228
arted-0.216
ked-0.214
anged-0.212
 stro-0.210
Token way
Token ID835
Feature activation-4.55e+00

Pos logit contributions
arted+0.096
oured+0.081
ur+0.079
 Sammy+0.078
 aback+0.077
Neg logit contributions
iest-0.060
 thing-0.057
 important-0.056
 best-0.055
 things-0.049
TokenI
Token ID40
Feature activation-9.32e+00

Pos logit contributions
 best+0.026
 thing+0.026
 world+0.025
 things+0.025
 biggest+0.024
Neg logit contributions
arted-0.033
ur-0.032
ked-0.030
 stro-0.030
anged-0.030
Token'm
Token ID1101
Feature activation+5.32e-01

Pos logit contributions
 best+0.381
 thing+0.371
 world+0.353
 lot+0.351
 important+0.349
Neg logit contributions
arted-0.564
ur-0.530
ked-0.515
anged-0.498
oured-0.494
Token sorry
Token ID7926
Feature activation+3.90e+00

Pos logit contributions
arted+0.031
ur+0.030
ked+0.029
anged+0.029
 stro+0.028
Neg logit contributions
 best-0.023
 thing-0.021
 important-0.020
 things-0.020
 world-0.020
Token,
Token ID11
Feature activation+3.33e+00

Pos logit contributions
ur+0.184
arted+0.180
ked+0.164
 stro+0.162
ree+0.162
Neg logit contributions
 best-0.192
 thing-0.189
 world-0.187
 biggest-0.183
 lot-0.179
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
arted+0.159
ur+0.157
ked+0.148
 stro+0.147
ree+0.142
Neg logit contributions
 best-0.162
 thing-0.158
 world-0.154
 things-0.153
 biggest-0.152
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.471
ur+0.461
ked+0.407
anged+0.385
 stro+0.377
Neg logit contributions
 best-1.284
 thing-1.268
 biggest-1.239
 things-1.236
 world-1.234
Tokeners
Token ID364
Feature activation+1.59e+00

Pos logit contributions
ur+1.295
arted+1.245
ked+1.196
ree+1.097
 stro+1.091
Neg logit contributions
 best-1.408
 biggest-1.388
 problem-1.378
 world-1.377
 lot-1.376
Token.
Token ID13
Feature activation-2.73e+00

Pos logit contributions
ur+0.083
arted+0.082
ked+0.078
 stro+0.077
oured+0.074
Neg logit contributions
 best-0.069
 thing-0.068
 biggest-0.067
 things-0.065
 world-0.065
Token Let
Token ID3914
Feature activation+2.11e+00

Pos logit contributions
 best+0.094
 thing+0.092
 world+0.088
 biggest+0.086
 things+0.086
Neg logit contributions
arted-0.178
ur-0.171
ked-0.164
anged-0.163
 stro-0.161
Token's
Token ID338
Feature activation-7.62e+00

Pos logit contributions
arted+0.117
ur+0.112
ked+0.107
 stro+0.106
ree+0.102
Neg logit contributions
 thing-0.085
 best-0.085
 world-0.083
 biggest-0.081
 things-0.080
Token go
Token ID467
Feature activation-2.59e+00

Pos logit contributions
 best+0.306
 thing+0.296
 world+0.286
 things+0.284
 biggest+0.277
Neg logit contributions
arted-0.451
ur-0.441
ked-0.419
 stro-0.413
anged-0.410
Token her
Token ID607
Feature activation+3.66e+00

Pos logit contributions
ur+0.262
 stro+0.261
arted+0.258
ked+0.247
anged+0.232
Neg logit contributions
 world-0.313
 best-0.300
 thing-0.299
 biggest-0.291
 problem-0.288
Token pet
Token ID4273
Feature activation+1.54e+01

Pos logit contributions
 stro+0.164
ur+0.164
ked+0.163
arted+0.158
oured+0.153
Neg logit contributions
iest-0.154
 thing-0.154
body-0.149
 world-0.149
 important-0.143
Token cat
Token ID3797
Feature activation-1.44e+00

Pos logit contributions
arted+0.857
ur+0.839
ked+0.828
 stro+0.823
anged+0.794
Neg logit contributions
 best-0.591
 important-0.576
 thing-0.564
 biggest-0.551
 world-0.551
Token,
Token ID11
Feature activation+3.33e+00

Pos logit contributions
 best+0.070
 thing+0.067
 things+0.066
 world+0.065
 important+0.064
Neg logit contributions
arted-0.075
ur-0.072
ked-0.068
 stro-0.067
anged-0.066
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
arted+0.147
ur+0.140
ked+0.130
anged+0.130
oured+0.125
Neg logit contributions
 best-0.194
 thing-0.183
 things-0.182
 world-0.182
 lot-0.181
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.332
ur+0.307
ked+0.255
 stro+0.251
anged+0.230
Neg logit contributions
 best-1.434
 thing-1.403
 world-1.390
 things-1.377
 important-1.364
Tokeners
Token ID364
Feature activation+1.59e+00

Pos logit contributions
ur+1.395
arted+1.395
ked+1.299
 stro+1.237
anged+1.194
Neg logit contributions
 best-1.335
 problem-1.276
 biggest-1.275
 thing-1.272
 lot-1.269
Token.
Token ID13
Feature activation-2.73e+00

Pos logit contributions
arted+0.097
ur+0.095
ked+0.090
 stro+0.090
anged+0.088
Neg logit contributions
 best-0.061
 thing-0.059
 world-0.057
 things-0.056
 biggest-0.055
Token Lily
Token ID20037
Feature activation+6.07e+00

Pos logit contributions
 best+0.132
 lot+0.128
 thing+0.127
 world+0.126
 things+0.125
Neg logit contributions
arted-0.144
ur-0.138
ked-0.131
anged-0.128
oured-0.127
Token loved
Token ID6151
Feature activation-2.25e+00

Pos logit contributions
arted+0.401
ur+0.389
ked+0.372
anged+0.365
oured+0.362
Neg logit contributions
 best-0.211
 thing-0.199
 world-0.194
 things-0.191
 important-0.190
Token to
Token ID284
Feature activation+6.97e+00

Pos logit contributions
 things+0.115
 best+0.113
 thing+0.112
 lot+0.109
 important+0.109
Neg logit contributions
arted-0.112
ur-0.108
oured-0.106
ked-0.105
anged-0.102
Token One
Token ID1881
Feature activation-2.58e+00

Pos logit contributions
 best+0.124
 thing+0.118
 lot+0.118
 world+0.117
 things+0.116
Neg logit contributions
arted-0.158
ur-0.150
ked-0.145
oured-0.142
 stro-0.140
Token sunny
Token ID27737
Feature activation-8.77e+00

Pos logit contributions
 best+0.045
 biggest+0.044
 things+0.040
 problem+0.039
 puzzle+0.038
Neg logit contributions
ur-0.195
arted-0.187
ree-0.181
anged-0.178
 stro-0.177
Token day
Token ID1110
Feature activation-1.29e+01

Pos logit contributions
 best+0.120
 biggest+0.112
 important+0.089
 puzzle+0.086
 lot+0.086
Neg logit contributions
arted-0.718
ur-0.705
anged-0.669
ree-0.669
 stro-0.659
Token,
Token ID11
Feature activation+3.32e+00

Pos logit contributions
 best+0.571
 thing+0.562
 things+0.544
 lot+0.536
 world+0.521
Neg logit contributions
arted-0.723
ur-0.692
ked-0.651
 stro-0.648
oured-0.643
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
ur+0.108
arted+0.107
ked+0.106
 stro+0.102
anged+0.094
Neg logit contributions
 best-0.209
 world-0.209
 thing-0.203
 important-0.200
 biggest-0.199
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.257
ur+0.239
ked+0.183
 stro+0.173
anged+0.161
Neg logit contributions
 best-1.500
 thing-1.479
 world-1.453
 things-1.449
 biggest-1.436
Tokeners
Token ID364
Feature activation+1.61e+00

Pos logit contributions
arted+1.409
ur+1.389
ked+1.312
 stro+1.239
ree+1.219
Neg logit contributions
 best-1.308
 thing-1.266
 problem-1.252
 biggest-1.250
 world-1.249
Token went
Token ID1816
Feature activation+1.23e+00

Pos logit contributions
arted+0.091
ur+0.087
ked+0.082
oured+0.081
anged+0.080
Neg logit contributions
 best-0.071
 thing-0.069
 world-0.068
 things-0.067
 lot-0.064
Token for
Token ID329
Feature activation+8.09e-01

Pos logit contributions
arted+0.067
ur+0.064
ked+0.063
anged+0.062
 stro+0.062
Neg logit contributions
 best-0.057
 thing-0.054
 things-0.054
 world-0.053
 important-0.052
Token a
Token ID257
Feature activation-2.89e+00

Pos logit contributions
arted+0.049
ur+0.049
ked+0.046
 stro+0.046
anged+0.045
Neg logit contributions
 best-0.030
 thing-0.029
 world-0.028
 things-0.028
 biggest-0.028
Token walk
Token ID2513
Feature activation+2.95e+00

Pos logit contributions
 best+0.160
 thing+0.156
 world+0.156
 things+0.153
 biggest+0.153
Neg logit contributions
ur-0.123
arted-0.112
 stro-0.108
ked-0.106
ree-0.106
Token.
Token ID13
Feature activation-2.77e+00

Pos logit contributions
 world+0.109
 best+0.107
 thing+0.107
 biggest+0.105
 problem+0.101
Neg logit contributions
ur-0.125
arted-0.118
ked-0.115
 stro-0.112
anged-0.107
Token One
Token ID1881
Feature activation-2.55e+00

Pos logit contributions
 best+0.128
 thing+0.124
 lot+0.122
 world+0.121
 things+0.121
Neg logit contributions
arted-0.153
ur-0.146
ked-0.140
oured-0.137
 stro-0.137
Token day
Token ID1110
Feature activation-1.29e+01

Pos logit contributions
 best+0.041
 biggest+0.040
 things+0.036
 world+0.035
 problem+0.032
Neg logit contributions
ur-0.196
arted-0.190
anged-0.183
 stro-0.182
ree-0.181
Token,
Token ID11
Feature activation+3.32e+00

Pos logit contributions
 best+0.568
 thing+0.560
 things+0.542
 lot+0.536
 world+0.514
Neg logit contributions
arted-0.728
ur-0.697
ked-0.653
 stro-0.650
oured-0.647
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
arted+0.101
ur+0.100
ked+0.095
 stro+0.092
anged+0.086
Neg logit contributions
 best-0.220
 world-0.217
 thing-0.215
 biggest-0.210
 important-0.209
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.288
ur+0.265
ked+0.212
 stro+0.209
anged+0.188
Neg logit contributions
 best-1.472
 thing-1.443
 world-1.428
 things-1.420
 important-1.405
Tokeners
Token ID364
Feature activation+1.61e+00

Pos logit contributions
arted+1.444
ur+1.423
ked+1.342
 stro+1.278
ree+1.250
Neg logit contributions
 best-1.288
 thing-1.246
 world-1.234
 lot-1.224
 biggest-1.223
Token saw
Token ID2497
Feature activation+5.14e+00

Pos logit contributions
arted+0.094
ur+0.090
ked+0.084
oured+0.083
anged+0.083
Neg logit contributions
 best-0.068
 thing-0.067
 things-0.064
 world-0.064
 lot-0.062
Token a
Token ID257
Feature activation-2.91e+00

Pos logit contributions
arted+0.352
ur+0.346
ked+0.327
anged+0.326
oured+0.326
Neg logit contributions
 best-0.165
 thing-0.162
 things-0.159
 lot-0.149
 important-0.148
Token can
Token ID460
Feature activation-1.04e+01

Pos logit contributions
 best+0.080
iest+0.079
 biggest+0.072
 things+0.071
 important+0.066
Neg logit contributions
ur-0.166
 stro-0.164
ree-0.164
ked-0.157
 Fro-0.157
Token of
Token ID286
Feature activation-3.83e+00

Pos logit contributions
 best+0.432
 thing+0.424
 world+0.405
 things+0.402
 lot+0.395
Neg logit contributions
arted-0.601
ur-0.581
ked-0.553
 stro-0.545
anged-0.538
Token.
Token ID13
Feature activation-2.72e+00

Pos logit contributions
 best+0.479
 biggest+0.466
 thing+0.461
 lot+0.451
 world+0.449
Neg logit contributions
ur-0.475
arted-0.469
 stro-0.443
ked-0.431
oured-0.420
Token
Token ID198
Feature activation+6.41e+00

Pos logit contributions
 best+0.119
 lot+0.118
 world+0.113
 things+0.110
 important+0.109
Neg logit contributions
arted-0.148
ur-0.141
ked-0.141
oured-0.137
anged-0.136
Token
Token ID198
Feature activation+6.43e+00

Pos logit contributions
arted+0.358
ur+0.346
ked+0.337
anged+0.329
oured+0.326
Neg logit contributions
 best-0.283
 lot-0.278
 world-0.274
 thing-0.271
 important-0.267
TokenAfter
Token ID3260
Feature activation+2.21e+00

Pos logit contributions
arted+0.326
ur+0.321
ked+0.300
 stro+0.299
anged+0.290
Neg logit contributions
 best-0.309
 thing-0.301
 world-0.291
 things-0.289
 biggest-0.287
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
arted+0.119
ur+0.117
ked+0.110
 stro+0.109
anged+0.107
Neg logit contributions
 best-0.100
 thing-0.097
 world-0.094
 things-0.094
 biggest-0.092
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.325
ur+0.289
ked+0.251
 stro+0.251
anged+0.214
Neg logit contributions
 best-1.426
 thing-1.390
 world-1.389
 things-1.370
 important-1.362
Tokeners
Token ID364
Feature activation+1.61e+00

Pos logit contributions
arted+1.469
ur+1.460
ked+1.361
 stro+1.327
oured+1.294
Neg logit contributions
 best-1.280
 thing-1.239
 world-1.215
 biggest-1.206
 lot-1.203
Token got
Token ID1392
Feature activation-2.05e+00

Pos logit contributions
arted+0.089
ur+0.082
ked+0.080
anged+0.078
ree+0.078
Neg logit contributions
 best-0.072
 thing-0.069
 world-0.067
 things-0.067
 lot-0.066
Token better
Token ID1365
Feature activation-9.01e+00

Pos logit contributions
 best+0.098
 things+0.090
 important+0.088
 thing+0.087
 biggest+0.085
Neg logit contributions
arted-0.114
ur-0.110
ked-0.107
ree-0.105
anged-0.102
Token,
Token ID11
Feature activation+3.33e+00

Pos logit contributions
 best+0.421
 thing+0.410
 things+0.404
 world+0.397
 lot+0.384
Neg logit contributions
arted-0.496
ur-0.480
ked-0.450
oured-0.436
 stro-0.434
Token he
Token ID339
Feature activation-4.26e+00

Pos logit contributions
arted+0.128
ur+0.125
ked+0.113
 stro+0.111
anged+0.110
Neg logit contributions
 best-0.202
 thing-0.198
 things-0.193
 world-0.193
 lot-0.190
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
arted+0.508
anged+0.487
ur+0.477
ked+0.468
oured+0.462
Neg logit contributions
 best-0.837
 things-0.824
 thing-0.818
 world-0.796
 biggest-0.788
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.351
ur+0.323
 stro+0.288
ked+0.275
anged+0.258
Neg logit contributions
 best-1.409
 world-1.369
 thing-1.368
 things-1.345
 important-1.328
Tokeners
Token ID364
Feature activation+1.61e+00

Pos logit contributions
arted+1.474
ur+1.456
ked+1.362
 stro+1.295
oured+1.265
Neg logit contributions
 best-1.289
 thing-1.233
 world-1.224
 biggest-1.222
 lot-1.219
Token.
Token ID13
Feature activation-2.73e+00

Pos logit contributions
arted+0.116
ur+0.108
anged+0.105
ree+0.105
oured+0.103
Neg logit contributions
 best-0.047
 things-0.040
 thing-0.039
 world-0.039
 lot-0.036
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
 lot+0.145
 best+0.140
 things+0.136
 thing+0.134
 important+0.133
Neg logit contributions
arted-0.129
ur-0.119
anged-0.117
oured-0.116
ked-0.112
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.350
ur+0.328
 stro+0.297
ked+0.279
anged+0.261
Neg logit contributions
 best-1.403
 world-1.360
 thing-1.357
 things-1.342
 important-1.333
Tokeners
Token ID364
Feature activation+1.61e+00

Pos logit contributions
arted+1.451
ur+1.423
ked+1.341
 stro+1.269
oured+1.239
Neg logit contributions
 best-1.307
 thing-1.255
 world-1.252
 biggest-1.239
 lot-1.229
Token looked
Token ID3114
Feature activation-3.67e+00

Pos logit contributions
arted+0.095
ur+0.090
ree+0.086
anged+0.085
ked+0.083
Neg logit contributions
 best-0.068
 thing-0.063
 things-0.062
 lot-0.061
 world-0.060
Token at
Token ID379
Feature activation+8.85e+00

Pos logit contributions
 best+0.172
 things+0.160
 important+0.160
 thing+0.160
 lot+0.157
Neg logit contributions
arted-0.199
ur-0.197
anged-0.188
ked-0.187
 stro-0.184
Token Spot
Token ID15899
Feature activation+1.43e+01

Pos logit contributions
ur+0.371
arted+0.368
 stro+0.334
anged+0.331
ked+0.331
Neg logit contributions
 best-0.514
 thing-0.509
 things-0.501
 lot-0.494
 world-0.480
Token and
Token ID290
Feature activation+6.07e+00

Pos logit contributions
arted+0.774
ur+0.760
ked+0.716
 stro+0.708
anged+0.697
Neg logit contributions
 best-0.635
 thing-0.619
 world-0.599
 things-0.594
 biggest-0.586
Token scared
Token ID12008
Feature activation+4.65e+00

Pos logit contributions
 best+0.188
 thing+0.183
 world+0.176
 things+0.175
 biggest+0.172
Neg logit contributions
arted-0.266
ur-0.261
ked-0.247
 stro-0.245
anged-0.241
Token.
Token ID13
Feature activation-2.72e+00

Pos logit contributions
arted+0.287
ur+0.282
ked+0.268
 stro+0.265
anged+0.262
Neg logit contributions
 best-0.175
 thing-0.170
 world-0.163
 things-0.162
 biggest-0.158
Token Mr
Token ID1770
Feature activation+7.50e+00

Pos logit contributions
 best+0.115
 things+0.109
 lot+0.109
 world+0.107
 thing+0.103
Neg logit contributions
arted-0.155
ur-0.152
ked-0.148
anged-0.148
oured-0.142
Token.
Token ID13
Feature activation-2.72e+00

Pos logit contributions
arted+0.443
ur+0.442
ked+0.420
 stro+0.414
anged+0.408
Neg logit contributions
 best-0.285
 thing-0.278
 world-0.269
 biggest-0.268
 things-0.263
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
 best+0.157
 world+0.151
 lot+0.150
 thing+0.150
 things+0.148
Neg logit contributions
arted-0.116
ur-0.113
anged-0.103
oured-0.100
 stro-0.100
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.338
ur+0.314
 stro+0.277
ked+0.264
anged+0.243
Neg logit contributions
 best-1.431
 thing-1.378
 world-1.374
 things-1.361
 important-1.358
Tokeners
Token ID364
Feature activation+1.60e+00

Pos logit contributions
arted+1.421
ur+1.402
ked+1.305
 stro+1.243
oured+1.206
Neg logit contributions
 best-1.320
 world-1.271
 biggest-1.268
 lot-1.259
 thing-1.258
Token said
Token ID531
Feature activation+1.18e+01

Pos logit contributions
arted+0.080
ur+0.078
anged+0.071
ree+0.071
ked+0.071
Neg logit contributions
 best-0.079
 thing-0.076
 things-0.073
 world-0.072
 lot-0.072
Token,
Token ID11
Feature activation+3.32e+00

Pos logit contributions
arted+0.637
ur+0.626
ked+0.589
 stro+0.583
anged+0.573
Neg logit contributions
 best-0.529
 thing-0.514
 world-0.499
 things-0.495
 biggest-0.487
Token "
Token ID366
Feature activation-5.76e-01

Pos logit contributions
arted+0.200
ur+0.191
anged+0.187
oured+0.182
ked+0.178
Neg logit contributions
 best-0.131
 thing-0.126
 things-0.125
 lot-0.123
 important-0.123
TokenDon
Token ID3987
Feature activation+3.43e+00

Pos logit contributions
 best+0.028
 thing+0.027
 world+0.026
 things+0.026
 biggest+0.025
Neg logit contributions
arted-0.030
ur-0.029
ked-0.027
 stro-0.027
anged-0.027
Token wise
Token ID10787
Feature activation+2.92e-01

Pos logit contributions
arted+0.093
ur+0.088
 stro+0.085
ked+0.085
ree+0.082
Neg logit contributions
 thing-0.066
 best-0.064
iest-0.063
 things-0.063
 lot-0.062
Token old
Token ID1468
Feature activation-1.19e+00

Pos logit contributions
arted+0.021
ur+0.020
ked+0.019
 stro+0.019
anged+0.019
Neg logit contributions
 best-0.008
 thing-0.008
 world-0.008
 things-0.008
 important-0.007
Token cat
Token ID3797
Feature activation-1.45e+00

Pos logit contributions
 best+0.044
 thing+0.043
 world+0.041
 things+0.041
 important+0.040
Neg logit contributions
arted-0.075
ur-0.073
ked-0.069
 stro-0.068
anged-0.068
Token named
Token ID3706
Feature activation+1.35e+01

Pos logit contributions
 best+0.071
 thing+0.069
 things+0.067
 world+0.065
 important+0.064
Neg logit contributions
arted-0.076
ur-0.070
ked-0.067
oured-0.067
anged-0.066
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
arted+0.508
anged+0.487
ur+0.477
ked+0.468
oured+0.462
Neg logit contributions
 best-0.837
 things-0.824
 thing-0.818
 world-0.796
 biggest-0.788
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.351
ur+0.323
 stro+0.288
ked+0.275
anged+0.258
Neg logit contributions
 best-1.409
 world-1.369
 thing-1.368
 things-1.345
 important-1.328
Tokeners
Token ID364
Feature activation+1.61e+00

Pos logit contributions
arted+1.474
ur+1.456
ked+1.362
 stro+1.295
oured+1.265
Neg logit contributions
 best-1.289
 thing-1.233
 world-1.224
 biggest-1.222
 lot-1.219
Token.
Token ID13
Feature activation-2.73e+00

Pos logit contributions
arted+0.116
ur+0.108
anged+0.105
ree+0.105
oured+0.103
Neg logit contributions
 best-0.047
 things-0.040
 thing-0.039
 world-0.039
 lot-0.036
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
 lot+0.145
 best+0.140
 things+0.136
 thing+0.134
 important+0.133
Neg logit contributions
arted-0.129
ur-0.119
anged-0.117
oured-0.116
ked-0.112
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.350
ur+0.328
 stro+0.297
ked+0.279
anged+0.261
Neg logit contributions
 best-1.403
 world-1.360
 thing-1.357
 things-1.342
 important-1.333
Tokeners
Token ID364
Feature activation+1.61e+00

Pos logit contributions
arted+1.451
ur+1.423
ked+1.341
 stro+1.269
oured+1.239
Neg logit contributions
 best-1.307
 thing-1.255
 world-1.252
 biggest-1.239
 lot-1.229
Token family
Token ID1641
Feature activation-3.83e+00

Pos logit contributions
 best+0.309
 thing+0.306
 world+0.294
 important+0.289
 things+0.289
Neg logit contributions
arted-0.428
ur-0.416
ked-0.399
 stro-0.393
anged-0.389
Token sh
Token ID427
Feature activation+1.05e+01

Pos logit contributions
 best+0.134
 thing+0.128
 things+0.123
 world+0.122
 lot+0.119
Neg logit contributions
arted-0.248
ur-0.244
ked-0.231
 stro-0.227
anged-0.226
Tokenoo
Token ID2238
Feature activation+1.53e+01

Pos logit contributions
ur+0.504
 stro+0.496
arted+0.487
ked+0.461
oured+0.461
Neg logit contributions
 best-0.489
 world-0.476
 thing-0.467
 important-0.467
 things-0.450
Tokened
Token ID276
Feature activation+1.56e+01

Pos logit contributions
arted+0.616
ur+0.605
 stro+0.544
ked+0.538
anged+0.533
Neg logit contributions
 best-0.902
 thing-0.888
 things-0.869
 world-0.853
 lot-0.845
Token Wh
Token ID854
Feature activation+1.77e+01

Pos logit contributions
arted+0.853
ur+0.843
oured+0.799
anged+0.791
ree+0.762
Neg logit contributions
 best-0.718
 thing-0.715
 things-0.701
 lot-0.676
 world-0.650
Tokenisk
Token ID1984
Feature activation+2.83e+01

Pos logit contributions
arted+0.342
ur+0.324
 stro+0.282
ked+0.249
oured+0.235
Neg logit contributions
 best-1.412
 thing-1.367
 world-1.361
 things-1.354
 important-1.331
Tokeners
Token ID364
Feature activation+1.61e+00

Pos logit contributions
arted+1.511
ur+1.490
ked+1.409
 stro+1.349
anged+1.327
Neg logit contributions
 best-1.277
 thing-1.222
 world-1.208
 biggest-1.205
 lot-1.177
Token away
Token ID1497
Feature activation+8.48e+00

Pos logit contributions
arted+0.083
ur+0.080
anged+0.073
ked+0.073
 stro+0.072
Neg logit contributions
 best-0.079
 thing-0.074
 things-0.073
 world-0.072
 lot-0.071
Token and
Token ID290
Feature activation+6.06e+00

Pos logit contributions
arted+0.446
ur+0.435
ked+0.405
 stro+0.405
anged+0.399
Neg logit contributions
 best-0.401
 thing-0.393
 things-0.378
 world-0.376
 lot-0.369
Token told
Token ID1297
Feature activation+1.39e+01

Pos logit contributions
arted+0.314
ur+0.303
oured+0.281
ked+0.274
anged+0.273
Neg logit contributions
 best-0.299
 thing-0.290
 things-0.286
 important-0.275
 lot-0.275
Token Max
Token ID5436
Feature activation+1.17e+01

Pos logit contributions
arted+0.435
ur+0.419
ked+0.375
 stro+0.368
anged+0.359
Neg logit contributions
 best-0.946
 thing-0.931
 things-0.911
 world-0.907
 lot-0.897
Token make
Token ID787
Feature activation-1.51e+00

Pos logit contributions
arted+0.191
ur+0.188
ked+0.177
 stro+0.176
anged+0.173
Neg logit contributions
 best-0.172
 thing-0.168
 world-0.164
 things-0.161
 biggest-0.161
Token the
Token ID262
Feature activation-7.24e-01

Pos logit contributions
 thing+0.062
 best+0.062
 world+0.061
 important+0.058
 biggest+0.057
Neg logit contributions
arted-0.079
ur-0.076
ked-0.074
 stro-0.073
anged-0.071
Token frame
Token ID5739
Feature activation-1.20e+01

Pos logit contributions
iest+0.026
 thing+0.019
body+0.019
 important+0.018
 fog+0.017
Neg logit contributions
arted-0.035
ked-0.034
oured-0.033
anged-0.032
ents-0.032
Token of
Token ID286
Feature activation-6.32e+00

Pos logit contributions
 world+0.491
 thing+0.489
 lot+0.482
 best+0.474
 important+0.471
Neg logit contributions
ur-0.596
arted-0.576
ked-0.537
anged-0.525
 stro-0.524
Token the
Token ID262
Feature activation-8.16e-01

Pos logit contributions
 best+0.291
 thing+0.286
 world+0.280
 important+0.274
 biggest+0.273
Neg logit contributions
arted-0.306
ur-0.294
 stro-0.285
ked-0.284
anged-0.273
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
iest+0.029
 thing+0.023
 important+0.021
body+0.021
 best+0.020
Neg logit contributions
arted-0.042
ked-0.042
 stro-0.040
anged-0.039
oured-0.039
Token.
Token ID13
Feature activation-5.31e+00

Pos logit contributions
 biggest+0.934
 thing+0.932
 best+0.922
 lot+0.919
 important+0.915
Neg logit contributions
ur-0.894
arted-0.869
anged-0.791
ried-0.767
 stro-0.767
Token Then
Token ID3244
Feature activation+7.82e+00

Pos logit contributions
 lot+0.236
 best+0.234
 thing+0.227
 things+0.225
 world+0.219
Neg logit contributions
ur-0.286
arted-0.277
ked-0.267
 stro-0.266
anged-0.257
Token,
Token ID11
Feature activation+6.91e-01

Pos logit contributions
ur+0.355
arted+0.352
 stro+0.332
ked+0.332
anged+0.331
Neg logit contributions
 world-0.393
 best-0.378
 biggest-0.376
 thing-0.374
 important-0.374
Token they
Token ID484
Feature activation-9.85e+00

Pos logit contributions
arted+0.041
ur+0.040
ked+0.038
 stro+0.038
anged+0.037
Neg logit contributions
 best-0.028
 thing-0.027
 world-0.026
 things-0.026
 biggest-0.025
Token look
Token ID804
Feature activation-7.29e+00

Pos logit contributions
 best+0.438
 thing+0.424
 world+0.410
 things+0.410
 lot+0.405
Neg logit contributions
arted-0.541
ur-0.524
ked-0.496
 stro-0.489
anged-0.484
TokenYou
Token ID1639
Feature activation-1.18e+01

Pos logit contributions
 best+0.115
 thing+0.109
 things+0.105
 world+0.104
 biggest+0.103
Neg logit contributions
arted-0.180
ur-0.176
ked-0.168
 stro-0.167
anged-0.163
Token can
Token ID460
Feature activation-1.28e+01

Pos logit contributions
 best+0.382
 thing+0.377
 world+0.373
 important+0.354
 things+0.353
Neg logit contributions
ur-0.757
arted-0.731
ked-0.707
 stro-0.691
oured-0.690
Token't
Token ID470
Feature activation-4.40e+00

Pos logit contributions
 best+0.591
 thing+0.575
 world+0.557
 things+0.554
 biggest+0.544
Neg logit contributions
arted-0.684
ur-0.667
ked-0.629
 stro-0.621
anged-0.613
Token have
Token ID423
Feature activation-8.48e+00

Pos logit contributions
 best+0.209
 thing+0.202
 world+0.196
 things+0.195
 biggest+0.193
Neg logit contributions
arted-0.228
ur-0.224
ked-0.209
 stro-0.207
anged-0.204
Token that
Token ID326
Feature activation-4.39e+00

Pos logit contributions
 best+0.409
 world+0.402
 thing+0.395
 biggest+0.388
iest+0.377
Neg logit contributions
arted-0.402
ur-0.389
 stro-0.361
ked-0.359
oured-0.353
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
 best+0.116
iest+0.112
 important+0.110
 biggest+0.109
 lot+0.109
Neg logit contributions
arted-0.239
ur-0.239
 aback-0.235
oured-0.229
ked-0.225
Token,
Token ID11
Feature activation+7.49e-01

Pos logit contributions
 best+0.958
 world+0.944
 lot+0.944
 biggest+0.938
 thing+0.936
Neg logit contributions
ur-1.008
arted-0.965
oured-0.854
anged-0.849
ked-0.848
Token Lisa
Token ID15378
Feature activation+5.75e+00

Pos logit contributions
arted+0.020
ur+0.019
ked+0.017
 stro+0.016
ree+0.016
Neg logit contributions
 best-0.052
 world-0.051
 thing-0.050
 biggest-0.050
 lot-0.049
Token.
Token ID13
Feature activation-5.37e+00

Pos logit contributions
ur+0.302
arted+0.300
 stro+0.287
ked+0.285
anged+0.274
Neg logit contributions
 best-0.243
 biggest-0.235
 lot-0.233
 world-0.233
 thing-0.232
Token It
Token ID632
Feature activation-8.54e+00

Pos logit contributions
 best+0.226
 thing+0.224
 world+0.215
 things+0.213
 biggest+0.212
Neg logit contributions
arted-0.303
ur-0.293
ked-0.279
 stro-0.277
anged-0.274
Token is
Token ID318
Feature activation-1.61e+01

Pos logit contributions
 thing+0.327
 world+0.321
 lot+0.314
 best+0.313
 things+0.302
Neg logit contributions
ur-0.488
arted-0.479
ked-0.450
 stro-0.445
anged-0.433
Token smiled
Token ID13541
Feature activation+6.22e+00

Pos logit contributions
ur+0.198
arted+0.196
oured+0.173
ree+0.173
amb+0.170
Neg logit contributions
 best-0.176
 thing-0.175
 things-0.172
 lot-0.159
 biggest-0.158
Token.
Token ID13
Feature activation-5.16e+00

Pos logit contributions
arted+0.362
ur+0.355
ked+0.342
 stro+0.336
anged+0.324
Neg logit contributions
 best-0.241
 thing-0.233
 world-0.231
 lot-0.224
 biggest-0.224
Token She
Token ID1375
Feature activation-4.65e+00

Pos logit contributions
 best+0.215
 world+0.201
 lot+0.201
 things+0.200
 thing+0.197
Neg logit contributions
ur-0.301
arted-0.298
anged-0.288
oured-0.286
ked-0.284
Token liked
Token ID8288
Feature activation-3.92e+00

Pos logit contributions
 best+0.191
 thing+0.176
 important+0.173
 lot+0.171
 things+0.168
Neg logit contributions
arted-0.280
ur-0.272
ked-0.251
anged-0.250
ree-0.246
Token the
Token ID262
Feature activation-8.16e-01

Pos logit contributions
 best+0.196
 thing+0.191
 world+0.186
 biggest+0.182
 things+0.182
Neg logit contributions
arted-0.192
ur-0.186
ked-0.174
 stro-0.172
anged-0.169
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
iest+0.023
body+0.020
ww+0.019
ooky+0.017
 fog+0.017
Neg logit contributions
 grinning-0.039
arted-0.038
 aback-0.038
oured-0.038
 Jack-0.038
Token.
Token ID13
Feature activation-5.31e+00

Pos logit contributions
 biggest+0.971
 lot+0.951
 world+0.947
 thing+0.943
 best+0.924
Neg logit contributions
ur-0.904
arted-0.860
 stro-0.808
anged-0.769
oured-0.743
Token
Token ID198
Feature activation+3.82e+00

Pos logit contributions
 best+0.228
 world+0.216
 thing+0.211
 lot+0.211
 things+0.210
Neg logit contributions
ur-0.305
arted-0.305
ked-0.294
anged-0.291
oured-0.290
Token
Token ID198
Feature activation+3.78e+00

Pos logit contributions
arted+0.233
ked+0.231
ur+0.225
oured+0.219
anged+0.216
Neg logit contributions
 lot-0.148
 best-0.147
 world-0.147
 thing-0.141
 things-0.141
TokenSam
Token ID16305
Feature activation+9.09e+00

Pos logit contributions
arted+0.208
ur+0.204
ked+0.192
 stro+0.190
anged+0.188
Neg logit contributions
 best-0.167
 thing-0.163
 world-0.158
 things-0.157
 biggest-0.154
Token watched
Token ID7342
Feature activation+7.69e+00

Pos logit contributions
arted+0.505
ur+0.502
ked+0.456
anged+0.453
ree+0.452
Neg logit contributions
 best-0.394
 thing-0.367
 world-0.362
 things-0.361
 game-0.361
Token to
Token ID284
Feature activation+4.57e+00

Pos logit contributions
 best+0.186
 thing+0.182
 world+0.177
 things+0.177
 biggest+0.173
Neg logit contributions
arted-0.150
ur-0.146
ked-0.136
 stro-0.134
anged-0.132
Token solve
Token ID8494
Feature activation-2.77e+00

Pos logit contributions
ur+0.225
arted+0.223
 stro+0.203
ked+0.203
anged+0.199
Neg logit contributions
 best-0.220
 thing-0.211
 world-0.207
 biggest-0.205
 things-0.203
Token the
Token ID262
Feature activation-7.65e-01

Pos logit contributions
 best+0.120
 world+0.120
 thing+0.120
 biggest+0.119
 important+0.113
Neg logit contributions
arted-0.134
ur-0.129
 stro-0.119
ked-0.116
anged-0.115
Token giant
Token ID6175
Feature activation-4.94e-01

Pos logit contributions
iest+0.023
 thing+0.016
 biggest+0.016
body+0.015
 lot+0.015
Neg logit contributions
arted-0.037
 aback-0.037
ur-0.035
 Charlie-0.034
 Jack-0.034
Token's
Token ID338
Feature activation-1.01e+01

Pos logit contributions
 best+0.017
 lot+0.016
 biggest+0.016
 important+0.016
 thing+0.015
Neg logit contributions
ur-0.028
arted-0.028
anged-0.027
 stro-0.026
oured-0.026
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
iest+0.227
 thing+0.189
 lot+0.185
 best+0.164
 biggest+0.158
Neg logit contributions
arted-0.668
ur-0.668
anged-0.632
oured-0.614
ree-0.610
Token.
Token ID13
Feature activation-5.31e+00

Pos logit contributions
 best+0.897
 thing+0.889
 biggest+0.887
 lot+0.864
 world+0.846
Neg logit contributions
ur-1.050
arted-1.039
 stro-0.940
anged-0.930
ked-0.913
Token Finally
Token ID9461
Feature activation+1.36e+00

Pos logit contributions
 best+0.304
 world+0.298
 thing+0.295
 things+0.291
 lot+0.290
Neg logit contributions
arted-0.223
ur-0.214
ked-0.207
anged-0.201
 stro-0.197
Token,
Token ID11
Feature activation+6.91e-01

Pos logit contributions
arted+0.067
ur+0.066
ked+0.061
 stro+0.061
anged+0.060
Neg logit contributions
 best-0.066
 thing-0.064
 world-0.064
 biggest-0.063
 things-0.062
Token she
Token ID673
Feature activation-1.15e+01

Pos logit contributions
ur+0.033
arted+0.033
 stro+0.032
ked+0.031
oured+0.030
Neg logit contributions
 best-0.032
 thing-0.031
 world-0.031
 biggest-0.031
 important-0.030
Token managed
Token ID5257
Feature activation+3.70e+00

Pos logit contributions
 best+0.537
 thing+0.516
 world+0.493
 things+0.489
 important+0.485
Neg logit contributions
arted-0.617
ur-0.594
ked-0.563
 stro-0.551
anged-0.547
Token clever
Token ID14169
Feature activation-7.18e+00

Pos logit contributions
 best+0.234
 thing+0.212
 important+0.207
 things+0.199
 biggest+0.196
Neg logit contributions
arted-0.476
ur-0.466
ked-0.446
anged-0.445
 stro-0.436
Token,
Token ID11
Feature activation+8.96e-01

Pos logit contributions
 best+0.385
 thing+0.378
 world+0.372
 biggest+0.370
 lot+0.367
Neg logit contributions
arted-0.316
ur-0.309
ked-0.287
 stro-0.279
anged-0.277
Token and
Token ID290
Feature activation+3.59e+00

Pos logit contributions
arted+0.042
ur+0.042
ked+0.039
anged+0.038
 stro+0.038
Neg logit contributions
 best-0.047
 thing-0.046
 things-0.045
 world-0.045
 important-0.044
Token soon
Token ID2582
Feature activation+6.73e+00

Pos logit contributions
arted+0.200
ur+0.196
ked+0.185
 stro+0.183
anged+0.181
Neg logit contributions
 best-0.157
 thing-0.152
 things-0.146
 world-0.146
 biggest-0.143
Token the
Token ID262
Feature activation-8.16e-01

Pos logit contributions
arted+0.341
ur+0.334
ked+0.313
 stro+0.310
anged+0.304
Neg logit contributions
 best-0.307
 world-0.306
 thing-0.305
 biggest-0.296
 important-0.293
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
 thing+0.029
 lot+0.029
iest+0.029
 important+0.028
 world+0.028
Neg logit contributions
arted-0.043
ur-0.039
oured-0.038
ked-0.038
anged-0.037
Token was
Token ID373
Feature activation-7.16e+00

Pos logit contributions
 best+0.882
 biggest+0.880
 thing+0.865
 world+0.852
 lot+0.850
Neg logit contributions
arted-1.097
ur-1.065
ked-1.014
 stro-0.992
anged-0.990
Token finished
Token ID5201
Feature activation+2.69e+00

Pos logit contributions
 best+0.325
 thing+0.318
 world+0.308
 things+0.306
 biggest+0.300
Neg logit contributions
arted-0.382
ur-0.375
ked-0.352
 stro-0.348
anged-0.342
Token.
Token ID13
Feature activation-5.37e+00

Pos logit contributions
arted+0.138
ur+0.136
ked+0.129
anged+0.127
 stro+0.126
Neg logit contributions
 thing-0.114
 best-0.113
 biggest-0.111
 world-0.109
 lot-0.109
Token All
Token ID1439
Feature activation-5.03e+00

Pos logit contributions
 best+0.247
 lot+0.243
 thing+0.236
 world+0.236
 things+0.234
Neg logit contributions
ur-0.269
arted-0.267
ked-0.259
 stro-0.258
anged-0.250
Token the
Token ID262
Feature activation-9.96e-01

Pos logit contributions
 best+0.252
 world+0.243
 thing+0.241
 biggest+0.241
 lot+0.238
Neg logit contributions
arted-0.243
ur-0.229
anged-0.213
ked-0.210
 stro-0.208
Token.
Token ID13
Feature activation-5.12e+00

Pos logit contributions
arted+0.983
ur+0.950
ked+0.910
anged+0.901
 stro+0.893
Neg logit contributions
 best-0.511
 thing-0.493
 things-0.473
 important-0.457
 world-0.457
Token "
Token ID366
Feature activation-3.02e+00

Pos logit contributions
 best+0.215
 lot+0.205
 biggest+0.187
 important+0.183
 world+0.183
Neg logit contributions
arted-0.289
ur-0.286
anged-0.285
ked-0.272
ried-0.257
TokenI
Token ID40
Feature activation-1.18e+01

Pos logit contributions
 best+0.129
 problem+0.127
 thing+0.125
 things+0.124
 important+0.118
Neg logit contributions
arted-0.166
ur-0.158
 stro-0.155
ked-0.155
ree-0.147
Token became
Token ID2627
Feature activation-5.90e+00

Pos logit contributions
 best+0.528
 thing+0.514
 world+0.498
 things+0.497
 biggest+0.488
Neg logit contributions
arted-0.634
ur-0.629
ked-0.589
 stro-0.589
anged-0.575
Token a
Token ID257
Feature activation-5.43e+00

Pos logit contributions
 world+0.191
 thing+0.189
 best+0.176
iest+0.172
 game+0.171
Neg logit contributions
arted-0.357
ur-0.351
 stro-0.339
ked-0.319
oured-0.317
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
 things+0.148
iest+0.144
 thing+0.139
 world+0.137
 best+0.132
Neg logit contributions
ur-0.335
 stro-0.328
arted-0.309
oured-0.305
ree-0.301
Token master
Token ID4958
Feature activation+1.11e+00

Pos logit contributions
 best+0.963
 thing+0.957
 lot+0.939
 world+0.931
 biggest+0.928
Neg logit contributions
ur-0.999
arted-0.959
 stro-0.900
oured-0.876
ked-0.865
Token!"
Token ID2474
Feature activation+8.30e-02

Pos logit contributions
arted+0.060
ur+0.059
ked+0.055
 stro+0.055
oured+0.054
Neg logit contributions
 best-0.049
 thing-0.048
 world-0.046
 things-0.045
 biggest-0.045
Token<|endoftext|>
Token ID50256
Feature activation-2.71e-01

Pos logit contributions
arted+0.005
ur+0.005
ked+0.004
anged+0.004
oured+0.004
Neg logit contributions
 best-0.004
 lot-0.003
 things-0.003
 day-0.003
 thing-0.003
TokenOnce
Token ID7454
Feature activation-2.67e+00

Pos logit contributions
 best+0.014
 thing+0.014
 world+0.013
 things+0.013
 biggest+0.013
Neg logit contributions
arted-0.013
ur-0.012
ked-0.012
 stro-0.011
anged-0.011
Token upon
Token ID2402
Feature activation-4.15e+00

Pos logit contributions
 best+0.139
 thing+0.137
 world+0.134
 biggest+0.132
 things+0.132
Neg logit contributions
arted-0.121
ur-0.118
ked-0.107
anged-0.107
 stro-0.105
Token work
Token ID670
Feature activation-1.52e+00

Pos logit contributions
ur+0.326
arted+0.325
ked+0.320
anged+0.317
oured+0.314
Neg logit contributions
iest-0.163
 thing-0.149
 best-0.149
 biggest-0.148
 world-0.147
Token.
Token ID13
Feature activation-5.16e+00

Pos logit contributions
 biggest+0.073
 best+0.073
 lot+0.070
 thing+0.070
 world+0.069
Neg logit contributions
arted-0.070
ur-0.069
anged-0.063
ked-0.063
 stro-0.062
Token He
Token ID679
Feature activation-2.29e-01

Pos logit contributions
 thing+0.250
 best+0.248
 world+0.247
 things+0.240
 important+0.239
Neg logit contributions
arted-0.252
ur-0.250
ked-0.249
 stro-0.248
oured-0.236
Token hugged
Token ID41130
Feature activation+1.13e+01

Pos logit contributions
 best+0.010
 important+0.010
 thing+0.009
 lot+0.009
 world+0.009
Neg logit contributions
arted-0.013
ur-0.012
ked-0.012
anged-0.012
ree-0.011
Token his
Token ID465
Feature activation+5.49e+00

Pos logit contributions
ur+0.766
arted+0.729
anged+0.710
oured+0.709
olded+0.706
Neg logit contributions
 things-0.455
 thing-0.453
 lot-0.401
 best-0.390
 day-0.384
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
ked+0.309
arted+0.307
ur+0.302
 stro+0.301
oured+0.297
Neg logit contributions
 best-0.180
 world-0.176
 thing-0.175
iest-0.174
 important-0.171
Token and
Token ID290
Feature activation+3.49e+00

Pos logit contributions
 best+1.130
 thing+1.109
 world+1.082
 things+1.074
 biggest+1.068
Neg logit contributions
arted-0.899
ur-0.892
 stro-0.809
ked-0.804
anged-0.782
Token took
Token ID1718
Feature activation+2.62e+00

Pos logit contributions
arted+0.167
ur+0.164
ree+0.149
ked+0.148
anged+0.144
Neg logit contributions
 best-0.189
 thing-0.180
 things-0.180
 important-0.173
 lot-0.170
Token it
Token ID340
Feature activation-1.00e+01

Pos logit contributions
arted+0.151
ur+0.148
ked+0.140
 stro+0.138
anged+0.137
Neg logit contributions
 best-0.106
 thing-0.102
 world-0.101
 things-0.098
 biggest-0.097
Token home
Token ID1363
Feature activation+3.61e+00

Pos logit contributions
 best+0.491
 thing+0.469
 things+0.462
 world+0.454
 important+0.442
Neg logit contributions
arted-0.518
ur-0.489
 stro-0.487
ked-0.482
anged-0.472
Token with
Token ID351
Feature activation+4.55e+00

Pos logit contributions
arted+0.242
ur+0.230
ked+0.223
anged+0.219
 stro+0.217
Neg logit contributions
 best-0.125
 thing-0.119
 things-0.115
 world-0.112
 important-0.112
Token.
Token ID13
Feature activation-5.10e+00

Pos logit contributions
arted+0.392
ur+0.388
 stro+0.381
ked+0.354
oured+0.350
Neg logit contributions
 best-0.347
 thing-0.346
 problem-0.345
 biggest-0.336
 puzzle-0.333
Token She
Token ID1375
Feature activation-4.56e+00

Pos logit contributions
 lot+0.245
 best+0.237
 world+0.233
 biggest+0.229
 thing+0.226
Neg logit contributions
ked-0.252
ur-0.249
arted-0.249
anged-0.238
 stro-0.226
Token saw
Token ID2497
Feature activation+2.70e+00

Pos logit contributions
 best+0.199
 thing+0.187
 world+0.186
 important+0.183
 things+0.183
Neg logit contributions
arted-0.257
ur-0.245
ked-0.237
ree-0.228
anged-0.228
Token a
Token ID257
Feature activation-5.38e+00

Pos logit contributions
arted+0.155
ur+0.151
 stro+0.145
ked+0.143
oured+0.139
Neg logit contributions
 best-0.103
 thing-0.103
 world-0.101
 important-0.095
 problem-0.095
Token black
Token ID2042
Feature activation-3.95e+00

Pos logit contributions
 best+0.137
iest+0.133
 important+0.123
 things+0.122
 biggest+0.122
Neg logit contributions
 stro-0.311
oured-0.304
ur-0.299
arted-0.282
ree-0.281
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
 best+0.157
 biggest+0.144
 important+0.143
 world+0.129
 things+0.129
Neg logit contributions
ur-0.223
arted-0.217
 stro-0.215
oured-0.213
ked-0.211
Token sitting
Token ID5586
Feature activation-3.72e+00

Pos logit contributions
 biggest+1.025
 best+1.020
 lot+0.984
 important+0.962
 thing+0.947
Neg logit contributions
ur-0.874
arted-0.851
 stro-0.771
oured-0.750
anged-0.735
Token on
Token ID319
Feature activation-3.04e+00

Pos logit contributions
 best+0.164
 thing+0.160
 biggest+0.156
 game+0.154
 world+0.154
Neg logit contributions
arted-0.192
ur-0.183
ked-0.180
oured-0.171
 stro-0.170
Token the
Token ID262
Feature activation-8.89e-01

Pos logit contributions
 best+0.164
 biggest+0.152
 important+0.151
 problem+0.149
 world+0.149
Neg logit contributions
arted-0.127
ur-0.121
 stro-0.120
ked-0.116
oured-0.112
Token shelf
Token ID18316
Feature activation-1.06e+01

Pos logit contributions
 problem+0.032
 best+0.031
 thing+0.031
 important+0.029
iest+0.029
Neg logit contributions
oured-0.047
arted-0.047
ur-0.046
ked-0.045
 stro-0.042
Token.
Token ID13
Feature activation-5.40e+00

Pos logit contributions
 best+0.461
 thing+0.449
 things+0.436
 important+0.434
 biggest+0.433
Neg logit contributions
arted-0.564
ur-0.555
ked-0.513
oured-0.510
 stro-0.509
Token Tim
Token ID5045
Feature activation+1.45e+01

Pos logit contributions
 best+0.157
 thing+0.154
 things+0.150
 world+0.150
 lot+0.147
Neg logit contributions
arted-0.090
ur-0.087
ked-0.079
 stro-0.078
anged-0.076
Token was
Token ID373
Feature activation-6.97e+00

Pos logit contributions
arted+0.843
ur+0.837
ked+0.780
 stro+0.772
ree+0.744
Neg logit contributions
 best-0.604
 thing-0.583
 world-0.573
 things-0.570
 game-0.550
Token playing
Token ID2712
Feature activation-7.39e-01

Pos logit contributions
 best+0.325
 important+0.301
 thing+0.300
 things+0.300
 lot+0.291
Neg logit contributions
arted-0.383
ur-0.366
ked-0.351
 stro-0.348
anged-0.339
Token with
Token ID351
Feature activation+4.78e+00

Pos logit contributions
 best+0.037
 thing+0.035
 world+0.035
 biggest+0.035
 important+0.034
Neg logit contributions
arted-0.035
ur-0.034
ked-0.032
 stro-0.031
anged-0.031
Token the
Token ID262
Feature activation-7.81e-01

Pos logit contributions
arted+0.232
ur+0.226
ked+0.212
 stro+0.209
anged+0.206
Neg logit contributions
 best-0.240
 thing-0.234
 world-0.228
 things-0.225
 biggest-0.223
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
iest+0.022
ww+0.019
 thing+0.018
 fog+0.018
asis+0.018
Neg logit contributions
arted-0.038
ked-0.038
ents-0.037
 aback-0.037
oured-0.036
Token,
Token ID11
Feature activation+7.70e-01

Pos logit contributions
 biggest+1.089
 world+1.085
 best+1.082
 thing+1.069
 lot+1.061
Neg logit contributions
ur-0.840
arted-0.831
anged-0.755
ked-0.729
oured-0.686
Token his
Token ID465
Feature activation+5.43e+00

Pos logit contributions
arted+0.036
ur+0.035
ked+0.034
 stro+0.034
anged+0.032
Neg logit contributions
 world-0.038
 best-0.037
 important-0.035
 thing-0.035
 biggest-0.035
Token friend
Token ID1545
Feature activation-1.70e+00

Pos logit contributions
arted+0.275
ked+0.265
ur+0.263
anged+0.249
 stro+0.244
Neg logit contributions
iest-0.223
 thing-0.221
 world-0.220
 best-0.216
 lot-0.215
Token Sara
Token ID24799
Feature activation+8.59e+00

Pos logit contributions
 best+0.074
 thing+0.072
 things+0.067
 day+0.063
 bad+0.063
Neg logit contributions
arted-0.100
ur-0.095
oured-0.092
 stro-0.091
ked-0.090
Token came
Token ID1625
Feature activation+3.08e-01

Pos logit contributions
arted+0.496
ur+0.478
oured+0.448
ked+0.444
anged+0.441
Neg logit contributions
 best-0.359
 thing-0.343
 things-0.340
 world-0.332
 day-0.316
Token.
Token ID13
Feature activation-5.10e+00

Pos logit contributions
ur+0.232
arted+0.217
ked+0.197
 stro+0.195
anged+0.193
Neg logit contributions
 best-0.260
 world-0.255
 biggest-0.254
 lot-0.248
 thing-0.247
Token We
Token ID775
Feature activation-9.69e+00

Pos logit contributions
 best+0.188
 thing+0.187
 world+0.180
 things+0.176
 important+0.174
Neg logit contributions
arted-0.312
ur-0.295
ked-0.291
 stro-0.283
anged-0.282
Token can
Token ID460
Feature activation-1.28e+01

Pos logit contributions
 best+0.272
 thing+0.262
 world+0.249
 things+0.245
 biggest+0.245
Neg logit contributions
arted-0.680
ur-0.679
ked-0.641
 stro-0.639
anged-0.632
Token fix
Token ID4259
Feature activation+6.13e+00

Pos logit contributions
 best+0.582
 thing+0.559
 world+0.544
 things+0.541
 lot+0.535
Neg logit contributions
arted-0.709
ur-0.679
ked-0.651
 stro-0.636
anged-0.635
Token the
Token ID262
Feature activation-7.81e-01

Pos logit contributions
arted+0.297
ur+0.284
ked+0.271
 stro+0.266
anged+0.265
Neg logit contributions
 best-0.274
 world-0.273
 thing-0.270
 important-0.263
 biggest-0.260
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
iest+0.024
ww+0.017
body+0.016
 thing+0.016
 fog+0.016
Neg logit contributions
arted-0.040
ked-0.039
oured-0.039
 Jack-0.037
 Sammy-0.036
Token."
Token ID526
Feature activation-3.43e+00

Pos logit contributions
 best+1.155
 thing+1.142
 lot+1.125
 world+1.117
 biggest+1.116
Neg logit contributions
ur-0.787
arted-0.775
ked-0.666
oured-0.664
anged-0.651
Token She
Token ID1375
Feature activation-4.75e+00

Pos logit contributions
 things+0.133
 lot+0.126
 thing+0.126
 world+0.122
 day+0.120
Neg logit contributions
arted-0.191
ked-0.187
ur-0.185
anged-0.176
ried-0.174
Token showed
Token ID3751
Feature activation+6.13e+00

Pos logit contributions
 best+0.219
 thing+0.206
 important+0.205
 things+0.195
 lot+0.194
Neg logit contributions
arted-0.257
ur-0.236
ree-0.229
anged-0.226
ked-0.221
Token him
Token ID683
Feature activation+6.93e+00

Pos logit contributions
 stro+0.263
ur+0.262
arted+0.253
anged+0.248
ked+0.245
Neg logit contributions
 things-0.354
 thing-0.347
 best-0.329
 other-0.308
 biggest-0.307
Token how
Token ID703
Feature activation+2.72e+00

Pos logit contributions
arted+0.292
ur+0.285
ked+0.263
 stro+0.259
anged+0.254
Neg logit contributions
 best-0.395
 thing-0.386
 world-0.376
 things-0.375
 biggest-0.370
Token inside
Token ID2641
Feature activation+4.50e+00

Pos logit contributions
 world+0.860
 best+0.846
 lot+0.843
 thing+0.838
 biggest+0.818
Neg logit contributions
ur-0.642
arted-0.631
ked-0.604
 stro-0.574
anged-0.542
Token.
Token ID13
Feature activation-5.12e+00

Pos logit contributions
arted+0.209
ur+0.200
anged+0.190
ked+0.188
 stro+0.183
Neg logit contributions
 biggest-0.205
 best-0.205
 world-0.198
 thing-0.195
 puzzle-0.195
Token It
Token ID632
Feature activation-8.30e+00

Pos logit contributions
 thing+0.201
 things+0.199
 best+0.199
 lot+0.195
 world+0.195
Neg logit contributions
arted-0.299
ur-0.296
ked-0.278
oured-0.274
 stro-0.271
Token is
Token ID318
Feature activation-1.59e+01

Pos logit contributions
 thing+0.206
 best+0.201
 world+0.199
 important+0.186
 biggest+0.186
Neg logit contributions
ur-0.580
arted-0.558
 stro-0.551
ked-0.531
anged-0.527
Token a
Token ID257
Feature activation-5.40e+00

Pos logit contributions
 best+0.627
 thing+0.617
 world+0.603
 things+0.587
 biggest+0.582
Neg logit contributions
arted-0.919
ur-0.909
ked-0.861
 stro-0.854
anged-0.834
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
iest+0.158
 best+0.154
 biggest+0.149
 things+0.143
 important+0.138
Neg logit contributions
ur-0.303
 stro-0.302
ree-0.290
ked-0.275
arted-0.271
Token.
Token ID13
Feature activation-5.29e+00

Pos logit contributions
 biggest+1.058
 lot+1.031
 best+1.015
 world+0.976
 important+0.966
Neg logit contributions
ur-0.844
arted-0.757
 stro-0.720
ked-0.718
anged-0.702
Token It
Token ID632
Feature activation-8.44e+00

Pos logit contributions
 best+0.230
 thing+0.228
 things+0.223
 lot+0.223
 world+0.221
Neg logit contributions
arted-0.292
ur-0.291
ked-0.271
anged-0.265
 stro-0.263
Token has
Token ID468
Feature activation-1.60e+01

Pos logit contributions
 thing+0.222
 world+0.216
 best+0.216
 lot+0.200
 important+0.198
Neg logit contributions
ur-0.576
arted-0.553
 stro-0.536
ked-0.523
anged-0.519
Token many
Token ID867
Feature activation-1.02e+01

Pos logit contributions
 world+0.671
 thing+0.669
 best+0.665
 important+0.612
iest+0.601
Neg logit contributions
arted-0.820
ur-0.811
ked-0.780
 stro-0.765
oured-0.729
Token pieces
Token ID5207
Feature activation-1.28e+01

Pos logit contributions
 best+0.325
 biggest+0.303
 world+0.301
 thing+0.300
 day+0.287
Neg logit contributions
arted-0.559
 stro-0.547
anged-0.536
ur-0.531
ked-0.524
Token He
Token ID679
Feature activation-1.21e-01

Pos logit contributions
 thing+0.085
 best+0.084
 world+0.083
 things+0.082
 biggest+0.080
Neg logit contributions
arted-0.093
ked-0.089
ur-0.089
 stro-0.088
anged-0.085
Token decided
Token ID3066
Feature activation-1.08e+00

Pos logit contributions
 best+0.006
 thing+0.006
 important+0.005
 things+0.005
 world+0.005
Neg logit contributions
arted-0.006
ur-0.006
ked-0.006
anged-0.006
 stro-0.005
Token to
Token ID284
Feature activation+4.53e+00

Pos logit contributions
 things+0.055
 thing+0.054
 best+0.054
 important+0.052
 world+0.050
Neg logit contributions
arted-0.055
ur-0.052
anged-0.050
ked-0.050
oured-0.049
Token join
Token ID4654
Feature activation+8.93e+00

Pos logit contributions
arted+0.224
ur+0.223
 stro+0.203
ked+0.202
anged+0.199
Neg logit contributions
 best-0.219
 thing-0.210
 world-0.205
 biggest-0.204
 things-0.203
Token a
Token ID257
Feature activation-5.40e+00

Pos logit contributions
arted+0.501
ur+0.491
 stro+0.471
ked+0.463
anged+0.455
Neg logit contributions
 best-0.391
 thing-0.389
 things-0.387
 world-0.363
 lot-0.358
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
iest+0.126
 best+0.124
 thing+0.119
 biggest+0.117
 things+0.114
Neg logit contributions
ur-0.359
ree-0.345
arted-0.323
 stro-0.322
 Sammy-0.310
Token-
Token ID12
Feature activation+3.05e+00

Pos logit contributions
 best+1.020
 thing+0.994
 world+0.968
 biggest+0.965
 things+0.961
Neg logit contributions
arted-1.005
ur-1.002
ked-0.909
 stro-0.906
anged-0.889
Tokens
Token ID82
Feature activation-7.85e+00

Pos logit contributions
arted+0.141
ur+0.140
ked+0.128
 stro+0.127
anged+0.125
Neg logit contributions
 best-0.162
 thing-0.155
 world-0.152
 things-0.150
 biggest-0.149
Tokenolving
Token ID10890
Feature activation+1.25e+01

Pos logit contributions
 best+0.358
 biggest+0.353
 thing+0.346
 lot+0.343
 important+0.331
Neg logit contributions
ur-0.368
arted-0.360
ree-0.323
anged-0.323
 stro-0.322
Token group
Token ID1448
Feature activation-6.44e+00

Pos logit contributions
arted+0.750
ked+0.728
ur+0.725
anged+0.690
ree+0.687
Neg logit contributions
 biggest-0.321
 best-0.315
 important-0.305
 worst-0.294
 lot-0.294
Token.
Token ID13
Feature activation-5.40e+00

Pos logit contributions
 best+0.301
 thing+0.292
 biggest+0.284
 world+0.284
 lot+0.282
Neg logit contributions
arted-0.330
ur-0.327
ked-0.300
 stro-0.298
anged-0.294
Token who
Token ID508
Feature activation-2.06e+00

Pos logit contributions
arted+0.398
ur+0.388
ked+0.362
 stro+0.358
anged+0.354
Neg logit contributions
 best-0.418
 thing-0.406
 things-0.396
 world-0.393
 biggest-0.387
Token can
Token ID460
Feature activation-1.28e+01

Pos logit contributions
 best+0.063
 thing+0.059
 biggest+0.057
 world+0.057
 important+0.057
Neg logit contributions
ur-0.135
arted-0.134
ked-0.127
anged-0.127
 stro-0.127
Token put
Token ID1234
Feature activation+5.47e+00

Pos logit contributions
 best+0.572
 thing+0.557
 world+0.537
 things+0.535
 biggest+0.527
Neg logit contributions
arted-0.699
ur-0.681
ked-0.643
 stro-0.637
anged-0.630
Token together
Token ID1978
Feature activation+9.54e-01

Pos logit contributions
arted+0.265
ur+0.259
ked+0.251
oured+0.243
 stro+0.243
Neg logit contributions
 best-0.243
 world-0.241
 thing-0.235
 important-0.231
 problem-0.228
Token the
Token ID262
Feature activation-7.81e-01

Pos logit contributions
arted+0.048
ur+0.046
ked+0.043
oured+0.043
 stro+0.042
Neg logit contributions
 best-0.044
 thing-0.044
 world-0.042
 important-0.042
 lot-0.042
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
iest+0.022
 fog+0.019
 important+0.019
ooky+0.019
 lot+0.019
Neg logit contributions
oured-0.040
 aback-0.039
arted-0.037
ked-0.037
 Sammy-0.035
Token first
Token ID717
Feature activation-1.59e+00

Pos logit contributions
 best+1.079
 thing+1.046
 lot+1.044
 world+1.033
 important+1.020
Neg logit contributions
ur-0.896
arted-0.895
ked-0.799
oured-0.787
 stro-0.771
Token."
Token ID526
Feature activation-3.45e+00

Pos logit contributions
 best+0.063
 thing+0.062
 world+0.060
 lot+0.060
 biggest+0.060
Neg logit contributions
arted-0.088
ur-0.087
ked-0.082
oured-0.079
anged-0.079
Token
Token ID220
Feature activation+1.44e+00

Pos logit contributions
 best+0.121
 things+0.120
 lot+0.117
 world+0.110
 day+0.109
Neg logit contributions
arted-0.216
ur-0.214
ked-0.210
ree-0.198
anged-0.196
Token
Token ID198
Feature activation+3.78e+00

Pos logit contributions
arted+0.089
ked+0.086
ur+0.086
 stro+0.084
oured+0.082
Neg logit contributions
 best-0.058
 things-0.052
 world-0.051
 biggest-0.050
 thing-0.049
Token
Token ID198
Feature activation+3.74e+00

Pos logit contributions
arted+0.242
ur+0.233
ked+0.233
anged+0.223
oured+0.216
Neg logit contributions
 best-0.140
 world-0.140
 lot-0.137
 things-0.132
 thing-0.131
Token lots
Token ID6041
Feature activation-5.50e+00

Pos logit contributions
 thing+0.431
 best+0.428
 world+0.412
 things+0.402
 biggest+0.396
Neg logit contributions
arted-0.669
ur-0.656
 stro-0.623
ked-0.622
oured-0.605
Token of
Token ID286
Feature activation-6.22e+00

Pos logit contributions
 best+0.235
 thing+0.234
 biggest+0.232
 world+0.231
 things+0.216
Neg logit contributions
ur-0.290
arted-0.286
 stro-0.275
oured-0.262
anged-0.259
Token fun
Token ID1257
Feature activation-9.02e+00

Pos logit contributions
 thing+0.152
 world+0.146
 biggest+0.143
iest+0.139
 best+0.135
Neg logit contributions
arted-0.399
ur-0.394
 stro-0.386
anged-0.373
oured-0.373
Token putting
Token ID5137
Feature activation+4.81e+00

Pos logit contributions
 biggest+0.480
 best+0.462
 thing+0.448
 world+0.448
iest+0.429
Neg logit contributions
arted-0.361
anged-0.345
ur-0.340
 stro-0.333
ked-0.327
Token the
Token ID262
Feature activation-7.81e-01

Pos logit contributions
arted+0.238
ked+0.218
ur+0.216
oured+0.213
 stro+0.209
Neg logit contributions
 best-0.197
 world-0.197
 thing-0.195
 biggest-0.193
 important-0.192
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
iest+0.023
 fog+0.021
ww+0.021
 thing+0.018
asis+0.018
Neg logit contributions
 aback-0.035
oured-0.035
arted-0.035
ked-0.033
ents-0.033
Token together
Token ID1978
Feature activation+8.69e-01

Pos logit contributions
 best+1.187
 thing+1.176
 lot+1.168
 biggest+1.167
 world+1.159
Neg logit contributions
ur-0.756
arted-0.753
ked-0.640
anged-0.631
oured-0.625
Token.
Token ID13
Feature activation-5.31e+00

Pos logit contributions
arted+0.049
ur+0.048
ked+0.045
 stro+0.045
anged+0.044
Neg logit contributions
 best-0.036
 thing-0.035
 world-0.034
 biggest-0.034
 things-0.034
Token They
Token ID1119
Feature activation-4.81e+00

Pos logit contributions
 best+0.292
 thing+0.286
 world+0.279
 things+0.278
 lot+0.277
Neg logit contributions
arted-0.236
ur-0.231
ked-0.217
 stro-0.213
oured-0.209
Token were
Token ID547
Feature activation-8.05e+00

Pos logit contributions
 best+0.214
 thing+0.200
 lot+0.197
 things+0.193
 world+0.193
Neg logit contributions
arted-0.272
ur-0.263
ked-0.246
anged-0.242
 stro-0.240
Token very
Token ID845
Feature activation-1.51e+01

Pos logit contributions
 thing+0.302
 world+0.292
 best+0.284
 things+0.281
 problem+0.278
Neg logit contributions
arted-0.477
ur-0.464
ked-0.439
 stro-0.438
oured-0.421
Token I
Token ID314
Feature activation-9.99e+00

Pos logit contributions
ur+0.070
arted+0.068
 stro+0.066
ked+0.065
anged+0.063
Neg logit contributions
 best-0.042
 world-0.041
 thing-0.041
 lot-0.040
 biggest-0.039
Token will
Token ID481
Feature activation-1.25e+01

Pos logit contributions
 best+0.363
 thing+0.350
 world+0.334
 things+0.329
 lot+0.327
Neg logit contributions
arted-0.644
ur-0.622
ked-0.596
 stro-0.580
anged-0.579
Token clean
Token ID3424
Feature activation+1.99e-01

Pos logit contributions
 best+0.551
 thing+0.532
 things+0.513
 world+0.510
 lot+0.506
Neg logit contributions
arted-0.719
ur-0.687
ked-0.661
anged-0.644
 stro-0.637
Token up
Token ID510
Feature activation+5.41e+00

Pos logit contributions
arted+0.010
ur+0.010
ked+0.009
 stro+0.009
anged+0.009
Neg logit contributions
 best-0.009
 thing-0.009
 world-0.009
 things-0.009
 biggest-0.009
Token the
Token ID262
Feature activation-7.81e-01

Pos logit contributions
arted+0.273
ur+0.269
 stro+0.250
ked+0.248
anged+0.242
Neg logit contributions
 best-0.252
 thing-0.245
 world-0.239
 biggest-0.237
 lot-0.233
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
iest+0.029
 best+0.024
 important+0.023
 biggest+0.022
 strongest+0.022
Neg logit contributions
arted-0.036
ents-0.034
oured-0.034
 aback-0.032
ked-0.032
Token."
Token ID526
Feature activation-3.43e+00

Pos logit contributions
 best+1.101
 biggest+1.083
 lot+1.071
 important+1.058
 thing+1.057
Neg logit contributions
ur-0.815
arted-0.807
ked-0.699
oured-0.699
ried-0.691
Token Tom
Token ID4186
Feature activation+1.29e+01

Pos logit contributions
 best+0.156
 thing+0.152
 things+0.150
 lot+0.150
 day+0.143
Neg logit contributions
arted-0.201
ur-0.196
oured-0.188
ked-0.186
ried-0.172
Token's
Token ID338
Feature activation-1.02e+01

Pos logit contributions
arted+0.722
ur+0.671
ree+0.635
anged+0.623
ked+0.597
Neg logit contributions
 best-0.639
 thing-0.580
 things-0.575
 game-0.563
 is-0.559
Token mom
Token ID1995
Feature activation+2.34e+00

Pos logit contributions
 best+0.483
 world+0.474
 thing+0.473
 important+0.462
 biggest+0.462
Neg logit contributions
arted-0.490
ur-0.477
ked-0.460
anged-0.452
 stro-0.452
Token smiled
Token ID13541
Feature activation+5.94e+00

Pos logit contributions
arted+0.120
ur+0.112
ree+0.107
anged+0.102
ked+0.100
Neg logit contributions
 best-0.119
 thing-0.113
 things-0.109
 game-0.106
 bad-0.105
Token the
Token ID262
Feature activation-3.68e+00

Pos logit contributions
 best+0.575
 thing+0.552
 biggest+0.548
 world+0.531
 lot+0.528
Neg logit contributions
arted-0.789
ur-0.779
 stro-0.731
ked-0.726
anged-0.706
Token key
Token ID1994
Feature activation-1.23e+01

Pos logit contributions
 best+0.107
 thing+0.106
 important+0.101
iest+0.098
 biggest+0.098
Neg logit contributions
arted-0.224
ur-0.219
 stro-0.210
ked-0.204
ree-0.202
Token to
Token ID284
Feature activation+2.98e+00

Pos logit contributions
 best+0.558
 thing+0.543
 world+0.525
 things+0.523
 lot+0.523
Neg logit contributions
ur-0.637
arted-0.636
ked-0.594
 stro-0.585
oured-0.576
Token completing
Token ID14339
Feature activation-2.51e-01

Pos logit contributions
arted+0.160
ur+0.153
ked+0.148
anged+0.144
 stro+0.143
Neg logit contributions
 best-0.136
 thing-0.135
 things-0.131
 world-0.130
 lot-0.128
Token the
Token ID262
Feature activation-1.15e+00

Pos logit contributions
 best+0.012
 thing+0.011
 world+0.011
 biggest+0.011
 important+0.011
Neg logit contributions
arted-0.012
ur-0.012
 stro-0.011
ked-0.011
anged-0.010
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
iest+0.025
 thing+0.025
 best+0.025
 lot+0.023
 biggest+0.022
Neg logit contributions
ur-0.067
arted-0.066
 stro-0.061
oured-0.061
ree-0.060
Token.
Token ID13
Feature activation-4.93e+00

Pos logit contributions
 best+0.929
 thing+0.902
 biggest+0.900
 lot+0.869
 world+0.856
Neg logit contributions
ur-1.000
arted-0.988
 stro-0.878
anged-0.876
ree-0.865
Token<|endoftext|>
Token ID50256
Feature activation+3.80e-01

Pos logit contributions
 world+0.234
 best+0.231
 lot+0.225
 thing+0.224
 things+0.220
Neg logit contributions
arted-0.238
ked-0.237
ur-0.230
anged-0.225
oured-0.217
TokenOnce
Token ID7454
Feature activation-1.77e+00

Pos logit contributions
arted+0.020
ur+0.019
ked+0.018
 stro+0.018
anged+0.018
Neg logit contributions
 best-0.018
 thing-0.018
 world-0.018
 things-0.017
 lot-0.017
Token upon
Token ID2402
Feature activation-3.06e+00

Pos logit contributions
 best+0.091
 thing+0.089
 world+0.086
 things+0.086
 biggest+0.085
Neg logit contributions
arted-0.083
ur-0.081
ked-0.075
 stro-0.074
anged-0.073
Token a
Token ID257
Feature activation-4.37e+00

Pos logit contributions
 best+0.142
 problem+0.141
 thing+0.138
 world+0.135
 biggest+0.132
Neg logit contributions
ur-0.144
 stro-0.144
arted-0.142
ked-0.136
ree-0.129
Token and
Token ID290
Feature activation+6.73e-01

Pos logit contributions
 best+0.568
 thing+0.554
 things+0.536
 world+0.532
 important+0.529
Neg logit contributions
arted-0.576
ur-0.559
ked-0.520
 stro-0.508
anged-0.505
Token started
Token ID2067
Feature activation-5.32e+00

Pos logit contributions
arted+0.033
ur+0.033
ked+0.031
 stro+0.030
anged+0.030
Neg logit contributions
 best-0.033
 thing-0.032
 world-0.031
 things-0.031
 biggest-0.031
Token to
Token ID284
Feature activation+2.98e+00

Pos logit contributions
 best+0.284
 thing+0.276
 things+0.273
 world+0.268
 lot+0.264
Neg logit contributions
arted-0.244
ur-0.240
ked-0.226
oured-0.217
 stro-0.215
Token put
Token ID1234
Feature activation+4.60e+00

Pos logit contributions
arted+0.147
ur+0.144
ked+0.135
 stro+0.133
anged+0.131
Neg logit contributions
 best-0.147
 thing-0.143
 world-0.139
 things-0.138
 biggest-0.136
Token the
Token ID262
Feature activation-1.15e+00

Pos logit contributions
arted+0.236
ur+0.227
ked+0.217
 stro+0.216
anged+0.211
Neg logit contributions
 best-0.204
 world-0.204
 thing-0.200
 important-0.195
 biggest-0.192
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
 best+0.034
 important+0.034
 world+0.034
 thing+0.033
 lot+0.031
Neg logit contributions
arted-0.063
 stro-0.062
oured-0.062
ur-0.062
ked-0.061
Token together
Token ID1978
Feature activation+1.23e+00

Pos logit contributions
 best+1.076
 thing+1.051
 world+1.046
 lot+1.037
 biggest+1.035
Neg logit contributions
ur-0.902
arted-0.890
ked-0.794
 stro-0.788
oured-0.783
Token.
Token ID13
Feature activation-4.72e+00

Pos logit contributions
arted+0.061
ur+0.061
ked+0.057
 stro+0.057
oured+0.056
Neg logit contributions
 best-0.054
 thing-0.053
 world-0.052
 biggest-0.052
 lot-0.051
Token She
Token ID1375
Feature activation-3.94e+00

Pos logit contributions
 best+0.228
 world+0.226
 lot+0.223
 thing+0.219
 things+0.215
Neg logit contributions
arted-0.225
ked-0.221
ur-0.221
anged-0.212
oured-0.202
Token worked
Token ID3111
Feature activation-5.05e+00

Pos logit contributions
 best+0.173
 thing+0.167
 things+0.160
 important+0.160
 lot+0.160
Neg logit contributions
arted-0.220
ur-0.212
ked-0.201
anged-0.195
ree-0.193
Token very
Token ID845
Feature activation-1.39e+01

Pos logit contributions
 best+0.154
 thing+0.150
 world+0.145
 things+0.140
 biggest+0.139
Neg logit contributions
arted-0.342
ur-0.335
ked-0.319
 stro-0.316
anged-0.313
Token.
Token ID13
Feature activation-8.12e+00

Pos logit contributions
arted+0.051
ur+0.051
ked+0.047
 stro+0.047
anged+0.046
Neg logit contributions
 best-0.042
 thing-0.041
 world-0.040
 things-0.040
 biggest-0.039
Token She
Token ID1375
Feature activation-7.20e+00

Pos logit contributions
 best+0.384
 world+0.380
 thing+0.376
 things+0.365
 lot+0.364
Neg logit contributions
arted-0.406
ur-0.388
ked-0.385
oured-0.370
anged-0.370
Token wanted
Token ID2227
Feature activation-4.25e+00

Pos logit contributions
 best+0.345
 thing+0.331
 things+0.325
 world+0.322
 lot+0.320
Neg logit contributions
arted-0.368
ur-0.355
ked-0.338
anged-0.327
oured-0.320
Token a
Token ID257
Feature activation-6.21e+00

Pos logit contributions
 best+0.237
 thing+0.232
 things+0.226
 world+0.224
 biggest+0.221
Neg logit contributions
arted-0.187
ur-0.181
ked-0.169
 stro-0.165
anged-0.163
Token new
Token ID649
Feature activation-1.23e+01

Pos logit contributions
 best+0.195
 world+0.193
 things+0.187
 important+0.186
 biggest+0.183
Neg logit contributions
ur-0.364
 stro-0.355
ree-0.337
oured-0.336
arted-0.333
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
 best+0.388
 biggest+0.366
 important+0.361
 world+0.358
 things+0.350
Neg logit contributions
ur-0.753
arted-0.740
 stro-0.728
oured-0.696
ree-0.695
Token that
Token ID326
Feature activation-4.07e+00

Pos logit contributions
 best+1.006
 biggest+0.982
 thing+0.976
 world+0.971
 lot+0.970
Neg logit contributions
ur-0.968
arted-0.925
 stro-0.850
ked-0.824
anged-0.823
Token was
Token ID373
Feature activation-6.57e+00

Pos logit contributions
 best+0.164
 thing+0.160
 biggest+0.157
 world+0.157
 lot+0.155
Neg logit contributions
ur-0.219
arted-0.216
 stro-0.206
anged-0.196
ked-0.196
Token harder
Token ID7069
Feature activation-1.18e+00

Pos logit contributions
 world+0.264
 thing+0.259
 best+0.256
 biggest+0.244
 things+0.242
Neg logit contributions
ur-0.365
arted-0.358
 stro-0.336
ked-0.325
oured-0.318
Token and
Token ID290
Feature activation+4.46e+00

Pos logit contributions
 best+0.058
 biggest+0.057
 thing+0.057
 lot+0.057
 world+0.056
Neg logit contributions
arted-0.049
ur-0.048
anged-0.045
 stro-0.043
ree-0.042
Token more
Token ID517
Feature activation-1.28e+01

Pos logit contributions
arted+0.239
ur+0.234
 stro+0.225
ked+0.218
anged+0.212
Neg logit contributions
 world-0.192
 thing-0.191
 best-0.190
 lot-0.182
 biggest-0.177
Token happy
Token ID3772
Feature activation-8.62e+00

Pos logit contributions
 best+0.313
 thing+0.305
 world+0.297
 things+0.296
 biggest+0.292
Neg logit contributions
arted-0.225
ur-0.219
ked-0.200
 stro-0.198
anged-0.195
Token that
Token ID326
Feature activation-6.90e+00

Pos logit contributions
 best+0.399
 thing+0.386
 biggest+0.377
 world+0.376
 lot+0.372
Neg logit contributions
arted-0.439
ur-0.429
 stro-0.403
ked-0.402
anged-0.390
Token he
Token ID339
Feature activation-8.25e+00

Pos logit contributions
 best+0.315
 biggest+0.315
 thing+0.313
 world+0.309
 lot+0.304
Neg logit contributions
ur-0.319
arted-0.318
 stro-0.302
ked-0.278
anged-0.277
Token finished
Token ID5201
Feature activation+1.99e+00

Pos logit contributions
 best+0.355
 thing+0.354
 world+0.338
 things+0.336
 lot+0.336
Neg logit contributions
arted-0.442
ur-0.439
 stro-0.421
ked-0.408
oured-0.403
Token the
Token ID262
Feature activation-1.15e+00

Pos logit contributions
arted+0.099
ur+0.097
ked+0.091
 stro+0.090
anged+0.088
Neg logit contributions
 best-0.097
 thing-0.095
 world-0.092
 things-0.091
 biggest-0.090
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
 best+0.031
 thing+0.030
iest+0.028
 biggest+0.028
 important+0.028
Neg logit contributions
arted-0.068
ur-0.068
 stro-0.066
oured-0.064
ked-0.062
Token all
Token ID477
Feature activation-3.79e+00

Pos logit contributions
 best+0.976
 thing+0.964
 biggest+0.947
 lot+0.939
 world+0.920
Neg logit contributions
ur-0.970
arted-0.956
 stro-0.881
oured-0.854
anged-0.843
Token on
Token ID319
Feature activation-2.45e+00

Pos logit contributions
 best+0.146
 biggest+0.139
 thing+0.139
 lot+0.139
 world+0.135
Neg logit contributions
arted-0.220
ur-0.212
 stro-0.196
anged-0.195
oured-0.194
Token his
Token ID465
Feature activation+6.24e+00

Pos logit contributions
 best+0.140
 thing+0.136
 biggest+0.134
 world+0.132
 important+0.130
Neg logit contributions
arted-0.097
ur-0.093
 stro-0.086
ked-0.084
anged-0.082
Token own
Token ID898
Feature activation-3.47e+00

Pos logit contributions
ur+0.349
anged+0.343
arted+0.343
 stro+0.342
oured+0.328
Neg logit contributions
 thing-0.186
iest-0.180
 best-0.169
 biggest-0.169
 world-0.167
Token.
Token ID13
Feature activation-4.19e+00

Pos logit contributions
 thing+0.150
 best+0.149
 biggest+0.148
 lot+0.137
iest+0.137
Neg logit contributions
arted-0.157
ur-0.156
anged-0.148
 stro-0.148
oured-0.146
Token pieces
Token ID5207
Feature activation-1.24e+01

Pos logit contributions
iest+0.019
 lot+0.015
body+0.015
 worst+0.015
 fog+0.015
Neg logit contributions
 aback-0.030
oured-0.028
arted-0.028
ked-0.025
 Jack-0.025
Token and
Token ID290
Feature activation+3.70e+00

Pos logit contributions
 best+0.640
 thing+0.632
 world+0.626
 biggest+0.616
 lot+0.615
Neg logit contributions
arted-0.563
ur-0.559
ked-0.500
anged-0.489
 stro-0.488
Token pie
Token ID2508
Feature activation-1.77e+00

Pos logit contributions
arted+0.195
ur+0.190
ked+0.178
 stro+0.177
anged+0.175
Neg logit contributions
 best-0.169
 thing-0.166
 world-0.161
 things-0.158
 biggest-0.158
Tokencing
Token ID2259
Feature activation-3.04e+00

Pos logit contributions
 thing+0.091
 best+0.091
 world+0.090
 biggest+0.089
 important+0.088
Neg logit contributions
arted-0.079
ur-0.077
ked-0.071
anged-0.069
 stro-0.068
Token the
Token ID262
Feature activation-7.65e-01

Pos logit contributions
 best+0.147
 thing+0.143
 world+0.142
 biggest+0.138
 important+0.137
Neg logit contributions
arted-0.149
ur-0.145
ked-0.136
 stro-0.133
anged-0.132
Token puzzle
Token ID15027
Feature activation-2.08e+01

Pos logit contributions
iest+0.027
ooky+0.020
body+0.020
br+0.019
time+0.019
Neg logit contributions
arted-0.036
ked-0.034
oured-0.034
 aback-0.034
 argued-0.033
Token together
Token ID1978
Feature activation+9.03e-01

Pos logit contributions
 best+1.060
 world+1.045
 thing+1.042
 biggest+1.037
 lot+1.033
Neg logit contributions
ur-0.866
arted-0.865
anged-0.750
ked-0.750
oured-0.731
Token!
Token ID0
Feature activation-2.02e+00

Pos logit contributions
arted+0.048
ur+0.047
ked+0.044
 stro+0.044
anged+0.043
Neg logit contributions
 best-0.039
 biggest-0.037
 thing-0.037
 world-0.037
 important-0.036
Token<|endoftext|>
Token ID50256
Feature activation-2.13e-01

Pos logit contributions
 best+0.099
 thing+0.095
 world+0.094
 lot+0.093
 things+0.091
Neg logit contributions
arted-0.103
ur-0.102
ked-0.099
 stro-0.095
anged-0.094
TokenOnce
Token ID7454
Feature activation-2.60e+00

Pos logit contributions
 best+0.010
 thing+0.010
 things+0.010
 important+0.010
 biggest+0.010
Neg logit contributions
arted-0.011
ur-0.010
ked-0.010
 stro-0.009
anged-0.009
Token upon
Token ID2402
Feature activation-4.09e+00

Pos logit contributions
 best+0.128
 thing+0.125
 world+0.121
 things+0.121
 biggest+0.118
Neg logit contributions
arted-0.129
ur-0.126
ked-0.118
 stro-0.117
anged-0.114
Token skip
Token ID14267
Feature activation+1.63e+00

Pos logit contributions
arted+0.348
ur+0.342
ked+0.323
anged+0.315
oured+0.309
Neg logit contributions
 best-0.299
 thing-0.298
 world-0.286
 things-0.284
 lot-0.282
Token around
Token ID1088
Feature activation+8.12e+00

Pos logit contributions
arted+0.076
ur+0.075
ked+0.070
anged+0.070
 stro+0.070
Neg logit contributions
 best-0.085
 thing-0.084
 things-0.082
 world-0.081
 lot-0.080
Token the
Token ID262
Feature activation+1.22e+00

Pos logit contributions
arted+0.425
ur+0.419
ked+0.394
 stro+0.389
anged+0.387
Neg logit contributions
 best-0.383
 thing-0.381
 things-0.371
 world-0.367
 lot-0.356
Token beach
Token ID10481
Feature activation-1.03e+00

Pos logit contributions
arted+0.063
oured+0.058
ur+0.058
ked+0.054
 stro+0.054
Neg logit contributions
 best-0.049
 important-0.047
 thing-0.046
 puzzle-0.046
 lot-0.046
Token and
Token ID290
Feature activation+5.52e+00

Pos logit contributions
 best+0.050
 thing+0.049
 world+0.047
 important+0.047
 biggest+0.047
Neg logit contributions
arted-0.050
ur-0.049
ked-0.046
 stro-0.045
anged-0.044
Token Bob
Token ID5811
Feature activation+2.04e+01

Pos logit contributions
arted+0.296
ur+0.290
ked+0.273
anged+0.267
 stro+0.265
Neg logit contributions
 best-0.254
 thing-0.249
 things-0.242
 biggest-0.231
 world-0.231
Token followed
Token ID3940
Feature activation+7.00e+00

Pos logit contributions
arted+1.142
ur+1.117
ked+1.038
anged+1.030
ree+1.024
Neg logit contributions
 best-0.896
 thing-0.839
 things-0.814
 world-0.809
 biggest-0.794
Token him
Token ID683
Feature activation+8.97e+00

Pos logit contributions
arted+0.395
oured+0.383
ur+0.382
anged+0.377
 stro+0.372
Neg logit contributions
 thing-0.325
 things-0.305
 best-0.285
 lot-0.280
 day-0.269
Token.
Token ID13
Feature activation-3.34e+00

Pos logit contributions
arted+0.480
ur+0.468
ked+0.442
 stro+0.439
anged+0.433
Neg logit contributions
 best-0.412
 thing-0.400
 things-0.387
 world-0.385
 biggest-0.377
Token
Token ID220
Feature activation+3.41e+00

Pos logit contributions
 lot+0.143
 best+0.139
 thing+0.139
 world+0.138
 things+0.135
Neg logit contributions
ked-0.182
ur-0.180
 stro-0.175
anged-0.171
arted-0.170
Token
Token ID198
Feature activation+5.75e+00

Pos logit contributions
arted+0.198
ur+0.193
ked+0.192
 stro+0.190
anged+0.185
Neg logit contributions
 best-0.145
 world-0.135
 things-0.131
 thing-0.130
 lot-0.129
Token.
Token ID13
Feature activation-3.80e+00

Pos logit contributions
 best+0.543
 thing+0.522
 world+0.503
 things+0.501
 important+0.487
Neg logit contributions
arted-0.843
ur-0.830
ked-0.784
 stro-0.778
anged-0.766
Token<|endoftext|>
Token ID50256
Feature activation+1.28e+00

Pos logit contributions
 best+0.180
 world+0.175
 thing+0.173
 lot+0.169
 things+0.168
Neg logit contributions
arted-0.201
ur-0.195
ked-0.190
oured-0.181
anged-0.181
TokenOne
Token ID3198
Feature activation-8.04e+00

Pos logit contributions
arted+0.057
ur+0.055
ked+0.051
 stro+0.051
anged+0.049
Neg logit contributions
 best-0.069
 thing-0.067
 things-0.065
 important-0.065
 biggest-0.065
Token morning
Token ID3329
Feature activation-8.81e+00

Pos logit contributions
 biggest+0.157
 best+0.151
 things+0.144
 lot+0.140
 world+0.132
Neg logit contributions
ur-0.557
arted-0.551
anged-0.530
 Joe-0.528
oured-0.510
Token,
Token ID11
Feature activation+2.17e+00

Pos logit contributions
 best+0.402
 thing+0.392
 world+0.378
 things+0.378
 lot+0.369
Neg logit contributions
arted-0.472
ur-0.460
ked-0.433
 stro-0.428
anged-0.422
Token Bob
Token ID5811
Feature activation+1.98e+01

Pos logit contributions
arted+0.127
ur+0.125
ked+0.118
 stro+0.117
anged+0.115
Neg logit contributions
 best-0.088
 thing-0.085
 world-0.082
 things-0.082
 biggest-0.080
Token the
Token ID262
Feature activation+5.03e-01

Pos logit contributions
arted+1.164
ur+1.122
ked+1.073
 stro+1.040
oured+1.034
Neg logit contributions
 best-0.813
 thing-0.812
 world-0.786
 things-0.773
 lot-0.746
Token bear
Token ID6842
Feature activation+8.96e-01

Pos logit contributions
arted+0.034
ur+0.033
ked+0.032
 stro+0.031
ree+0.031
Neg logit contributions
 thing-0.012
 best-0.012
 things-0.011
 important-0.011
 problem-0.011
Token was
Token ID373
Feature activation-5.83e+00

Pos logit contributions
arted+0.052
ur+0.051
ked+0.048
 stro+0.047
anged+0.047
Neg logit contributions
 best-0.037
 thing-0.036
 world-0.035
 things-0.034
 lot-0.034
Token walking
Token ID6155
Feature activation+2.13e+00

Pos logit contributions
 best+0.253
 world+0.230
 thing+0.230
 things+0.226
 important+0.226
Neg logit contributions
arted-0.340
ur-0.331
ked-0.318
anged-0.309
 stro-0.309
Token in
Token ID287
Feature activation+1.46e+00

Pos logit contributions
arted+0.120
ur+0.114
 stro+0.112
olded+0.110
ked+0.110
Neg logit contributions
 thing-0.097
 best-0.095
 things-0.093
 day-0.091
 world-0.090
Token.
Token ID13
Feature activation-3.57e+00

Pos logit contributions
 best+0.038
 lot+0.036
 important+0.036
 biggest+0.035
 things+0.035
Neg logit contributions
arted-0.046
ur-0.046
ked-0.044
 stro-0.042
anged-0.042
Token He
Token ID679
Feature activation+1.38e+00

Pos logit contributions
 world+0.166
 thing+0.164
 best+0.162
 biggest+0.160
 things+0.160
Neg logit contributions
ur-0.187
 stro-0.179
ked-0.175
arted-0.170
anged-0.169
Token shouted
Token ID17293
Feature activation+1.61e+01

Pos logit contributions
arted+0.082
ur+0.080
ked+0.077
anged+0.075
oured+0.074
Neg logit contributions
 best-0.054
 thing-0.052
 lot-0.051
 important-0.051
 things-0.051
Token with
Token ID351
Feature activation+6.35e+00

Pos logit contributions
arted+0.957
anged+0.891
 stro+0.867
ur+0.865
oured+0.851
Neg logit contributions
 thing-0.679
 best-0.675
 things-0.645
 lot-0.610
 important-0.603
Token g
Token ID308
Feature activation+7.43e+00

Pos logit contributions
arted+0.339
ur+0.332
ked+0.312
 stro+0.309
anged+0.303
Neg logit contributions
 best-0.291
 thing-0.283
 world-0.275
 things-0.272
 biggest-0.268
Tokenlee
Token ID7197
Feature activation+1.89e+01

Pos logit contributions
 stro+0.201
ked+0.180
ur+0.178
arted+0.173
oured+0.163
Neg logit contributions
 best-0.505
 important-0.496
 things-0.492
 thing-0.484
 lot-0.472
Token,
Token ID11
Feature activation+2.36e+00

Pos logit contributions
arted+1.041
ur+1.013
 stro+0.958
anged+0.949
ked+0.946
Neg logit contributions
 best-0.833
 thing-0.791
 things-0.770
 world-0.758
 important-0.747
Token "
Token ID366
Feature activation-1.57e+00

Pos logit contributions
arted+0.138
ur+0.138
anged+0.134
oured+0.127
ked+0.125
Neg logit contributions
 best-0.093
 thing-0.090
 lot-0.088
 things-0.088
 world-0.087
TokenLook
Token ID8567
Feature activation+2.48e+00

Pos logit contributions
 best+0.064
 thing+0.064
 world+0.062
 things+0.060
 biggest+0.060
Neg logit contributions
ur-0.090
arted-0.090
 stro-0.084
ked-0.084
anged-0.082
Token,
Token ID11
Feature activation+2.32e+00

Pos logit contributions
arted+0.147
 stro+0.139
ur+0.136
ked+0.134
anged+0.133
Neg logit contributions
 best-0.107
 thing-0.102
 things-0.100
 biggest-0.093
 lot-0.091
Token Molly
Token ID30236
Feature activation+8.54e+00

Pos logit contributions
arted+0.075
ur+0.073
ked+0.065
 stro+0.064
anged+0.062
Neg logit contributions
 best-0.154
 thing-0.151
 world-0.148
 things-0.147
 biggest-0.146
Token not
Token ID407
Feature activation+1.32e+00

Pos logit contributions
 best+0.190
 important+0.169
 thing+0.160
 biggest+0.158
 things+0.156
Neg logit contributions
arted-0.344
ur-0.340
anged-0.332
ked-0.325
oured-0.325
Token pleased
Token ID10607
Feature activation+4.14e+00

Pos logit contributions
arted+0.083
ur+0.082
ked+0.078
 stro+0.077
anged+0.077
Neg logit contributions
 best-0.049
 thing-0.046
 things-0.045
 world-0.044
 important-0.044
Token.
Token ID13
Feature activation-3.39e+00

Pos logit contributions
arted+0.240
ur+0.235
ked+0.222
 stro+0.220
anged+0.216
Neg logit contributions
 best-0.165
 biggest-0.157
 thing-0.156
 world-0.154
 lot-0.154
Token Anna
Token ID11735
Feature activation+3.64e+00

Pos logit contributions
 best+0.152
 world+0.150
 things+0.148
 thing+0.148
 lot+0.145
Neg logit contributions
ur-0.181
arted-0.178
ked-0.174
anged-0.169
oured-0.167
Token sc
Token ID629
Feature activation+1.08e+01

Pos logit contributions
arted+0.208
ur+0.201
ree+0.185
anged+0.185
ked+0.182
Neg logit contributions
 best-0.157
 thing-0.148
 things-0.143
 world-0.142
 lot-0.139
Tokenolded
Token ID48959
Feature activation+1.93e+01

Pos logit contributions
arted+0.271
 stro+0.235
ked+0.214
oured+0.202
ur+0.200
Neg logit contributions
 best-0.746
 important-0.688
 biggest-0.687
 lot-0.678
 things-0.666
Token the
Token ID262
Feature activation+9.73e-01

Pos logit contributions
arted+0.959
ur+0.935
ked+0.878
 stro+0.867
anged+0.849
Neg logit contributions
 best-0.953
 thing-0.928
 world-0.904
 things-0.891
 biggest-0.889
Token goat
Token ID26917
Feature activation+4.05e+00

Pos logit contributions
arted+0.045
ked+0.042
ur+0.042
 stro+0.042
ree+0.038
Neg logit contributions
 best-0.048
 thing-0.047
 biggest-0.046
 world-0.045
 things-0.045
Token.
Token ID13
Feature activation-3.50e+00

Pos logit contributions
arted+0.199
ur+0.193
ked+0.181
 stro+0.179
anged+0.178
Neg logit contributions
 best-0.204
 thing-0.198
 things-0.192
 world-0.191
 important-0.188
Token Anna
Token ID11735
Feature activation+3.52e+00

Pos logit contributions
 world+0.145
 lot+0.144
 best+0.143
 things+0.142
 thing+0.141
Neg logit contributions
ur-0.196
arted-0.193
oured-0.187
anged-0.186
ked-0.186
Token said
Token ID531
Feature activation+1.10e+01

Pos logit contributions
arted+0.186
ur+0.181
ked+0.165
anged+0.164
ree+0.164
Neg logit contributions
 best-0.168
 thing-0.160
 things-0.156
 world-0.153
 lot-0.152
Token Suddenly
Token ID24975
Feature activation+6.93e+00

Pos logit contributions
 best+0.187
 world+0.185
 lot+0.178
 thing+0.176
 things+0.174
Neg logit contributions
arted-0.211
ked-0.210
ur-0.201
anged-0.197
oured-0.190
Token,
Token ID11
Feature activation+2.23e+00

Pos logit contributions
arted+0.345
ur+0.344
ked+0.318
 stro+0.318
anged+0.310
Neg logit contributions
 best-0.330
 thing-0.322
 world-0.318
 biggest-0.311
 important-0.307
Token the
Token ID262
Feature activation+6.95e-01

Pos logit contributions
arted+0.119
ur+0.117
 stro+0.110
ked+0.109
anged+0.106
Neg logit contributions
 best-0.099
 thing-0.096
 world-0.095
 biggest-0.092
 important-0.092
Token mouse
Token ID10211
Feature activation-7.20e+00

Pos logit contributions
arted+0.040
ur+0.040
 stro+0.039
ked+0.038
oured+0.037
Neg logit contributions
 best-0.022
 important-0.021
 thing-0.021
 lot-0.020
 biggest-0.020
Token sc
Token ID629
Feature activation+1.06e+01

Pos logit contributions
 best+0.350
 thing+0.342
 world+0.331
 things+0.330
 biggest+0.326
Neg logit contributions
arted-0.361
ur-0.353
ked-0.331
 stro-0.328
anged-0.322
Tokenur
Token ID333
Feature activation+2.41e+01

Pos logit contributions
arted+0.141
ked+0.116
 stro+0.113
ior+0.078
ree+0.065
Neg logit contributions
 best-0.810
 biggest-0.771
 lot-0.761
 problem-0.758
 important-0.741
Tokenried
Token ID2228
Feature activation+1.87e+01

Pos logit contributions
ur+0.364
arted+0.354
 stro+0.311
anged+0.254
ked+0.253
Neg logit contributions
 best-1.974
 thing-1.942
 things-1.914
 world-1.908
 biggest-1.873
Token over
Token ID625
Feature activation+1.54e+01

Pos logit contributions
anged+0.858
ur+0.853
arted+0.851
 stro+0.835
ree+0.789
Neg logit contributions
 thing-0.958
 best-0.948
 things-0.928
 world-0.901
 lot-0.892
Token to
Token ID284
Feature activation+6.33e+00

Pos logit contributions
arted+0.774
ur+0.758
ked+0.745
anged+0.729
 stro+0.713
Neg logit contributions
 thing-0.748
 things-0.739
 best-0.734
 world-0.726
 lot-0.708
Token the
Token ID262
Feature activation+1.12e+00

Pos logit contributions
arted+0.346
anged+0.331
ur+0.330
ked+0.326
oured+0.316
Neg logit contributions
 thing-0.287
 best-0.278
 things-0.276
 lot-0.272
 world-0.270
Token edge
Token ID5743
Feature activation+1.34e+00

Pos logit contributions
arted+0.065
ur+0.063
 stro+0.060
ked+0.059
oured+0.059
Neg logit contributions
 best-0.038
 thing-0.037
iest-0.037
 things-0.035
 important-0.035
Token could
Token ID714
Feature activation-5.27e+00

Pos logit contributions
 best+0.424
 thing+0.417
 world+0.408
 things+0.401
 biggest+0.399
Neg logit contributions
ur-0.491
arted-0.489
 stro-0.463
ked-0.452
anged-0.441
Token get
Token ID651
Feature activation-2.01e-01

Pos logit contributions
 best+0.235
 thing+0.227
 world+0.220
 things+0.219
 lot+0.215
Neg logit contributions
arted-0.292
ur-0.282
ked-0.271
anged-0.263
 stro-0.262
Token better
Token ID1365
Feature activation-9.47e+00

Pos logit contributions
 best+0.009
 thing+0.009
 world+0.009
 lot+0.008
 things+0.008
Neg logit contributions
arted-0.011
ur-0.011
 stro-0.010
ked-0.010
anged-0.010
Token.
Token ID13
Feature activation-3.11e+00

Pos logit contributions
 best+0.416
 thing+0.405
 world+0.394
 biggest+0.391
 lot+0.386
Neg logit contributions
arted-0.506
ur-0.500
 stro-0.472
ked-0.468
anged-0.463
Token
Token ID220
Feature activation+3.67e+00

Pos logit contributions
 world+0.130
 best+0.130
 lot+0.126
 day+0.121
 things+0.121
Neg logit contributions
ked-0.180
arted-0.173
ur-0.166
anged-0.165
oured-0.158
Token
Token ID198
Feature activation+6.09e+00

Pos logit contributions
arted+0.217
ked+0.214
ur+0.209
anged+0.204
 stro+0.204
Neg logit contributions
 best-0.150
 world-0.140
 lot-0.136
 things-0.135
 important-0.131
Token
Token ID198
Feature activation+6.13e+00

Pos logit contributions
ked+0.360
arted+0.358
ur+0.347
anged+0.343
oured+0.333
Neg logit contributions
 best-0.251
 world-0.248
 lot-0.244
 important-0.233
 thing-0.232
TokenEmily
Token ID48640
Feature activation+7.26e+00

Pos logit contributions
arted+0.342
ur+0.333
ked+0.320
anged+0.312
 stro+0.309
Neg logit contributions
 best-0.267
 thing-0.260
 world-0.254
 things-0.249
 lot-0.248
Token was
Token ID373
Feature activation-4.81e+00

Pos logit contributions
arted+0.437
ur+0.421
ked+0.406
anged+0.397
ree+0.388
Neg logit contributions
 best-0.285
 thing-0.272
 world-0.263
 important-0.263
 things-0.260
Token really
Token ID1107
Feature activation-6.96e+00

Pos logit contributions
 best+0.180
 important+0.162
 thing+0.161
 world+0.155
 biggest+0.154
Neg logit contributions
arted-0.307
ur-0.300
ked-0.295
anged-0.290
ree-0.282
Token sad
Token ID6507
Feature activation-2.53e+00

Pos logit contributions
 best+0.290
 thing+0.284
 world+0.275
 things+0.273
 biggest+0.266
Neg logit contributions
arted-0.395
ur-0.389
 stro-0.366
ked-0.365
anged-0.356
Token't
Token ID470
Feature activation-1.98e+00

Pos logit contributions
arted+0.179
ur+0.178
 stro+0.158
ked+0.156
ree+0.155
Neg logit contributions
 best-0.184
 thing-0.183
 things-0.176
 world-0.176
 biggest-0.175
Token keep
Token ID1394
Feature activation+7.13e+00

Pos logit contributions
 best+0.084
 thing+0.082
 world+0.079
 things+0.078
 lot+0.076
Neg logit contributions
arted-0.113
ur-0.110
ked-0.105
 stro-0.103
anged-0.102
Token going
Token ID1016
Feature activation-3.05e+00

Pos logit contributions
arted+0.372
ur+0.367
ked+0.344
 stro+0.340
anged+0.336
Neg logit contributions
 best-0.339
 thing-0.326
 things-0.319
 world-0.315
 biggest-0.310
Token.
Token ID13
Feature activation-2.77e+00

Pos logit contributions
 best+0.124
 thing+0.122
 world+0.117
 biggest+0.115
 things+0.114
Neg logit contributions
arted-0.175
ur-0.171
ked-0.159
 stro-0.156
anged-0.156
Token
Token ID198
Feature activation+6.40e+00

Pos logit contributions
 best+0.125
 world+0.122
 lot+0.118
 things+0.115
 thing+0.115
Neg logit contributions
ked-0.148
arted-0.148
ur-0.146
 stro-0.139
anged-0.137
Token
Token ID198
Feature activation+6.40e+00

Pos logit contributions
arted+0.363
ked+0.358
ur+0.352
anged+0.337
oured+0.335
Neg logit contributions
 best-0.276
 world-0.270
 lot-0.268
 thing-0.262
 important-0.261
TokenThat
Token ID2504
Feature activation-1.03e+01

Pos logit contributions
arted+0.336
ur+0.329
ked+0.309
 stro+0.306
anged+0.301
Neg logit contributions
 best-0.299
 thing-0.291
 world-0.282
 things-0.280
 biggest-0.276
Token's
Token ID338
Feature activation-7.62e+00

Pos logit contributions
 biggest+0.301
 best+0.286
 lot+0.277
 things+0.267
 thing+0.260
Neg logit contributions
arted-0.643
ur-0.588
anged-0.584
 stro-0.574
 Joe-0.561
Token when
Token ID618
Feature activation+3.35e+00

Pos logit contributions
 best+0.415
 thing+0.406
 world+0.396
 things+0.393
 biggest+0.388
Neg logit contributions
arted-0.344
ur-0.336
ked-0.311
 stro-0.307
anged-0.302
Token the
Token ID262
Feature activation+1.72e+00

Pos logit contributions
arted+0.144
ur+0.142
 stro+0.131
ked+0.130
anged+0.129
Neg logit contributions
 best-0.173
 thing-0.172
 world-0.171
 biggest-0.169
 important-0.167
Token hardest
Token ID17612
Feature activation-5.05e+00

Pos logit contributions
arted+0.105
ur+0.103
 stro+0.099
ked+0.095
anged+0.095
Neg logit contributions
 best-0.051
 thing-0.050
 biggest-0.048
 important-0.048
 world-0.047
Token watch
Token ID2342
Feature activation+2.12e+00

Pos logit contributions
arted+0.249
ur+0.244
ked+0.228
 stro+0.226
anged+0.222
Neg logit contributions
 best-0.242
 thing-0.236
 world-0.229
 things-0.228
 biggest-0.224
Token the
Token ID262
Feature activation-3.23e-01

Pos logit contributions
arted+0.102
ur+0.099
ked+0.093
 stro+0.092
anged+0.089
Neg logit contributions
 best-0.105
 thing-0.102
 world-0.099
 important-0.097
 biggest-0.097
Token birds
Token ID10087
Feature activation-2.78e+00

Pos logit contributions
 thing+0.010
 best+0.010
iest+0.010
 important+0.009
 problem+0.009
Neg logit contributions
arted-0.019
ur-0.017
ked-0.016
 stro-0.016
 Sammy-0.016
Token who
Token ID508
Feature activation-1.80e+00

Pos logit contributions
 best+0.120
 thing+0.118
 important+0.112
 world+0.112
 things+0.111
Neg logit contributions
arted-0.153
ur-0.151
ked-0.142
 stro-0.142
anged-0.137
Token flew
Token ID13112
Feature activation+2.53e+00

Pos logit contributions
 best+0.079
 thing+0.076
 biggest+0.073
 world+0.073
 important+0.073
Neg logit contributions
arted-0.099
ur-0.097
 stro-0.092
ked-0.091
anged-0.090
Token around
Token ID1088
Feature activation+6.43e+00

Pos logit contributions
arted+0.113
ur+0.110
ked+0.102
 stro+0.100
anged+0.097
Neg logit contributions
 best-0.138
 thing-0.135
 world-0.131
 things-0.130
 biggest-0.129
Token.
Token ID13
Feature activation-4.94e+00

Pos logit contributions
arted+0.336
ur+0.328
ked+0.309
 stro+0.306
anged+0.299
Neg logit contributions
 best-0.297
 thing-0.285
 biggest-0.276
 world-0.275
 important-0.274
Token
Token ID198
Feature activation+4.20e+00

Pos logit contributions
 best+0.236
 world+0.231
 thing+0.228
 lot+0.223
 things+0.222
Neg logit contributions
arted-0.262
ur-0.253
ked-0.251
 stro-0.236
anged-0.233
Token
Token ID198
Feature activation+4.17e+00

Pos logit contributions
arted+0.208
ur+0.203
ked+0.194
 stro+0.188
anged+0.186
Neg logit contributions
 best-0.210
 thing-0.205
 world-0.202
 things-0.198
 lot-0.198
TokenOne
Token ID3198
Feature activation-9.17e+00

Pos logit contributions
arted+0.229
ur+0.226
ree+0.213
 stro+0.212
ked+0.203
Neg logit contributions
 best-0.174
 thing-0.164
 biggest-0.162
 things-0.160
 problem-0.154
Token day
Token ID1110
Feature activation-1.52e+01

Pos logit contributions
 best+0.141
 biggest+0.136
 things+0.123
 world+0.111
 thing+0.107
Neg logit contributions
ur-0.734
arted-0.715
 stro-0.680
ree-0.678
anged-0.675
Token interesting
Token ID3499
Feature activation-4.34e+00

Pos logit contributions
iest+0.177
 world+0.168
 problem+0.164
 thing+0.163
 puzzle+0.159
Neg logit contributions
ur-0.260
arted-0.257
 Sammy-0.251
 stro-0.244
 Joe-0.241
Token so
Token ID523
Feature activation-1.56e+00

Pos logit contributions
 lot+0.231
 best+0.231
 biggest+0.228
 world+0.227
 thing+0.224
Neg logit contributions
ur-0.171
arted-0.168
 stro-0.149
ked-0.142
anged-0.140
Token she
Token ID673
Feature activation-9.12e+00

Pos logit contributions
 world+0.060
 best+0.057
 thing+0.057
 problem+0.056
 biggest+0.054
Neg logit contributions
ur-0.088
arted-0.086
 stro-0.081
 Joe-0.077
 Sammy-0.076
Token picked
Token ID6497
Feature activation+1.09e+01

Pos logit contributions
 best+0.418
 thing+0.412
 important+0.407
 things+0.394
 lot+0.383
Neg logit contributions
arted-0.469
ur-0.445
anged-0.437
ree-0.421
ked-0.420
Token it
Token ID340
Feature activation-7.70e+00

Pos logit contributions
arted+0.619
ur+0.601
ked+0.571
 stro+0.563
oured+0.550
Neg logit contributions
 best-0.445
 thing-0.426
 world-0.422
 important-0.410
 lot-0.408
Token up
Token ID510
Feature activation+7.54e+00

Pos logit contributions
 best+0.382
 thing+0.372
 things+0.362
 world+0.354
 biggest+0.352
Neg logit contributions
arted-0.394
ur-0.384
ked-0.363
 stro-0.361
anged-0.357
Token.
Token ID13
Feature activation-3.12e+00

Pos logit contributions
arted+0.494
ur+0.477
anged+0.465
ked+0.460
oured+0.451
Neg logit contributions
 best-0.284
 thing-0.271
 things-0.264
 important-0.249
 lot-0.244
Token
Token ID220
Feature activation+3.64e+00

Pos logit contributions
 best+0.129
 things+0.128
 thing+0.127
 world+0.127
 lot+0.125
Neg logit contributions
arted-0.179
ked-0.175
ur-0.175
anged-0.169
oured-0.165
Token
Token ID198
Feature activation+6.02e+00

Pos logit contributions
arted+0.204
ur+0.199
ked+0.196
 stro+0.191
anged+0.189
Neg logit contributions
 best-0.164
 world-0.154
 things-0.151
 thing-0.151
 lot-0.149
Token
Token ID198
Feature activation+6.01e+00

Pos logit contributions
arted+0.382
ked+0.378
ur+0.368
anged+0.365
oured+0.354
Neg logit contributions
 lot-0.227
 best-0.225
 world-0.222
 things-0.212
 thing-0.211
TokenWhen
Token ID2215
Feature activation-1.61e+00

Pos logit contributions
arted+0.302
ur+0.296
ked+0.276
 stro+0.274
anged+0.268
Neg logit contributions
 best-0.296
 thing-0.288
 world-0.280
 things-0.278
 biggest-0.275
Token to
Token ID284
Feature activation+6.14e+00

Pos logit contributions
 best+0.058
 things+0.056
 thing+0.056
 important+0.054
 lot+0.053
Neg logit contributions
arted-0.052
ur-0.049
ked-0.047
anged-0.045
oured-0.045
Token help
Token ID1037
Feature activation+2.16e+00

Pos logit contributions
arted+0.393
ur+0.370
ked+0.368
anged+0.358
oured+0.348
Neg logit contributions
 thing-0.232
 best-0.228
 things-0.220
 lot-0.217
 world-0.211
Token him
Token ID683
Feature activation+8.70e+00

Pos logit contributions
arted+0.127
ur+0.120
ked+0.118
oured+0.117
anged+0.114
Neg logit contributions
 thing-0.096
 best-0.095
 things-0.095
 world-0.090
 lot-0.087
Token.
Token ID13
Feature activation-3.61e+00

Pos logit contributions
arted+0.589
ur+0.562
ked+0.551
anged+0.536
 stro+0.536
Neg logit contributions
 best-0.299
 thing-0.284
 things-0.274
 world-0.266
 biggest-0.261
Token She
Token ID1375
Feature activation-3.07e+00

Pos logit contributions
 things+0.126
 world+0.118
 thing+0.117
 best+0.116
 is+0.115
Neg logit contributions
arted-0.215
ked-0.214
ur-0.211
anged-0.206
oured-0.204
Token asked
Token ID1965
Feature activation+1.32e+01

Pos logit contributions
 best+0.134
 important+0.132
 lot+0.127
 thing+0.127
 things+0.124
Neg logit contributions
arted-0.163
ur-0.150
ree-0.150
anged-0.148
oured-0.143
Token her
Token ID607
Feature activation+2.70e+00

Pos logit contributions
arted+0.718
oured+0.669
ur+0.656
anged+0.651
ked+0.649
Neg logit contributions
 things-0.673
 best-0.660
 thing-0.654
 other-0.597
 good-0.594
Token mom
Token ID1995
Feature activation+4.01e+00

Pos logit contributions
arted+0.172
ur+0.162
ked+0.158
anged+0.155
oured+0.154
Neg logit contributions
 best-0.111
 thing-0.105
 things-0.102
 important-0.098
 biggest-0.096
Token if
Token ID611
Feature activation+7.55e+00

Pos logit contributions
arted+0.191
ur+0.183
ked+0.174
 stro+0.172
anged+0.168
Neg logit contributions
 best-0.214
 thing-0.206
 things-0.202
 world-0.200
 important-0.196
Token she
Token ID673
Feature activation-9.80e+00

Pos logit contributions
ur+0.433
arted+0.432
 stro+0.412
ked+0.409
anged+0.394
Neg logit contributions
 best-0.295
 thing-0.288
 world-0.288
 biggest-0.278
 lot-0.277
Token could
Token ID714
Feature activation-5.78e+00

Pos logit contributions
 best+0.380
 thing+0.368
 world+0.353
 things+0.351
 biggest+0.344
Neg logit contributions
arted-0.592
ur-0.580
ked-0.550
 stro-0.544
anged-0.538
Token dad
Token ID9955
Feature activation+6.29e+00

Pos logit contributions
iest+0.461
 thing+0.453
 best+0.437
 world+0.435
 biggest+0.434
Neg logit contributions
arted-0.395
ur-0.384
 stro-0.379
anged-0.366
oured-0.363
Token and
Token ID290
Feature activation+4.27e+00

Pos logit contributions
arted+0.339
ur+0.321
ree+0.304
ked+0.302
anged+0.296
Neg logit contributions
 best-0.307
 thing-0.283
 important-0.276
 things-0.276
 world-0.276
Token little
Token ID1310
Feature activation-2.94e+00

Pos logit contributions
arted+0.216
ur+0.212
ked+0.196
oured+0.193
ree+0.191
Neg logit contributions
 best-0.220
 thing-0.216
 things-0.213
 important-0.202
 world-0.200
Token brother
Token ID3956
Feature activation-5.29e-01

Pos logit contributions
 best+0.110
 thing+0.110
 world+0.109
 lot+0.107
 important+0.107
Neg logit contributions
arted-0.173
ur-0.168
ked-0.163
anged-0.160
 stro-0.159
Token were
Token ID547
Feature activation-7.27e+00

Pos logit contributions
 best+0.026
 thing+0.024
 things+0.024
 world+0.023
 lot+0.023
Neg logit contributions
arted-0.028
ur-0.027
ked-0.025
ree-0.025
anged-0.024
Token happy
Token ID3772
Feature activation-5.12e+00

Pos logit contributions
 best+0.224
 thing+0.201
 world+0.193
 things+0.191
 important+0.190
Neg logit contributions
arted-0.513
ur-0.503
ked-0.484
anged-0.477
 stro-0.476
Token to
Token ID284
Feature activation+5.05e+00

Pos logit contributions
 best+0.278
 thing+0.271
 world+0.265
 things+0.262
 biggest+0.261
Neg logit contributions
arted-0.229
ur-0.224
ked-0.207
 stro-0.206
anged-0.202
Token see
Token ID766
Feature activation+8.63e+00

Pos logit contributions
arted+0.307
ur+0.291
ked+0.283
anged+0.272
oured+0.272
Neg logit contributions
 best-0.208
 thing-0.202
 lot-0.195
 things-0.190
 world-0.189
Token the
Token ID262
Feature activation-2.57e-01

Pos logit contributions
arted+0.379
ur+0.367
 stro+0.341
ked+0.340
anged+0.326
Neg logit contributions
 best-0.463
 thing-0.452
 world-0.448
 biggest-0.437
 important-0.435
Token strawberries
Token ID36973
Feature activation-2.44e+00

Pos logit contributions
 best+0.008
 thing+0.008
iest+0.008
 puzzle+0.008
 universe+0.008
Neg logit contributions
arted-0.014
 stro-0.013
oured-0.012
ur-0.012
ked-0.012
Token.
Token ID13
Feature activation-4.76e+00

Pos logit contributions
 thing+0.109
 world+0.108
 problem+0.107
 best+0.106
 lot+0.106
Neg logit contributions
arted-0.117
ur-0.117
 stro-0.111
ked-0.108
oured-0.107
Token from
Token ID422
Feature activation+1.38e+01

Pos logit contributions
 thing+0.052
 things+0.050
 world+0.048
 best+0.047
 whole+0.045
Neg logit contributions
oured-0.058
 stro-0.055
arted-0.055
anged-0.055
olded-0.053
Token the
Token ID262
Feature activation+1.53e+00

Pos logit contributions
arted+0.684
ur+0.672
ked+0.628
 stro+0.619
anged+0.613
Neg logit contributions
 best-0.679
 thing-0.673
 things-0.650
 world-0.650
 lot-0.631
Token ground
Token ID2323
Feature activation-3.52e+00

Pos logit contributions
arted+0.083
 stro+0.077
ur+0.075
ree+0.074
oured+0.073
Neg logit contributions
iest-0.057
 best-0.056
 important-0.053
 thing-0.050
 game-0.048
Token.
Token ID13
Feature activation-2.93e+00

Pos logit contributions
 best+0.144
 thing+0.135
 biggest+0.132
 things+0.130
 important+0.130
Neg logit contributions
arted-0.207
ur-0.202
ked-0.191
 stro-0.189
anged-0.185
Token He
Token ID679
Feature activation+2.04e+00

Pos logit contributions
 thing+0.102
 things+0.097
 world+0.095
 lot+0.094
 best+0.092
Neg logit contributions
arted-0.190
anged-0.178
ur-0.173
ked-0.172
oured-0.172
Token looked
Token ID3114
Feature activation-3.89e+00

Pos logit contributions
arted+0.134
anged+0.127
oured+0.125
ur+0.124
ked+0.120
Neg logit contributions
 thing-0.064
 lot-0.062
 best-0.061
 rare-0.059
 things-0.059
Token down
Token ID866
Feature activation+8.60e+00

Pos logit contributions
 things+0.169
 best+0.169
 thing+0.167
 important+0.161
 lot+0.160
Neg logit contributions
anged-0.208
arted-0.208
ur-0.205
ree-0.199
ked-0.198
Token and
Token ID290
Feature activation+5.81e+00

Pos logit contributions
ur+0.433
arted+0.429
anged+0.406
ked+0.399
 stro+0.397
Neg logit contributions
 best-0.424
 thing-0.416
 things-0.407
 world-0.401
 lot-0.393
Token saw
Token ID2497
Feature activation+4.87e+00

Pos logit contributions
arted+0.370
ur+0.354
ree+0.351
anged+0.351
amb+0.347
Neg logit contributions
 things-0.191
 pretty-0.183
 thing-0.180
 rare-0.179
 beautiful-0.179
Token a
Token ID257
Feature activation-3.19e+00

Pos logit contributions
arted+0.290
ur+0.284
ked+0.269
 stro+0.266
anged+0.263
Neg logit contributions
 best-0.195
 thing-0.189
 things-0.182
 world-0.181
 biggest-0.178
Token small
Token ID1402
Feature activation-4.04e+00

Pos logit contributions
 best+0.106
 biggest+0.094
 things+0.094
 important+0.092
 thing+0.091
Neg logit contributions
 stro-0.190
ur-0.190
arted-0.186
ree-0.181
ked-0.180
Token End
Token ID5268
Feature activation-1.60e+01

Pos logit contributions
iest+0.096
 biggest+0.090
 world+0.084
 thing+0.083
ooky+0.083
Neg logit contributions
arted-0.183
 stro-0.180
ur-0.178
anged-0.170
 Sammy-0.169
Token.
Token ID13
Feature activation-3.73e+00

Pos logit contributions
 best+0.747
 biggest+0.718
 thing+0.704
 world+0.686
 things+0.677
Neg logit contributions
arted-0.792
ur-0.787
 stro-0.766
ree-0.732
ked-0.722
Token<|endoftext|>
Token ID50256
Feature activation+1.36e+00

Pos logit contributions
 best+0.162
 lot+0.157
 world+0.156
 thing+0.151
 things+0.145
Neg logit contributions
arted-0.210
ur-0.205
ked-0.200
anged-0.197
oured-0.188
TokenOnce
Token ID7454
Feature activation-1.06e+00

Pos logit contributions
arted+0.066
ur+0.064
ked+0.060
 stro+0.059
anged+0.058
Neg logit contributions
 best-0.068
 thing-0.066
 things-0.064
 world-0.064
 biggest-0.063
Token upon
Token ID2402
Feature activation-2.52e+00

Pos logit contributions
 best+0.054
 thing+0.052
 world+0.051
 things+0.050
 biggest+0.050
Neg logit contributions
arted-0.051
ur-0.050
ked-0.047
 stro-0.046
anged-0.045
Token a
Token ID257
Feature activation-3.99e+00

Pos logit contributions
 problem+0.124
 best+0.123
 thing+0.120
 world+0.118
 biggest+0.117
Neg logit contributions
 stro-0.111
ur-0.108
arted-0.103
ked-0.101
ree-0.095
Token time
Token ID640
Feature activation-7.59e+00

Pos logit contributions
 best+0.064
 thing+0.061
 things+0.057
 biggest+0.055
 world+0.054
Neg logit contributions
arted-0.322
ur-0.321
 stro-0.305
ked-0.304
anged-0.302
Token there
Token ID612
Feature activation-6.39e+00

Pos logit contributions
 best+0.336
 thing+0.324
 things+0.313
 world+0.310
 lot+0.304
Neg logit contributions
arted-0.427
ur-0.407
ked-0.395
 stro-0.392
anged-0.382
Token was
Token ID373
Feature activation-5.74e+00

Pos logit contributions
 best+0.222
 biggest+0.212
 important+0.207
 thing+0.206
 world+0.204
Neg logit contributions
arted-0.383
ur-0.380
 stro-0.359
anged-0.356
ked-0.354
Token a
Token ID257
Feature activation-4.09e+00

Pos logit contributions
 thing+0.235
 best+0.234
 world+0.229
 problem+0.226
 things+0.222
Neg logit contributions
arted-0.309
ur-0.306
 stro-0.294
ked-0.287
oured-0.277
Token boy
Token ID2933
Feature activation-9.52e+00

Pos logit contributions
iest+0.139
 best+0.133
 things+0.130
 biggest+0.129
 thing+0.124
Neg logit contributions
 stro-0.226
ur-0.225
ree-0.217
arted-0.213
ked-0.205
Token
Token ID198
Feature activation+5.66e+00

Pos logit contributions
arted+0.399
ur+0.385
ked+0.379
anged+0.378
oured+0.372
Neg logit contributions
 lot-0.193
 world-0.176
 thing-0.172
 things-0.171
 other-0.171
TokenThe
Token ID464
Feature activation-3.10e+00

Pos logit contributions
arted+0.359
ur+0.352
anged+0.333
ked+0.330
oured+0.327
Neg logit contributions
 thing-0.204
 best-0.199
 lot-0.195
 things-0.194
 world-0.193
Token man
Token ID582
Feature activation-7.11e-01

Pos logit contributions
 lot+0.113
 thing+0.113
 best+0.110
 world+0.104
 things+0.103
Neg logit contributions
arted-0.170
ur-0.164
ked-0.161
 stro-0.157
oured-0.153
Token smiled
Token ID13541
Feature activation+7.80e+00

Pos logit contributions
 best+0.033
 thing+0.032
 things+0.030
 game+0.029
 lot+0.029
Neg logit contributions
arted-0.038
ur-0.036
ree-0.035
anged-0.033
oured-0.032
Token and
Token ID290
Feature activation+5.25e+00

Pos logit contributions
arted+0.369
ur+0.362
ked+0.332
 stro+0.331
anged+0.330
Neg logit contributions
 best-0.411
 thing-0.400
 world-0.388
 things-0.387
 important-0.379
Token came
Token ID1625
Feature activation+2.14e+00

Pos logit contributions
arted+0.281
ur+0.263
ree+0.261
anged+0.259
pered+0.240
Neg logit contributions
 best-0.247
 thing-0.231
 things-0.227
 rare-0.223
 pretty-0.221
Token closer
Token ID5699
Feature activation+3.79e+00

Pos logit contributions
 stro+0.112
oured+0.111
arted+0.108
anged+0.108
olded+0.107
Neg logit contributions
 best-0.099
 things-0.095
 world-0.093
 thing-0.090
 biggest-0.085
Token.
Token ID13
Feature activation-3.61e+00

Pos logit contributions
arted+0.243
ur+0.237
ked+0.225
anged+0.223
 stro+0.223
Neg logit contributions
 best-0.135
 thing-0.134
 things-0.129
 world-0.127
 important-0.120
Token He
Token ID679
Feature activation+1.34e+00

Pos logit contributions
 thing+0.158
 best+0.158
 lot+0.157
 things+0.153
 important+0.150
Neg logit contributions
ur-0.191
arted-0.191
anged-0.181
ked-0.179
 stro-0.176
Token said
Token ID531
Feature activation+1.09e+01

Pos logit contributions
arted+0.076
ur+0.069
anged+0.067
ree+0.066
ked+0.065
Neg logit contributions
 best-0.059
 rare-0.055
 thing-0.055
 important-0.055
 special-0.054
Token "
Token ID366
Feature activation-1.54e+00

Pos logit contributions
arted+0.658
anged+0.600
ur+0.596
oured+0.583
ked+0.578
Neg logit contributions
 best-0.463
 thing-0.452
 things-0.438
 lot-0.417
 important-0.412
Token inside
Token ID2641
Feature activation+5.68e+00

Pos logit contributions
 best+0.142
 things+0.131
 thing+0.130
 biggest+0.121
 world+0.121
Neg logit contributions
arted-0.186
ur-0.182
ked-0.180
 stro-0.176
ree-0.176
Token,
Token ID11
Feature activation+2.12e+00

Pos logit contributions
arted+0.303
ur+0.296
ked+0.280
 stro+0.276
anged+0.271
Neg logit contributions
 best-0.262
 thing-0.256
 world-0.247
 things-0.247
 lot-0.240
Token he
Token ID339
Feature activation-5.49e+00

Pos logit contributions
arted+0.113
ur+0.111
ked+0.104
 stro+0.104
anged+0.102
Neg logit contributions
 best-0.095
 thing-0.092
 world-0.091
 things-0.088
 biggest-0.088
Token found
Token ID1043
Feature activation+4.09e+00

Pos logit contributions
 best+0.232
 thing+0.213
 lot+0.211
 things+0.211
 world+0.211
Neg logit contributions
arted-0.323
ur-0.307
ked-0.291
ree-0.286
anged-0.281
Token his
Token ID465
Feature activation+6.74e+00

Pos logit contributions
arted+0.232
ur+0.228
ked+0.215
 stro+0.213
anged+0.209
Neg logit contributions
 best-0.172
 thing-0.167
 world-0.163
 things-0.160
 important-0.158
Token mom
Token ID1995
Feature activation+3.67e+00

Pos logit contributions
arted+0.310
ur+0.302
oured+0.294
anged+0.293
 stro+0.290
Neg logit contributions
 world-0.273
 thing-0.267
 important-0.267
iest-0.264
 lot-0.264
Token.
Token ID13
Feature activation-4.07e+00

Pos logit contributions
arted+0.209
ur+0.205
ked+0.194
 stro+0.192
anged+0.189
Neg logit contributions
 best-0.155
 thing-0.150
 world-0.145
 things-0.144
 biggest-0.141
Token She
Token ID1375
Feature activation-3.51e+00

Pos logit contributions
 thing+0.183
 world+0.177
 lot+0.177
 things+0.174
 best+0.172
Neg logit contributions
arted-0.216
ked-0.211
ur-0.208
anged-0.197
oured-0.194
Token was
Token ID373
Feature activation-5.95e+00

Pos logit contributions
 best+0.130
 thing+0.120
 lot+0.118
 important+0.118
 things+0.117
Neg logit contributions
arted-0.226
ur-0.214
ree-0.205
ked-0.205
anged-0.200
Token feeling
Token ID4203
Feature activation-1.84e+00

Pos logit contributions
 best+0.211
 important+0.182
 thing+0.181
 biggest+0.176
 world+0.176
Neg logit contributions
arted-0.401
ur-0.389
anged-0.383
ked-0.380
ree-0.376
Token jealous
Token ID19354
Feature activation-2.29e+00

Pos logit contributions
 thing+0.069
 world+0.067
 game+0.065
 best+0.065
 things+0.064
Neg logit contributions
ur-0.105
 stro-0.104
arted-0.104
ked-0.096
oured-0.096
Token,
Token ID11
Feature activation+2.70e+00

Pos logit contributions
 best+0.446
 thing+0.440
 world+0.429
 lot+0.424
 things+0.424
Neg logit contributions
arted-0.459
ur-0.451
ked-0.421
 stro-0.408
anged-0.406
Token and
Token ID290
Feature activation+5.41e+00

Pos logit contributions
arted+0.123
ur+0.120
ked+0.111
 stro+0.110
anged+0.108
Neg logit contributions
 best-0.145
 thing-0.142
 world-0.138
 things-0.137
 biggest-0.135
Token he
Token ID339
Feature activation-4.91e+00

Pos logit contributions
arted+0.284
ur+0.277
ked+0.260
 stro+0.259
anged+0.253
Neg logit contributions
 best-0.242
 thing-0.237
 world-0.232
 things-0.226
 biggest-0.226
Token hoped
Token ID10719
Feature activation+6.58e+00

Pos logit contributions
 best+0.185
 thing+0.177
 world+0.171
 things+0.168
 important+0.166
Neg logit contributions
arted-0.304
ur-0.298
ked-0.283
anged-0.278
 stro-0.278
Token he
Token ID339
Feature activation-4.97e+00

Pos logit contributions
arted+0.355
ur+0.351
ked+0.324
 stro+0.324
anged+0.319
Neg logit contributions
 best-0.288
 thing-0.280
 world-0.274
 biggest-0.268
 things-0.266
Token would
Token ID561
Feature activation-1.12e+01

Pos logit contributions
 best+0.159
 thing+0.155
 things+0.147
 world+0.146
 biggest+0.145
Neg logit contributions
ur-0.326
arted-0.326
ked-0.309
 stro-0.309
anged-0.299
Token always
Token ID1464
Feature activation-3.28e+00

Pos logit contributions
 best+0.524
 thing+0.512
 things+0.494
 world+0.493
 biggest+0.485
Neg logit contributions
arted-0.581
ur-0.575
ked-0.534
 stro-0.533
anged-0.519
Token receive
Token ID3328
Feature activation+5.99e+00

Pos logit contributions
 thing+0.166
 best+0.165
 world+0.159
 things+0.159
 biggest+0.156
Neg logit contributions
ur-0.149
arted-0.148
 stro-0.139
ked-0.136
anged-0.133
Token special
Token ID2041
Feature activation-9.14e+00

Pos logit contributions
arted+0.329
ur+0.322
ked+0.304
 stro+0.303
anged+0.296
Neg logit contributions
 best-0.258
 thing-0.250
 world-0.244
 things-0.238
 biggest-0.237
Token things
Token ID1243
Feature activation-1.80e+01

Pos logit contributions
 best+0.269
 world+0.249
 thing+0.246
 biggest+0.241
 important+0.236
Neg logit contributions
arted-0.611
ur-0.606
 stro-0.586
ked-0.579
anged-0.575
Token like
Token ID588
Feature activation+4.49e+00

Pos logit contributions
 best+0.810
 thing+0.799
 world+0.778
 lot+0.764
 things+0.760
Neg logit contributions
ur-0.937
arted-0.934
ked-0.868
 stro-0.865
oured-0.840
Token "
Token ID366
Feature activation-8.72e-01

Pos logit contributions
arted+0.173
ur+0.165
anged+0.165
oured+0.163
ree+0.156
Neg logit contributions
 best-0.123
 thing-0.121
 things-0.117
 lot-0.115
 world-0.112
TokenMom
Token ID29252
Feature activation+1.49e+00

Pos logit contributions
 thing+0.035
 world+0.034
 best+0.034
 things+0.033
 whole+0.031
Neg logit contributions
ur-0.051
arted-0.050
anged-0.047
oured-0.047
ked-0.046
Tokenmy
Token ID1820
Feature activation+5.30e+00

Pos logit contributions
arted+0.073
ur+0.070
ked+0.067
 stro+0.066
anged+0.064
Neg logit contributions
 best-0.077
 thing-0.073
 world-0.072
 things-0.071
 biggest-0.070
Token,
Token ID11
Feature activation+3.09e+00

Pos logit contributions
arted+0.284
ur+0.277
ked+0.261
 stro+0.258
anged+0.254
Neg logit contributions
 best-0.242
 thing-0.235
 world-0.227
 things-0.226
 biggest-0.221
Token where
Token ID810
Feature activation+2.54e+00

Pos logit contributions
ur+0.156
arted+0.146
 stro+0.138
ked+0.137
oured+0.133
Neg logit contributions
 best-0.143
 world-0.140
 important-0.135
 biggest-0.134
 thing-0.134
Token are
Token ID389
Feature activation-9.78e+00

Pos logit contributions
ur+0.137
arted+0.129
 stro+0.128
olded+0.125
 Joe+0.123
Neg logit contributions
 world-0.091
 lot-0.087
 biggest-0.087
 thing-0.085
 best-0.085
Token you
Token ID345
Feature activation-7.37e+00

Pos logit contributions
 world+0.352
 thing+0.328
 best+0.314
 lot+0.309
 biggest+0.293
Neg logit contributions
ur-0.594
arted-0.584
ked-0.535
 stro-0.529
olded-0.515
Token?
Token ID30
Feature activation+1.61e+00

Pos logit contributions
 thing+0.327
 best+0.326
 world+0.313
 things+0.313
 lot+0.311
Neg logit contributions
arted-0.387
ur-0.385
ked-0.355
 stro-0.350
oured-0.346
Token Come
Token ID7911
Feature activation+6.39e+00

Pos logit contributions
arted+0.105
ur+0.099
ked+0.097
anged+0.094
oured+0.093
Neg logit contributions
 thing-0.054
 best-0.054
 world-0.051
 things-0.050
 biggest-0.049
Token back
Token ID736
Feature activation+4.05e+00

Pos logit contributions
arted+0.395
anged+0.366
ur+0.364
 stro+0.359
ked+0.357
Neg logit contributions
 best-0.267
 thing-0.259
 world-0.248
 things-0.237
 lot-0.235
Token!"
Token ID2474
Feature activation+2.61e+00

Pos logit contributions
arted+0.244
ked+0.235
anged+0.225
ur+0.225
 stro+0.225
Neg logit contributions
 thing-0.165
 best-0.164
 world-0.159
 things-0.153
 day-0.147
Token to
Token ID284
Feature activation+4.93e+00

Pos logit contributions
 best+0.041
 thing+0.040
 things+0.039
 world+0.039
 important+0.038
Neg logit contributions
arted-0.032
ur-0.031
ked-0.029
anged-0.028
 stro-0.028
Token resist
Token ID4180
Feature activation+6.21e+00

Pos logit contributions
arted+0.281
ur+0.272
ked+0.261
 stro+0.256
anged+0.255
Neg logit contributions
 best-0.210
 thing-0.204
 world-0.198
 things-0.195
 lot-0.192
Token and
Token ID290
Feature activation+3.95e+00

Pos logit contributions
arted+0.314
ur+0.306
 stro+0.286
ked+0.286
anged+0.279
Neg logit contributions
 best-0.298
 thing-0.290
 world-0.281
 biggest-0.277
 things-0.277
Token spend
Token ID4341
Feature activation-1.37e+00

Pos logit contributions
arted+0.214
ur+0.208
ked+0.197
 stro+0.194
anged+0.192
Neg logit contributions
 best-0.178
 thing-0.174
 things-0.167
 world-0.167
 biggest-0.163
Token one
Token ID530
Feature activation-7.98e+00

Pos logit contributions
 best+0.048
 thing+0.046
 world+0.045
 biggest+0.044
 things+0.043
Neg logit contributions
arted-0.087
ur-0.085
 stro-0.081
ked-0.081
anged-0.080
Token more
Token ID517
Feature activation-1.35e+01

Pos logit contributions
 best+0.190
 puzzle+0.187
 things+0.184
 lot+0.180
 world+0.178
Neg logit contributions
ur-0.524
arted-0.523
oured-0.479
run-0.478
 stro-0.477
Token night
Token ID1755
Feature activation-1.16e+01

Pos logit contributions
 best+0.276
 thing+0.261
 biggest+0.249
 world+0.239
 things+0.230
Neg logit contributions
arted-1.046
ur-1.037
 stro-0.995
anged-0.963
ked-0.962
Token in
Token ID287
Feature activation+5.29e-01

Pos logit contributions
 best+0.529
 thing+0.528
 lot+0.501
 things+0.495
 biggest+0.493
Neg logit contributions
ur-0.593
arted-0.592
 stro-0.558
ked-0.539
anged-0.536
Token the
Token ID262
Feature activation-5.50e-01

Pos logit contributions
arted+0.025
ur+0.024
 stro+0.022
ked+0.022
anged+0.022
Neg logit contributions
 best-0.027
 thing-0.026
 biggest-0.026
 world-0.026
 things-0.025
Token magical
Token ID10883
Feature activation-7.61e+00

Pos logit contributions
iest+0.022
 thing+0.018
kin+0.018
 puzzle+0.018
 best+0.018
Neg logit contributions
arted-0.029
ur-0.027
ree-0.025
 stro-0.025
 approached-0.024
Token forest
Token ID8222
Feature activation-5.17e+00

Pos logit contributions
 best+0.238
 biggest+0.228
 thing+0.201
 lot+0.199
 worst+0.197
Neg logit contributions
arted-0.468
ur-0.460
 stro-0.441
ree-0.426
anged-0.413
Token the
Token ID262
Feature activation+1.70e+00

Pos logit contributions
arted+0.510
ur+0.506
ked+0.468
 stro+0.464
anged+0.455
Neg logit contributions
 best-0.377
 thing-0.367
 world-0.356
 biggest-0.348
 things-0.342
Token mail
Token ID6920
Feature activation-2.79e+00

Pos logit contributions
arted+0.097
ked+0.091
 stro+0.091
ur+0.090
 Bill+0.087
Neg logit contributions
iest-0.047
 thing-0.045
 lot-0.044
 best-0.043
 important-0.041
Tokenman
Token ID805
Feature activation+3.43e+00

Pos logit contributions
 lot+0.100
 best+0.100
 world+0.097
 thing+0.093
 biggest+0.092
Neg logit contributions
ur-0.157
arted-0.146
ked-0.144
 stro-0.142
olded-0.142
Token that
Token ID326
Feature activation-1.86e+00

Pos logit contributions
ur+0.180
arted+0.168
 stro+0.166
ked+0.163
anged+0.158
Neg logit contributions
 best-0.149
 world-0.149
 thing-0.145
 lot-0.142
 biggest-0.139
Token there
Token ID612
Feature activation-5.24e+00

Pos logit contributions
 best+0.067
 lot+0.065
 thing+0.065
 world+0.065
 biggest+0.063
Neg logit contributions
ur-0.105
arted-0.103
 stro-0.098
ked-0.096
anged-0.095
Token is
Token ID318
Feature activation-1.34e+01

Pos logit contributions
 thing+0.162
 best+0.161
 lot+0.157
 biggest+0.153
 world+0.150
Neg logit contributions
ur-0.311
arted-0.289
 stro-0.287
anged-0.279
olded-0.275
Token mail
Token ID6920
Feature activation-2.77e+00

Pos logit contributions
 best+0.543
 thing+0.536
 world+0.533
 lot+0.504
 things+0.504
Neg logit contributions
ur-0.743
arted-0.734
 stro-0.690
ked-0.683
anged-0.662
Token inside
Token ID2641
Feature activation+6.87e+00

Pos logit contributions
 best+0.131
 lot+0.128
 thing+0.126
 world+0.124
 biggest+0.124
Neg logit contributions
ur-0.134
arted-0.129
ked-0.122
 stro-0.119
anged-0.116
Token the
Token ID262
Feature activation+1.71e+00

Pos logit contributions
ur+0.318
arted+0.314
 stro+0.294
oured+0.282
ked+0.279
Neg logit contributions
 best-0.322
 thing-0.308
 biggest-0.302
 world-0.296
 lot-0.295
Token mailbox
Token ID37282
Feature activation-4.02e+00

Pos logit contributions
arted+0.081
ur+0.077
ked+0.075
 stro+0.074
anged+0.074
Neg logit contributions
iest-0.067
 best-0.064
 thing-0.062
 lot-0.061
 important-0.060
Token,"
Token ID553
Feature activation+5.61e+00

Pos logit contributions
 best+0.188
 thing+0.185
 lot+0.181
 world+0.176
 biggest+0.175
Neg logit contributions
ur-0.193
arted-0.189
 stro-0.179
ked-0.177
oured-0.175
Token his
Token ID465
Feature activation+5.64e+00

Pos logit contributions
arted+0.256
ur+0.252
ked+0.235
 stro+0.233
anged+0.231
Neg logit contributions
 best-0.229
 thing-0.223
 world-0.215
 things-0.215
 biggest-0.211
Token be
Token ID307
Feature activation-2.62e+00

Pos logit contributions
ur+0.317
arted+0.315
ked+0.307
 stro+0.299
anged+0.291
Neg logit contributions
 world-0.214
 thing-0.213
 best-0.207
 important-0.200
 things-0.197
Tokenak
Token ID461
Feature activation+8.82e-01

Pos logit contributions
 best+0.117
 thing+0.112
 things+0.108
 important+0.108
 world+0.107
Neg logit contributions
arted-0.145
ur-0.140
ked-0.133
 stro-0.131
anged-0.130
Token and
Token ID290
Feature activation+3.59e+00

Pos logit contributions
arted+0.044
ur+0.043
 stro+0.040
anged+0.040
ked+0.039
Neg logit contributions
 best-0.044
 thing-0.042
 things-0.041
 world-0.040
 important-0.040
Token opened
Token ID4721
Feature activation+2.03e+00

Pos logit contributions
arted+0.200
ur+0.191
ree+0.179
anged+0.178
oured+0.177
Neg logit contributions
 thing-0.158
 best-0.158
 things-0.155
 lot-0.148
 important-0.144
Token it
Token ID340
Feature activation-9.89e+00

Pos logit contributions
arted+0.133
ur+0.133
anged+0.130
olded+0.125
oured+0.124
Neg logit contributions
 thing-0.071
 things-0.070
 best-0.067
 lot-0.063
 biggest-0.060
Token.
Token ID13
Feature activation-5.29e+00

Pos logit contributions
 best+0.390
 thing+0.374
 things+0.359
 world+0.359
 biggest+0.353
Neg logit contributions
arted-0.598
ur-0.585
ked-0.553
 stro-0.552
anged-0.549
Token Inside
Token ID14384
Feature activation+6.30e-01

Pos logit contributions
 thing+0.241
 lot+0.237
 best+0.237
 things+0.233
 world+0.229
Neg logit contributions
arted-0.276
ur-0.266
anged-0.266
ked-0.250
ree-0.245
Token the
Token ID262
Feature activation-8.89e-01

Pos logit contributions
arted+0.033
ur+0.032
ked+0.030
 stro+0.030
anged+0.029
Neg logit contributions
 best-0.029
 thing-0.028
 important-0.027
 world-0.027
 things-0.027
Token case
Token ID1339
Feature activation-1.61e+01

Pos logit contributions
 thing+0.023
 important+0.023
iest+0.022
time+0.021
 lot+0.020
Neg logit contributions
oured-0.043
 stro-0.043
ur-0.042
 aback-0.041
 hes-0.040
Token was
Token ID373
Feature activation-7.25e+00

Pos logit contributions
 best+0.768
 thing+0.762
 important+0.736
 world+0.735
 lot+0.730
Neg logit contributions
ur-0.792
arted-0.791
ked-0.732
 stro-0.724
anged-0.710