TOP ACTIVATIONS
MAX = 22.880

. At the end of the day , Lily
felt much better . The end . <|endoftext|> Once upon a
At the end of the special day they
care . The end . <|endoftext|> Once upon a
in the forest . The end . <|endoftext|> Once upon a
friend was hurt . The end . <|endoftext|> Once , there
again . The end . <|endoftext|> Once upon a
nothing worked . In the end they had a bright idea
had fun together . The end . <|endoftext|> Once upon a
very hard and in the end , the mirror looked just
. The end . <|endoftext|> Once upon a
They are happy . The end . <|endoftext|> Tom and Lily
again . The end . M oral
intelligent . The end . <|endoftext|> Once , there
meals . The end . <|endoftext|> Once upon a
save you from a bad end . <|endoftext|> Harry gazed out
. The end . <|endoftext|> The sun was
. The crack was the end of her adventures . <|endoftext|>
all day long . The end . <|endoftext|> Once upon a
all the animals . The end . <|endoftext|> Once upon a

MIN ACTIVATIONS
MIN = -30.546

said , " Ant if ree ze . Poison . Do
of a plan to unf ree ze the saf ari .
Now , I will unf ree ze you all !" He
so she could run fr ree ly . The
teacher told them to unf ree ze and sit back down
present ?" B ree said , " Well ,
the kids had their t amb our ines and drums ,
B amb i . B amb i loved to play and
. One day , B amb i saw his friend Th
j olly deer named B amb i . B amb i
in a trap . B amb i tried to help Th
songs and shaking her t amb our ine . Whenever she
the woods . They r amb led through the trees ,
. B amb i knew he had to
she went , the hand ker chief unw inding until she
let go of the hand ker chief and started climbing back
He took out a hand ker chief from his pocket and
him . As he cl amb ered , he foolish ly
delicious apples and sweet water mel ons . He also grabbed
. " L amb ? Like sheep ?" Mia

INTERVAL 9.523 - 22.880
CONTAINS 5.795%

went on the boat and had a great time sailing down
They open the box and take out the parts
stronger . They said they would hide the carrot nose and
closer look . She touched it and examined it with her
the fruit . How can we get it ?" Rod had

INTERVAL -3.833 - 9.523
CONTAINS 66.036%

need to eat something different today . Eggs and toast are
L ily squee zes his hand . " I will not
went , something new and amazing called out to her .
mom showed him a surprise . It was a big ,
are curious . They nod . The old man takes off

INTERVAL -17.190 - -3.833
CONTAINS 27.882%

hands together and tw irled around happily .
<|endoftext|> Tom and Lily went to the shop with Mom .
sweet and juicy . " Y um , y
and was never found . Her uncle felt very guilty for
L ily listened to the story and imagined she

INTERVAL -30.546 - -17.190
CONTAINS 0.287%

� € � � Hey Ch ir p ! I can
robber . Lily ch ases Sam around the trees and the
" Okay !" Sam said . He liked the
and disappointed . She sc olded Max for being greedy and
One day , Sam and Lucy went to the
Token.
Token ID13
Feature activation+4.47e+00

Pos logit contributions
arted+0.253
ur+0.253
ked+0.249
 stro+0.236
anged+0.225
Neg logit contributions
 world-0.283
 lot-0.279
 best-0.278
 thing-0.275
 problem-0.268
Token
Token ID198
Feature activation-3.43e+00

Pos logit contributions
 best+0.237
 world+0.236
 lot+0.235
 things+0.228
 thing+0.226
Neg logit contributions
arted-0.218
ur-0.214
ked-0.209
 stro-0.201
anged-0.198
Token
Token ID198
Feature activation-3.38e+00

Pos logit contributions
arted+0.184
ur+0.178
ked+0.174
 stro+0.165
anged+0.164
Neg logit contributions
 best-0.166
 world-0.165
 lot-0.164
 thing-0.159
 things-0.156
TokenAt
Token ID2953
Feature activation+2.23e+00

Pos logit contributions
arted+0.138
ur+0.131
ked+0.124
 stro+0.118
ree+0.112
Neg logit contributions
 best-0.202
 biggest-0.191
 world-0.191
 problem-0.189
 thing-0.187
Token the
Token ID262
Feature activation+4.47e-01

Pos logit contributions
 best+0.095
 world+0.091
 lot+0.090
 thing+0.089
 things+0.088
Neg logit contributions
arted-0.135
ur-0.131
ked-0.131
 stro-0.125
olded-0.122
Token end
Token ID886
Feature activation+2.07e+01

Pos logit contributions
iest+0.014
 puzzle+0.014
 things+0.014
 thing+0.014
 important+0.014
Neg logit contributions
arted-0.026
ur-0.024
ree-0.023
 stro-0.023
ked-0.022
Token of
Token ID286
Feature activation+5.81e+00

Pos logit contributions
 best+0.728
 lot+0.700
 world+0.696
 thing+0.691
 things+0.669
Neg logit contributions
arted-1.391
ked-1.340
ur-1.340
 stro-1.279
anged-1.256
Token the
Token ID262
Feature activation+5.09e-01

Pos logit contributions
 best+0.242
 lot+0.237
 thing+0.234
 day+0.229
 world+0.227
Neg logit contributions
ur-0.359
arted-0.358
ked-0.351
olded-0.334
anged-0.328
Token day
Token ID1110
Feature activation+1.55e+01

Pos logit contributions
 biggest+0.010
 world+0.009
 best+0.009
 thing+0.009
 puzzle+0.009
Neg logit contributions
arted-0.037
ur-0.036
anged-0.034
 stro-0.033
ree-0.033
Token,
Token ID11
Feature activation-7.75e-01

Pos logit contributions
 best+0.711
 world+0.693
 lot+0.679
 thing+0.673
 things+0.661
Neg logit contributions
arted-0.883
ur-0.856
ked-0.836
 stro-0.799
anged-0.783
Token Lily
Token ID20037
Feature activation-3.69e+00

Pos logit contributions
arted+0.029
ked+0.028
ur+0.028
 stro+0.026
anged+0.024
Neg logit contributions
 world-0.049
 best-0.047
 biggest-0.045
 thing-0.045
 problem-0.045
Token felt
Token ID2936
Feature activation+1.72e+00

Pos logit contributions
arted+0.368
ur+0.362
ked+0.343
 stro+0.341
anged+0.325
Neg logit contributions
 world-0.327
 best-0.313
 lot-0.308
 thing-0.306
 biggest-0.299
Token much
Token ID881
Feature activation+1.80e+01

Pos logit contributions
 world+0.063
 best+0.061
 thing+0.059
 lot+0.058
 problem+0.058
Neg logit contributions
arted-0.109
ur-0.109
 stro-0.104
ked-0.103
oured-0.097
Token better
Token ID1365
Feature activation+1.12e+01

Pos logit contributions
 world+0.570
 best+0.552
 thing+0.536
 lot+0.534
 things+0.514
Neg logit contributions
arted-1.221
ur-1.197
ked-1.162
 stro-1.161
anged-1.103
Token.
Token ID13
Feature activation+4.76e+00

Pos logit contributions
 lot+0.490
 best+0.485
 world+0.484
 biggest+0.468
 thing+0.466
Neg logit contributions
arted-0.615
ur-0.598
ked-0.590
 stro-0.577
anged-0.559
Token The
Token ID383
Feature activation-5.19e+00

Pos logit contributions
 best+0.259
 world+0.256
 lot+0.253
 thing+0.246
 things+0.243
Neg logit contributions
arted-0.228
ur-0.220
ked-0.215
anged-0.200
 stro-0.199
Token end
Token ID886
Feature activation+2.07e+01

Pos logit contributions
 stro+0.274
 pat+0.236
ur+0.234
arted+0.233
olded+0.230
Neg logit contributions
iest-0.149
 biggest-0.148
 tallest-0.145
 planet-0.143
 universe-0.139
Token.
Token ID13
Feature activation+4.86e+00

Pos logit contributions
 best+0.871
 world+0.835
 lot+0.824
 biggest+0.815
 thing+0.792
Neg logit contributions
arted-1.216
ur-1.211
ked-1.137
 stro-1.131
anged-1.104
Token<|endoftext|>
Token ID50256
Feature activation+9.02e-01

Pos logit contributions
 best+0.258
 world+0.253
 thing+0.249
 lot+0.249
 things+0.244
Neg logit contributions
arted-0.233
ur-0.227
ked-0.216
 stro-0.208
anged-0.204
TokenOnce
Token ID7454
Feature activation+3.08e+00

Pos logit contributions
 best+0.044
 world+0.044
 lot+0.043
 thing+0.042
 things+0.042
Neg logit contributions
arted-0.048
ur-0.047
ked-0.045
anged-0.043
 stro-0.043
Token upon
Token ID2402
Feature activation+5.36e+00

Pos logit contributions
 best+0.154
 world+0.152
 lot+0.149
 thing+0.149
 things+0.145
Neg logit contributions
arted-0.155
ur-0.152
ked-0.143
anged-0.137
 stro-0.137
Token a
Token ID257
Feature activation+5.30e+00

Pos logit contributions
 problem+0.258
 best+0.258
 world+0.254
 thing+0.245
 biggest+0.243
Neg logit contributions
ur-0.254
 stro-0.254
arted-0.248
ked-0.248
oured-0.217
Token
Token ID220
Feature activation+2.86e-01

Pos logit contributions
 best+0.251
 world+0.249
 lot+0.245
 thing+0.241
 things+0.236
Neg logit contributions
arted-0.215
ur-0.209
ked-0.202
 stro-0.190
anged-0.188
Token
Token ID198
Feature activation-3.17e+00

Pos logit contributions
 best+0.012
 world+0.011
 lot+0.011
 biggest+0.010
 whole+0.010
Neg logit contributions
arted-0.017
ked-0.017
ur-0.017
anged-0.016
 stro-0.016
Token
Token ID198
Feature activation-3.12e+00

Pos logit contributions
arted+0.185
ur+0.179
ked+0.178
anged+0.170
 stro+0.165
Neg logit contributions
 world-0.141
 best-0.139
 lot-0.139
 thing-0.130
 story-0.129
TokenAt
Token ID2953
Feature activation+2.51e+00

Pos logit contributions
arted+0.146
ur+0.141
 stro+0.132
ked+0.130
ree+0.120
Neg logit contributions
 best-0.166
 biggest-0.159
 problem-0.157
 thing-0.157
 things-0.156
Token the
Token ID262
Feature activation+5.09e-01

Pos logit contributions
 best+0.112
 world+0.109
 lot+0.107
 thing+0.106
 things+0.104
Neg logit contributions
arted-0.143
ur-0.140
ked-0.135
 stro-0.130
anged-0.128
Token end
Token ID886
Feature activation+2.07e+01

Pos logit contributions
 things+0.012
 puzzle+0.012
 thing+0.011
iest+0.011
 world+0.011
Neg logit contributions
ur-0.030
arted-0.029
oured-0.029
anged-0.027
 stro-0.027
Token of
Token ID286
Feature activation+5.92e+00

Pos logit contributions
 best+0.813
 world+0.798
 lot+0.778
 thing+0.771
 things+0.751
Neg logit contributions
arted-1.282
ur-1.256
ked-1.211
 stro-1.171
anged-1.161
Token the
Token ID262
Feature activation+5.84e-01

Pos logit contributions
 best+0.260
 thing+0.251
 lot+0.251
 world+0.251
 day+0.241
Neg logit contributions
arted-0.357
ur-0.352
ked-0.345
anged-0.321
 stro-0.320
Token special
Token ID2041
Feature activation+1.09e+01

Pos logit contributions
 world+0.011
 biggest+0.010
 thing+0.010
 lot+0.009
 best+0.009
Neg logit contributions
ur-0.041
arted-0.040
anged-0.039
oured-0.038
 stro-0.038
Token day
Token ID1110
Feature activation+1.56e+01

Pos logit contributions
 biggest+0.169
 best+0.167
 world+0.159
 lot+0.155
 puzzle+0.120
Neg logit contributions
ur-0.838
arted-0.808
anged-0.799
 stro-0.772
oured-0.755
Token they
Token ID484
Feature activation+9.56e+00

Pos logit contributions
 best+0.756
 world+0.747
 lot+0.738
 thing+0.731
 things+0.711
Neg logit contributions
arted-0.811
ur-0.793
ked-0.759
 stro-0.729
anged-0.724
Token care
Token ID1337
Feature activation+8.39e-01

Pos logit contributions
 world+0.065
 best+0.062
 thing+0.060
 problem+0.059
 lot+0.059
Neg logit contributions
ur-0.121
 stro-0.119
arted-0.117
oured-0.110
ked-0.108
Token.
Token ID13
Feature activation+4.48e+00

Pos logit contributions
 lot+0.041
 world+0.040
 best+0.040
 problem+0.040
 game+0.040
Neg logit contributions
ur-0.040
arted-0.039
 stro-0.036
ked-0.036
olded-0.034
Token
Token ID198
Feature activation-3.20e+00

Pos logit contributions
 world+0.252
 best+0.252
 lot+0.246
 thing+0.242
 things+0.239
Neg logit contributions
arted-0.202
ur-0.196
ked-0.194
 stro-0.180
anged-0.179
Token
Token ID198
Feature activation-3.15e+00

Pos logit contributions
arted+0.193
ked+0.187
ur+0.186
anged+0.177
 stro+0.173
Neg logit contributions
 world-0.135
 best-0.133
 lot-0.132
 thing-0.126
 things-0.125
TokenThe
Token ID464
Feature activation+4.08e+00

Pos logit contributions
arted+0.136
ur+0.133
 stro+0.121
ked+0.116
ree+0.113
Neg logit contributions
 best-0.182
 biggest-0.173
 thing-0.172
 problem-0.171
 world-0.169
Token end
Token ID886
Feature activation+2.07e+01

Pos logit contributions
iest+0.110
 biggest+0.100
 best+0.095
 thing+0.094
 lot+0.093
Neg logit contributions
arted-0.251
ur-0.211
 stro-0.209
 shrugged-0.207
pered-0.202
Token.
Token ID13
Feature activation+4.64e+00

Pos logit contributions
 best+0.887
 lot+0.850
 world+0.849
 biggest+0.837
 thing+0.809
Neg logit contributions
ur-1.187
arted-1.174
ked-1.091
 stro-1.090
anged-1.075
Token<|endoftext|>
Token ID50256
Feature activation+9.32e-01

Pos logit contributions
 best+0.232
 world+0.231
 lot+0.226
 thing+0.222
 things+0.217
Neg logit contributions
arted-0.241
ur-0.234
ked-0.227
anged-0.215
 stro-0.214
TokenOnce
Token ID7454
Feature activation+3.03e+00

Pos logit contributions
 best+0.048
 world+0.047
 lot+0.047
 thing+0.046
 things+0.045
Neg logit contributions
arted-0.046
ur-0.045
ked-0.043
 stro-0.041
anged-0.040
Token upon
Token ID2402
Feature activation+5.49e+00

Pos logit contributions
 best+0.153
 world+0.151
 lot+0.149
 thing+0.148
 biggest+0.145
Neg logit contributions
arted-0.151
ur-0.147
ked-0.138
anged-0.133
 stro-0.132
Token a
Token ID257
Feature activation+5.37e+00

Pos logit contributions
 problem+0.271
 best+0.267
 world+0.264
 thing+0.254
 biggest+0.252
Neg logit contributions
ur-0.254
 stro-0.253
arted-0.246
ked-0.246
ree-0.214
Token in
Token ID287
Feature activation-1.05e+00

Pos logit contributions
arted+0.044
ur+0.043
ked+0.042
 stro+0.041
anged+0.040
Neg logit contributions
 best-0.032
 world-0.031
 biggest-0.031
 lot-0.031
 thing-0.031
Token the
Token ID262
Feature activation+4.38e-01

Pos logit contributions
arted+0.044
ur+0.042
 stro+0.039
ked+0.039
anged+0.037
Neg logit contributions
 best-0.057
 thing-0.055
 biggest-0.055
 world-0.055
 lot-0.054
Token forest
Token ID8222
Feature activation+5.80e+00

Pos logit contributions
 thing+0.014
iest+0.014
 game+0.013
 task+0.013
 puzzle+0.013
Neg logit contributions
arted-0.026
ur-0.023
 stro-0.023
 Sammy-0.022
ked-0.021
Token.
Token ID13
Feature activation+4.66e+00

Pos logit contributions
 best+0.252
 lot+0.250
 thing+0.250
 world+0.245
 biggest+0.243
Neg logit contributions
arted-0.316
ur-0.314
ked-0.299
 stro-0.292
anged-0.283
Token The
Token ID383
Feature activation-5.18e+00

Pos logit contributions
 best+0.249
 world+0.246
 thing+0.241
 lot+0.240
 things+0.236
Neg logit contributions
arted-0.220
ur-0.214
ked-0.205
 stro-0.196
anged-0.192
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
arted+0.309
 stro+0.295
ur+0.293
ked+0.292
pered+0.266
Neg logit contributions
iest-0.135
 thing-0.129
 biggest-0.119
 task-0.119
 puzzle-0.117
Token.
Token ID13
Feature activation+4.76e+00

Pos logit contributions
 best+0.894
 world+0.879
 biggest+0.863
 lot+0.848
 thing+0.838
Neg logit contributions
ur-1.140
arted-1.127
ked-1.067
 stro-1.064
anged-1.026
Token<|endoftext|>
Token ID50256
Feature activation+8.93e-01

Pos logit contributions
 best+0.258
 world+0.253
 thing+0.251
 lot+0.248
 things+0.246
Neg logit contributions
arted-0.216
ur-0.211
ked-0.200
 stro-0.194
anged-0.187
TokenOnce
Token ID7454
Feature activation+3.21e+00

Pos logit contributions
 best+0.044
 world+0.044
 lot+0.043
 thing+0.042
 things+0.041
Neg logit contributions
arted-0.047
ur-0.046
ked-0.044
anged-0.042
 stro-0.042
Token upon
Token ID2402
Feature activation+5.63e+00

Pos logit contributions
 best+0.159
 world+0.157
 lot+0.153
 thing+0.153
 things+0.150
Neg logit contributions
arted-0.164
ur-0.160
ked-0.152
 stro-0.145
anged-0.145
Token a
Token ID257
Feature activation+5.43e+00

Pos logit contributions
 problem+0.274
 best+0.270
 world+0.266
 thing+0.260
 biggest+0.256
Neg logit contributions
 stro-0.262
ur-0.260
ked-0.259
arted-0.256
ree-0.223
Token friend
Token ID1545
Feature activation+1.70e+00

Pos logit contributions
 world+0.504
 best+0.501
 lot+0.486
 biggest+0.475
 thing+0.472
Neg logit contributions
arted-0.659
ur-0.642
ked-0.629
 stro-0.624
anged-0.604
Token was
Token ID373
Feature activation+7.47e+00

Pos logit contributions
 best+0.075
 world+0.073
 lot+0.072
 thing+0.071
 things+0.069
Neg logit contributions
arted-0.098
ur-0.096
ked-0.092
 stro-0.089
anged-0.088
Token hurt
Token ID5938
Feature activation-3.85e+00

Pos logit contributions
 best+0.319
 world+0.314
 lot+0.308
 thing+0.305
 things+0.297
Neg logit contributions
arted-0.441
ur-0.430
ked-0.415
 stro-0.400
anged-0.394
Token.
Token ID13
Feature activation+4.87e+00

Pos logit contributions
arted+0.220
ur+0.212
ked+0.206
 stro+0.195
anged+0.194
Neg logit contributions
 world-0.166
 best-0.166
 biggest-0.160
 lot-0.159
 thing-0.158
Token The
Token ID383
Feature activation-5.11e+00

Pos logit contributions
 lot+0.245
 best+0.244
 world+0.243
 things+0.234
 thing+0.231
Neg logit contributions
arted-0.254
ur-0.246
ked-0.241
oured-0.226
 stro-0.223
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
 stro+0.303
arted+0.287
ked+0.286
ur+0.283
olded+0.273
Neg logit contributions
 biggest-0.145
 world-0.144
iest-0.135
 lot-0.129
 best-0.128
Token.
Token ID13
Feature activation+4.96e+00

Pos logit contributions
 best+0.828
 world+0.808
 lot+0.782
 thing+0.770
 biggest+0.766
Neg logit contributions
arted-1.256
ur-1.243
ked-1.189
 stro-1.159
anged-1.147
Token<|endoftext|>
Token ID50256
Feature activation+9.22e-01

Pos logit contributions
 best+0.245
 world+0.243
 lot+0.241
 thing+0.234
 things+0.230
Neg logit contributions
arted-0.261
ur-0.253
ked-0.245
anged-0.231
 stro-0.231
TokenOnce
Token ID7454
Feature activation+3.08e+00

Pos logit contributions
 world+0.045
 best+0.045
 lot+0.043
 thing+0.043
 things+0.042
Neg logit contributions
arted-0.050
ur-0.048
ked-0.047
anged-0.044
 stro-0.044
Token,
Token ID11
Feature activation-9.18e-01

Pos logit contributions
 best+0.153
 world+0.152
 lot+0.148
 thing+0.148
 things+0.144
Neg logit contributions
arted-0.155
ur-0.152
ked-0.143
anged-0.137
 stro-0.136
Token there
Token ID612
Feature activation+7.58e+00

Pos logit contributions
arted+0.048
ur+0.048
ked+0.046
anged+0.043
 stro+0.043
Neg logit contributions
 world-0.043
 best-0.041
 biggest-0.040
 lot-0.040
 thing-0.039
Token again
Token ID757
Feature activation-2.97e+00

Pos logit contributions
 lot+0.261
 world+0.260
 best+0.248
 biggest+0.245
 thing+0.241
Neg logit contributions
ur-0.239
arted-0.239
ked-0.230
 stro-0.212
oured-0.211
Token.
Token ID13
Feature activation+4.61e+00

Pos logit contributions
arted+0.170
ur+0.166
ked+0.160
 stro+0.156
anged+0.155
Neg logit contributions
 best-0.123
 lot-0.121
 world-0.120
 biggest-0.117
 thing-0.117
Token
Token ID198
Feature activation-3.08e+00

Pos logit contributions
 best+0.233
 world+0.232
 lot+0.226
 thing+0.222
 things+0.217
Neg logit contributions
arted-0.238
ur-0.230
ked-0.226
 stro-0.213
anged-0.211
Token
Token ID198
Feature activation-3.05e+00

Pos logit contributions
arted+0.157
ur+0.154
ked+0.145
 stro+0.142
anged+0.138
Neg logit contributions
 best-0.153
 world-0.149
 thing-0.147
 lot-0.146
 things-0.144
TokenThe
Token ID464
Feature activation+4.04e+00

Pos logit contributions
arted+0.136
ur+0.134
 stro+0.122
ked+0.117
olded+0.112
Neg logit contributions
 best-0.163
 biggest-0.158
 problem-0.157
 thing-0.156
 things-0.154
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
 biggest+0.107
 world+0.098
 best+0.098
 lot+0.096
iest+0.095
Neg logit contributions
 stro-0.244
arted-0.235
ked-0.234
ur-0.224
anged-0.221
Token.
Token ID13
Feature activation+4.72e+00

Pos logit contributions
 best+0.889
 world+0.853
 lot+0.849
 biggest+0.844
 thing+0.811
Neg logit contributions
ur-1.175
arted-1.160
ked-1.085
 stro-1.082
anged-1.065
Token<|endoftext|>
Token ID50256
Feature activation+9.39e-01

Pos logit contributions
 best+0.234
 world+0.233
 lot+0.227
 thing+0.223
 things+0.217
Neg logit contributions
arted-0.248
ur-0.240
ked-0.234
 stro-0.221
anged-0.221
TokenOnce
Token ID7454
Feature activation+3.09e+00

Pos logit contributions
 best+0.050
 world+0.048
 lot+0.048
 thing+0.048
 things+0.047
Neg logit contributions
arted-0.045
ur-0.043
ked-0.042
 stro-0.040
anged-0.039
Token upon
Token ID2402
Feature activation+5.51e+00

Pos logit contributions
 best+0.153
 world+0.151
 lot+0.148
 thing+0.147
 things+0.143
Neg logit contributions
arted-0.158
ur-0.155
ked-0.147
 stro-0.140
anged-0.140
Token a
Token ID257
Feature activation+5.39e+00

Pos logit contributions
 problem+0.270
 best+0.266
 world+0.263
 thing+0.254
 biggest+0.251
Neg logit contributions
ur-0.257
 stro-0.255
arted-0.249
ked-0.248
oured-0.216
Token nothing
Token ID2147
Feature activation+6.87e+00

Pos logit contributions
arted+0.237
ur+0.230
ked+0.222
anged+0.214
 stro+0.209
Neg logit contributions
 world-0.217
 best-0.208
 lot-0.206
 biggest-0.204
 story-0.199
Token worked
Token ID3111
Feature activation+5.11e+00

Pos logit contributions
 world+0.332
 best+0.318
 lot+0.309
 biggest+0.308
 things+0.295
Neg logit contributions
ur-0.366
arted-0.363
ked-0.348
 stro-0.336
anged-0.328
Token.
Token ID13
Feature activation+4.44e+00

Pos logit contributions
 world+0.235
 lot+0.234
 biggest+0.233
 best+0.225
 thing+0.220
Neg logit contributions
arted-0.255
ur-0.248
ked-0.240
anged-0.225
ried-0.221
Token In
Token ID554
Feature activation-5.97e-01

Pos logit contributions
 world+0.241
 best+0.241
 lot+0.236
 thing+0.230
 things+0.229
Neg logit contributions
arted-0.207
ur-0.204
ked-0.196
 stro-0.189
anged-0.182
Token the
Token ID262
Feature activation+3.62e-01

Pos logit contributions
arted+0.025
ur+0.023
anged+0.022
ked+0.022
 stro+0.020
Neg logit contributions
 world-0.033
 biggest-0.032
 best-0.032
 lot-0.031
 thing-0.030
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
iest+0.012
iflower+0.010
ww+0.010
 comet+0.010
 thing+0.010
Neg logit contributions
arted-0.019
anged-0.018
 aback-0.018
ked-0.018
 Sammy-0.017
Token they
Token ID484
Feature activation+9.35e+00

Pos logit contributions
 best+1.051
 world+1.047
 lot+1.015
 thing+1.000
 biggest+0.995
Neg logit contributions
arted-1.010
ur-1.006
ked-0.933
anged-0.907
 stro-0.902
Token had
Token ID550
Feature activation+1.14e+01

Pos logit contributions
 best+0.439
 world+0.431
 lot+0.423
 thing+0.419
 things+0.410
Neg logit contributions
arted-0.511
ur-0.498
ked-0.479
 stro-0.460
anged-0.454
Token a
Token ID257
Feature activation+5.51e+00

Pos logit contributions
 best+0.575
 world+0.564
 lot+0.555
 thing+0.549
 things+0.540
Neg logit contributions
arted-0.588
ur-0.572
ked-0.550
 stro-0.525
anged-0.520
Token bright
Token ID6016
Feature activation+6.81e+00

Pos logit contributions
iest+0.145
 world+0.126
 things+0.125
 biggest+0.122
 thing+0.116
Neg logit contributions
ur-0.347
 stro-0.333
arted-0.325
ree-0.321
anged-0.316
Token idea
Token ID2126
Feature activation+9.09e+00

Pos logit contributions
 best+0.207
 biggest+0.198
 world+0.190
 lot+0.176
 things+0.175
Neg logit contributions
arted-0.419
ur-0.411
ked-0.393
anged-0.392
 stro-0.390
Token had
Token ID550
Feature activation+1.12e+01

Pos logit contributions
 best+0.240
 world+0.238
 lot+0.232
 thing+0.230
 things+0.227
Neg logit contributions
arted-0.188
ur-0.183
ked-0.174
 stro-0.167
anged-0.163
Token fun
Token ID1257
Feature activation+9.58e+00

Pos logit contributions
 best+0.419
 world+0.415
 lot+0.402
 thing+0.401
 things+0.386
Neg logit contributions
arted-0.714
ur-0.698
ked-0.677
 stro-0.656
anged-0.647
Token together
Token ID1978
Feature activation-8.40e-01

Pos logit contributions
 best+0.542
 world+0.536
 biggest+0.533
 lot+0.530
 thing+0.526
Neg logit contributions
arted-0.405
ur-0.394
ked-0.379
 stro-0.371
anged-0.366
Token.
Token ID13
Feature activation+4.63e+00

Pos logit contributions
arted+0.053
ur+0.052
ked+0.050
 stro+0.048
anged+0.048
Neg logit contributions
 best-0.033
 world-0.032
 lot-0.031
 thing-0.031
 things-0.030
Token The
Token ID383
Feature activation-5.29e+00

Pos logit contributions
 best+0.273
 lot+0.272
 world+0.271
 thing+0.262
 things+0.260
Neg logit contributions
arted-0.198
ur-0.193
ked-0.186
 stro-0.173
anged-0.172
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
arted+0.332
ur+0.313
anged+0.296
 stro+0.295
 Sammy+0.290
Neg logit contributions
 biggest-0.134
 best-0.125
iest-0.120
 lot-0.117
br-0.104
Token.
Token ID13
Feature activation+4.73e+00

Pos logit contributions
 best+0.905
 world+0.871
 biggest+0.862
 lot+0.857
 thing+0.818
Neg logit contributions
ur-1.147
arted-1.143
ked-1.078
 stro-1.065
anged-1.050
Token<|endoftext|>
Token ID50256
Feature activation+9.03e-01

Pos logit contributions
 best+0.251
 world+0.247
 thing+0.242
 lot+0.242
 things+0.237
Neg logit contributions
arted-0.226
ur-0.220
ked-0.210
 stro-0.202
anged-0.198
TokenOnce
Token ID7454
Feature activation+3.00e+00

Pos logit contributions
 best+0.045
 world+0.045
 lot+0.044
 thing+0.043
 things+0.042
Neg logit contributions
arted-0.047
ur-0.046
ked-0.044
 stro-0.042
anged-0.042
Token upon
Token ID2402
Feature activation+5.38e+00

Pos logit contributions
 best+0.149
 world+0.148
 lot+0.144
 thing+0.144
 things+0.141
Neg logit contributions
arted-0.152
ur-0.148
ked-0.140
anged-0.134
 stro-0.133
Token a
Token ID257
Feature activation+5.31e+00

Pos logit contributions
 problem+0.266
 best+0.263
 world+0.260
 thing+0.250
 biggest+0.250
Neg logit contributions
 stro-0.248
ur-0.244
ked-0.243
arted-0.240
ree-0.210
Token very
Token ID845
Feature activation+1.50e+01

Pos logit contributions
 best+0.127
 lot+0.118
 world+0.116
 thing+0.113
 things+0.111
Neg logit contributions
arted-0.373
ur-0.368
ked-0.357
 stro-0.350
anged-0.346
Token hard
Token ID1327
Feature activation+1.18e+01

Pos logit contributions
 world+0.350
 best+0.331
 thing+0.307
 lot+0.296
 biggest+0.293
Neg logit contributions
arted-1.167
ur-1.155
ked-1.124
 stro-1.082
anged-1.075
Token and
Token ID290
Feature activation-3.48e+00

Pos logit contributions
 best+0.602
 world+0.584
 lot+0.581
 things+0.574
 thing+0.572
Neg logit contributions
arted-0.604
ur-0.595
ked-0.568
 stro-0.563
anged-0.538
Token in
Token ID287
Feature activation-8.66e-01

Pos logit contributions
arted+0.186
ur+0.181
ked+0.173
 stro+0.166
anged+0.163
Neg logit contributions
 best-0.171
 world-0.165
 lot-0.164
 thing-0.164
 things-0.161
Token the
Token ID262
Feature activation+4.50e-01

Pos logit contributions
arted+0.044
ur+0.041
ked+0.041
anged+0.040
 stro+0.039
Neg logit contributions
 best-0.040
 world-0.040
 biggest-0.039
 lot-0.038
 thing-0.037
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
iest+0.017
 lot+0.016
 best+0.016
 important+0.016
ooky+0.015
Neg logit contributions
arted-0.020
anged-0.020
 stro-0.020
oured-0.018
ked-0.018
Token,
Token ID11
Feature activation-8.96e-01

Pos logit contributions
 best+0.925
 world+0.908
 lot+0.889
 thing+0.882
 things+0.861
Neg logit contributions
arted-1.169
ur-1.141
ked-1.099
 stro-1.057
anged-1.046
Token the
Token ID262
Feature activation+5.26e-01

Pos logit contributions
arted+0.051
ur+0.049
ked+0.047
 stro+0.046
anged+0.045
Neg logit contributions
 best-0.041
 world-0.040
 lot-0.039
 thing-0.039
 things-0.038
Token mirror
Token ID10162
Feature activation+2.80e+00

Pos logit contributions
 lot+0.020
 best+0.018
 world+0.018
 biggest+0.017
 planet+0.017
Neg logit contributions
arted-0.024
 stro-0.024
anged-0.024
 shrugged-0.023
ked-0.022
Token looked
Token ID3114
Feature activation+6.28e+00

Pos logit contributions
 world+0.112
 lot+0.109
 best+0.109
 biggest+0.104
 thing+0.103
Neg logit contributions
arted-0.163
ur-0.162
ked-0.161
 stro-0.157
anged-0.156
Token just
Token ID655
Feature activation-2.11e+00

Pos logit contributions
 world+0.247
 best+0.231
 thing+0.226
 story+0.224
 lot+0.218
Neg logit contributions
arted-0.381
ur-0.371
ked-0.366
 stro-0.350
oured-0.339
Token.
Token ID13
Feature activation+4.62e+00

Pos logit contributions
 world+0.691
 best+0.690
 lot+0.681
 thing+0.671
 biggest+0.658
Neg logit contributions
arted-0.900
ur-0.891
ked-0.828
 stro-0.822
anged-0.785
Token
Token ID220
Feature activation+3.19e-01

Pos logit contributions
 world+0.205
 lot+0.201
 best+0.200
 thing+0.185
 things+0.182
Neg logit contributions
ked-0.268
arted-0.267
ur-0.257
anged-0.247
 stro-0.235
Token
Token ID198
Feature activation-3.13e+00

Pos logit contributions
 best+0.014
 world+0.013
 lot+0.013
 biggest+0.012
 things+0.012
Neg logit contributions
arted-0.019
ked-0.019
ur-0.018
anged-0.018
 stro-0.017
Token
Token ID198
Feature activation-3.11e+00

Pos logit contributions
arted+0.191
ked+0.188
ur+0.183
anged+0.179
oured+0.170
Neg logit contributions
 world-0.130
 best-0.128
 lot-0.128
 story-0.120
 thing-0.117
TokenThe
Token ID464
Feature activation+3.94e+00

Pos logit contributions
arted+0.179
ur+0.175
ked+0.167
 stro+0.164
anged+0.159
Neg logit contributions
 best-0.136
 world-0.132
 thing-0.130
 lot-0.129
 things-0.127
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
iest+0.107
 biggest+0.107
 world+0.104
 best+0.100
 thing+0.098
Neg logit contributions
arted-0.227
 stro-0.223
ked-0.216
ur-0.213
oured-0.201
Token.
Token ID13
Feature activation+4.74e+00

Pos logit contributions
 best+0.891
 world+0.865
 biggest+0.856
 lot+0.853
 thing+0.821
Neg logit contributions
ur-1.182
arted-1.157
 stro-1.086
ked-1.074
anged-1.057
Token<|endoftext|>
Token ID50256
Feature activation+8.48e-01

Pos logit contributions
 world+0.226
 best+0.226
 lot+0.220
 thing+0.213
 things+0.206
Neg logit contributions
arted-0.261
ur-0.253
ked-0.250
anged-0.237
 stro-0.230
TokenOnce
Token ID7454
Feature activation+3.06e+00

Pos logit contributions
 best+0.039
 world+0.037
 lot+0.037
 thing+0.037
 things+0.036
Neg logit contributions
arted-0.047
ur-0.045
ked-0.044
 stro-0.042
anged-0.041
Token upon
Token ID2402
Feature activation+5.45e+00

Pos logit contributions
 best+0.153
 world+0.151
 lot+0.148
 thing+0.148
 things+0.144
Neg logit contributions
arted-0.154
ur-0.151
ked-0.142
anged-0.136
 stro-0.135
Token a
Token ID257
Feature activation+5.37e+00

Pos logit contributions
 problem+0.268
 best+0.263
 world+0.260
 thing+0.252
 biggest+0.249
Neg logit contributions
 stro-0.256
ur-0.254
arted-0.247
ked-0.246
oured-0.215
Token They
Token ID1119
Feature activation+3.93e+00

Pos logit contributions
 best+0.213
 lot+0.210
 world+0.208
 thing+0.204
 things+0.199
Neg logit contributions
arted-0.245
ur-0.240
ked-0.227
anged-0.217
oured-0.214
Token are
Token ID389
Feature activation+1.30e+01

Pos logit contributions
 world+0.174
 best+0.170
 thing+0.169
 things+0.163
 biggest+0.161
Neg logit contributions
ur-0.214
arted-0.210
ked-0.207
 stro-0.206
anged-0.191
Token happy
Token ID3772
Feature activation+5.95e+00

Pos logit contributions
 world+0.480
 lot+0.456
 thing+0.456
 best+0.444
 things+0.431
Neg logit contributions
arted-0.818
ur-0.796
ked-0.782
 stro-0.762
anged-0.727
Token.
Token ID13
Feature activation+4.51e+00

Pos logit contributions
 best+0.252
 world+0.246
 lot+0.245
 biggest+0.241
 thing+0.238
Neg logit contributions
arted-0.343
ur-0.335
ked-0.324
 stro-0.319
anged-0.310
Token The
Token ID383
Feature activation-5.27e+00

Pos logit contributions
 best+0.226
 world+0.221
 lot+0.220
 thing+0.216
 things+0.211
Neg logit contributions
arted-0.244
ur-0.238
ked-0.226
anged-0.214
 stro-0.213
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
 stro+0.269
ked+0.269
arted+0.262
 pat+0.245
 Tom+0.239
Neg logit contributions
iest-0.153
iflower-0.142
 world-0.139
body-0.138
ooky-0.136
Token.
Token ID13
Feature activation+4.61e+00

Pos logit contributions
 best+0.949
 biggest+0.930
 world+0.921
 lot+0.921
 most+0.838
Neg logit contributions
ur-1.102
arted-1.084
ked-1.034
anged-1.024
 stro-1.021
Token<|endoftext|>
Token ID50256
Feature activation+8.83e-01

Pos logit contributions
 best+0.224
 world+0.219
 lot+0.219
 thing+0.212
 things+0.208
Neg logit contributions
arted-0.253
ur-0.247
ked-0.235
anged-0.224
 stro-0.221
TokenTom
Token ID13787
Feature activation-6.72e+00

Pos logit contributions
 best+0.044
 lot+0.042
 biggest+0.042
 important+0.042
 thing+0.042
Neg logit contributions
arted-0.042
ur-0.041
ked-0.039
 stro-0.038
anged-0.037
Token and
Token ID290
Feature activation-3.65e+00

Pos logit contributions
arted+0.519
ur+0.507
ked+0.497
 stro+0.478
anged+0.474
Neg logit contributions
 best-0.170
 world-0.165
 lot-0.156
 thing-0.155
 things-0.151
Token Lily
Token ID20037
Feature activation-3.83e+00

Pos logit contributions
arted+0.213
ur+0.205
ked+0.198
anged+0.190
 stro+0.187
Neg logit contributions
 best-0.164
 thing-0.160
 lot-0.160
 world-0.157
 things-0.155
Token again
Token ID757
Feature activation-3.08e+00

Pos logit contributions
 best+0.214
 world+0.211
 lot+0.207
 thing+0.206
 things+0.202
Neg logit contributions
arted-0.190
ur-0.184
ked-0.177
 stro-0.168
anged-0.166
Token.
Token ID13
Feature activation+4.80e+00

Pos logit contributions
arted+0.197
ur+0.192
ked+0.186
 stro+0.180
anged+0.178
Neg logit contributions
 best-0.117
 world-0.115
 lot-0.112
 thing-0.111
 things-0.108
Token
Token ID198
Feature activation-3.26e+00

Pos logit contributions
 world+0.207
 best+0.203
 lot+0.201
 thing+0.191
 things+0.189
Neg logit contributions
arted-0.294
ur-0.285
ked-0.281
oured-0.272
anged-0.260
Token
Token ID198
Feature activation-3.24e+00

Pos logit contributions
arted+0.204
ked+0.196
ur+0.195
anged+0.184
 stro+0.182
Neg logit contributions
 world-0.132
 best-0.131
 lot-0.130
 thing-0.125
 things-0.123
TokenThe
Token ID464
Feature activation+3.75e+00

Pos logit contributions
arted+0.196
ur+0.191
ked+0.185
 stro+0.178
anged+0.176
Neg logit contributions
 best-0.134
 world-0.132
 lot-0.130
 thing-0.129
 things-0.125
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
 world+0.161
 thing+0.161
 biggest+0.160
 puzzle+0.160
 best+0.157
Neg logit contributions
arted-0.174
 stro-0.168
ur-0.166
ked-0.160
pered-0.148
Token.
Token ID13
Feature activation+4.94e+00

Pos logit contributions
 best+0.851
 world+0.822
 lot+0.809
 biggest+0.795
 thing+0.778
Neg logit contributions
arted-1.220
ur-1.215
ked-1.151
 stro-1.128
anged-1.121
Token
Token ID198
Feature activation-3.13e+00

Pos logit contributions
 world+0.203
 lot+0.199
 best+0.194
 story+0.185
 thing+0.184
Neg logit contributions
arted-0.311
ur-0.299
ked-0.295
anged-0.281
oured-0.277
Token
Token ID198
Feature activation-3.09e+00

Pos logit contributions
arted+0.183
ur+0.177
ked+0.173
anged+0.164
 stro+0.163
Neg logit contributions
 best-0.138
 world-0.137
 lot-0.134
 thing-0.132
 things-0.129
TokenM
Token ID44
Feature activation-2.03e+00

Pos logit contributions
arted+0.179
ur+0.175
ked+0.169
 stro+0.162
anged+0.161
Neg logit contributions
 best-0.134
 world-0.132
 lot-0.129
 thing-0.128
 things-0.125
Tokenoral
Token ID6864
Feature activation+1.28e+01

Pos logit contributions
arted+0.142
ur+0.141
ked+0.135
anged+0.132
 stro+0.131
Neg logit contributions
 best-0.061
 world-0.060
 lot-0.058
 thing-0.058
 biggest-0.056
Token intelligent
Token ID12661
Feature activation+4.09e+00

Pos logit contributions
 world+0.487
 thing+0.435
 best+0.425
 things+0.425
 puzzle+0.424
Neg logit contributions
ur-1.004
arted-0.978
ked-0.908
 stro-0.903
olded-0.885
Token.
Token ID13
Feature activation+4.52e+00

Pos logit contributions
 lot+0.200
 world+0.196
 best+0.195
 biggest+0.192
 thing+0.188
Neg logit contributions
arted-0.204
ur-0.197
ked-0.189
 stro-0.182
anged-0.176
Token
Token ID198
Feature activation-3.13e+00

Pos logit contributions
 world+0.240
 best+0.238
 lot+0.236
 thing+0.226
 things+0.225
Neg logit contributions
arted-0.216
ur-0.213
ked-0.210
anged-0.196
 stro-0.195
Token
Token ID198
Feature activation-3.11e+00

Pos logit contributions
arted+0.171
ur+0.166
ked+0.164
anged+0.155
 stro+0.153
Neg logit contributions
 world-0.148
 best-0.148
 lot-0.147
 thing-0.141
 things-0.139
TokenThe
Token ID464
Feature activation+4.06e+00

Pos logit contributions
arted+0.139
ur+0.131
 stro+0.124
ree+0.116
ked+0.115
Neg logit contributions
 best-0.175
 biggest-0.166
 problem-0.164
 thing-0.163
 things-0.159
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
iest+0.139
 world+0.136
 biggest+0.134
 lot+0.134
ooky+0.132
Neg logit contributions
 stro-0.199
arted-0.194
ked-0.177
ur-0.174
oured-0.172
Token.
Token ID13
Feature activation+4.68e+00

Pos logit contributions
 best+0.882
 world+0.850
 lot+0.836
 biggest+0.832
 thing+0.797
Neg logit contributions
ur-1.182
arted-1.174
ked-1.105
 stro-1.092
anged-1.077
Token<|endoftext|>
Token ID50256
Feature activation+9.26e-01

Pos logit contributions
 world+0.230
 best+0.229
 lot+0.225
 thing+0.219
 things+0.213
Neg logit contributions
arted-0.248
ur-0.240
ked-0.235
anged-0.222
 stro-0.220
TokenOnce
Token ID7454
Feature activation+3.00e+00

Pos logit contributions
 best+0.049
 world+0.047
 lot+0.047
 thing+0.047
 things+0.046
Neg logit contributions
arted-0.044
ur-0.042
ked-0.040
 stro-0.039
anged-0.038
Token,
Token ID11
Feature activation-1.06e+00

Pos logit contributions
 best+0.152
 world+0.150
 lot+0.147
 thing+0.147
 biggest+0.143
Neg logit contributions
arted-0.149
ur-0.146
ked-0.137
anged-0.132
 stro-0.130
Token there
Token ID612
Feature activation+7.48e+00

Pos logit contributions
arted+0.059
ur+0.059
ked+0.056
 stro+0.054
anged+0.054
Neg logit contributions
 world-0.045
 best-0.045
 lot-0.044
 biggest-0.043
 thing-0.042
Token meals
Token ID13840
Feature activation+5.61e+00

Pos logit contributions
 best+0.035
 biggest+0.033
 problem+0.033
 puzzle+0.032
 world+0.032
Neg logit contributions
arted-0.062
 stro-0.061
ur-0.060
ked-0.057
anged-0.056
Token.
Token ID13
Feature activation+4.49e+00

Pos logit contributions
 world+0.250
 lot+0.247
 best+0.245
 thing+0.238
 biggest+0.236
Neg logit contributions
ur-0.304
arted-0.300
ked-0.290
 stro-0.279
anged-0.269
Token
Token ID198
Feature activation-3.15e+00

Pos logit contributions
 best+0.255
 world+0.251
 lot+0.247
 thing+0.245
 things+0.241
Neg logit contributions
arted-0.202
ur-0.196
ked-0.187
 stro-0.178
anged-0.175
Token
Token ID198
Feature activation-3.13e+00

Pos logit contributions
arted+0.173
ur+0.168
ked+0.163
 stro+0.156
anged+0.155
Neg logit contributions
 best-0.147
 world-0.146
 lot-0.143
 thing-0.141
 things-0.138
TokenThe
Token ID464
Feature activation+3.99e+00

Pos logit contributions
arted+0.146
ur+0.142
 stro+0.130
ked+0.127
ree+0.120
Neg logit contributions
 best-0.167
 problem-0.159
 thing-0.158
 biggest-0.158
 things-0.156
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
 puzzle+0.142
 biggest+0.140
 best+0.139
 game+0.134
 world+0.132
Neg logit contributions
 stro-0.225
arted-0.222
ur-0.220
ked-0.209
anged-0.198
Token.
Token ID13
Feature activation+4.60e+00

Pos logit contributions
 best+0.859
 world+0.835
 lot+0.823
 biggest+0.807
 thing+0.800
Neg logit contributions
ur-1.206
arted-1.197
ked-1.115
 stro-1.112
anged-1.089
Token<|endoftext|>
Token ID50256
Feature activation+8.63e-01

Pos logit contributions
 best+0.234
 world+0.231
 lot+0.226
 thing+0.223
 things+0.218
Neg logit contributions
arted-0.235
ur-0.228
ked-0.221
 stro-0.209
anged-0.208
TokenOnce
Token ID7454
Feature activation+3.08e+00

Pos logit contributions
 best+0.043
 world+0.043
 lot+0.042
 thing+0.041
 things+0.040
Neg logit contributions
arted-0.045
ur-0.043
ked-0.042
 stro-0.040
anged-0.040
Token upon
Token ID2402
Feature activation+5.49e+00

Pos logit contributions
 best+0.151
 world+0.148
 lot+0.145
 thing+0.144
 things+0.141
Neg logit contributions
arted-0.161
ur-0.157
ked-0.150
 stro-0.144
anged-0.142
Token a
Token ID257
Feature activation+5.40e+00

Pos logit contributions
 problem+0.263
 best+0.261
 world+0.259
 thing+0.250
 biggest+0.246
Neg logit contributions
ur-0.264
 stro-0.261
arted-0.257
ked-0.255
anged-0.224
Token save
Token ID3613
Feature activation-3.14e+00

Pos logit contributions
 best+0.582
 world+0.565
 thing+0.562
 lot+0.561
 things+0.549
Neg logit contributions
arted-0.748
ur-0.748
ked-0.700
 stro-0.684
anged-0.667
Token you
Token ID345
Feature activation+1.03e+01

Pos logit contributions
arted+0.185
ur+0.184
ked+0.172
 stro+0.168
anged+0.166
Neg logit contributions
 world-0.121
 best-0.120
 lot-0.118
 thing-0.114
 biggest-0.113
Token from
Token ID422
Feature activation-1.20e+01

Pos logit contributions
 best+0.459
 lot+0.458
 world+0.458
 thing+0.443
 biggest+0.435
Neg logit contributions
ur-0.547
arted-0.538
ked-0.510
 stro-0.508
anged-0.482
Token a
Token ID257
Feature activation+5.44e+00

Pos logit contributions
arted+0.637
ur+0.612
ked+0.578
 stro+0.574
anged+0.566
Neg logit contributions
 best-0.532
 world-0.523
 lot-0.522
 biggest-0.508
 thing-0.507
Token bad
Token ID2089
Feature activation+9.25e+00

Pos logit contributions
 best+0.180
 things+0.174
 biggest+0.168
 thing+0.164
 world+0.164
Neg logit contributions
ur-0.334
arted-0.321
 stro-0.313
oured-0.296
ree-0.292
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
 best+0.269
 biggest+0.255
 world+0.246
 lot+0.241
 things+0.228
Neg logit contributions
ur-0.615
arted-0.614
 stro-0.572
ked-0.568
anged-0.567
Token.
Token ID13
Feature activation+4.72e+00

Pos logit contributions
 lot+0.892
 best+0.877
 world+0.856
 biggest+0.826
 thing+0.824
Neg logit contributions
arted-1.168
ur-1.162
ked-1.073
 stro-1.042
anged-1.037
Token<|endoftext|>
Token ID50256
Feature activation+7.93e-01

Pos logit contributions
 world+0.209
 best+0.205
 lot+0.198
 thing+0.192
 story+0.188
Neg logit contributions
arted-0.283
ked-0.276
ur-0.267
anged-0.256
oured-0.246
TokenHarry
Token ID18308
Feature activation-1.18e+01

Pos logit contributions
 best+0.037
 world+0.036
 lot+0.036
 thing+0.036
 things+0.035
Neg logit contributions
arted-0.043
ur-0.041
ked-0.040
 stro-0.038
anged-0.038
Token gazed
Token ID50255
Feature activation-3.48e+00

Pos logit contributions
arted+0.709
ur+0.687
ked+0.673
anged+0.635
 stro+0.625
Neg logit contributions
 best-0.490
 world-0.487
 lot-0.472
 thing-0.463
 things-0.459
Token out
Token ID503
Feature activation-2.96e+00

Pos logit contributions
arted+0.173
ur+0.170
ked+0.165
anged+0.161
 stro+0.159
Neg logit contributions
 best-0.180
 world-0.177
 thing-0.172
 things-0.172
 lot-0.171
Token.
Token ID13
Feature activation+4.73e+00

Pos logit contributions
 best+0.451
 world+0.438
 lot+0.432
 thing+0.420
 things+0.409
Neg logit contributions
arted-0.646
ur-0.627
ked-0.606
anged-0.585
 stro-0.571
Token
Token ID220
Feature activation+3.47e-01

Pos logit contributions
 best+0.244
 world+0.241
 lot+0.237
 thing+0.227
 things+0.225
Neg logit contributions
arted-0.240
ked-0.232
ur-0.231
anged-0.217
 stro-0.211
Token
Token ID198
Feature activation-3.08e+00

Pos logit contributions
 best+0.015
 world+0.014
 biggest+0.013
 lot+0.013
 whole+0.013
Neg logit contributions
arted-0.021
ked-0.021
ur-0.020
anged-0.020
 stro-0.020
Token
Token ID198
Feature activation-3.05e+00

Pos logit contributions
arted+0.187
ked+0.181
ur+0.177
anged+0.174
oured+0.162
Neg logit contributions
 lot-0.133
 world-0.131
 best-0.130
 whole-0.120
 story-0.119
TokenThe
Token ID464
Feature activation+4.04e+00

Pos logit contributions
arted+0.153
ur+0.152
 stro+0.143
ked+0.141
anged+0.130
Neg logit contributions
 best-0.154
 world-0.148
 thing-0.147
 biggest-0.145
 things-0.145
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
 biggest+0.119
 thing+0.110
 world+0.109
 lot+0.108
 things+0.104
Neg logit contributions
 stro-0.233
arted-0.224
ur-0.219
ked-0.208
 hes-0.207
Token.
Token ID13
Feature activation+4.89e+00

Pos logit contributions
 best+0.902
 world+0.872
 biggest+0.871
 lot+0.861
 thing+0.828
Neg logit contributions
ur-1.164
arted-1.136
 stro-1.070
ked-1.056
anged-1.039
Token<|endoftext|>
Token ID50256
Feature activation+9.06e-01

Pos logit contributions
 best+0.247
 world+0.245
 lot+0.240
 thing+0.236
 things+0.231
Neg logit contributions
arted-0.251
ur-0.243
ked-0.235
anged-0.223
 stro-0.222
TokenThe
Token ID464
Feature activation+4.08e+00

Pos logit contributions
 best+0.043
 lot+0.041
 world+0.041
 thing+0.041
 things+0.040
Neg logit contributions
arted-0.048
ur-0.047
ked-0.045
 stro-0.043
anged-0.043
Token sun
Token ID4252
Feature activation+7.24e+00

Pos logit contributions
 best+0.124
 lot+0.120
 thing+0.119
 world+0.113
 problem+0.112
Neg logit contributions
arted-0.277
ur-0.264
ked-0.256
 stro-0.251
anged-0.244
Token was
Token ID373
Feature activation+7.55e+00

Pos logit contributions
 best+0.248
 world+0.241
 lot+0.236
 thing+0.231
 things+0.225
Neg logit contributions
arted-0.481
ur-0.469
ked-0.460
 stro-0.449
anged-0.441
Token.
Token ID13
Feature activation+4.68e+00

Pos logit contributions
arted+0.181
ur+0.176
ked+0.171
 stro+0.163
anged+0.162
Neg logit contributions
 best-0.125
 world-0.123
 lot-0.119
 thing-0.119
 things-0.116
Token The
Token ID383
Feature activation-5.30e+00

Pos logit contributions
 world+0.258
 best+0.255
 lot+0.249
 thing+0.249
 things+0.242
Neg logit contributions
arted-0.219
ur-0.211
ked-0.210
 stro-0.197
anged-0.193
Token crack
Token ID8469
Feature activation-8.49e-01

Pos logit contributions
ur+0.310
arted+0.294
anged+0.292
 stro+0.288
ked+0.283
Neg logit contributions
 best-0.158
 lot-0.153
iest-0.141
 biggest-0.141
 world-0.132
Token was
Token ID373
Feature activation+7.56e+00

Pos logit contributions
arted+0.049
ur+0.047
ked+0.047
 stro+0.044
anged+0.043
Neg logit contributions
 lot-0.036
 best-0.035
 world-0.034
 biggest-0.034
 things-0.032
Token the
Token ID262
Feature activation+4.25e-01

Pos logit contributions
 best+0.298
 world+0.290
 lot+0.284
 thing+0.281
 things+0.273
Neg logit contributions
arted-0.472
ur-0.462
ked-0.447
 stro-0.431
anged-0.429
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
 lot+0.013
 best+0.012
 world+0.011
 things+0.011
 thing+0.011
Neg logit contributions
ur-0.025
arted-0.024
ked-0.024
 stro-0.023
anged-0.023
Token of
Token ID286
Feature activation+5.83e+00

Pos logit contributions
 best+0.946
 lot+0.935
 world+0.927
 biggest+0.904
 thing+0.882
Neg logit contributions
arted-1.117
ur-1.113
ked-1.028
 stro-0.999
anged-0.987
Token her
Token ID607
Feature activation-1.24e+00

Pos logit contributions
 best+0.275
 lot+0.269
 world+0.268
 thing+0.266
 things+0.262
Neg logit contributions
arted-0.320
ur-0.320
ked-0.308
anged-0.296
 stro-0.292
Token adventures
Token ID17545
Feature activation+2.78e+00

Pos logit contributions
ur+0.081
arted+0.079
anged+0.078
ked+0.076
 stro+0.075
Neg logit contributions
 best-0.038
 lot-0.038
 world-0.037
 thing-0.037
 biggest-0.036
Token.
Token ID13
Feature activation+4.91e+00

Pos logit contributions
 best+0.142
 lot+0.141
 world+0.139
 biggest+0.136
 thing+0.136
Neg logit contributions
arted-0.136
ur-0.133
ked-0.127
 stro-0.122
anged-0.120
Token<|endoftext|>
Token ID50256
Feature activation+1.03e+00

Pos logit contributions
 best+0.273
 world+0.272
 lot+0.266
 thing+0.263
 things+0.257
Neg logit contributions
arted-0.227
ur-0.220
ked-0.216
 stro-0.202
anged-0.201
Token all
Token ID477
Feature activation+4.17e+00

Pos logit contributions
 best+0.324
 world+0.321
 lot+0.317
 thing+0.316
 things+0.308
Neg logit contributions
arted-0.317
ur-0.308
ked-0.295
 stro-0.283
anged-0.276
Token day
Token ID1110
Feature activation+1.53e+01

Pos logit contributions
 best+0.070
 lot+0.067
 biggest+0.063
 problem+0.062
 world+0.060
Neg logit contributions
arted-0.334
ur-0.318
anged-0.302
ked-0.298
 stro-0.291
Token long
Token ID890
Feature activation+5.29e+00

Pos logit contributions
 best+0.494
 world+0.482
 lot+0.467
 thing+0.466
 things+0.448
Neg logit contributions
arted-1.049
ur-1.029
ked-0.998
 stro-0.968
anged-0.958
Token.
Token ID13
Feature activation+4.62e+00

Pos logit contributions
 best+0.207
 world+0.203
 lot+0.198
 thing+0.196
 things+0.190
Neg logit contributions
arted-0.329
ur-0.322
ked-0.311
 stro-0.301
anged-0.298
Token The
Token ID383
Feature activation-5.30e+00

Pos logit contributions
 lot+0.249
 best+0.247
 world+0.246
 thing+0.236
 things+0.235
Neg logit contributions
arted-0.224
ur-0.217
ked-0.215
anged-0.199
 stro-0.195
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
 stro+0.304
ked+0.303
arted+0.291
ur+0.281
 thanking+0.277
Neg logit contributions
iest-0.126
 biggest-0.124
 world-0.119
 problem-0.116
 strongest-0.111
Token.
Token ID13
Feature activation+4.70e+00

Pos logit contributions
 best+0.896
 world+0.867
 biggest+0.847
 lot+0.840
 thing+0.804
Neg logit contributions
ur-1.164
arted-1.156
ked-1.078
 stro-1.070
anged-1.056
Token<|endoftext|>
Token ID50256
Feature activation+8.11e-01

Pos logit contributions
 best+0.246
 world+0.242
 lot+0.237
 thing+0.236
 things+0.231
Neg logit contributions
arted-0.230
ur-0.224
ked-0.214
 stro-0.205
anged-0.202
TokenOnce
Token ID7454
Feature activation+2.95e+00

Pos logit contributions
 best+0.040
 world+0.040
 lot+0.039
 thing+0.039
 things+0.038
Neg logit contributions
arted-0.042
ur-0.041
ked-0.040
 stro-0.038
anged-0.038
Token upon
Token ID2402
Feature activation+5.24e+00

Pos logit contributions
 best+0.146
 world+0.145
 lot+0.141
 thing+0.141
 things+0.138
Neg logit contributions
arted-0.150
ur-0.146
ked-0.138
anged-0.133
 stro-0.132
Token a
Token ID257
Feature activation+5.32e+00

Pos logit contributions
 problem+0.257
 best+0.254
 world+0.250
 thing+0.241
 biggest+0.239
Neg logit contributions
 stro-0.244
ur-0.241
ked-0.240
arted-0.238
oured-0.207
Token all
Token ID477
Feature activation+4.17e+00

Pos logit contributions
arted+0.106
ur+0.103
ked+0.098
 stro+0.094
anged+0.092
Neg logit contributions
 best-0.103
 world-0.102
 thing-0.098
 lot-0.098
 biggest-0.097
Token the
Token ID262
Feature activation+3.16e-01

Pos logit contributions
 best+0.229
 world+0.223
 lot+0.222
 biggest+0.219
 problem+0.216
Neg logit contributions
arted-0.186
ur-0.178
ked-0.160
 stro-0.157
anged-0.154
Token animals
Token ID4695
Feature activation+7.92e+00

Pos logit contributions
 thing+0.009
 puzzle+0.009
 lot+0.009
 biggest+0.009
iest+0.009
Neg logit contributions
arted-0.018
ur-0.017
ked-0.017
 stro-0.016
anged-0.016
Token.
Token ID13
Feature activation+4.63e+00

Pos logit contributions
 best+0.356
 world+0.356
 lot+0.353
 thing+0.348
 biggest+0.343
Neg logit contributions
ur-0.425
arted-0.420
ked-0.399
 stro-0.392
anged-0.377
Token The
Token ID383
Feature activation-5.28e+00

Pos logit contributions
 world+0.253
 lot+0.252
 best+0.247
 thing+0.236
 day+0.236
Neg logit contributions
arted-0.223
ked-0.215
ur-0.213
anged-0.197
 stro-0.197
Token end
Token ID886
Feature activation+2.06e+01

Pos logit contributions
arted+0.277
ked+0.274
ur+0.261
 stro+0.251
anged+0.244
Neg logit contributions
 thing-0.181
iest-0.177
 lot-0.175
 task-0.172
 world-0.171
Token.
Token ID13
Feature activation+4.74e+00

Pos logit contributions
 best+0.909
 world+0.880
 biggest+0.879
 lot+0.855
 thing+0.825
Neg logit contributions
ur-1.134
arted-1.104
 stro-1.040
ked-1.032
anged-1.014
Token<|endoftext|>
Token ID50256
Feature activation+8.43e-01

Pos logit contributions
 best+0.232
 world+0.231
 lot+0.227
 thing+0.221
 things+0.216
Neg logit contributions
arted-0.253
ur-0.245
ked-0.239
 stro-0.225
anged-0.225
TokenOnce
Token ID7454
Feature activation+3.13e+00

Pos logit contributions
 best+0.043
 world+0.043
 lot+0.042
 thing+0.042
 things+0.041
Neg logit contributions
arted-0.042
ur-0.041
ked-0.039
 stro-0.037
anged-0.037
Token upon
Token ID2402
Feature activation+5.52e+00

Pos logit contributions
 best+0.155
 world+0.153
 lot+0.150
 thing+0.149
 things+0.146
Neg logit contributions
arted-0.160
ur-0.156
ked-0.149
 stro-0.142
anged-0.142
Token a
Token ID257
Feature activation+5.37e+00

Pos logit contributions
 problem+0.278
 best+0.271
 world+0.268
 thing+0.259
 biggest+0.256
Neg logit contributions
 stro-0.252
ur-0.248
ked-0.245
arted-0.243
oured-0.212
Token said
Token ID531
Feature activation-1.18e+01

Pos logit contributions
 best+0.215
 world+0.206
 lot+0.204
 thing+0.202
 things+0.196
Neg logit contributions
arted-0.391
ur-0.382
ked-0.368
anged-0.355
 stro-0.353
Token,
Token ID11
Feature activation-2.88e+00

Pos logit contributions
arted+0.674
ur+0.640
ked+0.621
anged+0.598
 stro+0.595
Neg logit contributions
 best-0.543
 world-0.519
 thing-0.519
 lot-0.519
 things-0.508
Token "
Token ID366
Feature activation+5.25e-01

Pos logit contributions
arted+0.195
ur+0.187
anged+0.180
oured+0.173
ked+0.172
Neg logit contributions
 lot-0.103
 best-0.101
 world-0.099
 things-0.097
 thing-0.096
TokenAnt
Token ID13217
Feature activation-5.19e+00

Pos logit contributions
 world+0.023
 best+0.022
 thing+0.022
 lot+0.022
 things+0.021
Neg logit contributions
arted-0.031
ur-0.031
ked-0.029
 stro-0.028
anged-0.028
Tokenif
Token ID361
Feature activation+4.97e+00

Pos logit contributions
ur+0.228
anged+0.210
arted+0.206
ked+0.204
 stro+0.197
Neg logit contributions
 world-0.292
 biggest-0.289
 best-0.288
 lot-0.279
 thing-0.276
Tokenree
Token ID631
Feature activation-3.05e+01

Pos logit contributions
 world+0.196
 best+0.182
 universe+0.180
 game+0.177
 thing+0.175
Neg logit contributions
ur-0.301
arted-0.291
ked-0.278
olded-0.271
anged-0.270
Tokenze
Token ID2736
Feature activation-1.08e+01

Pos logit contributions
arted+1.512
ur+1.462
ked+1.394
 stro+1.331
anged+1.325
Neg logit contributions
 best-1.583
 world-1.551
 lot-1.535
 thing-1.525
 things-1.498
Token.
Token ID13
Feature activation+2.83e+00

Pos logit contributions
arted+0.619
ur+0.618
ked+0.592
 stro+0.561
anged+0.547
Neg logit contributions
 world-0.447
 best-0.443
 biggest-0.418
 lot-0.409
 thing-0.407
Token Poison
Token ID19357
Feature activation-2.06e+00

Pos logit contributions
 thing+0.115
 best+0.114
 world+0.112
 lot+0.111
 things+0.109
Neg logit contributions
arted-0.176
ked-0.158
ur-0.157
anged-0.152
oured-0.148
Token.
Token ID13
Feature activation+2.92e+00

Pos logit contributions
ur+0.153
arted+0.153
ked+0.148
anged+0.141
 stro+0.140
Neg logit contributions
 best-0.052
 world-0.051
 lot-0.049
 biggest-0.047
 thing-0.045
Token Do
Token ID2141
Feature activation+5.44e-01

Pos logit contributions
 best+0.128
 thing+0.126
 world+0.126
 lot+0.126
 things+0.122
Neg logit contributions
arted-0.170
ur-0.153
ked-0.153
anged-0.146
oured-0.144
Token of
Token ID286
Feature activation+3.60e+00

Pos logit contributions
ur+0.312
arted+0.308
ked+0.284
 stro+0.275
anged+0.267
Neg logit contributions
 world-0.302
 best-0.300
 lot-0.294
 biggest-0.288
 thing-0.286
Token a
Token ID257
Feature activation+3.15e+00

Pos logit contributions
 world+0.173
 problem+0.169
 best+0.169
 lot+0.168
 thing+0.168
Neg logit contributions
arted-0.174
ur-0.169
ked-0.162
 stro-0.158
anged-0.146
Token plan
Token ID1410
Feature activation+7.50e+00

Pos logit contributions
iest+0.067
 things+0.067
 thing+0.065
 world+0.064
 biggest+0.063
Neg logit contributions
 stro-0.197
ur-0.196
arted-0.188
 pat-0.182
ree-0.178
Token to
Token ID284
Feature activation-6.98e+00

Pos logit contributions
 best+0.307
 world+0.301
 lot+0.295
 thing+0.290
 things+0.283
Neg logit contributions
arted-0.450
ur-0.442
ked-0.426
 stro-0.410
anged-0.406
Token unf
Token ID3684
Feature activation-1.25e+01

Pos logit contributions
arted+0.385
ur+0.375
ked+0.361
 stro+0.347
anged+0.343
Neg logit contributions
 best-0.323
 world-0.317
 lot-0.311
 thing-0.309
 things-0.302
Tokenree
Token ID631
Feature activation-3.05e+01

Pos logit contributions
arted+0.373
 stro+0.360
ged+0.332
ked+0.327
atted+0.310
Neg logit contributions
 world-0.791
 thing-0.790
 end-0.726
 story-0.709
 game-0.704
Tokenze
Token ID2736
Feature activation-1.09e+01

Pos logit contributions
arted+1.690
ur+1.609
ked+1.520
anged+1.508
 stro+1.454
Neg logit contributions
 best-1.411
 world-1.383
 things-1.367
 thing-1.355
 lot-1.350
Token the
Token ID262
Feature activation-1.78e+00

Pos logit contributions
arted+0.613
ur+0.597
ked+0.576
 stro+0.553
anged+0.548
Neg logit contributions
 best-0.490
 world-0.481
 lot-0.471
 thing-0.468
 things-0.457
Token saf
Token ID1932
Feature activation-1.13e+00

Pos logit contributions
arted+0.105
ur+0.104
ked+0.099
 stro+0.093
anged+0.092
Neg logit contributions
 best-0.065
 thing-0.064
 lot-0.060
 things-0.059
 world-0.059
Tokenari
Token ID2743
Feature activation-3.65e+00

Pos logit contributions
arted+0.050
ur+0.050
ked+0.047
 stro+0.043
anged+0.043
Neg logit contributions
 best-0.064
 world-0.062
 lot-0.062
 thing-0.062
 things-0.060
Token.
Token ID13
Feature activation+2.47e+00

Pos logit contributions
arted+0.207
ur+0.204
ked+0.196
 stro+0.188
anged+0.183
Neg logit contributions
 best-0.159
 lot-0.155
 world-0.152
 thing-0.149
 biggest-0.148
TokenNow
Token ID3844
Feature activation-3.25e-01

Pos logit contributions
 world+0.069
 best+0.069
 lot+0.066
 thing+0.066
 things+0.063
Neg logit contributions
arted-0.108
ur-0.107
ked-0.102
anged-0.099
 stro-0.097
Token,
Token ID11
Feature activation-2.04e+00

Pos logit contributions
arted+0.020
ur+0.020
ked+0.019
 stro+0.018
anged+0.018
Neg logit contributions
 best-0.013
 world-0.012
 lot-0.012
 thing-0.012
 biggest-0.011
Token I
Token ID314
Feature activation+8.26e+00

Pos logit contributions
arted+0.106
ur+0.105
ked+0.100
 stro+0.097
anged+0.094
Neg logit contributions
 best-0.098
 world-0.097
 lot-0.095
 thing-0.093
 things-0.091
Token will
Token ID481
Feature activation+1.15e+01

Pos logit contributions
 best+0.317
 world+0.313
 lot+0.302
 thing+0.301
 things+0.296
Neg logit contributions
arted-0.511
ur-0.508
ked-0.487
 stro-0.478
anged-0.468
Token unf
Token ID3684
Feature activation-1.25e+01

Pos logit contributions
 best+0.512
 world+0.503
 lot+0.494
 thing+0.489
 things+0.476
Neg logit contributions
arted-0.658
ur-0.635
ked-0.618
anged-0.590
 stro-0.586
Tokenree
Token ID631
Feature activation-3.04e+01

Pos logit contributions
arted+0.314
 stro+0.294
ked+0.290
ged+0.273
 inspected+0.253
Neg logit contributions
 thing-0.802
 world-0.796
 end-0.783
 story-0.754
 game-0.726
Tokenze
Token ID2736
Feature activation-1.08e+01

Pos logit contributions
arted+1.750
ur+1.700
ked+1.613
anged+1.579
 stro+1.555
Neg logit contributions
 best-1.331
 world-1.293
 lot-1.273
 thing-1.273
 things-1.260
Token you
Token ID345
Feature activation+8.25e+00

Pos logit contributions
arted+0.594
ur+0.581
ked+0.559
 stro+0.537
anged+0.527
Neg logit contributions
 best-0.498
 world-0.488
 lot-0.479
 thing-0.473
 things-0.462
Token all
Token ID477
Feature activation+2.42e+00

Pos logit contributions
 best+0.371
 world+0.367
 lot+0.363
 thing+0.360
 things+0.350
Neg logit contributions
arted-0.451
ur-0.446
ked-0.427
 stro-0.409
anged-0.398
Token!"
Token ID2474
Feature activation-1.63e+00

Pos logit contributions
 best+0.107
 lot+0.102
 world+0.102
 thing+0.100
 biggest+0.099
Neg logit contributions
arted-0.134
ur-0.131
ked-0.121
 stro-0.119
anged-0.115
Token He
Token ID679
Feature activation-2.49e+00

Pos logit contributions
arted+0.094
ked+0.089
ur+0.088
anged+0.085
oured+0.080
Neg logit contributions
 lot-0.070
 world-0.070
 things-0.068
 best-0.068
 day-0.067
Token so
Token ID523
Feature activation+2.05e+00

Pos logit contributions
arted+0.027
ur+0.026
ked+0.025
 stro+0.024
anged+0.023
Neg logit contributions
 world-0.028
 best-0.028
 lot-0.027
 thing-0.027
 things-0.026
Token she
Token ID673
Feature activation+8.20e+00

Pos logit contributions
 world+0.094
 problem+0.084
 story+0.083
 game+0.081
 best+0.080
Neg logit contributions
ur-0.115
arted-0.106
 stro-0.105
ked-0.102
anged-0.100
Token could
Token ID714
Feature activation+4.70e+00

Pos logit contributions
 world+0.331
 best+0.330
 thing+0.318
 lot+0.318
 things+0.307
Neg logit contributions
arted-0.493
ur-0.491
ked-0.471
 stro-0.461
anged-0.446
Token run
Token ID1057
Feature activation-1.76e+00

Pos logit contributions
 best+0.220
 world+0.216
 lot+0.211
 thing+0.211
 things+0.206
Neg logit contributions
arted-0.255
ur-0.250
ked-0.240
 stro-0.232
anged-0.227
Token fr
Token ID1216
Feature activation-2.42e+00

Pos logit contributions
arted+0.084
ur+0.082
ked+0.078
 stro+0.074
anged+0.073
Neg logit contributions
 best-0.094
 world-0.092
 lot-0.090
 thing-0.090
 things-0.088
Tokenree
Token ID631
Feature activation-3.02e+01

Pos logit contributions
arted+0.142
ur+0.138
ked+0.135
 stro+0.130
anged+0.128
Neg logit contributions
 best-0.103
 world-0.101
 lot-0.099
 thing-0.098
 things-0.097
Tokenly
Token ID306
Feature activation-5.51e+00

Pos logit contributions
arted+1.560
ur+1.517
ked+1.456
 stro+1.394
anged+1.382
Neg logit contributions
 best-1.497
 world-1.471
 lot-1.445
 thing-1.433
 things-1.405
Token.
Token ID13
Feature activation+2.58e+00

Pos logit contributions
arted+0.281
ur+0.275
ked+0.270
 stro+0.263
anged+0.245
Neg logit contributions
 best-0.261
 world-0.248
 biggest-0.247
 important-0.246
 lot-0.237
Token
Token ID198
Feature activation-5.03e+00

Pos logit contributions
 world+0.108
 lot+0.108
 best+0.103
 day+0.099
 things+0.098
Neg logit contributions
arted-0.159
ked-0.159
ur-0.153
anged-0.146
 stro-0.141
Token
Token ID198
Feature activation-4.99e+00

Pos logit contributions
arted+0.302
ked+0.296
ur+0.291
anged+0.276
 stro+0.272
Neg logit contributions
 world-0.223
 lot-0.220
 best-0.216
 thing-0.206
 story-0.202
TokenThe
Token ID464
Feature activation+2.05e+00

Pos logit contributions
arted+0.300
ur+0.295
 stro+0.280
ked+0.275
ree+0.269
Neg logit contributions
 best-0.202
 thing-0.186
 problem-0.186
 biggest-0.186
 world-0.184
Token teacher
Token ID4701
Feature activation-2.52e+00

Pos logit contributions
arted+0.025
ur+0.023
anged+0.022
olded+0.021
ked+0.021
Neg logit contributions
 world-0.025
 best-0.025
 lot-0.025
 thing-0.024
 biggest-0.024
Token told
Token ID1297
Feature activation-1.28e+01

Pos logit contributions
arted+0.134
ur+0.128
ked+0.122
anged+0.115
pered+0.112
Neg logit contributions
 best-0.126
 world-0.121
 thing-0.118
 lot-0.118
 things-0.117
Token them
Token ID606
Feature activation-3.06e+00

Pos logit contributions
arted+0.618
ur+0.609
ked+0.583
 stro+0.555
anged+0.554
Neg logit contributions
 best-0.676
 lot-0.670
 world-0.664
 thing-0.658
 things-0.656
Token to
Token ID284
Feature activation-5.54e+00

Pos logit contributions
ur+0.140
arted+0.140
ked+0.130
 stro+0.126
anged+0.121
Neg logit contributions
 world-0.167
 best-0.163
 lot-0.158
 thing-0.157
 biggest-0.157
Token unf
Token ID3684
Feature activation-1.10e+01

Pos logit contributions
arted+0.331
ur+0.320
ked+0.314
anged+0.299
 stro+0.298
Neg logit contributions
 best-0.234
 world-0.230
 lot-0.226
 thing-0.223
 things-0.217
Tokenree
Token ID631
Feature activation-2.89e+01

Pos logit contributions
arted+0.262
 stro+0.249
 Joe+0.242
 Sammy+0.240
ked+0.225
Neg logit contributions
 thing-0.736
 world-0.692
 game-0.661
 lot-0.659
 end-0.642
Tokenze
Token ID2736
Feature activation-9.30e+00

Pos logit contributions
arted+1.738
ur+1.672
ked+1.552
anged+1.548
 stro+1.500
Neg logit contributions
 best-1.215
 things-1.192
 thing-1.178
 world-1.164
 lot-1.157
Token and
Token ID290
Feature activation-4.24e+00

Pos logit contributions
arted+0.529
ur+0.516
ked+0.495
 stro+0.477
anged+0.475
Neg logit contributions
 best-0.417
 world-0.409
 lot-0.402
 thing-0.401
 things-0.393
Token sit
Token ID1650
Feature activation-4.60e+00

Pos logit contributions
arted+0.231
ur+0.223
ked+0.216
anged+0.205
 stro+0.204
Neg logit contributions
 best-0.200
 world-0.195
 lot-0.194
 thing-0.192
 things-0.189
Token back
Token ID736
Feature activation-1.91e+00

Pos logit contributions
arted+0.260
ur+0.257
ked+0.239
anged+0.233
 stro+0.232
Neg logit contributions
 best-0.215
 world-0.207
 thing-0.203
 lot-0.202
 things-0.200
Token down
Token ID866
Feature activation-6.86e+00

Pos logit contributions
arted+0.115
ur+0.112
ked+0.108
 stro+0.104
anged+0.103
Neg logit contributions
 best-0.081
 world-0.079
 lot-0.077
 thing-0.077
 things-0.076
Token present
Token ID1944
Feature activation+1.16e+01

Pos logit contributions
 best+0.326
 world+0.324
 lot+0.314
 biggest+0.312
 important+0.304
Neg logit contributions
ur-0.568
 stro-0.559
oured-0.534
ked-0.524
anged-0.497
Token?"
Token ID1701
Feature activation-7.82e-01

Pos logit contributions
 best+0.479
 lot+0.475
 world+0.475
 thing+0.457
 things+0.451
Neg logit contributions
arted-0.665
ur-0.662
ked-0.629
 stro-0.603
anged-0.596
Token
Token ID198
Feature activation-3.61e+00

Pos logit contributions
arted+0.033
ur+0.032
ked+0.029
anged+0.028
oured+0.028
Neg logit contributions
 best-0.048
 lot-0.047
 world-0.047
 things-0.046
 thing-0.045
Token
Token ID198
Feature activation-3.59e+00

Pos logit contributions
arted+0.218
ur+0.213
ked+0.210
anged+0.202
oured+0.199
Neg logit contributions
 lot-0.155
 best-0.154
 world-0.153
 things-0.141
 thing-0.140
TokenB
Token ID33
Feature activation-3.30e+00

Pos logit contributions
arted+0.185
ur+0.180
ked+0.172
anged+0.167
oured+0.162
Neg logit contributions
 best-0.181
 world-0.181
 lot-0.180
 thing-0.174
 things-0.172
Tokenree
Token ID631
Feature activation-2.87e+01

Pos logit contributions
arted+0.127
ur+0.120
ked+0.115
 stro+0.112
anged+0.107
Neg logit contributions
 world-0.202
 best-0.201
 lot-0.197
 thing-0.195
 things-0.191
Token said
Token ID531
Feature activation-1.01e+01

Pos logit contributions
arted+1.619
ur+1.427
anged+1.419
pered+1.323
ked+1.289
Neg logit contributions
 thing-1.218
 is-1.217
 best-1.204
 lot-1.184
 important-1.159
Token,
Token ID11
Feature activation-1.36e+00

Pos logit contributions
arted+0.628
ur+0.576
anged+0.575
ked+0.567
 stro+0.552
Neg logit contributions
 best-0.440
 thing-0.420
 lot-0.399
 things-0.397
 world-0.386
Token "
Token ID366
Feature activation+2.29e+00

Pos logit contributions
arted+0.087
anged+0.084
ur+0.083
oured+0.077
ked+0.077
Neg logit contributions
 best-0.054
 lot-0.052
 thing-0.051
 world-0.051
 things-0.048
TokenWell
Token ID5779
Feature activation+1.69e+00

Pos logit contributions
 best+0.087
 problem+0.080
 lot+0.080
 world+0.079
 things+0.078
Neg logit contributions
arted-0.148
ur-0.142
ked-0.142
 stro-0.140
anged-0.131
Token,
Token ID11
Feature activation-1.29e+00

Pos logit contributions
 best+0.082
 world+0.081
 lot+0.079
 thing+0.078
 things+0.077
Neg logit contributions
arted-0.090
ur-0.088
ked-0.084
 stro-0.080
anged-0.080
Token the
Token ID262
Feature activation-1.74e+00

Pos logit contributions
 best+0.165
 world+0.161
 lot+0.160
 thing+0.157
 things+0.155
Neg logit contributions
arted-0.183
ur-0.180
ked-0.174
anged-0.165
 stro-0.165
Token kids
Token ID3988
Feature activation+4.42e+00

Pos logit contributions
arted+0.115
ur+0.107
 stro+0.103
anged+0.102
ked+0.100
Neg logit contributions
 thing-0.047
 biggest-0.045
 lot-0.044
 best-0.042
 world-0.041
Token had
Token ID550
Feature activation+9.08e+00

Pos logit contributions
 best+0.222
 world+0.218
 lot+0.213
 thing+0.211
 things+0.208
Neg logit contributions
arted-0.226
ur-0.219
ked-0.211
 stro-0.201
anged-0.199
Token their
Token ID511
Feature activation-2.17e+00

Pos logit contributions
 best+0.408
 world+0.403
 lot+0.393
 thing+0.392
 things+0.381
Neg logit contributions
arted-0.508
ur-0.496
ked-0.477
 stro-0.461
anged-0.452
Token t
Token ID256
Feature activation-6.10e+00

Pos logit contributions
arted+0.144
ur+0.143
ked+0.138
 stro+0.137
anged+0.137
Neg logit contributions
 thing-0.059
 world-0.057
 biggest-0.053
 lot-0.052
 best-0.051
Tokenamb
Token ID4131
Feature activation-2.70e+01

Pos logit contributions
arted+0.297
ur+0.290
ked+0.276
anged+0.263
 stro+0.262
Neg logit contributions
 best-0.322
 world-0.316
 lot-0.312
 thing-0.310
 things-0.303
Tokenour
Token ID454
Feature activation-7.27e+00

Pos logit contributions
arted+1.113
ur+1.079
ked+1.039
anged+0.981
 stro+0.940
Neg logit contributions
 world-1.615
 best-1.606
 lot-1.602
 thing-1.588
 biggest-1.526
Tokenines
Token ID1127
Feature activation-3.06e+00

Pos logit contributions
arted+0.374
ur+0.366
ked+0.354
 stro+0.340
anged+0.336
Neg logit contributions
 best-0.348
 world-0.343
 lot-0.337
 thing-0.335
 biggest-0.329
Token and
Token ID290
Feature activation-5.58e+00

Pos logit contributions
arted+0.162
ur+0.159
ked+0.154
 stro+0.149
anged+0.143
Neg logit contributions
 best-0.138
 world-0.136
 lot-0.134
 thing-0.133
 biggest-0.133
Token drums
Token ID24730
Feature activation-3.09e+00

Pos logit contributions
arted+0.298
ur+0.292
ked+0.279
 stro+0.272
anged+0.266
Neg logit contributions
 world-0.259
 best-0.258
 lot-0.252
 thing-0.250
 things-0.242
Token,
Token ID11
Feature activation-3.10e+00

Pos logit contributions
arted+0.175
ur+0.171
ked+0.166
 stro+0.158
anged+0.156
Neg logit contributions
 best-0.135
 world-0.134
 lot-0.131
 thing-0.130
 things-0.127
Token B
Token ID347
Feature activation-1.16e+01

Pos logit contributions
arted+0.522
anged+0.490
ur+0.487
ked+0.485
 stro+0.479
Neg logit contributions
 best-0.884
 lot-0.866
 world-0.861
 things-0.843
 thing-0.838
Tokenamb
Token ID4131
Feature activation-2.69e+01

Pos logit contributions
arted+0.499
ur+0.461
ked+0.455
 stro+0.450
anged+0.437
Neg logit contributions
 world-0.683
 best-0.675
 lot-0.651
 thing-0.644
 biggest-0.632
Tokeni
Token ID72
Feature activation-1.17e+01

Pos logit contributions
arted+0.965
ur+0.928
ked+0.874
 stro+0.820
anged+0.804
Neg logit contributions
 best-1.767
 world-1.744
 lot-1.719
 thing-1.710
 things-1.685
Token.
Token ID13
Feature activation+2.09e+00

Pos logit contributions
ur+0.678
arted+0.670
ked+0.650
 stro+0.630
anged+0.614
Neg logit contributions
 best-0.479
 world-0.472
 lot-0.462
 thing-0.458
 problem-0.452
Token B
Token ID347
Feature activation-1.17e+01

Pos logit contributions
 best+0.134
 world+0.133
 lot+0.131
 thing+0.129
 things+0.128
Neg logit contributions
arted-0.078
ur-0.075
ked-0.072
 stro-0.068
anged-0.067
Tokenamb
Token ID4131
Feature activation-2.69e+01

Pos logit contributions
arted+0.229
ur+0.197
ked+0.193
 stro+0.184
anged+0.177
Neg logit contributions
 world-0.959
 best-0.951
 lot-0.922
 thing-0.921
 things-0.911
Tokeni
Token ID72
Feature activation-1.17e+01

Pos logit contributions
arted+0.935
ur+0.878
ked+0.838
 stro+0.803
anged+0.774
Neg logit contributions
 best-1.793
 world-1.759
 lot-1.724
 thing-1.724
 things-1.718
Token loved
Token ID6151
Feature activation+1.77e+00

Pos logit contributions
ur+0.686
arted+0.664
 stro+0.647
ked+0.643
anged+0.620
Neg logit contributions
 best-0.456
 problem-0.445
 thing-0.444
 lot-0.443
 world-0.440
Token to
Token ID284
Feature activation-7.14e+00

Pos logit contributions
 best+0.102
 world+0.101
 lot+0.100
 things+0.099
 thing+0.099
Neg logit contributions
arted-0.079
ur-0.076
ked-0.074
anged-0.070
 stro-0.069
Token play
Token ID711
Feature activation-2.76e+00

Pos logit contributions
arted+0.401
ur+0.388
ked+0.376
 stro+0.358
anged+0.358
Neg logit contributions
 best-0.327
 world-0.322
 lot-0.318
 thing-0.314
 things-0.307
Token and
Token ID290
Feature activation-5.90e+00

Pos logit contributions
arted+0.162
ur+0.156
anged+0.150
ked+0.149
 stro+0.144
Neg logit contributions
 best-0.125
 lot-0.123
 world-0.123
 things-0.120
 thing-0.120
Token.
Token ID13
Feature activation+2.13e+00

Pos logit contributions
 best+0.020
 world+0.020
 lot+0.019
 thing+0.019
 things+0.018
Neg logit contributions
arted-0.038
ur-0.037
ked-0.035
 stro-0.034
anged-0.034
Token One
Token ID1881
Feature activation+3.03e+00

Pos logit contributions
 best+0.104
 world+0.103
 lot+0.102
 thing+0.099
 things+0.098
Neg logit contributions
arted-0.114
ur-0.110
ked-0.107
 stro-0.102
anged-0.102
Token day
Token ID1110
Feature activation+1.31e+01

Pos logit contributions
 best+0.042
 world+0.040
 biggest+0.039
 lot+0.037
 things+0.037
Neg logit contributions
ur-0.250
arted-0.247
ked-0.232
anged-0.231
 stro-0.231
Token,
Token ID11
Feature activation-3.46e+00

Pos logit contributions
 best+0.580
 lot+0.572
 thing+0.552
 things+0.546
 world+0.542
Neg logit contributions
arted-0.766
ur-0.726
ked-0.699
 stro-0.677
anged-0.668
Token B
Token ID347
Feature activation-1.17e+01

Pos logit contributions
arted+0.141
ur+0.137
ked+0.132
 stro+0.125
anged+0.121
Neg logit contributions
 world-0.207
 best-0.207
 thing-0.199
 lot-0.198
 things-0.194
Tokenamb
Token ID4131
Feature activation-2.69e+01

Pos logit contributions
arted+0.243
ked+0.207
ur+0.207
 stro+0.200
anged+0.188
Neg logit contributions
 world-0.942
 best-0.934
 lot-0.905
 thing-0.902
 things-0.896
Tokeni
Token ID72
Feature activation-1.17e+01

Pos logit contributions
arted+0.947
ur+0.891
ked+0.847
 stro+0.820
anged+0.782
Neg logit contributions
 best-1.778
 world-1.736
 thing-1.714
 things-1.711
 lot-1.702
Token saw
Token ID2497
Feature activation-4.83e+00

Pos logit contributions
arted+0.653
ur+0.633
ked+0.611
 stro+0.579
anged+0.578
Neg logit contributions
 best-0.542
 world-0.536
 lot-0.517
 thing-0.511
 things-0.507
Token his
Token ID465
Feature activation-8.02e+00

Pos logit contributions
arted+0.324
ur+0.317
ked+0.307
anged+0.297
 stro+0.294
Neg logit contributions
 best-0.173
 lot-0.164
 world-0.164
 thing-0.163
 things-0.160
Token friend
Token ID1545
Feature activation-4.54e-01

Pos logit contributions
arted+0.494
ur+0.487
ked+0.469
 stro+0.450
anged+0.444
Neg logit contributions
 world-0.304
 best-0.301
 lot-0.298
 thing-0.295
 things-0.286
Token Th
Token ID536
Feature activation-1.55e+01

Pos logit contributions
arted+0.025
ked+0.023
ur+0.022
anged+0.022
 stro+0.021
Neg logit contributions
 best-0.024
 world-0.021
 thing-0.021
 things-0.021
 story-0.021
Token j
Token ID474
Feature activation-4.93e-02

Pos logit contributions
 best+0.090
 world+0.088
 things+0.086
 biggest+0.085
 thing+0.083
Neg logit contributions
ur-0.175
 stro-0.175
ked-0.172
arted-0.169
ree-0.162
Tokenolly
Token ID5098
Feature activation+2.78e+00

Pos logit contributions
ur+0.002
arted+0.002
anged+0.002
ree+0.002
ked+0.002
Neg logit contributions
 best-0.003
 biggest-0.003
 things-0.003
 world-0.003
 thing-0.003
Token deer
Token ID20096
Feature activation-3.28e+00

Pos logit contributions
 best+0.099
 world+0.094
 biggest+0.093
 lot+0.091
 things+0.088
Neg logit contributions
ur-0.175
arted-0.167
 stro-0.165
ked-0.164
ree-0.154
Token named
Token ID3706
Feature activation-1.36e+01

Pos logit contributions
ur+0.125
arted+0.124
ked+0.117
 stro+0.113
anged+0.108
Neg logit contributions
 best-0.204
 world-0.199
 lot-0.196
 biggest-0.194
 thing-0.194
Token B
Token ID347
Feature activation-1.16e+01

Pos logit contributions
arted+0.522
anged+0.490
ur+0.487
ked+0.485
 stro+0.479
Neg logit contributions
 best-0.884
 lot-0.866
 world-0.861
 things-0.843
 thing-0.838
Tokenamb
Token ID4131
Feature activation-2.69e+01

Pos logit contributions
arted+0.499
ur+0.461
ked+0.455
 stro+0.450
anged+0.437
Neg logit contributions
 world-0.683
 best-0.675
 lot-0.651
 thing-0.644
 biggest-0.632
Tokeni
Token ID72
Feature activation-1.17e+01

Pos logit contributions
arted+0.965
ur+0.928
ked+0.874
 stro+0.820
anged+0.804
Neg logit contributions
 best-1.767
 world-1.744
 lot-1.719
 thing-1.710
 things-1.685
Token.
Token ID13
Feature activation+2.09e+00

Pos logit contributions
ur+0.678
arted+0.670
ked+0.650
 stro+0.630
anged+0.614
Neg logit contributions
 best-0.479
 world-0.472
 lot-0.462
 thing-0.458
 problem-0.452
Token B
Token ID347
Feature activation-1.17e+01

Pos logit contributions
 best+0.134
 world+0.133
 lot+0.131
 thing+0.129
 things+0.128
Neg logit contributions
arted-0.078
ur-0.075
ked-0.072
 stro-0.068
anged-0.067
Tokenamb
Token ID4131
Feature activation-2.69e+01

Pos logit contributions
arted+0.229
ur+0.197
ked+0.193
 stro+0.184
anged+0.177
Neg logit contributions
 world-0.959
 best-0.951
 lot-0.922
 thing-0.921
 things-0.911
Tokeni
Token ID72
Feature activation-1.17e+01

Pos logit contributions
arted+0.935
ur+0.878
ked+0.838
 stro+0.803
anged+0.774
Neg logit contributions
 best-1.793
 world-1.759
 lot-1.724
 thing-1.724
 things-1.718
Token in
Token ID287
Feature activation-3.27e+00

Pos logit contributions
arted+0.286
ur+0.282
 stro+0.269
ked+0.266
anged+0.263
Neg logit contributions
 best-0.206
 thing-0.198
 world-0.196
 things-0.195
 lot-0.194
Token a
Token ID257
Feature activation+2.95e+00

Pos logit contributions
arted+0.179
ur+0.172
ked+0.167
 stro+0.164
anged+0.155
Neg logit contributions
 best-0.148
 world-0.143
 biggest-0.138
 thing-0.138
 lot-0.135
Token trap
Token ID12840
Feature activation+3.40e+00

Pos logit contributions
 best+0.101
iest+0.096
 things+0.088
 biggest+0.086
 most+0.084
Neg logit contributions
arted-0.166
ur-0.166
ree-0.160
 stro-0.158
ked-0.148
Token.
Token ID13
Feature activation+2.30e+00

Pos logit contributions
 best+0.133
 world+0.131
 lot+0.127
 thing+0.126
 things+0.123
Neg logit contributions
arted-0.212
ur-0.207
ked-0.200
 stro-0.193
anged-0.192
Token B
Token ID347
Feature activation-1.16e+01

Pos logit contributions
 lot+0.112
 world+0.106
 best+0.105
 things+0.103
 thing+0.099
Neg logit contributions
arted-0.128
ur-0.123
anged-0.119
ked-0.117
oured-0.111
Tokenamb
Token ID4131
Feature activation-2.68e+01

Pos logit contributions
arted+0.233
ked+0.196
ur+0.196
 stro+0.195
anged+0.178
Neg logit contributions
 world-0.935
 best-0.920
 thing-0.885
 things-0.880
 lot-0.880
Tokeni
Token ID72
Feature activation-1.16e+01

Pos logit contributions
arted+1.001
ur+0.935
 stro+0.885
ked+0.879
ree+0.826
Neg logit contributions
 best-1.643
 things-1.591
 world-1.572
 thing-1.567
 lot-1.545
Token tried
Token ID3088
Feature activation-4.10e+00

Pos logit contributions
arted+0.677
ur+0.659
ked+0.627
anged+0.611
 stro+0.601
Neg logit contributions
 best-0.512
 world-0.491
 lot-0.489
 thing-0.476
 things-0.475
Token to
Token ID284
Feature activation-6.99e+00

Pos logit contributions
arted+0.238
ked+0.226
ur+0.225
anged+0.224
oured+0.222
Neg logit contributions
 lot-0.193
 best-0.190
 things-0.185
 thing-0.177
 good-0.172
Token help
Token ID1037
Feature activation-3.15e+00

Pos logit contributions
arted+0.449
anged+0.426
ur+0.420
ked+0.419
oured+0.408
Neg logit contributions
 lot-0.269
 thing-0.268
 best-0.265
 things-0.260
 game-0.257
Token Th
Token ID536
Feature activation-1.55e+01

Pos logit contributions
arted+0.157
ur+0.146
ked+0.140
anged+0.136
 stro+0.135
Neg logit contributions
 lot-0.172
 best-0.168
 things-0.167
 thing-0.164
 world-0.162
Token songs
Token ID7259
Feature activation+2.94e+00

Pos logit contributions
arted+0.191
ur+0.186
anged+0.177
ked+0.177
 stro+0.172
Neg logit contributions
 best-0.131
 lot-0.131
 world-0.130
 things-0.128
 thing-0.127
Token and
Token ID290
Feature activation-5.86e+00

Pos logit contributions
 best+0.140
 world+0.135
 lot+0.134
 thing+0.133
 things+0.130
Neg logit contributions
arted-0.162
ur-0.156
ked-0.149
anged-0.146
 stro-0.145
Token shaking
Token ID17275
Feature activation-6.24e+00

Pos logit contributions
arted+0.286
ur+0.277
ked+0.267
 stro+0.258
anged+0.251
Neg logit contributions
 best-0.304
 world-0.299
 lot-0.294
 thing-0.293
 biggest-0.285
Token her
Token ID607
Feature activation-3.70e+00

Pos logit contributions
arted+0.345
ur+0.337
ked+0.324
 stro+0.312
anged+0.308
Neg logit contributions
 best-0.288
 world-0.282
 lot-0.277
 thing-0.274
 things-0.268
Token t
Token ID256
Feature activation-5.82e+00

Pos logit contributions
arted+0.179
ur+0.172
ked+0.172
 stro+0.163
anged+0.161
Neg logit contributions
 best-0.185
 world-0.182
 lot-0.181
 thing-0.179
 biggest-0.174
Tokenamb
Token ID4131
Feature activation-2.68e+01

Pos logit contributions
arted+0.286
ur+0.279
ked+0.268
 stro+0.260
anged+0.250
Neg logit contributions
 best-0.303
 world-0.300
 lot-0.292
 thing-0.291
 things-0.285
Tokenour
Token ID454
Feature activation-7.11e+00

Pos logit contributions
arted+1.043
ur+1.008
ked+0.958
 stro+0.894
anged+0.890
Neg logit contributions
 best-1.685
 world-1.669
 lot-1.648
 thing-1.637
 things-1.603
Tokenine
Token ID500
Feature activation-1.01e+01

Pos logit contributions
arted+0.410
ur+0.399
ked+0.390
 stro+0.372
anged+0.369
Neg logit contributions
 best-0.304
 world-0.297
 lot-0.292
 thing-0.288
 biggest-0.281
Token.
Token ID13
Feature activation+2.40e+00

Pos logit contributions
arted+0.539
ked+0.532
ur+0.530
 stro+0.506
anged+0.485
Neg logit contributions
 best-0.439
 lot-0.435
 world-0.433
 thing-0.432
 biggest-0.415
Token Whenever
Token ID21326
Feature activation-2.73e+00

Pos logit contributions
 world+0.114
 best+0.113
 lot+0.112
 thing+0.107
 things+0.106
Neg logit contributions
arted-0.133
ur-0.128
ked-0.127
anged-0.119
 stro-0.118
Token she
Token ID673
Feature activation+7.93e+00

Pos logit contributions
 stro+0.117
ur+0.115
arted+0.115
ked+0.107
anged+0.103
Neg logit contributions
 biggest-0.132
 world-0.130
 lot-0.129
 best-0.129
 important-0.126
Token the
Token ID262
Feature activation-1.54e+00

Pos logit contributions
 best+0.145
 world+0.137
 biggest+0.136
 lot+0.135
 thing+0.134
Neg logit contributions
arted-0.223
ur-0.216
ked-0.204
 stro-0.203
anged-0.194
Token woods
Token ID16479
Feature activation+5.58e+00

Pos logit contributions
arted+0.095
ur+0.095
 stro+0.083
pered+0.082
ked+0.081
Neg logit contributions
 best-0.042
 lot-0.041
 puzzle-0.040
 thing-0.040
iest-0.038
Token.
Token ID13
Feature activation+2.60e+00

Pos logit contributions
 best+0.256
 lot+0.253
 world+0.247
 thing+0.244
 biggest+0.242
Neg logit contributions
arted-0.301
ur-0.299
ked-0.284
 stro-0.273
anged-0.268
Token They
Token ID1119
Feature activation+2.23e+00

Pos logit contributions
 world+0.125
 lot+0.125
 best+0.119
 biggest+0.116
 thing+0.114
Neg logit contributions
arted-0.130
ked-0.128
anged-0.126
ur-0.124
 stro-0.121
Token r
Token ID374
Feature activation+1.26e-01

Pos logit contributions
 best+0.099
 lot+0.096
 world+0.088
 things+0.087
 game+0.087
Neg logit contributions
arted-0.128
ur-0.121
ree-0.119
ked-0.117
anged-0.116
Tokenamb
Token ID4131
Feature activation-2.68e+01

Pos logit contributions
 best+0.006
 world+0.006
 things+0.006
 biggest+0.006
 lot+0.005
Neg logit contributions
 stro-0.006
ked-0.005
ur-0.005
oured-0.005
olded-0.005
Tokenled
Token ID992
Feature activation-9.84e+00

Pos logit contributions
arted+1.026
ur+0.974
ked+0.937
 stro+0.912
anged+0.834
Neg logit contributions
 best-1.647
 world-1.614
 things-1.589
 thing-1.585
 lot-1.579
Token through
Token ID832
Feature activation-6.73e+00

Pos logit contributions
arted+0.492
ur+0.482
ked+0.455
 stro+0.442
anged+0.436
Neg logit contributions
 best-0.514
 world-0.499
 things-0.498
 thing-0.494
 lot-0.494
Token the
Token ID262
Feature activation-1.49e+00

Pos logit contributions
arted+0.324
ur+0.308
ked+0.300
 stro+0.288
anged+0.282
Neg logit contributions
 best-0.352
 world-0.338
 thing-0.331
 biggest-0.330
 lot-0.328
Token trees
Token ID7150
Feature activation-5.96e-01

Pos logit contributions
arted+0.096
ur+0.093
 stro+0.084
ked+0.083
anged+0.082
Neg logit contributions
 best-0.052
 thing-0.048
 puzzle-0.045
 biggest-0.044
 game-0.044
Token,
Token ID11
Feature activation-2.89e+00

Pos logit contributions
arted+0.031
ur+0.030
ked+0.029
anged+0.028
 stro+0.028
Neg logit contributions
 best-0.030
 world-0.029
 lot-0.028
 thing-0.028
 things-0.027
Token.
Token ID13
Feature activation+2.61e+00

Pos logit contributions
 best+0.448
 world+0.440
 lot+0.431
 thing+0.428
 things+0.417
Neg logit contributions
arted-0.539
ur-0.526
ked-0.506
 stro-0.485
anged-0.481
Token
Token ID220
Feature activation-1.74e+00

Pos logit contributions
 lot+0.117
 best+0.115
 world+0.111
 things+0.109
 biggest+0.106
Neg logit contributions
arted-0.152
ur-0.149
anged-0.142
ked-0.141
 stro-0.135
Token
Token ID198
Feature activation-5.13e+00

Pos logit contributions
arted+0.105
ur+0.101
ked+0.100
 stro+0.097
anged+0.096
Neg logit contributions
 best-0.075
 lot-0.073
 world-0.072
 things-0.068
 thing-0.067
Token
Token ID198
Feature activation-5.11e+00

Pos logit contributions
arted+0.319
ked+0.311
ur+0.306
anged+0.293
 stro+0.288
Neg logit contributions
 lot-0.217
 best-0.211
 world-0.209
 things-0.196
 thing-0.191
TokenB
Token ID33
Feature activation-4.86e+00

Pos logit contributions
arted+0.268
ur+0.261
ked+0.251
 stro+0.240
anged+0.238
Neg logit contributions
 best-0.251
 world-0.247
 lot-0.244
 thing-0.241
 things-0.236
Tokenamb
Token ID4131
Feature activation-2.66e+01

Pos logit contributions
arted+0.124
ked+0.109
 stro+0.106
ur+0.098
anged+0.092
Neg logit contributions
 world-0.353
 best-0.346
 lot-0.345
 thing-0.339
 things-0.331
Tokeni
Token ID72
Feature activation-1.13e+01

Pos logit contributions
arted+1.042
ur+0.959
ked+0.944
 stro+0.912
anged+0.860
Neg logit contributions
 best-1.588
 things-1.541
 thing-1.532
 world-1.519
 problem-1.482
Token knew
Token ID2993
Feature activation-1.44e+00

Pos logit contributions
arted+0.670
ur+0.657
ked+0.615
anged+0.603
oured+0.593
Neg logit contributions
 best-0.481
 world-0.465
 lot-0.456
 things-0.447
 thing-0.444
Token he
Token ID339
Feature activation+3.95e+00

Pos logit contributions
arted+0.093
ur+0.088
ked+0.087
anged+0.085
oured+0.082
Neg logit contributions
 best-0.057
 things-0.053
 lot-0.053
 thing-0.052
 world-0.051
Token had
Token ID550
Feature activation+9.40e+00

Pos logit contributions
 world+0.152
 best+0.151
 lot+0.147
 thing+0.144
 biggest+0.142
Neg logit contributions
arted-0.246
ur-0.244
ked-0.237
 stro-0.228
anged-0.225
Token to
Token ID284
Feature activation-6.61e+00

Pos logit contributions
 best+0.568
 world+0.559
 lot+0.551
 thing+0.548
 things+0.539
Neg logit contributions
arted-0.391
ur-0.377
ked-0.359
 stro-0.339
anged-0.334
Token she
Token ID673
Feature activation+8.15e+00

Pos logit contributions
arted+0.403
ur+0.399
ked+0.379
anged+0.368
 stro+0.366
Neg logit contributions
 best-0.306
 things-0.296
 thing-0.294
 lot-0.288
 world-0.278
Token went
Token ID1816
Feature activation-1.21e+00

Pos logit contributions
 best+0.397
 world+0.376
 lot+0.367
 thing+0.367
 things+0.365
Neg logit contributions
arted-0.440
ur-0.417
ked-0.394
anged-0.375
 stro-0.371
Token,
Token ID11
Feature activation-2.95e+00

Pos logit contributions
arted+0.065
ur+0.063
ked+0.060
 stro+0.058
anged+0.057
Neg logit contributions
 best-0.058
 world-0.057
 lot-0.056
 thing-0.055
 things-0.054
Token the
Token ID262
Feature activation-1.70e+00

Pos logit contributions
arted+0.158
ur+0.154
ked+0.148
anged+0.145
 stro+0.136
Neg logit contributions
 best-0.143
 world-0.140
 lot-0.136
 things-0.135
 thing-0.135
Token hand
Token ID1021
Feature activation-3.70e+00

Pos logit contributions
arted+0.099
ur+0.097
ked+0.097
 stro+0.096
oured+0.091
Neg logit contributions
 best-0.054
 important-0.050
 lot-0.050
 world-0.049
 story-0.049
Tokenker
Token ID6122
Feature activation-2.64e+01

Pos logit contributions
arted+0.186
ur+0.179
ked+0.171
 stro+0.164
anged+0.163
Neg logit contributions
 best-0.189
 world-0.184
 thing-0.181
 lot-0.181
 things-0.177
Tokenchief
Token ID17351
Feature activation-1.52e+01

Pos logit contributions
arted+1.066
ur+1.030
anged+0.867
ked+0.849
olded+0.841
Neg logit contributions
 best-1.670
 things-1.654
 thing-1.635
 lot-1.605
 biggest-1.575
Token unw
Token ID7379
Feature activation-1.47e+01

Pos logit contributions
arted+0.743
ur+0.722
ked+0.689
 stro+0.661
anged+0.656
Neg logit contributions
 best-0.792
 world-0.776
 lot-0.762
 thing-0.758
 things-0.743
Tokeninding
Token ID6020
Feature activation-6.61e+00

Pos logit contributions
arted+0.749
ur+0.725
 stro+0.714
anged+0.683
olded+0.676
Neg logit contributions
 thing-0.685
 best-0.685
 world-0.663
 lot-0.659
 things-0.647
Token until
Token ID1566
Feature activation-6.56e+00

Pos logit contributions
arted+0.327
ur+0.316
ked+0.301
 stro+0.288
anged+0.284
Neg logit contributions
 best-0.337
 world-0.335
 lot-0.327
 thing-0.325
 things-0.315
Token she
Token ID673
Feature activation+7.91e+00

Pos logit contributions
arted+0.368
ur+0.358
ked+0.345
 stro+0.335
anged+0.328
Neg logit contributions
 best-0.292
 world-0.291
 lot-0.280
 thing-0.278
 biggest-0.272
Token let
Token ID1309
Feature activation-7.11e+00

Pos logit contributions
arted+0.405
ur+0.373
ked+0.368
anged+0.348
ree+0.329
Neg logit contributions
 best-0.484
 lot-0.479
 thing-0.468
 world-0.466
 things-0.463
Token go
Token ID467
Feature activation+3.83e+00

Pos logit contributions
arted+0.360
ur+0.350
ked+0.334
 stro+0.320
anged+0.314
Neg logit contributions
 best-0.356
 world-0.354
 lot-0.346
 thing-0.342
 things-0.336
Token of
Token ID286
Feature activation+3.57e+00

Pos logit contributions
 best+0.155
 world+0.151
 lot+0.148
 thing+0.146
 things+0.144
Neg logit contributions
arted-0.234
ked-0.227
ur-0.223
 stro-0.216
anged-0.211
Token the
Token ID262
Feature activation-1.74e+00

Pos logit contributions
 best+0.173
 world+0.169
 lot+0.168
 thing+0.165
 things+0.164
Neg logit contributions
arted-0.190
ur-0.186
ked-0.180
anged-0.171
 stro-0.171
Token hand
Token ID1021
Feature activation-3.67e+00

Pos logit contributions
ur+0.090
arted+0.088
 stro+0.085
oured+0.081
ked+0.080
Neg logit contributions
 best-0.057
 lot-0.053
iest-0.053
 world-0.053
 important-0.052
Tokenker
Token ID6122
Feature activation-2.63e+01

Pos logit contributions
arted+0.192
ur+0.183
ked+0.180
 stro+0.171
anged+0.170
Neg logit contributions
 best-0.182
 world-0.177
 thing-0.174
 lot-0.172
 things-0.170
Tokenchief
Token ID17351
Feature activation-1.50e+01

Pos logit contributions
arted+1.102
ur+1.065
olded+0.914
ked+0.907
anged+0.896
Neg logit contributions
 best-1.642
 things-1.631
 thing-1.568
 lot-1.548
 biggest-1.547
Token and
Token ID290
Feature activation-5.46e+00

Pos logit contributions
arted+0.802
ur+0.781
ked+0.753
 stro+0.722
anged+0.715
Neg logit contributions
 best-0.721
 world-0.708
 lot-0.693
 thing-0.688
 things-0.673
Token started
Token ID2067
Feature activation+1.41e-01

Pos logit contributions
arted+0.295
ur+0.274
ked+0.272
anged+0.256
oured+0.240
Neg logit contributions
 best-0.269
 lot-0.259
 thing-0.259
 things-0.258
 world-0.252
Token climbing
Token ID14281
Feature activation-8.40e+00

Pos logit contributions
 best+0.007
 things+0.007
 lot+0.007
 world+0.007
 thing+0.007
Neg logit contributions
arted-0.007
ur-0.007
ked-0.007
anged-0.006
oured-0.006
Token back
Token ID736
Feature activation-3.74e+00

Pos logit contributions
arted+0.457
ur+0.445
ked+0.428
 stro+0.411
anged+0.406
Neg logit contributions
 best-0.395
 world-0.388
 lot-0.380
 thing-0.377
 things-0.368
Token He
Token ID679
Feature activation-2.55e+00

Pos logit contributions
 world+0.104
 thing+0.096
 best+0.095
 lot+0.093
 things+0.092
Neg logit contributions
ked-0.140
ur-0.130
anged-0.129
arted-0.129
oured-0.123
Token took
Token ID1718
Feature activation-5.24e+00

Pos logit contributions
arted+0.136
ur+0.128
ked+0.124
anged+0.121
ree+0.115
Neg logit contributions
 best-0.128
 lot-0.122
 important-0.119
 thing-0.118
 world-0.116
Token out
Token ID503
Feature activation-5.39e+00

Pos logit contributions
arted+0.307
ur+0.300
ked+0.295
 stro+0.282
anged+0.281
Neg logit contributions
 best-0.237
 thing-0.228
 things-0.225
 lot-0.223
 world-0.215
Token a
Token ID257
Feature activation+3.31e+00

Pos logit contributions
arted+0.309
ur+0.300
ked+0.294
anged+0.284
 stro+0.273
Neg logit contributions
 best-0.243
 lot-0.241
 things-0.240
 thing-0.238
 world-0.228
Token hand
Token ID1021
Feature activation-3.49e+00

Pos logit contributions
 best+0.127
 world+0.124
 biggest+0.115
 thing+0.115
 things+0.115
Neg logit contributions
ur-0.200
arted-0.199
 stro-0.190
ked-0.189
anged-0.181
Tokenker
Token ID6122
Feature activation-2.62e+01

Pos logit contributions
arted+0.218
ur+0.192
ked+0.190
anged+0.187
oured+0.181
Neg logit contributions
 thing-0.145
 best-0.137
 things-0.130
 world-0.128
 day-0.127
Tokenchief
Token ID17351
Feature activation-1.49e+01

Pos logit contributions
arted+1.398
ur+1.309
olded+1.122
anged+1.113
pered+1.106
Neg logit contributions
 things-1.398
 thing-1.371
 best-1.317
 lot-1.310
 is-1.235
Token from
Token ID422
Feature activation-1.41e+01

Pos logit contributions
arted+0.810
ur+0.769
anged+0.751
ked+0.750
 stro+0.741
Neg logit contributions
 best-0.727
 world-0.699
 thing-0.676
 lot-0.664
 things-0.654
Token his
Token ID465
Feature activation-7.60e+00

Pos logit contributions
ur+0.850
ked+0.824
arted+0.817
olded+0.801
anged+0.798
Neg logit contributions
 thing-0.602
 things-0.580
 day-0.580
 lot-0.561
 best-0.550
Token pocket
Token ID10000
Feature activation+5.43e-01

Pos logit contributions
arted+0.391
ur+0.386
ked+0.368
 stro+0.362
anged+0.354
Neg logit contributions
 best-0.361
 world-0.359
 lot-0.351
 thing-0.349
 things-0.339
Token and
Token ID290
Feature activation-5.48e+00

Pos logit contributions
 best+0.027
 world+0.026
 lot+0.026
 thing+0.025
 things+0.025
Neg logit contributions
arted-0.029
ur-0.027
ked-0.027
anged-0.026
 stro-0.025
Token him
Token ID683
Feature activation-8.59e+00

Pos logit contributions
 best+0.006
 world+0.005
 lot+0.005
 thing+0.005
 things+0.005
Neg logit contributions
arted-0.003
ur-0.003
ked-0.002
 stro-0.002
anged-0.002
Token.
Token ID13
Feature activation+2.72e+00

Pos logit contributions
arted+0.459
ur+0.444
ked+0.429
 stro+0.412
anged+0.406
Neg logit contributions
 best-0.416
 world-0.407
 lot-0.398
 thing-0.397
 things-0.389
Token As
Token ID1081
Feature activation-5.11e+00

Pos logit contributions
 world+0.111
 lot+0.108
 best+0.101
 things+0.100
 day+0.097
Neg logit contributions
arted-0.166
ked-0.165
ur-0.161
anged-0.155
 stro-0.154
Token he
Token ID339
Feature activation+4.28e+00

Pos logit contributions
arted+0.286
ur+0.280
ked+0.268
 stro+0.259
anged+0.256
Neg logit contributions
 best-0.232
 world-0.228
 lot-0.222
 thing-0.221
 things-0.215
Token cl
Token ID537
Feature activation-8.77e+00

Pos logit contributions
 best+0.223
 world+0.219
 lot+0.217
 thing+0.216
 things+0.211
Neg logit contributions
arted-0.211
ur-0.204
ked-0.194
 stro-0.184
anged-0.183
Tokenamb
Token ID4131
Feature activation-2.61e+01

Pos logit contributions
arted+0.406
ur+0.369
ked+0.348
 stro+0.331
ree+0.327
Neg logit contributions
 thing-0.439
 best-0.430
 things-0.402
 lot-0.402
 world-0.394
Tokenered
Token ID1068
Feature activation-9.69e+00

Pos logit contributions
arted+0.979
ur+0.941
ked+0.883
 stro+0.850
anged+0.802
Neg logit contributions
 best-1.665
 world-1.611
 thing-1.602
 things-1.592
 lot-1.587
Token,
Token ID11
Feature activation-2.69e+00

Pos logit contributions
arted+0.503
ur+0.479
ked+0.476
anged+0.468
 stro+0.461
Neg logit contributions
 best-0.486
 things-0.479
 thing-0.469
 lot-0.465
 world-0.459
Token he
Token ID339
Feature activation+4.32e+00

Pos logit contributions
arted+0.141
ur+0.138
ked+0.134
 stro+0.130
anged+0.126
Neg logit contributions
 world-0.128
 best-0.127
 lot-0.121
 thing-0.120
 biggest-0.118
Token foolish
Token ID19538
Feature activation+7.44e+00

Pos logit contributions
 best+0.215
 world+0.209
 lot+0.207
 thing+0.203
 things+0.201
Neg logit contributions
arted-0.226
ur-0.220
ked-0.207
anged-0.198
 stro-0.196
Tokenly
Token ID306
Feature activation-4.97e+00

Pos logit contributions
 best+0.363
 world+0.354
 lot+0.349
 thing+0.341
 biggest+0.337
Neg logit contributions
arted-0.394
ur-0.384
ked-0.371
anged-0.352
 stro-0.351
Token delicious
Token ID12625
Feature activation+5.14e+00

Pos logit contributions
arted+0.656
ur+0.637
ked+0.618
 stro+0.590
anged+0.579
Neg logit contributions
 world-0.490
 best-0.489
 lot-0.470
 thing-0.465
 things-0.449
Token apples
Token ID22514
Feature activation-8.95e-01

Pos logit contributions
 best+0.195
 world+0.191
 lot+0.186
 thing+0.184
 things+0.179
Neg logit contributions
arted-0.326
ur-0.319
ked-0.308
 stro-0.298
anged-0.295
Token and
Token ID290
Feature activation-5.50e+00

Pos logit contributions
arted+0.045
ur+0.044
ked+0.042
 stro+0.041
anged+0.041
Neg logit contributions
 best-0.046
 world-0.045
 thing-0.044
 lot-0.044
 things-0.044
Token sweet
Token ID6029
Feature activation+1.08e+00

Pos logit contributions
arted+0.303
ur+0.295
ked+0.284
 stro+0.272
anged+0.270
Neg logit contributions
 best-0.255
 world-0.249
 lot-0.246
 thing-0.244
 things-0.239
Token water
Token ID1660
Feature activation+2.78e+00

Pos logit contributions
 best+0.041
 world+0.040
 lot+0.038
 thing+0.038
 biggest+0.037
Neg logit contributions
arted-0.068
ur-0.066
ked-0.064
 stro-0.063
anged-0.062
Tokenmel
Token ID17694
Feature activation-2.60e+01

Pos logit contributions
 best+0.121
 world+0.117
 lot+0.115
 thing+0.115
 things+0.113
Neg logit contributions
arted-0.162
ur-0.157
ked-0.151
 stro-0.146
anged-0.145
Tokenons
Token ID684
Feature activation+2.75e+00

Pos logit contributions
arted+1.343
ur+1.291
ked+1.282
 stro+1.226
oured+1.174
Neg logit contributions
 best-1.246
 world-1.237
 thing-1.191
 biggest-1.182
 lot-1.166
Token.
Token ID13
Feature activation+2.58e+00

Pos logit contributions
 best+0.108
 world+0.105
 lot+0.102
 thing+0.102
 things+0.099
Neg logit contributions
arted-0.172
ur-0.167
ked-0.162
 stro-0.156
anged-0.155
Token He
Token ID679
Feature activation-2.65e+00

Pos logit contributions
 lot+0.130
 world+0.130
 best+0.127
 biggest+0.121
 things+0.121
Neg logit contributions
ked-0.129
arted-0.127
ur-0.127
anged-0.120
 stro-0.116
Token also
Token ID635
Feature activation+6.03e+00

Pos logit contributions
arted+0.157
ur+0.151
ked+0.147
anged+0.140
 stro+0.137
Neg logit contributions
 best-0.115
 lot-0.111
 world-0.109
 thing-0.108
 things-0.105
Token grabbed
Token ID13176
Feature activation-1.17e+01

Pos logit contributions
 best+0.303
 lot+0.296
 world+0.293
 thing+0.287
 things+0.282
Neg logit contributions
arted-0.312
ur-0.302
ked-0.291
anged-0.273
 stro-0.269
Token.
Token ID13
Feature activation+3.13e+00

Pos logit contributions
arted+0.833
ur+0.788
ked+0.756
anged+0.739
 stro+0.730
Neg logit contributions
 best-0.332
 thing-0.304
 things-0.294
 lot-0.279
 world-0.279
Token
Token ID198
Feature activation-4.71e+00

Pos logit contributions
 lot+0.108
 best+0.107
 thing+0.100
 is+0.098
 things+0.098
Neg logit contributions
ur-0.207
oured-0.206
ked-0.206
arted-0.206
anged-0.199
Token
Token ID198
Feature activation-4.65e+00

Pos logit contributions
ked+0.321
arted+0.318
ur+0.309
oured+0.302
anged+0.297
Neg logit contributions
 lot-0.179
 world-0.167
 best-0.167
 thing-0.160
 things-0.155
Token"
Token ID1
Feature activation+6.73e+00

Pos logit contributions
arted+0.268
ur+0.262
ked+0.254
 stro+0.241
anged+0.240
Neg logit contributions
 best-0.207
 world-0.203
 lot-0.201
 thing-0.200
 things-0.194
TokenL
Token ID43
Feature activation-2.37e+00

Pos logit contributions
 best+0.308
 world+0.299
 lot+0.286
 problem+0.285
 thing+0.285
Neg logit contributions
arted-0.376
ur-0.362
ked-0.359
 stro-0.356
anged-0.330
Tokenamb
Token ID4131
Feature activation-2.60e+01

Pos logit contributions
arted+0.132
ur+0.129
ked+0.124
 stro+0.120
anged+0.118
Neg logit contributions
 best-0.109
 world-0.107
 lot-0.105
 thing-0.104
 things-0.102
Token?
Token ID30
Feature activation-1.38e-01

Pos logit contributions
arted+1.128
ur+1.066
ked+1.011
 stro+0.979
anged+0.925
Neg logit contributions
 best-1.521
 thing-1.473
 world-1.468
 things-1.455
 lot-1.447
Token Like
Token ID4525
Feature activation-6.99e+00

Pos logit contributions
arted+0.008
ur+0.007
ked+0.007
anged+0.007
 stro+0.007
Neg logit contributions
 best-0.006
 world-0.006
 thing-0.006
 lot-0.006
 things-0.006
Token sheep
Token ID15900
Feature activation-4.46e+00

Pos logit contributions
ur+0.389
arted+0.380
 stro+0.369
ked+0.363
anged+0.341
Neg logit contributions
 world-0.301
 best-0.295
 biggest-0.280
 thing-0.279
 lot-0.272
Token?"
Token ID1701
Feature activation-1.50e+00

Pos logit contributions
arted+0.241
ur+0.237
ked+0.227
 stro+0.219
anged+0.215
Neg logit contributions
 best-0.208
 world-0.205
 lot-0.201
 thing-0.199
 things-0.194
Token Mia
Token ID32189
Feature activation-1.07e+01

Pos logit contributions
arted+0.074
ked+0.073
ur+0.073
oured+0.071
anged+0.064
Neg logit contributions
 best-0.081
 lot-0.079
 things-0.078
 thing-0.078
 day-0.075
Token went
Token ID1816
Feature activation+5.82e-01

Pos logit contributions
 best+0.200
 lot+0.199
 game+0.180
 day+0.177
 whole+0.175
Neg logit contributions
arted-0.197
ur-0.182
anged-0.178
ree-0.176
ked-0.174
Token on
Token ID319
Feature activation+1.78e+00

Pos logit contributions
 best+0.028
 lot+0.026
 things+0.026
 world+0.025
 thing+0.025
Neg logit contributions
arted-0.032
ked-0.031
 stro-0.031
ur-0.031
anged-0.030
Token the
Token ID262
Feature activation+1.96e-01

Pos logit contributions
 best+0.075
 lot+0.075
 world+0.072
 things+0.071
 thing+0.068
Neg logit contributions
arted-0.111
ur-0.111
ked-0.108
olded-0.103
anged-0.102
Token boat
Token ID8848
Feature activation+8.54e+00

Pos logit contributions
 world+0.007
 thing+0.006
 lot+0.006
 story+0.006
 problem+0.006
Neg logit contributions
arted-0.012
ked-0.011
ur-0.011
oured-0.010
 stro-0.010
Token and
Token ID290
Feature activation-3.57e+00

Pos logit contributions
 best+0.422
 world+0.406
 lot+0.387
 thing+0.383
 things+0.380
Neg logit contributions
arted-0.472
ur-0.447
ked-0.441
 stro-0.430
anged-0.417
Token had
Token ID550
Feature activation+1.12e+01

Pos logit contributions
arted+0.205
ur+0.195
ked+0.180
anged+0.179
pered+0.177
Neg logit contributions
 best-0.160
 lot-0.155
 day-0.154
 thing-0.152
 things-0.150
Token a
Token ID257
Feature activation+5.24e+00

Pos logit contributions
 world+0.413
 best+0.412
 thing+0.405
 lot+0.398
 things+0.385
Neg logit contributions
arted-0.704
ur-0.685
ked-0.663
 stro-0.645
anged-0.624
Token great
Token ID1049
Feature activation+1.10e+01

Pos logit contributions
 world+0.116
 things+0.111
 thing+0.109
iest+0.103
 biggest+0.099
Neg logit contributions
ur-0.361
 stro-0.359
ked-0.346
arted-0.345
oured-0.344
Token time
Token ID640
Feature activation+9.70e+00

Pos logit contributions
 biggest+0.230
 best+0.212
 world+0.202
 things+0.201
 thing+0.200
Neg logit contributions
arted-0.776
ur-0.772
ree-0.731
anged-0.729
 stro-0.717
Token sailing
Token ID28499
Feature activation+1.07e+00

Pos logit contributions
 best+0.429
 world+0.416
 lot+0.408
 thing+0.404
 things+0.399
Neg logit contributions
arted-0.561
ur-0.544
ked-0.527
 stro-0.504
anged-0.499
Token down
Token ID866
Feature activation-6.22e+00

Pos logit contributions
 best+0.051
 world+0.051
 lot+0.050
 thing+0.049
 biggest+0.048
Neg logit contributions
arted-0.055
ur-0.053
ked-0.052
 stro-0.049
anged-0.048
Token
Token ID198
Feature activation-4.52e+00

Pos logit contributions
 lot+0.123
 best+0.123
 thing+0.123
 things+0.120
 world+0.114
Neg logit contributions
ur-0.208
arted-0.201
oured-0.198
ked-0.196
 stro-0.195
Token
Token ID198
Feature activation-4.51e+00

Pos logit contributions
arted+0.291
ked+0.289
ur+0.279
oured+0.267
 stro+0.266
Neg logit contributions
 lot-0.184
 world-0.178
 best-0.175
 thing-0.167
 things-0.164
TokenThey
Token ID2990
Feature activation+4.29e+00

Pos logit contributions
arted+0.256
ur+0.250
ked+0.241
 stro+0.232
anged+0.230
Neg logit contributions
 best-0.203
 world-0.199
 lot-0.195
 thing-0.194
 things-0.189
Token open
Token ID1280
Feature activation-2.09e+00

Pos logit contributions
 best+0.202
 world+0.197
 lot+0.197
 thing+0.192
 things+0.189
Neg logit contributions
arted-0.233
ur-0.225
ked-0.218
 stro-0.210
anged-0.206
Token the
Token ID262
Feature activation-7.23e-01

Pos logit contributions
ur+0.123
arted+0.122
ked+0.115
 stro+0.114
anged+0.110
Neg logit contributions
 best-0.090
 things-0.089
 thing-0.089
 lot-0.088
 world-0.087
Token box
Token ID3091
Feature activation+1.65e+01

Pos logit contributions
arted+0.038
ked+0.037
oured+0.034
anged+0.034
ur+0.033
Neg logit contributions
 important-0.022
time-0.021
iest-0.020
body-0.020
 best-0.019
Token and
Token ID290
Feature activation-4.56e+00

Pos logit contributions
 best+0.888
 world+0.871
 lot+0.854
 thing+0.853
 things+0.839
Neg logit contributions
arted-0.789
ur-0.771
ked-0.734
 stro-0.708
anged-0.692
Token take
Token ID1011
Feature activation-1.81e+00

Pos logit contributions
arted+0.264
ur+0.253
ked+0.242
 stro+0.232
anged+0.230
Neg logit contributions
 best-0.206
 lot-0.199
 things-0.198
 thing-0.196
 world-0.195
Token out
Token ID503
Feature activation-4.08e+00

Pos logit contributions
arted+0.101
ur+0.098
ked+0.095
 stro+0.091
anged+0.090
Neg logit contributions
 best-0.083
 world-0.082
 lot-0.080
 thing-0.079
 things-0.077
Token the
Token ID262
Feature activation-6.34e-01

Pos logit contributions
arted+0.220
ur+0.215
ked+0.206
 stro+0.198
anged+0.196
Neg logit contributions
 best-0.190
 world-0.190
 thing-0.181
 lot-0.180
 biggest-0.177
Token parts
Token ID3354
Feature activation+7.27e+00

Pos logit contributions
ked+0.033
arted+0.031
anged+0.029
oured+0.028
ur+0.028
Neg logit contributions
 world-0.020
 important-0.020
time-0.018
 lot-0.018
 environment-0.017
Token stronger
Token ID7387
Feature activation-9.79e-01

Pos logit contributions
ur+0.295
arted+0.292
 stro+0.288
ked+0.287
anged+0.264
Neg logit contributions
 world-0.244
 lot-0.225
 best-0.221
 thing-0.219
 story-0.217
Token.
Token ID13
Feature activation+2.74e+00

Pos logit contributions
arted+0.058
ur+0.057
ked+0.056
 stro+0.054
anged+0.054
Neg logit contributions
 best-0.040
 world-0.039
 lot-0.039
 thing-0.037
 biggest-0.037
Token They
Token ID1119
Feature activation+2.02e+00

Pos logit contributions
 lot+0.120
 best+0.110
 things+0.110
 thing+0.105
 world+0.105
Neg logit contributions
arted-0.171
ur-0.170
oured-0.161
ked-0.158
ried-0.152
Token said
Token ID531
Feature activation-1.20e+01

Pos logit contributions
 best+0.100
 lot+0.098
 thing+0.092
 game+0.091
 things+0.089
Neg logit contributions
arted-0.109
ur-0.104
ked-0.096
ree-0.094
anged-0.094
Token they
Token ID484
Feature activation+6.91e+00

Pos logit contributions
arted+0.717
ur+0.695
ked+0.669
anged+0.643
 stro+0.640
Neg logit contributions
 best-0.516
 world-0.496
 lot-0.493
 thing-0.489
 things-0.481
Token would
Token ID561
Feature activation+1.10e+01

Pos logit contributions
 world+0.289
 thing+0.270
 best+0.270
 biggest+0.260
 lot+0.260
Neg logit contributions
ked-0.399
ur-0.399
 stro-0.397
arted-0.390
anged-0.374
Token hide
Token ID7808
Feature activation-5.22e+00

Pos logit contributions
 best+0.486
 world+0.477
 lot+0.470
 thing+0.463
 things+0.453
Neg logit contributions
arted-0.631
ur-0.611
ked-0.592
 stro-0.566
anged-0.564
Token the
Token ID262
Feature activation-1.86e+00

Pos logit contributions
arted+0.266
ur+0.256
ked+0.239
anged+0.235
 stro+0.233
Neg logit contributions
 best-0.266
 thing-0.266
 things-0.265
 lot-0.261
 world-0.259
Token carrot
Token ID40562
Feature activation-2.41e+00

Pos logit contributions
arted+0.098
ked+0.094
ur+0.092
 stro+0.090
anged+0.086
Neg logit contributions
 lot-0.065
 world-0.063
 best-0.063
 thing-0.062
 important-0.060
Token nose
Token ID9686
Feature activation-7.34e+00

Pos logit contributions
arted+0.121
ur+0.121
ked+0.117
 stro+0.110
anged+0.108
Neg logit contributions
 world-0.116
 lot-0.115
 best-0.115
 thing-0.111
 biggest-0.109
Token and
Token ID290
Feature activation-5.48e+00

Pos logit contributions
arted+0.356
ur+0.350
ked+0.340
 stro+0.320
anged+0.313
Neg logit contributions
 world-0.375
 best-0.375
 lot-0.370
 thing-0.362
 things-0.357
Token closer
Token ID5699
Feature activation-2.67e+00

Pos logit contributions
 best+0.254
 world+0.250
 lot+0.238
 thing+0.237
 things+0.236
Neg logit contributions
ur-0.232
arted-0.230
ked-0.216
 stro-0.209
anged-0.206
Token look
Token ID804
Feature activation+8.15e+00

Pos logit contributions
arted+0.179
ur+0.174
ked+0.170
 stro+0.165
anged+0.163
Neg logit contributions
 best-0.092
 world-0.090
 lot-0.088
 thing-0.088
 things-0.085
Token.
Token ID13
Feature activation+4.13e+00

Pos logit contributions
 world+0.364
 best+0.356
 lot+0.351
 thing+0.338
 biggest+0.334
Neg logit contributions
arted-0.454
ur-0.441
ked-0.424
 stro-0.407
anged-0.395
Token She
Token ID1375
Feature activation+3.21e+00

Pos logit contributions
 world+0.154
 lot+0.151
 best+0.148
 things+0.143
 day+0.140
Neg logit contributions
ked-0.269
ur-0.265
arted-0.259
anged-0.258
 stro-0.238
Token touched
Token ID12615
Feature activation-1.55e+01

Pos logit contributions
 best+0.152
 lot+0.151
 things+0.145
 thing+0.144
 important+0.144
Neg logit contributions
arted-0.163
ked-0.153
ur-0.151
anged-0.143
pered-0.140
Token it
Token ID340
Feature activation+9.68e+00

Pos logit contributions
arted+0.894
ur+0.876
ked+0.842
 stro+0.808
anged+0.804
Neg logit contributions
 best-0.690
 world-0.675
 lot-0.665
 thing-0.660
 things-0.645
Token and
Token ID290
Feature activation-3.77e+00

Pos logit contributions
 best+0.526
 world+0.515
 lot+0.508
 thing+0.506
 things+0.496
Neg logit contributions
arted-0.459
ur-0.446
ked-0.425
 stro-0.406
anged-0.403
Token examined
Token ID11068
Feature activation-1.60e+01

Pos logit contributions
arted+0.235
ur+0.231
ked+0.211
pered+0.206
anged+0.203
Neg logit contributions
 things-0.155
 best-0.154
 thing-0.152
 lot-0.148
 world-0.138
Token it
Token ID340
Feature activation+9.74e+00

Pos logit contributions
arted+0.996
ur+0.974
ked+0.942
 stro+0.910
anged+0.901
Neg logit contributions
 best-0.625
 world-0.611
 lot-0.597
 thing-0.591
 things-0.576
Token with
Token ID351
Feature activation-5.31e+00

Pos logit contributions
 world+0.518
 best+0.509
 lot+0.503
 thing+0.494
 things+0.481
Neg logit contributions
arted-0.452
ur-0.441
ked-0.424
 stro-0.403
anged-0.389
Token her
Token ID607
Feature activation-1.51e+00

Pos logit contributions
arted+0.302
ur+0.297
ked+0.285
anged+0.273
 stro+0.270
Neg logit contributions
 best-0.244
 lot-0.238
 world-0.234
 thing-0.232
 things-0.230
Token the
Token ID262
Feature activation+1.36e-01

Pos logit contributions
arted+0.132
ur+0.130
ked+0.125
 stro+0.120
anged+0.116
Neg logit contributions
 best-0.103
 world-0.103
 thing-0.098
 lot-0.097
 biggest-0.095
Token fruit
Token ID8234
Feature activation+2.80e+00

Pos logit contributions
body+0.006
 world+0.006
 best+0.006
 planet+0.006
 important+0.006
Neg logit contributions
arted-0.006
ur-0.005
ked-0.005
 Bob-0.005
 stro-0.005
Token.
Token ID13
Feature activation+4.50e+00

Pos logit contributions
 world+0.127
 best+0.124
 lot+0.123
 thing+0.119
 problem+0.117
Neg logit contributions
ur-0.146
arted-0.144
ked-0.133
oured-0.123
 stro-0.123
Token How
Token ID1374
Feature activation+4.80e+00

Pos logit contributions
 world+0.187
 best+0.185
 lot+0.184
 thing+0.184
 biggest+0.176
Neg logit contributions
arted-0.273
ked-0.253
ur-0.253
anged-0.250
 stro-0.237
Token can
Token ID460
Feature activation+1.32e+01

Pos logit contributions
 world+0.190
 best+0.169
 whole+0.162
 lot+0.162
 biggest+0.162
Neg logit contributions
ur-0.272
ree-0.239
ked-0.233
arted-0.230
 Dan-0.228
Token we
Token ID356
Feature activation+1.22e+01

Pos logit contributions
 best+0.435
 world+0.433
 lot+0.416
 thing+0.405
 things+0.388
Neg logit contributions
arted-0.892
ur-0.886
ked-0.853
 stro-0.828
anged-0.816
Token get
Token ID651
Feature activation+3.29e+00

Pos logit contributions
 best+0.592
 world+0.581
 thing+0.566
 lot+0.564
 things+0.554
Neg logit contributions
arted-0.636
ur-0.630
ked-0.596
 stro-0.576
anged-0.564
Token it
Token ID340
Feature activation+9.79e+00

Pos logit contributions
 best+0.140
 world+0.138
 lot+0.134
 thing+0.133
 things+0.130
Neg logit contributions
arted-0.194
ur-0.190
ked-0.183
 stro-0.177
anged-0.175
Token?"
Token ID1701
Feature activation-3.27e-01

Pos logit contributions
 best+0.407
 world+0.405
 lot+0.395
 thing+0.392
 things+0.381
Neg logit contributions
arted-0.571
ur-0.562
ked-0.538
 stro-0.516
anged-0.509
Token Rod
Token ID6882
Feature activation-6.40e+00

Pos logit contributions
arted+0.015
ur+0.015
ked+0.014
 stro+0.013
oured+0.013
Neg logit contributions
 lot-0.018
 best-0.018
 thing-0.018
 things-0.018
 world-0.018
Token had
Token ID550
Feature activation+1.13e+01

Pos logit contributions
arted+0.345
ur+0.325
anged+0.304
 stro+0.302
ked+0.301
Neg logit contributions
 best-0.319
 thing-0.307
 world-0.292
 lot-0.292
 things-0.290
Token need
Token ID761
Feature activation+8.26e+00

Pos logit contributions
 world+0.285
 best+0.274
 thing+0.261
 biggest+0.258
 things+0.248
Neg logit contributions
ur-0.471
arted-0.447
ked-0.444
 stro-0.440
anged-0.424
Token to
Token ID284
Feature activation-6.92e+00

Pos logit contributions
 world+0.461
 best+0.458
 thing+0.435
 biggest+0.433
 lot+0.427
Neg logit contributions
ur-0.356
arted-0.353
 stro-0.328
ked-0.320
anged-0.309
Token eat
Token ID4483
Feature activation-4.32e+00

Pos logit contributions
ur+0.355
arted+0.352
ked+0.330
 stro+0.327
anged+0.317
Neg logit contributions
 best-0.332
 world-0.328
 thing-0.316
 lot-0.315
 biggest-0.310
Token something
Token ID1223
Feature activation+8.17e+00

Pos logit contributions
ur+0.220
arted+0.218
 stro+0.207
ked+0.206
anged+0.194
Neg logit contributions
 world-0.195
 best-0.191
 biggest-0.181
 thing-0.180
 lot-0.179
Token different
Token ID1180
Feature activation+6.53e+00

Pos logit contributions
 world+0.325
 lot+0.287
 thing+0.284
 universe+0.284
 best+0.283
Neg logit contributions
ur-0.482
arted-0.447
 stro-0.432
olded-0.430
ked-0.414
Token today
Token ID1909
Feature activation+5.02e+00

Pos logit contributions
 best+0.303
 world+0.302
 lot+0.295
 thing+0.287
 biggest+0.283
Neg logit contributions
arted-0.345
ur-0.341
ked-0.322
 stro-0.314
anged-0.309
Token.
Token ID13
Feature activation+2.77e+00

Pos logit contributions
 lot+0.241
 world+0.236
 best+0.235
 biggest+0.230
 thing+0.227
Neg logit contributions
ur-0.232
arted-0.223
anged-0.204
 stro-0.204
ked-0.199
Token Eggs
Token ID40433
Feature activation+1.21e+00

Pos logit contributions
 best+0.120
 world+0.119
 lot+0.117
 thing+0.115
 things+0.111
Neg logit contributions
arted-0.163
ked-0.154
ur-0.152
anged-0.146
oured-0.141
Token and
Token ID290
Feature activation-5.37e+00

Pos logit contributions
 best+0.038
 world+0.038
 biggest+0.036
 lot+0.036
 thing+0.035
Neg logit contributions
ur-0.080
arted-0.079
ked-0.076
anged-0.074
 stro-0.074
Token toast
Token ID27805
Feature activation+1.73e+00

Pos logit contributions
ur+0.293
arted+0.290
 stro+0.288
ked+0.279
anged+0.272
Neg logit contributions
 world-0.233
 best-0.220
 thing-0.209
 biggest-0.207
 lot-0.204
Token are
Token ID389
Feature activation+1.13e+01

Pos logit contributions
 world+0.077
 best+0.073
 lot+0.073
 biggest+0.073
 thing+0.071
Neg logit contributions
ur-0.092
arted-0.088
ked-0.086
 stro-0.084
anged-0.082
TokenL
Token ID43
Feature activation-3.38e+00

Pos logit contributions
arted+0.305
ur+0.298
ked+0.286
 stro+0.277
anged+0.271
Neg logit contributions
 best-0.251
 world-0.245
 lot-0.240
 thing-0.239
 things-0.233
Tokenily
Token ID813
Feature activation-1.31e+00

Pos logit contributions
arted+0.185
ur+0.181
ked+0.173
 stro+0.166
anged+0.165
Neg logit contributions
 best-0.159
 world-0.156
 lot-0.154
 thing-0.152
 things-0.149
Token squee
Token ID13170
Feature activation-1.61e+01

Pos logit contributions
arted+0.074
ur+0.073
ked+0.070
anged+0.067
 stro+0.065
Neg logit contributions
 best-0.060
 world-0.057
 lot-0.056
 things-0.056
 thing-0.055
Tokenzes
Token ID12271
Feature activation-1.41e+01

Pos logit contributions
ur+0.751
arted+0.735
olded+0.621
ked+0.613
 stro+0.606
Neg logit contributions
 thing-0.914
 best-0.907
 things-0.899
 lot-0.880
 game-0.876
Token his
Token ID465
Feature activation-7.55e+00

Pos logit contributions
ur+0.806
arted+0.779
ked+0.714
oured+0.696
anged+0.689
Neg logit contributions
 things-0.669
 best-0.668
 thing-0.660
 lot-0.658
 world-0.604
Token hand
Token ID1021
Feature activation-3.23e+00

Pos logit contributions
arted+0.371
ked+0.367
ur+0.357
 stro+0.346
anged+0.338
Neg logit contributions
 world-0.372
 best-0.356
 lot-0.355
 thing-0.341
 important-0.339
Token.
Token ID13
Feature activation+2.56e+00

Pos logit contributions
arted+0.179
ur+0.174
ked+0.165
 stro+0.159
anged+0.159
Neg logit contributions
 best-0.151
 world-0.145
 thing-0.144
 lot-0.144
 things-0.141
Token "
Token ID366
Feature activation+4.31e-01

Pos logit contributions
 lot+0.116
 world+0.111
 best+0.109
 things+0.104
 thing+0.102
Neg logit contributions
arted-0.141
ur-0.140
ked-0.136
anged-0.135
ried-0.130
TokenI
Token ID40
Feature activation+8.90e+00

Pos logit contributions
 best+0.019
 world+0.018
 thing+0.018
 lot+0.018
 things+0.018
Neg logit contributions
arted-0.025
ur-0.024
ked-0.023
 stro-0.023
anged-0.022
Token will
Token ID481
Feature activation+1.08e+01

Pos logit contributions
 best+0.316
 lot+0.307
 world+0.300
 thing+0.297
 things+0.282
Neg logit contributions
arted-0.614
ked-0.572
ur-0.571
anged-0.539
 stro-0.532
Token not
Token ID407
Feature activation-1.38e+00

Pos logit contributions
 best+0.476
 lot+0.461
 world+0.457
 thing+0.450
 things+0.437
Neg logit contributions
arted-0.639
ur-0.609
ked-0.598
anged-0.571
 stro-0.562
Token went
Token ID1816
Feature activation-1.42e+00

Pos logit contributions
 best+0.372
 world+0.366
 lot+0.357
 thing+0.356
 things+0.348
Neg logit contributions
arted-0.435
ur-0.423
ked-0.409
 stro-0.390
anged-0.386
Token,
Token ID11
Feature activation-3.22e+00

Pos logit contributions
arted+0.069
ur+0.068
ked+0.062
 stro+0.060
anged+0.060
Neg logit contributions
 best-0.072
 lot-0.072
 world-0.070
 biggest-0.070
 thing-0.069
Token something
Token ID1223
Feature activation+7.98e+00

Pos logit contributions
arted+0.179
ur+0.172
ked+0.164
 stro+0.164
anged+0.156
Neg logit contributions
 best-0.135
 lot-0.134
 world-0.134
 thing-0.133
 problem-0.130
Token new
Token ID649
Feature activation+9.68e+00

Pos logit contributions
 world+0.262
 lot+0.245
 best+0.244
 biggest+0.231
 thing+0.230
Neg logit contributions
ur-0.533
arted-0.530
 stro-0.506
ked-0.496
anged-0.466
Token and
Token ID290
Feature activation-5.65e+00

Pos logit contributions
 best+0.486
 lot+0.486
 world+0.473
 biggest+0.466
 thing+0.458
Neg logit contributions
arted-0.479
ur-0.472
ked-0.444
 stro-0.433
anged-0.427
Token amazing
Token ID4998
Feature activation+6.81e+00

Pos logit contributions
arted+0.392
ur+0.380
 stro+0.368
ked+0.365
anged+0.347
Neg logit contributions
 world-0.162
 lot-0.156
 best-0.149
 thing-0.148
 story-0.143
Token called
Token ID1444
Feature activation-1.01e+01

Pos logit contributions
 best+0.310
 lot+0.306
 world+0.305
 biggest+0.296
 thing+0.287
Neg logit contributions
arted-0.378
ur-0.369
ked-0.357
 stro-0.343
anged-0.339
Token out
Token ID503
Feature activation-5.49e+00

Pos logit contributions
arted+0.492
ur+0.491
ked+0.463
 stro+0.441
anged+0.425
Neg logit contributions
 best-0.510
 world-0.507
 lot-0.494
 thing-0.488
 biggest-0.480
Token to
Token ID284
Feature activation-6.96e+00

Pos logit contributions
arted+0.294
ur+0.294
ked+0.272
 stro+0.262
anged+0.255
Neg logit contributions
 best-0.255
 world-0.251
 biggest-0.243
 lot-0.239
 thing-0.235
Token her
Token ID607
Feature activation-3.54e+00

Pos logit contributions
arted+0.341
ur+0.332
ked+0.317
 stro+0.303
anged+0.299
Neg logit contributions
 best-0.365
 world-0.359
 lot-0.353
 thing-0.351
 things-0.344
Token.
Token ID13
Feature activation+2.34e+00

Pos logit contributions
arted+0.212
ur+0.209
ked+0.201
 stro+0.195
anged+0.192
Neg logit contributions
 best-0.143
 world-0.140
 lot-0.138
 thing-0.137
 things-0.133
Token mom
Token ID1995
Feature activation-4.60e+00

Pos logit contributions
 world+0.391
 lot+0.382
 best+0.380
 thing+0.375
 things+0.366
Neg logit contributions
arted-0.477
ur-0.470
ked-0.457
anged-0.432
 stro-0.427
Token showed
Token ID3751
Feature activation-8.94e+00

Pos logit contributions
arted+0.238
ur+0.232
ked+0.224
anged+0.208
 stro+0.208
Neg logit contributions
 best-0.244
 world-0.231
 lot-0.228
 thing-0.225
 important-0.223
Token him
Token ID683
Feature activation-8.85e+00

Pos logit contributions
 stro+0.412
ked+0.395
ur+0.387
arted+0.383
oured+0.368
Neg logit contributions
 things-0.532
 thing-0.515
 lot-0.490
 best-0.486
 world-0.472
Token a
Token ID257
Feature activation+3.38e+00

Pos logit contributions
arted+0.416
ur+0.396
ked+0.386
 stro+0.366
anged+0.363
Neg logit contributions
 best-0.493
 thing-0.476
 lot-0.475
 things-0.474
 world-0.471
Token surprise
Token ID5975
Feature activation+8.36e-01

Pos logit contributions
 best+0.083
 world+0.078
 things+0.075
 biggest+0.068
 thing+0.068
Neg logit contributions
ur-0.229
arted-0.212
oured-0.209
 stro-0.209
ree-0.206
Token.
Token ID13
Feature activation+2.80e+00

Pos logit contributions
 best+0.039
 world+0.039
 lot+0.039
 biggest+0.038
 things+0.036
Neg logit contributions
ur-0.043
arted-0.042
ked-0.041
anged-0.039
 stro-0.039
Token It
Token ID632
Feature activation+6.19e+00

Pos logit contributions
 lot+0.137
 world+0.130
 thing+0.129
 things+0.128
 best+0.122
Neg logit contributions
arted-0.156
ur-0.152
ked-0.148
anged-0.138
 stro-0.138
Token was
Token ID373
Feature activation+5.59e+00

Pos logit contributions
 world+0.219
 lot+0.209
 best+0.208
 thing+0.200
 things+0.198
Neg logit contributions
ur-0.391
arted-0.390
ked-0.377
 stro-0.366
anged-0.358
Token a
Token ID257
Feature activation+3.47e+00

Pos logit contributions
 world+0.212
 best+0.212
 lot+0.201
 thing+0.201
 things+0.195
Neg logit contributions
arted-0.346
ur-0.342
ked-0.329
 stro-0.319
anged-0.313
Token big
Token ID1263
Feature activation+4.23e+00

Pos logit contributions
 best+0.090
 world+0.084
 things+0.083
 biggest+0.083
iest+0.083
Neg logit contributions
ur-0.218
 stro-0.210
ked-0.202
ree-0.200
arted-0.195
Token,
Token ID11
Feature activation-2.78e+00

Pos logit contributions
 best+0.165
 biggest+0.156
 world+0.155
 lot+0.151
 important+0.146
Neg logit contributions
ur-0.235
arted-0.228
ked-0.221
anged-0.218
 stro-0.217
Token are
Token ID389
Feature activation+1.26e+01

Pos logit contributions
arted+0.511
ur+0.491
ked+0.468
anged+0.453
 stro+0.444
Neg logit contributions
 best-0.509
 lot-0.493
 world-0.492
 thing-0.480
 things-0.474
Token curious
Token ID11040
Feature activation-1.77e+00

Pos logit contributions
 best+0.567
 world+0.504
 biggest+0.493
 thing+0.491
 lot+0.491
Neg logit contributions
arted-0.773
ur-0.766
ked-0.733
anged-0.730
oured-0.692
Token.
Token ID13
Feature activation+3.98e+00

Pos logit contributions
arted+0.097
ur+0.094
ked+0.092
 stro+0.089
anged+0.087
Neg logit contributions
 best-0.077
 lot-0.076
 world-0.075
 biggest-0.074
 thing-0.072
Token They
Token ID1119
Feature activation+3.46e+00

Pos logit contributions
 world+0.185
 best+0.185
 lot+0.184
 thing+0.183
 things+0.177
Neg logit contributions
arted-0.226
ur-0.220
ked-0.208
anged-0.204
oured-0.199
Token nod
Token ID18666
Feature activation-6.06e+00

Pos logit contributions
 best+0.165
 lot+0.162
 world+0.159
 thing+0.155
 things+0.152
Neg logit contributions
arted-0.189
ur-0.182
ked-0.175
anged-0.171
 stro-0.166
Token.
Token ID13
Feature activation+4.06e+00

Pos logit contributions
arted+0.377
ur+0.369
ked+0.357
 stro+0.345
anged+0.341
Neg logit contributions
 best-0.239
 world-0.234
 lot-0.228
 thing-0.226
 things-0.220
Token The
Token ID383
Feature activation-5.73e+00

Pos logit contributions
 best+0.187
 lot+0.187
 world+0.186
 thing+0.183
 things+0.178
Neg logit contributions
arted-0.233
ur-0.225
ked-0.215
anged-0.210
oured-0.204
Token old
Token ID1468
Feature activation+2.73e+00

Pos logit contributions
arted+0.319
ur+0.298
 stro+0.294
ked+0.292
anged+0.281
Neg logit contributions
 best-0.217
 lot-0.216
 world-0.196
 biggest-0.193
 thing-0.191
Token man
Token ID582
Feature activation+1.45e+00

Pos logit contributions
 best+0.090
 world+0.087
 lot+0.086
 thing+0.082
 biggest+0.081
Neg logit contributions
arted-0.184
ur-0.180
ked-0.176
 stro-0.171
anged-0.169
Token takes
Token ID2753
Feature activation-3.13e+00

Pos logit contributions
 best+0.076
 world+0.074
 lot+0.073
 thing+0.073
 things+0.071
Neg logit contributions
arted-0.071
ur-0.069
ked-0.066
 stro-0.063
anged-0.063
Token off
Token ID572
Feature activation-1.10e+01

Pos logit contributions
arted+0.172
ur+0.169
ked+0.160
 stro+0.155
anged+0.153
Neg logit contributions
 best-0.151
 thing-0.146
 lot-0.145
 things-0.144
 world-0.142
Token hands
Token ID2832
Feature activation-5.69e+00

Pos logit contributions
arted+0.172
ur+0.167
ked+0.159
 stro+0.151
anged+0.149
Neg logit contributions
 best-0.228
 world-0.223
 lot-0.221
 thing-0.219
 things-0.215
Token together
Token ID1978
Feature activation-3.32e+00

Pos logit contributions
arted+0.317
ur+0.301
anged+0.290
ked+0.283
 stro+0.273
Neg logit contributions
 best-0.275
 lot-0.257
 world-0.251
 thing-0.245
 whole-0.241
Token and
Token ID290
Feature activation-5.84e+00

Pos logit contributions
arted+0.182
ur+0.175
ked+0.168
anged+0.162
 stro+0.159
Neg logit contributions
 best-0.160
 lot-0.154
 world-0.154
 thing-0.152
 things-0.149
Token tw
Token ID665
Feature activation-4.62e+00

Pos logit contributions
arted+0.292
ur+0.286
ked+0.271
anged+0.262
ree+0.247
Neg logit contributions
 best-0.306
 lot-0.291
 thing-0.285
 things-0.284
 world-0.280
Tokenirled
Token ID49376
Feature activation-7.95e+00

Pos logit contributions
arted+0.254
ur+0.242
ked+0.239
anged+0.219
 stro+0.218
Neg logit contributions
 best-0.203
 lot-0.193
 things-0.185
 thing-0.182
 important-0.180
Token around
Token ID1088
Feature activation-9.15e+00

Pos logit contributions
arted+0.380
ur+0.375
ked+0.354
anged+0.342
 stro+0.339
Neg logit contributions
 best-0.429
 world-0.419
 lot-0.417
 thing-0.414
 things-0.408
Token happily
Token ID18177
Feature activation-1.23e+01

Pos logit contributions
arted+0.520
ur+0.510
ked+0.488
anged+0.470
 stro+0.468
Neg logit contributions
 best-0.411
 world-0.404
 lot-0.398
 thing-0.395
 things-0.387
Token.
Token ID13
Feature activation+2.16e+00

Pos logit contributions
arted+0.706
ur+0.700
ked+0.672
 stro+0.648
anged+0.635
Neg logit contributions
 best-0.517
 world-0.515
 lot-0.506
 thing-0.496
 biggest-0.489
Token
Token ID220
Feature activation-2.16e+00

Pos logit contributions
 lot+0.098
 world+0.097
 best+0.094
 day+0.089
 biggest+0.086
Neg logit contributions
ked-0.125
ur-0.120
arted-0.118
anged-0.114
 stro-0.111
Token
Token ID198
Feature activation-5.46e+00

Pos logit contributions
arted+0.131
ked+0.130
ur+0.126
anged+0.124
 stro+0.123
Neg logit contributions
 best-0.090
 world-0.086
 lot-0.084
 biggest-0.078
 things-0.078
Token
Token ID198
Feature activation-5.42e+00

Pos logit contributions
ked+0.330
arted+0.327
ur+0.321
anged+0.317
 stro+0.297
Neg logit contributions
 lot-0.234
 world-0.231
 best-0.226
 whole-0.213
 day-0.207
Token<|endoftext|>
Token ID50256
Feature activation-9.74e-01

Pos logit contributions
 best+0.171
 world+0.168
 lot+0.165
 thing+0.164
 things+0.161
Neg logit contributions
arted-0.157
ur-0.152
ked-0.146
 stro-0.139
anged-0.137
TokenTom
Token ID13787
Feature activation-8.55e+00

Pos logit contributions
arted+0.045
ur+0.044
ked+0.041
 stro+0.041
anged+0.039
Neg logit contributions
 best-0.050
 biggest-0.048
 lot-0.048
 thing-0.048
 important-0.048
Token and
Token ID290
Feature activation-5.38e+00

Pos logit contributions
arted+0.518
ur+0.502
ked+0.492
 stro+0.463
anged+0.457
Neg logit contributions
 best-0.369
 world-0.362
 thing-0.344
 lot-0.344
 things-0.337
Token Lily
Token ID20037
Feature activation-5.82e+00

Pos logit contributions
arted+0.318
ur+0.299
ked+0.294
anged+0.280
oured+0.278
Neg logit contributions
 best-0.244
 thing-0.240
 lot-0.239
 things-0.232
 world-0.227
Token went
Token ID1816
Feature activation-9.36e-01

Pos logit contributions
arted+0.334
ur+0.324
ked+0.313
 stro+0.297
anged+0.296
Neg logit contributions
 best-0.259
 world-0.254
 lot-0.249
 thing-0.246
 things-0.241
Token to
Token ID284
Feature activation-6.56e+00

Pos logit contributions
arted+0.047
ked+0.045
ur+0.044
 stro+0.044
anged+0.042
Neg logit contributions
 best-0.050
 world-0.047
 lot-0.047
 things-0.047
 thing-0.045
Token the
Token ID262
Feature activation-1.43e+00

Pos logit contributions
arted+0.414
ur+0.398
ked+0.397
anged+0.377
olded+0.371
Neg logit contributions
 lot-0.266
 best-0.265
 world-0.265
 things-0.257
 thing-0.250
Token shop
Token ID6128
Feature activation+5.70e+00

Pos logit contributions
arted+0.087
ur+0.086
 stro+0.081
ked+0.080
oured+0.079
Neg logit contributions
 best-0.046
 thing-0.045
 lot-0.044
 problem-0.043
 important-0.043
Token with
Token ID351
Feature activation-6.71e+00

Pos logit contributions
 best+0.247
 world+0.244
 thing+0.236
 things+0.232
 lot+0.231
Neg logit contributions
arted-0.342
ur-0.327
ked-0.319
 stro-0.306
anged-0.300
Token Mom
Token ID11254
Feature activation-8.30e+00

Pos logit contributions
arted+0.296
ur+0.288
ked+0.275
 stro+0.265
anged+0.258
Neg logit contributions
 world-0.376
 best-0.374
 thing-0.363
 lot-0.363
 things-0.352
Token.
Token ID13
Feature activation+3.04e+00

Pos logit contributions
arted+0.493
ur+0.482
ked+0.465
 stro+0.448
anged+0.444
Neg logit contributions
 best-0.349
 world-0.342
 lot-0.334
 thing-0.331
 things-0.323
Token sweet
Token ID6029
Feature activation+1.87e+00

Pos logit contributions
 world+0.374
 best+0.366
 thing+0.355
 lot+0.355
 things+0.339
Neg logit contributions
arted-0.794
ur-0.777
ked-0.755
 stro-0.734
anged-0.718
Token and
Token ID290
Feature activation-4.95e+00

Pos logit contributions
 best+0.101
 world+0.099
 lot+0.097
 thing+0.097
 things+0.096
Neg logit contributions
arted-0.090
ur-0.087
ked-0.083
 stro-0.079
anged-0.079
Token juicy
Token ID35075
Feature activation-9.80e-01

Pos logit contributions
ur+0.289
 stro+0.286
arted+0.284
ked+0.281
anged+0.259
Neg logit contributions
 world-0.205
 thing-0.179
 best-0.178
 story-0.174
 problem-0.173
Token.
Token ID13
Feature activation+3.17e+00

Pos logit contributions
arted+0.055
ur+0.053
ked+0.053
 stro+0.051
anged+0.049
Neg logit contributions
 world-0.042
 best-0.042
 lot-0.040
 thing-0.039
 biggest-0.039
Token
Token ID198
Feature activation-4.86e+00

Pos logit contributions
 lot+0.143
 best+0.139
 world+0.132
 thing+0.131
 things+0.130
Neg logit contributions
arted-0.195
ur-0.192
ked-0.181
anged-0.178
oured-0.174
Token
Token ID198
Feature activation-4.84e+00

Pos logit contributions
arted+0.313
ked+0.301
ur+0.299
anged+0.284
oured+0.278
Neg logit contributions
 lot-0.205
 world-0.193
 best-0.192
 thing-0.182
 things-0.180
Token"
Token ID1
Feature activation+6.41e+00

Pos logit contributions
arted+0.255
ur+0.248
ked+0.244
 stro+0.234
anged+0.225
Neg logit contributions
 best-0.233
 world-0.230
 thing-0.218
 biggest-0.218
 lot-0.217
TokenY
Token ID56
Feature activation+2.08e+00

Pos logit contributions
 world+0.318
 best+0.317
 problem+0.310
 things+0.304
 thing+0.302
Neg logit contributions
arted-0.314
ked-0.311
 stro-0.309
ur-0.296
anged-0.270
Tokenum
Token ID388
Feature activation-5.41e+00

Pos logit contributions
 best+0.105
 world+0.105
 thing+0.102
 lot+0.102
 things+0.100
Neg logit contributions
arted-0.105
ur-0.103
ked-0.097
anged-0.093
 stro-0.092
Token,
Token ID11
Feature activation-2.25e+00

Pos logit contributions
ur+0.268
arted+0.267
ked+0.249
 stro+0.240
anged+0.237
Neg logit contributions
 world-0.267
 best-0.261
 lot-0.261
 thing-0.253
 biggest-0.250
Token y
Token ID331
Feature activation+3.77e+00

Pos logit contributions
ur+0.130
 stro+0.126
arted+0.125
ked+0.125
oured+0.112
Neg logit contributions
 world-0.093
 best-0.082
 thing-0.081
 problem-0.079
 lot-0.078
Token and
Token ID290
Feature activation-5.69e+00

Pos logit contributions
arted+0.266
ur+0.260
ked+0.253
anged+0.238
 stro+0.237
Neg logit contributions
 world-0.233
 best-0.230
 lot-0.225
 thing-0.224
 things-0.221
Token was
Token ID373
Feature activation+5.18e+00

Pos logit contributions
arted+0.315
ur+0.305
ked+0.295
anged+0.279
 stro+0.276
Neg logit contributions
 best-0.267
 world-0.260
 lot-0.256
 thing-0.255
 things-0.252
Token never
Token ID1239
Feature activation-2.48e+00

Pos logit contributions
 best+0.192
 world+0.183
 lot+0.179
 thing+0.175
 things+0.169
Neg logit contributions
arted-0.340
ur-0.331
ked-0.326
anged-0.313
 stro-0.308
Token found
Token ID1043
Feature activation-4.98e+00

Pos logit contributions
arted+0.130
ur+0.127
ked+0.122
 stro+0.117
anged+0.116
Neg logit contributions
 best-0.122
 world-0.119
 lot-0.116
 thing-0.116
 things-0.114
Token.
Token ID13
Feature activation+2.43e+00

Pos logit contributions
arted+0.252
ur+0.246
ked+0.236
 stro+0.226
anged+0.224
Neg logit contributions
 best-0.255
 world-0.250
 lot-0.246
 thing-0.245
 things-0.240
Token Her
Token ID2332
Feature activation-6.40e+00

Pos logit contributions
 world+0.124
 lot+0.121
 best+0.118
 thing+0.114
 important+0.112
Neg logit contributions
arted-0.126
ked-0.126
ur-0.120
anged-0.114
 stro-0.113
Token uncle
Token ID7711
Feature activation-1.27e+01

Pos logit contributions
arted+0.266
ur+0.258
ked+0.248
 stro+0.238
anged+0.233
Neg logit contributions
 best-0.371
 world-0.371
 lot-0.366
 thing-0.363
 things-0.355
Token felt
Token ID2936
Feature activation-4.02e-01

Pos logit contributions
arted+0.815
ur+0.776
ked+0.748
anged+0.745
amb+0.709
Neg logit contributions
 best-0.506
 world-0.457
 thing-0.451
 lot-0.446
 things-0.440
Token very
Token ID845
Feature activation+1.29e+01

Pos logit contributions
ur+0.026
arted+0.026
 stro+0.025
ked+0.024
olded+0.023
Neg logit contributions
 world-0.013
 best-0.013
 thing-0.013
 game-0.012
 things-0.012
Token guilty
Token ID6717
Feature activation-1.09e+00

Pos logit contributions
 best+0.519
 world+0.515
 lot+0.499
 thing+0.496
 things+0.485
Neg logit contributions
arted-0.778
ur-0.768
ked-0.730
 stro-0.716
anged-0.699
Token for
Token ID329
Feature activation-4.34e-01

Pos logit contributions
arted+0.051
ur+0.050
ked+0.047
 stro+0.045
anged+0.044
Neg logit contributions
 best-0.059
 world-0.059
 lot-0.058
 thing-0.057
 biggest-0.056
Token
Token ID198
Feature activation-5.08e+00

Pos logit contributions
 world+0.050
 best+0.050
 lot+0.049
 thing+0.048
 things+0.048
Neg logit contributions
arted-0.068
ur-0.066
ked-0.064
oured-0.060
anged-0.060
Token
Token ID198
Feature activation-5.06e+00

Pos logit contributions
arted+0.292
ur+0.284
ked+0.280
anged+0.267
oured+0.260
Neg logit contributions
 world-0.232
 lot-0.232
 best-0.229
 thing-0.221
 things-0.219
TokenL
Token ID43
Feature activation-3.32e+00

Pos logit contributions
arted+0.273
ur+0.267
ked+0.254
 stro+0.248
anged+0.240
Neg logit contributions
 best-0.242
 world-0.236
 lot-0.229
 thing-0.229
 biggest-0.225
Tokenily
Token ID813
Feature activation-1.30e+00

Pos logit contributions
arted+0.173
ur+0.170
ked+0.161
 stro+0.153
anged+0.153
Neg logit contributions
 best-0.166
 world-0.162
 lot-0.161
 thing-0.157
 biggest-0.155
Token listened
Token ID16399
Feature activation-4.02e+00

Pos logit contributions
arted+0.081
ur+0.078
ked+0.074
anged+0.074
run+0.069
Neg logit contributions
 best-0.054
 world-0.052
 is-0.049
 lot-0.049
 things-0.048
Token to
Token ID284
Feature activation-7.03e+00

Pos logit contributions
arted+0.208
ur+0.192
ked+0.189
anged+0.188
 stro+0.179
Neg logit contributions
 best-0.204
 lot-0.201
 important-0.198
 thing-0.194
 world-0.193
Token the
Token ID262
Feature activation-1.82e+00

Pos logit contributions
ur+0.355
arted+0.354
ked+0.340
anged+0.328
 stro+0.320
Neg logit contributions
 best-0.369
 lot-0.361
 world-0.360
 things-0.354
 thing-0.349
Token story
Token ID1621
Feature activation+1.49e+01

Pos logit contributions
arted+0.096
ur+0.086
ked+0.086
 stro+0.078
pered+0.075
Neg logit contributions
 best-0.077
 thing-0.076
 world-0.076
 lot-0.074
 things-0.074
Token and
Token ID290
Feature activation-5.42e+00

Pos logit contributions
 best+0.766
 world+0.752
 lot+0.745
 biggest+0.737
 thing+0.727
Neg logit contributions
arted-0.713
ur-0.710
ked-0.679
 stro-0.649
anged-0.632
Token imagined
Token ID15758
Feature activation-1.20e+01

Pos logit contributions
arted+0.289
ur+0.274
ked+0.255
anged+0.249
oured+0.244
Neg logit contributions
 best-0.276
 thing-0.262
 things-0.261
 world-0.259
 lot-0.254
Token she
Token ID673
Feature activation+8.13e+00

Pos logit contributions
arted+0.630
ur+0.620
ked+0.595
 stro+0.578
anged+0.548
Neg logit contributions
 world-0.560
 lot-0.543
 best-0.541
 thing-0.538
 problem-0.518
Token�
Token ID95
Feature activation-2.35e+00

Pos logit contributions
arted+0.373
ur+0.355
ked+0.341
 stro+0.325
anged+0.324
Neg logit contributions
 best-0.439
 world-0.436
 lot-0.427
 thing-0.426
 things-0.417
Token€
Token ID26391
Feature activation-5.96e+00

Pos logit contributions
arted+0.110
ur+0.107
ked+0.102
 stro+0.097
anged+0.097
Neg logit contributions
 best-0.127
 world-0.127
 lot-0.123
 thing-0.123
 things-0.120
Token�
Token ID129
Feature activation-6.89e+00

Pos logit contributions
arted+0.297
ur+0.289
ked+0.277
 stro+0.264
anged+0.261
Neg logit contributions
 best-0.307
 world-0.302
 lot-0.297
 thing-0.294
 things-0.289
Token�
Token ID241
Feature activation-1.36e+00

Pos logit contributions
arted+0.332
anged+0.325
ur+0.324
ked+0.322
 stro+0.313
Neg logit contributions
 world-0.355
 best-0.337
 important-0.332
 lot-0.330
 thing-0.326
TokenHey
Token ID10814
Feature activation-6.12e+00

Pos logit contributions
arted+0.064
ur+0.063
ked+0.060
 stro+0.058
anged+0.056
Neg logit contributions
 best-0.073
 world-0.071
 lot-0.070
 thing-0.069
 things-0.068
Token Ch
Token ID609
Feature activation-1.75e+01

Pos logit contributions
ur+0.292
arted+0.282
ked+0.278
 stro+0.268
anged+0.249
Neg logit contributions
 world-0.313
 best-0.311
 biggest-0.301
 lot-0.297
 thing-0.296
Tokenir
Token ID343
Feature activation-1.51e+01

Pos logit contributions
arted+0.667
ur+0.629
ked+0.617
 stro+0.579
anged+0.565
Neg logit contributions
 best-1.090
 world-1.089
 lot-1.069
 thing-1.050
 things-1.028
Tokenp
Token ID79
Feature activation-1.30e+01

Pos logit contributions
arted+0.647
ur+0.625
ked+0.604
 stro+0.582
anged+0.579
Neg logit contributions
 best-0.872
 world-0.867
 thing-0.849
 lot-0.846
 things-0.830
Token!
Token ID0
Feature activation-9.88e-02

Pos logit contributions
arted+0.660
ur+0.657
ked+0.622
 stro+0.596
anged+0.584
Neg logit contributions
 best-0.649
 world-0.639
 lot-0.622
 thing-0.620
 things-0.612
Token I
Token ID314
Feature activation+8.26e+00

Pos logit contributions
arted+0.005
anged+0.005
ked+0.005
ur+0.005
oured+0.005
Neg logit contributions
 lot-0.005
 world-0.005
 best-0.004
 thing-0.004
 things-0.004
Token can
Token ID460
Feature activation+1.16e+01

Pos logit contributions
 best+0.301
 world+0.295
 lot+0.288
 thing+0.284
 things+0.273
Neg logit contributions
arted-0.538
ur-0.523
ked-0.510
 stro-0.489
anged-0.488
Token robber
Token ID29979
Feature activation-8.28e+00

Pos logit contributions
arted+0.090
ur+0.086
ked+0.083
 stro+0.076
ree+0.073
Neg logit contributions
 world-0.084
 lot-0.084
 thing-0.083
 things-0.083
 best-0.080
Token.
Token ID13
Feature activation+2.28e+00

Pos logit contributions
ur+0.445
arted+0.434
ked+0.418
 stro+0.408
anged+0.390
Neg logit contributions
 best-0.372
 world-0.369
 lot-0.367
 biggest-0.366
 thing-0.356
Token Lily
Token ID20037
Feature activation-5.89e+00

Pos logit contributions
 lot+0.113
 world+0.108
 best+0.107
 things+0.102
 thing+0.101
Neg logit contributions
ur-0.120
ked-0.118
arted-0.116
oured-0.115
 stro-0.112
Token ch
Token ID442
Feature activation-7.54e+00

Pos logit contributions
arted+0.460
ur+0.450
ked+0.440
anged+0.426
 stro+0.423
Neg logit contributions
 best-0.144
 world-0.137
 lot-0.131
 thing-0.126
 things-0.125
Tokenases
Token ID1386
Feature activation-1.64e+01

Pos logit contributions
 stro+0.285
ur+0.285
ked+0.273
arted+0.269
oured+0.259
Neg logit contributions
 best-0.456
 world-0.453
 things-0.453
 lot-0.452
 thing-0.448
Token Sam
Token ID3409
Feature activation-1.93e+01

Pos logit contributions
ur+0.712
arted+0.706
 stro+0.663
ked+0.659
anged+0.633
Neg logit contributions
 best-0.941
 things-0.938
 lot-0.934
 thing-0.929
 world-0.915
Token around
Token ID1088
Feature activation-8.71e+00

Pos logit contributions
arted+1.026
ur+0.993
ked+0.956
 stro+0.916
anged+0.911
Neg logit contributions
 best-0.936
 world-0.915
 lot-0.902
 thing-0.898
 things-0.882
Token the
Token ID262
Feature activation-2.00e+00

Pos logit contributions
arted+0.449
ur+0.442
ked+0.421
 stro+0.404
anged+0.401
Neg logit contributions
 best-0.434
 world-0.429
 lot-0.422
 thing-0.420
 things-0.413
Token trees
Token ID7150
Feature activation-1.11e+00

Pos logit contributions
arted+0.126
ur+0.115
ked+0.113
pered+0.110
 stro+0.109
Neg logit contributions
 best-0.069
 thing-0.062
 lot-0.062
iest-0.061
 biggest-0.061
Token and
Token ID290
Feature activation-5.90e+00

Pos logit contributions
arted+0.054
ur+0.053
ked+0.050
 stro+0.048
anged+0.048
Neg logit contributions
 best-0.059
 world-0.058
 lot-0.057
 thing-0.056
 things-0.055
Token the
Token ID262
Feature activation-2.12e+00

Pos logit contributions
arted+0.319
ur+0.312
ked+0.299
 stro+0.285
anged+0.285
Neg logit contributions
 best-0.281
 world-0.273
 lot-0.271
 thing-0.269
 things-0.266
Token
Token ID198
Feature activation-5.27e+00

Pos logit contributions
 lot+0.114
 best+0.111
 world+0.108
 things+0.104
 thing+0.102
Neg logit contributions
arted-0.152
ur-0.148
ked-0.143
anged-0.142
oured-0.133
Token
Token ID198
Feature activation-5.25e+00

Pos logit contributions
arted+0.333
ked+0.332
ur+0.321
oured+0.305
anged+0.302
Neg logit contributions
 lot-0.216
 world-0.216
 best-0.206
 things-0.199
 thing-0.195
Token"
Token ID1
Feature activation+5.93e+00

Pos logit contributions
arted+0.330
ur+0.323
ked+0.312
 stro+0.302
anged+0.299
Neg logit contributions
 best-0.206
 world-0.201
 lot-0.197
 thing-0.195
 things-0.189
TokenOkay
Token ID16454
Feature activation-3.04e-01

Pos logit contributions
 best+0.284
 world+0.275
 lot+0.264
 problem+0.263
 things+0.263
Neg logit contributions
arted-0.315
ked-0.305
ur-0.298
 stro-0.298
anged-0.277
Token!"
Token ID2474
Feature activation-1.90e+00

Pos logit contributions
arted+0.017
ur+0.017
ked+0.016
 stro+0.016
anged+0.015
Neg logit contributions
 best-0.014
 world-0.014
 thing-0.013
 lot-0.013
 things-0.013
Token Sam
Token ID3409
Feature activation-1.89e+01

Pos logit contributions
arted+0.096
ur+0.089
ked+0.083
oured+0.083
ried+0.081
Neg logit contributions
 lot-0.105
 things-0.104
 best-0.103
 thing-0.099
 day-0.096
Token said
Token ID531
Feature activation-1.19e+01

Pos logit contributions
arted+1.147
ur+1.073
ked+1.026
anged+1.019
pered+0.997
Neg logit contributions
 game-0.786
 best-0.779
 thing-0.721
 things-0.715
 is-0.707
Token.
Token ID13
Feature activation+2.56e+00

Pos logit contributions
arted+0.908
ur+0.854
ked+0.836
anged+0.815
 stro+0.814
Neg logit contributions
 best-0.343
 thing-0.314
 things-0.304
 lot-0.279
 world-0.271
Token He
Token ID679
Feature activation-2.64e+00

Pos logit contributions
 lot+0.110
 thing+0.108
 best+0.106
 things+0.105
 world+0.104
Neg logit contributions
arted-0.152
ur-0.151
oured-0.145
ked-0.145
anged-0.143
Token liked
Token ID8288
Feature activation+1.01e+00

Pos logit contributions
arted+0.137
ur+0.128
ked+0.119
anged+0.118
ree+0.113
Neg logit contributions
 best-0.132
 lot-0.129
 thing-0.127
 world-0.126
 game-0.122
Token the
Token ID262
Feature activation-1.68e+00

Pos logit contributions
 best+0.048
 world+0.047
 lot+0.046
 thing+0.046
 things+0.045
Neg logit contributions
arted-0.054
ur-0.053
ked-0.050
 stro-0.049
anged-0.048
Token and
Token ID290
Feature activation-4.80e+00

Pos logit contributions
arted+0.306
ur+0.299
ked+0.288
 stro+0.277
anged+0.273
Neg logit contributions
 best-0.285
 world-0.280
 thing-0.276
 things-0.274
 lot-0.269
Token disappointed
Token ID11679
Feature activation-4.53e+00

Pos logit contributions
arted+0.266
ur+0.262
ked+0.250
anged+0.241
oured+0.239
Neg logit contributions
 best-0.215
 lot-0.210
 things-0.205
 thing-0.204
 world-0.203
Token.
Token ID13
Feature activation+3.26e+00

Pos logit contributions
arted+0.298
ur+0.291
ked+0.282
 stro+0.272
anged+0.269
Neg logit contributions
 best-0.167
 world-0.161
 lot-0.159
 thing-0.158
 things-0.154
Token She
Token ID1375
Feature activation+2.12e+00

Pos logit contributions
 lot+0.143
 world+0.142
 best+0.140
 things+0.139
 thing+0.134
Neg logit contributions
ur-0.191
arted-0.189
ked-0.186
oured-0.180
anged-0.175
Token sc
Token ID629
Feature activation-9.51e+00

Pos logit contributions
 lot+0.102
 best+0.095
 important+0.094
 bad+0.092
 problem+0.091
Neg logit contributions
arted-0.117
ur-0.112
ked-0.106
anged-0.104
oured-0.103
Tokenolded
Token ID48959
Feature activation-2.03e+01

Pos logit contributions
arted+0.218
ked+0.217
 stro+0.184
red+0.174
 Patrick+0.164
Neg logit contributions
 best-0.609
 lot-0.602
 problem-0.582
 important-0.577
 biggest-0.555
Token Max
Token ID5436
Feature activation-1.11e+01

Pos logit contributions
arted+0.803
ur+0.794
ked+0.731
 stro+0.730
anged+0.714
Neg logit contributions
 best-1.279
 lot-1.277
 things-1.268
 thing-1.247
 world-1.239
Token for
Token ID329
Feature activation+1.93e-01

Pos logit contributions
arted+0.612
ur+0.582
ked+0.564
anged+0.548
 stro+0.545
Neg logit contributions
 best-0.546
 lot-0.515
 world-0.514
 things-0.509
 thing-0.507
Token being
Token ID852
Feature activation+5.94e+00

Pos logit contributions
 best+0.009
 lot+0.008
 things+0.008
 thing+0.008
 world+0.008
Neg logit contributions
arted-0.011
ur-0.011
ked-0.011
anged-0.010
 stro-0.010
Token greedy
Token ID31828
Feature activation-9.14e-01

Pos logit contributions
 best+0.162
 thing+0.145
 world+0.145
 lot+0.144
 things+0.140
Neg logit contributions
arted-0.449
ur-0.442
ked-0.430
anged-0.419
 stro-0.417
Token and
Token ID290
Feature activation-4.65e+00

Pos logit contributions
arted+0.038
ur+0.037
ked+0.036
 stro+0.034
anged+0.033
Neg logit contributions
 best-0.053
 world-0.053
 lot-0.052
 biggest-0.051
 thing-0.051
Token
Token ID198
Feature activation-5.49e+00

Pos logit contributions
 lot+0.098
 best+0.097
 world+0.095
 things+0.089
 day+0.088
Neg logit contributions
arted-0.156
ur-0.152
ked-0.152
anged-0.141
 stro-0.141
Token
Token ID198
Feature activation-5.45e+00

Pos logit contributions
arted+0.286
ur+0.277
ked+0.270
 stro+0.255
anged+0.253
Neg logit contributions
 best-0.274
 world-0.271
 lot-0.267
 thing-0.263
 things-0.257
TokenOne
Token ID3198
Feature activation+7.63e+00

Pos logit contributions
arted+0.352
ur+0.348
ked+0.331
 stro+0.328
anged+0.319
Neg logit contributions
 best-0.189
 world-0.184
 biggest-0.179
 thing-0.177
 things-0.172
Token day
Token ID1110
Feature activation+1.31e+01

Pos logit contributions
 biggest+0.138
 best+0.133
 world+0.133
 things+0.122
 problem+0.119
Neg logit contributions
ur-0.588
arted-0.567
anged-0.542
 stro-0.537
ree-0.534
Token,
Token ID11
Feature activation-3.51e+00

Pos logit contributions
 best+0.571
 lot+0.552
 thing+0.549
 things+0.537
 world+0.519
Neg logit contributions
arted-0.785
ur-0.750
ked-0.723
 stro-0.709
anged-0.684
Token Sam
Token ID3409
Feature activation-1.90e+01

Pos logit contributions
arted+0.194
ur+0.189
ked+0.180
 stro+0.174
anged+0.173
Neg logit contributions
 best-0.169
 lot-0.165
 thing-0.163
 things-0.160
 world-0.159
Token and
Token ID290
Feature activation-5.91e+00

Pos logit contributions
arted+1.074
ur+1.041
ked+1.018
anged+0.964
 stro+0.929
Neg logit contributions
 best-0.882
 world-0.881
 thing-0.836
 things-0.829
 lot-0.824
Token Lucy
Token ID22162
Feature activation-1.10e+01

Pos logit contributions
arted+0.280
ked+0.274
ur+0.268
anged+0.264
oured+0.254
Neg logit contributions
 best-0.331
 thing-0.327
 lot-0.323
 things-0.321
 world-0.312
Token went
Token ID1816
Feature activation-1.44e+00

Pos logit contributions
arted+0.686
ur+0.670
ked+0.641
anged+0.608
 stro+0.607
Neg logit contributions
 best-0.449
 world-0.423
 lot-0.420
 thing-0.399
 things-0.390
Token to
Token ID284
Feature activation-7.08e+00

Pos logit contributions
arted+0.075
ked+0.071
ur+0.070
 stro+0.068
anged+0.067
Neg logit contributions
 best-0.075
 lot-0.070
 things-0.070
 world-0.070
 thing-0.068
Token the
Token ID262
Feature activation-2.01e+00

Pos logit contributions
arted+0.466
ked+0.451
ur+0.448
olded+0.428
anged+0.423
Neg logit contributions
 lot-0.280
 best-0.274
 world-0.270
 things-0.268
 thing-0.260