TOP ACTIVATIONS
MAX = 15.540

. â Mon and Joy smiled and nodded
have some . Mon ths later , when El
come true . Mon ths later , Tim my
her . S aman tha knew that the par
to life . Mon ths passed and Daisy almost
. S aman tha was amazed and she
and flying . Mon ths later , Sarah passed
laughed and said , " Mon sters aren 't real ,
. He said , " Mon key , monkey , you
outside ." S aman tha looked at him and
come . Mon ths went by , and
my dear . â <|endoftext|> Once upon a time ,
said seriously . Pe ppa rushed back to her
her wise old friend . <|endoftext|> Once upon a time ,
can try ! â <|endoftext|> Billy was three years old
ured in the garden . <|endoftext|> Once upon a time ,
again full of joy . <|endoftext|> Once upon a time ,
touching me . â <|endoftext|> Once there was a girl
Y ay ! â <|endoftext|> Once upon a time ,
with a newfound respect . <|endoftext|> Once upon a time ,

MIN ACTIVATIONS
MIN = -17.826

Don 't yield to the temptation to mist reat and break
examine it with the magn ifying glass ?" "
cried , dropping the magn ifying glass . "
borrowed her mom 's magn ifying glass to take a closer
. Lily had a magn ifying glass that she found in
the blanket was too t attered to fix . But sue
When Daisy put the magn ifying glass down , the cater
She took out a magn ifying glass and looked very carefully
. They had a magn ifying glass and a small box
flashlight , and a magn ifying glass . As
and handed him the magn ifying glass . He
not to yield to his temptation . Knowing that cutting the
she needed to resist the temptation and leave it alone .
and paint and glue spl attered everywhere .
fingers were red and bl istered . He felt very scared
He pointed his magn ifying glass at different classmates and
Lily tried to resist the temptation , but she couldn 't
, handing him the magn ifying glass . She looked around
happy that she had resisted temptation and went on this big
the necklace and the magn ifying glass and told him that

INTERVAL 7.199 - 15.540
CONTAINS 5.934%

what you need ." M ama smiled and said
to bite them . L ily and Ben were
felt so wise ! But one day , while
. I 'm glad I met you ." Anna said .
your ball big again ." Lu cy and the

INTERVAL -1.143 - 7.199
CONTAINS 57.093%

At the end of the day , the daughter hugged her
excited . He ran around the store with it and gave
to the park and look for it again . <|endoftext|> Once
little boy grinned and said , " I was whist ling
" Help me , wise cat !" said the

INTERVAL -9.484 - -1.143
CONTAINS 35.801%

turned around and saw his papers . He was very shocked
Tim my felt sorry for taking George 's banana
ran together to the river . As soon as they reached
loved the warm water and bubbles . She spl ashed and
was so curious that he decided to go and see what

INTERVAL -17.826 - -9.484
CONTAINS 1.173%

but a friendship can mend . <|endoftext|> Tom and Lily went
looks ," her mom said . Lu cy smiled
sometimes , but not always . I like to try new
You are a good sister . I love you ." Ben
and gave her the ball . Anna threw the ball to
Token.
Token ID13
Feature activation-7.96e+00

Pos logit contributions
der+0.006
attered+0.006
ly+0.006
ishy+0.006
 dot+0.005
Neg logit contributions
Once-0.008
At-0.007
 -0.006
aman-0.006
ll-0.006
Tokenâ
Token ID22940
Feature activation+3.28e+00

Pos logit contributions
der+0.219
attered+0.193
ly+0.193
ishy+0.183
 because+0.180
Neg logit contributions
Once-0.261
aman-0.239
ll-0.237
At-0.235
Sab-0.234
Token
Token ID26391
Feature activation+4.66e+00

Pos logit contributions
Once+0.090
At+0.075
Sab+0.074
Pe+0.073
aman+0.072
Neg logit contributions
der-0.116
ly-0.116
attered-0.112
ishy-0.109
 textures-0.100
Token
Token ID198
Feature activation+1.36e+01

Pos logit contributions
Once+0.136
aman+0.119
Sab+0.116
At+0.116
Pe+0.114
Neg logit contributions
der-0.147
attered-0.143
ly-0.136
ishy-0.135
.-0.132
Token
Token ID198
Feature activation+1.35e+01

Pos logit contributions
aman+0.415
 salon+0.398
eks+0.380
tha+0.361
prise+0.358
Neg logit contributions
.-0.377
 loudly-0.303
der-0.296
!-0.276
 much-0.274
TokenMon
Token ID9069
Feature activation+1.55e+01

Pos logit contributions
aman+0.325
ma+0.320
ll+0.312
Sab+0.310
eks+0.307
Neg logit contributions
der-0.396
 loudly-0.372
attered-0.367
ly-0.355
.-0.346
Token and
Token ID290
Feature activation+1.56e+00

Pos logit contributions
Once+0.411
At+0.336
 +0.318
Pe+0.318
Sab+0.316
Neg logit contributions
der-0.572
ly-0.555
attered-0.535
ishy-0.520
 loudly-0.500
Token Joy
Token ID14087
Feature activation+3.48e+00

Pos logit contributions
Once+0.047
At+0.040
aman+0.039
Pe+0.038
Mary+0.038
Neg logit contributions
der-0.052
ly-0.050
attered-0.049
ishy-0.047
 textures-0.044
Token smiled
Token ID13541
Feature activation+2.88e+00

Pos logit contributions
Once+0.106
aman+0.091
Pe+0.091
At+0.089
Sab+0.088
Neg logit contributions
der-0.113
ly-0.109
attered-0.105
ishy-0.101
 loudly-0.095
Token and
Token ID290
Feature activation+1.47e+00

Pos logit contributions
Once+0.085
 +0.076
At+0.068
 Once+0.064
+0.063
Neg logit contributions
ishy-0.093
attered-0.092
working-0.084
 possession-0.079
 textures-0.079
Token nodded
Token ID14464
Feature activation+4.09e+00

Pos logit contributions
Once+0.038
At+0.032
 +0.031
aman+0.030
Mary+0.030
Neg logit contributions
der-0.056
attered-0.053
ly-0.053
ishy-0.052
 dot-0.048
Token have
Token ID423
Feature activation+8.39e-01

Pos logit contributions
der+0.043
ly+0.041
attered+0.041
ishy+0.040
 dot+0.036
Neg logit contributions
Once-0.045
At-0.039
 -0.037
aman-0.037
Mary-0.037
Token some
Token ID617
Feature activation-1.87e+00

Pos logit contributions
Once+0.031
At+0.027
 +0.026
Mary+0.026
aman+0.026
Neg logit contributions
der-0.023
ly-0.022
attered-0.022
ishy-0.021
 dot-0.019
Token.
Token ID13
Feature activation-8.06e+00

Pos logit contributions
der+0.051
attered+0.049
ly+0.045
ishy+0.043
 dot+0.043
Neg logit contributions
Once-0.067
At-0.061
 -0.057
n-0.055
ma-0.055
Token
Token ID198
Feature activation+1.32e+01

Pos logit contributions
der+0.244
attered+0.215
ly+0.211
ishy+0.202
.+0.198
Neg logit contributions
Once-0.265
ll-0.238
ma-0.235
aman-0.226
Pe-0.224
Token
Token ID198
Feature activation+1.32e+01

Pos logit contributions
aman+0.387
bur+0.366
 salon+0.347
ll+0.347
ma+0.341
Neg logit contributions
.-0.401
der-0.345
 loudly-0.324
 because-0.286
 much-0.285
TokenMon
Token ID9069
Feature activation+1.55e+01

Pos logit contributions
aman+0.312
ll+0.309
ma+0.300
bur+0.298
aren+0.296
Neg logit contributions
der-0.423
 loudly-0.391
.-0.378
attered-0.376
 dot-0.359
Tokenths
Token ID9998
Feature activation+2.61e+00

Pos logit contributions
Once+0.418
At+0.341
Pe+0.329
 +0.326
aman+0.311
Neg logit contributions
der-0.569
ly-0.548
attered-0.529
ishy-0.510
 loudly-0.493
Token later
Token ID1568
Feature activation+3.48e+00

Pos logit contributions
Once+0.081
Pe+0.067
Sab+0.067
aman+0.066
At+0.066
Neg logit contributions
ly-0.083
der-0.083
attered-0.079
ishy-0.076
 loudly-0.071
Token,
Token ID11
Feature activation+4.15e+00

Pos logit contributions
Once+0.126
At+0.110
Pe+0.109
Sab+0.108
aman+0.107
Neg logit contributions
der-0.095
ly-0.091
attered-0.086
ishy-0.082
 loudly-0.077
Token when
Token ID618
Feature activation-1.21e+00

Pos logit contributions
Once+0.117
At+0.097
aman+0.094
Pe+0.094
ll+0.092
Neg logit contributions
der-0.147
ly-0.140
attered-0.135
ishy-0.132
 loudly-0.125
Token El
Token ID2574
Feature activation+6.46e+00

Pos logit contributions
der+0.043
ly+0.041
attered+0.040
ishy+0.039
 dot+0.037
Neg logit contributions
Once-0.033
At-0.028
aman-0.027
Pe-0.027
Mary-0.027
Token come
Token ID1282
Feature activation+7.18e+00

Pos logit contributions
attered+0.000
der+0.000
ishy+0.000
ly+0.000
 textures+0.000
Neg logit contributions
Once-0.000
At-0.000
 -0.000
n-0.000
Sab-0.000
Token true
Token ID2081
Feature activation-4.38e-01

Pos logit contributions
Once+0.239
At+0.207
 +0.197
Sab+0.194
n+0.188
Neg logit contributions
der-0.236
attered-0.220
ishy-0.220
ly-0.204
 textures-0.192
Token.
Token ID13
Feature activation-7.87e+00

Pos logit contributions
attered+0.011
ishy+0.010
 dot+0.007
der+0.007
working+0.007
Neg logit contributions
Once-0.018
 -0.016
At-0.016
ll-0.015
aman-0.015
Token
Token ID198
Feature activation+1.35e+01

Pos logit contributions
der+0.295
attered+0.274
.+0.269
ly+0.264
 because+0.263
Neg logit contributions
Once-0.173
aman-0.163
Sab-0.158
Pe-0.155
igi-0.153
Token
Token ID198
Feature activation+1.34e+01

Pos logit contributions
aman+0.426
 salon+0.382
tha+0.364
 ha+0.360
aren+0.356
Neg logit contributions
.-0.404
 loudly-0.330
 because-0.311
der-0.310
!-0.304
TokenMon
Token ID9069
Feature activation+1.55e+01

Pos logit contributions
aman+0.316
aren+0.294
Sab+0.276
ma+0.275
igi+0.270
Neg logit contributions
der-0.444
 loudly-0.425
attered-0.417
.-0.398
ly-0.394
Tokenths
Token ID9998
Feature activation+2.71e+00

Pos logit contributions
Once+0.409
At+0.336
Pe+0.328
 +0.328
Sab+0.320
Neg logit contributions
der-0.557
ly-0.544
attered-0.518
ishy-0.494
 loudly-0.493
Token later
Token ID1568
Feature activation+3.67e+00

Pos logit contributions
Once+0.081
Sab+0.069
aman+0.068
Mary+0.068
Pe+0.067
Neg logit contributions
ly-0.089
der-0.086
attered-0.082
ishy-0.078
.-0.076
Token,
Token ID11
Feature activation+4.07e+00

Pos logit contributions
Once+0.131
At+0.115
aman+0.112
Sab+0.112
Pe+0.112
Neg logit contributions
der-0.103
ly-0.097
attered-0.093
ishy-0.088
 loudly-0.083
Token Tim
Token ID5045
Feature activation+6.93e+00

Pos logit contributions
Once+0.131
At+0.113
aman+0.110
Pe+0.109
 +0.108
Neg logit contributions
der-0.129
ly-0.122
attered-0.121
ishy-0.118
 dot-0.107
Tokenmy
Token ID1820
Feature activation+3.13e+00

Pos logit contributions
Once+0.160
At+0.129
aman+0.124
Pe+0.121
Mary+0.121
Neg logit contributions
der-0.282
ly-0.269
attered-0.268
ishy-0.262
 dot-0.244
Token her
Token ID607
Feature activation+3.23e-01

Pos logit contributions
Once+0.149
At+0.129
Pe+0.125
aman+0.124
Sab+0.124
Neg logit contributions
der-0.137
ly-0.129
attered-0.126
ishy-0.122
 textures-0.112
Token.
Token ID13
Feature activation-7.59e+00

Pos logit contributions
Once+0.015
 +0.013
At+0.013
aman+0.013
ma+0.013
Neg logit contributions
attered-0.005
der-0.005
ishy-0.005
ly-0.005
 textures-0.004
Token
Token ID198
Feature activation+1.36e+01

Pos logit contributions
der+0.250
.+0.223
 because+0.217
ly+0.212
attered+0.211
Neg logit contributions
ll-0.211
Once-0.205
ma-0.199
prise-0.192
fters-0.181
Token
Token ID198
Feature activation+1.36e+01

Pos logit contributions
 salon+0.441
illas+0.433
rob+0.427
ital+0.417
prise+0.413
Neg logit contributions
.-0.364
 loudly-0.265
der-0.259
!-0.246
 much-0.245
TokenS
Token ID50
Feature activation+1.04e+01

Pos logit contributions
ll+0.341
ma+0.315
OV+0.304
|+0.298
 meets+0.295
Neg logit contributions
der-0.401
 loudly-0.399
attered-0.375
ly-0.372
.-0.368
Tokenaman
Token ID10546
Feature activation+1.54e+01

Pos logit contributions
Once+0.158
 +0.120
Sab+0.108
Are+0.106
Br+0.101
Neg logit contributions
ly-0.503
der-0.461
ishy-0.443
attered-0.440
.-0.423
Tokentha
Token ID12898
Feature activation+7.44e+00

Pos logit contributions
Once+0.332
At+0.264
 +0.215
Pe+0.213
Ver+0.206
Neg logit contributions
ly-0.680
der-0.649
attered-0.624
 loudly-0.597
ishy-0.589
Token knew
Token ID2993
Feature activation+2.00e+00

Pos logit contributions
Once+0.222
At+0.188
aman+0.184
Pe+0.183
Sab+0.181
Neg logit contributions
der-0.251
ly-0.241
attered-0.234
ishy-0.226
 loudly-0.212
Token that
Token ID326
Feature activation-3.07e+00

Pos logit contributions
Once+0.074
At+0.066
Pe+0.063
aman+0.063
Sab+0.063
Neg logit contributions
der-0.054
ly-0.050
attered-0.047
ishy-0.045
 loudly-0.043
Token the
Token ID262
Feature activation+1.65e+00

Pos logit contributions
der+0.095
ly+0.089
attered+0.087
ishy+0.084
 textures+0.079
Neg logit contributions
Once-0.100
At-0.087
aman-0.084
Pe-0.084
Sab-0.083
Token par
Token ID1582
Feature activation+8.27e+00

Pos logit contributions
Once+0.061
At+0.057
aman+0.057
Mary+0.055
ll+0.055
Neg logit contributions
ly-0.039
der-0.038
 dot-0.034
.-0.033
ishy-0.033
Token to
Token ID284
Feature activation+3.53e+00

Pos logit contributions
der+0.024
attered+0.023
ishy+0.020
ly+0.019
 textures+0.016
Neg logit contributions
Once-0.072
At-0.065
 -0.064
ma-0.062
n-0.062
Token life
Token ID1204
Feature activation+8.45e-01

Pos logit contributions
Once+0.117
At+0.102
aman+0.100
Pe+0.098
Mary+0.097
Neg logit contributions
der-0.107
ly-0.101
attered-0.098
ishy-0.096
 dot-0.088
Token.
Token ID13
Feature activation-7.93e+00

Pos logit contributions
Once+0.038
ma+0.035
 +0.034
At+0.034
n+0.033
Neg logit contributions
attered-0.016
der-0.016
ishy-0.015
 dot-0.012
ly-0.012
Token
Token ID198
Feature activation+1.33e+01

Pos logit contributions
der+0.242
ly+0.226
attered+0.223
ishy+0.210
 loudly+0.200
Neg logit contributions
Once-0.255
At-0.222
aman-0.220
Pe-0.220
Sab-0.219
Token
Token ID198
Feature activation+1.33e+01

Pos logit contributions
 salon+0.386
aman+0.385
 ha+0.350
bur+0.348
eking+0.347
Neg logit contributions
.-0.399
 loudly-0.331
der-0.309
!-0.304
 because-0.300
TokenMon
Token ID9069
Feature activation+1.54e+01

Pos logit contributions
aman+0.297
ll+0.294
Sab+0.277
aren+0.276
ma+0.273
Neg logit contributions
der-0.439
ly-0.427
 loudly-0.424
attered-0.419
.-0.387
Tokenths
Token ID9998
Feature activation+2.59e+00

Pos logit contributions
Once+0.401
At+0.335
Pe+0.328
 +0.318
Sab+0.318
Neg logit contributions
der-0.566
ly-0.554
attered-0.529
ishy-0.504
 loudly-0.499
Token passed
Token ID3804
Feature activation+3.68e+00

Pos logit contributions
Once+0.075
Sab+0.064
Pe+0.062
Mary+0.062
aman+0.061
Neg logit contributions
ly-0.089
der-0.084
attered-0.081
ishy-0.077
.-0.075
Token and
Token ID290
Feature activation+1.39e+00

Pos logit contributions
Once+0.134
At+0.118
aman+0.115
Mary+0.113
Pe+0.112
Neg logit contributions
der-0.103
attered-0.094
ly-0.092
ishy-0.091
 dot-0.083
Token Daisy
Token ID40355
Feature activation+2.64e+00

Pos logit contributions
Once+0.044
Pe+0.039
Sab+0.038
At+0.038
aman+0.038
Neg logit contributions
der-0.043
ly-0.042
attered-0.039
ishy-0.037
 loudly-0.037
Token almost
Token ID2048
Feature activation+2.30e+00

Pos logit contributions
Once+0.079
At+0.067
Pe+0.066
aman+0.066
Mary+0.065
Neg logit contributions
der-0.088
ly-0.084
attered-0.082
ishy-0.079
 loudly-0.074
Token
Token ID26391
Feature activation+3.71e+00

Pos logit contributions
Once+0.064
At+0.052
Sab+0.051
Pe+0.050
aman+0.049
Neg logit contributions
der-0.100
ly-0.099
attered-0.096
ishy-0.095
 textures-0.087
Token.
Token ID13
Feature activation-8.46e+00

Pos logit contributions
Once+0.090
aman+0.080
At+0.077
Sab+0.075
Pe+0.074
Neg logit contributions
der-0.125
attered-0.124
.-0.123
ishy-0.120
ly-0.120
Token
Token ID198
Feature activation+1.30e+01

Pos logit contributions
der+0.212
ly+0.190
attered+0.185
ishy+0.179
 because+0.174
Neg logit contributions
Once-0.295
ll-0.276
aman-0.274
At-0.270
ma-0.263
Token
Token ID198
Feature activation+1.30e+01

Pos logit contributions
aman+0.328
ll+0.319
 salon+0.300
ma+0.288
bur+0.280
Neg logit contributions
.-0.410
der-0.362
 loudly-0.356
elly-0.335
 because-0.331
TokenS
Token ID50
Feature activation+1.01e+01

Pos logit contributions
ll+0.292
aman+0.286
ma+0.266
Once+0.257
OV+0.249
Neg logit contributions
der-0.410
 loudly-0.398
ly-0.395
attered-0.393
.-0.387
Tokenaman
Token ID10546
Feature activation+1.53e+01

Pos logit contributions
Once+0.246
At+0.210
Sab+0.203
Pe+0.203
 +0.198
Neg logit contributions
ly-0.388
der-0.367
ishy-0.352
attered-0.338
.-0.332
Tokentha
Token ID12898
Feature activation+7.26e+00

Pos logit contributions
Once+0.353
At+0.295
Pe+0.245
Sab+0.232
Ver+0.229
Neg logit contributions
ly-0.657
der-0.619
attered-0.593
.-0.571
ishy-0.568
Token was
Token ID373
Feature activation+3.46e+00

Pos logit contributions
Once+0.234
At+0.201
Pe+0.197
aman+0.196
Sab+0.194
Neg logit contributions
der-0.227
ly-0.217
attered-0.211
ishy-0.204
 loudly-0.188
Token amazed
Token ID24562
Feature activation+1.34e+00

Pos logit contributions
Once+0.136
At+0.125
aman+0.124
ll+0.124
Sab+0.122
Neg logit contributions
der-0.077
ly-0.070
attered-0.070
ishy-0.067
 loudly-0.066
Token and
Token ID290
Feature activation+1.33e+00

Pos logit contributions
Once+0.054
aman+0.050
ma+0.047
Mary+0.046
At+0.046
Neg logit contributions
attered-0.031
der-0.030
ishy-0.029
 dot-0.028
working-0.025
Token she
Token ID673
Feature activation-3.65e+00

Pos logit contributions
Once+0.042
At+0.036
aman+0.035
Pe+0.035
Sab+0.034
Neg logit contributions
der-0.043
ly-0.041
attered-0.040
ishy-0.039
 loudly-0.036
Token and
Token ID290
Feature activation+1.15e+00

Pos logit contributions
Once+0.031
At+0.027
ll+0.027
n+0.027
 +0.027
Neg logit contributions
attered-0.016
ishy-0.015
der-0.014
ly-0.012
 textures-0.012
Token flying
Token ID7348
Feature activation-2.53e+00

Pos logit contributions
Once+0.038
At+0.034
Sab+0.032
 +0.032
Mary+0.032
Neg logit contributions
der-0.034
attered-0.033
ly-0.033
ishy-0.032
 dot-0.029
Token.
Token ID13
Feature activation-8.14e+00

Pos logit contributions
der+0.067
attered+0.066
ishy+0.065
ly+0.060
 dot+0.054
Neg logit contributions
Once-0.089
At-0.082
 -0.079
Sab-0.077
Mary-0.076
Token
Token ID198
Feature activation+1.32e+01

Pos logit contributions
der+0.262
attered+0.238
 because+0.237
.+0.237
ly+0.232
Neg logit contributions
Once-0.215
igi-0.211
ll-0.210
Pe-0.204
Sab-0.198
Token
Token ID198
Feature activation+1.32e+01

Pos logit contributions
aman+0.409
 salon+0.380
bur+0.378
tha+0.376
illas+0.374
Neg logit contributions
.-0.372
 loudly-0.290
 because-0.277
!-0.276
der-0.275
TokenMon
Token ID9069
Feature activation+1.53e+01

Pos logit contributions
ll+0.291
igi+0.290
aman+0.287
bur+0.276
Sab+0.263
Neg logit contributions
der-0.426
 loudly-0.416
attered-0.411
ly-0.392
.-0.389
Tokenths
Token ID9998
Feature activation+2.41e+00

Pos logit contributions
Once+0.425
At+0.357
 +0.349
Pe+0.344
Sab+0.329
Neg logit contributions
der-0.531
ly-0.518
attered-0.488
 loudly-0.467
ishy-0.458
Token later
Token ID1568
Feature activation+3.64e+00

Pos logit contributions
Once+0.071
Sab+0.061
Pe+0.060
Mary+0.058
 +0.058
Neg logit contributions
ly-0.080
der-0.078
attered-0.076
ishy-0.071
.-0.068
Token,
Token ID11
Feature activation+4.14e+00

Pos logit contributions
Once+0.131
At+0.116
Pe+0.114
Sab+0.113
aman+0.112
Neg logit contributions
der-0.099
ly-0.094
attered-0.090
ishy-0.084
 loudly-0.080
Token Sarah
Token ID10490
Feature activation+2.45e+00

Pos logit contributions
Once+0.123
At+0.105
aman+0.102
Pe+0.101
Mary+0.100
Neg logit contributions
der-0.141
ly-0.133
attered-0.132
ishy-0.128
 dot-0.118
Token passed
Token ID3804
Feature activation+3.65e+00

Pos logit contributions
Once+0.073
At+0.062
aman+0.060
Pe+0.060
Sab+0.059
Neg logit contributions
der-0.083
ly-0.079
attered-0.077
ishy-0.075
 dot-0.069
Token laughed
Token ID13818
Feature activation+1.67e+00

Pos logit contributions
Once+0.066
Pe+0.064
Sab+0.063
igi+0.060
aman+0.060
Neg logit contributions
ly-0.063
der-0.063
.-0.058
 loudly-0.056
 because-0.054
Token and
Token ID290
Feature activation+1.31e+00

Pos logit contributions
Once+0.054
 +0.044
At+0.041
ma+0.036
<|endoftext|>+0.035
Neg logit contributions
attered-0.059
ishy-0.057
 textures-0.053
working-0.053
powers-0.052
Token said
Token ID531
Feature activation+6.30e+00

Pos logit contributions
Once+0.040
Sab+0.037
Pe+0.036
At+0.036
lor+0.036
Neg logit contributions
der-0.042
ly-0.037
.-0.037
 because-0.037
 loudly-0.035
Token,
Token ID11
Feature activation+3.83e+00

Pos logit contributions
Once+0.212
At+0.183
aman+0.178
Pe+0.177
 +0.177
Neg logit contributions
der-0.188
attered-0.180
ly-0.178
ishy-0.174
 dot-0.158
Token "
Token ID366
Feature activation+3.99e+00

Pos logit contributions
Once+0.149
aman+0.140
aren+0.137
|+0.137
igi+0.137
Neg logit contributions
der-0.098
ly-0.077
 because-0.076
 loudly-0.074
.-0.074
TokenMon
Token ID9069
Feature activation+1.53e+01

Pos logit contributions
Once+0.107
aman+0.092
Sab+0.091
 +0.088
At+0.088
Neg logit contributions
der-0.140
ly-0.132
attered-0.132
ishy-0.125
 textures-0.121
Tokensters
Token ID5937
Feature activation+3.95e-01

Pos logit contributions
Once+0.389
At+0.320
aman+0.309
Pe+0.305
Mary+0.303
Neg logit contributions
der-0.588
ly-0.560
attered-0.556
ishy-0.542
 dot-0.505
Token aren
Token ID3588
Feature activation+5.21e+00

Pos logit contributions
Once+0.010
At+0.008
ll+0.008
 +0.008
ma+0.008
Neg logit contributions
der-0.015
attered-0.014
ly-0.014
ishy-0.014
 dot-0.013
Token't
Token ID470
Feature activation+2.34e+00

Pos logit contributions
Once+0.110
aman+0.098
Pe+0.093
Mary+0.091
At+0.091
Neg logit contributions
der-0.209
ly-0.198
attered-0.187
ishy-0.182
.-0.181
Token real
Token ID1103
Feature activation-4.64e+00

Pos logit contributions
Once+0.087
At+0.072
Pe+0.072
n+0.072
 +0.072
Neg logit contributions
attered-0.067
der-0.066
ishy-0.060
ly-0.058
 dot-0.054
Token,
Token ID11
Feature activation+3.87e+00

Pos logit contributions
attered+0.168
ishy+0.154
der+0.142
 uneven+0.128
 dot+0.128
Neg logit contributions
Once-0.127
n-0.109
ll-0.104
 ha-0.100
At-0.099
Token.
Token ID13
Feature activation-8.32e+00

Pos logit contributions
attered+0.172
ishy+0.130
der+0.116
ly+0.110
 dot+0.104
Neg logit contributions
Once-0.321
n-0.282
At-0.281
 -0.280
Mary-0.268
Token He
Token ID679
Feature activation-3.15e+00

Pos logit contributions
der+0.191
ly+0.162
attered+0.161
ishy+0.153
 loudly+0.139
Neg logit contributions
Once-0.341
At-0.307
aman-0.307
ll-0.300
Sab-0.299
Token said
Token ID531
Feature activation+6.11e+00

Pos logit contributions
der+0.111
ly+0.106
attered+0.106
ishy+0.103
 dot+0.095
Neg logit contributions
Once-0.089
At-0.075
aman-0.071
Pe-0.071
Mary-0.071
Token,
Token ID11
Feature activation+3.87e+00

Pos logit contributions
Once+0.214
At+0.188
aman+0.185
Pe+0.182
Mary+0.181
Neg logit contributions
der-0.177
ly-0.166
attered-0.161
ishy-0.156
 loudly-0.143
Token "
Token ID366
Feature activation+3.82e+00

Pos logit contributions
Once+0.148
aman+0.140
Mary+0.134
aren+0.134
Pe+0.134
Neg logit contributions
der-0.102
ly-0.083
 loudly-0.079
.-0.078
 because-0.076
TokenMon
Token ID9069
Feature activation+1.53e+01

Pos logit contributions
Once+0.103
aman+0.087
At+0.086
Sab+0.083
Mary+0.082
Neg logit contributions
der-0.139
ly-0.132
attered-0.132
ishy-0.128
 dot-0.121
Tokenkey
Token ID2539
Feature activation+2.11e+00

Pos logit contributions
Once+0.383
At+0.316
aman+0.307
Pe+0.297
Sab+0.296
Neg logit contributions
der-0.582
ly-0.560
attered-0.550
ishy-0.539
 loudly-0.503
Token,
Token ID11
Feature activation+3.90e+00

Pos logit contributions
Once+0.051
 +0.045
ma+0.044
ll+0.042
n+0.041
Neg logit contributions
attered-0.081
der-0.079
ishy-0.075
ly-0.071
 dot-0.070
Token monkey
Token ID21657
Feature activation+3.36e+00

Pos logit contributions
Once+0.134
At+0.121
aman+0.118
Mary+0.116
Pe+0.114
Neg logit contributions
der-0.112
ly-0.105
ishy-0.096
attered-0.095
 dot-0.092
Token,
Token ID11
Feature activation+3.86e+00

Pos logit contributions
Once+0.096
ma+0.083
n+0.082
 +0.081
ll+0.080
Neg logit contributions
attered-0.119
der-0.114
ishy-0.110
ly-0.101
 textures-0.100
Token you
Token ID345
Feature activation+7.18e-01

Pos logit contributions
Once+0.126
At+0.111
aman+0.110
Mary+0.108
Pe+0.107
Neg logit contributions
der-0.118
ly-0.111
attered-0.104
ishy-0.104
 dot-0.097
Token outside
Token ID2354
Feature activation-5.91e-01

Pos logit contributions
der+0.024
attered+0.023
ishy+0.022
ly+0.020
 dot+0.020
Neg logit contributions
Once-0.031
ma-0.028
At-0.028
 -0.028
aman-0.028
Token."
Token ID526
Feature activation-2.71e+00

Pos logit contributions
der+0.015
attered+0.015
ishy+0.014
ly+0.013
 dot+0.012
Neg logit contributions
Once-0.022
ma-0.020
At-0.019
 -0.019
aman-0.018
Token
Token ID198
Feature activation+1.30e+01

Pos logit contributions
der+0.093
.+0.093
 because+0.087
ly+0.080
 loudly+0.080
Neg logit contributions
fters-0.066
Sab-0.065
Once-0.064
prise-0.063
ll-0.063
Token
Token ID198
Feature activation+1.31e+01

Pos logit contributions
fters+0.351
eks+0.348
Once+0.332
|+0.330
 salon+0.327
Neg logit contributions
.-0.405
der-0.359
 loudly-0.339
!-0.303
 because-0.301
TokenS
Token ID50
Feature activation+9.92e+00

Pos logit contributions
Once+0.251
ll+0.230
|+0.227
Sab+0.224
 +0.216
Neg logit contributions
der-0.502
 loudly-0.474
ly-0.467
attered-0.464
.-0.442
Tokenaman
Token ID10546
Feature activation+1.53e+01

Pos logit contributions
Once+0.149
Sab+0.119
 +0.105
Pe+0.101
At+0.096
Neg logit contributions
ly-0.480
der-0.470
attered-0.454
ishy-0.444
.-0.416
Tokentha
Token ID12898
Feature activation+7.34e+00

Pos logit contributions
Once+0.321
At+0.253
Sab+0.213
Pe+0.208
Ver+0.202
Neg logit contributions
ly-0.683
der-0.661
attered-0.638
ishy-0.602
.-0.596
Token looked
Token ID3114
Feature activation-1.79e+00

Pos logit contributions
Once+0.228
At+0.196
aman+0.190
Pe+0.189
Sab+0.187
Neg logit contributions
der-0.239
ly-0.226
attered-0.223
ishy-0.217
 dot-0.199
Token at
Token ID379
Feature activation+4.27e+00

Pos logit contributions
der+0.044
attered+0.040
 textures+0.038
 dot+0.038
ishy+0.037
Neg logit contributions
Once-0.068
 -0.061
At-0.060
aman-0.059
ma-0.056
Token him
Token ID683
Feature activation+3.15e+00

Pos logit contributions
Once+0.130
At+0.112
Sab+0.107
Pe+0.106
ll+0.105
Neg logit contributions
der-0.144
ly-0.137
attered-0.132
ishy-0.129
 textures-0.121
Token and
Token ID290
Feature activation+1.39e+00

Pos logit contributions
Once+0.112
 +0.101
At+0.095
aman+0.092
ma+0.092
Neg logit contributions
attered-0.088
der-0.087
ishy-0.086
ly-0.079
working-0.076
Token come
Token ID1282
Feature activation+6.51e+00

Pos logit contributions
Once+0.096
At+0.085
aman+0.082
Pe+0.081
Mary+0.081
Neg logit contributions
der-0.087
ly-0.082
attered-0.078
ishy-0.077
 loudly-0.072
Token.
Token ID13
Feature activation-8.39e+00

Pos logit contributions
Once+0.246
At+0.215
 +0.210
Sab+0.205
Pe+0.203
Neg logit contributions
der-0.184
attered-0.160
ishy-0.159
ly-0.155
 textures-0.143
Token
Token ID220
Feature activation+1.03e+01

Pos logit contributions
der+0.268
 because+0.246
.+0.244
attered+0.233
ly+0.226
Neg logit contributions
fters-0.218
ll-0.215
Once-0.213
ma-0.208
aman-0.207
Token
Token ID198
Feature activation+1.30e+01

Pos logit contributions
Once+0.259
aman+0.257
Pe+0.236
Sab+0.232
fters+0.229
Neg logit contributions
.-0.351
der-0.336
attered-0.317
 because-0.290
 loudly-0.286
Token
Token ID198
Feature activation+1.30e+01

Pos logit contributions
aman+0.479
 salon+0.446
use+0.430
tha+0.414
 ha+0.410
Neg logit contributions
.-0.305
 loudly-0.234
 He-0.222
 much-0.216
 They-0.196
TokenMon
Token ID9069
Feature activation+1.52e+01

Pos logit contributions
aman+0.368
aren+0.350
tha+0.338
ll+0.337
bur+0.326
Neg logit contributions
 loudly-0.323
der-0.300
.-0.279
 beautifully-0.265
 player-0.265
Tokenths
Token ID9998
Feature activation+2.40e+00

Pos logit contributions
Once+0.428
At+0.364
 +0.357
Pe+0.351
aman+0.342
Neg logit contributions
der-0.500
ly-0.496
attered-0.450
 loudly-0.448
.-0.433
Token went
Token ID1816
Feature activation+2.72e+00

Pos logit contributions
Once+0.074
Mary+0.064
aman+0.063
Girl+0.063
 +0.063
Neg logit contributions
ly-0.076
der-0.069
attered-0.068
.-0.066
working-0.063
Token by
Token ID416
Feature activation+5.67e+00

Pos logit contributions
Once+0.096
At+0.084
aman+0.081
Pe+0.081
Mary+0.080
Neg logit contributions
der-0.079
ly-0.073
attered-0.073
ishy-0.071
 dot-0.064
Token,
Token ID11
Feature activation+3.86e+00

Pos logit contributions
Once+0.215
At+0.189
aman+0.185
Pe+0.183
Mary+0.183
Neg logit contributions
der-0.146
attered-0.136
ly-0.136
ishy-0.131
 dot-0.116
Token and
Token ID290
Feature activation+1.28e+00

Pos logit contributions
Once+0.148
At+0.133
Sab+0.132
Pe+0.132
aman+0.131
Neg logit contributions
der-0.090
ly-0.085
 loudly-0.078
attered-0.077
 because-0.075
Token my
Token ID616
Feature activation+1.75e+00

Pos logit contributions
Once+0.135
At+0.119
aman+0.117
Pe+0.116
Sab+0.115
Neg logit contributions
der-0.126
ly-0.118
attered-0.111
ishy-0.110
 loudly-0.103
Token dear
Token ID13674
Feature activation-1.05e-01

Pos logit contributions
Once+0.059
 +0.053
ma+0.051
aman+0.049
At+0.049
Neg logit contributions
attered-0.050
der-0.049
ishy-0.045
ly-0.045
 dot-0.041
Token.
Token ID13
Feature activation-8.01e+00

Pos logit contributions
attered+0.002
der+0.002
ishy+0.002
 dot+0.002
ly+0.002
Neg logit contributions
Once-0.005
ma-0.004
 -0.004
Mary-0.004
aman-0.004
Tokenâ
Token ID22940
Feature activation+3.20e+00

Pos logit contributions
der+0.198
ly+0.177
attered+0.173
ishy+0.167
 dot+0.153
Neg logit contributions
Once-0.299
aman-0.272
At-0.270
Pe-0.263
Mary-0.263
Token
Token ID26391
Feature activation+4.54e+00

Pos logit contributions
Once+0.082
At+0.067
Sab+0.066
aman+0.065
Pe+0.065
Neg logit contributions
der-0.120
ly-0.118
attered-0.114
ishy-0.113
 textures-0.103
Token<|endoftext|>
Token ID50256
Feature activation+1.51e+01

Pos logit contributions
Once+0.119
aman+0.104
At+0.102
Sab+0.099
Pe+0.098
Neg logit contributions
der-0.156
attered-0.150
ly-0.147
ishy-0.145
.-0.141
TokenOnce
Token ID7454
Feature activation+1.31e+01

Pos logit contributions
aman+0.311
Once+0.291
Pe+0.270
Sab+0.266
igi+0.257
Neg logit contributions
.-0.614
der-0.570
ly-0.551
attered-0.512
ishy-0.509
Token upon
Token ID2402
Feature activation+8.48e+00

Pos logit contributions
Pe+0.366
Bet+0.310
Sab+0.309
Look+0.308
Wil+0.306
Neg logit contributions
.-0.491
ly-0.428
 because-0.425
der-0.424
ishy-0.409
Token a
Token ID257
Feature activation-4.03e+00

Pos logit contributions
Once+0.329
 +0.276
At+0.275
Mary+0.269
Pe+0.269
Neg logit contributions
der-0.236
attered-0.236
ishy-0.221
ly-0.220
 dot-0.190
Token time
Token ID640
Feature activation+5.08e+00

Pos logit contributions
der+0.123
ishy+0.120
ly+0.120
attered+0.114
 loudly+0.109
Neg logit contributions
Once-0.117
Sab-0.111
Pe-0.106
At-0.104
aman-0.104
Token,
Token ID11
Feature activation+4.07e+00

Pos logit contributions
Pe+0.251
fters+0.246
aman+0.243
Sab+0.242
Kids+0.242
Neg logit contributions
.-0.069
 because-0.056
!-0.043
ly-0.034
der-0.031
Token said
Token ID531
Feature activation+6.56e+00

Pos logit contributions
Once+0.108
At+0.090
Pe+0.087
aman+0.085
Sab+0.083
Neg logit contributions
der-0.172
ly-0.168
attered-0.159
ishy-0.155
 loudly-0.151
Token seriously
Token ID6411
Feature activation-4.28e-01

Pos logit contributions
Once+0.266
 +0.258
n+0.231
ma+0.231
aman+0.225
Neg logit contributions
ishy-0.142
attered-0.140
der-0.138
 dot-0.126
 textures-0.119
Token.
Token ID13
Feature activation-7.93e+00

Pos logit contributions
der+0.006
attered+0.006
ishy+0.005
ly+0.005
 textures+0.004
Neg logit contributions
Once-0.021
At-0.019
 -0.019
ll-0.019
aman-0.019
Token
Token ID198
Feature activation+1.37e+01

Pos logit contributions
der+0.251
.+0.246
 because+0.242
 They+0.214
 loudly+0.211
Neg logit contributions
prise-0.244
ma-0.227
reetings-0.218
fters-0.211
eks-0.211
Token
Token ID198
Feature activation+1.36e+01

Pos logit contributions
prise+0.507
eks+0.497
 salon+0.490
fters+0.483
aman+0.481
Neg logit contributions
.-0.372
 because-0.248
."-0.239
der-0.227
!-0.217
TokenPe
Token ID6435
Feature activation+1.51e+01

Pos logit contributions
ma+0.315
Once+0.294
aren+0.279
OV+0.276
ll+0.274
Neg logit contributions
der-0.475
ly-0.469
attered-0.451
 loudly-0.450
.-0.426
Tokenppa
Token ID44989
Feature activation+4.65e+00

Pos logit contributions
Once+0.449
 +0.395
At+0.386
Kids+0.370
|+0.365
Neg logit contributions
der-0.494
ly-0.478
attered-0.438
.-0.429
 loudly-0.422
Token rushed
Token ID13999
Feature activation+7.73e-01

Pos logit contributions
Once+0.151
At+0.131
Pe+0.130
aman+0.129
Sab+0.127
Neg logit contributions
der-0.143
ly-0.137
attered-0.133
ishy-0.127
 loudly-0.119
Token back
Token ID736
Feature activation+1.02e+00

Pos logit contributions
Once+0.033
aman+0.031
Pe+0.031
str+0.030
Sab+0.030
Neg logit contributions
ly-0.015
der-0.014
attered-0.013
 loudly-0.012
.-0.012
Token to
Token ID284
Feature activation+3.81e+00

Pos logit contributions
Once+0.044
Pe+0.040
aman+0.039
At+0.039
Mary+0.038
Neg logit contributions
ly-0.020
der-0.019
attered-0.018
ishy-0.017
.-0.017
Token her
Token ID607
Feature activation+5.44e-01

Pos logit contributions
Once+0.145
ll+0.140
At+0.135
igi+0.134
str+0.134
Neg logit contributions
.-0.086
 because-0.078
 loudly-0.075
der-0.073
ly-0.071
Token her
Token ID607
Feature activation+3.44e-01

Pos logit contributions
Once+0.076
At+0.065
aman+0.062
 +0.062
Mary+0.062
Neg logit contributions
der-0.089
attered-0.088
ishy-0.085
ly-0.085
 dot-0.078
Token wise
Token ID10787
Feature activation-3.36e+00

Pos logit contributions
Once+0.013
At+0.011
aman+0.011
Sab+0.011
 +0.011
Neg logit contributions
der-0.009
attered-0.009
ly-0.008
ishy-0.008
 dot-0.007
Token old
Token ID1468
Feature activation-3.36e+00

Pos logit contributions
attered+0.092
der+0.088
ishy+0.084
ly+0.076
 dot+0.071
Neg logit contributions
Once-0.119
 -0.106
aman-0.103
At-0.102
Sab-0.101
Token friend
Token ID1545
Feature activation-5.07e-01

Pos logit contributions
der+0.093
attered+0.091
ishy+0.086
ly+0.085
 dot+0.075
Neg logit contributions
Once-0.121
At-0.105
 -0.104
aman-0.103
Mary-0.102
Token.
Token ID13
Feature activation-7.50e+00

Pos logit contributions
der+0.013
attered+0.012
ishy+0.012
ly+0.011
 dot+0.010
Neg logit contributions
Once-0.020
At-0.018
 -0.018
ma-0.017
n-0.017
Token<|endoftext|>
Token ID50256
Feature activation+1.51e+01

Pos logit contributions
der+0.272
.+0.258
 because+0.251
attered+0.240
ly+0.234
Neg logit contributions
aman-0.168
Once-0.165
fters-0.161
Danny-0.161
Mary-0.155
TokenOnce
Token ID7454
Feature activation+1.32e+01

Pos logit contributions
aman+0.443
 salon+0.392
Sab+0.382
Pe+0.377
shadow+0.373
Neg logit contributions
.-0.555
der-0.429
!-0.426
 because-0.426
ly-0.412
Token upon
Token ID2402
Feature activation+8.32e+00

Pos logit contributions
Pe+0.310
Mary+0.278
Alice+0.274
Sab+0.271
aman+0.271
Neg logit contributions
.-0.512
der-0.471
ly-0.467
 because-0.454
ishy-0.442
Token a
Token ID257
Feature activation-4.08e+00

Pos logit contributions
Once+0.333
At+0.285
 +0.283
Pe+0.273
Mary+0.270
Neg logit contributions
der-0.219
attered-0.211
ly-0.201
ishy-0.200
 dot-0.171
Token time
Token ID640
Feature activation+5.00e+00

Pos logit contributions
der+0.116
ishy+0.115
ly+0.113
attered+0.107
.+0.106
Neg logit contributions
Sab-0.121
Once-0.121
Pe-0.113
aman-0.110
Girl-0.110
Token,
Token ID11
Feature activation+3.91e+00

Pos logit contributions
Pe+0.246
Kids+0.244
aman+0.242
Girl+0.240
Sab+0.238
Neg logit contributions
.-0.071
 because-0.058
!-0.043
ly-0.037
der-0.036
Token can
Token ID460
Feature activation+4.21e-01

Pos logit contributions
der+0.103
ishy+0.099
ly+0.098
attered+0.096
.+0.089
Neg logit contributions
Once-0.088
aman-0.077
Pe-0.075
Mary-0.074
Sab-0.073
Token try
Token ID1949
Feature activation-3.43e+00

Pos logit contributions
Once+0.008
At+0.006
 +0.005
aman+0.005
Pe+0.005
Neg logit contributions
der-0.019
ly-0.019
attered-0.019
ishy-0.018
 textures-0.017
Token!
Token ID0
Feature activation-6.70e+00

Pos logit contributions
der+0.090
attered+0.090
ly+0.086
ishy+0.084
 dot+0.075
Neg logit contributions
Once-0.128
At-0.112
 -0.111
n-0.108
aman-0.108
Tokenâ
Token ID22940
Feature activation+3.26e+00

Pos logit contributions
der+0.178
ly+0.166
attered+0.161
ishy+0.156
 dot+0.143
Neg logit contributions
Once-0.239
aman-0.213
At-0.213
Mary-0.207
ll-0.207
Token
Token ID26391
Feature activation+4.71e+00

Pos logit contributions
Once+0.079
At+0.065
aman+0.063
Sab+0.063
Pe+0.062
Neg logit contributions
der-0.126
ly-0.124
attered-0.121
ishy-0.119
 textures-0.110
Token<|endoftext|>
Token ID50256
Feature activation+1.50e+01

Pos logit contributions
Once+0.131
aman+0.115
At+0.114
Pe+0.109
Sab+0.109
Neg logit contributions
der-0.155
attered-0.148
ly-0.147
ishy-0.145
.-0.137
TokenBilly
Token ID49640
Feature activation+5.59e+00

Pos logit contributions
aman+0.293
Once+0.283
Pe+0.263
Sab+0.257
igi+0.246
Neg logit contributions
.-0.612
der-0.569
ly-0.561
attered-0.521
ishy-0.516
Token was
Token ID373
Feature activation+3.57e+00

Pos logit contributions
Once+0.207
At+0.182
aman+0.178
Pe+0.175
 +0.175
Neg logit contributions
der-0.150
ly-0.138
attered-0.137
ishy-0.132
 dot-0.119
Token three
Token ID1115
Feature activation+1.20e-01

Pos logit contributions
Once+0.133
Sab+0.124
aman+0.121
Pe+0.120
ll+0.120
Neg logit contributions
der-0.081
attered-0.081
ishy-0.078
ly-0.075
 loudly-0.074
Token years
Token ID812
Feature activation-6.50e+00

Pos logit contributions
Once+0.003
ma+0.003
 +0.003
At+0.003
Mary+0.003
Neg logit contributions
attered-0.004
der-0.004
 dot-0.004
ishy-0.004
working-0.004
Token old
Token ID1468
Feature activation-3.59e+00

Pos logit contributions
der+0.128
attered+0.121
ishy+0.113
ly+0.109
 dot+0.098
Neg logit contributions
Once-0.279
At-0.252
aman-0.251
 -0.246
ma-0.246
Tokenured
Token ID1522
Feature activation+6.88e-01

Pos logit contributions
Once+0.068
n+0.063
At+0.060
Sab+0.059
Pe+0.059
Neg logit contributions
der-0.079
attered-0.077
ishy-0.075
ly-0.073
 dot-0.071
Token in
Token ID287
Feature activation+2.98e+00

Pos logit contributions
Once+0.023
At+0.021
aman+0.020
 +0.020
Mary+0.020
Neg logit contributions
attered-0.019
der-0.019
ishy-0.019
ly-0.017
 dot-0.017
Token the
Token ID262
Feature activation+1.79e+00

Pos logit contributions
Once+0.091
At+0.077
 +0.075
aman+0.074
Mary+0.074
Neg logit contributions
der-0.102
attered-0.097
ly-0.097
ishy-0.093
 dot-0.084
Token garden
Token ID11376
Feature activation-2.21e-01

Pos logit contributions
Once+0.050
At+0.041
 +0.040
aman+0.039
Mary+0.037
Neg logit contributions
der-0.070
attered-0.065
ly-0.063
ishy-0.061
 dot-0.057
Token.
Token ID13
Feature activation-7.34e+00

Pos logit contributions
der+0.003
attered+0.003
ishy+0.003
ly+0.003
 dot+0.002
Neg logit contributions
Once-0.011
At-0.010
 -0.010
aman-0.009
ma-0.009
Token<|endoftext|>
Token ID50256
Feature activation+1.50e+01

Pos logit contributions
der+0.284
attered+0.265
ly+0.262
.+0.256
 because+0.254
Neg logit contributions
Once-0.155
Pe-0.139
aman-0.133
ll-0.132
At-0.128
TokenOnce
Token ID7454
Feature activation+1.32e+01

Pos logit contributions
aman+0.322
Once+0.294
Pe+0.285
Sab+0.278
shadow+0.272
Neg logit contributions
.-0.602
der-0.550
ly-0.526
attered-0.504
 because-0.501
Token upon
Token ID2402
Feature activation+8.42e+00

Pos logit contributions
Pe+0.313
Once+0.276
Look+0.267
Sab+0.261
Wil+0.257
Neg logit contributions
.-0.509
der-0.480
ly-0.475
 because-0.464
ishy-0.454
Token a
Token ID257
Feature activation-3.98e+00

Pos logit contributions
Once+0.320
At+0.267
 +0.266
Mary+0.257
Pe+0.255
Neg logit contributions
attered-0.249
der-0.249
ishy-0.235
ly-0.232
 dot-0.201
Token time
Token ID640
Feature activation+5.09e+00

Pos logit contributions
der+0.125
ly+0.121
ishy+0.120
attered+0.117
 loudly+0.110
Neg logit contributions
Once-0.114
Sab-0.106
Pe-0.102
aman-0.101
At-0.100
Token,
Token ID11
Feature activation+4.04e+00

Pos logit contributions
Pe+0.243
aman+0.242
fters+0.240
Kids+0.237
igi+0.237
Neg logit contributions
.-0.071
 because-0.063
!-0.044
ly-0.040
der-0.040
Token again
Token ID757
Feature activation+3.31e+00

Pos logit contributions
Once+0.299
At+0.264
aman+0.257
Mary+0.255
Pe+0.254
Neg logit contributions
der-0.190
attered-0.176
ly-0.176
ishy-0.168
 dot-0.149
Token full
Token ID1336
Feature activation-6.20e+00

Pos logit contributions
Once+0.164
At+0.148
 +0.146
aman+0.143
Mary+0.142
Neg logit contributions
der-0.048
attered-0.048
ishy-0.043
ly-0.043
 dot-0.034
Token of
Token ID286
Feature activation+1.15e-01

Pos logit contributions
der+0.144
attered+0.140
ishy+0.134
ly+0.128
 dot+0.117
Neg logit contributions
Once-0.252
At-0.222
aman-0.216
 -0.216
ma-0.213
Token joy
Token ID8716
Feature activation+7.55e-01

Pos logit contributions
Once+0.004
At+0.004
aman+0.003
 +0.003
Mary+0.003
Neg logit contributions
der-0.003
attered-0.003
ly-0.003
ishy-0.003
 dot-0.003
Token.
Token ID13
Feature activation-7.63e+00

Pos logit contributions
Once+0.032
ll+0.029
ma+0.029
aman+0.029
 +0.028
Neg logit contributions
ishy-0.014
attered-0.014
 dot-0.012
der-0.011
 obstacle-0.010
Token<|endoftext|>
Token ID50256
Feature activation+1.50e+01

Pos logit contributions
der+0.295
.+0.265
 because+0.264
attered+0.263
ly+0.257
Neg logit contributions
Once-0.160
ll-0.149
Sab-0.147
ma-0.145
igi-0.142
TokenOnce
Token ID7454
Feature activation+1.31e+01

Pos logit contributions
aman+0.337
Once+0.299
Sab+0.292
Pe+0.290
shadow+0.287
Neg logit contributions
.-0.602
der-0.533
ly-0.516
!-0.491
 because-0.486
Token upon
Token ID2402
Feature activation+8.45e+00

Pos logit contributions
Pe+0.352
Sab+0.301
Bet+0.300
Look+0.300
lor+0.300
Neg logit contributions
.-0.493
ly-0.437
der-0.436
 because-0.434
ishy-0.420
Token a
Token ID257
Feature activation-4.04e+00

Pos logit contributions
Once+0.328
At+0.277
 +0.275
Mary+0.269
Pe+0.267
Neg logit contributions
der-0.236
attered-0.235
ishy-0.220
ly-0.220
 dot-0.190
Token time
Token ID640
Feature activation+5.13e+00

Pos logit contributions
der+0.123
ly+0.120
ishy+0.120
attered+0.115
 loudly+0.109
Neg logit contributions
Once-0.117
Sab-0.111
Pe-0.106
aman-0.104
At-0.104
Token,
Token ID11
Feature activation+3.99e+00

Pos logit contributions
Pe+0.249
aman+0.247
fters+0.246
Sab+0.244
igi+0.242
Neg logit contributions
.-0.071
 because-0.061
!-0.045
ly-0.037
 loudly-0.035
Token touching
Token ID15241
Feature activation+1.63e-01

Pos logit contributions
Once+0.116
aman+0.107
At+0.105
ll+0.104
Sab+0.104
Neg logit contributions
der-0.094
ly-0.084
.-0.084
 because-0.083
 loudly-0.081
Token me
Token ID502
Feature activation+6.96e+00

Pos logit contributions
Once+0.005
 +0.004
At+0.004
aman+0.004
ma+0.004
Neg logit contributions
attered-0.005
der-0.005
ly-0.005
ishy-0.005
 dot-0.005
Token.
Token ID13
Feature activation-7.68e+00

Pos logit contributions
Once+0.282
 +0.248
At+0.247
ma+0.245
n+0.244
Neg logit contributions
der-0.164
attered-0.159
ly-0.146
ishy-0.146
 dot-0.133
Tokenâ
Token ID22940
Feature activation+3.42e+00

Pos logit contributions
der+0.201
ly+0.177
attered+0.175
ishy+0.169
 dot+0.157
Neg logit contributions
Once-0.276
At-0.249
aman-0.248
ll-0.244
Sab-0.242
Token
Token ID26391
Feature activation+4.87e+00

Pos logit contributions
Once+0.085
Sab+0.070
At+0.070
Pe+0.068
aman+0.067
Neg logit contributions
ly-0.129
der-0.128
attered-0.123
ishy-0.122
 textures-0.111
Token<|endoftext|>
Token ID50256
Feature activation+1.50e+01

Pos logit contributions
Once+0.105
aman+0.093
Sab+0.090
At+0.089
igi+0.088
Neg logit contributions
der-0.178
attered-0.173
.-0.172
ishy-0.170
ly-0.168
TokenOnce
Token ID7454
Feature activation+1.32e+01

Pos logit contributions
aman+0.301
Once+0.285
Pe+0.268
Sab+0.267
igi+0.254
Neg logit contributions
.-0.613
der-0.569
ly-0.553
attered-0.518
ishy-0.515
Token there
Token ID612
Feature activation+1.99e+00

Pos logit contributions
Pe+0.345
Sab+0.295
Once+0.292
Bet+0.284
Alice+0.283
Neg logit contributions
.-0.502
ly-0.453
der-0.448
 because-0.445
ishy-0.438
Token was
Token ID373
Feature activation+3.62e+00

Pos logit contributions
Sab+0.080
fters+0.077
Pe+0.077
igi+0.076
shadow+0.075
Neg logit contributions
.-0.046
 because-0.038
!-0.034
ly-0.033
 like-0.033
Token a
Token ID257
Feature activation-4.04e+00

Pos logit contributions
Once+0.146
At+0.126
 +0.122
Pe+0.122
Mary+0.120
Neg logit contributions
der-0.094
ly-0.086
attered-0.084
ishy-0.079
 dot-0.069
Token girl
Token ID2576
Feature activation-1.72e-01

Pos logit contributions
der+0.131
attered+0.126
ly+0.118
ishy+0.109
 textures+0.105
Neg logit contributions
Once-0.143
Pe-0.120
At-0.119
 -0.115
str-0.111
TokenY
Token ID56
Feature activation+6.68e-01

Pos logit contributions
der+0.030
ly+0.029
attered+0.029
ishy+0.028
 dot+0.026
Neg logit contributions
Once-0.020
At-0.016
Mary-0.016
Pe-0.016
 -0.015
Tokenay
Token ID323
Feature activation+4.13e+00

Pos logit contributions
Once+0.021
aman+0.020
ll+0.018
ma+0.018
At+0.018
Neg logit contributions
der-0.021
attered-0.019
 dot-0.019
 textures-0.018
ishy-0.018
Token!
Token ID0
Feature activation-6.63e+00

Pos logit contributions
Once+0.165
ma+0.148
 +0.146
n+0.144
Mary+0.144
Neg logit contributions
der-0.104
attered-0.098
ishy-0.092
ly-0.087
 textures-0.085
Tokenâ
Token ID22940
Feature activation+3.54e+00

Pos logit contributions
der+0.167
ly+0.151
attered+0.147
ishy+0.140
.+0.132
Neg logit contributions
Once-0.238
At-0.215
aman-0.213
ll-0.212
Pe-0.206
Token
Token ID26391
Feature activation+4.96e+00

Pos logit contributions
Once+0.088
At+0.073
Pe+0.070
Sab+0.069
aman+0.069
Neg logit contributions
der-0.135
ly-0.132
attered-0.129
ishy-0.127
 textures-0.117
Token<|endoftext|>
Token ID50256
Feature activation+1.50e+01

Pos logit contributions
Once+0.137
At+0.117
aman+0.116
Pe+0.114
Sab+0.111
Neg logit contributions
der-0.165
attered-0.157
ly-0.156
ishy-0.153
.-0.146
TokenOnce
Token ID7454
Feature activation+1.32e+01

Pos logit contributions
Once+0.263
aman+0.256
Pe+0.229
Sab+0.224
ll+0.217
Neg logit contributions
.-0.624
der-0.611
ly-0.596
attered-0.562
ishy-0.555
Token upon
Token ID2402
Feature activation+8.56e+00

Pos logit contributions
Pe+0.344
Once+0.291
Look+0.287
Sab+0.284
Wil+0.283
Neg logit contributions
.-0.504
ly-0.457
der-0.449
 because-0.446
ishy-0.433
Token a
Token ID257
Feature activation-3.94e+00

Pos logit contributions
Once+0.329
 +0.277
At+0.273
Mary+0.268
Pe+0.265
Neg logit contributions
der-0.247
attered-0.246
ishy-0.232
ly-0.229
 dot-0.200
Token time
Token ID640
Feature activation+5.15e+00

Pos logit contributions
der+0.132
ly+0.127
ishy+0.125
attered+0.124
 loudly+0.114
Neg logit contributions
Once-0.111
Sab-0.096
At-0.095
Pe-0.094
aman-0.094
Token,
Token ID11
Feature activation+4.13e+00

Pos logit contributions
Pe+0.247
fters+0.239
igi+0.238
Sab+0.237
At+0.235
Neg logit contributions
.-0.077
 because-0.064
!-0.051
ly-0.045
der-0.043
Token with
Token ID351
Feature activation+3.25e+00

Pos logit contributions
Once+0.135
At+0.119
 +0.118
ll+0.113
n+0.113
Neg logit contributions
der-0.079
attered-0.078
ishy-0.075
ly-0.072
 dot-0.064
Token a
Token ID257
Feature activation-4.11e+00

Pos logit contributions
Once+0.115
At+0.099
 +0.098
aman+0.095
Pe+0.094
Neg logit contributions
der-0.094
attered-0.090
ly-0.088
ishy-0.086
 dot-0.076
Token newfound
Token ID40000
Feature activation+1.98e+00

Pos logit contributions
der+0.104
ly+0.097
attered+0.095
ishy+0.092
 dot+0.083
Neg logit contributions
Once-0.156
At-0.138
aman-0.136
Pe-0.135
Sab-0.134
Token respect
Token ID2461
Feature activation-7.48e-02

Pos logit contributions
Once+0.073
At+0.064
aman+0.064
Pe+0.062
 +0.062
Neg logit contributions
der-0.053
attered-0.050
ly-0.048
ishy-0.047
 dot-0.042
Token.
Token ID13
Feature activation-7.33e+00

Pos logit contributions
der+0.002
attered+0.001
ishy+0.001
ly+0.001
 dot+0.001
Neg logit contributions
Once-0.003
 -0.003
ma-0.003
aman-0.003
At-0.003
Token<|endoftext|>
Token ID50256
Feature activation+1.50e+01

Pos logit contributions
der+0.296
attered+0.273
ly+0.266
.+0.263
 because+0.263
Neg logit contributions
Once-0.140
aman-0.136
Sab-0.122
Pe-0.122
At-0.120
TokenOnce
Token ID7454
Feature activation+1.31e+01

Pos logit contributions
aman+0.332
Once+0.304
Pe+0.295
Sab+0.291
shadow+0.276
Neg logit contributions
.-0.594
der-0.534
ly-0.516
attered-0.503
 because-0.489
Token upon
Token ID2402
Feature activation+8.23e+00

Pos logit contributions
Pe+0.312
Once+0.277
Sab+0.271
Look+0.264
Mary+0.260
Neg logit contributions
.-0.506
ly-0.475
der-0.471
 because-0.458
ishy-0.443
Token a
Token ID257
Feature activation-4.21e+00

Pos logit contributions
Once+0.320
At+0.271
 +0.268
Pe+0.259
Mary+0.258
Neg logit contributions
der-0.232
attered-0.230
ishy-0.219
ly-0.214
 dot-0.187
Token time
Token ID640
Feature activation+4.93e+00

Pos logit contributions
der+0.128
ly+0.125
ishy+0.123
attered+0.119
 loudly+0.113
Neg logit contributions
Once-0.122
Sab-0.116
Pe-0.111
aman-0.109
At-0.108
Token,
Token ID11
Feature activation+3.91e+00

Pos logit contributions
Pe+0.236
aman+0.234
Girl+0.232
igi+0.232
fters+0.230
Neg logit contributions
.-0.068
 because-0.061
ly-0.044
!-0.043
der-0.041
Token Don
Token ID2094
Feature activation-1.66e+00

Pos logit contributions
der+0.346
ly+0.317
attered+0.317
ishy+0.317
.+0.315
Neg logit contributions
Once-0.221
Sab-0.206
aman-0.203
Mary-0.201
Pe-0.197
Token't
Token ID470
Feature activation-2.10e+00

Pos logit contributions
der+0.065
ly+0.062
.+0.058
 because+0.056
attered+0.056
Neg logit contributions
Once-0.038
aman-0.034
At-0.033
Mary-0.032
Pe-0.032
Token yield
Token ID7800
Feature activation-4.33e+00

Pos logit contributions
der+0.070
attered+0.068
ishy+0.064
ly+0.062
 textures+0.056
Neg logit contributions
Once-0.067
n-0.054
At-0.054
 -0.053
Pe-0.052
Token to
Token ID284
Feature activation-8.34e-02

Pos logit contributions
der+0.108
attered+0.107
ishy+0.102
ly+0.099
 dot+0.092
Neg logit contributions
Once-0.162
 -0.142
At-0.141
ma-0.136
n-0.136
Token the
Token ID262
Feature activation-1.82e+00

Pos logit contributions
der+0.003
attered+0.003
ly+0.003
ishy+0.003
 dot+0.002
Neg logit contributions
Once-0.002
At-0.002
aman-0.002
 -0.002
Mary-0.002
Token temptation
Token ID29062
Feature activation-1.40e+01

Pos logit contributions
der+0.051
attered+0.048
ly+0.048
ishy+0.046
 dot+0.040
Neg logit contributions
Once-0.065
At-0.056
aman-0.056
 -0.054
Pe-0.054
Token to
Token ID284
Feature activation+7.73e-01

Pos logit contributions
der+0.301
attered+0.279
ly+0.273
ishy+0.270
 dot+0.233
Neg logit contributions
Once-0.583
At-0.522
aman-0.507
 -0.506
Mary-0.504
Token mist
Token ID4020
Feature activation-2.10e+00

Pos logit contributions
Once+0.028
At+0.024
Sab+0.024
aman+0.024
Pe+0.024
Neg logit contributions
der-0.021
ly-0.020
attered-0.019
ishy-0.019
 textures-0.017
Tokenreat
Token ID630
Feature activation+2.92e+00

Pos logit contributions
attered+0.052
ishy+0.051
der+0.051
 dot+0.047
 textures+0.047
Neg logit contributions
Once-0.070
ll-0.065
ma-0.064
aman-0.063
 -0.062
Token and
Token ID290
Feature activation-8.46e-01

Pos logit contributions
Once+0.101
At+0.090
 +0.087
n+0.085
Pe+0.084
Neg logit contributions
der-0.082
attered-0.080
ishy-0.078
ly-0.072
 textures-0.069
Token break
Token ID2270
Feature activation-2.89e+00

Pos logit contributions
der+0.026
ly+0.025
attered+0.024
ishy+0.024
 textures+0.022
Neg logit contributions
Once-0.028
At-0.024
aman-0.023
Pe-0.023
Sab-0.023
Token examine
Token ID10716
Feature activation+1.90e+00

Pos logit contributions
der+0.172
ly+0.163
attered+0.161
ishy+0.157
 dot+0.144
Neg logit contributions
Once-0.163
At-0.140
aman-0.136
Pe-0.135
Mary-0.134
Token it
Token ID340
Feature activation-1.56e+00

Pos logit contributions
Once+0.054
 +0.046
At+0.046
ma+0.045
Sab+0.044
Neg logit contributions
der-0.066
attered-0.064
ishy-0.061
ly-0.060
 textures-0.055
Token with
Token ID351
Feature activation+1.04e+00

Pos logit contributions
der+0.045
ly+0.044
ishy+0.039
attered+0.039
.+0.038
Neg logit contributions
Once-0.053
Pe-0.047
Mary-0.047
aman-0.047
At-0.046
Token the
Token ID262
Feature activation-5.31e-01

Pos logit contributions
Once+0.032
At+0.028
aman+0.027
Pe+0.027
Mary+0.027
Neg logit contributions
der-0.034
ly-0.033
attered-0.031
ishy-0.031
.-0.029
Token magn
Token ID7842
Feature activation+7.29e-01

Pos logit contributions
der+0.014
attered+0.013
ly+0.012
ishy+0.012
 loudly+0.010
Neg logit contributions
Once-0.021
At-0.018
aman-0.018
 -0.018
Pe-0.018
Tokenifying
Token ID4035
Feature activation-1.35e+01

Pos logit contributions
Once+0.032
At+0.028
aman+0.027
Sab+0.027
ll+0.027
Neg logit contributions
der-0.014
 textures-0.013
attered-0.013
ishy-0.012
ly-0.012
Token glass
Token ID5405
Feature activation-3.26e+00

Pos logit contributions
der+0.329
ly+0.309
attered+0.304
ishy+0.290
 textures+0.258
Neg logit contributions
Once-0.535
At-0.471
aman-0.460
 -0.460
Pe-0.458
Token?"
Token ID1701
Feature activation-2.70e-02

Pos logit contributions
der+0.108
attered+0.105
ishy+0.100
ly+0.100
 textures+0.092
Neg logit contributions
Once-0.097
At-0.082
ma-0.079
 -0.079
aman-0.078
Token
Token ID198
Feature activation+1.16e+01

Pos logit contributions
.+0.001
ly+0.001
 because+0.001
der+0.001
ishy+0.001
Neg logit contributions
aman-0.001
ll-0.001
lor-0.001
Once-0.001
igi-0.001
Token
Token ID198
Feature activation+1.16e+01

Pos logit contributions
aman+0.367
 salon+0.329
fters+0.322
aren+0.310
Once+0.308
Neg logit contributions
.-0.394
ly-0.300
 loudly-0.295
 because-0.291
der-0.288
Token"
Token ID1
Feature activation+1.85e+00

Pos logit contributions
Once+0.270
aman+0.244
Sab+0.227
ll+0.216
Mary+0.215
Neg logit contributions
ly-0.431
der-0.429
attered-0.425
ishy-0.407
 loudly-0.386
Token cried
Token ID16896
Feature activation+8.09e-01

Pos logit contributions
Once+0.059
At+0.049
aman+0.047
Pe+0.047
Sab+0.046
Neg logit contributions
der-0.101
ly-0.097
attered-0.095
ishy-0.093
 loudly-0.088
Token,
Token ID11
Feature activation+2.25e+00

Pos logit contributions
Once+0.032
 +0.029
Pe+0.027
Sab+0.027
 babys+0.026
Neg logit contributions
attered-0.019
 sizes-0.019
der-0.018
 possession-0.017
ishy-0.017
Token dropping
Token ID12047
Feature activation-4.46e+00

Pos logit contributions
Once+0.065
 +0.054
At+0.054
aman+0.052
Pe+0.052
Neg logit contributions
der-0.078
attered-0.076
ly-0.075
ishy-0.075
 textures-0.067
Token the
Token ID262
Feature activation-6.81e-01

Pos logit contributions
der+0.206
attered+0.198
ly+0.192
ishy+0.185
 dot+0.176
Neg logit contributions
Once-0.080
 -0.076
Pe-0.060
ma-0.059
Mary-0.058
Token magn
Token ID7842
Feature activation+8.11e-01

Pos logit contributions
der+0.020
attered+0.019
ly+0.018
ishy+0.017
lier+0.015
Neg logit contributions
Once-0.024
At-0.021
ma-0.020
aman-0.020
ll-0.020
Tokenifying
Token ID4035
Feature activation-1.35e+01

Pos logit contributions
Once+0.029
aman+0.025
lor+0.024
At+0.024
ma+0.024
Neg logit contributions
 textures-0.021
 sizes-0.020
der-0.019
 enjoyable-0.019
 easy-0.018
Token glass
Token ID5405
Feature activation-3.12e+00

Pos logit contributions
der+0.344
ly+0.331
attered+0.322
ishy+0.312
 textures+0.279
Neg logit contributions
Once-0.516
At-0.451
 -0.443
Pe-0.439
aman-0.438
Token.
Token ID13
Feature activation-9.97e+00

Pos logit contributions
attered+0.074
ishy+0.074
der+0.072
working+0.069
 textures+0.068
Neg logit contributions
Once-0.105
ma-0.098
 -0.090
At-0.088
aman-0.086
Token
Token ID198
Feature activation+1.13e+01

Pos logit contributions
der+0.236
.+0.210
 because+0.204
 loudly+0.197
 They+0.171
Neg logit contributions
ma-0.384
prise-0.382
igi-0.370
aman-0.366
ll-0.358
Token
Token ID198
Feature activation+1.13e+01

Pos logit contributions
aman+0.349
 salon+0.349
fters+0.337
ital+0.328
Once+0.326
Neg logit contributions
.-0.352
 loudly-0.280
der-0.279
 because-0.259
!-0.251
Token"
Token ID1
Feature activation+1.92e+00

Pos logit contributions
Once+0.279
aman+0.245
Sab+0.242
ll+0.234
 +0.227
Neg logit contributions
ly-0.402
der-0.399
attered-0.395
 loudly-0.370
ishy-0.365
Token borrowed
Token ID22546
Feature activation-2.04e+00

Pos logit contributions
der+0.138
ly+0.131
 loudly+0.121
attered+0.119
 sharp+0.114
Neg logit contributions
Once-0.163
aman-0.160
aren-0.148
Pe-0.147
Alice-0.144
Token her
Token ID607
Feature activation-1.32e+00

Pos logit contributions
der+0.064
attered+0.061
ly+0.059
ishy+0.058
 dot+0.052
Neg logit contributions
Once-0.067
At-0.057
 -0.057
Pe-0.055
Sab-0.055
Token mom
Token ID1995
Feature activation+1.52e+00

Pos logit contributions
der+0.037
attered+0.035
ly+0.034
ishy+0.033
 loudly+0.028
Neg logit contributions
Once-0.048
At-0.042
 -0.041
n-0.040
aman-0.040
Token's
Token ID338
Feature activation-2.14e+00

Pos logit contributions
Once+0.044
At+0.037
aman+0.036
 +0.035
Pe+0.035
Neg logit contributions
der-0.054
attered-0.050
ly-0.050
ishy-0.049
 dot-0.045
Token magn
Token ID7842
Feature activation+8.91e-01

Pos logit contributions
der+0.059
attered+0.056
ishy+0.054
ly+0.053
working+0.045
Neg logit contributions
Once-0.076
At-0.067
 -0.066
aman-0.064
n-0.063
Tokenifying
Token ID4035
Feature activation-1.32e+01

Pos logit contributions
Once+0.040
aman+0.035
At+0.034
ll+0.034
 +0.034
Neg logit contributions
der-0.016
 textures-0.014
ishy-0.014
ly-0.014
attered-0.014
Token glass
Token ID5405
Feature activation-3.04e+00

Pos logit contributions
der+0.326
ly+0.309
attered+0.303
ishy+0.290
 textures+0.258
Neg logit contributions
Once-0.516
At-0.453
aman-0.444
 -0.442
Pe-0.441
Token to
Token ID284
Feature activation+1.83e+00

Pos logit contributions
ishy+0.089
attered+0.089
der+0.088
working+0.083
 annoying+0.078
Neg logit contributions
Once-0.089
 -0.077
ma-0.077
At-0.074
ll-0.072
Token take
Token ID1011
Feature activation+3.07e-01

Pos logit contributions
Once+0.063
At+0.055
aman+0.054
Pe+0.054
Sab+0.053
Neg logit contributions
der-0.053
ly-0.050
attered-0.049
ishy-0.048
 textures-0.044
Token a
Token ID257
Feature activation-5.85e+00

Pos logit contributions
Once+0.009
 +0.008
At+0.008
n+0.008
Sab+0.008
Neg logit contributions
der-0.011
attered-0.010
ishy-0.010
ly-0.009
 dot-0.009
Token closer
Token ID5699
Feature activation-4.11e+00

Pos logit contributions
der+0.159
ly+0.151
attered+0.147
ishy+0.144
 dot+0.131
Neg logit contributions
Once-0.210
At-0.186
aman-0.182
Pe-0.180
Mary-0.180
Token.
Token ID13
Feature activation-8.88e+00

Pos logit contributions
attered+0.001
ishy+0.001
der+0.001
 textures+0.001
 obstacle+0.001
Neg logit contributions
Once-0.001
 -0.001
At-0.001
ll-0.001
ma-0.001
Token Lily
Token ID20037
Feature activation+2.98e+00

Pos logit contributions
der+0.235
ly+0.213
attered+0.212
ishy+0.209
.+0.186
Neg logit contributions
Once-0.321
aman-0.302
Sab-0.289
Mary-0.287
Pe-0.287
Token had
Token ID550
Feature activation-2.44e+00

Pos logit contributions
Once+0.098
aman+0.087
Pe+0.086
Sab+0.083
Mary+0.083
Neg logit contributions
der-0.089
ly-0.085
attered-0.083
ishy-0.080
 textures-0.075
Token a
Token ID257
Feature activation-5.61e+00

Pos logit contributions
der+0.052
ly+0.048
attered+0.047
ishy+0.044
 textures+0.039
Neg logit contributions
Once-0.103
At-0.092
aman-0.092
Pe-0.090
Mary-0.090
Token magn
Token ID7842
Feature activation+1.26e+00

Pos logit contributions
der+0.129
ly+0.119
attered+0.118
ishy+0.112
 textures+0.098
Neg logit contributions
Once-0.229
At-0.204
aman-0.199
Pe-0.198
Sab-0.197
Tokenifying
Token ID4035
Feature activation-1.30e+01

Pos logit contributions
Once+0.056
At+0.049
aman+0.048
ll+0.048
Sab+0.048
Neg logit contributions
der-0.023
 textures-0.020
attered-0.020
ishy-0.020
ly-0.019
Token glass
Token ID5405
Feature activation-2.86e+00

Pos logit contributions
der+0.288
ly+0.260
attered+0.258
ishy+0.249
 dot+0.224
Neg logit contributions
Once-0.534
At-0.482
aman-0.474
Mary-0.472
Sab-0.469
Token that
Token ID326
Feature activation-4.95e+00

Pos logit contributions
attered+0.086
ishy+0.086
der+0.081
 loudly+0.075
 textures+0.075
Neg logit contributions
Once-0.087
ma-0.073
At-0.071
 -0.069
ll-0.065
Token she
Token ID673
Feature activation-5.36e+00

Pos logit contributions
der+0.167
ly+0.159
attered+0.156
ishy+0.152
 textures+0.141
Neg logit contributions
Once-0.147
At-0.125
aman-0.124
Pe-0.123
Mary-0.121
Token found
Token ID1043
Feature activation+3.50e+00

Pos logit contributions
der+0.175
ly+0.162
attered+0.161
ishy+0.157
 dot+0.145
Neg logit contributions
Once-0.169
At-0.148
Sab-0.138
Mary-0.138
Pe-0.138
Token in
Token ID287
Feature activation+1.09e+00

Pos logit contributions
Once+0.139
At+0.125
 +0.123
ma+0.122
n+0.121
Neg logit contributions
der-0.079
attered-0.077
ishy-0.077
ly-0.069
 dot-0.066
Token the
Token ID262
Feature activation-6.55e-01

Pos logit contributions
Once+0.065
At+0.056
aman+0.056
Pe+0.055
Sab+0.054
Neg logit contributions
der-0.064
ly-0.061
attered-0.059
ishy-0.058
 loudly-0.054
Token blanket
Token ID18447
Feature activation-1.19e+00

Pos logit contributions
der+0.018
attered+0.017
ly+0.016
ishy+0.016
 loudly+0.014
Neg logit contributions
Once-0.024
At-0.021
ma-0.020
aman-0.020
 -0.020
Token was
Token ID373
Feature activation+1.69e+00

Pos logit contributions
der+0.039
ly+0.037
attered+0.037
ishy+0.036
 dot+0.033
Neg logit contributions
Once-0.036
At-0.031
aman-0.030
Pe-0.030
Mary-0.030
Token too
Token ID1165
Feature activation-6.50e+00

Pos logit contributions
Once+0.066
At+0.059
Sab+0.059
aman+0.058
Pe+0.058
Neg logit contributions
der-0.040
ly-0.037
attered-0.036
ishy-0.035
 loudly-0.032
Token t
Token ID256
Feature activation+2.18e+00

Pos logit contributions
der+0.184
attered+0.170
ly+0.167
 dot+0.147
 textures+0.146
Neg logit contributions
Once-0.237
At-0.214
str-0.206
 -0.203
ll-0.196
Tokenattered
Token ID10228
Feature activation-1.28e+01

Pos logit contributions
Once+0.074
ll+0.073
Pe+0.067
aman+0.067
ma+0.066
Neg logit contributions
der-0.059
 textures-0.054
ishy-0.053
 sizes-0.052
 floppy-0.051
Token to
Token ID284
Feature activation+2.18e+00

Pos logit contributions
der+0.306
attered+0.291
ishy+0.285
ly+0.280
 textures+0.251
Neg logit contributions
Once-0.495
At-0.436
aman-0.426
 -0.426
ll-0.414
Token fix
Token ID4259
Feature activation-8.90e-01

Pos logit contributions
Once+0.081
At+0.072
aman+0.071
Mary+0.071
Sab+0.070
Neg logit contributions
der-0.054
ly-0.052
ishy-0.049
attered-0.048
 textures-0.046
Token.
Token ID13
Feature activation-8.87e+00

Pos logit contributions
der+0.027
attered+0.027
ishy+0.026
ly+0.026
 dot+0.024
Neg logit contributions
Once-0.027
 -0.025
ma-0.024
n-0.023
aman-0.023
Token But
Token ID887
Feature activation+3.20e+00

Pos logit contributions
der+0.245
attered+0.217
ly+0.214
.+0.207
 because+0.206
Neg logit contributions
Once-0.289
ll-0.276
aman-0.264
ma-0.261
At-0.261
Token sue
Token ID20889
Feature activation+3.20e+00

Pos logit contributions
Once+0.118
Sab+0.106
At+0.106
Pe+0.106
Mary+0.104
Neg logit contributions
der-0.083
ly-0.076
 because-0.072
 loudly-0.070
attered-0.070
Token When
Token ID1649
Feature activation-1.12e+00

Pos logit contributions
der+0.314
ly+0.283
attered+0.279
.+0.265
 loudly+0.259
Neg logit contributions
Once-0.426
ll-0.393
aman-0.392
Pe-0.381
At-0.381
Token Daisy
Token ID40355
Feature activation+6.77e-02

Pos logit contributions
der+0.033
ly+0.031
attered+0.031
ishy+0.030
 loudly+0.027
Neg logit contributions
Once-0.038
At-0.033
aman-0.032
Pe-0.032
Sab-0.031
Token put
Token ID1234
Feature activation+5.89e-01

Pos logit contributions
Once+0.002
Pe+0.002
aman+0.002
Sab+0.002
Mary+0.002
Neg logit contributions
ly-0.002
der-0.002
attered-0.002
ishy-0.002
.-0.002
Token the
Token ID262
Feature activation-1.12e+00

Pos logit contributions
Once+0.021
At+0.019
Pe+0.019
aman+0.019
Sab+0.018
Neg logit contributions
der-0.016
ly-0.015
ishy-0.014
attered-0.014
 loudly-0.013
Token magn
Token ID7842
Feature activation+2.95e-01

Pos logit contributions
der+0.035
attered+0.033
ly+0.030
ishy+0.030
 loudly+0.027
Neg logit contributions
Once-0.036
ma-0.032
At-0.031
aman-0.031
 -0.031
Tokenifying
Token ID4035
Feature activation-1.28e+01

Pos logit contributions
Once+0.013
At+0.011
 +0.011
Sab+0.011
aman+0.010
Neg logit contributions
der-0.006
 textures-0.005
ishy-0.005
attered-0.005
 sizes-0.005
Token glass
Token ID5405
Feature activation-3.33e+00

Pos logit contributions
der+0.321
ly+0.296
attered+0.295
ishy+0.283
 dot+0.254
Neg logit contributions
Once-0.490
At-0.435
aman-0.425
Pe-0.423
Sab-0.422
Token down
Token ID866
Feature activation-2.77e+00

Pos logit contributions
der+0.099
attered+0.092
ly+0.090
ishy+0.089
 dot+0.081
Neg logit contributions
Once-0.114
At-0.098
 -0.096
aman-0.094
ma-0.094
Token,
Token ID11
Feature activation+2.51e+00

Pos logit contributions
der+0.078
ly+0.075
attered+0.072
ishy+0.070
 loudly+0.064
Neg logit contributions
Once-0.097
At-0.084
Pe-0.084
aman-0.084
Sab-0.083
Token the
Token ID262
Feature activation-3.54e-01

Pos logit contributions
Once+0.075
At+0.063
aman+0.061
Mary+0.060
Pe+0.060
Neg logit contributions
der-0.086
ly-0.081
attered-0.081
ishy-0.078
 dot-0.072
Token cater
Token ID20825
Feature activation-9.24e-01

Pos logit contributions
der+0.010
attered+0.009
ly+0.009
ishy+0.009
 loudly+0.008
Neg logit contributions
Once-0.013
At-0.011
 -0.011
aman-0.011
ma-0.011
Token She
Token ID1375
Feature activation-4.16e+00

Pos logit contributions
der+0.266
ishy+0.258
ly+0.255
attered+0.254
 dot+0.231
Neg logit contributions
Once-0.291
At-0.255
 -0.254
Sab-0.243
Pe-0.239
Token took
Token ID1718
Feature activation+3.11e+00

Pos logit contributions
der+0.135
ly+0.127
attered+0.125
ishy+0.120
 dot+0.112
Neg logit contributions
Once-0.129
aman-0.112
Pe-0.112
Mary-0.108
Sab-0.107
Token out
Token ID503
Feature activation+1.34e-02

Pos logit contributions
Once+0.112
At+0.098
Sab+0.094
 +0.094
Pe+0.094
Neg logit contributions
der-0.088
attered-0.081
ly-0.081
ishy-0.078
 dot-0.070
Token a
Token ID257
Feature activation-5.20e+00

Pos logit contributions
Once+0.000
At+0.000
Sab+0.000
Mary+0.000
 +0.000
Neg logit contributions
der-0.000
attered-0.000
ly-0.000
ishy-0.000
 dot-0.000
Token magn
Token ID7842
Feature activation+1.46e+00

Pos logit contributions
der+0.114
ly+0.110
attered+0.104
ishy+0.103
 dot+0.094
Neg logit contributions
Once-0.209
At-0.186
aman-0.186
Pe-0.185
Mary-0.185
Tokenifying
Token ID4035
Feature activation-1.26e+01

Pos logit contributions
Once+0.064
At+0.057
aman+0.056
Sab+0.055
ll+0.055
Neg logit contributions
der-0.027
ishy-0.024
 textures-0.024
attered-0.023
ly-0.023
Token glass
Token ID5405
Feature activation-2.45e+00

Pos logit contributions
der+0.285
ly+0.259
attered+0.256
ishy+0.245
 dot+0.220
Neg logit contributions
Once-0.517
At-0.463
aman-0.455
Mary-0.451
Pe-0.451
Token and
Token ID290
Feature activation+2.88e-01

Pos logit contributions
attered+0.070
der+0.069
ishy+0.069
ly+0.063
working+0.061
Neg logit contributions
Once-0.082
At-0.071
ma-0.067
 -0.067
aman-0.065
Token looked
Token ID3114
Feature activation-2.98e+00

Pos logit contributions
Once+0.009
At+0.008
 +0.007
aman+0.007
Mary+0.007
Neg logit contributions
der-0.010
attered-0.009
ly-0.009
ishy-0.009
 dot-0.008
Token very
Token ID845
Feature activation-6.09e+00

Pos logit contributions
der+0.087
attered+0.084
 dot+0.075
 textures+0.073
ishy+0.073
Neg logit contributions
Once-0.102
 -0.092
At-0.091
ma-0.079
aman-0.078
Token carefully
Token ID7773
Feature activation-3.06e+00

Pos logit contributions
der+0.181
attered+0.181
ly+0.165
 dot+0.164
ishy+0.150
Neg logit contributions
Once-0.190
 -0.176
At-0.169
ll-0.159
Els-0.151
Token.
Token ID13
Feature activation-8.63e+00

Pos logit contributions
attered+0.059
ishy+0.052
der+0.050
ly+0.046
 textures+0.038
Neg logit contributions
Once-0.171
At-0.156
 -0.152
ll-0.152
n-0.152
Token They
Token ID1119
Feature activation-4.31e+00

Pos logit contributions
der+0.217
attered+0.215
.+0.211
 because+0.203
ishy+0.198
Neg logit contributions
Once-0.294
lor-0.293
Sab-0.288
aman-0.285
ma-0.284
Token had
Token ID550
Feature activation-1.94e+00

Pos logit contributions
der+0.144
ly+0.138
attered+0.137
ishy+0.134
 textures+0.123
Neg logit contributions
Once-0.125
aman-0.108
Pe-0.106
At-0.106
Sab-0.105
Token a
Token ID257
Feature activation-5.19e+00

Pos logit contributions
der+0.046
ly+0.042
attered+0.042
ishy+0.040
 dot+0.035
Neg logit contributions
Once-0.080
At-0.071
Mary-0.069
 -0.069
Pe-0.069
Token magn
Token ID7842
Feature activation+1.54e+00

Pos logit contributions
der+0.120
ly+0.112
attered+0.111
ishy+0.109
 dot+0.098
Neg logit contributions
Once-0.203
At-0.181
aman-0.179
Mary-0.177
Sab-0.177
Tokenifying
Token ID4035
Feature activation-1.26e+01

Pos logit contributions
Once+0.070
At+0.063
aman+0.062
ll+0.062
Sab+0.061
Neg logit contributions
der-0.026
ly-0.022
attered-0.022
ishy-0.021
 textures-0.021
Token glass
Token ID5405
Feature activation-2.44e+00

Pos logit contributions
der+0.300
ly+0.272
attered+0.272
ishy+0.264
 dot+0.236
Neg logit contributions
Once-0.496
At-0.445
aman-0.438
Mary-0.434
Sab-0.434
Token and
Token ID290
Feature activation+3.53e-01

Pos logit contributions
der+0.071
attered+0.070
ishy+0.069
ly+0.064
 loudly+0.063
Neg logit contributions
Once-0.076
ma-0.066
At-0.065
ll-0.062
 -0.061
Token a
Token ID257
Feature activation-5.22e+00

Pos logit contributions
Once+0.012
At+0.011
 +0.010
n+0.010
Mary+0.010
Neg logit contributions
der-0.010
attered-0.010
ly-0.010
ishy-0.010
 dot-0.009
Token small
Token ID1402
Feature activation-4.92e+00

Pos logit contributions
der+0.150
attered+0.140
ly+0.139
ishy+0.130
 textures+0.116
Neg logit contributions
Once-0.189
At-0.164
 -0.160
aman-0.157
Pe-0.157
Token box
Token ID3091
Feature activation-7.00e-01

Pos logit contributions
der+0.169
attered+0.166
ly+0.149
ishy+0.147
liest+0.135
Neg logit contributions
Once-0.142
 -0.126
 ha-0.118
aman-0.113
At-0.112
Token flashlight
Token ID38371
Feature activation-4.93e+00

Pos logit contributions
der+0.169
attered+0.156
ly+0.156
ishy+0.148
 textures+0.131
Neg logit contributions
Once-0.235
At-0.205
aman-0.199
 -0.199
Pe-0.198
Token,
Token ID11
Feature activation+2.33e+00

Pos logit contributions
ishy+0.116
der+0.112
attered+0.111
ly+0.098
 textures+0.095
Neg logit contributions
Once-0.189
At-0.169
ma-0.166
 -0.165
aman-0.163
Token and
Token ID290
Feature activation-1.91e-01

Pos logit contributions
Once+0.093
At+0.083
aman+0.079
 +0.079
Mary+0.079
Neg logit contributions
der-0.056
attered-0.052
ly-0.051
ishy-0.050
 dot-0.044
Token a
Token ID257
Feature activation-5.88e+00

Pos logit contributions
der+0.005
attered+0.005
ly+0.005
ishy+0.005
 dot+0.004
Neg logit contributions
Once-0.007
At-0.006
 -0.006
Mary-0.006
Sab-0.006
Token magn
Token ID7842
Feature activation+1.08e+00

Pos logit contributions
der+0.164
attered+0.154
ly+0.153
ishy+0.146
 textures+0.127
Neg logit contributions
Once-0.215
At-0.187
 -0.184
aman-0.181
Pe-0.179
Tokenifying
Token ID4035
Feature activation-1.25e+01

Pos logit contributions
Once+0.043
At+0.038
aman+0.037
ll+0.037
Sab+0.037
Neg logit contributions
der-0.022
 textures-0.022
ishy-0.020
attered-0.020
 sizes-0.019
Token glass
Token ID5405
Feature activation-2.50e+00

Pos logit contributions
der+0.326
ly+0.309
attered+0.299
ishy+0.289
 textures+0.261
Neg logit contributions
Once-0.476
At-0.415
aman-0.407
 -0.406
Pe-0.400
Token.
Token ID13
Feature activation-9.15e+00

Pos logit contributions
der+0.051
attered+0.048
ishy+0.047
ly+0.041
working+0.039
Neg logit contributions
Once-0.102
ma-0.095
 -0.091
At-0.090
n-0.086
Token
Token ID198
Feature activation+1.22e+01

Pos logit contributions
der+0.226
ly+0.191
attered+0.187
.+0.179
 because+0.177
Neg logit contributions
Once-0.331
ll-0.328
aman-0.314
str-0.306
ma-0.306
Token
Token ID198
Feature activation+1.23e+01

Pos logit contributions
aman+0.375
ll+0.362
bur+0.358
 salon+0.351
eking+0.341
Neg logit contributions
.-0.349
 loudly-0.267
 They-0.242
elly-0.238
 because-0.238
TokenAs
Token ID1722
Feature activation+2.41e+00

Pos logit contributions
ll+0.326
aman+0.310
bur+0.300
Once+0.286
igi+0.281
Neg logit contributions
 loudly-0.347
der-0.342
attered-0.341
ly-0.332
.-0.324
Token and
Token ID290
Feature activation+2.32e-01

Pos logit contributions
attered+0.004
der+0.004
ishy+0.004
 dot+0.003
 textures+0.003
Neg logit contributions
Once-0.004
aman-0.004
 -0.004
At-0.004
ma-0.004
Token handed
Token ID10158
Feature activation+2.24e+00

Pos logit contributions
Once+0.007
aman+0.006
At+0.006
Pe+0.006
Sab+0.006
Neg logit contributions
der-0.008
ly-0.007
attered-0.007
ishy-0.007
 loudly-0.006
Token him
Token ID683
Feature activation+2.11e+00

Pos logit contributions
Once+0.071
At+0.062
Pe+0.061
aman+0.061
Sab+0.060
Neg logit contributions
der-0.069
ly-0.069
attered-0.064
ishy-0.062
.-0.059
Token the
Token ID262
Feature activation+2.02e-01

Pos logit contributions
Once+0.059
 +0.050
ma+0.050
At+0.049
n+0.048
Neg logit contributions
der-0.079
attered-0.075
ishy-0.072
ly-0.070
 dot-0.070
Token magn
Token ID7842
Feature activation+1.61e+00

Pos logit contributions
Once+0.008
Mary+0.008
aman+0.008
Danny+0.008
Michael+0.008
Neg logit contributions
ly-0.004
der-0.004
 dot-0.004
ishy-0.004
.-0.004
Tokenifying
Token ID4035
Feature activation-1.25e+01

Pos logit contributions
Once+0.065
At+0.057
Sab+0.056
lor+0.055
aman+0.055
Neg logit contributions
der-0.034
 textures-0.032
ishy-0.031
attered-0.030
 loudly-0.029
Token glass
Token ID5405
Feature activation-2.30e+00

Pos logit contributions
der+0.300
ly+0.283
attered+0.279
ishy+0.268
 textures+0.236
Neg logit contributions
Once-0.495
At-0.437
aman-0.427
 -0.426
Pe-0.425
Token.
Token ID13
Feature activation-9.27e+00

Pos logit contributions
attered+0.052
ishy+0.051
der+0.051
working+0.048
ly+0.044
Neg logit contributions
Once-0.086
 -0.077
ma-0.076
At-0.075
n-0.071
Token
Token ID198
Feature activation+1.21e+01

Pos logit contributions
der+0.277
.+0.237
ly+0.231
 loudly+0.220
 because+0.218
Neg logit contributions
aman-0.299
str-0.293
Once-0.291
igi-0.284
ma-0.284
Token
Token ID198
Feature activation+1.20e+01

Pos logit contributions
aman+0.421
rounded+0.389
shadow+0.388
prise+0.388
 salon+0.376
Neg logit contributions
.-0.354
 loudly-0.244
ly-0.234
!-0.224
der-0.223
TokenHe
Token ID1544
Feature activation-3.52e+00

Pos logit contributions
aman+0.309
n+0.299
str+0.293
aren+0.287
Wil+0.282
Neg logit contributions
ly-0.385
der-0.372
 loudly-0.354
.-0.350
attered-0.338
Token not
Token ID407
Feature activation-5.59e-02

Pos logit contributions
Once+0.005
At+0.004
 +0.004
aman+0.004
Mary+0.004
Neg logit contributions
der-0.005
attered-0.005
ly-0.005
ishy-0.005
 dot-0.005
Token to
Token ID284
Feature activation+2.28e+00

Pos logit contributions
der+0.002
attered+0.002
ishy+0.002
ly+0.002
 dot+0.002
Neg logit contributions
Once-0.002
 -0.001
At-0.001
n-0.001
Pe-0.001
Token yield
Token ID7800
Feature activation-1.72e+00

Pos logit contributions
Once+0.071
At+0.059
Pe+0.058
n+0.058
 +0.058
Neg logit contributions
der-0.079
attered-0.077
ishy-0.072
ly-0.071
 dot-0.066
Token to
Token ID284
Feature activation+2.21e+00

Pos logit contributions
attered+0.053
der+0.052
ishy+0.049
 dot+0.047
ly+0.047
Neg logit contributions
Once-0.056
 -0.049
At-0.048
aman-0.045
Mary-0.044
Token his
Token ID465
Feature activation-1.03e+00

Pos logit contributions
Once+0.066
At+0.056
 +0.055
aman+0.055
Mary+0.054
Neg logit contributions
der-0.075
attered-0.075
ishy-0.072
ly-0.069
 dot-0.066
Token temptation
Token ID29062
Feature activation-1.24e+01

Pos logit contributions
der+0.031
attered+0.029
ly+0.028
ishy+0.028
 dot+0.026
Neg logit contributions
Once-0.034
At-0.030
 -0.030
aman-0.030
ma-0.029
Token.
Token ID13
Feature activation-9.23e+00

Pos logit contributions
der+0.178
attered+0.174
ishy+0.169
ly+0.156
 dot+0.136
Neg logit contributions
Once-0.600
At-0.542
 -0.539
aman-0.530
Mary-0.528
Token Knowing
Token ID29154
Feature activation-2.95e+00

Pos logit contributions
der+0.287
ly+0.273
.+0.267
attered+0.266
 because+0.258
Neg logit contributions
Once-0.265
igi-0.254
Sab-0.253
Pe-0.247
ll-0.246
Token that
Token ID326
Feature activation-4.90e+00

Pos logit contributions
attered+0.094
ishy+0.091
der+0.090
 dot+0.086
ly+0.085
Neg logit contributions
Once-0.093
At-0.080
aman-0.080
Mary-0.078
 -0.078
Token cutting
Token ID7720
Feature activation-6.22e-01

Pos logit contributions
der+0.151
attered+0.144
ly+0.142
ishy+0.139
 dot+0.125
Neg logit contributions
Once-0.163
At-0.141
aman-0.137
 -0.135
Mary-0.134
Token the
Token ID262
Feature activation-1.22e-01

Pos logit contributions
der+0.018
attered+0.017
ly+0.017
ishy+0.016
 dot+0.015
Neg logit contributions
Once-0.021
At-0.019
 -0.018
aman-0.018
Pe-0.018
Token she
Token ID673
Feature activation-5.55e+00

Pos logit contributions
Once+0.166
At+0.145
aman+0.141
Pe+0.140
Sab+0.139
Neg logit contributions
der-0.146
ly-0.137
attered-0.134
ishy-0.130
 loudly-0.118
Token needed
Token ID2622
Feature activation-6.21e+00

Pos logit contributions
der+0.208
ly+0.196
ishy+0.195
attered+0.193
 dot+0.187
Neg logit contributions
Once-0.141
At-0.120
 -0.117
ll-0.116
ma-0.114
Token to
Token ID284
Feature activation+1.98e+00

Pos logit contributions
der+0.134
ly+0.126
attered+0.116
ishy+0.110
.+0.102
Neg logit contributions
Once-0.259
At-0.231
aman-0.228
Pe-0.226
Sab-0.226
Token resist
Token ID4180
Feature activation-1.99e+00

Pos logit contributions
Once+0.063
At+0.054
aman+0.052
Pe+0.052
 +0.052
Neg logit contributions
der-0.064
attered-0.061
ly-0.060
ishy-0.059
 dot-0.053
Token the
Token ID262
Feature activation-3.46e-01

Pos logit contributions
der+0.053
attered+0.050
ly+0.049
ishy+0.048
 dot+0.043
Neg logit contributions
Once-0.073
At-0.065
aman-0.063
 -0.063
Mary-0.062
Token temptation
Token ID29062
Feature activation-1.23e+01

Pos logit contributions
der+0.010
attered+0.010
ly+0.009
ishy+0.009
 dot+0.008
Neg logit contributions
Once-0.012
aman-0.010
At-0.010
ma-0.010
 -0.010
Token and
Token ID290
Feature activation+1.20e-01

Pos logit contributions
der+0.261
attered+0.241
ly+0.237
ishy+0.233
 dot+0.202
Neg logit contributions
Once-0.520
At-0.466
aman-0.454
 -0.451
Mary-0.450
Token leave
Token ID2666
Feature activation+1.26e+00

Pos logit contributions
Once+0.004
Sab+0.003
Pe+0.003
At+0.003
aman+0.003
Neg logit contributions
der-0.004
ly-0.004
attered-0.003
ishy-0.003
 loudly-0.003
Token it
Token ID340
Feature activation-7.18e-01

Pos logit contributions
Once+0.042
At+0.037
aman+0.036
Pe+0.035
Sab+0.035
Neg logit contributions
der-0.038
ly-0.036
attered-0.035
ishy-0.034
 dot-0.031
Token alone
Token ID3436
Feature activation+5.98e-02

Pos logit contributions
der+0.020
attered+0.018
ly+0.018
ishy+0.018
 dot+0.017
Neg logit contributions
Once-0.026
At-0.023
 -0.022
ll-0.022
n-0.022
Token.
Token ID13
Feature activation-8.95e+00

Pos logit contributions
Once+0.003
At+0.003
 +0.003
ma+0.003
ll+0.002
Neg logit contributions
der-0.001
attered-0.001
ishy-0.001
ly-0.001
 dot-0.001
Token and
Token ID290
Feature activation+7.63e-01

Pos logit contributions
der+0.102
ly+0.096
attered+0.096
ishy+0.094
 textures+0.086
Neg logit contributions
Once-0.083
At-0.071
aman-0.067
 -0.066
Sab-0.066
Token paint
Token ID7521
Feature activation-6.39e-01

Pos logit contributions
Once+0.028
At+0.024
 +0.024
Mary+0.024
aman+0.023
Neg logit contributions
der-0.021
attered-0.020
ly-0.019
ishy-0.019
 dot-0.016
Token and
Token ID290
Feature activation+7.16e-01

Pos logit contributions
der+0.024
ishy+0.023
ly+0.022
attered+0.022
 dot+0.020
Neg logit contributions
Once-0.016
At-0.014
 -0.013
ma-0.013
aman-0.012
Token glue
Token ID22749
Feature activation-3.55e+00

Pos logit contributions
Once+0.025
At+0.022
aman+0.021
Mary+0.021
 +0.021
Neg logit contributions
der-0.021
ly-0.019
attered-0.019
ishy-0.019
 dot-0.017
Token spl
Token ID4328
Feature activation+4.23e-01

Pos logit contributions
der+0.109
ly+0.102
attered+0.101
ishy+0.100
 dot+0.090
Neg logit contributions
Once-0.114
At-0.100
 -0.097
aman-0.095
ll-0.094
Tokenattered
Token ID10228
Feature activation-1.23e+01

Pos logit contributions
At+0.011
ll+0.011
Once+0.011
n+0.011
Ph+0.010
Neg logit contributions
der-0.013
 dot-0.012
 human-0.011
 enemy-0.011
 skull-0.011
Token everywhere
Token ID8347
Feature activation+1.68e+00

Pos logit contributions
der+0.313
ishy+0.305
attered+0.298
ly+0.287
working+0.269
Neg logit contributions
Once-0.425
 -0.384
At-0.378
ma-0.361
aman-0.350
Token.
Token ID13
Feature activation-8.43e+00

Pos logit contributions
Once+0.071
 +0.066
At+0.063
n+0.058
ll+0.058
Neg logit contributions
ishy-0.037
attered-0.035
ly-0.030
der-0.030
vious-0.027
Token
Token ID220
Feature activation+9.76e+00

Pos logit contributions
der+0.266
attered+0.237
ly+0.230
ishy+0.223
.+0.219
Neg logit contributions
Once-0.269
Sab-0.243
aman-0.237
Pe-0.236
ma-0.233
Token
Token ID198
Feature activation+1.27e+01

Pos logit contributions
Once+0.228
aman+0.226
Pe+0.215
Sab+0.205
igi+0.197
Neg logit contributions
der-0.360
.-0.358
attered-0.323
 because-0.316
ly-0.306
Token
Token ID198
Feature activation+1.27e+01

Pos logit contributions
aman+0.450
 salon+0.446
prise+0.421
 ha+0.409
eks+0.408
Neg logit contributions
.-0.338
 loudly-0.246
 because-0.237
 They-0.230
der-0.225
Token fingers
Token ID9353
Feature activation-9.55e-01

Pos logit contributions
der+0.006
ly+0.006
attered+0.006
ishy+0.006
working+0.005
Neg logit contributions
Once-0.007
ma-0.006
aman-0.006
 -0.006
At-0.006
Token were
Token ID547
Feature activation+4.82e+00

Pos logit contributions
der+0.037
ly+0.035
attered+0.035
ishy+0.035
 textures+0.032
Neg logit contributions
Once-0.023
At-0.020
 -0.019
n-0.017
ll-0.017
Token red
Token ID2266
Feature activation-4.54e+00

Pos logit contributions
Once+0.169
At+0.144
Pe+0.139
 +0.137
Wil+0.136
Neg logit contributions
ly-0.137
der-0.135
attered-0.127
ishy-0.123
working-0.115
Token and
Token ID290
Feature activation+8.10e-01

Pos logit contributions
ishy+0.164
 possession+0.163
working+0.161
powers+0.159
 efficient+0.157
Neg logit contributions
Once-0.105
At-0.080
 -0.079
ma-0.077
aman-0.074
Token bl
Token ID698
Feature activation-2.80e-01

Pos logit contributions
Once+0.032
 +0.028
At+0.028
Pe+0.027
ma+0.026
Neg logit contributions
der-0.020
ly-0.019
attered-0.018
ishy-0.017
 dot-0.015
Tokenistered
Token ID23187
Feature activation-1.22e+01

Pos logit contributions
 obstacle+0.007
 textures+0.007
der+0.007
 missile+0.006
 infant+0.006
Neg logit contributions
ll-0.009
Once-0.009
Pe-0.008
aj-0.008
Ver-0.008
Token.
Token ID13
Feature activation-8.84e+00

Pos logit contributions
working+0.277
 obstacle+0.274
attered+0.261
ishy+0.259
powers+0.257
Neg logit contributions
Once-0.419
ma-0.367
 -0.340
Are-0.339
m-0.336
Token He
Token ID679
Feature activation-3.39e+00

Pos logit contributions
der+0.130
attered+0.094
ly+0.093
.+0.081
ishy+0.081
Neg logit contributions
Once-0.431
aman-0.418
ll-0.411
At-0.399
ma-0.399
Token felt
Token ID2936
Feature activation-8.48e-01

Pos logit contributions
der+0.114
ly+0.108
attered+0.105
ishy+0.102
 loudly+0.096
Neg logit contributions
Once-0.102
aman-0.090
At-0.087
Sab-0.086
Mary-0.085
Token very
Token ID845
Feature activation-5.39e+00

Pos logit contributions
der+0.026
ly+0.025
attered+0.025
ishy+0.023
working+0.023
Neg logit contributions
Once-0.027
 -0.025
Pe-0.022
At-0.022
Are-0.021
Token scared
Token ID12008
Feature activation-6.08e+00

Pos logit contributions
der+0.167
attered+0.158
ly+0.156
 dot+0.150
 sizes+0.148
Neg logit contributions
Once-0.163
 -0.147
At-0.137
ll-0.133
Pe-0.132
Token
Token ID198
Feature activation+1.27e+01

Pos logit contributions
aman+0.385
 salon+0.347
ll+0.347
igi+0.328
Pe+0.323
Neg logit contributions
.-0.421
 loudly-0.366
der-0.324
 loud-0.321
ly-0.319
TokenHe
Token ID1544
Feature activation-2.65e+00

Pos logit contributions
aman+0.282
Once+0.268
ll+0.267
igi+0.239
OV+0.230
Neg logit contributions
 loudly-0.444
der-0.438
ly-0.436
.-0.422
attered-0.409
Token pointed
Token ID6235
Feature activation-6.44e-01

Pos logit contributions
der+0.084
ly+0.080
 loudly+0.071
.+0.069
attered+0.067
Neg logit contributions
Once-0.082
aman-0.078
Pe-0.077
igi-0.076
ll-0.074
Token his
Token ID465
Feature activation-5.68e-01

Pos logit contributions
der+0.019
attered+0.019
ishy+0.017
ly+0.016
 textures+0.016
Neg logit contributions
Once-0.022
 -0.020
At-0.019
aman-0.018
Mary-0.018
Token magn
Token ID7842
Feature activation+2.05e+00

Pos logit contributions
der+0.015
ly+0.014
attered+0.013
ishy+0.013
 dot+0.012
Neg logit contributions
Once-0.022
At-0.019
aman-0.019
Pe-0.019
Mary-0.018
Tokenifying
Token ID4035
Feature activation-1.22e+01

Pos logit contributions
Once+0.083
At+0.073
aman+0.072
ma+0.070
Sab+0.070
Neg logit contributions
der-0.044
 textures-0.041
 sizes-0.038
ishy-0.037
working-0.035
Token glass
Token ID5405
Feature activation-1.92e+00

Pos logit contributions
der+0.308
ly+0.289
attered+0.284
ishy+0.273
 textures+0.244
Neg logit contributions
Once-0.468
At-0.412
aman-0.404
 -0.399
Pe-0.399
Token at
Token ID379
Feature activation+3.73e+00

Pos logit contributions
attered+0.056
der+0.055
ishy+0.054
working+0.051
 textures+0.049
Neg logit contributions
Once-0.059
ma-0.052
 -0.050
At-0.049
aman-0.046
Token different
Token ID1180
Feature activation-4.55e+00

Pos logit contributions
Once+0.117
At+0.099
aman+0.097
 +0.097
Mary+0.096
Neg logit contributions
der-0.122
attered-0.116
ly-0.114
ishy-0.112
 dot-0.101
Token classmates
Token ID28999
Feature activation-7.56e-01

Pos logit contributions
attered+0.111
der+0.109
ishy+0.106
ly+0.089
 dot+0.085
Neg logit contributions
Once-0.171
 -0.154
ma-0.154
At-0.154
aman-0.151
Token and
Token ID290
Feature activation+5.50e-01

Pos logit contributions
attered+0.020
ishy+0.019
der+0.019
ly+0.017
 textures+0.016
Neg logit contributions
Once-0.029
 -0.025
At-0.025
ma-0.025
n-0.024
Token Lily
Token ID20037
Feature activation+2.79e+00

Pos logit contributions
der+0.274
 because+0.246
.+0.239
attered+0.231
ly+0.227
Neg logit contributions
ll-0.304
ma-0.292
Once-0.291
igi-0.288
prise-0.283
Token tried
Token ID3088
Feature activation-3.81e+00

Pos logit contributions
Once+0.087
Sab+0.078
aman+0.076
Pe+0.076
Mary+0.075
Neg logit contributions
der-0.088
ly-0.085
attered-0.082
ishy-0.077
 loudly-0.075
Token to
Token ID284
Feature activation+2.17e+00

Pos logit contributions
der+0.073
ly+0.062
attered+0.058
 because+0.058
 loudly+0.058
Neg logit contributions
Once-0.164
aman-0.152
Pe-0.150
At-0.150
Sab-0.149
Token resist
Token ID4180
Feature activation-1.83e+00

Pos logit contributions
Once+0.073
At+0.063
aman+0.061
Pe+0.061
Mary+0.060
Neg logit contributions
der-0.068
attered-0.064
ly-0.062
ishy-0.061
 dot-0.055
Token the
Token ID262
Feature activation-1.78e-01

Pos logit contributions
der+0.060
attered+0.057
ly+0.057
ishy+0.055
 dot+0.050
Neg logit contributions
Once-0.057
At-0.048
aman-0.047
 -0.046
Pe-0.046
Token temptation
Token ID29062
Feature activation-1.22e+01

Pos logit contributions
der+0.006
attered+0.006
lier+0.005
ly+0.005
liest+0.005
Neg logit contributions
Once-0.005
 babys-0.005
 -0.005
ma-0.005
At-0.004
Token,
Token ID11
Feature activation+2.80e+00

Pos logit contributions
der+0.291
ly+0.269
attered+0.265
ishy+0.255
 dot+0.225
Neg logit contributions
Once-0.482
At-0.428
aman-0.419
Pe-0.416
Mary-0.414
Token but
Token ID475
Feature activation+2.14e+00

Pos logit contributions
Once+0.106
Sab+0.096
aman+0.095
At+0.095
Mary+0.094
Neg logit contributions
der-0.072
ly-0.064
attered-0.060
 because-0.059
ishy-0.059
Token she
Token ID673
Feature activation-5.03e+00

Pos logit contributions
Once+0.073
Sab+0.064
At+0.064
Pe+0.064
aman+0.063
Neg logit contributions
der-0.062
attered-0.056
ly-0.056
ishy-0.053
 loudly-0.052
Token couldn
Token ID3521
Feature activation+4.06e-01

Pos logit contributions
der+0.157
attered+0.146
ly+0.144
ishy+0.139
 loudly+0.129
Neg logit contributions
Once-0.163
aman-0.141
Sab-0.141
At-0.140
Pe-0.138
Token't
Token ID470
Feature activation+1.48e+00

Pos logit contributions
Once+0.007
At+0.005
aman+0.004
 +0.004
Pe+0.004
Neg logit contributions
der-0.019
ly-0.019
attered-0.019
ishy-0.018
 textures-0.017
Token,
Token ID11
Feature activation+2.96e+00

Pos logit contributions
Once+0.184
 +0.177
n+0.154
ma+0.150
ll+0.149
Neg logit contributions
attered-0.175
ishy-0.169
der-0.157
 dot-0.154
 textures-0.147
Token handing
Token ID22786
Feature activation+3.16e-01

Pos logit contributions
Once+0.103
aman+0.094
At+0.093
Sab+0.092
Pe+0.090
Neg logit contributions
der-0.084
ly-0.076
attered-0.070
 loudly-0.069
ishy-0.068
Token him
Token ID683
Feature activation+1.99e+00

Pos logit contributions
Once+0.010
At+0.009
aman+0.008
Sab+0.008
Pe+0.008
Neg logit contributions
der-0.010
ly-0.010
attered-0.009
ishy-0.009
 loudly-0.008
Token the
Token ID262
Feature activation+1.90e-01

Pos logit contributions
Once+0.052
ma+0.045
 +0.044
Mary+0.040
At+0.040
Neg logit contributions
der-0.082
attered-0.076
 dot-0.072
ly-0.071
ishy-0.071
Token magn
Token ID7842
Feature activation+1.64e+00

Pos logit contributions
Once+0.007
At+0.006
aman+0.006
 +0.006
ma+0.006
Neg logit contributions
der-0.005
attered-0.005
ly-0.005
ishy-0.005
 loudly-0.004
Tokenifying
Token ID4035
Feature activation-1.21e+01

Pos logit contributions
Once+0.063
ll+0.054
lor+0.053
ma+0.053
At+0.052
Neg logit contributions
 textures-0.039
der-0.035
 sizes-0.035
ishy-0.034
attered-0.032
Token glass
Token ID5405
Feature activation-2.11e+00

Pos logit contributions
der+0.289
ly+0.269
attered+0.265
ishy+0.254
 textures+0.223
Neg logit contributions
Once-0.484
At-0.430
aman-0.421
Pe-0.417
Mary-0.415
Token.
Token ID13
Feature activation-8.68e+00

Pos logit contributions
attered+0.045
ishy+0.043
der+0.042
working+0.040
ly+0.037
Neg logit contributions
Once-0.082
ma-0.075
 -0.074
At-0.071
n-0.068
Token She
Token ID1375
Feature activation-3.98e+00

Pos logit contributions
der+0.265
.+0.239
attered+0.230
 because+0.226
 loudly+0.219
Neg logit contributions
Once-0.263
aman-0.251
igi-0.250
ma-0.244
Sab-0.243
Token looked
Token ID3114
Feature activation-2.64e+00

Pos logit contributions
der+0.138
ly+0.126
attered+0.121
ishy+0.116
 loudly+0.114
Neg logit contributions
Once-0.117
aman-0.102
Sab-0.100
Pe-0.099
Mary-0.098
Token around
Token ID1088
Feature activation+2.34e+00

Pos logit contributions
der+0.058
attered+0.053
ly+0.050
ishy+0.048
 textures+0.045
Neg logit contributions
Once-0.112
At-0.100
 -0.098
aman-0.096
ma-0.094
Token happy
Token ID3772
Feature activation-9.28e-01

Pos logit contributions
der+0.118
ly+0.111
attered+0.110
ishy+0.107
 loudly+0.096
Neg logit contributions
Once-0.155
At-0.136
aman-0.134
Pe-0.133
Sab-0.131
Token that
Token ID326
Feature activation-4.42e+00

Pos logit contributions
ishy+0.026
attered+0.025
der+0.023
ly+0.021
 dot+0.021
Neg logit contributions
Once-0.035
At-0.030
 -0.029
aman-0.029
Mary-0.029
Token she
Token ID673
Feature activation-4.83e+00

Pos logit contributions
der+0.146
ly+0.138
attered+0.136
ishy+0.132
 textures+0.122
Neg logit contributions
Once-0.135
At-0.116
aman-0.113
Pe-0.113
Sab-0.111
Token had
Token ID550
Feature activation-1.51e+00

Pos logit contributions
der+0.173
attered+0.162
ly+0.160
ishy+0.159
 textures+0.148
Neg logit contributions
Once-0.138
At-0.116
 -0.110
Mary-0.109
Sab-0.108
Token resisted
Token ID26643
Feature activation+1.28e-01

Pos logit contributions
der+0.049
ly+0.047
attered+0.046
ishy+0.044
 loudly+0.041
Neg logit contributions
Once-0.046
At-0.039
aman-0.039
Pe-0.039
Sab-0.038
Token temptation
Token ID29062
Feature activation-1.21e+01

Pos logit contributions
Once+0.004
At+0.004
aman+0.004
 +0.004
Mary+0.004
Neg logit contributions
der-0.004
attered-0.003
ly-0.003
ishy-0.003
 dot-0.003
Token and
Token ID290
Feature activation+3.20e-01

Pos logit contributions
der+0.281
attered+0.271
ishy+0.266
ly+0.258
 dot+0.237
Neg logit contributions
Once-0.483
At-0.429
 -0.419
aman-0.415
Mary-0.412
Token went
Token ID1816
Feature activation+1.84e+00

Pos logit contributions
Once+0.009
At+0.008
aman+0.007
Pe+0.007
Sab+0.007
Neg logit contributions
der-0.011
ly-0.011
attered-0.011
ishy-0.010
 textures-0.010
Token on
Token ID319
Feature activation+1.54e+00

Pos logit contributions
Once+0.065
At+0.057
 +0.056
Sab+0.055
Mary+0.054
Neg logit contributions
der-0.053
attered-0.049
ly-0.048
ishy-0.048
 textures-0.044
Token this
Token ID428
Feature activation+4.28e-01

Pos logit contributions
Once+0.049
At+0.041
 +0.040
Mary+0.039
aman+0.038
Neg logit contributions
der-0.052
attered-0.050
ly-0.050
ishy-0.050
 dot-0.045
Token big
Token ID1263
Feature activation-4.70e+00

Pos logit contributions
Once+0.015
At+0.013
aman+0.012
 +0.012
Mary+0.012
Neg logit contributions
der-0.013
attered-0.012
ly-0.012
ishy-0.012
 dot-0.011
Token the
Token ID262
Feature activation+6.39e-01

Pos logit contributions
Once+0.065
 +0.056
ma+0.055
At+0.053
Sab+0.051
Neg logit contributions
der-0.094
attered-0.092
ishy-0.090
ly-0.087
 dot-0.086
Token necklace
Token ID38987
Feature activation+1.63e+00

Pos logit contributions
Once+0.026
Mary+0.025
aman+0.025
Danny+0.024
ll+0.024
Neg logit contributions
der-0.013
ly-0.012
 dot-0.011
ishy-0.011
 textures-0.011
Token and
Token ID290
Feature activation+5.02e-01

Pos logit contributions
Once+0.050
ma+0.046
 +0.043
At+0.042
n+0.042
Neg logit contributions
ishy-0.053
attered-0.051
der-0.051
 textures-0.045
working-0.044
Token the
Token ID262
Feature activation+6.51e-01

Pos logit contributions
Once+0.014
aman+0.012
Pe+0.012
At+0.012
ll+0.012
Neg logit contributions
der-0.018
ly-0.017
attered-0.016
ishy-0.015
 textures-0.015
Token magn
Token ID7842
Feature activation+2.01e+00

Pos logit contributions
Once+0.025
At+0.023
aman+0.022
Mary+0.022
ll+0.022
Neg logit contributions
der-0.016
ly-0.015
attered-0.014
ishy-0.014
 dot-0.013
Tokenifying
Token ID4035
Feature activation-1.21e+01

Pos logit contributions
Once+0.084
At+0.074
Sab+0.073
aman+0.073
ll+0.072
Neg logit contributions
der-0.040
 textures-0.037
ishy-0.036
attered-0.035
ly-0.034
Token glass
Token ID5405
Feature activation-1.87e+00

Pos logit contributions
der+0.290
ly+0.274
attered+0.267
ishy+0.258
 textures+0.227
Neg logit contributions
Once-0.481
At-0.425
aman-0.415
Pe-0.414
 -0.413
Token and
Token ID290
Feature activation+4.90e-01

Pos logit contributions
ishy+0.036
attered+0.036
der+0.034
working+0.033
ly+0.030
Neg logit contributions
Once-0.073
ma-0.068
 -0.067
At-0.065
n-0.064
Token told
Token ID1297
Feature activation+2.96e+00

Pos logit contributions
Once+0.014
At+0.012
aman+0.012
Pe+0.012
Sab+0.011
Neg logit contributions
der-0.017
ly-0.016
attered-0.016
ishy-0.016
 textures-0.014
Token him
Token ID683
Feature activation+2.47e+00

Pos logit contributions
Once+0.065
 +0.054
Sab+0.049
n+0.048
Mary+0.048
Neg logit contributions
attered-0.126
der-0.119
ishy-0.118
ly-0.111
 dot-0.111
Token that
Token ID326
Feature activation-4.02e+00

Pos logit contributions
Once+0.063
 +0.061
n+0.056
Mary+0.053
ma+0.053
Neg logit contributions
attered-0.092
ishy-0.092
 dot-0.091
der-0.087
ly-0.080
Token what
Token ID644
Feature activation-3.13e+00

Pos logit contributions
Once+0.047
 +0.041
At+0.041
aman+0.040
Mary+0.039
Neg logit contributions
der-0.042
attered-0.041
ishy-0.039
ly-0.038
 dot-0.035
Token you
Token ID345
Feature activation-1.85e-01

Pos logit contributions
der+0.102
attered+0.098
ly+0.097
ishy+0.095
 dot+0.085
Neg logit contributions
Once-0.098
At-0.085
 -0.081
Mary-0.080
aman-0.079
Token need
Token ID761
Feature activation-6.99e+00

Pos logit contributions
der+0.005
attered+0.005
ly+0.005
ishy+0.005
 dot+0.004
Neg logit contributions
Once-0.007
At-0.006
 -0.005
ll-0.005
n-0.005
Token."
Token ID526
Feature activation-3.94e+00

Pos logit contributions
der+0.132
ly+0.124
.+0.108
attered+0.104
ishy+0.101
Neg logit contributions
Once-0.309
Pe-0.288
Sab-0.283
aman-0.283
Mary-0.281
Token
Token ID198
Feature activation+1.21e+01

Pos logit contributions
der+0.153
.+0.142
 because+0.137
ly+0.136
attered+0.135
Neg logit contributions
Once-0.079
Pe-0.076
Wil-0.073
Sab-0.072
aman-0.072
Token
Token ID198
Feature activation+1.20e+01

Pos logit contributions
 salon+0.381
aren+0.370
 inconvenience+0.361
aman+0.353
shadow+0.338
Neg logit contributions
.-0.345
der-0.305
 loudly-0.286
 each-0.276
 because-0.260
TokenM
Token ID44
Feature activation+4.78e+00

Pos logit contributions
aren+0.366
|+0.335
n+0.335
 inconvenience+0.331
Ph+0.330
Neg logit contributions
der-0.319
 loudly-0.291
ly-0.276
attered-0.268
.-0.257
Tokenama
Token ID1689
Feature activation+1.82e+00

Pos logit contributions
Once+0.112
At+0.090
aman+0.088
Mary+0.085
Pe+0.085
Neg logit contributions
der-0.194
ly-0.186
attered-0.185
ishy-0.180
 dot-0.169
Token smiled
Token ID13541
Feature activation+1.36e+00

Pos logit contributions
Once+0.053
Pe+0.046
At+0.045
aman+0.045
Sab+0.044
Neg logit contributions
der-0.062
ly-0.061
attered-0.056
ishy-0.054
 loudly-0.052
Token and
Token ID290
Feature activation-1.51e-01

Pos logit contributions
Once+0.044
 +0.037
At+0.034
aman+0.034
ma+0.031
Neg logit contributions
attered-0.044
ishy-0.042
working-0.040
der-0.039
 sizes-0.038
Token said
Token ID531
Feature activation+5.03e+00

Pos logit contributions
der+0.005
ly+0.005
attered+0.005
ishy+0.005
 loudly+0.005
Neg logit contributions
Once-0.004
Pe-0.004
At-0.004
Sab-0.004
aman-0.004
Token to
Token ID284
Feature activation+2.98e+00

Pos logit contributions
der+0.064
ly+0.061
attered+0.056
ishy+0.053
 textures+0.049
Neg logit contributions
Once-0.124
aman-0.112
At-0.111
Pe-0.108
Sab-0.108
Token bite
Token ID13197
Feature activation+8.98e-01

Pos logit contributions
Once+0.104
aman+0.093
At+0.093
Sab+0.089
Mary+0.089
Neg logit contributions
der-0.080
ly-0.077
attered-0.072
ishy-0.072
.-0.068
Token them
Token ID606
Feature activation+5.25e+00

Pos logit contributions
Once+0.018
At+0.014
 +0.014
Pe+0.013
Mary+0.013
Neg logit contributions
der-0.039
attered-0.038
ishy-0.037
ly-0.036
working-0.035
Token.
Token ID13
Feature activation-8.80e+00

Pos logit contributions
Once+0.210
 +0.197
ma+0.173
ll+0.170
At+0.168
Neg logit contributions
attered-0.145
ishy-0.130
working-0.121
der-0.118
ly-0.116
Token
Token ID198
Feature activation+1.28e+01

Pos logit contributions
der+0.210
ly+0.176
attered+0.168
 loudly+0.167
.+0.165
Neg logit contributions
Once-0.345
ll-0.327
aman-0.324
ma-0.318
At-0.307
Token
Token ID198
Feature activation+1.28e+01

Pos logit contributions
aman+0.429
 salon+0.382
Once+0.364
OV+0.364
aren+0.357
Neg logit contributions
.-0.398
 loudly-0.306
der-0.299
."-0.276
 loud-0.273
TokenL
Token ID43
Feature activation+8.43e+00

Pos logit contributions
aman+0.343
Once+0.337
ll+0.310
OV+0.295
n+0.290
Neg logit contributions
der-0.383
ly-0.375
 loudly-0.371
attered-0.371
.-0.366
Tokenily
Token ID813
Feature activation+1.83e+00

Pos logit contributions
Once+0.249
At+0.212
aman+0.206
 +0.203
Pe+0.202
Neg logit contributions
der-0.296
ly-0.281
attered-0.272
ishy-0.266
 dot-0.245
Token and
Token ID290
Feature activation+8.85e-01

Pos logit contributions
Once+0.085
aman+0.080
Sab+0.079
Pe+0.078
At+0.077
Neg logit contributions
der-0.030
ly-0.028
attered-0.023
.-0.022
 loudly-0.021
Token Ben
Token ID3932
Feature activation+4.18e+00

Pos logit contributions
Once+0.027
At+0.022
aman+0.022
Pe+0.022
Sab+0.022
Neg logit contributions
der-0.030
ly-0.029
attered-0.029
ishy-0.028
 dot-0.025
Token were
Token ID547
Feature activation+5.03e+00

Pos logit contributions
Once+0.120
aman+0.107
Sab+0.105
At+0.103
Pe+0.102
Neg logit contributions
der-0.143
ly-0.138
attered-0.130
ishy-0.128
.-0.123
Token felt
Token ID2936
Feature activation-1.29e+00

Pos logit contributions
der+0.121
ly+0.114
attered+0.113
ishy+0.109
 dot+0.101
Neg logit contributions
Once-0.112
At-0.095
aman-0.095
Pe-0.093
Mary-0.093
Token so
Token ID523
Feature activation-3.74e+00

Pos logit contributions
der+0.041
attered+0.039
ly+0.038
ishy+0.036
 dot+0.035
Neg logit contributions
Once-0.044
At-0.037
 -0.037
Pe-0.035
Mary-0.033
Token wise
Token ID10787
Feature activation-4.63e+00

Pos logit contributions
der+0.112
attered+0.102
ly+0.097
 dot+0.095
ishy+0.091
Neg logit contributions
Once-0.137
At-0.119
 -0.116
ll-0.110
Pe-0.109
Token!
Token ID0
Feature activation-7.77e+00

Pos logit contributions
ishy+0.170
attered+0.163
 dot+0.150
 trigger+0.142
 obstacle+0.141
Neg logit contributions
Once-0.127
 -0.106
ll-0.105
At-0.095
n-0.094
Token
Token ID198
Feature activation+1.26e+01

Pos logit contributions
der+0.299
attered+0.276
ly+0.273
.+0.266
 because+0.259
Neg logit contributions
Once-0.179
Mary-0.175
aman-0.172
ma-0.159
Sab-0.158
Token
Token ID198
Feature activation+1.25e+01

Pos logit contributions
aman+0.439
 salon+0.396
 ha+0.387
aren+0.385
prise+0.380
Neg logit contributions
.-0.386
 because-0.296
 loudly-0.284
der-0.279
!-0.274
TokenBut
Token ID1537
Feature activation+3.32e+00

Pos logit contributions
Once+0.252
aman+0.216
Mary+0.208
aren+0.202
Sab+0.200
Neg logit contributions
der-0.528
ly-0.507
attered-0.499
ishy-0.481
 loudly-0.464
Token one
Token ID530
Feature activation-4.57e+00

Pos logit contributions
Once+0.120
Mary+0.112
Sab+0.109
aman+0.108
At+0.106
Neg logit contributions
der-0.088
ly-0.083
attered-0.076
 because-0.074
ishy-0.072
Token day
Token ID1110
Feature activation+2.22e+00

Pos logit contributions
der+0.169
attered+0.161
ly+0.150
ishy+0.149
 textures+0.138
Neg logit contributions
Once-0.138
At-0.112
 -0.106
ma-0.105
Mary-0.102
Token,
Token ID11
Feature activation+2.98e+00

Pos logit contributions
Once+0.075
At+0.065
aman+0.063
Pe+0.063
Sab+0.062
Neg logit contributions
der-0.067
ly-0.063
attered-0.063
ishy-0.061
 dot-0.055
Token while
Token ID981
Feature activation-9.39e-01

Pos logit contributions
Once+0.100
At+0.086
 +0.083
Pe+0.082
aman+0.082
Neg logit contributions
der-0.092
attered-0.087
ly-0.086
ishy-0.085
 textures-0.076
Token.
Token ID13
Feature activation-8.69e+00

Pos logit contributions
attered+0.152
ishy+0.145
 solid+0.122
der+0.121
 dot+0.121
Neg logit contributions
Once-0.208
ma-0.192
 -0.179
ll-0.176
Alice-0.175
Token I
Token ID314
Feature activation-3.37e+00

Pos logit contributions
der+0.226
ly+0.205
attered+0.202
ishy+0.194
 dot+0.179
Neg logit contributions
Once-0.325
At-0.289
aman-0.287
Mary-0.279
ll-0.278
Token'm
Token ID1101
Feature activation+3.32e+00

Pos logit contributions
der+0.106
ly+0.100
attered+0.099
ishy+0.096
 dot+0.088
Neg logit contributions
Once-0.109
At-0.094
aman-0.092
Pe-0.091
Mary-0.090
Token glad
Token ID9675
Feature activation+7.58e-01

Pos logit contributions
Once+0.108
At+0.094
aman+0.091
Pe+0.090
Mary+0.090
Neg logit contributions
der-0.103
ly-0.097
attered-0.096
ishy-0.093
 dot-0.085
Token I
Token ID314
Feature activation-3.40e+00

Pos logit contributions
Once+0.023
At+0.019
aman+0.019
Pe+0.019
Mary+0.019
Neg logit contributions
der-0.026
ly-0.024
attered-0.024
ishy-0.023
 dot-0.021
Token met
Token ID1138
Feature activation+9.76e+00

Pos logit contributions
der+0.103
ly+0.099
attered+0.096
ishy+0.095
 dot+0.087
Neg logit contributions
Once-0.109
aman-0.099
Mary-0.096
Sab-0.094
Pe-0.094
Token you
Token ID345
Feature activation+6.79e-01

Pos logit contributions
Once+0.287
At+0.240
 +0.234
Mary+0.233
aman+0.231
Neg logit contributions
der-0.335
attered-0.323
ishy-0.310
ly-0.310
 dot-0.285
Token."
Token ID526
Feature activation-1.48e+00

Pos logit contributions
Once+0.028
 +0.025
At+0.025
ll+0.024
ma+0.024
Neg logit contributions
der-0.015
attered-0.015
ishy-0.014
ly-0.014
 dot-0.012
Token Anna
Token ID11735
Feature activation+3.64e+00

Pos logit contributions
.+0.051
ly+0.043
 because+0.043
der+0.043
 loudly+0.041
Neg logit contributions
aman-0.042
fters-0.041
Once-0.039
ll-0.038
igi-0.037
Token said
Token ID531
Feature activation+6.14e+00

Pos logit contributions
Once+0.112
aman+0.103
Pe+0.103
At+0.102
igi+0.101
Neg logit contributions
der-0.116
ly-0.113
.-0.106
 loudly-0.098
 because-0.097
Token.
Token ID13
Feature activation-8.46e+00

Pos logit contributions
Once+0.225
 +0.215
ma+0.188
n+0.187
At+0.186
Neg logit contributions
attered-0.167
ishy-0.162
der-0.161
 textures-0.142
 dot-0.140
Token your
Token ID534
Feature activation+1.30e-01

Pos logit contributions
der+0.064
attered+0.064
ishy+0.061
ly+0.060
 dot+0.054
Neg logit contributions
Once-0.054
 -0.046
At-0.045
n-0.044
Sab-0.044
Token ball
Token ID2613
Feature activation-6.13e-01

Pos logit contributions
Once+0.005
 +0.004
At+0.004
n+0.004
ma+0.004
Neg logit contributions
der-0.004
attered-0.004
ly-0.004
ishy-0.003
 loudly-0.003
Token big
Token ID1263
Feature activation-4.91e+00

Pos logit contributions
der+0.018
attered+0.018
ishy+0.016
ly+0.016
elly+0.014
Neg logit contributions
Once-0.021
At-0.018
 -0.018
n-0.017
ll-0.017
Token again
Token ID757
Feature activation+2.23e+00

Pos logit contributions
attered+0.158
ishy+0.132
ly+0.130
der+0.126
working+0.118
Neg logit contributions
Once-0.155
 -0.136
ma-0.122
At-0.117
 ha-0.116
Token."
Token ID526
Feature activation-3.48e+00

Pos logit contributions
Once+0.091
At+0.082
 +0.080
n+0.078
ll+0.078
Neg logit contributions
attered-0.053
der-0.049
ishy-0.047
ly-0.045
 dot-0.039
Token
Token ID198
Feature activation+1.28e+01

Pos logit contributions
der+0.128
.+0.125
ly+0.122
 because+0.118
attered+0.115
Neg logit contributions
Once-0.078
aman-0.071
fters-0.070
Sab-0.069
At-0.068
Token
Token ID198
Feature activation+1.29e+01

Pos logit contributions
aman+0.361
 salon+0.341
Once+0.332
fters+0.327
Pe+0.316
Neg logit contributions
.-0.427
der-0.342
 loudly-0.341
ly-0.338
!-0.316
TokenLu
Token ID25596
Feature activation+4.60e+00

Pos logit contributions
Once+0.305
aman+0.297
ll+0.279
 +0.270
OV+0.270
Neg logit contributions
der-0.419
ly-0.408
 loudly-0.401
attered-0.388
.-0.373
Tokency
Token ID948
Feature activation-1.92e+00

Pos logit contributions
Once+0.206
Sab+0.185
At+0.184
Mary+0.177
Pe+0.176
Neg logit contributions
ly-0.086
attered-0.082
der-0.077
ishy-0.074
 dot-0.060
Token and
Token ID290
Feature activation+1.03e+00

Pos logit contributions
der+0.064
ly+0.061
attered+0.058
ishy+0.055
 loudly+0.054
Neg logit contributions
Once-0.058
Pe-0.050
aman-0.050
Sab-0.049
At-0.049
Token the
Token ID262
Feature activation+8.22e-01

Pos logit contributions
Once+0.036
At+0.033
ll+0.033
Sab+0.032
aman+0.031
Neg logit contributions
der-0.027
ly-0.026
 loudly-0.023
.-0.023
 because-0.023
TokenAt
Token ID2953
Feature activation+9.15e+00

Pos logit contributions
ll+0.254
aman+0.249
OV+0.225
aren+0.221
bur+0.221
Neg logit contributions
 loudly-0.301
.-0.288
attered-0.285
der-0.281
 than-0.272
Token the
Token ID262
Feature activation-1.22e+00

Pos logit contributions
Once+0.334
Pe+0.301
At+0.297
Sab+0.296
Mary+0.285
Neg logit contributions
der-0.237
ly-0.217
.-0.199
 loudly-0.199
attered-0.198
Token end
Token ID886
Feature activation-2.87e-01

Pos logit contributions
der+0.041
ly+0.039
attered+0.038
ishy+0.037
 dot+0.035
Neg logit contributions
Once-0.036
At-0.031
aman-0.030
Pe-0.030
Sab-0.030
Token of
Token ID286
Feature activation-2.58e+00

Pos logit contributions
der+0.006
ly+0.006
attered+0.006
ishy+0.005
 dot+0.005
Neg logit contributions
Once-0.012
At-0.011
aman-0.011
Pe-0.011
Sab-0.010
Token the
Token ID262
Feature activation-8.74e-01

Pos logit contributions
.+0.050
der+0.048
 because+0.046
 orange+0.041
 loudly+0.041
Neg logit contributions
Once-0.104
ll-0.100
At-0.097
Sab-0.096
Pe-0.095
Token day
Token ID1110
Feature activation+1.20e+00

Pos logit contributions
der+0.025
ly+0.024
attered+0.023
ishy+0.023
 dot+0.021
Neg logit contributions
Once-0.030
At-0.026
Pe-0.026
aman-0.025
Sab-0.025
Token,
Token ID11
Feature activation+2.30e+00

Pos logit contributions
Once+0.043
At+0.037
aman+0.037
Pe+0.037
Sab+0.037
Neg logit contributions
der-0.033
ly-0.031
attered-0.030
ishy-0.029
 dot-0.027
Token the
Token ID262
Feature activation-4.37e-01

Pos logit contributions
Once+0.070
At+0.060
aman+0.059
Sab+0.058
Pe+0.058
Neg logit contributions
der-0.075
ly-0.071
attered-0.070
ishy-0.068
 loudly-0.064
Token daughter
Token ID4957
Feature activation-3.96e+00

Pos logit contributions
der+0.012
attered+0.012
ly+0.011
ishy+0.011
 textures+0.009
Neg logit contributions
Once-0.016
At-0.014
ma-0.013
Pe-0.013
 -0.013
Token hugged
Token ID41130
Feature activation+1.02e+00

Pos logit contributions
der+0.120
ly+0.115
attered+0.108
.+0.103
ishy+0.102
Neg logit contributions
Once-0.126
Pe-0.113
Sab-0.110
At-0.108
aman-0.108
Token her
Token ID607
Feature activation-1.37e+00

Pos logit contributions
Once+0.031
At+0.027
aman+0.026
Pe+0.026
Sab+0.026
Neg logit contributions
der-0.034
ly-0.032
attered-0.031
ishy-0.030
 dot-0.028
Token excited
Token ID6568
Feature activation-1.84e+00

Pos logit contributions
der+0.118
ly+0.112
attered+0.111
ishy+0.108
 loudly+0.099
Neg logit contributions
Once-0.119
At-0.103
aman-0.102
Sab-0.100
Pe-0.099
Token.
Token ID13
Feature activation-8.78e+00

Pos logit contributions
ishy+0.044
attered+0.042
 dot+0.040
der+0.038
 floppy+0.033
Neg logit contributions
Once-0.074
n-0.069
 -0.068
ma-0.067
ll-0.067
Token He
Token ID679
Feature activation-3.97e+00

Pos logit contributions
ishy+0.242
ly+0.241
der+0.238
attered+0.227
 dot+0.215
Neg logit contributions
Once-0.309
 -0.289
At-0.273
aman-0.258
Pe-0.250
Token ran
Token ID4966
Feature activation-1.25e+00

Pos logit contributions
der+0.141
ly+0.134
attered+0.132
ishy+0.129
 dot+0.119
Neg logit contributions
Once-0.112
At-0.094
aman-0.092
Pe-0.091
Sab-0.090
Token around
Token ID1088
Feature activation+1.94e+00

Pos logit contributions
der+0.034
attered+0.031
ly+0.031
ishy+0.030
 dot+0.027
Neg logit contributions
Once-0.047
At-0.041
 -0.040
Mary-0.039
aman-0.039
Token the
Token ID262
Feature activation+1.32e-01

Pos logit contributions
Once+0.067
aman+0.060
Sab+0.059
Pe+0.059
At+0.059
Neg logit contributions
der-0.055
ly-0.052
attered-0.048
ishy-0.047
 loudly-0.046
Token store
Token ID3650
Feature activation-7.52e-01

Pos logit contributions
Once+0.003
At+0.003
 +0.003
aman+0.003
Pe+0.003
Neg logit contributions
der-0.005
ly-0.005
attered-0.005
ishy-0.005
 textures-0.004
Token with
Token ID351
Feature activation+1.79e+00

Pos logit contributions
der+0.021
ishy+0.020
attered+0.020
ly+0.019
 dot+0.018
Neg logit contributions
Once-0.027
 -0.024
At-0.023
ll-0.022
ma-0.022
Token it
Token ID340
Feature activation-7.69e-01

Pos logit contributions
Once+0.052
 +0.045
At+0.042
ma+0.039
Mary+0.039
Neg logit contributions
der-0.067
attered-0.064
ly-0.063
ishy-0.061
 dot-0.055
Token and
Token ID290
Feature activation+2.23e-01

Pos logit contributions
attered+0.018
ishy+0.018
der+0.018
ly+0.016
 dot+0.016
Neg logit contributions
Once-0.031
 -0.028
At-0.027
ll-0.026
n-0.025
Token gave
Token ID2921
Feature activation+2.25e+00

Pos logit contributions
Once+0.007
aman+0.007
Sab+0.006
Pe+0.006
Mary+0.006
Neg logit contributions
der-0.007
ly-0.006
attered-0.006
ishy-0.006
 loudly-0.006
Token to
Token ID284
Feature activation+3.38e+00

Pos logit contributions
Once+0.022
At+0.019
aman+0.019
Pe+0.018
Mary+0.018
Neg logit contributions
der-0.018
ly-0.016
attered-0.016
ishy-0.016
 dot-0.014
Token the
Token ID262
Feature activation+1.13e+00

Pos logit contributions
Once+0.113
At+0.100
aman+0.098
Pe+0.096
Mary+0.096
Neg logit contributions
der-0.101
ly-0.095
attered-0.091
ishy-0.089
 loudly-0.084
Token park
Token ID3952
Feature activation+2.02e+00

Pos logit contributions
Once+0.037
aman+0.032
At+0.032
Mary+0.032
Pe+0.032
Neg logit contributions
der-0.034
ly-0.032
attered-0.031
ishy-0.030
 dot-0.029
Token and
Token ID290
Feature activation+1.19e+00

Pos logit contributions
Once+0.071
At+0.061
 +0.060
aman+0.059
Mary+0.058
Neg logit contributions
der-0.058
attered-0.056
ishy-0.055
ly-0.054
 textures-0.049
Token look
Token ID804
Feature activation-6.56e-01

Pos logit contributions
Once+0.041
aman+0.037
Pe+0.036
At+0.036
Mary+0.036
Neg logit contributions
der-0.033
ly-0.031
attered-0.029
ishy-0.028
 loudly-0.027
Token for
Token ID329
Feature activation+2.88e+00

Pos logit contributions
der+0.017
attered+0.016
 dot+0.015
 textures+0.015
ishy+0.014
Neg logit contributions
Once-0.025
 -0.022
At-0.022
aman-0.020
ma-0.020
Token it
Token ID340
Feature activation+4.15e-01

Pos logit contributions
Once+0.101
At+0.086
 +0.086
aman+0.085
Pe+0.084
Neg logit contributions
der-0.084
attered-0.081
ishy-0.076
ly-0.076
 dot-0.070
Token again
Token ID757
Feature activation+2.88e+00

Pos logit contributions
Once+0.015
At+0.013
ll+0.012
 +0.012
ma+0.012
Neg logit contributions
der-0.012
attered-0.012
ishy-0.011
ly-0.011
 dot-0.010
Token.
Token ID13
Feature activation-8.04e+00

Pos logit contributions
Once+0.113
At+0.099
 +0.096
Mary+0.095
aman+0.094
Neg logit contributions
der-0.073
attered-0.073
ishy-0.068
ly-0.067
 dot-0.061
Token<|endoftext|>
Token ID50256
Feature activation+1.46e+01

Pos logit contributions
der+0.235
attered+0.216
ly+0.206
ishy+0.197
 because+0.193
Neg logit contributions
Once-0.269
Sab-0.244
aman-0.240
Pe-0.239
Mary-0.238
TokenOnce
Token ID7454
Feature activation+1.29e+01

Pos logit contributions
aman+0.401
Sab+0.360
Pe+0.359
fters+0.357
shadow+0.354
Neg logit contributions
.-0.540
der-0.423
!-0.422
 because-0.413
ly-0.409
Token little
Token ID1310
Feature activation-2.11e+00

Pos logit contributions
Once+0.050
At+0.043
aman+0.042
 +0.041
ma+0.041
Neg logit contributions
der-0.038
attered-0.036
ly-0.034
ishy-0.034
 dot-0.030
Token boy
Token ID2933
Feature activation+2.26e+00

Pos logit contributions
der+0.065
attered+0.059
ishy+0.056
ly+0.055
 loudly+0.051
Neg logit contributions
Once-0.072
 -0.062
Pe-0.060
At-0.060
ma-0.059
Token grinned
Token ID36268
Feature activation+3.97e+00

Pos logit contributions
Once+0.069
Pe+0.064
aman+0.062
Sab+0.061
At+0.061
Neg logit contributions
der-0.072
ly-0.068
 loudly-0.063
.-0.062
attered-0.062
Token and
Token ID290
Feature activation+6.15e-01

Pos logit contributions
Once+0.112
 +0.090
ma+0.082
<|endoftext|>+0.081
+0.081
Neg logit contributions
attered-0.140
ishy-0.136
working-0.135
 sizes-0.134
powers-0.131
Token said
Token ID531
Feature activation+5.40e+00

Pos logit contributions
Once+0.016
At+0.013
aman+0.013
Pe+0.013
Sab+0.013
Neg logit contributions
der-0.023
ly-0.022
attered-0.022
ishy-0.021
 loudly-0.020
Token,
Token ID11
Feature activation+3.21e+00

Pos logit contributions
Once+0.163
 +0.141
At+0.136
Sab+0.132
aman+0.132
Neg logit contributions
der-0.176
attered-0.173
ishy-0.167
ly-0.163
 dot-0.157
Token "
Token ID366
Feature activation+3.12e+00

Pos logit contributions
Once+0.118
aman+0.108
At+0.108
Sab+0.106
Mary+0.106
Neg logit contributions
der-0.086
ly-0.074
 loudly-0.071
.-0.070
 because-0.068
TokenI
Token ID40
Feature activation-4.46e+00

Pos logit contributions
Once+0.089
At+0.075
aman+0.074
Pe+0.072
Sab+0.072
Neg logit contributions
der-0.109
ly-0.104
attered-0.103
ishy-0.100
 dot-0.093
Token was
Token ID373
Feature activation+2.30e+00

Pos logit contributions
der+0.172
ly+0.164
attered+0.162
ishy+0.156
 textures+0.147
Neg logit contributions
Once-0.115
At-0.095
aman-0.089
ll-0.088
 -0.088
Token whist
Token ID32262
Feature activation+4.23e+00

Pos logit contributions
Once+0.069
 +0.057
At+0.056
n+0.055
Mary+0.054
Neg logit contributions
der-0.080
attered-0.076
ly-0.076
ishy-0.072
 textures-0.068
Tokenling
Token ID1359
Feature activation-3.26e+00

Pos logit contributions
Once+0.138
ma+0.119
ll+0.115
Jimmy+0.112
n+0.112
Neg logit contributions
der-0.131
attered-0.118
 textures-0.117
 possession-0.106
 sizes-0.105
Token
Token ID198
Feature activation+1.27e+01

Pos logit contributions
der+0.187
ly+0.181
attered+0.170
.+0.170
 because+0.163
Neg logit contributions
Once-0.328
aman-0.315
ll-0.313
Sab-0.307
igi-0.304
Token
Token ID198
Feature activation+1.27e+01

Pos logit contributions
aman+0.459
 salon+0.410
fters+0.401
aren+0.385
 ha+0.385
Neg logit contributions
.-0.390
 because-0.285
 loudly-0.275
!-0.265
."-0.259
Token"
Token ID1
Feature activation+3.03e+00

Pos logit contributions
Once+0.276
aman+0.243
ll+0.230
Sab+0.226
ma+0.217
Neg logit contributions
der-0.491
ly-0.491
attered-0.479
ishy-0.463
 loudly-0.452
TokenHelp
Token ID22087
Feature activation+7.75e+00

Pos logit contributions
ll+0.095
aman+0.085
n+0.084
Once+0.081
ma+0.080
Neg logit contributions
 dot-0.088
ly-0.082
der-0.081
 loudly-0.076
attered-0.076
Token me
Token ID502
Feature activation+5.92e+00

Pos logit contributions
Once+0.288
At+0.266
aman+0.253
Pe+0.253
Sab+0.250
Neg logit contributions
ly-0.199
der-0.186
ishy-0.172
attered-0.169
.-0.164
Token,
Token ID11
Feature activation+3.12e+00

Pos logit contributions
Once+0.189
 +0.155
At+0.154
ll+0.152
aman+0.150
Neg logit contributions
der-0.198
attered-0.187
ishy-0.176
ly-0.175
 textures-0.166
Token wise
Token ID10787
Feature activation-4.55e+00

Pos logit contributions
Once+0.106
At+0.090
Pe+0.088
aman+0.087
Sab+0.087
Neg logit contributions
der-0.096
attered-0.091
ly-0.089
ishy-0.087
 textures-0.079
Token cat
Token ID3797
Feature activation+1.61e+00

Pos logit contributions
attered+0.159
ishy+0.150
der+0.139
ly+0.124
 textures+0.117
Neg logit contributions
Once-0.133
Sab-0.122
aman-0.119
Pe-0.118
 -0.117
Token!"
Token ID2474
Feature activation-3.43e-02

Pos logit contributions
Once+0.054
ma+0.046
 +0.046
ll+0.045
Sab+0.045
Neg logit contributions
der-0.049
attered-0.047
ishy-0.046
 textures-0.043
ly-0.042
Token said
Token ID531
Feature activation+5.06e+00

Pos logit contributions
der+0.001
ly+0.001
attered+0.001
ishy+0.001
 loudly+0.001
Neg logit contributions
Once-0.001
At-0.001
aman-0.001
Pe-0.001
Sab-0.001
Token the
Token ID262
Feature activation+2.95e-01

Pos logit contributions
Once+0.132
 +0.114
At+0.106
Pe+0.105
Mary+0.101
Neg logit contributions
der-0.197
attered-0.189
ishy-0.183
ly-0.181
 textures-0.166
Token turned
Token ID2900
Feature activation-1.30e+00

Pos logit contributions
Once+0.047
Pe+0.042
Sab+0.041
At+0.040
Wil+0.039
Neg logit contributions
der-0.057
ly-0.056
attered-0.049
ishy-0.048
.-0.048
Token around
Token ID1088
Feature activation+1.90e+00

Pos logit contributions
der+0.034
attered+0.031
ishy+0.029
ly+0.029
 textures+0.027
Neg logit contributions
Once-0.050
At-0.044
 -0.044
aman-0.042
ma-0.042
Token and
Token ID290
Feature activation+2.21e-01

Pos logit contributions
Once+0.080
At+0.071
aman+0.070
Pe+0.070
Sab+0.070
Neg logit contributions
der-0.041
ly-0.038
attered-0.037
ishy-0.035
 dot-0.031
Token saw
Token ID2497
Feature activation+4.41e+00

Pos logit contributions
Once+0.006
Pe+0.006
aman+0.005
Sab+0.005
igi+0.005
Neg logit contributions
ly-0.008
der-0.008
attered-0.007
 loudly-0.007
ishy-0.007
Token his
Token ID465
Feature activation-7.21e-01

Pos logit contributions
Once+0.140
At+0.120
aman+0.118
Pe+0.116
Mary+0.115
Neg logit contributions
der-0.141
attered-0.134
ly-0.133
ishy-0.130
 dot-0.118
Token papers
Token ID9473
Feature activation-5.84e+00

Pos logit contributions
der+0.021
attered+0.020
ishy+0.019
ly+0.019
 textures+0.017
Neg logit contributions
Once-0.023
ma-0.021
 -0.021
aman-0.021
At-0.020
Token.
Token ID13
Feature activation-9.09e+00

Pos logit contributions
attered+0.117
der+0.116
ishy+0.116
ly+0.110
 textures+0.096
Neg logit contributions
Once-0.252
At-0.223
 -0.218
ma-0.214
n-0.209
Token He
Token ID679
Feature activation-3.54e+00

Pos logit contributions
der+0.142
ly+0.115
attered+0.107
ishy+0.092
.+0.089
Neg logit contributions
Once-0.435
ll-0.411
aman-0.402
At-0.400
ma-0.394
Token was
Token ID373
Feature activation+2.35e+00

Pos logit contributions
der+0.106
ly+0.101
 loudly+0.092
attered+0.091
.+0.089
Neg logit contributions
Once-0.112
aman-0.111
Pe-0.102
Sab-0.102
igi-0.101
Token very
Token ID845
Feature activation-6.10e+00

Pos logit contributions
Once+0.094
aman+0.092
ll+0.088
At+0.088
Sab+0.086
Neg logit contributions
der-0.050
attered-0.044
 loudly-0.043
.-0.042
ly-0.042
Token shocked
Token ID11472
Feature activation-4.07e+00

Pos logit contributions
der+0.152
ly+0.144
attered+0.142
ishy+0.140
 loudly+0.130
Neg logit contributions
Once-0.233
aman-0.209
At-0.206
Sab-0.203
Pe-0.202
Token
Token ID198
Feature activation+1.33e+01

Pos logit contributions
der+0.277
.+0.272
 because+0.263
 loudly+0.258
ly+0.245
Neg logit contributions
prise-0.207
aman-0.190
Once-0.189
Sab-0.188
ma-0.187
Token
Token ID198
Feature activation+1.33e+01

Pos logit contributions
aman+0.461
 salon+0.457
fters+0.423
Pe+0.408
 ha+0.404
Neg logit contributions
.-0.403
 because-0.305
 loudly-0.298
!-0.274
der-0.274
TokenTim
Token ID14967
Feature activation+7.53e+00

Pos logit contributions
Once+0.280
aman+0.256
ll+0.238
Sab+0.233
aren+0.228
Neg logit contributions
der-0.526
ly-0.515
attered-0.500
 loudly-0.493
.-0.465
Tokenmy
Token ID1820
Feature activation+3.13e+00

Pos logit contributions
Once+0.184
At+0.154
Pe+0.154
aman+0.150
Sab+0.149
Neg logit contributions
der-0.286
ly-0.285
attered-0.276
ishy-0.267
 loudly-0.252
Token felt
Token ID2936
Feature activation-4.84e-01

Pos logit contributions
Once+0.110
aman+0.098
At+0.097
Pe+0.097
Sab+0.096
Neg logit contributions
der-0.087
ly-0.085
attered-0.080
ishy-0.075
 loudly-0.073
Token sorry
Token ID7926
Feature activation-2.30e+00

Pos logit contributions
der+0.014
attered+0.014
ly+0.013
ishy+0.013
 textures+0.013
Neg logit contributions
Once-0.017
 -0.015
At-0.014
Pe-0.014
<|endoftext|>-0.013
Token for
Token ID329
Feature activation+2.51e+00

Pos logit contributions
attered+0.075
ishy+0.072
 textures+0.065
 dot+0.064
 sizes+0.063
Neg logit contributions
Once-0.073
 -0.065
ll-0.060
At-0.057
<|endoftext|>-0.055
Token taking
Token ID2263
Feature activation-9.81e-02

Pos logit contributions
Once+0.088
At+0.081
Sab+0.078
aman+0.078
Pe+0.076
Neg logit contributions
der-0.070
ly-0.066
 because-0.060
 loudly-0.060
.-0.059
Token George
Token ID4502
Feature activation+2.78e+00

Pos logit contributions
der+0.003
attered+0.003
ishy+0.003
ly+0.003
 dot+0.003
Neg logit contributions
Once-0.003
 -0.003
At-0.003
Pe-0.002
n-0.002
Token's
Token ID338
Feature activation-7.00e-01

Pos logit contributions
Once+0.096
 +0.077
At+0.077
ma+0.077
ll+0.075
Neg logit contributions
der-0.091
ishy-0.081
attered-0.080
ly-0.075
 dot-0.073
Token banana
Token ID25996
Feature activation+4.19e-01

Pos logit contributions
der+0.018
ly+0.017
attered+0.017
ishy+0.016
 dot+0.014
Neg logit contributions
Once-0.026
At-0.023
aman-0.023
Pe-0.023
Sab-0.023
Token ran
Token ID4966
Feature activation-9.89e-02

Pos logit contributions
der+0.099
ly+0.090
attered+0.090
ishy+0.087
 dot+0.082
Neg logit contributions
Once-0.086
At-0.075
 -0.074
n-0.069
ll-0.068
Token together
Token ID1978
Feature activation-8.22e-01

Pos logit contributions
der+0.004
attered+0.003
ishy+0.003
 dot+0.003
 textures+0.003
Neg logit contributions
Once-0.003
 -0.003
At-0.003
n-0.002
Mary-0.002
Token to
Token ID284
Feature activation+3.38e+00

Pos logit contributions
der+0.022
attered+0.021
ly+0.020
ishy+0.020
 dot+0.017
Neg logit contributions
Once-0.032
At-0.028
 -0.028
ll-0.027
n-0.026
Token the
Token ID262
Feature activation+1.09e+00

Pos logit contributions
Once+0.105
At+0.090
aman+0.087
Pe+0.087
Mary+0.086
Neg logit contributions
der-0.111
ly-0.105
attered-0.104
ishy-0.101
 dot-0.092
Token river
Token ID7850
Feature activation+3.53e-01

Pos logit contributions
Once+0.035
At+0.030
 +0.029
aman+0.028
Pe+0.028
Neg logit contributions
der-0.036
attered-0.035
ly-0.035
ishy-0.034
 than-0.030
Token.
Token ID13
Feature activation-8.13e+00

Pos logit contributions
Once+0.017
 +0.016
At+0.016
n+0.015
ll+0.015
Neg logit contributions
der-0.005
ishy-0.005
attered-0.005
ly-0.005
 dot-0.004
Token As
Token ID1081
Feature activation+1.88e-01

Pos logit contributions
der+0.299
attered+0.279
ishy+0.268
ly+0.268
.+0.265
Neg logit contributions
Once-0.192
ma-0.183
Sab-0.173
aman-0.171
ll-0.169
Token soon
Token ID2582
Feature activation-1.56e+00

Pos logit contributions
Once+0.006
aman+0.006
At+0.006
Sab+0.005
Pe+0.005
Neg logit contributions
der-0.005
ly-0.005
attered-0.005
ishy-0.005
 loudly-0.004
Token as
Token ID355
Feature activation-2.98e+00

Pos logit contributions
der+0.024
ly+0.021
attered+0.021
ishy+0.020
 dot+0.016
Neg logit contributions
Once-0.075
At-0.068
aman-0.067
Pe-0.066
Mary-0.066
Token they
Token ID484
Feature activation-2.95e+00

Pos logit contributions
der+0.089
ly+0.084
attered+0.083
ishy+0.081
 dot+0.074
Neg logit contributions
Once-0.100
At-0.086
aman-0.085
Sab-0.083
Pe-0.083
Token reached
Token ID4251
Feature activation+8.02e+00

Pos logit contributions
der+0.103
ly+0.098
attered+0.097
ishy+0.096
 dot+0.088
Neg logit contributions
Once-0.082
aman-0.072
Sab-0.070
Pe-0.070
Alice-0.069
Token loved
Token ID6151
Feature activation-1.30e+00

Pos logit contributions
der+0.167
ly+0.159
attered+0.158
ishy+0.154
 dot+0.143
Neg logit contributions
Once-0.142
At-0.121
aman-0.115
Mary-0.113
 -0.113
Token the
Token ID262
Feature activation+2.63e-01

Pos logit contributions
der+0.043
attered+0.043
ly+0.042
ishy+0.042
 dot+0.038
Neg logit contributions
Once-0.039
At-0.033
 -0.033
aman-0.032
Mary-0.031
Token warm
Token ID5814
Feature activation-4.66e+00

Pos logit contributions
Once+0.009
 +0.008
ma+0.007
At+0.007
n+0.007
Neg logit contributions
attered-0.008
der-0.008
ly-0.007
ishy-0.007
 dot-0.006
Token water
Token ID1660
Feature activation-3.75e+00

Pos logit contributions
der+0.128
attered+0.126
aly+0.121
powers+0.115
ishy+0.114
Neg logit contributions
Once-0.143
 -0.134
ma-0.130
Kids-0.123
At-0.121
Token and
Token ID290
Feature activation+8.73e-01

Pos logit contributions
der+0.082
attered+0.081
ishy+0.077
 dot+0.075
powers+0.074
Neg logit contributions
Once-0.139
 -0.127
ma-0.124
ll-0.120
At-0.119
Token bubbles
Token ID25037
Feature activation-2.56e+00

Pos logit contributions
Once+0.026
At+0.023
 +0.022
n+0.021
Mary+0.021
Neg logit contributions
der-0.029
ly-0.028
attered-0.028
ishy-0.027
 dot-0.026
Token.
Token ID13
Feature activation-8.50e+00

Pos logit contributions
attered+0.048
ishy+0.046
der+0.043
ly+0.040
 dot+0.039
Neg logit contributions
Once-0.110
At-0.097
 -0.097
ll-0.096
n-0.094
Token She
Token ID1375
Feature activation-3.65e+00

Pos logit contributions
der+0.239
ly+0.225
attered+0.222
ishy+0.215
 dot+0.194
Neg logit contributions
Once-0.303
At-0.265
aman-0.259
Pe-0.256
Mary-0.255
Token spl
Token ID4328
Feature activation+4.37e-01

Pos logit contributions
der+0.118
ly+0.111
attered+0.110
ishy+0.106
 loudly+0.098
Neg logit contributions
Once-0.114
At-0.097
aman-0.097
Pe-0.096
Sab-0.095
Tokenashed
Token ID5263
Feature activation+1.10e+00

Pos logit contributions
Once+0.016
At+0.014
ll+0.014
aman+0.014
Pe+0.014
Neg logit contributions
der-0.012
ly-0.010
attered-0.010
 dot-0.009
ishy-0.009
Token and
Token ID290
Feature activation+1.02e+00

Pos logit contributions
Once+0.043
At+0.038
aman+0.037
Pe+0.037
Mary+0.037
Neg logit contributions
der-0.027
ly-0.025
attered-0.025
ishy-0.024
 dot-0.021
Token was
Token ID373
Feature activation+3.37e+00

Pos logit contributions
Once+0.046
At+0.039
aman+0.039
Pe+0.039
Mary+0.038
Neg logit contributions
der-0.050
ly-0.048
attered-0.047
ishy-0.045
 loudly-0.043
Token so
Token ID523
Feature activation-2.76e+00

Pos logit contributions
Once+0.144
aman+0.133
At+0.131
Pe+0.128
Sab+0.128
Neg logit contributions
der-0.067
ly-0.060
attered-0.060
ishy-0.056
 loudly-0.055
Token curious
Token ID11040
Feature activation-6.96e+00

Pos logit contributions
der+0.068
ly+0.064
attered+0.063
ishy+0.062
 loudly+0.060
Neg logit contributions
Once-0.104
aman-0.096
At-0.093
Pe-0.092
Sab-0.091
Token that
Token ID326
Feature activation-3.37e+00

Pos logit contributions
attered+0.257
ishy+0.249
der+0.219
istered+0.210
 dot+0.209
Neg logit contributions
Once-0.208
 -0.181
At-0.162
n-0.161
ma-0.160
Token he
Token ID339
Feature activation-2.83e+00

Pos logit contributions
der+0.112
attered+0.112
ishy+0.109
ly+0.106
 dot+0.095
Neg logit contributions
Once-0.110
 -0.094
At-0.093
Sab-0.087
Mary-0.086
Token decided
Token ID3066
Feature activation-2.80e+00

Pos logit contributions
der+0.097
attered+0.092
ishy+0.090
ly+0.090
 dot+0.082
Neg logit contributions
Once-0.084
At-0.073
Sab-0.068
 -0.068
Mary-0.068
Token to
Token ID284
Feature activation+3.70e+00

Pos logit contributions
der+0.070
attered+0.065
ly+0.065
ishy+0.063
 dot+0.056
Neg logit contributions
Once-0.109
At-0.096
aman-0.094
Pe-0.093
Mary-0.093
Token go
Token ID467
Feature activation+3.52e+00

Pos logit contributions
Once+0.122
At+0.105
aman+0.102
Pe+0.101
Mary+0.101
Neg logit contributions
der-0.115
ly-0.108
attered-0.108
ishy-0.104
 dot-0.094
Token and
Token ID290
Feature activation+1.39e+00

Pos logit contributions
Once+0.127
At+0.111
 +0.108
ll+0.106
Sab+0.105
Neg logit contributions
der-0.104
attered-0.096
ly-0.092
ishy-0.091
 dot-0.083
Token see
Token ID766
Feature activation+4.89e+00

Pos logit contributions
Once+0.042
At+0.036
aman+0.035
Pe+0.035
Mary+0.034
Neg logit contributions
der-0.046
ly-0.044
attered-0.044
ishy-0.042
 dot-0.039
Token what
Token ID644
Feature activation-1.91e+00

Pos logit contributions
Once+0.125
 +0.108
At+0.100
ma+0.097
Mary+0.094
Neg logit contributions
attered-0.196
der-0.185
ishy-0.185
ly-0.178
 dot-0.166
Token but
Token ID475
Feature activation+8.33e-01

Pos logit contributions
Once+0.074
At+0.065
aman+0.064
Pe+0.063
Mary+0.063
Neg logit contributions
der-0.054
ly-0.051
attered-0.050
ishy-0.048
 dot-0.044
Token a
Token ID257
Feature activation-6.34e+00

Pos logit contributions
Once+0.028
At+0.024
aman+0.024
Pe+0.024
Mary+0.024
Neg logit contributions
der-0.025
ly-0.023
attered-0.023
ishy-0.022
 dot-0.020
Token friendship
Token ID14738
Feature activation-1.72e+00

Pos logit contributions
der+0.170
ly+0.167
attered+0.165
ishy+0.158
 loudly+0.135
Neg logit contributions
Once-0.235
At-0.204
 -0.198
aman-0.196
Pe-0.196
Token can
Token ID460
Feature activation-2.18e+00

Pos logit contributions
attered+0.056
ishy+0.056
der+0.055
ly+0.053
 loudly+0.049
Neg logit contributions
Once-0.048
At-0.043
 -0.042
ma-0.042
ll-0.041
Token mend
Token ID47618
Feature activation-2.45e+00

Pos logit contributions
der+0.069
attered+0.068
ly+0.067
ishy+0.066
 dot+0.058
Neg logit contributions
Once-0.069
At-0.059
ll-0.056
 -0.056
Sab-0.056
Token.
Token ID13
Feature activation-1.03e+01

Pos logit contributions
attered+0.072
der+0.071
ishy+0.068
ly+0.064
 dot+0.061
Neg logit contributions
Once-0.073
ma-0.072
 -0.068
n-0.066
At-0.065
Token<|endoftext|>
Token ID50256
Feature activation+1.22e+01

Pos logit contributions
der+0.322
attered+0.289
ly+0.280
ishy+0.275
 dot+0.265
Neg logit contributions
Once-0.332
aman-0.295
At-0.291
ll-0.283
Mary-0.282
TokenTom
Token ID13787
Feature activation+7.36e+00

Pos logit contributions
Once+0.191
aman+0.177
Sab+0.158
Pe+0.155
Mary+0.155
Neg logit contributions
der-0.550
ly-0.530
.-0.522
attered-0.520
ishy-0.513
Token and
Token ID290
Feature activation-9.27e-01

Pos logit contributions
Once+0.229
At+0.195
aman+0.189
 +0.187
Mary+0.185
Neg logit contributions
der-0.242
ly-0.227
attered-0.225
ishy-0.218
 dot-0.202
Token Lily
Token ID20037
Feature activation+2.07e+00

Pos logit contributions
der+0.032
attered+0.029
ly+0.029
ishy+0.029
 dot+0.026
Neg logit contributions
Once-0.029
At-0.025
 -0.024
Pe-0.023
Mary-0.023
Token went
Token ID1816
Feature activation+4.31e-01

Pos logit contributions
Once+0.060
At+0.051
aman+0.048
 +0.048
ma+0.047
Neg logit contributions
der-0.074
ly-0.069
attered-0.068
ishy-0.066
 dot-0.062
Token looks
Token ID3073
Feature activation-5.57e+00

Pos logit contributions
der+0.048
attered+0.046
ly+0.044
ishy+0.043
 textures+0.039
Neg logit contributions
Once-0.047
At-0.041
 -0.039
ll-0.039
n-0.038
Token,"
Token ID553
Feature activation+3.29e+00

Pos logit contributions
der+0.144
attered+0.142
ishy+0.126
ly+0.122
 dot+0.117
Neg logit contributions
Once-0.208
 -0.181
At-0.178
aman-0.171
ll-0.170
Token her
Token ID607
Feature activation-1.49e+00

Pos logit contributions
Once+0.116
ll+0.115
aman+0.109
At+0.108
igi+0.105
Neg logit contributions
der-0.081
ly-0.079
.-0.077
 loudly-0.076
 because-0.070
Token mom
Token ID1995
Feature activation+1.43e+00

Pos logit contributions
der+0.048
ly+0.045
attered+0.045
ishy+0.043
 dot+0.040
Neg logit contributions
Once-0.047
At-0.041
aman-0.040
Pe-0.039
Mary-0.039
Token said
Token ID531
Feature activation+4.34e+00

Pos logit contributions
Pe+0.045
Once+0.043
At+0.041
igi+0.040
Sab+0.039
Neg logit contributions
ly-0.046
.-0.044
der-0.042
 loudly-0.037
 because-0.037
Token.
Token ID13
Feature activation-9.95e+00

Pos logit contributions
Once+0.204
 +0.195
ma+0.182
n+0.180
At+0.178
Neg logit contributions
der-0.067
ishy-0.064
attered-0.063
 textures-0.053
 dot-0.051
Token
Token ID198
Feature activation+1.16e+01

Pos logit contributions
der+0.286
.+0.265
 because+0.240
 loudly+0.237
ly+0.234
Neg logit contributions
ll-0.324
Once-0.323
fters-0.310
prise-0.309
igi-0.309
Token
Token ID198
Feature activation+1.18e+01

Pos logit contributions
ll+0.347
OV+0.341
bur+0.340
aman+0.338
fters+0.338
Neg logit contributions
.-0.352
 loudly-0.259
der-0.243
!-0.240
 They-0.226
TokenLu
Token ID25596
Feature activation+3.92e+00

Pos logit contributions
ll+0.320
OV+0.293
aman+0.285
Once+0.285
bur+0.282
Neg logit contributions
 loudly-0.330
der-0.316
.-0.313
 than-0.308
ly-0.305
Tokency
Token ID948
Feature activation-2.55e+00

Pos logit contributions
Once+0.172
Sab+0.151
At+0.151
Pe+0.149
Mary+0.146
Neg logit contributions
ly-0.077
der-0.075
attered-0.074
ishy-0.069
 textures-0.057
Token smiled
Token ID13541
Feature activation+1.57e+00

Pos logit contributions
der+0.085
ly+0.080
attered+0.074
 loudly+0.071
ishy+0.070
Neg logit contributions
Once-0.077
Pe-0.070
Sab-0.068
igi-0.067
Wil-0.066
Token sometimes
Token ID3360
Feature activation-3.96e+00

Pos logit contributions
Once+0.149
ma+0.130
ll+0.118
At+0.117
n+0.117
Neg logit contributions
attered-0.120
der-0.111
ishy-0.109
 textures-0.100
ly-0.096
Token,
Token ID11
Feature activation+2.89e+00

Pos logit contributions
attered+0.106
ishy+0.100
der+0.099
 textures+0.082
ly+0.080
Neg logit contributions
Once-0.154
 -0.140
At-0.128
ma-0.122
n-0.120
Token but
Token ID475
Feature activation+1.68e+00

Pos logit contributions
Once+0.106
At+0.094
aman+0.092
Pe+0.091
Mary+0.091
Neg logit contributions
der-0.078
ly-0.072
attered-0.071
ishy-0.068
 dot-0.062
Token not
Token ID407
Feature activation+2.49e-02

Pos logit contributions
Once+0.060
At+0.052
aman+0.052
Pe+0.051
Mary+0.051
Neg logit contributions
der-0.048
ly-0.044
attered-0.043
ishy-0.042
 textures-0.038
Token always
Token ID1464
Feature activation+4.13e-01

Pos logit contributions
Once+0.001
 +0.001
ma+0.001
At+0.001
n+0.001
Neg logit contributions
attered-0.001
der-0.001
ishy-0.001
ly-0.001
liest-0.001
Token.
Token ID13
Feature activation-9.65e+00

Pos logit contributions
Once+0.014
 +0.012
At+0.012
n+0.012
ma+0.012
Neg logit contributions
attered-0.013
der-0.012
ishy-0.011
ly-0.011
 dot-0.010
Token I
Token ID314
Feature activation-4.55e+00

Pos logit contributions
der+0.256
ly+0.253
attered+0.250
ishy+0.242
 textures+0.211
Neg logit contributions
Once-0.357
 -0.314
At-0.306
Sab-0.303
Pe-0.302
Token like
Token ID588
Feature activation-4.06e+00

Pos logit contributions
der+0.150
ly+0.143
attered+0.140
ishy+0.131
 textures+0.123
Neg logit contributions
Once-0.146
At-0.125
ll-0.119
 -0.118
n-0.117
Token to
Token ID284
Feature activation+2.28e+00

Pos logit contributions
der+0.116
ly+0.109
attered+0.109
ishy+0.105
 dot+0.093
Neg logit contributions
Once-0.143
At-0.125
aman-0.121
Pe-0.120
Sab-0.120
Token try
Token ID1949
Feature activation-4.75e+00

Pos logit contributions
Once+0.070
At+0.059
 +0.057
Pe+0.057
aman+0.056
Neg logit contributions
der-0.078
attered-0.075
ly-0.073
ishy-0.071
 textures-0.063
Token new
Token ID649
Feature activation-4.73e+00

Pos logit contributions
attered+0.137
der+0.136
ly+0.131
ishy+0.131
 dot+0.113
Neg logit contributions
Once-0.165
At-0.143
 -0.142
Sab-0.138
Mary-0.136
Token You
Token ID921
Feature activation-9.68e-01

Pos logit contributions
der+0.275
ly+0.259
attered+0.254
ishy+0.245
 textures+0.219
Neg logit contributions
Once-0.365
At-0.315
 -0.311
Pe-0.308
aman-0.307
Token are
Token ID389
Feature activation+3.82e+00

Pos logit contributions
der+0.031
ly+0.030
attered+0.029
ishy+0.029
 textures+0.026
Neg logit contributions
Once-0.030
aman-0.026
At-0.026
Mary-0.026
Pe-0.026
Token a
Token ID257
Feature activation-5.54e+00

Pos logit contributions
Once+0.145
At+0.124
 +0.122
Pe+0.121
Mary+0.120
Neg logit contributions
der-0.106
ly-0.099
attered-0.098
ishy-0.093
 dot-0.084
Token good
Token ID922
Feature activation-5.23e+00

Pos logit contributions
der+0.207
attered+0.206
ly+0.194
ishy+0.178
 textures+0.172
Neg logit contributions
Once-0.177
Pe-0.136
Fore-0.133
 ha-0.132
At-0.131
Token sister
Token ID6621
Feature activation-4.63e+00

Pos logit contributions
attered+0.188
der+0.178
ishy+0.164
ly+0.164
 sizes+0.162
Neg logit contributions
Once-0.151
 ha-0.129
ma-0.124
Pe-0.123
aman-0.118
Token.
Token ID13
Feature activation-9.73e+00

Pos logit contributions
attered+0.112
ishy+0.107
der+0.101
 textures+0.090
 dot+0.085
Neg logit contributions
Once-0.194
ma-0.179
 -0.167
Alice-0.159
ll-0.158
Token I
Token ID314
Feature activation-4.15e+00

Pos logit contributions
der+0.305
ly+0.295
attered+0.292
ishy+0.280
 loudly+0.256
Neg logit contributions
Once-0.319
 -0.274
At-0.266
Pe-0.265
Mary-0.259
Token love
Token ID1842
Feature activation-5.22e-01

Pos logit contributions
der+0.143
ly+0.135
attered+0.134
ishy+0.125
 dot+0.119
Neg logit contributions
Once-0.126
At-0.107
ll-0.103
 -0.103
n-0.101
Token you
Token ID345
Feature activation-5.02e-03

Pos logit contributions
der+0.019
attered+0.019
ly+0.019
ishy+0.018
 dot+0.017
Neg logit contributions
Once-0.014
At-0.012
aman-0.011
Pe-0.011
Mary-0.011
Token."
Token ID526
Feature activation-2.22e+00

Pos logit contributions
attered+0.000
der+0.000
ishy+0.000
 dot+0.000
ly+0.000
Neg logit contributions
Once-0.000
 -0.000
ma-0.000
ll-0.000
At-0.000
Token Ben
Token ID3932
Feature activation+3.85e+00

Pos logit contributions
der+0.071
ly+0.068
.+0.064
attered+0.063
ishy+0.062
Neg logit contributions
Once-0.068
aman-0.061
At-0.060
Pe-0.059
Mary-0.058
Token and
Token ID290
Feature activation-1.78e-01

Pos logit contributions
der+0.027
attered+0.025
ly+0.024
ishy+0.024
 dot+0.020
Neg logit contributions
Once-0.047
At-0.041
 -0.041
Sab-0.040
aman-0.040
Token gave
Token ID2921
Feature activation+1.88e+00

Pos logit contributions
der+0.006
ly+0.005
attered+0.005
ishy+0.005
 loudly+0.005
Neg logit contributions
Once-0.006
aman-0.005
Pe-0.005
At-0.005
Sab-0.005
Token her
Token ID607
Feature activation-1.21e+00

Pos logit contributions
Once+0.048
At+0.040
aman+0.039
Pe+0.038
Mary+0.037
Neg logit contributions
der-0.071
ly-0.069
attered-0.068
ishy-0.066
 textures-0.062
Token the
Token ID262
Feature activation-3.83e-01

Pos logit contributions
der+0.039
attered+0.037
ly+0.036
ishy+0.035
working+0.031
Neg logit contributions
Once-0.040
 -0.035
ma-0.034
At-0.034
Sab-0.033
Token ball
Token ID2613
Feature activation-1.56e+00

Pos logit contributions
der+0.011
ly+0.010
attered+0.010
ishy+0.010
 loudly+0.009
Neg logit contributions
Once-0.013
At-0.012
aman-0.011
 -0.011
Pe-0.011
Token.
Token ID13
Feature activation-9.73e+00

Pos logit contributions
der+0.040
ishy+0.035
attered+0.035
ening+0.031
working+0.031
Neg logit contributions
Once-0.059
 -0.055
ma-0.054
At-0.051
ossible-0.050
Token Anna
Token ID11735
Feature activation+1.97e+00

Pos logit contributions
der+0.269
.+0.230
ly+0.225
attered+0.219
 loudly+0.217
Neg logit contributions
Once-0.332
aman-0.327
ll-0.320
ma-0.310
str-0.305
Token threw
Token ID9617
Feature activation-1.01e-01

Pos logit contributions
Once+0.069
aman+0.063
Pe+0.061
Mary+0.060
Sab+0.060
Neg logit contributions
der-0.056
ly-0.052
attered-0.049
ishy-0.047
.-0.045
Token the
Token ID262
Feature activation-4.92e-01

Pos logit contributions
der+0.004
attered+0.003
ly+0.003
ishy+0.003
 textures+0.003
Neg logit contributions
Once-0.003
At-0.003
 -0.003
Pe-0.002
Mary-0.002
Token ball
Token ID2613
Feature activation-1.66e+00

Pos logit contributions
der+0.013
attered+0.012
ly+0.011
ishy+0.011
working+0.010
Neg logit contributions
Once-0.019
At-0.017
 -0.016
ma-0.016
Alice-0.016
Token to
Token ID284
Feature activation+1.94e+00

Pos logit contributions
der+0.052
attered+0.051
ishy+0.051
ly+0.047
 textures+0.046
Neg logit contributions
Once-0.053
At-0.046
 -0.045
ma-0.039
ll-0.039