TOP ACTIVATIONS
MAX = 10.148

upon a time there were two friends , Joe and Molly
upon a time there were two best friends . They were
upon a time there were two friends , Sam and Jack
a time , there were two friends . One was wise
! <|endoftext|> Once there were two sisters . One was named
end . <|endoftext|> One day two twins went for a walk
<|endoftext|> One day there were two friends , Joe and Adam
great . <|endoftext|> There were two best friends . One day
adventure ! <|endoftext|> One day two friends went for a walk
adventure . <|endoftext|> One day two friends , Jim and Sam
<|endoftext|> Sam and Sarah were two children from the same neighbourhood
a time , there were two friends . Their names were
upon a time there were two friends , Jake and Sam
upon a time there were two friends ; a young boy
day , Be a noticed two squirrel s running around outside
a time , there were two friends , Jack and Jill
. <|endoftext|> One day , two friends went to the river
upon a time there were two teams . The two teams
a time , there were two best friends , Car a
! <|endoftext|> One day , two friends wanted to play .

MIN ACTIVATIONS
MIN = -11.410

not eat too many ch ocol ates and his friend learned
was filled with delicious ch ocol ates . He was happy
colorful cand ies and ch ocol ates . She said to
find a box of ch ocol ates . "
't have all these ch ocol ates ! Now , go
she paid for the ch ocol ates and the two of
giving him too many ch ocol ates ." The king learned
box and share the ch ocol ates . They are happy
and her favourite , ch ocol ates ! She asked mummy
It was sweet , ch ocol at y and warm ,
, cup cakes and ch ocol ates .
son is eating our ch ocol ates !"
and they shared the ch ocol ates with everyone in the
a big box of ch ocol ates !
was a box of ch ocol ates ! Alice and her
a big box of ch ocol ates ! <|endoftext|> Once upon
looking at all the ch ocol ates , Lex i chose
park . When the ch ocol ates were all gone ,
was a box of ch ocol ates . " For me
started eating the delicious ch ocol ates . She was so

INTERVAL 4.759 - 10.148
CONTAINS 1.889%

Once upon a time , there was a tall tree .
He looked around , but all he could see were shadows
play another game . They agreed and went back to their
were making cookies , Lily 's mom my asked her to
envelope and says " It 's a card . I made

INTERVAL -0.631 - 4.759
CONTAINS 67.850%

When it was done , Polly was very excited .
eat food , especially candy and cookies . She would eat
It is so colorful . It can sing ." Mil a
Mitt ens with a big smile on her face .
and waved goodbye . The squash was so happy that someone

INTERVAL -6.020 - -0.631
CONTAINS 29.865%

so excited to see the bug 's tiny legs and eyes
everywhere , never happier than when he made jokes . <|endoftext|>
you can . You can play with your doll and your
he always asked his mom my for some . One day
day , Jacob wrote a letter to Jenny . He mailed

INTERVAL -11.410 - -6.020
CONTAINS 0.397%

scared and didn 't know what to do . But then
, Lucy 's tum my ac he was gone , and
O w , ow , ow ! It hurts !"
bully and said : " Why don 't you respect the
ring ? That feeling is worth more than any ancient treasure
Token upon
Token ID2402
Feature activation+1.61e+00

Pos logit contributions
 ones+0.054
's+0.047
 color+0.047
oes+0.046
packs+0.046
Neg logit contributions
 relief-0.067
 Doctor-0.067
 stret-0.067
 ow-0.066
 awful-0.066
Token a
Token ID257
Feature activation+7.19e-01

Pos logit contributions
 ones+0.050
's+0.045
oes+0.044
 color+0.044
packs+0.043
Neg logit contributions
 relief-0.054
 Doctor-0.054
 stret-0.054
ores-0.054
 ow-0.053
Token time
Token ID640
Feature activation+3.75e+00

Pos logit contributions
 ones+0.015
's+0.013
oes+0.013
 color+0.013
packs+0.012
Neg logit contributions
 relief-0.032
 Doctor-0.031
ores-0.031
 ow-0.031
 stret-0.031
Token there
Token ID612
Feature activation+6.54e+00

Pos logit contributions
 ones+0.093
's+0.081
oes+0.078
 color+0.078
packs+0.077
Neg logit contributions
 relief-0.153
 Doctor-0.151
 stret-0.151
 awful-0.151
ores-0.150
Token were
Token ID547
Feature activation+1.84e+00

Pos logit contributions
 ones+0.137
's+0.125
 color+0.114
packs+0.109
 one+0.109
Neg logit contributions
 stret-0.297
 relief-0.297
 Doctor-0.292
 awful-0.292
paste-0.291
Token two
Token ID734
Feature activation+9.57e+00

Pos logit contributions
oes+0.054
 ones+0.050
packs+0.050
 color+0.044
 pads+0.043
Neg logit contributions
 ow-0.064
 awful-0.064
 Doctor-0.063
 horrible-0.062
 sorry-0.060
Token friends
Token ID2460
Feature activation+4.87e+00

Pos logit contributions
 ones+0.243
's+0.221
 color+0.199
 one+0.198
packs+0.186
Neg logit contributions
ores-0.405
 relief-0.401
 stret-0.399
paste-0.398
 Doctor-0.387
Token,
Token ID11
Feature activation+7.67e-01

Pos logit contributions
 ones+0.138
's+0.127
oes+0.120
 color+0.119
packs+0.119
Neg logit contributions
 relief-0.183
 Doctor-0.179
 stret-0.179
ores-0.178
 ow-0.178
Token Joe
Token ID5689
Feature activation+1.67e+00

Pos logit contributions
 ones+0.026
's+0.024
oes+0.023
 color+0.023
packs+0.023
Neg logit contributions
 relief-0.024
 stret-0.024
 Doctor-0.023
 ow-0.023
ores-0.023
Token and
Token ID290
Feature activation+1.05e+00

Pos logit contributions
 ones+0.044
oes+0.039
's+0.038
 color+0.038
packs+0.037
Neg logit contributions
 Doctor-0.064
 relief-0.064
ores-0.063
 stret-0.063
 ow-0.063
Token Molly
Token ID30236
Feature activation+4.16e-01

Pos logit contributions
 ones+0.035
's+0.032
 color+0.031
 one+0.031
oes+0.030
Neg logit contributions
 relief-0.035
 stret-0.034
 awful-0.034
ores-0.034
paste-0.034
Token upon
Token ID2402
Feature activation+1.61e+00

Pos logit contributions
 ones+0.028
's+0.026
oes+0.025
 color+0.025
packs+0.025
Neg logit contributions
 relief-0.023
 Doctor-0.023
ores-0.023
 stret-0.023
 ow-0.023
Token a
Token ID257
Feature activation+7.19e-01

Pos logit contributions
 ones+0.050
's+0.045
oes+0.044
 color+0.044
packs+0.043
Neg logit contributions
 relief-0.054
 Doctor-0.054
 stret-0.054
ores-0.054
 ow-0.053
Token time
Token ID640
Feature activation+3.75e+00

Pos logit contributions
 ones+0.015
's+0.013
oes+0.013
 color+0.013
packs+0.012
Neg logit contributions
 relief-0.032
 Doctor-0.031
ores-0.031
 ow-0.031
 stret-0.031
Token there
Token ID612
Feature activation+6.54e+00

Pos logit contributions
 ones+0.093
's+0.081
oes+0.078
 color+0.078
packs+0.077
Neg logit contributions
 relief-0.153
 Doctor-0.151
 stret-0.151
 awful-0.151
ores-0.150
Token were
Token ID547
Feature activation+1.84e+00

Pos logit contributions
 ones+0.137
's+0.125
 color+0.114
packs+0.109
 one+0.109
Neg logit contributions
 stret-0.297
 relief-0.297
 Doctor-0.292
 awful-0.292
paste-0.291
Token two
Token ID734
Feature activation+9.57e+00

Pos logit contributions
oes+0.054
 ones+0.050
packs+0.050
 color+0.044
 pads+0.043
Neg logit contributions
 ow-0.064
 awful-0.064
 Doctor-0.063
 horrible-0.062
 sorry-0.060
Token best
Token ID1266
Feature activation+5.06e+00

Pos logit contributions
 ones+0.243
's+0.221
 color+0.199
 one+0.198
packs+0.186
Neg logit contributions
ores-0.405
 relief-0.401
 stret-0.399
paste-0.398
 Doctor-0.387
Token friends
Token ID2460
Feature activation+4.81e+00

Pos logit contributions
 ones+0.125
's+0.107
oes+0.105
 color+0.104
packs+0.104
Neg logit contributions
 relief-0.208
 Doctor-0.206
 stret-0.206
ores-0.206
 awful-0.205
Token.
Token ID13
Feature activation+4.08e-01

Pos logit contributions
 ones+0.137
's+0.124
oes+0.120
 color+0.118
packs+0.118
Neg logit contributions
 relief-0.177
 Doctor-0.175
 stret-0.174
ores-0.174
 ow-0.173
Token They
Token ID1119
Feature activation+2.39e+00

Pos logit contributions
 ones+0.008
's+0.007
oes+0.007
 color+0.007
packs+0.007
Neg logit contributions
 Doctor-0.018
ores-0.018
 relief-0.018
 ow-0.017
 stret-0.017
Token were
Token ID547
Feature activation+1.19e+00

Pos logit contributions
 ones+0.049
's+0.043
 color+0.041
oes+0.040
packs+0.039
Neg logit contributions
 relief-0.107
 Doctor-0.107
 stret-0.106
ores-0.106
paste-0.105
Token upon
Token ID2402
Feature activation+1.61e+00

Pos logit contributions
 Doctor+0.045
ores+0.044
 relief+0.043
 stret+0.042
 awful+0.042
Neg logit contributions
 ones-0.135
's-0.125
 color-0.124
oes-0.124
packs-0.123
Token a
Token ID257
Feature activation+7.19e-01

Pos logit contributions
 ones+0.050
's+0.045
oes+0.044
 color+0.044
packs+0.043
Neg logit contributions
 relief-0.054
 Doctor-0.054
 stret-0.054
ores-0.054
 ow-0.053
Token time
Token ID640
Feature activation+3.75e+00

Pos logit contributions
 ones+0.015
's+0.013
oes+0.013
 color+0.013
packs+0.012
Neg logit contributions
 relief-0.032
 Doctor-0.031
ores-0.031
 ow-0.031
 stret-0.031
Token there
Token ID612
Feature activation+6.54e+00

Pos logit contributions
 ones+0.093
's+0.081
oes+0.078
 color+0.078
packs+0.077
Neg logit contributions
 relief-0.153
 Doctor-0.151
 stret-0.151
 awful-0.151
ores-0.150
Token were
Token ID547
Feature activation+1.84e+00

Pos logit contributions
 ones+0.137
's+0.125
 color+0.114
packs+0.109
 one+0.109
Neg logit contributions
 stret-0.297
 relief-0.297
 Doctor-0.292
 awful-0.292
paste-0.291
Token two
Token ID734
Feature activation+9.57e+00

Pos logit contributions
oes+0.054
 ones+0.050
packs+0.050
 color+0.044
 pads+0.043
Neg logit contributions
 ow-0.064
 awful-0.064
 Doctor-0.063
 horrible-0.062
 sorry-0.060
Token friends
Token ID2460
Feature activation+4.87e+00

Pos logit contributions
 ones+0.243
's+0.221
 color+0.199
 one+0.198
packs+0.186
Neg logit contributions
ores-0.405
 relief-0.401
 stret-0.399
paste-0.398
 Doctor-0.387
Token,
Token ID11
Feature activation+7.67e-01

Pos logit contributions
 ones+0.138
's+0.127
oes+0.120
 color+0.119
packs+0.119
Neg logit contributions
 relief-0.183
 Doctor-0.179
 stret-0.179
ores-0.178
 ow-0.178
Token Sam
Token ID3409
Feature activation+9.66e-01

Pos logit contributions
 ones+0.026
's+0.024
oes+0.023
 color+0.023
packs+0.023
Neg logit contributions
 relief-0.024
 stret-0.024
 Doctor-0.023
 ow-0.023
ores-0.023
Token and
Token ID290
Feature activation+2.67e-01

Pos logit contributions
 ones+0.023
's+0.021
oes+0.020
 color+0.020
packs+0.019
Neg logit contributions
 relief-0.040
 Doctor-0.039
ores-0.039
 ow-0.039
 stret-0.039
Token Jack
Token ID3619
Feature activation+2.01e+00

Pos logit contributions
 ones+0.009
's+0.008
 one+0.008
 color+0.008
oes+0.008
Neg logit contributions
 relief-0.008
 stret-0.008
ores-0.008
 awful-0.008
paste-0.008
Token a
Token ID257
Feature activation+2.69e-02

Pos logit contributions
 relief+0.013
 Doctor+0.013
 stret+0.012
ores+0.012
 ow+0.012
Neg logit contributions
 ones-0.012
's-0.011
oes-0.010
 color-0.010
packs-0.010
Token time
Token ID640
Feature activation+3.51e+00

Pos logit contributions
 ones+0.001
oes+0.001
packs+0.001
's+0.001
 color+0.001
Neg logit contributions
 relief-0.001
 ow-0.001
ores-0.001
 stret-0.001
 awful-0.001
Token,
Token ID11
Feature activation+1.12e+00

Pos logit contributions
 ones+0.104
oes+0.094
 color+0.090
packs+0.089
's+0.089
Neg logit contributions
 ow-0.123
 sorry-0.122
 Doctor-0.121
ores-0.119
 relief-0.119
Token there
Token ID612
Feature activation+6.74e+00

Pos logit contributions
 ones+0.031
's+0.027
oes+0.027
 color+0.026
packs+0.026
Neg logit contributions
 relief-0.042
 Doctor-0.042
ores-0.041
 ow-0.041
 stret-0.041
Token were
Token ID547
Feature activation+1.37e+00

Pos logit contributions
 ones+0.149
's+0.132
 color+0.125
oes+0.123
packs+0.120
Neg logit contributions
 relief-0.291
 stret-0.290
 Doctor-0.289
ores-0.288
paste-0.286
Token two
Token ID734
Feature activation+9.52e+00

Pos logit contributions
oes+0.045
 ones+0.041
packs+0.040
 pads+0.036
ci+0.036
Neg logit contributions
 ow-0.047
 awful-0.045
 Doctor-0.044
 sorry-0.043
ores-0.043
Token friends
Token ID2460
Feature activation+5.06e+00

Pos logit contributions
 ones+0.261
's+0.232
oes+0.228
 color+0.224
packs+0.222
Neg logit contributions
 relief-0.359
 Doctor-0.356
ores-0.354
 stret-0.352
 ow-0.351
Token.
Token ID13
Feature activation+1.20e+00

Pos logit contributions
 ones+0.179
oes+0.165
 color+0.157
packs+0.156
 pads+0.154
Neg logit contributions
 Doctor-0.150
 sorry-0.148
 ow-0.145
ores-0.145
 stret-0.144
Token One
Token ID1881
Feature activation+3.86e+00

Pos logit contributions
 ones+0.033
's+0.029
oes+0.029
 way+0.029
 pads+0.028
Neg logit contributions
 Doctor-0.042
 Dr-0.040
ores-0.040
 ow-0.040
lic-0.036
Token was
Token ID373
Feature activation+1.36e+00

Pos logit contributions
 ones+0.089
packs+0.074
oes+0.070
First+0.066
 pads+0.064
Neg logit contributions
ores-0.165
 relief-0.163
 Doctor-0.163
 ow-0.162
 awful-0.162
Token wise
Token ID10787
Feature activation+2.65e+00

Pos logit contributions
 ones+0.047
oes+0.043
's+0.042
packs+0.041
 color+0.041
Neg logit contributions
 Doctor-0.042
ores-0.041
 ow-0.041
 sorry-0.041
 relief-0.041
Token!
Token ID0
Feature activation+9.94e-01

Pos logit contributions
 stret+0.068
ache+0.064
coon+0.062
paste+0.059
 relief+0.059
Neg logit contributions
 one-0.038
 ones-0.038
 color-0.032
's-0.030
 wandered-0.029
Token<|endoftext|>
Token ID50256
Feature activation-4.32e+00

Pos logit contributions
 ones+0.024
's+0.023
oes+0.021
 color+0.020
packs+0.020
Neg logit contributions
 relief-0.041
 ow-0.040
 awful-0.040
 stret-0.040
 poisonous-0.040
TokenOnce
Token ID7454
Feature activation+1.15e-01

Pos logit contributions
 relief+0.172
 awful+0.172
 stret+0.171
 Doctor+0.168
ores+0.168
Neg logit contributions
 ones-0.111
's-0.103
oes-0.096
packs-0.095
 color-0.094
Token there
Token ID612
Feature activation+6.41e+00

Pos logit contributions
 ones+0.003
's+0.002
oes+0.002
 color+0.002
packs+0.002
Neg logit contributions
 relief-0.005
 Doctor-0.005
 stret-0.005
ores-0.005
 ow-0.005
Token were
Token ID547
Feature activation+1.60e+00

Pos logit contributions
 ones+0.167
oes+0.148
packs+0.142
's+0.139
 color+0.139
Neg logit contributions
 Doctor-0.245
 ow-0.244
 relief-0.242
ores-0.241
 awful-0.237
Token two
Token ID734
Feature activation+9.48e+00

Pos logit contributions
oes+0.055
 ones+0.053
packs+0.047
 pads+0.045
 color+0.045
Neg logit contributions
 ow-0.055
 awful-0.051
 sorry-0.051
 Doctor-0.049
ores-0.049
Token sisters
Token ID15153
Feature activation+4.24e+00

Pos logit contributions
 ones+0.261
oes+0.233
's+0.230
 color+0.224
packs+0.223
Neg logit contributions
 relief-0.352
 Doctor-0.352
 ow-0.351
 awful-0.348
 stret-0.347
Token.
Token ID13
Feature activation+8.25e-01

Pos logit contributions
 ones+0.150
oes+0.141
 color+0.132
packs+0.131
's+0.129
Neg logit contributions
 ow-0.120
 Doctor-0.120
 stret-0.119
 sorry-0.117
 relief-0.117
Token One
Token ID1881
Feature activation+3.61e+00

Pos logit contributions
 ones+0.027
oes+0.025
's+0.024
 traveled+0.023
 pads+0.023
Neg logit contributions
 ow-0.026
 Doctor-0.026
ores-0.025
 relief-0.025
 sorry-0.025
Token was
Token ID373
Feature activation+1.35e+00

Pos logit contributions
 ones+0.072
oes+0.058
packs+0.058
's+0.055
 color+0.052
Neg logit contributions
ores-0.165
 Doctor-0.163
 relief-0.163
 ow-0.161
 awful-0.161
Token named
Token ID3706
Feature activation+9.52e-04

Pos logit contributions
 ones+0.041
's+0.037
 color+0.036
 one+0.036
oes+0.035
Neg logit contributions
 relief-0.049
 stret-0.047
coon-0.047
 awful-0.047
paste-0.046
Token end
Token ID886
Feature activation-3.04e+00

Pos logit contributions
 ones+0.016
's+0.015
 one+0.014
 color+0.014
 way+0.013
Neg logit contributions
ores-0.023
 stret-0.023
paste-0.022
 Doctor-0.022
lic-0.022
Token.
Token ID13
Feature activation-3.55e-02

Pos logit contributions
 stret+0.155
paste+0.152
ocol+0.151
 sorry+0.151
coon+0.151
Neg logit contributions
's-0.049
 ones-0.039
 one-0.038
 color-0.031
 One-0.029
Token<|endoftext|>
Token ID50256
Feature activation-4.63e+00

Pos logit contributions
 stret+0.002
 ow+0.002
 awful+0.002
 relief+0.001
paste+0.001
Neg logit contributions
 ones-0.001
's-0.001
oes-0.001
 color-0.001
packs-0.001
TokenOne
Token ID3198
Feature activation+4.44e+00

Pos logit contributions
 relief+0.163
 Doctor+0.161
 stret+0.161
 awful+0.160
 ow+0.160
Neg logit contributions
 ones-0.138
's-0.125
oes-0.122
 color-0.120
packs-0.119
Token day
Token ID1110
Feature activation-1.17e-01

Pos logit contributions
 ones+0.058
's+0.043
oes+0.042
 color+0.039
packs+0.039
Neg logit contributions
 relief-0.232
 Doctor-0.231
ores-0.230
 ow-0.229
 awful-0.229
Token two
Token ID734
Feature activation+9.47e+00

Pos logit contributions
 ow+0.004
 sorry+0.004
 Doctor+0.004
ores+0.004
 relief+0.004
Neg logit contributions
 ones-0.003
oes-0.003
packs-0.003
 color-0.003
's-0.003
Token twins
Token ID20345
Feature activation+2.31e+00

Pos logit contributions
 ones+0.288
oes+0.263
's+0.253
packs+0.253
 color+0.252
Neg logit contributions
 ow-0.323
 Doctor-0.322
 relief-0.321
 awful-0.320
 stret-0.315
Token went
Token ID1816
Feature activation-7.57e-01

Pos logit contributions
 ones+0.058
oes+0.054
 color+0.049
packs+0.048
First+0.044
Neg logit contributions
 ow-0.088
 Doctor-0.087
 sorry-0.086
 relief-0.086
ores-0.085
Token for
Token ID329
Feature activation-5.60e+00

Pos logit contributions
 stret+0.025
ores+0.025
 Doctor+0.025
 sorry+0.025
 relief+0.024
Neg logit contributions
 ones-0.024
 color-0.021
oes-0.021
packs-0.020
's-0.020
Token a
Token ID257
Feature activation-3.88e-02

Pos logit contributions
 relief+0.195
paste+0.195
 Doctor+0.194
 awful+0.193
ores+0.193
Neg logit contributions
 ones-0.170
's-0.153
 color-0.149
packs-0.146
 one-0.146
Token walk
Token ID2513
Feature activation+1.07e+00

Pos logit contributions
 stret+0.002
 ow+0.002
 horrible+0.001
 Doctor+0.001
 poisonous+0.001
Neg logit contributions
 ones-0.001
packs-0.001
oes-0.001
 color-0.001
 pads-0.001
Token<|endoftext|>
Token ID50256
Feature activation-4.50e+00

Pos logit contributions
 stret+0.029
 ow+0.029
 awful+0.029
 relief+0.029
paste+0.029
Neg logit contributions
 ones-0.016
's-0.014
oes-0.013
 color-0.013
packs-0.013
TokenOne
Token ID3198
Feature activation+4.54e+00

Pos logit contributions
 relief+0.166
 stret+0.164
 awful+0.164
 Doctor+0.163
 ow+0.162
Neg logit contributions
 ones-0.129
's-0.117
oes-0.112
 color-0.111
packs-0.110
Token day
Token ID1110
Feature activation-1.22e-01

Pos logit contributions
 ones+0.055
's+0.043
oes+0.040
 color+0.039
packs+0.036
Neg logit contributions
 relief-0.241
 stret-0.241
 Doctor-0.240
paste-0.238
 ow-0.238
Token there
Token ID612
Feature activation+6.75e+00

Pos logit contributions
 sorry+0.004
 ow+0.004
 Doctor+0.004
 relief+0.004
ores+0.004
Neg logit contributions
 ones-0.004
oes-0.003
packs-0.003
 color-0.003
's-0.003
Token were
Token ID547
Feature activation+1.37e+00

Pos logit contributions
 ones+0.180
oes+0.162
packs+0.152
 color+0.147
's+0.144
Neg logit contributions
 relief-0.259
 ow-0.257
 Doctor-0.255
ores-0.255
 awful-0.254
Token two
Token ID734
Feature activation+9.47e+00

Pos logit contributions
oes+0.042
 ones+0.038
packs+0.036
 pads+0.033
 wander+0.033
Neg logit contributions
 awful-0.048
 sorry-0.048
 horrible-0.046
 Doctor-0.046
 ow-0.045
Token friends
Token ID2460
Feature activation+4.94e+00

Pos logit contributions
 ones+0.268
's+0.244
 color+0.230
oes+0.227
packs+0.226
Neg logit contributions
 relief-0.354
ores-0.352
 Doctor-0.348
 stret-0.348
paste-0.347
Token,
Token ID11
Feature activation+1.17e+00

Pos logit contributions
 ones+0.154
oes+0.138
 color+0.134
's+0.133
packs+0.132
Neg logit contributions
 Doctor-0.165
ores-0.163
 relief-0.163
 stret-0.162
 ow-0.162
Token Joe
Token ID5689
Feature activation+1.59e+00

Pos logit contributions
 ones+0.040
oes+0.036
's+0.036
 color+0.035
packs+0.035
Neg logit contributions
 Doctor-0.036
 relief-0.036
ores-0.035
 ow-0.035
 stret-0.035
Token and
Token ID290
Feature activation+1.17e+00

Pos logit contributions
 ones+0.044
oes+0.039
 color+0.038
packs+0.038
's+0.038
Neg logit contributions
 Doctor-0.058
 stret-0.057
 relief-0.057
ores-0.057
 ow-0.057
Token Adam
Token ID7244
Feature activation+1.44e+00

Pos logit contributions
 ones+0.042
oes+0.040
packs+0.038
's+0.038
 color+0.038
Neg logit contributions
 Doctor-0.033
 relief-0.032
 ow-0.032
 sorry-0.031
ores-0.031
Token great
Token ID1049
Feature activation-7.47e-01

Pos logit contributions
coon+0.136
 stret+0.135
 Doctor+0.127
paste+0.125
 explosion+0.124
Neg logit contributions
's-0.126
 ones-0.122
 color-0.118
 one-0.117
 way-0.107
Token.
Token ID13
Feature activation+1.74e-01

Pos logit contributions
 Doctor+0.028
 relief+0.028
 stret+0.028
ores+0.028
 ow+0.028
Neg logit contributions
 ones-0.021
's-0.019
 color-0.018
oes-0.018
packs-0.018
Token<|endoftext|>
Token ID50256
Feature activation-4.54e+00

Pos logit contributions
 ones+0.005
oes+0.004
 color+0.004
packs+0.004
's+0.004
Neg logit contributions
 relief-0.007
 Doctor-0.007
 ow-0.006
ores-0.006
 sorry-0.006
TokenThere
Token ID1858
Feature activation+3.49e+00

Pos logit contributions
 relief+0.169
 Doctor+0.168
ores+0.167
 ow+0.166
 stret+0.165
Neg logit contributions
 ones-0.124
oes-0.110
's-0.109
 color-0.107
packs-0.106
Token were
Token ID547
Feature activation+1.31e+00

Pos logit contributions
 ones+0.085
oes+0.083
packs+0.076
 color+0.073
's+0.072
Neg logit contributions
 ow-0.137
 Doctor-0.134
 relief-0.134
 awful-0.131
 stret-0.130
Token two
Token ID734
Feature activation+9.46e+00

Pos logit contributions
oes+0.042
 ones+0.040
packs+0.034
 necks+0.034
esty+0.034
Neg logit contributions
 ow-0.041
 awful-0.040
 sorry-0.039
 Doctor-0.039
 horrible-0.038
Token best
Token ID1266
Feature activation+4.87e+00

Pos logit contributions
 ones+0.257
's+0.234
 color+0.220
packs+0.215
oes+0.214
Neg logit contributions
 relief-0.366
ores-0.365
 Doctor-0.360
 stret-0.360
paste-0.360
Token friends
Token ID2460
Feature activation+5.02e+00

Pos logit contributions
 ones+0.121
's+0.105
packs+0.102
 color+0.101
oes+0.101
Neg logit contributions
 relief-0.199
ores-0.197
 Doctor-0.197
 awful-0.196
 stret-0.196
Token.
Token ID13
Feature activation+1.43e+00

Pos logit contributions
 ones+0.155
oes+0.141
 color+0.136
's+0.131
First+0.130
Neg logit contributions
ores-0.166
 Doctor-0.166
 relief-0.164
 ow-0.163
 sorry-0.163
Token One
Token ID1881
Feature activation+3.87e+00

Pos logit contributions
 ones+0.036
oes+0.032
 way+0.029
's+0.029
 traveled+0.028
Neg logit contributions
 Doctor-0.055
ores-0.055
 Dr-0.053
 relief-0.051
 sorry-0.050
Token day
Token ID1110
Feature activation-1.99e-01

Pos logit contributions
 ones+0.097
oes+0.081
packs+0.081
's+0.079
 color+0.076
Neg logit contributions
ores-0.154
 Doctor-0.153
 relief-0.153
 awful-0.151
 ow-0.151
Token adventure
Token ID8855
Feature activation+6.58e-01

Pos logit contributions
 relief+0.027
 stret+0.026
 ow+0.026
ores+0.026
 Doctor+0.026
Neg logit contributions
 ones-0.018
oes-0.016
's-0.016
packs-0.015
 color-0.015
Token!
Token ID0
Feature activation-6.47e-01

Pos logit contributions
 ones+0.019
's+0.017
oes+0.017
packs+0.016
 color+0.016
Neg logit contributions
 Doctor-0.024
 relief-0.024
 ow-0.024
 stret-0.024
ores-0.024
Token<|endoftext|>
Token ID50256
Feature activation-4.53e+00

Pos logit contributions
ores+0.022
 Doctor+0.022
 relief+0.022
 ow+0.021
 sorry+0.020
Neg logit contributions
 ones-0.021
oes-0.018
 color-0.018
's-0.018
packs-0.017
TokenOne
Token ID3198
Feature activation+4.41e+00

Pos logit contributions
 Doctor+0.164
 relief+0.164
ores+0.163
 ow+0.161
 stret+0.159
Neg logit contributions
 ones-0.127
oes-0.114
's-0.111
 color-0.110
packs-0.109
Token day
Token ID1110
Feature activation-1.51e-01

Pos logit contributions
 ones+0.062
's+0.047
oes+0.047
packs+0.044
 color+0.043
Neg logit contributions
 relief-0.226
 Doctor-0.225
ores-0.224
 ow-0.223
 awful-0.223
Token two
Token ID734
Feature activation+9.45e+00

Pos logit contributions
 sorry+0.005
 ow+0.005
 Doctor+0.005
 relief+0.005
ores+0.005
Neg logit contributions
 ones-0.005
oes-0.005
packs-0.004
 color-0.004
 pads-0.004
Token friends
Token ID2460
Feature activation+5.00e+00

Pos logit contributions
 ones+0.297
oes+0.280
packs+0.264
 color+0.261
's+0.257
Neg logit contributions
 ow-0.313
 awful-0.306
 Doctor-0.306
 relief-0.304
 sorry-0.297
Token went
Token ID1816
Feature activation-7.35e-01

Pos logit contributions
 ones+0.124
oes+0.114
 color+0.099
packs+0.096
 pads+0.091
Neg logit contributions
 ow-0.192
 Doctor-0.189
ores-0.187
 relief-0.186
coon-0.186
Token for
Token ID329
Feature activation-5.57e+00

Pos logit contributions
 sorry+0.020
ores+0.020
 stret+0.020
 Doctor+0.019
 poisonous+0.017
Neg logit contributions
 ones-0.027
 color-0.025
oes-0.025
First-0.023
packs-0.023
Token a
Token ID257
Feature activation-6.61e-02

Pos logit contributions
paste+0.197
 Doctor+0.196
 relief+0.196
 awful+0.195
ores+0.195
Neg logit contributions
 ones-0.166
's-0.148
 color-0.146
 one-0.143
packs-0.142
Token walk
Token ID2513
Feature activation+1.13e+00

Pos logit contributions
 stret+0.003
 ow+0.003
 horrible+0.002
 Doctor+0.002
 awful+0.002
Neg logit contributions
 ones-0.002
packs-0.002
oes-0.002
 color-0.001
ipt-0.001
Token adventure
Token ID8855
Feature activation+5.91e-01

Pos logit contributions
oes+0.059
 ones+0.058
First+0.056
packs+0.055
's+0.055
Neg logit contributions
 relief-0.058
 ow-0.057
 poisonous-0.055
 Doctor-0.054
 explosion-0.054
Token.
Token ID13
Feature activation-4.70e-02

Pos logit contributions
 ones+0.015
's+0.014
oes+0.013
 color+0.013
packs+0.013
Neg logit contributions
 relief-0.023
 Doctor-0.023
 ow-0.023
 stret-0.023
ores-0.023
Token<|endoftext|>
Token ID50256
Feature activation-4.54e+00

Pos logit contributions
 relief+0.002
 Doctor+0.002
 stret+0.002
 ow+0.002
ores+0.002
Neg logit contributions
 ones-0.001
's-0.001
oes-0.001
 color-0.001
packs-0.001
TokenOne
Token ID3198
Feature activation+4.43e+00

Pos logit contributions
 relief+0.152
 Doctor+0.152
ores+0.150
 ow+0.150
 stret+0.149
Neg logit contributions
 ones-0.142
's-0.127
oes-0.127
 color-0.124
packs-0.123
Token day
Token ID1110
Feature activation-1.69e-01

Pos logit contributions
 ones+0.054
's+0.041
oes+0.039
 color+0.038
packs+0.036
Neg logit contributions
 relief-0.235
 Doctor-0.234
 stret-0.234
ores-0.232
 ow-0.232
Token two
Token ID734
Feature activation+9.45e+00

Pos logit contributions
 sorry+0.006
 ow+0.006
 Doctor+0.006
 relief+0.006
ores+0.006
Neg logit contributions
 ones-0.005
oes-0.005
packs-0.004
 color-0.004
's-0.004
Token friends
Token ID2460
Feature activation+4.94e+00

Pos logit contributions
 ones+0.281
's+0.251
oes+0.250
 color+0.244
packs+0.242
Neg logit contributions
 relief-0.333
 Doctor-0.331
 ow-0.328
ores-0.328
 stret-0.327
Token,
Token ID11
Feature activation+1.16e+00

Pos logit contributions
 ones+0.132
oes+0.120
 color+0.110
packs+0.107
's+0.103
Neg logit contributions
 Doctor-0.182
 ow-0.182
ores-0.179
 relief-0.179
coon-0.177
Token Jim
Token ID5395
Feature activation+2.43e+00

Pos logit contributions
 ones+0.042
oes+0.038
's+0.037
packs+0.037
 color+0.037
Neg logit contributions
 Doctor-0.033
 relief-0.033
ores-0.032
 ow-0.032
 awful-0.032
Token and
Token ID290
Feature activation+1.17e+00

Pos logit contributions
 ones+0.066
oes+0.059
 color+0.057
's+0.056
packs+0.055
Neg logit contributions
 Doctor-0.090
 relief-0.090
ores-0.089
 ow-0.089
 stret-0.088
Token Sam
Token ID3409
Feature activation+1.63e+00

Pos logit contributions
 ones+0.041
oes+0.038
packs+0.037
's+0.036
 color+0.036
Neg logit contributions
 Doctor-0.034
 relief-0.034
 sorry-0.033
 ow-0.032
 awful-0.032
Token<|endoftext|>
Token ID50256
Feature activation-4.62e+00

Pos logit contributions
 ones+0.004
oes+0.003
 color+0.003
packs+0.003
's+0.003
Neg logit contributions
ores-0.005
 relief-0.005
 Doctor-0.005
 sorry-0.004
othe-0.004
TokenSam
Token ID16305
Feature activation+2.19e+00

Pos logit contributions
 Doctor+0.160
 relief+0.159
ores+0.159
 ow+0.157
 stret+0.154
Neg logit contributions
 ones-0.136
oes-0.124
 color-0.120
's-0.119
packs-0.118
Token and
Token ID290
Feature activation+1.11e+00

Pos logit contributions
 ones+0.040
's+0.035
oes+0.032
 color+0.032
packs+0.031
Neg logit contributions
 relief-0.103
 Doctor-0.102
 stret-0.102
ores-0.102
paste-0.102
Token Sarah
Token ID10490
Feature activation+2.31e+00

Pos logit contributions
 ones+0.044
oes+0.042
packs+0.040
 pads+0.040
First+0.039
Neg logit contributions
 relief-0.027
 Doctor-0.026
 sorry-0.026
 ow-0.025
 stret-0.023
Token were
Token ID547
Feature activation+1.33e+00

Pos logit contributions
 ones+0.051
oes+0.046
packs+0.042
 color+0.041
's+0.039
Neg logit contributions
 ow-0.097
 relief-0.096
 stret-0.095
ores-0.095
 awful-0.095
Token two
Token ID734
Feature activation+9.44e+00

Pos logit contributions
 ones+0.039
oes+0.036
 color+0.034
packs+0.033
's+0.033
Neg logit contributions
 Doctor-0.048
 sorry-0.047
ores-0.047
 relief-0.046
 ow-0.046
Token children
Token ID1751
Feature activation+2.73e+00

Pos logit contributions
 ones+0.276
oes+0.256
 color+0.242
's+0.242
packs+0.240
Neg logit contributions
 relief-0.331
 Doctor-0.330
 ow-0.330
 awful-0.327
 stret-0.326
Token from
Token ID422
Feature activation-3.14e+00

Pos logit contributions
 ones+0.092
oes+0.085
 color+0.082
packs+0.079
's+0.079
Neg logit contributions
 Doctor-0.083
ores-0.082
 sorry-0.081
 stret-0.081
 relief-0.080
Token the
Token ID262
Feature activation+6.82e-01

Pos logit contributions
paste+0.138
ache+0.133
lic+0.128
 stret+0.126
coon+0.124
Neg logit contributions
 ones-0.082
 one-0.077
 color-0.069
packs-0.068
's-0.065
Token same
Token ID976
Feature activation-7.00e-02

Pos logit contributions
 ones+0.018
oes+0.017
packs+0.016
's+0.016
 color+0.016
Neg logit contributions
 Doctor-0.026
 relief-0.025
 ow-0.025
 stret-0.025
 awful-0.025
Token neighbourhood
Token ID19495
Feature activation+1.62e+00

Pos logit contributions
 stret+0.003
 Doctor+0.003
 relief+0.003
ores+0.003
 ow+0.003
Neg logit contributions
 ones-0.002
oes-0.002
packs-0.001
's-0.001
 color-0.001
Token a
Token ID257
Feature activation-8.53e-02

Pos logit contributions
 relief+0.013
 Doctor+0.013
ores+0.012
 stret+0.012
 ow+0.012
Neg logit contributions
 ones-0.012
's-0.011
oes-0.011
 color-0.011
packs-0.011
Token time
Token ID640
Feature activation+3.45e+00

Pos logit contributions
 relief+0.003
 ow+0.003
 stret+0.003
 awful+0.003
ores+0.003
Neg logit contributions
 ones-0.002
oes-0.002
packs-0.002
 color-0.002
's-0.002
Token,
Token ID11
Feature activation+1.13e+00

Pos logit contributions
 ones+0.105
oes+0.096
 color+0.091
packs+0.089
's+0.089
Neg logit contributions
 ow-0.118
 sorry-0.118
 Doctor-0.116
ores-0.115
 relief-0.113
Token there
Token ID612
Feature activation+6.73e+00

Pos logit contributions
 ones+0.031
's+0.027
oes+0.027
 color+0.026
packs+0.026
Neg logit contributions
 relief-0.043
 Doctor-0.043
ores-0.043
 stret-0.043
 ow-0.042
Token were
Token ID547
Feature activation+1.42e+00

Pos logit contributions
 ones+0.154
's+0.135
oes+0.131
 color+0.129
packs+0.126
Neg logit contributions
 relief-0.284
 Doctor-0.282
 stret-0.280
ores-0.280
 ow-0.278
Token two
Token ID734
Feature activation+9.43e+00

Pos logit contributions
oes+0.045
 ones+0.040
packs+0.039
 pads+0.036
ci+0.035
Neg logit contributions
 ow-0.051
 awful-0.048
 Doctor-0.047
 sorry-0.047
 horrible-0.046
Token friends
Token ID2460
Feature activation+4.99e+00

Pos logit contributions
 ones+0.257
's+0.229
oes+0.224
 color+0.221
packs+0.218
Neg logit contributions
 relief-0.357
 Doctor-0.354
ores-0.352
 stret-0.351
 ow-0.350
Token.
Token ID13
Feature activation+1.30e+00

Pos logit contributions
 ones+0.169
oes+0.158
 color+0.149
packs+0.148
 pads+0.144
Neg logit contributions
 Doctor-0.155
 sorry-0.152
 ow-0.151
 relief-0.150
ores-0.150
Token Their
Token ID5334
Feature activation+3.86e+00

Pos logit contributions
 ones+0.035
oes+0.031
 pads+0.031
's+0.031
st+0.031
Neg logit contributions
 Doctor-0.045
ores-0.044
 ow-0.044
 Dr-0.044
 sorry-0.042
Token names
Token ID3891
Feature activation+3.33e+00

Pos logit contributions
 ones+0.089
oes+0.082
packs+0.077
's+0.073
 color+0.070
Neg logit contributions
 ow-0.159
 Doctor-0.158
 relief-0.158
 sorry-0.155
 awful-0.154
Token were
Token ID547
Feature activation+1.39e+00

Pos logit contributions
 ones+0.056
's+0.049
 color+0.045
packs+0.043
oes+0.042
Neg logit contributions
 stret-0.164
 relief-0.163
 Doctor-0.162
paste-0.160
ores-0.159
Token upon
Token ID2402
Feature activation-3.57e-01

Pos logit contributions
 Doctor+0.004
ores+0.004
 ow+0.004
 relief+0.004
 sorry+0.004
Neg logit contributions
 ones-0.003
's-0.002
oes-0.002
 color-0.002
packs-0.002
Token a
Token ID257
Feature activation-7.67e-02

Pos logit contributions
 relief+0.012
 Doctor+0.012
ores+0.012
 stret+0.012
 ow+0.012
Neg logit contributions
 ones-0.011
's-0.010
oes-0.010
 color-0.010
packs-0.010
Token time
Token ID640
Feature activation+3.43e+00

Pos logit contributions
 relief+0.003
 ow+0.003
ores+0.003
 stret+0.003
 Doctor+0.003
Neg logit contributions
 ones-0.002
oes-0.002
packs-0.002
's-0.002
 color-0.002
Token there
Token ID612
Feature activation+6.71e+00

Pos logit contributions
 ones+0.108
oes+0.099
 color+0.093
packs+0.092
's+0.090
Neg logit contributions
 sorry-0.116
 ow-0.115
 Doctor-0.113
ores-0.110
 relief-0.109
Token were
Token ID547
Feature activation+1.39e+00

Pos logit contributions
 ones+0.133
's+0.123
 color+0.114
oes+0.105
 one+0.103
Neg logit contributions
 stret-0.311
 relief-0.306
paste-0.304
 Doctor-0.304
ores-0.304
Token two
Token ID734
Feature activation+9.43e+00

Pos logit contributions
oes+0.048
 ones+0.040
packs+0.039
 pads+0.038
ci+0.036
Neg logit contributions
 ow-0.047
 Doctor-0.044
 awful-0.044
 sorry-0.044
 horrible-0.042
Token friends
Token ID2460
Feature activation+5.12e+00

Pos logit contributions
 ones+0.255
's+0.228
oes+0.220
 color+0.218
packs+0.215
Neg logit contributions
 relief-0.361
ores-0.357
 Doctor-0.356
 stret-0.354
paste-0.351
Token,
Token ID11
Feature activation+1.12e+00

Pos logit contributions
 ones+0.173
oes+0.159
 color+0.153
packs+0.151
's+0.148
Neg logit contributions
 Doctor-0.158
 sorry-0.154
ores-0.154
 ow-0.153
 stret-0.153
Token Jake
Token ID14757
Feature activation+8.01e-01

Pos logit contributions
 ones+0.042
oes+0.041
's+0.039
packs+0.038
 color+0.038
Neg logit contributions
 Doctor-0.029
 ow-0.028
ores-0.028
 relief-0.028
 sorry-0.027
Token and
Token ID290
Feature activation+1.12e+00

Pos logit contributions
 ones+0.026
oes+0.024
packs+0.023
First+0.022
 color+0.022
Neg logit contributions
 stret-0.025
 Doctor-0.025
 relief-0.025
ores-0.025
 ow-0.024
Token Sam
Token ID3409
Feature activation+1.64e+00

Pos logit contributions
 ones+0.044
oes+0.044
packs+0.042
's+0.041
 pads+0.040
Neg logit contributions
 Doctor-0.027
 sorry-0.026
 ow-0.025
 relief-0.025
 Dr-0.024
Token upon
Token ID2402
Feature activation-4.85e-01

Pos logit contributions
ores+0.004
 Doctor+0.004
 ow+0.004
 sorry+0.004
 relief+0.004
Neg logit contributions
 ones-0.003
 color-0.003
's-0.003
oes-0.003
packs-0.003
Token a
Token ID257
Feature activation-7.73e-02

Pos logit contributions
 relief+0.016
 Doctor+0.016
 stret+0.016
ores+0.016
 ow+0.016
Neg logit contributions
 ones-0.015
's-0.014
oes-0.014
 color-0.013
packs-0.013
Token time
Token ID640
Feature activation+3.43e+00

Pos logit contributions
 relief+0.003
 ow+0.003
 stret+0.003
ores+0.003
 awful+0.003
Neg logit contributions
 ones-0.002
oes-0.002
packs-0.002
 color-0.002
's-0.002
Token there
Token ID612
Feature activation+6.73e+00

Pos logit contributions
 ones+0.113
oes+0.104
 color+0.098
packs+0.097
 pads+0.096
Neg logit contributions
 sorry-0.111
 ow-0.109
 Doctor-0.106
ores-0.102
 relief-0.101
Token were
Token ID547
Feature activation+1.36e+00

Pos logit contributions
 ones+0.140
's+0.127
 color+0.119
oes+0.113
packs+0.110
Neg logit contributions
 stret-0.303
 relief-0.300
 Doctor-0.298
ores-0.298
paste-0.297
Token two
Token ID734
Feature activation+9.42e+00

Pos logit contributions
oes+0.048
 ones+0.042
packs+0.041
 pads+0.040
lets+0.038
Neg logit contributions
 ow-0.045
 awful-0.041
 sorry-0.041
 Doctor-0.040
 horrible-0.039
Token friends
Token ID2460
Feature activation+5.00e+00

Pos logit contributions
 ones+0.254
's+0.229
oes+0.217
 color+0.217
packs+0.213
Neg logit contributions
 relief-0.364
ores-0.360
 Doctor-0.357
 stret-0.356
paste-0.356
Token;
Token ID26
Feature activation-7.96e-01

Pos logit contributions
 ones+0.164
oes+0.148
 color+0.144
packs+0.143
's+0.141
Neg logit contributions
 Doctor-0.161
ores-0.158
 sorry-0.157
 stret-0.157
 relief-0.156
Token a
Token ID257
Feature activation-8.99e-02

Pos logit contributions
 Doctor+0.020
ores+0.019
 Dr+0.018
 sorry+0.018
 ow+0.017
Neg logit contributions
oes-0.030
 ones-0.030
 pads-0.029
j-0.029
packs-0.029
Token young
Token ID1862
Feature activation+2.79e+00

Pos logit contributions
paste+0.004
ores+0.003
 awful+0.003
 relief+0.003
coon+0.003
Neg logit contributions
 ones-0.003
 color-0.002
's-0.002
 one-0.002
 way-0.002
Token boy
Token ID2933
Feature activation-2.36e-01

Pos logit contributions
 ones+0.077
's+0.067
 color+0.065
oes+0.065
packs+0.063
Neg logit contributions
 relief-0.107
ores-0.107
 awful-0.106
 Doctor-0.105
 ow-0.105
Token day
Token ID1110
Feature activation-2.41e-01

Pos logit contributions
 ones+0.092
packs+0.080
lets+0.074
 necks+0.072
 pads+0.072
Neg logit contributions
ores-0.159
 Doctor-0.159
 relief-0.156
 ow-0.152
 awful-0.148
Token,
Token ID11
Feature activation+1.05e+00

Pos logit contributions
 relief+0.010
 Doctor+0.010
ores+0.010
 stret+0.010
 ow+0.010
Neg logit contributions
 ones-0.006
's-0.005
oes-0.005
 color-0.005
packs-0.005
Token Be
Token ID1355
Feature activation-8.33e-01

Pos logit contributions
 ones+0.027
's+0.023
 one+0.022
 color+0.022
oes+0.019
Neg logit contributions
 awful-0.046
 relief-0.046
ores-0.045
 stret-0.044
coon-0.044
Tokena
Token ID64
Feature activation-1.34e-01

Pos logit contributions
ores+0.024
 stret+0.024
 relief+0.024
paste+0.024
 sorry+0.024
Neg logit contributions
 ones-0.029
oes-0.028
packs-0.025
 color-0.025
's-0.025
Token noticed
Token ID6810
Feature activation-1.77e+00

Pos logit contributions
 Doctor+0.006
 relief+0.005
 ow+0.005
ores+0.005
 stret+0.005
Neg logit contributions
 ones-0.003
oes-0.003
 color-0.003
packs-0.003
's-0.003
Token two
Token ID734
Feature activation+9.42e+00

Pos logit contributions
 stret+0.070
 relief+0.069
ores+0.068
paste+0.066
affle+0.066
Neg logit contributions
 ones-0.053
 one-0.050
's-0.046
 color-0.045
 wandered-0.039
Token squirrel
Token ID33039
Feature activation-1.47e+00

Pos logit contributions
 ones+0.278
's+0.254
 color+0.240
 one+0.227
packs+0.227
Neg logit contributions
 relief-0.356
ores-0.356
 stret-0.347
 Doctor-0.346
paste-0.342
Tokens
Token ID82
Feature activation+6.14e+00

Pos logit contributions
coon+0.063
 poisonous+0.062
 ow+0.062
paste+0.061
ache+0.061
Neg logit contributions
 ones-0.028
 color-0.025
oes-0.025
packs-0.022
First-0.021
Token running
Token ID2491
Feature activation+5.85e-01

Pos logit contributions
 ones+0.146
's+0.134
 color+0.125
oes+0.123
packs+0.120
Neg logit contributions
 relief-0.268
ores-0.268
 stret-0.263
 Doctor-0.262
paste-0.258
Token around
Token ID1088
Feature activation+1.58e+00

Pos logit contributions
 ones+0.019
oes+0.017
's+0.017
 color+0.017
packs+0.017
Neg logit contributions
 relief-0.019
ores-0.019
 Doctor-0.019
 stret-0.018
 awful-0.018
Token outside
Token ID2354
Feature activation-7.43e-01

Pos logit contributions
 ones+0.040
 one+0.034
 color+0.031
's+0.031
packs+0.028
Neg logit contributions
paste-0.074
ache-0.072
 awful-0.071
lic-0.071
affle-0.070
Token a
Token ID257
Feature activation-7.68e-02

Pos logit contributions
 relief+0.012
 Doctor+0.012
ores+0.012
 stret+0.012
 ow+0.012
Neg logit contributions
 ones-0.012
's-0.011
oes-0.011
 color-0.010
packs-0.010
Token time
Token ID640
Feature activation+3.48e+00

Pos logit contributions
 relief+0.003
 ow+0.003
 stret+0.003
ores+0.003
 awful+0.003
Neg logit contributions
 ones-0.002
oes-0.002
packs-0.002
 color-0.002
's-0.002
Token,
Token ID11
Feature activation+1.09e+00

Pos logit contributions
 ones+0.105
oes+0.096
 color+0.091
packs+0.089
's+0.088
Neg logit contributions
 sorry-0.120
 ow-0.120
 Doctor-0.119
 relief-0.116
ores-0.116
Token there
Token ID612
Feature activation+6.72e+00

Pos logit contributions
 ones+0.029
's+0.026
oes+0.025
 color+0.025
packs+0.025
Neg logit contributions
 relief-0.042
 Doctor-0.042
ores-0.042
 stret-0.042
 ow-0.041
Token were
Token ID547
Feature activation+1.33e+00

Pos logit contributions
 ones+0.148
's+0.132
 color+0.125
oes+0.122
packs+0.118
Neg logit contributions
 relief-0.291
 stret-0.290
 Doctor-0.288
ores-0.288
paste-0.286
Token two
Token ID734
Feature activation+9.42e+00

Pos logit contributions
oes+0.044
 ones+0.038
packs+0.038
 pads+0.036
ci+0.035
Neg logit contributions
 ow-0.046
 awful-0.043
 Doctor-0.043
 sorry-0.043
 horrible-0.041
Token friends
Token ID2460
Feature activation+5.05e+00

Pos logit contributions
 ones+0.260
oes+0.230
's+0.230
 color+0.224
packs+0.222
Neg logit contributions
 relief-0.351
 Doctor-0.350
 ow-0.348
 stret-0.346
ores-0.346
Token,
Token ID11
Feature activation+1.06e+00

Pos logit contributions
 ones+0.174
oes+0.162
 color+0.152
packs+0.150
 pads+0.150
Neg logit contributions
 Doctor-0.154
 sorry-0.152
 ow-0.151
ores-0.148
 relief-0.147
Token Jack
Token ID3619
Feature activation+2.15e+00

Pos logit contributions
 ones+0.041
oes+0.039
's+0.037
packs+0.036
 color+0.036
Neg logit contributions
 Doctor-0.027
 ow-0.026
 relief-0.026
ores-0.026
 sorry-0.026
Token and
Token ID290
Feature activation+1.07e+00

Pos logit contributions
 ones+0.078
oes+0.074
First+0.071
packs+0.067
 color+0.067
Neg logit contributions
 Doctor-0.056
 stret-0.055
 poisonous-0.054
ores-0.054
coon-0.054
Token Jill
Token ID22772
Feature activation+2.85e+00

Pos logit contributions
 ones+0.042
oes+0.041
packs+0.039
's+0.038
 pads+0.037
Neg logit contributions
 Doctor-0.026
 sorry-0.026
 relief-0.025
 ow-0.025
 Dr-0.024
Token.
Token ID13
Feature activation-6.42e-02

Pos logit contributions
 stret+0.171
coon+0.166
 furious+0.161
 sorry+0.160
ocol+0.160
Neg logit contributions
's-0.053
 ones-0.044
 one-0.041
 color-0.040
packs-0.032
Token<|endoftext|>
Token ID50256
Feature activation-4.62e+00

Pos logit contributions
 relief+0.003
 stret+0.003
 Doctor+0.003
 ow+0.003
 awful+0.003
Neg logit contributions
 ones-0.002
's-0.001
oes-0.001
 color-0.001
packs-0.001
TokenOne
Token ID3198
Feature activation+4.39e+00

Pos logit contributions
 relief+0.154
 Doctor+0.153
ores+0.151
 ow+0.151
 stret+0.151
Neg logit contributions
 ones-0.146
's-0.131
oes-0.130
 color-0.128
packs-0.127
Token day
Token ID1110
Feature activation-2.84e-01

Pos logit contributions
 ones+0.058
's+0.043
oes+0.042
packs+0.039
 color+0.039
Neg logit contributions
 relief-0.228
 Doctor-0.227
ores-0.226
 ow-0.226
 awful-0.225
Token,
Token ID11
Feature activation+1.13e+00

Pos logit contributions
 Doctor+0.010
 ow+0.010
 relief+0.010
ores+0.010
 sorry+0.010
Neg logit contributions
 ones-0.008
oes-0.007
packs-0.007
's-0.007
 color-0.007
Token two
Token ID734
Feature activation+9.42e+00

Pos logit contributions
 ones+0.037
oes+0.034
's+0.034
packs+0.033
 color+0.033
Neg logit contributions
 relief-0.036
 ow-0.036
 Doctor-0.036
ores-0.036
 awful-0.035
Token friends
Token ID2460
Feature activation+5.00e+00

Pos logit contributions
 ones+0.295
oes+0.293
packs+0.272
 color+0.267
First+0.252
Neg logit contributions
 ow-0.315
 Doctor-0.303
 awful-0.302
 relief-0.300
 sorry-0.299
Token went
Token ID1816
Feature activation-7.49e-01

Pos logit contributions
oes+0.181
 ones+0.172
 pads+0.158
j+0.154
 color+0.154
Neg logit contributions
 ow-0.144
 sorry-0.140
 relief-0.139
 Doctor-0.138
 what-0.133
Token to
Token ID284
Feature activation+1.69e-01

Pos logit contributions
ores+0.020
 stret+0.020
 sorry+0.020
 Doctor+0.020
lic+0.018
Neg logit contributions
 ones-0.027
 color-0.025
oes-0.025
's-0.023
First-0.023
Token the
Token ID262
Feature activation+5.84e-01

Pos logit contributions
 ones+0.006
oes+0.005
's+0.005
packs+0.005
 color+0.005
Neg logit contributions
 Doctor-0.005
 relief-0.005
 stret-0.005
ores-0.005
 ow-0.005
Token river
Token ID7850
Feature activation-1.23e-01

Pos logit contributions
 ones+0.014
oes+0.013
packs+0.013
's+0.012
 color+0.012
Neg logit contributions
 Doctor-0.024
 ow-0.023
 stret-0.023
 awful-0.023
 relief-0.023
Token upon
Token ID2402
Feature activation-3.75e-01

Pos logit contributions
 relief+0.002
 Doctor+0.002
ores+0.002
 ow+0.002
 awful+0.002
Neg logit contributions
 ones-0.001
's-0.001
oes-0.001
 color-0.001
packs-0.001
Token a
Token ID257
Feature activation+6.33e-02

Pos logit contributions
 stret+0.014
 relief+0.014
 Doctor+0.014
ores+0.014
paste+0.014
Neg logit contributions
 ones-0.011
's-0.010
 color-0.009
 one-0.009
packs-0.009
Token time
Token ID640
Feature activation+3.47e+00

Pos logit contributions
 ones+0.001
's+0.001
oes+0.001
 color+0.001
packs+0.001
Neg logit contributions
 relief-0.003
 Doctor-0.003
ores-0.003
 stret-0.003
 ow-0.003
Token there
Token ID612
Feature activation+6.76e+00

Pos logit contributions
 ones+0.094
oes+0.083
's+0.081
 color+0.080
packs+0.079
Neg logit contributions
 Doctor-0.130
 ow-0.130
 relief-0.130
ores-0.129
 sorry-0.128
Token were
Token ID547
Feature activation+1.47e+00

Pos logit contributions
 ones+0.129
's+0.121
 color+0.112
 one+0.101
oes+0.101
Neg logit contributions
 stret-0.320
 relief-0.313
paste-0.313
 Doctor-0.311
ores-0.311
Token two
Token ID734
Feature activation+9.42e+00

Pos logit contributions
oes+0.046
 ones+0.040
packs+0.039
 pads+0.036
 wander+0.034
Neg logit contributions
 ow-0.055
 awful-0.052
 sorry-0.051
 Doctor-0.051
 horrible-0.050
Token teams
Token ID3466
Feature activation+5.83e+00

Pos logit contributions
 ones+0.248
's+0.224
 color+0.210
oes+0.205
packs+0.204
Neg logit contributions
 relief-0.373
ores-0.372
paste-0.366
 Doctor-0.364
 stret-0.364
Token.
Token ID13
Feature activation+1.27e+00

Pos logit contributions
 ones+0.166
oes+0.147
's+0.147
 color+0.143
packs+0.142
Neg logit contributions
 relief-0.211
 Doctor-0.210
ores-0.209
 ow-0.208
 stret-0.208
Token The
Token ID383
Feature activation+1.32e+00

Pos logit contributions
 ones+0.029
oes+0.026
's+0.025
packs+0.025
 pads+0.023
Neg logit contributions
 Doctor-0.052
ores-0.050
 relief-0.049
 Dr-0.049
 ow-0.049
Token two
Token ID734
Feature activation+9.40e+00

Pos logit contributions
 ones+0.035
packs+0.034
oes+0.034
 color+0.031
's+0.031
Neg logit contributions
 Doctor-0.047
 sorry-0.047
 relief-0.047
 awful-0.046
 ow-0.046
Token teams
Token ID3466
Feature activation+5.81e+00

Pos logit contributions
 ones+0.275
oes+0.269
packs+0.255
 color+0.243
's+0.231
Neg logit contributions
 sorry-0.312
 ow-0.311
 awful-0.310
 Doctor-0.304
 relief-0.302
Token a
Token ID257
Feature activation-3.44e-02

Pos logit contributions
 relief+0.015
 Doctor+0.015
 stret+0.015
ores+0.015
 ow+0.015
Neg logit contributions
 ones-0.014
's-0.013
oes-0.013
 color-0.013
packs-0.012
Token time
Token ID640
Feature activation+3.46e+00

Pos logit contributions
 relief+0.001
 ow+0.001
ores+0.001
 stret+0.001
 awful+0.001
Neg logit contributions
 ones-0.001
oes-0.001
packs-0.001
 color-0.001
's-0.001
Token,
Token ID11
Feature activation+1.19e+00

Pos logit contributions
 ones+0.113
oes+0.105
 color+0.098
packs+0.098
 pads+0.097
Neg logit contributions
 sorry-0.112
 ow-0.111
 Doctor-0.107
ores-0.105
 relief-0.103
Token there
Token ID612
Feature activation+6.75e+00

Pos logit contributions
 ones+0.033
oes+0.029
's+0.029
 color+0.028
packs+0.028
Neg logit contributions
 relief-0.044
 Doctor-0.044
 ow-0.044
ores-0.044
 awful-0.043
Token were
Token ID547
Feature activation+1.40e+00

Pos logit contributions
 ones+0.166
oes+0.146
's+0.142
packs+0.140
 color+0.138
Neg logit contributions
 relief-0.271
 ow-0.269
 Doctor-0.269
ores-0.267
 awful-0.265
Token two
Token ID734
Feature activation+9.41e+00

Pos logit contributions
oes+0.045
packs+0.040
 ones+0.040
 pads+0.036
ci+0.035
Neg logit contributions
 ow-0.050
 awful-0.048
 sorry-0.046
 Doctor-0.046
 horrible-0.046
Token best
Token ID1266
Feature activation+4.95e+00

Pos logit contributions
 ones+0.259
's+0.229
oes+0.228
 color+0.222
packs+0.221
Neg logit contributions
 relief-0.352
 Doctor-0.351
 ow-0.348
ores-0.347
 stret-0.347
Token friends
Token ID2460
Feature activation+5.07e+00

Pos logit contributions
 ones+0.131
's+0.115
oes+0.114
 color+0.111
packs+0.110
Neg logit contributions
 relief-0.192
 Doctor-0.191
ores-0.189
 ow-0.189
 stret-0.189
Token,
Token ID11
Feature activation+1.15e+00

Pos logit contributions
 ones+0.182
oes+0.175
First+0.165
 pads+0.165
 color+0.162
Neg logit contributions
 sorry-0.142
 Doctor-0.138
 ow-0.138
ores-0.136
 relief-0.133
Token Car
Token ID1879
Feature activation-1.40e+00

Pos logit contributions
 ones+0.044
oes+0.042
's+0.040
packs+0.039
 color+0.039
Neg logit contributions
 Doctor-0.030
 relief-0.029
 ow-0.029
ores-0.028
 sorry-0.028
Tokena
Token ID64
Feature activation-2.55e-01

Pos logit contributions
ache+0.040
paste+0.038
 stret+0.038
 Doctor+0.038
 ow+0.037
Neg logit contributions
 ones-0.047
oes-0.047
 color-0.045
First-0.044
 way-0.041
Token!
Token ID0
Feature activation-1.19e+00

Pos logit contributions
paste+0.026
lic+0.026
ores+0.026
 stret+0.026
 Doctor+0.026
Neg logit contributions
 ones-0.015
's-0.014
 one-0.014
 color-0.012
 way-0.012
Token<|endoftext|>
Token ID50256
Feature activation-4.62e+00

Pos logit contributions
 awful+0.050
 ow+0.050
 stret+0.050
 poisonous+0.050
coon+0.049
Neg logit contributions
 ones-0.027
's-0.026
packs-0.024
oes-0.024
 One-0.023
TokenOne
Token ID3198
Feature activation+4.32e+00

Pos logit contributions
 relief+0.163
 stret+0.161
 awful+0.161
 Doctor+0.160
 ow+0.159
Neg logit contributions
 ones-0.141
's-0.128
oes-0.123
 color-0.121
packs-0.121
Token day
Token ID1110
Feature activation-1.81e-01

Pos logit contributions
 ones+0.052
's+0.039
oes+0.037
 color+0.036
packs+0.034
Neg logit contributions
 relief-0.230
 Doctor-0.229
 stret-0.228
ores-0.227
 ow-0.227
Token,
Token ID11
Feature activation+1.15e+00

Pos logit contributions
 ow+0.007
 Doctor+0.007
 relief+0.007
ores+0.007
 sorry+0.007
Neg logit contributions
 ones-0.005
oes-0.005
 color-0.004
packs-0.004
's-0.004
Token two
Token ID734
Feature activation+9.41e+00

Pos logit contributions
 ones+0.036
's+0.032
 color+0.032
oes+0.031
packs+0.031
Neg logit contributions
 relief-0.039
 Doctor-0.039
 stret-0.039
ores-0.038
paste-0.038
Token friends
Token ID2460
Feature activation+5.03e+00

Pos logit contributions
 ones+0.291
oes+0.287
packs+0.266
 color+0.262
First+0.248
Neg logit contributions
 ow-0.317
 Doctor-0.307
 awful-0.306
 relief-0.305
 sorry-0.303
Token wanted
Token ID2227
Feature activation+9.01e-01

Pos logit contributions
oes+0.185
 ones+0.176
 pads+0.164
j+0.162
 color+0.157
Neg logit contributions
 ow-0.140
 sorry-0.137
 relief-0.134
 Doctor-0.132
 what-0.132
Token to
Token ID284
Feature activation+1.55e-01

Pos logit contributions
 ones+0.023
oes+0.021
packs+0.020
 color+0.019
's+0.019
Neg logit contributions
 Doctor-0.035
 awful-0.034
ores-0.034
 relief-0.034
 stret-0.034
Token play
Token ID711
Feature activation-9.09e-01

Pos logit contributions
 ones+0.004
oes+0.003
packs+0.003
's+0.003
 color+0.003
Neg logit contributions
 Doctor-0.006
 relief-0.006
 stret-0.006
ores-0.006
 ow-0.006
Token.
Token ID13
Feature activation+1.38e+00

Pos logit contributions
 relief+0.037
 ow+0.037
ache+0.037
paste+0.037
 awful+0.037
Neg logit contributions
 ones-0.024
's-0.022
 one-0.021
 color-0.020
packs-0.018
Token not
Token ID407
Feature activation-6.87e-01

Pos logit contributions
 relief+0.031
 Doctor+0.030
 sorry+0.030
 awful+0.030
ores+0.030
Neg logit contributions
 ones-0.024
packs-0.023
oes-0.022
's-0.021
 color-0.020
Token eat
Token ID4483
Feature activation-4.08e+00

Pos logit contributions
 relief+0.026
 Doctor+0.025
 ow+0.025
ores+0.025
 awful+0.025
Neg logit contributions
 ones-0.019
's-0.017
packs-0.016
oes-0.016
 color-0.016
Token too
Token ID1165
Feature activation-5.29e-01

Pos logit contributions
 relief+0.168
lic+0.164
coon+0.163
 stret+0.159
ores+0.158
Neg logit contributions
 ones-0.129
 one-0.107
 color-0.103
's-0.102
packs-0.091
Token many
Token ID867
Feature activation-8.10e-01

Pos logit contributions
 stret+0.015
 relief+0.015
 Doctor+0.014
paste+0.014
 ow+0.014
Neg logit contributions
 ones-0.020
's-0.018
oes-0.018
 color-0.018
packs-0.018
Token ch
Token ID442
Feature activation-2.59e+00

Pos logit contributions
ores+0.035
 stret+0.034
 relief+0.034
coon+0.034
 Doctor+0.034
Neg logit contributions
 ones-0.022
 color-0.018
's-0.018
 one-0.017
packs-0.016
Tokenocol
Token ID4668
Feature activation-1.14e+01

Pos logit contributions
 relief+0.053
 stret+0.051
 Doctor+0.049
 awful+0.047
coon+0.046
Neg logit contributions
 ones-0.124
oes-0.118
packs-0.117
 color-0.112
's-0.110
Tokenates
Token ID689
Feature activation+5.46e-01

Pos logit contributions
paste+0.279
ores+0.270
coon+0.265
ache+0.258
ocol+0.253
Neg logit contributions
 ones-0.448
's-0.434
 color-0.420
 way-0.417
 one-0.405
Token and
Token ID290
Feature activation-4.23e-01

Pos logit contributions
 ones+0.015
's+0.014
 color+0.012
 one+0.012
oes+0.012
Neg logit contributions
 relief-0.023
 stret-0.022
 Doctor-0.022
 ow-0.022
 explosion-0.021
Token his
Token ID465
Feature activation-3.58e-01

Pos logit contributions
 stret+0.016
coon+0.016
ores+0.016
lic+0.016
 Doctor+0.015
Neg logit contributions
 ones-0.012
's-0.012
 color-0.011
 one-0.011
oes-0.010
Token friend
Token ID1545
Feature activation+2.62e+00

Pos logit contributions
ores+0.015
lic+0.015
coon+0.015
paste+0.014
 awful+0.014
Neg logit contributions
 ones-0.011
 color-0.009
 one-0.009
 way-0.008
's-0.008
Token learned
Token ID4499
Feature activation+1.58e+00

Pos logit contributions
 ones+0.054
's+0.053
 one+0.043
 color+0.042
packs+0.039
Neg logit contributions
 stret-0.124
ores-0.122
 relief-0.121
coon-0.119
 ow-0.119
Token was
Token ID373
Feature activation+7.92e-01

Pos logit contributions
 stret+0.013
 Doctor+0.013
 relief+0.012
 ow+0.012
ores+0.012
Neg logit contributions
's-0.007
 ones-0.007
 one-0.007
 color-0.006
 way-0.005
Token filled
Token ID5901
Feature activation+1.18e+00

Pos logit contributions
 ones+0.023
's+0.022
 one+0.022
 color+0.021
 way+0.019
Neg logit contributions
 stret-0.029
paste-0.029
coon-0.029
 ow-0.029
 Doctor-0.029
Token with
Token ID351
Feature activation+6.51e-01

Pos logit contributions
 ones+0.031
's+0.029
 color+0.027
 one+0.026
oes+0.026
Neg logit contributions
 ow-0.047
 relief-0.046
 Doctor-0.046
 stret-0.046
paste-0.045
Token delicious
Token ID12625
Feature activation-2.57e+00

Pos logit contributions
 ones+0.021
 color+0.019
 one+0.019
's+0.018
 way+0.016
Neg logit contributions
coon-0.025
paste-0.024
 stret-0.024
lic-0.024
ache-0.024
Token ch
Token ID442
Feature activation-2.30e+00

Pos logit contributions
ores+0.104
coon+0.101
othe+0.098
 Doctor+0.098
 stret+0.098
Neg logit contributions
 ones-0.080
 color-0.067
's-0.065
 one-0.063
 way-0.058
Tokenocol
Token ID4668
Feature activation-1.12e+01

Pos logit contributions
 Doctor+0.056
 stret+0.053
 relief+0.051
 Dr+0.050
 awful+0.049
Neg logit contributions
 ones-0.099
packs-0.091
oes-0.091
 color-0.090
's-0.088
Tokenates
Token ID689
Feature activation+8.50e-01

Pos logit contributions
paste+0.268
ores+0.258
coon+0.255
ocol+0.247
 ow+0.244
Neg logit contributions
 ones-0.461
's-0.435
 way-0.429
 color-0.425
 one-0.417
Token.
Token ID13
Feature activation-2.50e-01

Pos logit contributions
 ones+0.023
's+0.022
packs+0.020
 color+0.020
oes+0.020
Neg logit contributions
 relief-0.033
 Doctor-0.033
 ow-0.033
 stret-0.032
 awful-0.032
Token He
Token ID679
Feature activation-2.51e+00

Pos logit contributions
 ow+0.011
 awful+0.011
 stret+0.010
 relief+0.010
paste+0.010
Neg logit contributions
 ones-0.006
's-0.005
oes-0.005
 one-0.005
 color-0.005
Token was
Token ID373
Feature activation+7.14e-01

Pos logit contributions
ores+0.114
 stret+0.114
 Doctor+0.113
 relief+0.112
 awful+0.111
Neg logit contributions
's-0.056
 ones-0.048
 color-0.045
 one-0.045
 traveled-0.044
Token happy
Token ID3772
Feature activation+3.22e-01

Pos logit contributions
's+0.021
 one+0.019
 ones+0.018
 color+0.018
 way+0.017
Neg logit contributions
coon-0.028
 stret-0.027
paste-0.027
 ow-0.026
ache-0.026
Token colorful
Token ID20239
Feature activation+2.03e+00

Pos logit contributions
ores+0.012
 relief+0.011
 Doctor+0.011
paste+0.011
 stret+0.011
Neg logit contributions
 ones-0.010
's-0.009
 color-0.008
packs-0.008
 one-0.008
Token cand
Token ID2658
Feature activation-3.97e-01

Pos logit contributions
 ones+0.053
's+0.046
oes+0.044
 color+0.044
packs+0.044
Neg logit contributions
 relief-0.081
 Doctor-0.080
ores-0.080
 ow-0.079
 stret-0.079
Tokenies
Token ID444
Feature activation+1.39e+00

Pos logit contributions
ores+0.015
 ow+0.015
paste+0.015
 Doctor+0.015
 relief+0.015
Neg logit contributions
 ones-0.010
oes-0.009
's-0.009
 color-0.009
packs-0.008
Token and
Token ID290
Feature activation+6.44e-01

Pos logit contributions
 ones+0.040
's+0.037
 color+0.035
packs+0.034
 one+0.034
Neg logit contributions
 relief-0.053
 Doctor-0.052
 stret-0.051
ores-0.051
 ow-0.051
Token ch
Token ID442
Feature activation-2.23e+00

Pos logit contributions
 ones+0.017
 color+0.015
's+0.015
 one+0.015
packs+0.013
Neg logit contributions
ores-0.027
 relief-0.026
 stret-0.026
coon-0.026
 Doctor-0.026
Tokenocol
Token ID4668
Feature activation-1.12e+01

Pos logit contributions
 Doctor+0.051
 relief+0.049
 stret+0.048
 awful+0.047
coon+0.046
Neg logit contributions
 ones-0.099
packs-0.095
oes-0.092
 color-0.089
's-0.089
Tokenates
Token ID689
Feature activation+9.35e-01

Pos logit contributions
paste+0.259
ores+0.246
coon+0.239
ocol+0.237
 ow+0.235
Neg logit contributions
 ones-0.457
's-0.453
 way-0.439
 color-0.432
oes-0.421
Token.
Token ID13
Feature activation-2.18e-02

Pos logit contributions
 ones+0.023
's+0.021
 color+0.020
packs+0.020
oes+0.020
Neg logit contributions
 relief-0.039
 Doctor-0.039
 stret-0.038
 ow-0.038
 awful-0.037
Token She
Token ID1375
Feature activation-2.49e+00

Pos logit contributions
 Doctor+0.001
 relief+0.001
ores+0.001
 ow+0.001
 stret+0.001
Neg logit contributions
 ones-0.001
's-0.000
oes-0.000
 color-0.000
packs-0.000
Token said
Token ID531
Feature activation+1.89e-01

Pos logit contributions
 relief+0.112
ores+0.112
 stret+0.111
 Doctor+0.111
paste+0.109
Neg logit contributions
 ones-0.051
's-0.049
 color-0.045
 one-0.042
oes-0.040
Token to
Token ID284
Feature activation-1.85e-01

Pos logit contributions
 ones+0.004
's+0.004
 one+0.003
 color+0.003
oes+0.003
Neg logit contributions
 relief-0.008
ores-0.008
 stret-0.008
coon-0.008
 awful-0.008
Token find
Token ID1064
Feature activation-1.25e+00

Pos logit contributions
 ones+0.028
's+0.026
 color+0.025
oes+0.024
 one+0.024
Neg logit contributions
 awful-0.043
 relief-0.043
 Doctor-0.043
coon-0.043
 ow-0.043
Token a
Token ID257
Feature activation-1.30e+00

Pos logit contributions
 stret+0.054
ache+0.053
coon+0.053
 relief+0.052
 Doctor+0.051
Neg logit contributions
 ones-0.033
 one-0.029
's-0.028
 color-0.026
packs-0.025
Token box
Token ID3091
Feature activation+1.96e+00

Pos logit contributions
ores+0.057
 awful+0.054
coon+0.052
 relief+0.052
paste+0.051
Neg logit contributions
 ones-0.036
 color-0.035
 way-0.032
's-0.030
 one-0.029
Token of
Token ID286
Feature activation-2.06e+00

Pos logit contributions
 ones+0.063
's+0.058
oes+0.057
 color+0.056
packs+0.055
Neg logit contributions
 relief-0.064
 Doctor-0.064
 stret-0.063
ores-0.063
 ow-0.063
Token ch
Token ID442
Feature activation-2.50e+00

Pos logit contributions
ores+0.094
ache+0.093
coon+0.090
 Doctor+0.088
lic+0.087
Neg logit contributions
 ones-0.057
 color-0.052
 one-0.046
's-0.042
packs-0.039
Tokenocol
Token ID4668
Feature activation-1.11e+01

Pos logit contributions
 Doctor+0.067
 stret+0.066
 relief+0.065
 awful+0.063
 ow+0.062
Neg logit contributions
 ones-0.103
packs-0.098
oes-0.095
 color-0.092
's-0.091
Tokenates
Token ID689
Feature activation+1.01e+00

Pos logit contributions
paste+0.253
ores+0.232
ocol+0.230
coon+0.227
 ow+0.216
Neg logit contributions
 ones-0.464
's-0.453
 way-0.452
 one-0.441
 One-0.436
Token.
Token ID13
Feature activation-8.75e-01

Pos logit contributions
 ones+0.026
's+0.023
oes+0.023
 color+0.022
packs+0.022
Neg logit contributions
 relief-0.040
 Doctor-0.040
 stret-0.039
 ow-0.039
 awful-0.039
Token
Token ID198
Feature activation-8.28e-01

Pos logit contributions
 relief+0.040
 ow+0.039
 awful+0.039
 stret+0.039
 Doctor+0.039
Neg logit contributions
 ones-0.018
's-0.015
oes-0.014
 color-0.014
packs-0.014
Token
Token ID198
Feature activation-8.14e-01

Pos logit contributions
 relief+0.038
 ow+0.038
 awful+0.038
paste+0.038
 stret+0.038
Neg logit contributions
 ones-0.016
's-0.015
oes-0.013
packs-0.013
 color-0.013
Token"
Token ID1
Feature activation+3.36e+00

Pos logit contributions
ores+0.029
 Doctor+0.029
 relief+0.029
 stret+0.028
paste+0.028
Neg logit contributions
 ones-0.024
oes-0.022
 color-0.021
packs-0.020
's-0.020
Token't
Token ID470
Feature activation+1.93e+00

Pos logit contributions
 relief+0.002
 stret+0.002
ores+0.002
paste+0.002
 awful+0.002
Neg logit contributions
 ones-0.001
's-0.001
 color-0.001
oes-0.001
 one-0.001
Token have
Token ID423
Feature activation+1.32e+00

Pos logit contributions
 ones+0.048
's+0.046
 color+0.045
oes+0.043
 one+0.042
Neg logit contributions
ores-0.077
 Doctor-0.076
 stret-0.076
 relief-0.076
paste-0.076
Token all
Token ID477
Feature activation+5.29e+00

Pos logit contributions
 ones+0.037
's+0.033
 color+0.032
oes+0.031
packs+0.031
Neg logit contributions
 relief-0.050
 Doctor-0.049
ores-0.049
 stret-0.049
coon-0.048
Token these
Token ID777
Feature activation+2.95e+00

Pos logit contributions
 ones+0.170
's+0.154
oes+0.151
 color+0.149
packs+0.148
Neg logit contributions
 relief-0.176
 Doctor-0.175
ores-0.173
 stret-0.173
 ow-0.173
Token ch
Token ID442
Feature activation-2.38e+00

Pos logit contributions
 ones+0.088
's+0.076
 color+0.073
 one+0.071
packs+0.067
Neg logit contributions
coon-0.117
 relief-0.116
ores-0.114
 til-0.113
 Doctor-0.112
Tokenocol
Token ID4668
Feature activation-1.11e+01

Pos logit contributions
 relief+0.032
 Doctor+0.031
 awful+0.030
 stret+0.029
 ow+0.029
Neg logit contributions
 ones-0.128
oes-0.120
packs-0.120
 color-0.117
's-0.116
Tokenates
Token ID689
Feature activation+9.24e-01

Pos logit contributions
paste+0.264
ores+0.258
ocol+0.248
coon+0.244
ache+0.234
Neg logit contributions
 ones-0.450
's-0.441
 way-0.428
 color-0.418
 one-0.413
Token!
Token ID0
Feature activation-1.17e+00

Pos logit contributions
 ones+0.027
's+0.025
 color+0.023
oes+0.023
packs+0.023
Neg logit contributions
 relief-0.035
 Doctor-0.035
 stret-0.034
 ow-0.034
coon-0.033
Token Now
Token ID2735
Feature activation-3.99e-01

Pos logit contributions
 Doctor+0.040
 relief+0.038
ores+0.038
 sorry+0.038
paste+0.037
Neg logit contributions
 ones-0.038
packs-0.034
oes-0.033
 color-0.033
's-0.032
Token,
Token ID11
Feature activation+3.76e-02

Pos logit contributions
ores+0.015
 stret+0.015
 relief+0.015
 Doctor+0.015
coon+0.015
Neg logit contributions
 ones-0.011
's-0.010
 color-0.010
 one-0.009
oes-0.009
Token go
Token ID467
Feature activation-2.63e+00

Pos logit contributions
 ones+0.001
's+0.001
 color+0.001
 one+0.001
oes+0.001
Neg logit contributions
 stret-0.001
 ow-0.001
ores-0.001
coon-0.001
 relief-0.001
Token she
Token ID673
Feature activation-9.58e-01

Pos logit contributions
coon+0.006
 stret+0.006
ores+0.006
 ow+0.006
 awful+0.006
Neg logit contributions
 ones-0.004
's-0.003
 one-0.003
 color-0.003
packs-0.003
Token paid
Token ID3432
Feature activation-1.98e+00

Pos logit contributions
 relief+0.045
ores+0.045
 ow+0.045
 awful+0.045
ocol+0.045
Neg logit contributions
's-0.021
 wandered-0.016
 one-0.016
 traveled-0.016
 decided-0.015
Token for
Token ID329
Feature activation-5.92e+00

Pos logit contributions
 ow+0.089
coon+0.085
arter+0.083
 explosion+0.083
lic+0.082
Neg logit contributions
 one-0.050
's-0.049
 ones-0.043
 way-0.037
 first-0.035
Token the
Token ID262
Feature activation-1.68e-01

Pos logit contributions
lic+0.264
paste+0.251
coon+0.251
ache+0.248
affle+0.241
Neg logit contributions
 ones-0.155
 one-0.153
 color-0.129
 way-0.121
's-0.114
Token ch
Token ID442
Feature activation-2.31e+00

Pos logit contributions
ores+0.007
othe+0.006
lic+0.006
 til+0.006
coon+0.006
Neg logit contributions
 ones-0.006
 one-0.005
 color-0.005
 way-0.005
's-0.004
Tokenocol
Token ID4668
Feature activation-1.11e+01

Pos logit contributions
 relief+0.041
 awful+0.041
 horrible+0.039
 ow+0.037
 furious+0.037
Neg logit contributions
 ones-0.116
oes-0.107
 color-0.104
 one-0.104
packs-0.100
Tokenates
Token ID689
Feature activation+9.65e-01

Pos logit contributions
paste+0.297
ores+0.289
coon+0.279
 Doctor+0.277
 ow+0.272
Neg logit contributions
 ones-0.422
's-0.401
 color-0.386
 way-0.380
 one-0.374
Token and
Token ID290
Feature activation+4.49e-01

Pos logit contributions
 ones+0.025
's+0.024
oes+0.022
packs+0.022
 one+0.022
Neg logit contributions
 relief-0.039
 ow-0.038
 Doctor-0.038
 stret-0.037
 awful-0.037
Token the
Token ID262
Feature activation-1.89e-01

Pos logit contributions
's+0.011
 ones+0.011
 one+0.010
 color+0.010
 traveled+0.009
Neg logit contributions
coon-0.020
 awful-0.019
 ow-0.019
affle-0.019
ores-0.019
Token two
Token ID734
Feature activation+8.60e+00

Pos logit contributions
ores+0.008
othe+0.008
lic+0.007
 til+0.007
paste+0.007
Neg logit contributions
 ones-0.006
 one-0.005
 color-0.005
 way-0.005
's-0.005
Token of
Token ID286
Feature activation-1.78e+00

Pos logit contributions
 ones+0.272
's+0.255
 one+0.234
 color+0.228
packs+0.225
Neg logit contributions
ores-0.308
coon-0.303
 relief-0.302
 Doctor-0.300
 stret-0.296
Token giving
Token ID3501
Feature activation+1.47e+00

Pos logit contributions
paste+0.011
lic+0.011
ores+0.011
 relief+0.011
 stret+0.010
Neg logit contributions
 ones-0.010
 color-0.009
 one-0.009
's-0.009
 way-0.008
Token him
Token ID683
Feature activation-3.04e+00

Pos logit contributions
 ones+0.047
's+0.044
 one+0.042
 color+0.038
oes+0.037
Neg logit contributions
 stret-0.056
coon-0.054
ores-0.054
 ow-0.053
 relief-0.053
Token too
Token ID1165
Feature activation-1.84e-01

Pos logit contributions
coon+0.122
 stret+0.121
lic+0.120
 relief+0.120
 Doctor+0.119
Neg logit contributions
 ones-0.085
's-0.079
 one-0.078
 color-0.071
packs-0.067
Token many
Token ID867
Feature activation-5.63e-01

Pos logit contributions
 relief+0.005
 Doctor+0.005
ores+0.005
 ow+0.005
 awful+0.005
Neg logit contributions
 ones-0.007
's-0.007
oes-0.007
 color-0.006
packs-0.006
Token ch
Token ID442
Feature activation-2.42e+00

Pos logit contributions
ores+0.023
 stret+0.023
coon+0.023
 Doctor+0.022
lic+0.022
Neg logit contributions
 ones-0.016
 color-0.013
 one-0.013
's-0.013
packs-0.012
Tokenocol
Token ID4668
Feature activation-1.11e+01

Pos logit contributions
 stret+0.073
 Doctor+0.071
 relief+0.070
 awful+0.067
coon+0.066
Neg logit contributions
 ones-0.091
oes-0.086
packs-0.086
 color-0.081
 one-0.078
Tokenates
Token ID689
Feature activation+7.47e-01

Pos logit contributions
paste+0.265
ores+0.261
coon+0.252
ocol+0.252
ache+0.247
Neg logit contributions
 ones-0.441
's-0.425
 color-0.413
 way-0.409
 one-0.395
Token."
Token ID526
Feature activation+1.17e+00

Pos logit contributions
 ones+0.019
's+0.017
 one+0.016
 color+0.016
oes+0.016
Neg logit contributions
 relief-0.032
 Doctor-0.031
 stret-0.031
 ow-0.031
coon-0.030
Token The
Token ID383
Feature activation+3.78e-01

Pos logit contributions
 ones+0.025
's+0.024
 one+0.021
oes+0.021
 color+0.021
Neg logit contributions
 awful-0.051
 ow-0.051
 relief-0.051
paste-0.051
ores-0.050
Token king
Token ID5822
Feature activation-2.43e+00

Pos logit contributions
 ones+0.011
 one+0.010
 color+0.009
's+0.009
 way+0.008
Neg logit contributions
ores-0.016
paste-0.015
othe-0.015
 stret-0.014
 relief-0.014
Token learned
Token ID4499
Feature activation+1.78e+00

Pos logit contributions
ores+0.107
 relief+0.106
 stret+0.106
 ow+0.102
 Doctor+0.102
Neg logit contributions
's-0.059
 ones-0.053
 color-0.049
 one-0.049
oes-0.046
Token box
Token ID3091
Feature activation+2.00e+00

Pos logit contributions
ores+0.023
 Doctor+0.022
 stret+0.022
 relief+0.022
 awful+0.021
Neg logit contributions
 ones-0.012
's-0.011
 color-0.010
packs-0.009
oes-0.009
Token and
Token ID290
Feature activation-1.89e-01

Pos logit contributions
 ones+0.073
oes+0.068
ety+0.068
First+0.065
 wander+0.064
Neg logit contributions
 deeply-0.050
ores-0.050
 shocked-0.049
 relief-0.048
 awful-0.048
Token share
Token ID2648
Feature activation+1.79e+00

Pos logit contributions
coon+0.008
 awful+0.008
 ow+0.008
paste+0.008
 poisonous+0.008
Neg logit contributions
 ones-0.005
's-0.005
 one-0.004
 color-0.004
packs-0.004
Token the
Token ID262
Feature activation-5.19e-01

Pos logit contributions
 ones+0.058
's+0.053
 color+0.051
packs+0.050
oes+0.050
Neg logit contributions
 relief-0.060
 Doctor-0.059
 ow-0.059
 stret-0.059
 awful-0.059
Token ch
Token ID442
Feature activation-2.43e+00

Pos logit contributions
ores+0.022
 Doctor+0.020
paste+0.020
 awful+0.020
 stret+0.020
Neg logit contributions
 ones-0.015
 color-0.013
's-0.013
 one-0.012
 way-0.011
Tokenocol
Token ID4668
Feature activation-1.11e+01

Pos logit contributions
 Doctor+0.040
 relief+0.039
 stret+0.039
 awful+0.038
 ow+0.038
Neg logit contributions
 ones-0.123
oes-0.115
 color-0.113
packs-0.113
's-0.112
Tokenates
Token ID689
Feature activation+1.07e+00

Pos logit contributions
paste+0.263
ores+0.251
coon+0.244
ocol+0.237
affle+0.229
Neg logit contributions
 ones-0.452
's-0.442
 way-0.436
 one-0.422
 color-0.417
Token.
Token ID13
Feature activation-8.79e-01

Pos logit contributions
 ones+0.031
oes+0.029
 color+0.027
packs+0.027
's+0.027
Neg logit contributions
ores-0.037
paste-0.037
 Doctor-0.036
 awful-0.036
 relief-0.036
Token They
Token ID1119
Feature activation+1.01e+00

Pos logit contributions
 relief+0.042
 Doctor+0.042
ores+0.042
 stret+0.042
 ow+0.041
Neg logit contributions
 ones-0.015
's-0.012
oes-0.012
 color-0.012
packs-0.012
Token are
Token ID389
Feature activation+3.59e-01

Pos logit contributions
 ones+0.024
's+0.023
 color+0.022
 one+0.020
oes+0.020
Neg logit contributions
 Doctor-0.042
 awful-0.042
coon-0.041
ores-0.041
 relief-0.041
Token happy
Token ID3772
Feature activation+2.64e-01

Pos logit contributions
 ones+0.011
 color+0.011
's+0.011
 one+0.010
packs+0.010
Neg logit contributions
coon-0.013
 stret-0.013
ache-0.012
paste-0.012
 Doctor-0.012
Token and
Token ID290
Feature activation+8.54e-01

Pos logit contributions
 ones+0.050
's+0.046
oes+0.044
 color+0.044
packs+0.043
Neg logit contributions
 relief-0.070
 Doctor-0.069
 stret-0.068
ores-0.068
 ow-0.068
Token her
Token ID607
Feature activation-7.13e-02

Pos logit contributions
 ones+0.028
 color+0.026
's+0.025
 one+0.025
 way+0.022
Neg logit contributions
coon-0.031
ores-0.030
 stret-0.030
 awful-0.029
 relief-0.029
Token favourite
Token ID12507
Feature activation+3.32e+00

Pos logit contributions
coon+0.003
ores+0.003
paste+0.003
lic+0.003
 stret+0.003
Neg logit contributions
 ones-0.002
 color-0.002
 way-0.002
 one-0.002
's-0.002
Token,
Token ID11
Feature activation+7.09e-01

Pos logit contributions
 ones+0.106
 color+0.094
's+0.091
 one+0.090
 way+0.082
Neg logit contributions
ores-0.123
 Doctor-0.120
coon-0.119
 stret-0.118
paste-0.118
Token ch
Token ID442
Feature activation-2.02e+00

Pos logit contributions
 ones+0.024
's+0.022
 one+0.022
 color+0.022
 way+0.019
Neg logit contributions
coon-0.025
 relief-0.024
 explosion-0.024
ores-0.024
othe-0.024
Tokenocol
Token ID4668
Feature activation-1.11e+01

Pos logit contributions
 Doctor+0.040
 relief+0.039
 awful+0.039
 stret+0.039
 horrible+0.036
Neg logit contributions
 ones-0.094
packs-0.089
oes-0.088
 color-0.086
 one-0.084
Tokenates
Token ID689
Feature activation+1.18e+00

Pos logit contributions
paste+0.251
ores+0.235
coon+0.228
 ow+0.226
ocol+0.222
Neg logit contributions
 ones-0.468
's-0.456
 way-0.446
 one-0.431
 color-0.426
Token!
Token ID0
Feature activation-4.17e-01

Pos logit contributions
 ones+0.032
's+0.030
 color+0.028
 one+0.027
packs+0.027
Neg logit contributions
 relief-0.049
 Doctor-0.048
 stret-0.047
 ow-0.046
 awful-0.045
Token She
Token ID1375
Feature activation-2.35e+00

Pos logit contributions
 Doctor+0.015
ores+0.015
 relief+0.015
 sorry+0.014
 ow+0.014
Neg logit contributions
 ones-0.012
oes-0.011
packs-0.010
 color-0.010
's-0.010
Token asked
Token ID1965
Feature activation+3.62e+00

Pos logit contributions
 relief+0.106
 stret+0.105
ores+0.105
coon+0.104
 Doctor+0.103
Neg logit contributions
's-0.051
 ones-0.049
 color-0.044
 one-0.041
 traveled-0.039
Token mummy
Token ID38567
Feature activation+2.78e-01

Pos logit contributions
 ones+0.091
 one+0.088
's+0.084
 color+0.082
 One+0.067
Neg logit contributions
 stret-0.165
coon-0.158
 awful-0.157
ache-0.156
paste-0.156
Token It
Token ID632
Feature activation-2.87e-01

Pos logit contributions
 relief+0.011
 Doctor+0.011
ores+0.011
 ow+0.010
 stret+0.010
Neg logit contributions
 ones-0.010
's-0.009
oes-0.009
 color-0.008
packs-0.008
Token was
Token ID373
Feature activation+5.11e-01

Pos logit contributions
 stret+0.015
ocol+0.014
coon+0.014
 Doctor+0.014
 relief+0.013
Neg logit contributions
's-0.006
 ones-0.005
 color-0.005
 one-0.005
 way-0.004
Token sweet
Token ID6029
Feature activation-8.26e-02

Pos logit contributions
 color+0.014
 one+0.014
 ones+0.013
's+0.012
 way+0.012
Neg logit contributions
coon-0.023
 stret-0.023
oke-0.021
ache-0.021
lic-0.020
Token,
Token ID11
Feature activation+1.84e-01

Pos logit contributions
 stret+0.004
coon+0.004
 Doctor+0.004
ores+0.004
 awful+0.004
Neg logit contributions
 ones-0.002
's-0.002
 color-0.002
 one-0.002
oes-0.001
Token ch
Token ID442
Feature activation-2.39e+00

Pos logit contributions
 ones+0.006
 color+0.006
's+0.006
 one+0.006
 way+0.005
Neg logit contributions
coon-0.006
 stret-0.006
 Doctor-0.006
lic-0.006
ache-0.006
Tokenocol
Token ID4668
Feature activation-1.10e+01

Pos logit contributions
 stret+0.100
 Doctor+0.097
 Dr+0.090
coon+0.090
 brand+0.089
Neg logit contributions
 ones-0.056
oes-0.055
 color-0.053
y-0.051
 one-0.049
Tokenat
Token ID265
Feature activation-3.02e+00

Pos logit contributions
paste+0.291
ores+0.284
 ow+0.279
coon+0.274
 Doctor+0.263
Neg logit contributions
 ones-0.429
's-0.407
 way-0.389
 one-0.384
oes-0.384
Tokeny
Token ID88
Feature activation+9.74e-01

Pos logit contributions
 relief+0.117
 stret+0.114
coon+0.112
 Doctor+0.112
 sorry+0.109
Neg logit contributions
's-0.088
 ones-0.085
 color-0.079
 one-0.077
oes-0.073
Token and
Token ID290
Feature activation+3.51e-01

Pos logit contributions
 ones+0.023
's+0.022
 color+0.020
 one+0.020
 way+0.017
Neg logit contributions
 stret-0.043
 Doctor-0.043
 relief-0.043
coon-0.043
ores-0.042
Token warm
Token ID5814
Feature activation+3.34e-01

Pos logit contributions
 ones+0.011
 one+0.010
 color+0.010
's+0.010
 way+0.008
Neg logit contributions
 stret-0.014
coon-0.014
affle-0.013
 x-0.013
ache-0.012
Token,
Token ID11
Feature activation+1.74e-01

Pos logit contributions
 ones+0.009
's+0.008
 color+0.008
 one+0.008
oes+0.007
Neg logit contributions
 Doctor-0.013
 stret-0.013
coon-0.013
 awful-0.013
ores-0.013
Token,
Token ID11
Feature activation+3.42e-01

Pos logit contributions
 ones+0.043
's+0.038
oes+0.037
 color+0.036
packs+0.036
Neg logit contributions
 relief-0.060
 Doctor-0.060
 stret-0.059
ores-0.059
 ow-0.059
Token cup
Token ID6508
Feature activation-3.13e+00

Pos logit contributions
 ones+0.011
's+0.009
 color+0.009
 one+0.009
packs+0.009
Neg logit contributions
 relief-0.012
ores-0.012
coon-0.012
 ow-0.012
 Doctor-0.012
Tokencakes
Token ID37263
Feature activation-6.01e-01

Pos logit contributions
 relief+0.130
 Doctor+0.119
 til+0.118
 sorry+0.118
 stret+0.117
Neg logit contributions
 ones-0.092
's-0.081
 color-0.078
 one-0.075
oes-0.075
Token and
Token ID290
Feature activation+1.71e-01

Pos logit contributions
 relief+0.025
 Doctor+0.025
ores+0.025
 awful+0.025
 ow+0.025
Neg logit contributions
 ones-0.015
's-0.013
 color-0.012
oes-0.012
 one-0.012
Token ch
Token ID442
Feature activation-2.24e+00

Pos logit contributions
 ones+0.006
's+0.005
 color+0.005
 one+0.005
oes+0.005
Neg logit contributions
 relief-0.006
ores-0.006
coon-0.006
 Doctor-0.006
 awful-0.006
Tokenocol
Token ID4668
Feature activation-1.10e+01

Pos logit contributions
 Doctor+0.044
 relief+0.042
 awful+0.040
coon+0.040
 horrible+0.039
Neg logit contributions
 ones-0.109
packs-0.104
oes-0.100
 color-0.098
 one-0.096
Tokenates
Token ID689
Feature activation+8.55e-01

Pos logit contributions
paste+0.275
ores+0.266
ocol+0.254
coon+0.254
 stret+0.251
Neg logit contributions
 ones-0.436
's-0.426
 way-0.410
 color-0.408
oes-0.402
Token.
Token ID13
Feature activation-3.40e-01

Pos logit contributions
 ones+0.021
's+0.019
 color+0.018
 one+0.018
packs+0.017
Neg logit contributions
 relief-0.037
 Doctor-0.037
 ow-0.036
 stret-0.035
 awful-0.035
Token
Token ID220
Feature activation+2.18e+00

Pos logit contributions
 relief+0.013
 ow+0.013
 awful+0.013
paste+0.013
ores+0.013
Neg logit contributions
 ones-0.009
's-0.008
oes-0.008
 color-0.007
packs-0.007
Token
Token ID198
Feature activation+9.49e-02

Pos logit contributions
 ones+0.041
's+0.040
packs+0.037
oes+0.037
 color+0.033
Neg logit contributions
 relief-0.105
 ow-0.102
 awful-0.101
 explosion-0.099
coon-0.099
Token
Token ID198
Feature activation+8.13e-02

Pos logit contributions
 ones+0.002
's+0.002
oes+0.002
 color+0.002
packs+0.002
Neg logit contributions
 relief-0.004
 ow-0.004
 awful-0.004
paste-0.004
ores-0.004
Token son
Token ID3367
Feature activation+8.16e-01

Pos logit contributions
ores+0.011
coon+0.011
lic+0.010
othe+0.010
 awful+0.010
Neg logit contributions
 ones-0.008
 color-0.007
 one-0.007
 way-0.007
's-0.007
Token is
Token ID318
Feature activation+6.87e-01

Pos logit contributions
's+0.020
 ones+0.020
 one+0.017
 color+0.017
 way+0.016
Neg logit contributions
 relief-0.035
 stret-0.035
ores-0.035
 awful-0.034
coon-0.034
Token eating
Token ID6600
Feature activation-3.27e+00

Pos logit contributions
 one+0.021
's+0.020
 ones+0.020
 color+0.018
 way+0.018
Neg logit contributions
coon-0.026
lic-0.026
 ow-0.025
paste-0.025
 stret-0.025
Token our
Token ID674
Feature activation+1.72e+00

Pos logit contributions
lic+0.135
coon+0.134
 relief+0.131
 stret+0.130
othe+0.129
Neg logit contributions
 ones-0.098
 one-0.091
's-0.084
 color-0.082
 way-0.067
Token ch
Token ID442
Feature activation-2.34e+00

Pos logit contributions
 ones+0.059
's+0.054
 color+0.051
 one+0.050
 way+0.048
Neg logit contributions
ores-0.061
coon-0.059
 Doctor-0.059
paste-0.058
 relief-0.058
Tokenocol
Token ID4668
Feature activation-1.10e+01

Pos logit contributions
 relief+0.039
 Doctor+0.038
 awful+0.038
 horrible+0.036
 stret+0.036
Neg logit contributions
 ones-0.121
oes-0.113
 color-0.112
packs-0.111
 one-0.106
Tokenates
Token ID689
Feature activation+9.47e-01

Pos logit contributions
paste+0.274
ores+0.259
coon+0.248
ocol+0.243
ache+0.242
Neg logit contributions
 ones-0.446
's-0.423
 way-0.407
 color-0.407
 one-0.402
Token!"
Token ID2474
Feature activation+6.18e-02

Pos logit contributions
 ones+0.027
's+0.025
 color+0.024
oes+0.024
packs+0.023
Neg logit contributions
 relief-0.036
 Doctor-0.035
 ow-0.035
 stret-0.035
 awful-0.034
Token
Token ID220
Feature activation+2.00e+00

Pos logit contributions
's+0.001
 ones+0.001
 one+0.001
 color+0.001
 One+0.001
Neg logit contributions
coon-0.003
 ow-0.003
 poisonous-0.003
 awful-0.003
paste-0.003
Token
Token ID198
Feature activation-1.78e-01

Pos logit contributions
's+0.034
 ones+0.030
oes+0.027
packs+0.026
 One+0.025
Neg logit contributions
 relief-0.100
 ow-0.099
 awful-0.097
coon-0.097
 poisonous-0.096
Token
Token ID198
Feature activation-1.65e-01

Pos logit contributions
 explosion+0.008
 ow+0.008
 relief+0.007
 awful+0.007
paste+0.007
Neg logit contributions
's-0.004
First-0.004
One-0.004
 ones-0.004
Today-0.003
Token and
Token ID290
Feature activation+6.05e-01

Pos logit contributions
 ones+0.118
's+0.112
 one+0.102
 color+0.100
oes+0.093
Neg logit contributions
 ow-0.233
ores-0.230
 stret-0.229
 Doctor-0.229
 relief-0.229
Token they
Token ID484
Feature activation+3.47e+00

Pos logit contributions
 ones+0.013
's+0.013
 one+0.012
 color+0.011
oes+0.011
Neg logit contributions
 ow-0.026
 stret-0.026
 relief-0.026
 awful-0.026
coon-0.025
Token shared
Token ID4888
Feature activation+1.95e+00

Pos logit contributions
 ones+0.077
's+0.073
 color+0.068
oes+0.065
 one+0.064
Neg logit contributions
 Doctor-0.148
 relief-0.147
 stret-0.147
 ow-0.145
ores-0.145
Token the
Token ID262
Feature activation-2.04e-01

Pos logit contributions
 ones+0.054
 one+0.049
's+0.048
 color+0.048
packs+0.043
Neg logit contributions
coon-0.083
paste-0.080
ache-0.079
 ow-0.079
 stret-0.078
Token ch
Token ID442
Feature activation-2.31e+00

Pos logit contributions
ores+0.006
paste+0.006
 relief+0.006
 stret+0.006
 Doctor+0.006
Neg logit contributions
 ones-0.008
's-0.007
 color-0.007
 one-0.007
oes-0.007
Tokenocol
Token ID4668
Feature activation-1.10e+01

Pos logit contributions
 relief+0.038
 ow+0.038
 awful+0.036
 Doctor+0.036
 stret+0.035
Neg logit contributions
 ones-0.115
oes-0.110
 color-0.108
 one-0.104
's-0.103
Tokenates
Token ID689
Feature activation+1.16e+00

Pos logit contributions
paste+0.284
ores+0.283
coon+0.266
affle+0.256
 Doctor+0.256
Neg logit contributions
 ones-0.433
's-0.417
 way-0.398
 color-0.391
 one-0.389
Token with
Token ID351
Feature activation+6.00e-01

Pos logit contributions
 ones+0.032
oes+0.028
's+0.027
 color+0.027
packs+0.027
Neg logit contributions
ores-0.043
 relief-0.043
 Doctor-0.043
 stret-0.042
 awful-0.042
Token everyone
Token ID2506
Feature activation+7.44e-01

Pos logit contributions
 ones+0.017
's+0.016
 one+0.015
 color+0.015
packs+0.014
Neg logit contributions
paste-0.022
 stret-0.022
ache-0.022
 ow-0.022
coon-0.022
Token in
Token ID287
Feature activation+1.35e+00

Pos logit contributions
 ones+0.020
's+0.020
packs+0.017
 color+0.017
 one+0.017
Neg logit contributions
 relief-0.029
 stret-0.029
 Doctor-0.029
 ow-0.029
coon-0.028
Token the
Token ID262
Feature activation-1.65e-01

Pos logit contributions
 ones+0.040
 color+0.036
 one+0.035
packs+0.035
's+0.034
Neg logit contributions
paste-0.051
 stret-0.051
ache-0.050
 awful-0.050
coon-0.050
Token a
Token ID257
Feature activation-6.22e-01

Pos logit contributions
coon+0.015
 stret+0.015
 ow+0.015
ores+0.015
 relief+0.015
Neg logit contributions
 ones-0.011
's-0.010
 one-0.010
 color-0.010
packs-0.009
Token big
Token ID1263
Feature activation-3.60e+00

Pos logit contributions
ores+0.030
othe+0.026
coon+0.025
eling+0.025
ocol+0.025
Neg logit contributions
 color-0.017
 ones-0.016
 way-0.015
 one-0.014
's-0.014
Token box
Token ID3091
Feature activation+2.18e+00

Pos logit contributions
ores+0.175
lic+0.164
paste+0.161
othe+0.158
coon+0.157
Neg logit contributions
 ones-0.088
 color-0.080
 one-0.079
's-0.071
 way-0.070
Token of
Token ID286
Feature activation-1.67e+00

Pos logit contributions
 ones+0.066
's+0.065
 color+0.061
 one+0.060
oes+0.058
Neg logit contributions
 Doctor-0.078
 relief-0.077
 stret-0.076
ores-0.075
 awful-0.074
Token ch
Token ID442
Feature activation-2.16e+00

Pos logit contributions
ores+0.073
lic+0.068
ache+0.067
paste+0.067
 Doctor+0.066
Neg logit contributions
 ones-0.048
 color-0.044
 one-0.040
's-0.038
packs-0.035
Tokenocol
Token ID4668
Feature activation-1.10e+01

Pos logit contributions
 Doctor+0.029
 stret+0.029
 relief+0.027
 awful+0.025
 ow+0.025
Neg logit contributions
 ones-0.115
packs-0.109
oes-0.107
's-0.106
 color-0.106
Tokenates
Token ID689
Feature activation+1.36e+00

Pos logit contributions
paste+0.247
ocol+0.232
coon+0.224
ores+0.221
 ow+0.221
Neg logit contributions
 ones-0.469
's-0.454
 way-0.449
 one-0.439
 One-0.432
Token!
Token ID0
Feature activation-2.34e-01

Pos logit contributions
 ones+0.043
's+0.041
 color+0.038
packs+0.038
 one+0.038
Neg logit contributions
 relief-0.047
 Doctor-0.047
 stret-0.046
 ow-0.045
ores-0.045
Token
Token ID220
Feature activation+2.23e+00

Pos logit contributions
 Doctor+0.009
 relief+0.009
ores+0.009
 ow+0.009
 stret+0.009
Neg logit contributions
 ones-0.006
oes-0.005
 color-0.005
's-0.005
packs-0.005
Token
Token ID198
Feature activation+2.81e-01

Pos logit contributions
's+0.038
 ones+0.036
packs+0.032
oes+0.031
 One+0.030
Neg logit contributions
 relief-0.111
 ow-0.109
 explosion-0.109
 awful-0.108
 stret-0.107
Token
Token ID198
Feature activation+2.63e-01

Pos logit contributions
 ones+0.006
's+0.005
oes+0.005
 color+0.005
packs+0.005
Neg logit contributions
 relief-0.013
 awful-0.012
 ow-0.012
paste-0.012
 stret-0.012
Token was
Token ID373
Feature activation+1.05e+00

Pos logit contributions
 stret+0.015
 Doctor+0.015
 relief+0.015
paste+0.015
 ow+0.015
Neg logit contributions
 ones-0.009
's-0.008
 color-0.008
 one-0.007
packs-0.007
Token a
Token ID257
Feature activation-6.45e-01

Pos logit contributions
 one+0.023
's+0.020
 color+0.019
 ones+0.018
 way+0.015
Neg logit contributions
coon-0.052
affle-0.051
ache-0.049
 stret-0.049
oke-0.048
Token box
Token ID3091
Feature activation+2.11e+00

Pos logit contributions
ores+0.029
coon+0.026
paste+0.025
eling+0.025
ocol+0.025
Neg logit contributions
 color-0.018
 ones-0.017
 way-0.017
 one-0.016
's-0.016
Token of
Token ID286
Feature activation-1.62e+00

Pos logit contributions
 ones+0.061
's+0.057
 color+0.054
oes+0.054
packs+0.053
Neg logit contributions
 Doctor-0.076
 relief-0.076
 ow-0.075
ores-0.074
 stret-0.074
Token ch
Token ID442
Feature activation-2.14e+00

Pos logit contributions
ores+0.068
ache+0.065
 stret+0.065
coon+0.065
paste+0.064
Neg logit contributions
 ones-0.047
 color-0.044
 one-0.040
's-0.039
 way-0.034
Tokenocol
Token ID4668
Feature activation-1.10e+01

Pos logit contributions
 stret+0.033
 Doctor+0.033
 relief+0.032
 ow+0.030
coon+0.029
Neg logit contributions
 ones-0.108
packs-0.102
oes-0.101
's-0.100
 color-0.099
Tokenates
Token ID689
Feature activation+1.27e+00

Pos logit contributions
paste+0.260
coon+0.246
ocol+0.244
ores+0.239
 ow+0.232
Neg logit contributions
 ones-0.454
's-0.432
 way-0.431
 one-0.419
 color-0.417
Token!
Token ID0
Feature activation-4.74e-01

Pos logit contributions
 ones+0.034
's+0.031
 color+0.029
packs+0.029
oes+0.029
Neg logit contributions
 relief-0.051
 Doctor-0.050
 ow-0.049
 stret-0.049
ores-0.049
Token Alice
Token ID14862
Feature activation+1.62e+00

Pos logit contributions
 relief+0.016
 Doctor+0.016
 stret+0.016
ores+0.016
 ow+0.016
Neg logit contributions
 ones-0.015
oes-0.014
 color-0.013
's-0.013
packs-0.013
Token and
Token ID290
Feature activation+7.54e-01

Pos logit contributions
 ones+0.037
's+0.036
 color+0.032
oes+0.030
 one+0.030
Neg logit contributions
 relief-0.069
ores-0.069
 stret-0.068
 Doctor-0.068
 awful-0.068
Token her
Token ID607
Feature activation-8.46e-02

Pos logit contributions
 ones+0.022
oes+0.021
's+0.020
packs+0.019
 color+0.019
Neg logit contributions
 relief-0.027
 Doctor-0.027
 ow-0.026
 stret-0.026
 sorry-0.026
Token a
Token ID257
Feature activation-7.50e-01

Pos logit contributions
 ones+0.017
 one+0.016
's+0.014
 color+0.014
packs+0.013
Neg logit contributions
 stret-0.026
coon-0.025
lic-0.025
paste-0.025
ores-0.025
Token big
Token ID1263
Feature activation-3.66e+00

Pos logit contributions
ores+0.031
coon+0.027
paste+0.027
 ow+0.027
ocol+0.027
Neg logit contributions
 way-0.022
 ones-0.021
's-0.021
 color-0.020
 one-0.020
Token box
Token ID3091
Feature activation+2.11e+00

Pos logit contributions
ores+0.154
paste+0.147
lic+0.145
 awful+0.140
coon+0.139
Neg logit contributions
 ones-0.100
 one-0.094
 color-0.091
 way-0.089
's-0.087
Token of
Token ID286
Feature activation-1.72e+00

Pos logit contributions
 ones+0.065
's+0.060
oes+0.058
 color+0.057
packs+0.056
Neg logit contributions
 relief-0.073
 Doctor-0.073
 ow-0.072
ores-0.071
 stret-0.071
Token ch
Token ID442
Feature activation-2.18e+00

Pos logit contributions
ores+0.074
ache+0.071
lic+0.071
paste+0.069
coon+0.069
Neg logit contributions
 ones-0.048
 color-0.044
 one-0.041
's-0.038
packs-0.036
Tokenocol
Token ID4668
Feature activation-1.10e+01

Pos logit contributions
 Doctor+0.039
 stret+0.036
 relief+0.036
 ow+0.035
 awful+0.034
Neg logit contributions
 ones-0.106
packs-0.102
oes-0.100
's-0.098
 color-0.096
Tokenates
Token ID689
Feature activation+1.12e+00

Pos logit contributions
paste+0.237
ocol+0.226
coon+0.218
ores+0.214
 ow+0.205
Neg logit contributions
 ones-0.478
's-0.466
 way-0.459
 color-0.449
 one-0.448
Token!
Token ID0
Feature activation-1.03e+00

Pos logit contributions
 ones+0.030
's+0.028
oes+0.026
packs+0.026
 color+0.026
Neg logit contributions
 relief-0.044
 Doctor-0.043
 ow-0.043
 stret-0.043
 awful-0.042
Token<|endoftext|>
Token ID50256
Feature activation-4.64e+00

Pos logit contributions
 awful+0.048
 ow+0.048
 stret+0.048
coon+0.047
paste+0.047
Neg logit contributions
 ones-0.019
's-0.018
oes-0.016
packs-0.016
 one-0.015
TokenOnce
Token ID7454
Feature activation-2.48e-01

Pos logit contributions
 relief+0.155
 Doctor+0.153
ores+0.152
 stret+0.152
 ow+0.151
Neg logit contributions
 ones-0.147
's-0.133
oes-0.131
 color-0.129
packs-0.128
Token upon
Token ID2402
Feature activation-6.59e-01

Pos logit contributions
 relief+0.010
 Doctor+0.010
ores+0.010
 ow+0.010
 stret+0.010
Neg logit contributions
 ones-0.006
's-0.005
oes-0.005
 color-0.005
packs-0.005
Token looking
Token ID2045
Feature activation-1.94e+00

Pos logit contributions
 relief+0.056
 awful+0.056
coon+0.056
 stret+0.056
paste+0.055
Neg logit contributions
 ones-0.042
 one-0.038
 color-0.036
's-0.035
packs-0.033
Token at
Token ID379
Feature activation+2.01e+00

Pos logit contributions
coon+0.085
 ow+0.081
 stret+0.080
 explosion+0.080
paste+0.078
Neg logit contributions
 ones-0.050
's-0.047
 one-0.046
packs-0.040
oes-0.037
Token all
Token ID477
Feature activation+5.48e+00

Pos logit contributions
 ones+0.063
's+0.057
oes+0.056
 color+0.056
packs+0.054
Neg logit contributions
 relief-0.069
 stret-0.068
 Doctor-0.068
ores-0.067
 awful-0.067
Token the
Token ID262
Feature activation+2.02e-01

Pos logit contributions
 ones+0.196
oes+0.182
's+0.177
packs+0.175
 color+0.173
Neg logit contributions
 Doctor-0.157
 relief-0.154
ores-0.154
 awful-0.152
 ow-0.148
Token ch
Token ID442
Feature activation-2.03e+00

Pos logit contributions
 ones+0.007
 color+0.006
's+0.006
 one+0.005
 way+0.005
Neg logit contributions
ores-0.008
 relief-0.007
coon-0.007
paste-0.007
 sorry-0.007
Tokenocol
Token ID4668
Feature activation-1.10e+01

Pos logit contributions
 relief+0.033
coon+0.031
 stret+0.031
 awful+0.029
 horrible+0.029
Neg logit contributions
 ones-0.104
oes-0.097
packs-0.096
 color-0.094
's-0.093
Tokenates
Token ID689
Feature activation+1.22e+00

Pos logit contributions
ores+0.245
paste+0.245
ocol+0.227
coon+0.220
affle+0.216
Neg logit contributions
 ones-0.462
's-0.451
 way-0.441
 one-0.426
 color-0.425
Token,
Token ID11
Feature activation+6.56e-01

Pos logit contributions
 ones+0.030
's+0.028
 color+0.026
 one+0.025
oes+0.025
Neg logit contributions
 relief-0.052
 stret-0.051
 Doctor-0.051
 ow-0.050
coon-0.050
Token Lex
Token ID17210
Feature activation-2.21e+00

Pos logit contributions
 ones+0.022
oes+0.022
packs+0.021
's+0.021
 color+0.020
Neg logit contributions
 Doctor-0.020
 relief-0.019
 ow-0.019
ores-0.019
 awful-0.019
Tokeni
Token ID72
Feature activation-1.72e+00

Pos logit contributions
 ow+0.074
 Doctor+0.074
 relief+0.073
 poisonous+0.073
ores+0.073
Neg logit contributions
 ones-0.064
oes-0.059
 color-0.058
packs-0.058
First-0.056
Token chose
Token ID7690
Feature activation+3.82e+00

Pos logit contributions
 relief+0.078
ores+0.076
 stret+0.075
 Doctor+0.074
 ow+0.074
Neg logit contributions
's-0.038
 ones-0.037
 color-0.031
 one-0.031
oes-0.029
Token park
Token ID3952
Feature activation+3.69e-01

Pos logit contributions
paste+0.008
ores+0.008
 Doctor+0.008
ache+0.008
coon+0.008
Neg logit contributions
 ones-0.003
's-0.003
 color-0.003
 one-0.003
 way-0.002
Token.
Token ID13
Feature activation-1.11e-01

Pos logit contributions
 ones+0.009
's+0.008
 color+0.008
packs+0.008
oes+0.008
Neg logit contributions
 relief-0.015
 Doctor-0.015
 ow-0.015
 stret-0.015
ores-0.015
Token When
Token ID1649
Feature activation-3.09e+00

Pos logit contributions
 relief+0.004
 Doctor+0.004
ores+0.004
 stret+0.004
 awful+0.004
Neg logit contributions
 ones-0.003
's-0.003
oes-0.003
 color-0.003
packs-0.003
Token the
Token ID262
Feature activation-1.32e-01

Pos logit contributions
 stret+0.121
 relief+0.121
 Doctor+0.120
 ow+0.120
ores+0.119
Neg logit contributions
 ones-0.081
's-0.072
 color-0.069
oes-0.069
packs-0.068
Token ch
Token ID442
Feature activation-2.28e+00

Pos logit contributions
ores+0.004
 relief+0.004
 stret+0.004
 Doctor+0.004
paste+0.004
Neg logit contributions
 ones-0.005
's-0.004
 color-0.004
oes-0.004
 one-0.004
Tokenocol
Token ID4668
Feature activation-1.10e+01

Pos logit contributions
 relief+0.045
 stret+0.044
 awful+0.042
 Doctor+0.042
 horrible+0.042
Neg logit contributions
 ones-0.105
oes-0.102
 color-0.098
 one-0.096
packs-0.095
Tokenates
Token ID689
Feature activation+1.16e+00

Pos logit contributions
ores+0.285
paste+0.282
 ow+0.269
coon+0.267
 Doctor+0.258
Neg logit contributions
 ones-0.426
's-0.410
 way-0.388
 one-0.382
 color-0.382
Token were
Token ID547
Feature activation+7.46e-01

Pos logit contributions
's+0.022
 ones+0.022
 color+0.019
 one+0.018
oes+0.017
Neg logit contributions
 stret-0.054
 relief-0.054
 Doctor-0.054
ores-0.053
 awful-0.053
Token all
Token ID477
Feature activation+5.29e+00

Pos logit contributions
's+0.020
 ones+0.020
 one+0.020
 color+0.020
 way+0.019
Neg logit contributions
coon-0.030
paste-0.028
 stret-0.028
 Doctor-0.028
 relief-0.027
Token gone
Token ID3750
Feature activation-2.93e+00

Pos logit contributions
 ones+0.159
's+0.155
 color+0.147
 one+0.141
oes+0.138
Neg logit contributions
coon-0.189
 Doctor-0.188
 stret-0.188
paste-0.187
 relief-0.183
Token,
Token ID11
Feature activation+2.89e-01

Pos logit contributions
 Doctor+0.141
paste+0.141
coon+0.141
ores+0.140
 relief+0.139
Neg logit contributions
 ones-0.056
's-0.049
 one-0.045
packs-0.042
 color-0.041
Token was
Token ID373
Feature activation+1.16e+00

Pos logit contributions
 ones+0.062
's+0.055
oes+0.055
 color+0.054
packs+0.053
Neg logit contributions
 relief-0.073
 Doctor-0.073
ores-0.072
 ow-0.072
 stret-0.072
Token a
Token ID257
Feature activation-6.34e-01

Pos logit contributions
 ones+0.026
 one+0.026
 color+0.024
's+0.023
 way+0.019
Neg logit contributions
 stret-0.053
affle-0.053
coon-0.053
ache-0.051
ocol-0.050
Token box
Token ID3091
Feature activation+2.13e+00

Pos logit contributions
ores+0.028
paste+0.025
coon+0.025
ocol+0.024
othe+0.024
Neg logit contributions
 color-0.019
 way-0.018
 ones-0.017
 one-0.017
's-0.016
Token of
Token ID286
Feature activation-1.40e+00

Pos logit contributions
 ones+0.064
's+0.061
 color+0.058
 one+0.057
oes+0.056
Neg logit contributions
 relief-0.076
 Doctor-0.076
 stret-0.074
 ow-0.074
ores-0.074
Token ch
Token ID442
Feature activation-2.17e+00

Pos logit contributions
ores+0.061
ache+0.060
coon+0.057
paste+0.057
lic+0.057
Neg logit contributions
 ones-0.038
 color-0.036
 one-0.032
's-0.029
packs-0.027
Tokenocol
Token ID4668
Feature activation-1.10e+01

Pos logit contributions
 Doctor+0.047
 stret+0.045
 relief+0.043
 ow+0.041
coon+0.041
Neg logit contributions
 ones-0.099
packs-0.094
oes-0.092
's-0.090
 color-0.090
Tokenates
Token ID689
Feature activation+1.14e+00

Pos logit contributions
paste+0.264
ocol+0.245
ores+0.243
coon+0.238
 ow+0.230
Neg logit contributions
 ones-0.456
's-0.437
 way-0.434
 color-0.420
 one-0.419
Token.
Token ID13
Feature activation+8.67e-02

Pos logit contributions
 ones+0.032
's+0.030
 color+0.028
 one+0.028
packs+0.027
Neg logit contributions
 relief-0.045
 Doctor-0.044
 stret-0.043
 ow-0.043
ores-0.042
Token "
Token ID366
Feature activation+2.11e+00

Pos logit contributions
 ones+0.002
's+0.002
 One+0.002
 one+0.002
oes+0.001
Neg logit contributions
 ow-0.004
 poisonous-0.004
 awful-0.004
 relief-0.004
ache-0.004
TokenFor
Token ID1890
Feature activation-2.00e+00

Pos logit contributions
 ones+0.061
's+0.057
packs+0.057
oes+0.057
 way+0.055
Neg logit contributions
paste-0.071
 relief-0.071
coon-0.070
 Doctor-0.069
ores-0.069
Token me
Token ID502
Feature activation-4.53e+00

Pos logit contributions
 stret+0.082
ache+0.080
paste+0.080
coon+0.077
ores+0.077
Neg logit contributions
 ones-0.054
 one-0.054
 color-0.052
's-0.049
 way-0.045
Token started
Token ID2067
Feature activation+2.76e+00

Pos logit contributions
 ones+0.018
's+0.016
 color+0.015
packs+0.015
oes+0.015
Neg logit contributions
 relief-0.028
 awful-0.028
 ow-0.028
 Doctor-0.027
 stret-0.027
Token eating
Token ID6600
Feature activation-3.20e+00

Pos logit contributions
 ones+0.071
oes+0.062
's+0.062
 color+0.060
packs+0.059
Neg logit contributions
 relief-0.108
 Doctor-0.108
ores-0.107
 stret-0.107
 ow-0.107
Token the
Token ID262
Feature activation+1.79e-01

Pos logit contributions
coon+0.151
ache+0.143
 relief+0.139
 explosion+0.139
lic+0.138
Neg logit contributions
 ones-0.089
 one-0.079
 color-0.069
's-0.068
packs-0.059
Token delicious
Token ID12625
Feature activation-2.61e+00

Pos logit contributions
 ones+0.007
's+0.006
 one+0.006
 color+0.006
 way+0.005
Neg logit contributions
ores-0.006
 relief-0.006
 Doctor-0.006
coon-0.006
 sorry-0.006
Token ch
Token ID442
Feature activation-2.15e+00

Pos logit contributions
ores+0.097
coon+0.093
 sorry+0.091
 Doctor+0.090
 relief+0.090
Neg logit contributions
 ones-0.093
 color-0.078
's-0.077
 one-0.076
 way-0.071
Tokenocol
Token ID4668
Feature activation-1.10e+01

Pos logit contributions
 relief+0.036
 Doctor+0.033
 stret+0.032
 awful+0.032
 ow+0.031
Neg logit contributions
 ones-0.110
oes-0.102
packs-0.100
 color-0.099
's-0.098
Tokenates
Token ID689
Feature activation+1.23e+00

Pos logit contributions
paste+0.264
ores+0.263
coon+0.247
ache+0.234
ocol+0.232
Neg logit contributions
 ones-0.455
's-0.429
 way-0.420
 color-0.417
 one-0.415
Token.
Token ID13
Feature activation+2.09e-01

Pos logit contributions
 ones+0.033
's+0.030
oes+0.028
 color+0.028
packs+0.028
Neg logit contributions
 relief-0.049
 Doctor-0.048
 ow-0.048
 stret-0.047
ores-0.047
Token She
Token ID1375
Feature activation-2.53e+00

Pos logit contributions
 ones+0.005
oes+0.005
 color+0.004
's+0.004
packs+0.004
Neg logit contributions
 relief-0.008
 Doctor-0.008
ores-0.008
 stret-0.008
 sorry-0.008
Token was
Token ID373
Feature activation+8.73e-01

Pos logit contributions
 stret+0.121
 relief+0.121
ores+0.119
 sorry+0.118
coon+0.118
Neg logit contributions
's-0.054
 ones-0.046
 color-0.043
 one-0.041
 traveled-0.038
Token so
Token ID523
Feature activation-4.70e-01

Pos logit contributions
's+0.021
 one+0.019
 color+0.017
 way+0.016
 ones+0.016
Neg logit contributions
coon-0.042
 stret-0.040
affle-0.038
paste-0.038
ache-0.038
TokenOnce
Token ID7454
Feature activation-3.40e-01

Pos logit contributions
 Doctor+0.179
ores+0.178
 relief+0.178
 ow+0.175
 stret+0.174
Neg logit contributions
 ones-0.137
oes-0.122
's-0.119
 color-0.118
packs-0.117
Token upon
Token ID2402
Feature activation-6.64e-01

Pos logit contributions
ores+0.012
 Doctor+0.012
 ow+0.012
 sorry+0.012
 relief+0.012
Neg logit contributions
 ones-0.010
 color-0.008
's-0.008
oes-0.008
packs-0.008
Token a
Token ID257
Feature activation-2.71e-01

Pos logit contributions
 relief+0.020
 ow+0.020
 Doctor+0.020
ores+0.020
 awful+0.019
Neg logit contributions
 ones-0.023
oes-0.022
packs-0.021
's-0.020
 color-0.020
Token time
Token ID640
Feature activation+3.21e+00

Pos logit contributions
 ow+0.010
 relief+0.010
ores+0.010
 stret+0.010
 awful+0.010
Neg logit contributions
 ones-0.008
oes-0.007
packs-0.007
 color-0.007
's-0.006
Token,
Token ID11
Feature activation+8.28e-01

Pos logit contributions
 ones+0.116
oes+0.108
 pads+0.104
First+0.102
 traveled+0.102
Neg logit contributions
 sorry-0.096
 ow-0.094
 Doctor-0.087
ores-0.085
othe-0.084
Token there
Token ID612
Feature activation+6.56e+00

Pos logit contributions
 ones+0.024
's+0.021
oes+0.021
 color+0.021
packs+0.021
Neg logit contributions
 relief-0.030
 Doctor-0.030
ores-0.029
 ow-0.029
 stret-0.029
Token was
Token ID373
Feature activation+1.13e+00

Pos logit contributions
 ones+0.178
oes+0.160
packs+0.153
's+0.146
 color+0.145
Neg logit contributions
 ow-0.247
 relief-0.242
 Doctor-0.242
ores-0.239
 awful-0.236
Token a
Token ID257
Feature activation-2.98e-01

Pos logit contributions
 ones+0.043
oes+0.042
packs+0.038
's+0.038
 color+0.037
Neg logit contributions
 Doctor-0.031
ores-0.031
 ow-0.031
 relief-0.030
 awful-0.030
Token tall
Token ID7331
Feature activation-1.48e+00

Pos logit contributions
paste+0.012
ores+0.012
 awful+0.012
othe+0.012
 relief+0.012
Neg logit contributions
 ones-0.008
 color-0.008
's-0.007
 one-0.007
 way-0.006
Token tree
Token ID5509
Feature activation-4.27e-01

Pos logit contributions
ores+0.060
paste+0.060
 awful+0.058
 relief+0.058
othe+0.057
Neg logit contributions
 ones-0.042
 color-0.038
 one-0.037
's-0.036
 way-0.033
Token.
Token ID13
Feature activation+1.12e+00

Pos logit contributions
ores+0.012
 ow+0.012
 Doctor+0.012
 poisonous+0.012
 awful+0.012
Neg logit contributions
 ones-0.015
packs-0.013
oes-0.013
First-0.013
 color-0.013
Token He
Token ID679
Feature activation-1.97e+00

Pos logit contributions
 ones+0.013
's+0.012
oes+0.011
 one+0.011
 color+0.011
Neg logit contributions
 relief-0.023
 awful-0.023
 ow-0.023
 stret-0.023
paste-0.023
Token looked
Token ID3114
Feature activation+8.21e-01

Pos logit contributions
ores+0.091
paste+0.089
 relief+0.088
coon+0.088
 stret+0.088
Neg logit contributions
's-0.044
 ones-0.037
 wandered-0.036
 one-0.035
 traveled-0.033
Token around
Token ID1088
Feature activation+1.75e+00

Pos logit contributions
's+0.019
 one+0.018
 ones+0.018
 way+0.014
 color+0.013
Neg logit contributions
paste-0.038
coon-0.038
ache-0.036
 Doctor-0.036
 stret-0.036
Token,
Token ID11
Feature activation+5.10e-01

Pos logit contributions
 ones+0.032
 one+0.031
's+0.026
 color+0.023
 way+0.020
Neg logit contributions
paste-0.095
ache-0.092
affle-0.092
lic-0.091
coon-0.091
Token but
Token ID475
Feature activation+9.31e-01

Pos logit contributions
 ones+0.013
's+0.012
 one+0.012
 color+0.011
oes+0.010
Neg logit contributions
paste-0.021
coon-0.020
 relief-0.020
 ow-0.020
 Doctor-0.020
Token all
Token ID477
Feature activation+5.78e+00

Pos logit contributions
's+0.025
 one+0.025
 ones+0.025
 color+0.021
 wandered+0.020
Neg logit contributions
 stret-0.039
lic-0.036
coon-0.036
affle-0.036
 relief-0.035
Token he
Token ID339
Feature activation-1.06e+00

Pos logit contributions
's+0.161
 ones+0.160
 one+0.149
 way+0.141
 color+0.138
Neg logit contributions
 stret-0.232
paste-0.225
coon-0.218
ache-0.218
lic-0.216
Token could
Token ID714
Feature activation+1.07e+00

Pos logit contributions
 stret+0.052
 relief+0.049
ores+0.048
ocol+0.047
 Doctor+0.047
Neg logit contributions
's-0.023
 one-0.019
 ones-0.019
 wandered-0.019
 decided-0.018
Token see
Token ID766
Feature activation-2.76e+00

Pos logit contributions
's+0.029
 ones+0.028
 one+0.027
 color+0.026
oes+0.026
Neg logit contributions
paste-0.041
 relief-0.041
ocol-0.040
 stret-0.040
coon-0.040
Token were
Token ID547
Feature activation+1.00e+00

Pos logit contributions
 stret+0.114
affle+0.112
ores+0.111
 relief+0.110
paste+0.109
Neg logit contributions
 ones-0.084
 one-0.075
's-0.074
 color-0.068
 way-0.063
Token shadows
Token ID16187
Feature activation-2.20e+00

Pos logit contributions
 ones+0.031
 one+0.030
's+0.029
 way+0.027
 color+0.027
Neg logit contributions
affle-0.037
 stret-0.037
coon-0.036
ocol-0.035
paste-0.035
Token play
Token ID711
Feature activation-1.50e+00

Pos logit contributions
ores+0.015
 ow+0.015
 relief+0.015
 awful+0.015
 Doctor+0.015
Neg logit contributions
 ones-0.011
's-0.010
 color-0.010
oes-0.010
packs-0.009
Token another
Token ID1194
Feature activation+2.30e-01

Pos logit contributions
 ow+0.072
ache+0.070
 Doctor+0.067
 ogre+0.067
 cavity+0.067
Neg logit contributions
 ones-0.034
 one-0.031
's-0.030
 color-0.026
Let-0.020
Token game
Token ID983
Feature activation+2.68e+00

Pos logit contributions
 ones+0.003
oes+0.003
's+0.003
packs+0.003
 color+0.003
Neg logit contributions
 relief-0.011
 Doctor-0.011
 stret-0.011
 ow-0.011
 awful-0.011
Token.
Token ID13
Feature activation-1.08e+00

Pos logit contributions
 ones+0.073
's+0.065
 color+0.061
oes+0.061
 one+0.061
Neg logit contributions
 relief-0.106
 ow-0.105
 stret-0.105
 Doctor-0.104
 awful-0.103
Token They
Token ID1119
Feature activation+1.11e+00

Pos logit contributions
 relief+0.048
 stret+0.047
 awful+0.047
 ow+0.047
 Doctor+0.047
Neg logit contributions
 ones-0.023
's-0.019
oes-0.019
 color-0.019
packs-0.018
Token agreed
Token ID4987
Feature activation+4.93e+00

Pos logit contributions
 ones+0.029
oes+0.026
's+0.025
packs+0.025
 color+0.024
Neg logit contributions
 relief-0.043
 Doctor-0.043
ores-0.042
 awful-0.042
 ow-0.042
Token and
Token ID290
Feature activation-3.98e-01

Pos logit contributions
 ones+0.113
's+0.107
 color+0.097
 one+0.096
oes+0.091
Neg logit contributions
 ow-0.208
 relief-0.208
 Doctor-0.207
coon-0.207
 stret-0.205
Token went
Token ID1816
Feature activation-1.94e+00

Pos logit contributions
 relief+0.015
 Doctor+0.015
ores+0.015
 ow+0.014
 awful+0.014
Neg logit contributions
 ones-0.011
oes-0.010
's-0.010
packs-0.010
 color-0.010
Token back
Token ID736
Feature activation-2.29e+00

Pos logit contributions
ores+0.074
 Doctor+0.074
 stret+0.074
 relief+0.073
 sorry+0.072
Neg logit contributions
 ones-0.051
oes-0.046
 color-0.045
's-0.045
packs-0.043
Token to
Token ID284
Feature activation-5.13e-01

Pos logit contributions
 Doctor+0.097
ores+0.096
 stret+0.095
 relief+0.095
 ow+0.093
Neg logit contributions
 ones-0.052
oes-0.044
 color-0.044
's-0.043
packs-0.041
Token their
Token ID511
Feature activation+1.55e+00

Pos logit contributions
paste+0.023
 awful+0.022
lic+0.022
coon+0.021
ache+0.021
Neg logit contributions
 ones-0.012
's-0.011
 color-0.011
 one-0.011
packs-0.010
Token were
Token ID547
Feature activation+1.07e+00

Pos logit contributions
 ones+0.084
oes+0.073
's+0.070
packs+0.068
 color+0.066
Neg logit contributions
 relief-0.154
ores-0.152
 ow-0.152
 Doctor-0.152
 stret-0.150
Token making
Token ID1642
Feature activation-4.44e-01

Pos logit contributions
's+0.031
 ones+0.030
 one+0.029
 color+0.029
packs+0.028
Neg logit contributions
coon-0.039
 relief-0.039
 awful-0.037
paste-0.037
 horrible-0.036
Token cookies
Token ID14746
Feature activation-2.33e-01

Pos logit contributions
coon+0.019
lic+0.017
ache+0.017
paste+0.016
 poisonous+0.016
Neg logit contributions
 ones-0.013
's-0.013
 one-0.012
 color-0.012
 way-0.011
Token,
Token ID11
Feature activation+5.55e-02

Pos logit contributions
 relief+0.012
coon+0.011
 ow+0.011
 Doctor+0.011
 stret+0.011
Neg logit contributions
 ones-0.004
's-0.004
 color-0.004
 one-0.003
oes-0.003
Token Lily
Token ID20037
Feature activation+3.71e-01

Pos logit contributions
 ones+0.002
 one+0.002
 color+0.001
's+0.001
packs+0.001
Neg logit contributions
ache-0.002
 relief-0.002
coon-0.002
 stret-0.002
 ow-0.002
Token's
Token ID338
Feature activation+6.80e+00

Pos logit contributions
 ones+0.007
's+0.006
 color+0.005
oes+0.005
packs+0.005
Neg logit contributions
 relief-0.018
ores-0.017
 awful-0.017
 stret-0.017
 Doctor-0.017
Token mom
Token ID1995
Feature activation+5.57e-01

Pos logit contributions
 ones+0.182
's+0.172
 color+0.168
 one+0.162
 way+0.152
Neg logit contributions
ores-0.286
lic-0.282
coon-0.281
othe-0.268
paste-0.265
Tokenmy
Token ID1820
Feature activation-1.56e+00

Pos logit contributions
's+0.012
 ones+0.011
 color+0.010
oes+0.009
 one+0.009
Neg logit contributions
 relief-0.026
ores-0.025
 awful-0.025
 ow-0.025
 stret-0.025
Token asked
Token ID1965
Feature activation+3.17e+00

Pos logit contributions
 relief+0.075
 awful+0.071
 ow+0.070
ores+0.070
ocol+0.069
Neg logit contributions
's-0.035
 ones-0.030
 color-0.027
 one-0.027
 wandered-0.024
Token her
Token ID607
Feature activation-6.13e-01

Pos logit contributions
 ones+0.067
 one+0.060
's+0.060
 color+0.052
packs+0.042
Neg logit contributions
 relief-0.152
paste-0.151
affle-0.150
 awful-0.149
 ow-0.149
Token to
Token ID284
Feature activation-2.70e-01

Pos logit contributions
 ow+0.030
 relief+0.030
coon+0.030
 awful+0.029
 explosion+0.029
Neg logit contributions
's-0.013
 one-0.012
 ones-0.012
 color-0.010
packs-0.009
Token envelope
Token ID22878
Feature activation-2.58e+00

Pos logit contributions
 relief+0.016
 Doctor+0.016
 stret+0.016
 ow+0.015
 awful+0.015
Neg logit contributions
 ones-0.015
oes-0.013
's-0.013
packs-0.013
 color-0.013
Token and
Token ID290
Feature activation-2.31e-01

Pos logit contributions
 relief+0.093
 Doctor+0.092
ores+0.092
paste+0.092
 awful+0.091
Neg logit contributions
 ones-0.075
oes-0.066
's-0.064
packs-0.063
 color-0.062
Token says
Token ID1139
Feature activation-8.64e-01

Pos logit contributions
coon+0.009
ores+0.009
 stret+0.008
 awful+0.008
lic+0.008
Neg logit contributions
 ones-0.007
's-0.007
 one-0.006
 color-0.006
packs-0.006
Token "
Token ID366
Feature activation+1.32e+00

Pos logit contributions
 relief+0.027
 Doctor+0.026
 stret+0.026
ores+0.026
 awful+0.026
Neg logit contributions
 ones-0.030
's-0.027
oes-0.027
 color-0.027
packs-0.026
TokenIt
Token ID1026
Feature activation-6.94e-01

Pos logit contributions
 ones+0.040
 way+0.038
packs+0.037
oes+0.037
's+0.036
Neg logit contributions
paste-0.044
coon-0.042
ores-0.042
 relief-0.041
 ow-0.040
Token's
Token ID338
Feature activation+6.62e+00

Pos logit contributions
 Doctor+0.028
 ow+0.028
paste+0.028
 awful+0.028
ores+0.028
Neg logit contributions
 ones-0.016
oes-0.014
packs-0.014
 color-0.013
First-0.013
Token a
Token ID257
Feature activation-1.32e+00

Pos logit contributions
 color+0.183
's+0.181
 ones+0.175
 one+0.171
packs+0.154
Neg logit contributions
 stret-0.265
coon-0.262
ache-0.253
 Doctor-0.249
 explosion-0.248
Token card
Token ID2657
Feature activation+2.38e-01

Pos logit contributions
ores+0.050
 Doctor+0.048
 relief+0.047
paste+0.047
 awful+0.047
Neg logit contributions
 ones-0.040
 color-0.036
's-0.035
 one-0.034
 way-0.032
Token.
Token ID13
Feature activation-9.84e-01

Pos logit contributions
 ones+0.006
's+0.006
 color+0.006
 one+0.005
oes+0.005
Neg logit contributions
 relief-0.010
 stret-0.010
 awful-0.010
 ow-0.009
 horrible-0.009
Token I
Token ID314
Feature activation+4.30e-01

Pos logit contributions
 relief+0.032
 Doctor+0.032
ores+0.032
 ow+0.031
paste+0.031
Neg logit contributions
 ones-0.032
's-0.029
oes-0.029
packs-0.028
 color-0.028
Token made
Token ID925
Feature activation+1.55e+00

Pos logit contributions
 ones+0.011
's+0.010
 color+0.010
oes+0.009
 one+0.009
Neg logit contributions
 Doctor-0.018
 stret-0.017
 relief-0.017
 ow-0.017
 awful-0.017
Token
Token ID198
Feature activation+2.74e-01

Pos logit contributions
 ones+0.009
oes+0.008
packs+0.008
 color+0.008
's+0.007
Neg logit contributions
 Doctor-0.010
ores-0.010
 relief-0.009
 ow-0.009
 stret-0.009
TokenWhen
Token ID2215
Feature activation-1.87e+00

Pos logit contributions
packs+0.008
 ones+0.007
oes+0.007
ci+0.007
re+0.006
Neg logit contributions
 awful-0.010
 horrible-0.010
 Doctor-0.009
lic-0.009
 poisonous-0.009
Token it
Token ID340
Feature activation-2.46e-01

Pos logit contributions
 relief+0.054
 sorry+0.052
 Doctor+0.052
 ow+0.049
 horrible+0.048
Neg logit contributions
oes-0.071
packs-0.067
 ones-0.064
j-0.064
's-0.061
Token was
Token ID373
Feature activation+1.02e+00

Pos logit contributions
 stret+0.011
 relief+0.011
 Doctor+0.011
coon+0.011
ores+0.011
Neg logit contributions
 ones-0.005
's-0.005
 color-0.004
oes-0.004
 one-0.004
Token done
Token ID1760
Feature activation-1.63e+00

Pos logit contributions
 ones+0.027
 color+0.026
 one+0.025
's+0.023
oes+0.023
Neg logit contributions
coon-0.040
paste-0.039
 stret-0.039
ache-0.038
 relief-0.038
Token,
Token ID11
Feature activation+4.70e-01

Pos logit contributions
 relief+0.082
coon+0.081
 stret+0.081
paste+0.078
 awful+0.077
Neg logit contributions
 ones-0.030
 one-0.026
 color-0.026
's-0.026
oes-0.018
Token Polly
Token ID36898
Feature activation+1.07e+00

Pos logit contributions
 ones+0.015
oes+0.014
packs+0.014
's+0.014
 color+0.014
Neg logit contributions
 relief-0.015
 Doctor-0.015
 ow-0.015
 awful-0.015
ores-0.015
Token was
Token ID373
Feature activation+9.57e-01

Pos logit contributions
 ones+0.023
's+0.019
oes+0.019
 color+0.019
packs+0.019
Neg logit contributions
 relief-0.047
 Doctor-0.046
ores-0.046
 stret-0.046
 ow-0.046
Token very
Token ID845
Feature activation-3.86e+00

Pos logit contributions
's+0.026
 color+0.025
 one+0.025
 ones+0.023
 way+0.022
Neg logit contributions
coon-0.039
 stret-0.037
paste-0.037
 relief-0.036
ache-0.035
Token excited
Token ID6568
Feature activation+3.70e+00

Pos logit contributions
coon+0.126
 stret+0.124
paste+0.118
 relief+0.115
 ow+0.115
Neg logit contributions
 ones-0.130
's-0.129
 color-0.127
 one-0.119
oes-0.117
Token.
Token ID13
Feature activation+5.42e-01

Pos logit contributions
 ones+0.101
oes+0.090
's+0.089
packs+0.088
 color+0.087
Neg logit contributions
 Doctor-0.137
 relief-0.137
ores-0.136
 ow-0.136
 awful-0.136
Token eat
Token ID4483
Feature activation-3.63e+00

Pos logit contributions
 Doctor+0.005
 relief+0.005
 ow+0.005
 stret+0.005
ores+0.005
Neg logit contributions
 ones-0.003
oes-0.003
's-0.003
packs-0.003
 color-0.003
Token food
Token ID2057
Feature activation-2.02e+00

Pos logit contributions
 relief+0.124
coon+0.119
ores+0.115
 stret+0.114
paste+0.114
Neg logit contributions
 ones-0.131
 color-0.119
's-0.118
 one-0.113
packs-0.099
Token,
Token ID11
Feature activation+6.12e-01

Pos logit contributions
 relief+0.088
 stret+0.087
coon+0.084
 explosion+0.084
 ow+0.084
Neg logit contributions
 ones-0.051
's-0.045
 color-0.044
 one-0.042
oes-0.040
Token especially
Token ID2592
Feature activation+4.32e-01

Pos logit contributions
 ones+0.018
oes+0.016
's+0.016
packs+0.016
 color+0.015
Neg logit contributions
 Doctor-0.022
 ow-0.022
 relief-0.022
ores-0.021
 stret-0.021
Token candy
Token ID18550
Feature activation-3.26e+00

Pos logit contributions
 ones+0.014
's+0.012
 color+0.012
oes+0.012
packs+0.012
Neg logit contributions
 relief-0.015
 stret-0.015
 Doctor-0.014
ores-0.014
 awful-0.014
Token and
Token ID290
Feature activation+6.98e-01

Pos logit contributions
 relief+0.124
 Doctor+0.123
ores+0.122
 stret+0.121
 ow+0.121
Neg logit contributions
 ones-0.089
's-0.079
oes-0.077
 color-0.076
packs-0.075
Token cookies
Token ID14746
Feature activation+6.89e-02

Pos logit contributions
 ones+0.025
's+0.022
oes+0.022
 color+0.022
packs+0.022
Neg logit contributions
 relief-0.021
 Doctor-0.021
ores-0.021
 stret-0.021
 ow-0.021
Token.
Token ID13
Feature activation+8.32e-01

Pos logit contributions
 ones+0.002
's+0.002
 color+0.001
oes+0.001
packs+0.001
Neg logit contributions
 relief-0.003
 Doctor-0.003
 stret-0.003
 ow-0.003
 awful-0.003
Token She
Token ID1375
Feature activation-2.35e+00

Pos logit contributions
 traveled+0.017
packs+0.017
 travelled+0.017
 ones+0.016
 ventured+0.015
Neg logit contributions
 Doctor-0.035
 Dr-0.035
 ow-0.033
ores-0.032
 sorry-0.032
Token would
Token ID561
Feature activation+9.53e-01

Pos logit contributions
 Doctor+0.089
 ow+0.089
 awful+0.086
 poisonous+0.085
ores+0.084
Neg logit contributions
 ones-0.062
oes-0.060
packs-0.059
 color-0.049
First-0.048
Token eat
Token ID4483
Feature activation-3.67e+00

Pos logit contributions
 ones+0.028
's+0.026
 color+0.025
oes+0.025
 one+0.024
Neg logit contributions
 relief-0.034
 Doctor-0.034
ores-0.033
 stret-0.033
paste-0.033
Token It
Token ID632
Feature activation+4.20e-01

Pos logit contributions
 ones+0.002
's+0.002
oes+0.002
 color+0.002
packs+0.002
Neg logit contributions
 relief-0.002
 stret-0.002
 Doctor-0.002
 ow-0.002
 awful-0.002
Token is
Token ID318
Feature activation+6.05e-01

Pos logit contributions
's+0.010
 ones+0.009
 one+0.008
 color+0.008
oes+0.007
Neg logit contributions
 relief-0.019
 stret-0.019
ores-0.018
 Doctor-0.018
coon-0.018
Token so
Token ID523
Feature activation+8.45e-03

Pos logit contributions
 one+0.016
 color+0.016
 ones+0.015
's+0.014
 way+0.013
Neg logit contributions
coon-0.026
 stret-0.025
paste-0.025
 relief-0.024
lic-0.024
Token colorful
Token ID20239
Feature activation+1.86e+00

Pos logit contributions
 color+0.000
 ones+0.000
's+0.000
 one+0.000
oes+0.000
Neg logit contributions
 stret-0.000
coon-0.000
 Doctor-0.000
paste-0.000
 relief-0.000
Token.
Token ID13
Feature activation+1.07e-02

Pos logit contributions
 ones+0.048
's+0.042
 one+0.039
 color+0.039
oes+0.036
Neg logit contributions
 relief-0.079
 stret-0.079
 Doctor-0.077
 awful-0.076
 ow-0.075
Token It
Token ID632
Feature activation+1.07e-01

Pos logit contributions
 ones+0.000
's+0.000
oes+0.000
 color+0.000
packs+0.000
Neg logit contributions
 relief-0.000
 stret-0.000
 awful-0.000
 ow-0.000
 Doctor-0.000
Token can
Token ID460
Feature activation+1.80e-01

Pos logit contributions
 ones+0.003
's+0.002
 color+0.002
 one+0.002
oes+0.002
Neg logit contributions
 relief-0.005
 stret-0.005
 Doctor-0.005
ores-0.005
coon-0.005
Token sing
Token ID1702
Feature activation+1.50e+00

Pos logit contributions
 ones+0.004
 one+0.004
 color+0.004
's+0.004
oes+0.004
Neg logit contributions
 relief-0.008
 Doctor-0.008
 stret-0.008
ores-0.007
paste-0.007
Token."
Token ID526
Feature activation+1.05e+00

Pos logit contributions
 ones+0.034
's+0.034
 color+0.031
 one+0.031
packs+0.025
Neg logit contributions
 stret-0.070
 Doctor-0.068
 relief-0.068
affle-0.067
 poisonous-0.066
Token Mil
Token ID4460
Feature activation-1.46e+00

Pos logit contributions
 ones+0.025
's+0.023
 one+0.021
 color+0.021
oes+0.021
Neg logit contributions
 awful-0.044
 ow-0.043
paste-0.043
 stret-0.043
 relief-0.043
Tokena
Token ID64
Feature activation-1.16e+00

Pos logit contributions
 relief+0.055
 Doctor+0.054
 stret+0.054
 ow+0.053
 awful+0.053
Neg logit contributions
 ones-0.040
's-0.038
oes-0.037
 color-0.036
packs-0.035
Token Mitt
Token ID16627
Feature activation-5.96e+00

Pos logit contributions
ache+0.077
lic+0.076
oke+0.074
paste+0.074
 stret+0.073
Neg logit contributions
 ones-0.053
 one-0.051
 color-0.048
's-0.044
 way-0.039
Tokenens
Token ID641
Feature activation-2.30e+00

Pos logit contributions
 relief+0.229
 stret+0.223
 awful+0.222
ores+0.221
 sorry+0.218
Neg logit contributions
 ones-0.173
's-0.155
oes-0.144
 color-0.136
 one-0.136
Token with
Token ID351
Feature activation+5.72e-01

Pos logit contributions
 relief+0.110
ores+0.107
 stret+0.105
ocol+0.102
lic+0.101
Neg logit contributions
's-0.057
 ones-0.050
 one-0.045
 color-0.040
 pads-0.036
Token a
Token ID257
Feature activation-1.11e+00

Pos logit contributions
 ones+0.014
 one+0.013
 color+0.012
's+0.011
 One+0.010
Neg logit contributions
 stret-0.027
coon-0.025
 relief-0.025
ache-0.025
lic-0.025
Token big
Token ID1263
Feature activation-3.88e+00

Pos logit contributions
ores+0.048
coon+0.044
 awful+0.043
 relief+0.043
 Doctor+0.043
Neg logit contributions
 ones-0.030
 color-0.028
's-0.028
 one-0.026
 way-0.025
Token smile
Token ID8212
Feature activation+3.77e-01

Pos logit contributions
ores+0.164
 stret+0.156
coon+0.155
 Doctor+0.155
 awful+0.155
Neg logit contributions
 ones-0.102
 color-0.091
's-0.090
 one-0.087
 way-0.079
Token on
Token ID319
Feature activation+1.56e+00

Pos logit contributions
 ones+0.010
's+0.009
 color+0.009
 one+0.009
oes+0.009
Neg logit contributions
 relief-0.015
 stret-0.014
 awful-0.014
 Doctor-0.014
ores-0.014
Token her
Token ID607
Feature activation-5.06e-01

Pos logit contributions
 ones+0.034
's+0.031
 one+0.030
 color+0.030
 One+0.027
Neg logit contributions
 stret-0.067
 relief-0.067
coon-0.066
ores-0.066
 explosion-0.066
Token face
Token ID1986
Feature activation-1.35e+00

Pos logit contributions
ores+0.016
 Doctor+0.015
 stret+0.015
coon+0.014
 relief+0.014
Neg logit contributions
's-0.018
 ones-0.018
 color-0.017
 one-0.016
oes-0.016
Token.
Token ID13
Feature activation+7.74e-02

Pos logit contributions
 relief+0.058
 stret+0.057
 Doctor+0.056
 ow+0.056
 awful+0.056
Neg logit contributions
 ones-0.032
's-0.029
 color-0.028
 one-0.027
oes-0.026
Token
Token ID198
Feature activation-1.61e-01

Pos logit contributions
 ones+0.002
's+0.002
oes+0.002
 color+0.001
 one+0.001
Neg logit contributions
 relief-0.003
 stret-0.003
 awful-0.003
coon-0.003
 ow-0.003
Token and
Token ID290
Feature activation+6.74e-01

Pos logit contributions
 ones+0.003
's+0.003
 one+0.003
 color+0.003
oes+0.003
Neg logit contributions
 ow-0.006
 relief-0.006
 awful-0.006
 stret-0.006
 Doctor-0.006
Token waved
Token ID26834
Feature activation+3.41e+00

Pos logit contributions
 ones+0.018
's+0.018
 one+0.017
 color+0.017
packs+0.016
Neg logit contributions
coon-0.026
 stret-0.026
ores-0.026
 awful-0.025
ache-0.025
Token goodbye
Token ID24829
Feature activation+1.28e+00

Pos logit contributions
 ones+0.098
's+0.088
 one+0.086
 color+0.083
packs+0.080
Neg logit contributions
coon-0.128
 Doctor-0.128
 ow-0.125
 stret-0.125
ores-0.124
Token.
Token ID13
Feature activation-1.45e-01

Pos logit contributions
 ones+0.031
's+0.028
oes+0.026
 color+0.026
 one+0.026
Neg logit contributions
 Doctor-0.052
 relief-0.052
 ow-0.052
ores-0.051
 stret-0.051
Token The
Token ID383
Feature activation+4.73e-01

Pos logit contributions
 relief+0.006
 Doctor+0.006
ores+0.006
 ow+0.006
 stret+0.006
Neg logit contributions
 ones-0.004
oes-0.003
's-0.003
 color-0.003
packs-0.003
Token squash
Token ID34613
Feature activation+1.76e+00

Pos logit contributions
 ones+0.015
's+0.013
 color+0.013
oes+0.013
 one+0.013
Neg logit contributions
 stret-0.016
ores-0.016
 Doctor-0.016
paste-0.016
 relief-0.016
Token was
Token ID373
Feature activation+8.19e-01

Pos logit contributions
 ones+0.044
's+0.040
 color+0.037
oes+0.037
packs+0.036
Neg logit contributions
 Doctor-0.071
ores-0.071
 relief-0.070
 stret-0.070
 ow-0.070
Token so
Token ID523
Feature activation-3.73e-01

Pos logit contributions
's+0.024
 one+0.024
 ones+0.023
 color+0.022
 way+0.021
Neg logit contributions
coon-0.031
paste-0.030
 Doctor-0.030
 stret-0.030
affle-0.030
Token happy
Token ID3772
Feature activation+4.78e-01

Pos logit contributions
 stret+0.013
coon+0.013
 Doctor+0.012
paste+0.012
affle+0.011
Neg logit contributions
's-0.012
 color-0.012
 ones-0.011
 one-0.011
packs-0.011
Token that
Token ID326
Feature activation-3.82e+00

Pos logit contributions
 ones+0.015
's+0.014
 one+0.013
 color+0.013
oes+0.012
Neg logit contributions
 stret-0.017
 relief-0.017
 Doctor-0.017
 ow-0.017
coon-0.017
Token someone
Token ID2130
Feature activation-1.72e+00

Pos logit contributions
 stret+0.148
 ow+0.143
coon+0.142
ores+0.142
 relief+0.142
Neg logit contributions
 ones-0.107
's-0.097
 color-0.093
 one-0.091
oes-0.089
Token so
Token ID523
Feature activation-3.81e-01

Pos logit contributions
's+0.022
 one+0.020
 ones+0.019
 color+0.019
 way+0.017
Neg logit contributions
 stret-0.031
coon-0.030
paste-0.030
ocol-0.029
ache-0.029
Token excited
Token ID6568
Feature activation+3.70e+00

Pos logit contributions
 stret+0.016
coon+0.016
paste+0.015
 Doctor+0.015
 explosion+0.015
Neg logit contributions
's-0.010
 color-0.009
 ones-0.009
 one-0.008
packs-0.008
Token to
Token ID284
Feature activation-3.91e-02

Pos logit contributions
 ones+0.108
's+0.099
oes+0.093
 one+0.092
 color+0.092
Neg logit contributions
 relief-0.141
 stret-0.138
 Doctor-0.136
ores-0.135
paste-0.135
Token see
Token ID766
Feature activation-2.98e+00

Pos logit contributions
 relief+0.001
 Doctor+0.001
ores+0.001
 ow+0.001
 awful+0.001
Neg logit contributions
 ones-0.001
oes-0.001
packs-0.001
's-0.001
 color-0.001
Token the
Token ID262
Feature activation-8.28e-03

Pos logit contributions
lic+0.122
 stret+0.122
paste+0.121
ache+0.118
affle+0.116
Neg logit contributions
 one-0.091
's-0.085
 ones-0.084
 color-0.076
 way-0.064
Token bug
Token ID5434
Feature activation-1.20e+00

Pos logit contributions
 stret+0.000
ores+0.000
 Doctor+0.000
 relief+0.000
paste+0.000
Neg logit contributions
 ones-0.000
 color-0.000
's-0.000
oes-0.000
 one-0.000
Token's
Token ID338
Feature activation+7.13e+00

Pos logit contributions
 stret+0.049
 relief+0.049
 Doctor+0.049
ores+0.048
paste+0.047
Neg logit contributions
's-0.032
 ones-0.031
 one-0.028
 color-0.027
packs-0.025
Token tiny
Token ID7009
Feature activation+6.42e-02

Pos logit contributions
 ones+0.235
's+0.214
 color+0.212
oes+0.209
packs+0.202
Neg logit contributions
 stret-0.238
 Doctor-0.234
ores-0.233
 relief-0.232
paste-0.230
Token legs
Token ID7405
Feature activation+4.43e+00

Pos logit contributions
 ones+0.002
's+0.002
oes+0.002
 color+0.002
packs+0.002
Neg logit contributions
 relief-0.002
 Doctor-0.002
ores-0.002
 stret-0.002
 ow-0.002
Token and
Token ID290
Feature activation+6.72e-01

Pos logit contributions
 ones+0.122
's+0.111
oes+0.107
 color+0.107
packs+0.103
Neg logit contributions
 relief-0.167
 Doctor-0.166
 stret-0.164
ores-0.164
 ow-0.162
Token eyes
Token ID2951
Feature activation-4.35e-01

Pos logit contributions
 ones+0.018
 color+0.017
's+0.016
 one+0.016
 way+0.015
Neg logit contributions
 stret-0.027
coon-0.027
ores-0.027
 awful-0.026
ache-0.026
Token everywhere
Token ID8347
Feature activation-4.21e+00

Pos logit contributions
 ones+0.038
's+0.034
 one+0.034
 color+0.032
 way+0.028
Neg logit contributions
paste-0.061
 ow-0.059
lic-0.059
ache-0.059
 stret-0.059
Token,
Token ID11
Feature activation+6.39e-01

Pos logit contributions
 stret+0.192
 relief+0.190
 Doctor+0.188
affle+0.185
ores+0.183
Neg logit contributions
's-0.095
 ones-0.091
 one-0.087
 color-0.081
 way-0.062
Token never
Token ID1239
Feature activation-2.66e-01

Pos logit contributions
 ones+0.019
's+0.017
 color+0.016
 one+0.016
oes+0.016
Neg logit contributions
 ow-0.023
paste-0.023
 Doctor-0.023
 relief-0.023
 stret-0.023
Token happier
Token ID23030
Feature activation-1.11e+00

Pos logit contributions
 relief+0.010
 Doctor+0.010
ores+0.010
 stret+0.010
 ow+0.009
Neg logit contributions
 ones-0.008
's-0.007
oes-0.007
 color-0.007
packs-0.007
Token than
Token ID621
Feature activation-3.46e+00

Pos logit contributions
 stret+0.046
 Doctor+0.045
ores+0.045
coon+0.045
paste+0.044
Neg logit contributions
 ones-0.030
's-0.026
 color-0.025
 one-0.024
packs-0.024
Token when
Token ID618
Feature activation-1.46e+00

Pos logit contributions
 stret+0.125
 relief+0.122
paste+0.122
 Doctor+0.122
ores+0.122
Neg logit contributions
 ones-0.105
 color-0.094
's-0.093
 one-0.092
oes-0.091
Token he
Token ID339
Feature activation-7.00e-01

Pos logit contributions
 stret+0.053
 relief+0.053
 Doctor+0.052
paste+0.052
ores+0.052
Neg logit contributions
 ones-0.045
's-0.040
 one-0.038
 color-0.038
packs-0.037
Token made
Token ID925
Feature activation+1.91e+00

Pos logit contributions
 stret+0.033
 Doctor+0.033
ores+0.032
 relief+0.032
paste+0.032
Neg logit contributions
's-0.015
 ones-0.014
 one-0.012
 color-0.012
 wandered-0.011
Token jokes
Token ID14532
Feature activation+1.85e+00

Pos logit contributions
 ones+0.056
's+0.056
 one+0.054
 color+0.049
 way+0.046
Neg logit contributions
 stret-0.073
coon-0.073
paste-0.073
 ow-0.071
lic-0.070
Token.
Token ID13
Feature activation+1.28e-01

Pos logit contributions
's+0.043
 ones+0.043
 one+0.038
 color+0.037
oes+0.033
Neg logit contributions
 relief-0.083
 stret-0.081
 Doctor-0.081
 ow-0.081
coon-0.080
Token<|endoftext|>
Token ID50256
Feature activation-4.57e+00

Pos logit contributions
 ones+0.003
's+0.003
oes+0.003
 color+0.002
packs+0.002
Neg logit contributions
 relief-0.005
 Doctor-0.005
ores-0.005
 ow-0.005
 stret-0.005
Token you
Token ID345
Feature activation-3.80e-01

Pos logit contributions
 stret+0.007
ores+0.007
coon+0.007
ache+0.007
 til+0.007
Neg logit contributions
 ones-0.007
's-0.006
 color-0.006
 one-0.006
oes-0.005
Token can
Token ID460
Feature activation-2.48e-01

Pos logit contributions
 relief+0.013
 Doctor+0.013
 stret+0.013
ores+0.013
 ow+0.013
Neg logit contributions
 ones-0.012
's-0.010
oes-0.010
 color-0.010
packs-0.010
Token.
Token ID13
Feature activation-9.87e-01

Pos logit contributions
 sorry+0.008
ores+0.008
 relief+0.007
 Doctor+0.007
 ow+0.007
Neg logit contributions
 ones-0.008
packs-0.008
oes-0.007
 pads-0.007
 traveled-0.007
Token You
Token ID921
Feature activation-3.63e-01

Pos logit contributions
 Doctor+0.033
 relief+0.033
 sorry+0.032
ores+0.031
paste+0.030
Neg logit contributions
 ones-0.033
packs-0.031
's-0.028
 color-0.028
 traveled-0.027
Token can
Token ID460
Feature activation-2.60e-01

Pos logit contributions
 relief+0.011
 sorry+0.011
 Doctor+0.011
 ow+0.011
 awful+0.011
Neg logit contributions
 ones-0.012
oes-0.011
packs-0.011
 color-0.010
's-0.010
Token play
Token ID711
Feature activation-1.50e+00

Pos logit contributions
 Doctor+0.009
ores+0.009
 relief+0.009
 ow+0.009
 sorry+0.009
Neg logit contributions
 ones-0.007
packs-0.007
oes-0.006
's-0.006
 pads-0.006
Token with
Token ID351
Feature activation+7.52e-02

Pos logit contributions
ache+0.073
othe+0.070
 awful+0.070
 relief+0.069
 ow+0.069
Neg logit contributions
 ones-0.034
's-0.032
 one-0.030
 color-0.028
One-0.021
Token your
Token ID534
Feature activation-5.60e-01

Pos logit contributions
 ones+0.002
's+0.002
 color+0.002
 one+0.002
packs+0.002
Neg logit contributions
 relief-0.003
ores-0.003
lic-0.003
 stret-0.003
 Doctor-0.003
Token doll
Token ID3654
Feature activation+7.53e-01

Pos logit contributions
ores+0.026
 relief+0.026
 Doctor+0.026
 awful+0.026
paste+0.026
Neg logit contributions
 ones-0.012
 color-0.010
's-0.009
packs-0.009
 one-0.009
Token and
Token ID290
Feature activation-3.30e-01

Pos logit contributions
 ones+0.020
's+0.018
oes+0.018
 color+0.017
packs+0.017
Neg logit contributions
 relief-0.029
 Doctor-0.029
ores-0.029
 ow-0.028
 stret-0.028
Token your
Token ID534
Feature activation-5.99e-01

Pos logit contributions
lic+0.014
ores+0.014
 awful+0.014
coon+0.014
 horrible+0.014
Neg logit contributions
 ones-0.009
's-0.008
 color-0.008
 one-0.007
packs-0.007
Token he
Token ID339
Feature activation-9.63e-01

Pos logit contributions
oes+0.010
 ones+0.010
packs+0.010
's+0.009
First+0.009
Neg logit contributions
 Doctor-0.012
 ow-0.012
 relief-0.011
ores-0.011
 sorry-0.011
Token always
Token ID1464
Feature activation+6.81e-01

Pos logit contributions
 Doctor+0.038
 relief+0.038
 ow+0.037
ores+0.037
 stret+0.037
Neg logit contributions
 ones-0.025
oes-0.022
's-0.021
packs-0.021
 color-0.021
Token asked
Token ID1965
Feature activation+3.28e+00

Pos logit contributions
 ones+0.016
oes+0.015
packs+0.014
's+0.014
 color+0.013
Neg logit contributions
 relief-0.028
 Doctor-0.028
 ow-0.027
ores-0.027
 awful-0.027
Token his
Token ID465
Feature activation+4.03e-01

Pos logit contributions
 ones+0.096
 one+0.085
 color+0.085
's+0.081
 One+0.072
Neg logit contributions
 relief-0.133
 stret-0.132
coon-0.132
 awful-0.130
paste-0.130
Token mom
Token ID1995
Feature activation+6.12e-01

Pos logit contributions
 ones+0.011
 color+0.010
's+0.010
 one+0.009
oes+0.009
Neg logit contributions
ores-0.016
coon-0.016
 awful-0.016
 relief-0.016
 stret-0.016
Tokenmy
Token ID1820
Feature activation-1.64e+00

Pos logit contributions
 ones+0.015
packs+0.013
oes+0.013
's+0.012
 traveled+0.012
Neg logit contributions
 Doctor-0.024
ores-0.024
 awful-0.023
 sorry-0.023
 ow-0.023
Token for
Token ID329
Feature activation-6.08e+00

Pos logit contributions
 relief+0.062
 ow+0.060
 stret+0.060
 awful+0.059
coon+0.059
Neg logit contributions
 ones-0.049
's-0.044
oes-0.043
 color-0.042
packs-0.041
Token some
Token ID617
Feature activation+3.70e+00

Pos logit contributions
 relief+0.171
 Doctor+0.169
ores+0.168
 stret+0.168
 ow+0.167
Neg logit contributions
 ones-0.226
's-0.207
oes-0.203
 color-0.202
packs-0.199
Token.
Token ID13
Feature activation+4.71e-01

Pos logit contributions
 ones+0.140
's+0.128
 color+0.126
oes+0.125
 one+0.122
Neg logit contributions
 relief-0.104
ores-0.102
 Doctor-0.101
 stret-0.101
coon-0.101
Token One
Token ID1881
Feature activation+3.32e+00

Pos logit contributions
 ones+0.010
packs+0.010
 traveled+0.009
 pads+0.009
 travelled+0.009
Neg logit contributions
 Doctor-0.020
ores-0.019
 Dr-0.019
 stret-0.018
 sorry-0.018
Token day
Token ID1110
Feature activation-7.25e-01

Pos logit contributions
 ones+0.064
packs+0.056
oes+0.048
 pads+0.048
First+0.046
Neg logit contributions
 Doctor-0.156
 ow-0.151
ores-0.150
 relief-0.150
 awful-0.149
Token day
Token ID1110
Feature activation-3.93e-01

Pos logit contributions
 ones+0.071
packs+0.056
oes+0.055
's+0.053
 color+0.047
Neg logit contributions
ores-0.195
 Doctor-0.195
 relief-0.194
 ow-0.192
 awful-0.191
Token,
Token ID11
Feature activation+6.78e-01

Pos logit contributions
 Doctor+0.013
ores+0.013
 ow+0.013
 sorry+0.013
 relief+0.013
Neg logit contributions
 ones-0.012
oes-0.011
 pads-0.011
packs-0.010
 color-0.010
Token Jacob
Token ID12806
Feature activation+1.02e+00

Pos logit contributions
 ones+0.021
oes+0.020
's+0.019
packs+0.019
 pads+0.019
Neg logit contributions
 Doctor-0.022
ores-0.022
 relief-0.022
 ow-0.021
 sorry-0.021
Token wrote
Token ID2630
Feature activation+1.04e+00

Pos logit contributions
 ones+0.029
oes+0.027
 color+0.024
packs+0.024
First+0.023
Neg logit contributions
 Doctor-0.036
 ow-0.035
ores-0.035
 relief-0.035
ache-0.034
Token a
Token ID257
Feature activation-5.47e-01

Pos logit contributions
's+0.026
 one+0.026
 ones+0.025
 color+0.023
packs+0.019
Neg logit contributions
 relief-0.046
ocol-0.045
 explosion-0.044
 ow-0.044
 stret-0.043
Token letter
Token ID3850
Feature activation-7.81e-01

Pos logit contributions
ores+0.023
paste+0.021
coon+0.021
 relief+0.021
 ow+0.021
Neg logit contributions
 ones-0.015
 color-0.015
's-0.015
 one-0.014
 way-0.014
Token to
Token ID284
Feature activation+3.27e-02

Pos logit contributions
 relief+0.035
 ow+0.034
 stret+0.034
 awful+0.034
ores+0.034
Neg logit contributions
 ones-0.017
's-0.016
 color-0.015
 one-0.015
oes-0.015
Token Jenny
Token ID28437
Feature activation+9.76e-01

Pos logit contributions
 color+0.001
 ones+0.001
 one+0.001
's+0.001
oes+0.001
Neg logit contributions
paste-0.001
ache-0.001
 stret-0.001
 poisonous-0.001
 awful-0.001
Token.
Token ID13
Feature activation+5.88e-01

Pos logit contributions
 ones+0.021
's+0.019
oes+0.017
 color+0.017
packs+0.017
Neg logit contributions
 relief-0.043
ores-0.043
 Doctor-0.043
 ow-0.043
 awful-0.042
Token He
Token ID679
Feature activation-1.96e+00

Pos logit contributions
 ones+0.018
's+0.016
oes+0.016
 color+0.016
packs+0.016
Neg logit contributions
 relief-0.020
 stret-0.020
 ow-0.020
 awful-0.020
 Doctor-0.020
Token mailed
Token ID37789
Feature activation-3.81e-01

Pos logit contributions
ores+0.092
 relief+0.091
 stret+0.091
 Doctor+0.089
paste+0.088
Neg logit contributions
's-0.040
 ones-0.038
 color-0.035
 one-0.032
packs-0.030
Token scared
Token ID12008
Feature activation-2.42e-01

Pos logit contributions
's+0.035
 one+0.032
 color+0.030
 ones+0.029
 way+0.029
Neg logit contributions
paste-0.039
 stret-0.038
ores-0.038
ocol-0.037
 relief-0.037
Token and
Token ID290
Feature activation+8.01e-01

Pos logit contributions
 stret+0.011
 relief+0.011
paste+0.011
ores+0.011
coon+0.011
Neg logit contributions
 ones-0.005
's-0.005
 one-0.004
oes-0.004
 color-0.004
Token didn
Token ID1422
Feature activation-2.03e+00

Pos logit contributions
 ones+0.015
's+0.015
 one+0.014
 wandered+0.013
 color+0.013
Neg logit contributions
 stret-0.037
paste-0.037
ores-0.036
 Doctor-0.036
 relief-0.036
Token't
Token ID470
Feature activation+2.25e+00

Pos logit contributions
 relief+0.082
 stret+0.081
 Doctor+0.079
 ow+0.079
paste+0.079
Neg logit contributions
 ones-0.047
's-0.047
oes-0.043
 one-0.042
 color-0.041
Token know
Token ID760
Feature activation-1.33e+00

Pos logit contributions
 ones+0.081
's+0.073
oes+0.073
packs+0.072
 color+0.071
Neg logit contributions
 relief-0.066
 Doctor-0.065
 ow-0.065
ores-0.065
 stret-0.064
Token what
Token ID644
Feature activation-6.38e+00

Pos logit contributions
 stret+0.033
 relief+0.032
ores+0.031
affle+0.031
paste+0.031
Neg logit contributions
 ones-0.057
 one-0.056
's-0.056
 color-0.053
 way-0.049
Token to
Token ID284
Feature activation-1.58e-03

Pos logit contributions
 relief+0.310
 stret+0.308
ores+0.306
 Doctor+0.302
 awful+0.301
Neg logit contributions
 ones-0.115
's-0.102
 color-0.091
oes-0.090
 one-0.085
Token do
Token ID466
Feature activation-3.10e+00

Pos logit contributions
 relief+0.000
 Doctor+0.000
ores+0.000
 stret+0.000
 awful+0.000
Neg logit contributions
 ones-0.000
's-0.000
oes-0.000
 color-0.000
packs-0.000
Token.
Token ID13
Feature activation+6.13e-01

Pos logit contributions
 awful+0.102
 relief+0.101
 ow+0.100
 Doctor+0.100
ores+0.100
Neg logit contributions
 ones-0.095
oes-0.089
packs-0.086
 color-0.083
's-0.081
Token But
Token ID887
Feature activation+1.08e+00

Pos logit contributions
 ones+0.015
's+0.015
 one+0.013
oes+0.013
 color+0.013
Neg logit contributions
 stret-0.026
ores-0.025
 relief-0.025
coon-0.025
paste-0.025
Token then
Token ID788
Feature activation-2.46e+00

Pos logit contributions
 ones+0.039
's+0.037
 one+0.036
 color+0.035
oes+0.034
Neg logit contributions
 stret-0.033
ores-0.032
paste-0.032
 relief-0.031
coon-0.031
Token,
Token ID11
Feature activation+1.88e-01

Pos logit contributions
 stret+0.009
ores+0.009
coon+0.009
 relief+0.009
 awful+0.009
Neg logit contributions
 ones-0.007
's-0.006
 one-0.005
 color-0.005
packs-0.005
Token Lucy
Token ID22162
Feature activation+7.17e-01

Pos logit contributions
 ones+0.006
 one+0.005
 color+0.005
's+0.005
 way+0.004
Neg logit contributions
coon-0.008
affle-0.008
lic-0.007
 stret-0.007
ache-0.007
Token's
Token ID338
Feature activation+6.78e+00

Pos logit contributions
's+0.015
 ones+0.012
 one+0.010
 wandered+0.010
 color+0.010
Neg logit contributions
 relief-0.035
 stret-0.034
ores-0.034
 awful-0.034
 Doctor-0.033
Token tum
Token ID11814
Feature activation-5.28e+00

Pos logit contributions
 ones+0.221
's+0.214
 one+0.210
 color+0.208
 way+0.205
Neg logit contributions
lic-0.259
coon-0.249
ores-0.242
oke-0.225
 amused-0.221
Tokenmy
Token ID1820
Feature activation-1.65e+00

Pos logit contributions
 relief+0.279
 Doctor+0.276
 stret+0.276
ores+0.275
 ow+0.274
Neg logit contributions
 ones-0.067
's-0.051
oes-0.050
 color-0.047
packs-0.045
Token ac
Token ID936
Feature activation-6.54e+00

Pos logit contributions
ores+0.078
ocol+0.078
 stret+0.077
coon+0.077
 relief+0.076
Neg logit contributions
's-0.041
 ones-0.030
 one-0.028
 color-0.026
 way-0.025
Tokenhe
Token ID258
Feature activation-4.07e+00

Pos logit contributions
 relief+0.210
 sorry+0.207
ores+0.195
 stret+0.195
 amused+0.195
Neg logit contributions
's-0.224
 ones-0.219
oes-0.201
c-0.198
st-0.191
Token was
Token ID373
Feature activation+3.50e-01

Pos logit contributions
ores+0.186
lic+0.185
 relief+0.182
ocol+0.178
 stret+0.178
Neg logit contributions
's-0.091
 ones-0.088
 one-0.084
 color-0.081
 way-0.070
Token gone
Token ID3750
Feature activation-3.06e+00

Pos logit contributions
's+0.010
 one+0.010
 color+0.010
 way+0.009
 ones+0.009
Neg logit contributions
coon-0.015
oke-0.013
lic-0.013
 stret-0.013
affle-0.013
Token,
Token ID11
Feature activation+7.24e-02

Pos logit contributions
 Doctor+0.134
 relief+0.134
coon+0.133
 stret+0.133
ores+0.132
Neg logit contributions
 ones-0.070
's-0.065
 one-0.060
 color-0.059
packs-0.056
Token and
Token ID290
Feature activation+2.95e-01

Pos logit contributions
 ones+0.002
's+0.002
 one+0.002
 color+0.001
 way+0.001
Neg logit contributions
affle-0.003
coon-0.003
ores-0.003
ocol-0.003
lic-0.003
TokenO
Token ID46
Feature activation-4.08e+00

Pos logit contributions
 ones+0.046
oes+0.042
's+0.042
packs+0.041
 color+0.041
Neg logit contributions
 relief-0.050
 Doctor-0.050
 ow-0.050
 stret-0.049
ores-0.049
Tokenw
Token ID86
Feature activation-1.17e+00

Pos logit contributions
 relief+0.173
 Doctor+0.171
 stret+0.171
ores+0.169
 awful+0.168
Neg logit contributions
 ones-0.095
's-0.084
oes-0.083
packs-0.078
 color-0.078
Token,
Token ID11
Feature activation-4.54e-01

Pos logit contributions
 relief+0.057
ores+0.056
 stret+0.055
 awful+0.054
 explosion+0.054
Neg logit contributions
's-0.025
 ones-0.020
 color-0.018
 one-0.018
oes-0.016
Token ow
Token ID12334
Feature activation-9.73e+00

Pos logit contributions
lic+0.012
rig+0.012
ores+0.011
 til+0.010
 accomplishment+0.010
Neg logit contributions
's-0.019
 one-0.019
 ones-0.017
 color-0.017
 way-0.015
Token,
Token ID11
Feature activation-5.17e-01

Pos logit contributions
 relief+0.468
 stret+0.462
ores+0.445
 awful+0.445
 Doctor+0.436
Neg logit contributions
 ones-0.200
's-0.186
 one-0.161
oes-0.157
 color-0.153
Token ow
Token ID12334
Feature activation-9.76e+00

Pos logit contributions
lic+0.015
rig+0.013
ores+0.013
 til+0.012
plug+0.012
Neg logit contributions
's-0.022
 one-0.021
 ones-0.020
 color-0.019
 way-0.017
Token!
Token ID0
Feature activation-1.88e+00

Pos logit contributions
 stret+0.407
 relief+0.403
ores+0.372
 awful+0.369
 Doctor+0.368
Neg logit contributions
 ones-0.269
's-0.258
 one-0.233
oes-0.218
 color-0.216
Token It
Token ID632
Feature activation-2.99e-01

Pos logit contributions
 stret+0.074
 relief+0.074
ores+0.074
 awful+0.072
 poisonous+0.071
Neg logit contributions
's-0.048
 ones-0.046
 color-0.042
oes-0.041
 one-0.040
Token hurts
Token ID20406
Feature activation-1.39e+00

Pos logit contributions
 relief+0.010
ores+0.010
 Doctor+0.010
 stret+0.010
 awful+0.009
Neg logit contributions
 ones-0.010
's-0.009
oes-0.009
 color-0.008
packs-0.008
Token!"
Token ID2474
Feature activation-8.01e-01

Pos logit contributions
 stret+0.057
 til+0.055
paste+0.053
affle+0.053
 amused+0.053
Neg logit contributions
's-0.039
 ones-0.038
 one-0.036
 color-0.035
y-0.029
Token
Token ID198
Feature activation-8.33e-01

Pos logit contributions
coon+0.042
 relief+0.042
ocol+0.042
paste+0.041
 awful+0.041
Neg logit contributions
 ones-0.010
's-0.010
 one-0.009
 One-0.009
 color-0.006
Token bully
Token ID27410
Feature activation-1.92e+00

Pos logit contributions
ores+0.007
paste+0.007
ocol+0.007
lic+0.007
 til+0.007
Neg logit contributions
 ones-0.005
's-0.004
 color-0.004
 one-0.004
 way-0.004
Token and
Token ID290
Feature activation+2.17e-01

Pos logit contributions
 relief+0.082
 ow+0.082
ores+0.081
 awful+0.081
 Doctor+0.079
Neg logit contributions
 ones-0.046
's-0.043
 one-0.038
oes-0.037
 color-0.037
Token said
Token ID531
Feature activation+4.25e-02

Pos logit contributions
's+0.005
 ones+0.005
 one+0.004
 color+0.004
 way+0.004
Neg logit contributions
 ow-0.010
ores-0.010
paste-0.010
ocol-0.009
 stret-0.009
Token:
Token ID25
Feature activation-1.68e-01

Pos logit contributions
's+0.001
 ones+0.001
 one+0.001
 color+0.001
 One+0.001
Neg logit contributions
ores-0.002
 stret-0.002
 relief-0.002
coon-0.002
 ow-0.002
Token "
Token ID366
Feature activation+1.92e+00

Pos logit contributions
 relief+0.006
 ow+0.006
paste+0.006
 awful+0.006
 stret+0.006
Neg logit contributions
 ones-0.005
's-0.004
 one-0.004
 color-0.004
oes-0.004
TokenWhy
Token ID5195
Feature activation-6.65e+00

Pos logit contributions
 ones+0.058
packs+0.054
oes+0.052
's+0.052
 color+0.051
Neg logit contributions
 Doctor-0.066
 relief-0.065
 stret-0.065
 awful-0.064
coon-0.064
Token don
Token ID836
Feature activation-2.34e+00

Pos logit contributions
 stret+0.251
 relief+0.250
ores+0.239
paste+0.239
 poisonous+0.237
Neg logit contributions
's-0.192
 ones-0.191
 color-0.182
oes-0.168
 one-0.168
Token't
Token ID470
Feature activation+2.13e+00

Pos logit contributions
 relief+0.088
 stret+0.086
 Doctor+0.086
ores+0.086
 ow+0.085
Neg logit contributions
 ones-0.064
's-0.059
oes-0.056
 color-0.055
packs-0.053
Token you
Token ID345
Feature activation-9.48e-02

Pos logit contributions
 ones+0.059
's+0.056
 color+0.052
oes+0.051
 one+0.050
Neg logit contributions
ores-0.082
 relief-0.081
 stret-0.081
paste-0.080
 Doctor-0.079
Token respect
Token ID2461
Feature activation-1.58e+00

Pos logit contributions
 relief+0.004
paste+0.003
 stret+0.003
ores+0.003
 Doctor+0.003
Neg logit contributions
's-0.003
 ones-0.003
 color-0.003
oes-0.002
 one-0.002
Token the
Token ID262
Feature activation-2.55e-01

Pos logit contributions
 relief+0.063
 stret+0.063
paste+0.063
lic+0.062
ores+0.062
Neg logit contributions
 ones-0.047
 color-0.043
's-0.043
 one-0.041
oes-0.038
Token ring
Token ID5858
Feature activation-1.49e+00

Pos logit contributions
ores+0.043
 Doctor+0.040
 awful+0.039
 relief+0.038
othe+0.038
Neg logit contributions
 ones-0.035
 color-0.032
 one-0.030
's-0.028
 way-0.028
Token?
Token ID30
Feature activation-1.23e+00

Pos logit contributions
 relief+0.059
 stret+0.058
 Doctor+0.057
ores+0.056
 ow+0.056
Neg logit contributions
 ones-0.041
's-0.041
 color-0.036
 one-0.036
oes-0.036
Token That
Token ID1320
Feature activation+4.63e-01

Pos logit contributions
 relief+0.046
 Doctor+0.046
ores+0.046
 stret+0.045
 ow+0.045
Neg logit contributions
 ones-0.034
's-0.030
oes-0.029
 color-0.029
packs-0.029
Token feeling
Token ID4203
Feature activation-4.51e+00

Pos logit contributions
 ones+0.009
's+0.007
oes+0.007
 color+0.007
packs+0.007
Neg logit contributions
 relief-0.021
 Doctor-0.021
ores-0.021
 stret-0.021
 ow-0.021
Token is
Token ID318
Feature activation+6.83e-01

Pos logit contributions
 stret+0.163
 Doctor+0.154
coon+0.150
paste+0.148
ores+0.148
Neg logit contributions
 ones-0.145
's-0.141
 one-0.130
 color-0.129
oes-0.116
Token worth
Token ID2861
Feature activation-7.30e+00

Pos logit contributions
 one+0.020
's+0.020
 ones+0.020
 color+0.019
 way+0.017
Neg logit contributions
 stret-0.027
coon-0.026
paste-0.025
 ow-0.025
affle-0.025
Token more
Token ID517
Feature activation+1.00e+00

Pos logit contributions
 stret+0.254
ores+0.246
 Doctor+0.245
paste+0.243
 ow+0.242
Neg logit contributions
 ones-0.237
's-0.218
 one-0.215
 color-0.204
packs-0.189
Token than
Token ID621
Feature activation-3.51e+00

Pos logit contributions
 ones+0.028
's+0.025
 color+0.024
 one+0.024
 way+0.023
Neg logit contributions
 Doctor-0.039
coon-0.039
ores-0.039
 stret-0.039
 ow-0.038
Token any
Token ID597
Feature activation-7.55e-01

Pos logit contributions
 stret+0.136
 relief+0.131
 Doctor+0.131
 ow+0.131
ores+0.131
Neg logit contributions
 ones-0.103
 one-0.090
 color-0.090
's-0.088
oes-0.086
Token ancient
Token ID6156
Feature activation-5.65e-02

Pos logit contributions
ores+0.029
 stret+0.028
 Doctor+0.028
coon+0.027
paste+0.027
Neg logit contributions
 ones-0.023
's-0.021
 color-0.020
 one-0.020
 way-0.019
Token treasure
Token ID14068
Feature activation-1.38e+00

Pos logit contributions
ores+0.002
 Doctor+0.002
 relief+0.002
 stret+0.002
 awful+0.002
Neg logit contributions
 ones-0.002
's-0.001
 color-0.001
 one-0.001
packs-0.001