TOP ACTIVATIONS
MAX = 11.303

write it . --- Bee and Ant
very much . <|endoftext|> Once upon a time , there was
the rain ." <|endoftext|> Once upon a time , there was
y anymore . <|endoftext|> Once upon a time , there was
to her . <|endoftext|> Once upon a time there was a
it forever . <|endoftext|> Once upon a time there were two
to them . <|endoftext|> Once upon a time , there was
to it . <|endoftext|> Once upon a time there was a
swing together . <|endoftext|> Once upon a time , there was
its creatures . <|endoftext|> Once upon a time , there was
the butterfly . <|endoftext|> Once upon a time there was a
scary dogs . <|endoftext|> Once upon a time , there was
his favorite . <|endoftext|> Once upon a time , there was
the house . <|endoftext|> Once upon a time , there was
the park . <|endoftext|> Once upon a time , there was
the park . <|endoftext|> Once upon a time , there was
a friend . <|endoftext|> Once upon a time , there was
the rabbit . <|endoftext|> Once upon a time , there was
this again !" <|endoftext|> Once upon a time , there was
y days . <|endoftext|> Once upon a time , there was

MIN ACTIVATIONS
MIN = -12.682

clown s juggling and ac rob ats doing flips . But
thought they looked like ac rob ats . She enjoyed the
of fun watching the ac rob ats and clown s .
elephants , lions and ac rob ats . They are amazed
clown s and the ac rob ats and the lions and
saw clown s and ac rob ats and lions and elephants
the animals , the ac rob ats and the clown s
very still so the ha ird ress er could do a
his mom and the ha ird ress er . " I
She twisted it around her wrists like brace lets .
happy and thanked the ha ird ress er . As they
in and let the ha ird ress er go to work
the chair and the ha ird ress er began to cut
called it a " we ird rock ." Tim my got
am ready !" The ha ird ress er put a cape
chair and allowed the ha ird ress er to give her
his mom left the ha ird ress er , happy with
tied the balloons to their wrists . L ily
forgot the lion and his mercy . He always tried to
like it and the ha ird ress er said it was

INTERVAL 5.307 - 11.303
CONTAINS 0.683%

and thought about the nice guard . She was very glad
, Ok ! â She
The end . <|endoftext|> Once upon a time , there was
mean cat . <|endoftext|> Once upon a time , there was
modern wonder ! <|endoftext|> Once upon a time , there was

INTERVAL -0.689 - 5.307
CONTAINS 63.866%

, and a sandbox . Where is Ben ?
and Mia . We want you to be good . We
Mom says , " You have to wash . You
to the bunny , " Fl opsy , why did you
to be careful and not get too close to the fence

INTERVAL -6.685 - -0.689
CONTAINS 35.188%

to swim in the big pond , and the lion felt
watched as the wind blew away all his carefully planted wheat
amas . Then he sn ugg led up in his bed
the bag . The little girl decided to take
's mom took him to the store . She said ,

INTERVAL -12.682 - -6.685
CONTAINS 0.261%

" Oh no ," said Lily . " I
, smooth butter . The butter looked all around , seeing
try to say the words ," the monkey said .
to help the ant . Together , they moved more and
" But I like it ," he said .
Token write
Token ID3551
Feature activation-2.25e+00

Pos logit contributions
 wrists+0.054
 chip+0.052
 depends+0.051
 will+0.050
 sa+0.050
Neg logit contributions
ires-0.051
 afar-0.050
ear-0.049
eway-0.048
 underground-0.047
Token it
Token ID340
Feature activation+2.14e+00

Pos logit contributions
 wrists+0.089
 chip+0.085
 will+0.084
tails+0.083
 sa+0.082
Neg logit contributions
ires-0.086
 afar-0.080
err-0.079
ear-0.078
eway-0.077
Token.
Token ID13
Feature activation+1.01e+00

Pos logit contributions
ires+0.075
ear+0.068
err+0.068
 afar+0.067
eway+0.065
Neg logit contributions
 wrists-0.094
 chip-0.089
tails-0.088
 sa-0.086
girls-0.085
Token
Token ID198
Feature activation+2.13e-01

Pos logit contributions
ires+0.041
 afar+0.039
err+0.038
ear+0.038
eway+0.037
Neg logit contributions
 wrists-0.038
tails-0.036
 will-0.036
 chip-0.036
 sa-0.035
Token
Token ID198
Feature activation+1.69e-01

Pos logit contributions
ires+0.011
 afar+0.011
err+0.010
uttering+0.010
ging+0.010
Neg logit contributions
 will-0.005
 chip-0.005
 wrists-0.005
 depends-0.005
 says-0.005
Token---
Token ID6329
Feature activation+9.21e+00

Pos logit contributions
ires+0.008
 afar+0.008
err+0.007
ening+0.007
ied+0.007
Neg logit contributions
Together-0.005
 will-0.005
 chip-0.005
 wrists-0.005
-0.005
Token
Token ID198
Feature activation+1.20e-01

Pos logit contributions
ocol+0.571
ires+0.563
ening+0.562
ied+0.559
 afar+0.549
Neg logit contributions
 will-0.171
-0.148
 -0.141
 says-0.135
 said-0.131
Token
Token ID198
Feature activation+9.61e-02

Pos logit contributions
ires+0.006
 afar+0.006
ging+0.006
ening+0.006
ied+0.005
Neg logit contributions
 will-0.004
Together-0.003
 chip-0.003
 Together-0.003
-0.003
TokenBee
Token ID49512
Feature activation-2.84e+00

Pos logit contributions
ires+0.005
 afar+0.005
ening+0.005
ging+0.005
ied+0.005
Neg logit contributions
Together-0.003
 will-0.003
-0.002
 chip-0.002
They-0.002
Token and
Token ID290
Feature activation+1.41e+00

Pos logit contributions
 wrists+0.103
 chip+0.097
 will+0.096
tails+0.095
 sa+0.094
Neg logit contributions
ires-0.117
 afar-0.110
err-0.109
ear-0.108
eway-0.105
Token Ant
Token ID3738
Feature activation-8.06e-01

Pos logit contributions
ires+0.061
 afar+0.057
err+0.057
ear+0.056
eway+0.056
Neg logit contributions
 wrists-0.049
 chip-0.046
 will-0.046
tails-0.045
 sa-0.045
Token very
Token ID845
Feature activation+8.02e-01

Pos logit contributions
 chip+0.026
 said+0.025
 will+0.024
 say+0.021
 would+0.021
Neg logit contributions
ires-0.042
ocol-0.041
 afar-0.040
ging-0.040
eway-0.039
Token much
Token ID881
Feature activation+1.29e+00

Pos logit contributions
ires+0.030
 afar+0.028
err+0.028
eway+0.027
ear+0.027
Neg logit contributions
 wrists-0.032
 will-0.031
 chip-0.031
 sa-0.030
 depends-0.030
Token.
Token ID13
Feature activation+2.25e+00

Pos logit contributions
ires+0.047
 afar+0.044
 underground+0.044
ear+0.042
err+0.042
Neg logit contributions
 wrists-0.054
tails-0.053
girls-0.050
 chip-0.049
 depends-0.049
Token<|endoftext|>
Token ID50256
Feature activation+4.79e+00

Pos logit contributions
ires+0.101
 afar+0.096
ear+0.094
err+0.094
eway+0.094
Neg logit contributions
 wrists-0.073
 will-0.069
 chip-0.068
tails-0.067
 sa-0.067
TokenOnce
Token ID7454
Feature activation+2.04e+00

Pos logit contributions
ires+0.256
 afar+0.247
uttering+0.239
ear+0.238
err+0.236
Neg logit contributions
 will-0.118
 chip-0.109
 said-0.106
 says-0.105
 wrists-0.105
Token upon
Token ID2402
Feature activation+9.02e+00

Pos logit contributions
 afar+0.054
ear+0.049
vised+0.048
err+0.046
foot+0.046
Neg logit contributions
 said-0.111
 -0.110
 will-0.108
 says-0.105
-0.100
Token a
Token ID257
Feature activation+4.60e-01

Pos logit contributions
ires+0.522
ied+0.504
ging+0.504
ening+0.502
ying+0.491
Neg logit contributions
 said-0.195
 will-0.186
 chip-0.176
 sa-0.168
 -0.164
Token time
Token ID640
Feature activation+3.70e+00

Pos logit contributions
ires+0.014
eway+0.013
ges+0.013
 afar+0.013
ied+0.012
Neg logit contributions
 will-0.022
 chip-0.021
 wrists-0.021
-0.021
 sa-0.020
Token,
Token ID11
Feature activation+8.43e-01

Pos logit contributions
 afar+0.251
ires+0.238
err+0.234
ear+0.231
ening+0.227
Neg logit contributions
 -0.048
 said-0.045
 chip-0.044
 will-0.044
 Together-0.043
Token there
Token ID612
Feature activation+4.00e+00

Pos logit contributions
ear+0.034
ires+0.033
 afar+0.033
err+0.032
eway+0.032
Neg logit contributions
 will-0.031
 said-0.030
 chip-0.029
-0.028
 -0.028
Token was
Token ID373
Feature activation+2.05e+00

Pos logit contributions
ires+0.221
 afar+0.215
ear+0.210
err+0.209
uttering+0.207
Neg logit contributions
 will-0.101
 said-0.091
 chip-0.089
 says-0.083
 -0.080
Token the
Token ID262
Feature activation+2.75e+00

Pos logit contributions
ires+0.020
 afar+0.019
err+0.018
ear+0.018
eway+0.018
Neg logit contributions
 wrists-0.025
 will-0.023
 chip-0.023
tails-0.023
 depends-0.023
Token rain
Token ID6290
Feature activation+5.48e+00

Pos logit contributions
ires+0.115
 afar+0.108
err+0.106
ear+0.105
eway+0.105
Neg logit contributions
 wrists-0.098
 chip-0.093
 will-0.093
tails-0.091
 sa-0.091
Token."
Token ID526
Feature activation-1.49e+00

Pos logit contributions
ires+0.189
 afar+0.182
eway+0.180
 underground+0.180
err+0.179
Neg logit contributions
 wrists-0.225
 sa-0.209
girls-0.208
tails-0.208
 chip-0.208
Token<|endoftext|>
Token ID50256
Feature activation+4.87e+00

Pos logit contributions
 wrists+0.053
 will+0.051
 depends+0.050
tails+0.050
 sa+0.049
Neg logit contributions
ires-0.061
 afar-0.058
ear-0.057
ges-0.057
 upon-0.057
TokenOnce
Token ID7454
Feature activation+2.05e+00

Pos logit contributions
ires+0.258
 afar+0.246
uttering+0.240
ear+0.236
err+0.235
Neg logit contributions
 will-0.125
 said-0.115
-0.114
 says-0.114
 -0.112
Token upon
Token ID2402
Feature activation+9.02e+00

Pos logit contributions
 afar+0.052
vised+0.048
ear+0.048
foot+0.047
err+0.045
Neg logit contributions
 -0.114
 said-0.113
 will-0.108
 says-0.107
-0.102
Token a
Token ID257
Feature activation+4.60e-01

Pos logit contributions
ires+0.534
ging+0.526
ied+0.517
ening+0.517
ying+0.507
Neg logit contributions
 said-0.188
 will-0.172
 -0.157
 chip-0.157
 kissed-0.151
Token time
Token ID640
Feature activation+3.83e+00

Pos logit contributions
ires+0.014
eway+0.013
 afar+0.013
ges+0.012
ging+0.012
Neg logit contributions
 will-0.022
 chip-0.021
 wrists-0.021
 sa-0.021
-0.021
Token,
Token ID11
Feature activation+7.46e-01

Pos logit contributions
 afar+0.281
ires+0.265
err+0.258
ear+0.256
ening+0.254
Neg logit contributions
 -0.046
 said-0.036
 Together-0.032
 will-0.029
 they-0.028
Token there
Token ID612
Feature activation+3.98e+00

Pos logit contributions
ires+0.031
ear+0.030
 afar+0.029
err+0.029
ocol+0.028
Neg logit contributions
 will-0.027
 said-0.026
 chip-0.025
 -0.025
-0.024
Token was
Token ID373
Feature activation+2.03e+00

Pos logit contributions
ires+0.232
 afar+0.225
ear+0.222
uttering+0.219
err+0.218
Neg logit contributions
 will-0.093
 said-0.088
 chip-0.078
 -0.077
 says-0.075
Tokeny
Token ID88
Feature activation+5.85e+00

Pos logit contributions
ires+0.124
 afar+0.115
err+0.113
ear+0.112
ges+0.110
Neg logit contributions
 wrists-0.137
 chip-0.130
 will-0.129
tails-0.127
 sa-0.127
Token anymore
Token ID7471
Feature activation+6.36e+00

Pos logit contributions
ires+0.225
 afar+0.212
err+0.209
ear+0.206
eway+0.204
Neg logit contributions
 wrists-0.225
 chip-0.215
 will-0.212
tails-0.209
 sa-0.209
Token.
Token ID13
Feature activation+2.31e+00

Pos logit contributions
ires+0.266
 afar+0.250
err+0.248
ear+0.243
eway+0.242
Neg logit contributions
 wrists-0.227
 will-0.218
 chip-0.215
 sa-0.209
tails-0.209
Token<|endoftext|>
Token ID50256
Feature activation+4.87e+00

Pos logit contributions
ires+0.106
 afar+0.101
err+0.099
ear+0.099
eway+0.098
Neg logit contributions
 wrists-0.073
 will-0.068
 chip-0.068
tails-0.067
 sa-0.067
TokenOnce
Token ID7454
Feature activation+2.06e+00

Pos logit contributions
ires+0.258
 afar+0.247
uttering+0.240
ear+0.238
err+0.234
Neg logit contributions
 will-0.125
-0.114
 -0.114
 said-0.113
 chip-0.112
Token upon
Token ID2402
Feature activation+9.02e+00

Pos logit contributions
 afar+0.057
ear+0.054
vised+0.054
foot+0.052
ant+0.052
Neg logit contributions
 -0.108
 said-0.108
 will-0.104
 says-0.101
-0.097
Token a
Token ID257
Feature activation+5.30e-01

Pos logit contributions
ires+0.545
ied+0.534
ging+0.533
ening+0.529
ying+0.523
Neg logit contributions
 said-0.173
 will-0.162
 chip-0.147
 -0.145
 kissed-0.139
Token time
Token ID640
Feature activation+3.86e+00

Pos logit contributions
ires+0.017
eway+0.017
ges+0.016
ied+0.016
 afar+0.015
Neg logit contributions
 will-0.024
-0.023
 chip-0.023
 wrists-0.023
 sa-0.023
Token,
Token ID11
Feature activation+8.64e-01

Pos logit contributions
 afar+0.276
ires+0.261
err+0.256
ear+0.252
ening+0.249
Neg logit contributions
 -0.044
 said-0.036
 Together-0.035
 will-0.033
 chip-0.032
Token there
Token ID612
Feature activation+3.94e+00

Pos logit contributions
ear+0.036
ires+0.035
eway+0.035
ocol+0.034
err+0.034
Neg logit contributions
 will-0.031
 said-0.029
 chip-0.028
 -0.028
-0.028
Token was
Token ID373
Feature activation+2.16e+00

Pos logit contributions
ires+0.227
 afar+0.221
ear+0.216
uttering+0.215
err+0.214
Neg logit contributions
 will-0.094
 said-0.085
 chip-0.079
 -0.076
 says-0.076
Token to
Token ID284
Feature activation+1.93e+00

Pos logit contributions
 wrists+0.008
 will+0.007
 sa+0.007
 chip+0.007
tails+0.007
Neg logit contributions
ires-0.010
 afar-0.010
err-0.010
uttering-0.010
eway-0.009
Token her
Token ID607
Feature activation+4.94e+00

Pos logit contributions
 afar+0.078
ires+0.078
eway+0.073
ear+0.073
 underground+0.072
Neg logit contributions
 wrists-0.068
 chip-0.065
 depends-0.064
 will-0.063
tails-0.063
Token.
Token ID13
Feature activation+2.51e+00

Pos logit contributions
ires+0.197
 afar+0.188
 behind+0.179
 upon+0.179
 underground+0.178
Neg logit contributions
 wrists-0.173
 chip-0.170
 depends-0.168
Together-0.168
 will-0.166
Token<|endoftext|>
Token ID50256
Feature activation+4.80e+00

Pos logit contributions
ires+0.119
 afar+0.112
err+0.111
ear+0.111
eway+0.110
Neg logit contributions
 wrists-0.075
 will-0.071
 chip-0.070
 sa-0.070
tails-0.069
TokenOnce
Token ID7454
Feature activation+2.02e+00

Pos logit contributions
ires+0.255
 afar+0.245
uttering+0.238
ear+0.237
err+0.236
Neg logit contributions
 will-0.118
 chip-0.110
 wrists-0.109
 Together-0.105
 says-0.105
Token upon
Token ID2402
Feature activation+9.00e+00

Pos logit contributions
 afar+0.053
ear+0.047
vised+0.046
foot+0.045
err+0.045
Neg logit contributions
 -0.112
 said-0.111
 will-0.109
 says-0.105
-0.102
Token a
Token ID257
Feature activation+4.86e-01

Pos logit contributions
ires+0.520
ied+0.502
ging+0.500
ening+0.497
ying+0.487
Neg logit contributions
 said-0.197
 will-0.192
 chip-0.178
 -0.171
 sa-0.170
Token time
Token ID640
Feature activation+3.76e+00

Pos logit contributions
ires+0.015
eway+0.014
ges+0.013
 afar+0.013
ging+0.013
Neg logit contributions
 will-0.023
 chip-0.022
 wrists-0.022
-0.022
 sa-0.022
Token there
Token ID612
Feature activation+4.10e+00

Pos logit contributions
 afar+0.263
ires+0.247
err+0.243
ear+0.241
away+0.235
Neg logit contributions
 -0.049
 said-0.043
 will-0.042
 Together-0.042
 chip-0.040
Token was
Token ID373
Feature activation+2.10e+00

Pos logit contributions
ires+0.221
 afar+0.213
ear+0.210
err+0.208
uttering+0.206
Neg logit contributions
 will-0.108
 said-0.101
 chip-0.099
 says-0.091
 wrists-0.090
Token a
Token ID257
Feature activation+1.76e-01

Pos logit contributions
ires+0.111
eway+0.109
ging+0.106
ening+0.106
ant+0.104
Neg logit contributions
 chip-0.052
 sa-0.050
 said-0.050
 wrists-0.046
 piping-0.046
Token it
Token ID340
Feature activation+5.35e+00

Pos logit contributions
ires+0.102
 afar+0.096
err+0.094
ear+0.093
eway+0.092
Neg logit contributions
 wrists-0.101
 will-0.097
 chip-0.097
 sa-0.095
tails-0.094
Token forever
Token ID8097
Feature activation-2.05e+00

Pos logit contributions
ires+0.226
 afar+0.214
err+0.211
ear+0.210
eway+0.208
Neg logit contributions
 wrists-0.185
 chip-0.176
tails-0.174
girls-0.169
 sa-0.167
Token.
Token ID13
Feature activation+1.70e+00

Pos logit contributions
 wrists+0.076
 chip+0.074
tails+0.071
 sa+0.070
girls+0.070
Neg logit contributions
ires-0.082
 afar-0.078
ear-0.076
err-0.075
eway-0.074
Token<|endoftext|>
Token ID50256
Feature activation+4.71e+00

Pos logit contributions
ires+0.078
 afar+0.074
ear+0.073
err+0.073
eway+0.072
Neg logit contributions
 wrists-0.054
 will-0.050
 chip-0.050
 sa-0.050
tails-0.050
TokenOnce
Token ID7454
Feature activation+1.95e+00

Pos logit contributions
ires+0.241
 afar+0.230
uttering+0.223
ear+0.223
err+0.221
Neg logit contributions
 will-0.126
 chip-0.117
 wrists-0.116
 said-0.114
-0.114
Token upon
Token ID2402
Feature activation+8.99e+00

Pos logit contributions
 afar+0.047
ear+0.043
err+0.042
vised+0.042
foot+0.042
Neg logit contributions
 -0.108
 said-0.108
 will-0.106
 says-0.102
-0.098
Token a
Token ID257
Feature activation+5.76e-01

Pos logit contributions
ires+0.520
ging+0.508
ied+0.501
ening+0.499
ying+0.488
Neg logit contributions
 said-0.194
 will-0.184
 chip-0.170
 sa-0.164
 -0.162
Token time
Token ID640
Feature activation+3.71e+00

Pos logit contributions
ires+0.018
eway+0.017
ges+0.017
 afar+0.016
ging+0.016
Neg logit contributions
 will-0.027
 chip-0.026
-0.026
 wrists-0.026
 sa-0.025
Token there
Token ID612
Feature activation+4.13e+00

Pos logit contributions
 afar+0.264
ires+0.250
err+0.245
ear+0.243
ening+0.237
Neg logit contributions
 -0.046
 will-0.038
 said-0.037
 Together-0.034
 chip-0.033
Token were
Token ID547
Feature activation-6.81e-01

Pos logit contributions
ires+0.229
 afar+0.220
ear+0.217
err+0.216
uttering+0.212
Neg logit contributions
 will-0.106
 said-0.099
 chip-0.093
 -0.088
 says-0.088
Token two
Token ID734
Feature activation-2.97e+00

Pos logit contributions
 wrists+0.023
 chip+0.022
 will+0.021
tails+0.021
 sa+0.021
Neg logit contributions
ires-0.030
 afar-0.028
err-0.028
ear-0.028
eway-0.028
Token to
Token ID284
Feature activation+1.93e+00

Pos logit contributions
ires+0.183
 afar+0.170
err+0.165
ear+0.165
foot+0.165
Neg logit contributions
 wrists-0.200
 will-0.193
 depends-0.189
 chip-0.188
tails-0.187
Token them
Token ID606
Feature activation+4.27e+00

Pos logit contributions
 afar+0.078
ires+0.078
eway+0.073
ear+0.073
 underground+0.072
Neg logit contributions
 wrists-0.068
 chip-0.065
 depends-0.064
 will-0.063
tails-0.063
Token.
Token ID13
Feature activation+2.51e+00

Pos logit contributions
ires+0.135
 afar+0.129
 behind+0.126
ear+0.124
 upon+0.124
Neg logit contributions
 wrists-0.186
 chip-0.181
 depends-0.178
tails-0.177
 will-0.176
Token<|endoftext|>
Token ID50256
Feature activation+4.79e+00

Pos logit contributions
ires+0.123
 afar+0.116
ear+0.114
err+0.114
eway+0.113
Neg logit contributions
 wrists-0.072
 will-0.067
tails-0.067
 sa-0.067
 chip-0.066
TokenOnce
Token ID7454
Feature activation+2.01e+00

Pos logit contributions
ires+0.258
 afar+0.250
uttering+0.242
ear+0.240
err+0.239
Neg logit contributions
 will-0.116
 chip-0.106
 Together-0.103
-0.103
 says-0.103
Token upon
Token ID2402
Feature activation+8.99e+00

Pos logit contributions
 afar+0.049
ear+0.044
vised+0.042
err+0.042
foot+0.042
Neg logit contributions
 -0.112
 said-0.112
 will-0.111
 says-0.107
-0.103
Token a
Token ID257
Feature activation+4.83e-01

Pos logit contributions
ires+0.522
ied+0.507
ging+0.507
ening+0.503
ying+0.495
Neg logit contributions
 said-0.193
 will-0.186
 chip-0.173
 -0.165
 sa-0.164
Token time
Token ID640
Feature activation+3.76e+00

Pos logit contributions
ires+0.014
eway+0.013
 afar+0.013
ges+0.013
ging+0.012
Neg logit contributions
 will-0.023
 chip-0.023
 wrists-0.023
 sa-0.022
-0.022
Token,
Token ID11
Feature activation+8.54e-01

Pos logit contributions
 afar+0.265
ires+0.248
err+0.244
ear+0.243
away+0.236
Neg logit contributions
 -0.048
 will-0.042
 said-0.042
 Together-0.041
 chip-0.038
Token there
Token ID612
Feature activation+4.01e+00

Pos logit contributions
ear+0.033
ires+0.032
 afar+0.032
eway+0.032
err+0.031
Neg logit contributions
 will-0.033
 said-0.032
 chip-0.031
 wrists-0.030
 -0.029
Token was
Token ID373
Feature activation+2.07e+00

Pos logit contributions
ires+0.220
 afar+0.214
ear+0.210
err+0.208
uttering+0.206
Neg logit contributions
 will-0.102
 said-0.093
 chip-0.091
 says-0.084
 -0.083
Token to
Token ID284
Feature activation+1.93e+00

Pos logit contributions
ires+0.191
 afar+0.181
ear+0.176
err+0.175
uttering+0.175
Neg logit contributions
 will-0.113
 wrists-0.111
 chip-0.109
 sa-0.105
 depends-0.105
Token it
Token ID340
Feature activation+4.39e+00

Pos logit contributions
 afar+0.078
ires+0.078
eway+0.073
ear+0.073
 underground+0.072
Neg logit contributions
 wrists-0.068
 chip-0.065
 depends-0.064
 will-0.063
tails-0.063
Token.
Token ID13
Feature activation+2.25e+00

Pos logit contributions
ires+0.145
 afar+0.138
 underground+0.135
ear+0.135
eway+0.134
Neg logit contributions
 wrists-0.190
 chip-0.181
tails-0.180
girls-0.178
 sa-0.176
Token<|endoftext|>
Token ID50256
Feature activation+4.64e+00

Pos logit contributions
ires+0.105
 afar+0.099
ear+0.098
err+0.098
eway+0.097
Neg logit contributions
 wrists-0.070
 will-0.066
 chip-0.065
tails-0.065
 sa-0.064
TokenOnce
Token ID7454
Feature activation+1.95e+00

Pos logit contributions
ires+0.250
 afar+0.241
uttering+0.234
ear+0.232
err+0.231
Neg logit contributions
 will-0.112
 chip-0.104
 wrists-0.101
 Together-0.101
 said-0.100
Token upon
Token ID2402
Feature activation+8.97e+00

Pos logit contributions
 afar+0.046
ear+0.041
err+0.040
ires+0.039
foot+0.039
Neg logit contributions
 -0.111
 said-0.109
 will-0.109
 says-0.104
-0.101
Token a
Token ID257
Feature activation+4.32e-01

Pos logit contributions
ires+0.524
ging+0.510
ied+0.506
ening+0.503
ying+0.495
Neg logit contributions
 said-0.192
 will-0.182
 chip-0.169
 -0.166
 sa-0.161
Token time
Token ID640
Feature activation+3.72e+00

Pos logit contributions
ires+0.013
eway+0.012
 afar+0.012
ges+0.012
ging+0.012
Neg logit contributions
 will-0.020
 chip-0.020
 wrists-0.020
-0.019
 sa-0.019
Token there
Token ID612
Feature activation+4.10e+00

Pos logit contributions
 afar+0.265
ires+0.250
err+0.245
ear+0.243
away+0.237
Neg logit contributions
 -0.047
 said-0.039
 will-0.038
 Together-0.038
 chip-0.036
Token was
Token ID373
Feature activation+2.04e+00

Pos logit contributions
ires+0.221
 afar+0.214
ear+0.210
err+0.208
uttering+0.206
Neg logit contributions
 will-0.107
 said-0.100
 chip-0.098
 says-0.091
 -0.089
Token a
Token ID257
Feature activation+1.34e-01

Pos logit contributions
ires+0.107
eway+0.105
ging+0.103
ening+0.102
ant+0.101
Neg logit contributions
 chip-0.050
 said-0.049
 sa-0.048
 piping-0.045
 wrists-0.044
Token swing
Token ID9628
Feature activation+1.45e+00

Pos logit contributions
ires+0.181
eway+0.174
err+0.171
ear+0.170
ening+0.168
Neg logit contributions
 chip-0.106
 said-0.102
 will-0.099
 sa-0.097
 wrists-0.096
Token together
Token ID1978
Feature activation-2.08e+00

Pos logit contributions
ires+0.051
 afar+0.048
ear+0.048
err+0.047
eway+0.047
Neg logit contributions
 wrists-0.060
 chip-0.058
 sa-0.057
tails-0.057
 depends-0.056
Token.
Token ID13
Feature activation+2.09e+00

Pos logit contributions
 wrists+0.087
 chip+0.086
tails+0.084
 sa+0.083
 depends+0.083
Neg logit contributions
ires-0.071
ear-0.066
 afar-0.066
eway-0.065
ges-0.064
Token<|endoftext|>
Token ID50256
Feature activation+4.82e+00

Pos logit contributions
ires+0.103
 afar+0.098
err+0.096
ear+0.096
eway+0.096
Neg logit contributions
 wrists-0.059
 will-0.055
 chip-0.055
 sa-0.055
tails-0.054
TokenOnce
Token ID7454
Feature activation+2.03e+00

Pos logit contributions
ires+0.272
 afar+0.264
uttering+0.257
ear+0.252
ging+0.251
Neg logit contributions
 will-0.107
 -0.097
-0.097
 said-0.095
 says-0.095
Token upon
Token ID2402
Feature activation+8.97e+00

Pos logit contributions
 afar+0.057
ear+0.052
vised+0.052
foot+0.050
err+0.049
Neg logit contributions
 -0.107
 said-0.107
 will-0.104
 says-0.100
-0.097
Token a
Token ID257
Feature activation+5.41e-01

Pos logit contributions
ires+0.528
ening+0.518
ging+0.518
ied+0.516
ying+0.506
Neg logit contributions
 said-0.183
 will-0.174
 chip-0.161
 -0.152
 rhy-0.151
Token time
Token ID640
Feature activation+3.80e+00

Pos logit contributions
ires+0.016
eway+0.015
ges+0.015
 afar+0.015
ging+0.015
Neg logit contributions
 will-0.026
 chip-0.025
 wrists-0.025
-0.025
 sa-0.024
Token,
Token ID11
Feature activation+8.30e-01

Pos logit contributions
 afar+0.261
ires+0.245
ear+0.241
err+0.241
ening+0.237
Neg logit contributions
 -0.050
 will-0.047
 said-0.046
 Together-0.042
 chip-0.040
Token there
Token ID612
Feature activation+3.97e+00

Pos logit contributions
ear+0.033
eway+0.032
ires+0.032
 afar+0.032
err+0.031
Neg logit contributions
 will-0.031
 said-0.030
 chip-0.028
 wrists-0.028
-0.028
Token was
Token ID373
Feature activation+2.08e+00

Pos logit contributions
ires+0.223
 afar+0.218
ear+0.212
err+0.211
vised+0.210
Neg logit contributions
 will-0.099
 said-0.089
 chip-0.082
 says-0.080
 -0.078
Token its
Token ID663
Feature activation+3.86e+00

Pos logit contributions
ires+0.055
ear+0.054
err+0.053
 afar+0.053
earing+0.052
Neg logit contributions
 wrists-0.050
 chip-0.045
girls-0.044
tails-0.043
 sa-0.043
Token creatures
Token ID8109
Feature activation+5.73e+00

Pos logit contributions
ires+0.170
 afar+0.159
err+0.158
ear+0.156
eway+0.155
Neg logit contributions
 wrists-0.130
 will-0.123
 chip-0.122
tails-0.119
 sa-0.119
Token.
Token ID13
Feature activation+2.06e+00

Pos logit contributions
ires+0.205
 afar+0.195
ear+0.190
err+0.187
eway+0.187
Neg logit contributions
 wrists-0.233
 chip-0.226
tails-0.219
 sa-0.216
 depends-0.213
Token<|endoftext|>
Token ID50256
Feature activation+4.71e+00

Pos logit contributions
ires+0.097
 afar+0.092
err+0.090
ear+0.090
eway+0.089
Neg logit contributions
 wrists-0.063
 will-0.059
 chip-0.059
 sa-0.058
tails-0.058
TokenOnce
Token ID7454
Feature activation+1.97e+00

Pos logit contributions
ires+0.253
 afar+0.242
uttering+0.236
ear+0.234
err+0.233
Neg logit contributions
 will-0.114
 chip-0.104
 said-0.104
-0.104
 says-0.102
Token upon
Token ID2402
Feature activation+8.96e+00

Pos logit contributions
 afar+0.051
ear+0.048
vised+0.046
foot+0.046
err+0.045
Neg logit contributions
 -0.108
 said-0.107
 will-0.104
 says-0.100
-0.097
Token a
Token ID257
Feature activation+4.18e-01

Pos logit contributions
ires+0.522
ging+0.505
ied+0.505
ening+0.499
ying+0.491
Neg logit contributions
 said-0.191
 will-0.184
 chip-0.171
 sa-0.163
 -0.160
Token time
Token ID640
Feature activation+3.66e+00

Pos logit contributions
ires+0.013
eway+0.012
ges+0.012
 afar+0.012
ging+0.011
Neg logit contributions
 will-0.020
 chip-0.019
 wrists-0.019
-0.019
 sa-0.018
Token,
Token ID11
Feature activation+7.16e-01

Pos logit contributions
 afar+0.260
ires+0.246
err+0.241
ear+0.240
ening+0.233
Neg logit contributions
 -0.046
 will-0.038
 said-0.038
 Together-0.035
 chip-0.034
Token there
Token ID612
Feature activation+3.94e+00

Pos logit contributions
ear+0.027
ires+0.027
 afar+0.026
eway+0.026
err+0.026
Neg logit contributions
 will-0.027
 said-0.026
 chip-0.026
 wrists-0.026
 -0.025
Token was
Token ID373
Feature activation+2.06e+00

Pos logit contributions
ires+0.220
 afar+0.213
ear+0.209
err+0.207
uttering+0.206
Neg logit contributions
 will-0.098
 said-0.090
 chip-0.085
 -0.080
 says-0.080
Token the
Token ID262
Feature activation+2.75e+00

Pos logit contributions
 earlier+0.041
 outside+0.040
 behind+0.040
ied+0.039
 when+0.039
Neg logit contributions
 seventeen-0.043
vals-0.043
girls-0.042
rob-0.040
 nineteen-0.040
Token butterfly
Token ID35113
Feature activation+5.68e+00

Pos logit contributions
ires+0.115
 afar+0.108
err+0.106
ear+0.105
eway+0.105
Neg logit contributions
 wrists-0.098
 chip-0.093
 will-0.093
tails-0.091
 sa-0.091
Token.
Token ID13
Feature activation+2.41e+00

Pos logit contributions
ires+0.203
 afar+0.202
ear+0.202
 upon+0.196
 behind+0.196
Neg logit contributions
 wrists-0.217
 chip-0.211
 sa-0.202
tails-0.199
girls-0.197
Token<|endoftext|>
Token ID50256
Feature activation+4.94e+00

Pos logit contributions
ires+0.109
 afar+0.103
ear+0.102
err+0.102
eway+0.100
Neg logit contributions
 wrists-0.077
 will-0.073
 chip-0.072
 sa-0.072
tails-0.072
TokenOnce
Token ID7454
Feature activation+2.04e+00

Pos logit contributions
ires+0.271
 afar+0.261
uttering+0.254
ear+0.252
err+0.250
Neg logit contributions
 will-0.116
 -0.105
 said-0.104
-0.104
 chip-0.103
Token upon
Token ID2402
Feature activation+8.95e+00

Pos logit contributions
 afar+0.056
ear+0.051
vised+0.050
foot+0.049
err+0.049
Neg logit contributions
 -0.112
 said-0.109
 will-0.106
 says-0.102
-0.099
Token a
Token ID257
Feature activation+4.98e-01

Pos logit contributions
ires+0.531
ging+0.518
ied+0.514
ening+0.510
ying+0.504
Neg logit contributions
 said-0.184
 will-0.174
 chip-0.159
 -0.156
 sa-0.150
Token time
Token ID640
Feature activation+3.77e+00

Pos logit contributions
ires+0.015
eway+0.015
ges+0.014
 afar+0.014
ging+0.014
Neg logit contributions
 will-0.023
 chip-0.023
 wrists-0.022
-0.022
 sa-0.022
Token there
Token ID612
Feature activation+4.06e+00

Pos logit contributions
 afar+0.265
ires+0.252
err+0.245
ear+0.244
ening+0.239
Neg logit contributions
 -0.049
 will-0.041
 said-0.040
 Together-0.038
 chip-0.037
Token was
Token ID373
Feature activation+2.12e+00

Pos logit contributions
ires+0.225
 afar+0.217
ear+0.214
err+0.212
uttering+0.209
Neg logit contributions
 will-0.103
 said-0.097
 chip-0.091
 -0.087
 says-0.086
Token a
Token ID257
Feature activation+2.00e-01

Pos logit contributions
ires+0.112
eway+0.112
ging+0.109
ening+0.108
ant+0.108
Neg logit contributions
 chip-0.050
 said-0.049
 sa-0.047
 butter-0.044
 piping-0.044
Token scary
Token ID14343
Feature activation+6.75e+00

Pos logit contributions
 wrists+0.011
girls+0.010
tails+0.010
 chip+0.009
 depends+0.009
Neg logit contributions
 afar-0.015
ires-0.015
err-0.014
ear-0.014
 underground-0.014
Token dogs
Token ID6844
Feature activation+3.43e+00

Pos logit contributions
ires+0.271
 afar+0.252
err+0.249
ear+0.246
eway+0.244
Neg logit contributions
 wrists-0.253
 will-0.242
 chip-0.242
 sa-0.235
tails-0.234
Token.
Token ID13
Feature activation+2.28e+00

Pos logit contributions
ires+0.117
 afar+0.114
ges+0.109
ear+0.109
 underground+0.108
Neg logit contributions
 wrists-0.144
 chip-0.140
 depends-0.133
tails-0.132
 sa-0.131
Token<|endoftext|>
Token ID50256
Feature activation+4.75e+00

Pos logit contributions
ires+0.106
 afar+0.101
err+0.099
ear+0.099
eway+0.098
Neg logit contributions
 wrists-0.071
 will-0.066
 chip-0.065
 sa-0.065
tails-0.065
TokenOnce
Token ID7454
Feature activation+1.98e+00

Pos logit contributions
ires+0.266
 afar+0.256
uttering+0.249
ear+0.244
ging+0.243
Neg logit contributions
 will-0.110
 -0.103
-0.102
 said-0.099
 says-0.097
Token upon
Token ID2402
Feature activation+8.95e+00

Pos logit contributions
 afar+0.057
vised+0.053
ear+0.053
foot+0.051
ant+0.050
Neg logit contributions
 -0.104
 said-0.102
 will-0.100
 says-0.096
-0.093
Token a
Token ID257
Feature activation+3.36e-01

Pos logit contributions
ires+0.537
ied+0.526
ging+0.525
ening+0.522
ying+0.517
Neg logit contributions
 said-0.177
 will-0.166
 chip-0.150
 -0.145
 nodded-0.143
Token time
Token ID640
Feature activation+3.78e+00

Pos logit contributions
ires+0.010
eway+0.010
ges+0.009
 afar+0.009
ging+0.009
Neg logit contributions
 will-0.016
 wrists-0.015
 chip-0.015
-0.015
 sa-0.015
Token,
Token ID11
Feature activation+6.97e-01

Pos logit contributions
 afar+0.262
ires+0.247
err+0.243
ear+0.241
ening+0.238
Neg logit contributions
 -0.050
 said-0.044
 will-0.041
 Together-0.041
 chip-0.038
Token there
Token ID612
Feature activation+3.91e+00

Pos logit contributions
ear+0.028
ires+0.027
eway+0.027
err+0.027
 afar+0.027
Neg logit contributions
 will-0.026
 said-0.025
 chip-0.024
 -0.023
 wrists-0.023
Token was
Token ID373
Feature activation+2.04e+00

Pos logit contributions
ires+0.219
 afar+0.214
ear+0.209
err+0.208
uttering+0.205
Neg logit contributions
 will-0.097
 said-0.088
 chip-0.082
 says-0.078
 -0.078
Token his
Token ID465
Feature activation+6.68e+00

Pos logit contributions
ires+0.056
 afar+0.053
err+0.052
ear+0.052
eway+0.051
Neg logit contributions
 wrists-0.039
 will-0.037
 chip-0.037
tails-0.036
 sa-0.036
Token favorite
Token ID4004
Feature activation+5.86e+00

Pos logit contributions
ires+0.294
 afar+0.276
err+0.274
ear+0.272
eway+0.270
Neg logit contributions
 wrists-0.224
 will-0.212
 chip-0.211
 sa-0.205
tails-0.205
Token.
Token ID13
Feature activation+2.51e+00

Pos logit contributions
ires+0.231
 afar+0.216
err+0.213
ear+0.210
eway+0.209
Neg logit contributions
 wrists-0.224
 chip-0.213
 will-0.213
 sa-0.208
tails-0.208
Token<|endoftext|>
Token ID50256
Feature activation+4.79e+00

Pos logit contributions
ires+0.118
 afar+0.111
err+0.110
ear+0.110
eway+0.109
Neg logit contributions
 wrists-0.077
 will-0.073
 chip-0.071
tails-0.071
 sa-0.070
TokenOnce
Token ID7454
Feature activation+1.99e+00

Pos logit contributions
ires+0.261
 afar+0.251
uttering+0.245
err+0.242
ear+0.242
Neg logit contributions
 will-0.113
 chip-0.102
 said-0.101
 says-0.101
-0.100
Token upon
Token ID2402
Feature activation+8.94e+00

Pos logit contributions
 afar+0.054
ear+0.049
vised+0.048
err+0.047
foot+0.047
Neg logit contributions
 -0.108
 said-0.108
 will-0.105
 says-0.102
-0.098
Token a
Token ID257
Feature activation+4.59e-01

Pos logit contributions
ires+0.525
ied+0.509
ging+0.504
ening+0.503
ying+0.497
Neg logit contributions
 said-0.189
 will-0.181
 chip-0.169
 -0.163
 sa-0.160
Token time
Token ID640
Feature activation+3.73e+00

Pos logit contributions
ires+0.013
eway+0.013
 afar+0.012
ges+0.012
ging+0.012
Neg logit contributions
 will-0.022
 wrists-0.022
 chip-0.022
-0.021
 sa-0.021
Token,
Token ID11
Feature activation+8.05e-01

Pos logit contributions
 afar+0.265
ires+0.249
err+0.245
ear+0.241
ening+0.237
Neg logit contributions
 -0.046
 will-0.040
 said-0.040
 Together-0.037
 chip-0.036
Token there
Token ID612
Feature activation+3.99e+00

Pos logit contributions
ear+0.031
ires+0.031
 afar+0.030
err+0.030
eway+0.030
Neg logit contributions
 will-0.031
 said-0.029
 chip-0.029
 wrists-0.028
-0.027
Token was
Token ID373
Feature activation+2.06e+00

Pos logit contributions
ires+0.222
 afar+0.216
ear+0.211
err+0.209
uttering+0.209
Neg logit contributions
 will-0.100
 said-0.092
 chip-0.088
 says-0.083
 -0.082
Token the
Token ID262
Feature activation+2.75e+00

Pos logit contributions
ires+0.065
uttering+0.060
err+0.060
ear+0.060
 afar+0.060
Neg logit contributions
 wrists-0.050
 chip-0.050
 sa-0.050
 will-0.049
 ra-0.047
Token house
Token ID2156
Feature activation+1.18e+00

Pos logit contributions
ires+0.115
 afar+0.108
err+0.106
ear+0.105
eway+0.105
Neg logit contributions
 wrists-0.098
 chip-0.093
 will-0.093
tails-0.091
 sa-0.091
Token.
Token ID13
Feature activation+2.25e+00

Pos logit contributions
ires+0.043
 afar+0.040
eway+0.039
ear+0.039
 underground+0.039
Neg logit contributions
 wrists-0.049
 chip-0.046
tails-0.046
girls-0.045
 sa-0.044
Token<|endoftext|>
Token ID50256
Feature activation+4.80e+00

Pos logit contributions
ires+0.108
 afar+0.102
err+0.101
ear+0.100
eway+0.099
Neg logit contributions
 wrists-0.067
 will-0.063
 chip-0.063
tails-0.061
 sa-0.061
TokenOnce
Token ID7454
Feature activation+1.92e+00

Pos logit contributions
ires+0.271
 afar+0.263
uttering+0.256
ear+0.252
ging+0.250
Neg logit contributions
 will-0.109
 -0.101
-0.098
 said-0.097
 says-0.096
Token upon
Token ID2402
Feature activation+8.94e+00

Pos logit contributions
 afar+0.055
ear+0.050
vised+0.049
foot+0.047
err+0.047
Neg logit contributions
 -0.104
 said-0.101
 will-0.098
 says-0.095
-0.091
Token a
Token ID257
Feature activation+4.14e-01

Pos logit contributions
ires+0.534
ging+0.521
ening+0.518
ied+0.518
ying+0.508
Neg logit contributions
 said-0.181
 will-0.169
 chip-0.155
 -0.150
 kissed-0.146
Token time
Token ID640
Feature activation+3.70e+00

Pos logit contributions
ires+0.013
eway+0.012
ges+0.012
 afar+0.012
ging+0.012
Neg logit contributions
 will-0.019
 chip-0.019
 wrists-0.018
-0.018
 sa-0.018
Token,
Token ID11
Feature activation+7.89e-01

Pos logit contributions
 afar+0.263
ires+0.247
err+0.242
ear+0.242
ening+0.237
Neg logit contributions
 -0.047
 said-0.039
 will-0.038
 Together-0.036
 chip-0.034
Token there
Token ID612
Feature activation+3.98e+00

Pos logit contributions
ear+0.032
ires+0.031
 afar+0.030
eway+0.030
err+0.030
Neg logit contributions
 will-0.029
 said-0.028
 chip-0.027
 wrists-0.026
 -0.026
Token was
Token ID373
Feature activation+2.05e+00

Pos logit contributions
ires+0.228
 afar+0.222
ear+0.218
err+0.215
uttering+0.214
Neg logit contributions
 will-0.095
 said-0.088
 chip-0.079
 -0.078
 says-0.076
Token the
Token ID262
Feature activation+2.75e+00

Pos logit contributions
ires+0.085
 afar+0.078
 underground+0.077
err+0.077
ear+0.076
Neg logit contributions
 wrists-0.091
tails-0.083
 will-0.082
 depends-0.082
 chip-0.081
Token park
Token ID3952
Feature activation+4.03e+00

Pos logit contributions
ires+0.115
 afar+0.108
err+0.106
ear+0.105
eway+0.105
Neg logit contributions
 wrists-0.098
 chip-0.093
 will-0.093
tails-0.091
 sa-0.091
Token.
Token ID13
Feature activation+2.54e+00

Pos logit contributions
ires+0.144
 afar+0.137
eway+0.134
ear+0.134
err+0.133
Neg logit contributions
 wrists-0.164
 chip-0.158
tails-0.154
 depends-0.153
 sa-0.153
Token<|endoftext|>
Token ID50256
Feature activation+4.84e+00

Pos logit contributions
ires+0.127
 afar+0.120
err+0.119
uttering+0.118
ear+0.117
Neg logit contributions
 wrists-0.069
 chip-0.065
 will-0.065
tails-0.061
 depends-0.061
TokenOnce
Token ID7454
Feature activation+1.96e+00

Pos logit contributions
ires+0.275
 afar+0.268
uttering+0.261
ear+0.256
ging+0.256
Neg logit contributions
 will-0.107
 -0.099
-0.095
 said-0.094
 says-0.094
Token upon
Token ID2402
Feature activation+8.94e+00

Pos logit contributions
 afar+0.055
ear+0.050
vised+0.050
foot+0.048
err+0.047
Neg logit contributions
 -0.106
 said-0.104
 will-0.101
 says-0.097
-0.093
Token a
Token ID257
Feature activation+4.41e-01

Pos logit contributions
ires+0.529
ging+0.514
ied+0.513
ening+0.513
ying+0.503
Neg logit contributions
 said-0.184
 will-0.175
 chip-0.160
 -0.155
 sa-0.151
Token time
Token ID640
Feature activation+3.74e+00

Pos logit contributions
ires+0.013
eway+0.013
ges+0.012
 afar+0.012
ging+0.012
Neg logit contributions
 will-0.021
 wrists-0.020
 chip-0.020
-0.020
 sa-0.020
Token,
Token ID11
Feature activation+8.10e-01

Pos logit contributions
 afar+0.263
ires+0.247
ear+0.242
err+0.242
ening+0.237
Neg logit contributions
 -0.050
 said-0.042
 will-0.042
 Together-0.039
 chip-0.037
Token there
Token ID612
Feature activation+4.00e+00

Pos logit contributions
ear+0.032
ires+0.031
 afar+0.031
eway+0.031
err+0.030
Neg logit contributions
 will-0.031
 said-0.029
 chip-0.028
 wrists-0.028
 -0.027
Token was
Token ID373
Feature activation+2.08e+00

Pos logit contributions
ires+0.228
 afar+0.221
ear+0.218
err+0.215
uttering+0.213
Neg logit contributions
 will-0.097
 said-0.089
 chip-0.081
 -0.080
 says-0.078
Token the
Token ID262
Feature activation+2.75e+00

Pos logit contributions
earing+0.030
ear+0.029
err+0.029
ires+0.029
 afar+0.028
Neg logit contributions
 wrists-0.033
girls-0.033
tails-0.033
 chip-0.032
 depends-0.032
Token park
Token ID3952
Feature activation+4.03e+00

Pos logit contributions
ires+0.115
 afar+0.108
err+0.106
ear+0.105
eway+0.105
Neg logit contributions
 wrists-0.098
 chip-0.093
 will-0.093
tails-0.091
 sa-0.091
Token.
Token ID13
Feature activation+2.54e+00

Pos logit contributions
ires+0.144
 afar+0.137
eway+0.134
ear+0.134
err+0.133
Neg logit contributions
 wrists-0.164
 chip-0.158
tails-0.154
 depends-0.153
 sa-0.153
Token<|endoftext|>
Token ID50256
Feature activation+4.84e+00

Pos logit contributions
ires+0.127
 afar+0.120
err+0.119
uttering+0.118
ear+0.117
Neg logit contributions
 wrists-0.069
 chip-0.065
 will-0.065
tails-0.061
 depends-0.061
TokenOnce
Token ID7454
Feature activation+1.96e+00

Pos logit contributions
ires+0.275
 afar+0.268
uttering+0.261
ear+0.256
ging+0.256
Neg logit contributions
 will-0.107
 -0.099
-0.095
 said-0.094
 says-0.094
Token upon
Token ID2402
Feature activation+8.94e+00

Pos logit contributions
 afar+0.055
ear+0.050
vised+0.050
foot+0.048
err+0.047
Neg logit contributions
 -0.106
 said-0.104
 will-0.101
 says-0.097
-0.093
Token a
Token ID257
Feature activation+4.41e-01

Pos logit contributions
ires+0.529
ging+0.514
ied+0.513
ening+0.513
ying+0.503
Neg logit contributions
 said-0.184
 will-0.175
 chip-0.160
 -0.155
 sa-0.151
Token time
Token ID640
Feature activation+3.74e+00

Pos logit contributions
ires+0.013
eway+0.013
ges+0.012
 afar+0.012
ging+0.012
Neg logit contributions
 will-0.021
 wrists-0.020
 chip-0.020
-0.020
 sa-0.020
Token,
Token ID11
Feature activation+8.10e-01

Pos logit contributions
 afar+0.263
ires+0.247
ear+0.242
err+0.242
ening+0.237
Neg logit contributions
 -0.050
 said-0.042
 will-0.042
 Together-0.039
 chip-0.037
Token there
Token ID612
Feature activation+4.00e+00

Pos logit contributions
ear+0.032
ires+0.031
 afar+0.031
eway+0.031
err+0.030
Neg logit contributions
 will-0.031
 said-0.029
 chip-0.028
 wrists-0.028
 -0.027
Token was
Token ID373
Feature activation+2.08e+00

Pos logit contributions
ires+0.228
 afar+0.221
ear+0.218
err+0.215
uttering+0.213
Neg logit contributions
 will-0.097
 said-0.089
 chip-0.081
 -0.080
 says-0.078
Token a
Token ID257
Feature activation+3.27e+00

Pos logit contributions
 wrists+0.086
 chip+0.084
 sa+0.072
 ra+0.071
tails+0.070
Neg logit contributions
ires-0.194
 afar-0.189
 upon-0.188
 underground-0.186
ear-0.185
Token friend
Token ID1545
Feature activation+2.59e+00

Pos logit contributions
ires+0.121
 afar+0.116
err+0.113
eway+0.110
ear+0.110
Neg logit contributions
 wrists-0.131
 chip-0.123
 will-0.122
tails-0.122
 depends-0.120
Token.
Token ID13
Feature activation+2.33e+00

Pos logit contributions
ear+0.093
 upon+0.093
 afar+0.093
 behind+0.093
ges+0.092
Neg logit contributions
 wrists-0.100
 chip-0.097
 sa-0.090
 depends-0.090
 ra-0.090
Token<|endoftext|>
Token ID50256
Feature activation+4.82e+00

Pos logit contributions
ires+0.109
 afar+0.103
ear+0.102
err+0.102
eway+0.101
Neg logit contributions
 wrists-0.072
 will-0.067
 sa-0.067
 chip-0.066
tails-0.066
TokenOnce
Token ID7454
Feature activation+1.99e+00

Pos logit contributions
ires+0.263
 afar+0.254
uttering+0.246
ear+0.244
err+0.242
Neg logit contributions
 will-0.115
 chip-0.103
 said-0.102
 says-0.102
 Together-0.102
Token upon
Token ID2402
Feature activation+8.94e+00

Pos logit contributions
 afar+0.053
ear+0.046
err+0.045
vised+0.044
ires+0.044
Neg logit contributions
 -0.111
 said-0.109
 will-0.107
 says-0.103
 Together-0.100
Token a
Token ID257
Feature activation+4.60e-01

Pos logit contributions
ires+0.533
ied+0.516
ging+0.515
ening+0.511
ying+0.504
Neg logit contributions
 said-0.185
 will-0.175
 -0.159
 chip-0.157
 kissed-0.150
Token time
Token ID640
Feature activation+3.71e+00

Pos logit contributions
ires+0.014
eway+0.013
 afar+0.013
ges+0.013
ging+0.012
Neg logit contributions
 will-0.022
 wrists-0.021
 chip-0.021
-0.021
 sa-0.021
Token,
Token ID11
Feature activation+8.33e-01

Pos logit contributions
 afar+0.260
ires+0.247
err+0.242
ear+0.241
ening+0.234
Neg logit contributions
 -0.051
 will-0.042
 said-0.042
 Together-0.039
 chip-0.037
Token there
Token ID612
Feature activation+3.99e+00

Pos logit contributions
ear+0.033
ires+0.033
err+0.032
 afar+0.032
eway+0.031
Neg logit contributions
 will-0.031
 said-0.030
 -0.029
 chip-0.029
-0.028
Token was
Token ID373
Feature activation+2.07e+00

Pos logit contributions
ires+0.229
 afar+0.220
ear+0.218
err+0.216
uttering+0.215
Neg logit contributions
 will-0.096
 said-0.088
 chip-0.081
 -0.080
 says-0.077
Token the
Token ID262
Feature activation+2.75e+00

Pos logit contributions
 wrists+0.037
 chip+0.037
 sa+0.033
girls+0.033
 ra+0.033
Neg logit contributions
ires-0.042
ear-0.041
 upon-0.040
 afar-0.040
err-0.040
Token rabbit
Token ID22746
Feature activation+4.49e+00

Pos logit contributions
ires+0.115
 afar+0.108
err+0.106
ear+0.105
eway+0.105
Neg logit contributions
 wrists-0.098
 chip-0.093
 will-0.093
tails-0.091
 sa-0.091
Token.
Token ID13
Feature activation+2.35e+00

Pos logit contributions
ires+0.178
 afar+0.176
 upon+0.174
ear+0.174
ges+0.171
Neg logit contributions
 wrists-0.159
 chip-0.149
tails-0.142
 depends-0.140
 sa-0.140
Token<|endoftext|>
Token ID50256
Feature activation+4.86e+00

Pos logit contributions
ires+0.106
 afar+0.101
ear+0.099
err+0.099
eway+0.098
Neg logit contributions
 wrists-0.076
 will-0.072
 chip-0.070
tails-0.070
 sa-0.070
TokenOnce
Token ID7454
Feature activation+1.97e+00

Pos logit contributions
ires+0.268
 afar+0.257
uttering+0.251
ear+0.248
err+0.245
Neg logit contributions
 will-0.115
 -0.106
-0.104
 said-0.103
 Together-0.103
Token upon
Token ID2402
Feature activation+8.94e+00

Pos logit contributions
 afar+0.053
ear+0.048
vised+0.046
err+0.046
ires+0.046
Neg logit contributions
 -0.109
 said-0.106
 will-0.103
 says-0.099
-0.097
Token a
Token ID257
Feature activation+4.48e-01

Pos logit contributions
ires+0.538
ging+0.524
ied+0.522
ening+0.517
ying+0.513
Neg logit contributions
 said-0.178
 will-0.165
 chip-0.152
 -0.151
 sa-0.144
Token time
Token ID640
Feature activation+3.72e+00

Pos logit contributions
ires+0.014
eway+0.013
ges+0.012
 afar+0.012
ging+0.012
Neg logit contributions
 will-0.021
 chip-0.021
 wrists-0.020
-0.020
 sa-0.020
Token,
Token ID11
Feature activation+8.37e-01

Pos logit contributions
 afar+0.263
ires+0.251
err+0.244
ear+0.243
ening+0.237
Neg logit contributions
 -0.048
 said-0.038
 will-0.037
 Together-0.036
 chip-0.035
Token there
Token ID612
Feature activation+3.98e+00

Pos logit contributions
ear+0.033
ires+0.033
 afar+0.032
eway+0.032
err+0.032
Neg logit contributions
 will-0.031
 said-0.030
 chip-0.029
 -0.029
 wrists-0.028
Token was
Token ID373
Feature activation+2.08e+00

Pos logit contributions
ires+0.226
 afar+0.218
ear+0.215
err+0.212
uttering+0.212
Neg logit contributions
 will-0.096
 said-0.089
 chip-0.082
 -0.080
 says-0.078
Token this
Token ID428
Feature activation+5.95e+00

Pos logit contributions
ires+0.038
err+0.036
uttering+0.036
eway+0.036
ear+0.034
Neg logit contributions
 wrists-0.039
 will-0.038
 chip-0.036
 sa-0.036
tails-0.036
Token again
Token ID757
Feature activation+6.99e-01

Pos logit contributions
ires+0.234
 afar+0.221
eway+0.218
err+0.217
 upon+0.214
Neg logit contributions
 wrists-0.222
tails-0.214
girls-0.211
 chip-0.204
Together-0.204
Token!"
Token ID2474
Feature activation-2.32e+00

Pos logit contributions
ires+0.027
 afar+0.026
ear+0.025
err+0.024
 upon+0.024
Neg logit contributions
 wrists-0.027
 chip-0.026
tails-0.026
 sa-0.025
girls-0.025
Token<|endoftext|>
Token ID50256
Feature activation+4.65e+00

Pos logit contributions
 wrists+0.081
tails+0.079
 will+0.078
 depends+0.077
 sa+0.075
Neg logit contributions
ires-0.094
 afar-0.090
ear-0.090
 upon-0.088
ges-0.088
TokenOnce
Token ID7454
Feature activation+1.92e+00

Pos logit contributions
ires+0.223
 afar+0.212
uttering+0.207
ear+0.203
err+0.203
Neg logit contributions
 will-0.139
 wrists-0.131
 chip-0.131
-0.129
 says-0.129
Token upon
Token ID2402
Feature activation+8.93e+00

Pos logit contributions
 afar+0.041
foot+0.037
vised+0.037
ear+0.037
err+0.036
Neg logit contributions
 said-0.111
 -0.109
 will-0.108
 says-0.107
-0.101
Token a
Token ID257
Feature activation+5.30e-01

Pos logit contributions
ires+0.498
ging+0.484
ied+0.477
ening+0.470
ying+0.463
Neg logit contributions
 said-0.209
 will-0.196
 chip-0.187
 -0.181
 sa-0.180
Token time
Token ID640
Feature activation+3.73e+00

Pos logit contributions
ires+0.015
 afar+0.014
eway+0.014
ges+0.013
ging+0.013
Neg logit contributions
 will-0.026
 chip-0.026
 wrists-0.025
 sa-0.025
-0.025
Token,
Token ID11
Feature activation+7.05e-01

Pos logit contributions
 afar+0.278
ires+0.266
err+0.259
ear+0.252
away+0.250
Neg logit contributions
 -0.042
 said-0.033
 they-0.032
 will-0.028
 Together-0.027
Token there
Token ID612
Feature activation+3.93e+00

Pos logit contributions
ires+0.028
ear+0.027
 afar+0.027
oken+0.026
err+0.026
Neg logit contributions
 will-0.027
 said-0.026
 -0.025
 chip-0.025
-0.024
Token was
Token ID373
Feature activation+2.00e+00

Pos logit contributions
ires+0.223
 afar+0.217
ear+0.212
uttering+0.211
err+0.211
Neg logit contributions
 will-0.095
 said-0.089
 chip-0.082
 says-0.078
 -0.077
Tokeny
Token ID88
Feature activation+7.05e+00

Pos logit contributions
 wrists+0.010
 will+0.009
 sa+0.009
 depends+0.009
 chip+0.009
Neg logit contributions
ires-0.019
eway-0.018
 afar-0.018
ges-0.018
err-0.018
Token days
Token ID1528
Feature activation-1.12e-01

Pos logit contributions
ires+0.273
 afar+0.257
err+0.253
ear+0.249
eway+0.247
Neg logit contributions
 wrists-0.270
 chip-0.258
 will-0.255
tails-0.250
 depends-0.250
Token.
Token ID13
Feature activation+2.36e+00

Pos logit contributions
 wrists+0.004
 chip+0.004
 depends+0.004
 will+0.004
tails+0.004
Neg logit contributions
ires-0.004
 afar-0.004
ear-0.004
ges-0.004
 underground-0.004
Token<|endoftext|>
Token ID50256
Feature activation+4.69e+00

Pos logit contributions
ires+0.112
 afar+0.106
err+0.105
ear+0.103
eway+0.103
Neg logit contributions
 wrists-0.071
 chip-0.067
 will-0.066
tails-0.064
 sa-0.064
TokenOnce
Token ID7454
Feature activation+1.91e+00

Pos logit contributions
ires+0.266
 afar+0.257
uttering+0.250
ear+0.247
ging+0.244
Neg logit contributions
 will-0.103
 -0.099
-0.094
 said-0.094
 Together-0.093
Token upon
Token ID2402
Feature activation+8.93e+00

Pos logit contributions
 afar+0.051
ear+0.047
vised+0.046
err+0.045
foot+0.045
Neg logit contributions
 -0.103
 said-0.102
 will-0.099
 says-0.096
-0.093
Token a
Token ID257
Feature activation+5.08e-01

Pos logit contributions
ires+0.533
ging+0.516
ied+0.516
ening+0.513
ying+0.506
Neg logit contributions
 said-0.181
 will-0.170
 chip-0.156
 -0.152
 sa-0.148
Token time
Token ID640
Feature activation+3.71e+00

Pos logit contributions
ires+0.016
eway+0.015
ges+0.015
 afar+0.014
ied+0.014
Neg logit contributions
 will-0.024
 chip-0.023
-0.023
 wrists-0.023
 sa-0.022
Token,
Token ID11
Feature activation+8.65e-01

Pos logit contributions
 afar+0.263
ires+0.249
err+0.244
ear+0.241
ening+0.237
Neg logit contributions
 -0.045
 said-0.038
 Together-0.035
 will-0.035
 chip-0.034
Token there
Token ID612
Feature activation+3.97e+00

Pos logit contributions
ear+0.035
ires+0.035
ocol+0.034
eway+0.034
err+0.034
Neg logit contributions
 will-0.031
 said-0.029
 chip-0.029
 -0.028
-0.028
Token was
Token ID373
Feature activation+2.10e+00

Pos logit contributions
ires+0.228
 afar+0.222
ear+0.217
err+0.216
uttering+0.214
Neg logit contributions
 will-0.094
 said-0.087
 chip-0.080
 -0.077
 says-0.076
Token clown
Token ID25573
Feature activation+2.33e+00

Pos logit contributions
ires+0.120
err+0.112
 afar+0.112
ear+0.110
eway+0.109
Neg logit contributions
 wrists-0.106
 chip-0.101
 will-0.101
 sa-0.098
tails-0.098
Tokens
Token ID82
Feature activation+3.74e-01

Pos logit contributions
ires+0.128
 afar+0.122
err+0.121
ear+0.119
eway+0.119
Neg logit contributions
 wrists-0.053
 chip-0.050
 will-0.049
tails-0.048
 sa-0.048
Token juggling
Token ID48097
Feature activation+1.53e+00

Pos logit contributions
ires+0.015
 afar+0.014
err+0.013
eway+0.013
uttering+0.013
Neg logit contributions
 wrists-0.014
 will-0.014
 chip-0.014
 depends-0.014
tails-0.013
Token and
Token ID290
Feature activation+8.09e-01

Pos logit contributions
ires+0.064
 afar+0.060
err+0.059
ear+0.058
eway+0.058
Neg logit contributions
 wrists-0.056
 chip-0.053
 will-0.053
 sa-0.052
tails-0.051
Token ac
Token ID936
Feature activation-1.03e+00

Pos logit contributions
ges+0.026
 afar+0.025
foot+0.025
ires+0.024
 behind+0.024
Neg logit contributions
 depends-0.034
 will-0.033
 wrists-0.033
Together-0.032
 chip-0.031
Tokenrob
Token ID22609
Feature activation-1.27e+01

Pos logit contributions
 wrists+0.018
 depends+0.015
 chip+0.015
 will+0.015
 sa+0.014
Neg logit contributions
ires-0.062
 afar-0.059
ear-0.059
err-0.059
ges-0.059
Tokenats
Token ID1381
Feature activation-6.02e-01

Pos logit contributions
 wrists+0.426
tails+0.396
 will+0.382
 sa+0.373
 says+0.370
Neg logit contributions
 afar-0.548
ires-0.547
err-0.533
eway-0.531
 underground-0.529
Token doing
Token ID1804
Feature activation-7.62e-01

Pos logit contributions
 wrists+0.024
 chip+0.023
 will+0.023
 depends+0.023
tails+0.022
Neg logit contributions
ires-0.022
 afar-0.021
err-0.020
ear-0.020
eway-0.020
Token flips
Token ID45971
Feature activation+2.49e+00

Pos logit contributions
 wrists+0.031
 chip+0.029
 will+0.029
 depends+0.029
tails+0.029
Neg logit contributions
ires-0.028
 afar-0.026
err-0.025
eway-0.025
ear-0.025
Token.
Token ID13
Feature activation+1.70e-01

Pos logit contributions
ires+0.105
 afar+0.098
err+0.096
uttering+0.093
earing+0.093
Neg logit contributions
 will-0.087
 chip-0.087
 wrists-0.087
 sa-0.085
tails-0.084
Token But
Token ID887
Feature activation-2.01e-01

Pos logit contributions
ires+0.008
 afar+0.007
err+0.007
ear+0.007
ges+0.007
Neg logit contributions
 wrists-0.005
 will-0.005
tails-0.005
 sa-0.005
 depends-0.005
Token thought
Token ID1807
Feature activation+2.01e+00

Pos logit contributions
 afar+0.047
ear+0.047
ires+0.046
earing+0.046
 upon+0.046
Neg logit contributions
 chip-0.046
 wrists-0.044
girls-0.044
 sa-0.043
 depends-0.043
Token they
Token ID484
Feature activation-3.71e+00

Pos logit contributions
err+0.087
ires+0.085
 afar+0.084
eway+0.081
uttering+0.079
Neg logit contributions
 will-0.069
 wrists-0.068
 chip-0.064
 rhy-0.064
 says-0.064
Token looked
Token ID3114
Feature activation-1.59e+00

Pos logit contributions
 chip+0.108
 wrists+0.100
girls+0.094
 sa+0.094
 ra+0.090
Neg logit contributions
 afar-0.176
 upon-0.175
 microscopic-0.175
 outside-0.172
 behind-0.172
Token like
Token ID588
Feature activation+4.08e+00

Pos logit contributions
 will+0.049
 wrists+0.045
 scar+0.043
 sa+0.042
 said+0.041
Neg logit contributions
ires-0.078
eway-0.078
ges-0.076
err-0.075
uttering-0.073
Token ac
Token ID936
Feature activation-6.92e-01

Pos logit contributions
ires+0.164
 afar+0.154
err+0.152
ear+0.150
eway+0.149
Neg logit contributions
 wrists-0.153
 will-0.145
 chip-0.145
tails-0.142
 sa-0.142
Tokenrob
Token ID22609
Feature activation-1.21e+01

Pos logit contributions
 wrists+0.016
 chip+0.013
 depends+0.013
 will+0.013
 sa+0.013
Neg logit contributions
ires-0.038
 afar-0.037
ear-0.036
err-0.036
ges-0.035
Tokenats
Token ID1381
Feature activation-1.88e-01

Pos logit contributions
 wrists+0.384
tails+0.351
 sa+0.339
 will+0.336
 says+0.334
Neg logit contributions
eway-0.548
 afar-0.548
 underground-0.536
err-0.532
ires-0.526
Token.
Token ID13
Feature activation+3.24e-01

Pos logit contributions
 wrists+0.009
 depends+0.008
 chip+0.008
Together+0.008
tails+0.008
Neg logit contributions
ires-0.006
 behind-0.005
ges-0.005
 outside-0.005
 afar-0.005
Token She
Token ID1375
Feature activation-2.56e+00

Pos logit contributions
ires+0.016
 afar+0.015
err+0.015
ear+0.015
eway+0.014
Neg logit contributions
 wrists-0.009
 chip-0.009
 will-0.009
 sa-0.009
tails-0.009
Token enjoyed
Token ID8359
Feature activation-1.25e+00

Pos logit contributions
 chip+0.086
 wrists+0.084
girls+0.083
tails+0.082
 depends+0.081
Neg logit contributions
ires-0.105
 afar-0.104
ear-0.102
err-0.101
 upon-0.100
Token the
Token ID262
Feature activation-1.68e+00

Pos logit contributions
 wrists+0.057
 chip+0.055
tails+0.054
 sa+0.053
 will+0.053
Neg logit contributions
ires-0.042
err-0.040
eway-0.038
ear-0.038
earing-0.037
Token of
Token ID286
Feature activation-5.40e-01

Pos logit contributions
 will+0.072
 chip+0.071
 sa+0.069
 said+0.068
 wrists+0.068
Neg logit contributions
ires-0.063
 afar-0.062
err-0.060
uttering-0.057
ging-0.057
Token fun
Token ID1257
Feature activation+4.61e-02

Pos logit contributions
 wrists+0.020
 chip+0.020
tails+0.020
 will+0.019
 depends+0.019
Neg logit contributions
ires-0.021
 afar-0.021
ear-0.020
 underground-0.019
err-0.019
Token watching
Token ID4964
Feature activation+3.00e+00

Pos logit contributions
ires+0.002
 afar+0.002
err+0.002
ear+0.002
eway+0.002
Neg logit contributions
 wrists-0.002
 will-0.002
 chip-0.002
 sa-0.002
tails-0.002
Token the
Token ID262
Feature activation-1.71e+00

Pos logit contributions
ires+0.119
 afar+0.111
err+0.111
uttering+0.110
eway+0.108
Neg logit contributions
 wrists-0.114
 chip-0.109
 will-0.108
 sa-0.107
tails-0.106
Token ac
Token ID936
Feature activation-7.50e-01

Pos logit contributions
 wrists+0.055
 will+0.053
 chip+0.053
 sa+0.052
 depends+0.051
Neg logit contributions
ires-0.077
 afar-0.073
err-0.072
ear-0.071
eway-0.071
Tokenrob
Token ID22609
Feature activation-1.20e+01

Pos logit contributions
 wrists+0.006
 chip+0.006
 will+0.006
 sa+0.005
tails+0.005
Neg logit contributions
ires-0.051
 afar-0.050
err-0.049
ear-0.049
eway-0.049
Tokenats
Token ID1381
Feature activation-1.85e-01

Pos logit contributions
 wrists+0.383
tails+0.366
 sa+0.363
 will+0.358
girls+0.353
Neg logit contributions
 afar-0.529
ires-0.526
err-0.515
eway-0.510
foot-0.505
Token and
Token ID290
Feature activation+1.02e+00

Pos logit contributions
 wrists+0.008
 depends+0.007
 chip+0.007
tails+0.007
 will+0.007
Neg logit contributions
ires-0.006
ges-0.006
ear-0.006
 afar-0.006
 behind-0.006
Token clown
Token ID25573
Feature activation+2.81e+00

Pos logit contributions
 afar+0.036
ires+0.035
 behind+0.034
 underground+0.033
 upon+0.033
Neg logit contributions
-0.038
 wrists-0.038
Together-0.038
 will-0.038
 depends-0.037
Tokens
Token ID82
Feature activation+5.64e-01

Pos logit contributions
ires+0.128
ges+0.124
ear+0.123
 upon+0.121
ied+0.121
Neg logit contributions
 wrists-0.085
 depends-0.080
 chip-0.078
 ra-0.075
 will-0.075
Token.
Token ID13
Feature activation+1.96e-01

Pos logit contributions
ires+0.020
 afar+0.019
eway+0.018
err+0.018
ear+0.018
Neg logit contributions
 wrists-0.023
 chip-0.022
 depends-0.022
 will-0.022
tails-0.022
Token elephants
Token ID25368
Feature activation-1.04e+00

Pos logit contributions
ires+0.000
 afar+0.000
ges+0.000
 behind+0.000
err+0.000
Neg logit contributions
 wrists-0.000
 depends-0.000
 will-0.000
 chip-0.000
Together-0.000
Token,
Token ID11
Feature activation+3.51e-03

Pos logit contributions
 wrists+0.039
 depends+0.037
 will+0.036
 chip+0.036
Together+0.035
Neg logit contributions
ires-0.041
ges-0.038
 afar-0.038
 underground-0.037
 behind-0.037
Token lions
Token ID31420
Feature activation-1.07e+00

Pos logit contributions
ires+0.000
 afar+0.000
err+0.000
ear+0.000
eway+0.000
Neg logit contributions
 wrists-0.000
 chip-0.000
 will-0.000
 depends-0.000
tails-0.000
Token and
Token ID290
Feature activation+1.09e+00

Pos logit contributions
 wrists+0.036
 will+0.033
 chip+0.033
 depends+0.033
 sa+0.032
Neg logit contributions
ires-0.047
 afar-0.045
err-0.044
ear-0.043
eway-0.043
Token ac
Token ID936
Feature activation-8.02e-01

Pos logit contributions
 afar+0.039
ires+0.038
 underground+0.038
foot+0.037
 behind+0.036
Neg logit contributions
 wrists-0.041
 Together-0.040
 depends-0.040
Together-0.039
 chip-0.039
Tokenrob
Token ID22609
Feature activation-1.19e+01

Pos logit contributions
 wrists+0.016
 depends+0.014
 chip+0.013
 will+0.013
 sa+0.013
Neg logit contributions
ires-0.046
 afar-0.044
ges-0.043
ear-0.043
 underground-0.043
Tokenats
Token ID1381
Feature activation-1.41e-01

Pos logit contributions
 wrists+0.395
tails+0.372
 will+0.361
 says+0.349
girls+0.348
Neg logit contributions
ires-0.524
 afar-0.513
err-0.507
eway-0.501
 underground-0.496
Token.
Token ID13
Feature activation+4.96e-01

Pos logit contributions
 wrists+0.006
 chip+0.005
 will+0.005
 depends+0.005
 sa+0.005
Neg logit contributions
ires-0.005
 afar-0.005
ges-0.005
ear-0.005
eway-0.005
Token They
Token ID1119
Feature activation-4.22e+00

Pos logit contributions
ires+0.027
 afar+0.026
err+0.025
ear+0.025
ges+0.024
Neg logit contributions
tails-0.012
 wrists-0.012
 depends-0.011
 will-0.011
 sa-0.011
Token are
Token ID389
Feature activation-1.40e+00

Pos logit contributions
 chip+0.153
 wrists+0.142
 sa+0.138
 depends+0.134
 Together+0.131
Neg logit contributions
 upon-0.171
 afar-0.170
 behind-0.166
ires-0.162
ges-0.157
Token amazed
Token ID24562
Feature activation+1.22e+00

Pos logit contributions
 wrists+0.057
 depends+0.055
+0.055
 will+0.054
 chip+0.053
Neg logit contributions
ires-0.051
 behind-0.048
 underground-0.047
 afar-0.046
 upon-0.045
Token clown
Token ID25573
Feature activation+2.92e+00

Pos logit contributions
 wrists+0.049
 will+0.047
 chip+0.047
 sa+0.046
 depends+0.045
Neg logit contributions
ires-0.076
 afar-0.071
ear-0.071
err-0.071
eway-0.070
Tokens
Token ID82
Feature activation+7.73e-01

Pos logit contributions
ires+0.117
 afar+0.109
err+0.107
ear+0.106
ges+0.106
Neg logit contributions
 wrists-0.109
 chip-0.102
 will-0.101
 depends-0.101
 sa-0.099
Token and
Token ID290
Feature activation+1.37e+00

Pos logit contributions
ires+0.030
err+0.028
 afar+0.027
ear+0.027
eway+0.027
Neg logit contributions
 wrists-0.031
 will-0.029
 chip-0.029
tails-0.029
 sa-0.028
Token the
Token ID262
Feature activation-1.58e+00

Pos logit contributions
 behind+0.044
 afar+0.041
 nowhere+0.039
 upon+0.039
ges+0.039
Neg logit contributions
Together-0.060
 chip-0.058
 Together-0.057
 depends-0.057
-0.056
Token ac
Token ID936
Feature activation-7.17e-01

Pos logit contributions
 wrists+0.054
 chip+0.051
tails+0.051
 will+0.051
 depends+0.050
Neg logit contributions
ires-0.067
 afar-0.065
err-0.062
eway-0.061
 underground-0.060
Tokenrob
Token ID22609
Feature activation-1.19e+01

Pos logit contributions
 wrists+0.014
 chip+0.012
 depends+0.011
 will+0.011
 sa+0.011
Neg logit contributions
ires-0.042
 afar-0.040
err-0.040
ear-0.039
eway-0.039
Tokenats
Token ID1381
Feature activation+2.15e-02

Pos logit contributions
 wrists+0.369
tails+0.357
 will+0.338
girls+0.327
 says+0.326
Neg logit contributions
ires-0.536
 afar-0.533
err-0.523
eway-0.513
 underground-0.512
Token and
Token ID290
Feature activation+1.41e+00

Pos logit contributions
ires+0.001
 afar+0.001
ges+0.001
ear+0.001
eway+0.001
Neg logit contributions
 wrists-0.001
 chip-0.001
 depends-0.001
 will-0.001
tails-0.001
Token the
Token ID262
Feature activation-1.55e+00

Pos logit contributions
 behind+0.036
 afar+0.034
 upon+0.031
ges+0.030
 underground+0.030
Neg logit contributions
Together-0.070
 Together-0.067
-0.066
 will-0.065
 depends-0.064
Token lions
Token ID31420
Feature activation-9.32e-01

Pos logit contributions
 Together+0.054
Together+0.054
 wrists+0.053
tails+0.051
 chip+0.050
Neg logit contributions
 afar-0.062
ires-0.060
 underground-0.057
err-0.057
 upon-0.056
Token and
Token ID290
Feature activation+1.42e+00

Pos logit contributions
 wrists+0.039
 depends+0.037
 chip+0.037
 will+0.036
Together+0.036
Neg logit contributions
ires-0.032
ges-0.030
 afar-0.030
 behind-0.029
 underground-0.029
Token saw
Token ID2497
Feature activation+3.44e+00

Pos logit contributions
 wrists+0.029
 chip+0.027
tails+0.027
 depends+0.026
 will+0.026
Neg logit contributions
ires-0.038
 afar-0.037
err-0.036
ear-0.035
 upon-0.035
Token clown
Token ID25573
Feature activation+3.06e+00

Pos logit contributions
ires+0.147
err+0.143
ear+0.140
uttering+0.139
 afar+0.138
Neg logit contributions
 wrists-0.119
 chip-0.116
 sa-0.115
 will-0.113
 says-0.111
Tokens
Token ID82
Feature activation+8.62e-01

Pos logit contributions
ires+0.170
 afar+0.163
err+0.161
ear+0.159
eway+0.159
Neg logit contributions
 wrists-0.068
 chip-0.064
 will-0.064
tails-0.062
 sa-0.062
Token and
Token ID290
Feature activation+1.46e+00

Pos logit contributions
ires+0.035
 afar+0.033
err+0.032
ear+0.032
eway+0.031
Neg logit contributions
 wrists-0.032
 will-0.031
 chip-0.031
 sa-0.030
tails-0.030
Token ac
Token ID936
Feature activation-6.79e-01

Pos logit contributions
 afar+0.053
 behind+0.052
ges+0.051
ires+0.051
 underground+0.050
Neg logit contributions
Together-0.054
 depends-0.054
 wrists-0.052
 will-0.052
 Together-0.051
Tokenrob
Token ID22609
Feature activation-1.17e+01

Pos logit contributions
 wrists+0.016
 chip+0.015
 depends+0.014
 will+0.014
 rhy+0.013
Neg logit contributions
ires-0.037
 afar-0.035
ges-0.034
err-0.034
ear-0.034
Tokenats
Token ID1381
Feature activation-3.68e-02

Pos logit contributions
 wrists+0.374
tails+0.352
 will+0.334
girls+0.329
 says+0.327
Neg logit contributions
ires-0.525
 afar-0.518
err-0.513
eway-0.512
 underground-0.505
Token and
Token ID290
Feature activation+1.48e+00

Pos logit contributions
 wrists+0.001
 will+0.001
 chip+0.001
 sa+0.001
tails+0.001
Neg logit contributions
ires-0.001
 afar-0.001
err-0.001
ear-0.001
eway-0.001
Token lions
Token ID31420
Feature activation-8.86e-01

Pos logit contributions
 afar+0.054
 behind+0.051
ires+0.051
 underground+0.051
ges+0.050
Neg logit contributions
Together-0.055
 depends-0.053
 Together-0.053
 wrists-0.052
 will-0.051
Token and
Token ID290
Feature activation+1.49e+00

Pos logit contributions
 wrists+0.036
 chip+0.034
 depends+0.034
 will+0.034
tails+0.033
Neg logit contributions
ires-0.032
 afar-0.030
ges-0.029
ear-0.029
err-0.029
Token elephants
Token ID25368
Feature activation-9.39e-01

Pos logit contributions
 afar+0.057
ires+0.056
 underground+0.054
ges+0.054
 behind+0.053
Neg logit contributions
 wrists-0.053
 depends-0.051
Together-0.051
 will-0.051
 Together-0.050
Token the
Token ID262
Feature activation-1.02e+00

Pos logit contributions
ires+0.043
 afar+0.039
 behind+0.038
 underground+0.037
ear+0.037
Neg logit contributions
 wrists-0.055
tails-0.055
 chip-0.053
 Together-0.052
 depends-0.051
Token animals
Token ID4695
Feature activation+2.53e+00

Pos logit contributions
 wrists+0.039
 will+0.037
 chip+0.037
 sa+0.036
tails+0.036
Neg logit contributions
ires-0.040
 afar-0.038
err-0.037
ear-0.037
eway-0.037
Token,
Token ID11
Feature activation+7.91e-01

Pos logit contributions
ires+0.080
 underground+0.072
 afar+0.071
 behind+0.071
ges+0.070
Neg logit contributions
 wrists-0.119
 chip-0.111
 depends-0.110
tails-0.109
 will-0.106
Token the
Token ID262
Feature activation-1.01e+00

Pos logit contributions
ires+0.027
 afar+0.027
 upon+0.025
ear+0.025
err+0.025
Neg logit contributions
 wrists-0.033
tails-0.031
 depends-0.031
 chip-0.031
 will-0.031
Token ac
Token ID936
Feature activation-3.05e-01

Pos logit contributions
 wrists+0.040
 chip+0.038
 will+0.037
tails+0.037
 depends+0.037
Neg logit contributions
ires-0.039
 afar-0.036
err-0.036
eway-0.035
ear-0.034
Tokenrob
Token ID22609
Feature activation-1.16e+01

Pos logit contributions
 wrists+0.005
 will+0.004
 depends+0.004
 chip+0.004
tails+0.004
Neg logit contributions
ires-0.019
 afar-0.018
err-0.018
ear-0.018
eway-0.017
Tokenats
Token ID1381
Feature activation+2.47e-01

Pos logit contributions
 wrists+0.348
tails+0.333
 will+0.326
 sa+0.323
girls+0.315
Neg logit contributions
 afar-0.532
err-0.527
ires-0.524
eway-0.520
foot-0.508
Token and
Token ID290
Feature activation+1.77e+00

Pos logit contributions
ires+0.008
ges+0.008
 behind+0.008
 outside+0.008
ging+0.008
Neg logit contributions
 wrists-0.011
 depends-0.010
tails-0.009
 will-0.009
 Amen-0.009
Token the
Token ID262
Feature activation-1.02e+00

Pos logit contributions
 afar+0.048
 behind+0.048
 upon+0.047
ires+0.046
 outside+0.043
Neg logit contributions
 Together-0.081
Together-0.080
 wrists-0.078
 depends-0.078
-0.076
Token clown
Token ID25573
Feature activation+3.21e+00

Pos logit contributions
 Together+0.042
Together+0.040
 wrists+0.040
tails+0.039
 chip+0.038
Neg logit contributions
ires-0.035
 afar-0.034
 upon-0.033
err-0.032
 underground-0.032
Tokens
Token ID82
Feature activation+1.52e+00

Pos logit contributions
ires+0.140
ges+0.136
ied+0.131
 behind+0.131
 outside+0.130
Neg logit contributions
 wrists-0.106
 depends-0.101
 chip-0.093
Together-0.092
 will-0.091
Token very
Token ID845
Feature activation-2.33e+00

Pos logit contributions
ires+0.033
 afar+0.031
err+0.030
ear+0.030
eway+0.030
Neg logit contributions
 wrists-0.042
 will-0.040
 chip-0.040
 sa-0.039
tails-0.039
Token still
Token ID991
Feature activation+6.12e-01

Pos logit contributions
 wrists+0.096
tails+0.094
 will+0.093
girls+0.093
 depends+0.092
Neg logit contributions
ires-0.086
 afar-0.079
ear-0.076
err-0.075
uttering-0.075
Token so
Token ID523
Feature activation-6.24e+00

Pos logit contributions
ires+0.021
 afar+0.020
err+0.019
ear+0.019
uttering+0.018
Neg logit contributions
 wrists-0.026
 will-0.025
 chip-0.025
tails-0.025
 depends-0.025
Token the
Token ID262
Feature activation-1.80e+00

Pos logit contributions
 wrists+0.260
tails+0.247
 will+0.245
 depends+0.244
 chip+0.243
Neg logit contributions
ires-0.231
 afar-0.214
ear-0.208
err-0.206
uttering-0.203
Token ha
Token ID387
Feature activation-6.05e+00

Pos logit contributions
 wrists+0.055
tails+0.051
 chip+0.050
 will+0.050
 depends+0.050
Neg logit contributions
ires-0.083
 afar-0.080
err-0.079
ear-0.077
eway-0.076
Tokenird
Token ID1447
Feature activation-1.03e+01

Pos logit contributions
 wrists+0.107
 chip+0.094
 depends+0.093
 will+0.090
 sa+0.090
Neg logit contributions
ires-0.358
ear-0.347
ges-0.346
 afar-0.342
uttering-0.341
Tokenress
Token ID601
Feature activation-3.97e+00

Pos logit contributions
 wrists+0.388
 will+0.364
 depends+0.364
 chip+0.364
 sa+0.363
Neg logit contributions
ires-0.411
ear-0.382
ges-0.378
 afar-0.377
err-0.377
Tokener
Token ID263
Feature activation-1.29e+00

Pos logit contributions
 wrists+0.203
 will+0.192
 depends+0.191
 chip+0.190
tails+0.190
Neg logit contributions
ires-0.105
ear-0.095
 afar-0.094
err-0.094
eway-0.091
Token could
Token ID714
Feature activation-2.26e+00

Pos logit contributions
 wrists+0.041
 depends+0.039
 chip+0.038
girls+0.038
 sa+0.037
Neg logit contributions
ires-0.054
ied-0.053
ges-0.053
 afar-0.053
ear-0.053
Token do
Token ID466
Feature activation-2.15e+00

Pos logit contributions
 wrists+0.082
 depends+0.080
girls+0.077
 sa+0.075
tails+0.073
Neg logit contributions
 afar-0.088
ires-0.085
 outside-0.083
ear-0.083
 behind-0.083
Token a
Token ID257
Feature activation-8.47e-01

Pos logit contributions
 wrists+0.095
 chip+0.091
 will+0.090
tails+0.089
 depends+0.089
Neg logit contributions
ires-0.071
 afar-0.066
ear-0.064
eway-0.063
err-0.063
Token his
Token ID465
Feature activation+2.35e+00

Pos logit contributions
 wrists+0.154
 will+0.152
 depends+0.151
tails+0.151
 Claus+0.146
Neg logit contributions
ires-0.056
 afar-0.055
foot-0.051
ear-0.050
eway-0.048
Token mom
Token ID1995
Feature activation-1.52e+00

Pos logit contributions
ires+0.119
eway+0.114
ges+0.114
 afar+0.113
ear+0.112
Neg logit contributions
 wrists-0.065
 chip-0.063
 will-0.060
 sa-0.060
 depends-0.057
Token and
Token ID290
Feature activation+7.74e-01

Pos logit contributions
 wrists+0.067
 chip+0.063
tails+0.062
 depends+0.062
 will+0.062
Neg logit contributions
ires-0.051
 afar-0.046
ear-0.046
err-0.046
 upon-0.045
Token the
Token ID262
Feature activation-1.44e+00

Pos logit contributions
 outside+0.028
 behind+0.027
 when+0.027
ied+0.027
 earlier+0.026
Neg logit contributions
 seventeen-0.030
 wrists-0.028
girls-0.028
vals-0.028
Crunch-0.028
Token ha
Token ID387
Feature activation-5.99e+00

Pos logit contributions
 wrists+0.050
 chip+0.048
 will+0.048
 sa+0.046
 depends+0.045
Neg logit contributions
ires-0.064
ear-0.059
eway-0.058
err-0.058
 afar-0.058
Tokenird
Token ID1447
Feature activation-1.02e+01

Pos logit contributions
 wrists+0.131
 depends+0.120
 chip+0.119
 will+0.115
 sa+0.113
Neg logit contributions
ires-0.329
ear-0.323
ges-0.321
uttering-0.318
ied-0.318
Tokenress
Token ID601
Feature activation-3.86e+00

Pos logit contributions
 wrists+0.424
 depends+0.415
 sa+0.403
 will+0.401
 rhy+0.400
Neg logit contributions
ires-0.365
 upon-0.350
ges-0.349
ear-0.343
ied-0.343
Tokener
Token ID263
Feature activation-1.20e+00

Pos logit contributions
 wrists+0.204
 depends+0.194
 will+0.193
tails+0.191
 chip+0.191
Neg logit contributions
ires-0.095
ear-0.087
ied-0.086
err-0.082
 afar-0.082
Token.
Token ID13
Feature activation+3.34e-02

Pos logit contributions
 wrists+0.049
 depends+0.048
 chip+0.046
 rhy+0.046
girls+0.045
Neg logit contributions
ied-0.041
ires-0.041
 upon-0.040
ges-0.040
ear-0.039
Token "
Token ID366
Feature activation+2.19e+00

Pos logit contributions
ires+0.002
 afar+0.001
err+0.001
ear+0.001
eway+0.001
Neg logit contributions
 wrists-0.001
 will-0.001
 chip-0.001
tails-0.001
 sa-0.001
TokenI
Token ID40
Feature activation+1.98e-01

Pos logit contributions
ires+0.098
eway+0.091
 afar+0.090
err+0.089
uttering+0.089
Neg logit contributions
 wrists-0.072
 will-0.071
 chip-0.070
 sa-0.067
 depends-0.066
Token She
Token ID1375
Feature activation-3.10e+00

Pos logit contributions
 wrists+0.006
 will+0.006
 chip+0.006
tails+0.006
 sa+0.006
Neg logit contributions
ires-0.011
 afar-0.010
err-0.010
ear-0.010
eway-0.010
Token twisted
Token ID19074
Feature activation+1.57e-01

Pos logit contributions
 chip+0.105
girls+0.105
tails+0.103
 depends+0.101
 wrists+0.100
Neg logit contributions
 afar-0.121
ires-0.120
ear-0.120
ied-0.117
 behind-0.116
Token it
Token ID340
Feature activation+1.51e+00

Pos logit contributions
ires+0.007
err+0.006
uttering+0.006
eway+0.006
 afar+0.006
Neg logit contributions
 wrists-0.006
 will-0.005
 chip-0.005
 said-0.005
 sa-0.005
Token around
Token ID1088
Feature activation+3.67e+00

Pos logit contributions
ires+0.057
 afar+0.054
err+0.052
eway+0.051
ear+0.051
Neg logit contributions
 wrists-0.060
 will-0.060
 chip-0.058
 sa-0.056
tails-0.056
Token her
Token ID607
Feature activation+2.02e+00

Pos logit contributions
ires+0.173
err+0.161
uttering+0.159
eway+0.157
ear+0.155
Neg logit contributions
 wrists-0.121
 will-0.116
 chip-0.112
 said-0.109
 sa-0.107
Token wrists
Token ID38163
Feature activation-1.02e+01

Pos logit contributions
ires+0.099
 afar+0.095
err+0.092
ear+0.091
 underground+0.091
Neg logit contributions
 will-0.053
 wrists-0.053
 chip-0.052
 depends-0.052
 sa-0.051
Token like
Token ID588
Feature activation+3.53e+00

Pos logit contributions
 wrists+0.403
 chip+0.384
 will+0.383
tails+0.374
 sa+0.373
Neg logit contributions
ires-0.392
 afar-0.366
err-0.361
ear-0.354
eway-0.354
Token brace
Token ID22581
Feature activation-5.25e+00

Pos logit contributions
ires+0.137
 afar+0.127
err+0.126
ear+0.124
eway+0.123
Neg logit contributions
 wrists-0.138
 chip-0.132
 will-0.131
 sa-0.128
tails-0.128
Tokenlets
Token ID5289
Feature activation-3.49e+00

Pos logit contributions
 wrists+0.091
 will+0.080
tails+0.079
 chip+0.076
 sa+0.071
Neg logit contributions
ires-0.320
 afar-0.308
err-0.304
eway-0.303
ear-0.297
Token.
Token ID13
Feature activation-1.80e-01

Pos logit contributions
 wrists+0.148
 depends+0.143
 chip+0.141
girls+0.138
 sa+0.137
Neg logit contributions
ires-0.117
 underground-0.109
ges-0.109
ear-0.108
 afar-0.108
Token
Token ID220
Feature activation-6.45e+00

Pos logit contributions
 wrists+0.003
 will+0.003
 sa+0.003
 chip+0.003
 depends+0.003
Neg logit contributions
ires-0.011
 afar-0.010
err-0.010
ear-0.010
ges-0.010
Token happy
Token ID3772
Feature activation+2.18e+00

Pos logit contributions
 wrists+0.085
 chip+0.082
 will+0.081
 sa+0.079
tails+0.079
Neg logit contributions
ires-0.081
 afar-0.076
err-0.075
ear-0.073
eway-0.073
Token and
Token ID290
Feature activation+8.50e-01

Pos logit contributions
ires+0.090
 afar+0.084
err+0.084
eway+0.082
uttering+0.081
Neg logit contributions
 wrists-0.080
 chip-0.078
 will-0.078
 sa-0.076
 said-0.075
Token thanked
Token ID26280
Feature activation-2.75e+00

Pos logit contributions
ear+0.034
ires+0.033
 behind+0.033
earing+0.033
 afar+0.032
Neg logit contributions
 wrists-0.029
girls-0.029
tails-0.027
 ra-0.027
 depends-0.027
Token the
Token ID262
Feature activation-1.76e+00

Pos logit contributions
 wrists+0.128
 will+0.127
tails+0.126
 depends+0.123
girls+0.119
Neg logit contributions
ires-0.082
 afar-0.079
ear-0.076
eway-0.075
err-0.075
Token ha
Token ID387
Feature activation-5.91e+00

Pos logit contributions
 wrists+0.046
 chip+0.044
 will+0.043
 sa+0.042
tails+0.041
Neg logit contributions
ires-0.091
 afar-0.086
err-0.085
ear-0.084
eway-0.084
Tokenird
Token ID1447
Feature activation-1.02e+01

Pos logit contributions
 wrists+0.132
 depends+0.119
 will+0.110
 sa+0.109
 chip+0.108
Neg logit contributions
ires-0.321
ear-0.320
ges-0.316
ied-0.310
and-0.309
Tokenress
Token ID601
Feature activation-3.82e+00

Pos logit contributions
 wrists+0.406
 depends+0.382
 will+0.378
 sa+0.378
 chip+0.370
Neg logit contributions
ires-0.390
ear-0.364
ges-0.359
err-0.356
uttering-0.355
Tokener
Token ID263
Feature activation-1.11e+00

Pos logit contributions
 wrists+0.196
 will+0.186
 depends+0.184
tails+0.183
 chip+0.183
Neg logit contributions
ires-0.102
ear-0.091
err-0.091
 afar-0.090
eway-0.087
Token.
Token ID13
Feature activation-2.93e-02

Pos logit contributions
 wrists+0.046
 depends+0.043
 will+0.043
 chip+0.043
 sa+0.042
Neg logit contributions
ires-0.040
ges-0.037
 afar-0.037
ear-0.037
 upon-0.036
Token As
Token ID1081
Feature activation+3.09e+00

Pos logit contributions
 wrists+0.001
 will+0.001
tails+0.001
 chip+0.001
 sa+0.001
Neg logit contributions
ires-0.001
 afar-0.001
err-0.001
ear-0.001
eway-0.001
Token they
Token ID484
Feature activation-4.00e+00

Pos logit contributions
ires+0.137
 afar+0.129
err+0.128
ear+0.126
eway+0.126
Neg logit contributions
 wrists-0.104
 chip-0.099
 will-0.099
 sa-0.096
tails-0.096
Token in
Token ID287
Feature activation-1.83e-01

Pos logit contributions
ires+0.013
 afar+0.012
arter+0.011
err+0.011
oken+0.011
Neg logit contributions
 will-0.009
 said-0.009
 -0.008
 sa-0.008
 Together-0.008
Token and
Token ID290
Feature activation+5.49e-01

Pos logit contributions
 wrists+0.007
 will+0.007
 chip+0.007
 sa+0.007
tails+0.007
Neg logit contributions
ires-0.007
 afar-0.007
err-0.007
ear-0.007
eway-0.006
Token let
Token ID1309
Feature activation-3.28e+00

Pos logit contributions
ires+0.022
 afar+0.021
ear+0.021
err+0.021
ges+0.020
Neg logit contributions
 wrists-0.020
 chip-0.019
 will-0.019
 depends-0.018
tails-0.018
Token the
Token ID262
Feature activation-1.84e+00

Pos logit contributions
 wrists+0.149
 chip+0.141
 will+0.140
 sa+0.139
tails+0.139
Neg logit contributions
ires-0.106
 afar-0.099
err-0.097
ear-0.096
eway-0.095
Token ha
Token ID387
Feature activation-5.95e+00

Pos logit contributions
 wrists+0.060
girls+0.060
 depends+0.057
tails+0.056
 Together+0.056
Neg logit contributions
ires-0.077
err-0.074
 afar-0.074
 upon-0.072
foot-0.071
Tokenird
Token ID1447
Feature activation-1.02e+01

Pos logit contributions
 wrists+0.096
 depends+0.085
 chip+0.083
 sa+0.080
 will+0.077
Neg logit contributions
ires-0.359
ear-0.355
ges-0.352
uttering-0.345
 upon-0.345
Tokenress
Token ID601
Feature activation-3.83e+00

Pos logit contributions
 wrists+0.375
 chip+0.353
 will+0.352
 sa+0.347
 depends+0.346
Neg logit contributions
ires-0.417
 afar-0.386
err-0.385
ear-0.384
eway-0.381
Tokener
Token ID263
Feature activation-1.18e+00

Pos logit contributions
 wrists+0.170
 chip+0.166
 will+0.164
tails+0.162
 sa+0.159
Neg logit contributions
ires-0.128
 afar-0.120
err-0.116
uttering-0.113
ear-0.112
Token go
Token ID467
Feature activation-1.05e+00

Pos logit contributions
 wrists+0.045
 chip+0.043
 depends+0.042
 sa+0.042
 will+0.042
Neg logit contributions
ires-0.046
 afar-0.043
ges-0.043
ear-0.043
eway-0.042
Token to
Token ID284
Feature activation-1.72e+00

Pos logit contributions
 wrists+0.039
 will+0.038
 chip+0.038
tails+0.037
 sa+0.036
Neg logit contributions
ires-0.043
 afar-0.040
err-0.040
eway-0.038
ear-0.038
Token work
Token ID670
Feature activation-9.94e-01

Pos logit contributions
 wrists+0.068
 will+0.065
 chip+0.064
tails+0.064
 depends+0.064
Neg logit contributions
ires-0.064
 afar-0.062
ear-0.059
err-0.059
eway-0.058
Token the
Token ID262
Feature activation-1.69e+00

Pos logit contributions
 wrists+0.139
 will+0.134
 depends+0.134
tails+0.132
 chip+0.132
Neg logit contributions
ires-0.128
 afar-0.122
err-0.117
ear-0.115
ges-0.115
Token chair
Token ID5118
Feature activation-2.87e+00

Pos logit contributions
 wrists+0.050
 will+0.048
 chip+0.047
tails+0.047
 depends+0.046
Neg logit contributions
ires-0.079
 afar-0.076
err-0.075
ear-0.074
eway-0.073
Token and
Token ID290
Feature activation+7.49e-01

Pos logit contributions
 wrists+0.139
 depends+0.138
girls+0.135
 will+0.134
tails+0.133
Neg logit contributions
 behind-0.076
ires-0.074
 outside-0.074
 underground-0.074
 afar-0.071
Token the
Token ID262
Feature activation-1.69e+00

Pos logit contributions
 afar+0.029
 upon+0.029
 behind+0.029
foot+0.029
 outside+0.029
Neg logit contributions
girls-0.027
 depends-0.027
 Amen-0.025
 chip-0.024
tails-0.024
Token ha
Token ID387
Feature activation-5.98e+00

Pos logit contributions
 wrists+0.045
 will+0.041
 chip+0.041
tails+0.041
 depends+0.040
Neg logit contributions
ires-0.085
 afar-0.082
err-0.081
ear-0.080
eway-0.079
Tokenird
Token ID1447
Feature activation-1.01e+01

Pos logit contributions
 wrists+0.095
 depends+0.089
 chip+0.088
 will+0.082
 sa+0.082
Neg logit contributions
ires-0.358
ges-0.352
ear-0.351
uttering-0.346
ied-0.344
Tokenress
Token ID601
Feature activation-3.88e+00

Pos logit contributions
 wrists+0.398
 depends+0.386
 sa+0.379
 will+0.378
 chip+0.377
Neg logit contributions
ires-0.381
ges-0.358
ear-0.357
 upon-0.353
uttering-0.351
Tokener
Token ID263
Feature activation-1.18e+00

Pos logit contributions
 wrists+0.198
 depends+0.192
 will+0.192
 sa+0.188
 chip+0.188
Neg logit contributions
ires-0.099
ear-0.090
 afar-0.089
err-0.088
ied-0.088
Token began
Token ID2540
Feature activation+1.21e-01

Pos logit contributions
 depends+0.039
 wrists+0.035
 Amen+0.035
girls+0.035
 rhy+0.035
Neg logit contributions
ied-0.051
 upon-0.050
ges-0.049
ging-0.048
earing-0.048
Token to
Token ID284
Feature activation-1.66e+00

Pos logit contributions
ires+0.004
ges+0.004
foot+0.004
uttering+0.003
 upon+0.003
Neg logit contributions
 depends-0.005
 wrists-0.005
 will-0.005
girls-0.005
-0.005
Token cut
Token ID2005
Feature activation-4.54e+00

Pos logit contributions
girls+0.064
 depends+0.061
 wrists+0.060
 chip+0.058
tails+0.057
Neg logit contributions
 afar-0.062
foot-0.058
 outside-0.057
 upon-0.056
 outdoors-0.056
Token called
Token ID1444
Feature activation-2.94e+00

Pos logit contributions
ires+0.032
 afar+0.031
ear+0.029
 underground+0.029
err+0.029
Neg logit contributions
 wrists-0.031
 chip-0.029
tails-0.029
 will-0.028
 depends-0.028
Token it
Token ID340
Feature activation+1.90e+00

Pos logit contributions
 will+0.106
 wrists+0.105
 chip+0.104
 sa+0.101
 depends+0.099
Neg logit contributions
ires-0.123
uttering-0.115
err-0.114
 afar-0.112
ear-0.109
Token a
Token ID257
Feature activation-7.13e-01

Pos logit contributions
ires+0.088
 afar+0.083
err+0.082
ear+0.081
eway+0.081
Neg logit contributions
 wrists-0.060
 chip-0.056
 will-0.055
tails-0.054
 sa-0.054
Token "
Token ID366
Feature activation+2.15e+00

Pos logit contributions
 wrists+0.029
girls+0.028
tails+0.028
Together+0.026
 chip+0.025
Neg logit contributions
ires-0.025
err-0.024
 underground-0.024
 afar-0.024
 upon-0.023
Tokenwe
Token ID732
Feature activation+7.45e-01

Pos logit contributions
ires+0.086
 afar+0.080
err+0.078
eway+0.078
uttering+0.077
Neg logit contributions
 wrists-0.079
 will-0.079
 chip-0.079
 sa-0.075
 depends-0.075
Tokenird
Token ID1447
Feature activation-1.01e+01

Pos logit contributions
ires+0.033
 afar+0.031
ear+0.031
ges+0.031
ied+0.030
Neg logit contributions
 wrists-0.025
 chip-0.024
 sa-0.022
tails-0.022
 depends-0.022
Token rock
Token ID3881
Feature activation+1.35e+00

Pos logit contributions
 wrists+0.369
 chip+0.344
 will+0.342
 sa+0.341
 depends+0.336
Neg logit contributions
ires-0.419
ear-0.387
 afar-0.387
err-0.386
eway-0.384
Token."
Token ID526
Feature activation-3.82e+00

Pos logit contributions
ires+0.055
 afar+0.052
ear+0.051
err+0.050
ges+0.050
Neg logit contributions
 wrists-0.050
 chip-0.046
tails-0.045
 sa-0.045
 will-0.045
Token Tim
Token ID5045
Feature activation+3.12e+00

Pos logit contributions
 wrists+0.146
 will+0.139
tails+0.136
 sa+0.134
 depends+0.132
Neg logit contributions
ires-0.151
 afar-0.144
err-0.143
ear-0.140
eway-0.139
Tokenmy
Token ID1820
Feature activation+8.23e-01

Pos logit contributions
err+0.098
ires+0.097
ear+0.093
 afar+0.093
earing+0.092
Neg logit contributions
 wrists-0.141
 chip-0.133
 sa-0.130
 depends-0.128
 ra-0.128
Token got
Token ID1392
Feature activation+8.48e-01

Pos logit contributions
earing+0.029
err+0.028
 afar+0.027
ear+0.027
ied+0.027
Neg logit contributions
 wrists-0.033
girls-0.032
 piping-0.031
 depends-0.031
 sle-0.031
Token am
Token ID716
Feature activation-9.28e-01

Pos logit contributions
 wrists+0.004
 chip+0.003
 sa+0.003
tails+0.003
girls+0.003
Neg logit contributions
 afar-0.006
ires-0.006
ear-0.006
err-0.006
 upon-0.006
Token ready
Token ID3492
Feature activation-2.39e+00

Pos logit contributions
 wrists+0.041
 chip+0.037
Together+0.037
 depends+0.037
tails+0.036
Neg logit contributions
ires-0.031
 underground-0.030
 here-0.030
 outside-0.030
 behind-0.029
Token!"
Token ID2474
Feature activation-4.06e+00

Pos logit contributions
 wrists+0.095
 chip+0.089
tails+0.088
 depends+0.087
 sa+0.086
Neg logit contributions
ires-0.091
ear-0.086
 afar-0.085
 upon-0.083
eway-0.083
Token The
Token ID383
Feature activation-1.37e+00

Pos logit contributions
 depends+0.159
 will+0.159
 wrists+0.153
tails+0.153
 merry+0.145
Neg logit contributions
ges-0.149
ear-0.146
ires-0.145
earing-0.143
 upon-0.142
Token ha
Token ID387
Feature activation-5.99e+00

Pos logit contributions
 wrists+0.051
 chip+0.050
 will+0.049
 sa+0.048
 depends+0.047
Neg logit contributions
ires-0.055
 afar-0.051
ear-0.051
err-0.051
eway-0.050
Tokenird
Token ID1447
Feature activation-1.01e+01

Pos logit contributions
 wrists+0.187
 depends+0.174
 chip+0.173
 will+0.170
 sa+0.170
Neg logit contributions
ires-0.274
ear-0.266
ges-0.265
uttering-0.259
ied-0.259
Tokenress
Token ID601
Feature activation-3.84e+00

Pos logit contributions
 wrists+0.420
 depends+0.408
 sa+0.401
 rhy+0.389
 will+0.388
Neg logit contributions
ires-0.354
ges-0.343
 upon-0.342
ear-0.341
eway-0.333
Tokener
Token ID263
Feature activation-1.13e+00

Pos logit contributions
 wrists+0.202
 depends+0.196
 will+0.193
tails+0.192
 sa+0.191
Neg logit contributions
ires-0.090
ied-0.086
ear-0.084
err-0.083
ges-0.080
Token put
Token ID1234
Feature activation-3.16e+00

Pos logit contributions
 depends+0.034
 wrists+0.033
girls+0.031
 will+0.031
 sa+0.029
Neg logit contributions
ied-0.051
ges-0.049
 upon-0.049
earing-0.048
ires-0.048
Token a
Token ID257
Feature activation-7.19e-01

Pos logit contributions
 wrists+0.129
 chip+0.122
tails+0.122
 depends+0.121
 will+0.121
Neg logit contributions
ires-0.114
 afar-0.108
err-0.104
ges-0.104
ied-0.102
Token cape
Token ID36023
Feature activation-2.90e+00

Pos logit contributions
girls+0.027
 wrists+0.027
 depends+0.025
tails+0.025
 Together+0.025
Neg logit contributions
err-0.026
ires-0.026
 upon-0.026
 afar-0.026
eh-0.026
Token chair
Token ID5118
Feature activation-2.65e+00

Pos logit contributions
 wrists+0.052
 will+0.049
 chip+0.049
tails+0.049
 depends+0.048
Neg logit contributions
ires-0.062
 afar-0.059
err-0.058
 underground-0.057
ear-0.056
Token and
Token ID290
Feature activation+9.25e-01

Pos logit contributions
 wrists+0.115
tails+0.105
girls+0.103
 chip+0.102
 sa+0.102
Neg logit contributions
ires-0.090
 upon-0.089
 underground-0.087
 behind-0.086
 afar-0.086
Token allowed
Token ID3142
Feature activation-6.32e-01

Pos logit contributions
 afar+0.040
ires+0.039
 upon+0.038
ear+0.037
 behind+0.037
Neg logit contributions
 wrists-0.030
 chip-0.029
tails-0.028
girls-0.027
 depends-0.027
Token the
Token ID262
Feature activation-1.59e+00

Pos logit contributions
 wrists+0.027
 will+0.026
 chip+0.026
tails+0.025
 sa+0.025
Neg logit contributions
ires-0.022
 afar-0.021
err-0.020
uttering-0.020
ear-0.020
Token ha
Token ID387
Feature activation-5.89e+00

Pos logit contributions
 wrists+0.048
 will+0.048
 chip+0.047
 sa+0.046
 depends+0.044
Neg logit contributions
ires-0.077
ear-0.071
 afar-0.071
err-0.071
eway-0.070
Tokenird
Token ID1447
Feature activation-1.01e+01

Pos logit contributions
 wrists+0.136
 chip+0.121
 depends+0.120
 sa+0.118
 will+0.115
Neg logit contributions
ires-0.318
ear-0.309
ges-0.307
eway-0.302
 upon-0.299
Tokenress
Token ID601
Feature activation-3.77e+00

Pos logit contributions
 wrists+0.401
 sa+0.376
 depends+0.375
 chip+0.371
 will+0.369
Neg logit contributions
ires-0.384
ear-0.354
ges-0.352
 upon-0.352
eway-0.351
Tokener
Token ID263
Feature activation-1.06e+00

Pos logit contributions
 wrists+0.187
 depends+0.177
 sa+0.176
 will+0.175
 chip+0.175
Neg logit contributions
ires-0.104
ear-0.095
err-0.093
ied-0.093
 afar-0.093
Token to
Token ID284
Feature activation-1.65e+00

Pos logit contributions
 wrists+0.041
 chip+0.038
 sa+0.038
 depends+0.038
tails+0.037
Neg logit contributions
ires-0.041
 afar-0.038
ear-0.038
ges-0.038
eway-0.037
Token give
Token ID1577
Feature activation-1.17e+00

Pos logit contributions
 wrists+0.061
 depends+0.058
tails+0.056
girls+0.056
 chip+0.055
Neg logit contributions
 afar-0.065
ires-0.063
ear-0.062
 underground-0.060
 behind-0.059
Token her
Token ID607
Feature activation+2.27e+00

Pos logit contributions
 wrists+0.052
 will+0.052
 chip+0.050
tails+0.049
 sa+0.049
Neg logit contributions
ires-0.039
err-0.036
 afar-0.036
uttering-0.035
eway-0.035
Token his
Token ID465
Feature activation+2.09e+00

Pos logit contributions
err+0.016
ires+0.016
ges+0.016
ied+0.015
 behind+0.015
Neg logit contributions
 wrists-0.053
 will-0.052
girls-0.051
tails-0.051
 depends-0.050
Token mom
Token ID1995
Feature activation-1.90e+00

Pos logit contributions
ires+0.100
 afar+0.094
err+0.094
ear+0.092
uttering+0.091
Neg logit contributions
 wrists-0.062
 will-0.058
 chip-0.057
 depends-0.057
tails-0.057
Token left
Token ID1364
Feature activation-3.42e-01

Pos logit contributions
 depends+0.083
girls+0.081
 wrists+0.076
 Amen+0.075
 chip+0.075
Neg logit contributions
 outside-0.069
ied-0.065
 behind-0.060
 upon-0.059
pped-0.058
Token the
Token ID262
Feature activation-1.63e+00

Pos logit contributions
 wrists+0.014
 will+0.013
 chip+0.013
tails+0.013
 depends+0.013
Neg logit contributions
ires-0.013
 afar-0.012
err-0.012
ges-0.012
ear-0.011
Token ha
Token ID387
Feature activation-5.76e+00

Pos logit contributions
 wrists+0.041
 will+0.039
 chip+0.038
 sa+0.037
 depends+0.035
Neg logit contributions
ires-0.088
eway-0.083
 afar-0.082
err-0.082
ear-0.082
Tokenird
Token ID1447
Feature activation-1.01e+01

Pos logit contributions
 wrists+0.120
 depends+0.118
 chip+0.114
 sa+0.107
 will+0.104
Neg logit contributions
ear-0.314
ges-0.314
and-0.308
ires-0.308
 upon-0.305
Tokenress
Token ID601
Feature activation-3.73e+00

Pos logit contributions
 wrists+0.410
 depends+0.397
 sa+0.390
 will+0.386
 chip+0.382
Neg logit contributions
ires-0.365
ges-0.346
ear-0.346
 upon-0.344
uttering-0.338
Tokener
Token ID263
Feature activation-1.09e+00

Pos logit contributions
 wrists+0.209
 depends+0.203
 will+0.199
tails+0.197
 sa+0.196
Neg logit contributions
ires-0.074
ear-0.070
 behind-0.069
ied-0.068
ges-0.065
Token,
Token ID11
Feature activation-9.48e-02

Pos logit contributions
 wrists+0.049
 depends+0.049
Together+0.047
 rhy+0.046
 will+0.046
Neg logit contributions
ges-0.033
 upon-0.032
ied-0.031
uttering-0.031
 outside-0.031
Token happy
Token ID3772
Feature activation+2.34e+00

Pos logit contributions
tails+0.004
girls+0.004
 wrists+0.004
 depends+0.004
 will+0.004
Neg logit contributions
ires-0.003
 behind-0.003
 upon-0.003
 afar-0.003
 outside-0.003
Token with
Token ID351
Feature activation+7.65e-01

Pos logit contributions
ires+0.076
ear+0.069
 underground+0.068
err+0.067
 afar+0.066
Neg logit contributions
 wrists-0.109
tails-0.104
 depends-0.100
 will-0.099
girls-0.099
Token tied
Token ID8165
Feature activation+2.77e-01

Pos logit contributions
 behind+0.037
 outside+0.034
ires+0.034
 nowhere+0.033
overed+0.033
Neg logit contributions
tails-0.029
 Amen-0.027
girls-0.026
 depends-0.025
tail-0.024
Token the
Token ID262
Feature activation-1.63e+00

Pos logit contributions
ires+0.011
 afar+0.010
err+0.010
ear+0.010
eway+0.010
Neg logit contributions
 wrists-0.011
 chip-0.010
 will-0.010
 sa-0.010
tails-0.010
Token balloons
Token ID35485
Feature activation-3.91e+00

Pos logit contributions
 wrists+0.039
 chip+0.037
 will+0.036
tails+0.036
 sa+0.035
Neg logit contributions
ires-0.086
 afar-0.082
err-0.081
ear-0.080
eway-0.080
Token to
Token ID284
Feature activation-1.66e+00

Pos logit contributions
 wrists+0.171
 chip+0.165
 will+0.163
tails+0.162
 depends+0.162
Neg logit contributions
ires-0.131
 afar-0.120
err-0.118
ear-0.118
eway-0.116
Token their
Token ID511
Feature activation-1.53e+00

Pos logit contributions
 wrists+0.068
 depends+0.067
tails+0.066
 will+0.066
 Together+0.064
Neg logit contributions
ires-0.058
 afar-0.058
ear-0.053
 behind-0.052
ges-0.052
Token wrists
Token ID38163
Feature activation-1.00e+01

Pos logit contributions
Together+0.045
tails+0.043
 Together+0.043
 depends+0.043
 will+0.042
Neg logit contributions
ires-0.070
 afar-0.069
 underground-0.064
err-0.064
 behind-0.064
Token.
Token ID13
Feature activation+2.33e-01

Pos logit contributions
 depends+0.428
 wrists+0.423
tails+0.412
 chip+0.407
 will+0.405
Neg logit contributions
ires-0.336
ear-0.308
 afar-0.305
ges-0.302
uttering-0.301
Token
Token ID198
Feature activation-6.93e-01

Pos logit contributions
ires+0.010
 afar+0.009
ear+0.008
err+0.008
 upon+0.008
Neg logit contributions
tails-0.009
 wrists-0.009
 will-0.008
 depends-0.008
girls-0.008
Token
Token ID198
Feature activation-6.74e-01

Pos logit contributions
 wrists+0.023
 will+0.021
 chip+0.021
tails+0.021
 sa+0.021
Neg logit contributions
ires-0.031
 afar-0.029
err-0.029
ear-0.029
eway-0.029
TokenL
Token ID43
Feature activation+6.81e-01

Pos logit contributions
 wrists+0.026
 chip+0.025
 will+0.025
tails+0.025
 sa+0.025
Neg logit contributions
ires-0.026
 afar-0.024
err-0.024
ear-0.024
eway-0.024
Tokenily
Token ID813
Feature activation+2.06e+00

Pos logit contributions
 afar+0.029
foot+0.025
 freely+0.024
ging+0.024
 comparison+0.023
Neg logit contributions
ird-0.028
 says-0.026
 will-0.025
 said-0.024
 kissed-0.024
Token forgot
Token ID16453
Feature activation-1.21e+00

Pos logit contributions
 wrists+0.009
 chip+0.009
Together+0.008
girls+0.008
tails+0.008
Neg logit contributions
ear-0.008
 afar-0.008
 underground-0.008
 upon-0.008
 microscopic-0.008
Token the
Token ID262
Feature activation-1.78e+00

Pos logit contributions
 will+0.047
 wrists+0.045
 sa+0.045
 chip+0.043
 said+0.043
Neg logit contributions
ires-0.048
err-0.045
uttering-0.045
 afar-0.044
eway-0.043
Token lion
Token ID18744
Feature activation+2.84e+00

Pos logit contributions
 will+0.068
 chip+0.065
 sa+0.065
 wrists+0.065
 depends+0.064
Neg logit contributions
ires-0.074
ges-0.069
ear-0.069
foot-0.066
eway-0.066
Token and
Token ID290
Feature activation+7.74e-01

Pos logit contributions
ires+0.118
 afar+0.109
err+0.108
eway+0.106
uttering+0.106
Neg logit contributions
 wrists-0.104
 will-0.101
 chip-0.100
 sa-0.098
tails-0.097
Token his
Token ID465
Feature activation+2.00e+00

Pos logit contributions
ires+0.026
 afar+0.025
 underground+0.024
ear+0.024
err+0.023
Neg logit contributions
 wrists-0.033
 chip-0.032
tails-0.031
Together-0.031
 sa-0.030
Token mercy
Token ID17703
Feature activation-9.99e+00

Pos logit contributions
ires+0.082
ear+0.076
ges+0.076
eway+0.075
err+0.075
Neg logit contributions
 wrists-0.075
 will-0.073
 chip-0.071
 sa-0.070
 depends-0.070
Token.
Token ID13
Feature activation-8.24e-02

Pos logit contributions
 wrists+0.363
 will+0.358
 chip+0.349
 sa+0.341
tails+0.338
Neg logit contributions
ires-0.413
err-0.378
 afar-0.378
eway-0.367
ear-0.365
Token He
Token ID679
Feature activation-1.49e+00

Pos logit contributions
 wrists+0.002
 will+0.002
 chip+0.002
 depends+0.002
 Together+0.002
Neg logit contributions
ires-0.004
 afar-0.004
uttering-0.004
ied-0.004
err-0.004
Token always
Token ID1464
Feature activation-2.33e+00

Pos logit contributions
 wrists+0.051
 chip+0.050
tails+0.047
 piping+0.046
 sa+0.044
Neg logit contributions
 afar-0.063
ires-0.061
err-0.060
 underground-0.059
ear-0.058
Token tried
Token ID3088
Feature activation-7.87e-01

Pos logit contributions
 chip+0.096
 piping+0.086
 wrists+0.084
 panc+0.083
brush+0.083
Neg logit contributions
 underground-0.088
 afar-0.088
 alone-0.084
 behind-0.083
 nowhere-0.080
Token to
Token ID284
Feature activation-1.84e+00

Pos logit contributions
 wrists+0.031
 will+0.030
 chip+0.030
 sa+0.029
tails+0.029
Neg logit contributions
ires-0.030
 afar-0.028
err-0.028
ear-0.027
eway-0.027
Token like
Token ID588
Feature activation+4.37e+00

Pos logit contributions
ires+0.108
 afar+0.103
ear+0.102
err+0.101
eway+0.100
Neg logit contributions
 wrists-0.085
 chip-0.078
 will-0.077
 sa-0.076
tails-0.075
Token it
Token ID340
Feature activation+2.31e+00

Pos logit contributions
ires+0.190
err+0.182
uttering+0.181
 afar+0.174
eway+0.173
Neg logit contributions
 sa-0.152
 will-0.150
 chip-0.150
 wrists-0.150
 ra-0.141
Token and
Token ID290
Feature activation+1.12e+00

Pos logit contributions
ires+0.095
 afar+0.089
err+0.088
ear+0.087
eway+0.086
Neg logit contributions
 wrists-0.084
 will-0.082
 chip-0.081
 sa-0.079
tails-0.078
Token the
Token ID262
Feature activation-1.26e+00

Pos logit contributions
 afar+0.035
ear+0.034
ires+0.033
 behind+0.033
 underground+0.033
Neg logit contributions
 wrists-0.048
 chip-0.044
tails-0.044
 piping-0.044
girls-0.043
Token ha
Token ID387
Feature activation-5.81e+00

Pos logit contributions
 wrists+0.040
tails+0.037
 Together+0.037
girls+0.036
Together+0.036
Neg logit contributions
ires-0.054
 afar-0.054
err-0.052
 upon-0.051
 underground-0.051
Tokenird
Token ID1447
Feature activation-9.99e+00

Pos logit contributions
 wrists+0.107
 chip+0.096
 depends+0.091
 will+0.091
 sa+0.090
Neg logit contributions
ires-0.340
ear-0.327
ges-0.326
 afar-0.324
err-0.322
Tokenress
Token ID601
Feature activation-3.67e+00

Pos logit contributions
 wrists+0.378
 chip+0.356
 depends+0.355
 sa+0.354
 will+0.354
Neg logit contributions
ires-0.395
ear-0.367
ges-0.365
err-0.365
 afar-0.364
Tokener
Token ID263
Feature activation-9.00e-01

Pos logit contributions
 wrists+0.192
 depends+0.181
 will+0.181
tails+0.180
 sa+0.179
Neg logit contributions
ires-0.091
ear-0.082
ied-0.082
err-0.082
 afar-0.081
Token said
Token ID531
Feature activation-3.98e+00

Pos logit contributions
 wrists+0.029
 chip+0.027
 depends+0.027
 sa+0.026
girls+0.026
Neg logit contributions
ires-0.039
ied-0.038
 upon-0.038
 afar-0.037
ges-0.037
Token it
Token ID340
Feature activation+2.11e+00

Pos logit contributions
 wrists+0.148
 chip+0.138
tails+0.138
 will+0.137
 depends+0.136
Neg logit contributions
ires-0.160
 afar-0.150
ear-0.148
err-0.147
eway-0.145
Token was
Token ID373
Feature activation+1.47e+00

Pos logit contributions
ires+0.087
 afar+0.084
err+0.084
ear+0.082
eway+0.082
Neg logit contributions
 wrists-0.074
 chip-0.068
 sa-0.066
tails-0.066
girls-0.066
Token and
Token ID290
Feature activation+1.25e+00

Pos logit contributions
ires+0.035
 afar+0.032
err+0.032
eway+0.032
ear+0.031
Neg logit contributions
 wrists-0.039
 will-0.038
 chip-0.037
 sa-0.037
tails-0.036
Token thought
Token ID1807
Feature activation+2.06e+00

Pos logit contributions
 behind+0.054
 afar+0.052
 nowhere+0.051
ear+0.051
 upon+0.050
Neg logit contributions
girls-0.040
 piping-0.037
 depends-0.037
 chip-0.036
 seventeen-0.036
Token about
Token ID546
Feature activation+2.55e+00

Pos logit contributions
 afar+0.080
err+0.080
ires+0.080
eway+0.074
uttering+0.073
Neg logit contributions
 will-0.076
 wrists-0.075
 chip-0.074
 sa-0.072
 said-0.071
Token the
Token ID262
Feature activation-1.77e+00

Pos logit contributions
ires+0.078
 afar+0.072
err+0.069
ear+0.069
eway+0.068
Neg logit contributions
 wrists-0.120
 chip-0.115
 will-0.115
tails-0.113
 depends-0.112
Token nice
Token ID3621
Feature activation-1.02e-01

Pos logit contributions
 will+0.059
 zo+0.056
 merry+0.052
 said+0.051
 depends+0.050
Neg logit contributions
ges-0.089
ires-0.087
ied-0.087
ayed-0.086
pped-0.086
Token guard
Token ID4860
Feature activation+5.62e+00

Pos logit contributions
 will+0.003
 said+0.003
 says+0.003
 chip+0.003
 nodded+0.003
Neg logit contributions
ires-0.005
 afar-0.005
pped-0.005
uttering-0.005
ayed-0.005
Token.
Token ID13
Feature activation+3.07e-01

Pos logit contributions
ires+0.205
 afar+0.191
err+0.187
ear+0.187
eway+0.183
Neg logit contributions
 wrists-0.229
 chip-0.215
tails-0.213
 will-0.212
 sa-0.210
Token She
Token ID1375
Feature activation-2.63e+00

Pos logit contributions
ires+0.015
 afar+0.014
err+0.014
ear+0.014
eway+0.013
Neg logit contributions
 wrists-0.009
 chip-0.009
 will-0.009
tails-0.009
 sa-0.009
Token was
Token ID373
Feature activation+1.29e+00

Pos logit contributions
 wrists+0.095
 chip+0.094
tails+0.093
girls+0.092
 piping+0.090
Neg logit contributions
 afar-0.098
 nowhere-0.098
ear-0.097
err-0.097
 upon-0.096
Token very
Token ID845
Feature activation-1.95e+00

Pos logit contributions
ires+0.051
 afar+0.049
ear+0.047
err+0.047
uttering+0.047
Neg logit contributions
 wrists-0.050
 chip-0.047
 will-0.046
tails-0.046
 depends-0.045
Token glad
Token ID9675
Feature activation+2.42e+00

Pos logit contributions
 wrists+0.067
 will+0.066
 chip+0.065
 sa+0.063
 said+0.063
Neg logit contributions
ires-0.083
err-0.078
 afar-0.078
eway-0.078
ear-0.075
Token,
Token ID11
Feature activation+8.91e-02

Pos logit contributions
 wrists+0.146
 depends+0.142
girls+0.140
tails+0.138
 chip+0.133
Neg logit contributions
ires-0.139
ear-0.135
ges-0.133
ied-0.132
foot-0.129
Token
Token ID6184
Feature activation-2.97e-01

Pos logit contributions
ires+0.003
ear+0.003
ges+0.003
 upon+0.003
earing+0.003
Neg logit contributions
 wrists-0.004
tails-0.004
girls-0.004
 depends-0.004
 sa-0.003
Token
Token ID95
Feature activation-8.90e-01

Pos logit contributions
 wrists+0.011
 will+0.010
 sa+0.010
tails+0.010
 chip+0.010
Neg logit contributions
ires-0.012
 afar-0.011
ear-0.011
uttering-0.011
foot-0.011
Token
Token ID26391
Feature activation-1.89e+00

Pos logit contributions
 wrists+0.035
 sa+0.034
 chip+0.033
tails+0.033
 will+0.033
Neg logit contributions
ires-0.034
 afar-0.032
ear-0.032
err-0.032
eway-0.031
Token
Token ID129
Feature activation-3.00e+00

Pos logit contributions
 sa+0.063
 wrists+0.062
 chip+0.059
 ra+0.056
 depends+0.056
Neg logit contributions
ires-0.078
ear-0.078
 upon-0.077
foot-0.077
earing-0.076
Token
Token ID241
Feature activation+6.61e+00

Pos logit contributions
 wrists+0.133
 chip+0.125
tails+0.124
 will+0.123
 depends+0.121
Neg logit contributions
ires-0.100
err-0.093
ges-0.092
eway-0.091
uttering-0.091
TokenOk
Token ID18690
Feature activation-1.42e+00

Pos logit contributions
ires+0.253
 afar+0.237
err+0.233
ear+0.231
eway+0.228
Neg logit contributions
 wrists-0.260
 chip-0.247
 will-0.246
tails-0.242
 sa-0.241
Token!
Token ID0
Feature activation+1.67e+00

Pos logit contributions
 wrists+0.065
girls+0.064
tails+0.063
 depends+0.062
 sa+0.060
Neg logit contributions
 here-0.042
 upon-0.041
ear-0.040
all-0.040
ging-0.039
Tokenâ
Token ID22940
Feature activation-8.71e-01

Pos logit contributions
ires+0.056
ear+0.055
foot+0.053
 upon+0.052
ges+0.052
Neg logit contributions
 wrists-0.074
tails-0.069
 sa-0.069
 ra-0.066
 depends-0.065
Token
Token ID26391
Feature activation-1.84e+00

Pos logit contributions
 wrists+0.037
 sa+0.035
tails+0.035
girls+0.034
 chip+0.034
Neg logit contributions
ires-0.031
ear-0.030
earing-0.030
err-0.029
 afar-0.029
Token She
Token ID1375
Feature activation-2.16e+00

Pos logit contributions
 wrists+0.062
 sa+0.061
 chip+0.057
 ra+0.055
 depends+0.055
Neg logit contributions
ear-0.077
foot-0.076
 upon-0.075
ires-0.074
earing-0.074
Token The
Token ID383
Feature activation-1.59e+00

Pos logit contributions
ires+0.001
 afar+0.001
uttering+0.001
eway+0.001
ging+0.001
Neg logit contributions
 chip-0.001
 will-0.001
 Together-0.001
 depends-0.001
 wrists-0.001
Token end
Token ID886
Feature activation+8.04e-01

Pos logit contributions
 will+0.056
 sa+0.052
 chip+0.052
 said+0.051
 depends+0.051
Neg logit contributions
ges-0.072
ires-0.072
ear-0.071
earing-0.067
uttering-0.067
Token.
Token ID13
Feature activation+5.46e-02

Pos logit contributions
 afar+0.040
ires+0.040
uttering+0.038
ges+0.038
ging+0.038
Neg logit contributions
 will-0.023
 chip-0.022
 says-0.021
 said-0.021
 Together-0.020
Token<|endoftext|>
Token ID50256
Feature activation+3.77e+00

Pos logit contributions
ires+0.003
 afar+0.002
err+0.002
ear+0.002
eway+0.002
Neg logit contributions
 wrists-0.002
 will-0.002
 chip-0.002
tails-0.002
 sa-0.002
TokenOnce
Token ID7454
Feature activation+1.44e+00

Pos logit contributions
ires+0.193
 afar+0.185
ear+0.180
err+0.179
uttering+0.178
Neg logit contributions
 wrists-0.096
 will-0.096
 chip-0.094
 sa-0.089
 depends-0.088
Token upon
Token ID2402
Feature activation+8.03e+00

Pos logit contributions
 afar+0.035
err+0.029
ear+0.029
ires+0.028
eway+0.028
Neg logit contributions
 said-0.081
 will-0.081
 says-0.079
 -0.079
 Together-0.076
Token a
Token ID257
Feature activation+4.14e-02

Pos logit contributions
ires+0.446
ging+0.424
ening+0.418
ied+0.417
ocol+0.411
Neg logit contributions
 will-0.198
 said-0.194
 -0.184
-0.182
 chip-0.177
Token time
Token ID640
Feature activation+3.04e+00

Pos logit contributions
ires+0.001
ges+0.001
eway+0.001
ging+0.001
 afar+0.001
Neg logit contributions
 will-0.002
-0.002
 chip-0.002
 said-0.002
 wrists-0.002
Token,
Token ID11
Feature activation+6.86e-01

Pos logit contributions
 afar+0.219
ires+0.212
err+0.204
ear+0.200
away+0.198
Neg logit contributions
 -0.037
 will-0.031
 said-0.028
 Together-0.027
 says-0.024
Token there
Token ID612
Feature activation+3.46e+00

Pos logit contributions
ear+0.033
ires+0.033
 afar+0.033
err+0.032
ocol+0.031
Neg logit contributions
 will-0.023
 -0.022
 said-0.022
 they-0.020
-0.020
Token was
Token ID373
Feature activation+2.11e+00

Pos logit contributions
ires+0.192
 afar+0.186
uttering+0.182
ear+0.181
err+0.179
Neg logit contributions
 will-0.088
 chip-0.080
 said-0.079
 says-0.075
 -0.073
Token mean
Token ID1612
Feature activation+8.17e-01

Pos logit contributions
 chip+0.062
 will+0.061
 sa+0.058
 depends+0.057
 wrists+0.057
Neg logit contributions
ires-0.069
ges-0.066
eway-0.065
ear-0.064
ied-0.063
Token cat
Token ID3797
Feature activation+1.25e+00

Pos logit contributions
ires+0.038
eway+0.037
ied+0.036
ges+0.035
foot+0.035
Neg logit contributions
 chip-0.027
 will-0.026
 said-0.024
 depends-0.024
 sa-0.024
Token.
Token ID13
Feature activation+3.25e-01

Pos logit contributions
ires+0.044
 afar+0.042
ear+0.042
 underground+0.041
ges+0.041
Neg logit contributions
 wrists-0.052
 chip-0.049
 sa-0.048
 depends-0.047
 will-0.047
Token<|endoftext|>
Token ID50256
Feature activation+4.06e+00

Pos logit contributions
ires+0.013
err+0.012
ges+0.012
ear+0.012
 afar+0.012
Neg logit contributions
 wrists-0.013
 sa-0.012
tails-0.012
 will-0.012
 scar-0.012
TokenOnce
Token ID7454
Feature activation+1.71e+00

Pos logit contributions
ires+0.201
 afar+0.191
err+0.187
ear+0.187
eway+0.184
Neg logit contributions
 wrists-0.112
 will-0.109
 chip-0.107
 sa-0.103
tails-0.102
Token upon
Token ID2402
Feature activation+8.28e+00

Pos logit contributions
 afar+0.045
ear+0.040
ires+0.038
err+0.038
vised+0.038
Neg logit contributions
 said-0.093
 will-0.093
 -0.091
 says-0.091
 Together-0.086
Token a
Token ID257
Feature activation+2.45e-01

Pos logit contributions
ires+0.457
ging+0.431
ied+0.425
ening+0.423
ocol+0.422
Neg logit contributions
 will-0.213
 said-0.201
 -0.186
-0.183
 chip-0.182
Token time
Token ID640
Feature activation+3.21e+00

Pos logit contributions
ires+0.008
ges+0.007
eway+0.007
ging+0.007
ied+0.007
Neg logit contributions
 will-0.011
-0.011
 chip-0.011
 said-0.011
Together-0.010
Token,
Token ID11
Feature activation+1.02e+00

Pos logit contributions
 afar+0.233
ires+0.226
ear+0.220
err+0.216
ening+0.213
Neg logit contributions
 -0.039
 said-0.029
 will-0.028
 Together-0.026
 they-0.024
Token there
Token ID612
Feature activation+3.60e+00

Pos logit contributions
ear+0.047
ires+0.045
 afar+0.045
err+0.044
vised+0.043
Neg logit contributions
 will-0.037
 said-0.035
 -0.033
-0.031
 says-0.031
Token was
Token ID373
Feature activation+2.35e+00

Pos logit contributions
ires+0.212
 afar+0.203
uttering+0.201
ear+0.200
ging+0.196
Neg logit contributions
 will-0.085
 said-0.077
 chip-0.071
 -0.070
 says-0.068
Token modern
Token ID3660
Feature activation+1.95e+00

Pos logit contributions
 wrists+0.043
 will+0.042
 chip+0.042
 sa+0.041
tails+0.040
Neg logit contributions
ires-0.037
 afar-0.034
err-0.034
ear-0.034
eway-0.033
Token wonder
Token ID4240
Feature activation+5.99e-01

Pos logit contributions
ires+0.088
ges+0.078
eway+0.078
ied+0.078
err+0.076
Neg logit contributions
 wrists-0.068
 will-0.068
 chip-0.068
 sa-0.064
 depends-0.062
Token!
Token ID0
Feature activation+8.38e-01

Pos logit contributions
ires+0.023
 afar+0.021
err+0.021
ear+0.021
eway+0.021
Neg logit contributions
 wrists-0.024
 chip-0.022
 will-0.022
tails-0.022
 sa-0.022
Token<|endoftext|>
Token ID50256
Feature activation+3.62e+00

Pos logit contributions
ires+0.035
err+0.033
ear+0.032
ges+0.032
earing+0.032
Neg logit contributions
 wrists-0.031
 sa-0.031
 scar-0.029
 chip-0.029
 ra-0.029
TokenOnce
Token ID7454
Feature activation+1.15e+00

Pos logit contributions
ires+0.151
 afar+0.142
err+0.138
ear+0.138
uttering+0.136
Neg logit contributions
 wrists-0.128
 will-0.127
 chip-0.124
 sa-0.120
 depends-0.120
Token upon
Token ID2402
Feature activation+7.71e+00

Pos logit contributions
 afar+0.026
ear+0.023
err+0.022
ires+0.022
vised+0.021
Neg logit contributions
 said-0.067
 will-0.066
 -0.066
 says-0.065
 Together-0.062
Token a
Token ID257
Feature activation-1.70e-01

Pos logit contributions
ires+0.444
ging+0.417
ied+0.416
ishing+0.410
ening+0.408
Neg logit contributions
 will-0.177
 said-0.173
 -0.160
 kissed-0.150
 chip-0.148
Token time
Token ID640
Feature activation+2.81e+00

Pos logit contributions
 will+0.007
+0.007
 chip+0.007
 said+0.007
Together+0.007
Neg logit contributions
ires-0.006
ied-0.006
ges-0.006
eway-0.006
ging-0.005
Token,
Token ID11
Feature activation+4.86e-01

Pos logit contributions
 afar+0.206
ires+0.203
err+0.196
ear+0.194
ishing+0.190
Neg logit contributions
 -0.035
 they-0.027
 said-0.025
 will-0.022
 Together-0.019
Token there
Token ID612
Feature activation+3.28e+00

Pos logit contributions
ires+0.024
ear+0.024
err+0.023
ayed+0.023
 afar+0.022
Neg logit contributions
 -0.015
 will-0.015
 said-0.014
 they-0.014
 looked-0.013
Token was
Token ID373
Feature activation+1.83e+00

Pos logit contributions
ires+0.186
 afar+0.176
uttering+0.176
ear+0.175
err+0.172
Neg logit contributions
 will-0.080
 said-0.074
 chip-0.071
 -0.068
 says-0.067
Token,
Token ID11
Feature activation+8.99e-02

Pos logit contributions
 wrists+0.063
tails+0.059
 sa+0.059
 depends+0.059
 chip+0.058
Neg logit contributions
ires-0.061
 underground-0.055
 behind-0.055
ges-0.055
eway-0.055
Token and
Token ID290
Feature activation+1.20e+00

Pos logit contributions
ires+0.004
err+0.003
 afar+0.003
ear+0.003
eway+0.003
Neg logit contributions
 wrists-0.003
 will-0.003
 chip-0.003
 sa-0.003
tails-0.003
Token a
Token ID257
Feature activation-6.66e-01

Pos logit contributions
ires+0.043
 afar+0.041
ear+0.039
err+0.039
foot+0.038
Neg logit contributions
 wrists-0.049
 chip-0.048
tails-0.047
 will-0.046
 sa-0.046
Token sandbox
Token ID35204
Feature activation-3.80e+00

Pos logit contributions
 wrists+0.023
girls+0.023
 depends+0.022
 Together+0.022
Together+0.022
Neg logit contributions
 upon-0.025
 underground-0.025
err-0.025
 afar-0.025
uttering-0.024
Token.
Token ID13
Feature activation+6.12e-01

Pos logit contributions
 wrists+0.164
 depends+0.159
tails+0.159
 chip+0.158
 sa+0.157
Neg logit contributions
ires-0.126
 underground-0.114
 behind-0.113
eway-0.111
err-0.111
Token Where
Token ID6350
Feature activation+2.25e+00

Pos logit contributions
ires+0.028
 afar+0.027
err+0.026
ear+0.026
ges+0.026
Neg logit contributions
 wrists-0.019
 will-0.019
tails-0.018
 chip-0.018
 sa-0.018
Token is
Token ID318
Feature activation+1.35e+00

Pos logit contributions
ires+0.121
 afar+0.116
err+0.114
ear+0.113
eway+0.112
Neg logit contributions
 wrists-0.054
 will-0.052
 chip-0.050
 depends-0.048
tails-0.048
Token Ben
Token ID3932
Feature activation-6.07e-01

Pos logit contributions
ires+0.052
 afar+0.049
err+0.047
foot+0.047
ear+0.047
Neg logit contributions
 wrists-0.052
 will-0.050
 chip-0.049
 depends-0.049
tails-0.049
Token?
Token ID30
Feature activation+8.53e-01

Pos logit contributions
 will+0.023
 wrists+0.023
 chip+0.022
 depends+0.022
 sa+0.022
Neg logit contributions
ires-0.024
err-0.022
 afar-0.022
eway-0.022
uttering-0.021
Token
Token ID198
Feature activation-3.27e-01

Pos logit contributions
ires+0.037
 afar+0.035
err+0.034
ges+0.034
ear+0.034
Neg logit contributions
 wrists-0.029
tails-0.027
 sa-0.027
 chip-0.027
 will-0.027
Token
Token ID198
Feature activation-3.19e-01

Pos logit contributions
 will+0.007
 says+0.007
 chip+0.006
 said+0.006
 wrists+0.006
Neg logit contributions
ires-0.018
 afar-0.018
err-0.018
uttering-0.017
eway-0.017
Token and
Token ID290
Feature activation+1.38e+00

Pos logit contributions
 earlier+0.024
 outside+0.022
ear+0.022
 upon+0.022
 here+0.022
Neg logit contributions
 wrists-0.047
 rhy-0.042
tails-0.041
Together-0.041
 piping-0.041
Token Mia
Token ID32189
Feature activation-4.24e-01

Pos logit contributions
ires+0.035
 afar+0.035
 here+0.035
ear+0.034
foot+0.032
Neg logit contributions
 wrists-0.063
tails-0.063
Together-0.061
 slot-0.059
 Together-0.059
Token.
Token ID13
Feature activation+9.59e-01

Pos logit contributions
 wrists+0.022
 chip+0.020
Together+0.020
tails+0.019
 Fer+0.019
Neg logit contributions
 upon-0.011
 outside-0.011
 earlier-0.010
 underground-0.010
ear-0.010
Token We
Token ID775
Feature activation-1.15e+00

Pos logit contributions
ires+0.034
ear+0.031
 afar+0.031
ges+0.030
err+0.030
Neg logit contributions
 wrists-0.042
tails-0.040
 sa-0.040
 scar-0.037
 chip-0.037
Token want
Token ID765
Feature activation+4.94e+00

Pos logit contributions
 wrists+0.051
 chip+0.050
tails+0.045
 sle+0.045
Together+0.044
Neg logit contributions
 afar-0.038
 upon-0.036
ear-0.036
 underground-0.035
 outside-0.034
Token you
Token ID345
Feature activation+2.17e+00

Pos logit contributions
ires+0.138
 afar+0.134
 outside+0.129
ear+0.129
foot+0.128
Neg logit contributions
 wrists-0.228
tails-0.227
Together-0.222
 depends-0.216
 Together-0.216
Token to
Token ID284
Feature activation-1.51e+00

Pos logit contributions
ires+0.068
 upon+0.067
ear+0.067
 afar+0.065
 underground+0.064
Neg logit contributions
 wrists-0.095
 chip-0.092
 depends-0.090
tails-0.090
girls-0.088
Token be
Token ID307
Feature activation-1.88e+00

Pos logit contributions
 depends+0.065
 wrists+0.065
rob+0.061
 Together+0.060
 slot+0.060
Neg logit contributions
 afar-0.049
ear-0.046
 underground-0.045
 outside-0.044
away-0.043
Token good
Token ID922
Feature activation-2.46e+00

Pos logit contributions
 wrists+0.077
 chip+0.074
 will+0.072
tails+0.071
 depends+0.071
Neg logit contributions
ires-0.070
 afar-0.066
ear-0.064
err-0.064
eway-0.062
Token.
Token ID13
Feature activation+5.58e-01

Pos logit contributions
 wrists+0.113
tails+0.107
 depends+0.103
Together+0.103
 chip+0.103
Neg logit contributions
ires-0.079
ear-0.074
 outside-0.073
 underground-0.073
 afar-0.072
Token We
Token ID775
Feature activation-1.40e+00

Pos logit contributions
ires+0.021
 afar+0.020
ear+0.019
foot+0.019
err+0.018
Neg logit contributions
 wrists-0.023
tails-0.022
 sa-0.022
 chip-0.021
 will-0.021
Token
Token ID198
Feature activation-5.07e-01

Pos logit contributions
 will+0.011
 chip+0.011
 wrists+0.011
 said+0.010
 says+0.010
Neg logit contributions
ires-0.026
 afar-0.025
err-0.025
uttering-0.025
ening-0.025
TokenMom
Token ID29252
Feature activation-1.20e+00

Pos logit contributions
 wrists+0.016
 chip+0.016
 will+0.016
tails+0.015
 sa+0.015
Neg logit contributions
ires-0.023
 afar-0.022
err-0.022
eway-0.021
ear-0.021
Token says
Token ID1139
Feature activation-4.37e+00

Pos logit contributions
 wrists+0.049
 chip+0.044
tails+0.044
 spe+0.042
 rhy+0.041
Neg logit contributions
ires-0.045
 upon-0.043
 behind-0.042
 outside-0.042
ied-0.041
Token,
Token ID11
Feature activation-1.54e-02

Pos logit contributions
 wrists+0.160
tails+0.149
 depends+0.147
 chip+0.145
girls+0.143
Neg logit contributions
ires-0.179
 afar-0.167
ges-0.166
ear-0.164
uttering-0.163
Token "
Token ID366
Feature activation+2.11e+00

Pos logit contributions
 wrists+0.001
tails+0.001
girls+0.001
 depends+0.001
rob+0.001
Neg logit contributions
 upon-0.000
ires-0.000
ges-0.000
foot-0.000
 outside-0.000
TokenYou
Token ID1639
Feature activation+1.27e+00

Pos logit contributions
ires+0.079
 afar+0.074
err+0.073
ear+0.071
uttering+0.070
Neg logit contributions
 wrists-0.086
 chip-0.081
 will-0.081
tails-0.081
 sa-0.080
Token have
Token ID423
Feature activation-9.30e-01

Pos logit contributions
 upon+0.056
 underground+0.055
ires+0.055
 afar+0.054
ear+0.053
Neg logit contributions
 wrists-0.044
 chip-0.039
Together-0.036
tails-0.036
rob-0.036
Token to
Token ID284
Feature activation-1.58e+00

Pos logit contributions
 wrists+0.045
tails+0.043
Together+0.042
 chip+0.042
girls+0.042
Neg logit contributions
 afar-0.025
 underground-0.025
ires-0.025
ear-0.025
 upon-0.025
Token wash
Token ID13502
Feature activation-3.94e+00

Pos logit contributions
 wrists+0.055
 depends+0.050
girls+0.048
tails+0.048
 chip+0.048
Neg logit contributions
ear-0.066
 afar-0.066
ires-0.065
 underground-0.064
foot-0.063
Token.
Token ID13
Feature activation+6.93e-01

Pos logit contributions
 wrists+0.145
 will+0.143
 chip+0.135
 sa+0.135
 says+0.129
Neg logit contributions
ires-0.162
uttering-0.159
err-0.158
eway-0.155
 afar-0.152
Token You
Token ID921
Feature activation+1.74e+00

Pos logit contributions
ires+0.028
err+0.027
eway+0.027
uttering+0.027
 afar+0.026
Neg logit contributions
 wrists-0.025
 chip-0.024
 will-0.024
 depends-0.023
 sa-0.023
Token to
Token ID284
Feature activation-1.59e+00

Pos logit contributions
 wrists+0.133
 chip+0.125
 will+0.124
tails+0.122
 depends+0.122
Neg logit contributions
ires-0.177
 afar-0.165
err-0.164
ear-0.164
eway-0.162
Token the
Token ID262
Feature activation-1.73e+00

Pos logit contributions
 wrists+0.059
 chip+0.057
 will+0.056
 sa+0.055
tails+0.055
Neg logit contributions
ires-0.066
err-0.062
uttering-0.061
eway-0.061
ear-0.061
Token bunny
Token ID44915
Feature activation+1.48e+00

Pos logit contributions
 wrists+0.067
 will+0.062
tails+0.062
 depends+0.062
girls+0.062
Neg logit contributions
ires-0.067
 afar-0.063
err-0.063
 underground-0.060
ear-0.059
Token,
Token ID11
Feature activation-1.42e-01

Pos logit contributions
ires+0.069
 afar+0.065
err+0.064
ear+0.063
eway+0.062
Neg logit contributions
 wrists-0.046
 will-0.045
 chip-0.044
tails-0.043
 sa-0.043
Token "
Token ID366
Feature activation+2.24e+00

Pos logit contributions
 wrists+0.004
 chip+0.004
 will+0.004
 sa+0.004
 said+0.004
Neg logit contributions
ires-0.007
 afar-0.007
err-0.007
ear-0.007
eway-0.007
TokenFl
Token ID7414
Feature activation-1.41e-01

Pos logit contributions
ires+0.102
eway+0.093
uttering+0.090
 underground+0.089
ear+0.088
Neg logit contributions
 will-0.080
 chip-0.080
 wrists-0.080
 sa-0.077
-0.073
Tokenopsy
Token ID44522
Feature activation-6.12e-01

Pos logit contributions
 wrists+0.005
 chip+0.005
 will+0.005
tails+0.005
 sa+0.005
Neg logit contributions
ires-0.006
eway-0.006
 afar-0.005
err-0.005
ear-0.005
Token,
Token ID11
Feature activation-5.37e-02

Pos logit contributions
 wrists+0.022
 chip+0.020
 sa+0.020
 depends+0.019
tails+0.019
Neg logit contributions
ires-0.025
 afar-0.024
ear-0.024
err-0.024
ges-0.023
Token why
Token ID1521
Feature activation+2.05e+00

Pos logit contributions
 will+0.002
 wrists+0.002
 chip+0.002
 sa+0.001
 depends+0.001
Neg logit contributions
ires-0.003
 afar-0.002
err-0.002
uttering-0.002
eway-0.002
Token did
Token ID750
Feature activation-2.22e+00

Pos logit contributions
ires+0.096
 upon+0.091
 afar+0.090
err+0.089
ear+0.088
Neg logit contributions
 wrists-0.064
 chip-0.058
tails-0.056
girls-0.056
 Together-0.055
Token you
Token ID345
Feature activation+2.00e+00

Pos logit contributions
 wrists+0.107
 depends+0.101
 chip+0.100
 will+0.100
 Together+0.099
Neg logit contributions
ires-0.062
 afar-0.056
ear-0.054
foot-0.054
 underground-0.052
Token to
Token ID284
Feature activation-1.65e+00

Pos logit contributions
ires+0.086
 afar+0.080
err+0.078
eway+0.078
ear+0.078
Neg logit contributions
 wrists-0.092
 chip-0.087
 will-0.086
 sa-0.085
 depends-0.085
Token be
Token ID307
Feature activation-1.99e+00

Pos logit contributions
 wrists+0.054
 depends+0.050
 chip+0.049
 sa+0.049
girls+0.049
Neg logit contributions
 afar-0.073
ires-0.071
eway-0.069
ear-0.069
 underground-0.068
Token careful
Token ID8161
Feature activation-1.27e+00

Pos logit contributions
 wrists+0.077
 will+0.075
 chip+0.073
 sa+0.073
tails+0.072
Neg logit contributions
ires-0.079
err-0.072
eway-0.071
 afar-0.071
ear-0.070
Token and
Token ID290
Feature activation+6.07e-01

Pos logit contributions
 chip+0.044
 wrists+0.044
 will+0.043
tails+0.042
 sa+0.041
Neg logit contributions
ires-0.054
 afar-0.052
err-0.051
ear-0.048
uttering-0.048
Token not
Token ID407
Feature activation+1.61e+00

Pos logit contributions
 afar+0.024
ires+0.024
ear+0.023
 underground+0.023
 upon+0.022
Neg logit contributions
 wrists-0.023
 chip-0.021
 piping-0.020
girls-0.020
 sa-0.020
Token get
Token ID651
Feature activation+2.93e-01

Pos logit contributions
 afar+0.054
ires+0.053
ear+0.053
foot+0.052
 underground+0.052
Neg logit contributions
 wrists-0.068
 depends-0.062
Together-0.061
rob-0.061
girls-0.061
Token too
Token ID1165
Feature activation-1.89e+00

Pos logit contributions
ires+0.012
eway+0.011
err+0.011
ear+0.011
uttering+0.011
Neg logit contributions
 will-0.011
 wrists-0.011
 sa-0.011
 chip-0.010
tails-0.010
Token close
Token ID1969
Feature activation+2.81e-01

Pos logit contributions
 will+0.059
 sa+0.057
 chip+0.056
tails+0.055
 said+0.055
Neg logit contributions
ires-0.092
eway-0.086
err-0.084
 afar-0.082
overed-0.079
Token to
Token ID284
Feature activation-1.71e+00

Pos logit contributions
ires+0.011
 afar+0.010
err+0.010
ear+0.010
eway+0.010
Neg logit contributions
 wrists-0.011
 chip-0.010
 will-0.010
tails-0.010
 sa-0.010
Token the
Token ID262
Feature activation-1.93e+00

Pos logit contributions
 wrists+0.067
 chip+0.065
 will+0.065
 sa+0.064
tails+0.064
Neg logit contributions
ires-0.066
err-0.062
uttering-0.061
eway-0.059
ear-0.059
Token fence
Token ID13990
Feature activation+3.10e-01

Pos logit contributions
 wrists+0.072
 chip+0.068
 will+0.068
tails+0.067
 sa+0.067
Neg logit contributions
ires-0.078
 afar-0.074
err-0.073
ear-0.072
eway-0.071
Token to
Token ID284
Feature activation-1.86e+00

Pos logit contributions
ires+0.076
 afar+0.072
err+0.071
eway+0.069
ear+0.069
Neg logit contributions
 wrists-0.072
 will-0.070
 chip-0.070
 sa-0.068
 depends-0.067
Token swim
Token ID9422
Feature activation+1.92e+00

Pos logit contributions
 wrists+0.075
 sa+0.067
 chip+0.065
tails+0.064
 cones+0.063
Neg logit contributions
ear-0.073
 afar-0.072
ires-0.071
 underground-0.071
eway-0.071
Token in
Token ID287
Feature activation-4.05e-01

Pos logit contributions
ires+0.070
 afar+0.064
err+0.063
ear+0.062
eway+0.061
Neg logit contributions
 wrists-0.080
 will-0.079
 chip-0.077
 sa-0.075
tails-0.075
Token the
Token ID262
Feature activation-1.89e+00

Pos logit contributions
 wrists+0.013
 will+0.013
 chip+0.013
 sa+0.012
 depends+0.012
Neg logit contributions
ires-0.019
err-0.017
uttering-0.017
 afar-0.017
ear-0.017
Token big
Token ID1263
Feature activation-7.85e-01

Pos logit contributions
 wrists+0.059
 will+0.056
 chip+0.056
 sa+0.054
tails+0.054
Neg logit contributions
ires-0.088
 afar-0.083
err-0.083
ear-0.082
eway-0.081
Token pond
Token ID16723
Feature activation-1.42e+00

Pos logit contributions
 wrists+0.031
tails+0.030
Together+0.029
 chip+0.029
 Claus+0.028
Neg logit contributions
 underground-0.029
 afar-0.028
ires-0.028
ening-0.026
away-0.026
Token,
Token ID11
Feature activation-4.50e-01

Pos logit contributions
 wrists+0.057
 chip+0.054
 will+0.054
tails+0.053
 depends+0.053
Neg logit contributions
ires-0.052
 afar-0.049
ear-0.048
err-0.048
eway-0.047
Token and
Token ID290
Feature activation+5.62e-01

Pos logit contributions
 wrists+0.016
 will+0.016
 chip+0.016
 sa+0.015
 depends+0.015
Neg logit contributions
ires-0.019
 afar-0.017
err-0.017
eway-0.017
ear-0.017
Token the
Token ID262
Feature activation-1.80e+00

Pos logit contributions
ires+0.020
 afar+0.020
ear+0.020
 behind+0.020
 underground+0.019
Neg logit contributions
 wrists-0.023
tails-0.021
 depends-0.020
 chip-0.020
 rhy-0.020
Token lion
Token ID18744
Feature activation+2.92e+00

Pos logit contributions
 wrists+0.063
 chip+0.060
 will+0.060
 sa+0.058
tails+0.058
Neg logit contributions
ires-0.078
 afar-0.073
err-0.072
ear-0.071
eway-0.071
Token felt
Token ID2936
Feature activation-3.82e+00

Pos logit contributions
ires+0.116
 afar+0.115
ear+0.115
 underground+0.114
ied+0.113
Neg logit contributions
 wrists-0.100
 chip-0.096
tails-0.093
 depends-0.090
 piping-0.088
Token watched
Token ID7342
Feature activation-1.74e+00

Pos logit contributions
ires+0.003
 afar+0.003
ied+0.003
 behind+0.003
err+0.003
Neg logit contributions
 wrists-0.003
tails-0.003
 chip-0.003
girls-0.002
 depends-0.002
Token as
Token ID355
Feature activation+2.38e+00

Pos logit contributions
 chip+0.065
 wrists+0.064
 sa+0.063
 will+0.061
 ra+0.060
Neg logit contributions
ires-0.069
err-0.066
 afar-0.065
uttering-0.063
eway-0.062
Token the
Token ID262
Feature activation-1.53e+00

Pos logit contributions
ires+0.091
 afar+0.087
err+0.084
eway+0.083
ear+0.082
Neg logit contributions
 wrists-0.094
tails-0.088
 will-0.087
 chip-0.087
 depends-0.086
Token wind
Token ID2344
Feature activation-2.65e+00

Pos logit contributions
 wrists+0.058
 will+0.054
 chip+0.054
tails+0.053
 depends+0.053
Neg logit contributions
ires-0.061
 afar-0.058
err-0.057
ear-0.056
eway-0.055
Token blew
Token ID17948
Feature activation-2.03e+00

Pos logit contributions
 wrists+0.083
 chip+0.076
 depends+0.076
tails+0.076
 will+0.073
Neg logit contributions
ires-0.118
 afar-0.114
ges-0.114
ied-0.113
ging-0.113
Token away
Token ID1497
Feature activation-1.26e+00

Pos logit contributions
 wrists+0.085
 chip+0.079
 will+0.079
tails+0.079
 depends+0.079
Neg logit contributions
ires-0.073
 afar-0.069
err-0.067
ear-0.066
eway-0.066
Token all
Token ID477
Feature activation-3.07e+00

Pos logit contributions
 wrists+0.056
 depends+0.053
tails+0.052
 will+0.052
girls+0.052
Neg logit contributions
ires-0.041
eway-0.039
 afar-0.038
 underground-0.038
 behind-0.038
Token his
Token ID465
Feature activation+2.16e+00

Pos logit contributions
 wrists+0.118
 will+0.116
 chip+0.115
 sa+0.111
 depends+0.110
Neg logit contributions
ires-0.120
 afar-0.112
err-0.110
ear-0.108
uttering-0.108
Token carefully
Token ID7773
Feature activation-2.31e+00

Pos logit contributions
ires+0.093
eway+0.089
err+0.088
ear+0.086
 afar+0.086
Neg logit contributions
 wrists-0.077
 will-0.074
 sa-0.072
 chip-0.071
tails-0.070
Token planted
Token ID18969
Feature activation+2.81e-01

Pos logit contributions
 wrists+0.094
 chip+0.089
tails+0.089
 depends+0.089
 will+0.089
Neg logit contributions
ires-0.082
 afar-0.078
ear-0.076
ges-0.075
err-0.075
Token wheat
Token ID17135
Feature activation-3.03e+00

Pos logit contributions
ires+0.010
 afar+0.010
err+0.010
ear+0.009
eway+0.009
Neg logit contributions
 wrists-0.011
 chip-0.011
 will-0.011
tails-0.011
 sa-0.011
Tokenamas
Token ID17485
Feature activation-3.96e+00

Pos logit contributions
 will+0.101
 depends+0.101
 wrists+0.098
 says+0.094
 chip+0.093
Neg logit contributions
ires-0.101
err-0.093
foot-0.091
ear-0.090
uttering-0.089
Token.
Token ID13
Feature activation+7.42e-01

Pos logit contributions
 wrists+0.192
 depends+0.183
tails+0.173
 chip+0.172
girls+0.171
Neg logit contributions
 outside-0.114
 behind-0.113
ires-0.113
 underground-0.111
ear-0.110
Token Then
Token ID3244
Feature activation-2.90e+00

Pos logit contributions
ires+0.037
 afar+0.037
uttering+0.037
ging+0.037
ges+0.036
Neg logit contributions
 wrists-0.019
 Together-0.019
 chip-0.018
 sa-0.017
 will-0.017
Token he
Token ID339
Feature activation+6.86e-01

Pos logit contributions
 wrists+0.144
girls+0.135
 depends+0.133
tails+0.130
 chip+0.127
Neg logit contributions
foot-0.080
 upon-0.078
 underground-0.077
ear-0.077
ires-0.077
Token sn
Token ID3013
Feature activation-3.66e+00

Pos logit contributions
 upon+0.029
eh+0.029
ear+0.028
 afar+0.028
 outside+0.028
Neg logit contributions
 wrists-0.022
 chip-0.021
girls-0.021
 depends-0.021
tails-0.020
Tokenugg
Token ID6837
Feature activation-1.24e+00

Pos logit contributions
 will+0.124
 wrists+0.122
 chip+0.119
 Together+0.118
+0.117
Neg logit contributions
ires-0.158
err-0.151
eway-0.151
 afar-0.150
 microscopic-0.142
Tokenled
Token ID992
Feature activation+3.59e+00

Pos logit contributions
 wrists+0.074
 chip+0.072
 will+0.071
tails+0.071
 depends+0.070
Neg logit contributions
ires-0.023
 afar-0.019
err-0.019
ear-0.018
eway-0.018
Token up
Token ID510
Feature activation+4.86e+00

Pos logit contributions
ires+0.128
uttering+0.115
eway+0.113
err+0.113
ierce+0.110
Neg logit contributions
 sa-0.164
 will-0.159
 wrists-0.155
 scar-0.154
 said-0.150
Token in
Token ID287
Feature activation+3.35e-01

Pos logit contributions
ires+0.173
 afar+0.159
err+0.156
eway+0.154
ear+0.154
Neg logit contributions
 wrists-0.207
 chip-0.201
 will-0.199
 sa-0.199
tails-0.193
Token his
Token ID465
Feature activation+2.51e+00

Pos logit contributions
ires+0.013
 afar+0.012
foot+0.012
ear+0.012
err+0.012
Neg logit contributions
 wrists-0.013
tails-0.012
 depends-0.012
 will-0.012
 chip-0.012
Token bed
Token ID3996
Feature activation-5.40e+00

Pos logit contributions
ires+0.115
 afar+0.109
err+0.108
 underground+0.108
foot+0.107
Neg logit contributions
 wrists-0.076
 depends-0.073
 will-0.072
girls-0.072
 chip-0.072
Token the
Token ID262
Feature activation-1.21e+00

Pos logit contributions
ires+0.036
err+0.034
eway+0.033
 afar+0.033
uttering+0.033
Neg logit contributions
 wrists-0.047
 will-0.045
 sa-0.044
 chip-0.044
tails-0.044
Token bag
Token ID6131
Feature activation-4.52e+00

Pos logit contributions
 wrists+0.033
 Together+0.031
tails+0.030
girls+0.030
Together+0.029
Neg logit contributions
ires-0.059
 upon-0.056
 underground-0.056
 afar-0.056
err-0.055
Token.
Token ID13
Feature activation+1.08e+00

Pos logit contributions
 wrists+0.185
 chip+0.174
 will+0.172
 depends+0.172
tails+0.170
Neg logit contributions
ires-0.167
ear-0.152
 afar-0.152
err-0.152
eway-0.151
Token
Token ID198
Feature activation+5.36e-01

Pos logit contributions
ires+0.051
 afar+0.048
err+0.047
ear+0.047
eway+0.046
Neg logit contributions
 wrists-0.033
 sa-0.031
 chip-0.031
 will-0.031
tails-0.030
Token
Token ID198
Feature activation+4.87e-01

Pos logit contributions
ires+0.027
 afar+0.026
err+0.025
ear+0.025
uttering+0.025
Neg logit contributions
 wrists-0.014
 will-0.014
 chip-0.014
tails-0.013
 depends-0.013
TokenThe
Token ID464
Feature activation-8.80e-01

Pos logit contributions
ires+0.024
 afar+0.023
err+0.022
ges+0.022
ening+0.022
Neg logit contributions
Together-0.015
 will-0.014
 chip-0.014
 wrists-0.014
tails-0.014
Token little
Token ID1310
Feature activation+3.33e+00

Pos logit contributions
 sa+0.035
 will+0.035
 chip+0.035
 scar+0.033
 said+0.033
Neg logit contributions
eway-0.036
 afar-0.035
ges-0.035
ires-0.035
ear-0.034
Token girl
Token ID2576
Feature activation-8.31e-01

Pos logit contributions
ires+0.113
 afar+0.107
err+0.106
 upon+0.106
 underground+0.106
Neg logit contributions
 wrists-0.139
girls-0.138
tails-0.133
 chip-0.132
 depends-0.130
Token decided
Token ID3066
Feature activation+9.58e-01

Pos logit contributions
 wrists+0.029
 chip+0.027
girls+0.027
tails+0.026
 depends+0.026
Neg logit contributions
ear-0.033
ires-0.033
earing-0.033
ied-0.033
 underground-0.032
Token to
Token ID284
Feature activation-1.09e+00

Pos logit contributions
ires+0.038
 afar+0.036
err+0.035
ear+0.035
eway+0.035
Neg logit contributions
 wrists-0.037
 chip-0.035
 will-0.035
tails-0.034
 sa-0.034
Token take
Token ID1011
Feature activation-2.73e-01

Pos logit contributions
 wrists+0.042
girls+0.039
 depends+0.038
 chip+0.038
rob+0.038
Neg logit contributions
ear-0.041
ires-0.041
foot-0.040
eway-0.040
 outside-0.039
Token's
Token ID338
Feature activation-3.59e-01

Pos logit contributions
ires+0.061
eway+0.057
 afar+0.056
uttering+0.056
err+0.056
Neg logit contributions
 will-0.023
 said-0.023
 sa-0.022
 wrists-0.022
 chip-0.022
Token mom
Token ID1995
Feature activation-1.45e+00

Pos logit contributions
 chip+0.009
 will+0.009
 sa+0.009
 wrists+0.008
 cones+0.008
Neg logit contributions
ires-0.019
eway-0.019
 afar-0.019
ear-0.019
ges-0.018
Token took
Token ID1718
Feature activation-1.28e+00

Pos logit contributions
 wrists+0.053
girls+0.043
tails+0.042
 merry+0.040
 cones+0.040
Neg logit contributions
 upon-0.066
ied-0.064
 outside-0.063
ires-0.062
earing-0.062
Token him
Token ID683
Feature activation+5.21e+00

Pos logit contributions
 sa+0.058
 chip+0.056
 said+0.055
 says+0.054
 will+0.054
Neg logit contributions
ires-0.045
uttering-0.042
eway-0.041
err-0.040
 microscopic-0.039
Token to
Token ID284
Feature activation-1.50e+00

Pos logit contributions
ires+0.290
 afar+0.269
ocol+0.265
err+0.264
eway+0.263
Neg logit contributions
 will-0.136
 sa-0.129
tails-0.127
 said-0.122
 chip-0.116
Token the
Token ID262
Feature activation-1.54e+00

Pos logit contributions
 wrists+0.058
 will+0.055
 chip+0.054
tails+0.053
 depends+0.053
Neg logit contributions
ires-0.059
 afar-0.056
err-0.054
ear-0.054
eway-0.053
Token store
Token ID3650
Feature activation-1.83e+00

Pos logit contributions
 wrists+0.056
 chip+0.054
 will+0.053
 sa+0.052
tails+0.052
Neg logit contributions
ires-0.064
 afar-0.060
err-0.059
ear-0.058
eway-0.058
Token.
Token ID13
Feature activation+3.05e-01

Pos logit contributions
 wrists+0.079
 will+0.074
 depends+0.074
 chip+0.073
tails+0.073
Neg logit contributions
ires-0.064
 afar-0.059
 underground-0.058
ear-0.058
uttering-0.058
Token She
Token ID1375
Feature activation-2.30e+00

Pos logit contributions
ires+0.013
 afar+0.012
err+0.012
ear+0.012
eway+0.012
Neg logit contributions
 wrists-0.011
 will-0.010
 chip-0.010
tails-0.010
 sa-0.010
Token said
Token ID531
Feature activation-3.77e+00

Pos logit contributions
 wrists+0.086
girls+0.082
tails+0.081
 rhy+0.078
 depends+0.078
Neg logit contributions
 upon-0.089
ires-0.086
ied-0.085
 nowhere-0.084
 behind-0.084
Token,
Token ID11
Feature activation-1.49e-02

Pos logit contributions
 wrists+0.138
tails+0.130
 depends+0.130
 will+0.128
girls+0.127
Neg logit contributions
ires-0.154
ear-0.143
ges-0.142
 afar-0.142
ied-0.140
Token
Token ID198
Feature activation-3.42e-01

Pos logit contributions
 wrists+0.125
 will+0.121
tails+0.119
 depends+0.118
 sa+0.116
Neg logit contributions
ires-0.137
 afar-0.133
ear-0.132
err-0.129
ges-0.128
Token
Token ID198
Feature activation-3.21e-01

Pos logit contributions
 wrists+0.010
 chip+0.010
 will+0.010
tails+0.009
 depends+0.009
Neg logit contributions
ires-0.016
 afar-0.015
err-0.015
uttering-0.015
ear-0.015
Token"
Token ID1
Feature activation+2.96e+00

Pos logit contributions
Together+0.011
 chip+0.010
 will+0.010
 wrists+0.010
tails+0.010
Neg logit contributions
ires-0.015
ening-0.013
 afar-0.013
uttering-0.013
err-0.013
TokenOh
Token ID5812
Feature activation-9.72e-01

Pos logit contributions
ires+0.120
 afar+0.112
err+0.111
ear+0.109
eway+0.109
Neg logit contributions
 wrists-0.111
 chip-0.105
 will-0.105
tails-0.102
 sa-0.102
Token no
Token ID645
Feature activation-3.30e+00

Pos logit contributions
 wrists+0.038
girls+0.035
Together+0.034
 sa+0.034
 chip+0.034
Neg logit contributions
 upon-0.037
ear-0.036
earing-0.036
 here-0.036
eway-0.035
Token,"
Token ID553
Feature activation-6.72e+00

Pos logit contributions
 wrists+0.118
tails+0.106
 will+0.106
 chip+0.106
 depends+0.105
Neg logit contributions
ires-0.143
 afar-0.136
ear-0.134
err-0.133
eway-0.131
Token said
Token ID531
Feature activation-3.62e+00

Pos logit contributions
 wrists+0.216
 will+0.210
tails+0.210
girls+0.202
 depends+0.202
Neg logit contributions
 afar-0.283
 behind-0.273
ires-0.262
 upon-0.261
ear-0.257
Token Lily
Token ID20037
Feature activation+1.45e+00

Pos logit contributions
 wrists+0.155
 will+0.153
 depends+0.150
tails+0.145
Together+0.144
Neg logit contributions
ires-0.117
 afar-0.114
ges-0.113
 behind-0.110
err-0.110
Token.
Token ID13
Feature activation+6.54e-01

Pos logit contributions
 behind+0.040
earing+0.039
 earlier+0.036
 outside+0.036
 upon+0.036
Neg logit contributions
 wrists-0.070
 depends-0.069
rob-0.069
 Fer-0.068
 ra-0.068
Token "
Token ID366
Feature activation+2.42e+00

Pos logit contributions
ires+0.027
err+0.025
 afar+0.025
ges+0.024
eway+0.024
Neg logit contributions
 wrists-0.024
 will-0.023
 sa-0.023
tails-0.023
 depends-0.022
TokenI
Token ID40
Feature activation+5.49e-01

Pos logit contributions
ires+0.124
ied+0.119
ening+0.118
 underground+0.117
eway+0.117
Neg logit contributions
 chip-0.060
 will-0.059
 wrists-0.058
Together-0.055
They-0.054
Token,
Token ID11
Feature activation+8.47e-01

Pos logit contributions
ires+0.121
 afar+0.113
eway+0.111
err+0.110
ear+0.109
Neg logit contributions
 chip-0.101
 wrists-0.100
 sa-0.097
 will-0.096
 depends-0.092
Token smooth
Token ID7209
Feature activation-9.46e-01

Pos logit contributions
ires+0.036
 afar+0.034
eway+0.033
ear+0.033
err+0.033
Neg logit contributions
 chip-0.029
 wrists-0.029
 sa-0.028
 will-0.028
 said-0.028
Token butter
Token ID9215
Feature activation-7.72e+00

Pos logit contributions
 chip+0.030
 wrists+0.030
 sa+0.029
 will+0.028
 ra+0.027
Neg logit contributions
ires-0.045
eway-0.042
ear-0.041
 afar-0.041
err-0.041
Token.
Token ID13
Feature activation+1.31e+00

Pos logit contributions
 wrists+0.320
 depends+0.300
 will+0.298
 chip+0.297
tails+0.293
Neg logit contributions
ires-0.278
err-0.254
 afar-0.252
 underground-0.252
uttering-0.251
Token The
Token ID383
Feature activation-7.63e-01

Pos logit contributions
ires+0.056
 afar+0.052
err+0.052
ear+0.051
eway+0.051
Neg logit contributions
 wrists-0.047
 chip-0.045
 will-0.044
 sa-0.043
tails-0.043
Token butter
Token ID9215
Feature activation-7.73e+00

Pos logit contributions
 wrists+0.016
tails+0.015
girls+0.014
 chip+0.014
 will+0.014
Neg logit contributions
ires-0.043
 afar-0.041
err-0.041
ear-0.040
uttering-0.040
Token looked
Token ID3114
Feature activation-1.06e+00

Pos logit contributions
 wrists+0.302
girls+0.291
 chip+0.285
tails+0.283
 depends+0.278
Neg logit contributions
ires-0.292
ied-0.274
 underground-0.267
err-0.264
 upon-0.262
Token all
Token ID477
Feature activation-2.37e+00

Pos logit contributions
 will+0.026
 sa+0.025
 wrists+0.025
 chip+0.023
 said+0.023
Neg logit contributions
eway-0.056
ires-0.055
err-0.055
 afar-0.053
ges-0.053
Token around
Token ID1088
Feature activation+4.40e+00

Pos logit contributions
 wrists+0.118
 will+0.115
 chip+0.114
 sa+0.112
 depends+0.111
Neg logit contributions
ires-0.066
 afar-0.061
err-0.059
eway-0.058
ear-0.057
Token,
Token ID11
Feature activation+7.39e-01

Pos logit contributions
err+0.213
eway+0.204
ayed+0.203
uttering+0.200
ires+0.199
Neg logit contributions
 will-0.128
 said-0.119
 -0.116
-0.113
 says-0.112
Token seeing
Token ID4379
Feature activation+2.83e+00

Pos logit contributions
ires+0.032
err+0.031
eway+0.031
ear+0.030
 afar+0.030
Neg logit contributions
 wrists-0.025
 will-0.024
 chip-0.024
 sa-0.023
 said-0.023
Token try
Token ID1949
Feature activation-9.63e-01

Pos logit contributions
ires+0.039
 afar+0.038
ear+0.035
 underground+0.035
 upon+0.034
Neg logit contributions
 wrists-0.038
 chip-0.038
tails-0.036
Together-0.036
 sa-0.036
Token to
Token ID284
Feature activation-1.64e+00

Pos logit contributions
 wrists+0.037
 chip+0.036
 will+0.035
tails+0.035
 sa+0.035
Neg logit contributions
ires-0.037
 afar-0.035
err-0.034
ear-0.034
eway-0.034
Token say
Token ID910
Feature activation-3.99e+00

Pos logit contributions
 wrists+0.051
 chip+0.048
 sa+0.047
 will+0.047
 depends+0.047
Neg logit contributions
ires-0.075
 afar-0.073
ear-0.071
err-0.069
eway-0.069
Token the
Token ID262
Feature activation-1.78e+00

Pos logit contributions
 wrists+0.181
 chip+0.178
tails+0.174
Together+0.172
 depends+0.170
Neg logit contributions
ires-0.123
 afar-0.116
ear-0.115
eway-0.107
 underground-0.105
Token words
Token ID2456
Feature activation-2.26e+00

Pos logit contributions
 wrists+0.073
tails+0.070
 chip+0.068
 Together+0.066
 kissed+0.066
Neg logit contributions
ires-0.065
 afar-0.062
err-0.061
 underground-0.059
eway-0.057
Token,"
Token ID553
Feature activation-6.96e+00

Pos logit contributions
 wrists+0.087
tails+0.081
 chip+0.079
girls+0.078
 depends+0.077
Neg logit contributions
ires-0.091
ear-0.087
 behind-0.085
 upon-0.084
 outside-0.083
Token the
Token ID262
Feature activation-1.78e+00

Pos logit contributions
 wrists+0.253
tails+0.251
girls+0.244
 will+0.244
 sle+0.232
Neg logit contributions
 afar-0.269
ires-0.258
 behind-0.247
 nowhere-0.237
 outside-0.236
Token monkey
Token ID21657
Feature activation+2.18e+00

Pos logit contributions
 wrists+0.065
 chip+0.063
 will+0.062
 sa+0.061
 depends+0.060
Neg logit contributions
ires-0.074
 afar-0.069
err-0.068
ear-0.068
eway-0.068
Token said
Token ID531
Feature activation-3.93e+00

Pos logit contributions
ires+0.131
 afar+0.128
ear+0.127
err+0.127
earing+0.126
Neg logit contributions
 wrists-0.036
 chip-0.033
 sa-0.031
girls-0.029
 ra-0.029
Token.
Token ID13
Feature activation+1.59e-01

Pos logit contributions
 wrists+0.161
 will+0.153
 chip+0.152
tails+0.150
 depends+0.150
Neg logit contributions
ires-0.143
 afar-0.133
ear-0.131
err-0.130
eway-0.129
Token
Token ID220
Feature activation-6.10e+00

Pos logit contributions
ires+0.006
err+0.005
 upon+0.005
ear+0.005
ges+0.005
Neg logit contributions
tails-0.007
 wrists-0.007
 will-0.006
 scar-0.006
 depends-0.006
Token to
Token ID284
Feature activation-1.50e+00

Pos logit contributions
ires+0.005
 afar+0.005
err+0.005
ear+0.005
eway+0.005
Neg logit contributions
 wrists-0.006
 will-0.006
 chip-0.006
 depends-0.005
tails-0.005
Token help
Token ID1037
Feature activation-3.31e+00

Pos logit contributions
 wrists+0.061
 kissed+0.056
girls+0.056
 piping+0.056
 sa+0.054
Neg logit contributions
 afar-0.056
 underground-0.052
 freely-0.049
 upon-0.049
eway-0.049
Token the
Token ID262
Feature activation-1.83e+00

Pos logit contributions
 wrists+0.132
 chip+0.127
 will+0.126
 sa+0.123
tails+0.123
Neg logit contributions
ires-0.127
 afar-0.118
err-0.117
ear-0.115
eway-0.114
Token ant
Token ID1885
Feature activation+3.10e+00

Pos logit contributions
 chip+0.080
 will+0.075
 sa+0.075
 depends+0.071
 cones+0.071
Neg logit contributions
eway-0.069
ear-0.067
ires-0.067
ges-0.065
ied-0.064
Token.
Token ID13
Feature activation+1.82e-01

Pos logit contributions
ires+0.120
 underground+0.116
 afar+0.115
eway+0.113
ear+0.112
Neg logit contributions
 wrists-0.118
 chip-0.108
 sa-0.105
 depends-0.103
 will-0.103
Token Together
Token ID17083
Feature activation-7.47e+00

Pos logit contributions
ires+0.008
ear+0.007
 afar+0.007
err+0.007
ges+0.007
Neg logit contributions
 wrists-0.007
tails-0.006
 will-0.006
 scar-0.006
 sa-0.006
Token,
Token ID11
Feature activation-1.25e-01

Pos logit contributions
 wrists+0.370
 depends+0.344
tails+0.330
girls+0.325
 Claus+0.319
Neg logit contributions
 underground-0.215
ires-0.215
foot-0.209
 outside-0.208
ied-0.207
Token they
Token ID484
Feature activation-3.81e+00

Pos logit contributions
 wrists+0.003
tails+0.003
 depends+0.003
 will+0.003
 rhy+0.003
Neg logit contributions
ires-0.006
 afar-0.006
 underground-0.006
 behind-0.006
ges-0.006
Token moved
Token ID3888
Feature activation+1.80e-01

Pos logit contributions
 wrists+0.140
 chip+0.136
 depends+0.133
+0.131
 piping+0.130
Neg logit contributions
 behind-0.147
ied-0.146
 underground-0.145
 outside-0.144
ges-0.143
Token more
Token ID517
Feature activation-5.73e+00

Pos logit contributions
ires+0.008
err+0.007
 afar+0.007
ear+0.007
uttering+0.007
Neg logit contributions
 wrists-0.007
 will-0.006
 chip-0.006
 sa-0.006
tails-0.006
Token and
Token ID290
Feature activation+9.29e-01

Pos logit contributions
 will+0.196
 wrists+0.185
 sa+0.182
 scar+0.180
 said+0.176
Neg logit contributions
ires-0.253
 afar-0.245
err-0.242
uttering-0.236
eway-0.227
Token "
Token ID366
Feature activation+2.25e+00

Pos logit contributions
ires+0.016
 afar+0.015
err+0.014
ear+0.014
eway+0.014
Neg logit contributions
 wrists-0.011
 chip-0.010
 will-0.010
tails-0.010
 sa-0.010
TokenBut
Token ID1537
Feature activation+1.60e-01

Pos logit contributions
ires+0.113
 underground+0.110
eway+0.109
ied+0.108
ocol+0.106
Neg logit contributions
Together-0.060
 will-0.059
 chip-0.056
 wrists-0.054
They-0.053
Token I
Token ID314
Feature activation-5.13e-01

Pos logit contributions
ires+0.008
err+0.007
uttering+0.007
 afar+0.007
eway+0.007
Neg logit contributions
 will-0.005
 wrists-0.005
 chip-0.005
 sa-0.005
 depends-0.005
Token like
Token ID588
Feature activation+4.08e+00

Pos logit contributions
 will+0.017
 wrists+0.015
 said+0.015
 depends+0.015
 chip+0.015
Neg logit contributions
ires-0.024
eway-0.022
err-0.022
 afar-0.022
uttering-0.022
Token it
Token ID340
Feature activation+1.86e+00

Pos logit contributions
ires+0.162
err+0.154
eway+0.151
uttering+0.150
 afar+0.149
Neg logit contributions
 wrists-0.158
 chip-0.156
 will-0.154
 sa-0.154
 depends-0.145
Token,"
Token ID553
Feature activation-7.02e+00

Pos logit contributions
ires+0.071
 afar+0.066
ear+0.065
 underground+0.065
err+0.064
Neg logit contributions
 wrists-0.074
tails-0.066
 depends-0.066
girls-0.065
 chip-0.065
Token he
Token ID339
Feature activation+4.29e-01

Pos logit contributions
 wrists+0.355
 will+0.330
tails+0.327
girls+0.326
 depends+0.321
Neg logit contributions
 afar-0.188
ires-0.181
foot-0.175
 upon-0.171
 behind-0.167
Token said
Token ID531
Feature activation-3.78e+00

Pos logit contributions
earing+0.022
err+0.022
eh+0.022
ear+0.022
ied+0.021
Neg logit contributions
 piping-0.009
 wrists-0.009
 will-0.009
 sle-0.009
 depends-0.009
Token.
Token ID13
Feature activation+3.66e-01

Pos logit contributions
 wrists+0.146
 will+0.139
 chip+0.138
tails+0.136
 sa+0.135
Neg logit contributions
ires-0.147
 afar-0.138
err-0.136
ear-0.135
eway-0.133
Token
Token ID198
Feature activation-5.76e-01

Pos logit contributions
ires+0.015
err+0.014
 upon+0.013
 afar+0.013
foot+0.013
Neg logit contributions
 wrists-0.014
tails-0.013
 will-0.013
 depends-0.013
girls-0.012
Token
Token ID198
Feature activation-5.67e-01

Pos logit contributions
 chip+0.012
Together+0.011
 Together+0.011
 said+0.011
 will+0.011
Neg logit contributions
ires-0.033
 afar-0.032
uttering-0.031
err-0.031
ges-0.030