TOP ACTIVATIONS
MAX = 13.225

make the waiter drop the pizz as . They said sorry
not played with the ash tr ay . They wished they
the door and give the pizz as to the people ."
. Let 's put the pizz as in the oven and
will not touch the ash tr ay again ," Tim says
you go , enjoy your pizz as ." "
. We can share the mel on ." Tom said .
had learned to appreciate the therm ometer . <|endoftext|> John lay
and the crabs and the seaw eed . Sam
wanted to keep the ash tr ay . So he tried
not play with the ash tr ay . It is dangerous
, you should have the therm ometer in your mouth .
should never touch an ash tr ay ." S
friends and always shared their jelly . <|endoftext|> L ily and
was able to share the mel on with his friends and
. We will share the oct opus together . Let 's
should have shared my l oll ipop with you . It
were brave and kept the therm ometer in your mouth !"
he decided to remove the seaw eed from the water so
should not tease the ost ric hes . They are not

MIN ACTIVATIONS
MIN = -12.159

." <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> Anna and Ben are playing with
<|endoftext|> Once upon a time on a bright sunny day there
more squash ." <|endoftext|> <|endoftext|> <|endoftext|> L ily and Sam went
L ily and Sam went on a saf ari with their
that much more . <|endoftext|> Amy woke up to a sunny
." <|endoftext|> <|endoftext|> <|endoftext|> L ily and Sam went on a
love ." <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> Anna and Ben are playing
, pretty chamber . <|endoftext|> Amy was walking through the woods
! <|endoftext|> One day , on a hot summer 's day
. <|endoftext|> One day , on a cold winter morning ,
Once upon a time , on a clean Tuesday morning ,
wearing her cap . <|endoftext|> John drove his car on the
. <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> L ila and Ben are
next one . <|endoftext|> <|endoftext|> Tim and Jen are twins .
whole way home . <|endoftext|> Amy was a very brave girl
Once upon a time , in a small village , there
Once upon a time , in a small village , there
Once upon a time , in a lively little town ,
Once upon a time , in a small village there lived
Once upon a time , in a big park , there

INTERVAL 6.879 - 13.225
CONTAINS 0.546%

because there was no more rice left . S
y again after accepting the stone and it made her feel
I used to sail the seas and look for treasure .
sparkling wand and a shiny crown . With these items ,
money to pay for the food . Dad was so relieved

INTERVAL 0.533 - 6.879
CONTAINS 45.529%

The wolf came closer and closer , and Benny knew
on the branch , practicing her balancing skills . Other animals
and had lots of fun . From that day
Whenever the tree stretches his branches in the morning , he
and , with a big smile on his face , he

INTERVAL -5.813 - 0.533
CONTAINS 49.494%

John put them in his large pocket and then
, George was walking near a lake and he saw a
a cold lizard . She was so excited ! She wanted
there lived a large dragon in a forest . He had
it herself . Mrs . Smith was so surprised and touched

INTERVAL -12.159 - -5.813
CONTAINS 4.430%

garden . Katy 's dad took out a microscope . He
there was a girl named Alice . She wanted to go
<|endoftext|> Once upon a time , there was a happy dog
<|endoftext|> Mike y was walking through the park . He noticed
. <|endoftext|> Once upon a time , there was a graceful
Token make
Token ID787
Feature activation-4.31e-01

Pos logit contributions
uts+0.045
ding+0.041
vere+0.041
ometer+0.039
othe+0.038
Neg logit contributions
en-0.036
 me-0.035
ere-0.035
ened-0.034
el-0.032
Token the
Token ID262
Feature activation+4.09e+00

Pos logit contributions
en+0.006
ere+0.005
 me+0.005
ened+0.005
el+0.005
Neg logit contributions
uts-0.014
ding-0.013
vere-0.013
ache-0.013
othe-0.013
Token waiter
Token ID46612
Feature activation+4.83e+00

Pos logit contributions
uts+0.104
vere+0.091
othe+0.091
ding+0.088
ometer+0.084
Neg logit contributions
ere-0.085
 me-0.084
en-0.083
ened-0.081
 colored-0.079
Token drop
Token ID4268
Feature activation-3.45e-01

Pos logit contributions
uts+0.138
vere+0.129
ding+0.129
ometer+0.126
EEP+0.124
Neg logit contributions
en-0.077
ened-0.071
 me-0.067
el-0.066
 two-0.063
Token the
Token ID262
Feature activation+4.15e+00

Pos logit contributions
en+0.004
ened+0.004
ere+0.004
 me+0.003
el+0.003
Neg logit contributions
uts-0.012
vere-0.011
ding-0.011
othe-0.011
ache-0.010
Token pizz
Token ID38748
Feature activation+1.32e+01

Pos logit contributions
uts+0.095
vere+0.083
othe+0.082
ding+0.081
ache+0.077
Neg logit contributions
ere-0.095
en-0.093
ened-0.093
 me-0.088
 colored-0.086
Tokenas
Token ID292
Feature activation+5.42e+00

Pos logit contributions
othe+0.292
vere+0.270
 bleed+0.269
 numb+0.264
uts+0.256
Neg logit contributions
al-0.250
el-0.248
ere-0.243
en-0.240
 two-0.237
Token.
Token ID13
Feature activation+3.19e+00

Pos logit contributions
uts+0.161
vere+0.154
ometer+0.145
othe+0.145
ding+0.144
Neg logit contributions
en-0.087
ened-0.079
el-0.075
 me-0.073
ere-0.066
Token They
Token ID1119
Feature activation+6.00e-01

Pos logit contributions
vere+0.091
uts+0.091
ding+0.089
ache+0.088
othe+0.083
Neg logit contributions
ere-0.046
 me-0.045
en-0.043
ened-0.042
 Every-0.041
Token said
Token ID531
Feature activation-8.99e-01

Pos logit contributions
uts+0.018
vere+0.017
othe+0.017
ding+0.017
ache+0.016
Neg logit contributions
ere-0.009
en-0.009
ened-0.008
 me-0.007
el-0.007
Token sorry
Token ID7926
Feature activation-7.23e-01

Pos logit contributions
ere+0.023
ened+0.023
en+0.023
 me+0.022
 colored+0.021
Neg logit contributions
uts-0.017
ding-0.015
vere-0.015
othe-0.014
ometer-0.014
Token not
Token ID407
Feature activation+3.64e+00

Pos logit contributions
uts+0.038
vere+0.034
ding+0.034
othe+0.034
ometer+0.033
Neg logit contributions
ere-0.028
en-0.027
ened-0.027
 me-0.026
 colored-0.025
Token played
Token ID2826
Feature activation+1.50e+00

Pos logit contributions
uts+0.103
othe+0.092
vere+0.092
ding+0.092
ache+0.089
Neg logit contributions
ere-0.063
en-0.061
ened-0.060
 me-0.054
el-0.052
Token with
Token ID351
Feature activation+4.05e-01

Pos logit contributions
uts+0.044
ding+0.041
vere+0.041
othe+0.040
ometer+0.039
Neg logit contributions
en-0.023
ere-0.023
 me-0.022
ened-0.022
 colored-0.021
Token the
Token ID262
Feature activation+4.29e+00

Pos logit contributions
uts+0.013
vere+0.012
ding+0.012
othe+0.012
ache+0.011
Neg logit contributions
ere-0.006
en-0.005
ened-0.005
 me-0.005
 colored-0.005
Token ash
Token ID12530
Feature activation+9.35e+00

Pos logit contributions
uts+0.126
vere+0.113
ding+0.112
othe+0.111
ache+0.107
Neg logit contributions
ere-0.071
en-0.069
ened-0.068
 me-0.064
 colored-0.062
Tokentr
Token ID2213
Feature activation+1.28e+01

Pos logit contributions
vere+0.148
uts+0.145
othe+0.126
zel+0.112
vern+0.110
Neg logit contributions
en-0.268
 me-0.267
el-0.259
 colored-0.259
ened-0.257
Tokenay
Token ID323
Feature activation+4.30e+00

Pos logit contributions
uts+0.392
vere+0.365
othe+0.355
ache+0.350
ages+0.346
Neg logit contributions
el-0.192
ened-0.173
en-0.172
ere-0.168
 me-0.151
Token.
Token ID13
Feature activation+3.45e+00

Pos logit contributions
uts+0.121
vere+0.117
othe+0.108
 numb+0.104
uer+0.103
Neg logit contributions
en-0.075
ened-0.074
el-0.072
ere-0.069
 me-0.068
Token They
Token ID1119
Feature activation+9.58e-01

Pos logit contributions
uts+0.089
vere+0.083
ding+0.082
ache+0.081
othe+0.079
Neg logit contributions
ere-0.064
en-0.063
ened-0.062
 me-0.060
 colored-0.056
Token wished
Token ID16555
Feature activation+8.01e-01

Pos logit contributions
uts+0.029
vere+0.027
ding+0.027
othe+0.026
ometer+0.025
Neg logit contributions
ere-0.014
ened-0.014
en-0.014
 me-0.013
 colored-0.012
Token they
Token ID484
Feature activation+3.83e+00

Pos logit contributions
uts+0.017
ding+0.015
vere+0.015
ache+0.015
ometer+0.014
Neg logit contributions
ened-0.020
en-0.020
 me-0.019
ere-0.018
el-0.018
Token the
Token ID262
Feature activation+1.37e+00

Pos logit contributions
ere+0.053
ened+0.050
en+0.050
 colored+0.040
eled+0.040
Neg logit contributions
uts-0.166
ding-0.154
othe-0.152
vere-0.152
ometer-0.147
Token door
Token ID3420
Feature activation+3.00e+00

Pos logit contributions
uts+0.047
othe+0.043
vere+0.041
ding+0.040
ache+0.039
Neg logit contributions
 me-0.018
ere-0.015
en-0.015
ened-0.015
 two-0.014
Token and
Token ID290
Feature activation+2.01e+00

Pos logit contributions
uts+0.082
vere+0.075
othe+0.074
ding+0.072
ache+0.071
Neg logit contributions
en-0.054
ere-0.052
ened-0.051
 me-0.050
el-0.047
Token give
Token ID1577
Feature activation-2.43e+00

Pos logit contributions
uts+0.058
vere+0.052
ding+0.051
othe+0.051
ache+0.049
Neg logit contributions
ere-0.034
en-0.033
ened-0.033
 me-0.030
 colored-0.029
Token the
Token ID262
Feature activation+3.18e+00

Pos logit contributions
ere+0.036
ened+0.035
en+0.035
 colored+0.030
efully+0.028
Neg logit contributions
uts-0.071
ding-0.066
othe-0.065
vere-0.064
ometer-0.062
Token pizz
Token ID38748
Feature activation+1.26e+01

Pos logit contributions
uts+0.094
othe+0.084
ding+0.081
vere+0.081
 numb+0.076
Neg logit contributions
ere-0.053
 me-0.053
ened-0.049
en-0.049
 colored-0.049
Tokenas
Token ID292
Feature activation+4.69e+00

Pos logit contributions
othe+0.313
 bleed+0.302
 numb+0.290
vere+0.286
ding+0.276
Neg logit contributions
ere-0.219
al-0.212
 two-0.203
el-0.199
ac-0.194
Token to
Token ID284
Feature activation-5.55e-01

Pos logit contributions
uts+0.131
vere+0.120
othe+0.118
ding+0.117
ache+0.111
Neg logit contributions
ere-0.081
en-0.080
ened-0.078
 me-0.074
 colored-0.069
Token the
Token ID262
Feature activation+1.89e+00

Pos logit contributions
 me+0.007
ere+0.006
en+0.006
ened+0.006
el+0.005
Neg logit contributions
uts-0.019
vere-0.017
ding-0.017
ometer-0.017
othe-0.017
Token people
Token ID661
Feature activation+3.09e+00

Pos logit contributions
uts+0.057
othe+0.051
vere+0.049
ding+0.049
 numb+0.046
Neg logit contributions
 me-0.032
ere-0.029
 colored-0.028
en-0.028
 two-0.028
Token."
Token ID526
Feature activation+1.01e+00

Pos logit contributions
uts+0.082
vere+0.075
othe+0.075
ding+0.072
ometer+0.069
Neg logit contributions
en-0.057
ere-0.057
ened-0.054
 me-0.054
 colored-0.050
Token.
Token ID13
Feature activation+3.60e+00

Pos logit contributions
ere+0.059
ened+0.056
en+0.056
eled+0.052
 me+0.051
Neg logit contributions
uts-0.071
vere-0.061
ding-0.060
othe-0.059
ache-0.058
Token Let
Token ID3914
Feature activation-4.22e+00

Pos logit contributions
uts+0.087
vere+0.076
ding+0.076
othe+0.076
ache+0.072
Neg logit contributions
ere-0.077
en-0.076
ened-0.075
 me-0.070
 colored-0.068
Token's
Token ID338
Feature activation-2.54e-01

Pos logit contributions
ere+0.091
ened+0.089
en+0.083
eled+0.078
 colored+0.077
Neg logit contributions
uts-0.094
ding-0.085
vere-0.082
ache-0.082
othe-0.080
Token put
Token ID1234
Feature activation-2.37e+00

Pos logit contributions
ened+0.006
ere+0.005
en+0.005
eled+0.005
een+0.005
Neg logit contributions
uts-0.006
ding-0.005
othe-0.005
vere-0.005
 explode-0.005
Token the
Token ID262
Feature activation+3.36e+00

Pos logit contributions
ere+0.035
en+0.035
ened+0.034
 me+0.030
 colored+0.029
Neg logit contributions
uts-0.072
vere-0.065
ding-0.065
othe-0.065
ache-0.062
Token pizz
Token ID38748
Feature activation+1.26e+01

Pos logit contributions
uts+0.082
othe+0.071
vere+0.070
ding+0.069
ache+0.065
Neg logit contributions
ere-0.073
ened-0.072
en-0.072
 me-0.070
 colored-0.068
Tokenas
Token ID292
Feature activation+4.77e+00

Pos logit contributions
othe+0.298
uts+0.283
 bleed+0.280
 numb+0.275
vere+0.271
Neg logit contributions
ere-0.247
al-0.233
en-0.223
efully-0.221
el-0.221
Token in
Token ID287
Feature activation-1.47e+00

Pos logit contributions
uts+0.137
vere+0.124
othe+0.123
ding+0.121
ache+0.117
Neg logit contributions
ere-0.080
en-0.079
ened-0.078
 me-0.071
 colored-0.069
Token the
Token ID262
Feature activation+2.69e+00

Pos logit contributions
ere+0.018
ened+0.018
en+0.017
eled+0.014
ered+0.014
Neg logit contributions
uts-0.047
ding-0.044
vere-0.043
othe-0.043
ache-0.041
Token oven
Token ID14361
Feature activation+4.10e+00

Pos logit contributions
uts+0.080
vere+0.071
othe+0.071
ding+0.070
ache+0.067
Neg logit contributions
ere-0.044
en-0.043
ened-0.042
 me-0.040
 colored-0.038
Token and
Token ID290
Feature activation+2.70e+00

Pos logit contributions
uts+0.119
vere+0.108
othe+0.107
ding+0.106
ache+0.102
Neg logit contributions
ere-0.067
en-0.066
ened-0.065
 me-0.060
 colored-0.058
Token will
Token ID481
Feature activation+1.96e+00

Pos logit contributions
uts+0.086
vere+0.079
ding+0.077
othe+0.075
ometer+0.072
Neg logit contributions
ere-0.044
en-0.043
ened-0.042
 colored-0.040
 me-0.039
Token not
Token ID407
Feature activation+2.50e+00

Pos logit contributions
uts+0.053
ding+0.049
vere+0.047
ometer+0.045
othe+0.044
Neg logit contributions
 me-0.038
en-0.036
ere-0.035
 colored-0.033
ened-0.032
Token touch
Token ID3638
Feature activation-1.48e+00

Pos logit contributions
uts+0.073
ding+0.068
vere+0.068
ometer+0.066
othe+0.064
Neg logit contributions
 me-0.044
en-0.042
ere-0.039
 colored-0.036
el-0.036
Token the
Token ID262
Feature activation+4.48e+00

Pos logit contributions
en+0.018
ere+0.018
ened+0.017
 me+0.016
el+0.015
Neg logit contributions
uts-0.049
vere-0.045
ding-0.045
othe-0.045
ometer-0.043
Token ash
Token ID12530
Feature activation+9.71e+00

Pos logit contributions
uts+0.142
vere+0.127
othe+0.124
ding+0.122
ache+0.115
Neg logit contributions
ere-0.066
 me-0.065
ened-0.064
 colored-0.064
en-0.064
Tokentr
Token ID2213
Feature activation+1.26e+01

Pos logit contributions
uts+0.145
vere+0.144
othe+0.126
vern+0.110
 numb+0.105
Neg logit contributions
en-0.293
 me-0.289
el-0.284
al-0.278
ened-0.273
Tokenay
Token ID323
Feature activation+4.15e+00

Pos logit contributions
uts+0.390
vere+0.357
othe+0.349
 villains+0.343
ache+0.342
Neg logit contributions
el-0.193
en-0.171
ened-0.165
ere-0.161
 me-0.153
Token again
Token ID757
Feature activation+3.60e+00

Pos logit contributions
uts+0.118
vere+0.111
othe+0.105
 numb+0.100
ding+0.099
Neg logit contributions
en-0.074
ened-0.072
el-0.070
 me-0.068
ere-0.067
Token,"
Token ID553
Feature activation+1.61e+00

Pos logit contributions
uts+0.094
vere+0.085
ding+0.084
othe+0.083
ometer+0.080
Neg logit contributions
en-0.069
ere-0.068
ened-0.067
 me-0.065
el-0.061
Token Tim
Token ID5045
Feature activation-3.27e+00

Pos logit contributions
uts+0.039
ding+0.035
vere+0.034
ache+0.034
ometer+0.033
Neg logit contributions
ere-0.034
ened-0.033
 me-0.033
en-0.033
 colored-0.031
Token says
Token ID1139
Feature activation-7.98e-01

Pos logit contributions
ere+0.059
ened+0.057
en+0.057
 colored+0.050
el+0.049
Neg logit contributions
uts-0.090
vere-0.080
othe-0.080
ding-0.079
ache-0.077
Token you
Token ID345
Feature activation+1.77e+00

Pos logit contributions
ened+0.032
ere+0.031
en+0.031
eled+0.027
 colored+0.026
Neg logit contributions
uts-0.058
vere-0.052
othe-0.050
ding-0.049
ache-0.048
Token go
Token ID467
Feature activation-1.84e-01

Pos logit contributions
uts+0.041
vere+0.037
othe+0.036
ding+0.036
ometer+0.034
Neg logit contributions
ere-0.039
ened-0.038
en-0.037
eled-0.034
 colored-0.033
Token,
Token ID11
Feature activation+1.40e+00

Pos logit contributions
ere+0.003
en+0.003
ened+0.003
 me+0.002
 colored+0.002
Neg logit contributions
uts-0.006
ding-0.005
vere-0.005
othe-0.005
ache-0.005
Token enjoy
Token ID2883
Feature activation-3.30e+00

Pos logit contributions
uts+0.035
vere+0.032
ding+0.031
othe+0.031
ometer+0.030
Neg logit contributions
ere-0.028
en-0.028
ened-0.027
 me-0.026
 colored-0.025
Token your
Token ID534
Feature activation+3.19e+00

Pos logit contributions
ened+0.053
ere+0.053
en+0.052
 colored+0.044
el+0.044
Neg logit contributions
uts-0.096
vere-0.087
ding-0.087
othe-0.086
ache-0.082
Token pizz
Token ID38748
Feature activation+1.25e+01

Pos logit contributions
uts+0.066
vere+0.056
othe+0.055
ding+0.054
ache+0.052
Neg logit contributions
ere-0.081
ened-0.079
en-0.079
 me-0.076
 colored-0.075
Tokenas
Token ID292
Feature activation+5.13e+00

Pos logit contributions
 bleed+0.334
othe+0.328
 numb+0.315
vere+0.295
 scree+0.294
Neg logit contributions
ere-0.202
ac-0.184
al-0.178
el-0.171
en-0.167
Token."
Token ID526
Feature activation+2.25e+00

Pos logit contributions
uts+0.145
vere+0.142
othe+0.140
ding+0.131
ometer+0.130
Neg logit contributions
en-0.084
 me-0.078
ere-0.078
ened-0.072
el-0.070
Token
Token ID198
Feature activation+4.47e+00

Pos logit contributions
uts+0.063
ache+0.060
vere+0.059
ding+0.059
othe+0.057
Neg logit contributions
ere-0.035
 me-0.035
en-0.032
ened-0.030
 two-0.029
Token
Token ID198
Feature activation+3.91e+00

Pos logit contributions
uts+0.120
vere+0.110
ding+0.108
ache+0.108
othe+0.106
Neg logit contributions
ere-0.081
 me-0.079
en-0.077
ened-0.071
el-0.068
Token"
Token ID1
Feature activation+7.08e-01

Pos logit contributions
uts+0.094
vere+0.083
othe+0.083
ding+0.082
ometer+0.078
Neg logit contributions
ere-0.085
en-0.083
ened-0.081
 me-0.078
 colored-0.075
Token.
Token ID13
Feature activation+4.60e+00

Pos logit contributions
en+0.012
ere+0.012
ened+0.012
 me+0.011
 colored+0.011
Neg logit contributions
uts-0.018
othe-0.016
ding-0.016
vere-0.016
ometer-0.015
Token We
Token ID775
Feature activation+1.73e+00

Pos logit contributions
uts+0.104
vere+0.091
othe+0.090
ding+0.089
ache+0.084
Neg logit contributions
ere-0.106
en-0.105
ened-0.104
 me-0.096
 colored-0.094
Token can
Token ID460
Feature activation+2.38e+00

Pos logit contributions
uts+0.046
othe+0.041
vere+0.041
ding+0.041
ache+0.039
Neg logit contributions
ere-0.032
en-0.031
ened-0.031
 me-0.028
 colored-0.027
Token share
Token ID2648
Feature activation-2.47e+00

Pos logit contributions
uts+0.063
ding+0.057
vere+0.056
othe+0.055
ache+0.054
Neg logit contributions
en-0.044
ere-0.044
ened-0.043
 me-0.042
el-0.039
Token the
Token ID262
Feature activation+5.65e+00

Pos logit contributions
ere+0.025
en+0.024
ened+0.023
 me+0.021
 colored+0.019
Neg logit contributions
uts-0.088
vere-0.081
ding-0.080
othe-0.080
ache-0.078
Token mel
Token ID7758
Feature activation+1.24e+01

Pos logit contributions
uts+0.129
vere+0.113
ding+0.112
othe+0.112
ache+0.105
Neg logit contributions
ere-0.128
en-0.126
ened-0.125
 me-0.117
 colored-0.114
Tokenon
Token ID261
Feature activation+9.59e+00

Pos logit contributions
vere+0.340
othe+0.336
 numb+0.335
 bleed+0.330
uts+0.323
Neg logit contributions
ere-0.203
el-0.198
en-0.180
 me-0.177
al-0.174
Token."
Token ID526
Feature activation+1.67e+00

Pos logit contributions
vere+0.286
ding+0.276
othe+0.273
uts+0.258
ache+0.255
Neg logit contributions
en-0.122
 me-0.121
ie-0.119
 colored-0.113
el-0.112
Token Tom
Token ID4186
Feature activation-3.71e+00

Pos logit contributions
uts+0.055
ache+0.053
vere+0.052
ding+0.050
ometer+0.050
Neg logit contributions
 me-0.021
ere-0.019
ened-0.019
 her-0.017
 the-0.017
Token said
Token ID531
Feature activation-2.37e+00

Pos logit contributions
ere+0.074
en+0.072
ened+0.071
el+0.063
 colored+0.062
Neg logit contributions
uts-0.095
othe-0.083
ding-0.083
vere-0.083
ache-0.081
Token.
Token ID13
Feature activation+2.84e+00

Pos logit contributions
ened+0.035
en+0.032
 me+0.029
ered+0.028
ere+0.028
Neg logit contributions
uts-0.070
ding-0.069
vere-0.064
tr-0.063
ache-0.063
Token had
Token ID550
Feature activation+2.05e-01

Pos logit contributions
uts+0.144
vere+0.130
ding+0.129
othe+0.126
ache+0.124
Neg logit contributions
ened-0.084
ere-0.083
en-0.083
 me-0.076
 colored-0.073
Token learned
Token ID4499
Feature activation+5.13e-01

Pos logit contributions
uts+0.006
vere+0.005
ding+0.005
othe+0.005
ache+0.005
Neg logit contributions
ere-0.003
en-0.003
ened-0.003
 me-0.003
 colored-0.003
Token to
Token ID284
Feature activation+1.12e+00

Pos logit contributions
uts+0.016
ding+0.015
vere+0.015
othe+0.015
ache+0.014
Neg logit contributions
en-0.007
ere-0.007
ened-0.007
 me-0.006
 colored-0.006
Token appreciate
Token ID9144
Feature activation-2.27e+00

Pos logit contributions
uts+0.030
vere+0.026
ding+0.026
othe+0.026
ache+0.025
Neg logit contributions
ere-0.021
en-0.021
ened-0.021
 me-0.019
 colored-0.018
Token the
Token ID262
Feature activation+4.40e+00

Pos logit contributions
ere+0.029
en+0.028
ened+0.028
 me+0.024
 colored+0.023
Neg logit contributions
uts-0.075
vere-0.068
othe-0.068
ding-0.068
ache-0.065
Token therm
Token ID21969
Feature activation+1.23e+01

Pos logit contributions
uts+0.118
ding+0.102
vere+0.101
othe+0.101
ometer+0.094
Neg logit contributions
ere-0.088
en-0.085
ened-0.085
 me-0.082
 colored-0.078
Tokenometer
Token ID15635
Feature activation+8.44e+00

Pos logit contributions
uts+0.132
vere+0.125
 explode+0.113
ding+0.106
othe+0.100
Neg logit contributions
el-0.382
ere-0.381
ened-0.367
en-0.363
 me-0.356
Token.
Token ID13
Feature activation+3.79e+00

Pos logit contributions
uts+0.241
vere+0.229
ding+0.223
othe+0.219
ometer+0.210
Neg logit contributions
en-0.138
ened-0.132
ere-0.122
 me-0.122
 colored-0.116
Token<|endoftext|>
Token ID50256
Feature activation-8.49e+00

Pos logit contributions
uts+0.097
vere+0.088
ding+0.086
ache+0.085
othe+0.084
Neg logit contributions
ere-0.076
en-0.074
ened-0.072
 me-0.069
el-0.066
TokenJohn
Token ID7554
Feature activation-9.08e+00

Pos logit contributions
ened+0.166
en+0.162
ere+0.162
 colored+0.145
ered+0.142
Neg logit contributions
uts-0.218
ding-0.198
vere-0.192
othe-0.188
ometer-0.180
Token lay
Token ID3830
Feature activation-5.32e+00

Pos logit contributions
ere+0.160
efully+0.148
ened+0.146
 forces+0.143
eful+0.141
Neg logit contributions
uts-0.221
ding-0.210
ache-0.206
ometer-0.189
ages-0.186
Token and
Token ID290
Feature activation+4.38e+00

Pos logit contributions
uts+0.214
vere+0.200
othe+0.194
ding+0.193
ometer+0.189
Neg logit contributions
en-0.106
ened-0.098
ere-0.097
 me-0.093
 colored-0.088
Token the
Token ID262
Feature activation+2.98e+00

Pos logit contributions
uts+0.149
othe+0.138
vere+0.136
ding+0.135
ache+0.131
Neg logit contributions
ere-0.050
en-0.046
ened-0.045
el-0.037
 me-0.036
Token crabs
Token ID49172
Feature activation+5.82e+00

Pos logit contributions
uts+0.095
othe+0.084
vere+0.084
ding+0.084
ache+0.080
Neg logit contributions
ere-0.043
en-0.042
ened-0.041
 me-0.041
 colored-0.038
Token and
Token ID290
Feature activation+4.14e+00

Pos logit contributions
uts+0.161
vere+0.153
othe+0.148
ometer+0.147
ding+0.144
Neg logit contributions
en-0.099
ened-0.094
ere-0.091
 me-0.089
 colored-0.087
Token the
Token ID262
Feature activation+3.20e+00

Pos logit contributions
uts+0.145
othe+0.135
vere+0.133
ding+0.133
ache+0.129
Neg logit contributions
ere-0.043
en-0.039
ened-0.038
el-0.031
 me-0.030
Token seaw
Token ID32085
Feature activation+1.22e+01

Pos logit contributions
uts+0.101
othe+0.089
ding+0.089
vere+0.089
ometer+0.085
Neg logit contributions
ere-0.047
en-0.046
ened-0.045
 me-0.044
 colored-0.041
Tokeneed
Token ID2308
Feature activation+6.72e+00

Pos logit contributions
uts+0.223
othe+0.202
vere+0.200
ding+0.185
whe+0.179
Neg logit contributions
en-0.329
ere-0.325
ened-0.310
 me-0.309
 colored-0.292
Token.
Token ID13
Feature activation+3.36e+00

Pos logit contributions
uts+0.181
vere+0.174
othe+0.167
ding+0.164
ometer+0.159
Neg logit contributions
en-0.122
ened-0.118
ere-0.111
 colored-0.107
 me-0.106
Token
Token ID198
Feature activation+4.57e+00

Pos logit contributions
uts+0.090
ache+0.086
ding+0.085
vere+0.085
othe+0.080
Neg logit contributions
ere-0.059
en-0.056
 me-0.054
ened-0.054
 colored-0.050
Token
Token ID198
Feature activation+3.77e+00

Pos logit contributions
uts+0.114
vere+0.102
ding+0.101
othe+0.098
ache+0.098
Neg logit contributions
ere-0.094
en-0.091
ened-0.087
 me-0.087
 colored-0.082
TokenSam
Token ID16305
Feature activation-3.78e+00

Pos logit contributions
uts+0.089
vere+0.078
ding+0.078
othe+0.077
ache+0.074
Neg logit contributions
ere-0.083
en-0.082
ened-0.080
 me-0.076
 colored-0.074
Token wanted
Token ID2227
Feature activation-1.94e+00

Pos logit contributions
ere+0.006
en+0.006
ened+0.005
el+0.005
 me+0.005
Neg logit contributions
uts-0.010
othe-0.009
ding-0.009
vere-0.009
ache-0.009
Token to
Token ID284
Feature activation-1.00e+00

Pos logit contributions
ere+0.036
en+0.036
ened+0.036
 me+0.033
 colored+0.032
Neg logit contributions
uts-0.052
vere-0.047
ding-0.046
othe-0.046
ache-0.044
Token keep
Token ID1394
Feature activation-8.25e-01

Pos logit contributions
en+0.017
ere+0.017
ened+0.016
 me+0.016
 colored+0.015
Neg logit contributions
uts-0.029
ding-0.026
vere-0.026
othe-0.025
ometer-0.025
Token the
Token ID262
Feature activation+4.85e+00

Pos logit contributions
ere+0.012
en+0.011
ened+0.011
 me+0.010
 colored+0.009
Neg logit contributions
uts-0.026
vere-0.024
othe-0.023
ding-0.023
ache-0.022
Token ash
Token ID12530
Feature activation+9.51e+00

Pos logit contributions
uts+0.133
vere+0.119
othe+0.119
ding+0.119
ache+0.113
Neg logit contributions
ere-0.087
en-0.086
ened-0.085
 me-0.076
 colored-0.074
Tokentr
Token ID2213
Feature activation+1.22e+01

Pos logit contributions
uts+0.204
vere+0.197
othe+0.177
ding+0.173
zel+0.163
Neg logit contributions
en-0.211
el-0.210
 me-0.208
ened-0.206
ered-0.195
Tokenay
Token ID323
Feature activation+3.89e+00

Pos logit contributions
uts+0.375
vere+0.345
ache+0.341
othe+0.337
 villains+0.337
Neg logit contributions
el-0.178
ere-0.150
ened-0.149
en-0.148
 me-0.135
Token.
Token ID13
Feature activation+2.90e+00

Pos logit contributions
uts+0.115
vere+0.106
othe+0.103
ding+0.100
 numb+0.097
Neg logit contributions
ened-0.061
en-0.061
ere-0.057
el-0.056
 me-0.056
Token So
Token ID1406
Feature activation+1.55e-01

Pos logit contributions
uts+0.080
vere+0.072
ding+0.071
othe+0.071
ache+0.069
Neg logit contributions
ere-0.052
en-0.051
ened-0.051
 me-0.047
el-0.045
Token he
Token ID339
Feature activation-9.56e-01

Pos logit contributions
uts+0.005
vere+0.005
ding+0.005
othe+0.005
ache+0.005
Neg logit contributions
ere-0.002
en-0.002
ened-0.002
 me-0.002
el-0.002
Token tried
Token ID3088
Feature activation-1.87e+00

Pos logit contributions
ere+0.016
en+0.015
ened+0.015
el+0.013
een+0.013
Neg logit contributions
uts-0.027
ache-0.024
ding-0.024
othe-0.023
ometer-0.023
Token not
Token ID407
Feature activation+9.59e-01

Pos logit contributions
ere+0.005
ened+0.004
en+0.004
 me+0.004
 colored+0.004
Neg logit contributions
uts-0.003
vere-0.003
othe-0.003
ding-0.003
ache-0.003
Token play
Token ID711
Feature activation+8.54e-01

Pos logit contributions
uts+0.027
vere+0.025
ding+0.024
othe+0.024
ometer+0.023
Neg logit contributions
en-0.017
ere-0.016
 me-0.016
ened-0.015
 colored-0.015
Token with
Token ID351
Feature activation+2.52e-02

Pos logit contributions
uts+0.024
ding+0.022
othe+0.021
vere+0.021
ometer+0.021
Neg logit contributions
ere-0.015
en-0.015
 me-0.015
ened-0.013
 colored-0.013
Token the
Token ID262
Feature activation+4.23e+00

Pos logit contributions
uts+0.001
othe+0.001
vere+0.001
ache+0.001
ding+0.001
Neg logit contributions
ened-0.000
en-0.000
ere-0.000
efully-0.000
eled-0.000
Token ash
Token ID12530
Feature activation+9.21e+00

Pos logit contributions
uts+0.132
vere+0.118
ding+0.116
othe+0.116
ache+0.109
Neg logit contributions
ere-0.064
 me-0.061
en-0.060
 colored-0.059
ened-0.059
Tokentr
Token ID2213
Feature activation+1.21e+01

Pos logit contributions
uts+0.142
vere+0.132
othe+0.130
 numb+0.110
ding+0.110
Neg logit contributions
 me-0.275
en-0.270
el-0.255
ere-0.254
 colored-0.254
Tokenay
Token ID323
Feature activation+3.92e+00

Pos logit contributions
uts+0.381
vere+0.352
othe+0.352
ache+0.339
 numb+0.335
Neg logit contributions
el-0.173
en-0.156
ere-0.151
 me-0.144
ened-0.144
Token.
Token ID13
Feature activation+4.10e+00

Pos logit contributions
uts+0.111
vere+0.103
othe+0.102
ding+0.097
 numb+0.095
Neg logit contributions
en-0.068
ere-0.066
ened-0.066
 me-0.064
el-0.062
Token It
Token ID632
Feature activation+4.11e+00

Pos logit contributions
uts+0.112
vere+0.101
ding+0.101
othe+0.099
ache+0.096
Neg logit contributions
ere-0.074
en-0.072
ened-0.071
 me-0.066
el-0.064
Token is
Token ID318
Feature activation+4.81e+00

Pos logit contributions
uts+0.135
vere+0.124
ding+0.123
othe+0.122
ache+0.116
Neg logit contributions
en-0.051
ere-0.051
ened-0.050
 me-0.047
 colored-0.044
Token dangerous
Token ID4923
Feature activation+1.28e+00

Pos logit contributions
uts+0.117
ding+0.104
vere+0.101
othe+0.096
whe+0.091
Neg logit contributions
 me-0.108
en-0.101
ere-0.101
ened-0.100
 colored-0.098
Token,
Token ID11
Feature activation+3.59e+00

Pos logit contributions
ere+0.021
ened+0.020
en+0.020
eled+0.018
efully+0.018
Neg logit contributions
uts-0.038
ding-0.034
vere-0.033
ache-0.033
othe-0.033
Token you
Token ID345
Feature activation+1.65e+00

Pos logit contributions
uts+0.095
ding+0.085
vere+0.085
othe+0.083
ometer+0.080
Neg logit contributions
ere-0.067
en-0.067
 me-0.066
ened-0.065
 colored-0.059
Token should
Token ID815
Feature activation-1.24e-01

Pos logit contributions
uts+0.043
vere+0.038
ding+0.038
othe+0.038
ache+0.037
Neg logit contributions
ere-0.032
en-0.031
ened-0.031
 me-0.028
el-0.027
Token have
Token ID423
Feature activation-7.24e-01

Pos logit contributions
ened+0.003
ere+0.003
en+0.003
eled+0.003
 colored+0.003
Neg logit contributions
uts-0.002
othe-0.002
vere-0.002
ding-0.002
ache-0.002
Token the
Token ID262
Feature activation+4.30e+00

Pos logit contributions
ere+0.012
en+0.012
ened+0.012
el+0.010
 me+0.010
Neg logit contributions
uts-0.021
vere-0.019
ding-0.018
othe-0.018
ache-0.018
Token therm
Token ID21969
Feature activation+1.21e+01

Pos logit contributions
uts+0.125
othe+0.110
ding+0.108
vere+0.107
ache+0.102
Neg logit contributions
ere-0.073
en-0.071
ened-0.071
 me-0.071
 colored-0.067
Tokenometer
Token ID15635
Feature activation+6.76e+00

Pos logit contributions
vere+0.140
uts+0.132
whe+0.130
wheel+0.119
othe+0.118
Neg logit contributions
el-0.368
en-0.349
ere-0.347
al-0.342
ened-0.338
Token in
Token ID287
Feature activation+1.24e-01

Pos logit contributions
uts+0.193
vere+0.188
othe+0.181
ding+0.180
ometer+0.166
Neg logit contributions
en-0.108
 me-0.102
ere-0.097
ened-0.096
 colored-0.089
Token your
Token ID534
Feature activation+2.67e+00

Pos logit contributions
uts+0.004
othe+0.004
vere+0.003
ding+0.003
 numb+0.003
Neg logit contributions
ened-0.002
ere-0.002
en-0.002
efully-0.001
eled-0.001
Token mouth
Token ID5422
Feature activation+5.69e+00

Pos logit contributions
uts+0.079
vere+0.071
ding+0.070
othe+0.070
ache+0.067
Neg logit contributions
ere-0.044
en-0.043
ened-0.042
 me-0.039
 colored-0.037
Token.
Token ID13
Feature activation+4.17e+00

Pos logit contributions
uts+0.153
vere+0.146
othe+0.142
ding+0.138
ometer+0.132
Neg logit contributions
en-0.100
ere-0.096
 me-0.095
ened-0.093
el-0.086
Token should
Token ID815
Feature activation+9.12e-01

Pos logit contributions
uts+0.082
vere+0.075
ding+0.074
othe+0.073
ache+0.069
Neg logit contributions
ere-0.049
en-0.049
ened-0.048
 me-0.044
 colored-0.043
Token never
Token ID1239
Feature activation+6.81e-01

Pos logit contributions
uts+0.022
vere+0.020
ding+0.020
othe+0.019
ache+0.018
Neg logit contributions
ere-0.019
en-0.019
ened-0.019
 me-0.018
 colored-0.017
Token touch
Token ID3638
Feature activation-7.96e-01

Pos logit contributions
uts+0.019
vere+0.017
ding+0.017
othe+0.017
ometer+0.016
Neg logit contributions
ere-0.012
en-0.012
ened-0.011
 me-0.011
 colored-0.010
Token an
Token ID281
Feature activation+5.05e+00

Pos logit contributions
ere+0.013
en+0.013
ened+0.013
 me+0.012
 colored+0.011
Neg logit contributions
uts-0.023
vere-0.020
othe-0.020
ding-0.020
ache-0.019
Token ash
Token ID12530
Feature activation+8.77e+00

Pos logit contributions
uts+0.119
ache+0.107
othe+0.106
ding+0.105
ometer+0.104
Neg logit contributions
ere-0.104
ened-0.101
en-0.097
efully-0.091
een-0.088
Tokentr
Token ID2213
Feature activation+1.21e+01

Pos logit contributions
uts+0.123
vere+0.116
othe+0.108
ding+0.089
vern+0.082
Neg logit contributions
en-0.278
 me-0.275
al-0.265
el-0.260
ened-0.258
Tokenay
Token ID323
Feature activation+3.93e+00

Pos logit contributions
uts+0.371
vere+0.338
othe+0.336
ages+0.327
ache+0.322
Neg logit contributions
el-0.182
en-0.170
ened-0.163
ere-0.155
 me-0.152
Token."
Token ID526
Feature activation+2.41e+00

Pos logit contributions
uts+0.106
vere+0.099
othe+0.096
 numb+0.090
ding+0.089
Neg logit contributions
en-0.075
ened-0.070
 me-0.069
ere-0.067
el-0.066
Token
Token ID198
Feature activation+4.71e+00

Pos logit contributions
uts+0.061
ding+0.056
vere+0.056
ache+0.055
othe+0.054
Neg logit contributions
ere-0.046
en-0.044
 me-0.044
ened-0.043
el-0.040
Token
Token ID198
Feature activation+4.11e+00

Pos logit contributions
uts+0.128
ding+0.120
vere+0.118
ache+0.117
othe+0.113
Neg logit contributions
ere-0.081
 me-0.079
en-0.074
el-0.068
ened-0.066
TokenS
Token ID50
Feature activation-1.52e+00

Pos logit contributions
uts+0.100
vere+0.091
ding+0.090
othe+0.090
ache+0.086
Neg logit contributions
ere-0.087
en-0.081
 me-0.081
ened-0.075
el-0.074
Token friends
Token ID2460
Feature activation+1.78e+00

Pos logit contributions
en+0.007
ere+0.007
ened+0.007
 me+0.006
 colored+0.006
Neg logit contributions
uts-0.019
vere-0.018
ding-0.018
othe-0.018
ometer-0.017
Token and
Token ID290
Feature activation+4.76e+00

Pos logit contributions
uts+0.060
othe+0.056
vere+0.056
ding+0.056
ometer+0.054
Neg logit contributions
en-0.021
ened-0.019
ere-0.019
 me-0.018
el-0.016
Token always
Token ID1464
Feature activation+6.33e-01

Pos logit contributions
uts+0.153
vere+0.145
ding+0.143
othe+0.135
ache+0.134
Neg logit contributions
ened-0.062
en-0.060
ere-0.059
 colored-0.057
 me-0.055
Token shared
Token ID4888
Feature activation-2.13e+00

Pos logit contributions
uts+0.017
othe+0.016
vere+0.015
ding+0.015
ache+0.015
Neg logit contributions
ere-0.011
en-0.011
ened-0.010
el-0.010
 me-0.009
Token their
Token ID511
Feature activation+5.96e+00

Pos logit contributions
en+0.031
ere+0.030
ened+0.029
 me+0.029
 colored+0.027
Neg logit contributions
uts-0.066
vere-0.062
ding-0.061
othe-0.059
ometer-0.058
Token jelly
Token ID27644
Feature activation+1.21e+01

Pos logit contributions
uts+0.140
vere+0.122
ding+0.121
othe+0.121
ache+0.115
Neg logit contributions
ere-0.128
en-0.128
ened-0.126
 me-0.111
el-0.110
Token.
Token ID13
Feature activation+3.66e+00

Pos logit contributions
vere+0.354
ding+0.333
othe+0.320
 bleed+0.318
uts+0.313
Neg logit contributions
en-0.167
 me-0.166
ie-0.160
ere-0.154
 colored-0.148
Token<|endoftext|>
Token ID50256
Feature activation-8.48e+00

Pos logit contributions
uts+0.100
vere+0.097
ache+0.095
ding+0.094
othe+0.089
Neg logit contributions
ere-0.062
en-0.060
 me-0.058
ened-0.056
 colored-0.055
TokenL
Token ID43
Feature activation-7.83e+00

Pos logit contributions
ened+0.176
ere+0.176
en+0.175
 me+0.157
 colored+0.156
Neg logit contributions
uts-0.208
ding-0.184
vere-0.183
othe-0.180
ometer-0.169
Tokenily
Token ID813
Feature activation-9.09e+00

Pos logit contributions
ened+0.177
ere+0.154
en+0.149
ised+0.146
efully+0.145
Neg logit contributions
uts-0.178
t-0.153
ding-0.149
EEP-0.141
othe-0.141
Token and
Token ID290
Feature activation-4.65e+00

Pos logit contributions
ere+0.183
efully+0.178
ened+0.175
ered+0.173
els+0.169
Neg logit contributions
uts-0.203
ding-0.187
ache-0.178
ages-0.164
EEP-0.159
Token was
Token ID373
Feature activation+2.33e+00

Pos logit contributions
uts+0.065
ding+0.058
vere+0.057
othe+0.056
ache+0.056
Neg logit contributions
ere-0.042
en-0.042
ened-0.041
el-0.037
 me-0.036
Token able
Token ID1498
Feature activation+4.02e-01

Pos logit contributions
uts+0.055
ding+0.048
vere+0.048
othe+0.048
ometer+0.046
Neg logit contributions
ere-0.051
en-0.050
ened-0.050
 me-0.048
 colored-0.046
Token to
Token ID284
Feature activation+6.47e-01

Pos logit contributions
uts+0.012
ding+0.011
vere+0.011
othe+0.010
ometer+0.010
Neg logit contributions
ened-0.006
ere-0.006
 me-0.006
en-0.006
 colored-0.005
Token share
Token ID2648
Feature activation-1.86e+00

Pos logit contributions
uts+0.017
vere+0.015
othe+0.015
ding+0.015
ache+0.014
Neg logit contributions
ere-0.012
en-0.012
ened-0.012
 me-0.011
 colored-0.011
Token the
Token ID262
Feature activation+5.15e+00

Pos logit contributions
en+0.021
ere+0.021
ened+0.020
 me+0.019
 colored+0.017
Neg logit contributions
uts-0.064
vere-0.060
ding-0.059
othe-0.059
ometer-0.058
Token mel
Token ID7758
Feature activation+1.19e+01

Pos logit contributions
uts+0.122
vere+0.107
othe+0.106
ding+0.105
ache+0.100
Neg logit contributions
ere-0.114
en-0.111
ened-0.111
 me-0.104
 colored-0.102
Tokenon
Token ID261
Feature activation+8.90e+00

Pos logit contributions
vere+0.334
othe+0.334
 bleed+0.333
 numb+0.327
ometer+0.321
Neg logit contributions
ere-0.188
el-0.181
en-0.171
 colored-0.160
 me-0.156
Token with
Token ID351
Feature activation+5.52e-01

Pos logit contributions
vere+0.271
uts+0.255
othe+0.254
ding+0.254
ache+0.252
Neg logit contributions
en-0.126
 me-0.115
el-0.108
ere-0.107
ie-0.104
Token his
Token ID465
Feature activation+1.50e+00

Pos logit contributions
uts+0.018
ding+0.017
vere+0.017
ometer+0.016
ache+0.016
Neg logit contributions
en-0.007
ere-0.007
 me-0.007
ened-0.007
 colored-0.006
Token friends
Token ID2460
Feature activation+1.11e+00

Pos logit contributions
uts+0.050
ding+0.048
vere+0.046
othe+0.045
ache+0.044
Neg logit contributions
ere-0.018
en-0.017
ened-0.017
 me-0.017
el-0.015
Token and
Token ID290
Feature activation+3.28e+00

Pos logit contributions
uts+0.031
vere+0.029
othe+0.028
ding+0.028
ometer+0.027
Neg logit contributions
en-0.019
ere-0.018
ened-0.018
 me-0.017
 colored-0.016
Token.
Token ID13
Feature activation+4.00e+00

Pos logit contributions
uts+0.103
othe+0.097
vere+0.096
ding+0.092
 bleed+0.088
Neg logit contributions
en-0.078
ere-0.077
ened-0.075
 me-0.074
 colored-0.069
Token We
Token ID775
Feature activation+4.56e-01

Pos logit contributions
uts+0.107
ding+0.096
vere+0.095
othe+0.094
ache+0.094
Neg logit contributions
ere-0.073
en-0.072
 me-0.070
ened-0.068
el-0.063
Token will
Token ID481
Feature activation+8.89e-01

Pos logit contributions
uts+0.011
vere+0.010
ding+0.010
othe+0.010
ometer+0.009
Neg logit contributions
ere-0.009
ened-0.009
en-0.009
eled-0.008
 colored-0.008
Token share
Token ID2648
Feature activation-2.56e+00

Pos logit contributions
uts+0.022
vere+0.020
othe+0.020
ding+0.020
ache+0.019
Neg logit contributions
ere-0.018
ened-0.018
en-0.017
eled-0.015
el-0.015
Token the
Token ID262
Feature activation+5.23e+00

Pos logit contributions
en+0.037
ere+0.037
ened+0.036
 me+0.034
 colored+0.032
Neg logit contributions
uts-0.079
vere-0.072
othe-0.071
ding-0.071
ache-0.069
Token oct
Token ID19318
Feature activation+1.19e+01

Pos logit contributions
uts+0.146
vere+0.132
ding+0.132
othe+0.130
ache+0.125
Neg logit contributions
ere-0.090
en-0.088
ened-0.088
 me-0.076
el-0.075
Tokenopus
Token ID25790
Feature activation+7.84e+00

Pos logit contributions
 bleed+0.308
othe+0.306
 numb+0.303
vere+0.265
istered+0.258
Neg logit contributions
ere-0.220
 me-0.202
en-0.189
al-0.186
el-0.181
Token together
Token ID1978
Feature activation+1.59e+00

Pos logit contributions
uts+0.227
vere+0.226
othe+0.220
ding+0.215
ache+0.207
Neg logit contributions
en-0.119
ere-0.111
 me-0.111
ened-0.103
 colored-0.102
Token.
Token ID13
Feature activation+3.83e+00

Pos logit contributions
uts+0.041
ding+0.036
vere+0.036
othe+0.035
ache+0.034
Neg logit contributions
ere-0.032
ened-0.031
en-0.030
 me-0.027
 colored-0.027
Token Let
Token ID3914
Feature activation-3.90e+00

Pos logit contributions
uts+0.098
vere+0.086
ding+0.086
othe+0.085
ache+0.085
Neg logit contributions
en-0.075
ere-0.074
 me-0.073
ened-0.071
efully-0.067
Token's
Token ID338
Feature activation+1.03e-02

Pos logit contributions
ere+0.074
ened+0.073
en+0.071
 colored+0.065
eled+0.063
Neg logit contributions
uts-0.098
ding-0.090
vere-0.089
othe-0.087
ache-0.086
Token should
Token ID815
Feature activation+5.68e-01

Pos logit contributions
uts+0.058
vere+0.053
ding+0.052
othe+0.050
ache+0.048
Neg logit contributions
en-0.039
ere-0.039
ened-0.037
 colored-0.035
 me-0.035
Token have
Token ID423
Feature activation-4.58e-02

Pos logit contributions
uts+0.013
vere+0.011
othe+0.011
ding+0.011
ache+0.010
Neg logit contributions
ere-0.013
ened-0.013
en-0.013
eled-0.011
 colored-0.011
Token shared
Token ID4888
Feature activation-1.84e+00

Pos logit contributions
ere+0.001
en+0.001
ened+0.001
el+0.001
 me+0.001
Neg logit contributions
uts-0.001
vere-0.001
othe-0.001
ding-0.001
ache-0.001
Token my
Token ID616
Feature activation+5.75e+00

Pos logit contributions
en+0.026
ere+0.026
ened+0.025
 me+0.025
 colored+0.022
Neg logit contributions
uts-0.057
vere-0.053
ding-0.052
othe-0.051
ache-0.050
Token l
Token ID300
Feature activation+1.05e+01

Pos logit contributions
uts+0.119
vere+0.105
ding+0.102
othe+0.101
ache+0.099
Neg logit contributions
en-0.139
ere-0.137
ened-0.137
el-0.121
efully-0.120
Tokenoll
Token ID692
Feature activation+1.19e+01

Pos logit contributions
uts+0.206
ding+0.197
othe+0.192
vere+0.189
 explode+0.183
Neg logit contributions
ere-0.255
en-0.241
 me-0.234
ened-0.231
al-0.230
Tokenipop
Token ID42800
Feature activation+1.03e+01

Pos logit contributions
uts+0.254
vere+0.242
ding+0.229
othe+0.223
 explode+0.216
Neg logit contributions
ere-0.273
ened-0.267
en-0.266
 me-0.253
el-0.252
Token with
Token ID351
Feature activation+1.67e+00

Pos logit contributions
vere+0.309
othe+0.301
ding+0.301
 bleed+0.290
 explode+0.290
Neg logit contributions
 me-0.130
en-0.115
al-0.103
 colored-0.100
ere-0.100
Token you
Token ID345
Feature activation+5.11e+00

Pos logit contributions
uts+0.044
vere+0.040
ding+0.039
othe+0.039
ache+0.037
Neg logit contributions
ere-0.032
en-0.031
ened-0.031
 me-0.028
 colored-0.027
Token.
Token ID13
Feature activation+4.57e+00

Pos logit contributions
uts+0.119
vere+0.104
ding+0.103
othe+0.103
ache+0.097
Neg logit contributions
ere-0.115
en-0.113
ened-0.112
 me-0.103
 colored-0.101
Token It
Token ID632
Feature activation+4.51e+00

Pos logit contributions
uts+0.105
vere+0.093
ding+0.092
othe+0.091
ache+0.087
Neg logit contributions
ere-0.102
en-0.101
ened-0.100
 me-0.094
 colored-0.090
Token were
Token ID547
Feature activation+1.27e+00

Pos logit contributions
uts+0.036
vere+0.032
ache+0.032
othe+0.032
ding+0.032
Neg logit contributions
ened-0.034
ere-0.033
en-0.033
eled-0.031
el-0.030
Token brave
Token ID14802
Feature activation-1.11e+00

Pos logit contributions
uts+0.027
vere+0.025
othe+0.024
ding+0.024
ache+0.022
Neg logit contributions
ened-0.030
en-0.029
ere-0.029
el-0.026
eled-0.026
Token and
Token ID290
Feature activation+5.50e+00

Pos logit contributions
ere+0.021
ened+0.019
en+0.019
ered+0.019
el+0.019
Neg logit contributions
uts-0.029
ache-0.025
vere-0.025
ding-0.024
othe-0.024
Token kept
Token ID4030
Feature activation-1.68e+00

Pos logit contributions
uts+0.132
vere+0.116
ding+0.115
othe+0.114
ache+0.109
Neg logit contributions
ere-0.118
en-0.116
ened-0.116
 me-0.108
 colored-0.105
Token the
Token ID262
Feature activation+4.76e+00

Pos logit contributions
ere+0.028
en+0.027
ened+0.027
 colored+0.023
 me+0.023
Neg logit contributions
uts-0.048
vere-0.044
othe-0.044
ding-0.043
ache-0.041
Token therm
Token ID21969
Feature activation+1.19e+01

Pos logit contributions
uts+0.116
vere+0.099
ding+0.097
othe+0.097
ache+0.089
Neg logit contributions
ere-0.107
en-0.105
ened-0.105
 me-0.102
 colored-0.097
Tokenometer
Token ID15635
Feature activation+7.98e+00

Pos logit contributions
uts+0.137
vere+0.121
 explode+0.112
 villains+0.105
whe+0.105
Neg logit contributions
el-0.364
ere-0.355
al-0.340
 me-0.335
ened-0.333
Token in
Token ID287
Feature activation+2.46e-01

Pos logit contributions
uts+0.216
vere+0.195
othe+0.193
ding+0.191
ache+0.181
Neg logit contributions
en-0.144
ere-0.139
ened-0.137
 me-0.136
el-0.124
Token your
Token ID534
Feature activation+2.18e+00

Pos logit contributions
uts+0.007
othe+0.007
vere+0.007
ding+0.007
 numb+0.006
Neg logit contributions
ened-0.004
ere-0.004
en-0.003
 colored-0.003
eled-0.003
Token mouth
Token ID5422
Feature activation+5.58e+00

Pos logit contributions
uts+0.063
vere+0.056
ding+0.056
othe+0.055
ache+0.053
Neg logit contributions
ere-0.038
en-0.037
ened-0.037
 me-0.035
el-0.032
Token!"
Token ID2474
Feature activation+1.65e+00

Pos logit contributions
uts+0.134
vere+0.120
othe+0.119
ding+0.118
ache+0.112
Neg logit contributions
ere-0.117
en-0.116
ened-0.114
 me-0.108
 colored-0.104
Token he
Token ID339
Feature activation-1.58e+00

Pos logit contributions
ened+0.012
 me+0.012
 colored+0.012
 the+0.012
ere+0.012
Neg logit contributions
uts-0.014
vere-0.013
ache-0.013
ometer-0.012
ding-0.012
Token decided
Token ID3066
Feature activation-1.77e+00

Pos logit contributions
ere+0.025
en+0.024
ened+0.024
el+0.021
een+0.019
Neg logit contributions
uts-0.046
ding-0.042
ache-0.041
vere-0.040
othe-0.040
Token to
Token ID284
Feature activation-4.13e+00

Pos logit contributions
ened+0.031
en+0.031
 me+0.030
ere+0.030
 colored+0.027
Neg logit contributions
uts-0.048
vere-0.045
othe-0.043
ding-0.043
ometer-0.041
Token remove
Token ID4781
Feature activation-4.76e+00

Pos logit contributions
ere+0.078
ened+0.077
en+0.076
 me+0.068
 colored+0.068
Neg logit contributions
uts-0.108
othe-0.097
vere-0.096
ding-0.096
ache-0.091
Token the
Token ID262
Feature activation+2.12e+00

Pos logit contributions
ere+0.054
en+0.049
ened+0.049
 me+0.040
 colored+0.039
Neg logit contributions
uts-0.160
othe-0.149
ding-0.147
vere-0.146
ometer-0.140
Token seaw
Token ID32085
Feature activation+1.19e+01

Pos logit contributions
uts+0.058
vere+0.052
ding+0.051
othe+0.051
ache+0.049
Neg logit contributions
ere-0.039
en-0.038
ened-0.038
 me-0.034
 colored-0.034
Tokeneed
Token ID2308
Feature activation+6.85e+00

Pos logit contributions
uts+0.215
othe+0.207
vere+0.197
 bleed+0.188
ding+0.178
Neg logit contributions
en-0.313
ere-0.306
 me-0.302
ened-0.288
al-0.281
Token from
Token ID422
Feature activation-2.84e+00

Pos logit contributions
uts+0.204
vere+0.195
othe+0.188
ding+0.187
ache+0.181
Neg logit contributions
en-0.106
ened-0.099
ere-0.093
 me-0.092
el-0.089
Token the
Token ID262
Feature activation+1.11e+00

Pos logit contributions
ere+0.038
en+0.035
ened+0.035
 me+0.029
efully+0.029
Neg logit contributions
uts-0.090
othe-0.084
ding-0.084
vere-0.082
ometer-0.079
Token water
Token ID1660
Feature activation+4.27e+00

Pos logit contributions
uts+0.033
othe+0.030
vere+0.029
ding+0.029
ache+0.028
Neg logit contributions
 me-0.018
en-0.018
ere-0.017
ened-0.017
 colored-0.016
Token so
Token ID523
Feature activation+4.02e+00

Pos logit contributions
uts+0.124
vere+0.115
othe+0.112
ding+0.111
ache+0.108
Neg logit contributions
en-0.070
ere-0.068
ened-0.067
 me-0.063
el-0.060
Token should
Token ID815
Feature activation+3.63e-01

Pos logit contributions
uts+0.064
vere+0.056
othe+0.056
ding+0.056
ache+0.055
Neg logit contributions
ere-0.049
en-0.048
ened-0.048
 me-0.043
el-0.042
Token not
Token ID407
Feature activation+1.26e+00

Pos logit contributions
uts+0.008
vere+0.007
othe+0.007
ding+0.007
ache+0.007
Neg logit contributions
ere-0.008
en-0.008
ened-0.008
 me-0.008
 colored-0.007
Token tease
Token ID34691
Feature activation-2.84e+00

Pos logit contributions
uts+0.036
vere+0.033
ding+0.032
othe+0.032
ometer+0.031
Neg logit contributions
en-0.022
ere-0.021
 me-0.020
ened-0.020
 colored-0.019
Token the
Token ID262
Feature activation+3.79e+00

Pos logit contributions
ere+0.044
en+0.044
ened+0.043
 me+0.040
el+0.037
Neg logit contributions
uts-0.085
vere-0.077
ding-0.077
othe-0.076
ache-0.073
Token ost
Token ID23619
Feature activation+6.79e+00

Pos logit contributions
uts+0.102
vere+0.087
othe+0.087
ding+0.086
ache+0.080
Neg logit contributions
ere-0.075
 me-0.073
en-0.073
ened-0.069
 colored-0.068
Tokenric
Token ID1173
Feature activation+1.19e+01

Pos logit contributions
uts+0.162
vere+0.151
othe+0.146
ding+0.143
ometer+0.134
Neg logit contributions
en-0.145
ere-0.144
ened-0.135
el-0.132
 me-0.129
Tokenhes
Token ID956
Feature activation+2.61e+00

Pos logit contributions
uts+0.204
vere+0.180
ding+0.177
othe+0.177
 numb+0.174
Neg logit contributions
ere-0.326
en-0.308
 me-0.302
ened-0.293
 colored-0.292
Token.
Token ID13
Feature activation+4.67e+00

Pos logit contributions
uts+0.074
vere+0.068
ding+0.067
othe+0.065
ache+0.063
Neg logit contributions
en-0.045
 me-0.043
ere-0.042
ened-0.042
el-0.039
Token They
Token ID1119
Feature activation+2.28e+00

Pos logit contributions
uts+0.118
ding+0.108
vere+0.107
othe+0.105
ache+0.102
Neg logit contributions
ere-0.092
en-0.091
ened-0.088
 me-0.084
el-0.083
Token are
Token ID389
Feature activation+3.69e+00

Pos logit contributions
uts+0.067
ding+0.061
vere+0.061
othe+0.060
ometer+0.058
Neg logit contributions
ere-0.036
en-0.036
ened-0.035
 me-0.033
 colored-0.031
Token not
Token ID407
Feature activation+4.30e+00

Pos logit contributions
uts+0.087
ding+0.082
vere+0.077
othe+0.071
ometer+0.071
Neg logit contributions
 me-0.082
en-0.079
ere-0.079
 colored-0.076
ened-0.076
Token."
Token ID526
Feature activation+1.35e+00

Pos logit contributions
en+0.000
ere+0.000
ened+0.000
 me+0.000
 colored+0.000
Neg logit contributions
uts-0.000
vere-0.000
ding-0.000
othe-0.000
ache-0.000
Token<|endoftext|>
Token ID50256
Feature activation-8.56e+00

Pos logit contributions
uts+0.039
vere+0.037
ding+0.036
ache+0.036
 numb+0.033
Neg logit contributions
 me-0.022
ere-0.022
en-0.020
ened-0.020
el-0.019
Token<|endoftext|>
Token ID50256
Feature activation-1.03e+01

Pos logit contributions
ened+0.185
en+0.182
ere+0.181
 colored+0.162
ered+0.159
Neg logit contributions
uts-0.205
ding-0.180
vere-0.178
othe-0.175
ometer-0.162
Token<|endoftext|>
Token ID50256
Feature activation-1.10e+01

Pos logit contributions
ened+0.230
en+0.222
ere+0.219
ered+0.199
 colored+0.198
Neg logit contributions
uts-0.238
ding-0.213
vere-0.206
othe-0.202
whe-0.187
Token<|endoftext|>
Token ID50256
Feature activation-1.13e+01

Pos logit contributions
ened+0.238
en+0.226
ere+0.221
ered+0.205
 colored+0.200
Neg logit contributions
uts-0.262
ding-0.240
vere-0.230
othe-0.225
whe-0.213
TokenAnna
Token ID31160
Feature activation-1.18e+01

Pos logit contributions
ened+0.238
en+0.224
ere+0.218
ered+0.204
 colored+0.197
Neg logit contributions
uts-0.275
ding-0.255
vere-0.241
othe-0.236
whe-0.227
Token and
Token ID290
Feature activation-8.26e+00

Pos logit contributions
ened+0.269
ere+0.262
en+0.247
ered+0.247
efully+0.243
Neg logit contributions
uts-0.252
ding-0.216
othe-0.211
ache-0.208
ometer-0.198
Token Ben
Token ID3932
Feature activation-1.06e+01

Pos logit contributions
en+0.119
ened+0.118
ere+0.114
 colored+0.094
efully+0.091
Neg logit contributions
uts-0.251
othe-0.241
ding-0.235
vere-0.224
 bleed-0.218
Token are
Token ID389
Feature activation-8.74e+00

Pos logit contributions
ened+0.216
ere+0.212
en+0.207
efully+0.187
ered+0.186
Neg logit contributions
uts-0.257
ding-0.230
othe-0.222
ache-0.211
vere-0.208
Token playing
Token ID2712
Feature activation-6.20e+00

Pos logit contributions
en+0.168
ened+0.168
ere+0.161
 colored+0.142
 me+0.139
Neg logit contributions
uts-0.234
vere-0.210
ding-0.209
othe-0.206
ometer-0.192
Token with
Token ID351
Feature activation-6.78e+00

Pos logit contributions
ened+0.120
en+0.115
ere+0.114
eled+0.100
ered+0.099
Neg logit contributions
uts-0.155
vere-0.143
othe-0.140
ding-0.139
balls-0.134
Token<|endoftext|>
Token ID50256
Feature activation-8.39e+00

Pos logit contributions
uts+0.076
vere+0.070
ache+0.069
ding+0.068
othe+0.067
Neg logit contributions
ere-0.053
en-0.051
ened-0.051
 me-0.049
 colored-0.045
TokenOnce
Token ID7454
Feature activation-7.29e+00

Pos logit contributions
ened+0.187
en+0.185
ere+0.185
 colored+0.167
 me+0.166
Neg logit contributions
uts-0.193
ding-0.171
vere-0.167
othe-0.162
ometer-0.154
Token upon
Token ID2402
Feature activation-7.19e+00

Pos logit contributions
en+0.106
ened+0.102
ere+0.102
 colored+0.091
efully+0.089
Neg logit contributions
uts-0.226
othe-0.211
vere-0.205
ding-0.204
ometer-0.197
Token a
Token ID257
Feature activation-5.64e+00

Pos logit contributions
ere+0.087
en+0.085
ened+0.084
 me+0.072
 colored+0.069
Neg logit contributions
uts-0.239
vere-0.219
ding-0.218
othe-0.217
ache-0.209
Token time
Token ID640
Feature activation-7.69e+00

Pos logit contributions
ere+0.071
en+0.071
ened+0.071
 me+0.059
 colored+0.058
Neg logit contributions
uts-0.182
vere-0.167
ding-0.167
othe-0.166
ache-0.160
Token on
Token ID319
Feature activation-1.18e+01

Pos logit contributions
en+0.120
ened+0.118
ere+0.116
eled+0.102
 colored+0.102
Neg logit contributions
uts-0.239
vere-0.215
othe-0.213
ding-0.212
t-0.201
Token a
Token ID257
Feature activation-3.25e+00

Pos logit contributions
en+0.231
eled+0.214
ened+0.212
efully+0.211
 angles+0.205
Neg logit contributions
uts-0.289
othe-0.284
ggy-0.259
vere-0.254
ding-0.253
Token bright
Token ID6016
Feature activation-4.39e+00

Pos logit contributions
ere+0.047
en+0.046
ened+0.044
 me+0.043
 colored+0.040
Neg logit contributions
uts-0.102
vere-0.092
othe-0.091
ding-0.089
ometer-0.087
Token sunny
Token ID27737
Feature activation-3.19e+00

Pos logit contributions
efully+0.077
ere+0.076
ened+0.075
ered+0.075
en+0.074
Neg logit contributions
ding-0.102
uts-0.102
othe-0.093
ometer-0.089
ache-0.088
Token day
Token ID1110
Feature activation-3.73e+00

Pos logit contributions
ened+0.074
ere+0.072
en+0.071
efully+0.069
ered+0.068
Neg logit contributions
uts-0.063
ding-0.062
othe-0.056
tr-0.052
vere-0.051
Token there
Token ID612
Feature activation-5.81e+00

Pos logit contributions
ened+0.051
ere+0.051
en+0.051
efully+0.045
eled+0.042
Neg logit contributions
uts-0.116
ding-0.109
othe-0.106
vere-0.103
ometer-0.098
Token more
Token ID517
Feature activation+5.01e+00

Pos logit contributions
 me+0.013
en+0.012
ere+0.012
ened+0.012
 colored+0.011
Neg logit contributions
uts-0.021
vere-0.020
ding-0.020
othe-0.019
ometer-0.019
Token squash
Token ID34613
Feature activation+6.19e+00

Pos logit contributions
uts+0.126
vere+0.112
othe+0.111
ding+0.110
ache+0.105
Neg logit contributions
ere-0.102
en-0.101
ened-0.100
 me-0.091
el-0.089
Token."
Token ID526
Feature activation+2.22e+00

Pos logit contributions
uts+0.163
vere+0.157
othe+0.154
ding+0.153
ometer+0.147
Neg logit contributions
en-0.110
ere-0.110
 me-0.106
ened-0.102
 colored-0.097
Token<|endoftext|>
Token ID50256
Feature activation-8.57e+00

Pos logit contributions
uts+0.064
ding+0.061
ache+0.061
vere+0.060
othe+0.057
Neg logit contributions
ere-0.034
 me-0.034
en-0.031
ened-0.030
 two-0.029
Token<|endoftext|>
Token ID50256
Feature activation-1.07e+01

Pos logit contributions
ened+0.142
en+0.137
ere+0.134
ered+0.119
 colored+0.118
Neg logit contributions
uts-0.251
vere-0.225
ding-0.224
othe-0.217
ometer-0.207
Token<|endoftext|>
Token ID50256
Feature activation-1.16e+01

Pos logit contributions
ened+0.197
en+0.185
ere+0.179
ered+0.168
eled+0.161
Neg logit contributions
uts-0.292
ding-0.265
vere-0.261
othe-0.252
whe-0.243
TokenL
Token ID43
Feature activation-1.04e+01

Pos logit contributions
ened+0.226
en+0.207
ere+0.200
ered+0.191
eled+0.182
Neg logit contributions
uts-0.305
ding-0.281
vere-0.272
othe-0.263
whe-0.257
Tokenily
Token ID813
Feature activation-1.14e+01

Pos logit contributions
ened+0.217
en+0.183
ere+0.175
efully+0.172
ised+0.168
Neg logit contributions
uts-0.269
ding-0.216
othe-0.216
t-0.213
EEP-0.213
Token and
Token ID290
Feature activation-8.06e+00

Pos logit contributions
ened+0.277
ered+0.271
ere+0.269
efully+0.258
en+0.253
Neg logit contributions
uts-0.206
ding-0.173
ache-0.170
othe-0.168
ages-0.159
Token Sam
Token ID3409
Feature activation-1.08e+01

Pos logit contributions
en+0.127
ened+0.122
ere+0.121
een+0.098
 colored+0.098
Neg logit contributions
uts-0.235
othe-0.227
ding-0.213
vere-0.209
 bleed-0.209
Token went
Token ID1816
Feature activation-9.89e+00

Pos logit contributions
ere+0.217
ened+0.216
en+0.206
efully+0.196
ered+0.192
Neg logit contributions
uts-0.234
ding-0.216
othe-0.205
ache-0.200
 bleed-0.188
TokenL
Token ID43
Feature activation-1.04e+01

Pos logit contributions
ened+0.226
en+0.207
ere+0.200
ered+0.191
eled+0.182
Neg logit contributions
uts-0.305
ding-0.281
vere-0.272
othe-0.263
whe-0.257
Tokenily
Token ID813
Feature activation-1.14e+01

Pos logit contributions
ened+0.217
en+0.183
ere+0.175
efully+0.172
ised+0.168
Neg logit contributions
uts-0.269
ding-0.216
othe-0.216
t-0.213
EEP-0.213
Token and
Token ID290
Feature activation-8.06e+00

Pos logit contributions
ened+0.277
ered+0.271
ere+0.269
efully+0.258
en+0.253
Neg logit contributions
uts-0.206
ding-0.173
ache-0.170
othe-0.168
ages-0.159
Token Sam
Token ID3409
Feature activation-1.08e+01

Pos logit contributions
en+0.127
ened+0.122
ere+0.121
een+0.098
 colored+0.098
Neg logit contributions
uts-0.235
othe-0.227
ding-0.213
vere-0.209
 bleed-0.209
Token went
Token ID1816
Feature activation-9.89e+00

Pos logit contributions
ere+0.217
ened+0.216
en+0.206
efully+0.196
ered+0.192
Neg logit contributions
uts-0.234
ding-0.216
othe-0.205
ache-0.200
 bleed-0.188
Token on
Token ID319
Feature activation-1.16e+01

Pos logit contributions
ere+0.203
en+0.200
ened+0.199
 me+0.183
 colored+0.179
Neg logit contributions
uts-0.245
vere-0.218
ding-0.215
othe-0.215
ache-0.205
Token a
Token ID257
Feature activation-3.53e+00

Pos logit contributions
ened+0.225
en+0.221
ere+0.215
eled+0.194
al+0.193
Neg logit contributions
uts-0.295
ding-0.269
othe-0.266
vere-0.257
 much-0.244
Token saf
Token ID1932
Feature activation+2.28e-01

Pos logit contributions
ere+0.075
ened+0.075
en+0.075
 colored+0.065
el+0.065
Neg logit contributions
uts-0.083
ding-0.076
vere-0.074
othe-0.074
ache-0.070
Tokenari
Token ID2743
Feature activation-5.19e+00

Pos logit contributions
 bleed+0.007
uts+0.007
othe+0.006
ache+0.006
paste+0.006
Neg logit contributions
ere-0.004
 me-0.003
el-0.003
en-0.003
 colored-0.003
Token with
Token ID351
Feature activation-6.59e+00

Pos logit contributions
en+0.094
ere+0.093
ened+0.088
 colored+0.085
 me+0.084
Neg logit contributions
uts-0.142
vere-0.130
othe-0.127
ache-0.126
ding-0.125
Token their
Token ID511
Feature activation-4.19e+00

Pos logit contributions
en+0.144
ened+0.137
ere+0.135
al+0.120
 colored+0.115
Neg logit contributions
uts-0.159
othe-0.145
vere-0.140
ding-0.132
 explode-0.129
Token that
Token ID326
Feature activation+3.89e+00

Pos logit contributions
uts+0.177
vere+0.169
ometer+0.168
othe+0.167
ding+0.159
Neg logit contributions
en-0.108
 me-0.106
ened-0.100
ere-0.097
 colored-0.095
Token much
Token ID881
Feature activation+3.75e+00

Pos logit contributions
uts+0.116
vere+0.110
ding+0.104
ache+0.102
ometer+0.102
Neg logit contributions
ened-0.059
 me-0.059
ere-0.058
en-0.057
 colored-0.055
Token more
Token ID517
Feature activation+5.82e+00

Pos logit contributions
uts+0.098
vere+0.091
ding+0.089
othe+0.087
ometer+0.084
Neg logit contributions
en-0.072
ened-0.071
ere-0.070
 me-0.068
 colored-0.065
Token.
Token ID13
Feature activation+2.63e+00

Pos logit contributions
uts+0.149
ding+0.140
vere+0.139
othe+0.134
ometer+0.133
Neg logit contributions
en-0.112
ere-0.110
ened-0.110
 me-0.106
 colored-0.104
Token<|endoftext|>
Token ID50256
Feature activation-8.13e+00

Pos logit contributions
uts+0.070
vere+0.064
ding+0.062
othe+0.062
ache+0.060
Neg logit contributions
ere-0.049
en-0.048
ened-0.047
 me-0.045
el-0.042
TokenAmy
Token ID40797
Feature activation-1.15e+01

Pos logit contributions
ened+0.189
en+0.187
ere+0.186
 colored+0.169
 me+0.168
Neg logit contributions
uts-0.179
ding-0.157
vere-0.154
othe-0.152
ometer-0.141
Token woke
Token ID19092
Feature activation-9.01e+00

Pos logit contributions
ere+0.229
ened+0.227
efully+0.223
ptions+0.216
en+0.216
Neg logit contributions
uts-0.240
ding-0.226
ache-0.222
othe-0.207
EEP-0.195
Token up
Token ID510
Feature activation-7.67e+00

Pos logit contributions
ere+0.101
en+0.097
ened+0.097
 colored+0.078
 me+0.078
Neg logit contributions
uts-0.309
othe-0.282
vere-0.281
ding-0.281
ache-0.270
Token to
Token ID284
Feature activation-6.27e+00

Pos logit contributions
ere+0.134
en+0.132
ened+0.130
 me+0.118
 colored+0.115
Neg logit contributions
uts-0.214
vere-0.192
ding-0.191
othe-0.191
ache-0.182
Token a
Token ID257
Feature activation-1.27e+00

Pos logit contributions
ere+0.100
ened+0.098
eled+0.088
en+0.087
efully+0.085
Neg logit contributions
uts-0.162
othe-0.159
vere-0.150
ding-0.149
 explode-0.148
Token sunny
Token ID27737
Feature activation-1.90e+00

Pos logit contributions
 me+0.021
ere+0.020
en+0.020
ened+0.018
 colored+0.018
Neg logit contributions
uts-0.039
othe-0.034
vere-0.034
ding-0.033
ages-0.032
Token."
Token ID526
Feature activation+2.22e+00

Pos logit contributions
uts+0.163
vere+0.157
othe+0.154
ding+0.153
ometer+0.147
Neg logit contributions
en-0.110
ere-0.110
 me-0.106
ened-0.102
 colored-0.097
Token<|endoftext|>
Token ID50256
Feature activation-8.57e+00

Pos logit contributions
uts+0.064
ding+0.061
ache+0.061
vere+0.060
othe+0.057
Neg logit contributions
ere-0.034
 me-0.034
en-0.031
ened-0.030
 two-0.029
Token<|endoftext|>
Token ID50256
Feature activation-1.07e+01

Pos logit contributions
ened+0.142
en+0.137
ere+0.134
ered+0.119
 colored+0.118
Neg logit contributions
uts-0.251
vere-0.225
ding-0.224
othe-0.217
ometer-0.207
Token<|endoftext|>
Token ID50256
Feature activation-1.16e+01

Pos logit contributions
ened+0.197
en+0.185
ere+0.179
ered+0.168
eled+0.161
Neg logit contributions
uts-0.292
ding-0.265
vere-0.261
othe-0.252
whe-0.243
TokenL
Token ID43
Feature activation-1.04e+01

Pos logit contributions
ened+0.226
en+0.207
ere+0.200
ered+0.191
eled+0.182
Neg logit contributions
uts-0.305
ding-0.281
vere-0.272
othe-0.263
whe-0.257
Tokenily
Token ID813
Feature activation-1.14e+01

Pos logit contributions
ened+0.217
en+0.183
ere+0.175
efully+0.172
ised+0.168
Neg logit contributions
uts-0.269
ding-0.216
othe-0.216
t-0.213
EEP-0.213
Token and
Token ID290
Feature activation-8.06e+00

Pos logit contributions
ened+0.277
ered+0.271
ere+0.269
efully+0.258
en+0.253
Neg logit contributions
uts-0.206
ding-0.173
ache-0.170
othe-0.168
ages-0.159
Token Sam
Token ID3409
Feature activation-1.08e+01

Pos logit contributions
en+0.127
ened+0.122
ere+0.121
een+0.098
 colored+0.098
Neg logit contributions
uts-0.235
othe-0.227
ding-0.213
vere-0.209
 bleed-0.209
Token went
Token ID1816
Feature activation-9.89e+00

Pos logit contributions
ere+0.217
ened+0.216
en+0.206
efully+0.196
ered+0.192
Neg logit contributions
uts-0.234
ding-0.216
othe-0.205
ache-0.200
 bleed-0.188
Token on
Token ID319
Feature activation-1.16e+01

Pos logit contributions
ere+0.203
en+0.200
ened+0.199
 me+0.183
 colored+0.179
Neg logit contributions
uts-0.245
vere-0.218
ding-0.215
othe-0.215
ache-0.205
Token a
Token ID257
Feature activation-3.53e+00

Pos logit contributions
ened+0.225
en+0.221
ere+0.215
eled+0.194
al+0.193
Neg logit contributions
uts-0.295
ding-0.269
othe-0.266
vere-0.257
 much-0.244
Token love
Token ID1842
Feature activation-6.36e-03

Pos logit contributions
uts+0.035
ding+0.032
vere+0.032
othe+0.031
ometer+0.030
Neg logit contributions
ere-0.019
en-0.019
ened-0.018
 me-0.018
 colored-0.017
Token."
Token ID526
Feature activation+1.35e+00

Pos logit contributions
en+0.000
ere+0.000
ened+0.000
 me+0.000
 colored+0.000
Neg logit contributions
uts-0.000
vere-0.000
ding-0.000
othe-0.000
ache-0.000
Token<|endoftext|>
Token ID50256
Feature activation-8.56e+00

Pos logit contributions
uts+0.039
vere+0.037
ding+0.036
ache+0.036
 numb+0.033
Neg logit contributions
 me-0.022
ere-0.022
en-0.020
ened-0.020
el-0.019
Token<|endoftext|>
Token ID50256
Feature activation-1.03e+01

Pos logit contributions
ened+0.185
en+0.182
ere+0.181
 colored+0.162
ered+0.159
Neg logit contributions
uts-0.205
ding-0.180
vere-0.178
othe-0.175
ometer-0.162
Token<|endoftext|>
Token ID50256
Feature activation-1.10e+01

Pos logit contributions
ened+0.230
en+0.222
ere+0.219
ered+0.199
 colored+0.198
Neg logit contributions
uts-0.238
ding-0.213
vere-0.206
othe-0.202
whe-0.187
Token<|endoftext|>
Token ID50256
Feature activation-1.13e+01

Pos logit contributions
ened+0.238
en+0.226
ere+0.221
ered+0.205
 colored+0.200
Neg logit contributions
uts-0.262
ding-0.240
vere-0.230
othe-0.225
whe-0.213
TokenAnna
Token ID31160
Feature activation-1.18e+01

Pos logit contributions
ened+0.238
en+0.224
ere+0.218
ered+0.204
 colored+0.197
Neg logit contributions
uts-0.275
ding-0.255
vere-0.241
othe-0.236
whe-0.227
Token and
Token ID290
Feature activation-8.26e+00

Pos logit contributions
ened+0.269
ere+0.262
en+0.247
ered+0.247
efully+0.243
Neg logit contributions
uts-0.252
ding-0.216
othe-0.211
ache-0.208
ometer-0.198
Token Ben
Token ID3932
Feature activation-1.06e+01

Pos logit contributions
en+0.119
ened+0.118
ere+0.114
 colored+0.094
efully+0.091
Neg logit contributions
uts-0.251
othe-0.241
ding-0.235
vere-0.224
 bleed-0.218
Token are
Token ID389
Feature activation-8.74e+00

Pos logit contributions
ened+0.216
ere+0.212
en+0.207
efully+0.187
ered+0.186
Neg logit contributions
uts-0.257
ding-0.230
othe-0.222
ache-0.211
vere-0.208
Token playing
Token ID2712
Feature activation-6.20e+00

Pos logit contributions
en+0.168
ened+0.168
ere+0.161
 colored+0.142
 me+0.139
Neg logit contributions
uts-0.234
vere-0.210
ding-0.209
othe-0.206
ometer-0.192
Token,
Token ID11
Feature activation+5.36e+00

Pos logit contributions
uts+0.002
vere+0.001
ding+0.001
othe+0.001
ache+0.001
Neg logit contributions
ere-0.001
en-0.001
ened-0.001
 me-0.001
 colored-0.001
Token pretty
Token ID2495
Feature activation-4.26e-01

Pos logit contributions
uts+0.157
vere+0.145
ding+0.143
othe+0.141
ometer+0.138
Neg logit contributions
 me-0.085
ened-0.082
en-0.082
ere-0.081
 colored-0.076
Token chamber
Token ID11847
Feature activation+3.00e+00

Pos logit contributions
ere+0.008
en+0.007
ened+0.007
efully+0.007
 me+0.007
Neg logit contributions
uts-0.011
ding-0.010
vere-0.010
othe-0.010
ache-0.010
Token.
Token ID13
Feature activation+2.34e+00

Pos logit contributions
uts+0.078
vere+0.071
othe+0.071
ding+0.070
ache+0.066
Neg logit contributions
ere-0.056
en-0.056
ened-0.055
 me-0.052
 colored-0.050
Token<|endoftext|>
Token ID50256
Feature activation-8.59e+00

Pos logit contributions
uts+0.062
vere+0.056
ding+0.055
othe+0.055
ache+0.054
Neg logit contributions
ere-0.044
en-0.042
ened-0.041
 me-0.041
el-0.037
TokenAmy
Token ID40797
Feature activation-1.13e+01

Pos logit contributions
ened+0.194
en+0.189
ere+0.186
 colored+0.171
ered+0.168
Neg logit contributions
uts-0.196
ding-0.174
vere-0.168
othe-0.165
ometer-0.155
Token was
Token ID373
Feature activation-7.78e+00

Pos logit contributions
ere+0.225
efully+0.224
ened+0.224
ptions+0.221
en+0.215
Neg logit contributions
uts-0.230
ache-0.211
ding-0.210
tr-0.188
ages-0.188
Token walking
Token ID6155
Feature activation-7.35e+00

Pos logit contributions
en+0.156
ened+0.154
ere+0.146
al+0.129
een+0.128
Neg logit contributions
uts-0.191
vere-0.176
ding-0.176
othe-0.173
 bleed-0.159
Token through
Token ID832
Feature activation-1.00e+01

Pos logit contributions
ere+0.151
ened+0.143
en+0.143
 colored+0.128
ered+0.127
Neg logit contributions
uts-0.172
ding-0.163
othe-0.154
ometer-0.149
vere-0.149
Token the
Token ID262
Feature activation-2.77e+00

Pos logit contributions
ened+0.192
ere+0.189
eled+0.180
en+0.180
ished+0.175
Neg logit contributions
uts-0.211
 much-0.201
othe-0.186
vere-0.184
whe-0.181
Token woods
Token ID16479
Feature activation-1.49e+00

Pos logit contributions
ened+0.049
en+0.044
ere+0.043
ered+0.043
eled+0.039
Neg logit contributions
uts-0.073
ding-0.070
vere-0.065
tr-0.064
ometer-0.063
Token!
Token ID0
Feature activation+4.09e+00

Pos logit contributions
ere+0.014
en+0.014
ened+0.014
 me+0.012
 colored+0.012
Neg logit contributions
uts-0.033
vere-0.030
ding-0.030
othe-0.030
ache-0.029
Token<|endoftext|>
Token ID50256
Feature activation-8.30e+00

Pos logit contributions
uts+0.124
ache+0.118
ding+0.118
 numb+0.115
vere+0.115
Neg logit contributions
ere-0.057
 me-0.054
 Every-0.051
 Everyday-0.051
 two-0.048
TokenOne
Token ID3198
Feature activation-4.69e+00

Pos logit contributions
ened+0.151
ere+0.149
en+0.149
 colored+0.133
ered+0.129
Neg logit contributions
uts-0.223
ding-0.203
vere-0.197
othe-0.193
ometer-0.186
Token day
Token ID1110
Feature activation-5.84e+00

Pos logit contributions
en+0.103
ere+0.103
ened+0.102
al+0.090
efully+0.090
Neg logit contributions
uts-0.106
ding-0.096
othe-0.095
vere-0.091
ometer-0.087
Token,
Token ID11
Feature activation-5.58e+00

Pos logit contributions
en+0.091
ere+0.088
ened+0.084
efully+0.075
 colored+0.073
Neg logit contributions
uts-0.179
ding-0.163
othe-0.160
vere-0.158
t-0.152
Token on
Token ID319
Feature activation-1.13e+01

Pos logit contributions
en+0.097
ere+0.094
ened+0.091
el+0.080
els+0.080
Neg logit contributions
uts-0.160
ding-0.147
othe-0.143
vere-0.139
t-0.134
Token a
Token ID257
Feature activation-1.95e+00

Pos logit contributions
en+0.216
ened+0.198
ere+0.198
eled+0.192
efully+0.191
Neg logit contributions
uts-0.287
othe-0.276
ding-0.254
vere-0.249
balls-0.242
Token hot
Token ID3024
Feature activation-1.08e+00

Pos logit contributions
 me+0.038
ere+0.037
en+0.037
ened+0.035
 colored+0.034
Neg logit contributions
uts-0.053
vere-0.046
othe-0.046
ometer-0.044
ache-0.043
Token summer
Token ID3931
Feature activation-1.22e+00

Pos logit contributions
ere+0.021
ened+0.021
en+0.021
efully+0.018
ered+0.018
Neg logit contributions
uts-0.027
ding-0.025
othe-0.024
vere-0.024
ache-0.023
Token's
Token ID338
Feature activation-1.11e+00

Pos logit contributions
ere+0.022
en+0.022
ened+0.022
 me+0.020
 colored+0.019
Neg logit contributions
uts-0.033
vere-0.030
ding-0.029
othe-0.029
ache-0.028
Token day
Token ID1110
Feature activation-2.20e+00

Pos logit contributions
ere+0.019
en+0.019
ened+0.018
 me+0.018
el+0.017
Neg logit contributions
uts-0.032
vere-0.029
othe-0.028
ding-0.028
ache-0.027
Token.
Token ID13
Feature activation+3.01e+00

Pos logit contributions
en+0.051
ere+0.050
ened+0.049
 me+0.046
 colored+0.042
Neg logit contributions
uts-0.103
ding-0.093
vere-0.093
othe-0.091
ache-0.090
Token<|endoftext|>
Token ID50256
Feature activation-8.30e+00

Pos logit contributions
uts+0.084
vere+0.079
ache+0.078
ding+0.075
othe+0.074
Neg logit contributions
ere-0.053
en-0.050
 me-0.048
ened-0.048
el-0.044
TokenOne
Token ID3198
Feature activation-4.79e+00

Pos logit contributions
ened+0.190
en+0.186
ere+0.186
 colored+0.169
 me+0.166
Neg logit contributions
uts-0.187
ding-0.165
vere-0.161
othe-0.157
ometer-0.148
Token day
Token ID1110
Feature activation-5.75e+00

Pos logit contributions
en+0.108
ened+0.108
ere+0.107
al+0.095
efully+0.094
Neg logit contributions
uts-0.105
ding-0.095
othe-0.095
vere-0.091
ache-0.086
Token,
Token ID11
Feature activation-6.20e+00

Pos logit contributions
en+0.096
ere+0.092
ened+0.090
efully+0.081
eled+0.079
Neg logit contributions
uts-0.170
othe-0.154
ding-0.154
vere-0.150
t-0.146
Token on
Token ID319
Feature activation-1.12e+01

Pos logit contributions
en+0.129
ere+0.121
ened+0.121
els+0.109
al+0.102
Neg logit contributions
uts-0.152
ding-0.144
othe-0.141
t-0.136
 much-0.131
Token a
Token ID257
Feature activation-2.08e+00

Pos logit contributions
en+0.229
ened+0.214
 angles+0.205
ere+0.204
eled+0.203
Neg logit contributions
uts-0.264
othe-0.262
ggy-0.240
 much-0.238
ves-0.233
Token cold
Token ID4692
Feature activation-1.59e+00

Pos logit contributions
 me+0.031
ere+0.031
en+0.030
ened+0.028
 colored+0.027
Neg logit contributions
uts-0.066
vere-0.059
othe-0.058
ometer-0.055
ache-0.055
Token winter
Token ID7374
Feature activation-8.22e-01

Pos logit contributions
ere+0.029
en+0.028
ened+0.028
 me+0.026
 colored+0.025
Neg logit contributions
uts-0.044
vere-0.039
othe-0.039
ding-0.039
ache-0.037
Token morning
Token ID3329
Feature activation-2.30e+00

Pos logit contributions
ere+0.014
en+0.014
ened+0.013
 me+0.013
 colored+0.012
Neg logit contributions
uts-0.024
vere-0.022
othe-0.021
ding-0.021
ache-0.020
Token,
Token ID11
Feature activation-3.23e+00

Pos logit contributions
ere+0.032
en+0.031
ened+0.031
 me+0.027
 colored+0.026
Neg logit contributions
uts-0.072
ding-0.066
othe-0.065
vere-0.065
ache-0.063
TokenOnce
Token ID7454
Feature activation-7.25e+00

Pos logit contributions
ened+0.207
ere+0.206
en+0.205
 me+0.186
 colored+0.186
Neg logit contributions
uts-0.184
ding-0.160
vere-0.159
othe-0.156
ometer-0.144
Token upon
Token ID2402
Feature activation-7.15e+00

Pos logit contributions
en+0.105
ened+0.102
ere+0.102
 colored+0.090
efully+0.089
Neg logit contributions
uts-0.224
othe-0.209
vere-0.203
ding-0.202
ometer-0.194
Token a
Token ID257
Feature activation-5.63e+00

Pos logit contributions
ere+0.085
en+0.083
ened+0.082
 me+0.070
 colored+0.067
Neg logit contributions
uts-0.239
vere-0.219
ding-0.218
othe-0.218
ache-0.210
Token time
Token ID640
Feature activation-7.56e+00

Pos logit contributions
ere+0.071
en+0.071
ened+0.071
 me+0.059
 colored+0.058
Neg logit contributions
uts-0.183
vere-0.167
ding-0.167
othe-0.166
ache-0.160
Token,
Token ID11
Feature activation-5.98e+00

Pos logit contributions
en+0.131
ened+0.128
ere+0.124
eled+0.113
efully+0.113
Neg logit contributions
uts-0.226
vere-0.201
othe-0.201
ding-0.198
t-0.191
Token on
Token ID319
Feature activation-1.12e+01

Pos logit contributions
en+0.130
ened+0.122
ere+0.120
al+0.102
efully+0.101
Neg logit contributions
othe-0.139
uts-0.138
t-0.131
ding-0.130
 much-0.130
Token a
Token ID257
Feature activation-2.93e+00

Pos logit contributions
en+0.236
eled+0.222
ened+0.222
 angles+0.211
efully+0.210
Neg logit contributions
uts-0.265
othe-0.248
ggy-0.234
balls-0.228
ding-0.226
Token clean
Token ID3424
Feature activation-4.17e+00

Pos logit contributions
ere+0.042
en+0.040
ened+0.039
 me+0.037
 colored+0.035
Neg logit contributions
uts-0.092
vere-0.084
othe-0.083
ding-0.082
ache-0.080
Token Tuesday
Token ID3431
Feature activation-3.92e+00

Pos logit contributions
ened+0.080
en+0.078
ere+0.076
efully+0.075
ered+0.074
Neg logit contributions
ding-0.096
uts-0.095
ometer-0.089
ache-0.086
othe-0.086
Token morning
Token ID3329
Feature activation-2.96e+00

Pos logit contributions
ere+0.051
ened+0.050
en+0.050
ered+0.041
efully+0.041
Neg logit contributions
uts-0.126
ding-0.117
othe-0.112
vere-0.112
ometer-0.109
Token,
Token ID11
Feature activation-3.45e+00

Pos logit contributions
ere+0.037
en+0.036
ened+0.035
 me+0.029
 colored+0.029
Neg logit contributions
uts-0.098
ding-0.090
othe-0.089
vere-0.089
ache-0.085
Token wearing
Token ID5762
Feature activation-4.64e+00

Pos logit contributions
uts+0.004
othe+0.004
vere+0.004
ding+0.004
ache+0.004
Neg logit contributions
ere-0.004
en-0.004
ened-0.004
 me-0.004
 colored-0.003
Token her
Token ID607
Feature activation+6.48e-01

Pos logit contributions
ere+0.070
en+0.068
ened+0.066
 me+0.058
el+0.057
Neg logit contributions
uts-0.140
vere-0.127
ding-0.126
othe-0.126
ache-0.121
Token cap
Token ID1451
Feature activation+4.14e+00

Pos logit contributions
uts+0.018
vere+0.016
othe+0.016
ding+0.016
ache+0.015
Neg logit contributions
ere-0.012
ened-0.011
en-0.011
 me-0.011
 colored-0.010
Token.
Token ID13
Feature activation+7.54e-01

Pos logit contributions
uts+0.110
vere+0.101
othe+0.100
ding+0.099
ache+0.094
Neg logit contributions
ere-0.075
en-0.074
ened-0.074
 me-0.068
 colored-0.066
Token<|endoftext|>
Token ID50256
Feature activation-9.20e+00

Pos logit contributions
uts+0.020
vere+0.018
ding+0.018
othe+0.018
ache+0.017
Neg logit contributions
ere-0.014
en-0.014
ened-0.014
 me-0.012
 colored-0.012
TokenJohn
Token ID7554
Feature activation-1.11e+01

Pos logit contributions
ened+0.220
en+0.215
ere+0.212
 colored+0.196
ered+0.194
Neg logit contributions
uts-0.199
ding-0.174
vere-0.168
othe-0.166
ometer-0.155
Token drove
Token ID10357
Feature activation-8.56e+00

Pos logit contributions
ere+0.209
ened+0.196
 forces+0.193
efully+0.192
eful+0.187
Neg logit contributions
ding-0.253
uts-0.252
ache-0.234
othe-0.216
ometer-0.213
Token his
Token ID465
Feature activation-5.65e+00

Pos logit contributions
ere+0.155
en+0.151
ened+0.150
 colored+0.133
el+0.131
Neg logit contributions
uts-0.233
ding-0.209
vere-0.206
othe-0.205
ometer-0.194
Token car
Token ID1097
Feature activation-1.75e+00

Pos logit contributions
ere+0.117
en+0.116
ened+0.115
 me+0.105
 colored+0.103
Neg logit contributions
uts-0.139
vere-0.123
ding-0.123
othe-0.122
ache-0.116
Token on
Token ID319
Feature activation-8.94e+00

Pos logit contributions
ere+0.029
en+0.029
ened+0.028
 me+0.025
 colored+0.025
Neg logit contributions
uts-0.050
ding-0.046
vere-0.045
othe-0.044
ache-0.043
Token the
Token ID262
Feature activation-2.93e+00

Pos logit contributions
ere+0.123
ened+0.123
en+0.118
eled+0.112
al+0.106
Neg logit contributions
uts-0.266
ding-0.246
othe-0.237
vere-0.235
 numb-0.226
Token.
Token ID13
Feature activation+2.84e+00

Pos logit contributions
ding+0.090
uts+0.090
vere+0.087
ache+0.085
 bleed+0.083
Neg logit contributions
 me-0.033
en-0.030
ere-0.029
 colored-0.027
ened-0.025
Token<|endoftext|>
Token ID50256
Feature activation-6.82e+00

Pos logit contributions
uts+0.084
ache+0.083
vere+0.081
ding+0.080
 numb+0.077
Neg logit contributions
ere-0.039
 me-0.037
 Every-0.036
 Everyday-0.034
en-0.034
Token<|endoftext|>
Token ID50256
Feature activation-9.08e+00

Pos logit contributions
ened+0.107
en+0.105
ere+0.105
 me+0.090
 colored+0.089
Neg logit contributions
uts-0.203
ding-0.183
vere-0.182
othe-0.178
ometer-0.170
Token<|endoftext|>
Token ID50256
Feature activation-1.02e+01

Pos logit contributions
ened+0.166
en+0.161
ere+0.159
ered+0.140
 colored+0.140
Neg logit contributions
uts-0.248
ding-0.223
vere-0.221
othe-0.215
ache-0.202
Token<|endoftext|>
Token ID50256
Feature activation-1.08e+01

Pos logit contributions
ened+0.198
en+0.189
ere+0.186
ered+0.168
 colored+0.165
Neg logit contributions
uts-0.267
ding-0.242
vere-0.237
othe-0.231
whe-0.219
Token<|endoftext|>
Token ID50256
Feature activation-1.11e+01

Pos logit contributions
ened+0.214
en+0.201
ere+0.197
ered+0.181
 colored+0.176
Neg logit contributions
uts-0.277
ding-0.253
vere-0.246
othe-0.239
whe-0.229
TokenL
Token ID43
Feature activation-1.07e+01

Pos logit contributions
ened+0.222
en+0.207
ere+0.203
ered+0.188
 colored+0.181
Neg logit contributions
uts-0.283
ding-0.259
vere-0.250
othe-0.243
whe-0.234
Tokenila
Token ID10102
Feature activation-1.01e+01

Pos logit contributions
ened+0.223
en+0.189
efully+0.180
ere+0.176
ised+0.173
Neg logit contributions
uts-0.269
ding-0.222
othe-0.218
EEP-0.214
t-0.214
Token and
Token ID290
Feature activation-8.47e+00

Pos logit contributions
ened+0.215
ere+0.213
en+0.204
efully+0.196
ered+0.192
Neg logit contributions
uts-0.228
ding-0.197
othe-0.194
ache-0.189
ages-0.178
Token Ben
Token ID3932
Feature activation-1.07e+01

Pos logit contributions
en+0.126
ened+0.124
ere+0.123
 colored+0.101
efully+0.098
Neg logit contributions
uts-0.258
othe-0.241
ding-0.237
vere-0.228
 explode-0.221
Token are
Token ID389
Feature activation-9.10e+00

Pos logit contributions
ened+0.217
ere+0.215
en+0.206
efully+0.190
ered+0.187
Neg logit contributions
uts-0.255
ding-0.226
othe-0.218
ache-0.209
vere-0.202
Token next
Token ID1306
Feature activation+2.45e+00

Pos logit contributions
uts+0.105
ding+0.094
othe+0.092
vere+0.092
ache+0.087
Neg logit contributions
ere-0.054
en-0.053
ened-0.052
 me-0.052
 colored-0.049
Token one
Token ID530
Feature activation+5.53e+00

Pos logit contributions
uts+0.067
vere+0.060
ding+0.060
othe+0.060
ache+0.057
Neg logit contributions
ere-0.044
en-0.043
ened-0.043
 me-0.039
 colored-0.038
Token.
Token ID13
Feature activation+3.09e+00

Pos logit contributions
uts+0.152
vere+0.137
ding+0.136
othe+0.132
ometer+0.124
Neg logit contributions
en-0.101
ere-0.099
ened-0.098
 me-0.097
 colored-0.090
Token<|endoftext|>
Token ID50256
Feature activation-8.37e+00

Pos logit contributions
uts+0.084
ding+0.077
vere+0.076
ache+0.075
othe+0.074
Neg logit contributions
ere-0.055
en-0.053
ened-0.052
 me-0.051
el-0.047
Token<|endoftext|>
Token ID50256
Feature activation-1.04e+01

Pos logit contributions
ened+0.175
en+0.172
ere+0.171
 colored+0.152
ered+0.151
Neg logit contributions
uts-0.207
ding-0.184
vere-0.181
othe-0.176
ometer-0.167
TokenTim
Token ID14967
Feature activation-1.11e+01

Pos logit contributions
ened+0.223
en+0.215
ere+0.209
ered+0.195
eled+0.189
Neg logit contributions
uts-0.250
ding-0.227
vere-0.218
othe-0.212
whe-0.200
Token and
Token ID290
Feature activation-5.57e+00

Pos logit contributions
ised+0.272
efully+0.265
 forces+0.263
ered+0.262
ened+0.262
Neg logit contributions
ding-0.158
ache-0.154
ongo-0.152
uts-0.150
tr-0.140
Token Jen
Token ID13364
Feature activation-9.35e+00

Pos logit contributions
ere+0.077
en+0.076
ened+0.074
 me+0.064
 colored+0.062
Neg logit contributions
uts-0.177
othe-0.161
vere-0.161
ding-0.159
ache-0.153
Token are
Token ID389
Feature activation-6.43e+00

Pos logit contributions
ere+0.191
ened+0.186
en+0.181
efully+0.168
 me+0.166
Neg logit contributions
uts-0.214
ding-0.201
othe-0.184
ache-0.184
 bleed-0.178
Token twins
Token ID20345
Feature activation-5.11e+00

Pos logit contributions
en+0.124
ened+0.123
ere+0.119
 colored+0.105
 me+0.103
Neg logit contributions
uts-0.173
vere-0.155
ding-0.152
othe-0.152
ometer-0.142
Token.
Token ID13
Feature activation-2.92e+00

Pos logit contributions
en+0.083
 me+0.081
ere+0.080
ened+0.076
el+0.073
Neg logit contributions
uts-0.150
vere-0.136
othe-0.134
ding-0.133
ometer-0.133
Token whole
Token ID2187
Feature activation+1.58e+00

Pos logit contributions
uts+0.067
othe+0.059
ding+0.058
vere+0.058
ometer+0.055
Neg logit contributions
ere-0.034
ened-0.033
en-0.033
 me-0.032
 colored-0.030
Token way
Token ID835
Feature activation+3.61e-01

Pos logit contributions
uts+0.044
vere+0.039
othe+0.039
ding+0.039
ache+0.038
Neg logit contributions
ere-0.028
en-0.027
ened-0.027
 me-0.024
 colored-0.024
Token home
Token ID1363
Feature activation+2.24e-01

Pos logit contributions
uts+0.011
vere+0.010
ding+0.010
ache+0.010
othe+0.009
Neg logit contributions
en-0.006
ened-0.006
ere-0.005
 me-0.005
el-0.005
Token.
Token ID13
Feature activation+2.16e+00

Pos logit contributions
uts+0.007
vere+0.006
ding+0.006
othe+0.006
ache+0.006
Neg logit contributions
en-0.003
ere-0.003
ened-0.003
 me-0.003
 colored-0.003
Token<|endoftext|>
Token ID50256
Feature activation-8.26e+00

Pos logit contributions
uts+0.051
othe+0.044
ding+0.043
vere+0.043
ometer+0.041
Neg logit contributions
ened-0.048
en-0.048
ere-0.048
 colored-0.044
el-0.042
TokenAmy
Token ID40797
Feature activation-1.11e+01

Pos logit contributions
ened+0.176
en+0.171
ere+0.168
 colored+0.152
eled+0.150
Neg logit contributions
uts-0.202
ding-0.179
vere-0.174
othe-0.169
ometer-0.161
Token was
Token ID373
Feature activation-7.18e+00

Pos logit contributions
ere+0.228
efully+0.225
ened+0.225
ptions+0.218
ised+0.218
Neg logit contributions
uts-0.221
ding-0.205
ache-0.201
othe-0.178
ages-0.178
Token a
Token ID257
Feature activation-1.77e+00

Pos logit contributions
en+0.135
ened+0.134
ere+0.130
 colored+0.113
el+0.112
Neg logit contributions
uts-0.188
vere-0.172
ding-0.171
othe-0.170
ometer-0.157
Token very
Token ID845
Feature activation+1.46e+00

Pos logit contributions
ere+0.028
en+0.026
ened+0.026
 me+0.025
 colored+0.023
Neg logit contributions
uts-0.054
vere-0.049
othe-0.048
ding-0.047
ache-0.045
Token brave
Token ID14802
Feature activation-4.06e+00

Pos logit contributions
uts+0.031
vere+0.027
othe+0.027
ding+0.027
ache+0.025
Neg logit contributions
en-0.035
ere-0.035
ened-0.035
 colored-0.031
 me-0.031
Token girl
Token ID2576
Feature activation-3.83e+00

Pos logit contributions
ere+0.069
ened+0.067
en+0.067
ered+0.059
 colored+0.059
Neg logit contributions
uts-0.111
ding-0.101
vere-0.100
othe-0.100
ache-0.099
TokenOnce
Token ID7454
Feature activation-7.26e+00

Pos logit contributions
ened+0.216
en+0.212
ere+0.206
 colored+0.191
ered+0.189
Neg logit contributions
uts-0.199
ding-0.174
othe-0.167
vere-0.167
ometer-0.156
Token upon
Token ID2402
Feature activation-8.27e+00

Pos logit contributions
en+0.107
ere+0.105
ened+0.105
 colored+0.094
efully+0.092
Neg logit contributions
uts-0.221
othe-0.205
ding-0.200
vere-0.200
ometer-0.192
Token a
Token ID257
Feature activation-6.38e+00

Pos logit contributions
ere+0.109
en+0.107
ened+0.106
 me+0.089
el+0.086
Neg logit contributions
uts-0.268
vere-0.243
ding-0.242
othe-0.240
ache-0.232
Token time
Token ID640
Feature activation-8.26e+00

Pos logit contributions
ened+0.082
en+0.081
ere+0.081
 colored+0.067
 me+0.067
Neg logit contributions
uts-0.205
vere-0.188
ding-0.188
othe-0.186
ache-0.180
Token,
Token ID11
Feature activation-7.03e+00

Pos logit contributions
ened+0.137
en+0.135
ere+0.130
eled+0.117
efully+0.115
Neg logit contributions
uts-0.255
vere-0.224
ding-0.223
othe-0.221
t-0.210
Token in
Token ID287
Feature activation-1.11e+01

Pos logit contributions
ened+0.130
en+0.127
ere+0.121
 types+0.104
efully+0.102
Neg logit contributions
uts-0.183
ding-0.179
othe-0.168
ometer-0.166
vere-0.165
Token a
Token ID257
Feature activation-4.41e+00

Pos logit contributions
ened+0.218
en+0.218
ere+0.203
eled+0.193
ered+0.191
Neg logit contributions
uts-0.267
othe-0.241
ding-0.238
vere-0.233
 much-0.227
Token small
Token ID1402
Feature activation-6.31e+00

Pos logit contributions
ened+0.103
en+0.097
ered+0.094
ere+0.092
els+0.084
Neg logit contributions
uts-0.093
ding-0.090
vere-0.083
ache-0.081
othe-0.079
Token village
Token ID7404
Feature activation-5.06e+00

Pos logit contributions
ered+0.126
ened+0.124
ere+0.113
en+0.112
efully+0.108
Neg logit contributions
uts-0.144
ding-0.137
ometer-0.133
ache-0.127
 numb-0.123
Token,
Token ID11
Feature activation-4.77e+00

Pos logit contributions
ened+0.081
en+0.079
ere+0.075
efully+0.071
eled+0.066
Neg logit contributions
uts-0.150
othe-0.136
vere-0.133
ding-0.132
ometer-0.126
Token there
Token ID612
Feature activation-6.19e+00

Pos logit contributions
ened+0.113
en+0.112
ere+0.103
ited+0.088
eled+0.088
Neg logit contributions
 much-0.099
uts-0.099
othe-0.098
vere-0.098
ometer-0.096
TokenOnce
Token ID7454
Feature activation-7.26e+00

Pos logit contributions
ened+0.216
en+0.212
ere+0.206
 colored+0.191
ered+0.189
Neg logit contributions
uts-0.199
ding-0.174
othe-0.167
vere-0.167
ometer-0.156
Token upon
Token ID2402
Feature activation-8.27e+00

Pos logit contributions
en+0.107
ere+0.105
ened+0.105
 colored+0.094
efully+0.092
Neg logit contributions
uts-0.221
othe-0.205
ding-0.200
vere-0.200
ometer-0.192
Token a
Token ID257
Feature activation-6.38e+00

Pos logit contributions
ere+0.109
en+0.107
ened+0.106
 me+0.089
el+0.086
Neg logit contributions
uts-0.268
vere-0.243
ding-0.242
othe-0.240
ache-0.232
Token time
Token ID640
Feature activation-8.26e+00

Pos logit contributions
ened+0.082
en+0.081
ere+0.081
 colored+0.067
 me+0.067
Neg logit contributions
uts-0.205
vere-0.188
ding-0.188
othe-0.186
ache-0.180
Token,
Token ID11
Feature activation-7.03e+00

Pos logit contributions
ened+0.137
en+0.135
ere+0.130
eled+0.117
efully+0.115
Neg logit contributions
uts-0.255
vere-0.224
ding-0.223
othe-0.221
t-0.210
Token in
Token ID287
Feature activation-1.11e+01

Pos logit contributions
ened+0.130
en+0.127
ere+0.121
 types+0.104
efully+0.102
Neg logit contributions
uts-0.183
ding-0.179
othe-0.168
ometer-0.166
vere-0.165
Token a
Token ID257
Feature activation-4.41e+00

Pos logit contributions
ened+0.218
en+0.218
ere+0.203
eled+0.193
ered+0.191
Neg logit contributions
uts-0.267
othe-0.241
ding-0.238
vere-0.233
 much-0.227
Token small
Token ID1402
Feature activation-6.31e+00

Pos logit contributions
ened+0.103
en+0.097
ered+0.094
ere+0.092
els+0.084
Neg logit contributions
uts-0.093
ding-0.090
vere-0.083
ache-0.081
othe-0.079
Token village
Token ID7404
Feature activation-5.06e+00

Pos logit contributions
ered+0.126
ened+0.124
ere+0.113
en+0.112
efully+0.108
Neg logit contributions
uts-0.144
ding-0.137
ometer-0.133
ache-0.127
 numb-0.123
Token,
Token ID11
Feature activation-4.77e+00

Pos logit contributions
ened+0.081
en+0.079
ere+0.075
efully+0.071
eled+0.066
Neg logit contributions
uts-0.150
othe-0.136
vere-0.133
ding-0.132
ometer-0.126
Token there
Token ID612
Feature activation-6.19e+00

Pos logit contributions
ened+0.113
en+0.112
ere+0.103
ited+0.088
eled+0.088
Neg logit contributions
 much-0.099
uts-0.099
othe-0.098
vere-0.098
ometer-0.096
TokenOnce
Token ID7454
Feature activation-7.26e+00

Pos logit contributions
ened+0.216
en+0.212
ere+0.206
 colored+0.191
ered+0.189
Neg logit contributions
uts-0.199
ding-0.174
othe-0.167
vere-0.167
ometer-0.156
Token upon
Token ID2402
Feature activation-8.27e+00

Pos logit contributions
en+0.107
ere+0.105
ened+0.105
 colored+0.094
efully+0.092
Neg logit contributions
uts-0.221
othe-0.205
ding-0.200
vere-0.200
ometer-0.192
Token a
Token ID257
Feature activation-6.38e+00

Pos logit contributions
ere+0.109
en+0.107
ened+0.106
 me+0.089
el+0.086
Neg logit contributions
uts-0.268
vere-0.243
ding-0.242
othe-0.240
ache-0.232
Token time
Token ID640
Feature activation-8.26e+00

Pos logit contributions
ened+0.082
en+0.081
ere+0.081
 colored+0.067
 me+0.067
Neg logit contributions
uts-0.205
vere-0.188
ding-0.188
othe-0.186
ache-0.180
Token,
Token ID11
Feature activation-7.03e+00

Pos logit contributions
ened+0.137
en+0.135
ere+0.130
eled+0.117
efully+0.115
Neg logit contributions
uts-0.255
vere-0.224
ding-0.223
othe-0.221
t-0.210
Token in
Token ID287
Feature activation-1.11e+01

Pos logit contributions
ened+0.130
en+0.127
ere+0.121
 types+0.104
efully+0.102
Neg logit contributions
uts-0.183
ding-0.179
othe-0.168
ometer-0.166
vere-0.165
Token a
Token ID257
Feature activation-4.41e+00

Pos logit contributions
ened+0.218
en+0.218
ere+0.203
eled+0.193
ered+0.191
Neg logit contributions
uts-0.267
othe-0.241
ding-0.238
vere-0.233
 much-0.227
Token lively
Token ID29696
Feature activation-6.79e+00

Pos logit contributions
ened+0.103
en+0.097
ered+0.094
ere+0.092
els+0.084
Neg logit contributions
uts-0.093
ding-0.090
vere-0.083
ache-0.081
othe-0.079
Token little
Token ID1310
Feature activation-6.14e+00

Pos logit contributions
ened+0.114
ered+0.110
ere+0.103
en+0.101
efully+0.100
Neg logit contributions
uts-0.175
ding-0.168
ache-0.156
ometer-0.156
 cho-0.152
Token town
Token ID3240
Feature activation-4.28e+00

Pos logit contributions
ened+0.104
ere+0.102
en+0.100
ered+0.094
 colored+0.088
Neg logit contributions
uts-0.170
ding-0.158
vere-0.151
othe-0.150
ometer-0.147
Token,
Token ID11
Feature activation-3.83e+00

Pos logit contributions
ened+0.064
ere+0.062
en+0.062
efully+0.054
eled+0.050
Neg logit contributions
uts-0.131
ding-0.118
othe-0.117
vere-0.116
ometer-0.112
TokenOnce
Token ID7454
Feature activation-7.26e+00

Pos logit contributions
ened+0.216
en+0.212
ere+0.206
 colored+0.191
ered+0.189
Neg logit contributions
uts-0.199
ding-0.174
othe-0.167
vere-0.167
ometer-0.156
Token upon
Token ID2402
Feature activation-8.27e+00

Pos logit contributions
en+0.107
ere+0.105
ened+0.105
 colored+0.094
efully+0.092
Neg logit contributions
uts-0.221
othe-0.205
ding-0.200
vere-0.200
ometer-0.192
Token a
Token ID257
Feature activation-6.38e+00

Pos logit contributions
ere+0.109
en+0.107
ened+0.106
 me+0.089
el+0.086
Neg logit contributions
uts-0.268
vere-0.243
ding-0.242
othe-0.240
ache-0.232
Token time
Token ID640
Feature activation-8.26e+00

Pos logit contributions
ened+0.082
en+0.081
ere+0.081
 colored+0.067
 me+0.067
Neg logit contributions
uts-0.205
vere-0.188
ding-0.188
othe-0.186
ache-0.180
Token,
Token ID11
Feature activation-7.03e+00

Pos logit contributions
ened+0.137
en+0.135
ere+0.130
eled+0.117
efully+0.115
Neg logit contributions
uts-0.255
vere-0.224
ding-0.223
othe-0.221
t-0.210
Token in
Token ID287
Feature activation-1.11e+01

Pos logit contributions
ened+0.130
en+0.127
ere+0.121
 types+0.104
efully+0.102
Neg logit contributions
uts-0.183
ding-0.179
othe-0.168
ometer-0.166
vere-0.165
Token a
Token ID257
Feature activation-4.41e+00

Pos logit contributions
ened+0.218
en+0.218
ere+0.203
eled+0.193
ered+0.191
Neg logit contributions
uts-0.267
othe-0.241
ding-0.238
vere-0.233
 much-0.227
Token small
Token ID1402
Feature activation-6.31e+00

Pos logit contributions
ened+0.103
en+0.097
ered+0.094
ere+0.092
els+0.084
Neg logit contributions
uts-0.093
ding-0.090
vere-0.083
ache-0.081
othe-0.079
Token village
Token ID7404
Feature activation-5.06e+00

Pos logit contributions
ered+0.126
ened+0.124
ere+0.113
en+0.112
efully+0.108
Neg logit contributions
uts-0.144
ding-0.137
ometer-0.133
ache-0.127
 numb-0.123
Token there
Token ID612
Feature activation-6.04e+00

Pos logit contributions
ened+0.081
en+0.079
ere+0.075
efully+0.071
eled+0.066
Neg logit contributions
uts-0.150
othe-0.136
vere-0.133
ding-0.132
ometer-0.126
Token lived
Token ID5615
Feature activation-7.20e+00

Pos logit contributions
ened+0.132
ere+0.121
en+0.120
efully+0.115
arted+0.108
Neg logit contributions
uts-0.136
ding-0.119
othe-0.117
whe-0.110
t-0.109
TokenOnce
Token ID7454
Feature activation-7.26e+00

Pos logit contributions
ened+0.216
en+0.212
ere+0.206
 colored+0.191
ered+0.189
Neg logit contributions
uts-0.199
ding-0.174
othe-0.167
vere-0.167
ometer-0.156
Token upon
Token ID2402
Feature activation-8.27e+00

Pos logit contributions
en+0.107
ere+0.105
ened+0.105
 colored+0.094
efully+0.092
Neg logit contributions
uts-0.221
othe-0.205
ding-0.200
vere-0.200
ometer-0.192
Token a
Token ID257
Feature activation-6.38e+00

Pos logit contributions
ere+0.109
en+0.107
ened+0.106
 me+0.089
el+0.086
Neg logit contributions
uts-0.268
vere-0.243
ding-0.242
othe-0.240
ache-0.232
Token time
Token ID640
Feature activation-8.26e+00

Pos logit contributions
ened+0.082
en+0.081
ere+0.081
 colored+0.067
 me+0.067
Neg logit contributions
uts-0.205
vere-0.188
ding-0.188
othe-0.186
ache-0.180
Token,
Token ID11
Feature activation-7.03e+00

Pos logit contributions
ened+0.137
en+0.135
ere+0.130
eled+0.117
efully+0.115
Neg logit contributions
uts-0.255
vere-0.224
ding-0.223
othe-0.221
t-0.210
Token in
Token ID287
Feature activation-1.11e+01

Pos logit contributions
ened+0.130
en+0.127
ere+0.121
 types+0.104
efully+0.102
Neg logit contributions
uts-0.183
ding-0.179
othe-0.168
ometer-0.166
vere-0.165
Token a
Token ID257
Feature activation-4.41e+00

Pos logit contributions
ened+0.218
en+0.218
ere+0.203
eled+0.193
ered+0.191
Neg logit contributions
uts-0.267
othe-0.241
ding-0.238
vere-0.233
 much-0.227
Token big
Token ID1263
Feature activation-5.69e+00

Pos logit contributions
ened+0.103
en+0.097
ered+0.094
ere+0.092
els+0.084
Neg logit contributions
uts-0.093
ding-0.090
vere-0.083
ache-0.081
othe-0.079
Token park
Token ID3952
Feature activation-4.85e+00

Pos logit contributions
ered+0.121
ened+0.117
efully+0.111
en+0.109
ere+0.107
Neg logit contributions
ding-0.111
uts-0.108
 cho-0.105
 rider-0.102
ache-0.102
Token,
Token ID11
Feature activation-4.00e+00

Pos logit contributions
ened+0.086
en+0.085
ere+0.083
efully+0.073
eled+0.072
Neg logit contributions
uts-0.139
ding-0.124
othe-0.124
vere-0.122
t-0.116
Token there
Token ID612
Feature activation-5.96e+00

Pos logit contributions
ened+0.090
en+0.090
ere+0.082
eled+0.071
ited+0.069
Neg logit contributions
uts-0.091
vere-0.087
 much-0.086
ding-0.085
othe-0.083
Token because
Token ID780
Feature activation+3.97e+00

Pos logit contributions
en+0.024
 me+0.023
ened+0.021
ere+0.019
el+0.018
Neg logit contributions
uts-0.064
vere-0.063
ding-0.062
zel-0.059
tr-0.059
Token there
Token ID612
Feature activation+1.50e+00

Pos logit contributions
uts+0.097
vere+0.086
ding+0.086
ache+0.083
ometer+0.082
Neg logit contributions
ere-0.082
ened-0.081
en-0.081
 me-0.079
el-0.075
Token was
Token ID373
Feature activation+1.73e+00

Pos logit contributions
uts+0.049
vere+0.044
othe+0.044
ding+0.044
ache+0.042
Neg logit contributions
ere-0.019
en-0.019
ened-0.019
 me-0.016
 colored-0.016
Token no
Token ID645
Feature activation+1.06e+00

Pos logit contributions
uts+0.043
othe+0.039
vere+0.038
ding+0.038
ache+0.037
Neg logit contributions
ere-0.035
en-0.034
ened-0.033
el-0.030
efully-0.030
Token more
Token ID517
Feature activation+5.43e+00

Pos logit contributions
uts+0.027
vere+0.024
othe+0.024
ding+0.024
ache+0.023
Neg logit contributions
ere-0.021
en-0.020
ened-0.020
 me-0.018
 colored-0.018
Token rice
Token ID11464
Feature activation+7.76e+00

Pos logit contributions
uts+0.143
vere+0.127
othe+0.126
ding+0.126
ache+0.120
Neg logit contributions
ere-0.104
en-0.103
ened-0.101
 me-0.093
 colored-0.090
Token left
Token ID1364
Feature activation+8.67e-01

Pos logit contributions
uts+0.221
vere+0.206
othe+0.201
ding+0.200
ometer+0.193
Neg logit contributions
ere-0.126
en-0.126
ened-0.125
 me-0.113
 colored-0.110
Token.
Token ID13
Feature activation+2.35e+00

Pos logit contributions
uts+0.022
vere+0.020
othe+0.020
ding+0.020
ache+0.019
Neg logit contributions
ere-0.017
en-0.017
ened-0.016
 me-0.015
 colored-0.015
Token
Token ID198
Feature activation+3.55e+00

Pos logit contributions
uts+0.061
vere+0.056
ding+0.055
othe+0.053
ache+0.053
Neg logit contributions
ere-0.045
en-0.043
ened-0.043
 me-0.041
el-0.039
Token
Token ID198
Feature activation+2.89e+00

Pos logit contributions
uts+0.092
vere+0.083
ding+0.082
ache+0.080
othe+0.079
Neg logit contributions
ere-0.068
en-0.064
 me-0.063
ened-0.063
 colored-0.059
TokenS
Token ID50
Feature activation-1.93e+00

Pos logit contributions
uts+0.070
vere+0.062
ding+0.061
othe+0.060
ache+0.058
Neg logit contributions
ere-0.062
en-0.060
ened-0.059
 me-0.057
 colored-0.055
Tokeny
Token ID88
Feature activation+3.80e+00

Pos logit contributions
uts+0.046
ding+0.041
othe+0.040
ache+0.040
vere+0.040
Neg logit contributions
ere-0.021
ened-0.019
en-0.019
 me-0.018
 colored-0.017
Token again
Token ID757
Feature activation+2.09e+00

Pos logit contributions
uts+0.101
vere+0.091
ding+0.090
othe+0.089
ometer+0.086
Neg logit contributions
en-0.072
ened-0.071
ere-0.070
 me-0.064
 colored-0.062
Token after
Token ID706
Feature activation+4.93e-01

Pos logit contributions
uts+0.059
vere+0.054
ding+0.054
othe+0.053
ometer+0.052
Neg logit contributions
en-0.036
ere-0.035
ened-0.035
 me-0.032
 colored-0.031
Token accepting
Token ID12598
Feature activation-1.11e+00

Pos logit contributions
uts+0.015
othe+0.014
vere+0.013
ding+0.013
ache+0.012
Neg logit contributions
ere-0.008
ened-0.007
en-0.007
eled-0.006
 colored-0.006
Token the
Token ID262
Feature activation+3.13e+00

Pos logit contributions
ere+0.015
en+0.014
ened+0.014
 colored+0.012
 me+0.012
Neg logit contributions
uts-0.035
othe-0.032
vere-0.032
ding-0.031
ache-0.030
Token stone
Token ID7815
Feature activation+7.45e+00

Pos logit contributions
uts+0.080
vere+0.074
othe+0.072
ding+0.071
ache+0.070
Neg logit contributions
ere-0.060
en-0.059
ened-0.057
efully-0.050
el-0.050
Token and
Token ID290
Feature activation+4.81e+00

Pos logit contributions
uts+0.207
ding+0.192
vere+0.191
othe+0.186
ometer+0.179
Neg logit contributions
en-0.130
ened-0.127
ere-0.121
 me-0.119
el-0.110
Token it
Token ID340
Feature activation+4.25e+00

Pos logit contributions
uts+0.136
othe+0.124
vere+0.122
ding+0.120
ache+0.116
Neg logit contributions
ere-0.082
en-0.078
ened-0.076
el-0.068
 me-0.068
Token made
Token ID925
Feature activation-2.00e-01

Pos logit contributions
uts+0.122
vere+0.110
othe+0.110
ding+0.108
ache+0.105
Neg logit contributions
ere-0.071
en-0.068
ened-0.067
 me-0.059
el-0.059
Token her
Token ID607
Feature activation+2.42e+00

Pos logit contributions
en+0.001
ere+0.001
ened+0.001
 me+0.001
el+0.001
Neg logit contributions
uts-0.008
ding-0.007
vere-0.007
othe-0.007
ache-0.007
Token feel
Token ID1254
Feature activation+3.55e+00

Pos logit contributions
uts+0.066
ding+0.062
vere+0.060
othe+0.056
ache+0.056
Neg logit contributions
en-0.044
ened-0.044
ere-0.040
 me-0.040
el-0.038
Token I
Token ID314
Feature activation+1.01e+00

Pos logit contributions
uts+0.059
vere+0.052
othe+0.052
ding+0.052
ache+0.050
Neg logit contributions
ere-0.052
en-0.051
ened-0.050
 me-0.047
 colored-0.045
Token used
Token ID973
Feature activation-1.94e+00

Pos logit contributions
uts+0.023
ding+0.020
vere+0.020
othe+0.020
ache+0.019
Neg logit contributions
ere-0.023
ened-0.022
en-0.022
 me-0.021
el-0.020
Token to
Token ID284
Feature activation-9.99e-01

Pos logit contributions
ere+0.030
en+0.030
ened+0.029
 me+0.025
 colored+0.025
Neg logit contributions
uts-0.058
vere-0.052
othe-0.052
ding-0.052
ache-0.049
Token sail
Token ID14936
Feature activation+1.38e+00

Pos logit contributions
ere+0.019
en+0.019
ened+0.019
 me+0.017
 colored+0.017
Neg logit contributions
uts-0.026
vere-0.023
ding-0.023
othe-0.023
ache-0.022
Token the
Token ID262
Feature activation+2.03e+00

Pos logit contributions
uts+0.038
vere+0.035
ding+0.034
othe+0.034
ache+0.033
Neg logit contributions
ere-0.024
en-0.023
ened-0.023
 me-0.021
 colored-0.020
Token seas
Token ID21547
Feature activation+8.04e+00

Pos logit contributions
uts+0.053
vere+0.047
othe+0.047
ding+0.047
ache+0.044
Neg logit contributions
ere-0.040
en-0.039
ened-0.039
 me-0.036
 colored-0.035
Token and
Token ID290
Feature activation+3.51e+00

Pos logit contributions
uts+0.188
othe+0.175
vere+0.174
ding+0.168
ache+0.161
Neg logit contributions
ere-0.170
en-0.169
ened-0.161
 me-0.160
 colored-0.149
Token look
Token ID804
Feature activation+1.48e+00

Pos logit contributions
uts+0.098
vere+0.088
ding+0.087
othe+0.087
ache+0.084
Neg logit contributions
ere-0.061
en-0.060
ened-0.059
 me-0.054
 colored-0.053
Token for
Token ID329
Feature activation-2.72e+00

Pos logit contributions
uts+0.040
vere+0.035
othe+0.035
ding+0.035
ache+0.034
Neg logit contributions
ere-0.027
en-0.027
ened-0.027
 me-0.025
 colored-0.024
Token treasure
Token ID14068
Feature activation+4.16e+00

Pos logit contributions
ere+0.047
ened+0.046
en+0.046
el+0.039
 colored+0.039
Neg logit contributions
uts-0.076
vere-0.070
othe-0.068
ding-0.067
ache-0.065
Token.
Token ID13
Feature activation+2.36e+00

Pos logit contributions
uts+0.104
othe+0.098
vere+0.095
ding+0.094
 bleed+0.090
Neg logit contributions
en-0.082
ere-0.081
 me-0.078
ened-0.077
 colored-0.071
Token sparkling
Token ID39072
Feature activation+1.04e+00

Pos logit contributions
uts+0.097
othe+0.087
ding+0.085
vere+0.085
ometer+0.079
Neg logit contributions
ere-0.056
 me-0.055
ened-0.055
en-0.054
 colored-0.052
Token wand
Token ID11569
Feature activation+3.76e+00

Pos logit contributions
uts+0.023
ding+0.020
vere+0.020
othe+0.019
ometer+0.019
Neg logit contributions
ere-0.023
en-0.022
ened-0.022
efully-0.021
ered-0.020
Token and
Token ID290
Feature activation+3.60e+00

Pos logit contributions
uts+0.100
ding+0.088
vere+0.087
othe+0.087
ache+0.084
Neg logit contributions
ere-0.073
en-0.070
ened-0.069
 me-0.064
efully-0.061
Token a
Token ID257
Feature activation+4.10e+00

Pos logit contributions
uts+0.105
othe+0.094
vere+0.094
ding+0.093
ache+0.089
Neg logit contributions
ere-0.060
en-0.058
ened-0.055
el-0.050
 me-0.048
Token shiny
Token ID22441
Feature activation+1.84e-01

Pos logit contributions
uts+0.124
ding+0.110
othe+0.109
vere+0.106
whe+0.100
Neg logit contributions
 me-0.066
ened-0.062
 colored-0.061
en-0.060
ere-0.059
Token crown
Token ID12389
Feature activation+7.13e+00

Pos logit contributions
uts+0.004
vere+0.004
othe+0.004
ding+0.004
ometer+0.004
Neg logit contributions
ere-0.004
en-0.004
ened-0.004
efully-0.004
eled-0.003
Token.
Token ID13
Feature activation+1.82e+00

Pos logit contributions
uts+0.187
vere+0.186
othe+0.180
ding+0.176
 numb+0.168
Neg logit contributions
ened-0.122
en-0.119
 colored-0.110
ere-0.109
el-0.106
Token With
Token ID2080
Feature activation-4.29e+00

Pos logit contributions
uts+0.044
othe+0.038
ding+0.038
vere+0.038
ometer+0.037
Neg logit contributions
en-0.039
ere-0.039
ened-0.038
 colored-0.035
el-0.034
Token these
Token ID777
Feature activation+6.76e-01

Pos logit contributions
ere+0.077
en+0.076
een+0.069
urious+0.069
ened+0.065
Neg logit contributions
uts-0.112
othe-0.107
vere-0.099
 much-0.096
 numb-0.093
Token items
Token ID3709
Feature activation+3.13e+00

Pos logit contributions
uts+0.016
vere+0.015
othe+0.015
ache+0.014
ding+0.013
Neg logit contributions
ere-0.013
en-0.013
eled-0.012
efully-0.012
een-0.012
Token,
Token ID11
Feature activation+2.12e+00

Pos logit contributions
uts+0.080
vere+0.071
othe+0.071
ding+0.070
ache+0.066
Neg logit contributions
ere-0.063
en-0.062
ened-0.060
 me-0.056
 colored-0.054
Token money
Token ID1637
Feature activation+1.67e+00

Pos logit contributions
ere+0.007
en+0.007
ened+0.007
 me+0.006
 colored+0.006
Neg logit contributions
uts-0.009
vere-0.008
ding-0.008
othe-0.008
ache-0.008
Token to
Token ID284
Feature activation-1.39e+00

Pos logit contributions
uts+0.044
vere+0.040
othe+0.040
ding+0.039
ache+0.038
Neg logit contributions
ere-0.031
ened-0.031
en-0.031
 me-0.028
 colored-0.027
Token pay
Token ID1414
Feature activation-2.63e-01

Pos logit contributions
ere+0.028
en+0.027
ened+0.027
 me+0.025
 colored+0.024
Neg logit contributions
uts-0.036
vere-0.032
ding-0.031
othe-0.031
ache-0.030
Token for
Token ID329
Feature activation-2.00e+00

Pos logit contributions
ere+0.005
en+0.005
ened+0.005
 me+0.004
el+0.004
Neg logit contributions
uts-0.008
vere-0.007
ding-0.007
othe-0.006
ache-0.006
Token the
Token ID262
Feature activation+3.64e+00

Pos logit contributions
ere+0.018
en+0.018
ened+0.018
 me+0.016
el+0.014
Neg logit contributions
uts-0.074
vere-0.067
ding-0.066
othe-0.066
ometer-0.065
Token food
Token ID2057
Feature activation+7.53e+00

Pos logit contributions
uts+0.094
vere+0.083
othe+0.083
ding+0.082
ache+0.078
Neg logit contributions
ere-0.072
ened-0.071
en-0.070
 me-0.066
 colored-0.064
Token.
Token ID13
Feature activation+3.14e+00

Pos logit contributions
uts+0.205
vere+0.198
othe+0.190
 bleed+0.186
ding+0.184
Neg logit contributions
en-0.121
ened-0.120
ere-0.119
 me-0.119
 colored-0.105
Token Dad
Token ID17415
Feature activation-2.48e+00

Pos logit contributions
uts+0.084
vere+0.077
othe+0.075
ding+0.074
ache+0.073
Neg logit contributions
ere-0.058
en-0.057
ened-0.057
 me-0.052
el-0.050
Token was
Token ID373
Feature activation-3.23e-01

Pos logit contributions
ere+0.040
en+0.039
ened+0.039
 me+0.033
el+0.033
Neg logit contributions
uts-0.071
ding-0.065
othe-0.063
ache-0.063
vere-0.062
Token so
Token ID523
Feature activation+4.01e+00

Pos logit contributions
ere+0.007
en+0.007
ened+0.007
 me+0.006
 colored+0.006
Neg logit contributions
uts-0.008
vere-0.007
ding-0.007
othe-0.007
ache-0.006
Token relieved
Token ID22598
Feature activation-2.60e+00

Pos logit contributions
uts+0.085
ding+0.073
vere+0.072
othe+0.072
ometer+0.071
Neg logit contributions
ere-0.095
ened-0.090
 me-0.088
en-0.085
 colored-0.085
Token
Token ID198
Feature activation+2.26e+00

Pos logit contributions
uts+0.076
vere+0.068
ding+0.067
ache+0.067
othe+0.066
Neg logit contributions
ere-0.057
en-0.054
 me-0.053
ened-0.052
el-0.048
TokenThe
Token ID464
Feature activation-1.72e+00

Pos logit contributions
uts+0.045
vere+0.040
othe+0.039
ding+0.039
ache+0.037
Neg logit contributions
en-0.058
ened-0.058
ere-0.057
el-0.052
 colored-0.052
Token wolf
Token ID17481
Feature activation-6.24e-01

Pos logit contributions
ere+0.038
en+0.038
ened+0.037
 me+0.035
 colored+0.034
Neg logit contributions
uts-0.040
vere-0.035
ding-0.035
othe-0.035
ache-0.033
Token came
Token ID1625
Feature activation-4.11e+00

Pos logit contributions
ere+0.009
en+0.009
ened+0.009
 colored+0.007
el+0.007
Neg logit contributions
uts-0.019
othe-0.017
vere-0.017
ding-0.017
ache-0.017
Token closer
Token ID5699
Feature activation-3.55e+00

Pos logit contributions
ere+0.084
ened+0.081
en+0.079
 colored+0.072
el+0.069
Neg logit contributions
uts-0.103
vere-0.093
othe-0.092
ding-0.090
ometer-0.085
Token and
Token ID290
Feature activation+1.15e+00

Pos logit contributions
ere+0.061
en+0.058
ened+0.058
 me+0.052
 colored+0.051
Neg logit contributions
uts-0.101
othe-0.090
ding-0.090
vere-0.090
ache-0.086
Token closer
Token ID5699
Feature activation-2.86e+00

Pos logit contributions
uts+0.034
othe+0.032
ding+0.032
vere+0.031
ache+0.030
Neg logit contributions
ere-0.018
en-0.017
ened-0.017
el-0.015
 colored-0.014
Token,
Token ID11
Feature activation+1.49e+00

Pos logit contributions
ere+0.057
en+0.055
ened+0.055
 colored+0.050
 me+0.050
Neg logit contributions
uts-0.073
othe-0.065
vere-0.065
ding-0.064
ache-0.061
Token and
Token ID290
Feature activation+1.73e+00

Pos logit contributions
uts+0.041
vere+0.037
ding+0.036
othe+0.036
ometer+0.035
Neg logit contributions
en-0.026
ened-0.026
 me-0.026
ere-0.026
 colored-0.023
Token Benny
Token ID44275
Feature activation-2.85e+00

Pos logit contributions
uts+0.045
othe+0.042
ding+0.041
vere+0.041
ache+0.039
Neg logit contributions
ere-0.033
en-0.032
ened-0.031
el-0.028
 colored-0.028
Token knew
Token ID2993
Feature activation-6.20e-01

Pos logit contributions
ere+0.049
en+0.046
ened+0.046
el+0.042
 colored+0.041
Neg logit contributions
uts-0.078
ache-0.072
ding-0.070
othe-0.070
vere-0.069
Token on
Token ID319
Feature activation-5.55e+00

Pos logit contributions
ere+0.042
en+0.040
ened+0.039
el+0.034
eled+0.034
Neg logit contributions
uts-0.083
vere-0.076
ding-0.076
othe-0.076
ache-0.070
Token the
Token ID262
Feature activation+8.41e-01

Pos logit contributions
ere+0.063
ened+0.058
en+0.057
eled+0.048
 colored+0.046
Neg logit contributions
uts-0.186
othe-0.173
vere-0.172
ding-0.169
 numb-0.161
Token branch
Token ID8478
Feature activation+3.37e+00

Pos logit contributions
uts+0.027
othe+0.024
vere+0.023
ding+0.023
ometer+0.023
Neg logit contributions
 me-0.013
ere-0.012
en-0.012
ened-0.012
 colored-0.011
Token,
Token ID11
Feature activation+2.34e+00

Pos logit contributions
uts+0.109
vere+0.100
ding+0.099
othe+0.098
ache+0.096
Neg logit contributions
ere-0.043
en-0.043
ened-0.042
 me-0.038
 colored-0.035
Token practicing
Token ID18207
Feature activation-1.30e+00

Pos logit contributions
uts+0.062
vere+0.055
ding+0.054
othe+0.054
ache+0.052
Neg logit contributions
ere-0.045
en-0.044
ened-0.044
 me-0.040
 colored-0.039
Token her
Token ID607
Feature activation+1.15e+00

Pos logit contributions
ere+0.019
en+0.019
ened+0.018
 me+0.016
 colored+0.016
Neg logit contributions
uts-0.040
vere-0.037
ding-0.037
othe-0.036
ache-0.035
Token balancing
Token ID22486
Feature activation+2.27e+00

Pos logit contributions
uts+0.030
ding+0.028
othe+0.027
vere+0.026
ache+0.025
Neg logit contributions
 me-0.022
en-0.022
ened-0.022
ere-0.021
 colored-0.019
Token skills
Token ID4678
Feature activation+4.33e+00

Pos logit contributions
uts+0.062
ding+0.056
othe+0.056
vere+0.056
ache+0.053
Neg logit contributions
en-0.041
ere-0.040
ened-0.040
 me-0.038
 colored-0.036
Token.
Token ID13
Feature activation+2.36e+00

Pos logit contributions
uts+0.121
othe+0.112
vere+0.110
ache+0.109
ding+0.109
Neg logit contributions
en-0.071
ened-0.067
ere-0.067
 me-0.065
 colored-0.059
Token Other
Token ID3819
Feature activation-1.90e+00

Pos logit contributions
uts+0.065
othe+0.058
ding+0.058
vere+0.058
ometer+0.055
Neg logit contributions
ere-0.042
en-0.042
ened-0.041
 me-0.037
 colored-0.037
Token animals
Token ID4695
Feature activation+1.00e+00

Pos logit contributions
ere+0.032
ened+0.031
en+0.031
 colored+0.027
 me+0.027
Neg logit contributions
uts-0.053
vere-0.048
othe-0.048
ding-0.046
ache-0.045
Token and
Token ID290
Feature activation+3.30e+00

Pos logit contributions
uts+0.006
vere+0.006
othe+0.006
ometer+0.006
ding+0.006
Neg logit contributions
en-0.003
ened-0.002
ere-0.002
 me-0.002
 colored-0.002
Token had
Token ID550
Feature activation-1.19e+00

Pos logit contributions
uts+0.097
vere+0.088
ding+0.085
othe+0.084
ometer+0.083
Neg logit contributions
en-0.055
ened-0.055
ere-0.053
 me-0.049
 colored-0.049
Token lots
Token ID6041
Feature activation+6.86e-01

Pos logit contributions
ere+0.026
en+0.026
ened+0.026
el+0.024
eled+0.023
Neg logit contributions
uts-0.028
ding-0.024
vere-0.024
othe-0.023
ache-0.022
Token of
Token ID286
Feature activation+2.20e-01

Pos logit contributions
uts+0.020
ometer+0.020
vere+0.019
othe+0.019
ding+0.019
Neg logit contributions
ere-0.010
en-0.010
ened-0.010
 me-0.009
 colored-0.009
Token fun
Token ID1257
Feature activation+1.12e+00

Pos logit contributions
uts+0.005
ding+0.004
othe+0.004
ometer+0.004
ache+0.004
Neg logit contributions
ere-0.005
en-0.005
ened-0.005
 me-0.005
 colored-0.005
Token.
Token ID13
Feature activation+2.93e+00

Pos logit contributions
uts+0.031
ometer+0.029
ache+0.029
vere+0.029
ding+0.029
Neg logit contributions
en-0.020
ere-0.019
ened-0.018
 colored-0.018
 me-0.017
Token
Token ID198
Feature activation+4.10e+00

Pos logit contributions
uts+0.087
vere+0.083
ache+0.082
 numb+0.079
ding+0.078
Neg logit contributions
ere-0.043
 me-0.043
ened-0.041
en-0.040
 Every-0.038
Token
Token ID198
Feature activation+3.47e+00

Pos logit contributions
uts+0.113
ache+0.103
vere+0.103
ding+0.100
 numb+0.099
Neg logit contributions
ere-0.070
 me-0.066
en-0.064
ened-0.060
 colored-0.058
TokenFrom
Token ID4863
Feature activation-3.37e+00

Pos logit contributions
uts+0.090
ometer+0.080
othe+0.080
vere+0.078
ache+0.077
Neg logit contributions
ere-0.066
 me-0.065
en-0.062
ened-0.059
 colored-0.058
Token that
Token ID326
Feature activation-1.86e+00

Pos logit contributions
ened+0.084
en+0.082
ere+0.082
eled+0.074
een+0.073
Neg logit contributions
uts-0.065
othe-0.061
ding-0.060
vere-0.055
ache-0.053
Token day
Token ID1110
Feature activation-1.41e+00

Pos logit contributions
ere+0.034
en+0.033
ened+0.033
 me+0.030
 colored+0.029
Neg logit contributions
uts-0.052
vere-0.047
ding-0.046
othe-0.046
ache-0.044
Token Whenever
Token ID21326
Feature activation-1.56e+00

Pos logit contributions
uts+0.091
vere+0.086
ache+0.083
ding+0.082
othe+0.081
Neg logit contributions
ere-0.045
ened-0.044
en-0.043
 me-0.042
 Everyday-0.037
Token the
Token ID262
Feature activation+2.60e-01

Pos logit contributions
ere+0.034
en+0.034
ened+0.033
 me+0.031
 colored+0.030
Neg logit contributions
uts-0.037
vere-0.032
othe-0.032
ding-0.032
ache-0.030
Token tree
Token ID5509
Feature activation+1.51e+00

Pos logit contributions
uts+0.007
vere+0.006
ding+0.006
othe+0.006
ache+0.006
Neg logit contributions
ere-0.005
en-0.005
ened-0.005
 me-0.005
 colored-0.005
Token stretches
Token ID23687
Feature activation-1.51e-01

Pos logit contributions
uts+0.045
vere+0.041
ding+0.041
othe+0.041
ache+0.039
Neg logit contributions
ere-0.023
en-0.023
ened-0.022
 me-0.019
 colored-0.019
Token his
Token ID465
Feature activation+1.62e+00

Pos logit contributions
ere+0.002
en+0.002
ened+0.002
 me+0.002
 colored+0.002
Neg logit contributions
uts-0.005
vere-0.004
ding-0.004
othe-0.004
ache-0.004
Token branches
Token ID13737
Feature activation+3.94e+00

Pos logit contributions
uts+0.044
ding+0.040
vere+0.040
othe+0.040
ache+0.037
Neg logit contributions
ere-0.029
ened-0.028
en-0.028
 me-0.027
 colored-0.025
Token in
Token ID287
Feature activation-1.81e+00

Pos logit contributions
uts+0.119
vere+0.108
ding+0.107
othe+0.107
ache+0.103
Neg logit contributions
ere-0.059
en-0.058
ened-0.058
 me-0.052
 colored-0.050
Token the
Token ID262
Feature activation+1.21e+00

Pos logit contributions
ere+0.027
en+0.025
ened+0.024
 me+0.021
 colored+0.021
Neg logit contributions
uts-0.055
othe-0.051
vere-0.050
ding-0.049
 numb-0.047
Token morning
Token ID3329
Feature activation+2.50e+00

Pos logit contributions
uts+0.034
vere+0.031
ding+0.031
othe+0.031
ache+0.029
Neg logit contributions
ere-0.021
en-0.020
ened-0.020
 me-0.018
 colored-0.018
Token,
Token ID11
Feature activation+2.54e+00

Pos logit contributions
uts+0.084
vere+0.078
othe+0.077
ding+0.076
ometer+0.074
Neg logit contributions
ere-0.029
en-0.029
ened-0.029
 me-0.025
 colored-0.023
Token he
Token ID339
Feature activation+1.76e-01

Pos logit contributions
uts+0.077
vere+0.068
ding+0.068
ache+0.066
ometer+0.065
Neg logit contributions
ened-0.041
ere-0.040
en-0.040
 me-0.038
 colored-0.034
Token and
Token ID290
Feature activation+3.28e+00

Pos logit contributions
uts+0.062
vere+0.060
ding+0.060
ometer+0.058
 numb+0.058
Neg logit contributions
ened-0.022
en-0.021
 me-0.017
ere-0.016
ered-0.016
Token,
Token ID11
Feature activation+2.42e+00

Pos logit contributions
uts+0.091
vere+0.081
ding+0.081
othe+0.081
ache+0.077
Neg logit contributions
ere-0.058
en-0.057
ened-0.056
 me-0.051
 colored-0.050
Token with
Token ID351
Feature activation-3.11e+00

Pos logit contributions
uts+0.070
vere+0.064
ding+0.062
othe+0.061
ometer+0.061
Neg logit contributions
ened-0.040
en-0.040
ere-0.039
 me-0.038
 colored-0.034
Token a
Token ID257
Feature activation+3.99e+00

Pos logit contributions
ere+0.053
en+0.053
ened+0.048
el+0.045
 me+0.044
Neg logit contributions
othe-0.082
uts-0.081
vere-0.076
 numb-0.071
ding-0.070
Token big
Token ID1263
Feature activation+8.90e-01

Pos logit contributions
uts+0.118
ding+0.103
othe+0.098
vere+0.097
ometer+0.094
Neg logit contributions
 me-0.064
ere-0.064
ened-0.062
 two-0.059
en-0.058
Token smile
Token ID8212
Feature activation+1.43e+00

Pos logit contributions
uts+0.021
vere+0.019
ding+0.018
othe+0.018
ache+0.017
Neg logit contributions
ere-0.019
en-0.019
ened-0.019
 me-0.018
 colored-0.017
Token on
Token ID319
Feature activation-2.31e+00

Pos logit contributions
uts+0.041
vere+0.039
ding+0.038
othe+0.037
ometer+0.035
Neg logit contributions
ened-0.023
en-0.023
ere-0.022
 colored-0.020
el-0.020
Token his
Token ID465
Feature activation+2.45e+00

Pos logit contributions
ere+0.044
en+0.042
ened+0.040
 colored+0.037
 me+0.037
Neg logit contributions
uts-0.060
othe-0.057
vere-0.053
ding-0.051
ometer-0.049
Token face
Token ID1986
Feature activation+3.35e+00

Pos logit contributions
uts+0.069
ding+0.065
othe+0.061
vere+0.060
ometer+0.058
Neg logit contributions
ened-0.042
 me-0.042
en-0.041
ere-0.039
ered-0.036
Token,
Token ID11
Feature activation+3.96e+00

Pos logit contributions
uts+0.106
vere+0.099
ding+0.098
othe+0.095
ometer+0.093
Neg logit contributions
ened-0.045
en-0.044
ere-0.042
 me-0.037
el-0.037
Token he
Token ID339
Feature activation+8.04e-01

Pos logit contributions
uts+0.098
vere+0.087
ding+0.087
ometer+0.083
othe+0.082
Neg logit contributions
ened-0.082
en-0.081
ere-0.080
 me-0.077
 colored-0.074
Token
Token ID198
Feature activation+3.14e+00

Pos logit contributions
othe+0.012
uts+0.012
 numb+0.011
vere+0.011
ache+0.011
Neg logit contributions
ere-0.013
 me-0.012
en-0.012
ened-0.011
een-0.011
Token
Token ID198
Feature activation+2.40e+00

Pos logit contributions
uts+0.080
othe+0.071
vere+0.071
ache+0.071
ding+0.071
Neg logit contributions
ere-0.063
en-0.057
 me-0.057
ened-0.054
el-0.051
TokenJohn
Token ID7554
Feature activation-5.53e+00

Pos logit contributions
uts+0.059
othe+0.053
vere+0.052
ding+0.051
ometer+0.051
Neg logit contributions
ere-0.051
en-0.047
 me-0.046
ened-0.045
 colored-0.043
Token put
Token ID1234
Feature activation-3.24e+00

Pos logit contributions
ere+0.091
ened+0.085
en+0.082
 forces+0.079
efully+0.076
Neg logit contributions
uts-0.141
ding-0.134
ache-0.134
ometer-0.124
othe-0.124
Token them
Token ID606
Feature activation+2.78e+00

Pos logit contributions
en+0.045
ere+0.044
ened+0.041
urious+0.038
efully+0.037
Neg logit contributions
uts-0.095
ding-0.091
othe-0.090
vere-0.088
ometer-0.085
Token in
Token ID287
Feature activation-2.95e+00

Pos logit contributions
uts+0.077
vere+0.070
ding+0.069
othe+0.069
ache+0.066
Neg logit contributions
ere-0.049
en-0.048
ened-0.047
 me-0.042
 colored-0.041
Token his
Token ID465
Feature activation+3.79e-01

Pos logit contributions
ere+0.042
en+0.041
ened+0.040
efully+0.033
 me+0.033
Neg logit contributions
uts-0.089
othe-0.083
vere-0.082
ding-0.082
 numb-0.078
Token large
Token ID1588
Feature activation-7.38e-01

Pos logit contributions
uts+0.010
vere+0.009
ding+0.009
othe+0.009
ometer+0.008
Neg logit contributions
ere-0.007
en-0.007
ened-0.007
 me-0.006
 colored-0.006
Token pocket
Token ID10000
Feature activation+2.81e+00

Pos logit contributions
en+0.016
ere+0.016
ened+0.015
efully+0.015
ered+0.015
Neg logit contributions
uts-0.014
ometer-0.013
vere-0.013
ding-0.013
ache-0.013
Token and
Token ID290
Feature activation+2.78e+00

Pos logit contributions
uts+0.080
vere+0.073
othe+0.072
ding+0.072
ache+0.069
Neg logit contributions
ere-0.047
en-0.046
ened-0.045
 me-0.041
 colored-0.040
Token then
Token ID788
Feature activation-2.17e+00

Pos logit contributions
uts+0.081
ding+0.073
vere+0.073
othe+0.073
ache+0.070
Neg logit contributions
ere-0.045
en-0.044
ened-0.043
 me-0.038
el-0.037
Token,
Token ID11
Feature activation-1.73e+00

Pos logit contributions
ere+0.062
en+0.060
ened+0.055
el+0.052
eled+0.052
Neg logit contributions
uts-0.106
othe-0.101
ding-0.099
vere-0.096
ometer-0.091
Token George
Token ID4502
Feature activation-5.87e+00

Pos logit contributions
ere+0.028
ened+0.028
en+0.028
 me+0.026
 colored+0.025
Neg logit contributions
uts-0.050
vere-0.045
ding-0.044
othe-0.044
ache-0.043
Token was
Token ID373
Feature activation-3.67e+00

Pos logit contributions
ere+0.102
ened+0.095
en+0.089
els+0.084
 colored+0.083
Neg logit contributions
uts-0.159
ache-0.146
ding-0.143
ages-0.135
othe-0.134
Token walking
Token ID6155
Feature activation-5.86e+00

Pos logit contributions
ere+0.067
en+0.065
ened+0.065
 me+0.061
 colored+0.058
Neg logit contributions
uts-0.100
vere-0.089
ding-0.089
othe-0.088
ometer-0.084
Token near
Token ID1474
Feature activation-7.51e+00

Pos logit contributions
ere+0.098
en+0.096
ened+0.095
 me+0.085
 colored+0.083
Neg logit contributions
uts-0.169
vere-0.153
ding-0.152
othe-0.152
ache-0.145
Token a
Token ID257
Feature activation+2.76e-01

Pos logit contributions
ened+0.126
ere+0.124
en+0.119
al+0.108
efully+0.108
Neg logit contributions
uts-0.187
othe-0.180
 explode-0.175
ding-0.173
vere-0.167
Token lake
Token ID13546
Feature activation-9.40e-01

Pos logit contributions
uts+0.008
vere+0.007
othe+0.007
ding+0.007
ometer+0.006
Neg logit contributions
ere-0.005
en-0.005
 me-0.005
ened-0.005
 colored-0.004
Token and
Token ID290
Feature activation+4.12e-01

Pos logit contributions
ere+0.018
en+0.017
ened+0.017
 me+0.016
 colored+0.015
Neg logit contributions
uts-0.025
vere-0.022
ding-0.022
othe-0.022
ache-0.021
Token he
Token ID339
Feature activation-2.07e+00

Pos logit contributions
uts+0.006
ding+0.005
othe+0.005
vere+0.005
ache+0.005
Neg logit contributions
ere-0.012
en-0.012
ened-0.012
el-0.012
 colored-0.011
Token saw
Token ID2497
Feature activation-7.00e+00

Pos logit contributions
ere+0.029
en+0.028
ened+0.027
el+0.023
 colored+0.023
Neg logit contributions
uts-0.064
ding-0.060
vere-0.058
othe-0.058
ache-0.057
Token a
Token ID257
Feature activation+1.88e+00

Pos logit contributions
en+0.118
ere+0.116
ened+0.114
el+0.095
 colored+0.092
Neg logit contributions
uts-0.202
othe-0.186
vere-0.181
ding-0.179
ache-0.175
Token a
Token ID257
Feature activation+3.60e-01

Pos logit contributions
en+0.166
ened+0.165
ere+0.155
een+0.137
urious+0.137
Neg logit contributions
uts-0.186
othe-0.172
ache-0.158
 bleed-0.157
ding-0.157
Token cold
Token ID4692
Feature activation+3.39e-01

Pos logit contributions
uts+0.011
othe+0.010
vere+0.010
ding+0.010
whe+0.009
Neg logit contributions
 me-0.005
ere-0.005
en-0.005
 colored-0.005
ened-0.005
Token lizard
Token ID42406
Feature activation+8.43e-01

Pos logit contributions
uts+0.009
vere+0.008
ding+0.008
othe+0.008
ache+0.008
Neg logit contributions
ere-0.006
en-0.006
ened-0.006
 me-0.006
 colored-0.005
Token.
Token ID13
Feature activation-8.51e-01

Pos logit contributions
uts+0.023
vere+0.021
ding+0.020
othe+0.020
ache+0.019
Neg logit contributions
ere-0.015
en-0.015
ened-0.015
 me-0.014
 colored-0.013
Token She
Token ID1375
Feature activation-2.67e+00

Pos logit contributions
ere+0.018
en+0.018
ened+0.017
eled+0.016
 colored+0.016
Neg logit contributions
uts-0.021
othe-0.018
ding-0.018
ometer-0.018
vere-0.017
Token was
Token ID373
Feature activation-1.78e+00

Pos logit contributions
en+0.053
ere+0.051
ened+0.048
eful+0.045
 forces+0.045
Neg logit contributions
uts-0.068
othe-0.060
ache-0.057
ding-0.057
 bleed-0.056
Token so
Token ID523
Feature activation+2.07e+00

Pos logit contributions
en+0.050
ere+0.050
ened+0.049
 colored+0.044
een+0.044
Neg logit contributions
uts-0.032
othe-0.027
vere-0.027
ding-0.025
 bleed-0.025
Token excited
Token ID6568
Feature activation-4.24e+00

Pos logit contributions
uts+0.046
othe+0.039
vere+0.039
ding+0.038
ometer+0.036
Neg logit contributions
en-0.051
ere-0.050
ened-0.049
 colored-0.044
 me-0.044
Token!
Token ID0
Feature activation+2.28e-01

Pos logit contributions
ere+0.084
en+0.083
ened+0.082
 me+0.076
 colored+0.074
Neg logit contributions
uts-0.109
vere-0.097
ding-0.096
othe-0.096
ache-0.091
Token She
Token ID1375
Feature activation-2.11e+00

Pos logit contributions
uts+0.005
ding+0.005
othe+0.005
ometer+0.005
vere+0.004
Neg logit contributions
en-0.005
ened-0.005
ere-0.005
eled-0.005
 colored-0.005
Token wanted
Token ID2227
Feature activation-3.73e+00

Pos logit contributions
en+0.038
ere+0.037
ened+0.035
el+0.033
eful+0.032
Neg logit contributions
uts-0.056
ding-0.049
othe-0.049
ache-0.047
zel-0.046
Token there
Token ID612
Feature activation-7.49e+00

Pos logit contributions
en+0.124
ere+0.120
ened+0.118
efully+0.102
al+0.101
Neg logit contributions
othe-0.136
uts-0.133
 much-0.130
t-0.125
ding-0.124
Token lived
Token ID5615
Feature activation-7.64e+00

Pos logit contributions
ere+0.166
en+0.159
ened+0.159
efully+0.153
els+0.146
Neg logit contributions
uts-0.157
othe-0.147
ding-0.141
t-0.137
whe-0.132
Token a
Token ID257
Feature activation-1.26e+00

Pos logit contributions
ened+0.192
en+0.176
ening+0.150
ere+0.149
 angles+0.146
Neg logit contributions
uts-0.167
vere-0.157
 much-0.147
ding-0.145
whe-0.143
Token large
Token ID1588
Feature activation-3.87e+00

Pos logit contributions
 me+0.011
ere+0.011
 colored+0.009
en+0.009
 three+0.007
Neg logit contributions
uts-0.049
vere-0.044
othe-0.043
ding-0.042
ometer-0.041
Token dragon
Token ID10441
Feature activation-9.18e-01

Pos logit contributions
ere+0.074
ened+0.074
en+0.073
ered+0.065
 me+0.064
Neg logit contributions
uts-0.098
ding-0.089
vere-0.088
othe-0.087
ache-0.085
Token in
Token ID287
Feature activation-5.76e+00

Pos logit contributions
ere+0.014
en+0.014
ened+0.014
 me+0.013
 colored+0.012
Neg logit contributions
uts-0.027
vere-0.025
ding-0.025
othe-0.025
ache-0.024
Token a
Token ID257
Feature activation-2.86e-01

Pos logit contributions
ere+0.086
ened+0.084
en+0.083
eled+0.069
efully+0.067
Neg logit contributions
uts-0.168
othe-0.157
 explode-0.153
vere-0.153
ding-0.151
Token forest
Token ID8222
Feature activation-2.71e+00

Pos logit contributions
ened+0.006
ere+0.006
en+0.005
 colored+0.005
ered+0.005
Neg logit contributions
uts-0.007
vere-0.007
ding-0.007
othe-0.006
ache-0.006
Token.
Token ID13
Feature activation-8.88e-01

Pos logit contributions
ere+0.055
en+0.054
ened+0.054
 me+0.050
 colored+0.049
Neg logit contributions
uts-0.068
vere-0.060
ding-0.060
othe-0.059
ache-0.057
Token He
Token ID679
Feature activation-2.77e+00

Pos logit contributions
ere+0.017
en+0.017
ened+0.016
eled+0.015
 me+0.015
Neg logit contributions
uts-0.023
othe-0.021
ding-0.020
vere-0.020
ometer-0.019
Token had
Token ID550
Feature activation-2.89e+00

Pos logit contributions
ere+0.041
en+0.040
ened+0.039
el+0.034
eled+0.033
Neg logit contributions
uts-0.084
ding-0.076
vere-0.076
othe-0.075
ache-0.074
Token it
Token ID340
Feature activation+5.26e+00

Pos logit contributions
en+0.047
ere+0.046
ened+0.045
 me+0.043
 colored+0.040
Neg logit contributions
uts-0.089
vere-0.082
ding-0.080
othe-0.078
ache-0.077
Token herself
Token ID5223
Feature activation+2.54e+00

Pos logit contributions
uts+0.142
vere+0.131
othe+0.129
ding+0.127
ometer+0.124
Neg logit contributions
en-0.096
ened-0.094
ere-0.090
 me-0.088
 colored-0.085
Token.
Token ID13
Feature activation+2.63e+00

Pos logit contributions
uts+0.066
vere+0.060
othe+0.059
ding+0.059
ache+0.056
Neg logit contributions
en-0.048
ened-0.048
ere-0.048
 me-0.044
 colored-0.043
Token Mrs
Token ID9074
Feature activation-1.06e+00

Pos logit contributions
uts+0.080
vere+0.074
ding+0.072
othe+0.072
ache+0.071
Neg logit contributions
ere-0.039
en-0.038
ened-0.038
 me-0.035
 colored-0.032
Token.
Token ID13
Feature activation+2.86e+00

Pos logit contributions
ere+0.023
ened+0.022
en+0.022
 me+0.020
eled+0.020
Neg logit contributions
uts-0.025
othe-0.022
vere-0.021
ding-0.021
ache-0.021
Token Smith
Token ID4176
Feature activation-2.39e+00

Pos logit contributions
uts+0.062
vere+0.053
ding+0.053
othe+0.053
ache+0.049
Neg logit contributions
ere-0.068
en-0.067
ened-0.067
 me-0.062
 colored-0.061
Token was
Token ID373
Feature activation-7.83e-01

Pos logit contributions
ere+0.037
ened+0.036
en+0.036
 me+0.031
el+0.031
Neg logit contributions
uts-0.070
othe-0.063
ding-0.062
ache-0.062
vere-0.061
Token so
Token ID523
Feature activation+3.13e+00

Pos logit contributions
ere+0.019
en+0.019
ened+0.018
el+0.017
 colored+0.017
Neg logit contributions
uts-0.017
vere-0.015
othe-0.015
ding-0.015
ache-0.014
Token surprised
Token ID6655
Feature activation-3.04e+00

Pos logit contributions
uts+0.061
ding+0.052
vere+0.051
othe+0.051
ometer+0.048
Neg logit contributions
ere-0.082
ened-0.079
en-0.079
 me-0.076
 colored-0.074
Token and
Token ID290
Feature activation+3.79e+00

Pos logit contributions
ere+0.057
en+0.056
ened+0.056
 colored+0.050
 me+0.049
Neg logit contributions
uts-0.081
othe-0.071
vere-0.071
ding-0.071
ache-0.067
Token touched
Token ID12615
Feature activation-3.24e+00

Pos logit contributions
uts+0.108
vere+0.097
ding+0.096
othe+0.093
ometer+0.092
Neg logit contributions
ened-0.064
ere-0.064
en-0.063
 me-0.060
 colored-0.056
Token garden
Token ID11376
Feature activation-2.31e+00

Pos logit contributions
ened+0.048
en+0.047
ere+0.047
ered+0.043
efully+0.038
Neg logit contributions
uts-0.074
ding-0.074
tr-0.068
vere-0.068
othe-0.068
Token.
Token ID13
Feature activation-6.98e-01

Pos logit contributions
ere+0.048
en+0.047
ened+0.047
 me+0.043
 colored+0.042
Neg logit contributions
uts-0.057
ding-0.050
vere-0.050
othe-0.050
ache-0.047
Token Katy
Token ID36715
Feature activation-6.08e+00

Pos logit contributions
ere+0.017
en+0.017
ened+0.016
el+0.016
 me+0.015
Neg logit contributions
uts-0.014
ding-0.013
othe-0.011
ometer-0.011
vere-0.011
Token's
Token ID338
Feature activation-2.52e+00

Pos logit contributions
ere+0.109
ened+0.106
en+0.105
efully+0.097
een+0.092
Neg logit contributions
uts-0.158
ding-0.145
ache-0.135
ometer-0.131
othe-0.131
Token dad
Token ID9955
Feature activation-3.63e+00

Pos logit contributions
en+0.059
ere+0.056
ened+0.056
ered+0.050
efully+0.049
Neg logit contributions
uts-0.056
ding-0.050
vere-0.049
ometer-0.048
othe-0.048
Token took
Token ID1718
Feature activation-7.07e+00

Pos logit contributions
ere+0.060
en+0.058
ened+0.057
efully+0.051
 me+0.050
Neg logit contributions
uts-0.103
ding-0.096
othe-0.092
vere-0.091
ache-0.090
Token out
Token ID503
Feature activation-2.30e+00

Pos logit contributions
ere+0.079
en+0.078
ened+0.073
 colored+0.060
 me+0.060
Neg logit contributions
uts-0.231
ding-0.214
othe-0.214
vere-0.211
ache-0.201
Token a
Token ID257
Feature activation+2.11e+00

Pos logit contributions
en+0.038
ere+0.037
ened+0.036
eled+0.032
 me+0.032
Neg logit contributions
uts-0.061
othe-0.056
vere-0.055
ding-0.054
 numb-0.052
Token microscope
Token ID36396
Feature activation+1.25e+00

Pos logit contributions
uts+0.061
vere+0.053
othe+0.053
ding+0.051
ache+0.049
Neg logit contributions
ere-0.037
 me-0.035
ened-0.035
en-0.035
 colored-0.033
Token.
Token ID13
Feature activation-3.29e-01

Pos logit contributions
uts+0.036
vere+0.034
othe+0.034
ding+0.033
 bleed+0.031
Neg logit contributions
en-0.020
ened-0.019
ere-0.018
 me-0.017
 colored-0.017
Token He
Token ID679
Feature activation-2.38e+00

Pos logit contributions
ere+0.008
en+0.008
ened+0.007
el+0.007
 colored+0.007
Neg logit contributions
uts-0.007
ding-0.006
ometer-0.005
othe-0.005
vere-0.005
Token there
Token ID612
Feature activation-7.52e+00

Pos logit contributions
en+0.125
ened+0.123
ere+0.121
efully+0.108
eled+0.107
Neg logit contributions
uts-0.242
othe-0.217
vere-0.216
ding-0.213
t-0.204
Token was
Token ID373
Feature activation-6.28e+00

Pos logit contributions
ere+0.162
en+0.155
ened+0.152
efully+0.149
 forces+0.139
Neg logit contributions
uts-0.164
othe-0.151
ding-0.149
t-0.142
whe-0.138
Token a
Token ID257
Feature activation-1.89e+00

Pos logit contributions
ened+0.115
en+0.113
ere+0.109
ered+0.096
 me+0.091
Neg logit contributions
uts-0.170
ding-0.155
othe-0.149
vere-0.147
 numb-0.141
Token girl
Token ID2576
Feature activation-5.62e+00

Pos logit contributions
 me+0.013
ere+0.011
 colored+0.008
en+0.008
 three+0.008
Neg logit contributions
uts-0.078
vere-0.070
othe-0.069
ding-0.066
 bleed-0.065
Token named
Token ID3706
Feature activation-6.39e+00

Pos logit contributions
ere+0.097
ened+0.097
en+0.095
efully+0.082
 me+0.082
Neg logit contributions
uts-0.155
othe-0.141
ding-0.138
vere-0.138
ache-0.134
Token Alice
Token ID14862
Feature activation-7.83e+00

Pos logit contributions
ened+0.129
en+0.129
ere+0.126
eled+0.115
el+0.111
Neg logit contributions
uts-0.166
othe-0.148
ding-0.146
vere-0.145
ometer-0.136
Token.
Token ID13
Feature activation-8.47e-01

Pos logit contributions
ere+0.178
en+0.174
ened+0.174
 me+0.159
 colored+0.155
Neg logit contributions
uts-0.179
othe-0.155
ding-0.155
vere-0.155
ache-0.148
Token She
Token ID1375
Feature activation-3.31e+00

Pos logit contributions
ere+0.017
en+0.016
ened+0.016
el+0.015
eled+0.014
Neg logit contributions
uts-0.022
ding-0.020
vere-0.019
othe-0.019
ometer-0.018
Token wanted
Token ID2227
Feature activation-4.68e+00

Pos logit contributions
en+0.057
ere+0.057
ened+0.056
el+0.049
efully+0.048
Neg logit contributions
uts-0.093
ding-0.085
othe-0.082
vere-0.081
ache-0.080
Token to
Token ID284
Feature activation-3.52e+00

Pos logit contributions
ere+0.093
en+0.092
ened+0.091
 me+0.086
 colored+0.083
Neg logit contributions
uts-0.119
vere-0.107
othe-0.105
ding-0.105
ache-0.100
Token go
Token ID467
Feature activation-3.67e+00

Pos logit contributions
ere+0.067
en+0.066
ened+0.065
 me+0.059
 colored+0.058
Neg logit contributions
uts-0.094
vere-0.084
ding-0.083
othe-0.083
ache-0.079
Token<|endoftext|>
Token ID50256
Feature activation-8.95e+00

Pos logit contributions
uts+0.063
vere+0.057
ding+0.056
othe+0.056
ache+0.054
Neg logit contributions
ere-0.044
en-0.043
ened-0.043
 me-0.040
 colored-0.037
TokenOnce
Token ID7454
Feature activation-7.48e+00

Pos logit contributions
ere+0.203
en+0.201
ened+0.200
 me+0.184
 colored+0.181
Neg logit contributions
uts-0.203
vere-0.177
ding-0.176
othe-0.175
ache-0.164
Token upon
Token ID2402
Feature activation-7.16e+00

Pos logit contributions
en+0.109
ere+0.108
ened+0.107
 colored+0.093
efully+0.092
Neg logit contributions
uts-0.230
othe-0.214
ding-0.208
vere-0.207
ometer-0.199
Token a
Token ID257
Feature activation-5.74e+00

Pos logit contributions
ere+0.083
en+0.081
ened+0.079
 me+0.069
 colored+0.066
Neg logit contributions
uts-0.242
vere-0.222
ding-0.220
othe-0.220
ache-0.212
Token time
Token ID640
Feature activation-7.93e+00

Pos logit contributions
ere+0.074
en+0.073
ened+0.073
 me+0.061
 colored+0.060
Neg logit contributions
uts-0.185
ding-0.169
vere-0.169
othe-0.168
ache-0.162
Token,
Token ID11
Feature activation-6.10e+00

Pos logit contributions
en+0.136
ened+0.133
ere+0.130
eled+0.117
efully+0.116
Neg logit contributions
uts-0.239
vere-0.210
othe-0.210
ding-0.209
t-0.202
Token there
Token ID612
Feature activation-7.25e+00

Pos logit contributions
en+0.134
ere+0.125
ened+0.124
al+0.107
efully+0.106
Neg logit contributions
othe-0.139
 much-0.135
uts-0.134
t-0.131
ding-0.131
Token was
Token ID373
Feature activation-6.07e+00

Pos logit contributions
ere+0.157
ened+0.154
en+0.154
efully+0.146
al+0.139
Neg logit contributions
uts-0.159
othe-0.145
ding-0.144
t-0.138
whe-0.129
Token a
Token ID257
Feature activation-1.72e+00

Pos logit contributions
en+0.120
ened+0.119
ere+0.112
ered+0.102
al+0.096
Neg logit contributions
uts-0.157
ding-0.142
othe-0.135
 much-0.132
vere-0.132
Token happy
Token ID3772
Feature activation-4.28e+00

Pos logit contributions
 me+0.013
ere+0.012
 colored+0.009
en+0.009
ened+0.008
Neg logit contributions
uts-0.069
othe-0.062
vere-0.062
ding-0.059
 bleed-0.058
Token dog
Token ID3290
Feature activation-2.80e+00

Pos logit contributions
ere+0.060
 me+0.057
en+0.057
ened+0.053
 colored+0.051
Neg logit contributions
uts-0.139
vere-0.130
othe-0.125
ding-0.122
ache-0.117
Token<|endoftext|>
Token ID50256
Feature activation-8.48e+00

Pos logit contributions
uts+0.104
ache+0.101
vere+0.099
 numb+0.098
 bleed+0.097
Neg logit contributions
ere-0.059
 me-0.056
 Everyday-0.055
ened-0.055
 Every-0.055
TokenMike
Token ID16073
Feature activation-8.77e+00

Pos logit contributions
ened+0.151
en+0.148
ere+0.147
 colored+0.131
ered+0.129
Neg logit contributions
uts-0.231
ding-0.212
vere-0.205
othe-0.199
ometer-0.192
Tokeny
Token ID88
Feature activation-7.94e+00

Pos logit contributions
ere+0.149
ened+0.142
efully+0.141
 forces+0.137
en+0.133
Neg logit contributions
ding-0.212
uts-0.211
ache-0.186
ometer-0.176
othe-0.174
Token was
Token ID373
Feature activation-5.53e+00

Pos logit contributions
ere+0.140
ened+0.130
en+0.127
efully+0.121
els+0.121
Neg logit contributions
uts-0.208
ding-0.190
ache-0.176
othe-0.174
ometer-0.167
Token walking
Token ID6155
Feature activation-6.74e+00

Pos logit contributions
en+0.113
ened+0.112
ere+0.112
al+0.097
el+0.096
Neg logit contributions
uts-0.131
ding-0.122
vere-0.120
othe-0.117
 numb-0.110
Token through
Token ID832
Feature activation-9.81e+00

Pos logit contributions
ere+0.119
ened+0.112
en+0.112
 me+0.098
 colored+0.097
Neg logit contributions
uts-0.183
ding-0.171
vere-0.167
othe-0.165
ometer-0.157
Token the
Token ID262
Feature activation-2.67e+00

Pos logit contributions
ere+0.179
ened+0.169
en+0.163
ubby+0.157
al+0.157
Neg logit contributions
uts-0.212
 much-0.209
ding-0.205
whe-0.199
othe-0.198
Token park
Token ID3952
Feature activation-2.96e+00

Pos logit contributions
ened+0.049
ere+0.044
en+0.044
ered+0.044
eled+0.038
Neg logit contributions
ding-0.067
uts-0.067
vere-0.060
whe-0.059
tr-0.058
Token.
Token ID13
Feature activation-7.23e-01

Pos logit contributions
ere+0.055
en+0.054
ened+0.054
 me+0.048
 colored+0.047
Neg logit contributions
uts-0.079
vere-0.071
ding-0.071
othe-0.070
ache-0.067
Token He
Token ID679
Feature activation-3.13e+00

Pos logit contributions
ere+0.020
en+0.019
ened+0.019
el+0.018
al+0.018
Neg logit contributions
uts-0.013
ding-0.012
vere-0.011
othe-0.011
ometer-0.010
Token noticed
Token ID6810
Feature activation-7.48e+00

Pos logit contributions
ere+0.044
en+0.041
ened+0.040
efully+0.036
el+0.035
Neg logit contributions
ding-0.092
uts-0.091
vere-0.084
othe-0.083
t-0.080
Token.
Token ID13
Feature activation+3.49e+00

Pos logit contributions
uts+0.030
vere+0.028
ding+0.027
othe+0.027
ache+0.026
Neg logit contributions
ere-0.019
en-0.019
ened-0.019
 me-0.018
 colored-0.017
Token<|endoftext|>
Token ID50256
Feature activation-7.95e+00

Pos logit contributions
uts+0.097
vere+0.092
ache+0.091
ding+0.089
othe+0.087
Neg logit contributions
ere-0.060
en-0.058
ened-0.056
 me-0.055
el-0.051
TokenOnce
Token ID7454
Feature activation-7.00e+00

Pos logit contributions
ened+0.163
ere+0.161
en+0.161
 colored+0.144
 me+0.143
Neg logit contributions
uts-0.196
ding-0.175
vere-0.171
othe-0.167
ometer-0.159
Token upon
Token ID2402
Feature activation-6.74e+00

Pos logit contributions
en+0.098
ere+0.095
ened+0.095
 colored+0.084
efully+0.081
Neg logit contributions
uts-0.219
othe-0.205
vere-0.199
ding-0.199
ometer-0.191
Token a
Token ID257
Feature activation-5.55e+00

Pos logit contributions
ere+0.077
en+0.075
ened+0.074
 me+0.065
 colored+0.062
Neg logit contributions
uts-0.228
vere-0.209
ding-0.207
othe-0.207
ache-0.200
Token time
Token ID640
Feature activation-7.65e+00

Pos logit contributions
ened+0.071
ere+0.071
en+0.071
 me+0.059
 colored+0.058
Neg logit contributions
uts-0.177
vere-0.163
ding-0.163
othe-0.161
ache-0.156
Token,
Token ID11
Feature activation-5.63e+00

Pos logit contributions
en+0.120
ened+0.119
ere+0.117
eled+0.104
efully+0.103
Neg logit contributions
uts-0.236
vere-0.212
othe-0.211
ding-0.209
t-0.199
Token there
Token ID612
Feature activation-6.94e+00

Pos logit contributions
en+0.110
ened+0.106
ere+0.105
al+0.087
 types+0.085
Neg logit contributions
uts-0.138
othe-0.135
ding-0.132
vere-0.127
 much-0.127
Token was
Token ID373
Feature activation-5.87e+00

Pos logit contributions
ere+0.151
ened+0.146
en+0.145
efully+0.138
al+0.131
Neg logit contributions
uts-0.151
othe-0.138
ding-0.136
t-0.131
whe-0.125
Token a
Token ID257
Feature activation-1.75e+00

Pos logit contributions
ened+0.109
en+0.108
ere+0.103
ered+0.091
al+0.087
Neg logit contributions
uts-0.156
ding-0.143
othe-0.137
vere-0.135
 much-0.130
Token graceful
Token ID44363
Feature activation-4.09e+00

Pos logit contributions
 me+0.013
ere+0.012
en+0.010
 colored+0.009
ened+0.008
Neg logit contributions
uts-0.070
vere-0.063
othe-0.063
ding-0.060
 bleed-0.059