TOP ACTIVATIONS
MAX = 0.241

't know the rh in oc eros was wild . Can
am taking the rh in oc eros to a new home
give to the rh in oc eros . Lily walked to
Did you know that av oc ados are very healthy for
that the bad rh in oc eros was now nice .
he would take the av oc ados and scatter them around
it to the rh in oc eros . When he tasted
She loved to eat av oc ados . One day ,
She loved to eat av oc ados because they were y
't touch the rh in oc eros ! It might bite
couldn 't find any av oc ados . Finally , she
" The rh in oc eros is in there ,
and all the other av oc ados were very happy too
s wrong , rh in oc eros ? â said
," said the rh in oc eros . The
I will buy more av oc ados tomorrow . You can
, " I love av oc ados ! They are so
They find the rh in oc eros . They are gentle
They saw the rh in oc eros . They thanked Mom
turned out that the av oc ados were bad and made

MIN ACTIVATIONS
MIN = -0.122

a big forest , there was a log . The log
, a small c actus was very frightened . He wanted
a big forest , there was a group of animal friends
and sunny day , Jimmy was so excited . He and
a small park , there was a broken swing . It
a big park , there was a loyal pigeon . The
a small town , there was a very old mine .
One day , a girl was tying a balloon to her
a small town , there was a judge . The judge
<|endoftext|> One day , Toby was playing in the backyard with
a small town , there was a powerful girl named Lily
a small town , there was a depend able red car
a small garden , there was a little cater pillar .
On Wednesday , Sus ie was excited . She was going
a small town , there was a girl named Amy .
a big forest , there was a poor little rabbit .
the park . As she looked around , she noticed a
<|endoftext|> One day , Pete was playing in his room .
a small town , there was a thoughtful girl named Lucy
a peaceful town , there was a pretty bridge . The

INTERVAL 0.150 - 0.241
CONTAINS 0.022%

s wrong , rh in oc eros ? â said
that the bad rh in oc eros was now nice .
and all the other av oc ados were very happy too
give to the rh in oc eros . Lily walked to
am taking the rh in oc eros to a new home

INTERVAL 0.060 - 0.150
CONTAINS 4.253%

Sam . She was three years old . Sam was very
to decor ate the tree and make it pretty . They
his brother Sandy to be obedient from then on .
pillar w igg led inside and felt safe .
can make all the beautiful pictures without worrying your cr ay

INTERVAL -0.031 - 0.060
CONTAINS 78.809%

was a delicious red apple . He smiled .
From that day on , Tim was known as
." She pulled her arm back and stayed with the cow
ly , Lily brought the sauce home and showed it to
. We won 't take your things without asking ."

INTERVAL -0.122 - -0.031
CONTAINS 16.916%

. The reliable boy and his friends kept playing
a bee . The bee was very independent and she liked
a forest . Everywhere they looked they saw trees . Then
time , in a big tree , there was a little
find his way back home . The fire was getting bigger
Token't
Token ID470
Feature activation+1.10e-01

Pos logit contributions
reating+0.009
ites+0.009
ishes+0.009
orns+0.009
oes+0.009
Neg logit contributions
ider-0.016
 slightly-0.015
Dis-0.015
 softened-0.015
Luckily-0.015
Token know
Token ID760
Feature activation+1.06e-01

Pos logit contributions
ites+0.165
reating+0.165
orns+0.165
opl+0.164
ishes+0.163
Neg logit contributions
ider-0.166
 slightly-0.162
 softened-0.161
Luckily-0.157
 horrible-0.150
Token the
Token ID262
Feature activation+9.54e-02

Pos logit contributions
ites+0.141
opl+0.141
reating+0.138
orns+0.138
oes+0.136
Neg logit contributions
 slightly-0.176
ider-0.175
 softened-0.171
Dis-0.165
 horrible-0.164
Token rh
Token ID9529
Feature activation+1.31e-01

Pos logit contributions
reating+0.155
ites+0.150
opl+0.149
orns+0.148
oes+0.147
Neg logit contributions
ider-0.136
 slightly-0.129
 softened-0.127
Dis-0.127
Luckily-0.121
Tokenin
Token ID259
Feature activation+4.32e-02

Pos logit contributions
reating+0.225
fl+0.220
opl+0.220
oes+0.214
mel+0.209
Neg logit contributions
ider-0.181
 slightly-0.156
 softened-0.156
Luckily-0.149
Dis-0.148
Tokenoc
Token ID420
Feature activation+1.91e-01

Pos logit contributions
reating+0.045
ites+0.042
orns+0.040
opl+0.039
ishes+0.039
Neg logit contributions
ider-0.088
 slightly-0.086
 softened-0.086
Dis-0.085
 horrible-0.083
Tokeneros
Token ID27498
Feature activation+5.98e-02

Pos logit contributions
reating+0.276
ites+0.255
oes+0.255
uggle+0.251
 whenever+0.250
Neg logit contributions
ider-0.304
 softened-0.266
 slightly-0.261
Dis-0.253
 shattered-0.247
Token was
Token ID373
Feature activation-2.35e-02

Pos logit contributions
reating+0.103
ites+0.100
opl+0.099
orns+0.099
ie+0.099
Neg logit contributions
ider-0.079
 slightly-0.073
Dis-0.073
 softened-0.071
Luckily-0.069
Token wild
Token ID4295
Feature activation+7.23e-02

Pos logit contributions
ider+0.029
Dis+0.027
 softened+0.026
 slightly+0.026
Conf+0.025
Neg logit contributions
reating-0.043
opl-0.041
orns-0.041
ites-0.040
bur-0.040
Token.
Token ID13
Feature activation+2.78e-02

Pos logit contributions
ites+0.118
reating+0.116
opl+0.115
bur+0.114
orns+0.111
Neg logit contributions
ider-0.100
 slightly-0.096
 softened-0.092
 horrible-0.090
 strange-0.090
Token Can
Token ID1680
Feature activation+7.55e-02

Pos logit contributions
reating+0.042
ites+0.041
opl+0.041
oes+0.040
orns+0.040
Neg logit contributions
ider-0.043
 softened-0.040
 slightly-0.040
Dis-0.040
Luckily-0.037
Token am
Token ID716
Feature activation+3.55e-03

Pos logit contributions
reating+0.099
ites+0.096
opl+0.095
orns+0.094
oes+0.094
Neg logit contributions
ider-0.141
 slightly-0.134
Dis-0.133
 softened-0.133
Luckily-0.128
Token taking
Token ID2263
Feature activation+6.89e-02

Pos logit contributions
reating+0.006
ites+0.006
opl+0.006
orns+0.006
oes+0.006
Neg logit contributions
ider-0.005
 slightly-0.004
 softened-0.004
Dis-0.004
Luckily-0.004
Token the
Token ID262
Feature activation+9.44e-02

Pos logit contributions
reating+0.103
oes+0.100
opl+0.099
ites+0.098
ishes+0.097
Neg logit contributions
 slightly-0.103
ider-0.103
Dis-0.100
 softened-0.100
 horrible-0.096
Token rh
Token ID9529
Feature activation+1.16e-01

Pos logit contributions
reating+0.141
ites+0.137
opl+0.136
orns+0.135
ishes+0.133
Neg logit contributions
ider-0.145
 slightly-0.138
 softened-0.135
Dis-0.134
 horrible-0.131
Tokenin
Token ID259
Feature activation+3.71e-02

Pos logit contributions
fl+0.213
reating+0.213
opl+0.205
mel+0.203
oes+0.202
Neg logit contributions
ider-0.142
 badly-0.118
Luckily-0.117
 slightly-0.116
Dis-0.113
Tokenoc
Token ID420
Feature activation+1.81e-01

Pos logit contributions
reating+0.049
ishes+0.045
angles+0.044
orns+0.044
ites+0.043
Neg logit contributions
ider-0.062
A-0.058
 slightly-0.058
Out-0.056
 horrible-0.056
Tokeneros
Token ID27498
Feature activation+5.83e-02

Pos logit contributions
reating+0.286
 parties+0.261
 whenever+0.255
oes+0.251
uggle+0.247
Neg logit contributions
ider-0.275
 slightly-0.230
 softened-0.230
Dis-0.224
nown-0.220
Token to
Token ID284
Feature activation+5.30e-02

Pos logit contributions
reating+0.089
opl+0.084
orns+0.084
ites+0.083
ishes+0.083
Neg logit contributions
ider-0.086
 slightly-0.080
 softened-0.078
Luckily-0.077
 horrible-0.076
Token a
Token ID257
Feature activation+6.06e-03

Pos logit contributions
ites+0.081
ishes+0.079
fl+0.077
oes+0.077
orns+0.077
Neg logit contributions
 slightly-0.080
 strange-0.076
 horrible-0.076
Luckily-0.074
Dis-0.074
Token new
Token ID649
Feature activation+4.68e-02

Pos logit contributions
reating+0.010
opl+0.010
ites+0.010
oes+0.010
orns+0.010
Neg logit contributions
ider-0.008
 softened-0.008
Dis-0.008
 slightly-0.008
Luckily-0.007
Token home
Token ID1363
Feature activation+2.72e-02

Pos logit contributions
reating+0.062
ites+0.061
orns+0.061
ishes+0.059
oes+0.059
Neg logit contributions
ider-0.077
 slightly-0.077
Dis-0.074
 softened-0.072
Luckily-0.072
Token give
Token ID1577
Feature activation+1.02e-01

Pos logit contributions
reating+0.138
ites+0.136
ishes+0.133
opl+0.131
fl+0.131
Neg logit contributions
 slightly-0.131
 strange-0.126
 softened-0.126
 ripped-0.121
Luckily-0.121
Token to
Token ID284
Feature activation+9.67e-02

Pos logit contributions
opl+0.145
reating+0.136
ites+0.125
oes+0.124
ric+0.120
Neg logit contributions
 softened-0.167
 slightly-0.166
 strange-0.160
 ripped-0.158
 horrible-0.157
Token the
Token ID262
Feature activation+8.74e-02

Pos logit contributions
ites+0.155
pp+0.154
oes+0.152
fl+0.150
opl+0.147
Neg logit contributions
 softened-0.129
 strange-0.127
 slightly-0.125
 horrible-0.121
 sharp-0.119
Token rh
Token ID9529
Feature activation+1.17e-01

Pos logit contributions
reating+0.122
ites+0.117
opl+0.117
orns+0.115
oes+0.115
Neg logit contributions
ider-0.145
 slightly-0.138
 softened-0.137
Dis-0.136
Luckily-0.131
Tokenin
Token ID259
Feature activation+3.58e-02

Pos logit contributions
fl+0.214
reating+0.209
opl+0.207
mel+0.203
oes+0.202
Neg logit contributions
ider-0.146
 softened-0.127
Luckily-0.125
 slightly-0.124
Dis-0.121
Tokenoc
Token ID420
Feature activation+1.77e-01

Pos logit contributions
reating+0.049
angles+0.045
orns+0.044
ites+0.044
bur+0.044
Neg logit contributions
A-0.055
 slightly-0.055
ider-0.054
 Two-0.054
 A-0.054
Tokeneros
Token ID27498
Feature activation+5.89e-02

Pos logit contributions
reating+0.259
 whenever+0.241
oes+0.240
ites+0.237
uggle+0.235
Neg logit contributions
ider-0.273
 softened-0.242
 slightly-0.236
Dis-0.234
 shattered-0.222
Token.
Token ID13
Feature activation+2.95e-02

Pos logit contributions
reating+0.094
opl+0.091
ites+0.089
orns+0.088
bur+0.088
Neg logit contributions
ider-0.083
 softened-0.082
 slightly-0.081
Dis-0.076
Luckily-0.075
Token Lily
Token ID20037
Feature activation+4.89e-02

Pos logit contributions
reating+0.049
ites+0.047
opl+0.047
orns+0.047
ishes+0.046
Neg logit contributions
ider-0.041
 slightly-0.039
Dis-0.039
 softened-0.039
Luckily-0.037
Token walked
Token ID6807
Feature activation-1.45e-02

Pos logit contributions
reating+0.089
ites+0.085
opl+0.084
oes+0.084
orns+0.083
Neg logit contributions
ider-0.062
 softened-0.059
 slightly-0.058
Dis-0.058
Luckily-0.055
Token to
Token ID284
Feature activation+4.92e-02

Pos logit contributions
ider+0.026
 slightly+0.024
 softened+0.024
Dis+0.024
Luckily+0.023
Neg logit contributions
reating-0.019
ites-0.018
opl-0.018
orns-0.018
oes-0.018
Token Did
Token ID7731
Feature activation+6.91e-02

Pos logit contributions
reating+0.063
orns+0.059
ites+0.058
ie+0.058
ishes+0.058
Neg logit contributions
ider-0.051
Dis-0.047
 slightly-0.046
 softened-0.045
Luckily-0.045
Token you
Token ID345
Feature activation+5.86e-02

Pos logit contributions
opl+0.082
reating+0.081
orns+0.079
ites+0.078
mel+0.075
Neg logit contributions
ider-0.125
 slightly-0.123
 softened-0.122
 strange-0.122
Luckily-0.121
Token know
Token ID760
Feature activation+8.45e-02

Pos logit contributions
reating+0.084
ites+0.081
opl+0.080
orns+0.080
ishes+0.079
Neg logit contributions
ider-0.095
 slightly-0.091
 softened-0.091
Dis-0.089
Luckily-0.088
Token that
Token ID326
Feature activation+8.98e-02

Pos logit contributions
reating+0.099
opl+0.098
ites+0.098
orns+0.097
oes+0.095
Neg logit contributions
ider-0.153
 slightly-0.150
 softened-0.148
Dis-0.144
 horrible-0.141
Token av
Token ID1196
Feature activation+1.02e-01

Pos logit contributions
reating+0.123
ites+0.122
opl+0.120
orns+0.120
ishes+0.118
Neg logit contributions
ider-0.144
 slightly-0.144
 softened-0.143
 horrible-0.139
Dis-0.138
Tokenoc
Token ID420
Feature activation+1.76e-01

Pos logit contributions
reating+0.142
line+0.126
uggle+0.123
oes+0.122
 farms+0.118
Neg logit contributions
Dis-0.161
 softened-0.154
 spat-0.152
Luckily-0.151
 slightly-0.151
Tokenados
Token ID22484
Feature activation+9.63e-02

Pos logit contributions
reating+0.229
uggle+0.228
oes+0.220
line+0.219
ric+0.215
Neg logit contributions
ider-0.273
 softened-0.253
nown-0.248
Dis-0.245
 horrible-0.239
Token are
Token ID389
Feature activation+2.06e-02

Pos logit contributions
reating+0.147
ites+0.139
opl+0.137
orns+0.136
ans+0.132
Neg logit contributions
ider-0.146
 softened-0.142
 slightly-0.141
 ripped-0.136
Dis-0.133
Token very
Token ID845
Feature activation+1.95e-02

Pos logit contributions
reating+0.038
ites+0.037
opl+0.036
orns+0.036
oes+0.036
Neg logit contributions
ider-0.025
 slightly-0.024
 softened-0.024
Dis-0.023
Luckily-0.022
Token healthy
Token ID5448
Feature activation+7.92e-02

Pos logit contributions
reating+0.040
ites+0.039
opl+0.039
orns+0.039
oes+0.039
Neg logit contributions
ider-0.019
 slightly-0.018
 softened-0.018
Dis-0.018
Luckily-0.016
Token for
Token ID329
Feature activation+7.17e-02

Pos logit contributions
ites+0.128
opl+0.127
ric+0.125
reating+0.123
oes+0.120
Neg logit contributions
ider-0.112
 slightly-0.100
 softened-0.097
 strange-0.096
 spat-0.096
Token that
Token ID326
Feature activation+9.00e-02

Pos logit contributions
orns+0.130
opl+0.130
ishes+0.129
bur+0.126
reating+0.126
Neg logit contributions
ider-0.111
 slightly-0.107
 softened-0.104
Luckily-0.098
 strange-0.097
Token the
Token ID262
Feature activation+7.44e-02

Pos logit contributions
orns+0.121
ishes+0.119
ites+0.117
reating+0.116
opl+0.116
Neg logit contributions
 slightly-0.153
 softened-0.149
ider-0.148
Dis-0.145
 horrible-0.143
Token bad
Token ID2089
Feature activation+7.25e-02

Pos logit contributions
reating+0.109
ites+0.105
opl+0.104
fl+0.103
oes+0.102
Neg logit contributions
ider-0.118
Dis-0.111
 slightly-0.111
 softened-0.110
Luckily-0.106
Token rh
Token ID9529
Feature activation+1.09e-01

Pos logit contributions
reating+0.095
ites+0.093
orns+0.091
opl+0.090
ishes+0.090
Neg logit contributions
ider-0.125
 slightly-0.120
 softened-0.120
Dis-0.118
 horrible-0.115
Tokenin
Token ID259
Feature activation+2.80e-02

Pos logit contributions
reating+0.171
opl+0.164
oes+0.161
fl+0.161
ishes+0.157
Neg logit contributions
ider-0.167
 softened-0.154
 slightly-0.153
Dis-0.149
Luckily-0.146
Tokenoc
Token ID420
Feature activation+1.74e-01

Pos logit contributions
reating+0.027
ites+0.025
orns+0.024
oes+0.024
opl+0.024
Neg logit contributions
ider-0.059
 softened-0.057
 slightly-0.057
Dis-0.056
Luckily-0.055
Tokeneros
Token ID27498
Feature activation+5.32e-02

Pos logit contributions
reating+0.249
ites+0.227
oes+0.226
uggle+0.220
bur+0.219
Neg logit contributions
ider-0.295
 softened-0.262
 slightly-0.250
Dis-0.246
 shattered-0.239
Token was
Token ID373
Feature activation-2.04e-02

Pos logit contributions
reating+0.095
ites+0.093
opl+0.093
orns+0.092
ishes+0.091
Neg logit contributions
ider-0.066
 slightly-0.061
Dis-0.061
 softened-0.059
Conf-0.056
Token now
Token ID783
Feature activation+3.16e-02

Pos logit contributions
ider+0.026
 slightly+0.026
 softened+0.026
Dis+0.024
Luckily+0.024
Neg logit contributions
reating-0.036
ites-0.035
oes-0.035
orns-0.035
ishes-0.035
Token nice
Token ID3621
Feature activation+5.53e-02

Pos logit contributions
reating+0.051
opl+0.049
ites+0.049
orns+0.048
oes+0.048
Neg logit contributions
ider-0.046
 slightly-0.043
Dis-0.043
 softened-0.043
Luckily-0.041
Token.
Token ID13
Feature activation+1.88e-02

Pos logit contributions
ites+0.086
ans+0.084
opl+0.084
reating+0.083
orns+0.082
Neg logit contributions
 slightly-0.078
ider-0.077
 softened-0.077
 horrible-0.073
 ripped-0.072
Token he
Token ID339
Feature activation+4.67e-02

Pos logit contributions
reating+0.158
opl+0.154
ites+0.152
orns+0.150
oes+0.149
Neg logit contributions
ider-0.187
 slightly-0.182
 softened-0.180
Dis-0.176
Luckily-0.172
Token would
Token ID561
Feature activation+2.53e-02

Pos logit contributions
reating+0.077
ites+0.074
opl+0.074
orns+0.073
oes+0.073
Neg logit contributions
ider-0.066
 slightly-0.062
Dis-0.061
 softened-0.061
Luckily-0.058
Token take
Token ID1011
Feature activation+4.51e-02

Pos logit contributions
reating+0.039
orns+0.039
ites+0.038
ishes+0.037
opl+0.036
Neg logit contributions
 slightly-0.038
ider-0.037
 softened-0.037
Luckily-0.035
Dis-0.035
Token the
Token ID262
Feature activation+8.02e-02

Pos logit contributions
reating+0.071
oes+0.070
ites+0.068
opl+0.068
ishes+0.066
Neg logit contributions
 slightly-0.065
ider-0.063
 softened-0.062
Dis-0.062
 horrible-0.060
Token av
Token ID1196
Feature activation+9.97e-02

Pos logit contributions
reating+0.123
ites+0.119
opl+0.118
orns+0.117
oes+0.115
Neg logit contributions
ider-0.121
 slightly-0.116
 softened-0.114
Dis-0.113
 horrible-0.108
Tokenoc
Token ID420
Feature activation+1.73e-01

Pos logit contributions
reating+0.143
ric+0.131
line+0.125
 parties+0.124
 supervision+0.120
Neg logit contributions
Dis-0.147
Out-0.142
Just-0.141
Conf-0.139
A-0.133
Tokenados
Token ID22484
Feature activation+7.90e-02

Pos logit contributions
reating+0.231
uggle+0.222
 parties+0.218
 whenever+0.203
oes+0.200
Neg logit contributions
ider-0.278
Dis-0.254
 slightly-0.250
nown-0.248
 softened-0.245
Token and
Token ID290
Feature activation+8.03e-02

Pos logit contributions
opl+0.139
reating+0.130
ric+0.129
ans+0.125
orns+0.125
Neg logit contributions
 slightly-0.101
ider-0.100
Out-0.087
A-0.086
 softened-0.084
Token scatter
Token ID41058
Feature activation+3.92e-02

Pos logit contributions
reating+0.127
opl+0.126
ites+0.123
uggle+0.122
orns+0.120
Neg logit contributions
 softened-0.117
 slightly-0.117
 spat-0.111
Dis-0.110
 strange-0.110
Token them
Token ID606
Feature activation+4.64e-02

Pos logit contributions
opl+0.048
reating+0.048
oes+0.046
ric+0.046
uggle+0.044
Neg logit contributions
 slightly-0.072
 strange-0.069
 horrible-0.067
 softened-0.067
 ripped-0.064
Token around
Token ID1088
Feature activation+5.53e-02

Pos logit contributions
reating+0.077
opl+0.073
ites+0.073
orns+0.072
oes+0.072
Neg logit contributions
ider-0.066
 slightly-0.063
 softened-0.062
Dis-0.061
Luckily-0.058
Token it
Token ID340
Feature activation+7.73e-02

Pos logit contributions
opl+0.148
reating+0.141
ites+0.137
oes+0.136
orns+0.129
Neg logit contributions
 softened-0.174
 slightly-0.173
 strange-0.165
 horrible-0.164
 ripped-0.163
Token to
Token ID284
Feature activation+7.64e-02

Pos logit contributions
opl+0.113
reating+0.112
ites+0.109
orns+0.107
oes+0.105
Neg logit contributions
ider-0.119
 slightly-0.118
 softened-0.116
Dis-0.112
 horrible-0.111
Token the
Token ID262
Feature activation+7.54e-02

Pos logit contributions
ites+0.114
reating+0.108
oes+0.108
ishes+0.108
orns+0.107
Neg logit contributions
 slightly-0.116
 softened-0.113
 horrible-0.110
 strange-0.109
ider-0.108
Token rh
Token ID9529
Feature activation+1.07e-01

Pos logit contributions
reating+0.105
opl+0.100
ites+0.099
fl+0.097
orns+0.097
Neg logit contributions
ider-0.126
Dis-0.120
 softened-0.119
 slightly-0.119
Luckily-0.114
Tokenin
Token ID259
Feature activation+2.43e-02

Pos logit contributions
fl+0.185
reating+0.184
oes+0.177
opl+0.176
mel+0.176
Neg logit contributions
ider-0.144
 softened-0.131
 slightly-0.129
Luckily-0.128
Dis-0.125
Tokenoc
Token ID420
Feature activation+1.71e-01

Pos logit contributions
reating+0.027
orns+0.026
ites+0.025
ishes+0.024
bur+0.024
Neg logit contributions
ider-0.048
 slightly-0.045
 softened-0.045
Luckily-0.044
Dis-0.044
Tokeneros
Token ID27498
Feature activation+4.96e-02

Pos logit contributions
reating+0.261
 whenever+0.240
oes+0.240
ites+0.239
 parties+0.238
Neg logit contributions
ider-0.256
 softened-0.232
 slightly-0.224
Dis-0.218
 shattered-0.213
Token.
Token ID13
Feature activation+2.76e-02

Pos logit contributions
reating+0.081
orns+0.077
opl+0.077
ites+0.077
bur+0.075
Neg logit contributions
ider-0.071
 softened-0.069
 slightly-0.068
Dis-0.064
Luckily-0.063
Token When
Token ID1649
Feature activation+6.93e-02

Pos logit contributions
reating+0.045
ites+0.043
opl+0.043
orns+0.043
ishes+0.042
Neg logit contributions
ider-0.040
 slightly-0.038
 softened-0.037
Dis-0.037
Luckily-0.036
Token he
Token ID339
Feature activation+1.40e-02

Pos logit contributions
reating+0.079
opl+0.076
ites+0.076
orns+0.075
oes+0.075
Neg logit contributions
ider-0.132
 slightly-0.128
 softened-0.128
Dis-0.126
Luckily-0.122
Token tasted
Token ID29187
Feature activation+1.19e-02

Pos logit contributions
reating+0.024
ites+0.024
opl+0.023
orns+0.023
ie+0.023
Neg logit contributions
ider-0.019
 slightly-0.017
Dis-0.017
 softened-0.016
Luckily-0.016
Token She
Token ID1375
Feature activation+3.59e-02

Pos logit contributions
reating+0.041
ites+0.040
opl+0.040
orns+0.039
oes+0.039
Neg logit contributions
ider-0.036
 slightly-0.034
 softened-0.033
Dis-0.033
Luckily-0.032
Token loved
Token ID6151
Feature activation+4.25e-02

Pos logit contributions
ie+0.054
fl+0.054
opl+0.053
 whenever+0.052
ites+0.052
Neg logit contributions
ider-0.057
 slightly-0.051
Dis-0.050
 softened-0.047
Conf-0.047
Token to
Token ID284
Feature activation+4.14e-02

Pos logit contributions
boat+0.076
pp+0.073
boats+0.073
ites+0.072
opl+0.071
Neg logit contributions
 strange-0.051
 sharp-0.050
 curly-0.048
 tiny-0.047
 slightly-0.046
Token eat
Token ID4483
Feature activation+4.68e-02

Pos logit contributions
reating+0.057
ites+0.056
orns+0.054
ishes+0.054
opl+0.053
Neg logit contributions
ider-0.067
 softened-0.067
 slightly-0.067
Dis-0.064
Luckily-0.064
Token av
Token ID1196
Feature activation+8.57e-02

Pos logit contributions
oes+0.081
opl+0.079
ites+0.079
reating+0.079
ric+0.075
Neg logit contributions
 strange-0.059
 slightly-0.058
 spat-0.058
 horrible-0.057
 softened-0.057
Tokenoc
Token ID420
Feature activation+1.68e-01

Pos logit contributions
reating+0.113
line+0.109
ric+0.105
oes+0.103
aro+0.101
Neg logit contributions
Dis-0.132
Just-0.129
A-0.126
Conf-0.123
Out-0.123
Tokenados
Token ID22484
Feature activation+7.92e-02

Pos logit contributions
boat+0.220
reating+0.218
oes+0.216
line+0.214
uggle+0.209
Neg logit contributions
ider-0.257
 softened-0.251
Dis-0.247
Conf-0.231
 NOT-0.230
Token.
Token ID13
Feature activation+1.60e-02

Pos logit contributions
opl+0.148
ric+0.145
reating+0.141
ites+0.136
emaker+0.135
Neg logit contributions
ider-0.086
 slightly-0.083
A-0.077
 strange-0.073
Just-0.072
Token One
Token ID1881
Feature activation+2.93e-02

Pos logit contributions
reating+0.034
ites+0.034
ishes+0.032
opl+0.032
ans+0.032
Neg logit contributions
ider-0.014
 slightly-0.014
Dis-0.014
 softened-0.013
Luckily-0.013
Token day
Token ID1110
Feature activation+2.23e-02

Pos logit contributions
reating+0.034
ites+0.033
ishes+0.031
girls+0.030
opl+0.030
Neg logit contributions
 strange-0.054
 slightly-0.054
Dis-0.054
A-0.053
ider-0.053
Token,
Token ID11
Feature activation+2.23e-02

Pos logit contributions
reating+0.020
 whenever+0.020
ie+0.019
orns+0.019
ites+0.019
Neg logit contributions
ider-0.048
 slightly-0.045
Dis-0.045
 softened-0.045
 peculiar-0.044
Token She
Token ID1375
Feature activation+3.30e-02

Pos logit contributions
reating+0.037
ites+0.035
opl+0.035
orns+0.035
oes+0.035
Neg logit contributions
ider-0.034
 slightly-0.032
 softened-0.032
Dis-0.031
Luckily-0.030
Token loved
Token ID6151
Feature activation+3.84e-02

Pos logit contributions
ie+0.047
 whenever+0.045
fl+0.044
ites+0.042
bur+0.042
Neg logit contributions
ider-0.058
 slightly-0.051
Dis-0.050
 horrible-0.048
istered-0.048
Token to
Token ID284
Feature activation+3.89e-02

Pos logit contributions
boat+0.066
opl+0.065
pp+0.065
boats+0.064
ites+0.063
Neg logit contributions
 strange-0.048
 sharp-0.046
 curly-0.044
 slightly-0.044
 tiny-0.044
Token eat
Token ID4483
Feature activation+4.35e-02

Pos logit contributions
reating+0.050
ites+0.049
orns+0.048
ishes+0.047
opl+0.047
Neg logit contributions
ider-0.067
 slightly-0.066
 softened-0.066
Dis-0.064
Luckily-0.063
Token av
Token ID1196
Feature activation+7.94e-02

Pos logit contributions
oes+0.074
opl+0.073
ites+0.073
reating+0.073
ric+0.069
Neg logit contributions
 slightly-0.055
 strange-0.055
 softened-0.054
 horrible-0.053
 spat-0.053
Tokenoc
Token ID420
Feature activation+1.62e-01

Pos logit contributions
reating+0.104
line+0.101
ric+0.098
oes+0.093
aro+0.092
Neg logit contributions
Dis-0.122
Just-0.118
A-0.118
Out-0.114
Conf-0.114
Tokenados
Token ID22484
Feature activation+7.58e-02

Pos logit contributions
boat+0.208
reating+0.207
oes+0.205
line+0.201
uggle+0.196
Neg logit contributions
ider-0.256
 softened-0.249
Dis-0.243
 horrible-0.229
 slightly-0.228
Token because
Token ID780
Feature activation+6.57e-02

Pos logit contributions
opl+0.143
ric+0.140
reating+0.135
ites+0.129
ans+0.129
Neg logit contributions
ider-0.082
 slightly-0.081
A-0.073
 strange-0.072
 really-0.070
Token they
Token ID484
Feature activation+5.60e-02

Pos logit contributions
ans+0.107
opl+0.105
ric+0.102
ites+0.100
uggle+0.099
Neg logit contributions
 slightly-0.097
 horrible-0.092
 strange-0.091
Dis-0.090
A-0.090
Token were
Token ID547
Feature activation-1.45e-02

Pos logit contributions
reating+0.102
ites+0.098
opl+0.098
oes+0.096
orns+0.094
Neg logit contributions
 softened-0.069
 slightly-0.068
ider-0.067
Dis-0.066
 ripped-0.064
Token y
Token ID331
Feature activation+5.42e-02

Pos logit contributions
 slightly+0.012
 strange+0.012
 horrible+0.011
 softened+0.011
 sharp+0.011
Neg logit contributions
oes-0.032
reating-0.032
uggle-0.032
ites-0.032
opl-0.032
Token't
Token ID470
Feature activation+6.58e-02

Pos logit contributions
ishes+0.103
reating+0.102
ites+0.100
orns+0.099
opl+0.098
Neg logit contributions
ider-0.068
Dis-0.068
Luckily-0.067
 softened-0.065
 slightly-0.065
Token touch
Token ID3638
Feature activation+5.01e-02

Pos logit contributions
reating+0.088
opl+0.086
ites+0.085
orns+0.085
ishes+0.084
Neg logit contributions
ider-0.113
 slightly-0.109
 softened-0.108
Dis-0.107
Luckily-0.105
Token the
Token ID262
Feature activation+7.51e-02

Pos logit contributions
oes+0.072
reating+0.069
ites+0.069
uggle+0.067
pp+0.067
Neg logit contributions
 horrible-0.077
 slightly-0.077
 strange-0.077
A-0.072
 softened-0.072
Token rh
Token ID9529
Feature activation+9.64e-02

Pos logit contributions
reating+0.112
ites+0.109
opl+0.108
orns+0.107
oes+0.106
Neg logit contributions
ider-0.115
 slightly-0.111
 softened-0.108
Dis-0.107
 horrible-0.104
Tokenin
Token ID259
Feature activation+2.00e-02

Pos logit contributions
reating+0.175
fl+0.171
oes+0.168
opl+0.167
mel+0.166
Neg logit contributions
ider-0.121
 softened-0.100
 badly-0.099
Luckily-0.098
 slightly-0.098
Tokenoc
Token ID420
Feature activation+1.59e-01

Pos logit contributions
reating+0.025
ites+0.023
orns+0.022
oes+0.021
ishes+0.021
Neg logit contributions
ider-0.036
 slightly-0.034
 horrible-0.033
 shattered-0.033
 spat-0.033
Tokeneros
Token ID27498
Feature activation+4.83e-02

Pos logit contributions
reating+0.247
 whenever+0.225
ites+0.225
oes+0.224
 parties+0.223
Neg logit contributions
ider-0.243
 softened-0.210
Dis-0.203
 slightly-0.203
nown-0.201
Token!
Token ID0
Feature activation+1.83e-02

Pos logit contributions
reating+0.073
ites+0.071
orns+0.070
opl+0.069
oes+0.069
Neg logit contributions
ider-0.074
 slightly-0.069
 softened-0.067
 horrible-0.066
Dis-0.065
Token It
Token ID632
Feature activation+3.40e-02

Pos logit contributions
reating+0.029
opl+0.028
oes+0.028
ites+0.028
orns+0.027
Neg logit contributions
ider-0.026
 softened-0.025
 slightly-0.025
Dis-0.025
 horrible-0.023
Token might
Token ID1244
Feature activation+2.25e-02

Pos logit contributions
reating+0.058
ites+0.053
orns+0.051
opl+0.051
mel+0.050
Neg logit contributions
ider-0.050
 softened-0.049
 slightly-0.048
 ripped-0.048
 horrible-0.047
Token bite
Token ID13197
Feature activation+6.38e-02

Pos logit contributions
ites+0.046
ishes+0.046
reating+0.045
orns+0.044
opl+0.043
Neg logit contributions
 slightly-0.023
 horrible-0.023
 ripped-0.023
ider-0.023
 strange-0.023
Token couldn
Token ID3521
Feature activation+6.96e-03

Pos logit contributions
reating+0.046
opl+0.044
ites+0.044
oes+0.044
orns+0.043
Neg logit contributions
ider-0.043
Dis-0.040
 softened-0.040
 slightly-0.040
Luckily-0.038
Token't
Token ID470
Feature activation+3.57e-02

Pos logit contributions
reating+0.012
ishes+0.012
ites+0.011
oes+0.011
orns+0.011
Neg logit contributions
ider-0.010
 slightly-0.009
Dis-0.009
 softened-0.009
Luckily-0.009
Token find
Token ID1064
Feature activation+2.05e-02

Pos logit contributions
reating+0.045
ites+0.045
opl+0.043
ishes+0.043
orns+0.043
Neg logit contributions
ider-0.064
 slightly-0.063
Dis-0.061
 softened-0.061
Luckily-0.060
Token any
Token ID597
Feature activation+4.97e-02

Pos logit contributions
ites+0.029
reating+0.029
opl+0.029
oes+0.027
ans+0.027
Neg logit contributions
 slightly-0.034
ider-0.031
 softened-0.031
 horrible-0.030
 strange-0.030
Token av
Token ID1196
Feature activation+8.05e-02

Pos logit contributions
reating+0.070
orns+0.066
ites+0.066
opl+0.066
oes+0.065
Neg logit contributions
ider-0.084
Dis-0.080
 softened-0.079
 slightly-0.078
Luckily-0.076
Tokenoc
Token ID420
Feature activation+1.59e-01

Pos logit contributions
reating+0.114
uggle+0.101
oes+0.098
 parties+0.098
ric+0.097
Neg logit contributions
Dis-0.126
 slightly-0.121
Just-0.120
Conf-0.119
Out-0.115
Tokenados
Token ID22484
Feature activation+6.85e-02

Pos logit contributions
reating+0.222
uggle+0.216
oes+0.206
ric+0.198
ie+0.196
Neg logit contributions
ider-0.246
Dis-0.229
nown-0.225
 softened-0.224
 slightly-0.220
Token.
Token ID13
Feature activation+6.13e-03

Pos logit contributions
opl+0.123
reating+0.121
ites+0.120
ans+0.116
uggle+0.116
Neg logit contributions
ider-0.087
 slightly-0.083
 softened-0.075
 ripped-0.074
 interrupted-0.072
Token Finally
Token ID9461
Feature activation+4.62e-02

Pos logit contributions
reating+0.008
orns+0.008
ites+0.008
opl+0.008
oes+0.008
Neg logit contributions
ider-0.010
 softened-0.009
 slightly-0.009
Dis-0.009
 horrible-0.009
Token,
Token ID11
Feature activation+2.05e-02

Pos logit contributions
reating+0.044
ie+0.043
 whenever+0.043
orns+0.042
ites+0.041
Neg logit contributions
ider-0.095
Dis-0.090
 slightly-0.089
 softened-0.088
Conf-0.087
Token she
Token ID673
Feature activation+9.90e-03

Pos logit contributions
reating+0.027
opl+0.027
ites+0.026
orns+0.026
oes+0.026
Neg logit contributions
ider-0.036
 slightly-0.034
 softened-0.034
Dis-0.034
Luckily-0.033
Token
Token ID198
Feature activation+3.50e-02

Pos logit contributions
ishes+0.075
reating+0.075
fl+0.072
oes+0.070
ites+0.070
Neg logit contributions
 slightly-0.063
Dis-0.063
Luckily-0.062
ider-0.061
Conf-0.060
Token"
Token ID1
Feature activation+6.91e-02

Pos logit contributions
reating+0.060
ites+0.058
opl+0.058
orns+0.058
ishes+0.057
Neg logit contributions
ider-0.047
 slightly-0.044
Dis-0.044
 softened-0.044
Luckily-0.042
TokenThe
Token ID464
Feature activation+6.48e-02

Pos logit contributions
reating+0.118
ites+0.117
fl+0.114
bur+0.113
oes+0.112
Neg logit contributions
ider-0.089
 softened-0.085
 slightly-0.084
 horrible-0.083
 peculiar-0.080
Token rh
Token ID9529
Feature activation+9.40e-02

Pos logit contributions
reating+0.090
opl+0.086
ites+0.086
orns+0.084
oes+0.083
Neg logit contributions
ider-0.108
 softened-0.104
Dis-0.102
 slightly-0.102
Luckily-0.098
Tokenin
Token ID259
Feature activation+1.50e-02

Pos logit contributions
reating+0.159
opl+0.154
fl+0.152
oes+0.149
mel+0.148
Neg logit contributions
ider-0.135
 softened-0.119
 slightly-0.119
Dis-0.117
Luckily-0.115
Tokenoc
Token ID420
Feature activation+1.59e-01

Pos logit contributions
reating+0.015
orns+0.013
ites+0.013
opl+0.013
ishes+0.013
Neg logit contributions
ider-0.032
 slightly-0.031
 softened-0.030
Dis-0.030
Luckily-0.030
Tokeneros
Token ID27498
Feature activation+3.84e-02

Pos logit contributions
reating+0.242
 whenever+0.226
ites+0.224
oes+0.221
uggle+0.220
Neg logit contributions
ider-0.247
 softened-0.214
 slightly-0.211
Dis-0.209
 shattered-0.199
Token is
Token ID318
Feature activation-2.16e-02

Pos logit contributions
reating+0.052
ites+0.050
opl+0.050
orns+0.049
oes+0.049
Neg logit contributions
ider-0.065
 slightly-0.062
 softened-0.061
Dis-0.061
Luckily-0.059
Token in
Token ID287
Feature activation+7.96e-03

Pos logit contributions
ider+0.030
Dis+0.028
 softened+0.027
 slightly+0.027
Conf+0.026
Neg logit contributions
reating-0.037
opl-0.036
ites-0.035
oes-0.035
orns-0.035
Token there
Token ID612
Feature activation+3.34e-02

Pos logit contributions
oes+0.011
reating+0.011
ites+0.011
ishes+0.011
fl+0.011
Neg logit contributions
 slightly-0.013
ider-0.013
Dis-0.012
 softened-0.012
 horrible-0.012
Token,
Token ID11
Feature activation+3.98e-02

Pos logit contributions
reating+0.046
ites+0.044
opl+0.044
orns+0.043
ishes+0.043
Neg logit contributions
ider-0.057
 slightly-0.054
 softened-0.053
Dis-0.053
Luckily-0.051
Token and
Token ID290
Feature activation+7.79e-02

Pos logit contributions
orns+0.061
reating+0.061
oes+0.060
opl+0.060
ites+0.060
Neg logit contributions
 slightly-0.051
 softened-0.048
ider-0.047
Dis-0.046
Luckily-0.045
Token all
Token ID477
Feature activation+7.65e-02

Pos logit contributions
opl+0.123
reating+0.121
orns+0.119
ites+0.118
oes+0.117
Neg logit contributions
 slightly-0.116
 softened-0.116
ider-0.109
 ripped-0.109
 spat-0.109
Token the
Token ID262
Feature activation+5.99e-02

Pos logit contributions
reating+0.122
bur+0.116
orns+0.116
oes+0.116
ans+0.115
Neg logit contributions
 softened-0.112
 shattered-0.105
 slightly-0.105
 strange-0.103
 horrible-0.102
Token other
Token ID584
Feature activation+5.10e-02

Pos logit contributions
reating+0.079
ites+0.076
opl+0.075
orns+0.074
oes+0.074
Neg logit contributions
ider-0.105
 slightly-0.100
 softened-0.099
Dis-0.099
Luckily-0.095
Token av
Token ID1196
Feature activation+7.20e-02

Pos logit contributions
reating+0.064
ites+0.061
opl+0.060
oes+0.059
orns+0.059
Neg logit contributions
ider-0.093
 slightly-0.089
 softened-0.088
Dis-0.088
Luckily-0.085
Tokenoc
Token ID420
Feature activation+1.58e-01

Pos logit contributions
reating+0.118
oes+0.111
line+0.108
ric+0.104
 festivals+0.102
Neg logit contributions
Dis-0.100
Just-0.093
Luckily-0.093
Conf-0.092
 softened-0.090
Tokenados
Token ID22484
Feature activation+5.87e-02

Pos logit contributions
reating+0.208
 whenever+0.189
oes+0.187
uggle+0.186
line+0.180
Neg logit contributions
ider-0.276
 softened-0.252
Dis-0.246
nown-0.238
 slightly-0.238
Token were
Token ID547
Feature activation-1.22e-02

Pos logit contributions
reating+0.108
opl+0.104
oes+0.102
orns+0.099
ites+0.098
Neg logit contributions
ider-0.072
 softened-0.071
 slightly-0.068
 ripped-0.065
Luckily-0.064
Token very
Token ID845
Feature activation-2.09e-04

Pos logit contributions
 slightly+0.015
 softened+0.014
ider+0.014
 horrible+0.013
Dis+0.013
Neg logit contributions
reating-0.023
oes-0.023
ites-0.022
opl-0.022
orns-0.021
Token happy
Token ID3772
Feature activation+6.69e-02

Pos logit contributions
ider+0.000
 slightly+0.000
Dis+0.000
 softened+0.000
Luckily+0.000
Neg logit contributions
reating-0.000
ites-0.000
opl-0.000
oes-0.000
orns-0.000
Token too
Token ID1165
Feature activation+4.88e-02

Pos logit contributions
opl+0.115
reating+0.114
ites+0.113
oes+0.113
bur+0.113
Neg logit contributions
ider-0.089
 slightly-0.084
 softened-0.082
 horrible-0.076
Dis-0.076
Tokens
Token ID82
Feature activation+3.07e-02

Pos logit contributions
reating+0.095
ites+0.090
orns+0.090
ishes+0.090
opl+0.088
Neg logit contributions
ider-0.110
 softened-0.105
 slightly-0.105
Dis-0.104
Luckily-0.100
Token wrong
Token ID2642
Feature activation+2.07e-02

Pos logit contributions
reating+0.048
opl+0.047
ites+0.047
oes+0.046
orns+0.046
Neg logit contributions
ider-0.046
 slightly-0.044
Dis-0.043
 softened-0.043
Luckily-0.041
Token,
Token ID11
Feature activation+3.08e-02

Pos logit contributions
reating+0.025
fl+0.025
opl+0.024
ites+0.024
ie+0.024
Neg logit contributions
ider-0.038
Dis-0.036
 softened-0.036
 slightly-0.035
Luckily-0.034
Token rh
Token ID9529
Feature activation+8.76e-02

Pos logit contributions
reating+0.048
ites+0.047
opl+0.047
orns+0.046
oes+0.046
Neg logit contributions
ider-0.045
 slightly-0.043
 softened-0.043
Dis-0.043
Luckily-0.041
Tokenin
Token ID259
Feature activation+1.13e-02

Pos logit contributions
reating+0.122
ites+0.116
opl+0.116
orns+0.115
oes+0.114
Neg logit contributions
ider-0.147
 slightly-0.140
 softened-0.139
Dis-0.139
Luckily-0.133
Tokenoc
Token ID420
Feature activation+1.57e-01

Pos logit contributions
reating+0.014
opl+0.014
ites+0.014
oes+0.014
ie+0.014
Neg logit contributions
ider-0.020
 slightly-0.019
Dis-0.019
 softened-0.019
Luckily-0.018
Tokeneros
Token ID27498
Feature activation+3.73e-02

Pos logit contributions
reating+0.202
ites+0.187
oes+0.187
 whenever+0.185
uggle+0.184
Neg logit contributions
ider-0.280
 softened-0.254
Dis-0.250
 slightly-0.247
 horrible-0.236
Token?
Token ID30
Feature activation-4.52e-03

Pos logit contributions
reating+0.042
opl+0.042
ites+0.041
ie+0.041
orns+0.040
Neg logit contributions
ider-0.071
Dis-0.069
 slightly-0.068
 softened-0.066
Luckily-0.065
Tokenâ
Token ID22940
Feature activation+4.48e-02

Pos logit contributions
ider+0.007
 softened+0.007
 slightly+0.006
Dis+0.006
 horrible+0.006
Neg logit contributions
reating-0.007
ites-0.007
opl-0.007
oes-0.006
orns-0.006
Token
Token ID26391
Feature activation-1.19e-02

Pos logit contributions
opl+0.064
reating+0.061
oes+0.061
orns+0.060
ites+0.059
Neg logit contributions
ider-0.074
 slightly-0.072
Dis-0.070
 softened-0.070
Luckily-0.067
Token said
Token ID531
Feature activation+4.17e-02

Pos logit contributions
ider+0.017
Luckily+0.017
 slightly+0.016
Dis+0.016
 softened+0.016
Neg logit contributions
reating-0.022
ishes-0.019
ie-0.019
oes-0.019
orns-0.018
Token,"
Token ID553
Feature activation+6.80e-02

Pos logit contributions
reating+0.087
ites+0.084
opl+0.083
oes+0.083
orns+0.083
Neg logit contributions
ider-0.112
 slightly-0.110
 softened-0.107
Dis-0.106
 horrible-0.102
Token said
Token ID531
Feature activation+5.30e-02

Pos logit contributions
reating+0.098
ites+0.094
opl+0.093
orns+0.092
oes+0.092
Neg logit contributions
ider-0.111
 slightly-0.105
Dis-0.104
 softened-0.104
Luckily-0.099
Token the
Token ID262
Feature activation+5.53e-02

Pos logit contributions
ie+0.057
 whenever+0.056
reating+0.056
mel+0.055
fl+0.054
Neg logit contributions
ider-0.104
Dis-0.096
 slightly-0.093
Luckily-0.091
 peculiar-0.091
Token rh
Token ID9529
Feature activation+8.65e-02

Pos logit contributions
reating+0.073
fl+0.069
oes+0.069
ites+0.069
ie+0.069
Neg logit contributions
ider-0.092
Dis-0.091
 softened-0.088
 slightly-0.088
Luckily-0.085
Tokenin
Token ID259
Feature activation+4.99e-03

Pos logit contributions
reating+0.140
opl+0.135
fl+0.134
oes+0.132
ishes+0.131
Neg logit contributions
ider-0.130
 softened-0.118
 slightly-0.118
Dis-0.116
Luckily-0.114
Tokenoc
Token ID420
Feature activation+1.56e-01

Pos logit contributions
reating+0.006
ites+0.006
opl+0.006
orns+0.006
oes+0.006
Neg logit contributions
ider-0.009
 slightly-0.009
 softened-0.009
Dis-0.009
Luckily-0.008
Tokeneros
Token ID27498
Feature activation+3.06e-02

Pos logit contributions
reating+0.234
ites+0.220
 whenever+0.219
oes+0.214
 parties+0.209
Neg logit contributions
ider-0.249
 softened-0.212
 slightly-0.210
Dis-0.208
nown-0.198
Token.
Token ID13
Feature activation+1.51e-02

Pos logit contributions
ie+0.037
 whenever+0.033
fl+0.032
cut+0.031
oes+0.031
Neg logit contributions
ider-0.057
Dis-0.057
 slightly-0.053
Out-0.052
 horrible-0.052
Token
Token ID198
Feature activation+2.43e-02

Pos logit contributions
oes+0.017
mel+0.017
 whenever+0.017
ites+0.017
opl+0.017
Neg logit contributions
ider-0.027
Dis-0.025
 softened-0.025
 horrible-0.025
 peculiar-0.025
Token
Token ID198
Feature activation+1.77e-02

Pos logit contributions
reating+0.032
ites+0.031
opl+0.031
orns+0.030
oes+0.030
Neg logit contributions
ider-0.042
 slightly-0.040
 softened-0.040
Dis-0.040
Luckily-0.038
TokenThe
Token ID464
Feature activation+3.71e-02

Pos logit contributions
ites+0.026
reating+0.026
opl+0.025
orns+0.025
oes+0.025
Neg logit contributions
ider-0.027
 softened-0.025
 slightly-0.025
 horrible-0.024
Dis-0.024
TokenI
Token ID40
Feature activation+2.10e-02

Pos logit contributions
reating+0.049
opl+0.048
ishes+0.047
orns+0.047
ites+0.047
Neg logit contributions
ider-0.027
Dis-0.026
 slightly-0.025
 softened-0.025
Luckily-0.024
Token will
Token ID481
Feature activation+2.87e-02

Pos logit contributions
reating+0.024
ites+0.024
oes+0.023
opl+0.023
ishes+0.023
Neg logit contributions
ider-0.040
 slightly-0.038
Dis-0.037
 softened-0.037
Conf-0.035
Token buy
Token ID2822
Feature activation+5.63e-02

Pos logit contributions
reating+0.034
ites+0.032
opl+0.032
oes+0.032
ishes+0.030
Neg logit contributions
ider-0.054
Dis-0.052
 slightly-0.051
 softened-0.051
Conf-0.049
Token more
Token ID517
Feature activation+7.15e-02

Pos logit contributions
reating+0.065
opl+0.064
oes+0.064
ites+0.062
orns+0.060
Neg logit contributions
ider-0.102
 slightly-0.102
 softened-0.100
Dis-0.100
 horrible-0.098
Token av
Token ID1196
Feature activation+6.99e-02

Pos logit contributions
bur+0.109
oes+0.108
ric+0.106
reating+0.106
opl+0.105
Neg logit contributions
 slightly-0.103
ider-0.100
 strange-0.099
 softened-0.095
 peculiar-0.094
Tokenoc
Token ID420
Feature activation+1.56e-01

Pos logit contributions
ric+0.114
reating+0.108
oes+0.102
line+0.100
fl+0.098
Neg logit contributions
Dis-0.094
Just-0.087
Conf-0.086
Out-0.084
ider-0.084
Tokenados
Token ID22484
Feature activation+6.94e-02

Pos logit contributions
reating+0.205
oes+0.195
ric+0.194
uggle+0.189
bur+0.186
Neg logit contributions
ider-0.267
Dis-0.242
 softened-0.237
nown-0.229
 slightly-0.224
Token tomorrow
Token ID9439
Feature activation+4.95e-02

Pos logit contributions
ric+0.126
opl+0.125
ans+0.119
uggle+0.116
orns+0.116
Neg logit contributions
ider-0.092
 slightly-0.075
A-0.066
 ripped-0.065
 badly-0.065
Token.
Token ID13
Feature activation+1.41e-02

Pos logit contributions
reating+0.080
orns+0.077
ites+0.077
opl+0.076
ishes+0.075
Neg logit contributions
ider-0.073
 slightly-0.067
Dis-0.063
 ripped-0.060
 spat-0.060
Token You
Token ID921
Feature activation+3.89e-02

Pos logit contributions
reating+0.023
ites+0.022
opl+0.022
orns+0.022
ishes+0.021
Neg logit contributions
ider-0.021
 slightly-0.019
Dis-0.019
 softened-0.019
Luckily-0.018
Token can
Token ID460
Feature activation+2.61e-02

Pos logit contributions
reating+0.052
opl+0.050
ites+0.050
orns+0.049
oes+0.048
Neg logit contributions
ider-0.067
 slightly-0.065
 softened-0.064
Dis-0.064
Luckily-0.062
Token,
Token ID11
Feature activation+4.40e-02

Pos logit contributions
 whenever+0.042
reating+0.040
ie+0.039
mel+0.038
ites+0.037
Neg logit contributions
ider-0.082
Dis-0.077
 slightly-0.076
Out-0.073
Conf-0.073
Token "
Token ID366
Feature activation+2.27e-02

Pos logit contributions
reating+0.056
 whenever+0.056
mel+0.053
fl+0.053
oes+0.053
Neg logit contributions
ider-0.076
Dis-0.069
 slightly-0.069
 softened-0.067
istered-0.067
TokenI
Token ID40
Feature activation+1.48e-02

Pos logit contributions
reating+0.046
opl+0.045
ites+0.045
ishes+0.044
orns+0.044
Neg logit contributions
ider-0.024
Dis-0.023
 slightly-0.022
 softened-0.022
Luckily-0.021
Token love
Token ID1842
Feature activation+5.82e-02

Pos logit contributions
reating+0.018
oes+0.018
ites+0.018
ie+0.018
opl+0.017
Neg logit contributions
ider-0.027
 slightly-0.025
Dis-0.025
 softened-0.024
Conf-0.024
Token av
Token ID1196
Feature activation+7.67e-02

Pos logit contributions
ites+0.090
boat+0.087
oes+0.086
uggle+0.086
ans+0.086
Neg logit contributions
 horrible-0.083
 softened-0.082
 strange-0.081
 slightly-0.079
 sharp-0.079
Tokenoc
Token ID420
Feature activation+1.53e-01

Pos logit contributions
reating+0.110
oes+0.102
ric+0.100
line+0.100
fl+0.094
Neg logit contributions
Dis-0.118
 softened-0.110
Conf-0.108
Luckily-0.108
Just-0.108
Tokenados
Token ID22484
Feature activation+7.85e-02

Pos logit contributions
reating+0.202
oes+0.191
ric+0.189
boat+0.184
uggle+0.184
Neg logit contributions
ider-0.251
 softened-0.242
Dis-0.236
nown-0.225
 horrible-0.222
Token!
Token ID0
Feature activation+1.03e-02

Pos logit contributions
opl+0.141
ric+0.140
ans+0.139
ites+0.138
uggle+0.135
Neg logit contributions
ider-0.098
 slightly-0.079
 softened-0.077
 horrible-0.076
 strange-0.075
Token They
Token ID1119
Feature activation+5.13e-02

Pos logit contributions
reating+0.017
ites+0.016
orns+0.016
opl+0.016
ishes+0.016
Neg logit contributions
ider-0.015
 slightly-0.014
Dis-0.014
 softened-0.014
Luckily-0.013
Token are
Token ID389
Feature activation-1.20e-02

Pos logit contributions
reating+0.072
opl+0.068
ites+0.065
orns+0.064
oes+0.062
Neg logit contributions
 softened-0.086
ider-0.084
 slightly-0.083
 ripped-0.083
 horrible-0.082
Token so
Token ID523
Feature activation+2.06e-02

Pos logit contributions
 slightly+0.011
 horrible+0.011
 strange+0.010
 softened+0.010
ider+0.010
Neg logit contributions
ites-0.026
reating-0.026
oes-0.025
uggle-0.025
opl-0.025
Token They
Token ID1119
Feature activation+6.28e-02

Pos logit contributions
reating+0.046
opl+0.043
orns+0.043
ites+0.043
ishes+0.043
Neg logit contributions
ider-0.038
 slightly-0.036
Dis-0.035
 softened-0.035
Luckily-0.034
Token find
Token ID1064
Feature activation+1.89e-02

Pos logit contributions
reating+0.101
opl+0.095
orns+0.094
ites+0.093
oes+0.093
Neg logit contributions
ider-0.093
 softened-0.090
 slightly-0.090
Dis-0.087
 ripped-0.085
Token the
Token ID262
Feature activation+5.17e-02

Pos logit contributions
reating+0.037
opl+0.036
ites+0.035
oes+0.035
orns+0.035
Neg logit contributions
ider-0.021
 slightly-0.020
 softened-0.020
Dis-0.019
 horrible-0.019
Token rh
Token ID9529
Feature activation+8.34e-02

Pos logit contributions
reating+0.081
ites+0.079
opl+0.078
orns+0.077
oes+0.076
Neg logit contributions
ider-0.077
 slightly-0.073
 softened-0.072
Dis-0.071
 horrible-0.069
Tokenin
Token ID259
Feature activation+7.56e-03

Pos logit contributions
reating+0.148
opl+0.144
fl+0.144
oes+0.140
mel+0.139
Neg logit contributions
ider-0.111
 slightly-0.097
 softened-0.096
Dis-0.093
Luckily-0.093
Tokenoc
Token ID420
Feature activation+1.52e-01

Pos logit contributions
reating+0.012
opl+0.010
oes+0.010
 parties+0.010
orns+0.009
Neg logit contributions
 slightly-0.012
 strange-0.012
ider-0.012
 shattered-0.011
 sharp-0.011
Tokeneros
Token ID27498
Feature activation+3.62e-02

Pos logit contributions
reating+0.235
 whenever+0.212
oes+0.209
ites+0.206
ie+0.205
Neg logit contributions
ider-0.238
 softened-0.204
 slightly-0.202
Dis-0.199
 shattered-0.192
Token.
Token ID13
Feature activation+1.61e-02

Pos logit contributions
reating+0.060
opl+0.058
ites+0.056
orns+0.055
oes+0.055
Neg logit contributions
ider-0.051
 slightly-0.050
 softened-0.049
Dis-0.047
 ripped-0.046
Token They
Token ID1119
Feature activation+4.96e-02

Pos logit contributions
reating+0.023
ites+0.022
opl+0.022
orns+0.022
oes+0.021
Neg logit contributions
ider-0.026
 slightly-0.025
 softened-0.025
Dis-0.025
Luckily-0.024
Token are
Token ID389
Feature activation-1.41e-02

Pos logit contributions
reating+0.073
opl+0.069
ites+0.069
orns+0.068
oes+0.068
Neg logit contributions
ider-0.079
 slightly-0.077
 softened-0.077
Dis-0.075
Luckily-0.072
Token gentle
Token ID10296
Feature activation+6.77e-02

Pos logit contributions
ider+0.018
Dis+0.017
 softened+0.016
Conf+0.016
 slightly+0.016
Neg logit contributions
reating-0.025
ites-0.025
ishes-0.024
opl-0.024
mel-0.024
Token They
Token ID1119
Feature activation+7.43e-02

Pos logit contributions
reating+0.068
ites+0.065
orns+0.064
ishes+0.064
opl+0.064
Neg logit contributions
ider-0.060
 slightly-0.057
Dis-0.055
 softened-0.055
Luckily-0.053
Token saw
Token ID2497
Feature activation+2.75e-02

Pos logit contributions
ites+0.107
bur+0.106
ie+0.106
ishes+0.105
mel+0.105
Neg logit contributions
ider-0.122
Dis-0.109
 slightly-0.107
 peculiar-0.102
Conf-0.102
Token the
Token ID262
Feature activation+5.31e-02

Pos logit contributions
reating+0.044
opl+0.044
ites+0.043
orns+0.043
oes+0.042
Neg logit contributions
ider-0.038
 softened-0.038
 slightly-0.038
Dis-0.036
 horrible-0.036
Token rh
Token ID9529
Feature activation+8.08e-02

Pos logit contributions
reating+0.084
ites+0.081
opl+0.081
orns+0.080
oes+0.080
Neg logit contributions
ider-0.078
 slightly-0.074
 softened-0.073
Dis-0.073
Luckily-0.070
Tokenin
Token ID259
Feature activation+3.50e-03

Pos logit contributions
reating+0.137
fl+0.132
opl+0.132
oes+0.131
mel+0.129
Neg logit contributions
ider-0.112
 softened-0.102
 slightly-0.099
Luckily-0.097
Dis-0.096
Tokenoc
Token ID420
Feature activation+1.51e-01

Pos logit contributions
reating+0.003
ites+0.003
oes+0.003
orns+0.003
opl+0.003
Neg logit contributions
ider-0.007
 slightly-0.007
 softened-0.007
Dis-0.007
 horrible-0.007
Tokeneros
Token ID27498
Feature activation+3.06e-02

Pos logit contributions
reating+0.233
oes+0.209
 whenever+0.208
ites+0.207
 parties+0.203
Neg logit contributions
ider-0.234
 softened-0.211
 slightly-0.203
Dis-0.197
 shattered-0.196
Token.
Token ID13
Feature activation+2.35e-02

Pos logit contributions
reating+0.045
ites+0.043
opl+0.043
orns+0.042
oes+0.042
Neg logit contributions
ider-0.049
 slightly-0.047
 softened-0.046
Dis-0.045
Luckily-0.044
Token They
Token ID1119
Feature activation+5.63e-02

Pos logit contributions
reating+0.031
ites+0.031
opl+0.030
oes+0.030
orns+0.029
Neg logit contributions
ider-0.041
 softened-0.038
Dis-0.038
 slightly-0.037
Luckily-0.036
Token thanked
Token ID26280
Feature activation+9.68e-02

Pos logit contributions
ishes+0.079
ites+0.079
bur+0.078
ie+0.077
mel+0.077
Neg logit contributions
ider-0.096
Dis-0.086
 slightly-0.080
 peculiar-0.078
Conf-0.077
Token Mom
Token ID11254
Feature activation+1.01e-01

Pos logit contributions
opl+0.148
orns+0.141
ites+0.138
reating+0.137
oes+0.133
Neg logit contributions
 slightly-0.148
 softened-0.147
Dis-0.141
A-0.140
 strange-0.140
Token turned
Token ID2900
Feature activation+5.60e-04

Pos logit contributions
ie+0.045
fl+0.043
orns+0.042
 whenever+0.042
reating+0.042
Neg logit contributions
ider-0.034
 slightly-0.031
Dis-0.030
Conf-0.028
 peculiar-0.028
Token out
Token ID503
Feature activation+9.97e-03

Pos logit contributions
reating+0.001
ie+0.001
ishes+0.001
orns+0.001
mel+0.001
Neg logit contributions
ider-0.001
Dis-0.001
Luckily-0.001
 slightly-0.001
Conf-0.001
Token that
Token ID326
Feature activation+5.45e-02

Pos logit contributions
 whenever+0.009
ie+0.009
reating+0.009
ites+0.008
mel+0.008
Neg logit contributions
ider-0.022
Dis-0.021
Luckily-0.020
 softened-0.019
 slightly-0.019
Token the
Token ID262
Feature activation+4.05e-02

Pos logit contributions
reating+0.067
opl+0.065
ites+0.065
orns+0.063
oes+0.063
Neg logit contributions
ider-0.098
 slightly-0.096
 softened-0.096
Dis-0.094
 horrible-0.091
Token av
Token ID1196
Feature activation+6.17e-02

Pos logit contributions
reating+0.053
ites+0.050
opl+0.050
orns+0.050
fl+0.050
Neg logit contributions
ider-0.071
 slightly-0.067
Dis-0.067
 softened-0.067
Luckily-0.064
Tokenoc
Token ID420
Feature activation+1.50e-01

Pos logit contributions
reating+0.091
oes+0.080
 parties+0.078
mel+0.077
line+0.075
Neg logit contributions
Dis-0.101
Conf-0.093
 softened-0.093
Just-0.093
 slightly-0.093
Tokenados
Token ID22484
Feature activation+4.53e-02

Pos logit contributions
reating+0.205
 whenever+0.190
oes+0.188
 parties+0.187
uggle+0.185
Neg logit contributions
ider-0.246
 softened-0.228
nown-0.221
Dis-0.221
 slightly-0.218
Token were
Token ID547
Feature activation-2.12e-02

Pos logit contributions
reating+0.077
ites+0.075
opl+0.074
orns+0.074
oes+0.073
Neg logit contributions
ider-0.061
 slightly-0.058
Dis-0.057
 softened-0.057
Luckily-0.054
Token bad
Token ID2089
Feature activation+5.54e-02

Pos logit contributions
ider+0.025
Dis+0.023
 slightly+0.022
 softened+0.022
Conf+0.022
Neg logit contributions
reating-0.039
orns-0.038
ishes-0.038
opl-0.038
mel-0.038
Token and
Token ID290
Feature activation+5.56e-02

Pos logit contributions
reating+0.077
ites+0.077
opl+0.077
bur+0.074
oes+0.074
Neg logit contributions
ider-0.091
 softened-0.087
 slightly-0.086
Dis-0.083
 horrible-0.083
Token made
Token ID925
Feature activation+3.48e-02

Pos logit contributions
reating+0.102
opl+0.101
ites+0.100
oes+0.098
orns+0.098
Neg logit contributions
 softened-0.067
 slightly-0.067
ider-0.066
 horrible-0.063
Dis-0.062
Token a
Token ID257
Feature activation-8.84e-02

Pos logit contributions
 slightly+0.043
Dis+0.040
ider+0.040
 softened+0.037
Luckily+0.037
Neg logit contributions
reating-0.162
ites-0.156
oes-0.156
opl-0.155
ishes-0.155
Token big
Token ID1263
Feature activation-6.81e-02

Pos logit contributions
ider+0.071
 slightly+0.068
Dis+0.067
 softened+0.064
Luckily+0.062
Neg logit contributions
reating-0.201
ites-0.193
orns-0.192
opl-0.192
oes-0.192
Token forest
Token ID8222
Feature activation-8.56e-02

Pos logit contributions
 strange+0.088
 slightly+0.087
Dis+0.084
 ripped+0.084
 horrible+0.083
Neg logit contributions
reating-0.120
oes-0.111
orns-0.108
mel-0.107
ishes-0.106
Token,
Token ID11
Feature activation-5.57e-02

Pos logit contributions
ider+0.153
 slightly+0.146
 softened+0.144
Dis+0.144
Luckily+0.139
Neg logit contributions
reating-0.111
ites-0.106
opl-0.105
orns-0.104
oes-0.103
Token there
Token ID612
Feature activation-4.89e-02

Pos logit contributions
ider+0.099
 softened+0.094
Dis+0.091
ED+0.089
 slightly+0.088
Neg logit contributions
ie-0.071
oes-0.069
fl-0.068
ites-0.068
 whenever-0.067
Token was
Token ID373
Feature activation-1.22e-01

Pos logit contributions
ider+0.065
 slightly+0.060
 softened+0.060
Dis+0.059
Luckily+0.056
Neg logit contributions
reating-0.086
ites-0.083
opl-0.083
orns-0.082
oes-0.081
Token a
Token ID257
Feature activation-1.00e-01

Pos logit contributions
ider+0.249
 softened+0.221
istered+0.213
ED+0.204
Dis+0.200
Neg logit contributions
 whenever-0.153
fl-0.134
orns-0.133
reating-0.130
 nature-0.122
Token log
Token ID2604
Feature activation-8.42e-02

Pos logit contributions
ider+0.161
 softened+0.151
Dis+0.137
 slightly+0.136
ified+0.135
Neg logit contributions
ie-0.145
 whenever-0.137
 ostr-0.136
fl-0.136
 pump-0.135
Token.
Token ID13
Feature activation-5.65e-02

Pos logit contributions
ider+0.139
Dis+0.136
 softened+0.135
 slightly+0.134
Conf+0.128
Neg logit contributions
reating-0.113
opl-0.111
ites-0.111
oes-0.111
orns-0.109
Token The
Token ID383
Feature activation-3.06e-02

Pos logit contributions
ider+0.105
 softened+0.097
istered+0.094
 horrible+0.093
Dis+0.093
Neg logit contributions
oes-0.071
opl-0.071
ites-0.069
orns-0.067
reating-0.065
Token log
Token ID2604
Feature activation-7.21e-02

Pos logit contributions
ider+0.055
 slightly+0.052
Dis+0.052
 softened+0.052
Luckily+0.050
Neg logit contributions
reating-0.040
ites-0.038
opl-0.037
orns-0.037
ishes-0.036
Token,
Token ID11
Feature activation-4.96e-02

Pos logit contributions
ider+0.120
 slightly+0.115
 softened+0.114
Luckily+0.110
Dis+0.110
Neg logit contributions
reating-0.114
orns-0.105
ites-0.104
opl-0.103
ishes-0.103
Token a
Token ID257
Feature activation-9.33e-02

Pos logit contributions
ider+0.072
 softened+0.067
 slightly+0.067
Dis+0.067
Luckily+0.063
Neg logit contributions
reating-0.079
ites-0.077
oes-0.076
orns-0.076
opl-0.076
Token small
Token ID1402
Feature activation-6.99e-02

Pos logit contributions
ider+0.107
 softened+0.096
 slightly+0.093
Dis+0.092
Luckily+0.085
Neg logit contributions
reating-0.176
ites-0.175
ie-0.173
fl-0.173
opl-0.171
Token c
Token ID269
Feature activation-7.39e-02

Pos logit contributions
ider+0.114
 slightly+0.110
 softened+0.107
Dis+0.106
Luckily+0.102
Neg logit contributions
reating-0.100
opl-0.096
ites-0.094
orns-0.094
oes-0.092
Tokenactus
Token ID34144
Feature activation-8.32e-02

Pos logit contributions
ider+0.131
 slightly+0.126
 softened+0.125
Dis+0.125
Luckily+0.119
Neg logit contributions
reating-0.095
ites-0.091
opl-0.091
orns-0.090
oes-0.089
Token was
Token ID373
Feature activation-1.20e-01

Pos logit contributions
ider+0.126
 slightly+0.120
Dis+0.120
 softened+0.118
 horrible+0.112
Neg logit contributions
ites-0.126
orns-0.124
reating-0.124
ie-0.123
opl-0.123
Token very
Token ID845
Feature activation-8.85e-02

Pos logit contributions
Dis+0.185
ider+0.184
Conf+0.173
 softened+0.167
ED+0.159
Neg logit contributions
ie-0.178
mel-0.177
bur-0.172
opl-0.172
 whenever-0.169
Token frightened
Token ID24776
Feature activation-3.70e-02

Pos logit contributions
ider+0.122
Dis+0.121
Conf+0.118
 softened+0.116
Just+0.112
Neg logit contributions
ie-0.143
mel-0.138
oes-0.133
ites-0.133
ans-0.132
Token.
Token ID13
Feature activation-5.50e-02

Pos logit contributions
ider+0.066
Dis+0.064
 softened+0.063
 slightly+0.062
Luckily+0.060
Neg logit contributions
reating-0.049
ie-0.046
oes-0.045
ites-0.045
orns-0.045
Token He
Token ID679
Feature activation-4.54e-02

Pos logit contributions
ider+0.106
 softened+0.099
istered+0.099
 horrible+0.094
 shattered+0.093
Neg logit contributions
oes-0.068
opl-0.062
 whenever-0.061
orns-0.055
uggle-0.055
Token wanted
Token ID2227
Feature activation-4.75e-02

Pos logit contributions
ider+0.084
 widening+0.073
Conf+0.072
istered+0.072
Dis+0.071
Neg logit contributions
ie-0.068
fl-0.057
uses-0.054
 whenever-0.053
orns-0.051
Token a
Token ID257
Feature activation-8.62e-02

Pos logit contributions
 slightly+0.036
Dis+0.033
A+0.032
 strange+0.031
Luckily+0.030
Neg logit contributions
reating-0.164
oes-0.160
opl-0.159
ites-0.157
ishes-0.156
Token big
Token ID1263
Feature activation-6.43e-02

Pos logit contributions
Dis+0.062
 slightly+0.060
Luckily+0.058
ider+0.058
 strange+0.057
Neg logit contributions
reating-0.206
oes-0.196
ishes-0.194
mel-0.194
ites-0.194
Token forest
Token ID8222
Feature activation-8.28e-02

Pos logit contributions
 strange+0.081
 ripped+0.076
 slightly+0.075
 horrible+0.075
 tiny+0.074
Neg logit contributions
reating-0.117
oes-0.109
pp-0.105
mel-0.105
orns-0.105
Token,
Token ID11
Feature activation-5.33e-02

Pos logit contributions
ider+0.130
 slightly+0.124
 softened+0.119
Luckily+0.118
Dis+0.116
Neg logit contributions
reating-0.133
opl-0.123
ishes-0.122
ites-0.121
orns-0.121
Token there
Token ID612
Feature activation-4.70e-02

Pos logit contributions
ider+0.091
 softened+0.085
Dis+0.084
 slightly+0.082
Conf+0.079
Neg logit contributions
ie-0.071
ites-0.069
fl-0.069
oes-0.069
 whenever-0.068
Token was
Token ID373
Feature activation-1.20e-01

Pos logit contributions
ider+0.060
 slightly+0.056
 softened+0.056
Dis+0.056
Luckily+0.053
Neg logit contributions
reating-0.084
ites-0.081
opl-0.081
orns-0.080
oes-0.080
Token a
Token ID257
Feature activation-9.92e-02

Pos logit contributions
ider+0.234
 softened+0.203
istered+0.194
Dis+0.188
ED+0.186
Neg logit contributions
 whenever-0.161
orns-0.147
fl-0.147
reating-0.146
ie-0.135
Token group
Token ID1448
Feature activation-6.22e-02

Pos logit contributions
ider+0.161
 softened+0.150
 slightly+0.136
Dis+0.136
 narrowed+0.131
Neg logit contributions
ie-0.144
 whenever-0.134
 ostr-0.134
fl-0.133
 farm-0.132
Token of
Token ID286
Feature activation-2.85e-02

Pos logit contributions
ider+0.111
 slightly+0.106
 softened+0.104
Luckily+0.102
Dis+0.102
Neg logit contributions
reating-0.081
bur-0.076
opl-0.076
ishes-0.076
fl-0.074
Token animal
Token ID5044
Feature activation-5.52e-02

Pos logit contributions
ider+0.062
 softened+0.058
 slightly+0.058
Dis+0.057
Luckily+0.056
Neg logit contributions
ites-0.027
reating-0.027
orns-0.027
ishes-0.026
fl-0.025
Token friends
Token ID2460
Feature activation-1.38e-02

Pos logit contributions
ider+0.100
Dis+0.099
 slightly+0.097
 softened+0.096
Conf+0.092
Neg logit contributions
ites-0.068
ie-0.066
fl-0.065
orns-0.065
reating-0.064
Token and
Token ID290
Feature activation-1.86e-02

Pos logit contributions
 strange+0.111
 peculiar+0.106
 softened+0.104
 slightly+0.103
Luckily+0.102
Neg logit contributions
reating-0.109
ishes-0.102
girls-0.102
orns-0.099
mel-0.095
Token sunny
Token ID27737
Feature activation-7.01e-02

Pos logit contributions
ider+0.032
 softened+0.029
 slightly+0.028
Dis+0.028
Luckily+0.027
Neg logit contributions
reating-0.026
ites-0.025
ie-0.025
fl-0.025
opl-0.024
Token day
Token ID1110
Feature activation-6.25e-02

Pos logit contributions
A+0.107
Luckily+0.107
Just+0.104
 strange+0.103
 sudden+0.102
Neg logit contributions
ishes-0.106
reating-0.105
girls-0.096
pp-0.095
angles-0.091
Token,
Token ID11
Feature activation-5.59e-02

Pos logit contributions
ider+0.134
 slightly+0.124
Dis+0.124
 softened+0.122
Luckily+0.117
Neg logit contributions
ie-0.059
 whenever-0.058
reating-0.057
fl-0.055
ites-0.055
Token Jimmy
Token ID12963
Feature activation-6.37e-02

Pos logit contributions
ider+0.101
 softened+0.091
Dis+0.090
 slightly+0.089
Conf+0.087
Neg logit contributions
 whenever-0.072
ie-0.068
fl-0.068
reating-0.068
ites-0.065
Token was
Token ID373
Feature activation-1.20e-01

Pos logit contributions
ider+0.086
 slightly+0.080
Dis+0.079
 softened+0.078
Luckily+0.074
Neg logit contributions
reating-0.110
ites-0.106
opl-0.106
orns-0.105
oes-0.105
Token so
Token ID523
Feature activation-4.54e-02

Pos logit contributions
ider+0.187
Dis+0.176
 slightly+0.175
 softened+0.175
Luckily+0.166
Neg logit contributions
reating-0.181
ites-0.173
opl-0.173
orns-0.170
oes-0.169
Token excited
Token ID6568
Feature activation-1.89e-02

Pos logit contributions
ider+0.052
 slightly+0.048
 softened+0.048
Dis+0.048
Luckily+0.045
Neg logit contributions
reating-0.087
ites-0.085
opl-0.084
orns-0.083
oes-0.083
Token.
Token ID13
Feature activation-4.44e-02

Pos logit contributions
ider+0.028
 slightly+0.027
 softened+0.026
Dis+0.026
Luckily+0.025
Neg logit contributions
ites-0.029
reating-0.029
opl-0.029
ishes-0.028
orns-0.028
Token He
Token ID679
Feature activation-3.82e-02

Pos logit contributions
ider+0.068
 softened+0.064
Dis+0.063
 slightly+0.061
Luckily+0.060
Neg logit contributions
reating-0.069
oes-0.066
 whenever-0.065
opl-0.065
mel-0.063
Token and
Token ID290
Feature activation+1.21e-02

Pos logit contributions
ider+0.062
 slightly+0.054
Dis+0.053
istered+0.051
 softened+0.051
Neg logit contributions
ie-0.057
 whenever-0.055
ites-0.054
ishes-0.053
ans-0.053
Token a
Token ID257
Feature activation-8.41e-02

Pos logit contributions
 slightly+0.037
Dis+0.035
A+0.033
 strange+0.033
ider+0.032
Neg logit contributions
reating-0.159
oes-0.154
opl-0.153
ishes-0.153
ites-0.153
Token small
Token ID1402
Feature activation-6.96e-02

Pos logit contributions
ider+0.064
Dis+0.064
 slightly+0.063
Luckily+0.059
 softened+0.059
Neg logit contributions
reating-0.193
ites-0.184
orns-0.184
oes-0.184
ishes-0.183
Token park
Token ID3952
Feature activation-7.16e-02

Pos logit contributions
ider+0.103
 slightly+0.102
Dis+0.100
 softened+0.098
 strange+0.096
Neg logit contributions
reating-0.106
opl-0.101
orns-0.099
ites-0.099
oes-0.099
Token,
Token ID11
Feature activation-5.48e-02

Pos logit contributions
ider+0.130
 slightly+0.124
 softened+0.124
Dis+0.124
Luckily+0.119
Neg logit contributions
reating-0.089
ites-0.086
opl-0.085
orns-0.084
oes-0.084
Token there
Token ID612
Feature activation-4.48e-02

Pos logit contributions
ider+0.100
 softened+0.096
Dis+0.091
 widening+0.090
 horrible+0.089
Neg logit contributions
ites-0.067
ie-0.066
oes-0.065
 whenever-0.064
fl-0.063
Token was
Token ID373
Feature activation-1.20e-01

Pos logit contributions
ider+0.061
 slightly+0.057
 softened+0.056
Dis+0.056
Luckily+0.053
Neg logit contributions
reating-0.077
ites-0.074
opl-0.074
orns-0.073
oes-0.073
Token a
Token ID257
Feature activation-9.73e-02

Pos logit contributions
ider+0.221
 softened+0.199
istered+0.191
Dis+0.183
 slightly+0.180
Neg logit contributions
 whenever-0.165
reating-0.161
orns-0.153
fl-0.150
ites-0.141
Token broken
Token ID5445
Feature activation-6.90e-02

Pos logit contributions
ider+0.161
 softened+0.155
ified+0.142
Dis+0.140
 slightly+0.139
Neg logit contributions
ie-0.129
 ostr-0.126
 pump-0.126
ites-0.125
 whenever-0.125
Token swing
Token ID9628
Feature activation-6.31e-02

Pos logit contributions
ider+0.128
Dis+0.123
 softened+0.123
 slightly+0.119
Luckily+0.118
Neg logit contributions
reating-0.086
opl-0.082
ie-0.080
orns-0.080
oes-0.080
Token.
Token ID13
Feature activation-5.61e-02

Pos logit contributions
ider+0.089
 slightly+0.082
 softened+0.079
Dis+0.077
 ripped+0.077
Neg logit contributions
reating-0.105
ishes-0.102
ites-0.101
fl-0.100
orns-0.100
Token It
Token ID632
Feature activation-4.75e-02

Pos logit contributions
ider+0.105
 softened+0.098
istered+0.095
Dis+0.093
 horrible+0.093
Neg logit contributions
opl-0.071
oes-0.070
reating-0.067
ites-0.066
orns-0.063
Token a
Token ID257
Feature activation-8.51e-02

Pos logit contributions
 slightly+0.036
Dis+0.033
A+0.033
 strange+0.032
Luckily+0.031
Neg logit contributions
reating-0.158
oes-0.153
ites-0.151
opl-0.151
ishes-0.150
Token big
Token ID1263
Feature activation-6.45e-02

Pos logit contributions
Dis+0.062
 slightly+0.062
ider+0.062
Luckily+0.058
 softened+0.057
Neg logit contributions
reating-0.199
ites-0.189
oes-0.189
ishes-0.189
orns-0.189
Token park
Token ID3952
Feature activation-7.29e-02

Pos logit contributions
 strange+0.086
 slightly+0.083
 ripped+0.081
Dis+0.081
 horrible+0.080
Neg logit contributions
reating-0.111
oes-0.102
orns-0.100
mel-0.099
ishes-0.099
Token,
Token ID11
Feature activation-5.48e-02

Pos logit contributions
ider+0.140
Dis+0.135
 softened+0.135
 slightly+0.134
Conf+0.128
Neg logit contributions
reating-0.080
ites-0.079
oes-0.078
opl-0.078
orns-0.076
Token there
Token ID612
Feature activation-4.54e-02

Pos logit contributions
ider+0.104
 softened+0.099
Dis+0.093
ED+0.093
istered+0.093
Neg logit contributions
ie-0.064
ites-0.062
 whenever-0.061
oes-0.061
fl-0.060
Token was
Token ID373
Feature activation-1.20e-01

Pos logit contributions
ider+0.061
 slightly+0.057
 softened+0.056
Dis+0.056
Luckily+0.053
Neg logit contributions
reating-0.078
opl-0.076
ites-0.076
orns-0.075
oes-0.075
Token a
Token ID257
Feature activation-9.87e-02

Pos logit contributions
ider+0.228
 softened+0.200
istered+0.193
 slightly+0.183
Dis+0.183
Neg logit contributions
 whenever-0.163
reating-0.159
orns-0.153
fl-0.146
bur-0.139
Token loyal
Token ID9112
Feature activation-5.49e-02

Pos logit contributions
ider+0.165
 softened+0.157
ified+0.142
Dis+0.142
 slightly+0.142
Neg logit contributions
ie-0.131
 pump-0.127
ites-0.127
 ostr-0.126
 whenever-0.126
Token pigeon
Token ID46387
Feature activation-4.89e-02

Pos logit contributions
ider+0.085
 slightly+0.082
Dis+0.080
 softened+0.079
Luckily+0.077
Neg logit contributions
reating-0.080
opl-0.078
ites-0.077
orns-0.076
ishes-0.076
Token.
Token ID13
Feature activation-5.10e-02

Pos logit contributions
ider+0.060
 slightly+0.055
 softened+0.053
Luckily+0.052
Dis+0.051
Neg logit contributions
reating-0.092
ishes-0.087
opl-0.085
ites-0.085
bur-0.085
Token The
Token ID383
Feature activation-2.56e-02

Pos logit contributions
ider+0.076
 slightly+0.071
 softened+0.071
Dis+0.070
Luckily+0.067
Neg logit contributions
reating-0.081
opl-0.079
ites-0.078
orns-0.077
oes-0.077
Token a
Token ID257
Feature activation-8.60e-02

Pos logit contributions
 slightly+0.039
Dis+0.035
ider+0.035
 strange+0.034
 softened+0.034
Neg logit contributions
reating-0.160
opl-0.155
ites-0.155
oes-0.155
ishes-0.153
Token small
Token ID1402
Feature activation-7.21e-02

Pos logit contributions
ider+0.068
Dis+0.068
 slightly+0.068
Luckily+0.063
 softened+0.063
Neg logit contributions
reating-0.196
ites-0.186
oes-0.186
ishes-0.185
orns-0.185
Token town
Token ID3240
Feature activation-7.29e-02

Pos logit contributions
 slightly+0.099
ider+0.098
Dis+0.096
 strange+0.094
 softened+0.093
Neg logit contributions
reating-0.118
opl-0.113
oes-0.110
orns-0.109
mel-0.108
Token,
Token ID11
Feature activation-5.10e-02

Pos logit contributions
ider+0.118
 slightly+0.113
 softened+0.108
Dis+0.107
Luckily+0.106
Neg logit contributions
reating-0.109
orns-0.104
opl-0.103
ishes-0.102
bur-0.100
Token there
Token ID612
Feature activation-4.64e-02

Pos logit contributions
ider+0.092
 softened+0.089
Dis+0.086
ED+0.084
 widening+0.084
Neg logit contributions
ie-0.063
oes-0.061
 whenever-0.060
ites-0.059
fl-0.059
Token was
Token ID373
Feature activation-1.19e-01

Pos logit contributions
ider+0.060
 slightly+0.056
 softened+0.055
Dis+0.055
Luckily+0.052
Neg logit contributions
reating-0.083
ites-0.080
opl-0.080
orns-0.079
oes-0.078
Token a
Token ID257
Feature activation-9.78e-02

Pos logit contributions
ider+0.240
 softened+0.212
istered+0.205
Dis+0.198
Conf+0.196
Neg logit contributions
 whenever-0.155
reating-0.146
orns-0.130
 parties-0.127
fl-0.127
Token very
Token ID845
Feature activation-1.09e-01

Pos logit contributions
ider+0.165
 softened+0.159
Dis+0.145
ified+0.145
 narrowed+0.143
Neg logit contributions
ie-0.124
 whenever-0.123
 pump-0.122
 ostr-0.122
 farm-0.120
Token old
Token ID1468
Feature activation-5.16e-02

Pos logit contributions
 softened+0.172
ified+0.165
ider+0.156
Conf+0.156
 interrupted+0.155
Neg logit contributions
ie-0.150
 day-0.149
ites-0.149
igator-0.148
uggle-0.142
Token mine
Token ID6164
Feature activation-7.00e-02

Pos logit contributions
ider+0.094
 slightly+0.090
 softened+0.089
Dis+0.089
Luckily+0.086
Neg logit contributions
reating-0.064
ites-0.061
opl-0.061
orns-0.060
oes-0.060
Token.
Token ID13
Feature activation-5.41e-02

Pos logit contributions
 softened+0.140
Dis+0.139
 horrible+0.131
 slightly+0.131
Conf+0.131
Neg logit contributions
oes-0.075
ie-0.075
fl-0.074
ites-0.071
 whenever-0.071
TokenOne
Token ID3198
Feature activation-2.94e-02

Pos logit contributions
ider+0.016
 slightly+0.015
 softened+0.015
Dis+0.015
Luckily+0.014
Neg logit contributions
reating-0.011
ites-0.011
opl-0.011
orns-0.011
oes-0.010
Token day
Token ID1110
Feature activation-4.19e-02

Pos logit contributions
ider+0.071
 slightly+0.068
 softened+0.068
Dis+0.068
Luckily+0.065
Neg logit contributions
reating-0.020
ites-0.018
opl-0.018
oes-0.018
orns-0.017
Token,
Token ID11
Feature activation-3.70e-02

Pos logit contributions
ider+0.093
 slightly+0.091
Dis+0.090
 softened+0.089
 peculiar+0.087
Neg logit contributions
ie-0.034
 whenever-0.032
fl-0.029
ites-0.029
oes-0.027
Token a
Token ID257
Feature activation-9.03e-02

Pos logit contributions
ider+0.067
 softened+0.062
istered+0.059
Dis+0.059
 uneven+0.058
Neg logit contributions
 whenever-0.051
ie-0.045
 them-0.043
 wherever-0.041
fl-0.041
Token girl
Token ID2576
Feature activation-7.10e-02

Pos logit contributions
ider+0.144
 softened+0.135
 slightly+0.129
Dis+0.123
 shattered+0.123
Neg logit contributions
ie-0.126
 whenever-0.125
oes-0.122
uggle-0.120
ites-0.120
Token was
Token ID373
Feature activation-1.19e-01

Pos logit contributions
 softened+0.079
ider+0.079
 slightly+0.078
 ripped+0.077
Dis+0.077
Neg logit contributions
reating-0.136
opl-0.127
oes-0.126
ishes-0.125
ites-0.125
Token tying
Token ID28379
Feature activation-3.27e-02

Pos logit contributions
ider+0.163
 slightly+0.153
 softened+0.152
Dis+0.151
Luckily+0.143
Neg logit contributions
reating-0.203
ites-0.196
opl-0.195
orns-0.193
oes-0.192
Token a
Token ID257
Feature activation-7.01e-02

Pos logit contributions
 slightly+0.050
ider+0.050
 softened+0.048
Dis+0.048
 ripped+0.046
Neg logit contributions
reating-0.049
opl-0.048
oes-0.047
ites-0.047
ishes-0.046
Token balloon
Token ID21190
Feature activation-5.16e-02

Pos logit contributions
ider+0.109
 softened+0.100
Dis+0.096
 slightly+0.096
Luckily+0.094
Neg logit contributions
reating-0.109
ites-0.106
opl-0.102
ie-0.102
oes-0.102
Token to
Token ID284
Feature activation-1.64e-02

Pos logit contributions
ider+0.071
 slightly+0.067
 ripped+0.062
 softened+0.061
Dis+0.061
Neg logit contributions
reating-0.090
opl-0.086
ites-0.086
ishes-0.086
orns-0.085
Token her
Token ID607
Feature activation-4.85e-02

Pos logit contributions
 slightly+0.024
ider+0.023
Dis+0.023
 softened+0.022
 ripped+0.022
Neg logit contributions
ites-0.025
ishes-0.025
reating-0.025
fl-0.025
opl-0.025
Token a
Token ID257
Feature activation-8.73e-02

Pos logit contributions
 slightly+0.033
A+0.031
 strange+0.029
Dis+0.028
Luckily+0.027
Neg logit contributions
reating-0.167
ites-0.163
oes-0.162
ishes-0.161
opl-0.159
Token small
Token ID1402
Feature activation-7.14e-02

Pos logit contributions
ider+0.069
 slightly+0.066
Dis+0.066
 softened+0.063
Luckily+0.062
Neg logit contributions
reating-0.199
ites-0.191
oes-0.190
orns-0.190
opl-0.189
Token town
Token ID3240
Feature activation-7.38e-02

Pos logit contributions
 slightly+0.097
ider+0.097
Dis+0.094
 strange+0.093
 softened+0.092
Neg logit contributions
reating-0.117
opl-0.111
oes-0.108
orns-0.108
ites-0.107
Token,
Token ID11
Feature activation-5.32e-02

Pos logit contributions
ider+0.120
 slightly+0.115
 softened+0.111
Dis+0.110
Luckily+0.108
Neg logit contributions
reating-0.109
orns-0.104
opl-0.103
ishes-0.102
ites-0.100
Token there
Token ID612
Feature activation-4.78e-02

Pos logit contributions
ider+0.103
 softened+0.098
ED+0.095
Dis+0.095
 widening+0.095
Neg logit contributions
ie-0.059
 whenever-0.057
oes-0.056
ites-0.055
fl-0.054
Token was
Token ID373
Feature activation-1.19e-01

Pos logit contributions
ider+0.068
 slightly+0.062
 softened+0.062
Dis+0.062
Luckily+0.058
Neg logit contributions
reating-0.079
opl-0.077
ites-0.077
orns-0.076
oes-0.076
Token a
Token ID257
Feature activation-9.84e-02

Pos logit contributions
ider+0.237
 softened+0.211
istered+0.207
Dis+0.198
ED+0.195
Neg logit contributions
 whenever-0.156
reating-0.148
orns-0.132
 parties-0.127
fl-0.127
Token judge
Token ID5052
Feature activation-6.07e-02

Pos logit contributions
ider+0.172
 softened+0.163
ified+0.153
Dis+0.151
 narrowed+0.150
Neg logit contributions
ie-0.120
 pump-0.119
 ostr-0.118
 whenever-0.116
 farm-0.115
Token.
Token ID13
Feature activation-5.27e-02

Pos logit contributions
ider+0.095
 slightly+0.090
 softened+0.089
Dis+0.088
Luckily+0.085
Neg logit contributions
reating-0.092
ites-0.088
opl-0.088
orns-0.087
oes-0.086
Token The
Token ID383
Feature activation-2.56e-02

Pos logit contributions
ider+0.109
Dis+0.098
 softened+0.097
istered+0.096
Luckily+0.094
Neg logit contributions
oes-0.058
opl-0.058
reating-0.053
 whenever-0.051
mel-0.050
Token judge
Token ID5052
Feature activation-4.80e-02

Pos logit contributions
ider+0.053
 softened+0.053
Dis+0.052
 slightly+0.052
Conf+0.051
Neg logit contributions
reating-0.024
oes-0.023
opl-0.023
mel-0.023
orns-0.022
Token<|endoftext|>
Token ID50256
Feature activation-7.46e-03

Pos logit contributions
ider+0.061
 slightly+0.058
Dis+0.057
 softened+0.056
Luckily+0.054
Neg logit contributions
reating-0.073
ites-0.070
opl-0.070
orns-0.070
ishes-0.069
TokenOne
Token ID3198
Feature activation-3.23e-02

Pos logit contributions
ider+0.012
 slightly+0.011
Dis+0.011
 softened+0.011
Luckily+0.011
Neg logit contributions
reating-0.011
ites-0.010
opl-0.010
ishes-0.010
oes-0.010
Token day
Token ID1110
Feature activation-4.14e-02

Pos logit contributions
ider+0.086
 slightly+0.081
Dis+0.081
 softened+0.081
Conf+0.078
Neg logit contributions
reating-0.015
ie-0.015
oes-0.014
opl-0.014
ites-0.013
Token,
Token ID11
Feature activation-3.61e-02

Pos logit contributions
ider+0.091
 slightly+0.089
Dis+0.089
 softened+0.088
 peculiar+0.086
Neg logit contributions
ie-0.033
 whenever-0.032
fl-0.029
reating-0.028
ites-0.028
Token Toby
Token ID37578
Feature activation-5.52e-02

Pos logit contributions
ider+0.061
 softened+0.056
Dis+0.053
 horrible+0.053
 peculiar+0.052
Neg logit contributions
 whenever-0.052
ie-0.047
fl-0.045
 them-0.044
orns-0.043
Token was
Token ID373
Feature activation-1.19e-01

Pos logit contributions
ider+0.070
 slightly+0.067
 softened+0.067
Dis+0.066
Luckily+0.063
Neg logit contributions
reating-0.099
opl-0.095
ites-0.095
oes-0.093
orns-0.093
Token playing
Token ID2712
Feature activation-7.89e-02

Pos logit contributions
ider+0.168
 slightly+0.159
 softened+0.156
Dis+0.156
Luckily+0.149
Neg logit contributions
reating-0.199
ites-0.191
opl-0.190
orns-0.188
oes-0.188
Token in
Token ID287
Feature activation-5.94e-02

Pos logit contributions
ider+0.125
 slightly+0.118
 softened+0.117
Dis+0.117
 horrible+0.111
Neg logit contributions
reating-0.115
orns-0.112
ites-0.111
 whenever-0.109
ie-0.108
Token the
Token ID262
Feature activation-3.47e-02

Pos logit contributions
 slightly+0.084
A+0.083
 strange+0.080
Dis+0.078
 ripped+0.078
Neg logit contributions
ishes-0.098
oes-0.095
pp-0.095
ites-0.093
ans-0.092
Token backyard
Token ID24296
Feature activation-6.98e-02

Pos logit contributions
ider+0.052
 slightly+0.051
Dis+0.051
 softened+0.050
Luckily+0.048
Neg logit contributions
reating-0.052
ites-0.050
ishes-0.050
oes-0.050
opl-0.049
Token with
Token ID351
Feature activation-1.17e-02

Pos logit contributions
ider+0.103
 slightly+0.097
 softened+0.090
Luckily+0.090
Dis+0.089
Neg logit contributions
reating-0.115
ishes-0.113
ites-0.113
opl-0.110
orns-0.109
Token a
Token ID257
Feature activation-8.54e-02

Pos logit contributions
 slightly+0.036
Dis+0.033
A+0.032
 strange+0.031
Luckily+0.030
Neg logit contributions
reating-0.165
oes-0.159
ites-0.159
opl-0.158
ishes-0.157
Token small
Token ID1402
Feature activation-7.05e-02

Pos logit contributions
Dis+0.060
 slightly+0.059
Luckily+0.056
ider+0.056
 strange+0.056
Neg logit contributions
reating-0.206
oes-0.195
ishes-0.194
mel-0.193
ites-0.193
Token town
Token ID3240
Feature activation-7.26e-02

Pos logit contributions
 slightly+0.090
 strange+0.088
ider+0.086
Dis+0.086
 ripped+0.084
Neg logit contributions
reating-0.123
opl-0.117
oes-0.113
mel-0.113
orns-0.113
Token,
Token ID11
Feature activation-5.15e-02

Pos logit contributions
ider+0.115
 slightly+0.110
 softened+0.105
Dis+0.104
Luckily+0.104
Neg logit contributions
reating-0.112
orns-0.106
opl-0.105
ishes-0.104
bur-0.103
Token there
Token ID612
Feature activation-4.62e-02

Pos logit contributions
ider+0.096
 softened+0.093
ED+0.088
Dis+0.088
 widening+0.088
Neg logit contributions
ie-0.060
 whenever-0.058
oes-0.058
ites-0.057
fl-0.057
Token was
Token ID373
Feature activation-1.19e-01

Pos logit contributions
ider+0.057
 slightly+0.054
 softened+0.054
Dis+0.054
Luckily+0.051
Neg logit contributions
reating-0.084
ites-0.082
opl-0.081
orns-0.080
ishes-0.080
Token a
Token ID257
Feature activation-9.74e-02

Pos logit contributions
ider+0.222
 softened+0.200
istered+0.190
Dis+0.184
Conf+0.182
Neg logit contributions
 whenever-0.166
reating-0.157
orns-0.145
fl-0.143
 parties-0.139
Token powerful
Token ID3665
Feature activation-6.47e-02

Pos logit contributions
ider+0.158
 softened+0.151
Dis+0.139
 slightly+0.138
ified+0.137
Neg logit contributions
ie-0.130
 whenever-0.129
 pump-0.127
ites-0.125
 ostr-0.124
Token girl
Token ID2576
Feature activation-6.55e-02

Pos logit contributions
ider+0.119
 softened+0.115
Dis+0.115
 slightly+0.114
Luckily+0.110
Neg logit contributions
reating-0.079
opl-0.076
ites-0.075
orns-0.072
ie-0.072
Token named
Token ID3706
Feature activation-1.33e-02

Pos logit contributions
ider+0.072
 slightly+0.066
 softened+0.061
Luckily+0.061
 ripped+0.061
Neg logit contributions
reating-0.133
opl-0.126
ishes-0.126
orns-0.123
bur-0.122
Token Lily
Token ID20037
Feature activation-2.51e-02

Pos logit contributions
ider+0.024
 softened+0.024
 slightly+0.023
Dis+0.022
 horrible+0.022
Neg logit contributions
reating-0.016
oes-0.016
ites-0.016
fl-0.016
 whenever-0.015
Token a
Token ID257
Feature activation-8.59e-02

Pos logit contributions
 slightly+0.031
A+0.030
 strange+0.028
Dis+0.026
Luckily+0.025
Neg logit contributions
reating-0.169
oes-0.165
ites-0.164
ishes-0.162
opl-0.162
Token small
Token ID1402
Feature activation-7.24e-02

Pos logit contributions
ider+0.061
 slightly+0.061
Dis+0.061
 softened+0.056
Luckily+0.056
Neg logit contributions
reating-0.202
ites-0.193
oes-0.192
orns-0.192
ishes-0.191
Token town
Token ID3240
Feature activation-7.28e-02

Pos logit contributions
ider+0.106
 slightly+0.105
Dis+0.102
 softened+0.101
 strange+0.098
Neg logit contributions
reating-0.112
opl-0.107
orns-0.105
oes-0.104
ites-0.104
Token,
Token ID11
Feature activation-5.21e-02

Pos logit contributions
ider+0.128
 slightly+0.122
 softened+0.120
Dis+0.120
Luckily+0.116
Neg logit contributions
reating-0.097
opl-0.091
ites-0.091
orns-0.091
ishes-0.089
Token there
Token ID612
Feature activation-4.64e-02

Pos logit contributions
ider+0.103
 softened+0.099
 widening+0.097
ED+0.096
Dis+0.096
Neg logit contributions
 whenever-0.055
ie-0.053
ites-0.051
oes-0.050
fl-0.048
Token was
Token ID373
Feature activation-1.18e-01

Pos logit contributions
ider+0.069
 slightly+0.063
 softened+0.062
Dis+0.062
Luckily+0.059
Neg logit contributions
reating-0.074
opl-0.072
orns-0.071
ie-0.071
ites-0.071
Token a
Token ID257
Feature activation-9.78e-02

Pos logit contributions
ider+0.248
istered+0.222
 softened+0.222
Dis+0.208
Conf+0.206
Neg logit contributions
 whenever-0.147
reating-0.137
 parties-0.120
 day-0.119
 farms-0.116
Token depend
Token ID4745
Feature activation-5.06e-02

Pos logit contributions
ider+0.172
 softened+0.163
ified+0.151
Dis+0.150
 narrowed+0.149
Neg logit contributions
ie-0.119
 pump-0.119
 ostr-0.118
 farm-0.117
 whenever-0.117
Tokenable
Token ID540
Feature activation-5.72e-02

Pos logit contributions
ider+0.081
 slightly+0.077
 softened+0.075
Dis+0.075
 ripped+0.073
Neg logit contributions
reating-0.072
ishes-0.067
fl-0.067
opl-0.066
ric-0.066
Token red
Token ID2266
Feature activation-5.21e-02

Pos logit contributions
ider+0.108
 softened+0.105
Dis+0.104
 slightly+0.102
Luckily+0.100
Neg logit contributions
reating-0.068
opl-0.063
ites-0.063
ie-0.062
mel-0.062
Token car
Token ID1097
Feature activation-5.44e-02

Pos logit contributions
ider+0.096
 softened+0.093
Dis+0.092
 slightly+0.092
Luckily+0.089
Neg logit contributions
reating-0.065
opl-0.060
oes-0.060
ites-0.060
orns-0.059
Token a
Token ID257
Feature activation-8.48e-02

Pos logit contributions
 slightly+0.037
Dis+0.034
 strange+0.033
A+0.032
 softened+0.032
Neg logit contributions
reating-0.161
oes-0.156
ites-0.155
opl-0.154
ishes-0.153
Token small
Token ID1402
Feature activation-7.10e-02

Pos logit contributions
ider+0.068
 slightly+0.065
Dis+0.065
 softened+0.062
Luckily+0.060
Neg logit contributions
reating-0.193
ites-0.185
orns-0.184
oes-0.184
opl-0.184
Token garden
Token ID11376
Feature activation-7.80e-02

Pos logit contributions
ider+0.100
 slightly+0.100
Dis+0.097
 strange+0.095
 softened+0.095
Neg logit contributions
reating-0.114
opl-0.108
orns-0.106
oes-0.105
ites-0.104
Token,
Token ID11
Feature activation-5.13e-02

Pos logit contributions
ider+0.110
 slightly+0.104
Luckily+0.102
 softened+0.101
A+0.098
Neg logit contributions
reating-0.141
ishes-0.128
orns-0.127
opl-0.127
ites-0.124
Token there
Token ID612
Feature activation-4.51e-02

Pos logit contributions
ider+0.090
 softened+0.084
Dis+0.082
 slightly+0.080
ED+0.079
Neg logit contributions
ie-0.068
fl-0.065
ites-0.064
 whenever-0.064
oes-0.064
Token was
Token ID373
Feature activation-1.18e-01

Pos logit contributions
ider+0.051
 slightly+0.050
 softened+0.049
Dis+0.049
Luckily+0.046
Neg logit contributions
reating-0.086
ites-0.083
ishes-0.082
opl-0.082
orns-0.081
Token a
Token ID257
Feature activation-9.65e-02

Pos logit contributions
ider+0.232
 softened+0.202
ED+0.195
istered+0.194
Dis+0.192
Neg logit contributions
 whenever-0.154
reating-0.143
orns-0.139
fl-0.138
ie-0.130
Token little
Token ID1310
Feature activation-5.31e-02

Pos logit contributions
ider+0.156
 softened+0.147
ified+0.137
Dis+0.136
 slightly+0.136
Neg logit contributions
ie-0.135
 pump-0.128
 ostr-0.127
 whenever-0.126
fl-0.125
Token cater
Token ID20825
Feature activation-5.84e-02

Pos logit contributions
ider+0.099
 softened+0.094
 slightly+0.094
Dis+0.094
Luckily+0.091
Neg logit contributions
reating-0.063
ites-0.062
ishes-0.061
ie-0.061
orns-0.060
Tokenpillar
Token ID41643
Feature activation-7.68e-02

Pos logit contributions
ider+0.069
 slightly+0.061
Luckily+0.061
Dis+0.061
 softened+0.060
Neg logit contributions
reating-0.111
ishes-0.106
bur-0.105
oes-0.103
ric-0.101
Token.
Token ID13
Feature activation-4.86e-02

Pos logit contributions
ider+0.110
 slightly+0.102
 softened+0.100
Dis+0.098
Luckily+0.097
Neg logit contributions
reating-0.129
opl-0.122
ites-0.121
ishes-0.120
oes-0.119
TokenOn
Token ID2202
Feature activation-1.75e-02

Pos logit contributions
ider+0.009
 slightly+0.009
Dis+0.009
 softened+0.009
Luckily+0.008
Neg logit contributions
reating-0.008
ites-0.007
opl-0.007
ishes-0.007
oes-0.007
Token Wednesday
Token ID3583
Feature activation-5.42e-02

Pos logit contributions
A+0.022
 slightly+0.022
 strange+0.021
 softened+0.021
Luckily+0.021
Neg logit contributions
ishes-0.033
ites-0.031
oes-0.031
reating-0.031
pp-0.030
Token,
Token ID11
Feature activation-4.60e-02

Pos logit contributions
ider+0.079
 softened+0.077
 slightly+0.076
Dis+0.075
Luckily+0.074
Neg logit contributions
reating-0.085
opl-0.082
ishes-0.081
orns-0.081
ites-0.081
Token Sus
Token ID8932
Feature activation-3.68e-02

Pos logit contributions
ider+0.083
 softened+0.077
Dis+0.076
 slightly+0.075
istered+0.071
Neg logit contributions
 whenever-0.059
ie-0.057
ites-0.056
oes-0.055
reating-0.055
Tokenie
Token ID494
Feature activation-5.99e-02

Pos logit contributions
 softened+0.085
 ripped+0.083
 shattered+0.082
Luckily+0.081
Dis+0.081
Neg logit contributions
reating-0.021
opl-0.021
ites-0.020
mel-0.020
orns-0.018
Token was
Token ID373
Feature activation-1.18e-01

Pos logit contributions
 softened+0.063
Dis+0.060
Luckily+0.060
 slightly+0.060
ider+0.060
Neg logit contributions
reating-0.122
oes-0.113
ishes-0.112
ites-0.111
orns-0.109
Token excited
Token ID6568
Feature activation-2.38e-02

Pos logit contributions
ider+0.178
Dis+0.167
 softened+0.160
 slightly+0.156
Conf+0.154
Neg logit contributions
reating-0.185
opl-0.181
 whenever-0.178
orns-0.176
bur-0.175
Token.
Token ID13
Feature activation-4.46e-02

Pos logit contributions
ider+0.041
 slightly+0.039
 softened+0.038
Dis+0.038
Luckily+0.037
Neg logit contributions
ites-0.032
opl-0.031
orns-0.031
ishes-0.031
reating-0.031
Token She
Token ID1375
Feature activation-2.74e-02

Pos logit contributions
ider+0.069
 softened+0.065
Dis+0.063
 slightly+0.063
Luckily+0.061
Neg logit contributions
reating-0.070
oes-0.067
opl-0.066
ites-0.065
fl-0.065
Token was
Token ID373
Feature activation-1.04e-01

Pos logit contributions
ider+0.047
 slightly+0.041
 horrible+0.041
 softened+0.040
Dis+0.040
Neg logit contributions
ie-0.039
 whenever-0.038
fl-0.036
ites-0.036
bur-0.035
Token going
Token ID1016
Feature activation-8.05e-02

Pos logit contributions
ider+0.171
 slightly+0.166
 softened+0.161
Dis+0.159
Luckily+0.154
Neg logit contributions
reating-0.148
ites-0.145
orns-0.145
opl-0.144
ishes-0.142
Token a
Token ID257
Feature activation-8.70e-02

Pos logit contributions
 slightly+0.040
Dis+0.036
ider+0.035
A+0.034
 softened+0.034
Neg logit contributions
reating-0.158
oes-0.153
opl-0.153
ites-0.152
ishes-0.151
Token small
Token ID1402
Feature activation-7.13e-02

Pos logit contributions
ider+0.078
 slightly+0.074
Dis+0.073
 softened+0.071
Luckily+0.068
Neg logit contributions
reating-0.189
ites-0.182
opl-0.181
orns-0.180
oes-0.180
Token town
Token ID3240
Feature activation-7.27e-02

Pos logit contributions
ider+0.099
 slightly+0.099
Dis+0.096
 strange+0.094
 softened+0.094
Neg logit contributions
reating-0.115
opl-0.110
oes-0.107
orns-0.106
mel-0.106
Token,
Token ID11
Feature activation-5.24e-02

Pos logit contributions
ider+0.122
 slightly+0.116
 softened+0.113
Dis+0.113
Luckily+0.110
Neg logit contributions
reating-0.104
orns-0.098
opl-0.098
ishes-0.096
ites-0.096
Token there
Token ID612
Feature activation-4.60e-02

Pos logit contributions
ider+0.100
 softened+0.095
Dis+0.091
ED+0.091
 widening+0.091
Neg logit contributions
ie-0.061
 whenever-0.059
oes-0.057
fl-0.057
ites-0.056
Token was
Token ID373
Feature activation-1.18e-01

Pos logit contributions
ider+0.062
 slightly+0.057
 softened+0.056
Dis+0.056
Luckily+0.053
Neg logit contributions
reating-0.080
opl-0.077
ites-0.077
orns-0.077
oes-0.076
Token a
Token ID257
Feature activation-9.75e-02

Pos logit contributions
ider+0.235
 softened+0.207
istered+0.202
Dis+0.192
Conf+0.190
Neg logit contributions
 whenever-0.155
reating-0.147
orns-0.133
fl-0.131
 ones-0.126
Token girl
Token ID2576
Feature activation-7.11e-02

Pos logit contributions
ider+0.165
 softened+0.157
ified+0.145
Dis+0.145
 narrowed+0.142
Neg logit contributions
ie-0.125
 pump-0.123
 whenever-0.122
 ostr-0.122
 farm-0.121
Token named
Token ID3706
Feature activation-1.45e-02

Pos logit contributions
ider+0.093
 slightly+0.088
 softened+0.085
Dis+0.083
Luckily+0.083
Neg logit contributions
reating-0.128
opl-0.121
ishes-0.120
ites-0.119
orns-0.119
Token Amy
Token ID14235
Feature activation-3.52e-02

Pos logit contributions
 softened+0.028
ider+0.028
 slightly+0.027
Dis+0.026
 horrible+0.026
Neg logit contributions
oes-0.016
reating-0.016
ie-0.015
fl-0.015
 whenever-0.015
Token.
Token ID13
Feature activation-4.30e-02

Pos logit contributions
ider+0.044
 slightly+0.043
 softened+0.042
Dis+0.041
Luckily+0.040
Neg logit contributions
reating-0.064
ites-0.061
opl-0.060
ishes-0.060
orns-0.059
Token a
Token ID257
Feature activation-8.64e-02

Pos logit contributions
 slightly+0.042
Dis+0.040
ider+0.039
 softened+0.037
A+0.036
Neg logit contributions
reating-0.158
ites-0.152
oes-0.151
opl-0.151
ishes-0.151
Token big
Token ID1263
Feature activation-6.53e-02

Pos logit contributions
ider+0.070
 slightly+0.067
Dis+0.066
 softened+0.064
Luckily+0.061
Neg logit contributions
reating-0.195
ites-0.187
opl-0.187
orns-0.186
oes-0.186
Token forest
Token ID8222
Feature activation-8.32e-02

Pos logit contributions
 strange+0.085
 slightly+0.083
 ripped+0.080
 horrible+0.079
Dis+0.079
Neg logit contributions
reating-0.115
oes-0.105
orns-0.103
mel-0.102
ites-0.102
Token,
Token ID11
Feature activation-5.28e-02

Pos logit contributions
ider+0.153
Dis+0.147
 slightly+0.147
 softened+0.146
Luckily+0.140
Neg logit contributions
reating-0.100
ites-0.097
opl-0.096
oes-0.096
orns-0.095
Token there
Token ID612
Feature activation-4.73e-02

Pos logit contributions
ider+0.096
 softened+0.091
Dis+0.088
ED+0.086
 widening+0.086
Neg logit contributions
ie-0.066
oes-0.064
ites-0.063
fl-0.062
 whenever-0.062
Token was
Token ID373
Feature activation-1.18e-01

Pos logit contributions
ider+0.067
 slightly+0.062
 softened+0.062
Dis+0.062
Luckily+0.059
Neg logit contributions
reating-0.078
opl-0.076
ites-0.076
orns-0.075
oes-0.075
Token a
Token ID257
Feature activation-9.73e-02

Pos logit contributions
ider+0.240
 softened+0.216
istered+0.214
ED+0.199
Dis+0.196
Neg logit contributions
 whenever-0.146
orns-0.128
fl-0.127
reating-0.125
 farms-0.119
Token poor
Token ID3595
Feature activation-5.76e-02

Pos logit contributions
ider+0.163
 softened+0.152
Dis+0.138
ified+0.138
 slightly+0.138
Neg logit contributions
ie-0.134
 ostr-0.128
 pump-0.127
 whenever-0.126
 farm-0.125
Token little
Token ID1310
Feature activation-4.38e-02

Pos logit contributions
ider+0.093
Dis+0.089
 softened+0.088
 slightly+0.085
Conf+0.083
Neg logit contributions
ie-0.090
fl-0.082
reating-0.082
mel-0.080
ishes-0.080
Token rabbit
Token ID22746
Feature activation-5.79e-02

Pos logit contributions
ider+0.083
Dis+0.079
 softened+0.079
 slightly+0.079
Luckily+0.077
Neg logit contributions
ie-0.050
reating-0.049
ishes-0.049
fl-0.049
ites-0.048
Token.
Token ID13
Feature activation-4.98e-02

Pos logit contributions
ider+0.078
 slightly+0.072
 softened+0.070
Luckily+0.068
Dis+0.068
Neg logit contributions
reating-0.103
opl-0.097
ites-0.097
ishes-0.095
oes-0.094
Token the
Token ID262
Feature activation-3.69e-02

Pos logit contributions
 slightly+0.098
 strange+0.094
Dis+0.093
A+0.092
 softened+0.090
Neg logit contributions
ishes-0.090
ites-0.089
oes-0.088
opl-0.088
reating-0.088
Token park
Token ID3952
Feature activation-7.00e-02

Pos logit contributions
ider+0.072
 softened+0.067
 slightly+0.066
Dis+0.066
Luckily+0.063
Neg logit contributions
reating-0.045
opl-0.042
ites-0.042
orns-0.041
fl-0.041
Token.
Token ID13
Feature activation-5.06e-02

Pos logit contributions
ider+0.112
 slightly+0.107
 softened+0.104
Dis+0.104
Luckily+0.101
Neg logit contributions
reating-0.105
ites-0.100
opl-0.100
orns-0.099
ishes-0.099
Token As
Token ID1081
Feature activation-3.52e-02

Pos logit contributions
ider+0.086
 softened+0.080
Dis+0.077
 slightly+0.076
 horrible+0.076
Neg logit contributions
reating-0.071
oes-0.071
 whenever-0.068
fl-0.068
orns-0.067
Token she
Token ID673
Feature activation-5.37e-02

Pos logit contributions
 slightly+0.051
A+0.050
ider+0.050
Dis+0.049
 strange+0.047
Neg logit contributions
ishes-0.057
opl-0.057
ans-0.055
reating-0.055
ric-0.054
Token looked
Token ID3114
Feature activation-1.18e-01

Pos logit contributions
ider+0.076
 slightly+0.070
Dis+0.069
 horrible+0.068
 softened+0.067
Neg logit contributions
ites-0.089
ie-0.089
fl-0.088
orns-0.087
oes-0.085
Token around
Token ID1088
Feature activation-4.43e-02

Pos logit contributions
ider+0.172
 slightly+0.165
 softened+0.162
Dis+0.156
 strange+0.150
Neg logit contributions
reating-0.192
opl-0.189
ites-0.181
orns-0.179
oes-0.178
Token,
Token ID11
Feature activation-4.48e-02

Pos logit contributions
 slightly+0.061
ider+0.059
 softened+0.057
 strange+0.057
Dis+0.055
Neg logit contributions
reating-0.075
opl-0.074
ites-0.073
ans-0.073
ishes-0.072
Token she
Token ID673
Feature activation-4.66e-02

Pos logit contributions
ider+0.089
 softened+0.084
 slightly+0.084
Dis+0.084
Luckily+0.081
Neg logit contributions
reating-0.048
ites-0.045
orns-0.045
oes-0.044
opl-0.044
Token noticed
Token ID6810
Feature activation-5.72e-02

Pos logit contributions
ider+0.066
Dis+0.059
 slightly+0.059
 softened+0.057
Conf+0.056
Neg logit contributions
ie-0.076
ites-0.075
orns-0.074
fl-0.074
oes-0.073
Token a
Token ID257
Feature activation-9.23e-02

Pos logit contributions
ider+0.087
Dis+0.076
 slightly+0.072
 softened+0.071
istered+0.070
Neg logit contributions
reating-0.099
orns-0.093
ie-0.093
fl-0.091
 whenever-0.090
Token<|endoftext|>
Token ID50256
Feature activation-1.08e-02

Pos logit contributions
ider+0.073
 slightly+0.068
 softened+0.068
Dis+0.067
Luckily+0.064
Neg logit contributions
reating-0.082
opl-0.080
ites-0.079
oes-0.079
orns-0.079
TokenOne
Token ID3198
Feature activation-3.18e-02

Pos logit contributions
ider+0.017
 slightly+0.017
 softened+0.017
Dis+0.017
Luckily+0.016
Neg logit contributions
reating-0.015
ites-0.015
opl-0.015
ishes-0.014
oes-0.014
Token day
Token ID1110
Feature activation-4.35e-02

Pos logit contributions
ider+0.084
 slightly+0.079
Dis+0.079
 softened+0.079
Luckily+0.076
Neg logit contributions
reating-0.015
oes-0.015
opl-0.015
ie-0.014
ites-0.014
Token,
Token ID11
Feature activation-3.62e-02

Pos logit contributions
ider+0.098
 slightly+0.095
Dis+0.095
 softened+0.093
 peculiar+0.091
Neg logit contributions
ie-0.032
 whenever-0.032
fl-0.028
ites-0.026
oes-0.026
Token Pete
Token ID17542
Feature activation-5.26e-02

Pos logit contributions
ider+0.065
 softened+0.058
Dis+0.056
istered+0.055
 horrible+0.053
Neg logit contributions
 whenever-0.050
ie-0.044
 them-0.042
fl-0.041
 wherever-0.041
Token was
Token ID373
Feature activation-1.18e-01

Pos logit contributions
 softened+0.052
Dis+0.051
 slightly+0.051
ider+0.050
 ripped+0.049
Neg logit contributions
reating-0.109
opl-0.103
ites-0.103
oes-0.101
orns-0.099
Token playing
Token ID2712
Feature activation-8.00e-02

Pos logit contributions
ider+0.181
 softened+0.171
Dis+0.170
 slightly+0.169
Conf+0.161
Neg logit contributions
reating-0.178
opl-0.173
ites-0.172
orns-0.171
ishes-0.167
Token in
Token ID287
Feature activation-5.96e-02

Pos logit contributions
ider+0.132
 slightly+0.124
 softened+0.123
Dis+0.123
 horrible+0.117
Neg logit contributions
reating-0.110
orns-0.108
 whenever-0.107
ites-0.107
ie-0.105
Token his
Token ID465
Feature activation-6.33e-02

Pos logit contributions
 slightly+0.083
A+0.082
 strange+0.079
Dis+0.076
 ripped+0.076
Neg logit contributions
ishes-0.101
oes-0.098
pp-0.098
ites-0.095
reating-0.094
Token room
Token ID2119
Feature activation-7.42e-02

Pos logit contributions
ider+0.093
 slightly+0.087
 softened+0.087
Dis+0.086
Luckily+0.082
Neg logit contributions
reating-0.102
ites-0.098
opl-0.098
orns-0.097
oes-0.096
Token.
Token ID13
Feature activation-4.98e-02

Pos logit contributions
ider+0.137
Dis+0.137
 softened+0.136
 slightly+0.133
 horrible+0.129
Neg logit contributions
reating-0.086
ie-0.080
 whenever-0.080
opl-0.080
oes-0.079
Token a
Token ID257
Feature activation-8.40e-02

Pos logit contributions
 slightly+0.038
Dis+0.035
ider+0.034
A+0.033
 strange+0.033
Neg logit contributions
reating-0.158
ites-0.152
oes-0.152
opl-0.152
ishes-0.151
Token small
Token ID1402
Feature activation-7.15e-02

Pos logit contributions
ider+0.065
 slightly+0.064
Dis+0.064
 softened+0.060
Luckily+0.059
Neg logit contributions
reating-0.193
ites-0.184
oes-0.183
orns-0.183
ishes-0.183
Token town
Token ID3240
Feature activation-7.20e-02

Pos logit contributions
ider+0.101
 slightly+0.101
Dis+0.098
 softened+0.096
 strange+0.095
Neg logit contributions
reating-0.114
opl-0.108
oes-0.106
orns-0.105
ites-0.105
Token,
Token ID11
Feature activation-5.03e-02

Pos logit contributions
ider+0.123
 slightly+0.118
 softened+0.115
Dis+0.115
Luckily+0.112
Neg logit contributions
reating-0.100
opl-0.094
orns-0.094
ites-0.093
ishes-0.093
Token there
Token ID612
Feature activation-4.57e-02

Pos logit contributions
ider+0.096
 softened+0.092
ED+0.089
Dis+0.089
 widening+0.088
Neg logit contributions
ie-0.058
oes-0.056
 whenever-0.056
ites-0.054
fl-0.053
Token was
Token ID373
Feature activation-1.18e-01

Pos logit contributions
ider+0.063
 slightly+0.058
 softened+0.057
Dis+0.057
Luckily+0.054
Neg logit contributions
reating-0.078
opl-0.076
ites-0.075
orns-0.075
oes-0.074
Token a
Token ID257
Feature activation-9.64e-02

Pos logit contributions
ider+0.240
 softened+0.215
istered+0.211
Dis+0.201
ED+0.200
Neg logit contributions
 whenever-0.151
reating-0.139
 day-0.122
orns-0.122
 parties-0.121
Token thoughtful
Token ID22677
Feature activation-4.79e-02

Pos logit contributions
ider+0.164
 softened+0.156
ified+0.146
Dis+0.143
 narrowed+0.142
Neg logit contributions
ie-0.122
 pump-0.122
 whenever-0.120
 ostr-0.120
 farm-0.119
Token girl
Token ID2576
Feature activation-6.21e-02

Pos logit contributions
ider+0.081
 softened+0.077
Dis+0.076
 slightly+0.075
Luckily+0.073
Neg logit contributions
reating-0.067
ie-0.064
opl-0.063
ites-0.062
mel-0.062
Token named
Token ID3706
Feature activation-1.41e-02

Pos logit contributions
ider+0.072
 slightly+0.068
 softened+0.064
Luckily+0.063
Dis+0.062
Neg logit contributions
reating-0.120
opl-0.114
ishes-0.113
orns-0.111
ites-0.111
Token Lucy
Token ID22162
Feature activation-3.07e-02

Pos logit contributions
 softened+0.028
ider+0.028
 slightly+0.027
 horrible+0.026
Dis+0.026
Neg logit contributions
oes-0.015
reating-0.014
 whenever-0.014
fl-0.014
ie-0.014
Token a
Token ID257
Feature activation-8.55e-02

Pos logit contributions
 slightly+0.039
Dis+0.036
A+0.034
ider+0.033
Luckily+0.033
Neg logit contributions
reating-0.163
oes-0.158
ites-0.157
opl-0.157
ishes-0.154
Token peaceful
Token ID12309
Feature activation-5.99e-02

Pos logit contributions
Dis+0.064
 slightly+0.063
ider+0.061
Luckily+0.060
 strange+0.058
Neg logit contributions
reating-0.200
oes-0.190
ites-0.189
ishes-0.189
mel-0.189
Token town
Token ID3240
Feature activation-7.11e-02

Pos logit contributions
ider+0.078
 slightly+0.077
Dis+0.074
 softened+0.074
Luckily+0.073
Neg logit contributions
reating-0.103
ites-0.098
opl-0.098
orns-0.098
ishes-0.097
Token,
Token ID11
Feature activation-5.33e-02

Pos logit contributions
ider+0.115
 slightly+0.109
Dis+0.104
 softened+0.104
Luckily+0.103
Neg logit contributions
reating-0.107
orns-0.101
opl-0.101
ishes-0.100
bur-0.099
Token there
Token ID612
Feature activation-4.49e-02

Pos logit contributions
ider+0.101
 softened+0.098
 widening+0.092
Dis+0.092
ED+0.092
Neg logit contributions
ie-0.064
 whenever-0.060
oes-0.060
fl-0.060
ites-0.059
Token was
Token ID373
Feature activation-1.18e-01

Pos logit contributions
ider+0.060
 slightly+0.056
 softened+0.055
Dis+0.055
Luckily+0.052
Neg logit contributions
reating-0.078
ites-0.076
opl-0.075
orns-0.075
oes-0.074
Token a
Token ID257
Feature activation-9.72e-02

Pos logit contributions
ider+0.238
 softened+0.213
istered+0.209
 secretive+0.195
Dis+0.195
Neg logit contributions
 whenever-0.154
reating-0.143
orns-0.129
 parties-0.128
fl-0.127
Token pretty
Token ID2495
Feature activation-6.66e-02

Pos logit contributions
ider+0.162
 softened+0.153
ified+0.143
Dis+0.141
 narrowed+0.141
Neg logit contributions
ie-0.128
 pump-0.125
 whenever-0.124
 ostr-0.123
 farm-0.121
Token bridge
Token ID7696
Feature activation-6.40e-02

Pos logit contributions
ider+0.117
 softened+0.112
Dis+0.111
 slightly+0.111
Luckily+0.108
Neg logit contributions
reating-0.088
ites-0.083
opl-0.083
ie-0.082
fl-0.081
Token.
Token ID13
Feature activation-5.63e-02

Pos logit contributions
ider+0.087
 slightly+0.079
 softened+0.073
Luckily+0.072
 shattered+0.072
Neg logit contributions
reating-0.117
ishes-0.112
orns-0.109
bur-0.108
ites-0.107
Token The
Token ID383
Feature activation-2.81e-02

Pos logit contributions
ider+0.092
 softened+0.086
Dis+0.084
 slightly+0.082
 horrible+0.081
Neg logit contributions
opl-0.081
ites-0.081
oes-0.080
orns-0.080
reating-0.079
Tokens
Token ID82
Feature activation+3.07e-02

Pos logit contributions
reating+0.095
ites+0.090
orns+0.090
ishes+0.090
opl+0.088
Neg logit contributions
ider-0.110
 softened-0.105
 slightly-0.105
Dis-0.104
Luckily-0.100
Token wrong
Token ID2642
Feature activation+2.07e-02

Pos logit contributions
reating+0.048
opl+0.047
ites+0.047
oes+0.046
orns+0.046
Neg logit contributions
ider-0.046
 slightly-0.044
Dis-0.043
 softened-0.043
Luckily-0.041
Token,
Token ID11
Feature activation+3.08e-02

Pos logit contributions
reating+0.025
fl+0.025
opl+0.024
ites+0.024
ie+0.024
Neg logit contributions
ider-0.038
Dis-0.036
 softened-0.036
 slightly-0.035
Luckily-0.034
Token rh
Token ID9529
Feature activation+8.76e-02

Pos logit contributions
reating+0.048
ites+0.047
opl+0.047
orns+0.046
oes+0.046
Neg logit contributions
ider-0.045
 slightly-0.043
 softened-0.043
Dis-0.043
Luckily-0.041
Tokenin
Token ID259
Feature activation+1.13e-02

Pos logit contributions
reating+0.122
ites+0.116
opl+0.116
orns+0.115
oes+0.114
Neg logit contributions
ider-0.147
 slightly-0.140
 softened-0.139
Dis-0.139
Luckily-0.133
Tokenoc
Token ID420
Feature activation+1.57e-01

Pos logit contributions
reating+0.014
opl+0.014
ites+0.014
oes+0.014
ie+0.014
Neg logit contributions
ider-0.020
 slightly-0.019
Dis-0.019
 softened-0.019
Luckily-0.018
Tokeneros
Token ID27498
Feature activation+3.73e-02

Pos logit contributions
reating+0.202
ites+0.187
oes+0.187
 whenever+0.185
uggle+0.184
Neg logit contributions
ider-0.280
 softened-0.254
Dis-0.250
 slightly-0.247
 horrible-0.236
Token?
Token ID30
Feature activation-4.52e-03

Pos logit contributions
reating+0.042
opl+0.042
ites+0.041
ie+0.041
orns+0.040
Neg logit contributions
ider-0.071
Dis-0.069
 slightly-0.068
 softened-0.066
Luckily-0.065
Tokenâ
Token ID22940
Feature activation+4.48e-02

Pos logit contributions
ider+0.007
 softened+0.007
 slightly+0.006
Dis+0.006
 horrible+0.006
Neg logit contributions
reating-0.007
ites-0.007
opl-0.007
oes-0.006
orns-0.006
Token
Token ID26391
Feature activation-1.19e-02

Pos logit contributions
opl+0.064
reating+0.061
oes+0.061
orns+0.060
ites+0.059
Neg logit contributions
ider-0.074
 slightly-0.072
Dis-0.070
 softened-0.070
Luckily-0.067
Token said
Token ID531
Feature activation+4.17e-02

Pos logit contributions
ider+0.017
Luckily+0.017
 slightly+0.016
Dis+0.016
 softened+0.016
Neg logit contributions
reating-0.022
ishes-0.019
ie-0.019
oes-0.019
orns-0.018
Token that
Token ID326
Feature activation+9.00e-02

Pos logit contributions
orns+0.130
opl+0.130
ishes+0.129
bur+0.126
reating+0.126
Neg logit contributions
ider-0.111
 slightly-0.107
 softened-0.104
Luckily-0.098
 strange-0.097
Token the
Token ID262
Feature activation+7.44e-02

Pos logit contributions
orns+0.121
ishes+0.119
ites+0.117
reating+0.116
opl+0.116
Neg logit contributions
 slightly-0.153
 softened-0.149
ider-0.148
Dis-0.145
 horrible-0.143
Token bad
Token ID2089
Feature activation+7.25e-02

Pos logit contributions
reating+0.109
ites+0.105
opl+0.104
fl+0.103
oes+0.102
Neg logit contributions
ider-0.118
Dis-0.111
 slightly-0.111
 softened-0.110
Luckily-0.106
Token rh
Token ID9529
Feature activation+1.09e-01

Pos logit contributions
reating+0.095
ites+0.093
orns+0.091
opl+0.090
ishes+0.090
Neg logit contributions
ider-0.125
 slightly-0.120
 softened-0.120
Dis-0.118
 horrible-0.115
Tokenin
Token ID259
Feature activation+2.80e-02

Pos logit contributions
reating+0.171
opl+0.164
oes+0.161
fl+0.161
ishes+0.157
Neg logit contributions
ider-0.167
 softened-0.154
 slightly-0.153
Dis-0.149
Luckily-0.146
Tokenoc
Token ID420
Feature activation+1.74e-01

Pos logit contributions
reating+0.027
ites+0.025
orns+0.024
oes+0.024
opl+0.024
Neg logit contributions
ider-0.059
 softened-0.057
 slightly-0.057
Dis-0.056
Luckily-0.055
Tokeneros
Token ID27498
Feature activation+5.32e-02

Pos logit contributions
reating+0.249
ites+0.227
oes+0.226
uggle+0.220
bur+0.219
Neg logit contributions
ider-0.295
 softened-0.262
 slightly-0.250
Dis-0.246
 shattered-0.239
Token was
Token ID373
Feature activation-2.04e-02

Pos logit contributions
reating+0.095
ites+0.093
opl+0.093
orns+0.092
ishes+0.091
Neg logit contributions
ider-0.066
 slightly-0.061
Dis-0.061
 softened-0.059
Conf-0.056
Token now
Token ID783
Feature activation+3.16e-02

Pos logit contributions
ider+0.026
 slightly+0.026
 softened+0.026
Dis+0.024
Luckily+0.024
Neg logit contributions
reating-0.036
ites-0.035
oes-0.035
orns-0.035
ishes-0.035
Token nice
Token ID3621
Feature activation+5.53e-02

Pos logit contributions
reating+0.051
opl+0.049
ites+0.049
orns+0.048
oes+0.048
Neg logit contributions
ider-0.046
 slightly-0.043
Dis-0.043
 softened-0.043
Luckily-0.041
Token.
Token ID13
Feature activation+1.88e-02

Pos logit contributions
ites+0.086
ans+0.084
opl+0.084
reating+0.083
orns+0.082
Neg logit contributions
 slightly-0.078
ider-0.077
 softened-0.077
 horrible-0.073
 ripped-0.072
Token and
Token ID290
Feature activation+7.79e-02

Pos logit contributions
orns+0.061
reating+0.061
oes+0.060
opl+0.060
ites+0.060
Neg logit contributions
 slightly-0.051
 softened-0.048
ider-0.047
Dis-0.046
Luckily-0.045
Token all
Token ID477
Feature activation+7.65e-02

Pos logit contributions
opl+0.123
reating+0.121
orns+0.119
ites+0.118
oes+0.117
Neg logit contributions
 slightly-0.116
 softened-0.116
ider-0.109
 ripped-0.109
 spat-0.109
Token the
Token ID262
Feature activation+5.99e-02

Pos logit contributions
reating+0.122
bur+0.116
orns+0.116
oes+0.116
ans+0.115
Neg logit contributions
 softened-0.112
 shattered-0.105
 slightly-0.105
 strange-0.103
 horrible-0.102
Token other
Token ID584
Feature activation+5.10e-02

Pos logit contributions
reating+0.079
ites+0.076
opl+0.075
orns+0.074
oes+0.074
Neg logit contributions
ider-0.105
 slightly-0.100
 softened-0.099
Dis-0.099
Luckily-0.095
Token av
Token ID1196
Feature activation+7.20e-02

Pos logit contributions
reating+0.064
ites+0.061
opl+0.060
oes+0.059
orns+0.059
Neg logit contributions
ider-0.093
 slightly-0.089
 softened-0.088
Dis-0.088
Luckily-0.085
Tokenoc
Token ID420
Feature activation+1.58e-01

Pos logit contributions
reating+0.118
oes+0.111
line+0.108
ric+0.104
 festivals+0.102
Neg logit contributions
Dis-0.100
Just-0.093
Luckily-0.093
Conf-0.092
 softened-0.090
Tokenados
Token ID22484
Feature activation+5.87e-02

Pos logit contributions
reating+0.208
 whenever+0.189
oes+0.187
uggle+0.186
line+0.180
Neg logit contributions
ider-0.276
 softened-0.252
Dis-0.246
nown-0.238
 slightly-0.238
Token were
Token ID547
Feature activation-1.22e-02

Pos logit contributions
reating+0.108
opl+0.104
oes+0.102
orns+0.099
ites+0.098
Neg logit contributions
ider-0.072
 softened-0.071
 slightly-0.068
 ripped-0.065
Luckily-0.064
Token very
Token ID845
Feature activation-2.09e-04

Pos logit contributions
 slightly+0.015
 softened+0.014
ider+0.014
 horrible+0.013
Dis+0.013
Neg logit contributions
reating-0.023
oes-0.023
ites-0.022
opl-0.022
orns-0.021
Token happy
Token ID3772
Feature activation+6.69e-02

Pos logit contributions
ider+0.000
 slightly+0.000
Dis+0.000
 softened+0.000
Luckily+0.000
Neg logit contributions
reating-0.000
ites-0.000
opl-0.000
oes-0.000
orns-0.000
Token too
Token ID1165
Feature activation+4.88e-02

Pos logit contributions
opl+0.115
reating+0.114
ites+0.113
oes+0.113
bur+0.113
Neg logit contributions
ider-0.089
 slightly-0.084
 softened-0.082
 horrible-0.076
Dis-0.076
Token give
Token ID1577
Feature activation+1.02e-01

Pos logit contributions
reating+0.138
ites+0.136
ishes+0.133
opl+0.131
fl+0.131
Neg logit contributions
 slightly-0.131
 strange-0.126
 softened-0.126
 ripped-0.121
Luckily-0.121
Token to
Token ID284
Feature activation+9.67e-02

Pos logit contributions
opl+0.145
reating+0.136
ites+0.125
oes+0.124
ric+0.120
Neg logit contributions
 softened-0.167
 slightly-0.166
 strange-0.160
 ripped-0.158
 horrible-0.157
Token the
Token ID262
Feature activation+8.74e-02

Pos logit contributions
ites+0.155
pp+0.154
oes+0.152
fl+0.150
opl+0.147
Neg logit contributions
 softened-0.129
 strange-0.127
 slightly-0.125
 horrible-0.121
 sharp-0.119
Token rh
Token ID9529
Feature activation+1.17e-01

Pos logit contributions
reating+0.122
ites+0.117
opl+0.117
orns+0.115
oes+0.115
Neg logit contributions
ider-0.145
 slightly-0.138
 softened-0.137
Dis-0.136
Luckily-0.131
Tokenin
Token ID259
Feature activation+3.58e-02

Pos logit contributions
fl+0.214
reating+0.209
opl+0.207
mel+0.203
oes+0.202
Neg logit contributions
ider-0.146
 softened-0.127
Luckily-0.125
 slightly-0.124
Dis-0.121
Tokenoc
Token ID420
Feature activation+1.77e-01

Pos logit contributions
reating+0.049
angles+0.045
orns+0.044
ites+0.044
bur+0.044
Neg logit contributions
A-0.055
 slightly-0.055
ider-0.054
 Two-0.054
 A-0.054
Tokeneros
Token ID27498
Feature activation+5.89e-02

Pos logit contributions
reating+0.259
 whenever+0.241
oes+0.240
ites+0.237
uggle+0.235
Neg logit contributions
ider-0.273
 softened-0.242
 slightly-0.236
Dis-0.234
 shattered-0.222
Token.
Token ID13
Feature activation+2.95e-02

Pos logit contributions
reating+0.094
opl+0.091
ites+0.089
orns+0.088
bur+0.088
Neg logit contributions
ider-0.083
 softened-0.082
 slightly-0.081
Dis-0.076
Luckily-0.075
Token Lily
Token ID20037
Feature activation+4.89e-02

Pos logit contributions
reating+0.049
ites+0.047
opl+0.047
orns+0.047
ishes+0.046
Neg logit contributions
ider-0.041
 slightly-0.039
Dis-0.039
 softened-0.039
Luckily-0.037
Token walked
Token ID6807
Feature activation-1.45e-02

Pos logit contributions
reating+0.089
ites+0.085
opl+0.084
oes+0.084
orns+0.083
Neg logit contributions
ider-0.062
 softened-0.059
 slightly-0.058
Dis-0.058
Luckily-0.055
Token to
Token ID284
Feature activation+4.92e-02

Pos logit contributions
ider+0.026
 slightly+0.024
 softened+0.024
Dis+0.024
Luckily+0.023
Neg logit contributions
reating-0.019
ites-0.018
opl-0.018
orns-0.018
oes-0.018
Token am
Token ID716
Feature activation+3.55e-03

Pos logit contributions
reating+0.099
ites+0.096
opl+0.095
orns+0.094
oes+0.094
Neg logit contributions
ider-0.141
 slightly-0.134
Dis-0.133
 softened-0.133
Luckily-0.128
Token taking
Token ID2263
Feature activation+6.89e-02

Pos logit contributions
reating+0.006
ites+0.006
opl+0.006
orns+0.006
oes+0.006
Neg logit contributions
ider-0.005
 slightly-0.004
 softened-0.004
Dis-0.004
Luckily-0.004
Token the
Token ID262
Feature activation+9.44e-02

Pos logit contributions
reating+0.103
oes+0.100
opl+0.099
ites+0.098
ishes+0.097
Neg logit contributions
 slightly-0.103
ider-0.103
Dis-0.100
 softened-0.100
 horrible-0.096
Token rh
Token ID9529
Feature activation+1.16e-01

Pos logit contributions
reating+0.141
ites+0.137
opl+0.136
orns+0.135
ishes+0.133
Neg logit contributions
ider-0.145
 slightly-0.138
 softened-0.135
Dis-0.134
 horrible-0.131
Tokenin
Token ID259
Feature activation+3.71e-02

Pos logit contributions
fl+0.213
reating+0.213
opl+0.205
mel+0.203
oes+0.202
Neg logit contributions
ider-0.142
 badly-0.118
Luckily-0.117
 slightly-0.116
Dis-0.113
Tokenoc
Token ID420
Feature activation+1.81e-01

Pos logit contributions
reating+0.049
ishes+0.045
angles+0.044
orns+0.044
ites+0.043
Neg logit contributions
ider-0.062
A-0.058
 slightly-0.058
Out-0.056
 horrible-0.056
Tokeneros
Token ID27498
Feature activation+5.83e-02

Pos logit contributions
reating+0.286
 parties+0.261
 whenever+0.255
oes+0.251
uggle+0.247
Neg logit contributions
ider-0.275
 slightly-0.230
 softened-0.230
Dis-0.224
nown-0.220
Token to
Token ID284
Feature activation+5.30e-02

Pos logit contributions
reating+0.089
opl+0.084
orns+0.084
ites+0.083
ishes+0.083
Neg logit contributions
ider-0.086
 slightly-0.080
 softened-0.078
Luckily-0.077
 horrible-0.076
Token a
Token ID257
Feature activation+6.06e-03

Pos logit contributions
ites+0.081
ishes+0.079
fl+0.077
oes+0.077
orns+0.077
Neg logit contributions
 slightly-0.080
 strange-0.076
 horrible-0.076
Luckily-0.074
Dis-0.074
Token new
Token ID649
Feature activation+4.68e-02

Pos logit contributions
reating+0.010
opl+0.010
ites+0.010
oes+0.010
orns+0.010
Neg logit contributions
ider-0.008
 softened-0.008
Dis-0.008
 slightly-0.008
Luckily-0.007
Token home
Token ID1363
Feature activation+2.72e-02

Pos logit contributions
reating+0.062
ites+0.061
orns+0.061
ishes+0.059
oes+0.059
Neg logit contributions
ider-0.077
 slightly-0.077
Dis-0.074
 softened-0.072
Luckily-0.072
Token Sam
Token ID3409
Feature activation+5.76e-02

Pos logit contributions
opl+0.151
reating+0.150
ites+0.147
orns+0.145
ishes+0.142
Neg logit contributions
ider-0.116
 slightly-0.113
 softened-0.111
Dis-0.108
 ripped-0.104
Token.
Token ID13
Feature activation+2.45e-02

Pos logit contributions
reating+0.099
ites+0.097
opl+0.095
orns+0.094
oes+0.093
Neg logit contributions
 softened-0.076
ider-0.076
 slightly-0.074
 ripped-0.069
Dis-0.069
Token She
Token ID1375
Feature activation+3.20e-02

Pos logit contributions
reating+0.042
ites+0.040
ishes+0.040
orns+0.040
opl+0.039
Neg logit contributions
ider-0.033
 slightly-0.032
 softened-0.031
Dis-0.031
Luckily-0.029
Token was
Token ID373
Feature activation-4.41e-02

Pos logit contributions
ie+0.051
fl+0.049
ites+0.049
 whenever+0.048
opl+0.048
Neg logit contributions
ider-0.048
Dis-0.043
 slightly-0.043
 horrible-0.040
 softened-0.040
Token three
Token ID1115
Feature activation+1.13e-02

Pos logit contributions
ider+0.061
Dis+0.057
 softened+0.056
 slightly+0.054
Conf+0.054
Neg logit contributions
reating-0.075
opl-0.071
ites-0.071
bur-0.070
fl-0.070
Token years
Token ID812
Feature activation+6.29e-02

Pos logit contributions
reating+0.022
ites+0.021
opl+0.021
orns+0.021
oes+0.021
Neg logit contributions
ider-0.013
 slightly-0.012
 softened-0.012
Dis-0.012
Luckily-0.011
Token old
Token ID1468
Feature activation+7.63e-02

Pos logit contributions
reating+0.119
orns+0.115
opl+0.115
ites+0.115
ishes+0.114
Neg logit contributions
ider-0.073
 slightly-0.070
 softened-0.066
Dis-0.065
Luckily-0.062
Token.
Token ID13
Feature activation+4.21e-03

Pos logit contributions
reating+0.094
ites+0.089
orns+0.088
fl+0.088
opl+0.088
Neg logit contributions
ider-0.140
Dis-0.134
 slightly-0.133
 softened-0.132
Luckily-0.129
Token Sam
Token ID3409
Feature activation+2.55e-02

Pos logit contributions
reating+0.007
ites+0.007
ishes+0.007
orns+0.006
opl+0.006
Neg logit contributions
ider-0.006
 slightly-0.006
Dis-0.006
 softened-0.006
Luckily-0.005
Token was
Token ID373
Feature activation-5.41e-02

Pos logit contributions
ie+0.041
fl+0.040
ites+0.039
 whenever+0.039
ishes+0.039
Neg logit contributions
ider-0.038
 slightly-0.034
Dis-0.034
 softened-0.033
 horrible-0.033
Token very
Token ID845
Feature activation-2.15e-02

Pos logit contributions
ider+0.074
Dis+0.069
 softened+0.067
 slightly+0.065
Conf+0.064
Neg logit contributions
reating-0.091
opl-0.090
fl-0.088
ites-0.087
ie-0.087
Token to
Token ID284
Feature activation+4.14e-02

Pos logit contributions
reating+0.018
ites+0.017
oes+0.017
orns+0.016
opl+0.016
Neg logit contributions
ider-0.034
 slightly-0.033
Dis-0.032
 softened-0.032
Luckily-0.031
Token decor
Token ID11705
Feature activation+4.54e-02

Pos logit contributions
reating+0.066
orns+0.065
ishes+0.065
opl+0.064
ites+0.063
Neg logit contributions
 softened-0.060
 slightly-0.059
Luckily-0.059
ider-0.059
 ripped-0.058
Tokenate
Token ID378
Feature activation+4.57e-02

Pos logit contributions
reating+0.066
oes+0.066
opl+0.063
mel+0.063
orns+0.062
Neg logit contributions
ider-0.068
 softened-0.068
 slightly-0.067
Dis-0.065
Luckily-0.063
Token the
Token ID262
Feature activation+2.93e-02

Pos logit contributions
opl+0.069
oes+0.065
orns+0.064
ites+0.062
reating+0.062
Neg logit contributions
 slightly-0.071
 strange-0.068
 softened-0.066
 horrible-0.065
 ripped-0.063
Token tree
Token ID5509
Feature activation+3.24e-03

Pos logit contributions
reating+0.031
opl+0.030
fl+0.030
orns+0.029
ites+0.029
Neg logit contributions
ider-0.060
 softened-0.059
Dis-0.057
Luckily-0.056
 slightly-0.056
Token and
Token ID290
Feature activation+6.09e-02

Pos logit contributions
orns+0.006
opl+0.006
reating+0.006
bur+0.006
ites+0.006
Neg logit contributions
 slightly-0.004
ider-0.004
 strange-0.004
 ripped-0.004
 softened-0.004
Token make
Token ID787
Feature activation+3.52e-02

Pos logit contributions
opl+0.097
orns+0.094
ites+0.090
reating+0.088
bur+0.088
Neg logit contributions
 slightly-0.093
 softened-0.093
 ripped-0.091
ider-0.088
 shattered-0.087
Token it
Token ID340
Feature activation+2.25e-02

Pos logit contributions
opl+0.064
orns+0.063
oes+0.063
ites+0.062
reating+0.062
Neg logit contributions
 slightly-0.043
 strange-0.042
 horrible-0.041
 softened-0.040
Dis-0.040
Token pretty
Token ID2495
Feature activation+3.84e-02

Pos logit contributions
reating+0.039
opl+0.039
orns+0.039
ites+0.039
ishes+0.037
Neg logit contributions
ider-0.029
 slightly-0.028
 softened-0.028
Dis-0.027
 ripped-0.026
Token.
Token ID13
Feature activation-4.16e-03

Pos logit contributions
opl+0.070
ites+0.069
oes+0.069
orns+0.068
bur+0.068
Neg logit contributions
 slightly-0.046
ider-0.044
 strange-0.042
 softened-0.042
 horrible-0.040
Token They
Token ID1119
Feature activation+3.33e-02

Pos logit contributions
ider+0.008
 softened+0.008
 slightly+0.008
Dis+0.008
 peculiar+0.008
Neg logit contributions
ites-0.004
reating-0.004
 whenever-0.004
oes-0.004
opl-0.004
Token his
Token ID465
Feature activation+3.61e-02

Pos logit contributions
ites+0.118
opl+0.114
oes+0.112
orns+0.112
pp+0.109
Neg logit contributions
 softened-0.157
 slightly-0.156
 strange-0.154
 horrible-0.153
Dis-0.151
Token brother
Token ID3956
Feature activation+6.06e-02

Pos logit contributions
reating+0.053
opl+0.052
orns+0.052
ites+0.051
oes+0.051
Neg logit contributions
ider-0.056
 slightly-0.055
 softened-0.054
Dis-0.054
Luckily-0.053
Token Sandy
Token ID17581
Feature activation+5.04e-02

Pos logit contributions
oes+0.105
ites+0.105
opl+0.103
bur+0.102
reating+0.102
Neg logit contributions
 softened-0.075
 slightly-0.073
 strange-0.071
 horrible-0.071
Dis-0.069
Token to
Token ID284
Feature activation+5.05e-02

Pos logit contributions
opl+0.077
reating+0.076
ites+0.076
oes+0.075
orns+0.074
Neg logit contributions
 softened-0.073
ider-0.070
 slightly-0.070
 strange-0.069
Dis-0.068
Token be
Token ID307
Feature activation-1.57e-02

Pos logit contributions
ites+0.079
reating+0.078
ishes+0.077
orns+0.077
fl+0.074
Neg logit contributions
 softened-0.073
 slightly-0.073
ider-0.071
Luckily-0.071
Dis-0.069
Token obedient
Token ID41652
Feature activation+6.15e-02

Pos logit contributions
ider+0.018
 slightly+0.017
 softened+0.017
Dis+0.017
Luckily+0.016
Neg logit contributions
reating-0.030
ites-0.029
opl-0.029
orns-0.029
oes-0.029
Token from
Token ID422
Feature activation+5.79e-02

Pos logit contributions
reating+0.098
opl+0.098
ites+0.097
orns+0.095
oes+0.094
Neg logit contributions
ider-0.091
 slightly-0.086
 softened-0.084
Luckily-0.080
Dis-0.080
Token then
Token ID788
Feature activation+3.37e-02

Pos logit contributions
reating+0.081
ites+0.079
opl+0.077
orns+0.077
oes+0.077
Neg logit contributions
ider-0.096
 slightly-0.092
 softened-0.090
Dis-0.090
Luckily-0.087
Token on
Token ID319
Feature activation+2.68e-02

Pos logit contributions
reating+0.052
opl+0.052
ites+0.051
orns+0.050
oes+0.050
Neg logit contributions
ider-0.051
 softened-0.047
 slightly-0.047
Dis-0.047
Luckily-0.045
Token.
Token ID13
Feature activation+4.35e-03

Pos logit contributions
reating+0.037
opl+0.036
ites+0.036
orns+0.035
oes+0.035
Neg logit contributions
ider-0.045
 slightly-0.043
Dis-0.042
 softened-0.042
Luckily-0.041
Token
Token ID220
Feature activation+4.00e-03

Pos logit contributions
reating+0.007
ites+0.007
ishes+0.007
orns+0.007
ie+0.007
Neg logit contributions
ider-0.006
 slightly-0.006
Dis-0.006
 softened-0.006
Conf-0.006
Tokenpillar
Token ID41643
Feature activation+3.49e-02

Pos logit contributions
reating+0.134
ites+0.127
opl+0.126
oes+0.126
ishes+0.125
Neg logit contributions
ider-0.120
 softened-0.112
 slightly-0.112
Dis-0.111
Luckily-0.107
Token w
Token ID266
Feature activation+4.04e-02

Pos logit contributions
ie+0.059
ites+0.057
opl+0.057
orns+0.056
fl+0.056
Neg logit contributions
ider-0.047
 slightly-0.044
Dis-0.043
Conf-0.040
 softened-0.039
Tokenigg
Token ID6950
Feature activation+7.67e-02

Pos logit contributions
orns+0.066
ie+0.062
opl+0.062
ites+0.061
ishes+0.061
Neg logit contributions
 slightly-0.058
ider-0.058
Dis-0.056
 softened-0.055
 interrupted-0.054
Tokenled
Token ID992
Feature activation+3.28e-02

Pos logit contributions
reating+0.162
opl+0.154
ites+0.150
bur+0.149
oes+0.148
Neg logit contributions
ider-0.077
 softened-0.073
Dis-0.070
 slightly-0.069
 horrible-0.066
Token inside
Token ID2641
Feature activation+3.86e-02

Pos logit contributions
reating+0.049
ites+0.047
opl+0.047
orns+0.046
oes+0.046
Neg logit contributions
ider-0.052
 slightly-0.049
 softened-0.049
Dis-0.049
Luckily-0.047
Token and
Token ID290
Feature activation+7.97e-02

Pos logit contributions
reating+0.052
ites+0.050
opl+0.050
orns+0.049
ishes+0.049
Neg logit contributions
ider-0.065
 slightly-0.064
 softened-0.064
Dis-0.062
Luckily-0.060
Token felt
Token ID2936
Feature activation-6.60e-03

Pos logit contributions
reating+0.143
ites+0.138
opl+0.138
orns+0.137
oes+0.136
Neg logit contributions
ider-0.101
 slightly-0.095
Dis-0.093
 softened-0.093
Luckily-0.088
Token safe
Token ID3338
Feature activation+7.35e-02

Pos logit contributions
ider+0.008
 slightly+0.008
Dis+0.008
 softened+0.008
Luckily+0.007
Neg logit contributions
reating-0.012
ites-0.012
opl-0.012
orns-0.011
oes-0.011
Token.
Token ID13
Feature activation+1.80e-02

Pos logit contributions
reating+0.123
ites+0.122
opl+0.120
oes+0.118
orns+0.116
Neg logit contributions
ider-0.105
 slightly-0.101
 softened-0.098
 ripped-0.090
Dis-0.089
Token
Token ID198
Feature activation+2.58e-02

Pos logit contributions
reating+0.033
ites+0.031
opl+0.030
ishes+0.030
orns+0.030
Neg logit contributions
ider-0.023
 slightly-0.022
Dis-0.022
 softened-0.022
Luckily-0.021
Token
Token ID198
Feature activation+1.96e-02

Pos logit contributions
reating+0.041
ishes+0.040
oes+0.039
ites+0.039
opl+0.039
Neg logit contributions
ider-0.037
Dis-0.037
 slightly-0.037
Luckily-0.036
 softened-0.036
Token can
Token ID460
Feature activation+3.19e-02

Pos logit contributions
reating+0.051
ites+0.048
opl+0.048
orns+0.048
oes+0.047
Neg logit contributions
ider-0.075
 slightly-0.072
 softened-0.071
Dis-0.071
Luckily-0.069
Token make
Token ID787
Feature activation+5.70e-02

Pos logit contributions
reating+0.049
orns+0.048
ishes+0.047
ites+0.046
opl+0.045
Neg logit contributions
ider-0.048
 softened-0.046
 slightly-0.046
Luckily-0.045
Dis-0.044
Token all
Token ID477
Feature activation+7.51e-02

Pos logit contributions
oes+0.103
ites+0.097
opl+0.095
bur+0.094
reating+0.094
Neg logit contributions
 strange-0.073
 horrible-0.072
Dis-0.069
 slightly-0.068
 peculiar-0.066
Token the
Token ID262
Feature activation+6.00e-02

Pos logit contributions
bur+0.125
ishes+0.124
girls+0.118
ans+0.118
pter+0.118
Neg logit contributions
 strange-0.106
 horrible-0.097
 shattered-0.097
 tiny-0.096
 softened-0.094
Token beautiful
Token ID4950
Feature activation+5.86e-02

Pos logit contributions
reating+0.092
ites+0.089
oes+0.088
orns+0.087
opl+0.087
Neg logit contributions
ider-0.086
 slightly-0.084
 horrible-0.083
 softened-0.083
Dis-0.083
Token pictures
Token ID5986
Feature activation+7.88e-02

Pos logit contributions
bur+0.090
reating+0.087
oes+0.087
orns+0.083
pp+0.083
Neg logit contributions
 strange-0.084
 slightly-0.080
 softened-0.080
 peculiar-0.079
 horrible-0.077
Token without
Token ID1231
Feature activation+7.57e-02

Pos logit contributions
orns+0.126
ans+0.126
ishes+0.123
opl+0.122
bur+0.121
Neg logit contributions
ider-0.113
 slightly-0.110
 softened-0.101
 ripped-0.099
 shattered-0.098
Token worrying
Token ID18916
Feature activation+4.91e-02

Pos logit contributions
reating+0.108
ites+0.105
opl+0.104
orns+0.103
oes+0.103
Neg logit contributions
ider-0.122
 slightly-0.117
 softened-0.116
Dis-0.115
Luckily-0.110
Token your
Token ID534
Feature activation+5.55e-02

Pos logit contributions
ites+0.071
orns+0.070
ishes+0.070
reating+0.069
bur+0.069
Neg logit contributions
ider-0.075
 slightly-0.075
 softened-0.072
 horrible-0.070
 strange-0.069
Token cr
Token ID1067
Feature activation+5.40e-02

Pos logit contributions
reating+0.092
orns+0.089
ites+0.087
opl+0.086
bur+0.084
Neg logit contributions
ider-0.079
 slightly-0.077
 softened-0.074
Dis-0.073
 horrible-0.072
Tokenay
Token ID323
Feature activation+7.63e-02

Pos logit contributions
ishes+0.088
reating+0.087
fl+0.086
ites+0.085
orns+0.084
Neg logit contributions
ider-0.074
 slightly-0.068
Luckily-0.067
 softened-0.065
Dis-0.065
Token was
Token ID373
Feature activation-6.17e-02

Pos logit contributions
ider+0.009
Dis+0.008
 slightly+0.008
Conf+0.008
Luckily+0.008
Neg logit contributions
ie-0.008
 whenever-0.008
fl-0.008
orns-0.008
opl-0.008
Token a
Token ID257
Feature activation-4.17e-02

Pos logit contributions
ider+0.093
Dis+0.086
Conf+0.082
Luckily+0.078
Just+0.077
Neg logit contributions
reating-0.101
 whenever-0.097
bur-0.095
mel-0.095
opl-0.094
Token delicious
Token ID12625
Feature activation-6.89e-03

Pos logit contributions
ider+0.064
 softened+0.057
Dis+0.055
Luckily+0.054
Just+0.053
Neg logit contributions
reating-0.064
 whenever-0.059
opl-0.058
mel-0.058
fl-0.058
Token red
Token ID2266
Feature activation+1.68e-03

Pos logit contributions
ider+0.012
 softened+0.012
Dis+0.012
 slightly+0.012
Luckily+0.012
Neg logit contributions
reating-0.009
ie-0.008
opl-0.008
fl-0.008
mel-0.008
Token apple
Token ID17180
Feature activation+8.88e-03

Pos logit contributions
reating+0.002
oes+0.002
opl+0.002
ie+0.002
fl+0.002
Neg logit contributions
ider-0.003
 softened-0.003
Dis-0.003
Luckily-0.003
 slightly-0.003
Token.
Token ID13
Feature activation-1.72e-02

Pos logit contributions
reating+0.014
ites+0.014
opl+0.014
ishes+0.014
oes+0.014
Neg logit contributions
ider-0.013
 slightly-0.012
 softened-0.012
Dis-0.012
 horrible-0.011
Token He
Token ID679
Feature activation-3.94e-03

Pos logit contributions
ider+0.022
 slightly+0.021
Dis+0.020
 softened+0.020
Luckily+0.019
Neg logit contributions
reating-0.031
ites-0.030
opl-0.030
orns-0.030
oes-0.029
Token smiled
Token ID13541
Feature activation-1.24e-02

Pos logit contributions
ider+0.006
 slightly+0.005
istered+0.005
Dis+0.005
 horrible+0.005
Neg logit contributions
ie-0.006
 whenever-0.006
ites-0.006
ishes-0.006
mel-0.006
Token.
Token ID13
Feature activation-1.86e-02

Pos logit contributions
ider+0.025
Dis+0.024
 softened+0.023
 slightly+0.023
Luckily+0.022
Neg logit contributions
reating-0.013
 whenever-0.013
fl-0.013
ites-0.012
ishes-0.012
Token
Token ID198
Feature activation+2.83e-03

Pos logit contributions
ider+0.028
 slightly+0.026
 softened+0.026
Dis+0.026
Luckily+0.025
Neg logit contributions
reating-0.030
ites-0.029
opl-0.028
orns-0.028
oes-0.028
Token
Token ID198
Feature activation-3.81e-03

Pos logit contributions
reating+0.005
ishes+0.004
ites+0.004
oes+0.004
orns+0.004
Neg logit contributions
ider-0.004
 slightly-0.004
Dis-0.004
 softened-0.004
Luckily-0.004
Token
Token ID198
Feature activation+5.58e-02

Pos logit contributions
reating+0.126
ishes+0.120
ites+0.117
orns+0.115
fl+0.113
Neg logit contributions
Conf-0.075
 softened-0.074
Luckily-0.073
 slightly-0.073
ider-0.071
Token
Token ID198
Feature activation+4.19e-02

Pos logit contributions
reating+0.084
ishes+0.080
ites+0.079
opl+0.079
orns+0.078
Neg logit contributions
ider-0.088
 slightly-0.084
Dis-0.084
 softened-0.084
Luckily-0.082
TokenFrom
Token ID4863
Feature activation+9.50e-02

Pos logit contributions
reating+0.075
ites+0.074
opl+0.073
orns+0.072
oes+0.071
Neg logit contributions
ider-0.053
 slightly-0.049
 softened-0.049
Dis-0.048
 horrible-0.045
Token that
Token ID326
Feature activation+9.96e-02

Pos logit contributions
reating+0.128
ites+0.123
opl+0.122
orns+0.120
oes+0.119
Neg logit contributions
ider-0.164
 slightly-0.157
Dis-0.155
 softened-0.155
Luckily-0.148
Token day
Token ID1110
Feature activation+4.63e-02

Pos logit contributions
reating+0.120
ishes+0.120
ites+0.115
ans+0.114
orns+0.112
Neg logit contributions
 horrible-0.179
Dis-0.179
 softened-0.176
 shattered-0.175
 strange-0.175
Token on
Token ID319
Feature activation+3.66e-02

Pos logit contributions
ites+0.065
opl+0.063
reating+0.063
orns+0.062
 whenever+0.061
Neg logit contributions
ider-0.076
 slightly-0.072
 softened-0.071
 peculiar-0.071
Dis-0.071
Token,
Token ID11
Feature activation+3.21e-02

Pos logit contributions
reating+0.060
oes+0.055
ites+0.055
ishes+0.055
ans+0.055
Neg logit contributions
ider-0.054
 slightly-0.054
 softened-0.054
Dis-0.052
 horrible-0.051
Token Tim
Token ID5045
Feature activation+4.56e-02

Pos logit contributions
opl+0.060
oes+0.058
reating+0.058
ans+0.056
orns+0.056
Neg logit contributions
 slightly-0.035
Dis-0.035
 softened-0.034
ider-0.033
 strange-0.033
Token was
Token ID373
Feature activation-4.45e-02

Pos logit contributions
reating+0.044
ites+0.042
uggle+0.040
opl+0.038
oes+0.038
Neg logit contributions
 softened-0.092
Dis-0.089
 ripped-0.089
ider-0.088
Luckily-0.088
Token known
Token ID1900
Feature activation+4.01e-02

Pos logit contributions
 softened+0.054
 slightly+0.054
ider+0.053
Luckily+0.053
Dis+0.051
Neg logit contributions
reating-0.081
oes-0.080
ishes-0.079
ites-0.079
orns-0.078
Token as
Token ID355
Feature activation+6.54e-02

Pos logit contributions
reating+0.058
ites+0.057
opl+0.057
orns+0.057
fl+0.056
Neg logit contributions
ider-0.065
 slightly-0.064
 softened-0.060
Dis-0.059
Luckily-0.059
Token."
Token ID526
Feature activation+2.98e-02

Pos logit contributions
opl+0.062
reating+0.062
ites+0.061
ishes+0.061
fl+0.059
Neg logit contributions
 slightly-0.083
ider-0.083
A-0.079
Dis-0.079
 softened-0.078
Token She
Token ID1375
Feature activation+2.05e-02

Pos logit contributions
reating+0.054
opl+0.050
ites+0.050
ishes+0.049
orns+0.049
Neg logit contributions
ider-0.040
 slightly-0.039
Dis-0.038
 softened-0.038
Luckily-0.038
Token pulled
Token ID5954
Feature activation+1.14e-02

Pos logit contributions
ie+0.029
ishes+0.028
bur+0.028
ans+0.028
ites+0.027
Neg logit contributions
ider-0.032
 slightly-0.029
istered-0.029
 peculiar-0.029
 horrible-0.028
Token her
Token ID607
Feature activation+1.45e-02

Pos logit contributions
reating+0.016
opl+0.015
ites+0.015
orns+0.015
oes+0.015
Neg logit contributions
ider-0.019
 slightly-0.019
Dis-0.018
 softened-0.018
Conf-0.017
Token arm
Token ID3211
Feature activation+1.27e-02

Pos logit contributions
reating+0.030
orns+0.029
opl+0.029
ites+0.029
ishes+0.027
Neg logit contributions
 slightly-0.016
ider-0.015
 ripped-0.014
 softened-0.014
 strange-0.014
Token back
Token ID736
Feature activation-4.04e-04

Pos logit contributions
reating+0.016
ites+0.015
opl+0.015
orns+0.015
oes+0.015
Neg logit contributions
ider-0.023
Dis-0.022
 softened-0.022
 slightly-0.021
Luckily-0.021
Token and
Token ID290
Feature activation+6.13e-02

Pos logit contributions
 slightly+0.001
ider+0.001
 softened+0.001
Dis+0.001
 ripped+0.001
Neg logit contributions
reating-0.001
opl-0.001
ites-0.001
orns-0.001
ishes-0.001
Token stayed
Token ID9658
Feature activation-1.56e-02

Pos logit contributions
reating+0.111
ites+0.108
opl+0.107
orns+0.106
oes+0.106
Neg logit contributions
ider-0.077
 slightly-0.073
 softened-0.072
Dis-0.071
Luckily-0.067
Token with
Token ID351
Feature activation+4.65e-02

Pos logit contributions
Dis+0.025
ider+0.025
Conf+0.023
 softened+0.023
Luckily+0.022
Neg logit contributions
ites-0.021
ie-0.021
bur-0.021
ishes-0.021
reating-0.021
Token the
Token ID262
Feature activation+4.18e-02

Pos logit contributions
ites+0.074
oes+0.072
fl+0.071
opl+0.070
pp+0.070
Neg logit contributions
 slightly-0.072
 horrible-0.067
 strange-0.067
 softened-0.066
Dis-0.065
Token cow
Token ID9875
Feature activation+3.41e-02

Pos logit contributions
reating+0.053
ites+0.053
orns+0.051
opl+0.051
ishes+0.051
Neg logit contributions
ider-0.075
 slightly-0.071
Dis-0.070
 softened-0.070
 horrible-0.068
Tokenly
Token ID306
Feature activation-5.28e-03

Pos logit contributions
opl+0.082
reating+0.082
ites+0.081
orns+0.079
bur+0.078
Neg logit contributions
ider-0.045
 softened-0.044
 slightly-0.042
Dis-0.041
 horrible-0.040
Token,
Token ID11
Feature activation+2.38e-03

Pos logit contributions
ider+0.010
 slightly+0.010
Dis+0.010
 softened+0.010
 horrible+0.009
Neg logit contributions
reating-0.006
ites-0.006
ie-0.005
orns-0.005
opl-0.005
Token Lily
Token ID20037
Feature activation+1.41e-02

Pos logit contributions
 whenever+0.003
reating+0.003
mel+0.002
ie+0.002
fl+0.002
Neg logit contributions
ider-0.005
Dis-0.004
 softened-0.004
 slightly-0.004
Luckily-0.004
Token brought
Token ID3181
Feature activation+3.66e-03

Pos logit contributions
ie+0.023
fl+0.022
ishes+0.022
ites+0.021
 whenever+0.021
Neg logit contributions
ider-0.021
 slightly-0.020
 horrible-0.019
Dis-0.018
 peculiar-0.017
Token the
Token ID262
Feature activation+2.58e-02

Pos logit contributions
opl+0.006
reating+0.006
ric+0.006
oes+0.006
ites+0.006
Neg logit contributions
 slightly-0.005
 softened-0.005
 strange-0.005
Dis-0.005
 horrible-0.005
Token sauce
Token ID10746
Feature activation-1.75e-03

Pos logit contributions
reating+0.040
oes+0.039
orns+0.038
opl+0.038
fl+0.038
Neg logit contributions
ider-0.039
 slightly-0.038
Dis-0.037
 softened-0.036
Luckily-0.035
Token home
Token ID1363
Feature activation-3.04e-04

Pos logit contributions
ider+0.003
 slightly+0.003
 softened+0.002
 ripped+0.002
 shattered+0.002
Neg logit contributions
opl-0.003
reating-0.003
ric-0.003
ites-0.003
orns-0.003
Token and
Token ID290
Feature activation+5.64e-02

Pos logit contributions
ider+0.000
 slightly+0.000
 softened+0.000
Dis+0.000
Luckily+0.000
Neg logit contributions
reating-0.000
opl-0.000
ites-0.000
orns-0.000
ishes-0.000
Token showed
Token ID3751
Feature activation+3.64e-02

Pos logit contributions
reating+0.103
ites+0.100
opl+0.099
orns+0.099
ishes+0.098
Neg logit contributions
ider-0.070
 slightly-0.065
Dis-0.064
 softened-0.064
Luckily-0.060
Token it
Token ID340
Feature activation+1.76e-02

Pos logit contributions
opl+0.053
oes+0.053
ric+0.053
ites+0.053
bur+0.052
Neg logit contributions
 slightly-0.057
 softened-0.055
 strange-0.055
 horrible-0.054
Dis-0.054
Token to
Token ID284
Feature activation+3.43e-02

Pos logit contributions
reating+0.019
ites+0.018
oes+0.017
opl+0.017
orns+0.017
Neg logit contributions
ider-0.036
 slightly-0.034
Dis-0.034
 softened-0.034
Luckily-0.033
Token.
Token ID13
Feature activation-1.69e-02

Pos logit contributions
reating+0.025
ites+0.024
opl+0.024
orns+0.023
oes+0.023
Neg logit contributions
ider-0.029
 slightly-0.028
 softened-0.027
Dis-0.027
Luckily-0.026
Token We
Token ID775
Feature activation+3.60e-02

Pos logit contributions
ider+0.025
 slightly+0.024
Dis+0.023
 softened+0.023
Luckily+0.022
Neg logit contributions
reating-0.027
ie-0.027
orns-0.026
ites-0.026
ishes-0.026
Token won
Token ID1839
Feature activation+8.24e-03

Pos logit contributions
reating+0.044
opl+0.041
ites+0.041
orns+0.041
oes+0.040
Neg logit contributions
ider-0.067
 softened-0.064
 slightly-0.064
Dis-0.064
Luckily-0.062
Token't
Token ID470
Feature activation+3.60e-02

Pos logit contributions
reating+0.011
opl+0.011
ites+0.011
orns+0.010
fl+0.010
Neg logit contributions
ider-0.014
 slightly-0.014
 softened-0.014
Dis-0.014
Conf-0.013
Token take
Token ID1011
Feature activation+5.75e-03

Pos logit contributions
reating+0.051
oes+0.048
ites+0.048
opl+0.048
orns+0.047
Neg logit contributions
ider-0.060
Dis-0.057
 slightly-0.057
 softened-0.056
Conf-0.054
Token your
Token ID534
Feature activation+2.65e-02

Pos logit contributions
opl+0.008
reating+0.008
oes+0.008
mel+0.007
bur+0.007
Neg logit contributions
 slightly-0.009
 strange-0.009
ider-0.009
 softened-0.009
 horrible-0.009
Token things
Token ID1243
Feature activation+3.24e-02

Pos logit contributions
reating+0.042
opl+0.040
orns+0.040
ites+0.039
ishes+0.039
Neg logit contributions
ider-0.039
 slightly-0.038
 softened-0.037
Dis-0.037
 ripped-0.036
Token without
Token ID1231
Feature activation+4.59e-02

Pos logit contributions
fl+0.057
opl+0.056
ric+0.056
orns+0.056
ishes+0.055
Neg logit contributions
ider-0.045
 slightly-0.042
 ripped-0.039
 badly-0.039
 horrible-0.038
Token asking
Token ID4737
Feature activation+4.23e-02

Pos logit contributions
reating+0.066
ites+0.063
opl+0.062
orns+0.062
oes+0.061
Neg logit contributions
ider-0.076
 slightly-0.072
Dis-0.071
 softened-0.071
Luckily-0.068
Token."
Token ID526
Feature activation-4.20e-03

Pos logit contributions
reating+0.052
ites+0.050
opl+0.050
orns+0.049
oes+0.048
Neg logit contributions
ider-0.078
 slightly-0.074
 softened-0.073
Dis-0.073
Luckily-0.070
Token
Token ID198
Feature activation-6.22e-03

Pos logit contributions
ider+0.007
 slightly+0.006
 softened+0.006
Dis+0.006
Luckily+0.006
Neg logit contributions
reating-0.006
opl-0.006
orns-0.006
ites-0.006
ishes-0.006
Token.
Token ID13
Feature activation-3.75e-02

Pos logit contributions
orns+0.059
ites+0.058
ishes+0.057
oes+0.057
opl+0.057
Neg logit contributions
 slightly-0.047
 softened-0.044
 strange-0.042
ider-0.042
Conf-0.042
Token
Token ID198
Feature activation-1.57e-02

Pos logit contributions
Dis+0.046
ider+0.046
 slightly+0.046
Conf+0.045
Luckily+0.045
Neg logit contributions
reating-0.069
ie-0.064
ishes-0.063
ites-0.063
bur-0.061
Token
Token ID198
Feature activation-2.07e-02

Pos logit contributions
Luckily+0.018
Dis+0.018
 slightly+0.017
 softened+0.017
ider+0.017
Neg logit contributions
ishes-0.030
reating-0.030
oes-0.029
ites-0.028
orns-0.028
TokenThe
Token ID464
Feature activation+2.64e-03

Pos logit contributions
ider+0.023
Dis+0.022
 slightly+0.022
 softened+0.021
Luckily+0.021
Neg logit contributions
reating-0.040
ishes-0.040
oes-0.039
ites-0.039
opl-0.039
Token reliable
Token ID9314
Feature activation+1.27e-03

Pos logit contributions
reating+0.002
fl+0.002
ie+0.002
ites+0.002
oes+0.002
Neg logit contributions
ider-0.006
 slightly-0.006
 softened-0.005
istered-0.005
Dis-0.005
Token boy
Token ID2933
Feature activation-4.46e-02

Pos logit contributions
reating+0.001
fl+0.001
ites+0.001
oes+0.001
ie+0.001
Neg logit contributions
ider-0.002
 slightly-0.002
Dis-0.002
 softened-0.002
Conf-0.002
Token and
Token ID290
Feature activation+3.02e-02

Pos logit contributions
ider+0.068
 slightly+0.063
Dis+0.062
 softened+0.061
 horrible+0.058
Neg logit contributions
ites-0.068
reating-0.068
opl-0.067
orns-0.066
ishes-0.066
Token his
Token ID465
Feature activation-2.92e-02

Pos logit contributions
reating+0.054
opl+0.052
uggle+0.050
oes+0.049
bur+0.049
Neg logit contributions
 softened-0.038
Dis-0.038
 slightly-0.037
 ripped-0.036
 strange-0.036
Token friends
Token ID2460
Feature activation-2.79e-03

Pos logit contributions
ider+0.060
 slightly+0.058
Dis+0.057
 softened+0.057
Conf+0.055
Neg logit contributions
reating-0.029
opl-0.029
ites-0.028
ishes-0.028
orns-0.028
Token kept
Token ID4030
Feature activation-3.62e-02

Pos logit contributions
ider+0.004
 slightly+0.004
Dis+0.004
 softened+0.004
Luckily+0.003
Neg logit contributions
reating-0.005
ites-0.005
opl-0.004
ishes-0.004
orns-0.004
Token playing
Token ID2712
Feature activation-4.10e-02

Pos logit contributions
 slightly+0.053
 softened+0.052
ider+0.052
Dis+0.051
Luckily+0.048
Neg logit contributions
reating-0.057
oes-0.055
opl-0.054
orns-0.053
ites-0.053
Token a
Token ID257
Feature activation-4.78e-02

Pos logit contributions
ider+0.072
 slightly+0.065
 softened+0.065
Dis+0.064
Luckily+0.059
Neg logit contributions
reating-0.134
ites-0.130
orns-0.130
opl-0.130
fl-0.127
Token bee
Token ID20697
Feature activation-1.84e-02

Pos logit contributions
ider+0.068
 softened+0.065
 slightly+0.063
Dis+0.060
 shattered+0.058
Neg logit contributions
ie-0.077
ites-0.073
 whenever-0.072
fl-0.071
bur-0.071
Token.
Token ID13
Feature activation-4.27e-03

Pos logit contributions
ider+0.022
Luckily+0.020
 softened+0.020
 slightly+0.020
Just+0.019
Neg logit contributions
reating-0.036
ishes-0.033
ites-0.033
oes-0.032
opl-0.032
Token The
Token ID383
Feature activation+1.95e-02

Pos logit contributions
ider+0.007
 softened+0.007
Dis+0.007
 slightly+0.007
Luckily+0.007
Neg logit contributions
opl-0.006
orns-0.006
reating-0.006
oes-0.006
fl-0.005
Token bee
Token ID20697
Feature activation-1.27e-02

Pos logit contributions
reating+0.030
opl+0.029
ites+0.029
orns+0.028
ishes+0.027
Neg logit contributions
ider-0.031
 slightly-0.029
Dis-0.029
 softened-0.029
Luckily-0.028
Token was
Token ID373
Feature activation-6.63e-02

Pos logit contributions
 softened+0.012
ider+0.012
 slightly+0.011
Luckily+0.011
Dis+0.011
Neg logit contributions
reating-0.028
oes-0.025
opl-0.025
ites-0.025
ishes-0.025
Token very
Token ID845
Feature activation-3.29e-02

Pos logit contributions
ider+0.068
 slightly+0.063
 softened+0.062
Dis+0.062
Luckily+0.058
Neg logit contributions
reating-0.136
ites-0.132
opl-0.131
orns-0.130
oes-0.129
Token independent
Token ID4795
Feature activation+3.00e-02

Pos logit contributions
ider+0.034
 slightly+0.032
Dis+0.031
 softened+0.031
Luckily+0.029
Neg logit contributions
reating-0.066
ites-0.065
opl-0.064
oes-0.064
orns-0.064
Token and
Token ID290
Feature activation+3.46e-02

Pos logit contributions
reating+0.045
ites+0.043
opl+0.043
oes+0.043
orns+0.042
Neg logit contributions
ider-0.047
 slightly-0.045
 softened-0.044
Dis-0.044
Luckily-0.042
Token she
Token ID673
Feature activation+1.34e-02

Pos logit contributions
reating+0.076
opl+0.071
ites+0.071
orns+0.070
oes+0.069
Neg logit contributions
 slightly-0.035
 softened-0.032
Dis-0.032
ider-0.032
 strange-0.031
Token liked
Token ID8288
Feature activation-8.46e-04

Pos logit contributions
reating+0.026
ites+0.025
opl+0.025
orns+0.024
oes+0.024
Neg logit contributions
ider-0.016
 slightly-0.014
 softened-0.014
Dis-0.014
Luckily-0.013
Token a
Token ID257
Feature activation-3.88e-02

Pos logit contributions
 slightly+0.027
 strange+0.025
Dis+0.024
A+0.024
 softened+0.023
Neg logit contributions
pp-0.025
ishes-0.025
ans-0.025
opl-0.025
orns-0.025
Token forest
Token ID8222
Feature activation-3.94e-02

Pos logit contributions
ider+0.061
 softened+0.054
Dis+0.052
 slightly+0.052
Luckily+0.050
Neg logit contributions
ites-0.061
reating-0.060
opl-0.059
ie-0.058
uggle-0.057
Token.
Token ID13
Feature activation-1.55e-02

Pos logit contributions
ider+0.056
 slightly+0.053
 softened+0.049
Luckily+0.048
Dis+0.048
Neg logit contributions
reating-0.069
opl-0.066
orns-0.065
ishes-0.065
ites-0.064
Token Everywhere
Token ID48201
Feature activation+1.79e-02

Pos logit contributions
ider+0.027
 softened+0.026
 slightly+0.025
Dis+0.025
Luckily+0.025
Neg logit contributions
oes-0.019
 whenever-0.019
reating-0.019
ishes-0.019
ites-0.018
Token they
Token ID484
Feature activation+1.60e-02

Pos logit contributions
reating+0.024
opl+0.024
ric+0.022
orns+0.021
oes+0.021
Neg logit contributions
ider-0.032
 softened-0.031
Dis-0.031
 slightly-0.031
 strange-0.030
Token looked
Token ID3114
Feature activation-7.08e-02

Pos logit contributions
reating+0.030
opl+0.029
ites+0.029
orns+0.029
oes+0.028
Neg logit contributions
ider-0.019
 slightly-0.018
 softened-0.018
Dis-0.018
Luckily-0.017
Token they
Token ID484
Feature activation+7.57e-03

Pos logit contributions
 strange+0.087
 slightly+0.086
 softened+0.085
ider+0.084
 peculiar+0.080
Neg logit contributions
reating-0.138
opl-0.130
oes-0.123
ric-0.121
bur-0.121
Token saw
Token ID2497
Feature activation-2.84e-02

Pos logit contributions
reating+0.015
opl+0.014
ites+0.014
orns+0.014
oes+0.014
Neg logit contributions
ider-0.008
 slightly-0.008
 softened-0.008
Dis-0.008
Luckily-0.007
Token trees
Token ID7150
Feature activation+1.81e-02

Pos logit contributions
ider+0.047
Dis+0.042
 slightly+0.042
 softened+0.042
Luckily+0.040
Neg logit contributions
reating-0.042
orns-0.041
fl-0.041
ites-0.040
opl-0.040
Token.
Token ID13
Feature activation-2.21e-02

Pos logit contributions
reating+0.028
ites+0.027
opl+0.027
ishes+0.027
orns+0.026
Neg logit contributions
 slightly-0.027
ider-0.026
 softened-0.026
 strange-0.024
 widening-0.024
Token Then
Token ID3244
Feature activation+2.17e-03

Pos logit contributions
ider+0.036
 softened+0.034
 slightly+0.034
Dis+0.033
Luckily+0.032
Neg logit contributions
 whenever-0.030
oes-0.030
ites-0.029
reating-0.029
ishes-0.029
Token time
Token ID640
Feature activation-2.42e-02

Pos logit contributions
ider+0.069
 slightly+0.065
 softened+0.064
Dis+0.064
Luckily+0.061
Neg logit contributions
reating-0.084
ites-0.081
opl-0.081
orns-0.080
oes-0.079
Token,
Token ID11
Feature activation-3.59e-02

Pos logit contributions
ider+0.032
 slightly+0.031
Dis+0.029
 softened+0.028
Conf+0.028
Neg logit contributions
reating-0.045
ishes-0.043
orns-0.043
ans-0.042
opl-0.042
Token in
Token ID287
Feature activation-5.25e-02

Pos logit contributions
ider+0.062
 softened+0.060
Dis+0.058
 slightly+0.057
 widening+0.055
Neg logit contributions
ie-0.046
 whenever-0.045
ites-0.044
oes-0.044
fl-0.044
Token a
Token ID257
Feature activation-7.50e-02

Pos logit contributions
 slightly+0.035
Dis+0.032
ider+0.032
 softened+0.030
 strange+0.030
Neg logit contributions
reating-0.129
oes-0.126
ites-0.125
opl-0.125
ishes-0.124
Token big
Token ID1263
Feature activation-5.48e-02

Pos logit contributions
ider+0.062
Dis+0.060
 slightly+0.060
 softened+0.057
Luckily+0.055
Neg logit contributions
reating-0.169
ites-0.161
oes-0.161
opl-0.161
orns-0.161
Token tree
Token ID5509
Feature activation-6.52e-02

Pos logit contributions
 strange+0.070
 slightly+0.068
 ripped+0.067
 horrible+0.066
 softened+0.065
Neg logit contributions
reating-0.098
oes-0.091
orns-0.088
fl-0.087
mel-0.086
Token,
Token ID11
Feature activation-4.21e-02

Pos logit contributions
ider+0.108
 slightly+0.102
 softened+0.101
Dis+0.099
Luckily+0.098
Neg logit contributions
reating-0.096
ites-0.088
opl-0.088
ishes-0.088
orns-0.088
Token there
Token ID612
Feature activation-3.38e-02

Pos logit contributions
ider+0.071
 softened+0.067
Dis+0.065
 slightly+0.064
 horrible+0.062
Neg logit contributions
ie-0.056
ites-0.056
fl-0.054
oes-0.054
 whenever-0.054
Token was
Token ID373
Feature activation-1.09e-01

Pos logit contributions
ider+0.043
 slightly+0.040
 softened+0.040
Dis+0.040
Luckily+0.038
Neg logit contributions
reating-0.061
ites-0.059
opl-0.059
orns-0.058
oes-0.058
Token a
Token ID257
Feature activation-8.63e-02

Pos logit contributions
ider+0.219
 softened+0.192
istered+0.189
ED+0.180
Conf+0.177
Neg logit contributions
 whenever-0.139
reating-0.128
orns-0.122
fl-0.120
 farms-0.113
Token little
Token ID1310
Feature activation-4.18e-02

Pos logit contributions
ider+0.143
 softened+0.135
Dis+0.123
ified+0.122
 slightly+0.121
Neg logit contributions
ie-0.116
 ostr-0.113
 pump-0.111
 farm-0.110
 whenever-0.109
Token find
Token ID1064
Feature activation-3.88e-02

Pos logit contributions
ider+0.034
 softened+0.032
Dis+0.032
 slightly+0.032
Conf+0.031
Neg logit contributions
reating-0.019
oes-0.017
opl-0.017
orns-0.017
ites-0.017
Token his
Token ID465
Feature activation-2.62e-02

Pos logit contributions
 slightly+0.060
 strange+0.056
 horrible+0.056
 softened+0.055
 shattered+0.054
Neg logit contributions
ites-0.060
reating-0.060
opl-0.058
pp-0.056
fl-0.056
Token way
Token ID835
Feature activation+8.53e-03

Pos logit contributions
ider+0.051
 slightly+0.049
 softened+0.048
Dis+0.048
Luckily+0.047
Neg logit contributions
reating-0.030
ites-0.028
opl-0.028
orns-0.028
oes-0.028
Token back
Token ID736
Feature activation-3.81e-02

Pos logit contributions
opl+0.015
reating+0.015
orns+0.015
ishes+0.015
ites+0.014
Neg logit contributions
 slightly-0.011
ider-0.010
 ripped-0.010
 shattered-0.010
Luckily-0.010
Token home
Token ID1363
Feature activation-3.13e-02

Pos logit contributions
 slightly+0.057
ider+0.052
 strange+0.048
Luckily+0.048
 softened+0.048
Neg logit contributions
reating-0.061
opl-0.060
ites-0.057
ans-0.055
ishes-0.055
Token.
Token ID13
Feature activation-4.66e-02

Pos logit contributions
ider+0.048
 slightly+0.046
 softened+0.045
Dis+0.044
Luckily+0.043
Neg logit contributions
reating-0.048
ites-0.047
opl-0.046
orns-0.046
ishes-0.045
Token The
Token ID383
Feature activation-9.67e-03

Pos logit contributions
ider+0.096
 softened+0.087
istered+0.086
Dis+0.086
 slightly+0.084
Neg logit contributions
 whenever-0.048
oes-0.047
opl-0.044
fl-0.043
bank-0.042
Token fire
Token ID2046
Feature activation-3.21e-02

Pos logit contributions
ider+0.016
Dis+0.016
 softened+0.015
 slightly+0.015
Conf+0.015
Neg logit contributions
fl-0.013
reating-0.013
opl-0.013
oes-0.013
orns-0.012
Token was
Token ID373
Feature activation-1.00e-01

Pos logit contributions
ider+0.048
Dis+0.045
 slightly+0.045
Conf+0.043
 softened+0.043
Neg logit contributions
fl-0.049
oes-0.049
ie-0.049
reating-0.049
ishes-0.048
Token getting
Token ID1972
Feature activation-7.00e-02

Pos logit contributions
ider+0.144
Dis+0.134
Conf+0.124
ED+0.116
Just+0.114
Neg logit contributions
bur-0.167
reating-0.165
mel-0.162
ie-0.158
 whenever-0.157
Token bigger
Token ID5749
Feature activation-2.30e-02

Pos logit contributions
ider+0.103
Dis+0.091
Conf+0.087
ED+0.085
 Looking+0.082
Neg logit contributions
bur-0.119
reating-0.117
ie-0.115
ishes-0.114
 whenever-0.112