TOP ACTIVATIONS
MAX = 11.707

upon a time , there arrived a boy to a stable
big hug . <|endoftext|> Henry awoke and jumped out of bed
day , a big box arrived in the mail . Inside
On a arrived on the first day ,
deep backyard . In the backyard there was an ash tr
. Once they arrived , Mike y was so
. In the woods , Buddy met a little
, in a very deep forest , a little girl named
time , in a deep forest , there lived a little
time , in a big forest , there was a little
One day , Billy arrived in the garden and noticed
He walked through woods , he sailed on the
girl named Sophie . She arrived home from school one day
time , in a big forest , there lived a lion
. When Tim arrived , his grandma gave him
. As Sara arrived to school , she could
When they arrived , Jack was so excited
time , in a big forest , there was a little
time , in a small forest , there lived a little
Suddenly , Pete awoke to some loud noises .

MIN ACTIVATIONS
MIN = -16.507

, how can we help Mitt ens feel fine again ?"
can always make the conscious choice to take control of our
" Thank you for sharing Mitt ens with me . I
was always better than being boss y . And he dreamed
always remembered to say " please " and " thank you
the zo oke ep ers forg ave Tom and said ,
that he wouldn 't scare Mitt ens again . The end
You have to say ' please ' and ' thank you
am taking the rh in oc eros to a new home
love your ch ubby cat Mitt ens ," Sarah said .
wished she had never let Mitt ens run away . <|endoftext|>
hug him and say " we love you , Max ".
about who has the sh ini est gem or the best
't like spiders or cob we bs . But I like
she had tried to fit Mitt ens in the jar .
't know the rh in oc eros was wild . Can
the tree and unt ied Mitt ens from the branches .
" Thank you very much Mrs Berry ." Mrs Berry then
each other and said " we did a great job !
, " We can take Mitt ens to the vet .

INTERVAL 4.654 - 11.707
CONTAINS 6.976%

a time , there was a seal named Sammy . Sammy
upon a time , there was a little fish named Fin
. She was so sleepy - she quit trying to keep
great holiday . <|endoftext|> Once there was a little girl called
mom . From then on , Jack was always careful with

INTERVAL -2.400 - 4.654
CONTAINS 72.385%

. The little boy grabbed a bat and a baseball and
big crack in the wall . It was too big to
fast . He laughed and said , " Whe e !
started to cry . She was wet and sticky and scared
surface . It was so much fun ! At

INTERVAL -9.453 - -2.400
CONTAINS 20.578%

. He looks happy , but he smells bad .
usually did not think of others . He rarely stopped to
isy thanked God for her special day and smiled all the
and flowers . " Mom , can I put
's okay to worry , F uzz y . But it

INTERVAL -16.507 - -9.453
CONTAINS 0.061%

we were trying to be le op ards ," Lily said
he wanted to give his mom ma a special hug .
. Sam can be the boss first , then Mia can
. They did not think Mom would notice .
more sa us ages for Mrs Smith . He quickly returned
Token upon
Token ID2402
Feature activation+6.09e+00

Pos logit contributions
 Inside+0.116
 Across+0.099
 Outside+0.097
Inside+0.085
 On+0.085
Neg logit contributions
 harm-0.341
 rid-0.310
 decisions-0.305
 disl-0.300
 choices-0.295
Token a
Token ID257
Feature activation+5.68e+00

Pos logit contributions
 Inside+0.149
 Across+0.137
 Outside+0.135
lic+0.128
 On+0.118
Neg logit contributions
 harm-0.302
 rid-0.270
 decisions-0.261
 disl-0.256
 wounds-0.255
Token time
Token ID640
Feature activation+5.02e+00

Pos logit contributions
 Inside+0.196
 Across+0.175
 Outside+0.169
lic+0.168
Over+0.160
Neg logit contributions
 harm-0.237
 impact-0.200
 decisions-0.191
 disl-0.189
 rid-0.188
Token,
Token ID11
Feature activation+6.89e+00

Pos logit contributions
 Across+0.152
lic+0.150
 Inside+0.150
shop+0.148
Under+0.142
Neg logit contributions
 harm-0.188
 disl-0.168
 decisions-0.155
 rid-0.152
 supervision-0.151
Token there
Token ID612
Feature activation+7.39e+00

Pos logit contributions
shop+0.231
lic+0.229
 Inside+0.213
 Across+0.205
 Together+0.204
Neg logit contributions
 harm-0.230
 rid-0.204
edient-0.190
 possible-0.180
maker-0.175
Token arrived
Token ID5284
Feature activation+1.17e+01

Pos logit contributions
 Inside+0.228
lic+0.223
 Across+0.216
 Outside+0.212
Inside+0.211
Neg logit contributions
 harm-0.312
 rid-0.258
 decisions-0.253
 disl-0.247
 impact-0.241
Token a
Token ID257
Feature activation+6.15e+00

Pos logit contributions
lic+0.420
shop+0.397
 Across+0.386
Over+0.379
 Inside+0.357
Neg logit contributions
 harm-0.370
 decisions-0.367
 rid-0.366
 attention-0.349
 choices-0.345
Token boy
Token ID2933
Feature activation+1.07e+00

Pos logit contributions
 Inside+0.171
 Across+0.156
 Outside+0.154
 On+0.143
Inside+0.138
Neg logit contributions
 harm-0.289
 rid-0.255
 decisions-0.248
 impact-0.246
 disl-0.243
Token to
Token ID284
Feature activation+9.31e-01

Pos logit contributions
 Inside+0.031
 Across+0.029
 Outside+0.028
lic+0.027
Inside+0.026
Neg logit contributions
 harm-0.051
 rid-0.044
 disl-0.042
 decisions-0.042
 supervision-0.041
Token a
Token ID257
Feature activation+5.15e+00

Pos logit contributions
 Inside+0.031
 Across+0.028
 Outside+0.028
Inside+0.026
 On+0.026
Neg logit contributions
 harm-0.040
 rid-0.035
 decisions-0.035
 disl-0.033
 choices-0.033
Token stable
Token ID8245
Feature activation+5.87e+00

Pos logit contributions
 Inside+0.152
 Across+0.138
 Outside+0.137
 On+0.128
Inside+0.127
Neg logit contributions
 harm-0.236
 rid-0.207
 decisions-0.202
 disl-0.201
 impact-0.198
Token big
Token ID1263
Feature activation+9.75e-02

Pos logit contributions
 Inside+0.062
 Across+0.057
 Outside+0.057
 On+0.052
Over+0.052
Neg logit contributions
 harm-0.099
 rid-0.091
 decisions-0.086
 disl-0.085
 impact-0.083
Token hug
Token ID16225
Feature activation-2.11e+00

Pos logit contributions
 Inside+0.004
 Across+0.004
 Outside+0.004
Inside+0.003
 On+0.003
Neg logit contributions
 harm-0.003
 rid-0.003
 decisions-0.003
 disl-0.003
 choices-0.003
Token.
Token ID13
Feature activation-1.37e-01

Pos logit contributions
 harm+0.099
 rid+0.087
 disl+0.087
 decisions+0.086
 wounds+0.085
Neg logit contributions
 Inside-0.063
 Outside-0.056
 Across-0.055
 Suddenly-0.052
 On-0.052
Token<|endoftext|>
Token ID50256
Feature activation+1.64e-01

Pos logit contributions
 harm+0.006
 rid+0.006
 disl+0.006
 decisions+0.006
 impact+0.005
Neg logit contributions
 Inside-0.003
 Across-0.003
Over-0.003
 Outside-0.003
lic-0.003
TokenHenry
Token ID32476
Feature activation+3.25e+00

Pos logit contributions
 Inside+0.004
 Across+0.003
shop+0.003
 Outside+0.003
Over+0.003
Neg logit contributions
 harm-0.008
 rid-0.007
keeping-0.007
 decisions-0.007
 conclusions-0.006
Token awoke
Token ID43363
Feature activation+1.12e+01

Pos logit contributions
 Inside+0.089
 Across+0.082
 Outside+0.081
lic+0.076
Over+0.076
Neg logit contributions
 harm-0.154
 rid-0.137
 decisions-0.133
 disl-0.132
 supervision-0.128
Token and
Token ID290
Feature activation+7.34e-01

Pos logit contributions
 Inside+0.380
 Across+0.358
 Outside+0.350
Over+0.339
lic+0.337
Neg logit contributions
 harm-0.454
 rid-0.399
 decisions-0.387
 disl-0.381
 wounds-0.369
Token jumped
Token ID11687
Feature activation+6.27e+00

Pos logit contributions
 Inside+0.021
 Across+0.019
 Outside+0.019
 Suddenly+0.018
 On+0.018
Neg logit contributions
 harm-0.035
 rid-0.031
 decisions-0.030
 wounds-0.029
 supervision-0.029
Token out
Token ID503
Feature activation+5.71e+00

Pos logit contributions
 Inside+0.192
 Across+0.180
 Outside+0.176
Over+0.169
lic+0.166
Neg logit contributions
 harm-0.270
 rid-0.245
 decisions-0.234
 disl-0.227
 choices-0.222
Token of
Token ID286
Feature activation+1.15e+00

Pos logit contributions
 Inside+0.186
 Across+0.169
 Outside+0.168
 On+0.157
lic+0.156
Neg logit contributions
 harm-0.250
 rid-0.219
 decisions-0.215
 disl-0.212
 supervision-0.208
Token bed
Token ID3996
Feature activation+6.98e+00

Pos logit contributions
 Inside+0.033
 Outside+0.029
 Across+0.029
Inside+0.027
 On+0.027
Neg logit contributions
 harm-0.055
 rid-0.049
 decisions-0.048
 disl-0.047
 choices-0.046
Token day
Token ID1110
Feature activation+7.29e+00

Pos logit contributions
 Inside+0.187
 Across+0.174
 Outside+0.172
lic+0.165
Inside+0.164
Neg logit contributions
 harm-0.201
 rid-0.171
 decisions-0.169
 disl-0.164
 choices-0.161
Token,
Token ID11
Feature activation+7.38e+00

Pos logit contributions
 Inside+0.211
lic+0.210
shop+0.201
 Across+0.198
 Outside+0.191
Neg logit contributions
 harm-0.293
 rid-0.265
 disl-0.250
 decisions-0.248
 attention-0.241
Token a
Token ID257
Feature activation+5.96e+00

Pos logit contributions
lic+0.238
shop+0.227
 Across+0.207
 Inside+0.207
 Together+0.207
Neg logit contributions
 harm-0.264
 rid-0.243
 disl-0.220
 theirs-0.198
keeping-0.197
Token big
Token ID1263
Feature activation+3.04e+00

Pos logit contributions
 Inside+0.177
 Across+0.160
 Outside+0.156
 On+0.148
Inside+0.146
Neg logit contributions
 harm-0.266
 rid-0.235
 disl-0.226
 impact-0.223
 decisions-0.217
Token box
Token ID3091
Feature activation+7.30e+00

Pos logit contributions
 Inside+0.109
 Across+0.101
 Outside+0.100
lic+0.097
Inside+0.097
Neg logit contributions
 harm-0.122
 rid-0.103
 decisions-0.102
 impact-0.100
 disl-0.100
Token arrived
Token ID5284
Feature activation+1.12e+01

Pos logit contributions
 Inside+0.235
 Outside+0.222
 Across+0.222
Inside+0.221
lic+0.217
Neg logit contributions
 harm-0.292
 rid-0.235
 impact-0.222
edient-0.218
 affect-0.216
Token in
Token ID287
Feature activation+5.57e+00

Pos logit contributions
 Inside+0.329
 Across+0.315
lic+0.311
 Outside+0.309
Over+0.292
Neg logit contributions
 harm-0.484
 rid-0.438
 decisions-0.432
 choices-0.417
 wounds-0.403
Token the
Token ID262
Feature activation+3.46e+00

Pos logit contributions
 Inside+0.202
 Across+0.189
 Outside+0.187
lic+0.179
Over+0.176
Neg logit contributions
 harm-0.220
 rid-0.190
 decisions-0.186
 disl-0.182
 choices-0.178
Token mail
Token ID6920
Feature activation+2.69e+00

Pos logit contributions
lic+0.104
 Inside+0.097
 Across+0.093
 Outside+0.087
 On+0.085
Neg logit contributions
 harm-0.148
 rid-0.132
 decisions-0.130
 choices-0.126
 impact-0.125
Token.
Token ID13
Feature activation+1.92e+00

Pos logit contributions
 Inside+0.105
 Across+0.100
 Outside+0.098
lic+0.095
Over+0.095
Neg logit contributions
 harm-0.094
 rid-0.080
 decisions-0.078
 choices-0.075
 supervision-0.074
Token Inside
Token ID14384
Feature activation+8.06e+00

Pos logit contributions
ippy+0.054
lic+0.052
Over+0.048
zag+0.048
pit+0.047
Neg logit contributions
 harm-0.065
keeping-0.063
 disl-0.062
 decisions-0.061
 rid-0.061
Token
Token ID220
Feature activation+2.55e+00

Pos logit contributions
lic+0.027
ithering+0.025
Over+0.025
 Across+0.025
ider+0.024
Neg logit contributions
 harm-0.030
 rid-0.028
 decisions-0.027
keeping-0.027
 disl-0.027
Token
Token ID198
Feature activation+3.16e+00

Pos logit contributions
 Inside+0.054
 Across+0.048
 Outside+0.047
lic+0.042
Inside+0.042
Neg logit contributions
 harm-0.137
 rid-0.124
 decisions-0.122
 disl-0.120
 choices-0.118
Token
Token ID198
Feature activation+3.76e+00

Pos logit contributions
 Inside+0.073
 Across+0.068
 Outside+0.065
lic+0.060
Over+0.059
Neg logit contributions
 harm-0.161
 rid-0.144
 decisions-0.140
 disl-0.137
 wounds-0.135
TokenOn
Token ID2202
Feature activation+7.02e+00

Pos logit contributions
 Inside+0.088
 Across+0.086
 Outside+0.078
 Together+0.074
lic+0.074
Neg logit contributions
 harm-0.169
 conclusions-0.147
 rid-0.145
 disl-0.145
 affect-0.144
Tokena
Token ID64
Feature activation+3.16e+00

Pos logit contributions
lic+0.288
 Across+0.287
 Inside+0.283
 Outside+0.278
Under+0.268
Neg logit contributions
 harm-0.222
 rid-0.178
 decisions-0.175
 supervision-0.165
 choices-0.165
Token arrived
Token ID5284
Feature activation+1.11e+01

Pos logit contributions
 Inside+0.117
 Across+0.108
 Outside+0.107
lic+0.106
Over+0.103
Neg logit contributions
 harm-0.120
 decisions-0.099
 rid-0.098
 supervision-0.094
 choices-0.092
Token on
Token ID319
Feature activation+4.10e+00

Pos logit contributions
lic+0.344
 Inside+0.316
 Across+0.314
 Outside+0.300
Over+0.296
Neg logit contributions
 harm-0.464
 decisions-0.430
 attention-0.420
 choices-0.419
 rid-0.417
Token the
Token ID262
Feature activation+1.91e+00

Pos logit contributions
lic+0.154
 Inside+0.152
 Across+0.149
 Outside+0.146
Over+0.141
Neg logit contributions
 harm-0.148
 decisions-0.123
 rid-0.123
 choices-0.116
 disl-0.113
Token first
Token ID717
Feature activation+2.52e+00

Pos logit contributions
 Inside+0.061
 Across+0.056
 Outside+0.055
lic+0.055
Over+0.052
Neg logit contributions
 harm-0.083
 rid-0.071
 decisions-0.071
 choices-0.069
 disl-0.068
Token day
Token ID1110
Feature activation+5.30e+00

Pos logit contributions
 Inside+0.077
 Across+0.071
 Outside+0.069
lic+0.068
Over+0.066
Neg logit contributions
 harm-0.112
 decisions-0.097
 rid-0.095
 choices-0.092
 impact-0.092
Token,
Token ID11
Feature activation+6.14e+00

Pos logit contributions
 Inside+0.143
 Across+0.134
 Outside+0.127
lic+0.127
Over+0.126
Neg logit contributions
 harm-0.245
 decisions-0.219
 rid-0.216
 choices-0.209
 supervision-0.205
Token deep
Token ID2769
Feature activation+5.46e+00

Pos logit contributions
 Inside+0.150
 Across+0.135
 Outside+0.135
 On+0.126
Inside+0.124
Neg logit contributions
 harm-0.219
 rid-0.194
 decisions-0.193
 disl-0.190
 wounds-0.186
Token backyard
Token ID24296
Feature activation+1.02e+01

Pos logit contributions
 Inside+0.172
lic+0.166
Over+0.164
 Across+0.159
Inside+0.158
Neg logit contributions
 harm-0.234
 impact-0.201
 decisions-0.198
 rid-0.192
 disl-0.190
Token.
Token ID13
Feature activation+2.50e+00

Pos logit contributions
 Inside+0.437
 Across+0.418
 Outside+0.415
Inside+0.409
lic+0.403
Neg logit contributions
 harm-0.311
 decisions-0.245
 rid-0.243
 choices-0.239
 impact-0.234
Token In
Token ID554
Feature activation+7.03e+00

Pos logit contributions
lic+0.075
ippy+0.071
Over+0.070
uff+0.065
ithering+0.064
Neg logit contributions
 harm-0.088
 disl-0.079
keeping-0.074
 decisions-0.074
 impact-0.072
Token the
Token ID262
Feature activation+3.79e+00

Pos logit contributions
lic+0.307
 Inside+0.290
 Across+0.290
 Outside+0.286
shop+0.273
Neg logit contributions
 harm-0.213
 rid-0.174
 decisions-0.172
 disl-0.171
 choices-0.166
Token backyard
Token ID24296
Feature activation+1.10e+01

Pos logit contributions
 Inside+0.113
 Across+0.103
lic+0.101
 Outside+0.100
Inside+0.096
Neg logit contributions
 harm-0.169
 decisions-0.151
 rid-0.150
 choices-0.148
 wounds-0.143
Token there
Token ID612
Feature activation+7.16e+00

Pos logit contributions
lic+0.317
 Across+0.293
 Inside+0.291
Under+0.289
Inside+0.286
Neg logit contributions
 harm-0.474
 decisions-0.416
 choices-0.415
 disl-0.405
 impact-0.397
Token was
Token ID373
Feature activation+6.27e+00

Pos logit contributions
lic+0.204
 Inside+0.194
 Across+0.187
Inside+0.181
 Outside+0.181
Neg logit contributions
 harm-0.313
 rid-0.266
 decisions-0.264
 choices-0.261
 disl-0.261
Token an
Token ID281
Feature activation+2.16e+00

Pos logit contributions
 Inside+0.169
 Outside+0.166
 Across+0.165
lic+0.163
 Along+0.153
Neg logit contributions
 harm-0.259
 rid-0.240
 ger-0.227
 decisions-0.225
 choices-0.221
Token ash
Token ID12530
Feature activation+4.03e+00

Pos logit contributions
 Inside+0.080
 Across+0.074
 Outside+0.074
 On+0.069
Inside+0.069
Neg logit contributions
 harm-0.083
 rid-0.073
 decisions-0.070
 disl-0.069
 supervision-0.068
Tokentr
Token ID2213
Feature activation+3.75e+00

Pos logit contributions
 Inside+0.098
 Across+0.086
 Outside+0.085
Inside+0.082
Over+0.081
Neg logit contributions
 harm-0.204
 rid-0.179
 decisions-0.175
 choices-0.173
keeping-0.169
Token.
Token ID13
Feature activation+7.40e-01

Pos logit contributions
 Inside+0.134
 Outside+0.117
On+0.116
 Across+0.113
Inside+0.112
Neg logit contributions
 harm-0.207
 disl-0.187
 rid-0.181
 decisions-0.179
 supervision-0.179
Token
Token ID198
Feature activation+3.15e+00

Pos logit contributions
lic+0.021
shop+0.021
Over+0.020
 Across+0.019
P+0.019
Neg logit contributions
 harm-0.024
 disl-0.023
 rid-0.023
 decisions-0.023
keeping-0.022
Token
Token ID198
Feature activation+3.72e+00

Pos logit contributions
 Across+0.079
 Inside+0.078
 Outside+0.073
shop+0.069
lic+0.068
Neg logit contributions
 harm-0.146
 rid-0.129
 decisions-0.123
 disl-0.120
 wounds-0.119
TokenOnce
Token ID7454
Feature activation+6.20e+00

Pos logit contributions
 Across+0.081
 Inside+0.079
 Outside+0.073
shop+0.072
�+0.071
Neg logit contributions
 harm-0.165
 rid-0.148
 disl-0.146
 decisions-0.145
edient-0.138
Token they
Token ID484
Feature activation+2.80e+00

Pos logit contributions
 Inside+0.184
 Across+0.184
 Outside+0.177
lic+0.176
shop+0.169
Neg logit contributions
 harm-0.250
 rid-0.233
 decisions-0.226
 choices-0.224
 attention-0.218
Token arrived
Token ID5284
Feature activation+1.10e+01

Pos logit contributions
 Across+0.068
 Inside+0.067
lic+0.067
 Outside+0.066
shop+0.064
Neg logit contributions
 harm-0.128
 rid-0.116
 permission-0.106
 impact-0.106
 benefit-0.102
Token,
Token ID11
Feature activation+6.67e+00

Pos logit contributions
lic+0.383
 Across+0.349
shop+0.344
 Inside+0.343
Inside+0.337
Neg logit contributions
 harm-0.400
 decisions-0.384
 choices-0.377
 rid-0.377
 attention-0.375
Token Mike
Token ID4995
Feature activation-1.67e-01

Pos logit contributions
shop+0.258
lic+0.256
 Across+0.253
 Inside+0.246
 Outside+0.245
Neg logit contributions
 harm-0.191
 rid-0.188
 decisions-0.172
 choices-0.171
 attention-0.160
Tokeny
Token ID88
Feature activation+4.98e+00

Pos logit contributions
 harm+0.008
 rid+0.007
 decisions+0.006
 disl+0.006
 supervision+0.006
Neg logit contributions
 Inside-0.005
 Outside-0.005
 Across-0.005
Over-0.004
Inside-0.004
Token was
Token ID373
Feature activation+4.80e+00

Pos logit contributions
 Inside+0.129
 Across+0.116
 Outside+0.116
lic+0.107
 On+0.106
Neg logit contributions
 harm-0.243
 rid-0.216
 decisions-0.213
 disl-0.206
 wounds-0.204
Token so
Token ID523
Feature activation-2.89e-01

Pos logit contributions
 Inside+0.167
 Across+0.155
 Outside+0.153
Inside+0.148
Over+0.148
Neg logit contributions
 harm-0.192
 rid-0.173
 decisions-0.163
 wounds-0.154
 disl-0.153
Token.
Token ID13
Feature activation+1.37e+00

Pos logit contributions
 Inside+0.305
 Outside+0.272
 Across+0.270
On+0.261
 On+0.260
Neg logit contributions
 harm-0.423
 rid-0.381
 disl-0.379
 decisions-0.378
 supervision-0.368
Token
Token ID198
Feature activation+3.76e+00

Pos logit contributions
shop+0.044
pit+0.042
lic+0.040
Over+0.039
Along+0.038
Neg logit contributions
 harm-0.049
 rid-0.043
 disl-0.040
 affect-0.037
 decisions-0.036
Token
Token ID198
Feature activation+4.23e+00

Pos logit contributions
 Across+0.098
 Inside+0.098
 Outside+0.092
shop+0.090
Over+0.086
Neg logit contributions
 harm-0.171
 rid-0.148
 decisions-0.136
 disl-0.134
 wounds-0.134
TokenIn
Token ID818
Feature activation+7.11e+00

Pos logit contributions
pit+0.135
shop+0.130
uff+0.122
 Inside+0.120
oss+0.120
Neg logit contributions
 harm-0.153
 rid-0.123
 affect-0.116
edient-0.111
 disl-0.104
Token the
Token ID262
Feature activation+3.75e+00

Pos logit contributions
lic+0.370
 Across+0.327
shop+0.320
 Outside+0.315
Under+0.314
Neg logit contributions
 harm-0.189
 rid-0.133
 decisions-0.121
 supervision-0.117
 choices-0.116
Token woods
Token ID16479
Feature activation+1.09e+01

Pos logit contributions
lic+0.122
 Inside+0.112
 Across+0.101
shop+0.097
uttered+0.096
Neg logit contributions
 harm-0.158
 decisions-0.136
 impact-0.133
 choices-0.132
 difference-0.130
Token,
Token ID11
Feature activation+6.53e+00

Pos logit contributions
 Inside+0.282
 Across+0.272
lic+0.266
 Outside+0.257
shop+0.249
Neg logit contributions
 harm-0.532
 rid-0.456
 decisions-0.453
 choices-0.441
 disl-0.426
Token Buddy
Token ID36896
Feature activation+2.26e+00

Pos logit contributions
shop+0.251
lic+0.232
 Inside+0.230
 Across+0.227
 Outside+0.226
Neg logit contributions
 harm-0.230
 rid-0.196
 attention-0.167
 choices-0.163
 wounds-0.161
Token met
Token ID1138
Feature activation+5.22e+00

Pos logit contributions
 Inside+0.062
 Across+0.057
 Outside+0.055
Over+0.053
lic+0.053
Neg logit contributions
 harm-0.108
 rid-0.095
 decisions-0.089
 disl-0.088
 wounds-0.087
Token a
Token ID257
Feature activation+4.55e+00

Pos logit contributions
lic+0.149
 Inside+0.148
�+0.147
shop+0.147
 Across+0.147
Neg logit contributions
 harm-0.217
 rid-0.193
 decisions-0.181
 kindness-0.180
 attention-0.179
Token little
Token ID1310
Feature activation+1.83e+00

Pos logit contributions
 Inside+0.159
 Across+0.151
 Outside+0.148
 On+0.140
 Along+0.136
Neg logit contributions
 harm-0.182
 rid-0.154
 disl-0.148
 impact-0.146
 decisions-0.145
Token,
Token ID11
Feature activation+7.48e+00

Pos logit contributions
lic+0.168
 Across+0.167
 Inside+0.166
shop+0.163
Under+0.158
Neg logit contributions
 harm-0.224
 disl-0.199
 decisions-0.190
 rid-0.185
 supervision-0.183
Token in
Token ID287
Feature activation+6.99e+00

Pos logit contributions
lic+0.241
shop+0.237
 Inside+0.218
 Across+0.212
Inside+0.211
Neg logit contributions
 harm-0.264
 rid-0.228
edient-0.211
 possible-0.205
 theirs-0.199
Token a
Token ID257
Feature activation+7.81e+00

Pos logit contributions
lic+0.192
 Inside+0.182
 Across+0.181
 Outside+0.174
shop+0.166
Neg logit contributions
 harm-0.312
 rid-0.275
 decisions-0.262
 wounds-0.258
 disl-0.255
Token very
Token ID845
Feature activation+4.00e+00

Pos logit contributions
 Inside+0.247
lic+0.233
 Across+0.219
 Outside+0.209
 On+0.200
Neg logit contributions
 harm-0.341
 disl-0.289
 conclusions-0.283
 rid-0.279
 impact-0.278
Token deep
Token ID2769
Feature activation+7.66e+00

Pos logit contributions
 Inside+0.120
 Across+0.109
 Outside+0.107
lic+0.103
Inside+0.101
Neg logit contributions
 harm-0.181
 rid-0.158
 decisions-0.156
 disl-0.155
 choices-0.150
Token forest
Token ID8222
Feature activation+1.09e+01

Pos logit contributions
 Inside+0.235
lic+0.228
Over+0.220
 Across+0.219
Inside+0.212
Neg logit contributions
 harm-0.346
 decisions-0.285
 rid-0.283
 impact-0.282
 disl-0.276
Token,
Token ID11
Feature activation+7.10e+00

Pos logit contributions
 Inside+0.302
 Across+0.281
lic+0.279
 Outside+0.275
shop+0.269
Neg logit contributions
 harm-0.511
 decisions-0.428
 rid-0.420
 disl-0.415
 choices-0.414
Token a
Token ID257
Feature activation+6.56e+00

Pos logit contributions
lic+0.210
shop+0.198
 Across+0.186
 Outside+0.183
 Inside+0.181
Neg logit contributions
 harm-0.282
 rid-0.247
 decisions-0.235
 disl-0.233
 choices-0.231
Token little
Token ID1310
Feature activation+3.26e+00

Pos logit contributions
 Inside+0.190
 Across+0.173
 Outside+0.167
 On+0.160
shop+0.154
Neg logit contributions
 harm-0.300
 rid-0.266
 disl-0.256
 impact-0.253
 decisions-0.252
Token girl
Token ID2576
Feature activation+1.70e+00

Pos logit contributions
 Inside+0.127
 Across+0.120
 Outside+0.117
Inside+0.116
lic+0.114
Neg logit contributions
 harm-0.118
 rid-0.100
 decisions-0.097
 impact-0.095
 choices-0.093
Token named
Token ID3706
Feature activation-1.91e+00

Pos logit contributions
 Inside+0.051
 Across+0.048
lic+0.047
 Outside+0.046
Inside+0.045
Neg logit contributions
 harm-0.077
 rid-0.065
 disl-0.063
 decisions-0.061
 impact-0.060
Token time
Token ID640
Feature activation+5.39e+00

Pos logit contributions
 Inside+0.209
 Across+0.189
 Outside+0.183
lic+0.182
Over+0.174
Neg logit contributions
 harm-0.254
 impact-0.214
 disl-0.208
 decisions-0.206
 rid-0.203
Token,
Token ID11
Feature activation+7.32e+00

Pos logit contributions
lic+0.167
 Inside+0.164
 Across+0.163
shop+0.159
 Outside+0.155
Neg logit contributions
 harm-0.203
 disl-0.177
 decisions-0.170
 rid-0.164
 choices-0.163
Token in
Token ID287
Feature activation+6.66e+00

Pos logit contributions
shop+0.261
lic+0.258
 Inside+0.229
boards+0.228
 Outside+0.227
Neg logit contributions
 harm-0.236
 rid-0.204
edient-0.185
 possible-0.176
 attention-0.171
Token a
Token ID257
Feature activation+7.51e+00

Pos logit contributions
lic+0.180
 Inside+0.176
 Across+0.172
 Outside+0.168
shop+0.158
Neg logit contributions
 harm-0.296
 rid-0.263
 decisions-0.247
 wounds-0.246
 disl-0.242
Token deep
Token ID2769
Feature activation+7.62e+00

Pos logit contributions
 Inside+0.232
lic+0.218
 Across+0.207
 Outside+0.200
 On+0.186
Neg logit contributions
 harm-0.328
 disl-0.279
 rid-0.273
 impact-0.270
 wounds-0.269
Token forest
Token ID8222
Feature activation+1.09e+01

Pos logit contributions
lic+0.231
 Inside+0.230
Over+0.224
 Across+0.221
Inside+0.211
Neg logit contributions
 harm-0.333
 impact-0.273
 decisions-0.271
 rid-0.268
 disl-0.267
Token,
Token ID11
Feature activation+7.19e+00

Pos logit contributions
 Inside+0.299
lic+0.284
 Across+0.283
 Outside+0.274
shop+0.268
Neg logit contributions
 harm-0.508
 decisions-0.420
 rid-0.415
 disl-0.412
 choices-0.406
Token there
Token ID612
Feature activation+7.75e+00

Pos logit contributions
shop+0.228
lic+0.227
 Across+0.204
 Outside+0.202
ippy+0.198
Neg logit contributions
 harm-0.256
 rid-0.223
 choices-0.205
edient-0.204
 decisions-0.203
Token lived
Token ID5615
Feature activation+9.43e+00

Pos logit contributions
lic+0.283
 Inside+0.266
 Along+0.254
Inside+0.253
 Across+0.251
Neg logit contributions
 harm-0.281
 happen-0.236
 disl-0.222
 decisions-0.217
 rid-0.214
Token a
Token ID257
Feature activation+5.35e+00

Pos logit contributions
 Across+0.292
lic+0.285
shop+0.280
 Outside+0.273
�+0.269
Neg logit contributions
 rid-0.326
 harm-0.315
 permission-0.264
 decisions-0.257
 kindness-0.251
Token little
Token ID1310
Feature activation+1.99e+00

Pos logit contributions
 Inside+0.166
 Across+0.157
 Outside+0.153
 On+0.141
shop+0.138
Neg logit contributions
 harm-0.231
 rid-0.200
 disl-0.193
 impact-0.190
 decisions-0.189
Token time
Token ID640
Feature activation+5.52e+00

Pos logit contributions
 Inside+0.217
 Across+0.198
lic+0.191
 Outside+0.190
Over+0.179
Neg logit contributions
 harm-0.248
 impact-0.206
 disl-0.197
 decisions-0.194
 rid-0.193
Token,
Token ID11
Feature activation+7.36e+00

Pos logit contributions
 Across+0.168
lic+0.168
 Inside+0.168
shop+0.166
Under+0.158
Neg logit contributions
 harm-0.208
 disl-0.181
 decisions-0.170
 rid-0.169
 choices-0.165
Token in
Token ID287
Feature activation+6.84e+00

Pos logit contributions
shop+0.255
lic+0.244
 Inside+0.227
 Across+0.224
 Along+0.224
Neg logit contributions
 harm-0.247
 rid-0.215
edient-0.200
 possible-0.183
 attention-0.177
Token a
Token ID257
Feature activation+7.50e+00

Pos logit contributions
lic+0.196
 Across+0.185
 Inside+0.183
 Outside+0.177
shop+0.169
Neg logit contributions
 harm-0.299
 rid-0.263
 decisions-0.244
 disl-0.242
 wounds-0.241
Token big
Token ID1263
Feature activation+4.65e+00

Pos logit contributions
 Inside+0.236
lic+0.232
 Across+0.213
 Outside+0.204
 Together+0.195
Neg logit contributions
 harm-0.320
 disl-0.270
 conclusions-0.260
 wounds-0.260
 rid-0.260
Token forest
Token ID8222
Feature activation+1.09e+01

Pos logit contributions
 Inside+0.134
lic+0.128
 Across+0.124
 Outside+0.121
Inside+0.119
Neg logit contributions
 harm-0.217
 decisions-0.186
 impact-0.184
 disl-0.181
 rid-0.180
Token,
Token ID11
Feature activation+7.16e+00

Pos logit contributions
 Inside+0.304
lic+0.303
 Across+0.298
 Outside+0.287
shop+0.285
Neg logit contributions
 harm-0.492
 decisions-0.397
 disl-0.393
 choices-0.392
 rid-0.390
Token there
Token ID612
Feature activation+7.82e+00

Pos logit contributions
shop+0.219
lic+0.219
 Across+0.200
�+0.200
 Outside+0.198
Neg logit contributions
 harm-0.266
 rid-0.231
 choices-0.217
edient-0.215
 happen-0.211
Token was
Token ID373
Feature activation+6.86e+00

Pos logit contributions
lic+0.285
 Inside+0.257
 Across+0.251
 Along+0.249
Along+0.247
Neg logit contributions
 harm-0.289
 happen-0.242
 disl-0.227
 decisions-0.221
 rid-0.221
Token a
Token ID257
Feature activation+5.46e+00

Pos logit contributions
lic+0.236
 Across+0.226
oped+0.221
 Outside+0.219
pit+0.218
Neg logit contributions
 harm-0.210
 rid-0.184
 ger-0.178
 kindness-0.158
 disl-0.151
Token little
Token ID1310
Feature activation+2.14e+00

Pos logit contributions
 Inside+0.172
 Across+0.162
 Outside+0.158
 On+0.147
 Along+0.142
Neg logit contributions
 harm-0.233
 rid-0.199
 impact-0.195
 disl-0.194
 difference-0.192
Token
Token ID198
Feature activation+3.59e+00

Pos logit contributions
 Inside+0.080
 Across+0.079
 Outside+0.075
lic+0.072
shop+0.071
Neg logit contributions
 harm-0.142
 rid-0.120
 disl-0.115
 decisions-0.114
 impact-0.113
TokenOne
Token ID3198
Feature activation+4.45e+00

Pos logit contributions
 Inside+0.082
 Across+0.075
lic+0.075
 Outside+0.073
shop+0.067
Neg logit contributions
 harm-0.178
 disl-0.152
 impact-0.146
 rid-0.146
 decisions-0.145
Token day
Token ID1110
Feature activation+6.93e+00

Pos logit contributions
 Inside+0.191
 Across+0.184
lic+0.180
shop+0.176
 Outside+0.175
Neg logit contributions
 harm-0.136
 rid-0.106
 decisions-0.101
 attention-0.100
 choices-0.096
Token,
Token ID11
Feature activation+6.84e+00

Pos logit contributions
shop+0.222
lic+0.214
 Inside+0.201
pit+0.199
 Across+0.197
Neg logit contributions
 harm-0.273
 rid-0.236
 disl-0.214
 attention-0.212
 decisions-0.211
Token Billy
Token ID15890
Feature activation-2.59e-01

Pos logit contributions
 Inside+0.275
 Across+0.268
lic+0.264
 Outside+0.263
shop+0.260
Neg logit contributions
 harm-0.224
 rid-0.192
 decisions-0.173
 disl-0.170
 choices-0.169
Token arrived
Token ID5284
Feature activation+1.09e+01

Pos logit contributions
 harm+0.011
 rid+0.010
keeping+0.009
 decisions+0.009
 disl+0.009
Neg logit contributions
 Inside-0.008
 Across-0.007
 Outside-0.007
lic-0.007
Over-0.007
Token in
Token ID287
Feature activation+5.01e+00

Pos logit contributions
lic+0.350
 Inside+0.346
 Across+0.338
Over+0.331
 Outside+0.327
Neg logit contributions
 harm-0.426
 rid-0.393
 decisions-0.389
 choices-0.366
 attention-0.363
Token the
Token ID262
Feature activation+3.00e+00

Pos logit contributions
 Inside+0.185
 Across+0.178
lic+0.176
 Outside+0.174
Over+0.169
Neg logit contributions
 harm-0.190
 rid-0.162
 decisions-0.157
 disl-0.149
 choices-0.148
Token garden
Token ID11376
Feature activation+9.14e+00

Pos logit contributions
lic+0.091
 Inside+0.090
 Across+0.082
 Outside+0.077
 On+0.077
Neg logit contributions
 harm-0.123
 rid-0.113
 decisions-0.110
 conclusions-0.106
 difference-0.105
Token and
Token ID290
Feature activation-6.41e-01

Pos logit contributions
 Inside+0.327
 Across+0.311
lic+0.305
 Outside+0.303
Over+0.294
Neg logit contributions
 harm-0.345
 rid-0.299
 decisions-0.295
 choices-0.280
 disl-0.276
Token noticed
Token ID6810
Feature activation+4.87e+00

Pos logit contributions
 harm+0.028
 rid+0.026
 decisions+0.024
 disl+0.024
 wounds+0.024
Neg logit contributions
 Inside-0.018
 Across-0.017
 Outside-0.017
lic-0.016
 On-0.016
Token
Token ID198
Feature activation+2.90e+00

Pos logit contributions
lic+0.015
Over+0.014
 Across+0.014
 Inside+0.014
shop+0.014
Neg logit contributions
 harm-0.024
 decisions-0.022
 rid-0.022
 disl-0.021
 choices-0.021
Token
Token ID198
Feature activation+3.44e+00

Pos logit contributions
 Inside+0.071
 Across+0.069
 Outside+0.066
Over+0.060
lic+0.059
Neg logit contributions
 harm-0.141
 rid-0.123
 decisions-0.119
 disl-0.116
 wounds-0.115
TokenHe
Token ID1544
Feature activation+5.20e+00

Pos logit contributions
 Inside+0.091
 Across+0.082
 Outside+0.079
lic+0.073
 Together+0.072
Neg logit contributions
 harm-0.150
 conclusions-0.124
 rid-0.123
 disl-0.122
 decisions-0.122
Token walked
Token ID6807
Feature activation+5.61e+00

Pos logit contributions
 Inside+0.151
 Across+0.146
 Outside+0.146
lic+0.138
 On+0.131
Neg logit contributions
 harm-0.228
 rid-0.194
 wounds-0.182
 permission-0.172
 decisions-0.171
Token through
Token ID832
Feature activation+7.25e+00

Pos logit contributions
 Across+0.214
 Inside+0.212
 Outside+0.203
Over+0.202
lic+0.198
Neg logit contributions
 harm-0.182
 rid-0.165
 attention-0.155
 decisions-0.147
 choices-0.145
Token woods
Token ID16479
Feature activation+1.09e+01

Pos logit contributions
 Inside+0.287
 Across+0.276
 Outside+0.275
lic+0.274
Over+0.260
Neg logit contributions
 harm-0.256
 rid-0.210
 decisions-0.209
 choices-0.196
 wounds-0.195
Token,
Token ID11
Feature activation+5.64e+00

Pos logit contributions
 Inside+0.297
 Across+0.275
 Outside+0.268
lic+0.257
Over+0.249
Neg logit contributions
 harm-0.527
 rid-0.464
 decisions-0.455
 disl-0.439
 choices-0.439
Token he
Token ID339
Feature activation+4.44e+00

Pos logit contributions
 Inside+0.172
 Across+0.159
 Outside+0.156
lic+0.150
Over+0.146
Neg logit contributions
 harm-0.249
 rid-0.223
 decisions-0.210
 wounds-0.207
 disl-0.206
Token sailed
Token ID33835
Feature activation+5.47e+00

Pos logit contributions
 Inside+0.128
 Across+0.117
 Outside+0.114
lic+0.110
 On+0.109
Neg logit contributions
 harm-0.202
 rid-0.182
 wounds-0.162
 decisions-0.160
 supervision-0.160
Token on
Token ID319
Feature activation+2.82e+00

Pos logit contributions
 Inside+0.172
 Outside+0.155
 Across+0.154
On+0.146
Inside+0.145
Neg logit contributions
 harm-0.247
 disl-0.216
 rid-0.216
 decisions-0.214
 supervision-0.211
Token the
Token ID262
Feature activation+6.90e-01

Pos logit contributions
 Inside+0.080
 Across+0.075
 Outside+0.074
lic+0.072
Over+0.069
Neg logit contributions
 harm-0.132
 rid-0.115
 decisions-0.113
 wounds-0.109
 choices-0.108
Token girl
Token ID2576
Feature activation+1.16e-01

Pos logit contributions
 Inside+0.129
 Across+0.117
 Outside+0.113
 On+0.104
lic+0.102
Neg logit contributions
 harm-0.235
 rid-0.208
 disl-0.201
 impact-0.199
 decisions-0.198
Token named
Token ID3706
Feature activation-2.77e+00

Pos logit contributions
 Inside+0.003
 Outside+0.002
 Across+0.002
 On+0.002
On+0.002
Neg logit contributions
 harm-0.006
 rid-0.006
 decisions-0.006
 disl-0.005
 choices-0.005
Token Sophie
Token ID35331
Feature activation-9.55e-01

Pos logit contributions
 harm+0.123
 rid+0.110
 disl+0.105
 decisions+0.105
 supervision+0.102
Neg logit contributions
 Inside-0.084
 Across-0.079
 Outside-0.077
Over-0.073
lic-0.073
Token.
Token ID13
Feature activation+1.19e+00

Pos logit contributions
 harm+0.044
 decisions+0.042
 rid+0.040
 choices+0.039
 disl+0.039
Neg logit contributions
 Inside-0.029
 Outside-0.025
 Across-0.024
On-0.024
 On-0.024
Token She
Token ID1375
Feature activation+3.87e+00

Pos logit contributions
lic+0.027
 Inside+0.026
 Across+0.026
Over+0.025
 Outside+0.025
Neg logit contributions
 harm-0.054
 disl-0.049
 impact-0.049
 decisions-0.049
 rid-0.048
Token arrived
Token ID5284
Feature activation+1.09e+01

Pos logit contributions
 Inside+0.128
 Across+0.119
 Outside+0.116
shop+0.110
lic+0.110
Neg logit contributions
 harm-0.162
 rid-0.140
 disl-0.138
 affect-0.128
 wounds-0.127
Token home
Token ID1363
Feature activation+5.76e+00

Pos logit contributions
lic+0.358
 Inside+0.354
 Across+0.349
 Outside+0.336
Inside+0.331
Neg logit contributions
 harm-0.403
 decisions-0.369
 rid-0.368
 attention-0.356
 choices-0.355
Token from
Token ID422
Feature activation+1.96e+00

Pos logit contributions
 Inside+0.198
 Across+0.189
 Outside+0.184
lic+0.176
Inside+0.174
Neg logit contributions
 harm-0.235
 rid-0.205
 decisions-0.197
 choices-0.192
 wounds-0.188
Token school
Token ID1524
Feature activation+5.39e+00

Pos logit contributions
 Inside+0.065
 Across+0.059
 Outside+0.059
Inside+0.055
 On+0.055
Neg logit contributions
 harm-0.084
 rid-0.074
 decisions-0.073
 disl-0.071
 choices-0.069
Token one
Token ID530
Feature activation+1.73e+00

Pos logit contributions
 Inside+0.164
 Across+0.156
lic+0.155
 Outside+0.150
Over+0.146
Neg logit contributions
 harm-0.229
 decisions-0.196
 choices-0.192
 supervision-0.192
 rid-0.191
Token day
Token ID1110
Feature activation+4.84e+00

Pos logit contributions
 Inside+0.063
 Across+0.059
 Outside+0.058
lic+0.057
Inside+0.055
Neg logit contributions
 harm-0.068
 rid-0.058
 decisions-0.058
 choices-0.056
 impact-0.055
Token time
Token ID640
Feature activation+5.46e+00

Pos logit contributions
 Inside+0.213
 Across+0.193
lic+0.188
 Outside+0.185
Over+0.175
Neg logit contributions
 harm-0.249
 impact-0.206
 disl-0.200
 decisions-0.196
 rid-0.196
Token,
Token ID11
Feature activation+7.28e+00

Pos logit contributions
lic+0.170
 Across+0.170
 Inside+0.169
shop+0.167
Under+0.158
Neg logit contributions
 harm-0.202
 disl-0.168
 rid-0.160
 decisions-0.157
 attention-0.155
Token in
Token ID287
Feature activation+6.86e+00

Pos logit contributions
shop+0.257
lic+0.241
�+0.229
 Across+0.226
 Inside+0.226
Neg logit contributions
 harm-0.244
 rid-0.213
edient-0.192
 attention-0.184
 possible-0.174
Token a
Token ID257
Feature activation+7.59e+00

Pos logit contributions
lic+0.201
 Across+0.189
 Inside+0.187
 Outside+0.181
shop+0.179
Neg logit contributions
 harm-0.293
 rid-0.260
 wounds-0.238
 disl-0.235
 decisions-0.234
Token big
Token ID1263
Feature activation+4.62e+00

Pos logit contributions
 Inside+0.243
lic+0.241
 Across+0.221
 Outside+0.211
 Together+0.205
Neg logit contributions
 harm-0.318
 disl-0.267
 rid-0.258
 wounds-0.257
 conclusions-0.254
Token forest
Token ID8222
Feature activation+1.09e+01

Pos logit contributions
lic+0.140
 Inside+0.140
 Across+0.130
Inside+0.127
 Outside+0.127
Neg logit contributions
 harm-0.206
 impact-0.175
 decisions-0.175
 disl-0.172
 choices-0.170
Token,
Token ID11
Feature activation+7.07e+00

Pos logit contributions
 Inside+0.314
lic+0.313
 Across+0.308
shop+0.300
 Outside+0.295
Neg logit contributions
 harm-0.478
 choices-0.378
 decisions-0.378
 rid-0.376
 disl-0.375
Token there
Token ID612
Feature activation+7.84e+00

Pos logit contributions
shop+0.213
lic+0.207
�+0.194
 Across+0.194
 Outside+0.189
Neg logit contributions
 harm-0.273
 rid-0.239
edient-0.220
 choices-0.218
 attention-0.216
Token lived
Token ID5615
Feature activation+9.45e+00

Pos logit contributions
lic+0.294
 Inside+0.265
 Across+0.259
Along+0.259
 Along+0.257
Neg logit contributions
 harm-0.278
 happen-0.237
 disl-0.216
 affect-0.205
 rid-0.205
Token a
Token ID257
Feature activation+5.19e+00

Pos logit contributions
 Across+0.286
lic+0.280
shop+0.275
 Outside+0.267
�+0.264
Neg logit contributions
 rid-0.332
 harm-0.331
 permission-0.266
 decisions-0.265
 kindness-0.260
Token lion
Token ID18744
Feature activation+1.69e+00

Pos logit contributions
 Inside+0.170
 Across+0.163
 Outside+0.158
 On+0.146
shop+0.146
Neg logit contributions
 harm-0.213
 rid-0.182
 disl-0.177
 impact-0.172
 decisions-0.169
Token.
Token ID13
Feature activation+1.15e+00

Pos logit contributions
 Inside+0.091
 Across+0.084
 Outside+0.084
 On+0.079
On+0.078
Neg logit contributions
 harm-0.099
 rid-0.087
 decisions-0.084
 disl-0.084
 wounds-0.081
Token
Token ID198
Feature activation+3.62e+00

Pos logit contributions
Over+0.029
 Across+0.029
lic+0.028
 Inside+0.028
shop+0.027
Neg logit contributions
 harm-0.047
 disl-0.044
 rid-0.043
 decisions-0.041
 supervision-0.040
Token
Token ID198
Feature activation+4.15e+00

Pos logit contributions
 Inside+0.088
 Across+0.086
 Outside+0.081
Over+0.075
lic+0.074
Neg logit contributions
 harm-0.175
 rid-0.155
 decisions-0.148
 disl-0.146
 wounds-0.143
TokenWhen
Token ID2215
Feature activation+5.91e+00

Pos logit contributions
 Inside+0.128
 Across+0.124
 Outside+0.117
shop+0.113
lic+0.111
Neg logit contributions
 harm-0.158
 disl-0.138
 rid-0.133
grand-0.127
edient-0.123
Token Tim
Token ID5045
Feature activation+2.53e-01

Pos logit contributions
 Inside+0.205
 Across+0.198
lic+0.197
 Outside+0.194
shop+0.188
Neg logit contributions
 harm-0.232
 rid-0.203
 decisions-0.194
 choices-0.193
 disl-0.189
Token arrived
Token ID5284
Feature activation+1.09e+01

Pos logit contributions
 Inside+0.008
lic+0.008
Over+0.008
 Across+0.008
 Outside+0.007
Neg logit contributions
 harm-0.010
pa-0.007
 rid-0.007
grand-0.007
 supervision-0.007
Token,
Token ID11
Feature activation+6.37e+00

Pos logit contributions
lic+0.386
 Inside+0.361
 Across+0.359
shop+0.350
Inside+0.349
Neg logit contributions
 harm-0.396
 attention-0.364
 decisions-0.357
 choices-0.356
 rid-0.356
Token his
Token ID465
Feature activation+4.42e+00

Pos logit contributions
 Inside+0.219
 Across+0.209
 Outside+0.207
lic+0.206
shop+0.199
Neg logit contributions
 harm-0.245
 rid-0.222
 decisions-0.205
 disl-0.205
 choices-0.205
Token grandma
Token ID49890
Feature activation-3.10e+00

Pos logit contributions
 Inside+0.191
 Across+0.181
 Outside+0.178
Inside+0.175
lic+0.174
Neg logit contributions
 harm-0.142
 rid-0.117
 decisions-0.116
 choices-0.113
 wounds-0.111
Token gave
Token ID2921
Feature activation-1.46e+00

Pos logit contributions
 harm+0.139
 rid+0.120
 decisions+0.115
 supervision+0.114
 disl+0.114
Neg logit contributions
 Inside-0.088
 Across-0.084
 Outside-0.082
Over-0.078
Inside-0.076
Token him
Token ID683
Feature activation+4.79e-01

Pos logit contributions
 harm+0.052
 disl+0.044
 affect+0.044
 decisions+0.043
 rid+0.043
Neg logit contributions
 Inside-0.060
 Outside-0.054
 Across-0.054
 Suddenly-0.052
 On-0.052
Token.
Token ID13
Feature activation+9.92e-01

Pos logit contributions
 Inside+0.145
 Across+0.135
 Outside+0.134
Inside+0.127
lic+0.127
Neg logit contributions
 harm-0.145
 rid-0.124
 decisions-0.122
 disl-0.118
 choices-0.116
Token
Token ID198
Feature activation+3.58e+00

Pos logit contributions
lic+0.028
Over+0.024
 Across+0.023
Under+0.022
 Inside+0.022
Neg logit contributions
 harm-0.039
 disl-0.036
 decisions-0.036
 impact-0.034
 choices-0.034
Token
Token ID198
Feature activation+4.04e+00

Pos logit contributions
 Inside+0.087
 Across+0.084
 Outside+0.080
lic+0.076
Over+0.072
Neg logit contributions
 harm-0.174
 rid-0.152
 decisions-0.148
 disl-0.144
 wounds-0.143
TokenAs
Token ID1722
Feature activation+6.97e+00

Pos logit contributions
 Inside+0.107
 Across+0.105
lic+0.103
 Outside+0.097
 Together+0.089
Neg logit contributions
 harm-0.165
 conclusions-0.141
grand-0.139
 disl-0.134
 decisions-0.133
Token Sara
Token ID24799
Feature activation-5.16e-02

Pos logit contributions
 Across+0.252
lic+0.252
 Inside+0.252
 Outside+0.241
shop+0.231
Neg logit contributions
 harm-0.257
 rid-0.221
 attention-0.203
 decisions-0.201
 choices-0.200
Token arrived
Token ID5284
Feature activation+1.08e+01

Pos logit contributions
 harm+0.002
 disl+0.002
 decisions+0.002
 affect+0.002
 rid+0.002
Neg logit contributions
 Inside-0.002
lic-0.002
 Across-0.001
 Outside-0.001
Over-0.001
Token to
Token ID284
Feature activation+1.17e+00

Pos logit contributions
lic+0.391
 Inside+0.362
 Across+0.348
Inside+0.345
Under+0.335
Neg logit contributions
 harm-0.422
 attention-0.376
 decisions-0.374
 choices-0.364
 rid-0.361
Token school
Token ID1524
Feature activation+5.51e+00

Pos logit contributions
 Inside+0.037
 Outside+0.033
 Across+0.032
 On+0.031
On+0.030
Neg logit contributions
 harm-0.052
 rid-0.047
 decisions-0.046
 disl-0.045
 wounds-0.045
Token,
Token ID11
Feature activation+5.94e+00

Pos logit contributions
 Inside+0.147
lic+0.143
 Across+0.139
Over+0.135
 Outside+0.131
Neg logit contributions
 harm-0.245
 decisions-0.215
 choices-0.212
 supervision-0.207
 rid-0.203
Token she
Token ID673
Feature activation+3.94e+00

Pos logit contributions
lic+0.225
 Inside+0.220
 Across+0.213
 Outside+0.209
shop+0.205
Neg logit contributions
 harm-0.198
 rid-0.183
 choices-0.164
 decisions-0.164
 attention-0.161
Token could
Token ID714
Feature activation+2.15e+00

Pos logit contributions
 Inside+0.095
 Across+0.084
 Outside+0.083
lic+0.077
 On+0.076
Neg logit contributions
 harm-0.202
 rid-0.181
 decisions-0.176
 disl-0.175
 wounds-0.171
Token
Token ID220
Feature activation+2.18e+00

Pos logit contributions
 Inside+0.012
 Across+0.011
 Outside+0.011
 On+0.010
Inside+0.010
Neg logit contributions
 harm-0.019
 rid-0.017
 decisions-0.016
 disl-0.016
 wounds-0.016
Token
Token ID198
Feature activation+2.58e+00

Pos logit contributions
 Inside+0.044
 Across+0.037
 Outside+0.037
 On+0.033
Inside+0.033
Neg logit contributions
 harm-0.124
 rid-0.112
 decisions-0.109
 disl-0.108
 wounds-0.107
Token
Token ID198
Feature activation+3.20e+00

Pos logit contributions
 Inside+0.057
 Across+0.050
 Outside+0.049
lic+0.044
Over+0.044
Neg logit contributions
 harm-0.137
 rid-0.124
 decisions-0.121
 disl-0.120
 wounds-0.117
TokenWhen
Token ID2215
Feature activation+5.63e+00

Pos logit contributions
 Inside+0.084
 Across+0.075
 Outside+0.075
lic+0.068
Inside+0.068
Neg logit contributions
 harm-0.159
 rid-0.142
 decisions-0.140
 disl-0.137
 choices-0.134
Token they
Token ID484
Feature activation+2.62e+00

Pos logit contributions
 Inside+0.202
 Outside+0.183
 Across+0.183
Inside+0.173
 On+0.172
Neg logit contributions
 harm-0.229
 rid-0.198
 decisions-0.193
 disl-0.191
 wounds-0.188
Token arrived
Token ID5284
Feature activation+1.08e+01

Pos logit contributions
 Across+0.061
 Inside+0.061
 Outside+0.059
lic+0.056
 Suddenly+0.055
Neg logit contributions
 harm-0.125
 rid-0.113
 disl-0.105
 supervision-0.103
 permission-0.103
Token,
Token ID11
Feature activation+6.32e+00

Pos logit contributions
 Inside+0.269
 Across+0.248
 Outside+0.241
lic+0.234
Inside+0.227
Neg logit contributions
 harm-0.533
 rid-0.486
 decisions-0.483
 choices-0.467
 wounds-0.463
Token Jack
Token ID3619
Feature activation+1.68e+00

Pos logit contributions
 Inside+0.203
 Across+0.182
 Outside+0.181
 On+0.168
Inside+0.168
Neg logit contributions
 harm-0.284
 rid-0.246
 decisions-0.240
 disl-0.240
 wounds-0.236
Token was
Token ID373
Feature activation+4.96e+00

Pos logit contributions
 Inside+0.047
 Across+0.043
 Outside+0.043
Over+0.040
lic+0.039
Neg logit contributions
 harm-0.078
 rid-0.070
 decisions-0.068
 disl-0.067
 supervision-0.066
Token so
Token ID523
Feature activation+1.12e-01

Pos logit contributions
 Inside+0.157
 Across+0.144
 Outside+0.143
lic+0.133
Inside+0.132
Neg logit contributions
 harm-0.219
 rid-0.192
 decisions-0.189
 disl-0.186
 choices-0.182
Token excited
Token ID6568
Feature activation+5.79e-01

Pos logit contributions
 Inside+0.004
 Across+0.003
 Outside+0.003
lic+0.003
Inside+0.003
Neg logit contributions
 harm-0.005
 rid-0.004
 decisions-0.004
 disl-0.004
 supervision-0.004
Token time
Token ID640
Feature activation+5.38e+00

Pos logit contributions
 Inside+0.214
 Across+0.197
lic+0.191
 Outside+0.188
Over+0.180
Neg logit contributions
 harm-0.233
 impact-0.193
 disl-0.183
 decisions-0.179
 rid-0.177
Token,
Token ID11
Feature activation+7.17e+00

Pos logit contributions
 Across+0.165
lic+0.165
shop+0.163
 Inside+0.163
 Outside+0.155
Neg logit contributions
 harm-0.201
 disl-0.174
 decisions-0.165
 rid-0.164
 choices-0.161
Token in
Token ID287
Feature activation+6.60e+00

Pos logit contributions
shop+0.242
lic+0.234
 Across+0.214
 Inside+0.213
 Outside+0.213
Neg logit contributions
 harm-0.251
 rid-0.217
edient-0.202
 theirs-0.184
 possible-0.180
Token a
Token ID257
Feature activation+7.30e+00

Pos logit contributions
lic+0.184
 Across+0.177
 Inside+0.173
 Outside+0.169
shop+0.160
Neg logit contributions
 harm-0.291
 rid-0.258
 decisions-0.239
 wounds-0.239
 disl-0.239
Token big
Token ID1263
Feature activation+4.46e+00

Pos logit contributions
 Inside+0.226
lic+0.220
 Across+0.206
 Outside+0.197
 Together+0.187
Neg logit contributions
 harm-0.313
 disl-0.265
 wounds-0.256
 rid-0.256
 impact-0.255
Token forest
Token ID8222
Feature activation+1.08e+01

Pos logit contributions
 Inside+0.129
lic+0.118
 Across+0.118
 Outside+0.116
Inside+0.112
Neg logit contributions
 harm-0.208
 decisions-0.179
 rid-0.176
 impact-0.176
 disl-0.175
Token,
Token ID11
Feature activation+7.02e+00

Pos logit contributions
 Inside+0.299
lic+0.297
 Across+0.293
 Outside+0.283
shop+0.279
Neg logit contributions
 harm-0.492
 decisions-0.400
 choices-0.395
 disl-0.394
 rid-0.392
Token there
Token ID612
Feature activation+7.68e+00

Pos logit contributions
lic+0.213
shop+0.212
�+0.194
 Across+0.194
 Outside+0.193
Neg logit contributions
 harm-0.265
 rid-0.229
 choices-0.217
edient-0.215
 decisions-0.211
Token was
Token ID373
Feature activation+6.86e+00

Pos logit contributions
lic+0.268
 Inside+0.245
 Across+0.239
 Along+0.234
 Outside+0.233
Neg logit contributions
 harm-0.296
 happen-0.247
 disl-0.236
 rid-0.232
 decisions-0.231
Token a
Token ID257
Feature activation+5.37e+00

Pos logit contributions
lic+0.238
 Across+0.227
oped+0.226
uff+0.223
pit+0.222
Neg logit contributions
 harm-0.202
 rid-0.177
 ger-0.174
 kindness-0.153
 decisions-0.146
Token little
Token ID1310
Feature activation+2.10e+00

Pos logit contributions
 Inside+0.167
 Across+0.157
 Outside+0.153
 On+0.142
lic+0.137
Neg logit contributions
 harm-0.232
 rid-0.199
 impact-0.194
 disl-0.193
 difference-0.191
Token time
Token ID640
Feature activation+5.36e+00

Pos logit contributions
 Inside+0.203
 Across+0.188
 Outside+0.180
lic+0.179
Over+0.169
Neg logit contributions
 harm-0.255
 impact-0.213
 disl-0.206
 decisions-0.203
 rid-0.203
Token,
Token ID11
Feature activation+7.20e+00

Pos logit contributions
lic+0.163
 Inside+0.162
 Across+0.161
shop+0.157
 Outside+0.152
Neg logit contributions
 harm-0.208
 disl-0.178
 decisions-0.171
 rid-0.169
 choices-0.169
Token in
Token ID287
Feature activation+6.68e+00

Pos logit contributions
shop+0.241
lic+0.228
 Across+0.216
 Along+0.215
 Inside+0.213
Neg logit contributions
 harm-0.262
 rid-0.227
edient-0.200
 attention-0.198
 happen-0.194
Token a
Token ID257
Feature activation+7.46e+00

Pos logit contributions
lic+0.184
 Across+0.177
 Inside+0.176
 Outside+0.168
shop+0.162
Neg logit contributions
 harm-0.300
 rid-0.263
 disl-0.248
 wounds-0.245
 decisions-0.245
Token small
Token ID1402
Feature activation+5.57e+00

Pos logit contributions
 Inside+0.230
lic+0.228
 Across+0.214
 Outside+0.203
 Together+0.195
Neg logit contributions
 harm-0.319
 disl-0.271
 rid-0.261
 conclusions-0.256
 wounds-0.256
Token forest
Token ID8222
Feature activation+1.08e+01

Pos logit contributions
lic+0.222
 Inside+0.208
 Across+0.205
 Outside+0.196
Over+0.194
Neg logit contributions
 harm-0.201
 decisions-0.164
 difference-0.157
 choices-0.155
 impact-0.154
Token,
Token ID11
Feature activation+7.00e+00

Pos logit contributions
lic+0.325
 Across+0.315
 Inside+0.314
shop+0.305
 Outside+0.300
Neg logit contributions
 harm-0.470
 choices-0.370
 decisions-0.369
 rid-0.365
 disl-0.365
Token there
Token ID612
Feature activation+7.60e+00

Pos logit contributions
shop+0.216
lic+0.214
�+0.197
 Across+0.196
 Outside+0.192
Neg logit contributions
 harm-0.264
 rid-0.228
 attention-0.208
edient-0.208
 happen-0.208
Token lived
Token ID5615
Feature activation+9.39e+00

Pos logit contributions
lic+0.277
 Inside+0.248
 Across+0.247
Along+0.246
 Along+0.245
Neg logit contributions
 harm-0.274
 happen-0.235
 disl-0.211
 rid-0.211
 affect-0.207
Token a
Token ID257
Feature activation+5.16e+00

Pos logit contributions
 Across+0.291
lic+0.290
shop+0.280
cases+0.271
 Outside+0.270
Neg logit contributions
 harm-0.324
 rid-0.323
 permission-0.263
 kindness-0.259
 wounds-0.256
Token little
Token ID1310
Feature activation+1.71e+00

Pos logit contributions
 Inside+0.167
 Across+0.162
 Outside+0.156
shop+0.146
 On+0.145
Neg logit contributions
 harm-0.212
 rid-0.181
 disl-0.176
 impact-0.170
 affect-0.168
Token
Token ID198
Feature activation+2.93e+00

Pos logit contributions
 Inside+0.020
lic+0.019
 Across+0.019
Over+0.018
 Outside+0.018
Neg logit contributions
 harm-0.038
 rid-0.036
 decisions-0.035
 disl-0.034
 impact-0.033
Token
Token ID198
Feature activation+3.41e+00

Pos logit contributions
 Inside+0.073
 Across+0.068
 Outside+0.066
lic+0.059
Over+0.059
Neg logit contributions
 harm-0.143
 rid-0.129
 decisions-0.124
 disl-0.124
 wounds-0.121
TokenSuddenly
Token ID38582
Feature activation+8.05e+00

Pos logit contributions
 Inside+0.085
 Across+0.077
 Outside+0.075
 On+0.068
shop+0.067
Neg logit contributions
 harm-0.163
 rid-0.150
 disl-0.145
 decisions-0.141
 conclusions-0.140
Token,
Token ID11
Feature activation+6.85e+00

Pos logit contributions
 Inside+0.194
 Across+0.184
 Outside+0.179
lic+0.176
shop+0.173
Neg logit contributions
 harm-0.389
 rid-0.363
 decisions-0.351
 disl-0.334
 choices-0.333
Token Pete
Token ID17542
Feature activation+1.46e+00

Pos logit contributions
 Inside+0.231
 Across+0.216
 Outside+0.214
lic+0.203
Inside+0.200
Neg logit contributions
 harm-0.280
 rid-0.254
 decisions-0.242
 disl-0.234
 choices-0.230
Token awoke
Token ID43363
Feature activation+1.08e+01

Pos logit contributions
 Inside+0.040
 Across+0.036
 Outside+0.036
 On+0.033
Over+0.033
Neg logit contributions
 harm-0.070
 rid-0.062
 decisions-0.060
 disl-0.059
 wounds-0.058
Token to
Token ID284
Feature activation+1.03e+00

Pos logit contributions
 Inside+0.361
 Across+0.336
 Outside+0.330
Over+0.315
lic+0.313
Neg logit contributions
 harm-0.452
 rid-0.399
 decisions-0.386
 disl-0.378
 wounds-0.372
Token some
Token ID617
Feature activation+2.07e+00

Pos logit contributions
 Inside+0.034
 Outside+0.030
 Across+0.030
 Suddenly+0.029
 On+0.029
Neg logit contributions
 harm-0.044
 decisions-0.038
 rid-0.038
 supervision-0.038
 choices-0.038
Token loud
Token ID7812
Feature activation-2.01e-02

Pos logit contributions
 Inside+0.069
 Across+0.064
 Outside+0.063
Over+0.060
On+0.060
Neg logit contributions
 harm-0.084
 disl-0.075
 rid-0.074
 decisions-0.072
 supervision-0.071
Token noises
Token ID26782
Feature activation+4.05e+00

Pos logit contributions
 harm+0.001
 rid+0.001
 decisions+0.001
 disl+0.001
 choices+0.001
Neg logit contributions
 Inside-0.001
 Across-0.001
 Outside-0.001
Inside-0.001
lic-0.001
Token.
Token ID13
Feature activation+6.79e-01

Pos logit contributions
 Inside+0.138
 Across+0.127
 Outside+0.126
Inside+0.119
 On+0.118
Neg logit contributions
 harm-0.168
 rid-0.146
 decisions-0.142
 disl-0.138
 supervision-0.135
Token,
Token ID11
Feature activation+1.54e+00

Pos logit contributions
 harm+0.175
 rid+0.160
 decisions+0.157
 disl+0.154
 wounds+0.154
Neg logit contributions
 Inside-0.049
 Across-0.040
 Outside-0.040
 On-0.034
 Suddenly-0.034
Token how
Token ID703
Feature activation-3.00e+00

Pos logit contributions
 Inside+0.050
 Across+0.044
 Outside+0.044
Inside+0.041
 On+0.040
Neg logit contributions
 harm-0.070
 rid-0.061
 decisions-0.060
 disl-0.058
 wounds-0.058
Token can
Token ID460
Feature activation-4.49e+00

Pos logit contributions
 harm+0.150
 rid+0.132
 wounds+0.129
 decisions+0.128
 supervision+0.128
Neg logit contributions
 Inside-0.085
 Outside-0.072
 Across-0.071
 On-0.069
lic-0.068
Token we
Token ID356
Feature activation-8.65e+00

Pos logit contributions
 harm+0.177
 decisions+0.161
 impact+0.158
 disl+0.156
 rid+0.155
Neg logit contributions
 Inside-0.163
 Across-0.143
 Outside-0.141
 On-0.137
Inside-0.135
Token help
Token ID1037
Feature activation-6.68e+00

Pos logit contributions
 harm+0.363
 decisions+0.323
 rid+0.317
 disl+0.315
 wounds+0.311
Neg logit contributions
 Inside-0.293
 Outside-0.263
 Across-0.261
 On-0.252
Inside-0.250
Token Mitt
Token ID16627
Feature activation-1.37e+01

Pos logit contributions
 harm+0.260
 disl+0.247
 decisions+0.233
 impact+0.231
 rid+0.231
Neg logit contributions
 Inside-0.234
 Outside-0.206
 Across-0.198
Inside-0.198
On-0.195
Tokenens
Token ID641
Feature activation-6.07e+00

Pos logit contributions
 harm+0.450
 rid+0.381
 decisions+0.368
 disl+0.361
 wounds+0.348
Neg logit contributions
 Inside-0.584
 Across-0.548
 Outside-0.544
lic-0.519
 On-0.519
Token feel
Token ID1254
Feature activation-2.51e+00

Pos logit contributions
 harm+0.243
 rid+0.212
 decisions+0.208
 disl+0.205
 choices+0.199
Neg logit contributions
 Inside-0.215
 Across-0.199
 Outside-0.197
 On-0.186
Inside-0.184
Token fine
Token ID3734
Feature activation-2.76e+00

Pos logit contributions
 harm+0.099
 disl+0.091
keeping+0.091
 supervision+0.091
 wounds+0.090
Neg logit contributions
 Inside-0.085
 Outside-0.074
 Across-0.073
 Suddenly-0.072
 On-0.072
Token again
Token ID757
Feature activation-1.28e+00

Pos logit contributions
 harm+0.099
 disl+0.084
 rid+0.084
 decisions+0.082
 supervision+0.080
Neg logit contributions
 Inside-0.110
 Across-0.101
 Outside-0.101
 On-0.095
 Suddenly-0.095
Token?"
Token ID1701
Feature activation-1.39e+00

Pos logit contributions
 harm+0.043
 rid+0.038
 disl+0.037
 decisions+0.036
 supervision+0.036
Neg logit contributions
 Inside-0.052
 Outside-0.048
 Across-0.048
On-0.046
Inside-0.046
Token can
Token ID460
Feature activation-4.52e+00

Pos logit contributions
 harm+0.279
 decisions+0.250
 wounds+0.244
 disl+0.243
 choices+0.242
Neg logit contributions
 Inside-0.191
 Across-0.168
 Outside-0.168
 On-0.159
Inside-0.158
Token always
Token ID1464
Feature activation-4.27e+00

Pos logit contributions
 harm+0.192
 decisions+0.170
 rid+0.170
 supervision+0.165
 disl+0.165
Neg logit contributions
 Inside-0.149
 Outside-0.134
 Across-0.132
 On-0.128
Inside-0.126
Token make
Token ID787
Feature activation-6.54e+00

Pos logit contributions
 harm+0.172
 rid+0.155
 decisions+0.151
 disl+0.148
 choices+0.145
Neg logit contributions
 Inside-0.142
 Outside-0.131
 Across-0.128
Inside-0.124
On-0.123
Token the
Token ID262
Feature activation-5.70e+00

Pos logit contributions
 harm+0.293
 affect+0.272
keeping+0.259
 disl+0.258
edient+0.257
Neg logit contributions
 Inside-0.164
 two-0.146
 Suddenly-0.142
 Outside-0.138
 Across-0.136
Token conscious
Token ID6921
Feature activation-5.83e+00

Pos logit contributions
 harm+0.255
 disl+0.238
 rid+0.229
keeping+0.223
 affect+0.218
Neg logit contributions
 Inside-0.130
 Across-0.127
 Two-0.121
 Outside-0.121
Over-0.119
Token choice
Token ID3572
Feature activation-1.34e+01

Pos logit contributions
 harm+0.257
 rid+0.253
 disl+0.244
 wounds+0.236
 supervision+0.236
Neg logit contributions
 Inside-0.143
 Across-0.126
 Outside-0.122
 Two-0.121
 On-0.118
Token to
Token ID284
Feature activation-7.83e+00

Pos logit contributions
 harm+0.642
 rid+0.602
 disl+0.589
 wounds+0.586
 impact+0.571
Neg logit contributions
 Inside-0.365
 Outside-0.287
Inside-0.277
 Across-0.271
,-0.262
Token take
Token ID1011
Feature activation-5.28e+00

Pos logit contributions
 harm+0.301
 decisions+0.272
 rid+0.269
 wounds+0.261
 disl+0.260
Neg logit contributions
 Inside-0.278
 Outside-0.253
 Across-0.246
Inside-0.243
On-0.238
Token control
Token ID1630
Feature activation-3.77e+00

Pos logit contributions
 harm+0.179
 affect+0.168
 disl+0.164
 impact+0.151
edient+0.150
Neg logit contributions
 Inside-0.200
 Outside-0.178
 Suddenly-0.173
 Across-0.169
Inside-0.169
Token of
Token ID286
Feature activation-5.95e+00

Pos logit contributions
 harm+0.174
 disl+0.166
 impact+0.164
 apology+0.158
 rid+0.157
Neg logit contributions
 Inside-0.079
On-0.064
 Outside-0.063
,-0.062
Inside-0.061
Token our
Token ID674
Feature activation-4.24e+00

Pos logit contributions
 harm+0.229
 affect+0.219
 disl+0.218
edient+0.211
 impact+0.211
Neg logit contributions
 Inside-0.179
 Outside-0.146
Inside-0.143
 two-0.141
 Across-0.135
Token "
Token ID366
Feature activation-4.64e+00

Pos logit contributions
 Inside+0.062
 Across+0.058
 Outside+0.056
Over+0.053
shop+0.051
Neg logit contributions
 harm-0.147
 rid-0.135
 disl-0.130
 decisions-0.129
 wounds-0.127
TokenThank
Token ID10449
Feature activation-5.05e+00

Pos logit contributions
 harm+0.232
 decisions+0.213
 choices+0.202
 rid+0.200
 supervision+0.200
Neg logit contributions
 Inside-0.121
On-0.115
Over-0.114
Inside-0.111
 Across-0.111
Token you
Token ID345
Feature activation-6.80e+00

Pos logit contributions
 harm+0.154
 decisions+0.125
 rid+0.124
 disl+0.124
 impact+0.119
Neg logit contributions
 Inside-0.230
 Outside-0.213
 Across-0.211
Inside-0.206
Over-0.205
Token for
Token ID329
Feature activation-5.78e+00

Pos logit contributions
 harm+0.284
 rid+0.245
 disl+0.240
 decisions+0.239
 wounds+0.231
Neg logit contributions
 Inside-0.239
 Across-0.216
 Outside-0.216
lic-0.205
 On-0.203
Token sharing
Token ID7373
Feature activation-4.56e+00

Pos logit contributions
 harm+0.194
 rid+0.164
 decisions+0.157
 disl+0.154
 supervision+0.150
Neg logit contributions
 Inside-0.245
 Across-0.232
 Outside-0.230
 On-0.219
lic-0.218
Token Mitt
Token ID16627
Feature activation-1.33e+01

Pos logit contributions
 harm+0.222
 impact+0.195
 decisions+0.194
 affect+0.192
 supervision+0.189
Neg logit contributions
 Inside-0.123
 A-0.096
 Outside-0.096
 Suddenly-0.095
Inside-0.094
Tokenens
Token ID641
Feature activation-6.01e+00

Pos logit contributions
 harm+0.482
 rid+0.420
 decisions+0.401
 wounds+0.395
 disl+0.393
Neg logit contributions
 Inside-0.502
 Outside-0.467
 Across-0.467
lic-0.462
 On-0.445
Token with
Token ID351
Feature activation-2.98e+00

Pos logit contributions
 harm+0.268
 disl+0.239
 rid+0.238
 conclusions+0.237
 impact+0.231
Neg logit contributions
 Inside-0.178
 Across-0.157
 Outside-0.155
 Suddenly-0.149
 On-0.146
Token me
Token ID502
Feature activation-3.59e+00

Pos logit contributions
 harm+0.130
 decisions+0.120
 disl+0.115
 impact+0.115
 wounds+0.115
Neg logit contributions
 Inside-0.090
 Outside-0.078
 Across-0.076
 Suddenly-0.073
 On-0.073
Token.
Token ID13
Feature activation-1.72e+00

Pos logit contributions
 harm+0.122
 rid+0.104
 decisions+0.100
 disl+0.100
 impact+0.097
Neg logit contributions
 Inside-0.151
 Outside-0.138
 Across-0.138
 On-0.131
Inside-0.130
Token I
Token ID314
Feature activation-3.48e+00

Pos logit contributions
 harm+0.071
 rid+0.062
 decisions+0.061
 disl+0.060
 choices+0.058
Neg logit contributions
 Inside-0.058
 Across-0.054
 Outside-0.053
Inside-0.050
Over-0.050
Token was
Token ID373
Feature activation-1.69e+00

Pos logit contributions
 harm+0.418
 supervision+0.396
 rid+0.393
 wounds+0.390
 decisions+0.387
Neg logit contributions
 Inside-0.134
 looked-0.101
 Across-0.098
On-0.095
 Outside-0.087
Token always
Token ID1464
Feature activation-5.11e+00

Pos logit contributions
 harm+0.075
keeping+0.073
 conclusions+0.071
 affect+0.070
 supervision+0.069
Neg logit contributions
 Inside-0.044
 Suddenly-0.037
 two-0.036
 a-0.036
On-0.036
Token better
Token ID1365
Feature activation-6.83e+00

Pos logit contributions
 harm+0.216
 conclusions+0.194
 rid+0.190
 wounds+0.189
 disl+0.186
Neg logit contributions
 Inside-0.159
 Across-0.141
 Outside-0.137
 Suddenly-0.134
On-0.132
Token than
Token ID621
Feature activation-7.65e+00

Pos logit contributions
 harm+0.267
 rid+0.251
 disl+0.245
 conclusions+0.241
 wounds+0.234
Neg logit contributions
 Inside-0.210
 Suddenly-0.184
 Outside-0.181
 Across-0.180
On-0.179
Token being
Token ID852
Feature activation-7.29e+00

Pos logit contributions
 harm+0.281
 decisions+0.264
 disl+0.256
 rid+0.256
 wounds+0.255
Neg logit contributions
 Inside-0.265
 Across-0.229
 Outside-0.226
 Suddenly-0.217
On-0.216
Token boss
Token ID6478
Feature activation-1.31e+01

Pos logit contributions
 harm+0.300
 conclusions+0.274
 wounds+0.270
 decisions+0.265
 rid+0.259
Neg logit contributions
 Inside-0.225
 Across-0.204
 Outside-0.194
 Suddenly-0.190
lic-0.189
Tokeny
Token ID88
Feature activation-3.42e+00

Pos logit contributions
 harm+0.701
 wounds+0.661
 rid+0.658
 disl+0.645
 decisions+0.641
Neg logit contributions
 Inside-0.243
 Across-0.216
 Outside-0.204
 Suddenly-0.189
lic-0.185
Token.
Token ID13
Feature activation-2.28e+00

Pos logit contributions
 harm+0.151
 disl+0.141
 rid+0.140
 wounds+0.139
 impact+0.132
Neg logit contributions
 Inside-0.087
 Across-0.080
 Outside-0.077
 Suddenly-0.075
 On-0.071
Token And
Token ID843
Feature activation-2.98e+00

Pos logit contributions
 harm+0.129
 rid+0.118
 decisions+0.115
 wounds+0.112
 choices+0.112
Neg logit contributions
 Inside-0.047
 Outside-0.039
 Across-0.039
 On-0.036
 Suddenly-0.036
Token he
Token ID339
Feature activation+1.13e+00

Pos logit contributions
 harm+0.143
 rid+0.126
 wounds+0.125
 affect+0.123
 conclusions+0.123
Neg logit contributions
 Inside-0.084
 Suddenly-0.070
 Outside-0.066
On-0.064
 On-0.064
Token dreamed
Token ID27947
Feature activation-3.22e-01

Pos logit contributions
 Inside+0.028
 Across+0.024
 Suddenly+0.023
Over+0.023
 Outside+0.023
Neg logit contributions
 harm-0.056
 decisions-0.052
 rid-0.051
 supervision-0.051
 choices-0.050
Token always
Token ID1464
Feature activation-5.47e+00

Pos logit contributions
 harm+0.117
 decisions+0.105
 supervision+0.101
 disl+0.101
 choices+0.099
Neg logit contributions
 Inside-0.094
 Across-0.086
 Outside-0.086
lic-0.083
 Suddenly-0.081
Token remembered
Token ID12086
Feature activation-1.30e+00

Pos logit contributions
 harm+0.219
 rid+0.196
 decisions+0.191
 wounds+0.190
 disl+0.189
Neg logit contributions
 Inside-0.178
 Across-0.162
 Outside-0.162
 Suddenly-0.155
On-0.150
Token to
Token ID284
Feature activation-5.57e+00

Pos logit contributions
 harm+0.059
 rid+0.054
 decisions+0.052
 wounds+0.051
 affect+0.051
Neg logit contributions
 Inside-0.036
 Outside-0.031
 Across-0.031
 Suddenly-0.030
On-0.029
Token say
Token ID910
Feature activation-6.15e+00

Pos logit contributions
 harm+0.219
 rid+0.195
 decisions+0.192
 wounds+0.185
 disl+0.185
Neg logit contributions
 Inside-0.193
 Outside-0.178
 Across-0.175
Inside-0.165
 On-0.165
Token "
Token ID366
Feature activation-8.47e+00

Pos logit contributions
 harm+0.285
 rid+0.250
 decisions+0.240
 disl+0.239
 impact+0.238
Neg logit contributions
 Inside-0.181
 Outside-0.164
 Across-0.163
 Suddenly-0.150
 On-0.145
Tokenplease
Token ID29688
Feature activation-1.30e+01

Pos logit contributions
 harm+0.330
 disl+0.305
 wounds+0.300
 decisions+0.298
 supervision+0.297
Neg logit contributions
 Inside-0.280
 Across-0.267
On-0.266
 Outside-0.262
Over-0.260
Token"
Token ID1
Feature activation-9.76e+00

Pos logit contributions
 harm+0.618
 rid+0.548
 disl+0.542
 decisions+0.540
 wounds+0.534
Neg logit contributions
 Inside-0.365
 Outside-0.327
 Across-0.326
lic-0.303
Inside-0.303
Token and
Token ID290
Feature activation-6.18e+00

Pos logit contributions
 harm+0.491
 disl+0.451
 rid+0.428
 supervision+0.426
 decisions+0.424
Neg logit contributions
 Inside-0.266
 Across-0.231
On-0.227
 Outside-0.223
Inside-0.218
Token "
Token ID366
Feature activation-6.80e+00

Pos logit contributions
 harm+0.302
 rid+0.266
 decisions+0.264
 impact+0.256
 choices+0.254
Neg logit contributions
 Inside-0.174
 Across-0.152
 Outside-0.152
 Suddenly-0.141
Inside-0.139
Tokenthank
Token ID40716
Feature activation-6.77e+00

Pos logit contributions
 harm+0.306
 decisions+0.271
 rid+0.267
 supervision+0.264
 disl+0.264
Neg logit contributions
 Inside-0.204
On-0.191
 Across-0.190
 Outside-0.188
Inside-0.183
Token you
Token ID345
Feature activation-8.90e+00

Pos logit contributions
 harm+0.253
 affect+0.224
 rid+0.215
 wounds+0.215
 disl+0.210
Neg logit contributions
 Inside-0.262
 Outside-0.231
Inside-0.223
 Across-0.222
 Two-0.221
Token the
Token ID262
Feature activation-2.40e+00

Pos logit contributions
 harm+0.123
 rid+0.113
 decisions+0.112
 disl+0.110
 supervision+0.109
Neg logit contributions
 Inside-0.081
 Outside-0.072
 Across-0.070
On-0.069
Inside-0.068
Token zo
Token ID40565
Feature activation-7.36e+00

Pos logit contributions
 harm+0.111
 rid+0.100
 disl+0.098
 decisions+0.096
 affect+0.094
Neg logit contributions
 Inside-0.065
 Across-0.059
 Outside-0.058
 Suddenly-0.055
On-0.054
Tokenoke
Token ID2088
Feature activation-9.84e+00

Pos logit contributions
 harm+0.287
 rid+0.251
 decisions+0.242
 disl+0.242
 wounds+0.238
Neg logit contributions
 Inside-0.268
 Across-0.247
 Outside-0.246
lic-0.241
Inside-0.230
Tokenep
Token ID538
Feature activation-2.71e-01

Pos logit contributions
 harm+0.392
 rid+0.344
 disl+0.338
 decisions+0.330
 impact+0.328
Neg logit contributions
 Inside-0.336
 Across-0.314
 Outside-0.306
lic-0.301
Over-0.294
Tokeners
Token ID364
Feature activation-8.62e-01

Pos logit contributions
 harm+0.013
 rid+0.012
 decisions+0.012
 disl+0.012
 wounds+0.011
Neg logit contributions
 Inside-0.007
 Across-0.006
 Outside-0.006
lic-0.006
Inside-0.006
Token forg
Token ID5784
Feature activation-1.30e+01

Pos logit contributions
 harm+0.040
 rid+0.036
 decisions+0.035
 disl+0.035
 choices+0.034
Neg logit contributions
 Inside-0.025
 Across-0.022
 Outside-0.022
Inside-0.020
lic-0.020
Tokenave
Token ID1015
Feature activation-3.83e+00

Pos logit contributions
 harm+0.587
 wounds+0.535
 decisions+0.514
 rid+0.514
 supervision+0.497
Neg logit contributions
 Inside-0.379
lic-0.371
 Across-0.350
 Outside-0.345
 Suddenly-0.337
Token Tom
Token ID4186
Feature activation-6.20e+00

Pos logit contributions
 harm+0.156
 rid+0.141
 disl+0.140
 impact+0.138
 affect+0.136
Neg logit contributions
 Inside-0.122
 Outside-0.106
 Across-0.103
 On-0.103
 Suddenly-0.102
Token and
Token ID290
Feature activation-6.41e+00

Pos logit contributions
 harm+0.294
 disl+0.272
 rid+0.270
 conclusions+0.256
 wounds+0.254
Neg logit contributions
 Inside-0.156
 Suddenly-0.140
 Outside-0.138
 Across-0.138
 On-0.134
Token said
Token ID531
Feature activation-3.27e+00

Pos logit contributions
 harm+0.299
 decisions+0.265
 rid+0.260
 impact+0.257
 conclusions+0.256
Neg logit contributions
 Inside-0.186
 Across-0.158
 Outside-0.157
 Suddenly-0.152
lic-0.151
Token,
Token ID11
Feature activation-8.86e-01

Pos logit contributions
 harm+0.175
 rid+0.157
 decisions+0.154
 wounds+0.151
 disl+0.150
Neg logit contributions
 Inside-0.074
 Across-0.063
 Outside-0.063
 Suddenly-0.057
lic-0.057
Token that
Token ID326
Feature activation-1.53e+00

Pos logit contributions
 harm+0.123
 rid+0.109
 decisions+0.108
 disl+0.106
 wounds+0.104
Neg logit contributions
 Inside-0.080
 Outside-0.072
 Across-0.072
Over-0.067
 On-0.067
Token he
Token ID339
Feature activation+5.23e-01

Pos logit contributions
 harm+0.071
 rid+0.064
 disl+0.064
 wounds+0.064
 decisions+0.063
Neg logit contributions
 Inside-0.043
 Outside-0.037
 Across-0.037
 Suddenly-0.036
Over-0.035
Token wouldn
Token ID3636
Feature activation-7.09e+00

Pos logit contributions
 Inside+0.014
 Across+0.012
 Outside+0.012
 Suddenly+0.012
Over+0.012
Neg logit contributions
 harm-0.025
 decisions-0.023
 disl-0.022
 rid-0.022
 conclusions-0.022
Token't
Token ID470
Feature activation-6.17e+00

Pos logit contributions
 harm+0.354
 decisions+0.333
 choices+0.332
 disl+0.330
 wounds+0.325
Neg logit contributions
 Inside-0.170
 Outside-0.149
 Across-0.149
 On-0.145
On-0.138
Token scare
Token ID19437
Feature activation-5.59e+00

Pos logit contributions
 harm+0.224
 decisions+0.223
 wounds+0.218
 disl+0.215
 rid+0.214
Neg logit contributions
 Inside-0.202
 Outside-0.185
On-0.185
 Across-0.181
 On-0.180
Token Mitt
Token ID16627
Feature activation-1.29e+01

Pos logit contributions
 disl+0.212
 harm+0.201
 affect+0.196
 impact+0.194
 benefit+0.191
Neg logit contributions
 Inside-0.174
 Suddenly-0.163
 Outside-0.159
 Across-0.156
 A-0.151
Tokenens
Token ID641
Feature activation-5.15e+00

Pos logit contributions
 harm+0.393
 rid+0.324
 decisions+0.315
 disl+0.305
 choices+0.297
Neg logit contributions
 Inside-0.584
 Across-0.548
 Outside-0.544
Inside-0.521
Over-0.519
Token again
Token ID757
Feature activation-2.23e+00

Pos logit contributions
 harm+0.191
 disl+0.171
 rid+0.167
 decisions+0.164
 wounds+0.160
Neg logit contributions
 Inside-0.187
 Across-0.176
 Outside-0.174
On-0.165
 Suddenly-0.165
Token.
Token ID13
Feature activation-1.10e+00

Pos logit contributions
 harm+0.096
 disl+0.090
 rid+0.089
 decisions+0.087
 wounds+0.086
Neg logit contributions
 Inside-0.065
 Outside-0.058
On-0.057
 Across-0.057
 Suddenly-0.055
Token The
Token ID383
Feature activation-5.56e-01

Pos logit contributions
 harm+0.063
 rid+0.057
 decisions+0.056
 wounds+0.055
 disl+0.055
Neg logit contributions
 Inside-0.022
 Outside-0.018
 Across-0.018
 On-0.017
 Suddenly-0.016
Token end
Token ID886
Feature activation+1.47e+00

Pos logit contributions
 harm+0.019
 rid+0.017
 decisions+0.016
 disl+0.016
 supervision+0.015
Neg logit contributions
 Inside-0.023
 Across-0.021
 Outside-0.021
On-0.020
Over-0.020
Token You
Token ID921
Feature activation-7.40e+00

Pos logit contributions
 harm+0.090
 rid+0.079
 decisions+0.077
 disl+0.076
 choices+0.074
Neg logit contributions
 Inside-0.064
 Across-0.058
 Outside-0.058
 On-0.054
Inside-0.054
Token have
Token ID423
Feature activation-4.32e+00

Pos logit contributions
 harm+0.326
 decisions+0.291
 wounds+0.285
 disl+0.284
 rid+0.283
Neg logit contributions
 Inside-0.232
 Across-0.205
 Outside-0.205
Inside-0.194
 On-0.193
Token to
Token ID284
Feature activation-6.10e+00

Pos logit contributions
 harm+0.198
 decisions+0.177
 disl+0.175
 rid+0.175
 impact+0.172
Neg logit contributions
 Inside-0.126
 Across-0.108
 Outside-0.106
Inside-0.102
On-0.099
Token say
Token ID910
Feature activation-6.47e+00

Pos logit contributions
 harm+0.232
 rid+0.207
 decisions+0.203
 wounds+0.194
 choices+0.192
Neg logit contributions
 Inside-0.216
 Outside-0.199
Inside-0.191
 Across-0.190
On-0.189
Token '
Token ID705
Feature activation-1.07e+01

Pos logit contributions
 harm+0.278
keeping+0.245
 rid+0.240
 decisions+0.234
 affect+0.233
Neg logit contributions
 Inside-0.210
 Outside-0.189
 Across-0.183
 Suddenly-0.177
 Two-0.172
Tokenplease
Token ID29688
Feature activation-1.29e+01

Pos logit contributions
 harm+0.450
 decisions+0.399
 rid+0.391
 wounds+0.389
 disl+0.387
Neg logit contributions
 Inside-0.352
 Across-0.328
 Outside-0.321
On-0.312
Inside-0.311
Token'
Token ID6
Feature activation-6.42e+00

Pos logit contributions
 harm+0.595
 rid+0.528
 disl+0.527
 wounds+0.515
 decisions+0.515
Neg logit contributions
 Inside-0.377
 Outside-0.338
 Across-0.337
Inside-0.317
 Suddenly-0.312
Token and
Token ID290
Feature activation-6.95e+00

Pos logit contributions
 harm+0.294
 rid+0.257
 disl+0.252
 affect+0.244
 wounds+0.243
Neg logit contributions
 Inside-0.194
 Outside-0.170
 Across-0.166
 Suddenly-0.166
On-0.164
Token '
Token ID705
Feature activation-8.99e+00

Pos logit contributions
 harm+0.310
 rid+0.267
 decisions+0.264
 choices+0.258
 impact+0.253
Neg logit contributions
 Inside-0.229
 Outside-0.200
 Across-0.197
Inside-0.191
 Suddenly-0.186
Tokenthank
Token ID40716
Feature activation-6.38e+00

Pos logit contributions
 harm+0.385
 decisions+0.340
 wounds+0.333
 rid+0.332
 choices+0.329
Neg logit contributions
 Inside-0.292
 Across-0.272
 Outside-0.266
On-0.261
Inside-0.259
Token you
Token ID345
Feature activation-8.69e+00

Pos logit contributions
 harm+0.239
 rid+0.206
 affect+0.204
 disl+0.203
 wounds+0.203
Neg logit contributions
 Inside-0.241
 Outside-0.215
 Two-0.212
 Across-0.210
Inside-0.209
Token am
Token ID716
Feature activation-2.59e+00

Pos logit contributions
 harm+0.329
 decisions+0.298
 wounds+0.290
 supervision+0.289
 choices+0.289
Neg logit contributions
 Inside-0.249
 Across-0.223
 Outside-0.223
 On-0.216
On-0.215
Token taking
Token ID2263
Feature activation-7.28e+00

Pos logit contributions
 harm+0.118
 decisions+0.106
 conclusions+0.105
keeping+0.105
 affect+0.103
Neg logit contributions
 Inside-0.077
 Outside-0.066
 Across-0.065
 Suddenly-0.062
lic-0.060
Token the
Token ID262
Feature activation-5.83e+00

Pos logit contributions
 harm+0.318
 affect+0.289
 impact+0.282
 disl+0.279
 rid+0.275
Neg logit contributions
 Inside-0.221
 Outside-0.194
 Across-0.187
 Suddenly-0.182
Inside-0.180
Token rh
Token ID9529
Feature activation-6.79e+00

Pos logit contributions
 harm+0.259
 disl+0.238
 rid+0.233
 decisions+0.228
 wounds+0.225
Neg logit contributions
 Inside-0.160
 Across-0.147
 Outside-0.145
Under-0.134
On-0.133
Tokenin
Token ID259
Feature activation-4.49e+00

Pos logit contributions
 harm+0.307
 disl+0.302
 rid+0.284
 supervision+0.281
 decisions+0.275
Neg logit contributions
 Inside-0.166
Over-0.156
lic-0.154
 Across-0.152
On-0.150
Tokenoc
Token ID420
Feature activation-1.28e+01

Pos logit contributions
 harm+0.188
 disl+0.180
 wounds+0.175
 supervision+0.173
 rid+0.172
Neg logit contributions
 Inside-0.132
 Across-0.120
 Outside-0.120
 On-0.117
 Along-0.115
Tokeneros
Token ID27498
Feature activation-3.48e+00

Pos logit contributions
 harm+0.565
 rid+0.535
 disl+0.519
 affect+0.511
 conclusions+0.492
Neg logit contributions
 Inside-0.310
 Across-0.279
ippy-0.267
 Outside-0.265
 Suddenly-0.265
Token to
Token ID284
Feature activation-3.50e+00

Pos logit contributions
 harm+0.145
 disl+0.138
 wounds+0.133
 rid+0.130
 impact+0.129
Neg logit contributions
 Inside-0.100
 Across-0.092
 Outside-0.090
On-0.086
 On-0.084
Token a
Token ID257
Feature activation+1.47e+00

Pos logit contributions
 harm+0.144
 rid+0.130
 decisions+0.129
 disl+0.129
 wounds+0.128
Neg logit contributions
 Inside-0.113
 Outside-0.098
 Across-0.096
Inside-0.095
On-0.094
Token new
Token ID649
Feature activation-1.65e+00

Pos logit contributions
 Inside+0.041
 Across+0.037
 Outside+0.037
On+0.034
Inside+0.034
Neg logit contributions
 harm-0.069
 rid-0.062
 decisions-0.061
 disl-0.060
 choices-0.059
Token home
Token ID1363
Feature activation+2.65e+00

Pos logit contributions
 harm+0.072
 rid+0.067
 disl+0.065
 supervision+0.064
 decisions+0.063
Neg logit contributions
 Inside-0.050
 Outside-0.045
 Across-0.044
On-0.043
 Suddenly-0.042
Token love
Token ID1842
Feature activation-5.12e+00

Pos logit contributions
 harm+0.169
 rid+0.147
 decisions+0.145
 disl+0.141
 wounds+0.138
Neg logit contributions
 Inside-0.147
 Across-0.135
 Outside-0.135
Inside-0.127
lic-0.126
Token your
Token ID534
Feature activation-4.58e+00

Pos logit contributions
 harm+0.209
 decisions+0.198
 rid+0.198
 impact+0.195
 affect+0.191
Neg logit contributions
 Inside-0.170
 Outside-0.142
 Across-0.134
Inside-0.134
 A-0.131
Token ch
Token ID442
Feature activation-3.76e+00

Pos logit contributions
 harm+0.208
 rid+0.186
 decisions+0.183
 supervision+0.177
 conclusions+0.176
Neg logit contributions
 Inside-0.137
 Outside-0.125
 Across-0.122
 On-0.111
 Suddenly-0.111
Tokenubby
Token ID38393
Feature activation-5.10e+00

Pos logit contributions
 harm+0.171
 rid+0.155
 decisions+0.149
 disl+0.148
 wounds+0.144
Neg logit contributions
 Inside-0.109
 Outside-0.099
 Across-0.099
lic-0.091
 On-0.090
Token cat
Token ID3797
Feature activation-6.69e+00

Pos logit contributions
 harm+0.224
 rid+0.200
 decisions+0.194
 disl+0.190
 supervision+0.187
Neg logit contributions
 Inside-0.158
 Across-0.146
 Outside-0.145
 Suddenly-0.133
 On-0.133
Token Mitt
Token ID16627
Feature activation-1.28e+01

Pos logit contributions
 harm+0.305
 rid+0.278
 conclusions+0.267
 disl+0.264
 decisions+0.263
Neg logit contributions
 Inside-0.191
 Across-0.170
 Outside-0.166
 Suddenly-0.163
lic-0.159
Tokenens
Token ID641
Feature activation-5.23e+00

Pos logit contributions
 harm+0.475
 rid+0.418
 decisions+0.401
 wounds+0.400
 disl+0.381
Neg logit contributions
 Inside-0.485
 Outside-0.450
 Across-0.448
lic-0.438
 On-0.428
Token,"
Token ID553
Feature activation-2.56e+00

Pos logit contributions
 harm+0.200
 rid+0.176
 disl+0.169
 decisions+0.166
 wounds+0.162
Neg logit contributions
 Inside-0.195
 Across-0.179
 Outside-0.176
 Suddenly-0.169
 On-0.169
Token Sarah
Token ID10490
Feature activation-2.79e+00

Pos logit contributions
 harm+0.124
 rid+0.111
 decisions+0.109
 disl+0.106
 wounds+0.105
Neg logit contributions
 Inside-0.073
 Across-0.063
 Outside-0.063
 On-0.059
Inside-0.059
Token said
Token ID531
Feature activation-8.19e-01

Pos logit contributions
 harm+0.136
 rid+0.121
 decisions+0.120
 disl+0.117
 choices+0.115
Neg logit contributions
 Inside-0.076
 Across-0.068
 Outside-0.067
 On-0.062
Inside-0.062
Token.
Token ID13
Feature activation-1.27e+00

Pos logit contributions
 harm+0.038
 decisions+0.034
 disl+0.033
 rid+0.033
 conclusions+0.033
Neg logit contributions
 Inside-0.024
 Outside-0.021
 Across-0.021
 Suddenly-0.020
 On-0.020
Token wished
Token ID16555
Feature activation-2.83e+00

Pos logit contributions
 harm+0.111
 decisions+0.100
 rid+0.098
 disl+0.095
 impact+0.095
Neg logit contributions
 Inside-0.094
 Across-0.085
 Outside-0.084
 Suddenly-0.081
 On-0.080
Token she
Token ID673
Feature activation-2.67e+00

Pos logit contributions
 harm+0.154
 decisions+0.141
 disl+0.141
 rid+0.139
keeping+0.137
Neg logit contributions
 Inside-0.063
 Outside-0.050
 Suddenly-0.049
 Across-0.048
 On-0.046
Token had
Token ID550
Feature activation-1.18e+00

Pos logit contributions
 harm+0.132
 rid+0.118
 decisions+0.115
 disl+0.113
 wounds+0.112
Neg logit contributions
 Inside-0.070
 Across-0.063
 Outside-0.062
lic-0.057
 On-0.057
Token never
Token ID1239
Feature activation-4.38e+00

Pos logit contributions
 harm+0.051
 decisions+0.045
 rid+0.045
 disl+0.043
 choices+0.043
Neg logit contributions
 Inside-0.038
 Across-0.035
 Outside-0.034
Inside-0.032
Over-0.032
Token let
Token ID1309
Feature activation-6.83e+00

Pos logit contributions
 harm+0.189
 rid+0.159
 disl+0.148
 supervision+0.148
 decisions+0.147
Neg logit contributions
 Inside-0.148
 Outside-0.138
 Across-0.137
 On-0.133
lic-0.130
Token Mitt
Token ID16627
Feature activation-1.27e+01

Pos logit contributions
 harm+0.292
 disl+0.285
 decisions+0.269
 wounds+0.269
 impact+0.266
Neg logit contributions
 Inside-0.205
 Outside-0.183
 Across-0.180
 On-0.171
 Suddenly-0.169
Tokenens
Token ID641
Feature activation-5.10e+00

Pos logit contributions
 harm+0.355
 rid+0.285
 decisions+0.269
 disl+0.262
 supervision+0.259
Neg logit contributions
 Inside-0.613
 Across-0.577
 Outside-0.571
Over-0.548
On-0.547
Token run
Token ID1057
Feature activation-3.58e+00

Pos logit contributions
 harm+0.200
 decisions+0.180
 disl+0.179
 rid+0.174
 choices+0.174
Neg logit contributions
 Inside-0.174
 Across-0.162
 Outside-0.160
 On-0.151
On-0.150
Token away
Token ID1497
Feature activation+1.96e+00

Pos logit contributions
 disl+0.141
 harm+0.141
 decisions+0.137
 supervision+0.132
 choices+0.131
Neg logit contributions
 Inside-0.113
 Outside-0.101
 Across-0.099
 Suddenly-0.097
On-0.097
Token.
Token ID13
Feature activation-7.75e-01

Pos logit contributions
 Inside+0.053
On+0.045
 Outside+0.043
 Across+0.043
 On+0.042
Neg logit contributions
 harm-0.089
 disl-0.087
 supervision-0.084
 decisions-0.081
 impact-0.081
Token<|endoftext|>
Token ID50256
Feature activation+1.15e-01

Pos logit contributions
 harm+0.043
 rid+0.039
 decisions+0.038
 disl+0.037
 choices+0.037
Neg logit contributions
 Inside-0.016
 Across-0.014
 Outside-0.014
 On-0.013
Inside-0.012
Token hug
Token ID16225
Feature activation-3.07e+00

Pos logit contributions
 Inside+0.023
On+0.020
 On+0.020
Over+0.019
 Outside+0.019
Neg logit contributions
 harm-0.032
 decisions-0.030
keeping-0.029
 supervision-0.028
 rid-0.028
Token him
Token ID683
Feature activation-9.16e-01

Pos logit contributions
 harm+0.154
 decisions+0.139
 disl+0.139
 rid+0.139
 supervision+0.137
Neg logit contributions
 Inside-0.076
 Suddenly-0.068
 Outside-0.064
 While-0.064
 On-0.063
Token and
Token ID290
Feature activation-4.07e+00

Pos logit contributions
 harm+0.043
 rid+0.038
 disl+0.037
 decisions+0.037
 supervision+0.036
Neg logit contributions
 Inside-0.027
 Across-0.024
 Outside-0.024
 On-0.023
On-0.023
Token say
Token ID910
Feature activation-3.88e+00

Pos logit contributions
 harm+0.152
 rid+0.134
 decisions+0.132
edient+0.125
keeping+0.124
Neg logit contributions
 Inside-0.152
 Outside-0.133
On-0.131
 On-0.131
 Together-0.129
Token "
Token ID366
Feature activation-7.54e+00

Pos logit contributions
 harm+0.214
 rid+0.192
 decisions+0.189
 disl+0.184
 wounds+0.184
Neg logit contributions
 Inside-0.084
 Across-0.071
 Outside-0.071
lic-0.064
Inside-0.063
Tokenwe
Token ID732
Feature activation-1.25e+01

Pos logit contributions
 harm+0.318
 decisions+0.288
 wounds+0.280
 supervision+0.279
 disl+0.273
Neg logit contributions
 Inside-0.242
Over-0.234
On-0.231
 Outside-0.225
 Across-0.225
Token love
Token ID1842
Feature activation-7.09e+00

Pos logit contributions
 harm+0.526
 decisions+0.449
 wounds+0.445
 disl+0.444
 rid+0.441
Neg logit contributions
 Inside-0.434
 Across-0.389
 Outside-0.389
lic-0.369
 On-0.369
Token you
Token ID345
Feature activation-8.92e+00

Pos logit contributions
 harm+0.257
 rid+0.233
 decisions+0.224
 impact+0.224
 disl+0.221
Neg logit contributions
 Inside-0.277
 Outside-0.239
 Suddenly-0.231
 On-0.231
Inside-0.229
Token,
Token ID11
Feature activation-2.87e-01

Pos logit contributions
 harm+0.450
 rid+0.400
 disl+0.394
 decisions+0.388
 wounds+0.381
Neg logit contributions
 Inside-0.236
 Outside-0.205
 Across-0.204
 On-0.189
lic-0.189
Token Max
Token ID5436
Feature activation-3.68e+00

Pos logit contributions
 harm+0.015
 rid+0.014
keeping+0.014
 decisions+0.014
 disl+0.013
Neg logit contributions
 Inside-0.007
 Outside-0.005
 Across-0.005
 On-0.005
Inside-0.005
Token".
Token ID1911
Feature activation-1.85e+00

Pos logit contributions
 harm+0.166
 rid+0.151
 disl+0.146
 conclusions+0.144
 decisions+0.144
Neg logit contributions
 Inside-0.107
 Across-0.094
 Outside-0.093
 Suddenly-0.092
 Together-0.092
Token about
Token ID546
Feature activation-1.95e+00

Pos logit contributions
 harm+0.249
 rid+0.223
 disl+0.219
 decisions+0.218
 wounds+0.217
Neg logit contributions
 Inside-0.191
 Across-0.171
 Outside-0.169
On-0.161
Inside-0.160
Token who
Token ID508
Feature activation-4.26e+00

Pos logit contributions
 harm+0.077
 rid+0.073
 wounds+0.068
 decisions+0.067
 supervision+0.067
Neg logit contributions
 Inside-0.062
 Across-0.056
 Outside-0.055
 Suddenly-0.052
On-0.051
Token has
Token ID468
Feature activation-9.09e-01

Pos logit contributions
 harm+0.168
 wounds+0.158
 rid+0.157
keeping+0.156
 decisions+0.155
Neg logit contributions
 Inside-0.134
 Across-0.120
 Outside-0.118
 Suddenly-0.111
 On-0.110
Token the
Token ID262
Feature activation-3.62e+00

Pos logit contributions
 harm+0.042
 rid+0.038
 disl+0.038
 wounds+0.037
 decisions+0.037
Neg logit contributions
 Inside-0.024
 Across-0.021
 Outside-0.020
 Suddenly-0.019
Over-0.019
Token sh
Token ID427
Feature activation-3.22e+00

Pos logit contributions
 harm+0.155
 disl+0.150
keeping+0.144
 rid+0.139
 wounds+0.136
Neg logit contributions
 Inside-0.093
 Across-0.087
 Outside-0.086
 Suddenly-0.081
 Two-0.079
Tokenini
Token ID5362
Feature activation-1.25e+01

Pos logit contributions
 harm+0.102
 rid+0.095
 supervision+0.093
 wounds+0.091
 decisions+0.090
Neg logit contributions
lic-0.121
 Outside-0.116
 Across-0.115
 Inside-0.115
 Suddenly-0.107
Tokenest
Token ID395
Feature activation-2.18e+00

Pos logit contributions
 harm+0.559
 wounds+0.532
 disl+0.509
keeping+0.506
 supervision+0.499
Neg logit contributions
 Suddenly-0.286
 Outside-0.277
 Two-0.268
 Across-0.267
 Inside-0.257
Token gem
Token ID16840
Feature activation+4.90e-01

Pos logit contributions
 harm+0.094
 disl+0.085
 rid+0.085
 supervision+0.082
 decisions+0.081
Neg logit contributions
 Inside-0.060
 Outside-0.056
 Across-0.055
 Suddenly-0.052
lic-0.051
Token or
Token ID393
Feature activation-5.30e+00

Pos logit contributions
 Inside+0.012
 Across+0.010
 Outside+0.010
Inside+0.009
lic+0.009
Neg logit contributions
 harm-0.024
 decisions-0.022
 wounds-0.022
 rid-0.022
 conclusions-0.022
Token the
Token ID262
Feature activation-2.34e+00

Pos logit contributions
 harm+0.224
 rid+0.200
 decisions+0.197
 impact+0.195
edient+0.192
Neg logit contributions
 Inside-0.159
 Across-0.137
 Outside-0.134
 Suddenly-0.129
 A-0.127
Token best
Token ID1266
Feature activation-3.37e+00

Pos logit contributions
 harm+0.097
 disl+0.090
 decisions+0.086
 rid+0.086
 supervision+0.085
Neg logit contributions
 Inside-0.057
 two-0.056
 Across-0.054
 Suddenly-0.054
 Outside-0.053
Token't
Token ID470
Feature activation-7.02e+00

Pos logit contributions
 harm+0.372
 rid+0.334
 decisions+0.328
 disl+0.322
 supervision+0.316
Neg logit contributions
 Inside-0.232
 Across-0.209
 Outside-0.207
Inside-0.193
 On-0.193
Token like
Token ID588
Feature activation-4.80e+00

Pos logit contributions
 wounds+0.243
 supervision+0.235
 harm+0.234
 rid+0.234
 decisions+0.226
Neg logit contributions
 Inside-0.233
 Suddenly-0.218
 On-0.216
 Outside-0.214
 Two-0.210
Token spiders
Token ID26120
Feature activation-4.83e+00

Pos logit contributions
edient+0.198
 harm+0.194
 impact+0.191
 rid+0.189
keeping+0.187
Neg logit contributions
 Inside-0.135
 Outside-0.114
 Across-0.111
 A-0.107
 Suddenly-0.105
Token or
Token ID393
Feature activation-7.01e+00

Pos logit contributions
 harm+0.214
 rid+0.209
 decisions+0.201
 disl+0.197
keeping+0.191
Neg logit contributions
 Inside-0.120
 Across-0.105
 Outside-0.105
On-0.102
 Suddenly-0.101
Token cob
Token ID22843
Feature activation-2.45e+00

Pos logit contributions
 harm+0.251
 decisions+0.236
keeping+0.227
 rid+0.226
 impact+0.223
Neg logit contributions
 Inside-0.235
 Across-0.217
 Outside-0.210
Over-0.204
Under-0.203
Tokenwe
Token ID732
Feature activation-1.25e+01

Pos logit contributions
 harm+0.065
 rid+0.053
 decisions+0.053
 impact+0.050
 disl+0.049
Neg logit contributions
 Inside-0.116
 Outside-0.110
 Across-0.109
lic-0.105
Inside-0.104
Tokenbs
Token ID1443
Feature activation-5.41e+00

Pos logit contributions
 harm+0.507
 decisions+0.468
 wounds+0.447
 choices+0.445
 rid+0.440
Neg logit contributions
 Inside-0.412
 Across-0.386
 Outside-0.381
lic-0.365
 On-0.357
Token.
Token ID13
Feature activation-2.80e+00

Pos logit contributions
 harm+0.240
 disl+0.234
 decisions+0.226
 rid+0.226
 impact+0.223
Neg logit contributions
 Inside-0.142
 Outside-0.123
 Across-0.123
 On-0.115
On-0.115
Token But
Token ID887
Feature activation-3.61e+00

Pos logit contributions
 harm+0.135
 rid+0.119
 decisions+0.117
 disl+0.113
 wounds+0.112
Neg logit contributions
 Inside-0.080
 Outside-0.071
 Across-0.071
 On-0.066
Inside-0.065
Token I
Token ID314
Feature activation-5.17e+00

Pos logit contributions
 harm+0.124
 rid+0.105
 decisions+0.102
 disl+0.099
 wounds+0.096
Neg logit contributions
 Inside-0.150
 Across-0.140
 Outside-0.139
Inside-0.132
 On-0.131
Token like
Token ID588
Feature activation-3.35e+00

Pos logit contributions
 harm+0.228
 decisions+0.221
keeping+0.216
 supervision+0.215
 choices+0.212
Neg logit contributions
 Inside-0.153
 Suddenly-0.138
 On-0.138
On-0.135
 Outside-0.132
Token she
Token ID673
Feature activation-2.32e+00

Pos logit contributions
 harm+0.207
 rid+0.193
 wounds+0.191
 decisions+0.188
 supervision+0.188
Neg logit contributions
 Inside-0.083
 Across-0.069
 Outside-0.068
 Suddenly-0.067
 On-0.065
Token had
Token ID550
Feature activation-4.41e-01

Pos logit contributions
 harm+0.124
 decisions+0.118
 supervision+0.114
 choices+0.113
 impact+0.112
Neg logit contributions
 Inside-0.046
 Across-0.041
lic-0.040
 Outside-0.039
 Suddenly-0.039
Token tried
Token ID3088
Feature activation-3.63e-01

Pos logit contributions
 harm+0.020
 decisions+0.019
 choices+0.018
 rid+0.018
 impact+0.018
Neg logit contributions
 Inside-0.012
 Across-0.011
 Outside-0.011
Over-0.010
On-0.010
Token to
Token ID284
Feature activation-5.15e+00

Pos logit contributions
 harm+0.017
 rid+0.015
 impact+0.015
 decisions+0.015
 disl+0.015
Neg logit contributions
 Inside-0.009
 Outside-0.008
 Suddenly-0.008
 Across-0.008
On-0.008
Token fit
Token ID4197
Feature activation-1.52e+00

Pos logit contributions
 harm+0.194
 decisions+0.186
 supervision+0.177
 choices+0.176
 disl+0.175
Neg logit contributions
 Inside-0.162
 Outside-0.149
Inside-0.146
On-0.145
 Across-0.142
Token Mitt
Token ID16627
Feature activation-1.24e+01

Pos logit contributions
 harm+0.070
 disl+0.065
 affect+0.064
 impact+0.063
 benefit+0.061
Neg logit contributions
 Inside-0.042
 Outside-0.036
 Across-0.034
 Suddenly-0.033
Inside-0.032
Tokenens
Token ID641
Feature activation-4.78e+00

Pos logit contributions
 harm+0.416
 rid+0.350
 decisions+0.344
 disl+0.337
 wounds+0.328
Neg logit contributions
 Inside-0.514
 Across-0.484
 Outside-0.482
lic-0.462
 On-0.458
Token in
Token ID287
Feature activation+1.44e+00

Pos logit contributions
 harm+0.208
 disl+0.184
 rid+0.183
 decisions+0.182
 impact+0.174
Neg logit contributions
 Inside-0.151
 Across-0.138
 Outside-0.136
On-0.128
 On-0.127
Token the
Token ID262
Feature activation+2.14e-01

Pos logit contributions
 Inside+0.042
 Outside+0.036
 Across+0.036
 On+0.035
On+0.035
Neg logit contributions
 harm-0.066
 disl-0.060
 decisions-0.059
 rid-0.059
 affect-0.057
Token jar
Token ID17379
Feature activation+5.50e-01

Pos logit contributions
 Inside+0.006
 Across+0.005
 Outside+0.005
 On+0.004
Over+0.004
Neg logit contributions
 harm-0.011
 rid-0.010
 decisions-0.009
 disl-0.009
 supervision-0.009
Token.
Token ID13
Feature activation-8.93e-01

Pos logit contributions
 Inside+0.018
 Across+0.016
 Outside+0.016
On+0.015
Inside+0.015
Neg logit contributions
 harm-0.024
 rid-0.021
 disl-0.021
 decisions-0.021
 wounds-0.020
Token't
Token ID470
Feature activation-9.57e+00

Pos logit contributions
 harm+0.038
 rid+0.034
 decisions+0.033
 disl+0.033
 choices+0.032
Neg logit contributions
 Inside-0.018
 Across-0.016
 Outside-0.016
Inside-0.014
lic-0.014
Token know
Token ID760
Feature activation-4.15e+00

Pos logit contributions
 wounds+0.331
 harm+0.330
 rid+0.325
 decisions+0.323
 disl+0.308
Neg logit contributions
 Inside-0.314
 Outside-0.285
 Across-0.284
On-0.271
 Suddenly-0.268
Token the
Token ID262
Feature activation-5.14e+00

Pos logit contributions
 harm+0.182
 rid+0.168
 disl+0.158
 decisions+0.158
 impact+0.157
Neg logit contributions
 Inside-0.129
 Across-0.112
 Outside-0.111
Inside-0.102
 Suddenly-0.102
Token rh
Token ID9529
Feature activation-6.45e+00

Pos logit contributions
 harm+0.217
 disl+0.199
 rid+0.198
 supervision+0.188
 affect+0.187
Neg logit contributions
 Inside-0.145
 Across-0.136
 Outside-0.131
 Suddenly-0.122
 Two-0.122
Tokenin
Token ID259
Feature activation-4.13e+00

Pos logit contributions
 harm+0.286
 disl+0.280
 rid+0.265
 decisions+0.260
 supervision+0.259
Neg logit contributions
 Inside-0.165
Over-0.153
 Across-0.152
lic-0.147
On-0.146
Tokenoc
Token ID420
Feature activation-1.23e+01

Pos logit contributions
 harm+0.167
 rid+0.151
 wounds+0.148
 disl+0.146
 decisions+0.145
Neg logit contributions
 Inside-0.136
 Across-0.124
 Outside-0.121
 Suddenly-0.119
 On-0.116
Tokeneros
Token ID27498
Feature activation-2.49e+00

Pos logit contributions
 harm+0.529
 rid+0.499
 affect+0.485
 disl+0.483
 conclusions+0.459
Neg logit contributions
 Inside-0.302
 Across-0.267
 Suddenly-0.259
ippy-0.251
und-0.251
Token was
Token ID373
Feature activation+2.00e+00

Pos logit contributions
 harm+0.136
 rid+0.127
 decisions+0.124
 disl+0.123
 wounds+0.123
Neg logit contributions
 Inside-0.048
 Across-0.042
 Outside-0.039
 Suddenly-0.036
Over-0.035
Token wild
Token ID4295
Feature activation-2.71e+00

Pos logit contributions
 Inside+0.058
 Across+0.053
 Outside+0.051
On+0.049
 Suddenly+0.049
Neg logit contributions
 harm-0.088
 disl-0.082
 decisions-0.081
 rid-0.080
 conclusions-0.080
Token.
Token ID13
Feature activation-9.89e-01

Pos logit contributions
 harm+0.120
 rid+0.115
 disl+0.108
 wounds+0.107
 supervision+0.106
Neg logit contributions
 Inside-0.076
 Across-0.069
 Outside-0.068
 Suddenly-0.065
 On-0.064
Token Can
Token ID1680
Feature activation-6.53e+00

Pos logit contributions
 harm+0.049
 rid+0.044
 decisions+0.044
 disl+0.044
 supervision+0.042
Neg logit contributions
 Inside-0.023
 Across-0.021
 Outside-0.021
lic-0.020
Over-0.020
Token the
Token ID262
Feature activation-1.09e+00

Pos logit contributions
 Inside+0.125
 Across+0.110
 Outside+0.110
 On+0.105
On+0.104
Neg logit contributions
 harm-0.171
 rid-0.151
 decisions-0.150
 disl-0.149
 supervision-0.146
Token tree
Token ID5509
Feature activation+3.19e+00

Pos logit contributions
 harm+0.054
 disl+0.050
 rid+0.050
 decisions+0.049
 supervision+0.047
Neg logit contributions
 Inside-0.028
 Across-0.024
 Outside-0.024
Inside-0.022
On-0.022
Token and
Token ID290
Feature activation-2.06e+00

Pos logit contributions
 Inside+0.108
 Across+0.103
 Outside+0.101
lic+0.097
Over+0.096
Neg logit contributions
 harm-0.129
 rid-0.111
 decisions-0.110
 choices-0.105
 supervision-0.105
Token unt
Token ID1418
Feature activation-3.83e+00

Pos logit contributions
 harm+0.102
 rid+0.093
 decisions+0.088
 disl+0.087
 wounds+0.087
Neg logit contributions
 Inside-0.052
 Across-0.048
 Outside-0.047
lic-0.044
 On-0.043
Tokenied
Token ID798
Feature activation-2.17e-01

Pos logit contributions
 harm+0.161
 disl+0.145
 rid+0.140
 affect+0.138
 impact+0.137
Neg logit contributions
 Inside-0.128
 Outside-0.112
 Across-0.111
Over-0.108
 On-0.106
Token Mitt
Token ID16627
Feature activation-1.23e+01

Pos logit contributions
 harm+0.009
 affect+0.008
 disl+0.008
 impact+0.008
 decisions+0.008
Neg logit contributions
 Inside-0.007
 Across-0.006
 Outside-0.006
 Suddenly-0.006
 On-0.006
Tokenens
Token ID641
Feature activation-5.29e+00

Pos logit contributions
 harm+0.375
 rid+0.311
 decisions+0.301
 disl+0.291
 choices+0.283
Neg logit contributions
 Inside-0.558
 Across-0.523
 Outside-0.520
Inside-0.497
lic-0.495
Token from
Token ID422
Feature activation-1.69e+00

Pos logit contributions
 harm+0.220
 rid+0.193
 decisions+0.189
 disl+0.188
 wounds+0.181
Neg logit contributions
 Inside-0.180
 Across-0.164
 Outside-0.163
 On-0.154
On-0.152
Token the
Token ID262
Feature activation-1.72e+00

Pos logit contributions
 harm+0.072
 decisions+0.068
 disl+0.067
 affect+0.066
 rid+0.066
Neg logit contributions
 Inside-0.051
 Outside-0.044
 Across-0.044
 Suddenly-0.043
 On-0.042
Token branches
Token ID13737
Feature activation+1.86e+00

Pos logit contributions
 harm+0.077
 disl+0.070
 rid+0.070
 decisions+0.068
 supervision+0.065
Neg logit contributions
 Inside-0.047
 Across-0.044
 Outside-0.043
 Suddenly-0.041
Over-0.040
Token.
Token ID13
Feature activation-7.14e-01

Pos logit contributions
 Inside+0.061
 Across+0.055
 Outside+0.054
On+0.052
 On+0.052
Neg logit contributions
 harm-0.079
 disl-0.071
 rid-0.070
 decisions-0.069
 wounds-0.067
Token "
Token ID366
Feature activation-6.56e+00

Pos logit contributions
 harm+0.041
 rid+0.038
 decisions+0.037
 disl+0.037
 choices+0.036
Neg logit contributions
 Inside-0.020
 Across-0.018
 Outside-0.018
On-0.016
Inside-0.016
TokenThank
Token ID10449
Feature activation-6.63e+00

Pos logit contributions
 harm+0.280
 decisions+0.247
 wounds+0.246
 supervision+0.234
 choices+0.233
Neg logit contributions
 Inside-0.226
On-0.205
 Across-0.204
Inside-0.204
 Outside-0.204
Token you
Token ID345
Feature activation-7.97e+00

Pos logit contributions
 harm+0.198
 rid+0.162
 decisions+0.156
 disl+0.155
 wounds+0.152
Neg logit contributions
 Inside-0.309
 Outside-0.285
 Across-0.284
Inside-0.276
 On-0.272
Token very
Token ID845
Feature activation-4.17e+00

Pos logit contributions
 harm+0.395
 disl+0.344
 rid+0.342
 affect+0.332
 wounds+0.331
Neg logit contributions
 Inside-0.231
 Outside-0.198
 Across-0.196
lic-0.190
 On-0.186
Token much
Token ID881
Feature activation-5.34e+00

Pos logit contributions
 harm+0.197
 rid+0.174
 decisions+0.168
 supervision+0.166
 disl+0.166
Neg logit contributions
 Inside-0.124
 Across-0.109
 Outside-0.108
Inside-0.102
lic-0.102
Token Mrs
Token ID9074
Feature activation-1.23e+01

Pos logit contributions
 harm+0.217
 rid+0.194
 disl+0.182
 wounds+0.181
 supervision+0.179
Neg logit contributions
 Inside-0.193
 Outside-0.170
 Across-0.168
 On-0.165
Inside-0.162
Token Berry
Token ID20165
Feature activation-6.31e+00

Pos logit contributions
 harm+0.523
 rid+0.468
 disl+0.456
 decisions+0.445
 wounds+0.439
Neg logit contributions
 Inside-0.412
 Across-0.367
 Outside-0.366
 On-0.346
lic-0.337
Token."
Token ID526
Feature activation-3.26e+00

Pos logit contributions
 harm+0.247
 rid+0.214
 wounds+0.206
 disl+0.204
 decisions+0.202
Neg logit contributions
 Inside-0.235
 Across-0.211
 Outside-0.208
lic-0.201
 On-0.200
Token Mrs
Token ID9074
Feature activation-8.53e+00

Pos logit contributions
 harm+0.158
 rid+0.142
 decisions+0.140
 disl+0.138
 choices+0.136
Neg logit contributions
 Inside-0.082
 Across-0.076
 Outside-0.074
lic-0.070
Over-0.070
Token Berry
Token ID20165
Feature activation-2.94e+00

Pos logit contributions
 harm+0.320
 rid+0.276
 decisions+0.271
 choices+0.261
 disl+0.259
Neg logit contributions
 Inside-0.322
 Across-0.302
 Outside-0.299
Inside-0.286
lic-0.283
Token then
Token ID788
Feature activation+1.76e+00

Pos logit contributions
 harm+0.138
 rid+0.123
 decisions+0.121
 disl+0.119
 choices+0.117
Neg logit contributions
 Inside-0.083
 Across-0.075
 Outside-0.075
Inside-0.069
lic-0.068
Token each
Token ID1123
Feature activation-5.56e+00

Pos logit contributions
 harm+0.100
 rid+0.089
 disl+0.086
 affect+0.084
 decisions+0.083
Neg logit contributions
 Inside-0.076
 Outside-0.069
 Suddenly-0.067
 Across-0.067
 On-0.066
Token other
Token ID584
Feature activation-5.18e+00

Pos logit contributions
 harm+0.212
 disl+0.191
 rid+0.182
 decisions+0.179
 supervision+0.175
Neg logit contributions
 Inside-0.210
 Outside-0.192
 Across-0.191
On-0.184
 On-0.184
Token and
Token ID290
Feature activation-3.40e+00

Pos logit contributions
 harm+0.254
 rid+0.225
 disl+0.222
 affect+0.215
 decisions+0.215
Neg logit contributions
 Inside-0.142
 Outside-0.124
 Across-0.123
 Suddenly-0.115
On-0.115
Token said
Token ID531
Feature activation-1.59e+00

Pos logit contributions
 harm+0.163
 rid+0.143
 decisions+0.142
 disl+0.140
 affect+0.138
Neg logit contributions
 Inside-0.097
 Outside-0.083
 Across-0.083
 Suddenly-0.078
Over-0.078
Token "
Token ID366
Feature activation-7.24e+00

Pos logit contributions
 harm+0.079
 rid+0.071
 decisions+0.070
 disl+0.069
 choices+0.068
Neg logit contributions
 Inside-0.040
 Across-0.036
 Outside-0.036
Inside-0.033
 On-0.033
Tokenwe
Token ID732
Feature activation-1.22e+01

Pos logit contributions
 harm+0.331
 decisions+0.296
 wounds+0.292
 choices+0.284
 supervision+0.284
Neg logit contributions
 Inside-0.221
 Across-0.202
On-0.201
 Outside-0.200
Inside-0.198
Token did
Token ID750
Feature activation-2.89e-02

Pos logit contributions
 harm+0.529
 decisions+0.453
 wounds+0.449
 rid+0.446
 disl+0.442
Neg logit contributions
 Inside-0.409
 Across-0.368
 Outside-0.364
lic-0.351
Inside-0.343
Token a
Token ID257
Feature activation-1.53e+00

Pos logit contributions
 harm+0.001
 rid+0.001
 affect+0.001
 wounds+0.001
 supervision+0.001
Neg logit contributions
 Inside-0.001
 Outside-0.001
 Suddenly-0.001
 Across-0.001
Inside-0.001
Token great
Token ID1049
Feature activation-4.62e+00

Pos logit contributions
 harm+0.068
 rid+0.060
 decisions+0.059
 disl+0.058
 choices+0.056
Neg logit contributions
 Inside-0.048
 Across-0.044
 Outside-0.044
Inside-0.041
lic-0.041
Token job
Token ID1693
Feature activation-2.12e+00

Pos logit contributions
 harm+0.170
 decisions+0.148
 rid+0.145
 choices+0.141
 impact+0.138
Neg logit contributions
 Inside-0.180
 Across-0.169
 Outside-0.166
lic-0.160
Inside-0.160
Token!
Token ID0
Feature activation-2.87e+00

Pos logit contributions
 harm+0.084
 rid+0.074
 wounds+0.070
 disl+0.070
 conclusions+0.069
Neg logit contributions
 Inside-0.077
 Outside-0.070
 Across-0.069
Inside-0.065
lic-0.065
Token,
Token ID11
Feature activation+3.09e+00

Pos logit contributions
 Inside+0.066
 Across+0.059
 Outside+0.058
Inside+0.053
On+0.053
Neg logit contributions
 harm-0.166
 rid-0.152
 decisions-0.148
 disl-0.146
 choices-0.144
Token "
Token ID366
Feature activation-4.65e+00

Pos logit contributions
 Inside+0.076
 Across+0.071
 Outside+0.069
Over+0.068
shop+0.066
Neg logit contributions
 harm-0.144
 rid-0.134
 disl-0.130
 supervision-0.124
 wounds-0.123
TokenWe
Token ID1135
Feature activation-5.70e+00

Pos logit contributions
 harm+0.209
 decisions+0.182
 rid+0.180
 disl+0.174
 choices+0.174
Neg logit contributions
 Inside-0.149
 Outside-0.133
 Across-0.133
On-0.130
Inside-0.129
Token can
Token ID460
Feature activation-2.38e+00

Pos logit contributions
 harm+0.238
 wounds+0.210
 disl+0.207
 decisions+0.207
 rid+0.204
Neg logit contributions
 Inside-0.196
 Outside-0.170
 Across-0.169
 On-0.166
 Suddenly-0.162
Token take
Token ID1011
Feature activation-3.04e+00

Pos logit contributions
 harm+0.083
 rid+0.070
 decisions+0.067
 disl+0.066
 wounds+0.064
Neg logit contributions
 Inside-0.098
 Across-0.092
 Outside-0.091
lic-0.087
Over-0.086
Token Mitt
Token ID16627
Feature activation-1.22e+01

Pos logit contributions
 harm+0.123
 disl+0.107
 rid+0.105
 decisions+0.105
 affect+0.104
Neg logit contributions
 Inside-0.106
 Outside-0.095
 Across-0.093
 Suddenly-0.090
 On-0.089
Tokenens
Token ID641
Feature activation-4.90e+00

Pos logit contributions
 harm+0.392
 rid+0.330
 decisions+0.320
 disl+0.312
 wounds+0.302
Neg logit contributions
 Inside-0.527
 Across-0.495
 Outside-0.493
lic-0.470
 On-0.469
Token to
Token ID284
Feature activation-3.41e+00

Pos logit contributions
 harm+0.214
 disl+0.191
 rid+0.186
 decisions+0.184
 impact+0.181
Neg logit contributions
 Inside-0.154
 Across-0.140
 Outside-0.139
 On-0.129
On-0.129
Token the
Token ID262
Feature activation-1.82e+00

Pos logit contributions
 harm+0.151
 decisions+0.137
 disl+0.136
 rid+0.136
 choices+0.134
Neg logit contributions
 Inside-0.102
 Outside-0.087
 Across-0.085
 On-0.084
Inside-0.084
Token vet
Token ID20202
Feature activation-4.60e+00

Pos logit contributions
 harm+0.078
 rid+0.068
 decisions+0.066
 disl+0.065
 wounds+0.064
Neg logit contributions
 Inside-0.061
 Across-0.056
 Outside-0.055
lic-0.052
Inside-0.052
Token.
Token ID13
Feature activation-1.39e+00

Pos logit contributions
 harm+0.214
 disl+0.191
 rid+0.191
 decisions+0.188
 conclusions+0.183
Neg logit contributions
 Inside-0.136
 Outside-0.119
 Across-0.117
 Suddenly-0.114
 On-0.111
Token a
Token ID257
Feature activation+6.04e+00

Pos logit contributions
 Inside+0.149
 Across+0.136
 Outside+0.134
lic+0.124
 On+0.116
Neg logit contributions
 harm-0.340
 rid-0.306
 decisions-0.296
 disl-0.293
 wounds-0.289
Token time
Token ID640
Feature activation+5.41e+00

Pos logit contributions
 Inside+0.204
 Across+0.185
 Outside+0.179
lic+0.176
Over+0.170
Neg logit contributions
 harm-0.254
 impact-0.215
 disl-0.207
 decisions-0.206
 rid-0.203
Token,
Token ID11
Feature activation+7.21e+00

Pos logit contributions
lic+0.154
 Inside+0.154
 Across+0.153
shop+0.147
 Outside+0.146
Neg logit contributions
 harm-0.219
 disl-0.194
 decisions-0.188
 rid-0.182
 supervision-0.181
Token there
Token ID612
Feature activation+7.59e+00

Pos logit contributions
lic+0.249
shop+0.248
 Inside+0.224
 Outside+0.222
 Together+0.219
Neg logit contributions
 harm-0.231
 rid-0.204
edient-0.183
 theirs-0.173
 possible-0.173
Token was
Token ID373
Feature activation+7.22e+00

Pos logit contributions
 Inside+0.227
lic+0.220
 Across+0.217
 Outside+0.213
Inside+0.210
Neg logit contributions
 harm-0.327
 rid-0.274
 decisions-0.267
 disl-0.262
 impact-0.258
Token a
Token ID257
Feature activation+5.82e+00

Pos logit contributions
 Inside+0.192
lic+0.191
 Across+0.186
 Outside+0.183
Inside+0.165
Neg logit contributions
 harm-0.317
 rid-0.286
 decisions-0.271
 disl-0.270
 choices-0.264
Token seal
Token ID13810
Feature activation+3.33e+00

Pos logit contributions
 Inside+0.165
 Across+0.150
 Outside+0.147
 On+0.137
Inside+0.133
Neg logit contributions
 harm-0.271
 rid-0.240
 disl-0.232
 impact-0.230
 decisions-0.229
Token named
Token ID3706
Feature activation-2.12e+00

Pos logit contributions
 Inside+0.087
 Across+0.078
 Outside+0.077
Inside+0.070
lic+0.070
Neg logit contributions
 harm-0.166
 rid-0.148
 decisions-0.145
 disl-0.143
 choices-0.140
Token Sammy
Token ID44067
Feature activation-7.61e-01

Pos logit contributions
 harm+0.079
 rid+0.070
 disl+0.065
 decisions+0.063
 choices+0.061
Neg logit contributions
 Across-0.073
 Inside-0.071
 Outside-0.069
Over-0.068
shop-0.068
Token.
Token ID13
Feature activation+1.72e+00

Pos logit contributions
 harm+0.034
 decisions+0.033
 rid+0.032
 choices+0.031
 wounds+0.030
Neg logit contributions
 Inside-0.023
 Outside-0.020
On-0.020
 On-0.019
 Across-0.019
Token Sammy
Token ID44067
Feature activation-9.28e-02

Pos logit contributions
shop+0.068
 Across+0.068
 Inside+0.067
Over+0.066
 Outside+0.065
Neg logit contributions
 harm-0.051
 rid-0.044
 disl-0.042
 impact-0.040
 decisions-0.040
Token upon
Token ID2402
Feature activation+5.96e+00

Pos logit contributions
 Inside+0.121
 Across+0.106
 Outside+0.104
lic+0.093
Inside+0.093
Neg logit contributions
 harm-0.334
 rid-0.302
 decisions-0.297
 disl-0.292
 choices-0.288
Token a
Token ID257
Feature activation+5.50e+00

Pos logit contributions
 Inside+0.141
 Across+0.128
 Outside+0.127
lic+0.119
shop+0.109
Neg logit contributions
 harm-0.299
 rid-0.269
 decisions-0.258
 disl-0.257
 wounds-0.254
Token time
Token ID640
Feature activation+4.65e+00

Pos logit contributions
 Inside+0.188
 Across+0.172
lic+0.166
 Outside+0.166
Over+0.158
Neg logit contributions
 harm-0.224
 impact-0.188
 disl-0.182
 decisions-0.180
 rid-0.176
Token,
Token ID11
Feature activation+6.72e+00

Pos logit contributions
lic+0.146
 Across+0.140
 Inside+0.140
shop+0.137
 Outside+0.135
Neg logit contributions
 harm-0.173
 disl-0.150
 decisions-0.145
 choices-0.139
 supervision-0.139
Token there
Token ID612
Feature activation+6.89e+00

Pos logit contributions
lic+0.241
shop+0.234
 Outside+0.211
 Across+0.206
Inside+0.205
Neg logit contributions
 harm-0.216
 rid-0.187
edient-0.165
 theirs-0.164
 possible-0.160
Token was
Token ID373
Feature activation+6.44e+00

Pos logit contributions
 Inside+0.220
lic+0.218
 Across+0.210
Inside+0.207
 Outside+0.205
Neg logit contributions
 harm-0.279
 rid-0.227
 disl-0.220
 happen-0.220
 decisions-0.219
Token a
Token ID257
Feature activation+5.17e+00

Pos logit contributions
lic+0.183
 Inside+0.180
 Across+0.174
 Outside+0.172
Inside+0.160
Neg logit contributions
 harm-0.266
 rid-0.239
 decisions-0.225
 disl-0.223
 choices-0.219
Token little
Token ID1310
Feature activation+1.57e+00

Pos logit contributions
 Inside+0.154
 Across+0.141
 Outside+0.139
 On+0.129
Inside+0.127
Neg logit contributions
 harm-0.230
 rid-0.203
 disl-0.196
 impact-0.194
 wounds-0.193
Token fish
Token ID5916
Feature activation+1.94e+00

Pos logit contributions
 Inside+0.060
 Across+0.056
 Outside+0.056
Inside+0.053
 On+0.053
Neg logit contributions
 harm-0.058
 rid-0.050
 decisions-0.049
 choices-0.047
 impact-0.047
Token named
Token ID3706
Feature activation-2.61e+00

Pos logit contributions
 Inside+0.054
 Across+0.049
 Outside+0.048
Inside+0.045
lic+0.044
Neg logit contributions
 harm-0.094
 rid-0.083
 decisions-0.080
 disl-0.079
 impact-0.078
Token Fin
Token ID4463
Feature activation-6.58e-01

Pos logit contributions
 harm+0.094
 rid+0.084
 disl+0.078
 decisions+0.076
 choices+0.073
Neg logit contributions
 Inside-0.098
 Across-0.095
 Outside-0.093
Over-0.090
shop-0.089
Token.
Token ID13
Feature activation+1.43e-01

Pos logit contributions
 harm+0.052
 disl+0.049
 rid+0.047
 decisions+0.045
 wounds+0.044
Neg logit contributions
 Inside-0.036
 Across-0.032
 Outside-0.032
 On-0.030
 Suddenly-0.030
Token She
Token ID1375
Feature activation+3.04e+00

Pos logit contributions
 Inside+0.003
 Across+0.003
lic+0.003
Over+0.003
shop+0.003
Neg logit contributions
 harm-0.006
 rid-0.006
 decisions-0.006
 disl-0.006
 choices-0.006
Token was
Token ID373
Feature activation+4.12e+00

Pos logit contributions
 Inside+0.089
 Across+0.082
 Outside+0.081
lic+0.075
 On+0.074
Neg logit contributions
 harm-0.140
 rid-0.125
 decisions-0.120
 wounds-0.118
 disl-0.118
Token so
Token ID523
Feature activation-6.47e-01

Pos logit contributions
 Inside+0.141
 Across+0.129
 Outside+0.128
Inside+0.120
lic+0.120
Neg logit contributions
 harm-0.172
 rid-0.150
 decisions-0.148
 disl-0.145
 choices-0.142
Token sleepy
Token ID42023
Feature activation-5.96e-01

Pos logit contributions
 harm+0.024
 rid+0.020
 decisions+0.020
 disl+0.019
 choices+0.019
Neg logit contributions
 Inside-0.026
 Across-0.024
 Outside-0.024
Inside-0.022
lic-0.022
Token -
Token ID532
Feature activation+4.73e+00

Pos logit contributions
 harm+0.027
 rid+0.024
 decisions+0.024
 disl+0.024
 wounds+0.023
Neg logit contributions
 Inside-0.018
 Outside-0.016
 Across-0.016
Inside-0.015
lic-0.015
Token she
Token ID673
Feature activation+2.54e+00

Pos logit contributions
 Inside+0.136
 Outside+0.119
 Across+0.116
On+0.111
Inside+0.111
Neg logit contributions
 harm-0.230
 rid-0.203
 decisions-0.201
 disl-0.200
 wounds-0.195
Token quit
Token ID11238
Feature activation+1.72e+00

Pos logit contributions
 Inside+0.068
 Across+0.060
 Outside+0.060
Inside+0.056
On+0.056
Neg logit contributions
 harm-0.124
 rid-0.111
 decisions-0.111
 disl-0.109
 choices-0.107
Token trying
Token ID2111
Feature activation+1.12e-02

Pos logit contributions
 Inside+0.050
 Outside+0.045
 Across+0.044
 Suddenly+0.042
On+0.041
Neg logit contributions
 harm-0.079
 rid-0.070
 decisions-0.069
 disl-0.068
 impact-0.067
Token to
Token ID284
Feature activation-2.95e+00

Pos logit contributions
 Inside+0.000
 Outside+0.000
On+0.000
 Across+0.000
Inside+0.000
Neg logit contributions
 harm-0.001
 disl-0.000
 rid-0.000
 supervision-0.000
 affect-0.000
Token keep
Token ID1394
Feature activation-3.32e+00

Pos logit contributions
 harm+0.120
 decisions+0.106
 rid+0.105
 disl+0.105
 choices+0.102
Neg logit contributions
 Inside-0.099
 Outside-0.092
 Across-0.089
Inside-0.087
On-0.086
Token great
Token ID1049
Feature activation-2.91e+00

Pos logit contributions
 Inside+0.020
 Outside+0.019
On+0.018
 Across+0.018
Inside+0.017
Neg logit contributions
 harm-0.036
 decisions-0.033
 wounds-0.033
 supervision-0.032
 disl-0.032
Token holiday
Token ID9912
Feature activation+2.07e+00

Pos logit contributions
 harm+0.132
 rid+0.123
 supervision+0.122
 wounds+0.121
 disl+0.118
Neg logit contributions
 Inside-0.081
 Outside-0.074
 Across-0.072
On-0.070
 On-0.070
Token.
Token ID13
Feature activation+1.17e-01

Pos logit contributions
 Inside+0.060
 Outside+0.053
 Across+0.051
On+0.051
shop+0.050
Neg logit contributions
 harm-0.093
 disl-0.087
 wounds-0.087
 conclusions-0.085
 rid-0.085
Token<|endoftext|>
Token ID50256
Feature activation+8.57e-02

Pos logit contributions
 Inside+0.002
 Across+0.002
 Outside+0.002
Inside+0.002
lic+0.002
Neg logit contributions
 harm-0.006
 rid-0.006
 decisions-0.006
 disl-0.006
 choices-0.005
TokenOnce
Token ID7454
Feature activation+6.38e+00

Pos logit contributions
 Inside+0.002
 Across+0.002
 Outside+0.002
lic+0.002
shop+0.002
Neg logit contributions
 harm-0.004
 rid-0.004
keeping-0.004
 decisions-0.004
 choices-0.003
Token there
Token ID612
Feature activation+6.83e+00

Pos logit contributions
 Inside+0.117
 Across+0.097
 Outside+0.097
 On+0.085
Inside+0.083
Neg logit contributions
 harm-0.368
 rid-0.336
 decisions-0.331
 disl-0.326
 wounds-0.322
Token was
Token ID373
Feature activation+6.88e+00

Pos logit contributions
 Inside+0.149
 Across+0.133
 Outside+0.130
lic+0.121
Inside+0.120
Neg logit contributions
 harm-0.364
 rid-0.328
 decisions-0.321
 disl-0.317
 choices-0.311
Token a
Token ID257
Feature activation+5.55e+00

Pos logit contributions
 Inside+0.159
 Across+0.146
lic+0.143
 Outside+0.142
Inside+0.129
Neg logit contributions
 harm-0.350
 rid-0.315
 decisions-0.309
 disl-0.304
 choices-0.299
Token little
Token ID1310
Feature activation+2.47e+00

Pos logit contributions
 Inside+0.154
 Across+0.140
 Outside+0.136
 On+0.124
lic+0.122
Neg logit contributions
 harm-0.262
 rid-0.231
 disl-0.223
 impact-0.222
 decisions-0.219
Token girl
Token ID2576
Feature activation+8.99e-01

Pos logit contributions
 Inside+0.091
 Across+0.086
 Outside+0.084
Inside+0.080
lic+0.080
Neg logit contributions
 harm-0.096
 rid-0.082
 decisions-0.080
 disl-0.078
 choices-0.077
Token called
Token ID1444
Feature activation+2.66e-01

Pos logit contributions
 Inside+0.020
 Outside+0.018
 Across+0.018
 On+0.016
On+0.016
Neg logit contributions
 harm-0.048
 rid-0.043
 decisions-0.043
 disl-0.042
 choices-0.041
Token mom
Token ID1995
Feature activation-5.94e+00

Pos logit contributions
 Inside+0.085
 Outside+0.074
 Across+0.074
On+0.069
 On+0.068
Neg logit contributions
 harm-0.127
 rid-0.117
 disl-0.112
 decisions-0.111
 supervision-0.110
Token.
Token ID13
Feature activation+2.90e-01

Pos logit contributions
 harm+0.269
 rid+0.245
 disl+0.245
 decisions+0.237
 affect+0.230
Neg logit contributions
 Inside-0.186
 Outside-0.163
 Across-0.160
 On-0.156
On-0.154
Token From
Token ID3574
Feature activation-4.52e-01

Pos logit contributions
 Inside+0.007
lic+0.007
 Across+0.007
Over+0.006
 Outside+0.006
Neg logit contributions
 harm-0.013
 rid-0.012
 disl-0.012
 decisions-0.012
 supervision-0.011
Token then
Token ID788
Feature activation+2.97e+00

Pos logit contributions
 harm+0.019
 rid+0.017
 decisions+0.017
 disl+0.016
 wounds+0.016
Neg logit contributions
 Inside-0.015
 Across-0.014
 Outside-0.014
lic-0.013
Inside-0.013
Token on
Token ID319
Feature activation+2.81e+00

Pos logit contributions
lic+0.101
shop+0.100
 Inside+0.097
 Across+0.097
 Outside+0.093
Neg logit contributions
 harm-0.107
 rid-0.092
 decisions-0.088
 attention-0.086
 choices-0.084
Token,
Token ID11
Feature activation+4.71e+00

Pos logit contributions
 Inside+0.062
 Across+0.056
 Outside+0.055
lic+0.055
Inside+0.051
Neg logit contributions
 harm-0.148
 rid-0.132
 decisions-0.131
 choices-0.127
 disl-0.127
Token Jack
Token ID3619
Feature activation+5.62e-01

Pos logit contributions
 Inside+0.112
 Across+0.092
 Outside+0.092
 Suddenly+0.090
 On+0.089
Neg logit contributions
 harm-0.254
 wounds-0.231
 disl-0.226
 decisions-0.224
 rid-0.223
Token was
Token ID373
Feature activation+4.21e+00

Pos logit contributions
 Inside+0.016
 Outside+0.014
 Across+0.014
 Suddenly+0.013
On+0.013
Neg logit contributions
 harm-0.027
 rid-0.024
 decisions-0.024
 disl-0.024
 supervision-0.023
Token always
Token ID1464
Feature activation-3.46e-01

Pos logit contributions
 Inside+0.146
 Across+0.134
 Outside+0.133
 Suddenly+0.126
On+0.125
Neg logit contributions
 harm-0.171
 rid-0.148
 decisions-0.148
 disl-0.147
 choices-0.142
Token careful
Token ID8161
Feature activation-3.80e+00

Pos logit contributions
 harm+0.016
 rid+0.014
 decisions+0.013
 disl+0.013
 wounds+0.013
Neg logit contributions
 Inside-0.011
 Across-0.010
 Outside-0.010
Inside-0.009
On-0.009
Token with
Token ID351
Feature activation-1.78e+00

Pos logit contributions
 harm+0.180
 rid+0.166
 disl+0.166
 conclusions+0.163
 impact+0.160
Neg logit contributions
 Inside-0.101
 Outside-0.092
 Suddenly-0.089
 Across-0.087
On-0.084
Token.
Token ID13
Feature activation+4.95e-01

Pos logit contributions
 Inside+0.080
 Outside+0.067
On+0.067
 Suddenly+0.066
 Across+0.065
Neg logit contributions
 harm-0.137
 rid-0.124
 disl-0.122
 decisions-0.121
 conclusions-0.121
Token The
Token ID383
Feature activation+6.82e-01

Pos logit contributions
Over+0.013
lic+0.013
shop+0.012
 Across+0.012
Inside+0.012
Neg logit contributions
 harm-0.019
 decisions-0.018
 rid-0.018
 disl-0.018
 choices-0.017
Token little
Token ID1310
Feature activation+1.72e-01

Pos logit contributions
 Inside+0.015
 Across+0.013
 Outside+0.013
Inside+0.012
lic+0.012
Neg logit contributions
 harm-0.037
 rid-0.033
 decisions-0.033
 disl-0.032
 choices-0.032
Token boy
Token ID2933
Feature activation-9.86e-01

Pos logit contributions
 Inside+0.005
 Across+0.004
 Outside+0.004
 Suddenly+0.004
Over+0.004
Neg logit contributions
 harm-0.009
 rid-0.008
 decisions-0.007
 disl-0.007
 supervision-0.007
Token grabbed
Token ID13176
Feature activation+3.78e+00

Pos logit contributions
 harm+0.047
 rid+0.042
 decisions+0.041
 disl+0.041
 choices+0.040
Neg logit contributions
 Inside-0.027
 Across-0.024
 Outside-0.024
Inside-0.022
lic-0.022
Token a
Token ID257
Feature activation+3.06e+00

Pos logit contributions
 Inside+0.117
 Across+0.105
 Outside+0.104
 On+0.098
Inside+0.096
Neg logit contributions
 harm-0.171
 rid-0.149
 decisions-0.149
 disl-0.147
 impact-0.142
Token bat
Token ID7365
Feature activation-5.81e-01

Pos logit contributions
 Inside+0.093
 Outside+0.086
 Across+0.085
lic+0.079
 On+0.079
Neg logit contributions
 harm-0.135
 rid-0.121
 decisions-0.117
 disl-0.115
 impact-0.112
Token and
Token ID290
Feature activation-1.52e+00

Pos logit contributions
 harm+0.028
 disl+0.025
 decisions+0.025
 rid+0.025
 wounds+0.025
Neg logit contributions
 Inside-0.016
 Outside-0.014
 Across-0.014
 On-0.013
On-0.013
Token a
Token ID257
Feature activation+3.64e+00

Pos logit contributions
 harm+0.076
 rid+0.068
 decisions+0.067
 disl+0.066
 choices+0.065
Neg logit contributions
 Inside-0.039
 Across-0.034
 Outside-0.034
Inside-0.031
lic-0.031
Token baseball
Token ID9283
Feature activation+2.50e+00

Pos logit contributions
 Inside+0.117
 Across+0.106
 Outside+0.106
 On+0.099
lic+0.097
Neg logit contributions
 harm-0.156
 rid-0.141
 decisions-0.135
 disl-0.132
 supervision-0.130
Token and
Token ID290
Feature activation-1.47e+00

Pos logit contributions
 Inside+0.070
 Across+0.062
 Outside+0.062
Over+0.058
On+0.057
Neg logit contributions
 harm-0.118
 decisions-0.105
 rid-0.104
 disl-0.104
 wounds-0.100
Token big
Token ID1263
Feature activation+2.69e+00

Pos logit contributions
 Inside+0.153
 Across+0.138
 Outside+0.134
 On+0.122
Inside+0.120
Neg logit contributions
 harm-0.286
 rid-0.254
 disl-0.245
 impact-0.242
 decisions-0.242
Token crack
Token ID8469
Feature activation+5.20e+00

Pos logit contributions
 Inside+0.079
 Across+0.072
 Outside+0.071
 On+0.066
Inside+0.066
Neg logit contributions
 harm-0.124
 rid-0.111
 decisions-0.108
 disl-0.106
 supervision-0.104
Token in
Token ID287
Feature activation+4.87e+00

Pos logit contributions
 Inside+0.175
 Across+0.160
 Outside+0.159
Inside+0.150
 On+0.148
Neg logit contributions
 harm-0.219
 rid-0.190
 decisions-0.186
 disl-0.181
 choices-0.178
Token the
Token ID262
Feature activation+3.10e+00

Pos logit contributions
 Inside+0.172
 Across+0.158
 Outside+0.157
Inside+0.147
 On+0.147
Neg logit contributions
 harm-0.199
 rid-0.174
 decisions-0.169
 disl-0.166
 supervision-0.162
Token wall
Token ID3355
Feature activation+6.15e+00

Pos logit contributions
 Inside+0.098
 Across+0.089
 Outside+0.087
lic+0.086
Inside+0.083
Neg logit contributions
 harm-0.137
 rid-0.120
 decisions-0.118
 disl-0.113
 choices-0.113
Token.
Token ID13
Feature activation+2.40e+00

Pos logit contributions
 Inside+0.239
 Across+0.222
 Outside+0.220
lic+0.213
Inside+0.212
Neg logit contributions
 harm-0.223
 rid-0.184
 decisions-0.180
 disl-0.175
 choices-0.171
Token It
Token ID632
Feature activation+3.85e+00

Pos logit contributions
lic+0.069
Over+0.065
 Across+0.063
shop+0.062
 Inside+0.062
Neg logit contributions
 harm-0.095
 disl-0.087
 decisions-0.083
 rid-0.082
 impact-0.080
Token was
Token ID373
Feature activation+5.13e+00

Pos logit contributions
 Inside+0.089
 Across+0.080
 Outside+0.079
lic+0.072
Inside+0.071
Neg logit contributions
 harm-0.201
 rid-0.180
 disl-0.172
 decisions-0.172
 wounds-0.168
Token too
Token ID1165
Feature activation-1.66e+00

Pos logit contributions
 Inside+0.147
 Across+0.141
 Outside+0.139
Over+0.133
lic+0.133
Neg logit contributions
 harm-0.231
 rid-0.206
 decisions-0.196
 choices-0.190
 wounds-0.187
Token big
Token ID1263
Feature activation+2.49e-02

Pos logit contributions
 harm+0.075
 rid+0.067
 decisions+0.064
 disl+0.063
 choices+0.062
Neg logit contributions
 Inside-0.048
 Across-0.045
 Outside-0.045
Over-0.042
shop-0.041
Token to
Token ID284
Feature activation-5.18e-01

Pos logit contributions
 Inside+0.001
 Outside+0.001
 Across+0.001
 On+0.001
 Suddenly+0.001
Neg logit contributions
 harm-0.001
 rid-0.001
 disl-0.001
 wounds-0.001
 decisions-0.001
Token fast
Token ID3049
Feature activation+2.74e-01

Pos logit contributions
 Inside+0.207
 Outside+0.184
 Across+0.184
On+0.177
Inside+0.175
Neg logit contributions
 harm-0.271
 disl-0.239
 rid-0.238
 decisions-0.238
 supervision-0.234
Token.
Token ID13
Feature activation+1.33e-01

Pos logit contributions
 Inside+0.009
 Across+0.008
 Outside+0.008
 Suddenly+0.007
 On+0.007
Neg logit contributions
 harm-0.012
 rid-0.011
 disl-0.010
 decisions-0.010
 supervision-0.010
Token He
Token ID679
Feature activation+3.52e+00

Pos logit contributions
lic+0.003
 Inside+0.003
Over+0.003
 Across+0.003
shop+0.003
Neg logit contributions
 harm-0.006
 rid-0.005
 disl-0.005
 decisions-0.005
 wounds-0.005
Token laughed
Token ID13818
Feature activation+2.76e+00

Pos logit contributions
 Inside+0.078
 Across+0.068
 Outside+0.067
 Suddenly+0.063
On+0.063
Neg logit contributions
 harm-0.187
 decisions-0.172
 rid-0.169
 choices-0.166
 disl-0.166
Token and
Token ID290
Feature activation-2.58e+00

Pos logit contributions
 Inside+0.088
 Across+0.081
 Outside+0.080
Inside+0.074
lic+0.074
Neg logit contributions
 harm-0.121
 rid-0.107
 decisions-0.104
 disl-0.102
 choices-0.100
Token said
Token ID531
Feature activation-4.15e-01

Pos logit contributions
 harm+0.110
 rid+0.099
 disl+0.095
 decisions+0.092
 wounds+0.091
Neg logit contributions
 Inside-0.084
 Across-0.079
 Outside-0.077
 On-0.073
Over-0.071
Token,
Token ID11
Feature activation+2.01e+00

Pos logit contributions
 harm+0.023
 rid+0.021
 decisions+0.020
 disl+0.020
 choices+0.020
Neg logit contributions
 Inside-0.009
 Across-0.008
 Outside-0.007
Inside-0.007
 On-0.007
Token "
Token ID366
Feature activation-5.55e+00

Pos logit contributions
 Inside+0.041
 Across+0.036
 Outside+0.036
Over+0.032
Inside+0.032
Neg logit contributions
 harm-0.108
 rid-0.099
 decisions-0.096
 disl-0.096
 supervision-0.094
TokenWhe
Token ID10842
Feature activation-1.28e+00

Pos logit contributions
 harm+0.262
 decisions+0.232
 rid+0.231
 wounds+0.223
 supervision+0.223
Neg logit contributions
 Inside-0.159
 Across-0.147
 Outside-0.146
Over-0.139
On-0.138
Tokene
Token ID68
Feature activation-2.60e+00

Pos logit contributions
 harm+0.036
 rid+0.029
 decisions+0.027
 impact+0.026
 choices+0.025
Neg logit contributions
 Inside-0.061
 Across-0.057
 Outside-0.056
Inside-0.055
 On-0.054
Token!
Token ID0
Feature activation-1.95e+00

Pos logit contributions
 harm+0.095
 rid+0.081
 decisions+0.079
 disl+0.077
 choices+0.075
Neg logit contributions
 Inside-0.102
 Across-0.095
 Outside-0.095
Inside-0.090
lic-0.090
Token started
Token ID2067
Feature activation+6.16e+00

Pos logit contributions
 Inside+0.002
 Across+0.002
 Outside+0.002
 On+0.002
Inside+0.002
Neg logit contributions
 harm-0.004
 rid-0.003
 decisions-0.003
 disl-0.003
 wounds-0.003
Token to
Token ID284
Feature activation+2.68e-01

Pos logit contributions
 Inside+0.211
 Across+0.193
 Outside+0.192
 On+0.180
On+0.180
Neg logit contributions
 harm-0.255
 rid-0.223
 decisions-0.220
 disl-0.215
 choices-0.210
Token cry
Token ID3960
Feature activation-1.18e-01

Pos logit contributions
 Inside+0.011
 Across+0.010
 Outside+0.010
 On+0.010
Inside+0.010
Neg logit contributions
 harm-0.009
 rid-0.008
 decisions-0.008
 disl-0.007
 choices-0.007
Token.
Token ID13
Feature activation+5.59e-01

Pos logit contributions
 harm+0.005
 rid+0.004
 decisions+0.004
 choices+0.004
 wounds+0.004
Neg logit contributions
 Inside-0.004
 Across-0.004
 Outside-0.004
Inside-0.004
lic-0.004
Token She
Token ID1375
Feature activation+2.89e+00

Pos logit contributions
lic+0.013
Over+0.012
 Inside+0.012
Inside+0.012
On+0.012
Neg logit contributions
 harm-0.026
 disl-0.024
 decisions-0.023
 rid-0.023
 mistakes-0.023
Token was
Token ID373
Feature activation+4.38e+00

Pos logit contributions
 Inside+0.101
 Across+0.095
 Outside+0.095
shop+0.090
 Along+0.088
Neg logit contributions
 harm-0.118
 rid-0.100
 wounds-0.099
 disl-0.093
 admit-0.088
Token wet
Token ID9583
Feature activation+1.88e-01

Pos logit contributions
 Inside+0.174
 Across+0.162
 Outside+0.160
Inside+0.155
 On+0.154
Neg logit contributions
 harm-0.161
 rid-0.138
 decisions-0.130
 wounds-0.129
 disl-0.126
Token and
Token ID290
Feature activation-2.33e+00

Pos logit contributions
 Inside+0.006
 Across+0.005
 Outside+0.005
Inside+0.005
lic+0.005
Neg logit contributions
 harm-0.008
 rid-0.007
 decisions-0.007
 disl-0.007
 choices-0.007
Token sticky
Token ID23408
Feature activation-9.48e-01

Pos logit contributions
 harm+0.094
 decisions+0.085
 rid+0.080
 impact+0.080
 disl+0.078
Neg logit contributions
 Inside-0.081
 Across-0.073
 Outside-0.072
 Suddenly-0.069
On-0.069
Token and
Token ID290
Feature activation-2.43e+00

Pos logit contributions
 harm+0.040
 rid+0.036
 decisions+0.035
 disl+0.034
 supervision+0.033
Neg logit contributions
 Inside-0.031
 Across-0.029
 Outside-0.028
lic-0.026
Inside-0.026
Token scared
Token ID12008
Feature activation-2.22e+00

Pos logit contributions
 harm+0.107
 decisions+0.096
 rid+0.093
 impact+0.092
 disl+0.091
Neg logit contributions
 Inside-0.075
 Across-0.068
 Outside-0.067
On-0.063
Over-0.063
Token surface
Token ID4417
Feature activation+6.51e+00

Pos logit contributions
 harm+0.020
 disl+0.019
 supervision+0.018
 rid+0.018
 decisions+0.018
Neg logit contributions
 Inside-0.007
 Outside-0.007
 Across-0.007
Over-0.006
 Suddenly-0.006
Token.
Token ID13
Feature activation+2.64e-01

Pos logit contributions
 Inside+0.205
 Outside+0.184
 Across+0.183
On+0.176
lic+0.175
Neg logit contributions
 harm-0.287
 disl-0.254
 rid-0.251
 decisions-0.248
 wounds-0.245
Token It
Token ID632
Feature activation+2.83e+00

Pos logit contributions
 Inside+0.006
 Across+0.006
 Outside+0.005
lic+0.005
Over+0.005
Neg logit contributions
 harm-0.013
 rid-0.012
 decisions-0.012
 disl-0.011
 supervision-0.011
Token was
Token ID373
Feature activation+4.13e+00

Pos logit contributions
 Inside+0.060
 Across+0.052
 Outside+0.051
lic+0.046
Over+0.046
Neg logit contributions
 harm-0.154
 rid-0.139
 decisions-0.138
 disl-0.136
 supervision-0.134
Token so
Token ID523
Feature activation-5.54e-01

Pos logit contributions
 Inside+0.117
 Across+0.104
 Outside+0.104
On+0.096
 Suddenly+0.096
Neg logit contributions
 harm-0.197
 rid-0.175
 decisions-0.173
 disl-0.171
 wounds-0.169
Token much
Token ID881
Feature activation-1.31e+00

Pos logit contributions
 harm+0.024
 rid+0.021
 decisions+0.021
 disl+0.020
 choices+0.020
Neg logit contributions
 Inside-0.018
 Across-0.017
 Outside-0.016
 On-0.015
Inside-0.015
Token fun
Token ID1257
Feature activation-8.43e-01

Pos logit contributions
 harm+0.058
 rid+0.052
 decisions+0.051
 disl+0.051
 supervision+0.051
Neg logit contributions
 Inside-0.041
 Outside-0.037
 Across-0.036
Inside-0.035
On-0.035
Token!
Token ID0
Feature activation+2.40e-01

Pos logit contributions
 harm+0.038
 rid+0.035
 wounds+0.034
 decisions+0.033
 disl+0.033
Neg logit contributions
 Inside-0.025
 Outside-0.023
 Across-0.022
Inside-0.021
 Suddenly-0.021
Token
Token ID198
Feature activation+2.43e+00

Pos logit contributions
 Inside+0.005
 Across+0.005
 Outside+0.005
lic+0.004
Inside+0.004
Neg logit contributions
 harm-0.013
 rid-0.011
 decisions-0.011
 disl-0.011
 supervision-0.011
Token
Token ID198
Feature activation+2.94e+00

Pos logit contributions
 Inside+0.053
 Across+0.047
 Outside+0.046
lic+0.041
Inside+0.041
Neg logit contributions
 harm-0.130
 rid-0.118
 decisions-0.116
 disl-0.114
 wounds-0.112
TokenAt
Token ID2953
Feature activation+5.10e+00

Pos logit contributions
 Inside+0.073
 Across+0.065
 Outside+0.064
 On+0.058
lic+0.058
Neg logit contributions
 harm-0.148
 rid-0.133
 decisions-0.130
 disl-0.129
 wounds-0.126
Token.
Token ID13
Feature activation-3.04e-01

Pos logit contributions
 harm+0.005
 rid+0.004
 decisions+0.004
 disl+0.004
 choices+0.004
Neg logit contributions
 Inside-0.004
 Across-0.004
 Outside-0.004
lic-0.004
Inside-0.004
Token He
Token ID679
Feature activation+2.47e+00

Pos logit contributions
 harm+0.008
 wounds+0.007
 choices+0.007
 disl+0.007
 decisions+0.007
Neg logit contributions
shop-0.012
Over-0.011
 Across-0.011
lic-0.011
Along-0.011
Token looks
Token ID3073
Feature activation+4.53e+00

Pos logit contributions
 Inside+0.090
 Across+0.085
 Outside+0.085
lic+0.083
Over+0.081
Neg logit contributions
 harm-0.094
 rid-0.080
 wounds-0.079
 disl-0.078
 decisions-0.078
Token happy
Token ID3772
Feature activation-2.75e+00

Pos logit contributions
 Inside+0.156
 Across+0.144
 Outside+0.143
Inside+0.134
lic+0.134
Neg logit contributions
 harm-0.188
 rid-0.165
 decisions-0.160
 disl-0.157
 wounds-0.153
Token,
Token ID11
Feature activation+2.48e+00

Pos logit contributions
 harm+0.126
 rid+0.112
 decisions+0.110
 disl+0.108
 impact+0.105
Neg logit contributions
 Inside-0.084
 Across-0.075
 Outside-0.075
On-0.069
Inside-0.069
Token but
Token ID475
Feature activation-3.56e+00

Pos logit contributions
 Inside+0.076
 Outside+0.067
 Across+0.067
 Suddenly+0.065
On+0.063
Neg logit contributions
 harm-0.115
 rid-0.102
 decisions-0.101
 disl-0.097
 impact-0.097
Token he
Token ID339
Feature activation+1.14e+00

Pos logit contributions
 harm+0.144
 rid+0.122
 choices+0.121
 decisions+0.121
 wounds+0.118
Neg logit contributions
 Inside-0.119
 Across-0.117
lic-0.116
 Outside-0.114
shop-0.106
Token smells
Token ID25760
Feature activation-9.90e-01

Pos logit contributions
 Inside+0.037
 Across+0.034
 Outside+0.034
Inside+0.032
lic+0.032
Neg logit contributions
 harm-0.049
 rid-0.043
 decisions-0.042
 disl-0.041
 choices-0.040
Token bad
Token ID2089
Feature activation-5.85e+00

Pos logit contributions
 harm+0.039
 rid+0.034
 decisions+0.033
 disl+0.033
 supervision+0.032
Neg logit contributions
 Inside-0.036
 Across-0.033
 Outside-0.033
 On-0.031
Inside-0.031
Token.
Token ID13
Feature activation-7.65e-01

Pos logit contributions
 harm+0.235
 rid+0.204
 decisions+0.199
 disl+0.197
 impact+0.192
Neg logit contributions
 Inside-0.209
 Across-0.192
 Outside-0.191
 On-0.179
lic-0.178
Token
Token ID198
Feature activation+1.13e+00

Pos logit contributions
 harm+0.016
 wounds+0.016
 decisions+0.016
 choices+0.016
 rid+0.015
Neg logit contributions
shop-0.031
Over-0.030
lic-0.029
 Across-0.029
 Outside-0.028
Token usually
Token ID3221
Feature activation+8.86e-01

Pos logit contributions
 harm+0.120
 rid+0.105
 decisions+0.102
 disl+0.101
 choices+0.098
Neg logit contributions
 Inside-0.093
 Across-0.085
 Outside-0.084
 On-0.079
Inside-0.079
Token did
Token ID750
Feature activation+3.12e+00

Pos logit contributions
 Inside+0.027
 Across+0.025
 Outside+0.025
Inside+0.023
Over+0.023
Neg logit contributions
 harm-0.039
 decisions-0.035
 rid-0.034
 wounds-0.033
 disl-0.033
Token not
Token ID407
Feature activation-4.58e+00

Pos logit contributions
 Inside+0.111
 Across+0.102
 Outside+0.102
Inside+0.096
On+0.095
Neg logit contributions
 harm-0.124
 rid-0.109
 decisions-0.106
 disl-0.104
 supervision-0.102
Token think
Token ID892
Feature activation-2.59e+00

Pos logit contributions
 harm+0.159
 decisions+0.155
 wounds+0.154
 supervision+0.151
 choices+0.146
Neg logit contributions
 Inside-0.162
 Outside-0.152
 Across-0.150
On-0.146
 On-0.145
Token of
Token ID286
Feature activation-4.48e+00

Pos logit contributions
 harm+0.119
 rid+0.108
 disl+0.107
 wounds+0.106
keeping+0.105
Neg logit contributions
 Inside-0.078
 Outside-0.069
 Across-0.069
 On-0.064
lic-0.064
Token others
Token ID1854
Feature activation-5.58e+00

Pos logit contributions
 harm+0.174
 rid+0.155
 decisions+0.149
 disl+0.147
 wounds+0.143
Neg logit contributions
 Inside-0.161
 Across-0.146
 Outside-0.146
Inside-0.136
 On-0.136
Token.
Token ID13
Feature activation+4.24e-01

Pos logit contributions
 harm+0.241
 rid+0.223
 disl+0.213
 impact+0.212
 decisions+0.212
Neg logit contributions
 Inside-0.175
 Outside-0.156
 Across-0.155
 On-0.147
On-0.145
Token He
Token ID679
Feature activation+3.86e+00

Pos logit contributions
shop+0.011
lic+0.011
 Across+0.011
Under+0.010
Inside+0.010
Neg logit contributions
 harm-0.017
 rid-0.016
 decisions-0.016
 choices-0.015
 disl-0.015
Token rarely
Token ID8365
Feature activation-2.10e+00

Pos logit contributions
 Inside+0.109
 Across+0.099
 Outside+0.099
Inside+0.091
lic+0.090
Neg logit contributions
 harm-0.185
 rid-0.161
 disl-0.156
 wounds-0.151
 decisions-0.150
Token stopped
Token ID5025
Feature activation+4.77e+00

Pos logit contributions
 harm+0.100
 rid+0.090
 decisions+0.089
 disl+0.087
 wounds+0.086
Neg logit contributions
 Inside-0.057
 Across-0.051
 Outside-0.051
lic-0.047
Inside-0.047
Token to
Token ID284
Feature activation-1.83e+00

Pos logit contributions
 Inside+0.121
 Across+0.110
 Outside+0.110
On+0.107
 Suddenly+0.105
Neg logit contributions
 harm-0.210
 disl-0.198
keeping-0.197
 rid-0.196
 supervision-0.196
Tokenisy
Token ID13560
Feature activation+3.23e+00

Pos logit contributions
 harm+0.005
 supervision+0.004
 rid+0.004
 wounds+0.004
 decisions+0.004
Neg logit contributions
 Inside-0.003
lic-0.003
On-0.003
Inside-0.003
 Suddenly-0.003
Token thanked
Token ID26280
Feature activation-1.11e+00

Pos logit contributions
 Inside+0.084
 Across+0.073
 Outside+0.073
 Suddenly+0.069
Inside+0.067
Neg logit contributions
 harm-0.161
 rid-0.145
 decisions-0.142
 disl-0.139
 supervision-0.138
Token God
Token ID1793
Feature activation-2.93e+00

Pos logit contributions
 harm+0.055
 supervision+0.053
 disl+0.051
 affect+0.051
 decisions+0.051
Neg logit contributions
 Inside-0.029
 On-0.023
 Outside-0.022
 Suddenly-0.021
 A-0.021
Token for
Token ID329
Feature activation-3.76e+00

Pos logit contributions
 harm+0.118
 disl+0.103
 rid+0.103
 decisions+0.099
 affect+0.095
Neg logit contributions
 Inside-0.106
 Outside-0.095
 Across-0.095
 On-0.090
On-0.089
Token her
Token ID607
Feature activation-4.07e-01

Pos logit contributions
 harm+0.155
 decisions+0.136
 disl+0.135
 rid+0.134
 affect+0.130
Neg logit contributions
 Inside-0.129
 Outside-0.113
 Across-0.111
Inside-0.107
On-0.107
Token special
Token ID2041
Feature activation-2.89e+00

Pos logit contributions
 harm+0.016
 rid+0.014
 conclusions+0.014
 disl+0.014
keeping+0.014
Neg logit contributions
 Inside-0.014
 Outside-0.012
 Across-0.012
 A-0.012
 On-0.012
Token day
Token ID1110
Feature activation+2.32e+00

Pos logit contributions
 harm+0.114
 rid+0.102
 disl+0.099
 decisions+0.099
 wounds+0.096
Neg logit contributions
 Inside-0.098
 Outside-0.090
 Across-0.088
 On-0.082
shop-0.082
Token and
Token ID290
Feature activation-1.83e+00

Pos logit contributions
 Inside+0.065
 Outside+0.056
 Across+0.054
 On+0.053
 Suddenly+0.052
Neg logit contributions
 harm-0.112
 disl-0.106
 rid-0.102
 wounds-0.099
 conclusions-0.097
Token smiled
Token ID13541
Feature activation+1.96e+00

Pos logit contributions
 harm+0.084
 rid+0.075
 decisions+0.073
 disl+0.072
 wounds+0.071
Neg logit contributions
 Inside-0.055
 Across-0.049
 Outside-0.048
Over-0.045
Inside-0.045
Token all
Token ID477
Feature activation-1.50e+00

Pos logit contributions
 Inside+0.067
 Across+0.062
 Outside+0.061
 On+0.057
lic+0.057
Neg logit contributions
 harm-0.082
 rid-0.072
 decisions-0.070
 disl-0.069
 supervision-0.067
Token the
Token ID262
Feature activation-1.14e+00

Pos logit contributions
 harm+0.067
 wounds+0.060
 rid+0.059
 decisions+0.058
 affect+0.058
Neg logit contributions
 Inside-0.048
 Outside-0.043
 Across-0.041
 On-0.040
 Suddenly-0.040
Token and
Token ID290
Feature activation-2.45e+00

Pos logit contributions
 Inside+0.094
 Across+0.087
 Outside+0.086
lic+0.080
Inside+0.080
Neg logit contributions
 harm-0.120
 rid-0.105
 decisions-0.103
 disl-0.101
 choices-0.099
Token flowers
Token ID12734
Feature activation+1.70e+00

Pos logit contributions
 harm+0.104
 rid+0.092
 decisions+0.088
 disl+0.087
 impact+0.086
Neg logit contributions
 Inside-0.080
 Across-0.072
 Outside-0.072
Inside-0.067
 Suddenly-0.067
Token.
Token ID13
Feature activation-2.34e-01

Pos logit contributions
 Inside+0.056
 Across+0.051
 Outside+0.051
Inside+0.047
lic+0.047
Neg logit contributions
 harm-0.073
 rid-0.064
 decisions-0.063
 disl-0.062
 choices-0.060
Token
Token ID198
Feature activation+1.70e+00

Pos logit contributions
 harm+0.011
 rid+0.010
 decisions+0.010
 disl+0.010
 supervision+0.010
Neg logit contributions
 Inside-0.006
 Across-0.005
 Outside-0.005
Over-0.005
lic-0.005
Token
Token ID198
Feature activation+2.27e+00

Pos logit contributions
 Inside+0.039
 Across+0.037
 Outside+0.035
lic+0.032
Over+0.032
Neg logit contributions
 harm-0.087
 rid-0.078
 decisions-0.076
 disl-0.075
 wounds-0.074
Token"
Token ID1
Feature activation-5.14e+00

Pos logit contributions
 Inside+0.056
 Across+0.053
 Outside+0.049
lic+0.049
 On+0.044
Neg logit contributions
 harm-0.110
 disl-0.097
 conclusions-0.096
 rid-0.095
 decisions-0.094
TokenMom
Token ID29252
Feature activation-5.76e+00

Pos logit contributions
 harm+0.226
 rid+0.207
 disl+0.195
 conclusions+0.193
 decisions+0.187
Neg logit contributions
 Inside-0.151
 Across-0.137
 Outside-0.133
 On-0.128
lic-0.125
Token,
Token ID11
Feature activation+2.03e+00

Pos logit contributions
 harm+0.303
 rid+0.272
 decisions+0.269
 disl+0.263
 supervision+0.262
Neg logit contributions
 Inside-0.132
 Across-0.118
 Outside-0.116
Over-0.107
Inside-0.105
Token can
Token ID460
Feature activation-1.83e+00

Pos logit contributions
 Inside+0.069
 Across+0.063
 Outside+0.063
lic+0.059
Inside+0.059
Neg logit contributions
 harm-0.085
 rid-0.075
 decisions-0.073
 disl-0.072
 supervision-0.070
Token I
Token ID314
Feature activation-5.22e+00

Pos logit contributions
 harm+0.070
 decisions+0.060
 rid+0.059
 wounds+0.056
 impact+0.056
Neg logit contributions
 Inside-0.071
 Outside-0.064
 Across-0.064
 On-0.061
Inside-0.061
Token put
Token ID1234
Feature activation-1.98e+00

Pos logit contributions
 harm+0.216
 decisions+0.190
 rid+0.186
 disl+0.183
 choices+0.182
Neg logit contributions
 Inside-0.182
 Outside-0.165
 Across-0.164
Inside-0.156
 On-0.156
Token's
Token ID338
Feature activation-2.63e+00

Pos logit contributions
 harm+0.033
 rid+0.030
 decisions+0.029
 disl+0.028
 wounds+0.028
Neg logit contributions
 Inside-0.023
 Across-0.021
 Outside-0.021
lic-0.020
Over-0.020
Token okay
Token ID8788
Feature activation-5.03e+00

Pos logit contributions
 harm+0.101
 rid+0.090
 disl+0.088
 affect+0.087
 decisions+0.086
Neg logit contributions
 Inside-0.096
 Outside-0.088
 Across-0.087
On-0.083
 On-0.083
Token to
Token ID284
Feature activation-4.64e+00

Pos logit contributions
 harm+0.273
 rid+0.258
 disl+0.249
 affect+0.247
 decisions+0.247
Neg logit contributions
 Inside-0.097
 Outside-0.081
 Across-0.077
 Suddenly-0.074
 On-0.069
Token worry
Token ID5490
Feature activation-4.62e+00

Pos logit contributions
 harm+0.181
 decisions+0.167
 rid+0.165
 wounds+0.164
 supervision+0.161
Neg logit contributions
 Inside-0.151
 Outside-0.142
 Across-0.136
On-0.134
 Suddenly-0.133
Token,
Token ID11
Feature activation+4.68e-01

Pos logit contributions
 harm+0.246
 rid+0.230
 disl+0.219
 affect+0.215
 supervision+0.215
Neg logit contributions
 Inside-0.102
 Outside-0.087
 Across-0.084
 Suddenly-0.077
 On-0.077
Token F
Token ID376
Feature activation-4.12e+00

Pos logit contributions
 Inside+0.012
 Outside+0.010
 Suddenly+0.010
 Across+0.010
 Two+0.010
Neg logit contributions
 harm-0.022
 rid-0.020
 affect-0.019
keeping-0.019
edient-0.019
Tokenuzz
Token ID4715
Feature activation-6.49e-01

Pos logit contributions
 harm+0.176
 rid+0.156
 decisions+0.152
 disl+0.147
 choices+0.146
Neg logit contributions
 Inside-0.137
 Outside-0.125
 Across-0.124
 On-0.117
lic-0.116
Tokeny
Token ID88
Feature activation+1.80e+00

Pos logit contributions
 harm+0.030
 rid+0.027
 wounds+0.027
 decisions+0.027
 disl+0.026
Neg logit contributions
 Inside-0.017
 Across-0.016
 Outside-0.016
lic-0.015
 On-0.014
Token.
Token ID13
Feature activation-1.76e+00

Pos logit contributions
 Inside+0.055
 Outside+0.050
 Across+0.050
 Suddenly+0.048
 On+0.047
Neg logit contributions
 harm-0.077
 rid-0.070
 disl-0.068
 wounds-0.068
 conclusions-0.065
Token But
Token ID887
Feature activation-1.84e+00

Pos logit contributions
 harm+0.068
 rid+0.059
 decisions+0.057
 disl+0.056
 wounds+0.055
Neg logit contributions
 Inside-0.066
 Across-0.061
 Outside-0.061
 On-0.057
Inside-0.057
Token it
Token ID340
Feature activation+1.28e+00

Pos logit contributions
 harm+0.067
 rid+0.057
 decisions+0.056
 disl+0.055
 wounds+0.054
Neg logit contributions
 Inside-0.073
 Outside-0.066
 Across-0.066
Inside-0.062
 On-0.062
Token we
Token ID356
Feature activation-8.01e+00

Pos logit contributions
 Inside+0.095
 Across+0.083
 Outside+0.082
 Suddenly+0.078
On+0.076
Neg logit contributions
 harm-0.172
 rid-0.153
 decisions-0.150
 impact-0.145
 disl-0.145
Token were
Token ID547
Feature activation+1.04e+00

Pos logit contributions
 harm+0.370
 wounds+0.340
 disl+0.336
 decisions+0.335
 supervision+0.329
Neg logit contributions
 Inside-0.224
 Across-0.193
 Outside-0.190
 On-0.185
 Suddenly-0.185
Token trying
Token ID2111
Feature activation-3.61e+00

Pos logit contributions
 Inside+0.027
 Across+0.024
 Outside+0.024
 Suddenly+0.024
lic+0.022
Neg logit contributions
 harm-0.048
 affect-0.044
 disl-0.044
 conclusions-0.043
 rid-0.042
Token to
Token ID284
Feature activation-7.32e+00

Pos logit contributions
 harm+0.177
 rid+0.163
 impact+0.161
keeping+0.158
 supervision+0.157
Neg logit contributions
 Inside-0.092
On-0.073
Inside-0.072
 Outside-0.071
 Across-0.070
Token be
Token ID307
Feature activation-3.89e+00

Pos logit contributions
 harm+0.277
 decisions+0.262
 disl+0.259
 choices+0.257
 supervision+0.252
Neg logit contributions
 Inside-0.237
Inside-0.216
On-0.212
 Outside-0.211
 Across-0.204
Token le
Token ID443
Feature activation-9.95e+00

Pos logit contributions
 harm+0.175
 conclusions+0.158
 affect+0.156
 supervision+0.154
keeping+0.154
Neg logit contributions
 Inside-0.103
 Across-0.093
 Outside-0.091
 Suddenly-0.089
Inside-0.085
Tokenop
Token ID404
Feature activation-7.81e+00

Pos logit contributions
 harm+0.433
 rid+0.381
 decisions+0.374
 disl+0.366
 choices+0.358
Neg logit contributions
 Inside-0.321
 Across-0.294
 Outside-0.291
lic-0.273
Inside-0.272
Tokenards
Token ID1371
Feature activation-5.85e+00

Pos logit contributions
 harm+0.329
 impact+0.291
 disl+0.290
 decisions+0.274
 rid+0.272
Neg logit contributions
 Inside-0.229
lic-0.209
 Across-0.204
 Outside-0.204
 Together-0.197
Token,"
Token ID553
Feature activation-3.53e+00

Pos logit contributions
 harm+0.237
 rid+0.208
 disl+0.205
 decisions+0.203
 choices+0.195
Neg logit contributions
 Inside-0.205
 Across-0.190
 Outside-0.188
 On-0.178
On-0.178
Token Lily
Token ID20037
Feature activation-3.64e+00

Pos logit contributions
 harm+0.177
 decisions+0.157
 rid+0.153
 affect+0.153
 impact+0.152
Neg logit contributions
 Inside-0.094
 Suddenly-0.079
 Outside-0.078
 Across-0.077
 On-0.075
Token said
Token ID531
Feature activation-8.53e-01

Pos logit contributions
 harm+0.184
 rid+0.165
 decisions+0.161
 disl+0.158
 impact+0.157
Neg logit contributions
 Inside-0.094
 Across-0.081
 Outside-0.081
 Suddenly-0.078
 On-0.078
Token he
Token ID339
Feature activation+1.99e+00

Pos logit contributions
 Inside+0.048
 Across+0.040
 Outside+0.040
 Suddenly+0.039
Inside+0.038
Neg logit contributions
 harm-0.079
 rid-0.073
 decisions-0.070
 impact-0.070
 affect-0.069
Token wanted
Token ID2227
Feature activation-2.62e+00

Pos logit contributions
 Inside+0.047
 Across+0.041
 Outside+0.040
 Suddenly+0.038
 On+0.037
Neg logit contributions
 harm-0.103
 decisions-0.095
 rid-0.094
 supervision-0.093
 wounds-0.092
Token to
Token ID284
Feature activation-3.71e+00

Pos logit contributions
 harm+0.117
 rid+0.105
 decisions+0.104
 disl+0.103
 affect+0.102
Neg logit contributions
 Inside-0.082
 Outside-0.072
 Across-0.071
Inside-0.067
On-0.066
Token give
Token ID1577
Feature activation-4.25e+00

Pos logit contributions
 harm+0.165
 rid+0.147
 decisions+0.147
 wounds+0.145
 supervision+0.145
Neg logit contributions
 Inside-0.113
 Outside-0.102
 Across-0.099
 On-0.095
Inside-0.095
Token his
Token ID465
Feature activation-1.98e+00

Pos logit contributions
 harm+0.209
 affect+0.191
 disl+0.188
 impact+0.186
 decisions+0.183
Neg logit contributions
 Inside-0.116
 Outside-0.100
 Across-0.098
 Suddenly-0.092
Over-0.090
Token mom
Token ID1995
Feature activation-9.71e+00

Pos logit contributions
 harm+0.095
 rid+0.083
 decisions+0.083
 impact+0.081
 disl+0.081
Neg logit contributions
 Inside-0.054
 Across-0.048
 Outside-0.047
Over-0.042
 On-0.042
Tokenma
Token ID2611
Feature activation-7.21e+00

Pos logit contributions
 harm+0.566
 disl+0.508
 affect+0.503
 rid+0.501
 wounds+0.495
Neg logit contributions
 Inside-0.187
 Outside-0.148
 Across-0.146
 On-0.134
On-0.134
Token a
Token ID257
Feature activation+1.50e-02

Pos logit contributions
 harm+0.436
 disl+0.390
 affect+0.388
 rid+0.387
 impact+0.383
Neg logit contributions
 Inside-0.121
 Across-0.092
 Outside-0.091
 On-0.079
On-0.079
Token special
Token ID2041
Feature activation-3.13e+00

Pos logit contributions
 Inside+0.001
 Across+0.001
 Outside+0.000
Inside+0.000
lic+0.000
Neg logit contributions
 harm-0.001
 rid-0.001
 decisions-0.000
 disl-0.000
 impact-0.000
Token hug
Token ID16225
Feature activation-4.25e+00

Pos logit contributions
 harm+0.124
 rid+0.107
 decisions+0.105
 disl+0.102
 choices+0.100
Neg logit contributions
 Inside-0.112
 Across-0.104
 Outside-0.103
lic-0.099
Inside-0.098
Token.
Token ID13
Feature activation-2.00e-01

Pos logit contributions
 harm+0.192
 disl+0.176
 rid+0.171
 wounds+0.169
 decisions+0.169
Neg logit contributions
 Inside-0.131
 Outside-0.116
 Across-0.115
 On-0.111
 Suddenly-0.107
Token.
Token ID13
Feature activation-1.78e+00

Pos logit contributions
 Inside+0.037
 Outside+0.032
Inside+0.031
On+0.031
 Across+0.030
Neg logit contributions
 harm-0.058
 rid-0.055
 conclusions-0.053
 decisions-0.052
 impact-0.052
Token Sam
Token ID3409
Feature activation-2.74e+00

Pos logit contributions
 harm+0.068
 rid+0.058
 decisions+0.057
 disl+0.055
 wounds+0.054
Neg logit contributions
 Inside-0.069
 Outside-0.063
 Across-0.063
 On-0.060
Inside-0.059
Token can
Token ID460
Feature activation-2.38e+00

Pos logit contributions
 harm+0.137
 rid+0.126
 decisions+0.123
 conclusions+0.122
 disl+0.120
Neg logit contributions
 Inside-0.068
 Across-0.057
 Outside-0.057
 Suddenly-0.055
 On-0.054
Token be
Token ID307
Feature activation-1.45e+00

Pos logit contributions
 harm+0.100
 decisions+0.097
 rid+0.091
 disl+0.091
keeping+0.091
Neg logit contributions
 Inside-0.076
 Outside-0.066
On-0.065
Inside-0.064
 On-0.064
Token the
Token ID262
Feature activation-3.13e+00

Pos logit contributions
 harm+0.070
 affect+0.064
 conclusions+0.064
 decisions+0.062
keeping+0.062
Neg logit contributions
 Inside-0.036
 Across-0.031
 Outside-0.031
 Suddenly-0.030
On-0.030
Token boss
Token ID6478
Feature activation-9.98e+00

Pos logit contributions
 harm+0.139
 disl+0.131
keeping+0.128
 conclusions+0.124
 rid+0.122
Neg logit contributions
 Inside-0.076
 Outside-0.075
 Across-0.074
 Two-0.073
On-0.073
Token first
Token ID717
Feature activation-2.44e+00

Pos logit contributions
 harm+0.470
 rid+0.441
 wounds+0.435
 disl+0.425
 decisions+0.417
Neg logit contributions
 Inside-0.260
 Across-0.224
 Outside-0.220
lic-0.218
 Two-0.212
Token,
Token ID11
Feature activation+1.38e+00

Pos logit contributions
 harm+0.098
 rid+0.093
 disl+0.091
 wounds+0.088
 decisions+0.086
Neg logit contributions
 Inside-0.078
 Across-0.072
 Outside-0.071
 Two-0.068
On-0.068
Token then
Token ID788
Feature activation+1.49e+00

Pos logit contributions
 Inside+0.036
 Suddenly+0.030
 Outside+0.030
On+0.029
 Across+0.029
Neg logit contributions
 harm-0.069
keeping-0.061
 rid-0.061
 decisions-0.060
 disl-0.058
Token Mia
Token ID32189
Feature activation-1.35e+00

Pos logit contributions
 Inside+0.051
 Suddenly+0.044
 Outside+0.044
 Across+0.043
Inside+0.043
Neg logit contributions
 harm-0.060
 rid-0.054
keeping-0.053
 disl-0.052
 impact-0.051
Token can
Token ID460
Feature activation-1.91e+00

Pos logit contributions
 harm+0.057
 rid+0.053
 conclusions+0.052
 decisions+0.052
 disl+0.052
Neg logit contributions
 Inside-0.041
 Outside-0.036
 Across-0.035
 On-0.034
 Together-0.034
Token.
Token ID13
Feature activation-8.39e-01

Pos logit contributions
 harm+0.113
 rid+0.103
 supervision+0.102
 affect+0.101
 disl+0.101
Neg logit contributions
 Inside-0.063
 Outside-0.051
 Together-0.051
On-0.050
 Across-0.049
Token They
Token ID1119
Feature activation+6.54e-01

Pos logit contributions
 harm+0.041
 rid+0.038
 disl+0.037
 decisions+0.037
 choices+0.036
Neg logit contributions
 Inside-0.019
 Across-0.018
 Outside-0.017
lic-0.017
Over-0.017
Token did
Token ID750
Feature activation+3.05e+00

Pos logit contributions
 Inside+0.017
 Across+0.015
 Outside+0.015
Inside+0.014
On+0.014
Neg logit contributions
 harm-0.032
 decisions-0.029
 rid-0.029
 disl-0.028
 choices-0.028
Token not
Token ID407
Feature activation-5.30e+00

Pos logit contributions
 Inside+0.088
On+0.075
 Outside+0.074
 Across+0.073
Inside+0.073
Neg logit contributions
 harm-0.144
 supervision-0.128
keeping-0.127
 rid-0.127
 decisions-0.125
Token think
Token ID892
Feature activation-2.94e+00

Pos logit contributions
 decisions+0.188
 wounds+0.188
 harm+0.183
 supervision+0.179
 mistakes+0.172
Neg logit contributions
 Inside-0.164
 Outside-0.150
 Two-0.149
 Across-0.144
 On-0.144
Token Mom
Token ID11254
Feature activation-9.48e+00

Pos logit contributions
 harm+0.143
keeping+0.142
 rid+0.133
 affect+0.133
 disl+0.131
Neg logit contributions
 Inside-0.077
 Across-0.064
 Outside-0.064
 On-0.060
On-0.057
Token would
Token ID561
Feature activation-3.11e+00

Pos logit contributions
 harm+0.440
 rid+0.420
 decisions+0.411
 disl+0.410
 conclusions+0.391
Neg logit contributions
 Inside-0.248
 Across-0.221
 Outside-0.218
On-0.210
 On-0.210
Token notice
Token ID4003
Feature activation+6.51e-01

Pos logit contributions
 decisions+0.130
 harm+0.130
 supervision+0.123
 wounds+0.121
keeping+0.121
Neg logit contributions
 Inside-0.087
On-0.083
Inside-0.078
 Outside-0.073
 On-0.072
Token.
Token ID13
Feature activation-6.77e-01

Pos logit contributions
 Inside+0.018
 Outside+0.015
 Across+0.015
 Suddenly+0.014
On+0.014
Neg logit contributions
 harm-0.029
 rid-0.027
 affect-0.027
 impact-0.027
 decisions-0.026
Token
Token ID198
Feature activation+1.06e+00

Pos logit contributions
 harm+0.029
 disl+0.027
 rid+0.027
 decisions+0.026
 wounds+0.025
Neg logit contributions
lic-0.017
 Across-0.017
 Inside-0.016
Over-0.016
shop-0.016
Token
Token ID198
Feature activation+1.72e+00

Pos logit contributions
 Inside+0.024
 Across+0.021
 Outside+0.021
Inside+0.019
 On+0.019
Neg logit contributions
 harm-0.056
 rid-0.051
 decisions-0.050
 disl-0.049
 choices-0.048
Token more
Token ID517
Feature activation-2.07e+00

Pos logit contributions
 Inside+0.014
 Across+0.013
 Outside+0.012
On+0.012
Inside+0.012
Neg logit contributions
 harm-0.020
 disl-0.017
 rid-0.017
 decisions-0.017
keeping-0.017
Token sa
Token ID473
Feature activation-2.96e+00

Pos logit contributions
 harm+0.094
 rid+0.082
 disl+0.081
keeping+0.080
 decisions+0.080
Neg logit contributions
 Inside-0.064
 Across-0.055
 Outside-0.054
 On-0.053
On-0.052
Tokenus
Token ID385
Feature activation-3.32e+00

Pos logit contributions
 harm+0.125
 disl+0.107
 rid+0.107
 decisions+0.105
 impact+0.103
Neg logit contributions
 Inside-0.099
 Across-0.088
 Outside-0.088
lic-0.085
Inside-0.085
Tokenages
Token ID1095
Feature activation-1.43e+00

Pos logit contributions
 harm+0.148
 impact+0.129
 conclusions+0.128
 affect+0.124
 disl+0.122
Neg logit contributions
 Inside-0.100
lic-0.088
 Across-0.086
 Outside-0.086
Inside-0.084
Token for
Token ID329
Feature activation-4.57e+00

Pos logit contributions
 harm+0.068
 disl+0.060
 conclusions+0.059
 impact+0.059
 decisions+0.058
Neg logit contributions
 Inside-0.043
 Outside-0.035
 Across-0.034
Inside-0.034
lic-0.033
Token Mrs
Token ID9074
Feature activation-1.04e+01

Pos logit contributions
 harm+0.179
 rid+0.156
 decisions+0.154
 disl+0.152
 affect+0.147
Neg logit contributions
 Inside-0.167
 Across-0.150
 Outside-0.149
Inside-0.142
On-0.141
Token Smith
Token ID4176
Feature activation-4.71e+00

Pos logit contributions
 harm+0.447
 rid+0.408
 decisions+0.382
 disl+0.381
 affect+0.375
Neg logit contributions
 Inside-0.345
 Outside-0.306
 Across-0.305
 On-0.289
lic-0.278
Token.
Token ID13
Feature activation-9.80e-01

Pos logit contributions
 harm+0.208
 rid+0.189
 disl+0.186
 wounds+0.181
 decisions+0.179
Neg logit contributions
 Inside-0.144
 Across-0.124
 Outside-0.124
 On-0.119
On-0.116
Token He
Token ID679
Feature activation+2.60e+00

Pos logit contributions
 harm+0.034
 rid+0.031
 disl+0.031
 decisions+0.030
keeping+0.030
Neg logit contributions
Over-0.031
 Across-0.030
lic-0.030
 Inside-0.030
Under-0.029
Token quickly
Token ID2952
Feature activation+1.36e+00

Pos logit contributions
 Inside+0.070
 Across+0.063
 Outside+0.063
Inside+0.058
lic+0.057
Neg logit contributions
 harm-0.126
 rid-0.112
 decisions-0.111
 disl-0.108
 choices-0.106
Token returned
Token ID4504
Feature activation+2.73e+00

Pos logit contributions
 Inside+0.039
 Across+0.036
 Outside+0.036
Over+0.033
 On+0.033
Neg logit contributions
 harm-0.063
 rid-0.056
 decisions-0.054
 disl-0.054
 supervision-0.053