TOP ACTIVATIONS
MAX = 0.294

flew down to the river bank and looked for the w
a raft near the river bank and she thought it was
elephant went to the river bank and he noticed something strange
Then she followed the river bank until she could see her
pulled herself onto the river bank , and waved hello to
he got to the river bank he was very amazed .
Stop cutting !" The wood cut ters heard them and stopped
to seat on the river bank and watch the other animals
appeared and destroyed the river bank ! Water was everywhere and
of sight from the river bank . The little
ran down to the river bank . When
to go to the river bank . She found a kind
to crawl along the river bank . As the
away in her pig gy bank every week . This way
John quickly scanned the river bank to see if anything was
back down on the river bank . He continued
man standing by the river bank . The old man showed
and grabbed his pig gy bank and brought it back to
and ran to the river bank . Charlie
Standing on the river bank was a duck . The

MIN ACTIVATIONS
MIN = -0.315

was different from the flash lights that other kids had .
, they saw many flash lights . Some were big ,
big box full of flash lights , so L ola was
liked to play with flash lights in the dark . They
. They find some flash lights and tape them to their
the tent with their flash lights and their t eddy bears
and picked up the nap kin . She put it in
it up in a nap kin . This sandwich was so
playful dog and the nap kin , and they both shared
how to use a nap kin to cut her food .
her dress with a nap kin , but it didn 't
grateful for the special nap kin . Tom
my mixed the beautiful nap kin with colorful water . The
waved goodbye as the nap kin slowly continued its journey up
colorful water . The nap kin began to change color and
contest and got his nap kin back . From that day
ished it with a nap kin . Tim my felt bad
she even put a nap kin on her head too .
't want a new nap kin , she wanted her old
Max gave back the nap kin and Lily felt happy again

INTERVAL 0.141 - 0.294
CONTAINS 1.812%

dress . They ran downstairs and hugged Mom .
. He wanted to make his own story
his basket . After they both had filled their
, there was a little dog named Spot . Spot was
walk . As she was walking , she started to sh

INTERVAL -0.011 - 0.141
CONTAINS 53.609%

Once upon a time , there was a very serious girl
bad duck !" Mom says . " She is not food
a frog hopping around and he wanted to say hello .
it was a set of drums . He quickly started to
made John think twice and he finally accepted to help .

INTERVAL -0.163 - -0.011
CONTAINS 43.650%

the wish you would like ?" Lu cy was
thing and got on a seat . He put on his
hugged the pillow tight . She was so happy to have
a good night and sweet dreams ." L ily
fun ! We like swinging !" Dad watches them

INTERVAL -0.315 - -0.163
CONTAINS 0.929%

She couldn 't help but smile back .
Yes , honey , you look very nice !"
, but it kept coming back . He stopped walking to
The driver smiled and waved back as Joe flew off into
, until Lily started to feel dizz y .
Token flew
Token ID13112
Feature activation+1.01e-01

Pos logit contributions
 asking+0.081
cut+0.070
 trying+0.064
 causing+0.064
 bulldo+0.063
Neg logit contributions
ill-0.120
aney-0.105
ola-0.105
kin-0.104
 Kate-0.103
Token down
Token ID866
Feature activation-9.11e-02

Pos logit contributions
 asking+0.158
 le+0.135
 trying+0.133
 challenged+0.133
 swoop+0.132
Neg logit contributions
ill-0.192
igsaw-0.170
aney-0.165
kin-0.163
ushes-0.161
Token to
Token ID284
Feature activation+1.09e-01

Pos logit contributions
ill+0.201
igsaw+0.185
aney+0.177
kin+0.176
etti+0.173
Neg logit contributions
 asking-0.113
 swoop-0.092
 le-0.091
 challenged-0.091
 trying-0.091
Token the
Token ID262
Feature activation+1.80e-02

Pos logit contributions
 asking+0.142
 le+0.117
 z+0.115
 swoop+0.115
 trying+0.115
Neg logit contributions
ill-0.235
igsaw-0.220
kin-0.214
aney-0.213
etti-0.210
Token river
Token ID7850
Feature activation+1.57e-01

Pos logit contributions
 asking+0.026
 challenged+0.022
 swoop+0.021
 trying+0.021
 chased+0.020
Neg logit contributions
ill-0.034
igsaw-0.032
kin-0.031
aney-0.030
ifying-0.030
Tokenbank
Token ID17796
Feature activation+2.87e-01

Pos logit contributions
 asking+0.225
 challenged+0.196
have+0.191
 le+0.191
 bulldo+0.190
Neg logit contributions
ill-0.312
aney-0.252
kin-0.250
igsaw-0.248
cake-0.243
Token and
Token ID290
Feature activation+1.10e-02

Pos logit contributions
 asking+0.395
 swoop+0.349
 challenged+0.345
 le+0.342
 bulldo+0.332
Neg logit contributions
ill-0.589
kin-0.504
aney-0.504
ushes-0.488
igsaw-0.483
Token looked
Token ID3114
Feature activation-8.11e-02

Pos logit contributions
 asking+0.012
 ult+0.010
have+0.010
cut+0.010
 le+0.010
Neg logit contributions
ill-0.023
 Jill-0.020
etti-0.020
kin-0.020
 Kate-0.020
Token for
Token ID329
Feature activation-3.02e-02

Pos logit contributions
ill+0.157
igsaw+0.139
aney+0.137
ushes+0.137
kin+0.134
Neg logit contributions
 asking-0.114
 swoop-0.098
 challenged-0.096
 le-0.096
 gir-0.096
Token the
Token ID262
Feature activation+1.37e-02

Pos logit contributions
ill+0.059
igsaw+0.055
aney+0.052
kin+0.052
etti+0.051
Neg logit contributions
 asking-0.044
 swoop-0.037
 trying-0.037
 le-0.037
 challenged-0.037
Token w
Token ID266
Feature activation+3.96e-02

Pos logit contributions
 asking+0.020
 challenged+0.017
 chased+0.016
 swoop+0.016
have+0.016
Neg logit contributions
ill-0.024
kin-0.023
igsaw-0.022
ifying-0.021
 pleasure-0.021
Token a
Token ID257
Feature activation+9.98e-03

Pos logit contributions
 asking+0.190
 swoop+0.172
 challenged+0.170
 gir+0.163
cut+0.162
Neg logit contributions
ill-0.179
igsaw-0.157
aney-0.147
kin-0.147
ola-0.144
Token raft
Token ID31812
Feature activation+8.91e-02

Pos logit contributions
 asking+0.013
 challenged+0.011
 trying+0.011
 swoop+0.011
 le+0.010
Neg logit contributions
ill-0.021
igsaw-0.019
kin-0.019
aney-0.018
etti-0.018
Token near
Token ID1474
Feature activation+4.66e-02

Pos logit contributions
 asking+0.141
emy+0.140
have+0.133
 requests+0.133
 demands+0.128
Neg logit contributions
ill-0.145
he-0.129
work-0.124
ifying-0.122
cake-0.122
Token the
Token ID262
Feature activation+1.72e-02

Pos logit contributions
 asking+0.077
 challenged+0.072
 gir+0.070
cut+0.070
 bulldo+0.070
Neg logit contributions
ill-0.079
igsaw-0.069
aney-0.066
etti-0.064
ushes-0.062
Token river
Token ID7850
Feature activation+1.56e-01

Pos logit contributions
 asking+0.023
 challenged+0.018
 trying+0.018
 swoop+0.018
 chased+0.018
Neg logit contributions
ill-0.035
igsaw-0.033
kin-0.031
etti-0.031
ifying-0.031
Tokenbank
Token ID17796
Feature activation+2.86e-01

Pos logit contributions
 asking+0.262
 cowboy+0.256
 walks+0.243
emy+0.242
have+0.234
Neg logit contributions
ill-0.262
aney-0.188
work-0.187
cake-0.187
kin-0.186
Token and
Token ID290
Feature activation+1.11e-02

Pos logit contributions
 asking+0.434
 swoop+0.414
 bulldo+0.405
 challenged+0.403
 gir+0.401
Neg logit contributions
ill-0.538
aney-0.430
kin-0.427
cake-0.398
ushes-0.394
Token she
Token ID673
Feature activation+3.46e-02

Pos logit contributions
 bulldo+0.014
 ult+0.014
 asking+0.014
cut+0.014
agons+0.014
Neg logit contributions
ill-0.019
 Jill-0.017
 Jane-0.017
 Something-0.016
 Kate-0.016
Token thought
Token ID1807
Feature activation+1.63e-01

Pos logit contributions
 asking+0.045
 traps+0.042
cut+0.040
 Street+0.040
 sneaking+0.040
Neg logit contributions
ill-0.064
ola-0.058
aney-0.056
 Kate-0.055
 Jill-0.054
Token it
Token ID340
Feature activation-6.77e-02

Pos logit contributions
 asking+0.282
 challenged+0.247
 swoop+0.245
 gir+0.240
cut+0.239
Neg logit contributions
ill-0.278
kin-0.243
aney-0.240
igsaw-0.239
ola-0.227
Token was
Token ID373
Feature activation-2.01e-02

Pos logit contributions
ill+0.125
igsaw+0.109
etti+0.108
kin+0.108
aney+0.107
Neg logit contributions
 asking-0.103
 swoop-0.085
 le-0.085
 ostr-0.084
 challenged-0.083
Token elephant
Token ID20950
Feature activation+1.42e-01

Pos logit contributions
 asking+0.044
 challenged+0.038
 swoop+0.038
 trying+0.036
 chased+0.036
Neg logit contributions
ill-0.057
igsaw-0.054
kin-0.054
ola-0.050
aney-0.050
Token went
Token ID1816
Feature activation+5.68e-02

Pos logit contributions
 asking+0.208
cut+0.170
 trying+0.170
 le+0.169
 trucks+0.169
Neg logit contributions
ill-0.277
aire-0.225
aney-0.223
kin-0.215
cake-0.212
Token to
Token ID284
Feature activation+1.11e-01

Pos logit contributions
 asking+0.079
 swoop+0.073
 le+0.072
 gir+0.071
 ostr+0.071
Neg logit contributions
ill-0.108
igsaw-0.098
cake-0.090
aney-0.089
ola-0.089
Token the
Token ID262
Feature activation+1.79e-02

Pos logit contributions
 asking+0.168
 swoop+0.147
 le+0.146
 challenged+0.144
 gir+0.143
Neg logit contributions
ill-0.206
igsaw-0.189
aney-0.175
etti-0.174
kin-0.173
Token river
Token ID7850
Feature activation+1.55e-01

Pos logit contributions
 asking+0.025
 challenged+0.022
 swoop+0.021
 trying+0.020
 chased+0.020
Neg logit contributions
ill-0.034
igsaw-0.033
ifying-0.030
cake-0.030
etti-0.030
Tokenbank
Token ID17796
Feature activation+2.86e-01

Pos logit contributions
 asking+0.229
 swoop+0.206
 ostr+0.202
 le+0.201
have+0.199
Neg logit contributions
ill-0.298
aney-0.230
igsaw-0.229
kin-0.225
etti-0.223
Token and
Token ID290
Feature activation+1.57e-02

Pos logit contributions
 asking+0.399
 swoop+0.369
 le+0.354
 challenged+0.354
 ostr+0.350
Neg logit contributions
ill-0.582
aney-0.489
kin-0.483
ushes-0.465
igsaw-0.462
Token he
Token ID339
Feature activation+7.76e-02

Pos logit contributions
 asking+0.018
 ult+0.016
cut+0.015
 gir+0.015
have+0.015
Neg logit contributions
ill-0.033
kin-0.028
 Jill-0.028
etti-0.028
ola-0.028
Token noticed
Token ID6810
Feature activation+6.96e-02

Pos logit contributions
 asking+0.110
cut+0.090
 trucks+0.089
 ostr+0.088
 trying+0.088
Neg logit contributions
ill-0.147
kin-0.128
ola-0.126
aney-0.121
etti-0.118
Token something
Token ID1223
Feature activation-9.49e-02

Pos logit contributions
 asking+0.119
 swoop+0.107
 challenged+0.104
 gir+0.100
 trying+0.100
Neg logit contributions
ill-0.115
igsaw-0.098
kin-0.095
aney-0.092
etti-0.092
Token strange
Token ID6283
Feature activation+3.47e-02

Pos logit contributions
ill+0.181
igsaw+0.166
kin+0.162
aney+0.158
ola+0.154
Neg logit contributions
 asking-0.137
 swoop-0.116
 challenged-0.115
 gir-0.113
 chased-0.110
Token Then
Token ID3244
Feature activation-3.86e-02

Pos logit contributions
ill+0.026
kin+0.026
igsaw+0.024
etti+0.023
aney+0.023
Neg logit contributions
 asking-0.024
 le-0.021
 trying-0.020
 challenged-0.020
 z-0.020
Token she
Token ID673
Feature activation+3.05e-02

Pos logit contributions
ill+0.076
igsaw+0.070
aney+0.067
kin+0.067
etti+0.066
Neg logit contributions
 asking-0.057
 swoop-0.048
 le-0.048
 trying-0.048
 challenged-0.047
Token followed
Token ID3940
Feature activation+1.14e-01

Pos logit contributions
 asking+0.039
cut+0.032
 trying+0.032
 trucks+0.030
 le+0.030
Neg logit contributions
ill-0.066
kin-0.058
aney-0.056
ola-0.056
etti-0.054
Token the
Token ID262
Feature activation+8.72e-03

Pos logit contributions
 asking+0.158
 trying+0.131
 le+0.131
 swoop+0.128
 z+0.126
Neg logit contributions
ill-0.235
igsaw-0.218
aney-0.211
kin-0.210
ushes-0.207
Token river
Token ID7850
Feature activation+1.52e-01

Pos logit contributions
 asking+0.012
 challenged+0.010
 swoop+0.010
 chased+0.010
have+0.010
Neg logit contributions
ill-0.016
igsaw-0.015
kin-0.015
ifying-0.014
aire-0.014
Tokenbank
Token ID17796
Feature activation+2.80e-01

Pos logit contributions
 asking+0.216
have+0.202
 challenged+0.194
 ostr+0.192
 swoop+0.191
Neg logit contributions
ill-0.292
igsaw-0.240
aney-0.224
etti-0.223
cake-0.221
Token until
Token ID1566
Feature activation+4.39e-02

Pos logit contributions
 asking+0.372
 swoop+0.324
 challenged+0.322
 le+0.318
 ostr+0.306
Neg logit contributions
ill-0.590
igsaw-0.511
aney-0.507
kin-0.500
ushes-0.496
Token she
Token ID673
Feature activation+2.92e-02

Pos logit contributions
 asking+0.058
 trying+0.050
 le+0.048
 swoop+0.048
 going+0.047
Neg logit contributions
ill-0.094
igsaw-0.089
aney-0.087
ushes-0.086
etti-0.084
Token could
Token ID714
Feature activation-1.29e-02

Pos logit contributions
 asking+0.034
 swoop+0.028
 le+0.028
 challenged+0.027
 trying+0.027
Neg logit contributions
ill-0.065
igsaw-0.063
aney-0.059
kin-0.059
etti-0.059
Token see
Token ID766
Feature activation-7.37e-02

Pos logit contributions
ill+0.024
igsaw+0.023
aney+0.022
ushes+0.021
kin+0.021
Neg logit contributions
 asking-0.020
 trying-0.017
 swoop-0.017
 challenged-0.017
 going-0.017
Token her
Token ID607
Feature activation-3.18e-02

Pos logit contributions
ill+0.150
igsaw+0.139
aney+0.133
kin+0.133
etti+0.131
Neg logit contributions
 asking-0.104
 le-0.087
 trying-0.087
 swoop-0.087
 challenged-0.085
Token pulled
Token ID5954
Feature activation+1.11e-01

Pos logit contributions
ill+0.033
kin+0.029
aney+0.028
ola+0.028
etti+0.026
Neg logit contributions
 asking-0.024
 trying-0.020
cut-0.020
 le-0.019
 trucks-0.019
Token herself
Token ID5223
Feature activation-9.14e-02

Pos logit contributions
 asking+0.151
 trying+0.127
 le+0.126
 swoop+0.125
 z+0.124
Neg logit contributions
ill-0.232
igsaw-0.219
aney-0.209
kin-0.208
etti-0.208
Token onto
Token ID4291
Feature activation-7.47e-02

Pos logit contributions
ill+0.172
igsaw+0.155
kin+0.150
aney+0.149
etti+0.146
Neg logit contributions
 asking-0.142
 le-0.122
 swoop-0.121
 challenged-0.120
 trying-0.120
Token the
Token ID262
Feature activation+1.44e-02

Pos logit contributions
ill+0.149
igsaw+0.137
kin+0.130
aney+0.130
etti+0.128
Neg logit contributions
 asking-0.107
 swoop-0.090
 challenged-0.090
 trying-0.089
 le-0.089
Token river
Token ID7850
Feature activation+1.49e-01

Pos logit contributions
 asking+0.022
 challenged+0.018
 swoop+0.018
 chased+0.017
 telling+0.017
Neg logit contributions
ill-0.025
kin-0.024
igsaw-0.023
etti-0.023
ifying-0.022
Tokenbank
Token ID17796
Feature activation+2.78e-01

Pos logit contributions
 asking+0.212
 challenged+0.188
 swoop+0.183
have+0.180
 le+0.176
Neg logit contributions
ill-0.302
igsaw-0.259
kin-0.248
aney-0.248
etti-0.241
Token,
Token ID11
Feature activation-3.08e-02

Pos logit contributions
 asking+0.336
 swoop+0.278
 le+0.275
 challenged+0.273
 trying+0.270
Neg logit contributions
ill-0.618
igsaw-0.564
aney-0.547
kin-0.544
etti-0.536
Token and
Token ID290
Feature activation+1.03e-02

Pos logit contributions
ill+0.062
kin+0.052
aney+0.051
etti+0.050
igsaw+0.049
Neg logit contributions
 asking-0.038
 challenged-0.034
 le-0.034
 bulldo-0.034
 ostr-0.034
Token waved
Token ID26834
Feature activation-2.45e-02

Pos logit contributions
 asking+0.014
 le+0.012
 swoop+0.012
 challenged+0.012
 trying+0.012
Neg logit contributions
ill-0.021
igsaw-0.018
kin-0.018
aney-0.018
etti-0.018
Token hello
Token ID23748
Feature activation+3.77e-02

Pos logit contributions
ill+0.051
igsaw+0.048
aney+0.045
kin+0.045
etti+0.045
Neg logit contributions
 asking-0.034
 trying-0.028
 swoop-0.028
 le-0.028
 z-0.028
Token to
Token ID284
Feature activation+1.02e-01

Pos logit contributions
 asking+0.041
 trying+0.033
 le+0.032
 swoop+0.032
 z+0.032
Neg logit contributions
ill-0.087
igsaw-0.085
kin-0.081
aney-0.080
etti-0.078
Token he
Token ID339
Feature activation+7.07e-02

Pos logit contributions
 asking+0.148
 swoop+0.127
 ostr+0.125
cut+0.124
 gir+0.123
Neg logit contributions
ill-0.197
igsaw-0.186
kin-0.182
etti-0.173
ushes-0.169
Token got
Token ID1392
Feature activation+3.14e-03

Pos logit contributions
 asking+0.090
 swoop+0.075
 le+0.075
 trying+0.074
 challenged+0.073
Neg logit contributions
ill-0.153
igsaw-0.143
aney-0.136
kin-0.136
etti-0.135
Token to
Token ID284
Feature activation+1.05e-01

Pos logit contributions
 asking+0.003
 trying+0.002
 going+0.002
 swoop+0.002
 le+0.002
Neg logit contributions
igsaw-0.008
ill-0.008
kin-0.008
aney-0.007
etti-0.007
Token the
Token ID262
Feature activation+1.66e-02

Pos logit contributions
 asking+0.149
 swoop+0.125
 le+0.124
 trying+0.124
 challenged+0.123
Neg logit contributions
ill-0.210
igsaw-0.193
aney-0.183
kin-0.182
etti-0.182
Token river
Token ID7850
Feature activation+1.51e-01

Pos logit contributions
 asking+0.025
 swoop+0.020
 chased+0.020
 trying+0.019
 telling+0.019
Neg logit contributions
ill-0.031
igsaw-0.028
 Something-0.027
ifying-0.027
etti-0.026
Tokenbank
Token ID17796
Feature activation+2.78e-01

Pos logit contributions
 asking+0.232
 swoop+0.204
 le+0.202
 gir+0.201
have+0.199
Neg logit contributions
ill-0.282
igsaw-0.227
aney-0.224
etti-0.221
ola-0.218
Token he
Token ID339
Feature activation+7.10e-02

Pos logit contributions
 asking+0.395
 swoop+0.342
 le+0.335
 challenged+0.330
 gir+0.328
Neg logit contributions
ill-0.560
igsaw-0.493
aney-0.480
kin-0.473
etti-0.466
Token was
Token ID373
Feature activation-2.32e-02

Pos logit contributions
 asking+0.094
cut+0.078
 gir+0.074
 trying+0.073
 trucks+0.073
Neg logit contributions
ill-0.151
ola-0.126
aney-0.125
etti-0.124
 Something-0.123
Token very
Token ID845
Feature activation-2.58e-02

Pos logit contributions
ill+0.046
igsaw+0.041
aney+0.040
kin+0.040
etti+0.040
Neg logit contributions
 asking-0.034
 le-0.029
 swoop-0.028
 challenged-0.028
 trying-0.028
Token amazed
Token ID24562
Feature activation-6.28e-02

Pos logit contributions
ill+0.056
igsaw+0.053
aney+0.050
etti+0.050
kin+0.050
Neg logit contributions
 asking-0.033
 trying-0.028
 swoop-0.028
 le-0.027
 challenged-0.027
Token.
Token ID13
Feature activation-1.19e-02

Pos logit contributions
ill+0.123
igsaw+0.102
etti+0.098
aney+0.095
ola+0.095
Neg logit contributions
 asking-0.091
 swoop-0.086
cut-0.083
 chased-0.082
 gir-0.082
Token Stop
Token ID13707
Feature activation+1.48e-02

Pos logit contributions
ill+0.088
igsaw+0.077
aney+0.074
kin+0.071
etti+0.071
Neg logit contributions
 asking-0.115
 swoop-0.102
 trying-0.101
 le-0.101
 challenged-0.100
Token cutting
Token ID7720
Feature activation+1.39e-01

Pos logit contributions
 asking+0.022
 gir+0.021
 swoop+0.021
 ostr+0.020
cut+0.020
Neg logit contributions
ill-0.025
 Kate-0.021
etti-0.021
igsaw-0.021
ola-0.021
Token!"
Token ID2474
Feature activation-4.35e-02

Pos logit contributions
 asking+0.167
 le+0.142
 trying+0.138
 z+0.134
 trucks+0.133
Neg logit contributions
ill-0.301
igsaw-0.290
aney-0.283
etti-0.274
kin-0.272
Token The
Token ID383
Feature activation+4.17e-02

Pos logit contributions
ill+0.069
igsaw+0.061
aney+0.058
kin+0.057
etti+0.056
Neg logit contributions
 asking-0.080
 swoop-0.071
 trying-0.070
 le-0.070
 challenged-0.069
Token wood
Token ID4898
Feature activation+1.47e-01

Pos logit contributions
 asking+0.058
 challenged+0.050
 trying+0.047
 swoop+0.047
 going+0.047
Neg logit contributions
ill-0.079
igsaw-0.077
kin-0.076
aney-0.071
ola-0.070
Tokencut
Token ID8968
Feature activation+2.78e-01

Pos logit contributions
 asking+0.210
 le+0.174
 challenged+0.172
 going+0.170
 gir+0.168
Neg logit contributions
ill-0.295
kin-0.273
aney-0.251
igsaw-0.250
cake-0.248
Tokenters
Token ID1010
Feature activation-2.20e-02

Pos logit contributions
 asking+0.456
 le+0.408
 gir+0.398
 challenged+0.390
 walks+0.388
Neg logit contributions
ill-0.478
kin-0.419
etti-0.386
ola-0.379
aney-0.374
Token heard
Token ID2982
Feature activation+7.52e-02

Pos logit contributions
ill+0.044
aney+0.038
kin+0.038
ola+0.036
ifying+0.035
Neg logit contributions
 asking-0.031
 le-0.027
 gir-0.027
 ostr-0.026
 trucks-0.025
Token them
Token ID606
Feature activation-4.71e-02

Pos logit contributions
 asking+0.114
 challenged+0.099
 swoop+0.099
 trying+0.098
 le+0.097
Neg logit contributions
ill-0.144
igsaw-0.129
aney-0.124
etti-0.121
kin-0.119
Token and
Token ID290
Feature activation+2.09e-02

Pos logit contributions
ill+0.089
aney+0.073
etti+0.071
ifying+0.067
 grin+0.066
Neg logit contributions
 asking-0.069
 le-0.065
 gir-0.064
 ostr-0.064
 challenged-0.062
Token stopped
Token ID5025
Feature activation+5.79e-02

Pos logit contributions
 asking+0.026
have+0.023
 le+0.022
 ostr+0.022
 challenged+0.022
Neg logit contributions
ill-0.042
etti-0.036
 Kate-0.036
 Something-0.035
aney-0.035
Token to
Token ID284
Feature activation+9.52e-02

Pos logit contributions
cut+0.068
perate+0.067
ep+0.066
 ult+0.066
have+0.066
Neg logit contributions
ill-0.063
igsaw-0.062
 Jane-0.061
 Jill-0.058
 Claire-0.057
Token seat
Token ID5852
Feature activation-6.29e-02

Pos logit contributions
 asking+0.152
 trying+0.127
 challenged+0.127
 le+0.124
 causing+0.123
Neg logit contributions
ill-0.170
igsaw-0.154
kin-0.148
etti-0.144
aney-0.143
Token on
Token ID319
Feature activation-6.54e-02

Pos logit contributions
ill+0.142
igsaw+0.135
aney+0.128
etti+0.127
kin+0.126
Neg logit contributions
 asking-0.075
 le-0.064
 sc-0.063
 z-0.062
 trying-0.061
Token the
Token ID262
Feature activation+3.29e-03

Pos logit contributions
ill+0.119
igsaw+0.110
aney+0.103
etti+0.101
kin+0.101
Neg logit contributions
 asking-0.102
 challenged-0.091
 swoop-0.088
cut-0.085
 trying-0.085
Token river
Token ID7850
Feature activation+1.48e-01

Pos logit contributions
 asking+0.005
 challenged+0.004
 warned+0.004
 trying+0.004
 telling+0.004
Neg logit contributions
ill-0.005
aney-0.005
kin-0.005
igsaw-0.005
cake-0.005
Tokenbank
Token ID17796
Feature activation+2.77e-01

Pos logit contributions
 asking+0.221
 challenged+0.194
 swoop+0.188
 bulldo+0.188
 le+0.186
Neg logit contributions
ill-0.282
igsaw-0.236
kin-0.235
aney-0.234
ushes-0.228
Token and
Token ID290
Feature activation-4.15e-03

Pos logit contributions
 asking+0.360
 swoop+0.307
 challenged+0.300
 le+0.300
 trying+0.293
Neg logit contributions
ill-0.590
igsaw-0.530
aney-0.521
kin-0.519
ushes-0.508
Token watch
Token ID2342
Feature activation+6.38e-03

Pos logit contributions
ill+0.008
 Kate+0.006
 Jill+0.006
aney+0.006
igsaw+0.006
Neg logit contributions
 asking-0.005
have-0.005
ep-0.005
 ult-0.005
 bulldo-0.005
Token the
Token ID262
Feature activation+6.04e-03

Pos logit contributions
 asking+0.009
 challenged+0.008
 swoop+0.008
 trying+0.008
 le+0.008
Neg logit contributions
ill-0.012
igsaw-0.011
aney-0.010
etti-0.010
ushes-0.010
Token other
Token ID584
Feature activation+9.37e-02

Pos logit contributions
 asking+0.009
perate+0.007
 challenged+0.007
 begged+0.007
 chased+0.007
Neg logit contributions
ill-0.010
ifying-0.009
kin-0.009
ushes-0.009
work-0.009
Token animals
Token ID4695
Feature activation+1.07e-01

Pos logit contributions
 asking+0.102
 trying+0.083
 challenged+0.081
 swoop+0.080
cut+0.079
Neg logit contributions
ill-0.204
aney-0.178
cy-0.173
ola-0.172
aire-0.171
Token appeared
Token ID4120
Feature activation+5.80e-02

Pos logit contributions
ill+0.173
kin+0.148
ifying+0.144
etti+0.140
ola+0.139
Neg logit contributions
 asking-0.151
have-0.129
 trucks-0.126
 le-0.122
 horses-0.120
Token and
Token ID290
Feature activation+6.83e-03

Pos logit contributions
 asking+0.082
 swoop+0.069
 le+0.069
 challenged+0.069
 gir+0.067
Neg logit contributions
ill-0.119
igsaw-0.105
kin-0.102
etti-0.101
aney-0.100
Token destroyed
Token ID6572
Feature activation+4.85e-02

Pos logit contributions
 asking+0.008
 trying+0.006
cut+0.006
have+0.006
 le+0.006
Neg logit contributions
ill-0.015
etti-0.013
igsaw-0.013
aney-0.013
kin-0.013
Token the
Token ID262
Feature activation+1.39e-02

Pos logit contributions
 asking+0.069
 swoop+0.058
 le+0.058
 trying+0.058
 challenged+0.057
Neg logit contributions
ill-0.097
igsaw-0.090
aney-0.086
kin-0.085
etti-0.084
Token river
Token ID7850
Feature activation+1.49e-01

Pos logit contributions
 asking+0.023
 challenged+0.020
perate+0.020
 swoop+0.019
have+0.019
Neg logit contributions
kin-0.021
ill-0.019
-0.017
aire-0.017
igsaw-0.017
Tokenbank
Token ID17796
Feature activation+2.77e-01

Pos logit contributions
 asking+0.254
 challenged+0.224
have+0.221
 ostr+0.221
 swoop+0.218
Neg logit contributions
ill-0.251
igsaw-0.202
etti-0.200
kin-0.198
cake-0.197
Token!
Token ID0
Feature activation-5.94e-02

Pos logit contributions
 asking+0.425
 swoop+0.391
 ostr+0.380
 challenged+0.374
 gir+0.372
Neg logit contributions
ill-0.516
aney-0.417
kin-0.416
etti-0.415
cake-0.403
Token Water
Token ID5638
Feature activation+1.39e-02

Pos logit contributions
ill+0.115
kin+0.107
ushes+0.102
igsaw+0.100
aney+0.100
Neg logit contributions
 asking-0.095
 le-0.083
 trying-0.082
 challenged-0.080
 chased-0.077
Token was
Token ID373
Feature activation-2.57e-02

Pos logit contributions
 asking+0.021
have+0.018
 gir+0.017
 le+0.017
 trying+0.016
Neg logit contributions
ill-0.027
etti-0.023
kin-0.023
aney-0.022
cake-0.022
Token everywhere
Token ID8347
Feature activation+6.67e-02

Pos logit contributions
ill+0.051
kin+0.047
igsaw+0.046
etti+0.045
ifying+0.045
Neg logit contributions
 asking-0.033
 gir-0.029
 challenged-0.028
 le-0.028
 swoop-0.027
Token and
Token ID290
Feature activation+7.97e-03

Pos logit contributions
 asking+0.092
 challenged+0.079
 swoop+0.079
 le+0.077
 gir+0.077
Neg logit contributions
ill-0.137
igsaw-0.122
kin-0.119
etti-0.117
aney-0.117
Token of
Token ID286
Feature activation-1.26e-01

Pos logit contributions
ill+0.170
ifying+0.147
igsaw+0.140
aps+0.133
 Kate+0.127
Neg logit contributions
ans-0.272
 asking-0.270
 ostr-0.269
 swoop-0.268
 challenged-0.268
Token sight
Token ID6504
Feature activation-3.44e-02

Pos logit contributions
ill+0.238
igsaw+0.220
kin+0.207
aney+0.204
ushes+0.202
Neg logit contributions
 asking-0.189
 swoop-0.163
 challenged-0.163
 le-0.161
 trying-0.159
Token from
Token ID422
Feature activation-9.62e-02

Pos logit contributions
ill+0.070
igsaw+0.063
aney+0.060
kin+0.060
etti+0.059
Neg logit contributions
 asking-0.049
 swoop-0.042
 challenged-0.041
 le-0.041
 trying-0.040
Token the
Token ID262
Feature activation+8.64e-03

Pos logit contributions
ill+0.190
igsaw+0.175
kin+0.166
aney+0.164
etti+0.162
Neg logit contributions
 asking-0.137
 swoop-0.117
 challenged-0.117
 le-0.116
 trying-0.114
Token river
Token ID7850
Feature activation+1.48e-01

Pos logit contributions
 asking+0.012
 challenged+0.010
 swoop+0.010
 chased+0.010
 trying+0.010
Neg logit contributions
ill-0.016
igsaw-0.015
ifying-0.014
kin-0.014
ola-0.014
Tokenbank
Token ID17796
Feature activation+2.76e-01

Pos logit contributions
 asking+0.221
have+0.202
 swoop+0.200
 challenged+0.198
 ostr+0.193
Neg logit contributions
ill-0.271
igsaw-0.214
aney-0.209
kin-0.209
etti-0.206
Token.
Token ID13
Feature activation-2.33e-02

Pos logit contributions
 asking+0.392
 swoop+0.362
 challenged+0.355
 le+0.342
 bulldo+0.340
Neg logit contributions
ill-0.547
aney-0.461
igsaw-0.452
kin-0.449
etti-0.437
Token
Token ID198
Feature activation-4.45e-02

Pos logit contributions
ill+0.041
kin+0.038
igsaw+0.037
aney+0.035
etti+0.035
Neg logit contributions
 asking-0.040
 le-0.035
 trying-0.034
 challenged-0.033
 swoop-0.033
Token
Token ID198
Feature activation-4.36e-02

Pos logit contributions
ill+0.062
igsaw+0.053
aney+0.052
ola+0.049
ushes+0.049
Neg logit contributions
 asking-0.091
 gir-0.082
 trucks-0.081
 trying-0.081
 le-0.081
TokenThe
Token ID464
Feature activation+8.69e-03

Pos logit contributions
ill+0.055
aney+0.046
igsaw+0.045
ola+0.043
 Something+0.043
Neg logit contributions
 asking-0.096
 trucks-0.092
 going-0.085
 ro-0.085
 trying-0.084
Token little
Token ID1310
Feature activation+1.17e-01

Pos logit contributions
 asking+0.011
 yells+0.010
 tries+0.009
 sneaking+0.009
 chased+0.009
Neg logit contributions
kin-0.015
igsaw-0.015
ill-0.015
-0.014
ola-0.014
Token ran
Token ID4966
Feature activation+1.41e-01

Pos logit contributions
 asking+0.040
 sneaking+0.035
 trying+0.033
 telling+0.032
 demanding+0.032
Neg logit contributions
ill-0.054
kin-0.050
ola-0.049
etti-0.046
aney-0.046
Token down
Token ID866
Feature activation-1.02e-01

Pos logit contributions
 asking+0.221
 gir+0.197
cut+0.195
 swoop+0.192
 challenged+0.191
Neg logit contributions
ill-0.251
igsaw-0.220
etti-0.210
aney-0.206
ola-0.204
Token to
Token ID284
Feature activation+1.00e-01

Pos logit contributions
ill+0.193
igsaw+0.163
ola+0.153
aney+0.151
ushes+0.150
Neg logit contributions
 asking-0.156
have-0.140
 challenged-0.139
 gir-0.134
 swoop-0.133
Token the
Token ID262
Feature activation+6.95e-03

Pos logit contributions
 asking+0.159
 gir+0.143
cut+0.138
 challenged+0.137
 le+0.137
Neg logit contributions
ill-0.171
igsaw-0.148
etti-0.144
ola-0.143
aney-0.140
Token river
Token ID7850
Feature activation+1.50e-01

Pos logit contributions
 asking+0.012
 sneaking+0.010
 begging+0.010
 telling+0.010
 chased+0.010
Neg logit contributions
kin-0.009
ifying-0.009
ill-0.009
etti-0.008
work-0.008
Tokenbank
Token ID17796
Feature activation+2.75e-01

Pos logit contributions
 asking+0.241
 cowboy+0.237
emy+0.234
have+0.231
 chased+0.222
Neg logit contributions
ill-0.253
aps-0.187
etti-0.184
work-0.183
aney-0.182
Token.
Token ID13
Feature activation-3.35e-02

Pos logit contributions
 asking+0.400
 swoop+0.379
 challenged+0.374
 bulldo+0.370
 le+0.367
Neg logit contributions
ill-0.533
aney-0.441
kin-0.431
ushes-0.422
etti-0.414
Token
Token ID220
Feature activation-3.89e-02

Pos logit contributions
ill+0.049
igsaw+0.044
aney+0.041
ushes+0.040
etti+0.040
Neg logit contributions
 asking-0.065
 swoop-0.058
 trying-0.057
 gir-0.057
 challenged-0.057
Token
Token ID198
Feature activation-4.61e-02

Pos logit contributions
ill+0.071
igsaw+0.070
kin+0.066
etti+0.065
aney+0.063
Neg logit contributions
 asking-0.059
 trying-0.050
 le-0.049
 challenged-0.049
 z-0.048
Token
Token ID198
Feature activation-4.46e-02

Pos logit contributions
ill+0.074
igsaw+0.066
aney+0.063
kin+0.062
ushes+0.060
Neg logit contributions
 asking-0.085
 swoop-0.074
 trying-0.074
 le-0.074
 challenged-0.073
TokenWhen
Token ID2215
Feature activation+9.97e-02

Pos logit contributions
ill+0.070
igsaw+0.061
aney+0.059
 Something+0.059
ola+0.058
Neg logit contributions
 asking-0.079
 trucks-0.073
 ostr-0.070
 trying-0.070
 going-0.069
Token to
Token ID284
Feature activation+1.04e-01

Pos logit contributions
cut+0.158
 bulldo+0.147
ep+0.146
cod+0.144
 trucks+0.143
Neg logit contributions
ill-0.138
 Jill-0.122
 Jane-0.116
cake-0.115
 Something-0.112
Token go
Token ID467
Feature activation-4.22e-02

Pos logit contributions
 asking+0.155
 le+0.130
 trying+0.129
 swoop+0.127
 challenged+0.126
Neg logit contributions
ill-0.201
igsaw-0.180
kin-0.175
aney-0.172
etti-0.171
Token to
Token ID284
Feature activation+1.04e-01

Pos logit contributions
ill+0.084
igsaw+0.076
etti+0.071
kin+0.071
aney+0.071
Neg logit contributions
 asking-0.059
 swoop-0.051
 le-0.050
cut-0.050
 challenged-0.049
Token the
Token ID262
Feature activation+1.61e-02

Pos logit contributions
 asking+0.161
cut+0.148
 swoop+0.145
 challenged+0.141
 chased+0.140
Neg logit contributions
ill-0.181
igsaw-0.169
etti-0.152
aney-0.148
kin-0.146
Token river
Token ID7850
Feature activation+1.48e-01

Pos logit contributions
 asking+0.022
 swoop+0.018
 challenged+0.018
cut+0.018
 trying+0.017
Neg logit contributions
ill-0.033
igsaw-0.031
aney-0.029
etti-0.029
cake-0.028
Tokenbank
Token ID17796
Feature activation+2.75e-01

Pos logit contributions
 asking+0.222
 swoop+0.204
 ostr+0.198
 yells+0.195
 le+0.195
Neg logit contributions
ill-0.276
igsaw-0.207
aps-0.207
etti-0.202
work-0.199
Token.
Token ID13
Feature activation-8.45e-03

Pos logit contributions
 asking+0.390
 swoop+0.356
 challenged+0.343
 le+0.340
 ostr+0.340
Neg logit contributions
ill-0.554
aney-0.454
igsaw-0.451
kin-0.451
ushes-0.441
Token She
Token ID1375
Feature activation-1.41e-02

Pos logit contributions
ill+0.013
igsaw+0.011
aney+0.010
ushes+0.010
etti+0.010
Neg logit contributions
 asking-0.016
 swoop-0.015
 trying-0.015
 le-0.014
 challenged-0.014
Token found
Token ID1043
Feature activation-4.61e-03

Pos logit contributions
ill+0.026
kin+0.023
ola+0.023
cy+0.022
aney+0.021
Neg logit contributions
 asking-0.019
 trying-0.016
 sneaking-0.016
cut-0.016
 causing-0.016
Token a
Token ID257
Feature activation+7.55e-03

Pos logit contributions
ill+0.008
igsaw+0.007
aney+0.007
ushes+0.007
kin+0.007
Neg logit contributions
 asking-0.007
 challenged-0.006
 swoop-0.006
 trying-0.006
 le-0.006
Token kind
Token ID1611
Feature activation-1.04e-02

Pos logit contributions
 asking+0.011
 challenged+0.009
 trying+0.009
 chased+0.009
 swoop+0.009
Neg logit contributions
ill-0.014
igsaw-0.013
kin-0.013
aney-0.012
etti-0.012
Token to
Token ID284
Feature activation+1.01e-01

Pos logit contributions
ill+0.026
 Something+0.022
ushes+0.022
aney+0.021
 Jill+0.021
Neg logit contributions
cut-0.024
 asking-0.024
have-0.021
 swoop-0.021
 bulldo-0.020
Token crawl
Token ID27318
Feature activation+4.81e-02

Pos logit contributions
 asking+0.159
 challenged+0.135
 gir+0.134
 le+0.131
 trying+0.131
Neg logit contributions
ill-0.184
kin-0.169
igsaw-0.153
aney-0.148
etti-0.147
Token along
Token ID1863
Feature activation-1.81e-02

Pos logit contributions
 asking+0.072
 swoop+0.060
 le+0.060
 challenged+0.060
 trying+0.060
Neg logit contributions
ill-0.094
igsaw-0.086
kin-0.082
aney-0.082
etti-0.081
Token the
Token ID262
Feature activation+1.00e-02

Pos logit contributions
ill+0.030
igsaw+0.026
 Kate+0.025
kin+0.025
etti+0.024
Neg logit contributions
 asking-0.029
 ostr-0.028
 swoop-0.028
 challenged-0.027
have-0.027
Token river
Token ID7850
Feature activation+1.47e-01

Pos logit contributions
 asking+0.016
 begging+0.014
cut+0.014
 swoop+0.014
 challenged+0.014
Neg logit contributions
igsaw-0.014
ill-0.014
kin-0.013
aire-0.012
tub-0.012
Tokenbank
Token ID17796
Feature activation+2.75e-01

Pos logit contributions
 asking+0.246
 cowboy+0.242
emy+0.240
have+0.236
 bulldo+0.235
Neg logit contributions
ill-0.227
kin-0.159
aps-0.154
aney-0.153
work-0.150
Token.
Token ID13
Feature activation-2.31e-02

Pos logit contributions
 asking+0.438
 swoop+0.432
 bulldo+0.427
 ostr+0.422
 le+0.410
Neg logit contributions
ill-0.479
kin-0.378
aney-0.365
ushes-0.358
aps-0.349
Token
Token ID198
Feature activation-3.77e-02

Pos logit contributions
ill+0.038
igsaw+0.035
kin+0.033
aney+0.033
etti+0.032
Neg logit contributions
 asking-0.041
 le-0.036
 trying-0.036
 swoop-0.036
 challenged-0.035
Token
Token ID198
Feature activation-3.70e-02

Pos logit contributions
ill+0.060
igsaw+0.054
aney+0.051
kin+0.050
etti+0.049
Neg logit contributions
 asking-0.069
 swoop-0.061
 le-0.061
 trying-0.061
 challenged-0.060
TokenAs
Token ID1722
Feature activation+1.36e-01

Pos logit contributions
ill+0.050
aney+0.043
 Something+0.041
aire+0.040
Kate+0.039
Neg logit contributions
 asking-0.076
 trucks-0.067
 going-0.067
 sc-0.065
 ostr-0.065
Token the
Token ID262
Feature activation+1.10e-02

Pos logit contributions
 asking+0.199
 swoop+0.167
 le+0.166
 trying+0.166
 challenged+0.165
Neg logit contributions
ill-0.271
igsaw-0.249
kin-0.239
aney-0.237
etti-0.235
Token away
Token ID1497
Feature activation-8.36e-02

Pos logit contributions
 asking+0.077
 swoop+0.064
 le+0.064
 trying+0.063
 challenged+0.063
Neg logit contributions
ill-0.125
igsaw-0.115
kin-0.111
aney-0.111
etti-0.109
Token in
Token ID287
Feature activation-9.91e-02

Pos logit contributions
ill+0.187
igsaw+0.177
aney+0.170
etti+0.169
kin+0.167
Neg logit contributions
 asking-0.099
 trying-0.080
 le-0.078
 swoop-0.076
 sc-0.075
Token her
Token ID607
Feature activation-3.11e-02

Pos logit contributions
ill+0.217
etti+0.204
aney+0.203
igsaw+0.201
cake+0.200
Neg logit contributions
 asking-0.129
 trying-0.103
 z-0.097
 sc-0.097
 going-0.096
Token pig
Token ID12967
Feature activation+1.29e-01

Pos logit contributions
ill+0.059
aney+0.053
igsaw+0.052
etti+0.051
ushes+0.050
Neg logit contributions
 asking-0.049
 trying-0.043
 le-0.042
 sc-0.042
 z-0.042
Tokengy
Token ID1360
Feature activation+7.44e-02

Pos logit contributions
 asking+0.191
 swoop+0.174
 challenged+0.168
 le+0.164
 chased+0.163
Neg logit contributions
ill-0.239
kin-0.210
igsaw-0.205
etti-0.198
aire-0.194
Tokenbank
Token ID17796
Feature activation+2.75e-01

Pos logit contributions
 asking+0.092
 swoop+0.080
 challenged+0.077
 trying+0.073
 going+0.071
Neg logit contributions
ill-0.147
igsaw-0.142
kin-0.134
ifying-0.132
aire-0.132
Token every
Token ID790
Feature activation-2.75e-02

Pos logit contributions
 asking+0.382
 swoop+0.353
 le+0.339
 challenged+0.337
 gir+0.329
Neg logit contributions
ill-0.559
aney-0.481
kin-0.479
igsaw-0.477
ushes-0.466
Token week
Token ID1285
Feature activation+1.44e-01

Pos logit contributions
ill+0.052
igsaw+0.049
kin+0.049
aney+0.046
ola+0.045
Neg logit contributions
 asking-0.037
 gir-0.034
cut-0.034
 trucks-0.033
 le-0.033
Token.
Token ID13
Feature activation-2.37e-02

Pos logit contributions
 asking+0.192
 swoop+0.167
 challenged+0.163
 le+0.163
 trying+0.160
Neg logit contributions
ill-0.298
igsaw-0.272
kin-0.263
aney-0.262
etti-0.256
Token This
Token ID770
Feature activation+2.67e-02

Pos logit contributions
ill+0.045
kin+0.045
etti+0.044
igsaw+0.043
ifying+0.041
Neg logit contributions
 asking-0.033
 le-0.027
 trying-0.026
 challenged-0.026
 asked-0.025
Token way
Token ID835
Feature activation+8.96e-02

Pos logit contributions
 asking+0.033
 swoop+0.029
 le+0.027
 gir+0.026
 trying+0.026
Neg logit contributions
ill-0.057
igsaw-0.054
kin-0.053
aney-0.050
ola-0.050
Token John
Token ID1757
Feature activation-4.00e-02

Pos logit contributions
ill+0.018
kin+0.018
igsaw+0.017
ola+0.016
ushes+0.016
Neg logit contributions
 asking-0.017
 trying-0.015
 challenged-0.014
 going-0.014
 le-0.014
Token quickly
Token ID2952
Feature activation+2.46e-02

Pos logit contributions
ill+0.074
aney+0.062
etti+0.059
ola+0.059
aire+0.057
Neg logit contributions
 asking-0.062
 le-0.055
 trucks-0.054
 going-0.053
cut-0.053
Token scanned
Token ID28660
Feature activation-3.10e-02

Pos logit contributions
 asking+0.030
 trucks+0.027
cut+0.026
 trying+0.025
have+0.025
Neg logit contributions
ill-0.052
igsaw-0.043
aney-0.043
kin-0.043
ola-0.042
Token the
Token ID262
Feature activation+1.62e-02

Pos logit contributions
ill+0.071
igsaw+0.068
aney+0.066
kin+0.065
etti+0.065
Neg logit contributions
 asking-0.037
 trying-0.029
 z-0.029
 sc-0.028
 swoop-0.028
Token river
Token ID7850
Feature activation+1.47e-01

Pos logit contributions
 asking+0.024
 challenged+0.021
 swoop+0.021
cut+0.021
 le+0.020
Neg logit contributions
ill-0.029
igsaw-0.027
ushes-0.026
ifying-0.026
kin-0.025
Tokenbank
Token ID17796
Feature activation+2.75e-01

Pos logit contributions
 asking+0.200
 challenged+0.176
 swoop+0.175
 le+0.172
have+0.171
Neg logit contributions
ill-0.300
igsaw-0.262
kin-0.250
etti-0.249
aney-0.249
Token to
Token ID284
Feature activation+1.03e-01

Pos logit contributions
 asking+0.352
 swoop+0.305
 le+0.298
 challenged+0.297
 ostr+0.284
Neg logit contributions
ill-0.589
igsaw-0.519
aney-0.510
kin-0.504
etti-0.500
Token see
Token ID766
Feature activation-7.08e-02

Pos logit contributions
 asking+0.135
 trying+0.112
 z+0.110
 sc+0.107
 going+0.106
Neg logit contributions
ill-0.227
igsaw-0.218
aney-0.215
kin-0.209
etti-0.207
Token if
Token ID611
Feature activation+6.69e-02

Pos logit contributions
ill+0.139
igsaw+0.125
aney+0.120
kin+0.120
ushes+0.119
Neg logit contributions
 asking-0.103
 swoop-0.088
 le-0.087
 challenged-0.087
 trying-0.085
Token anything
Token ID1997
Feature activation-1.14e-01

Pos logit contributions
 asking+0.090
 le+0.076
 swoop+0.076
 challenged+0.073
 gir+0.073
Neg logit contributions
ill-0.136
igsaw-0.124
kin-0.119
aney-0.118
etti-0.116
Token was
Token ID373
Feature activation-2.15e-02

Pos logit contributions
ill+0.218
igsaw+0.195
kin+0.188
aney+0.187
ola+0.182
Neg logit contributions
 asking-0.170
 swoop-0.142
 gir-0.141
 le-0.141
 challenged-0.139
Token back
Token ID736
Feature activation-2.58e-01

Pos logit contributions
 asking+0.150
 trying+0.130
 sc+0.123
 z+0.122
 le+0.121
Neg logit contributions
igsaw-0.296
ill-0.296
aney-0.292
etti-0.279
ushes-0.268
Token down
Token ID866
Feature activation-9.96e-02

Pos logit contributions
ill+0.501
igsaw+0.443
ushes+0.425
etti+0.420
kin+0.414
Neg logit contributions
 asking-0.376
 swoop-0.331
 challenged-0.327
 le-0.322
 gir-0.321
Token on
Token ID319
Feature activation-5.45e-02

Pos logit contributions
ill+0.215
igsaw+0.200
aney+0.191
kin+0.191
etti+0.188
Neg logit contributions
 asking-0.128
 swoop-0.105
 le-0.105
 trying-0.105
 challenged-0.103
Token the
Token ID262
Feature activation+1.32e-02

Pos logit contributions
ill+0.110
igsaw+0.101
kin+0.096
aney+0.095
etti+0.095
Neg logit contributions
 asking-0.078
 swoop-0.065
 challenged-0.065
 trying-0.064
 le-0.064
Token river
Token ID7850
Feature activation+1.47e-01

Pos logit contributions
 asking+0.018
 challenged+0.014
have+0.014
 trying+0.014
 swoop+0.014
Neg logit contributions
ill-0.027
igsaw-0.024
kin-0.023
etti-0.023
aney-0.023
Tokenbank
Token ID17796
Feature activation+2.75e-01

Pos logit contributions
 asking+0.218
have+0.194
 challenged+0.191
 swoop+0.189
 ostr+0.185
Neg logit contributions
ill-0.282
igsaw-0.222
kin-0.221
aney-0.220
etti-0.219
Token.
Token ID13
Feature activation-1.19e-02

Pos logit contributions
 asking+0.391
 swoop+0.357
 challenged+0.345
 le+0.343
 ostr+0.341
Neg logit contributions
ill-0.547
aney-0.455
kin-0.447
igsaw-0.438
etti-0.436
Token
Token ID198
Feature activation-3.32e-02

Pos logit contributions
ill+0.023
kin+0.022
igsaw+0.021
aney+0.020
etti+0.020
Neg logit contributions
 asking-0.018
 le-0.016
 challenged-0.015
 trying-0.015
 z-0.015
Token
Token ID198
Feature activation-3.27e-02

Pos logit contributions
ill+0.053
igsaw+0.047
aney+0.045
kin+0.044
ushes+0.043
Neg logit contributions
 asking-0.062
 swoop-0.054
 trying-0.054
 le-0.054
 challenged-0.053
TokenHe
Token ID1544
Feature activation+3.15e-02

Pos logit contributions
ill+0.046
igsaw+0.040
aney+0.037
cake+0.037
kin+0.036
Neg logit contributions
 asking-0.069
 trying-0.060
 ostr-0.060
 going-0.059
 trucks-0.059
Token continued
Token ID3767
Feature activation-8.96e-02

Pos logit contributions
 asking+0.043
cut+0.035
 trying+0.035
 going+0.034
 trucks+0.033
Neg logit contributions
ill-0.064
kin-0.055
ola-0.055
aney-0.054
etti-0.052
Token man
Token ID582
Feature activation+1.10e-01

Pos logit contributions
 asking+0.065
 challenged+0.053
 chased+0.051
 swoop+0.051
have+0.051
Neg logit contributions
ill-0.112
kin-0.104
igsaw-0.104
etti-0.099
aney-0.096
Token standing
Token ID5055
Feature activation+9.64e-02

Pos logit contributions
 asking+0.137
 gir+0.136
 challenged+0.134
 le+0.132
 trucks+0.131
Neg logit contributions
ill-0.209
aps-0.180
etti-0.180
ifying-0.170
aney-0.166
Token by
Token ID416
Feature activation+4.17e-02

Pos logit contributions
 asking+0.135
 swoop+0.114
 le+0.114
 challenged+0.113
 gir+0.110
Neg logit contributions
ill-0.196
igsaw-0.177
kin-0.173
ushes-0.169
aney-0.169
Token the
Token ID262
Feature activation+1.68e-02

Pos logit contributions
 asking+0.062
 challenged+0.057
 swoop+0.056
 gir+0.055
 le+0.055
Neg logit contributions
ill-0.076
igsaw-0.066
kin-0.064
etti-0.063
aney-0.062
Token river
Token ID7850
Feature activation+1.47e-01

Pos logit contributions
 asking+0.026
 challenged+0.022
 chased+0.022
have+0.022
 swoop+0.021
Neg logit contributions
ill-0.027
ifying-0.026
igsaw-0.025
kin-0.024
etti-0.024
Tokenbank
Token ID17796
Feature activation+2.75e-01

Pos logit contributions
 asking+0.246
emy+0.235
have+0.234
 cowboy+0.234
 challenged+0.226
Neg logit contributions
ill-0.228
work-0.172
etti-0.171
cake-0.171
aps-0.168
Token.
Token ID13
Feature activation-1.41e-02

Pos logit contributions
 asking+0.440
 swoop+0.432
 challenged+0.419
 ostr+0.417
 gir+0.410
Neg logit contributions
ill-0.474
aney-0.360
kin-0.354
etti-0.346
aps-0.344
Token The
Token ID383
Feature activation+2.99e-02

Pos logit contributions
ill+0.022
igsaw+0.020
aney+0.018
etti+0.018
kin+0.018
Neg logit contributions
 asking-0.027
 swoop-0.024
 trying-0.023
 le-0.023
 challenged-0.023
Token old
Token ID1468
Feature activation+5.44e-02

Pos logit contributions
 asking+0.045
 sneaking+0.041
 chased+0.041
 swoop+0.040
perate+0.039
Neg logit contributions
ill-0.046
kin-0.044
igsaw-0.044
ushes-0.041
pour-0.040
Token man
Token ID582
Feature activation+1.10e-01

Pos logit contributions
 sneaking+0.082
 asking+0.081
have+0.076
 causing+0.075
 chased+0.075
Neg logit contributions
ill-0.078
kin-0.072
 globe-0.066
 Jill-0.063
aps-0.062
Token showed
Token ID3751
Feature activation-9.96e-03

Pos logit contributions
 asking+0.165
 going+0.141
have+0.139
 trucks+0.137
 horses+0.136
Neg logit contributions
ill-0.186
etti-0.158
aney-0.157
kin-0.154
cake-0.154
Token and
Token ID290
Feature activation+1.05e-02

Pos logit contributions
ill+0.225
igsaw+0.206
kin+0.197
aney+0.197
ushes+0.194
Neg logit contributions
 asking-0.142
 swoop-0.127
 le-0.125
 challenged-0.121
 gir-0.120
Token grabbed
Token ID13176
Feature activation+9.84e-02

Pos logit contributions
 asking+0.010
 le+0.009
 swoop+0.009
 ostr+0.009
 gir+0.009
Neg logit contributions
ill-0.023
igsaw-0.021
etti-0.021
kin-0.021
 Kate-0.020
Token his
Token ID465
Feature activation+7.76e-03

Pos logit contributions
 asking+0.118
 le+0.095
 trying+0.094
 z+0.092
 sc+0.091
Neg logit contributions
ill-0.235
igsaw-0.214
aney-0.210
kin-0.207
etti-0.205
Token pig
Token ID12967
Feature activation+1.29e-01

Pos logit contributions
 asking+0.011
 swoop+0.010
 challenged+0.010
cut+0.009
 chased+0.009
Neg logit contributions
igsaw-0.014
ill-0.014
kin-0.013
ushes-0.013
ifying-0.013
Tokengy
Token ID1360
Feature activation+7.30e-02

Pos logit contributions
 asking+0.212
 challenged+0.193
 swoop+0.186
 le+0.182
have+0.181
Neg logit contributions
ill-0.217
igsaw-0.194
etti-0.185
kin-0.180
ushes-0.179
Tokenbank
Token ID17796
Feature activation+2.74e-01

Pos logit contributions
 asking+0.085
 challenged+0.075
 swoop+0.074
have+0.069
emy+0.066
Neg logit contributions
igsaw-0.146
ill-0.143
ifying-0.139
etti-0.134
aney-0.132
Token and
Token ID290
Feature activation+1.05e-02

Pos logit contributions
 asking+0.372
 swoop+0.335
 challenged+0.325
 le+0.325
 ostr+0.315
Neg logit contributions
ill-0.566
igsaw-0.490
aney-0.487
kin-0.484
ushes-0.481
Token brought
Token ID3181
Feature activation+1.17e-02

Pos logit contributions
 asking+0.012
 swoop+0.010
 le+0.010
 ostr+0.010
 gir+0.010
Neg logit contributions
ill-0.021
kin-0.019
etti-0.019
igsaw-0.019
 Kate-0.018
Token it
Token ID340
Feature activation-6.65e-02

Pos logit contributions
 asking+0.016
 trying+0.013
 le+0.013
 swoop+0.013
 z+0.012
Neg logit contributions
ill-0.025
igsaw-0.023
kin-0.022
aney-0.022
etti-0.022
Token back
Token ID736
Feature activation-2.58e-01

Pos logit contributions
ill+0.103
igsaw+0.091
ushes+0.085
kin+0.085
aney+0.083
Neg logit contributions
 asking-0.123
 le-0.111
 challenged-0.111
 gir-0.110
 swoop-0.109
Token to
Token ID284
Feature activation+1.02e-01

Pos logit contributions
ill+0.542
igsaw+0.497
aney+0.475
kin+0.473
etti+0.467
Neg logit contributions
 asking-0.342
 swoop-0.290
 le-0.289
 challenged-0.284
 trying-0.283
Token and
Token ID290
Feature activation+5.53e-03

Pos logit contributions
ill+0.073
igsaw+0.070
aney+0.068
etti+0.067
ushes+0.066
Neg logit contributions
 asking-0.033
 le-0.027
 trying-0.026
 z-0.026
 sc-0.026
Token ran
Token ID4966
Feature activation+1.41e-01

Pos logit contributions
 asking+0.007
 swoop+0.006
 trying+0.006
cut+0.006
 le+0.006
Neg logit contributions
ill-0.012
igsaw-0.011
kin-0.011
aney-0.010
etti-0.010
Token to
Token ID284
Feature activation+1.01e-01

Pos logit contributions
 asking+0.211
 swoop+0.180
 le+0.178
 trying+0.177
 challenged+0.176
Neg logit contributions
ill-0.273
igsaw-0.252
aney-0.239
kin-0.238
etti-0.235
Token the
Token ID262
Feature activation+1.29e-02

Pos logit contributions
 asking+0.149
 swoop+0.137
 gir+0.134
cut+0.132
 ostr+0.130
Neg logit contributions
ill-0.189
igsaw-0.174
etti-0.161
kin-0.160
ola-0.159
Token river
Token ID7850
Feature activation+1.47e-01

Pos logit contributions
 asking+0.019
 swoop+0.016
 challenged+0.016
 chased+0.016
 trying+0.015
Neg logit contributions
ill-0.024
igsaw-0.021
kin-0.021
ifying-0.021
ola-0.020
Tokenbank
Token ID17796
Feature activation+2.74e-01

Pos logit contributions
 asking+0.220
 swoop+0.201
have+0.199
 ostr+0.197
 challenged+0.192
Neg logit contributions
ill-0.270
igsaw-0.215
etti-0.208
kin-0.206
aps-0.206
Token.
Token ID13
Feature activation-1.43e-02

Pos logit contributions
 asking+0.391
 swoop+0.357
 challenged+0.348
 ostr+0.344
 le+0.342
Neg logit contributions
ill-0.543
aney-0.455
kin-0.452
igsaw-0.450
ushes-0.446
Token
Token ID220
Feature activation-3.38e-02

Pos logit contributions
ill+0.029
kin+0.027
igsaw+0.026
aney+0.026
ola+0.025
Neg logit contributions
 asking-0.022
 trying-0.018
 challenged-0.018
 going-0.017
 telling-0.017
Token
Token ID198
Feature activation-3.91e-02

Pos logit contributions
ill+0.061
igsaw+0.059
kin+0.057
aney+0.055
etti+0.055
Neg logit contributions
 asking-0.053
 trying-0.045
 challenged-0.044
 le-0.043
 going-0.043
Token
Token ID198
Feature activation-3.88e-02

Pos logit contributions
ill+0.064
igsaw+0.058
aney+0.054
kin+0.054
etti+0.053
Neg logit contributions
 asking-0.070
 swoop-0.061
 le-0.061
 trying-0.061
 challenged-0.061
TokenCharlie
Token ID37136
Feature activation+7.98e-02

Pos logit contributions
ill+0.072
igsaw+0.065
aney+0.062
ushes+0.061
kin+0.061
Neg logit contributions
 asking-0.060
 le-0.054
 ostr-0.053
 gir-0.051
 swoop-0.051
Token
Token ID198
Feature activation-3.43e-02

Pos logit contributions
ill+0.057
igsaw+0.051
aney+0.048
kin+0.048
etti+0.047
Neg logit contributions
 asking-0.063
 swoop-0.055
 le-0.055
 trying-0.055
 challenged-0.055
TokenStanding
Token ID44196
Feature activation+1.15e-01

Pos logit contributions
ill+0.047
igsaw+0.043
aney+0.040
cake+0.039
 Something+0.038
Neg logit contributions
 asking-0.072
 ostr-0.063
 going-0.062
 trying-0.062
 trucks-0.062
Token on
Token ID319
Feature activation-5.31e-02

Pos logit contributions
 asking+0.148
 swoop+0.122
 le+0.122
 trying+0.121
 challenged+0.120
Neg logit contributions
ill-0.246
igsaw-0.228
aney-0.218
kin-0.218
etti-0.215
Token the
Token ID262
Feature activation+1.57e-02

Pos logit contributions
ill+0.111
igsaw+0.102
aney+0.098
kin+0.098
etti+0.097
Neg logit contributions
 asking-0.072
 le-0.060
 swoop-0.060
 trying-0.060
 challenged-0.059
Token river
Token ID7850
Feature activation+1.47e-01

Pos logit contributions
 asking+0.022
 swoop+0.019
 chased+0.018
 challenged+0.017
have+0.017
Neg logit contributions
ill-0.029
igsaw-0.028
kin-0.026
etti-0.026
ola-0.025
Tokenbank
Token ID17796
Feature activation+2.74e-01

Pos logit contributions
 asking+0.211
 le+0.186
 swoop+0.183
have+0.183
 gir+0.182
Neg logit contributions
ill-0.283
aney-0.232
igsaw-0.228
kin-0.227
etti-0.224
Token was
Token ID373
Feature activation-2.41e-02

Pos logit contributions
 asking+0.390
 swoop+0.336
 le+0.327
 challenged+0.321
 trying+0.321
Neg logit contributions
ill-0.547
igsaw-0.480
aney-0.471
kin-0.463
etti-0.456
Token a
Token ID257
Feature activation+6.01e-03

Pos logit contributions
ill+0.045
igsaw+0.039
kin+0.039
aney+0.038
ushes+0.038
Neg logit contributions
 asking-0.037
 swoop-0.032
 challenged-0.032
 le-0.032
 gir-0.031
Token duck
Token ID22045
Feature activation+4.98e-02

Pos logit contributions
 asking+0.008
 swoop+0.006
 challenged+0.006
 trying+0.006
 le+0.006
Neg logit contributions
ill-0.013
igsaw-0.012
kin-0.011
aney-0.011
etti-0.011
Token.
Token ID13
Feature activation-9.69e-03

Pos logit contributions
 asking+0.075
have+0.068
 trucks+0.067
 challenged+0.066
 ult+0.065
Neg logit contributions
ill-0.087
etti-0.076
ifying-0.073
kin-0.073
igsaw-0.070
Token The
Token ID383
Feature activation+2.97e-02

Pos logit contributions
ill+0.014
igsaw+0.013
aney+0.012
kin+0.011
etti+0.011
Neg logit contributions
 asking-0.019
 swoop-0.017
 trying-0.017
 le-0.017
 challenged-0.017
Token was
Token ID373
Feature activation-2.53e-02

Pos logit contributions
ill+0.205
kin+0.182
igsaw+0.179
etti+0.177
ushes+0.175
Neg logit contributions
 asking-0.165
 le-0.139
 trying-0.138
 swoop-0.136
 gir-0.135
Token different
Token ID1180
Feature activation+2.51e-02

Pos logit contributions
ill+0.041
kin+0.038
igsaw+0.036
ushes+0.035
etti+0.034
Neg logit contributions
 asking-0.041
 challenged-0.038
 swoop-0.036
 chased-0.036
have-0.036
Token from
Token ID422
Feature activation-9.35e-02

Pos logit contributions
 asking+0.037
 swoop+0.032
 le+0.031
 challenged+0.031
 trying+0.031
Neg logit contributions
ill-0.049
igsaw-0.045
aney-0.043
kin-0.043
etti-0.043
Token the
Token ID262
Feature activation+1.56e-02

Pos logit contributions
ill+0.172
igsaw+0.161
kin+0.150
etti+0.150
aney+0.146
Neg logit contributions
 asking-0.135
 challenged-0.124
 swoop-0.123
 le-0.118
 chased-0.118
Token flash
Token ID7644
Feature activation-1.15e-01

Pos logit contributions
 asking+0.026
 swoop+0.024
 challenged+0.024
 le+0.022
 chased+0.022
Neg logit contributions
ill-0.025
ifying-0.023
igsaw-0.022
kin-0.022
 Something-0.021
Tokenlights
Token ID8091
Feature activation-3.15e-01

Pos logit contributions
ill+0.227
igsaw+0.215
aney+0.206
ushes+0.202
kin+0.201
Neg logit contributions
 asking-0.158
 trying-0.134
 le-0.133
 z-0.133
 swoop-0.132
Token that
Token ID326
Feature activation+4.34e-02

Pos logit contributions
ill+0.661
igsaw+0.598
kin+0.578
aney+0.576
etti+0.575
Neg logit contributions
 asking-0.414
 swoop-0.349
 le-0.346
 challenged-0.342
 trying-0.334
Token other
Token ID584
Feature activation+9.97e-02

Pos logit contributions
 asking+0.057
 trying+0.047
 going+0.046
 swoop+0.046
 z+0.046
Neg logit contributions
ill-0.094
igsaw-0.090
aney-0.086
ushes-0.083
etti-0.083
Token kids
Token ID3988
Feature activation+3.03e-02

Pos logit contributions
 asking+0.128
 trying+0.105
 le+0.105
 swoop+0.104
 challenged+0.102
Neg logit contributions
ill-0.216
igsaw-0.201
aney-0.193
kin-0.192
etti-0.190
Token had
Token ID550
Feature activation-1.24e-02

Pos logit contributions
 asking+0.045
 le+0.040
 swoop+0.039
 gir+0.039
 ostr+0.038
Neg logit contributions
ill-0.058
kin-0.050
aney-0.049
etti-0.049
igsaw-0.048
Token.
Token ID13
Feature activation-1.53e-02

Pos logit contributions
ill+0.024
ushes+0.021
etti+0.021
kin+0.020
aney+0.020
Neg logit contributions
 asking-0.016
cut-0.015
 le-0.015
 ostr-0.014
 gir-0.014
Token,
Token ID11
Feature activation-2.68e-02

Pos logit contributions
 asking+0.124
 trying+0.101
 le+0.101
 swoop+0.100
 challenged+0.099
Neg logit contributions
ill-0.209
igsaw-0.196
kin-0.187
aney-0.187
etti-0.184
Token they
Token ID484
Feature activation+5.73e-03

Pos logit contributions
ill+0.055
igsaw+0.055
aney+0.050
etti+0.049
kin+0.048
Neg logit contributions
 asking-0.037
 le-0.031
 trying-0.031
 going-0.031
 z-0.031
Token saw
Token ID2497
Feature activation+1.05e-01

Pos logit contributions
 asking+0.006
 le+0.004
 swoop+0.004
 trying+0.004
 challenged+0.004
Neg logit contributions
ill-0.014
igsaw-0.013
kin-0.013
aney-0.013
etti-0.013
Token many
Token ID867
Feature activation+8.45e-02

Pos logit contributions
 asking+0.147
 le+0.125
 trying+0.123
 swoop+0.120
 challenged+0.119
Neg logit contributions
ill-0.216
igsaw-0.201
kin-0.191
aney-0.190
etti-0.188
Token flash
Token ID7644
Feature activation-1.14e-01

Pos logit contributions
 asking+0.100
 swoop+0.088
 challenged+0.083
 trying+0.080
 chased+0.078
Neg logit contributions
ill-0.182
igsaw-0.171
ushes-0.165
aney-0.164
kin-0.163
Tokenlights
Token ID8091
Feature activation-3.15e-01

Pos logit contributions
ill+0.194
igsaw+0.190
aney+0.179
aire+0.173
kin+0.173
Neg logit contributions
 asking-0.185
 z-0.166
 animals-0.165
 trucks-0.164
 trying-0.163
Token.
Token ID13
Feature activation-7.50e-03

Pos logit contributions
ill+0.603
etti+0.525
igsaw+0.520
aney+0.503
cake+0.501
Neg logit contributions
 asking-0.447
 swoop-0.393
 le-0.391
 challenged-0.389
cut-0.389
Token Some
Token ID2773
Feature activation+8.29e-02

Pos logit contributions
ill+0.014
kin+0.013
igsaw+0.013
ola+0.012
etti+0.012
Neg logit contributions
 asking-0.012
 le-0.011
 trucks-0.010
 challenged-0.010
 gir-0.010
Token were
Token ID547
Feature activation-9.23e-02

Pos logit contributions
 asking+0.136
 swoop+0.117
 trying+0.116
 le+0.116
 challenged+0.115
Neg logit contributions
ill-0.149
igsaw-0.135
aney-0.129
kin-0.128
etti-0.126
Token big
Token ID1263
Feature activation-6.81e-03

Pos logit contributions
ill+0.185
igsaw+0.165
ushes+0.164
aney+0.163
kin+0.162
Neg logit contributions
 asking-0.123
 challenged-0.107
 swoop-0.105
 le-0.104
cut-0.102
Token,
Token ID11
Feature activation-2.62e-02

Pos logit contributions
ill+0.014
igsaw+0.013
etti+0.012
kin+0.012
ushes+0.012
Neg logit contributions
 asking-0.009
 swoop-0.008
 challenged-0.008
cut-0.007
 le-0.007
Token big
Token ID1263
Feature activation-1.06e-02

Pos logit contributions
 asking+0.012
 le+0.010
 trying+0.010
 swoop+0.010
 challenged+0.010
Neg logit contributions
ill-0.015
igsaw-0.013
aney-0.013
kin-0.013
etti-0.013
Token box
Token ID3091
Feature activation-4.40e-02

Pos logit contributions
ill+0.019
igsaw+0.017
aney+0.017
kin+0.016
etti+0.016
Neg logit contributions
 asking-0.017
 trying-0.015
 le-0.015
 swoop-0.015
 z-0.015
Token full
Token ID1336
Feature activation-1.34e-01

Pos logit contributions
ill+0.079
igsaw+0.072
aney+0.068
kin+0.068
etti+0.067
Neg logit contributions
 asking-0.072
 swoop-0.062
 le-0.062
 trying-0.062
 challenged-0.061
Token of
Token ID286
Feature activation-1.21e-01

Pos logit contributions
ill+0.207
igsaw+0.185
kin+0.175
aney+0.175
etti+0.170
Neg logit contributions
 asking-0.251
 swoop-0.222
 le-0.220
 trying-0.219
 challenged-0.219
Token flash
Token ID7644
Feature activation-1.14e-01

Pos logit contributions
ill+0.228
igsaw+0.214
ushes+0.205
kin+0.205
aney+0.202
Neg logit contributions
 asking-0.173
 challenged-0.152
 swoop-0.151
 le-0.143
 trying-0.143
Tokenlights
Token ID8091
Feature activation-3.14e-01

Pos logit contributions
ill+0.247
igsaw+0.236
aney+0.227
kin+0.224
etti+0.219
Neg logit contributions
 asking-0.138
 trying-0.114
 z-0.112
 le-0.112
 ro-0.109
Token,
Token ID11
Feature activation-3.21e-02

Pos logit contributions
ill+0.652
igsaw+0.591
etti+0.569
aney+0.562
kin+0.557
Neg logit contributions
 asking-0.410
 swoop-0.356
 le-0.353
 challenged-0.350
 gir-0.337
Token so
Token ID523
Feature activation-3.82e-02

Pos logit contributions
ill+0.069
igsaw+0.062
kin+0.060
etti+0.060
aney+0.060
Neg logit contributions
 asking-0.040
 swoop-0.034
 challenged-0.034
 le-0.033
 trying-0.032
Token L
Token ID406
Feature activation+1.95e-02

Pos logit contributions
ill+0.065
igsaw+0.059
aney+0.056
kin+0.055
etti+0.054
Neg logit contributions
 asking-0.067
 le-0.058
 trying-0.058
 swoop-0.058
 challenged-0.057
Tokenola
Token ID5708
Feature activation-2.26e-01

Pos logit contributions
 asking+0.050
 trying+0.046
 z+0.046
 challenged+0.046
 le+0.046
Neg logit contributions
igsaw-0.011
etti-0.010
ifying-0.010
ushes-0.010
ill-0.010
Token was
Token ID373
Feature activation-2.32e-02

Pos logit contributions
ill+0.462
kin+0.408
aney+0.404
etti+0.398
igsaw+0.394
Neg logit contributions
 asking-0.302
 swoop-0.249
 le-0.249
 trying-0.245
 trucks-0.243
Token liked
Token ID8288
Feature activation+5.69e-02

Pos logit contributions
 asking+0.026
 trying+0.021
 le+0.021
 swoop+0.021
 telling+0.020
Neg logit contributions
ill-0.041
aney-0.036
kin-0.036
igsaw-0.036
ushes-0.035
Token to
Token ID284
Feature activation+1.03e-01

Pos logit contributions
cut+0.090
have+0.087
 swoop+0.086
cod+0.085
eh+0.085
Neg logit contributions
ill-0.076
 Kate-0.075
igsaw-0.072
 Jill-0.070
ushes-0.068
Token play
Token ID711
Feature activation+2.06e-02

Pos logit contributions
 asking+0.140
 swoop+0.119
 le+0.119
 trying+0.117
 sc+0.116
Neg logit contributions
ill-0.214
igsaw-0.198
aney-0.191
kin-0.187
ushes-0.186
Token with
Token ID351
Feature activation-1.01e-01

Pos logit contributions
 asking+0.032
 swoop+0.027
 le+0.027
 trying+0.027
 challenged+0.026
Neg logit contributions
ill-0.039
igsaw-0.036
kin-0.034
aney-0.034
etti-0.034
Token flash
Token ID7644
Feature activation-1.17e-01

Pos logit contributions
ill+0.197
igsaw+0.182
aney+0.174
kin+0.170
etti+0.168
Neg logit contributions
 asking-0.153
 le-0.131
 trying-0.130
 swoop-0.128
 z-0.127
Tokenlights
Token ID8091
Feature activation-3.13e-01

Pos logit contributions
ill+0.105
igsaw+0.101
aney+0.090
+0.082
aire+0.082
Neg logit contributions
 asking-0.284
 trucks-0.270
 z-0.266
 animals-0.266
 fl-0.262
Token in
Token ID287
Feature activation-9.31e-02

Pos logit contributions
ill+0.641
igsaw+0.572
kin+0.556
aney+0.553
etti+0.548
Neg logit contributions
 asking-0.428
 swoop-0.374
 le-0.366
 challenged-0.363
 trying-0.353
Token the
Token ID262
Feature activation+1.35e-02

Pos logit contributions
ill+0.216
igsaw+0.209
aney+0.205
cake+0.205
kin+0.199
Neg logit contributions
 asking-0.106
 z-0.089
 trying-0.085
 sc-0.083
 fl-0.082
Token dark
Token ID3223
Feature activation+5.13e-03

Pos logit contributions
 asking+0.023
 challenged+0.021
 swoop+0.020
 chased+0.020
 trying+0.020
Neg logit contributions
ill-0.022
igsaw-0.021
ifying-0.019
kin-0.019
ushes-0.019
Token.
Token ID13
Feature activation-7.61e-03

Pos logit contributions
 asking+0.005
 z+0.004
 le+0.004
 trucks+0.004
 trying+0.004
Neg logit contributions
ill-0.012
igsaw-0.011
aney-0.011
kin-0.011
ushes-0.011
Token They
Token ID1119
Feature activation+1.88e-02

Pos logit contributions
ill+0.014
igsaw+0.014
kin+0.014
etti+0.013
cake+0.013
Neg logit contributions
 asking-0.012
 le-0.010
 trying-0.010
 challenged-0.010
 z-0.009
Token.
Token ID13
Feature activation-8.66e-03

Pos logit contributions
 swoop+0.051
 gir+0.050
 asking+0.050
cut+0.048
 ostr+0.048
Neg logit contributions
ill-0.042
kin-0.033
 Kate-0.032
ola-0.031
etti-0.031
Token They
Token ID1119
Feature activation+2.18e-02

Pos logit contributions
ill+0.016
igsaw+0.016
kin+0.015
etti+0.015
aney+0.014
Neg logit contributions
 asking-0.013
 le-0.012
 challenged-0.011
 trying-0.011
 asked-0.011
Token find
Token ID1064
Feature activation-9.51e-03

Pos logit contributions
 asking+0.035
 le+0.030
cut+0.030
 causing+0.029
 trying+0.029
Neg logit contributions
ill-0.039
igsaw-0.033
aney-0.032
ola-0.032
aire-0.032
Token some
Token ID617
Feature activation-5.15e-02

Pos logit contributions
ill+0.022
igsaw+0.021
aney+0.020
etti+0.019
cake+0.019
Neg logit contributions
 asking-0.012
 le-0.010
 z-0.010
 trying-0.010
 fl-0.010
Token flash
Token ID7644
Feature activation-1.12e-01

Pos logit contributions
ill+0.098
igsaw+0.091
kin+0.090
ushes+0.086
aney+0.085
Neg logit contributions
 asking-0.074
 swoop-0.065
 challenged-0.063
 trying-0.061
 gir-0.060
Tokenlights
Token ID8091
Feature activation-3.11e-01

Pos logit contributions
igsaw+0.134
ill+0.128
+0.118
aney+0.116
aire+0.116
Neg logit contributions
 asking-0.232
 fl-0.222
 z-0.221
 trucks-0.218
 sc-0.215
Token and
Token ID290
Feature activation+1.48e-02

Pos logit contributions
ill+0.649
igsaw+0.573
kin+0.561
aney+0.558
etti+0.556
Neg logit contributions
 asking-0.415
 le-0.350
 swoop-0.344
 challenged-0.343
 gir-0.341
Token tape
Token ID9154
Feature activation-3.05e-02

Pos logit contributions
 asking+0.021
 swoop+0.018
 le+0.018
 trying+0.018
 challenged+0.018
Neg logit contributions
ill-0.029
igsaw-0.027
kin-0.026
aney-0.025
etti-0.025
Token them
Token ID606
Feature activation-4.50e-02

Pos logit contributions
ill+0.061
kin+0.054
ushes+0.052
etti+0.052
igsaw+0.052
Neg logit contributions
 asking-0.044
 challenged-0.037
 le-0.037
 swoop-0.037
 gir-0.037
Token to
Token ID284
Feature activation+1.05e-01

Pos logit contributions
ill+0.084
igsaw+0.074
ushes+0.073
kin+0.072
etti+0.072
Neg logit contributions
 asking-0.070
 le-0.060
 challenged-0.059
 gir-0.058
 swoop-0.058
Token their
Token ID511
Feature activation-2.34e-02

Pos logit contributions
 asking+0.138
 z+0.125
 fl+0.121
 swoop+0.121
 sc+0.117
Neg logit contributions
ill-0.224
igsaw-0.220
aney-0.214
cake-0.203
ushes-0.200
Token the
Token ID262
Feature activation+2.35e-02

Pos logit contributions
ill+0.202
igsaw+0.190
kin+0.186
aney+0.185
cake+0.184
Neg logit contributions
 asking-0.111
 z-0.090
 trying-0.089
 le-0.086
 sc-0.085
Token tent
Token ID11105
Feature activation+1.79e-02

Pos logit contributions
 asking+0.039
 swoop+0.036
 challenged+0.035
cut+0.033
 le+0.033
Neg logit contributions
ill-0.039
igsaw-0.037
ushes-0.035
kin-0.035
ifying-0.034
Token with
Token ID351
Feature activation-9.21e-02

Pos logit contributions
 asking+0.021
 swoop+0.017
 le+0.017
 trying+0.017
 challenged+0.017
Neg logit contributions
ill-0.040
igsaw-0.037
aney-0.036
kin-0.036
etti-0.035
Token their
Token ID511
Feature activation-2.38e-02

Pos logit contributions
ill+0.179
igsaw+0.170
cake+0.159
aney+0.159
aire+0.155
Neg logit contributions
 asking-0.140
 le-0.130
 z-0.128
 sc-0.126
 trucks-0.125
Token flash
Token ID7644
Feature activation-1.10e-01

Pos logit contributions
ill+0.041
igsaw+0.038
kin+0.038
ushes+0.037
aney+0.036
Neg logit contributions
 asking-0.039
 challenged-0.035
 swoop-0.034
 trying-0.033
cut-0.032
Tokenlights
Token ID8091
Feature activation-3.10e-01

Pos logit contributions
igsaw+0.107
ill+0.106
aney+0.091
 Something+0.091
aire+0.091
Neg logit contributions
 asking-0.251
 z-0.236
 fl-0.236
 sc-0.234
 ro-0.231
Token and
Token ID290
Feature activation+1.71e-02

Pos logit contributions
ill+0.661
igsaw+0.600
aney+0.579
kin+0.578
etti+0.574
Neg logit contributions
 asking-0.399
 le-0.336
 swoop-0.336
 challenged-0.332
 trying-0.324
Token their
Token ID511
Feature activation-2.31e-02

Pos logit contributions
 asking+0.025
 trying+0.022
 swoop+0.021
 le+0.021
 z+0.021
Neg logit contributions
ill-0.034
igsaw-0.031
aney-0.030
kin-0.029
etti-0.029
Token t
Token ID256
Feature activation-4.49e-02

Pos logit contributions
ill+0.038
kin+0.036
igsaw+0.036
ushes+0.035
aney+0.034
Neg logit contributions
 asking-0.039
 challenged-0.035
 swoop-0.034
 trying-0.033
 le-0.033
Tokeneddy
Token ID21874
Feature activation-1.92e-01

Pos logit contributions
igsaw+0.054
ill+0.054
kin+0.048
aney+0.046
 Something+0.045
Neg logit contributions
 asking-0.089
ep-0.084
 trucks-0.083
 z-0.082
 sc-0.082
Token bears
Token ID13062
Feature activation+5.17e-02

Pos logit contributions
ill+0.398
kin+0.383
etti+0.372
igsaw+0.363
ushes+0.361
Neg logit contributions
 asking-0.224
 challenged-0.194
 swoop-0.183
 le-0.181
 trying-0.177
Token and
Token ID290
Feature activation-6.64e-04

Pos logit contributions
ill+0.213
igsaw+0.198
kin+0.190
aney+0.189
etti+0.186
Neg logit contributions
 asking-0.124
 swoop-0.103
 le-0.101
 challenged-0.101
 trying-0.101
Token picked
Token ID6497
Feature activation+1.31e-01

Pos logit contributions
ill+0.001
igsaw+0.001
kin+0.001
aney+0.001
etti+0.001
Neg logit contributions
 asking-0.001
 swoop-0.001
 le-0.001
 trying-0.001
 challenged-0.001
Token up
Token ID510
Feature activation-9.91e-02

Pos logit contributions
 asking+0.136
 le+0.113
 trying+0.110
 z+0.106
 fl+0.103
Neg logit contributions
ill-0.323
igsaw-0.299
aney-0.297
etti-0.293
kin-0.292
Token the
Token ID262
Feature activation+7.06e-03

Pos logit contributions
ill+0.211
igsaw+0.194
aney+0.190
etti+0.187
kin+0.187
Neg logit contributions
 asking-0.131
 le-0.112
 trying-0.109
 z-0.106
 sc-0.105
Token nap
Token ID25422
Feature activation-6.22e-02

Pos logit contributions
 asking+0.012
 trying+0.011
 le+0.011
 swoop+0.010
 z+0.010
Neg logit contributions
ill-0.012
igsaw-0.011
aney-0.011
etti-0.010
kin-0.010
Tokenkin
Token ID5116
Feature activation-3.03e-01

Pos logit contributions
ill+0.035
igsaw+0.030
etti+0.026
aney+0.025
ifying+0.021
Neg logit contributions
 asking-0.167
 trying-0.158
 le-0.157
 fl-0.155
 z-0.155
Token.
Token ID13
Feature activation-2.63e-02

Pos logit contributions
ill+0.651
igsaw+0.618
aney+0.587
etti+0.585
kin+0.581
Neg logit contributions
 asking-0.377
 trying-0.313
 le-0.308
 swoop-0.304
 z-0.296
Token She
Token ID1375
Feature activation-2.31e-02

Pos logit contributions
ill+0.052
kin+0.050
igsaw+0.047
aney+0.046
ifying+0.045
Neg logit contributions
 asking-0.040
 le-0.033
 trying-0.032
 challenged-0.030
 z-0.030
Token put
Token ID1234
Feature activation-1.14e-02

Pos logit contributions
ill+0.049
igsaw+0.045
aney+0.043
kin+0.043
etti+0.043
Neg logit contributions
 asking-0.031
 swoop-0.026
 le-0.026
 trying-0.026
 challenged-0.025
Token it
Token ID340
Feature activation-7.20e-02

Pos logit contributions
ill+0.029
igsaw+0.028
aney+0.028
etti+0.026
aire+0.026
Neg logit contributions
 asking-0.011
 trying-0.009
 le-0.009
 z-0.009
 fl-0.009
Token in
Token ID287
Feature activation-9.69e-02

Pos logit contributions
ill+0.144
igsaw+0.135
aney+0.129
etti+0.126
kin+0.125
Neg logit contributions
 asking-0.102
 trying-0.088
 swoop-0.087
 le-0.085
 z-0.085
Token it
Token ID340
Feature activation-7.34e-02

Pos logit contributions
ill+0.150
igsaw+0.140
aney+0.135
kin+0.132
etti+0.131
Neg logit contributions
 asking-0.092
 le-0.077
 trying-0.077
 swoop-0.075
 z-0.073
Token up
Token ID510
Feature activation-1.01e-01

Pos logit contributions
ill+0.135
igsaw+0.124
kin+0.120
ushes+0.117
aney+0.117
Neg logit contributions
 asking-0.115
 le-0.099
 ostr-0.099
 challenged-0.098
 swoop-0.098
Token in
Token ID287
Feature activation-9.77e-02

Pos logit contributions
ill+0.212
igsaw+0.198
aney+0.190
etti+0.187
kin+0.187
Neg logit contributions
 asking-0.133
 trying-0.112
 le-0.110
 swoop-0.110
 z-0.106
Token a
Token ID257
Feature activation+1.38e-03

Pos logit contributions
ill+0.208
igsaw+0.192
aney+0.188
kin+0.185
etti+0.184
Neg logit contributions
 asking-0.131
 trying-0.108
 le-0.107
 swoop-0.105
 z-0.103
Token nap
Token ID25422
Feature activation-5.92e-02

Pos logit contributions
 asking+0.002
 challenged+0.002
 swoop+0.002
 trying+0.002
 le+0.002
Neg logit contributions
ill-0.002
igsaw-0.002
kin-0.002
aney-0.002
etti-0.002
Tokenkin
Token ID5116
Feature activation-3.01e-01

Pos logit contributions
ill+0.032
igsaw+0.023
aney+0.019
etti+0.017
ushes+0.016
Neg logit contributions
 asking-0.171
 trying-0.157
 le-0.157
 swoop-0.157
 challenged-0.155
Token.
Token ID13
Feature activation-2.37e-02

Pos logit contributions
ill+0.595
kin+0.535
igsaw+0.511
cake+0.495
aney+0.494
Neg logit contributions
 asking-0.446
 swoop-0.400
 challenged-0.393
 le-0.381
 ostr-0.379
Token This
Token ID770
Feature activation+2.61e-02

Pos logit contributions
ill+0.036
kin+0.033
igsaw+0.032
aney+0.031
ushes+0.030
Neg logit contributions
 asking-0.045
 trying-0.040
 le-0.039
 challenged-0.039
 swoop-0.038
Token sandwich
Token ID20433
Feature activation-1.06e-01

Pos logit contributions
 asking+0.034
 swoop+0.028
 challenged+0.027
 trying+0.027
cut+0.027
Neg logit contributions
ill-0.056
igsaw-0.051
kin-0.051
aney-0.049
ushes-0.048
Token was
Token ID373
Feature activation-2.67e-02

Pos logit contributions
ill+0.236
igsaw+0.225
aney+0.213
etti+0.209
ushes+0.209
Neg logit contributions
 asking-0.126
 le-0.106
 trying-0.106
 swoop-0.105
 sc-0.102
Token so
Token ID523
Feature activation-4.40e-02

Pos logit contributions
ill+0.050
igsaw+0.045
kin+0.045
aney+0.044
ushes+0.043
Neg logit contributions
 asking-0.041
 challenged-0.035
 swoop-0.035
 le-0.034
 trying-0.034
Token playful
Token ID34264
Feature activation+6.05e-02

Pos logit contributions
 asking+0.013
 swoop+0.011
 challenged+0.011
 chased+0.010
 begged+0.010
Neg logit contributions
igsaw-0.014
ill-0.014
kin-0.014
aney-0.013
ushes-0.013
Token dog
Token ID3290
Feature activation+1.32e-01

Pos logit contributions
 asking+0.074
 swoop+0.063
 challenged+0.061
cut+0.059
 trying+0.059
Neg logit contributions
ill-0.129
igsaw-0.123
kin-0.117
aney-0.115
etti-0.114
Token and
Token ID290
Feature activation+1.53e-03

Pos logit contributions
 asking+0.190
 swoop+0.174
 challenged+0.168
cut+0.165
 bulldo+0.162
Neg logit contributions
ill-0.246
igsaw-0.215
kin-0.212
aney-0.207
etti-0.206
Token the
Token ID262
Feature activation+8.40e-03

Pos logit contributions
 asking+0.002
 swoop+0.002
 le+0.002
 challenged+0.002
 trying+0.002
Neg logit contributions
ill-0.003
igsaw-0.003
kin-0.003
aney-0.003
ushes-0.003
Token nap
Token ID25422
Feature activation-5.96e-02

Pos logit contributions
 asking+0.012
 challenged+0.010
 swoop+0.010
cut+0.009
 trying+0.009
Neg logit contributions
ill-0.016
igsaw-0.016
kin-0.015
aney-0.014
ushes-0.014
Tokenkin
Token ID5116
Feature activation-3.00e-01

Pos logit contributions
ill+0.073
igsaw+0.066
aney+0.060
etti+0.059
ushes+0.058
Neg logit contributions
 asking-0.129
 trying-0.117
 le-0.116
 swoop-0.114
 z-0.114
Token,
Token ID11
Feature activation-3.79e-02

Pos logit contributions
ill+0.602
igsaw+0.537
kin+0.535
aney+0.526
etti+0.510
Neg logit contributions
 asking-0.429
 le-0.362
 swoop-0.360
 challenged-0.360
 trying-0.351
Token and
Token ID290
Feature activation+3.30e-03

Pos logit contributions
ill+0.082
igsaw+0.075
aney+0.072
kin+0.072
etti+0.071
Neg logit contributions
 asking-0.048
 swoop-0.040
 le-0.039
 challenged-0.039
 trying-0.038
Token they
Token ID484
Feature activation-3.60e-03

Pos logit contributions
 asking+0.005
 swoop+0.004
 le+0.004
 challenged+0.004
 trying+0.004
Neg logit contributions
ill-0.006
igsaw-0.006
kin-0.006
aney-0.005
ushes-0.005
Token both
Token ID1111
Feature activation-1.14e-01

Pos logit contributions
ill+0.007
igsaw+0.006
aney+0.006
kin+0.006
etti+0.006
Neg logit contributions
 asking-0.005
 le-0.005
 trying-0.005
 swoop-0.005
 challenged-0.005
Token shared
Token ID4888
Feature activation-9.59e-02

Pos logit contributions
ill+0.236
igsaw+0.222
aney+0.209
kin+0.208
etti+0.208
Neg logit contributions
 asking-0.155
 swoop-0.132
 le-0.130
 trying-0.129
 challenged-0.128
Token how
Token ID703
Feature activation-3.13e-02

Pos logit contributions
 asking+0.067
 trying+0.057
 le+0.056
 going+0.055
 sc+0.054
Neg logit contributions
ill-0.108
igsaw-0.100
aney-0.097
kin-0.097
etti-0.097
Token to
Token ID284
Feature activation+9.76e-02

Pos logit contributions
ill+0.069
igsaw+0.065
aney+0.062
kin+0.062
ushes+0.061
Neg logit contributions
 asking-0.038
 swoop-0.031
 le-0.030
 trying-0.030
 challenged-0.030
Token use
Token ID779
Feature activation+6.73e-03

Pos logit contributions
 asking+0.138
 swoop+0.118
 le+0.118
 trying+0.117
 z+0.115
Neg logit contributions
ill-0.197
igsaw-0.184
aney-0.177
ushes-0.173
kin-0.171
Token a
Token ID257
Feature activation-1.18e-04

Pos logit contributions
 asking+0.009
 trying+0.007
 le+0.007
 fl+0.007
 z+0.007
Neg logit contributions
ill-0.015
aney-0.014
igsaw-0.014
etti-0.014
kin-0.014
Token nap
Token ID25422
Feature activation-5.97e-02

Pos logit contributions
ill+0.000
aney+0.000
igsaw+0.000
etti+0.000
ushes+0.000
Neg logit contributions
 asking-0.000
 le-0.000
 trying-0.000
 z-0.000
 fl-0.000
Tokenkin
Token ID5116
Feature activation-3.00e-01

Pos logit contributions
ill+0.061
igsaw+0.058
aney+0.054
etti+0.052
ushes+0.050
Neg logit contributions
 asking-0.138
 trying-0.129
 fl-0.125
 z-0.124
 le-0.123
Token to
Token ID284
Feature activation+9.79e-02

Pos logit contributions
ill+0.666
igsaw+0.628
aney+0.602
kin+0.596
etti+0.596
Neg logit contributions
 asking-0.355
 trying-0.288
 le-0.285
 swoop-0.284
 challenged-0.277
Token cut
Token ID2005
Feature activation+2.37e-02

Pos logit contributions
 asking+0.130
 z+0.114
 fl+0.114
 swoop+0.113
 trying+0.113
Neg logit contributions
ill-0.205
igsaw-0.196
aney-0.191
ushes-0.186
etti-0.180
Token her
Token ID607
Feature activation-3.25e-02

Pos logit contributions
 asking+0.027
 le+0.024
 trying+0.023
 z+0.022
 fl+0.022
Neg logit contributions
ill-0.053
igsaw-0.051
aney-0.050
etti-0.049
aire-0.046
Token food
Token ID2057
Feature activation-3.80e-02

Pos logit contributions
ill+0.068
igsaw+0.061
aney+0.061
etti+0.057
ola+0.057
Neg logit contributions
 asking-0.045
 fl-0.040
 trying-0.040
 le-0.039
 z-0.039
Token.
Token ID13
Feature activation-2.62e-02

Pos logit contributions
ill+0.083
igsaw+0.079
aney+0.076
etti+0.075
ushes+0.074
Neg logit contributions
 asking-0.046
 trying-0.038
 le-0.037
 swoop-0.037
 z-0.037
Token her
Token ID607
Feature activation-3.24e-02

Pos logit contributions
ill+0.297
igsaw+0.284
etti+0.272
ushes+0.270
aney+0.270
Neg logit contributions
 asking-0.166
 trying-0.138
 le-0.133
 sc-0.130
 z-0.128
Token dress
Token ID6576
Feature activation-1.70e-01

Pos logit contributions
ill+0.056
igsaw+0.054
kin+0.052
aney+0.050
etti+0.049
Neg logit contributions
 asking-0.052
 swoop-0.048
 challenged-0.046
 chased-0.044
 le-0.044
Token with
Token ID351
Feature activation-1.08e-01

Pos logit contributions
ill+0.394
igsaw+0.380
etti+0.364
aney+0.364
ushes+0.361
Neg logit contributions
 asking-0.177
 trying-0.144
 sc-0.140
 le-0.139
 z-0.134
Token a
Token ID257
Feature activation-5.56e-04

Pos logit contributions
ill+0.226
igsaw+0.212
aney+0.204
etti+0.201
kin+0.197
Neg logit contributions
 asking-0.149
 trying-0.128
 le-0.126
 sc-0.121
 swoop-0.120
Token nap
Token ID25422
Feature activation-6.01e-02

Pos logit contributions
ill+0.001
igsaw+0.001
aney+0.001
etti+0.001
kin+0.001
Neg logit contributions
 asking-0.001
 trying-0.001
 le-0.001
 swoop-0.001
 z-0.001
Tokenkin
Token ID5116
Feature activation-3.00e-01

Pos logit contributions
ill+0.034
igsaw+0.028
aney+0.023
etti+0.021
ushes+0.021
Neg logit contributions
 asking-0.171
 trying-0.159
 le-0.156
 z-0.155
 sc-0.155
Token,
Token ID11
Feature activation-3.86e-02

Pos logit contributions
ill+0.640
igsaw+0.574
kin+0.569
aney+0.555
etti+0.541
Neg logit contributions
 asking-0.395
 swoop-0.336
 challenged-0.331
 le-0.326
 trying-0.317
Token but
Token ID475
Feature activation+5.70e-02

Pos logit contributions
ill+0.069
kin+0.058
aney+0.057
igsaw+0.056
cake+0.055
Neg logit contributions
 asking-0.055
 swoop-0.054
 challenged-0.053
 gir-0.052
 ostr-0.052
Token it
Token ID340
Feature activation-7.31e-02

Pos logit contributions
 asking+0.076
 trying+0.065
 le+0.062
 z+0.060
 going+0.060
Neg logit contributions
ill-0.124
igsaw-0.120
aney-0.114
etti-0.112
ushes-0.111
Token didn
Token ID1422
Feature activation+1.36e-01

Pos logit contributions
ill+0.149
kin+0.136
igsaw+0.130
aney+0.128
ola+0.125
Neg logit contributions
 asking-0.104
 gir-0.083
 swoop-0.083
 challenged-0.083
 trying-0.083
Token't
Token ID470
Feature activation+5.01e-02

Pos logit contributions
 asking+0.179
 trying+0.154
 le+0.153
 going+0.150
 swoop+0.149
Neg logit contributions
ill-0.283
igsaw-0.261
aney-0.254
kin-0.252
ushes-0.249
Token grateful
Token ID14066
Feature activation-1.63e-02

Pos logit contributions
ill+0.079
igsaw+0.073
aney+0.069
kin+0.069
etti+0.068
Neg logit contributions
 asking-0.063
 trying-0.054
 le-0.054
 swoop-0.054
 challenged-0.053
Token for
Token ID329
Feature activation-3.42e-02

Pos logit contributions
ill+0.034
igsaw+0.031
kin+0.030
aney+0.029
cake+0.029
Neg logit contributions
 asking-0.021
 swoop-0.019
 challenged-0.018
 chased-0.018
 le-0.017
Token the
Token ID262
Feature activation+8.37e-03

Pos logit contributions
ill+0.082
igsaw+0.076
aney+0.075
aire+0.074
ushes+0.074
Neg logit contributions
 asking-0.041
 trying-0.033
 telling-0.031
 sc-0.031
 le-0.031
Token special
Token ID2041
Feature activation-5.66e-02

Pos logit contributions
 asking+0.015
 swoop+0.013
 challenged+0.013
 chased+0.012
 trying+0.012
Neg logit contributions
ill-0.014
igsaw-0.013
kin-0.012
aney-0.012
cake-0.012
Token nap
Token ID25422
Feature activation-5.82e-02

Pos logit contributions
ill+0.091
igsaw+0.082
aney+0.077
kin+0.077
etti+0.076
Neg logit contributions
 asking-0.104
 le-0.091
 swoop-0.091
 trying-0.091
 challenged-0.090
Tokenkin
Token ID5116
Feature activation-3.00e-01

Pos logit contributions
ill+0.058
igsaw+0.050
aney+0.046
ushes+0.046
etti+0.045
Neg logit contributions
 asking-0.138
 trying-0.128
 le-0.126
 z-0.124
 ro-0.124
Token.
Token ID13
Feature activation-2.40e-02

Pos logit contributions
ill+0.672
igsaw+0.627
aney+0.602
kin+0.599
etti+0.594
Neg logit contributions
 asking-0.355
 trying-0.287
 le-0.286
 swoop-0.286
 challenged-0.279
Token
Token ID220
Feature activation-3.72e-02

Pos logit contributions
ill+0.042
igsaw+0.039
kin+0.038
aney+0.037
etti+0.036
Neg logit contributions
 asking-0.041
 le-0.035
 trying-0.035
 challenged-0.034
 swoop-0.034
Token
Token ID198
Feature activation-4.14e-02

Pos logit contributions
ill+0.069
igsaw+0.067
kin+0.065
etti+0.062
aney+0.062
Neg logit contributions
 asking-0.054
 trying-0.045
 le-0.045
 challenged-0.044
 z-0.044
Token
Token ID198
Feature activation-4.05e-02

Pos logit contributions
ill+0.070
igsaw+0.065
kin+0.061
aney+0.060
etti+0.060
Neg logit contributions
 asking-0.072
 le-0.062
 swoop-0.062
 trying-0.062
 challenged-0.061
TokenTom
Token ID13787
Feature activation+8.14e-02

Pos logit contributions
ill+0.068
igsaw+0.061
aney+0.059
kin+0.058
cake+0.057
Neg logit contributions
 asking-0.072
 trying-0.062
 ro-0.062
 swoop-0.062
 le-0.062
Tokenmy
Token ID1820
Feature activation+2.01e-02

Pos logit contributions
ill+0.230
aney+0.197
igsaw+0.197
etti+0.193
kin+0.189
Neg logit contributions
 asking-0.166
 swoop-0.143
 le-0.143
 challenged-0.141
 trying-0.141
Token mixed
Token ID7668
Feature activation-4.16e-02

Pos logit contributions
 asking+0.028
 swoop+0.023
 le+0.023
 challenged+0.023
 ostr+0.023
Neg logit contributions
ill-0.042
etti-0.036
aney-0.036
igsaw-0.036
kin-0.035
Token the
Token ID262
Feature activation+1.08e-02

Pos logit contributions
ill+0.104
igsaw+0.100
aney+0.098
ifying+0.098
cake+0.097
Neg logit contributions
 asking-0.036
 le-0.033
 z-0.030
 fl-0.030
 ro-0.028
Token beautiful
Token ID4950
Feature activation-1.15e-01

Pos logit contributions
 asking+0.020
 swoop+0.017
 le+0.017
 trying+0.017
 challenged+0.017
Neg logit contributions
ill-0.018
igsaw-0.016
kin-0.015
aney-0.015
etti-0.015
Token nap
Token ID25422
Feature activation-5.81e-02

Pos logit contributions
ill+0.217
igsaw+0.197
aney+0.190
kin+0.185
etti+0.185
Neg logit contributions
 asking-0.180
 le-0.155
 trying-0.154
 swoop-0.153
 z-0.150
Tokenkin
Token ID5116
Feature activation-3.00e-01

Pos logit contributions
ill+0.042
igsaw+0.035
aney+0.033
etti+0.032
ushes+0.030
Neg logit contributions
 asking-0.147
 trying-0.139
 fl-0.137
 le-0.137
 z-0.135
Token with
Token ID351
Feature activation-1.07e-01

Pos logit contributions
ill+0.662
igsaw+0.655
etti+0.628
ushes+0.614
aney+0.614
Neg logit contributions
 asking-0.310
 trying-0.268
 le-0.251
 sc-0.250
 fl-0.249
Token colorful
Token ID20239
Feature activation-6.94e-02

Pos logit contributions
ill+0.228
igsaw+0.219
kin+0.209
aney+0.208
etti+0.202
Neg logit contributions
 asking-0.139
 le-0.122
 trying-0.120
 z-0.116
 fl-0.115
Token water
Token ID1660
Feature activation+1.00e-02

Pos logit contributions
ill+0.131
igsaw+0.120
kin+0.114
aney+0.114
etti+0.113
Neg logit contributions
 asking-0.107
 swoop-0.091
 trying-0.091
 le-0.090
 challenged-0.090
Token.
Token ID13
Feature activation-2.96e-02

Pos logit contributions
 asking+0.013
 swoop+0.010
 trying+0.010
 le+0.010
 challenged+0.010
Neg logit contributions
ill-0.022
igsaw-0.020
kin-0.019
aney-0.019
etti-0.019
Token The
Token ID383
Feature activation+2.62e-02

Pos logit contributions
ill+0.056
kin+0.055
igsaw+0.050
aney+0.048
etti+0.048
Neg logit contributions
 asking-0.046
 le-0.038
 trying-0.038
 telling-0.036
 going-0.036
Token waved
Token ID26834
Feature activation-2.92e-02

Pos logit contributions
 asking+0.123
cut+0.111
 causing+0.109
 trucks+0.105
 le+0.105
Neg logit contributions
ill-0.157
aney-0.129
ola-0.123
cake-0.121
aire-0.121
Token goodbye
Token ID24829
Feature activation-8.15e-02

Pos logit contributions
ill+0.050
igsaw+0.046
aney+0.044
kin+0.044
etti+0.043
Neg logit contributions
 asking-0.050
 le-0.044
 swoop-0.044
 trying-0.044
 challenged-0.043
Token as
Token ID355
Feature activation-2.03e-02

Pos logit contributions
ill+0.177
igsaw+0.163
etti+0.156
aney+0.156
kin+0.155
Neg logit contributions
 asking-0.098
 swoop-0.081
 challenged-0.081
 le-0.080
 trying-0.079
Token the
Token ID262
Feature activation+8.79e-03

Pos logit contributions
ill+0.050
igsaw+0.047
aney+0.046
etti+0.045
ushes+0.045
Neg logit contributions
 asking-0.021
 le-0.017
 z-0.017
 trying-0.016
 swoop-0.016
Token nap
Token ID25422
Feature activation-5.78e-02

Pos logit contributions
 asking+0.015
 challenged+0.013
 swoop+0.013
cut+0.012
 chased+0.012
Neg logit contributions
ill-0.015
igsaw-0.014
kin-0.014
cake-0.013
ushes-0.013
Tokenkin
Token ID5116
Feature activation-3.00e-01

Pos logit contributions
ill+0.048
igsaw+0.041
aney+0.036
ushes+0.034
etti+0.034
Neg logit contributions
 asking-0.149
 trying-0.137
 le-0.136
 swoop-0.135
 challenged-0.134
Token slowly
Token ID6364
Feature activation+4.57e-03

Pos logit contributions
ill+0.586
kin+0.532
aney+0.499
cake+0.495
igsaw+0.492
Neg logit contributions
 asking-0.446
 le-0.381
 trying-0.370
 challenged-0.364
 trucks-0.362
Token continued
Token ID3767
Feature activation-9.24e-02

Pos logit contributions
 asking+0.006
have+0.006
 trucks+0.005
cod+0.005
 challenged+0.005
Neg logit contributions
ill-0.009
kin-0.008
cake-0.008
igsaw-0.008
etti-0.007
Token its
Token ID663
Feature activation-6.50e-02

Pos logit contributions
ill+0.168
igsaw+0.158
kin+0.153
etti+0.146
ushes+0.143
Neg logit contributions
 asking-0.122
 challenged-0.118
 ostr-0.118
have-0.116
 le-0.115
Token journey
Token ID7002
Feature activation+8.02e-02

Pos logit contributions
ill+0.109
igsaw+0.107
kin+0.099
ushes+0.093
etti+0.089
Neg logit contributions
 asking-0.104
 challenged-0.101
have-0.091
cut-0.090
 chased-0.089
Token up
Token ID510
Feature activation-9.86e-02

Pos logit contributions
 asking+0.106
 le+0.088
 swoop+0.088
 challenged+0.087
 trying+0.087
Neg logit contributions
ill-0.169
igsaw-0.155
kin-0.149
aney-0.148
etti-0.146
Token colorful
Token ID20239
Feature activation-6.94e-02

Pos logit contributions
ill+0.228
igsaw+0.219
kin+0.209
aney+0.208
etti+0.202
Neg logit contributions
 asking-0.139
 le-0.122
 trying-0.120
 z-0.116
 fl-0.115
Token water
Token ID1660
Feature activation+1.00e-02

Pos logit contributions
ill+0.131
igsaw+0.120
kin+0.114
aney+0.114
etti+0.113
Neg logit contributions
 asking-0.107
 swoop-0.091
 trying-0.091
 le-0.090
 challenged-0.090
Token.
Token ID13
Feature activation-2.96e-02

Pos logit contributions
 asking+0.013
 swoop+0.010
 trying+0.010
 le+0.010
 challenged+0.010
Neg logit contributions
ill-0.022
igsaw-0.020
kin-0.019
aney-0.019
etti-0.019
Token The
Token ID383
Feature activation+2.62e-02

Pos logit contributions
ill+0.056
kin+0.055
igsaw+0.050
aney+0.048
etti+0.048
Neg logit contributions
 asking-0.046
 le-0.038
 trying-0.038
 telling-0.036
 going-0.036
Token nap
Token ID25422
Feature activation-5.76e-02

Pos logit contributions
 asking+0.041
 swoop+0.036
 challenged+0.035
 trying+0.034
 le+0.034
Neg logit contributions
ill-0.049
igsaw-0.045
kin-0.045
aney-0.042
ushes-0.042
Tokenkin
Token ID5116
Feature activation-2.99e-01

Pos logit contributions
ill+0.061
igsaw+0.060
aney+0.053
ifying+0.051
etti+0.050
Neg logit contributions
 asking-0.125
 trying-0.119
 fl-0.119
 ro-0.118
 z-0.117
Token began
Token ID2540
Feature activation+1.29e-03

Pos logit contributions
ill+0.610
igsaw+0.576
aney+0.540
etti+0.536
ushes+0.529
Neg logit contributions
 asking-0.409
 trying-0.349
 swoop-0.348
 le-0.346
 z-0.336
Token to
Token ID284
Feature activation+9.96e-02

Pos logit contributions
 asking+0.001
 challenged+0.001
 swoop+0.001
 le+0.001
cut+0.001
Neg logit contributions
ill-0.003
igsaw-0.003
aney-0.003
kin-0.003
etti-0.003
Token change
Token ID1487
Feature activation-1.70e-01

Pos logit contributions
 asking+0.165
 swoop+0.147
 trying+0.145
 z+0.144
 fl+0.143
Neg logit contributions
ill-0.174
igsaw-0.166
aney-0.157
ushes-0.155
kin-0.151
Token color
Token ID3124
Feature activation-1.28e-01

Pos logit contributions
ill+0.386
igsaw+0.373
aney+0.360
etti+0.356
cake+0.354
Neg logit contributions
 asking-0.192
 trying-0.157
 fl-0.151
 le-0.150
 z-0.149
Token and
Token ID290
Feature activation+1.62e-03

Pos logit contributions
ill+0.283
igsaw+0.272
aney+0.262
etti+0.255
kin+0.254
Neg logit contributions
 asking-0.142
 le-0.119
 trying-0.118
 z-0.117
 fl-0.115
Token contest
Token ID8414
Feature activation-2.47e-02

Pos logit contributions
 asking+0.019
 challenged+0.016
 yells+0.016
 ult+0.016
 begged+0.015
Neg logit contributions
igsaw-0.017
kin-0.017
ill-0.016
cake-0.015
aire-0.015
Token and
Token ID290
Feature activation+6.04e-03

Pos logit contributions
ill+0.056
igsaw+0.052
kin+0.051
aney+0.050
etti+0.050
Neg logit contributions
 asking-0.028
 trying-0.023
 le-0.023
 swoop-0.022
 challenged-0.022
Token got
Token ID1392
Feature activation-2.04e-03

Pos logit contributions
 asking+0.008
 trying+0.007
 swoop+0.007
 le+0.007
 challenged+0.007
Neg logit contributions
ill-0.012
igsaw-0.011
kin-0.011
aney-0.011
etti-0.011
Token his
Token ID465
Feature activation+6.31e-03

Pos logit contributions
ill+0.004
igsaw+0.003
ushes+0.003
aney+0.003
kin+0.003
Neg logit contributions
 asking-0.003
 swoop-0.003
 challenged-0.003
 gir-0.003
cut-0.003
Token nap
Token ID25422
Feature activation-5.85e-02

Pos logit contributions
 asking+0.010
 swoop+0.009
 challenged+0.009
 trying+0.009
 le+0.009
Neg logit contributions
ill-0.011
igsaw-0.010
kin-0.010
aney-0.010
ushes-0.009
Tokenkin
Token ID5116
Feature activation-2.99e-01

Pos logit contributions
ill+0.070
igsaw+0.062
aney+0.058
etti+0.056
ushes+0.055
Neg logit contributions
 asking-0.129
 trying-0.117
 le-0.115
 swoop-0.115
 z-0.114
Token back
Token ID736
Feature activation-2.61e-01

Pos logit contributions
ill+0.556
igsaw+0.511
aney+0.486
kin+0.484
etti+0.478
Neg logit contributions
 asking-0.468
 swoop-0.400
 le-0.399
 trying-0.399
 challenged-0.393
Token.
Token ID13
Feature activation-1.50e-02

Pos logit contributions
ill+0.544
igsaw+0.500
aney+0.477
kin+0.477
etti+0.470
Neg logit contributions
 asking-0.352
 swoop-0.296
 le-0.296
 challenged-0.291
 trying-0.290
Token From
Token ID3574
Feature activation-2.80e-03

Pos logit contributions
ill+0.028
kin+0.026
aney+0.025
igsaw+0.025
etti+0.024
Neg logit contributions
 asking-0.023
 le-0.021
 trying-0.020
 challenged-0.019
 z-0.019
Token that
Token ID326
Feature activation+3.84e-02

Pos logit contributions
ill+0.006
kin+0.005
igsaw+0.005
ola+0.005
etti+0.005
Neg logit contributions
 asking-0.004
 challenged-0.003
 ro-0.003
 gir-0.003
 swoop-0.003
Token day
Token ID1110
Feature activation+9.00e-02

Pos logit contributions
 asking+0.052
 swoop+0.045
 le+0.044
 challenged+0.044
 ostr+0.042
Neg logit contributions
ill-0.075
igsaw-0.073
kin-0.071
aire-0.065
ola-0.064
Tokenished
Token ID1348
Feature activation-4.51e-02

Pos logit contributions
ill+0.014
ushes+0.012
igsaw+0.012
aney+0.012
kin+0.011
Neg logit contributions
 asking-0.016
cut-0.014
 trying-0.014
 gir-0.014
 ostr-0.013
Token it
Token ID340
Feature activation-7.52e-02

Pos logit contributions
ill+0.087
igsaw+0.077
kin+0.075
aney+0.074
etti+0.073
Neg logit contributions
 asking-0.068
 challenged-0.059
 swoop-0.058
 trying-0.057
 le-0.057
Token with
Token ID351
Feature activation-1.09e-01

Pos logit contributions
ill+0.124
kin+0.104
igsaw+0.099
ushes+0.096
cake+0.096
Neg logit contributions
 asking-0.127
 ostr-0.123
 bulldo-0.116
 challenged-0.115
 gir-0.115
Token a
Token ID257
Feature activation+1.37e-04

Pos logit contributions
ill+0.215
igsaw+0.194
kin+0.187
aney+0.184
ushes+0.184
Neg logit contributions
 asking-0.159
 challenged-0.136
 swoop-0.131
 ostr-0.130
cut-0.130
Token nap
Token ID25422
Feature activation-6.00e-02

Pos logit contributions
 asking+0.000
 trying+0.000
 swoop+0.000
 le+0.000
 challenged+0.000
Neg logit contributions
ill-0.000
igsaw-0.000
kin-0.000
aney-0.000
etti-0.000
Tokenkin
Token ID5116
Feature activation-2.99e-01

Pos logit contributions
ill+0.018
igsaw+0.009
aney+0.004
kin+0.003
etti+0.002
Neg logit contributions
 asking-0.188
 swoop-0.175
 le-0.175
 trying-0.174
 challenged-0.173
Token.
Token ID13
Feature activation-3.77e-02

Pos logit contributions
ill+0.567
kin+0.504
cake+0.464
igsaw+0.447
etti+0.445
Neg logit contributions
 asking-0.452
have-0.416
 challenged-0.410
 swoop-0.409
 ostr-0.401
Token Tim
Token ID5045
Feature activation+4.96e-02

Pos logit contributions
ill+0.059
igsaw+0.053
etti+0.050
aney+0.050
cake+0.049
Neg logit contributions
 asking-0.068
 swoop-0.063
 fl-0.060
 ostr-0.059
 ro-0.059
Tokenmy
Token ID1820
Feature activation+2.20e-02

Pos logit contributions
 asking+0.101
cut+0.100
 le+0.098
 gir+0.096
 bulldo+0.095
Neg logit contributions
ill-0.071
cy-0.050
aney-0.050
ola-0.043
 Kate-0.042
Token felt
Token ID2936
Feature activation-7.98e-02

Pos logit contributions
 asking+0.036
 trucks+0.032
cut+0.032
 hunters+0.032
 going+0.032
Neg logit contributions
ill-0.033
kin-0.029
aney-0.028
 Kate-0.027
aire-0.026
Token bad
Token ID2089
Feature activation+6.85e-02

Pos logit contributions
ill+0.161
igsaw+0.149
kin+0.144
aney+0.142
ushes+0.139
Neg logit contributions
 asking-0.110
 swoop-0.092
 challenged-0.090
 le-0.090
 gir-0.089
Token she
Token ID673
Feature activation+3.02e-02

Pos logit contributions
 asking+0.002
 swoop+0.002
 le+0.002
 challenged+0.002
 gir+0.002
Neg logit contributions
ill-0.004
igsaw-0.003
kin-0.003
aney-0.003
ushes-0.003
Token even
Token ID772
Feature activation-1.58e-01

Pos logit contributions
 asking+0.041
 swoop+0.033
 le+0.032
 gir+0.032
 trying+0.032
Neg logit contributions
ill-0.062
kin-0.054
igsaw-0.054
aney-0.053
ushes-0.053
Token put
Token ID1234
Feature activation-9.04e-03

Pos logit contributions
ill+0.320
kin+0.283
igsaw+0.276
aney+0.270
ola+0.268
Neg logit contributions
 asking-0.204
cut-0.171
 ostr-0.170
 bulldo-0.169
 trucks-0.166
Token a
Token ID257
Feature activation+2.45e-03

Pos logit contributions
ill+0.021
aney+0.020
igsaw+0.020
etti+0.019
kin+0.018
Neg logit contributions
 asking-0.011
 trying-0.009
 le-0.009
 sc-0.009
 z-0.008
Token nap
Token ID25422
Feature activation-5.82e-02

Pos logit contributions
 asking+0.003
 le+0.003
 swoop+0.003
 trying+0.003
 z+0.003
Neg logit contributions
ill-0.005
igsaw-0.005
aney-0.005
etti-0.004
kin-0.004
Tokenkin
Token ID5116
Feature activation-2.99e-01

Pos logit contributions
ill+0.054
igsaw+0.050
aney+0.046
etti+0.044
ushes+0.041
Neg logit contributions
 asking-0.142
 trying-0.133
 fl-0.128
 z-0.127
 telling-0.126
Token on
Token ID319
Feature activation-5.74e-02

Pos logit contributions
ill+0.566
igsaw+0.513
kin+0.498
aney+0.490
etti+0.479
Neg logit contributions
 asking-0.463
 le-0.398
 swoop-0.395
 challenged-0.391
 trying-0.390
Token her
Token ID607
Feature activation-2.78e-02

Pos logit contributions
ill+0.129
aney+0.120
ushes+0.120
igsaw+0.118
etti+0.116
Neg logit contributions
 asking-0.069
 trying-0.058
 le-0.056
 z-0.054
 sc-0.053
Token head
Token ID1182
Feature activation-5.75e-02

Pos logit contributions
ill+0.053
igsaw+0.047
aney+0.046
etti+0.044
ushes+0.044
Neg logit contributions
 asking-0.044
 trying-0.038
 z-0.038
 fl-0.038
 sc-0.038
Token too
Token ID1165
Feature activation-1.15e-01

Pos logit contributions
ill+0.123
igsaw+0.113
kin+0.109
aney+0.109
etti+0.107
Neg logit contributions
 asking-0.074
 swoop-0.062
 le-0.061
 trying-0.061
 challenged-0.060
Token.
Token ID13
Feature activation-2.44e-02

Pos logit contributions
ill+0.246
igsaw+0.227
kin+0.217
aney+0.217
etti+0.214
Neg logit contributions
 asking-0.151
 swoop-0.126
 le-0.124
 challenged-0.123
 trying-0.123
Token't
Token ID470
Feature activation+4.91e-02

Pos logit contributions
 asking+0.155
 trying+0.136
 going+0.136
 le+0.131
 sc+0.129
Neg logit contributions
ill-0.308
aney-0.280
ushes-0.279
kin-0.278
igsaw-0.278
Token want
Token ID765
Feature activation+7.00e-02

Pos logit contributions
 asking+0.059
 trying+0.048
 challenged+0.048
 going+0.047
 swoop+0.046
Neg logit contributions
ill-0.108
igsaw-0.107
etti-0.105
ushes-0.103
aney-0.102
Token a
Token ID257
Feature activation+1.05e-03

Pos logit contributions
 asking+0.072
 trying+0.056
 le+0.055
 going+0.051
 sc+0.051
Neg logit contributions
ill-0.173
igsaw-0.162
kin-0.159
aney-0.158
etti-0.156
Token new
Token ID649
Feature activation-5.74e-02

Pos logit contributions
 asking+0.002
 z+0.001
 le+0.001
 trying+0.001
 fl+0.001
Neg logit contributions
ill-0.002
etti-0.002
igsaw-0.002
aney-0.002
ushes-0.002
Token nap
Token ID25422
Feature activation-6.03e-02

Pos logit contributions
ill+0.111
igsaw+0.101
aney+0.100
etti+0.098
ifying+0.096
Neg logit contributions
 asking-0.087
 z-0.076
 trying-0.076
 le-0.076
 fl-0.075
Tokenkin
Token ID5116
Feature activation-2.99e-01

Pos logit contributions
ill+0.122
igsaw+0.121
ushes+0.115
aney+0.114
etti+0.114
Neg logit contributions
 asking-0.080
 trying-0.071
 going-0.065
 le-0.064
 z-0.063
Token,
Token ID11
Feature activation-4.14e-02

Pos logit contributions
ill+0.643
igsaw+0.588
kin+0.569
aney+0.567
etti+0.555
Neg logit contributions
 asking-0.386
 swoop-0.321
 le-0.316
 challenged-0.313
 trying-0.313
Token she
Token ID673
Feature activation+2.90e-02

Pos logit contributions
ill+0.080
igsaw+0.070
aney+0.069
kin+0.068
etti+0.066
Neg logit contributions
 asking-0.060
 swoop-0.054
 challenged-0.052
cut-0.051
 le-0.051
Token wanted
Token ID2227
Feature activation+2.10e-01

Pos logit contributions
 asking+0.041
cut+0.033
 trucks+0.032
 gir+0.032
 ostr+0.032
Neg logit contributions
ill-0.057
kin-0.054
aney-0.050
ola-0.049
cake-0.048
Token her
Token ID607
Feature activation-3.04e-02

Pos logit contributions
 asking+0.256
 le+0.207
 trying+0.207
 swoop+0.206
 challenged+0.201
Neg logit contributions
ill-0.469
igsaw-0.437
aney-0.420
kin-0.419
etti-0.414
Token old
Token ID1468
Feature activation+5.05e-02

Pos logit contributions
ill+0.056
igsaw+0.052
aney+0.049
etti+0.048
kin+0.048
Neg logit contributions
 asking-0.049
 trying-0.042
 le-0.042
 swoop-0.041
 challenged-0.041
Token Max
Token ID5436
Feature activation+8.62e-02

Pos logit contributions
ill+0.082
igsaw+0.076
aney+0.074
ushes+0.072
etti+0.071
Neg logit contributions
 asking-0.056
 trying-0.048
 le-0.046
 going-0.045
 z-0.045
Token gave
Token ID2921
Feature activation-4.94e-02

Pos logit contributions
 asking+0.125
cut+0.122
 trucks+0.113
 ostr+0.105
 gir+0.105
Neg logit contributions
ill-0.173
aney-0.140
kin-0.138
aire-0.137
cake-0.136
Token back
Token ID736
Feature activation-2.62e-01

Pos logit contributions
ill+0.102
igsaw+0.095
aney+0.092
etti+0.091
kin+0.090
Neg logit contributions
 asking-0.071
 trying-0.059
 le-0.058
 z-0.057
 sc-0.057
Token the
Token ID262
Feature activation+1.05e-02

Pos logit contributions
ill+0.574
igsaw+0.533
kin+0.530
etti+0.520
aney+0.517
Neg logit contributions
 asking-0.337
 le-0.285
 trying-0.268
 sc-0.267
 z-0.267
Token nap
Token ID25422
Feature activation-5.92e-02

Pos logit contributions
 asking+0.018
 swoop+0.016
 challenged+0.015
 le+0.015
 trying+0.015
Neg logit contributions
ill-0.018
igsaw-0.017
kin-0.016
aney-0.016
ushes-0.015
Tokenkin
Token ID5116
Feature activation-2.99e-01

Pos logit contributions
ill+0.032
igsaw+0.027
etti+0.023
aney+0.022
ifying+0.021
Neg logit contributions
 asking-0.162
 trying-0.153
 z-0.151
 fl-0.150
 le-0.150
Token and
Token ID290
Feature activation+4.23e-04

Pos logit contributions
ill+0.649
igsaw+0.630
etti+0.605
aney+0.593
ushes+0.593
Neg logit contributions
 asking-0.347
 trying-0.295
 sc-0.292
 le-0.287
 swoop-0.280
Token Lily
Token ID20037
Feature activation-4.69e-03

Pos logit contributions
 asking+0.001
 swoop+0.001
cut+0.001
 le+0.001
 challenged+0.001
Neg logit contributions
ill-0.001
kin-0.001
igsaw-0.001
etti-0.001
aney-0.001
Token felt
Token ID2936
Feature activation-7.83e-02

Pos logit contributions
ill+0.009
kin+0.008
aney+0.007
etti+0.007
ola+0.007
Neg logit contributions
 asking-0.007
cut-0.006
 le-0.006
 swoop-0.006
 trying-0.006
Token happy
Token ID3772
Feature activation-4.78e-02

Pos logit contributions
ill+0.169
igsaw+0.157
etti+0.152
aney+0.150
ushes+0.143
Neg logit contributions
 asking-0.106
 trying-0.097
 going-0.094
 le-0.093
 sc-0.090
Token again
Token ID757
Feature activation-2.15e-01

Pos logit contributions
ill+0.096
igsaw+0.089
aney+0.086
kin+0.085
etti+0.084
Neg logit contributions
 asking-0.070
 trying-0.059
 le-0.058
 swoop-0.057
 challenged-0.056
Token dress
Token ID6576
Feature activation-1.65e-01

Pos logit contributions
ill+0.048
kin+0.044
igsaw+0.044
ushes+0.042
etti+0.041
Neg logit contributions
 asking-0.036
 challenged-0.031
 swoop-0.031
cut-0.029
 le-0.029
Token.
Token ID13
Feature activation-2.67e-03

Pos logit contributions
ill+0.360
igsaw+0.328
kin+0.321
aney+0.317
etti+0.315
Neg logit contributions
 asking-0.207
 swoop-0.170
 challenged-0.169
 le-0.168
 trying-0.167
Token
Token ID198
Feature activation-2.60e-02

Pos logit contributions
kin+0.006
igsaw+0.006
etti+0.005
ifying+0.005
aney+0.005
Neg logit contributions
 asking-0.003
 trying-0.003
 le-0.003
 asked-0.002
 going-0.002
Token
Token ID198
Feature activation-2.53e-02

Pos logit contributions
ill+0.045
igsaw+0.042
kin+0.040
aney+0.039
etti+0.039
Neg logit contributions
 asking-0.044
 le-0.038
 trying-0.037
 swoop-0.037
 z-0.037
TokenThey
Token ID2990
Feature activation-1.22e-02

Pos logit contributions
ill+0.051
igsaw+0.045
kin+0.044
aney+0.044
ushes+0.043
Neg logit contributions
 asking-0.037
 swoop-0.032
 le-0.032
 trying-0.031
 ro-0.031
Token ran
Token ID4966
Feature activation+1.45e-01

Pos logit contributions
ill+0.028
igsaw+0.025
aney+0.025
kin+0.025
etti+0.024
Neg logit contributions
 asking-0.014
 trying-0.012
 le-0.012
 swoop-0.011
 challenged-0.011
Token downstairs
Token ID34624
Feature activation-1.49e-01

Pos logit contributions
 asking+0.175
 trying+0.142
 le+0.141
 swoop+0.140
 challenged+0.137
Neg logit contributions
ill-0.325
igsaw-0.303
kin-0.292
aney-0.291
etti-0.287
Token and
Token ID290
Feature activation+1.84e-02

Pos logit contributions
ill+0.330
igsaw+0.303
kin+0.292
aney+0.292
etti+0.288
Neg logit contributions
 asking-0.175
 swoop-0.154
 le-0.148
 challenged-0.145
 gir-0.142
Token hugged
Token ID41130
Feature activation-9.05e-02

Pos logit contributions
 asking+0.021
 trying+0.017
 swoop+0.016
 z+0.016
 chased+0.016
Neg logit contributions
ill-0.043
igsaw-0.041
aney-0.039
kin-0.039
etti-0.038
Token Mom
Token ID11254
Feature activation-1.49e-01

Pos logit contributions
igsaw+0.194
aney+0.175
cake+0.172
+0.171
ill+0.165
Neg logit contributions
 animals-0.132
 asking-0.130
 z-0.120
 trucks-0.115
 sc-0.115
Token.
Token ID13
Feature activation+8.87e-04

Pos logit contributions
ill+0.341
igsaw+0.328
kin+0.313
aney+0.308
etti+0.307
Neg logit contributions
 asking-0.168
 trying-0.132
 le-0.132
 z-0.128
 swoop-0.127
Token.
Token ID13
Feature activation-1.36e-02

Pos logit contributions
 asking+0.032
 swoop+0.031
cut+0.030
 challenged+0.029
 le+0.029
Neg logit contributions
ill-0.037
igsaw-0.033
etti-0.031
ifying-0.030
cake-0.030
Token
Token ID220
Feature activation-3.47e-02

Pos logit contributions
ill+0.025
igsaw+0.023
kin+0.023
aney+0.022
etti+0.022
Neg logit contributions
 asking-0.022
 le-0.019
 challenged-0.019
 trying-0.019
 z-0.018
Token
Token ID198
Feature activation-3.56e-02

Pos logit contributions
ill+0.070
igsaw+0.069
kin+0.067
ifying+0.065
aney+0.064
Neg logit contributions
 asking-0.046
 trying-0.038
 z-0.036
 challenged-0.036
 going-0.036
Token
Token ID198
Feature activation-3.53e-02

Pos logit contributions
ill+0.053
igsaw+0.046
aney+0.044
kin+0.043
ushes+0.042
Neg logit contributions
 asking-0.070
 le-0.062
 trying-0.061
 swoop-0.061
 ostr-0.060
TokenHe
Token ID1544
Feature activation+3.02e-02

Pos logit contributions
ill+0.052
igsaw+0.044
cake+0.043
he+0.043
 Something+0.042
Neg logit contributions
 asking-0.068
 ostr-0.060
 trying-0.059
 le-0.058
 sc-0.057
Token wanted
Token ID2227
Feature activation+2.14e-01

Pos logit contributions
 asking+0.042
cut+0.038
 trying+0.035
 le+0.034
 trucks+0.034
Neg logit contributions
ill-0.057
etti-0.048
kin-0.047
aire-0.047
aney-0.046
Token to
Token ID284
Feature activation+1.03e-01

Pos logit contributions
 asking+0.226
 le+0.176
 swoop+0.176
 trying+0.176
 challenged+0.172
Neg logit contributions
ill-0.513
igsaw-0.480
aney-0.462
kin-0.462
etti-0.456
Token make
Token ID787
Feature activation-3.85e-02

Pos logit contributions
 asking+0.138
 swoop+0.117
 trying+0.115
 z+0.114
 sc+0.113
Neg logit contributions
ill-0.217
igsaw-0.205
aney-0.197
kin-0.195
ushes-0.194
Token his
Token ID465
Feature activation+6.81e-03

Pos logit contributions
ill+0.070
igsaw+0.064
ushes+0.061
etti+0.061
kin+0.060
Neg logit contributions
 asking-0.058
 swoop-0.052
 le-0.051
cut-0.050
 challenged-0.050
Token own
Token ID898
Feature activation+7.62e-03

Pos logit contributions
 asking+0.010
 swoop+0.008
 trying+0.008
 le+0.008
 challenged+0.008
Neg logit contributions
ill-0.014
igsaw-0.012
aney-0.012
kin-0.012
etti-0.012
Token story
Token ID1621
Feature activation-1.08e-02

Pos logit contributions
 asking+0.013
 le+0.012
 swoop+0.012
 challenged+0.012
 trying+0.011
Neg logit contributions
ill-0.013
igsaw-0.012
ushes-0.011
kin-0.011
aney-0.011
Token his
Token ID465
Feature activation+1.23e-02

Pos logit contributions
ill+0.198
igsaw+0.191
aney+0.186
kin+0.179
etti+0.179
Neg logit contributions
 asking-0.107
 trying-0.085
 le-0.083
 z-0.082
 swoop-0.081
Token basket
Token ID7988
Feature activation-7.99e-02

Pos logit contributions
 asking+0.020
 le+0.018
 trying+0.018
 swoop+0.018
 challenged+0.017
Neg logit contributions
ill-0.022
igsaw-0.020
aney-0.019
kin-0.019
etti-0.019
Token.
Token ID13
Feature activation-1.57e-02

Pos logit contributions
ill+0.165
igsaw+0.144
kin+0.144
aney+0.141
etti+0.139
Neg logit contributions
 asking-0.111
 swoop-0.096
 challenged-0.094
 le-0.094
 trying-0.092
Token
Token ID198
Feature activation-2.99e-02

Pos logit contributions
ill+0.029
igsaw+0.026
kin+0.025
aney+0.025
etti+0.024
Neg logit contributions
 asking-0.026
 le-0.023
 trying-0.022
 challenged-0.022
 going-0.022
Token
Token ID198
Feature activation-2.98e-02

Pos logit contributions
ill+0.051
igsaw+0.046
aney+0.044
kin+0.043
etti+0.043
Neg logit contributions
 asking-0.052
 swoop-0.045
 trying-0.045
 le-0.045
 challenged-0.045
TokenAfter
Token ID3260
Feature activation+1.45e-01

Pos logit contributions
ill+0.055
igsaw+0.048
aney+0.048
cake+0.048
kin+0.048
Neg logit contributions
 asking-0.048
 trying-0.042
 gir-0.040
 ro-0.039
 swoop-0.039
Token they
Token ID484
Feature activation+4.97e-03

Pos logit contributions
 asking+0.210
 swoop+0.177
 le+0.177
 trying+0.177
 challenged+0.174
Neg logit contributions
ill-0.290
igsaw-0.268
aney-0.256
kin-0.255
etti-0.252
Token both
Token ID1111
Feature activation-1.05e-01

Pos logit contributions
 asking+0.006
 sc+0.006
 le+0.006
 swoop+0.006
 z+0.006
Neg logit contributions
igsaw-0.010
ill-0.010
etti-0.009
ushes-0.009
aps-0.009
Token had
Token ID550
Feature activation-9.96e-03

Pos logit contributions
ill+0.217
igsaw+0.207
kin+0.196
etti+0.195
ushes+0.193
Neg logit contributions
 asking-0.140
 swoop-0.119
 le-0.119
 trying-0.116
 z-0.114
Token filled
Token ID5901
Feature activation-1.45e-01

Pos logit contributions
ill+0.021
igsaw+0.020
kin+0.019
aney+0.019
etti+0.019
Neg logit contributions
 asking-0.013
 swoop-0.011
 le-0.011
 trying-0.011
 challenged-0.010
Token their
Token ID511
Feature activation-2.94e-02

Pos logit contributions
ill+0.334
igsaw+0.310
aney+0.302
ifying+0.292
kin+0.292
Neg logit contributions
 asking-0.182
 trying-0.153
 le-0.152
 z-0.142
 fl-0.142
Token,
Token ID11
Feature activation-2.58e-02

Pos logit contributions
 asking+0.085
 trying+0.070
 le+0.067
 z+0.066
 going+0.065
Neg logit contributions
ill-0.169
igsaw-0.160
ushes-0.154
etti-0.153
aney-0.153
Token there
Token ID612
Feature activation+4.61e-04

Pos logit contributions
ill+0.057
igsaw+0.054
aney+0.051
etti+0.051
kin+0.051
Neg logit contributions
 asking-0.031
 trying-0.026
 swoop-0.026
 le-0.026
 challenged-0.025
Token was
Token ID373
Feature activation-2.00e-02

Pos logit contributions
 asking+0.000
 z+0.000
 trying+0.000
 le+0.000
 sc+0.000
Neg logit contributions
igsaw-0.001
ill-0.001
ushes-0.001
ifying-0.001
etti-0.001
Token a
Token ID257
Feature activation+1.08e-02

Pos logit contributions
ill+0.044
igsaw+0.041
kin+0.039
aney+0.039
ushes+0.039
Neg logit contributions
 asking-0.025
 le-0.021
 trying-0.021
 going-0.021
 z-0.020
Token little
Token ID1310
Feature activation+1.26e-01

Pos logit contributions
 asking+0.011
 le+0.009
 swoop+0.009
 trying+0.009
 challenged+0.008
Neg logit contributions
ill-0.026
igsaw-0.024
aney-0.023
kin-0.023
etti-0.023
Token dog
Token ID3290
Feature activation+1.43e-01

Pos logit contributions
 asking+0.141
 le+0.136
 gir+0.128
 z+0.125
 ostr+0.123
Neg logit contributions
ill-0.295
aney-0.266
igsaw-0.265
ushes-0.263
etti-0.259
Token named
Token ID3706
Feature activation+1.10e-01

Pos logit contributions
 asking+0.186
 swoop+0.157
have+0.156
 challenged+0.156
 ro+0.148
Neg logit contributions
ill-0.282
kin-0.254
etti-0.248
igsaw-0.244
aps-0.238
Token Spot
Token ID15899
Feature activation+1.15e-01

Pos logit contributions
 asking+0.171
 swoop+0.148
 trying+0.145
 challenged+0.144
 le+0.142
Neg logit contributions
ill-0.205
igsaw-0.188
aney-0.178
kin-0.177
etti-0.177
Token.
Token ID13
Feature activation-3.08e-03

Pos logit contributions
 asking+0.171
 challenged+0.145
 swoop+0.143
 trying+0.142
 le+0.141
Neg logit contributions
ill-0.225
kin-0.191
aney-0.189
igsaw-0.187
etti-0.185
Token Spot
Token ID15899
Feature activation+1.16e-01

Pos logit contributions
ill+0.007
igsaw+0.006
kin+0.006
aney+0.006
ola+0.006
Neg logit contributions
 asking-0.004
 le-0.003
 trying-0.003
 swoop-0.003
 challenged-0.003
Token was
Token ID373
Feature activation-1.96e-02

Pos logit contributions
 asking+0.156
 swoop+0.130
 trying+0.129
 le+0.129
 challenged+0.127
Neg logit contributions
ill-0.240
igsaw-0.221
kin-0.212
aney-0.212
etti-0.209
Token walk
Token ID2513
Feature activation+1.39e-01

Pos logit contributions
 asking+0.007
cut+0.006
 challenged+0.005
 trying+0.005
 causing+0.005
Neg logit contributions
igsaw-0.011
ill-0.011
aney-0.010
aire-0.010
kin-0.010
Token.
Token ID13
Feature activation-1.43e-02

Pos logit contributions
 asking+0.211
 swoop+0.183
 le+0.183
 challenged+0.181
 gir+0.177
Neg logit contributions
ill-0.267
igsaw-0.230
kin-0.227
aney-0.223
ola-0.219
Token As
Token ID1081
Feature activation+9.92e-02

Pos logit contributions
ill+0.022
igsaw+0.020
kin+0.019
aney+0.019
etti+0.019
Neg logit contributions
 asking-0.027
 le-0.024
 trying-0.024
 swoop-0.024
 challenged-0.023
Token she
Token ID673
Feature activation+3.10e-02

Pos logit contributions
 asking+0.164
 swoop+0.144
 challenged+0.144
cut+0.142
 le+0.142
Neg logit contributions
ill-0.173
igsaw-0.156
kin-0.147
aney-0.146
ola-0.143
Token was
Token ID373
Feature activation-2.38e-02

Pos logit contributions
 asking+0.042
cut+0.036
 trying+0.035
 trucks+0.035
 gir+0.034
Neg logit contributions
ill-0.062
kin-0.056
ola-0.055
igsaw-0.054
aney-0.054
Token walking
Token ID6155
Feature activation+1.78e-01

Pos logit contributions
ill+0.044
aney+0.038
ola+0.038
igsaw+0.038
 Kate+0.038
Neg logit contributions
cut-0.029
 trucks-0.027
cod-0.025
 gir-0.025
 wants-0.025
Token,
Token ID11
Feature activation-4.12e-02

Pos logit contributions
 asking+0.231
 swoop+0.192
 trying+0.191
 le+0.191
 challenged+0.186
Neg logit contributions
ill-0.380
igsaw-0.356
aney-0.342
kin-0.339
etti-0.337
Token she
Token ID673
Feature activation+2.88e-02

Pos logit contributions
ill+0.081
igsaw+0.074
aney+0.071
kin+0.071
ushes+0.070
Neg logit contributions
 asking-0.061
 trying-0.052
 le-0.052
 swoop-0.051
 challenged-0.050
Token started
Token ID2067
Feature activation+4.13e-02

Pos logit contributions
 asking+0.041
cut+0.040
 trucks+0.039
 le+0.037
 gir+0.036
Neg logit contributions
ill-0.055
ola-0.047
kin-0.044
aire-0.043
aney-0.042
Token to
Token ID284
Feature activation+1.00e-01

Pos logit contributions
cut+0.067
cod+0.060
 gir+0.057
 swoop+0.056
 trucks+0.056
Neg logit contributions
ill-0.069
aps-0.055
 Kate-0.054
 Jill-0.054
 Julia-0.054
Token sh
Token ID427
Feature activation+1.60e-02

Pos logit contributions
 asking+0.185
cut+0.168
 le+0.164
 gir+0.162
 swoop+0.161
Neg logit contributions
ill-0.153
igsaw-0.125
aney-0.118
 Something-0.118
kin-0.114
TokenOnce
Token ID7454
Feature activation+3.99e-02

Pos logit contributions
 asking+0.084
 z+0.072
 trying+0.071
 ro+0.070
 le+0.070
Neg logit contributions
ill-0.143
igsaw-0.143
aney-0.136
kin-0.135
ushes-0.134
Token upon
Token ID2402
Feature activation+7.33e-02

Pos logit contributions
 asking+0.054
 trying+0.045
 le+0.045
 swoop+0.045
 z+0.044
Neg logit contributions
ill-0.081
igsaw-0.076
etti-0.073
kin-0.072
aney-0.072
Token a
Token ID257
Feature activation+5.45e-04

Pos logit contributions
 asking+0.102
 trying+0.087
 le+0.086
 swoop+0.086
 challenged+0.084
Neg logit contributions
ill-0.148
igsaw-0.138
aney-0.131
kin-0.131
etti-0.130
Token time
Token ID640
Feature activation+6.56e-02

Pos logit contributions
 asking+0.001
 swoop+0.001
 challenged+0.001
 le+0.001
 trying+0.001
Neg logit contributions
ill-0.001
igsaw-0.001
aney-0.001
kin-0.001
ushes-0.001
Token,
Token ID11
Feature activation-3.50e-02

Pos logit contributions
 asking+0.074
 trying+0.061
 le+0.059
 z+0.058
 swoop+0.057
Neg logit contributions
ill-0.149
igsaw-0.140
ushes-0.136
aney-0.135
kin-0.135
Token there
Token ID612
Feature activation-6.83e-03

Pos logit contributions
ill+0.075
igsaw+0.069
aney+0.066
kin+0.066
etti+0.065
Neg logit contributions
 asking-0.046
 swoop-0.038
 le-0.038
 trying-0.037
 challenged-0.037
Token was
Token ID373
Feature activation-2.88e-02

Pos logit contributions
igsaw+0.018
ill+0.017
ushes+0.017
ifying+0.016
etti+0.016
Neg logit contributions
 asking-0.005
 z-0.005
 trying-0.005
 le-0.004
 sc-0.004
Token a
Token ID257
Feature activation+1.51e-03

Pos logit contributions
ill+0.058
igsaw+0.053
aney+0.051
kin+0.051
etti+0.050
Neg logit contributions
 asking-0.041
 le-0.035
 trying-0.035
 swoop-0.035
 challenged-0.034
Token very
Token ID845
Feature activation-3.24e-02

Pos logit contributions
 asking+0.001
 swoop+0.001
 trying+0.001
 le+0.001
 challenged+0.001
Neg logit contributions
ill-0.004
igsaw-0.003
kin-0.003
aney-0.003
etti-0.003
Token serious
Token ID2726
Feature activation+5.30e-02

Pos logit contributions
ill+0.061
igsaw+0.061
kin+0.059
aney+0.055
ola+0.055
Neg logit contributions
 asking-0.047
 challenged-0.039
 swoop-0.039
 trying-0.038
cut-0.038
Token girl
Token ID2576
Feature activation-3.20e-03

Pos logit contributions
 asking+0.064
 swoop+0.051
 challenged+0.050
 trying+0.050
 chased+0.048
Neg logit contributions
ill-0.115
igsaw-0.111
kin-0.106
aney-0.103
etti-0.102
Token bad
Token ID2089
Feature activation+7.44e-02

Pos logit contributions
 asking+0.053
cut+0.050
 gir+0.046
 ostr+0.046
 swoop+0.046
Neg logit contributions
ill-0.039
aney-0.031
ifying-0.030
ushes-0.030
ola-0.030
Token duck
Token ID22045
Feature activation+5.07e-02

Pos logit contributions
 asking+0.106
 trying+0.089
cut+0.089
 swoop+0.089
 challenged+0.087
Neg logit contributions
ill-0.145
kin-0.136
igsaw-0.132
ola-0.126
aire-0.126
Token!"
Token ID2474
Feature activation-4.23e-02

Pos logit contributions
 asking+0.085
 challenged+0.071
 gir+0.069
 ult+0.068
 trying+0.067
Neg logit contributions
ill-0.090
kin-0.081
igsaw-0.073
aney-0.072
aire-0.072
Token Mom
Token ID11254
Feature activation-1.46e-01

Pos logit contributions
ill+0.063
igsaw+0.059
etti+0.054
kin+0.053
aney+0.053
Neg logit contributions
 asking-0.086
 le-0.077
 trying-0.076
 sc-0.075
 z-0.075
Token says
Token ID1139
Feature activation+1.14e-01

Pos logit contributions
ill+0.279
aney+0.232
 Kate+0.228
aps+0.226
ola+0.217
Neg logit contributions
 cowboy-0.192
 gir-0.191
 asking-0.189
 swoop-0.189
working-0.184
Token.
Token ID13
Feature activation+1.05e-03

Pos logit contributions
 asking+0.155
 swoop+0.139
 challenged+0.136
 le+0.135
 gir+0.134
Neg logit contributions
ill-0.233
igsaw-0.206
aney-0.205
kin-0.203
etti-0.198
Token "
Token ID366
Feature activation+8.68e-02

Pos logit contributions
 asking+0.001
 le+0.001
 trying+0.001
 sc+0.001
 challenged+0.001
Neg logit contributions
ill-0.002
igsaw-0.002
etti-0.002
kin-0.002
aney-0.002
TokenShe
Token ID3347
Feature activation+1.79e-02

Pos logit contributions
 asking+0.160
 trying+0.138
 challenged+0.137
 swoop+0.136
 le+0.136
Neg logit contributions
ill-0.138
igsaw-0.119
aney-0.115
kin-0.115
ushes-0.112
Token is
Token ID318
Feature activation-3.40e-02

Pos logit contributions
 asking+0.033
Later+0.028
cut+0.028
 challenged+0.028
 approaching+0.026
Neg logit contributions
cy-0.020
etti-0.020
-0.019
ill-0.019
ushes-0.019
Token not
Token ID407
Feature activation+6.26e-02

Pos logit contributions
ill+0.063
kin+0.060
igsaw+0.057
aney+0.056
ifying+0.055
Neg logit contributions
 asking-0.041
cut-0.040
 challenged-0.039
 swoop-0.038
 gir-0.037
Token food
Token ID2057
Feature activation-3.28e-02

Pos logit contributions
 asking+0.090
cut+0.082
 swoop+0.079
 challenged+0.079
 gir+0.079
Neg logit contributions
ill-0.113
kin-0.101
aney-0.099
igsaw-0.099
ushes-0.094
Token a
Token ID257
Feature activation+7.85e-03

Pos logit contributions
 asking+0.165
 swoop+0.141
 trying+0.139
 challenged+0.138
 le+0.138
Neg logit contributions
ill-0.203
igsaw-0.183
aney-0.176
kin-0.176
etti-0.172
Token frog
Token ID21264
Feature activation+4.48e-02

Pos logit contributions
 asking+0.009
 le+0.008
 swoop+0.008
 z+0.008
 trying+0.008
Neg logit contributions
ill-0.018
igsaw-0.017
aney-0.016
ushes-0.016
etti-0.016
Token hopping
Token ID47153
Feature activation+1.59e-01

Pos logit contributions
 asking+0.069
have+0.062
 trucks+0.061
 challenged+0.061
 begged+0.059
Neg logit contributions
ill-0.076
aps-0.066
kin-0.064
etti-0.063
ifying-0.060
Token around
Token ID1088
Feature activation-6.55e-02

Pos logit contributions
 asking+0.312
 Street+0.265
have+0.261
 doctors+0.260
 gir+0.258
Neg logit contributions
ill-0.236
ifying-0.189
igsaw-0.184
cy-0.184
he-0.176
Token and
Token ID290
Feature activation+1.62e-02

Pos logit contributions
ill+0.109
 Jill+0.086
 Jane+0.082
 Kate+0.082
kin+0.079
Neg logit contributions
 asking-0.104
have-0.104
 ostr-0.103
 gir-0.102
cut-0.100
Token he
Token ID339
Feature activation+7.19e-02

Pos logit contributions
ep+0.019
 ult+0.019
cut+0.019
 tractor+0.019
ans+0.018
Neg logit contributions
ill-0.028
 Jill-0.027
 Kate-0.026
 Something-0.026
 Jane-0.025
Token wanted
Token ID2227
Feature activation+2.17e-01

Pos logit contributions
 asking+0.098
cut+0.090
 trucks+0.084
have+0.080
 bulldo+0.080
Neg logit contributions
ill-0.134
 Something-0.116
ola-0.114
aney-0.113
kin-0.113
Token to
Token ID284
Feature activation+1.06e-01

Pos logit contributions
 asking+0.317
cut+0.315
ep+0.297
 swoop+0.282
 bulldo+0.281
Neg logit contributions
ill-0.383
 Jill-0.320
aney-0.316
 Something-0.313
cake-0.304
Token say
Token ID910
Feature activation+5.76e-02

Pos logit contributions
 asking+0.158
 trying+0.133
 le+0.129
 swoop+0.129
 gir+0.128
Neg logit contributions
ill-0.206
igsaw-0.185
aney-0.179
kin-0.175
etti-0.172
Token hello
Token ID23748
Feature activation+4.01e-02

Pos logit contributions
 asking+0.082
 swoop+0.068
 trying+0.068
 le+0.068
 challenged+0.067
Neg logit contributions
ill-0.115
igsaw-0.106
aney-0.101
kin-0.101
etti-0.099
Token.
Token ID13
Feature activation-6.44e-03

Pos logit contributions
 asking+0.045
 le+0.036
 swoop+0.036
 trying+0.035
 challenged+0.035
Neg logit contributions
ill-0.093
igsaw-0.090
kin-0.085
aney-0.085
ushes-0.083
Token it
Token ID340
Feature activation-7.35e-02

Pos logit contributions
 asking+0.004
 z+0.003
 challenged+0.003
 sc+0.003
 trying+0.003
Neg logit contributions
ill-0.009
igsaw-0.009
aney-0.008
kin-0.008
ushes-0.008
Token was
Token ID373
Feature activation-3.07e-02

Pos logit contributions
ill+0.168
igsaw+0.161
aney+0.153
ushes+0.149
kin+0.147
Neg logit contributions
 asking-0.088
 z-0.072
 trying-0.071
 le-0.071
 swoop-0.070
Token a
Token ID257
Feature activation-1.06e-03

Pos logit contributions
ill+0.061
igsaw+0.057
aney+0.054
kin+0.054
etti+0.053
Neg logit contributions
 asking-0.045
 trying-0.038
 le-0.037
 swoop-0.037
 z-0.037
Token set
Token ID900
Feature activation-6.54e-03

Pos logit contributions
ill+0.002
igsaw+0.002
aney+0.002
ushes+0.002
etti+0.002
Neg logit contributions
 asking-0.002
 le-0.001
 trying-0.001
 z-0.001
 swoop-0.001
Token of
Token ID286
Feature activation-1.27e-01

Pos logit contributions
ill+0.012
igsaw+0.011
kin+0.010
etti+0.010
aney+0.010
Neg logit contributions
 asking-0.011
 swoop-0.010
 challenged-0.009
 le-0.009
 chased-0.009
Token drums
Token ID24730
Feature activation+5.17e-02

Pos logit contributions
ill+0.216
igsaw+0.206
kin+0.194
etti+0.189
ushes+0.189
Neg logit contributions
 asking-0.201
 swoop-0.180
 challenged-0.179
 chased-0.173
 le-0.172
Token.
Token ID13
Feature activation-1.80e-02

Pos logit contributions
 asking+0.078
 swoop+0.069
cut+0.069
 le+0.067
 challenged+0.066
Neg logit contributions
ill-0.096
etti-0.087
igsaw-0.082
ifying-0.079
cake-0.078
Token He
Token ID679
Feature activation+2.41e-03

Pos logit contributions
ill+0.033
kin+0.032
aney+0.030
igsaw+0.030
etti+0.029
Neg logit contributions
 asking-0.029
 trying-0.024
 challenged-0.024
 le-0.023
 asked-0.023
Token quickly
Token ID2952
Feature activation+1.59e-02

Pos logit contributions
 asking+0.003
cut+0.003
 le+0.003
 trying+0.003
 swoop+0.003
Neg logit contributions
ill-0.005
kin-0.004
etti-0.004
aney-0.004
igsaw-0.004
Token started
Token ID2067
Feature activation+3.80e-02

Pos logit contributions
 asking+0.021
cut+0.019
 trying+0.018
 sneaking+0.017
 ostr+0.017
Neg logit contributions
ill-0.032
etti-0.028
aney-0.028
igsaw-0.028
kin-0.027
Token to
Token ID284
Feature activation+9.89e-02

Pos logit contributions
cut+0.054
 swoop+0.048
 ostr+0.047
 asking+0.046
cod+0.046
Neg logit contributions
ill-0.068
igsaw-0.061
etti-0.060
 Kate-0.059
aps-0.058
Token made
Token ID925
Feature activation-1.33e-02

Pos logit contributions
ill+0.044
igsaw+0.041
kin+0.039
aney+0.039
ushes+0.038
Neg logit contributions
 asking-0.026
 swoop-0.023
 le-0.022
 trying-0.021
 gir-0.021
Token John
Token ID1757
Feature activation-3.97e-02

Pos logit contributions
ill+0.027
igsaw+0.025
kin+0.024
etti+0.024
aney+0.023
Neg logit contributions
 asking-0.020
 trying-0.016
 going-0.016
 sc-0.016
 z-0.016
Token think
Token ID892
Feature activation+8.99e-02

Pos logit contributions
ill+0.067
igsaw+0.061
aney+0.058
kin+0.057
etti+0.056
Neg logit contributions
 asking-0.069
 swoop-0.060
 le-0.060
 trying-0.060
 challenged-0.059
Token twice
Token ID5403
Feature activation+1.18e-02

Pos logit contributions
 asking+0.119
 swoop+0.098
 le+0.098
 trying+0.098
 challenged+0.096
Neg logit contributions
ill-0.189
igsaw-0.175
aney-0.168
kin-0.168
etti-0.166
Token and
Token ID290
Feature activation+1.16e-02

Pos logit contributions
 asking+0.016
 swoop+0.014
 le+0.013
 challenged+0.013
 gir+0.013
Neg logit contributions
ill-0.024
igsaw-0.022
kin-0.021
aney-0.021
etti-0.021
Token he
Token ID339
Feature activation+7.12e-02

Pos logit contributions
 asking+0.015
 le+0.013
 swoop+0.013
 challenged+0.013
 gir+0.013
Neg logit contributions
ill-0.024
igsaw-0.022
aney-0.021
kin-0.021
etti-0.020
Token finally
Token ID3443
Feature activation-4.81e-02

Pos logit contributions
 asking+0.096
 le+0.079
 trying+0.078
 swoop+0.077
 challenged+0.075
Neg logit contributions
ill-0.148
kin-0.131
igsaw-0.131
aney-0.129
ushes-0.127
Token accepted
Token ID6292
Feature activation-4.37e-02

Pos logit contributions
ill+0.103
igsaw+0.096
aney+0.092
kin+0.091
etti+0.091
Neg logit contributions
 asking-0.062
 swoop-0.051
 le-0.051
 trying-0.051
 challenged-0.050
Token to
Token ID284
Feature activation+1.02e-01

Pos logit contributions
ill+0.093
igsaw+0.087
aney+0.083
kin+0.083
etti+0.083
Neg logit contributions
 asking-0.057
 trying-0.047
 le-0.047
 swoop-0.046
 challenged-0.045
Token help
Token ID1037
Feature activation+1.53e-02

Pos logit contributions
 asking+0.150
 trying+0.126
 swoop+0.126
 le+0.125
 challenged+0.123
Neg logit contributions
ill-0.203
igsaw-0.187
aney-0.179
kin-0.178
etti-0.176
Token.
Token ID13
Feature activation-5.58e-03

Pos logit contributions
 asking+0.019
 trying+0.015
 le+0.015
 swoop+0.014
 sc+0.014
Neg logit contributions
ill-0.034
igsaw-0.032
kin-0.031
aney-0.031
etti-0.031
Token the
Token ID262
Feature activation+1.86e-02

Pos logit contributions
ill+0.064
igsaw+0.061
kin+0.059
etti+0.057
ola+0.057
Neg logit contributions
 asking-0.062
 challenged-0.057
 swoop-0.056
 chased-0.056
 bulldo-0.054
Token wish
Token ID4601
Feature activation-6.88e-02

Pos logit contributions
 asking+0.030
 swoop+0.026
 challenged+0.026
 chased+0.026
 le+0.026
Neg logit contributions
ill-0.032
igsaw-0.029
kin-0.029
aney-0.028
ushes-0.028
Token you
Token ID345
Feature activation+2.57e-02

Pos logit contributions
ill+0.135
igsaw+0.125
aney+0.119
kin+0.117
etti+0.117
Neg logit contributions
 asking-0.101
 trying-0.086
 swoop-0.085
 le-0.085
 challenged-0.084
Token would
Token ID561
Feature activation+6.87e-02

Pos logit contributions
 asking+0.041
 le+0.035
 swoop+0.035
 trying+0.035
 challenged+0.034
Neg logit contributions
ill-0.048
igsaw-0.043
aney-0.041
kin-0.041
etti-0.040
Token like
Token ID588
Feature activation-6.68e-02

Pos logit contributions
 asking+0.109
 le+0.097
 gir+0.094
 challenged+0.093
 trucks+0.093
Neg logit contributions
ill-0.125
igsaw-0.109
kin-0.107
aney-0.104
etti-0.102
Token?"
Token ID1701
Feature activation-1.02e-01

Pos logit contributions
ill+0.124
igsaw+0.109
kin+0.105
etti+0.104
ola+0.104
Neg logit contributions
 asking-0.095
 swoop-0.087
 challenged-0.085
 chased-0.085
 le-0.084
Token
Token ID198
Feature activation-3.07e-02

Pos logit contributions
ill+0.179
igsaw+0.164
kin+0.160
aney+0.154
ushes+0.154
Neg logit contributions
 asking-0.177
 le-0.153
 trying-0.152
 swoop-0.151
 challenged-0.149
Token
Token ID198
Feature activation-2.94e-02

Pos logit contributions
ill+0.057
igsaw+0.053
kin+0.051
etti+0.050
ifying+0.049
Neg logit contributions
 asking-0.048
 le-0.041
 swoop-0.041
 z-0.040
 trying-0.040
TokenLu
Token ID25596
Feature activation-1.47e-02

Pos logit contributions
ill+0.049
igsaw+0.044
kin+0.042
etti+0.041
aney+0.041
Neg logit contributions
 asking-0.052
 swoop-0.046
 le-0.046
 challenged-0.045
 trying-0.045
Tokency
Token ID948
Feature activation-1.91e-01

Pos logit contributions
igsaw+0.014
ill+0.013
kin+0.013
lights+0.011
etti+0.011
Neg logit contributions
 asking-0.034
 swoop-0.032
 le-0.032
 chased-0.031
 fl-0.031
Token was
Token ID373
Feature activation-1.35e-02

Pos logit contributions
ill+0.339
aney+0.291
kin+0.282
igsaw+0.278
etti+0.277
Neg logit contributions
 asking-0.310
 trucks-0.271
 le-0.270
 going-0.269
 bulldo-0.266
Token thing
Token ID1517
Feature activation-5.18e-02

Pos logit contributions
ill+0.060
igsaw+0.055
aney+0.052
kin+0.052
etti+0.051
Neg logit contributions
 asking-0.049
 swoop-0.042
 le-0.042
 trying-0.042
 challenged-0.041
Token and
Token ID290
Feature activation+1.46e-02

Pos logit contributions
ill+0.116
igsaw+0.108
aney+0.104
kin+0.103
etti+0.102
Neg logit contributions
 asking-0.062
 swoop-0.050
 le-0.050
 trying-0.049
 challenged-0.049
Token got
Token ID1392
Feature activation+4.58e-04

Pos logit contributions
 asking+0.016
 swoop+0.013
 trying+0.013
cut+0.013
 le+0.013
Neg logit contributions
ill-0.033
igsaw-0.030
kin-0.029
aney-0.029
ushes-0.029
Token on
Token ID319
Feature activation-5.57e-02

Pos logit contributions
 asking+0.001
 gir+0.001
 swoop+0.001
cut+0.001
 challenged+0.001
Neg logit contributions
ill-0.001
ushes-0.001
ola-0.001
aney-0.001
igsaw-0.001
Token a
Token ID257
Feature activation+6.56e-03

Pos logit contributions
ill+0.094
igsaw+0.085
etti+0.082
kin+0.078
ushes+0.077
Neg logit contributions
 asking-0.091
 challenged-0.083
 swoop-0.081
cut-0.080
 chased-0.079
Token seat
Token ID5852
Feature activation-5.66e-02

Pos logit contributions
 asking+0.009
 swoop+0.008
 trying+0.008
 le+0.008
 challenged+0.008
Neg logit contributions
ill-0.013
igsaw-0.012
aney-0.012
kin-0.012
etti-0.011
Token.
Token ID13
Feature activation-5.94e-03

Pos logit contributions
ill+0.119
igsaw+0.106
kin+0.105
aney+0.103
etti+0.102
Neg logit contributions
 asking-0.077
 swoop-0.065
 challenged-0.064
 le-0.064
 trying-0.063
Token He
Token ID679
Feature activation+1.25e-02

Pos logit contributions
ill+0.011
igsaw+0.011
kin+0.011
aney+0.010
etti+0.010
Neg logit contributions
 asking-0.008
 le-0.008
 z-0.007
 trying-0.007
 challenged-0.007
Token put
Token ID1234
Feature activation-8.34e-03

Pos logit contributions
 asking+0.015
 trying+0.012
 swoop+0.012
 le+0.012
 challenged+0.012
Neg logit contributions
ill-0.028
igsaw-0.025
kin-0.025
aney-0.025
ushes-0.024
Token on
Token ID319
Feature activation-5.57e-02

Pos logit contributions
ill+0.020
igsaw+0.019
aney+0.018
aire+0.017
etti+0.017
Neg logit contributions
 asking-0.010
 z-0.008
 le-0.008
 trying-0.008
 swoop-0.008
Token his
Token ID465
Feature activation+1.18e-02

Pos logit contributions
ill+0.107
igsaw+0.098
etti+0.093
aney+0.092
ushes+0.092
Neg logit contributions
 asking-0.083
 challenged-0.071
 swoop-0.070
cut-0.068
 trying-0.068
Token hugged
Token ID41130
Feature activation-9.08e-02

Pos logit contributions
ill+0.001
kin+0.001
aire+0.001
aney+0.001
ola+0.001
Neg logit contributions
 asking-0.001
cut-0.001
 trucks-0.001
 happening-0.001
 causing-0.001
Token the
Token ID262
Feature activation+1.46e-02

Pos logit contributions
ill+0.208
igsaw+0.204
aney+0.193
etti+0.191
kin+0.190
Neg logit contributions
 asking-0.112
 trying-0.089
 le-0.089
 z-0.089
 sc-0.087
Token pillow
Token ID28774
Feature activation-4.99e-02

Pos logit contributions
 asking+0.029
 le+0.026
 swoop+0.026
 challenged+0.026
 trying+0.025
Neg logit contributions
ill-0.020
igsaw-0.018
kin-0.018
ushes-0.016
etti-0.016
Token tight
Token ID5381
Feature activation+4.18e-02

Pos logit contributions
ill+0.100
igsaw+0.092
aney+0.088
kin+0.088
etti+0.087
Neg logit contributions
 asking-0.071
 trying-0.060
 swoop-0.060
 le-0.060
 challenged-0.058
Token.
Token ID13
Feature activation-1.80e-02

Pos logit contributions
 asking+0.057
 swoop+0.049
 challenged+0.048
 le+0.048
 trying+0.046
Neg logit contributions
ill-0.086
igsaw-0.078
kin-0.076
aney-0.074
etti-0.074
Token She
Token ID1375
Feature activation-1.67e-02

Pos logit contributions
ill+0.031
kin+0.030
igsaw+0.029
aney+0.028
etti+0.027
Neg logit contributions
 asking-0.031
 trying-0.026
 le-0.026
 challenged-0.025
 going-0.025
Token was
Token ID373
Feature activation-2.45e-02

Pos logit contributions
ill+0.033
kin+0.030
aney+0.029
etti+0.029
igsaw+0.028
Neg logit contributions
 asking-0.024
 trying-0.019
cut-0.019
 challenged-0.019
 trucks-0.018
Token so
Token ID523
Feature activation-4.14e-02

Pos logit contributions
ill+0.051
igsaw+0.048
aney+0.045
kin+0.045
etti+0.045
Neg logit contributions
 asking-0.034
 trying-0.029
 swoop-0.028
 le-0.028
 z-0.027
Token happy
Token ID3772
Feature activation-4.24e-02

Pos logit contributions
ill+0.095
igsaw+0.093
etti+0.090
ifying+0.084
aney+0.083
Neg logit contributions
 asking-0.047
 z-0.041
 trying-0.041
 sc-0.041
 swoop-0.040
Token to
Token ID284
Feature activation+1.02e-01

Pos logit contributions
ill+0.086
igsaw+0.075
etti+0.074
cake+0.072
aney+0.071
Neg logit contributions
 asking-0.055
 swoop-0.051
cut-0.050
 challenged-0.050
 bulldo-0.048
Token have
Token ID423
Feature activation-8.42e-02

Pos logit contributions
 asking+0.143
 swoop+0.129
 z+0.129
 sc+0.126
 going+0.125
Neg logit contributions
igsaw-0.201
ill-0.201
ushes-0.194
aney-0.193
etti-0.187
Token a
Token ID257
Feature activation+6.03e-03

Pos logit contributions
 asking+0.004
 swoop+0.004
 le+0.004
 trying+0.004
 challenged+0.004
Neg logit contributions
ill-0.007
igsaw-0.006
aney-0.006
kin-0.006
etti-0.006
Token good
Token ID922
Feature activation-6.79e-02

Pos logit contributions
 asking+0.010
 challenged+0.008
 swoop+0.008
 trying+0.008
 le+0.008
Neg logit contributions
ill-0.010
igsaw-0.010
kin-0.009
aney-0.009
etti-0.009
Token night
Token ID1755
Feature activation+1.09e-02

Pos logit contributions
ill+0.124
kin+0.117
igsaw+0.117
aney+0.107
aire+0.106
Neg logit contributions
 asking-0.100
 swoop-0.088
 challenged-0.087
 trying-0.085
 le-0.084
Token and
Token ID290
Feature activation+2.00e-02

Pos logit contributions
 asking+0.015
 swoop+0.013
 le+0.013
 trying+0.013
 challenged+0.013
Neg logit contributions
ill-0.022
igsaw-0.020
kin-0.019
aney-0.019
etti-0.019
Token sweet
Token ID6029
Feature activation-6.61e-02

Pos logit contributions
 asking+0.029
 trying+0.025
 le+0.024
 swoop+0.024
 challenged+0.024
Neg logit contributions
ill-0.040
igsaw-0.037
aney-0.036
etti-0.035
kin-0.035
Token dreams
Token ID10625
Feature activation-9.50e-02

Pos logit contributions
ill+0.112
igsaw+0.099
kin+0.099
aney+0.095
aire+0.092
Neg logit contributions
 asking-0.107
 swoop-0.098
 challenged-0.097
 chased-0.095
cut-0.094
Token."
Token ID526
Feature activation-2.80e-02

Pos logit contributions
ill+0.205
igsaw+0.194
aney+0.184
kin+0.182
etti+0.181
Neg logit contributions
 asking-0.121
 trying-0.100
 swoop-0.097
 le-0.097
 going-0.096
Token
Token ID198
Feature activation-2.66e-02

Pos logit contributions
ill+0.053
igsaw+0.050
etti+0.049
kin+0.048
aney+0.047
Neg logit contributions
 asking-0.045
 le-0.038
 trying-0.037
 asked-0.036
 z-0.036
Token
Token ID198
Feature activation-2.57e-02

Pos logit contributions
ill+0.048
igsaw+0.045
aney+0.042
kin+0.042
etti+0.042
Neg logit contributions
 asking-0.043
 swoop-0.037
 le-0.037
 trying-0.037
 challenged-0.036
TokenL
Token ID43
Feature activation+3.26e-02

Pos logit contributions
ill+0.041
igsaw+0.037
kin+0.035
aney+0.035
ushes+0.034
Neg logit contributions
 asking-0.048
 le-0.042
 trying-0.042
 swoop-0.042
 challenged-0.041
Tokenily
Token ID813
Feature activation-5.84e-02

Pos logit contributions
 asking+0.048
 le+0.041
 accident+0.041
 trying+0.041
 z+0.040
Neg logit contributions
igsaw-0.058
ifying-0.056
etti-0.055
ill-0.055
ushes-0.052
Token fun
Token ID1257
Feature activation+1.16e-02

Pos logit contributions
ill+0.070
igsaw+0.065
aney+0.062
kin+0.062
etti+0.061
Neg logit contributions
 asking-0.048
 swoop-0.040
 le-0.040
 trying-0.040
 challenged-0.039
Token!
Token ID0
Feature activation-4.23e-02

Pos logit contributions
 asking+0.017
 swoop+0.015
 challenged+0.015
cut+0.015
 le+0.014
Neg logit contributions
ill-0.022
igsaw-0.020
kin-0.020
aney-0.019
etti-0.019
Token We
Token ID775
Feature activation-6.96e-02

Pos logit contributions
ill+0.078
igsaw+0.075
kin+0.071
aney+0.070
ushes+0.069
Neg logit contributions
 asking-0.063
 le-0.058
 challenged-0.056
 trying-0.055
 wants-0.054
Token like
Token ID588
Feature activation-6.67e-02

Pos logit contributions
ill+0.112
igsaw+0.101
aney+0.096
kin+0.095
etti+0.094
Neg logit contributions
 asking-0.127
 swoop-0.111
 le-0.111
 trying-0.111
 challenged-0.110
Token swinging
Token ID25635
Feature activation+7.71e-02

Pos logit contributions
ill+0.095
igsaw+0.088
etti+0.087
ushes+0.082
aps+0.081
Neg logit contributions
 asking-0.107
cut-0.103
 challenged-0.101
 swoop-0.099
 ult-0.097
Token!"
Token ID2474
Feature activation-3.98e-02

Pos logit contributions
 asking+0.100
 le+0.084
 swoop+0.084
 trying+0.083
 challenged+0.081
Neg logit contributions
ill-0.164
igsaw-0.153
aney-0.146
kin-0.145
etti-0.144
Token
Token ID198
Feature activation-1.99e-02

Pos logit contributions
ill+0.078
igsaw+0.077
kin+0.075
etti+0.070
aney+0.068
Neg logit contributions
 asking-0.059
 le-0.055
 z-0.050
 trying-0.049
 sc-0.049
Token
Token ID198
Feature activation-1.91e-02

Pos logit contributions
ill+0.037
igsaw+0.034
aney+0.032
kin+0.032
etti+0.032
Neg logit contributions
 asking-0.031
 le-0.027
 swoop-0.027
 trying-0.027
 challenged-0.026
TokenDad
Token ID46270
Feature activation-6.87e-02

Pos logit contributions
ill+0.030
igsaw+0.026
aney+0.026
kin+0.026
ushes+0.025
Neg logit contributions
 asking-0.035
 trying-0.031
 swoop-0.030
 challenged-0.030
 le-0.030
Token watches
Token ID16860
Feature activation+9.19e-02

Pos logit contributions
ill+0.140
igsaw+0.122
aney+0.122
etti+0.119
ushes+0.118
Neg logit contributions
 asking-0.094
 swoop-0.081
 challenged-0.080
cut-0.079
 trying-0.078
Token them
Token ID606
Feature activation-4.68e-02

Pos logit contributions
 asking+0.140
 swoop+0.123
 trying+0.121
 le+0.120
 challenged+0.120
Neg logit contributions
ill-0.174
igsaw-0.155
aney-0.153
kin-0.149
etti-0.147
Token She
Token ID1375
Feature activation-1.28e-02

Pos logit contributions
ill+0.022
kin+0.020
aney+0.020
igsaw+0.019
etti+0.018
Neg logit contributions
 asking-0.023
 le-0.019
 trying-0.019
 z-0.019
 challenged-0.019
Token couldn
Token ID3521
Feature activation+1.10e-01

Pos logit contributions
ill+0.028
igsaw+0.025
aney+0.025
kin+0.024
etti+0.024
Neg logit contributions
 asking-0.016
 trying-0.013
 le-0.013
 swoop-0.013
cut-0.013
Token't
Token ID470
Feature activation+5.47e-02

Pos logit contributions
 asking+0.093
 trying+0.087
 going+0.086
 le+0.082
 sc+0.080
Neg logit contributions
ill-0.263
ushes-0.258
kin-0.257
igsaw-0.252
aney-0.250
Token help
Token ID1037
Feature activation+1.57e-02

Pos logit contributions
 asking+0.093
 sc+0.086
 trying+0.085
 z+0.084
 fl+0.083
Neg logit contributions
ushes-0.093
igsaw-0.088
ill-0.087
etti-0.084
kin-0.082
Token but
Token ID475
Feature activation+6.28e-02

Pos logit contributions
 asking+0.018
 trying+0.015
 le+0.014
 sc+0.014
 z+0.014
Neg logit contributions
ill-0.036
igsaw-0.035
ushes-0.034
aney-0.034
kin-0.033
Token smile
Token ID8212
Feature activation-2.56e-01

Pos logit contributions
 asking+0.113
 trying+0.107
 sc+0.102
 fl+0.101
 telling+0.100
Neg logit contributions
ill-0.104
etti-0.102
igsaw-0.098
kin-0.098
ushes-0.097
Token back
Token ID736
Feature activation-2.56e-01

Pos logit contributions
ill+0.551
igsaw+0.506
kin+0.487
aney+0.486
etti+0.481
Neg logit contributions
 asking-0.324
 swoop-0.269
 le-0.268
 trying-0.264
 challenged-0.263
Token.
Token ID13
Feature activation-1.22e-02

Pos logit contributions
ill+0.552
igsaw+0.510
aney+0.485
ushes+0.481
kin+0.481
Neg logit contributions
 asking-0.320
 swoop-0.278
 challenged-0.269
 le-0.264
 trying-0.260
Token
Token ID220
Feature activation-3.24e-02

Pos logit contributions
ill+0.023
kin+0.021
aney+0.020
igsaw+0.020
etti+0.019
Neg logit contributions
 asking-0.020
 trying-0.017
 le-0.017
 z-0.016
 challenged-0.016
Token
Token ID198
Feature activation-3.27e-02

Pos logit contributions
ill+0.063
igsaw+0.062
kin+0.061
etti+0.058
aney+0.057
Neg logit contributions
 asking-0.045
 trying-0.038
 going-0.036
 z-0.036
 le-0.036
Token
Token ID198
Feature activation-3.23e-02

Pos logit contributions
ill+0.049
igsaw+0.044
aney+0.041
kin+0.040
ushes+0.040
Neg logit contributions
 asking-0.063
 swoop-0.056
 le-0.056
 trying-0.055
 gir-0.055
TokenYes
Token ID5297
Feature activation-8.65e-02

Pos logit contributions
 asking+0.162
 le+0.144
 swoop+0.143
 challenged+0.143
 trying+0.142
Neg logit contributions
ill-0.102
igsaw-0.100
kin-0.094
etti-0.093
aney-0.085
Token,
Token ID11
Feature activation-3.05e-02

Pos logit contributions
ill+0.190
igsaw+0.178
kin+0.170
aney+0.169
ushes+0.168
Neg logit contributions
 asking-0.108
 trying-0.088
 le-0.088
 swoop-0.086
 z-0.084
Token honey
Token ID12498
Feature activation-7.01e-03

Pos logit contributions
ill+0.057
igsaw+0.051
aney+0.049
kin+0.049
etti+0.048
Neg logit contributions
 asking-0.048
 swoop-0.042
 challenged-0.041
 le-0.041
 trying-0.041
Token,
Token ID11
Feature activation-3.00e-02

Pos logit contributions
ill+0.016
igsaw+0.016
ushes+0.015
aney+0.014
etti+0.014
Neg logit contributions
 asking-0.008
 le-0.006
 trying-0.006
 z-0.006
 going-0.006
Token you
Token ID345
Feature activation+2.09e-02

Pos logit contributions
ill+0.057
igsaw+0.050
aney+0.050
kin+0.049
ola+0.048
Neg logit contributions
 asking-0.046
 swoop-0.041
 challenged-0.040
 le-0.039
 trying-0.039
Token look
Token ID804
Feature activation-1.78e-01

Pos logit contributions
 asking+0.025
 trying+0.023
 going+0.023
 z+0.022
 swoop+0.022
Neg logit contributions
igsaw-0.043
ill-0.043
etti-0.040
aney-0.040
ushes-0.039
Token very
Token ID845
Feature activation-2.87e-02

Pos logit contributions
ill+0.363
igsaw+0.335
kin+0.321
aney+0.319
etti+0.317
Neg logit contributions
 asking-0.251
 trying-0.211
 le-0.209
 swoop-0.205
 z-0.203
Token nice
Token ID3621
Feature activation-8.53e-02

Pos logit contributions
ill+0.062
igsaw+0.058
etti+0.056
kin+0.055
aney+0.055
Neg logit contributions
 asking-0.037
 trying-0.032
 le-0.031
 sc-0.030
 z-0.030
Token!"
Token ID2474
Feature activation-5.67e-02

Pos logit contributions
ill+0.186
igsaw+0.174
aney+0.166
kin+0.166
etti+0.164
Neg logit contributions
 asking-0.107
 trying-0.087
 le-0.087
 swoop-0.085
 challenged-0.085
Token
Token ID198
Feature activation-3.21e-02

Pos logit contributions
ill+0.092
igsaw+0.087
kin+0.082
aney+0.080
etti+0.079
Neg logit contributions
 asking-0.102
 le-0.089
 trying-0.087
 challenged-0.086
 swoop-0.086
Token
Token ID198
Feature activation-2.92e-02

Pos logit contributions
ill+0.053
igsaw+0.048
aney+0.046
kin+0.046
etti+0.045
Neg logit contributions
 asking-0.057
 swoop-0.050
 le-0.050
 trying-0.050
 challenged-0.049
Token,
Token ID11
Feature activation-2.77e-02

Pos logit contributions
ill+0.102
igsaw+0.086
kin+0.084
etti+0.076
 Kate+0.076
Neg logit contributions
 asking-0.106
 swoop-0.103
 ostr-0.100
 challenged-0.098
have-0.097
Token but
Token ID475
Feature activation+6.56e-02

Pos logit contributions
ill+0.041
 Something+0.034
 Jill+0.033
 Kate+0.032
aney+0.031
Neg logit contributions
 swoop-0.044
 asking-0.043
 gir-0.042
 challenged-0.042
cut-0.042
Token it
Token ID340
Feature activation-6.42e-02

Pos logit contributions
 asking+0.091
 swoop+0.076
 le+0.076
 trying+0.075
 challenged+0.075
Neg logit contributions
ill-0.134
igsaw-0.123
kin-0.120
aney-0.118
etti-0.116
Token kept
Token ID4030
Feature activation-3.73e-02

Pos logit contributions
ill+0.107
kin+0.105
cake+0.085
ola+0.084
aney+0.082
Neg logit contributions
 asking-0.110
 Street-0.087
 gir-0.085
 trying-0.085
 telling-0.085
Token coming
Token ID2406
Feature activation+9.48e-02

Pos logit contributions
ill+0.058
ifying+0.051
kin+0.050
+0.050
 perfectly+0.048
Neg logit contributions
 ostr-0.047
 challenged-0.047
 gir-0.047
 Street-0.046
ans-0.044
Token back
Token ID736
Feature activation-2.53e-01

Pos logit contributions
 asking+0.160
 challenged+0.144
 gir+0.139
 chased+0.138
 le+0.137
Neg logit contributions
ill-0.165
etti-0.131
ifying-0.131
ushes-0.131
kin-0.128
Token.
Token ID13
Feature activation-8.99e-03

Pos logit contributions
ill+0.543
igsaw+0.499
aney+0.478
kin+0.477
etti+0.471
Neg logit contributions
 asking-0.328
 swoop-0.273
 le-0.272
 trying-0.270
 challenged-0.267
Token He
Token ID679
Feature activation+7.45e-03

Pos logit contributions
ill+0.014
igsaw+0.013
aney+0.012
etti+0.012
ushes+0.011
Neg logit contributions
 asking-0.017
 swoop-0.015
 trying-0.015
 le-0.014
 challenged-0.014
Token stopped
Token ID5025
Feature activation+5.89e-02

Pos logit contributions
 asking+0.011
cut+0.009
 going+0.009
 trucks+0.009
 trying+0.009
Neg logit contributions
ill-0.012
kin-0.012
ola-0.011
aney-0.010
 Kate-0.010
Token walking
Token ID6155
Feature activation+1.80e-01

Pos logit contributions
 gir+0.077
 swoop+0.077
cut+0.075
 challenged+0.074
 asking+0.072
Neg logit contributions
ill-0.105
aps-0.087
igsaw-0.082
 Kate-0.080
 Jill-0.079
Token to
Token ID284
Feature activation+1.05e-01

Pos logit contributions
 asking+0.244
 swoop+0.205
 le+0.204
 challenged+0.202
 trying+0.200
Neg logit contributions
ill-0.377
igsaw-0.339
aney-0.324
kin-0.323
ushes-0.321
TokenThe
Token ID464
Feature activation+6.46e-03

Pos logit contributions
ill+0.066
igsaw+0.055
aney+0.054
ushes+0.052
kin+0.052
Neg logit contributions
 asking-0.077
 le-0.067
 trying-0.067
 ostr-0.065
 gir-0.065
Token driver
Token ID4639
Feature activation+4.15e-02

Pos logit contributions
 asking+0.010
perate+0.010
 sneaking+0.009
 challenged+0.009
 chased+0.009
Neg logit contributions
ay-0.008
kin-0.008
ushes-0.008
ill-0.008
 pleasure-0.008
Token smiled
Token ID13541
Feature activation-1.22e-01

Pos logit contributions
 asking+0.060
 le+0.051
 trying+0.050
 swoop+0.049
 going+0.048
Neg logit contributions
ill-0.081
aney-0.070
kin-0.070
etti-0.069
igsaw-0.069
Token and
Token ID290
Feature activation+2.15e-03

Pos logit contributions
ill+0.281
igsaw+0.266
aney+0.255
kin+0.254
etti+0.249
Neg logit contributions
 asking-0.139
 trying-0.112
 swoop-0.110
 le-0.109
 challenged-0.108
Token waved
Token ID26834
Feature activation-2.94e-02

Pos logit contributions
 asking+0.002
 swoop+0.002
 z+0.002
 le+0.002
 challenged+0.002
Neg logit contributions
ill-0.005
igsaw-0.005
aney-0.005
kin-0.005
ushes-0.005
Token back
Token ID736
Feature activation-2.59e-01

Pos logit contributions
ill+0.052
igsaw+0.048
aney+0.045
kin+0.045
etti+0.044
Neg logit contributions
 asking-0.049
 swoop-0.042
 le-0.042
 trying-0.042
 challenged-0.041
Token as
Token ID355
Feature activation-1.86e-02

Pos logit contributions
ill+0.485
igsaw+0.453
ushes+0.423
ifying+0.422
etti+0.418
Neg logit contributions
 asking-0.369
 swoop-0.342
 challenged-0.333
 le-0.323
 chased-0.322
Token Joe
Token ID5689
Feature activation-3.14e-02

Pos logit contributions
ill+0.041
igsaw+0.039
aney+0.037
kin+0.037
etti+0.037
Neg logit contributions
 asking-0.023
 trying-0.019
 le-0.019
 swoop-0.019
 z-0.018
Token flew
Token ID13112
Feature activation+9.62e-02

Pos logit contributions
ill+0.063
aney+0.052
etti+0.052
kin+0.051
ola+0.051
Neg logit contributions
 asking-0.043
cut-0.037
 trucks-0.037
 le-0.037
 trying-0.036
Token off
Token ID572
Feature activation-1.34e-01

Pos logit contributions
 asking+0.155
 le+0.134
 challenged+0.132
 trying+0.132
 swoop+0.130
Neg logit contributions
ill-0.176
igsaw-0.156
kin-0.148
aney-0.148
ushes-0.147
Token into
Token ID656
Feature activation-9.52e-02

Pos logit contributions
ill+0.259
igsaw+0.237
etti+0.223
kin+0.222
aney+0.219
Neg logit contributions
 asking-0.191
 challenged-0.166
 swoop-0.165
 le-0.164
 ostr-0.160
Token,
Token ID11
Feature activation-3.91e-02

Pos logit contributions
ill+0.210
igsaw+0.194
aney+0.186
kin+0.186
etti+0.183
Neg logit contributions
 asking-0.132
 swoop-0.110
 le-0.109
 trying-0.109
 challenged-0.107
Token until
Token ID1566
Feature activation+4.54e-02

Pos logit contributions
ill+0.077
igsaw+0.069
aney+0.067
kin+0.066
etti+0.066
Neg logit contributions
 asking-0.055
 le-0.047
 swoop-0.047
 challenged-0.047
 trying-0.046
Token Lily
Token ID20037
Feature activation+2.54e-03

Pos logit contributions
 asking+0.059
 trying+0.049
 swoop+0.049
 le+0.048
 z+0.047
Neg logit contributions
ill-0.098
igsaw-0.092
aney-0.088
ushes-0.087
kin-0.087
Token started
Token ID2067
Feature activation+4.12e-02

Pos logit contributions
 asking+0.003
cut+0.003
 le+0.003
 swoop+0.003
 trucks+0.003
Neg logit contributions
ill-0.006
aney-0.005
igsaw-0.005
ola-0.004
etti-0.004
Token to
Token ID284
Feature activation+9.99e-02

Pos logit contributions
 asking+0.049
cut+0.047
 gir+0.046
 ostr+0.045
 swoop+0.045
Neg logit contributions
ill-0.085
igsaw-0.074
aney-0.072
ushes-0.071
aps-0.070
Token feel
Token ID1254
Feature activation-1.65e-01

Pos logit contributions
 asking+0.171
 le+0.152
 gir+0.148
cut+0.147
 challenged+0.146
Neg logit contributions
ill-0.167
igsaw-0.143
aney-0.137
etti-0.135
kin-0.133
Token dizz
Token ID29527
Feature activation-3.53e-02

Pos logit contributions
ill+0.327
igsaw+0.303
aney+0.290
kin+0.287
etti+0.287
Neg logit contributions
 asking-0.241
 trying-0.206
 le-0.204
 swoop-0.202
 challenged-0.198
Tokeny
Token ID88
Feature activation+2.78e-02

Pos logit contributions
ill+0.064
igsaw+0.063
kin+0.059
etti+0.059
ushes+0.059
Neg logit contributions
 asking-0.054
 trying-0.046
 swoop-0.045
 le-0.045
 going-0.044
Token.
Token ID13
Feature activation-1.76e-02

Pos logit contributions
 asking+0.040
 swoop+0.036
 challenged+0.036
 le+0.035
 gir+0.034
Neg logit contributions
ill-0.054
igsaw-0.048
kin-0.047
aney-0.045
etti-0.045
Token
Token ID198
Feature activation-3.22e-02

Pos logit contributions
ill+0.027
igsaw+0.024
aney+0.022
etti+0.022
kin+0.022
Neg logit contributions
 asking-0.033
 swoop-0.030
 trying-0.030
 le-0.029
 challenged-0.029
Token
Token ID198
Feature activation-2.89e-02

Pos logit contributions
ill+0.049
igsaw+0.043
aney+0.041
ushes+0.040
kin+0.040
Neg logit contributions
 asking-0.062
 swoop-0.055
 trying-0.054
 le-0.054
 gir-0.054