TOP ACTIVATIONS
MAX = 0.199

bird what it looked like and where it came from .
S ara felt scared and sorry . She took off
Tom looked around and saw a key on the
Ben looked at the coin and then at Lily . He
looked at the big box and the bottle on the floor
?" Anna looked up and stared at her mummy feeling
They looked around and found a long stick on
He looked around the room and saw some things they could
Ben felt sorry and scared . He said ,
and they saw the rain and the lightning . They were
whistle . He looked up and saw the bird in a
Mom looked at the mess and sighed . She picked up
Anna . She felt sad and angry . Where was Anna
They looked around and saw a distant man sitting
face . She looked sad and worried . He didn 't
dance show coming up , and anyone could join . Jenny
saw a bird 's nest and wanted to take a closer
not know grandma very well and he was afraid of her
." They searched and searched and finally found Mitt
They searched and searched and finally found Mitt ens hiding

MIN ACTIVATIONS
MIN = -0.280

sign ' happy ', ' birth day ', and ' you
should be compassionate . Comp assion ate means you care about
Grand ma 's words fore shadow ed a happy future for
L ily had fore shadow ed the danger with her
lucky . Fore shadow ing or setup and payoff
. " Mar riage is when two people love
and wait . The fore shadow ing of the dull clouds
The wave had fore shadow ed L ola 's fate
, but it was fore shadow ed in the beginning when
her beloved sock . Fore shadow ing her bad ending ,
, her mum was fore shadow ing ... the next day
by saying , " Mar riage is when two people who
. She had been fore shadow ed about the power of
heavy bag had been fore shadow ed , and that setup
sky before they walked fore shadow ed the down pour that
cat said , " Mar riage is when two animals love
was the setup that fore shadow ed the bad ending .
. He had been fore shadow ed about this bad ending
that it was a ' shadow ' of her own shape
girl remembered the earlier fore shadow ing , and it saved

INTERVAL 0.079 - 0.199
CONTAINS 17.969%

so happy there . One day , Peter peek ed into
She stepped through the gate , excited to see what was
too quick for him , and he couldn 't find her
figure dreamed of soaring higher than any other figure in the
. One day , they see a big cable

INTERVAL -0.041 - 0.079
CONTAINS 49.347%

. It had some food that was all black , and
to be so friendly and thanked her in return .
a sh aw l . Then , she began to roll
affe ! But right at the end ,
Lily went to a dance competition where she saw a famous

INTERVAL -0.160 - -0.041
CONTAINS 32.415%

bright and makes her eyes hurt . Mom hears
her incredible m ittens . <|endoftext|> Once upon a time ,
it is time to clean your room . I will watch
see if this works ". The duck and the
out of mom my â € ™ s pocket . It

INTERVAL -0.280 - -0.160
CONTAINS 0.269%

and independent . <|endoftext|> Once upon a time , there was
very cheerful . <|endoftext|> Once upon a time , there was
spent together . <|endoftext|> Once upon a time , there was
garden bloom . <|endoftext|> Once upon a time , there was
next adventure ! <|endoftext|> Once upon a time there was a
Token bird
Token ID6512
Feature activation+6.12e-02

Pos logit contributions
 biggest+0.213
 two+0.198
 smallest+0.195
ceed+0.194
 exact+0.193
Neg logit contributions
ho-0.240
anged-0.219
 throats-0.216
olph-0.200
keeping-0.196
Token what
Token ID644
Feature activation-3.63e-02

Pos logit contributions
ho+0.103
 throats+0.100
keeping+0.093
anged+0.092
olph+0.092
Neg logit contributions
 biggest-0.104
 two-0.099
 best-0.098
ate-0.097
 teaches-0.096
Token it
Token ID340
Feature activation+3.14e-02

Pos logit contributions
 teaches+0.063
 two+0.061
ceed+0.059
 smallest+0.059
 biggest+0.059
Neg logit contributions
ho-0.063
anged-0.058
 throats-0.058
keeping-0.053
owing-0.052
Token looked
Token ID3114
Feature activation-4.07e-02

Pos logit contributions
ho+0.059
 throats+0.049
anged+0.047
olph+0.046
keeping+0.043
Neg logit contributions
ceed-0.053
 biggest-0.053
 exact-0.050
ate-0.049
 teaches-0.049
Token like
Token ID588
Feature activation+2.17e-02

Pos logit contributions
 biggest+0.075
 exact+0.071
ceed+0.070
 teaches+0.068
ate+0.068
Neg logit contributions
ho-0.069
 throats-0.060
anged-0.057
olph-0.056
keeping-0.054
Token and
Token ID290
Feature activation+1.99e-01

Pos logit contributions
ho+0.042
anged+0.037
 throats+0.037
olph+0.035
keeping+0.033
Neg logit contributions
 biggest-0.034
 two-0.032
 teaches-0.032
ceed-0.031
 most-0.031
Token where
Token ID810
Feature activation-6.96e-02

Pos logit contributions
ho+0.354
 throats+0.307
anged+0.295
olph+0.292
keeping+0.286
Neg logit contributions
 teaches-0.325
 two-0.323
 biggest-0.323
ceed-0.317
 exact-0.308
Token it
Token ID340
Feature activation+3.23e-02

Pos logit contributions
 two+0.132
 biggest+0.132
 teaches+0.125
ceed+0.125
 best+0.124
Neg logit contributions
ho-0.106
anged-0.095
 throats-0.094
olph-0.089
owing-0.086
Token came
Token ID1625
Feature activation-1.81e-02

Pos logit contributions
ho+0.057
 throats+0.050
anged+0.047
keeping+0.046
olph+0.046
Neg logit contributions
 biggest-0.054
ceed-0.054
 teaches-0.053
 exact-0.052
ate-0.052
Token from
Token ID422
Feature activation+2.20e-02

Pos logit contributions
 biggest+0.030
 two+0.029
ceed+0.028
 exact+0.028
 best+0.028
Neg logit contributions
ho-0.031
 throats-0.030
anged-0.028
olph-0.027
keeping-0.025
Token.
Token ID13
Feature activation+6.32e-03

Pos logit contributions
ho+0.040
 throats+0.035
anged+0.033
olph+0.032
keeping+0.030
Neg logit contributions
 biggest-0.037
ceed-0.036
 exact-0.035
ate-0.035
 teaches-0.034
Token
Token ID198
Feature activation-1.29e-01

Pos logit contributions
 biggest+0.224
 exact+0.218
 nic+0.214
ceed+0.213
 most+0.199
Neg logit contributions
ho-0.243
olph-0.203
 he-0.196
 throats-0.196
keeping-0.186
TokenS
Token ID50
Feature activation+1.71e-02

Pos logit contributions
 biggest+0.247
ceed+0.239
 exact+0.228
 teaches+0.224
 nic+0.224
Neg logit contributions
ho-0.225
 throats-0.179
olph-0.173
anged-0.164
keeping-0.159
Tokenara
Token ID3301
Feature activation+3.87e-02

Pos logit contributions
ho+0.038
 throats+0.033
anged+0.031
olph+0.031
keeping+0.029
Neg logit contributions
 biggest-0.024
ceed-0.023
 exact-0.022
 teaches-0.021
 smallest-0.021
Token felt
Token ID2936
Feature activation-1.11e-01

Pos logit contributions
ho+0.069
 throats+0.059
anged+0.057
keeping+0.056
olph+0.053
Neg logit contributions
ceed-0.065
 two-0.064
 teaches-0.063
ate-0.062
 biggest-0.061
Token scared
Token ID12008
Feature activation-2.59e-02

Pos logit contributions
 biggest+0.191
ceed+0.189
 exact+0.181
 teaches+0.179
ate+0.175
Neg logit contributions
ho-0.204
 throats-0.172
anged-0.165
olph-0.160
keeping-0.156
Token and
Token ID290
Feature activation+1.99e-01

Pos logit contributions
 two+0.038
 biggest+0.035
 best+0.034
 most+0.032
 teaches+0.032
Neg logit contributions
ho-0.058
 throats-0.048
keeping-0.048
anged-0.047
olph-0.047
Token sorry
Token ID7926
Feature activation+2.80e-02

Pos logit contributions
ho+0.371
 throats+0.314
anged+0.307
keeping+0.307
olph+0.306
Neg logit contributions
 biggest-0.326
ceed-0.326
 teaches-0.308
 exact-0.308
 two-0.306
Token.
Token ID13
Feature activation+6.39e-03

Pos logit contributions
ho+0.058
olph+0.049
keeping+0.048
 throats+0.048
anged+0.048
Neg logit contributions
 biggest-0.040
ceed-0.039
 two-0.039
 teaches-0.038
 best-0.037
Token She
Token ID1375
Feature activation+7.48e-02

Pos logit contributions
ho+0.011
 throats+0.009
anged+0.008
olph+0.008
keeping+0.008
Neg logit contributions
ceed-0.012
 biggest-0.012
 exact-0.011
 teaches-0.011
hest-0.011
Token took
Token ID1718
Feature activation-1.17e-01

Pos logit contributions
ho+0.121
 throats+0.101
anged+0.096
olph+0.092
keeping+0.091
Neg logit contributions
ceed-0.142
 biggest-0.141
 teaches-0.136
 exact-0.136
ate-0.132
Token off
Token ID572
Feature activation+7.39e-02

Pos logit contributions
 biggest+0.202
ceed+0.201
 exact+0.191
 teaches+0.188
ate+0.188
Neg logit contributions
ho-0.212
 throats-0.180
anged-0.170
olph-0.165
keeping-0.160
Token
Token ID198
Feature activation-1.25e-01

Pos logit contributions
 biggest+0.071
ceed+0.069
 exact+0.066
 two+0.065
 teaches+0.064
Neg logit contributions
ho-0.146
 throats-0.129
anged-0.123
olph-0.119
keeping-0.119
Token
Token ID198
Feature activation-1.26e-01

Pos logit contributions
 biggest+0.216
 exact+0.215
ceed+0.205
 nic+0.203
 smallest+0.194
Neg logit contributions
ho-0.224
 he-0.195
olph-0.195
 him-0.187
anged-0.184
TokenTom
Token ID13787
Feature activation+5.76e-02

Pos logit contributions
 biggest+0.170
ceed+0.162
 exact+0.157
 teaches+0.154
 smallest+0.146
Neg logit contributions
ho-0.279
 throats-0.241
anged-0.233
olph-0.233
keeping-0.222
Token looked
Token ID3114
Feature activation-4.16e-02

Pos logit contributions
ho+0.094
 throats+0.079
anged+0.075
olph+0.072
keeping+0.069
Neg logit contributions
ceed-0.108
 biggest-0.108
 exact-0.104
 teaches-0.102
ate-0.101
Token around
Token ID1088
Feature activation+3.66e-02

Pos logit contributions
 biggest+0.083
ceed+0.080
 exact+0.080
 teaches+0.078
 two+0.077
Neg logit contributions
ho-0.065
 throats-0.054
anged-0.050
keeping-0.049
olph-0.049
Token and
Token ID290
Feature activation+1.99e-01

Pos logit contributions
ho+0.075
 throats+0.066
anged+0.065
keeping+0.065
olph+0.062
Neg logit contributions
 two-0.055
 exact-0.049
 most-0.049
 biggest-0.049
 best-0.047
Token saw
Token ID2497
Feature activation-1.13e-01

Pos logit contributions
ho+0.369
 throats+0.322
keeping+0.302
olph+0.294
anged+0.292
Neg logit contributions
 biggest-0.327
 exact-0.317
ceed-0.310
 two-0.307
 most-0.302
Token a
Token ID257
Feature activation-1.14e-01

Pos logit contributions
ceed+0.158
 biggest+0.152
 exact+0.149
 teaches+0.147
ate+0.144
Neg logit contributions
ho-0.237
 throats-0.209
anged-0.198
olph-0.195
 he-0.187
Token key
Token ID1994
Feature activation+2.66e-02

Pos logit contributions
 biggest+0.145
ceed+0.143
 exact+0.136
 teaches+0.131
ate+0.129
Neg logit contributions
ho-0.255
 throats-0.227
anged-0.220
olph-0.212
keeping-0.208
Token on
Token ID319
Feature activation+1.79e-02

Pos logit contributions
ho+0.044
 throats+0.039
anged+0.036
olph+0.035
keeping+0.034
Neg logit contributions
ceed-0.048
 biggest-0.048
 exact-0.047
 teaches-0.045
ate-0.045
Token the
Token ID262
Feature activation-1.27e-01

Pos logit contributions
ho+0.036
 throats+0.032
anged+0.030
olph+0.029
keeping+0.028
Neg logit contributions
ceed-0.026
 biggest-0.026
 exact-0.025
 teaches-0.025
ate-0.024
TokenBen
Token ID11696
Feature activation+6.19e-02

Pos logit contributions
 biggest+0.200
ceed+0.186
 exact+0.185
 nic+0.177
 teaches+0.176
Neg logit contributions
ho-0.254
 throats-0.217
olph-0.211
anged-0.202
keeping-0.200
Token looked
Token ID3114
Feature activation-4.10e-02

Pos logit contributions
ho+0.092
 throats+0.076
anged+0.070
olph+0.069
keeping+0.064
Neg logit contributions
 biggest-0.126
ceed-0.125
 exact-0.121
ate-0.117
 teaches-0.117
Token at
Token ID379
Feature activation+8.06e-03

Pos logit contributions
 biggest+0.075
 exact+0.070
 two+0.069
ceed+0.068
 most+0.068
Neg logit contributions
ho-0.072
 throats-0.062
keeping-0.060
anged-0.058
olph-0.057
Token the
Token ID262
Feature activation-1.29e-01

Pos logit contributions
ho+0.012
 throats+0.011
olph+0.010
anged+0.010
keeping+0.009
Neg logit contributions
ceed-0.016
 biggest-0.016
 exact-0.015
 teaches-0.015
hest-0.014
Token coin
Token ID10752
Feature activation+3.21e-02

Pos logit contributions
ceed+0.253
 biggest+0.250
 exact+0.241
 teaches+0.238
ate+0.236
Neg logit contributions
ho-0.203
 throats-0.169
anged-0.157
olph-0.153
keeping-0.145
Token and
Token ID290
Feature activation+1.99e-01

Pos logit contributions
ho+0.059
anged+0.057
 throats+0.056
olph+0.054
keeping+0.052
Neg logit contributions
 two-0.051
 biggest-0.050
 best-0.048
 exact-0.047
ate-0.047
Token then
Token ID788
Feature activation+1.09e-01

Pos logit contributions
ho+0.339
 throats+0.286
anged+0.271
olph+0.261
keeping+0.260
Neg logit contributions
 biggest-0.358
ceed-0.353
 exact-0.343
 teaches-0.339
ate-0.336
Token at
Token ID379
Feature activation+7.85e-03

Pos logit contributions
ho+0.191
 throats+0.166
keeping+0.152
anged+0.151
olph+0.147
Neg logit contributions
 biggest-0.189
ceed-0.187
 two-0.181
 teaches-0.180
 exact-0.179
Token Lily
Token ID20037
Feature activation+2.88e-03

Pos logit contributions
ho+0.014
 throats+0.012
olph+0.011
anged+0.011
keeping+0.011
Neg logit contributions
ceed-0.014
 biggest-0.014
 exact-0.013
 teaches-0.013
hest-0.013
Token.
Token ID13
Feature activation+8.99e-03

Pos logit contributions
ho+0.006
 throats+0.005
anged+0.005
olph+0.005
keeping+0.005
Neg logit contributions
 exact-0.004
ceed-0.004
 two-0.004
 teaches-0.004
 biggest-0.004
Token He
Token ID679
Feature activation+8.71e-02

Pos logit contributions
ho+0.013
 throats+0.011
anged+0.010
olph+0.010
 he+0.010
Neg logit contributions
ceed-0.018
 biggest-0.018
 exact-0.017
 teaches-0.016
 smallest-0.016
Token looked
Token ID3114
Feature activation-4.22e-02

Pos logit contributions
ho+0.123
 throats+0.104
anged+0.097
olph+0.094
keeping+0.092
Neg logit contributions
ceed-0.142
 biggest-0.141
 exact-0.136
 teaches-0.135
ate-0.133
Token at
Token ID379
Feature activation+8.02e-03

Pos logit contributions
 biggest+0.073
 most+0.068
 exact+0.068
 two+0.067
 best+0.067
Neg logit contributions
ho-0.079
keeping-0.073
 throats-0.069
olph-0.066
anged-0.065
Token the
Token ID262
Feature activation-1.25e-01

Pos logit contributions
ho+0.014
 throats+0.012
anged+0.012
olph+0.011
keeping+0.011
Neg logit contributions
ceed-0.014
 biggest-0.014
 exact-0.013
 teaches-0.013
ate-0.013
Token big
Token ID1263
Feature activation-5.43e-02

Pos logit contributions
ceed+0.236
ate+0.222
Each+0.220
 biggest+0.210
hest+0.209
Neg logit contributions
ho-0.197
 throats-0.170
olph-0.137
 tongues-0.132
 coughing-0.130
Token box
Token ID3091
Feature activation+9.91e-03

Pos logit contributions
ceed+0.085
 biggest+0.082
 exact+0.079
ate+0.077
 teaches+0.077
Neg logit contributions
ho-0.107
 throats-0.093
anged-0.088
olph-0.086
keeping-0.082
Token and
Token ID290
Feature activation+1.99e-01

Pos logit contributions
ho+0.015
 throats+0.013
olph+0.013
anged+0.013
keeping+0.012
Neg logit contributions
 biggest-0.019
 exact-0.019
ceed-0.018
 teaches-0.018
 two-0.018
Token the
Token ID262
Feature activation-1.25e-01

Pos logit contributions
ho+0.329
 throats+0.277
anged+0.262
olph+0.252
keeping+0.242
Neg logit contributions
ceed-0.369
 biggest-0.368
 exact-0.353
 teaches-0.348
ate-0.345
Token bottle
Token ID9294
Feature activation-3.43e-02

Pos logit contributions
ceed+0.201
ate+0.192
 biggest+0.185
 teaches+0.183
hest+0.182
Neg logit contributions
ho-0.239
 throats-0.211
olph-0.187
anged-0.183
keeping-0.180
Token on
Token ID319
Feature activation+1.82e-02

Pos logit contributions
 biggest+0.058
ceed+0.057
 teaches+0.056
 exact+0.055
 two+0.054
Neg logit contributions
ho-0.060
 throats-0.053
anged-0.050
olph-0.050
keeping-0.049
Token the
Token ID262
Feature activation-1.25e-01

Pos logit contributions
ho+0.032
 throats+0.027
anged+0.025
olph+0.025
keeping+0.024
Neg logit contributions
ceed-0.032
 biggest-0.032
 exact-0.030
 teaches-0.030
ate-0.029
Token floor
Token ID4314
Feature activation+1.57e-02

Pos logit contributions
ceed+0.185
 biggest+0.177
ate+0.172
 exact+0.171
 teaches+0.170
Neg logit contributions
ho-0.256
 throats-0.223
anged-0.206
olph-0.203
keeping-0.201
Token?"
Token ID1701
Feature activation-6.46e-02

Pos logit contributions
ho+0.003
 throats+0.003
anged+0.003
olph+0.003
keeping+0.003
Neg logit contributions
 biggest-0.003
ceed-0.003
 exact-0.003
 teaches-0.003
ate-0.003
Token
Token ID198
Feature activation-1.28e-01

Pos logit contributions
ceed+0.095
 biggest+0.094
 exact+0.089
ate+0.086
hest+0.085
Neg logit contributions
ho-0.130
 throats-0.119
olph-0.111
anged-0.109
keeping-0.102
TokenAnna
Token ID31160
Feature activation+4.30e-02

Pos logit contributions
 biggest+0.266
 exact+0.259
 coolest+0.241
 nic+0.240
 prett+0.237
Neg logit contributions
ho-0.161
olph-0.158
 him-0.148
 he-0.145
 to-0.138
Token looked
Token ID3114
Feature activation-4.33e-02

Pos logit contributions
ho+0.071
 throats+0.059
anged+0.056
olph+0.054
keeping+0.051
Neg logit contributions
ceed-0.080
 biggest-0.080
 exact-0.077
 teaches-0.075
ate-0.075
Token up
Token ID510
Feature activation-2.37e-02

Pos logit contributions
 biggest+0.074
 exact+0.071
 two+0.069
 most+0.068
 nic+0.068
Neg logit contributions
ho-0.080
 throats-0.070
keeping-0.070
olph-0.066
anged-0.065
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
 two+0.040
 biggest+0.038
 most+0.037
 exact+0.036
 best+0.036
Neg logit contributions
ho-0.043
keeping-0.042
 throats-0.040
olph-0.039
anged-0.037
Token stared
Token ID18484
Feature activation-1.40e-01

Pos logit contributions
ho+0.333
 throats+0.283
anged+0.266
olph+0.257
keeping+0.251
Neg logit contributions
 biggest-0.362
ceed-0.361
 exact-0.348
 teaches-0.340
ate-0.339
Token at
Token ID379
Feature activation+6.07e-03

Pos logit contributions
 biggest+0.254
 exact+0.236
 two+0.236
 best+0.229
 most+0.228
Neg logit contributions
ho-0.253
 throats-0.216
keeping-0.210
olph-0.209
anged-0.203
Token her
Token ID607
Feature activation-2.82e-02

Pos logit contributions
ho+0.011
 throats+0.009
anged+0.009
olph+0.009
keeping+0.008
Neg logit contributions
ceed-0.010
 biggest-0.010
 exact-0.010
 teaches-0.010
ate-0.010
Token mummy
Token ID38567
Feature activation+1.34e-01

Pos logit contributions
 biggest+0.050
ceed+0.050
 exact+0.048
 teaches+0.047
ate+0.047
Neg logit contributions
ho-0.050
 throats-0.042
anged-0.040
olph-0.039
keeping-0.037
Token feeling
Token ID4203
Feature activation-5.89e-02

Pos logit contributions
ho+0.269
 throats+0.246
olph+0.232
anged+0.228
keeping+0.220
Neg logit contributions
 exact-0.194
 two-0.192
 biggest-0.185
ceed-0.182
 teaches-0.180
Token
Token ID198
Feature activation-1.24e-01

Pos logit contributions
ceed+0.032
 biggest+0.031
 exact+0.029
 teaches+0.027
 smallest+0.027
Neg logit contributions
ho-0.059
 throats-0.054
anged-0.051
olph-0.051
keeping-0.049
Token
Token ID198
Feature activation-1.24e-01

Pos logit contributions
 exact+0.232
 biggest+0.231
ceed+0.223
 nic+0.220
 smallest+0.211
Neg logit contributions
ho-0.200
 he-0.179
olph-0.175
 throats-0.167
 him-0.160
TokenThey
Token ID2990
Feature activation+6.90e-02

Pos logit contributions
 biggest+0.200
ceed+0.192
 exact+0.184
 nic+0.174
 teaches+0.173
Neg logit contributions
ho-0.244
 throats-0.210
olph-0.201
anged-0.196
keeping-0.188
Token looked
Token ID3114
Feature activation-4.00e-02

Pos logit contributions
ho+0.110
 throats+0.095
keeping+0.087
anged+0.087
olph+0.085
Neg logit contributions
ceed-0.129
 biggest-0.127
 exact-0.124
 teaches-0.123
 two-0.123
Token around
Token ID1088
Feature activation+3.58e-02

Pos logit contributions
 biggest+0.069
 two+0.064
 best+0.063
 exact+0.063
 most+0.063
Neg logit contributions
ho-0.077
keeping-0.068
 throats-0.066
olph-0.063
anged-0.061
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
ho+0.067
anged+0.058
keeping+0.058
 throats+0.057
olph+0.055
Neg logit contributions
 two-0.059
 biggest-0.056
 most-0.055
 teaches-0.054
 exact-0.054
Token found
Token ID1043
Feature activation-1.02e-01

Pos logit contributions
ho+0.411
keeping+0.360
 throats+0.360
olph+0.345
anged+0.334
Neg logit contributions
 biggest-0.270
 two-0.268
 exact-0.267
 most-0.265
 nic-0.258
Token a
Token ID257
Feature activation-1.14e-01

Pos logit contributions
ceed+0.149
 exact+0.140
 teaches+0.136
icious+0.136
hest+0.135
Neg logit contributions
ho-0.207
 throats-0.176
anged-0.173
 he-0.165
ged-0.160
Token long
Token ID890
Feature activation-1.36e-03

Pos logit contributions
 biggest+0.157
ceed+0.153
 exact+0.148
 teaches+0.142
ate+0.140
Neg logit contributions
ho-0.244
 throats-0.216
anged-0.210
olph-0.202
keeping-0.198
Token stick
Token ID4859
Feature activation+2.27e-02

Pos logit contributions
ceed+0.002
 biggest+0.002
 exact+0.002
 teaches+0.002
 two+0.002
Neg logit contributions
ho-0.003
 throats-0.002
anged-0.002
keeping-0.002
olph-0.002
Token on
Token ID319
Feature activation+1.81e-02

Pos logit contributions
ho+0.041
 throats+0.038
anged+0.035
olph+0.034
keeping+0.033
Neg logit contributions
ceed-0.035
 biggest-0.035
 exact-0.035
 teaches-0.034
 two-0.034
Token He
Token ID679
Feature activation+8.67e-02

Pos logit contributions
ho+0.008
 throats+0.006
anged+0.006
 he+0.006
olph+0.005
Neg logit contributions
ceed-0.015
 biggest-0.014
 exact-0.014
hest-0.014
ate-0.014
Token looked
Token ID3114
Feature activation-4.22e-02

Pos logit contributions
ho+0.136
 throats+0.110
anged+0.105
olph+0.103
ged+0.090
Neg logit contributions
 biggest-0.171
ceed-0.170
 exact-0.164
ate-0.158
 teaches-0.157
Token around
Token ID1088
Feature activation+3.60e-02

Pos logit contributions
 biggest+0.083
ceed+0.081
 exact+0.079
 teaches+0.078
ate+0.077
Neg logit contributions
ho-0.067
 throats-0.056
anged-0.052
olph-0.051
keeping-0.050
Token the
Token ID262
Feature activation-1.25e-01

Pos logit contributions
ho+0.068
anged+0.058
 throats+0.058
keeping+0.057
olph+0.055
Neg logit contributions
 two-0.058
 biggest-0.056
 teaches-0.054
 exact-0.054
 most-0.054
Token room
Token ID2119
Feature activation+8.22e-02

Pos logit contributions
ceed+0.184
 biggest+0.174
 exact+0.168
 teaches+0.165
ate+0.162
Neg logit contributions
ho-0.254
 throats-0.224
anged-0.210
olph-0.208
keeping-0.202
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
ho+0.142
 throats+0.122
anged+0.119
olph+0.119
keeping+0.112
Neg logit contributions
 biggest-0.145
 two-0.140
 exact-0.139
 teaches-0.139
 most-0.137
Token saw
Token ID2497
Feature activation-1.13e-01

Pos logit contributions
ho+0.370
 throats+0.316
keeping+0.296
olph+0.293
anged+0.292
Neg logit contributions
 biggest-0.328
ceed-0.313
 exact-0.312
 two-0.309
 teaches-0.306
Token some
Token ID617
Feature activation-1.15e-01

Pos logit contributions
ceed+0.143
 biggest+0.138
 exact+0.135
 teaches+0.130
ate+0.128
Neg logit contributions
ho-0.251
 throats-0.224
anged-0.214
olph-0.209
keeping-0.201
Token things
Token ID1243
Feature activation+7.84e-02

Pos logit contributions
ceed+0.177
 biggest+0.167
 exact+0.163
ate+0.161
 teaches+0.157
Neg logit contributions
ho-0.221
 throats-0.199
anged-0.187
olph-0.180
keeping-0.171
Token they
Token ID484
Feature activation+8.38e-02

Pos logit contributions
ho+0.157
 throats+0.141
olph+0.133
keeping+0.132
anged+0.128
Neg logit contributions
 two-0.117
 teaches-0.115
 biggest-0.114
 best-0.113
 exact-0.112
Token could
Token ID714
Feature activation+2.83e-02

Pos logit contributions
ho+0.152
 throats+0.132
keeping+0.124
anged+0.124
olph+0.122
Neg logit contributions
 biggest-0.137
 teaches-0.135
 exact-0.132
 two-0.132
ceed-0.131
Token
Token ID198
Feature activation-1.28e-01

Pos logit contributions
 biggest+0.043
ceed+0.041
 two+0.041
ate+0.039
 exact+0.039
Neg logit contributions
ho-0.175
 throats-0.160
anged-0.157
keeping-0.152
olph-0.147
Token
Token ID198
Feature activation-1.28e-01

Pos logit contributions
 biggest+0.209
 exact+0.200
ceed+0.198
 nic+0.188
 most+0.183
Neg logit contributions
ho-0.250
olph-0.216
 throats-0.206
 he-0.202
 him-0.193
TokenBen
Token ID11696
Feature activation+6.15e-02

Pos logit contributions
 biggest+0.194
ceed+0.191
 exact+0.182
 teaches+0.178
ate+0.172
Neg logit contributions
ho-0.261
 throats-0.224
anged-0.213
olph-0.212
keeping-0.202
Token felt
Token ID2936
Feature activation-1.12e-01

Pos logit contributions
ho+0.102
 throats+0.087
anged+0.083
keeping+0.078
olph+0.077
Neg logit contributions
ceed-0.113
 teaches-0.110
 biggest-0.109
 exact-0.107
ate-0.107
Token sorry
Token ID7926
Feature activation+3.16e-02

Pos logit contributions
ceed+0.216
 biggest+0.209
 exact+0.206
ate+0.205
hest+0.198
Neg logit contributions
ho-0.168
 throats-0.147
anged-0.133
olph-0.128
 he-0.121
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
ho+0.059
 throats+0.050
keeping+0.048
olph+0.047
anged+0.047
Neg logit contributions
 biggest-0.053
ceed-0.052
 teaches-0.051
 two-0.050
 exact-0.049
Token scared
Token ID12008
Feature activation-2.50e-02

Pos logit contributions
ho+0.378
 throats+0.327
anged+0.312
keeping+0.306
olph+0.303
Neg logit contributions
 biggest-0.320
ceed-0.317
 teaches-0.304
 exact-0.302
 two-0.291
Token.
Token ID13
Feature activation+5.56e-03

Pos logit contributions
 two+0.032
 biggest+0.030
 teaches+0.030
ceed+0.029
 best+0.029
Neg logit contributions
ho-0.056
keeping-0.050
 throats-0.049
anged-0.047
olph-0.046
Token He
Token ID679
Feature activation+8.56e-02

Pos logit contributions
ho+0.010
 throats+0.008
anged+0.008
olph+0.007
keeping+0.007
Neg logit contributions
 biggest-0.010
ceed-0.010
 exact-0.010
 teaches-0.010
ate-0.010
Token said
Token ID531
Feature activation-6.86e-02

Pos logit contributions
ho+0.145
 throats+0.123
anged+0.117
olph+0.112
keeping+0.111
Neg logit contributions
ceed-0.155
 biggest-0.154
 teaches-0.148
 exact-0.148
ate-0.144
Token,
Token ID11
Feature activation+1.47e-01

Pos logit contributions
 two+0.105
 best+0.102
 biggest+0.099
ceed+0.097
 teaches+0.094
Neg logit contributions
keeping-0.133
ho-0.123
 throats-0.120
anged-0.116
gie-0.105
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
ho+0.080
olph+0.074
keeping+0.072
anged+0.071
 throats+0.069
Neg logit contributions
 biggest-0.088
 two-0.086
 most-0.084
 exact-0.083
 best-0.083
Token they
Token ID484
Feature activation+8.26e-02

Pos logit contributions
ho+0.371
 throats+0.314
olph+0.307
keeping+0.307
anged+0.302
Neg logit contributions
 two-0.308
 biggest-0.308
 most-0.296
 exact-0.293
 teaches-0.292
Token saw
Token ID2497
Feature activation-1.14e-01

Pos logit contributions
ho+0.133
 throats+0.112
anged+0.105
olph+0.101
keeping+0.098
Neg logit contributions
ceed-0.156
 biggest-0.156
 exact-0.150
 teaches-0.148
ate-0.146
Token the
Token ID262
Feature activation-1.24e-01

Pos logit contributions
ceed+0.169
 biggest+0.168
 exact+0.161
 teaches+0.157
ate+0.155
Neg logit contributions
ho-0.233
 throats-0.203
anged-0.193
olph-0.188
keeping-0.182
Token rain
Token ID6290
Feature activation+5.69e-02

Pos logit contributions
ceed+0.167
 biggest+0.166
 exact+0.157
 teaches+0.154
ate+0.152
Neg logit contributions
ho-0.269
 throats-0.237
anged-0.227
olph-0.221
keeping-0.215
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
ho+0.097
 throats+0.084
olph+0.082
anged+0.081
keeping+0.075
Neg logit contributions
 biggest-0.100
ceed-0.098
 exact-0.095
 teaches-0.093
ate-0.093
Token the
Token ID262
Feature activation-1.25e-01

Pos logit contributions
ho+0.438
anged+0.382
olph+0.377
 throats+0.375
keeping+0.362
Neg logit contributions
 biggest-0.253
 two-0.244
 teaches-0.239
ceed-0.235
 most-0.233
Token lightning
Token ID14357
Feature activation+2.85e-02

Pos logit contributions
 biggest+0.185
ceed+0.181
 exact+0.173
 teaches+0.169
ate+0.166
Neg logit contributions
ho-0.253
 throats-0.220
anged-0.214
olph-0.207
keeping-0.199
Token.
Token ID13
Feature activation+8.77e-03

Pos logit contributions
ho+0.054
 throats+0.047
olph+0.046
anged+0.046
keeping+0.044
Neg logit contributions
 biggest-0.044
ceed-0.044
 teaches-0.043
 exact-0.042
 two-0.041
Token They
Token ID1119
Feature activation+7.32e-02

Pos logit contributions
ho+0.015
 throats+0.013
anged+0.012
olph+0.011
keeping+0.011
Neg logit contributions
ceed-0.016
 biggest-0.016
 exact-0.015
hest-0.015
ate-0.015
Token were
Token ID547
Feature activation-6.93e-02

Pos logit contributions
ho+0.122
 throats+0.103
anged+0.097
olph+0.094
keeping+0.091
Neg logit contributions
ceed-0.134
 biggest-0.134
 exact-0.129
 teaches-0.127
ate-0.126
Token whistle
Token ID16121
Feature activation-1.21e-02

Pos logit contributions
ho+0.120
 throats+0.108
anged+0.101
olph+0.100
keeping+0.098
Neg logit contributions
ceed-0.098
 biggest-0.095
 exact-0.092
 teaches-0.089
ate-0.089
Token.
Token ID13
Feature activation+8.99e-03

Pos logit contributions
ceed+0.023
 biggest+0.023
 exact+0.022
 teaches+0.022
ate+0.022
Neg logit contributions
ho-0.019
 throats-0.016
anged-0.015
olph-0.014
keeping-0.014
Token He
Token ID679
Feature activation+8.51e-02

Pos logit contributions
ho+0.018
 throats+0.016
anged+0.015
olph+0.015
keeping+0.015
Neg logit contributions
ceed-0.013
 biggest-0.013
 teaches-0.012
 exact-0.012
ate-0.012
Token looked
Token ID3114
Feature activation-4.34e-02

Pos logit contributions
ho+0.139
 throats+0.121
anged+0.112
keeping+0.108
olph+0.106
Neg logit contributions
ceed-0.158
 biggest-0.155
 teaches-0.150
 exact-0.148
ate-0.148
Token up
Token ID510
Feature activation-2.29e-02

Pos logit contributions
 biggest+0.061
 most+0.056
 best+0.055
 same+0.054
 two+0.054
Neg logit contributions
ho-0.094
keeping-0.089
 throats-0.088
anged-0.085
olph-0.081
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
 two+0.039
 biggest+0.036
 most+0.035
 same+0.034
 exact+0.034
Neg logit contributions
ho-0.042
keeping-0.040
 throats-0.040
anged-0.037
olph-0.037
Token saw
Token ID2497
Feature activation-1.17e-01

Pos logit contributions
ho+0.357
 throats+0.320
anged+0.299
keeping+0.292
olph+0.286
Neg logit contributions
 biggest-0.333
ceed-0.317
 exact-0.314
 two-0.309
 most-0.305
Token the
Token ID262
Feature activation-1.29e-01

Pos logit contributions
ceed+0.168
 biggest+0.166
 exact+0.161
 teaches+0.157
ate+0.155
Neg logit contributions
ho-0.242
 throats-0.210
anged-0.201
olph-0.196
keeping-0.189
Token bird
Token ID6512
Feature activation+6.29e-02

Pos logit contributions
ceed+0.235
 biggest+0.224
 teaches+0.223
 exact+0.221
ate+0.221
Neg logit contributions
ho-0.220
 throats-0.183
olph-0.167
anged-0.166
keeping-0.157
Token in
Token ID287
Feature activation+3.34e-03

Pos logit contributions
ho+0.130
 throats+0.122
anged+0.117
keeping+0.116
olph+0.115
Neg logit contributions
 biggest-0.085
 exact-0.082
 two-0.081
ceed-0.079
 most-0.077
Token a
Token ID257
Feature activation-1.17e-01

Pos logit contributions
ho+0.006
 throats+0.005
olph+0.005
keeping+0.005
anged+0.005
Neg logit contributions
 biggest-0.006
ceed-0.006
 exact-0.005
 teaches-0.005
ate-0.005
TokenMom
Token ID29252
Feature activation+1.09e-01

Pos logit contributions
ceed+0.151
 biggest+0.145
ate+0.139
 exact+0.139
 teaches+0.136
Neg logit contributions
ho-0.301
 throats-0.268
anged-0.259
olph-0.249
keeping-0.245
Token looked
Token ID3114
Feature activation-4.36e-02

Pos logit contributions
ho+0.172
 throats+0.150
keeping+0.137
anged+0.136
olph+0.129
Neg logit contributions
ceed-0.208
 biggest-0.201
 teaches-0.200
 two-0.199
 exact-0.196
Token at
Token ID379
Feature activation+6.87e-03

Pos logit contributions
 biggest+0.089
ceed+0.088
 exact+0.086
 teaches+0.084
ate+0.083
Neg logit contributions
ho-0.065
 throats-0.053
anged-0.050
olph-0.048
keeping-0.047
Token the
Token ID262
Feature activation-1.28e-01

Pos logit contributions
ho+0.013
 throats+0.011
anged+0.011
keeping+0.010
olph+0.010
Neg logit contributions
 biggest-0.011
 two-0.010
ceed-0.010
 exact-0.010
ate-0.010
Token mess
Token ID2085
Feature activation-3.46e-03

Pos logit contributions
 biggest+0.217
ceed+0.216
 exact+0.208
 teaches+0.204
ate+0.200
Neg logit contributions
ho-0.233
 throats-0.199
anged-0.190
olph-0.183
keeping-0.177
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
 two+0.006
 teaches+0.006
ceed+0.006
 best+0.005
 exact+0.005
Neg logit contributions
ho-0.006
 throats-0.005
anged-0.005
keeping-0.005
olph-0.005
Token sighed
Token ID21893
Feature activation-2.86e-02

Pos logit contributions
ho+0.358
keeping+0.318
 throats+0.310
anged+0.285
olph+0.284
Neg logit contributions
 two-0.328
 teaches-0.316
ceed-0.309
 biggest-0.308
 exact-0.307
Token.
Token ID13
Feature activation+5.61e-03

Pos logit contributions
 two+0.036
 teaches+0.035
 exact+0.034
ceed+0.034
ate+0.033
Neg logit contributions
ho-0.063
anged-0.054
 throats-0.053
keeping-0.052
olph-0.051
Token She
Token ID1375
Feature activation+7.41e-02

Pos logit contributions
ho+0.011
 throats+0.009
anged+0.009
olph+0.009
keeping+0.008
Neg logit contributions
ceed-0.009
 biggest-0.009
 exact-0.009
 teaches-0.008
ate-0.008
Token picked
Token ID6497
Feature activation-9.13e-02

Pos logit contributions
ho+0.125
 throats+0.105
anged+0.100
olph+0.096
keeping+0.091
Neg logit contributions
ceed-0.136
 biggest-0.136
 exact-0.130
 teaches-0.127
ate-0.127
Token up
Token ID510
Feature activation-2.61e-02

Pos logit contributions
ceed+0.140
 biggest+0.139
 exact+0.133
 teaches+0.130
ate+0.128
Neg logit contributions
ho-0.181
 throats-0.157
anged-0.150
olph-0.145
keeping-0.141
Token Anna
Token ID11735
Feature activation+2.33e-02

Pos logit contributions
ho+0.066
 throats+0.058
olph+0.057
anged+0.057
gie+0.053
Neg logit contributions
 biggest-0.057
 two-0.055
 best-0.055
 most-0.053
 exact-0.052
Token.
Token ID13
Feature activation+8.13e-03

Pos logit contributions
ho+0.045
 throats+0.040
olph+0.037
anged+0.037
keeping+0.037
Neg logit contributions
 biggest-0.034
 exact-0.034
 teaches-0.034
ceed-0.034
 two-0.033
Token She
Token ID1375
Feature activation+7.76e-02

Pos logit contributions
ho+0.013
 throats+0.012
anged+0.010
olph+0.010
keeping+0.010
Neg logit contributions
ceed-0.015
 biggest-0.015
 exact-0.015
ate-0.014
 teaches-0.014
Token felt
Token ID2936
Feature activation-1.13e-01

Pos logit contributions
ho+0.150
keeping+0.135
 throats+0.134
anged+0.126
olph+0.120
Neg logit contributions
 two-0.114
 teaches-0.109
 biggest-0.108
ceed-0.108
 smallest-0.107
Token sad
Token ID6507
Feature activation-2.56e-02

Pos logit contributions
 biggest+0.185
 exact+0.171
ceed+0.171
 smallest+0.170
 two+0.170
Neg logit contributions
ho-0.227
anged-0.187
 throats-0.184
olph-0.181
keeping-0.181
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
 biggest+0.041
 two+0.041
 best+0.040
 teaches+0.039
 most+0.039
Neg logit contributions
ho-0.050
 throats-0.041
keeping-0.041
olph-0.040
anged-0.040
Token angry
Token ID7954
Feature activation-1.22e-02

Pos logit contributions
ho+0.364
keeping+0.319
olph+0.313
anged+0.305
 throats+0.300
Neg logit contributions
 two-0.310
 biggest-0.309
ceed-0.305
 teaches-0.302
 exact-0.298
Token.
Token ID13
Feature activation+7.53e-03

Pos logit contributions
 biggest+0.017
 two+0.016
ceed+0.015
 best+0.015
 teaches+0.015
Neg logit contributions
ho-0.027
 throats-0.023
olph-0.022
anged-0.022
keeping-0.021
Token Where
Token ID6350
Feature activation-7.83e-02

Pos logit contributions
ho+0.014
 throats+0.012
olph+0.011
anged+0.011
keeping+0.010
Neg logit contributions
ceed-0.013
 biggest-0.012
 exact-0.012
 teaches-0.012
ate-0.012
Token was
Token ID373
Feature activation-6.60e-02

Pos logit contributions
 exact+0.155
 biggest+0.155
ceed+0.152
ate+0.145
 smallest+0.142
Neg logit contributions
ho-0.120
anged-0.090
 he-0.088
 throats-0.088
olph-0.087
Token Anna
Token ID11735
Feature activation+2.31e-02

Pos logit contributions
 biggest+0.095
ceed+0.093
 teaches+0.088
 exact+0.088
 two+0.088
Neg logit contributions
ho-0.138
 throats-0.120
anged-0.117
olph-0.110
keeping-0.109
Token
Token ID198
Feature activation-1.26e-01

Pos logit contributions
ceed+0.036
 biggest+0.034
 exact+0.033
 smallest+0.031
hest+0.031
Neg logit contributions
ho-0.054
 throats-0.049
olph-0.047
anged-0.046
 he-0.046
Token
Token ID198
Feature activation-1.26e-01

Pos logit contributions
 exact+0.233
 biggest+0.230
 nic+0.222
ceed+0.221
 smallest+0.215
Neg logit contributions
ho-0.210
 he-0.181
olph-0.175
 throats-0.168
 him-0.167
TokenThey
Token ID2990
Feature activation+6.95e-02

Pos logit contributions
 biggest+0.212
ceed+0.198
 exact+0.194
 nic+0.186
 smallest+0.184
Neg logit contributions
ho-0.243
 throats-0.199
olph-0.196
anged-0.188
keeping-0.186
Token looked
Token ID3114
Feature activation-4.04e-02

Pos logit contributions
ho+0.106
 throats+0.090
olph+0.083
keeping+0.083
anged+0.083
Neg logit contributions
ceed-0.133
 biggest-0.131
 two-0.129
 exact-0.128
 teaches-0.126
Token around
Token ID1088
Feature activation+3.62e-02

Pos logit contributions
 biggest+0.070
 two+0.065
 best+0.065
 most+0.065
 exact+0.064
Neg logit contributions
ho-0.076
keeping-0.067
 throats-0.065
olph-0.062
anged-0.060
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
ho+0.065
keeping+0.056
 throats+0.056
anged+0.056
olph+0.053
Neg logit contributions
 two-0.062
 biggest-0.058
 most-0.057
 teaches-0.057
 exact-0.056
Token saw
Token ID2497
Feature activation-1.11e-01

Pos logit contributions
ho+0.404
keeping+0.360
 throats+0.359
olph+0.344
anged+0.330
Neg logit contributions
 two-0.270
 biggest-0.269
 exact-0.268
 most-0.266
 nic-0.264
Token a
Token ID257
Feature activation-1.14e-01

Pos logit contributions
 biggest+0.120
ceed+0.119
 exact+0.110
 teaches+0.108
ate+0.106
Neg logit contributions
ho-0.270
 throats-0.241
anged-0.233
olph-0.228
keeping-0.222
Token distant
Token ID12899
Feature activation-1.47e-02

Pos logit contributions
 biggest+0.144
ceed+0.133
 exact+0.131
 two+0.129
 smallest+0.124
Neg logit contributions
ho-0.251
 throats-0.230
anged-0.230
keeping-0.217
olph-0.215
Token man
Token ID582
Feature activation+1.01e-01

Pos logit contributions
 biggest+0.019
 exact+0.019
ate+0.018
 two+0.018
 smallest+0.017
Neg logit contributions
ho-0.030
anged-0.028
 throats-0.028
olph-0.027
ged-0.026
Token sitting
Token ID5586
Feature activation-7.89e-02

Pos logit contributions
ho+0.203
 throats+0.192
olph+0.166
keeping+0.164
 tongues+0.162
Neg logit contributions
 exact-0.140
 two-0.137
ate-0.125
ceed-0.123
 biggest-0.122
Token face
Token ID1986
Feature activation+4.93e-02

Pos logit contributions
 two+0.071
 biggest+0.071
 exact+0.070
ceed+0.067
 teaches+0.067
Neg logit contributions
ho-0.103
 throats-0.086
anged-0.085
olph-0.083
keeping-0.078
Token.
Token ID13
Feature activation+7.99e-03

Pos logit contributions
ho+0.094
keeping+0.083
 throats+0.079
olph+0.078
anged+0.078
Neg logit contributions
 two-0.081
 biggest-0.077
 best-0.074
 teaches-0.073
 most-0.073
Token She
Token ID1375
Feature activation+7.61e-02

Pos logit contributions
ho+0.015
 throats+0.013
anged+0.013
keeping+0.012
olph+0.012
Neg logit contributions
 biggest-0.013
ceed-0.012
 teaches-0.012
 exact-0.012
 two-0.012
Token looked
Token ID3114
Feature activation-4.31e-02

Pos logit contributions
ho+0.116
 throats+0.094
anged+0.089
olph+0.087
ged+0.077
Neg logit contributions
ceed-0.155
 biggest-0.154
 exact-0.148
ate-0.143
hest-0.142
Token sad
Token ID6507
Feature activation-2.31e-02

Pos logit contributions
 biggest+0.080
 exact+0.075
 two+0.073
 most+0.073
 teaches+0.072
Neg logit contributions
ho-0.076
keeping-0.065
 throats-0.064
anged-0.060
olph-0.059
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
 two+0.032
 best+0.031
 biggest+0.030
 most+0.029
 teaches+0.028
Neg logit contributions
ho-0.051
keeping-0.046
olph-0.045
anged-0.043
 throats-0.041
Token worried
Token ID7960
Feature activation+2.98e-02

Pos logit contributions
ho+0.349
 throats+0.281
keeping+0.277
anged+0.274
olph+0.271
Neg logit contributions
 biggest-0.349
ceed-0.344
 exact-0.339
 teaches-0.338
 two-0.334
Token.
Token ID13
Feature activation+8.48e-03

Pos logit contributions
ho+0.070
keeping+0.062
olph+0.061
anged+0.061
 throats+0.059
Neg logit contributions
 teaches-0.034
 two-0.034
 best-0.032
 biggest-0.031
ceed-0.031
Token He
Token ID679
Feature activation+8.61e-02

Pos logit contributions
ho+0.016
 throats+0.014
anged+0.013
keeping+0.013
olph+0.013
Neg logit contributions
 biggest-0.014
ceed-0.013
 exact-0.013
 two-0.013
 teaches-0.013
Token didn
Token ID1422
Feature activation-4.76e-02

Pos logit contributions
ho+0.150
 throats+0.127
keeping+0.122
anged+0.121
olph+0.116
Neg logit contributions
ceed-0.149
 biggest-0.148
 teaches-0.147
 exact-0.144
 two-0.142
Token't
Token ID470
Feature activation+2.39e-02

Pos logit contributions
 biggest+0.085
 exact+0.083
ceed+0.081
 nic+0.079
 smallest+0.078
Neg logit contributions
ho-0.079
 throats-0.065
olph-0.062
anged-0.061
 he-0.059
Token dance
Token ID9280
Feature activation-2.33e-02

Pos logit contributions
ceed+0.162
 biggest+0.156
 exact+0.150
ate+0.149
 teaches+0.147
Neg logit contributions
ho-0.241
 throats-0.208
anged-0.195
olph-0.194
keeping-0.186
Token show
Token ID905
Feature activation-4.58e-02

Pos logit contributions
ceed+0.037
 biggest+0.036
 exact+0.035
ate+0.034
 teaches+0.033
Neg logit contributions
ho-0.045
 throats-0.038
anged-0.036
olph-0.035
keeping-0.034
Token coming
Token ID2406
Feature activation+4.17e-02

Pos logit contributions
ceed+0.091
 biggest+0.090
 exact+0.088
ate+0.085
 teaches+0.084
Neg logit contributions
ho-0.069
 throats-0.057
anged-0.054
olph-0.051
keeping-0.050
Token up
Token ID510
Feature activation-2.22e-02

Pos logit contributions
ho+0.061
 throats+0.050
anged+0.047
olph+0.045
keeping+0.043
Neg logit contributions
ceed-0.085
 biggest-0.085
 exact-0.082
 teaches-0.081
ate-0.080
Token,
Token ID11
Feature activation+1.49e-01

Pos logit contributions
 biggest+0.043
ceed+0.043
 exact+0.042
 teaches+0.041
ate+0.041
Neg logit contributions
ho-0.035
 throats-0.029
anged-0.028
olph-0.027
keeping-0.025
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
ho+0.241
 throats+0.202
anged+0.197
olph+0.192
keeping+0.178
Neg logit contributions
 biggest-0.285
 exact-0.277
ceed-0.276
 teaches-0.271
 most-0.268
Token anyone
Token ID2687
Feature activation-6.37e-03

Pos logit contributions
ho+0.318
 throats+0.270
anged+0.252
olph+0.250
keeping+0.235
Neg logit contributions
 biggest-0.377
ceed-0.372
 exact-0.361
 teaches-0.360
 two-0.351
Token could
Token ID714
Feature activation+2.94e-02

Pos logit contributions
 teaches+0.011
ceed+0.010
 biggest+0.010
 exact+0.010
ate+0.010
Neg logit contributions
ho-0.011
 throats-0.010
anged-0.009
keeping-0.009
olph-0.009
Token join
Token ID4654
Feature activation-8.64e-02

Pos logit contributions
ho+0.049
 throats+0.040
anged+0.037
olph+0.036
keeping+0.034
Neg logit contributions
ceed-0.054
 biggest-0.053
 exact-0.051
ate-0.049
 teaches-0.049
Token.
Token ID13
Feature activation+7.30e-03

Pos logit contributions
ceed+0.158
 biggest+0.157
 exact+0.150
 teaches+0.150
ate+0.149
Neg logit contributions
ho-0.146
 throats-0.123
anged-0.116
olph-0.112
keeping-0.106
Token Jenny
Token ID28437
Feature activation+3.10e-02

Pos logit contributions
ho+0.008
 throats+0.007
anged+0.007
 he+0.007
 after+0.006
Neg logit contributions
ceed-0.016
 biggest-0.016
hest-0.015
 smallest-0.015
 exact-0.015
Token saw
Token ID2497
Feature activation-1.15e-01

Pos logit contributions
ho+0.005
 throats+0.004
anged+0.004
olph+0.004
keeping+0.004
Neg logit contributions
ceed-0.005
 biggest-0.005
 exact-0.005
 teaches-0.005
ate-0.005
Token a
Token ID257
Feature activation-1.15e-01

Pos logit contributions
 biggest+0.167
ceed+0.164
 exact+0.155
 teaches+0.152
ate+0.151
Neg logit contributions
ho-0.241
 throats-0.212
anged-0.203
olph-0.196
keeping-0.191
Token bird
Token ID6512
Feature activation+6.31e-02

Pos logit contributions
ceed+0.153
 biggest+0.152
 exact+0.144
 teaches+0.141
ate+0.139
Neg logit contributions
ho-0.250
 throats-0.220
anged-0.211
olph-0.206
keeping-0.200
Token's
Token ID338
Feature activation-4.27e-02

Pos logit contributions
ho+0.113
 throats+0.112
keeping+0.105
anged+0.104
olph+0.100
Neg logit contributions
 two-0.094
 exact-0.094
 biggest-0.092
 most-0.090
ceed-0.088
Token nest
Token ID16343
Feature activation+6.18e-02

Pos logit contributions
 biggest+0.073
ceed+0.071
 exact+0.069
 teaches+0.067
ate+0.066
Neg logit contributions
ho-0.077
 throats-0.067
anged-0.064
olph-0.062
keeping-0.059
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
ho+0.094
 throats+0.084
anged+0.082
olph+0.079
keeping+0.071
Neg logit contributions
 biggest-0.119
 exact-0.115
ceed-0.115
ate-0.112
 most-0.112
Token wanted
Token ID2227
Feature activation-6.35e-02

Pos logit contributions
ho+0.341
 throats+0.295
anged+0.282
keeping+0.270
olph+0.263
Neg logit contributions
 two-0.346
ceed-0.339
 biggest-0.337
 teaches-0.327
ate-0.321
Token to
Token ID284
Feature activation+1.16e-01

Pos logit contributions
 two+0.128
 biggest+0.126
 most+0.116
 best+0.115
 smallest+0.110
Neg logit contributions
ho-0.100
anged-0.091
 throats-0.087
olph-0.082
keeping-0.078
Token take
Token ID1011
Feature activation-1.16e-01

Pos logit contributions
ho+0.203
 throats+0.174
anged+0.165
olph+0.159
keeping+0.153
Neg logit contributions
ceed-0.204
 biggest-0.204
 exact-0.195
 teaches-0.192
ate-0.191
Token a
Token ID257
Feature activation-1.15e-01

Pos logit contributions
 biggest+0.187
ceed+0.180
 two+0.180
ate+0.170
 exact+0.168
Neg logit contributions
ho-0.233
 throats-0.195
anged-0.190
olph-0.184
keeping-0.173
Token closer
Token ID5699
Feature activation+1.44e-02

Pos logit contributions
 biggest+0.205
ceed+0.191
 exact+0.189
 two+0.184
 most+0.179
Neg logit contributions
ho-0.200
 throats-0.180
anged-0.170
keeping-0.163
olph-0.162
Token not
Token ID407
Feature activation+3.81e-02

Pos logit contributions
 biggest+0.034
ceed+0.034
 exact+0.033
 teaches+0.032
ate+0.032
Neg logit contributions
ho-0.048
 throats-0.041
anged-0.040
olph-0.038
keeping-0.037
Token know
Token ID760
Feature activation-3.23e-02

Pos logit contributions
ho+0.075
 throats+0.064
anged+0.061
olph+0.058
keeping+0.055
Neg logit contributions
ceed-0.061
 teaches-0.060
 biggest-0.060
 exact-0.059
 two-0.057
Token grandma
Token ID49890
Feature activation+1.01e-01

Pos logit contributions
 biggest+0.051
ceed+0.048
 two+0.048
 teaches+0.048
 exact+0.047
Neg logit contributions
ho-0.065
 throats-0.055
anged-0.054
olph-0.051
keeping-0.050
Token very
Token ID845
Feature activation+8.88e-02

Pos logit contributions
ho+0.199
 throats+0.181
anged+0.167
olph+0.157
keeping+0.157
Neg logit contributions
 teaches-0.152
 biggest-0.149
ceed-0.146
 exact-0.144
 best-0.143
Token well
Token ID880
Feature activation+2.09e-02

Pos logit contributions
ho+0.192
 throats+0.168
anged+0.164
olph+0.156
keeping+0.154
Neg logit contributions
 biggest-0.124
ceed-0.123
 teaches-0.117
 exact-0.117
ate-0.114
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
ho+0.041
 throats+0.037
anged+0.035
olph+0.033
keeping+0.031
Neg logit contributions
 biggest-0.031
 teaches-0.031
 exact-0.030
 best-0.029
ceed-0.029
Token he
Token ID339
Feature activation+1.04e-01

Pos logit contributions
ho+0.344
 throats+0.280
anged+0.280
olph+0.276
keeping+0.249
Neg logit contributions
 biggest-0.353
 teaches-0.349
ceed-0.344
 two-0.342
 exact-0.337
Token was
Token ID373
Feature activation-6.04e-02

Pos logit contributions
ho+0.186
 throats+0.163
anged+0.153
keeping+0.149
olph+0.147
Neg logit contributions
ceed-0.174
 teaches-0.173
 biggest-0.172
 exact-0.167
 smallest-0.164
Token afraid
Token ID7787
Feature activation+5.23e-03

Pos logit contributions
ceed+0.105
 biggest+0.105
 exact+0.100
 teaches+0.099
ate+0.097
Neg logit contributions
ho-0.108
 throats-0.092
anged-0.087
olph-0.084
keeping-0.081
Token of
Token ID286
Feature activation+5.58e-03

Pos logit contributions
ho+0.011
anged+0.009
 throats+0.009
olph+0.009
keeping+0.009
Neg logit contributions
 biggest-0.008
 two-0.008
 teaches-0.007
 best-0.007
 most-0.007
Token her
Token ID607
Feature activation-2.69e-02

Pos logit contributions
ho+0.011
 throats+0.009
anged+0.009
olph+0.008
keeping+0.008
Neg logit contributions
 biggest-0.009
ceed-0.009
 exact-0.009
 two-0.009
 teaches-0.009
Token."
Token ID526
Feature activation-2.58e-02

Pos logit contributions
ho+0.047
 throats+0.040
anged+0.037
olph+0.037
keeping+0.036
Neg logit contributions
ceed-0.045
 biggest-0.043
 exact-0.042
 teaches-0.040
ate-0.040
Token
Token ID198
Feature activation-1.29e-01

Pos logit contributions
ceed+0.036
 biggest+0.035
 exact+0.033
 teaches+0.032
ate+0.032
Neg logit contributions
ho-0.055
 throats-0.049
anged-0.047
olph-0.045
keeping-0.044
Token
Token ID198
Feature activation-1.28e-01

Pos logit contributions
 biggest+0.233
ceed+0.224
 exact+0.224
 nic+0.219
 smallest+0.212
Neg logit contributions
ho-0.216
olph-0.186
 he-0.180
 throats-0.177
keeping-0.170
TokenThey
Token ID2990
Feature activation+7.22e-02

Pos logit contributions
 biggest+0.197
ceed+0.185
 exact+0.173
 smallest+0.169
 teaches+0.168
Neg logit contributions
ho-0.256
 throats-0.221
olph-0.204
 after-0.200
keeping-0.198
Token searched
Token ID16499
Feature activation-2.59e-02

Pos logit contributions
ho+0.109
 throats+0.094
olph+0.083
keeping+0.083
anged+0.082
Neg logit contributions
ceed-0.139
 biggest-0.139
 two-0.138
 exact-0.134
ate-0.133
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
 biggest+0.042
 best+0.040
 two+0.040
 most+0.039
 exact+0.038
Neg logit contributions
ho-0.049
 throats-0.044
anged-0.042
olph-0.041
keeping-0.041
Token searched
Token ID16499
Feature activation-2.61e-02

Pos logit contributions
ho+0.310
 throats+0.270
olph+0.240
anged+0.234
keeping+0.233
Neg logit contributions
 biggest-0.380
 exact-0.369
 two-0.365
ceed-0.359
 most-0.357
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
 biggest+0.047
 best+0.044
 two+0.043
 most+0.043
 exact+0.043
Neg logit contributions
ho-0.045
 throats-0.041
anged-0.039
olph-0.038
keeping-0.036
Token finally
Token ID3443
Feature activation+7.42e-02

Pos logit contributions
ho+0.309
 throats+0.269
olph+0.242
anged+0.238
keeping+0.231
Neg logit contributions
 biggest-0.385
 exact-0.371
ceed-0.365
 two-0.360
 most-0.359
Token found
Token ID1043
Feature activation-1.03e-01

Pos logit contributions
ho+0.108
 throats+0.084
anged+0.084
olph+0.078
ged+0.073
Neg logit contributions
ceed-0.157
 biggest-0.152
 teaches-0.147
 exact-0.147
ate-0.142
Token Mitt
Token ID16627
Feature activation+3.70e-03

Pos logit contributions
ceed+0.194
 biggest+0.183
 teaches+0.182
 exact+0.180
hest+0.175
Neg logit contributions
ho-0.173
 throats-0.143
anged-0.136
olph-0.128
keeping-0.127
Token
Token ID198
Feature activation-1.28e-01

Pos logit contributions
 biggest+0.233
ceed+0.224
 exact+0.224
 nic+0.219
 smallest+0.212
Neg logit contributions
ho-0.216
olph-0.186
 he-0.180
 throats-0.177
keeping-0.170
TokenThey
Token ID2990
Feature activation+7.22e-02

Pos logit contributions
 biggest+0.197
ceed+0.185
 exact+0.173
 smallest+0.169
 teaches+0.168
Neg logit contributions
ho-0.256
 throats-0.221
olph-0.204
 after-0.200
keeping-0.198
Token searched
Token ID16499
Feature activation-2.59e-02

Pos logit contributions
ho+0.109
 throats+0.094
olph+0.083
keeping+0.083
anged+0.082
Neg logit contributions
ceed-0.139
 biggest-0.139
 two-0.138
 exact-0.134
ate-0.133
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
 biggest+0.042
 best+0.040
 two+0.040
 most+0.039
 exact+0.038
Neg logit contributions
ho-0.049
 throats-0.044
anged-0.042
olph-0.041
keeping-0.041
Token searched
Token ID16499
Feature activation-2.61e-02

Pos logit contributions
ho+0.310
 throats+0.270
olph+0.240
anged+0.234
keeping+0.233
Neg logit contributions
 biggest-0.380
 exact-0.369
 two-0.365
ceed-0.359
 most-0.357
Token and
Token ID290
Feature activation+1.98e-01

Pos logit contributions
 biggest+0.047
 best+0.044
 two+0.043
 most+0.043
 exact+0.043
Neg logit contributions
ho-0.045
 throats-0.041
anged-0.039
olph-0.038
keeping-0.036
Token finally
Token ID3443
Feature activation+7.42e-02

Pos logit contributions
ho+0.309
 throats+0.269
olph+0.242
anged+0.238
keeping+0.231
Neg logit contributions
 biggest-0.385
 exact-0.371
ceed-0.365
 two-0.360
 most-0.359
Token found
Token ID1043
Feature activation-1.03e-01

Pos logit contributions
ho+0.108
 throats+0.084
anged+0.084
olph+0.078
ged+0.073
Neg logit contributions
ceed-0.157
 biggest-0.152
 teaches-0.147
 exact-0.147
ate-0.142
Token Mitt
Token ID16627
Feature activation+3.70e-03

Pos logit contributions
ceed+0.194
 biggest+0.183
 teaches+0.182
 exact+0.180
hest+0.175
Neg logit contributions
ho-0.173
 throats-0.143
anged-0.136
olph-0.128
keeping-0.127
Tokenens
Token ID641
Feature activation+7.74e-02

Pos logit contributions
ho+0.004
 throats+0.003
anged+0.003
olph+0.003
keeping+0.003
Neg logit contributions
 biggest-0.009
ceed-0.009
 exact-0.008
 teaches-0.008
 two-0.008
Token hiding
Token ID11816
Feature activation-5.09e-02

Pos logit contributions
ho+0.137
 throats+0.121
anged+0.111
olph+0.109
keeping+0.102
Neg logit contributions
 biggest-0.133
ceed-0.130
 exact-0.129
 two-0.126
 teaches-0.123
Token sign
Token ID1051
Feature activation-1.63e-03

Pos logit contributions
ho+0.218
 throats+0.189
anged+0.180
olph+0.175
keeping+0.169
Neg logit contributions
ceed-0.167
 biggest-0.166
 exact-0.158
 teaches-0.154
ate-0.152
Token '
Token ID705
Feature activation-3.07e-02

Pos logit contributions
 biggest+0.003
ceed+0.003
 teaches+0.003
 exact+0.003
ate+0.003
Neg logit contributions
ho-0.003
 throats-0.003
anged-0.002
olph-0.002
keeping-0.002
Tokenhappy
Token ID34191
Feature activation-5.04e-02

Pos logit contributions
 biggest+0.047
ceed+0.047
 exact+0.045
 teaches+0.044
ate+0.043
Neg logit contributions
ho-0.060
 throats-0.052
anged-0.050
olph-0.049
keeping-0.047
Token',
Token ID3256
Feature activation-2.29e-03

Pos logit contributions
ceed+0.072
 biggest+0.072
 exact+0.068
 teaches+0.066
ate+0.065
Neg logit contributions
ho-0.105
 throats-0.092
anged-0.087
olph-0.085
keeping-0.083
Token '
Token ID705
Feature activation-3.06e-02

Pos logit contributions
ceed+0.005
 biggest+0.005
 exact+0.005
 teaches+0.005
ate+0.005
Neg logit contributions
ho-0.003
 throats-0.003
anged-0.002
olph-0.002
keeping-0.002
Tokenbirth
Token ID24280
Feature activation-2.80e-01

Pos logit contributions
 biggest+0.045
ceed+0.045
 exact+0.042
 teaches+0.042
ate+0.040
Neg logit contributions
ho-0.063
 throats-0.054
anged-0.052
olph-0.051
keeping-0.049
Tokenday
Token ID820
Feature activation+4.55e-02

Pos logit contributions
 biggest+0.457
 exact+0.439
ceed+0.423
hest+0.404
 smallest+0.396
Neg logit contributions
ho-0.521
 throats-0.453
olph-0.424
keeping-0.417
anged-0.415
Token',
Token ID3256
Feature activation-1.42e-03

Pos logit contributions
ho+0.083
 throats+0.072
anged+0.065
olph+0.065
keeping+0.063
Neg logit contributions
 biggest-0.076
ceed-0.075
 exact-0.071
ate-0.069
hest-0.069
Token and
Token ID290
Feature activation+1.87e-01

Pos logit contributions
ceed+0.003
 biggest+0.003
 exact+0.003
 teaches+0.003
ate+0.003
Neg logit contributions
ho-0.002
 throats-0.002
anged-0.002
olph-0.001
keeping-0.001
Token '
Token ID705
Feature activation-3.01e-02

Pos logit contributions
ho+0.301
 throats+0.251
anged+0.239
olph+0.231
keeping+0.223
Neg logit contributions
 biggest-0.352
ceed-0.351
 exact-0.340
 teaches-0.339
ate-0.331
Tokenyou
Token ID5832
Feature activation+1.09e-02

Pos logit contributions
ceed+0.043
 biggest+0.043
 exact+0.041
 teaches+0.040
ate+0.040
Neg logit contributions
ho-0.062
 throats-0.054
anged-0.052
olph-0.051
keeping-0.049
Token should
Token ID815
Feature activation+5.95e-02

Pos logit contributions
ceed+0.007
 biggest+0.007
 exact+0.006
 teaches+0.006
ate+0.006
Neg logit contributions
ho-0.012
 throats-0.011
anged-0.010
olph-0.010
keeping-0.010
Token be
Token ID307
Feature activation-3.95e-02

Pos logit contributions
ho+0.122
 throats+0.105
anged+0.105
olph+0.098
keeping+0.097
Neg logit contributions
ceed-0.088
 biggest-0.085
 teaches-0.084
 exact-0.083
ate-0.082
Token compassionate
Token ID32533
Feature activation-1.55e-02

Pos logit contributions
 biggest+0.063
ceed+0.063
 exact+0.063
hest+0.061
ate+0.058
Neg logit contributions
ho-0.068
 throats-0.063
anged-0.059
olph-0.058
ged-0.055
Token.
Token ID13
Feature activation+8.07e-03

Pos logit contributions
ceed+0.024
 biggest+0.024
 teaches+0.023
 two+0.023
ate+0.022
Neg logit contributions
ho-0.030
 throats-0.026
anged-0.025
olph-0.024
ged-0.022
Token Comp
Token ID3082
Feature activation+1.22e-02

Pos logit contributions
ho+0.019
 throats+0.017
anged+0.016
olph+0.016
keeping+0.015
Neg logit contributions
 biggest-0.010
ceed-0.010
 exact-0.009
 teaches-0.008
ate-0.008
Tokenassion
Token ID11857
Feature activation-2.57e-01

Pos logit contributions
ho+0.027
anged+0.022
 throats+0.021
olph+0.020
keeping+0.020
Neg logit contributions
 biggest-0.018
 exact-0.016
ceed-0.016
 smallest-0.016
 most-0.016
Tokenate
Token ID378
Feature activation-6.72e-02

Pos logit contributions
 exact+0.342
 nic+0.314
 biggest+0.310
ceed+0.297
 BEST+0.275
Neg logit contributions
ho-0.514
 throats-0.474
away-0.461
olph-0.457
keeping-0.447
Token means
Token ID1724
Feature activation-8.96e-02

Pos logit contributions
 biggest+0.060
ceed+0.059
 teaches+0.056
 exact+0.054
ate+0.053
Neg logit contributions
ho-0.175
 throats-0.158
anged-0.154
olph-0.150
keeping-0.146
Token you
Token ID345
Feature activation+5.55e-02

Pos logit contributions
 biggest+0.146
ceed+0.145
 exact+0.138
 teaches+0.138
ate+0.134
Neg logit contributions
ho-0.168
 throats-0.144
anged-0.138
olph-0.133
keeping-0.130
Token care
Token ID1337
Feature activation+6.26e-02

Pos logit contributions
ho+0.110
 throats+0.096
anged+0.092
olph+0.089
keeping+0.085
Neg logit contributions
ceed-0.084
 biggest-0.083
 exact-0.080
ate-0.077
 teaches-0.074
Token about
Token ID546
Feature activation+1.70e-02

Pos logit contributions
ho+0.086
 throats+0.071
 after+0.065
anged+0.063
 he+0.063
Neg logit contributions
 biggest-0.123
 exact-0.121
ceed-0.121
 nic-0.120
V-0.115
TokenGrand
Token ID23581
Feature activation-9.24e-03

Pos logit contributions
 biggest+0.199
ceed+0.179
 exact+0.179
 smallest+0.168
 nic+0.167
Neg logit contributions
ho-0.296
 throats-0.244
olph-0.241
 he-0.234
anged-0.232
Tokenma
Token ID2611
Feature activation+1.06e-01

Pos logit contributions
ceed+0.017
 biggest+0.017
 exact+0.016
 teaches+0.016
ate+0.015
Neg logit contributions
ho-0.015
 throats-0.013
anged-0.013
olph-0.012
keeping-0.012
Token's
Token ID338
Feature activation-5.40e-02

Pos logit contributions
ho+0.180
 throats+0.165
anged+0.148
keeping+0.146
olph+0.140
Neg logit contributions
 teaches-0.183
ceed-0.179
 biggest-0.178
 two-0.178
ate-0.174
Token words
Token ID2456
Feature activation-1.14e-03

Pos logit contributions
 biggest+0.078
ceed+0.077
 exact+0.074
 teaches+0.073
ate+0.071
Neg logit contributions
ho-0.112
 throats-0.098
anged-0.094
olph-0.091
keeping-0.089
Token fore
Token ID1674
Feature activation+2.90e-02

Pos logit contributions
ceed+0.002
 biggest+0.002
 exact+0.002
 teaches+0.002
ate+0.002
Neg logit contributions
ho-0.002
 throats-0.002
anged-0.002
olph-0.002
keeping-0.001
Tokenshadow
Token ID19106
Feature activation-2.28e-01

Pos logit contributions
ho+0.056
 throats+0.048
anged+0.047
olph+0.045
keeping+0.043
Neg logit contributions
 biggest-0.045
ceed-0.045
 exact-0.043
 teaches-0.042
ate-0.042
Tokened
Token ID276
Feature activation-3.17e-02

Pos logit contributions
 biggest+0.501
 exact+0.495
ceed+0.493
 teaches+0.476
 smallest+0.466
Neg logit contributions
ho-0.298
anged-0.235
 throats-0.218
olph-0.204
keeping-0.189
Token a
Token ID257
Feature activation-1.25e-01

Pos logit contributions
 biggest+0.059
 two+0.057
 most+0.057
 teaches+0.056
 exact+0.056
Neg logit contributions
ho-0.050
 throats-0.045
olph-0.042
anged-0.041
keeping-0.039
Token happy
Token ID3772
Feature activation-1.15e-02

Pos logit contributions
 biggest+0.167
ceed+0.164
 exact+0.156
 teaches+0.155
 two+0.148
Neg logit contributions
ho-0.266
 throats-0.240
anged-0.233
olph-0.224
keeping-0.219
Token future
Token ID2003
Feature activation+1.48e-02

Pos logit contributions
 biggest+0.014
 two+0.014
 teaches+0.014
 exact+0.014
ceed+0.013
Neg logit contributions
ho-0.024
anged-0.022
 throats-0.022
olph-0.021
keeping-0.020
Token for
Token ID329
Feature activation-5.85e-02

Pos logit contributions
ho+0.025
 throats+0.022
anged+0.020
olph+0.019
keeping+0.018
Neg logit contributions
ceed-0.027
 biggest-0.026
 teaches-0.026
 exact-0.026
ate-0.025
Token
Token ID198
Feature activation-1.39e-01

Pos logit contributions
 biggest+0.271
 exact+0.270
ceed+0.262
 nic+0.252
 quickest+0.251
Neg logit contributions
ho-0.194
 he-0.179
olph-0.163
 him-0.158
 they-0.158
TokenL
Token ID43
Feature activation-9.29e-02

Pos logit contributions
 biggest+0.253
ceed+0.222
 exact+0.217
 nic+0.214
 smallest+0.214
Neg logit contributions
ho-0.219
 after-0.188
 he-0.182
ged-0.177
olph-0.172
Tokenily
Token ID813
Feature activation+2.12e-02

Pos logit contributions
 biggest+0.136
ceed+0.123
 exact+0.121
 nic+0.116
 smallest+0.111
Neg logit contributions
ho-0.187
olph-0.166
anged-0.165
keeping-0.163
 throats-0.160
Token had
Token ID550
Feature activation-4.59e-02

Pos logit contributions
ho+0.035
 throats+0.031
anged+0.029
keeping+0.028
olph+0.028
Neg logit contributions
 teaches-0.037
 biggest-0.037
 two-0.037
ceed-0.037
 exact-0.036
Token fore
Token ID1674
Feature activation+3.11e-02

Pos logit contributions
ceed+0.082
 biggest+0.078
ate+0.075
 exact+0.073
hest+0.073
Neg logit contributions
ho-0.077
 throats-0.069
anged-0.063
keeping-0.061
olph-0.060
Tokenshadow
Token ID19106
Feature activation-2.27e-01

Pos logit contributions
ho+0.061
 throats+0.053
anged+0.051
olph+0.049
keeping+0.048
Neg logit contributions
 biggest-0.047
ceed-0.047
 exact-0.045
 teaches-0.044
ate-0.044
Tokened
Token ID276
Feature activation-3.06e-02

Pos logit contributions
 exact+0.514
 biggest+0.512
ceed+0.509
 teaches+0.494
 smallest+0.480
Neg logit contributions
ho-0.285
anged-0.219
 throats-0.204
olph-0.188
ged-0.173
Token the
Token ID262
Feature activation-1.34e-01

Pos logit contributions
 biggest+0.051
 two+0.049
 most+0.049
 teaches+0.048
 exact+0.048
Neg logit contributions
ho-0.055
 throats-0.048
olph-0.047
anged-0.047
keeping-0.042
Token danger
Token ID3514
Feature activation-9.96e-03

Pos logit contributions
ceed+0.220
 biggest+0.200
ate+0.200
hest+0.194
 teaches+0.194
Neg logit contributions
ho-0.262
 throats-0.215
anged-0.204
olph-0.200
keeping-0.193
Token with
Token ID351
Feature activation-6.74e-02

Pos logit contributions
 biggest+0.018
ceed+0.018
 teaches+0.018
 exact+0.017
 two+0.017
Neg logit contributions
ho-0.016
 throats-0.015
anged-0.014
olph-0.013
keeping-0.012
Token her
Token ID607
Feature activation-3.19e-02

Pos logit contributions
 biggest+0.093
ceed+0.092
 exact+0.088
 teaches+0.086
ate+0.085
Neg logit contributions
ho-0.143
 throats-0.125
anged-0.121
olph-0.118
keeping-0.114
Token lucky
Token ID9670
Feature activation+3.89e-03

Pos logit contributions
ho+0.406
anged+0.347
 throats+0.347
olph+0.340
keeping+0.314
Neg logit contributions
 biggest-0.262
ceed-0.258
 teaches-0.252
 exact-0.245
ate-0.240
Token.
Token ID13
Feature activation-4.92e-03

Pos logit contributions
ho+0.007
 throats+0.006
anged+0.006
olph+0.006
keeping+0.006
Neg logit contributions
 biggest-0.006
ceed-0.006
 teaches-0.006
 two-0.006
 best-0.006
Token
Token ID198
Feature activation-1.42e-01

Pos logit contributions
 biggest+0.008
ceed+0.008
 exact+0.007
hest+0.007
 nic+0.007
Neg logit contributions
ho-0.009
 throats-0.008
olph-0.008
keeping-0.008
anged-0.007
Token
Token ID198
Feature activation-1.42e-01

Pos logit contributions
 exact+0.261
 biggest+0.261
ceed+0.251
 nic+0.245
 quickest+0.243
Neg logit contributions
ho-0.242
 throats-0.185
 he-0.172
 him-0.171
olph-0.170
TokenFore
Token ID16351
Feature activation-2.50e-03

Pos logit contributions
 biggest+0.229
 nic+0.200
ceed+0.199
 exact+0.198
 smallest+0.185
Neg logit contributions
ho-0.291
 he-0.220
 throats-0.217
olph-0.214
 him-0.204
Tokenshadow
Token ID19106
Feature activation-2.27e-01

Pos logit contributions
 biggest+0.005
 smallest+0.005
 exact+0.005
 teaches+0.005
ceed+0.005
Neg logit contributions
ho-0.004
anged-0.003
owing-0.003
ged-0.003
olph-0.003
Tokening
Token ID278
Feature activation-6.25e-03

Pos logit contributions
 exact+0.542
 biggest+0.525
ceed+0.512
 nic+0.495
 smallest+0.494
Neg logit contributions
ho-0.280
anged-0.209
 throats-0.196
olph-0.174
owing-0.163
Token or
Token ID393
Feature activation+1.74e-01

Pos logit contributions
 teaches+0.011
 biggest+0.011
ceed+0.011
 two+0.011
 exact+0.011
Neg logit contributions
ho-0.010
 throats-0.009
anged-0.009
olph-0.008
keeping-0.008
Token setup
Token ID9058
Feature activation-6.91e-02

Pos logit contributions
ho+0.346
 throats+0.297
anged+0.289
olph+0.283
keeping+0.274
Neg logit contributions
ceed-0.261
 biggest-0.258
 exact-0.248
 teaches-0.246
 two-0.243
Token and
Token ID290
Feature activation+1.87e-01

Pos logit contributions
ceed+0.142
 biggest+0.142
 exact+0.136
 teaches+0.133
ate+0.133
Neg logit contributions
ho-0.100
 throats-0.082
anged-0.076
olph-0.073
keeping-0.070
Token payoff
Token ID43843
Feature activation-3.17e-02

Pos logit contributions
ho+0.383
 throats+0.334
anged+0.319
olph+0.311
keeping+0.301
Neg logit contributions
ceed-0.273
 biggest-0.272
 exact-0.259
 teaches-0.254
ate-0.251
Token.
Token ID13
Feature activation-2.66e-03

Pos logit contributions
ho+0.186
 throats+0.172
keeping+0.156
anged+0.152
olph+0.150
Neg logit contributions
 two-0.138
 biggest-0.136
 exact-0.134
ceed-0.133
 teaches-0.128
Token
Token ID198
Feature activation-1.37e-01

Pos logit contributions
ceed+0.003
 biggest+0.003
 exact+0.003
 teaches+0.003
ate+0.003
Neg logit contributions
ho-0.006
 throats-0.005
anged-0.005
olph-0.005
keeping-0.005
Token
Token ID198
Feature activation-1.37e-01

Pos logit contributions
 biggest+0.203
 exact+0.198
ceed+0.191
 teaches+0.183
 nic+0.182
Neg logit contributions
ho-0.284
olph-0.244
 he-0.232
 throats-0.222
anged-0.219
Token"
Token ID1
Feature activation-1.66e-02

Pos logit contributions
ceed+0.213
 biggest+0.211
 exact+0.201
 teaches+0.197
ate+0.196
Neg logit contributions
ho-0.268
 throats-0.234
anged-0.223
olph-0.215
keeping-0.209
TokenMar
Token ID7676
Feature activation+4.43e-02

Pos logit contributions
 biggest+0.022
ceed+0.022
 exact+0.020
 teaches+0.020
ate+0.019
Neg logit contributions
ho-0.037
 throats-0.032
olph-0.031
anged-0.030
keeping-0.029
Tokenriage
Token ID4087
Feature activation-2.25e-01

Pos logit contributions
ho+0.066
 throats+0.062
anged+0.054
keeping+0.051
olph+0.050
Neg logit contributions
 exact-0.084
 biggest-0.083
ceed-0.082
ate-0.082
 two-0.079
Token is
Token ID318
Feature activation-4.72e-02

Pos logit contributions
ceed+0.328
 biggest+0.319
 exact+0.315
V+0.289
 smallest+0.283
Neg logit contributions
ho-0.424
 throats-0.356
keeping-0.350
 he-0.339
anged-0.333
Token when
Token ID618
Feature activation+3.57e-02

Pos logit contributions
ceed+0.096
 biggest+0.095
 exact+0.092
 teaches+0.090
ate+0.090
Neg logit contributions
ho-0.070
 throats-0.057
anged-0.053
olph-0.051
keeping-0.049
Token two
Token ID734
Feature activation-7.79e-02

Pos logit contributions
ho+0.073
anged+0.064
 throats+0.063
olph+0.059
keeping+0.057
Neg logit contributions
 biggest-0.054
 two-0.053
 most-0.051
 smallest-0.050
 best-0.049
Token people
Token ID661
Feature activation+6.14e-02

Pos logit contributions
 biggest+0.116
ceed+0.116
 exact+0.111
 teaches+0.109
ate+0.107
Neg logit contributions
ho-0.157
 throats-0.136
anged-0.130
olph-0.126
keeping-0.122
Token love
Token ID1842
Feature activation+1.58e-02

Pos logit contributions
ho+0.123
 throats+0.109
anged+0.102
olph+0.100
keeping+0.099
Neg logit contributions
 biggest-0.090
ceed-0.090
 teaches-0.089
 exact-0.087
 two-0.085
Token and
Token ID290
Feature activation+1.88e-01

Pos logit contributions
 biggest+0.093
ceed+0.091
 exact+0.090
 teaches+0.087
 two+0.086
Neg logit contributions
ho-0.085
 throats-0.075
anged-0.071
olph-0.069
keeping-0.064
Token wait
Token ID4043
Feature activation+2.42e-02

Pos logit contributions
ho+0.383
olph+0.330
 throats+0.329
anged+0.326
gie+0.295
Neg logit contributions
 biggest-0.278
 two-0.271
ceed-0.270
 exact-0.268
 most-0.262
Token.
Token ID13
Feature activation-2.29e-03

Pos logit contributions
ho+0.043
 throats+0.037
anged+0.036
olph+0.035
keeping+0.033
Neg logit contributions
 biggest-0.042
ceed-0.041
 exact-0.041
 teaches-0.040
 two-0.038
Token The
Token ID383
Feature activation-1.50e-01

Pos logit contributions
ceed+0.004
 biggest+0.004
 exact+0.003
 teaches+0.003
ate+0.003
Neg logit contributions
ho-0.004
 throats-0.004
anged-0.004
olph-0.003
keeping-0.003
Token fore
Token ID1674
Feature activation+3.28e-02

Pos logit contributions
 biggest+0.232
 two+0.218
ceed+0.212
 exact+0.209
 smallest+0.209
Neg logit contributions
ho-0.312
anged-0.283
 throats-0.276
olph-0.257
keeping-0.253
Tokenshadow
Token ID19106
Feature activation-2.25e-01

Pos logit contributions
ho+0.060
anged+0.050
 throats+0.049
olph+0.047
keeping+0.045
Neg logit contributions
 biggest-0.055
ceed-0.053
 teaches-0.051
 exact-0.051
 smallest-0.051
Tokening
Token ID278
Feature activation-3.57e-03

Pos logit contributions
 biggest+0.515
 exact+0.512
ceed+0.500
 teaches+0.491
 smallest+0.482
Neg logit contributions
ho-0.277
anged-0.213
 throats-0.192
olph-0.176
ged-0.162
Token of
Token ID286
Feature activation-2.03e-03

Pos logit contributions
 teaches+0.006
 biggest+0.006
ceed+0.006
 exact+0.006
 two+0.006
Neg logit contributions
ho-0.006
 throats-0.006
anged-0.006
olph-0.005
keeping-0.005
Token the
Token ID262
Feature activation-1.36e-01

Pos logit contributions
 biggest+0.003
ceed+0.003
 exact+0.003
 teaches+0.003
ate+0.003
Neg logit contributions
ho-0.004
 throats-0.004
anged-0.004
olph-0.003
keeping-0.003
Token dull
Token ID19222
Feature activation-5.33e-02

Pos logit contributions
 biggest+0.212
 two+0.195
 exact+0.195
 smallest+0.192
ceed+0.192
Neg logit contributions
ho-0.277
anged-0.250
 throats-0.247
olph-0.230
ged-0.223
Token clouds
Token ID15114
Feature activation+5.49e-02

Pos logit contributions
 biggest+0.086
 exact+0.082
 two+0.082
 smallest+0.080
 teaches+0.079
Neg logit contributions
ho-0.103
anged-0.092
 throats-0.091
olph-0.083
keeping-0.080
Token
Token ID198
Feature activation-1.38e-01

Pos logit contributions
 biggest+0.265
 exact+0.253
 nic+0.246
ceed+0.244
 smallest+0.234
Neg logit contributions
ho-0.219
olph-0.180
 he-0.180
anged-0.179
ged-0.171
TokenThe
Token ID464
Feature activation-1.49e-01

Pos logit contributions
 biggest+0.242
ceed+0.208
 exact+0.204
 nic+0.204
 smallest+0.199
Neg logit contributions
ho-0.241
 throats-0.192
anged-0.192
 after-0.186
olph-0.183
Token wave
Token ID6769
Feature activation-4.41e-02

Pos logit contributions
 biggest+0.196
ceed+0.195
 exact+0.184
 teaches+0.181
ate+0.177
Neg logit contributions
ho-0.331
 throats-0.292
anged-0.281
olph-0.273
keeping-0.266
Token had
Token ID550
Feature activation-4.33e-02

Pos logit contributions
 teaches+0.073
ceed+0.070
 biggest+0.069
 two+0.069
 exact+0.067
Neg logit contributions
ho-0.077
 throats-0.072
keeping-0.067
anged-0.066
olph-0.062
Token fore
Token ID1674
Feature activation+3.24e-02

Pos logit contributions
ceed+0.074
 biggest+0.074
 exact+0.070
ate+0.070
 teaches+0.069
Neg logit contributions
ho-0.077
 throats-0.066
anged-0.062
olph-0.060
keeping-0.059
Tokenshadow
Token ID19106
Feature activation-2.25e-01

Pos logit contributions
ho+0.067
 throats+0.058
anged+0.056
olph+0.054
keeping+0.052
Neg logit contributions
ceed-0.047
 biggest-0.047
 exact-0.044
 teaches-0.043
ate-0.043
Tokened
Token ID276
Feature activation-2.99e-02

Pos logit contributions
 biggest+0.524
ceed+0.523
 exact+0.521
 teaches+0.510
 smallest+0.494
Neg logit contributions
ho-0.266
anged-0.205
 throats-0.184
olph-0.166
ged-0.156
Token L
Token ID406
Feature activation-8.78e-02

Pos logit contributions
 biggest+0.052
 two+0.050
 most+0.050
 exact+0.049
 teaches+0.049
Neg logit contributions
ho-0.049
 throats-0.045
olph-0.044
anged-0.042
keeping-0.039
Tokenola
Token ID5708
Feature activation+6.14e-03

Pos logit contributions
 biggest+0.141
ceed+0.140
 exact+0.133
 teaches+0.129
 smallest+0.126
Neg logit contributions
ho-0.168
 throats-0.140
anged-0.139
olph-0.135
keeping-0.128
Token's
Token ID338
Feature activation-5.04e-02

Pos logit contributions
ho+0.012
 throats+0.011
anged+0.011
olph+0.010
keeping+0.010
Neg logit contributions
 two-0.009
 biggest-0.009
 teaches-0.009
ceed-0.008
 most-0.008
Token fate
Token ID10030
Feature activation+1.02e-02

Pos logit contributions
 biggest+0.073
 two+0.068
 best+0.066
 exact+0.066
 smallest+0.066
Neg logit contributions
ho-0.107
olph-0.094
anged-0.093
 throats-0.093
ged-0.089
Token,
Token ID11
Feature activation+1.26e-01

Pos logit contributions
ceed+0.012
 biggest+0.012
 exact+0.012
 teaches+0.011
 smallest+0.011
Neg logit contributions
ho-0.020
 throats-0.017
anged-0.017
olph-0.017
keeping-0.016
Token but
Token ID475
Feature activation+9.07e-02

Pos logit contributions
ho+0.220
 throats+0.192
anged+0.189
olph+0.172
keeping+0.170
Neg logit contributions
 biggest-0.220
ceed-0.214
 exact-0.210
 two-0.208
 teaches-0.205
Token it
Token ID340
Feature activation+2.27e-02

Pos logit contributions
ho+0.141
anged+0.128
 throats+0.127
keeping+0.116
olph+0.113
Neg logit contributions
 two-0.170
 biggest-0.167
 most-0.165
 best-0.160
 exact-0.160
Token was
Token ID373
Feature activation-6.61e-02

Pos logit contributions
ho+0.043
 throats+0.037
anged+0.035
olph+0.034
keeping+0.033
Neg logit contributions
ceed-0.036
 biggest-0.036
 exact-0.034
ate-0.034
 teaches-0.034
Token fore
Token ID1674
Feature activation+3.54e-02

Pos logit contributions
ceed+0.111
 biggest+0.110
 exact+0.105
 teaches+0.104
ate+0.103
Neg logit contributions
ho-0.121
 throats-0.104
anged-0.097
olph-0.095
keeping-0.091
Tokenshadow
Token ID19106
Feature activation-2.24e-01

Pos logit contributions
ho+0.065
anged+0.054
 throats+0.054
olph+0.051
keeping+0.049
Neg logit contributions
 biggest-0.058
ceed-0.058
 teaches-0.056
 smallest-0.054
ate-0.054
Tokened
Token ID276
Feature activation-3.30e-02

Pos logit contributions
 biggest+0.476
ceed+0.475
 exact+0.467
 teaches+0.457
ate+0.448
Neg logit contributions
ho-0.316
 throats-0.235
anged-0.235
olph-0.215
keeping-0.208
Token in
Token ID287
Feature activation-1.37e-03

Pos logit contributions
 biggest+0.058
 two+0.057
 exact+0.057
ceed+0.056
 most+0.056
Neg logit contributions
ho-0.054
 throats-0.049
anged-0.048
olph-0.046
keeping-0.039
Token the
Token ID262
Feature activation-1.41e-01

Pos logit contributions
ceed+0.002
 biggest+0.002
 exact+0.002
 teaches+0.002
ate+0.002
Neg logit contributions
ho-0.003
 throats-0.002
anged-0.002
olph-0.002
keeping-0.002
Token beginning
Token ID3726
Feature activation-4.24e-02

Pos logit contributions
ceed+0.209
 biggest+0.209
 exact+0.199
 teaches+0.195
ate+0.192
Neg logit contributions
ho-0.284
 throats-0.248
anged-0.237
olph-0.230
keeping-0.223
Token when
Token ID618
Feature activation+3.60e-02

Pos logit contributions
ceed+0.079
 biggest+0.078
 exact+0.075
 teaches+0.074
ate+0.074
Neg logit contributions
ho-0.070
 throats-0.059
anged-0.055
olph-0.053
keeping-0.051
Token her
Token ID607
Feature activation-3.00e-02

Pos logit contributions
 biggest+0.098
 two+0.094
 exact+0.093
 most+0.092
 best+0.090
Neg logit contributions
ho-0.125
anged-0.109
 throats-0.105
olph-0.103
ged-0.090
Token beloved
Token ID14142
Feature activation+1.53e-02

Pos logit contributions
ceed+0.053
 biggest+0.051
 exact+0.049
 teaches+0.049
ate+0.049
Neg logit contributions
ho-0.052
 throats-0.045
anged-0.043
olph-0.041
keeping-0.039
Token sock
Token ID32263
Feature activation+1.97e-02

Pos logit contributions
ho+0.024
 throats+0.020
anged+0.019
olph+0.018
keeping+0.016
Neg logit contributions
ceed-0.031
 biggest-0.028
 teaches-0.028
 exact-0.028
ate-0.028
Token.
Token ID13
Feature activation-7.36e-04

Pos logit contributions
ho+0.038
 throats+0.034
anged+0.034
olph+0.032
keeping+0.029
Neg logit contributions
 biggest-0.031
 exact-0.029
 best-0.029
 most-0.029
 teaches-0.029
Token Fore
Token ID4558
Feature activation+1.63e-02

Pos logit contributions
ceed+0.001
 biggest+0.001
hest+0.001
ate+0.001
 exact+0.001
Neg logit contributions
ho-0.001
 throats-0.001
anged-0.001
keeping-0.001
olph-0.001
Tokenshadow
Token ID19106
Feature activation-2.24e-01

Pos logit contributions
ho+0.035
 throats+0.032
anged+0.030
olph+0.029
keeping+0.028
Neg logit contributions
ceed-0.022
 biggest-0.021
 exact-0.021
ate-0.020
 teaches-0.020
Tokening
Token ID278
Feature activation-1.51e-03

Pos logit contributions
 biggest+0.498
 exact+0.496
ceed+0.491
 teaches+0.478
hest+0.465
Neg logit contributions
ho-0.290
anged-0.227
 throats-0.211
olph-0.195
ged-0.178
Token her
Token ID607
Feature activation-3.07e-02

Pos logit contributions
 teaches+0.003
ceed+0.002
 biggest+0.002
 exact+0.002
 two+0.002
Neg logit contributions
ho-0.003
 throats-0.002
anged-0.002
olph-0.002
keeping-0.002
Token bad
Token ID2089
Feature activation+1.35e-02

Pos logit contributions
 biggest+0.046
ceed+0.045
 exact+0.044
 teaches+0.043
ate+0.042
Neg logit contributions
ho-0.062
 throats-0.054
anged-0.052
olph-0.051
keeping-0.050
Token ending
Token ID7464
Feature activation-6.72e-03

Pos logit contributions
ho+0.026
 throats+0.022
anged+0.021
olph+0.021
keeping+0.021
Neg logit contributions
 biggest-0.022
ceed-0.020
 teaches-0.020
 exact-0.020
 two-0.020
Token,
Token ID11
Feature activation+1.36e-01

Pos logit contributions
ceed+0.013
 biggest+0.013
 exact+0.013
 teaches+0.013
ate+0.013
Neg logit contributions
ho-0.010
 throats-0.008
anged-0.008
olph-0.008
keeping-0.007
Token,
Token ID11
Feature activation+1.38e-01

Pos logit contributions
ceed+0.071
 biggest+0.071
 exact+0.068
 teaches+0.067
ate+0.067
Neg logit contributions
ho-0.062
 throats-0.052
anged-0.049
olph-0.047
keeping-0.045
Token her
Token ID607
Feature activation-2.96e-02

Pos logit contributions
ho+0.214
 throats+0.180
anged+0.172
olph+0.165
keeping+0.157
Neg logit contributions
 biggest-0.271
ceed-0.268
 exact-0.262
 teaches-0.258
 two-0.254
Token mum
Token ID25682
Feature activation+1.40e-01

Pos logit contributions
 biggest+0.051
ceed+0.048
 exact+0.047
 two+0.046
 teaches+0.046
Neg logit contributions
ho-0.055
 throats-0.047
anged-0.045
olph-0.045
keeping-0.043
Token was
Token ID373
Feature activation-7.08e-02

Pos logit contributions
ho+0.253
 throats+0.229
anged+0.210
olph+0.201
keeping+0.201
Neg logit contributions
ceed-0.231
 biggest-0.230
 teaches-0.227
 exact-0.225
 two-0.220
Token fore
Token ID1674
Feature activation+3.32e-02

Pos logit contributions
ceed+0.134
 biggest+0.128
hest+0.123
 teaches+0.123
V+0.122
Neg logit contributions
ho-0.104
 throats-0.087
olph-0.084
anged-0.081
keeping-0.079
Tokenshadow
Token ID19106
Feature activation-2.24e-01

Pos logit contributions
ho+0.065
 throats+0.055
anged+0.054
olph+0.052
keeping+0.050
Neg logit contributions
ceed-0.051
 biggest-0.051
 teaches-0.048
 exact-0.048
ate-0.047
Tokening
Token ID278
Feature activation-2.21e-03

Pos logit contributions
 biggest+0.483
ceed+0.477
 exact+0.475
 teaches+0.464
 smallest+0.455
Neg logit contributions
ho-0.305
anged-0.228
 throats-0.221
olph-0.206
keeping-0.199
Token...
Token ID986
Feature activation-9.55e-04

Pos logit contributions
 biggest+0.004
ceed+0.004
 teaches+0.004
 exact+0.004
 two+0.004
Neg logit contributions
ho-0.004
 throats-0.003
anged-0.003
olph-0.003
keeping-0.003
Tokenthe
Token ID1169
Feature activation-1.39e-01

Pos logit contributions
ceed+0.002
 biggest+0.002
 exact+0.002
 teaches+0.002
ate+0.002
Neg logit contributions
ho-0.002
 throats-0.001
anged-0.001
olph-0.001
keeping-0.001
Token next
Token ID1306
Feature activation-1.17e-02

Pos logit contributions
ceed+0.212
 biggest+0.201
ate+0.197
 teaches+0.196
 exact+0.195
Neg logit contributions
ho-0.268
 throats-0.228
anged-0.222
olph-0.212
 he-0.199
Token day
Token ID1110
Feature activation+9.09e-02

Pos logit contributions
 biggest+0.018
ceed+0.018
 teaches+0.018
 exact+0.017
 two+0.017
Neg logit contributions
ho-0.022
 throats-0.020
anged-0.018
olph-0.018
keeping-0.018
Token by
Token ID416
Feature activation+2.94e-02

Pos logit contributions
 biggest+0.069
ceed+0.069
 teaches+0.067
 exact+0.067
ate+0.066
Neg logit contributions
ho-0.042
 throats-0.034
anged-0.032
olph-0.030
keeping-0.028
Token saying
Token ID2282
Feature activation-4.88e-02

Pos logit contributions
ho+0.051
 throats+0.044
olph+0.040
anged+0.040
keeping+0.038
Neg logit contributions
 biggest-0.052
ceed-0.052
 exact-0.050
ate-0.049
hest-0.048
Token,
Token ID11
Feature activation+1.38e-01

Pos logit contributions
ceed+0.077
 biggest+0.075
 exact+0.074
ate+0.071
hest+0.071
Neg logit contributions
ho-0.094
 throats-0.081
olph-0.073
anged-0.072
keeping-0.068
Token "
Token ID366
Feature activation-4.24e-02

Pos logit contributions
ho+0.314
 throats+0.278
anged+0.267
olph+0.261
keeping+0.254
Neg logit contributions
ceed-0.168
 biggest-0.167
 exact-0.157
 teaches-0.154
ate-0.151
TokenMar
Token ID7676
Feature activation+4.60e-02

Pos logit contributions
 biggest+0.060
ceed+0.059
 exact+0.055
 teaches+0.055
 best+0.052
Neg logit contributions
ho-0.090
 throats-0.072
olph-0.070
keeping-0.069
anged-0.069
Tokenriage
Token ID4087
Feature activation-2.23e-01

Pos logit contributions
ho+0.101
 throats+0.088
anged+0.086
olph+0.083
keeping+0.081
Neg logit contributions
ceed-0.062
 biggest-0.061
 exact-0.057
 teaches-0.056
ate-0.055
Token is
Token ID318
Feature activation-4.83e-02

Pos logit contributions
 exact+0.342
 biggest+0.329
ceed+0.327
 prett+0.312
hest+0.301
Neg logit contributions
ho-0.394
keeping-0.329
 throats-0.327
 to-0.317
olph-0.298
Token when
Token ID618
Feature activation+3.68e-02

Pos logit contributions
ceed+0.110
 biggest+0.108
 exact+0.107
hest+0.105
V+0.104
Neg logit contributions
ho-0.054
 throats-0.039
anged-0.036
 after-0.035
olph-0.035
Token two
Token ID734
Feature activation-8.04e-02

Pos logit contributions
ho+0.070
 throats+0.060
anged+0.058
olph+0.056
keeping+0.054
Neg logit contributions
 biggest-0.060
ceed-0.058
 exact-0.056
 teaches-0.055
 two-0.055
Token people
Token ID661
Feature activation+5.99e-02

Pos logit contributions
ceed+0.122
 biggest+0.121
 exact+0.115
 teaches+0.113
ate+0.112
Neg logit contributions
ho-0.159
 throats-0.139
anged-0.132
olph-0.128
keeping-0.124
Token who
Token ID508
Feature activation+3.73e-02

Pos logit contributions
ho+0.118
 throats+0.105
anged+0.098
olph+0.096
keeping+0.095
Neg logit contributions
ceed-0.090
 biggest-0.090
 teaches-0.089
 exact-0.086
ate-0.085
Token.
Token ID13
Feature activation-1.79e-03

Pos logit contributions
ho+0.051
 throats+0.045
anged+0.043
olph+0.042
keeping+0.040
Neg logit contributions
 biggest-0.047
ceed-0.046
 exact-0.046
 two-0.044
 teaches-0.044
Token She
Token ID1375
Feature activation+6.74e-02

Pos logit contributions
ceed+0.003
 biggest+0.003
 exact+0.003
 teaches+0.003
hest+0.003
Neg logit contributions
ho-0.003
 throats-0.002
anged-0.002
keeping-0.002
olph-0.002
Token had
Token ID550
Feature activation-4.08e-02

Pos logit contributions
ho+0.115
 throats+0.100
anged+0.093
olph+0.090
keeping+0.089
Neg logit contributions
 biggest-0.119
ceed-0.119
 teaches-0.116
 exact-0.114
 two-0.113
Token been
Token ID587
Feature activation-7.36e-02

Pos logit contributions
ceed+0.080
 biggest+0.079
ate+0.075
 exact+0.075
 teaches+0.074
Neg logit contributions
ho-0.059
 throats-0.050
anged-0.046
keeping-0.046
olph-0.044
Token fore
Token ID1674
Feature activation+3.39e-02

Pos logit contributions
ceed+0.141
 biggest+0.138
 exact+0.133
 teaches+0.133
ate+0.131
Neg logit contributions
ho-0.117
 throats-0.098
anged-0.094
olph-0.091
keeping-0.087
Tokenshadow
Token ID19106
Feature activation-2.23e-01

Pos logit contributions
ho+0.069
 throats+0.059
anged+0.057
olph+0.055
keeping+0.054
Neg logit contributions
ceed-0.050
 biggest-0.050
 exact-0.047
 teaches-0.047
ate-0.046
Tokened
Token ID276
Feature activation-2.83e-02

Pos logit contributions
 biggest+0.475
ceed+0.474
 exact+0.469
 teaches+0.463
 smallest+0.449
Neg logit contributions
ho-0.304
anged-0.234
 throats-0.224
olph-0.205
keeping-0.199
Token about
Token ID546
Feature activation+1.38e-02

Pos logit contributions
 biggest+0.051
 two+0.049
 most+0.049
 exact+0.048
 best+0.047
Neg logit contributions
ho-0.047
 throats-0.043
anged-0.041
olph-0.041
keeping-0.035
Token the
Token ID262
Feature activation-1.34e-01

Pos logit contributions
ho+0.029
 throats+0.026
anged+0.026
olph+0.024
keeping+0.023
Neg logit contributions
 biggest-0.019
 two-0.019
 most-0.018
ceed-0.018
 exact-0.018
Token power
Token ID1176
Feature activation-2.09e-02

Pos logit contributions
ceed+0.247
 biggest+0.233
ate+0.231
 exact+0.227
 teaches+0.226
Neg logit contributions
ho-0.227
 throats-0.189
olph-0.177
anged-0.176
keeping-0.165
Token of
Token ID286
Feature activation-1.47e-03

Pos logit contributions
 biggest+0.032
 teaches+0.031
 exact+0.030
ceed+0.030
 most+0.029
Neg logit contributions
ho-0.041
 throats-0.038
anged-0.036
olph-0.035
keeping-0.033
Token heavy
Token ID4334
Feature activation-3.03e-02

Pos logit contributions
 biggest+0.205
 exact+0.192
ceed+0.190
 most+0.183
 two+0.183
Neg logit contributions
ho-0.273
 throats-0.238
anged-0.235
olph-0.222
keeping-0.213
Token bag
Token ID6131
Feature activation-1.39e-02

Pos logit contributions
 biggest+0.041
 exact+0.040
 two+0.039
 most+0.037
 smallest+0.036
Neg logit contributions
ho-0.066
 throats-0.058
anged-0.058
olph-0.055
keeping-0.052
Token had
Token ID550
Feature activation-4.27e-02

Pos logit contributions
 biggest+0.026
 exact+0.025
ceed+0.024
 two+0.024
 teaches+0.024
Neg logit contributions
ho-0.023
 throats-0.021
anged-0.020
olph-0.019
keeping-0.017
Token been
Token ID587
Feature activation-7.51e-02

Pos logit contributions
ceed+0.072
 biggest+0.072
 exact+0.069
ate+0.067
 teaches+0.067
Neg logit contributions
ho-0.077
 throats-0.065
anged-0.062
olph-0.060
keeping-0.059
Token fore
Token ID1674
Feature activation+3.34e-02

Pos logit contributions
 biggest+0.129
ceed+0.126
 exact+0.125
 teaches+0.122
 two+0.120
Neg logit contributions
ho-0.136
 throats-0.116
anged-0.109
olph-0.103
keeping-0.099
Tokenshadow
Token ID19106
Feature activation-2.23e-01

Pos logit contributions
ho+0.067
 throats+0.057
anged+0.055
olph+0.054
keeping+0.052
Neg logit contributions
ceed-0.050
 biggest-0.050
 exact-0.047
 teaches-0.047
ate-0.046
Tokened
Token ID276
Feature activation-2.66e-02

Pos logit contributions
 biggest+0.453
ceed+0.452
 exact+0.443
 teaches+0.431
ate+0.423
Neg logit contributions
ho-0.329
anged-0.255
 throats-0.253
olph-0.234
keeping-0.228
Token,
Token ID11
Feature activation+1.35e-01

Pos logit contributions
 biggest+0.047
 two+0.046
 most+0.045
 exact+0.045
 best+0.045
Neg logit contributions
ho-0.045
 throats-0.039
anged-0.037
olph-0.037
keeping-0.032
Token and
Token ID290
Feature activation+1.87e-01

Pos logit contributions
ho+0.217
anged+0.183
 throats+0.183
olph+0.169
keeping+0.161
Neg logit contributions
 two-0.253
 teaches-0.252
 biggest-0.251
 exact-0.251
ceed-0.246
Token that
Token ID326
Feature activation+2.24e-02

Pos logit contributions
ho+0.340
 throats+0.305
anged+0.293
olph+0.274
keeping+0.258
Neg logit contributions
 two-0.306
 teaches-0.297
ceed-0.296
 biggest-0.296
 most-0.289
Token setup
Token ID9058
Feature activation-6.42e-02

Pos logit contributions
ho+0.036
 throats+0.030
anged+0.029
olph+0.028
keeping+0.027
Neg logit contributions
 biggest-0.043
ceed-0.042
 teaches-0.041
 exact-0.041
 two-0.040
Token sky
Token ID6766
Feature activation+4.92e-02

Pos logit contributions
ceed+0.205
 biggest+0.198
 exact+0.188
 teaches+0.188
ate+0.186
Neg logit contributions
ho-0.311
 throats-0.273
anged-0.258
olph-0.252
keeping-0.247
Token before
Token ID878
Feature activation+1.79e-02

Pos logit contributions
ho+0.074
 throats+0.061
anged+0.056
olph+0.054
keeping+0.052
Neg logit contributions
ceed-0.099
 biggest-0.098
 exact-0.094
 teaches-0.093
ate-0.092
Token they
Token ID484
Feature activation+7.16e-02

Pos logit contributions
ho+0.034
anged+0.030
 throats+0.029
olph+0.027
keeping+0.026
Neg logit contributions
 two-0.028
 biggest-0.028
ceed-0.028
 exact-0.027
 most-0.026
Token walked
Token ID6807
Feature activation-5.42e-02

Pos logit contributions
ho+0.125
 throats+0.110
olph+0.104
anged+0.103
keeping+0.100
Neg logit contributions
ceed-0.122
 biggest-0.121
 teaches-0.116
 exact-0.115
ate-0.112
Token fore
Token ID1674
Feature activation+3.45e-02

Pos logit contributions
 biggest+0.106
ceed+0.103
 exact+0.102
 two+0.101
ate+0.100
Neg logit contributions
ho-0.083
 throats-0.073
anged-0.067
olph-0.065
keeping-0.060
Tokenshadow
Token ID19106
Feature activation-2.22e-01

Pos logit contributions
ho+0.067
 throats+0.058
anged+0.056
olph+0.054
keeping+0.053
Neg logit contributions
ceed-0.053
 biggest-0.053
 teaches-0.050
 exact-0.050
ate-0.049
Tokened
Token ID276
Feature activation-3.09e-02

Pos logit contributions
ceed+0.512
 biggest+0.506
 exact+0.500
 teaches+0.494
 smallest+0.484
Neg logit contributions
ho-0.274
 throats-0.198
anged-0.196
olph-0.173
keeping-0.172
Token the
Token ID262
Feature activation-1.38e-01

Pos logit contributions
 biggest+0.057
 two+0.056
 teaches+0.055
 exact+0.055
 most+0.054
Neg logit contributions
ho-0.047
 throats-0.043
anged-0.042
olph-0.040
keeping-0.035
Token down
Token ID866
Feature activation-1.64e-02

Pos logit contributions
ceed+0.203
 biggest+0.201
 exact+0.191
 teaches+0.189
ate+0.187
Neg logit contributions
ho-0.279
 throats-0.242
anged-0.230
olph-0.224
keeping-0.218
Tokenpour
Token ID48681
Feature activation+1.83e-02

Pos logit contributions
 biggest+0.031
ceed+0.031
 exact+0.030
 teaches+0.030
ate+0.029
Neg logit contributions
ho-0.026
 throats-0.022
anged-0.021
olph-0.021
keeping-0.019
Token that
Token ID326
Feature activation+2.24e-02

Pos logit contributions
ho+0.030
 throats+0.026
anged+0.024
olph+0.023
keeping+0.022
Neg logit contributions
ceed-0.034
 biggest-0.034
 exact-0.032
 teaches-0.032
ate-0.032
Token cat
Token ID3797
Feature activation+6.91e-02

Pos logit contributions
ceed+0.338
ate+0.329
 exact+0.325
 biggest+0.323
hest+0.319
Neg logit contributions
ho-0.193
 throats-0.150
olph-0.134
anged-0.129
gie-0.111
Token said
Token ID531
Feature activation-7.00e-02

Pos logit contributions
ho+0.147
 throats+0.134
keeping+0.130
anged+0.129
olph+0.120
Neg logit contributions
ceed-0.089
 teaches-0.089
 two-0.086
 biggest-0.086
 exact-0.083
Token,
Token ID11
Feature activation+1.45e-01

Pos logit contributions
 two+0.111
 best+0.109
 biggest+0.108
ceed+0.105
 teaches+0.102
Neg logit contributions
ho-0.126
keeping-0.126
 throats-0.119
anged-0.118
olph-0.105
Token "
Token ID366
Feature activation-4.15e-02

Pos logit contributions
ho+0.357
 throats+0.326
anged+0.321
keeping+0.315
olph+0.300
Neg logit contributions
ceed-0.142
 biggest-0.141
 two-0.133
 teaches-0.132
 exact-0.128
TokenMar
Token ID7676
Feature activation+4.68e-02

Pos logit contributions
 biggest+0.054
ceed+0.054
 exact+0.051
 teaches+0.049
ate+0.048
Neg logit contributions
ho-0.094
 throats-0.080
anged-0.076
olph-0.076
keeping-0.073
Tokenriage
Token ID4087
Feature activation-2.22e-01

Pos logit contributions
ho+0.078
 throats+0.063
anged+0.061
olph+0.059
keeping+0.055
Neg logit contributions
ceed-0.089
 biggest-0.089
 exact-0.084
 teaches-0.084
ate-0.082
Token is
Token ID318
Feature activation-4.44e-02

Pos logit contributions
 exact+0.323
 biggest+0.312
ceed+0.306
 prett+0.285
 nic+0.277
Neg logit contributions
ho-0.429
 throats-0.365
keeping-0.352
anged-0.337
 he-0.335
Token when
Token ID618
Feature activation+3.80e-02

Pos logit contributions
ceed+0.091
 biggest+0.091
 exact+0.088
 teaches+0.086
ate+0.086
Neg logit contributions
ho-0.064
 throats-0.051
anged-0.048
olph-0.046
keeping-0.044
Token two
Token ID734
Feature activation-7.19e-02

Pos logit contributions
ho+0.077
 throats+0.066
anged+0.065
olph+0.062
keeping+0.060
Neg logit contributions
 biggest-0.058
ceed-0.055
 two-0.055
 exact-0.053
 most-0.053
Token animals
Token ID4695
Feature activation+6.01e-02

Pos logit contributions
ceed+0.103
 biggest+0.102
 exact+0.097
 teaches+0.095
ate+0.094
Neg logit contributions
ho-0.149
 throats-0.130
anged-0.124
olph-0.121
keeping-0.117
Token love
Token ID1842
Feature activation+1.94e-02

Pos logit contributions
ho+0.122
 throats+0.108
anged+0.103
olph+0.100
keeping+0.098
Neg logit contributions
ceed-0.086
 biggest-0.086
 teaches-0.085
 exact-0.081
ate-0.081
Token was
Token ID373
Feature activation-6.82e-02

Pos logit contributions
 biggest+0.104
ceed+0.100
 teaches+0.099
 exact+0.098
 two+0.096
Neg logit contributions
ho-0.148
 throats-0.133
anged-0.130
olph-0.126
keeping-0.121
Token the
Token ID262
Feature activation-1.34e-01

Pos logit contributions
 biggest+0.095
ceed+0.094
 exact+0.091
 teaches+0.089
 two+0.086
Neg logit contributions
ho-0.146
 throats-0.128
anged-0.123
olph-0.119
keeping-0.116
Token setup
Token ID9058
Feature activation-6.22e-02

Pos logit contributions
ceed+0.195
 biggest+0.190
 exact+0.182
ate+0.180
 teaches+0.180
Neg logit contributions
ho-0.278
 throats-0.242
anged-0.229
olph-0.224
keeping-0.217
Token that
Token ID326
Feature activation+2.32e-02

Pos logit contributions
 teaches+0.117
ceed+0.115
 two+0.112
 biggest+0.111
 exact+0.111
Neg logit contributions
ho-0.097
 throats-0.086
anged-0.079
olph-0.076
keeping-0.073
Token fore
Token ID1674
Feature activation+3.42e-02

Pos logit contributions
ho+0.034
 throats+0.029
anged+0.028
olph+0.027
keeping+0.025
Neg logit contributions
 biggest-0.047
 teaches-0.046
ceed-0.046
 exact-0.046
 two-0.045
Tokenshadow
Token ID19106
Feature activation-2.22e-01

Pos logit contributions
ho+0.065
anged+0.054
 throats+0.054
olph+0.051
keeping+0.050
Neg logit contributions
 biggest-0.055
ceed-0.054
 teaches-0.052
 exact-0.052
 smallest-0.051
Tokened
Token ID276
Feature activation-2.79e-02

Pos logit contributions
 biggest+0.501
 exact+0.494
ceed+0.493
 teaches+0.476
 smallest+0.466
Neg logit contributions
ho-0.280
anged-0.211
 throats-0.201
olph-0.186
keeping-0.181
Token the
Token ID262
Feature activation-1.34e-01

Pos logit contributions
 biggest+0.051
 two+0.050
 teaches+0.050
ceed+0.049
 most+0.049
Neg logit contributions
ho-0.044
 throats-0.039
olph-0.037
anged-0.037
keeping-0.033
Token bad
Token ID2089
Feature activation+1.41e-02

Pos logit contributions
ceed+0.190
 biggest+0.187
 exact+0.177
ate+0.175
 teaches+0.174
Neg logit contributions
ho-0.284
 throats-0.248
anged-0.236
olph-0.230
keeping-0.224
Token ending
Token ID7464
Feature activation-5.88e-03

Pos logit contributions
ho+0.027
 throats+0.024
anged+0.024
olph+0.023
keeping+0.022
Neg logit contributions
 biggest-0.022
 smallest-0.021
 exact-0.021
 teaches-0.021
 two-0.021
Token.
Token ID13
Feature activation-4.82e-03

Pos logit contributions
 biggest+0.011
ceed+0.011
 teaches+0.011
 exact+0.010
 two+0.010
Neg logit contributions
ho-0.009
 throats-0.008
anged-0.008
olph-0.008
keeping-0.007
Token.
Token ID13
Feature activation+5.27e-03

Pos logit contributions
ho+0.124
 throats+0.113
anged+0.106
olph+0.102
keeping+0.094
Neg logit contributions
 biggest-0.120
 exact-0.118
 two-0.116
ceed-0.116
 teaches-0.115
Token He
Token ID679
Feature activation+8.25e-02

Pos logit contributions
ho+0.009
 throats+0.007
anged+0.007
olph+0.007
keeping+0.006
Neg logit contributions
ceed-0.010
 biggest-0.010
 exact-0.010
 teaches-0.009
ate-0.009
Token had
Token ID550
Feature activation-3.77e-02

Pos logit contributions
ho+0.156
 throats+0.141
keeping+0.132
anged+0.131
olph+0.127
Neg logit contributions
 two-0.123
 teaches-0.123
 biggest-0.120
 smallest-0.120
 most-0.119
Token been
Token ID587
Feature activation-7.29e-02

Pos logit contributions
ceed+0.065
 biggest+0.065
 exact+0.062
 teaches+0.061
ate+0.060
Neg logit contributions
ho-0.068
 throats-0.058
anged-0.055
olph-0.053
keeping-0.051
Token fore
Token ID1674
Feature activation+3.60e-02

Pos logit contributions
 biggest+0.132
ceed+0.131
 exact+0.126
 teaches+0.124
ate+0.123
Neg logit contributions
ho-0.125
 throats-0.106
anged-0.100
olph-0.096
keeping-0.093
Tokenshadow
Token ID19106
Feature activation-2.21e-01

Pos logit contributions
ho+0.074
 throats+0.064
anged+0.062
olph+0.060
keeping+0.058
Neg logit contributions
ceed-0.052
 biggest-0.052
 exact-0.049
 teaches-0.049
ate-0.048
Tokened
Token ID276
Feature activation-2.62e-02

Pos logit contributions
 biggest+0.491
ceed+0.488
 exact+0.484
 teaches+0.473
ate+0.464
Neg logit contributions
ho-0.289
anged-0.214
 throats-0.210
olph-0.198
keeping-0.183
Token about
Token ID546
Feature activation+1.68e-02

Pos logit contributions
 biggest+0.048
 two+0.047
 most+0.046
ceed+0.046
 best+0.046
Neg logit contributions
ho-0.042
 throats-0.038
anged-0.036
olph-0.034
keeping-0.031
Token this
Token ID428
Feature activation-8.34e-02

Pos logit contributions
ho+0.036
 throats+0.031
anged+0.030
olph+0.028
keeping+0.028
Neg logit contributions
 biggest-0.024
 two-0.024
 exact-0.023
ceed-0.023
 most-0.023
Token bad
Token ID2089
Feature activation+1.62e-02

Pos logit contributions
 biggest+0.118
 exact+0.115
ceed+0.112
 teaches+0.111
 smallest+0.107
Neg logit contributions
ho-0.175
 throats-0.158
anged-0.152
olph-0.145
keeping-0.141
Token ending
Token ID7464
Feature activation-3.57e-03

Pos logit contributions
ho+0.032
 throats+0.030
anged+0.029
olph+0.027
keeping+0.027
Neg logit contributions
 biggest-0.024
 smallest-0.023
 exact-0.023
 two-0.023
 best-0.022
Token that
Token ID326
Feature activation+2.63e-02

Pos logit contributions
 two+0.090
 biggest+0.080
 new+0.077
 best+0.076
 teaches+0.076
Neg logit contributions
ho-0.143
anged-0.139
 throats-0.137
keeping-0.137
olph-0.135
Token it
Token ID340
Feature activation+3.07e-02

Pos logit contributions
ho+0.044
anged+0.039
 throats+0.039
olph+0.037
keeping+0.034
Neg logit contributions
 biggest-0.048
 two-0.047
 teaches-0.047
ceed-0.046
 most-0.046
Token was
Token ID373
Feature activation-6.49e-02

Pos logit contributions
ho+0.067
 throats+0.060
anged+0.057
olph+0.056
keeping+0.055
Neg logit contributions
 biggest-0.041
ceed-0.041
 teaches-0.040
 exact-0.038
 smallest-0.038
Token a
Token ID257
Feature activation-1.18e-01

Pos logit contributions
ceed+0.088
 biggest+0.088
 exact+0.083
 teaches+0.082
ate+0.080
Neg logit contributions
ho-0.140
 throats-0.123
anged-0.118
olph-0.115
keeping-0.111
Token '
Token ID705
Feature activation-1.87e-02

Pos logit contributions
 biggest+0.145
ceed+0.134
 exact+0.130
 teaches+0.124
 smallest+0.124
Neg logit contributions
ho-0.273
 throats-0.250
anged-0.249
olph-0.237
keeping-0.232
Tokenshadow
Token ID19106
Feature activation-2.21e-01

Pos logit contributions
ceed+0.030
 biggest+0.030
 exact+0.028
 teaches+0.028
ate+0.028
Neg logit contributions
ho-0.036
 throats-0.031
anged-0.029
olph-0.028
keeping-0.027
Token'
Token ID6
Feature activation+4.66e-02

Pos logit contributions
 biggest+0.489
 exact+0.482
ceed+0.468
 teaches+0.454
 nic+0.448
Neg logit contributions
ho-0.293
anged-0.224
 throats-0.204
olph-0.201
keeping-0.197
Token of
Token ID286
Feature activation+3.01e-04

Pos logit contributions
ho+0.082
 throats+0.070
anged+0.066
olph+0.065
keeping+0.062
Neg logit contributions
ceed-0.081
 biggest-0.081
 exact-0.077
 teaches-0.077
ate-0.075
Token her
Token ID607
Feature activation-2.74e-02

Pos logit contributions
ho+0.001
 throats+0.001
anged+0.001
olph+0.001
keeping+0.000
Neg logit contributions
 biggest-0.000
ceed-0.000
 teaches-0.000
 exact-0.000
 two-0.000
Token own
Token ID898
Feature activation-2.64e-02

Pos logit contributions
 biggest+0.045
ceed+0.042
 exact+0.041
 teaches+0.040
 most+0.040
Neg logit contributions
ho-0.053
 throats-0.046
olph-0.043
anged-0.043
keeping-0.041
Token shape
Token ID5485
Feature activation-1.40e-02

Pos logit contributions
 biggest+0.043
ceed+0.043
 exact+0.041
 teaches+0.040
ate+0.040
Neg logit contributions
ho-0.050
 throats-0.043
anged-0.041
olph-0.040
keeping-0.038
Token girl
Token ID2576
Feature activation+6.17e-02

Pos logit contributions
ho+0.003
 throats+0.003
anged+0.003
keeping+0.003
olph+0.003
Neg logit contributions
 biggest-0.002
 two-0.002
 exact-0.002
 most-0.002
 smallest-0.002
Token remembered
Token ID12086
Feature activation-3.32e-02

Pos logit contributions
ho+0.123
 throats+0.110
keeping+0.108
anged+0.103
olph+0.100
Neg logit contributions
 teaches-0.087
 two-0.086
 biggest-0.086
 exact-0.084
ceed-0.084
Token the
Token ID262
Feature activation-1.38e-01

Pos logit contributions
 two+0.047
 biggest+0.042
 exact+0.041
 most+0.041
ceed+0.040
Neg logit contributions
ho-0.072
anged-0.063
 throats-0.062
olph-0.058
keeping-0.055
Token earlier
Token ID2961
Feature activation-1.02e-02

Pos logit contributions
 biggest+0.197
ceed+0.188
 exact+0.185
 teaches+0.178
 smallest+0.177
Neg logit contributions
ho-0.290
 throats-0.257
anged-0.252
olph-0.242
keeping-0.236
Token fore
Token ID1674
Feature activation+3.31e-02

Pos logit contributions
 teaches+0.018
 biggest+0.017
 two+0.017
ceed+0.017
 exact+0.017
Neg logit contributions
ho-0.017
 throats-0.015
anged-0.015
olph-0.014
keeping-0.014
Tokenshadow
Token ID19106
Feature activation-2.19e-01

Pos logit contributions
ho+0.066
 throats+0.057
anged+0.055
olph+0.053
keeping+0.052
Neg logit contributions
 biggest-0.050
ceed-0.050
 exact-0.047
 teaches-0.046
ate-0.045
Tokening
Token ID278
Feature activation-3.10e-03

Pos logit contributions
ceed+0.474
 biggest+0.474
 exact+0.464
 teaches+0.455
 smallest+0.445
Neg logit contributions
ho-0.298
 throats-0.226
anged-0.223
olph-0.204
keeping-0.197
Token,
Token ID11
Feature activation+1.36e-01

Pos logit contributions
 teaches+0.005
 biggest+0.005
ceed+0.005
 two+0.005
 exact+0.005
Neg logit contributions
ho-0.005
 throats-0.005
anged-0.005
olph-0.004
keeping-0.004
Token and
Token ID290
Feature activation+1.89e-01

Pos logit contributions
ho+0.236
 throats+0.207
anged+0.204
olph+0.193
keeping+0.186
Neg logit contributions
 biggest-0.238
 two-0.235
 exact-0.233
 most-0.229
ceed-0.229
Token it
Token ID340
Feature activation+2.99e-02

Pos logit contributions
ho+0.350
 throats+0.325
anged+0.303
olph+0.294
keeping+0.280
Neg logit contributions
 two-0.295
ceed-0.287
 biggest-0.287
 teaches-0.280
 most-0.279
Token saved
Token ID7448
Feature activation-6.69e-02

Pos logit contributions
ho+0.056
 throats+0.048
anged+0.046
olph+0.045
keeping+0.043
Neg logit contributions
ceed-0.049
 biggest-0.049
 exact-0.046
 teaches-0.045
ate-0.045
Token so
Token ID523
Feature activation+8.29e-02

Pos logit contributions
ho+0.152
 throats+0.128
anged+0.120
olph+0.117
keeping+0.113
Neg logit contributions
ceed-0.184
 biggest-0.180
 exact-0.174
 teaches-0.174
ate-0.171
Token happy
Token ID3772
Feature activation-2.95e-03

Pos logit contributions
ho+0.160
 throats+0.139
anged+0.132
olph+0.128
keeping+0.124
Neg logit contributions
ceed-0.131
 biggest-0.131
 exact-0.125
 teaches-0.123
ate-0.122
Token there
Token ID612
Feature activation-4.39e-02

Pos logit contributions
 biggest+0.005
ceed+0.005
ate+0.004
 teaches+0.004
 two+0.004
Neg logit contributions
ho-0.006
 throats-0.005
anged-0.005
olph-0.005
gie-0.004
Token.
Token ID13
Feature activation+1.80e-03

Pos logit contributions
 biggest+0.075
ceed+0.073
 teaches+0.073
 exact+0.071
 best+0.070
Neg logit contributions
ho-0.077
 throats-0.068
anged-0.067
olph-0.064
keeping-0.061
Token One
Token ID1881
Feature activation-1.17e-01

Pos logit contributions
ho+0.003
 throats+0.003
olph+0.003
 he+0.003
anged+0.003
Neg logit contributions
ceed-0.003
 biggest-0.003
 exact-0.003
 teaches-0.003
ate-0.003
Token day
Token ID1110
Feature activation+9.53e-02

Pos logit contributions
 biggest+0.201
ceed+0.195
 exact+0.190
 teaches+0.188
ate+0.185
Neg logit contributions
ho-0.214
 throats-0.187
anged-0.175
olph-0.170
keeping-0.170
Token,
Token ID11
Feature activation+1.45e-01

Pos logit contributions
ho+0.161
 throats+0.136
keeping+0.134
anged+0.124
olph+0.118
Neg logit contributions
 two-0.177
 biggest-0.175
ceed-0.173
 best-0.165
 most-0.162
Token Peter
Token ID5613
Feature activation+8.61e-02

Pos logit contributions
ho+0.379
 throats+0.338
anged+0.337
keeping+0.327
olph+0.318
Neg logit contributions
 biggest-0.144
 two-0.137
 most-0.132
ceed-0.128
 exact-0.117
Token peek
Token ID27185
Feature activation-9.80e-02

Pos logit contributions
ho+0.162
 throats+0.149
anged+0.133
olph+0.126
keeping+0.124
Neg logit contributions
ate-0.132
 biggest-0.132
 two-0.131
 teaches-0.128
ceed-0.128
Tokened
Token ID276
Feature activation-2.51e-02

Pos logit contributions
 biggest+0.208
ceed+0.208
 exact+0.200
 teaches+0.197
ate+0.197
Neg logit contributions
ho-0.137
 throats-0.112
anged-0.102
olph-0.099
keeping-0.094
Token into
Token ID656
Feature activation-1.63e-02

Pos logit contributions
 biggest+0.035
 two+0.034
 most+0.033
ate+0.032
 same+0.031
Neg logit contributions
ho-0.056
 throats-0.049
olph-0.049
anged-0.047
keeping-0.046
Token She
Token ID1375
Feature activation+6.71e-02

Pos logit contributions
ceed+0.008
 biggest+0.008
 exact+0.008
 teaches+0.008
ate+0.008
Neg logit contributions
ho-0.010
 throats-0.009
anged-0.008
olph-0.008
keeping-0.008
Token stepped
Token ID10764
Feature activation-9.14e-02

Pos logit contributions
ho+0.099
anged+0.078
 throats+0.077
olph+0.073
ged+0.067
Neg logit contributions
ceed-0.139
 biggest-0.138
 exact-0.134
 teaches-0.129
ate-0.127
Token through
Token ID832
Feature activation-2.30e-02

Pos logit contributions
 biggest+0.176
ceed+0.174
 two+0.171
 exact+0.170
ate+0.164
Neg logit contributions
ho-0.146
 throats-0.130
anged-0.116
olph-0.115
keeping-0.109
Token the
Token ID262
Feature activation-1.34e-01

Pos logit contributions
 biggest+0.033
ceed+0.032
 exact+0.031
 two+0.031
ate+0.030
Neg logit contributions
ho-0.048
 throats-0.043
anged-0.041
olph-0.040
keeping-0.037
Token gate
Token ID8946
Feature activation+6.44e-03

Pos logit contributions
ceed+0.209
 biggest+0.201
ate+0.193
 teaches+0.193
 exact+0.193
Neg logit contributions
ho-0.268
 throats-0.233
anged-0.220
keeping-0.208
olph-0.206
Token,
Token ID11
Feature activation+1.37e-01

Pos logit contributions
ho+0.008
 throats+0.006
anged+0.006
olph+0.005
keeping+0.005
Neg logit contributions
ceed-0.015
 biggest-0.014
 exact-0.014
 teaches-0.014
ate-0.014
Token excited
Token ID6568
Feature activation-2.47e-03

Pos logit contributions
ho+0.226
 throats+0.194
anged+0.184
olph+0.179
keeping+0.172
Neg logit contributions
 biggest-0.254
ceed-0.248
 exact-0.246
 two-0.237
 teaches-0.236
Token to
Token ID284
Feature activation+1.10e-01

Pos logit contributions
 biggest+0.004
 same+0.004
 two+0.004
 most+0.004
 exact+0.004
Neg logit contributions
ho-0.004
olph-0.004
 throats-0.004
anged-0.003
ipop-0.003
Token see
Token ID766
Feature activation-4.37e-02

Pos logit contributions
ho+0.172
 throats+0.143
anged+0.135
keeping+0.130
olph+0.126
Neg logit contributions
ceed-0.207
 biggest-0.205
 exact-0.198
 teaches-0.195
 smallest-0.190
Token what
Token ID644
Feature activation-4.36e-02

Pos logit contributions
ceed+0.091
 teaches+0.084
 exact+0.084
 biggest+0.082
V+0.082
Neg logit contributions
ho-0.059
 throats-0.044
 he-0.044
 after-0.043
keeping-0.043
Token was
Token ID373
Feature activation-6.97e-02

Pos logit contributions
ceed+0.084
 biggest+0.083
 exact+0.081
ate+0.078
hest+0.077
Neg logit contributions
ho-0.069
 throats-0.056
anged-0.051
olph-0.051
keeping-0.049
Token too
Token ID1165
Feature activation+1.00e-01

Pos logit contributions
ceed+0.132
 biggest+0.132
 exact+0.126
 teaches+0.124
ate+0.124
Neg logit contributions
ho-0.123
 throats-0.103
anged-0.097
olph-0.095
keeping-0.090
Token quick
Token ID2068
Feature activation+2.84e-03

Pos logit contributions
ho+0.202
 throats+0.176
anged+0.168
olph+0.162
keeping+0.159
Neg logit contributions
ceed-0.151
 biggest-0.151
 exact-0.143
 teaches-0.140
ate-0.139
Token for
Token ID329
Feature activation-5.92e-02

Pos logit contributions
ho+0.006
 throats+0.005
olph+0.005
anged+0.005
chard+0.004
Neg logit contributions
 two-0.004
 teaches-0.004
 best-0.004
 biggest-0.004
 exact-0.004
Token him
Token ID683
Feature activation+6.55e-02

Pos logit contributions
 two+0.109
ceed+0.108
 biggest+0.108
 exact+0.100
 most+0.099
Neg logit contributions
ho-0.101
 throats-0.083
anged-0.083
olph-0.073
keeping-0.073
Token,
Token ID11
Feature activation+1.35e-01

Pos logit contributions
ho+0.141
 throats+0.134
olph+0.125
anged+0.124
 tongues+0.111
Neg logit contributions
 two-0.082
 biggest-0.081
 best-0.081
 exact-0.076
 most-0.073
Token and
Token ID290
Feature activation+1.85e-01

Pos logit contributions
ho+0.260
anged+0.247
 throats+0.234
olph+0.224
keeping+0.222
Neg logit contributions
 two-0.202
 most-0.201
 biggest-0.201
 exact-0.200
 teaches-0.192
Token he
Token ID339
Feature activation+8.95e-02

Pos logit contributions
ho+0.246
 throats+0.235
anged+0.218
keeping+0.214
olph+0.190
Neg logit contributions
 two-0.386
ceed-0.366
 most-0.364
 biggest-0.364
 teaches-0.354
Token couldn
Token ID3521
Feature activation-1.79e-02

Pos logit contributions
ho+0.142
 throats+0.133
keeping+0.131
anged+0.122
olph+0.116
Neg logit contributions
ceed-0.156
 two-0.155
 biggest-0.155
 teaches-0.150
 nic-0.146
Token't
Token ID470
Feature activation+1.53e-02

Pos logit contributions
 biggest+0.031
ceed+0.030
 exact+0.030
ate+0.029
 smallest+0.028
Neg logit contributions
ho-0.032
 throats-0.025
olph-0.024
anged-0.024
 he-0.023
Token find
Token ID1064
Feature activation-8.35e-02

Pos logit contributions
ho+0.030
 throats+0.027
anged+0.025
olph+0.024
keeping+0.024
Neg logit contributions
ceed-0.024
 biggest-0.024
 exact-0.023
 teaches-0.022
 two-0.022
Token her
Token ID607
Feature activation-3.43e-02

Pos logit contributions
 biggest+0.173
ceed+0.164
 two+0.162
 exact+0.160
 most+0.156
Neg logit contributions
ho-0.127
 throats-0.105
anged-0.098
olph-0.090
keeping-0.089
Token figure
Token ID3785
Feature activation+6.88e-03

Pos logit contributions
ceed+0.263
 teaches+0.255
hest+0.253
 exact+0.235
ate+0.235
Neg logit contributions
ho-0.256
anged-0.202
 throats-0.198
 he-0.177
 after-0.176
Token dreamed
Token ID27947
Feature activation-7.55e-02

Pos logit contributions
ho+0.012
 throats+0.010
anged+0.009
olph+0.009
keeping+0.009
Neg logit contributions
ceed-0.012
 biggest-0.012
 exact-0.012
 teaches-0.011
ate-0.011
Token of
Token ID286
Feature activation+8.15e-04

Pos logit contributions
 biggest+0.118
ceed+0.118
 exact+0.112
 teaches+0.111
ate+0.108
Neg logit contributions
ho-0.147
 throats-0.128
anged-0.121
olph-0.118
keeping-0.114
Token soaring
Token ID32376
Feature activation-7.10e-02

Pos logit contributions
ho+0.002
 throats+0.001
anged+0.001
olph+0.001
keeping+0.001
Neg logit contributions
ceed-0.001
 biggest-0.001
 exact-0.001
 teaches-0.001
ate-0.001
Token higher
Token ID2440
Feature activation-2.82e-02

Pos logit contributions
ceed+0.145
 biggest+0.142
 exact+0.138
 teaches+0.137
ate+0.135
Neg logit contributions
ho-0.106
 throats-0.085
anged-0.079
olph-0.075
keeping-0.072
Token than
Token ID621
Feature activation+8.67e-02

Pos logit contributions
 biggest+0.056
 most+0.054
ate+0.053
 exact+0.053
 two+0.053
Neg logit contributions
ho-0.042
 throats-0.038
anged-0.037
olph-0.036
keeping-0.032
Token any
Token ID597
Feature activation-8.86e-02

Pos logit contributions
ho+0.147
 throats+0.137
anged+0.132
olph+0.124
keeping+0.110
Neg logit contributions
 biggest-0.155
 two-0.149
 most-0.149
 smallest-0.142
 best-0.142
Token other
Token ID584
Feature activation-1.93e-02

Pos logit contributions
 biggest+0.109
ceed+0.109
 exact+0.102
 teaches+0.100
ate+0.098
Neg logit contributions
ho-0.202
 throats-0.179
anged-0.172
olph-0.168
keeping-0.164
Token figure
Token ID3785
Feature activation+7.19e-03

Pos logit contributions
ceed+0.033
 teaches+0.031
 biggest+0.030
 exact+0.030
ate+0.030
Neg logit contributions
ho-0.035
anged-0.029
 throats-0.029
keeping-0.027
olph-0.026
Token in
Token ID287
Feature activation+1.73e-03

Pos logit contributions
ho+0.012
 throats+0.010
anged+0.010
olph+0.009
keeping+0.009
Neg logit contributions
 biggest-0.013
ceed-0.013
 exact-0.013
 teaches-0.013
ate-0.012
Token the
Token ID262
Feature activation-1.31e-01

Pos logit contributions
ho+0.003
 throats+0.002
keeping+0.002
anged+0.002
olph+0.002
Neg logit contributions
ceed-0.003
 biggest-0.003
 exact-0.003
 teaches-0.003
hest-0.003
Token.
Token ID13
Feature activation-6.12e-03

Pos logit contributions
ho+0.061
 throats+0.052
anged+0.048
olph+0.047
keeping+0.045
Neg logit contributions
 biggest-0.060
ceed-0.058
 teaches-0.056
 exact-0.056
ate-0.056
Token
Token ID198
Feature activation-1.42e-01

Pos logit contributions
ceed+0.010
 biggest+0.009
 exact+0.009
ate+0.008
hest+0.008
Neg logit contributions
ho-0.011
 throats-0.010
anged-0.010
olph-0.010
 he-0.009
Token
Token ID198
Feature activation-1.42e-01

Pos logit contributions
ceed+0.266
 biggest+0.265
 exact+0.256
 nic+0.252
 prett+0.237
Neg logit contributions
ho-0.228
 he-0.198
olph-0.190
 him-0.176
 to-0.175
TokenOne
Token ID3198
Feature activation-1.34e-01

Pos logit contributions
 biggest+0.150
ceed+0.146
 exact+0.121
 nic+0.117
 teaches+0.115
Neg logit contributions
ho-0.356
 throats-0.301
olph-0.301
anged-0.294
 after-0.292
Token day
Token ID1110
Feature activation+8.45e-02

Pos logit contributions
 biggest+0.233
ceed+0.227
 exact+0.221
 teaches+0.218
ate+0.215
Neg logit contributions
ho-0.240
 throats-0.209
anged-0.198
olph-0.192
keeping-0.190
Token,
Token ID11
Feature activation+1.37e-01

Pos logit contributions
ho+0.136
 throats+0.115
keeping+0.107
anged+0.104
olph+0.099
Neg logit contributions
 biggest-0.160
 two-0.160
ceed-0.157
ate-0.152
 best-0.151
Token they
Token ID484
Feature activation+7.13e-02

Pos logit contributions
ho+0.204
 throats+0.165
anged+0.155
keeping+0.147
olph+0.145
Neg logit contributions
 biggest-0.282
ceed-0.276
 two-0.270
 exact-0.269
ate-0.265
Token see
Token ID766
Feature activation-4.28e-02

Pos logit contributions
ho+0.119
 throats+0.098
anged+0.093
olph+0.089
keeping+0.086
Neg logit contributions
ceed-0.133
 biggest-0.132
 exact-0.126
 teaches-0.124
ate-0.122
Token a
Token ID257
Feature activation-1.25e-01

Pos logit contributions
ceed+0.058
 exact+0.053
 biggest+0.053
 teaches+0.052
hest+0.051
Neg logit contributions
ho-0.091
 throats-0.082
anged-0.078
olph-0.075
 he-0.073
Token big
Token ID1263
Feature activation-6.34e-02

Pos logit contributions
 biggest+0.127
ceed+0.118
 exact+0.111
 two+0.104
 teaches+0.103
Neg logit contributions
ho-0.311
 throats-0.288
anged-0.283
keeping-0.273
olph-0.271
Token cable
Token ID7862
Feature activation+4.75e-02

Pos logit contributions
 biggest+0.091
 two+0.084
 exact+0.082
 same+0.081
 best+0.081
Neg logit contributions
ho-0.126
anged-0.122
keeping-0.120
 throats-0.119
olph-0.111
Token.
Token ID13
Feature activation-4.33e-03

Pos logit contributions
ho+0.128
anged+0.121
 throats+0.119
olph+0.117
ged+0.110
Neg logit contributions
 two-0.091
 exact-0.090
 biggest-0.090
 teaches-0.084
ceed-0.084
Token It
Token ID632
Feature activation+1.63e-02

Pos logit contributions
ceed+0.007
 biggest+0.007
 exact+0.006
 teaches+0.006
ate+0.006
Neg logit contributions
ho-0.009
 throats-0.007
anged-0.007
olph-0.007
keeping-0.007
Token had
Token ID550
Feature activation-4.27e-02

Pos logit contributions
ho+0.034
 throats+0.030
anged+0.029
olph+0.028
keeping+0.028
Neg logit contributions
 biggest-0.023
ceed-0.023
 exact-0.022
 teaches-0.022
ate-0.021
Token some
Token ID617
Feature activation-1.22e-01

Pos logit contributions
ceed+0.054
 biggest+0.053
 exact+0.051
 teaches+0.050
ate+0.050
Neg logit contributions
ho-0.095
 throats-0.084
anged-0.079
olph-0.078
keeping-0.077
Token food
Token ID2057
Feature activation+5.44e-02

Pos logit contributions
 biggest+0.195
 exact+0.178
ceed+0.177
 two+0.177
 most+0.174
Neg logit contributions
ho-0.244
 throats-0.208
anged-0.207
olph-0.203
keeping-0.199
Token that
Token ID326
Feature activation+2.33e-02

Pos logit contributions
ho+0.092
anged+0.090
 throats+0.090
olph+0.087
gie+0.076
Neg logit contributions
 two-0.087
 biggest-0.085
 best-0.083
 most-0.081
 exact-0.081
Token was
Token ID373
Feature activation-6.84e-02

Pos logit contributions
ho+0.041
 throats+0.036
anged+0.035
olph+0.034
keeping+0.032
Neg logit contributions
 biggest-0.040
ceed-0.040
 teaches-0.038
 exact-0.038
 most-0.037
Token all
Token ID477
Feature activation+2.32e-02

Pos logit contributions
 biggest+0.101
 two+0.097
 exact+0.096
ceed+0.094
 most+0.093
Neg logit contributions
ho-0.137
anged-0.125
 throats-0.120
olph-0.118
keeping-0.114
Token black
Token ID2042
Feature activation-7.12e-02

Pos logit contributions
ho+0.045
 throats+0.041
anged+0.041
olph+0.038
ged+0.035
Neg logit contributions
 biggest-0.037
 exact-0.036
 teaches-0.036
ceed-0.035
 two-0.034
Token,
Token ID11
Feature activation+1.39e-01

Pos logit contributions
 two+0.107
 biggest+0.097
 smallest+0.090
ate+0.089
 most+0.087
Neg logit contributions
ho-0.148
 throats-0.118
anged-0.114
olph-0.111
 whining-0.108
Token and
Token ID290
Feature activation+1.90e-01

Pos logit contributions
ho+0.276
anged+0.267
olph+0.238
 throats+0.227
owing+0.221
Neg logit contributions
 most-0.227
 biggest-0.226
 two-0.220
 exact-0.214
 teaches-0.205
Token to
Token ID284
Feature activation+1.12e-01

Pos logit contributions
ho+0.149
 throats+0.142
anged+0.131
olph+0.123
 tongues+0.117
Neg logit contributions
 biggest-0.147
 exact-0.143
 two-0.140
 teaches-0.140
 most-0.139
Token be
Token ID307
Feature activation-4.44e-02

Pos logit contributions
ho+0.218
 throats+0.190
anged+0.185
olph+0.180
keeping+0.168
Neg logit contributions
ceed-0.173
 biggest-0.170
 exact-0.169
 teaches-0.168
 two-0.162
Token so
Token ID523
Feature activation+7.54e-02

Pos logit contributions
 biggest+0.076
ceed+0.074
 exact+0.072
 teaches+0.071
ate+0.070
Neg logit contributions
ho-0.083
 throats-0.069
anged-0.066
olph-0.063
keeping-0.060
Token friendly
Token ID8030
Feature activation-3.65e-02

Pos logit contributions
ho+0.152
 throats+0.138
anged+0.133
olph+0.130
keeping+0.126
Neg logit contributions
 biggest-0.114
ceed-0.109
 exact-0.106
 two-0.103
 teaches-0.103
Token and
Token ID290
Feature activation+1.88e-01

Pos logit contributions
 biggest+0.056
 two+0.055
 teaches+0.055
 most+0.054
 exact+0.053
Neg logit contributions
ho-0.069
 throats-0.065
anged-0.064
olph-0.058
 tongues-0.054
Token thanked
Token ID26280
Feature activation-3.34e-02

Pos logit contributions
ho+0.391
 throats+0.343
anged+0.334
olph+0.326
keeping+0.308
Neg logit contributions
 biggest-0.273
ceed-0.268
 exact-0.257
 teaches-0.254
ate-0.244
Token her
Token ID607
Feature activation-3.04e-02

Pos logit contributions
ceed+0.059
 biggest+0.058
 exact+0.056
 teaches+0.055
ate+0.055
Neg logit contributions
ho-0.059
 throats-0.050
anged-0.047
olph-0.046
keeping-0.044
Token in
Token ID287
Feature activation-1.16e-03

Pos logit contributions
ceed+0.058
 biggest+0.057
 exact+0.055
 teaches+0.054
ate+0.054
Neg logit contributions
ho-0.049
 throats-0.040
anged-0.037
olph-0.036
keeping-0.035
Token return
Token ID1441
Feature activation-8.23e-03

Pos logit contributions
ceed+0.002
 biggest+0.002
ate+0.002
hest+0.002
 exact+0.002
Neg logit contributions
ho-0.002
 throats-0.002
anged-0.002
keeping-0.001
olph-0.001
Token.
Token ID13
Feature activation-4.97e-03

Pos logit contributions
 biggest+0.014
 exact+0.013
ate+0.013
ceed+0.013
 two+0.013
Neg logit contributions
ho-0.015
 throats-0.013
anged-0.012
olph-0.012
gie-0.011
Token
Token ID220
Feature activation-1.38e-01

Pos logit contributions
ceed+0.009
 biggest+0.008
hest+0.008
 exact+0.007
redible+0.007
Neg logit contributions
ho-0.009
 throats-0.008
 he-0.008
anged-0.007
olph-0.007
Token a
Token ID257
Feature activation-1.22e-01

Pos logit contributions
ho+0.037
 throats+0.033
anged+0.032
olph+0.031
keeping+0.030
Neg logit contributions
 biggest-0.020
ceed-0.019
 exact-0.018
 teaches-0.017
 two-0.017
Token sh
Token ID427
Feature activation-4.32e-02

Pos logit contributions
 biggest+0.181
ceed+0.178
 exact+0.171
 teaches+0.167
ate+0.162
Neg logit contributions
ho-0.248
 throats-0.217
anged-0.211
olph-0.202
keeping-0.198
Tokenaw
Token ID707
Feature activation+1.10e-02

Pos logit contributions
ceed+0.070
 biggest+0.069
 exact+0.067
 teaches+0.065
ate+0.065
Neg logit contributions
ho-0.082
 throats-0.071
anged-0.068
olph-0.065
keeping-0.063
Tokenl
Token ID75
Feature activation-8.66e-02

Pos logit contributions
ho+0.020
 throats+0.020
anged+0.019
olph+0.018
keeping+0.018
Neg logit contributions
 biggest-0.016
ceed-0.016
 exact-0.015
 two-0.015
 best-0.014
Token.
Token ID13
Feature activation-6.16e-03

Pos logit contributions
ceed+0.154
 biggest+0.151
 exact+0.145
 teaches+0.143
ate+0.141
Neg logit contributions
ho-0.149
 throats-0.129
anged-0.121
olph-0.115
keeping-0.111
Token Then
Token ID3244
Feature activation+4.24e-02

Pos logit contributions
ceed+0.010
 biggest+0.010
 exact+0.010
hest+0.009
 teaches+0.009
Neg logit contributions
ho-0.012
 throats-0.010
anged-0.009
olph-0.009
keeping-0.009
Token,
Token ID11
Feature activation+1.37e-01

Pos logit contributions
ho+0.077
 throats+0.066
olph+0.065
keeping+0.065
anged+0.065
Neg logit contributions
 two-0.078
 biggest-0.074
 most-0.072
 best-0.072
 exact-0.067
Token she
Token ID673
Feature activation+7.80e-02

Pos logit contributions
ho+0.213
 throats+0.188
anged+0.180
olph+0.174
keeping+0.167
Neg logit contributions
 biggest-0.271
 two-0.269
 most-0.262
ceed-0.261
 exact-0.250
Token began
Token ID2540
Feature activation-6.63e-02

Pos logit contributions
ho+0.129
 throats+0.110
anged+0.102
olph+0.098
keeping+0.097
Neg logit contributions
ceed-0.145
 biggest-0.144
 exact-0.138
 teaches-0.138
ate-0.137
Token to
Token ID284
Feature activation+1.09e-01

Pos logit contributions
 two+0.136
 biggest+0.134
 best+0.127
ate+0.127
 exact+0.126
Neg logit contributions
ho-0.094
 throats-0.076
olph-0.071
anged-0.069
keeping-0.068
Token roll
Token ID4836
Feature activation-6.77e-02

Pos logit contributions
ho+0.202
 throats+0.173
anged+0.165
olph+0.161
keeping+0.155
Neg logit contributions
 biggest-0.187
ceed-0.183
 exact-0.181
ate-0.174
 two-0.173
Tokenaffe
Token ID21223
Feature activation+7.58e-02

Pos logit contributions
 biggest+0.021
ceed+0.021
 teaches+0.020
hest+0.020
 exact+0.020
Neg logit contributions
ho-0.022
 throats-0.020
keeping-0.018
anged-0.018
olph-0.018
Token!
Token ID0
Feature activation+1.21e-03

Pos logit contributions
ho+0.135
 throats+0.121
anged+0.111
olph+0.109
keeping+0.105
Neg logit contributions
 biggest-0.127
ceed-0.127
 exact-0.121
 teaches-0.121
 two-0.119
Token
Token ID220
Feature activation-1.34e-01

Pos logit contributions
ho+0.002
anged+0.002
 throats+0.002
olph+0.002
 he+0.002
Neg logit contributions
ceed-0.002
 biggest-0.002
 exact-0.002
ate-0.002
 teaches-0.002
Token
Token ID198
Feature activation-1.34e-01

Pos logit contributions
 exact+0.222
 biggest+0.221
 simplest+0.216
ceed+0.215
 nic+0.214
Neg logit contributions
ho-0.209
 he-0.199
anged-0.186
 him-0.186
olph-0.180
Token
Token ID198
Feature activation-1.33e-01

Pos logit contributions
 biggest+0.249
 exact+0.244
 nic+0.239
 prett+0.236
ceed+0.232
Neg logit contributions
ho-0.200
 he-0.186
olph-0.174
 him-0.159
ged-0.156
TokenBut
Token ID1537
Feature activation+3.79e-02

Pos logit contributions
 biggest+0.205
 best+0.181
 exact+0.180
 nic+0.178
 smallest+0.173
Neg logit contributions
ho-0.261
 he-0.219
olph-0.212
 dev-0.200
ged-0.199
Token right
Token ID826
Feature activation-1.30e-02

Pos logit contributions
ho+0.071
 throats+0.066
anged+0.061
keeping+0.060
olph+0.059
Neg logit contributions
 two-0.061
 biggest-0.060
 most-0.058
 exact-0.057
ceed-0.056
Token at
Token ID379
Feature activation+4.44e-03

Pos logit contributions
 biggest+0.023
 two+0.022
 exact+0.022
 most+0.022
ate+0.022
Neg logit contributions
ho-0.023
 throats-0.020
anged-0.019
olph-0.019
keeping-0.019
Token the
Token ID262
Feature activation-1.32e-01

Pos logit contributions
ho+0.009
 throats+0.007
anged+0.007
olph+0.007
keeping+0.007
Neg logit contributions
ceed-0.007
 biggest-0.007
 exact-0.007
 teaches-0.006
ate-0.006
Token end
Token ID886
Feature activation-2.88e-02

Pos logit contributions
ceed+0.187
ate+0.179
 teaches+0.166
 biggest+0.155
hest+0.154
Neg logit contributions
ho-0.284
 throats-0.222
 after-0.213
anged-0.209
 crab-0.206
Token,
Token ID11
Feature activation+1.38e-01

Pos logit contributions
 biggest+0.041
ceed+0.039
 two+0.038
 teaches+0.038
 exact+0.037
Neg logit contributions
ho-0.060
 throats-0.055
olph-0.051
anged-0.051
keeping-0.050
Token Lily
Token ID20037
Feature activation-4.32e-03

Pos logit contributions
ho+0.284
keeping+0.261
anged+0.251
 throats+0.244
owing+0.241
Neg logit contributions
 two-0.223
 biggest-0.219
 most-0.213
 same-0.201
ceed-0.198
Token went
Token ID1816
Feature activation+8.55e-03

Pos logit contributions
 biggest+0.007
ceed+0.007
 exact+0.006
 teaches+0.006
ate+0.006
Neg logit contributions
ho-0.008
 throats-0.007
anged-0.007
olph-0.007
keeping-0.007
Token to
Token ID284
Feature activation+1.13e-01

Pos logit contributions
ho+0.012
 throats+0.010
anged+0.009
keeping+0.008
olph+0.008
Neg logit contributions
 biggest-0.019
ceed-0.018
 two-0.018
ate-0.018
 best-0.018
Token a
Token ID257
Feature activation-1.21e-01

Pos logit contributions
ho+0.145
 throats+0.119
olph+0.116
anged+0.113
 him+0.111
Neg logit contributions
ceed-0.241
 biggest-0.231
hest-0.230
 exact-0.228
 teaches-0.227
Token dance
Token ID9280
Feature activation-2.55e-02

Pos logit contributions
 biggest+0.139
ceed+0.138
 exact+0.130
 teaches+0.125
ate+0.124
Neg logit contributions
ho-0.283
 throats-0.254
anged-0.245
olph-0.238
keeping-0.233
Token competition
Token ID5449
Feature activation+2.19e-02

Pos logit contributions
ceed+0.043
 biggest+0.043
 exact+0.041
 teaches+0.041
ate+0.040
Neg logit contributions
ho-0.046
 throats-0.040
anged-0.038
olph-0.036
keeping-0.035
Token where
Token ID810
Feature activation-7.14e-02

Pos logit contributions
ho+0.028
anged+0.021
 throats+0.021
keeping+0.019
olph+0.019
Neg logit contributions
 biggest-0.049
ceed-0.049
 exact-0.048
 teaches-0.046
 smallest-0.046
Token she
Token ID673
Feature activation+8.27e-02

Pos logit contributions
ceed+0.145
 biggest+0.145
 exact+0.139
 teaches+0.138
ate+0.137
Neg logit contributions
ho-0.105
 throats-0.087
anged-0.082
olph-0.078
keeping-0.074
Token saw
Token ID2497
Feature activation-1.18e-01

Pos logit contributions
ho+0.109
anged+0.084
 throats+0.081
olph+0.077
ged+0.070
Neg logit contributions
ceed-0.182
 exact-0.176
 biggest-0.176
hest-0.164
 teaches-0.163
Token a
Token ID257
Feature activation-1.20e-01

Pos logit contributions
ceed+0.186
 exact+0.185
 teaches+0.179
hest+0.172
 biggest+0.170
Neg logit contributions
ho-0.215
 throats-0.181
anged-0.174
 he-0.171
olph-0.169
Token famous
Token ID5863
Feature activation-2.60e-02

Pos logit contributions
ceed+0.173
 exact+0.161
 biggest+0.161
ate+0.159
 teaches+0.159
Neg logit contributions
ho-0.249
 throats-0.212
olph-0.197
anged-0.195
keeping-0.184
Token bright
Token ID6016
Feature activation-5.57e-02

Pos logit contributions
ho+0.213
 throats+0.186
anged+0.179
olph+0.174
keeping+0.169
Neg logit contributions
ceed-0.146
 biggest-0.145
 exact-0.137
 teaches-0.136
ate-0.134
Token and
Token ID290
Feature activation+1.91e-01

Pos logit contributions
 two+0.079
 teaches+0.076
 best+0.075
 biggest+0.074
 same+0.072
Neg logit contributions
ho-0.115
olph-0.100
anged-0.100
 throats-0.099
keeping-0.096
Token makes
Token ID1838
Feature activation-1.11e-01

Pos logit contributions
ho+0.396
 throats+0.345
anged+0.332
olph+0.325
keeping+0.315
Neg logit contributions
 biggest-0.274
ceed-0.274
 exact-0.260
 teaches-0.256
ate-0.251
Token her
Token ID607
Feature activation-2.67e-02

Pos logit contributions
 biggest+0.189
ceed+0.189
 exact+0.177
 two+0.177
 teaches+0.176
Neg logit contributions
ho-0.201
 throats-0.169
olph-0.163
anged-0.163
keeping-0.156
Token eyes
Token ID2951
Feature activation+2.67e-02

Pos logit contributions
 biggest+0.046
 best+0.042
 two+0.042
 most+0.041
 same+0.040
Neg logit contributions
ho-0.051
olph-0.046
anged-0.041
dd-0.040
keeping-0.040
Token hurt
Token ID5938
Feature activation-4.20e-02

Pos logit contributions
ho+0.054
 throats+0.046
olph+0.045
anged+0.045
keeping+0.044
Neg logit contributions
 biggest-0.041
ceed-0.040
 exact-0.039
 teaches-0.038
 two-0.037
Token.
Token ID13
Feature activation+6.85e-04

Pos logit contributions
 two+0.053
 best+0.048
 biggest+0.046
 most+0.044
ceed+0.042
Neg logit contributions
ho-0.093
olph-0.085
anged-0.084
 throats-0.083
ipop-0.079
Token
Token ID198
Feature activation-1.34e-01

Pos logit contributions
ho+0.002
 throats+0.001
anged+0.001
olph+0.001
keeping+0.001
Neg logit contributions
ceed-0.001
 biggest-0.001
 exact-0.001
 teaches-0.001
hest-0.001
Token
Token ID198
Feature activation-1.34e-01

Pos logit contributions
 biggest+0.182
 exact+0.177
ceed+0.176
 nic+0.168
 teaches+0.160
Neg logit contributions
ho-0.299
 throats-0.257
anged-0.250
olph-0.247
keeping-0.233
TokenMom
Token ID29252
Feature activation+1.03e-01

Pos logit contributions
ceed+0.153
 biggest+0.149
 exact+0.141
ate+0.140
 teaches+0.138
Neg logit contributions
ho-0.317
 throats-0.284
anged-0.274
olph-0.267
keeping-0.263
Token hears
Token ID23180
Feature activation-8.87e-02

Pos logit contributions
ho+0.187
 throats+0.161
anged+0.153
olph+0.148
keeping+0.141
Neg logit contributions
ceed-0.173
 biggest-0.173
 exact-0.166
 teaches-0.162
ate-0.161
Token her
Token ID607
Feature activation-3.17e-02

Pos logit contributions
 two+0.101
 biggest+0.100
 most+0.094
 best+0.092
 exact+0.090
Neg logit contributions
ho-0.137
anged-0.132
olph-0.112
 throats-0.111
owing-0.108
Token incredible
Token ID8082
Feature activation-1.90e-02

Pos logit contributions
ceed+0.057
 biggest+0.052
 teaches+0.052
ate+0.051
 exact+0.051
Neg logit contributions
ho-0.056
 throats-0.048
anged-0.045
olph-0.045
keeping-0.041
Token m
Token ID285
Feature activation+6.82e-02

Pos logit contributions
ceed+0.036
 biggest+0.035
 exact+0.034
 teaches+0.034
ate+0.033
Neg logit contributions
ho-0.031
 throats-0.026
anged-0.025
olph-0.024
keeping-0.023
Tokenittens
Token ID34978
Feature activation+1.15e-01

Pos logit contributions
ho+0.094
 throats+0.077
anged+0.070
keeping+0.067
olph+0.066
Neg logit contributions
 biggest-0.143
ceed-0.139
 exact-0.134
ate-0.132
 teaches-0.131
Token.
Token ID13
Feature activation-7.94e-04

Pos logit contributions
ho+0.247
anged+0.231
 throats+0.227
olph+0.220
 inspection+0.198
Neg logit contributions
 biggest-0.158
 best-0.154
 two-0.149
 most-0.146
 exact-0.143
Token<|endoftext|>
Token ID50256
Feature activation-1.41e-01

Pos logit contributions
ceed+0.001
 biggest+0.001
 exact+0.001
 teaches+0.001
ate+0.001
Neg logit contributions
ho-0.002
 throats-0.001
anged-0.001
olph-0.001
keeping-0.001
TokenOnce
Token ID7454
Feature activation+1.37e-03

Pos logit contributions
 biggest+0.233
 exact+0.229
ceed+0.227
 prett+0.212
 nic+0.212
Neg logit contributions
ho-0.258
olph-0.226
keeping-0.220
 he-0.213
 him-0.207
Token upon
Token ID2402
Feature activation-1.67e-01

Pos logit contributions
ho+0.003
 throats+0.002
anged+0.002
olph+0.002
 he+0.002
Neg logit contributions
ceed-0.002
 biggest-0.002
 exact-0.002
 teaches-0.002
ate-0.002
Token a
Token ID257
Feature activation-1.28e-01

Pos logit contributions
ceed+0.192
 biggest+0.192
 exact+0.181
 teaches+0.178
ate+0.172
Neg logit contributions
ho-0.382
 throats-0.343
olph-0.325
anged-0.322
keeping-0.318
Token time
Token ID640
Feature activation+8.04e-02

Pos logit contributions
ceed+0.177
 biggest+0.172
 exact+0.158
ate+0.156
 teaches+0.156
Neg logit contributions
ho-0.279
 throats-0.243
olph-0.230
anged-0.229
keeping-0.210
Token,
Token ID11
Feature activation+1.40e-01

Pos logit contributions
ho+0.109
 throats+0.090
anged+0.083
olph+0.078
keeping+0.076
Neg logit contributions
ceed-0.171
 biggest-0.171
 exact-0.165
 teaches-0.163
ate-0.162
Token it
Token ID340
Feature activation+2.65e-02

Pos logit contributions
ho+0.313
anged+0.281
 throats+0.277
keeping+0.263
olph+0.258
Neg logit contributions
 biggest-0.196
 teaches-0.192
ceed-0.191
 two-0.189
 exact-0.185
Token is
Token ID318
Feature activation-4.66e-02

Pos logit contributions
ho+0.053
 throats+0.045
anged+0.042
olph+0.041
 he+0.041
Neg logit contributions
 biggest-0.035
ceed-0.035
ate-0.034
 exact-0.033
Each-0.033
Token time
Token ID640
Feature activation+8.36e-02

Pos logit contributions
ceed+0.064
 biggest+0.061
 exact+0.058
ate+0.057
hest+0.057
Neg logit contributions
ho-0.094
 throats-0.087
olph-0.080
anged-0.079
keeping-0.077
Token to
Token ID284
Feature activation+1.12e-01

Pos logit contributions
ho+0.111
 throats+0.086
olph+0.079
anged+0.079
keeping+0.071
Neg logit contributions
 biggest-0.184
ceed-0.178
 exact-0.176
 teaches-0.175
 two-0.174
Token clean
Token ID3424
Feature activation+3.66e-02

Pos logit contributions
ho+0.177
 throats+0.153
anged+0.139
keeping+0.135
olph+0.134
Neg logit contributions
 biggest-0.209
ceed-0.204
 exact-0.196
 smallest-0.192
 teaches-0.191
Token your
Token ID534
Feature activation-4.69e-02

Pos logit contributions
ho+0.080
anged+0.068
 throats+0.068
olph+0.063
keeping+0.060
Neg logit contributions
 biggest-0.052
ate-0.050
ceed-0.050
 exact-0.049
 teaches-0.048
Token room
Token ID2119
Feature activation+7.65e-02

Pos logit contributions
ceed+0.070
 biggest+0.068
 exact+0.066
 teaches+0.065
ate+0.063
Neg logit contributions
ho-0.093
 throats-0.084
anged-0.079
olph-0.076
keeping-0.075
Token.
Token ID13
Feature activation+4.29e-03

Pos logit contributions
ho+0.130
 throats+0.112
anged+0.104
olph+0.101
keeping+0.100
Neg logit contributions
ceed-0.136
 biggest-0.134
 exact-0.128
 teaches-0.126
ate-0.124
Token I
Token ID314
Feature activation+2.39e-02

Pos logit contributions
ho+0.010
 throats+0.008
anged+0.008
olph+0.008
keeping+0.008
Neg logit contributions
 biggest-0.006
ceed-0.006
 teaches-0.005
 exact-0.005
ate-0.005
Token will
Token ID481
Feature activation+5.33e-02

Pos logit contributions
ho+0.050
 throats+0.044
anged+0.042
olph+0.041
keeping+0.040
Neg logit contributions
 biggest-0.033
ceed-0.033
 teaches-0.032
 exact-0.032
ate-0.031
Token watch
Token ID2342
Feature activation-2.22e-02

Pos logit contributions
ho+0.096
 throats+0.083
anged+0.078
olph+0.076
keeping+0.074
Neg logit contributions
ceed-0.090
 biggest-0.090
 exact-0.085
 teaches-0.084
ate-0.083
Token see
Token ID766
Feature activation-4.51e-02

Pos logit contributions
 biggest+0.079
ceed+0.079
 exact+0.075
ate+0.074
 teaches+0.073
Neg logit contributions
ho-0.103
 throats-0.089
anged-0.086
olph-0.084
keeping-0.080
Token if
Token ID611
Feature activation+6.12e-03

Pos logit contributions
ceed+0.092
 teaches+0.088
 exact+0.087
 biggest+0.085
V+0.084
Neg logit contributions
ho-0.062
 throats-0.050
 after-0.047
anged-0.046
olph-0.045
Token this
Token ID428
Feature activation-9.09e-02

Pos logit contributions
ho+0.011
 throats+0.009
anged+0.009
olph+0.009
keeping+0.008
Neg logit contributions
 biggest-0.011
ceed-0.011
 exact-0.010
 teaches-0.010
ate-0.010
Token works
Token ID2499
Feature activation-1.02e-01

Pos logit contributions
ceed+0.157
 biggest+0.150
 exact+0.146
ate+0.143
V+0.143
Neg logit contributions
ho-0.162
 throats-0.138
anged-0.124
olph-0.123
keeping-0.118
Token".
Token ID1911
Feature activation-3.25e-02

Pos logit contributions
ceed+0.153
 biggest+0.152
 exact+0.148
V+0.141
ate+0.140
Neg logit contributions
ho-0.190
 throats-0.164
anged-0.156
olph-0.154
keeping-0.149
Token
Token ID198
Feature activation-1.43e-01

Pos logit contributions
ceed+0.039
 biggest+0.038
 teaches+0.036
 two+0.035
 exact+0.035
Neg logit contributions
ho-0.077
 throats-0.066
anged-0.064
olph-0.062
ged-0.061
Token
Token ID198
Feature activation-1.42e-01

Pos logit contributions
 exact+0.228
 biggest+0.227
ceed+0.211
 nic+0.209
 smallest+0.207
Neg logit contributions
ho-0.274
olph-0.235
 he-0.226
 throats-0.219
keeping-0.215
TokenThe
Token ID464
Feature activation-1.51e-01

Pos logit contributions
 biggest+0.159
 exact+0.144
ceed+0.143
 teaches+0.135
 smallest+0.131
Neg logit contributions
ho-0.345
 throats-0.298
olph-0.289
anged-0.283
keeping-0.278
Token duck
Token ID22045
Feature activation+5.21e-02

Pos logit contributions
ceed+0.349
 exact+0.325
ate+0.325
hest+0.324
 teaches+0.322
Neg logit contributions
ho-0.196
 throats-0.149
olph-0.138
 crab-0.120
anged-0.117
Token and
Token ID290
Feature activation+1.86e-01

Pos logit contributions
ho+0.093
 throats+0.082
anged+0.077
keeping+0.076
olph+0.074
Neg logit contributions
ceed-0.086
 biggest-0.086
 teaches-0.085
 two-0.084
 exact-0.082
Token the
Token ID262
Feature activation-1.37e-01

Pos logit contributions
ho+0.345
 throats+0.302
anged+0.296
keeping+0.281
olph+0.274
Neg logit contributions
 biggest-0.300
ceed-0.295
 two-0.284
 exact-0.283
 teaches-0.277
Token out
Token ID503
Feature activation-4.75e-03

Pos logit contributions
 throats+0.095
ho+0.094
anged+0.092
olph+0.087
gie+0.080
Neg logit contributions
 biggest-0.104
 two-0.100
 best-0.099
ceed-0.097
 most-0.095
Token of
Token ID286
Feature activation-3.61e-04

Pos logit contributions
 biggest+0.007
ceed+0.007
 teaches+0.007
 exact+0.007
 two+0.007
Neg logit contributions
ho-0.009
 throats-0.008
anged-0.008
olph-0.008
keeping-0.008
Token mom
Token ID1995
Feature activation+1.38e-01

Pos logit contributions
ceed+0.001
 biggest+0.001
 exact+0.001
 teaches+0.001
ate+0.001
Neg logit contributions
ho-0.001
 throats-0.001
anged-0.001
olph-0.001
keeping-0.001
Tokenmy
Token ID1820
Feature activation+8.11e-02

Pos logit contributions
ho+0.332
 throats+0.302
anged+0.278
olph+0.275
keeping+0.267
Neg logit contributions
 biggest-0.155
 exact-0.151
 two-0.146
ceed-0.137
 most-0.137
Tokenâ
Token ID22940
Feature activation-3.43e-02

Pos logit contributions
ho+0.218
 throats+0.200
anged+0.189
olph+0.188
keeping+0.179
Neg logit contributions
 exact-0.063
 two-0.062
 biggest-0.062
 teaches-0.057
 best-0.055
Token€
Token ID26391
Feature activation-6.54e-02

Pos logit contributions
 biggest+0.054
ceed+0.050
 exact+0.047
 most+0.046
ate+0.046
Neg logit contributions
ho-0.068
 throats-0.059
anged-0.058
olph-0.056
keeping-0.055
Tokenâ„¢
Token ID8151
Feature activation+6.48e-02

Pos logit contributions
 biggest+0.120
ceed+0.120
 exact+0.115
 teaches+0.113
ate+0.111
Neg logit contributions
ho-0.110
 throats-0.093
anged-0.088
olph-0.085
keeping-0.082
Tokens
Token ID82
Feature activation-3.67e-02

Pos logit contributions
ho+0.112
 throats+0.095
anged+0.090
olph+0.087
keeping+0.084
Neg logit contributions
ceed-0.114
 biggest-0.114
 exact-0.109
 teaches-0.108
ate-0.106
Token pocket
Token ID10000
Feature activation-5.10e-03

Pos logit contributions
 biggest+0.058
 two+0.057
 exact+0.055
 teaches+0.055
ceed+0.054
Neg logit contributions
ho-0.067
anged-0.057
olph-0.057
 throats-0.056
gie-0.052
Token.
Token ID13
Feature activation-2.57e-03

Pos logit contributions
 biggest+0.009
ceed+0.009
 exact+0.009
 teaches+0.008
 two+0.008
Neg logit contributions
ho-0.009
 throats-0.008
anged-0.007
olph-0.007
keeping-0.007
Token It
Token ID632
Feature activation+1.76e-02

Pos logit contributions
ceed+0.005
 biggest+0.004
 teaches+0.004
 exact+0.004
 smallest+0.004
Neg logit contributions
ho-0.004
 throats-0.004
anged-0.004
olph-0.004
keeping-0.003
Token and
Token ID290
Feature activation+1.88e-01

Pos logit contributions
 best+0.016
 two+0.016
 most+0.016
 biggest+0.015
 teaches+0.014
Neg logit contributions
ho-0.022
anged-0.018
 throats-0.018
olph-0.018
keeping-0.017
Token independent
Token ID4795
Feature activation-2.25e-03

Pos logit contributions
ho+0.343
 throats+0.294
anged+0.278
olph+0.268
keeping+0.263
Neg logit contributions
ceed-0.314
 biggest-0.311
 exact-0.299
 teaches-0.293
ate-0.290
Token.
Token ID13
Feature activation+2.94e-03

Pos logit contributions
 biggest+0.004
ceed+0.004
 teaches+0.004
 exact+0.004
 best+0.004
Neg logit contributions
ho-0.004
 throats-0.004
anged-0.003
olph-0.003
keeping-0.003
Token<|endoftext|>
Token ID50256
Feature activation-1.35e-01

Pos logit contributions
ho+0.006
 throats+0.006
olph+0.005
anged+0.005
 he+0.005
Neg logit contributions
ceed-0.004
 biggest-0.004
 exact-0.004
ate-0.004
hest-0.003
TokenOnce
Token ID7454
Feature activation+3.28e-03

Pos logit contributions
 exact+0.244
 biggest+0.244
ceed+0.238
 nic+0.228
 prett+0.227
Neg logit contributions
ho-0.208
keeping-0.188
olph-0.186
 him-0.183
 he-0.180
Token upon
Token ID2402
Feature activation-1.62e-01

Pos logit contributions
ho+0.006
 throats+0.005
anged+0.005
olph+0.005
 he+0.005
Neg logit contributions
ceed-0.006
 biggest-0.005
 exact-0.005
 teaches-0.005
ate-0.005
Token a
Token ID257
Feature activation-1.23e-01

Pos logit contributions
ceed+0.215
 biggest+0.211
 exact+0.207
 teaches+0.207
ate+0.192
Neg logit contributions
ho-0.329
 throats-0.298
olph-0.284
 he-0.279
keeping-0.272
Token time
Token ID640
Feature activation+8.59e-02

Pos logit contributions
ceed+0.177
 biggest+0.169
 exact+0.159
ate+0.158
 teaches+0.157
Neg logit contributions
ho-0.257
 throats-0.226
anged-0.211
olph-0.211
owing-0.191
Token,
Token ID11
Feature activation+1.45e-01

Pos logit contributions
ho+0.111
 throats+0.087
anged+0.081
olph+0.077
keeping+0.071
Neg logit contributions
ceed-0.192
 biggest-0.191
 exact-0.185
 teaches-0.183
ate-0.181
Token there
Token ID612
Feature activation-4.63e-02

Pos logit contributions
ho+0.210
olph+0.155
 throats+0.150
anged+0.148
ged+0.140
Neg logit contributions
ceed-0.310
 teaches-0.298
 biggest-0.296
 exact-0.289
ate-0.283
Token was
Token ID373
Feature activation-6.39e-02

Pos logit contributions
ceed+0.081
 biggest+0.078
 exact+0.076
 teaches+0.074
ate+0.072
Neg logit contributions
ho-0.085
 throats-0.069
anged-0.066
olph-0.066
 he-0.060
Token very
Token ID845
Feature activation+8.52e-02

Pos logit contributions
 biggest+0.093
ceed+0.092
 exact+0.088
 teaches+0.087
ate+0.085
Neg logit contributions
ho-0.117
 throats-0.100
anged-0.097
olph-0.094
keeping-0.091
Token cheerful
Token ID37999
Feature activation-2.22e-02

Pos logit contributions
ho+0.179
 throats+0.155
anged+0.150
olph+0.148
keeping+0.145
Neg logit contributions
 biggest-0.126
ceed-0.120
 exact-0.119
 teaches-0.114
ate-0.113
Token.
Token ID13
Feature activation+4.50e-04

Pos logit contributions
 two+0.034
 biggest+0.033
 exact+0.033
 most+0.032
 best+0.032
Neg logit contributions
ho-0.045
anged-0.038
 throats-0.037
olph-0.035
gie-0.033
Token<|endoftext|>
Token ID50256
Feature activation-1.27e-01

Pos logit contributions
ho+0.001
 throats+0.001
anged+0.001
olph+0.001
keeping+0.001
Neg logit contributions
ceed-0.001
 biggest-0.001
 exact-0.001
ate-0.001
 teaches-0.001
TokenOnce
Token ID7454
Feature activation+9.26e-03

Pos logit contributions
 biggest+0.170
ceed+0.165
 exact+0.160
 teaches+0.153
 nic+0.150
Neg logit contributions
ho-0.269
olph-0.239
keeping-0.233
 he-0.223
 throats-0.221
Token upon
Token ID2402
Feature activation-1.60e-01

Pos logit contributions
ho+0.019
 throats+0.016
anged+0.015
olph+0.015
keeping+0.014
Neg logit contributions
ceed-0.014
 biggest-0.014
 exact-0.013
 teaches-0.013
ate-0.013
Token a
Token ID257
Feature activation-1.22e-01

Pos logit contributions
ceed+0.195
 biggest+0.191
 teaches+0.182
 exact+0.181
ate+0.175
Neg logit contributions
ho-0.350
 throats-0.317
olph-0.299
anged-0.293
keeping-0.292
Token time
Token ID640
Feature activation+8.54e-02

Pos logit contributions
ceed+0.170
 biggest+0.162
ate+0.153
 exact+0.149
 teaches+0.149
Neg logit contributions
ho-0.261
 throats-0.230
anged-0.217
olph-0.216
ged-0.196
Token,
Token ID11
Feature activation+1.43e-01

Pos logit contributions
ho+0.126
 throats+0.114
anged+0.103
keeping+0.100
olph+0.095
Neg logit contributions
 biggest-0.167
ceed-0.164
 two-0.159
 exact-0.159
 teaches-0.155
Token there
Token ID612
Feature activation-5.01e-02

Pos logit contributions
ho+0.234
 throats+0.188
anged+0.180
olph+0.179
keeping+0.162
Neg logit contributions
ceed-0.274
 biggest-0.268
 teaches-0.259
 exact-0.257
ate-0.253
Token was
Token ID373
Feature activation-6.39e-02

Pos logit contributions
ceed+0.078
 biggest+0.077
 exact+0.074
 teaches+0.072
ate+0.071
Neg logit contributions
ho-0.099
 throats-0.085
anged-0.081
olph-0.079
keeping-0.075
Token spent
Token ID3377
Feature activation-1.21e-01

Pos logit contributions
ho+0.130
 throats+0.114
olph+0.102
keeping+0.101
anged+0.100
Neg logit contributions
 biggest-0.161
 two-0.158
 exact-0.156
 most-0.156
 best-0.154
Token together
Token ID1978
Feature activation+1.92e-02

Pos logit contributions
 biggest+0.206
ceed+0.205
 exact+0.197
 teaches+0.194
ate+0.194
Neg logit contributions
ho-0.223
 throats-0.188
anged-0.179
olph-0.173
keeping-0.165
Token.
Token ID13
Feature activation-2.16e-03

Pos logit contributions
ho+0.034
 throats+0.031
anged+0.029
olph+0.028
gie+0.026
Neg logit contributions
 two-0.031
 exact-0.031
 biggest-0.030
 best-0.030
 most-0.029
Token<|endoftext|>
Token ID50256
Feature activation-1.29e-01

Pos logit contributions
ceed+0.003
 biggest+0.003
 exact+0.003
ate+0.003
 smallest+0.003
Neg logit contributions
ho-0.004
 throats-0.004
anged-0.004
 he-0.003
olph-0.003
TokenOnce
Token ID7454
Feature activation+6.23e-03

Pos logit contributions
 biggest+0.177
ceed+0.173
 exact+0.172
 teaches+0.161
 nic+0.158
Neg logit contributions
ho-0.271
olph-0.239
keeping-0.233
 he-0.226
 him-0.217
Token upon
Token ID2402
Feature activation-1.62e-01

Pos logit contributions
ho+0.013
 throats+0.011
anged+0.011
olph+0.010
keeping+0.010
Neg logit contributions
ceed-0.009
 biggest-0.009
 exact-0.009
 teaches-0.009
ate-0.008
Token a
Token ID257
Feature activation-1.24e-01

Pos logit contributions
ceed+0.192
 biggest+0.189
 teaches+0.180
 exact+0.178
ate+0.173
Neg logit contributions
ho-0.358
 throats-0.324
olph-0.309
anged-0.303
keeping-0.301
Token time
Token ID640
Feature activation+8.35e-02

Pos logit contributions
ceed+0.157
 biggest+0.153
 exact+0.142
ate+0.140
 teaches+0.139
Neg logit contributions
ho-0.279
 throats-0.247
anged-0.236
olph-0.233
keeping-0.219
Token,
Token ID11
Feature activation+1.40e-01

Pos logit contributions
ho+0.129
 throats+0.122
keeping+0.109
anged+0.109
olph+0.100
Neg logit contributions
 biggest-0.154
ceed-0.151
 two-0.150
 best-0.146
 exact-0.145
Token there
Token ID612
Feature activation-5.26e-02

Pos logit contributions
ho+0.252
 throats+0.217
anged+0.205
olph+0.197
keeping+0.193
Neg logit contributions
 biggest-0.240
ceed-0.240
 exact-0.231
 teaches-0.225
ate-0.224
Token was
Token ID373
Feature activation-6.67e-02

Pos logit contributions
ceed+0.080
 biggest+0.080
 exact+0.076
 teaches+0.075
ate+0.073
Neg logit contributions
ho-0.105
 throats-0.090
anged-0.086
olph-0.084
keeping-0.081
Token garden
Token ID11376
Feature activation+9.23e-02

Pos logit contributions
ceed+0.211
 teaches+0.186
V+0.184
hest+0.182
ate+0.181
Neg logit contributions
ho-0.279
 throats-0.233
keeping-0.203
anged-0.202
olph-0.202
Token bloom
Token ID29955
Feature activation-8.78e-02

Pos logit contributions
ho+0.148
anged+0.134
 throats+0.133
olph+0.133
keeping+0.118
Neg logit contributions
 biggest-0.172
 exact-0.164
 most-0.161
 two-0.159
 teaches-0.158
Token.
Token ID13
Feature activation-7.81e-04

Pos logit contributions
 biggest+0.173
ceed+0.173
 exact+0.166
 teaches+0.164
ate+0.163
Neg logit contributions
ho-0.134
 throats-0.112
anged-0.105
olph-0.101
keeping-0.096
Token<|endoftext|>
Token ID50256
Feature activation-1.28e-01

Pos logit contributions
ceed+0.001
 biggest+0.001
 exact+0.001
hest+0.001
ate+0.001
Neg logit contributions
ho-0.001
 throats-0.001
keeping-0.001
anged-0.001
olph-0.001
TokenOnce
Token ID7454
Feature activation+7.49e-03

Pos logit contributions
 biggest+0.166
ceed+0.160
 exact+0.155
 teaches+0.147
 nic+0.143
Neg logit contributions
ho-0.283
olph-0.248
keeping-0.242
 he-0.235
 throats-0.230
Token upon
Token ID2402
Feature activation-1.64e-01

Pos logit contributions
ho+0.015
 throats+0.013
anged+0.012
olph+0.012
keeping+0.011
Neg logit contributions
ceed-0.012
 biggest-0.011
 exact-0.011
 teaches-0.011
ate-0.010
Token a
Token ID257
Feature activation-1.25e-01

Pos logit contributions
ceed+0.202
 biggest+0.200
 teaches+0.190
 exact+0.188
ate+0.182
Neg logit contributions
ho-0.352
 throats-0.317
olph-0.301
keeping-0.295
anged-0.294
Token time
Token ID640
Feature activation+8.31e-02

Pos logit contributions
ceed+0.169
 biggest+0.164
ate+0.151
 exact+0.150
 teaches+0.149
Neg logit contributions
ho-0.273
 throats-0.238
anged-0.226
olph-0.225
keeping-0.207
Token,
Token ID11
Feature activation+1.42e-01

Pos logit contributions
ho+0.125
 throats+0.116
anged+0.103
keeping+0.102
olph+0.096
Neg logit contributions
 biggest-0.158
ceed-0.155
 two-0.152
 exact-0.150
 best-0.148
Token there
Token ID612
Feature activation-5.39e-02

Pos logit contributions
ho+0.236
 throats+0.195
anged+0.185
olph+0.182
keeping+0.169
Neg logit contributions
ceed-0.263
 biggest-0.259
 exact-0.248
 teaches-0.248
ate-0.244
Token was
Token ID373
Feature activation-6.77e-02

Pos logit contributions
ceed+0.083
 biggest+0.083
 exact+0.079
 teaches+0.077
ate+0.076
Neg logit contributions
ho-0.107
 throats-0.091
anged-0.087
olph-0.085
keeping-0.082
Token next
Token ID1306
Feature activation-1.08e-02

Pos logit contributions
 biggest+0.060
 best+0.055
 two+0.054
 most+0.054
 exact+0.052
Neg logit contributions
ho-0.075
 throats-0.065
anged-0.064
olph-0.063
keeping-0.061
Token adventure
Token ID8855
Feature activation+4.61e-02

Pos logit contributions
 biggest+0.017
 exact+0.016
ceed+0.016
 two+0.016
 best+0.015
Neg logit contributions
ho-0.022
 throats-0.019
anged-0.018
olph-0.017
keeping-0.017
Token!
Token ID0
Feature activation-1.13e-03

Pos logit contributions
ho+0.088
 throats+0.079
anged+0.075
olph+0.072
keeping+0.066
Neg logit contributions
 biggest-0.073
 exact-0.071
ceed-0.070
 most-0.068
 two-0.067
Token<|endoftext|>
Token ID50256
Feature activation-1.30e-01

Pos logit contributions
ceed+0.002
 biggest+0.002
 exact+0.002
ate+0.001
 smallest+0.001
Neg logit contributions
ho-0.002
 throats-0.002
anged-0.002
olph-0.002
 he-0.002
TokenOnce
Token ID7454
Feature activation+7.43e-03

Pos logit contributions
 biggest+0.170
ceed+0.164
 exact+0.159
 teaches+0.154
 nic+0.148
Neg logit contributions
ho-0.281
olph-0.251
keeping-0.242
 he-0.237
ged-0.228
Token upon
Token ID2402
Feature activation-1.64e-01

Pos logit contributions
ho+0.016
 throats+0.014
anged+0.013
olph+0.013
keeping+0.012
Neg logit contributions
 biggest-0.010
ceed-0.010
 exact-0.010
 teaches-0.010
ate-0.009
Token a
Token ID257
Feature activation-1.24e-01

Pos logit contributions
ceed+0.205
 biggest+0.202
 teaches+0.195
 exact+0.191
ate+0.185
Neg logit contributions
ho-0.348
 throats-0.314
olph-0.299
keeping-0.293
anged-0.292
Token time
Token ID640
Feature activation+8.20e-02

Pos logit contributions
ceed+0.173
 biggest+0.166
ate+0.155
 exact+0.152
 teaches+0.152
Neg logit contributions
ho-0.267
 throats-0.234
anged-0.221
olph-0.220
keeping-0.201
Token there
Token ID612
Feature activation-5.49e-02

Pos logit contributions
ho+0.123
 throats+0.113
anged+0.101
keeping+0.099
olph+0.094
Neg logit contributions
 biggest-0.157
ceed-0.154
 two-0.151
 exact-0.149
 best-0.147
Token was
Token ID373
Feature activation-6.71e-02

Pos logit contributions
ceed+0.087
 biggest+0.086
 exact+0.082
 teaches+0.081
ate+0.079
Neg logit contributions
ho-0.107
 throats-0.091
anged-0.087
olph-0.085
keeping-0.081
Token a
Token ID257
Feature activation-1.24e-01

Pos logit contributions
ceed+0.076
 biggest+0.073
 teaches+0.070
 exact+0.069
ate+0.067
Neg logit contributions
ho-0.157
 throats-0.141
anged-0.134
olph-0.134
keeping-0.129