TOP ACTIVATIONS
MAX = 0.295

not want to be friends anymore . They turn away and
glad you are not rude anymore . You are very wise
play catch near the hut anymore . They learn to respect
want to play with blocks anymore . They wanted to stay
he wouldn 't get punished anymore . The
't sign at the traffic anymore ." Jess
't even remember to frown anymore . The end . <|endoftext|>
that Sammy wouldn 't complain anymore . From that day on
not play with the car anymore . S ara
't anxious about making requests anymore . She knew that if
they couldn 't play outside anymore . But Tim my had
to lend her the meal anymore . They hugged and smiled
they couldn 't play together anymore . The bald man said
't let you play here anymore !" Tim didn
to play with his friends anymore . He missed his favorite
that he couldn 't swim anymore . He became very worried
to be friends with Lily anymore . He takes all the
to go to the party anymore ," he said .
about the boy 's train anymore . They were happy with
was too tired to move anymore . He felt

MIN ACTIVATIONS
MIN = -0.208

friendly . He crou ched down and said ,
eyes . Jill and Mike crou ched down to pet the
said $ 1 . 99 per pound . "
. P oppy crou ched down and said ,
its mouth . Doug ie crou ched down and said quietly
the bunny , so she crou ched down and bowed to
She put some on her eyel ids and cheeks . It
to say hello . He crou ched down and carefully reached
. He blinked his long eyel ashes and kept hopping .
door . She saw Lily crou ching behind the clothes .
surrender meant , so Daddy crou ched down and said ,
theater and Larry felt his eyel ids become heavy . He
and saw him . She crou ched down and put her
cave , but the hero per se vered . He searched
get closer , so he crou ched down and watched .
. Every day she would crou ch down and demand to
to reach . The turtle per se vered though , padd
M ia 's mom crou ched down and said ,
. She explained how to crou ch down and support herself
" It 's one dollar per cup . But I have

INTERVAL 0.169 - 0.295
CONTAINS 0.034%

can 't play with paper anymore . It is not safe
. They were not bored anymore . They were scared .
she wasn 't dizz y anymore . She was so happy
full they couldn 't eat anymore cheese . It was a
how you make each other happy and grow as a family

INTERVAL 0.044 - 0.169
CONTAINS 14.260%

He scrub bed and polished until it was so shiny it
on his head . One day , Max saw a little
fun . One day , the pretty sheep went
fun dancing together ! <|endoftext|> Once upon a time , there
itself . The little girl was so excited !

INTERVAL -0.082 - 0.044
CONTAINS 80.484%

inside . Inside was a beautiful toy car with vibrant colors
he had done that day and show them the fish he
again , but she will always remember how it refused to
stick to fish the wheel out of the water and put
lesson and never went to the lion 's cave again .

INTERVAL -0.208 - -0.082
CONTAINS 5.222%

too high up . He jumped up , but he still
. When he got there , he saw the
as fast as he could and hid in a bush .
but the knife did not go in . He pushed harder
ran out of the yard and into the street . Suddenly
Token not
Token ID407
Feature activation+8.95e-02

Pos logit contributions
 Majesty+0.084
 disl+0.084
 endings+0.071
idents+0.070
 fault+0.069
Neg logit contributions
 full-0.062
hes-0.061
 tucked-0.058
 called-0.057
 sn-0.055
Token want
Token ID765
Feature activation+9.91e-02

Pos logit contributions
 Majesty+0.178
 disl+0.175
idents+0.158
 endings+0.157
flies+0.150
Neg logit contributions
 full-0.112
hes-0.098
 tucked-0.095
 sn-0.095
 called-0.094
Token to
Token ID284
Feature activation+7.52e-02

Pos logit contributions
 Majesty+0.175
 disl+0.174
idents+0.146
 endings+0.143
cient+0.142
Neg logit contributions
 full-0.143
 sn-0.131
hes-0.130
 tucked-0.130
 called-0.130
Token be
Token ID307
Feature activation-3.75e-02

Pos logit contributions
 disl+0.151
 Majesty+0.150
 endings+0.134
 fault+0.131
 beings+0.131
Neg logit contributions
 full-0.087
 tucked-0.079
 sn-0.079
hes-0.079
 called-0.076
Token friends
Token ID2460
Feature activation+1.06e-01

Pos logit contributions
 full+0.043
hes+0.039
 called+0.038
 tucked+0.037
 sn+0.036
Neg logit contributions
 disl-0.078
 Majesty-0.077
 endings-0.072
idents-0.071
 fault-0.067
Token anymore
Token ID7471
Feature activation+2.28e-01

Pos logit contributions
 Majesty+0.257
 disl+0.250
idents+0.221
 endings+0.221
flies+0.218
Neg logit contributions
 full-0.112
 j-0.093
 called-0.092
 tucked-0.092
 sn-0.090
Token.
Token ID13
Feature activation+3.18e-02

Pos logit contributions
 Majesty+0.536
 disl+0.503
cient+0.481
idents+0.472
roc+0.470
Neg logit contributions
 full-0.220
 called-0.201
 j-0.182
 sn-0.180
 tucked-0.172
Token They
Token ID1119
Feature activation+1.89e-02

Pos logit contributions
 Majesty+0.057
 disl+0.057
 endings+0.051
 fault+0.051
 appearance+0.050
Neg logit contributions
 full-0.042
hes-0.039
 called-0.039
On-0.037
 tucked-0.037
Token turn
Token ID1210
Feature activation-5.46e-02

Pos logit contributions
 disl+0.035
 Majesty+0.034
 endings+0.030
 fault+0.030
idents+0.029
Neg logit contributions
 full-0.026
hes-0.024
 tucked-0.024
 sn-0.023
 called-0.023
Token away
Token ID1497
Feature activation+1.21e-01

Pos logit contributions
 full+0.066
hes+0.066
 tucked+0.063
 nick+0.059
 sn+0.059
Neg logit contributions
 disl-0.106
 Majesty-0.101
 endings-0.092
 fault-0.091
 appearance-0.088
Token and
Token ID290
Feature activation-5.69e-02

Pos logit contributions
 disl+0.248
 Majesty+0.242
 endings+0.217
 fault+0.212
 appearance+0.207
Neg logit contributions
 full-0.138
hes-0.135
 tucked-0.130
 sn-0.121
 nick-0.119
Token glad
Token ID9675
Feature activation+2.18e-01

Pos logit contributions
hes+0.014
 full+0.013
 tucked+0.013
 sn+0.011
 nick+0.011
Neg logit contributions
 disl-0.036
 Majesty-0.036
 fault-0.032
 endings-0.032
 appearance-0.031
Token you
Token ID345
Feature activation+1.05e-01

Pos logit contributions
 disl+0.437
 Majesty+0.434
itionally+0.379
 perspectives+0.378
 fault+0.375
Neg logit contributions
 full-0.271
 called-0.236
 sn-0.227
 j-0.224
 tucked-0.222
Token are
Token ID389
Feature activation+4.50e-02

Pos logit contributions
 disl+0.208
 Majesty+0.204
 fault+0.180
 endings+0.180
idents+0.177
Neg logit contributions
 full-0.142
 tucked-0.123
 called-0.121
 j-0.120
 sn-0.120
Token not
Token ID407
Feature activation+6.95e-02

Pos logit contributions
 disl+0.088
 Majesty+0.085
 endings+0.078
idents+0.077
 appearance+0.074
Neg logit contributions
 full-0.059
hes-0.055
 tucked-0.053
 sn-0.051
 called-0.051
Token rude
Token ID22066
Feature activation+1.37e-01

Pos logit contributions
 disl+0.138
 Majesty+0.136
 endings+0.123
idents+0.121
 purposes+0.116
Neg logit contributions
 full-0.087
hes-0.078
 tucked-0.076
 sn-0.074
 called-0.074
Token anymore
Token ID7471
Feature activation+2.25e-01

Pos logit contributions
 Majesty+0.326
 disl+0.312
cient+0.280
 majesty+0.277
idents+0.268
Neg logit contributions
 full-0.127
 called-0.107
 j-0.106
 ped-0.098
 bra-0.097
Token.
Token ID13
Feature activation+2.38e-02

Pos logit contributions
 Majesty+0.511
 disl+0.475
cient+0.451
idents+0.447
roc+0.435
Neg logit contributions
 full-0.228
 called-0.201
 sn-0.185
 j-0.185
hes-0.178
Token You
Token ID921
Feature activation+4.12e-02

Pos logit contributions
 Majesty+0.035
 disl+0.035
 endings+0.031
idents+0.030
 fault+0.030
Neg logit contributions
 full-0.039
hes-0.038
 tucked-0.036
 called-0.036
 sn-0.035
Token are
Token ID389
Feature activation-5.82e-03

Pos logit contributions
 Majesty+0.070
 disl+0.067
 endings+0.059
 appearance+0.057
 fault+0.056
Neg logit contributions
 full-0.066
hes-0.064
 tucked-0.061
 called-0.061
 sn-0.060
Token very
Token ID845
Feature activation+2.29e-02

Pos logit contributions
 full+0.007
hes+0.007
 tucked+0.006
 sn+0.006
 nick+0.006
Neg logit contributions
 disl-0.012
 Majesty-0.011
 endings-0.010
 fault-0.010
 appearance-0.010
Token wise
Token ID10787
Feature activation+9.92e-02

Pos logit contributions
 disl+0.048
 Majesty+0.047
 endings+0.044
 fault+0.043
 appearance+0.042
Neg logit contributions
 full-0.025
hes-0.024
 tucked-0.022
 sn-0.021
 called-0.021
Token play
Token ID711
Feature activation+5.12e-02

Pos logit contributions
 Majesty+0.178
 disl+0.175
idents+0.158
 endings+0.157
flies+0.150
Neg logit contributions
 full-0.112
hes-0.098
 tucked-0.095
 sn-0.095
 called-0.094
Token catch
Token ID4929
Feature activation+4.19e-02

Pos logit contributions
 Majesty+0.114
 disl+0.113
 perspectives+0.099
 preferences+0.098
 purposes+0.098
Neg logit contributions
 full-0.052
hes-0.048
 j-0.046
 sn-0.046
 called-0.045
Token near
Token ID1474
Feature activation+7.46e-02

Pos logit contributions
 Majesty+0.078
 disl+0.076
 fault+0.066
idents+0.064
 endings+0.064
Neg logit contributions
 full-0.056
hes-0.053
 sn-0.052
 called-0.052
 tucked-0.052
Token the
Token ID262
Feature activation+6.45e-02

Pos logit contributions
 disl+0.149
 Majesty+0.148
idents+0.130
 fault+0.128
 endings+0.128
Neg logit contributions
 full-0.088
hes-0.082
 called-0.082
 tucked-0.081
 sn-0.081
Token hut
Token ID40812
Feature activation+1.34e-01

Pos logit contributions
 disl+0.116
 Majesty+0.114
 endings+0.100
 fault+0.098
 appearance+0.097
Neg logit contributions
hes-0.088
 full-0.083
 tucked-0.082
 ped-0.078
On-0.077
Token anymore
Token ID7471
Feature activation+2.19e-01

Pos logit contributions
 disl+0.299
 Majesty+0.288
 endings+0.253
 preferences+0.251
rums+0.244
Neg logit contributions
 full-0.172
 tucked-0.147
 called-0.144
 sn-0.134
 closed-0.132
Token.
Token ID13
Feature activation+3.10e-02

Pos logit contributions
 Majesty+0.497
 disl+0.488
idents+0.446
 fault+0.444
roc+0.436
Neg logit contributions
 full-0.228
 called-0.217
 sn-0.192
 tucked-0.192
hes-0.183
Token They
Token ID1119
Feature activation+2.02e-02

Pos logit contributions
 disl+0.059
 Majesty+0.058
 endings+0.052
 fault+0.051
 appearance+0.051
Neg logit contributions
 full-0.040
hes-0.037
 called-0.036
 tucked-0.035
On-0.034
Token learn
Token ID2193
Feature activation+2.40e-02

Pos logit contributions
 disl+0.038
 Majesty+0.037
 endings+0.033
 fault+0.033
idents+0.031
Neg logit contributions
 full-0.027
 tucked-0.025
hes-0.025
 sn-0.024
 called-0.024
Token to
Token ID284
Feature activation+5.38e-02

Pos logit contributions
 disl+0.038
 Majesty+0.036
 endings+0.032
 fault+0.031
 appearance+0.030
Neg logit contributions
 full-0.038
hes-0.037
 tucked-0.036
 nick-0.035
 sn-0.035
Token respect
Token ID2461
Feature activation+8.52e-02

Pos logit contributions
 disl+0.092
 Majesty+0.091
 endings+0.081
 fault+0.079
idents+0.077
Neg logit contributions
 full-0.077
hes-0.074
 tucked-0.073
 sn-0.071
 called-0.070
Token want
Token ID765
Feature activation+8.51e-02

Pos logit contributions
 disl+0.114
 Majesty+0.112
 endings+0.099
 fault+0.097
idents+0.096
Neg logit contributions
 full-0.078
hes-0.073
 tucked-0.071
 sn-0.069
 called-0.068
Token to
Token ID284
Feature activation+7.01e-02

Pos logit contributions
 disl+0.139
 Majesty+0.139
 endings+0.116
idents+0.114
 fault+0.111
Neg logit contributions
 full-0.133
hes-0.126
 tucked-0.124
 sn-0.121
 called-0.120
Token play
Token ID711
Feature activation+4.04e-02

Pos logit contributions
 disl+0.138
 Majesty+0.136
 endings+0.122
 fault+0.119
idents+0.117
Neg logit contributions
 full-0.084
hes-0.079
 tucked-0.078
 sn-0.075
 called-0.073
Token with
Token ID351
Feature activation+2.74e-02

Pos logit contributions
 disl+0.083
 Majesty+0.081
 endings+0.072
 fault+0.070
 preferences+0.069
Neg logit contributions
 full-0.047
hes-0.046
 tucked-0.044
 sn-0.042
 called-0.041
Token blocks
Token ID7021
Feature activation+4.35e-03

Pos logit contributions
 disl+0.051
 Majesty+0.050
 endings+0.044
 fault+0.044
idents+0.043
Neg logit contributions
 full-0.035
hes-0.035
 called-0.034
 sn-0.033
 tucked-0.033
Token anymore
Token ID7471
Feature activation+2.19e-01

Pos logit contributions
 Majesty+0.008
 disl+0.008
 endings+0.007
idents+0.007
 perspectives+0.007
Neg logit contributions
 full-0.006
hes-0.006
 tucked-0.006
 called-0.005
 sn-0.005
Token.
Token ID13
Feature activation+2.54e-02

Pos logit contributions
 Majesty+0.522
 disl+0.500
cient+0.473
roc+0.466
idents+0.463
Neg logit contributions
 full-0.197
 called-0.193
 sn-0.168
 j-0.163
hes-0.158
Token They
Token ID1119
Feature activation+1.95e-02

Pos logit contributions
 Majesty+0.049
 disl+0.048
 endings+0.044
 fault+0.043
 appearance+0.043
Neg logit contributions
 full-0.031
hes-0.029
 called-0.029
On-0.027
 tucked-0.027
Token wanted
Token ID2227
Feature activation+5.47e-02

Pos logit contributions
 disl+0.037
 Majesty+0.036
 endings+0.032
 fault+0.031
idents+0.030
Neg logit contributions
 full-0.026
hes-0.025
 tucked-0.024
 sn-0.023
 called-0.023
Token to
Token ID284
Feature activation+5.77e-02

Pos logit contributions
 Majesty+0.096
 disl+0.094
 endings+0.080
idents+0.077
cient+0.075
Neg logit contributions
 full-0.083
hes-0.079
 tucked-0.078
 called-0.077
 sn-0.075
Token stay
Token ID2652
Feature activation-4.68e-02

Pos logit contributions
 Majesty+0.117
 disl+0.115
 endings+0.109
 beings+0.107
 fault+0.104
Neg logit contributions
 full-0.065
 sn-0.061
 called-0.059
 tucked-0.058
hes-0.058
Token he
Token ID339
Feature activation+5.94e-02

Pos logit contributions
 disl+0.114
 Majesty+0.111
 endings+0.099
 fault+0.096
 appearance+0.094
Neg logit contributions
 full-0.077
hes-0.076
 tucked-0.073
 sn-0.069
 nick-0.068
Token wouldn
Token ID3636
Feature activation+2.63e-03

Pos logit contributions
 disl+0.115
 Majesty+0.110
 endings+0.100
 fault+0.098
idents+0.096
Neg logit contributions
 full-0.075
hes-0.072
 tucked-0.071
 sn-0.068
 called-0.067
Token't
Token ID470
Feature activation+5.75e-02

Pos logit contributions
 disl+0.003
 Majesty+0.003
 endings+0.003
 fault+0.003
idents+0.002
Neg logit contributions
 full-0.005
hes-0.005
 tucked-0.005
 nick-0.004
aver-0.004
Token get
Token ID651
Feature activation-4.16e-02

Pos logit contributions
 disl+0.121
 Majesty+0.119
 endings+0.107
 fault+0.105
 beings+0.104
Neg logit contributions
 full-0.066
 tucked-0.061
 sn-0.060
 called-0.057
 closed-0.056
Token punished
Token ID16851
Feature activation+1.01e-01

Pos logit contributions
 full+0.052
hes+0.051
 tucked+0.049
 sn+0.046
 nick+0.046
Neg logit contributions
 disl-0.080
 Majesty-0.078
 endings-0.070
 fault-0.068
 appearance-0.066
Token anymore
Token ID7471
Feature activation+2.18e-01

Pos logit contributions
 disl+0.205
 Majesty+0.199
 preferences+0.171
flies+0.167
 endings+0.166
Neg logit contributions
 full-0.124
 tucked-0.113
 sn-0.113
 called-0.111
 j-0.109
Token.
Token ID13
Feature activation+2.60e-02

Pos logit contributions
 Majesty+0.523
 disl+0.508
cient+0.464
idents+0.453
roc+0.452
Neg logit contributions
 full-0.206
 called-0.189
 tucked-0.177
 sn-0.173
 ped-0.161
Token
Token ID220
Feature activation+6.26e-02

Pos logit contributions
 disl+0.051
 Majesty+0.049
 endings+0.045
 appearance+0.044
 fault+0.044
Neg logit contributions
 full-0.032
hes-0.029
 called-0.028
 tucked-0.028
On-0.028
Token
Token ID198
Feature activation+2.32e-02

Pos logit contributions
 disl+0.128
 Majesty+0.124
 endings+0.112
 appearance+0.111
 beings+0.108
Neg logit contributions
hes-0.073
 full-0.072
 tucked-0.065
On-0.063
 called-0.062
Token
Token ID198
Feature activation+1.60e-02

Pos logit contributions
 disl+0.041
 Majesty+0.039
 endings+0.034
 fault+0.033
 purposes+0.032
Neg logit contributions
 full-0.033
hes-0.032
 tucked-0.032
 sn-0.030
 nick-0.029
TokenThe
Token ID464
Feature activation+1.51e-02

Pos logit contributions
 disl+0.034
 Majesty+0.032
 endings+0.029
 fault+0.027
 preferences+0.027
Neg logit contributions
 full-0.017
hes-0.017
 tucked-0.017
 sn-0.016
 nick-0.015
Token't
Token ID470
Feature activation+8.41e-02

Pos logit contributions
 disl+0.237
 Majesty+0.236
 endings+0.198
 appearance+0.193
 preferences+0.192
Neg logit contributions
 full-0.199
hes-0.195
 tucked-0.189
 sn-0.183
 ped-0.182
Token sign
Token ID1051
Feature activation+1.70e-03

Pos logit contributions
 disl+0.158
 Majesty+0.155
 endings+0.137
 fault+0.134
idents+0.131
Neg logit contributions
 full-0.110
hes-0.104
 tucked-0.102
 sn-0.098
 called-0.096
Token at
Token ID379
Feature activation+8.89e-02

Pos logit contributions
 Majesty+0.003
 disl+0.003
 endings+0.003
idents+0.002
 fault+0.002
Neg logit contributions
 full-0.002
hes-0.002
 tucked-0.002
 called-0.002
 sn-0.002
Token the
Token ID262
Feature activation+5.27e-02

Pos logit contributions
 disl+0.150
 Majesty+0.146
 endings+0.125
 fault+0.123
idents+0.120
Neg logit contributions
 full-0.134
hes-0.130
 tucked-0.127
 called-0.123
 sn-0.121
Token traffic
Token ID4979
Feature activation+1.30e-02

Pos logit contributions
 disl+0.097
 Majesty+0.094
 endings+0.084
 fault+0.081
idents+0.079
Neg logit contributions
 full-0.071
hes-0.068
 tucked-0.066
 sn-0.063
 called-0.062
Token anymore
Token ID7471
Feature activation+2.18e-01

Pos logit contributions
 Majesty+0.026
 disl+0.026
 endings+0.022
 preferences+0.021
 fault+0.021
Neg logit contributions
 full-0.017
 tucked-0.015
 called-0.015
hes-0.015
 sn-0.014
Token."
Token ID526
Feature activation+6.16e-02

Pos logit contributions
 Majesty+0.456
 disl+0.441
idents+0.401
cient+0.396
roc+0.396
Neg logit contributions
 full-0.251
 called-0.241
 tucked-0.217
 sn-0.217
 j-0.210
Token
Token ID220
Feature activation+7.60e-02

Pos logit contributions
 disl+0.124
 Majesty+0.123
 endings+0.111
 fault+0.110
 appearance+0.108
Neg logit contributions
 full-0.072
hes-0.067
 called-0.066
 tucked-0.065
 sn-0.063
Token
Token ID198
Feature activation+3.11e-02

Pos logit contributions
 disl+0.160
 Majesty+0.160
 endings+0.144
 appearance+0.144
 beings+0.138
Neg logit contributions
hes-0.085
 full-0.081
 called-0.071
On-0.070
 tucked-0.070
Token
Token ID198
Feature activation+2.26e-02

Pos logit contributions
 disl+0.056
 Majesty+0.054
 endings+0.048
 fault+0.047
 appearance+0.045
Neg logit contributions
 full-0.043
hes-0.042
 tucked-0.041
 sn-0.038
 nick-0.038
TokenJess
Token ID34648
Feature activation+8.48e-02

Pos logit contributions
 disl+0.050
 Majesty+0.048
 endings+0.044
 fault+0.042
 preferences+0.041
Neg logit contributions
 full-0.022
hes-0.022
 tucked-0.021
 sn-0.019
 nick-0.019
Token't
Token ID470
Feature activation+5.67e-02

Pos logit contributions
 disl+0.040
 Majesty+0.036
 fault+0.032
 endings+0.032
 means+0.030
Neg logit contributions
 full-0.064
hes-0.064
 tucked-0.062
 nick-0.060
aver-0.059
Token even
Token ID772
Feature activation+2.74e-02

Pos logit contributions
 disl+0.099
 Majesty+0.096
 endings+0.085
 fault+0.082
 appearance+0.080
Neg logit contributions
 full-0.082
hes-0.080
 tucked-0.078
 sn-0.074
 nick-0.073
Token remember
Token ID3505
Feature activation+1.02e-01

Pos logit contributions
 Majesty+0.055
 disl+0.053
idents+0.049
 fault+0.049
flies+0.047
Neg logit contributions
 full-0.033
 bra-0.031
 sn-0.030
 j-0.029
 called-0.029
Token to
Token ID284
Feature activation+5.67e-02

Pos logit contributions
 Majesty+0.185
 disl+0.184
idents+0.163
 endings+0.161
 purposes+0.157
Neg logit contributions
 full-0.142
 called-0.131
 tucked-0.130
 sn-0.128
 j-0.122
Token frown
Token ID22002
Feature activation+4.04e-02

Pos logit contributions
 disl+0.106
 Majesty+0.104
 endings+0.095
 fault+0.093
idents+0.093
Neg logit contributions
 full-0.074
 tucked-0.068
 sn-0.068
hes-0.066
 called-0.066
Token anymore
Token ID7471
Feature activation+2.17e-01

Pos logit contributions
 disl+0.084
 Majesty+0.082
 endings+0.071
 fault+0.070
 majesty+0.068
Neg logit contributions
 full-0.048
 j-0.045
hes-0.044
 called-0.043
 sn-0.042
Token.
Token ID13
Feature activation+2.55e-02

Pos logit contributions
 Majesty+0.483
 disl+0.472
idents+0.426
 fault+0.424
cient+0.424
Neg logit contributions
 full-0.219
 called-0.212
 sn-0.195
 tucked-0.191
 j-0.185
Token The
Token ID383
Feature activation+2.34e-02

Pos logit contributions
 disl+0.051
 Majesty+0.050
 endings+0.046
 beings+0.046
 appearance+0.046
Neg logit contributions
 full-0.029
 called-0.026
hes-0.026
On-0.025
 tucked-0.025
Token end
Token ID886
Feature activation+2.35e-02

Pos logit contributions
 Majesty+0.039
 disl+0.039
 endings+0.034
 appearance+0.032
 fault+0.032
Neg logit contributions
hes-0.036
 full-0.033
 tucked-0.033
 ped-0.031
On-0.031
Token.
Token ID13
Feature activation+1.71e-02

Pos logit contributions
 Majesty+0.048
 disl+0.047
 endings+0.041
 fault+0.041
 appearance+0.040
Neg logit contributions
 full-0.029
hes-0.027
 tucked-0.027
 called-0.026
 sn-0.025
Token<|endoftext|>
Token ID50256
Feature activation+1.31e-01

Pos logit contributions
 disl+0.033
 Majesty+0.033
 endings+0.029
 fault+0.028
 appearance+0.028
Neg logit contributions
 full-0.021
hes-0.020
 tucked-0.020
 called-0.019
 nick-0.018
Token that
Token ID326
Feature activation+1.23e-01

Pos logit contributions
 disl+0.147
 Majesty+0.145
 endings+0.132
 fault+0.128
 appearance+0.127
Neg logit contributions
 full-0.091
hes-0.085
 tucked-0.083
 sn-0.078
 called-0.077
Token Sammy
Token ID44067
Feature activation+6.81e-02

Pos logit contributions
 disl+0.233
 Majesty+0.226
 endings+0.200
idents+0.198
 fault+0.198
Neg logit contributions
 full-0.170
 tucked-0.159
 sn-0.153
hes-0.153
 called-0.148
Token wouldn
Token ID3636
Feature activation+1.18e-02

Pos logit contributions
 disl+0.129
 Majesty+0.125
 endings+0.113
 fault+0.108
 appearance+0.107
Neg logit contributions
 full-0.091
 tucked-0.089
hes-0.086
 sn-0.085
 called-0.083
Token't
Token ID470
Feature activation+6.35e-02

Pos logit contributions
 disl+0.015
 Majesty+0.014
 endings+0.012
 fault+0.012
idents+0.011
Neg logit contributions
hes-0.022
 full-0.022
 tucked-0.021
 nick-0.020
aver-0.020
Token complain
Token ID13121
Feature activation+4.51e-02

Pos logit contributions
 disl+0.129
 Majesty+0.128
 endings+0.114
 fault+0.111
 beings+0.110
Neg logit contributions
 full-0.076
 tucked-0.071
 sn-0.069
hes-0.066
 called-0.066
Token anymore
Token ID7471
Feature activation+2.17e-01

Pos logit contributions
 Majesty+0.088
 disl+0.085
 majesty+0.072
cient+0.071
flies+0.070
Neg logit contributions
 full-0.061
 called-0.054
 tucked-0.054
 j-0.054
 sn-0.054
Token.
Token ID13
Feature activation+3.08e-02

Pos logit contributions
 Majesty+0.489
 disl+0.464
cient+0.425
 endings+0.416
idents+0.415
Neg logit contributions
 full-0.225
 called-0.209
 sn-0.204
 tucked-0.203
 j-0.191
Token From
Token ID3574
Feature activation+4.84e-02

Pos logit contributions
 disl+0.057
 Majesty+0.056
 endings+0.050
 fault+0.049
 appearance+0.048
Neg logit contributions
 full-0.040
hes-0.039
 tucked-0.037
 called-0.036
 sn-0.036
Token that
Token ID326
Feature activation+1.11e-01

Pos logit contributions
 disl+0.085
 Majesty+0.082
 endings+0.072
 fault+0.070
 appearance+0.068
Neg logit contributions
 full-0.069
hes-0.068
 tucked-0.066
 sn-0.063
 nick-0.062
Token day
Token ID1110
Feature activation+1.04e-01

Pos logit contributions
 disl+0.207
 Majesty+0.200
 endings+0.176
 fault+0.169
idents+0.168
Neg logit contributions
 full-0.155
hes-0.145
 tucked-0.145
 sn-0.138
 nick-0.137
Token on
Token ID319
Feature activation+3.15e-02

Pos logit contributions
 Majesty+0.174
 disl+0.171
 endings+0.167
 means+0.166
 relationships+0.165
Neg logit contributions
 tucked-0.116
hes-0.115
 full-0.104
 draped-0.101
 parked-0.100
Token not
Token ID407
Feature activation+8.95e-02

Pos logit contributions
hes+0.027
 full+0.025
 tucked+0.024
 nick+0.024
 bra+0.024
Neg logit contributions
 disl-0.031
 Majesty-0.031
idents-0.027
 endings-0.027
 fault-0.026
Token play
Token ID711
Feature activation+5.12e-02

Pos logit contributions
 Majesty+0.178
 disl+0.175
idents+0.158
 endings+0.157
flies+0.150
Neg logit contributions
 full-0.112
hes-0.098
 tucked-0.095
 sn-0.095
 called-0.094
Token with
Token ID351
Feature activation+3.69e-02

Pos logit contributions
 Majesty+0.114
 disl+0.113
 perspectives+0.099
 preferences+0.098
 purposes+0.098
Neg logit contributions
 full-0.052
hes-0.048
 j-0.046
 sn-0.046
 called-0.045
Token the
Token ID262
Feature activation+4.44e-02

Pos logit contributions
 disl+0.071
 Majesty+0.070
idents+0.062
 endings+0.060
 perspectives+0.060
Neg logit contributions
 full-0.045
hes-0.043
 called-0.042
 sn-0.041
 tucked-0.041
Token car
Token ID1097
Feature activation+1.03e-02

Pos logit contributions
 disl+0.080
 Majesty+0.077
 endings+0.068
 fault+0.068
 appearance+0.065
Neg logit contributions
hes-0.061
 full-0.060
 tucked-0.058
 nick-0.054
 sn-0.054
Token anymore
Token ID7471
Feature activation+2.16e-01

Pos logit contributions
 disl+0.019
 Majesty+0.019
 endings+0.016
 preferences+0.016
 perspectives+0.016
Neg logit contributions
 full-0.014
 tucked-0.014
hes-0.013
 called-0.013
 sn-0.013
Token.
Token ID13
Feature activation+2.72e-02

Pos logit contributions
 Majesty+0.486
 disl+0.476
idents+0.434
roc+0.430
cient+0.427
Neg logit contributions
 full-0.221
 called-0.210
 tucked-0.191
 sn-0.191
hes-0.186
Token
Token ID198
Feature activation+1.85e-02

Pos logit contributions
 disl+0.049
 Majesty+0.048
 endings+0.043
 fault+0.042
 appearance+0.041
Neg logit contributions
 full-0.037
hes-0.036
 tucked-0.034
 called-0.033
 sn-0.032
Token
Token ID198
Feature activation+1.14e-02

Pos logit contributions
 disl+0.037
 Majesty+0.034
 endings+0.031
 fault+0.030
 purposes+0.029
Neg logit contributions
 tucked-0.021
 full-0.021
hes-0.021
 sn-0.020
 bra-0.020
TokenS
Token ID50
Feature activation+1.26e-02

Pos logit contributions
 disl+0.023
 Majesty+0.021
 endings+0.020
 property+0.018
 purposes+0.018
Neg logit contributions
 tucked-0.013
 full-0.013
hes-0.013
 sn-0.012
 bra-0.012
Tokenara
Token ID3301
Feature activation-2.36e-02

Pos logit contributions
 disl+0.021
 Majesty+0.021
 endings+0.018
 fault+0.018
 beings+0.017
Neg logit contributions
 full-0.019
 tucked-0.018
hes-0.018
 sn-0.017
 called-0.017
Token't
Token ID470
Feature activation+5.23e-02

Pos logit contributions
 tucked+0.018
 full+0.017
hes+0.017
 bisc+0.017
 nick+0.017
Neg logit contributions
 disl-0.007
 Majesty-0.005
 fault-0.005
 anymore-0.004
idents-0.004
Token anxious
Token ID18116
Feature activation+1.32e-01

Pos logit contributions
 disl+0.111
 Majesty+0.108
 endings+0.097
 fault+0.095
 appearance+0.093
Neg logit contributions
 full-0.056
hes-0.055
 tucked-0.053
 sn-0.049
 nick-0.048
Token about
Token ID546
Feature activation+8.65e-02

Pos logit contributions
 Majesty+0.212
 disl+0.201
cient+0.164
 majesty+0.162
flies+0.153
Neg logit contributions
 full-0.227
 j-0.209
 called-0.209
 sn-0.206
 tucked-0.204
Token making
Token ID1642
Feature activation+2.51e-02

Pos logit contributions
 Majesty+0.168
 disl+0.162
 endings+0.147
idents+0.146
 purposes+0.142
Neg logit contributions
 full-0.103
 sn-0.097
 j-0.094
 called-0.093
 tucked-0.092
Token requests
Token ID7007
Feature activation+7.40e-02

Pos logit contributions
 disl+0.051
 Majesty+0.050
 endings+0.045
 fault+0.043
 appearance+0.043
Neg logit contributions
 full-0.028
hes-0.027
 tucked-0.026
 sn-0.026
 called-0.025
Token anymore
Token ID7471
Feature activation+2.15e-01

Pos logit contributions
 disl+0.143
 Majesty+0.140
 endings+0.123
 fault+0.120
idents+0.117
Neg logit contributions
 full-0.094
hes-0.091
 tucked-0.088
 sn-0.083
 called-0.082
Token.
Token ID13
Feature activation+2.83e-02

Pos logit contributions
 Majesty+0.480
 disl+0.459
 fault+0.413
idents+0.412
cient+0.409
Neg logit contributions
 full-0.218
 called-0.216
 sn-0.200
 tucked-0.198
hes-0.192
Token She
Token ID1375
Feature activation+4.70e-02

Pos logit contributions
 disl+0.055
 Majesty+0.054
 endings+0.049
 beings+0.048
 fault+0.048
Neg logit contributions
 full-0.034
hes-0.032
 tucked-0.031
 called-0.030
 sn-0.030
Token knew
Token ID2993
Feature activation+5.73e-02

Pos logit contributions
 disl+0.090
 Majesty+0.087
 endings+0.079
 fault+0.077
idents+0.075
Neg logit contributions
 full-0.060
hes-0.058
 tucked-0.058
 sn-0.055
 called-0.054
Token that
Token ID326
Feature activation+1.13e-01

Pos logit contributions
 disl+0.099
 Majesty+0.098
 endings+0.085
 purposes+0.083
idents+0.083
Neg logit contributions
 full-0.085
 tucked-0.081
hes-0.080
 called-0.079
 sn-0.077
Token if
Token ID611
Feature activation+1.00e-01

Pos logit contributions
 disl+0.198
 Majesty+0.197
 endings+0.168
 fault+0.164
idents+0.162
Neg logit contributions
 full-0.160
hes-0.153
 tucked-0.152
 sn-0.146
 called-0.143
Token they
Token ID484
Feature activation+6.12e-02

Pos logit contributions
 disl+0.095
 Majesty+0.093
 endings+0.081
 fault+0.080
idents+0.078
Neg logit contributions
 full-0.074
hes-0.071
 tucked-0.070
 sn-0.066
 called-0.065
Token couldn
Token ID3521
Feature activation+5.48e-02

Pos logit contributions
 Majesty+0.136
 disl+0.131
 endings+0.117
 fault+0.115
 appearance+0.113
Neg logit contributions
 full-0.072
hes-0.063
 tucked-0.062
 called-0.061
 sn-0.061
Token't
Token ID470
Feature activation+4.85e-02

Pos logit contributions
 Majesty+0.114
 disl+0.109
 appearance+0.096
 beings+0.096
 preferences+0.095
Neg logit contributions
hes-0.067
 full-0.066
 sn-0.060
 tucked-0.060
 nick-0.058
Token play
Token ID711
Feature activation+2.04e-02

Pos logit contributions
 disl+0.091
 Majesty+0.087
 endings+0.078
 fault+0.075
 appearance+0.074
Neg logit contributions
hes-0.063
 full-0.062
 tucked-0.059
 nick-0.056
 sn-0.055
Token outside
Token ID2354
Feature activation+7.16e-02

Pos logit contributions
 Majesty+0.046
 disl+0.045
 perspectives+0.039
 preferences+0.039
 purposes+0.039
Neg logit contributions
 full-0.021
hes-0.020
 sn-0.019
 j-0.018
 called-0.018
Token anymore
Token ID7471
Feature activation+2.15e-01

Pos logit contributions
 disl+0.157
 Majesty+0.155
 endings+0.131
 preferences+0.131
idents+0.129
Neg logit contributions
 full-0.088
 tucked-0.076
 called-0.075
 sn-0.074
hes-0.072
Token.
Token ID13
Feature activation+3.54e-02

Pos logit contributions
 Majesty+0.503
 disl+0.481
cient+0.431
idents+0.429
roc+0.428
Neg logit contributions
 full-0.210
 called-0.198
 sn-0.184
 tucked-0.181
hes-0.179
Token But
Token ID887
Feature activation+4.89e-02

Pos logit contributions
 Majesty+0.071
 disl+0.070
 endings+0.063
 fault+0.062
 appearance+0.062
Neg logit contributions
 full-0.041
hes-0.038
 called-0.038
On-0.036
 tucked-0.036
Token Tim
Token ID5045
Feature activation+5.19e-02

Pos logit contributions
 disl+0.104
 Majesty+0.101
 fault+0.090
 endings+0.090
 preferences+0.090
Neg logit contributions
 full-0.059
 called-0.054
 tucked-0.052
 sn-0.052
 closed-0.049
Tokenmy
Token ID1820
Feature activation+1.65e-02

Pos logit contributions
 disl+0.105
 Majesty+0.100
 endings+0.092
 beings+0.087
 ourselves+0.087
Neg logit contributions
 full-0.066
 tucked-0.064
 called-0.063
 sn-0.063
 closed-0.061
Token had
Token ID550
Feature activation+1.54e-02

Pos logit contributions
 disl+0.032
 Majesty+0.030
 endings+0.028
 ourselves+0.027
 beings+0.026
Neg logit contributions
 full-0.023
 tucked-0.022
 called-0.022
 sn-0.022
 closed-0.021
Token to
Token ID284
Feature activation+9.56e-02

Pos logit contributions
 disl+0.138
 Majesty+0.136
 endings+0.117
idents+0.114
 fault+0.113
Neg logit contributions
 full-0.102
 tucked-0.089
 sn-0.088
hes-0.088
 j-0.086
Token lend
Token ID22096
Feature activation+5.79e-02

Pos logit contributions
 disl+0.192
 Majesty+0.191
 endings+0.175
 fault+0.172
idents+0.170
Neg logit contributions
 full-0.110
 tucked-0.102
hes-0.102
 sn-0.101
 called-0.099
Token her
Token ID607
Feature activation+7.06e-02

Pos logit contributions
 Majesty+0.102
 disl+0.102
 endings+0.088
 fault+0.085
idents+0.083
Neg logit contributions
 full-0.081
hes-0.079
 tucked-0.077
 called-0.076
 sn-0.074
Token the
Token ID262
Feature activation+5.30e-02

Pos logit contributions
 disl+0.142
 Majesty+0.142
 endings+0.129
idents+0.122
 appearance+0.121
Neg logit contributions
 full-0.081
 tucked-0.076
hes-0.076
 sn-0.072
 called-0.072
Token meal
Token ID9799
Feature activation+7.03e-02

Pos logit contributions
 disl+0.100
 Majesty+0.097
 endings+0.087
 fault+0.084
 appearance+0.082
Neg logit contributions
 full-0.068
hes-0.066
 tucked-0.064
 sn-0.060
 nick-0.060
Token anymore
Token ID7471
Feature activation+2.15e-01

Pos logit contributions
 disl+0.130
 Majesty+0.126
 endings+0.112
 fault+0.108
 appearance+0.106
Neg logit contributions
 full-0.095
hes-0.092
 tucked-0.089
 sn-0.084
 called-0.083
Token.
Token ID13
Feature activation+3.46e-02

Pos logit contributions
 Majesty+0.451
 disl+0.443
 fault+0.393
idents+0.390
 endings+0.390
Neg logit contributions
 full-0.257
 called-0.235
 tucked-0.234
 sn-0.223
hes-0.220
Token They
Token ID1119
Feature activation+1.81e-02

Pos logit contributions
 disl+0.069
 Majesty+0.067
 endings+0.061
 fault+0.060
 appearance+0.060
Neg logit contributions
 full-0.042
hes-0.039
 tucked-0.037
 called-0.037
On-0.036
Token hugged
Token ID41130
Feature activation-1.51e-02

Pos logit contributions
 disl+0.037
 Majesty+0.036
 endings+0.032
 fault+0.032
 appearance+0.031
Neg logit contributions
 full-0.021
hes-0.020
 tucked-0.020
 sn-0.018
 called-0.018
Token and
Token ID290
Feature activation-5.17e-02

Pos logit contributions
 full+0.019
 tucked+0.018
 sn+0.018
hes+0.018
 called+0.017
Neg logit contributions
 disl-0.030
 Majesty-0.029
 endings-0.025
 fault-0.025
 purposes-0.024
Token smiled
Token ID13541
Feature activation-4.93e-03

Pos logit contributions
hes+0.064
 full+0.059
On+0.058
 nick+0.056
aver+0.055
Neg logit contributions
 disl-0.098
 Majesty-0.090
 endings-0.082
 preferences-0.079
idents-0.078
Token they
Token ID484
Feature activation+7.26e-02

Pos logit contributions
hes+0.008
 full+0.007
 nick+0.007
 tucked+0.007
On+0.007
Neg logit contributions
 disl-0.009
 Majesty-0.009
 fault-0.008
 appearance-0.008
 endings-0.008
Token couldn
Token ID3521
Feature activation+6.92e-02

Pos logit contributions
 disl+0.150
 Majesty+0.145
 endings+0.130
 fault+0.128
idents+0.125
Neg logit contributions
 full-0.086
hes-0.081
 tucked-0.080
 sn-0.075
 called-0.075
Token't
Token ID470
Feature activation+5.77e-02

Pos logit contributions
 Majesty+0.141
 disl+0.138
 appearance+0.121
 preferences+0.120
 beings+0.120
Neg logit contributions
 full-0.086
hes-0.081
 sn-0.080
 tucked-0.080
 called-0.077
Token play
Token ID711
Feature activation+2.56e-02

Pos logit contributions
 disl+0.105
 Majesty+0.102
 endings+0.090
 fault+0.088
 appearance+0.085
Neg logit contributions
 full-0.078
hes-0.077
 tucked-0.074
 sn-0.070
 nick-0.070
Token together
Token ID1978
Feature activation+1.38e-01

Pos logit contributions
 disl+0.059
 Majesty+0.059
 preferences+0.052
 perspectives+0.051
 purposes+0.051
Neg logit contributions
 full-0.025
hes-0.024
 j-0.023
 sn-0.023
 called-0.022
Token anymore
Token ID7471
Feature activation+2.15e-01

Pos logit contributions
 disl+0.274
 Majesty+0.262
 endings+0.229
idents+0.227
 fault+0.222
Neg logit contributions
 full-0.182
 tucked-0.166
hes-0.164
 called-0.160
 sn-0.160
Token.
Token ID13
Feature activation+3.32e-02

Pos logit contributions
 Majesty+0.506
 disl+0.490
roc+0.440
cient+0.439
 fault+0.435
Neg logit contributions
 full-0.202
 called-0.199
 sn-0.173
 tucked-0.169
 j-0.162
Token The
Token ID383
Feature activation+2.38e-02

Pos logit contributions
 disl+0.066
 Majesty+0.065
 endings+0.058
 appearance+0.057
 fault+0.057
Neg logit contributions
 full-0.040
hes-0.037
 called-0.036
On-0.035
 tucked-0.035
Token bald
Token ID28273
Feature activation+1.26e-02

Pos logit contributions
 Majesty+0.040
 disl+0.039
 endings+0.034
 fault+0.033
 appearance+0.032
Neg logit contributions
hes-0.036
 full-0.034
 tucked-0.034
 ped-0.032
On-0.031
Token man
Token ID582
Feature activation+1.52e-02

Pos logit contributions
 disl+0.023
 Majesty+0.022
 endings+0.020
 fault+0.019
 appearance+0.019
Neg logit contributions
 full-0.017
hes-0.017
 tucked-0.016
 sn-0.016
 called-0.015
Token said
Token ID531
Feature activation+1.77e-02

Pos logit contributions
 disl+0.030
 Majesty+0.029
 endings+0.027
 fault+0.026
 appearance+0.025
Neg logit contributions
 full-0.018
hes-0.018
 tucked-0.018
 sn-0.016
 called-0.016
Token't
Token ID470
Feature activation+5.76e-02

Pos logit contributions
 disl+0.110
 Majesty+0.109
 endings+0.096
 fault+0.093
 appearance+0.091
Neg logit contributions
 full-0.096
hes-0.094
 tucked-0.091
 sn-0.090
 ped-0.089
Token let
Token ID1309
Feature activation+4.49e-02

Pos logit contributions
 disl+0.116
 Majesty+0.116
 endings+0.101
 fault+0.101
idents+0.099
Neg logit contributions
 full-0.070
 tucked-0.064
 sn-0.063
 called-0.061
hes-0.060
Token you
Token ID345
Feature activation+5.80e-02

Pos logit contributions
 disl+0.073
 Majesty+0.067
 preferences+0.060
 perspectives+0.058
 beings+0.057
Neg logit contributions
 sn-0.072
 full-0.071
 j-0.068
 nick-0.068
 called-0.067
Token play
Token ID711
Feature activation+2.52e-02

Pos logit contributions
 disl+0.116
 Majesty+0.113
 endings+0.102
 fault+0.100
 appearance+0.098
Neg logit contributions
hes-0.067
 full-0.065
 tucked-0.062
 nick-0.057
On-0.056
Token here
Token ID994
Feature activation+1.52e-01

Pos logit contributions
 disl+0.054
 Majesty+0.053
 fault+0.047
 endings+0.047
 purposes+0.046
Neg logit contributions
 full-0.027
hes-0.026
 tucked-0.025
 sn-0.024
 called-0.024
Token anymore
Token ID7471
Feature activation+2.15e-01

Pos logit contributions
 disl+0.279
 Majesty+0.262
 endings+0.226
 fault+0.219
idents+0.219
Neg logit contributions
 full-0.232
 tucked-0.214
 sn-0.206
 called-0.205
hes-0.202
Token!"
Token ID2474
Feature activation+6.06e-02

Pos logit contributions
 disl+0.431
 Majesty+0.423
 fault+0.380
idents+0.378
roc+0.374
Neg logit contributions
 called-0.258
 full-0.257
 sn-0.243
 j-0.234
 tucked-0.227
Token
Token ID198
Feature activation+2.09e-02

Pos logit contributions
 disl+0.118
 Majesty+0.118
 endings+0.105
 fault+0.104
 appearance+0.102
Neg logit contributions
 full-0.075
hes-0.071
 tucked-0.069
 called-0.068
 sn-0.066
Token
Token ID198
Feature activation+1.40e-02

Pos logit contributions
 disl+0.040
 Majesty+0.038
 endings+0.033
 fault+0.032
 preferences+0.031
Neg logit contributions
 full-0.027
hes-0.027
 tucked-0.026
 sn-0.024
 nick-0.024
TokenTim
Token ID14967
Feature activation+2.75e-02

Pos logit contributions
 disl+0.030
 Majesty+0.028
 endings+0.025
 fault+0.024
 preferences+0.024
Neg logit contributions
 full-0.015
hes-0.015
 tucked-0.015
 sn-0.013
 nick-0.013
Token didn
Token ID1422
Feature activation+3.90e-03

Pos logit contributions
 disl+0.055
 Majesty+0.053
 endings+0.048
 fault+0.046
 appearance+0.045
Neg logit contributions
 full-0.034
 tucked-0.033
hes-0.032
 sn-0.032
 called-0.031
Token to
Token ID284
Feature activation+7.37e-02

Pos logit contributions
 Majesty+0.172
 disl+0.171
idents+0.141
 endings+0.141
cient+0.137
Neg logit contributions
 full-0.144
 sn-0.132
 tucked-0.131
 called-0.131
hes-0.130
Token play
Token ID711
Feature activation+4.63e-02

Pos logit contributions
 disl+0.149
 Majesty+0.148
 endings+0.134
 fault+0.130
 beings+0.130
Neg logit contributions
 full-0.083
 tucked-0.076
 sn-0.076
hes-0.076
 called-0.074
Token with
Token ID351
Feature activation+3.00e-02

Pos logit contributions
 disl+0.098
 Majesty+0.096
 endings+0.085
 fault+0.084
 appearance+0.083
Neg logit contributions
 full-0.051
hes-0.049
 tucked-0.047
 sn-0.045
 called-0.045
Token his
Token ID465
Feature activation+4.51e-02

Pos logit contributions
 disl+0.056
 Majesty+0.056
 endings+0.049
 fault+0.048
idents+0.048
Neg logit contributions
 full-0.038
 called-0.037
hes-0.037
 tucked-0.036
 sn-0.036
Token friends
Token ID2460
Feature activation+7.95e-02

Pos logit contributions
 disl+0.082
 Majesty+0.078
 endings+0.071
 purposes+0.067
idents+0.066
Neg logit contributions
 full-0.063
 tucked-0.058
 called-0.057
 sn-0.057
hes-0.056
Token anymore
Token ID7471
Feature activation+2.15e-01

Pos logit contributions
 disl+0.173
 Majesty+0.168
 endings+0.151
idents+0.143
 preferences+0.142
Neg logit contributions
 full-0.095
 tucked-0.088
 called-0.086
 sn-0.081
hes-0.080
Token.
Token ID13
Feature activation+3.02e-02

Pos logit contributions
 Majesty+0.501
 disl+0.485
cient+0.444
roc+0.442
idents+0.440
Neg logit contributions
 full-0.207
 called-0.199
 sn-0.175
 tucked-0.174
 j-0.171
Token He
Token ID679
Feature activation+4.24e-02

Pos logit contributions
 disl+0.060
 Majesty+0.059
 endings+0.053
 fault+0.053
 appearance+0.052
Neg logit contributions
 full-0.036
 called-0.033
hes-0.032
On-0.032
 tucked-0.031
Token missed
Token ID6825
Feature activation+7.32e-02

Pos logit contributions
 disl+0.077
 Majesty+0.074
 endings+0.067
 fault+0.065
 purposes+0.064
Neg logit contributions
 full-0.058
 tucked-0.057
 sn-0.056
 called-0.056
hes-0.055
Token his
Token ID465
Feature activation+3.10e-02

Pos logit contributions
 disl+0.139
 Majesty+0.136
 endings+0.120
 fault+0.117
idents+0.115
Neg logit contributions
 full-0.094
hes-0.092
 tucked-0.089
 sn-0.084
 called-0.083
Token favorite
Token ID4004
Feature activation+9.53e-02

Pos logit contributions
 disl+0.049
 Majesty+0.047
 endings+0.041
 fault+0.039
idents+0.038
Neg logit contributions
 full-0.050
hes-0.048
 tucked-0.047
 sn-0.045
 nick-0.045
Token that
Token ID326
Feature activation+1.48e-01

Pos logit contributions
hes+0.088
 full+0.087
 tucked+0.083
 nick+0.079
 sn+0.078
Neg logit contributions
 disl-0.146
 Majesty-0.143
 endings-0.129
 fault-0.126
 appearance-0.123
Token he
Token ID339
Feature activation+7.66e-02

Pos logit contributions
 Majesty+0.272
 disl+0.270
 endings+0.231
idents+0.230
 fault+0.228
Neg logit contributions
 full-0.210
hes-0.195
 tucked-0.194
 called-0.187
 sn-0.187
Token couldn
Token ID3521
Feature activation+6.52e-02

Pos logit contributions
 disl+0.144
 Majesty+0.138
 endings+0.123
 fault+0.121
idents+0.118
Neg logit contributions
 full-0.103
hes-0.099
 tucked-0.098
 sn-0.094
 called-0.093
Token't
Token ID470
Feature activation+5.97e-02

Pos logit contributions
 disl+0.118
 Majesty+0.118
 endings+0.100
 appearance+0.099
 beings+0.098
Neg logit contributions
 full-0.093
hes-0.089
 tucked-0.088
 sn-0.085
 called-0.082
Token swim
Token ID9422
Feature activation-3.73e-02

Pos logit contributions
 disl+0.101
 Majesty+0.097
 endings+0.086
 fault+0.083
 appearance+0.081
Neg logit contributions
 full-0.088
hes-0.088
 tucked-0.084
 nick-0.079
 sn-0.079
Token anymore
Token ID7471
Feature activation+2.14e-01

Pos logit contributions
 full+0.050
hes+0.044
 tucked+0.044
 called+0.043
 sn+0.042
Neg logit contributions
 disl-0.075
 Majesty-0.073
 endings-0.063
 preferences-0.061
 fault-0.061
Token.
Token ID13
Feature activation+3.39e-02

Pos logit contributions
 disl+0.469
 Majesty+0.468
idents+0.415
 fault+0.406
 preferences+0.403
Neg logit contributions
 full-0.229
 called-0.212
 sn-0.205
 tucked-0.202
 j-0.194
Token He
Token ID679
Feature activation+4.64e-02

Pos logit contributions
 disl+0.067
 Majesty+0.066
 endings+0.060
 fault+0.059
 beings+0.059
Neg logit contributions
 full-0.041
 called-0.036
hes-0.036
 tucked-0.035
On-0.035
Token became
Token ID2627
Feature activation-2.84e-02

Pos logit contributions
 disl+0.089
 Majesty+0.086
 endings+0.077
 fault+0.075
 appearance+0.073
Neg logit contributions
 full-0.058
hes-0.057
 tucked-0.055
 sn-0.052
 nick-0.051
Token very
Token ID845
Feature activation+2.67e-02

Pos logit contributions
hes+0.029
 tucked+0.029
 full+0.028
 nick+0.026
On+0.025
Neg logit contributions
 disl-0.059
 Majesty-0.058
 endings-0.051
 fault-0.051
 relationships-0.050
Token worried
Token ID7960
Feature activation+1.25e-01

Pos logit contributions
 disl+0.057
 Majesty+0.056
 endings+0.050
 fault+0.049
 appearance+0.048
Neg logit contributions
 full-0.028
hes-0.027
 tucked-0.026
 sn-0.024
 nick-0.024
Token to
Token ID284
Feature activation+8.76e-02

Pos logit contributions
 Majesty+0.227
 disl+0.222
idents+0.185
 endings+0.181
cient+0.180
Neg logit contributions
 full-0.183
 called-0.171
 sn-0.170
 tucked-0.167
hes-0.165
Token be
Token ID307
Feature activation-2.99e-02

Pos logit contributions
 disl+0.178
 Majesty+0.176
 endings+0.160
idents+0.156
 beings+0.156
Neg logit contributions
 full-0.098
hes-0.090
 tucked-0.090
 sn-0.089
 called-0.087
Token friends
Token ID2460
Feature activation+1.14e-01

Pos logit contributions
 full+0.036
hes+0.034
 called+0.032
 tucked+0.032
 sn+0.032
Neg logit contributions
 disl-0.059
 Majesty-0.058
 endings-0.055
idents-0.054
 fault-0.052
Token with
Token ID351
Feature activation+2.71e-02

Pos logit contributions
 Majesty+0.266
 disl+0.264
 endings+0.237
idents+0.230
 purposes+0.225
Neg logit contributions
 full-0.123
 called-0.111
 j-0.104
 tucked-0.104
 sn-0.100
Token Lily
Token ID20037
Feature activation+8.21e-02

Pos logit contributions
 Majesty+0.048
 disl+0.048
 endings+0.041
idents+0.041
 fault+0.041
Neg logit contributions
 full-0.038
hes-0.037
 called-0.036
 tucked-0.036
 sn-0.035
Token anymore
Token ID7471
Feature activation+2.14e-01

Pos logit contributions
 disl+0.208
 Majesty+0.193
 endings+0.176
 majesty+0.163
 perspectives+0.163
Neg logit contributions
 called-0.077
 full-0.076
 tucked-0.068
 sn-0.067
 j-0.067
Token.
Token ID13
Feature activation+2.65e-02

Pos logit contributions
 Majesty+0.481
 disl+0.467
idents+0.429
roc+0.428
cient+0.428
Neg logit contributions
 full-0.212
 called-0.205
 sn-0.181
 j-0.180
 tucked-0.178
Token He
Token ID679
Feature activation+4.19e-02

Pos logit contributions
 disl+0.050
 Majesty+0.049
 endings+0.045
 fault+0.044
 appearance+0.043
Neg logit contributions
 full-0.034
hes-0.031
 called-0.031
 tucked-0.030
On-0.029
Token takes
Token ID2753
Feature activation-5.86e-02

Pos logit contributions
 disl+0.062
 Majesty+0.059
 endings+0.053
 fault+0.049
 purposes+0.049
Neg logit contributions
 full-0.072
 tucked-0.070
hes-0.069
 sn-0.069
 called-0.068
Token all
Token ID477
Feature activation+3.43e-02

Pos logit contributions
 full+0.066
hes+0.065
 tucked+0.063
 sn+0.058
 nick+0.058
Neg logit contributions
 disl-0.120
 Majesty-0.116
 endings-0.105
 fault-0.102
 appearance-0.100
Token the
Token ID262
Feature activation+1.92e-02

Pos logit contributions
 Majesty+0.078
 disl+0.075
 endings+0.069
idents+0.068
 appearance+0.068
Neg logit contributions
 full-0.035
hes-0.034
 tucked-0.031
 called-0.030
 sn-0.029
Token to
Token ID284
Feature activation+7.37e-02

Pos logit contributions
 Majesty+0.172
 disl+0.171
idents+0.141
 endings+0.141
cient+0.137
Neg logit contributions
 full-0.144
 sn-0.132
 tucked-0.131
 called-0.131
hes-0.130
Token go
Token ID467
Feature activation-3.62e-02

Pos logit contributions
 disl+0.149
 Majesty+0.148
 endings+0.134
 fault+0.130
 beings+0.130
Neg logit contributions
 full-0.083
 tucked-0.076
 sn-0.076
hes-0.076
 called-0.074
Token to
Token ID284
Feature activation+6.90e-02

Pos logit contributions
 full+0.055
hes+0.055
 tucked+0.052
 nick+0.050
 sn+0.049
Neg logit contributions
 disl-0.060
 Majesty-0.058
 endings-0.050
 fault-0.049
 appearance-0.048
Token the
Token ID262
Feature activation+4.85e-02

Pos logit contributions
 disl+0.142
 Majesty+0.140
 endings+0.127
 preferences+0.123
idents+0.122
Neg logit contributions
 full-0.077
 sn-0.072
 tucked-0.072
hes-0.070
 nick-0.069
Token party
Token ID2151
Feature activation+9.48e-02

Pos logit contributions
 Majesty+0.082
 disl+0.080
 fault+0.070
 endings+0.069
 appearance+0.068
Neg logit contributions
hes-0.071
 full-0.067
 tucked-0.066
 ped-0.063
 nick-0.061
Token anymore
Token ID7471
Feature activation+2.14e-01

Pos logit contributions
 disl+0.200
 Majesty+0.188
 endings+0.166
 fault+0.163
 preferences+0.163
Neg logit contributions
 full-0.122
 tucked-0.112
 called-0.107
hes-0.106
 sn-0.101
Token,"
Token ID553
Feature activation+7.34e-02

Pos logit contributions
 Majesty+0.481
 disl+0.479
roc+0.428
idents+0.428
cient+0.427
Neg logit contributions
 full-0.224
 called-0.213
 tucked-0.195
 sn-0.191
 j-0.181
Token he
Token ID339
Feature activation+4.05e-02

Pos logit contributions
 disl+0.153
 Majesty+0.153
 fault+0.139
 endings+0.137
 appearance+0.134
Neg logit contributions
 full-0.082
 called-0.077
hes-0.074
 tucked-0.072
 sn-0.072
Token said
Token ID531
Feature activation+2.62e-02

Pos logit contributions
 disl+0.084
 Majesty+0.081
 endings+0.074
 fault+0.072
 appearance+0.071
Neg logit contributions
 full-0.045
hes-0.044
 tucked-0.043
 sn-0.040
 called-0.040
Token.
Token ID13
Feature activation+1.26e-02

Pos logit contributions
 disl+0.050
 Majesty+0.049
 endings+0.044
 fault+0.042
 appearance+0.042
Neg logit contributions
 full-0.035
hes-0.033
 tucked-0.032
 sn-0.030
 called-0.030
Token
Token ID198
Feature activation+1.46e-02

Pos logit contributions
 disl+0.023
 Majesty+0.023
 endings+0.020
 fault+0.020
 appearance+0.019
Neg logit contributions
 full-0.017
hes-0.016
 tucked-0.016
 sn-0.015
 called-0.015
Token about
Token ID546
Feature activation+1.07e-01

Pos logit contributions
 Majesty+0.201
 disl+0.192
idents+0.175
 endings+0.172
eman+0.166
Neg logit contributions
 full-0.102
hes-0.091
 called-0.089
 tucked-0.088
 j-0.085
Token the
Token ID262
Feature activation+4.91e-02

Pos logit contributions
 Majesty+0.200
 disl+0.196
idents+0.179
cient+0.173
 endings+0.169
Neg logit contributions
 full-0.136
 sn-0.124
 called-0.120
 j-0.119
 tucked-0.119
Token boy
Token ID2933
Feature activation+1.08e-01

Pos logit contributions
 disl+0.087
 Majesty+0.084
 endings+0.075
 fault+0.073
 appearance+0.071
Neg logit contributions
 full-0.068
hes-0.068
 tucked-0.065
 sn-0.061
 nick-0.061
Token's
Token ID338
Feature activation+4.41e-02

Pos logit contributions
 disl+0.227
 Majesty+0.224
 endings+0.192
 perspectives+0.188
 preferences+0.187
Neg logit contributions
 full-0.135
 tucked-0.124
 called-0.121
 sn-0.121
 j-0.112
Token train
Token ID4512
Feature activation+5.07e-03

Pos logit contributions
 disl+0.072
 Majesty+0.069
 endings+0.062
idents+0.060
 purposes+0.058
Neg logit contributions
 full-0.067
 tucked-0.061
 sn-0.061
hes-0.059
 j-0.058
Token anymore
Token ID7471
Feature activation+2.14e-01

Pos logit contributions
 disl+0.010
 Majesty+0.009
 endings+0.008
 preferences+0.008
idents+0.008
Neg logit contributions
 full-0.007
 tucked-0.006
hes-0.006
 called-0.006
 sn-0.006
Token.
Token ID13
Feature activation+2.26e-02

Pos logit contributions
 Majesty+0.478
 disl+0.462
roc+0.431
idents+0.425
cient+0.421
Neg logit contributions
 full-0.223
 called-0.205
 tucked-0.194
 sn-0.189
 j-0.179
Token They
Token ID1119
Feature activation+1.54e-02

Pos logit contributions
 disl+0.043
 Majesty+0.043
 endings+0.039
 appearance+0.038
 fault+0.038
Neg logit contributions
 full-0.028
hes-0.026
 called-0.025
 tucked-0.025
On-0.024
Token were
Token ID547
Feature activation-4.07e-02

Pos logit contributions
 disl+0.029
 Majesty+0.028
 endings+0.025
 fault+0.024
 purposes+0.024
Neg logit contributions
 full-0.021
 tucked-0.020
hes-0.019
 sn-0.019
 called-0.019
Token happy
Token ID3772
Feature activation+1.45e-01

Pos logit contributions
 full+0.045
hes+0.044
 tucked+0.041
 sn+0.039
 called+0.038
Neg logit contributions
 disl-0.086
 Majesty-0.084
 endings-0.077
idents-0.074
 fault-0.073
Token with
Token ID351
Feature activation+7.17e-03

Pos logit contributions
 disl+0.323
 Majesty+0.321
cient+0.291
 perspectives+0.280
form+0.279
Neg logit contributions
 full-0.165
 called-0.142
 tucked-0.140
 sn-0.139
 j-0.137
Token was
Token ID373
Feature activation-2.14e-02

Pos logit contributions
 disl+0.182
 Majesty+0.174
 endings+0.157
 fault+0.154
idents+0.151
Neg logit contributions
 full-0.114
hes-0.109
 tucked-0.108
 sn-0.101
 called-0.101
Token too
Token ID1165
Feature activation+3.91e-02

Pos logit contributions
hes+0.021
 full+0.021
 tucked+0.020
 nick+0.019
 sn+0.018
Neg logit contributions
 disl-0.045
 Majesty-0.045
 fault-0.039
 endings-0.039
 appearance-0.038
Token tired
Token ID10032
Feature activation+1.00e-01

Pos logit contributions
 disl+0.087
 Majesty+0.083
 fault+0.076
 endings+0.074
 beings+0.073
Neg logit contributions
hes-0.034
 tucked-0.034
 nick-0.032
 full-0.032
Wh-0.030
Token to
Token ID284
Feature activation+5.54e-02

Pos logit contributions
 disl+0.191
 Majesty+0.185
 endings+0.157
 preferences+0.154
 perspectives+0.151
Neg logit contributions
 full-0.141
 tucked-0.132
hes-0.128
 called-0.123
 sn-0.123
Token move
Token ID1445
Feature activation-7.21e-03

Pos logit contributions
 disl+0.115
 Majesty+0.111
 endings+0.101
 fault+0.099
 beings+0.097
Neg logit contributions
 full-0.063
 tucked-0.059
hes-0.059
 sn-0.057
 called-0.054
Token anymore
Token ID7471
Feature activation+2.13e-01

Pos logit contributions
 full+0.009
hes+0.009
 tucked+0.009
 sn+0.008
 called+0.008
Neg logit contributions
 disl-0.014
 Majesty-0.014
 endings-0.012
 fault-0.012
idents-0.011
Token.
Token ID13
Feature activation+2.40e-02

Pos logit contributions
 disl+0.458
 Majesty+0.452
idents+0.399
 endings+0.390
 fault+0.390
Neg logit contributions
 full-0.243
 tucked-0.221
 called-0.218
 sn-0.214
hes-0.200
Token
Token ID198
Feature activation+1.75e-02

Pos logit contributions
 disl+0.046
 Majesty+0.044
 endings+0.040
 fault+0.039
 appearance+0.038
Neg logit contributions
 full-0.031
hes-0.030
 tucked-0.029
 called-0.028
 sn-0.027
Token
Token ID198
Feature activation+1.13e-02

Pos logit contributions
 disl+0.033
 Majesty+0.031
 endings+0.028
 fault+0.027
 purposes+0.026
Neg logit contributions
 full-0.022
 tucked-0.022
hes-0.022
 sn-0.020
 nick-0.020
TokenHe
Token ID1544
Feature activation+2.72e-02

Pos logit contributions
 disl+0.023
 Majesty+0.022
 endings+0.020
 purposes+0.018
 perspectives+0.018
Neg logit contributions
 full-0.013
hes-0.013
 tucked-0.013
 sn-0.012
 nick-0.011
Token felt
Token ID2936
Feature activation-5.29e-02

Pos logit contributions
 Majesty+0.051
 disl+0.051
 endings+0.045
 means+0.043
 appearance+0.043
Neg logit contributions
hes-0.033
 full-0.032
 nick-0.028
 tucked-0.028
Wh-0.027
Token friendly
Token ID8030
Feature activation+5.49e-02

Pos logit contributions
 full+0.017
hes+0.017
 tucked+0.016
 sn+0.014
 nick+0.014
Neg logit contributions
 disl-0.038
 Majesty-0.037
 endings-0.034
 fault-0.033
 appearance-0.033
Token.
Token ID13
Feature activation-2.46e-02

Pos logit contributions
 disl+0.116
 Majesty+0.112
 endings+0.097
 fault+0.094
idents+0.094
Neg logit contributions
 full-0.069
 tucked-0.064
 called-0.060
 sn-0.060
hes-0.058
Token
Token ID198
Feature activation-2.50e-02

Pos logit contributions
 full+0.029
hes+0.027
 called+0.026
 tucked+0.025
On+0.025
Neg logit contributions
 disl-0.051
 Majesty-0.051
 endings-0.046
 fault-0.045
 appearance-0.044
Token
Token ID198
Feature activation-2.83e-02

Pos logit contributions
 full+0.033
hes+0.033
 tucked+0.031
 sn+0.030
 nick+0.029
Neg logit contributions
 disl-0.046
 Majesty-0.045
 endings-0.040
 fault-0.039
 appearance-0.038
TokenHe
Token ID1544
Feature activation-1.22e-02

Pos logit contributions
 full+0.036
 tucked+0.035
hes+0.035
 sn+0.033
 nick+0.033
Neg logit contributions
 disl-0.052
 Majesty-0.048
 endings-0.044
 perspectives-0.042
 purposes-0.041
Token crou
Token ID33838
Feature activation-2.06e-01

Pos logit contributions
hes+0.016
 full+0.015
 nick+0.014
 tucked+0.014
 gust+0.013
Neg logit contributions
 disl-0.021
 Majesty-0.021
 endings-0.018
 appearance-0.018
flies-0.018
Tokenched
Token ID1740
Feature activation-5.05e-02

Pos logit contributions
 full+0.367
 convey+0.361
 com+0.346
 curly+0.345
 masse+0.345
Neg logit contributions
vered-0.200
 disl-0.193
form-0.183
minded-0.182
 means-0.181
Token down
Token ID866
Feature activation-1.05e-02

Pos logit contributions
hes+0.083
 full+0.081
 tucked+0.080
aver+0.074
On+0.073
Neg logit contributions
 disl-0.068
 Majesty-0.065
 means-0.061
 beings-0.060
 endings-0.059
Token and
Token ID290
Feature activation-1.02e-01

Pos logit contributions
hes+0.020
aver+0.019
Wh+0.019
Ring+0.018
 nick+0.018
Neg logit contributions
 means-0.010
minded-0.008
 anymore-0.008
 beings-0.008
ry-0.008
Token said
Token ID531
Feature activation-2.78e-02

Pos logit contributions
hes+0.118
 full+0.113
 tucked+0.101
 nick+0.101
aver+0.100
Neg logit contributions
 disl-0.201
 Majesty-0.193
 endings-0.168
 means-0.166
idents-0.166
Token,
Token ID11
Feature activation-1.17e-02

Pos logit contributions
 full+0.042
 sn+0.038
 j+0.037
 tucked+0.036
 bra+0.036
Neg logit contributions
 disl-0.050
 Majesty-0.047
 fault-0.044
 endings-0.044
idents-0.043
Token eyes
Token ID2951
Feature activation-1.17e-01

Pos logit contributions
hes+0.026
 full+0.024
 tucked+0.024
 nick+0.022
aver+0.022
Neg logit contributions
 disl-0.036
 Majesty-0.035
 fault-0.032
 endings-0.031
 appearance-0.031
Token.
Token ID13
Feature activation-2.54e-02

Pos logit contributions
hes+0.149
 nick+0.140
aver+0.139
Wh+0.133
 prec+0.130
Neg logit contributions
 disl-0.208
 Majesty-0.192
 endings-0.178
 appearance-0.178
flies-0.175
Token Jill
Token ID22772
Feature activation+1.11e-02

Pos logit contributions
 full+0.036
hes+0.036
 tucked+0.035
 sn+0.033
 nick+0.032
Neg logit contributions
 disl-0.045
 Majesty-0.043
 endings-0.038
 fault-0.037
idents-0.036
Token and
Token ID290
Feature activation-1.03e-01

Pos logit contributions
 disl+0.022
 Majesty+0.021
 endings+0.019
 fault+0.019
 appearance+0.018
Neg logit contributions
 full-0.013
hes-0.013
 tucked-0.013
 sn-0.012
 called-0.012
Token Mike
Token ID4995
Feature activation-3.08e-02

Pos logit contributions
 full+0.158
hes+0.150
 tucked+0.149
 sn+0.144
 ped+0.143
Neg logit contributions
 disl-0.170
 Majesty-0.167
 endings-0.146
 fault-0.143
 preferences-0.137
Token crou
Token ID33838
Feature activation-2.05e-01

Pos logit contributions
 full+0.037
hes+0.037
 tucked+0.035
 nick+0.033
 sn+0.032
Neg logit contributions
 disl-0.060
 Majesty-0.058
 endings-0.052
 fault-0.051
 appearance-0.050
Tokenched
Token ID1740
Feature activation-4.76e-02

Pos logit contributions
 convey+0.358
 full+0.352
Grand+0.350
Kim+0.347
 masse+0.346
Neg logit contributions
vered-0.207
 disl-0.191
mer-0.178
form-0.177
cient-0.168
Token down
Token ID866
Feature activation-7.88e-03

Pos logit contributions
hes+0.066
 full+0.066
 tucked+0.063
 nick+0.059
 sn+0.058
Neg logit contributions
 disl-0.082
 Majesty-0.079
 endings-0.072
 fault-0.070
 beings-0.069
Token to
Token ID284
Feature activation-1.04e-02

Pos logit contributions
hes+0.014
aver+0.013
Wh+0.012
eh+0.012
 tucked+0.012
Neg logit contributions
 disl-0.010
 Majesty-0.010
 fault-0.009
 means-0.009
 beings-0.009
Token pet
Token ID4273
Feature activation-6.11e-02

Pos logit contributions
 full+0.014
hes+0.013
 tucked+0.013
 nick+0.012
 sn+0.012
Neg logit contributions
 disl-0.020
 Majesty-0.019
 endings-0.017
 fault-0.016
 appearance-0.016
Token the
Token ID262
Feature activation-7.09e-04

Pos logit contributions
 full+0.092
hes+0.092
 tucked+0.089
 nick+0.084
 sn+0.083
Neg logit contributions
 disl-0.100
 Majesty-0.096
 endings-0.085
 fault-0.082
idents-0.080
Token said
Token ID531
Feature activation-2.83e-02

Pos logit contributions
hes+0.019
 nick+0.019
 full+0.019
 leash+0.018
 gust+0.018
Neg logit contributions
 disl-0.015
 Majesty-0.014
 means-0.013
flies-0.013
 preferences-0.012
Token $
Token ID720
Feature activation-9.52e-02

Pos logit contributions
 full+0.042
hes+0.041
 tucked+0.040
 sn+0.038
 called+0.037
Neg logit contributions
 disl-0.049
 Majesty-0.047
 endings-0.042
 fault-0.041
 appearance-0.040
Token1
Token ID16
Feature activation-3.11e-02

Pos logit contributions
hes+0.149
 tucked+0.143
 full+0.140
aver+0.136
 sn+0.132
Neg logit contributions
 Majesty-0.139
 disl-0.139
idents-0.133
 beings-0.129
minded-0.129
Token.
Token ID13
Feature activation-3.38e-02

Pos logit contributions
 full+0.051
hes+0.050
 tucked+0.049
 sn+0.046
 called+0.046
Neg logit contributions
 disl-0.048
 Majesty-0.046
 endings-0.040
 fault-0.038
idents-0.037
Token99
Token ID2079
Feature activation-4.24e-02

Pos logit contributions
 tucked+0.059
 ped+0.056
 full+0.056
hes+0.055
aver+0.055
Neg logit contributions
 disl-0.045
 Majesty-0.042
 fault-0.037
 endings-0.035
 means-0.035
Token per
Token ID583
Feature activation-2.04e-01

Pos logit contributions
 full+0.060
hes+0.058
 tucked+0.058
 sn+0.054
 called+0.053
Neg logit contributions
 disl-0.076
 Majesty-0.073
 endings-0.065
 fault-0.062
idents-0.062
Token pound
Token ID14896
Feature activation-6.56e-02

Pos logit contributions
 full+0.332
hes+0.329
 tucked+0.323
On+0.310
 called+0.308
Neg logit contributions
 Majesty-0.313
 disl-0.307
 fault-0.267
 endings-0.252
flies-0.249
Token.
Token ID13
Feature activation-3.60e-02

Pos logit contributions
 full+0.087
hes+0.085
 tucked+0.083
 sn+0.078
 called+0.077
Neg logit contributions
 disl-0.123
 Majesty-0.118
 endings-0.105
 fault-0.102
idents-0.100
Token
Token ID198
Feature activation-3.52e-02

Pos logit contributions
 tucked+0.060
 ped+0.057
 full+0.057
 sn+0.056
hes+0.056
Neg logit contributions
 disl-0.051
 Majesty-0.047
 fault-0.043
 endings-0.041
 means-0.040
Token
Token ID198
Feature activation-3.82e-02

Pos logit contributions
 full+0.046
hes+0.045
 tucked+0.043
 sn+0.041
 called+0.041
Neg logit contributions
 disl-0.066
 Majesty-0.064
 endings-0.057
 fault-0.056
 appearance-0.055
Token"
Token ID1
Feature activation+2.25e-02

Pos logit contributions
 full+0.062
hes+0.062
 tucked+0.061
 sn+0.058
 nick+0.058
Neg logit contributions
 disl-0.058
 Majesty-0.055
 endings-0.048
 purposes-0.045
 preferences-0.045
Token.
Token ID13
Feature activation-2.00e-02

Pos logit contributions
hes+0.073
 nick+0.067
 full+0.065
 bra+0.064
 tucked+0.064
Neg logit contributions
 disl-0.055
 Majesty-0.054
 endings-0.049
flies-0.049
 appearance-0.046
Token
Token ID198
Feature activation-1.96e-02

Pos logit contributions
 full+0.025
hes+0.024
 tucked+0.024
 sn+0.022
 nick+0.022
Neg logit contributions
 disl-0.039
 Majesty-0.038
 endings-0.034
 fault-0.033
 appearance-0.032
Token
Token ID198
Feature activation-2.29e-02

Pos logit contributions
 full+0.023
hes+0.022
 tucked+0.021
On+0.020
 called+0.020
Neg logit contributions
 disl-0.042
 Majesty-0.041
 endings-0.036
 fault-0.036
 appearance-0.036
TokenP
Token ID47
Feature activation-4.51e-02

Pos logit contributions
hes+0.026
 full+0.025
 tucked+0.024
 ped+0.022
 nick+0.022
Neg logit contributions
 disl-0.046
 Majesty-0.045
 endings-0.041
 appearance-0.039
 perspectives-0.038
Tokenoppy
Token ID26696
Feature activation-2.28e-02

Pos logit contributions
hes+0.098
 full+0.098
 tucked+0.095
 sn+0.091
 nick+0.091
Neg logit contributions
 disl-0.044
 Majesty-0.043
 endings-0.035
 fault-0.032
idents-0.030
Token crou
Token ID33838
Feature activation-2.03e-01

Pos logit contributions
 full+0.030
 tucked+0.029
hes+0.029
 sn+0.028
 called+0.028
Neg logit contributions
 disl-0.042
 Majesty-0.041
 fault-0.037
 endings-0.037
idents-0.035
Tokenched
Token ID1740
Feature activation-4.79e-02

Pos logit contributions
 masse+0.369
 convey+0.369
 full+0.369
 com+0.350
 leash+0.349
Neg logit contributions
vered-0.194
 disl-0.191
form-0.167
cient-0.161
 means-0.160
Token down
Token ID866
Feature activation-5.06e-03

Pos logit contributions
hes+0.079
 full+0.075
 tucked+0.074
 nick+0.070
aver+0.069
Neg logit contributions
 disl-0.067
 Majesty-0.064
 means-0.057
 beings-0.057
idents-0.056
Token and
Token ID290
Feature activation-9.60e-02

Pos logit contributions
hes+0.010
aver+0.010
Wh+0.009
 nick+0.009
 convey+0.009
Neg logit contributions
 means-0.004
 disl-0.004
 beings-0.004
minded-0.003
 anymore-0.003
Token said
Token ID531
Feature activation-2.42e-02

Pos logit contributions
 full+0.109
hes+0.108
 tucked+0.102
 nick+0.095
 sn+0.094
Neg logit contributions
 disl-0.196
 Majesty-0.190
 endings-0.170
 fault-0.166
 appearance-0.163
Token,
Token ID11
Feature activation-7.93e-03

Pos logit contributions
 full+0.033
 sn+0.030
 j+0.029
 low+0.028
 bra+0.027
Neg logit contributions
 disl-0.047
 Majesty-0.044
 fault-0.044
 purposes-0.043
 endings-0.043
Token its
Token ID663
Feature activation-2.45e-02

Pos logit contributions
hes+0.041
 full+0.040
 tucked+0.040
 nick+0.037
 bra+0.036
Neg logit contributions
 disl-0.046
 Majesty-0.043
 endings-0.038
 fault-0.037
 appearance-0.036
Token mouth
Token ID5422
Feature activation-6.98e-02

Pos logit contributions
 full+0.030
hes+0.029
 tucked+0.028
 sn+0.026
 nick+0.026
Neg logit contributions
 disl-0.048
 Majesty-0.047
 endings-0.042
 fault-0.041
 appearance-0.040
Token.
Token ID13
Feature activation-2.73e-02

Pos logit contributions
 nick+0.121
hes+0.118
Wh+0.115
Grab+0.113
 masse+0.113
Neg logit contributions
 disl-0.086
 Majesty-0.078
 fault-0.075
flies-0.072
 endings-0.071
Token Doug
Token ID15115
Feature activation+7.52e-04

Pos logit contributions
 full+0.031
 called+0.029
hes+0.028
 sn+0.026
 nick+0.026
Neg logit contributions
 disl-0.057
 Majesty-0.057
 fault-0.054
 endings-0.052
 appearance-0.050
Tokenie
Token ID494
Feature activation-1.38e-02

Pos logit contributions
 disl+0.001
 Majesty+0.001
 fault+0.001
 endings+0.001
idents+0.001
Neg logit contributions
 full-0.001
 tucked-0.001
 called-0.001
 sn-0.001
hes-0.001
Token crou
Token ID33838
Feature activation-2.01e-01

Pos logit contributions
 full+0.019
hes+0.018
 tucked+0.018
 sn+0.017
 called+0.017
Neg logit contributions
 disl-0.025
 Majesty-0.025
 endings-0.022
 fault-0.021
 appearance-0.021
Tokenched
Token ID1740
Feature activation-4.55e-02

Pos logit contributions
 full+0.394
Grand+0.391
 convey+0.389
 com+0.387
 masse+0.387
Neg logit contributions
vered-0.157
 disl-0.122
cient-0.120
form-0.117
 means-0.114
Token down
Token ID866
Feature activation-3.87e-03

Pos logit contributions
hes+0.072
 full+0.070
 tucked+0.068
 nick+0.064
On+0.064
Neg logit contributions
 disl-0.066
 Majesty-0.064
 beings-0.057
 endings-0.057
 means-0.056
Token and
Token ID290
Feature activation-9.32e-02

Pos logit contributions
hes+0.007
aver+0.007
Wh+0.007
 nick+0.007
Ring+0.006
Neg logit contributions
 means-0.004
 disl-0.004
 beings-0.003
 anymore-0.003
 fault-0.003
Token said
Token ID531
Feature activation-1.87e-02

Pos logit contributions
hes+0.103
 full+0.103
 tucked+0.095
 nick+0.091
 sn+0.087
Neg logit contributions
 disl-0.191
 Majesty-0.185
 endings-0.165
 fault-0.159
idents-0.158
Token quietly
Token ID12703
Feature activation-8.38e-04

Pos logit contributions
 full+0.027
 j+0.023
 sn+0.023
 bra+0.022
 tucked+0.022
Neg logit contributions
 disl-0.035
 Majesty-0.034
 purposes-0.033
 endings-0.032
 preferences-0.032
Token the
Token ID262
Feature activation-9.74e-03

Pos logit contributions
 full+0.024
 called+0.022
 tucked+0.021
 sn+0.021
hes+0.021
Neg logit contributions
 Majesty-0.045
 disl-0.044
 endings-0.039
 fault-0.039
idents-0.037
Token bunny
Token ID44915
Feature activation+7.02e-03

Pos logit contributions
 full+0.013
 tucked+0.012
 sn+0.012
 j+0.011
 cold+0.011
Neg logit contributions
 disl-0.021
 Majesty-0.019
 endings-0.017
 fault-0.017
flies-0.016
Token,
Token ID11
Feature activation+3.14e-03

Pos logit contributions
 disl+0.016
 Majesty+0.015
 endings+0.014
 purposes+0.013
 preferences+0.013
Neg logit contributions
 full-0.009
 called-0.008
 tucked-0.007
 sn-0.007
 j-0.007
Token so
Token ID523
Feature activation+7.61e-03

Pos logit contributions
 disl+0.005
 Majesty+0.005
 fault+0.005
 endings+0.004
 preferences+0.004
Neg logit contributions
hes-0.005
 full-0.004
 tucked-0.004
 nick-0.004
aver-0.004
Token she
Token ID673
Feature activation+2.06e-02

Pos logit contributions
 disl+0.010
 Majesty+0.010
 endings+0.008
idents+0.008
 appearance+0.008
Neg logit contributions
hes-0.014
 nick-0.012
aver-0.012
 tucked-0.012
 prec-0.012
Token crou
Token ID33838
Feature activation-1.99e-01

Pos logit contributions
 Majesty+0.038
 disl+0.037
flies+0.033
 means+0.032
cient+0.032
Neg logit contributions
hes-0.024
 full-0.024
 nick-0.023
 gust-0.022
 leash-0.021
Tokenched
Token ID1740
Feature activation-4.75e-02

Pos logit contributions
 full+0.352
 convey+0.341
Kim+0.335
 nick+0.333
 masse+0.333
Neg logit contributions
 disl-0.200
vered-0.196
cient-0.188
form-0.188
 means-0.181
Token down
Token ID866
Feature activation-8.67e-03

Pos logit contributions
hes+0.076
 full+0.074
 tucked+0.072
aver+0.069
 nick+0.069
Neg logit contributions
 disl-0.068
 Majesty-0.064
 means-0.058
 endings-0.058
 beings-0.058
Token and
Token ID290
Feature activation-9.48e-02

Pos logit contributions
aver+0.015
hes+0.015
 nick+0.014
Wh+0.014
Hold+0.013
Neg logit contributions
 means-0.010
 disl-0.009
 fault-0.009
 beings-0.009
 Majesty-0.009
Token bowed
Token ID35737
Feature activation-5.91e-02

Pos logit contributions
hes+0.093
 full+0.091
 tucked+0.083
 nick+0.079
On+0.076
Neg logit contributions
 disl-0.208
 Majesty-0.202
 endings-0.181
 appearance-0.176
 fault-0.175
Token to
Token ID284
Feature activation-3.82e-03

Pos logit contributions
 full+0.079
hes+0.076
 tucked+0.074
 sn+0.071
 called+0.069
Neg logit contributions
 disl-0.111
 Majesty-0.109
 endings-0.097
 fault-0.095
idents-0.091
Token She
Token ID1375
Feature activation+1.23e-03

Pos logit contributions
 full+0.023
hes+0.022
 tucked+0.020
 called+0.020
 sn+0.020
Neg logit contributions
 disl-0.043
 Majesty-0.042
 endings-0.038
 fault-0.038
 appearance-0.038
Token put
Token ID1234
Feature activation-9.68e-02

Pos logit contributions
 disl+0.002
 Majesty+0.002
 endings+0.002
 preferences+0.002
 fault+0.002
Neg logit contributions
 full-0.002
hes-0.002
 nick-0.002
 leash-0.002
 tucked-0.002
Token some
Token ID617
Feature activation-4.43e-02

Pos logit contributions
 tucked+0.169
aver+0.168
 nick+0.166
 full+0.165
On+0.165
Neg logit contributions
 disl-0.113
 Majesty-0.102
 means-0.091
 relationships-0.089
idents-0.088
Token on
Token ID319
Feature activation-3.78e-02

Pos logit contributions
hes+0.058
 ped+0.057
Phill+0.056
Wh+0.055
 nick+0.055
Neg logit contributions
 disl-0.073
 preferences-0.064
 purposes-0.062
 endings-0.062
 Majesty-0.062
Token her
Token ID607
Feature activation-2.09e-02

Pos logit contributions
aver+0.060
hes+0.060
 tucked+0.058
 full+0.056
 nick+0.056
Neg logit contributions
 disl-0.055
 Majesty-0.051
 fault-0.047
 appearance-0.045
 endings-0.045
Token eyel
Token ID29708
Feature activation-1.99e-01

Pos logit contributions
hes+0.027
 full+0.026
 tucked+0.026
 nick+0.025
 ped+0.024
Neg logit contributions
 disl-0.039
 Majesty-0.038
 endings-0.033
 preferences-0.033
 appearance-0.032
Tokenids
Token ID2340
Feature activation-1.07e-01

Pos logit contributions
hes+0.302
 full+0.302
 parked+0.300
 tucked+0.299
On+0.298
Neg logit contributions
 Majesty-0.332
 disl-0.317
 fault-0.306
 appearance-0.260
 endings-0.257
Token and
Token ID290
Feature activation-1.04e-01

Pos logit contributions
hes+0.136
 full+0.134
 tucked+0.129
 nick+0.126
 sn+0.122
Neg logit contributions
 disl-0.195
 Majesty-0.190
 endings-0.169
 fault-0.167
 appearance-0.162
Token cheeks
Token ID25839
Feature activation-1.01e-01

Pos logit contributions
hes+0.129
aver+0.129
 full+0.128
 nick+0.126
 leash+0.119
Neg logit contributions
 disl-0.184
 Majesty-0.176
 means-0.156
 preferences-0.149
 endings-0.148
Token.
Token ID13
Feature activation-2.34e-02

Pos logit contributions
 nick+0.160
hes+0.158
aver+0.154
 full+0.149
Wh+0.146
Neg logit contributions
 disl-0.142
 Majesty-0.139
 endings-0.121
 fault-0.120
idents-0.118
Token It
Token ID632
Feature activation-1.35e-02

Pos logit contributions
 full+0.029
hes+0.028
 tucked+0.026
 sn+0.025
 called+0.025
Neg logit contributions
 disl-0.046
 Majesty-0.045
 endings-0.040
 fault-0.040
 appearance-0.039
Token to
Token ID284
Feature activation+2.30e-02

Pos logit contributions
 disl+0.024
 Majesty+0.024
 endings+0.020
 fault+0.020
idents+0.019
Neg logit contributions
 full-0.027
hes-0.026
 tucked-0.026
 sn-0.025
 called-0.025
Token say
Token ID910
Feature activation-3.35e-02

Pos logit contributions
 disl+0.045
 Majesty+0.043
 endings+0.038
 fault+0.037
 appearance+0.037
Neg logit contributions
hes-0.028
 full-0.028
 tucked-0.027
 nick-0.024
 ped-0.024
Token hello
Token ID23748
Feature activation+2.47e-02

Pos logit contributions
 full+0.045
 sn+0.041
hes+0.040
 tucked+0.040
 j+0.040
Neg logit contributions
 disl-0.063
 Majesty-0.060
 endings-0.055
idents-0.054
 fault-0.054
Token.
Token ID13
Feature activation-1.10e-02

Pos logit contributions
 disl+0.054
 Majesty+0.050
 endings+0.046
 purposes+0.044
 fault+0.044
Neg logit contributions
 full-0.029
 sn-0.026
 tucked-0.026
 called-0.025
 j-0.025
Token He
Token ID679
Feature activation+7.30e-03

Pos logit contributions
 full+0.013
On+0.011
 called+0.011
 low+0.010
 j+0.010
Neg logit contributions
 Majesty-0.024
 disl-0.024
 fault-0.022
 beings-0.022
 endings-0.022
Token crou
Token ID33838
Feature activation-1.98e-01

Pos logit contributions
 disl+0.013
 Majesty+0.013
flies+0.011
 means+0.011
by+0.011
Neg logit contributions
hes-0.010
 full-0.009
 nick-0.008
 tucked-0.008
 leash-0.008
Tokenched
Token ID1740
Feature activation-4.63e-02

Pos logit contributions
 convey+0.363
 full+0.346
 masse+0.337
Kim+0.334
 nick+0.334
Neg logit contributions
 disl-0.196
vered-0.184
form-0.176
 means-0.173
cient-0.169
Token down
Token ID866
Feature activation-5.98e-03

Pos logit contributions
hes+0.085
 full+0.080
 tucked+0.079
aver+0.075
On+0.075
Neg logit contributions
 disl-0.049
 means-0.049
 Majesty-0.047
 beings-0.046
by-0.045
Token and
Token ID290
Feature activation-9.42e-02

Pos logit contributions
hes+0.012
aver+0.011
Wh+0.011
 nick+0.011
 convey+0.010
Neg logit contributions
 means-0.005
minded-0.004
 anymore-0.004
 disl-0.004
 beings-0.004
Token carefully
Token ID7773
Feature activation-2.36e-02

Pos logit contributions
hes+0.105
 full+0.104
 tucked+0.096
 nick+0.091
On+0.087
Neg logit contributions
 disl-0.193
 Majesty-0.187
 endings-0.167
 appearance-0.161
 fault-0.160
Token reached
Token ID4251
Feature activation-7.83e-02

Pos logit contributions
hes+0.028
 nick+0.027
 masse+0.025
Wh+0.025
Ring+0.024
Neg logit contributions
 disl-0.044
 Majesty-0.040
 means-0.036
flies-0.035
 appearance-0.035
Token.
Token ID13
Feature activation-2.03e-02

Pos logit contributions
 disl+0.154
 Majesty+0.152
idents+0.127
 endings+0.124
 fault+0.122
Neg logit contributions
 full-0.091
 tucked-0.083
 called-0.083
 sn-0.076
hes-0.074
Token He
Token ID679
Feature activation+2.25e-03

Pos logit contributions
 full+0.027
hes+0.026
 tucked+0.025
 sn+0.024
 nick+0.023
Neg logit contributions
 disl-0.038
 Majesty-0.037
 endings-0.033
 fault-0.032
 appearance-0.032
Token blinked
Token ID40360
Feature activation-4.37e-02

Pos logit contributions
 Majesty+0.003
 disl+0.003
 endings+0.003
flies+0.003
 means+0.003
Neg logit contributions
hes-0.003
 full-0.003
 nick-0.003
 gust-0.003
 leash-0.003
Token his
Token ID465
Feature activation-1.90e-02

Pos logit contributions
 full+0.050
hes+0.046
 tucked+0.045
 sn+0.043
 called+0.041
Neg logit contributions
 disl-0.093
 Majesty-0.090
 endings-0.081
 fault-0.080
idents-0.078
Token long
Token ID890
Feature activation+1.95e-02

Pos logit contributions
 full+0.023
hes+0.022
 tucked+0.021
 sn+0.020
 nick+0.020
Neg logit contributions
 disl-0.038
 Majesty-0.037
 endings-0.033
 fault-0.032
 appearance-0.031
Token eyel
Token ID29708
Feature activation-1.95e-01

Pos logit contributions
 disl+0.038
 Majesty+0.038
 endings+0.034
 appearance+0.034
 fault+0.032
Neg logit contributions
hes-0.024
 full-0.022
 tucked-0.021
 nick-0.019
 ped-0.019
Tokenashes
Token ID7465
Feature activation-6.32e-02

Pos logit contributions
hes+0.310
 tucked+0.306
 full+0.304
On+0.302
 parked+0.300
Neg logit contributions
 Majesty-0.313
 disl-0.300
 fault-0.298
 appearance-0.253
 endings-0.249
Token and
Token ID290
Feature activation-9.61e-02

Pos logit contributions
hes+0.087
 full+0.084
 tucked+0.080
 nick+0.078
Wh+0.076
Neg logit contributions
 disl-0.107
 Majesty-0.104
 endings-0.096
 fault-0.092
 appearance-0.090
Token kept
Token ID4030
Feature activation-6.43e-02

Pos logit contributions
 full+0.127
hes+0.127
 tucked+0.121
 nick+0.114
 sn+0.114
Neg logit contributions
 disl-0.177
 Majesty-0.172
 endings-0.152
 fault-0.147
 appearance-0.145
Token hopping
Token ID47153
Feature activation-7.70e-02

Pos logit contributions
hes+0.083
 full+0.081
 tucked+0.080
Wh+0.078
aver+0.078
Neg logit contributions
 Majesty-0.110
 disl-0.110
 beings-0.094
 appearance-0.092
 relationships-0.091
Token.
Token ID13
Feature activation-1.97e-02

Pos logit contributions
hes+0.117
 full+0.111
 tucked+0.110
 nick+0.105
 ped+0.103
Neg logit contributions
 disl-0.124
 Majesty-0.119
 endings-0.107
 fault-0.102
 relationships-0.101
Token door
Token ID3420
Feature activation-5.33e-02

Pos logit contributions
hes+0.006
 full+0.006
 ped+0.005
aver+0.005
 tucked+0.005
Neg logit contributions
 Majesty-0.008
 disl-0.008
 fault-0.007
 endings-0.007
 appearance-0.007
Token.
Token ID13
Feature activation-2.48e-02

Pos logit contributions
hes+0.092
 nick+0.090
aver+0.088
Wh+0.084
 gust+0.083
Neg logit contributions
 disl-0.067
 Majesty-0.059
 means-0.058
 fault-0.057
 purposes-0.056
Token She
Token ID1375
Feature activation+8.08e-03

Pos logit contributions
 full+0.033
hes+0.033
 tucked+0.032
 nick+0.030
 sn+0.030
Neg logit contributions
 disl-0.045
 Majesty-0.044
 endings-0.039
 fault-0.038
idents-0.037
Token saw
Token ID2497
Feature activation+7.97e-03

Pos logit contributions
 Majesty+0.011
 disl+0.011
 means+0.009
 sibling+0.009
 majesty+0.009
Neg logit contributions
 leash-0.012
 nick-0.012
hes-0.012
 gust-0.012
 full-0.012
Token Lily
Token ID20037
Feature activation+2.40e-02

Pos logit contributions
 disl+0.015
 Majesty+0.015
 endings+0.013
 fault+0.013
idents+0.013
Neg logit contributions
 full-0.010
hes-0.009
 tucked-0.009
 nick-0.009
 sn-0.009
Token crou
Token ID33838
Feature activation-1.95e-01

Pos logit contributions
 disl+0.044
 Majesty+0.043
 endings+0.038
 fault+0.038
 appearance+0.037
Neg logit contributions
 full-0.031
hes-0.031
 tucked-0.029
 nick-0.028
 bra-0.027
Tokenching
Token ID10813
Feature activation+2.04e-02

Pos logit contributions
 full+0.333
On+0.322
 masse+0.321
Un+0.320
 curly+0.317
Neg logit contributions
vered-0.178
ry-0.176
 anymore-0.170
 Majesty-0.161
 purposes-0.159
Token behind
Token ID2157
Feature activation+1.01e-02

Pos logit contributions
 means+0.027
 endings+0.026
 anymore+0.026
 beings+0.026
 disl+0.025
Neg logit contributions
aver-0.032
hes-0.031
Phill-0.030
 nick-0.029
 tucked-0.028
Token the
Token ID262
Feature activation+1.34e-02

Pos logit contributions
 disl+0.014
 Majesty+0.014
 endings+0.012
 appearance+0.012
 fault+0.012
Neg logit contributions
hes-0.017
aver-0.016
 full-0.016
 tucked-0.015
 nick-0.015
Token clothes
Token ID8242
Feature activation-6.08e-02

Pos logit contributions
 disl+0.023
 Majesty+0.022
 endings+0.020
 appearance+0.020
 beings+0.019
Neg logit contributions
hes-0.020
 full-0.018
aver-0.017
 nick-0.017
 ped-0.017
Token.
Token ID13
Feature activation-1.97e-02

Pos logit contributions
hes+0.084
 full+0.082
 tucked+0.079
 nick+0.078
 sn+0.075
Neg logit contributions
 disl-0.103
 Majesty-0.100
 endings-0.089
 fault-0.089
 appearance-0.086
Token surrender
Token ID16908
Feature activation+2.28e-02

Pos logit contributions
 Majesty+0.127
 disl+0.115
idents+0.104
 endings+0.102
 appearance+0.100
Neg logit contributions
 full-0.083
 called-0.078
hes-0.077
 j-0.074
 tucked-0.073
Token meant
Token ID4001
Feature activation+4.67e-02

Pos logit contributions
 Majesty+0.045
 disl+0.044
 endings+0.036
idents+0.035
 preferences+0.034
Neg logit contributions
 full-0.032
 tucked-0.030
hes-0.029
 called-0.029
 closed-0.027
Token,
Token ID11
Feature activation+8.86e-03

Pos logit contributions
 Majesty+0.107
 disl+0.099
roc+0.090
cient+0.089
 majesty+0.087
Neg logit contributions
 full-0.056
 called-0.049
 tucked-0.046
 j-0.044
 closed-0.043
Token so
Token ID523
Feature activation+1.51e-02

Pos logit contributions
 disl+0.018
 Majesty+0.017
 endings+0.015
 fault+0.015
 appearance+0.015
Neg logit contributions
hes-0.010
 full-0.010
 tucked-0.010
 nick-0.009
 sn-0.009
Token Daddy
Token ID26926
Feature activation+1.07e-02

Pos logit contributions
 disl+0.029
 Majesty+0.028
 fault+0.025
 endings+0.025
 appearance+0.024
Neg logit contributions
 full-0.021
 tucked-0.019
hes-0.019
 called-0.018
 sn-0.018
Token crou
Token ID33838
Feature activation-1.93e-01

Pos logit contributions
 disl+0.023
 Majesty+0.022
 endings+0.020
 fault+0.019
 appearance+0.019
Neg logit contributions
 full-0.012
hes-0.011
 tucked-0.011
 sn-0.010
 called-0.010
Tokenched
Token ID1740
Feature activation-4.47e-02

Pos logit contributions
 full+0.311
 tucked+0.293
On+0.282
 com+0.279
 nick+0.277
Neg logit contributions
 disl-0.255
 means-0.227
 Majesty-0.224
vered-0.221
cient-0.219
Token down
Token ID866
Feature activation-6.10e-03

Pos logit contributions
hes+0.071
 full+0.070
 tucked+0.068
aver+0.064
 nick+0.064
Neg logit contributions
 disl-0.063
 Majesty-0.059
 means-0.055
 endings-0.055
 beings-0.054
Token and
Token ID290
Feature activation-9.39e-02

Pos logit contributions
Wh+0.010
hes+0.010
aver+0.010
 nick+0.009
Loop+0.009
Neg logit contributions
 disl-0.007
 means-0.007
 fault-0.007
 Majesty-0.007
 beings-0.007
Token said
Token ID531
Feature activation-1.99e-02

Pos logit contributions
hes+0.116
 full+0.111
 tucked+0.103
 nick+0.102
On+0.100
Neg logit contributions
 disl-0.177
 Majesty-0.170
 endings-0.150
 fault-0.145
idents-0.144
Token,
Token ID11
Feature activation-5.96e-03

Pos logit contributions
 full+0.028
 j+0.025
 sn+0.025
hes+0.024
 tucked+0.024
Neg logit contributions
 disl-0.037
 Majesty-0.035
 endings-0.033
idents-0.032
 appearance-0.032
Token theater
Token ID13766
Feature activation-2.07e-02

Pos logit contributions
hes+0.007
 full+0.007
 tucked+0.007
 ped+0.007
 nick+0.006
Neg logit contributions
 disl-0.009
 Majesty-0.009
 endings-0.008
 appearance-0.008
 fault-0.008
Token and
Token ID290
Feature activation-9.27e-02

Pos logit contributions
 full+0.028
 tucked+0.026
hes+0.026
 called+0.025
 sn+0.025
Neg logit contributions
 disl-0.040
 Majesty-0.038
 endings-0.034
 fault-0.034
 preferences-0.033
Token Larry
Token ID13633
Feature activation+2.90e-02

Pos logit contributions
 full+0.122
 tucked+0.103
 called+0.102
 sn+0.101
hes+0.100
Neg logit contributions
 disl-0.197
 Majesty-0.194
 fault-0.179
 preferences-0.168
 endings-0.167
Token felt
Token ID2936
Feature activation-8.66e-02

Pos logit contributions
 disl+0.059
 Majesty+0.054
 endings+0.049
 fault+0.049
idents+0.048
Neg logit contributions
 full-0.040
 tucked-0.039
 called-0.039
 sn-0.039
 closed-0.037
Token his
Token ID465
Feature activation-2.64e-02

Pos logit contributions
 full+0.075
hes+0.074
 tucked+0.070
 nick+0.064
 sn+0.063
Neg logit contributions
 disl-0.198
 Majesty-0.193
 endings-0.176
 fault-0.172
 appearance-0.169
Token eyel
Token ID29708
Feature activation-1.92e-01

Pos logit contributions
hes+0.032
 full+0.031
 tucked+0.030
On+0.028
 called+0.028
Neg logit contributions
 disl-0.051
 Majesty-0.050
 endings-0.045
 fault-0.045
 appearance-0.043
Tokenids
Token ID2340
Feature activation-9.83e-02

Pos logit contributions
On+0.301
hes+0.300
 tucked+0.294
 full+0.292
 parked+0.288
Neg logit contributions
 Majesty-0.309
 disl-0.294
 fault-0.289
 endings-0.240
 appearance-0.237
Token become
Token ID1716
Feature activation-8.15e-02

Pos logit contributions
hes+0.125
 full+0.124
 tucked+0.118
 nick+0.113
 sn+0.111
Neg logit contributions
 disl-0.182
 Majesty-0.177
 endings-0.157
 fault-0.155
 appearance-0.150
Token heavy
Token ID4334
Feature activation+4.19e-02

Pos logit contributions
hes+0.078
 nick+0.076
Grab+0.074
Wh+0.074
 tucked+0.073
Neg logit contributions
 Majesty-0.167
 disl-0.161
 fault-0.148
 beings-0.144
 endings-0.144
Token.
Token ID13
Feature activation-2.47e-02

Pos logit contributions
 disl+0.089
 Majesty+0.083
 endings+0.073
idents+0.071
 preferences+0.070
Neg logit contributions
 full-0.051
 tucked-0.048
 sn-0.045
hes-0.044
 called-0.044
Token He
Token ID679
Feature activation+2.57e-03

Pos logit contributions
 full+0.021
hes+0.019
 nick+0.018
 called+0.017
On+0.017
Neg logit contributions
 disl-0.059
 Majesty-0.057
 fault-0.055
 appearance-0.053
 property-0.052
Token and
Token ID290
Feature activation-7.05e-02

Pos logit contributions
 disl+0.041
 Majesty+0.040
 endings+0.036
 fault+0.035
idents+0.034
Neg logit contributions
 full-0.021
hes-0.020
 tucked-0.020
 sn-0.018
 called-0.018
Token saw
Token ID2497
Feature activation+1.85e-02

Pos logit contributions
 full+0.092
hes+0.089
 tucked+0.088
 sn+0.083
 called+0.083
Neg logit contributions
 disl-0.135
 Majesty-0.132
 endings-0.119
 fault-0.118
 appearance-0.113
Token him
Token ID683
Feature activation-1.92e-02

Pos logit contributions
 Majesty+0.040
 disl+0.036
 preferences+0.034
 fault+0.033
 majesty+0.033
Neg logit contributions
 called-0.021
hes-0.021
 full-0.020
 tucked-0.019
 j-0.019
Token.
Token ID13
Feature activation-1.47e-02

Pos logit contributions
hes+0.029
 full+0.029
 tucked+0.028
 nick+0.027
aver+0.027
Neg logit contributions
 disl-0.030
 Majesty-0.027
 fault-0.025
 endings-0.025
 appearance-0.024
Token She
Token ID1375
Feature activation+1.26e-02

Pos logit contributions
 full+0.017
 tucked+0.015
hes+0.015
 called+0.015
 sn+0.015
Neg logit contributions
 disl-0.031
 Majesty-0.030
 fault-0.027
 endings-0.027
 preferences-0.026
Token crou
Token ID33838
Feature activation-1.92e-01

Pos logit contributions
 disl+0.018
 Majesty+0.016
 means+0.015
cient+0.014
flies+0.014
Neg logit contributions
hes-0.020
 full-0.019
 nick-0.018
 leash-0.018
 bra-0.018
Tokenched
Token ID1740
Feature activation-3.33e-02

Pos logit contributions
 full+0.388
 com+0.364
Grand+0.364
 masse+0.363
 convey+0.361
Neg logit contributions
vered-0.163
iling-0.125
rums-0.117
 endings-0.109
 purposes-0.109
Token down
Token ID866
Feature activation+7.03e-03

Pos logit contributions
hes+0.056
 full+0.056
 tucked+0.054
On+0.051
aver+0.051
Neg logit contributions
 disl-0.044
 Majesty-0.041
 means-0.039
 beings-0.039
 endings-0.039
Token and
Token ID290
Feature activation-8.50e-02

Pos logit contributions
 means+0.007
 anymore+0.006
 disl+0.005
 beings+0.005
 fault+0.005
Neg logit contributions
aver-0.014
hes-0.013
Ta-0.013
Wh-0.013
Ring-0.013
Token put
Token ID1234
Feature activation-8.30e-02

Pos logit contributions
 full+0.114
hes+0.113
 nick+0.105
 leash+0.105
On+0.103
Neg logit contributions
 disl-0.143
 Majesty-0.130
 means-0.121
idents-0.112
 endings-0.111
Token her
Token ID607
Feature activation-1.13e-02

Pos logit contributions
On+0.144
 full+0.137
 tucked+0.135
 masse+0.134
Grand+0.131
Neg logit contributions
 disl-0.093
 means-0.086
 ourselves-0.082
 relationships-0.080
 fault-0.079
Token cave
Token ID11527
Feature activation-3.89e-03

Pos logit contributions
hes+0.018
 tucked+0.016
 full+0.016
 ped+0.016
 sn+0.015
Neg logit contributions
 Majesty-0.016
 disl-0.016
 endings-0.014
 appearance-0.014
 relationships-0.013
Token,
Token ID11
Feature activation-6.94e-03

Pos logit contributions
 full+0.006
 tucked+0.005
hes+0.005
 called+0.005
 sn+0.005
Neg logit contributions
 disl-0.008
 Majesty-0.007
 fault-0.006
 endings-0.006
 preferences-0.006
Token but
Token ID475
Feature activation-1.61e-02

Pos logit contributions
hes+0.009
 full+0.009
 tucked+0.008
 nick+0.008
 sn+0.008
Neg logit contributions
 disl-0.013
 Majesty-0.013
 endings-0.011
 fault-0.011
 appearance-0.011
Token the
Token ID262
Feature activation-2.36e-02

Pos logit contributions
 full+0.022
 tucked+0.019
 called+0.019
 closed+0.019
hes+0.018
Neg logit contributions
 disl-0.033
 Majesty-0.032
 fault-0.028
 endings-0.028
 preferences-0.027
Token hero
Token ID4293
Feature activation+1.56e-02

Pos logit contributions
hes+0.036
 tucked+0.033
 full+0.033
 draped+0.030
 ped+0.030
Neg logit contributions
 Majesty-0.040
 disl-0.037
 endings-0.035
 appearance-0.033
 relationships-0.032
Token per
Token ID583
Feature activation-1.90e-01

Pos logit contributions
 disl+0.029
 Majesty+0.028
 endings+0.025
 fault+0.025
 appearance+0.024
Neg logit contributions
 full-0.021
 tucked-0.020
hes-0.020
 called-0.019
 sn-0.019
Tokense
Token ID325
Feature activation-4.70e-02

Pos logit contributions
 full+0.329
hes+0.324
 tucked+0.306
 called+0.290
 sn+0.285
Neg logit contributions
 Majesty-0.228
cient-0.219
 disl-0.213
flies-0.207
 endings-0.197
Tokenvered
Token ID21917
Feature activation+4.09e-02

Pos logit contributions
 full+0.096
 tucked+0.095
hes+0.094
 sn+0.092
 called+0.091
Neg logit contributions
 disl-0.057
 Majesty-0.053
 fault-0.042
idents-0.041
 endings-0.041
Token.
Token ID13
Feature activation-2.82e-02

Pos logit contributions
 disl+0.082
 Majesty+0.079
 endings+0.069
idents+0.068
 fault+0.067
Neg logit contributions
 full-0.052
 tucked-0.048
hes-0.047
 sn-0.046
 called-0.045
Token He
Token ID679
Feature activation-4.80e-04

Pos logit contributions
 full+0.039
hes+0.039
 tucked+0.038
 sn+0.035
 nick+0.035
Neg logit contributions
 disl-0.050
 Majesty-0.049
 endings-0.043
 fault-0.042
idents-0.041
Token searched
Token ID16499
Feature activation-7.72e-02

Pos logit contributions
hes+0.001
 full+0.001
 nick+0.001
 tucked+0.001
 gust+0.001
Neg logit contributions
 disl-0.001
 Majesty-0.001
 endings-0.001
flies-0.001
 means-0.001
Token get
Token ID651
Feature activation-8.11e-02

Pos logit contributions
 disl+0.054
 Majesty+0.049
 endings+0.043
 appearance+0.042
 preferences+0.042
Neg logit contributions
hes-0.039
 full-0.036
 tucked-0.034
 ped-0.032
 prec-0.032
Token closer
Token ID5699
Feature activation+6.10e-02

Pos logit contributions
hes+0.104
 full+0.099
 tucked+0.098
On+0.093
 ped+0.092
Neg logit contributions
 disl-0.150
 Majesty-0.137
 endings-0.123
 fault-0.121
 beings-0.120
Token,
Token ID11
Feature activation+1.79e-02

Pos logit contributions
 disl+0.141
 Majesty+0.133
 endings+0.119
cient+0.116
 preferences+0.115
Neg logit contributions
 full-0.074
 tucked-0.064
 called-0.063
 j-0.062
 sn-0.062
Token so
Token ID523
Feature activation+1.84e-02

Pos logit contributions
 disl+0.030
 Majesty+0.029
 endings+0.025
idents+0.025
 fault+0.025
Neg logit contributions
hes-0.026
 full-0.024
 tucked-0.024
aver-0.023
 nick-0.023
Token he
Token ID339
Feature activation+2.25e-02

Pos logit contributions
 disl+0.028
 Majesty+0.027
 endings+0.023
 fault+0.023
idents+0.022
Neg logit contributions
hes-0.030
 full-0.029
 tucked-0.028
 nick-0.027
 bra-0.026
Token crou
Token ID33838
Feature activation-1.90e-01

Pos logit contributions
 Majesty+0.038
 disl+0.036
by+0.035
flies+0.035
 means+0.033
Neg logit contributions
 full-0.027
hes-0.026
 nick-0.024
 gust-0.023
wered-0.023
Tokenched
Token ID1740
Feature activation-4.01e-02

Pos logit contributions
 convey+0.356
 full+0.347
 curly+0.343
Kim+0.342
Grand+0.342
Neg logit contributions
vered-0.163
 disl-0.161
 means-0.147
form-0.146
cient-0.142
Token down
Token ID866
Feature activation-5.52e-04

Pos logit contributions
hes+0.077
 convey+0.072
 full+0.072
 tucked+0.071
 masse+0.071
Neg logit contributions
 means-0.037
by-0.033
 disl-0.033
fl-0.031
 ourselves-0.031
Token and
Token ID290
Feature activation-8.82e-02

Pos logit contributions
hes+0.001
aver+0.001
Wh+0.001
Ta+0.001
 nick+0.001
Neg logit contributions
 means-0.000
 anymore-0.000
 disl-0.000
 fault-0.000
minded-0.000
Token watched
Token ID7342
Feature activation-2.51e-02

Pos logit contributions
hes+0.089
 full+0.087
 tucked+0.078
 nick+0.075
On+0.074
Neg logit contributions
 disl-0.189
 Majesty-0.183
 endings-0.163
idents-0.159
 appearance-0.158
Token.
Token ID13
Feature activation-1.13e-02

Pos logit contributions
hes+0.046
Wh+0.044
 tucked+0.044
aver+0.043
 full+0.043
Neg logit contributions
 disl-0.032
 Majesty-0.029
 endings-0.025
 fault-0.023
 relationships-0.023
Token.
Token ID13
Feature activation-8.92e-03

Pos logit contributions
 disl+0.044
 Majesty+0.042
 endings+0.038
idents+0.037
 beings+0.037
Neg logit contributions
 full-0.019
 sn-0.017
hes-0.017
 tucked-0.017
 bra-0.016
Token Every
Token ID3887
Feature activation+4.06e-02

Pos logit contributions
 full+0.009
hes+0.008
On+0.008
 called+0.007
 tucked+0.007
Neg logit contributions
 disl-0.020
 Majesty-0.020
 beings-0.018
 endings-0.018
 fault-0.017
Token day
Token ID1110
Feature activation+6.18e-02

Pos logit contributions
 disl+0.064
 Majesty+0.062
 endings+0.056
 fault+0.054
 appearance+0.052
Neg logit contributions
hes-0.063
 full-0.061
 tucked-0.059
 ped-0.056
Wh-0.055
Token she
Token ID673
Feature activation+1.67e-02

Pos logit contributions
 disl+0.087
 Majesty+0.083
 endings+0.075
 appearance+0.070
 purposes+0.068
Neg logit contributions
hes-0.106
 full-0.101
 tucked-0.098
 nick-0.096
aver-0.094
Token would
Token ID561
Feature activation-3.52e-02

Pos logit contributions
 disl+0.025
 Majesty+0.025
 endings+0.021
 fault+0.020
 appearance+0.020
Neg logit contributions
 full-0.028
hes-0.027
 tucked-0.026
 nick-0.025
 sn-0.025
Token crou
Token ID33838
Feature activation-1.89e-01

Pos logit contributions
 full+0.035
hes+0.033
 tucked+0.032
 sn+0.030
 nick+0.029
Neg logit contributions
 disl-0.078
 Majesty-0.075
 endings-0.069
 fault-0.067
 preferences-0.066
Tokench
Token ID354
Feature activation-1.05e-01

Pos logit contributions
 full+0.271
 tucked+0.260
hes+0.255
 nick+0.253
Wh+0.249
Neg logit contributions
 disl-0.277
 Majesty-0.262
cient-0.245
 means-0.239
 relationships-0.235
Token down
Token ID866
Feature activation-2.53e-03

Pos logit contributions
 full+0.095
 tucked+0.085
hes+0.083
 nick+0.082
On+0.077
Neg logit contributions
 Majesty-0.224
 disl-0.224
 fault-0.206
 endings-0.202
flies-0.199
Token and
Token ID290
Feature activation-8.87e-02

Pos logit contributions
hes+0.004
aver+0.004
Wh+0.004
 bisc+0.004
Ring+0.004
Neg logit contributions
 means-0.003
 Majesty-0.003
flies-0.003
 anymore-0.003
 beings-0.003
Token demand
Token ID3512
Feature activation-7.55e-04

Pos logit contributions
 full+0.084
 tucked+0.082
 sn+0.081
 closed+0.076
hes+0.075
Neg logit contributions
 disl-0.202
 Majesty-0.196
 endings-0.184
 fault-0.178
 beings-0.176
Token to
Token ID284
Feature activation+1.58e-02

Pos logit contributions
 full+0.001
hes+0.001
 tucked+0.001
 nick+0.001
 sn+0.001
Neg logit contributions
 disl-0.001
 Majesty-0.001
 endings-0.001
 fault-0.001
 appearance-0.001
Token to
Token ID284
Feature activation+2.51e-03

Pos logit contributions
 disl+0.078
 Majesty+0.076
 endings+0.066
cient+0.065
idents+0.064
Neg logit contributions
 full-0.037
 tucked-0.035
 sn-0.032
 called-0.032
 closed-0.030
Token reach
Token ID3151
Feature activation-5.53e-02

Pos logit contributions
 disl+0.005
 Majesty+0.005
 fault+0.004
 appearance+0.004
 ourselves+0.004
Neg logit contributions
hes-0.003
 full-0.003
 tucked-0.003
Say-0.003
 nick-0.003
Token.
Token ID13
Feature activation-2.78e-02

Pos logit contributions
hes+0.095
 full+0.088
 tucked+0.086
 nick+0.084
aver+0.083
Neg logit contributions
 disl-0.077
 Majesty-0.076
 fault-0.068
 endings-0.065
 appearance-0.065
Token The
Token ID383
Feature activation-1.48e-02

Pos logit contributions
 full+0.039
hes+0.039
 tucked+0.038
 sn+0.035
 nick+0.035
Neg logit contributions
 disl-0.049
 Majesty-0.048
 endings-0.042
 fault-0.041
 appearance-0.040
Token turtle
Token ID28699
Feature activation-2.86e-02

Pos logit contributions
 full+0.021
hes+0.020
 tucked+0.019
 sn+0.019
 nick+0.018
Neg logit contributions
 disl-0.027
 Majesty-0.026
 endings-0.023
 fault-0.023
idents-0.022
Token per
Token ID583
Feature activation-1.88e-01

Pos logit contributions
 full+0.036
 tucked+0.034
hes+0.033
 called+0.033
 sn+0.033
Neg logit contributions
 disl-0.057
 Majesty-0.055
 endings-0.049
 fault-0.049
 appearance-0.047
Tokense
Token ID325
Feature activation-4.81e-02

Pos logit contributions
hes+0.363
 full+0.355
 tucked+0.340
 called+0.322
 nick+0.318
Neg logit contributions
 disl-0.195
 Majesty-0.195
flies-0.185
cient-0.182
idents-0.166
Tokenvered
Token ID21917
Feature activation+4.43e-02

Pos logit contributions
 full+0.104
 tucked+0.102
 sn+0.101
hes+0.101
 ped+0.100
Neg logit contributions
 disl-0.052
 Majesty-0.049
 fault-0.037
 endings-0.037
idents-0.037
Token though
Token ID996
Feature activation+7.18e-02

Pos logit contributions
 disl+0.091
 Majesty+0.088
 endings+0.078
 fault+0.076
 preferences+0.075
Neg logit contributions
 full-0.052
 tucked-0.049
hes-0.048
 sn-0.047
 called-0.045
Token,
Token ID11
Feature activation+1.56e-03

Pos logit contributions
 disl+0.104
 Majesty+0.101
 endings+0.089
 fault+0.086
 appearance+0.083
Neg logit contributions
hes-0.123
 full-0.117
 tucked-0.115
 nick-0.110
 sn-0.108
Token padd
Token ID14098
Feature activation-1.15e-01

Pos logit contributions
 disl+0.003
 Majesty+0.002
 endings+0.002
 fault+0.002
 appearance+0.002
Neg logit contributions
hes-0.002
 full-0.002
 tucked-0.002
 nick-0.002
 sn-0.002
Token
Token ID198
Feature activation-8.22e-03

Pos logit contributions
 full+0.006
hes+0.006
 tucked+0.006
 sn+0.005
 nick+0.005
Neg logit contributions
 disl-0.008
 Majesty-0.008
 endings-0.007
 fault-0.007
 appearance-0.007
TokenM
Token ID44
Feature activation-2.77e-02

Pos logit contributions
 full+0.008
 tucked+0.008
hes+0.008
 sn+0.007
 nick+0.007
Neg logit contributions
 disl-0.018
 Majesty-0.017
 endings-0.015
 preferences-0.015
 purposes-0.015
Tokenia
Token ID544
Feature activation-2.55e-02

Pos logit contributions
 full+0.045
hes+0.045
 tucked+0.043
 sn+0.042
 nick+0.041
Neg logit contributions
 disl-0.043
 Majesty-0.042
 endings-0.036
 fault-0.036
 purposes-0.034
Token's
Token ID338
Feature activation-1.47e-02

Pos logit contributions
 full+0.033
hes+0.032
 tucked+0.032
 sn+0.030
 called+0.029
Neg logit contributions
 disl-0.048
 Majesty-0.047
 endings-0.042
 fault-0.041
 appearance-0.040
Token mom
Token ID1995
Feature activation-6.19e-03

Pos logit contributions
hes+0.021
 full+0.020
 tucked+0.019
 nick+0.018
On+0.018
Neg logit contributions
 disl-0.026
 Majesty-0.026
 endings-0.022
 fault-0.022
 appearance-0.022
Token crou
Token ID33838
Feature activation-1.88e-01

Pos logit contributions
hes+0.007
 full+0.007
 tucked+0.007
 nick+0.007
 sn+0.006
Neg logit contributions
 disl-0.012
 Majesty-0.012
 endings-0.010
 fault-0.010
idents-0.010
Tokenched
Token ID1740
Feature activation-2.96e-02

Pos logit contributions
 full+0.302
 leash+0.282
Wh+0.282
 convey+0.281
 tucked+0.280
Neg logit contributions
 disl-0.228
vered-0.206
 Majesty-0.199
 means-0.198
form-0.198
Token down
Token ID866
Feature activation+1.17e-02

Pos logit contributions
hes+0.043
 full+0.042
 tucked+0.041
 nick+0.038
 ped+0.038
Neg logit contributions
 disl-0.049
 Majesty-0.048
 endings-0.043
 fault-0.042
 beings-0.041
Token and
Token ID290
Feature activation-8.05e-02

Pos logit contributions
 disl+0.015
 Majesty+0.014
 means+0.014
 fault+0.014
 beings+0.013
Neg logit contributions
hes-0.019
aver-0.018
Wh-0.018
 nick-0.017
Hold-0.017
Token said
Token ID531
Feature activation-8.19e-03

Pos logit contributions
hes+0.092
 full+0.089
 nick+0.084
aver+0.083
On+0.081
Neg logit contributions
 disl-0.156
 Majesty-0.150
 means-0.130
idents-0.128
 endings-0.127
Token,
Token ID11
Feature activation+7.22e-03

Pos logit contributions
 full+0.012
 tucked+0.011
 sn+0.011
 j+0.011
hes+0.011
Neg logit contributions
 disl-0.015
 Majesty-0.014
 endings-0.013
 property-0.013
 fault-0.013
Token.
Token ID13
Feature activation-2.88e-02

Pos logit contributions
 full+0.056
 tucked+0.051
hes+0.048
 called+0.047
 sn+0.047
Neg logit contributions
 disl-0.097
 Majesty-0.093
 endings-0.082
idents-0.080
 preferences-0.079
Token She
Token ID1375
Feature activation+2.32e-03

Pos logit contributions
 full+0.042
hes+0.039
 tucked+0.037
On+0.036
 called+0.036
Neg logit contributions
 disl-0.051
 Majesty-0.050
 endings-0.044
 appearance-0.043
 fault-0.043
Token explained
Token ID4893
Feature activation+1.40e-02

Pos logit contributions
 disl+0.004
 Majesty+0.004
 means+0.003
flies+0.003
 endings+0.003
Neg logit contributions
hes-0.003
 full-0.003
 nick-0.003
 gust-0.003
 leash-0.003
Token how
Token ID703
Feature activation+2.45e-02

Pos logit contributions
 disl+0.025
 Majesty+0.025
 fault+0.022
 endings+0.022
idents+0.022
Neg logit contributions
 full-0.019
 called-0.018
 tucked-0.017
 sn-0.017
 bra-0.017
Token to
Token ID284
Feature activation+1.13e-02

Pos logit contributions
 Majesty+0.048
 disl+0.046
 endings+0.041
 fault+0.041
 appearance+0.041
Neg logit contributions
 full-0.033
hes-0.030
 tucked-0.028
 j-0.028
 sn-0.027
Token crou
Token ID33838
Feature activation-1.88e-01

Pos logit contributions
 disl+0.023
 Majesty+0.022
 endings+0.020
 fault+0.019
idents+0.019
Neg logit contributions
 full-0.013
hes-0.013
 tucked-0.013
 sn-0.012
 called-0.012
Tokench
Token ID354
Feature activation-1.10e-01

Pos logit contributions
 full+0.245
hes+0.243
 tucked+0.242
 nick+0.229
Wh+0.222
Neg logit contributions
 disl-0.330
 Majesty-0.302
 means-0.278
 fault-0.276
 endings-0.275
Token down
Token ID866
Feature activation-9.69e-03

Pos logit contributions
hes+0.142
 full+0.141
 tucked+0.134
 sn+0.128
 called+0.126
Neg logit contributions
 disl-0.208
 Majesty-0.203
 endings-0.181
 fault-0.173
 appearance-0.172
Token and
Token ID290
Feature activation-9.45e-02

Pos logit contributions
hes+0.013
 tucked+0.012
 full+0.012
 nick+0.012
Wh+0.012
Neg logit contributions
 disl-0.017
 Majesty-0.016
 endings-0.015
 fault-0.015
 appearance-0.014
Token support
Token ID1104
Feature activation-7.33e-04

Pos logit contributions
 full+0.111
 tucked+0.105
hes+0.101
 sn+0.099
 called+0.098
Neg logit contributions
 disl-0.193
 Majesty-0.188
 fault-0.170
 endings-0.170
idents-0.164
Token herself
Token ID5223
Feature activation+2.49e-03

Pos logit contributions
 full+0.001
hes+0.001
 tucked+0.001
 called+0.001
 sn+0.001
Neg logit contributions
 disl-0.001
 Majesty-0.001
 endings-0.001
 fault-0.001
idents-0.001
Token"
Token ID1
Feature activation+3.16e-02

Pos logit contributions
 full+0.032
hes+0.031
 tucked+0.030
On+0.028
 called+0.028
Neg logit contributions
 disl-0.050
 Majesty-0.049
 endings-0.044
 fault-0.044
 appearance-0.042
TokenIt
Token ID1026
Feature activation+6.35e-03

Pos logit contributions
 disl+0.055
 Majesty+0.044
 preferences+0.043
 endings+0.042
 means+0.041
Neg logit contributions
 tucked-0.049
 full-0.046
 nick-0.045
 sn-0.044
hes-0.044
Token's
Token ID338
Feature activation-2.62e-02

Pos logit contributions
 disl+0.010
 endings+0.010
 Majesty+0.010
 perspective+0.009
 relationships+0.009
Neg logit contributions
 sn-0.011
 called-0.010
hes-0.010
 tucked-0.010
 full-0.010
Token one
Token ID530
Feature activation-3.05e-03

Pos logit contributions
 ped+0.038
hes+0.035
 nick+0.035
Grab+0.035
 tucked+0.034
Neg logit contributions
 disl-0.043
 Majesty-0.041
 fault-0.039
 our-0.038
 ourselves-0.037
Token dollar
Token ID8872
Feature activation-1.88e-02

Pos logit contributions
hes+0.004
 full+0.004
 nick+0.004
 tucked+0.004
 ped+0.004
Neg logit contributions
 disl-0.006
 Majesty-0.005
 fault-0.005
 endings-0.005
 beings-0.005
Token per
Token ID583
Feature activation-1.87e-01

Pos logit contributions
 full+0.028
hes+0.028
 tucked+0.027
 sn+0.026
 nick+0.025
Neg logit contributions
 disl-0.031
 Majesty-0.030
 endings-0.027
 fault-0.026
 appearance-0.025
Token cup
Token ID6508
Feature activation-7.57e-02

Pos logit contributions
 full+0.223
 tucked+0.212
hes+0.206
 ped+0.200
Su+0.195
Neg logit contributions
 Majesty-0.320
cient-0.310
ish-0.307
 fault-0.307
 means-0.307
Token.
Token ID13
Feature activation-2.44e-02

Pos logit contributions
 nick+0.109
 ped+0.104
hes+0.103
 full+0.101
aver+0.100
Neg logit contributions
 disl-0.111
 means-0.107
 Majesty-0.105
 fault-0.102
 endings-0.101
Token But
Token ID887
Feature activation+1.58e-02

Pos logit contributions
 full+0.044
hes+0.044
 tucked+0.043
 nick+0.042
 ped+0.041
Neg logit contributions
 disl-0.033
 Majesty-0.032
 endings-0.026
 fault-0.025
 preferences-0.025
Token I
Token ID314
Feature activation-1.62e-02

Pos logit contributions
 disl+0.024
 Majesty+0.023
 endings+0.020
 fault+0.019
 purposes+0.018
Neg logit contributions
 full-0.027
hes-0.026
 tucked-0.026
 sn-0.025
 called-0.025
Token have
Token ID423
Feature activation-2.19e-02

Pos logit contributions
 full+0.025
hes+0.024
 tucked+0.024
 sn+0.023
 called+0.022
Neg logit contributions
 Majesty-0.028
 disl-0.027
 endings-0.025
 appearance-0.023
 relationships-0.022
Token can
Token ID460
Feature activation+8.16e-02

Pos logit contributions
 disl+0.145
 Majesty+0.144
 appearance+0.127
 endings+0.125
idents+0.124
Neg logit contributions
 full-0.112
 called-0.102
 tucked-0.098
hes-0.097
 sn-0.095
Token't
Token ID470
Feature activation+5.79e-02

Pos logit contributions
 disl+0.171
 Majesty+0.167
 endings+0.158
 beings+0.153
flies+0.148
Neg logit contributions
 full-0.096
 called-0.085
 sn-0.085
 tucked-0.084
 j-0.083
Token play
Token ID711
Feature activation+2.96e-02

Pos logit contributions
 disl+0.110
 Majesty+0.106
 endings+0.095
 fault+0.092
 appearance+0.090
Neg logit contributions
 full-0.074
hes-0.074
 tucked-0.071
 nick-0.066
 sn-0.066
Token with
Token ID351
Feature activation+2.43e-02

Pos logit contributions
 Majesty+0.065
 disl+0.065
 endings+0.058
 fault+0.057
 purposes+0.057
Neg logit contributions
 full-0.028
hes-0.028
 j-0.027
 called-0.027
 sn-0.026
Token paper
Token ID3348
Feature activation+3.73e-04

Pos logit contributions
 disl+0.039
 Majesty+0.038
 endings+0.033
 perspectives+0.032
 purposes+0.032
Neg logit contributions
hes-0.037
 full-0.036
 sn-0.036
 called-0.036
 tucked-0.035
Token anymore
Token ID7471
Feature activation+2.13e-01

Pos logit contributions
 disl+0.001
 Majesty+0.001
 endings+0.001
 fault+0.001
 perspectives+0.001
Neg logit contributions
 full-0.000
 tucked-0.000
hes-0.000
 called-0.000
 sn-0.000
Token.
Token ID13
Feature activation+2.74e-02

Pos logit contributions
 Majesty+0.418
 disl+0.412
roc+0.374
cient+0.373
 fault+0.367
Neg logit contributions
 called-0.264
 full-0.252
 sn-0.242
hes-0.236
 j-0.234
Token It
Token ID632
Feature activation+3.31e-02

Pos logit contributions
 disl+0.038
 Majesty+0.038
 endings+0.033
 fault+0.031
 appearance+0.030
Neg logit contributions
 full-0.047
hes-0.047
 tucked-0.045
 called-0.044
 sn-0.043
Token is
Token ID318
Feature activation-4.29e-02

Pos logit contributions
 Majesty+0.056
 disl+0.053
 endings+0.047
 topics+0.045
 perspective+0.044
Neg logit contributions
 called-0.056
 sn-0.056
hes-0.055
 tucked-0.054
 full-0.054
Token not
Token ID407
Feature activation+2.43e-02

Pos logit contributions
 ped+0.059
 nick+0.057
Grab+0.056
hes+0.056
 tucked+0.056
Neg logit contributions
 disl-0.072
 fault-0.067
 Majesty-0.067
 beings-0.060
 preferences-0.060
Token safe
Token ID3338
Feature activation+1.14e-01

Pos logit contributions
 disl+0.047
 Majesty+0.045
 endings+0.040
 fault+0.040
 appearance+0.039
Neg logit contributions
 full-0.030
hes-0.030
 tucked-0.029
 nick-0.027
 sn-0.027
Token.
Token ID13
Feature activation+4.69e-02

Pos logit contributions
 disl+0.031
 Majesty+0.031
 endings+0.027
idents+0.026
 fault+0.025
Neg logit contributions
 full-0.029
hes-0.027
 tucked-0.027
 sn-0.027
 called-0.026
Token They
Token ID1119
Feature activation+2.29e-02

Pos logit contributions
 disl+0.087
 Majesty+0.085
 endings+0.076
 fault+0.075
 appearance+0.073
Neg logit contributions
 full-0.062
hes-0.059
 tucked-0.057
 called-0.056
 sn-0.055
Token were
Token ID547
Feature activation-3.23e-02

Pos logit contributions
 disl+0.043
 Majesty+0.041
 endings+0.037
 fault+0.036
 appearance+0.035
Neg logit contributions
hes-0.029
 full-0.029
 tucked-0.028
 nick-0.026
 sn-0.026
Token not
Token ID407
Feature activation+3.24e-02

Pos logit contributions
 tucked+0.038
On+0.038
 nick+0.038
 full+0.038
hes+0.038
Neg logit contributions
 Majesty-0.058
 disl-0.058
 fault-0.050
 beings-0.049
 endings-0.047
Token bored
Token ID17533
Feature activation+1.21e-01

Pos logit contributions
 disl+0.068
 Majesty+0.067
 endings+0.061
 fault+0.059
idents+0.058
Neg logit contributions
 full-0.035
hes-0.034
 tucked-0.032
 sn-0.030
 called-0.030
Token anymore
Token ID7471
Feature activation+1.98e-01

Pos logit contributions
 Majesty+0.201
 disl+0.188
 majesty+0.153
cient+0.151
idents+0.148
Neg logit contributions
 full-0.214
 called-0.193
 tucked-0.192
 j-0.188
 sn-0.188
Token.
Token ID13
Feature activation+1.83e-02

Pos logit contributions
 Majesty+0.472
 disl+0.445
cient+0.405
idents+0.395
roc+0.394
Neg logit contributions
 full-0.197
 called-0.181
 tucked-0.159
 sn-0.158
 j-0.156
Token They
Token ID1119
Feature activation+1.55e-02

Pos logit contributions
 disl+0.031
 Majesty+0.030
 endings+0.027
 fault+0.026
 appearance+0.025
Neg logit contributions
 full-0.027
hes-0.026
 tucked-0.025
 sn-0.024
 nick-0.024
Token were
Token ID547
Feature activation-3.94e-02

Pos logit contributions
 disl+0.026
 Majesty+0.025
 endings+0.022
 fault+0.022
 appearance+0.021
Neg logit contributions
hes-0.022
 full-0.022
 tucked-0.021
 nick-0.020
On-0.019
Token scared
Token ID12008
Feature activation+8.63e-02

Pos logit contributions
 full+0.046
hes+0.045
 tucked+0.044
 nick+0.042
On+0.041
Neg logit contributions
 disl-0.077
 Majesty-0.075
 fault-0.066
 endings-0.066
 beings-0.064
Token.
Token ID13
Feature activation+1.03e-02

Pos logit contributions
 Majesty+0.196
 disl+0.196
idents+0.165
 fault+0.165
 endings+0.164
Neg logit contributions
 full-0.091
 called-0.081
 tucked-0.081
hes-0.079
 sn-0.078
Token she
Token ID673
Feature activation+3.78e-02

Pos logit contributions
 full+0.068
 sn+0.061
 j+0.059
 tucked+0.058
 called+0.057
Neg logit contributions
 Majesty-0.115
 disl-0.113
 fault-0.105
 endings-0.105
 beings-0.101
Token wasn
Token ID2492
Feature activation-3.19e-02

Pos logit contributions
 disl+0.075
 Majesty+0.072
 endings+0.066
idents+0.063
 fault+0.063
Neg logit contributions
 full-0.047
hes-0.045
 tucked-0.045
 sn-0.044
 called-0.043
Token't
Token ID470
Feature activation+2.79e-02

Pos logit contributions
 tucked+0.067
hes+0.065
 full+0.065
aver+0.063
 nick+0.063
Neg logit contributions
 disl-0.029
 Majesty-0.024
 fault-0.023
idents-0.020
 anymore-0.019
Token dizz
Token ID29527
Feature activation+3.53e-02

Pos logit contributions
 disl+0.059
 Majesty+0.058
 endings+0.052
 fault+0.051
 appearance+0.050
Neg logit contributions
 full-0.029
hes-0.029
 tucked-0.028
 nick-0.025
 sn-0.025
Tokeny
Token ID88
Feature activation+6.88e-02

Pos logit contributions
 disl+0.039
 Majesty+0.038
 endings+0.030
 preferences+0.027
idents+0.027
Neg logit contributions
 full-0.075
hes-0.073
 tucked-0.071
 called-0.069
 sn-0.068
Token anymore
Token ID7471
Feature activation+1.77e-01

Pos logit contributions
 disl+0.113
 Majesty+0.110
 endings+0.088
idents+0.088
 preferences+0.085
Neg logit contributions
 full-0.116
 tucked-0.106
 sn-0.103
 called-0.103
hes-0.102
Token.
Token ID13
Feature activation+2.34e-03

Pos logit contributions
 Majesty+0.407
 disl+0.401
idents+0.356
cient+0.355
roc+0.348
Neg logit contributions
 full-0.183
 called-0.161
 tucked-0.154
 sn-0.150
 j-0.149
Token She
Token ID1375
Feature activation+3.38e-02

Pos logit contributions
 disl+0.005
 Majesty+0.005
 beings+0.004
 endings+0.004
 appearance+0.004
Neg logit contributions
 full-0.003
hes-0.003
 tucked-0.002
 nick-0.002
On-0.002
Token was
Token ID373
Feature activation-6.00e-02

Pos logit contributions
 disl+0.061
 Majesty+0.059
 endings+0.052
 fault+0.051
 appearance+0.049
Neg logit contributions
 full-0.046
hes-0.046
 tucked-0.044
 nick-0.042
 sn-0.041
Token so
Token ID523
Feature activation+2.53e-02

Pos logit contributions
hes+0.065
 full+0.065
 tucked+0.063
 nick+0.059
On+0.057
Neg logit contributions
 disl-0.121
 Majesty-0.119
 fault-0.105
 endings-0.103
 appearance-0.100
Token happy
Token ID3772
Feature activation+1.32e-01

Pos logit contributions
 disl+0.051
 Majesty+0.049
 fault+0.042
 endings+0.042
 preferences+0.041
Neg logit contributions
hes-0.029
 tucked-0.028
 full-0.027
 nick-0.026
 sn-0.026
Token full
Token ID1336
Feature activation+9.42e-02

Pos logit contributions
 full+0.010
hes+0.010
 tucked+0.009
 sn+0.009
 called+0.009
Neg logit contributions
 disl-0.011
 Majesty-0.010
 endings-0.009
 fault-0.009
 appearance-0.008
Token they
Token ID484
Feature activation+7.07e-02

Pos logit contributions
 disl+0.184
 Majesty+0.178
 endings+0.165
 fault+0.158
 purposes+0.153
Neg logit contributions
hes-0.107
 full-0.099
Wh-0.098
 nick-0.097
 tucked-0.096
Token couldn
Token ID3521
Feature activation+5.83e-02

Pos logit contributions
 disl+0.140
 Majesty+0.136
 endings+0.122
 fault+0.119
 appearance+0.116
Neg logit contributions
 full-0.083
hes-0.082
 tucked-0.078
 nick-0.073
 sn-0.073
Token't
Token ID470
Feature activation+4.82e-02

Pos logit contributions
 disl+0.109
 Majesty+0.108
 endings+0.094
 appearance+0.092
 preferences+0.092
Neg logit contributions
 full-0.080
hes-0.077
 tucked-0.076
 sn-0.072
 called-0.071
Token eat
Token ID4483
Feature activation+2.04e-02

Pos logit contributions
 disl+0.082
 Majesty+0.079
 endings+0.069
 fault+0.067
 appearance+0.065
Neg logit contributions
hes-0.071
 full-0.071
 tucked-0.068
 nick-0.064
 sn-0.064
Token anymore
Token ID7471
Feature activation+2.12e-01

Pos logit contributions
 disl+0.039
 Majesty+0.038
 endings+0.033
 fault+0.032
 perspectives+0.032
Neg logit contributions
 full-0.027
hes-0.025
 tucked-0.025
 sn-0.024
 called-0.023
Token cheese
Token ID9891
Feature activation+2.35e-02

Pos logit contributions
 Majesty+0.483
 disl+0.475
cient+0.424
idents+0.420
roc+0.419
Neg logit contributions
 full-0.222
 called-0.197
 sn-0.189
 tucked-0.186
 j-0.173
Token.
Token ID13
Feature activation+3.13e-02

Pos logit contributions
 disl+0.048
 Majesty+0.046
 perspectives+0.040
 endings+0.040
 preferences+0.039
Neg logit contributions
 full-0.030
 tucked-0.028
hes-0.027
 called-0.026
 sn-0.026
Token It
Token ID632
Feature activation+2.81e-02

Pos logit contributions
 disl+0.061
 Majesty+0.059
 endings+0.054
 fault+0.053
 appearance+0.053
Neg logit contributions
 full-0.039
hes-0.036
 called-0.035
 tucked-0.035
On-0.034
Token was
Token ID373
Feature activation-6.20e-02

Pos logit contributions
 disl+0.051
 Majesty+0.051
 endings+0.045
 fault+0.044
 appearance+0.043
Neg logit contributions
 full-0.040
 tucked-0.038
hes-0.038
 sn-0.037
 called-0.036
Token a
Token ID257
Feature activation-3.50e-02

Pos logit contributions
hes+0.063
 tucked+0.059
 ped+0.058
 full+0.058
 nick+0.057
Neg logit contributions
 disl-0.128
 Majesty-0.124
 fault-0.113
 endings-0.109
 preferences-0.106
Token how
Token ID703
Feature activation+7.89e-02

Pos logit contributions
 disl+0.060
 Majesty+0.059
 endings+0.053
 fault+0.052
 appearance+0.051
Neg logit contributions
 full-0.030
hes-0.029
 tucked-0.028
 sn-0.026
 nick-0.026
Token you
Token ID345
Feature activation+8.55e-02

Pos logit contributions
 Majesty+0.140
 disl+0.140
 endings+0.123
 appearance+0.118
 fault+0.117
Neg logit contributions
 full-0.122
hes-0.109
 tucked-0.104
 sn-0.104
 j-0.104
Token make
Token ID787
Feature activation+6.22e-03

Pos logit contributions
 disl+0.156
 Majesty+0.154
 endings+0.138
 fault+0.136
idents+0.130
Neg logit contributions
 full-0.119
 tucked-0.111
hes-0.110
 sn-0.109
 called-0.107
Token each
Token ID1123
Feature activation+1.98e-02

Pos logit contributions
 disl+0.011
 Majesty+0.010
 endings+0.009
idents+0.009
 fault+0.009
Neg logit contributions
 full-0.009
hes-0.009
 tucked-0.008
 sn-0.008
 called-0.008
Token other
Token ID584
Feature activation+1.06e-01

Pos logit contributions
 disl+0.036
 Majesty+0.034
 endings+0.030
 fault+0.029
idents+0.029
Neg logit contributions
 full-0.028
hes-0.027
 tucked-0.026
 sn-0.025
 called-0.025
Token happy
Token ID3772
Feature activation+1.74e-01

Pos logit contributions
 disl+0.202
 Majesty+0.199
 endings+0.180
 fault+0.172
idents+0.171
Neg logit contributions
 full-0.135
 sn-0.124
hes-0.123
 j-0.121
 tucked-0.120
Token and
Token ID290
Feature activation-3.90e-02

Pos logit contributions
 disl+0.324
 Majesty+0.318
itionally+0.281
cient+0.278
roc+0.275
Neg logit contributions
 full-0.222
 called-0.211
 sn-0.201
 tucked-0.199
 j-0.198
Token grow
Token ID1663
Feature activation-3.04e-02

Pos logit contributions
 full+0.050
 sn+0.041
hes+0.041
 tucked+0.041
 called+0.041
Neg logit contributions
 Majesty-0.078
 endings-0.075
 fault-0.074
 disl-0.073
eman-0.071
Token as
Token ID355
Feature activation+1.77e-02

Pos logit contributions
 full+0.043
hes+0.042
 tucked+0.037
 called+0.036
 j+0.036
Neg logit contributions
 disl-0.058
 Majesty-0.057
 endings-0.049
 fault-0.049
idents-0.048
Token a
Token ID257
Feature activation-1.39e-02

Pos logit contributions
 disl+0.036
 Majesty+0.036
 endings+0.032
 fault+0.031
idents+0.031
Neg logit contributions
 full-0.020
hes-0.019
 tucked-0.018
 sn-0.017
 called-0.017
Token family
Token ID1641
Feature activation+5.72e-02

Pos logit contributions
 full+0.018
hes+0.018
 tucked+0.017
 sn+0.016
 nick+0.016
Neg logit contributions
 disl-0.026
 Majesty-0.025
 endings-0.022
 fault-0.022
 appearance-0.021
Token He
Token ID679
Feature activation+8.74e-03

Pos logit contributions
 full+0.028
hes+0.027
 tucked+0.026
 sn+0.025
 nick+0.025
Neg logit contributions
 disl-0.032
 Majesty-0.031
 endings-0.028
 fault-0.027
 appearance-0.026
Token scrub
Token ID27268
Feature activation-6.30e-02

Pos logit contributions
 disl+0.017
 Majesty+0.017
 endings+0.015
 fault+0.014
 preferences+0.014
Neg logit contributions
 full-0.010
hes-0.010
 nick-0.009
 tucked-0.009
 bra-0.009
Tokenbed
Token ID3077
Feature activation-4.57e-02

Pos logit contributions
hes+0.103
 full+0.097
 tucked+0.096
 sn+0.095
 called+0.093
Neg logit contributions
 disl-0.102
 Majesty-0.099
 endings-0.087
 fault-0.085
 appearance-0.083
Token and
Token ID290
Feature activation-8.71e-02

Pos logit contributions
 full+0.050
hes+0.049
 tucked+0.047
 sn+0.047
 j+0.044
Neg logit contributions
 disl-0.097
 Majesty-0.094
 endings-0.086
 fault-0.083
 appearance-0.081
Token polished
Token ID23895
Feature activation-7.66e-03

Pos logit contributions
 full+0.081
hes+0.078
 tucked+0.073
 nick+0.070
 ped+0.067
Neg logit contributions
 disl-0.193
 Majesty-0.183
 endings-0.166
 preferences-0.161
 fault-0.161
Token until
Token ID1566
Feature activation+4.82e-02

Pos logit contributions
 full+0.009
hes+0.009
 tucked+0.009
 sn+0.009
 called+0.008
Neg logit contributions
 disl-0.015
 Majesty-0.015
 endings-0.013
 fault-0.013
 appearance-0.013
Token it
Token ID340
Feature activation-4.18e-03

Pos logit contributions
 disl+0.075
 Majesty+0.073
 endings+0.064
 fault+0.061
idents+0.060
Neg logit contributions
hes-0.074
 full-0.073
 tucked-0.072
 nick-0.068
 sn-0.068
Token was
Token ID373
Feature activation-8.08e-02

Pos logit contributions
 full+0.006
hes+0.006
 tucked+0.006
 nick+0.006
 sn+0.006
Neg logit contributions
 disl-0.007
 Majesty-0.006
 endings-0.006
 fault-0.006
idents-0.005
Token so
Token ID523
Feature activation+4.31e-03

Pos logit contributions
Wh+0.087
Say+0.087
Grab+0.086
 ped+0.085
Hold+0.082
Neg logit contributions
 disl-0.143
 Majesty-0.142
 fault-0.138
 anymore-0.123
 ourselves-0.121
Token shiny
Token ID22441
Feature activation-1.09e-03

Pos logit contributions
 disl+0.009
 Majesty+0.008
idents+0.008
 ourselves+0.007
 preferences+0.007
Neg logit contributions
Wh-0.004
 ped-0.004
 leash-0.004
 nick-0.004
aver-0.004
Token it
Token ID340
Feature activation+2.19e-03

Pos logit contributions
 full+0.001
 tucked+0.001
hes+0.001
 sn+0.001
 closed+0.001
Neg logit contributions
 disl-0.002
 Majesty-0.002
 endings-0.002
 preferences-0.002
 perspectives-0.002
Token on
Token ID319
Feature activation+6.73e-03

Pos logit contributions
 full+0.107
hes+0.106
 tucked+0.102
 nick+0.098
 sn+0.095
Neg logit contributions
 disl-0.153
 Majesty-0.148
 endings-0.132
 fault-0.130
 appearance-0.127
Token his
Token ID465
Feature activation+1.49e-02

Pos logit contributions
 disl+0.012
 Majesty+0.012
 endings+0.010
 appearance+0.010
 fault+0.010
Neg logit contributions
 full-0.009
hes-0.009
 tucked-0.009
 nick-0.008
aver-0.008
Token head
Token ID1182
Feature activation-6.77e-02

Pos logit contributions
 disl+0.030
 Majesty+0.030
 endings+0.026
 fault+0.026
 appearance+0.026
Neg logit contributions
hes-0.017
 full-0.017
 tucked-0.016
On-0.015
 nick-0.014
Token.
Token ID13
Feature activation-6.62e-04

Pos logit contributions
 full+0.092
hes+0.090
 tucked+0.087
 sn+0.082
 nick+0.082
Neg logit contributions
 disl-0.123
 Majesty-0.119
 endings-0.105
 fault-0.103
 appearance-0.100
Token One
Token ID1881
Feature activation+3.88e-02

Pos logit contributions
 full+0.001
hes+0.001
 tucked+0.001
 nick+0.001
 sn+0.001
Neg logit contributions
 disl-0.001
 Majesty-0.001
 endings-0.001
 fault-0.001
 appearance-0.001
Token day
Token ID1110
Feature activation+7.74e-02

Pos logit contributions
 disl+0.036
 means+0.028
 Majesty+0.027
by+0.026
 property+0.025
Neg logit contributions
hes-0.079
 bra-0.076
 convey-0.075
 tucked-0.075
 ped-0.074
Token,
Token ID11
Feature activation+1.60e-02

Pos logit contributions
 disl+0.127
 Majesty+0.124
 endings+0.103
 fault+0.103
 preferences+0.100
Neg logit contributions
 full-0.129
hes-0.124
 tucked-0.120
 called-0.118
 sn-0.116
Token Max
Token ID5436
Feature activation+4.41e-02

Pos logit contributions
 disl+0.029
 Majesty+0.028
 endings+0.025
 fault+0.024
 appearance+0.024
Neg logit contributions
 full-0.022
hes-0.022
 tucked-0.021
 sn-0.020
 nick-0.020
Token saw
Token ID2497
Feature activation+2.01e-02

Pos logit contributions
 disl+0.082
 Majesty+0.080
 fault+0.070
 endings+0.070
 preferences+0.067
Neg logit contributions
 full-0.061
 tucked-0.059
hes-0.058
 sn-0.057
 called-0.057
Token a
Token ID257
Feature activation-4.43e-02

Pos logit contributions
 disl+0.044
 Majesty+0.042
 endings+0.038
 fault+0.038
 preferences+0.036
Neg logit contributions
 full-0.019
hes-0.018
 tucked-0.018
 nick-0.017
 bra-0.016
Token little
Token ID1310
Feature activation+5.53e-02

Pos logit contributions
hes+0.063
 tucked+0.054
 ped+0.053
 full+0.051
 nick+0.051
Neg logit contributions
 Majesty-0.077
 disl-0.075
 appearance-0.066
cient-0.065
 environment-0.064
Token fun
Token ID1257
Feature activation+5.19e-02

Pos logit contributions
 tucked+0.036
 full+0.035
 parked+0.034
hes+0.033
 masse+0.033
Neg logit contributions
 disl-0.041
 Majesty-0.038
 preferences-0.035
 endings-0.035
 fault-0.034
Token.
Token ID13
Feature activation-1.48e-02

Pos logit contributions
 Majesty+0.108
 disl+0.108
 fault+0.096
 appearance+0.093
idents+0.092
Neg logit contributions
 full-0.061
hes-0.060
 called-0.055
 sn-0.054
 tucked-0.053
Token
Token ID198
Feature activation-1.39e-02

Pos logit contributions
 full+0.018
hes+0.017
 tucked+0.016
 called+0.016
 sn+0.015
Neg logit contributions
 disl-0.030
 Majesty-0.030
 endings-0.026
 fault-0.026
 appearance-0.026
Token
Token ID198
Feature activation-1.72e-02

Pos logit contributions
 full+0.017
hes+0.017
 tucked+0.016
 called+0.015
On+0.015
Neg logit contributions
 disl-0.028
 Majesty-0.027
 endings-0.024
 fault-0.024
 appearance-0.023
TokenOne
Token ID3198
Feature activation+1.48e-02

Pos logit contributions
 tucked+0.021
hes+0.021
 full+0.020
 bra+0.020
 nick+0.019
Neg logit contributions
 disl-0.031
 Majesty-0.029
 endings-0.028
 topics-0.026
 perspectives-0.026
Token day
Token ID1110
Feature activation+6.69e-02

Pos logit contributions
 disl+0.022
 Majesty+0.020
 endings+0.018
 beings+0.017
 means+0.017
Neg logit contributions
hes-0.023
 ped-0.022
 tucked-0.022
 bra-0.021
 convey-0.021
Token,
Token ID11
Feature activation+3.14e-03

Pos logit contributions
 disl+0.103
 Majesty+0.099
 endings+0.084
 fault+0.084
 preferences+0.080
Neg logit contributions
 full-0.116
hes-0.111
 tucked-0.109
 called-0.106
 sn-0.105
Token the
Token ID262
Feature activation-1.29e-02

Pos logit contributions
 disl+0.006
 Majesty+0.006
 endings+0.005
idents+0.005
 fault+0.005
Neg logit contributions
hes-0.004
 tucked-0.004
 full-0.004
 nick-0.003
 ped-0.003
Token pretty
Token ID2495
Feature activation+4.88e-02

Pos logit contributions
hes+0.016
 ped+0.016
 full+0.015
 tucked+0.015
wheel+0.015
Neg logit contributions
 disl-0.022
 Majesty-0.022
 beings-0.020
 environment-0.019
 appearance-0.019
Token sheep
Token ID15900
Feature activation-4.39e-02

Pos logit contributions
 disl+0.098
 Majesty+0.095
 endings+0.086
 fault+0.084
idents+0.081
Neg logit contributions
 full-0.058
hes-0.057
 tucked-0.056
 sn-0.053
 called-0.051
Token went
Token ID1816
Feature activation-1.04e-01

Pos logit contributions
 full+0.054
hes+0.051
 tucked+0.051
 called+0.049
 sn+0.048
Neg logit contributions
 disl-0.089
 Majesty-0.086
 fault-0.077
 endings-0.077
 preferences-0.074
Token fun
Token ID1257
Feature activation+6.73e-02

Pos logit contributions
 disl+0.046
 Majesty+0.040
 endings+0.040
 fault+0.038
 relationships+0.037
Neg logit contributions
hes-0.034
Wh-0.031
 full-0.031
 tucked-0.031
aver-0.030
Token dancing
Token ID15360
Feature activation-3.23e-02

Pos logit contributions
 disl+0.137
 Majesty+0.137
cient+0.121
roc+0.121
 preferences+0.120
Neg logit contributions
 full-0.082
 j-0.077
 sn-0.075
 tucked-0.074
 called-0.074
Token together
Token ID1978
Feature activation+9.86e-02

Pos logit contributions
 full+0.041
 tucked+0.037
hes+0.037
 called+0.035
 sn+0.035
Neg logit contributions
 disl-0.065
 Majesty-0.064
 endings-0.055
 preferences-0.055
 fault-0.054
Token!
Token ID0
Feature activation-5.75e-03

Pos logit contributions
 disl+0.201
 Majesty+0.193
 endings+0.168
 preferences+0.165
idents+0.164
Neg logit contributions
 full-0.128
 tucked-0.116
hes-0.113
 sn-0.111
 called-0.111
Token<|endoftext|>
Token ID50256
Feature activation+1.12e-01

Pos logit contributions
 full+0.008
hes+0.008
 tucked+0.007
 sn+0.007
 nick+0.007
Neg logit contributions
 disl-0.011
 Majesty-0.010
 endings-0.009
 fault-0.009
 appearance-0.009
TokenOnce
Token ID7454
Feature activation+1.16e-01

Pos logit contributions
 disl+0.200
 Majesty+0.194
 endings+0.169
 fault+0.165
 appearance+0.163
Neg logit contributions
 full-0.160
hes-0.157
 tucked-0.151
 sn-0.143
 called-0.141
Token upon
Token ID2402
Feature activation+4.99e-02

Pos logit contributions
 disl+0.272
 Majesty+0.268
 endings+0.246
 preferences+0.242
 property+0.234
Neg logit contributions
 full-0.146
 j-0.118
 low-0.115
 cold-0.115
 called-0.114
Token a
Token ID257
Feature activation-1.80e-02

Pos logit contributions
 disl+0.108
 Majesty+0.102
 endings+0.098
 fault+0.093
 purposes+0.092
Neg logit contributions
hes-0.041
 full-0.041
 tucked-0.041
 nick-0.039
 bra-0.038
Token time
Token ID640
Feature activation+4.11e-02

Pos logit contributions
hes+0.023
 tucked+0.022
 full+0.022
 ped+0.020
 bra+0.020
Neg logit contributions
 disl-0.033
 Majesty-0.032
 fault-0.029
 endings-0.029
 appearance-0.027
Token,
Token ID11
Feature activation+3.04e-02

Pos logit contributions
 disl+0.084
 Majesty+0.084
 preferences+0.078
 endings+0.074
 perspectives+0.073
Neg logit contributions
 full-0.055
 j-0.049
 called-0.046
On-0.044
 sn-0.044
Token there
Token ID612
Feature activation+9.94e-02

Pos logit contributions
 disl+0.068
 Majesty+0.066
 endings+0.059
 preferences+0.058
 appearance+0.057
Neg logit contributions
 full-0.037
hes-0.033
 tucked-0.031
 called-0.031
On-0.030
Token itself
Token ID2346
Feature activation-2.88e-02

Pos logit contributions
 disl+0.022
 Majesty+0.021
 endings+0.019
idents+0.019
 fault+0.019
Neg logit contributions
 full-0.013
 tucked-0.011
 sn-0.011
hes-0.011
 called-0.011
Token.
Token ID13
Feature activation-1.95e-02

Pos logit contributions
 full+0.033
 tucked+0.030
 sn+0.028
 called+0.027
 closed+0.027
Neg logit contributions
 disl-0.066
 Majesty-0.062
 endings-0.056
 preferences-0.054
 fault-0.053
Token
Token ID198
Feature activation-2.17e-02

Pos logit contributions
 full+0.021
hes+0.019
 tucked+0.018
 called+0.018
On+0.018
Neg logit contributions
 disl-0.042
 Majesty-0.041
 fault-0.038
 endings-0.038
 appearance-0.037
Token
Token ID198
Feature activation-2.49e-02

Pos logit contributions
 full+0.030
hes+0.029
 tucked+0.028
 sn+0.027
 nick+0.026
Neg logit contributions
 disl-0.039
 Majesty-0.038
 endings-0.034
 fault-0.033
 appearance-0.032
TokenThe
Token ID464
Feature activation-1.29e-02

Pos logit contributions
hes+0.031
 full+0.031
 tucked+0.031
 sn+0.028
 nick+0.028
Neg logit contributions
 disl-0.048
 Majesty-0.044
 endings-0.040
 perspectives-0.039
 property-0.038
Token little
Token ID1310
Feature activation+5.41e-02

Pos logit contributions
hes+0.019
 full+0.019
 ped+0.018
 tucked+0.018
 nick+0.018
Neg logit contributions
 disl-0.021
 Majesty-0.020
 endings-0.018
 fault-0.017
idents-0.017
Token girl
Token ID2576
Feature activation+3.49e-02

Pos logit contributions
 disl+0.087
 Majesty+0.081
 endings+0.071
 fault+0.069
 appearance+0.066
Neg logit contributions
 full-0.092
 tucked-0.088
hes-0.087
 sn-0.084
 called-0.082
Token was
Token ID373
Feature activation-8.63e-02

Pos logit contributions
 disl+0.066
 Majesty+0.064
 endings+0.057
 fault+0.056
 appearance+0.055
Neg logit contributions
 full-0.045
hes-0.044
 tucked-0.043
 sn-0.041
 called-0.040
Token so
Token ID523
Feature activation+8.22e-04

Pos logit contributions
hes+0.092
 tucked+0.088
 full+0.087
 nick+0.087
On+0.085
Neg logit contributions
 disl-0.167
 Majesty-0.167
 fault-0.146
 beings-0.138
 endings-0.138
Token excited
Token ID6568
Feature activation+6.16e-02

Pos logit contributions
 disl+0.002
 Majesty+0.002
 endings+0.002
 fault+0.001
idents+0.001
Neg logit contributions
 full-0.001
hes-0.001
 tucked-0.001
 nick-0.001
 sn-0.001
Token!
Token ID0
Feature activation-2.37e-02

Pos logit contributions
 Majesty+0.133
 disl+0.130
 fault+0.110
 preferences+0.110
 perspectives+0.109
Neg logit contributions
 full-0.080
 called-0.074
 tucked-0.072
 sn-0.068
 closed-0.065
Token inside
Token ID2641
Feature activation+4.14e-02

Pos logit contributions
 full+0.040
 tucked+0.036
hes+0.035
 sn+0.034
 closed+0.032
Neg logit contributions
 disl-0.078
 Majesty-0.075
 endings-0.067
 fault-0.065
 preferences-0.065
Token.
Token ID13
Feature activation-2.84e-02

Pos logit contributions
 disl+0.091
 Majesty+0.090
 endings+0.079
 fault+0.077
 preferences+0.076
Neg logit contributions
 full-0.051
 tucked-0.045
 sn-0.042
hes-0.042
 called-0.041
Token Inside
Token ID14384
Feature activation+4.55e-03

Pos logit contributions
hes+0.043
 full+0.043
 tucked+0.042
 sn+0.040
 bra+0.040
Neg logit contributions
 disl-0.046
 Majesty-0.044
 endings-0.039
 fault-0.038
idents-0.037
Token was
Token ID373
Feature activation-9.07e-02

Pos logit contributions
 disl+0.011
 Majesty+0.010
 endings+0.009
 preferences+0.009
 appearance+0.009
Neg logit contributions
 full-0.006
 tucked-0.006
 called-0.005
 sn-0.005
hes-0.005
Token a
Token ID257
Feature activation-6.86e-02

Pos logit contributions
 full+0.066
hes+0.066
 tucked+0.062
 nick+0.056
 sn+0.055
Neg logit contributions
 disl-0.216
 Majesty-0.212
 endings-0.194
 fault-0.191
 preferences-0.185
Token beautiful
Token ID4950
Feature activation+4.24e-02

Pos logit contributions
hes+0.077
 tucked+0.067
 ped+0.067
 full+0.067
 nick+0.064
Neg logit contributions
 Majesty-0.139
 disl-0.138
 fault-0.120
 endings-0.119
idents-0.117
Token toy
Token ID13373
Feature activation+5.55e-04

Pos logit contributions
 disl+0.087
 Majesty+0.085
 endings+0.076
 fault+0.074
 appearance+0.073
Neg logit contributions
hes-0.047
 full-0.046
 tucked-0.044
 nick-0.041
 ped-0.040
Token car
Token ID1097
Feature activation-6.91e-02

Pos logit contributions
 disl+0.001
 Majesty+0.001
 endings+0.001
 fault+0.001
 appearance+0.001
Neg logit contributions
 full-0.001
hes-0.001
 tucked-0.001
 sn-0.001
 called-0.001
Token with
Token ID351
Feature activation-4.06e-02

Pos logit contributions
 full+0.100
hes+0.099
 tucked+0.095
 nick+0.090
 sn+0.090
Neg logit contributions
 disl-0.118
 Majesty-0.114
 endings-0.101
 fault-0.098
 appearance-0.095
Token vibrant
Token ID21266
Feature activation+2.77e-02

Pos logit contributions
hes+0.041
 full+0.039
 tucked+0.039
 nick+0.037
On+0.035
Neg logit contributions
 disl-0.085
 Majesty-0.083
 endings-0.075
 fault-0.074
 preferences-0.072
Token colors
Token ID7577
Feature activation-4.58e-02

Pos logit contributions
 disl+0.058
 Majesty+0.056
 endings+0.051
 fault+0.049
idents+0.049
Neg logit contributions
 full-0.031
hes-0.029
 tucked-0.028
 sn-0.027
 called-0.026
Token he
Token ID339
Feature activation+7.45e-02

Pos logit contributions
 Majesty+0.197
 disl+0.188
idents+0.170
 fault+0.165
 endings+0.164
Neg logit contributions
 full-0.135
hes-0.119
 tucked-0.116
 called-0.116
 j-0.116
Token had
Token ID550
Feature activation+3.47e-02

Pos logit contributions
 disl+0.143
 Majesty+0.141
 endings+0.124
 fault+0.124
 appearance+0.119
Neg logit contributions
 full-0.099
 tucked-0.094
hes-0.093
 called-0.089
 sn-0.089
Token done
Token ID1760
Feature activation+9.11e-02

Pos logit contributions
 Majesty+0.071
 disl+0.071
 endings+0.061
idents+0.060
 appearance+0.060
Neg logit contributions
 full-0.044
hes-0.040
 tucked-0.039
 called-0.038
 ped-0.036
Token that
Token ID326
Feature activation+1.04e-01

Pos logit contributions
 Majesty+0.207
 disl+0.195
cient+0.176
 appearance+0.174
 endings+0.173
Neg logit contributions
 full-0.102
hes-0.088
 called-0.082
 tucked-0.082
 j-0.079
Token day
Token ID1110
Feature activation+1.07e-01

Pos logit contributions
 disl+0.188
 Majesty+0.185
 endings+0.161
 fault+0.156
idents+0.153
Neg logit contributions
 full-0.145
hes-0.139
 tucked-0.136
 sn-0.129
 nick-0.127
Token and
Token ID290
Feature activation-6.39e-02

Pos logit contributions
 disl+0.191
 Majesty+0.184
 endings+0.164
 fault+0.159
 appearance+0.155
Neg logit contributions
 full-0.149
hes-0.147
 tucked-0.142
 sn-0.134
 nick-0.134
Token show
Token ID905
Feature activation+2.39e-02

Pos logit contributions
 full+0.081
 called+0.072
 tucked+0.072
 sn+0.069
hes+0.066
Neg logit contributions
 disl-0.129
 Majesty-0.127
 fault-0.120
 endings-0.117
idents-0.116
Token them
Token ID606
Feature activation+1.11e-02

Pos logit contributions
 disl+0.048
 Majesty+0.046
idents+0.043
 endings+0.042
 fault+0.041
Neg logit contributions
 full-0.028
 tucked-0.027
hes-0.027
 called-0.026
 sn-0.026
Token the
Token ID262
Feature activation+2.89e-02

Pos logit contributions
 Majesty+0.023
 disl+0.022
 endings+0.020
idents+0.020
 appearance+0.020
Neg logit contributions
 full-0.014
hes-0.012
 tucked-0.012
 j-0.011
 sn-0.011
Token fish
Token ID5916
Feature activation-1.98e-02

Pos logit contributions
 disl+0.055
 Majesty+0.053
 endings+0.047
idents+0.046
 fault+0.045
Neg logit contributions
 full-0.038
hes-0.035
 tucked-0.034
 sn-0.033
 nick-0.033
Token he
Token ID339
Feature activation+5.42e-02

Pos logit contributions
 full+0.026
 tucked+0.022
 called+0.022
hes+0.022
 sn+0.022
Neg logit contributions
 Majesty-0.040
 disl-0.039
idents-0.035
 endings-0.035
 fault-0.035
Token again
Token ID757
Feature activation+4.18e-02

Pos logit contributions
 full+0.034
 tucked+0.032
 sn+0.031
hes+0.031
 called+0.030
Neg logit contributions
 disl-0.033
 Majesty-0.031
 endings-0.027
 preferences-0.026
 fault-0.026
Token,
Token ID11
Feature activation+1.08e-02

Pos logit contributions
 disl+0.085
 Majesty+0.084
 endings+0.073
 fault+0.073
idents+0.072
Neg logit contributions
 full-0.053
 tucked-0.047
 called-0.046
 sn-0.046
hes-0.045
Token but
Token ID475
Feature activation+8.27e-04

Pos logit contributions
 disl+0.020
 Majesty+0.020
 endings+0.018
 fault+0.017
 appearance+0.017
Neg logit contributions
hes-0.014
 full-0.013
 tucked-0.013
 nick-0.012
 sn-0.012
Token she
Token ID673
Feature activation+1.93e-02

Pos logit contributions
 Majesty+0.002
 disl+0.002
 endings+0.002
 purposes+0.002
cient+0.002
Neg logit contributions
 full-0.001
 tucked-0.001
 called-0.001
 closed-0.001
 sn-0.001
Token will
Token ID481
Feature activation-2.73e-03

Pos logit contributions
 disl+0.037
 Majesty+0.036
 endings+0.033
 fault+0.031
 appearance+0.031
Neg logit contributions
 full-0.026
 tucked-0.025
 sn-0.025
 called-0.024
hes-0.024
Token always
Token ID1464
Feature activation+1.97e-02

Pos logit contributions
 full+0.004
hes+0.004
 tucked+0.003
 sn+0.003
 nick+0.003
Neg logit contributions
 disl-0.005
 Majesty-0.005
 endings-0.004
 fault-0.004
 appearance-0.004
Token remember
Token ID3505
Feature activation+5.89e-02

Pos logit contributions
 disl+0.035
 Majesty+0.033
 endings+0.030
 fault+0.029
idents+0.029
Neg logit contributions
 full-0.029
 tucked-0.027
 sn-0.027
hes-0.026
 called-0.026
Token how
Token ID703
Feature activation+3.51e-02

Pos logit contributions
 Majesty+0.127
 disl+0.127
 endings+0.112
idents+0.112
 purposes+0.109
Neg logit contributions
 full-0.066
 tucked-0.061
 called-0.060
 sn-0.058
hes-0.056
Token it
Token ID340
Feature activation+2.24e-02

Pos logit contributions
 Majesty+0.080
 disl+0.075
 endings+0.072
eman+0.072
idents+0.071
Neg logit contributions
 full-0.038
 j-0.031
 much-0.030
 fast-0.030
hes-0.030
Token refused
Token ID6520
Feature activation+5.08e-02

Pos logit contributions
 Majesty+0.043
 disl+0.042
 endings+0.037
 fault+0.036
 appearance+0.035
Neg logit contributions
 full-0.030
hes-0.029
 tucked-0.029
 sn-0.029
 closed-0.028
Token to
Token ID284
Feature activation+2.79e-02

Pos logit contributions
 disl+0.095
 Majesty+0.095
 endings+0.080
idents+0.077
cient+0.076
Neg logit contributions
 full-0.070
 tucked-0.065
hes-0.063
 sn-0.062
 called-0.061
Token stick
Token ID4859
Feature activation-1.02e-01

Pos logit contributions
hes+0.068
 full+0.061
 tucked+0.060
 ped+0.056
 sn+0.056
Neg logit contributions
 disl-0.116
 Majesty-0.113
 fault-0.103
 endings-0.098
 appearance-0.097
Token to
Token ID284
Feature activation-7.39e-03

Pos logit contributions
 full+0.159
hes+0.156
 tucked+0.152
 sn+0.144
 nick+0.143
Neg logit contributions
 disl-0.165
 Majesty-0.159
 endings-0.138
 fault-0.134
 appearance-0.130
Token fish
Token ID5916
Feature activation-5.34e-02

Pos logit contributions
hes+0.011
 masse+0.011
 full+0.011
Say+0.011
 tucked+0.010
Neg logit contributions
 disl-0.011
 Majesty-0.010
 ourselves-0.009
 appearance-0.009
 relationships-0.009
Token the
Token ID262
Feature activation-4.03e-03

Pos logit contributions
 full+0.063
 tucked+0.057
 called+0.056
 sn+0.052
hes+0.050
Neg logit contributions
 Majesty-0.116
 disl-0.116
 endings-0.104
 preferences-0.099
 appearance-0.099
Token wheel
Token ID7825
Feature activation-9.01e-02

Pos logit contributions
hes+0.006
 full+0.006
 tucked+0.005
 sn+0.005
 nick+0.005
Neg logit contributions
 disl-0.007
 Majesty-0.007
 endings-0.006
 fault-0.006
 appearance-0.006
Token out
Token ID503
Feature activation-1.80e-02

Pos logit contributions
 full+0.120
hes+0.119
 tucked+0.114
 sn+0.108
 nick+0.107
Neg logit contributions
 disl-0.165
 Majesty-0.160
 endings-0.142
 fault-0.139
 appearance-0.135
Token of
Token ID286
Feature activation+5.10e-03

Pos logit contributions
hes+0.021
 full+0.019
 tucked+0.018
 nick+0.018
 sn+0.017
Neg logit contributions
 disl-0.035
 Majesty-0.033
 endings-0.032
 fault-0.031
 appearance-0.030
Token the
Token ID262
Feature activation+4.85e-03

Pos logit contributions
 disl+0.007
 Majesty+0.006
 fault+0.006
 appearance+0.006
 endings+0.006
Neg logit contributions
hes-0.009
 tucked-0.008
 full-0.008
 masse-0.008
 bra-0.008
Token water
Token ID1660
Feature activation-7.90e-02

Pos logit contributions
 disl+0.009
 Majesty+0.008
 endings+0.007
 fault+0.007
idents+0.007
Neg logit contributions
 full-0.007
hes-0.007
 tucked-0.007
 nick-0.006
 sn-0.006
Token and
Token ID290
Feature activation-9.20e-02

Pos logit contributions
 full+0.106
hes+0.102
 tucked+0.100
 sn+0.094
 called+0.093
Neg logit contributions
 disl-0.146
 Majesty-0.142
 endings-0.126
 fault-0.122
 appearance-0.119
Token put
Token ID1234
Feature activation-9.37e-02

Pos logit contributions
 full+0.097
hes+0.093
 tucked+0.091
 sn+0.084
 called+0.083
Neg logit contributions
 disl-0.198
 Majesty-0.193
 endings-0.175
 fault-0.171
 appearance-0.167
Token lesson
Token ID11483
Feature activation+4.48e-02

Pos logit contributions
 disl+0.010
 Majesty+0.010
 endings+0.008
 fault+0.008
 appearance+0.008
Neg logit contributions
 full-0.009
hes-0.009
 tucked-0.009
 sn-0.008
 nick-0.008
Token and
Token ID290
Feature activation-8.04e-02

Pos logit contributions
 Majesty+0.095
 disl+0.094
 endings+0.080
 fault+0.080
idents+0.078
Neg logit contributions
 full-0.052
hes-0.049
 tucked-0.048
 called-0.047
 sn-0.045
Token never
Token ID1239
Feature activation+2.11e-02

Pos logit contributions
 full+0.100
hes+0.097
 tucked+0.096
 sn+0.090
 called+0.090
Neg logit contributions
 disl-0.159
 Majesty-0.156
 endings-0.138
 fault-0.135
 appearance-0.133
Token went
Token ID1816
Feature activation-8.44e-02

Pos logit contributions
 disl+0.046
 Majesty+0.046
 endings+0.041
 fault+0.040
idents+0.039
Neg logit contributions
 full-0.022
 tucked-0.020
hes-0.020
 sn-0.019
 called-0.019
Token to
Token ID284
Feature activation+2.89e-02

Pos logit contributions
hes+0.113
 full+0.110
 tucked+0.108
 nick+0.101
 sn+0.100
Neg logit contributions
 disl-0.152
 Majesty-0.147
 endings-0.131
 fault-0.128
 appearance-0.128
Token the
Token ID262
Feature activation+1.84e-02

Pos logit contributions
 disl+0.056
 Majesty+0.055
 endings+0.048
idents+0.047
 fault+0.047
Neg logit contributions
 full-0.037
 tucked-0.035
 sn-0.034
hes-0.034
 nick-0.033
Token lion
Token ID18744
Feature activation+7.27e-03

Pos logit contributions
 disl+0.032
 Majesty+0.030
 endings+0.026
 fault+0.026
idents+0.025
Neg logit contributions
 full-0.028
hes-0.026
 tucked-0.026
 sn-0.025
 nick-0.025
Token's
Token ID338
Feature activation+5.94e-04

Pos logit contributions
 disl+0.013
 Majesty+0.012
 preferences+0.011
 endings+0.010
 perspectives+0.010
Neg logit contributions
 full-0.012
 tucked-0.011
 sn-0.011
 called-0.011
 closed-0.010
Token cave
Token ID11527
Feature activation+2.49e-02

Pos logit contributions
 disl+0.001
 Majesty+0.001
 endings+0.001
idents+0.001
cient+0.001
Neg logit contributions
 full-0.001
 sn-0.001
 tucked-0.001
 closed-0.001
 j-0.001
Token again
Token ID757
Feature activation+5.15e-02

Pos logit contributions
 disl+0.039
 Majesty+0.037
 endings+0.032
 fault+0.030
idents+0.030
Neg logit contributions
 full-0.042
hes-0.040
 tucked-0.040
 sn-0.038
 called-0.037
Token.
Token ID13
Feature activation-2.09e-02

Pos logit contributions
 Majesty+0.112
 disl+0.111
idents+0.098
 fault+0.098
 endings+0.097
Neg logit contributions
 full-0.060
 called-0.051
 tucked-0.051
 sn-0.050
 j-0.048
Token too
Token ID1165
Feature activation-2.87e-02

Pos logit contributions
Ta+0.142
Ring+0.138
hes+0.135
La+0.134
 convey+0.133
Neg logit contributions
 fault-0.114
 disl-0.102
 means-0.100
 anymore-0.100
flies-0.096
Token high
Token ID1029
Feature activation+8.84e-03

Pos logit contributions
 tucked+0.032
 nick+0.031
 ped+0.030
 blush+0.029
 convey+0.029
Neg logit contributions
 disl-0.052
 means-0.045
by-0.044
 fault-0.044
idents-0.043
Token up
Token ID510
Feature activation+1.22e-03

Pos logit contributions
 disl+0.019
 Majesty+0.019
 endings+0.016
 preferences+0.016
 perspectives+0.016
Neg logit contributions
 full-0.011
 tucked-0.010
hes-0.009
 sn-0.009
 called-0.009
Token.
Token ID13
Feature activation-1.67e-02

Pos logit contributions
 disl+0.002
 Majesty+0.002
 fault+0.001
 endings+0.001
 appearance+0.001
Neg logit contributions
hes-0.002
 tucked-0.002
 nick-0.002
 sn-0.002
Wh-0.002
Token He
Token ID679
Feature activation+3.07e-03

Pos logit contributions
 full+0.021
hes+0.020
 tucked+0.019
 nick+0.018
 sn+0.018
Neg logit contributions
 disl-0.033
 Majesty-0.032
 endings-0.028
 fault-0.028
 beings-0.027
Token jumped
Token ID11687
Feature activation-1.27e-01

Pos logit contributions
 disl+0.004
 Majesty+0.004
 endings+0.004
idents+0.004
 means+0.004
Neg logit contributions
hes-0.005
 full-0.005
 tucked-0.004
 nick-0.004
 masse-0.004
Token up
Token ID510
Feature activation+6.96e-04

Pos logit contributions
hes+0.204
 tucked+0.185
aver+0.183
Say+0.178
 nick+0.176
Neg logit contributions
 disl-0.154
 anymore-0.154
idents-0.153
 Majesty-0.152
 fault-0.151
Token,
Token ID11
Feature activation+3.45e-03

Pos logit contributions
 disl+0.001
idents+0.001
flies+0.001
 fault+0.001
 anymore+0.001
Neg logit contributions
hes-0.001
aver-0.001
Ring-0.001
 tucked-0.001
Say-0.001
Token but
Token ID475
Feature activation-1.31e-02

Pos logit contributions
 disl+0.006
 Majesty+0.006
idents+0.005
 fault+0.005
 appearance+0.005
Neg logit contributions
hes-0.004
 full-0.004
 tucked-0.004
 nick-0.004
aver-0.004
Token he
Token ID339
Feature activation+1.48e-03

Pos logit contributions
 full+0.018
hes+0.017
 tucked+0.017
 sn+0.016
 called+0.016
Neg logit contributions
 disl-0.025
 Majesty-0.024
 endings-0.022
 fault-0.021
 preferences-0.021
Token still
Token ID991
Feature activation+2.27e-02

Pos logit contributions
 disl+0.002
 Majesty+0.002
 endings+0.002
 fault+0.002
idents+0.002
Neg logit contributions
hes-0.002
 full-0.002
 tucked-0.002
 nick-0.002
 sn-0.002
Token.
Token ID13
Feature activation-3.45e-03

Pos logit contributions
 disl+0.036
 Majesty+0.035
 endings+0.031
 fault+0.030
idents+0.029
Neg logit contributions
 full-0.026
 tucked-0.024
hes-0.024
 called-0.023
 sn-0.023
Token
Token ID198
Feature activation-4.68e-03

Pos logit contributions
hes+0.005
 full+0.005
 tucked+0.005
 sn+0.005
 bra+0.004
Neg logit contributions
 disl-0.006
 Majesty-0.006
 endings-0.005
 fault-0.005
idents-0.005
Token
Token ID198
Feature activation-8.91e-03

Pos logit contributions
 full+0.007
 tucked+0.007
hes+0.006
 sn+0.006
 nick+0.006
Neg logit contributions
 disl-0.008
 Majesty-0.007
 endings-0.007
 fault-0.006
 purposes-0.006
TokenWhen
Token ID2215
Feature activation+3.48e-02

Pos logit contributions
 full+0.011
 tucked+0.011
hes+0.010
 sn+0.010
 called+0.010
Neg logit contributions
 disl-0.017
 Majesty-0.016
 endings-0.014
 perspectives-0.014
 topics-0.013
Token he
Token ID339
Feature activation+1.12e-02

Pos logit contributions
 disl+0.062
 Majesty+0.059
 endings+0.052
 fault+0.051
 preferences+0.049
Neg logit contributions
 full-0.051
hes-0.049
 tucked-0.048
 sn-0.046
 nick-0.045
Token got
Token ID1392
Feature activation-8.51e-02

Pos logit contributions
 disl+0.019
 Majesty+0.019
 endings+0.017
 fault+0.016
 appearance+0.016
Neg logit contributions
hes-0.016
 full-0.016
 tucked-0.015
 nick-0.014
 sn-0.014
Token there
Token ID612
Feature activation+9.20e-02

Pos logit contributions
hes+0.142
 full+0.137
 tucked+0.136
 nick+0.129
 sn+0.129
Neg logit contributions
 disl-0.122
 Majesty-0.120
 endings-0.102
 fault-0.100
idents-0.096
Token,
Token ID11
Feature activation+1.71e-02

Pos logit contributions
 disl+0.197
 Majesty+0.176
 preferences+0.163
 fault+0.158
 beings+0.151
Neg logit contributions
 full-0.135
 tucked-0.120
 j-0.116
 sn-0.115
 called-0.114
Token he
Token ID339
Feature activation+1.75e-02

Pos logit contributions
 disl+0.036
 Majesty+0.034
 fault+0.030
 preferences+0.030
 endings+0.030
Neg logit contributions
 full-0.022
hes-0.019
 tucked-0.019
 nick-0.019
 sn-0.019
Token saw
Token ID2497
Feature activation+1.56e-02

Pos logit contributions
 disl+0.029
 Majesty+0.029
 endings+0.025
 fault+0.024
 appearance+0.024
Neg logit contributions
 full-0.026
hes-0.026
 tucked-0.024
 nick-0.023
 sn-0.023
Token the
Token ID262
Feature activation+1.48e-03

Pos logit contributions
 disl+0.035
 Majesty+0.034
 endings+0.031
 fault+0.030
idents+0.030
Neg logit contributions
 full-0.015
hes-0.014
 tucked-0.014
 sn-0.013
 called-0.013
Token as
Token ID355
Feature activation-3.17e-02

Pos logit contributions
 disl+0.124
 Majesty+0.118
 fault+0.107
idents+0.104
flies+0.104
Neg logit contributions
hes-0.130
 nick-0.114
 tucked-0.114
Ring-0.112
aver-0.112
Token fast
Token ID3049
Feature activation-9.24e-03

Pos logit contributions
 full+0.017
hes+0.016
 tucked+0.015
 sn+0.013
 nick+0.012
Neg logit contributions
 disl-0.084
 Majesty-0.082
 endings-0.075
 fault-0.074
 appearance-0.073
Token as
Token ID355
Feature activation-3.75e-02

Pos logit contributions
hes+0.012
 full+0.011
 tucked+0.011
 nick+0.010
On+0.010
Neg logit contributions
 disl-0.018
 Majesty-0.017
 endings-0.016
 fault-0.015
 appearance-0.015
Token he
Token ID339
Feature activation-5.14e-03

Pos logit contributions
 full+0.054
hes+0.050
 tucked+0.050
 sn+0.047
 called+0.047
Neg logit contributions
 disl-0.068
 Majesty-0.066
 endings-0.058
 fault-0.057
idents-0.056
Token could
Token ID714
Feature activation-1.37e-02

Pos logit contributions
hes+0.010
 full+0.010
 nick+0.009
 tucked+0.009
 leash+0.009
Neg logit contributions
 Majesty-0.005
 disl-0.005
 endings-0.005
flies-0.004
idents-0.004
Token and
Token ID290
Feature activation-9.60e-02

Pos logit contributions
hes+0.021
 full+0.020
 tucked+0.020
 nick+0.019
 sn+0.019
Neg logit contributions
 disl-0.022
 Majesty-0.022
 endings-0.019
 fault-0.019
 appearance-0.018
Token hid
Token ID24519
Feature activation-4.61e-02

Pos logit contributions
 full+0.110
 tucked+0.105
 sn+0.104
 called+0.099
hes+0.098
Neg logit contributions
 disl-0.205
 Majesty-0.200
 fault-0.185
 endings-0.180
 preferences-0.178
Token in
Token ID287
Feature activation-2.11e-02

Pos logit contributions
hes+0.048
 full+0.043
 tucked+0.043
 nick+0.040
On+0.038
Neg logit contributions
 disl-0.095
 Majesty-0.092
 endings-0.086
 beings-0.084
 relationships-0.083
Token a
Token ID257
Feature activation-5.39e-02

Pos logit contributions
hes+0.029
 tucked+0.025
 full+0.025
aver+0.025
wheel+0.023
Neg logit contributions
 disl-0.037
 Majesty-0.035
 endings-0.033
 appearance-0.033
 fault-0.032
Token bush
Token ID24484
Feature activation-4.59e-02

Pos logit contributions
hes+0.066
 full+0.059
 tucked+0.058
 ped+0.058
 parked+0.054
Neg logit contributions
 disl-0.104
 Majesty-0.101
 endings-0.088
 appearance-0.088
 property-0.086
Token.
Token ID13
Feature activation-1.83e-02

Pos logit contributions
hes+0.083
Ring+0.075
 ped+0.074
 nick+0.073
Wh+0.073
Neg logit contributions
 disl-0.053
 Majesty-0.051
 anymore-0.049
 fault-0.049
 property-0.047
Token but
Token ID475
Feature activation-7.59e-03

Pos logit contributions
 disl+0.002
 Majesty+0.002
 fault+0.002
 appearance+0.002
 endings+0.002
Neg logit contributions
hes-0.002
 full-0.002
 tucked-0.002
 nick-0.002
aver-0.002
Token the
Token ID262
Feature activation-1.84e-02

Pos logit contributions
hes+0.012
 full+0.012
 tucked+0.012
 sn+0.011
 nick+0.011
Neg logit contributions
 disl-0.012
 Majesty-0.011
 endings-0.010
 fault-0.009
idents-0.009
Token knife
Token ID9845
Feature activation-5.57e-02

Pos logit contributions
 full+0.028
hes+0.025
 cold+0.024
 tucked+0.023
 sn+0.023
Neg logit contributions
 disl-0.034
 Majesty-0.031
 endings-0.028
idents-0.028
ors-0.028
Token did
Token ID750
Feature activation-1.66e-02

Pos logit contributions
 full+0.062
hes+0.060
 tucked+0.057
 nick+0.056
On+0.053
Neg logit contributions
 disl-0.110
 Majesty-0.106
 endings-0.097
 fault-0.096
idents-0.092
Token not
Token ID407
Feature activation-1.01e-02

Pos logit contributions
 full+0.021
hes+0.021
 tucked+0.020
 nick+0.019
 sn+0.019
Neg logit contributions
 disl-0.031
 Majesty-0.030
 endings-0.027
 fault-0.026
idents-0.026
Token go
Token ID467
Feature activation-9.75e-02

Pos logit contributions
hes+0.012
 full+0.011
 nick+0.011
 tucked+0.011
aver+0.010
Neg logit contributions
 disl-0.019
 Majesty-0.018
 fault-0.017
 endings-0.016
idents-0.016
Token in
Token ID287
Feature activation-4.82e-03

Pos logit contributions
hes+0.150
 tucked+0.143
 nick+0.139
 full+0.138
 ped+0.132
Neg logit contributions
 disl-0.142
 Majesty-0.135
 fault-0.127
 appearance-0.127
 endings-0.123
Token.
Token ID13
Feature activation-2.71e-02

Pos logit contributions
hes+0.007
 full+0.007
 tucked+0.007
 nick+0.006
aver+0.006
Neg logit contributions
 disl-0.008
 Majesty-0.008
 fault-0.007
 endings-0.007
 appearance-0.007
Token He
Token ID679
Feature activation+3.89e-03

Pos logit contributions
 tucked+0.049
 full+0.045
 sn+0.045
aver+0.044
hes+0.044
Neg logit contributions
 disl-0.036
 Majesty-0.033
 fault-0.031
 endings-0.031
idents-0.030
Token pushed
Token ID7121
Feature activation-6.28e-02

Pos logit contributions
 disl+0.006
 Majesty+0.005
 means+0.005
 endings+0.005
flies+0.005
Neg logit contributions
 full-0.005
 nick-0.005
hes-0.005
 leash-0.005
 masse-0.005
Token harder
Token ID7069
Feature activation-6.43e-03

Pos logit contributions
hes+0.093
 full+0.092
 tucked+0.092
 nick+0.083
 sn+0.080
Neg logit contributions
 disl-0.099
 Majesty-0.096
 endings-0.088
 fault-0.087
 relationships-0.086
Token ran
Token ID4966
Feature activation-8.79e-02

Pos logit contributions
 full+0.001
hes+0.001
 tucked+0.001
 sn+0.001
 called+0.001
Neg logit contributions
 disl-0.002
 Majesty-0.002
 endings-0.002
 fault-0.002
 appearance-0.002
Token out
Token ID503
Feature activation-1.14e-02

Pos logit contributions
hes+0.158
 tucked+0.145
 full+0.142
aver+0.138
 nick+0.137
Neg logit contributions
 disl-0.112
 Majesty-0.105
idents-0.095
 endings-0.091
 fault-0.090
Token of
Token ID286
Feature activation+1.42e-02

Pos logit contributions
hes+0.020
Ring+0.018
aver+0.018
Say+0.018
 bisc+0.018
Neg logit contributions
 disl-0.012
 means-0.012
 beings-0.012
flies-0.011
 sibling-0.011
Token the
Token ID262
Feature activation+3.38e-03

Pos logit contributions
 disl+0.018
 Majesty+0.015
 appearance+0.014
 means+0.014
 topics+0.014
Neg logit contributions
hes-0.025
 full-0.023
 bra-0.023
 tucked-0.023
oll-0.023
Token yard
Token ID12699
Feature activation+1.38e-02

Pos logit contributions
 disl+0.006
 Majesty+0.006
 endings+0.005
 fault+0.005
 appearance+0.005
Neg logit contributions
hes-0.004
 full-0.004
 tucked-0.004
 sn-0.004
 nick-0.004
Token and
Token ID290
Feature activation-8.42e-02

Pos logit contributions
 disl+0.025
 Majesty+0.025
 endings+0.022
 fault+0.021
 appearance+0.021
Neg logit contributions
 full-0.019
hes-0.018
 tucked-0.018
 sn-0.017
 called-0.016
Token into
Token ID656
Feature activation-2.31e-02

Pos logit contributions
 full+0.108
hes+0.107
 tucked+0.102
 nick+0.096
 sn+0.095
Neg logit contributions
 disl-0.159
 Majesty-0.154
 endings-0.136
 fault-0.132
idents-0.130
Token the
Token ID262
Feature activation-2.89e-03

Pos logit contributions
hes+0.035
 full+0.033
 tucked+0.033
 nick+0.030
aver+0.030
Neg logit contributions
 disl-0.038
 Majesty-0.036
 endings-0.032
 relationships-0.030
 purposes-0.030
Token street
Token ID4675
Feature activation-9.91e-03

Pos logit contributions
hes+0.005
 tucked+0.004
 full+0.004
 ped+0.004
 sn+0.004
Neg logit contributions
 disl-0.004
 Majesty-0.004
 endings-0.004
 property-0.004
 appearance-0.004
Token.
Token ID13
Feature activation-2.11e-02

Pos logit contributions
 full+0.015
 tucked+0.014
 called+0.013
 sn+0.013
hes+0.013
Neg logit contributions
 disl-0.018
 Majesty-0.017
 endings-0.016
 fault-0.015
 preferences-0.015
Token Suddenly
Token ID24975
Feature activation+4.29e-02

Pos logit contributions
 tucked+0.038
hes+0.037
aver+0.036
ak+0.036
 full+0.035
Neg logit contributions
 disl-0.026
 Majesty-0.025
idents-0.022
 means-0.020
 endings-0.020