TOP ACTIVATIONS
MAX = 5.887

that scared them : jelly fish . " Look
" D anger . Jelly fish . Do not swim ."
. She mixed purple jelly fish with a lot of different
must be careful . Jelly fish in the sea can be
and f enn el and an ise and car away and
no . She said jelly fish can sting and hurt .
better . They said jelly fish are not bad , they
together . They make the dancer . They look
big hug . The jelly fish was so happy and thanked
into it . The star fish said ,
't touch it . Jelly fish can sting you and make
with our time !" The dancer and G aby laughed and
" J elly fish are very interesting , but
no . He said jelly fish can hurt you . Lily
the fire . Their flowers bloom . Bloom means grow and
the water . The jelly fish was cool and Tim my
Trust me . The jelly fish is not a good friend
And that 's why Tues days are so important . <|endoftext|>
help me ! The jelly fish st ung me !" Mia
said . " J elly fish can hurt you . They

MIN ACTIVATIONS
MIN = -6.767

sund ae with lots of topp ings . Tim my was
sund ae with lots of topp ings . They
weighed at least two hundred pounds ! W opsy was so
spotted a row of books hel ves in her grandparents '
my had a long screw driver that he loved to play
has all the y ummy topp ings you like ."
had a quarrel about what topp ings to put on the
a hammer , a screw driver , and a saw .
the world and wanted Wh isk ers to come along .
saw him use a screw driver , a wrench , and
and took out a screw driver . Then he carefully unsc
" I have a screw driver !" Ben says . "
tightened it with a screw driver . The clock was now
room and used a screw driver to open it .
a hammer and a screw driver that he liked to store
a special treat for the chip m unk . She said
a hammer , a screw driver , and a saw .
a hammer and a screw driver . M olly
One day , Wh isk ers ' friend , a
, fifteen , sixteen , seventeen , eighteen , nineteen ,

INTERVAL 2.723 - 5.887
CONTAINS 0.339%

good . Can you help me make some icing ?" Jane
too soft . Squ ee ze it just fair . Do
and paper and made his own cartoon characters . He had
ed so beautifully ." The flower felt very happy and proud
said . " You hurt my doll ! Give it back

INTERVAL -0.440 - 2.723
CONTAINS 69.929%

coins and jewels spilled out on the ground . They looked
late . <|endoftext|> Frank ie was a smart 3 year old
didn 't have any hay to eat .
bird . It started to fly away fast . The loud
. " Le o 's doll has a loose head .

INTERVAL -3.604 - -0.440
CONTAINS 29.612%

the le opard was known as the brav est and most
would be fun ," Ben said . L ily
stubborn , which means he didn 't like to listen to
They never parted ways and the train became very light with
. One day , a dog came near the

INTERVAL -6.767 - -3.604
CONTAINS 0.121%

to nib ble at the crust of the sandwich . Jill
hard and share with others than to be lazy and alone
do more push - ups than most adults . Her friends
was a polite three - year - old . Whenever somebody
she noticed a 3 - year - old child sat by
Token that
Token ID326
Feature activation+8.25e-01

Pos logit contributions
 trends+0.017
 Grace+0.016
 occasions+0.016
 Grac+0.016
lyss+0.016
Neg logit contributions
 supposed-0.010
 leaning-0.010
 Tom-0.008
 allergic-0.008
 lying-0.008
Token scared
Token ID12008
Feature activation+1.07e+00

Pos logit contributions
 trends+0.044
 occasions+0.041
lyss+0.041
 Grace+0.041
 Grac+0.040
Neg logit contributions
 supposed-0.035
 leaning-0.035
 Tom-0.033
 allergic-0.030
 done-0.030
Token them
Token ID606
Feature activation-1.41e+00

Pos logit contributions
 trends+0.050
lyss+0.048
 Grac+0.048
 Grace+0.048
 Flor+0.047
Neg logit contributions
 supposed-0.051
 leaning-0.050
 Tom-0.046
 allergic-0.045
aked-0.043
Token:
Token ID25
Feature activation+3.76e-01

Pos logit contributions
 supposed+0.048
 leaning+0.047
 allergic+0.039
 Tom+0.038
 done+0.035
Neg logit contributions
 trends-0.087
lyss-0.086
 Grace-0.086
 Grac-0.084
 occasions-0.083
Token jelly
Token ID27644
Feature activation+2.42e+00

Pos logit contributions
 trends+0.027
 occasions+0.025
 Grace+0.025
lyss+0.025
 Grac+0.024
Neg logit contributions
 leaning-0.010
 supposed-0.009
 Tom-0.009
 allergic-0.008
 greedy-0.007
Tokenfish
Token ID11084
Feature activation+5.28e+00

Pos logit contributions
 trends+0.066
 Grace+0.065
 Grac+0.062
lyss+0.059
 Nina+0.055
Neg logit contributions
 supposed-0.166
 leaning-0.163
 allergic-0.151
 done-0.151
 lying-0.150
Token.
Token ID13
Feature activation+6.55e-02

Pos logit contributions
 trends+0.313
lyss+0.299
 occasions+0.292
 Grace+0.291
 Grac+0.286
Neg logit contributions
 leaning-0.205
 supposed-0.201
 allergic-0.181
 lying-0.176
 Tom-0.170
Token
Token ID198
Feature activation-2.08e-01

Pos logit contributions
 trends+0.004
 occasions+0.003
 Grace+0.003
lyss+0.003
 Grac+0.003
Neg logit contributions
 leaning-0.002
 Tom-0.002
 supposed-0.002
 allergic-0.002
agged-0.002
Token
Token ID198
Feature activation-6.31e-01

Pos logit contributions
 supposed+0.009
 leaning+0.009
 Tom+0.008
aked+0.008
 allergic+0.008
Neg logit contributions
 trends-0.010
 Grace-0.010
 Grac-0.010
lyss-0.010
 Flor-0.010
Token"
Token ID1
Feature activation-7.63e-01

Pos logit contributions
 supposed+0.027
 leaning+0.027
 Tom+0.024
aked+0.024
 allergic+0.024
Neg logit contributions
 trends-0.031
 Grace-0.030
 Flor-0.030
 Grac-0.030
lyss-0.029
TokenLook
Token ID8567
Feature activation+2.69e-01

Pos logit contributions
 leaning+0.027
 supposed+0.027
 Tom+0.025
aked+0.024
 allergic+0.023
Neg logit contributions
 trends-0.044
 Grace-0.042
 Grac-0.041
lyss-0.040
 occasions-0.039
Token "
Token ID366
Feature activation+2.72e-01

Pos logit contributions
 leaning+0.012
 supposed+0.012
 Tom+0.010
 allergic+0.010
 done+0.010
Neg logit contributions
 trends-0.017
 Grace-0.017
 Grac-0.016
lyss-0.016
 occasions-0.016
TokenD
Token ID35
Feature activation+1.40e+00

Pos logit contributions
 trends+0.012
 Grace+0.012
 Grac+0.011
lyss+0.011
 Flor+0.011
Neg logit contributions
 supposed-0.014
 leaning-0.014
 allergic-0.012
aked-0.012
 Tom-0.012
Tokenanger
Token ID2564
Feature activation+1.79e+00

Pos logit contributions
 trends+0.062
 Grace+0.061
 Grac+0.059
lyss+0.058
 Flor+0.057
Neg logit contributions
 supposed-0.070
 leaning-0.068
 allergic-0.061
 Tom-0.059
aked-0.058
Token.
Token ID13
Feature activation+5.05e-01

Pos logit contributions
 trends+0.086
 Grace+0.085
lyss+0.083
 Grac+0.083
 Nina+0.081
Neg logit contributions
 supposed-0.082
 leaning-0.080
 Tom-0.071
 allergic-0.068
aked-0.068
Token Jelly
Token ID37152
Feature activation+2.67e+00

Pos logit contributions
 trends+0.025
fish+0.025
 Grace+0.024
 occasions+0.024
lyss+0.024
Neg logit contributions
 Tom-0.021
 supposed-0.020
 leaning-0.020
aked-0.018
 thieves-0.017
Tokenfish
Token ID11084
Feature activation+5.22e+00

Pos logit contributions
 Grac+0.123
 Grace+0.122
 trends+0.119
 occasions+0.111
 Flor+0.111
Neg logit contributions
 supposed-0.132
 leaning-0.128
 Tom-0.117
 done-0.113
 allergic-0.109
Token.
Token ID13
Feature activation+6.61e-01

Pos logit contributions
 trends+0.254
 Grace+0.252
 Grac+0.247
lyss+0.244
 Nina+0.238
Neg logit contributions
 supposed-0.253
 leaning-0.245
 done-0.227
 allergic-0.220
 thieves-0.214
Token Do
Token ID2141
Feature activation+9.36e-01

Pos logit contributions
 trends+0.034
 occasions+0.033
 Grace+0.032
fish+0.032
 outfits+0.032
Neg logit contributions
 Tom-0.029
 supposed-0.024
aked-0.023
 leaning-0.023
 greedy-0.022
Token not
Token ID407
Feature activation-1.83e+00

Pos logit contributions
 trends+0.053
 Grace+0.052
 Grac+0.051
lyss+0.050
 Nina+0.050
Neg logit contributions
 supposed-0.036
 leaning-0.035
 Tom-0.030
 allergic-0.030
aked-0.028
Token swim
Token ID9422
Feature activation+1.86e+00

Pos logit contributions
 supposed+0.084
 leaning+0.083
 Tom+0.074
 allergic+0.072
 thieves+0.070
Neg logit contributions
 trends-0.092
 Grace-0.090
 Grac-0.086
lyss-0.086
 Nina-0.084
Token."
Token ID526
Feature activation+9.77e-01

Pos logit contributions
 Grace+0.110
 Grac+0.109
 trends+0.108
lyss+0.105
 occasions+0.105
Neg logit contributions
 supposed-0.066
 leaning-0.062
 Tom-0.054
aked-0.053
 done-0.053
Token.
Token ID13
Feature activation+7.25e-02

Pos logit contributions
 Grace+0.097
 trends+0.096
 Grac+0.094
 Nina+0.092
 Flor+0.091
Neg logit contributions
 supposed-0.072
 leaning-0.069
 allergic-0.061
aked-0.059
agged-0.059
Token She
Token ID1375
Feature activation+5.63e-01

Pos logit contributions
 trends+0.004
lyss+0.004
fish+0.003
 coral+0.003
 Grac+0.003
Neg logit contributions
 supposed-0.003
 leaning-0.003
 Tom-0.003
 allergic-0.003
aked-0.003
Token mixed
Token ID7668
Feature activation+1.29e+00

Pos logit contributions
 trends+0.027
 Grace+0.027
 Grac+0.026
lyss+0.026
 Flor+0.026
Neg logit contributions
 supposed-0.026
 leaning-0.025
 allergic-0.022
aked-0.021
 Tom-0.021
Token purple
Token ID14032
Feature activation+8.98e-01

Pos logit contributions
 trends+0.064
 Grace+0.064
 Grac+0.062
lyss+0.061
 Flor+0.060
Neg logit contributions
 supposed-0.057
 leaning-0.055
 allergic-0.048
aked-0.046
 Tom-0.046
Token jelly
Token ID27644
Feature activation+2.36e+00

Pos logit contributions
 trends+0.054
lyss+0.054
 Grace+0.053
 Nina+0.052
 Flor+0.051
Neg logit contributions
 leaning-0.027
 supposed-0.026
 chocolate-0.025
 Tom-0.023
aked-0.023
Tokenfish
Token ID11084
Feature activation+5.09e+00

Pos logit contributions
 trends+0.138
 Grace+0.137
lyss+0.136
 Isn+0.132
 Nina+0.131
Neg logit contributions
 supposed-0.076
 leaning-0.072
 Tom-0.066
aked-0.065
hes-0.064
Token with
Token ID351
Feature activation+1.30e+00

Pos logit contributions
lyss+0.313
 trends+0.311
 Isn+0.300
 Grace+0.297
 occasions+0.292
Neg logit contributions
 supposed-0.153
 leaning-0.153
 Tom-0.135
 done-0.129
 hard-0.128
Token a
Token ID257
Feature activation+5.60e-01

Pos logit contributions
 Grace+0.067
 trends+0.066
 Grac+0.065
lyss+0.063
 Flor+0.063
Neg logit contributions
 supposed-0.055
 leaning-0.054
 allergic-0.047
aked-0.045
agged-0.045
Token lot
Token ID1256
Feature activation-1.27e+00

Pos logit contributions
 trends+0.035
 Grace+0.034
 occasions+0.034
 Flor+0.033
lyss+0.033
Neg logit contributions
 leaning-0.017
 supposed-0.016
 allergic-0.014
ige-0.013
 chocolate-0.013
Token of
Token ID286
Feature activation-1.51e+00

Pos logit contributions
 supposed+0.045
 leaning+0.042
 Tom+0.036
aked+0.035
 allergic+0.035
Neg logit contributions
 Grace-0.074
 trends-0.074
lyss-0.072
 Grac-0.071
 Flor-0.070
Token different
Token ID1180
Feature activation-6.55e-01

Pos logit contributions
 supposed+0.052
 leaning+0.051
 allergic+0.045
 Tom+0.044
 chocolate+0.044
Neg logit contributions
 trends-0.088
 Grace-0.086
lyss-0.085
 Flor-0.083
 Grac-0.082
Token must
Token ID1276
Feature activation+4.22e-01

Pos logit contributions
 trends+0.098
 Grace+0.092
 occasions+0.091
lyss+0.091
 Flor+0.089
Neg logit contributions
 supposed-0.093
 leaning-0.085
 allergic-0.080
 Tom-0.079
 done-0.079
Token be
Token ID307
Feature activation+4.05e-01

Pos logit contributions
 trends+0.021
 Grace+0.021
 Grac+0.020
lyss+0.020
 Isn+0.020
Neg logit contributions
 supposed-0.019
 leaning-0.018
 done-0.016
 allergic-0.016
 Tom-0.016
Token careful
Token ID8161
Feature activation+1.06e+00

Pos logit contributions
 trends+0.025
 Grace+0.025
 Grac+0.024
lyss+0.024
 Flor+0.024
Neg logit contributions
 supposed-0.013
 leaning-0.013
 allergic-0.011
aked-0.010
 Tom-0.010
Token.
Token ID13
Feature activation+5.27e-01

Pos logit contributions
 Grace+0.049
 trends+0.049
 Grac+0.047
 Nina+0.046
lyss+0.046
Neg logit contributions
 supposed-0.048
 leaning-0.046
aked-0.043
agged-0.042
 allergic-0.042
Token Jelly
Token ID37152
Feature activation+2.25e+00

Pos logit contributions
 trends+0.027
 outfits+0.025
fish+0.025
 coral+0.025
 Grace+0.024
Neg logit contributions
 leaning-0.023
 supposed-0.023
 Tom-0.022
itt-0.020
 allergic-0.020
Tokenfish
Token ID11084
Feature activation+5.06e+00

Pos logit contributions
 trends+0.062
 Grace+0.059
 Grac+0.057
 occasions+0.052
 Nina+0.048
Neg logit contributions
 supposed-0.152
 leaning-0.144
 done-0.142
 greedy-0.139
 allergic-0.139
Token in
Token ID287
Feature activation+1.27e+00

Pos logit contributions
 trends+0.259
lyss+0.248
 Grac+0.243
 Grace+0.242
 Nina+0.241
Neg logit contributions
 supposed-0.235
 leaning-0.229
 done-0.221
 allergic-0.212
ow-0.201
Token the
Token ID262
Feature activation-3.17e-01

Pos logit contributions
 trends+0.073
 Grace+0.068
lyss+0.068
 Grac+0.067
 Nina+0.066
Neg logit contributions
 supposed-0.050
 leaning-0.048
 allergic-0.044
 Tom-0.043
 thieves-0.041
Token sea
Token ID5417
Feature activation+1.27e+00

Pos logit contributions
 leaning+0.013
 supposed+0.012
 done+0.012
 allergic+0.012
 greedy+0.012
Neg logit contributions
 trends-0.017
 Grac-0.017
 Grace-0.017
 Tracy-0.016
lyss-0.015
Token can
Token ID460
Feature activation+1.26e+00

Pos logit contributions
 trends+0.068
 Grace+0.067
 Grac+0.067
lyss+0.066
 Nina+0.063
Neg logit contributions
 leaning-0.051
 supposed-0.051
 done-0.049
 greedy-0.045
 Tom-0.044
Token be
Token ID307
Feature activation+4.68e-01

Pos logit contributions
 trends+0.079
lyss+0.074
 Isn+0.072
 outfits+0.072
 Grace+0.071
Neg logit contributions
 supposed-0.041
 leaning-0.038
ow-0.038
 Tom-0.037
 allergic-0.036
Token and
Token ID290
Feature activation+2.20e+00

Pos logit contributions
 trends+0.191
 coral+0.186
 Grac+0.185
 Grace+0.184
lyss+0.182
Neg logit contributions
 leaning-0.121
 supposed-0.117
 Tom-0.101
 done-0.099
aged-0.097
Token f
Token ID277
Feature activation+3.67e+00

Pos logit contributions
 trends+0.114
 Grace+0.107
lyss+0.107
 Grac+0.106
 occasions+0.104
Neg logit contributions
 leaning-0.097
 supposed-0.092
 Tom-0.087
 thieves-0.083
 done-0.083
Tokenenn
Token ID1697
Feature activation+3.42e+00

Pos logit contributions
 Grace+0.220
 trends+0.218
 Flor+0.209
 Grac+0.209
lyss+0.209
Neg logit contributions
 supposed-0.126
 leaning-0.119
aked-0.115
agged-0.107
aged-0.103
Tokenel
Token ID417
Feature activation+3.55e-01

Pos logit contributions
 trends+0.201
 Grace+0.193
lyss+0.192
 Grac+0.187
 Flor+0.186
Neg logit contributions
 supposed-0.122
 leaning-0.120
aked-0.110
 Tom-0.102
agged-0.101
Token and
Token ID290
Feature activation+2.30e+00

Pos logit contributions
 trends+0.018
lyss+0.018
 Grace+0.017
 Grac+0.017
 coral+0.017
Neg logit contributions
 leaning-0.015
 supposed-0.015
aked-0.014
aged-0.014
 done-0.013
Token an
Token ID281
Feature activation+5.03e+00

Pos logit contributions
 trends+0.123
lyss+0.116
 Grace+0.116
 Grac+0.113
 occasions+0.113
Neg logit contributions
 leaning-0.099
 supposed-0.093
 Tom-0.089
 thieves-0.084
 done-0.084
Tokenise
Token ID786
Feature activation-1.30e-01

Pos logit contributions
 trends+0.277
 Grace+0.273
 Grac+0.267
lyss+0.266
 Flor+0.265
Neg logit contributions
 supposed-0.199
 leaning-0.189
aked-0.163
 allergic-0.162
 Tom-0.158
Token and
Token ID290
Feature activation+2.42e+00

Pos logit contributions
 supposed+0.005
 leaning+0.005
aked+0.005
 Tom+0.005
aged+0.004
Neg logit contributions
 trends-0.007
lyss-0.007
 Grace-0.007
 Grac-0.007
 coral-0.007
Token car
Token ID1097
Feature activation+1.96e+00

Pos logit contributions
 trends+0.130
lyss+0.122
 Grace+0.121
 occasions+0.119
 Grac+0.119
Neg logit contributions
 leaning-0.103
 supposed-0.096
 Tom-0.094
 thieves-0.088
 done-0.085
Tokenaway
Token ID8272
Feature activation+2.89e+00

Pos logit contributions
 trends+0.112
 Grace+0.111
 Grac+0.109
 coral+0.109
lyss+0.109
Neg logit contributions
 leaning-0.069
 supposed-0.068
aked-0.062
 Tom-0.061
aged-0.060
Token and
Token ID290
Feature activation+2.45e+00

Pos logit contributions
 trends+0.159
 Grace+0.154
lyss+0.151
 Grac+0.150
 coral+0.149
Neg logit contributions
 leaning-0.106
 supposed-0.103
aged-0.102
aked-0.100
 Tom-0.098
Token no
Token ID645
Feature activation+2.58e-01

Pos logit contributions
 supposed+0.034
 leaning+0.033
 allergic+0.029
 Tom+0.028
aked+0.028
Neg logit contributions
 trends-0.031
 Grace-0.031
 Grac-0.030
lyss-0.029
 Flor-0.029
Token.
Token ID13
Feature activation-1.60e-01

Pos logit contributions
 trends+0.014
 Grace+0.014
 Grac+0.014
lyss+0.014
 Nina+0.014
Neg logit contributions
 supposed-0.010
 leaning-0.009
 allergic-0.008
aked-0.008
agged-0.008
Token She
Token ID1375
Feature activation+7.93e-01

Pos logit contributions
 leaning+0.007
 supposed+0.007
 Tom+0.006
aked+0.006
 allergic+0.006
Neg logit contributions
 trends-0.008
 Grace-0.008
 Grac-0.007
lyss-0.007
 coral-0.007
Token said
Token ID531
Feature activation-4.03e-01

Pos logit contributions
 trends+0.038
 Grace+0.037
 Grac+0.036
lyss+0.035
 coral+0.035
Neg logit contributions
 supposed-0.037
 leaning-0.036
 allergic-0.032
 Tom-0.032
aked-0.032
Token jelly
Token ID27644
Feature activation+2.17e+00

Pos logit contributions
 leaning+0.019
 supposed+0.019
 Tom+0.017
 allergic+0.017
aked+0.016
Neg logit contributions
 trends-0.019
 Grace-0.018
 Grac-0.017
lyss-0.017
 occasions-0.017
Tokenfish
Token ID11084
Feature activation+4.97e+00

Pos logit contributions
 trends+0.057
 Grace+0.054
lyss+0.050
 Grac+0.050
 occasions+0.050
Neg logit contributions
 supposed-0.143
 leaning-0.134
 done-0.133
aked-0.132
 Tom-0.132
Token can
Token ID460
Feature activation+9.41e-01

Pos logit contributions
 trends+0.217
lyss+0.205
 Grace+0.201
 Grac+0.199
 occasions+0.199
Neg logit contributions
 supposed-0.253
 leaning-0.248
 Tom-0.227
 allergic-0.224
 done-0.219
Token sting
Token ID24276
Feature activation+2.05e+00

Pos logit contributions
 trends+0.056
lyss+0.054
 Grace+0.053
 coral+0.052
 occasions+0.052
Neg logit contributions
 leaning-0.033
 supposed-0.033
 Tom-0.030
 allergic-0.029
ow-0.026
Token and
Token ID290
Feature activation+4.61e-01

Pos logit contributions
 trends+0.107
lyss+0.103
 Grac+0.097
 Isn+0.096
 Grace+0.094
Neg logit contributions
 Tom-0.086
 supposed-0.085
 leaning-0.085
aked-0.078
 allergic-0.077
Token hurt
Token ID5938
Feature activation+9.90e-01

Pos logit contributions
 trends+0.025
lyss+0.024
maid+0.022
 occasions+0.022
 Grac+0.022
Neg logit contributions
 Tom-0.020
 leaning-0.019
 supposed-0.018
 allergic-0.017
 angry-0.017
Token.
Token ID13
Feature activation+4.65e-02

Pos logit contributions
 trends+0.054
lyss+0.052
 Grace+0.050
 Grac+0.049
 Isn+0.049
Neg logit contributions
 supposed-0.039
 leaning-0.039
 Tom-0.037
 allergic-0.034
aked-0.032
Token better
Token ID1365
Feature activation+1.05e+00

Pos logit contributions
 trends+0.047
lyss+0.046
 Grace+0.046
 Grac+0.045
 Isn+0.044
Neg logit contributions
 supposed-0.037
 leaning-0.037
 allergic-0.033
 Tom-0.032
eping-0.031
Token.
Token ID13
Feature activation-3.29e-02

Pos logit contributions
 Grace+0.062
 trends+0.062
lyss+0.061
 Grac+0.061
 Flor+0.059
Neg logit contributions
 supposed-0.037
 leaning-0.036
 Tom-0.030
 allergic-0.030
 done-0.029
Token They
Token ID1119
Feature activation-3.09e-01

Pos logit contributions
 Tom+0.001
 leaning+0.001
 supposed+0.001
 late+0.001
 Jerry+0.001
Neg logit contributions
fish-0.002
 trends-0.002
maid-0.002
 occasions-0.002
lyss-0.002
Token said
Token ID531
Feature activation-2.90e-01

Pos logit contributions
 supposed+0.015
 leaning+0.014
 Tom+0.013
 allergic+0.013
aked+0.012
Neg logit contributions
 trends-0.014
 Grace-0.014
lyss-0.014
 Grac-0.014
 Flor-0.013
Token jelly
Token ID27644
Feature activation+2.12e+00

Pos logit contributions
 supposed+0.015
 leaning+0.015
 Tom+0.014
 allergic+0.014
aked+0.013
Neg logit contributions
 trends-0.012
 Grace-0.011
lyss-0.011
 Grac-0.011
 Flor-0.011
Tokenfish
Token ID11084
Feature activation+4.91e+00

Pos logit contributions
 trends+0.076
 Grace+0.075
 Isn+0.073
 Grac+0.072
lyss+0.071
Neg logit contributions
 supposed-0.113
 Tom-0.109
 done-0.108
aked-0.107
oni-0.106
Token are
Token ID389
Feature activation+3.66e-01

Pos logit contributions
 trends+0.231
lyss+0.223
 Grac+0.219
 Grace+0.218
 Isn+0.216
Neg logit contributions
 supposed-0.221
 leaning-0.214
 Tom-0.209
ow-0.199
 done-0.199
Token not
Token ID407
Feature activation-2.60e+00

Pos logit contributions
 Grace+0.022
 trends+0.022
 Grac+0.021
lyss+0.021
 Flor+0.021
Neg logit contributions
 supposed-0.013
 leaning-0.012
 allergic-0.010
aked-0.010
agged-0.010
Token bad
Token ID2089
Feature activation-1.44e+00

Pos logit contributions
 leaning+0.095
 supposed+0.095
aked+0.081
agged+0.080
 allergic+0.078
Neg logit contributions
 Grace-0.146
 trends-0.143
 Grac-0.142
 Nina-0.139
 Flor-0.138
Token,
Token ID11
Feature activation-2.97e-01

Pos logit contributions
 supposed+0.057
 leaning+0.055
 allergic+0.048
 Tom+0.047
aked+0.046
Neg logit contributions
 trends-0.078
 Grace-0.078
 Grac-0.076
lyss-0.075
 Flor-0.074
Token they
Token ID484
Feature activation+1.09e-01

Pos logit contributions
 supposed+0.014
 leaning+0.014
 Tom+0.014
 allergic+0.013
 greedy+0.012
Neg logit contributions
 trends-0.014
 Grace-0.013
lyss-0.013
 Grac-0.013
 Flor-0.013
Token together
Token ID1978
Feature activation+1.72e+00

Pos logit contributions
 supposed+0.011
 leaning+0.010
 allergic+0.010
agged+0.009
aked+0.009
Neg logit contributions
 trends-0.013
 Grace-0.013
 Grac-0.013
 Flor-0.013
 dance-0.013
Token.
Token ID13
Feature activation+7.68e-02

Pos logit contributions
 Grace+0.084
 trends+0.084
 Flor+0.082
 coral+0.080
 Grac+0.080
Neg logit contributions
 supposed-0.076
 leaning-0.070
 allergic-0.065
agged-0.064
aked-0.064
Token They
Token ID1119
Feature activation-3.06e-01

Pos logit contributions
 trends+0.004
 Grace+0.004
 Grac+0.003
lyss+0.003
 Flor+0.003
Neg logit contributions
 supposed-0.004
 leaning-0.004
 Tom-0.003
 allergic-0.003
aked-0.003
Token make
Token ID787
Feature activation+1.49e+00

Pos logit contributions
 leaning+0.014
 supposed+0.013
 Tom+0.013
 done+0.012
hes+0.012
Neg logit contributions
lyss-0.015
 trends-0.015
 Grace-0.014
 occasions-0.014
 Isn-0.014
Token the
Token ID262
Feature activation+1.36e+00

Pos logit contributions
 Grace+0.083
 trends+0.083
 Grac+0.081
 Flor+0.081
 Nina+0.080
Neg logit contributions
 supposed-0.056
 leaning-0.053
 allergic-0.046
aked-0.045
agged-0.045
Token dancer
Token ID38619
Feature activation+4.86e+00

Pos logit contributions
 trends+0.064
 Grace+0.063
lyss+0.061
 Grac+0.061
 Nina+0.059
Neg logit contributions
 leaning-0.064
 supposed-0.064
 allergic-0.056
 Tom-0.055
aked-0.053
Token.
Token ID13
Feature activation+2.39e-01

Pos logit contributions
 trends+0.236
lyss+0.233
 Grace+0.231
 Isn+0.226
 Grac+0.224
Neg logit contributions
 leaning-0.216
 supposed-0.215
 allergic-0.189
 Tom-0.188
 done-0.183
Token
Token ID198
Feature activation+8.00e-02

Pos logit contributions
 trends+0.013
lyss+0.012
 Grace+0.012
 Grac+0.012
 occasions+0.012
Neg logit contributions
 leaning-0.010
 supposed-0.010
 Tom-0.009
aked-0.008
hes-0.008
Token
Token ID198
Feature activation-3.98e-01

Pos logit contributions
 trends+0.004
 Grace+0.004
lyss+0.004
 Grac+0.004
 Nina+0.004
Neg logit contributions
 supposed-0.004
 leaning-0.003
 Tom-0.003
 allergic-0.003
aked-0.003
TokenThey
Token ID2990
Feature activation-1.57e+00

Pos logit contributions
 supposed+0.018
 leaning+0.018
 Tom+0.016
aked+0.016
 allergic+0.015
Neg logit contributions
 trends-0.019
 Grace-0.019
 Grac-0.019
 Nina-0.018
lyss-0.018
Token look
Token ID804
Feature activation+8.88e-01

Pos logit contributions
 leaning+0.069
 Tom+0.066
 supposed+0.065
 done+0.062
hy+0.059
Neg logit contributions
 trends-0.076
lyss-0.076
 Grace-0.074
 occasions-0.074
 Isn-0.074
Token big
Token ID1263
Feature activation-4.88e-01

Pos logit contributions
 trends+0.093
 occasions+0.089
 Grace+0.088
 Flor+0.086
 Grac+0.085
Neg logit contributions
 supposed-0.039
 leaning-0.037
 hard-0.033
 allergic-0.031
 done-0.031
Token hug
Token ID16225
Feature activation+7.58e-01

Pos logit contributions
 leaning+0.021
 supposed+0.020
 Tom+0.019
 allergic+0.018
 hard+0.018
Neg logit contributions
 trends-0.025
 Grace-0.023
 occasions-0.023
 Grac-0.022
 Flor-0.022
Token.
Token ID13
Feature activation+9.45e-03

Pos logit contributions
 trends+0.041
 Grace+0.041
 Grac+0.039
lyss+0.039
 Nina+0.038
Neg logit contributions
 supposed-0.030
 leaning-0.029
 allergic-0.025
aked-0.025
agged-0.024
Token The
Token ID383
Feature activation+1.01e+00

Pos logit contributions
 trends+0.001
 Grace+0.000
 occasions+0.000
 Grac+0.000
lyss+0.000
Neg logit contributions
 leaning-0.000
 supposed-0.000
 Tom-0.000
aked-0.000
 allergic-0.000
Token jelly
Token ID27644
Feature activation+2.08e+00

Pos logit contributions
 trends+0.045
 Grace+0.042
 Grac+0.041
 occasions+0.041
 Isn+0.040
Neg logit contributions
 leaning-0.048
 supposed-0.047
 allergic-0.045
 greedy-0.044
 pain-0.043
Tokenfish
Token ID11084
Feature activation+4.86e+00

Pos logit contributions
 trends+0.067
 Isn+0.060
 Grace+0.058
 Grac+0.055
 occasions+0.054
Neg logit contributions
 done-0.122
 leaning-0.122
aged-0.118
aver-0.117
aked-0.117
Token was
Token ID373
Feature activation+1.66e-01

Pos logit contributions
 trends+0.283
 occasions+0.269
 Isn+0.259
 conclusions+0.257
 Nina+0.246
Neg logit contributions
 leaning-0.178
 done-0.170
aged-0.160
 gone-0.158
 eaten-0.158
Token so
Token ID523
Feature activation+1.37e+00

Pos logit contributions
 Grace+0.008
 trends+0.007
 Grac+0.007
lyss+0.007
 Flor+0.007
Neg logit contributions
 supposed-0.008
 leaning-0.007
aked-0.007
 allergic-0.007
agged-0.007
Token happy
Token ID3772
Feature activation-5.66e-01

Pos logit contributions
 trends+0.082
 Grace+0.080
 Grac+0.078
 occasions+0.078
lyss+0.077
Neg logit contributions
 supposed-0.049
 leaning-0.047
 allergic-0.040
 Tom-0.039
aked-0.037
Token and
Token ID290
Feature activation-4.18e-01

Pos logit contributions
 leaning+0.019
 supposed+0.019
 Tom+0.017
 enough+0.015
 lying+0.015
Neg logit contributions
 trends-0.035
 Grac-0.033
 Grace-0.033
lyss-0.033
 Flor-0.032
Token thanked
Token ID26280
Feature activation-2.38e-01

Pos logit contributions
 leaning+0.017
 supposed+0.016
 Tom+0.016
 done+0.014
aged+0.014
Neg logit contributions
 occasions-0.023
 trends-0.023
 Grac-0.021
 outfits-0.021
 conclusions-0.021
Token into
Token ID656
Feature activation+1.91e+00

Pos logit contributions
 Grace+0.026
 Grac+0.025
 Nina+0.024
 Flor+0.024
 trends+0.024
Neg logit contributions
 supposed-0.033
 leaning-0.032
 allergic-0.030
agged-0.029
aked-0.029
Token it
Token ID340
Feature activation+4.32e-01

Pos logit contributions
 trends+0.117
 Grace+0.115
lyss+0.112
 Grac+0.112
 Flor+0.110
Neg logit contributions
 supposed-0.063
 leaning-0.061
 allergic-0.051
 Tom-0.050
aked-0.048
Token.
Token ID13
Feature activation+6.66e-02

Pos logit contributions
 trends+0.025
 Grace+0.025
lyss+0.025
 Grac+0.024
 Flor+0.024
Neg logit contributions
 supposed-0.015
 leaning-0.015
 allergic-0.013
 Tom-0.012
aked-0.012
Token The
Token ID383
Feature activation+9.10e-01

Pos logit contributions
 trends+0.004
 outfits+0.004
 occasions+0.004
maid+0.004
fish+0.004
Neg logit contributions
 Tom-0.002
 leaning-0.002
 supposed-0.002
 Jerry-0.002
 allergic-0.002
Token star
Token ID3491
Feature activation+3.50e+00

Pos logit contributions
 trends+0.039
 Grace+0.038
 Grac+0.037
 occasions+0.036
 Isn+0.035
Neg logit contributions
 leaning-0.046
 supposed-0.044
 Tom-0.040
 allergic-0.040
 greedy-0.040
Tokenfish
Token ID11084
Feature activation+4.84e+00

Pos logit contributions
 trends+0.112
 Grace+0.103
 occasions+0.099
 Flor+0.098
 Grac+0.096
Neg logit contributions
 leaning-0.223
 supposed-0.213
eping-0.205
aged-0.201
aked-0.200
Token said
Token ID531
Feature activation-4.63e-01

Pos logit contributions
 trends+0.280
 occasions+0.267
lyss+0.256
 Grace+0.255
 Tracy+0.252
Neg logit contributions
 leaning-0.199
 supposed-0.174
aged-0.168
 done-0.166
eping-0.166
Token,
Token ID11
Feature activation+5.39e-01

Pos logit contributions
 leaning+0.017
 supposed+0.015
 Tom+0.015
 allergic+0.014
 enough+0.013
Neg logit contributions
 trends-0.028
 occasions-0.026
 Grace-0.025
 outfits-0.025
lyss-0.025
Token
Token ID6184
Feature activation+1.71e+00

Pos logit contributions
 trends+0.031
 Grace+0.029
 occasions+0.029
lyss+0.029
 Grac+0.028
Neg logit contributions
 leaning-0.020
 supposed-0.020
 Tom-0.018
 allergic-0.017
aged-0.015
Token
Token ID95
Feature activation+2.77e+00

Pos logit contributions
 Grace+0.074
 trends+0.073
 Grac+0.072
 occasions+0.069
 outfits+0.068
Neg logit contributions
 leaning-0.084
 supposed-0.082
 Tom-0.075
hy-0.074
agged-0.073
Token
Token ID26391
Feature activation-4.93e-01

Pos logit contributions
 Grace+0.147
 Grac+0.145
 trends+0.143
 Nina+0.137
 Flor+0.135
Neg logit contributions
 supposed-0.118
 leaning-0.112
aked-0.097
agged-0.097
 Tom-0.097
Token't
Token ID470
Feature activation-2.97e-01

Pos logit contributions
 supposed+0.073
 leaning+0.070
 allergic+0.061
aked+0.061
agged+0.059
Neg logit contributions
 trends-0.084
 Grace-0.084
 Grac-0.081
lyss-0.080
 Flor-0.079
Token touch
Token ID3638
Feature activation+5.12e-01

Pos logit contributions
 supposed+0.013
 leaning+0.013
 Tom+0.012
 allergic+0.012
 thieves+0.011
Neg logit contributions
 trends-0.015
lyss-0.015
 Grace-0.015
 Grac-0.014
 Nina-0.014
Token it
Token ID340
Feature activation-3.71e-01

Pos logit contributions
 trends+0.029
lyss+0.027
 Grace+0.027
 occasions+0.026
 Grac+0.026
Neg logit contributions
 leaning-0.020
 supposed-0.019
 Tom-0.018
 allergic-0.018
 enough-0.018
Token.
Token ID13
Feature activation+2.61e-01

Pos logit contributions
 supposed+0.016
 leaning+0.015
 allergic+0.014
 Tom+0.013
aked+0.013
Neg logit contributions
 trends-0.019
 Grace-0.019
lyss-0.019
 Grac-0.018
 Nina-0.018
Token Jelly
Token ID37152
Feature activation+2.02e+00

Pos logit contributions
 trends+0.016
lyss+0.015
 occasions+0.015
 outfits+0.015
 Grace+0.014
Neg logit contributions
 Tom-0.009
 leaning-0.009
 supposed-0.007
ow-0.007
 greedy-0.007
Tokenfish
Token ID11084
Feature activation+4.83e+00

Pos logit contributions
 trends+0.059
 Grace+0.056
 Grac+0.053
lyss+0.049
 occasions+0.049
Neg logit contributions
 supposed-0.131
 leaning-0.127
 done-0.125
 greedy-0.123
 Tom-0.122
Token can
Token ID460
Feature activation+1.25e+00

Pos logit contributions
 trends+0.256
lyss+0.253
 Grace+0.244
 Nina+0.235
 Grac+0.235
Neg logit contributions
 supposed-0.213
 leaning-0.209
 done-0.204
 allergic-0.199
 mad-0.188
Token sting
Token ID24276
Feature activation+2.13e+00

Pos logit contributions
 trends+0.086
lyss+0.081
 occasions+0.080
 Grace+0.079
 outfits+0.078
Neg logit contributions
 supposed-0.035
 leaning-0.034
 Tom-0.031
 allergic-0.031
 mad-0.027
Token you
Token ID345
Feature activation+2.05e+00

Pos logit contributions
 trends+0.102
lyss+0.097
 Grac+0.088
 Grace+0.086
 pir+0.085
Neg logit contributions
 Tom-0.104
 supposed-0.102
 leaning-0.100
 thieves-0.094
aked-0.094
Token and
Token ID290
Feature activation+1.15e-01

Pos logit contributions
 trends+0.117
lyss+0.115
 Grace+0.112
 Grac+0.109
 Nina+0.108
Neg logit contributions
 supposed-0.081
 leaning-0.078
 allergic-0.071
 Tom-0.067
 done-0.064
Token make
Token ID787
Feature activation+1.04e+00

Pos logit contributions
 trends+0.006
lyss+0.006
 Grac+0.006
 Grace+0.006
 Tracy+0.005
Neg logit contributions
 Tom-0.005
 leaning-0.005
 supposed-0.005
 allergic-0.005
 greedy-0.005
Token with
Token ID351
Feature activation-1.68e-01

Pos logit contributions
lyss+0.121
 trends+0.116
 Grac+0.115
 Flor+0.111
 Grace+0.110
Neg logit contributions
 supposed-0.084
 leaning-0.079
 enough-0.076
 Tom-0.076
 done-0.071
Token our
Token ID674
Feature activation-9.49e-02

Pos logit contributions
 supposed+0.008
 leaning+0.007
 Tom+0.007
 allergic+0.007
 thieves+0.006
Neg logit contributions
 trends-0.008
lyss-0.008
 Isn-0.008
 Grac-0.008
 Grace-0.008
Token time
Token ID640
Feature activation-1.64e+00

Pos logit contributions
 supposed+0.004
 leaning+0.004
 allergic+0.003
 Tom+0.003
 thieves+0.003
Neg logit contributions
 trends-0.005
lyss-0.005
 Grace-0.005
 Grac-0.005
 Nina-0.005
Token!"
Token ID2474
Feature activation+4.51e-01

Pos logit contributions
 supposed+0.083
 leaning+0.079
 Tom+0.071
 done+0.071
 allergic+0.068
Neg logit contributions
 trends-0.071
lyss-0.071
 Flor-0.070
 Grac-0.070
 Grace-0.069
Token The
Token ID383
Feature activation+9.62e-01

Pos logit contributions
 trends+0.024
 occasions+0.022
lyss+0.022
fish+0.022
 Grace+0.022
Neg logit contributions
 leaning-0.018
 Tom-0.018
 supposed-0.017
aked-0.016
aged-0.015
Token dancer
Token ID38619
Feature activation+4.81e+00

Pos logit contributions
 trends+0.045
 Isn+0.044
 reefs+0.042
 sometime+0.042
 occasions+0.041
Neg logit contributions
 supposed-0.043
 leaning-0.043
 allergic-0.040
 pain-0.039
 angry-0.038
Token and
Token ID290
Feature activation+4.97e-01

Pos logit contributions
 trends+0.254
 coral+0.241
 occasions+0.239
 conclusions+0.239
 Isn+0.238
Neg logit contributions
 leaning-0.193
 Tom-0.182
 supposed-0.173
aged-0.173
 done-0.171
Token G
Token ID402
Feature activation+1.13e-02

Pos logit contributions
 trends+0.025
 Isn+0.022
 occasions+0.022
lyss+0.022
 reefs+0.022
Neg logit contributions
 leaning-0.022
 supposed-0.021
 Tom-0.021
 allergic-0.019
aged-0.019
Tokenaby
Token ID3930
Feature activation+2.17e-02

Pos logit contributions
 trends+0.000
 coral+0.000
lyss+0.000
 Isn+0.000
 reefs+0.000
Neg logit contributions
 supposed-0.001
 leaning-0.001
agged-0.001
aked-0.001
 allergic-0.001
Token laughed
Token ID13818
Feature activation+2.38e-01

Pos logit contributions
 trends+0.001
 Grace+0.001
lyss+0.001
 Grac+0.001
 Nina+0.001
Neg logit contributions
 supposed-0.001
 leaning-0.001
aked-0.001
 Tom-0.001
aged-0.001
Token and
Token ID290
Feature activation+3.75e-01

Pos logit contributions
 trends+0.013
lyss+0.012
 Grace+0.012
 Grac+0.012
 occasions+0.012
Neg logit contributions
 leaning-0.009
 supposed-0.009
 Tom-0.008
 allergic-0.008
aked-0.008
Token
Token ID198
Feature activation+2.67e-02

Pos logit contributions
 trends+0.003
lyss+0.002
 occasions+0.002
 Grace+0.002
 Grac+0.002
Neg logit contributions
 Tom-0.002
 leaning-0.002
 supposed-0.002
 allergic-0.002
agged-0.001
Token
Token ID198
Feature activation-3.95e-01

Pos logit contributions
 trends+0.001
 Grace+0.001
lyss+0.001
 Grac+0.001
 Flor+0.001
Neg logit contributions
 supposed-0.001
 leaning-0.001
 Tom-0.001
 allergic-0.001
aked-0.001
Token"
Token ID1
Feature activation-5.78e-01

Pos logit contributions
 supposed+0.017
 leaning+0.017
 Tom+0.016
aked+0.015
 allergic+0.015
Neg logit contributions
 trends-0.019
 Grace-0.019
 Grac-0.019
 Flor-0.018
lyss-0.018
TokenJ
Token ID41
Feature activation-4.94e-01

Pos logit contributions
 supposed+0.026
 leaning+0.025
 allergic+0.022
aked+0.022
 Tom+0.022
Neg logit contributions
 trends-0.028
 Grace-0.028
 Grac-0.027
lyss-0.027
 Flor-0.027
Tokenelly
Token ID6148
Feature activation+2.21e+00

Pos logit contributions
 supposed+0.028
 leaning+0.027
 allergic+0.025
aged+0.024
eping+0.024
Neg logit contributions
 trends-0.018
 Grace-0.018
lyss-0.017
 Grac-0.017
 Nina-0.017
Tokenfish
Token ID11084
Feature activation+4.81e+00

Pos logit contributions
 trends+0.104
 Grace+0.099
 Grac+0.095
lyss+0.092
Gr+0.089
Neg logit contributions
 Tom-0.104
 leaning-0.100
 supposed-0.097
 enough-0.092
 done-0.092
Token are
Token ID389
Feature activation+3.85e-01

Pos logit contributions
 trends+0.242
 Grace+0.228
lyss+0.223
 Grac+0.221
 Nina+0.219
Neg logit contributions
 leaning-0.222
 supposed-0.218
 done-0.208
 allergic-0.204
 Tom-0.199
Token very
Token ID845
Feature activation+7.44e-01

Pos logit contributions
 trends+0.024
 Grace+0.024
lyss+0.023
 Grac+0.023
 Nina+0.023
Neg logit contributions
 supposed-0.012
 leaning-0.012
 allergic-0.010
 Tom-0.009
aked-0.009
Token interesting
Token ID3499
Feature activation+1.10e-01

Pos logit contributions
 trends+0.048
 Grace+0.047
 Grac+0.046
lyss+0.045
 Flor+0.045
Neg logit contributions
 supposed-0.023
 leaning-0.022
 allergic-0.019
 Tom-0.018
aked-0.017
Token,
Token ID11
Feature activation+1.44e-01

Pos logit contributions
 trends+0.006
lyss+0.006
 Grace+0.006
 Grac+0.006
 Nina+0.006
Neg logit contributions
 supposed-0.004
 leaning-0.004
 Tom-0.004
 thieves-0.004
 allergic-0.003
Token but
Token ID475
Feature activation+1.72e-01

Pos logit contributions
 trends+0.008
 occasions+0.008
lyss+0.008
 Grace+0.008
 Tracy+0.008
Neg logit contributions
 Tom-0.006
 leaning-0.005
 greedy-0.005
 allergic-0.005
 angry-0.005
Token no
Token ID645
Feature activation-2.29e-01

Pos logit contributions
 leaning+0.029
 supposed+0.028
 Tom+0.025
 allergic+0.023
 greedy+0.021
Neg logit contributions
 trends-0.078
 Grace-0.075
 Grac-0.073
 occasions-0.072
lyss-0.072
Token.
Token ID13
Feature activation-3.85e-01

Pos logit contributions
 supposed+0.009
 leaning+0.008
 allergic+0.007
aked+0.007
agged+0.007
Neg logit contributions
 trends-0.013
 Grace-0.013
 Grac-0.012
lyss-0.012
 Flor-0.012
Token He
Token ID679
Feature activation-1.03e+00

Pos logit contributions
 leaning+0.014
 Tom+0.013
 supposed+0.013
aked+0.012
agged+0.012
Neg logit contributions
 trends-0.022
 Grace-0.021
 occasions-0.020
lyss-0.020
 Grac-0.020
Token said
Token ID531
Feature activation-5.73e-01

Pos logit contributions
 supposed+0.046
 leaning+0.045
 Tom+0.040
 allergic+0.040
aked+0.039
Neg logit contributions
 trends-0.051
 Grace-0.050
 Grac-0.048
lyss-0.047
 Flor-0.047
Token jelly
Token ID27644
Feature activation+2.04e+00

Pos logit contributions
 leaning+0.022
 supposed+0.022
 Tom+0.020
 allergic+0.019
aked+0.018
Neg logit contributions
 trends-0.032
 Grace-0.031
 Grac-0.030
lyss-0.030
 occasions-0.029
Tokenfish
Token ID11084
Feature activation+4.79e+00

Pos logit contributions
 trends+0.050
 Grace+0.047
lyss+0.043
 occasions+0.043
 Grac+0.042
Neg logit contributions
 supposed-0.139
 leaning-0.133
 Tom-0.132
aked-0.129
oni-0.128
Token can
Token ID460
Feature activation+9.03e-01

Pos logit contributions
 trends+0.203
 Grace+0.192
lyss+0.192
 Grac+0.187
 occasions+0.185
Neg logit contributions
 supposed-0.251
 leaning-0.248
 Tom-0.228
 allergic-0.226
aked-0.215
Token hurt
Token ID5938
Feature activation+1.27e+00

Pos logit contributions
 trends+0.056
lyss+0.054
 Grace+0.053
 occasions+0.052
 Isn+0.051
Neg logit contributions
 leaning-0.031
 supposed-0.030
 Tom-0.027
 allergic-0.027
hy-0.024
Token you
Token ID345
Feature activation+1.75e+00

Pos logit contributions
 trends+0.075
lyss+0.072
 Grace+0.070
 Grac+0.069
 Isn+0.068
Neg logit contributions
 supposed-0.044
 leaning-0.044
 Tom-0.042
 allergic-0.038
 thieves-0.037
Token.
Token ID13
Feature activation+1.47e-01

Pos logit contributions
 trends+0.092
 Grace+0.092
 Grac+0.089
lyss+0.088
 Flor+0.087
Neg logit contributions
 supposed-0.071
 leaning-0.069
 allergic-0.060
aked-0.059
agged-0.058
Token Lily
Token ID20037
Feature activation+1.17e+00

Pos logit contributions
 trends+0.008
 Grace+0.007
 occasions+0.007
 coral+0.007
fish+0.007
Neg logit contributions
 Tom-0.006
 leaning-0.006
 supposed-0.006
agged-0.005
aked-0.005
Token the
Token ID262
Feature activation-4.29e-01

Pos logit contributions
 supposed+0.018
 leaning+0.017
aked+0.016
agged+0.016
 allergic+0.015
Neg logit contributions
 Grace-0.020
 Grac-0.019
 trends-0.019
 Flor-0.019
 Nina-0.019
Token fire
Token ID2046
Feature activation-4.59e-01

Pos logit contributions
 supposed+0.020
 leaning+0.019
 allergic+0.017
 Tom+0.017
aked+0.016
Neg logit contributions
 trends-0.021
 Grace-0.020
lyss-0.020
 Grac-0.020
 Nina-0.019
Token.
Token ID13
Feature activation-1.87e-01

Pos logit contributions
 supposed+0.016
 leaning+0.016
 allergic+0.013
 Tom+0.013
aked+0.013
Neg logit contributions
 trends-0.027
 Grace-0.026
lyss-0.026
 Grac-0.026
 Nina-0.025
Token Their
Token ID5334
Feature activation-1.15e+00

Pos logit contributions
 supposed+0.008
 Tom+0.007
 leaning+0.007
 allergic+0.006
aked+0.006
Neg logit contributions
 trends-0.010
 Grace-0.009
lyss-0.009
fish-0.009
 occasions-0.009
Token flowers
Token ID12734
Feature activation+2.53e+00

Pos logit contributions
 leaning+0.049
 supposed+0.049
 Tom+0.043
 allergic+0.042
aked+0.041
Neg logit contributions
 trends-0.058
 Grace-0.057
lyss-0.057
 Grac-0.056
 Nina-0.054
Token bloom
Token ID29955
Feature activation+4.77e+00

Pos logit contributions
lyss+0.133
 trends+0.131
 Grace+0.130
 Grac+0.128
 Nina+0.126
Neg logit contributions
 leaning-0.111
 supposed-0.108
 done-0.102
 allergic-0.096
 gone-0.090
Token.
Token ID13
Feature activation+6.01e-02

Pos logit contributions
 trends+0.304
 Grace+0.302
lyss+0.298
 Grac+0.296
 Nina+0.291
Neg logit contributions
 supposed-0.144
 leaning-0.135
aked-0.115
 allergic-0.111
agged-0.110
Token Bloom
Token ID11891
Feature activation+3.29e+00

Pos logit contributions
 trends+0.003
fish+0.003
 occasions+0.003
lyss+0.003
 dance+0.003
Neg logit contributions
 Tom-0.002
 supposed-0.002
 leaning-0.002
hes-0.002
 Jerry-0.002
Token means
Token ID1724
Feature activation-3.77e-01

Pos logit contributions
 trends+0.156
 Isn+0.153
 Grac+0.147
lyss+0.146
 occasions+0.146
Neg logit contributions
 supposed-0.146
 leaning-0.140
 done-0.137
 Tom-0.136
aked-0.134
Token grow
Token ID1663
Feature activation+2.34e+00

Pos logit contributions
 supposed+0.016
 leaning+0.016
 Tom+0.013
 done+0.013
 allergic+0.013
Neg logit contributions
 trends-0.020
lyss-0.019
 Grac-0.019
 Grace-0.019
 Isn-0.018
Token and
Token ID290
Feature activation+6.44e-01

Pos logit contributions
 trends+0.128
lyss+0.128
 occasions+0.124
 Grace+0.123
 Grac+0.121
Neg logit contributions
 supposed-0.091
 leaning-0.087
hes-0.078
 allergic-0.078
 done-0.078
Token the
Token ID262
Feature activation-4.57e-01

Pos logit contributions
 Grace+0.044
 trends+0.044
 Grac+0.042
lyss+0.042
 Flor+0.042
Neg logit contributions
 supposed-0.035
 leaning-0.034
 allergic-0.029
aked-0.029
agged-0.028
Token water
Token ID1660
Feature activation+4.29e-01

Pos logit contributions
 leaning+0.020
 supposed+0.019
 done+0.017
 Tom+0.017
aked+0.017
Neg logit contributions
 Grace-0.024
 Grac-0.024
 trends-0.024
 occasions-0.023
 Tracy-0.022
Token.
Token ID13
Feature activation+8.27e-02

Pos logit contributions
 trends+0.025
 Grace+0.025
 Grac+0.024
lyss+0.024
 Flor+0.024
Neg logit contributions
 supposed-0.016
 leaning-0.015
 allergic-0.013
 Tom-0.012
aked-0.012
Token The
Token ID383
Feature activation+9.10e-01

Pos logit contributions
 trends+0.004
 Grace+0.004
 Grac+0.004
lyss+0.004
 Flor+0.004
Neg logit contributions
 supposed-0.003
 leaning-0.003
 allergic-0.003
 Tom-0.003
aked-0.003
Token jelly
Token ID27644
Feature activation+1.79e+00

Pos logit contributions
 Grace+0.035
 trends+0.035
 Grac+0.033
 occasions+0.030
lyss+0.030
Neg logit contributions
 supposed-0.050
 leaning-0.050
 allergic-0.047
 Tom-0.047
 done-0.046
Tokenfish
Token ID11084
Feature activation+4.76e+00

Pos logit contributions
 trends+0.063
 Grace+0.061
 Tracy+0.056
 occasions+0.055
 conclusions+0.054
Neg logit contributions
 done-0.101
 leaning-0.100
 supposed-0.100
oni-0.100
hes-0.099
Token was
Token ID373
Feature activation+1.08e-01

Pos logit contributions
 trends+0.305
 occasions+0.304
 conclusions+0.292
 Tracy+0.287
 Grace+0.279
Neg logit contributions
 leaning-0.169
 supposed-0.155
 done-0.153
 lying-0.145
 gone-0.142
Token cool
Token ID3608
Feature activation+1.20e+00

Pos logit contributions
 Grace+0.006
 coral+0.005
 Grac+0.005
 Flor+0.005
 Nina+0.005
Neg logit contributions
 supposed-0.004
 leaning-0.004
aked-0.004
agged-0.004
itt-0.004
Token and
Token ID290
Feature activation-2.22e-01

Pos logit contributions
 trends+0.054
 Grace+0.054
 Grac+0.052
 Nina+0.051
lyss+0.051
Neg logit contributions
 supposed-0.058
 leaning-0.056
aked-0.051
 allergic-0.050
agged-0.050
Token Tim
Token ID5045
Feature activation-1.65e+00

Pos logit contributions
 leaning+0.009
 supposed+0.008
 Tom+0.008
 allergic+0.007
 Jerry+0.007
Neg logit contributions
 trends-0.013
 occasions-0.013
lyss-0.012
 Grace-0.012
 Flor-0.012
Tokenmy
Token ID1820
Feature activation-1.71e+00

Pos logit contributions
 supposed+0.063
 leaning+0.063
 allergic+0.055
 thieves+0.054
agged+0.053
Neg logit contributions
 Grace-0.088
 Grac-0.085
 trends-0.085
 Nina-0.083
maid-0.081
Token Trust
Token ID9870
Feature activation+1.04e+00

Pos logit contributions
 leaning+0.003
 supposed+0.003
 Tom+0.003
 allergic+0.003
aged+0.002
Neg logit contributions
 trends-0.003
 Grace-0.003
lyss-0.003
 coral-0.003
 occasions-0.003
Token me
Token ID502
Feature activation+1.68e+00

Pos logit contributions
 Grace+0.012
 Grac+0.009
 dance+0.009
lyss+0.008
 trends+0.008
Neg logit contributions
 supposed-0.085
 leaning-0.084
 allergic-0.080
itt-0.079
aked-0.078
Token.
Token ID13
Feature activation+1.81e-01

Pos logit contributions
 Grace+0.075
 trends+0.075
 Grac+0.072
lyss+0.070
 Flor+0.070
Neg logit contributions
 supposed-0.082
 leaning-0.080
 allergic-0.072
aked-0.071
agged-0.070
Token The
Token ID383
Feature activation+8.39e-01

Pos logit contributions
 trends+0.008
 Grace+0.008
lyss+0.008
 coral+0.008
 Grac+0.008
Neg logit contributions
 leaning-0.009
 supposed-0.008
 Tom-0.008
 allergic-0.008
agged-0.007
Token jelly
Token ID27644
Feature activation+1.90e+00

Pos logit contributions
 trends+0.031
 Grace+0.031
 Grac+0.029
lyss+0.027
 occasions+0.027
Neg logit contributions
 leaning-0.048
 supposed-0.047
 allergic-0.045
 greedy-0.043
 thieves-0.043
Tokenfish
Token ID11084
Feature activation+4.74e+00

Pos logit contributions
 Grace+0.096
 Grac+0.095
 trends+0.092
 Tracy+0.087
 Nina+0.087
Neg logit contributions
 leaning-0.087
 done-0.086
 supposed-0.083
 Tom-0.081
 hard-0.079
Token is
Token ID318
Feature activation+4.50e-01

Pos logit contributions
 trends+0.227
 Grace+0.221
lyss+0.219
 Grac+0.217
 Nina+0.214
Neg logit contributions
 supposed-0.235
 leaning-0.233
 allergic-0.218
 done-0.218
 Tom-0.199
Token not
Token ID407
Feature activation-2.42e+00

Pos logit contributions
 Grace+0.026
 Grac+0.025
 trends+0.025
 Flor+0.025
 coral+0.025
Neg logit contributions
 supposed-0.015
 leaning-0.015
aked-0.013
agged-0.012
itt-0.012
Token a
Token ID257
Feature activation+1.81e-01

Pos logit contributions
 leaning+0.091
 supposed+0.091
aked+0.079
agged+0.078
itt+0.075
Neg logit contributions
 Grace-0.131
 trends-0.127
 Grac-0.126
 Flor-0.125
 Nina-0.124
Token good
Token ID922
Feature activation-2.11e+00

Pos logit contributions
 trends+0.011
 Grace+0.010
 Grac+0.010
lyss+0.010
 Tracy+0.010
Neg logit contributions
 supposed-0.007
 leaning-0.006
 allergic-0.006
 greedy-0.006
 pain-0.006
Token friend
Token ID1545
Feature activation+7.71e-02

Pos logit contributions
 supposed+0.081
 leaning+0.074
 Tom+0.070
 allergic+0.068
 enough+0.067
Neg logit contributions
 trends-0.117
 Grace-0.115
 Grac-0.111
lyss-0.111
 Isn-0.106
Token And
Token ID843
Feature activation+1.10e+00

Pos logit contributions
 supposed+0.007
 leaning+0.007
 Tom+0.007
aked+0.006
 allergic+0.006
Neg logit contributions
 trends-0.009
 Grace-0.009
fish-0.009
 Grac-0.009
 dance-0.009
Token that
Token ID326
Feature activation+8.85e-01

Pos logit contributions
 trends+0.058
 Grace+0.055
lyss+0.053
 occasions+0.053
 Grac+0.053
Neg logit contributions
 supposed-0.045
 leaning-0.043
 Tom-0.043
 done-0.039
 enough-0.038
Token's
Token ID338
Feature activation+1.40e+00

Pos logit contributions
 trends+0.039
 Grace+0.037
 Flor+0.036
 Grac+0.036
lyss+0.036
Neg logit contributions
 supposed-0.047
 leaning-0.045
 allergic-0.041
 done-0.041
 Tom-0.040
Token why
Token ID1521
Feature activation-5.58e-01

Pos logit contributions
 Grace+0.058
 trends+0.057
 Grac+0.056
 Nina+0.054
lyss+0.054
Neg logit contributions
 supposed-0.072
 leaning-0.070
aked-0.064
agged-0.064
 allergic-0.063
Token Tues
Token ID48496
Feature activation+7.56e-01

Pos logit contributions
 supposed+0.026
 leaning+0.026
 allergic+0.023
 Tom+0.022
aked+0.022
Neg logit contributions
 trends-0.026
 Grace-0.026
 Grac-0.025
lyss-0.025
 Flor-0.024
Tokendays
Token ID12545
Feature activation+4.74e+00

Pos logit contributions
 trends+0.022
 Grace+0.022
 Grac+0.021
lyss+0.021
 Flor+0.020
Neg logit contributions
 supposed-0.049
 leaning-0.048
 allergic-0.043
aked-0.043
 Tom-0.043
Token are
Token ID389
Feature activation+7.43e-01

Pos logit contributions
 Flor+0.214
 Grace+0.213
 Grac+0.211
 coral+0.210
 trends+0.210
Neg logit contributions
 supposed-0.230
 leaning-0.226
 done-0.203
 Tom-0.202
 allergic-0.194
Token so
Token ID523
Feature activation+1.59e+00

Pos logit contributions
 trends+0.041
 Grace+0.041
 Grac+0.040
lyss+0.039
 Flor+0.039
Neg logit contributions
 supposed-0.029
 leaning-0.028
 allergic-0.024
 Tom-0.023
aked-0.023
Token important
Token ID1593
Feature activation-1.56e-01

Pos logit contributions
 trends+0.098
 Grace+0.097
 Grac+0.095
lyss+0.094
 Flor+0.093
Neg logit contributions
 supposed-0.052
 leaning-0.049
 allergic-0.041
aked-0.039
 Tom-0.039
Token.
Token ID13
Feature activation-3.20e-02

Pos logit contributions
 supposed+0.006
 leaning+0.006
 Tom+0.005
 allergic+0.005
 thieves+0.004
Neg logit contributions
 trends-0.009
 Grace-0.009
 Grac-0.009
lyss-0.009
 Flor-0.008
Token<|endoftext|>
Token ID50256
Feature activation+2.56e+00

Pos logit contributions
 supposed+0.001
 leaning+0.001
 Tom+0.001
 Jerry+0.001
itt+0.001
Neg logit contributions
 trends-0.002
 dance-0.002
 outfits-0.002
maid-0.002
fish-0.002
Token help
Token ID1037
Feature activation+2.06e+00

Pos logit contributions
 trends+0.041
 Grace+0.038
lyss+0.037
 Grac+0.036
 occasions+0.036
Neg logit contributions
 supposed-0.039
 leaning-0.038
 Tom-0.036
 allergic-0.035
 done-0.034
Token me
Token ID502
Feature activation+2.89e+00

Pos logit contributions
 trends+0.056
 Grace+0.054
 Grac+0.051
lyss+0.051
 Flor+0.049
Neg logit contributions
 supposed-0.140
 leaning-0.137
 allergic-0.126
 Tom-0.125
aked-0.124
Token!
Token ID0
Feature activation+4.16e-01

Pos logit contributions
 trends+0.117
 Grace+0.116
 Grac+0.112
lyss+0.111
 Flor+0.109
Neg logit contributions
 supposed-0.154
 leaning-0.150
 allergic-0.135
aked-0.133
 Tom-0.132
Token The
Token ID383
Feature activation+7.24e-01

Pos logit contributions
 trends+0.019
 Grace+0.018
 Grac+0.017
lyss+0.017
 Flor+0.017
Neg logit contributions
 supposed-0.021
 leaning-0.020
 Tom-0.018
 allergic-0.018
aked-0.018
Token jelly
Token ID27644
Feature activation+1.80e+00

Pos logit contributions
 trends+0.028
 Grace+0.028
 Grac+0.025
 occasions+0.024
lyss+0.024
Neg logit contributions
 supposed-0.040
 leaning-0.040
 allergic-0.038
 pain-0.038
 greedy-0.038
Tokenfish
Token ID11084
Feature activation+4.73e+00

Pos logit contributions
 Grace+0.055
 trends+0.055
 Grac+0.051
 Tracy+0.046
lyss+0.045
Neg logit contributions
 supposed-0.119
 leaning-0.117
 done-0.117
 gone-0.112
 lying-0.109
Token st
Token ID336
Feature activation+1.07e+00

Pos logit contributions
 trends+0.226
 Grace+0.216
lyss+0.211
 Grac+0.209
 Nina+0.207
Neg logit contributions
 supposed-0.241
 leaning-0.237
 allergic-0.216
 done-0.215
 thieves-0.202
Tokenung
Token ID2150
Feature activation+2.37e+00

Pos logit contributions
 trends+0.067
 Grace+0.067
 Grac+0.065
lyss+0.065
 Flor+0.064
Neg logit contributions
 supposed-0.033
 leaning-0.031
 allergic-0.026
 Tom-0.024
aked-0.024
Token me
Token ID502
Feature activation+3.04e+00

Pos logit contributions
 trends+0.072
 Grace+0.067
lyss+0.066
 Grac+0.064
 Flor+0.062
Neg logit contributions
 supposed-0.152
 leaning-0.150
 Tom-0.140
 allergic-0.137
aked-0.136
Token!"
Token ID2474
Feature activation-2.19e-01

Pos logit contributions
 Grace+0.113
 trends+0.113
 Grac+0.108
 Flor+0.106
lyss+0.105
Neg logit contributions
 supposed-0.167
 leaning-0.163
 allergic-0.148
agged-0.144
aked-0.144
Token Mia
Token ID32189
Feature activation+9.84e-01

Pos logit contributions
 Tom+0.008
 leaning+0.008
 supposed+0.007
 allergic+0.007
 mad+0.006
Neg logit contributions
 trends-0.014
maid-0.013
lyss-0.013
 outfits-0.013
 occasions-0.012
Token said
Token ID531
Feature activation-5.30e-01

Pos logit contributions
 supposed+0.022
 fault+0.020
 leaning+0.019
 allergic+0.018
working+0.018
Neg logit contributions
 Grace-0.007
 trends-0.007
 Grac-0.007
lyss-0.006
fish-0.006
Token.
Token ID13
Feature activation+2.68e-03

Pos logit contributions
 supposed+0.030
aked+0.028
wy+0.027
ige+0.027
Ray+0.026
Neg logit contributions
 Grace-0.018
 Nina-0.018
 Grac-0.018
 dance-0.017
 trends-0.017
Token "
Token ID366
Feature activation+3.18e-01

Pos logit contributions
 trends+0.000
 Grace+0.000
 Grac+0.000
lyss+0.000
 Flor+0.000
Neg logit contributions
 supposed-0.000
 leaning-0.000
 allergic-0.000
aked-0.000
 Tom-0.000
TokenJ
Token ID41
Feature activation-5.00e-01

Pos logit contributions
 trends+0.015
 Grace+0.015
 Grac+0.015
lyss+0.014
 Nina+0.014
Neg logit contributions
 supposed-0.015
 leaning-0.014
 allergic-0.013
aked-0.013
 Tom-0.012
Tokenelly
Token ID6148
Feature activation+2.28e+00

Pos logit contributions
 supposed+0.024
 leaning+0.023
 allergic+0.021
aked+0.020
 Tom+0.020
Neg logit contributions
 trends-0.023
 Grace-0.023
 Grac-0.022
lyss-0.022
 Nina-0.022
Tokenfish
Token ID11084
Feature activation+4.73e+00

Pos logit contributions
 trends+0.112
 Grace+0.105
 Grac+0.101
Gr+0.100
lyss+0.099
Neg logit contributions
 Tom-0.105
 leaning-0.098
 enough-0.097
 supposed-0.096
 done-0.093
Token can
Token ID460
Feature activation+1.13e+00

Pos logit contributions
 trends+0.225
lyss+0.216
 Grace+0.215
 Grac+0.208
 Isn+0.206
Neg logit contributions
 supposed-0.233
 leaning-0.231
 allergic-0.213
 done-0.209
 Tom-0.205
Token hurt
Token ID5938
Feature activation+1.38e+00

Pos logit contributions
 trends+0.075
lyss+0.071
 occasions+0.070
 Grace+0.070
 outfits+0.069
Neg logit contributions
 leaning-0.034
 supposed-0.034
 Tom-0.030
 allergic-0.030
ow-0.025
Token you
Token ID345
Feature activation+1.74e+00

Pos logit contributions
 trends+0.064
lyss+0.061
 Grace+0.055
 Grac+0.055
 occasions+0.053
Neg logit contributions
 Tom-0.071
 supposed-0.069
 leaning-0.068
 thieves-0.065
 allergic-0.063
Token.
Token ID13
Feature activation+2.94e-01

Pos logit contributions
 trends+0.088
 Grace+0.087
lyss+0.085
 Grac+0.085
 Nina+0.083
Neg logit contributions
 supposed-0.077
 leaning-0.075
 allergic-0.066
 Tom-0.064
aked-0.062
Token They
Token ID1119
Feature activation-3.40e-01

Pos logit contributions
 trends+0.015
fish+0.014
lyss+0.014
 coral+0.014
 outfits+0.014
Neg logit contributions
 Tom-0.013
 leaning-0.012
 supposed-0.012
 greedy-0.011
 allergic-0.011
Token sund
Token ID37437
Feature activation+9.23e-01

Pos logit contributions
 supposed+0.093
 leaning+0.086
aked+0.080
agged+0.079
 allergic+0.077
Neg logit contributions
 trends-0.127
 Grace-0.121
lyss-0.121
 Grac-0.119
 Flor-0.118
Tokenae
Token ID3609
Feature activation+4.76e-01

Pos logit contributions
 trends+0.044
 occasions+0.042
lyss+0.041
 Grace+0.040
fish+0.040
Neg logit contributions
 leaning-0.042
 supposed-0.041
 allergic-0.037
 vain-0.036
 handcuffs-0.036
Token with
Token ID351
Feature activation+1.66e-01

Pos logit contributions
 trends+0.025
 Grace+0.025
 Grac+0.025
lyss+0.024
 Flor+0.024
Neg logit contributions
 supposed-0.019
 leaning-0.019
 allergic-0.016
 Tom-0.016
aked-0.015
Token lots
Token ID6041
Feature activation+3.38e-01

Pos logit contributions
 Grace+0.009
 Grac+0.008
 Nina+0.008
 coral+0.008
 trends+0.008
Neg logit contributions
 supposed-0.007
 leaning-0.007
agged-0.006
aked-0.006
owing-0.006
Token of
Token ID286
Feature activation-2.15e+00

Pos logit contributions
 trends+0.017
 Grace+0.017
lyss+0.016
 Grac+0.016
 Nina+0.016
Neg logit contributions
 supposed-0.015
 leaning-0.014
 allergic-0.012
aked-0.012
 Tom-0.012
Token topp
Token ID23126
Feature activation-6.77e+00

Pos logit contributions
 supposed+0.075
 leaning+0.072
 allergic+0.060
aked+0.059
 Tom+0.057
Neg logit contributions
 trends-0.127
 Grace-0.126
 Grac-0.123
lyss-0.122
 Nina-0.121
Tokenings
Token ID654
Feature activation-2.45e+00

Pos logit contributions
 leaning+0.356
 supposed+0.355
 allergic+0.310
 vain+0.304
hy+0.304
Neg logit contributions
maid-0.273
fish-0.259
 outfits-0.258
 occasions-0.249
 coral-0.247
Token.
Token ID13
Feature activation-3.65e-02

Pos logit contributions
 supposed+0.111
 leaning+0.105
 allergic+0.099
agged+0.096
eping+0.094
Neg logit contributions
 trends-0.115
 Grace-0.112
 Isn-0.108
 Nina-0.107
 Grac-0.106
Token Tim
Token ID5045
Feature activation-7.46e-01

Pos logit contributions
 supposed+0.002
 leaning+0.001
 allergic+0.001
aked+0.001
agged+0.001
Neg logit contributions
 trends-0.002
 Grace-0.002
 Grac-0.002
lyss-0.002
 Nina-0.002
Tokenmy
Token ID1820
Feature activation-8.44e-01

Pos logit contributions
 supposed+0.034
 leaning+0.034
hy+0.031
agged+0.031
 thieves+0.031
Neg logit contributions
 Grace-0.031
 trends-0.030
 Grac-0.030
 occasions-0.030
 dance-0.030
Token was
Token ID373
Feature activation+5.20e-02

Pos logit contributions
 supposed+0.044
 leaning+0.043
 allergic+0.039
 Tom+0.038
aked+0.038
Neg logit contributions
 trends-0.036
 Grace-0.035
 Grac-0.034
lyss-0.033
 Flor-0.033
Token sund
Token ID37437
Feature activation+4.42e-01

Pos logit contributions
 trends+0.058
 Grace+0.058
 Grac+0.056
lyss+0.056
 Flor+0.055
Neg logit contributions
 supposed-0.040
 leaning-0.039
 allergic-0.034
 Tom-0.032
aked-0.032
Tokenae
Token ID3609
Feature activation-3.16e-01

Pos logit contributions
 trends+0.021
 occasions+0.020
lyss+0.019
fish+0.019
maid+0.019
Neg logit contributions
 leaning-0.021
 supposed-0.021
 allergic-0.019
 Tom-0.018
 Jerry-0.018
Token with
Token ID351
Feature activation-9.60e-02

Pos logit contributions
 supposed+0.013
 leaning+0.012
agged+0.011
aked+0.011
 allergic+0.011
Neg logit contributions
 trends-0.016
 Grace-0.016
 Grac-0.016
 Flor-0.015
lyss-0.015
Token lots
Token ID6041
Feature activation+1.91e-01

Pos logit contributions
 supposed+0.004
 leaning+0.004
agged+0.003
aked+0.003
 allergic+0.003
Neg logit contributions
 Grace-0.005
 Grac-0.005
 Nina-0.005
 trends-0.005
 Flor-0.005
Token of
Token ID286
Feature activation-2.32e+00

Pos logit contributions
 trends+0.010
 Grace+0.010
 Grac+0.009
lyss+0.009
 Nina+0.009
Neg logit contributions
 supposed-0.008
 leaning-0.008
 allergic-0.007
aked-0.007
 Tom-0.007
Token topp
Token ID23126
Feature activation-6.72e+00

Pos logit contributions
 supposed+0.085
 leaning+0.081
 allergic+0.068
aked+0.066
agged+0.065
Neg logit contributions
 trends-0.134
 Grace-0.132
 Grac-0.129
lyss-0.128
 Nina-0.127
Tokenings
Token ID654
Feature activation-2.39e+00

Pos logit contributions
 leaning+0.340
 supposed+0.340
 allergic+0.301
 greedy+0.286
 handcuffs+0.282
Neg logit contributions
maid-0.280
 outfits-0.274
 trends-0.272
fish-0.269
 Grac-0.266
Token.
Token ID13
Feature activation-1.75e-01

Pos logit contributions
 supposed+0.104
 leaning+0.097
 allergic+0.090
agged+0.089
aked+0.087
Neg logit contributions
 Grace-0.122
 trends-0.120
 Grac-0.116
 Nina-0.116
 Flor-0.112
Token
Token ID198
Feature activation-1.08e-01

Pos logit contributions
 supposed+0.008
 leaning+0.007
 Tom+0.007
 allergic+0.006
aked+0.006
Neg logit contributions
 trends-0.009
 coral-0.008
 Grace-0.008
lyss-0.008
fish-0.008
Token
Token ID198
Feature activation-5.02e-01

Pos logit contributions
 supposed+0.005
 leaning+0.004
 Tom+0.004
aked+0.004
 allergic+0.004
Neg logit contributions
 trends-0.006
 Grace-0.005
 coral-0.005
 Grac-0.005
lyss-0.005
TokenThey
Token ID2990
Feature activation-1.65e+00

Pos logit contributions
 supposed+0.024
 leaning+0.023
aked+0.021
 Tom+0.021
eping+0.020
Neg logit contributions
maid-0.023
 Flor-0.023
 trends-0.023
 coral-0.023
 Grace-0.023
Token weighed
Token ID20261
Feature activation-1.60e+00

Pos logit contributions
 Grace+0.003
 trends+0.003
 Grac+0.003
lyss+0.003
 Flor+0.003
Neg logit contributions
 supposed-0.001
 leaning-0.001
agged-0.001
 allergic-0.001
aked-0.001
Token at
Token ID379
Feature activation-2.68e-01

Pos logit contributions
 supposed+0.069
 leaning+0.066
 allergic+0.058
agged+0.058
aked+0.057
Neg logit contributions
 Grace-0.079
 trends-0.078
 Grac-0.077
 Flor-0.076
 coral-0.075
Token least
Token ID1551
Feature activation+2.13e+00

Pos logit contributions
 supposed+0.011
 leaning+0.010
 allergic+0.009
aked+0.008
 Tom+0.008
Neg logit contributions
 trends-0.015
 Grace-0.015
 Grac-0.014
lyss-0.014
 Flor-0.014
Token two
Token ID734
Feature activation-1.49e+00

Pos logit contributions
 trends+0.114
 Grace+0.112
 Grac+0.111
lyss+0.111
 Flor+0.106
Neg logit contributions
 supposed-0.091
 leaning-0.088
 Tom-0.077
 allergic-0.074
 done-0.073
Token hundred
Token ID3470
Feature activation-1.59e+00

Pos logit contributions
 supposed+0.064
 leaning+0.062
agged+0.055
aked+0.054
 allergic+0.053
Neg logit contributions
 trends-0.074
 Grace-0.073
 Grac-0.071
 coral-0.070
lyss-0.070
Token pounds
Token ID8059
Feature activation-6.08e+00

Pos logit contributions
 supposed+0.069
 leaning+0.066
agged+0.058
 allergic+0.057
aked+0.057
Neg logit contributions
 trends-0.079
 Grace-0.078
 coral-0.076
 Grac-0.075
lyss-0.074
Token!
Token ID0
Feature activation-5.04e-01

Pos logit contributions
 supposed+0.301
agged+0.297
 leaning+0.290
 allergic+0.274
aked+0.268
Neg logit contributions
 trends-0.242
fish-0.240
 Grace-0.233
 coral-0.232
 reefs-0.228
Token W
Token ID370
Feature activation+3.37e+00

Pos logit contributions
 Tom+0.018
 leaning+0.017
 supposed+0.017
aked+0.016
itt+0.016
Neg logit contributions
 trends-0.029
 occasions-0.028
 dance-0.027
 outfits-0.027
fish-0.027
Tokenopsy
Token ID44522
Feature activation+1.48e+00

Pos logit contributions
 trends+0.215
 Grace+0.215
 Grac+0.209
lyss+0.207
 Nina+0.204
Neg logit contributions
 supposed-0.105
 leaning-0.097
 allergic-0.083
aked-0.079
agged-0.077
Token was
Token ID373
Feature activation-1.43e-02

Pos logit contributions
 trends+0.070
 Grace+0.067
 Nina+0.066
 Grac+0.064
lyss+0.064
Neg logit contributions
 leaning-0.072
 supposed-0.070
aked-0.064
aged-0.063
 done-0.062
Token so
Token ID523
Feature activation+1.28e+00

Pos logit contributions
agged+0.001
arl+0.001
 supposed+0.001
wy+0.001
issors+0.001
Neg logit contributions
 Grace-0.000
 Flor-0.000
 Grac-0.000
lyss-0.000
 dance-0.000
Token spotted
Token ID13489
Feature activation-4.89e-01

Pos logit contributions
 trends+0.067
 occasions+0.066
 Grac+0.065
 conclusions+0.065
lyss+0.064
Neg logit contributions
 supposed-0.036
 leaning-0.033
 Tom-0.032
 done-0.031
 enough-0.030
Token a
Token ID257
Feature activation+8.66e-01

Pos logit contributions
 supposed+0.015
 leaning+0.014
 allergic+0.011
 Tom+0.011
aked+0.011
Neg logit contributions
 trends-0.031
 Grace-0.031
 Grac-0.030
lyss-0.030
 Flor-0.030
Token row
Token ID5752
Feature activation-9.19e-01

Pos logit contributions
 trends+0.049
 Grace+0.048
 Grac+0.047
 occasions+0.047
 Nina+0.046
Neg logit contributions
 leaning-0.032
 supposed-0.032
 allergic-0.027
 Tom-0.025
 thieves-0.024
Token of
Token ID286
Feature activation-1.06e+00

Pos logit contributions
 supposed+0.043
 leaning+0.040
 allergic+0.037
aked+0.036
agged+0.035
Neg logit contributions
 trends-0.043
 Grace-0.043
lyss-0.041
 Grac-0.041
 Nina-0.040
Token books
Token ID3835
Feature activation-2.26e+00

Pos logit contributions
 leaning+0.046
 supposed+0.046
 Tom+0.040
 allergic+0.040
eping+0.039
Neg logit contributions
 trends-0.053
 Grace-0.051
 Grac-0.050
lyss-0.050
 Flor-0.050
Tokenhel
Token ID2978
Feature activation-6.04e+00

Pos logit contributions
 supposed+0.083
 leaning+0.073
 allergic+0.071
aked+0.070
 thieves+0.069
Neg logit contributions
 trends-0.126
 Grace-0.122
 dance-0.120
fish-0.116
lyss-0.116
Tokenves
Token ID1158
Feature activation-2.29e+00

Pos logit contributions
 supposed+0.356
 leaning+0.345
 allergic+0.326
 handcuffs+0.317
aked+0.315
Neg logit contributions
 trends-0.203
fish-0.201
 Grace-0.198
 occasions-0.191
 Flor-0.186
Token in
Token ID287
Feature activation+7.18e-01

Pos logit contributions
 supposed+0.088
 leaning+0.085
 allergic+0.073
aked+0.070
 Tom+0.070
Neg logit contributions
 trends-0.127
 Grace-0.126
 Grac-0.123
lyss-0.122
 Flor-0.120
Token her
Token ID607
Feature activation-1.35e-01

Pos logit contributions
 trends+0.039
 Grace+0.039
 Grac+0.038
lyss+0.038
 Flor+0.037
Neg logit contributions
 supposed-0.028
 leaning-0.027
 allergic-0.023
aked-0.023
 Tom-0.023
Token grandparents
Token ID28571
Feature activation-3.45e-02

Pos logit contributions
 leaning+0.006
 supposed+0.006
 Tom+0.005
 allergic+0.005
 done+0.005
Neg logit contributions
 trends-0.007
 Grac-0.007
lyss-0.007
 Grace-0.007
 Isn-0.007
Token'
Token ID6
Feature activation-4.05e-01

Pos logit contributions
 supposed+0.002
 leaning+0.002
 Tom+0.001
 allergic+0.001
 done+0.001
Neg logit contributions
 trends-0.002
lyss-0.002
 Grace-0.002
 Grac-0.002
 Flor-0.002
Tokenmy
Token ID1820
Feature activation-1.04e+00

Pos logit contributions
 thieves+0.044
 leaning+0.044
 supposed+0.043
itt+0.042
iers+0.041
Neg logit contributions
 Grace-0.040
 dance-0.038
 dances-0.037
 coral-0.037
 costume-0.037
Token had
Token ID550
Feature activation-1.11e+00

Pos logit contributions
 supposed+0.053
 leaning+0.052
 thieves+0.048
 allergic+0.047
agged+0.047
Neg logit contributions
 Grace-0.043
 trends-0.042
lyss-0.041
 Grac-0.040
 Nina-0.040
Token a
Token ID257
Feature activation-3.41e-02

Pos logit contributions
ared+0.050
 leaning+0.050
arl+0.050
owing+0.050
aced+0.049
Neg logit contributions
 Grace-0.049
 dance-0.047
 tut-0.046
 Grac-0.046
 Nina-0.045
Token long
Token ID890
Feature activation-2.47e+00

Pos logit contributions
 supposed+0.001
 leaning+0.001
aked+0.001
agged+0.001
 allergic+0.001
Neg logit contributions
 Grace-0.002
 trends-0.002
 Grac-0.002
lyss-0.002
 Flor-0.002
Token screw
Token ID9580
Feature activation-3.60e+00

Pos logit contributions
 supposed+0.120
 leaning+0.109
aked+0.099
 allergic+0.098
 thieves+0.097
Neg logit contributions
 Grace-0.113
 trends-0.110
 dance-0.110
lyss-0.107
 Grac-0.106
Tokendriver
Token ID26230
Feature activation-6.03e+00

Pos logit contributions
 supposed+0.159
 leaning+0.148
agged+0.143
 allergic+0.143
aked+0.137
Neg logit contributions
 trends-0.172
 costume-0.160
fish-0.160
 Grace-0.159
 dance-0.157
Token that
Token ID326
Feature activation+4.52e-01

Pos logit contributions
 supposed+0.302
aked+0.290
ige+0.284
agged+0.282
 leaning+0.280
Neg logit contributions
 trends-0.224
 dance-0.217
 Grace-0.206
 costume-0.202
 tut-0.201
Token he
Token ID339
Feature activation+1.81e-01

Pos logit contributions
 trends+0.018
 Grace+0.018
 Grac+0.018
lyss+0.017
 Flor+0.017
Neg logit contributions
 supposed-0.024
 leaning-0.023
 allergic-0.021
 Tom-0.021
aked-0.021
Token loved
Token ID6151
Feature activation-4.14e-02

Pos logit contributions
 trends+0.008
 Grace+0.008
 Grac+0.007
lyss+0.007
 Nina+0.007
Neg logit contributions
 leaning-0.009
 supposed-0.009
 allergic-0.008
 Tom-0.008
 thieves-0.008
Token to
Token ID284
Feature activation-6.44e-01

Pos logit contributions
 supposed+0.002
agged+0.002
 leaning+0.002
aked+0.002
ige+0.002
Neg logit contributions
 trends-0.002
 Grace-0.002
 occasions-0.002
 Grac-0.002
 dance-0.002
Token play
Token ID711
Feature activation+1.39e+00

Pos logit contributions
 supposed+0.036
 leaning+0.035
 allergic+0.031
aked+0.031
 Tom+0.031
Neg logit contributions
 trends-0.025
 Grace-0.025
 Grac-0.024
lyss-0.023
 Flor-0.023
Token has
Token ID468
Feature activation-2.25e-01

Pos logit contributions
 supposed+0.014
 leaning+0.014
 done+0.013
 allergic+0.013
 Tom+0.012
Neg logit contributions
 trends-0.013
 Grace-0.013
 Nina-0.012
lyss-0.012
 Grac-0.012
Token all
Token ID477
Feature activation+9.15e-01

Pos logit contributions
 supposed+0.008
 leaning+0.007
 done+0.006
 allergic+0.006
 Tom+0.006
Neg logit contributions
 trends-0.014
 Grace-0.014
lyss-0.013
 Grac-0.013
 Flor-0.013
Token the
Token ID262
Feature activation-1.46e+00

Pos logit contributions
 trends+0.050
 Grace+0.050
 Grac+0.048
lyss+0.048
 Flor+0.047
Neg logit contributions
 supposed-0.037
 leaning-0.036
 allergic-0.031
 Tom-0.030
aked-0.029
Token y
Token ID331
Feature activation-1.48e+00

Pos logit contributions
 leaning+0.058
 supposed+0.056
agged+0.049
hy+0.048
aked+0.048
Neg logit contributions
 coral-0.073
 trends-0.073
 occasions-0.073
 Grac-0.071
 reefs-0.071
Tokenummy
Token ID13513
Feature activation-1.37e+00

Pos logit contributions
 leaning+0.077
 supposed+0.076
hy+0.067
 handcuffs+0.066
eping+0.066
Neg logit contributions
 occasions-0.061
 trends-0.059
 coral-0.059
 reefs-0.057
 Nina-0.054
Token topp
Token ID23126
Feature activation-5.93e+00

Pos logit contributions
 supposed+0.052
 leaning+0.052
 allergic+0.043
agged+0.042
aked+0.042
Neg logit contributions
 trends-0.074
 Grace-0.072
 Grac-0.071
 coral-0.071
lyss-0.071
Tokenings
Token ID654
Feature activation-1.78e+00

Pos logit contributions
 supposed+0.259
 leaning+0.254
 allergic+0.221
 Tom+0.210
aked+0.210
Neg logit contributions
 trends-0.294
 Grace-0.284
 Grac-0.281
 coral-0.278
 occasions-0.277
Token you
Token ID345
Feature activation+7.11e-01

Pos logit contributions
 supposed+0.087
 leaning+0.081
 Tom+0.078
 done+0.074
 allergic+0.072
Neg logit contributions
 trends-0.080
 Flor-0.079
 coral-0.078
lyss-0.078
 Grace-0.077
Token like
Token ID588
Feature activation-8.66e-01

Pos logit contributions
 Grace+0.037
 trends+0.037
lyss+0.036
 Grac+0.035
 Flor+0.034
Neg logit contributions
 supposed-0.033
 leaning-0.029
 done-0.027
 allergic-0.027
 Tom-0.027
Token."
Token ID526
Feature activation+9.06e-01

Pos logit contributions
 supposed+0.042
 leaning+0.041
 Tom+0.037
 allergic+0.036
 done+0.034
Neg logit contributions
 trends-0.041
 Grace-0.039
lyss-0.039
 Grac-0.039
 Flor-0.038
Token
Token ID220
Feature activation+2.54e-01

Pos logit contributions
 trends+0.050
 coral+0.047
fish+0.047
 reefs+0.046
lyss+0.046
Neg logit contributions
 Tom-0.035
 supposed-0.035
 leaning-0.034
hes-0.030
 allergic-0.030
Token had
Token ID550
Feature activation-1.11e+00

Pos logit contributions
 Grace+0.009
 trends+0.008
 Nina+0.008
 Grac+0.008
lyss+0.008
Neg logit contributions
 supposed-0.010
 leaning-0.010
agged-0.009
aked-0.009
 allergic-0.009
Token a
Token ID257
Feature activation+1.14e-01

Pos logit contributions
 supposed+0.049
 leaning+0.049
agged+0.045
aked+0.044
eping+0.043
Neg logit contributions
 coral-0.050
 Grace-0.050
 trends-0.049
lyss-0.049
 Nina-0.049
Token quarrel
Token ID41998
Feature activation-2.20e+00

Pos logit contributions
lyss+0.006
 Grace+0.006
 dance+0.006
 trends+0.006
 Grac+0.006
Neg logit contributions
 supposed-0.004
 leaning-0.004
agged-0.004
aked-0.004
umbled-0.004
Token about
Token ID546
Feature activation-2.03e+00

Pos logit contributions
 supposed+0.086
 leaning+0.083
 allergic+0.075
agged+0.072
aked+0.071
Neg logit contributions
 trends-0.117
 Grace-0.114
 Grac-0.110
lyss-0.110
 Nina-0.108
Token what
Token ID644
Feature activation-2.65e+00

Pos logit contributions
 supposed+0.074
 leaning+0.072
 Tom+0.062
 allergic+0.061
 thieves+0.055
Neg logit contributions
 trends-0.120
 Grace-0.117
lyss-0.116
 Grac-0.115
 Nina-0.112
Token topp
Token ID23126
Feature activation-5.90e+00

Pos logit contributions
 supposed+0.133
 leaning+0.129
 Tom+0.121
 allergic+0.113
 done+0.109
Neg logit contributions
 trends-0.121
 Grace-0.118
 Grac-0.117
 Flor-0.114
 Nina-0.112
Tokenings
Token ID654
Feature activation-2.11e+00

Pos logit contributions
 supposed+0.265
 leaning+0.257
 allergic+0.226
aked+0.217
 Tom+0.215
Neg logit contributions
 trends-0.288
 Grace-0.282
 Grac-0.276
lyss-0.272
 Nina-0.270
Token to
Token ID284
Feature activation-1.48e-01

Pos logit contributions
 supposed+0.096
 leaning+0.092
 Tom+0.083
 allergic+0.081
 done+0.078
Neg logit contributions
 trends-0.103
 Grace-0.101
 Grac-0.100
lyss-0.099
 Flor-0.099
Token put
Token ID1234
Feature activation-2.83e-01

Pos logit contributions
 supposed+0.008
 leaning+0.008
aked+0.007
agged+0.007
 allergic+0.007
Neg logit contributions
 trends-0.006
 Grace-0.006
 dance-0.006
 Grac-0.006
 occasions-0.006
Token on
Token ID319
Feature activation-7.72e-02

Pos logit contributions
 supposed+0.012
 leaning+0.012
 allergic+0.010
agged+0.010
aked+0.010
Neg logit contributions
 Grace-0.014
 trends-0.014
 Grac-0.013
 Nina-0.013
 coral-0.013
Token the
Token ID262
Feature activation-6.03e-01

Pos logit contributions
 supposed+0.003
 leaning+0.003
aked+0.003
 allergic+0.003
agged+0.003
Neg logit contributions
 trends-0.004
 Grace-0.004
 Grac-0.004
lyss-0.004
 Nina-0.004
Token a
Token ID257
Feature activation+2.02e-01

Pos logit contributions
 leaning+0.032
owing+0.032
agged+0.031
 supposed+0.031
ared+0.030
Neg logit contributions
 Grace-0.039
 Grac-0.038
 coral-0.037
 trends-0.037
 Nina-0.037
Token hammer
Token ID15554
Feature activation-3.24e+00

Pos logit contributions
 Grace+0.011
 trends+0.011
 Grac+0.011
lyss+0.011
 dance+0.010
Neg logit contributions
 supposed-0.008
 leaning-0.008
aked-0.007
agged-0.007
 allergic-0.007
Token,
Token ID11
Feature activation+8.86e-02

Pos logit contributions
 supposed+0.174
agged+0.158
 leaning+0.158
 allergic+0.155
aked+0.154
Neg logit contributions
 trends-0.123
 dance-0.116
 Grace-0.113
 Grac-0.112
 tut-0.111
Token a
Token ID257
Feature activation+1.01e+00

Pos logit contributions
 Grace+0.004
 Grac+0.004
 trends+0.004
 Nina+0.004
 coral+0.004
Neg logit contributions
 supposed-0.004
 leaning-0.004
aked-0.003
agged-0.003
 thieves-0.003
Token screw
Token ID9580
Feature activation-3.19e+00

Pos logit contributions
 Grace+0.057
 trends+0.057
 Grac+0.056
lyss+0.056
 Flor+0.054
Neg logit contributions
 supposed-0.038
 leaning-0.035
aked-0.031
agged-0.031
 allergic-0.030
Tokendriver
Token ID26230
Feature activation-5.87e+00

Pos logit contributions
 supposed+0.124
 allergic+0.105
 leaning+0.105
agged+0.102
aked+0.097
Neg logit contributions
 trends-0.177
 Grac-0.172
 Grace-0.170
fish-0.167
 dance-0.166
Token,
Token ID11
Feature activation+2.66e-01

Pos logit contributions
 supposed+0.346
 leaning+0.318
aked+0.310
 allergic+0.310
agged+0.305
Neg logit contributions
 trends-0.192
 Grace-0.183
 dance-0.177
 Grac-0.173
lyss-0.173
Token and
Token ID290
Feature activation+9.21e-02

Pos logit contributions
 Grace+0.011
 trends+0.011
 Grac+0.011
 Nina+0.010
 Flor+0.010
Neg logit contributions
 supposed-0.014
 leaning-0.013
agged-0.012
 allergic-0.012
aked-0.011
Token a
Token ID257
Feature activation+7.55e-01

Pos logit contributions
 Grace+0.005
 trends+0.005
 Grac+0.004
lyss+0.004
 Nina+0.004
Neg logit contributions
 supposed-0.004
 leaning-0.004
 allergic-0.003
aked-0.003
agged-0.003
Token saw
Token ID2497
Feature activation-1.51e+00

Pos logit contributions
 trends+0.043
 Grace+0.043
 Grac+0.042
lyss+0.041
 Flor+0.041
Neg logit contributions
 supposed-0.028
 leaning-0.027
 allergic-0.023
 Tom-0.022
aked-0.022
Token.
Token ID13
Feature activation-1.34e-01

Pos logit contributions
 supposed+0.073
 leaning+0.068
 allergic+0.064
agged+0.060
 thieves+0.060
Neg logit contributions
 trends-0.067
 Grace-0.065
 Isn-0.064
lyss-0.063
 Grac-0.063
Token the
Token ID262
Feature activation-3.59e-01

Pos logit contributions
 trends+0.022
lyss+0.022
 Grace+0.021
 Grac+0.021
 Flor+0.021
Neg logit contributions
 supposed-0.014
 leaning-0.014
 Tom-0.012
 allergic-0.012
aked-0.011
Token world
Token ID995
Feature activation+1.91e+00

Pos logit contributions
 supposed+0.017
 leaning+0.017
 allergic+0.015
 Tom+0.014
aked+0.014
Neg logit contributions
 trends-0.017
 Grace-0.017
 Grac-0.016
lyss-0.016
 Flor-0.016
Token and
Token ID290
Feature activation+2.03e-01

Pos logit contributions
 trends+0.111
 Grac+0.110
 Grace+0.109
lyss+0.109
 Nina+0.107
Neg logit contributions
 supposed-0.069
 leaning-0.068
 Tom-0.058
 allergic-0.053
aked-0.053
Token wanted
Token ID2227
Feature activation-2.32e+00

Pos logit contributions
 trends+0.012
lyss+0.012
 Grace+0.012
 Grac+0.011
maid+0.011
Neg logit contributions
 supposed-0.008
 leaning-0.008
 Tom-0.007
 done-0.006
 allergic-0.006
Token Wh
Token ID854
Feature activation-2.77e+00

Pos logit contributions
 supposed+0.081
 leaning+0.075
 Tom+0.067
 allergic+0.064
aked+0.060
Neg logit contributions
 trends-0.139
 Grace-0.136
lyss-0.134
 Grac-0.133
 Flor-0.130
Tokenisk
Token ID1984
Feature activation-5.85e+00

Pos logit contributions
 leaning+0.109
 supposed+0.102
 Tom+0.088
 allergic+0.088
aged+0.083
Neg logit contributions
 trends-0.151
 Grace-0.146
 Grac-0.144
 occasions-0.143
lyss-0.143
Tokeners
Token ID364
Feature activation-2.91e+00

Pos logit contributions
 leaning+0.240
 supposed+0.230
 allergic+0.210
 Tom+0.188
 vain+0.185
Neg logit contributions
 trends-0.305
 Grace-0.303
lyss-0.295
 Flor-0.295
 Grac-0.292
Token to
Token ID284
Feature activation-3.51e-01

Pos logit contributions
 supposed+0.102
 leaning+0.093
 allergic+0.081
 Tom+0.080
 done+0.078
Neg logit contributions
 Grace-0.173
 Grac-0.171
 trends-0.171
lyss-0.170
 Nina-0.166
Token come
Token ID1282
Feature activation-7.07e-01

Pos logit contributions
 Tom+0.014
 supposed+0.014
 leaning+0.014
hy+0.013
 allergic+0.013
Neg logit contributions
 trends-0.019
lyss-0.018
 Grac-0.018
 Grace-0.017
 Tracy-0.017
Token along
Token ID1863
Feature activation+1.61e+00

Pos logit contributions
 supposed+0.026
 leaning+0.025
 Tom+0.021
 allergic+0.021
eping+0.020
Neg logit contributions
 trends-0.041
lyss-0.041
 Grace-0.039
 occasions-0.039
 Grac-0.039
Token.
Token ID13
Feature activation-2.35e-01

Pos logit contributions
 trends+0.103
lyss+0.102
 Grac+0.099
 Grace+0.097
 Nina+0.096
Neg logit contributions
 supposed-0.049
 leaning-0.048
 Tom-0.041
 allergic-0.039
 enough-0.035
Token saw
Token ID2497
Feature activation-5.58e-01

Pos logit contributions
 trends+0.022
 Grace+0.021
 occasions+0.021
lyss+0.020
 Grac+0.020
Neg logit contributions
 leaning-0.016
 supposed-0.014
 Tom-0.014
aked-0.014
aged-0.013
Token him
Token ID683
Feature activation-1.22e+00

Pos logit contributions
 supposed+0.025
 leaning+0.022
agged+0.021
aked+0.020
 allergic+0.019
Neg logit contributions
 Grace-0.029
 Flor-0.029
 Grac-0.028
 Nina-0.028
 coral-0.028
Token use
Token ID779
Feature activation+3.08e-01

Pos logit contributions
 supposed+0.054
agged+0.049
 leaning+0.048
aked+0.047
 allergic+0.044
Neg logit contributions
 Grace-0.058
 dance-0.058
 trends-0.058
 Grac-0.057
 Flor-0.056
Token a
Token ID257
Feature activation+2.56e-01

Pos logit contributions
 Grace+0.016
 trends+0.016
 Grac+0.016
 Nina+0.016
 Flor+0.015
Neg logit contributions
 supposed-0.013
 leaning-0.012
 allergic-0.010
agged-0.010
aked-0.010
Token screw
Token ID9580
Feature activation-3.62e+00

Pos logit contributions
 trends+0.015
 Grace+0.014
 Flor+0.014
lyss+0.014
 occasions+0.014
Neg logit contributions
 supposed-0.009
 leaning-0.009
 allergic-0.008
eping-0.007
 done-0.007
Tokendriver
Token ID26230
Feature activation-5.84e+00

Pos logit contributions
 supposed+0.132
 leaning+0.114
 allergic+0.112
agged+0.107
ige+0.096
Neg logit contributions
 trends-0.210
 Grac-0.204
 Grace-0.200
 Flor-0.194
 Nina-0.194
Token,
Token ID11
Feature activation+1.43e-01

Pos logit contributions
agged+0.363
 supposed+0.361
ige+0.344
wy+0.340
aked+0.338
Neg logit contributions
 trends-0.147
 dance-0.131
 Grace-0.127
 anytime-0.121
 Grac-0.120
Token a
Token ID257
Feature activation+7.68e-01

Pos logit contributions
 Grace+0.008
 trends+0.008
 Grac+0.008
 Flor+0.008
lyss+0.008
Neg logit contributions
 supposed-0.005
 leaning-0.005
 allergic-0.004
aked-0.004
agged-0.004
Token wrench
Token ID34561
Feature activation-3.98e+00

Pos logit contributions
 trends+0.044
 Grace+0.044
lyss+0.042
 Grac+0.042
 Flor+0.042
Neg logit contributions
 supposed-0.028
 leaning-0.027
 allergic-0.023
 Tom-0.022
eping-0.021
Token,
Token ID11
Feature activation+2.19e-01

Pos logit contributions
 supposed+0.245
 allergic+0.222
agged+0.219
 leaning+0.212
eping+0.207
Neg logit contributions
 trends-0.127
 Grac-0.123
 Grace-0.120
 Isn-0.110
 Flor-0.110
Token and
Token ID290
Feature activation-2.62e-02

Pos logit contributions
 Grace+0.011
 Grac+0.011
 trends+0.011
 Flor+0.011
 Nina+0.010
Neg logit contributions
 supposed-0.010
 leaning-0.009
agged-0.008
 allergic-0.008
aked-0.007
Token and
Token ID290
Feature activation+6.32e-01

Pos logit contributions
 supposed+0.026
 leaning+0.025
 allergic+0.024
agged+0.023
aked+0.023
Neg logit contributions
 trends-0.015
 Grace-0.015
 Grac-0.015
 Nina-0.014
 dance-0.013
Token took
Token ID1718
Feature activation-4.82e-01

Pos logit contributions
 trends+0.033
lyss+0.032
 Grace+0.032
 Isn+0.032
 Flor+0.032
Neg logit contributions
 leaning-0.025
 supposed-0.025
 Tom-0.023
aged-0.022
 allergic-0.021
Token out
Token ID503
Feature activation-7.54e-01

Pos logit contributions
 supposed+0.023
 leaning+0.022
 allergic+0.020
agged+0.019
ige+0.018
Neg logit contributions
 Grace-0.023
 Grac-0.023
 trends-0.022
 Nina-0.022
 Flor-0.022
Token a
Token ID257
Feature activation+7.67e-01

Pos logit contributions
 supposed+0.034
 leaning+0.032
 allergic+0.028
agged+0.028
aked+0.027
Neg logit contributions
 Grace-0.039
 Grac-0.038
 trends-0.037
 Nina-0.037
 Flor-0.036
Token screw
Token ID9580
Feature activation-3.51e+00

Pos logit contributions
 Grace+0.042
 trends+0.041
 Grac+0.041
lyss+0.040
 Nina+0.039
Neg logit contributions
 supposed-0.030
 leaning-0.029
 allergic-0.025
aked-0.025
agged-0.025
Tokendriver
Token ID26230
Feature activation-5.83e+00

Pos logit contributions
 supposed+0.126
agged+0.114
 leaning+0.111
 allergic+0.110
aked+0.099
Neg logit contributions
 trends-0.202
 Grac-0.195
 Grace-0.194
fish-0.188
 dance-0.186
Token.
Token ID13
Feature activation+1.15e-01

Pos logit contributions
agged+0.325
 supposed+0.318
 allergic+0.310
 leaning+0.308
aked+0.307
Neg logit contributions
 trends-0.187
 Grace-0.175
 Grac-0.163
 dance-0.162
 anytime-0.161
Token Then
Token ID3244
Feature activation-8.20e-01

Pos logit contributions
 trends+0.006
 Grace+0.006
 occasions+0.006
fish+0.006
 coral+0.006
Neg logit contributions
 supposed-0.004
 leaning-0.004
 Tom-0.004
 allergic-0.004
aked-0.004
Token he
Token ID339
Feature activation+5.01e-01

Pos logit contributions
 Tom+0.028
 leaning+0.028
 enough+0.027
 supposed+0.024
aged+0.023
Neg logit contributions
 trends-0.048
 occasions-0.046
 reefs-0.044
 Isn-0.044
 coral-0.044
Token carefully
Token ID7773
Feature activation+3.25e-01

Pos logit contributions
 trends+0.027
 Grace+0.026
lyss+0.026
 Flor+0.025
 Grac+0.025
Neg logit contributions
 supposed-0.020
 leaning-0.020
aked-0.017
 Tom-0.017
 allergic-0.017
Token unsc
Token ID28594
Feature activation+1.83e-01

Pos logit contributions
 trends+0.015
 Grac+0.015
 Grace+0.014
 dance+0.014
lyss+0.014
Neg logit contributions
 supposed-0.016
 leaning-0.014
agged-0.014
 allergic-0.014
ige-0.013
Token"
Token ID1
Feature activation-6.05e-01

Pos logit contributions
 leaning+0.001
 supposed+0.001
 Tom+0.001
 allergic+0.001
aked+0.001
Neg logit contributions
 Grac-0.002
 Grace-0.002
 trends-0.002
 Flor-0.002
lyss-0.002
TokenI
Token ID40
Feature activation+1.65e+00

Pos logit contributions
 supposed+0.028
 leaning+0.027
 Tom+0.024
 allergic+0.024
aked+0.023
Neg logit contributions
 trends-0.029
 Grace-0.029
 Grac-0.028
lyss-0.027
 Flor-0.027
Token have
Token ID423
Feature activation-1.59e-01

Pos logit contributions
 trends+0.069
 Grace+0.068
 Grac+0.065
lyss+0.065
 Flor+0.064
Neg logit contributions
 supposed-0.087
 leaning-0.085
 allergic-0.076
aked-0.074
 Tom-0.074
Token a
Token ID257
Feature activation+4.11e-02

Pos logit contributions
 supposed+0.004
 leaning+0.004
 allergic+0.003
 Tom+0.003
aked+0.003
Neg logit contributions
 trends-0.011
 Grace-0.011
lyss-0.010
 Grac-0.010
 Flor-0.010
Token screw
Token ID9580
Feature activation-3.61e+00

Pos logit contributions
 trends+0.002
 Grace+0.002
 Grac+0.002
lyss+0.002
 Flor+0.002
Neg logit contributions
 supposed-0.001
 leaning-0.001
 allergic-0.001
 Tom-0.001
aked-0.001
Tokendriver
Token ID26230
Feature activation-5.81e+00

Pos logit contributions
 supposed+0.117
 leaning+0.111
 allergic+0.095
agged+0.090
aked+0.090
Neg logit contributions
 trends-0.221
 Grace-0.217
 Grac-0.215
lyss-0.211
 Flor-0.209
Token!"
Token ID2474
Feature activation+1.81e-01

Pos logit contributions
 supposed+0.317
 leaning+0.309
 allergic+0.284
aked+0.279
agged+0.278
Neg logit contributions
 trends-0.221
 Grace-0.213
 Grac-0.205
lyss-0.202
 Flor-0.198
Token Ben
Token ID3932
Feature activation-1.82e-01

Pos logit contributions
 trends+0.012
lyss+0.011
 Grace+0.011
 Grac+0.011
 occasions+0.011
Neg logit contributions
 Tom-0.005
 leaning-0.005
 supposed-0.005
 allergic-0.004
 done-0.004
Token says
Token ID1139
Feature activation-1.53e-01

Pos logit contributions
 Tom+0.008
 leaning+0.008
 supposed+0.008
 allergic+0.007
 angry+0.007
Neg logit contributions
 trends-0.009
lyss-0.009
 Grace-0.009
 coral-0.009
 Grac-0.009
Token.
Token ID13
Feature activation-7.28e-03

Pos logit contributions
 supposed+0.008
 leaning+0.007
aked+0.007
wy+0.007
ige+0.007
Neg logit contributions
 Grace-0.007
 Grac-0.007
 dance-0.007
 trends-0.007
 Flor-0.006
Token "
Token ID366
Feature activation+4.44e-01

Pos logit contributions
 supposed+0.000
 leaning+0.000
 Tom+0.000
 allergic+0.000
aked+0.000
Neg logit contributions
 trends-0.000
 Grace-0.000
 Grac-0.000
lyss-0.000
 occasions-0.000
Token tightened
Token ID37257
Feature activation-2.14e+00

Pos logit contributions
 trends+0.004
lyss+0.004
 occasions+0.004
maid+0.004
 Flor+0.004
Neg logit contributions
 leaning-0.003
 supposed-0.003
 Tom-0.003
 done-0.002
aged-0.002
Token it
Token ID340
Feature activation-1.18e-01

Pos logit contributions
 supposed+0.109
 leaning+0.101
 allergic+0.095
agged+0.093
aked+0.090
Neg logit contributions
 Grace-0.096
 Grac-0.094
 trends-0.092
 Nina-0.091
 Flor-0.090
Token with
Token ID351
Feature activation+8.92e-01

Pos logit contributions
 supposed+0.005
 leaning+0.005
 allergic+0.004
agged+0.004
aked+0.004
Neg logit contributions
 Grace-0.006
 trends-0.006
 Grac-0.006
 Flor-0.005
 Nina-0.005
Token a
Token ID257
Feature activation+5.35e-01

Pos logit contributions
 Grace+0.050
 Grac+0.049
 trends+0.048
 Nina+0.047
 Flor+0.046
Neg logit contributions
 supposed-0.035
 leaning-0.033
agged-0.029
 allergic-0.029
aked-0.028
Token screw
Token ID9580
Feature activation-3.55e+00

Pos logit contributions
 trends+0.030
 Grace+0.030
 Grac+0.029
lyss+0.029
 Flor+0.029
Neg logit contributions
 supposed-0.020
 leaning-0.020
 allergic-0.017
 Tom-0.016
aked-0.016
Tokendriver
Token ID26230
Feature activation-5.80e+00

Pos logit contributions
 supposed+0.140
agged+0.125
 leaning+0.124
 allergic+0.123
aked+0.115
Neg logit contributions
 trends-0.191
 Grac-0.184
 Grace-0.183
 dance-0.177
fish-0.177
Token.
Token ID13
Feature activation-1.51e-01

Pos logit contributions
agged+0.279
 supposed+0.279
aked+0.265
 allergic+0.259
 leaning+0.258
Neg logit contributions
 trends-0.238
 Grace-0.221
 dance-0.220
 Grac-0.218
fish-0.208
Token The
Token ID383
Feature activation+8.99e-01

Pos logit contributions
 leaning+0.006
 supposed+0.006
 Tom+0.006
aged+0.005
itt+0.005
Neg logit contributions
fish-0.008
 coral-0.008
 trends-0.007
maid-0.007
 occasions-0.007
Token clock
Token ID8801
Feature activation-2.74e-01

Pos logit contributions
 Grace+0.047
 trends+0.047
 dance+0.045
 Grac+0.045
 occasions+0.044
Neg logit contributions
 supposed-0.036
agged-0.032
aked-0.032
 leaning-0.031
 allergic-0.029
Token was
Token ID373
Feature activation+2.67e-01

Pos logit contributions
 leaning+0.013
 supposed+0.012
 Tom+0.011
 done+0.011
aged+0.011
Neg logit contributions
 trends-0.013
 Grace-0.013
lyss-0.012
 Nina-0.012
 Flor-0.012
Token now
Token ID783
Feature activation+9.04e-01

Pos logit contributions
 Grace+0.013
 Grac+0.013
 trends+0.012
 dance+0.012
 Nina+0.012
Neg logit contributions
agged-0.011
 supposed-0.011
aked-0.011
 thieves-0.010
itt-0.010
Token room
Token ID2119
Feature activation+9.44e-01

Pos logit contributions
 leaning+0.001
 supposed+0.001
 Tom+0.001
 allergic+0.001
 thieves+0.001
Neg logit contributions
 trends-0.001
lyss-0.001
 Grace-0.001
 Grac-0.001
 Isn-0.001
Token and
Token ID290
Feature activation+4.92e-01

Pos logit contributions
 trends+0.046
 Grace+0.046
 Grac+0.044
lyss+0.044
 Flor+0.043
Neg logit contributions
 supposed-0.042
 leaning-0.041
 allergic-0.036
 Tom-0.035
aked-0.035
Token used
Token ID973
Feature activation+1.02e+00

Pos logit contributions
lyss+0.029
 trends+0.029
 Grac+0.027
 reefs+0.027
 Grace+0.027
Neg logit contributions
 Tom-0.018
 leaning-0.018
 supposed-0.016
 done-0.016
aged-0.015
Token a
Token ID257
Feature activation+4.83e-01

Pos logit contributions
 Grace+0.039
 trends+0.039
 Grac+0.038
 Nina+0.037
lyss+0.036
Neg logit contributions
 supposed-0.056
 leaning-0.054
 allergic-0.049
aked-0.048
agged-0.048
Token screw
Token ID9580
Feature activation-3.44e+00

Pos logit contributions
 trends+0.028
 Flor+0.027
 Grace+0.027
lyss+0.027
 Nina+0.026
Neg logit contributions
 leaning-0.017
 supposed-0.017
 allergic-0.014
eping-0.014
 hard-0.014
Tokendriver
Token ID26230
Feature activation-5.79e+00

Pos logit contributions
 supposed+0.110
 leaning+0.100
 allergic+0.091
agged+0.084
aked+0.077
Neg logit contributions
 trends-0.215
 Grac-0.209
 Grace-0.208
 Flor-0.202
 Isn-0.201
Token to
Token ID284
Feature activation-4.64e-01

Pos logit contributions
 supposed+0.302
 leaning+0.295
 allergic+0.286
agged+0.285
ige+0.268
Neg logit contributions
 trends-0.213
 dance-0.187
 Grace-0.186
 outfits-0.185
 occasions-0.184
Token open
Token ID1280
Feature activation+1.19e-01

Pos logit contributions
 leaning+0.021
 supposed+0.020
 Tom+0.019
eping+0.018
 allergic+0.018
Neg logit contributions
lyss-0.023
 trends-0.023
 Grace-0.022
 Flor-0.022
 Isn-0.021
Token it
Token ID340
Feature activation+1.76e-01

Pos logit contributions
lyss+0.006
 trends+0.006
 Grace+0.006
 Grac+0.006
 Flor+0.006
Neg logit contributions
 leaning-0.005
 supposed-0.005
 Tom-0.005
 handcuffs-0.004
 allergic-0.004
Token.
Token ID13
Feature activation+1.01e-01

Pos logit contributions
 trends+0.010
 Grace+0.010
lyss+0.010
 Grac+0.010
 Flor+0.010
Neg logit contributions
 supposed-0.006
 leaning-0.006
 allergic-0.005
 Tom-0.005
aked-0.005
Token
Token ID198
Feature activation+5.79e-02

Pos logit contributions
 trends+0.006
 Grace+0.005
lyss+0.005
 Grac+0.005
 Flor+0.005
Neg logit contributions
 supposed-0.004
 leaning-0.004
 Tom-0.003
 allergic-0.003
aked-0.003
Token a
Token ID257
Feature activation+2.37e-01

Pos logit contributions
 leaning+0.035
 supposed+0.032
owing+0.032
agged+0.032
ared+0.031
Neg logit contributions
 Grace-0.046
 Grac-0.044
 coral-0.042
 occasions-0.041
 trends-0.041
Token hammer
Token ID15554
Feature activation-2.94e+00

Pos logit contributions
 Grace+0.013
 Grac+0.012
 trends+0.012
lyss+0.012
 dance+0.012
Neg logit contributions
 supposed-0.010
 leaning-0.009
agged-0.009
aked-0.009
umbled-0.008
Token and
Token ID290
Feature activation-3.36e-01

Pos logit contributions
 supposed+0.166
agged+0.153
 leaning+0.150
 allergic+0.147
aked+0.146
Neg logit contributions
 trends-0.101
 Grace-0.095
 dance-0.095
 Grac-0.094
 tut-0.092
Token a
Token ID257
Feature activation+6.61e-01

Pos logit contributions
 supposed+0.015
 leaning+0.015
 allergic+0.013
agged+0.013
aked+0.012
Neg logit contributions
 Grace-0.016
 trends-0.016
 Grac-0.016
lyss-0.016
 Nina-0.016
Token screw
Token ID9580
Feature activation-3.34e+00

Pos logit contributions
 Grace+0.036
 trends+0.035
 Grac+0.035
lyss+0.034
 Flor+0.034
Neg logit contributions
 supposed-0.027
 leaning-0.025
aked-0.022
agged-0.022
 allergic-0.022
Tokendriver
Token ID26230
Feature activation-5.74e+00

Pos logit contributions
 supposed+0.120
agged+0.105
 allergic+0.104
 leaning+0.104
ige+0.096
Neg logit contributions
 trends-0.189
 Grac-0.180
fish-0.180
 dance-0.176
 Grace-0.176
Token that
Token ID326
Feature activation+6.35e-01

Pos logit contributions
 supposed+0.286
 leaning+0.270
ige+0.268
agged+0.263
aked+0.262
Neg logit contributions
 trends-0.216
 dance-0.209
fish-0.198
 occasions-0.192
 Grace-0.192
Token he
Token ID339
Feature activation+2.88e-01

Pos logit contributions
 trends+0.029
 Grace+0.028
lyss+0.027
 Grac+0.027
 Flor+0.027
Neg logit contributions
 supposed-0.031
 leaning-0.030
 Tom-0.027
 allergic-0.027
aked-0.026
Token liked
Token ID8288
Feature activation-1.23e+00

Pos logit contributions
 trends+0.014
 Grace+0.013
 Grac+0.013
lyss+0.013
 Flor+0.013
Neg logit contributions
 supposed-0.013
 leaning-0.013
 allergic-0.012
 Tom-0.011
aked-0.011
Token to
Token ID284
Feature activation-6.50e-01

Pos logit contributions
 supposed+0.052
 leaning+0.050
 allergic+0.045
agged+0.045
aked+0.045
Neg logit contributions
 Grace-0.062
 trends-0.062
 Grac-0.060
lyss-0.059
 Flor-0.059
Token store
Token ID3650
Feature activation-1.40e-01

Pos logit contributions
 supposed+0.033
 leaning+0.032
 allergic+0.029
 Tom+0.028
aked+0.028
Neg logit contributions
 trends-0.028
 Grace-0.028
 Grac-0.027
lyss-0.027
 Flor-0.026
Token a
Token ID257
Feature activation+1.08e+00

Pos logit contributions
 supposed+0.015
 leaning+0.015
 allergic+0.012
 Tom+0.012
aked+0.012
Neg logit contributions
 trends-0.026
 Grace-0.026
lyss-0.025
 Grac-0.025
 Flor-0.025
Token special
Token ID2041
Feature activation+1.05e+00

Pos logit contributions
 trends+0.058
 Grace+0.057
 Grac+0.055
 Flor+0.055
lyss+0.055
Neg logit contributions
 supposed-0.043
 leaning-0.042
 allergic-0.036
 Tom-0.034
eping-0.034
Token treat
Token ID2190
Feature activation-5.15e-01

Pos logit contributions
 trends+0.053
 Grace+0.051
 Grac+0.050
 Flor+0.050
lyss+0.049
Neg logit contributions
 leaning-0.045
 supposed-0.045
 allergic-0.039
 Tom-0.037
eping-0.036
Token for
Token ID329
Feature activation-1.24e+00

Pos logit contributions
 supposed+0.019
 leaning+0.018
 allergic+0.015
 Tom+0.015
aked+0.015
Neg logit contributions
 trends-0.030
 Grace-0.029
 Grac-0.029
lyss-0.028
 Flor-0.028
Token the
Token ID262
Feature activation-1.88e+00

Pos logit contributions
 supposed+0.052
 leaning+0.051
 allergic+0.044
 Tom+0.044
aked+0.042
Neg logit contributions
 trends-0.064
 Grace-0.062
lyss-0.061
 Grac-0.061
 Flor-0.059
Token chip
Token ID11594
Feature activation-5.73e+00

Pos logit contributions
 supposed+0.084
 leaning+0.081
aked+0.073
agged+0.072
eping+0.070
Neg logit contributions
 Grace-0.093
 trends-0.092
 Grac-0.091
lyss-0.090
 Flor-0.088
Tokenm
Token ID76
Feature activation-2.36e+00

Pos logit contributions
 supposed+0.253
 leaning+0.244
 allergic+0.211
aked+0.208
agged+0.206
Neg logit contributions
 trends-0.286
 Grace-0.285
 Grac-0.278
lyss-0.276
 Flor-0.271
Tokenunk
Token ID2954
Feature activation-7.44e-01

Pos logit contributions
 leaning+0.074
 supposed+0.069
 handcuffs+0.058
eping+0.055
 Jerry+0.055
Neg logit contributions
 Grace-0.138
 Grac-0.137
fish-0.136
 Nina-0.136
lyss-0.135
Token.
Token ID13
Feature activation-2.01e-01

Pos logit contributions
 supposed+0.027
 leaning+0.025
 allergic+0.022
 Tom+0.021
aked+0.020
Neg logit contributions
 trends-0.044
 Grace-0.042
lyss-0.042
 Grac-0.042
 Flor-0.041
Token She
Token ID1375
Feature activation+8.56e-01

Pos logit contributions
 leaning+0.007
 Tom+0.007
 supposed+0.006
 Jerry+0.006
itt+0.006
Neg logit contributions
 trends-0.012
fish-0.012
 coral-0.011
maid-0.011
 occasions-0.011
Token said
Token ID531
Feature activation-3.92e-01

Pos logit contributions
 trends+0.048
 conclusions+0.045
 occasions+0.044
 Isn+0.043
 Grace+0.042
Neg logit contributions
 leaning-0.033
 supposed-0.031
 Tom-0.030
 eaten-0.029
aged-0.029
Token a
Token ID257
Feature activation+1.41e-01

Pos logit contributions
 leaning+0.028
 supposed+0.028
agged+0.027
owing+0.026
 allergic+0.025
Neg logit contributions
 Grace-0.039
 Grac-0.038
 trends-0.037
 Nina-0.036
 coral-0.036
Token hammer
Token ID15554
Feature activation-3.03e+00

Pos logit contributions
 Grace+0.008
 trends+0.008
 Grac+0.008
lyss+0.007
 dance+0.007
Neg logit contributions
 supposed-0.006
 leaning-0.005
aked-0.005
agged-0.005
 allergic-0.005
Token,
Token ID11
Feature activation+1.37e-01

Pos logit contributions
 supposed+0.176
agged+0.159
 leaning+0.157
 allergic+0.157
aked+0.154
Neg logit contributions
 trends-0.100
 Grac-0.093
 dance-0.093
 Grace-0.092
fish-0.091
Token a
Token ID257
Feature activation+1.02e+00

Pos logit contributions
 Grace+0.006
 Grac+0.006
 trends+0.006
 Nina+0.006
 coral+0.006
Neg logit contributions
 supposed-0.007
 leaning-0.006
aked-0.006
agged-0.006
 allergic-0.006
Token screw
Token ID9580
Feature activation-3.13e+00

Pos logit contributions
 Grace+0.058
 trends+0.058
 Grac+0.057
lyss+0.056
 Nina+0.055
Neg logit contributions
 supposed-0.037
 leaning-0.035
aked-0.031
agged-0.030
 allergic-0.030
Tokendriver
Token ID26230
Feature activation-5.71e+00

Pos logit contributions
 supposed+0.127
 allergic+0.107
 leaning+0.104
agged+0.104
aked+0.098
Neg logit contributions
 trends-0.168
 Grac-0.165
 Grace-0.161
fish-0.160
 dance-0.157
Token,
Token ID11
Feature activation+2.90e-01

Pos logit contributions
 supposed+0.359
aked+0.326
 leaning+0.325
 allergic+0.322
agged+0.320
Neg logit contributions
 trends-0.157
 Grace-0.148
 dance-0.143
 Grac-0.139
 tut-0.138
Token and
Token ID290
Feature activation+1.33e-01

Pos logit contributions
 Grace+0.012
 trends+0.011
 Grac+0.011
 Nina+0.011
 Flor+0.011
Neg logit contributions
 supposed-0.015
 leaning-0.014
agged-0.013
 allergic-0.013
aked-0.013
Token a
Token ID257
Feature activation+7.11e-01

Pos logit contributions
 trends+0.007
 Grace+0.007
 Grac+0.007
lyss+0.006
 Nina+0.006
Neg logit contributions
 supposed-0.006
 leaning-0.005
 allergic-0.005
aked-0.005
agged-0.005
Token saw
Token ID2497
Feature activation-1.45e+00

Pos logit contributions
 trends+0.041
 Grace+0.041
 Grac+0.040
lyss+0.039
 Flor+0.039
Neg logit contributions
 supposed-0.026
 leaning-0.025
 allergic-0.021
 Tom-0.020
aked-0.020
Token.
Token ID13
Feature activation-2.08e-01

Pos logit contributions
 supposed+0.069
 leaning+0.064
 allergic+0.060
 thieves+0.056
agged+0.056
Neg logit contributions
 trends-0.064
 Isn-0.063
 Grace-0.063
 coral-0.061
 Grac-0.061
Token a
Token ID257
Feature activation+1.28e+00

Pos logit contributions
 supposed+0.005
 leaning+0.005
 Tom+0.004
 allergic+0.004
aked+0.004
Neg logit contributions
 trends-0.009
lyss-0.009
 Grace-0.009
 Grac-0.009
 Isn-0.009
Token hammer
Token ID15554
Feature activation-2.71e+00

Pos logit contributions
 trends+0.070
 Grace+0.069
 Flor+0.067
 Grac+0.067
lyss+0.066
Neg logit contributions
 supposed-0.049
 leaning-0.048
 allergic-0.042
 Tom-0.039
eping-0.038
Token and
Token ID290
Feature activation+4.08e-01

Pos logit contributions
 supposed+0.140
agged+0.131
 leaning+0.128
 allergic+0.126
aked+0.120
Neg logit contributions
 trends-0.111
 Grace-0.111
 Grac-0.110
 Nina-0.103
 dance-0.099
Token a
Token ID257
Feature activation+9.29e-01

Pos logit contributions
 trends+0.022
lyss+0.021
 Grace+0.021
 Flor+0.021
 Isn+0.020
Neg logit contributions
 supposed-0.017
 leaning-0.016
 Tom-0.015
 done-0.014
 allergic-0.014
Token screw
Token ID9580
Feature activation-3.36e+00

Pos logit contributions
 trends+0.051
 Grace+0.051
 Grac+0.049
lyss+0.049
 Flor+0.049
Neg logit contributions
 supposed-0.036
 leaning-0.035
 allergic-0.030
 Tom-0.029
aked-0.028
Tokendriver
Token ID26230
Feature activation-5.70e+00

Pos logit contributions
 supposed+0.116
 allergic+0.100
 leaning+0.098
agged+0.097
ige+0.087
Neg logit contributions
 Grac-0.197
 trends-0.195
 Grace-0.191
fish-0.183
 Isn-0.182
Token.
Token ID13
Feature activation-6.91e-02

Pos logit contributions
 supposed+0.282
 leaning+0.264
 allergic+0.262
agged+0.262
wy+0.253
Neg logit contributions
 trends-0.226
 Grace-0.220
 Grac-0.217
 Nina-0.205
 dance-0.202
Token
Token ID198
Feature activation-1.99e-01

Pos logit contributions
 Tom+0.003
 supposed+0.003
 leaning+0.003
aked+0.003
hes+0.003
Neg logit contributions
 trends-0.003
 occasions-0.003
fish-0.003
 outfits-0.003
maid-0.003
Token
Token ID198
Feature activation-5.49e-01

Pos logit contributions
 supposed+0.007
 Tom+0.006
 leaning+0.006
hes+0.006
aked+0.006
Neg logit contributions
 trends-0.012
 occasions-0.011
maid-0.011
 outfits-0.011
 Grace-0.011
TokenM
Token ID44
Feature activation-7.59e-01

Pos logit contributions
 supposed+0.014
hes+0.014
aked+0.014
 leaning+0.013
itt+0.013
Neg logit contributions
 trends-0.036
maid-0.034
 Grace-0.034
 occasions-0.034
 outfits-0.033
Tokenolly
Token ID5098
Feature activation+2.09e+00

Pos logit contributions
 supposed+0.036
 leaning+0.035
 allergic+0.031
aked+0.030
 Tom+0.030
Neg logit contributions
 trends-0.035
 Grace-0.035
 Grac-0.034
lyss-0.034
 Flor-0.033
Token
Token ID198
Feature activation-7.30e-01

Pos logit contributions
 supposed+0.018
 leaning+0.018
 allergic+0.016
 thieves+0.016
aked+0.016
Neg logit contributions
 Grace-0.012
 trends-0.011
 Grac-0.011
lyss-0.011
 Flor-0.011
TokenOne
Token ID3198
Feature activation-3.88e-01

Pos logit contributions
 supposed+0.041
 leaning+0.039
 thieves+0.036
 allergic+0.036
agged+0.035
Neg logit contributions
 trends-0.027
 Grace-0.027
lyss-0.025
 Grac-0.025
 occasions-0.025
Token day
Token ID1110
Feature activation+1.06e+00

Pos logit contributions
 supposed+0.020
 leaning+0.019
agged+0.019
 handcuffs+0.019
 thieves+0.019
Neg logit contributions
 dance-0.017
 Grace-0.016
lyss-0.015
 coral-0.015
 dancer-0.015
Token,
Token ID11
Feature activation-6.68e-01

Pos logit contributions
 Grace+0.031
 Grac+0.028
lyss+0.027
 Bub+0.026
 dance+0.026
Neg logit contributions
 supposed-0.073
ige-0.064
 fault-0.064
 leaning-0.064
wy-0.062
Token Wh
Token ID854
Feature activation-2.48e+00

Pos logit contributions
 supposed+0.039
 leaning+0.033
 fault+0.032
aked+0.032
ige+0.032
Neg logit contributions
 Grace-0.028
 Bub-0.027
 Grac-0.026
 dance-0.025
 Flor-0.024
Tokenisk
Token ID1984
Feature activation-5.69e+00

Pos logit contributions
 leaning+0.117
 supposed+0.108
 allergic+0.093
 Tom+0.093
aged+0.092
Neg logit contributions
 trends-0.110
 Grace-0.105
 Flor-0.104
 Grac-0.103
lyss-0.102
Tokeners
Token ID364
Feature activation-2.25e+00

Pos logit contributions
 leaning+0.245
 supposed+0.235
 allergic+0.218
 vain+0.195
 Tom+0.194
Neg logit contributions
 Grace-0.283
 trends-0.279
 Flor-0.276
 Grac-0.272
lyss-0.270
Token'
Token ID6
Feature activation-8.34e-01

Pos logit contributions
 supposed+0.102
 leaning+0.100
 allergic+0.089
 Tom+0.089
aked+0.087
Neg logit contributions
 trends-0.109
 Grace-0.107
 Grac-0.105
lyss-0.103
 Flor-0.103
Token friend
Token ID1545
Feature activation+1.02e+00

Pos logit contributions
 supposed+0.040
 leaning+0.038
bench+0.033
 allergic+0.033
agged+0.032
Neg logit contributions
 Grace-0.039
 trends-0.038
 Flor-0.038
lyss-0.038
 Grac-0.038
Token,
Token ID11
Feature activation-1.34e-01

Pos logit contributions
 Grace+0.048
 Grac+0.046
lyss+0.046
 trends+0.045
 Nina+0.044
Neg logit contributions
 supposed-0.048
 leaning-0.047
agged-0.042
 allergic-0.041
eping-0.040
Token a
Token ID257
Feature activation+7.74e-01

Pos logit contributions
 supposed+0.005
 leaning+0.004
agged+0.004
 allergic+0.004
aked+0.004
Neg logit contributions
 Grace-0.008
 Grac-0.008
 Flor-0.008
 Nina-0.008
lyss-0.008
Token,
Token ID11
Feature activation+4.78e-01

Pos logit contributions
 supposed+0.078
 leaning+0.077
 allergic+0.067
 Tom+0.066
aked+0.064
Neg logit contributions
 trends-0.088
 Grace-0.087
 Grac-0.085
lyss-0.084
 Flor-0.082
Token fifteen
Token ID17280
Feature activation-2.66e+00

Pos logit contributions
 Grace+0.030
 Grac+0.030
 trends+0.030
 Flor+0.030
 coral+0.029
Neg logit contributions
 supposed-0.015
 leaning-0.013
 allergic-0.011
agged-0.011
aked-0.011
Token,
Token ID11
Feature activation-2.18e-01

Pos logit contributions
 supposed+0.131
 leaning+0.125
 allergic+0.112
aked+0.110
agged+0.109
Neg logit contributions
 trends-0.118
 Grace-0.117
 Grac-0.113
lyss-0.112
 Flor-0.111
Token sixteen
Token ID27913
Feature activation-3.00e+00

Pos logit contributions
 supposed+0.006
 leaning+0.006
 allergic+0.005
aked+0.005
agged+0.004
Neg logit contributions
 trends-0.014
 Grace-0.014
 Grac-0.014
 Flor-0.014
lyss-0.014
Token,
Token ID11
Feature activation-1.88e-01

Pos logit contributions
 leaning+0.122
 supposed+0.121
 Tom+0.105
 allergic+0.104
 done+0.100
Neg logit contributions
 trends-0.159
 Grace-0.158
lyss-0.153
 Grac-0.152
 Nina-0.150
Token seventeen
Token ID38741
Feature activation-5.69e+00

Pos logit contributions
 supposed+0.005
 leaning+0.005
 allergic+0.004
aked+0.004
agged+0.003
Neg logit contributions
 trends-0.013
 Grace-0.013
 Grac-0.012
 Flor-0.012
lyss-0.012
Token,
Token ID11
Feature activation+2.40e-01

Pos logit contributions
 supposed+0.338
 leaning+0.302
aked+0.292
 allergic+0.283
 handcuffs+0.277
Neg logit contributions
 trends-0.189
 coral-0.188
 Flor-0.187
 dance-0.184
 dances-0.184
Token eighteen
Token ID29095
Feature activation-3.61e+00

Pos logit contributions
 Flor+0.015
 Grace+0.015
 Grac+0.015
 trends+0.015
 coral+0.014
Neg logit contributions
 supposed-0.008
 leaning-0.007
 allergic-0.006
aked-0.006
agged-0.006
Token,
Token ID11
Feature activation+4.15e-01

Pos logit contributions
 supposed+0.171
 leaning+0.164
 allergic+0.146
aked+0.144
agged+0.142
Neg logit contributions
 trends-0.166
 Grace-0.165
 Grac-0.161
lyss-0.159
 Flor-0.158
Token nineteen
Token ID43063
Feature activation-2.17e+00

Pos logit contributions
 Grace+0.025
 trends+0.025
 Grac+0.025
 Flor+0.025
 coral+0.024
Neg logit contributions
 supposed-0.013
 leaning-0.012
 allergic-0.010
aked-0.010
agged-0.010
Token,
Token ID11
Feature activation+6.33e-01

Pos logit contributions
 supposed+0.090
 leaning+0.088
 allergic+0.077
 Tom+0.076
aked+0.074
Neg logit contributions
 trends-0.114
 Grace-0.113
lyss-0.110
 Grac-0.109
 Nina-0.106
Token good
Token ID922
Feature activation+3.45e-01

Pos logit contributions
 trends+0.182
 Grace+0.178
 occasions+0.176
 Grac+0.174
lyss+0.174
Neg logit contributions
 supposed-0.077
 leaning-0.068
 done-0.063
 allergic-0.060
 Tom-0.057
Token.
Token ID13
Feature activation+4.86e-01

Pos logit contributions
 trends+0.018
lyss+0.018
 Flor+0.018
 Grace+0.018
 Grac+0.017
Neg logit contributions
 supposed-0.015
 leaning-0.014
 Tom-0.013
 done-0.013
 enough-0.012
Token Can
Token ID1680
Feature activation+1.47e+00

Pos logit contributions
fish+0.025
 coral+0.023
 trends+0.023
 outfits+0.022
maid+0.022
Neg logit contributions
 Tom-0.023
 supposed-0.021
 leaning-0.020
itt-0.019
 Jerry-0.019
Token you
Token ID345
Feature activation+1.94e+00

Pos logit contributions
 coral+0.056
 trends+0.056
 dances+0.053
 reefs+0.053
 outfits+0.052
Neg logit contributions
 Tom-0.086
 leaning-0.079
 supposed-0.073
 Jerry-0.073
 done-0.072
Token help
Token ID1037
Feature activation+2.26e+00

Pos logit contributions
 trends+0.103
lyss+0.098
 Grace+0.097
 dances+0.097
 reefs+0.096
Neg logit contributions
 supposed-0.079
 leaning-0.075
 done-0.074
 Tom-0.069
 allergic-0.068
Token me
Token ID502
Feature activation+2.82e+00

Pos logit contributions
 trends+0.090
lyss+0.087
 dances+0.084
 Flor+0.084
 reefs+0.084
Neg logit contributions
 supposed-0.123
 leaning-0.115
 Tom-0.115
 allergic-0.109
 done-0.109
Token make
Token ID787
Feature activation+1.28e+00

Pos logit contributions
 trends+0.139
 Grace+0.135
lyss+0.134
 Flor+0.130
 Isn+0.130
Neg logit contributions
 supposed-0.131
 leaning-0.129
 allergic-0.115
 done-0.115
 Tom-0.107
Token some
Token ID617
Feature activation+1.85e+00

Pos logit contributions
 trends+0.069
 Grace+0.067
lyss+0.066
 Grac+0.065
 Flor+0.064
Neg logit contributions
 supposed-0.052
 leaning-0.049
 allergic-0.045
 Tom-0.044
 done-0.042
Token icing
Token ID41567
Feature activation-1.19e+00

Pos logit contributions
 Grace+0.114
 trends+0.112
lyss+0.111
 Grac+0.111
 Flor+0.109
Neg logit contributions
 supposed-0.062
 leaning-0.058
 allergic-0.054
 done-0.053
 chocolate-0.050
Token?"
Token ID1701
Feature activation+1.18e+00

Pos logit contributions
 supposed+0.061
 leaning+0.060
 done+0.055
 Tom+0.054
hes+0.052
Neg logit contributions
 Grace-0.050
 trends-0.049
lyss-0.048
 Flor-0.048
 Grac-0.047
Token Jane
Token ID12091
Feature activation+1.82e+00

Pos logit contributions
 trends+0.069
 coral+0.066
fish+0.065
maid+0.064
 Grace+0.064
Neg logit contributions
 Tom-0.042
 leaning-0.040
 supposed-0.039
 Jerry-0.034
aked-0.033
Token too
Token ID1165
Feature activation-1.54e+00

Pos logit contributions
 supposed+0.111
 leaning+0.108
agged+0.100
aked+0.098
 allergic+0.094
Neg logit contributions
 Grace-0.143
 trends-0.141
 Grac-0.138
 Flor-0.135
 Nina-0.135
Token soft
Token ID2705
Feature activation-7.77e-01

Pos logit contributions
agged+0.059
 supposed+0.055
aked+0.053
 leaning+0.051
hy+0.050
Neg logit contributions
 dance-0.079
 Grace-0.078
fish-0.076
 coral-0.076
 we-0.075
Token.
Token ID13
Feature activation-2.39e-01

Pos logit contributions
 supposed+0.034
 leaning+0.032
 allergic+0.029
aked+0.028
agged+0.028
Neg logit contributions
 Grace-0.038
 trends-0.038
 Grac-0.037
 Nina-0.036
 Flor-0.036
Token Squ
Token ID5056
Feature activation-7.00e-01

Pos logit contributions
 Tom+0.009
 leaning+0.009
 supposed+0.009
 Jerry+0.008
 allergic+0.007
Neg logit contributions
 trends-0.014
fish-0.013
 coral-0.013
 outfits-0.013
 dance-0.013
Tokenee
Token ID1453
Feature activation+1.24e+00

Pos logit contributions
 supposed+0.029
 leaning+0.028
 allergic+0.023
 Tom+0.022
 thieves+0.022
Neg logit contributions
 trends-0.037
lyss-0.035
 Grace-0.035
 Grac-0.035
 Nina-0.035
Tokenze
Token ID2736
Feature activation+2.85e+00

Pos logit contributions
 trends+0.066
 Grace+0.065
 Grac+0.062
lyss+0.062
 Nina+0.061
Neg logit contributions
 supposed-0.051
 leaning-0.051
aked-0.045
 allergic-0.044
 Tom-0.044
Token it
Token ID340
Feature activation+1.66e+00

Pos logit contributions
 trends+0.156
 coral+0.149
 Grace+0.148
 Grac+0.146
 occasions+0.146
Neg logit contributions
 leaning-0.113
 supposed-0.113
 Tom-0.113
 done-0.104
hes-0.102
Token just
Token ID655
Feature activation+1.99e+00

Pos logit contributions
 trends+0.107
lyss+0.105
 Grace+0.102
 Nina+0.102
 occasions+0.101
Neg logit contributions
 leaning-0.055
 supposed-0.055
 allergic-0.049
 done-0.046
 hard-0.045
Token fair
Token ID3148
Feature activation+4.57e-01

Pos logit contributions
 trends+0.106
lyss+0.101
 Grace+0.098
 Grac+0.096
 occasions+0.094
Neg logit contributions
 supposed-0.088
 leaning-0.087
 Tom-0.080
 allergic-0.076
 done-0.075
Token.
Token ID13
Feature activation+1.35e-01

Pos logit contributions
 Grace+0.027
lyss+0.026
 trends+0.026
 Grac+0.026
 Flor+0.025
Neg logit contributions
 supposed-0.017
 leaning-0.016
 done-0.014
 Tom-0.014
aged-0.013
Token Do
Token ID2141
Feature activation+9.74e-01

Pos logit contributions
 trends+0.008
fish+0.007
 dance+0.007
 outfits+0.007
 coral+0.007
Neg logit contributions
 Tom-0.006
 leaning-0.005
 supposed-0.005
 Jerry-0.004
itt-0.004
Token and
Token ID290
Feature activation+1.66e-01

Pos logit contributions
 supposed+0.092
 leaning+0.086
agged+0.084
 allergic+0.081
aked+0.081
Neg logit contributions
 Grace-0.063
 Grac-0.063
 trends-0.062
 dance-0.061
 coral-0.059
Token paper
Token ID3348
Feature activation-9.72e-01

Pos logit contributions
 trends+0.009
lyss+0.009
 Grace+0.009
 Flor+0.009
 Grac+0.008
Neg logit contributions
 leaning-0.006
 supposed-0.006
 Tom-0.006
aked-0.005
eping-0.005
Token and
Token ID290
Feature activation+4.18e-01

Pos logit contributions
 supposed+0.042
 leaning+0.040
 allergic+0.035
aked+0.034
 Tom+0.034
Neg logit contributions
 trends-0.049
 Grace-0.049
 Grac-0.048
lyss-0.047
 Flor-0.047
Token made
Token ID925
Feature activation+2.85e-01

Pos logit contributions
 trends+0.022
lyss+0.021
 Grace+0.021
 Flor+0.020
 Grac+0.020
Neg logit contributions
 leaning-0.017
 supposed-0.017
 Tom-0.016
aked-0.015
eping-0.014
Token his
Token ID465
Feature activation+3.49e-01

Pos logit contributions
 Grace+0.016
 trends+0.016
 Grac+0.015
 Flor+0.015
 Nina+0.015
Neg logit contributions
 supposed-0.011
 leaning-0.011
 allergic-0.009
agged-0.009
aked-0.009
Token own
Token ID898
Feature activation+3.06e+00

Pos logit contributions
 trends+0.016
 Grace+0.016
lyss+0.015
 Grac+0.015
 Nina+0.015
Neg logit contributions
 supposed-0.017
 leaning-0.016
 allergic-0.015
 Tom-0.014
aked-0.014
Token cartoon
Token ID16251
Feature activation+9.47e-01

Pos logit contributions
 trends+0.148
lyss+0.147
 Grace+0.146
 Grac+0.143
 Flor+0.142
Neg logit contributions
 supposed-0.140
 leaning-0.135
 allergic-0.120
 Tom-0.118
eping-0.115
Token characters
Token ID3435
Feature activation-5.10e-01

Pos logit contributions
 Grace+0.057
 trends+0.056
lyss+0.056
 Grac+0.055
 Nina+0.054
Neg logit contributions
 supposed-0.032
 leaning-0.031
 Tom-0.026
 allergic-0.025
aked-0.024
Token.
Token ID13
Feature activation+1.94e-01

Pos logit contributions
 supposed+0.019
 leaning+0.018
 allergic+0.016
aked+0.015
agged+0.015
Neg logit contributions
 trends-0.028
 Grace-0.028
 Grac-0.027
lyss-0.027
 Flor-0.027
Token He
Token ID679
Feature activation-6.52e-01

Pos logit contributions
 Isn+0.007
 Grace+0.007
 Nina+0.007
 trends+0.007
 Grac+0.007
Neg logit contributions
 supposed-0.012
 leaning-0.011
 allergic-0.011
 thieves-0.010
eping-0.010
Token had
Token ID550
Feature activation-3.84e-01

Pos logit contributions
 leaning+0.024
 supposed+0.024
hes+0.023
 Tom+0.023
 done+0.022
Neg logit contributions
 trends-0.036
 Grace-0.035
lyss-0.034
 conclusions-0.034
 occasions-0.034
Tokened
Token ID276
Feature activation-7.86e-01

Pos logit contributions
 trends+0.198
lyss+0.195
 Grace+0.193
 occasions+0.193
 Grac+0.192
Neg logit contributions
 supposed-0.113
 leaning-0.100
aked-0.093
 done-0.088
agged-0.083
Token so
Token ID523
Feature activation+1.82e+00

Pos logit contributions
 supposed+0.030
 leaning+0.028
 enough+0.025
 Tom+0.025
 done+0.025
Neg logit contributions
lyss-0.046
 trends-0.045
 occasions-0.041
 Tracy-0.041
 Grac-0.041
Token beautifully
Token ID21104
Feature activation+1.51e+00

Pos logit contributions
 trends+0.128
lyss+0.122
 occasions+0.122
 outfits+0.120
 Grac+0.120
Neg logit contributions
 leaning-0.044
 supposed-0.042
 hard-0.037
 allergic-0.034
 Tom-0.032
Token."
Token ID526
Feature activation+1.05e+00

Pos logit contributions
 trends+0.075
 Grac+0.074
lyss+0.073
 Grace+0.070
 Flor+0.068
Neg logit contributions
 supposed-0.071
 leaning-0.069
 done-0.062
 enough-0.060
 Tom-0.055
Token The
Token ID383
Feature activation+1.04e+00

Pos logit contributions
 trends+0.063
fish+0.061
 coral+0.060
maid+0.060
 reefs+0.058
Neg logit contributions
 Tom-0.038
 leaning-0.034
 supposed-0.031
 Jerry-0.030
itt-0.028
Token flower
Token ID15061
Feature activation+3.10e+00

Pos logit contributions
 Grace+0.045
 trends+0.044
 Flor+0.044
lyss+0.043
 Grac+0.042
Neg logit contributions
 supposed-0.053
 leaning-0.049
aked-0.045
 allergic-0.045
agged-0.045
Token felt
Token ID2936
Feature activation+7.96e-01

Pos logit contributions
 trends+0.177
 occasions+0.164
 conclusions+0.163
 Isn+0.160
lyss+0.157
Neg logit contributions
 leaning-0.122
 done-0.113
 gone-0.102
 angry-0.100
 supposed-0.100
Token very
Token ID845
Feature activation+1.42e+00

Pos logit contributions
 trends+0.056
 occasions+0.055
 outfits+0.053
lyss+0.051
maid+0.051
Neg logit contributions
 supposed-0.022
 allergic-0.020
 leaning-0.019
 angry-0.018
 done-0.018
Token happy
Token ID3772
Feature activation-4.09e-01

Pos logit contributions
 trends+0.095
 occasions+0.095
 Grace+0.092
 Isn+0.091
lyss+0.091
Neg logit contributions
 supposed-0.041
 leaning-0.039
 allergic-0.036
 done-0.034
 hard-0.032
Token and
Token ID290
Feature activation-5.51e-01

Pos logit contributions
 leaning+0.012
 supposed+0.011
 Tom+0.010
 lying+0.010
 enough+0.010
Neg logit contributions
 trends-0.027
lyss-0.027
 Grac-0.026
 Grace-0.026
 Tracy-0.025
Token proud
Token ID6613
Feature activation+3.07e-02

Pos logit contributions
 leaning+0.020
 supposed+0.019
 done+0.018
 Tom+0.017
aged+0.016
Neg logit contributions
 occasions-0.033
 trends-0.032
 conclusions-0.032
lyss-0.031
maid-0.031
Token said
Token ID531
Feature activation-7.74e-01

Pos logit contributions
 supposed+0.022
 leaning+0.022
 Tom+0.020
 allergic+0.019
aked+0.019
Neg logit contributions
 trends-0.025
 Grace-0.025
 Grac-0.024
lyss-0.024
 coral-0.024
Token.
Token ID13
Feature activation-3.59e-01

Pos logit contributions
 supposed+0.043
wy+0.038
 leaning+0.038
aked+0.038
ige+0.038
Neg logit contributions
 Grace-0.032
 Flor-0.031
 Grac-0.031
 Nina-0.030
 trends-0.029
Token "
Token ID366
Feature activation+3.75e-01

Pos logit contributions
 supposed+0.018
 leaning+0.017
 allergic+0.016
eping+0.015
aked+0.015
Neg logit contributions
 Grace-0.016
 Grac-0.015
 Flor-0.015
 trends-0.015
lyss-0.015
TokenYou
Token ID1639
Feature activation+4.60e-01

Pos logit contributions
 trends+0.017
 Grace+0.017
 Grac+0.017
lyss+0.017
 Flor+0.016
Neg logit contributions
 supposed-0.018
 leaning-0.017
 allergic-0.015
aked-0.015
 Tom-0.015
Token hurt
Token ID5938
Feature activation+1.70e+00

Pos logit contributions
 trends+0.025
 Grace+0.024
 occasions+0.024
lyss+0.024
 Flor+0.023
Neg logit contributions
 supposed-0.019
 leaning-0.017
 Tom-0.017
 allergic-0.016
 done-0.016
Token my
Token ID616
Feature activation+2.80e+00

Pos logit contributions
 trends+0.066
 Grace+0.062
lyss+0.061
 Grac+0.060
 Flor+0.058
Neg logit contributions
 supposed-0.097
 leaning-0.095
 Tom-0.092
 allergic-0.084
aked-0.083
Token doll
Token ID3654
Feature activation+3.01e-01

Pos logit contributions
 trends+0.137
 Grace+0.135
lyss+0.132
 Grac+0.131
 coral+0.130
Neg logit contributions
 supposed-0.127
 leaning-0.125
 allergic-0.111
 Tom-0.107
aked-0.104
Token!
Token ID0
Feature activation+2.74e-01

Pos logit contributions
 trends+0.013
 Grace+0.013
 Grac+0.013
 Flor+0.013
lyss+0.013
Neg logit contributions
 supposed-0.015
 leaning-0.014
aked-0.013
agged-0.013
 allergic-0.012
Token Give
Token ID13786
Feature activation+4.70e-01

Pos logit contributions
 trends+0.013
 coral+0.013
 Grace+0.013
 reefs+0.012
fish+0.012
Neg logit contributions
 supposed-0.013
 leaning-0.012
 Tom-0.012
 allergic-0.011
aked-0.010
Token it
Token ID340
Feature activation+2.71e-03

Pos logit contributions
 trends+0.017
 Grace+0.017
 Grac+0.016
lyss+0.016
 Flor+0.016
Neg logit contributions
 supposed-0.027
 leaning-0.026
 Tom-0.024
 allergic-0.024
aked-0.023
Token back
Token ID736
Feature activation+4.68e-02

Pos logit contributions
 trends+0.000
 Grace+0.000
 Grac+0.000
lyss+0.000
 dance+0.000
Neg logit contributions
 supposed-0.000
 leaning-0.000
 allergic-0.000
agged-0.000
aked-0.000
Token coins
Token ID10796
Feature activation-1.01e+00

Pos logit contributions
 trends+0.046
 Grace+0.046
 Grac+0.044
lyss+0.044
 Flor+0.043
Neg logit contributions
 supposed-0.051
 leaning-0.049
 allergic-0.044
aked-0.043
 Tom-0.042
Token and
Token ID290
Feature activation+5.96e-01

Pos logit contributions
 supposed+0.045
 leaning+0.043
 allergic+0.038
aked+0.037
agged+0.037
Neg logit contributions
 trends-0.049
 Grace-0.048
 Grac-0.047
lyss-0.046
 Flor-0.046
Token jewels
Token ID42701
Feature activation+1.93e-01

Pos logit contributions
 trends+0.035
 Grace+0.034
lyss+0.033
 Grac+0.032
 occasions+0.032
Neg logit contributions
 leaning-0.021
 supposed-0.020
 Tom-0.019
 thieves-0.017
 handcuffs-0.017
Token spilled
Token ID31896
Feature activation+2.80e-01

Pos logit contributions
 trends+0.008
 Grace+0.008
 Grac+0.008
lyss+0.008
 Flor+0.008
Neg logit contributions
 supposed-0.010
 leaning-0.009
 allergic-0.008
aked-0.008
agged-0.008
Token out
Token ID503
Feature activation+7.80e-02

Pos logit contributions
 Grace+0.014
 Grac+0.014
 trends+0.014
 Flor+0.014
 coral+0.014
Neg logit contributions
 supposed-0.012
 leaning-0.011
 allergic-0.010
aked-0.010
agged-0.010
Token on
Token ID319
Feature activation-1.27e-01

Pos logit contributions
 trends+0.004
 Grace+0.004
lyss+0.004
 Grac+0.004
 Flor+0.004
Neg logit contributions
 supposed-0.003
 leaning-0.003
 Tom-0.002
 allergic-0.002
aked-0.002
Token the
Token ID262
Feature activation-6.02e-01

Pos logit contributions
 supposed+0.006
 leaning+0.006
 allergic+0.005
aked+0.005
agged+0.005
Neg logit contributions
 Grace-0.006
 Grac-0.006
 trends-0.006
 Flor-0.006
 Nina-0.006
Token ground
Token ID2323
Feature activation-9.33e-01

Pos logit contributions
 supposed+0.028
 leaning+0.027
 Tom+0.024
 allergic+0.024
 handcuffs+0.023
Neg logit contributions
 trends-0.028
lyss-0.028
 Grace-0.028
 Grac-0.027
 Nina-0.027
Token.
Token ID13
Feature activation+7.13e-02

Pos logit contributions
 supposed+0.038
 leaning+0.036
 allergic+0.032
agged+0.031
aked+0.031
Neg logit contributions
 trends-0.049
 Grace-0.049
 Grac-0.047
 Flor-0.046
lyss-0.046
Token They
Token ID1119
Feature activation-2.05e-01

Pos logit contributions
 trends+0.004
 Grace+0.003
lyss+0.003
 Grac+0.003
 occasions+0.003
Neg logit contributions
 supposed-0.003
 leaning-0.003
 Tom-0.003
 allergic-0.003
aked-0.003
Token looked
Token ID3114
Feature activation+8.61e-01

Pos logit contributions
 supposed+0.009
 leaning+0.009
 allergic+0.008
aked+0.008
agged+0.008
Neg logit contributions
 trends-0.010
 Grace-0.010
 Grac-0.010
lyss-0.009
 Flor-0.009
Token late
Token ID2739
Feature activation-2.28e+00

Pos logit contributions
 supposed+0.006
 leaning+0.005
agged+0.004
 allergic+0.004
aked+0.004
Neg logit contributions
 Grace-0.013
 trends-0.013
 Grac-0.013
 Flor-0.013
 Nina-0.013
Token.
Token ID13
Feature activation-6.09e-01

Pos logit contributions
 supposed+0.093
 leaning+0.090
 allergic+0.078
 Tom+0.076
aked+0.075
Neg logit contributions
 trends-0.121
 Grace-0.120
 Grac-0.117
lyss-0.116
 Flor-0.114
Token<|endoftext|>
Token ID50256
Feature activation+1.75e+00

Pos logit contributions
 Tom+0.022
 leaning+0.020
 supposed+0.019
 Jerry+0.018
 handcuffs+0.016
Neg logit contributions
 trends-0.037
fish-0.035
 outfits-0.035
 occasions-0.034
 coral-0.034
TokenFrank
Token ID17439
Feature activation-1.16e+00

Pos logit contributions
 Grace+0.073
 trends+0.072
 Grac+0.070
lyss+0.069
 Flor+0.069
Neg logit contributions
 supposed-0.091
 leaning-0.088
 allergic-0.080
aked-0.077
eping-0.076
Tokenie
Token ID494
Feature activation+2.44e+00

Pos logit contributions
 supposed+0.058
 leaning+0.057
 allergic+0.051
aked+0.050
 Tom+0.050
Neg logit contributions
 trends-0.051
 Grace-0.050
 Grac-0.048
lyss-0.048
 Flor-0.047
Token was
Token ID373
Feature activation-1.91e-01

Pos logit contributions
 trends+0.129
 Grace+0.120
 Grac+0.120
lyss+0.119
 Flor+0.118
Neg logit contributions
 leaning-0.106
 supposed-0.102
 Tom-0.094
 done-0.088
 allergic-0.088
Token a
Token ID257
Feature activation+7.67e-01

Pos logit contributions
 supposed+0.006
 leaning+0.006
 thieves+0.006
 handcuffs+0.006
hy+0.006
Neg logit contributions
 Grace-0.011
 Grac-0.010
 Nina-0.010
 Flor-0.010
 dance-0.010
Token smart
Token ID4451
Feature activation-5.63e-01

Pos logit contributions
 Grace+0.040
lyss+0.039
 trends+0.038
 Grac+0.038
 Flor+0.038
Neg logit contributions
 supposed-0.034
 leaning-0.030
aked-0.029
agged-0.029
aged-0.027
Token 3
Token ID513
Feature activation-7.15e-01

Pos logit contributions
 supposed+0.029
 leaning+0.027
aked+0.025
agged+0.025
 allergic+0.025
Neg logit contributions
 Grace-0.025
 trends-0.024
lyss-0.024
 Grac-0.024
 Nina-0.023
Token year
Token ID614
Feature activation-8.18e-01

Pos logit contributions
 supposed+0.038
 leaning+0.037
 allergic+0.033
agged+0.032
ige+0.032
Neg logit contributions
 trends-0.029
 Grace-0.028
lyss-0.028
 occasions-0.028
 Grac-0.026
Token old
Token ID1468
Feature activation-1.77e+00

Pos logit contributions
 supposed+0.023
 leaning+0.022
 allergic+0.018
 Tom+0.017
aked+0.017
Neg logit contributions
 trends-0.053
 Grace-0.053
 Grac-0.052
lyss-0.052
 Flor-0.051
Token didn
Token ID1422
Feature activation-1.43e+00

Pos logit contributions
 supposed+0.018
 leaning+0.017
 allergic+0.015
aked+0.015
agged+0.015
Neg logit contributions
 Grace-0.016
 trends-0.016
 Grac-0.015
lyss-0.015
 Nina-0.015
Token't
Token ID470
Feature activation-2.31e-01

Pos logit contributions
 supposed+0.061
 leaning+0.059
 allergic+0.052
aked+0.051
 Tom+0.050
Neg logit contributions
 trends-0.073
 Grace-0.073
 Grac-0.071
lyss-0.070
 Flor-0.069
Token have
Token ID423
Feature activation-1.12e+00

Pos logit contributions
 supposed+0.011
 leaning+0.011
 allergic+0.009
aked+0.009
agged+0.009
Neg logit contributions
 trends-0.011
 Grace-0.011
 Grac-0.010
lyss-0.010
 Flor-0.010
Token any
Token ID597
Feature activation-1.19e+00

Pos logit contributions
 leaning+0.043
 supposed+0.041
agged+0.037
aked+0.036
eping+0.035
Neg logit contributions
 Grace-0.060
 Flor-0.058
 Grac-0.058
 Nina-0.057
 trends-0.057
Token hay
Token ID27678
Feature activation-3.06e+00

Pos logit contributions
 supposed+0.052
 leaning+0.051
agged+0.044
ige+0.044
aked+0.044
Neg logit contributions
 Grace-0.057
 trends-0.056
 occasions-0.054
 dance-0.054
 Grac-0.054
Token to
Token ID284
Feature activation-4.34e-01

Pos logit contributions
 supposed+0.129
 leaning+0.123
agged+0.117
aked+0.115
eping+0.111
Neg logit contributions
 trends-0.151
 Grace-0.150
 Grac-0.145
 Flor-0.143
 Nina-0.142
Token eat
Token ID4483
Feature activation-9.73e-02

Pos logit contributions
 supposed+0.020
 leaning+0.020
 allergic+0.018
 Tom+0.018
 done+0.017
Neg logit contributions
 trends-0.020
lyss-0.020
 Grace-0.020
 Grac-0.019
 Isn-0.019
Token.
Token ID13
Feature activation-1.91e-01

Pos logit contributions
 supposed+0.004
 leaning+0.003
 allergic+0.003
aked+0.003
 Tom+0.003
Neg logit contributions
 trends-0.006
 Grace-0.006
 Grac-0.005
lyss-0.005
 Flor-0.005
Token
Token ID220
Feature activation-6.74e-03

Pos logit contributions
 supposed+0.010
 leaning+0.009
 allergic+0.008
aked+0.008
 Tom+0.008
Neg logit contributions
 trends-0.008
 Grace-0.008
 Grac-0.008
lyss-0.008
 Flor-0.008
Token
Token ID198
Feature activation-2.97e-01

Pos logit contributions
 supposed+0.000
 leaning+0.000
 thieves+0.000
 allergic+0.000
eping+0.000
Neg logit contributions
 Grace-0.000
 Grac-0.000
 Flor-0.000
 Nina-0.000
 trends-0.000
Token
Token ID198
Feature activation-6.92e-01

Pos logit contributions
 supposed+0.014
 leaning+0.014
 allergic+0.012
 Tom+0.012
aked+0.012
Neg logit contributions
 trends-0.014
 Grace-0.013
 Grac-0.013
lyss-0.013
 Flor-0.013
Token bird
Token ID6512
Feature activation+6.59e-01

Pos logit contributions
 supposed+0.036
agged+0.033
aked+0.033
 leaning+0.032
aced+0.030
Neg logit contributions
 dance-0.023
 Grace-0.023
 coral-0.023
lyss-0.022
 stream-0.022
Token.
Token ID13
Feature activation-1.58e-01

Pos logit contributions
 trends+0.035
 Grace+0.034
lyss+0.034
 Grac+0.034
 Flor+0.033
Neg logit contributions
 supposed-0.027
 leaning-0.027
 allergic-0.023
 Tom-0.022
aked-0.022
Token It
Token ID632
Feature activation+3.35e-01

Pos logit contributions
 leaning+0.006
 supposed+0.006
 Tom+0.006
 allergic+0.005
aked+0.005
Neg logit contributions
 trends-0.009
 occasions-0.008
 Grac-0.008
lyss-0.008
 Grace-0.008
Token started
Token ID2067
Feature activation-9.42e-01

Pos logit contributions
 trends+0.021
 occasions+0.021
 Grac+0.020
lyss+0.019
 Nina+0.019
Neg logit contributions
 leaning-0.012
 supposed-0.011
 angry-0.010
eping-0.010
 Tom-0.010
Token to
Token ID284
Feature activation-1.62e-01

Pos logit contributions
 supposed+0.041
 leaning+0.039
agged+0.037
wy+0.036
aked+0.036
Neg logit contributions
 Grace-0.045
 Grac-0.044
 trends-0.044
 dance-0.043
 Flor-0.043
Token fly
Token ID6129
Feature activation+1.67e+00

Pos logit contributions
 supposed+0.007
 leaning+0.007
 allergic+0.006
 Tom+0.006
eping+0.005
Neg logit contributions
 trends-0.009
lyss-0.008
 Grace-0.008
 Grac-0.008
 occasions-0.008
Token away
Token ID1497
Feature activation+5.51e-01

Pos logit contributions
 trends+0.103
lyss+0.101
 occasions+0.099
 Grac+0.098
 Grace+0.097
Neg logit contributions
 leaning-0.056
 supposed-0.056
 Tom-0.048
aked-0.045
 allergic-0.044
Token fast
Token ID3049
Feature activation+3.17e-01

Pos logit contributions
 trends+0.027
 Grace+0.027
lyss+0.026
 Grac+0.026
 Nina+0.026
Neg logit contributions
 supposed-0.025
 leaning-0.024
 allergic-0.021
 Tom-0.021
aked-0.020
Token.
Token ID13
Feature activation-2.71e-01

Pos logit contributions
 trends+0.016
 Grace+0.016
 Grac+0.016
lyss+0.015
 Flor+0.015
Neg logit contributions
 supposed-0.014
 leaning-0.013
 allergic-0.012
aked-0.011
 Tom-0.011
Token The
Token ID383
Feature activation+6.80e-01

Pos logit contributions
 leaning+0.009
 Tom+0.009
 supposed+0.009
agged+0.008
 allergic+0.007
Neg logit contributions
 trends-0.017
 occasions-0.015
maid-0.015
 outfits-0.015
lyss-0.015
Token loud
Token ID7812
Feature activation-1.39e+00

Pos logit contributions
 dance+0.020
 stream+0.020
 coral+0.019
 planet+0.018
fish+0.018
Neg logit contributions
 supposed-0.046
aked-0.042
agged-0.041
aced-0.040
bench-0.039
Token.
Token ID13
Feature activation+1.64e-01

Pos logit contributions
 leaning+0.028
 supposed+0.028
 Tom+0.027
 allergic+0.023
aged+0.022
Neg logit contributions
 trends-0.046
lyss-0.043
 occasions-0.043
 Grace-0.043
 coral-0.042
Token "
Token ID366
Feature activation+9.87e-01

Pos logit contributions
 trends+0.010
 occasions+0.010
 coral+0.010
fish+0.009
 outfits+0.009
Neg logit contributions
 Tom-0.006
 leaning-0.005
 Jerry-0.005
 supposed-0.004
 Kim-0.004
TokenLe
Token ID3123
Feature activation-4.88e-01

Pos logit contributions
 trends+0.048
 Grace+0.047
 Grac+0.047
lyss+0.045
 Nina+0.045
Neg logit contributions
 supposed-0.043
 leaning-0.043
aked-0.039
 Tom-0.038
agged-0.037
Tokeno
Token ID78
Feature activation+5.69e-01

Pos logit contributions
 supposed+0.014
 leaning+0.013
 allergic+0.011
agged+0.010
aked+0.010
Neg logit contributions
 Grace-0.031
 trends-0.031
 Grac-0.031
lyss-0.031
 Flor-0.030
Token's
Token ID338
Feature activation+2.22e+00

Pos logit contributions
 trends+0.028
 Grace+0.026
 Grac+0.026
 occasions+0.025
 Nina+0.025
Neg logit contributions
 leaning-0.027
 supposed-0.027
 Tom-0.024
 done-0.024
aked-0.023
Token doll
Token ID3654
Feature activation+1.54e+00

Pos logit contributions
 trends+0.124
 Grace+0.121
lyss+0.119
 Grac+0.119
 Nina+0.115
Neg logit contributions
 supposed-0.087
 leaning-0.084
 Tom-0.073
 allergic-0.073
aked-0.068
Token has
Token ID468
Feature activation-2.07e-01

Pos logit contributions
 trends+0.073
 reefs+0.072
 coral+0.072
 Nina+0.072
 occasions+0.071
Neg logit contributions
 leaning-0.071
 supposed-0.070
hy-0.066
 Tom-0.065
 done-0.063
Token a
Token ID257
Feature activation+2.89e-01

Pos logit contributions
 supposed+0.006
 leaning+0.005
 Tom+0.005
 allergic+0.004
 done+0.004
Neg logit contributions
 trends-0.014
lyss-0.014
 Grace-0.014
 Grac-0.013
maid-0.013
Token loose
Token ID9155
Feature activation-2.09e+00

Pos logit contributions
 trends+0.016
 reefs+0.016
 Grace+0.016
 occasions+0.016
 Grac+0.016
Neg logit contributions
 supposed-0.010
 leaning-0.010
 allergic-0.009
 Tom-0.009
hy-0.008
Token head
Token ID1182
Feature activation-1.30e+00

Pos logit contributions
 supposed+0.091
 leaning+0.089
 allergic+0.078
 Tom+0.076
aked+0.076
Neg logit contributions
 trends-0.104
 Grace-0.102
lyss-0.100
 Grac-0.100
 Flor-0.098
Token.
Token ID13
Feature activation+3.63e-01

Pos logit contributions
 supposed+0.054
 leaning+0.053
 allergic+0.045
 Tom+0.044
aked+0.043
Neg logit contributions
 trends-0.068
 Grace-0.068
 Grac-0.066
lyss-0.066
 Flor-0.065
Token the
Token ID262
Feature activation+1.60e-01

Pos logit contributions
 supposed+0.020
 leaning+0.020
 allergic+0.017
 Tom+0.017
aked+0.017
Neg logit contributions
 trends-0.024
 Grace-0.024
 Grac-0.024
lyss-0.023
 Flor-0.023
Token le
Token ID443
Feature activation-2.05e+00

Pos logit contributions
lyss+0.007
 Grac+0.007
 Grace+0.007
 stream+0.007
 trends+0.007
Neg logit contributions
 supposed-0.008
agged-0.007
 leaning-0.007
aced-0.007
aked-0.006
Tokenopard
Token ID15478
Feature activation+2.38e-01

Pos logit contributions
 supposed+0.069
 leaning+0.063
 allergic+0.057
 done+0.053
 vain+0.051
Neg logit contributions
 trends-0.121
lyss-0.121
 Grac-0.119
 Flor-0.118
 Grace-0.117
Token was
Token ID373
Feature activation-2.42e-02

Pos logit contributions
 trends+0.013
 Grace+0.012
 Nina+0.012
 occasions+0.012
 Grac+0.012
Neg logit contributions
 leaning-0.010
 supposed-0.010
 done-0.009
aked-0.009
aged-0.009
Token known
Token ID1900
Feature activation-2.02e+00

Pos logit contributions
 supposed+0.001
 leaning+0.001
agged+0.001
aked+0.001
 allergic+0.001
Neg logit contributions
 Grace-0.001
 trends-0.001
 Grac-0.001
lyss-0.001
 Flor-0.001
Token as
Token ID355
Feature activation-2.79e+00

Pos logit contributions
 supposed+0.078
 leaning+0.075
 allergic+0.065
aked+0.064
 Tom+0.062
Neg logit contributions
 trends-0.111
 Grace-0.110
 Grac-0.107
lyss-0.106
 Flor-0.105
Token the
Token ID262
Feature activation-1.37e+00

Pos logit contributions
 supposed+0.107
 leaning+0.103
 allergic+0.088
 Tom+0.087
aked+0.083
Neg logit contributions
 trends-0.157
 Grace-0.152
lyss-0.149
 Grac-0.148
 occasions-0.146
Token brav
Token ID49025
Feature activation+1.08e-01

Pos logit contributions
 supposed+0.060
 leaning+0.057
agged+0.053
aked+0.051
 allergic+0.051
Neg logit contributions
 trends-0.069
 Grace-0.069
 Grac-0.067
lyss-0.067
 Flor-0.066
Tokenest
Token ID395
Feature activation-2.12e-02

Pos logit contributions
 occasions+0.003
 Grac+0.002
 Grace+0.002
 Flor+0.002
 Isn+0.002
Neg logit contributions
 supposed-0.007
 leaning-0.007
hes-0.007
bench-0.006
 allergic-0.006
Token and
Token ID290
Feature activation-1.29e+00

Pos logit contributions
 supposed+0.001
 leaning+0.001
 allergic+0.001
 Tom+0.001
aked+0.001
Neg logit contributions
 trends-0.001
 Grace-0.001
 Grac-0.001
lyss-0.001
 Flor-0.001
Token most
Token ID749
Feature activation+1.23e+00

Pos logit contributions
 supposed+0.048
 leaning+0.047
 allergic+0.039
 Tom+0.039
 thieves+0.037
Neg logit contributions
 trends-0.074
 Grace-0.071
 occasions-0.070
lyss-0.069
 Grac-0.069
Token would
Token ID561
Feature activation+7.80e-01

Pos logit contributions
 Grac+0.019
lyss+0.019
 Nina+0.018
 trends+0.017
 Holly+0.017
Neg logit contributions
 enough-0.010
 supposed-0.009
 leaning-0.009
 wrong-0.009
 time-0.009
Token be
Token ID307
Feature activation+4.69e-01

Pos logit contributions
lyss+0.047
 trends+0.046
 Isn+0.043
 Tracy+0.042
maid+0.042
Neg logit contributions
 supposed-0.027
 leaning-0.026
 Tom-0.026
 hard-0.025
 allergic-0.025
Token fun
Token ID1257
Feature activation+2.95e-01

Pos logit contributions
 trends+0.031
lyss+0.030
 Grace+0.029
 Grac+0.029
 occasions+0.029
Neg logit contributions
 supposed-0.015
 leaning-0.014
 Tom-0.012
 allergic-0.012
 done-0.011
Token,"
Token ID553
Feature activation-1.25e+00

Pos logit contributions
lyss+0.020
 Grac+0.019
 trends+0.019
 Flor+0.019
 Tracy+0.019
Neg logit contributions
 supposed-0.009
 enough-0.007
 leaning-0.007
 Tom-0.006
 done-0.006
Token Ben
Token ID3932
Feature activation-3.04e-01

Pos logit contributions
 leaning+0.038
 supposed+0.037
 Tom+0.034
 allergic+0.031
 thieves+0.028
Neg logit contributions
 trends-0.080
 Grace-0.076
lyss-0.076
 Grac-0.075
 Isn-0.074
Token said
Token ID531
Feature activation-9.44e-01

Pos logit contributions
 Tom+0.010
 supposed+0.010
 leaning+0.009
aked+0.009
 angry+0.009
Neg logit contributions
 trends-0.018
 Grace-0.017
 occasions-0.017
 Grac-0.017
 coral-0.017
Token.
Token ID13
Feature activation-5.33e-01

Pos logit contributions
 supposed+0.049
 leaning+0.045
ige+0.042
aked+0.041
eping+0.041
Neg logit contributions
 Grace-0.043
 Grac-0.041
 Flor-0.041
 Nina-0.040
 trends-0.039
Token
Token ID198
Feature activation-1.64e-01

Pos logit contributions
 leaning+0.021
 supposed+0.020
 Tom+0.020
aked+0.018
agged+0.017
Neg logit contributions
 trends-0.029
 Grace-0.027
lyss-0.027
 occasions-0.027
 Grac-0.027
Token
Token ID198
Feature activation-4.77e-01

Pos logit contributions
 supposed+0.008
 leaning+0.008
 Tom+0.007
 allergic+0.007
aked+0.007
Neg logit contributions
 trends-0.008
 Grace-0.008
 Grac-0.007
lyss-0.007
 Nina-0.007
TokenL
Token ID43
Feature activation-4.75e-01

Pos logit contributions
 leaning+0.021
 supposed+0.021
aked+0.019
 Tom+0.019
 allergic+0.018
Neg logit contributions
 trends-0.023
 Grace-0.023
 Grac-0.022
lyss-0.022
 Nina-0.022
Tokenily
Token ID813
Feature activation+8.24e-01

Pos logit contributions
 supposed+0.013
 leaning+0.012
 allergic+0.010
aked+0.010
 Tom+0.010
Neg logit contributions
 trends-0.032
 Grace-0.032
 Grac-0.031
lyss-0.031
 Nina-0.030
Token stubborn
Token ID24444
Feature activation+4.54e-01

Pos logit contributions
 Grace+0.035
 Grac+0.035
lyss+0.035
 Nina+0.035
 dance+0.035
Neg logit contributions
 supposed-0.023
 leaning-0.023
 handcuffs-0.021
aked-0.021
aced-0.020
Token,
Token ID11
Feature activation-1.94e-01

Pos logit contributions
 trends+0.019
 Grace+0.019
lyss+0.017
 Nina+0.017
 dance+0.017
Neg logit contributions
 supposed-0.023
 leaning-0.022
aked-0.021
agged-0.020
 allergic-0.020
Token which
Token ID543
Feature activation-5.93e-01

Pos logit contributions
 supposed+0.011
 leaning+0.011
 allergic+0.010
 Tom+0.009
aked+0.009
Neg logit contributions
 trends-0.007
 Grace-0.007
 Grac-0.007
lyss-0.007
 Flor-0.007
Token means
Token ID1724
Feature activation-8.36e-01

Pos logit contributions
 supposed+0.026
 leaning+0.025
agged+0.024
aked+0.023
 allergic+0.023
Neg logit contributions
 Grace-0.029
 trends-0.028
lyss-0.027
 Grac-0.027
 dance-0.027
Token he
Token ID339
Feature activation+1.05e-01

Pos logit contributions
 supposed+0.038
 leaning+0.037
 allergic+0.032
 Tom+0.032
 done+0.030
Neg logit contributions
 trends-0.041
 Grace-0.041
 Grac-0.040
lyss-0.039
 Flor-0.039
Token didn
Token ID1422
Feature activation-1.20e+00

Pos logit contributions
 trends+0.005
 Grace+0.005
 Grac+0.005
lyss+0.005
 Flor+0.005
Neg logit contributions
 supposed-0.005
 leaning-0.005
 allergic-0.004
aked-0.004
 Tom-0.004
Token't
Token ID470
Feature activation+9.00e-02

Pos logit contributions
 supposed+0.049
 leaning+0.048
 allergic+0.041
 Tom+0.040
aked+0.040
Neg logit contributions
 trends-0.064
 Grace-0.063
 Grac-0.061
lyss-0.061
 Flor-0.060
Token like
Token ID588
Feature activation-1.14e+00

Pos logit contributions
 trends+0.004
 Grace+0.004
 Grac+0.004
lyss+0.004
 dance+0.004
Neg logit contributions
 supposed-0.004
 leaning-0.004
aked-0.004
agged-0.004
 allergic-0.004
Token to
Token ID284
Feature activation-8.64e-01

Pos logit contributions
 supposed+0.045
 leaning+0.044
 allergic+0.038
 Tom+0.037
aked+0.035
Neg logit contributions
 trends-0.062
 Grace-0.062
 Grac-0.060
lyss-0.060
 Flor-0.059
Token listen
Token ID6004
Feature activation+3.17e-01

Pos logit contributions
 supposed+0.047
 leaning+0.046
 allergic+0.042
 Tom+0.041
aked+0.040
Neg logit contributions
 trends-0.034
 Grace-0.034
 Grac-0.033
lyss-0.033
 Flor-0.032
Token to
Token ID284
Feature activation+4.48e-01

Pos logit contributions
lyss+0.020
 trends+0.020
 Grac+0.020
 Grace+0.020
 Flor+0.020
Neg logit contributions
 supposed-0.010
 leaning-0.009
 Tom-0.008
 allergic-0.008
 done-0.007
Token They
Token ID1119
Feature activation-4.15e-01

Pos logit contributions
 Tom+0.010
 supposed+0.010
 leaning+0.010
 Jerry+0.009
 Bill+0.008
Neg logit contributions
 trends-0.015
maid-0.014
fish-0.014
 occasions-0.014
 coral-0.014
Token never
Token ID1239
Feature activation-5.24e-02

Pos logit contributions
 leaning+0.013
 supposed+0.013
 done+0.012
 Tom+0.012
hy+0.011
Neg logit contributions
 occasions-0.025
 trends-0.025
 Isn-0.024
lyss-0.024
 Grac-0.022
Token parted
Token ID37813
Feature activation+6.26e-01

Pos logit contributions
 supposed+0.002
 leaning+0.002
 Tom+0.002
 done+0.002
 allergic+0.002
Neg logit contributions
 trends-0.003
 Grace-0.002
 Grac-0.002
 Flor-0.002
 occasions-0.002
Token ways
Token ID2842
Feature activation-1.40e+00

Pos logit contributions
 trends+0.032
lyss+0.031
 Grace+0.031
 Grac+0.031
 Flor+0.030
Neg logit contributions
 supposed-0.026
 leaning-0.026
 Tom-0.023
aked-0.023
 thieves-0.023
Token and
Token ID290
Feature activation-3.72e-01

Pos logit contributions
 supposed+0.057
 leaning+0.056
 Tom+0.046
 allergic+0.045
 enough+0.044
Neg logit contributions
 trends-0.076
lyss-0.074
 Grace-0.074
 Grac-0.073
 Flor-0.071
Token the
Token ID262
Feature activation-4.94e-01

Pos logit contributions
 supposed+0.016
 leaning+0.015
 done+0.014
 Tom+0.013
 allergic+0.013
Neg logit contributions
 trends-0.019
lyss-0.019
 Grace-0.018
 Grac-0.018
maid-0.018
Token train
Token ID4512
Feature activation+5.00e-01

Pos logit contributions
 supposed+0.023
 leaning+0.022
 allergic+0.019
aked+0.019
agged+0.019
Neg logit contributions
 trends-0.024
 Grace-0.024
 Grac-0.023
lyss-0.023
 Nina-0.022
Token became
Token ID2627
Feature activation-1.00e+00

Pos logit contributions
 Grac+0.024
 trends+0.024
lyss+0.024
 Grace+0.024
 Isn+0.024
Neg logit contributions
 leaning-0.022
 supposed-0.021
 done-0.020
 Tom-0.019
hy-0.019
Token very
Token ID845
Feature activation+7.16e-01

Pos logit contributions
 supposed+0.039
 leaning+0.037
 allergic+0.032
 Tom+0.031
aked+0.031
Neg logit contributions
 trends-0.056
 Grace-0.055
 Grac-0.054
lyss-0.054
 Flor-0.053
Token light
Token ID1657
Feature activation+8.29e-01

Pos logit contributions
 trends+0.046
 occasions+0.046
 Isn+0.045
 Grace+0.045
 Grac+0.045
Neg logit contributions
 supposed-0.021
 leaning-0.020
 allergic-0.017
 done-0.016
 greedy-0.016
Token with
Token ID351
Feature activation+3.67e-01

Pos logit contributions
 trends+0.045
 Grace+0.044
lyss+0.044
 Grac+0.044
 Flor+0.043
Neg logit contributions
 supposed-0.032
 leaning-0.032
 allergic-0.027
 Tom-0.026
eping-0.025
Token.
Token ID13
Feature activation+2.07e-02

Pos logit contributions
 supposed+0.025
 leaning+0.024
 allergic+0.020
 Tom+0.019
aked+0.018
Neg logit contributions
 trends-0.049
lyss-0.048
 Grace-0.048
 Grac-0.047
 Nina-0.047
Token
Token ID198
Feature activation-2.46e-01

Pos logit contributions
 trends+0.001
lyss+0.001
 Grac+0.001
 Grace+0.001
maid+0.001
Neg logit contributions
 leaning-0.001
 supposed-0.001
 Tom-0.001
 allergic-0.001
 Jerry-0.001
Token
Token ID198
Feature activation-6.62e-01

Pos logit contributions
 supposed+0.013
 leaning+0.013
 allergic+0.012
aked+0.011
 thieves+0.011
Neg logit contributions
 Grace-0.010
 trends-0.010
 Grac-0.009
lyss-0.009
 Flor-0.009
TokenOne
Token ID3198
Feature activation-4.40e-01

Pos logit contributions
 supposed+0.034
 leaning+0.032
 allergic+0.030
 thieves+0.029
agged+0.029
Neg logit contributions
 Grace-0.028
 trends-0.028
lyss-0.027
 Grac-0.027
 coral-0.026
Token day
Token ID1110
Feature activation+1.00e+00

Pos logit contributions
 supposed+0.020
 leaning+0.019
 handcuffs+0.017
agged+0.017
 thieves+0.017
Neg logit contributions
 dance-0.022
 Grace-0.021
 coral-0.021
lyss-0.021
 trends-0.020
Token,
Token ID11
Feature activation-6.99e-01

Pos logit contributions
 Grace+0.038
 Grac+0.036
lyss+0.036
 Flor+0.035
 trends+0.035
Neg logit contributions
 supposed-0.058
 leaning-0.053
 allergic-0.049
eping-0.049
aked-0.049
Token a
Token ID257
Feature activation+5.85e-01

Pos logit contributions
 supposed+0.030
 leaning+0.026
aked+0.024
agged+0.023
 allergic+0.023
Neg logit contributions
 Grace-0.039
 Grac-0.037
 Flor-0.037
 Nina-0.037
 coral-0.037
Token dog
Token ID3290
Feature activation-1.07e+00

Pos logit contributions
 Grace+0.030
 trends+0.030
lyss+0.030
 Grac+0.029
 Flor+0.029
Neg logit contributions
 supposed-0.025
 leaning-0.024
aked-0.021
 allergic-0.020
agged-0.020
Token came
Token ID1625
Feature activation-1.00e+00

Pos logit contributions
 leaning+0.038
 supposed+0.037
 Tom+0.036
 allergic+0.032
aked+0.031
Neg logit contributions
 trends-0.066
 Grac-0.062
 occasions-0.061
 Grace-0.061
 Nina-0.060
Token near
Token ID1474
Feature activation+8.29e-01

Pos logit contributions
 supposed+0.043
 leaning+0.042
 allergic+0.039
agged+0.038
ige+0.038
Neg logit contributions
 Grace-0.047
 trends-0.045
 Grac-0.045
 Flor-0.045
 dance-0.045
Token the
Token ID262
Feature activation-4.04e-01

Pos logit contributions
 Grace+0.034
 coral+0.032
 Flor+0.031
 dance+0.031
 Nina+0.031
Neg logit contributions
 supposed-0.044
 leaning-0.041
agged-0.039
hy-0.038
aged-0.038
Token to
Token ID284
Feature activation-4.22e-01

Pos logit contributions
 supposed+0.052
 leaning+0.050
 allergic+0.043
aked+0.042
agged+0.042
Neg logit contributions
 Grace-0.074
 trends-0.073
 Grac-0.072
lyss-0.071
 Flor-0.070
Token nib
Token ID33272
Feature activation-3.27e-01

Pos logit contributions
 leaning+0.019
 supposed+0.018
 allergic+0.018
 Tom+0.017
 eaten+0.016
Neg logit contributions
 trends-0.022
lyss-0.021
 Isn-0.020
 occasions-0.019
 Tracy-0.019
Tokenble
Token ID903
Feature activation-6.90e-01

Pos logit contributions
 supposed+0.014
 leaning+0.013
 allergic+0.011
hy+0.011
 Tom+0.011
Neg logit contributions
 Grace-0.017
 Grac-0.017
 Flor-0.017
 Nina-0.016
 trends-0.016
Token at
Token ID379
Feature activation-5.21e-01

Pos logit contributions
 supposed+0.023
 leaning+0.022
 allergic+0.019
 Tom+0.018
aked+0.017
Neg logit contributions
 trends-0.042
 Grace-0.041
lyss-0.041
 Grac-0.040
 Nina-0.039
Token the
Token ID262
Feature activation-1.07e+00

Pos logit contributions
 supposed+0.023
 leaning+0.022
 allergic+0.020
 Tom+0.019
aked+0.019
Neg logit contributions
 trends-0.026
 Grace-0.026
lyss-0.025
 Grac-0.025
 Flor-0.024
Token crust
Token ID20048
Feature activation-4.53e+00

Pos logit contributions
 supposed+0.052
 leaning+0.050
 allergic+0.045
 Tom+0.043
aked+0.043
Neg logit contributions
 trends-0.049
 Grace-0.049
 Grac-0.047
lyss-0.047
 Flor-0.046
Token of
Token ID286
Feature activation-6.35e-01

Pos logit contributions
 supposed+0.195
 leaning+0.184
 allergic+0.165
aked+0.164
agged+0.164
Neg logit contributions
 Grace-0.228
 trends-0.228
 Grac-0.224
 Flor-0.218
 Nina-0.218
Token the
Token ID262
Feature activation-1.06e+00

Pos logit contributions
 supposed+0.029
 leaning+0.027
aked+0.026
agged+0.025
aged+0.024
Neg logit contributions
 Grace-0.031
 coral-0.030
 Grac-0.030
 Flor-0.030
 Nina-0.029
Token sandwich
Token ID20433
Feature activation-1.68e+00

Pos logit contributions
 supposed+0.048
 leaning+0.046
 allergic+0.040
aked+0.040
agged+0.039
Neg logit contributions
 trends-0.052
 Grace-0.051
 Grac-0.050
lyss-0.049
 Flor-0.049
Token.
Token ID13
Feature activation-1.16e-01

Pos logit contributions
 leaning+0.051
 supposed+0.051
 Tom+0.042
 allergic+0.041
 done+0.039
Neg logit contributions
 trends-0.106
lyss-0.105
 Grace-0.103
 Grac-0.102
 Nina-0.101
Token Jill
Token ID22772
Feature activation+1.08e+00

Pos logit contributions
 supposed+0.005
 Tom+0.005
 leaning+0.005
 done+0.004
aked+0.004
Neg logit contributions
 trends-0.006
fish-0.006
 Grace-0.006
 occasions-0.006
lyss-0.006
Token hard
Token ID1327
Feature activation-7.24e-01

Pos logit contributions
 supposed+0.002
 leaning+0.002
 allergic+0.001
 Tom+0.001
aked+0.001
Neg logit contributions
 trends-0.004
 Grace-0.004
 Grac-0.004
lyss-0.004
 Flor-0.004
Token and
Token ID290
Feature activation-4.86e-01

Pos logit contributions
 supposed+0.030
 leaning+0.029
 allergic+0.025
aked+0.025
 Tom+0.024
Neg logit contributions
 trends-0.038
 Grace-0.038
 Grac-0.036
lyss-0.036
 Flor-0.036
Token share
Token ID2648
Feature activation+2.09e-01

Pos logit contributions
 supposed+0.019
 leaning+0.019
 done+0.016
 Tom+0.016
 allergic+0.015
Neg logit contributions
 trends-0.027
 Grace-0.027
 Grac-0.026
lyss-0.026
maid-0.026
Token with
Token ID351
Feature activation+4.28e-01

Pos logit contributions
lyss+0.013
 trends+0.013
 Grac+0.012
 Grace+0.012
 Tracy+0.012
Neg logit contributions
 supposed-0.007
 Tom-0.006
 leaning-0.006
 allergic-0.006
 hard-0.005
Token others
Token ID1854
Feature activation-9.92e-01

Pos logit contributions
 trends+0.023
lyss+0.023
 Isn+0.021
 Grac+0.021
 Grace+0.021
Neg logit contributions
 supposed-0.018
 leaning-0.017
 Tom-0.016
 allergic-0.015
 thieves-0.014
Token than
Token ID621
Feature activation-3.93e+00

Pos logit contributions
 supposed+0.028
 leaning+0.027
 Tom+0.021
 allergic+0.021
 greedy+0.019
Neg logit contributions
 trends-0.065
lyss-0.064
 Grac-0.064
 Grace-0.064
 Nina-0.063
Token to
Token ID284
Feature activation-7.41e-01

Pos logit contributions
 supposed+0.147
 leaning+0.141
 allergic+0.121
 Tom+0.116
aked+0.114
Neg logit contributions
 trends-0.224
 Grace-0.222
 Grac-0.216
lyss-0.215
 Flor-0.211
Token be
Token ID307
Feature activation-6.85e-01

Pos logit contributions
 supposed+0.038
 leaning+0.037
 allergic+0.033
 Tom+0.033
 done+0.031
Neg logit contributions
 trends-0.032
 Grace-0.032
lyss-0.031
 Grac-0.031
 Flor-0.030
Token lazy
Token ID16931
Feature activation+1.57e+00

Pos logit contributions
 leaning+0.024
 supposed+0.024
agged+0.021
 allergic+0.020
aked+0.020
Neg logit contributions
 trends-0.037
 Grace-0.037
 Grac-0.037
lyss-0.036
 coral-0.036
Token and
Token ID290
Feature activation-7.72e-01

Pos logit contributions
 trends+0.088
 Grace+0.087
 Grac+0.085
lyss+0.085
 Flor+0.084
Neg logit contributions
 supposed-0.059
 leaning-0.057
 allergic-0.049
 Tom-0.047
aked-0.047
Token alone
Token ID3436
Feature activation+1.69e+00

Pos logit contributions
 supposed+0.029
 leaning+0.027
 allergic+0.023
 Tom+0.022
 done+0.022
Neg logit contributions
 trends-0.045
 Grace-0.045
 Grac-0.044
lyss-0.044
 Flor-0.043
Token do
Token ID466
Feature activation-1.13e+00

Pos logit contributions
 trends+0.028
lyss+0.028
 Grace+0.027
 Grac+0.027
 Isn+0.027
Neg logit contributions
 leaning-0.026
 supposed-0.025
 allergic-0.022
 Tom-0.022
 done-0.021
Token more
Token ID517
Feature activation-8.68e-01

Pos logit contributions
 supposed+0.045
 leaning+0.042
 allergic+0.037
 Tom+0.037
 done+0.037
Neg logit contributions
lyss-0.061
 trends-0.061
maid-0.059
 Grace-0.059
 Isn-0.059
Token push
Token ID4574
Feature activation-2.03e+00

Pos logit contributions
 supposed+0.034
 leaning+0.033
 allergic+0.028
aked+0.027
 Tom+0.027
Neg logit contributions
 trends-0.048
 Grace-0.047
 Grac-0.046
lyss-0.046
 Flor-0.045
Token-
Token ID12
Feature activation-3.33e+00

Pos logit contributions
 supposed+0.079
 leaning+0.077
 Tom+0.068
aked+0.065
 allergic+0.065
Neg logit contributions
 trends-0.109
lyss-0.107
 Grace-0.106
 coral-0.105
 Grac-0.103
Tokenups
Token ID4739
Feature activation-1.03e+00

Pos logit contributions
 leaning+0.166
 supposed+0.164
 allergic+0.151
agged+0.145
 Tom+0.144
Neg logit contributions
 trends-0.143
lyss-0.135
 Grace-0.134
 occasions-0.131
 Grac-0.130
Token than
Token ID621
Feature activation-3.95e+00

Pos logit contributions
 supposed+0.041
 leaning+0.039
 allergic+0.034
aked+0.033
 Tom+0.033
Neg logit contributions
 trends-0.055
 Grace-0.055
 Grac-0.054
lyss-0.053
 Flor-0.052
Token most
Token ID749
Feature activation+1.96e-01

Pos logit contributions
 supposed+0.172
 leaning+0.166
 allergic+0.146
aked+0.142
 Tom+0.141
Neg logit contributions
 trends-0.198
 Grace-0.197
 Grac-0.191
lyss-0.189
 Flor-0.187
Token adults
Token ID6490
Feature activation-2.17e+00

Pos logit contributions
 trends+0.011
 Grace+0.011
 Grac+0.010
lyss+0.010
 Flor+0.010
Neg logit contributions
 supposed-0.008
 leaning-0.008
 allergic-0.007
 Tom-0.006
aked-0.006
Token.
Token ID13
Feature activation-2.05e-01

Pos logit contributions
 supposed+0.081
 leaning+0.077
 Tom+0.064
 allergic+0.062
 done+0.059
Neg logit contributions
 trends-0.124
 Grace-0.122
 Grac-0.122
lyss-0.121
 coral-0.120
Token Her
Token ID2332
Feature activation+3.64e-01

Pos logit contributions
 supposed+0.008
 leaning+0.008
 Tom+0.008
aked+0.007
 allergic+0.007
Neg logit contributions
 trends-0.011
fish-0.011
lyss-0.010
 Grace-0.010
 coral-0.010
Token friends
Token ID2460
Feature activation+2.07e+00

Pos logit contributions
 trends+0.021
 Grace+0.021
 Grac+0.020
lyss+0.020
 Flor+0.020
Neg logit contributions
 supposed-0.014
 leaning-0.013
 allergic-0.011
aked-0.011
 Tom-0.011
Token was
Token ID373
Feature activation-5.42e-02

Pos logit contributions
 supposed+0.053
 leaning+0.053
 Tom+0.048
aked+0.047
 done+0.044
Neg logit contributions
 trends-0.070
lyss-0.066
 Grac-0.066
 occasions-0.065
 Grace-0.065
Token a
Token ID257
Feature activation+8.94e-01

Pos logit contributions
 supposed+0.002
 handcuffs+0.002
 leaning+0.002
 thieves+0.002
ache+0.002
Neg logit contributions
 Grace-0.003
 Grac-0.003
 Nina-0.003
 Flor-0.003
 dance-0.003
Token polite
Token ID23507
Feature activation-1.08e+00

Pos logit contributions
 Grace+0.049
 trends+0.048
lyss+0.047
 Grac+0.047
 Flor+0.046
Neg logit contributions
 supposed-0.036
 leaning-0.034
aked-0.030
agged-0.030
 allergic-0.029
Token three
Token ID1115
Feature activation-6.51e-01

Pos logit contributions
 supposed+0.050
 leaning+0.048
 allergic+0.043
aked+0.042
agged+0.041
Neg logit contributions
 Grace-0.052
 trends-0.051
 Grac-0.050
lyss-0.050
 Nina-0.049
Token-
Token ID12
Feature activation-3.83e+00

Pos logit contributions
 supposed+0.028
 leaning+0.027
 allergic+0.023
aked+0.023
 Tom+0.023
Neg logit contributions
 trends-0.033
 Grace-0.033
 Grac-0.032
lyss-0.032
 Flor-0.032
Tokenyear
Token ID1941
Feature activation-3.87e+00

Pos logit contributions
 leaning+0.133
 supposed+0.131
 allergic+0.114
agged+0.108
 thieves+0.107
Neg logit contributions
 trends-0.226
lyss-0.221
 Grace-0.220
 Grac-0.217
 Flor-0.213
Token-
Token ID12
Feature activation-4.56e+00

Pos logit contributions
 supposed+0.187
 leaning+0.178
 Tom+0.160
 allergic+0.158
aked+0.157
Neg logit contributions
 trends-0.183
lyss-0.172
 Grace-0.168
fish-0.165
 coral-0.163
Tokenold
Token ID727
Feature activation-2.89e+00

Pos logit contributions
 leaning+0.181
 supposed+0.178
 allergic+0.157
agged+0.149
 thieves+0.148
Neg logit contributions
 trends-0.249
lyss-0.240
 Grace-0.240
 Grac-0.234
 Flor-0.230
Token.
Token ID13
Feature activation-3.73e-01

Pos logit contributions
 supposed+0.115
 leaning+0.113
 Tom+0.098
 allergic+0.096
aked+0.094
Neg logit contributions
 trends-0.154
 Grace-0.151
 Grac-0.149
lyss-0.147
 Flor-0.145
Token Whenever
Token ID21326
Feature activation-5.71e-01

Pos logit contributions
 supposed+0.016
 leaning+0.015
 allergic+0.013
aked+0.013
agged+0.012
Neg logit contributions
 trends-0.020
 Grace-0.019
 Grac-0.019
lyss-0.019
 Nina-0.018
Token somebody
Token ID8276
Feature activation+1.04e+00

Pos logit contributions
 leaning+0.025
 supposed+0.025
 Tom+0.023
 allergic+0.021
 done+0.021
Neg logit contributions
 trends-0.029
 Grac-0.028
 Grace-0.027
lyss-0.027
 Flor-0.027
Token she
Token ID673
Feature activation+1.60e+00

Pos logit contributions
 trends+0.001
lyss+0.001
 occasions+0.001
 Grace+0.001
 coral+0.001
Neg logit contributions
 leaning-0.001
 supposed-0.001
 Tom-0.001
 allergic-0.001
 thieves-0.001
Token noticed
Token ID6810
Feature activation+1.06e-01

Pos logit contributions
 occasions+0.114
 trends+0.110
 conclusions+0.107
 Grac+0.103
 Grace+0.103
Neg logit contributions
 leaning-0.046
 Tom-0.040
 done-0.038
 supposed-0.038
 angry-0.033
Token a
Token ID257
Feature activation+1.10e+00

Pos logit contributions
 trends+0.006
 Grace+0.006
lyss+0.006
 Grac+0.006
 Flor+0.006
Neg logit contributions
 supposed-0.004
 leaning-0.004
 allergic-0.003
 Tom-0.003
aked-0.003
Token 3
Token ID513
Feature activation-5.04e-01

Pos logit contributions
 trends+0.063
 Grace+0.061
 Grac+0.060
 Flor+0.059
 occasions+0.059
Neg logit contributions
 leaning-0.040
 supposed-0.040
 allergic-0.033
 Tom-0.032
eping-0.031
Token-
Token ID12
Feature activation-3.62e+00

Pos logit contributions
 supposed+0.023
 leaning+0.022
agged+0.020
 allergic+0.020
aked+0.019
Neg logit contributions
 trends-0.024
 Grace-0.024
lyss-0.024
 Grac-0.023
 occasions-0.022
Tokenyear
Token ID1941
Feature activation-3.99e+00

Pos logit contributions
 supposed+0.134
 leaning+0.133
 allergic+0.117
agged+0.115
 Tom+0.112
Neg logit contributions
 trends-0.196
 Grac-0.194
 Grace-0.193
lyss-0.192
 Flor-0.192
Token-
Token ID12
Feature activation-4.76e+00

Pos logit contributions
 supposed+0.216
 leaning+0.198
 handcuffs+0.186
 Tom+0.184
agged+0.182
Neg logit contributions
 trends-0.158
fish-0.155
lyss-0.153
 coral-0.151
 mer-0.144
Tokenold
Token ID727
Feature activation-2.67e+00

Pos logit contributions
 supposed+0.204
 leaning+0.202
 allergic+0.182
agged+0.179
 thieves+0.174
Neg logit contributions
 trends-0.235
lyss-0.226
 Flor-0.223
 Grace-0.223
 Grac-0.219
Token child
Token ID1200
Feature activation-6.56e-01

Pos logit contributions
 supposed+0.130
 leaning+0.125
 allergic+0.112
agged+0.110
aked+0.109
Neg logit contributions
 Grace-0.121
 trends-0.121
 Grac-0.116
lyss-0.116
 Flor-0.114
Token sat
Token ID3332
Feature activation+1.94e-02

Pos logit contributions
 supposed+0.023
 leaning+0.021
 allergic+0.019
aked+0.018
agged+0.018
Neg logit contributions
 Grace-0.038
 trends-0.038
 Grac-0.037
lyss-0.037
 Flor-0.037
Token by
Token ID416
Feature activation-3.40e-01

Pos logit contributions
 Grace+0.001
 trends+0.001
 Nina+0.001
 coral+0.001
 Grac+0.001
Neg logit contributions
 supposed-0.001
 allergic-0.001
ige-0.001
 leaning-0.001
hy-0.001