TOP ACTIVATIONS
MAX = 0.113

<|endoftext|> M ilo was a cute little boy who liked to
time , there was a playful kitten named Mitt ens .
He was a mis chie vous little baby and always got
time , there was a cute little girl who loved to
time , there was a wide tree . The tree had
time , there was a wide room that had a computer
time , there was a playful puppy named Max . Max
time , there was a playful prince who loved to dance
time , there was a cute little girl named Lily .
time , there was a cute little girl named Lucy .
time , there was a wide path in the woods .
time , there was a wide , green me adow .
time , there was a playful little girl named Lily .
George . Mia was a cute little girl , and George
time , there was a playful kitten named Mitt ens .
time , there was a cute little dove named D olly
time , there was a hard little girl named Lucy .
time , there was a cute little duck . The duck
time , there was a cool ot ter who lived alone
time , there was a cute little girl named Lily .

MIN ACTIVATIONS
MIN = -0.123

. The moral of the story is that it
. The moral of the story is : You
." M oral of the story : Being enthusiastic
. The moral of the story is : Don
M oral of the story : It 's
. The moral of the story is : Don
M oral of the story : Always examine
." The moral of the story is : Be
then good . The moral of the story is : honesty
The moral of this story is that it
The moral of the story is that it
M oral of the story : It is
M oral of the story : Even if
. M oral of the story : It is
. The moral of the story is : Sharing
. The moral of the story is : Be
M oral of the story : We should
. The moral of the story is that hugs
. M oral of the story : Be
. The moral of this story is that it

INTERVAL 0.054 - 0.113
CONTAINS 3.157%

" Hello , little ones . I come in peace .
things and created a big , beautiful boat . This time
upon a time , there was a farmer named Bob .
a family . They had a house , but the kitchen
stopped ! He looked around to find out what

INTERVAL -0.005 - 0.054
CONTAINS 53.122%

can try to make a pretty car !" So
" They are very nice , Ben . You have
excited . It was his first sleep over with his best
to the bathroom . She was hes his finger and puts
your letters ," the man said . " You cannot mail

INTERVAL -0.064 - -0.005
CONTAINS 43.187%

cookies were delicious and her mother thanked the little girl for
and asked her mom if she could have it . Her
The wild fire work hit Sara and Ben . It
From that day on , Gray happily sold his sweet ,
L ily followed Mom my 's movements and watched

INTERVAL -0.123 - -0.064
CONTAINS 0.534%

, reading always helps . <|endoftext|> Once upon a time there
was ready for bed . <|endoftext|> Once upon a time ,
sky together every day . <|endoftext|> Once upon a time ,
not hurt worms again . <|endoftext|> Tom is a bird .
and ran back home . <|endoftext|> Once there was an infant
Token<|endoftext|>
Token ID50256
Feature activation-6.29e-02

Pos logit contributions
 wondering+0.009
 loudly+0.008
 She+0.008
 curious+0.008
 indeed+0.008
Neg logit contributions
aster-0.014
uch-0.013
aches-0.013
ons-0.013
kins-0.013
TokenM
Token ID44
Feature activation-2.74e-02

Pos logit contributions
aster+0.088
ons+0.079
rac+0.078
uch+0.076
kins+0.075
Neg logit contributions
 loudly-0.108
 wondering-0.108
 pad-0.107
 curious-0.104
 indeed-0.103
Tokenilo
Token ID18526
Feature activation+1.39e-02

Pos logit contributions
aster+0.028
kins+0.026
ons+0.025
uch+0.024
rac+0.024
Neg logit contributions
 indeed-0.056
 wondering-0.054
 loudly-0.053
 pad-0.052
 curious-0.052
Token was
Token ID373
Feature activation+7.75e-02

Pos logit contributions
 loudly+0.017
 wondering+0.016
 pad+0.016
 curious+0.016
 floppy+0.015
Neg logit contributions
aster-0.027
ons-0.025
uch-0.024
rac-0.024
kins-0.024
Token a
Token ID257
Feature activation+8.54e-02

Pos logit contributions
 pad+0.091
 Nice+0.089
 loudly+0.088
 detective+0.086
aly+0.086
Neg logit contributions
aster-0.150
ons-0.132
ma-0.130
uch-0.126
rac-0.124
Token cute
Token ID13779
Feature activation+1.13e-01

Pos logit contributions
aly+0.107
 Nice+0.102
 pad+0.100
 undone+0.096
lier+0.095
Neg logit contributions
aster-0.159
ma-0.133
ossible-0.122
kins-0.121
rac-0.121
Token little
Token ID1310
Feature activation+7.21e-02

Pos logit contributions
 undone+0.187
aly+0.186
ends+0.183
lessly+0.181
 likely+0.176
Neg logit contributions
aster-0.135
ma-0.125
 end-0.124
pa-0.107
 next-0.107
Token boy
Token ID2933
Feature activation+3.38e-02

Pos logit contributions
 undone+0.119
epad+0.116
ISE+0.110
 pad+0.110
 likely+0.109
Neg logit contributions
ma-0.075
aster-0.074
guard-0.070
uses-0.069
kins-0.067
Token who
Token ID508
Feature activation+1.10e-02

Pos logit contributions
 interested+0.054
 loudly+0.053
 handsome+0.053
 detective+0.052
 wondering+0.052
Neg logit contributions
aster-0.049
ons-0.046
ma-0.043
kins-0.041
rac-0.039
Token liked
Token ID8288
Feature activation+5.23e-03

Pos logit contributions
 pad+0.015
 loudly+0.014
 wondering+0.014
 undone+0.014
aly+0.013
Neg logit contributions
aster-0.018
ons-0.018
uch-0.017
kins-0.016
ma-0.016
Token to
Token ID284
Feature activation+8.32e-03

Pos logit contributions
 loudly+0.007
 pad+0.007
 wondering+0.007
 interested+0.007
aly+0.007
Neg logit contributions
aster-0.009
ons-0.009
rac-0.008
kins-0.008
uch-0.008
Token time
Token ID640
Feature activation+4.66e-02

Pos logit contributions
 loudly+0.075
 pad+0.073
 wondering+0.073
 curious+0.072
 floppy+0.070
Neg logit contributions
aster-0.083
ons-0.082
kins-0.078
rac-0.077
uch-0.074
Token,
Token ID11
Feature activation+3.17e-02

Pos logit contributions
 loudly+0.042
 wondering+0.042
 curious+0.040
 pad+0.039
 indeed+0.038
Neg logit contributions
aster-0.105
ons-0.098
rac-0.095
kins-0.095
uch-0.094
Token there
Token ID612
Feature activation-2.01e-02

Pos logit contributions
 loudly+0.041
 wondering+0.040
 pad+0.039
 curious+0.039
 interested+0.038
Neg logit contributions
aster-0.058
ons-0.055
uch-0.053
rac-0.053
kins-0.052
Token was
Token ID373
Feature activation+7.85e-02

Pos logit contributions
aster+0.044
rac+0.041
vation+0.041
kins+0.040
aches+0.040
Neg logit contributions
 loudly-0.019
 curious-0.019
 She-0.019
 wondering-0.018
 indeed-0.018
Token a
Token ID257
Feature activation+8.96e-02

Pos logit contributions
 wondering+0.113
 pad+0.112
 loudly+0.111
 detective+0.108
 interested+0.107
Neg logit contributions
aster-0.139
ons-0.125
uch-0.111
kins-0.107
ossible-0.106
Token playful
Token ID34264
Feature activation+1.12e-01

Pos logit contributions
 pad+0.122
 wondering+0.119
 undone+0.117
 indeed+0.115
aly+0.114
Neg logit contributions
aster-0.161
kins-0.141
ma-0.132
 moms-0.127
uch-0.127
Token kitten
Token ID42143
Feature activation+5.43e-02

Pos logit contributions
 undone+0.188
 wondering+0.187
 likely+0.183
 hesitant+0.177
 interested+0.173
Neg logit contributions
aster-0.164
ma-0.131
ons-0.125
 course-0.125
asis-0.119
Token named
Token ID3706
Feature activation+4.68e-02

Pos logit contributions
 wondering+0.077
 pad+0.075
 loudly+0.075
 interested+0.074
 blonde+0.072
Neg logit contributions
aster-0.092
ons-0.087
kins-0.081
uch-0.080
rac-0.079
Token Mitt
Token ID16627
Feature activation+5.58e-03

Pos logit contributions
 wondering+0.059
 loudly+0.058
 pad+0.057
 interested+0.056
 detective+0.055
Neg logit contributions
aster-0.087
ons-0.082
uch-0.079
rac-0.079
kins-0.075
Tokenens
Token ID641
Feature activation+2.67e-02

Pos logit contributions
 loudly+0.011
 pad+0.010
 wondering+0.010
 floppy+0.010
 indeed+0.010
Neg logit contributions
aster-0.007
rac-0.006
ons-0.006
kins-0.006
uch-0.006
Token.
Token ID13
Feature activation+4.44e-02

Pos logit contributions
 loudly+0.035
 wondering+0.035
 pad+0.033
 curious+0.033
 floppy+0.032
Neg logit contributions
aster-0.049
ons-0.046
kins-0.044
uch-0.043
rac-0.043
Token He
Token ID679
Feature activation+2.33e-02

Pos logit contributions
 wondering+0.054
 pad+0.053
 loudly+0.053
 handsome+0.051
 detective+0.051
Neg logit contributions
aster-0.065
ons-0.053
rac-0.048
uch-0.047
ma-0.046
Token was
Token ID373
Feature activation+7.44e-02

Pos logit contributions
 wondering+0.032
 loudly+0.032
 pad+0.031
 interested+0.030
 floppy+0.030
Neg logit contributions
aster-0.041
ons-0.039
uch-0.037
kins-0.037
rac-0.036
Token a
Token ID257
Feature activation+8.48e-02

Pos logit contributions
 pad+0.091
 loudly+0.080
lier+0.079
 wondering+0.074
 detective+0.073
Neg logit contributions
aster-0.147
ons-0.127
uch-0.127
rac-0.123
ma-0.122
Token mis
Token ID2984
Feature activation+6.68e-02

Pos logit contributions
 pad+0.117
 undone+0.112
lier+0.111
 Lay+0.109
 wondering+0.109
Neg logit contributions
aster-0.147
uch-0.120
ma-0.116
kins-0.114
rac-0.109
Tokenchie
Token ID3043
Feature activation+5.93e-02

Pos logit contributions
 wondering+0.119
 floppy+0.115
 loudly+0.114
 hello+0.112
 pad+0.112
Neg logit contributions
aster-0.093
ons-0.084
rac-0.082
uch-0.081
kins-0.079
Tokenvous
Token ID31222
Feature activation+1.12e-01

Pos logit contributions
 loudly+0.076
 curious+0.074
 pad+0.072
 interested+0.070
 wondering+0.069
Neg logit contributions
aster-0.106
kins-0.103
ons-0.101
 halftime-0.100
rac-0.099
Token little
Token ID1310
Feature activation+6.30e-02

Pos logit contributions
 undone+0.205
epad+0.187
ends+0.183
aly+0.174
 wondering+0.172
Neg logit contributions
aster-0.135
ma-0.117
pa-0.103
 moms-0.098
 course-0.098
Token baby
Token ID5156
Feature activation+3.67e-02

Pos logit contributions
 undone+0.110
epad+0.108
 Lay+0.107
 wondering+0.106
 blonde+0.105
Neg logit contributions
aster-0.062
ma-0.060
uses-0.059
kins-0.058
guard-0.056
Token and
Token ID290
Feature activation+2.83e-02

Pos logit contributions
 undone+0.060
 wondering+0.060
 blonde+0.059
 interested+0.059
aly+0.057
Neg logit contributions
aster-0.050
ons-0.048
ma-0.048
kins-0.044
uses-0.040
Token always
Token ID1464
Feature activation-1.33e-03

Pos logit contributions
 pad+0.033
 wondering+0.032
 loudly+0.031
 interested+0.029
 indeed+0.028
Neg logit contributions
aster-0.056
ons-0.053
kins-0.051
rac-0.051
uch-0.050
Token got
Token ID1392
Feature activation+4.66e-02

Pos logit contributions
aster+0.002
ons+0.002
uch+0.002
kins+0.002
rac+0.002
Neg logit contributions
 pad-0.002
 loudly-0.002
 wondering-0.002
aly-0.002
 detective-0.002
Token time
Token ID640
Feature activation+4.60e-02

Pos logit contributions
 loudly+0.074
 pad+0.071
 wondering+0.071
 curious+0.070
 floppy+0.068
Neg logit contributions
aster-0.082
ons-0.081
kins-0.077
rac-0.077
uch-0.073
Token,
Token ID11
Feature activation+2.95e-02

Pos logit contributions
 loudly+0.044
 wondering+0.043
 curious+0.041
 pad+0.041
 indeed+0.039
Neg logit contributions
aster-0.101
ons-0.095
rac-0.092
kins-0.091
uch-0.091
Token there
Token ID612
Feature activation-1.95e-02

Pos logit contributions
 loudly+0.041
 wondering+0.041
 pad+0.040
 floppy+0.039
 interested+0.038
Neg logit contributions
aster-0.051
ons-0.048
kins-0.046
uch-0.046
rac-0.046
Token was
Token ID373
Feature activation+8.08e-02

Pos logit contributions
aster+0.041
rac+0.038
kins+0.037
vation+0.037
ons+0.037
Neg logit contributions
 loudly-0.021
 curious-0.020
 wondering-0.020
 indeed-0.020
 She-0.020
Token a
Token ID257
Feature activation+8.88e-02

Pos logit contributions
 wondering+0.117
 pad+0.116
 loudly+0.115
 detective+0.112
 interested+0.110
Neg logit contributions
aster-0.141
ons-0.128
uch-0.113
kins-0.110
ossible-0.110
Token cute
Token ID13779
Feature activation+1.11e-01

Pos logit contributions
 pad+0.123
 wondering+0.120
 undone+0.118
aly+0.117
 indeed+0.116
Neg logit contributions
aster-0.157
kins-0.136
ma-0.129
 moms-0.124
 halftime-0.123
Token little
Token ID1310
Feature activation+8.25e-02

Pos logit contributions
 undone+0.196
lessly+0.185
 hesitant+0.183
 wondering+0.183
 likely+0.181
Neg logit contributions
aster-0.148
ma-0.124
 end-0.111
ons-0.110
asis-0.110
Token girl
Token ID2576
Feature activation+5.16e-02

Pos logit contributions
 undone+0.133
 wondering+0.131
 interested+0.125
 likely+0.124
 hesitant+0.118
Neg logit contributions
aster-0.122
ma-0.118
ons-0.112
kins-0.109
 moms-0.105
Token who
Token ID508
Feature activation+2.00e-02

Pos logit contributions
 loudly+0.058
 wondering+0.057
 pad+0.055
 curious+0.055
 floppy+0.053
Neg logit contributions
aster-0.103
ons-0.098
uch-0.096
rac-0.095
kins-0.095
Token loved
Token ID6151
Feature activation+1.58e-02

Pos logit contributions
 pad+0.027
 wondering+0.026
 loudly+0.026
 hesitant+0.025
aly+0.024
Neg logit contributions
aster-0.035
ons-0.033
uch-0.032
kins-0.030
ma-0.030
Token to
Token ID284
Feature activation+1.10e-02

Pos logit contributions
 loudly+0.020
 wondering+0.019
 pad+0.019
 curious+0.018
 interested+0.018
Neg logit contributions
aster-0.030
ons-0.028
rac-0.027
kins-0.027
uch-0.027
Token time
Token ID640
Feature activation+4.14e-02

Pos logit contributions
 loudly+0.065
 pad+0.064
 curious+0.062
 wondering+0.062
 floppy+0.060
Neg logit contributions
aster-0.079
ons-0.079
kins-0.075
rac-0.075
aro-0.071
Token,
Token ID11
Feature activation+2.62e-02

Pos logit contributions
 wondering+0.032
 loudly+0.032
 curious+0.031
 pad+0.029
 indeed+0.029
Neg logit contributions
aster-0.099
ons-0.092
rac-0.089
kins-0.089
aches-0.088
Token there
Token ID612
Feature activation-2.41e-02

Pos logit contributions
 loudly+0.034
 wondering+0.033
 pad+0.032
 curious+0.032
 interested+0.031
Neg logit contributions
aster-0.048
ons-0.046
uch-0.044
rac-0.044
kins-0.044
Token was
Token ID373
Feature activation+8.09e-02

Pos logit contributions
aster+0.053
rac+0.050
vation+0.048
aches+0.047
kins+0.047
Neg logit contributions
 curious-0.024
 loudly-0.023
 indeed-0.023
 She-0.023
 wondering-0.023
Token a
Token ID257
Feature activation+8.67e-02

Pos logit contributions
 wondering+0.109
 loudly+0.109
 pad+0.108
 interested+0.104
 detective+0.104
Neg logit contributions
aster-0.149
ons-0.136
uch-0.120
kins-0.118
ossible-0.117
Token wide
Token ID3094
Feature activation+1.11e-01

Pos logit contributions
 wondering+0.115
 pad+0.115
 undone+0.111
aly+0.110
 indeed+0.110
Neg logit contributions
aster-0.157
kins-0.137
ma-0.128
 moms-0.125
 halftime-0.124
Token tree
Token ID5509
Feature activation+3.19e-02

Pos logit contributions
 wondering+0.180
 interested+0.177
 hesitant+0.174
 undone+0.171
 likely+0.169
Neg logit contributions
aster-0.182
ons-0.145
ma-0.145
asis-0.140
rac-0.134
Token.
Token ID13
Feature activation+4.45e-02

Pos logit contributions
 loudly+0.036
 wondering+0.034
 curious+0.034
 indeed+0.032
 pad+0.032
Neg logit contributions
aster-0.065
ons-0.062
kins-0.060
uch-0.060
rac-0.060
Token The
Token ID383
Feature activation+2.45e-02

Pos logit contributions
 wondering+0.070
 loudly+0.069
 pad+0.068
 handsome+0.066
 detective+0.065
Neg logit contributions
aster-0.075
ons-0.064
rac-0.061
uch-0.057
ossible-0.055
Token tree
Token ID5509
Feature activation-3.80e-03

Pos logit contributions
 loudly+0.042
 wondering+0.042
 pad+0.041
 floppy+0.040
 curious+0.040
Neg logit contributions
aster-0.034
ons-0.032
rac-0.030
uch-0.030
kins-0.030
Token had
Token ID550
Feature activation+4.26e-02

Pos logit contributions
aster+0.008
vation+0.007
ons+0.007
uch+0.007
rac+0.007
Neg logit contributions
 curious-0.005
 loudly-0.005
 wondering-0.004
 pad-0.004
 indeed-0.004
Token time
Token ID640
Feature activation+3.87e-02

Pos logit contributions
 loudly+0.076
 wondering+0.075
 pad+0.074
 curious+0.072
 floppy+0.072
Neg logit contributions
aster-0.070
ons-0.066
kins-0.062
rac-0.062
uch-0.062
Token,
Token ID11
Feature activation+3.03e-02

Pos logit contributions
 loudly+0.041
 wondering+0.040
 pad+0.039
 curious+0.038
 interested+0.037
Neg logit contributions
aster-0.080
ons-0.076
uch-0.073
kins-0.073
rac-0.073
Token there
Token ID612
Feature activation-2.63e-02

Pos logit contributions
 wondering+0.047
 loudly+0.046
 pad+0.046
 floppy+0.044
 detective+0.044
Neg logit contributions
aster-0.046
ons-0.043
kins-0.042
uch-0.041
rac-0.041
Token was
Token ID373
Feature activation+7.50e-02

Pos logit contributions
aster+0.054
rac+0.050
ons+0.048
vation+0.048
kins+0.048
Neg logit contributions
 loudly-0.030
 wondering-0.029
 curious-0.029
 indeed-0.028
 She-0.028
Token a
Token ID257
Feature activation+8.47e-02

Pos logit contributions
 wondering+0.114
 pad+0.113
 loudly+0.113
 detective+0.110
 interested+0.109
Neg logit contributions
aster-0.126
ons-0.109
uch-0.096
ma-0.093
ossible-0.092
Token wide
Token ID3094
Feature activation+1.11e-01

Pos logit contributions
 pad+0.120
 wondering+0.116
 undone+0.116
aly+0.115
 indeed+0.112
Neg logit contributions
aster-0.145
kins-0.123
ma-0.119
 halftime-0.113
 moms-0.113
Token room
Token ID2119
Feature activation+5.43e-02

Pos logit contributions
 wondering+0.182
 interested+0.181
 hesitant+0.176
 undone+0.176
 likely+0.173
Neg logit contributions
aster-0.173
ma-0.140
asis-0.133
ons-0.133
 halftime-0.121
Token that
Token ID326
Feature activation+1.50e-03

Pos logit contributions
 loudly+0.069
 wondering+0.069
 pad+0.067
 floppy+0.065
 curious+0.064
Neg logit contributions
aster-0.101
ons-0.095
kins-0.090
uch-0.090
rac-0.090
Token had
Token ID550
Feature activation+3.81e-02

Pos logit contributions
 wondering+0.002
 loudly+0.002
 curious+0.002
 interested+0.002
 floppy+0.002
Neg logit contributions
aster-0.003
ons-0.003
uch-0.003
rac-0.003
kins-0.003
Token a
Token ID257
Feature activation+6.65e-02

Pos logit contributions
 pad+0.050
 wondering+0.048
 loudly+0.048
aly+0.045
 interested+0.045
Neg logit contributions
aster-0.069
ons-0.066
rac-0.062
uch-0.059
kins-0.058
Token computer
Token ID3644
Feature activation+4.38e-02

Pos logit contributions
lier+0.116
aly+0.110
 pad+0.105
king+0.105
ked+0.104
Neg logit contributions
aster-0.080
 moms-0.074
ma-0.072
ons-0.071
kins-0.070
Token time
Token ID640
Feature activation+4.00e-02

Pos logit contributions
 loudly+0.070
 wondering+0.069
 pad+0.068
 curious+0.067
 floppy+0.066
Neg logit contributions
aster-0.076
ons-0.074
kins-0.071
rac-0.070
uch-0.068
Token,
Token ID11
Feature activation+2.96e-02

Pos logit contributions
 loudly+0.041
 wondering+0.040
 pad+0.038
 curious+0.038
 interested+0.037
Neg logit contributions
aster-0.085
ons-0.080
uch-0.077
kins-0.077
rac-0.077
Token there
Token ID612
Feature activation-2.36e-02

Pos logit contributions
 wondering+0.047
 loudly+0.046
 pad+0.046
 floppy+0.044
 detective+0.043
Neg logit contributions
aster-0.045
ons-0.042
kins-0.041
uch-0.040
rac-0.039
Token was
Token ID373
Feature activation+8.04e-02

Pos logit contributions
aster+0.048
rac+0.044
kins+0.043
ons+0.043
vation+0.042
Neg logit contributions
 loudly-0.027
 wondering-0.026
 curious-0.026
 indeed-0.026
 She-0.025
Token a
Token ID257
Feature activation+8.79e-02

Pos logit contributions
 loudly+0.114
 wondering+0.114
 pad+0.112
 interested+0.110
 detective+0.109
Neg logit contributions
aster-0.142
ons-0.129
uch-0.111
ossible-0.110
ma-0.109
Token playful
Token ID34264
Feature activation+1.11e-01

Pos logit contributions
 wondering+0.120
 pad+0.120
 undone+0.118
aly+0.118
 indeed+0.115
Neg logit contributions
aster-0.153
kins-0.130
ma-0.126
 moms-0.122
 halftime-0.120
Token puppy
Token ID26188
Feature activation+5.33e-02

Pos logit contributions
 undone+0.183
 wondering+0.179
 likely+0.177
 hesitant+0.170
 interested+0.166
Neg logit contributions
aster-0.164
ma-0.136
 course-0.128
ons-0.126
 moms-0.121
Token named
Token ID3706
Feature activation+5.15e-02

Pos logit contributions
 wondering+0.096
 blonde+0.095
 interested+0.095
 pad+0.094
 undone+0.093
Neg logit contributions
aster-0.070
ons-0.067
ma-0.056
kins-0.055
uch-0.054
Token Max
Token ID5436
Feature activation+2.15e-02

Pos logit contributions
 wondering+0.067
 loudly+0.066
 interested+0.065
 pad+0.065
 curious+0.062
Neg logit contributions
aster-0.090
ons-0.088
rac-0.082
uch-0.082
ma-0.079
Token.
Token ID13
Feature activation+4.74e-02

Pos logit contributions
 wondering+0.032
 loudly+0.032
 pad+0.031
 floppy+0.031
 interested+0.031
Neg logit contributions
aster-0.035
ons-0.032
kins-0.030
ma-0.029
rac-0.029
Token Max
Token ID5436
Feature activation+3.34e-02

Pos logit contributions
 wondering+0.066
 interested+0.064
 pad+0.063
 loudly+0.063
 indeed+0.060
Neg logit contributions
aster-0.081
ons-0.078
rac-0.074
uch-0.070
ossible-0.070
Token time
Token ID640
Feature activation+4.55e-02

Pos logit contributions
 loudly+0.071
 wondering+0.069
 pad+0.069
 curious+0.067
 floppy+0.066
Neg logit contributions
aster-0.078
ons-0.076
kins-0.073
rac-0.072
uch-0.069
Token,
Token ID11
Feature activation+2.32e-02

Pos logit contributions
 loudly+0.046
 wondering+0.045
 pad+0.043
 curious+0.043
 interested+0.041
Neg logit contributions
aster-0.097
ons-0.092
rac-0.088
uch-0.088
kins-0.088
Token there
Token ID612
Feature activation-2.15e-02

Pos logit contributions
 wondering+0.035
 loudly+0.035
 pad+0.035
 floppy+0.034
 detective+0.033
Neg logit contributions
aster-0.036
ons-0.034
kins-0.033
uch-0.032
rac-0.032
Token was
Token ID373
Feature activation+7.53e-02

Pos logit contributions
aster+0.043
rac+0.040
ons+0.039
kins+0.038
vation+0.038
Neg logit contributions
 loudly-0.025
 wondering-0.024
 curious-0.024
 indeed-0.024
 She-0.023
Token a
Token ID257
Feature activation+8.53e-02

Pos logit contributions
 wondering+0.102
 loudly+0.101
 pad+0.100
 interested+0.097
 detective+0.096
Neg logit contributions
aster-0.139
ons-0.128
uch-0.113
kins-0.111
rac-0.110
Token playful
Token ID34264
Feature activation+1.11e-01

Pos logit contributions
 pad+0.117
 wondering+0.114
aly+0.114
 undone+0.113
 indeed+0.109
Neg logit contributions
aster-0.151
kins-0.130
ma-0.122
 halftime-0.119
 moms-0.117
Token prince
Token ID19716
Feature activation+4.45e-02

Pos logit contributions
 undone+0.189
 wondering+0.185
 likely+0.182
 hesitant+0.175
 unsure+0.170
Neg logit contributions
aster-0.163
ma-0.130
ons-0.126
 course-0.122
 moms-0.117
Token who
Token ID508
Feature activation+1.94e-02

Pos logit contributions
 wondering+0.060
 loudly+0.059
 pad+0.057
 floppy+0.057
 interested+0.056
Neg logit contributions
aster-0.079
ons-0.076
kins-0.070
uch-0.068
rac-0.068
Token loved
Token ID6151
Feature activation+8.96e-03

Pos logit contributions
 wondering+0.029
 loudly+0.029
 pad+0.029
 hesitant+0.027
 interested+0.027
Neg logit contributions
aster-0.030
ons-0.030
uch-0.027
kins-0.027
aches-0.026
Token to
Token ID284
Feature activation+7.88e-03

Pos logit contributions
 loudly+0.010
 wondering+0.010
 pad+0.009
 curious+0.009
 floppy+0.009
Neg logit contributions
aster-0.018
ons-0.017
uch-0.017
kins-0.017
rac-0.016
Token dance
Token ID9280
Feature activation+2.99e-02

Pos logit contributions
 loudly+0.011
 wondering+0.011
 curious+0.010
 floppy+0.010
 pad+0.010
Neg logit contributions
aster-0.013
ons-0.013
uch-0.013
kins-0.013
rac-0.013
Token time
Token ID640
Feature activation+4.42e-02

Pos logit contributions
 loudly+0.075
 wondering+0.074
 pad+0.073
 curious+0.072
 floppy+0.070
Neg logit contributions
aster-0.077
ons-0.074
kins-0.070
rac-0.070
uch-0.068
Token,
Token ID11
Feature activation+2.89e-02

Pos logit contributions
 loudly+0.051
 wondering+0.050
 pad+0.048
 floppy+0.047
 curious+0.046
Neg logit contributions
aster-0.087
ons-0.083
kins-0.079
uch-0.079
rac-0.079
Token there
Token ID612
Feature activation-2.10e-02

Pos logit contributions
 wondering+0.047
 pad+0.046
 loudly+0.046
 floppy+0.045
 detective+0.044
Neg logit contributions
aster-0.042
ons-0.040
kins-0.039
uch-0.038
rac-0.037
Token was
Token ID373
Feature activation+7.86e-02

Pos logit contributions
aster+0.041
rac+0.038
kins+0.037
ons+0.037
uch+0.036
Neg logit contributions
 loudly-0.025
 wondering-0.024
 curious-0.024
 indeed-0.024
 She-0.023
Token a
Token ID257
Feature activation+8.74e-02

Pos logit contributions
 wondering+0.118
 loudly+0.117
 pad+0.117
 detective+0.113
 interested+0.113
Neg logit contributions
aster-0.134
ons-0.120
uch-0.104
ma-0.102
ossible-0.101
Token cute
Token ID13779
Feature activation+1.11e-01

Pos logit contributions
 pad+0.124
aly+0.120
 wondering+0.119
 undone+0.118
 indeed+0.115
Neg logit contributions
aster-0.151
kins-0.129
ma-0.125
 moms-0.118
 halftime-0.118
Token little
Token ID1310
Feature activation+8.09e-02

Pos logit contributions
 undone+0.195
lessly+0.187
 wondering+0.184
 hesitant+0.182
 likely+0.180
Neg logit contributions
aster-0.142
ma-0.119
 end-0.108
asis-0.106
 moms-0.105
Token girl
Token ID2576
Feature activation+5.09e-02

Pos logit contributions
 undone+0.132
 wondering+0.131
 likely+0.124
 interested+0.124
 hesitant+0.118
Neg logit contributions
aster-0.116
ma-0.113
ons-0.106
kins-0.104
 moms-0.101
Token named
Token ID3706
Feature activation+4.42e-02

Pos logit contributions
 loudly+0.058
 wondering+0.058
 pad+0.055
 curious+0.055
 floppy+0.053
Neg logit contributions
aster-0.101
ons-0.096
uch-0.093
rac-0.092
kins-0.092
Token Lily
Token ID20037
Feature activation+2.81e-02

Pos logit contributions
 loudly+0.054
 wondering+0.054
 pad+0.051
 interested+0.050
 indeed+0.049
Neg logit contributions
aster-0.083
ons-0.079
uch-0.075
rac-0.075
kins-0.074
Token.
Token ID13
Feature activation+3.51e-02

Pos logit contributions
 loudly+0.036
 wondering+0.036
 pad+0.034
 curious+0.034
 floppy+0.033
Neg logit contributions
aster-0.052
ons-0.049
uch-0.047
kins-0.047
rac-0.047
Token time
Token ID640
Feature activation+4.12e-02

Pos logit contributions
 wondering+0.081
 loudly+0.080
 pad+0.078
 floppy+0.076
 curious+0.076
Neg logit contributions
aster-0.066
ons-0.060
uch-0.057
kins-0.056
rac-0.056
Token,
Token ID11
Feature activation+3.00e-02

Pos logit contributions
 loudly+0.041
 wondering+0.041
 pad+0.038
 curious+0.038
 interested+0.037
Neg logit contributions
aster-0.089
ons-0.084
uch-0.081
rac-0.081
kins-0.080
Token there
Token ID612
Feature activation-2.23e-02

Pos logit contributions
 wondering+0.043
 loudly+0.042
 pad+0.041
 floppy+0.040
 indeed+0.040
Neg logit contributions
aster-0.050
ons-0.047
kins-0.046
uch-0.045
rac-0.045
Token was
Token ID373
Feature activation+7.01e-02

Pos logit contributions
aster+0.047
rac+0.043
ons+0.043
kins+0.042
vation+0.042
Neg logit contributions
 loudly-0.025
 curious-0.024
 wondering-0.023
 She-0.023
 indeed-0.023
Token a
Token ID257
Feature activation+8.46e-02

Pos logit contributions
 wondering+0.107
 pad+0.106
 detective+0.105
 loudly+0.105
 interested+0.102
Neg logit contributions
aster-0.114
ons-0.099
uch-0.089
ossible-0.085
ma-0.084
Token cute
Token ID13779
Feature activation+1.11e-01

Pos logit contributions
 pad+0.121
aly+0.115
 wondering+0.115
 undone+0.113
 indeed+0.112
Neg logit contributions
aster-0.144
kins-0.123
ma-0.119
uch-0.115
 moms-0.114
Token little
Token ID1310
Feature activation+7.99e-02

Pos logit contributions
 undone+0.190
lessly+0.186
 wondering+0.179
 likely+0.179
 hesitant+0.177
Neg logit contributions
aster-0.144
ma-0.122
 end-0.114
asis-0.113
lf-0.109
Token girl
Token ID2576
Feature activation+4.90e-02

Pos logit contributions
 undone+0.130
 wondering+0.129
 likely+0.125
 interested+0.124
 constantly+0.118
Neg logit contributions
aster-0.110
ma-0.109
ons-0.101
kins-0.100
 moms-0.096
Token named
Token ID3706
Feature activation+4.39e-02

Pos logit contributions
 loudly+0.058
 wondering+0.058
 pad+0.056
 curious+0.054
 floppy+0.054
Neg logit contributions
aster-0.095
ons-0.090
uch-0.086
kins-0.086
rac-0.086
Token Lucy
Token ID22162
Feature activation+1.96e-02

Pos logit contributions
 loudly+0.056
 wondering+0.056
 pad+0.053
 interested+0.052
 detective+0.052
Neg logit contributions
aster-0.080
ons-0.076
uch-0.072
rac-0.071
kins-0.070
Token.
Token ID13
Feature activation+3.26e-02

Pos logit contributions
 loudly+0.030
 wondering+0.030
 pad+0.029
 floppy+0.029
 detective+0.029
Neg logit contributions
aster-0.031
ons-0.028
ma-0.026
kins-0.026
uch-0.026
Token time
Token ID640
Feature activation+4.58e-02

Pos logit contributions
 loudly+0.077
 pad+0.074
 wondering+0.074
 curious+0.074
 floppy+0.071
Neg logit contributions
aster-0.086
ons-0.084
kins-0.080
rac-0.080
uch-0.076
Token,
Token ID11
Feature activation+3.05e-02

Pos logit contributions
 loudly+0.039
 wondering+0.038
 curious+0.038
 pad+0.035
 indeed+0.035
Neg logit contributions
aster-0.107
ons-0.099
rac-0.096
kins-0.096
uch-0.095
Token there
Token ID612
Feature activation-1.92e-02

Pos logit contributions
 loudly+0.038
 wondering+0.037
 curious+0.036
 pad+0.035
 interested+0.035
Neg logit contributions
aster-0.059
ons-0.055
rac-0.053
uch-0.053
kins-0.052
Token was
Token ID373
Feature activation+7.81e-02

Pos logit contributions
aster+0.041
rac+0.038
vation+0.037
ons+0.037
kins+0.037
Neg logit contributions
 loudly-0.020
 curious-0.020
 wondering-0.019
 indeed-0.019
 She-0.019
Token a
Token ID257
Feature activation+8.63e-02

Pos logit contributions
 wondering+0.112
 loudly+0.110
 pad+0.110
 detective+0.105
 interested+0.104
Neg logit contributions
aster-0.140
ons-0.126
uch-0.113
kins-0.109
ossible-0.108
Token wide
Token ID3094
Feature activation+1.11e-01

Pos logit contributions
 pad+0.120
 wondering+0.115
 undone+0.112
aly+0.111
 indeed+0.111
Neg logit contributions
aster-0.153
kins-0.134
ma-0.126
uch-0.123
 moms-0.121
Token path
Token ID3108
Feature activation+3.54e-02

Pos logit contributions
 wondering+0.184
 interested+0.180
 hesitant+0.176
 undone+0.175
 likely+0.172
Neg logit contributions
aster-0.174
ma-0.139
ons-0.135
asis-0.135
uch-0.126
Token in
Token ID287
Feature activation-2.35e-02

Pos logit contributions
 loudly+0.045
 wondering+0.045
 pad+0.043
 curious+0.043
 floppy+0.042
Neg logit contributions
aster-0.066
ons-0.062
uch-0.059
kins-0.059
rac-0.059
Token the
Token ID262
Feature activation-2.53e-03

Pos logit contributions
uch+0.053
kins+0.051
ons+0.051
aster+0.050
rac+0.050
Neg logit contributions
 curious-0.020
 loudly-0.020
 wondering-0.019
 indeed-0.017
 floppy-0.017
Token woods
Token ID16479
Feature activation+1.29e-02

Pos logit contributions
aster+0.004
rac+0.004
ons+0.004
uch+0.004
ma+0.003
Neg logit contributions
 loudly-0.004
 wondering-0.004
 indeed-0.003
 interested-0.003
 pad-0.003
Token.
Token ID13
Feature activation+3.80e-02

Pos logit contributions
 loudly+0.016
 wondering+0.015
 curious+0.015
 interested+0.014
 pad+0.014
Neg logit contributions
aster-0.025
ons-0.024
uch-0.023
rac-0.023
kins-0.023
Token time
Token ID640
Feature activation+4.17e-02

Pos logit contributions
 loudly+0.072
 wondering+0.071
 pad+0.070
 curious+0.069
 interested+0.067
Neg logit contributions
aster-0.072
ons-0.069
kins-0.065
rac-0.065
uch-0.064
Token,
Token ID11
Feature activation+2.88e-02

Pos logit contributions
 loudly+0.040
 wondering+0.040
 curious+0.038
 pad+0.037
 interested+0.036
Neg logit contributions
aster-0.092
ons-0.086
uch-0.083
kins-0.083
rac-0.083
Token there
Token ID612
Feature activation-2.31e-02

Pos logit contributions
 wondering+0.041
 loudly+0.041
 pad+0.040
 floppy+0.039
 indeed+0.038
Neg logit contributions
aster-0.049
ons-0.046
kins-0.044
uch-0.044
rac-0.044
Token was
Token ID373
Feature activation+7.59e-02

Pos logit contributions
aster+0.047
rac+0.043
ons+0.042
kins+0.042
vation+0.041
Neg logit contributions
 loudly-0.027
 wondering-0.026
 curious-0.026
 indeed-0.025
 She-0.025
Token a
Token ID257
Feature activation+8.66e-02

Pos logit contributions
 wondering+0.111
 pad+0.110
 loudly+0.108
 detective+0.106
 interested+0.104
Neg logit contributions
aster-0.132
ons-0.118
uch-0.105
kins-0.101
ossible-0.101
Token wide
Token ID3094
Feature activation+1.11e-01

Pos logit contributions
 pad+0.123
 wondering+0.118
 undone+0.116
aly+0.116
 indeed+0.113
Neg logit contributions
aster-0.150
kins-0.129
ma-0.124
 halftime-0.118
 moms-0.118
Token,
Token ID11
Feature activation+6.14e-02

Pos logit contributions
 wondering+0.183
 interested+0.181
 hesitant+0.177
 undone+0.177
aly+0.172
Neg logit contributions
aster-0.171
ma-0.139
asis-0.133
ons-0.132
 halftime-0.122
Token green
Token ID4077
Feature activation+7.24e-02

Pos logit contributions
 pad+0.090
 exception+0.087
lier+0.087
der+0.086
 Lay+0.084
Neg logit contributions
aster-0.089
kins-0.086
ons-0.077
ma-0.077
uch-0.077
Token me
Token ID502
Feature activation+1.40e-02

Pos logit contributions
 wondering+0.118
 interested+0.117
 hesitant+0.112
 undone+0.112
 likely+0.109
Neg logit contributions
aster-0.114
ons-0.098
kins-0.093
ma-0.092
asis-0.092
Tokenadow
Token ID4584
Feature activation+4.33e-02

Pos logit contributions
 loudly+0.019
 wondering+0.019
 pad+0.019
 curious+0.018
 interested+0.018
Neg logit contributions
aster-0.025
ons-0.023
rac-0.022
kins-0.022
uch-0.022
Token.
Token ID13
Feature activation+4.16e-02

Pos logit contributions
 wondering+0.056
 pad+0.056
 loudly+0.054
 floppy+0.054
 detective+0.052
Neg logit contributions
aster-0.078
ons-0.074
kins-0.071
uch-0.070
rac-0.070
Token time
Token ID640
Feature activation+4.49e-02

Pos logit contributions
 loudly+0.074
 pad+0.071
 wondering+0.071
 curious+0.070
 interested+0.068
Neg logit contributions
aster-0.081
ons-0.080
kins-0.077
rac-0.076
uch-0.073
Token,
Token ID11
Feature activation+2.84e-02

Pos logit contributions
 loudly+0.042
 wondering+0.041
 curious+0.039
 pad+0.039
 interested+0.037
Neg logit contributions
aster-0.100
ons-0.094
kins-0.091
rac-0.090
uch-0.090
Token there
Token ID612
Feature activation-2.30e-02

Pos logit contributions
 wondering+0.041
 loudly+0.041
 pad+0.040
 floppy+0.039
 interested+0.038
Neg logit contributions
aster-0.048
ons-0.045
kins-0.043
uch-0.043
rac-0.042
Token was
Token ID373
Feature activation+7.88e-02

Pos logit contributions
aster+0.047
rac+0.044
kins+0.043
ons+0.043
vation+0.042
Neg logit contributions
 loudly-0.025
 wondering-0.025
 curious-0.025
 indeed-0.024
 She-0.024
Token a
Token ID257
Feature activation+8.82e-02

Pos logit contributions
 wondering+0.116
 loudly+0.114
 pad+0.114
 detective+0.111
 interested+0.110
Neg logit contributions
aster-0.137
ons-0.124
uch-0.107
ma-0.105
ossible-0.105
Token playful
Token ID34264
Feature activation+1.10e-01

Pos logit contributions
 pad+0.123
 wondering+0.120
 undone+0.119
aly+0.118
 indeed+0.116
Neg logit contributions
aster-0.154
kins-0.130
ma-0.126
 moms-0.121
 halftime-0.119
Token little
Token ID1310
Feature activation+8.02e-02

Pos logit contributions
 undone+0.185
 wondering+0.180
 likely+0.179
 hesitant+0.173
 unsure+0.168
Neg logit contributions
aster-0.162
ma-0.134
 course-0.125
ons-0.123
 moms-0.118
Token girl
Token ID2576
Feature activation+5.13e-02

Pos logit contributions
 wondering+0.130
 undone+0.127
 likely+0.119
 interested+0.119
 blonde+0.118
Neg logit contributions
aster-0.123
ma-0.114
ons-0.109
kins-0.104
 moms-0.102
Token named
Token ID3706
Feature activation+4.61e-02

Pos logit contributions
 loudly+0.054
 wondering+0.054
 curious+0.052
 pad+0.051
 floppy+0.049
Neg logit contributions
aster-0.105
ons-0.101
uch-0.099
rac-0.098
kins-0.098
Token Lily
Token ID20037
Feature activation+2.53e-02

Pos logit contributions
 wondering+0.058
 loudly+0.058
 pad+0.055
 interested+0.053
 floppy+0.053
Neg logit contributions
aster-0.086
ons-0.081
uch-0.076
rac-0.076
kins-0.075
Token.
Token ID13
Feature activation+3.96e-02

Pos logit contributions
 loudly+0.031
 wondering+0.031
 pad+0.030
 curious+0.029
 floppy+0.029
Neg logit contributions
aster-0.048
ons-0.046
uch-0.044
kins-0.044
rac-0.044
Token George
Token ID4502
Feature activation+2.89e-02

Pos logit contributions
 pad+0.051
 loudly+0.051
 wondering+0.050
 indeed+0.048
 interested+0.048
Neg logit contributions
aster-0.062
ons-0.058
kins-0.055
rac-0.054
uch-0.054
Token.
Token ID13
Feature activation+3.33e-02

Pos logit contributions
 loudly+0.050
 wondering+0.050
 detective+0.050
 handsome+0.049
 pad+0.049
Neg logit contributions
aster-0.035
ons-0.034
kins-0.033
ma-0.032
ossible-0.032
Token Mia
Token ID32189
Feature activation+2.82e-02

Pos logit contributions
 wondering+0.031
 loudly+0.029
 handsome+0.029
 badly+0.029
 undone+0.028
Neg logit contributions
aster-0.072
ons-0.067
kins-0.065
rac-0.064
ossible-0.063
Token was
Token ID373
Feature activation+7.56e-02

Pos logit contributions
 loudly+0.031
 curious+0.031
 wondering+0.031
 pad+0.031
 interested+0.029
Neg logit contributions
aster-0.058
ons-0.053
rac-0.052
uch-0.051
kins-0.051
Token a
Token ID257
Feature activation+8.45e-02

Pos logit contributions
 pad+0.096
 loudly+0.086
 wondering+0.080
 Nice+0.079
 indeed+0.078
Neg logit contributions
aster-0.148
ons-0.133
uch-0.130
aches-0.128
rac-0.128
Token cute
Token ID13779
Feature activation+1.10e-01

Pos logit contributions
 pad+0.119
 wondering+0.117
 undone+0.113
lier+0.109
aly+0.108
Neg logit contributions
aster-0.134
ma-0.115
 moms-0.115
uch-0.114
kins-0.114
Token little
Token ID1310
Feature activation+5.81e-02

Pos logit contributions
 undone+0.201
lessly+0.194
epad+0.183
 wondering+0.180
aly+0.180
Neg logit contributions
 end-0.107
ma-0.103
aster-0.098
 moms-0.094
 day-0.093
Token girl
Token ID2576
Feature activation+5.07e-02

Pos logit contributions
 undone+0.100
epad+0.095
ISE+0.093
vel+0.090
 blonde+0.090
Neg logit contributions
 moms-0.059
ma-0.058
guard-0.057
 last-0.056
kins-0.056
Token,
Token ID11
Feature activation+3.41e-02

Pos logit contributions
 loudly+0.062
 wondering+0.061
 handsome+0.059
 interested+0.059
 undone+0.059
Neg logit contributions
aster-0.089
ons-0.089
ma-0.086
kins-0.083
rac-0.078
Token and
Token ID290
Feature activation+2.76e-02

Pos logit contributions
 loudly+0.035
aly+0.034
 detective+0.033
 pad+0.032
 interested+0.032
Neg logit contributions
aster-0.066
ons-0.066
ma-0.061
rac-0.060
uch-0.059
Token George
Token ID4502
Feature activation+1.28e-02

Pos logit contributions
 pad+0.038
 undone+0.036
 wondering+0.036
 loudly+0.035
 interested+0.034
Neg logit contributions
ons-0.046
aster-0.046
kins-0.043
ma-0.043
rac-0.042
Token time
Token ID640
Feature activation+4.08e-02

Pos logit contributions
 loudly+0.068
 pad+0.065
 wondering+0.065
 curious+0.064
 interested+0.062
Neg logit contributions
aster-0.074
ons-0.074
kins-0.070
rac-0.069
uch-0.066
Token,
Token ID11
Feature activation+2.45e-02

Pos logit contributions
 loudly+0.039
 wondering+0.039
 curious+0.037
 pad+0.037
 interested+0.035
Neg logit contributions
aster-0.089
ons-0.084
kins-0.081
uch-0.081
rac-0.081
Token there
Token ID612
Feature activation-2.67e-02

Pos logit contributions
 wondering+0.035
 loudly+0.035
 pad+0.034
 floppy+0.033
 interested+0.032
Neg logit contributions
aster-0.041
ons-0.039
kins-0.037
uch-0.037
rac-0.037
Token was
Token ID373
Feature activation+7.44e-02

Pos logit contributions
aster+0.054
rac+0.050
ons+0.048
kins+0.048
vation+0.048
Neg logit contributions
 loudly-0.030
 wondering-0.030
 curious-0.029
 indeed-0.029
 She-0.029
Token a
Token ID257
Feature activation+8.50e-02

Pos logit contributions
 wondering+0.111
 pad+0.111
 loudly+0.109
 interested+0.106
 detective+0.106
Neg logit contributions
aster-0.128
ons-0.112
uch-0.098
ossible-0.094
ma-0.094
Token playful
Token ID34264
Feature activation+1.10e-01

Pos logit contributions
 pad+0.121
 undone+0.117
 wondering+0.116
aly+0.114
 indeed+0.113
Neg logit contributions
aster-0.145
kins-0.126
ma-0.119
 moms-0.115
 halftime-0.113
Token kitten
Token ID42143
Feature activation+4.95e-02

Pos logit contributions
 undone+0.182
 wondering+0.180
 likely+0.177
 hesitant+0.171
 interested+0.167
Neg logit contributions
aster-0.160
ma-0.128
 course-0.123
ons-0.119
asis-0.116
Token named
Token ID3706
Feature activation+4.45e-02

Pos logit contributions
 wondering+0.073
 pad+0.072
 loudly+0.071
 interested+0.071
 blonde+0.069
Neg logit contributions
aster-0.079
ons-0.076
kins-0.070
uch-0.068
ma-0.067
Token Mitt
Token ID16627
Feature activation+2.95e-03

Pos logit contributions
 wondering+0.058
 loudly+0.057
 pad+0.055
 interested+0.054
 detective+0.054
Neg logit contributions
aster-0.081
ons-0.076
uch-0.074
rac-0.074
kins-0.070
Tokenens
Token ID641
Feature activation+2.08e-02

Pos logit contributions
 loudly+0.006
 pad+0.006
 wondering+0.005
 floppy+0.005
 indeed+0.005
Neg logit contributions
aster-0.004
ons-0.003
rac-0.003
kins-0.003
uch-0.003
Token.
Token ID13
Feature activation+4.09e-02

Pos logit contributions
 loudly+0.028
 wondering+0.028
 pad+0.027
 floppy+0.026
 curious+0.026
Neg logit contributions
aster-0.037
ons-0.035
kins-0.034
rac-0.033
uch-0.033
Token time
Token ID640
Feature activation+4.44e-02

Pos logit contributions
 loudly+0.071
 pad+0.069
 wondering+0.069
 curious+0.068
 interested+0.066
Neg logit contributions
aster-0.082
ons-0.080
kins-0.077
rac-0.077
uch-0.073
Token,
Token ID11
Feature activation+3.18e-02

Pos logit contributions
 loudly+0.040
 wondering+0.040
 curious+0.038
 pad+0.037
 indeed+0.036
Neg logit contributions
aster-0.100
ons-0.094
rac-0.091
kins-0.091
uch-0.090
Token there
Token ID612
Feature activation-1.89e-02

Pos logit contributions
 loudly+0.045
 wondering+0.045
 pad+0.043
 floppy+0.042
 interested+0.041
Neg logit contributions
aster-0.054
ons-0.051
kins-0.049
uch-0.049
rac-0.049
Token was
Token ID373
Feature activation+8.00e-02

Pos logit contributions
aster+0.039
rac+0.036
kins+0.035
ons+0.035
vation+0.035
Neg logit contributions
 loudly-0.021
 wondering-0.021
 curious-0.020
 indeed-0.020
 She-0.020
Token a
Token ID257
Feature activation+8.64e-02

Pos logit contributions
 wondering+0.118
 pad+0.115
 loudly+0.115
 detective+0.113
 interested+0.111
Neg logit contributions
aster-0.141
ons-0.125
uch-0.110
ossible-0.107
kins-0.104
Token cute
Token ID13779
Feature activation+1.10e-01

Pos logit contributions
 pad+0.119
 wondering+0.117
 undone+0.114
aly+0.113
 indeed+0.113
Neg logit contributions
aster-0.153
kins-0.132
ma-0.124
 moms-0.120
 halftime-0.120
Token little
Token ID1310
Feature activation+8.10e-02

Pos logit contributions
 undone+0.187
 wondering+0.178
lessly+0.175
 hesitant+0.174
 likely+0.174
Neg logit contributions
aster-0.153
ma-0.125
asis-0.115
ons-0.113
kins-0.112
Token dove
Token ID38862
Feature activation+5.46e-02

Pos logit contributions
 wondering+0.132
 undone+0.131
 interested+0.124
 likely+0.124
 hesitant+0.118
Neg logit contributions
aster-0.120
ma-0.113
ons-0.108
kins-0.105
 moms-0.101
Token named
Token ID3706
Feature activation+5.35e-02

Pos logit contributions
 loudly+0.065
 wondering+0.064
 pad+0.062
 curious+0.061
 floppy+0.059
Neg logit contributions
aster-0.106
ons-0.100
rac-0.096
kins-0.096
uch-0.096
Token D
Token ID360
Feature activation+2.74e-02

Pos logit contributions
 loudly+0.069
 wondering+0.068
 pad+0.065
 interested+0.064
 detective+0.063
Neg logit contributions
aster-0.097
ons-0.091
rac-0.089
uch-0.089
kins-0.085
Tokenolly
Token ID5098
Feature activation+4.40e-02

Pos logit contributions
 loudly+0.038
 pad+0.038
 wondering+0.037
 curious+0.036
 indeed+0.036
Neg logit contributions
aster-0.047
ons-0.045
kins-0.042
rac-0.042
uch-0.041
Token time
Token ID640
Feature activation+4.06e-02

Pos logit contributions
 loudly+0.076
 wondering+0.076
 pad+0.074
 curious+0.073
 floppy+0.072
Neg logit contributions
aster-0.070
ons-0.066
kins-0.062
rac-0.061
uch-0.061
Token,
Token ID11
Feature activation+2.96e-02

Pos logit contributions
 loudly+0.044
 wondering+0.043
 pad+0.041
 curious+0.040
 floppy+0.040
Neg logit contributions
aster-0.083
ons-0.079
uch-0.076
kins-0.076
rac-0.076
Token there
Token ID612
Feature activation-2.38e-02

Pos logit contributions
 wondering+0.046
 loudly+0.045
 pad+0.045
 floppy+0.043
 detective+0.043
Neg logit contributions
aster-0.045
ons-0.042
kins-0.042
uch-0.041
rac-0.040
Token was
Token ID373
Feature activation+7.58e-02

Pos logit contributions
aster+0.050
rac+0.046
ons+0.045
vation+0.045
kins+0.044
Neg logit contributions
 loudly-0.025
 curious-0.025
 wondering-0.025
 indeed-0.025
 She-0.024
Token a
Token ID257
Feature activation+8.46e-02

Pos logit contributions
 wondering+0.114
 loudly+0.112
 pad+0.111
 detective+0.109
 interested+0.108
Neg logit contributions
aster-0.128
ons-0.113
uch-0.101
kins-0.097
ossible-0.096
Token hard
Token ID1327
Feature activation+1.09e-01

Pos logit contributions
 pad+0.118
 wondering+0.115
 undone+0.114
aly+0.113
 indeed+0.111
Neg logit contributions
aster-0.146
kins-0.126
ma-0.120
 moms-0.114
 halftime-0.113
Token little
Token ID1310
Feature activation+7.38e-02

Pos logit contributions
 wondering+0.187
aly+0.184
 Nice+0.184
 Lay+0.182
lier+0.181
Neg logit contributions
aster-0.167
uch-0.129
ons-0.125
kins-0.120
asis-0.119
Token girl
Token ID2576
Feature activation+4.74e-02

Pos logit contributions
 wondering+0.119
 undone+0.118
 interested+0.111
 likely+0.109
 indeed+0.108
Neg logit contributions
aster-0.109
ma-0.100
ons-0.096
kins-0.093
 moms-0.088
Token named
Token ID3706
Feature activation+4.10e-02

Pos logit contributions
 loudly+0.050
 wondering+0.050
 curious+0.048
 pad+0.047
 floppy+0.045
Neg logit contributions
aster-0.098
ons-0.094
uch-0.092
rac-0.092
kins-0.091
Token Lucy
Token ID22162
Feature activation+1.30e-02

Pos logit contributions
 wondering+0.050
 loudly+0.050
 pad+0.048
 interested+0.046
 floppy+0.046
Neg logit contributions
aster-0.078
ons-0.074
rac-0.070
uch-0.070
kins-0.069
Token.
Token ID13
Feature activation+3.21e-02

Pos logit contributions
 loudly+0.020
 wondering+0.019
 pad+0.019
 floppy+0.019
 interested+0.018
Neg logit contributions
aster-0.021
ons-0.020
kins-0.018
ma-0.018
uch-0.018
Token time
Token ID640
Feature activation+4.06e-02

Pos logit contributions
 loudly+0.070
 wondering+0.068
 pad+0.068
 curious+0.067
 floppy+0.065
Neg logit contributions
aster-0.075
ons-0.073
kins-0.069
rac-0.069
uch-0.066
Token,
Token ID11
Feature activation+3.06e-02

Pos logit contributions
 loudly+0.038
 wondering+0.037
 curious+0.036
 pad+0.035
 indeed+0.034
Neg logit contributions
aster-0.091
ons-0.085
rac-0.082
uch-0.082
kins-0.082
Token there
Token ID612
Feature activation-2.32e-02

Pos logit contributions
 wondering+0.046
 loudly+0.045
 pad+0.045
 floppy+0.043
 detective+0.043
Neg logit contributions
aster-0.049
ons-0.046
kins-0.045
uch-0.044
rac-0.044
Token was
Token ID373
Feature activation+7.85e-02

Pos logit contributions
aster+0.046
rac+0.042
ons+0.042
kins+0.042
vation+0.041
Neg logit contributions
 loudly-0.027
 wondering-0.026
 curious-0.026
 indeed-0.026
 She-0.025
Token a
Token ID257
Feature activation+8.49e-02

Pos logit contributions
 wondering+0.116
 pad+0.114
 detective+0.114
 loudly+0.113
 interested+0.110
Neg logit contributions
aster-0.133
ons-0.121
uch-0.107
ossible-0.105
ma-0.102
Token cute
Token ID13779
Feature activation+1.09e-01

Pos logit contributions
 pad+0.123
 wondering+0.120
aly+0.118
 undone+0.118
 Lay+0.114
Neg logit contributions
aster-0.143
kins-0.123
ma-0.117
 halftime-0.114
 moms-0.113
Token little
Token ID1310
Feature activation+7.92e-02

Pos logit contributions
 undone+0.187
 wondering+0.177
lessly+0.176
 hesitant+0.176
 likely+0.173
Neg logit contributions
aster-0.149
ma-0.125
asis-0.115
 end-0.114
ons-0.112
Token duck
Token ID22045
Feature activation+4.51e-02

Pos logit contributions
 wondering+0.128
 undone+0.128
 interested+0.120
 likely+0.119
 hesitant+0.114
Neg logit contributions
aster-0.115
ma-0.110
ons-0.106
kins-0.102
 moms-0.097
Token.
Token ID13
Feature activation+4.19e-02

Pos logit contributions
 loudly+0.052
 wondering+0.050
 curious+0.048
 pad+0.048
 interested+0.046
Neg logit contributions
aster-0.091
ons-0.086
rac-0.083
uch-0.083
kins-0.082
Token The
Token ID383
Feature activation+2.13e-02

Pos logit contributions
 wondering+0.058
 pad+0.058
 loudly+0.057
 interested+0.054
 detective+0.054
Neg logit contributions
aster-0.075
ons-0.068
rac-0.067
kins-0.063
uch-0.063
Token duck
Token ID22045
Feature activation+1.89e-02

Pos logit contributions
 curious+0.025
 loudly+0.025
 wondering+0.024
 floppy+0.024
 puppy+0.024
Neg logit contributions
aster-0.043
ons-0.041
rac-0.038
uch-0.037
kins-0.037
Token time
Token ID640
Feature activation+3.79e-02

Pos logit contributions
 loudly+0.075
 wondering+0.075
 pad+0.073
 curious+0.072
 floppy+0.071
Neg logit contributions
aster-0.069
ons-0.065
kins-0.061
rac-0.061
uch-0.061
Token,
Token ID11
Feature activation+2.85e-02

Pos logit contributions
 loudly+0.041
 wondering+0.041
 pad+0.039
 curious+0.038
 floppy+0.038
Neg logit contributions
aster-0.077
ons-0.074
uch-0.070
kins-0.070
rac-0.070
Token there
Token ID612
Feature activation-2.67e-02

Pos logit contributions
 wondering+0.046
 pad+0.045
 loudly+0.045
 floppy+0.043
 detective+0.043
Neg logit contributions
aster-0.042
ons-0.039
kins-0.038
uch-0.037
rac-0.037
Token was
Token ID373
Feature activation+7.31e-02

Pos logit contributions
aster+0.052
rac+0.048
ons+0.047
kins+0.047
uch+0.046
Neg logit contributions
 loudly-0.032
 wondering-0.032
 curious-0.031
 indeed-0.031
 She-0.030
Token a
Token ID257
Feature activation+8.51e-02

Pos logit contributions
 loudly+0.110
 wondering+0.110
 pad+0.109
 detective+0.106
 interested+0.106
Neg logit contributions
aster-0.121
ons-0.107
uch-0.093
ma-0.091
ossible-0.090
Token cool
Token ID3608
Feature activation+1.09e-01

Pos logit contributions
 pad+0.121
aly+0.117
 wondering+0.116
 undone+0.115
 indeed+0.111
Neg logit contributions
aster-0.144
kins-0.124
ma-0.119
 halftime-0.114
 moms-0.114
Token ot
Token ID30972
Feature activation+5.07e-02

Pos logit contributions
 wondering+0.166
 undone+0.162
 interested+0.162
 likely+0.160
 hesitant+0.159
Neg logit contributions
aster-0.178
ma-0.150
ons-0.144
asis-0.141
uch-0.139
Tokenter
Token ID353
Feature activation+4.71e-02

Pos logit contributions
 wondering+0.070
 loudly+0.069
 pad+0.066
 floppy+0.066
 blonde+0.065
Neg logit contributions
aster-0.087
ons-0.084
uch-0.082
kins-0.082
rac-0.079
Token who
Token ID508
Feature activation+1.10e-02

Pos logit contributions
 wondering+0.053
 loudly+0.053
 pad+0.051
 floppy+0.050
 interested+0.049
Neg logit contributions
aster-0.093
ons-0.089
kins-0.085
uch-0.084
rac-0.084
Token lived
Token ID5615
Feature activation+3.12e-02

Pos logit contributions
 pad+0.015
 wondering+0.015
 loudly+0.015
 detective+0.014
 indeed+0.014
Neg logit contributions
aster-0.018
ons-0.018
kins-0.017
uch-0.017
rac-0.016
Token alone
Token ID3436
Feature activation+1.06e-02

Pos logit contributions
 pad+0.040
 wondering+0.040
 loudly+0.040
 floppy+0.039
 blonde+0.038
Neg logit contributions
aster-0.055
ons-0.053
kins-0.052
rac-0.050
uch-0.050
Token time
Token ID640
Feature activation+4.39e-02

Pos logit contributions
 loudly+0.066
 pad+0.065
 wondering+0.064
 curious+0.063
 floppy+0.061
Neg logit contributions
aster-0.079
ons-0.078
kins-0.075
rac-0.075
uch-0.071
Token,
Token ID11
Feature activation+2.85e-02

Pos logit contributions
 loudly+0.038
 wondering+0.038
 curious+0.037
 pad+0.036
 indeed+0.034
Neg logit contributions
aster-0.100
ons-0.094
rac-0.091
kins-0.090
uch-0.090
Token there
Token ID612
Feature activation-1.78e-02

Pos logit contributions
 loudly+0.039
 wondering+0.039
 pad+0.037
 floppy+0.036
 curious+0.036
Neg logit contributions
aster-0.050
ons-0.047
kins-0.045
uch-0.045
rac-0.045
Token was
Token ID373
Feature activation+8.09e-02

Pos logit contributions
aster+0.038
rac+0.036
vation+0.035
ons+0.034
kins+0.034
Neg logit contributions
 wondering-0.018
 curious-0.018
 She-0.018
 loudly-0.018
 indeed-0.017
Token a
Token ID257
Feature activation+8.66e-02

Pos logit contributions
 wondering+0.116
 loudly+0.115
 pad+0.113
 detective+0.112
 interested+0.110
Neg logit contributions
aster-0.142
ons-0.128
uch-0.113
ossible-0.111
kins-0.109
Token cute
Token ID13779
Feature activation+1.09e-01

Pos logit contributions
 pad+0.118
 wondering+0.117
 undone+0.113
 indeed+0.113
aly+0.111
Neg logit contributions
aster-0.154
kins-0.134
ma-0.127
 moms-0.122
 halftime-0.120
Token little
Token ID1310
Feature activation+7.96e-02

Pos logit contributions
 undone+0.186
 wondering+0.178
lessly+0.176
 hesitant+0.175
 likely+0.174
Neg logit contributions
aster-0.152
ma-0.123
 end-0.113
asis-0.110
 moms-0.110
Token girl
Token ID2576
Feature activation+5.11e-02

Pos logit contributions
 wondering+0.128
 undone+0.128
 interested+0.121
 likely+0.120
 hesitant+0.115
Neg logit contributions
aster-0.119
ma-0.113
ons-0.109
kins-0.106
 moms-0.101
Token named
Token ID3706
Feature activation+4.86e-02

Pos logit contributions
 loudly+0.052
 wondering+0.052
 curious+0.050
 pad+0.049
 floppy+0.047
Neg logit contributions
aster-0.106
ons-0.102
uch-0.101
rac-0.100
kins-0.100
Token Lily
Token ID20037
Feature activation+2.61e-02

Pos logit contributions
 loudly+0.058
 wondering+0.057
 pad+0.055
 interested+0.053
 floppy+0.053
Neg logit contributions
aster-0.094
ons-0.089
uch-0.084
rac-0.084
kins-0.084
Token.
Token ID13
Feature activation+3.82e-02

Pos logit contributions
 loudly+0.031
 wondering+0.031
 pad+0.030
 curious+0.029
 floppy+0.028
Neg logit contributions
aster-0.051
ons-0.048
uch-0.047
kins-0.047
rac-0.047
Token.
Token ID13
Feature activation-1.74e-03

Pos logit contributions
aster+0.011
kins+0.011
ons+0.010
rac+0.010
uch+0.010
Neg logit contributions
 wondering-0.009
 pad-0.009
 floppy-0.009
 blonde-0.009
 loudly-0.009
Token
Token ID198
Feature activation-5.88e-02

Pos logit contributions
aster+0.003
ons+0.003
rac+0.003
kins+0.003
uch+0.003
Neg logit contributions
 loudly-0.002
 wondering-0.002
 pad-0.002
 curious-0.002
 floppy-0.002
Token
Token ID198
Feature activation-4.64e-02

Pos logit contributions
uch+0.107
aster+0.102
rac+0.099
kins+0.098
aches+0.096
Neg logit contributions
 loudly-0.075
 curious-0.075
 wondering-0.074
 She-0.071
 It-0.069
TokenThe
Token ID464
Feature activation-1.43e-02

Pos logit contributions
aster+0.078
uch+0.074
ons+0.074
rac+0.072
kins+0.071
Neg logit contributions
 loudly-0.066
 wondering-0.062
 curious-0.061
 indeed-0.061
 pad-0.061
Token moral
Token ID6573
Feature activation-4.62e-02

Pos logit contributions
aster+0.017
ons+0.016
uch+0.015
rac+0.015
kins+0.015
Neg logit contributions
 loudly-0.028
 wondering-0.028
 curious-0.027
 pad-0.027
 interested-0.026
Token of
Token ID286
Feature activation-1.11e-01

Pos logit contributions
aster+0.080
ons+0.075
kins+0.074
uch+0.073
rac+0.072
Neg logit contributions
 loudly-0.064
 wondering-0.063
 indeed-0.061
 curious-0.061
 pad-0.060
Token the
Token ID262
Feature activation-4.60e-02

Pos logit contributions
uch+0.244
ups+0.235
bel+0.225
rac+0.224
vals+0.223
Neg logit contributions
 indeed-0.109
 wondering-0.105
 very-0.098
 so-0.098
 knowing-0.096
Token story
Token ID1621
Feature activation-6.02e-02

Pos logit contributions
aster+0.065
ons+0.059
rac+0.058
kins+0.057
uch+0.057
Neg logit contributions
 wondering-0.080
 loudly-0.079
 curious-0.078
 indeed-0.077
 floppy-0.075
Token is
Token ID318
Feature activation-1.01e-02

Pos logit contributions
aster+0.098
ons+0.089
uch+0.085
kins+0.084
rac+0.083
Neg logit contributions
 loudly-0.094
 wondering-0.091
 indeed-0.089
 pad-0.089
 curious-0.088
Token that
Token ID326
Feature activation-3.43e-02

Pos logit contributions
aster+0.016
ons+0.015
kins+0.014
uch+0.014
rac+0.013
Neg logit contributions
 loudly-0.016
 wondering-0.016
 curious-0.016
 indeed-0.016
 interested-0.015
Token it
Token ID340
Feature activation-4.61e-02

Pos logit contributions
aster+0.056
ons+0.053
kins+0.050
rac+0.050
uch+0.050
Neg logit contributions
 pad-0.049
 loudly-0.048
 floppy-0.047
 wondering-0.047
 blonde-0.047
Token.
Token ID13
Feature activation+2.23e-02

Pos logit contributions
aster+0.047
ons+0.045
kins+0.043
rac+0.043
uch+0.042
Neg logit contributions
 wondering-0.030
 loudly-0.030
 pad-0.029
 floppy-0.028
 curious-0.028
Token
Token ID198
Feature activation-5.70e-02

Pos logit contributions
 loudly+0.028
 wondering+0.027
 pad+0.026
 curious+0.026
 floppy+0.026
Neg logit contributions
aster-0.043
ons-0.040
uch-0.039
kins-0.038
rac-0.038
Token
Token ID198
Feature activation-4.36e-02

Pos logit contributions
uch+0.113
kins+0.106
rac+0.104
vation+0.102
atmeal+0.101
Neg logit contributions
 loudly-0.070
 curious-0.070
 wondering-0.064
 She-0.063
 interested-0.062
TokenThe
Token ID464
Feature activation-6.11e-03

Pos logit contributions
uch+0.080
ons+0.080
rac+0.079
kins+0.078
at+0.076
Neg logit contributions
 loudly-0.051
 curious-0.046
 indeed-0.045
 pad-0.045
 wondering-0.045
Token moral
Token ID6573
Feature activation-4.02e-02

Pos logit contributions
ons+0.009
aster+0.009
rac+0.009
uch+0.008
kins+0.008
Neg logit contributions
 curious-0.010
 loudly-0.010
 wondering-0.010
 detective-0.010
 pad-0.010
Token of
Token ID286
Feature activation-1.11e-01

Pos logit contributions
aster+0.076
kins+0.073
uch+0.072
ons+0.071
rac+0.069
Neg logit contributions
 loudly-0.048
 indeed-0.047
 curious-0.046
 wondering-0.045
 interested-0.044
Token the
Token ID262
Feature activation-4.36e-02

Pos logit contributions
uch+0.242
rac+0.233
ups+0.233
vation+0.232
bel+0.230
Neg logit contributions
 indeed-0.103
 very-0.097
 and-0.095
 baby-0.091
 like-0.089
Token story
Token ID1621
Feature activation-6.13e-02

Pos logit contributions
aster+0.073
rac+0.070
uch+0.067
ons+0.066
kins+0.065
Neg logit contributions
 curious-0.063
 wondering-0.061
 loudly-0.060
 shy-0.059
 interested-0.059
Token is
Token ID318
Feature activation-1.58e-02

Pos logit contributions
aster+0.113
ons+0.100
uch+0.097
kins+0.097
rac+0.095
Neg logit contributions
 loudly-0.086
 indeed-0.083
 curious-0.078
 wondering-0.078
 interested-0.077
Token:
Token ID25
Feature activation-1.15e-02

Pos logit contributions
aster+0.031
ons+0.029
kins+0.029
uch+0.028
rac+0.026
Neg logit contributions
 loudly-0.019
 wondering-0.019
 curious-0.019
 indeed-0.018
 She-0.017
Token You
Token ID921
Feature activation-2.67e-02

Pos logit contributions
aster+0.022
ons+0.021
rac+0.020
uch+0.020
kins+0.019
Neg logit contributions
 loudly-0.013
 pad-0.013
 wondering-0.013
 floppy-0.013
 blonde-0.012
Token."
Token ID526
Feature activation-2.19e-02

Pos logit contributions
aster+0.092
ons+0.087
rac+0.082
uch+0.082
kins+0.082
Neg logit contributions
 loudly-0.092
 wondering-0.092
 curious-0.088
 pad-0.088
 interested-0.087
Token
Token ID198
Feature activation-6.33e-02

Pos logit contributions
aster+0.039
uch+0.036
ons+0.036
kins+0.035
rac+0.035
Neg logit contributions
 wondering-0.030
 loudly-0.030
 pad-0.028
 curious-0.028
 She-0.027
Token
Token ID198
Feature activation-4.94e-02

Pos logit contributions
uch+0.138
aster+0.132
kins+0.130
aches+0.129
rac+0.127
Neg logit contributions
 loudly-0.061
 She-0.059
 curious-0.058
 wondering-0.058
 curiously-0.053
TokenM
Token ID44
Feature activation-2.54e-02

Pos logit contributions
aster+0.079
uch+0.076
ons+0.074
rac+0.073
kins+0.073
Neg logit contributions
 loudly-0.074
 wondering-0.071
 curious-0.069
 pad-0.069
 indeed-0.068
Tokenoral
Token ID6864
Feature activation-4.51e-02

Pos logit contributions
aster+0.034
ons+0.031
rac+0.030
kins+0.030
uch+0.029
Neg logit contributions
 loudly-0.045
 wondering-0.044
 indeed-0.043
 pad-0.043
 curious-0.042
Token of
Token ID286
Feature activation-1.10e-01

Pos logit contributions
aster+0.083
kins+0.077
ons+0.076
uch+0.076
rac+0.073
Neg logit contributions
 loudly-0.059
 indeed-0.058
 wondering-0.056
 curious-0.055
 interested-0.053
Token the
Token ID262
Feature activation-4.33e-02

Pos logit contributions
uch+0.247
ups+0.231
kins+0.230
aches+0.224
ops+0.221
Neg logit contributions
 indeed-0.106
 wondering-0.105
 three-0.103
 knowing-0.100
 3-0.100
Token story
Token ID1621
Feature activation-5.05e-02

Pos logit contributions
aster+0.065
uch+0.058
ons+0.057
rac+0.057
kins+0.056
Neg logit contributions
 wondering-0.073
 loudly-0.072
 curious-0.071
 indeed-0.070
 floppy-0.068
Token:
Token ID25
Feature activation-2.04e-02

Pos logit contributions
aster+0.097
ons+0.088
uch+0.087
kins+0.085
rac+0.083
Neg logit contributions
 loudly-0.065
 wondering-0.062
 indeed-0.061
 pad-0.060
 curious-0.059
Token Being
Token ID11204
Feature activation-1.94e-02

Pos logit contributions
aster+0.041
ons+0.039
rac+0.037
kins+0.036
uch+0.036
Neg logit contributions
 loudly-0.022
 pad-0.022
 wondering-0.022
 floppy-0.022
 interested-0.020
Token enthusiastic
Token ID21848
Feature activation-4.38e-02

Pos logit contributions
aster+0.040
ons+0.039
rac+0.038
kins+0.037
uch+0.037
Neg logit contributions
 pad-0.019
 loudly-0.019
 wondering-0.018
 floppy-0.017
 indeed-0.017
Token.
Token ID13
Feature activation+1.92e-02

Pos logit contributions
aster+0.037
ons+0.037
kins+0.034
rac+0.033
uch+0.032
Neg logit contributions
 wondering-0.035
 loudly-0.034
 pad-0.034
 floppy-0.034
 indeed-0.033
Token
Token ID198
Feature activation-6.39e-02

Pos logit contributions
 loudly+0.021
 wondering+0.020
 pad+0.020
 interested+0.018
 floppy+0.018
Neg logit contributions
aster-0.039
ons-0.037
rac-0.035
kins-0.035
uch-0.034
Token
Token ID198
Feature activation-4.92e-02

Pos logit contributions
uch+0.137
kins+0.133
vation+0.130
atmeal+0.128
rac+0.126
Neg logit contributions
 curious-0.069
 loudly-0.068
 She-0.067
 He-0.066
 It-0.064
TokenThe
Token ID464
Feature activation-1.69e-02

Pos logit contributions
uch+0.116
ons+0.115
rac+0.115
kins+0.114
aster+0.112
Neg logit contributions
 loudly-0.033
 curious-0.029
 She-0.027
 pad-0.027
 indeed-0.025
Token moral
Token ID6573
Feature activation-3.74e-02

Pos logit contributions
aster+0.024
ons+0.023
rac+0.022
uch+0.021
kins+0.020
Neg logit contributions
 loudly-0.029
 wondering-0.029
 curious-0.029
 pad-0.028
 detective-0.028
Token of
Token ID286
Feature activation-1.10e-01

Pos logit contributions
aster+0.068
kins+0.065
ons+0.064
uch+0.063
rac+0.062
Neg logit contributions
 loudly-0.048
 wondering-0.046
 indeed-0.046
 curious-0.046
 interested-0.044
Token the
Token ID262
Feature activation-4.42e-02

Pos logit contributions
uch+0.237
vation+0.226
ups+0.226
rac+0.226
bel+0.223
Neg logit contributions
 indeed-0.102
 and-0.100
 very-0.090
 like-0.089
 Will-0.087
Token story
Token ID1621
Feature activation-6.08e-02

Pos logit contributions
aster+0.069
rac+0.066
ons+0.063
uch+0.063
kins+0.062
Neg logit contributions
 curious-0.068
 wondering-0.068
 loudly-0.066
 detective-0.065
 indeed-0.065
Token is
Token ID318
Feature activation-1.63e-02

Pos logit contributions
aster+0.115
ons+0.101
kins+0.101
uch+0.099
at+0.097
Neg logit contributions
 loudly-0.084
 indeed-0.079
 She-0.076
 curious-0.075
 wondering-0.074
Token:
Token ID25
Feature activation-8.02e-03

Pos logit contributions
aster+0.032
kins+0.030
ons+0.030
uch+0.029
vation+0.028
Neg logit contributions
 loudly-0.019
 wondering-0.019
 curious-0.019
 indeed-0.019
 She-0.017
Token Don
Token ID2094
Feature activation-3.99e-02

Pos logit contributions
aster+0.016
ons+0.015
rac+0.015
uch+0.014
kins+0.014
Neg logit contributions
 loudly-0.009
 wondering-0.009
 pad-0.009
 floppy-0.008
 curious-0.008
Token
Token ID220
Feature activation-1.55e-02

Pos logit contributions
 loudly+0.011
 wondering+0.011
 curious+0.011
 pad+0.010
 interested+0.010
Neg logit contributions
aster-0.024
ons-0.023
kins-0.021
rac-0.021
ma-0.021
Token
Token ID198
Feature activation-6.31e-02

Pos logit contributions
aster+0.028
ons+0.026
kins+0.025
uch+0.025
rac+0.025
Neg logit contributions
 loudly-0.021
 wondering-0.020
 pad-0.019
 She-0.019
 curious-0.019
Token
Token ID198
Feature activation-4.95e-02

Pos logit contributions
uch+0.140
kins+0.134
rac+0.133
aster+0.130
ons+0.125
Neg logit contributions
 She-0.060
 loudly-0.056
 curious-0.050
 wondering-0.050
 He-0.050
TokenM
Token ID44
Feature activation-1.94e-02

Pos logit contributions
kins+0.078
uch+0.078
rac+0.078
aster+0.076
ons+0.075
Neg logit contributions
 loudly-0.072
 indeed-0.069
 wondering-0.065
 pad-0.064
 She-0.064
Tokenoral
Token ID6864
Feature activation-4.27e-02

Pos logit contributions
aster+0.020
ons+0.018
rac+0.017
kins+0.017
uch+0.016
Neg logit contributions
 loudly-0.041
 indeed-0.040
 wondering-0.039
 pad-0.039
 interested-0.038
Token of
Token ID286
Feature activation-1.10e-01

Pos logit contributions
aster+0.079
kins+0.077
ons+0.074
uch+0.072
aches+0.071
Neg logit contributions
 indeed-0.054
 loudly-0.054
 wondering-0.050
 curious-0.050
 She-0.048
Token the
Token ID262
Feature activation-4.80e-02

Pos logit contributions
uch+0.247
kins+0.242
ups+0.234
rac+0.231
aches+0.225
Neg logit contributions
 knowing-0.101
 indeed-0.101
 wondering-0.097
 about-0.097
 three-0.095
Token story
Token ID1621
Feature activation-4.91e-02

Pos logit contributions
aster+0.078
kins+0.072
rac+0.071
uch+0.071
ons+0.070
Neg logit contributions
 wondering-0.072
 indeed-0.071
 curious-0.071
 loudly-0.071
 floppy-0.066
Token:
Token ID25
Feature activation-2.19e-02

Pos logit contributions
aster+0.103
ons+0.094
uch+0.093
kins+0.093
aches+0.089
Neg logit contributions
 loudly-0.055
 indeed-0.052
 She-0.051
 wondering-0.049
aly-0.049
Token It
Token ID632
Feature activation-5.29e-02

Pos logit contributions
aster+0.044
ons+0.042
rac+0.040
uch+0.040
kins+0.040
Neg logit contributions
 loudly-0.024
 wondering-0.024
 pad-0.023
 floppy-0.023
 curious-0.022
Token's
Token ID338
Feature activation-2.25e-03

Pos logit contributions
aster+0.108
ons+0.093
kins+0.093
uch+0.090
rac+0.089
Neg logit contributions
 loudly-0.065
 curious-0.063
 wondering-0.062
 indeed-0.061
 interested-0.060
Token.
Token ID13
Feature activation+3.01e-02

Pos logit contributions
aster+0.069
ons+0.066
uch+0.063
rac+0.063
kins+0.062
Neg logit contributions
 loudly-0.043
 wondering-0.042
 pad-0.041
 curious-0.040
 interested-0.039
Token
Token ID198
Feature activation-5.50e-02

Pos logit contributions
 loudly+0.026
 wondering+0.025
 pad+0.024
 interested+0.022
 floppy+0.022
Neg logit contributions
aster-0.068
ons-0.065
rac-0.063
kins-0.063
uch-0.061
Token
Token ID198
Feature activation-4.12e-02

Pos logit contributions
uch+0.123
kins+0.114
atmeal+0.114
rac+0.113
aster+0.113
Neg logit contributions
 curious-0.056
 loudly-0.055
 She-0.052
 wondering-0.049
 excited-0.049
TokenThe
Token ID464
Feature activation-7.06e-03

Pos logit contributions
uch+0.079
aster+0.078
ons+0.078
at+0.077
rac+0.077
Neg logit contributions
 loudly-0.047
 curious-0.044
 wondering-0.041
 indeed-0.040
 interested-0.040
Token moral
Token ID6573
Feature activation-3.79e-02

Pos logit contributions
aster+0.012
ons+0.012
rac+0.012
uch+0.011
kins+0.011
Neg logit contributions
 loudly-0.010
 curious-0.010
 wondering-0.010
 interested-0.009
 pad-0.009
Token of
Token ID286
Feature activation-1.10e-01

Pos logit contributions
aster+0.066
ons+0.062
kins+0.061
uch+0.060
rac+0.059
Neg logit contributions
 loudly-0.053
 wondering-0.051
 curious-0.050
 indeed-0.050
 pad-0.049
Token the
Token ID262
Feature activation-4.12e-02

Pos logit contributions
uch+0.240
rac+0.227
vation+0.223
aches+0.221
ups+0.219
Neg logit contributions
 indeed-0.102
 baby-0.096
 very-0.095
 three-0.092
 knowing-0.091
Token story
Token ID1621
Feature activation-6.31e-02

Pos logit contributions
aster+0.062
rac+0.059
ons+0.057
kins+0.056
uch+0.056
Neg logit contributions
 curious-0.066
 wondering-0.066
 loudly-0.065
 interested-0.063
 puppy-0.063
Token is
Token ID318
Feature activation-1.73e-02

Pos logit contributions
aster+0.115
ons+0.103
kins+0.100
uch+0.098
rac+0.097
Neg logit contributions
 loudly-0.091
 indeed-0.085
 curious-0.082
 wondering-0.081
 interested-0.080
Token:
Token ID25
Feature activation-9.28e-03

Pos logit contributions
aster+0.033
ons+0.031
kins+0.031
uch+0.029
rac+0.029
Neg logit contributions
 loudly-0.022
 wondering-0.021
 curious-0.021
 indeed-0.021
 interested-0.020
Token Don
Token ID2094
Feature activation-4.16e-02

Pos logit contributions
aster+0.018
ons+0.017
rac+0.016
uch+0.016
kins+0.015
Neg logit contributions
 loudly-0.011
 pad-0.011
 wondering-0.011
 floppy-0.011
 interested-0.010
Token
Token ID220
Feature activation-1.45e-02

Pos logit contributions
aster+0.057
kins+0.052
ons+0.050
rac+0.049
aches+0.048
Neg logit contributions
 pad-0.042
 loudly-0.042
 indeed-0.040
 badly-0.039
aly-0.039
Token
Token ID198
Feature activation-6.26e-02

Pos logit contributions
aster+0.027
ons+0.025
kins+0.024
uch+0.024
aches+0.023
Neg logit contributions
 loudly-0.019
 wondering-0.018
 curious-0.018
 pad-0.018
 interested-0.018
Token
Token ID198
Feature activation-4.77e-02

Pos logit contributions
uch+0.118
aster+0.118
kins+0.116
rac+0.114
ons+0.111
Neg logit contributions
 loudly-0.071
 She-0.068
 curious-0.067
 indeed-0.063
 interested-0.063
TokenM
Token ID44
Feature activation-1.84e-02

Pos logit contributions
aster+0.080
kins+0.080
uch+0.080
ons+0.079
rac+0.079
Neg logit contributions
 loudly-0.064
 indeed-0.061
 wondering-0.057
 curious-0.056
 interested-0.056
Tokenoral
Token ID6864
Feature activation-4.60e-02

Pos logit contributions
aster+0.019
ons+0.018
kins+0.018
rac+0.017
uch+0.016
Neg logit contributions
 loudly-0.036
 indeed-0.036
 pad-0.035
 wondering-0.035
aly-0.034
Token of
Token ID286
Feature activation-1.10e-01

Pos logit contributions
aster+0.086
kins+0.082
ons+0.080
uch+0.078
rac+0.076
Neg logit contributions
 indeed-0.059
 loudly-0.058
 wondering-0.055
 curious-0.054
 interested-0.051
Token the
Token ID262
Feature activation-4.68e-02

Pos logit contributions
uch+0.249
kins+0.240
ups+0.234
vals+0.223
ops+0.222
Neg logit contributions
 indeed-0.106
 about-0.102
 wondering-0.101
 knowing-0.100
 so-0.097
Token story
Token ID1621
Feature activation-4.90e-02

Pos logit contributions
aster+0.076
ons+0.070
kins+0.069
uch+0.068
rac+0.068
Neg logit contributions
 wondering-0.071
 indeed-0.071
 curious-0.071
 loudly-0.070
 interested-0.065
Token:
Token ID25
Feature activation-2.14e-02

Pos logit contributions
aster+0.097
ons+0.090
uch+0.087
kins+0.086
rac+0.083
Neg logit contributions
 loudly-0.060
 indeed-0.057
 wondering-0.056
 pad-0.055
 curious-0.053
Token Always
Token ID16622
Feature activation-4.19e-02

Pos logit contributions
aster+0.043
ons+0.040
rac+0.039
uch+0.038
kins+0.038
Neg logit contributions
 loudly-0.024
 wondering-0.023
 pad-0.023
 floppy-0.023
 curious-0.022
Token examine
Token ID10716
Feature activation-5.12e-02

Pos logit contributions
aster+0.071
ons+0.068
kins+0.064
uch+0.063
rac+0.063
Neg logit contributions
 wondering-0.060
 loudly-0.060
 pad-0.058
 curious-0.057
 interested-0.056
Token."
Token ID526
Feature activation-4.29e-04

Pos logit contributions
aster+0.097
ons+0.095
rac+0.091
uch+0.090
kins+0.088
Neg logit contributions
 loudly-0.055
 wondering-0.052
 curious-0.051
 indeed-0.049
 pad-0.049
Token
Token ID198
Feature activation-5.51e-02

Pos logit contributions
aster+0.001
ons+0.001
uch+0.001
kins+0.001
rac+0.001
Neg logit contributions
 wondering-0.001
 loudly-0.001
 curious-0.001
 pad-0.001
 floppy-0.001
Token
Token ID198
Feature activation-4.15e-02

Pos logit contributions
uch+0.125
aster+0.121
at+0.119
kins+0.119
vation+0.118
Neg logit contributions
 She-0.051
 loudly-0.048
 curious-0.046
 wondering-0.045
 He-0.041
TokenThe
Token ID464
Feature activation-7.22e-03

Pos logit contributions
aster+0.068
uch+0.065
rac+0.065
ons+0.064
kins+0.063
Neg logit contributions
 loudly-0.060
 wondering-0.058
 curious-0.057
 pad-0.055
 She-0.055
Token moral
Token ID6573
Feature activation-3.92e-02

Pos logit contributions
aster+0.011
ons+0.010
uch+0.010
kins+0.010
rac+0.010
Neg logit contributions
 loudly-0.011
 wondering-0.011
 pad-0.011
 curious-0.011
 floppy-0.011
Token of
Token ID286
Feature activation-1.10e-01

Pos logit contributions
aster+0.070
ons+0.066
kins+0.066
uch+0.065
rac+0.064
Neg logit contributions
 loudly-0.053
 wondering-0.050
 curious-0.050
 indeed-0.050
 floppy-0.049
Token the
Token ID262
Feature activation-4.27e-02

Pos logit contributions
uch+0.240
vation+0.238
rac+0.225
ups+0.224
vals+0.224
Neg logit contributions
 indeed-0.104
 and-0.099
 very-0.097
 like-0.093
 so-0.092
Token story
Token ID1621
Feature activation-6.27e-02

Pos logit contributions
aster+0.062
rac+0.057
ons+0.057
kins+0.055
uch+0.054
Neg logit contributions
 loudly-0.072
 curious-0.071
 wondering-0.071
 indeed-0.068
 interested-0.068
Token is
Token ID318
Feature activation-1.67e-02

Pos logit contributions
aster+0.125
ons+0.111
kins+0.110
uch+0.108
at+0.107
Neg logit contributions
 loudly-0.083
 She-0.077
 indeed-0.077
 badly-0.073
 curious-0.072
Token:
Token ID25
Feature activation-1.01e-02

Pos logit contributions
aster+0.032
ons+0.030
kins+0.029
uch+0.029
rac+0.028
Neg logit contributions
 loudly-0.021
 wondering-0.021
 curious-0.020
 indeed-0.020
 pad-0.019
Token Be
Token ID1355
Feature activation-1.59e-02

Pos logit contributions
aster+0.021
ons+0.019
rac+0.019
uch+0.019
kins+0.018
Neg logit contributions
 loudly-0.011
 wondering-0.011
 pad-0.011
 floppy-0.010
 interested-0.010
Token then
Token ID788
Feature activation-5.44e-02

Pos logit contributions
aster+0.169
uch+0.154
ons+0.154
rac+0.153
kins+0.151
Neg logit contributions
 loudly-0.088
 wondering-0.084
 pad-0.084
 indeed-0.080
 curious-0.080
Token good
Token ID922
Feature activation-3.98e-02

Pos logit contributions
aster+0.103
ons+0.097
uch+0.097
rac+0.096
at+0.093
Neg logit contributions
 loudly-0.070
 wondering-0.067
 indeed-0.064
 curious-0.063
 pad-0.061
Token.
Token ID13
Feature activation-1.60e-02

Pos logit contributions
aster+0.068
ons+0.064
uch+0.062
rac+0.061
kins+0.060
Neg logit contributions
 loudly-0.058
 wondering-0.057
 pad-0.055
 curious-0.054
 interested-0.053
Token The
Token ID383
Feature activation-2.04e-02

Pos logit contributions
aster+0.030
ons+0.028
uch+0.028
kins+0.028
aches+0.027
Neg logit contributions
 wondering-0.020
 loudly-0.020
 pad-0.019
 indeed-0.019
 curious-0.019
Token moral
Token ID6573
Feature activation-5.50e-02

Pos logit contributions
aster+0.027
ons+0.025
uch+0.023
kins+0.023
rac+0.023
Neg logit contributions
 loudly-0.037
 wondering-0.037
 pad-0.036
 curious-0.035
 floppy-0.035
Token of
Token ID286
Feature activation-1.10e-01

Pos logit contributions
aster+0.098
kins+0.091
ons+0.090
uch+0.089
rac+0.088
Neg logit contributions
 loudly-0.075
 wondering-0.073
 indeed-0.072
 curious-0.071
 pad-0.069
Token the
Token ID262
Feature activation-4.60e-02

Pos logit contributions
uch+0.219
ups+0.216
bel+0.202
atmeal+0.199
ats+0.199
Neg logit contributions
 wondering-0.129
 indeed-0.123
 knowing-0.116
 three-0.107
 curious-0.106
Token story
Token ID1621
Feature activation-6.04e-02

Pos logit contributions
aster+0.068
ons+0.063
rac+0.061
uch+0.060
kins+0.060
Neg logit contributions
 curious-0.077
 wondering-0.076
 loudly-0.076
 indeed-0.073
 interested-0.073
Token is
Token ID318
Feature activation-1.59e-02

Pos logit contributions
aster+0.102
ons+0.091
uch+0.088
kins+0.087
rac+0.086
Neg logit contributions
 loudly-0.092
 wondering-0.088
 indeed-0.087
 pad-0.086
 curious-0.086
Token:
Token ID25
Feature activation-2.15e-02

Pos logit contributions
aster+0.024
ons+0.022
kins+0.021
uch+0.021
rac+0.020
Neg logit contributions
 loudly-0.026
 wondering-0.026
 curious-0.026
 indeed-0.025
 interested-0.025
Token honesty
Token ID24345
Feature activation-4.74e-02

Pos logit contributions
aster+0.038
ons+0.035
rac+0.034
uch+0.033
kins+0.032
Neg logit contributions
 pad-0.028
 loudly-0.028
 floppy-0.028
 wondering-0.028
 blonde-0.027
Token
Token ID220
Feature activation-1.66e-02

Pos logit contributions
 loudly+0.011
 wondering+0.011
 curious+0.011
 pad+0.011
 interested+0.010
Neg logit contributions
aster-0.015
ons-0.014
kins-0.013
rac-0.013
uch-0.013
Token
Token ID198
Feature activation-6.43e-02

Pos logit contributions
aster+0.030
ons+0.027
kins+0.027
uch+0.027
rac+0.026
Neg logit contributions
 loudly-0.022
 wondering-0.021
 pad-0.021
 curious-0.021
 interested-0.021
Token
Token ID198
Feature activation-4.83e-02

Pos logit contributions
uch+0.129
aster+0.129
kins+0.127
rac+0.124
ons+0.123
Neg logit contributions
 wondering-0.066
 loudly-0.066
 She-0.065
 curious-0.062
 interested-0.061
TokenThe
Token ID464
Feature activation-1.09e-02

Pos logit contributions
aster+0.075
kins+0.074
uch+0.074
ons+0.074
rac+0.073
Neg logit contributions
 loudly-0.072
 wondering-0.070
 indeed-0.070
 pad-0.068
 interested-0.067
Token moral
Token ID6573
Feature activation-5.69e-02

Pos logit contributions
aster+0.019
ons+0.018
uch+0.017
kins+0.017
rac+0.017
Neg logit contributions
 loudly-0.015
 wondering-0.015
 pad-0.014
 curious-0.014
 interested-0.014
Token of
Token ID286
Feature activation-1.10e-01

Pos logit contributions
aster+0.102
kins+0.094
ons+0.093
uch+0.093
rac+0.090
Neg logit contributions
 loudly-0.077
 wondering-0.075
 indeed-0.074
 curious-0.072
 pad-0.071
Token this
Token ID428
Feature activation-6.25e-02

Pos logit contributions
uch+0.221
ups+0.212
bel+0.211
kins+0.204
vation+0.203
Neg logit contributions
 indeed-0.121
 wondering-0.116
 knowing-0.106
 and-0.106
 about-0.105
Token story
Token ID1621
Feature activation-6.22e-02

Pos logit contributions
aster+0.097
ons+0.085
rac+0.084
kins+0.083
uch+0.082
Neg logit contributions
 loudly-0.103
 indeed-0.102
 wondering-0.099
 curious-0.097
 interested-0.094
Token is
Token ID318
Feature activation-1.05e-02

Pos logit contributions
aster+0.103
ons+0.093
uch+0.089
kins+0.087
rac+0.087
Neg logit contributions
 loudly-0.096
 wondering-0.093
 indeed-0.091
 pad-0.091
 curious-0.090
Token that
Token ID326
Feature activation-3.63e-02

Pos logit contributions
aster+0.017
ons+0.016
kins+0.015
uch+0.015
rac+0.015
Neg logit contributions
 loudly-0.016
 wondering-0.016
 curious-0.015
 indeed-0.015
 interested-0.015
Token it
Token ID340
Feature activation-4.34e-02

Pos logit contributions
aster+0.062
ons+0.059
kins+0.056
uch+0.055
rac+0.055
Neg logit contributions
 loudly-0.051
 wondering-0.051
 pad-0.049
 curious-0.048
 floppy-0.048
Token
Token ID220
Feature activation-1.35e-02

Pos logit contributions
aster+0.020
uch+0.019
kins+0.018
aches+0.018
ons+0.018
Neg logit contributions
 She-0.012
 wondering-0.012
 indeed-0.011
 loudly-0.011
 pad-0.011
Token
Token ID198
Feature activation-6.17e-02

Pos logit contributions
aster+0.026
uch+0.023
kins+0.023
ons+0.023
aches+0.022
Neg logit contributions
 loudly-0.016
 She-0.016
 wondering-0.016
 curious-0.016
 pad-0.016
Token
Token ID198
Feature activation-4.95e-02

Pos logit contributions
uch+0.144
rac+0.132
kins+0.131
aster+0.129
aches+0.129
Neg logit contributions
 She-0.054
 It-0.049
 loudly-0.047
 curious-0.045
 He-0.045
TokenThe
Token ID464
Feature activation-1.88e-02

Pos logit contributions
uch+0.090
kins+0.087
rac+0.086
aster+0.083
ats+0.081
Neg logit contributions
 indeed-0.061
 loudly-0.060
 She-0.054
 wondering-0.051
 curious-0.051
Token moral
Token ID6573
Feature activation-3.98e-02

Pos logit contributions
aster+0.024
ons+0.023
rac+0.022
uch+0.021
kins+0.021
Neg logit contributions
 curious-0.035
 loudly-0.034
 wondering-0.034
 indeed-0.034
 interested-0.033
Token of
Token ID286
Feature activation-1.09e-01

Pos logit contributions
aster+0.072
kins+0.067
ons+0.066
uch+0.065
rac+0.064
Neg logit contributions
 loudly-0.053
 indeed-0.051
 wondering-0.051
 curious-0.050
 interested-0.049
Token the
Token ID262
Feature activation-4.48e-02

Pos logit contributions
uch+0.242
ups+0.230
vals+0.226
rac+0.225
aches+0.223
Neg logit contributions
 indeed-0.107
 wondering-0.097
 knowing-0.097
 so-0.096
 and-0.094
Token story
Token ID1621
Feature activation-5.96e-02

Pos logit contributions
aster+0.077
rac+0.069
kins+0.068
ons+0.068
uch+0.068
Neg logit contributions
 curious-0.066
 wondering-0.065
 indeed-0.065
 loudly-0.062
 interested-0.060
Token is
Token ID318
Feature activation-7.03e-03

Pos logit contributions
aster+0.101
ons+0.089
uch+0.087
kins+0.087
rac+0.085
Neg logit contributions
 loudly-0.091
 wondering-0.087
 indeed-0.087
 pad-0.084
 curious-0.084
Token that
Token ID326
Feature activation-3.28e-02

Pos logit contributions
aster+0.011
ons+0.010
kins+0.010
uch+0.010
rac+0.010
Neg logit contributions
 wondering-0.011
 loudly-0.011
 indeed-0.011
 curious-0.011
 interested-0.010
Token it
Token ID340
Feature activation-4.23e-02

Pos logit contributions
aster+0.057
ons+0.055
kins+0.051
rac+0.051
uch+0.051
Neg logit contributions
 pad-0.044
 loudly-0.044
 wondering-0.043
 floppy-0.042
aly-0.041
Token
Token ID220
Feature activation-1.85e-02

Pos logit contributions
aster+0.001
ons+0.001
uch+0.001
kins+0.001
rac+0.001
Neg logit contributions
 wondering-0.001
 loudly-0.001
 pad-0.000
 curious-0.000
 indeed-0.000
Token
Token ID198
Feature activation-6.63e-02

Pos logit contributions
aster+0.036
kins+0.033
ons+0.032
uch+0.032
aches+0.032
Neg logit contributions
 loudly-0.022
 She-0.021
 curious-0.021
 wondering-0.021
 interested-0.021
Token
Token ID198
Feature activation-4.81e-02

Pos logit contributions
uch+0.145
aster+0.143
rac+0.135
kins+0.135
ons+0.130
Neg logit contributions
 She-0.062
 loudly-0.062
 curious-0.059
 wondering-0.057
 interested-0.056
TokenM
Token ID44
Feature activation-1.35e-02

Pos logit contributions
aster+0.080
uch+0.078
kins+0.076
rac+0.076
ons+0.075
Neg logit contributions
 loudly-0.067
 indeed-0.064
 interested-0.059
 wondering-0.059
 curious-0.059
Tokenoral
Token ID6864
Feature activation-4.52e-02

Pos logit contributions
aster+0.013
ons+0.012
rac+0.011
kins+0.011
uch+0.011
Neg logit contributions
 loudly-0.028
 pad-0.028
 indeed-0.028
 wondering-0.027
 interested-0.027
Token of
Token ID286
Feature activation-1.09e-01

Pos logit contributions
aster+0.089
kins+0.083
ons+0.081
aches+0.080
uch+0.080
Neg logit contributions
 indeed-0.055
 loudly-0.053
 curious-0.050
 wondering-0.049
 She-0.049
Token the
Token ID262
Feature activation-4.32e-02

Pos logit contributions
uch+0.250
kins+0.240
ups+0.237
aches+0.233
ops+0.230
Neg logit contributions
 indeed-0.101
 knowing-0.099
 three-0.096
 3-0.094
 very-0.093
Token story
Token ID1621
Feature activation-5.48e-02

Pos logit contributions
aster+0.070
uch+0.060
rac+0.060
ons+0.059
kins+0.059
Neg logit contributions
 wondering-0.068
 loudly-0.068
 curious-0.068
 indeed-0.066
 floppy-0.065
Token:
Token ID25
Feature activation-2.09e-02

Pos logit contributions
aster+0.114
ons+0.103
uch+0.102
kins+0.100
aches+0.098
Neg logit contributions
 loudly-0.063
 indeed-0.061
 She-0.058
 wondering-0.058
 pad-0.057
Token It
Token ID632
Feature activation-5.32e-02

Pos logit contributions
aster+0.042
ons+0.040
rac+0.038
uch+0.038
kins+0.038
Neg logit contributions
 loudly-0.023
 wondering-0.023
 pad-0.022
 floppy-0.022
 curious-0.021
Token is
Token ID318
Feature activation+9.23e-03

Pos logit contributions
aster+0.104
ons+0.090
kins+0.088
uch+0.087
rac+0.086
Neg logit contributions
 loudly-0.069
 curious-0.068
 wondering-0.068
 indeed-0.066
 interested-0.065
Token
Token ID220
Feature activation-1.62e-02

Pos logit contributions
 wondering+0.003
 loudly+0.003
 pad+0.003
 She+0.003
 curious+0.002
Neg logit contributions
aster-0.007
kins-0.007
ons-0.007
uch-0.006
rac-0.006
Token
Token ID198
Feature activation-6.33e-02

Pos logit contributions
aster+0.030
kins+0.028
ons+0.028
uch+0.027
rac+0.027
Neg logit contributions
 loudly-0.020
 She-0.020
 wondering-0.020
 curious-0.019
 pad-0.019
Token
Token ID198
Feature activation-4.76e-02

Pos logit contributions
kins+0.148
uch+0.144
aster+0.137
ons+0.136
rac+0.136
Neg logit contributions
 She-0.058
 loudly-0.049
 curious-0.049
 wondering-0.047
 He-0.045
TokenM
Token ID44
Feature activation-1.10e-02

Pos logit contributions
kins+0.084
ons+0.078
uch+0.075
rac+0.074
aster+0.074
Neg logit contributions
 loudly-0.066
 She-0.062
 indeed-0.061
 curious-0.060
 badly-0.059
Tokenoral
Token ID6864
Feature activation-4.57e-02

Pos logit contributions
ons+0.013
kins+0.013
aster+0.013
rac+0.012
 halftime+0.012
Neg logit contributions
 loudly-0.020
 indeed-0.019
aly-0.019
 wondering-0.018
 badly-0.018
Token of
Token ID286
Feature activation-1.09e-01

Pos logit contributions
aster+0.085
kins+0.082
ons+0.079
uch+0.077
rac+0.076
Neg logit contributions
 loudly-0.059
 indeed-0.058
 wondering-0.055
 curious-0.055
 She-0.053
Token the
Token ID262
Feature activation-4.13e-02

Pos logit contributions
uch+0.254
kins+0.250
ups+0.239
rac+0.234
ops+0.234
Neg logit contributions
 indeed-0.098
 knowing-0.098
 about-0.094
 wondering-0.092
 very-0.092
Token story
Token ID1621
Feature activation-5.15e-02

Pos logit contributions
aster+0.067
kins+0.066
rac+0.064
ons+0.062
uch+0.061
Neg logit contributions
 curious-0.060
 indeed-0.060
 wondering-0.059
 loudly-0.059
 puppy-0.056
Token:
Token ID25
Feature activation-2.24e-02

Pos logit contributions
aster+0.106
ons+0.096
uch+0.095
kins+0.095
aches+0.090
Neg logit contributions
 loudly-0.061
 indeed-0.057
 She-0.056
 wondering-0.055
 pad-0.054
Token Even
Token ID3412
Feature activation-3.57e-02

Pos logit contributions
aster+0.045
ons+0.043
rac+0.041
uch+0.040
kins+0.040
Neg logit contributions
 loudly-0.025
 wondering-0.025
 pad-0.024
 curious-0.023
 floppy-0.023
Token if
Token ID611
Feature activation-5.35e-02

Pos logit contributions
aster+0.072
ons+0.068
rac+0.065
kins+0.065
uch+0.065
Neg logit contributions
 loudly-0.039
 wondering-0.039
 pad-0.038
 curious-0.036
 floppy-0.036
Token.
Token ID13
Feature activation-2.29e-03

Pos logit contributions
aster+0.013
ons+0.013
kins+0.012
rac+0.012
uch+0.012
Neg logit contributions
 wondering-0.008
 loudly-0.008
 pad-0.008
 floppy-0.008
 curious-0.008
Token
Token ID198
Feature activation-6.47e-02

Pos logit contributions
aster+0.004
ons+0.004
rac+0.004
kins+0.004
ma+0.004
Neg logit contributions
 loudly-0.003
 wondering-0.003
 pad-0.003
 interested-0.003
 floppy-0.003
Token
Token ID198
Feature activation-4.99e-02

Pos logit contributions
uch+0.140
kins+0.135
aster+0.133
aches+0.128
atmeal+0.127
Neg logit contributions
 She-0.064
 loudly-0.062
 wondering-0.061
 curious-0.059
 It-0.055
TokenM
Token ID44
Feature activation-1.98e-02

Pos logit contributions
aster+0.084
uch+0.081
kins+0.080
ons+0.078
rac+0.077
Neg logit contributions
 loudly-0.071
 wondering-0.067
 indeed-0.065
 pad-0.065
 curious-0.064
Tokenoral
Token ID6864
Feature activation-3.90e-02

Pos logit contributions
aster+0.022
ons+0.020
kins+0.019
uch+0.018
aches+0.018
Neg logit contributions
 loudly-0.039
 indeed-0.039
 wondering-0.038
 pad-0.038
aly-0.037
Token of
Token ID286
Feature activation-1.09e-01

Pos logit contributions
aster+0.069
ons+0.065
kins+0.063
uch+0.062
rac+0.062
Neg logit contributions
 loudly-0.054
 wondering-0.052
 indeed-0.051
 curious-0.051
 pad-0.049
Token the
Token ID262
Feature activation-4.27e-02

Pos logit contributions
kins+0.240
uch+0.237
aches+0.230
ups+0.226
mons+0.222
Neg logit contributions
 knowing-0.113
 wondering-0.111
 indeed-0.110
 3-0.104
 three-0.103
Token story
Token ID1621
Feature activation-4.88e-02

Pos logit contributions
aster+0.066
kins+0.059
uch+0.058
ons+0.058
rac+0.057
Neg logit contributions
 wondering-0.070
 loudly-0.068
 curious-0.068
 indeed-0.067
 floppy-0.065
Token:
Token ID25
Feature activation-1.93e-02

Pos logit contributions
aster+0.097
ons+0.089
kins+0.087
uch+0.087
aches+0.083
Neg logit contributions
 loudly-0.060
 indeed-0.057
 wondering-0.056
 She-0.054
 pad-0.054
Token It
Token ID632
Feature activation-5.14e-02

Pos logit contributions
aster+0.038
ons+0.036
rac+0.034
uch+0.033
kins+0.033
Neg logit contributions
 loudly-0.022
 pad-0.022
 floppy-0.022
 wondering-0.021
 interested-0.020
Token is
Token ID318
Feature activation+9.77e-03

Pos logit contributions
aster+0.105
ons+0.090
kins+0.089
rac+0.087
uch+0.087
Neg logit contributions
 loudly-0.063
 curious-0.062
 wondering-0.061
 indeed-0.060
 interested-0.059
Token.
Token ID13
Feature activation+1.59e-02

Pos logit contributions
aster+0.014
ons+0.014
kins+0.014
uch+0.012
rac+0.012
Neg logit contributions
 wondering-0.013
 loudly-0.013
 pad-0.013
 floppy-0.013
aly-0.013
Token
Token ID198
Feature activation-6.06e-02

Pos logit contributions
 loudly+0.015
 wondering+0.015
 pad+0.014
 curious+0.013
 interested+0.013
Neg logit contributions
aster-0.035
ons-0.033
rac-0.032
kins-0.032
uch-0.032
Token
Token ID198
Feature activation-4.83e-02

Pos logit contributions
uch+0.146
atmeal+0.138
kins+0.136
aster+0.136
at+0.135
Neg logit contributions
 loudly-0.049
 curious-0.046
 She-0.045
 excited-0.042
 It-0.041
TokenThe
Token ID464
Feature activation-1.19e-02

Pos logit contributions
ons+0.094
at+0.094
aster+0.093
uch+0.092
atmeal+0.092
Neg logit contributions
 loudly-0.053
 curious-0.048
 louder-0.046
 indeed-0.045
 pad-0.044
Token moral
Token ID6573
Feature activation-3.73e-02

Pos logit contributions
aster+0.024
ons+0.023
rac+0.022
kins+0.022
uch+0.021
Neg logit contributions
 curious-0.014
 loudly-0.013
 wondering-0.013
 indeed-0.013
 puppy-0.013
Token of
Token ID286
Feature activation-1.09e-01

Pos logit contributions
aster+0.069
kins+0.066
ons+0.065
uch+0.063
rac+0.062
Neg logit contributions
 loudly-0.047
 indeed-0.045
 curious-0.045
 wondering-0.044
 interested-0.043
Token the
Token ID262
Feature activation-4.28e-02

Pos logit contributions
uch+0.249
ups+0.238
vation+0.237
rac+0.234
aches+0.233
Neg logit contributions
 very-0.091
 indeed-0.090
 and-0.088
 three-0.084
 baby-0.084
Token story
Token ID1621
Feature activation-6.18e-02

Pos logit contributions
aster+0.071
rac+0.065
kins+0.062
ons+0.061
uch+0.059
Neg logit contributions
 curious-0.065
 wondering-0.064
 indeed-0.064
 loudly-0.062
 interested-0.061
Token is
Token ID318
Feature activation-1.53e-02

Pos logit contributions
aster+0.116
ons+0.103
kins+0.101
uch+0.100
rac+0.098
Neg logit contributions
 loudly-0.085
 indeed-0.081
 curious-0.077
 She-0.076
 interested-0.075
Token:
Token ID25
Feature activation-1.44e-02

Pos logit contributions
aster+0.032
kins+0.030
ons+0.030
uch+0.029
vation+0.028
Neg logit contributions
 loudly-0.016
 indeed-0.016
 curious-0.016
 wondering-0.016
 She-0.015
Token Sharing
Token ID36644
Feature activation-4.32e-02

Pos logit contributions
aster+0.028
ons+0.026
rac+0.026
uch+0.025
kins+0.025
Neg logit contributions
 loudly-0.017
 wondering-0.016
 pad-0.016
 floppy-0.016
 interested-0.015
Token.
Token ID13
Feature activation+1.52e-02

Pos logit contributions
aster+0.008
ons+0.007
uch+0.007
rac+0.007
kins+0.007
Neg logit contributions
 loudly-0.005
 wondering-0.004
 curious-0.004
 pad-0.004
 indeed-0.004
Token
Token ID198
Feature activation-6.52e-02

Pos logit contributions
 wondering+0.014
 loudly+0.014
 curious+0.014
 She+0.013
 pad+0.013
Neg logit contributions
aster-0.034
ons-0.032
kins-0.032
uch-0.031
aches-0.031
Token
Token ID198
Feature activation-4.93e-02

Pos logit contributions
uch+0.136
kins+0.131
rac+0.126
atmeal+0.126
vation+0.125
Neg logit contributions
 She-0.076
 curious-0.075
 loudly-0.072
 wondering-0.069
 It-0.067
TokenThe
Token ID464
Feature activation-1.39e-02

Pos logit contributions
uch+0.097
aster+0.095
aro+0.093
ons+0.093
at+0.093
Neg logit contributions
 She-0.052
 loudly-0.052
 curious-0.052
 wondering-0.049
 interested-0.046
Token moral
Token ID6573
Feature activation-3.99e-02

Pos logit contributions
aster+0.017
ons+0.016
rac+0.015
uch+0.014
kins+0.014
Neg logit contributions
 loudly-0.027
 wondering-0.027
 pad-0.026
 curious-0.026
 indeed-0.026
Token of
Token ID286
Feature activation-1.09e-01

Pos logit contributions
aster+0.071
kins+0.068
ons+0.067
uch+0.066
rac+0.065
Neg logit contributions
 loudly-0.053
 wondering-0.050
 curious-0.050
 indeed-0.050
 interested-0.048
Token the
Token ID262
Feature activation-4.58e-02

Pos logit contributions
uch+0.249
rac+0.235
ups+0.233
vation+0.228
aches+0.227
Neg logit contributions
 indeed-0.094
 and-0.093
 very-0.088
 like-0.083
 She-0.081
Token story
Token ID1621
Feature activation-6.60e-02

Pos logit contributions
aster+0.072
rac+0.067
ons+0.064
uch+0.064
kins+0.062
Neg logit contributions
 curious-0.071
 wondering-0.071
 loudly-0.069
 indeed-0.069
 interested-0.068
Token is
Token ID318
Feature activation-2.07e-02

Pos logit contributions
aster+0.128
ons+0.115
kins+0.113
uch+0.111
at+0.110
Neg logit contributions
 loudly-0.088
 indeed-0.081
 She-0.079
 curious-0.076
 badly-0.075
Token:
Token ID25
Feature activation-1.21e-02

Pos logit contributions
aster+0.039
kins+0.037
ons+0.037
uch+0.036
rac+0.035
Neg logit contributions
 loudly-0.025
 wondering-0.025
 curious-0.025
 indeed-0.024
 She-0.023
Token Be
Token ID1355
Feature activation-1.39e-02

Pos logit contributions
aster+0.024
ons+0.022
rac+0.021
uch+0.021
kins+0.021
Neg logit contributions
 loudly-0.014
 wondering-0.014
 pad-0.014
 floppy-0.014
 interested-0.013
Token
Token ID220
Feature activation-9.33e-03

Pos logit contributions
 loudly+0.010
 wondering+0.010
 pad+0.009
 curious+0.009
 floppy+0.009
Neg logit contributions
aster-0.013
ons-0.012
rac-0.011
kins-0.011
uch-0.011
Token
Token ID198
Feature activation-5.58e-02

Pos logit contributions
aster+0.018
ons+0.017
uch+0.017
kins+0.016
rac+0.016
Neg logit contributions
 loudly-0.011
 wondering-0.011
 She-0.011
 curious-0.011
 pad-0.010
Token
Token ID198
Feature activation-4.11e-02

Pos logit contributions
uch+0.118
kins+0.118
aster+0.117
ons+0.115
rac+0.115
Neg logit contributions
 loudly-0.052
 She-0.050
 wondering-0.050
 curious-0.049
 He-0.046
TokenM
Token ID44
Feature activation-1.30e-02

Pos logit contributions
aster+0.062
ons+0.062
kins+0.061
rac+0.061
uch+0.059
Neg logit contributions
 loudly-0.064
 wondering-0.060
 pad-0.059
 curious-0.059
 indeed-0.059
Tokenoral
Token ID6864
Feature activation-4.87e-02

Pos logit contributions
aster+0.019
ons+0.018
kins+0.018
rac+0.017
uch+0.016
Neg logit contributions
 loudly-0.021
 indeed-0.021
 wondering-0.020
 pad-0.020
 curious-0.019
Token of
Token ID286
Feature activation-1.09e-01

Pos logit contributions
aster+0.089
kins+0.086
ons+0.086
uch+0.082
rac+0.081
Neg logit contributions
 loudly-0.063
 indeed-0.062
 wondering-0.059
 curious-0.059
 She-0.056
Token the
Token ID262
Feature activation-4.90e-02

Pos logit contributions
uch+0.245
kins+0.239
ups+0.224
rac+0.223
aches+0.221
Neg logit contributions
 wondering-0.106
 indeed-0.105
 about-0.102
 three-0.101
 3-0.099
Token story
Token ID1621
Feature activation-5.25e-02

Pos logit contributions
aster+0.076
kins+0.070
ons+0.070
uch+0.067
rac+0.067
Neg logit contributions
 wondering-0.080
 loudly-0.079
 curious-0.078
 indeed-0.077
 floppy-0.073
Token:
Token ID25
Feature activation-2.49e-02

Pos logit contributions
aster+0.111
ons+0.103
uch+0.101
kins+0.099
aches+0.095
Neg logit contributions
 loudly-0.059
 indeed-0.055
 She-0.053
 wondering-0.053
aly-0.052
Token We
Token ID775
Feature activation-2.82e-02

Pos logit contributions
aster+0.051
ons+0.049
rac+0.046
uch+0.046
kins+0.046
Neg logit contributions
 loudly-0.026
 wondering-0.026
 pad-0.025
 floppy-0.025
 curious-0.024
Token should
Token ID815
Feature activation-1.51e-02

Pos logit contributions
aster+0.056
ons+0.053
kins+0.050
rac+0.050
uch+0.050
Neg logit contributions
 loudly-0.033
 wondering-0.032
 pad-0.031
 curious-0.031
 interested-0.030
Token.
Token ID13
Feature activation-4.40e-03

Pos logit contributions
aster+0.042
ons+0.040
uch+0.038
kins+0.038
rac+0.038
Neg logit contributions
 loudly-0.027
 wondering-0.026
 pad-0.026
 curious-0.025
 floppy-0.025
Token
Token ID198
Feature activation-6.10e-02

Pos logit contributions
aster+0.008
ons+0.008
uch+0.008
kins+0.007
aches+0.007
Neg logit contributions
 wondering-0.006
 loudly-0.006
 pad-0.006
 She-0.006
 curious-0.005
Token
Token ID198
Feature activation-4.84e-02

Pos logit contributions
uch+0.132
vals+0.121
kins+0.120
rac+0.118
aster+0.118
Neg logit contributions
 She-0.065
 curious-0.062
 loudly-0.061
 excited-0.059
 It-0.059
TokenThe
Token ID464
Feature activation-1.41e-02

Pos logit contributions
uch+0.090
ons+0.088
aster+0.088
lo+0.085
kins+0.085
Neg logit contributions
 loudly-0.057
 indeed-0.054
 She-0.051
 curious-0.051
 happily-0.049
Token moral
Token ID6573
Feature activation-3.88e-02

Pos logit contributions
aster+0.018
ons+0.016
rac+0.015
kins+0.015
uch+0.015
Neg logit contributions
 curious-0.028
 loudly-0.027
 wondering-0.027
 pad-0.027
 floppy-0.026
Token of
Token ID286
Feature activation-1.09e-01

Pos logit contributions
aster+0.069
kins+0.066
ons+0.064
uch+0.062
rac+0.062
Neg logit contributions
 loudly-0.052
 wondering-0.051
 indeed-0.050
 curious-0.049
 pad-0.049
Token the
Token ID262
Feature activation-4.47e-02

Pos logit contributions
uch+0.220
ups+0.214
rac+0.207
ops+0.207
bel+0.205
Neg logit contributions
 indeed-0.117
 wondering-0.107
 very-0.104
 knowing-0.103
 so-0.101
Token story
Token ID1621
Feature activation-6.10e-02

Pos logit contributions
aster+0.067
rac+0.061
ons+0.061
uch+0.059
kins+0.058
Neg logit contributions
 wondering-0.074
 curious-0.074
 loudly-0.073
 indeed-0.071
 interested-0.070
Token is
Token ID318
Feature activation-1.25e-02

Pos logit contributions
aster+0.103
ons+0.092
kins+0.089
uch+0.088
rac+0.086
Neg logit contributions
 loudly-0.094
 wondering-0.089
 indeed-0.089
 curious-0.087
 pad-0.086
Token that
Token ID326
Feature activation-3.32e-02

Pos logit contributions
aster+0.020
ons+0.019
kins+0.019
uch+0.018
rac+0.017
Neg logit contributions
 loudly-0.020
 wondering-0.020
 indeed-0.019
 curious-0.019
 interested-0.018
Token hugs
Token ID40657
Feature activation-1.74e-02

Pos logit contributions
aster+0.057
ons+0.055
kins+0.051
uch+0.051
rac+0.051
Neg logit contributions
 pad-0.045
 loudly-0.044
 wondering-0.044
 floppy-0.043
 blonde-0.042
Token.
Token ID13
Feature activation+2.79e-03

Pos logit contributions
aster+0.019
kins+0.019
ons+0.018
rac+0.016
ma+0.016
Neg logit contributions
 detective-0.036
 blonde-0.034
 pad-0.034
 floppy-0.033
 personality-0.032
Token
Token ID198
Feature activation-5.82e-02

Pos logit contributions
 loudly+0.004
 wondering+0.004
 pad+0.003
 curious+0.003
 floppy+0.003
Neg logit contributions
aster-0.005
ons-0.005
uch-0.005
kins-0.005
rac-0.005
Token
Token ID198
Feature activation-4.11e-02

Pos logit contributions
uch+0.118
kins+0.114
aster+0.110
rac+0.110
atmeal+0.106
Neg logit contributions
 loudly-0.064
 curious-0.061
 She-0.060
 wondering-0.060
 interested-0.058
TokenM
Token ID44
Feature activation-7.17e-03

Pos logit contributions
aster+0.064
kins+0.062
ons+0.062
uch+0.062
rac+0.060
Neg logit contributions
 loudly-0.062
 indeed-0.057
 pad-0.057
 wondering-0.057
 interested-0.056
Tokenoral
Token ID6864
Feature activation-4.17e-02

Pos logit contributions
aster+0.007
ons+0.006
rac+0.006
kins+0.006
uch+0.006
Neg logit contributions
 loudly-0.015
 indeed-0.015
 pad-0.015
 wondering-0.014
 badly-0.014
Token of
Token ID286
Feature activation-1.09e-01

Pos logit contributions
aster+0.078
kins+0.075
ons+0.073
uch+0.072
aches+0.070
Neg logit contributions
 indeed-0.053
 loudly-0.052
 wondering-0.049
 curious-0.049
 She-0.047
Token the
Token ID262
Feature activation-3.84e-02

Pos logit contributions
uch+0.245
kins+0.240
ups+0.239
aches+0.226
ops+0.226
Neg logit contributions
 indeed-0.095
 three-0.092
 knowing-0.091
 Will-0.090
 3-0.090
Token story
Token ID1621
Feature activation-4.80e-02

Pos logit contributions
aster+0.061
kins+0.055
uch+0.054
rac+0.054
ons+0.054
Neg logit contributions
 wondering-0.060
 loudly-0.059
 indeed-0.059
 curious-0.058
 floppy-0.056
Token:
Token ID25
Feature activation-2.05e-02

Pos logit contributions
aster+0.098
ons+0.089
uch+0.088
kins+0.087
aches+0.084
Neg logit contributions
 loudly-0.057
 indeed-0.054
 wondering-0.052
 She-0.051
 pad-0.051
Token
Token ID198
Feature activation-6.34e-02

Pos logit contributions
aster+0.041
ons+0.039
rac+0.037
uch+0.036
kins+0.036
Neg logit contributions
 loudly-0.023
 wondering-0.022
 pad-0.022
 floppy-0.022
 interested-0.021
TokenBe
Token ID3856
Feature activation-8.64e-03

Pos logit contributions
kins+0.124
uch+0.123
atmeal+0.118
aster+0.116
aches+0.116
Neg logit contributions
 She-0.064
 indeed-0.063
 It-0.062
 curious-0.062
 very-0.061
Token.
Token ID13
Feature activation+2.20e-03

Pos logit contributions
aster+0.017
ons+0.016
kins+0.016
uch+0.015
rac+0.015
Neg logit contributions
 loudly-0.013
 wondering-0.013
 pad-0.013
 floppy-0.012
 curious-0.012
Token
Token ID198
Feature activation-6.12e-02

Pos logit contributions
 loudly+0.003
 wondering+0.003
 pad+0.003
 curious+0.003
 floppy+0.003
Neg logit contributions
aster-0.004
ons-0.004
rac-0.004
kins-0.004
uch-0.004
Token
Token ID198
Feature activation-4.48e-02

Pos logit contributions
uch+0.128
kins+0.125
aster+0.123
rac+0.123
aches+0.118
Neg logit contributions
 loudly-0.061
 wondering-0.059
 curious-0.057
 She-0.056
 interested-0.053
TokenThe
Token ID464
Feature activation-6.86e-03

Pos logit contributions
aster+0.075
ons+0.071
kins+0.071
uch+0.071
rac+0.071
Neg logit contributions
 loudly-0.064
 wondering-0.061
 indeed-0.059
 curious-0.059
 pad-0.058
Token moral
Token ID6573
Feature activation-4.32e-02

Pos logit contributions
aster+0.008
ons+0.008
rac+0.007
kins+0.007
uch+0.007
Neg logit contributions
 loudly-0.013
 wondering-0.013
 curious-0.013
 pad-0.013
 floppy-0.013
Token of
Token ID286
Feature activation-1.09e-01

Pos logit contributions
aster+0.075
ons+0.070
kins+0.069
uch+0.068
rac+0.068
Neg logit contributions
 loudly-0.060
 wondering-0.059
 indeed-0.057
 curious-0.057
 pad-0.056
Token this
Token ID428
Feature activation-5.70e-02

Pos logit contributions
uch+0.223
ups+0.222
aches+0.214
kins+0.213
bel+0.212
Neg logit contributions
 indeed-0.111
 wondering-0.110
 and-0.104
 knowing-0.102
 so-0.100
Token story
Token ID1621
Feature activation-5.77e-02

Pos logit contributions
aster+0.080
ons+0.071
rac+0.070
kins+0.069
uch+0.067
Neg logit contributions
 loudly-0.101
 indeed-0.098
 wondering-0.098
 curious-0.096
 interested-0.093
Token is
Token ID318
Feature activation-7.69e-03

Pos logit contributions
aster+0.096
ons+0.086
uch+0.082
kins+0.081
rac+0.081
Neg logit contributions
 loudly-0.089
 wondering-0.085
 indeed-0.085
 pad-0.083
 curious-0.083
Token that
Token ID326
Feature activation-3.43e-02

Pos logit contributions
aster+0.012
ons+0.011
kins+0.011
uch+0.011
rac+0.010
Neg logit contributions
 loudly-0.012
 wondering-0.012
 indeed-0.012
 curious-0.012
 interested-0.011
Token it
Token ID340
Feature activation-4.07e-02

Pos logit contributions
aster+0.058
ons+0.056
kins+0.052
uch+0.052
rac+0.052
Neg logit contributions
 loudly-0.048
 wondering-0.048
 pad-0.047
 floppy-0.046
 curious-0.045
Token"
Token ID1
Feature activation-1.76e-02

Pos logit contributions
aster+0.061
ons+0.057
uch+0.054
kins+0.054
rac+0.054
Neg logit contributions
 loudly-0.048
 wondering-0.048
 pad-0.046
 curious-0.045
 floppy-0.045
TokenHello
Token ID15496
Feature activation+3.50e-03

Pos logit contributions
aster+0.034
ons+0.032
kins+0.031
uch+0.031
rac+0.031
Neg logit contributions
 loudly-0.021
 wondering-0.021
 pad-0.020
 curious-0.020
 floppy-0.020
Token,
Token ID11
Feature activation+3.13e-02

Pos logit contributions
 wondering+0.005
 pad+0.005
 loudly+0.005
 floppy+0.004
 interested+0.004
Neg logit contributions
aster-0.006
ons-0.006
kins-0.006
rac-0.006
ma-0.006
Token little
Token ID1310
Feature activation+1.22e-02

Pos logit contributions
 pad+0.047
 loudly+0.046
 wondering+0.045
 floppy+0.045
 blonde+0.044
Neg logit contributions
aster-0.050
ons-0.046
kins-0.046
rac-0.045
ma-0.045
Token ones
Token ID3392
Feature activation-4.11e-03

Pos logit contributions
 pad+0.017
 wondering+0.017
 loudly+0.017
 floppy+0.016
 interested+0.016
Neg logit contributions
aster-0.020
kins-0.020
ma-0.019
ons-0.018
 halftime-0.018
Token.
Token ID13
Feature activation+6.32e-02

Pos logit contributions
aster+0.006
ons+0.006
kins+0.006
ma+0.006
uch+0.006
Neg logit contributions
 wondering-0.006
 loudly-0.006
 pad-0.006
 floppy-0.006
 blonde-0.006
Token I
Token ID314
Feature activation+1.66e-03

Pos logit contributions
 loudly+0.084
 pad+0.076
 wondering+0.076
 interested+0.074
 badly+0.074
Neg logit contributions
aster-0.116
rac-0.104
ons-0.103
uch-0.097
ma-0.096
Token come
Token ID1282
Feature activation+1.72e-02

Pos logit contributions
 loudly+0.003
 wondering+0.003
 pad+0.002
 curious+0.002
 floppy+0.002
Neg logit contributions
aster-0.003
ons-0.002
uch-0.002
kins-0.002
rac-0.002
Token in
Token ID287
Feature activation-1.89e-02

Pos logit contributions
 blonde+0.027
 floppy+0.026
 loudly+0.025
 wondering+0.025
 pad+0.025
Neg logit contributions
aster-0.026
ma-0.025
rac-0.024
ons-0.023
ossible-0.023
Token peace
Token ID4167
Feature activation-9.31e-03

Pos logit contributions
aster+0.040
ons+0.038
kins+0.036
uch+0.036
rac+0.036
Neg logit contributions
 loudly-0.019
 wondering-0.019
 pad-0.018
 curious-0.018
 floppy-0.018
Token.
Token ID13
Feature activation+5.25e-02

Pos logit contributions
aster+0.018
ons+0.017
kins+0.016
rac+0.016
uch+0.016
Neg logit contributions
 wondering-0.011
 loudly-0.011
 pad-0.010
 floppy-0.010
 interested-0.010
Token things
Token ID1243
Feature activation+1.43e-02

Pos logit contributions
aster+0.003
ons+0.002
kins+0.002
uch+0.002
rac+0.002
Neg logit contributions
 loudly-0.002
 wondering-0.002
 pad-0.002
 curious-0.002
 interested-0.002
Token and
Token ID290
Feature activation+6.10e-03

Pos logit contributions
 loudly+0.013
 wondering+0.013
 pad+0.013
 curious+0.012
 indeed+0.012
Neg logit contributions
aster-0.032
ons-0.030
uch-0.029
rac-0.029
kins-0.029
Token created
Token ID2727
Feature activation-2.23e-03

Pos logit contributions
 wondering+0.008
 loudly+0.008
 curious+0.008
 pad+0.008
 floppy+0.008
Neg logit contributions
aster-0.010
uch-0.010
rac-0.010
ons-0.010
kins-0.010
Token a
Token ID257
Feature activation+5.29e-02

Pos logit contributions
aster+0.005
ons+0.004
kins+0.004
uch+0.004
rac+0.004
Neg logit contributions
 loudly-0.002
 wondering-0.002
 pad-0.002
 curious-0.002
 floppy-0.002
Token big
Token ID1263
Feature activation+6.04e-02

Pos logit contributions
 wondering+0.073
 loudly+0.072
aly+0.071
 floppy+0.070
 Mind+0.069
Neg logit contributions
aster-0.084
ons-0.079
kins-0.077
ma-0.073
aches-0.072
Token,
Token ID11
Feature activation+6.73e-02

Pos logit contributions
 loudly+0.103
 interested+0.102
aly+0.101
 wondering+0.099
 hesitant+0.097
Neg logit contributions
aster-0.082
ons-0.075
ma-0.070
kins-0.066
 course-0.062
Token beautiful
Token ID4950
Feature activation+5.97e-02

Pos logit contributions
 loudly+0.074
 wondering+0.074
 pad+0.071
aly+0.068
 badly+0.067
Neg logit contributions
aster-0.129
ons-0.122
kins-0.120
aches-0.110
rac-0.110
Token boat
Token ID8848
Feature activation+4.29e-02

Pos logit contributions
aly+0.096
 loudly+0.092
 hesitant+0.091
 interested+0.091
 wondering+0.090
Neg logit contributions
aster-0.087
ons-0.085
kins-0.076
ma-0.075
aches-0.070
Token.
Token ID13
Feature activation+1.15e-02

Pos logit contributions
 loudly+0.049
 wondering+0.048
 pad+0.047
 curious+0.046
 indeed+0.046
Neg logit contributions
aster-0.086
ons-0.080
rac-0.079
uch-0.079
kins-0.076
Token This
Token ID770
Feature activation-1.98e-02

Pos logit contributions
 pad+0.012
 loudly+0.012
 wondering+0.012
 She+0.012
 floppy+0.011
Neg logit contributions
aster-0.023
uch-0.023
kins-0.022
ons-0.022
aches-0.022
Token time
Token ID640
Feature activation-2.17e-02

Pos logit contributions
aster+0.035
rac+0.033
ons+0.032
uch+0.031
aches+0.030
Neg logit contributions
 wondering-0.027
 pad-0.027
 loudly-0.027
 curious-0.026
 indeed-0.026
Token upon
Token ID2402
Feature activation-3.04e-02

Pos logit contributions
aster+0.146
uch+0.142
vation+0.139
ons+0.136
rac+0.135
Neg logit contributions
 loudly-0.025
 wondering-0.024
 curious-0.023
 She-0.023
 interested-0.021
Token a
Token ID257
Feature activation+4.61e-02

Pos logit contributions
aster+0.056
ons+0.051
uch+0.046
kins+0.045
rac+0.044
Neg logit contributions
 pad-0.041
 wondering-0.040
 loudly-0.039
 detective-0.038
 floppy-0.038
Token time
Token ID640
Feature activation+3.91e-02

Pos logit contributions
 loudly+0.070
 pad+0.068
 wondering+0.067
 curious+0.066
 floppy+0.065
Neg logit contributions
aster-0.074
ons-0.072
kins-0.069
rac-0.068
uch-0.066
Token,
Token ID11
Feature activation+2.58e-02

Pos logit contributions
 loudly+0.038
 wondering+0.038
 pad+0.036
 curious+0.035
 interested+0.034
Neg logit contributions
aster-0.085
ons-0.080
kins-0.078
rac-0.077
uch-0.077
Token there
Token ID612
Feature activation-2.97e-02

Pos logit contributions
 wondering+0.041
 loudly+0.040
 pad+0.039
 floppy+0.039
 indeed+0.037
Neg logit contributions
aster-0.039
ons-0.036
kins-0.036
uch-0.035
rac-0.034
Token was
Token ID373
Feature activation+7.45e-02

Pos logit contributions
aster+0.061
rac+0.058
kins+0.055
vation+0.055
ons+0.055
Neg logit contributions
 loudly-0.032
 wondering-0.032
 curious-0.032
 indeed-0.031
 She-0.031
Token a
Token ID257
Feature activation+8.33e-02

Pos logit contributions
 wondering+0.107
 loudly+0.107
 pad+0.107
 interested+0.103
 detective+0.101
Neg logit contributions
aster-0.132
ons-0.118
uch-0.103
ossible-0.100
kins-0.100
Token farmer
Token ID18739
Feature activation+2.21e-02

Pos logit contributions
 pad+0.115
 undone+0.113
 wondering+0.113
aly+0.111
 indeed+0.110
Neg logit contributions
aster-0.142
kins-0.123
ma-0.117
 moms-0.114
uch-0.110
Token named
Token ID3706
Feature activation+4.35e-02

Pos logit contributions
 wondering+0.020
 loudly+0.020
 curious+0.020
 pad+0.019
 indeed+0.018
Neg logit contributions
aster-0.048
uch-0.046
rac-0.046
kins-0.046
ons-0.046
Token Bob
Token ID5811
Feature activation+1.81e-03

Pos logit contributions
 loudly+0.049
 wondering+0.049
 pad+0.047
 curious+0.046
 floppy+0.045
Neg logit contributions
aster-0.088
ons-0.083
uch-0.079
kins-0.079
rac-0.079
Token.
Token ID13
Feature activation+4.39e-02

Pos logit contributions
 loudly+0.003
 wondering+0.003
 pad+0.003
 curious+0.003
 floppy+0.003
Neg logit contributions
aster-0.003
ons-0.003
kins-0.003
uch-0.003
rac-0.003
Token a
Token ID257
Feature activation+7.40e-02

Pos logit contributions
 pad+0.106
 loudly+0.105
 wondering+0.105
 detective+0.103
aly+0.102
Neg logit contributions
aster-0.119
ons-0.105
uch-0.092
ma-0.090
ossible-0.088
Token family
Token ID1641
Feature activation+1.90e-02

Pos logit contributions
 pad+0.101
 wondering+0.098
aly+0.097
 undone+0.095
 indeed+0.095
Neg logit contributions
aster-0.131
kins-0.110
ma-0.107
 halftime-0.105
 moms-0.102
Token.
Token ID13
Feature activation+3.97e-02

Pos logit contributions
 wondering+0.019
 loudly+0.019
 curious+0.018
 pad+0.018
 indeed+0.017
Neg logit contributions
aster-0.040
ons-0.038
uch-0.037
rac-0.037
kins-0.037
Token They
Token ID1119
Feature activation+1.49e-02

Pos logit contributions
 loudly+0.041
 wondering+0.040
 pad+0.039
 interested+0.036
 detective+0.036
Neg logit contributions
aster-0.087
ons-0.077
rac-0.076
kins-0.073
ossible-0.072
Token had
Token ID550
Feature activation+3.93e-02

Pos logit contributions
 loudly+0.019
 wondering+0.019
 pad+0.019
 floppy+0.018
 detective+0.017
Neg logit contributions
aster-0.027
ons-0.026
kins-0.025
rac-0.025
uch-0.024
Token a
Token ID257
Feature activation+6.56e-02

Pos logit contributions
 loudly+0.044
 wondering+0.043
 pad+0.042
 interested+0.040
 detective+0.039
Neg logit contributions
aster-0.079
ons-0.075
uch-0.071
rac-0.071
kins-0.070
Token house
Token ID2156
Feature activation+3.03e-02

Pos logit contributions
lier+0.100
 loudly+0.100
 indeed+0.098
 wondering+0.097
 pad+0.097
Neg logit contributions
aster-0.096
uch-0.085
kins-0.084
ossible-0.084
 moms-0.083
Token,
Token ID11
Feature activation+2.79e-02

Pos logit contributions
 loudly+0.033
 wondering+0.032
 curious+0.031
 indeed+0.031
 interested+0.030
Neg logit contributions
aster-0.063
ons-0.059
uch-0.058
rac-0.058
kins-0.057
Token but
Token ID475
Feature activation-8.48e-03

Pos logit contributions
 pad+0.035
 loudly+0.035
aly+0.032
 undone+0.032
 wondering+0.031
Neg logit contributions
aster-0.052
ons-0.049
kins-0.046
rac-0.044
ma-0.043
Token the
Token ID262
Feature activation+2.22e-02

Pos logit contributions
aster+0.012
ons+0.012
kins+0.011
rac+0.011
uch+0.011
Neg logit contributions
 loudly-0.014
 wondering-0.014
 pad-0.014
 floppy-0.013
 interested-0.013
Token kitchen
Token ID9592
Feature activation+1.20e-02

Pos logit contributions
 loudly+0.033
 wondering+0.033
 pad+0.032
 indeed+0.031
 interested+0.031
Neg logit contributions
aster-0.035
ons-0.032
kins-0.031
rac-0.031
uch-0.030
Token stopped
Token ID5025
Feature activation+1.31e-02

Pos logit contributions
 wondering+0.018
 loudly+0.017
 pad+0.017
 floppy+0.017
 interested+0.017
Neg logit contributions
aster-0.017
ons-0.016
uch-0.015
kins-0.015
rac-0.015
Token!
Token ID0
Feature activation+1.84e-02

Pos logit contributions
 pad+0.020
 detective+0.019
 blonde+0.019
 floppy+0.019
 wondering+0.019
Neg logit contributions
aster-0.021
ons-0.019
kins-0.018
rac-0.017
uch-0.017
Token
Token ID198
Feature activation-4.58e-02

Pos logit contributions
 pad+0.017
 wondering+0.017
 loudly+0.016
 detective+0.016
 indeed+0.016
Neg logit contributions
aster-0.040
ons-0.037
rac-0.036
kins-0.035
uch-0.033
Token
Token ID198
Feature activation-3.36e-02

Pos logit contributions
aster+0.079
ons+0.073
uch+0.073
kins+0.071
rac+0.070
Neg logit contributions
 loudly-0.065
 wondering-0.063
 curious-0.063
 pad-0.061
 interested-0.061
TokenHe
Token ID1544
Feature activation+4.09e-03

Pos logit contributions
aster+0.051
ons+0.048
rac+0.045
kins+0.045
uch+0.044
Neg logit contributions
 wondering-0.054
 loudly-0.054
 pad-0.053
 floppy-0.052
 indeed-0.051
Token looked
Token ID3114
Feature activation+7.02e-02

Pos logit contributions
 wondering+0.007
 pad+0.007
 loudly+0.006
 indeed+0.006
 interested+0.006
Neg logit contributions
aster-0.006
ons-0.006
rac-0.005
kins-0.005
uch-0.005
Token around
Token ID1088
Feature activation-1.64e-02

Pos logit contributions
 blonde+0.138
 superstar+0.130
 detective+0.126
 personality+0.126
 indeed+0.126
Neg logit contributions
ons-0.058
uses-0.055
ma-0.053
aster-0.053
rac-0.049
Token to
Token ID284
Feature activation-1.63e-02

Pos logit contributions
aster+0.032
ons+0.029
kins+0.028
ma+0.028
rac+0.028
Neg logit contributions
 pad-0.018
 floppy-0.018
aly-0.017
 loudly-0.017
 blonde-0.017
Token find
Token ID1064
Feature activation-7.12e-03

Pos logit contributions
ons+0.028
aster+0.027
kins+0.027
uch+0.027
rac+0.026
Neg logit contributions
 wondering-0.023
 loudly-0.022
 curious-0.022
 pad-0.021
 interested-0.021
Token out
Token ID503
Feature activation-1.76e-02

Pos logit contributions
aster+0.015
ons+0.014
uch+0.014
kins+0.014
rac+0.014
Neg logit contributions
 loudly-0.007
 wondering-0.007
 curious-0.007
 pad-0.007
 floppy-0.007
Token what
Token ID644
Feature activation-3.03e-02

Pos logit contributions
aster+0.023
ons+0.023
ma+0.019
kins+0.019
rac+0.019
Neg logit contributions
 pad-0.031
 floppy-0.029
 blonde-0.029
 wondering-0.029
 superstar-0.028
Token can
Token ID460
Feature activation-1.30e-02

Pos logit contributions
aster+0.043
ons+0.042
rac+0.040
uch+0.040
kins+0.039
Neg logit contributions
 wondering-0.039
 loudly-0.039
 floppy-0.036
 interested-0.036
 blonde-0.035
Token try
Token ID1949
Feature activation-1.98e-02

Pos logit contributions
aster+0.020
ons+0.019
kins+0.018
uch+0.018
rac+0.018
Neg logit contributions
 loudly-0.020
 wondering-0.020
 pad-0.020
 curious-0.019
 floppy-0.019
Token to
Token ID284
Feature activation-5.63e-02

Pos logit contributions
aster+0.034
ons+0.034
kins+0.031
rac+0.031
aches+0.031
Neg logit contributions
 loudly-0.026
 pad-0.025
 wondering-0.025
 floppy-0.025
 interested-0.025
Token make
Token ID787
Feature activation-2.65e-02

Pos logit contributions
aster+0.087
ons+0.083
uch+0.078
kins+0.078
rac+0.077
Neg logit contributions
 loudly-0.088
 wondering-0.088
 curious-0.085
 pad-0.084
 interested-0.083
Token a
Token ID257
Feature activation+2.93e-02

Pos logit contributions
aster+0.049
ons+0.046
uch+0.045
kins+0.045
rac+0.044
Neg logit contributions
 loudly-0.034
 wondering-0.033
 pad-0.032
 curious-0.031
 floppy-0.031
Token pretty
Token ID2495
Feature activation+3.31e-02

Pos logit contributions
 wondering+0.044
 indeed+0.044
 loudly+0.043
 floppy+0.043
 blonde+0.043
Neg logit contributions
aster-0.046
ons-0.041
kins-0.040
aches-0.039
ma-0.038
Token car
Token ID1097
Feature activation+7.70e-03

Pos logit contributions
aly+0.053
 wondering+0.052
 loudly+0.052
 floppy+0.052
 pad+0.051
Neg logit contributions
aster-0.048
ons-0.048
kins-0.043
aches-0.042
ma-0.042
Token!"
Token ID2474
Feature activation-1.65e-02

Pos logit contributions
 wondering+0.009
 loudly+0.009
 pad+0.009
 curious+0.008
 floppy+0.008
Neg logit contributions
aster-0.015
ons-0.014
kins-0.014
uch-0.014
rac-0.014
Token
Token ID198
Feature activation-5.29e-02

Pos logit contributions
aster+0.032
uch+0.031
kins+0.031
ons+0.031
at+0.029
Neg logit contributions
 She-0.020
 loudly-0.019
 wondering-0.019
 pad-0.018
 curious-0.018
Token
Token ID198
Feature activation-3.74e-02

Pos logit contributions
kins+0.110
aster+0.108
ons+0.106
uch+0.103
uses+0.100
Neg logit contributions
 loudly-0.055
 She-0.055
 badly-0.053
 curious-0.050
 wondering-0.048
TokenSo
Token ID2396
Feature activation-5.99e-03

Pos logit contributions
aster+0.068
ons+0.067
kins+0.065
uch+0.060
rac+0.059
Neg logit contributions
 loudly-0.049
 badly-0.046
 wondering-0.046
 pad-0.045
 indeed-0.045
Token
Token ID198
Feature activation-3.39e-02

Pos logit contributions
uch+0.102
aster+0.101
ons+0.097
kins+0.096
rac+0.095
Neg logit contributions
 loudly-0.050
 She-0.049
 curious-0.046
 wondering-0.044
 He-0.043
Token"
Token ID1
Feature activation-1.56e-03

Pos logit contributions
aster+0.058
ons+0.054
uch+0.052
kins+0.052
rac+0.052
Neg logit contributions
 loudly-0.049
 wondering-0.048
 pad-0.047
 curious-0.046
 floppy-0.046
TokenThey
Token ID2990
Feature activation-5.65e-03

Pos logit contributions
aster+0.002
kins+0.002
rac+0.002
ons+0.002
uch+0.002
Neg logit contributions
 loudly-0.003
 wondering-0.003
 pad-0.003
 interested-0.002
 floppy-0.002
Token are
Token ID389
Feature activation+3.95e-02

Pos logit contributions
aster+0.007
ons+0.007
uch+0.006
rac+0.006
kins+0.006
Neg logit contributions
 loudly-0.010
 wondering-0.010
 pad-0.010
 floppy-0.010
 curious-0.010
Token very
Token ID845
Feature activation+4.84e-02

Pos logit contributions
 pad+0.050
 loudly+0.048
 wondering+0.044
 blonde+0.042
 undone+0.042
Neg logit contributions
aster-0.071
ons-0.069
ma-0.067
kins-0.066
aches-0.065
Token nice
Token ID3621
Feature activation+1.74e-02

Pos logit contributions
 pad+0.047
 wondering+0.042
 loudly+0.041
 indeed+0.038
 floppy+0.038
Neg logit contributions
aster-0.104
ons-0.099
kins-0.099
rac-0.097
aches-0.096
Token,
Token ID11
Feature activation+3.09e-02

Pos logit contributions
 undone+0.029
 pad+0.028
 blonde+0.027
 wondering+0.027
 floppy+0.027
Neg logit contributions
ons-0.024
ma-0.023
kins-0.022
ups-0.021
aster-0.021
Token Ben
Token ID3932
Feature activation+1.41e-02

Pos logit contributions
 wondering+0.040
 loudly+0.039
 pad+0.039
 interested+0.038
 blonde+0.038
Neg logit contributions
aster-0.053
kins-0.049
rac-0.048
aches-0.047
ma-0.047
Token.
Token ID13
Feature activation+5.18e-02

Pos logit contributions
 wondering+0.021
 loudly+0.020
 pad+0.020
 blonde+0.019
 interested+0.019
Neg logit contributions
aster-0.022
ons-0.022
kins-0.021
ma-0.020
uch-0.020
Token You
Token ID921
Feature activation-7.53e-03

Pos logit contributions
 loudly+0.070
 pad+0.067
 wondering+0.067
 interested+0.065
 badly+0.064
Neg logit contributions
aster-0.089
rac-0.082
ons-0.079
ossible-0.076
 moms-0.076
Token have
Token ID423
Feature activation-3.40e-03

Pos logit contributions
ons+0.009
aster+0.009
uch+0.008
rac+0.008
kins+0.008
Neg logit contributions
 loudly-0.015
 wondering-0.014
 pad-0.014
 floppy-0.013
 blonde-0.013
Token excited
Token ID6568
Feature activation-7.93e-03

Pos logit contributions
 pad+0.031
 detective+0.028
 wondering+0.026
 loudly+0.026
 indeed+0.025
Neg logit contributions
aster-0.123
ons-0.120
kins-0.115
uch-0.115
rac-0.114
Token.
Token ID13
Feature activation+2.48e-02

Pos logit contributions
aster+0.014
ons+0.014
rac+0.012
uch+0.012
kins+0.012
Neg logit contributions
 pad-0.010
 wondering-0.010
 loudly-0.010
 floppy-0.010
 curious-0.010
Token It
Token ID632
Feature activation-7.98e-03

Pos logit contributions
 loudly+0.031
 wondering+0.029
 interested+0.028
 pad+0.027
 badly+0.027
Neg logit contributions
aster-0.043
rac-0.037
ons-0.037
ossible-0.036
ma-0.036
Token was
Token ID373
Feature activation+5.51e-02

Pos logit contributions
aster+0.012
ons+0.012
kins+0.011
uch+0.011
rac+0.011
Neg logit contributions
 loudly-0.013
 wondering-0.013
 pad-0.012
 curious-0.012
 floppy-0.012
Token his
Token ID465
Feature activation+3.39e-02

Pos logit contributions
 loudly+0.065
 pad+0.063
 wondering+0.063
 interested+0.059
 curious+0.059
Neg logit contributions
aster-0.109
ons-0.102
rac-0.095
kins-0.094
uch-0.094
Token first
Token ID717
Feature activation+4.36e-02

Pos logit contributions
 loudly+0.047
 wondering+0.046
 pad+0.045
 interested+0.043
 floppy+0.043
Neg logit contributions
aster-0.059
ons-0.055
rac-0.053
uch-0.052
kins-0.052
Token sleep
Token ID3993
Feature activation+1.43e-02

Pos logit contributions
 loudly+0.069
 wondering+0.068
 floppy+0.066
 interested+0.065
 curious+0.064
Neg logit contributions
aster-0.068
ons-0.062
uch-0.058
kins-0.057
rac-0.057
Tokenover
Token ID2502
Feature activation+1.11e-02

Pos logit contributions
 wondering+0.018
 loudly+0.018
 pad+0.017
 floppy+0.017
 curious+0.017
Neg logit contributions
aster-0.027
ons-0.025
kins-0.024
uch-0.024
rac-0.024
Token with
Token ID351
Feature activation-5.75e-03

Pos logit contributions
 loudly+0.014
 wondering+0.013
 floppy+0.013
 interested+0.013
 pad+0.013
Neg logit contributions
aster-0.021
ons-0.020
kins-0.019
uch-0.019
rac-0.018
Token his
Token ID465
Feature activation+1.13e-02

Pos logit contributions
aster+0.010
ons+0.010
kins+0.009
rac+0.009
uch+0.009
Neg logit contributions
 loudly-0.007
 wondering-0.007
 pad-0.007
 interested-0.007
 curious-0.007
Token best
Token ID1266
Feature activation+3.33e-02

Pos logit contributions
 loudly+0.013
 wondering+0.013
 pad+0.013
 curious+0.012
 floppy+0.012
Neg logit contributions
aster-0.022
ons-0.021
rac-0.020
uch-0.020
kins-0.020
Token to
Token ID284
Feature activation-4.18e-02

Pos logit contributions
aster+0.002
kins+0.002
ons+0.002
rac+0.002
uch+0.002
Neg logit contributions
 loudly-0.001
 pad-0.001
 curious-0.001
 wondering-0.001
 interested-0.001
Token the
Token ID262
Feature activation-1.29e-02

Pos logit contributions
uch+0.077
kins+0.074
ons+0.073
at+0.072
rac+0.071
Neg logit contributions
 wondering-0.056
 loudly-0.056
 curious-0.054
 puppy-0.053
 She-0.053
Token bathroom
Token ID12436
Feature activation+3.71e-02

Pos logit contributions
aster+0.021
ons+0.020
uch+0.019
kins+0.019
rac+0.019
Neg logit contributions
 loudly-0.019
 wondering-0.019
 pad-0.019
 curious-0.018
 floppy-0.018
Token.
Token ID13
Feature activation+3.80e-02

Pos logit contributions
 loudly+0.037
 wondering+0.036
 pad+0.034
 curious+0.034
 interested+0.033
Neg logit contributions
aster-0.080
ons-0.077
uch-0.074
rac-0.074
kins-0.073
Token She
Token ID1375
Feature activation-9.76e-03

Pos logit contributions
 loudly+0.027
 wondering+0.026
 curious+0.024
 pad+0.023
 floppy+0.023
Neg logit contributions
aster-0.093
ons-0.090
kins-0.088
uch-0.087
rac-0.086
Token was
Token ID373
Feature activation+4.49e-02

Pos logit contributions
aster+0.016
ons+0.015
kins+0.014
uch+0.014
rac+0.014
Neg logit contributions
 loudly-0.015
 wondering-0.015
 pad-0.014
 curious-0.014
 floppy-0.014
Tokenhes
Token ID956
Feature activation-3.98e-02

Pos logit contributions
 loudly+0.049
 wondering+0.048
 pad+0.048
 floppy+0.045
 interested+0.045
Neg logit contributions
aster-0.091
ons-0.087
kins-0.083
uch-0.083
rac-0.083
Token his
Token ID465
Feature activation+1.44e-02

Pos logit contributions
aster+0.062
ons+0.060
uch+0.059
kins+0.058
rac+0.057
Neg logit contributions
 loudly-0.062
 wondering-0.057
 curious-0.056
 floppy-0.055
 pad-0.055
Token finger
Token ID7660
Feature activation+1.90e-02

Pos logit contributions
 curious+0.017
 loudly+0.017
 floppy+0.016
 pad+0.016
 wondering+0.016
Neg logit contributions
ons-0.028
aster-0.028
lo-0.026
vation-0.025
uch-0.025
Token and
Token ID290
Feature activation+1.35e-02

Pos logit contributions
 wondering+0.023
 loudly+0.022
 pad+0.021
 interested+0.021
 detective+0.021
Neg logit contributions
aster-0.037
ons-0.035
uch-0.033
kins-0.033
rac-0.033
Token puts
Token ID7584
Feature activation-1.68e-03

Pos logit contributions
 loudly+0.018
 wondering+0.018
 curious+0.018
 floppy+0.017
 pad+0.016
Neg logit contributions
uch-0.023
aster-0.023
rac-0.022
ons-0.022
at-0.022
Token your
Token ID534
Feature activation+9.16e-03

Pos logit contributions
aster+0.024
ons+0.024
aches+0.023
ma+0.020
rac+0.020
Neg logit contributions
 blonde-0.020
 superstar-0.020
 wondering-0.019
 Nice-0.019
 detective-0.019
Token letters
Token ID7475
Feature activation-3.22e-02

Pos logit contributions
 loudly+0.012
 wondering+0.012
 pad+0.012
 floppy+0.011
 indeed+0.011
Neg logit contributions
aster-0.016
ons-0.015
uch-0.014
kins-0.014
rac-0.014
Token,"
Token ID553
Feature activation-1.69e-02

Pos logit contributions
aster+0.065
ons+0.060
kins+0.057
rac+0.057
uch+0.057
Neg logit contributions
 loudly-0.037
 wondering-0.036
 pad-0.035
 curious-0.035
 indeed-0.035
Token the
Token ID262
Feature activation+4.26e-03

Pos logit contributions
aster+0.030
uch+0.030
ons+0.029
kins+0.028
at+0.028
Neg logit contributions
 loudly-0.024
 curious-0.023
 wondering-0.022
 interested-0.021
 floppy-0.020
Token man
Token ID582
Feature activation-7.49e-03

Pos logit contributions
 loudly+0.006
 wondering+0.006
 curious+0.006
 interested+0.005
 pad+0.005
Neg logit contributions
aster-0.008
ons-0.007
uch-0.007
kins-0.007
rac-0.007
Token said
Token ID531
Feature activation+2.66e-02

Pos logit contributions
aster+0.012
ons+0.011
uch+0.010
kins+0.010
rac+0.010
Neg logit contributions
 loudly-0.012
 wondering-0.012
 curious-0.012
 pad-0.011
 interested-0.011
Token.
Token ID13
Feature activation+2.46e-02

Pos logit contributions
 pad+0.036
 wondering+0.034
 floppy+0.033
 detective+0.033
 loudly+0.033
Neg logit contributions
aster-0.047
ons-0.043
kins-0.043
rac-0.042
uch-0.040
Token "
Token ID366
Feature activation-1.29e-02

Pos logit contributions
 loudly+0.028
 pad+0.028
 wondering+0.027
 detective+0.027
 floppy+0.026
Neg logit contributions
aster-0.047
ons-0.044
rac-0.043
kins-0.042
uch-0.041
TokenYou
Token ID1639
Feature activation-1.65e-02

Pos logit contributions
aster+0.023
ons+0.022
kins+0.021
uch+0.021
rac+0.021
Neg logit contributions
 loudly-0.017
 wondering-0.017
 pad-0.016
 curious-0.016
 interested-0.016
Token cannot
Token ID2314
Feature activation-2.22e-02

Pos logit contributions
aster+0.031
ons+0.030
uch+0.029
rac+0.028
kins+0.028
Neg logit contributions
 wondering-0.020
 loudly-0.020
 pad-0.019
 floppy-0.018
 interested-0.018
Token mail
Token ID6920
Feature activation-2.54e-02

Pos logit contributions
aster+0.035
ons+0.032
uch+0.031
rac+0.030
kins+0.030
Neg logit contributions
 loudly-0.035
 wondering-0.035
 pad-0.034
 curious-0.033
 floppy-0.033
Token cookies
Token ID14746
Feature activation-8.17e-03

Pos logit contributions
aster+0.006
ons+0.005
rac+0.005
uch+0.005
kins+0.005
Neg logit contributions
 loudly-0.005
 wondering-0.005
 pad-0.004
 curious-0.004
 interested-0.004
Token were
Token ID547
Feature activation+4.56e-02

Pos logit contributions
aster+0.014
rac+0.013
uch+0.013
ons+0.012
 halftime+0.012
Neg logit contributions
 loudly-0.012
 indeed-0.011
 wondering-0.011
 curious-0.011
 pad-0.011
Token delicious
Token ID12625
Feature activation+2.62e-02

Pos logit contributions
 loudly+0.039
 wondering+0.039
 pad+0.036
 curious+0.036
 indeed+0.036
Neg logit contributions
aster-0.104
ons-0.100
uch-0.096
rac-0.096
kins-0.095
Token and
Token ID290
Feature activation+7.55e-03

Pos logit contributions
 wondering+0.034
 undone+0.033
aly+0.033
 pad+0.032
 loudly+0.032
Neg logit contributions
ons-0.045
aster-0.044
kins-0.043
ma-0.041
ops-0.038
Token her
Token ID607
Feature activation+1.18e-02

Pos logit contributions
 loudly+0.010
 wondering+0.009
 curious+0.009
 indeed+0.009
 floppy+0.009
Neg logit contributions
aster-0.014
ons-0.013
uch-0.013
rac-0.013
kins-0.012
Token mother
Token ID2802
Feature activation-1.64e-02

Pos logit contributions
 loudly+0.018
 wondering+0.018
 pad+0.017
 curious+0.017
 interested+0.017
Neg logit contributions
aster-0.019
ons-0.018
kins-0.017
uch-0.017
rac-0.017
Token thanked
Token ID26280
Feature activation-2.84e-02

Pos logit contributions
aster+0.028
ons+0.025
rac+0.025
uch+0.025
kins+0.024
Neg logit contributions
 loudly-0.024
 wondering-0.023
 pad-0.023
 curious-0.023
 indeed-0.023
Token the
Token ID262
Feature activation+3.32e-03

Pos logit contributions
aster+0.053
uch+0.052
ons+0.051
kins+0.049
rac+0.049
Neg logit contributions
 loudly-0.036
 wondering-0.035
 indeed-0.033
 curious-0.032
 pad-0.032
Token little
Token ID1310
Feature activation-8.10e-03

Pos logit contributions
 loudly+0.003
 wondering+0.003
 pad+0.003
 floppy+0.003
aly+0.003
Neg logit contributions
aster-0.007
ons-0.007
kins-0.007
ma-0.007
uch-0.006
Token girl
Token ID2576
Feature activation-1.13e-03

Pos logit contributions
aster+0.015
ma+0.014
kins+0.014
ons+0.013
rac+0.013
Neg logit contributions
 wondering-0.009
 loudly-0.009
aly-0.009
 pad-0.008
 indeed-0.008
Token for
Token ID329
Feature activation-3.60e-02

Pos logit contributions
aster+0.002
ons+0.002
kins+0.002
ma+0.002
rac+0.002
Neg logit contributions
 wondering-0.002
 floppy-0.002
 blonde-0.002
 loudly-0.001
 detective-0.001
Token and
Token ID290
Feature activation+1.20e-02

Pos logit contributions
 loudly+0.036
 wondering+0.036
 pad+0.034
 floppy+0.034
 curious+0.034
Neg logit contributions
aster-0.050
ons-0.048
kins-0.046
uch-0.045
rac-0.044
Token asked
Token ID1965
Feature activation-2.77e-02

Pos logit contributions
 loudly+0.016
 pad+0.016
aly+0.016
 wondering+0.015
 indeed+0.015
Neg logit contributions
aster-0.020
ons-0.020
kins-0.018
aches-0.018
rac-0.017
Token her
Token ID607
Feature activation-1.92e-02

Pos logit contributions
aster+0.049
ons+0.047
kins+0.044
rac+0.044
ma+0.043
Neg logit contributions
 pad-0.036
 loudly-0.036
 wondering-0.036
 floppy-0.035
 indeed-0.034
Token mom
Token ID1995
Feature activation-4.03e-02

Pos logit contributions
aster+0.027
ons+0.025
uch+0.023
rac+0.023
kins+0.023
Neg logit contributions
 loudly-0.034
 wondering-0.034
 pad-0.033
 curious-0.033
 interested-0.032
Token if
Token ID611
Feature activation-3.80e-02

Pos logit contributions
aster+0.078
ons+0.074
kins+0.071
rac+0.071
uch+0.071
Neg logit contributions
 loudly-0.048
 wondering-0.047
 pad-0.045
 curious-0.044
 floppy-0.044
Token she
Token ID673
Feature activation-3.93e-02

Pos logit contributions
aster+0.071
ons+0.067
uch+0.065
rac+0.065
kins+0.063
Neg logit contributions
 wondering-0.048
 loudly-0.048
 curious-0.046
 indeed-0.045
 interested-0.044
Token could
Token ID714
Feature activation-8.07e-03

Pos logit contributions
aster+0.062
ons+0.059
uch+0.056
rac+0.055
kins+0.055
Neg logit contributions
 loudly-0.060
 wondering-0.060
 pad-0.058
 floppy-0.056
 curious-0.056
Token have
Token ID423
Feature activation-1.78e-02

Pos logit contributions
aster+0.012
ons+0.012
kins+0.011
rac+0.011
uch+0.011
Neg logit contributions
 loudly-0.013
 wondering-0.013
 curious-0.012
 pad-0.012
 indeed-0.012
Token it
Token ID340
Feature activation-1.91e-02

Pos logit contributions
aster+0.034
ons+0.032
kins+0.031
rac+0.031
uch+0.031
Neg logit contributions
 wondering-0.021
 loudly-0.021
 pad-0.020
 indeed-0.020
 curious-0.020
Token.
Token ID13
Feature activation+2.21e-02

Pos logit contributions
aster+0.036
ons+0.034
kins+0.033
uch+0.033
rac+0.032
Neg logit contributions
 loudly-0.024
 wondering-0.023
 pad-0.023
 curious-0.022
 floppy-0.022
Token Her
Token ID2332
Feature activation+1.34e-02

Pos logit contributions
 wondering+0.027
 pad+0.026
 indeed+0.026
 interested+0.026
 loudly+0.026
Neg logit contributions
aster-0.040
rac-0.038
ons-0.038
ossible-0.035
kins-0.034
Token
Token ID198
Feature activation-3.60e-02

Pos logit contributions
kins+0.110
aster+0.108
uch+0.108
at+0.107
ons+0.106
Neg logit contributions
 She-0.049
 loudly-0.046
 curious-0.044
 He-0.042
 It-0.042
TokenThe
Token ID464
Feature activation+4.39e-03

Pos logit contributions
aster+0.056
ons+0.056
kins+0.052
rac+0.051
uch+0.051
Neg logit contributions
 loudly-0.055
 curious-0.052
 wondering-0.051
 pad-0.050
 indeed-0.050
Token wild
Token ID4295
Feature activation-8.69e-04

Pos logit contributions
 loudly+0.006
 wondering+0.006
 pad+0.006
 curious+0.006
 floppy+0.005
Neg logit contributions
aster-0.008
ons-0.008
rac-0.007
kins-0.007
uch-0.007
Token fire
Token ID2046
Feature activation-1.30e-02

Pos logit contributions
aster+0.002
ons+0.002
kins+0.001
uch+0.001
rac+0.001
Neg logit contributions
 loudly-0.001
 wondering-0.001
 pad-0.001
 floppy-0.001
 curious-0.001
Tokenwork
Token ID1818
Feature activation-1.14e-02

Pos logit contributions
aster+0.025
ons+0.024
aches+0.023
rac+0.022
uch+0.022
Neg logit contributions
 loudly-0.017
 curious-0.016
 indeed-0.015
 puppy-0.015
 wondering-0.014
Token hit
Token ID2277
Feature activation-2.63e-02

Pos logit contributions
aster+0.021
rac+0.021
ons+0.020
uch+0.019
aches+0.019
Neg logit contributions
 loudly-0.014
 curious-0.013
 wondering-0.013
 indeed-0.013
 pad-0.012
Token Sara
Token ID24799
Feature activation+1.00e-02

Pos logit contributions
aster+0.040
ons+0.036
rac+0.034
kins+0.034
uch+0.034
Neg logit contributions
 wondering-0.043
 loudly-0.042
 pad-0.042
 interested-0.041
 curious-0.041
Token and
Token ID290
Feature activation-1.94e-03

Pos logit contributions
 wondering+0.009
 loudly+0.008
 interested+0.008
 detective+0.008
 floppy+0.008
Neg logit contributions
aster-0.022
ons-0.022
uch-0.021
rac-0.021
kins-0.020
Token Ben
Token ID3932
Feature activation+2.63e-03

Pos logit contributions
aster+0.004
ons+0.003
uch+0.003
kins+0.003
rac+0.003
Neg logit contributions
 loudly-0.003
 wondering-0.002
 curious-0.002
 floppy-0.002
 pad-0.002
Token.
Token ID13
Feature activation+4.10e-02

Pos logit contributions
 wondering+0.003
 loudly+0.003
 pad+0.003
 interested+0.003
 floppy+0.003
Neg logit contributions
aster-0.005
ons-0.005
uch-0.004
kins-0.004
rac-0.004
Token It
Token ID632
Feature activation-1.20e-02

Pos logit contributions
 pad+0.043
 wondering+0.041
 loudly+0.039
 interested+0.038
 floppy+0.037
Neg logit contributions
aster-0.088
rac-0.080
ons-0.079
kins-0.078
uch-0.075
TokenFrom
Token ID4863
Feature activation-9.48e-03

Pos logit contributions
aster+0.064
kins+0.061
ons+0.061
uch+0.058
rac+0.058
Neg logit contributions
 loudly-0.053
 wondering-0.050
 curious-0.048
 pad-0.048
 indeed-0.048
Token that
Token ID326
Feature activation-2.89e-02

Pos logit contributions
aster+0.016
ons+0.015
kins+0.014
uch+0.014
rac+0.014
Neg logit contributions
 loudly-0.015
 wondering-0.014
 curious-0.014
 indeed-0.013
 floppy-0.013
Token day
Token ID1110
Feature activation-2.03e-02

Pos logit contributions
aster+0.042
kins+0.039
ons+0.038
aches+0.038
rac+0.037
Neg logit contributions
 loudly-0.049
 wondering-0.049
 pad-0.048
 curious-0.047
 indeed-0.046
Token on
Token ID319
Feature activation-3.84e-02

Pos logit contributions
aster+0.038
ons+0.036
kins+0.034
uch+0.034
rac+0.034
Neg logit contributions
 loudly-0.026
 wondering-0.025
 pad-0.025
 curious-0.024
 floppy-0.024
Token,
Token ID11
Feature activation+3.96e-04

Pos logit contributions
aster+0.082
ons+0.077
uch+0.075
rac+0.074
kins+0.074
Neg logit contributions
 loudly-0.040
 wondering-0.038
 pad-0.036
 floppy-0.036
 curious-0.036
Token Gray
Token ID12723
Feature activation-1.47e-02

Pos logit contributions
 loudly+0.000
 wondering+0.000
 pad+0.000
 curious+0.000
 floppy+0.000
Neg logit contributions
aster-0.001
ons-0.001
uch-0.001
kins-0.001
rac-0.001
Token happily
Token ID18177
Feature activation-1.43e-02

Pos logit contributions
aster+0.027
ons+0.026
kins+0.024
uch+0.024
rac+0.024
Neg logit contributions
 wondering-0.019
 loudly-0.019
 pad-0.018
 floppy-0.017
 curious-0.017
Token sold
Token ID2702
Feature activation-2.69e-03

Pos logit contributions
aster+0.024
ons+0.023
rac+0.022
kins+0.022
uch+0.022
Neg logit contributions
 loudly-0.020
 wondering-0.020
 pad-0.020
 curious-0.019
 interested-0.019
Token his
Token ID465
Feature activation-7.83e-03

Pos logit contributions
aster+0.005
ons+0.005
uch+0.004
kins+0.004
rac+0.004
Neg logit contributions
 loudly-0.004
 wondering-0.004
 pad-0.003
 curious-0.003
 floppy-0.003
Token sweet
Token ID6029
Feature activation+6.07e-03

Pos logit contributions
aster+0.015
ons+0.014
uch+0.014
rac+0.014
kins+0.013
Neg logit contributions
 loudly-0.009
 wondering-0.009
 pad-0.009
 curious-0.009
 floppy-0.009
Token,
Token ID11
Feature activation+2.55e-02

Pos logit contributions
 loudly+0.008
 wondering+0.008
 curious+0.008
 pad+0.008
 interested+0.007
Neg logit contributions
aster-0.011
ons-0.010
kins-0.010
rac-0.010
uch-0.010
Token
Token ID198
Feature activation-4.39e-02

Pos logit contributions
 loudly+0.055
 wondering+0.054
 pad+0.053
 interested+0.052
 indeed+0.051
Neg logit contributions
aster-0.038
ons-0.035
kins-0.032
rac-0.032
 halftime-0.029
Token
Token ID198
Feature activation-3.09e-02

Pos logit contributions
aster+0.087
uch+0.085
rac+0.081
kins+0.081
ons+0.080
Neg logit contributions
 loudly-0.051
 She-0.050
 wondering-0.049
 curious-0.048
 pad-0.047
TokenL
Token ID43
Feature activation-1.95e-02

Pos logit contributions
aster+0.045
ons+0.042
kins+0.040
rac+0.039
uch+0.038
Neg logit contributions
 wondering-0.052
 loudly-0.051
 interested-0.050
 pad-0.049
 floppy-0.049
Tokenily
Token ID813
Feature activation-2.74e-03

Pos logit contributions
aster+0.025
ons+0.024
uch+0.022
kins+0.022
ma+0.021
Neg logit contributions
 wondering-0.038
 loudly-0.036
 floppy-0.036
 curious-0.036
 interested-0.035
Token followed
Token ID3940
Feature activation-7.52e-03

Pos logit contributions
aster+0.005
ons+0.005
rac+0.004
kins+0.004
uch+0.004
Neg logit contributions
 wondering-0.004
 loudly-0.004
 pad-0.004
 floppy-0.004
 interested-0.003
Token Mom
Token ID11254
Feature activation-2.96e-02

Pos logit contributions
aster+0.013
ons+0.012
rac+0.012
kins+0.012
uch+0.012
Neg logit contributions
 loudly-0.010
 wondering-0.010
 pad-0.010
 curious-0.010
 interested-0.010
Tokenmy
Token ID1820
Feature activation-1.92e-02

Pos logit contributions
aster+0.046
ons+0.044
kins+0.040
rac+0.040
uch+0.040
Neg logit contributions
 loudly-0.047
 wondering-0.046
 pad-0.044
 curious-0.044
 interested-0.044
Token's
Token ID338
Feature activation-2.07e-03

Pos logit contributions
aster+0.036
ons+0.036
kins+0.033
rac+0.032
ma+0.032
Neg logit contributions
 wondering-0.022
 floppy-0.022
 loudly-0.022
 blonde-0.021
 indeed-0.021
Token movements
Token ID8650
Feature activation+6.94e-04

Pos logit contributions
aster+0.003
ons+0.003
uch+0.003
kins+0.003
rac+0.003
Neg logit contributions
 loudly-0.004
 wondering-0.004
 pad-0.004
 curious-0.004
 interested-0.003
Token and
Token ID290
Feature activation+7.72e-03

Pos logit contributions
 wondering+0.001
 loudly+0.001
 pad+0.001
 floppy+0.001
 detective+0.001
Neg logit contributions
aster-0.001
ons-0.001
kins-0.001
rac-0.001
ma-0.001
Token watched
Token ID7342
Feature activation+4.08e-03

Pos logit contributions
 loudly+0.011
 wondering+0.011
 curious+0.010
 pad+0.010
 interested+0.010
Neg logit contributions
aster-0.013
ons-0.013
uch-0.013
rac-0.012
kins-0.012
Token,
Token ID11
Feature activation-5.35e-03

Pos logit contributions
aster+0.008
ons+0.008
kins+0.008
uch+0.007
rac+0.007
Neg logit contributions
 pad-0.006
 wondering-0.006
 floppy-0.006
 blonde-0.006
 loudly-0.006
Token reading
Token ID3555
Feature activation-4.32e-02

Pos logit contributions
aster+0.008
ons+0.007
kins+0.007
rac+0.007
uch+0.007
Neg logit contributions
 loudly-0.009
 wondering-0.009
 pad-0.008
 curious-0.008
 floppy-0.008
Token always
Token ID1464
Feature activation-4.06e-02

Pos logit contributions
aster+0.082
ons+0.077
uch+0.074
rac+0.073
kins+0.073
Neg logit contributions
 loudly-0.054
 wondering-0.053
 pad-0.051
 curious-0.050
 indeed-0.049
Token helps
Token ID5419
Feature activation-5.55e-02

Pos logit contributions
aster+0.074
ons+0.071
uch+0.067
kins+0.067
rac+0.067
Neg logit contributions
 wondering-0.054
 loudly-0.053
 curious-0.051
 pad-0.051
 interested-0.050
Token.
Token ID13
Feature activation-1.00e-02

Pos logit contributions
aster+0.090
uch+0.085
ons+0.084
kins+0.082
at+0.080
Neg logit contributions
 loudly-0.086
 wondering-0.080
 curious-0.079
 pad-0.076
 indeed-0.076
Token<|endoftext|>
Token ID50256
Feature activation-7.60e-02

Pos logit contributions
aster+0.017
ons+0.016
kins+0.015
uch+0.015
rac+0.015
Neg logit contributions
 loudly-0.014
 wondering-0.014
 pad-0.014
 curious-0.014
 floppy-0.014
TokenOnce
Token ID7454
Feature activation-4.96e-02

Pos logit contributions
aster+0.132
rac+0.120
uch+0.117
aches+0.117
ons+0.116
Neg logit contributions
 pad-0.105
 loudly-0.104
 wondering-0.104
 She-0.103
 curious-0.101
Token upon
Token ID2402
Feature activation-3.58e-02

Pos logit contributions
aster+0.134
uch+0.127
vation+0.125
ons+0.122
rac+0.119
Neg logit contributions
 curious-0.025
 loudly-0.025
 wondering-0.024
 She-0.023
 interested-0.023
Token a
Token ID257
Feature activation+3.91e-02

Pos logit contributions
aster+0.064
ons+0.056
uch+0.051
kins+0.050
aches+0.050
Neg logit contributions
 pad-0.050
 wondering-0.049
 loudly-0.047
 floppy-0.047
 detective-0.046
Token time
Token ID640
Feature activation+3.31e-02

Pos logit contributions
 loudly+0.059
 pad+0.057
 wondering+0.057
 curious+0.056
 floppy+0.055
Neg logit contributions
aster-0.063
ons-0.062
kins-0.059
rac-0.058
uch-0.056
Token there
Token ID612
Feature activation-3.94e-02

Pos logit contributions
 loudly+0.031
 wondering+0.031
 curious+0.029
 pad+0.029
 indeed+0.028
Neg logit contributions
aster-0.073
ons-0.069
uch-0.066
kins-0.066
rac-0.066
Token was
Token ID373
Feature activation+3.54e-02

Pos logit contributions
aster+0.004
ons+0.003
kins+0.003
uch+0.003
rac+0.003
Neg logit contributions
 loudly-0.003
 wondering-0.003
 pad-0.003
 curious-0.003
 floppy-0.003
Token ready
Token ID3492
Feature activation+3.00e-02

Pos logit contributions
 pad+0.039
 loudly+0.039
 wondering+0.038
 floppy+0.036
 interested+0.035
Neg logit contributions
aster-0.072
ons-0.067
uch-0.065
kins-0.064
rac-0.064
Token for
Token ID329
Feature activation-4.61e-02

Pos logit contributions
 loudly+0.033
 wondering+0.032
 pad+0.031
 curious+0.030
 indeed+0.030
Neg logit contributions
aster-0.062
ons-0.059
kins-0.057
uch-0.057
rac-0.057
Token bed
Token ID3996
Feature activation+6.92e-03

Pos logit contributions
aster+0.088
ons+0.085
rac+0.082
uch+0.082
kins+0.082
Neg logit contributions
 loudly-0.056
 wondering-0.056
 curious-0.052
 indeed-0.051
 pad-0.051
Token.
Token ID13
Feature activation+8.35e-03

Pos logit contributions
 loudly+0.008
 wondering+0.008
 indeed+0.007
 curious+0.007
 pad+0.007
Neg logit contributions
aster-0.014
ons-0.013
rac-0.013
uch-0.013
kins-0.012
Token<|endoftext|>
Token ID50256
Feature activation-6.45e-02

Pos logit contributions
 loudly+0.012
 wondering+0.012
 pad+0.012
 curious+0.012
 floppy+0.011
Neg logit contributions
aster-0.014
ons-0.013
kins-0.013
uch-0.013
rac-0.012
TokenOnce
Token ID7454
Feature activation-4.82e-02

Pos logit contributions
aster+0.130
rac+0.120
aches+0.119
uch+0.116
ons+0.116
Neg logit contributions
 She-0.074
 loudly-0.072
 wondering-0.072
 curious-0.071
 pad-0.071
Token upon
Token ID2402
Feature activation-2.53e-02

Pos logit contributions
aster+0.127
uch+0.120
vation+0.120
asis+0.115
ons+0.114
Neg logit contributions
 She-0.027
 loudly-0.026
 curious-0.026
 wondering-0.025
 pad-0.023
Token a
Token ID257
Feature activation+4.39e-02

Pos logit contributions
aster+0.051
ons+0.047
uch+0.044
kins+0.043
rac+0.043
Neg logit contributions
 wondering-0.030
 pad-0.029
 loudly-0.029
 floppy-0.028
 detective-0.027
Token time
Token ID640
Feature activation+4.21e-02

Pos logit contributions
 loudly+0.063
 pad+0.061
 wondering+0.060
 curious+0.060
 interested+0.058
Neg logit contributions
aster-0.074
ons-0.073
kins-0.070
rac-0.070
uch-0.066
Token,
Token ID11
Feature activation+2.93e-02

Pos logit contributions
 loudly+0.036
 wondering+0.036
 curious+0.035
 pad+0.033
 indeed+0.032
Neg logit contributions
aster-0.097
ons-0.091
rac-0.088
uch-0.087
kins-0.087
Token sky
Token ID6766
Feature activation-1.35e-02

Pos logit contributions
aster+0.032
ons+0.031
kins+0.029
uch+0.029
rac+0.029
Neg logit contributions
 loudly-0.026
 wondering-0.026
 pad-0.025
 curious-0.025
 floppy-0.024
Token together
Token ID1978
Feature activation-4.00e-02

Pos logit contributions
aster+0.027
ons+0.026
uch+0.025
rac+0.025
kins+0.025
Neg logit contributions
 loudly-0.015
 wondering-0.015
 pad-0.014
 curious-0.014
 floppy-0.014
Token every
Token ID790
Feature activation-4.28e-02

Pos logit contributions
aster+0.080
ons+0.075
uch+0.073
rac+0.072
kins+0.072
Neg logit contributions
 loudly-0.047
 wondering-0.046
 pad-0.045
 curious-0.043
 indeed-0.043
Token day
Token ID1110
Feature activation-2.34e-02

Pos logit contributions
aster+0.053
ons+0.051
rac+0.049
kins+0.049
uch+0.047
Neg logit contributions
 loudly-0.080
 wondering-0.079
 pad-0.078
 curious-0.077
 indeed-0.075
Token.
Token ID13
Feature activation+4.52e-03

Pos logit contributions
aster+0.043
ons+0.041
kins+0.039
rac+0.038
uch+0.038
Neg logit contributions
 loudly-0.030
 wondering-0.030
 pad-0.029
 curious-0.028
 floppy-0.028
Token<|endoftext|>
Token ID50256
Feature activation-6.94e-02

Pos logit contributions
 She+0.005
 pad+0.005
 wondering+0.005
 It+0.005
 loudly+0.004
Neg logit contributions
aster-0.010
uch-0.009
aches-0.009
kins-0.009
ons-0.009
TokenOnce
Token ID7454
Feature activation-4.83e-02

Pos logit contributions
aster+0.124
rac+0.113
ons+0.112
uch+0.111
aches+0.110
Neg logit contributions
 loudly-0.094
 wondering-0.093
 pad-0.092
 curious-0.090
 She-0.088
Token upon
Token ID2402
Feature activation-2.45e-02

Pos logit contributions
aster+0.125
uch+0.123
vation+0.120
ons+0.117
rac+0.115
Neg logit contributions
 loudly-0.026
 wondering-0.026
 curious-0.026
 She-0.025
 pad-0.024
Token a
Token ID257
Feature activation+4.53e-02

Pos logit contributions
aster+0.046
ons+0.041
uch+0.039
kins+0.038
rac+0.037
Neg logit contributions
 pad-0.032
 wondering-0.032
 loudly-0.030
 floppy-0.030
 detective-0.030
Token time
Token ID640
Feature activation+3.92e-02

Pos logit contributions
 loudly+0.068
 wondering+0.066
 pad+0.066
 curious+0.065
 interested+0.063
Neg logit contributions
aster-0.073
ons-0.071
kins-0.068
rac-0.067
uch-0.065
Token,
Token ID11
Feature activation+2.85e-02

Pos logit contributions
 loudly+0.037
 wondering+0.037
 curious+0.035
 pad+0.035
 interested+0.033
Neg logit contributions
aster-0.087
ons-0.082
uch-0.078
kins-0.078
rac-0.078
Token not
Token ID407
Feature activation-3.21e-02

Pos logit contributions
ons+0.021
aster+0.021
kins+0.020
rac+0.020
uch+0.019
Neg logit contributions
 loudly-0.016
 wondering-0.016
 indeed-0.015
 curious-0.015
 pad-0.015
Token hurt
Token ID5938
Feature activation-2.84e-02

Pos logit contributions
ons+0.052
aster+0.051
kins+0.050
rac+0.049
uch+0.047
Neg logit contributions
 loudly-0.048
 wondering-0.046
 curious-0.045
 pad-0.045
 indeed-0.044
Token worms
Token ID29174
Feature activation-1.24e-02

Pos logit contributions
aster+0.047
ons+0.044
uch+0.042
kins+0.042
rac+0.042
Neg logit contributions
 loudly-0.042
 wondering-0.042
 pad-0.041
 curious-0.040
 floppy-0.040
Token again
Token ID757
Feature activation-3.38e-02

Pos logit contributions
aster+0.027
ons+0.025
rac+0.024
uch+0.024
kins+0.024
Neg logit contributions
 loudly-0.013
 wondering-0.012
 pad-0.012
 curious-0.011
 indeed-0.011
Token.
Token ID13
Feature activation+1.40e-02

Pos logit contributions
aster+0.066
ons+0.063
rac+0.061
uch+0.061
kins+0.059
Neg logit contributions
 loudly-0.041
 wondering-0.040
 pad-0.038
 curious-0.038
 indeed-0.037
Token<|endoftext|>
Token ID50256
Feature activation-6.84e-02

Pos logit contributions
 loudly+0.015
 wondering+0.014
 pad+0.014
 interested+0.013
 indeed+0.013
Neg logit contributions
aster-0.029
ons-0.027
rac-0.026
uch-0.025
kins-0.025
TokenTom
Token ID13787
Feature activation-3.48e-02

Pos logit contributions
aster+0.109
rac+0.103
aches+0.102
uch+0.098
kins+0.095
Neg logit contributions
 She-0.104
 loudly-0.097
 curious-0.097
 pad-0.096
 Can-0.096
Token is
Token ID318
Feature activation+6.09e-02

Pos logit contributions
aster+0.066
ons+0.062
rac+0.059
kins+0.059
uch+0.059
Neg logit contributions
 loudly-0.043
 wondering-0.043
 pad-0.042
 curious-0.040
 floppy-0.040
Token a
Token ID257
Feature activation+7.02e-02

Pos logit contributions
 loudly+0.063
 pad+0.061
 wondering+0.059
 floppy+0.059
 indeed+0.058
Neg logit contributions
aster-0.130
ons-0.121
kins-0.115
rac-0.115
uch-0.114
Token bird
Token ID6512
Feature activation+1.87e-02

Pos logit contributions
 undone+0.092
 wondering+0.091
 Lay+0.090
 pad+0.090
 indeed+0.089
Neg logit contributions
aster-0.132
kins-0.108
ma-0.107
 moms-0.104
rac-0.102
Token.
Token ID13
Feature activation+3.54e-02

Pos logit contributions
 loudly+0.019
 wondering+0.018
 curious+0.017
 pad+0.017
 indeed+0.017
Neg logit contributions
aster-0.039
ons-0.038
rac-0.037
uch-0.037
kins-0.037
Token and
Token ID290
Feature activation+4.06e-03

Pos logit contributions
aster+0.036
ons+0.034
uch+0.033
rac+0.033
kins+0.033
Neg logit contributions
 loudly-0.023
 wondering-0.022
 curious-0.021
 pad-0.021
 interested-0.021
Token ran
Token ID4966
Feature activation+2.18e-02

Pos logit contributions
 wondering+0.006
 loudly+0.006
 curious+0.005
 pad+0.005
 interested+0.005
Neg logit contributions
aster-0.007
ons-0.007
uch-0.007
rac-0.007
kins-0.007
Token back
Token ID736
Feature activation-2.15e-02

Pos logit contributions
 wondering+0.033
 interested+0.033
 blonde+0.032
 floppy+0.032
 pad+0.032
Neg logit contributions
aster-0.033
ons-0.029
rac-0.027
ma-0.026
aches-0.026
Token home
Token ID1363
Feature activation-2.31e-02

Pos logit contributions
aster+0.040
ons+0.038
kins+0.036
rac+0.036
uch+0.036
Neg logit contributions
 loudly-0.027
 wondering-0.027
 pad-0.026
 curious-0.025
 floppy-0.025
Token.
Token ID13
Feature activation+2.25e-03

Pos logit contributions
aster+0.043
ons+0.041
kins+0.039
uch+0.039
rac+0.039
Neg logit contributions
 loudly-0.029
 wondering-0.029
 pad-0.028
 curious-0.027
 floppy-0.027
Token<|endoftext|>
Token ID50256
Feature activation-7.90e-02

Pos logit contributions
 loudly+0.003
 wondering+0.003
 pad+0.003
 interested+0.003
 curious+0.003
Neg logit contributions
aster-0.004
ons-0.003
rac-0.003
kins-0.003
uch-0.003
TokenOnce
Token ID7454
Feature activation-4.12e-02

Pos logit contributions
aster+0.151
rac+0.139
uch+0.136
aches+0.136
ons+0.135
Neg logit contributions
 loudly-0.096
 wondering-0.095
 pad-0.095
 She-0.094
 curious-0.092
Token there
Token ID612
Feature activation-3.88e-02

Pos logit contributions
aster+0.106
uch+0.098
ons+0.098
vation+0.096
rac+0.095
Neg logit contributions
 loudly-0.027
 wondering-0.026
 curious-0.025
 pad-0.024
 interested-0.023
Token was
Token ID373
Feature activation+7.22e-02

Pos logit contributions
aster+0.076
ons+0.070
rac+0.068
kins+0.068
uch+0.067
Neg logit contributions
 wondering-0.049
 loudly-0.047
 curious-0.046
 pad-0.046
 indeed-0.046
Token an
Token ID281
Feature activation+5.59e-02

Pos logit contributions
 wondering+0.094
 loudly+0.092
 interested+0.090
 detective+0.090
 pad+0.087
Neg logit contributions
aster-0.137
ons-0.124
uch-0.116
ossible-0.109
ma-0.109
Token infant
Token ID11212
Feature activation+2.72e-02

Pos logit contributions
 wondering+0.068
 loudly+0.067
 pad+0.066
 undone+0.062
 blonde+0.061
Neg logit contributions
aster-0.110
ons-0.096
ma-0.094
at-0.092
rac-0.091