TOP ACTIVATIONS
MAX = 8.518

. He was a mis chie vous little baby and always
The cat grinned mis chie v ously . " Never
<|endoftext|> Peter was a mis chie vous 3 year old who
One day , a mis chie vous young boy decided to
the squirrel got a mis chie vous look in its eye
sweet tune , a mis chie vous fox appeared . "
But then , a mis chie vous rabbit came and started
Suddenly , a mis chie vous bird flew by and
Suddenly , a mis chie vous squirrel came and tried
One day a mis chie vous monkey saw the bear
With a mis chie vous gl int in her
The little boy smiled mis chie v ously and said ,
at him with a mis chie vous grin . She said
S ara felt a gust of wind . She saw
to prevent the bad guys ( the toys on the floor
old lady with a mis chie vous tw inkle in her
, there was a loud * be ep * from the
small boy with a mis chie vous tw inkle in his
like happy â and
so that no more mis chie vous mice could come in

MIN ACTIVATIONS
MIN = -9.046

jam to the kids who say yes . They open the
. But Mom and Dad say no . They say the
. But Mom and Dad say , " Wait ! You
They stir and taste and say y um .
. But Anna and Ben say no . They say they
to open her mouth and say ˜ ah
They run to Ben and say , " Give us our
. Their mom and dad say they have to clean up
They walk to her and say hello . They ask her
They pay the man and say , " Thank you ."
Dad . Mom and Dad say yes , but be careful
their moms . Their moms say yes , but they have
wall . Mom and Dad say they can paint on the
word and one number and say them out loud .
bag and his arm and say , " You are caught
up on the floor and say " Ready , set ,
started to tease Jack and say he looked silly .
and dad , who always say they are very pretty .
run to the dog and say , " Hello , dog
go near the horse and say hello . "

INTERVAL 4.127 - 8.518
CONTAINS 1.959%

And they all lived happily ever after , spo iling
. Maybe they will come back and play again ." <|endoftext|>
He turned around and went back home , leaving his mom
girl called Jo was playing in her garden . She was
saw a shady looking man lurking near the exit . They

INTERVAL -0.264 - 4.127
CONTAINS 54.840%

safe . The goose cannot get us ," Tom says .
went home sad and his mom asked him why he was
. You should try to be calm all the time ,
dreaming of all the adventures she would have in her new
." <|endoftext|> Once upon a time , there was a little

INTERVAL -4.655 - -0.264
CONTAINS 41.640%

day , she found a worm that was very kind .
at that big tree ! Let 's go play under it
sh ivering in the cold and he knew right away what
rabbit trusted her . She thanked it and ran off ,
my and daddy . One day , they decided to go

INTERVAL -9.046 - -4.655
CONTAINS 1.560%

M ama smiled and said , " Good job helping
Tom . " Here , Tom . Drink some
says . " He is telling the people about something he
girl was so happy and said , " Thank you !
with it . They also say , " Thank you ,
Token.
Token ID13
Feature activation+9.10e-01

Pos logit contributions
 alone+0.067
 freely+0.066
 wildly+0.062
ashing+0.058
 late+0.058
Neg logit contributions
lor-0.081
ocket-0.062
ama-0.061
ipop-0.060
ves-0.059
Token He
Token ID679
Feature activation+5.92e-01

Pos logit contributions
 alone+0.027
 freely+0.023
 competitive+0.021
 happily+0.021
 fast+0.021
Neg logit contributions
lor-0.045
ocket-0.035
fs-0.034
chief-0.034
 story-0.034
Token was
Token ID373
Feature activation+2.62e-01

Pos logit contributions
 alone+0.019
 freely+0.017
 wildly+0.016
ashing+0.015
 late+0.015
Neg logit contributions
lor-0.028
ves-0.022
ocket-0.022
fs-0.022
 story-0.021
Token a
Token ID257
Feature activation-8.86e-02

Pos logit contributions
 alone+0.005
 freely+0.005
 happily+0.004
 wildly+0.004
 like+0.004
Neg logit contributions
lor-0.015
ocket-0.013
fs-0.012
chief-0.012
ves-0.012
Token mis
Token ID2984
Feature activation+5.70e+00

Pos logit contributions
lor+0.004
 passport+0.003
 story+0.003
ocket+0.003
ipop+0.003
Neg logit contributions
 alone-0.003
 freely-0.002
 wildly-0.002
ashes-0.002
Stop-0.002
Tokenchie
Token ID3043
Feature activation+8.52e+00

Pos logit contributions
 alone+0.202
 like+0.179
 happily+0.173
 freely+0.170
 their+0.168
Neg logit contributions
lor-0.260
ocket-0.201
fs-0.188
card-0.184
vation-0.184
Tokenvous
Token ID31222
Feature activation+2.05e+00

Pos logit contributions
 alone+0.226
 freely+0.177
 like+0.172
 happily+0.169
 fast+0.159
Neg logit contributions
lor-0.487
fs-0.393
ocket-0.386
ves-0.379
wen-0.373
Token little
Token ID1310
Feature activation+2.33e+00

Pos logit contributions
 alone+0.076
 freely+0.070
 late+0.065
 separately+0.064
 wildly+0.061
Neg logit contributions
lor-0.073
 story-0.064
 passport-0.063
ipop-0.062
 poem-0.059
Token baby
Token ID5156
Feature activation+1.41e+00

Pos logit contributions
 freely+0.077
Stop+0.072
Acc+0.071
 wildly+0.071
 happily+0.068
Neg logit contributions
ipop-0.070
 passport-0.066
ster-0.066
 chest-0.065
lor-0.064
Token and
Token ID290
Feature activation-3.81e-01

Pos logit contributions
 alone+0.041
 freely+0.040
 late+0.038
 wildly+0.037
ashes+0.036
Neg logit contributions
lor-0.062
ocket-0.050
ipop-0.049
ster-0.048
ama-0.047
Token always
Token ID1464
Feature activation+1.81e+00

Pos logit contributions
lor+0.021
fs+0.018
ves+0.017
ocket+0.017
chief+0.017
Neg logit contributions
 alone-0.009
 freely-0.008
 happily-0.007
 like-0.006
 late-0.006
Token
Token ID198
Feature activation-1.56e-01

Pos logit contributions
lor+0.019
fs+0.016
ocket+0.015
ves+0.015
chief+0.015
Neg logit contributions
 alone-0.008
 freely-0.006
 like-0.005
 happily-0.005
 fast-0.005
TokenThe
Token ID464
Feature activation-1.81e+00

Pos logit contributions
lor+0.009
fs+0.007
ves+0.006
ocket+0.006
wen+0.006
Neg logit contributions
 alone-0.005
 like-0.004
 freely-0.004
 happily-0.004
 outside-0.004
Token cat
Token ID3797
Feature activation-8.83e-02

Pos logit contributions
lor+0.099
fs+0.079
ocket+0.077
ves+0.075
wen+0.074
Neg logit contributions
 alone-0.055
 like-0.044
 freely-0.043
 happily-0.042
 fast-0.041
Token grinned
Token ID36268
Feature activation+7.91e-01

Pos logit contributions
lor+0.005
fs+0.004
ocket+0.004
ves+0.003
wen+0.003
Neg logit contributions
 alone-0.003
 happily-0.002
 like-0.002
 freely-0.002
 fast-0.002
Token mis
Token ID2984
Feature activation+5.16e+00

Pos logit contributions
 alone+0.020
 freely+0.016
 hate+0.016
 waste+0.016
 active+0.015
Neg logit contributions
lor-0.042
ama-0.033
ocket-0.032
fs-0.031
chief-0.031
Tokenchie
Token ID3043
Feature activation+8.42e+00

Pos logit contributions
 alone+0.190
 like+0.167
 outside+0.165
 happily+0.164
 normal+0.163
Neg logit contributions
lor-0.226
ocket-0.167
fs-0.159
card-0.156
wen-0.154
Tokenv
Token ID85
Feature activation+2.38e+00

Pos logit contributions
 alone+0.220
 like+0.166
 freely+0.161
 happily+0.158
 fast+0.155
Neg logit contributions
lor-0.499
fs-0.405
ocket-0.397
ves-0.385
chief-0.381
Tokenously
Token ID3481
Feature activation+1.87e+00

Pos logit contributions
 alone+0.094
 freely+0.079
 like+0.079
 happily+0.077
 fast+0.075
Neg logit contributions
lor-0.107
fs-0.081
ocket-0.078
ves-0.076
wen-0.074
Token.
Token ID13
Feature activation+3.36e-01

Pos logit contributions
 alone+0.042
 freely+0.030
 like+0.030
 happily+0.028
 fast+0.027
Neg logit contributions
lor-0.116
fs-0.095
ocket-0.093
ves-0.092
wen-0.090
Token "
Token ID366
Feature activation+4.77e+00

Pos logit contributions
 alone+0.010
 like+0.008
 freely+0.008
 happily+0.007
 fast+0.007
Neg logit contributions
lor-0.018
fs-0.014
ocket-0.014
ves-0.014
wen-0.014
TokenNever
Token ID12295
Feature activation-2.77e-01

Pos logit contributions
 alone+0.126
 freely+0.094
 like+0.093
 happily+0.091
 fast+0.088
Neg logit contributions
lor-0.281
fs-0.225
ocket-0.220
ves-0.215
wen-0.214
Token<|endoftext|>
Token ID50256
Feature activation-1.15e+00

Pos logit contributions
 alone+0.019
 freely+0.014
 like+0.014
 happily+0.014
 fast+0.013
Neg logit contributions
lor-0.044
fs-0.037
ocket-0.036
ves-0.035
wen-0.034
TokenPeter
Token ID19727
Feature activation+6.24e-01

Pos logit contributions
lor+0.062
fs+0.048
ocket+0.047
ves+0.046
wen+0.045
Neg logit contributions
 alone-0.035
 freely-0.028
 like-0.028
 happily-0.027
 fast-0.026
Token was
Token ID373
Feature activation+8.70e-01

Pos logit contributions
 alone+0.019
 freely+0.016
 fast+0.015
 like+0.015
 wildly+0.015
Neg logit contributions
lor-0.032
ocket-0.025
fs-0.025
wen-0.024
ves-0.024
Token a
Token ID257
Feature activation+4.22e-02

Pos logit contributions
 alone+0.019
 freely+0.018
cy+0.015
 wildly+0.015
 than+0.015
Neg logit contributions
lor-0.047
ocket-0.038
fs-0.037
wen-0.037
card-0.036
Token mis
Token ID2984
Feature activation+5.70e+00

Pos logit contributions
 alone+0.001
 freely+0.001
ashes+0.001
 wildly+0.001
Stop+0.001
Neg logit contributions
lor-0.002
 passport-0.002
ama-0.002
 story-0.002
ocket-0.002
Tokenchie
Token ID3043
Feature activation+8.40e+00

Pos logit contributions
 alone+0.201
 like+0.177
 happily+0.170
 freely+0.167
 their+0.166
Neg logit contributions
lor-0.267
ocket-0.201
fs-0.192
vation-0.191
card-0.189
Tokenvous
Token ID31222
Feature activation+2.18e+00

Pos logit contributions
 alone+0.222
 freely+0.172
 like+0.168
 happily+0.165
 fast+0.156
Neg logit contributions
lor-0.482
fs-0.389
ocket-0.382
ves-0.374
wen-0.369
Token 3
Token ID513
Feature activation+4.00e+00

Pos logit contributions
 alone+0.072
 freely+0.064
 late+0.061
 wildly+0.055
 separately+0.055
Neg logit contributions
lor-0.088
ipop-0.074
 story-0.074
 passport-0.074
ocket-0.072
Token year
Token ID614
Feature activation+1.22e+00

Pos logit contributions
 alone+0.146
 freely+0.132
 like+0.124
 late+0.124
 fast+0.117
Neg logit contributions
lor-0.174
fs-0.137
ves-0.135
ocket-0.131
wen-0.124
Token old
Token ID1468
Feature activation+2.37e+00

Pos logit contributions
 alone+0.037
 freely+0.033
 like+0.032
 happily+0.031
 hate+0.029
Neg logit contributions
lor-0.059
fs-0.047
ocket-0.047
cases-0.046
ves-0.045
Token who
Token ID508
Feature activation+1.89e-01

Pos logit contributions
 alone+0.064
 freely+0.057
 late+0.050
 wildly+0.047
 like+0.047
Neg logit contributions
lor-0.124
ocket-0.099
ipop-0.095
fs-0.095
 passport-0.095
TokenOne
Token ID3198
Feature activation+1.07e+00

Pos logit contributions
lor+0.011
ves+0.009
fs+0.008
wen+0.008
 story+0.008
Neg logit contributions
 alone-0.009
 freely-0.007
 happily-0.007
cy-0.007
ashing-0.007
Token day
Token ID1110
Feature activation-1.63e+00

Pos logit contributions
 alone+0.032
 freely+0.027
 happily+0.026
 like+0.025
 wildly+0.024
Neg logit contributions
lor-0.055
fs-0.044
ocket-0.043
ves-0.043
wen-0.042
Token,
Token ID11
Feature activation+1.60e+00

Pos logit contributions
lor+0.083
fs+0.068
ves+0.068
ocket+0.065
ipop+0.064
Neg logit contributions
 alone-0.050
 freely-0.042
 active-0.037
ashing-0.037
 happily-0.037
Token a
Token ID257
Feature activation-1.53e+00

Pos logit contributions
 alone+0.043
 freely+0.035
 like+0.032
 happily+0.032
ashing+0.030
Neg logit contributions
lor-0.088
fs-0.074
ves-0.071
ocket-0.070
chief-0.068
Token mis
Token ID2984
Feature activation+5.11e+00

Pos logit contributions
lor+0.084
fs+0.067
ocket+0.066
ves+0.065
wen+0.063
Neg logit contributions
 alone-0.044
 freely-0.035
 like-0.035
 happily-0.033
 fast-0.031
Tokenchie
Token ID3043
Feature activation+7.98e+00

Pos logit contributions
 alone+0.179
 like+0.148
 happily+0.147
 freely+0.144
 late+0.143
Neg logit contributions
lor-0.247
ocket-0.191
fs-0.182
card-0.177
wen-0.176
Tokenvous
Token ID31222
Feature activation+1.33e+00

Pos logit contributions
 alone+0.200
 like+0.149
 freely+0.147
 happily+0.142
 fast+0.138
Neg logit contributions
lor-0.479
fs-0.389
ocket-0.382
ves-0.370
chief-0.368
Token young
Token ID1862
Feature activation+3.15e+00

Pos logit contributions
 alone+0.046
 freely+0.039
 happily+0.037
 like+0.036
 late+0.036
Neg logit contributions
lor-0.063
ocket-0.049
fs-0.048
ves-0.047
anger-0.046
Token boy
Token ID2933
Feature activation+1.80e+00

Pos logit contributions
 alone+0.103
 freely+0.096
 wildly+0.091
 like+0.088
 hate+0.087
Neg logit contributions
lor-0.135
 passport-0.103
 message-0.102
 story-0.101
ocket-0.099
Token decided
Token ID3066
Feature activation-2.38e+00

Pos logit contributions
 alone+0.062
 freely+0.053
 like+0.049
 fast+0.049
 happily+0.049
Neg logit contributions
lor-0.087
ocket-0.067
fs-0.067
ves-0.065
wen-0.063
Token to
Token ID284
Feature activation-3.90e-01

Pos logit contributions
lor+0.147
fs+0.121
ocket+0.118
ves+0.116
wen+0.114
Neg logit contributions
 alone-0.055
 freely-0.040
 like-0.039
 happily-0.038
 fast-0.036
Token the
Token ID262
Feature activation-1.10e+00

Pos logit contributions
 alone+0.029
 like+0.021
 freely+0.020
 happily+0.020
 fast+0.019
Neg logit contributions
lor-0.083
fs-0.068
ocket-0.067
ves-0.065
wen-0.065
Token squirrel
Token ID33039
Feature activation+1.46e+00

Pos logit contributions
lor+0.063
fs+0.051
ocket+0.050
ves+0.049
wen+0.048
Neg logit contributions
 alone-0.030
 like-0.023
 freely-0.023
 happily-0.022
 fast-0.022
Token got
Token ID1392
Feature activation+2.05e+00

Pos logit contributions
 alone+0.048
 like+0.038
 freely+0.038
 happily+0.037
 fast+0.036
Neg logit contributions
lor-0.078
fs-0.061
ocket-0.060
ves-0.058
wen-0.058
Token a
Token ID257
Feature activation-1.04e+00

Pos logit contributions
 alone+0.050
 freely+0.043
 wildly+0.037
 like+0.037
 than+0.037
Neg logit contributions
lor-0.109
fs-0.090
ocket-0.090
chief-0.089
ves-0.087
Token mis
Token ID2984
Feature activation+5.33e+00

Pos logit contributions
lor+0.057
fs+0.046
ocket+0.045
ves+0.044
wen+0.043
Neg logit contributions
 alone-0.031
 like-0.024
 freely-0.024
 happily-0.023
 fast-0.023
Tokenchie
Token ID3043
Feature activation+7.96e+00

Pos logit contributions
 alone+0.170
 like+0.139
 freely+0.137
 happily+0.135
 fast+0.129
Neg logit contributions
lor-0.278
fs-0.214
ocket-0.214
ves-0.204
vation-0.202
Tokenvous
Token ID31222
Feature activation+1.15e+00

Pos logit contributions
 alone+0.182
 like+0.132
 freely+0.124
 happily+0.121
 fast+0.120
Neg logit contributions
lor-0.502
fs-0.413
ocket-0.405
chief-0.394
ves-0.390
Token look
Token ID804
Feature activation+2.95e-01

Pos logit contributions
 alone+0.035
 freely+0.028
 like+0.027
 happily+0.027
 fast+0.026
Neg logit contributions
lor-0.061
fs-0.048
ocket-0.048
ves-0.046
chief-0.046
Token in
Token ID287
Feature activation+3.16e+00

Pos logit contributions
 alone+0.006
 freely+0.005
 like+0.004
 happily+0.004
 fast+0.004
Neg logit contributions
lor-0.018
fs-0.015
ocket-0.015
ves-0.015
wen-0.014
Token its
Token ID663
Feature activation-7.09e-01

Pos logit contributions
 alone+0.072
 freely+0.056
ashing+0.053
 like+0.052
 happily+0.052
Neg logit contributions
lor-0.185
ocket-0.150
fs-0.149
chief-0.146
ves-0.142
Token eye
Token ID4151
Feature activation-5.68e+00

Pos logit contributions
lor+0.032
fs+0.025
ocket+0.023
ves+0.023
wen+0.022
Neg logit contributions
 alone-0.029
 like-0.025
 freely-0.024
 happily-0.024
 fast-0.024
Token sweet
Token ID6029
Feature activation+1.38e+00

Pos logit contributions
lor+0.039
fs+0.031
ves+0.030
ocket+0.029
vation+0.028
Neg logit contributions
 alone-0.033
 like-0.028
 freely-0.028
 late-0.027
 happily-0.027
Token tune
Token ID14009
Feature activation-1.51e+00

Pos logit contributions
 alone+0.043
 like+0.037
 happily+0.035
 freely+0.035
 outside+0.035
Neg logit contributions
lor-0.076
fs-0.058
ocket-0.058
wen-0.057
cases-0.056
Token,
Token ID11
Feature activation+1.40e+00

Pos logit contributions
lor+0.091
fs+0.074
ocket+0.073
ves+0.071
wen+0.070
Neg logit contributions
 alone-0.037
 like-0.027
 freely-0.027
 happily-0.026
 fast-0.025
Token a
Token ID257
Feature activation-1.72e+00

Pos logit contributions
 alone+0.040
 like+0.031
 freely+0.030
 happily+0.030
 fast+0.029
Neg logit contributions
lor-0.080
fs-0.064
ocket-0.063
ves-0.061
wen-0.061
Token mis
Token ID2984
Feature activation+5.10e+00

Pos logit contributions
lor+0.094
fs+0.074
ocket+0.073
ves+0.071
wen+0.070
Neg logit contributions
 alone-0.053
 like-0.042
 freely-0.042
 happily-0.041
 fast-0.040
Tokenchie
Token ID3043
Feature activation+7.93e+00

Pos logit contributions
 alone+0.170
 like+0.136
 freely+0.134
 happily+0.134
 fast+0.130
Neg logit contributions
lor-0.262
ocket-0.202
fs-0.198
wen-0.190
vation-0.189
Tokenvous
Token ID31222
Feature activation+1.03e+00

Pos logit contributions
 alone+0.192
 like+0.143
 freely+0.138
 happily+0.133
 fast+0.130
Neg logit contributions
lor-0.485
fs-0.396
ocket-0.389
chief-0.377
ves-0.375
Token fox
Token ID21831
Feature activation-3.79e-01

Pos logit contributions
 alone+0.035
 freely+0.029
 like+0.029
 happily+0.028
 fast+0.027
Neg logit contributions
lor-0.052
fs-0.040
ocket-0.040
ves-0.038
wen-0.038
Token appeared
Token ID4120
Feature activation-1.01e+00

Pos logit contributions
lor+0.019
fs+0.015
ocket+0.014
wen+0.014
ves+0.014
Neg logit contributions
 alone-0.014
 like-0.011
 happily-0.011
 freely-0.011
 fast-0.010
Token.
Token ID13
Feature activation+4.12e-01

Pos logit contributions
lor+0.060
fs+0.049
ocket+0.048
ves+0.046
wen+0.046
Neg logit contributions
 alone-0.023
 freely-0.017
 like-0.016
 happily-0.015
 youth-0.015
Token "
Token ID366
Feature activation+4.55e+00

Pos logit contributions
 alone+0.012
 freely+0.009
 like+0.009
 fast+0.008
 happily+0.008
Neg logit contributions
lor-0.023
fs-0.018
ocket-0.018
chief-0.017
anger-0.017
TokenBut
Token ID1537
Feature activation-2.36e+00

Pos logit contributions
lor+0.019
fs+0.015
ves+0.014
wen+0.014
vation+0.013
Neg logit contributions
 alone-0.013
 like-0.010
 freely-0.010
 happily-0.009
 outside-0.009
Token then
Token ID788
Feature activation-2.30e-01

Pos logit contributions
lor+0.127
fs+0.101
ocket+0.099
ves+0.097
chief+0.095
Neg logit contributions
 alone-0.070
 freely-0.056
 like-0.054
 happily-0.052
 fast-0.052
Token,
Token ID11
Feature activation+1.32e+00

Pos logit contributions
lor+0.013
fs+0.011
ocket+0.010
ves+0.010
wen+0.010
Neg logit contributions
 alone-0.006
 freely-0.005
 like-0.005
 happily-0.005
 fast-0.004
Token a
Token ID257
Feature activation-1.88e+00

Pos logit contributions
 alone+0.034
 freely+0.026
 like+0.025
 happily+0.024
 fast+0.023
Neg logit contributions
lor-0.077
fs-0.063
ocket-0.061
ves-0.060
wen-0.059
Token mis
Token ID2984
Feature activation+5.15e+00

Pos logit contributions
lor+0.104
fs+0.084
ocket+0.082
ves+0.080
wen+0.079
Neg logit contributions
 alone-0.054
 freely-0.042
 like-0.042
 happily-0.040
 fast-0.038
Tokenchie
Token ID3043
Feature activation+7.86e+00

Pos logit contributions
 alone+0.186
 like+0.157
 happily+0.153
 freely+0.152
 outside+0.148
Neg logit contributions
lor-0.245
ocket-0.183
fs-0.175
card-0.172
wen-0.170
Tokenvous
Token ID31222
Feature activation+1.10e+00

Pos logit contributions
 alone+0.189
 like+0.139
 freely+0.136
 happily+0.131
 fast+0.128
Neg logit contributions
lor-0.482
fs-0.394
ocket-0.387
chief-0.374
ves-0.374
Token rabbit
Token ID22746
Feature activation-9.87e-01

Pos logit contributions
 alone+0.036
 freely+0.029
 like+0.028
 happily+0.028
 late+0.027
Neg logit contributions
lor-0.055
ocket-0.043
fs-0.043
ves-0.042
chief-0.041
Token came
Token ID1625
Feature activation+1.85e+00

Pos logit contributions
lor+0.049
fs+0.038
ocket+0.037
ves+0.036
wen+0.036
Neg logit contributions
 alone-0.036
 like-0.029
 happily-0.029
 freely-0.029
 fast-0.027
Token and
Token ID290
Feature activation-1.78e+00

Pos logit contributions
 alone+0.046
 freely+0.037
 like+0.034
 youth+0.034
 hate+0.034
Neg logit contributions
lor-0.101
fs-0.077
wen-0.077
ocket-0.076
ama-0.076
Token started
Token ID2067
Feature activation+3.63e+00

Pos logit contributions
lor+0.084
fs+0.065
ocket+0.063
ves+0.062
chief+0.062
Neg logit contributions
 alone-0.063
 freely-0.053
 like-0.052
 happily-0.049
 fast-0.048
Token
Token ID198
Feature activation-3.23e-01

Pos logit contributions
lor+0.028
fs+0.022
ocket+0.022
ves+0.021
wen+0.021
Neg logit contributions
 alone-0.012
 freely-0.009
 like-0.009
 happily-0.009
 fast-0.008
TokenSuddenly
Token ID38582
Feature activation-1.56e+00

Pos logit contributions
lor+0.015
fs+0.012
ves+0.012
wen+0.011
ocket+0.011
Neg logit contributions
 alone-0.011
 freely-0.010
 like-0.010
 happily-0.009
 youth-0.009
Token,
Token ID11
Feature activation+1.10e+00

Pos logit contributions
lor+0.083
fs+0.067
ves+0.065
ocket+0.065
chief+0.063
Neg logit contributions
 alone-0.045
 freely-0.037
 like-0.034
 active-0.034
 happily-0.033
Token a
Token ID257
Feature activation-2.03e+00

Pos logit contributions
 alone+0.029
 freely+0.023
 like+0.022
 active+0.021
 happily+0.021
Neg logit contributions
lor-0.061
fs-0.050
ocket-0.049
ves-0.048
chief-0.047
Token mis
Token ID2984
Feature activation+4.94e+00

Pos logit contributions
lor+0.116
fs+0.093
ocket+0.091
ves+0.089
wen+0.088
Neg logit contributions
 alone-0.057
 like-0.044
 freely-0.043
 happily-0.042
 fast-0.042
Tokenchie
Token ID3043
Feature activation+7.82e+00

Pos logit contributions
 alone+0.173
 like+0.146
 happily+0.145
 freely+0.143
 late+0.141
Neg logit contributions
lor-0.236
ocket-0.182
fs-0.172
wen-0.168
card-0.167
Tokenvous
Token ID31222
Feature activation+1.18e+00

Pos logit contributions
 alone+0.186
 like+0.136
 fast+0.128
 freely+0.127
 happily+0.123
Neg logit contributions
lor-0.490
fs-0.400
ocket-0.394
chief-0.389
wen-0.376
Token bird
Token ID6512
Feature activation+7.21e-01

Pos logit contributions
 alone+0.038
 freely+0.032
 like+0.030
 happily+0.030
 late+0.029
Neg logit contributions
lor-0.060
ocket-0.047
fs-0.047
ves-0.045
wen-0.044
Token flew
Token ID13112
Feature activation+1.04e+00

Pos logit contributions
 alone+0.027
 like+0.022
 happily+0.021
 freely+0.021
 outside+0.021
Neg logit contributions
lor-0.036
fs-0.028
ocket-0.027
wen-0.026
ves-0.026
Token by
Token ID416
Feature activation+7.82e-02

Pos logit contributions
 alone+0.022
 freely+0.018
 youth+0.017
 active+0.016
 hate+0.016
Neg logit contributions
lor-0.059
fs-0.047
ocket-0.046
anger-0.046
chief-0.046
Token and
Token ID290
Feature activation-1.53e+00

Pos logit contributions
 alone+0.002
 active+0.002
ashing+0.002
 competitive+0.002
 freely+0.002
Neg logit contributions
lor-0.004
fs-0.003
ocket-0.003
ves-0.003
aby-0.003
Token
Token ID198
Feature activation-5.37e-01

Pos logit contributions
lor+0.038
ocket+0.032
fs+0.031
chief+0.030
vation+0.030
Neg logit contributions
 alone-0.013
 happily-0.010
 freely-0.010
 like-0.010
 fast-0.009
TokenSuddenly
Token ID38582
Feature activation-1.70e+00

Pos logit contributions
lor+0.027
fs+0.021
ves+0.021
ocket+0.020
wen+0.020
Neg logit contributions
 alone-0.019
 freely-0.015
 like-0.015
 happily-0.014
 fast-0.014
Token,
Token ID11
Feature activation+1.19e+00

Pos logit contributions
lor+0.082
fs+0.066
ves+0.066
chief+0.063
ocket+0.063
Neg logit contributions
 alone-0.055
 freely-0.047
 active-0.043
ashing-0.042
umpy-0.042
Token a
Token ID257
Feature activation-1.94e+00

Pos logit contributions
 alone+0.029
 freely+0.022
 like+0.020
 happily+0.020
 active+0.019
Neg logit contributions
lor-0.069
fs-0.058
ocket-0.056
ves-0.056
chief-0.054
Token mis
Token ID2984
Feature activation+4.89e+00

Pos logit contributions
lor+0.110
fs+0.089
ocket+0.087
ves+0.085
wen+0.083
Neg logit contributions
 alone-0.054
 freely-0.042
 like-0.042
 happily-0.040
 fast-0.039
Tokenchie
Token ID3043
Feature activation+7.78e+00

Pos logit contributions
 alone+0.166
 like+0.136
 freely+0.135
 happily+0.135
 late+0.131
Neg logit contributions
lor-0.242
ocket-0.191
fs-0.181
wen-0.174
card-0.173
Tokenvous
Token ID31222
Feature activation+1.13e+00

Pos logit contributions
 alone+0.190
 like+0.140
 freely+0.136
 happily+0.132
 fast+0.130
Neg logit contributions
lor-0.476
fs-0.387
ocket-0.380
ves-0.368
chief-0.367
Token squirrel
Token ID33039
Feature activation+1.66e+00

Pos logit contributions
 alone+0.035
 freely+0.028
 happily+0.027
 like+0.027
 late+0.026
Neg logit contributions
lor-0.058
fs-0.046
ocket-0.046
ves-0.044
chief-0.044
Token came
Token ID1625
Feature activation+1.79e+00

Pos logit contributions
 alone+0.060
 freely+0.049
 like+0.049
 happily+0.048
 fast+0.047
Neg logit contributions
lor-0.081
fs-0.062
ocket-0.061
ves-0.059
wen-0.058
Token and
Token ID290
Feature activation-1.87e+00

Pos logit contributions
 alone+0.046
 freely+0.040
 miserable+0.037
 lonely+0.036
 hate+0.036
Neg logit contributions
lor-0.085
wen-0.067
 story-0.066
ama-0.065
fs-0.064
Token tried
Token ID3088
Feature activation-1.73e+00

Pos logit contributions
lor+0.087
fs+0.069
chief+0.067
ves+0.066
ocket+0.066
Neg logit contributions
 alone-0.065
 freely-0.056
 like-0.053
 happily-0.051
 wildly-0.050
Token
Token ID198
Feature activation-3.69e-01

Pos logit contributions
lor+0.033
ocket+0.027
fs+0.027
chief+0.026
vation+0.026
Neg logit contributions
 alone-0.012
 freely-0.009
 like-0.009
 happily-0.009
 fast-0.009
TokenOne
Token ID3198
Feature activation+1.03e+00

Pos logit contributions
lor+0.019
fs+0.015
ves+0.015
wen+0.014
ocket+0.014
Neg logit contributions
 alone-0.012
 freely-0.009
 like-0.009
 happily-0.009
 outside-0.009
Token day
Token ID1110
Feature activation-1.75e+00

Pos logit contributions
 alone+0.032
 freely+0.027
 like+0.025
 happily+0.025
 wildly+0.024
Neg logit contributions
lor-0.053
fs-0.042
ocket-0.041
ves-0.040
wen-0.040
Token a
Token ID257
Feature activation-1.79e+00

Pos logit contributions
lor+0.088
fs+0.071
ves+0.070
ocket+0.068
ipop+0.067
Neg logit contributions
 alone-0.054
 freely-0.045
 active-0.041
ashing-0.040
 wildly-0.039
Token mis
Token ID2984
Feature activation+4.89e+00

Pos logit contributions
lor+0.109
fs+0.088
ocket+0.086
wen+0.084
ves+0.083
Neg logit contributions
 alone-0.045
 fast-0.034
 like-0.034
 happily-0.033
 freely-0.033
Tokenchie
Token ID3043
Feature activation+7.72e+00

Pos logit contributions
 alone+0.169
 like+0.140
 happily+0.137
 freely+0.137
 fast+0.134
Neg logit contributions
lor-0.244
ocket-0.186
fs-0.177
wen-0.172
vation-0.170
Tokenvous
Token ID31222
Feature activation+1.21e+00

Pos logit contributions
 alone+0.184
 like+0.136
 freely+0.132
 happily+0.128
 fast+0.124
Neg logit contributions
lor-0.474
fs-0.387
ocket-0.380
chief-0.368
ves-0.367
Token monkey
Token ID21657
Feature activation+3.60e-01

Pos logit contributions
 alone+0.041
 freely+0.034
 like+0.032
 happily+0.032
 late+0.031
Neg logit contributions
lor-0.060
fs-0.046
ocket-0.046
ves-0.045
chief-0.044
Token saw
Token ID2497
Feature activation-8.34e-01

Pos logit contributions
 alone+0.011
 like+0.009
 happily+0.009
 freely+0.009
 fast+0.008
Neg logit contributions
lor-0.020
fs-0.016
ocket-0.015
wen-0.015
ves-0.015
Token the
Token ID262
Feature activation-1.01e+00

Pos logit contributions
lor+0.055
fs+0.045
ocket+0.044
ves+0.044
wen+0.043
Neg logit contributions
 alone-0.017
 like-0.011
 freely-0.011
 happily-0.011
 fast-0.010
Token bear
Token ID6842
Feature activation-3.77e-02

Pos logit contributions
lor+0.053
fs+0.042
ocket+0.041
ves+0.040
wen+0.039
Neg logit contributions
 alone-0.033
 freely-0.027
 like-0.026
 happily-0.026
 fast-0.025
Token
Token ID198
Feature activation-4.13e-01

Pos logit contributions
lor+0.075
ocket+0.062
fs+0.062
vation+0.059
chief+0.059
Neg logit contributions
 alone-0.027
 like-0.019
 freely-0.019
 happily-0.019
 fast-0.018
Token
Token ID198
Feature activation-2.99e-01

Pos logit contributions
lor+0.024
fs+0.019
ocket+0.019
ves+0.018
wen+0.018
Neg logit contributions
 alone-0.011
 freely-0.008
 like-0.008
 happily-0.008
 fast-0.008
TokenWith
Token ID3152
Feature activation-2.15e+00

Pos logit contributions
lor+0.014
fs+0.011
ves+0.010
wen+0.010
cases+0.010
Neg logit contributions
 alone-0.011
 freely-0.009
 like-0.009
 happily-0.009
 youth-0.008
Token a
Token ID257
Feature activation-1.69e+00

Pos logit contributions
lor+0.160
fs+0.129
vation+0.128
ocket+0.127
anger+0.126
Neg logit contributions
 alone-0.033
 like-0.023
 fast-0.021
 happily-0.019
 each-0.018
Token mis
Token ID2984
Feature activation+4.96e+00

Pos logit contributions
lor+0.107
fs+0.085
ocket+0.084
wen+0.083
ves+0.081
Neg logit contributions
 alone-0.039
 fast-0.033
 happily-0.031
 competitive-0.030
 like-0.029
Tokenchie
Token ID3043
Feature activation+7.70e+00

Pos logit contributions
 alone+0.154
 freely+0.122
 like+0.121
 happily+0.120
 fast+0.114
Neg logit contributions
lor-0.265
fs-0.209
ocket-0.208
ves-0.199
wen-0.198
Tokenvous
Token ID31222
Feature activation+1.22e+00

Pos logit contributions
 alone+0.175
 like+0.128
 fast+0.117
 freely+0.117
 happily+0.114
Neg logit contributions
lor-0.490
fs-0.400
ocket-0.393
chief-0.386
ves-0.379
Token gl
Token ID1278
Feature activation-3.61e-01

Pos logit contributions
 alone+0.045
 freely+0.037
 like+0.037
 happily+0.036
 fast+0.035
Neg logit contributions
lor-0.058
fs-0.045
ocket-0.044
ves-0.042
wen-0.042
Tokenint
Token ID600
Feature activation-1.64e+00

Pos logit contributions
lor+0.013
fs+0.010
vation+0.009
 Story+0.009
ocket+0.009
Neg logit contributions
 alone-0.016
 like-0.014
 freely-0.014
 happily-0.014
ashing-0.013
Token in
Token ID287
Feature activation+2.76e+00

Pos logit contributions
lor+0.094
fs+0.075
ocket+0.073
vation+0.072
wen+0.069
Neg logit contributions
 alone-0.048
 like-0.041
 outside-0.037
 in-0.036
 fast-0.036
Token her
Token ID607
Feature activation+9.41e-02

Pos logit contributions
 alone+0.053
 like+0.035
 freely+0.034
 happily+0.033
 fast+0.031
Neg logit contributions
lor-0.185
fs-0.153
ocket-0.150
ves-0.148
wen-0.146
Token The
Token ID383
Feature activation-2.16e+00

Pos logit contributions
 alone+0.015
 freely+0.011
 like+0.011
 happily+0.011
 fast+0.011
Neg logit contributions
lor-0.029
fs-0.023
ocket-0.023
ves-0.022
wen-0.022
Token little
Token ID1310
Feature activation+1.29e+00

Pos logit contributions
lor+0.119
fs+0.095
ocket+0.092
ves+0.090
wen+0.090
Neg logit contributions
 alone-0.068
 like-0.053
 fast-0.051
 freely-0.051
 happily-0.051
Token boy
Token ID2933
Feature activation+1.76e+00

Pos logit contributions
 alone+0.035
 freely+0.029
 like+0.027
 happily+0.026
 wildly+0.026
Neg logit contributions
lor-0.069
ocket-0.054
fs-0.054
chief-0.053
ipop-0.053
Token smiled
Token ID13541
Feature activation+7.09e-01

Pos logit contributions
 alone+0.074
 happily+0.064
 like+0.064
 freely+0.062
 fast+0.060
Neg logit contributions
lor-0.077
fs-0.057
ocket-0.055
ves-0.053
wen-0.052
Token mis
Token ID2984
Feature activation+4.68e+00

Pos logit contributions
 alone+0.018
 freely+0.015
 waste+0.014
 hate+0.014
Stop+0.013
Neg logit contributions
lor-0.038
ocket-0.030
chief-0.030
ama-0.029
cases-0.028
Tokenchie
Token ID3043
Feature activation+7.69e+00

Pos logit contributions
 alone+0.165
 like+0.142
 normal+0.140
 freely+0.137
 lonely+0.137
Neg logit contributions
lor-0.219
ocket-0.165
card-0.156
fs-0.156
wen-0.154
Tokenv
Token ID85
Feature activation+2.02e+00

Pos logit contributions
 alone+0.222
 like+0.173
 freely+0.170
 happily+0.167
 fast+0.162
Neg logit contributions
lor-0.433
fs-0.347
ocket-0.339
ves-0.330
wen-0.326
Tokenously
Token ID3481
Feature activation+1.27e+00

Pos logit contributions
 alone+0.079
 happily+0.066
 like+0.066
 freely+0.065
 fast+0.062
Neg logit contributions
lor-0.092
fs-0.070
ocket-0.068
ves-0.065
wen-0.064
Token and
Token ID290
Feature activation-1.67e+00

Pos logit contributions
 alone+0.027
 like+0.019
 freely+0.019
 happily+0.019
 fast+0.017
Neg logit contributions
lor-0.081
fs-0.067
ocket-0.066
ves-0.064
wen-0.064
Token said
Token ID531
Feature activation-7.45e+00

Pos logit contributions
lor+0.081
fs+0.063
ocket+0.060
ves+0.058
wen+0.057
Neg logit contributions
 alone-0.064
 happily-0.055
 like-0.053
 freely-0.051
 fast-0.051
Token,
Token ID11
Feature activation+5.20e-01

Pos logit contributions
lor+0.425
fs+0.340
ocket+0.332
ves+0.331
chief+0.322
Neg logit contributions
 alone-0.202
 freely-0.159
 like-0.152
 late-0.143
 happily-0.142
Token at
Token ID379
Feature activation+1.23e-02

Pos logit contributions
lor+0.077
ocket+0.069
ipop+0.065
cases+0.064
 replied+0.064
Neg logit contributions
 alone-0.036
 freely-0.031
free-0.030
 town-0.030
 youth-0.030
Token him
Token ID683
Feature activation+2.46e-02

Pos logit contributions
 alone+0.000
 like+0.000
 freely+0.000
 happily+0.000
 fast+0.000
Neg logit contributions
lor-0.001
fs-0.001
ocket-0.001
vation-0.001
ves-0.001
Token with
Token ID351
Feature activation+3.49e-01

Pos logit contributions
 alone+0.000
 like+0.000
 freely+0.000
 happily+0.000
 fast+0.000
Neg logit contributions
lor-0.002
fs-0.001
ocket-0.001
ves-0.001
wen-0.001
Token a
Token ID257
Feature activation-1.48e+00

Pos logit contributions
 alone+0.009
 like+0.008
 happily+0.007
 fast+0.007
 freely+0.007
Neg logit contributions
lor-0.021
fs-0.017
ves-0.017
wen-0.016
ocket-0.016
Token mis
Token ID2984
Feature activation+5.11e+00

Pos logit contributions
lor+0.085
fs+0.068
ocket+0.066
wen+0.064
ves+0.064
Neg logit contributions
 alone-0.042
 like-0.034
 happily-0.032
 fast-0.032
 freely-0.032
Tokenchie
Token ID3043
Feature activation+7.64e+00

Pos logit contributions
 alone+0.170
 like+0.136
 happily+0.136
 freely+0.135
 fast+0.130
Neg logit contributions
lor-0.261
ocket-0.205
fs-0.201
ves-0.191
vation-0.189
Tokenvous
Token ID31222
Feature activation+1.22e+00

Pos logit contributions
 alone+0.174
 like+0.130
 freely+0.115
 fast+0.115
 happily+0.115
Neg logit contributions
lor-0.486
fs-0.400
ocket-0.388
chief-0.384
anger-0.377
Token grin
Token ID16770
Feature activation+9.69e-01

Pos logit contributions
 alone+0.048
 like+0.041
 freely+0.040
 happily+0.040
 fast+0.039
Neg logit contributions
lor-0.055
fs-0.042
ocket-0.040
ves-0.039
wen-0.038
Token.
Token ID13
Feature activation+4.80e-01

Pos logit contributions
 alone+0.015
 like+0.012
 happily+0.010
 outside+0.008
 fast+0.008
Neg logit contributions
lor-0.068
fs-0.056
wen-0.055
ocket-0.055
vation-0.055
Token She
Token ID1375
Feature activation+5.96e-01

Pos logit contributions
 alone+0.011
 freely+0.008
 like+0.008
 happily+0.008
 fast+0.007
Neg logit contributions
lor-0.030
fs-0.024
ocket-0.024
ves-0.023
wen-0.023
Token said
Token ID531
Feature activation-7.43e+00

Pos logit contributions
 alone+0.022
 happily+0.019
 like+0.019
 freely+0.018
 fast+0.018
Neg logit contributions
lor-0.029
fs-0.022
ocket-0.021
vation-0.021
anger-0.020
Token
Token ID198
Feature activation-2.96e-02

Pos logit contributions
lor+0.007
fs+0.005
ves+0.005
ocket+0.005
wen+0.005
Neg logit contributions
 alone-0.004
 like-0.003
 freely-0.003
 active-0.003
 happily-0.003
TokenS
Token ID50
Feature activation+3.47e-01

Pos logit contributions
lor+0.002
ves+0.001
fs+0.001
wen+0.001
vation+0.001
Neg logit contributions
 alone-0.001
 freely-0.001
 like-0.001
 outside-0.001
 happily-0.001
Tokenara
Token ID3301
Feature activation+1.69e+00

Pos logit contributions
 alone+0.012
 like+0.010
 happily+0.010
 freely+0.009
 late+0.009
Neg logit contributions
lor-0.017
fs-0.013
ocket-0.013
chief-0.012
ves-0.012
Token felt
Token ID2936
Feature activation+7.40e-01

Pos logit contributions
 alone+0.058
 freely+0.048
 like+0.047
 happily+0.045
 fast+0.044
Neg logit contributions
lor-0.084
fs-0.066
ocket-0.064
ves-0.063
wen-0.061
Token a
Token ID257
Feature activation+2.81e-01

Pos logit contributions
 alone+0.016
 freely+0.015
cy+0.015
ashing+0.015
 wildly+0.014
Neg logit contributions
lor-0.038
ocket-0.033
fs-0.032
chief-0.031
ves-0.031
Token gust
Token ID35253
Feature activation+7.64e+00

Pos logit contributions
 alone+0.009
 freely+0.007
 like+0.007
 late+0.007
ashing+0.007
Neg logit contributions
lor-0.014
fs-0.011
ocket-0.011
ves-0.011
chief-0.010
Token of
Token ID286
Feature activation+4.19e+00

Pos logit contributions
 alone+0.313
 freely+0.277
 active+0.272
 happily+0.265
 youth+0.263
Neg logit contributions
lor-0.287
fs-0.214
ocket-0.211
ves-0.200
ipop-0.194
Token wind
Token ID2344
Feature activation+3.63e+00

Pos logit contributions
 alone+0.117
 freely+0.108
 like+0.101
 happily+0.097
cy+0.092
Neg logit contributions
lor-0.200
fs-0.171
ocket-0.163
ves-0.161
chief-0.160
Token.
Token ID13
Feature activation+9.44e-01

Pos logit contributions
 alone+0.090
 freely+0.076
 active+0.069
 like+0.069
 competitive+0.067
Neg logit contributions
lor-0.200
fs-0.162
ves-0.157
ocket-0.156
chief-0.155
Token She
Token ID1375
Feature activation+1.18e+00

Pos logit contributions
 alone+0.025
 freely+0.021
 active+0.020
 competitive+0.019
 happily+0.018
Neg logit contributions
lor-0.051
anger-0.041
chief-0.041
ocket-0.040
ama-0.040
Token saw
Token ID2497
Feature activation-1.15e+00

Pos logit contributions
 alone+0.039
 like+0.032
 happily+0.031
 freely+0.031
 fast+0.030
Neg logit contributions
lor-0.061
fs-0.048
ocket-0.047
ves-0.045
wen-0.045
Token to
Token ID284
Feature activation-1.25e+00

Pos logit contributions
lor+0.060
fs+0.049
ocket+0.048
ves+0.047
wen+0.047
Neg logit contributions
 alone-0.025
 freely-0.018
 like-0.018
 happily-0.018
 fast-0.017
Token prevent
Token ID2948
Feature activation+1.93e+00

Pos logit contributions
lor+0.066
fs+0.052
ocket+0.051
ves+0.049
wen+0.049
Neg logit contributions
 alone-0.041
 like-0.033
 freely-0.032
 happily-0.032
 fast-0.031
Token the
Token ID262
Feature activation+3.21e-01

Pos logit contributions
 alone+0.053
 freely+0.043
 like+0.041
 happily+0.041
 active+0.039
Neg logit contributions
lor-0.103
fs-0.084
ocket-0.082
ves-0.080
chief-0.080
Token bad
Token ID2089
Feature activation+2.47e+00

Pos logit contributions
 alone+0.011
 like+0.009
 freely+0.009
 fast+0.009
 happily+0.009
Neg logit contributions
lor-0.017
fs-0.013
ocket-0.013
ves-0.012
wen-0.012
Token guys
Token ID3730
Feature activation+1.66e+00

Pos logit contributions
 alone+0.087
 freely+0.072
 like+0.070
 happily+0.070
 fast+0.067
Neg logit contributions
lor-0.120
fs-0.094
ocket-0.091
ves-0.089
chief-0.087
Token (
Token ID357
Feature activation+7.62e+00

Pos logit contributions
 alone+0.040
 freely+0.031
 happily+0.030
 like+0.030
 fast+0.028
Neg logit contributions
lor-0.097
fs-0.080
ocket-0.077
ves-0.076
vation-0.075
Tokenthe
Token ID1169
Feature activation-9.17e-01

Pos logit contributions
 alone+0.259
 freely+0.215
 like+0.214
 happily+0.211
 wildly+0.201
Neg logit contributions
lor-0.371
fs-0.291
ves-0.277
ocket-0.276
wen-0.269
Token toys
Token ID14958
Feature activation-2.17e-01

Pos logit contributions
lor+0.044
fs+0.034
ves+0.033
ocket+0.033
anger+0.032
Neg logit contributions
 alone-0.030
 freely-0.026
 like-0.025
 happily-0.025
 wildly-0.024
Token on
Token ID319
Feature activation+1.31e+00

Pos logit contributions
lor+0.012
fs+0.010
ocket+0.010
ves+0.010
wen+0.009
Neg logit contributions
 alone-0.006
 freely-0.004
 like-0.004
 happily-0.004
 fast-0.004
Token the
Token ID262
Feature activation-1.75e-01

Pos logit contributions
 alone+0.032
 freely+0.025
 like+0.024
 happily+0.023
 fast+0.022
Neg logit contributions
lor-0.077
fs-0.063
ves-0.060
ocket-0.060
wen-0.059
Token floor
Token ID4314
Feature activation+4.65e-01

Pos logit contributions
lor+0.007
fs+0.005
ocket+0.005
ves+0.005
vation+0.005
Neg logit contributions
 alone-0.007
 freely-0.006
 happily-0.006
 like-0.006
 wildly-0.006
Token old
Token ID1468
Feature activation+1.15e+00

Pos logit contributions
lor+0.074
fs+0.059
ocket+0.059
ves+0.055
wen+0.055
Neg logit contributions
 alone-0.051
 like-0.043
 fast-0.041
 freely-0.039
 youth-0.039
Token lady
Token ID10846
Feature activation+1.85e-01

Pos logit contributions
 alone+0.040
 like+0.035
 fast+0.033
 happily+0.032
 freely+0.031
Neg logit contributions
lor-0.058
fs-0.045
ocket-0.045
wen-0.042
vation-0.041
Token with
Token ID351
Feature activation+3.40e-01

Pos logit contributions
 alone+0.005
 like+0.004
 freely+0.004
 happily+0.004
 fast+0.004
Neg logit contributions
lor-0.011
fs-0.009
ocket-0.008
ves-0.008
wen-0.008
Token a
Token ID257
Feature activation-1.46e+00

Pos logit contributions
 alone+0.008
 like+0.006
 fast+0.005
 different+0.005
 hate+0.005
Neg logit contributions
lor-0.022
fs-0.018
ves-0.017
ocket-0.017
wen-0.017
Token mis
Token ID2984
Feature activation+4.99e+00

Pos logit contributions
lor+0.082
fs+0.065
ocket+0.063
ves+0.062
wen+0.061
Neg logit contributions
 alone-0.043
 like-0.036
 fast-0.035
 happily-0.033
 freely-0.033
Tokenchie
Token ID3043
Feature activation+7.58e+00

Pos logit contributions
 alone+0.168
 happily+0.138
 like+0.137
 freely+0.136
 late+0.129
Neg logit contributions
lor-0.252
ocket-0.194
fs-0.190
vation-0.182
wen-0.180
Tokenvous
Token ID31222
Feature activation+1.03e+00

Pos logit contributions
 alone+0.177
 like+0.133
 fast+0.120
 freely+0.120
 happily+0.117
Neg logit contributions
lor-0.475
fs-0.388
ocket-0.382
chief-0.373
ves-0.367
Token tw
Token ID665
Feature activation+1.38e+00

Pos logit contributions
 alone+0.038
 like+0.033
 freely+0.031
 happily+0.031
 fast+0.031
Neg logit contributions
lor-0.050
fs-0.038
ocket-0.037
ves-0.036
wen-0.035
Tokeninkle
Token ID19894
Feature activation-1.11e+00

Pos logit contributions
 alone+0.064
 happily+0.056
 freely+0.055
 like+0.055
 fast+0.054
Neg logit contributions
lor-0.050
fs-0.037
cases-0.036
izza-0.035
ocket-0.034
Token in
Token ID287
Feature activation+3.83e+00

Pos logit contributions
lor+0.068
fs+0.056
ocket+0.055
ves+0.054
wen+0.053
Neg logit contributions
 alone-0.025
 like-0.018
 freely-0.018
 happily-0.018
 fast-0.017
Token her
Token ID607
Feature activation+9.25e-01

Pos logit contributions
 alone+0.079
 like+0.058
 fast+0.050
 freely+0.047
 happily+0.047
Neg logit contributions
lor-0.252
fs-0.211
ocket-0.207
ves-0.204
wen-0.202
Token,
Token ID11
Feature activation+9.24e-01

Pos logit contributions
lor+0.132
fs+0.107
ocket+0.104
ves+0.104
wen+0.101
Neg logit contributions
 alone-0.060
 freely-0.047
 like-0.045
 happily-0.043
 fast-0.042
Token there
Token ID612
Feature activation+1.55e-02

Pos logit contributions
 alone+0.024
 like+0.018
 freely+0.018
 happily+0.017
 fast+0.017
Neg logit contributions
lor-0.055
fs-0.044
ocket-0.043
ves-0.042
wen-0.042
Token was
Token ID373
Feature activation-3.44e-01

Pos logit contributions
 alone+0.000
 like+0.000
 happily+0.000
 freely+0.000
 fast+0.000
Neg logit contributions
lor-0.001
fs-0.001
ocket-0.001
vation-0.001
wen-0.001
Token a
Token ID257
Feature activation-1.16e+00

Pos logit contributions
lor+0.020
fs+0.016
ocket+0.016
ves+0.016
wen+0.015
Neg logit contributions
 alone-0.009
 like-0.007
 freely-0.007
 happily-0.007
 fast-0.006
Token loud
Token ID7812
Feature activation+2.02e+00

Pos logit contributions
lor+0.065
fs+0.052
ocket+0.051
ves+0.050
wen+0.049
Neg logit contributions
 alone-0.033
 freely-0.026
 like-0.026
 happily-0.025
 fast-0.024
Token *
Token ID1635
Feature activation+7.57e+00

Pos logit contributions
 alone+0.073
 freely+0.061
 like+0.059
 happily+0.058
 fast+0.056
Neg logit contributions
lor-0.094
fs-0.073
ves-0.071
ocket-0.070
chief-0.069
Tokenbe
Token ID1350
Feature activation+2.77e+00

Pos logit contributions
 alone+0.273
 freely+0.226
 like+0.225
 happily+0.219
 fast+0.215
Neg logit contributions
lor-0.364
fs-0.282
ocket-0.276
ves-0.268
wen-0.263
Tokenep
Token ID538
Feature activation-5.51e-01

Pos logit contributions
 alone+0.087
 like+0.069
 freely+0.068
 happily+0.067
 fast+0.065
Neg logit contributions
lor-0.149
fs-0.118
ocket-0.116
ves-0.111
wen-0.110
Token*
Token ID9
Feature activation+2.86e+00

Pos logit contributions
lor+0.030
fs+0.024
ocket+0.024
ves+0.023
wen+0.023
Neg logit contributions
 alone-0.017
 like-0.014
 happily-0.013
 freely-0.013
 fast-0.012
Token from
Token ID422
Feature activation-8.33e-01

Pos logit contributions
 alone+0.080
 like+0.063
 freely+0.060
 happily+0.060
 fast+0.060
Neg logit contributions
lor-0.165
fs-0.133
ocket-0.133
ves-0.127
wen-0.126
Token the
Token ID262
Feature activation-4.10e-01

Pos logit contributions
lor+0.043
ocket+0.036
fs+0.036
chief+0.035
 address+0.035
Neg logit contributions
 alone-0.020
 freely-0.017
ashing-0.016
 like-0.016
 active-0.015
Token small
Token ID1402
Feature activation+2.28e+00

Pos logit contributions
lor+0.100
fs+0.079
ocket+0.077
ves+0.075
wen+0.074
Neg logit contributions
 alone-0.061
 freely-0.050
 like-0.048
 happily-0.047
 late-0.045
Token boy
Token ID2933
Feature activation+1.41e+00

Pos logit contributions
 alone+0.083
 freely+0.074
 late+0.069
 happily+0.069
ashing+0.068
Neg logit contributions
lor-0.096
ocket-0.074
chief-0.072
 passport-0.071
 message-0.071
Token with
Token ID351
Feature activation+5.46e-01

Pos logit contributions
 alone+0.035
 freely+0.028
 like+0.026
 happily+0.025
 fast+0.025
Neg logit contributions
lor-0.082
fs-0.066
ocket-0.065
ves-0.064
chief-0.063
Token a
Token ID257
Feature activation-1.81e+00

Pos logit contributions
 alone+0.014
 freely+0.011
 like+0.010
 happily+0.010
 fast+0.010
Neg logit contributions
lor-0.032
fs-0.026
ocket-0.026
ves-0.025
wen-0.025
Token mis
Token ID2984
Feature activation+4.48e+00

Pos logit contributions
lor+0.097
fs+0.077
ocket+0.075
ves+0.073
wen+0.072
Neg logit contributions
 alone-0.056
 freely-0.044
 like-0.044
 happily-0.043
 fast-0.042
Tokenchie
Token ID3043
Feature activation+7.35e+00

Pos logit contributions
 alone+0.151
 happily+0.124
 like+0.124
 freely+0.123
 late+0.119
Neg logit contributions
lor-0.223
ocket-0.175
fs-0.167
wen-0.162
vation-0.158
Tokenvous
Token ID31222
Feature activation+7.10e-01

Pos logit contributions
 alone+0.173
 like+0.127
 freely+0.119
 happily+0.115
 fast+0.115
Neg logit contributions
lor-0.458
fs-0.375
ocket-0.366
chief-0.358
ves-0.356
Token tw
Token ID665
Feature activation+1.06e+00

Pos logit contributions
 alone+0.028
 freely+0.025
 like+0.023
 happily+0.023
 late+0.023
Neg logit contributions
lor-0.030
ocket-0.023
fs-0.022
chief-0.022
ves-0.021
Tokeninkle
Token ID19894
Feature activation-1.61e+00

Pos logit contributions
 alone+0.051
 freely+0.044
 happily+0.044
 like+0.044
 fast+0.043
Neg logit contributions
lor-0.038
fs-0.027
ocket-0.025
vation-0.024
cases-0.024
Token in
Token ID287
Feature activation+3.38e+00

Pos logit contributions
lor+0.102
fs+0.085
ocket+0.083
ves+0.081
wen+0.080
Neg logit contributions
 alone-0.033
 freely-0.023
 like-0.022
 happily-0.021
 fast-0.020
Token his
Token ID465
Feature activation-1.90e-01

Pos logit contributions
 alone+0.065
 like+0.044
 freely+0.041
 happily+0.040
 fast+0.039
Neg logit contributions
lor-0.227
fs-0.190
ocket-0.185
ves-0.182
wen-0.180
Token like
Token ID588
Feature activation+1.48e+00

Pos logit contributions
lor+0.205
fs+0.168
ocket+0.166
ves+0.161
wen+0.161
Neg logit contributions
 alone-0.075
 like-0.058
 freely-0.054
 happily-0.053
 fast-0.051
Token
Token ID6184
Feature activation-7.67e-01

Pos logit contributions
 alone+0.037
 like+0.030
 fast+0.027
 happily+0.027
 freely+0.026
Neg logit contributions
lor-0.090
fs-0.073
ocket-0.071
vation-0.069
wen-0.069
Token
Token ID95
Feature activation+1.18e+00

Pos logit contributions
lor+0.033
fs+0.024
ocket+0.023
ves+0.023
wen+0.022
Neg logit contributions
 alone-0.033
 freely-0.027
 like-0.027
 happily-0.027
 fast-0.026
Token
Token ID26391
Feature activation-4.26e-01

Pos logit contributions
 alone+0.047
 freely+0.040
 like+0.040
 happily+0.039
 fast+0.038
Neg logit contributions
lor-0.053
fs-0.039
ocket-0.038
ves-0.037
wen-0.037
Token
Token ID129
Feature activation+9.64e-01

Pos logit contributions
lor+0.019
fs+0.014
ocket+0.014
ves+0.013
wen+0.013
Neg logit contributions
 alone-0.017
 like-0.014
 freely-0.014
 happily-0.014
 fast-0.014
Token
Token ID241
Feature activation+7.35e+00

Pos logit contributions
 alone+0.035
 freely+0.029
 happily+0.028
 like+0.028
 fast+0.027
Neg logit contributions
lor-0.046
fs-0.036
ves-0.035
ocket-0.034
wen-0.034
Tokenhappy
Token ID34191
Feature activation-1.23e+00

Pos logit contributions
 alone+0.211
 freely+0.161
 happily+0.154
 like+0.152
 fast+0.146
Neg logit contributions
lor-0.411
fs-0.329
ocket-0.320
ves-0.317
wen-0.314
Tokenâ
Token ID22940
Feature activation+1.10e+00

Pos logit contributions
lor+0.054
fs+0.043
ocket+0.042
wen+0.040
cases+0.039
Neg logit contributions
 alone-0.048
 like-0.042
 happily-0.040
 freely-0.040
 fast-0.039
Token
Token ID26391
Feature activation-2.48e-01

Pos logit contributions
 alone+0.044
 like+0.038
 freely+0.037
 happily+0.036
 fast+0.035
Neg logit contributions
lor-0.049
fs-0.037
ocket-0.037
ves-0.035
wen-0.034
Token and
Token ID290
Feature activation-1.03e+00

Pos logit contributions
lor+0.011
fs+0.009
ocket+0.009
ves+0.008
wen+0.008
Neg logit contributions
 alone-0.009
 like-0.008
 freely-0.008
 happily-0.008
 fast-0.008
Token
Token ID6184
Feature activation-1.46e+00

Pos logit contributions
lor+0.058
fs+0.046
ocket+0.045
ves+0.044
wen+0.044
Neg logit contributions
 alone-0.032
 like-0.026
 happily-0.024
 fast-0.024
 freely-0.024
Token so
Token ID523
Feature activation-1.77e-01

Pos logit contributions
lor+0.003
fs+0.002
ocket+0.002
ves+0.002
chief+0.002
Neg logit contributions
 alone-0.001
 freely-0.001
 like-0.001
 happily-0.001
 fast-0.001
Token that
Token ID326
Feature activation-1.20e+00

Pos logit contributions
lor+0.008
fs+0.006
ocket+0.006
ves+0.006
wen+0.006
Neg logit contributions
 alone-0.007
 freely-0.005
 like-0.005
 happily-0.005
 fast-0.005
Token no
Token ID645
Feature activation+7.18e-01

Pos logit contributions
lor+0.065
fs+0.052
ocket+0.051
anger+0.049
wen+0.048
Neg logit contributions
 alone-0.039
 like-0.032
 happily-0.031
 freely-0.031
 fast-0.030
Token more
Token ID517
Feature activation+4.57e+00

Pos logit contributions
 alone+0.021
 freely+0.017
 like+0.017
 happily+0.016
 fast+0.016
Neg logit contributions
lor-0.039
fs-0.031
ocket-0.031
ves-0.030
wen-0.029
Token mis
Token ID2984
Feature activation+5.38e+00

Pos logit contributions
 alone+0.138
 freely+0.111
 like+0.109
 happily+0.106
 fast+0.103
Neg logit contributions
lor-0.246
fs-0.196
ocket-0.192
ves-0.187
wen-0.183
Tokenchie
Token ID3043
Feature activation+7.34e+00

Pos logit contributions
 alone+0.176
 like+0.149
 happily+0.146
 freely+0.146
 fast+0.141
Neg logit contributions
lor-0.268
ocket-0.202
fs-0.202
vation-0.196
ves-0.191
Tokenvous
Token ID31222
Feature activation+1.02e+00

Pos logit contributions
 alone+0.189
 like+0.142
 freely+0.137
 happily+0.133
 fast+0.130
Neg logit contributions
lor-0.438
fs-0.358
ocket-0.350
chief-0.339
ves-0.339
Token mice
Token ID10693
Feature activation+2.22e+00

Pos logit contributions
 alone+0.034
 freely+0.028
 like+0.028
 happily+0.027
 fast+0.026
Neg logit contributions
lor-0.052
fs-0.041
ocket-0.040
ves-0.039
wen-0.038
Token could
Token ID714
Feature activation-1.22e+00

Pos logit contributions
 alone+0.071
 like+0.057
 freely+0.056
 happily+0.055
 fast+0.053
Neg logit contributions
lor-0.117
fs-0.093
ocket-0.090
ves-0.088
wen-0.087
Token come
Token ID1282
Feature activation+1.67e+00

Pos logit contributions
lor+0.062
fs+0.049
ocket+0.048
ves+0.046
wen+0.046
Neg logit contributions
 alone-0.042
 like-0.034
 freely-0.034
 happily-0.033
 fast-0.032
Token in
Token ID287
Feature activation+3.86e+00

Pos logit contributions
 alone+0.042
 freely+0.032
 happily+0.031
 youth+0.030
 wildly+0.030
Neg logit contributions
lor-0.093
fs-0.073
ocket-0.072
ves-0.070
cases-0.069
Token jam
Token ID16853
Feature activation-5.77e+00

Pos logit contributions
lor+0.114
fs+0.094
ocket+0.092
ves+0.090
wen+0.089
Neg logit contributions
 alone-0.042
 freely-0.030
 like-0.030
 happily-0.029
 fast-0.027
Token to
Token ID284
Feature activation-1.64e+00

Pos logit contributions
lor+0.363
fs+0.293
wen+0.288
ocket+0.285
vation+0.285
Neg logit contributions
 alone-0.131
 like-0.108
 happily-0.096
 freely-0.094
 fast-0.091
Token the
Token ID262
Feature activation-1.54e+00

Pos logit contributions
lor+0.105
fs+0.087
ocket+0.085
ves+0.084
wen+0.083
Neg logit contributions
 alone-0.034
 freely-0.023
 like-0.023
 happily-0.022
 fast-0.021
Token kids
Token ID3988
Feature activation+1.98e+00

Pos logit contributions
lor+0.087
fs+0.070
ocket+0.069
ves+0.067
wen+0.066
Neg logit contributions
 alone-0.042
 freely-0.033
 like-0.033
 happily-0.032
 fast-0.030
Token who
Token ID508
Feature activation-1.14e+00

Pos logit contributions
 alone+0.045
 freely+0.036
 late+0.031
 wildly+0.030
 happily+0.030
Neg logit contributions
lor-0.116
fs-0.096
ocket-0.095
ves-0.093
ipop-0.091
Token say
Token ID910
Feature activation-8.92e+00

Pos logit contributions
lor+0.058
fs+0.045
ocket+0.044
ves+0.043
wen+0.042
Neg logit contributions
 alone-0.039
 freely-0.032
 like-0.032
 happily-0.031
 fast-0.030
Token yes
Token ID3763
Feature activation-3.08e+00

Pos logit contributions
lor+0.488
fs+0.375
wen+0.371
vation+0.370
ocket+0.363
Neg logit contributions
 alone-0.271
 like-0.247
 happily-0.218
 fast-0.217
 freely-0.204
Token.
Token ID13
Feature activation+3.25e-01

Pos logit contributions
lor+0.194
fs+0.160
ocket+0.157
ves+0.155
wen+0.152
Neg logit contributions
 alone-0.066
 freely-0.047
 like-0.045
 happily-0.044
 fast-0.041
Token They
Token ID1119
Feature activation+9.30e-01

Pos logit contributions
 alone+0.008
 freely+0.006
 active+0.006
 like+0.006
 hate+0.005
Neg logit contributions
lor-0.018
fs-0.015
ves-0.015
ocket-0.014
cases-0.014
Token open
Token ID1280
Feature activation-4.29e-01

Pos logit contributions
 alone+0.034
 like+0.029
 freely+0.028
 happily+0.028
 fast+0.027
Neg logit contributions
lor-0.045
fs-0.035
ocket-0.034
ves-0.033
wen-0.033
Token the
Token ID262
Feature activation-2.10e+00

Pos logit contributions
lor+0.028
fs+0.023
ocket+0.023
ves+0.022
wen+0.022
Neg logit contributions
 alone-0.008
 like-0.006
 freely-0.006
 happily-0.005
 fast-0.005
Token.
Token ID13
Feature activation+4.83e-01

Pos logit contributions
lor+0.175
fs+0.143
ocket+0.142
wen+0.139
ves+0.137
Neg logit contributions
 alone-0.048
 like-0.036
 happily-0.032
 freely-0.031
 fast-0.030
Token But
Token ID887
Feature activation-3.61e+00

Pos logit contributions
 alone+0.016
 freely+0.013
 like+0.012
 active+0.012
 fast+0.011
Neg logit contributions
lor-0.024
fs-0.019
ves-0.019
chief-0.019
ipop-0.018
Token Mom
Token ID11254
Feature activation-1.38e+00

Pos logit contributions
lor+0.197
fs+0.157
ocket+0.153
ves+0.150
wen+0.147
Neg logit contributions
 alone-0.109
 freely-0.086
 like-0.085
 happily-0.083
 fast-0.080
Token and
Token ID290
Feature activation-2.57e+00

Pos logit contributions
lor+0.092
fs+0.074
ocket+0.074
vation+0.072
ves+0.071
Neg logit contributions
 alone-0.028
 like-0.022
 happily-0.018
 outside-0.018
 fast-0.017
Token Dad
Token ID17415
Feature activation-1.35e+00

Pos logit contributions
lor+0.124
fs+0.095
ocket+0.091
ves+0.090
wen+0.087
Neg logit contributions
 alone-0.097
 like-0.084
 happily-0.080
 freely-0.079
 fast-0.078
Token say
Token ID910
Feature activation-8.81e+00

Pos logit contributions
lor+0.069
fs+0.052
ocket+0.051
ves+0.049
wen+0.049
Neg logit contributions
 alone-0.049
 like-0.042
 happily-0.039
 freely-0.039
 fast-0.038
Token no
Token ID645
Feature activation-3.93e-01

Pos logit contributions
lor+0.456
fs+0.349
ocket+0.345
wen+0.338
vation+0.335
Neg logit contributions
 alone-0.295
 like-0.253
 happily-0.236
 fast-0.234
 freely-0.230
Token.
Token ID13
Feature activation-1.19e-01

Pos logit contributions
lor+0.020
fs+0.017
ves+0.017
cases+0.016
ipop+0.016
Neg logit contributions
 alone-0.011
 freely-0.009
 happily-0.008
 active-0.008
 wildly-0.008
Token They
Token ID1119
Feature activation+8.70e-01

Pos logit contributions
lor+0.006
fs+0.005
chief+0.005
ama+0.005
ves+0.005
Neg logit contributions
 alone-0.004
 freely-0.003
 active-0.003
 happily-0.003
 like-0.002
Token say
Token ID910
Feature activation-8.32e+00

Pos logit contributions
 alone+0.027
 like+0.023
 freely+0.021
 happily+0.021
 fast+0.021
Neg logit contributions
lor-0.048
fs-0.037
ocket-0.036
ves-0.035
wen-0.035
Token the
Token ID262
Feature activation-2.08e+00

Pos logit contributions
lor+0.425
fs+0.333
ocket+0.325
ves+0.316
wen+0.311
Neg logit contributions
 alone-0.280
 freely-0.227
 like-0.227
 happily-0.221
 fast-0.214
Token.
Token ID13
Feature activation+2.67e-01

Pos logit contributions
lor+0.345
chief+0.282
wen+0.281
fs+0.281
ocket+0.277
Neg logit contributions
 alone-0.086
 like-0.064
 outside-0.058
 happily-0.056
 fast-0.052
Token But
Token ID887
Feature activation-3.81e+00

Pos logit contributions
 alone+0.009
 freely+0.007
 like+0.007
 happily+0.007
 fast+0.007
Neg logit contributions
lor-0.013
fs-0.011
ocket-0.010
ves-0.010
anger-0.010
Token Mom
Token ID11254
Feature activation-1.45e+00

Pos logit contributions
lor+0.217
fs+0.171
ocket+0.168
ves+0.163
wen+0.162
Neg logit contributions
 alone-0.112
 like-0.089
 freely-0.086
 happily-0.085
 fast-0.082
Token and
Token ID290
Feature activation-2.84e+00

Pos logit contributions
lor+0.103
fs+0.085
ocket+0.084
vation+0.082
ves+0.081
Neg logit contributions
 alone-0.022
 like-0.016
 outside-0.013
 happily-0.013
 fast-0.011
Token Dad
Token ID17415
Feature activation-1.47e+00

Pos logit contributions
lor+0.137
fs+0.104
ocket+0.100
ves+0.099
wen+0.096
Neg logit contributions
 alone-0.108
 like-0.093
 happily-0.088
 freely-0.087
 fast-0.087
Token say
Token ID910
Feature activation-8.80e+00

Pos logit contributions
lor+0.081
fs+0.061
ocket+0.059
ves+0.058
wen+0.058
Neg logit contributions
 alone-0.049
 like-0.042
 happily-0.038
 fast-0.037
 freely-0.037
Token,
Token ID11
Feature activation-2.58e-02

Pos logit contributions
lor+0.473
fs+0.363
ocket+0.361
wen+0.351
vation+0.350
Neg logit contributions
 alone-0.282
 like-0.242
 fast-0.224
 happily-0.220
 freely-0.216
Token "
Token ID366
Feature activation+3.66e+00

Pos logit contributions
lor+0.001
fs+0.001
ocket+0.001
wen+0.001
vation+0.001
Neg logit contributions
 alone-0.001
 like-0.001
 freely-0.001
 fast-0.001
 happily-0.001
TokenWait
Token ID21321
Feature activation-2.26e+00

Pos logit contributions
 alone+0.104
 freely+0.080
 like+0.079
 happily+0.079
 fast+0.072
Neg logit contributions
lor-0.206
fs-0.168
ves-0.160
ocket-0.159
vation-0.157
Token!
Token ID0
Feature activation-3.81e-01

Pos logit contributions
lor+0.125
fs+0.102
ocket+0.098
ves+0.096
wen+0.095
Neg logit contributions
 alone-0.063
 freely-0.049
 happily-0.047
 like-0.046
 active-0.044
Token You
Token ID921
Feature activation-8.58e-01

Pos logit contributions
lor+0.020
fs+0.016
ocket+0.016
ves+0.016
wen+0.015
Neg logit contributions
 alone-0.012
 like-0.009
 freely-0.009
 happily-0.009
 fast-0.009
Token They
Token ID1119
Feature activation+1.01e+00

Pos logit contributions
 alone+0.007
 freely+0.006
 active+0.006
 like+0.005
 than+0.005
Neg logit contributions
lor-0.011
ves-0.009
fs-0.009
ama-0.009
anger-0.009
Token stir
Token ID11240
Feature activation-7.94e-01

Pos logit contributions
 alone+0.034
 freely+0.029
 happily+0.027
 late+0.027
 fast+0.026
Neg logit contributions
lor-0.049
ocket-0.039
fs-0.039
ves-0.037
wen-0.036
Token and
Token ID290
Feature activation-1.88e+00

Pos logit contributions
lor+0.045
fs+0.038
ves+0.038
ocket+0.036
chief+0.035
Neg logit contributions
 alone-0.021
 freely-0.015
 happily-0.014
 youth-0.014
 hate-0.014
Token taste
Token ID6938
Feature activation-2.24e+00

Pos logit contributions
lor+0.082
fs+0.064
ocket+0.062
ves+0.061
chief+0.059
Neg logit contributions
 alone-0.073
 freely-0.061
 happily-0.059
 like-0.058
 fast-0.057
Token and
Token ID290
Feature activation-1.88e+00

Pos logit contributions
lor+0.115
fs+0.098
ves+0.096
ocket+0.091
 story+0.087
Neg logit contributions
 alone-0.068
 freely-0.052
 youth-0.046
 hate-0.046
ashing-0.046
Token say
Token ID910
Feature activation-8.70e+00

Pos logit contributions
lor+0.089
fs+0.070
ocket+0.069
ves+0.067
chief+0.066
Neg logit contributions
 alone-0.066
 freely-0.055
 happily-0.052
 like-0.051
 fast-0.051
Token y
Token ID331
Feature activation-1.28e+00

Pos logit contributions
lor+0.509
fs+0.410
ocket+0.402
wen+0.390
ves+0.388
Neg logit contributions
 alone-0.231
 like-0.180
 freely-0.175
 happily-0.171
 fast-0.166
Tokenum
Token ID388
Feature activation-7.85e-01

Pos logit contributions
lor+0.045
fs+0.030
ocket+0.029
ves+0.028
wen+0.027
Neg logit contributions
 alone-0.064
 freely-0.055
 like-0.055
 happily-0.054
 fast-0.053
Token.
Token ID13
Feature activation+5.24e-02

Pos logit contributions
lor+0.049
fs+0.040
vation+0.039
ocket+0.039
chief+0.038
Neg logit contributions
 alone-0.017
 like-0.015
 happily-0.013
 fast-0.012
 freely-0.012
Token
Token ID198
Feature activation-7.21e-01

Pos logit contributions
 alone+0.002
 freely+0.001
 active+0.001
 like+0.001
 hate+0.001
Neg logit contributions
lor-0.003
ves-0.002
fs-0.002
ama-0.002
chief-0.002
Token
Token ID198
Feature activation-5.35e-01

Pos logit contributions
lor+0.036
ves+0.029
fs+0.029
wen+0.027
ama+0.026
Neg logit contributions
 alone-0.024
 freely-0.019
 like-0.019
 happily-0.018
 active-0.018
Token.
Token ID13
Feature activation+9.19e-01

Pos logit contributions
lor+0.023
fs+0.019
ocket+0.019
wen+0.018
ves+0.018
Neg logit contributions
 alone-0.007
 like-0.005
 freely-0.005
 happily-0.005
 fast-0.004
Token But
Token ID887
Feature activation-3.66e+00

Pos logit contributions
 alone+0.026
 freely+0.020
 active+0.019
 happily+0.018
 like+0.017
Neg logit contributions
lor-0.051
fs-0.040
ocket-0.038
anger-0.038
ama-0.038
Token Anna
Token ID11735
Feature activation+1.90e-01

Pos logit contributions
lor+0.209
fs+0.165
ocket+0.162
ves+0.158
wen+0.157
Neg logit contributions
 alone-0.107
 like-0.084
 freely-0.082
 happily-0.081
 fast-0.078
Token and
Token ID290
Feature activation-1.54e+00

Pos logit contributions
 alone+0.008
 freely+0.007
 late+0.007
 wildly+0.006
 happily+0.006
Neg logit contributions
lor-0.008
fs-0.006
ocket-0.006
ves-0.006
chief-0.005
Token Ben
Token ID3932
Feature activation-2.33e+00

Pos logit contributions
lor+0.087
fs+0.069
ocket+0.067
ves+0.066
wen+0.065
Neg logit contributions
 alone-0.046
 like-0.036
 freely-0.035
 happily-0.035
 fast-0.034
Token say
Token ID910
Feature activation-8.69e+00

Pos logit contributions
lor+0.096
fs+0.075
ocket+0.073
ves+0.072
chief+0.070
Neg logit contributions
 alone-0.088
 freely-0.079
 late-0.077
 active-0.073
 happily-0.073
Token no
Token ID645
Feature activation-2.70e-01

Pos logit contributions
lor+0.449
fs+0.354
ocket+0.344
ves+0.339
wen+0.328
Neg logit contributions
 alone-0.288
 freely-0.235
 like-0.230
 happily-0.225
 fast-0.217
Token.
Token ID13
Feature activation+4.91e-01

Pos logit contributions
lor+0.014
fs+0.011
ves+0.011
 story+0.011
ipop+0.011
Neg logit contributions
 alone-0.008
 freely-0.007
 happily-0.006
 active-0.006
 late-0.005
Token They
Token ID1119
Feature activation+1.10e+00

Pos logit contributions
 alone+0.013
 freely+0.011
 active+0.010
 hate+0.009
 happily+0.009
Neg logit contributions
lor-0.026
ves-0.021
fs-0.021
cases-0.020
izza-0.020
Token say
Token ID910
Feature activation-8.15e+00

Pos logit contributions
 alone+0.039
 freely+0.033
 late+0.032
 happily+0.031
 fast+0.030
Neg logit contributions
lor-0.050
fs-0.040
ocket-0.039
ves-0.038
chief-0.038
Token they
Token ID484
Feature activation-1.31e-01

Pos logit contributions
lor+0.419
fs+0.332
ocket+0.322
ves+0.319
chief+0.306
Neg logit contributions
 alone-0.268
 freely-0.220
 like-0.213
 happily-0.210
 fast-0.201
Token to
Token ID284
Feature activation-2.54e+00

Pos logit contributions
lor+0.111
fs+0.092
ocket+0.088
vation+0.087
ves+0.087
Neg logit contributions
 alone-0.044
 like-0.035
 happily-0.032
 freely-0.031
 fast-0.031
Token open
Token ID1280
Feature activation-8.65e-01

Pos logit contributions
lor+0.138
fs+0.111
ocket+0.106
ves+0.104
cases+0.103
Neg logit contributions
 alone-0.083
 like-0.072
 happily-0.065
 fast-0.065
 outside-0.063
Token her
Token ID607
Feature activation-1.44e+00

Pos logit contributions
lor+0.053
fs+0.043
ocket+0.043
ves+0.042
wen+0.041
Neg logit contributions
 alone-0.020
 freely-0.015
 like-0.015
 happily-0.014
 fast-0.013
Token mouth
Token ID5422
Feature activation-4.37e+00

Pos logit contributions
lor+0.063
fs+0.048
ves+0.046
wen+0.045
vation+0.045
Neg logit contributions
 alone-0.062
 like-0.053
 outside-0.051
 happily-0.051
 fast-0.050
Token and
Token ID290
Feature activation-1.97e+00

Pos logit contributions
lor+0.285
fs+0.236
ocket+0.232
ves+0.228
wen+0.225
Neg logit contributions
 alone-0.089
 like-0.064
 freely-0.059
 happily-0.057
 fast-0.055
Token say
Token ID910
Feature activation-8.69e+00

Pos logit contributions
lor+0.108
fs+0.083
ocket+0.080
anger+0.078
wen+0.078
Neg logit contributions
 alone-0.066
 like-0.057
 happily-0.054
 fast-0.054
 outside-0.051
Token
Token ID6184
Feature activation-2.24e+00

Pos logit contributions
lor+0.490
fs+0.393
ocket+0.388
vation+0.378
wen+0.376
Neg logit contributions
 alone-0.245
 like-0.201
 happily-0.186
 fast-0.185
 freely-0.182
Token
Token ID95
Feature activation+3.63e-01

Pos logit contributions
lor+0.093
fs+0.065
ocket+0.064
chief+0.063
ves+0.062
Neg logit contributions
 alone-0.098
 freely-0.084
 happily-0.083
 like-0.082
 late-0.080
Token
Token ID26391
Feature activation-1.18e+00

Pos logit contributions
 alone+0.014
 freely+0.012
 happily+0.012
 like+0.012
 fast+0.012
Neg logit contributions
lor-0.017
fs-0.012
ocket-0.012
ves-0.011
wen-0.011
Token˜
Token ID41185
Feature activation-4.65e-01

Pos logit contributions
lor+0.053
fs+0.043
ocket+0.043
vation+0.039
anger+0.039
Neg logit contributions
 alone-0.043
 like-0.037
 fast-0.035
 happily-0.035
 freely-0.035
Tokenah
Token ID993
Feature activation-1.73e+00

Pos logit contributions
lor+0.028
fs+0.023
ocket+0.022
ves+0.022
vation+0.021
Neg logit contributions
 alone-0.012
 like-0.009
 freely-0.008
 happily-0.008
 fast-0.008
Token They
Token ID1119
Feature activation+1.16e+00

Pos logit contributions
 alone+0.006
 freely+0.005
 active+0.004
 happily+0.004
free+0.004
Neg logit contributions
lor-0.014
fs-0.012
cases-0.012
ves-0.012
chief-0.011
Token run
Token ID1057
Feature activation+1.81e+00

Pos logit contributions
 alone+0.041
 freely+0.034
 like+0.033
 happily+0.033
 fast+0.032
Neg logit contributions
lor-0.056
fs-0.044
ocket-0.043
ves-0.042
wen-0.041
Token to
Token ID284
Feature activation-1.40e-01

Pos logit contributions
Acc+0.051
 active+0.049
 alone+0.048
 youth+0.044
 lonely+0.042
Neg logit contributions
lor-0.071
cases-0.066
anger-0.063
 ticket-0.062
fs-0.061
Token Ben
Token ID3932
Feature activation-1.07e+00

Pos logit contributions
lor+0.008
fs+0.007
ocket+0.007
ves+0.006
chief+0.006
Neg logit contributions
 alone-0.003
 freely-0.003
 like-0.002
 active-0.002
 late-0.002
Token and
Token ID290
Feature activation-1.43e+00

Pos logit contributions
lor+0.058
fs+0.046
ocket+0.045
ves+0.044
chief+0.043
Neg logit contributions
 alone-0.032
 freely-0.026
 like-0.025
 happily-0.024
 fast-0.024
Token say
Token ID910
Feature activation-8.69e+00

Pos logit contributions
lor+0.068
fs+0.052
ocket+0.050
ves+0.049
wen+0.048
Neg logit contributions
 alone-0.055
 like-0.045
 freely-0.045
 happily-0.044
 fast-0.043
Token,
Token ID11
Feature activation+1.35e-01

Pos logit contributions
lor+0.480
fs+0.381
ocket+0.374
ves+0.362
wen+0.360
Neg logit contributions
 alone-0.261
 like-0.206
 freely-0.204
 happily-0.199
 fast-0.193
Token "
Token ID366
Feature activation+3.95e+00

Pos logit contributions
 alone+0.004
 freely+0.004
 like+0.004
 happily+0.004
 fast+0.003
Neg logit contributions
lor-0.007
fs-0.005
ocket-0.005
ves-0.005
chief-0.005
TokenGive
Token ID23318
Feature activation-2.63e+00

Pos logit contributions
 alone+0.111
 freely+0.087
 happily+0.084
 like+0.084
 fast+0.079
Neg logit contributions
lor-0.223
fs-0.180
ves-0.172
ocket-0.172
vation-0.170
Token us
Token ID514
Feature activation+1.92e+00

Pos logit contributions
lor+0.147
fs+0.117
ocket+0.115
ves+0.112
wen+0.111
Neg logit contributions
 alone-0.076
 freely-0.060
 like-0.059
 happily-0.057
 fast-0.055
Token our
Token ID674
Feature activation-9.61e-01

Pos logit contributions
 alone+0.045
 freely+0.035
 happily+0.033
 like+0.032
 wildly+0.029
Neg logit contributions
lor-0.115
fs-0.094
ocket-0.091
ves-0.091
wen-0.089
Token.
Token ID13
Feature activation+3.03e-01

Pos logit contributions
lor+0.003
fs+0.002
ocket+0.002
ves+0.002
wen+0.002
Neg logit contributions
 alone-0.001
 like-0.001
 freely-0.001
 happily-0.001
 fast-0.001
Token Their
Token ID5334
Feature activation-2.51e+00

Pos logit contributions
 alone+0.010
 freely+0.008
 like+0.008
 active+0.008
free+0.008
Neg logit contributions
lor-0.015
fs-0.011
ves-0.011
izza-0.011
 story-0.011
Token mom
Token ID1995
Feature activation-7.45e-01

Pos logit contributions
lor+0.098
fs+0.071
ocket+0.068
ves+0.066
wen+0.064
Neg logit contributions
 alone-0.113
 freely-0.098
 like-0.098
 happily-0.095
 fast-0.093
Token and
Token ID290
Feature activation-2.63e+00

Pos logit contributions
lor+0.035
fs+0.026
ocket+0.026
ves+0.025
wen+0.025
Neg logit contributions
 alone-0.029
 like-0.024
 freely-0.024
 happily-0.023
 fast-0.023
Token dad
Token ID9955
Feature activation-1.46e+00

Pos logit contributions
lor+0.099
fs+0.070
ocket+0.068
ves+0.065
wen+0.063
Neg logit contributions
 alone-0.124
 freely-0.107
 like-0.106
 happily-0.105
 fast-0.103
Token say
Token ID910
Feature activation-8.67e+00

Pos logit contributions
lor+0.071
fs+0.055
ocket+0.054
ves+0.052
wen+0.051
Neg logit contributions
 alone-0.052
 freely-0.043
 like-0.043
 happily-0.042
 fast-0.041
Token they
Token ID484
Feature activation-3.36e-01

Pos logit contributions
lor+0.490
fs+0.396
ocket+0.386
ves+0.381
wen+0.370
Neg logit contributions
 alone-0.243
 freely-0.191
 like-0.185
 happily-0.181
 fast-0.173
Token have
Token ID423
Feature activation-7.34e-01

Pos logit contributions
lor+0.017
fs+0.014
ocket+0.013
ves+0.013
wen+0.012
Neg logit contributions
 alone-0.011
 freely-0.009
 happily-0.009
 fast-0.008
 like-0.008
Token to
Token ID284
Feature activation-1.86e+00

Pos logit contributions
lor+0.045
fs+0.036
ocket+0.036
ves+0.035
wen+0.034
Neg logit contributions
 alone-0.017
 freely-0.013
 like-0.013
 happily-0.012
 fast-0.012
Token clean
Token ID3424
Feature activation+1.79e+00

Pos logit contributions
lor+0.098
fs+0.078
ocket+0.076
ves+0.075
wen+0.073
Neg logit contributions
 alone-0.057
 freely-0.046
 like-0.044
 happily-0.044
 fast-0.042
Token up
Token ID510
Feature activation+2.80e+00

Pos logit contributions
 alone+0.053
 freely+0.044
ashing+0.042
free+0.041
umpy+0.040
Neg logit contributions
lor-0.088
fs-0.074
ves-0.071
chief-0.070
ocket-0.068
Token They
Token ID1119
Feature activation+9.40e-01

Pos logit contributions
 alone+0.005
 freely+0.004
 active+0.003
 hate+0.003
free+0.003
Neg logit contributions
lor-0.010
cases-0.008
 address-0.008
ves-0.008
fs-0.008
Token walk
Token ID2513
Feature activation+1.04e+00

Pos logit contributions
 alone+0.034
 like+0.028
 freely+0.027
 happily+0.027
 fast+0.026
Neg logit contributions
lor-0.047
fs-0.036
ocket-0.035
ves-0.034
wen-0.034
Token to
Token ID284
Feature activation-1.84e-01

Pos logit contributions
Acc+0.030
 alone+0.029
 hate+0.027
 waste+0.027
 miserable+0.026
Neg logit contributions
lor-0.038
cases-0.035
ves-0.035
fs-0.034
ipop-0.034
Token her
Token ID607
Feature activation+3.98e-02

Pos logit contributions
lor+0.011
fs+0.009
ocket+0.009
ves+0.009
wen+0.009
Neg logit contributions
 alone-0.004
 freely-0.003
 like-0.003
 happily-0.003
 late-0.003
Token and
Token ID290
Feature activation-1.57e+00

Pos logit contributions
 alone+0.001
 happily+0.001
 like+0.001
 fast+0.001
 freely+0.001
Neg logit contributions
lor-0.002
fs-0.002
ocket-0.002
ves-0.002
wen-0.002
Token say
Token ID910
Feature activation-8.66e+00

Pos logit contributions
lor+0.080
fs+0.061
ocket+0.058
wen+0.058
ves+0.057
Neg logit contributions
 alone-0.059
 like-0.049
 happily-0.048
 freely-0.047
 fast-0.046
Token hello
Token ID23748
Feature activation-1.74e+00

Pos logit contributions
lor+0.422
fs+0.323
ocket+0.316
wen+0.307
vation+0.302
Neg logit contributions
 alone-0.317
 like-0.266
 happily-0.257
 freely-0.257
 fast-0.254
Token.
Token ID13
Feature activation+1.34e-01

Pos logit contributions
lor+0.096
ocket+0.075
fs+0.074
cases+0.073
ves+0.073
Neg logit contributions
 alone-0.046
 freely-0.036
 active-0.035
 late-0.033
 competitive-0.032
Token They
Token ID1119
Feature activation+8.76e-01

Pos logit contributions
 alone+0.004
 freely+0.003
 active+0.003
 hate+0.003
 muddy+0.003
Neg logit contributions
lor-0.007
ves-0.005
cases-0.005
fs-0.005
ipop-0.005
Token ask
Token ID1265
Feature activation-3.23e+00

Pos logit contributions
 alone+0.031
 like+0.026
 freely+0.026
 happily+0.025
 fast+0.024
Neg logit contributions
lor-0.043
fs-0.033
ocket-0.032
ves-0.032
wen-0.031
Token her
Token ID607
Feature activation-1.70e+00

Pos logit contributions
lor+0.194
fs+0.163
ocket+0.160
ves+0.158
wen+0.152
Neg logit contributions
 alone-0.072
 freely-0.052
 like-0.049
 late-0.048
 happily-0.047
Token They
Token ID1119
Feature activation+1.31e+00

Pos logit contributions
 alone+0.013
 freely+0.010
 active+0.009
 like+0.009
 happily+0.009
Neg logit contributions
lor-0.023
fs-0.019
ves-0.018
chief-0.018
ocket-0.018
Token pay
Token ID1414
Feature activation-2.63e+00

Pos logit contributions
 alone+0.048
 freely+0.041
 late+0.040
 active+0.038
 wildly+0.037
Neg logit contributions
lor-0.056
fs-0.045
chief-0.044
ocket-0.044
ves-0.043
Token the
Token ID262
Feature activation-1.82e+00

Pos logit contributions
lor+0.155
fs+0.125
ocket+0.122
ves+0.119
wen+0.118
Neg logit contributions
 alone-0.070
 like-0.054
 freely-0.053
 happily-0.052
 fast-0.050
Token man
Token ID582
Feature activation-1.87e-01

Pos logit contributions
lor+0.100
fs+0.079
ocket+0.078
wen+0.073
ves+0.073
Neg logit contributions
 alone-0.057
 like-0.053
 fast-0.050
 different-0.047
 happily-0.046
Token and
Token ID290
Feature activation-1.74e+00

Pos logit contributions
lor+0.011
fs+0.009
ocket+0.009
ves+0.008
wen+0.008
Neg logit contributions
 alone-0.005
 freely-0.004
 like-0.004
 happily-0.004
 fast-0.003
Token say
Token ID910
Feature activation-8.63e+00

Pos logit contributions
lor+0.095
fs+0.074
ocket+0.072
wen+0.070
ves+0.069
Neg logit contributions
 alone-0.056
 like-0.048
 happily-0.045
 fast-0.044
 freely-0.044
Token,
Token ID11
Feature activation+4.43e-01

Pos logit contributions
lor+0.461
fs+0.362
ocket+0.355
wen+0.345
vation+0.341
Neg logit contributions
 alone-0.274
 like-0.226
 happily-0.216
 freely-0.216
 fast-0.212
Token "
Token ID366
Feature activation+4.13e+00

Pos logit contributions
 alone+0.013
 like+0.010
 freely+0.010
 happily+0.010
 fast+0.010
Neg logit contributions
lor-0.025
fs-0.020
ocket-0.020
wen-0.019
ves-0.019
TokenThank
Token ID10449
Feature activation-1.86e+00

Pos logit contributions
 alone+0.123
 freely+0.095
 like+0.092
 happily+0.091
 fast+0.087
Neg logit contributions
lor-0.230
fs-0.183
ocket-0.177
ves-0.175
vation-0.172
Token you
Token ID345
Feature activation+1.82e+00

Pos logit contributions
lor+0.109
ocket+0.087
fs+0.087
wen+0.085
cases+0.084
Neg logit contributions
 alone-0.048
 like-0.038
 fast-0.036
 freely-0.036
 happily-0.035
Token."
Token ID526
Feature activation+9.71e-01

Pos logit contributions
 alone+0.039
 like+0.030
 happily+0.028
 fast+0.027
 freely+0.027
Neg logit contributions
lor-0.115
fs-0.094
ocket-0.093
wen-0.091
ves-0.091
Token Dad
Token ID17415
Feature activation-1.81e+00

Pos logit contributions
lor+0.121
fs+0.093
ves+0.088
ocket+0.087
wen+0.084
Neg logit contributions
 alone-0.096
 like-0.082
 happily-0.079
 fast-0.077
 freely-0.077
Token.
Token ID13
Feature activation-1.62e-01

Pos logit contributions
lor+0.116
fs+0.095
ocket+0.093
wen+0.091
ves+0.091
Neg logit contributions
 alone-0.040
 like-0.031
 happily-0.027
 freely-0.027
 fast-0.026
Token Mom
Token ID11254
Feature activation-1.01e+00

Pos logit contributions
lor+0.008
fs+0.006
ama+0.006
ves+0.006
ocket+0.006
Neg logit contributions
 alone-0.005
 freely-0.004
 like-0.004
 happily-0.004
 active-0.004
Token and
Token ID290
Feature activation-2.26e+00

Pos logit contributions
lor+0.062
fs+0.051
ocket+0.050
ves+0.048
vation+0.047
Neg logit contributions
 alone-0.025
 like-0.020
 happily-0.018
 outside-0.017
 fast-0.017
Token Dad
Token ID17415
Feature activation-1.16e+00

Pos logit contributions
lor+0.102
fs+0.076
ocket+0.074
ves+0.072
wen+0.070
Neg logit contributions
 alone-0.091
 like-0.078
 freely-0.076
 happily-0.075
 fast-0.074
Token say
Token ID910
Feature activation-8.62e+00

Pos logit contributions
lor+0.056
fs+0.043
ocket+0.041
ves+0.040
wen+0.040
Neg logit contributions
 alone-0.044
 like-0.038
 happily-0.036
 freely-0.036
 fast-0.035
Token yes
Token ID3763
Feature activation-2.57e+00

Pos logit contributions
lor+0.406
ocket+0.299
fs+0.298
wen+0.295
vation+0.292
Neg logit contributions
 alone-0.332
 like-0.303
 fast-0.286
 happily-0.277
 their-0.270
Token,
Token ID11
Feature activation+1.80e+00

Pos logit contributions
lor+0.164
fs+0.136
ocket+0.133
ves+0.130
wen+0.129
Neg logit contributions
 alone-0.054
 freely-0.038
 like-0.038
 happily-0.036
 fast-0.034
Token but
Token ID475
Feature activation-3.80e+00

Pos logit contributions
 alone+0.039
 like+0.030
 fast+0.026
 freely+0.026
 happily+0.025
Neg logit contributions
lor-0.118
fs-0.095
ocket-0.094
wen-0.093
ves-0.091
Token be
Token ID307
Feature activation+1.50e+00

Pos logit contributions
lor+0.213
fs+0.168
ocket+0.164
wen+0.160
ves+0.160
Neg logit contributions
 alone-0.120
 like-0.098
 freely-0.092
 happily-0.092
 fast-0.091
Token careful
Token ID8161
Feature activation-6.41e-01

Pos logit contributions
 freely+0.041
 alone+0.039
cy+0.036
 wildly+0.035
 hate+0.034
Neg logit contributions
lor-0.072
ves-0.058
card-0.058
fs-0.058
wen-0.056
Token their
Token ID511
Feature activation-2.00e+00

Pos logit contributions
lor+0.197
fs+0.166
ocket+0.162
ves+0.162
wen+0.156
Neg logit contributions
 alone-0.075
 freely-0.056
 like-0.051
 happily-0.049
 late-0.048
Token moms
Token ID34015
Feature activation-1.63e+00

Pos logit contributions
lor+0.082
fs+0.059
ocket+0.057
ves+0.055
wen+0.054
Neg logit contributions
 alone-0.089
 like-0.076
 freely-0.076
 happily-0.075
 fast-0.074
Token.
Token ID13
Feature activation-4.02e-01

Pos logit contributions
lor+0.102
fs+0.083
ocket+0.081
ves+0.080
wen+0.079
Neg logit contributions
 alone-0.038
 like-0.029
 freely-0.027
 happily-0.026
 fast-0.025
Token Their
Token ID5334
Feature activation-2.84e+00

Pos logit contributions
lor+0.020
fs+0.016
ves+0.016
cases+0.015
chief+0.015
Neg logit contributions
 alone-0.013
 freely-0.011
 like-0.010
 active-0.010
 happily-0.009
Token moms
Token ID34015
Feature activation-9.22e-01

Pos logit contributions
lor+0.113
fs+0.081
ocket+0.078
ves+0.075
wen+0.073
Neg logit contributions
 alone-0.129
 like-0.111
 freely-0.110
 happily-0.109
 fast-0.107
Token say
Token ID910
Feature activation-8.62e+00

Pos logit contributions
lor+0.045
fs+0.034
ocket+0.033
ves+0.032
wen+0.032
Neg logit contributions
 alone-0.035
 like-0.030
 happily-0.029
 freely-0.028
 fast-0.027
Token yes
Token ID3763
Feature activation-2.76e+00

Pos logit contributions
lor+0.428
vation+0.319
ocket+0.319
fs+0.314
wen+0.312
Neg logit contributions
 alone-0.308
 like-0.282
 fast-0.269
 happily-0.252
 their-0.252
Token,
Token ID11
Feature activation+1.60e+00

Pos logit contributions
lor+0.186
fs+0.154
ocket+0.154
wen+0.148
ves+0.146
Neg logit contributions
 alone-0.052
 like-0.038
 fast-0.033
 happily-0.033
 freely-0.031
Token but
Token ID475
Feature activation-3.64e+00

Pos logit contributions
 alone+0.034
 like+0.027
 fast+0.023
 happily+0.021
 freely+0.021
Neg logit contributions
lor-0.107
ocket-0.085
fs-0.085
wen-0.083
vation-0.082
Token they
Token ID484
Feature activation-5.20e-02

Pos logit contributions
lor+0.201
fs+0.155
ocket+0.153
wen+0.150
ves+0.147
Neg logit contributions
 alone-0.118
 like-0.097
 freely-0.091
 fast-0.090
 happily-0.090
Token have
Token ID423
Feature activation-3.55e-01

Pos logit contributions
lor+0.003
fs+0.002
ocket+0.002
wen+0.002
ves+0.002
Neg logit contributions
 alone-0.002
 like-0.001
 freely-0.001
 happily-0.001
 fast-0.001
Token wall
Token ID3355
Feature activation-1.02e+00

Pos logit contributions
lor+0.048
fs+0.035
ocket+0.034
ves+0.033
wen+0.032
Neg logit contributions
 alone-0.044
 like-0.037
 freely-0.036
 fast-0.036
 happily-0.036
Token.
Token ID13
Feature activation+2.55e-01

Pos logit contributions
lor+0.066
fs+0.055
ocket+0.054
wen+0.052
ves+0.052
Neg logit contributions
 alone-0.021
 like-0.015
 freely-0.014
 happily-0.014
 fast-0.013
Token Mom
Token ID11254
Feature activation-6.65e-01

Pos logit contributions
 alone+0.008
 freely+0.006
 happily+0.005
 like+0.005
 active+0.005
Neg logit contributions
lor-0.014
fs-0.011
ves-0.011
cases-0.011
chief-0.010
Token and
Token ID290
Feature activation-2.52e+00

Pos logit contributions
lor+0.037
fs+0.029
ocket+0.029
ves+0.028
wen+0.028
Neg logit contributions
 alone-0.020
 freely-0.015
 like-0.015
 happily-0.015
 fast-0.014
Token Dad
Token ID17415
Feature activation-9.63e-01

Pos logit contributions
lor+0.099
fs+0.078
ipop+0.075
ocket+0.073
cases+0.073
Neg logit contributions
 alone-0.102
 freely-0.092
 like-0.083
 happily-0.082
 wildly-0.081
Token say
Token ID910
Feature activation-8.62e+00

Pos logit contributions
lor+0.049
fs+0.039
ocket+0.038
ves+0.037
wen+0.036
Neg logit contributions
 alone-0.032
 like-0.026
 freely-0.026
 happily-0.026
 fast-0.025
Token they
Token ID484
Feature activation-3.27e-01

Pos logit contributions
lor+0.524
fs+0.424
ocket+0.419
wen+0.408
vation+0.402
Neg logit contributions
 alone-0.210
 like-0.164
 happily-0.152
 freely-0.152
 fast-0.150
Token can
Token ID460
Feature activation-2.01e+00

Pos logit contributions
lor+0.016
fs+0.012
ocket+0.012
ves+0.012
wen+0.011
Neg logit contributions
 alone-0.011
 freely-0.009
 happily-0.009
 like-0.009
 fast-0.009
Token paint
Token ID7521
Feature activation-2.21e+00

Pos logit contributions
lor+0.110
fs+0.086
ocket+0.085
wen+0.081
ves+0.081
Neg logit contributions
 alone-0.062
 like-0.054
 happily-0.050
 freely-0.050
 fast-0.049
Token on
Token ID319
Feature activation+9.58e-02

Pos logit contributions
lor+0.136
fs+0.110
ocket+0.109
wen+0.105
ves+0.104
Neg logit contributions
 alone-0.053
 like-0.041
 freely-0.039
 happily-0.038
 fast-0.037
Token the
Token ID262
Feature activation-7.22e-01

Pos logit contributions
 alone+0.003
 freely+0.002
 happily+0.002
ashing+0.002
 active+0.002
Neg logit contributions
lor-0.005
fs-0.004
ves-0.004
ocket-0.004
wen-0.004
Token word
Token ID1573
Feature activation-5.36e+00

Pos logit contributions
lor+0.016
fs+0.013
wen+0.013
chief+0.013
ocket+0.013
Neg logit contributions
 alone-0.005
 like-0.004
 happily-0.004
 fast-0.004
 outside-0.004
Token and
Token ID290
Feature activation-1.37e+00

Pos logit contributions
lor+0.341
fs+0.276
ocket+0.271
wen+0.268
chief+0.265
Neg logit contributions
 alone-0.119
 like-0.093
 happily-0.082
 freely-0.081
 fast-0.080
Token one
Token ID530
Feature activation-4.47e-01

Pos logit contributions
lor+0.075
fs+0.058
ocket+0.056
wen+0.056
ves+0.054
Neg logit contributions
 alone-0.044
 like-0.039
 happily-0.035
 fast-0.035
 freely-0.034
Token number
Token ID1271
Feature activation-3.76e-01

Pos logit contributions
lor+0.024
fs+0.018
wen+0.018
chief+0.018
vation+0.017
Neg logit contributions
 alone-0.015
 like-0.014
 fast-0.013
 happily-0.012
 different-0.012
Token and
Token ID290
Feature activation-1.37e+00

Pos logit contributions
lor+0.023
fs+0.018
ocket+0.018
wen+0.017
ves+0.017
Neg logit contributions
 alone-0.009
 like-0.007
 freely-0.007
 happily-0.007
 fast-0.007
Token say
Token ID910
Feature activation-8.61e+00

Pos logit contributions
lor+0.074
fs+0.058
ocket+0.055
wen+0.055
ves+0.054
Neg logit contributions
 alone-0.045
 like-0.038
 happily-0.036
 freely-0.035
 fast-0.035
Token them
Token ID606
Feature activation-7.40e-02

Pos logit contributions
lor+0.462
fs+0.362
ocket+0.356
wen+0.346
vation+0.342
Neg logit contributions
 alone-0.268
 like-0.226
 happily-0.212
 fast-0.210
 freely-0.210
Token out
Token ID503
Feature activation+2.00e+00

Pos logit contributions
lor+0.004
fs+0.003
ocket+0.003
ves+0.003
wen+0.003
Neg logit contributions
 alone-0.002
 like-0.002
 freely-0.002
 happily-0.002
 fast-0.002
Token loud
Token ID7812
Feature activation-1.88e-04

Pos logit contributions
 alone+0.051
 like+0.038
 freely+0.038
 happily+0.036
 fast+0.035
Neg logit contributions
lor-0.119
fs-0.097
ocket-0.095
ves-0.093
wen-0.092
Token.
Token ID13
Feature activation+1.39e-01

Pos logit contributions
lor+0.000
fs+0.000
ocket+0.000
ves+0.000
chief+0.000
Neg logit contributions
 alone-0.000
 freely-0.000
 like-0.000
 happily-0.000
 fast-0.000
Token
Token ID198
Feature activation-7.64e-01

Pos logit contributions
 alone+0.005
 freely+0.004
 active+0.004
free+0.004
 than+0.004
Neg logit contributions
lor-0.006
ves-0.005
fs-0.005
cases-0.005
ipop-0.005
Token bag
Token ID6131
Feature activation-3.57e+00

Pos logit contributions
lor+0.036
wen+0.027
fs+0.027
vation+0.026
ves+0.026
Neg logit contributions
 alone-0.023
 like-0.019
 fast-0.018
 happily-0.018
 outside-0.018
Token and
Token ID290
Feature activation-1.32e+00

Pos logit contributions
lor+0.233
fs+0.193
ocket+0.190
ves+0.186
wen+0.184
Neg logit contributions
 alone-0.070
 like-0.047
 freely-0.047
 happily-0.044
 fast-0.042
Token his
Token ID465
Feature activation-1.33e+00

Pos logit contributions
lor+0.076
fs+0.059
ocket+0.058
wen+0.057
ves+0.056
Neg logit contributions
 alone-0.040
 like-0.033
 happily-0.032
 fast-0.030
 freely-0.030
Token arm
Token ID3211
Feature activation-3.51e+00

Pos logit contributions
lor+0.061
wen+0.044
vation+0.043
fs+0.041
ves+0.039
Neg logit contributions
 alone-0.059
 fast-0.051
 happily-0.050
 like-0.049
 youth-0.048
Token and
Token ID290
Feature activation-1.35e+00

Pos logit contributions
lor+0.211
fs+0.176
ves+0.170
ocket+0.168
chief+0.166
Neg logit contributions
 alone-0.080
 freely-0.061
 like-0.058
 happily-0.055
 active-0.053
Token say
Token ID910
Feature activation-8.61e+00

Pos logit contributions
lor+0.076
fs+0.060
ocket+0.058
wen+0.057
ves+0.057
Neg logit contributions
 alone-0.041
 like-0.033
 happily-0.032
 freely-0.032
 fast-0.031
Token,
Token ID11
Feature activation+6.02e-01

Pos logit contributions
lor+0.516
fs+0.420
ocket+0.411
ves+0.402
wen+0.397
Neg logit contributions
 alone-0.215
 freely-0.161
 like-0.160
 happily-0.154
 fast-0.148
Token "
Token ID366
Feature activation+4.28e+00

Pos logit contributions
 alone+0.018
 like+0.014
 freely+0.014
 happily+0.014
 fast+0.013
Neg logit contributions
lor-0.034
fs-0.027
ocket-0.027
wen-0.026
ves-0.026
TokenYou
Token ID1639
Feature activation-1.16e+00

Pos logit contributions
 alone+0.115
 like+0.089
 freely+0.088
 happily+0.085
 fast+0.082
Neg logit contributions
lor-0.249
fs-0.201
ocket-0.198
ves-0.193
wen-0.191
Token are
Token ID389
Feature activation-3.72e-01

Pos logit contributions
lor+0.065
fs+0.053
ves+0.050
wen+0.050
chief+0.049
Neg logit contributions
 alone-0.035
 like-0.032
 happily-0.027
 fast-0.027
 hate-0.026
Token caught
Token ID4978
Feature activation+1.00e+00

Pos logit contributions
lor+0.021
ocket+0.018
fs+0.017
ves+0.016
wen+0.016
Neg logit contributions
 alone-0.009
 freely-0.008
 happily-0.007
ashing-0.007
 wildly-0.007
Token up
Token ID510
Feature activation+3.61e+00

Pos logit contributions
 alone+0.027
 freely+0.022
 late+0.020
 like+0.019
 youth+0.019
Neg logit contributions
lor-0.054
fs-0.046
ves-0.045
cases-0.045
chief-0.043
Token on
Token ID319
Feature activation+1.10e+00

Pos logit contributions
 alone+0.089
 freely+0.072
 hate+0.067
 youth+0.067
 active+0.065
Neg logit contributions
lor-0.179
fs-0.155
ves-0.153
cases-0.145
box-0.145
Token the
Token ID262
Feature activation+2.39e-02

Pos logit contributions
 alone+0.028
 freely+0.021
 like+0.019
ashing+0.019
 active+0.019
Neg logit contributions
lor-0.063
fs-0.053
ves-0.052
wen-0.050
ocket-0.049
Token floor
Token ID4314
Feature activation+4.67e-01

Pos logit contributions
 alone+0.001
 freely+0.001
 like+0.001
 happily+0.001
 late+0.001
Neg logit contributions
lor-0.001
fs-0.001
ves-0.001
ocket-0.001
cases-0.001
Token and
Token ID290
Feature activation-1.49e+00

Pos logit contributions
 alone+0.011
 freely+0.009
 active+0.008
 youth+0.008
 happily+0.007
Neg logit contributions
lor-0.027
fs-0.022
ves-0.022
ocket-0.022
cases-0.021
Token say
Token ID910
Feature activation-8.61e+00

Pos logit contributions
lor+0.074
fs+0.057
ocket+0.056
ves+0.054
wen+0.053
Neg logit contributions
 alone-0.052
 freely-0.043
 like-0.042
 happily-0.042
 fast-0.040
Token "
Token ID366
Feature activation+3.42e+00

Pos logit contributions
lor+0.497
fs+0.398
ocket+0.393
wen+0.380
ves+0.376
Neg logit contributions
 alone-0.235
 like-0.184
 freely-0.179
 happily-0.176
 fast-0.172
TokenReady
Token ID35474
Feature activation-9.02e-01

Pos logit contributions
 alone+0.113
 freely+0.089
 like+0.088
 happily+0.087
 fast+0.082
Neg logit contributions
lor-0.177
fs-0.140
ves-0.133
ocket-0.132
vation-0.131
Token,
Token ID11
Feature activation+2.32e+00

Pos logit contributions
lor+0.052
fs+0.042
ocket+0.041
wen+0.040
ves+0.040
Neg logit contributions
 alone-0.024
 like-0.020
 freely-0.019
 happily-0.019
 fast-0.018
Token set
Token ID900
Feature activation-7.34e-01

Pos logit contributions
 alone+0.083
 like+0.072
 freely+0.068
 happily+0.068
 fast+0.067
Neg logit contributions
lor-0.116
fs-0.090
ocket-0.088
ves-0.085
wen-0.084
Token,
Token ID11
Feature activation+2.16e+00

Pos logit contributions
lor+0.048
ocket+0.042
fs+0.040
wen+0.039
ipop+0.038
Neg logit contributions
 alone-0.013
 like-0.011
 fast-0.011
 outside-0.010
 each-0.010
Token started
Token ID2067
Feature activation+3.51e+00

Pos logit contributions
 alone+0.008
 freely+0.006
 like+0.006
 happily+0.006
 fast+0.006
Neg logit contributions
lor-0.011
fs-0.009
ocket-0.008
ves-0.008
wen-0.008
Token to
Token ID284
Feature activation-1.12e-01

Pos logit contributions
 alone+0.099
 freely+0.088
 like+0.078
cy+0.077
 happily+0.073
Neg logit contributions
lor-0.170
chief-0.139
fs-0.139
ves-0.136
 story-0.131
Token tease
Token ID34691
Feature activation+1.28e+00

Pos logit contributions
lor+0.004
chief+0.004
fs+0.003
 story+0.003
ves+0.003
Neg logit contributions
 alone-0.004
 freely-0.004
 late-0.003
 than-0.003
ashing-0.003
Token Jack
Token ID3619
Feature activation+7.42e-01

Pos logit contributions
 alone+0.039
 freely+0.033
ashing+0.030
 competitive+0.030
 fast+0.030
Neg logit contributions
lor-0.062
fs-0.051
ocket-0.049
chief-0.048
ves-0.046
Token and
Token ID290
Feature activation-1.78e+00

Pos logit contributions
 alone+0.016
 freely+0.015
 late+0.014
 fast+0.013
 youth+0.013
Neg logit contributions
lor-0.040
fs-0.033
ocket-0.032
chief-0.032
ves-0.031
Token say
Token ID910
Feature activation-8.60e+00

Pos logit contributions
lor+0.096
fs+0.075
ocket+0.073
ves+0.071
wen+0.070
Neg logit contributions
 alone-0.058
 like-0.047
 happily-0.046
 freely-0.045
 fast-0.044
Token he
Token ID339
Feature activation-8.69e-01

Pos logit contributions
lor+0.467
fs+0.381
ves+0.372
ocket+0.364
chief+0.355
Neg logit contributions
 alone-0.246
 freely-0.202
 happily-0.185
 late-0.183
 like-0.179
Token looked
Token ID3114
Feature activation-2.47e+00

Pos logit contributions
lor+0.043
fs+0.034
ocket+0.033
ves+0.032
wen+0.032
Neg logit contributions
 alone-0.030
 freely-0.025
 like-0.024
 happily-0.024
 fast-0.023
Token silly
Token ID14397
Feature activation-1.37e-01

Pos logit contributions
lor+0.107
fs+0.099
ocket+0.095
cases+0.095
 message+0.093
Neg logit contributions
cy-0.051
 alone-0.050
 freely-0.050
 town-0.047
 youth-0.047
Token.
Token ID13
Feature activation+3.51e-01

Pos logit contributions
lor+0.008
fs+0.006
ocket+0.006
ves+0.006
cases+0.006
Neg logit contributions
 alone-0.003
 freely-0.003
 happily-0.002
 late-0.002
 fast-0.002
Token
Token ID198
Feature activation-6.68e-01

Pos logit contributions
 alone+0.014
 freely+0.012
 like+0.012
 fast+0.011
 active+0.011
Neg logit contributions
lor-0.014
chief-0.011
fs-0.010
 replied-0.010
cases-0.010
Token and
Token ID290
Feature activation-1.91e+00

Pos logit contributions
lor+0.033
fs+0.027
ocket+0.027
ves+0.027
wen+0.026
Neg logit contributions
 alone-0.008
 like-0.006
 happily-0.005
 outside-0.005
 freely-0.004
Token dad
Token ID9955
Feature activation-1.28e+00

Pos logit contributions
lor+0.096
fs+0.072
vation+0.068
anger+0.068
ocket+0.068
Neg logit contributions
 alone-0.073
 like-0.065
 happily-0.062
 each-0.061
 youth-0.060
Token,
Token ID11
Feature activation+9.31e-01

Pos logit contributions
lor+0.084
fs+0.070
ocket+0.068
ves+0.067
wen+0.067
Neg logit contributions
 alone-0.026
 like-0.018
 freely-0.017
 happily-0.017
 fast-0.015
Token who
Token ID508
Feature activation-1.64e+00

Pos logit contributions
 alone+0.021
 like+0.016
 freely+0.015
 happily+0.015
 fast+0.014
Neg logit contributions
lor-0.058
fs-0.048
ocket-0.047
ves-0.046
wen-0.045
Token always
Token ID1464
Feature activation+7.01e-01

Pos logit contributions
lor+0.079
fs+0.061
ocket+0.060
ves+0.058
wen+0.057
Neg logit contributions
 alone-0.059
 freely-0.049
 like-0.048
 happily-0.047
 fast-0.046
Token say
Token ID910
Feature activation-8.59e+00

Pos logit contributions
 alone+0.025
 like+0.021
 freely+0.020
 happily+0.020
 fast+0.019
Neg logit contributions
lor-0.035
fs-0.027
ocket-0.026
ves-0.026
wen-0.026
Token they
Token ID484
Feature activation-1.98e-01

Pos logit contributions
lor+0.505
fs+0.401
wen+0.401
ocket+0.399
vation+0.394
Neg logit contributions
 alone-0.217
 like-0.194
 fast-0.170
 happily-0.166
 different-0.160
Token are
Token ID389
Feature activation-6.27e-01

Pos logit contributions
lor+0.010
fs+0.008
ocket+0.008
ves+0.008
wen+0.008
Neg logit contributions
 alone-0.006
 like-0.005
 freely-0.005
 happily-0.005
 fast-0.005
Token very
Token ID845
Feature activation+2.11e+00

Pos logit contributions
lor+0.037
fs+0.030
ocket+0.029
ves+0.029
wen+0.028
Neg logit contributions
 alone-0.016
 freely-0.012
 like-0.012
 happily-0.012
 fast-0.011
Token pretty
Token ID2495
Feature activation+8.53e-01

Pos logit contributions
 alone+0.056
 freely+0.043
 like+0.040
 happily+0.038
 wildly+0.038
Neg logit contributions
lor-0.118
fs-0.098
ves-0.096
ocket-0.096
vation-0.092
Token.
Token ID13
Feature activation+6.09e-01

Pos logit contributions
 alone+0.022
 freely+0.018
 wildly+0.016
 late+0.016
 fast+0.015
Neg logit contributions
lor-0.048
ocket-0.039
fs-0.039
ves-0.037
cases-0.037
Token run
Token ID1057
Feature activation+1.92e+00

Pos logit contributions
 alone+0.031
 freely+0.026
 late+0.025
 happily+0.024
 fast+0.024
Neg logit contributions
lor-0.043
fs-0.034
ocket-0.034
ves-0.033
chief-0.033
Token to
Token ID284
Feature activation-6.39e-02

Pos logit contributions
Acc+0.067
 alone+0.065
 lonely+0.060
 miserable+0.058
 active+0.058
Neg logit contributions
lor-0.060
cases-0.055
 ticket-0.054
anger-0.053
fs-0.052
Token the
Token ID262
Feature activation-8.98e-01

Pos logit contributions
lor+0.004
fs+0.003
ves+0.003
ocket+0.003
chief+0.003
Neg logit contributions
 alone-0.002
 freely-0.001
 like-0.001
 late-0.001
 active-0.001
Token dog
Token ID3290
Feature activation+1.55e-01

Pos logit contributions
lor+0.046
fs+0.037
ocket+0.035
ves+0.035
wen+0.034
Neg logit contributions
 alone-0.028
 freely-0.024
 like-0.023
 happily-0.022
 late-0.021
Token and
Token ID290
Feature activation-1.59e+00

Pos logit contributions
 alone+0.004
 freely+0.003
 late+0.003
 like+0.003
 active+0.003
Neg logit contributions
lor-0.009
fs-0.007
ocket-0.007
ves-0.007
chief-0.007
Token say
Token ID910
Feature activation-8.59e+00

Pos logit contributions
lor+0.086
fs+0.069
ocket+0.068
ves+0.066
wen+0.065
Neg logit contributions
 alone-0.047
 freely-0.038
 like-0.037
 happily-0.036
 fast-0.035
Token,
Token ID11
Feature activation+1.70e-01

Pos logit contributions
lor+0.423
fs+0.329
ocket+0.319
ves+0.314
wen+0.304
Neg logit contributions
 alone-0.304
 freely-0.251
 like-0.246
 happily-0.242
 fast-0.234
Token "
Token ID366
Feature activation+4.05e+00

Pos logit contributions
 alone+0.006
 freely+0.005
 like+0.005
 happily+0.005
 late+0.004
Neg logit contributions
lor-0.008
fs-0.007
ocket-0.006
ves-0.006
chief-0.006
TokenHello
Token ID15496
Feature activation-1.17e+00

Pos logit contributions
 alone+0.107
 freely+0.079
 happily+0.075
 like+0.073
 competitive+0.069
Neg logit contributions
lor-0.235
fs-0.188
vation-0.182
ves-0.182
chief-0.182
Token,
Token ID11
Feature activation+2.04e+00

Pos logit contributions
lor+0.071
fs+0.058
ocket+0.057
ves+0.056
wen+0.055
Neg logit contributions
 alone-0.028
 like-0.021
 freely-0.021
 happily-0.020
 fast-0.019
Token dog
Token ID3290
Feature activation+9.85e-02

Pos logit contributions
 alone+0.053
 like+0.041
 fast+0.038
 freely+0.037
 happily+0.036
Neg logit contributions
lor-0.124
fs-0.100
ocket-0.098
ves-0.095
wen-0.095
Token go
Token ID467
Feature activation+2.24e+00

Pos logit contributions
 alone+0.027
 freely+0.022
 like+0.021
 happily+0.021
 fast+0.020
Neg logit contributions
lor-0.047
fs-0.037
ocket-0.037
ves-0.036
wen-0.035
Token near
Token ID1474
Feature activation+2.11e+00

Pos logit contributions
 alone+0.058
 freely+0.045
 wildly+0.042
 like+0.041
 happily+0.041
Neg logit contributions
lor-0.124
fs-0.102
ocket-0.098
ves-0.098
ipop-0.096
Token the
Token ID262
Feature activation-7.01e-01

Pos logit contributions
 alone+0.039
 like+0.026
 freely+0.025
 happily+0.024
 fast+0.023
Neg logit contributions
lor-0.142
fs-0.118
ocket-0.115
ves-0.113
wen-0.112
Token horse
Token ID8223
Feature activation-5.02e-01

Pos logit contributions
lor+0.032
fs+0.024
ocket+0.024
ves+0.023
wen+0.022
Neg logit contributions
 alone-0.028
 like-0.023
 freely-0.023
 fast-0.023
 happily-0.023
Token and
Token ID290
Feature activation-1.35e+00

Pos logit contributions
lor+0.032
fs+0.026
ocket+0.026
ves+0.025
wen+0.025
Neg logit contributions
 alone-0.011
 like-0.008
 freely-0.007
 happily-0.007
 fast-0.007
Token say
Token ID910
Feature activation-8.58e+00

Pos logit contributions
lor+0.071
fs+0.056
ocket+0.055
ves+0.054
wen+0.053
Neg logit contributions
 alone-0.043
 freely-0.034
 like-0.034
 happily-0.033
 fast-0.032
Token hello
Token ID23748
Feature activation-1.76e+00

Pos logit contributions
lor+0.412
fs+0.314
ocket+0.308
wen+0.295
vation+0.294
Neg logit contributions
 alone-0.321
 like-0.271
 freely-0.263
 happily-0.262
 fast-0.260
Token.
Token ID13
Feature activation+5.64e-02

Pos logit contributions
lor+0.096
ocket+0.073
ves+0.073
cases+0.073
chief+0.073
Neg logit contributions
 alone-0.049
 freely-0.038
 active-0.037
 late-0.035
 competitive-0.034
Token
Token ID198
Feature activation-6.94e-01

Pos logit contributions
 alone+0.002
 freely+0.001
 active+0.001
 like+0.001
 happily+0.001
Neg logit contributions
lor-0.003
ves-0.002
fs-0.002
cases-0.002
chief-0.002
Token
Token ID198
Feature activation-5.31e-01

Pos logit contributions
lor+0.038
fs+0.030
ves+0.030
wen+0.029
ocket+0.028
Neg logit contributions
 alone-0.021
 freely-0.016
 like-0.016
 happily-0.015
 active-0.015
Token"
Token ID1
Feature activation+3.56e+00

Pos logit contributions
lor+0.024
ves+0.021
fs+0.020
vation+0.018
wen+0.018
Neg logit contributions
 alone-0.019
 freely-0.016
 happily-0.015
 lonely-0.015
 miserable-0.014
Token
Token ID198
Feature activation-2.58e-01

Pos logit contributions
lor+0.021
fs+0.017
ocket+0.017
chief+0.016
vation+0.016
Neg logit contributions
 alone-0.013
 freely-0.011
 happily-0.011
 like-0.010
 fast-0.010
TokenAnd
Token ID1870
Feature activation-2.50e+00

Pos logit contributions
lor+0.014
fs+0.011
ocket+0.011
ves+0.011
wen+0.011
Neg logit contributions
 alone-0.008
 like-0.006
 freely-0.006
 happily-0.006
 fast-0.006
Token they
Token ID484
Feature activation+1.00e+00

Pos logit contributions
lor+0.149
fs+0.121
ocket+0.119
wen+0.117
anger+0.115
Neg logit contributions
 alone-0.067
 like-0.057
 happily-0.053
 freely-0.051
 fast-0.050
Token all
Token ID477
Feature activation+4.77e+00

Pos logit contributions
 alone+0.028
 like+0.022
 freely+0.022
 happily+0.021
 fast+0.020
Neg logit contributions
lor-0.057
fs-0.046
ocket-0.045
ves-0.044
wen-0.043
Token lived
Token ID5615
Feature activation+5.88e-01

Pos logit contributions
 alone+0.175
 freely+0.150
 like+0.145
 fast+0.142
 late+0.142
Neg logit contributions
lor-0.217
fs-0.163
ocket-0.163
ves-0.157
wen-0.154
Token happily
Token ID18177
Feature activation+5.08e+00

Pos logit contributions
 alone+0.008
 hate+0.006
 like+0.005
ashing+0.005
 miserable+0.005
Neg logit contributions
lor-0.038
ves-0.032
ocket-0.032
fs-0.032
card-0.030
Token ever
Token ID1683
Feature activation+3.53e+00

Pos logit contributions
 alone+0.133
 freely+0.103
 like+0.101
 happily+0.096
 fast+0.093
Neg logit contributions
lor-0.293
fs-0.236
ocket-0.233
ves-0.227
vation-0.222
Token after
Token ID706
Feature activation-1.43e+00

Pos logit contributions
 alone+0.072
 freely+0.051
 fast+0.045
 happily+0.044
 active+0.043
Neg logit contributions
lor-0.208
ocket-0.177
vation-0.169
fs-0.166
urger-0.165
Token,
Token ID11
Feature activation+1.89e+00

Pos logit contributions
lor+0.079
fs+0.064
ves+0.062
ocket+0.062
wen+0.059
Neg logit contributions
 alone-0.039
 freely-0.031
 like-0.029
 active-0.028
 wildly-0.028
Token spo
Token ID15106
Feature activation+1.80e+00

Pos logit contributions
 alone+0.050
 freely+0.038
 like+0.037
 happily+0.036
 fast+0.035
Neg logit contributions
lor-0.107
fs-0.086
ocket-0.085
ves-0.083
wen-0.081
Tokeniling
Token ID4386
Feature activation+2.47e+00

Pos logit contributions
 alone+0.058
 fast+0.052
 freely+0.049
 like+0.049
 happily+0.047
Neg logit contributions
lor-0.090
vation-0.069
fs-0.069
ocket-0.066
wen-0.066
Token.
Token ID13
Feature activation+1.72e-01

Pos logit contributions
lor+0.090
fs+0.074
ocket+0.072
wen+0.071
ves+0.070
Neg logit contributions
 alone-0.038
 like-0.029
 freely-0.028
 happily-0.027
 fast-0.026
Token Maybe
Token ID6674
Feature activation-1.19e+00

Pos logit contributions
 alone+0.005
 freely+0.003
 like+0.003
 happily+0.003
 fast+0.003
Neg logit contributions
lor-0.010
fs-0.008
ocket-0.008
ves-0.008
wen-0.008
Token they
Token ID484
Feature activation+5.07e-01

Pos logit contributions
lor+0.074
fs+0.060
vation+0.058
wen+0.058
ocket+0.057
Neg logit contributions
 alone-0.031
 like-0.027
 fast-0.024
 happily-0.023
 outside-0.022
Token will
Token ID481
Feature activation-8.26e-01

Pos logit contributions
 alone+0.019
 freely+0.016
 like+0.016
 happily+0.016
 fast+0.016
Neg logit contributions
lor-0.023
fs-0.017
ocket-0.017
ves-0.016
wen-0.016
Token come
Token ID1282
Feature activation+1.49e+00

Pos logit contributions
lor+0.042
fs+0.033
ocket+0.032
ves+0.031
wen+0.031
Neg logit contributions
 alone-0.028
 like-0.023
 freely-0.023
 happily-0.022
 fast-0.021
Token back
Token ID736
Feature activation+6.38e+00

Pos logit contributions
 alone+0.035
 freely+0.025
 wildly+0.023
 like+0.023
 happily+0.023
Neg logit contributions
lor-0.087
ocket-0.071
fs-0.070
ves-0.068
card-0.067
Token and
Token ID290
Feature activation-6.61e-01

Pos logit contributions
 alone+0.173
 freely+0.133
 wildly+0.129
ashing+0.123
 active+0.123
Neg logit contributions
lor-0.341
ocket-0.273
fs-0.269
ves-0.264
wen-0.257
Token play
Token ID711
Feature activation+2.05e+00

Pos logit contributions
lor+0.035
fs+0.028
ocket+0.027
ves+0.026
wen+0.026
Neg logit contributions
 alone-0.022
 like-0.017
 freely-0.017
 happily-0.017
 fast-0.016
Token again
Token ID757
Feature activation+4.71e+00

Pos logit contributions
 alone+0.046
ashing+0.043
 wildly+0.039
 hate+0.038
umpy+0.038
Neg logit contributions
lor-0.110
fs-0.089
ocket-0.088
ves-0.088
 story-0.081
Token."
Token ID526
Feature activation+1.47e+00

Pos logit contributions
 alone+0.138
ashing+0.110
 freely+0.110
 wildly+0.107
free+0.105
Neg logit contributions
lor-0.237
ves-0.184
fs-0.182
ocket-0.181
wen-0.172
Token<|endoftext|>
Token ID50256
Feature activation-3.91e-01

Pos logit contributions
 alone+0.045
 freely+0.038
 like+0.036
 active+0.035
 competitive+0.034
Neg logit contributions
lor-0.074
fs-0.057
ves-0.056
ocket-0.056
chief-0.055
Token He
Token ID679
Feature activation+9.62e-02

Pos logit contributions
 alone+0.013
 freely+0.010
 like+0.010
 happily+0.010
 fast+0.009
Neg logit contributions
lor-0.030
fs-0.024
ocket-0.024
ves-0.024
wen-0.023
Token turned
Token ID2900
Feature activation-2.32e+00

Pos logit contributions
 alone+0.003
 happily+0.003
 like+0.003
 freely+0.002
 fast+0.002
Neg logit contributions
lor-0.005
fs-0.004
ocket-0.004
ves-0.004
vation-0.004
Token around
Token ID1088
Feature activation+4.47e+00

Pos logit contributions
lor+0.124
fs+0.105
ocket+0.099
chief+0.099
ves+0.098
Neg logit contributions
 alone-0.060
 freely-0.048
 like-0.045
 active-0.044
 youth-0.043
Token and
Token ID290
Feature activation-1.75e+00

Pos logit contributions
 alone+0.101
 freely+0.076
 active+0.071
 like+0.070
 happily+0.069
Neg logit contributions
lor-0.255
fs-0.210
ocket-0.208
chief-0.202
ves-0.200
Token went
Token ID1816
Feature activation+1.26e+00

Pos logit contributions
lor+0.085
fs+0.066
ocket+0.065
ves+0.063
chief+0.063
Neg logit contributions
 alone-0.059
 freely-0.049
 like-0.048
 happily-0.047
 fast-0.046
Token back
Token ID736
Feature activation+5.63e+00

Pos logit contributions
 alone+0.030
 freely+0.025
 like+0.024
 hate+0.023
cy+0.023
Neg logit contributions
lor-0.071
fs-0.057
ocket-0.056
anger-0.055
chief-0.055
Token home
Token ID1363
Feature activation+5.43e-01

Pos logit contributions
 alone+0.149
 freely+0.122
 like+0.120
 wildly+0.117
 active+0.116
Neg logit contributions
lor-0.286
ocket-0.226
cases-0.221
fs-0.220
chief-0.220
Token,
Token ID11
Feature activation+1.88e+00

Pos logit contributions
 alone+0.013
 freely+0.011
 like+0.010
cy+0.010
Stop+0.010
Neg logit contributions
lor-0.030
fs-0.024
ocket-0.024
ves-0.024
cases-0.023
Token leaving
Token ID4305
Feature activation+3.00e+00

Pos logit contributions
 alone+0.055
 freely+0.045
 like+0.044
 happily+0.042
 fast+0.041
Neg logit contributions
lor-0.099
fs-0.078
ocket-0.077
chief-0.076
ves-0.074
Token his
Token ID465
Feature activation+2.13e-01

Pos logit contributions
 alone+0.069
 freely+0.053
 like+0.050
 happily+0.047
 active+0.047
Neg logit contributions
lor-0.175
fs-0.144
ocket-0.143
ves-0.138
wen-0.137
Token mom
Token ID1995
Feature activation+7.11e-01

Pos logit contributions
 alone+0.008
 like+0.007
 freely+0.007
 happily+0.007
 fast+0.007
Neg logit contributions
lor-0.010
fs-0.008
ocket-0.007
ves-0.007
wen-0.007
Token girl
Token ID2576
Feature activation+2.36e+00

Pos logit contributions
 alone+0.085
 freely+0.077
 wildly+0.073
 happily+0.070
 separately+0.069
Neg logit contributions
lor-0.085
ipop-0.079
ama-0.076
 passport-0.072
 story-0.071
Token called
Token ID1444
Feature activation-1.74e+00

Pos logit contributions
 alone+0.077
 freely+0.068
 like+0.062
 wildly+0.062
 fast+0.061
Neg logit contributions
lor-0.112
ocket-0.088
fs-0.085
ves-0.084
wen-0.081
Token Jo
Token ID5302
Feature activation+6.22e-01

Pos logit contributions
lor+0.106
ocket+0.086
fs+0.086
ves+0.083
anger+0.082
Neg logit contributions
 alone-0.038
 freely-0.029
 like-0.027
 happily-0.026
 active-0.025
Token was
Token ID373
Feature activation+4.94e-02

Pos logit contributions
 alone+0.020
 freely+0.017
 like+0.016
 late+0.015
 competitive+0.015
Neg logit contributions
lor-0.032
ocket-0.024
fs-0.024
ves-0.024
wen-0.024
Token playing
Token ID2712
Feature activation+3.60e+00

Pos logit contributions
 alone+0.001
 freely+0.001
 wildly+0.001
 than+0.001
 happily+0.001
Neg logit contributions
lor-0.003
ocket-0.002
 story-0.002
fs-0.002
ama-0.002
Token in
Token ID287
Feature activation+4.17e+00

Pos logit contributions
umpy+0.088
 alone+0.079
ashing+0.079
 freely+0.077
 than+0.074
Neg logit contributions
lor-0.166
fs-0.137
ves-0.135
 story-0.132
lf-0.131
Token her
Token ID607
Feature activation+9.08e-01

Pos logit contributions
ashing+0.105
 alone+0.100
 freely+0.098
umpy+0.094
ashes+0.090
Neg logit contributions
lor-0.203
fs-0.165
lf-0.164
 story-0.163
ocket-0.160
Token garden
Token ID11376
Feature activation+1.15e-01

Pos logit contributions
 alone+0.034
 like+0.032
 freely+0.032
free+0.029
 wildly+0.029
Neg logit contributions
lor-0.039
ocket-0.027
wen-0.027
fs-0.027
ama-0.027
Token.
Token ID13
Feature activation+1.41e-01

Pos logit contributions
 alone+0.003
 freely+0.003
 wildly+0.002
 like+0.002
 happily+0.002
Neg logit contributions
lor-0.006
fs-0.005
ves-0.005
ocket-0.005
 story-0.004
Token She
Token ID1375
Feature activation+3.07e-01

Pos logit contributions
 alone+0.003
 like+0.002
 freely+0.002
 happily+0.002
 fast+0.002
Neg logit contributions
lor-0.009
fs-0.008
ocket-0.008
ves-0.007
wen-0.007
Token was
Token ID373
Feature activation-2.84e-01

Pos logit contributions
 alone+0.010
 freely+0.009
 like+0.008
 late+0.008
 wildly+0.008
Neg logit contributions
lor-0.013
ves-0.011
fs-0.011
ocket-0.011
wen-0.010
Token saw
Token ID2497
Feature activation-9.31e-01

Pos logit contributions
 alone+0.047
 like+0.038
 freely+0.038
 happily+0.038
 fast+0.036
Neg logit contributions
lor-0.073
fs-0.058
ocket-0.056
ves-0.054
wen-0.054
Token a
Token ID257
Feature activation-1.68e+00

Pos logit contributions
lor+0.056
fs+0.046
ocket+0.045
ves+0.044
wen+0.043
Neg logit contributions
 alone-0.023
 like-0.018
 freely-0.017
 happily-0.017
 fast-0.016
Token shady
Token ID36135
Feature activation+2.30e+00

Pos logit contributions
lor+0.086
fs+0.068
ocket+0.067
ves+0.066
wen+0.064
Neg logit contributions
 alone-0.053
 freely-0.043
 like-0.042
 happily-0.040
 late-0.038
Token looking
Token ID2045
Feature activation+2.04e+00

Pos logit contributions
 alone+0.077
 freely+0.064
 like+0.064
 happily+0.061
 fast+0.060
Neg logit contributions
lor-0.115
fs-0.088
ocket-0.087
ves-0.085
wen-0.083
Token man
Token ID582
Feature activation+1.42e-03

Pos logit contributions
 alone+0.072
 freely+0.060
 like+0.059
 happily+0.057
 fast+0.057
Neg logit contributions
lor-0.098
fs-0.075
ocket-0.074
ves-0.071
wen-0.070
Token lurking
Token ID36105
Feature activation+5.37e+00

Pos logit contributions
 alone+0.000
 like+0.000
 freely+0.000
 happily+0.000
 fast+0.000
Neg logit contributions
lor-0.000
fs-0.000
ocket-0.000
ves-0.000
wen-0.000
Token near
Token ID1474
Feature activation+2.42e+00

Pos logit contributions
 alone+0.103
 youth+0.098
 active+0.098
 hate+0.096
 competitive+0.093
Neg logit contributions
lor-0.258
fs-0.219
ves-0.210
ocket-0.205
card-0.195
Token the
Token ID262
Feature activation-7.01e-01

Pos logit contributions
 alone+0.052
 freely+0.037
 like+0.036
 happily+0.035
 fast+0.033
Neg logit contributions
lor-0.150
fs-0.124
ocket-0.122
ves-0.119
wen-0.118
Token exit
Token ID8420
Feature activation+1.90e+00

Pos logit contributions
lor+0.030
ocket+0.023
 lid+0.023
 message+0.022
 story+0.022
Neg logit contributions
 alone-0.023
 freely-0.020
 happily-0.020
ashing-0.020
 active-0.020
Token.
Token ID13
Feature activation+6.58e-01

Pos logit contributions
 alone+0.046
 freely+0.034
 like+0.034
 happily+0.033
 fast+0.031
Neg logit contributions
lor-0.114
fs-0.093
ocket-0.091
ves-0.089
wen-0.088
Token They
Token ID1119
Feature activation+1.27e+00

Pos logit contributions
 alone+0.015
 like+0.011
 freely+0.011
 happily+0.011
 fast+0.010
Neg logit contributions
lor-0.041
fs-0.034
ocket-0.033
ves-0.032
wen-0.032
Token safe
Token ID3338
Feature activation+3.62e+00

Pos logit contributions
lor+0.017
ocket+0.014
fs+0.014
ves+0.013
chief+0.013
Neg logit contributions
 alone-0.009
 freely-0.009
cy-0.009
 wildly-0.009
able-0.008
Token.
Token ID13
Feature activation+7.13e-01

Pos logit contributions
 alone+0.099
ashing+0.092
 freely+0.091
 wildly+0.089
 hate+0.088
Neg logit contributions
lor-0.164
ocket-0.123
fs-0.119
ves-0.117
cases-0.117
Token The
Token ID383
Feature activation-2.68e+00

Pos logit contributions
 alone+0.022
 like+0.017
 freely+0.017
 happily+0.016
 fast+0.016
Neg logit contributions
lor-0.039
fs-0.031
ocket-0.030
ves-0.030
wen-0.030
Token goose
Token ID37246
Feature activation-2.26e-01

Pos logit contributions
lor+0.146
fs+0.116
ocket+0.113
ves+0.110
wen+0.109
Neg logit contributions
 alone-0.083
 like-0.067
 freely-0.065
 fast-0.064
 happily-0.063
Token cannot
Token ID2314
Feature activation-1.18e+00

Pos logit contributions
lor+0.010
fs+0.007
ocket+0.007
ves+0.007
vation+0.007
Neg logit contributions
 alone-0.009
 freely-0.008
 wildly-0.007
 happily-0.007
 like-0.007
Token get
Token ID651
Feature activation+2.30e+00

Pos logit contributions
lor+0.054
ves+0.041
fs+0.041
ocket+0.040
vation+0.039
Neg logit contributions
 alone-0.042
 freely-0.035
 active-0.035
 late-0.034
 happily-0.034
Token us
Token ID514
Feature activation+3.02e+00

Pos logit contributions
 alone+0.060
 freely+0.051
 happily+0.049
 wildly+0.048
 like+0.048
Neg logit contributions
lor-0.121
fs-0.097
ocket-0.096
chief-0.095
ves-0.094
Token,"
Token ID553
Feature activation-3.94e-01

Pos logit contributions
 alone+0.072
 freely+0.063
 happily+0.060
 wildly+0.058
 youth+0.057
Neg logit contributions
lor-0.164
fs-0.132
ocket-0.128
cases-0.127
ves-0.126
Token Tom
Token ID4186
Feature activation+1.16e+00

Pos logit contributions
lor+0.023
fs+0.019
ocket+0.018
ves+0.018
chief+0.018
Neg logit contributions
 alone-0.010
 freely-0.008
 like-0.007
 happily-0.007
 fast-0.007
Token says
Token ID1139
Feature activation-5.21e+00

Pos logit contributions
 alone+0.055
 freely+0.049
 like+0.047
 happily+0.046
 fast+0.046
Neg logit contributions
lor-0.042
fs-0.029
ocket-0.028
ves-0.027
wen-0.027
Token.
Token ID13
Feature activation+6.52e-01

Pos logit contributions
lor+0.296
fs+0.245
ves+0.241
ocket+0.239
chief+0.233
Neg logit contributions
 alone-0.124
 freely-0.099
 like-0.092
 active-0.088
 youth-0.086
Token went
Token ID1816
Feature activation+1.38e+00

Pos logit contributions
lor+0.026
fs+0.020
ocket+0.020
wen+0.019
ves+0.019
Neg logit contributions
 alone-0.018
 happily-0.015
 like-0.015
 freely-0.014
 fast-0.014
Token home
Token ID1363
Feature activation+6.17e-01

Pos logit contributions
 alone+0.033
 freely+0.029
 like+0.027
 hate+0.027
 wildly+0.027
Neg logit contributions
lor-0.072
fs-0.058
ocket-0.057
anger-0.057
card-0.057
Token sad
Token ID6507
Feature activation+2.96e+00

Pos logit contributions
 alone+0.014
 freely+0.013
Stop+0.012
 like+0.012
cy+0.012
Neg logit contributions
lor-0.032
fs-0.026
ocket-0.026
cases-0.025
chief-0.025
Token and
Token ID290
Feature activation-3.02e-01

Pos logit contributions
 alone+0.073
 freely+0.059
cy+0.056
 active+0.054
 like+0.053
Neg logit contributions
lor-0.161
fs-0.132
chief-0.128
ocket-0.128
cases-0.128
Token his
Token ID465
Feature activation-8.96e-01

Pos logit contributions
lor+0.017
fs+0.014
ocket+0.014
ves+0.013
wen+0.013
Neg logit contributions
 alone-0.009
 like-0.007
 freely-0.007
 happily-0.007
 fast-0.006
Token mom
Token ID1995
Feature activation-1.26e-01

Pos logit contributions
lor+0.050
fs+0.038
wen+0.036
vation+0.036
ocket+0.036
Neg logit contributions
 alone-0.032
 fast-0.025
 happily-0.025
 lonely-0.024
 like-0.024
Token asked
Token ID1965
Feature activation-2.68e+00

Pos logit contributions
lor+0.007
fs+0.006
ocket+0.006
anger+0.005
vation+0.005
Neg logit contributions
 alone-0.004
 happily-0.003
 like-0.003
 outside-0.003
 fast-0.003
Token him
Token ID683
Feature activation-1.91e+00

Pos logit contributions
lor+0.138
fs+0.113
ocket+0.111
chief+0.107
ves+0.106
Neg logit contributions
 alone-0.076
 freely-0.063
 like-0.057
 happily-0.057
 fast-0.057
Token why
Token ID1521
Feature activation-1.97e+00

Pos logit contributions
lor+0.104
fs+0.082
ocket+0.080
ves+0.078
wen+0.078
Neg logit contributions
 alone-0.060
 like-0.048
 freely-0.047
 happily-0.046
 fast-0.045
Token he
Token ID339
Feature activation-8.10e-01

Pos logit contributions
lor+0.089
fs+0.069
ocket+0.067
ves+0.066
wen+0.064
Neg logit contributions
 alone-0.074
 freely-0.063
 like-0.062
 happily-0.061
 fast-0.059
Token was
Token ID373
Feature activation-3.44e-01

Pos logit contributions
lor+0.041
fs+0.032
ocket+0.031
ves+0.030
wen+0.029
Neg logit contributions
 alone-0.028
 like-0.023
 freely-0.023
 happily-0.023
 fast-0.022
Token.
Token ID13
Feature activation+3.23e-01

Pos logit contributions
 alone+0.063
 freely+0.053
 happily+0.049
 hate+0.047
 wildly+0.047
Neg logit contributions
lor-0.140
chief-0.106
ocket-0.105
ama-0.105
cases-0.105
Token You
Token ID921
Feature activation-6.92e-01

Pos logit contributions
 alone+0.010
 like+0.008
 freely+0.008
 happily+0.008
 fast+0.008
Neg logit contributions
lor-0.017
fs-0.014
ocket-0.013
ves-0.013
wen-0.013
Token should
Token ID815
Feature activation-4.60e+00

Pos logit contributions
lor+0.037
fs+0.029
ocket+0.028
ves+0.028
wen+0.027
Neg logit contributions
 alone-0.023
 like-0.019
 freely-0.018
 happily-0.018
 fast-0.017
Token try
Token ID1949
Feature activation-1.84e+00

Pos logit contributions
lor+0.242
fs+0.191
ocket+0.186
ves+0.181
wen+0.179
Neg logit contributions
 alone-0.147
 freely-0.118
 like-0.117
 happily-0.114
 fast-0.110
Token to
Token ID284
Feature activation-1.23e+00

Pos logit contributions
lor+0.103
fs+0.083
ves+0.081
ocket+0.080
chief+0.079
Neg logit contributions
 alone-0.050
 freely-0.039
 happily-0.036
 like-0.035
cy-0.035
Token be
Token ID307
Feature activation+1.94e+00

Pos logit contributions
lor+0.061
fs+0.047
ocket+0.045
ves+0.045
wen+0.044
Neg logit contributions
 alone-0.043
 freely-0.035
 like-0.034
 happily-0.034
 late-0.033
Token calm
Token ID9480
Feature activation+2.76e+00

Pos logit contributions
 alone+0.050
 freely+0.050
cy+0.048
 wildly+0.046
ashing+0.043
Neg logit contributions
lor-0.090
ves-0.072
 message-0.072
 address-0.069
card-0.069
Token all
Token ID477
Feature activation+4.97e+00

Pos logit contributions
 alone+0.067
 freely+0.055
 happily+0.052
 hate+0.051
 muddy+0.050
Neg logit contributions
lor-0.149
chief-0.112
ocket-0.112
fs-0.111
cases-0.110
Token the
Token ID262
Feature activation-1.09e+00

Pos logit contributions
 alone+0.106
 happily+0.099
 freely+0.098
 wildly+0.092
ashing+0.083
Neg logit contributions
lor-0.265
fs-0.217
 story-0.210
vation-0.207
ocket-0.207
Token time
Token ID640
Feature activation+1.35e-01

Pos logit contributions
lor+0.047
fs+0.036
ocket+0.034
ves+0.033
vation+0.032
Neg logit contributions
 alone-0.043
 freely-0.038
 happily-0.036
 like-0.035
 wildly-0.035
Token,
Token ID11
Feature activation+1.56e+00

Pos logit contributions
 alone+0.004
 freely+0.003
 happily+0.003
 like+0.003
 wildly+0.003
Neg logit contributions
lor-0.007
fs-0.006
ves-0.006
ocket-0.006
cases-0.006
Token dreaming
Token ID28368
Feature activation-1.33e+00

Pos logit contributions
 alone+0.083
 freely+0.071
 like+0.071
 happily+0.069
 fast+0.068
Neg logit contributions
lor-0.081
fs-0.060
ocket-0.058
ves-0.056
wen-0.055
Token of
Token ID286
Feature activation+1.82e+00

Pos logit contributions
lor+0.065
fs+0.051
ocket+0.049
ves+0.048
wen+0.047
Neg logit contributions
 alone-0.046
 freely-0.039
 like-0.038
 happily-0.037
 fast-0.036
Token all
Token ID477
Feature activation+4.35e+00

Pos logit contributions
 alone+0.045
 freely+0.035
 like+0.033
 happily+0.032
 fast+0.030
Neg logit contributions
lor-0.106
fs-0.087
ocket-0.085
ves-0.084
wen-0.082
Token the
Token ID262
Feature activation-1.39e+00

Pos logit contributions
 alone+0.093
 freely+0.066
 like+0.065
 happily+0.063
 fast+0.059
Neg logit contributions
lor-0.276
fs-0.228
ocket-0.223
ves-0.219
wen-0.216
Token adventures
Token ID17545
Feature activation+8.11e-02

Pos logit contributions
lor+0.080
fs+0.066
ocket+0.064
ves+0.063
wen+0.062
Neg logit contributions
 alone-0.036
 freely-0.028
 like-0.027
 happily-0.026
 fast-0.025
Token she
Token ID673
Feature activation+1.77e+00

Pos logit contributions
 alone+0.002
 freely+0.002
 like+0.002
 happily+0.002
 fast+0.002
Neg logit contributions
lor-0.004
fs-0.003
ocket-0.003
ves-0.003
wen-0.003
Token would
Token ID561
Feature activation-8.89e-01

Pos logit contributions
 alone+0.066
 like+0.055
 freely+0.054
 happily+0.053
 fast+0.052
Neg logit contributions
lor-0.086
fs-0.067
ocket-0.064
ves-0.063
wen-0.062
Token have
Token ID423
Feature activation+1.06e+00

Pos logit contributions
lor+0.047
fs+0.038
ocket+0.037
ves+0.036
wen+0.035
Neg logit contributions
 alone-0.029
 like-0.024
 freely-0.023
 happily-0.022
 fast-0.022
Token in
Token ID287
Feature activation+3.40e+00

Pos logit contributions
 alone+0.026
 freely+0.020
 like+0.018
 happily+0.018
 fast+0.018
Neg logit contributions
lor-0.062
fs-0.051
ocket-0.050
ves-0.049
wen-0.047
Token her
Token ID607
Feature activation+4.53e-01

Pos logit contributions
 alone+0.079
 freely+0.059
 like+0.056
 happily+0.055
 fast+0.052
Neg logit contributions
lor-0.206
fs-0.168
ocket-0.165
ves-0.161
wen-0.158
Token new
Token ID649
Feature activation+2.69e+00

Pos logit contributions
 alone+0.014
 freely+0.012
 like+0.012
 happily+0.011
 fast+0.011
Neg logit contributions
lor-0.024
fs-0.019
ocket-0.018
ves-0.018
wen-0.017
Token."
Token ID526
Feature activation+1.23e+00

Pos logit contributions
lor+0.018
fs+0.013
ocket+0.013
wen+0.013
chief+0.013
Neg logit contributions
 alone-0.010
 freely-0.009
 wildly-0.008
 happily-0.008
 like-0.008
Token<|endoftext|>
Token ID50256
Feature activation-1.11e+00

Pos logit contributions
 alone+0.035
 freely+0.027
 like+0.027
 fast+0.025
 happily+0.025
Neg logit contributions
lor-0.069
fs-0.055
ocket-0.054
ves-0.053
wen-0.052
TokenOnce
Token ID7454
Feature activation-2.75e+00

Pos logit contributions
lor+0.065
fs+0.053
ocket+0.052
ves+0.051
wen+0.050
Neg logit contributions
 alone-0.028
 like-0.021
 freely-0.021
 happily-0.021
 fast-0.020
Token upon
Token ID2402
Feature activation-8.13e-01

Pos logit contributions
lor+0.159
fs+0.125
ocket+0.124
vation+0.121
chief+0.118
Neg logit contributions
 alone-0.083
 happily-0.063
 like-0.062
 freely-0.061
 fast-0.061
Token a
Token ID257
Feature activation-2.41e+00

Pos logit contributions
lor+0.046
fs+0.037
ocket+0.036
ves+0.036
wen+0.035
Neg logit contributions
 alone-0.022
 freely-0.017
 like-0.016
 happily-0.016
 wildly-0.015
Token time
Token ID640
Feature activation+1.36e-01

Pos logit contributions
lor+0.123
fs+0.097
ves+0.093
ocket+0.093
wen+0.091
Neg logit contributions
 alone-0.079
 freely-0.064
 like-0.061
 happily-0.059
 fast-0.059
Token,
Token ID11
Feature activation+1.48e+00

Pos logit contributions
 alone+0.004
 freely+0.003
 like+0.003
 happily+0.003
 active+0.003
Neg logit contributions
lor-0.008
fs-0.006
ocket-0.006
ves-0.006
wen-0.006
Token there
Token ID612
Feature activation+5.53e-01

Pos logit contributions
 alone+0.048
 freely+0.044
 active+0.041
 youth+0.040
 wildly+0.040
Neg logit contributions
lor-0.064
fs-0.051
ves-0.050
ocket-0.050
ipop-0.048
Token was
Token ID373
Feature activation+4.94e-01

Pos logit contributions
 alone+0.021
 freely+0.017
 like+0.017
 late+0.016
 fast+0.016
Neg logit contributions
lor-0.026
fs-0.020
ves-0.019
ocket-0.019
wen-0.018
Token a
Token ID257
Feature activation-3.87e-01

Pos logit contributions
 alone+0.016
 freely+0.014
 wildly+0.012
ashing+0.012
 active+0.011
Neg logit contributions
lor-0.024
ocket-0.019
fs-0.019
ves-0.018
wen-0.018
Token little
Token ID1310
Feature activation+2.42e+00

Pos logit contributions
lor+0.017
 story+0.014
ocket+0.014
ama+0.013
fs+0.013
Neg logit contributions
 alone-0.013
 freely-0.012
 wildly-0.011
 like-0.010
 than-0.010
Token day
Token ID1110
Feature activation-2.20e+00

Pos logit contributions
lor+0.017
fs+0.013
ocket+0.013
ves+0.013
wen+0.013
Neg logit contributions
 alone-0.010
 freely-0.009
 happily-0.008
 like-0.008
 wildly-0.008
Token,
Token ID11
Feature activation+1.17e+00

Pos logit contributions
lor+0.104
fs+0.087
ves+0.086
ipop+0.084
ama+0.080
Neg logit contributions
 alone-0.075
 freely-0.064
 active-0.058
 late-0.055
 happily-0.055
Token she
Token ID673
Feature activation+8.10e-01

Pos logit contributions
 alone+0.041
 freely+0.035
 active+0.032
 happily+0.032
 like+0.032
Neg logit contributions
lor-0.053
fs-0.045
ves-0.042
ocket-0.041
ipop-0.041
Token found
Token ID1043
Feature activation-1.34e+00

Pos logit contributions
 alone+0.023
 freely+0.019
 like+0.018
 fast+0.017
 happily+0.017
Neg logit contributions
lor-0.044
fs-0.035
ocket-0.035
ves-0.034
wen-0.033
Token a
Token ID257
Feature activation-2.31e+00

Pos logit contributions
lor+0.083
fs+0.068
ocket+0.066
ves+0.065
vation+0.064
Neg logit contributions
 alone-0.032
 like-0.023
 freely-0.022
 happily-0.022
 fast-0.021
Token worm
Token ID23849
Feature activation-2.24e+00

Pos logit contributions
lor+0.132
fs+0.106
ocket+0.103
ves+0.101
wen+0.101
Neg logit contributions
 alone-0.064
 like-0.050
 freely-0.049
 happily-0.048
 fast-0.048
Token that
Token ID326
Feature activation-1.65e+00

Pos logit contributions
lor+0.147
fs+0.123
wen+0.122
vation+0.121
cases+0.120
Neg logit contributions
 alone-0.048
 like-0.034
 outside-0.031
 happily-0.028
 freely-0.027
Token was
Token ID373
Feature activation-5.87e-01

Pos logit contributions
lor+0.091
fs+0.072
ocket+0.069
vation+0.068
wen+0.067
Neg logit contributions
 alone-0.053
 like-0.042
 happily-0.040
 freely-0.040
 fast-0.040
Token very
Token ID845
Feature activation+2.30e+00

Pos logit contributions
lor+0.036
fs+0.029
ocket+0.029
ves+0.028
wen+0.028
Neg logit contributions
 alone-0.014
 freely-0.010
 like-0.010
 happily-0.009
 fast-0.009
Token kind
Token ID1611
Feature activation+2.05e+00

Pos logit contributions
 alone+0.061
 freely+0.049
 like+0.047
 happily+0.045
 wildly+0.044
Neg logit contributions
lor-0.124
ocket-0.103
fs-0.102
ves-0.100
ipop-0.096
Token.
Token ID13
Feature activation+6.56e-01

Pos logit contributions
 alone+0.061
 freely+0.050
 happily+0.047
 like+0.046
 fast+0.046
Neg logit contributions
lor-0.110
ocket-0.087
fs-0.087
ves-0.082
anger-0.081
Token at
Token ID379
Feature activation+5.12e-01

Pos logit contributions
lor+0.008
fs+0.007
ocket+0.007
ves+0.006
cases+0.006
Neg logit contributions
 alone-0.003
 freely-0.002
 happily-0.002
 youth-0.002
 like-0.002
Token that
Token ID326
Feature activation-9.73e-01

Pos logit contributions
 alone+0.012
 like+0.009
 freely+0.008
 happily+0.008
 fast+0.008
Neg logit contributions
lor-0.032
fs-0.026
ocket-0.026
ves-0.025
wen-0.025
Token big
Token ID1263
Feature activation+8.56e-01

Pos logit contributions
lor+0.051
fs+0.040
ocket+0.039
ves+0.038
wen+0.038
Neg logit contributions
 alone-0.032
 like-0.026
 freely-0.025
 happily-0.025
 fast-0.024
Token tree
Token ID5509
Feature activation+6.92e-01

Pos logit contributions
 alone+0.029
 like+0.024
 freely+0.023
 happily+0.023
 fast+0.022
Neg logit contributions
lor-0.044
fs-0.034
ocket-0.033
ves-0.032
wen-0.032
Token!
Token ID0
Feature activation+4.06e-01

Pos logit contributions
 alone+0.017
 like+0.012
 freely+0.012
 happily+0.012
 fast+0.011
Neg logit contributions
lor-0.042
fs-0.034
ocket-0.034
ves-0.033
wen-0.032
Token Let
Token ID3914
Feature activation-8.86e-01

Pos logit contributions
 alone+0.011
 like+0.008
 freely+0.008
 happily+0.008
 fast+0.008
Neg logit contributions
lor-0.023
fs-0.019
ocket-0.019
ves-0.018
wen-0.018
Token's
Token ID338
Feature activation-1.24e+00

Pos logit contributions
lor+0.047
fs+0.038
ocket+0.038
ves+0.035
anger+0.035
Neg logit contributions
 alone-0.030
 like-0.024
 happily-0.024
 freely-0.022
 outside-0.022
Token go
Token ID467
Feature activation+2.70e+00

Pos logit contributions
lor+0.065
fs+0.052
ocket+0.050
ves+0.048
wen+0.048
Neg logit contributions
 alone-0.042
 like-0.035
 happily-0.034
 freely-0.033
 fast-0.032
Token play
Token ID711
Feature activation+2.45e+00

Pos logit contributions
 alone+0.067
 freely+0.050
 like+0.050
 happily+0.048
 fast+0.046
Neg logit contributions
lor-0.162
fs-0.132
ocket-0.129
ves-0.126
wen-0.125
Token under
Token ID739
Feature activation+1.82e+00

Pos logit contributions
 alone+0.057
ashing+0.043
 freely+0.043
 active+0.042
 hate+0.042
Neg logit contributions
lor-0.142
fs-0.117
ves-0.115
ocket-0.111
anger-0.108
Token it
Token ID340
Feature activation+9.58e-01

Pos logit contributions
 alone+0.047
 freely+0.038
ashing+0.038
 like+0.037
 active+0.037
Neg logit contributions
lor-0.101
fs-0.081
ves-0.079
chief-0.078
ocket-0.078
Token sh
Token ID427
Feature activation-8.56e-01

Pos logit contributions
lor+0.052
fs+0.043
ocket+0.042
ves+0.041
wen+0.041
Neg logit contributions
 alone-0.019
 like-0.013
 freely-0.013
 happily-0.013
 fast-0.012
Tokenivering
Token ID35598
Feature activation+1.85e+00

Pos logit contributions
lor+0.046
fs+0.036
ocket+0.035
ves+0.034
wen+0.034
Neg logit contributions
 alone-0.027
 freely-0.022
 like-0.022
 happily-0.021
 fast-0.021
Token in
Token ID287
Feature activation+3.58e+00

Pos logit contributions
 alone+0.039
 freely+0.028
 like+0.027
 happily+0.026
 fast+0.025
Neg logit contributions
lor-0.116
fs-0.096
ocket-0.094
ves-0.092
wen-0.091
Token the
Token ID262
Feature activation-6.02e-01

Pos logit contributions
 alone+0.085
 like+0.062
 freely+0.059
 happily+0.058
 fast+0.057
Neg logit contributions
lor-0.224
fs-0.184
ocket-0.179
ves-0.177
wen-0.175
Token cold
Token ID4692
Feature activation+3.10e+00

Pos logit contributions
lor+0.032
fs+0.024
ocket+0.024
chief+0.023
ves+0.023
Neg logit contributions
 alone-0.018
 freely-0.016
 like-0.015
 happily-0.015
 fast-0.014
Token and
Token ID290
Feature activation-9.86e-01

Pos logit contributions
 alone+0.097
 freely+0.083
 happily+0.081
 like+0.080
 competitive+0.078
Neg logit contributions
lor-0.159
fs-0.120
ocket-0.119
anger-0.115
chief-0.115
Token he
Token ID339
Feature activation+1.07e+00

Pos logit contributions
lor+0.053
fs+0.042
ocket+0.041
ves+0.040
chief+0.040
Neg logit contributions
 alone-0.029
 freely-0.023
 like-0.023
 happily-0.023
 fast-0.022
Token knew
Token ID2993
Feature activation-4.59e+00

Pos logit contributions
 alone+0.037
 happily+0.030
 like+0.030
 freely+0.029
 fast+0.028
Neg logit contributions
lor-0.055
fs-0.043
ocket-0.042
wen-0.040
anger-0.040
Token right
Token ID826
Feature activation+2.93e-01

Pos logit contributions
lor+0.252
fs+0.202
vation+0.199
ocket+0.198
anger+0.190
Neg logit contributions
 alone-0.143
 like-0.117
 happily-0.109
 fast-0.107
 freely-0.106
Token away
Token ID1497
Feature activation+1.14e+00

Pos logit contributions
 alone+0.006
 freely+0.004
 like+0.004
 happily+0.004
 fast+0.004
Neg logit contributions
lor-0.018
fs-0.015
ocket-0.015
ves-0.014
wen-0.014
Token what
Token ID644
Feature activation-1.11e+00

Pos logit contributions
 alone+0.025
 freely+0.018
 like+0.017
 happily+0.017
 fast+0.016
Neg logit contributions
lor-0.072
fs-0.059
ocket-0.058
ves-0.057
wen-0.056
Token rabbit
Token ID22746
Feature activation-1.53e+00

Pos logit contributions
lor+0.084
fs+0.068
ves+0.065
ocket+0.065
cases+0.064
Neg logit contributions
 alone-0.046
 like-0.037
 fast-0.035
 freely-0.035
 happily-0.035
Token trusted
Token ID13467
Feature activation-5.53e-01

Pos logit contributions
lor+0.076
fs+0.059
ocket+0.058
ves+0.056
wen+0.055
Neg logit contributions
 alone-0.053
 freely-0.043
 like-0.043
 happily-0.042
 fast-0.041
Token her
Token ID607
Feature activation-1.66e-01

Pos logit contributions
lor+0.035
fs+0.028
ocket+0.028
ves+0.027
wen+0.027
Neg logit contributions
 alone-0.012
 freely-0.008
 like-0.008
 happily-0.008
 fast-0.008
Token.
Token ID13
Feature activation+3.33e-01

Pos logit contributions
lor+0.011
fs+0.009
ocket+0.009
wen+0.009
vation+0.009
Neg logit contributions
 alone-0.003
 like-0.003
 happily-0.002
 fast-0.002
 freely-0.002
Token She
Token ID1375
Feature activation+4.79e-01

Pos logit contributions
 alone+0.011
 like+0.009
 freely+0.009
 happily+0.008
 fast+0.008
Neg logit contributions
lor-0.018
fs-0.013
ocket-0.013
ves-0.013
wen-0.013
Token thanked
Token ID26280
Feature activation-3.78e+00

Pos logit contributions
 alone+0.016
 happily+0.014
 like+0.013
 freely+0.013
 fast+0.012
Neg logit contributions
lor-0.025
fs-0.020
ocket-0.019
anger-0.019
vation-0.019
Token it
Token ID340
Feature activation-6.58e-01

Pos logit contributions
lor+0.248
fs+0.206
ocket+0.202
ves+0.198
wen+0.196
Neg logit contributions
 alone-0.074
 freely-0.050
 like-0.050
 happily-0.047
 fast-0.044
Token and
Token ID290
Feature activation-1.56e+00

Pos logit contributions
lor+0.040
fs+0.032
ocket+0.032
wen+0.031
ves+0.031
Neg logit contributions
 alone-0.017
 happily-0.013
 like-0.013
 freely-0.013
 fast-0.012
Token ran
Token ID4966
Feature activation+1.84e+00

Pos logit contributions
lor+0.085
fs+0.068
ocket+0.066
anger+0.064
wen+0.063
Neg logit contributions
 alone-0.050
 happily-0.044
 like-0.040
 freely-0.039
 fast-0.039
Token off
Token ID572
Feature activation+9.10e-01

Pos logit contributions
 alone+0.063
 hate+0.050
 freely+0.050
 lonely+0.049
 active+0.049
Neg logit contributions
lor-0.087
fs-0.066
ocket-0.064
ves-0.063
anger-0.063
Token,
Token ID11
Feature activation+1.89e+00

Pos logit contributions
 alone+0.022
 like+0.018
 hate+0.018
 freely+0.018
Stop+0.018
Neg logit contributions
lor-0.049
fs-0.039
ocket-0.039
ves-0.038
chief-0.038
Tokenmy
Token ID1820
Feature activation-8.04e-01

Pos logit contributions
 alone+0.002
 like+0.002
 outside+0.001
 happily+0.001
 freely+0.001
Neg logit contributions
lor-0.008
fs-0.006
ocket-0.006
ves-0.006
wen-0.006
Token and
Token ID290
Feature activation-2.03e+00

Pos logit contributions
lor+0.050
fs+0.041
ocket+0.040
ves+0.039
wen+0.039
Neg logit contributions
 alone-0.019
 like-0.014
 freely-0.013
 happily-0.013
 fast-0.012
Token daddy
Token ID35822
Feature activation-1.28e+00

Pos logit contributions
lor+0.094
fs+0.071
ocket+0.067
anger+0.065
ves+0.065
Neg logit contributions
 alone-0.085
 like-0.073
 happily-0.072
 fast-0.068
 outside-0.068
Token.
Token ID13
Feature activation-7.85e-02

Pos logit contributions
lor+0.078
fs+0.064
ocket+0.063
ves+0.061
chief+0.060
Neg logit contributions
 alone-0.029
 freely-0.022
 like-0.020
 happily-0.020
 fast-0.019
Token One
Token ID1881
Feature activation-3.80e-01

Pos logit contributions
lor+0.004
fs+0.003
ves+0.003
cases+0.003
chief+0.003
Neg logit contributions
 alone-0.003
 freely-0.002
 active-0.002
 fast-0.002
 happily-0.002
Token day
Token ID1110
Feature activation-2.12e+00

Pos logit contributions
lor+0.021
fs+0.017
ocket+0.017
ves+0.016
wen+0.016
Neg logit contributions
 alone-0.011
 freely-0.009
 like-0.009
 happily-0.008
 fast-0.008
Token,
Token ID11
Feature activation+1.14e+00

Pos logit contributions
lor+0.098
fs+0.083
ves+0.082
ipop+0.080
ama+0.078
Neg logit contributions
 alone-0.071
 freely-0.061
 active-0.054
 happily-0.051
 late-0.051
Token they
Token ID484
Feature activation+6.76e-01

Pos logit contributions
 alone+0.032
 freely+0.026
 active+0.022
 happily+0.022
umpy+0.021
Neg logit contributions
lor-0.059
fs-0.052
ves-0.050
ipop-0.049
chief-0.048
Token decided
Token ID3066
Feature activation-2.77e+00

Pos logit contributions
 alone+0.023
 freely+0.020
 like+0.018
 fast+0.018
 wildly+0.018
Neg logit contributions
lor-0.032
ocket-0.025
fs-0.025
ves-0.023
cases-0.023
Token to
Token ID284
Feature activation-6.20e-01

Pos logit contributions
lor+0.158
fs+0.128
ves+0.126
ocket+0.123
chief+0.121
Neg logit contributions
 alone-0.073
 freely-0.058
 happily-0.052
 like-0.052
cy-0.049
Token go
Token ID467
Feature activation+2.50e+00

Pos logit contributions
lor+0.033
fs+0.026
ocket+0.026
wen+0.025
ves+0.025
Neg logit contributions
 alone-0.020
 like-0.017
 happily-0.016
 freely-0.016
 fast-0.015
Token
Token ID198
Feature activation-4.34e-01

Pos logit contributions
lor+0.034
fs+0.027
ves+0.026
ocket+0.026
wen+0.026
Neg logit contributions
 alone-0.017
 like-0.013
 freely-0.013
 happily-0.013
 fast-0.012
TokenM
Token ID44
Feature activation-1.71e+00

Pos logit contributions
lor+0.020
fs+0.015
ves+0.015
ama+0.015
wen+0.014
Neg logit contributions
 alone-0.017
 freely-0.014
 like-0.014
 happily-0.013
 than-0.012
Tokenama
Token ID1689
Feature activation-1.61e+00

Pos logit contributions
lor+0.048
fs+0.030
ocket+0.029
vation+0.023
ves+0.022
Neg logit contributions
 alone-0.091
 happily-0.085
 late-0.085
 like-0.083
 fast-0.082
Token smiled
Token ID13541
Feature activation+4.24e-01

Pos logit contributions
lor+0.078
fs+0.061
ocket+0.059
ves+0.057
wen+0.056
Neg logit contributions
 alone-0.059
 like-0.048
 freely-0.048
 happily-0.048
 fast-0.046
Token and
Token ID290
Feature activation-1.26e+00

Pos logit contributions
 alone+0.013
 waste+0.011
 hate+0.011
 freely+0.011
Stop+0.010
Neg logit contributions
lor-0.019
ama-0.015
 replied-0.015
chief-0.015
cases-0.014
Token said
Token ID531
Feature activation-7.41e+00

Pos logit contributions
lor+0.059
fs+0.043
ocket+0.042
ves+0.040
anger+0.040
Neg logit contributions
 alone-0.051
 happily-0.046
 like-0.044
 fast-0.043
 freely-0.042
Token,
Token ID11
Feature activation+4.41e-01

Pos logit contributions
lor+0.348
ves+0.277
fs+0.271
ipop+0.269
chief+0.265
Neg logit contributions
 alone-0.254
 freely-0.219
 active-0.203
 late-0.201
cy-0.198
Token "
Token ID366
Feature activation+4.03e+00

Pos logit contributions
 alone+0.013
 like+0.011
 freely+0.011
 happily+0.010
 fast+0.010
Neg logit contributions
lor-0.024
fs-0.019
ocket-0.019
ves-0.018
wen-0.018
TokenGood
Token ID10248
Feature activation+2.21e+00

Pos logit contributions
 alone+0.113
 freely+0.083
 happily+0.081
 like+0.081
 fast+0.076
Neg logit contributions
lor-0.234
fs-0.187
ocket-0.180
wen-0.179
ves-0.178
Token job
Token ID1693
Feature activation-1.17e+00

Pos logit contributions
 alone+0.095
 freely+0.083
 wildly+0.075
 happily+0.075
 fast+0.074
Neg logit contributions
lor-0.082
ama-0.062
anger-0.060
chief-0.059
vation-0.058
Token helping
Token ID5742
Feature activation+1.15e-01

Pos logit contributions
lor+0.066
fs+0.054
ocket+0.052
ves+0.052
wen+0.051
Neg logit contributions
 alone-0.031
 freely-0.025
 like-0.024
 happily-0.023
 fast-0.022
Token Tom
Token ID4186
Feature activation-6.76e-01

Pos logit contributions
lor+0.066
fs+0.054
ocket+0.053
ves+0.052
chief+0.051
Neg logit contributions
 alone-0.034
 freely-0.027
 like-0.026
 happily-0.025
 active-0.025
Token.
Token ID13
Feature activation+1.87e-01

Pos logit contributions
lor+0.039
fs+0.032
ocket+0.032
chief+0.031
ves+0.031
Neg logit contributions
 alone-0.016
 freely-0.013
 wildly-0.012
 happily-0.011
 fast-0.011
Token
Token ID198
Feature activation-7.79e-01

Pos logit contributions
 alone+0.006
 freely+0.004
 active+0.004
 like+0.004
cy+0.004
Neg logit contributions
lor-0.010
fs-0.008
ves-0.008
chief-0.008
ocket-0.007
Token
Token ID198
Feature activation-5.58e-01

Pos logit contributions
lor+0.046
fs+0.037
ocket+0.036
ves+0.035
wen+0.035
Neg logit contributions
 alone-0.020
 freely-0.015
 like-0.015
 happily-0.015
 fast-0.014
Token"
Token ID1
Feature activation+3.85e+00

Pos logit contributions
lor+0.025
fs+0.020
ves+0.020
wen+0.018
vation+0.018
Neg logit contributions
 alone-0.021
 freely-0.018
 happily-0.017
 like-0.017
 lonely-0.017
TokenHere
Token ID4342
Feature activation-6.18e+00

Pos logit contributions
 alone+0.109
 freely+0.083
 like+0.080
 happily+0.080
 fast+0.074
Neg logit contributions
lor-0.221
fs-0.178
ocket-0.172
ves-0.171
vation-0.167
Token,
Token ID11
Feature activation+1.65e+00

Pos logit contributions
lor+0.394
fs+0.321
ocket+0.318
vation+0.309
wen+0.309
Neg logit contributions
 alone-0.140
 like-0.104
 fast-0.092
 happily-0.090
 freely-0.085
Token Tom
Token ID4186
Feature activation+6.15e-01

Pos logit contributions
 alone+0.044
 like+0.035
 happily+0.031
 fast+0.031
 freely+0.031
Neg logit contributions
lor-0.100
fs-0.081
ocket-0.079
wen-0.077
ves-0.077
Token.
Token ID13
Feature activation+4.64e-01

Pos logit contributions
 alone+0.015
 like+0.011
 freely+0.011
 happily+0.010
 fast+0.010
Neg logit contributions
lor-0.038
fs-0.031
ocket-0.030
ves-0.029
wen-0.029
Token Drink
Token ID32906
Feature activation-1.74e+00

Pos logit contributions
 alone+0.014
 like+0.012
 freely+0.011
 late+0.011
 happily+0.010
Neg logit contributions
lor-0.025
fs-0.021
ves-0.020
ocket-0.020
wen-0.020
Token some
Token ID617
Feature activation-4.48e-01

Pos logit contributions
lor+0.094
fs+0.076
ocket+0.073
ves+0.073
chief+0.071
Neg logit contributions
 alone-0.052
 freely-0.041
 like-0.039
 happily-0.039
 fast-0.037
Token says
Token ID1139
Feature activation-5.20e+00

Pos logit contributions
lor+0.016
fs+0.011
ocket+0.011
ves+0.010
wen+0.010
Neg logit contributions
 alone-0.018
 like-0.016
 happily-0.016
 freely-0.015
 fast-0.015
Token.
Token ID13
Feature activation+4.88e-01

Pos logit contributions
lor+0.309
fs+0.255
ocket+0.253
ves+0.247
chief+0.242
Neg logit contributions
 alone-0.117
 freely-0.090
 like-0.085
 happily-0.079
 active-0.078
Token "
Token ID366
Feature activation+4.12e+00

Pos logit contributions
 alone+0.014
 freely+0.012
 like+0.011
 active+0.010
 happily+0.010
Neg logit contributions
lor-0.026
fs-0.021
ocket-0.020
ves-0.020
chief-0.019
TokenHe
Token ID1544
Feature activation-9.83e-01

Pos logit contributions
 alone+0.114
 freely+0.087
 happily+0.086
 like+0.082
 wildly+0.077
Neg logit contributions
lor-0.233
fs-0.186
ocket-0.182
ves-0.180
vation-0.174
Token is
Token ID318
Feature activation-5.50e-01

Pos logit contributions
lor+0.052
fs+0.042
ocket+0.039
ves+0.038
wen+0.038
Neg logit contributions
 alone-0.032
 like-0.029
 hate-0.025
 happily-0.025
 fast-0.024
Token telling
Token ID5149
Feature activation-5.04e+00

Pos logit contributions
lor+0.029
ocket+0.024
fs+0.024
ves+0.023
wen+0.023
Neg logit contributions
 alone-0.015
 freely-0.013
 happily-0.012
cy-0.011
 wildly-0.011
Token the
Token ID262
Feature activation-1.90e+00

Pos logit contributions
lor+0.305
fs+0.245
ocket+0.242
wen+0.234
ves+0.232
Neg logit contributions
 alone-0.130
 like-0.103
 fast-0.097
 freely-0.090
 happily-0.090
Token people
Token ID661
Feature activation+1.52e+00

Pos logit contributions
lor+0.084
chief+0.065
fs+0.062
ocket+0.060
cases+0.060
Neg logit contributions
 alone-0.079
 like-0.073
 fast-0.070
 different-0.069
 town-0.068
Token about
Token ID546
Feature activation+1.03e+00

Pos logit contributions
 alone+0.043
 freely+0.035
 happily+0.033
 like+0.032
 fast+0.031
Neg logit contributions
lor-0.083
fs-0.067
ocket-0.066
ves-0.065
wen-0.062
Token something
Token ID1223
Feature activation-2.52e+00

Pos logit contributions
 alone+0.027
 freely+0.020
 like+0.020
 happily+0.019
 fast+0.018
Neg logit contributions
lor-0.060
fs-0.049
ocket-0.048
ves-0.047
wen-0.046
Token he
Token ID339
Feature activation+1.18e+00

Pos logit contributions
lor+0.136
fs+0.112
ves+0.109
ocket+0.108
cases+0.103
Neg logit contributions
 alone-0.069
 freely-0.057
 happily-0.054
 wildly-0.052
 like-0.050
Token girl
Token ID2576
Feature activation+1.37e+00

Pos logit contributions
 alone+0.035
 freely+0.030
 like+0.028
 happily+0.027
 wildly+0.027
Neg logit contributions
lor-0.059
fs-0.045
ocket-0.045
ves-0.044
chief-0.044
Token was
Token ID373
Feature activation-8.58e-02

Pos logit contributions
 alone+0.046
 happily+0.039
 like+0.038
 freely+0.037
 fast+0.036
Neg logit contributions
lor-0.071
fs-0.056
ocket-0.054
wen-0.052
ves-0.052
Token so
Token ID523
Feature activation-2.58e-01

Pos logit contributions
lor+0.005
ocket+0.004
fs+0.004
ves+0.004
wen+0.004
Neg logit contributions
 alone-0.002
 freely-0.002
 like-0.002
cy-0.001
 wildly-0.001
Token happy
Token ID3772
Feature activation+1.45e+00

Pos logit contributions
lor+0.014
ocket+0.012
ves+0.012
fs+0.012
ama+0.011
Neg logit contributions
 alone-0.005
 freely-0.004
 like-0.004
 wildly-0.004
 outside-0.004
Token and
Token ID290
Feature activation-1.18e+00

Pos logit contributions
 alone+0.040
 freely+0.031
 like+0.029
 happily+0.029
 fast+0.028
Neg logit contributions
lor-0.082
fs-0.066
ocket-0.065
ves-0.064
chief-0.062
Token said
Token ID531
Feature activation-7.30e+00

Pos logit contributions
lor+0.060
fs+0.046
ocket+0.045
wen+0.044
ves+0.043
Neg logit contributions
 alone-0.043
 happily-0.036
 like-0.035
 freely-0.035
 fast-0.034
Token,
Token ID11
Feature activation+7.82e-01

Pos logit contributions
lor+0.387
fs+0.303
ves+0.299
ocket+0.297
chief+0.292
Neg logit contributions
 alone-0.214
 freely-0.176
 like-0.165
 late-0.164
ashing-0.160
Token "
Token ID366
Feature activation+4.21e+00

Pos logit contributions
 alone+0.024
 like+0.019
 freely+0.019
 happily+0.019
 fast+0.018
Neg logit contributions
lor-0.042
fs-0.034
ocket-0.033
ves-0.032
wen-0.032
TokenThank
Token ID10449
Feature activation-1.52e+00

Pos logit contributions
 alone+0.093
 freely+0.065
 like+0.065
 happily+0.062
 fast+0.059
Neg logit contributions
lor-0.265
fs-0.217
ocket-0.213
ves-0.209
wen-0.206
Token you
Token ID345
Feature activation+1.96e+00

Pos logit contributions
lor+0.081
fs+0.064
ocket+0.063
ves+0.061
wen+0.060
Neg logit contributions
 alone-0.048
 freely-0.038
 like-0.038
 happily-0.037
 fast-0.036
Token!
Token ID0
Feature activation+9.61e-02

Pos logit contributions
 alone+0.047
 like+0.036
 happily+0.033
 fast+0.033
 freely+0.033
Neg logit contributions
lor-0.121
fs-0.099
ocket-0.097
ves-0.095
wen-0.094
Token with
Token ID351
Feature activation+9.23e-01

Pos logit contributions
ashing+0.070
umpy+0.067
 alone+0.067
 active+0.064
cy+0.063
Neg logit contributions
lor-0.126
ves-0.114
fs-0.113
glass-0.106
 story-0.102
Token it
Token ID340
Feature activation-1.99e-01

Pos logit contributions
 alone+0.023
 freely+0.019
ashing+0.019
 active+0.018
 like+0.018
Neg logit contributions
lor-0.049
fs-0.042
ves-0.042
ocket-0.041
chief-0.041
Token.
Token ID13
Feature activation+8.49e-01

Pos logit contributions
lor+0.011
fs+0.009
ves+0.009
ocket+0.009
cases+0.008
Neg logit contributions
 alone-0.005
 freely-0.004
 youth-0.004
 active-0.004
 hate-0.003
Token They
Token ID1119
Feature activation+1.58e+00

Pos logit contributions
 alone+0.020
 freely+0.014
 like+0.013
 active+0.013
 happily+0.013
Neg logit contributions
lor-0.051
ves-0.041
fs-0.041
ocket-0.040
chief-0.039
Token also
Token ID635
Feature activation+3.72e-01

Pos logit contributions
 alone+0.054
 freely+0.045
 late+0.043
 happily+0.042
 fast+0.041
Neg logit contributions
lor-0.077
fs-0.060
ocket-0.060
ves-0.057
chief-0.056
Token say
Token ID910
Feature activation-7.98e+00

Pos logit contributions
 alone+0.013
 freely+0.011
 happily+0.010
 like+0.010
 fast+0.010
Neg logit contributions
lor-0.018
ocket-0.014
fs-0.014
ves-0.013
chief-0.013
Token,
Token ID11
Feature activation+1.08e+00

Pos logit contributions
lor+0.439
fs+0.355
ves+0.349
ocket+0.344
chief+0.331
Neg logit contributions
 alone-0.227
 freely-0.183
 like-0.171
 happily-0.163
 late-0.162
Token "
Token ID366
Feature activation+4.44e+00

Pos logit contributions
 alone+0.035
 freely+0.028
 like+0.028
 happily+0.027
 fast+0.026
Neg logit contributions
lor-0.056
fs-0.045
ocket-0.043
ves-0.042
wen-0.041
TokenThank
Token ID10449
Feature activation-1.64e+00

Pos logit contributions
 alone+0.130
 freely+0.096
 like+0.096
 happily+0.091
 outside+0.089
Neg logit contributions
lor-0.250
fs-0.198
ves-0.193
vation-0.192
ocket-0.188
Token you
Token ID345
Feature activation+2.08e+00

Pos logit contributions
lor+0.086
fs+0.068
ocket+0.066
ves+0.065
wen+0.064
Neg logit contributions
 alone-0.052
 freely-0.042
 like-0.042
 happily-0.041
 fast-0.039
Token,
Token ID11
Feature activation+2.36e+00

Pos logit contributions
 alone+0.051
 like+0.037
 freely+0.037
 happily+0.036
 fast+0.034
Neg logit contributions
lor-0.126
fs-0.103
ocket-0.101
ves-0.098
wen-0.097