TOP ACTIVATIONS
MAX = 0.111

too cold for a re ind eer like you !"
when he saw a re ind eer . The re ind
Once there was a re ind eer named S illy .
. Lily has a re ind eer toy . It is
some carrots for his re ind eer ," Anna says .
? I want a re ind eer like this one ."
she always wanted a re ind eer . One day ,
happy to have a re ind eer of her own .
might not have a re ind eer toy for you .
. She wanted a re ind eer toy for Christmas .
Christmas and he loved re ind eer . He would often
there was a happy re ind eer named Rudolph . He
not know what a re ind eer is . They ask
there was a great re ind eer . He had a
loud engine as it ch ug ged along the tracks .
Once there was a re ind eer that was so very
. It was a re ind eer ! "
there was a poor re ind eer . He didn 't
could join the other re ind eer and play . He
's with a funny re ind eer . <|endoftext|> Once upon

MIN ACTIVATIONS
MIN = -0.081

It was covered in cher ries ! The bear was so
been packed in their back packs by mistake . The kids
They grabbed their back packs and put on their shoes
coats and packed their back packs . The whole
a pretty dancer bird in trouble . The bird had fallen
's eyes spark led in amaz ement . The palace garden
. " He is in trouble !" Tim 's
Ell a found herself in trouble . She had been walking
a small town , in trouble . The jet swoop ed
" I 'm in big trouble . Can you help me
B enny got in trouble and had to go to
it was an animal in trouble !" So Terry
Inside the sand pit he found a teacher with
agreed and packed their back packs with some of their favorite
" You are in big trouble . I 'm going to
But we will be in trouble !" Anna said . "
, you are in big trouble !" her mom said .
, " I am in trouble . I lost my key
's an oct opus in trouble !" Fin exclaimed . The
. " We 're in trouble !" Dad heard

INTERVAL 0.063 - 0.111
CONTAINS 0.005%

ready and she was looking forward to the adventure . She
't bud ge . Fed or sighed . Just
for a while â € � � and suddenly , she
there was a mighty re ind eer . He lived in
he was as heavy as three of his fake bones .

INTERVAL 0.015 - 0.063
CONTAINS 11.212%

all loved it . Lucy 's mom was not troubled anymore
wasn 't so scary after all . In fact , it
this motorcycle is not for you . It is big and
Anna and Ben show him their toys and books
O ll ie and went back to their home . They

INTERVAL -0.033 - 0.015
CONTAINS 88.197%

, and then slowly dropped it . The
homework . Tim my didn 't want to tell the truth
how it tastes ." Tom thought that was a
After learning , Jake and Jill got sleepy . Mom
, he had to turn around a corner . He leaned

INTERVAL -0.081 - -0.033
CONTAINS 0.586%

back home . She opened her eyes and she was back
meant , but she closed her eyes and tried to imagine
I taste popcorn in my mouth !" Mom my gave her
very tired from collecting ac orns all day ."
feel better . She hugged her t eddy bear and closed
Token too
Token ID1165
Feature activation+6.84e-03

Pos logit contributions
 its+0.021
 hovering+0.020
cks+0.019
 his+0.019
 perched+0.017
Neg logit contributions
t-0.030
ugg-0.029
case-0.027
ave-0.027
bye-0.027
Token cold
Token ID4692
Feature activation+2.20e-02

Pos logit contributions
t+0.013
ugg+0.013
case+0.012
ave+0.012
 virtue+0.012
Neg logit contributions
 its-0.009
 hovering-0.008
cks-0.008
 his-0.008
 outer-0.007
Token for
Token ID329
Feature activation+1.81e-02

Pos logit contributions
t+0.040
ugg+0.037
case+0.036
bye+0.035
 virtue+0.035
Neg logit contributions
 its-0.031
cks-0.029
 hovering-0.029
 his-0.027
 outer-0.027
Token a
Token ID257
Feature activation+2.76e-02

Pos logit contributions
t+0.040
ugg+0.038
case+0.037
bye+0.036
 judge+0.036
Neg logit contributions
 its-0.016
cks-0.016
 hovering-0.016
 outer-0.013
 perched-0.012
Token re
Token ID302
Feature activation+5.61e-02

Pos logit contributions
ugg+0.050
ave+0.049
t+0.047
omm+0.047
case+0.046
Neg logit contributions
 its-0.038
 his-0.035
 carrot-0.034
 hovering-0.033
 seaw-0.033
Tokenind
Token ID521
Feature activation+1.11e-01

Pos logit contributions
t+0.105
case+0.102
ave+0.101
ugg+0.100
omm+0.099
Neg logit contributions
cks-0.068
 its-0.068
 hovering-0.064
 his-0.061
 perched-0.060
Tokeneer
Token ID28153
Feature activation+3.38e-02

Pos logit contributions
t+0.227
case+0.217
ugg+0.216
 virtue+0.210
omm+0.209
Neg logit contributions
 its-0.128
 his-0.115
cks-0.109
 hovering-0.107
 outer-0.098
Token like
Token ID588
Feature activation+1.41e-02

Pos logit contributions
t+0.068
ugg+0.067
case+0.064
 virtue+0.063
omm+0.063
Neg logit contributions
 its-0.040
 hovering-0.036
 his-0.035
cks-0.034
 perched-0.032
Token you
Token ID345
Feature activation+1.98e-02

Pos logit contributions
t+0.032
ugg+0.030
case+0.029
ave+0.029
 virtue+0.029
Neg logit contributions
 its-0.015
 hovering-0.013
 his-0.012
cks-0.012
 outer-0.011
Token!"
Token ID2474
Feature activation+1.30e-03

Pos logit contributions
t+0.037
ugg+0.036
case+0.035
 virtue+0.034
ave+0.034
Neg logit contributions
 its-0.026
 hovering-0.024
cks-0.023
 his-0.023
 perched-0.021
Token
Token ID198
Feature activation-6.17e-03

Pos logit contributions
t+0.002
ugg+0.002
case+0.002
 virtue+0.002
ave+0.002
Neg logit contributions
 its-0.002
 his-0.002
 hovering-0.002
cks-0.002
 perched-0.001
Token when
Token ID618
Feature activation+5.89e-03

Pos logit contributions
t+0.022
ugg+0.020
case+0.020
bye+0.019
ave+0.019
Neg logit contributions
 its-0.017
 hovering-0.017
cks-0.016
 outer-0.016
 his-0.015
Token he
Token ID339
Feature activation+6.24e-03

Pos logit contributions
t+0.014
ugg+0.013
case+0.013
bye+0.012
 judge+0.012
Neg logit contributions
cks-0.005
 its-0.005
 hovering-0.005
 outer-0.004
 pans-0.004
Token saw
Token ID2497
Feature activation+1.97e-02

Pos logit contributions
t+0.013
ugg+0.013
 virtue+0.013
omm+0.012
case+0.012
Neg logit contributions
 its-0.007
 his-0.007
 perched-0.006
 seaw-0.006
 hovering-0.006
Token a
Token ID257
Feature activation+1.50e-02

Pos logit contributions
t+0.037
 tum+0.035
case+0.035
ugg+0.035
bye+0.033
Neg logit contributions
cks-0.024
 hovering-0.023
 its-0.021
 outer-0.019
 pans-0.019
Token re
Token ID302
Feature activation+5.23e-02

Pos logit contributions
t+0.028
ugg+0.026
case+0.025
ave+0.025
bye+0.024
Neg logit contributions
 its-0.021
 hovering-0.019
cks-0.019
 his-0.018
 outer-0.018
Tokenind
Token ID521
Feature activation+1.02e-01

Pos logit contributions
t+0.092
ugg+0.086
case+0.083
ave+0.081
 virtue+0.081
Neg logit contributions
 its-0.077
cks-0.073
 hovering-0.072
 his-0.069
 outer-0.066
Tokeneer
Token ID28153
Feature activation+2.80e-02

Pos logit contributions
t+0.185
ugg+0.176
bye+0.166
case+0.165
ave+0.163
Neg logit contributions
 its-0.143
 hovering-0.138
cks-0.135
 his-0.124
 outer-0.124
Token.
Token ID13
Feature activation+1.86e-03

Pos logit contributions
t+0.056
ugg+0.053
case+0.052
 judge+0.050
bye+0.050
Neg logit contributions
 its-0.033
cks-0.030
 hovering-0.030
 his-0.028
 outer-0.027
Token The
Token ID383
Feature activation+1.51e-02

Pos logit contributions
t+0.003
ugg+0.003
case+0.003
ave+0.003
 virtue+0.003
Neg logit contributions
 its-0.003
 hovering-0.003
cks-0.003
 his-0.003
 outer-0.002
Token re
Token ID302
Feature activation+2.15e-02

Pos logit contributions
t+0.032
ugg+0.031
case+0.030
ave+0.029
 virtue+0.029
Neg logit contributions
 its-0.017
 hovering-0.016
cks-0.015
 his-0.015
 outer-0.014
Tokenind
Token ID521
Feature activation+6.94e-02

Pos logit contributions
t+0.038
ugg+0.035
case+0.035
 virtue+0.034
ave+0.033
Neg logit contributions
 its-0.032
cks-0.030
 hovering-0.030
 his-0.028
 perched-0.027
TokenOnce
Token ID7454
Feature activation-7.62e-03

Pos logit contributions
t+0.038
ugg+0.035
case+0.034
bye+0.031
ave+0.031
Neg logit contributions
 its-0.039
 hovering-0.038
cks-0.038
 outer-0.035
 pans-0.035
Token there
Token ID612
Feature activation+2.07e-02

Pos logit contributions
 its+0.007
 hovering+0.007
cks+0.007
 outer+0.006
 pans+0.006
Neg logit contributions
t-0.018
ugg-0.016
case-0.016
 tum-0.015
ave-0.015
Token was
Token ID373
Feature activation+1.76e-02

Pos logit contributions
t+0.029
ugg+0.026
case+0.025
bye+0.023
ave+0.023
Neg logit contributions
 its-0.038
cks-0.037
 hovering-0.037
 outer-0.034
 his-0.034
Token a
Token ID257
Feature activation+2.69e-02

Pos logit contributions
t+0.034
ugg+0.033
case+0.032
bye+0.031
 tum+0.030
Neg logit contributions
cks-0.021
 hovering-0.021
 its-0.020
 outer-0.018
 pans-0.018
Token re
Token ID302
Feature activation+5.49e-02

Pos logit contributions
t+0.040
ugg+0.036
ave+0.035
case+0.035
 virtue+0.034
Neg logit contributions
 its-0.048
 his-0.045
 hovering-0.044
cks-0.043
 outer-0.042
Tokenind
Token ID521
Feature activation+1.02e-01

Pos logit contributions
t+0.106
case+0.098
ugg+0.097
ave+0.096
 virtue+0.095
Neg logit contributions
 its-0.072
cks-0.069
 hovering-0.065
 his-0.064
 perched-0.061
Tokeneer
Token ID28153
Feature activation+2.73e-02

Pos logit contributions
t+0.199
ugg+0.188
case+0.184
 virtue+0.178
ave+0.177
Neg logit contributions
 its-0.129
cks-0.118
 hovering-0.117
 his-0.114
 outer-0.106
Token named
Token ID3706
Feature activation+3.26e-02

Pos logit contributions
t+0.054
 virtue+0.053
case+0.051
omm+0.051
ugg+0.050
Neg logit contributions
 its-0.036
 his-0.033
 hovering-0.030
 perched-0.029
cks-0.029
Token S
Token ID311
Feature activation+9.75e-03

Pos logit contributions
t+0.057
ugg+0.053
case+0.050
bye+0.049
 judge+0.048
Neg logit contributions
 hovering-0.046
 its-0.045
cks-0.044
 outer-0.040
 pans-0.039
Tokenilly
Token ID6548
Feature activation+1.31e-02

Pos logit contributions
t+0.014
ugg+0.013
case+0.013
bye+0.012
ave+0.012
Neg logit contributions
 its-0.017
 hovering-0.016
cks-0.016
 his-0.015
 outer-0.015
Token.
Token ID13
Feature activation+9.86e-03

Pos logit contributions
t+0.025
ugg+0.024
case+0.023
bye+0.023
ave+0.023
Neg logit contributions
 its-0.016
 hovering-0.016
cks-0.016
 outer-0.014
 his-0.014
Token.
Token ID13
Feature activation+1.41e-03

Pos logit contributions
 its+0.006
 hovering+0.006
cks+0.006
 his+0.005
 outer+0.005
Neg logit contributions
t-0.010
ugg-0.010
case-0.010
ave-0.009
 virtue-0.009
Token Lily
Token ID20037
Feature activation-2.75e-03

Pos logit contributions
t+0.002
ugg+0.002
case+0.001
ave+0.001
 virtue+0.001
Neg logit contributions
 its-0.003
 hovering-0.003
cks-0.003
 his-0.003
 outer-0.003
Token has
Token ID468
Feature activation+3.99e-03

Pos logit contributions
 its+0.004
cks+0.004
 his+0.004
 hovering+0.004
 perched+0.003
Neg logit contributions
t-0.005
ugg-0.005
 virtue-0.004
case-0.004
bye-0.004
Token a
Token ID257
Feature activation+1.26e-02

Pos logit contributions
t+0.010
ugg+0.009
 virtue+0.009
case+0.009
ave+0.009
Neg logit contributions
 its-0.004
 his-0.003
 hovering-0.003
cks-0.003
 outer-0.003
Token re
Token ID302
Feature activation+5.56e-02

Pos logit contributions
t+0.027
ugg+0.026
ave+0.025
bye+0.024
 virtue+0.024
Neg logit contributions
 its-0.015
 his-0.013
 hovering-0.013
 outer-0.012
cks-0.011
Tokenind
Token ID521
Feature activation+1.01e-01

Pos logit contributions
t+0.100
ugg+0.094
case+0.092
ave+0.090
 virtue+0.090
Neg logit contributions
 its-0.078
cks-0.077
 hovering-0.071
 his-0.071
 outer-0.065
Tokeneer
Token ID28153
Feature activation+2.10e-02

Pos logit contributions
t+0.193
ugg+0.183
case+0.176
 virtue+0.171
ave+0.171
Neg logit contributions
 its-0.131
 hovering-0.122
cks-0.121
 his-0.115
 outer-0.110
Token toy
Token ID13373
Feature activation-4.17e-03

Pos logit contributions
t+0.045
ugg+0.043
 virtue+0.042
ave+0.041
bye+0.041
Neg logit contributions
 its-0.024
 his-0.021
 hovering-0.021
cks-0.020
 perched-0.018
Token.
Token ID13
Feature activation+1.05e-02

Pos logit contributions
 its+0.005
 his+0.005
 hovering+0.005
cks+0.005
 perched+0.004
Neg logit contributions
t-0.008
ugg-0.008
case-0.008
 virtue-0.008
ave-0.007
Token It
Token ID632
Feature activation+2.10e-02

Pos logit contributions
t+0.017
ugg+0.016
case+0.016
ave+0.015
 judge+0.015
Neg logit contributions
 its-0.016
 hovering-0.016
cks-0.016
 outer-0.015
 his-0.014
Token is
Token ID318
Feature activation-2.67e-03

Pos logit contributions
t+0.031
ugg+0.029
case+0.028
ave+0.026
 judge+0.026
Neg logit contributions
 its-0.036
 hovering-0.035
cks-0.035
 his-0.033
 outer-0.033
Token some
Token ID617
Feature activation+1.14e-02

Pos logit contributions
 its+0.001
 hovering+0.001
cks+0.001
 his+0.001
 outer+0.001
Neg logit contributions
t-0.002
ugg-0.002
case-0.001
 virtue-0.001
ave-0.001
Token carrots
Token ID34397
Feature activation+7.07e-03

Pos logit contributions
t+0.022
ugg+0.021
case+0.020
ave+0.019
 virtue+0.019
Neg logit contributions
 its-0.015
 his-0.014
 hovering-0.014
cks-0.014
 outer-0.013
Token for
Token ID329
Feature activation+1.39e-02

Pos logit contributions
t+0.013
ugg+0.012
case+0.012
bye+0.011
 virtue+0.011
Neg logit contributions
 its-0.010
 hovering-0.010
cks-0.009
 outer-0.009
 his-0.009
Token his
Token ID465
Feature activation+2.93e-02

Pos logit contributions
t+0.029
ugg+0.028
case+0.027
ave+0.026
 virtue+0.026
Neg logit contributions
 its-0.016
cks-0.015
 hovering-0.014
 his-0.014
 outer-0.013
Token re
Token ID302
Feature activation+4.96e-02

Pos logit contributions
t+0.053
ugg+0.050
ave+0.048
case+0.048
 virtue+0.045
Neg logit contributions
 its-0.044
 his-0.041
 hovering-0.038
cks-0.037
 carrot-0.037
Tokenind
Token ID521
Feature activation+9.75e-02

Pos logit contributions
t+0.094
case+0.088
ugg+0.087
ave+0.084
 virtue+0.083
Neg logit contributions
 its-0.068
cks-0.065
 hovering-0.060
 his-0.060
 perched-0.055
Tokeneer
Token ID28153
Feature activation+3.22e-02

Pos logit contributions
t+0.192
ugg+0.183
case+0.178
 virtue+0.173
ave+0.171
Neg logit contributions
 its-0.122
cks-0.111
 hovering-0.109
 his-0.108
 outer-0.099
Token,"
Token ID553
Feature activation-6.36e-03

Pos logit contributions
t+0.065
ugg+0.063
case+0.060
 virtue+0.060
ave+0.059
Neg logit contributions
 its-0.038
cks-0.034
 hovering-0.034
 his-0.033
 perched-0.030
Token Anna
Token ID11735
Feature activation+8.29e-03

Pos logit contributions
cks+0.010
 hovering+0.010
 its+0.010
 outer+0.009
 perched+0.009
Neg logit contributions
t-0.010
ugg-0.009
case-0.009
 judge-0.009
bye-0.008
Token says
Token ID1139
Feature activation-1.94e-03

Pos logit contributions
t+0.016
ugg+0.015
case+0.015
ave+0.014
 virtue+0.014
Neg logit contributions
 its-0.011
 hovering-0.010
cks-0.010
 his-0.009
 outer-0.009
Token.
Token ID13
Feature activation+3.92e-03

Pos logit contributions
 its+0.002
cks+0.002
 hovering+0.002
 outer+0.002
 his+0.002
Neg logit contributions
t-0.004
ugg-0.003
case-0.003
ave-0.003
bye-0.003
Token?
Token ID30
Feature activation-4.69e-03

Pos logit contributions
t+0.024
ugg+0.022
case+0.022
bye+0.021
 judge+0.021
Neg logit contributions
 its-0.017
 hovering-0.016
cks-0.016
 his-0.015
 outer-0.015
Token I
Token ID314
Feature activation-1.17e-02

Pos logit contributions
 its+0.007
 hovering+0.007
cks+0.007
 his+0.006
 outer+0.006
Neg logit contributions
t-0.008
ugg-0.007
case-0.007
 virtue-0.007
ave-0.007
Token want
Token ID765
Feature activation+1.38e-02

Pos logit contributions
 its+0.017
 his+0.016
 hovering+0.015
 perched+0.014
cks+0.014
Neg logit contributions
omm-0.019
ugg-0.019
case-0.018
t-0.018
becca-0.018
Token a
Token ID257
Feature activation+1.75e-02

Pos logit contributions
t+0.024
ugg+0.024
case+0.024
ave+0.023
 virtue+0.022
Neg logit contributions
 its-0.021
 his-0.020
 hovering-0.017
 outer-0.017
cks-0.017
Token re
Token ID302
Feature activation+5.25e-02

Pos logit contributions
t+0.033
ugg+0.031
case+0.030
ave+0.029
 virtue+0.029
Neg logit contributions
 its-0.024
 hovering-0.022
cks-0.021
 his-0.021
 outer-0.020
Tokenind
Token ID521
Feature activation+9.72e-02

Pos logit contributions
t+0.099
case+0.094
ugg+0.093
ave+0.091
 virtue+0.089
Neg logit contributions
 its-0.068
cks-0.067
 hovering-0.063
 his-0.060
 outer-0.058
Tokeneer
Token ID28153
Feature activation+2.78e-02

Pos logit contributions
t+0.190
ugg+0.180
case+0.176
 virtue+0.170
ave+0.169
Neg logit contributions
 its-0.124
cks-0.113
 hovering-0.113
 his-0.109
 outer-0.102
Token like
Token ID588
Feature activation+1.31e-02

Pos logit contributions
t+0.053
ugg+0.051
case+0.049
ave+0.048
 virtue+0.048
Neg logit contributions
 its-0.036
 hovering-0.032
 his-0.032
cks-0.031
 outer-0.029
Token this
Token ID428
Feature activation+1.35e-02

Pos logit contributions
t+0.026
ugg+0.025
case+0.024
 virtue+0.023
ave+0.023
Neg logit contributions
 its-0.016
 hovering-0.015
cks-0.015
 his-0.014
 outer-0.013
Token one
Token ID530
Feature activation+1.63e-02

Pos logit contributions
t+0.025
ugg+0.024
case+0.023
ave+0.023
 virtue+0.022
Neg logit contributions
 its-0.018
 hovering-0.017
cks-0.017
 his-0.016
 outer-0.015
Token."
Token ID526
Feature activation-2.37e-03

Pos logit contributions
t+0.028
ugg+0.026
case+0.025
bye+0.024
 virtue+0.024
Neg logit contributions
 its-0.025
 hovering-0.023
cks-0.023
 his-0.022
 outer-0.021
Token she
Token ID673
Feature activation-2.54e-02

Pos logit contributions
t+0.012
ugg+0.011
case+0.011
bye+0.011
 judge+0.011
Neg logit contributions
cks-0.007
 its-0.007
 hovering-0.007
 his-0.006
 outer-0.006
Token always
Token ID1464
Feature activation-1.09e-02

Pos logit contributions
 his+0.038
 its+0.037
 perched+0.033
 her+0.032
 hovering+0.031
Neg logit contributions
ugg-0.041
bye-0.041
aring-0.040
t-0.039
ari-0.038
Token wanted
Token ID2227
Feature activation-6.66e-03

Pos logit contributions
 its+0.019
 his+0.018
 hovering+0.017
cks+0.017
 perched+0.016
Neg logit contributions
t-0.017
case-0.016
ugg-0.015
 virtue-0.015
omm-0.015
Token a
Token ID257
Feature activation+9.98e-03

Pos logit contributions
 its+0.010
 his+0.009
 hovering+0.008
 outer+0.008
 her+0.008
Neg logit contributions
t-0.013
ugg-0.013
case-0.012
 virtue-0.012
ave-0.012
Token re
Token ID302
Feature activation+5.10e-02

Pos logit contributions
t+0.019
ugg+0.018
case+0.017
ave+0.017
omm+0.017
Neg logit contributions
 its-0.014
 his-0.013
 hovering-0.012
 outer-0.012
 carrot-0.012
Tokenind
Token ID521
Feature activation+9.71e-02

Pos logit contributions
t+0.101
case+0.097
ugg+0.095
ave+0.095
 virtue+0.093
Neg logit contributions
cks-0.060
 its-0.058
 hovering-0.054
 his-0.050
 perched-0.049
Tokeneer
Token ID28153
Feature activation+2.31e-02

Pos logit contributions
t+0.194
ugg+0.184
case+0.181
 virtue+0.175
ave+0.172
Neg logit contributions
 its-0.119
cks-0.108
 hovering-0.106
 his-0.105
 outer-0.096
Token.
Token ID13
Feature activation-6.99e-04

Pos logit contributions
t+0.043
ugg+0.041
case+0.039
 virtue+0.038
ave+0.038
Neg logit contributions
 its-0.032
 hovering-0.029
 his-0.029
cks-0.028
 outer-0.026
Token One
Token ID1881
Feature activation+1.48e-02

Pos logit contributions
 its+0.001
 hovering+0.001
cks+0.001
 his+0.001
 outer+0.001
Neg logit contributions
t-0.001
ugg-0.001
case-0.001
 virtue-0.001
ave-0.001
Token day
Token ID1110
Feature activation-1.58e-02

Pos logit contributions
t+0.015
ugg+0.014
case+0.013
 virtue+0.013
ave+0.012
Neg logit contributions
 its-0.033
 his-0.031
 hovering-0.031
cks-0.030
 outer-0.029
Token,
Token ID11
Feature activation-5.60e-04

Pos logit contributions
 its+0.024
 hovering+0.023
cks+0.022
 his+0.022
 outer+0.021
Neg logit contributions
t-0.027
ugg-0.025
case-0.024
 virtue-0.024
ave-0.023
Token happy
Token ID3772
Feature activation+5.97e-03

Pos logit contributions
t+0.020
ugg+0.019
 virtue+0.019
case+0.019
ave+0.018
Neg logit contributions
 its-0.010
 his-0.008
cks-0.008
 hovering-0.008
 perched-0.007
Token to
Token ID284
Feature activation-3.01e-03

Pos logit contributions
t+0.009
ugg+0.009
case+0.008
bye+0.008
ave+0.008
Neg logit contributions
 its-0.010
cks-0.009
 hovering-0.009
 his-0.009
 outer-0.009
Token have
Token ID423
Feature activation+8.92e-03

Pos logit contributions
 its+0.005
 his+0.005
cks+0.004
 hovering+0.004
 outer+0.004
Neg logit contributions
t-0.005
ugg-0.005
 virtue-0.005
case-0.004
ave-0.004
Token a
Token ID257
Feature activation+9.40e-03

Pos logit contributions
t+0.018
ugg+0.017
ave+0.017
 judge+0.017
bye+0.017
Neg logit contributions
cks-0.009
 hovering-0.009
 its-0.008
 outer-0.008
 pans-0.008
Token re
Token ID302
Feature activation+5.47e-02

Pos logit contributions
t+0.020
ugg+0.019
case+0.018
ave+0.018
 virtue+0.018
Neg logit contributions
 its-0.011
 his-0.010
 hovering-0.010
cks-0.009
 outer-0.009
Tokenind
Token ID521
Feature activation+9.70e-02

Pos logit contributions
t+0.105
case+0.101
ugg+0.097
ave+0.096
 virtue+0.093
Neg logit contributions
 its-0.069
cks-0.068
 hovering-0.063
 his-0.062
 perched-0.057
Tokeneer
Token ID28153
Feature activation+1.96e-02

Pos logit contributions
t+0.182
ugg+0.172
case+0.163
bye+0.162
ave+0.161
Neg logit contributions
 its-0.131
 hovering-0.124
cks-0.123
 his-0.114
 outer-0.111
Token of
Token ID286
Feature activation+8.97e-03

Pos logit contributions
t+0.039
ugg+0.037
case+0.036
 virtue+0.035
omm+0.035
Neg logit contributions
 its-0.025
 hovering-0.022
 his-0.022
cks-0.021
 perched-0.020
Token her
Token ID607
Feature activation-1.23e-02

Pos logit contributions
t+0.022
ugg+0.021
 judge+0.020
case+0.020
bye+0.020
Neg logit contributions
 its-0.006
 hovering-0.006
cks-0.006
 outer-0.005
 his-0.005
Token own
Token ID898
Feature activation+4.44e-03

Pos logit contributions
 its+0.021
 hovering+0.020
cks+0.020
 his+0.019
 outer+0.019
Neg logit contributions
t-0.018
ugg-0.017
case-0.016
bye-0.016
 virtue-0.016
Token.
Token ID13
Feature activation+2.89e-03

Pos logit contributions
t+0.009
ugg+0.008
case+0.008
ave+0.008
 virtue+0.008
Neg logit contributions
 its-0.006
 his-0.005
 hovering-0.005
cks-0.005
 outer-0.005
Token might
Token ID1244
Feature activation-2.02e-02

Pos logit contributions
t+0.009
ugg+0.009
bye+0.008
 virtue+0.008
omm+0.008
Neg logit contributions
 its-0.007
 his-0.006
cks-0.006
 hovering-0.006
 perched-0.006
Token not
Token ID407
Feature activation-2.06e-02

Pos logit contributions
 its+0.036
 his+0.033
cks+0.032
 hovering+0.032
 outer+0.031
Neg logit contributions
ugg-0.031
t-0.030
ave-0.028
case-0.027
 virtue-0.027
Token have
Token ID423
Feature activation+6.88e-03

Pos logit contributions
 its+0.031
 his+0.029
cks+0.026
 hovering+0.025
 outer+0.024
Neg logit contributions
ugg-0.039
omm-0.037
t-0.035
ave-0.034
 virtue-0.034
Token a
Token ID257
Feature activation+1.58e-02

Pos logit contributions
ugg+0.015
t+0.015
case+0.014
omm+0.013
 virtue+0.013
Neg logit contributions
 its-0.009
 his-0.008
cks-0.007
 hovering-0.006
 outer-0.006
Token re
Token ID302
Feature activation+5.33e-02

Pos logit contributions
t+0.028
ugg+0.027
ave+0.025
case+0.025
bye+0.024
Neg logit contributions
 its-0.024
 his-0.022
 hovering-0.021
cks-0.021
 outer-0.020
Tokenind
Token ID521
Feature activation+9.68e-02

Pos logit contributions
t+0.108
case+0.104
ugg+0.103
ave+0.100
 virtue+0.096
Neg logit contributions
cks-0.063
 its-0.061
 hovering-0.056
 his-0.053
 outer-0.051
Tokeneer
Token ID28153
Feature activation+1.97e-02

Pos logit contributions
t+0.194
ugg+0.186
case+0.184
 virtue+0.176
ave+0.173
Neg logit contributions
 its-0.118
cks-0.105
 his-0.104
 hovering-0.104
 outer-0.095
Token toy
Token ID13373
Feature activation-1.04e-03

Pos logit contributions
t+0.038
ugg+0.037
case+0.035
 virtue+0.035
ave+0.034
Neg logit contributions
 its-0.025
 his-0.023
 hovering-0.023
cks-0.023
 perched-0.020
Token for
Token ID329
Feature activation+1.54e-02

Pos logit contributions
 its+0.001
 his+0.001
 hovering+0.001
cks+0.001
 outer+0.001
Neg logit contributions
t-0.002
ugg-0.002
case-0.002
 virtue-0.002
ave-0.002
Token you
Token ID345
Feature activation+2.32e-02

Pos logit contributions
t+0.033
ugg+0.032
case+0.031
ave+0.030
 virtue+0.030
Neg logit contributions
 its-0.016
cks-0.015
 hovering-0.015
 his-0.014
 outer-0.013
Token.
Token ID13
Feature activation+1.01e-02

Pos logit contributions
t+0.047
ugg+0.046
case+0.044
 virtue+0.043
bye+0.042
Neg logit contributions
 its-0.028
cks-0.025
 his-0.024
 hovering-0.024
 outer-0.022
Token.
Token ID13
Feature activation-1.31e-03

Pos logit contributions
t+0.028
ugg+0.027
case+0.026
 virtue+0.025
ave+0.025
Neg logit contributions
 its-0.017
 hovering-0.015
cks-0.015
 his-0.015
 outer-0.014
Token She
Token ID1375
Feature activation-1.92e-03

Pos logit contributions
 its+0.002
cks+0.002
 hovering+0.002
 his+0.002
 outer+0.002
Neg logit contributions
t-0.002
ugg-0.002
case-0.002
bye-0.002
 virtue-0.002
Token wanted
Token ID2227
Feature activation-2.44e-03

Pos logit contributions
 its+0.003
 his+0.003
 perched+0.002
cks+0.002
 hovering+0.002
Neg logit contributions
t-0.004
ugg-0.004
 virtue-0.004
aring-0.003
bye-0.003
Token a
Token ID257
Feature activation+1.80e-02

Pos logit contributions
 its+0.004
 his+0.003
 hovering+0.003
 outer+0.003
cks+0.003
Neg logit contributions
t-0.005
ugg-0.004
case-0.004
 virtue-0.004
ave-0.004
Token re
Token ID302
Feature activation+5.15e-02

Pos logit contributions
t+0.033
ugg+0.031
case+0.030
ave+0.029
 virtue+0.029
Neg logit contributions
 its-0.025
 hovering-0.023
cks-0.023
 his-0.022
 outer-0.021
Tokenind
Token ID521
Feature activation+9.64e-02

Pos logit contributions
t+0.094
case+0.088
ugg+0.088
ave+0.085
 virtue+0.084
Neg logit contributions
 its-0.071
cks-0.069
 hovering-0.066
 his-0.063
 outer-0.060
Tokeneer
Token ID28153
Feature activation+2.96e-02

Pos logit contributions
t+0.181
ugg+0.172
case+0.164
bye+0.161
ave+0.160
Neg logit contributions
 its-0.129
 hovering-0.122
cks-0.121
 his-0.113
 outer-0.109
Token toy
Token ID13373
Feature activation-3.30e-03

Pos logit contributions
t+0.052
ugg+0.050
case+0.048
 virtue+0.046
ave+0.046
Neg logit contributions
 its-0.043
 hovering-0.040
cks-0.040
 his-0.038
 outer-0.036
Token for
Token ID329
Feature activation+1.38e-02

Pos logit contributions
 its+0.005
 hovering+0.004
 his+0.004
cks+0.004
 outer+0.004
Neg logit contributions
t-0.006
ugg-0.006
case-0.006
 virtue-0.006
ave-0.005
Token Christmas
Token ID6786
Feature activation+3.69e-02

Pos logit contributions
t+0.033
ugg+0.031
case+0.030
 virtue+0.030
bye+0.030
Neg logit contributions
 its-0.011
 hovering-0.010
cks-0.010
 his-0.009
 outer-0.009
Token.
Token ID13
Feature activation-4.82e-03

Pos logit contributions
t+0.079
ugg+0.077
ave+0.075
 virtue+0.073
omm+0.072
Neg logit contributions
 its-0.038
 his-0.035
 hovering-0.032
cks-0.032
 outer-0.031
Token Christmas
Token ID6786
Feature activation+2.25e-02

Pos logit contributions
t+0.027
ugg+0.025
case+0.023
 judge+0.023
bye+0.023
Neg logit contributions
 its-0.033
 hovering-0.033
cks-0.033
 outer-0.030
 his-0.029
Token and
Token ID290
Feature activation+5.71e-03

Pos logit contributions
t+0.044
ugg+0.042
ave+0.041
 virtue+0.040
case+0.039
Neg logit contributions
 its-0.028
 his-0.026
 hovering-0.025
cks-0.024
 outer-0.023
Token he
Token ID339
Feature activation-8.37e-03

Pos logit contributions
t+0.007
ugg+0.006
case+0.006
bye+0.006
ave+0.005
Neg logit contributions
cks-0.011
 its-0.011
 hovering-0.011
 outer-0.010
 his-0.010
Token loved
Token ID6151
Feature activation+2.00e-02

Pos logit contributions
 its+0.014
 his+0.014
 perched+0.012
 hovering+0.012
 her+0.011
Neg logit contributions
t-0.013
 virtue-0.012
aring-0.012
ugg-0.012
 punish-0.012
Token re
Token ID302
Feature activation+5.37e-02

Pos logit contributions
t+0.026
ugg+0.025
bye+0.023
case+0.023
 judge+0.022
Neg logit contributions
cks-0.036
 hovering-0.036
 its-0.035
 outer-0.032
 afloat-0.032
Tokenind
Token ID521
Feature activation+9.61e-02

Pos logit contributions
t+0.096
case+0.089
ugg+0.089
 virtue+0.086
ave+0.085
Neg logit contributions
 its-0.077
cks-0.074
 hovering-0.071
 his-0.069
 outer-0.065
Tokeneer
Token ID28153
Feature activation+2.48e-02

Pos logit contributions
t+0.173
ugg+0.165
bye+0.155
ave+0.154
case+0.152
Neg logit contributions
 its-0.136
 hovering-0.136
cks-0.130
 outer-0.120
 his-0.117
Token.
Token ID13
Feature activation+1.01e-02

Pos logit contributions
t+0.048
ugg+0.046
case+0.044
ave+0.043
 virtue+0.043
Neg logit contributions
 its-0.032
 his-0.030
 hovering-0.028
cks-0.028
 outer-0.025
Token He
Token ID679
Feature activation+9.79e-03

Pos logit contributions
t+0.014
ugg+0.013
case+0.013
bye+0.012
 virtue+0.012
Neg logit contributions
 its-0.018
 hovering-0.018
cks-0.018
 outer-0.016
 perched-0.016
Token would
Token ID561
Feature activation+1.31e-02

Pos logit contributions
t+0.015
 virtue+0.014
ugg+0.013
aring+0.013
case+0.013
Neg logit contributions
 its-0.017
 his-0.017
 perched-0.015
 hovering-0.015
 carrot-0.014
Token often
Token ID1690
Feature activation+5.21e-03

Pos logit contributions
t+0.021
ugg+0.019
ave+0.019
 virtue+0.019
case+0.018
Neg logit contributions
 its-0.023
 his-0.023
 hovering-0.020
cks-0.020
 outer-0.020
Token there
Token ID612
Feature activation+5.32e-03

Pos logit contributions
t+0.020
case+0.019
ugg+0.018
 judge+0.017
um+0.016
Neg logit contributions
 seaw-0.013
rob-0.012
 hovering-0.012
cks-0.012
 its-0.011
Token was
Token ID373
Feature activation+7.21e-03

Pos logit contributions
t+0.006
ugg+0.006
case+0.006
bye+0.005
 virtue+0.005
Neg logit contributions
 its-0.011
 hovering-0.010
cks-0.010
 his-0.010
 outer-0.010
Token a
Token ID257
Feature activation+1.59e-02

Pos logit contributions
t+0.013
case+0.013
ugg+0.013
bye+0.012
 tum+0.012
Neg logit contributions
cks-0.009
 hovering-0.009
 its-0.008
 pans-0.008
 outer-0.008
Token happy
Token ID3772
Feature activation-1.51e-02

Pos logit contributions
t+0.023
ugg+0.021
case+0.020
ave+0.019
 virtue+0.019
Neg logit contributions
 its-0.029
 hovering-0.027
 his-0.027
cks-0.027
 outer-0.025
Token re
Token ID302
Feature activation+5.02e-02

Pos logit contributions
 its+0.023
 his+0.021
 carrot+0.019
 coral+0.019
 hovering+0.018
Neg logit contributions
t-0.030
ave-0.028
 virtue-0.026
case-0.026
ugg-0.025
Tokenind
Token ID521
Feature activation+9.53e-02

Pos logit contributions
t+0.098
case+0.091
ave+0.091
ugg+0.090
 virtue+0.089
Neg logit contributions
 its-0.062
cks-0.061
 hovering-0.056
 his-0.055
 perched-0.051
Tokeneer
Token ID28153
Feature activation+2.24e-02

Pos logit contributions
t+0.181
ugg+0.172
case+0.166
ave+0.161
bye+0.161
Neg logit contributions
 its-0.125
 hovering-0.117
cks-0.116
 his-0.110
 outer-0.105
Token named
Token ID3706
Feature activation+2.95e-02

Pos logit contributions
t+0.043
 virtue+0.041
omm+0.041
case+0.040
ugg+0.040
Neg logit contributions
 its-0.032
 his-0.029
 hovering-0.026
 perched-0.025
cks-0.025
Token Rudolph
Token ID48684
Feature activation+3.22e-02

Pos logit contributions
t+0.050
ugg+0.046
bye+0.043
case+0.042
omm+0.041
Neg logit contributions
 hovering-0.045
cks-0.043
 its-0.043
 outer-0.038
 seaw-0.038
Token.
Token ID13
Feature activation+9.32e-03

Pos logit contributions
t+0.061
ugg+0.057
bye+0.054
case+0.054
 judge+0.053
Neg logit contributions
 its-0.042
 hovering-0.042
cks-0.040
 outer-0.038
 pans-0.036
Token He
Token ID679
Feature activation+8.29e-03

Pos logit contributions
t+0.016
ugg+0.015
bye+0.014
case+0.014
 tum+0.014
Neg logit contributions
cks-0.014
 hovering-0.014
 its-0.013
cking-0.012
 pans-0.012
Token not
Token ID407
Feature activation-7.06e-03

Pos logit contributions
 its+0.008
cks+0.007
 his+0.007
 hovering+0.007
 outer+0.007
Neg logit contributions
t-0.004
ugg-0.004
ave-0.004
 virtue-0.004
case-0.004
Token know
Token ID760
Feature activation+1.06e-02

Pos logit contributions
 its+0.012
cks+0.012
 hovering+0.011
 his+0.011
 outer+0.011
Neg logit contributions
t-0.011
ugg-0.010
case-0.009
 virtue-0.009
ave-0.009
Token what
Token ID644
Feature activation+2.65e-02

Pos logit contributions
t+0.020
case+0.018
 tum+0.018
 moms+0.018
 judge+0.018
Neg logit contributions
 hovering-0.012
cks-0.011
 its-0.011
rob-0.011
 afloat-0.010
Token a
Token ID257
Feature activation+1.24e-02

Pos logit contributions
t+0.053
ugg+0.049
case+0.049
 judge+0.047
bye+0.047
Neg logit contributions
 its-0.030
cks-0.030
 hovering-0.030
 outer-0.026
 his-0.026
Token re
Token ID302
Feature activation+5.06e-02

Pos logit contributions
t+0.021
ugg+0.019
case+0.019
 judge+0.019
 virtue+0.019
Neg logit contributions
 its-0.018
cks-0.018
 hovering-0.017
 his-0.016
 outer-0.016
Tokenind
Token ID521
Feature activation+9.48e-02

Pos logit contributions
t+0.087
ugg+0.082
case+0.079
ave+0.077
 virtue+0.076
Neg logit contributions
 its-0.076
cks-0.072
 hovering-0.071
 his-0.068
 outer-0.065
Tokeneer
Token ID28153
Feature activation+2.22e-02

Pos logit contributions
t+0.173
ugg+0.162
bye+0.153
case+0.153
ave+0.152
Neg logit contributions
 its-0.132
 hovering-0.129
cks-0.125
 outer-0.115
 his-0.115
Token is
Token ID318
Feature activation+4.11e-04

Pos logit contributions
t+0.038
ugg+0.036
case+0.035
 virtue+0.034
ave+0.034
Neg logit contributions
 its-0.033
 hovering-0.031
cks-0.031
 his-0.030
 outer-0.028
Token.
Token ID13
Feature activation-5.76e-03

Pos logit contributions
t+0.001
 judge+0.001
case+0.001
ave+0.001
ugg+0.001
Neg logit contributions
 its-0.001
 hovering-0.001
cks-0.001
 outer-0.001
 pans-0.001
Token They
Token ID1119
Feature activation+2.44e-04

Pos logit contributions
 its+0.009
cks+0.008
 hovering+0.008
 his+0.008
 outer+0.008
Neg logit contributions
t-0.010
ugg-0.009
case-0.009
 virtue-0.009
ave-0.009
Token ask
Token ID1265
Feature activation+1.25e-02

Pos logit contributions
ugg+0.000
t+0.000
 virtue+0.000
omm+0.000
bye+0.000
Neg logit contributions
 its-0.000
 his-0.000
cks-0.000
 hovering-0.000
 perched-0.000
Token there
Token ID612
Feature activation+1.30e-03

Pos logit contributions
t+0.012
case+0.011
ugg+0.011
bye+0.010
 judge+0.010
Neg logit contributions
cks-0.005
 seaw-0.005
 hovering-0.005
rob-0.005
 pans-0.005
Token was
Token ID373
Feature activation+6.84e-04

Pos logit contributions
t+0.002
ugg+0.002
case+0.001
 virtue+0.001
ave+0.001
Neg logit contributions
 its-0.002
 hovering-0.002
cks-0.002
 his-0.002
 outer-0.002
Token a
Token ID257
Feature activation+7.29e-03

Pos logit contributions
t+0.001
case+0.001
ugg+0.001
bye+0.001
 tum+0.001
Neg logit contributions
cks-0.001
 hovering-0.001
 pans-0.001
 its-0.001
b-0.001
Token great
Token ID1049
Feature activation-2.86e-04

Pos logit contributions
t+0.011
ugg+0.010
ave+0.010
case+0.010
 virtue+0.010
Neg logit contributions
 its-0.013
 his-0.012
 hovering-0.012
cks-0.011
 outer-0.011
Token re
Token ID302
Feature activation+4.96e-02

Pos logit contributions
 its+0.000
 his+0.000
 carrot+0.000
 hovering+0.000
 seaw+0.000
Neg logit contributions
t-0.001
ave-0.001
ugg-0.000
 virtue-0.000
case-0.000
Tokenind
Token ID521
Feature activation+9.36e-02

Pos logit contributions
t+0.095
case+0.089
ave+0.089
ugg+0.089
 virtue+0.088
Neg logit contributions
 its-0.062
cks-0.061
 hovering-0.057
 his-0.055
 perched-0.053
Tokeneer
Token ID28153
Feature activation+1.60e-02

Pos logit contributions
t+0.181
ugg+0.171
case+0.166
 virtue+0.161
ave+0.160
Neg logit contributions
 its-0.121
 hovering-0.111
cks-0.111
 his-0.106
 outer-0.100
Token.
Token ID13
Feature activation+3.79e-03

Pos logit contributions
t+0.025
case+0.023
 virtue+0.023
omm+0.023
ugg+0.023
Neg logit contributions
 its-0.028
 his-0.026
 hovering-0.025
cks-0.024
 perched-0.024
Token He
Token ID679
Feature activation+8.01e-03

Pos logit contributions
t+0.006
ugg+0.006
case+0.006
bye+0.006
 judge+0.005
Neg logit contributions
 its-0.006
 hovering-0.006
cks-0.006
 his-0.005
 outer-0.005
Token had
Token ID550
Feature activation-3.94e-04

Pos logit contributions
t+0.012
 virtue+0.011
ugg+0.011
case+0.011
omm+0.011
Neg logit contributions
 its-0.015
 his-0.014
cks-0.013
 hovering-0.013
 perched-0.012
Token a
Token ID257
Feature activation+1.15e-02

Pos logit contributions
 its+0.000
 hovering+0.000
cks+0.000
 his+0.000
 outer+0.000
Neg logit contributions
t-0.001
ugg-0.001
case-0.001
 virtue-0.001
ave-0.001
Token loud
Token ID7812
Feature activation-1.11e-02

Pos logit contributions
ugg+0.003
t+0.003
case+0.003
ave+0.003
omm+0.003
Neg logit contributions
 its-0.002
 his-0.002
 hovering-0.002
 glowing-0.002
 pans-0.002
Token engine
Token ID3113
Feature activation+8.18e-03

Pos logit contributions
 its+0.014
 his+0.013
 hovering+0.012
cks+0.011
 outer+0.011
Neg logit contributions
ugg-0.023
t-0.023
ave-0.022
case-0.021
 virtue-0.021
Token as
Token ID355
Feature activation+4.97e-03

Pos logit contributions
ugg+0.017
t+0.017
 virtue+0.016
ave+0.016
omm+0.016
Neg logit contributions
 its-0.009
 his-0.008
 hovering-0.008
cks-0.008
 perched-0.007
Token it
Token ID340
Feature activation+4.29e-02

Pos logit contributions
t+0.012
ugg+0.011
case+0.011
bye+0.010
 virtue+0.010
Neg logit contributions
 its-0.004
cks-0.004
 hovering-0.004
 perched-0.003
 his-0.003
Token ch
Token ID442
Feature activation+3.36e-02

Pos logit contributions
t+0.066
ugg+0.065
 virtue+0.062
omm+0.060
case+0.060
Neg logit contributions
 its-0.071
 his-0.066
 hovering-0.065
cks-0.065
 outer-0.060
Tokenug
Token ID1018
Feature activation+9.35e-02

Pos logit contributions
t+0.047
ugg+0.044
case+0.042
ave+0.040
 virtue+0.040
Neg logit contributions
 its-0.062
 hovering-0.059
cks-0.058
 his-0.057
 outer-0.055
Tokenged
Token ID2004
Feature activation-2.02e-03

Pos logit contributions
t+0.234
ugg+0.223
case+0.215
ave+0.214
bye+0.212
Neg logit contributions
 its-0.068
 hovering-0.061
cks-0.059
 his-0.052
 outer-0.050
Token along
Token ID1863
Feature activation+2.81e-02

Pos logit contributions
 its+0.002
cks+0.002
 hovering+0.002
 outer+0.002
 his+0.002
Neg logit contributions
t-0.004
ugg-0.004
case-0.004
ave-0.004
 virtue-0.003
Token the
Token ID262
Feature activation+8.33e-03

Pos logit contributions
t+0.060
ugg+0.059
case+0.056
ave+0.055
bye+0.054
Neg logit contributions
 its-0.032
 his-0.028
 hovering-0.027
cks-0.027
 outer-0.025
Token tracks
Token ID8339
Feature activation+9.25e-03

Pos logit contributions
ugg+0.013
t+0.013
ave+0.012
omm+0.011
case+0.011
Neg logit contributions
 its-0.014
 his-0.013
 outer-0.013
 hovering-0.012
cks-0.012
Token.
Token ID13
Feature activation-1.48e-04

Pos logit contributions
ugg+0.020
t+0.020
 virtue+0.019
omm+0.019
case+0.018
Neg logit contributions
 its-0.010
 his-0.008
 hovering-0.008
cks-0.008
 outer-0.007
TokenOnce
Token ID7454
Feature activation-2.28e-02

Pos logit contributions
t+0.030
ugg+0.028
case+0.028
ave+0.025
bye+0.025
Neg logit contributions
 its-0.018
cks-0.018
 hovering-0.018
 outer-0.016
 pans-0.016
Token there
Token ID612
Feature activation+9.38e-03

Pos logit contributions
 hovering+0.025
cks+0.024
 its+0.023
 outer+0.021
 seaw+0.021
Neg logit contributions
t-0.051
case-0.046
ugg-0.045
 tum-0.043
um-0.041
Token was
Token ID373
Feature activation+5.80e-03

Pos logit contributions
t+0.012
ugg+0.010
case+0.010
bye+0.009
ave+0.009
Neg logit contributions
 its-0.019
cks-0.018
 hovering-0.018
 outer-0.017
 his-0.017
Token a
Token ID257
Feature activation+1.16e-02

Pos logit contributions
t+0.010
ugg+0.010
case+0.010
bye+0.010
 tum+0.009
Neg logit contributions
cks-0.008
 hovering-0.007
 pans-0.007
 its-0.007
 outer-0.006
Token re
Token ID302
Feature activation+4.92e-02

Pos logit contributions
t+0.019
ave+0.017
ugg+0.017
case+0.017
 virtue+0.016
Neg logit contributions
 its-0.019
 his-0.018
 hovering-0.018
 outer-0.017
 pans-0.016
Tokenind
Token ID521
Feature activation+9.31e-02

Pos logit contributions
t+0.098
case+0.091
ave+0.091
ugg+0.090
 virtue+0.089
Neg logit contributions
 its-0.060
cks-0.058
 hovering-0.054
 his-0.053
 perched-0.051
Tokeneer
Token ID28153
Feature activation+1.65e-02

Pos logit contributions
t+0.184
ugg+0.174
case+0.171
 virtue+0.166
ave+0.164
Neg logit contributions
 its-0.116
cks-0.105
 hovering-0.105
 his-0.103
 outer-0.095
Token that
Token ID326
Feature activation-2.13e-03

Pos logit contributions
t+0.028
 virtue+0.027
omm+0.027
case+0.026
ugg+0.026
Neg logit contributions
 its-0.027
 his-0.025
 hovering-0.023
 perched-0.023
cks-0.023
Token was
Token ID373
Feature activation+8.81e-03

Pos logit contributions
cks+0.004
 hovering+0.004
 its+0.004
 outer+0.004
 pans+0.004
Neg logit contributions
t-0.002
ugg-0.002
bye-0.002
case-0.002
 tum-0.002
Token so
Token ID523
Feature activation+8.25e-03

Pos logit contributions
t+0.017
ugg+0.016
case+0.015
 virtue+0.015
ave+0.015
Neg logit contributions
 its-0.012
 hovering-0.011
cks-0.011
 his-0.010
 outer-0.010
Token very
Token ID845
Feature activation+1.84e-02

Pos logit contributions
t+0.018
ugg+0.017
case+0.017
 virtue+0.017
ave+0.016
Neg logit contributions
 its-0.009
 his-0.008
cks-0.008
 hovering-0.008
 outer-0.007
Token.
Token ID13
Feature activation-1.13e-02

Pos logit contributions
 its+0.015
cks+0.014
 hovering+0.014
 outer+0.013
 his+0.013
Neg logit contributions
t-0.023
case-0.021
ugg-0.021
ave-0.020
 judge-0.020
Token It
Token ID632
Feature activation+1.46e-02

Pos logit contributions
 its+0.018
 hovering+0.017
cks+0.017
 outer+0.016
 his+0.016
Neg logit contributions
t-0.019
ugg-0.017
case-0.016
bye-0.016
ave-0.016
Token was
Token ID373
Feature activation-7.52e-03

Pos logit contributions
t+0.014
case+0.011
 judge+0.010
ugg+0.009
 tum+0.009
Neg logit contributions
cks-0.032
 outer-0.032
 its-0.032
 hovering-0.032
ocol-0.030
Token a
Token ID257
Feature activation+7.22e-03

Pos logit contributions
cks+0.010
rob+0.009
 hovering+0.009
 its+0.009
 outer+0.009
Neg logit contributions
t-0.013
case-0.012
 tum-0.012
 judge-0.012
omm-0.011
Token re
Token ID302
Feature activation+4.47e-02

Pos logit contributions
t+0.014
ugg+0.013
case+0.013
ave+0.012
 virtue+0.012
Neg logit contributions
 its-0.010
 hovering-0.009
cks-0.009
 his-0.008
 outer-0.008
Tokenind
Token ID521
Feature activation+9.31e-02

Pos logit contributions
t+0.074
ugg+0.069
case+0.067
 virtue+0.064
bye+0.064
Neg logit contributions
 its-0.071
 hovering-0.066
cks-0.065
 his-0.064
 outer-0.061
Tokeneer
Token ID28153
Feature activation+1.49e-02

Pos logit contributions
t+0.151
ugg+0.139
bye+0.136
ave+0.130
 judge+0.126
Neg logit contributions
 hovering-0.147
 its-0.144
cks-0.139
 outer-0.138
 perched-0.132
Token!
Token ID0
Feature activation+4.13e-03

Pos logit contributions
t+0.030
ugg+0.028
case+0.028
 virtue+0.027
bye+0.027
Neg logit contributions
 its-0.018
cks-0.016
 hovering-0.016
 his-0.015
 outer-0.015
Token
Token ID198
Feature activation-2.42e-03

Pos logit contributions
t+0.007
ugg+0.007
case+0.007
bye+0.006
ave+0.006
Neg logit contributions
 its-0.006
 hovering-0.006
cks-0.005
 outer-0.005
 his-0.005
Token
Token ID198
Feature activation-9.14e-03

Pos logit contributions
 its+0.004
cks+0.003
 hovering+0.003
 his+0.003
 outer+0.003
Neg logit contributions
t-0.004
ugg-0.004
case-0.004
 virtue-0.004
ave-0.004
Token"
Token ID1
Feature activation-2.27e-02

Pos logit contributions
 its+0.014
 hovering+0.013
cks+0.013
 his+0.013
 outer+0.012
Neg logit contributions
t-0.015
ugg-0.014
case-0.014
 virtue-0.013
ave-0.013
Token there
Token ID612
Feature activation+7.39e-03

Pos logit contributions
t+0.027
case+0.025
ugg+0.023
 judge+0.021
 never+0.021
Neg logit contributions
 seaw-0.016
cks-0.014
 pans-0.014
rob-0.014
ocol-0.014
Token was
Token ID373
Feature activation+7.22e-03

Pos logit contributions
t+0.009
ugg+0.008
case+0.008
bye+0.007
ave+0.007
Neg logit contributions
 its-0.015
 hovering-0.015
cks-0.015
 outer-0.014
 his-0.014
Token a
Token ID257
Feature activation+1.61e-02

Pos logit contributions
t+0.015
ugg+0.014
case+0.014
bye+0.014
ave+0.013
Neg logit contributions
 its-0.008
cks-0.008
 hovering-0.007
 outer-0.006
 his-0.006
Token poor
Token ID3595
Feature activation+1.28e-02

Pos logit contributions
t+0.024
ugg+0.022
case+0.021
ave+0.021
 virtue+0.020
Neg logit contributions
 its-0.029
 hovering-0.027
 his-0.027
cks-0.026
 outer-0.025
Token re
Token ID302
Feature activation+4.31e-02

Pos logit contributions
t+0.025
ave+0.023
ugg+0.022
case+0.021
 virtue+0.021
Neg logit contributions
 its-0.019
 his-0.017
 carrot-0.016
 hovering-0.016
 coral-0.016
Tokenind
Token ID521
Feature activation+9.30e-02

Pos logit contributions
t+0.083
case+0.078
ave+0.077
ugg+0.077
 virtue+0.075
Neg logit contributions
 its-0.055
cks-0.053
 hovering-0.050
 his-0.048
 perched-0.046
Tokeneer
Token ID28153
Feature activation+1.65e-02

Pos logit contributions
t+0.179
ugg+0.170
case+0.164
 virtue+0.159
ave+0.159
Neg logit contributions
 its-0.121
 hovering-0.112
cks-0.111
 his-0.106
 outer-0.101
Token.
Token ID13
Feature activation+3.65e-03

Pos logit contributions
t+0.031
ugg+0.029
 virtue+0.029
omm+0.028
case+0.028
Neg logit contributions
 its-0.024
 his-0.022
 hovering-0.021
cks-0.020
 perched-0.020
Token He
Token ID679
Feature activation+7.08e-03

Pos logit contributions
t+0.006
ugg+0.005
case+0.005
bye+0.005
 virtue+0.005
Neg logit contributions
 its-0.006
cks-0.006
 hovering-0.006
 his-0.005
 outer-0.005
Token didn
Token ID1422
Feature activation+4.29e-03

Pos logit contributions
t+0.011
ugg+0.011
 virtue+0.010
case+0.010
omm+0.010
Neg logit contributions
 its-0.012
 his-0.012
cks-0.011
 hovering-0.011
 perched-0.011
Token't
Token ID470
Feature activation+1.19e-04

Pos logit contributions
ugg+0.005
 virtue+0.004
ave+0.004
bye+0.004
 tum+0.004
Neg logit contributions
 its-0.009
 his-0.009
 hovering-0.008
cks-0.008
 outer-0.008
Token could
Token ID714
Feature activation+2.07e-02

Pos logit contributions
 its+0.023
 his+0.021
 hovering+0.021
cks+0.020
 perched+0.020
Neg logit contributions
t-0.020
ugg-0.019
 virtue-0.018
case-0.018
omm-0.018
Token join
Token ID4654
Feature activation+1.90e-02

Pos logit contributions
t+0.030
ugg+0.028
case+0.026
ave+0.026
 virtue+0.026
Neg logit contributions
 its-0.037
 hovering-0.035
cks-0.035
 his-0.034
 outer-0.033
Token the
Token ID262
Feature activation+9.53e-03

Pos logit contributions
t+0.036
ugg+0.034
case+0.033
bye+0.032
 judge+0.032
Neg logit contributions
 hovering-0.022
cks-0.021
 its-0.021
 pans-0.018
 outer-0.018
Token other
Token ID584
Feature activation+8.10e-03

Pos logit contributions
t+0.017
ugg+0.017
ave+0.016
case+0.016
 virtue+0.015
Neg logit contributions
 its-0.013
 hovering-0.013
cks-0.012
 his-0.012
 outer-0.012
Token re
Token ID302
Feature activation+4.89e-02

Pos logit contributions
t+0.014
ugg+0.014
case+0.013
ave+0.013
 virtue+0.013
Neg logit contributions
 its-0.012
 hovering-0.011
cks-0.011
 his-0.011
 outer-0.010
Tokenind
Token ID521
Feature activation+9.29e-02

Pos logit contributions
t+0.092
case+0.089
ugg+0.087
ave+0.085
 virtue+0.085
Neg logit contributions
cks-0.063
 its-0.062
 hovering-0.059
 his-0.054
 perched-0.053
Tokeneer
Token ID28153
Feature activation+1.94e-02

Pos logit contributions
t+0.182
ugg+0.173
case+0.170
 virtue+0.165
ave+0.163
Neg logit contributions
 its-0.118
cks-0.108
 hovering-0.107
 his-0.103
 outer-0.097
Token and
Token ID290
Feature activation+9.11e-03

Pos logit contributions
t+0.037
ugg+0.036
case+0.034
omm+0.034
ave+0.034
Neg logit contributions
 its-0.025
 hovering-0.023
 his-0.023
cks-0.022
 outer-0.021
Token play
Token ID711
Feature activation+1.29e-02

Pos logit contributions
t+0.015
ugg+0.014
case+0.013
ave+0.013
 virtue+0.013
Neg logit contributions
 its-0.015
 his-0.014
 hovering-0.013
cks-0.013
 outer-0.013
Token.
Token ID13
Feature activation+6.39e-03

Pos logit contributions
t+0.019
case+0.018
ugg+0.017
bye+0.017
 virtue+0.016
Neg logit contributions
cks-0.021
 hovering-0.020
 its-0.019
 pans-0.019
 outer-0.018
Token He
Token ID679
Feature activation+8.65e-03

Pos logit contributions
t+0.011
ugg+0.011
case+0.010
 virtue+0.010
ave+0.010
Neg logit contributions
 its-0.009
 hovering-0.009
cks-0.009
 his-0.008
 outer-0.008
Token's
Token ID338
Feature activation+1.71e-02

Pos logit contributions
ugg+0.036
t+0.036
case+0.033
 virtue+0.033
bye+0.032
Neg logit contributions
 its-0.033
 hovering-0.031
 his-0.030
cks-0.030
 perched-0.028
Token with
Token ID351
Feature activation-6.15e-03

Pos logit contributions
t+0.025
ugg+0.024
case+0.023
ave+0.022
 virtue+0.022
Neg logit contributions
 its-0.030
 hovering-0.028
cks-0.028
 his-0.027
 outer-0.026
Token a
Token ID257
Feature activation+1.74e-03

Pos logit contributions
 its+0.008
 hovering+0.008
cks+0.008
 his+0.007
 outer+0.007
Neg logit contributions
t-0.012
ugg-0.011
case-0.011
ave-0.010
 virtue-0.010
Token funny
Token ID8258
Feature activation-7.73e-03

Pos logit contributions
t+0.003
ugg+0.003
case+0.003
ave+0.003
omm+0.003
Neg logit contributions
 its-0.003
 hovering-0.003
 his-0.002
 outer-0.002
cks-0.002
Token re
Token ID302
Feature activation+4.52e-02

Pos logit contributions
 its+0.013
 his+0.012
 hovering+0.012
cks+0.011
 seaw+0.011
Neg logit contributions
t-0.013
ugg-0.012
case-0.011
ave-0.011
omm-0.011
Tokenind
Token ID521
Feature activation+9.27e-02

Pos logit contributions
t+0.087
case+0.085
ugg+0.083
ave+0.080
omm+0.077
Neg logit contributions
 its-0.057
cks-0.057
 hovering-0.053
 his-0.050
 outer-0.047
Tokeneer
Token ID28153
Feature activation+1.70e-02

Pos logit contributions
t+0.178
ugg+0.168
case+0.163
 virtue+0.158
ave+0.158
Neg logit contributions
 its-0.121
 hovering-0.113
cks-0.112
 his-0.107
 outer-0.102
Token.
Token ID13
Feature activation-3.49e-04

Pos logit contributions
t+0.034
ugg+0.033
case+0.032
 virtue+0.031
ave+0.031
Neg logit contributions
 its-0.020
 hovering-0.019
cks-0.018
 his-0.018
 outer-0.017
Token<|endoftext|>
Token ID50256
Feature activation+1.69e-02

Pos logit contributions
 its+0.000
 his+0.000
 hovering+0.000
cks+0.000
 perched+0.000
Neg logit contributions
t-0.001
ugg-0.001
ave-0.001
bye-0.001
case-0.001
TokenOnce
Token ID7454
Feature activation-1.48e-02

Pos logit contributions
t+0.027
ugg+0.024
case+0.024
ave+0.022
bye+0.022
Neg logit contributions
 its-0.028
 hovering-0.027
cks-0.027
 outer-0.025
 his-0.024
Token upon
Token ID2402
Feature activation+2.12e-03

Pos logit contributions
 its+0.013
 hovering+0.013
cks+0.013
 outer+0.011
 seaw+0.011
Neg logit contributions
t-0.034
ugg-0.032
case-0.032
ave-0.030
 tum-0.030
Token It
Token ID632
Feature activation+1.49e-02

Pos logit contributions
 its+0.003
 his+0.003
 hovering+0.003
cks+0.003
 perched+0.003
Neg logit contributions
t-0.005
ugg-0.005
bye-0.005
 virtue-0.005
ave-0.005
Token was
Token ID373
Feature activation-7.61e-03

Pos logit contributions
t+0.021
ugg+0.020
case+0.018
 virtue+0.018
bye+0.018
Neg logit contributions
 its-0.027
 hovering-0.026
cks-0.025
 his-0.025
 perched-0.024
Token covered
Token ID5017
Feature activation-3.31e-02

Pos logit contributions
 its+0.009
cks+0.008
 hovering+0.008
 outer+0.007
 his+0.007
Neg logit contributions
t-0.015
ugg-0.014
case-0.014
 virtue-0.014
 judge-0.014
Token in
Token ID287
Feature activation-1.46e-02

Pos logit contributions
 its+0.044
cks+0.043
 hovering+0.041
 outer+0.038
 his+0.038
Neg logit contributions
t-0.061
ugg-0.057
case-0.056
bye-0.053
 virtue-0.053
Token cher
Token ID21382
Feature activation-2.80e-02

Pos logit contributions
 its+0.017
cks+0.017
 hovering+0.016
 his+0.014
 outer+0.014
Neg logit contributions
t-0.028
ugg-0.026
case-0.026
 virtue-0.026
 judge-0.025
Tokenries
Token ID1678
Feature activation-8.10e-02

Pos logit contributions
 its+0.035
 his+0.031
 outer+0.031
 theirs+0.030
 coral+0.027
Neg logit contributions
t-0.054
um-0.050
ugg-0.048
omm-0.047
case-0.047
Token!
Token ID0
Feature activation+2.37e-04

Pos logit contributions
 its+0.097
cks+0.093
 hovering+0.088
 outer+0.083
 his+0.082
Neg logit contributions
t-0.164
case-0.153
ugg-0.150
 virtue-0.143
bye-0.142
Token The
Token ID383
Feature activation+1.08e-02

Pos logit contributions
t+0.000
ugg+0.000
 virtue+0.000
ave+0.000
bye+0.000
Neg logit contributions
 its-0.000
 his-0.000
 hovering-0.000
cks-0.000
 perched-0.000
Token bear
Token ID6842
Feature activation-1.01e-02

Pos logit contributions
t+0.026
ugg+0.025
ave+0.025
case+0.023
 virtue+0.023
Neg logit contributions
 carrot-0.009
 its-0.009
 hovering-0.008
 glowing-0.008
 his-0.008
Token was
Token ID373
Feature activation-1.07e-02

Pos logit contributions
 its+0.015
 his+0.014
 perched+0.014
 hovering+0.013
 carrot+0.011
Neg logit contributions
ugg-0.019
t-0.018
 virtue-0.018
bye-0.017
omm-0.017
Token so
Token ID523
Feature activation+4.74e-04

Pos logit contributions
 its+0.014
 hovering+0.012
 his+0.012
cks+0.012
 perched+0.011
Neg logit contributions
t-0.022
ugg-0.022
bye-0.020
ave-0.020
 virtue-0.020
Token been
Token ID587
Feature activation+6.45e-03

Pos logit contributions
 its+0.010
 his+0.009
 hovering+0.009
cks+0.009
 perched+0.008
Neg logit contributions
ugg-0.010
t-0.010
case-0.009
omm-0.009
 virtue-0.009
Token packed
Token ID11856
Feature activation-3.99e-03

Pos logit contributions
t+0.012
ugg+0.012
case+0.011
bye+0.011
 virtue+0.011
Neg logit contributions
 its-0.009
 hovering-0.008
cks-0.008
 his-0.008
 outer-0.007
Token in
Token ID287
Feature activation-1.25e-03

Pos logit contributions
 its+0.005
 his+0.005
 hovering+0.004
cks+0.004
 outer+0.004
Neg logit contributions
t-0.008
ugg-0.008
case-0.007
ave-0.007
omm-0.007
Token their
Token ID511
Feature activation-2.34e-02

Pos logit contributions
 its+0.001
cks+0.001
 hovering+0.001
 his+0.001
 outer+0.001
Neg logit contributions
t-0.003
ugg-0.003
case-0.003
 virtue-0.003
bye-0.003
Token back
Token ID736
Feature activation-1.68e-03

Pos logit contributions
 its+0.032
 his+0.029
 outer+0.027
 hovering+0.026
cks+0.025
Neg logit contributions
t-0.049
ugg-0.044
ave-0.043
case-0.041
omm-0.040
Tokenpacks
Token ID32377
Feature activation-7.99e-02

Pos logit contributions
 its+0.002
 hovering+0.002
cks+0.002
 his+0.002
 outer+0.002
Neg logit contributions
t-0.003
ugg-0.003
case-0.003
 virtue-0.003
ave-0.003
Token by
Token ID416
Feature activation-1.18e-02

Pos logit contributions
 its+0.107
 hovering+0.097
 his+0.094
cks+0.094
 outer+0.088
Neg logit contributions
t-0.150
ugg-0.144
case-0.136
ave-0.135
 virtue-0.135
Token mistake
Token ID7457
Feature activation-2.45e-02

Pos logit contributions
 its+0.014
 his+0.012
 hovering+0.012
cks+0.012
 outer+0.010
Neg logit contributions
t-0.026
ugg-0.026
case-0.025
ave-0.025
bye-0.023
Token.
Token ID13
Feature activation-2.88e-03

Pos logit contributions
 its+0.032
cks+0.031
 hovering+0.030
 his+0.028
 outer+0.028
Neg logit contributions
t-0.047
ugg-0.043
case-0.042
bye-0.041
ave-0.041
Token The
Token ID383
Feature activation+1.90e-02

Pos logit contributions
 its+0.004
 hovering+0.003
cks+0.003
 his+0.003
 perched+0.003
Neg logit contributions
t-0.005
ugg-0.005
case-0.005
 virtue-0.005
ave-0.005
Token kids
Token ID3988
Feature activation-1.21e-02

Pos logit contributions
ave+0.042
t+0.041
ugg+0.040
case+0.039
EEP+0.036
Neg logit contributions
 glowing-0.019
 its-0.019
 his-0.019
 hovering-0.017
 fl-0.017
Token
Token ID198
Feature activation-3.90e-03

Pos logit contributions
t+0.008
ugg+0.008
case+0.007
ave+0.007
 virtue+0.007
Neg logit contributions
 its-0.008
 hovering-0.007
cks-0.007
 his-0.007
 outer-0.007
TokenThey
Token ID2990
Feature activation-5.50e-03

Pos logit contributions
 its+0.006
cks+0.006
 hovering+0.006
 his+0.006
 outer+0.005
Neg logit contributions
t-0.006
ugg-0.006
case-0.006
 virtue-0.005
ave-0.005
Token grabbed
Token ID13176
Feature activation-2.82e-02

Pos logit contributions
 its+0.009
 his+0.009
 hovering+0.009
 perched+0.008
cks+0.008
Neg logit contributions
t-0.008
 virtue-0.008
ugg-0.008
omm-0.008
case-0.007
Token their
Token ID511
Feature activation-1.16e-02

Pos logit contributions
 its+0.030
 hovering+0.030
cks+0.029
 outer+0.026
 his+0.026
Neg logit contributions
t-0.057
ugg-0.054
case-0.053
bye-0.051
ave-0.051
Token back
Token ID736
Feature activation+8.68e-06

Pos logit contributions
 its+0.017
 hovering+0.016
cks+0.016
 his+0.016
 outer+0.015
Neg logit contributions
t-0.020
ugg-0.019
case-0.018
ave-0.018
 virtue-0.018
Tokenpacks
Token ID32377
Feature activation-7.85e-02

Feature not active on prev token;
no predictions were affected.
Token and
Token ID290
Feature activation-1.02e-02

Pos logit contributions
 its+0.102
 hovering+0.091
 his+0.090
cks+0.087
 outer+0.082
Neg logit contributions
t-0.150
ugg-0.144
 virtue-0.137
ave-0.136
case-0.136
Token put
Token ID1234
Feature activation-7.13e-03

Pos logit contributions
 its+0.018
 his+0.017
 hovering+0.016
cks+0.016
 perched+0.016
Neg logit contributions
t-0.015
ugg-0.014
 virtue-0.014
case-0.014
omm-0.013
Token on
Token ID319
Feature activation+4.80e-03

Pos logit contributions
 hovering+0.009
cks+0.009
 outer+0.008
 seaw+0.008
 creepy+0.008
Neg logit contributions
t-0.012
ugg-0.012
 moms-0.011
case-0.011
 tum-0.011
Token their
Token ID511
Feature activation-3.52e-03

Pos logit contributions
t+0.011
ugg+0.011
case+0.011
bye+0.011
 tum+0.010
Neg logit contributions
 hovering-0.003
 its-0.003
cks-0.003
 perched-0.002
 outer-0.002
Token shoes
Token ID10012
Feature activation-6.01e-03

Pos logit contributions
 its+0.005
 hovering+0.005
cks+0.005
 his+0.004
 outer+0.004
Neg logit contributions
t-0.006
ugg-0.006
case-0.006
bye-0.006
ave-0.006
Token coats
Token ID30720
Feature activation-2.81e-02

Pos logit contributions
 its+0.003
 hovering+0.003
cks+0.003
 perched+0.003
 outer+0.003
Neg logit contributions
t-0.004
ugg-0.004
case-0.004
bye-0.004
 tum-0.004
Token and
Token ID290
Feature activation-1.61e-02

Pos logit contributions
 its+0.037
 hovering+0.034
 his+0.033
cks+0.033
 outer+0.030
Neg logit contributions
t-0.053
ugg-0.051
 virtue-0.049
case-0.049
ave-0.048
Token packed
Token ID11856
Feature activation+1.02e-03

Pos logit contributions
 its+0.023
cks+0.023
 hovering+0.023
 outer+0.020
 his+0.020
Neg logit contributions
t-0.028
ugg-0.026
case-0.025
ave-0.025
bye-0.025
Token their
Token ID511
Feature activation-1.19e-02

Pos logit contributions
t+0.002
ugg+0.002
case+0.002
 virtue+0.002
ave+0.002
Neg logit contributions
 its-0.001
 hovering-0.001
cks-0.001
 his-0.001
 outer-0.001
Token back
Token ID736
Feature activation+1.68e-03

Pos logit contributions
 its+0.015
 hovering+0.014
cks+0.014
 his+0.013
 outer+0.012
Neg logit contributions
t-0.023
ugg-0.022
case-0.021
ave-0.021
 virtue-0.021
Tokenpacks
Token ID32377
Feature activation-7.74e-02

Pos logit contributions
t+0.004
ugg+0.004
case+0.004
bye+0.004
ave+0.003
Neg logit contributions
 hovering-0.001
 its-0.001
cks-0.001
 perched-0.001
 outer-0.001
Token.
Token ID13
Feature activation-6.07e-03

Pos logit contributions
 its+0.106
 hovering+0.097
cks+0.097
 his+0.094
 outer+0.089
Neg logit contributions
t-0.143
ugg-0.136
case-0.130
 virtue-0.127
ave-0.127
Token
Token ID198
Feature activation+4.97e-03

Pos logit contributions
 its+0.009
 his+0.008
cks+0.007
 hovering+0.007
 outer+0.007
Neg logit contributions
t-0.011
 virtue-0.010
ugg-0.010
ave-0.010
case-0.010
Token
Token ID198
Feature activation-2.90e-03

Pos logit contributions
t+0.009
ugg+0.009
 virtue+0.008
case+0.008
ave+0.008
Neg logit contributions
 its-0.007
cks-0.006
 his-0.006
 hovering-0.006
 outer-0.006
TokenThe
Token ID464
Feature activation+1.75e-02

Pos logit contributions
 its+0.004
 his+0.004
 hovering+0.003
cks+0.003
 outer+0.003
Neg logit contributions
ugg-0.006
 tum-0.005
omm-0.005
 virtue-0.005
 stomach-0.005
Token whole
Token ID2187
Feature activation-1.46e-02

Pos logit contributions
ugg+0.035
 virtue+0.035
t+0.034
ave+0.034
age+0.033
Neg logit contributions
 pans-0.022
 glowing-0.020
 shining-0.019
 his-0.019
 wooden-0.018
Token a
Token ID257
Feature activation-2.93e-03

Pos logit contributions
cks+0.005
 hovering+0.005
 pans+0.004
ocol+0.004
 afloat+0.004
Neg logit contributions
 tum-0.005
t-0.005
ugg-0.005
 judge-0.005
 moms-0.004
Token pretty
Token ID2495
Feature activation-1.08e-02

Pos logit contributions
 its+0.004
cks+0.004
 hovering+0.004
 his+0.004
 outer+0.003
Neg logit contributions
t-0.005
ugg-0.005
case-0.005
 virtue-0.005
 judge-0.005
Token dancer
Token ID38619
Feature activation-1.75e-03

Pos logit contributions
 its+0.014
cks+0.013
 hovering+0.013
 his+0.012
 outer+0.012
Neg logit contributions
t-0.021
ugg-0.020
case-0.019
bye-0.018
 virtue-0.018
Token bird
Token ID6512
Feature activation-6.07e-03

Pos logit contributions
 its+0.002
cks+0.002
 hovering+0.002
 his+0.002
 outer+0.002
Neg logit contributions
t-0.003
ugg-0.003
case-0.003
 judge-0.003
bye-0.003
Token in
Token ID287
Feature activation+6.56e-03

Pos logit contributions
 its+0.007
cks+0.007
 hovering+0.007
 his+0.007
 outer+0.007
Neg logit contributions
t-0.012
ugg-0.011
case-0.011
 judge-0.010
bye-0.010
Token trouble
Token ID5876
Feature activation-7.65e-02

Pos logit contributions
t+0.010
case+0.010
 judge+0.010
 Everyone+0.010
 tum+0.009
Neg logit contributions
 hovering-0.009
cks-0.008
piring-0.008
 pans-0.008
 atop-0.008
Token.
Token ID13
Feature activation+1.11e-03

Pos logit contributions
 its+0.103
cks+0.101
 hovering+0.099
 pans+0.096
 outer+0.096
Neg logit contributions
t-0.138
case-0.125
ugg-0.122
 judge-0.118
ave-0.117
Token The
Token ID383
Feature activation+1.13e-02

Pos logit contributions
t+0.002
ugg+0.002
case+0.002
ave+0.002
 tum+0.002
Neg logit contributions
cks-0.002
 hovering-0.002
 its-0.002
 outer-0.001
 perched-0.001
Token bird
Token ID6512
Feature activation-1.53e-02

Pos logit contributions
ugg+0.020
t+0.019
ave+0.018
 virtue+0.018
case+0.018
Neg logit contributions
 its-0.018
 hovering-0.016
 his-0.016
 outer-0.015
cks-0.014
Token had
Token ID550
Feature activation-9.49e-03

Pos logit contributions
 its+0.028
 hovering+0.026
cks+0.026
 his+0.025
 outer+0.024
Neg logit contributions
t-0.022
ugg-0.021
case-0.020
 virtue-0.019
ave-0.019
Token fallen
Token ID9292
Feature activation-1.90e-02

Pos logit contributions
 its+0.012
 his+0.010
 hovering+0.010
cks+0.009
 perched+0.009
Neg logit contributions
ugg-0.020
t-0.019
case-0.018
 virtue-0.018
omm-0.018
Token's
Token ID338
Feature activation+1.31e-02

Pos logit contributions
 its+0.006
 hovering+0.005
 his+0.005
 perched+0.005
cks+0.005
Neg logit contributions
t-0.008
 virtue-0.008
ugg-0.008
bye-0.008
case-0.007
Token eyes
Token ID2951
Feature activation-1.61e-02

Pos logit contributions
t+0.027
ave+0.025
ugg+0.023
 virtue+0.023
aring+0.022
Neg logit contributions
 his-0.017
 glowing-0.016
 its-0.016
 hut-0.015
 wooden-0.015
Token spark
Token ID9009
Feature activation-2.28e-02

Pos logit contributions
 its+0.024
 hovering+0.022
cks+0.022
 his+0.021
 outer+0.020
Neg logit contributions
t-0.028
ugg-0.027
 virtue-0.026
case-0.025
bye-0.025
Tokenled
Token ID992
Feature activation-3.53e-02

Pos logit contributions
 its+0.032
cks+0.031
 hovering+0.030
 outer+0.028
 his+0.028
Neg logit contributions
t-0.041
ave-0.039
ugg-0.036
case-0.035
bye-0.035
Token in
Token ID287
Feature activation-5.80e-03

Pos logit contributions
cks+0.067
rob+0.064
 hovering+0.054
 pans+0.054
ocol+0.053
Neg logit contributions
t-0.048
case-0.041
bye-0.040
ave-0.040
um-0.039
Token amaz
Token ID40642
Feature activation-7.56e-02

Pos logit contributions
cks+0.006
 hovering+0.006
rob+0.006
 perched+0.005
 its+0.005
Neg logit contributions
t-0.011
case-0.011
 tum-0.011
 judge-0.010
 hire-0.010
Tokenement
Token ID972
Feature activation-3.68e-02

Pos logit contributions
cks+0.129
 hovering+0.127
 afloat+0.127
 its+0.126
 perched+0.120
Neg logit contributions
t-0.105
um-0.099
case-0.098
omm-0.097
ugg-0.095
Token.
Token ID13
Feature activation+1.67e-03

Pos logit contributions
 its+0.047
cks+0.045
 hovering+0.045
 his+0.041
 outer+0.041
Neg logit contributions
t-0.070
ugg-0.065
case-0.063
bye-0.062
ave-0.062
Token The
Token ID383
Feature activation+1.18e-02

Pos logit contributions
t+0.004
ugg+0.003
ave+0.003
 virtue+0.003
bye+0.003
Neg logit contributions
 its-0.002
 his-0.002
 hovering-0.001
 down-0.001
 perched-0.001
Token palace
Token ID20562
Feature activation-1.79e-02

Pos logit contributions
t+0.022
ugg+0.021
 virtue+0.020
ave+0.020
aring+0.019
Neg logit contributions
 its-0.017
 his-0.016
 outer-0.014
 hovering-0.014
cks-0.014
Token garden
Token ID11376
Feature activation-8.90e-03

Pos logit contributions
 its+0.022
 hovering+0.021
 his+0.020
 perched+0.020
 glowing+0.020
Neg logit contributions
ugg-0.035
 virtue-0.034
aring-0.032
t-0.032
bye-0.032
Token.
Token ID13
Feature activation-3.91e-03

Pos logit contributions
t+0.017
ugg+0.016
case+0.015
 virtue+0.015
ave+0.015
Neg logit contributions
 its-0.010
 hovering-0.010
cks-0.009
 his-0.009
 outer-0.008
Token "
Token ID366
Feature activation-1.51e-02

Pos logit contributions
 its+0.005
 his+0.004
 hovering+0.004
cks+0.003
 perched+0.003
Neg logit contributions
ugg-0.008
t-0.008
 virtue-0.008
ave-0.008
bye-0.008
TokenHe
Token ID1544
Feature activation-1.52e-02

Pos logit contributions
 its+0.021
 hovering+0.020
cks+0.019
 his+0.018
 outer+0.018
Neg logit contributions
t-0.029
ugg-0.026
case-0.025
bye-0.025
ave-0.024
Token is
Token ID318
Feature activation-1.76e-02

Pos logit contributions
 its+0.021
cks+0.021
 hovering+0.020
 his+0.020
 perched+0.019
Neg logit contributions
t-0.027
ugg-0.026
 virtue-0.025
bye-0.024
omm-0.024
Token in
Token ID287
Feature activation+1.01e-02

Pos logit contributions
 its+0.021
cks+0.020
 hovering+0.020
 outer+0.018
 seaw+0.018
Neg logit contributions
t-0.034
case-0.032
 judge-0.031
ugg-0.030
 virtue-0.030
Token trouble
Token ID5876
Feature activation-7.55e-02

Pos logit contributions
t+0.021
case+0.020
ugg+0.019
 judge+0.019
 virtue+0.019
Neg logit contributions
 hovering-0.010
 its-0.010
cks-0.010
 perched-0.009
 seaw-0.008
Token!"
Token ID2474
Feature activation+7.34e-04

Pos logit contributions
 its+0.083
 hovering+0.075
cks+0.075
 his+0.072
 outer+0.064
Neg logit contributions
t-0.159
ugg-0.156
case-0.148
 virtue-0.146
ave-0.145
Token
Token ID198
Feature activation+1.66e-04

Pos logit contributions
t+0.001
ugg+0.001
 virtue+0.001
ave+0.001
case+0.001
Neg logit contributions
 its-0.001
 his-0.001
cks-0.001
 hovering-0.001
 perched-0.001
Token
Token ID198
Feature activation-6.48e-03

Pos logit contributions
t+0.000
ugg+0.000
case+0.000
 virtue+0.000
ave+0.000
Neg logit contributions
 its-0.000
cks-0.000
 hovering-0.000
 his-0.000
 outer-0.000
TokenTim
Token ID14967
Feature activation-9.81e-03

Pos logit contributions
 its+0.009
 his+0.009
cks+0.008
 hovering+0.008
 outer+0.007
Neg logit contributions
t-0.012
ugg-0.012
ave-0.011
case-0.011
 virtue-0.011
Token's
Token ID338
Feature activation+2.37e-02

Pos logit contributions
 his+0.013
 its+0.013
 hovering+0.011
 perched+0.011
cks+0.011
Neg logit contributions
ugg-0.019
 virtue-0.019
t-0.019
bye-0.018
case-0.017
Token Ell
Token ID7122
Feature activation+3.99e-03

Pos logit contributions
 its+0.006
 hovering+0.006
cks+0.006
 his+0.006
 outer+0.006
Neg logit contributions
t-0.006
ugg-0.006
case-0.006
 virtue-0.005
ave-0.005
Tokena
Token ID64
Feature activation+1.98e-02

Pos logit contributions
t+0.006
 virtue+0.006
case+0.006
ugg+0.006
 tum+0.006
Neg logit contributions
 its-0.006
cks-0.006
 his-0.006
 hovering-0.006
 perched-0.005
Token found
Token ID1043
Feature activation-4.67e-03

Pos logit contributions
 virtue+0.042
t+0.041
ugg+0.040
omm+0.039
 conscience+0.036
Neg logit contributions
 its-0.023
 his-0.022
 perched-0.021
 hovering-0.019
 there-0.018
Token herself
Token ID5223
Feature activation-3.11e-02

Pos logit contributions
cks+0.005
 its+0.005
 hovering+0.005
 outer+0.004
 his+0.004
Neg logit contributions
t-0.010
ugg-0.009
case-0.009
ave-0.009
 virtue-0.009
Token in
Token ID287
Feature activation+1.50e-02

Pos logit contributions
 its+0.035
 hovering+0.032
cks+0.031
 his+0.030
 outer+0.028
Neg logit contributions
t-0.066
ugg-0.064
case-0.060
ave-0.060
bye-0.060
Token trouble
Token ID5876
Feature activation-7.55e-02

Pos logit contributions
t+0.034
ugg+0.033
case+0.032
 judge+0.032
 virtue+0.031
Neg logit contributions
cks-0.012
 its-0.012
 hovering-0.012
 his-0.010
 perched-0.010
Token.
Token ID13
Feature activation+6.10e-04

Pos logit contributions
 its+0.098
cks+0.093
 hovering+0.092
 outer+0.085
 his+0.084
Neg logit contributions
t-0.144
case-0.132
ugg-0.132
 virtue-0.125
ave-0.124
Token She
Token ID1375
Feature activation-5.47e-03

Pos logit contributions
t+0.001
 virtue+0.001
ave+0.001
ugg+0.001
bye+0.001
Neg logit contributions
 its-0.001
 his-0.000
 down-0.000
 her-0.000
 hovering-0.000
Token had
Token ID550
Feature activation-6.66e-03

Pos logit contributions
 its+0.008
 his+0.008
 perched+0.007
 hovering+0.006
cks+0.006
Neg logit contributions
ugg-0.011
t-0.010
 virtue-0.010
bye-0.010
omm-0.009
Token been
Token ID587
Feature activation-2.55e-04

Pos logit contributions
 its+0.011
 his+0.010
 hovering+0.009
 perched+0.009
cks+0.009
Neg logit contributions
ugg-0.012
t-0.011
bye-0.011
omm-0.010
case-0.010
Token walking
Token ID6155
Feature activation+4.46e-03

Pos logit contributions
 its+0.000
 hovering+0.000
cks+0.000
 his+0.000
 outer+0.000
Neg logit contributions
t-0.001
ugg-0.001
case-0.000
bye-0.000
 virtue-0.000
Token a
Token ID257
Feature activation+1.14e-03

Pos logit contributions
cks+0.010
 its+0.009
 hovering+0.009
rob+0.009
 pans+0.008
Neg logit contributions
t-0.015
case-0.014
 judge-0.014
 virtue-0.014
ugg-0.014
Token small
Token ID1402
Feature activation+3.17e-03

Pos logit contributions
t+0.002
ugg+0.002
ave+0.002
case+0.002
omm+0.002
Neg logit contributions
 its-0.001
 hovering-0.001
 his-0.001
 outer-0.001
 glowing-0.001
Token town
Token ID3240
Feature activation+2.23e-02

Pos logit contributions
t+0.006
ugg+0.006
ave+0.006
case+0.005
omm+0.005
Neg logit contributions
 its-0.004
 hovering-0.004
 his-0.004
 outer-0.004
 perched-0.004
Token,
Token ID11
Feature activation-2.85e-03

Pos logit contributions
ugg+0.048
ave+0.047
t+0.045
omm+0.044
 virtue+0.043
Neg logit contributions
 its-0.026
 hovering-0.025
 his-0.022
cks-0.020
 perched-0.020
Token in
Token ID287
Feature activation+8.50e-03

Pos logit contributions
 its+0.004
cks+0.004
 hovering+0.004
 his+0.003
 outer+0.003
Neg logit contributions
t-0.005
ugg-0.005
case-0.005
bye-0.005
 virtue-0.005
Token trouble
Token ID5876
Feature activation-7.54e-02

Pos logit contributions
t+0.017
case+0.016
 judge+0.016
 virtue+0.015
ugg+0.015
Neg logit contributions
cks-0.009
 hovering-0.009
 its-0.008
 perched-0.008
rob-0.007
Token.
Token ID13
Feature activation-8.00e-03

Pos logit contributions
 its+0.095
cks+0.091
 hovering+0.088
 his+0.083
 outer+0.083
Neg logit contributions
t-0.145
case-0.133
ugg-0.131
 judge-0.126
bye-0.126
Token The
Token ID383
Feature activation+7.90e-03

Pos logit contributions
 its+0.012
 hovering+0.010
cks+0.010
 his+0.010
 outer+0.009
Neg logit contributions
t-0.014
ugg-0.014
case-0.013
 virtue-0.013
ave-0.013
Token jet
Token ID12644
Feature activation-9.80e-04

Pos logit contributions
ugg+0.017
t+0.016
ave+0.015
omm+0.015
aring+0.014
Neg logit contributions
 its-0.010
 hovering-0.009
 glowing-0.009
 his-0.009
 shining-0.008
Token swoop
Token ID38527
Feature activation-2.92e-02

Pos logit contributions
 its+0.001
 hovering+0.001
 perched+0.001
 his+0.001
 atop+0.001
Neg logit contributions
ugg-0.002
omm-0.002
 virtue-0.002
t-0.002
 moms-0.002
Tokened
Token ID276
Feature activation-2.80e-02

Pos logit contributions
cks+0.051
 its+0.051
 hovering+0.049
 outer+0.049
 his+0.047
Neg logit contributions
t-0.044
ugg-0.038
ave-0.037
bye-0.036
case-0.035
Token "
Token ID366
Feature activation-2.29e-02

Pos logit contributions
 its+0.003
cks+0.003
 hovering+0.003
 outer+0.003
 his+0.003
Neg logit contributions
t-0.004
ugg-0.004
case-0.004
ave-0.004
 virtue-0.004
TokenI
Token ID40
Feature activation-1.52e-02

Pos logit contributions
 its+0.034
 hovering+0.032
cks+0.032
 his+0.031
 outer+0.029
Neg logit contributions
t-0.040
ugg-0.037
case-0.036
 virtue-0.035
ave-0.035
Token'm
Token ID1101
Feature activation-1.32e-02

Pos logit contributions
 its+0.024
 his+0.022
 hovering+0.022
 perched+0.020
cks+0.020
Neg logit contributions
t-0.024
ugg-0.024
case-0.023
omm-0.023
 virtue-0.022
Token in
Token ID287
Feature activation+9.59e-03

Pos logit contributions
 its+0.017
cks+0.017
 hovering+0.016
 outer+0.015
 his+0.015
Neg logit contributions
t-0.024
ugg-0.022
case-0.022
 judge-0.022
 virtue-0.021
Token big
Token ID1263
Feature activation-3.91e-02

Pos logit contributions
t+0.022
ugg+0.021
case+0.020
 virtue+0.020
bye+0.019
Neg logit contributions
 its-0.009
cks-0.008
 hovering-0.008
 his-0.007
 outer-0.007
Token trouble
Token ID5876
Feature activation-7.51e-02

Pos logit contributions
 its+0.057
cks+0.056
 hovering+0.055
 perched+0.050
 outer+0.050
Neg logit contributions
t-0.064
case-0.060
ugg-0.059
 judge-0.058
 tum-0.057
Token.
Token ID13
Feature activation-2.25e-02

Pos logit contributions
 its+0.093
cks+0.086
 hovering+0.086
 his+0.081
 outer+0.078
Neg logit contributions
t-0.148
ugg-0.140
case-0.136
 virtue-0.132
ave-0.131
Token Can
Token ID1680
Feature activation-7.30e-03

Pos logit contributions
 its+0.030
 his+0.027
 hovering+0.025
 outer+0.023
cks+0.023
Neg logit contributions
t-0.043
ugg-0.041
ave-0.039
case-0.039
 virtue-0.039
Token you
Token ID345
Feature activation-1.18e-02

Pos logit contributions
 its+0.008
 his+0.008
 hovering+0.007
cks+0.006
 perched+0.006
Neg logit contributions
ugg-0.015
ave-0.015
 virtue-0.015
case-0.014
t-0.014
Token help
Token ID1037
Feature activation-1.82e-04

Pos logit contributions
 its+0.018
 his+0.017
 hovering+0.016
 outer+0.015
cks+0.015
Neg logit contributions
ugg-0.020
t-0.019
ave-0.018
 virtue-0.018
omm-0.018
Token me
Token ID502
Feature activation+1.33e-02

Pos logit contributions
 its+0.000
 his+0.000
cks+0.000
 hovering+0.000
 outer+0.000
Neg logit contributions
t-0.000
ugg-0.000
case-0.000
ave-0.000
 virtue-0.000
Token
Token ID198
Feature activation-3.72e-03

Pos logit contributions
t+0.006
ugg+0.006
case+0.006
 virtue+0.005
bye+0.005
Neg logit contributions
 its-0.005
 hovering-0.005
cks-0.005
 his-0.005
 outer-0.004
TokenB
Token ID33
Feature activation+2.49e-02

Pos logit contributions
 its+0.005
 hovering+0.004
cks+0.004
 his+0.004
 outer+0.004
Neg logit contributions
t-0.007
ugg-0.007
case-0.007
 virtue-0.006
bye-0.006
Tokenenny
Token ID11870
Feature activation+5.99e-03

Pos logit contributions
 salon+0.032
 tum+0.032
case+0.031
 moms+0.030
 punish+0.030
Neg logit contributions
 its-0.040
 his-0.038
 afloat-0.037
 hovering-0.036
 down-0.035
Token got
Token ID1392
Feature activation-2.39e-03

Pos logit contributions
ugg+0.013
 virtue+0.012
t+0.012
uggle+0.012
umm+0.012
Neg logit contributions
 its-0.007
 his-0.007
 perched-0.005
 her-0.005
 fl-0.005
Token in
Token ID287
Feature activation+1.57e-02

Pos logit contributions
 its+0.003
 his+0.003
 down+0.002
 hovering+0.002
 perched+0.002
Neg logit contributions
t-0.005
ugg-0.005
case-0.005
 virtue-0.005
bye-0.005
Token trouble
Token ID5876
Feature activation-7.51e-02

Pos logit contributions
t+0.036
ugg+0.034
case+0.033
 virtue+0.032
ave+0.032
Neg logit contributions
 its-0.014
cks-0.013
 hovering-0.013
 his-0.012
 outer-0.012
Token and
Token ID290
Feature activation+7.77e-03

Pos logit contributions
 its+0.099
 hovering+0.091
cks+0.090
 his+0.087
 outer+0.081
Neg logit contributions
t-0.143
ugg-0.137
case-0.131
ave-0.128
 virtue-0.128
Token had
Token ID550
Feature activation+3.42e-04

Pos logit contributions
t+0.013
ugg+0.012
case+0.012
 virtue+0.012
omm+0.011
Neg logit contributions
 its-0.013
 his-0.013
 hovering-0.011
 perched-0.011
cks-0.011
Token to
Token ID284
Feature activation-4.50e-03

Pos logit contributions
ugg+0.001
t+0.001
case+0.001
bye+0.001
omm+0.001
Neg logit contributions
 its-0.001
 his-0.000
 hovering-0.000
 perched-0.000
cks-0.000
Token go
Token ID467
Feature activation+1.86e-02

Pos logit contributions
 its+0.008
 his+0.008
cks+0.007
 her+0.006
 hovering+0.006
Neg logit contributions
t-0.008
ugg-0.008
ave-0.007
case-0.006
omm-0.006
Token to
Token ID284
Feature activation-1.13e-02

Pos logit contributions
t+0.034
ugg+0.034
case+0.032
 virtue+0.032
omm+0.032
Neg logit contributions
 its-0.025
 his-0.024
cks-0.022
 hovering-0.021
 down-0.021
Token it
Token ID340
Feature activation-5.37e-03

Pos logit contributions
cks+0.022
 its+0.022
 hovering+0.021
 outer+0.019
 his+0.019
Neg logit contributions
t-0.045
case-0.042
ugg-0.041
bye-0.040
 judge-0.039
Token was
Token ID373
Feature activation-1.64e-02

Pos logit contributions
 its+0.010
cks+0.009
 hovering+0.009
 his+0.009
 outer+0.009
Neg logit contributions
t-0.007
ugg-0.007
case-0.007
ave-0.006
 virtue-0.006
Token an
Token ID281
Feature activation-1.76e-02

Pos logit contributions
cks+0.019
 its+0.019
 hovering+0.018
 outer+0.016
 his+0.016
Neg logit contributions
t-0.032
case-0.030
 virtue-0.029
 judge-0.029
bye-0.028
Token animal
Token ID5044
Feature activation-3.43e-03

Pos logit contributions
 its+0.027
 hovering+0.025
cks+0.025
 his+0.024
 outer+0.023
Neg logit contributions
t-0.030
ugg-0.029
case-0.027
ave-0.026
 virtue-0.026
Token in
Token ID287
Feature activation-1.21e-03

Pos logit contributions
 its+0.004
 hovering+0.004
 his+0.003
cks+0.003
 outer+0.003
Neg logit contributions
t-0.007
ugg-0.007
case-0.007
ave-0.007
 virtue-0.006
Token trouble
Token ID5876
Feature activation-7.50e-02

Pos logit contributions
 hovering+0.001
cks+0.001
 its+0.001
 perched+0.001
rob+0.001
Neg logit contributions
t-0.002
case-0.002
 judge-0.002
bye-0.002
 punish-0.002
Token!"
Token ID2474
Feature activation-3.63e-04

Pos logit contributions
 its+0.094
 hovering+0.087
cks+0.087
 his+0.082
 outer+0.078
Neg logit contributions
t-0.148
ugg-0.139
case-0.135
 virtue-0.131
ave-0.131
Token
Token ID198
Feature activation-7.43e-03

Pos logit contributions
 its+0.001
 hovering+0.000
cks+0.000
 his+0.000
 outer+0.000
Neg logit contributions
t-0.001
ugg-0.001
case-0.001
 virtue-0.001
ave-0.001
Token
Token ID198
Feature activation-1.35e-02

Pos logit contributions
 its+0.011
cks+0.009
 hovering+0.009
 his+0.009
 perched+0.008
Neg logit contributions
t-0.014
ugg-0.013
case-0.012
 virtue-0.012
ave-0.012
TokenSo
Token ID2396
Feature activation+1.08e-03

Pos logit contributions
 its+0.021
 his+0.019
 hovering+0.019
cks+0.018
 perched+0.017
Neg logit contributions
t-0.023
ugg-0.023
case-0.021
 virtue-0.021
omm-0.020
Token Terry
Token ID14286
Feature activation+1.41e-02

Pos logit contributions
t+0.002
ugg+0.002
case+0.002
 virtue+0.002
ave+0.002
Neg logit contributions
 its-0.001
 hovering-0.001
 his-0.001
cks-0.001
 outer-0.001
Token
Token ID198
Feature activation+1.36e-03

Pos logit contributions
t+0.026
ugg+0.023
case+0.023
ave+0.021
 virtue+0.021
Neg logit contributions
 its-0.025
 hovering-0.025
cks-0.024
 outer-0.023
 his-0.023
Token
Token ID198
Feature activation-7.95e-03

Pos logit contributions
t+0.002
ugg+0.002
case+0.002
 virtue+0.002
bye+0.002
Neg logit contributions
 its-0.002
 his-0.002
 hovering-0.002
cks-0.002
 outer-0.002
TokenInside
Token ID24441
Feature activation-2.60e-02

Pos logit contributions
 its+0.012
 his+0.011
 hovering+0.011
cks+0.010
 outer+0.010
Neg logit contributions
t-0.014
ugg-0.013
case-0.013
 virtue-0.012
bye-0.012
Token the
Token ID262
Feature activation-2.03e-02

Pos logit contributions
cks+0.033
 hovering+0.032
 its+0.029
 outer+0.029
 pans+0.028
Neg logit contributions
t-0.053
case-0.048
ugg-0.047
um-0.045
bye-0.044
Token sand
Token ID6450
Feature activation-1.52e-02

Pos logit contributions
 its+0.022
 hovering+0.022
cks+0.021
 perched+0.018
 outer+0.018
Neg logit contributions
t-0.043
case-0.041
ugg-0.039
 judge-0.038
bye-0.038
Tokenpit
Token ID15544
Feature activation-7.47e-02

Pos logit contributions
 its+0.024
 hovering+0.024
cks+0.023
 outer+0.022
 perched+0.021
Neg logit contributions
t-0.025
case-0.022
ugg-0.022
bye-0.020
 judge-0.020
Token he
Token ID339
Feature activation-3.44e-02

Pos logit contributions
 its+0.115
 hovering+0.107
cks+0.106
 his+0.104
 outer+0.098
Neg logit contributions
t-0.126
ugg-0.119
case-0.113
 virtue-0.111
ave-0.110
Token found
Token ID1043
Feature activation+1.08e-03

Pos logit contributions
 its+0.037
 his+0.036
 hovering+0.030
 perched+0.030
cks+0.030
Neg logit contributions
t-0.074
ugg-0.073
 virtue-0.072
bye-0.069
omm-0.069
Token a
Token ID257
Feature activation-3.04e-03

Pos logit contributions
t+0.002
ugg+0.002
case+0.002
ave+0.002
bye+0.002
Neg logit contributions
 its-0.001
cks-0.001
 hovering-0.001
 outer-0.001
 pans-0.001
Token teacher
Token ID4701
Feature activation+3.90e-03

Pos logit contributions
 its+0.004
 his+0.003
 hovering+0.003
 outer+0.003
 seaw+0.003
Neg logit contributions
t-0.006
ugg-0.006
ave-0.006
bye-0.006
case-0.006
Token with
Token ID351
Feature activation-1.03e-02

Pos logit contributions
t+0.008
ugg+0.007
case+0.007
 virtue+0.007
ave+0.007
Neg logit contributions
 its-0.005
 hovering-0.005
cks-0.004
 his-0.004
 outer-0.004
Token agreed
Token ID4987
Feature activation+2.76e-03

Pos logit contributions
 its+0.001
 his+0.001
 perched+0.001
 her+0.001
 down+0.001
Neg logit contributions
t-0.001
omm-0.001
 virtue-0.001
ugg-0.001
case-0.001
Token and
Token ID290
Feature activation+1.35e-02

Pos logit contributions
ugg+0.006
case+0.006
t+0.006
ave+0.006
omm+0.006
Neg logit contributions
 its-0.003
 his-0.003
 their-0.002
 her-0.002
 hovering-0.002
Token packed
Token ID11856
Feature activation+2.72e-03

Pos logit contributions
t+0.025
omm+0.022
case+0.022
ugg+0.022
 virtue+0.022
Neg logit contributions
 its-0.023
 his-0.022
 their-0.018
 her-0.017
 down-0.017
Token their
Token ID511
Feature activation-1.84e-02

Pos logit contributions
t+0.008
ugg+0.007
case+0.007
 virtue+0.007
bye+0.007
Neg logit contributions
 its-0.001
 his-0.001
cks-0.001
 hovering-0.001
 outer-0.001
Token back
Token ID736
Feature activation+7.98e-04

Pos logit contributions
 its+0.021
 his+0.019
cks+0.018
 hovering+0.017
 pans+0.017
Neg logit contributions
t-0.040
ugg-0.037
case-0.036
 virtue-0.036
bye-0.035
Tokenpacks
Token ID32377
Feature activation-7.43e-02

Pos logit contributions
t+0.002
ugg+0.002
 virtue+0.002
case+0.002
ave+0.002
Neg logit contributions
 its-0.001
 his-0.001
cks-0.001
 hovering-0.000
 outer-0.000
Token with
Token ID351
Feature activation-1.59e-02

Pos logit contributions
 its+0.095
 his+0.088
 hovering+0.078
cks+0.074
 outer+0.072
Neg logit contributions
t-0.143
ugg-0.139
 virtue-0.135
omm-0.134
ave-0.132
Token some
Token ID617
Feature activation-2.63e-03

Pos logit contributions
 its+0.012
 hovering+0.010
cks+0.010
 his+0.009
 outer+0.008
Neg logit contributions
t-0.040
ugg-0.038
case-0.037
 virtue-0.036
ave-0.036
Token of
Token ID286
Feature activation-3.75e-03

Pos logit contributions
 its+0.003
 his+0.003
 hovering+0.003
cks+0.003
 seaw+0.003
Neg logit contributions
t-0.005
ugg-0.005
case-0.005
 virtue-0.005
bye-0.005
Token their
Token ID511
Feature activation+9.20e-04

Pos logit contributions
 its+0.002
 hovering+0.002
cks+0.002
 his+0.002
 outer+0.002
Neg logit contributions
t-0.010
ugg-0.009
case-0.009
 virtue-0.009
bye-0.009
Token favorite
Token ID4004
Feature activation-1.93e-02

Pos logit contributions
t+0.002
ugg+0.002
case+0.002
 virtue+0.002
ave+0.002
Neg logit contributions
 its-0.001
 his-0.001
 pans-0.001
cks-0.001
 hovering-0.001
Token "
Token ID366
Feature activation-4.70e-03

Pos logit contributions
t+0.021
ugg+0.020
case+0.019
 virtue+0.018
ave+0.018
Neg logit contributions
 its-0.015
 hovering-0.014
cks-0.014
 his-0.014
 outer-0.013
TokenYou
Token ID1639
Feature activation+4.80e-04

Pos logit contributions
 its+0.008
 hovering+0.007
cks+0.007
 his+0.007
 outer+0.007
Neg logit contributions
t-0.008
ugg-0.007
case-0.007
ave-0.006
bye-0.006
Token are
Token ID389
Feature activation+9.09e-04

Pos logit contributions
ugg+0.001
t+0.001
omm+0.001
bye+0.001
 virtue+0.001
Neg logit contributions
 its-0.001
 his-0.001
cks-0.001
 hovering-0.001
 seaw-0.000
Token in
Token ID287
Feature activation+7.51e-03

Pos logit contributions
t+0.001
 judge+0.001
case+0.001
 punish+0.001
ugg+0.001
Neg logit contributions
 its-0.001
 hovering-0.001
cks-0.001
 outer-0.001
 perched-0.001
Token big
Token ID1263
Feature activation-3.84e-02

Pos logit contributions
t+0.017
ugg+0.017
case+0.016
ave+0.016
 virtue+0.016
Neg logit contributions
 its-0.007
 hovering-0.006
cks-0.006
 his-0.006
 outer-0.005
Token trouble
Token ID5876
Feature activation-7.43e-02

Pos logit contributions
 its+0.052
 hovering+0.050
cks+0.049
 outer+0.046
 his+0.044
Neg logit contributions
t-0.070
ugg-0.066
case-0.064
 judge-0.063
ave-0.062
Token.
Token ID13
Feature activation-7.51e-03

Pos logit contributions
 its+0.083
 hovering+0.075
cks+0.075
 his+0.072
 outer+0.066
Neg logit contributions
t-0.156
ugg-0.150
case-0.144
ave-0.141
 virtue-0.141
Token I
Token ID314
Feature activation-8.61e-03

Pos logit contributions
 its+0.010
 his+0.009
cks+0.009
 hovering+0.009
 outer+0.008
Neg logit contributions
t-0.014
case-0.013
ugg-0.013
 virtue-0.013
ave-0.013
Token'm
Token ID1101
Feature activation+1.45e-03

Pos logit contributions
 its+0.012
 his+0.011
cks+0.010
 hovering+0.010
 perched+0.010
Neg logit contributions
omm-0.015
ugg-0.015
bye-0.014
case-0.014
um-0.014
Token going
Token ID1016
Feature activation+1.38e-02

Pos logit contributions
t+0.002
ugg+0.002
case+0.002
 judge+0.002
 punish+0.002
Neg logit contributions
 its-0.002
cks-0.002
 hovering-0.002
 outer-0.002
 seaw-0.002
Token to
Token ID284
Feature activation-4.72e-03

Pos logit contributions
ugg+0.024
t+0.024
case+0.023
 virtue+0.023
omm+0.023
Neg logit contributions
 its-0.020
 his-0.020
 hovering-0.018
cks-0.017
 down-0.017
TokenBut
Token ID1537
Feature activation+5.88e-03

Pos logit contributions
 its+0.027
 hovering+0.025
cks+0.025
 his+0.024
 outer+0.024
Neg logit contributions
t-0.023
ugg-0.022
case-0.021
 virtue-0.020
ave-0.020
Token we
Token ID356
Feature activation-1.19e-02

Pos logit contributions
t+0.008
case+0.008
ugg+0.008
 judge+0.007
 moms+0.007
Neg logit contributions
cks-0.010
 hovering-0.010
 its-0.010
 outer-0.009
rob-0.009
Token will
Token ID481
Feature activation+1.02e-02

Pos logit contributions
 its+0.017
 his+0.016
cks+0.015
 hovering+0.014
 seaw+0.014
Neg logit contributions
ugg-0.021
t-0.020
omm-0.019
bye-0.019
 virtue-0.018
Token be
Token ID307
Feature activation-8.98e-03

Pos logit contributions
ave+0.019
ugg+0.019
t+0.018
omm+0.017
ase+0.017
Neg logit contributions
 its-0.016
 his-0.015
 her-0.012
 their-0.012
cks-0.012
Token in
Token ID287
Feature activation+1.16e-02

Pos logit contributions
 outer+0.014
 hovering+0.013
rob+0.013
 seaw+0.013
 its+0.013
Neg logit contributions
 punish-0.012
t-0.012
case-0.012
ave-0.012
 virtue-0.012
Token trouble
Token ID5876
Feature activation-7.41e-02

Pos logit contributions
t+0.024
case+0.022
ugg+0.022
 virtue+0.022
bye+0.021
Neg logit contributions
 its-0.013
 hovering-0.012
cks-0.012
 his-0.011
 outer-0.010
Token!"
Token ID2474
Feature activation-6.08e-03

Pos logit contributions
 its+0.086
 hovering+0.078
cks+0.078
 his+0.075
 outer+0.068
Neg logit contributions
t-0.153
ugg-0.148
case-0.142
ave-0.138
 virtue-0.138
Token Anna
Token ID11735
Feature activation+6.05e-03

Pos logit contributions
 its+0.009
 hovering+0.008
cks+0.008
 his+0.008
 outer+0.008
Neg logit contributions
t-0.011
ugg-0.010
case-0.010
ave-0.009
 virtue-0.009
Token said
Token ID531
Feature activation-2.30e-03

Pos logit contributions
ugg+0.013
t+0.013
 virtue+0.013
case+0.012
bye+0.012
Neg logit contributions
 its-0.007
 his-0.006
 hovering-0.005
cks-0.005
 perched-0.005
Token.
Token ID13
Feature activation+4.45e-03

Pos logit contributions
 its+0.003
 hovering+0.003
cks+0.003
 outer+0.002
 his+0.002
Neg logit contributions
t-0.005
ugg-0.004
case-0.004
ave-0.004
bye-0.004
Token "
Token ID366
Feature activation-4.31e-03

Pos logit contributions
t+0.008
ugg+0.008
case+0.008
ave+0.007
 virtue+0.007
Neg logit contributions
 its-0.006
 hovering-0.006
cks-0.006
 his-0.005
 outer-0.005
Token,
Token ID11
Feature activation-4.28e-03

Pos logit contributions
 its+0.026
 his+0.023
 hovering+0.021
cks+0.021
 perched+0.020
Neg logit contributions
t-0.046
ugg-0.045
 virtue-0.043
bye-0.042
case-0.042
Token you
Token ID345
Feature activation-8.26e-03

Pos logit contributions
 its+0.006
 his+0.005
 hovering+0.005
cks+0.005
 outer+0.005
Neg logit contributions
t-0.008
ugg-0.008
case-0.008
ave-0.008
bye-0.007
Token are
Token ID389
Feature activation-6.52e-05

Pos logit contributions
 its+0.010
 there+0.009
 his+0.009
 ours+0.008
 creepy+0.007
Neg logit contributions
ugg-0.017
bye-0.016
becca-0.015
omm-0.015
 virtue-0.015
Token in
Token ID287
Feature activation+1.39e-02

Pos logit contributions
 its+0.000
cks+0.000
 hovering+0.000
 his+0.000
 outer+0.000
Neg logit contributions
t-0.000
case-0.000
ugg-0.000
 virtue-0.000
 judge-0.000
Token big
Token ID1263
Feature activation-3.46e-02

Pos logit contributions
t+0.031
ugg+0.030
case+0.029
ave+0.029
bye+0.028
Neg logit contributions
 its-0.014
cks-0.012
 hovering-0.012
 his-0.012
 outer-0.011
Token trouble
Token ID5876
Feature activation-7.40e-02

Pos logit contributions
 its+0.045
 hovering+0.043
cks+0.043
 outer+0.039
 his+0.038
Neg logit contributions
t-0.065
ugg-0.060
case-0.060
 judge-0.058
ave-0.058
Token!"
Token ID2474
Feature activation-3.70e-03

Pos logit contributions
 its+0.084
 hovering+0.077
cks+0.076
 his+0.074
 outer+0.066
Neg logit contributions
t-0.153
ugg-0.149
case-0.141
ave-0.139
 virtue-0.139
Token her
Token ID607
Feature activation-2.09e-03

Pos logit contributions
 hovering+0.004
cks+0.004
 perched+0.004
 atop+0.003
rob+0.003
Neg logit contributions
t-0.008
 tum-0.007
ugg-0.007
case-0.007
 judge-0.007
Token mom
Token ID1995
Feature activation+2.19e-02

Pos logit contributions
 its+0.004
 his+0.004
 pans+0.003
 hovering+0.003
 outer+0.003
Neg logit contributions
t-0.003
ugg-0.003
bye-0.003
ave-0.003
 virtue-0.003
Token said
Token ID531
Feature activation-4.20e-03

Pos logit contributions
t+0.039
ugg+0.037
case+0.035
 virtue+0.034
ave+0.034
Neg logit contributions
 its-0.032
 hovering-0.030
cks-0.030
 his-0.029
 outer-0.027
Token.
Token ID13
Feature activation-3.01e-03

Pos logit contributions
 its+0.005
cks+0.005
 hovering+0.005
 his+0.004
 outer+0.004
Neg logit contributions
t-0.008
ugg-0.008
case-0.008
ave-0.007
 virtue-0.007
Token,
Token ID11
Feature activation+1.42e-02

Pos logit contributions
 its+0.003
cks+0.003
 hovering+0.003
 outer+0.002
 his+0.002
Neg logit contributions
t-0.003
ugg-0.003
case-0.003
bye-0.002
ave-0.002
Token "
Token ID366
Feature activation-1.31e-02

Pos logit contributions
t+0.026
ugg+0.024
case+0.023
 judge+0.022
 virtue+0.022
Neg logit contributions
 its-0.020
cks-0.019
 hovering-0.019
 outer-0.018
 his-0.017
TokenI
Token ID40
Feature activation-1.42e-02

Pos logit contributions
 its+0.017
 hovering+0.015
cks+0.015
 his+0.015
 outer+0.014
Neg logit contributions
t-0.026
ugg-0.024
case-0.024
 virtue-0.023
ave-0.023
Token am
Token ID716
Feature activation-2.20e-02

Pos logit contributions
 its+0.021
 his+0.020
 hovering+0.018
 perched+0.018
 glowing+0.016
Neg logit contributions
ugg-0.023
omm-0.023
case-0.022
t-0.022
becca-0.022
Token in
Token ID287
Feature activation+1.07e-02

Pos logit contributions
cks+0.031
 its+0.030
 hovering+0.030
rob+0.028
 seaw+0.028
Neg logit contributions
t-0.039
ugg-0.035
case-0.035
 judge-0.034
 punish-0.034
Token trouble
Token ID5876
Feature activation-7.37e-02

Pos logit contributions
t+0.024
ugg+0.023
case+0.022
 virtue+0.022
bye+0.021
Neg logit contributions
 its-0.010
 hovering-0.010
cks-0.009
 his-0.009
 outer-0.008
Token.
Token ID13
Feature activation-1.69e-02

Pos logit contributions
 its+0.094
 hovering+0.087
cks+0.087
 his+0.083
 outer+0.078
Neg logit contributions
t-0.143
ugg-0.136
case-0.131
 virtue-0.127
ave-0.127
Token I
Token ID314
Feature activation-1.71e-02

Pos logit contributions
 its+0.020
 his+0.017
 hovering+0.015
 perched+0.014
 outer+0.014
Neg logit contributions
t-0.034
case-0.033
ugg-0.032
 virtue-0.032
ave-0.032
Token lost
Token ID2626
Feature activation-1.60e-02

Pos logit contributions
 its+0.024
 his+0.022
 perched+0.021
 hovering+0.020
cks+0.018
Neg logit contributions
omm-0.030
ugg-0.029
becca-0.028
um-0.028
bye-0.027
Token my
Token ID616
Feature activation-1.62e-03

Pos logit contributions
 hovering+0.017
cks+0.016
rob+0.014
 perched+0.013
 its+0.013
Neg logit contributions
t-0.033
 tum-0.030
bye-0.030
 judge-0.030
 virtue-0.029
Token key
Token ID1994
Feature activation-2.37e-02

Pos logit contributions
 its+0.002
 hovering+0.002
 his+0.002
cks+0.002
 outer+0.002
Neg logit contributions
t-0.003
ugg-0.003
case-0.003
ave-0.003
bye-0.003
Token's
Token ID338
Feature activation+7.43e-03

Pos logit contributions
 its+0.026
 hovering+0.023
cks+0.023
 his+0.023
 perched+0.021
Neg logit contributions
ugg-0.025
t-0.025
ave-0.023
case-0.023
 virtue-0.023
Token an
Token ID281
Feature activation-1.94e-02

Pos logit contributions
t+0.012
ugg+0.011
case+0.011
 virtue+0.010
bye+0.010
Neg logit contributions
 its-0.012
 hovering-0.011
cks-0.011
 his-0.011
 outer-0.010
Token oct
Token ID19318
Feature activation+1.51e-02

Pos logit contributions
 its+0.024
 hovering+0.022
cks+0.021
 his+0.021
 outer+0.019
Neg logit contributions
t-0.040
ugg-0.038
case-0.036
ave-0.035
 virtue-0.035
Tokenopus
Token ID25790
Feature activation+2.46e-03

Pos logit contributions
 virtue+0.024
 Sadly+0.024
ugg+0.024
ave+0.023
 hire+0.023
Neg logit contributions
 hovering-0.019
 seaw-0.019
rob-0.019
 coral-0.018
cks-0.018
Token in
Token ID287
Feature activation+1.07e-02

Pos logit contributions
t+0.005
ugg+0.005
case+0.005
omm+0.005
 virtue+0.005
Neg logit contributions
 its-0.003
 hovering-0.002
cks-0.002
 his-0.002
 perched-0.002
Token trouble
Token ID5876
Feature activation-7.37e-02

Pos logit contributions
t+0.022
case+0.021
 judge+0.021
ugg+0.020
 virtue+0.020
Neg logit contributions
 hovering-0.010
cks-0.010
 its-0.009
 perched-0.008
 pans-0.008
Token!"
Token ID2474
Feature activation-7.21e-03

Pos logit contributions
 its+0.091
 hovering+0.084
cks+0.084
 his+0.078
 outer+0.076
Neg logit contributions
t-0.146
ugg-0.135
case-0.135
bye-0.128
 virtue-0.128
Token Fin
Token ID4463
Feature activation+1.52e-02

Pos logit contributions
 its+0.008
 his+0.007
 hovering+0.007
cks+0.007
 perched+0.006
Neg logit contributions
t-0.015
ugg-0.015
case-0.014
 virtue-0.014
ave-0.014
Token exclaimed
Token ID33552
Feature activation-9.99e-04

Pos logit contributions
ugg+0.030
t+0.029
 virtue+0.029
case+0.028
omm+0.028
Neg logit contributions
 its-0.019
 his-0.018
 hovering-0.016
cks-0.016
 perched-0.016
Token.
Token ID13
Feature activation+2.63e-03

Pos logit contributions
 its+0.001
cks+0.001
 hovering+0.001
 his+0.001
 outer+0.001
Neg logit contributions
t-0.002
ugg-0.002
case-0.002
ave-0.002
 virtue-0.002
Token The
Token ID383
Feature activation+8.52e-03

Pos logit contributions
ugg+0.006
t+0.005
case+0.005
 virtue+0.005
 salon+0.005
Neg logit contributions
 its-0.003
 his-0.003
 hovering-0.002
 perched-0.002
cks-0.002
Token.
Token ID13
Feature activation-4.27e-04

Pos logit contributions
 its+0.008
 hovering+0.007
cks+0.007
 his+0.007
 outer+0.006
Neg logit contributions
t-0.014
ugg-0.014
case-0.013
 virtue-0.013
ave-0.013
Token "
Token ID366
Feature activation-7.66e-03

Pos logit contributions
 its+0.001
 hovering+0.000
cks+0.000
 his+0.000
 outer+0.000
Neg logit contributions
t-0.001
ugg-0.001
case-0.001
 virtue-0.001
ave-0.001
TokenWe
Token ID1135
Feature activation+1.79e-03

Pos logit contributions
 its+0.012
 hovering+0.011
cks+0.011
 his+0.011
 outer+0.010
Neg logit contributions
t-0.013
ugg-0.012
case-0.011
ave-0.011
 virtue-0.011
Token're
Token ID821
Feature activation+7.52e-03

Pos logit contributions
omm+0.004
ugg+0.004
bye+0.003
becca+0.003
oral+0.003
Neg logit contributions
 its-0.002
 his-0.002
cks-0.002
 perched-0.002
 there-0.002
Token in
Token ID287
Feature activation+1.11e-02

Pos logit contributions
t+0.015
ugg+0.014
case+0.014
ave+0.013
bye+0.013
Neg logit contributions
 its-0.010
cks-0.009
 hovering-0.009
 his-0.009
 perched-0.008
Token trouble
Token ID5876
Feature activation-7.34e-02

Pos logit contributions
t+0.024
ugg+0.024
ave+0.023
case+0.022
bye+0.022
Neg logit contributions
 its-0.012
cks-0.010
 his-0.010
 hovering-0.010
 outer-0.009
Token!"
Token ID2474
Feature activation-4.93e-03

Pos logit contributions
 its+0.078
 hovering+0.071
cks+0.070
 his+0.067
 perched+0.060
Neg logit contributions
t-0.158
ugg-0.153
case-0.147
ave-0.144
 virtue-0.143
Token
Token ID198
Feature activation-1.96e-03

Pos logit contributions
 hovering+0.008
 its+0.008
cks+0.008
 outer+0.008
 perched+0.007
Neg logit contributions
t-0.008
ugg-0.007
case-0.007
bye-0.006
 tum-0.006
Token
Token ID198
Feature activation-8.29e-03

Pos logit contributions
 its+0.003
 hovering+0.003
cks+0.003
 outer+0.003
 his+0.003
Neg logit contributions
t-0.003
ugg-0.003
case-0.003
bye-0.003
 virtue-0.003
TokenDad
Token ID46270
Feature activation+2.39e-02

Pos logit contributions
 its+0.014
 hovering+0.013
cks+0.013
 his+0.013
 outer+0.012
Neg logit contributions
t-0.013
ugg-0.012
case-0.012
 virtue-0.011
bye-0.011
Token heard
Token ID2982
Feature activation-2.04e-03

Pos logit contributions
 virtue+0.049
bye+0.048
ugg+0.047
t+0.045
 deed+0.043
Neg logit contributions
 its-0.028
 his-0.028
 hovering-0.027
 perched-0.025
cks-0.025
Token ready
Token ID3492
Feature activation+2.00e-02

Pos logit contributions
 its+0.014
 his+0.013
 hovering+0.013
cks+0.013
 outer+0.011
Neg logit contributions
ugg-0.023
t-0.023
 virtue-0.022
ave-0.021
case-0.021
Token and
Token ID290
Feature activation+1.34e-02

Pos logit contributions
t+0.038
ugg+0.037
case+0.036
ave+0.035
 virtue+0.035
Neg logit contributions
 its-0.026
 his-0.024
 hovering-0.023
cks-0.023
 outer-0.021
Token she
Token ID673
Feature activation+8.44e-03

Pos logit contributions
t+0.023
ugg+0.022
case+0.021
 virtue+0.021
ave+0.020
Neg logit contributions
 its-0.020
 his-0.019
 hovering-0.018
cks-0.018
 outer-0.017
Token was
Token ID373
Feature activation+2.39e-02

Pos logit contributions
t+0.014
ugg+0.013
bye+0.013
omm+0.012
 virtue+0.012
Neg logit contributions
 its-0.013
 his-0.013
 hovering-0.012
 perched-0.012
cks-0.011
Token looking
Token ID2045
Feature activation+3.21e-02

Pos logit contributions
t+0.048
ugg+0.047
bye+0.044
case+0.044
ave+0.043
Neg logit contributions
 its-0.030
 hovering-0.028
 his-0.027
cks-0.026
 perched-0.025
Token forward
Token ID2651
Feature activation+7.30e-02

Pos logit contributions
t+0.061
ugg+0.057
case+0.055
ave+0.054
 virtue+0.054
Neg logit contributions
 its-0.042
cks-0.041
 hovering-0.040
 his-0.036
 outer-0.036
Token to
Token ID284
Feature activation+1.30e-02

Pos logit contributions
t+0.102
ugg+0.094
case+0.089
ave+0.086
bye+0.086
Neg logit contributions
 its-0.132
cks-0.128
 hovering-0.127
 his-0.120
 outer-0.117
Token the
Token ID262
Feature activation+2.43e-02

Pos logit contributions
t+0.025
 judge+0.024
case+0.024
ugg+0.024
 punish+0.023
Neg logit contributions
cks-0.014
 hovering-0.014
 its-0.014
 perched-0.012
 pans-0.012
Token adventure
Token ID8855
Feature activation+1.33e-02

Pos logit contributions
ugg+0.044
t+0.043
ave+0.042
aring+0.039
case+0.038
Neg logit contributions
 his-0.033
 outer-0.032
 its-0.032
 hovering-0.031
 pans-0.031
Token.
Token ID13
Feature activation+5.64e-03

Pos logit contributions
t+0.026
ugg+0.024
case+0.023
 virtue+0.023
ave+0.023
Neg logit contributions
 its-0.017
 hovering-0.016
cks-0.015
 his-0.015
 outer-0.014
Token She
Token ID1375
Feature activation+9.69e-03

Pos logit contributions
t+0.011
ugg+0.010
 virtue+0.010
ave+0.010
bye+0.010
Neg logit contributions
 its-0.008
 his-0.007
 hovering-0.006
 perched-0.006
 outer-0.006
Token't
Token ID470
Feature activation-1.96e-02

Pos logit contributions
 its+0.004
 hovering+0.003
 his+0.003
cks+0.003
 outer+0.003
Neg logit contributions
ugg-0.002
t-0.002
ave-0.001
 virtue-0.001
bye-0.001
Token bud
Token ID3600
Feature activation+9.02e-03

Pos logit contributions
cks+0.037
 its+0.035
 hovering+0.035
rob+0.034
 coral+0.033
Neg logit contributions
t-0.029
case-0.025
 punish-0.023
 judge-0.022
 virtue-0.022
Tokenge
Token ID469
Feature activation-2.71e-02

Pos logit contributions
t+0.018
case+0.015
ugg+0.015
ave+0.015
aring+0.014
Neg logit contributions
 its-0.012
cks-0.011
 hovering-0.011
 coral-0.011
 seaw-0.011
Token.
Token ID13
Feature activation-5.44e-03

Pos logit contributions
 its+0.034
cks+0.033
 hovering+0.033
 outer+0.030
 his+0.029
Neg logit contributions
t-0.053
case-0.048
ugg-0.047
 virtue-0.046
ave-0.046
Token Fed
Token ID10169
Feature activation+2.97e-02

Pos logit contributions
 its+0.007
 his+0.007
 hovering+0.006
 perched+0.006
cks+0.006
Neg logit contributions
t-0.010
ugg-0.010
bye-0.010
ave-0.010
 virtue-0.010
Tokenor
Token ID273
Feature activation+7.10e-02

Pos logit contributions
t+0.027
ugg+0.024
case+0.023
 virtue+0.021
bye+0.021
Neg logit contributions
 its-0.068
 hovering-0.065
cks-0.065
 his-0.064
 outer-0.062
Token sighed
Token ID21893
Feature activation-1.85e-02

Pos logit contributions
t+0.112
ugg+0.112
case+0.107
 virtue+0.105
bye+0.104
Neg logit contributions
 its-0.114
 his-0.106
 hovering-0.102
cks-0.101
 perched-0.099
Token.
Token ID13
Feature activation-5.17e-03

Pos logit contributions
 its+0.026
cks+0.025
 hovering+0.025
 his+0.023
 outer+0.023
Neg logit contributions
t-0.032
ugg-0.030
case-0.029
 virtue-0.028
ave-0.028
Token
Token ID198
Feature activation-1.55e-03

Pos logit contributions
 its+0.007
 his+0.006
 hovering+0.006
cks+0.006
 perched+0.005
Neg logit contributions
t-0.010
ugg-0.010
case-0.009
bye-0.009
ave-0.009
Token
Token ID198
Feature activation-7.72e-03

Pos logit contributions
 its+0.002
 his+0.002
 hovering+0.001
cks+0.001
 perched+0.001
Neg logit contributions
ugg-0.003
t-0.003
bye-0.003
 virtue-0.003
ave-0.003
TokenJust
Token ID5703
Feature activation+1.71e-03

Pos logit contributions
 its+0.010
 his+0.009
 hovering+0.009
 perched+0.008
cks+0.008
Neg logit contributions
ugg-0.014
t-0.014
 tum-0.014
case-0.013
bye-0.013
Token for
Token ID329
Feature activation+5.85e-03

Pos logit contributions
cks+0.014
 pans+0.013
 hovering+0.012
 its+0.012
 afloat+0.012
Neg logit contributions
t-0.012
case-0.011
ugg-0.010
 virtue-0.009
 tum-0.009
Token a
Token ID257
Feature activation+1.54e-02

Pos logit contributions
t+0.009
case+0.009
bye+0.008
 virtue+0.008
 punish+0.008
Neg logit contributions
cks-0.010
 hovering-0.008
cking-0.008
 pans-0.008
 its-0.008
Token while
Token ID981
Feature activation+1.17e-02

Pos logit contributions
t+0.029
ugg+0.029
case+0.027
ave+0.027
bye+0.026
Neg logit contributions
 its-0.020
 hovering-0.018
cks-0.018
 his-0.018
 outer-0.017
Tokenâ
Token ID22940
Feature activation+2.85e-02

Pos logit contributions
t+0.022
ugg+0.021
case+0.020
 virtue+0.020
ave+0.020
Neg logit contributions
 its-0.016
 hovering-0.014
cks-0.014
 his-0.014
 outer-0.013
Token€
Token ID26391
Feature activation+4.05e-02

Pos logit contributions
t+0.045
ugg+0.041
case+0.040
 virtue+0.039
 tum+0.038
Neg logit contributions
 its-0.046
cks-0.043
 hovering-0.043
 his-0.041
 outer-0.039
Token�
Token ID126
Feature activation+6.75e-02

Pos logit contributions
t+0.059
ugg+0.055
case+0.052
 virtue+0.050
ave+0.050
Neg logit contributions
 its-0.072
 hovering-0.068
cks-0.067
 his-0.065
 outer-0.063
Token�
Token ID99
Feature activation+4.35e-02

Pos logit contributions
t+0.107
ugg+0.101
ave+0.094
 moms+0.093
 judge+0.091
Neg logit contributions
 its-0.111
cks-0.107
 hovering-0.105
 his-0.098
 outer-0.093
Token and
Token ID290
Feature activation+1.43e-02

Pos logit contributions
t+0.071
ugg+0.063
ave+0.062
spe+0.059
bye+0.059
Neg logit contributions
 its-0.069
 hovering-0.068
cks-0.066
 his-0.063
 outer-0.060
Token suddenly
Token ID6451
Feature activation+1.93e-02

Pos logit contributions
t+0.024
ugg+0.023
case+0.022
ave+0.021
 virtue+0.021
Neg logit contributions
 its-0.022
cks-0.020
 hovering-0.020
 his-0.019
 outer-0.019
Token,
Token ID11
Feature activation+1.86e-02

Pos logit contributions
t+0.035
ugg+0.033
case+0.032
ave+0.031
 virtue+0.031
Neg logit contributions
 its-0.027
cks-0.025
 hovering-0.025
 his-0.024
 outer-0.023
Token she
Token ID673
Feature activation-1.80e-02

Pos logit contributions
t+0.044
ugg+0.042
case+0.041
 virtue+0.040
ave+0.040
Neg logit contributions
 its-0.017
 hovering-0.015
cks-0.014
 his-0.014
 outer-0.013
Token there
Token ID612
Feature activation-1.98e-03

Pos logit contributions
 its+0.016
cks+0.016
 hovering+0.016
 outer+0.014
 his+0.014
Neg logit contributions
t-0.015
ugg-0.014
case-0.013
bye-0.012
 judge-0.012
Token was
Token ID373
Feature activation-1.01e-03

Pos logit contributions
 its+0.004
 hovering+0.004
cks+0.004
 his+0.004
 outer+0.004
Neg logit contributions
t-0.002
ugg-0.002
case-0.002
bye-0.002
 virtue-0.002
Token a
Token ID257
Feature activation+5.07e-03

Pos logit contributions
cks+0.001
 hovering+0.001
 its+0.001
 pans+0.001
 outer+0.001
Neg logit contributions
t-0.002
ugg-0.002
case-0.002
bye-0.002
 tum-0.002
Token mighty
Token ID18680
Feature activation-5.51e-04

Pos logit contributions
t+0.008
ugg+0.007
ave+0.007
case+0.007
 virtue+0.006
Neg logit contributions
 its-0.009
 hovering-0.008
 his-0.008
cks-0.008
 outer-0.008
Token re
Token ID302
Feature activation+4.04e-02

Pos logit contributions
 its+0.001
 his+0.001
 carrot+0.001
 hovering+0.001
 coral+0.001
Neg logit contributions
t-0.001
ave-0.001
ugg-0.001
 virtue-0.001
case-0.001
Tokenind
Token ID521
Feature activation+8.57e-02

Pos logit contributions
t+0.081
ave+0.076
case+0.074
ugg+0.074
 virtue+0.074
Neg logit contributions
 its-0.048
cks-0.047
 hovering-0.044
 his-0.042
 perched-0.041
Tokeneer
Token ID28153
Feature activation+7.10e-03

Pos logit contributions
t+0.169
ugg+0.159
case+0.156
 virtue+0.151
ave+0.150
Neg logit contributions
 its-0.108
cks-0.098
 hovering-0.098
 his-0.095
 outer-0.089
Token.
Token ID13
Feature activation-5.54e-05

Pos logit contributions
t+0.013
 virtue+0.013
omm+0.013
ugg+0.013
case+0.013
Neg logit contributions
 its-0.010
 his-0.009
 hovering-0.009
cks-0.008
 perched-0.008
Token He
Token ID679
Feature activation+2.05e-03

Pos logit contributions
 its+0.000
cks+0.000
 hovering+0.000
 his+0.000
 outer+0.000
Neg logit contributions
t-0.000
ugg-0.000
case-0.000
bye-0.000
 virtue-0.000
Token lived
Token ID5615
Feature activation-1.38e-02

Pos logit contributions
t+0.003
ugg+0.003
omm+0.003
 virtue+0.003
case+0.003
Neg logit contributions
 its-0.003
 his-0.003
cks-0.003
 hovering-0.003
 perched-0.003
Token in
Token ID287
Feature activation-7.04e-04

Pos logit contributions
 its+0.019
cks+0.019
 hovering+0.019
 outer+0.018
 seaw+0.017
Neg logit contributions
t-0.024
ugg-0.023
case-0.022
ave-0.021
bye-0.021
Token he
Token ID339
Feature activation-1.18e-02

Pos logit contributions
 its+0.013
 his+0.012
 hovering+0.012
cks+0.011
 outer+0.010
Neg logit contributions
t-0.016
ugg-0.015
case-0.014
 virtue-0.014
ave-0.014
Token was
Token ID373
Feature activation+3.20e-03

Pos logit contributions
 its+0.016
 his+0.016
 perched+0.015
 hovering+0.014
cks+0.014
Neg logit contributions
ugg-0.021
t-0.020
 virtue-0.019
bye-0.019
case-0.018
Token as
Token ID355
Feature activation+3.87e-03

Pos logit contributions
t+0.005
 judge+0.005
case+0.005
 virtue+0.005
 tum+0.005
Neg logit contributions
 hovering-0.004
 its-0.004
cks-0.004
 outer-0.004
rob-0.004
Token heavy
Token ID4334
Feature activation+2.62e-02

Pos logit contributions
t+0.007
case+0.006
 virtue+0.006
ugg+0.006
ave+0.006
Neg logit contributions
cks-0.006
 hovering-0.006
 its-0.005
 outer-0.005
rob-0.005
Token as
Token ID355
Feature activation+6.52e-03

Pos logit contributions
t+0.041
ugg+0.037
 judge+0.035
bye+0.035
 virtue+0.034
Neg logit contributions
cks-0.040
 its-0.040
 hovering-0.040
 outer-0.037
 coral-0.035
Token three
Token ID1115
Feature activation+6.67e-02

Pos logit contributions
t+0.015
ugg+0.014
case+0.014
bye+0.013
 virtue+0.013
Neg logit contributions
cks-0.006
 its-0.006
 hovering-0.006
 outer-0.005
 perched-0.004
Token of
Token ID286
Feature activation+2.45e-02

Pos logit contributions
t+0.126
case+0.113
ugg+0.113
ave+0.110
 judge+0.110
Neg logit contributions
 its-0.085
 hovering-0.082
cks-0.079
 outer-0.076
 perched-0.071
Token his
Token ID465
Feature activation+1.79e-02

Pos logit contributions
t+0.058
ugg+0.054
case+0.054
 virtue+0.053
 judge+0.052
Neg logit contributions
 its-0.019
 hovering-0.019
cks-0.018
 outer-0.015
 perched-0.015
Token fake
Token ID8390
Feature activation-2.53e-03

Pos logit contributions
t+0.034
ugg+0.033
case+0.031
 virtue+0.030
ave+0.030
Neg logit contributions
 its-0.024
 his-0.022
cks-0.022
 hovering-0.022
 outer-0.020
Token bones
Token ID11945
Feature activation+8.02e-03

Pos logit contributions
 its+0.004
cks+0.003
 hovering+0.003
 outer+0.003
 perched+0.003
Neg logit contributions
t-0.004
 judge-0.004
case-0.004
 tum-0.004
ugg-0.004
Token.
Token ID13
Feature activation-2.27e-03

Pos logit contributions
t+0.016
ugg+0.015
case+0.015
 virtue+0.015
ave+0.014
Neg logit contributions
 its-0.010
 hovering-0.009
cks-0.009
 his-0.008
 outer-0.008
Token all
Token ID477
Feature activation+1.59e-02

Pos logit contributions
 its+0.010
 his+0.009
 hovering+0.008
cks+0.008
 perched+0.008
Neg logit contributions
t-0.012
ugg-0.011
 virtue-0.011
omm-0.011
case-0.010
Token loved
Token ID6151
Feature activation+1.77e-02

Pos logit contributions
t+0.022
ugg+0.020
case+0.019
 virtue+0.019
omm+0.019
Neg logit contributions
 its-0.030
 his-0.028
cks-0.028
 hovering-0.027
 perched-0.026
Token it
Token ID340
Feature activation+2.37e-02

Pos logit contributions
t+0.039
ugg+0.037
case+0.036
 virtue+0.035
bye+0.035
Neg logit contributions
 its-0.017
cks-0.017
 hovering-0.016
 his-0.015
 outer-0.014
Token.
Token ID13
Feature activation-3.20e-03

Pos logit contributions
t+0.046
ugg+0.042
case+0.042
 virtue+0.040
bye+0.040
Neg logit contributions
 its-0.030
cks-0.030
 hovering-0.029
 outer-0.026
 his-0.026
Token Lucy
Token ID22162
Feature activation-7.75e-04

Pos logit contributions
 its+0.005
 his+0.004
 hovering+0.004
 her+0.003
 down+0.003
Neg logit contributions
t-0.006
 judge-0.006
 virtue-0.006
ave-0.005
ugg-0.005
Token's
Token ID338
Feature activation+2.92e-02

Pos logit contributions
 its+0.001
 his+0.001
 perched+0.001
 hovering+0.001
 seaw+0.001
Neg logit contributions
t-0.001
 virtue-0.001
ugg-0.001
case-0.001
bye-0.001
Token mom
Token ID1995
Feature activation+1.69e-02

Pos logit contributions
t+0.059
ugg+0.054
ave+0.054
aring+0.050
bye+0.050
Neg logit contributions
 his-0.042
 its-0.039
 carrot-0.038
 glowing-0.036
 furry-0.036
Token was
Token ID373
Feature activation+3.06e-03

Pos logit contributions
t+0.034
 virtue+0.032
ugg+0.031
case+0.031
bye+0.031
Neg logit contributions
 his-0.022
 its-0.022
 perched-0.019
 her-0.018
 hovering-0.018
Token not
Token ID407
Feature activation-1.10e-02

Pos logit contributions
t+0.006
ugg+0.006
case+0.006
bye+0.006
ave+0.006
Neg logit contributions
 its-0.004
 his-0.003
 hovering-0.003
cks-0.003
 perched-0.003
Token troubled
Token ID17840
Feature activation+1.02e-02

Pos logit contributions
cks+0.017
 its+0.016
 hovering+0.016
 outer+0.016
 seaw+0.015
Neg logit contributions
t-0.018
 judge-0.016
ugg-0.016
ave-0.015
 punish-0.015
Token anymore
Token ID7471
Feature activation-2.22e-02

Pos logit contributions
ugg+0.019
case+0.017
t+0.017
bye+0.016
 virtue+0.016
Neg logit contributions
 its-0.015
 his-0.015
 down-0.013
 hovering-0.013
 her-0.012
Token wasn
Token ID2492
Feature activation-4.89e-03

Pos logit contributions
ugg+0.014
bye+0.013
t+0.012
 virtue+0.012
case+0.012
Neg logit contributions
 its-0.013
 his-0.013
cks-0.012
 hovering-0.012
 perched-0.011
Token't
Token ID470
Feature activation-7.36e-03

Pos logit contributions
 its+0.011
 his+0.010
 hovering+0.010
cks+0.010
 outer+0.010
Neg logit contributions
ugg-0.005
t-0.004
ave-0.004
bye-0.004
 virtue-0.004
Token so
Token ID523
Feature activation+3.93e-03

Pos logit contributions
cks+0.012
 hovering+0.012
rob+0.011
 outer+0.011
 valleys+0.011
Neg logit contributions
t-0.011
 judge-0.010
 quit-0.010
 always-0.010
 punish-0.010
Token scary
Token ID14343
Feature activation+1.23e-02

Pos logit contributions
t+0.007
ugg+0.006
case+0.006
ave+0.006
 virtue+0.006
Neg logit contributions
 its-0.006
 hovering-0.006
cks-0.006
 his-0.005
 outer-0.005
Token after
Token ID706
Feature activation+7.46e-03

Pos logit contributions
t+0.022
ugg+0.021
case+0.020
ave+0.019
 virtue+0.019
Neg logit contributions
 its-0.018
 hovering-0.016
cks-0.016
 his-0.015
 outer-0.015
Token all
Token ID477
Feature activation+2.56e-02

Pos logit contributions
t+0.013
ugg+0.012
case+0.011
ave+0.011
 virtue+0.011
Neg logit contributions
 its-0.012
cks-0.011
 his-0.011
 hovering-0.011
 outer-0.010
Token.
Token ID13
Feature activation-1.42e-03

Pos logit contributions
t+0.043
case+0.039
 judge+0.038
bye+0.036
 punish+0.036
Neg logit contributions
 hovering-0.037
cks-0.035
 its-0.035
 outer-0.035
 perched-0.033
Token In
Token ID554
Feature activation+1.01e-02

Pos logit contributions
 its+0.001
 his+0.001
 down+0.001
 hovering+0.001
cks+0.001
Neg logit contributions
t-0.003
ugg-0.003
 virtue-0.003
ave-0.003
bye-0.003
Token fact
Token ID1109
Feature activation-3.46e-02

Pos logit contributions
t+0.025
ugg+0.024
case+0.024
 virtue+0.023
bye+0.023
Neg logit contributions
 its-0.007
 hovering-0.006
cks-0.006
 his-0.005
 outer-0.005
Token,
Token ID11
Feature activation-1.11e-03

Pos logit contributions
 its+0.055
cks+0.054
 hovering+0.053
 outer+0.050
 his+0.049
Neg logit contributions
t-0.057
ugg-0.052
case-0.050
ave-0.048
 judge-0.047
Token it
Token ID340
Feature activation+2.05e-03

Pos logit contributions
cks+0.001
 hovering+0.001
 its+0.001
 outer+0.001
rob+0.001
Neg logit contributions
t-0.002
case-0.002
 tum-0.002
ugg-0.002
 judge-0.002
Token this
Token ID428
Feature activation-7.26e-03

Pos logit contributions
 its+0.011
 his+0.010
 hovering+0.010
cks+0.010
 outer+0.009
Neg logit contributions
t-0.016
ugg-0.016
case-0.015
ave-0.015
 virtue-0.015
Token motorcycle
Token ID18757
Feature activation+6.56e-03

Pos logit contributions
 its+0.012
cks+0.012
 hovering+0.012
 outer+0.011
 perched+0.011
Neg logit contributions
t-0.011
case-0.010
ugg-0.010
bye-0.010
 virtue-0.010
Token is
Token ID318
Feature activation-1.62e-02

Pos logit contributions
ugg+0.014
t+0.013
omm+0.012
 virtue+0.012
ave+0.012
Neg logit contributions
 its-0.008
 his-0.008
cks-0.007
 hovering-0.006
 down-0.006
Token not
Token ID407
Feature activation-2.39e-02

Pos logit contributions
 its+0.026
cks+0.025
 hovering+0.025
 his+0.023
 outer+0.023
Neg logit contributions
t-0.026
ugg-0.024
case-0.024
 virtue-0.023
bye-0.022
Token for
Token ID329
Feature activation+6.92e-03

Pos logit contributions
 hovering+0.037
cks+0.036
rob+0.033
 coral+0.033
 seaw+0.033
Neg logit contributions
t-0.037
 virtue-0.034
case-0.034
 judge-0.033
bye-0.033
Token you
Token ID345
Feature activation+1.76e-02

Pos logit contributions
t+0.014
ugg+0.014
case+0.013
ave+0.013
 virtue+0.013
Neg logit contributions
 its-0.009
cks-0.008
 his-0.008
 hovering-0.008
 outer-0.007
Token.
Token ID13
Feature activation-2.19e-03

Pos logit contributions
ugg+0.041
t+0.039
omm+0.038
case+0.038
 virtue+0.038
Neg logit contributions
 its-0.017
 his-0.015
cks-0.013
 hovering-0.012
 outer-0.011
Token It
Token ID632
Feature activation+1.51e-02

Pos logit contributions
 its+0.003
 his+0.002
cks+0.002
 hovering+0.002
 outer+0.002
Neg logit contributions
t-0.005
ugg-0.004
ave-0.004
case-0.004
 virtue-0.004
Token is
Token ID318
Feature activation-7.81e-03

Pos logit contributions
ugg+0.024
t+0.024
 virtue+0.022
case+0.022
bye+0.022
Neg logit contributions
 its-0.024
cks-0.022
 his-0.022
 hovering-0.022
 perched-0.020
Token big
Token ID1263
Feature activation+1.84e-03

Pos logit contributions
cks+0.012
 hovering+0.012
 its+0.012
 outer+0.011
 seaw+0.011
Neg logit contributions
t-0.012
case-0.011
 virtue-0.011
ugg-0.010
 judge-0.010
Token and
Token ID290
Feature activation-8.40e-03

Pos logit contributions
t+0.003
ugg+0.003
case+0.003
 virtue+0.003
bye+0.003
Neg logit contributions
 its-0.003
cks-0.003
 hovering-0.003
 his-0.002
 outer-0.002
Token
Token ID198
Feature activation+4.53e-03

Pos logit contributions
t+0.013
ugg+0.012
case+0.011
ave+0.011
bye+0.011
Neg logit contributions
 its-0.012
 hovering-0.012
cks-0.012
 outer-0.011
 perched-0.011
Token
Token ID198
Feature activation-1.34e-03

Pos logit contributions
t+0.008
ugg+0.008
case+0.007
 virtue+0.007
ave+0.007
Neg logit contributions
 its-0.007
cks-0.006
 hovering-0.006
 his-0.006
 outer-0.006
TokenAnna
Token ID31160
Feature activation+2.45e-03

Pos logit contributions
 its+0.002
 hovering+0.002
cks+0.002
 his+0.002
 outer+0.002
Neg logit contributions
t-0.002
ugg-0.002
case-0.002
 virtue-0.002
ave-0.002
Token and
Token ID290
Feature activation+1.56e-02

Pos logit contributions
t+0.005
ugg+0.005
 virtue+0.005
bye+0.005
case+0.005
Neg logit contributions
 its-0.003
 hovering-0.002
 his-0.002
cks-0.002
 perched-0.002
Token Ben
Token ID3932
Feature activation+1.39e-02

Pos logit contributions
t+0.024
 judge+0.023
ugg+0.023
 moms+0.022
bye+0.021
Neg logit contributions
 hovering-0.024
cks-0.024
 its-0.024
 perched-0.022
 outer-0.022
Token show
Token ID905
Feature activation+2.56e-02

Pos logit contributions
t+0.027
ugg+0.026
 virtue+0.024
omm+0.024
bye+0.023
Neg logit contributions
 its-0.018
 his-0.017
cks-0.016
 hovering-0.016
 seaw-0.015
Token him
Token ID683
Feature activation+1.44e-02

Pos logit contributions
t+0.062
ugg+0.058
 virtue+0.057
case+0.056
ave+0.054
Neg logit contributions
 its-0.029
 his-0.027
 their-0.021
 her-0.020
cks-0.020
Token their
Token ID511
Feature activation+1.78e-02

Pos logit contributions
t+0.037
ugg+0.036
case+0.034
 virtue+0.034
bye+0.034
Neg logit contributions
 its-0.011
 his-0.009
cks-0.009
 hovering-0.008
 outer-0.006
Token toys
Token ID14958
Feature activation+3.93e-03

Pos logit contributions
t+0.034
ugg+0.032
ave+0.031
case+0.030
aring+0.029
Neg logit contributions
 his-0.026
 its-0.025
 pans-0.023
 seaw-0.022
 wooden-0.021
Token and
Token ID290
Feature activation+6.84e-03

Pos logit contributions
t+0.007
ugg+0.007
case+0.007
ave+0.007
 virtue+0.006
Neg logit contributions
 its-0.005
 hovering-0.005
cks-0.005
 his-0.005
 outer-0.004
Token books
Token ID3835
Feature activation+6.07e-03

Pos logit contributions
t+0.013
ugg+0.012
case+0.011
 virtue+0.011
ave+0.011
Neg logit contributions
 its-0.010
 his-0.009
cks-0.009
 hovering-0.009
 outer-0.008
Token O
Token ID440
Feature activation+1.13e-02

Pos logit contributions
 hovering+0.047
cks+0.043
 its+0.041
 pans+0.040
 outer+0.038
Neg logit contributions
ugg-0.058
t-0.056
 tum-0.054
bye-0.053
 judge-0.053
Tokenll
Token ID297
Feature activation-2.07e-02

Pos logit contributions
t+0.015
ugg+0.014
case+0.013
bye+0.013
ave+0.013
Neg logit contributions
 its-0.022
 hovering-0.020
cks-0.020
 his-0.020
 outer-0.019
Tokenie
Token ID494
Feature activation-1.32e-02

Pos logit contributions
 hovering+0.037
 its+0.036
cks+0.035
 perched+0.033
 outer+0.033
Neg logit contributions
t-0.028
ugg-0.025
bye-0.024
 judge-0.023
 punish-0.023
Token and
Token ID290
Feature activation-8.77e-04

Pos logit contributions
 its+0.017
 his+0.015
 hovering+0.015
cks+0.014
 perched+0.013
Neg logit contributions
t-0.027
ugg-0.025
case-0.024
 virtue-0.024
ave-0.024
Token went
Token ID1816
Feature activation+2.48e-02

Pos logit contributions
 its+0.002
 his+0.001
 hovering+0.001
cks+0.001
 perched+0.001
Neg logit contributions
t-0.001
case-0.001
ugg-0.001
 virtue-0.001
omm-0.001
Token back
Token ID736
Feature activation+3.13e-02

Pos logit contributions
t+0.041
ugg+0.039
case+0.037
bye+0.036
ave+0.036
Neg logit contributions
 its-0.038
 hovering-0.036
cks-0.036
 his-0.034
 outer-0.033
Token to
Token ID284
Feature activation-4.03e-03

Pos logit contributions
ugg+0.039
t+0.036
case+0.032
bye+0.032
 judge+0.031
Neg logit contributions
 hovering-0.060
cks-0.060
 its-0.056
 seaw-0.053
 pans-0.053
Token their
Token ID511
Feature activation-1.63e-03

Pos logit contributions
 hovering+0.006
cks+0.005
 pans+0.005
hole+0.005
rob+0.005
Neg logit contributions
ugg-0.007
 tum-0.007
 judge-0.006
 moms-0.006
 punish-0.006
Token home
Token ID1363
Feature activation+1.82e-02

Pos logit contributions
 its+0.002
 his+0.002
 coral+0.002
 outer+0.002
 seaw+0.002
Neg logit contributions
t-0.003
ave-0.003
ugg-0.003
case-0.003
omm-0.003
Token.
Token ID13
Feature activation-2.71e-03

Pos logit contributions
t+0.037
ugg+0.035
case+0.034
 virtue+0.033
ave+0.033
Neg logit contributions
 its-0.021
 hovering-0.020
cks-0.020
 his-0.019
 outer-0.017
Token They
Token ID1119
Feature activation+9.78e-04

Pos logit contributions
 its+0.004
 his+0.003
cks+0.003
 hovering+0.003
 outer+0.003
Neg logit contributions
t-0.005
case-0.005
ave-0.005
 virtue-0.005
ugg-0.005
Token,
Token ID11
Feature activation-1.35e-02

Pos logit contributions
 its+0.012
 hovering+0.011
 his+0.011
cks+0.010
 perched+0.009
Neg logit contributions
t-0.017
ugg-0.016
 virtue-0.015
case-0.015
ave-0.015
Token and
Token ID290
Feature activation-1.35e-02

Pos logit contributions
 its+0.020
cks+0.019
 hovering+0.019
 outer+0.018
 his+0.017
Neg logit contributions
t-0.023
ugg-0.021
case-0.021
ave-0.020
 virtue-0.020
Token then
Token ID788
Feature activation+1.26e-02

Pos logit contributions
 its+0.020
 hovering+0.018
 his+0.018
cks+0.018
 outer+0.017
Neg logit contributions
t-0.024
ugg-0.023
case-0.022
 virtue-0.021
bye-0.021
Token slowly
Token ID6364
Feature activation+9.28e-03

Pos logit contributions
t+0.024
ugg+0.023
case+0.022
 virtue+0.022
bye+0.022
Neg logit contributions
 its-0.017
 his-0.015
 hovering-0.015
cks-0.015
 perched-0.014
Token dropped
Token ID5710
Feature activation-2.37e-02

Pos logit contributions
t+0.016
ugg+0.015
case+0.014
 virtue+0.014
bye+0.014
Neg logit contributions
 its-0.014
 hovering-0.014
cks-0.013
 his-0.013
 outer-0.012
Token it
Token ID340
Feature activation-2.63e-04

Pos logit contributions
cks+0.024
 hovering+0.023
 outer+0.021
 its+0.021
rob+0.020
Neg logit contributions
t-0.049
ugg-0.045
case-0.044
bye-0.044
 virtue-0.043
Token.
Token ID13
Feature activation-6.38e-03

Pos logit contributions
 its+0.000
 hovering+0.000
cks+0.000
 his+0.000
 outer+0.000
Neg logit contributions
t-0.001
ugg-0.001
case-0.000
 virtue-0.000
ave-0.000
Token
Token ID220
Feature activation+1.08e-02

Pos logit contributions
 its+0.009
 his+0.008
 hovering+0.007
cks+0.007
 outer+0.006
Neg logit contributions
t-0.012
ugg-0.012
 virtue-0.011
bye-0.011
 judge-0.011
Token
Token ID198
Feature activation+8.11e-05

Pos logit contributions
t+0.019
ugg+0.018
case+0.017
bye+0.017
 virtue+0.017
Neg logit contributions
 its-0.016
 hovering-0.014
cks-0.014
 his-0.014
 outer-0.013
Token
Token ID198
Feature activation-6.21e-03

Pos logit contributions
t+0.000
ugg+0.000
 virtue+0.000
case+0.000
bye+0.000
Neg logit contributions
 its-0.000
 his-0.000
cks-0.000
 hovering-0.000
 outer-0.000
TokenThe
Token ID464
Feature activation+1.84e-02

Pos logit contributions
 its+0.008
 his+0.007
 hovering+0.007
cks+0.006
 outer+0.006
Neg logit contributions
ugg-0.012
t-0.012
bye-0.011
 tum-0.011
 virtue-0.011
Token homework
Token ID26131
Feature activation+4.61e-03

Pos logit contributions
t+0.029
ugg+0.028
ave+0.027
case+0.027
bye+0.026
Neg logit contributions
 its-0.021
 his-0.020
 hovering-0.017
 her-0.016
cks-0.016
Token.
Token ID13
Feature activation-2.33e-03

Pos logit contributions
ugg+0.010
t+0.010
omm+0.009
bye+0.009
 virtue+0.009
Neg logit contributions
 its-0.004
 his-0.004
 hovering-0.003
cks-0.003
 down-0.003
Token Tim
Token ID5045
Feature activation-1.00e-02

Pos logit contributions
 its+0.003
 his+0.003
 hovering+0.002
cks+0.002
 down+0.002
Neg logit contributions
t-0.005
ugg-0.005
 virtue-0.005
ave-0.005
case-0.005
Tokenmy
Token ID1820
Feature activation+9.95e-03

Pos logit contributions
 its+0.012
 his+0.012
cks+0.010
 hovering+0.010
 perched+0.009
Neg logit contributions
t-0.021
ugg-0.020
case-0.019
 virtue-0.019
bye-0.018
Token didn
Token ID1422
Feature activation-1.75e-03

Pos logit contributions
ugg+0.020
t+0.020
 virtue+0.019
bye+0.019
case+0.018
Neg logit contributions
 his-0.013
 its-0.012
 her-0.010
 perched-0.009
 their-0.009
Token't
Token ID470
Feature activation-1.20e-02

Pos logit contributions
 its+0.003
 his+0.003
 hovering+0.003
cks+0.003
 outer+0.003
Neg logit contributions
ugg-0.002
ave-0.002
bye-0.002
 virtue-0.002
 tum-0.002
Token want
Token ID765
Feature activation+1.12e-02

Pos logit contributions
 its+0.015
 his+0.015
 down+0.012
 her+0.012
cks+0.012
Neg logit contributions
ugg-0.023
case-0.023
omm-0.022
t-0.022
 virtue-0.021
Token to
Token ID284
Feature activation-1.78e-03

Pos logit contributions
ugg+0.024
case+0.024
t+0.023
spe+0.022
bye+0.022
Neg logit contributions
 his-0.016
 its-0.015
 her-0.011
 him-0.011
 their-0.011
Token tell
Token ID1560
Feature activation+7.41e-03

Pos logit contributions
 his+0.002
 its+0.002
 her+0.002
 their+0.002
 your+0.002
Neg logit contributions
case-0.003
ave-0.003
ugg-0.003
ari-0.003
spe-0.003
Token the
Token ID262
Feature activation+1.69e-02

Pos logit contributions
ugg+0.022
t+0.022
case+0.021
ave+0.021
 virtue+0.021
Neg logit contributions
 its-0.003
 his-0.003
cks-0.001
 hovering-0.001
 outer-0.000
Token truth
Token ID3872
Feature activation-6.82e-04

Pos logit contributions
ugg+0.032
ave+0.029
bye+0.029
aring+0.028
t+0.028
Neg logit contributions
 his-0.024
 carrot-0.023
 glowing-0.020
 pans-0.020
 carrots-0.020
Token how
Token ID703
Feature activation+1.16e-04

Pos logit contributions
t+0.089
ugg+0.086
case+0.082
ave+0.081
bye+0.080
Neg logit contributions
 its-0.048
 his-0.041
 hovering-0.041
cks-0.041
 outer-0.037
Token it
Token ID340
Feature activation+2.36e-02

Pos logit contributions
t+0.000
ugg+0.000
case+0.000
ave+0.000
omm+0.000
Neg logit contributions
 its-0.000
 his-0.000
cks-0.000
 hovering-0.000
 outer-0.000
Token tastes
Token ID18221
Feature activation-8.03e-03

Pos logit contributions
t+0.041
ugg+0.040
bye+0.037
case+0.037
 virtue+0.037
Neg logit contributions
 its-0.035
cks-0.033
 hovering-0.032
 his-0.032
 perched-0.030
Token."
Token ID526
Feature activation+8.73e-04

Pos logit contributions
 hovering+0.011
cks+0.011
 its+0.011
 outer+0.010
 perched+0.010
Neg logit contributions
t-0.014
 virtue-0.012
ugg-0.012
case-0.012
bye-0.012
Token
Token ID198
Feature activation-2.19e-03

Pos logit contributions
t+0.002
ugg+0.002
 virtue+0.002
ave+0.002
case+0.002
Neg logit contributions
 its-0.001
 his-0.001
cks-0.001
 hovering-0.001
 perched-0.001
Token
Token ID198
Feature activation-8.24e-03

Pos logit contributions
 its+0.003
 hovering+0.003
cks+0.003
 outer+0.003
 his+0.003
Neg logit contributions
t-0.004
ugg-0.003
case-0.003
bye-0.003
 virtue-0.003
TokenTom
Token ID13787
Feature activation+3.99e-03

Pos logit contributions
 its+0.016
 hovering+0.015
cks+0.015
 his+0.015
 outer+0.014
Neg logit contributions
t-0.011
ugg-0.010
case-0.010
 virtue-0.009
ave-0.009
Token thought
Token ID1807
Feature activation-1.54e-02

Pos logit contributions
t+0.009
 virtue+0.008
ugg+0.008
bye+0.008
 deed+0.008
Neg logit contributions
 his-0.005
 its-0.004
 perched-0.004
 hovering-0.004
cks-0.003
Token that
Token ID326
Feature activation-3.04e-03

Pos logit contributions
 its+0.027
 hovering+0.025
cks+0.025
 his+0.024
 outer+0.023
Neg logit contributions
t-0.023
ugg-0.021
case-0.021
 virtue-0.020
ave-0.020
Token was
Token ID373
Feature activation-1.00e-02

Pos logit contributions
 its+0.006
 hovering+0.006
cks+0.005
 his+0.005
 outer+0.005
Neg logit contributions
t-0.004
ugg-0.004
case-0.004
 virtue-0.003
ave-0.003
Token a
Token ID257
Feature activation+3.85e-03

Pos logit contributions
cks+0.014
 hovering+0.014
 outer+0.014
 its+0.014
rob+0.013
Neg logit contributions
t-0.015
 judge-0.015
case-0.014
 virtue-0.014
 moms-0.014
Token
Token ID198
Feature activation-2.69e-04

Pos logit contributions
t+0.013
ugg+0.011
case+0.011
ave+0.010
bye+0.010
Neg logit contributions
 its-0.013
 hovering-0.013
cks-0.013
 outer-0.012
 his-0.012
TokenAfter
Token ID3260
Feature activation+1.43e-02

Pos logit contributions
 its+0.000
 hovering+0.000
cks+0.000
 his+0.000
 outer+0.000
Neg logit contributions
ugg-0.000
t-0.000
case-0.000
 virtue-0.000
bye-0.000
Token learning
Token ID4673
Feature activation+1.22e-02

Pos logit contributions
t+0.031
ugg+0.030
case+0.029
ave+0.029
 virtue+0.028
Neg logit contributions
 its-0.016
 his-0.014
 hovering-0.014
cks-0.013
 outer-0.012
Token,
Token ID11
Feature activation-3.69e-03

Pos logit contributions
t+0.020
ugg+0.019
case+0.019
 virtue+0.018
bye+0.018
Neg logit contributions
 its-0.018
cks-0.018
 hovering-0.018
 his-0.016
 outer-0.016
Token Jake
Token ID14757
Feature activation-1.50e-03

Pos logit contributions
 its+0.004
 his+0.004
 hovering+0.004
cks+0.004
 outer+0.003
Neg logit contributions
t-0.008
ugg-0.008
case-0.007
 virtue-0.007
ave-0.007
Token and
Token ID290
Feature activation+8.44e-03

Pos logit contributions
 its+0.002
 his+0.002
 perched+0.001
 hovering+0.001
 down+0.001
Neg logit contributions
ugg-0.003
t-0.003
 virtue-0.003
omm-0.003
bye-0.003
Token Jill
Token ID22772
Feature activation-1.70e-03

Pos logit contributions
t+0.017
ugg+0.015
case+0.015
bye+0.015
ave+0.015
Neg logit contributions
 its-0.010
 hovering-0.010
cks-0.010
 outer-0.009
 perched-0.009
Token got
Token ID1392
Feature activation+7.86e-03

Pos logit contributions
 his+0.002
 its+0.002
 her+0.002
 perched+0.002
 hovering+0.002
Neg logit contributions
ugg-0.003
omm-0.003
t-0.003
 virtue-0.003
aring-0.003
Token sleepy
Token ID42023
Feature activation-2.47e-03

Pos logit contributions
t+0.016
ugg+0.015
case+0.015
 virtue+0.015
bye+0.014
Neg logit contributions
 its-0.010
 his-0.009
 hovering-0.008
cks-0.008
 down-0.008
Token.
Token ID13
Feature activation+7.27e-03

Pos logit contributions
 its+0.003
 hovering+0.003
cks+0.003
 his+0.003
 outer+0.002
Neg logit contributions
t-0.005
ugg-0.005
case-0.005
 virtue-0.005
ave-0.005
Token Mom
Token ID11254
Feature activation+3.79e-03

Pos logit contributions
t+0.013
ugg+0.013
 virtue+0.012
case+0.012
bye+0.012
Neg logit contributions
 its-0.011
 his-0.010
cks-0.009
 hovering-0.009
 outer-0.008
Token,
Token ID11
Feature activation+2.86e-04

Pos logit contributions
t+0.015
ugg+0.013
case+0.013
bye+0.013
ave+0.013
Neg logit contributions
cks-0.015
 its-0.015
 hovering-0.015
 outer-0.013
 perched-0.013
Token he
Token ID339
Feature activation-1.94e-02

Pos logit contributions
t+0.001
ugg+0.001
case+0.001
bye+0.001
ave+0.000
Neg logit contributions
cks-0.000
 hovering-0.000
 its-0.000
 perched-0.000
 outer-0.000
Token had
Token ID550
Feature activation-6.96e-03

Pos logit contributions
 his+0.024
 its+0.022
 perched+0.020
 seaw+0.020
 hovering+0.019
Neg logit contributions
ugg-0.043
omm-0.040
 virtue-0.040
t-0.039
case-0.037
Token to
Token ID284
Feature activation-8.39e-03

Pos logit contributions
 its+0.007
 hovering+0.006
cks+0.006
 his+0.006
 outer+0.005
Neg logit contributions
t-0.016
ugg-0.015
case-0.015
 virtue-0.014
ave-0.014
Token turn
Token ID1210
Feature activation-3.39e-03

Pos logit contributions
 its+0.016
 his+0.015
cks+0.015
 hovering+0.014
 outer+0.014
Neg logit contributions
t-0.012
ugg-0.012
ave-0.010
case-0.010
 virtue-0.010
Token around
Token ID1088
Feature activation-2.64e-03

Pos logit contributions
cks+0.004
 its+0.004
 hovering+0.004
 outer+0.004
 perched+0.003
Neg logit contributions
t-0.007
ugg-0.006
case-0.006
ave-0.006
bye-0.006
Token a
Token ID257
Feature activation-6.07e-03

Pos logit contributions
 its+0.003
cks+0.003
 hovering+0.003
 outer+0.003
 perched+0.003
Neg logit contributions
t-0.005
ugg-0.005
case-0.005
 virtue-0.004
bye-0.004
Token corner
Token ID5228
Feature activation+4.15e-03

Pos logit contributions
 its+0.009
 his+0.008
 hovering+0.008
 outer+0.007
cks+0.007
Neg logit contributions
t-0.011
ugg-0.011
ave-0.011
bye-0.010
omm-0.010
Token.
Token ID13
Feature activation-5.05e-03

Pos logit contributions
ugg+0.009
t+0.009
omm+0.008
 virtue+0.008
ave+0.008
Neg logit contributions
 its-0.005
 his-0.004
 hovering-0.004
cks-0.004
 outer-0.003
Token He
Token ID679
Feature activation+1.22e-03

Pos logit contributions
 its+0.006
 his+0.006
 seaw+0.004
 outer+0.004
 hovering+0.004
Neg logit contributions
t-0.011
ugg-0.010
 virtue-0.010
bye-0.010
 punish-0.010
Token leaned
Token ID23831
Feature activation-2.22e-02

Pos logit contributions
ugg+0.002
omm+0.002
 virtue+0.002
t+0.002
oral+0.002
Neg logit contributions
 his-0.002
 its-0.002
 perched-0.001
b-0.001
 hovering-0.001
Token back
Token ID736
Feature activation+1.98e-02

Pos logit contributions
t+0.018
ugg+0.018
case+0.017
bye+0.016
omm+0.016
Neg logit contributions
 its-0.009
 his-0.008
 hovering-0.008
cks-0.007
 outer-0.007
Token home
Token ID1363
Feature activation+1.49e-02

Pos logit contributions
t+0.035
ugg+0.034
case+0.032
ave+0.031
 virtue+0.031
Neg logit contributions
 its-0.029
 hovering-0.027
cks-0.026
 his-0.026
 outer-0.024
Token.
Token ID13
Feature activation-1.19e-03

Pos logit contributions
t+0.031
ugg+0.031
case+0.029
ave+0.029
omm+0.029
Neg logit contributions
 its-0.017
 his-0.015
 hovering-0.015
cks-0.014
 outer-0.013
Token She
Token ID1375
Feature activation-3.73e-03

Pos logit contributions
 its+0.002
 his+0.001
 hovering+0.001
 outer+0.001
 perched+0.001
Neg logit contributions
t-0.002
case-0.002
 virtue-0.002
bye-0.002
 judge-0.002
Token opened
Token ID4721
Feature activation-3.89e-02

Pos logit contributions
 its+0.006
 his+0.005
 perched+0.005
 hovering+0.005
 fl+0.005
Neg logit contributions
t-0.007
ugg-0.006
omm-0.006
 virtue-0.006
aring-0.006
Token her
Token ID607
Feature activation-3.60e-02

Pos logit contributions
cks+0.041
 hovering+0.037
rob+0.035
 pans+0.034
 seaw+0.033
Neg logit contributions
t-0.075
case-0.071
um-0.071
ugg-0.071
 tum-0.069
Token eyes
Token ID2951
Feature activation-2.53e-02

Pos logit contributions
 its+0.053
 hovering+0.048
 his+0.048
cks+0.046
 outer+0.045
Neg logit contributions
t-0.067
ugg-0.062
ave-0.059
bye-0.058
case-0.058
Token and
Token ID290
Feature activation-2.75e-03

Pos logit contributions
 its+0.037
cks+0.035
 hovering+0.034
 his+0.033
 outer+0.032
Neg logit contributions
t-0.045
ugg-0.042
case-0.041
ave-0.039
bye-0.038
Token she
Token ID673
Feature activation-2.68e-02

Pos logit contributions
 its+0.004
 hovering+0.004
cks+0.004
 his+0.004
 outer+0.004
Neg logit contributions
t-0.005
ugg-0.004
case-0.004
 virtue-0.004
bye-0.004
Token was
Token ID373
Feature activation+2.99e-03

Pos logit contributions
 its+0.040
 his+0.036
 hovering+0.036
 perched+0.035
cks+0.034
Neg logit contributions
t-0.049
ugg-0.047
 virtue-0.046
omm-0.044
bye-0.044
Token back
Token ID736
Feature activation+3.36e-02

Pos logit contributions
t+0.006
ugg+0.006
case+0.005
bye+0.005
 virtue+0.005
Neg logit contributions
 its-0.004
 hovering-0.004
cks-0.003
 his-0.003
 perched-0.003
Token meant
Token ID4001
Feature activation-6.24e-03

Pos logit contributions
t+0.000
case+0.000
ugg+0.000
bye+0.000
 judge+0.000
Neg logit contributions
 its-0.000
cks-0.000
 hovering-0.000
 outer-0.000
 his-0.000
Token,
Token ID11
Feature activation+3.87e-03

Pos logit contributions
cks+0.011
 its+0.011
 hovering+0.010
 outer+0.010
 perched+0.010
Neg logit contributions
t-0.010
case-0.008
ugg-0.008
 judge-0.008
bye-0.008
Token but
Token ID475
Feature activation+5.24e-03

Pos logit contributions
t+0.008
ugg+0.008
case+0.008
 virtue+0.008
ave+0.008
Neg logit contributions
 its-0.004
 hovering-0.004
 his-0.004
cks-0.003
 perched-0.003
Token she
Token ID673
Feature activation-3.20e-02

Pos logit contributions
t+0.011
ugg+0.010
case+0.010
bye+0.010
ave+0.010
Neg logit contributions
cks-0.006
 its-0.006
 hovering-0.005
 outer-0.005
 his-0.005
Token closed
Token ID4838
Feature activation-2.32e-02

Pos logit contributions
 its+0.044
 his+0.043
 perched+0.037
 hovering+0.036
 her+0.034
Neg logit contributions
ugg-0.060
t-0.059
 virtue-0.054
aring-0.053
case-0.052
Token her
Token ID607
Feature activation-4.16e-02

Pos logit contributions
cks+0.026
 hovering+0.025
rob+0.025
 seaw+0.024
piring+0.024
Neg logit contributions
t-0.042
um-0.040
ugg-0.040
 tum-0.039
case-0.038
Token eyes
Token ID2951
Feature activation-2.51e-02

Pos logit contributions
 its+0.070
 hovering+0.069
cks+0.068
 perched+0.063
 outer+0.062
Neg logit contributions
t-0.061
case-0.057
ugg-0.057
 virtue-0.053
ave-0.053
Token and
Token ID290
Feature activation-5.86e-03

Pos logit contributions
 its+0.035
cks+0.033
 hovering+0.033
 his+0.031
 outer+0.030
Neg logit contributions
t-0.046
ugg-0.043
case-0.042
ave-0.040
 virtue-0.040
Token tried
Token ID3088
Feature activation+1.01e-02

Pos logit contributions
 its+0.010
cks+0.010
 hovering+0.010
 his+0.009
 outer+0.009
Neg logit contributions
t-0.009
ugg-0.008
case-0.008
ave-0.007
bye-0.007
Token to
Token ID284
Feature activation+5.97e-03

Pos logit contributions
t+0.013
case+0.011
ugg+0.011
 virtue+0.011
 judge+0.010
Neg logit contributions
cks-0.019
 hovering-0.018
 its-0.018
rob-0.017
 seaw-0.017
Token imagine
Token ID5967
Feature activation-3.36e-03

Pos logit contributions
t+0.009
ugg+0.009
case+0.008
 virtue+0.008
ave+0.008
Neg logit contributions
 its-0.010
 hovering-0.010
cks-0.010
 his-0.009
 outer-0.009
TokenI
Token ID40
Feature activation-1.09e-02

Pos logit contributions
 its+0.029
 hovering+0.027
cks+0.027
 his+0.026
 outer+0.025
Neg logit contributions
t-0.021
ugg-0.019
case-0.019
 virtue-0.018
ave-0.018
Token taste
Token ID6938
Feature activation-1.17e-02

Pos logit contributions
 its+0.017
 his+0.015
 hovering+0.015
 perched+0.014
 outer+0.014
Neg logit contributions
t-0.017
ugg-0.017
case-0.017
omm-0.016
ave-0.016
Token popcorn
Token ID38914
Feature activation-3.64e-04

Pos logit contributions
cks+0.016
 hovering+0.016
 its+0.014
 outer+0.014
 shadowy+0.013
Neg logit contributions
t-0.020
ugg-0.019
 tum-0.019
 virtue-0.017
omm-0.017
Token in
Token ID287
Feature activation-7.54e-03

Pos logit contributions
 its+0.000
cks+0.000
 hovering+0.000
 his+0.000
 outer+0.000
Neg logit contributions
t-0.001
ugg-0.001
case-0.001
bye-0.001
 virtue-0.001
Token my
Token ID616
Feature activation-2.73e-02

Pos logit contributions
 hovering+0.008
cks+0.007
 its+0.007
rob+0.006
 perched+0.006
Neg logit contributions
t-0.015
case-0.014
ugg-0.014
 tum-0.014
 virtue-0.014
Token mouth
Token ID5422
Feature activation-3.94e-02

Pos logit contributions
 its+0.039
 hovering+0.038
cks+0.038
 his+0.034
 perched+0.033
Neg logit contributions
t-0.047
ugg-0.046
case-0.044
 tum-0.043
bye-0.043
Token!"
Token ID2474
Feature activation+7.08e-04

Pos logit contributions
 its+0.045
 his+0.040
 hovering+0.039
cks+0.038
 perched+0.035
Neg logit contributions
t-0.078
ugg-0.078
ave-0.074
 virtue-0.074
case-0.073
Token Mom
Token ID11254
Feature activation+5.63e-03

Pos logit contributions
t+0.001
 virtue+0.001
case+0.001
 judge+0.001
bye+0.001
Neg logit contributions
 its-0.001
 his-0.001
 her-0.001
 down-0.001
 hovering-0.001
Tokenmy
Token ID1820
Feature activation+4.65e-03

Pos logit contributions
t+0.009
ugg+0.009
case+0.009
 virtue+0.008
bye+0.008
Neg logit contributions
 its-0.009
 hovering-0.008
cks-0.008
 his-0.008
 outer-0.007
Token gave
Token ID2921
Feature activation-1.34e-02

Pos logit contributions
t+0.009
 virtue+0.009
case+0.009
ugg+0.009
bye+0.008
Neg logit contributions
 its-0.006
 his-0.006
 hovering-0.005
 perched-0.005
 carrot-0.005
Token her
Token ID607
Feature activation-2.27e-02

Pos logit contributions
 its+0.007
 his+0.005
 hovering+0.005
cks+0.005
 outer+0.003
Neg logit contributions
t-0.037
ugg-0.036
case-0.035
 virtue-0.034
bye-0.034
Token very
Token ID845
Feature activation-8.34e-03

Pos logit contributions
cks+0.014
rob+0.013
 its+0.013
 pans+0.013
 seaw+0.013
Neg logit contributions
t-0.018
ugg-0.016
 tum-0.016
case-0.016
 punish-0.016
Token tired
Token ID10032
Feature activation-1.47e-02

Pos logit contributions
 its+0.012
cks+0.011
 hovering+0.011
 outer+0.010
 his+0.010
Neg logit contributions
t-0.015
ugg-0.014
case-0.013
 virtue-0.013
ave-0.013
Token from
Token ID422
Feature activation-1.25e-02

Pos logit contributions
 its+0.020
 hovering+0.018
cks+0.018
 his+0.018
 outer+0.017
Neg logit contributions
t-0.027
ugg-0.026
case-0.025
ave-0.024
 virtue-0.024
Token collecting
Token ID13157
Feature activation-1.10e-02

Pos logit contributions
 its+0.014
cks+0.014
 hovering+0.014
 outer+0.012
 his+0.012
Neg logit contributions
t-0.025
ugg-0.023
case-0.023
 judge-0.022
bye-0.022
Token ac
Token ID936
Feature activation+1.25e-02

Pos logit contributions
 hovering+0.013
cks+0.013
 its+0.012
 outer+0.010
 perched+0.010
Neg logit contributions
t-0.020
ugg-0.020
 tum-0.019
 virtue-0.019
case-0.019
Tokenorns
Token ID19942
Feature activation-4.12e-02

Pos logit contributions
t+0.027
ugg+0.026
case+0.024
bye+0.023
ave+0.023
Neg logit contributions
 its-0.015
 hovering-0.014
 outer-0.013
cks-0.013
 his-0.012
Token all
Token ID477
Feature activation-8.92e-03

Pos logit contributions
 hovering+0.066
cks+0.065
 its+0.063
 outer+0.061
 pans+0.056
Neg logit contributions
t-0.070
case-0.060
ugg-0.059
bye-0.057
 virtue-0.055
Token day
Token ID1110
Feature activation-6.86e-03

Pos logit contributions
 hovering+0.015
cks+0.015
 its+0.014
 pans+0.014
 perched+0.013
Neg logit contributions
t-0.013
ugg-0.012
bye-0.011
case-0.011
 virtue-0.010
Token."
Token ID526
Feature activation-1.95e-03

Pos logit contributions
 its+0.009
 hovering+0.008
cks+0.008
 his+0.008
 outer+0.007
Neg logit contributions
t-0.013
ugg-0.013
case-0.012
ave-0.012
 virtue-0.012
Token
Token ID198
Feature activation-3.78e-03

Pos logit contributions
 its+0.003
cks+0.003
 hovering+0.003
 outer+0.003
 his+0.003
Neg logit contributions
t-0.003
ugg-0.003
case-0.003
ave-0.003
bye-0.003
Token
Token ID198
Feature activation-1.01e-02

Pos logit contributions
 its+0.005
cks+0.005
 hovering+0.005
 his+0.005
 outer+0.004
Neg logit contributions
t-0.007
ugg-0.006
case-0.006
 virtue-0.006
ave-0.006
Token feel
Token ID1254
Feature activation-5.05e-03

Pos logit contributions
t+0.019
ugg+0.018
case+0.017
 virtue+0.017
ave+0.017
Neg logit contributions
 its-0.013
 hovering-0.012
cks-0.012
 his-0.012
 outer-0.011
Token better
Token ID1365
Feature activation+1.64e-02

Pos logit contributions
cks+0.007
 hovering+0.007
 its+0.007
 outer+0.007
rob+0.007
Neg logit contributions
t-0.008
ugg-0.008
 tum-0.008
 virtue-0.008
ave-0.007
Token.
Token ID13
Feature activation-4.59e-03

Pos logit contributions
t+0.027
ugg+0.024
 tum+0.023
 virtue+0.023
bye+0.023
Neg logit contributions
cks-0.026
 hovering-0.025
 its-0.025
 outer-0.024
rob-0.024
Token She
Token ID1375
Feature activation-1.13e-03

Pos logit contributions
 its+0.008
 hovering+0.007
cks+0.007
 his+0.007
 outer+0.006
Neg logit contributions
t-0.007
ugg-0.007
case-0.007
ave-0.006
 virtue-0.006
Token hugged
Token ID41130
Feature activation-1.14e-02

Pos logit contributions
 its+0.002
 his+0.002
 seaw+0.001
 carrot+0.001
 perched+0.001
Neg logit contributions
t-0.002
aring-0.002
ugg-0.002
 virtue-0.002
case-0.002
Token her
Token ID607
Feature activation-3.55e-02

Pos logit contributions
 its+0.012
 his+0.011
 hovering+0.010
cks+0.010
 outer+0.009
Neg logit contributions
t-0.026
ugg-0.024
case-0.024
 virtue-0.023
ave-0.023
Token t
Token ID256
Feature activation-2.18e-02

Pos logit contributions
 its+0.077
 hovering+0.075
cks+0.075
 his+0.072
 outer+0.071
Neg logit contributions
t-0.037
ugg-0.033
case-0.031
 virtue-0.029
bye-0.029
Tokeneddy
Token ID21874
Feature activation+1.65e-02

Pos logit contributions
 its+0.031
 his+0.028
cks+0.028
 hovering+0.027
 pans+0.026
Neg logit contributions
t-0.037
case-0.036
 judge-0.034
 virtue-0.034
 punish-0.034
Token bear
Token ID6842
Feature activation-1.29e-02

Pos logit contributions
t+0.031
ugg+0.030
case+0.028
bye+0.028
ave+0.028
Neg logit contributions
 its-0.021
 hovering-0.021
cks-0.020
 outer-0.019
 his-0.018
Token and
Token ID290
Feature activation-4.04e-03

Pos logit contributions
 its+0.015
 his+0.014
 hovering+0.012
 carrot+0.012
cks+0.011
Neg logit contributions
t-0.028
 virtue-0.025
ugg-0.025
case-0.025
ave-0.025
Token closed
Token ID4838
Feature activation-1.22e-02

Pos logit contributions
 its+0.007
 hovering+0.006
cks+0.006
 outer+0.006
 his+0.006
Neg logit contributions
t-0.006
ugg-0.006
case-0.006
ave-0.005
bye-0.005