TOP ACTIVATIONS
MAX = 0.168

truck with a sign that says " News ". They are
box with a sign that says " Do not open ".
It has a label that says " mail ". They are
is a big sign that says " O asis Hotel and
It has a button that says " power ". L ila
, " Look , it says Tom . It 's my
can figure out what it says !" So together the mom
truck has a sign that says , " L ocks mith
box had a label that said : " Please return to
It has a sign that says " test ". They are
also has a sign that says , " Boo !" They
license is a paper that says you can drive a car
there is a sign that says ' Hot el ' with
has a big sign that says " Magic Shop ". Lily
has a heart and it says I love you ."
lady with a badge that says " M useum Staff ".
'm not sure what it says . It looks like a
they see a sign that says " Do not go in
is holding a sign that says " Mr . Smith ".
They see a sign that says " Ad opt a Panda

MIN ACTIVATIONS
MIN = -0.180

. The nos y person then tried to
It 's ok ," the nos y girl said . "
happy . The nos y metal robot realized that
around . The nos y cart then made its
. The nos y metal robot thought it
The puppy reluctantly let the nos y person pet them .
together ." The nos y bird was excited and
replied . The nos y neighbor asked , "
!" The nos y girl smiled . She
away . The nos y duck did not listen
give . The nos y bear and nos y
tree . The nos y man finally went away
details . The nos y lady came to check
ground ." The nos y dove got closer and
?" The nos y monster smiled and waved
Sm iley laughed at the nos y dove and then continued
. The nos y face was never forgotten
. The nos y rabbit asked his friends
before . The nos y girl watched as they
persistent â so they gathered around the ladder

INTERVAL 0.081 - 0.168
CONTAINS 2.313%

Sam wanted to be friends with Tim , so he offered
friends ," said the princess with a smile .
. Even when not performing with it , Jane liked to
river and filled her bucket with water . The water was
! Daisy was so happy that she kept spl ashing and

INTERVAL -0.006 - 0.081
CONTAINS 53.508%

his dinner was and how much he loved playing with his
!" cried Jill . " You were so fast !"
on the side of the road . Sam
a loyal friend to his owner , a little girl named
day , Tim my 's mom my told him they were

INTERVAL -0.093 - -0.006
CONTAINS 43.804%

down . " Hey !" Mia says . She
before . But something even more incredible happened .
he was badly hurt . He lay there until some animals
boy said , Yes ! I
But they do not know that one button is

INTERVAL -0.180 - -0.093
CONTAINS 0.375%

. Can I mark something with
. The k ite goes higher and higher . Sara feels
<|endoftext|> Once there was a curious little girl called Am a
, Let â s
He flew up to a tall branch and brought down a
Token truck
Token ID7779
Feature activation+1.92e-02

Pos logit contributions
 aware+0.057
appointed+0.055
 understand+0.053
 active+0.052
 cautious+0.052
Neg logit contributions
ays-0.074
overs-0.065
ions-0.061
es-0.060
ase-0.060
Token with
Token ID351
Feature activation+1.13e-01

Pos logit contributions
ays+0.045
overs+0.041
acey+0.041
kins+0.039
ocket+0.039
Neg logit contributions
appointed-0.024
 calm-0.024
 aware-0.023
 understand-0.023
 trem-0.022
Token a
Token ID257
Feature activation+4.19e-02

Pos logit contributions
ays+0.271
overs+0.240
es+0.226
ions+0.226
ase+0.225
Neg logit contributions
 aware-0.155
appointed-0.145
 understand-0.144
 active-0.141
 gentle-0.137
Token sign
Token ID1051
Feature activation+3.79e-02

Pos logit contributions
ays+0.081
overs+0.069
otion+0.065
ase+0.065
es+0.064
Neg logit contributions
 aware-0.076
appointed-0.070
 active-0.066
 cautious-0.066
 understand-0.065
Token that
Token ID326
Feature activation+1.36e-01

Pos logit contributions
ays+0.084
overs+0.077
acey+0.076
kins+0.072
es+0.071
Neg logit contributions
 gentle-0.050
 calm-0.049
 aware-0.048
 understand-0.048
appointed-0.047
Token says
Token ID1139
Feature activation+1.68e-01

Pos logit contributions
ays+0.307
overs+0.294
ase+0.276
ions+0.269
es+0.256
Neg logit contributions
 calm-0.194
 understand-0.192
 gentle-0.189
appointed-0.185
 aware-0.179
Token "
Token ID366
Feature activation-2.49e-02

Pos logit contributions
ays+0.448
overs+0.371
ase+0.354
acey+0.352
ions+0.352
Neg logit contributions
 understand-0.194
 gentle-0.175
 aware-0.171
 calm-0.169
 wise-0.165
TokenNews
Token ID9980
Feature activation-2.70e-02

Pos logit contributions
 aware+0.034
appointed+0.033
 understand+0.033
 active+0.032
 calm+0.031
Neg logit contributions
ays-0.058
overs-0.051
ions-0.048
otion-0.048
es-0.048
Token".
Token ID1911
Feature activation-2.65e-02

Pos logit contributions
 understand+0.042
 aware+0.042
appointed+0.042
 gentle+0.039
 calm+0.039
Neg logit contributions
ays-0.056
overs-0.049
kins-0.045
ions-0.045
ase-0.044
Token They
Token ID1119
Feature activation+3.81e-03

Pos logit contributions
appointed+0.039
 calm+0.038
 aware+0.037
 understand+0.036
 active+0.036
Neg logit contributions
ays-0.056
ions-0.050
otion-0.050
ocket-0.049
es-0.049
Token are
Token ID389
Feature activation+3.70e-02

Pos logit contributions
ays+0.007
es+0.006
otion+0.006
overs+0.006
acey+0.006
Neg logit contributions
 aware-0.006
 active-0.006
appointed-0.006
 cautious-0.005
 pleasant-0.005
Token box
Token ID3091
Feature activation+6.57e-02

Pos logit contributions
 aware+0.049
appointed+0.046
 understand+0.045
 active+0.045
 cautious+0.044
Neg logit contributions
ays-0.067
overs-0.058
ions-0.055
es-0.055
ase-0.054
Token with
Token ID351
Feature activation+1.14e-01

Pos logit contributions
ays+0.158
acey+0.147
overs+0.141
es+0.135
ocket+0.134
Neg logit contributions
 calm-0.082
 gentle-0.076
 aware-0.075
appointed-0.072
 active-0.071
Token a
Token ID257
Feature activation+4.30e-02

Pos logit contributions
ays+0.285
overs+0.249
ase+0.243
otion+0.243
es+0.232
Neg logit contributions
 aware-0.143
 gentle-0.138
 understand-0.137
 calm-0.134
 pleasant-0.130
Token sign
Token ID1051
Feature activation+3.98e-02

Pos logit contributions
ays+0.090
overs+0.078
ions+0.073
es+0.073
ase+0.072
Neg logit contributions
 aware-0.071
appointed-0.068
 understand-0.066
 active-0.066
 cautious-0.064
Token that
Token ID326
Feature activation+1.39e-01

Pos logit contributions
ays+0.092
overs+0.084
acey+0.083
kins+0.077
ions+0.077
Neg logit contributions
 gentle-0.048
 calm-0.048
 aware-0.044
 understand-0.044
 exciting-0.043
Token says
Token ID1139
Feature activation+1.67e-01

Pos logit contributions
ays+0.375
overs+0.362
ase+0.346
ions+0.333
otion+0.318
Neg logit contributions
 calm-0.148
 gentle-0.143
 understand-0.139
 exciting-0.130
 strong-0.124
Token "
Token ID366
Feature activation-2.56e-02

Pos logit contributions
ays+0.457
overs+0.375
ase+0.355
ls+0.351
acey+0.351
Neg logit contributions
 understand-0.182
 gentle-0.170
 aware-0.163
 wise-0.162
 calm-0.156
TokenDo
Token ID5211
Feature activation+1.90e-02

Pos logit contributions
 aware+0.034
appointed+0.033
 understand+0.033
 active+0.032
 calm+0.031
Neg logit contributions
ays-0.061
overs-0.053
otion-0.051
ions-0.050
es-0.050
Token not
Token ID407
Feature activation-3.23e-02

Pos logit contributions
ays+0.050
overs+0.044
ase+0.041
ions+0.040
es+0.039
Neg logit contributions
 understand-0.023
 gentle-0.023
appointed-0.023
 aware-0.022
 calm-0.022
Token open
Token ID1280
Feature activation-7.83e-03

Pos logit contributions
 aware+0.039
appointed+0.037
 active+0.035
 understand+0.035
 cautious+0.034
Neg logit contributions
ays-0.082
overs-0.073
ions-0.069
es-0.069
otion-0.068
Token".
Token ID1911
Feature activation-3.10e-02

Pos logit contributions
 aware+0.012
 understand+0.011
 calm+0.011
 gentle+0.011
appointed+0.011
Neg logit contributions
ays-0.016
ase-0.014
overs-0.014
otion-0.013
ions-0.013
Token It
Token ID632
Feature activation-2.19e-02

Pos logit contributions
appointed+0.000
 calm+0.000
 gentle+0.000
 aware+0.000
 nod+0.000
Neg logit contributions
ays-0.001
ions-0.001
ocket-0.001
overs-0.001
ase-0.001
Token has
Token ID468
Feature activation+6.45e-02

Pos logit contributions
 gentle+0.031
 calm+0.030
 exciting+0.030
 steep+0.030
 climb+0.029
Neg logit contributions
overs-0.050
ays-0.048
otion-0.041
ase-0.041
ions-0.040
Token a
Token ID257
Feature activation+3.40e-02

Pos logit contributions
ays+0.155
overs+0.136
ions+0.128
es+0.128
otion+0.127
Neg logit contributions
 aware-0.089
appointed-0.083
 understand-0.083
 active-0.080
 gentle-0.079
Token label
Token ID6167
Feature activation+4.95e-02

Pos logit contributions
ays+0.075
overs+0.066
ions+0.062
es+0.062
ase+0.061
Neg logit contributions
 aware-0.053
appointed-0.050
 understand-0.049
 active-0.048
 cautious-0.047
Token that
Token ID326
Feature activation+1.35e-01

Pos logit contributions
ays+0.116
overs+0.103
ase+0.098
acey+0.096
icer+0.093
Neg logit contributions
 understand-0.059
 gentle-0.059
 calm-0.058
 aware-0.057
 strong-0.053
Token says
Token ID1139
Feature activation+1.66e-01

Pos logit contributions
ays+0.300
overs+0.289
ase+0.279
ions+0.256
ips+0.247
Neg logit contributions
 calm-0.200
 gentle-0.198
 understand-0.194
 exciting-0.181
 strong-0.180
Token "
Token ID366
Feature activation-2.01e-02

Pos logit contributions
ays+0.460
overs+0.373
ase+0.357
ls+0.355
ips+0.351
Neg logit contributions
 understand-0.175
 gentle-0.169
 wise-0.160
 calm-0.155
 aware-0.153
Tokenmail
Token ID4529
Feature activation+8.28e-02

Pos logit contributions
 aware+0.027
 understand+0.027
appointed+0.026
 active+0.025
 calm+0.025
Neg logit contributions
ays-0.048
overs-0.041
otion-0.039
ions-0.039
es-0.039
Token".
Token ID1911
Feature activation-2.47e-02

Pos logit contributions
ays+0.194
overs+0.157
ions+0.155
otion+0.151
ase+0.150
Neg logit contributions
 understand-0.113
 aware-0.113
appointed-0.109
 gentle-0.104
 active-0.102
Token They
Token ID1119
Feature activation+4.57e-04

Pos logit contributions
appointed+0.046
 calm+0.045
 aware+0.044
 understand+0.044
 gentle+0.044
Neg logit contributions
ays-0.043
ions-0.038
otion-0.036
ase-0.036
overs-0.036
Token are
Token ID389
Feature activation+3.27e-02

Pos logit contributions
ays+0.001
overs+0.001
es+0.001
ions+0.001
otion+0.001
Neg logit contributions
 aware-0.001
appointed-0.001
 active-0.001
 cautious-0.001
 pleasant-0.001
Token is
Token ID318
Feature activation+3.93e-02

Pos logit contributions
 aware+0.051
appointed+0.049
 understand+0.048
 active+0.048
 gentle+0.048
Neg logit contributions
ays-0.035
overs-0.031
ions-0.027
ase-0.027
otion-0.026
Token a
Token ID257
Feature activation+4.83e-02

Pos logit contributions
ays+0.084
overs+0.074
ions+0.071
es+0.068
otion+0.068
Neg logit contributions
 aware-0.061
appointed-0.059
 understand-0.057
 active-0.056
 cautious-0.054
Token big
Token ID1263
Feature activation-6.69e-03

Pos logit contributions
ays+0.100
overs+0.094
ase+0.084
ions+0.084
ls+0.082
Neg logit contributions
appointed-0.074
 aware-0.073
 understand-0.067
 active-0.064
 cautious-0.062
Token sign
Token ID1051
Feature activation+4.39e-02

Pos logit contributions
 aware+0.011
appointed+0.010
 understand+0.010
 active+0.010
 cautious+0.010
Neg logit contributions
ays-0.014
overs-0.013
ions-0.012
es-0.012
otion-0.012
Token that
Token ID326
Feature activation+1.38e-01

Pos logit contributions
ays+0.097
overs+0.082
acey+0.079
ase+0.078
otion+0.077
Neg logit contributions
 calm-0.067
 gentle-0.067
 aware-0.065
 active-0.062
 cautious-0.062
Token says
Token ID1139
Feature activation+1.66e-01

Pos logit contributions
ays+0.255
overs+0.230
ase+0.218
ions+0.204
es+0.201
Neg logit contributions
 aware-0.252
 understand-0.251
 calm-0.247
 gentle-0.245
appointed-0.236
Token "
Token ID366
Feature activation-2.15e-02

Pos logit contributions
ays+0.400
overs+0.336
ase+0.324
acey+0.315
ions+0.311
Neg logit contributions
 aware-0.226
 understand-0.219
 gentle-0.212
 cautious-0.210
 calm-0.205
TokenO
Token ID46
Feature activation-7.77e-03

Pos logit contributions
 aware+0.030
appointed+0.029
 understand+0.028
 active+0.027
 cautious+0.027
Neg logit contributions
ays-0.050
overs-0.044
ions-0.042
es-0.042
otion-0.041
Tokenasis
Token ID17765
Feature activation+2.41e-02

Pos logit contributions
 aware+0.018
appointed+0.017
 active+0.016
 cautious+0.016
 understand+0.016
Neg logit contributions
ays-0.011
overs-0.009
otion-0.008
ions-0.008
ase-0.008
Token Hotel
Token ID12696
Feature activation+3.87e-02

Pos logit contributions
ays+0.047
overs+0.040
ions+0.037
ase+0.037
otion+0.037
Neg logit contributions
 aware-0.042
appointed-0.040
 active-0.039
 gentle-0.038
 understand-0.038
Token and
Token ID290
Feature activation+4.97e-02

Pos logit contributions
ays+0.079
overs+0.066
otion+0.062
ions+0.062
es+0.062
Neg logit contributions
 aware-0.063
appointed-0.062
 active-0.059
 understand-0.059
 cautious-0.058
Token It
Token ID632
Feature activation-2.18e-02

Pos logit contributions
appointed+0.009
 aware+0.008
 calm+0.008
 active+0.008
 gentle+0.008
Neg logit contributions
ays-0.011
ons-0.010
ocket-0.010
es-0.010
overs-0.010
Token has
Token ID468
Feature activation+6.96e-02

Pos logit contributions
 understand+0.038
 gentle+0.037
 calm+0.037
 aware+0.036
 matter+0.035
Neg logit contributions
overs-0.041
ays-0.041
ase-0.035
ions-0.034
otion-0.034
Token a
Token ID257
Feature activation+3.76e-02

Pos logit contributions
ays+0.159
overs+0.140
es+0.133
ions+0.133
ase+0.129
Neg logit contributions
 aware-0.102
appointed-0.097
 active-0.093
 understand-0.093
 cautious-0.091
Token button
Token ID4936
Feature activation+3.34e-02

Pos logit contributions
ays+0.090
overs+0.080
es+0.076
ions+0.075
ase+0.075
Neg logit contributions
 aware-0.051
appointed-0.048
 understand-0.046
 active-0.046
 cautious-0.045
Token that
Token ID326
Feature activation+1.34e-01

Pos logit contributions
ays+0.071
overs+0.066
ase+0.061
otion+0.060
acey+0.059
Neg logit contributions
 aware-0.046
 understand-0.045
 gentle-0.045
 calm-0.044
appointed-0.043
Token says
Token ID1139
Feature activation+1.65e-01

Pos logit contributions
ays+0.268
overs+0.265
ase+0.250
otion+0.228
ions+0.224
Neg logit contributions
 calm-0.209
 understand-0.207
 gentle-0.206
 slow-0.193
 strong-0.189
Token "
Token ID366
Feature activation-2.18e-02

Pos logit contributions
ays+0.411
overs+0.355
ase+0.331
ions+0.319
ls+0.316
Neg logit contributions
 understand-0.202
 gentle-0.197
 aware-0.192
 calm-0.186
 wise-0.178
Tokenpower
Token ID6477
Feature activation+3.54e-02

Pos logit contributions
 aware+0.031
appointed+0.030
 understand+0.030
 active+0.029
 calm+0.028
Neg logit contributions
ays-0.050
overs-0.044
otion-0.041
ions-0.041
es-0.041
Token".
Token ID1911
Feature activation-2.47e-02

Pos logit contributions
ays+0.075
overs+0.063
otion+0.060
ase+0.059
ions+0.059
Neg logit contributions
 aware-0.054
 understand-0.053
appointed-0.052
 active-0.051
 gentle-0.050
Token L
Token ID406
Feature activation+1.77e-02

Pos logit contributions
 aware+0.044
appointed+0.044
 calm+0.042
 understand+0.042
 gentle+0.040
Neg logit contributions
ays-0.043
ase-0.037
overs-0.037
ions-0.037
ons-0.037
Tokenila
Token ID10102
Feature activation-2.26e-02

Pos logit contributions
ays+0.025
overs+0.024
 DVD+0.021
box+0.020
ase+0.020
Neg logit contributions
 aware-0.035
 understand-0.035
 cautious-0.033
 calm-0.032
 active-0.032
Token,
Token ID11
Feature activation+4.64e-03

Pos logit contributions
ays+0.271
overs+0.234
ase+0.223
ions+0.221
otion+0.213
Neg logit contributions
 aware-0.149
 understand-0.143
 gentle-0.138
 cautious-0.136
 active-0.134
Token "
Token ID366
Feature activation-5.26e-02

Pos logit contributions
ays+0.009
overs+0.008
es+0.008
ions+0.007
otion+0.007
Neg logit contributions
 aware-0.008
appointed-0.008
 active-0.008
 understand-0.008
 cautious-0.007
TokenLook
Token ID8567
Feature activation+6.67e-02

Pos logit contributions
 aware+0.070
appointed+0.067
 understand+0.066
 active+0.064
 cautious+0.063
Neg logit contributions
ays-0.128
overs-0.114
ions-0.108
es-0.107
otion-0.107
Token,
Token ID11
Feature activation+3.83e-02

Pos logit contributions
ays+0.172
overs+0.158
ase+0.154
ions+0.152
otion+0.143
Neg logit contributions
 gentle-0.072
 aware-0.072
 calm-0.069
 understand-0.069
 careful-0.065
Token it
Token ID340
Feature activation+3.06e-02

Pos logit contributions
ays+0.081
overs+0.067
ase+0.061
ions+0.060
es+0.059
Neg logit contributions
 gentle-0.070
 understand-0.070
 calm-0.067
 listen-0.066
 worry-0.065
Token says
Token ID1139
Feature activation+1.65e-01

Pos logit contributions
ays+0.066
overs+0.065
ions+0.056
ase+0.054
otion+0.054
Neg logit contributions
 gentle-0.047
 calm-0.045
appointed-0.044
 aware-0.043
 active-0.043
Token Tom
Token ID4186
Feature activation+7.53e-02

Pos logit contributions
ays+0.421
overs+0.350
ions+0.344
ase+0.342
acey+0.334
Neg logit contributions
 gentle-0.197
 calm-0.192
 wise-0.179
 understand-0.178
 listen-0.177
Token.
Token ID13
Feature activation+5.49e-02

Pos logit contributions
ays+0.181
overs+0.156
ions+0.153
otion+0.151
ase+0.143
Neg logit contributions
 gentle-0.098
 understand-0.096
 aware-0.096
 calm-0.092
 careful-0.089
Token It
Token ID632
Feature activation+3.13e-03

Pos logit contributions
ays+0.119
kins+0.110
ls+0.102
overs+0.102
ons+0.101
Neg logit contributions
 understand-0.077
 nod-0.073
 appreciate-0.070
 calm-0.069
 listen-0.068
Token's
Token ID338
Feature activation+2.35e-02

Pos logit contributions
ays+0.007
overs+0.007
ions+0.006
otion+0.006
ase+0.005
Neg logit contributions
 gentle-0.005
 calm-0.005
 understand-0.005
 aware-0.005
 matter-0.005
Token my
Token ID616
Feature activation+5.20e-02

Pos logit contributions
ays+0.060
overs+0.053
ase+0.051
es+0.051
ions+0.051
Neg logit contributions
 aware-0.029
 gentle-0.028
 calm-0.028
 understand-0.028
 active-0.027
Token can
Token ID460
Feature activation-7.56e-02

Pos logit contributions
 aware+0.048
appointed+0.047
 active+0.045
 cautious+0.044
 understand+0.044
Neg logit contributions
ays-0.046
overs-0.039
ions-0.037
es-0.037
otion-0.036
Token figure
Token ID3785
Feature activation+2.00e-02

Pos logit contributions
 aware+0.111
appointed+0.104
 active+0.100
 cautious+0.099
 understand+0.099
Neg logit contributions
ays-0.175
overs-0.154
ions-0.145
es-0.145
otion-0.143
Token out
Token ID503
Feature activation-5.10e-03

Pos logit contributions
ays+0.040
ase+0.037
overs+0.036
otion+0.033
ions+0.033
Neg logit contributions
 understand-0.034
 aware-0.033
 gentle-0.032
 calm-0.032
 listen-0.030
Token what
Token ID644
Feature activation+4.99e-02

Pos logit contributions
 aware+0.009
appointed+0.008
 understand+0.008
 active+0.008
 cautious+0.008
Neg logit contributions
ays-0.010
overs-0.009
ions-0.008
ase-0.008
es-0.008
Token it
Token ID340
Feature activation+4.06e-02

Pos logit contributions
ays+0.097
overs+0.084
otion+0.077
ase+0.077
ions+0.076
Neg logit contributions
 aware-0.088
 understand-0.087
 gentle-0.086
 calm-0.086
 active-0.085
Token says
Token ID1139
Feature activation+1.65e-01

Pos logit contributions
ays+0.068
overs+0.059
ions+0.053
ase+0.052
otion+0.051
Neg logit contributions
 aware-0.084
appointed-0.081
 understand-0.080
 active-0.079
 gentle-0.078
Token!"
Token ID2474
Feature activation-2.97e-02

Pos logit contributions
ays+0.371
overs+0.315
ase+0.304
ions+0.297
acey+0.292
Neg logit contributions
 understand-0.226
 aware-0.223
 gentle-0.215
 calm-0.215
appointed-0.214
Token So
Token ID1406
Feature activation-2.41e-02

Pos logit contributions
 aware+0.049
appointed+0.046
 active+0.045
 understand+0.045
 cautious+0.044
Neg logit contributions
ays-0.064
overs-0.056
ions-0.052
es-0.052
ase-0.052
Token together
Token ID1978
Feature activation-2.61e-03

Pos logit contributions
 aware+0.041
appointed+0.039
 understand+0.038
 active+0.038
 calm+0.037
Neg logit contributions
ays-0.051
overs-0.043
ions-0.041
ase-0.041
otion-0.040
Token the
Token ID262
Feature activation-2.22e-02

Pos logit contributions
 aware+0.004
 understand+0.004
appointed+0.004
 calm+0.004
 gentle+0.004
Neg logit contributions
ays-0.006
overs-0.005
ase-0.005
ions-0.005
ocket-0.005
Token mom
Token ID1995
Feature activation-3.08e-02

Pos logit contributions
 aware+0.029
 gentle+0.029
 understand+0.028
 calm+0.028
appointed+0.027
Neg logit contributions
ays-0.057
overs-0.048
ions-0.047
es-0.046
ase-0.045
Token truck
Token ID7779
Feature activation-5.21e-03

Pos logit contributions
ays+0.004
overs+0.003
es+0.003
ions+0.003
otion+0.003
Neg logit contributions
 aware-0.004
 calm-0.004
 gentle-0.004
 understand-0.004
appointed-0.004
Token has
Token ID468
Feature activation+6.04e-02

Pos logit contributions
 calm+0.009
 gentle+0.008
appointed+0.008
 aware+0.008
 understand+0.008
Neg logit contributions
ays-0.010
overs-0.010
ions-0.009
acey-0.009
ase-0.009
Token a
Token ID257
Feature activation+3.41e-02

Pos logit contributions
ays+0.144
overs+0.127
ions+0.120
es+0.120
otion+0.119
Neg logit contributions
 aware-0.083
appointed-0.078
 understand-0.077
 active-0.075
 cautious-0.074
Token sign
Token ID1051
Feature activation+3.00e-02

Pos logit contributions
ays+0.075
overs+0.066
ase+0.062
es+0.061
otion+0.061
Neg logit contributions
 aware-0.053
appointed-0.050
 active-0.047
 understand-0.047
 cautious-0.046
Token that
Token ID326
Feature activation+1.29e-01

Pos logit contributions
ays+0.067
overs+0.061
acey+0.059
ions+0.058
kins+0.058
Neg logit contributions
 gentle-0.041
 calm-0.041
 understand-0.038
 aware-0.038
 slow-0.036
Token says
Token ID1139
Feature activation+1.64e-01

Pos logit contributions
ays+0.265
overs+0.255
ase+0.241
ions+0.229
otion+0.217
Neg logit contributions
 calm-0.202
 gentle-0.198
 understand-0.197
 strong-0.180
 slow-0.179
Token,
Token ID11
Feature activation+3.78e-02

Pos logit contributions
ays+0.432
overs+0.353
ase+0.339
ions+0.332
ls+0.329
Neg logit contributions
 gentle-0.177
 understand-0.177
 calm-0.169
 wise-0.167
 aware-0.167
Token "
Token ID366
Feature activation-3.78e-02

Pos logit contributions
ays+0.081
overs+0.070
ions+0.066
es+0.065
ase+0.065
Neg logit contributions
 aware-0.062
appointed-0.059
 understand-0.058
 active-0.057
 cautious-0.056
TokenL
Token ID43
Feature activation+7.65e-02

Pos logit contributions
 aware+0.046
appointed+0.045
 understand+0.044
Slow+0.041
 calm+0.041
Neg logit contributions
ays-0.093
overs-0.083
otion-0.078
ions-0.078
es-0.078
Tokenocks
Token ID3320
Feature activation-3.09e-03

Pos logit contributions
ays+0.126
overs+0.097
kins+0.096
ls+0.095
es+0.088
Neg logit contributions
 aware-0.145
 understand-0.137
Slow-0.134
 gentle-0.132
 listen-0.132
Tokenmith
Token ID22947
Feature activation+8.10e-02

Pos logit contributions
 aware+0.005
 understand+0.005
 gentle+0.005
 listen+0.005
 calm+0.005
Neg logit contributions
ays-0.006
overs-0.005
ase-0.005
acey-0.005
ocket-0.004
Token box
Token ID3091
Feature activation+6.61e-02

Pos logit contributions
ays+0.017
overs+0.014
es+0.014
ions+0.014
otion+0.013
Neg logit contributions
 aware-0.014
 understand-0.013
 calm-0.013
appointed-0.013
 gentle-0.013
Token had
Token ID550
Feature activation+9.22e-02

Pos logit contributions
ays+0.173
overs+0.156
acey+0.145
ions+0.142
ase+0.139
Neg logit contributions
 calm-0.081
 matter-0.078
 gentle-0.073
 trem-0.069
appointed-0.068
Token a
Token ID257
Feature activation+5.14e-02

Pos logit contributions
ays+0.225
overs+0.193
ase+0.185
otion+0.185
es+0.183
Neg logit contributions
 aware-0.125
 understand-0.121
 calm-0.119
 gentle-0.118
appointed-0.116
Token label
Token ID6167
Feature activation+5.92e-02

Pos logit contributions
ays+0.109
overs+0.095
ions+0.089
es+0.089
ase+0.088
Neg logit contributions
 aware-0.084
appointed-0.080
 understand-0.078
 active-0.077
 cautious-0.075
Token that
Token ID326
Feature activation+1.44e-01

Pos logit contributions
ays+0.137
overs+0.117
ase+0.109
ions+0.107
acey+0.106
Neg logit contributions
 calm-0.083
 understand-0.081
 gentle-0.079
 aware-0.078
 listen-0.074
Token said
Token ID531
Feature activation+1.64e-01

Pos logit contributions
ays+0.370
overs+0.329
ase+0.302
otion+0.293
ions+0.292
Neg logit contributions
 calm-0.188
 understand-0.182
 gentle-0.177
 exciting-0.165
 matter-0.159
Token:
Token ID25
Feature activation+1.38e-02

Pos logit contributions
ays+0.440
overs+0.346
ase+0.342
ls+0.336
ions+0.333
Neg logit contributions
 understand-0.183
 gentle-0.176
 calm-0.169
 wise-0.167
 aware-0.166
Token "
Token ID366
Feature activation-3.32e-02

Pos logit contributions
ays+0.030
overs+0.026
ions+0.025
ase+0.025
es+0.024
Neg logit contributions
 aware-0.022
 understand-0.020
appointed-0.020
 gentle-0.020
 cautious-0.020
TokenPlease
Token ID5492
Feature activation-2.39e-02

Pos logit contributions
 understand+0.039
appointed+0.039
 aware+0.038
Slow+0.037
 calm+0.036
Neg logit contributions
ays-0.081
overs-0.072
otion-0.069
ions-0.068
es-0.068
Token return
Token ID1441
Feature activation+7.16e-02

Pos logit contributions
 aware+0.031
appointed+0.029
 active+0.028
 understand+0.028
 cautious+0.027
Neg logit contributions
ays-0.059
overs-0.053
es-0.050
ions-0.050
otion-0.049
Token to
Token ID284
Feature activation-1.87e-02

Pos logit contributions
ays+0.170
otion+0.144
overs+0.144
ase+0.141
es+0.141
Neg logit contributions
 aware-0.095
 calm-0.089
 understand-0.088
 gentle-0.087
appointed-0.085
Token It
Token ID632
Feature activation-2.27e-02

Pos logit contributions
ays+0.002
overs+0.002
es+0.002
ions+0.002
ase+0.002
Neg logit contributions
appointed-0.002
 aware-0.002
 understand-0.002
 calm-0.002
 gentle-0.002
Token has
Token ID468
Feature activation+6.49e-02

Pos logit contributions
 gentle+0.033
 calm+0.032
 exciting+0.031
 steep+0.030
 climb+0.030
Neg logit contributions
overs-0.053
ays-0.051
otion-0.044
ions-0.043
ase-0.043
Token a
Token ID257
Feature activation+3.35e-02

Pos logit contributions
ays+0.169
overs+0.148
otion+0.142
ions+0.140
ase+0.140
Neg logit contributions
 aware-0.077
 understand-0.074
appointed-0.072
 gentle-0.071
 calm-0.070
Token sign
Token ID1051
Feature activation+2.87e-02

Pos logit contributions
ays+0.073
overs+0.063
ions+0.060
es+0.059
ase+0.059
Neg logit contributions
 aware-0.053
appointed-0.050
 understand-0.050
 active-0.049
 cautious-0.048
Token that
Token ID326
Feature activation+1.29e-01

Pos logit contributions
ays+0.066
overs+0.061
acey+0.058
ase+0.056
ions+0.056
Neg logit contributions
 gentle-0.038
 calm-0.037
 understand-0.035
 exciting-0.035
 aware-0.034
Token says
Token ID1139
Feature activation+1.63e-01

Pos logit contributions
ays+0.270
overs+0.257
ase+0.248
ions+0.232
es+0.224
Neg logit contributions
 calm-0.206
 understand-0.204
 gentle-0.202
appointed-0.191
 exciting-0.191
Token "
Token ID366
Feature activation-2.56e-02

Pos logit contributions
ays+0.440
overs+0.363
ase+0.347
ls+0.339
ions+0.338
Neg logit contributions
 understand-0.182
 gentle-0.169
 calm-0.161
 wise-0.160
 aware-0.159
Tokentest
Token ID9288
Feature activation+4.86e-03

Pos logit contributions
 aware+0.035
appointed+0.034
 understand+0.034
 active+0.032
 calm+0.032
Neg logit contributions
ays-0.060
overs-0.053
es-0.050
otion-0.049
ions-0.049
Token".
Token ID1911
Feature activation-2.94e-02

Pos logit contributions
ays+0.010
overs+0.008
ions+0.008
otion+0.008
ase+0.008
Neg logit contributions
 aware-0.008
appointed-0.008
 understand-0.008
 active-0.008
 cautious-0.008
Token They
Token ID1119
Feature activation-2.49e-03

Pos logit contributions
appointed+0.051
 aware+0.050
 understand+0.048
 calm+0.047
 active+0.046
Neg logit contributions
ays-0.059
overs-0.050
ions-0.048
es-0.048
otion-0.047
Token are
Token ID389
Feature activation+3.00e-02

Pos logit contributions
 aware+0.004
 active+0.004
appointed+0.004
 cautious+0.004
 pleasant+0.004
Neg logit contributions
ays-0.005
otion-0.004
overs-0.004
es-0.004
ions-0.004
Token also
Token ID635
Feature activation+2.20e-02

Pos logit contributions
 aware+0.046
 gentle+0.045
 understand+0.045
 calm+0.044
 active+0.043
Neg logit contributions
ays-0.037
overs-0.034
ions-0.030
ase-0.030
otion-0.029
Token has
Token ID468
Feature activation+7.24e-02

Pos logit contributions
ays+0.041
overs+0.035
ions+0.032
ase+0.032
es+0.032
Neg logit contributions
 aware-0.042
appointed-0.040
 understand-0.040
 active-0.039
 cautious-0.039
Token a
Token ID257
Feature activation+3.79e-02

Pos logit contributions
ays+0.175
overs+0.155
ions+0.146
es+0.146
ase+0.145
Neg logit contributions
 aware-0.097
appointed-0.091
 understand-0.089
 active-0.088
 cautious-0.086
Token sign
Token ID1051
Feature activation+2.71e-02

Pos logit contributions
ays+0.087
overs+0.077
es+0.072
ase+0.071
ions+0.071
Neg logit contributions
 aware-0.055
appointed-0.052
 understand-0.049
 active-0.049
 cautious-0.048
Token that
Token ID326
Feature activation+1.31e-01

Pos logit contributions
ays+0.060
overs+0.052
ase+0.050
acey+0.050
otion+0.050
Neg logit contributions
 gentle-0.039
 calm-0.038
 understand-0.037
 aware-0.036
 respectful-0.035
Token says
Token ID1139
Feature activation+1.63e-01

Pos logit contributions
ays+0.256
overs+0.241
ase+0.228
ions+0.216
otion+0.211
Neg logit contributions
 understand-0.220
 gentle-0.217
 calm-0.215
 aware-0.203
 pleasant-0.199
Token,
Token ID11
Feature activation+3.36e-02

Pos logit contributions
ays+0.410
overs+0.343
ase+0.330
ions+0.318
ls+0.313
Neg logit contributions
 understand-0.196
 gentle-0.194
 aware-0.190
 calm-0.184
 wise-0.176
Token "
Token ID366
Feature activation-3.82e-02

Pos logit contributions
ays+0.071
overs+0.062
ions+0.058
es+0.058
ase+0.057
Neg logit contributions
 aware-0.055
appointed-0.052
 understand-0.052
 active-0.051
 cautious-0.050
TokenBoo
Token ID46120
Feature activation+2.76e-02

Pos logit contributions
 aware+0.049
 understand+0.047
appointed+0.047
 active+0.045
 cautious+0.044
Neg logit contributions
ays-0.093
overs-0.083
ions-0.078
otion-0.078
es-0.078
Token!"
Token ID2474
Feature activation+7.34e-03

Pos logit contributions
ays+0.059
overs+0.051
otion+0.048
ions+0.048
ase+0.047
Neg logit contributions
 aware-0.043
 active-0.040
 gentle-0.039
 calm-0.039
 pleasant-0.038
Token They
Token ID1119
Feature activation+1.70e-02

Pos logit contributions
ays+0.015
overs+0.013
ions+0.012
es+0.012
otion+0.012
Neg logit contributions
 aware-0.012
 understand-0.012
appointed-0.012
 calm-0.012
 gentle-0.011
Token license
Token ID5964
Feature activation+5.22e-02

Pos logit contributions
 aware+0.001
 understand+0.001
 active+0.001
 gentle+0.001
appointed+0.001
Neg logit contributions
ays-0.002
overs-0.001
ions-0.001
otion-0.001
es-0.001
Token is
Token ID318
Feature activation+5.26e-02

Pos logit contributions
ays+0.107
overs+0.093
otion+0.086
ions+0.085
acey+0.080
Neg logit contributions
 gentle-0.085
 calm-0.081
 listen-0.081
 understand-0.081
 patient-0.080
Token a
Token ID257
Feature activation+5.88e-02

Pos logit contributions
ays+0.125
overs+0.110
ase+0.106
ions+0.104
otion+0.103
Neg logit contributions
 aware-0.074
 understand-0.069
appointed-0.069
 gentle-0.068
 active-0.068
Token paper
Token ID3348
Feature activation+3.75e-02

Pos logit contributions
ays+0.125
overs+0.106
ions+0.102
es+0.101
otion+0.100
Neg logit contributions
 aware-0.097
 understand-0.095
 gentle-0.093
 active-0.093
 calm-0.092
Token that
Token ID326
Feature activation+1.38e-01

Pos logit contributions
ays+0.086
otion+0.073
overs+0.073
ions+0.071
ls+0.066
Neg logit contributions
 gentle-0.049
 calm-0.049
 understand-0.048
 active-0.048
 aware-0.048
Token says
Token ID1139
Feature activation+1.63e-01

Pos logit contributions
ays+0.266
overs+0.234
ase+0.214
otion+0.214
ions+0.209
Neg logit contributions
 understand-0.245
 gentle-0.240
 aware-0.235
 calm-0.233
 active-0.227
Token you
Token ID345
Feature activation+2.39e-02

Pos logit contributions
ays+0.378
overs+0.321
ase+0.309
ions+0.297
otion+0.292
Neg logit contributions
 understand-0.217
 gentle-0.216
 aware-0.209
 calm-0.205
 patient-0.202
Token can
Token ID460
Feature activation-4.66e-02

Pos logit contributions
ays+0.054
overs+0.047
ase+0.044
otion+0.044
ions+0.043
Neg logit contributions
 aware-0.037
 understand-0.037
 active-0.034
 gentle-0.034
appointed-0.034
Token drive
Token ID3708
Feature activation+7.26e-02

Pos logit contributions
 aware+0.060
 understand+0.060
appointed+0.057
 calm+0.055
 active+0.055
Neg logit contributions
ays-0.115
overs-0.102
ions-0.097
ase-0.096
es-0.096
Token a
Token ID257
Feature activation+3.55e-02

Pos logit contributions
ays+0.167
otion+0.144
overs+0.143
ase+0.139
ions+0.136
Neg logit contributions
 calm-0.099
 aware-0.099
 active-0.098
 gentle-0.097
 understand-0.097
Token car
Token ID1097
Feature activation+7.62e-02

Pos logit contributions
ays+0.074
overs+0.062
ions+0.060
otion+0.059
es+0.059
Neg logit contributions
 aware-0.060
 understand-0.058
 active-0.058
 gentle-0.057
 calm-0.057
Token there
Token ID612
Feature activation+1.76e-02

Pos logit contributions
ays+0.040
overs+0.033
ase+0.031
ions+0.030
es+0.030
Neg logit contributions
 aware-0.048
 understand-0.046
 calm-0.046
 gentle-0.046
 active-0.045
Token is
Token ID318
Feature activation+1.75e-02

Pos logit contributions
ays+0.032
overs+0.028
ase+0.026
ions+0.026
acey+0.025
Neg logit contributions
 understand-0.033
 calm-0.033
 aware-0.032
 gentle-0.032
appointed-0.031
Token a
Token ID257
Feature activation+3.51e-02

Pos logit contributions
ays+0.044
overs+0.039
ase+0.038
otion+0.037
es+0.037
Neg logit contributions
 aware-0.021
 calm-0.019
 gentle-0.019
 understand-0.019
appointed-0.019
Token sign
Token ID1051
Feature activation+2.92e-02

Pos logit contributions
ays+0.084
overs+0.073
es+0.070
ions+0.070
ase+0.069
Neg logit contributions
 aware-0.048
appointed-0.045
 understand-0.045
 active-0.045
 calm-0.044
Token that
Token ID326
Feature activation+1.29e-01

Pos logit contributions
ays+0.063
overs+0.054
acey+0.052
es+0.051
ase+0.051
Neg logit contributions
 calm-0.044
 gentle-0.044
 aware-0.043
 active-0.042
 understand-0.041
Token says
Token ID1139
Feature activation+1.62e-01

Pos logit contributions
ays+0.242
overs+0.212
ase+0.196
ions+0.191
acey+0.187
Neg logit contributions
 understand-0.233
 aware-0.232
 calm-0.232
 gentle-0.229
appointed-0.222
Token '
Token ID705
Feature activation-8.95e-03

Pos logit contributions
ays+0.375
overs+0.310
ase+0.301
ions+0.297
acey+0.289
Neg logit contributions
 aware-0.231
 understand-0.230
 gentle-0.226
 calm-0.220
appointed-0.216
TokenHot
Token ID21352
Feature activation+4.84e-03

Pos logit contributions
 aware+0.013
appointed+0.012
 understand+0.012
 active+0.012
 calm+0.011
Neg logit contributions
ays-0.021
overs-0.018
otion-0.017
ions-0.017
es-0.017
Tokenel
Token ID417
Feature activation+3.92e-02

Pos logit contributions
ays+0.009
overs+0.008
es+0.008
otion+0.007
ase+0.007
Neg logit contributions
 aware-0.008
 understand-0.008
appointed-0.008
 active-0.008
 calm-0.008
Token'
Token ID6
Feature activation+1.24e-02

Pos logit contributions
ays+0.076
overs+0.065
ions+0.060
otion+0.060
es+0.060
Neg logit contributions
 aware-0.070
appointed-0.066
 understand-0.066
 active-0.064
 cautious-0.063
Token with
Token ID351
Feature activation+1.32e-01

Pos logit contributions
ays+0.026
overs+0.022
ions+0.021
es+0.021
ase+0.020
Neg logit contributions
 aware-0.021
appointed-0.020
 understand-0.019
 active-0.019
 cautious-0.019
Token has
Token ID468
Feature activation+6.99e-02

Pos logit contributions
 exciting+0.028
 gentle+0.027
 calm+0.026
 wise+0.026
 slow+0.025
Neg logit contributions
ays-0.044
overs-0.043
ions-0.038
otion-0.037
kins-0.037
Token a
Token ID257
Feature activation+3.66e-02

Pos logit contributions
ays+0.178
overs+0.153
otion+0.150
ions+0.148
es+0.148
Neg logit contributions
 aware-0.086
 understand-0.082
 gentle-0.081
appointed-0.080
 calm-0.079
Token big
Token ID1263
Feature activation-1.71e-02

Pos logit contributions
ays+0.089
overs+0.078
ions+0.075
es+0.075
otion+0.073
Neg logit contributions
 aware-0.049
 understand-0.046
appointed-0.046
 active-0.045
 gentle-0.045
Token sign
Token ID1051
Feature activation+3.48e-02

Pos logit contributions
 aware+0.029
 understand+0.028
appointed+0.028
 gentle+0.027
 active+0.027
Neg logit contributions
ays-0.035
overs-0.029
es-0.029
ions-0.029
otion-0.028
Token that
Token ID326
Feature activation+1.27e-01

Pos logit contributions
ays+0.081
overs+0.070
acey+0.070
ions+0.068
ase+0.067
Neg logit contributions
 gentle-0.046
 calm-0.045
 understand-0.043
 aware-0.043
 exciting-0.042
Token says
Token ID1139
Feature activation+1.62e-01

Pos logit contributions
ays+0.278
overs+0.252
ase+0.249
ions+0.237
es+0.226
Neg logit contributions
 calm-0.196
 understand-0.194
 gentle-0.193
 exciting-0.182
 wise-0.176
Token "
Token ID366
Feature activation-2.56e-02

Pos logit contributions
ays+0.446
overs+0.356
ase+0.353
ls+0.349
ions+0.344
Neg logit contributions
 understand-0.168
 gentle-0.161
 wise-0.156
 calm-0.153
 aware-0.152
TokenMagic
Token ID22975
Feature activation+1.67e-02

Pos logit contributions
 aware+0.035
 understand+0.034
appointed+0.034
 active+0.033
 calm+0.032
Neg logit contributions
ays-0.060
overs-0.052
otion-0.049
es-0.049
ions-0.049
Token Shop
Token ID13705
Feature activation+4.59e-02

Pos logit contributions
ays+0.034
overs+0.026
ls+0.026
otion+0.026
kins+0.025
Neg logit contributions
 aware-0.026
 understand-0.025
 gentle-0.025
 approach-0.025
 listen-0.024
Token".
Token ID1911
Feature activation-2.73e-02

Pos logit contributions
ays+0.102
otion+0.083
overs+0.082
ls+0.079
ase+0.079
Neg logit contributions
 aware-0.065
appointed-0.060
 active-0.059
 understand-0.058
 gentle-0.057
Token Lily
Token ID20037
Feature activation+5.60e-03

Pos logit contributions
appointed+0.039
 calm+0.039
 understand+0.037
 nod+0.037
 gentle+0.037
Neg logit contributions
ays-0.056
ions-0.051
es-0.051
otion-0.049
ocket-0.049
Token has
Token ID468
Feature activation+8.04e-02

Pos logit contributions
 aware+0.005
appointed+0.005
 understand+0.005
 active+0.005
 cautious+0.005
Neg logit contributions
ays-0.006
overs-0.005
ions-0.005
es-0.005
ase-0.005
Token a
Token ID257
Feature activation+4.52e-02

Pos logit contributions
ays+0.188
overs+0.166
ions+0.156
es+0.156
otion+0.155
Neg logit contributions
 aware-0.113
appointed-0.107
 understand-0.106
 active-0.103
 gentle-0.101
Token heart
Token ID2612
Feature activation+3.46e-02

Pos logit contributions
ays+0.098
overs+0.086
ions+0.080
es+0.080
ase+0.079
Neg logit contributions
 aware-0.071
appointed-0.068
 understand-0.066
 active-0.065
 cautious-0.064
Token and
Token ID290
Feature activation+3.27e-02

Pos logit contributions
ays+0.073
overs+0.063
es+0.059
ions+0.059
ase+0.058
Neg logit contributions
 aware-0.058
appointed-0.055
 understand-0.054
 active-0.053
 cautious-0.052
Token it
Token ID340
Feature activation+2.53e-02

Pos logit contributions
ays+0.081
overs+0.072
ase+0.068
otion+0.068
ions+0.067
Neg logit contributions
 aware-0.042
 understand-0.040
 gentle-0.039
appointed-0.038
 calm-0.038
Token says
Token ID1139
Feature activation+1.62e-01

Pos logit contributions
ays+0.048
overs+0.042
ions+0.039
otion+0.038
ase+0.038
Neg logit contributions
 aware-0.046
appointed-0.044
 understand-0.044
 active-0.043
 gentle-0.043
Token I
Token ID314
Feature activation-4.83e-03

Pos logit contributions
ays+0.362
overs+0.309
ase+0.290
ions+0.289
otion+0.283
Neg logit contributions
 aware-0.240
 understand-0.233
appointed-0.229
 gentle-0.228
 calm-0.223
Token love
Token ID1842
Feature activation+2.05e-02

Pos logit contributions
 aware+0.007
 understand+0.007
 active+0.006
 gentle+0.006
 cautious+0.006
Neg logit contributions
ays-0.011
overs-0.010
ions-0.009
otion-0.009
ase-0.009
Token you
Token ID345
Feature activation+1.68e-02

Pos logit contributions
ays+0.040
overs+0.033
otion+0.032
ase+0.031
es+0.030
Neg logit contributions
 aware-0.036
 gentle-0.035
 understand-0.035
 active-0.034
 calm-0.034
Token."
Token ID526
Feature activation+2.88e-03

Pos logit contributions
ays+0.032
overs+0.027
es+0.025
ions+0.025
ase+0.025
Neg logit contributions
 aware-0.031
appointed-0.030
 understand-0.029
 active-0.029
 cautious-0.029
Token
Token ID198
Feature activation-4.44e-03

Pos logit contributions
ays+0.007
ions+0.007
overs+0.007
otion+0.006
kins+0.006
Neg logit contributions
 understand-0.003
 aware-0.003
appointed-0.003
 calm-0.003
 gentle-0.003
Token lady
Token ID10846
Feature activation+3.45e-02

Pos logit contributions
ays+0.026
overs+0.023
es+0.022
ions+0.021
ase+0.021
Neg logit contributions
 aware-0.015
appointed-0.015
 understand-0.014
 active-0.014
 cautious-0.013
Token with
Token ID351
Feature activation+8.83e-02

Pos logit contributions
ays+0.075
overs+0.062
ions+0.062
anger+0.061
ase+0.060
Neg logit contributions
 aware-0.054
 calm-0.052
 understand-0.051
 gentle-0.050
 cautious-0.048
Token a
Token ID257
Feature activation+2.81e-02

Pos logit contributions
ays+0.233
ase+0.204
overs+0.203
otion+0.197
es+0.195
Neg logit contributions
 gentle-0.098
 aware-0.097
 understand-0.096
 calm-0.094
 pleasant-0.089
Token badge
Token ID23009
Feature activation+4.30e-02

Pos logit contributions
ays+0.058
overs+0.051
es+0.047
ase+0.047
ions+0.047
Neg logit contributions
 aware-0.047
appointed-0.046
 active-0.043
 understand-0.043
 cautious-0.042
Token that
Token ID326
Feature activation+1.21e-01

Pos logit contributions
ays+0.107
overs+0.098
ions+0.094
acey+0.093
kins+0.091
Neg logit contributions
 aware-0.048
 understand-0.048
 calm-0.045
 gentle-0.044
appointed-0.043
Token says
Token ID1139
Feature activation+1.62e-01

Pos logit contributions
ays+0.259
overs+0.242
ase+0.224
ions+0.219
otion+0.208
Neg logit contributions
 understand-0.186
 gentle-0.184
 calm-0.180
 nod-0.174
 listen-0.172
Token "
Token ID366
Feature activation-2.42e-02

Pos logit contributions
ays+0.405
overs+0.335
ase+0.322
ions+0.311
acey+0.309
Neg logit contributions
 aware-0.213
 understand-0.212
 gentle-0.208
 cautious-0.199
 calm-0.196
TokenM
Token ID44
Feature activation+8.18e-03

Pos logit contributions
 aware+0.032
appointed+0.030
 understand+0.030
 active+0.029
 cautious+0.028
Neg logit contributions
ays-0.059
overs-0.052
ions-0.049
otion-0.049
es-0.049
Tokenuseum
Token ID6744
Feature activation-2.20e-02

Pos logit contributions
ays+0.015
overs+0.013
es+0.012
ions+0.012
otion+0.012
Neg logit contributions
 aware-0.015
appointed-0.014
 understand-0.014
 cautious-0.014
 active-0.013
Token Staff
Token ID9983
Feature activation+5.96e-02

Pos logit contributions
 aware+0.029
appointed+0.029
 understand+0.028
 gentle+0.027
 calm+0.027
Neg logit contributions
ays-0.046
kins-0.039
overs-0.039
ocket-0.038
otion-0.038
Token".
Token ID1911
Feature activation-1.67e-02

Pos logit contributions
ays+0.131
overs+0.108
otion+0.103
kins+0.102
ase+0.102
Neg logit contributions
 aware-0.083
 understand-0.079
 approach-0.075
 listen-0.075
 gentle-0.074
Token'm
Token ID1101
Feature activation-4.04e-02

Pos logit contributions
 aware+0.056
appointed+0.053
 active+0.051
 understand+0.050
 cautious+0.049
Neg logit contributions
ays-0.107
overs-0.094
es-0.090
ions-0.089
otion-0.088
Token not
Token ID407
Feature activation-2.82e-02

Pos logit contributions
 aware+0.052
appointed+0.047
 understand+0.047
 active+0.046
 cautious+0.046
Neg logit contributions
ays-0.102
overs-0.090
es-0.086
ions-0.085
ase-0.085
Token sure
Token ID1654
Feature activation+2.07e-02

Pos logit contributions
appointed+0.033
 aware+0.033
 understand+0.030
 active+0.029
 cautious+0.028
Neg logit contributions
ays-0.070
overs-0.064
es-0.060
otion-0.059
ions-0.059
Token what
Token ID644
Feature activation+5.16e-02

Pos logit contributions
ays+0.041
overs+0.035
ions+0.033
es+0.032
ase+0.032
Neg logit contributions
 aware-0.037
appointed-0.035
 understand-0.035
 active-0.034
 cautious-0.034
Token it
Token ID340
Feature activation+3.94e-02

Pos logit contributions
ays+0.125
overs+0.106
otion+0.101
ase+0.101
ions+0.100
Neg logit contributions
 aware-0.071
 understand-0.069
 gentle-0.067
 active-0.067
 cautious-0.064
Token says
Token ID1139
Feature activation+1.61e-01

Pos logit contributions
ays+0.071
overs+0.062
ions+0.056
ase+0.055
es+0.055
Neg logit contributions
 aware-0.077
appointed-0.073
 understand-0.073
 active-0.071
 cautious-0.070
Token.
Token ID13
Feature activation-3.23e-04

Pos logit contributions
ays+0.383
overs+0.319
ase+0.318
ions+0.314
ls+0.302
Neg logit contributions
 understand-0.211
 aware-0.199
 calm-0.192
 gentle-0.191
 listen-0.189
Token It
Token ID632
Feature activation-1.68e-02

Pos logit contributions
 aware+0.001
appointed+0.001
 understand+0.001
 active+0.001
 cautious+0.001
Neg logit contributions
ays-0.001
overs-0.000
ions-0.000
es-0.000
ase-0.000
Token looks
Token ID3073
Feature activation+1.87e-02

Pos logit contributions
 aware+0.029
 understand+0.028
appointed+0.028
 gentle+0.027
 active+0.027
Neg logit contributions
ays-0.034
overs-0.030
ions-0.028
ase-0.027
otion-0.027
Token like
Token ID588
Feature activation+3.68e-02

Pos logit contributions
ays+0.050
overs+0.046
es+0.043
ions+0.043
otion+0.042
Neg logit contributions
 aware-0.019
appointed-0.018
 understand-0.016
 active-0.016
 cautious-0.016
Token a
Token ID257
Feature activation+4.12e-02

Pos logit contributions
ays+0.092
overs+0.080
ase+0.077
es+0.076
otion+0.076
Neg logit contributions
 aware-0.048
 understand-0.044
 gentle-0.044
 active-0.043
 calm-0.043
Token they
Token ID484
Feature activation+1.70e-02

Pos logit contributions
 calm+0.033
 understand+0.033
 aware+0.033
 gentle+0.033
 active+0.032
Neg logit contributions
ays-0.026
ase-0.021
overs-0.020
ocket-0.019
otion-0.019
Token see
Token ID766
Feature activation+6.69e-03

Pos logit contributions
ays+0.046
overs+0.041
es+0.039
ions+0.039
ase+0.039
Neg logit contributions
 aware-0.019
 understand-0.018
 calm-0.017
 active-0.017
appointed-0.017
Token a
Token ID257
Feature activation+8.63e-03

Pos logit contributions
ays+0.016
overs+0.014
ions+0.013
es+0.013
ase+0.013
Neg logit contributions
 aware-0.009
appointed-0.009
 understand-0.009
 active-0.009
 cautious-0.008
Token sign
Token ID1051
Feature activation+2.52e-02

Pos logit contributions
ays+0.020
overs+0.018
ase+0.017
ions+0.017
otion+0.017
Neg logit contributions
 aware-0.012
appointed-0.011
 understand-0.011
 active-0.010
 cautious-0.010
Token that
Token ID326
Feature activation+1.23e-01

Pos logit contributions
ays+0.055
overs+0.051
acey+0.049
es+0.047
ions+0.047
Neg logit contributions
 calm-0.036
 gentle-0.035
 understand-0.034
 aware-0.033
 exciting-0.032
Token says
Token ID1139
Feature activation+1.61e-01

Pos logit contributions
ays+0.261
overs+0.246
ase+0.230
ions+0.222
es+0.214
Neg logit contributions
 calm-0.195
 understand-0.194
 gentle-0.190
 exciting-0.179
 aware-0.178
Token "
Token ID366
Feature activation-2.57e-02

Pos logit contributions
ays+0.418
overs+0.348
ase+0.330
acey+0.327
ions+0.327
Neg logit contributions
 understand-0.199
 gentle-0.183
 aware-0.179
 calm-0.177
 wise-0.176
TokenDo
Token ID5211
Feature activation+2.34e-02

Pos logit contributions
 aware+0.036
appointed+0.034
 understand+0.034
 active+0.033
 cautious+0.032
Neg logit contributions
ays-0.060
overs-0.053
ions-0.050
es-0.050
otion-0.050
Token not
Token ID407
Feature activation-2.69e-02

Pos logit contributions
ays+0.057
overs+0.050
ions+0.047
ase+0.047
es+0.047
Neg logit contributions
 aware-0.031
appointed-0.030
 understand-0.030
 active-0.029
 gentle-0.029
Token go
Token ID467
Feature activation+4.44e-03

Pos logit contributions
 aware+0.038
appointed+0.038
 active+0.035
 cautious+0.035
 pleasant+0.033
Neg logit contributions
ays-0.060
overs-0.054
es-0.050
ions-0.050
otion-0.050
Token in
Token ID287
Feature activation-7.11e-03

Pos logit contributions
ays+0.010
overs+0.008
ions+0.008
es+0.008
ase+0.008
Neg logit contributions
 aware-0.007
appointed-0.007
 understand-0.006
 active-0.006
 calm-0.006
Token is
Token ID318
Feature activation-3.86e-03

Pos logit contributions
ays+0.007
overs+0.006
ions+0.006
es+0.006
ase+0.006
Neg logit contributions
 aware-0.005
appointed-0.005
 understand-0.005
 active-0.005
 cautious-0.004
Token holding
Token ID4769
Feature activation+1.83e-02

Pos logit contributions
 aware+0.005
appointed+0.005
 understand+0.005
 active+0.004
 cautious+0.004
Neg logit contributions
ays-0.009
overs-0.008
ions-0.008
es-0.008
otion-0.008
Token a
Token ID257
Feature activation+7.88e-03

Pos logit contributions
ays+0.040
overs+0.034
ase+0.033
otion+0.033
ions+0.032
Neg logit contributions
 aware-0.029
appointed-0.027
 understand-0.027
 active-0.027
 gentle-0.026
Token sign
Token ID1051
Feature activation+1.83e-02

Pos logit contributions
ays+0.015
overs+0.013
ase+0.012
ions+0.012
es+0.012
Neg logit contributions
 aware-0.014
appointed-0.014
 understand-0.013
 cautious-0.013
 active-0.013
Token that
Token ID326
Feature activation+1.25e-01

Pos logit contributions
ays+0.040
overs+0.034
otion+0.033
ions+0.033
ase+0.032
Neg logit contributions
 aware-0.028
 understand-0.027
 gentle-0.027
 calm-0.027
 active-0.026
Token says
Token ID1139
Feature activation+1.61e-01

Pos logit contributions
ays+0.262
overs+0.231
ase+0.213
ions+0.210
otion+0.209
Neg logit contributions
 aware-0.204
 understand-0.203
 calm-0.196
 gentle-0.196
appointed-0.193
Token "
Token ID366
Feature activation-2.73e-02

Pos logit contributions
ays+0.388
overs+0.322
ase+0.308
ions+0.299
otion+0.296
Neg logit contributions
 aware-0.219
 understand-0.219
 gentle-0.213
 calm-0.206
 cautious-0.202
TokenMr
Token ID5246
Feature activation+2.92e-02

Pos logit contributions
 aware+0.038
appointed+0.037
 understand+0.036
 active+0.035
 cautious+0.034
Neg logit contributions
ays-0.064
overs-0.056
ions-0.053
es-0.052
otion-0.052
Token.
Token ID13
Feature activation+3.76e-02

Pos logit contributions
ays+0.062
overs+0.051
otion+0.050
ions+0.050
ase+0.047
Neg logit contributions
 aware-0.048
 understand-0.046
 cautious-0.044
 active-0.043
appointed-0.043
Token Smith
Token ID4176
Feature activation+1.34e-02

Pos logit contributions
ays+0.076
overs+0.064
ons+0.061
kins+0.061
ions+0.061
Neg logit contributions
 understand-0.059
 aware-0.059
 nod-0.056
 approach-0.056
 cautious-0.055
Token".
Token ID1911
Feature activation-1.79e-02

Pos logit contributions
ays+0.027
overs+0.023
otion+0.022
ions+0.021
ase+0.021
Neg logit contributions
 aware-0.023
 understand-0.021
 gentle-0.021
 cautious-0.020
 active-0.020
Token They
Token ID1119
Feature activation+1.16e-02

Pos logit contributions
ays+0.025
overs+0.021
ions+0.020
es+0.020
ase+0.019
Neg logit contributions
 aware-0.028
appointed-0.027
 understand-0.026
 active-0.026
 cautious-0.026
Token see
Token ID766
Feature activation+6.79e-03

Pos logit contributions
ays+0.025
overs+0.023
es+0.022
acey+0.022
otion+0.022
Neg logit contributions
 aware-0.016
appointed-0.016
 active-0.015
 pleasant-0.014
 cautious-0.014
Token a
Token ID257
Feature activation+1.53e-02

Pos logit contributions
ays+0.016
overs+0.015
ase+0.014
ions+0.014
otion+0.014
Neg logit contributions
 aware-0.009
 understand-0.009
 gentle-0.008
 calm-0.008
 cautious-0.008
Token sign
Token ID1051
Feature activation+2.54e-02

Pos logit contributions
ays+0.037
overs+0.033
es+0.031
ions+0.031
ase+0.031
Neg logit contributions
 aware-0.020
appointed-0.019
 understand-0.019
 active-0.018
 cautious-0.018
Token that
Token ID326
Feature activation+1.27e-01

Pos logit contributions
ays+0.055
overs+0.048
acey+0.047
ase+0.045
ions+0.045
Neg logit contributions
 gentle-0.039
 aware-0.038
 calm-0.038
 understand-0.038
 cautious-0.036
Token says
Token ID1139
Feature activation+1.61e-01

Pos logit contributions
ays+0.251
overs+0.231
ase+0.218
ions+0.205
acey+0.197
Neg logit contributions
 understand-0.221
 calm-0.213
 gentle-0.212
 aware-0.210
 listen-0.198
Token "
Token ID366
Feature activation-2.31e-02

Pos logit contributions
ays+0.394
overs+0.333
ase+0.313
ions+0.306
acey+0.303
Neg logit contributions
 aware-0.222
 understand-0.216
 gentle-0.207
 cautious-0.204
 calm-0.198
TokenAd
Token ID2782
Feature activation+3.63e-02

Pos logit contributions
 aware+0.032
appointed+0.030
 understand+0.030
 active+0.029
 cautious+0.029
Neg logit contributions
ays-0.055
overs-0.048
ions-0.045
es-0.045
otion-0.045
Tokenopt
Token ID8738
Feature activation+1.53e-02

Pos logit contributions
ays+0.064
overs+0.053
otion+0.051
ase+0.049
kins+0.048
Neg logit contributions
 aware-0.071
 cautious-0.065
 gentle-0.064
 active-0.064
 understand-0.064
Token a
Token ID257
Feature activation+3.35e-02

Pos logit contributions
ays+0.034
overs+0.028
otion+0.028
ase+0.027
acey+0.026
Neg logit contributions
 aware-0.024
 gentle-0.023
 calm-0.022
 understand-0.022
appointed-0.022
Token Panda
Token ID41112
Feature activation+2.16e-02

Pos logit contributions
ays+0.073
overs+0.065
es+0.061
ions+0.061
ase+0.061
Neg logit contributions
 aware-0.051
appointed-0.049
 understand-0.046
 active-0.046
 cautious-0.045
Token.
Token ID13
Feature activation-2.95e-04

Pos logit contributions
 aware+0.039
appointed+0.037
 understand+0.037
 active+0.036
 cautious+0.036
Neg logit contributions
ays-0.048
overs-0.041
ions-0.038
ase-0.038
otion-0.038
Token
Token ID220
Feature activation-9.98e-03

Pos logit contributions
 aware+0.000
appointed+0.000
 understand+0.000
 active+0.000
 cautious+0.000
Neg logit contributions
ays-0.001
overs-0.001
ions-0.001
es-0.001
otion-0.001
Token
Token ID198
Feature activation-1.75e-02

Pos logit contributions
 understand+0.011
 aware+0.011
appointed+0.010
 calm+0.009
 gentle+0.009
Neg logit contributions
ays-0.026
overs-0.023
ocket-0.022
ions-0.022
ase-0.022
Token
Token ID198
Feature activation-2.57e-02

Pos logit contributions
 aware+0.032
appointed+0.030
 active+0.029
 understand+0.029
 cautious+0.029
Neg logit contributions
ays-0.034
overs-0.029
es-0.028
ions-0.027
ase-0.027
TokenThe
Token ID464
Feature activation-3.10e-02

Pos logit contributions
 aware+0.034
 understand+0.031
 matter+0.031
 pleasant+0.031
 hesitate+0.031
Neg logit contributions
ays-0.055
overs-0.052
es-0.050
ions-0.050
ase-0.049
Token nos
Token ID43630
Feature activation-1.80e-01

Pos logit contributions
 aware+0.043
appointed+0.041
 understand+0.037
 active+0.035
 cautious+0.035
Neg logit contributions
ays-0.068
overs-0.065
otion-0.060
es-0.059
kins-0.059
Tokeny
Token ID88
Feature activation-7.15e-02

Pos logit contributions
 aware+0.281
 understand+0.264
appointed+0.256
 active+0.255
 cautious+0.247
Neg logit contributions
ays-0.378
overs-0.339
ions-0.315
es-0.314
otion-0.309
Token person
Token ID1048
Feature activation+5.24e-03

Pos logit contributions
 aware+0.112
appointed+0.107
 understand+0.104
 active+0.102
 cautious+0.100
Neg logit contributions
ays-0.157
overs-0.138
es-0.130
ions-0.128
ase-0.128
Token then
Token ID788
Feature activation-1.78e-02

Pos logit contributions
ays+0.012
overs+0.010
es+0.010
ions+0.010
ase+0.009
Neg logit contributions
 aware-0.008
appointed-0.008
 understand-0.007
 active-0.007
 cautious-0.007
Token tried
Token ID3088
Feature activation-5.33e-02

Pos logit contributions
 aware+0.028
appointed+0.026
 understand+0.026
 active+0.025
 cautious+0.025
Neg logit contributions
ays-0.040
overs-0.035
ions-0.033
ase-0.032
es-0.032
Token to
Token ID284
Feature activation-6.82e-02

Pos logit contributions
appointed+0.103
rated+0.096
 active+0.093
 pleasant+0.091
 aware+0.090
Neg logit contributions
ays-0.084
es-0.083
overs-0.074
ions-0.073
 they-0.072
TokenIt
Token ID1026
Feature activation-8.30e-03

Pos logit contributions
 aware+0.051
appointed+0.047
 understand+0.045
 active+0.044
 cautious+0.043
Neg logit contributions
ays-0.061
overs-0.052
ions-0.051
ase-0.049
otion-0.048
Token's
Token ID338
Feature activation+2.14e-02

Pos logit contributions
 aware+0.017
appointed+0.016
 understand+0.016
 active+0.016
 cautious+0.015
Neg logit contributions
ays-0.014
es-0.012
overs-0.010
otion-0.010
ls-0.010
Token ok
Token ID12876
Feature activation-5.83e-02

Pos logit contributions
ays+0.046
overs+0.041
ions+0.039
es+0.037
otion+0.037
Neg logit contributions
 aware-0.029
 understand-0.027
appointed-0.026
 hesitate-0.026
 nod-0.026
Token,"
Token ID553
Feature activation+1.65e-03

Pos logit contributions
 aware+0.109
appointed+0.105
 understand+0.102
 active+0.101
 cautious+0.100
Neg logit contributions
ays-0.106
overs-0.093
ions-0.087
es-0.086
otion-0.084
Token the
Token ID262
Feature activation-4.78e-02

Pos logit contributions
ays+0.003
overs+0.002
es+0.002
ions+0.002
otion+0.002
Neg logit contributions
 aware-0.003
appointed-0.003
 active-0.003
 understand-0.003
 cautious-0.003
Token nos
Token ID43630
Feature activation-1.79e-01

Pos logit contributions
 aware+0.072
 understand+0.062
appointed+0.059
 active+0.059
 cautious+0.059
Neg logit contributions
ays-0.102
overs-0.092
ase-0.091
otion-0.089
es-0.086
Tokeny
Token ID88
Feature activation-7.79e-02

Pos logit contributions
 aware+0.322
appointed+0.308
 understand+0.303
 active+0.299
 cautious+0.293
Neg logit contributions
ays-0.348
overs-0.300
es-0.279
ions-0.278
ase-0.272
Token girl
Token ID2576
Feature activation-4.79e-03

Pos logit contributions
 aware+0.121
appointed+0.113
 understand+0.108
 active+0.107
 cautious+0.104
Neg logit contributions
ays-0.159
overs-0.142
es-0.139
ase-0.136
otion-0.131
Token said
Token ID531
Feature activation+8.73e-02

Pos logit contributions
appointed+0.010
 aware+0.010
 active+0.010
 pleasant+0.010
 steep+0.010
Neg logit contributions
ays-0.006
es-0.006
overs-0.005
acey-0.005
ls-0.005
Token.
Token ID13
Feature activation-1.05e-03

Pos logit contributions
ays+0.148
overs+0.135
es+0.127
otion+0.117
ions+0.115
Neg logit contributions
 aware-0.168
appointed-0.163
 active-0.157
 pleasant-0.156
 understand-0.155
Token "
Token ID366
Feature activation-6.48e-02

Pos logit contributions
 aware+0.002
appointed+0.002
 active+0.002
 understand+0.002
 cautious+0.002
Neg logit contributions
ays-0.002
overs-0.002
es-0.002
ions-0.002
otion-0.002
Token happy
Token ID3772
Feature activation-1.11e-02

Pos logit contributions
appointed+0.024
 aware+0.024
 cautious+0.021
 active+0.021
 understand+0.021
Neg logit contributions
ays-0.043
overs-0.039
es-0.036
ions-0.036
otion-0.035
Token.
Token ID13
Feature activation+2.27e-03

Pos logit contributions
 aware+0.020
appointed+0.020
 active+0.019
 understand+0.019
 cautious+0.019
Neg logit contributions
ays-0.021
overs-0.018
es-0.018
ions-0.017
otion-0.016
Token
Token ID198
Feature activation-1.15e-02

Pos logit contributions
ays+0.005
overs+0.005
otion+0.005
ions+0.005
es+0.005
Neg logit contributions
appointed-0.003
 aware-0.003
 understand-0.003
 active-0.003
 cautious-0.003
Token
Token ID198
Feature activation-1.91e-02

Pos logit contributions
 aware+0.017
 understand+0.016
appointed+0.016
 gentle+0.015
 active+0.015
Neg logit contributions
ays-0.027
overs-0.024
ions-0.022
otion-0.022
ase-0.022
TokenThe
Token ID464
Feature activation-2.95e-02

Pos logit contributions
 aware+0.020
appointed+0.019
 active+0.017
 understand+0.017
 cautious+0.017
Neg logit contributions
ays-0.050
overs-0.045
ions-0.044
es-0.044
ase-0.043
Token nos
Token ID43630
Feature activation-1.78e-01

Pos logit contributions
 aware+0.048
appointed+0.046
 understand+0.045
 active+0.044
 cautious+0.043
Neg logit contributions
ays-0.063
overs-0.055
ions-0.051
es-0.051
ase-0.051
Tokeny
Token ID88
Feature activation-7.09e-02

Pos logit contributions
 aware+0.272
 understand+0.258
appointed+0.253
 active+0.249
 cautious+0.240
Neg logit contributions
ays-0.373
overs-0.334
es-0.312
ions-0.310
otion-0.306
Token metal
Token ID6147
Feature activation-6.70e-02

Pos logit contributions
 aware+0.114
appointed+0.108
 understand+0.105
 active+0.104
 cautious+0.102
Neg logit contributions
ays-0.153
overs-0.134
es-0.125
ions-0.125
ase-0.124
Token robot
Token ID9379
Feature activation+1.94e-02

Pos logit contributions
 aware+0.118
appointed+0.113
 understand+0.111
 active+0.109
 cautious+0.107
Neg logit contributions
ays-0.133
overs-0.115
ions-0.107
es-0.107
ase-0.106
Token realized
Token ID6939
Feature activation+3.32e-02

Pos logit contributions
ays+0.040
overs+0.035
es+0.034
ase+0.033
ions+0.033
Neg logit contributions
 aware-0.032
appointed-0.031
 understand-0.029
 active-0.029
 cautious-0.028
Token that
Token ID326
Feature activation+7.63e-02

Pos logit contributions
ays+0.064
overs+0.054
ase+0.051
otion+0.048
ions+0.048
Neg logit contributions
 aware-0.063
 understand-0.059
appointed-0.058
 cautious-0.057
 gentle-0.057
Token around
Token ID1088
Feature activation-4.01e-02

Pos logit contributions
ays+0.070
overs+0.061
es+0.057
ions+0.057
otion+0.056
Neg logit contributions
 aware-0.051
appointed-0.049
 understand-0.048
 active-0.047
 cautious-0.046
Token.
Token ID13
Feature activation-1.02e-02

Pos logit contributions
 aware+0.061
 understand+0.058
 calm+0.057
 active+0.056
 gentle+0.056
Neg logit contributions
ays-0.090
overs-0.074
ase-0.073
ions-0.072
es-0.071
Token
Token ID198
Feature activation-2.28e-02

Pos logit contributions
 aware+0.014
appointed+0.014
 understand+0.014
 active+0.013
 gentle+0.013
Neg logit contributions
ays-0.023
overs-0.020
es-0.019
otion-0.019
ions-0.019
Token
Token ID198
Feature activation-3.03e-02

Pos logit contributions
 aware+0.033
 understand+0.033
appointed+0.032
 active+0.030
 calm+0.030
Neg logit contributions
ays-0.052
overs-0.046
ions-0.043
es-0.043
otion-0.042
TokenThe
Token ID464
Feature activation-2.75e-02

Pos logit contributions
 aware+0.039
appointed+0.034
 cautious+0.034
 hesitate+0.033
 active+0.033
Neg logit contributions
ays-0.068
ions-0.064
ase-0.064
overs-0.064
otion-0.061
Token nos
Token ID43630
Feature activation-1.78e-01

Pos logit contributions
 aware+0.040
appointed+0.037
 understand+0.036
 cautious+0.034
 active+0.033
Neg logit contributions
ays-0.056
overs-0.055
ase-0.051
kins-0.050
otion-0.050
Tokeny
Token ID88
Feature activation-7.38e-02

Pos logit contributions
 aware+0.294
 understand+0.276
appointed+0.272
 active+0.270
 cautious+0.261
Neg logit contributions
ays-0.363
overs-0.321
es-0.298
ions-0.298
otion-0.291
Token cart
Token ID6383
Feature activation-3.58e-02

Pos logit contributions
 aware+0.144
appointed+0.140
 understand+0.139
 cautious+0.131
 pleasant+0.130
Neg logit contributions
ays-0.125
overs-0.115
ase-0.107
es-0.102
otion-0.098
Token then
Token ID788
Feature activation-9.18e-03

Pos logit contributions
 aware+0.065
appointed+0.061
 understand+0.061
 active+0.060
 pleasant+0.060
Neg logit contributions
ays-0.066
ase-0.059
overs-0.058
es-0.057
ions-0.053
Token made
Token ID925
Feature activation+3.18e-03

Pos logit contributions
 aware+0.014
appointed+0.013
 understand+0.013
 active+0.013
 cautious+0.013
Neg logit contributions
ays-0.021
overs-0.018
ions-0.017
es-0.017
ase-0.017
Token its
Token ID663
Feature activation-7.45e-03

Pos logit contributions
ays+0.007
overs+0.006
ions+0.006
es+0.006
ase+0.006
Neg logit contributions
 aware-0.005
appointed-0.004
 understand-0.004
 active-0.004
 cautious-0.004
Token.
Token ID13
Feature activation-9.30e-03

Pos logit contributions
 aware+0.010
appointed+0.010
 understand+0.009
 active+0.009
 cautious+0.009
Neg logit contributions
ays-0.013
overs-0.012
ions-0.011
es-0.011
ase-0.011
Token
Token ID220
Feature activation-1.86e-02

Pos logit contributions
 aware+0.014
appointed+0.013
 understand+0.013
 active+0.012
 cautious+0.012
Neg logit contributions
ays-0.021
overs-0.019
ions-0.018
otion-0.018
ase-0.017
Token
Token ID198
Feature activation-2.37e-02

Pos logit contributions
 understand+0.022
 aware+0.021
appointed+0.020
 calm+0.019
 gentle+0.018
Neg logit contributions
ays-0.047
overs-0.043
ocket-0.041
ions-0.040
ase-0.040
Token
Token ID198
Feature activation-3.09e-02

Pos logit contributions
 aware+0.041
appointed+0.039
 active+0.038
 understand+0.038
 cautious+0.037
Neg logit contributions
ays-0.048
overs-0.041
es-0.039
ions-0.038
ase-0.038
TokenThe
Token ID464
Feature activation-2.86e-02

Pos logit contributions
 aware+0.043
 former+0.041
appointed+0.041
 itch+0.041
 pleasant+0.040
Neg logit contributions
es-0.057
ays-0.054
overs-0.053
ons-0.051
ions-0.051
Token nos
Token ID43630
Feature activation-1.78e-01

Pos logit contributions
 aware+0.046
appointed+0.043
 understand+0.042
 active+0.041
 cautious+0.041
Neg logit contributions
ays-0.063
overs-0.055
ions-0.051
es-0.051
otion-0.051
Tokeny
Token ID88
Feature activation-7.32e-02

Pos logit contributions
 aware+0.270
 understand+0.258
appointed+0.249
 active+0.247
 calm+0.239
Neg logit contributions
ays-0.372
overs-0.337
ions-0.314
es-0.310
acey-0.306
Token metal
Token ID6147
Feature activation-6.79e-02

Pos logit contributions
 aware+0.128
appointed+0.123
 understand+0.118
 active+0.115
 cautious+0.115
Neg logit contributions
ays-0.146
overs-0.128
es-0.120
ase-0.120
otion-0.117
Token robot
Token ID9379
Feature activation+5.97e-03

Pos logit contributions
 aware+0.123
appointed+0.118
 understand+0.117
 active+0.115
 cautious+0.113
Neg logit contributions
ays-0.133
overs-0.114
ions-0.106
es-0.104
otion-0.103
Token thought
Token ID1807
Feature activation+4.83e-02

Pos logit contributions
ays+0.012
es+0.010
overs+0.010
ase+0.010
ions+0.010
Neg logit contributions
 aware-0.010
appointed-0.009
 understand-0.009
 active-0.009
 pleasant-0.009
Token it
Token ID340
Feature activation+1.95e-02

Pos logit contributions
ays+0.107
overs+0.091
ase+0.089
acey+0.081
es+0.081
Neg logit contributions
 understand-0.076
 aware-0.075
 gentle-0.073
 calm-0.071
 active-0.071
Token The
Token ID383
Feature activation-1.06e-02

Pos logit contributions
ays+0.017
overs+0.015
ions+0.015
ase+0.014
es+0.014
Neg logit contributions
 aware-0.009
appointed-0.009
 understand-0.009
 calm-0.008
 cautious-0.008
Token puppy
Token ID26188
Feature activation+2.98e-02

Pos logit contributions
 aware+0.015
appointed+0.014
 understand+0.013
 active+0.013
 cautious+0.013
Neg logit contributions
ays-0.023
overs-0.022
es-0.020
otion-0.020
ase-0.020
Token reluctantly
Token ID35462
Feature activation+9.56e-03

Pos logit contributions
ays+0.057
overs+0.049
es+0.049
ase+0.046
ions+0.046
Neg logit contributions
 aware-0.052
appointed-0.051
 understand-0.050
 active-0.049
 cautious-0.047
Token let
Token ID1309
Feature activation-1.43e-02

Pos logit contributions
ays+0.021
overs+0.018
es+0.017
ions+0.017
ase+0.017
Neg logit contributions
 aware-0.015
appointed-0.014
 understand-0.014
 active-0.014
 cautious-0.013
Token the
Token ID262
Feature activation-4.13e-02

Pos logit contributions
 understand+0.020
 calm+0.020
 aware+0.019
 gentle+0.018
 resist+0.018
Neg logit contributions
ays-0.036
ase-0.030
ions-0.029
overs-0.029
es-0.028
Token nos
Token ID43630
Feature activation-1.77e-01

Pos logit contributions
 aware+0.060
appointed+0.057
 understand+0.057
 active+0.056
 calm+0.055
Neg logit contributions
ays-0.097
overs-0.084
ions-0.081
es-0.079
ase-0.079
Tokeny
Token ID88
Feature activation-8.06e-02

Pos logit contributions
 aware+0.267
 understand+0.253
appointed+0.244
 active+0.237
 calm+0.235
Neg logit contributions
ays-0.380
overs-0.342
acey-0.321
ions-0.320
ase-0.316
Token person
Token ID1048
Feature activation-2.05e-03

Pos logit contributions
 aware+0.110
 understand+0.107
 calm+0.103
 gentle+0.102
 active+0.102
Neg logit contributions
ays-0.204
ions-0.173
overs-0.171
otion-0.163
ase-0.163
Token pet
Token ID4273
Feature activation+2.11e-02

Pos logit contributions
 aware+0.003
 understand+0.003
 calm+0.003
 gentle+0.003
 active+0.003
Neg logit contributions
ays-0.005
overs-0.004
ase-0.004
ions-0.004
otion-0.004
Token them
Token ID606
Feature activation+4.05e-03

Pos logit contributions
ays+0.044
ase+0.035
otion+0.035
overs+0.035
acey+0.034
Neg logit contributions
 calm-0.034
 aware-0.034
 understand-0.033
 gentle-0.033
 cautious-0.031
Token.
Token ID13
Feature activation+1.15e-02

Pos logit contributions
ays+0.009
overs+0.007
ions+0.007
es+0.007
ase+0.007
Neg logit contributions
 aware-0.007
appointed-0.006
 understand-0.006
 active-0.006
 cautious-0.006
Token together
Token ID1978
Feature activation+4.95e-02

Pos logit contributions
ays+0.038
overs+0.032
otion+0.030
ions+0.030
es+0.030
Neg logit contributions
 aware-0.030
 understand-0.029
appointed-0.029
 active-0.028
 gentle-0.028
Token."
Token ID526
Feature activation-2.67e-02

Pos logit contributions
ays+0.085
overs+0.073
es+0.069
ions+0.068
ase+0.065
Neg logit contributions
 aware-0.099
appointed-0.097
 understand-0.092
 cautious-0.092
 active-0.092
Token
Token ID198
Feature activation-2.83e-02

Pos logit contributions
 aware+0.043
appointed+0.040
 understand+0.039
 active+0.039
 cautious+0.038
Neg logit contributions
ays-0.059
overs-0.051
es-0.048
ions-0.048
ase-0.048
Token
Token ID198
Feature activation-3.32e-02

Pos logit contributions
 aware+0.049
appointed+0.046
 understand+0.045
 active+0.045
 cautious+0.044
Neg logit contributions
ays-0.058
overs-0.050
es-0.047
ions-0.047
ase-0.046
TokenThe
Token ID464
Feature activation-3.22e-02

Pos logit contributions
 aware+0.047
 hesitate+0.043
 understand+0.043
appointed+0.043
 itch+0.043
Neg logit contributions
es-0.062
ays-0.059
overs-0.058
ase-0.057
ions-0.055
Token nos
Token ID43630
Feature activation-1.77e-01

Pos logit contributions
 aware+0.054
appointed+0.053
 understand+0.049
 cautious+0.046
 appreciate+0.046
Neg logit contributions
ays-0.059
overs-0.053
es-0.051
kins-0.050
ls-0.049
Tokeny
Token ID88
Feature activation-7.34e-02

Pos logit contributions
 aware+0.321
appointed+0.306
 understand+0.301
 active+0.298
 cautious+0.293
Neg logit contributions
ays-0.345
overs-0.296
ions-0.275
es-0.275
ase-0.270
Token bird
Token ID6512
Feature activation-3.84e-02

Pos logit contributions
appointed+0.123
 aware+0.122
 understand+0.117
 consider+0.109
 active+0.109
Neg logit contributions
ays-0.139
es-0.125
overs-0.124
ase-0.122
ls-0.118
Token was
Token ID373
Feature activation+6.25e-03

Pos logit contributions
appointed+0.073
 aware+0.071
 active+0.067
 understand+0.067
 pleasant+0.067
Neg logit contributions
ays-0.065
es-0.062
ase-0.052
ls-0.051
ions-0.050
Token excited
Token ID6568
Feature activation-2.93e-02

Pos logit contributions
ays+0.012
overs+0.011
es+0.011
otion+0.010
ions+0.010
Neg logit contributions
 matter-0.009
appointed-0.009
 aware-0.008
 approach-0.008
 perspective-0.008
Token and
Token ID290
Feature activation-5.62e-03

Pos logit contributions
 aware+0.046
appointed+0.046
 active+0.043
 understand+0.043
 cautious+0.043
Neg logit contributions
ays-0.062
overs-0.055
es-0.052
ions-0.051
otion-0.050
Token replied
Token ID8712
Feature activation+8.25e-02

Pos logit contributions
 aware+0.092
appointed+0.089
 active+0.087
 understand+0.085
 pleasant+0.085
Neg logit contributions
ays-0.083
es-0.070
overs-0.070
ase-0.066
ions-0.064
Token.
Token ID13
Feature activation-1.70e-02

Pos logit contributions
ays+0.130
overs+0.121
es+0.113
otion+0.101
ions+0.101
Neg logit contributions
 aware-0.167
appointed-0.163
 matter-0.156
 understand-0.156
 pleasant-0.155
Token
Token ID198
Feature activation-2.69e-02

Pos logit contributions
 aware+0.035
 matter+0.033
 understand+0.033
 active+0.032
appointed+0.032
Neg logit contributions
ays-0.029
overs-0.025
es-0.023
ase-0.023
ions-0.022
Token
Token ID198
Feature activation-3.28e-02

Pos logit contributions
 aware+0.045
appointed+0.043
 understand+0.042
 active+0.042
 cautious+0.041
Neg logit contributions
ays-0.056
overs-0.048
es-0.045
ions-0.045
ase-0.045
TokenThe
Token ID464
Feature activation-3.07e-02

Pos logit contributions
 matter+0.064
 understand+0.063
 aware+0.063
 hesitate+0.061
 itch+0.061
Neg logit contributions
es-0.044
ase-0.043
ays-0.041
overs-0.040
ions-0.037
Token nos
Token ID43630
Feature activation-1.77e-01

Pos logit contributions
 aware+0.049
appointed+0.047
 understand+0.044
 active+0.043
 cautious+0.043
Neg logit contributions
ays-0.058
overs-0.056
ase-0.050
es-0.050
otion-0.049
Tokeny
Token ID88
Feature activation-7.41e-02

Pos logit contributions
 aware+0.324
appointed+0.310
 understand+0.305
 active+0.301
 cautious+0.296
Neg logit contributions
ays-0.340
overs-0.292
ions-0.271
es-0.270
ase-0.266
Token neighbor
Token ID4780
Feature activation-1.86e-02

Pos logit contributions
appointed+0.115
 aware+0.115
 understand+0.109
 active+0.104
 cautious+0.102
Neg logit contributions
ays-0.149
overs-0.140
ase-0.132
es-0.130
otion-0.124
Token asked
Token ID1965
Feature activation-9.35e-03

Pos logit contributions
 aware+0.033
appointed+0.032
 understand+0.031
 active+0.030
 cautious+0.030
Neg logit contributions
ays-0.035
overs-0.031
es-0.030
ase-0.029
ions-0.029
Token,
Token ID11
Feature activation-2.54e-02

Pos logit contributions
 aware+0.016
appointed+0.015
 active+0.015
 understand+0.015
 cautious+0.014
Neg logit contributions
ays-0.019
overs-0.017
es-0.016
ions-0.016
otion-0.016
Token "
Token ID366
Feature activation-7.41e-02

Pos logit contributions
appointed+0.052
 aware+0.051
 active+0.049
 understand+0.049
 matter+0.048
Neg logit contributions
ays-0.038
overs-0.035
es-0.033
ions-0.032
otion-0.032
Token!"
Token ID2474
Feature activation-2.63e-02

Pos logit contributions
ays+0.027
overs+0.023
ions+0.022
es+0.021
ase+0.021
Neg logit contributions
 aware-0.024
appointed-0.023
 understand-0.023
 active-0.022
 cautious-0.022
Token
Token ID220
Feature activation-2.29e-02

Pos logit contributions
 aware+0.042
appointed+0.040
 understand+0.040
 active+0.038
 cautious+0.038
Neg logit contributions
ays-0.058
overs-0.050
ions-0.047
es-0.046
ase-0.046
Token
Token ID198
Feature activation-2.64e-02

Pos logit contributions
 aware+0.032
 understand+0.031
appointed+0.030
 calm+0.028
 cautious+0.028
Neg logit contributions
ays-0.054
overs-0.048
ions-0.046
ase-0.045
otion-0.044
Token
Token ID198
Feature activation-3.17e-02

Pos logit contributions
 aware+0.048
appointed+0.046
 active+0.045
 understand+0.043
 cautious+0.043
Neg logit contributions
ays-0.050
overs-0.044
es-0.042
ions-0.041
ase-0.041
TokenThe
Token ID464
Feature activation-3.18e-02

Pos logit contributions
appointed+0.043
rated+0.040
 aware+0.040
 matter+0.039
 active+0.039
Neg logit contributions
ays-0.064
es-0.063
overs-0.062
ions-0.061
ase-0.061
Token nos
Token ID43630
Feature activation-1.76e-01

Pos logit contributions
 aware+0.048
appointed+0.046
 understand+0.044
 active+0.043
 cautious+0.042
Neg logit contributions
ays-0.070
overs-0.062
otion-0.058
es-0.058
ions-0.058
Tokeny
Token ID88
Feature activation-7.19e-02

Pos logit contributions
 aware+0.278
 understand+0.259
appointed+0.258
 active+0.252
 cautious+0.246
Neg logit contributions
ays-0.372
overs-0.331
ions-0.307
es-0.306
otion-0.299
Token girl
Token ID2576
Feature activation+2.48e-03

Pos logit contributions
 aware+0.104
appointed+0.099
 understand+0.096
 active+0.094
 cautious+0.092
Neg logit contributions
ays-0.163
overs-0.145
es-0.138
ase-0.136
otion-0.135
Token smiled
Token ID13541
Feature activation+4.87e-03

Pos logit contributions
ays+0.004
es+0.004
overs+0.004
ase+0.004
acey+0.004
Neg logit contributions
 aware-0.004
appointed-0.004
 active-0.004
 understand-0.004
 pleasant-0.004
Token.
Token ID13
Feature activation-9.47e-03

Pos logit contributions
ays+0.008
overs+0.007
es+0.007
otion+0.006
ls+0.006
Neg logit contributions
appointed-0.009
 matter-0.009
 aware-0.009
 active-0.009
Slow-0.008
Token She
Token ID1375
Feature activation+7.85e-04

Pos logit contributions
 aware+0.017
 active+0.017
appointed+0.017
 understand+0.016
 pleasant+0.016
Neg logit contributions
ays-0.018
overs-0.015
es-0.014
ions-0.014
ase-0.014
Token away
Token ID1497
Feature activation+3.92e-02

Pos logit contributions
 aware+0.018
appointed+0.017
 calm+0.017
 understand+0.017
 active+0.016
Neg logit contributions
ays-0.036
overs-0.031
ions-0.030
es-0.030
otion-0.030
Token.
Token ID13
Feature activation+2.60e-02

Pos logit contributions
ays+0.072
overs+0.063
es+0.060
ions+0.058
otion+0.056
Neg logit contributions
 aware-0.073
appointed-0.072
 active-0.068
 understand-0.068
 pleasant-0.067
Token
Token ID198
Feature activation+3.02e-03

Pos logit contributions
ays+0.063
overs+0.055
es+0.053
ions+0.053
otion+0.053
Neg logit contributions
 aware-0.033
 understand-0.033
appointed-0.033
 calm-0.033
 gentle-0.032
Token
Token ID198
Feature activation-6.07e-03

Pos logit contributions
ays+0.008
overs+0.007
ions+0.006
ase+0.006
otion+0.006
Neg logit contributions
 understand-0.004
 aware-0.004
 calm-0.004
appointed-0.003
 gentle-0.003
TokenThe
Token ID464
Feature activation-2.28e-02

Pos logit contributions
 aware+0.009
appointed+0.008
 understand+0.008
 active+0.008
 cautious+0.008
Neg logit contributions
ays-0.014
overs-0.013
ions-0.012
es-0.012
ase-0.012
Token nos
Token ID43630
Feature activation-1.75e-01

Pos logit contributions
 aware+0.034
appointed+0.032
 understand+0.031
 active+0.031
 cautious+0.030
Neg logit contributions
ays-0.052
overs-0.046
ions-0.043
es-0.043
otion-0.043
Tokeny
Token ID88
Feature activation-6.79e-02

Pos logit contributions
 aware+0.239
 understand+0.228
 active+0.219
 calm+0.219
appointed+0.212
Neg logit contributions
ays-0.367
overs-0.344
ocket-0.329
es-0.322
acey-0.321
Token duck
Token ID22045
Feature activation-3.30e-02

Pos logit contributions
 aware+0.110
 understand+0.103
 gentle+0.103
 calm+0.103
appointed+0.102
Neg logit contributions
ays-0.148
overs-0.127
ions-0.123
es-0.120
otion-0.117
Token did
Token ID750
Feature activation+4.98e-04

Pos logit contributions
 aware+0.055
appointed+0.053
 understand+0.051
 active+0.050
 cautious+0.049
Neg logit contributions
ays-0.070
overs-0.060
es-0.058
ase-0.056
ions-0.056
Token not
Token ID407
Feature activation-4.61e-02

Pos logit contributions
ays+0.001
overs+0.001
es+0.001
otion+0.001
ls+0.001
Neg logit contributions
 aware-0.001
 active-0.001
appointed-0.001
 cautious-0.001
 pleasant-0.001
Token listen
Token ID6004
Feature activation-1.55e-02

Pos logit contributions
appointed+0.060
 aware+0.054
 former+0.052
 cautious+0.049
Slow+0.049
Neg logit contributions
overs-0.091
otion-0.090
es-0.089
ays-0.088
acey-0.084
Token give
Token ID1577
Feature activation+6.57e-02

Pos logit contributions
 aware+0.037
appointed+0.035
 active+0.033
 cautious+0.033
 understand+0.032
Neg logit contributions
ays-0.058
overs-0.054
ions-0.050
es-0.050
otion-0.050
Token.
Token ID13
Feature activation+9.68e-03

Pos logit contributions
ays+0.142
overs+0.120
otion+0.115
ase+0.114
ions+0.113
Neg logit contributions
 aware-0.106
 understand-0.099
appointed-0.099
 active-0.098
 calm-0.097
Token
Token ID198
Feature activation-4.32e-03

Pos logit contributions
ays+0.022
overs+0.019
es+0.018
ions+0.018
kins+0.018
Neg logit contributions
 aware-0.014
appointed-0.014
 understand-0.013
 active-0.012
 cautious-0.012
Token
Token ID198
Feature activation-1.34e-02

Pos logit contributions
 aware+0.006
 understand+0.006
appointed+0.006
 active+0.006
 gentle+0.006
Neg logit contributions
ays-0.010
overs-0.009
ions-0.008
es-0.008
otion-0.008
TokenThe
Token ID464
Feature activation-2.63e-02

Pos logit contributions
 aware+0.015
appointed+0.014
 understand+0.014
 active+0.014
 cautious+0.013
Neg logit contributions
ays-0.033
overs-0.031
ions-0.030
es-0.029
ase-0.029
Token nos
Token ID43630
Feature activation-1.74e-01

Pos logit contributions
 aware+0.040
appointed+0.039
 understand+0.036
 active+0.035
 cautious+0.035
Neg logit contributions
ays-0.056
overs-0.051
ions-0.047
ase-0.047
otion-0.047
Tokeny
Token ID88
Feature activation-6.60e-02

Pos logit contributions
 aware+0.296
appointed+0.279
 understand+0.277
 active+0.271
 cautious+0.266
Neg logit contributions
ays-0.354
overs-0.311
es-0.287
ions-0.287
otion-0.282
Token bear
Token ID6842
Feature activation-1.46e-02

Pos logit contributions
 aware+0.101
appointed+0.097
 understand+0.094
 active+0.091
 cautious+0.090
Neg logit contributions
ays-0.143
overs-0.129
es-0.121
ase-0.120
ions-0.118
Token and
Token ID290
Feature activation-2.18e-03

Pos logit contributions
 aware+0.024
appointed+0.024
 active+0.023
 understand+0.023
 cautious+0.022
Neg logit contributions
ays-0.029
overs-0.026
es-0.025
ase-0.024
ions-0.024
Token nos
Token ID43630
Feature activation-1.40e-01

Pos logit contributions
 aware+0.004
 understand+0.004
appointed+0.004
 active+0.004
 cautious+0.004
Neg logit contributions
ays-0.004
overs-0.004
ions-0.004
otion-0.004
ase-0.003
Tokeny
Token ID88
Feature activation-3.32e-02

Pos logit contributions
 aware+0.238
appointed+0.225
 understand+0.222
 active+0.217
 cautious+0.213
Neg logit contributions
ays-0.284
overs-0.249
ions-0.231
es-0.229
otion-0.229
Token tree
Token ID5509
Feature activation+3.17e-04

Pos logit contributions
 aware+0.007
appointed+0.007
 understand+0.007
 active+0.007
 cautious+0.007
Neg logit contributions
ays-0.009
overs-0.008
ase-0.008
ions-0.008
es-0.008
Token.
Token ID13
Feature activation-1.17e-02

Pos logit contributions
ays+0.001
overs+0.001
ions+0.000
es+0.000
ase+0.000
Neg logit contributions
 aware-0.001
appointed-0.001
 understand-0.001
 active-0.001
 cautious-0.001
Token
Token ID198
Feature activation-2.23e-02

Pos logit contributions
 aware+0.020
 active+0.019
appointed+0.018
 understand+0.018
 matter+0.018
Neg logit contributions
ays-0.024
overs-0.021
ions-0.019
es-0.019
otion-0.018
Token
Token ID198
Feature activation-3.04e-02

Pos logit contributions
 aware+0.042
appointed+0.041
 active+0.040
 matter+0.039
 cautious+0.039
Neg logit contributions
ays-0.039
overs-0.034
ions-0.033
es-0.032
ase-0.031
TokenThe
Token ID464
Feature activation-3.18e-02

Pos logit contributions
 matter+0.044
 aware+0.043
 itch+0.042
appointed+0.041
 active+0.041
Neg logit contributions
ays-0.059
overs-0.055
ions-0.055
es-0.053
on-0.052
Token nos
Token ID43630
Feature activation-1.74e-01

Pos logit contributions
 aware+0.053
appointed+0.050
 understand+0.048
 active+0.046
 consider+0.046
Neg logit contributions
ays-0.056
overs-0.053
ase-0.050
ions-0.050
box-0.048
Tokeny
Token ID88
Feature activation-6.81e-02

Pos logit contributions
 aware+0.335
appointed+0.326
 understand+0.316
 active+0.315
 cautious+0.311
Neg logit contributions
ays-0.330
overs-0.273
ions-0.260
es-0.253
otion-0.252
Token man
Token ID582
Feature activation+5.62e-04

Pos logit contributions
 aware+0.109
appointed+0.106
 understand+0.102
 active+0.099
 cautious+0.096
Neg logit contributions
ays-0.136
overs-0.127
ase-0.116
es-0.116
ions-0.114
Token finally
Token ID3443
Feature activation-1.12e-02

Pos logit contributions
ays+0.001
overs+0.001
ions+0.001
ase+0.001
otion+0.001
Neg logit contributions
 aware-0.001
 understand-0.001
appointed-0.001
 active-0.001
 cautious-0.001
Token went
Token ID1816
Feature activation-6.30e-03

Pos logit contributions
 aware+0.018
 understand+0.017
appointed+0.017
 active+0.016
 cautious+0.016
Neg logit contributions
ays-0.025
overs-0.021
ions-0.020
ase-0.020
otion-0.020
Token away
Token ID1497
Feature activation+1.94e-02

Pos logit contributions
 aware+0.009
appointed+0.008
 calm+0.008
 active+0.008
 understand+0.008
Neg logit contributions
ays-0.015
overs-0.013
ase-0.012
es-0.012
ions-0.012
Token details
Token ID3307
Feature activation-3.46e-02

Pos logit contributions
 aware+0.020
 active+0.019
appointed+0.019
 understand+0.019
 gentle+0.019
Neg logit contributions
ays-0.020
overs-0.017
otion-0.016
es-0.016
ions-0.016
Token.
Token ID13
Feature activation-1.74e-02

Pos logit contributions
 aware+0.049
 active+0.047
 understand+0.046
 calm+0.046
 gentle+0.046
Neg logit contributions
ays-0.077
overs-0.066
otion-0.064
ase-0.063
acey-0.062
Token
Token ID220
Feature activation-1.83e-02

Pos logit contributions
 aware+0.025
appointed+0.024
 understand+0.024
 active+0.023
 cautious+0.023
Neg logit contributions
ays-0.040
overs-0.035
es-0.033
ions-0.033
otion-0.033
Token
Token ID198
Feature activation-2.37e-02

Pos logit contributions
 aware+0.023
 understand+0.023
appointed+0.022
 calm+0.020
 gentle+0.020
Neg logit contributions
ays-0.043
overs-0.039
ocket-0.037
ions-0.037
ase-0.036
TokenThe
Token ID464
Feature activation-2.91e-02

Pos logit contributions
 aware+0.042
appointed+0.040
 active+0.038
 understand+0.038
 cautious+0.038
Neg logit contributions
ays-0.047
overs-0.041
ions-0.038
es-0.038
ase-0.038
Token nos
Token ID43630
Feature activation-1.74e-01

Pos logit contributions
 aware+0.049
appointed+0.046
 understand+0.045
 active+0.045
 cautious+0.044
Neg logit contributions
ays-0.061
overs-0.053
ions-0.049
es-0.049
ase-0.049
Tokeny
Token ID88
Feature activation-6.74e-02

Pos logit contributions
 aware+0.310
appointed+0.295
 understand+0.291
 active+0.287
 cautious+0.282
Neg logit contributions
ays-0.343
overs-0.296
ions-0.275
es-0.275
otion-0.270
Token lady
Token ID10846
Feature activation+1.69e-02

Pos logit contributions
 aware+0.100
appointed+0.095
 understand+0.092
 active+0.091
 cautious+0.089
Neg logit contributions
ays-0.153
overs-0.135
es-0.127
ions-0.126
ase-0.126
Token came
Token ID1625
Feature activation-6.70e-02

Pos logit contributions
ays+0.035
overs+0.030
es+0.029
ions+0.028
ase+0.028
Neg logit contributions
 aware-0.028
appointed-0.028
 active-0.026
 understand-0.026
 cautious-0.026
Token to
Token ID284
Feature activation-5.42e-02

Pos logit contributions
appointed+0.114
 aware+0.109
 matter+0.106
 active+0.105
 understand+0.104
Neg logit contributions
ays-0.131
overs-0.118
es-0.112
ions-0.106
otion-0.103
Token check
Token ID2198
Feature activation+1.43e-02

Pos logit contributions
appointed+0.092
 aware+0.091
 active+0.084
 cautious+0.083
 understand+0.078
Neg logit contributions
ays-0.107
overs-0.100
ions-0.092
otion-0.088
es-0.088
Token ground
Token ID2323
Feature activation-3.10e-02

Pos logit contributions
 aware+0.024
appointed+0.022
 understand+0.021
 cautious+0.021
 active+0.020
Neg logit contributions
ays-0.030
overs-0.027
ase-0.025
ions-0.025
es-0.024
Token."
Token ID526
Feature activation-1.90e-02

Pos logit contributions
 aware+0.052
appointed+0.050
 understand+0.049
 active+0.049
 cautious+0.048
Neg logit contributions
ays-0.064
overs-0.055
ions-0.051
es-0.051
otion-0.051
Token
Token ID198
Feature activation-1.69e-02

Pos logit contributions
 aware+0.025
 understand+0.024
appointed+0.024
 calm+0.022
 gentle+0.022
Neg logit contributions
ays-0.046
overs-0.039
otion-0.038
ions-0.038
es-0.037
Token
Token ID198
Feature activation-2.19e-02

Pos logit contributions
 understand+0.022
 aware+0.022
 gentle+0.021
appointed+0.021
 calm+0.021
Neg logit contributions
ays-0.042
overs-0.036
otion-0.034
ions-0.034
es-0.034
TokenThe
Token ID464
Feature activation-2.76e-02

Pos logit contributions
 aware+0.028
appointed+0.027
 understand+0.025
 active+0.025
 cautious+0.024
Neg logit contributions
ays-0.053
overs-0.048
ions-0.046
es-0.046
ase-0.045
Token nos
Token ID43630
Feature activation-1.73e-01

Pos logit contributions
 aware+0.037
appointed+0.036
 understand+0.034
 active+0.033
 cautious+0.033
Neg logit contributions
ays-0.065
overs-0.058
ions-0.055
es-0.055
ase-0.055
Tokeny
Token ID88
Feature activation-6.52e-02

Pos logit contributions
 aware+0.289
appointed+0.272
 understand+0.271
 active+0.263
 cautious+0.259
Neg logit contributions
ays-0.354
overs-0.312
ions-0.289
es-0.289
otion-0.284
Token dove
Token ID38862
Feature activation-1.52e-02

Pos logit contributions
 aware+0.092
appointed+0.088
 understand+0.084
 active+0.082
 cautious+0.081
Neg logit contributions
ays-0.150
overs-0.135
es-0.127
ase-0.126
ions-0.125
Token got
Token ID1392
Feature activation-3.74e-03

Pos logit contributions
 aware+0.025
appointed+0.024
 understand+0.023
 active+0.023
 cautious+0.022
Neg logit contributions
ays-0.032
overs-0.028
es-0.027
ions-0.026
ase-0.026
Token closer
Token ID5699
Feature activation-8.19e-02

Pos logit contributions
appointed+0.005
 aware+0.005
 active+0.004
 understand+0.004
 matter+0.004
Neg logit contributions
ays-0.009
overs-0.008
es-0.007
ions-0.007
ls-0.007
Token and
Token ID290
Feature activation-1.85e-03

Pos logit contributions
appointed+0.150
 active+0.144
 aware+0.144
 matter+0.139
Slow+0.138
Neg logit contributions
ays-0.148
es-0.133
overs-0.129
ions-0.118
otion-0.113
Token?"
Token ID1701
Feature activation-3.11e-02

Pos logit contributions
 aware+0.046
appointed+0.045
 active+0.042
 cautious+0.042
 understand+0.042
Neg logit contributions
ays-0.044
es-0.036
overs-0.036
ase-0.035
ions-0.034
Token
Token ID220
Feature activation-2.23e-02

Pos logit contributions
 aware+0.057
appointed+0.056
 active+0.053
 pleasant+0.053
 understand+0.052
Neg logit contributions
ays-0.059
overs-0.052
es-0.049
ase-0.048
ions-0.047
Token
Token ID198
Feature activation-2.16e-02

Pos logit contributions
 aware+0.032
 understand+0.031
appointed+0.031
 active+0.029
 calm+0.029
Neg logit contributions
ays-0.051
overs-0.045
ions-0.043
ase-0.042
es-0.042
Token
Token ID198
Feature activation-2.75e-02

Pos logit contributions
 aware+0.044
appointed+0.043
 matter+0.043
 active+0.041
 cautious+0.041
Neg logit contributions
ays-0.034
overs-0.030
es-0.030
ase-0.029
ions-0.028
TokenThe
Token ID464
Feature activation-2.91e-02

Pos logit contributions
appointed+0.041
 matter+0.040
rated+0.040
 aware+0.040
 steep+0.038
Neg logit contributions
es-0.051
overs-0.050
ase-0.049
ays-0.049
ions-0.047
Token nos
Token ID43630
Feature activation-1.72e-01

Pos logit contributions
 aware+0.044
appointed+0.043
 understand+0.039
 cautious+0.037
 active+0.037
Neg logit contributions
ays-0.061
overs-0.054
es-0.054
otion-0.054
ase-0.053
Tokeny
Token ID88
Feature activation-7.25e-02

Pos logit contributions
 aware+0.292
appointed+0.276
 understand+0.275
 active+0.270
 cautious+0.263
Neg logit contributions
ays-0.348
overs-0.306
ions-0.283
es-0.282
otion-0.276
Token monster
Token ID9234
Feature activation-3.43e-02

Pos logit contributions
 aware+0.138
appointed+0.137
 understand+0.128
 consider+0.128
 active+0.124
Neg logit contributions
ays-0.119
es-0.107
ase-0.104
ls-0.102
overs-0.100
Token smiled
Token ID13541
Feature activation+2.05e-02

Pos logit contributions
 aware+0.059
appointed+0.058
 active+0.055
 understand+0.053
 cautious+0.051
Neg logit contributions
ays-0.062
es-0.059
ase-0.051
overs-0.051
ions-0.050
Token and
Token ID290
Feature activation+1.92e-02

Pos logit contributions
ays+0.038
overs+0.033
es+0.031
ls+0.027
ase+0.027
Neg logit contributions
appointed-0.037
 aware-0.035
 matter-0.035
 understand-0.034
 active-0.034
Token waved
Token ID26834
Feature activation+4.37e-02

Pos logit contributions
ays+0.033
overs+0.028
es+0.028
ions+0.026
ls+0.025
Neg logit contributions
 aware-0.036
appointed-0.036
 understand-0.035
 active-0.035
 cautious-0.033
Token Sm
Token ID2439
Feature activation-3.64e-02

Pos logit contributions
appointed+0.029
 aware+0.029
 active+0.027
 understand+0.027
 cautious+0.027
Neg logit contributions
ays-0.031
overs-0.028
ions-0.026
es-0.026
otion-0.025
Tokeniley
Token ID9618
Feature activation+1.58e-02

Pos logit contributions
 aware+0.068
appointed+0.066
 understand+0.064
 active+0.063
 gentle+0.062
Neg logit contributions
ays-0.069
overs-0.059
ions-0.056
es-0.054
otion-0.054
Token laughed
Token ID13818
Feature activation+1.15e-02

Pos logit contributions
ays+0.034
overs+0.030
ions+0.028
ase+0.027
es+0.027
Neg logit contributions
 aware-0.025
appointed-0.024
 understand-0.024
 active-0.023
 cautious-0.023
Token at
Token ID379
Feature activation-1.02e-04

Pos logit contributions
ays+0.024
overs+0.021
es+0.020
ions+0.019
otion+0.019
Neg logit contributions
 aware-0.019
appointed-0.018
 understand-0.018
 active-0.018
 cautious-0.017
Token the
Token ID262
Feature activation-3.06e-02

Pos logit contributions
 aware+0.000
appointed+0.000
 active+0.000
 understand+0.000
 cautious+0.000
Neg logit contributions
ays-0.000
overs-0.000
ions-0.000
es-0.000
otion-0.000
Token nos
Token ID43630
Feature activation-1.72e-01

Pos logit contributions
 aware+0.047
appointed+0.047
 active+0.044
 understand+0.042
During+0.041
Neg logit contributions
ays-0.061
overs-0.053
otion-0.051
ls-0.050
ocket-0.048
Tokeny
Token ID88
Feature activation-6.14e-02

Pos logit contributions
 aware+0.292
 understand+0.273
appointed+0.272
 active+0.268
 cautious+0.263
Neg logit contributions
ays-0.349
overs-0.306
es-0.285
ions-0.283
ase-0.280
Token dove
Token ID38862
Feature activation-4.52e-04

Pos logit contributions
 aware+0.099
appointed+0.098
 active+0.092
 understand+0.091
 cautious+0.087
Neg logit contributions
ays-0.123
overs-0.112
otion-0.103
es-0.103
ase-0.101
Token and
Token ID290
Feature activation+1.04e-02

Pos logit contributions
 aware+0.001
appointed+0.001
 understand+0.001
 active+0.001
 cautious+0.001
Neg logit contributions
ays-0.001
overs-0.001
ions-0.001
ase-0.001
es-0.001
Token then
Token ID788
Feature activation-5.63e-03

Pos logit contributions
ays+0.020
overs+0.018
es+0.017
ions+0.017
otion+0.016
Neg logit contributions
 aware-0.018
appointed-0.017
 understand-0.017
 active-0.017
 cautious-0.016
Token continued
Token ID3767
Feature activation-1.38e-02

Pos logit contributions
 aware+0.010
appointed+0.009
 understand+0.009
 active+0.009
 cautious+0.009
Neg logit contributions
ays-0.011
overs-0.010
ions-0.009
es-0.009
ase-0.009
Token.
Token ID13
Feature activation+2.51e-03

Pos logit contributions
ays+0.005
overs+0.004
ions+0.004
es+0.004
ase+0.004
Neg logit contributions
 aware-0.005
appointed-0.005
 understand-0.004
 active-0.004
 cautious-0.004
Token
Token ID220
Feature activation-6.61e-03

Pos logit contributions
ays+0.006
overs+0.005
es+0.005
ions+0.005
kins+0.005
Neg logit contributions
appointed-0.003
 aware-0.003
 understand-0.003
 active-0.003
 calm-0.003
Token
Token ID198
Feature activation-1.53e-02

Pos logit contributions
 aware+0.008
 understand+0.008
appointed+0.008
 calm+0.007
 gentle+0.007
Neg logit contributions
ays-0.016
overs-0.014
ocket-0.014
ions-0.013
otion-0.013
Token
Token ID198
Feature activation-2.38e-02

Pos logit contributions
 aware+0.024
appointed+0.023
 understand+0.023
 active+0.022
 cautious+0.022
Neg logit contributions
ays-0.034
overs-0.029
ions-0.027
es-0.027
otion-0.027
TokenThe
Token ID464
Feature activation-2.62e-02

Pos logit contributions
 aware+0.024
appointed+0.023
 active+0.021
 understand+0.021
 cautious+0.020
Neg logit contributions
ays-0.064
overs-0.058
ions-0.056
es-0.056
ase-0.055
Token nos
Token ID43630
Feature activation-1.72e-01

Pos logit contributions
 aware+0.044
appointed+0.042
 understand+0.040
 cautious+0.039
 active+0.039
Neg logit contributions
ays-0.052
overs-0.047
ions-0.043
ase-0.043
es-0.043
Tokeny
Token ID88
Feature activation-6.67e-02

Pos logit contributions
 aware+0.278
 understand+0.258
appointed+0.257
 active+0.255
 cautious+0.247
Neg logit contributions
ays-0.356
overs-0.316
ions-0.293
es-0.292
otion-0.291
Token face
Token ID1986
Feature activation-2.29e-02

Pos logit contributions
 aware+0.122
appointed+0.119
 understand+0.115
 active+0.111
 cautious+0.111
Neg logit contributions
ays-0.126
overs-0.111
es-0.103
ase-0.102
ions-0.100
Token was
Token ID373
Feature activation+2.65e-02

Pos logit contributions
 aware+0.042
appointed+0.041
 active+0.039
 understand+0.039
 cautious+0.038
Neg logit contributions
ays-0.042
es-0.036
overs-0.035
ase-0.034
ions-0.034
Token never
Token ID1239
Feature activation+1.98e-02

Pos logit contributions
ays+0.057
overs+0.054
es+0.050
ions+0.049
otion+0.047
Neg logit contributions
 aware-0.036
appointed-0.036
 understand-0.034
 active-0.033
 cautious-0.032
Token forgotten
Token ID11564
Feature activation+2.82e-02

Pos logit contributions
ays+0.038
overs+0.036
es+0.033
ions+0.031
otion+0.031
Neg logit contributions
appointed-0.031
 aware-0.029
 understand-0.028
Slow-0.027
 active-0.027
Token.
Token ID13
Feature activation-1.52e-02

Pos logit contributions
ays+0.012
overs+0.011
acey+0.010
ase+0.010
otion+0.010
Neg logit contributions
 aware-0.006
 gentle-0.006
 understand-0.006
 calm-0.006
 active-0.006
Token
Token ID220
Feature activation-1.89e-02

Pos logit contributions
 aware+0.022
appointed+0.022
 understand+0.021
 active+0.020
 gentle+0.020
Neg logit contributions
ays-0.034
overs-0.029
otion-0.028
ions-0.028
es-0.028
Token
Token ID198
Feature activation-2.31e-02

Pos logit contributions
 understand+0.022
 aware+0.020
appointed+0.019
 calm+0.018
 gentle+0.017
Neg logit contributions
ays-0.048
overs-0.044
ocket-0.041
ase-0.041
ions-0.041
Token
Token ID198
Feature activation-2.95e-02

Pos logit contributions
 aware+0.040
appointed+0.038
 understand+0.037
 active+0.037
 cautious+0.036
Neg logit contributions
ays-0.047
overs-0.040
es-0.038
ions-0.038
ase-0.037
TokenThe
Token ID464
Feature activation-2.62e-02

Pos logit contributions
 aware+0.037
 matter+0.034
appointed+0.034
 pleasant+0.033
 worry+0.032
Neg logit contributions
ays-0.064
overs-0.062
es-0.061
ions-0.060
ase-0.059
Token nos
Token ID43630
Feature activation-1.72e-01

Pos logit contributions
 aware+0.041
appointed+0.040
 understand+0.038
 active+0.035
 cautious+0.035
Neg logit contributions
ays-0.049
overs-0.046
kins-0.044
es-0.044
ocket-0.043
Tokeny
Token ID88
Feature activation-6.85e-02

Pos logit contributions
 aware+0.304
appointed+0.287
 understand+0.284
 active+0.280
 cautious+0.275
Neg logit contributions
ays-0.342
overs-0.297
ions-0.275
es-0.274
otion-0.271
Token rabbit
Token ID22746
Feature activation-2.53e-02

Pos logit contributions
appointed+0.109
 aware+0.108
 understand+0.105
 active+0.098
 pleasant+0.097
Neg logit contributions
ays-0.132
overs-0.121
es-0.119
ase-0.115
ls-0.111
Token asked
Token ID1965
Feature activation-7.62e-03

Pos logit contributions
appointed+0.046
 aware+0.046
 active+0.044
 understand+0.043
 pleasant+0.043
Neg logit contributions
ays-0.044
es-0.040
overs-0.036
ase-0.036
ions-0.035
Token his
Token ID465
Feature activation+5.66e-04

Pos logit contributions
 aware+0.014
appointed+0.014
 understand+0.014
 active+0.013
 cautious+0.013
Neg logit contributions
ays-0.014
overs-0.012
ions-0.011
es-0.011
ase-0.011
Token friends
Token ID2460
Feature activation-3.04e-02

Pos logit contributions
ays+0.001
overs+0.001
es+0.001
ions+0.001
ase+0.001
Neg logit contributions
appointed-0.001
 aware-0.001
 active-0.001
 understand-0.001
 cautious-0.001
Token before
Token ID878
Feature activation-1.03e-02

Pos logit contributions
 aware+0.019
appointed+0.018
 active+0.017
 understand+0.017
 cautious+0.017
Neg logit contributions
ays-0.024
overs-0.021
es-0.020
ions-0.020
otion-0.020
Token.
Token ID13
Feature activation-7.93e-03

Pos logit contributions
 aware+0.018
appointed+0.017
 understand+0.017
 active+0.016
 cautious+0.016
Neg logit contributions
ays-0.021
overs-0.018
ions-0.017
ase-0.017
es-0.017
Token
Token ID198
Feature activation-1.32e-02

Pos logit contributions
 aware+0.012
appointed+0.012
 understand+0.012
 active+0.011
 cautious+0.011
Neg logit contributions
ays-0.018
overs-0.015
es-0.014
ions-0.014
otion-0.014
Token
Token ID198
Feature activation-1.95e-02

Pos logit contributions
 aware+0.021
appointed+0.020
 understand+0.020
 active+0.019
 cautious+0.019
Neg logit contributions
ays-0.029
overs-0.025
ions-0.024
es-0.024
otion-0.023
TokenThe
Token ID464
Feature activation-2.22e-02

Pos logit contributions
 aware+0.022
appointed+0.020
 active+0.019
 understand+0.019
 cautious+0.019
Neg logit contributions
ays-0.049
overs-0.045
ions-0.043
ase-0.043
es-0.043
Token nos
Token ID43630
Feature activation-1.72e-01

Pos logit contributions
 aware+0.032
appointed+0.031
 understand+0.029
 active+0.029
 cautious+0.028
Neg logit contributions
ays-0.050
overs-0.045
ions-0.042
es-0.042
ase-0.042
Tokeny
Token ID88
Feature activation-6.42e-02

Pos logit contributions
 aware+0.309
appointed+0.295
 understand+0.290
 active+0.286
 cautious+0.282
Neg logit contributions
ays-0.337
overs-0.290
ions-0.270
es-0.269
ase-0.265
Token girl
Token ID2576
Feature activation+9.48e-03

Pos logit contributions
 aware+0.099
appointed+0.095
 understand+0.092
 active+0.089
 cautious+0.088
Neg logit contributions
ays-0.137
overs-0.124
ase-0.116
es-0.115
ions-0.114
Token watched
Token ID7342
Feature activation-1.37e-02

Pos logit contributions
ays+0.019
overs+0.016
es+0.016
ions+0.015
ase+0.015
Neg logit contributions
 aware-0.016
appointed-0.016
 active-0.015
 understand-0.015
 cautious-0.015
Token as
Token ID355
Feature activation-1.79e-02

Pos logit contributions
 aware+0.023
appointed+0.021
 active+0.021
 understand+0.021
 cautious+0.020
Neg logit contributions
ays-0.029
overs-0.025
ions-0.024
es-0.024
otion-0.023
Token they
Token ID484
Feature activation+1.63e-02

Pos logit contributions
 aware+0.033
appointed+0.032
 active+0.030
 understand+0.030
 cautious+0.030
Neg logit contributions
ays-0.034
overs-0.030
ions-0.028
es-0.027
otion-0.027
Token persistent
Token ID16218
Feature activation-4.89e-02

Pos logit contributions
 aware+0.031
appointed+0.029
 understand+0.027
 active+0.026
 cautious+0.026
Neg logit contributions
ays-0.086
overs-0.079
es-0.075
ions-0.074
otion-0.073
Tokenâ
Token ID22940
Feature activation-3.55e-02

Pos logit contributions
 aware+0.079
 understand+0.073
appointed+0.073
 active+0.073
 cautious+0.072
Neg logit contributions
ays-0.106
overs-0.090
ase-0.085
otion-0.085
ions-0.084
Token
Token ID26391
Feature activation-2.09e-02

Pos logit contributions
 aware+0.046
 understand+0.042
 active+0.041
 gentle+0.041
 pleasant+0.041
Neg logit contributions
ays-0.086
overs-0.079
ase-0.073
otion-0.073
ions-0.072
Token
Token ID447
Feature activation-3.85e-02

Pos logit contributions
 aware+0.034
appointed+0.033
 understand+0.032
 cautious+0.031
 active+0.031
Neg logit contributions
ays-0.044
ions-0.037
overs-0.037
otion-0.036
ocket-0.034
Token
Token ID251
Feature activation-4.92e-02

Pos logit contributions
 aware+0.092
appointed+0.089
 understand+0.088
 active+0.087
 cautious+0.086
Neg logit contributions
ays-0.053
overs-0.042
ions-0.038
es-0.037
otion-0.037
Tokenso
Token ID568
Feature activation-1.72e-01

Pos logit contributions
 aware+0.080
appointed+0.076
 understand+0.076
 active+0.074
 cautious+0.073
Neg logit contributions
ays-0.105
overs-0.090
ions-0.085
es-0.085
ase-0.085
Token they
Token ID484
Feature activation-2.83e-02

Pos logit contributions
 aware+0.284
appointed+0.273
 active+0.262
 understand+0.261
 pleasant+0.254
Neg logit contributions
ays-0.349
overs-0.309
es-0.292
otion-0.282
ions-0.281
Token gathered
Token ID9272
Feature activation-1.81e-02

Pos logit contributions
 aware+0.047
appointed+0.044
 active+0.044
 pleasant+0.042
 cautious+0.042
Neg logit contributions
ays-0.052
overs-0.047
es-0.047
acey-0.046
otion-0.045
Token around
Token ID1088
Feature activation-6.29e-02

Pos logit contributions
 aware+0.032
appointed+0.031
 understand+0.030
 active+0.030
 cautious+0.029
Neg logit contributions
ays-0.036
overs-0.031
ions-0.029
es-0.029
otion-0.028
Token the
Token ID262
Feature activation-4.14e-02

Pos logit contributions
 aware+0.104
appointed+0.099
 understand+0.098
 active+0.096
 cautious+0.094
Neg logit contributions
ays-0.133
overs-0.115
ions-0.107
es-0.107
ase-0.106
Token ladder
Token ID18002
Feature activation-4.64e-02

Pos logit contributions
 aware+0.073
appointed+0.070
 understand+0.067
 active+0.065
 cautious+0.065
Neg logit contributions
ays-0.082
overs-0.072
ions-0.067
ase-0.066
es-0.065
Token Sam
Token ID3409
Feature activation-1.02e-02

Pos logit contributions
 aware+0.032
appointed+0.030
 understand+0.030
 active+0.029
 cautious+0.029
Neg logit contributions
ays-0.028
overs-0.024
es-0.022
ions-0.022
ase-0.022
Token wanted
Token ID2227
Feature activation-3.33e-02

Pos logit contributions
appointed+0.020
 aware+0.020
 active+0.019
 understand+0.019
 pleasant+0.019
Neg logit contributions
ays-0.016
es-0.015
ls-0.013
ase-0.013
overs-0.013
Token to
Token ID284
Feature activation-6.15e-02

Pos logit contributions
appointed+0.053
 aware+0.051
 active+0.047
 cautious+0.045
 understand+0.045
Neg logit contributions
ays-0.071
overs-0.064
ions-0.064
es-0.063
ons-0.061
Token be
Token ID307
Feature activation-2.57e-02

Pos logit contributions
appointed+0.112
 aware+0.109
 active+0.100
 cautious+0.099
 steep+0.097
Neg logit contributions
overs-0.095
ays-0.092
ions-0.084
es-0.084
ons-0.079
Token friends
Token ID2460
Feature activation+1.23e-02

Pos logit contributions
appointed+0.037
 hesitate+0.031
 exp+0.031
rated+0.031
 perspective+0.030
Neg logit contributions
ays-0.054
overs-0.053
es-0.050
ions-0.046
kins-0.045
Token with
Token ID351
Feature activation+8.40e-02

Pos logit contributions
ays+0.019
overs+0.017
es+0.017
ase+0.015
ions+0.015
Neg logit contributions
appointed-0.026
 aware-0.024
 pleasant-0.023
Slow-0.023
 matter-0.023
Token Tim
Token ID5045
Feature activation+3.33e-02

Pos logit contributions
ays+0.179
overs+0.153
ase+0.145
otion+0.143
ions+0.141
Neg logit contributions
 aware-0.140
 understand-0.133
 calm-0.131
 gentle-0.131
appointed-0.130
Token,
Token ID11
Feature activation+1.34e-02

Pos logit contributions
ays+0.064
overs+0.055
es+0.053
ions+0.051
ase+0.050
Neg logit contributions
 aware-0.060
appointed-0.058
 understand-0.056
 active-0.056
 pleasant-0.055
Token so
Token ID523
Feature activation-2.87e-02

Pos logit contributions
ays+0.029
overs+0.026
es+0.025
ions+0.025
otion+0.024
Neg logit contributions
appointed-0.020
 aware-0.019
 active-0.018
 pleasant-0.017
 understand-0.017
Token he
Token ID339
Feature activation+1.02e-02

Pos logit contributions
 aware+0.054
 understand+0.051
appointed+0.051
 active+0.050
 gentle+0.050
Neg logit contributions
ays-0.057
overs-0.048
ase-0.046
otion-0.044
ions-0.044
Token offered
Token ID4438
Feature activation+1.56e-03

Pos logit contributions
ays+0.021
overs+0.019
es+0.018
ions+0.018
ase+0.017
Neg logit contributions
 aware-0.016
appointed-0.016
 understand-0.015
 active-0.015
 cautious-0.014
Token friends
Token ID2460
Feature activation+1.88e-02

Pos logit contributions
appointed+0.050
 aware+0.050
 understand+0.047
 cautious+0.046
 active+0.046
Neg logit contributions
ays-0.054
overs-0.047
es-0.045
ions-0.043
ase-0.041
Token,"
Token ID553
Feature activation+7.95e-03

Pos logit contributions
ays+0.031
es+0.027
overs+0.027
ions+0.024
ase+0.024
Neg logit contributions
appointed-0.038
 aware-0.037
 active-0.034
 cautious-0.034
Slow-0.034
Token said
Token ID531
Feature activation+7.99e-02

Pos logit contributions
ays+0.017
overs+0.014
ions+0.013
otion+0.013
ase+0.013
Neg logit contributions
 aware-0.014
 understand-0.013
appointed-0.013
 active-0.012
 gentle-0.012
Token the
Token ID262
Feature activation-2.24e-02

Pos logit contributions
ays+0.188
overs+0.162
ase+0.154
ions+0.154
otion+0.153
Neg logit contributions
 aware-0.113
 understand-0.109
appointed-0.105
 calm-0.104
 gentle-0.104
Token princess
Token ID21752
Feature activation-1.14e-02

Pos logit contributions
 aware+0.033
appointed+0.031
 understand+0.031
 active+0.030
 cautious+0.030
Neg logit contributions
ays-0.052
overs-0.045
ions-0.043
es-0.043
ase-0.042
Token with
Token ID351
Feature activation+8.40e-02

Pos logit contributions
 aware+0.021
appointed+0.020
 understand+0.020
 active+0.020
 cautious+0.019
Neg logit contributions
ays-0.021
overs-0.018
es-0.017
ions-0.017
ase-0.017
Token a
Token ID257
Feature activation+1.51e-02

Pos logit contributions
ays+0.212
overs+0.186
ase+0.178
es+0.176
ions+0.176
Neg logit contributions
 aware-0.107
 understand-0.104
appointed-0.099
 gentle-0.098
 calm-0.098
Token smile
Token ID8212
Feature activation-2.99e-03

Pos logit contributions
overs+0.029
ays+0.029
es+0.027
box+0.027
kin+0.027
Neg logit contributions
appointed-0.021
 aware-0.019
 active-0.017
 heights-0.016
 resist-0.016
Token.
Token ID13
Feature activation+6.83e-03

Pos logit contributions
 aware+0.005
appointed+0.005
 understand+0.005
 active+0.005
 cautious+0.005
Neg logit contributions
ays-0.006
overs-0.005
ions-0.005
es-0.005
ase-0.005
Token
Token ID198
Feature activation-9.34e-03

Pos logit contributions
ays+0.015
overs+0.014
ions+0.013
otion+0.013
es+0.013
Neg logit contributions
 aware-0.010
appointed-0.010
 understand-0.009
 cautious-0.009
 active-0.009
Token
Token ID198
Feature activation-1.71e-02

Pos logit contributions
 aware+0.013
 understand+0.013
appointed+0.013
 gentle+0.012
 calm+0.012
Neg logit contributions
ays-0.022
overs-0.019
ions-0.018
otion-0.018
es-0.017
Token.
Token ID13
Feature activation-1.26e-02

Pos logit contributions
ays+0.092
overs+0.080
ions+0.074
es+0.074
otion+0.073
Neg logit contributions
 aware-0.076
appointed-0.073
 understand-0.072
 active-0.071
 cautious-0.069
Token Even
Token ID3412
Feature activation-2.28e-02

Pos logit contributions
 aware+0.021
appointed+0.020
 understand+0.020
 active+0.020
 cautious+0.019
Neg logit contributions
ays-0.026
overs-0.023
ions-0.021
es-0.021
ase-0.021
Token when
Token ID618
Feature activation+1.09e-02

Pos logit contributions
appointed+0.032
 aware+0.030
 understand+0.029
 cautious+0.027
 active+0.027
Neg logit contributions
ays-0.053
overs-0.047
ions-0.046
es-0.045
otion-0.043
Token not
Token ID407
Feature activation-5.87e-02

Pos logit contributions
ays+0.019
overs+0.016
es+0.015
ase+0.015
ions+0.015
Neg logit contributions
 aware-0.022
appointed-0.021
 understand-0.020
 active-0.020
 cautious-0.020
Token performing
Token ID9489
Feature activation+1.10e-02

Pos logit contributions
appointed+0.088
 aware+0.087
 active+0.083
 cautious+0.082
 understand+0.082
Neg logit contributions
ays-0.126
overs-0.114
otion-0.105
ions-0.105
es-0.104
Token with
Token ID351
Feature activation+9.06e-02

Pos logit contributions
ays+0.024
overs+0.020
ase+0.019
ions+0.019
es+0.019
Neg logit contributions
 aware-0.018
appointed-0.017
 understand-0.017
 active-0.017
 cautious-0.016
Token it
Token ID340
Feature activation+1.36e-02

Pos logit contributions
ays+0.186
overs+0.162
ions+0.152
es+0.151
otion+0.148
Neg logit contributions
 aware-0.155
appointed-0.148
 understand-0.144
 active-0.143
 cautious-0.140
Token,
Token ID11
Feature activation+3.24e-03

Pos logit contributions
ays+0.030
overs+0.026
ions+0.024
ase+0.024
es+0.024
Neg logit contributions
 aware-0.021
appointed-0.020
 understand-0.020
 active-0.020
 cautious-0.019
Token Jane
Token ID12091
Feature activation-9.05e-03

Pos logit contributions
ays+0.006
overs+0.005
ions+0.004
es+0.004
ase+0.004
Neg logit contributions
 aware-0.007
appointed-0.006
 understand-0.006
 active-0.006
 cautious-0.006
Token liked
Token ID8288
Feature activation-2.00e-02

Pos logit contributions
 aware+0.015
appointed+0.014
 understand+0.014
 active+0.014
 cautious+0.013
Neg logit contributions
ays-0.019
overs-0.016
es-0.016
ions-0.015
ase-0.015
Token to
Token ID284
Feature activation-4.93e-02

Pos logit contributions
 aware+0.032
appointed+0.032
 understand+0.030
 active+0.030
 cautious+0.029
Neg logit contributions
ays-0.042
overs-0.037
ions-0.035
es-0.035
ase-0.034
Token river
Token ID7850
Feature activation-9.88e-02

Pos logit contributions
 aware+0.067
 understand+0.063
appointed+0.062
 cautious+0.058
 hesitate+0.057
Neg logit contributions
ays-0.076
overs-0.070
ase-0.066
ions-0.063
box-0.062
Token and
Token ID290
Feature activation+4.27e-04

Pos logit contributions
 aware+0.181
appointed+0.179
 understand+0.173
 active+0.171
 pleasant+0.171
Neg logit contributions
ays-0.184
overs-0.163
es-0.148
ions-0.147
ase-0.147
Token filled
Token ID5901
Feature activation-3.94e-02

Pos logit contributions
ays+0.001
overs+0.001
ions+0.001
es+0.001
ase+0.001
Neg logit contributions
 aware-0.001
appointed-0.001
 active-0.001
 understand-0.001
 cautious-0.001
Token her
Token ID607
Feature activation+1.66e-02

Pos logit contributions
 aware+0.059
 calm+0.056
 gentle+0.056
 understand+0.055
appointed+0.054
Neg logit contributions
ays-0.089
overs-0.077
ase-0.074
otion-0.073
es-0.072
Token bucket
Token ID19236
Feature activation-1.88e-02

Pos logit contributions
ays+0.031
overs+0.027
ions+0.025
es+0.025
ase+0.024
Neg logit contributions
 aware-0.031
appointed-0.030
 understand-0.029
 active-0.029
 cautious-0.028
Token with
Token ID351
Feature activation+9.27e-02

Pos logit contributions
 aware+0.032
appointed+0.031
 active+0.030
 understand+0.030
 calm+0.030
Neg logit contributions
ays-0.038
overs-0.032
otion-0.031
es-0.031
ions-0.030
Token water
Token ID1660
Feature activation-2.92e-02

Pos logit contributions
ays+0.206
overs+0.184
ions+0.175
es+0.169
kins+0.162
Neg logit contributions
 aware-0.143
appointed-0.141
 active-0.130
 understand-0.129
 cautious-0.128
Token.
Token ID13
Feature activation-1.11e-02

Pos logit contributions
 aware+0.044
 calm+0.041
 gentle+0.040
 understand+0.040
 active+0.040
Neg logit contributions
ays-0.063
overs-0.055
acey-0.053
otion-0.053
ase-0.053
Token The
Token ID383
Feature activation-1.60e-02

Pos logit contributions
 aware+0.019
appointed+0.018
 understand+0.018
 active+0.018
 cautious+0.017
Neg logit contributions
ays-0.023
overs-0.020
ions-0.019
es-0.018
ase-0.018
Token water
Token ID1660
Feature activation-4.36e-02

Pos logit contributions
 aware+0.029
appointed+0.029
 understand+0.028
 active+0.026
 hesitate+0.026
Neg logit contributions
ays-0.028
overs-0.027
ase-0.024
kins-0.024
ions-0.023
Token was
Token ID373
Feature activation+4.19e-02

Pos logit contributions
 aware+0.072
appointed+0.068
 understand+0.067
 calm+0.067
 active+0.067
Neg logit contributions
ays-0.091
overs-0.079
otion-0.073
ions-0.073
es-0.072
Token!
Token ID0
Feature activation-1.70e-02

Pos logit contributions
ays+0.060
overs+0.053
es+0.050
ions+0.049
otion+0.048
Neg logit contributions
 aware-0.053
appointed-0.052
 active-0.049
 understand-0.049
 cautious-0.048
Token Daisy
Token ID40355
Feature activation-1.13e-02

Pos logit contributions
 aware+0.026
appointed+0.025
 understand+0.025
 active+0.024
 cautious+0.024
Neg logit contributions
ays-0.038
overs-0.032
ions-0.031
otion-0.031
es-0.031
Token was
Token ID373
Feature activation-4.57e-03

Pos logit contributions
 aware+0.019
appointed+0.019
 understand+0.018
 active+0.018
 cautious+0.018
Neg logit contributions
ays-0.022
es-0.019
overs-0.019
ions-0.017
ase-0.017
Token so
Token ID523
Feature activation-3.51e-02

Pos logit contributions
appointed+0.006
 aware+0.006
 matter+0.006
 understand+0.006
 active+0.006
Neg logit contributions
ays-0.010
overs-0.010
ions-0.009
es-0.009
otion-0.009
Token happy
Token ID3772
Feature activation-7.44e-03

Pos logit contributions
 matter+0.047
appointed+0.046
 aware+0.043
 understand+0.040
 approach+0.040
Neg logit contributions
ays-0.077
overs-0.074
es-0.067
otion-0.065
ions-0.065
Token that
Token ID326
Feature activation+8.83e-02

Pos logit contributions
 aware+0.013
appointed+0.012
 understand+0.012
 active+0.012
 cautious+0.012
Neg logit contributions
ays-0.015
overs-0.013
ions-0.012
ase-0.012
es-0.012
Token she
Token ID673
Feature activation+3.72e-02

Pos logit contributions
ays+0.162
overs+0.130
ase+0.124
ions+0.120
es+0.119
Neg logit contributions
 aware-0.180
 understand-0.173
appointed-0.170
 active-0.168
 cautious-0.166
Token kept
Token ID4030
Feature activation-3.50e-02

Pos logit contributions
ays+0.082
overs+0.070
ions+0.066
otion+0.065
ase+0.065
Neg logit contributions
 aware-0.061
 active-0.057
appointed-0.057
 understand-0.057
 calm-0.056
Token spl
Token ID4328
Feature activation-7.70e-02

Pos logit contributions
 aware+0.059
appointed+0.057
 understand+0.056
 active+0.055
 cautious+0.054
Neg logit contributions
ays-0.072
overs-0.063
ions-0.058
es-0.058
otion-0.057
Tokenashing
Token ID2140
Feature activation-6.11e-02

Pos logit contributions
 matter+0.131
 understand+0.126
 aware+0.125
 pleasant+0.121
 gentle+0.121
Neg logit contributions
ays-0.163
overs-0.142
ions-0.126
ase-0.125
es-0.122
Token and
Token ID290
Feature activation+1.90e-02

Pos logit contributions
 aware+0.099
appointed+0.094
 understand+0.092
 active+0.090
 cautious+0.089
Neg logit contributions
ays-0.131
overs-0.115
ions-0.108
es-0.107
ase-0.105
Token his
Token ID465
Feature activation+5.64e-02

Pos logit contributions
 aware+0.025
appointed+0.024
 understand+0.024
 active+0.024
 cautious+0.023
Neg logit contributions
ays-0.021
overs-0.018
ions-0.016
es-0.016
ase-0.016
Token dinner
Token ID8073
Feature activation+6.96e-02

Pos logit contributions
ays+0.111
overs+0.096
es+0.090
ase+0.090
ions+0.086
Neg logit contributions
 aware-0.099
appointed-0.094
 understand-0.089
 active-0.087
 cautious-0.087
Token was
Token ID373
Feature activation+7.38e-02

Pos logit contributions
ays+0.166
overs+0.138
otion+0.132
ions+0.128
acey+0.127
Neg logit contributions
 calm-0.095
 understand-0.095
 active-0.093
 matter-0.091
 gentle-0.090
Token and
Token ID290
Feature activation+2.79e-02

Pos logit contributions
ays+0.134
overs+0.117
es+0.108
ions+0.106
otion+0.104
Neg logit contributions
 aware-0.139
appointed-0.135
 understand-0.129
 cautious-0.128
 active-0.127
Token how
Token ID703
Feature activation-1.82e-02

Pos logit contributions
ays+0.059
overs+0.053
es+0.049
ions+0.049
otion+0.047
Neg logit contributions
 aware-0.042
appointed-0.042
 understand-0.039
 active-0.038
 cautious-0.038
Token much
Token ID881
Feature activation-5.34e-03

Pos logit contributions
 aware+0.030
appointed+0.029
 understand+0.028
 imitate+0.027
 cautious+0.027
Neg logit contributions
ays-0.034
overs-0.031
es-0.029
ls-0.028
ions-0.028
Token he
Token ID339
Feature activation+4.09e-02

Pos logit contributions
 aware+0.009
appointed+0.008
 cautious+0.008
 active+0.008
 understand+0.008
Neg logit contributions
ays-0.011
overs-0.009
ions-0.009
es-0.009
otion-0.009
Token loved
Token ID6151
Feature activation-5.57e-03

Pos logit contributions
ays+0.082
overs+0.070
es+0.069
ions+0.066
ase+0.066
Neg logit contributions
 aware-0.071
appointed-0.069
 understand-0.065
 active-0.064
 cautious-0.063
Token playing
Token ID2712
Feature activation+5.36e-05

Pos logit contributions
 aware+0.011
appointed+0.010
 understand+0.010
 active+0.010
 cautious+0.010
Neg logit contributions
ays-0.010
overs-0.009
ions-0.008
es-0.008
ase-0.008
Token with
Token ID351
Feature activation+8.87e-02

Pos logit contributions
ays+0.000
overs+0.000
otion+0.000
ions+0.000
ase+0.000
Neg logit contributions
 aware-0.000
 active-0.000
 understand-0.000
appointed-0.000
 calm-0.000
Token his
Token ID465
Feature activation+3.42e-02

Pos logit contributions
ays+0.181
overs+0.157
ions+0.146
es+0.146
ase+0.143
Neg logit contributions
 aware-0.153
appointed-0.147
 understand-0.143
 active-0.141
 cautious-0.139
Token!"
Token ID2474
Feature activation-1.90e-02

Pos logit contributions
 aware+0.062
appointed+0.059
 understand+0.059
 active+0.058
 cautious+0.057
Neg logit contributions
ays-0.067
overs-0.057
es-0.053
ions-0.053
ase-0.052
Token cried
Token ID16896
Feature activation+3.39e-02

Pos logit contributions
 understand+0.029
 aware+0.028
appointed+0.027
 calm+0.027
 gentle+0.027
Neg logit contributions
ays-0.043
ase-0.037
overs-0.037
ions-0.036
otion-0.035
Token Jill
Token ID22772
Feature activation-2.36e-02

Pos logit contributions
ays+0.061
overs+0.050
ions+0.047
ase+0.047
es+0.047
Neg logit contributions
 aware-0.067
 understand-0.064
appointed-0.064
 active-0.063
 cautious-0.062
Token.
Token ID13
Feature activation+4.02e-03

Pos logit contributions
 aware+0.043
appointed+0.042
 active+0.041
 understand+0.041
 pleasant+0.040
Neg logit contributions
ays-0.043
overs-0.038
es-0.037
ions-0.034
ase-0.034
Token "
Token ID366
Feature activation-6.50e-02

Pos logit contributions
ays+0.009
overs+0.008
ions+0.008
es+0.008
ase+0.008
Neg logit contributions
 aware-0.006
appointed-0.006
 understand-0.005
 cautious-0.005
 active-0.005
TokenYou
Token ID1639
Feature activation+2.60e-03

Pos logit contributions
 aware+0.115
appointed+0.108
 active+0.104
 understand+0.103
 cautious+0.102
Neg logit contributions
ays-0.129
overs-0.113
ions-0.106
ase-0.106
es-0.104
Token were
Token ID547
Feature activation+4.70e-02

Pos logit contributions
ays+0.005
overs+0.004
es+0.004
ions+0.004
otion+0.004
Neg logit contributions
 aware-0.005
appointed-0.004
 active-0.004
 understand-0.004
 cautious-0.004
Token so
Token ID523
Feature activation-1.43e-02

Pos logit contributions
ays+0.107
overs+0.097
ions+0.092
es+0.092
otion+0.090
Neg logit contributions
 aware-0.064
appointed-0.061
 understand-0.059
 active-0.056
 cautious-0.056
Token fast
Token ID3049
Feature activation+2.01e-03

Pos logit contributions
 aware+0.014
appointed+0.013
 understand+0.013
 matter+0.011
 cautious+0.011
Neg logit contributions
ays-0.037
overs-0.035
es-0.033
otion-0.033
ions-0.033
Token!"
Token ID2474
Feature activation-6.78e-03

Pos logit contributions
ays+0.004
overs+0.003
ions+0.003
es+0.003
ase+0.003
Neg logit contributions
 aware-0.004
appointed-0.003
 understand-0.003
 active-0.003
 cautious-0.003
Token
Token ID198
Feature activation-1.81e-02

Pos logit contributions
 aware+0.010
 understand+0.009
appointed+0.009
 calm+0.009
 gentle+0.009
Neg logit contributions
ays-0.016
overs-0.014
ase-0.013
ions-0.013
otion-0.013
Token on
Token ID319
Feature activation+4.45e-02

Pos logit contributions
ays+0.081
overs+0.074
acey+0.073
kins+0.069
es+0.069
Neg logit contributions
 aware-0.043
 calm-0.043
appointed-0.043
 active-0.043
 gentle-0.041
Token the
Token ID262
Feature activation-6.22e-03

Pos logit contributions
ays+0.097
overs+0.085
ions+0.080
es+0.080
ase+0.078
Neg logit contributions
 aware-0.071
appointed-0.067
 understand-0.065
 active-0.065
 cautious-0.064
Token side
Token ID1735
Feature activation+6.51e-03

Pos logit contributions
 aware+0.010
 understand+0.009
appointed+0.009
 cautious+0.009
 active+0.009
Neg logit contributions
ays-0.013
overs-0.012
ase-0.011
ions-0.011
otion-0.011
Token of
Token ID286
Feature activation+2.11e-02

Pos logit contributions
ays+0.012
acey+0.011
es+0.010
overs+0.010
kins+0.010
Neg logit contributions
 aware-0.011
appointed-0.011
 understand-0.011
 calm-0.010
 active-0.010
Token the
Token ID262
Feature activation-2.62e-03

Pos logit contributions
ays+0.045
overs+0.039
ions+0.037
es+0.036
ase+0.036
Neg logit contributions
 aware-0.035
appointed-0.034
 understand-0.032
 active-0.032
 cautious-0.032
Token road
Token ID2975
Feature activation+1.67e-02

Pos logit contributions
 aware+0.004
appointed+0.004
 understand+0.004
 cautious+0.004
 pleasant+0.004
Neg logit contributions
ays-0.006
overs-0.005
ase-0.005
ions-0.005
es-0.005
Token.
Token ID13
Feature activation-9.58e-03

Pos logit contributions
ays+0.036
overs+0.031
ions+0.030
acey+0.030
es+0.029
Neg logit contributions
 aware-0.026
appointed-0.026
 understand-0.025
 calm-0.024
 active-0.024
Token
Token ID198
Feature activation-2.07e-02

Pos logit contributions
 aware+0.018
 active+0.017
 understand+0.016
appointed+0.016
 cautious+0.016
Neg logit contributions
ays-0.018
overs-0.015
es-0.015
ase-0.014
ions-0.014
Token
Token ID220
Feature activation-2.64e-02

Pos logit contributions
 understand+0.027
 aware+0.026
 calm+0.025
appointed+0.024
 gentle+0.024
Neg logit contributions
ays-0.051
overs-0.046
ions-0.043
otion-0.042
ase-0.041
Token
Token ID198
Feature activation-2.79e-02

Pos logit contributions
 understand+0.032
 aware+0.030
appointed+0.028
 calm+0.026
 gentle+0.026
Neg logit contributions
ays-0.066
overs-0.061
ions-0.058
kins-0.057
ocket-0.057
TokenSam
Token ID16305
Feature activation-3.52e-02

Pos logit contributions
 aware+0.039
appointed+0.036
 understand+0.035
 pleasant+0.035
 hesitate+0.034
Neg logit contributions
ays-0.062
overs-0.055
es-0.055
ase-0.054
ions-0.052
Token a
Token ID257
Feature activation+2.96e-02

Pos logit contributions
ays+0.019
overs+0.017
ions+0.016
es+0.016
otion+0.016
Neg logit contributions
 aware-0.010
appointed-0.010
 understand-0.009
 active-0.009
 cautious-0.009
Token loyal
Token ID9112
Feature activation-1.24e-02

Pos logit contributions
overs+0.062
ays+0.059
acey+0.056
ase+0.056
 DVD+0.054
Neg logit contributions
appointed-0.043
 aware-0.039
 hesitate-0.035
 understand-0.034
 heights-0.033
Token friend
Token ID1545
Feature activation+5.36e-02

Pos logit contributions
 aware+0.025
appointed+0.025
 understand+0.024
 active+0.023
 cautious+0.023
Neg logit contributions
ays-0.021
overs-0.018
ase-0.016
es-0.016
ions-0.016
Token to
Token ID284
Feature activation-1.54e-02

Pos logit contributions
ays+0.121
overs+0.103
otion+0.100
kins+0.096
ocket+0.095
Neg logit contributions
 aware-0.085
 understand-0.079
 active-0.078
 gentle-0.078
 cautious-0.077
Token his
Token ID465
Feature activation+2.28e-02

Pos logit contributions
 understand+0.026
 aware+0.024
 active+0.023
 gentle+0.023
 calm+0.023
Neg logit contributions
ays-0.036
overs-0.029
ase-0.027
ions-0.027
kins-0.027
Token owner
Token ID4870
Feature activation+2.38e-02

Pos logit contributions
ays+0.043
overs+0.038
es+0.037
ase+0.035
acey+0.034
Neg logit contributions
appointed-0.039
 aware-0.039
 hesitate-0.035
 pleasant-0.034
 matter-0.034
Token,
Token ID11
Feature activation+1.83e-02

Pos logit contributions
ays+0.055
overs+0.047
otion+0.045
ions+0.044
ase+0.043
Neg logit contributions
 aware-0.036
 understand-0.034
 active-0.033
 gentle-0.032
 cautious-0.032
Token a
Token ID257
Feature activation+2.97e-02

Pos logit contributions
ays+0.035
overs+0.031
es+0.029
ions+0.029
otion+0.028
Neg logit contributions
 aware-0.033
appointed-0.032
 understand-0.030
 active-0.030
 cautious-0.030
Token little
Token ID1310
Feature activation+4.94e-04

Pos logit contributions
overs+0.053
ays+0.051
acey+0.048
ase+0.048
robe+0.047
Neg logit contributions
appointed-0.050
 aware-0.047
 hesitate-0.040
 heights-0.040
rated-0.040
Token girl
Token ID2576
Feature activation+5.95e-02

Pos logit contributions
ays+0.001
overs+0.001
es+0.001
ase+0.001
ions+0.001
Neg logit contributions
appointed-0.001
 aware-0.001
 understand-0.001
 active-0.001
 pleasant-0.001
Token named
Token ID3706
Feature activation+1.13e-01

Pos logit contributions
ays+0.126
otion+0.099
overs+0.098
ions+0.095
ls+0.093
Neg logit contributions
 aware-0.105
 understand-0.099
 cautious-0.098
 gentle-0.097
 calm-0.097
Token day
Token ID1110
Feature activation+3.94e-02

Pos logit contributions
 aware+0.069
appointed+0.065
 cautious+0.062
 hesitate+0.062
 understand+0.060
Neg logit contributions
ays-0.064
overs-0.059
ase-0.056
ions-0.051
otion-0.051
Token,
Token ID11
Feature activation-2.14e-02

Pos logit contributions
ays+0.087
overs+0.076
ions+0.071
es+0.071
ase+0.070
Neg logit contributions
 aware-0.061
appointed-0.058
 understand-0.057
 active-0.056
 cautious-0.055
Token Tim
Token ID5045
Feature activation-8.76e-03

Pos logit contributions
 aware+0.040
appointed+0.038
 understand+0.038
 active+0.037
 cautious+0.037
Neg logit contributions
ays-0.041
overs-0.035
ions-0.032
es-0.032
ase-0.032
Tokenmy
Token ID1820
Feature activation-2.66e-02

Pos logit contributions
 aware+0.012
appointed+0.011
 understand+0.011
 active+0.011
 cautious+0.011
Neg logit contributions
ays-0.021
overs-0.018
es-0.018
ions-0.017
ase-0.017
Token's
Token ID338
Feature activation-4.02e-02

Pos logit contributions
 aware+0.041
 active+0.039
 understand+0.039
appointed+0.038
 calm+0.038
Neg logit contributions
ays-0.060
overs-0.052
ions-0.050
otion-0.049
es-0.048
Token mom
Token ID1995
Feature activation-2.96e-04

Pos logit contributions
 aware+0.063
 understand+0.061
 active+0.061
appointed+0.061
 gentle+0.060
Neg logit contributions
ays-0.087
overs-0.076
ions-0.073
es-0.072
otion-0.070
Tokenmy
Token ID1820
Feature activation-1.17e-03

Pos logit contributions
 aware+0.000
 understand+0.000
 gentle+0.000
 active+0.000
 calm+0.000
Neg logit contributions
ays-0.001
overs-0.001
ions-0.001
otion-0.001
es-0.001
Token told
Token ID1297
Feature activation+6.12e-02

Pos logit contributions
 aware+0.002
 understand+0.002
 active+0.001
 gentle+0.001
 calm+0.001
Neg logit contributions
ays-0.003
overs-0.003
ions-0.002
otion-0.002
ase-0.002
Token him
Token ID683
Feature activation-9.26e-03

Pos logit contributions
ays+0.100
overs+0.084
ions+0.076
es+0.075
otion+0.075
Neg logit contributions
 aware-0.131
appointed-0.126
 understand-0.123
 active-0.123
 cautious-0.122
Token they
Token ID484
Feature activation+4.90e-02

Pos logit contributions
 aware+0.016
 understand+0.015
appointed+0.015
 active+0.015
 gentle+0.014
Neg logit contributions
ays-0.019
overs-0.016
ase-0.015
ions-0.015
otion-0.015
Token were
Token ID547
Feature activation+5.82e-02

Pos logit contributions
ays+0.107
overs+0.091
ions+0.086
ase+0.084
es+0.083
Neg logit contributions
 understand-0.086
 aware-0.078
appointed-0.078
 active-0.077
 calm-0.075
Token down
Token ID866
Feature activation+5.90e-03

Pos logit contributions
ays+0.026
overs+0.021
es+0.020
ions+0.020
ase+0.020
Neg logit contributions
 aware-0.023
 understand-0.022
appointed-0.022
 cautious-0.021
 active-0.021
Token.
Token ID13
Feature activation+2.88e-02

Pos logit contributions
ays+0.013
overs+0.011
ase+0.010
ions+0.010
otion+0.010
Neg logit contributions
 aware-0.009
 calm-0.009
appointed-0.009
 understand-0.008
 active-0.008
Token
Token ID198
Feature activation+7.60e-03

Pos logit contributions
ays+0.060
overs+0.052
ions+0.049
es+0.049
ase+0.048
Neg logit contributions
 aware-0.049
appointed-0.047
 understand-0.045
 active-0.045
 cautious-0.044
Token
Token ID198
Feature activation-2.12e-03

Pos logit contributions
ays+0.018
overs+0.016
ase+0.015
ions+0.015
otion+0.015
Neg logit contributions
 aware-0.010
 understand-0.010
appointed-0.010
 calm-0.009
 gentle-0.009
Token"
Token ID1
Feature activation-2.64e-02

Pos logit contributions
 aware+0.003
appointed+0.003
 understand+0.003
 active+0.003
 cautious+0.003
Neg logit contributions
ays-0.005
overs-0.004
ions-0.004
es-0.004
otion-0.004
TokenHey
Token ID10814
Feature activation-1.53e-02

Pos logit contributions
 aware+0.043
appointed+0.039
 active+0.037
 understand+0.037
 cautious+0.037
Neg logit contributions
ays-0.059
ions-0.049
overs-0.048
es-0.048
otion-0.047
Token!"
Token ID2474
Feature activation+6.17e-05

Pos logit contributions
 aware+0.022
 understand+0.020
 gentle+0.020
appointed+0.020
 calm+0.020
Neg logit contributions
ays-0.036
overs-0.031
ase-0.029
ions-0.029
es-0.029
Token Mia
Token ID32189
Feature activation-3.32e-02

Pos logit contributions
ays+0.000
overs+0.000
ase+0.000
es+0.000
ions+0.000
Neg logit contributions
 aware-0.000
 understand-0.000
appointed-0.000
 calm-0.000
 gentle-0.000
Token says
Token ID1139
Feature activation+9.11e-02

Pos logit contributions
 aware+0.075
appointed+0.072
 active+0.070
 understand+0.070
 cautious+0.069
Neg logit contributions
ays-0.052
overs-0.041
es-0.039
ions-0.037
otion-0.037
Token.
Token ID13
Feature activation+3.07e-02

Pos logit contributions
ays+0.175
overs+0.150
es+0.139
ions+0.138
otion+0.136
Neg logit contributions
 aware-0.169
appointed-0.161
 understand-0.158
 active-0.157
 cautious-0.154
Token She
Token ID1375
Feature activation+1.03e-02

Pos logit contributions
ays+0.060
overs+0.051
es+0.048
otion+0.047
ions+0.047
Neg logit contributions
 aware-0.057
appointed-0.054
 active-0.053
 understand-0.053
 cautious-0.052
Token before
Token ID878
Feature activation+6.34e-03

Pos logit contributions
ays+0.056
overs+0.049
es+0.047
ions+0.046
otion+0.046
Neg logit contributions
 aware-0.043
appointed-0.040
 active-0.039
 understand-0.039
 cautious-0.038
Token.
Token ID13
Feature activation-1.39e-03

Pos logit contributions
ays+0.014
overs+0.012
ase+0.012
otion+0.011
ions+0.011
Neg logit contributions
 aware-0.010
appointed-0.009
 understand-0.009
 active-0.009
 calm-0.009
Token
Token ID198
Feature activation-1.32e-02

Pos logit contributions
 aware+0.002
appointed+0.002
 understand+0.002
 active+0.002
 cautious+0.002
Neg logit contributions
ays-0.003
overs-0.003
ions-0.003
es-0.003
otion-0.003
Token
Token ID198
Feature activation-2.19e-02

Pos logit contributions
 aware+0.021
appointed+0.020
 understand+0.020
 active+0.020
 cautious+0.019
Neg logit contributions
ays-0.028
overs-0.025
ions-0.023
es-0.023
ase-0.023
TokenBut
Token ID1537
Feature activation-2.89e-02

Pos logit contributions
 aware+0.026
appointed+0.025
 matter+0.023
 pleasant+0.023
 hesitate+0.023
Neg logit contributions
ays-0.052
overs-0.048
ase-0.047
es-0.047
ions-0.046
Token something
Token ID1223
Feature activation-2.66e-02

Pos logit contributions
 aware+0.040
appointed+0.037
 understand+0.037
 active+0.036
 cautious+0.036
Neg logit contributions
ays-0.070
overs-0.061
ase-0.058
ions-0.058
otion-0.057
Token even
Token ID772
Feature activation-1.48e-02

Pos logit contributions
 aware+0.035
appointed+0.033
 understand+0.031
 active+0.030
 cautious+0.029
Neg logit contributions
ays-0.064
overs-0.057
es-0.055
ions-0.054
otion-0.054
Token more
Token ID517
Feature activation-3.30e-02

Pos logit contributions
appointed+0.020
 aware+0.020
 understand+0.018
 active+0.018
 cautious+0.017
Neg logit contributions
ays-0.032
overs-0.029
es-0.029
ions-0.029
ons-0.027
Token incredible
Token ID8082
Feature activation+5.96e-03

Pos logit contributions
appointed+0.041
 aware+0.041
 understand+0.037
 active+0.035
 cautious+0.035
Neg logit contributions
ays-0.080
overs-0.073
ions-0.070
es-0.070
otion-0.067
Token happened
Token ID3022
Feature activation+2.66e-02

Pos logit contributions
ays+0.013
overs+0.011
acey+0.010
otion+0.010
ase+0.010
Neg logit contributions
 aware-0.009
 understand-0.009
appointed-0.009
 active-0.009
 cautious-0.009
Token.
Token ID13
Feature activation-1.03e-02

Pos logit contributions
ays+0.058
overs+0.049
ase+0.046
ions+0.046
otion+0.046
Neg logit contributions
 aware-0.043
appointed-0.040
 understand-0.040
 active-0.039
 cautious-0.039
Token he
Token ID339
Feature activation+1.50e-02

Pos logit contributions
 aware+0.013
appointed+0.013
 active+0.012
 understand+0.012
 cautious+0.012
Neg logit contributions
ays-0.013
overs-0.011
es-0.011
ions-0.011
otion-0.010
Token was
Token ID373
Feature activation+8.21e-03

Pos logit contributions
ays+0.029
es+0.026
overs+0.025
ions+0.023
ase+0.023
Neg logit contributions
 aware-0.026
appointed-0.025
 active-0.024
 understand-0.023
 pleasant-0.023
Token badly
Token ID11234
Feature activation-4.74e-03

Pos logit contributions
ays+0.016
overs+0.014
es+0.014
ions+0.013
otion+0.013
Neg logit contributions
 matter-0.012
appointed-0.012
Okay-0.012
 aware-0.011
 active-0.011
Token hurt
Token ID5938
Feature activation-1.83e-02

Pos logit contributions
appointed+0.008
 understand+0.007
Years+0.007
 matter+0.006
 nod+0.006
Neg logit contributions
ays-0.010
overs-0.009
es-0.009
ls-0.008
ions-0.008
Token.
Token ID13
Feature activation-8.09e-03

Pos logit contributions
 aware+0.036
appointed+0.036
 understand+0.035
 active+0.034
 pleasant+0.034
Neg logit contributions
ays-0.031
overs-0.028
es-0.026
ions-0.025
otion-0.024
Token He
Token ID679
Feature activation-1.04e-02

Pos logit contributions
 aware+0.013
appointed+0.012
 understand+0.012
 active+0.012
 cautious+0.012
Neg logit contributions
ays-0.017
overs-0.015
es-0.014
ions-0.014
ase-0.014
Token lay
Token ID3830
Feature activation-1.13e-02

Pos logit contributions
 aware+0.018
appointed+0.018
 active+0.017
 understand+0.017
 pleasant+0.016
Neg logit contributions
ays-0.019
es-0.017
overs-0.016
ions-0.015
ase-0.015
Token there
Token ID612
Feature activation+2.42e-03

Pos logit contributions
 aware+0.018
appointed+0.016
 active+0.016
 cautious+0.016
 calm+0.016
Neg logit contributions
ays-0.025
overs-0.022
ase-0.021
ions-0.021
otion-0.021
Token until
Token ID1566
Feature activation+4.38e-02

Pos logit contributions
ays+0.004
overs+0.004
es+0.004
ions+0.004
otion+0.004
Neg logit contributions
 aware-0.004
appointed-0.004
 active-0.004
 understand-0.004
 cautious-0.004
Token some
Token ID617
Feature activation-1.27e-02

Pos logit contributions
ays+0.089
overs+0.075
es+0.071
ase+0.071
ions+0.070
Neg logit contributions
 aware-0.076
appointed-0.074
 understand-0.072
 calm-0.071
 active-0.071
Token animals
Token ID4695
Feature activation-4.12e-02

Pos logit contributions
 aware+0.020
appointed+0.020
 active+0.019
 understand+0.019
 cautious+0.018
Neg logit contributions
ays-0.027
overs-0.024
ions-0.023
es-0.022
otion-0.022
Token boy
Token ID2933
Feature activation-9.90e-04

Pos logit contributions
 aware+0.073
appointed+0.069
 active+0.069
 understand+0.067
 calm+0.067
Neg logit contributions
ays-0.118
overs-0.103
es-0.099
ions-0.099
otion-0.098
Token said
Token ID531
Feature activation+9.67e-02

Pos logit contributions
 aware+0.001
 active+0.001
 understand+0.001
appointed+0.001
 gentle+0.001
Neg logit contributions
ays-0.002
overs-0.002
otion-0.002
ions-0.002
ase-0.002
Token,
Token ID11
Feature activation-3.52e-03

Pos logit contributions
ays+0.202
overs+0.177
es+0.166
ions+0.165
otion+0.162
Neg logit contributions
 aware-0.161
appointed-0.154
 understand-0.150
 active-0.148
 cautious-0.145
Token
Token ID6184
Feature activation+1.18e-02

Pos logit contributions
 aware+0.006
appointed+0.006
 understand+0.006
 active+0.006
 cautious+0.006
Neg logit contributions
ays-0.007
overs-0.006
es-0.006
ions-0.006
otion-0.005
Token
Token ID95
Feature activation+1.69e-03

Pos logit contributions
ays+0.024
overs+0.021
es+0.020
ions+0.019
ase+0.019
Neg logit contributions
 aware-0.020
appointed-0.020
 understand-0.019
 active-0.019
 cautious-0.019
Token
Token ID26391
Feature activation-2.84e-02

Pos logit contributions
ays+0.003
overs+0.003
ase+0.003
ions+0.003
es+0.003
Neg logit contributions
 aware-0.003
appointed-0.003
 understand-0.003
 active-0.003
 cautious-0.003
Token
Token ID129
Feature activation-4.42e-02

Pos logit contributions
 aware+0.045
appointed+0.043
 understand+0.042
 active+0.042
 cautious+0.041
Neg logit contributions
ays-0.061
overs-0.053
ions-0.052
otion-0.050
ase-0.049
Token
Token ID241
Feature activation-9.90e-02

Pos logit contributions
 aware+0.064
 understand+0.060
 matter+0.057
 hesitate+0.057
 pleasant+0.057
Neg logit contributions
ays-0.100
overs-0.090
otion-0.081
es-0.080
ase-0.080
TokenYes
Token ID5297
Feature activation+5.33e-02

Pos logit contributions
 aware+0.130
appointed+0.120
 active+0.110
 understand+0.109
 cautious+0.108
Neg logit contributions
ays-0.236
overs-0.219
ions-0.200
ase-0.197
otion-0.195
Token!
Token ID0
Feature activation+8.10e-03

Pos logit contributions
ays+0.126
overs+0.108
ase+0.105
ions+0.103
es+0.102
Neg logit contributions
 aware-0.074
appointed-0.073
 understand-0.072
 calm-0.071
 gentle-0.070
Token I
Token ID314
Feature activation-4.78e-02

Pos logit contributions
ays+0.014
overs+0.012
es+0.011
ase+0.011
ions+0.011
Neg logit contributions
 aware-0.016
appointed-0.016
 understand-0.015
 active-0.015
 cautious-0.015
Token
Token ID198
Feature activation-7.39e-03

Pos logit contributions
ays+0.012
overs+0.011
es+0.010
kins+0.010
ase+0.010
Neg logit contributions
appointed-0.008
 aware-0.008
 understand-0.008
 active-0.008
 gentle-0.007
Token
Token ID198
Feature activation-1.33e-02

Pos logit contributions
 aware+0.010
 understand+0.010
appointed+0.009
 gentle+0.009
 calm+0.009
Neg logit contributions
ays-0.018
overs-0.016
es-0.015
ions-0.015
otion-0.015
TokenBut
Token ID1537
Feature activation-3.61e-02

Pos logit contributions
 aware+0.019
appointed+0.018
 understand+0.017
 active+0.017
 cautious+0.017
Neg logit contributions
ays-0.031
overs-0.028
ions-0.026
es-0.026
ase-0.026
Token they
Token ID484
Feature activation-1.39e-02

Pos logit contributions
 aware+0.047
appointed+0.044
 understand+0.043
 calm+0.043
 gentle+0.043
Neg logit contributions
ays-0.090
overs-0.077
ase-0.075
es-0.072
otion-0.072
Token do
Token ID466
Feature activation-4.12e-02

Pos logit contributions
 aware+0.023
 understand+0.022
appointed+0.022
 active+0.021
 cautious+0.021
Neg logit contributions
ays-0.030
overs-0.026
ions-0.024
es-0.024
ase-0.024
Token not
Token ID407
Feature activation-7.26e-02

Pos logit contributions
 aware+0.061
appointed+0.057
 active+0.056
 understand+0.055
 cautious+0.054
Neg logit contributions
ays-0.094
overs-0.083
es-0.079
ions-0.078
otion-0.078
Token know
Token ID760
Feature activation+1.29e-02

Pos logit contributions
appointed+0.105
 aware+0.090
 active+0.086
 pleasant+0.086
 former+0.085
Neg logit contributions
overs-0.144
ays-0.143
otion-0.137
es-0.135
ions-0.129
Token that
Token ID326
Feature activation+8.28e-02

Pos logit contributions
ays+0.028
overs+0.024
ase+0.023
es+0.022
otion+0.022
Neg logit contributions
 aware-0.021
 understand-0.019
appointed-0.019
 gentle-0.019
 calm-0.019
Token one
Token ID530
Feature activation+2.47e-02

Pos logit contributions
ays+0.167
overs+0.143
ase+0.135
otion+0.130
es+0.130
Neg logit contributions
 aware-0.143
 understand-0.138
 calm-0.135
 active-0.134
 gentle-0.134
Token button
Token ID4936
Feature activation+3.45e-02

Pos logit contributions
ays+0.056
overs+0.049
es+0.046
ions+0.046
ase+0.045
Neg logit contributions
 aware-0.037
 understand-0.035
appointed-0.034
 active-0.034
 cautious-0.033
Token is
Token ID318
Feature activation+5.56e-02

Pos logit contributions
ays+0.065
overs+0.062
ase+0.055
ions+0.054
otion+0.053
Neg logit contributions
 understand-0.057
 aware-0.056
appointed-0.055
 calm-0.055
 matter-0.055
Token.
Token ID13
Feature activation-1.47e-02

Pos logit contributions
 aware+0.052
appointed+0.048
 understand+0.047
 active+0.046
 calm+0.045
Neg logit contributions
ays-0.086
overs-0.075
ions-0.071
ase-0.071
es-0.070
Token
Token ID6184
Feature activation-2.02e-03

Pos logit contributions
 aware+0.026
 active+0.024
appointed+0.024
 understand+0.024
 cautious+0.024
Neg logit contributions
ays-0.030
overs-0.026
es-0.024
ions-0.024
ase-0.024
Token
Token ID95
Feature activation-8.80e-04

Pos logit contributions
 aware+0.004
appointed+0.003
 understand+0.003
 active+0.003
 cautious+0.003
Neg logit contributions
ays-0.004
overs-0.004
ions-0.003
es-0.003
ase-0.003
Token
Token ID26391
Feature activation-3.10e-02

Pos logit contributions
 aware+0.001
 understand+0.001
 active+0.001
appointed+0.001
 cautious+0.001
Neg logit contributions
ays-0.002
overs-0.002
ase-0.002
ions-0.001
es-0.001
Token
Token ID129
Feature activation-4.57e-02

Pos logit contributions
 aware+0.050
appointed+0.048
 understand+0.047
 cautious+0.045
 active+0.045
Neg logit contributions
ays-0.066
overs-0.058
ions-0.055
otion-0.054
ase-0.052
Token
Token ID241
Feature activation-9.90e-02

Pos logit contributions
 aware+0.070
 understand+0.066
 active+0.064
 hesitate+0.064
 pleasant+0.063
Neg logit contributions
ays-0.097
overs-0.086
otion-0.078
ase-0.078
es-0.078
TokenCan
Token ID6090
Feature activation-5.95e-02

Pos logit contributions
 aware+0.152
appointed+0.140
 active+0.137
 cautious+0.134
 understand+0.133
Neg logit contributions
ays-0.216
overs-0.195
ions-0.181
es-0.179
ase-0.177
Token I
Token ID314
Feature activation-7.17e-02

Pos logit contributions
 aware+0.132
appointed+0.129
 understand+0.127
 active+0.125
 cautious+0.124
Neg logit contributions
ays-0.091
overs-0.075
es-0.067
ions-0.067
ase-0.066
Token mark
Token ID1317
Feature activation-1.14e-02

Pos logit contributions
 aware+0.099
appointed+0.094
 active+0.089
 understand+0.088
 cautious+0.087
Neg logit contributions
ays-0.170
overs-0.151
ions-0.143
es-0.142
ase-0.140
Token something
Token ID1223
Feature activation-4.48e-03

Pos logit contributions
 aware+0.019
 understand+0.018
 active+0.018
appointed+0.018
 gentle+0.018
Neg logit contributions
ays-0.023
overs-0.020
ase-0.019
otion-0.019
ions-0.019
Token with
Token ID351
Feature activation+1.02e-01

Pos logit contributions
 aware+0.007
 understand+0.006
 active+0.006
appointed+0.006
 gentle+0.006
Neg logit contributions
ays-0.010
overs-0.009
ase-0.008
otion-0.008
es-0.008
Token.
Token ID13
Feature activation+2.39e-02

Pos logit contributions
ays+0.011
overs+0.009
otion+0.009
ions+0.009
ase+0.009
Neg logit contributions
 aware-0.008
appointed-0.007
 understand-0.007
 active-0.007
 gentle-0.007
Token The
Token ID383
Feature activation-8.33e-03

Pos logit contributions
ays+0.050
overs+0.043
ions+0.041
es+0.041
ase+0.040
Neg logit contributions
 aware-0.040
appointed-0.038
 understand-0.038
 active-0.037
 cautious-0.036
Token k
Token ID479
Feature activation-3.02e-02

Pos logit contributions
 aware+0.012
appointed+0.011
 active+0.011
 understand+0.011
 cautious+0.011
Neg logit contributions
ays-0.019
overs-0.017
ions-0.016
ase-0.016
es-0.016
Tokenite
Token ID578
Feature activation-6.91e-03

Pos logit contributions
 aware+0.053
appointed+0.050
 gentle+0.050
 cautious+0.050
 calm+0.049
Neg logit contributions
ays-0.050
overs-0.048
acey-0.046
kins-0.045
otion-0.045
Token goes
Token ID2925
Feature activation+3.39e-03

Pos logit contributions
 aware+0.011
 gentle+0.011
 calm+0.011
 understand+0.011
appointed+0.011
Neg logit contributions
ays-0.014
overs-0.013
ions-0.012
otion-0.011
ase-0.011
Token higher
Token ID2440
Feature activation-1.06e-01

Pos logit contributions
ays+0.009
overs+0.008
es+0.008
otion+0.008
ase+0.008
Neg logit contributions
appointed-0.003
 aware-0.003
 calm-0.003
 active-0.003
 gentle-0.003
Token and
Token ID290
Feature activation+2.49e-02

Pos logit contributions
 understand+0.190
 aware+0.190
appointed+0.189
 active+0.178
 pleasant+0.178
Neg logit contributions
ays-0.212
overs-0.176
es-0.170
ions-0.164
ls-0.162
Token higher
Token ID2440
Feature activation-9.65e-02

Pos logit contributions
ays+0.076
overs+0.067
es+0.066
ions+0.065
ls+0.065
Neg logit contributions
 aware-0.018
 understand-0.016
 active-0.015
appointed-0.015
 cautious-0.014
Token.
Token ID13
Feature activation+2.39e-02

Pos logit contributions
 aware+0.192
 understand+0.189
appointed+0.189
 active+0.183
 cautious+0.179
Neg logit contributions
ays-0.172
overs-0.142
es-0.135
ions-0.130
ls-0.126
Token Sara
Token ID24799
Feature activation-1.24e-02

Pos logit contributions
ays+0.040
overs+0.033
ions+0.031
es+0.031
ase+0.030
Neg logit contributions
 aware-0.049
appointed-0.048
 understand-0.047
 active-0.046
 cautious-0.046
Token feels
Token ID5300
Feature activation-2.49e-03

Pos logit contributions
 aware+0.024
appointed+0.024
 active+0.023
 pleasant+0.022
 understand+0.022
Neg logit contributions
ays-0.023
es-0.019
overs-0.018
ions-0.017
ls-0.017
Token<|endoftext|>
Token ID50256
Feature activation+7.31e-02

Pos logit contributions
appointed+0.012
 understand+0.012
 aware+0.012
 active+0.011
 calm+0.010
Neg logit contributions
ays-0.025
overs-0.022
ions-0.022
kins-0.021
es-0.021
TokenOnce
Token ID7454
Feature activation+7.60e-02

Pos logit contributions
ays+0.173
overs+0.157
ocket+0.149
otion+0.147
kins+0.145
Neg logit contributions
 calm-0.081
 understand-0.078
appointed-0.076
Slow-0.075
 aware-0.073
Token there
Token ID612
Feature activation-8.81e-03

Pos logit contributions
ays+0.201
overs+0.185
ions+0.178
es+0.177
ase+0.176
Neg logit contributions
 understand-0.079
 aware-0.074
 gentle-0.070
appointed-0.069
 steep-0.067
Token was
Token ID373
Feature activation-2.29e-02

Pos logit contributions
 understand+0.013
 aware+0.012
 calm+0.012
appointed+0.012
 gentle+0.012
Neg logit contributions
ays-0.020
overs-0.018
ions-0.017
es-0.016
ase-0.016
Token a
Token ID257
Feature activation+2.04e-03

Pos logit contributions
 aware+0.033
appointed+0.031
 understand+0.030
 active+0.030
 cautious+0.029
Neg logit contributions
ays-0.053
overs-0.047
ions-0.045
es-0.044
otion-0.043
Token curious
Token ID11040
Feature activation-1.07e-01

Pos logit contributions
ays+0.005
overs+0.004
acey+0.004
ase+0.004
es+0.004
Neg logit contributions
appointed-0.003
 aware-0.003
 understand-0.002
 hesitate-0.002
 cautious-0.002
Token little
Token ID1310
Feature activation-1.76e-02

Pos logit contributions
appointed+0.171
 aware+0.167
 understand+0.151
 cautious+0.148
 active+0.144
Neg logit contributions
ays-0.221
overs-0.212
ase-0.197
otion-0.187
ions-0.186
Token girl
Token ID2576
Feature activation+4.29e-02

Pos logit contributions
 aware+0.028
appointed+0.027
 understand+0.026
 active+0.026
 cautious+0.025
Neg logit contributions
ays-0.038
overs-0.033
ions-0.031
es-0.031
ase-0.031
Token called
Token ID1444
Feature activation+5.91e-02

Pos logit contributions
ays+0.075
overs+0.060
otion+0.057
ions+0.056
es+0.055
Neg logit contributions
 aware-0.090
 understand-0.084
 active-0.084
 cautious-0.083
 gentle-0.083
Token Am
Token ID1703
Feature activation+3.88e-02

Pos logit contributions
ays+0.135
overs+0.110
ase+0.106
es+0.104
ions+0.102
Neg logit contributions
 aware-0.091
 understand-0.088
appointed-0.087
 calm-0.084
 gentle-0.084
Tokena
Token ID64
Feature activation+2.45e-02

Pos logit contributions
ays+0.090
otion+0.079
overs+0.079
ions+0.078
ase+0.074
Neg logit contributions
 aware-0.054
 understand-0.049
 gentle-0.048
 matter-0.045
 calm-0.045
Token,
Token ID11
Feature activation-2.10e-03

Pos logit contributions
ays+0.182
overs+0.161
es+0.156
otion+0.151
ions+0.149
Neg logit contributions
 aware-0.174
appointed-0.169
 active-0.160
 understand-0.160
 cautious-0.157
Token
Token ID6184
Feature activation+1.48e-02

Pos logit contributions
 aware+0.004
appointed+0.004
 active+0.004
 understand+0.004
 matter+0.004
Neg logit contributions
ays-0.003
overs-0.003
es-0.003
otion-0.003
ions-0.003
Token
Token ID95
Feature activation+5.09e-03

Pos logit contributions
ays+0.028
overs+0.024
ions+0.022
es+0.022
ase+0.022
Neg logit contributions
 aware-0.027
appointed-0.026
 understand-0.026
 active-0.025
 cautious-0.025
Token
Token ID26391
Feature activation-2.76e-02

Pos logit contributions
ays+0.010
overs+0.009
ase+0.008
ions+0.008
es+0.008
Neg logit contributions
 aware-0.009
appointed-0.008
 understand-0.008
 active-0.008
 cautious-0.008
Token
Token ID129
Feature activation-4.38e-02

Pos logit contributions
 aware+0.045
appointed+0.042
 understand+0.042
 active+0.041
 cautious+0.041
Neg logit contributions
ays-0.059
overs-0.051
ions-0.049
otion-0.048
ase-0.046
Token
Token ID241
Feature activation-9.88e-02

Pos logit contributions
 aware+0.066
 understand+0.063
 hesitate+0.061
 pleasant+0.060
 matter+0.060
Neg logit contributions
ays-0.096
overs-0.084
ase-0.076
es-0.076
otion-0.076
TokenLet
Token ID5756
Feature activation-7.28e-02

Pos logit contributions
 aware+0.162
appointed+0.152
 active+0.145
 understand+0.143
 cautious+0.143
Neg logit contributions
ays-0.205
overs-0.184
ions-0.170
ase-0.167
otion-0.167
Tokenâ
Token ID22940
Feature activation-5.30e-02

Pos logit contributions
 aware+0.128
appointed+0.122
 understand+0.120
 active+0.118
 cautious+0.117
Neg logit contributions
ays-0.147
overs-0.127
es-0.118
ions-0.118
ase-0.116
Token
Token ID26391
Feature activation-3.62e-02

Pos logit contributions
 aware+0.110
appointed+0.104
 active+0.103
 understand+0.102
 cautious+0.101
Neg logit contributions
ays-0.087
overs-0.076
ase-0.069
ions-0.068
otion-0.067
Token
Token ID8151
Feature activation-1.00e-01

Pos logit contributions
 aware+0.052
 understand+0.050
appointed+0.050
 active+0.048
 cautious+0.047
Neg logit contributions
ays-0.083
overs-0.072
ions-0.072
otion-0.069
ase-0.066
Tokens
Token ID82
Feature activation-3.73e-02

Pos logit contributions
appointed+0.162
 understand+0.159
 aware+0.147
So+0.142
 gentle+0.139
Neg logit contributions
ays-0.192
overs-0.171
ocket-0.165
kins-0.160
ions-0.159
Token He
Token ID679
Feature activation-3.58e-03

Pos logit contributions
 aware+0.017
appointed+0.016
 understand+0.016
 active+0.015
 cautious+0.015
Neg logit contributions
ays-0.022
overs-0.020
es-0.018
ions-0.018
ase-0.018
Token flew
Token ID13112
Feature activation-9.21e-03

Pos logit contributions
 aware+0.006
appointed+0.006
 understand+0.005
 active+0.005
 cautious+0.005
Neg logit contributions
ays-0.007
overs-0.006
es-0.006
ions-0.006
ase-0.006
Token up
Token ID510
Feature activation-4.85e-02

Pos logit contributions
 aware+0.016
 understand+0.015
appointed+0.015
 active+0.014
 cautious+0.014
Neg logit contributions
ays-0.019
overs-0.016
ions-0.015
es-0.015
ls-0.014
Token to
Token ID284
Feature activation-5.06e-02

Pos logit contributions
 aware+0.072
appointed+0.068
 understand+0.067
 active+0.064
 cautious+0.063
Neg logit contributions
ays-0.112
overs-0.100
ions-0.094
es-0.093
ase-0.092
Token a
Token ID257
Feature activation-7.05e-03

Pos logit contributions
 aware+0.080
appointed+0.076
 understand+0.076
 active+0.074
 cautious+0.072
Neg logit contributions
ays-0.111
overs-0.096
es-0.091
ions-0.090
ase-0.090
Token tall
Token ID7331
Feature activation-1.14e-01

Pos logit contributions
appointed+0.011
 aware+0.011
 understand+0.010
 active+0.010
 cautious+0.009
Neg logit contributions
ays-0.015
overs-0.014
ase-0.013
ions-0.012
es-0.012
Token branch
Token ID8478
Feature activation-2.71e-03

Pos logit contributions
 aware+0.193
appointed+0.189
 understand+0.182
 active+0.175
 cautious+0.173
Neg logit contributions
ays-0.227
overs-0.209
ase-0.193
ions-0.187
es-0.184
Token and
Token ID290
Feature activation+1.44e-02

Pos logit contributions
 aware+0.004
appointed+0.004
 understand+0.004
 active+0.003
 cautious+0.003
Neg logit contributions
ays-0.006
overs-0.006
ions-0.005
es-0.005
otion-0.005
Token brought
Token ID3181
Feature activation+2.54e-02

Pos logit contributions
ays+0.027
es+0.023
overs+0.023
ions+0.022
ls+0.021
Neg logit contributions
 aware-0.025
appointed-0.024
 understand-0.024
 active-0.023
 cautious-0.022
Token down
Token ID866
Feature activation-2.63e-02

Pos logit contributions
ays+0.057
overs+0.048
otion+0.048
ase+0.046
es+0.045
Neg logit contributions
 aware-0.039
 understand-0.038
 gentle-0.037
 calm-0.037
 active-0.036
Token a
Token ID257
Feature activation-9.10e-03

Pos logit contributions
 aware+0.037
 understand+0.035
appointed+0.034
 active+0.034
 calm+0.034
Neg logit contributions
ays-0.062
overs-0.054
otion-0.052
ase-0.051
es-0.051