TOP ACTIVATIONS
MAX = 0.134

only and anything you make must stay outside . Don 't
is a gift that you must share with others ." And
a reminder that even our brains can be full of surprises
Amy all the things she must do to make her dream
it is a sign that brains weren 't meant to be
may discover that your fears will turn out to be kind
break and relax ? Trying to catch the bird is a
told Be ep that they must all be careful and not
explorers . I fear they must have been lost ."
there something you can do to make it better ?"
and dreaming of what it must be like to travel in
is too heavy and its wings are too weak . But
to make sure the copper pieces are fair . We need
bad thing and that everything will be okay ."
To find the world you must look in your heart .
it wisely and your wound will be healed in no time
help build strong friendship which will last a lifetime . <|endoftext|>
blue is something all children must understand . Nature provides us
and remember that your dreams will come true if you try
law is something that everyone must obey . When you attach

MIN ACTIVATIONS
MIN = -0.137

<|endoftext|> Once there was a curious little girl called Martha .
a time there was a curious bear . One day he
a time there was a curious flower . The flower wanted
a time there was a curious little girl called Jane .
<|endoftext|> Once there was a curious little dancer named L ila
a time there was a curious boat . The boat wanted
a time there was a curious little girl called Daisy .
M olly was a very curious three year old . She
a time there was a curious little girl . She was
<|endoftext|> Once there was a curious little girl named Daisy .
a time there was a curious little girl . Every day
a time there was a curious girl . She wanted to
a time there was a curious mouse . He liked to
<|endoftext|> Once there was a curious little girl called Am a
a time there was a curious little boy . He wanted
a time there was a curious cat . The cat loved
a time there was a curious little girl . She was
a time there was a curious puppy called Benny . He
a time there was a curious little girl named Lucy .
time , there was a curious girl named Jill . She

INTERVAL 0.066 - 0.134
CONTAINS 3.003%

When Emma went to sleep at night , she was very
sorry . He wishes he did not bite Mia . He
his work . <|endoftext|> Once upon a time , there was
and the k itty wanted to play with it .
mixer was very special . One day , the

INTERVAL -0.002 - 0.066
CONTAINS 47.265%

get free . â So Jay settled down
was big and ugly and had a long needle .
, " Wow , the bug is so pretty ! Can
sorry you had a terrible day . I came to comfort
and happy again . She puts on her purple dress and

INTERVAL -0.069 - -0.002
CONTAINS 47.919%

stole his fish . The boy said , " I 'm
the toy . Max was happy and agreed .
made his mom and dad angry . The next day ,
too , with its colorful surprise . The little
. He mixed blue with yellow to make green . He

INTERVAL -0.137 - -0.069
CONTAINS 1.813%

. Sam shouted with g lee as he found his mom
Tim saw a big , scary bug . He
chim ney ," Ben said , pointing to the old chim
Her mom smiled and said , " I 'm glad you
, look ! Jim said , pointing excited ly at something
Token only
Token ID691
Feature activation+2.29e-02

Pos logit contributions
aches+0.055
 belong+0.054
 happen+0.051
irs+0.050
 yourself+0.049
Neg logit contributions
 "-0.049
 girl-0.045
-0.045
hog-0.045
 �-0.044
Token and
Token ID290
Feature activation+5.16e-02

Pos logit contributions
aches+0.041
 belong+0.040
 happen+0.038
 yourself+0.037
irs+0.037
Neg logit contributions
 "-0.039
 girl-0.037
 �-0.036
sed-0.036
elt-0.036
Token anything
Token ID1997
Feature activation+5.88e-02

Pos logit contributions
aches+0.089
 belong+0.086
irs+0.084
irl+0.080
ck+0.079
Neg logit contributions
 "-0.094
 �-0.084
 girl-0.083
sed-0.082
elt-0.081
Token you
Token ID345
Feature activation+8.07e-02

Pos logit contributions
aches+0.101
irs+0.099
ck+0.094
 ease+0.093
 yourself+0.093
Neg logit contributions
 "-0.107
 girl-0.096
hog-0.095
 �-0.093
-0.092
Token make
Token ID787
Feature activation+5.79e-02

Pos logit contributions
aches+0.118
 yourself+0.114
 ourselves+0.113
irs+0.113
 happen+0.108
Neg logit contributions
 "-0.153
-0.148
sed-0.142
 girl-0.141
 gir-0.137
Token must
Token ID1276
Feature activation+1.34e-01

Pos logit contributions
aches+0.099
 belong+0.096
ck+0.091
irs+0.089
 avoid+0.088
Neg logit contributions
 "-0.110
 girl-0.092
 �-0.092
-0.086
keeper-0.085
Token stay
Token ID2652
Feature activation+4.52e-02

Pos logit contributions
irs+0.204
cks+0.190
wy+0.183
eks+0.169
ws+0.167
Neg logit contributions
 "-0.240
 �-0.200
 lady-0.196
 man-0.196
 girl-0.190
Token outside
Token ID2354
Feature activation+4.44e-02

Pos logit contributions
aches+0.091
 belong+0.088
irs+0.087
 happen+0.084
 ease+0.082
Neg logit contributions
 "-0.068
-0.063
 girl-0.063
joy-0.061
keeper-0.059
Token.
Token ID13
Feature activation+2.45e-03

Pos logit contributions
aches+0.085
 belong+0.083
irs+0.079
 happen+0.077
ck+0.076
Neg logit contributions
 "-0.075
 girl-0.069
joy-0.067
-0.066
 �-0.066
Token Don
Token ID2094
Feature activation+6.12e-02

Pos logit contributions
 belong+0.006
aches+0.006
 happen+0.005
 yourself+0.005
 know+0.005
Neg logit contributions
 "-0.003
hog-0.003
elt-0.003
sed-0.003
-0.003
Token't
Token ID470
Feature activation+4.35e-02

Pos logit contributions
aches+0.087
ws+0.087
irs+0.086
 ease+0.086
ck+0.084
Neg logit contributions
 "-0.139
-0.121
 �-0.120
hog-0.116
 girl-0.116
Token is
Token ID318
Feature activation+4.15e-02

Pos logit contributions
aches+0.064
 belong+0.063
irs+0.059
 happen+0.059
 ease+0.058
Neg logit contributions
 "-0.062
-0.057
 girl-0.056
hog-0.056
keeper-0.055
Token a
Token ID257
Feature activation-1.15e-02

Pos logit contributions
 belong+0.066
irs+0.065
aches+0.065
 happen+0.063
 ease+0.060
Neg logit contributions
 "-0.083
 �-0.075
 gir-0.071
 girl-0.070
-0.069
Token gift
Token ID6979
Feature activation+8.05e-03

Pos logit contributions
elt+0.022
sed+0.022
hog+0.022
 Lee+0.021
 "+0.021
Neg logit contributions
 belong-0.018
aches-0.017
 happen-0.017
 yourself-0.017
 know-0.016
Token that
Token ID326
Feature activation+5.70e-02

Pos logit contributions
aches+0.016
 belong+0.016
 happen+0.015
 yourself+0.015
 know+0.014
Neg logit contributions
 "-0.012
hog-0.012
 girl-0.012
sed-0.012
-0.012
Token you
Token ID345
Feature activation+8.89e-02

Pos logit contributions
aches+0.087
irs+0.083
ck+0.078
irl+0.078
 yourself+0.077
Neg logit contributions
 "-0.116
 girl-0.106
 boy-0.101
 �-0.100
hog-0.100
Token must
Token ID1276
Feature activation+1.31e-01

Pos logit contributions
irs+0.132
aches+0.129
 yourself+0.124
 happen+0.122
irl+0.121
Neg logit contributions
 "-0.171
-0.160
 gir-0.152
 girl-0.149
 mouse-0.149
Token share
Token ID2648
Feature activation+6.63e-02

Pos logit contributions
irs+0.230
cks+0.199
inations+0.189
wy+0.181
lled+0.179
Neg logit contributions
 "-0.229
 lady-0.192
 man-0.192
 gard-0.185
 nodded-0.180
Token with
Token ID351
Feature activation+6.58e-02

Pos logit contributions
 belong+0.139
aches+0.136
irs+0.133
irl+0.131
 ease+0.128
Neg logit contributions
 "-0.105
-0.084
 �-0.082
 girl-0.081
keeper-0.081
Token others
Token ID1854
Feature activation+2.57e-02

Pos logit contributions
 belong+0.098
aches+0.096
 happen+0.092
irl+0.089
 achieve+0.085
Neg logit contributions
 "-0.139
 mouse-0.122
-0.121
 girl-0.121
 gir-0.116
Token."
Token ID526
Feature activation-2.43e-02

Pos logit contributions
 belong+0.041
aches+0.041
 happen+0.039
irs+0.038
 ease+0.037
Neg logit contributions
 "-0.050
-0.047
 girl-0.046
hog-0.046
 �-0.045
Token And
Token ID843
Feature activation-1.57e-02

Pos logit contributions
sed+0.040
hog+0.038
elt+0.038
joy+0.037
 girl+0.037
Neg logit contributions
 yourself-0.042
 belong-0.042
aches-0.041
 know-0.041
 ourselves-0.040
Token a
Token ID257
Feature activation-3.21e-02

Pos logit contributions
aches+0.070
 belong+0.067
irs+0.062
 happen+0.060
 yourself+0.060
Neg logit contributions
 "-0.064
 girl-0.061
 gir-0.058
 �-0.057
-0.057
Token reminder
Token ID15438
Feature activation+2.72e-02

Pos logit contributions
elt+0.057
 Lee+0.055
hog+0.055
sed+0.054
 "+0.053
Neg logit contributions
 belong-0.056
 happen-0.054
 know-0.053
aches-0.053
 avoid-0.050
Token that
Token ID326
Feature activation+3.53e-02

Pos logit contributions
 belong+0.051
aches+0.051
 yourself+0.049
irs+0.048
irl+0.048
Neg logit contributions
 "-0.044
keeper-0.041
hog-0.041
 �-0.041
 girl-0.040
Token even
Token ID772
Feature activation+4.06e-02

Pos logit contributions
aches+0.062
irs+0.058
 belong+0.058
 yourself+0.056
 ease+0.055
Neg logit contributions
 "-0.064
 girl-0.060
 �-0.058
-0.057
keeper-0.057
Token our
Token ID674
Feature activation+1.66e-02

Pos logit contributions
aches+0.071
 belong+0.070
irs+0.068
 happen+0.064
 ease+0.064
Neg logit contributions
 "-0.071
 girl-0.069
 �-0.064
 gir-0.064
 boy-0.063
Token brains
Token ID14290
Feature activation+1.31e-01

Pos logit contributions
 belong+0.027
aches+0.027
 happen+0.025
 yourself+0.025
irs+0.025
Neg logit contributions
 "-0.030
joy-0.030
sed-0.029
 girl-0.029
keeper-0.029
Token can
Token ID460
Feature activation+1.14e-01

Pos logit contributions
irs+0.188
 moderation+0.185
 yourself+0.182
irl+0.181
aches+0.180
Neg logit contributions
 "-0.253
-0.231
keeper-0.230
hog-0.227
 girl-0.226
Token be
Token ID307
Feature activation+5.43e-02

Pos logit contributions
irs+0.183
cks+0.169
 moderation+0.164
ws+0.162
irl+0.158
Neg logit contributions
 "-0.201
hog-0.179
-0.179
keeper-0.177
 rabbit-0.176
Token full
Token ID1336
Feature activation+4.01e-02

Pos logit contributions
irs+0.095
 belong+0.094
aches+0.092
 happen+0.091
irl+0.088
Neg logit contributions
 "-0.093
 �-0.084
 girl-0.082
keeper-0.081
-0.080
Token of
Token ID286
Feature activation+5.28e-02

Pos logit contributions
aches+0.075
 belong+0.073
irl+0.070
irs+0.069
 happen+0.069
Neg logit contributions
 "-0.067
 girl-0.062
 �-0.062
keeper-0.062
hog-0.061
Token surprises
Token ID24072
Feature activation+5.66e-02

Pos logit contributions
aches+0.089
 belong+0.087
irs+0.082
irl+0.080
 happen+0.077
Neg logit contributions
 "-0.093
 girl-0.088
joy-0.088
 �-0.086
 gir-0.084
Token Amy
Token ID14235
Feature activation-5.66e-03

Pos logit contributions
 belong+0.022
aches+0.022
 yourself+0.021
 happen+0.021
 ourselves+0.020
Neg logit contributions
 "-0.020
 girl-0.018
hog-0.018
 �-0.018
keeper-0.018
Token all
Token ID477
Feature activation+1.69e-02

Pos logit contributions
 "+0.009
sed+0.009
+0.009
elt+0.009
hog+0.009
Neg logit contributions
 belong-0.010
aches-0.010
 happen-0.010
 yourself-0.010
 know-0.010
Token the
Token ID262
Feature activation+2.05e-02

Pos logit contributions
aches+0.028
 belong+0.027
 yourself+0.027
 happen+0.026
 ourselves+0.025
Neg logit contributions
 "-0.031
 girl-0.030
 �-0.029
hog-0.029
joy-0.028
Token things
Token ID1243
Feature activation+1.72e-02

Pos logit contributions
aches+0.035
 belong+0.034
 yourself+0.032
irs+0.031
 ourselves+0.031
Neg logit contributions
 girl-0.037
 "-0.037
joy-0.035
 gir-0.035
 boy-0.035
Token she
Token ID673
Feature activation+4.74e-02

Pos logit contributions
aches+0.037
 belong+0.036
 yourself+0.036
 happen+0.035
 ourselves+0.034
Neg logit contributions
 "-0.023
hog-0.021
 �-0.021
 girl-0.021
keeper-0.021
Token must
Token ID1276
Feature activation+1.30e-01

Pos logit contributions
aches+0.115
irs+0.112
 ourselves+0.111
 yourself+0.108
 belong+0.107
Neg logit contributions
 "-0.057
sed-0.052
-0.052
 girl-0.051
elt-0.050
Token do
Token ID466
Feature activation+7.28e-02

Pos logit contributions
irs+0.225
 unnecessary+0.203
cks+0.199
eling+0.183
wy+0.181
Neg logit contributions
 "-0.241
 man-0.217
 lady-0.210
 nodded-0.209
 �-0.200
Token to
Token ID284
Feature activation+9.98e-02

Pos logit contributions
irs+0.175
aches+0.172
 belong+0.169
ck+0.167
 ease+0.161
Neg logit contributions
 "-0.095
 �-0.071
-0.068
joy-0.065
keeper-0.062
Token make
Token ID787
Feature activation+4.43e-02

Pos logit contributions
irs+0.176
cks+0.151
 yours+0.149
ws+0.143
 ours+0.140
Neg logit contributions
 "-0.158
 man-0.154
 girl-0.151
 nodded-0.148
 cheeks-0.147
Token her
Token ID607
Feature activation+3.58e-02

Pos logit contributions
 belong+0.074
aches+0.072
 appreciate+0.065
 avoid+0.064
irs+0.064
Neg logit contributions
 "-0.082
 girl-0.078
 �-0.074
 Lee-0.073
keeper-0.072
Token dream
Token ID4320
Feature activation+8.08e-02

Pos logit contributions
aches+0.060
irs+0.060
 belong+0.057
 yourself+0.057
 happen+0.054
Neg logit contributions
 girl-0.065
 "-0.064
 boy-0.061
keeper-0.061
joy-0.060
Token it
Token ID340
Feature activation+3.36e-02

Pos logit contributions
 belong+0.029
aches+0.029
 happen+0.027
 yourself+0.027
irs+0.027
Neg logit contributions
 "-0.027
 girl-0.025
 gir-0.025
 �-0.025
-0.025
Token is
Token ID318
Feature activation+3.95e-02

Pos logit contributions
aches+0.067
irs+0.066
 yourself+0.065
 belong+0.064
 happen+0.062
Neg logit contributions
 "-0.050
hog-0.047
 girl-0.047
keeper-0.047
-0.046
Token a
Token ID257
Feature activation-4.32e-02

Pos logit contributions
irs+0.068
aches+0.066
 belong+0.065
 achieve+0.064
 happen+0.063
Neg logit contributions
 "-0.071
 �-0.065
sed-0.063
 girl-0.062
 gir-0.061
Token sign
Token ID1051
Feature activation-8.05e-03

Pos logit contributions
elt+0.079
hog+0.073
 Lee+0.072
ked+0.071
sed+0.071
Neg logit contributions
 belong-0.076
 happen-0.072
aches-0.069
 know-0.067
 yourself-0.067
Token that
Token ID326
Feature activation+4.49e-02

Pos logit contributions
sed+0.014
 "+0.014
elt+0.014
joy+0.013
+0.013
Neg logit contributions
 belong-0.014
aches-0.014
 happen-0.013
 yourself-0.013
 know-0.013
Token brains
Token ID14290
Feature activation+1.30e-01

Pos logit contributions
aches+0.088
irs+0.083
 yourself+0.083
 belong+0.082
 ease+0.081
Neg logit contributions
 "-0.071
 girl-0.063
hog-0.062
keeper-0.062
 �-0.061
Token weren
Token ID6304
Feature activation+8.41e-02

Pos logit contributions
irs+0.218
 yourself+0.209
aches+0.208
 moderation+0.207
irl+0.206
Neg logit contributions
 "-0.232
-0.201
keeper-0.198
 girl-0.198
hog-0.198
Token't
Token ID470
Feature activation+5.09e-02

Pos logit contributions
 appreciate+0.132
aches+0.127
 moderation+0.125
irl+0.121
inations+0.120
Neg logit contributions
 "-0.168
 �-0.151
keeper-0.143
hog-0.139
-0.131
Token meant
Token ID4001
Feature activation+1.63e-02

Pos logit contributions
irl+0.092
aches+0.087
 achieve+0.086
irs+0.085
 yourself+0.083
Neg logit contributions
 "-0.091
 gir-0.078
 girl-0.078
 �-0.077
 mouse-0.074
Token to
Token ID284
Feature activation+9.04e-02

Pos logit contributions
aches+0.034
 belong+0.034
 yourself+0.032
 happen+0.032
irl+0.031
Neg logit contributions
 "-0.023
hog-0.021
sed-0.021
 �-0.021
keeper-0.021
Token be
Token ID307
Feature activation+3.64e-02

Pos logit contributions
irl+0.132
irs+0.130
inations+0.127
vern+0.123
aches+0.118
Neg logit contributions
 "-0.168
 girl-0.149
 gir-0.148
 mouse-0.147
 rabbit-0.146
Token may
Token ID743
Feature activation+8.95e-02

Pos logit contributions
irs+0.144
aches+0.139
 happen+0.136
 yourself+0.133
irl+0.132
Neg logit contributions
 "-0.131
-0.119
keeper-0.113
hog-0.112
 mouse-0.111
Token discover
Token ID7073
Feature activation+6.33e-02

Pos logit contributions
irs+0.148
wy+0.132
inations+0.130
ws+0.126
cks+0.124
Neg logit contributions
 "-0.177
 �-0.152
hog-0.142
 rabbit-0.140
 gir-0.137
Token that
Token ID326
Feature activation+4.10e-02

Pos logit contributions
aches+0.111
 belong+0.108
 know+0.101
ck+0.097
irs+0.097
Neg logit contributions
 "-0.118
 �-0.108
 girl-0.104
joy-0.100
-0.099
Token your
Token ID534
Feature activation+2.21e-02

Pos logit contributions
aches+0.064
irs+0.060
 belong+0.058
ck+0.058
irl+0.055
Neg logit contributions
 "-0.083
 girl-0.079
 �-0.075
 boy-0.075
 gir-0.073
Token fears
Token ID10251
Feature activation+1.08e-01

Pos logit contributions
 belong+0.043
aches+0.043
irs+0.041
 happen+0.041
 ourselves+0.040
Neg logit contributions
 "-0.034
 girl-0.033
sed-0.033
joy-0.032
-0.031
Token will
Token ID481
Feature activation+1.29e-01

Pos logit contributions
aches+0.190
irs+0.187
 moderation+0.176
irl+0.175
ck+0.173
Neg logit contributions
 "-0.192
 girl-0.167
hog-0.167
-0.166
 �-0.164
Token turn
Token ID1210
Feature activation+4.41e-02

Pos logit contributions
irs+0.216
irl+0.193
wy+0.189
cks+0.186
ws+0.181
Neg logit contributions
 "-0.244
 �-0.205
 rabbit-0.194
 man-0.188
 but-0.188
Token out
Token ID503
Feature activation+5.94e-02

Pos logit contributions
aches+0.092
 belong+0.088
 happen+0.082
irs+0.082
irl+0.081
Neg logit contributions
 "-0.064
 �-0.060
 girl-0.059
keeper-0.057
hog-0.057
Token to
Token ID284
Feature activation+1.03e-01

Pos logit contributions
aches+0.128
 belong+0.122
irs+0.117
irl+0.117
 happen+0.113
Neg logit contributions
 "-0.087
 �-0.079
 girl-0.075
joy-0.073
keeper-0.072
Token be
Token ID307
Feature activation+4.46e-02

Pos logit contributions
aches+0.162
irl+0.157
irs+0.155
wy+0.155
inations+0.147
Neg logit contributions
 "-0.195
 �-0.169
 girl-0.164
 mouse-0.163
 gir-0.159
Token kind
Token ID1611
Feature activation+5.89e-03

Pos logit contributions
aches+0.080
 belong+0.079
irs+0.079
 happen+0.077
irl+0.072
Neg logit contributions
 "-0.076
 girl-0.069
joy-0.069
 �-0.066
-0.066
Token break
Token ID2270
Feature activation+5.52e-02

Pos logit contributions
aches+0.027
 yourself+0.026
 belong+0.026
 ourselves+0.025
 happen+0.025
Neg logit contributions
 "-0.036
 girl-0.036
 �-0.035
joy-0.035
keeper-0.035
Token and
Token ID290
Feature activation+4.93e-02

Pos logit contributions
 belong+0.099
aches+0.094
irs+0.089
 happen+0.087
ck+0.087
Neg logit contributions
 "-0.101
 girl-0.094
 �-0.092
keeper-0.090
-0.090
Token relax
Token ID8960
Feature activation+3.94e-02

Pos logit contributions
aches+0.071
irs+0.069
 belong+0.068
 happen+0.063
vern+0.063
Neg logit contributions
 "-0.099
 girl-0.096
 �-0.093
keeper-0.092
 boy-0.091
Token?
Token ID30
Feature activation-1.31e-02

Pos logit contributions
 belong+0.077
aches+0.076
irs+0.073
 know+0.072
 happen+0.069
Neg logit contributions
 "-0.065
-0.061
 girl-0.059
 �-0.058
joy-0.056
Token Trying
Token ID31165
Feature activation+5.78e-02

Pos logit contributions
sed+0.017
hog+0.017
 "+0.017
 gir+0.016
joy+0.016
Neg logit contributions
 belong-0.027
 yourself-0.027
aches-0.027
 know-0.026
 happen-0.026
Token to
Token ID284
Feature activation+1.29e-01

Pos logit contributions
irs+0.108
 belong+0.105
ck+0.103
aches+0.103
 ease+0.098
Neg logit contributions
 "-0.098
 girl-0.088
 �-0.086
 boy-0.082
keeper-0.080
Token catch
Token ID4929
Feature activation+4.10e-02

Pos logit contributions
irs+0.222
cks+0.201
ws+0.185
lled+0.176
inations+0.174
Neg logit contributions
 "-0.215
 girl-0.210
 boy-0.200
 lady-0.195
 man-0.194
Token the
Token ID262
Feature activation-1.72e-02

Pos logit contributions
 belong+0.068
aches+0.067
 happen+0.062
irs+0.062
 achieve+0.061
Neg logit contributions
 "-0.075
 girl-0.071
 �-0.071
hog-0.070
keeper-0.068
Token bird
Token ID6512
Feature activation+3.11e-02

Pos logit contributions
 "+0.024
elt+0.024
+0.024
hog+0.023
joy+0.023
Neg logit contributions
 belong-0.035
aches-0.035
 yourself-0.034
 happen-0.034
 ourselves-0.032
Token is
Token ID318
Feature activation+3.20e-02

Pos logit contributions
aches+0.068
 belong+0.068
irs+0.067
 yourself+0.066
 happen+0.065
Neg logit contributions
 "-0.039
hog-0.037
 girl-0.036
 �-0.035
-0.034
Token a
Token ID257
Feature activation-1.78e-02

Pos logit contributions
 belong+0.053
aches+0.053
irs+0.052
 happen+0.051
 yourself+0.050
Neg logit contributions
 "-0.059
 �-0.054
sed-0.053
hog-0.053
 girl-0.052
Token told
Token ID1297
Feature activation+1.31e-02

Pos logit contributions
 yourself+0.049
aches+0.048
 belong+0.047
 happen+0.047
 ourselves+0.046
Neg logit contributions
 "-0.038
 girl-0.036
-0.035
 �-0.035
 gir-0.035
Token Be
Token ID1355
Feature activation+3.10e-02

Pos logit contributions
 belong+0.016
aches+0.016
 yourself+0.015
 happen+0.014
 achieve+0.013
Neg logit contributions
 "-0.030
 �-0.029
 girl-0.028
 Lee-0.028
hog-0.028
Tokenep
Token ID538
Feature activation+2.42e-02

Pos logit contributions
 belong+0.059
aches+0.059
 happen+0.058
irs+0.058
 ease+0.055
Neg logit contributions
 "-0.050
 girl-0.047
-0.046
joy-0.045
 �-0.045
Token that
Token ID326
Feature activation+1.06e-02

Pos logit contributions
aches+0.049
 belong+0.049
 yourself+0.047
 happen+0.047
irs+0.046
Neg logit contributions
 "-0.035
hog-0.032
-0.032
 girl-0.032
keeper-0.032
Token they
Token ID484
Feature activation+9.15e-02

Pos logit contributions
aches+0.019
 belong+0.019
 yourself+0.018
 happen+0.018
 know+0.017
Neg logit contributions
 "-0.018
 girl-0.017
hog-0.017
joy-0.017
sed-0.017
Token must
Token ID1276
Feature activation+1.28e-01

Pos logit contributions
 yourself+0.172
aches+0.172
irs+0.171
irl+0.165
 moderation+0.159
Neg logit contributions
 "-0.148
-0.136
 boy-0.133
 girl-0.132
hog-0.124
Token all
Token ID477
Feature activation+6.50e-02

Pos logit contributions
irs+0.229
cks+0.200
wy+0.192
inations+0.191
ieve+0.185
Neg logit contributions
 "-0.213
 man-0.182
 driver-0.178
 �-0.178
 lady-0.174
Token be
Token ID307
Feature activation+3.70e-02

Pos logit contributions
aches+0.085
irs+0.083
wy+0.079
irl+0.079
 yourself+0.076
Neg logit contributions
 "-0.140
 girl-0.127
 boy-0.127
 �-0.126
hog-0.123
Token careful
Token ID8161
Feature activation+3.23e-02

Pos logit contributions
 belong+0.068
aches+0.068
 happen+0.067
irs+0.064
 yourself+0.061
Neg logit contributions
 "-0.062
joy-0.058
 girl-0.055
-0.055
 gir-0.054
Token and
Token ID290
Feature activation+6.92e-02

Pos logit contributions
aches+0.074
 belong+0.073
irs+0.069
 yourself+0.068
 ease+0.067
Neg logit contributions
 "-0.042
 �-0.037
keeper-0.036
-0.036
hog-0.036
Token not
Token ID407
Feature activation+4.89e-02

Pos logit contributions
irs+0.098
aches+0.090
 ease+0.086
irl+0.085
hest+0.080
Neg logit contributions
 "-0.149
-0.133
 girl-0.132
 �-0.131
 boy-0.128
Token explorers
Token ID45531
Feature activation+6.67e-02

Pos logit contributions
 belong+0.006
aches+0.006
 happen+0.006
 yourself+0.006
 know+0.006
Neg logit contributions
 "-0.005
hog-0.005
sed-0.005
 girl-0.005
elt-0.005
Token.
Token ID13
Feature activation-2.13e-02

Pos logit contributions
aches+0.123
irl+0.116
 yourself+0.114
irs+0.114
 belong+0.110
Neg logit contributions
 "-0.116
 girl-0.105
-0.104
hog-0.103
keeper-0.102
Token I
Token ID314
Feature activation+7.72e-02

Pos logit contributions
joy+0.028
sed+0.027
hog+0.027
 girl+0.026
 gir+0.026
Neg logit contributions
 belong-0.045
 yourself-0.045
 know-0.043
 ourselves-0.043
aches-0.042
Token fear
Token ID3252
Feature activation+4.38e-02

Pos logit contributions
irs+0.133
 yourself+0.133
aches+0.128
 ourselves+0.125
 happen+0.117
Neg logit contributions
 "-0.138
 girl-0.135
-0.133
keeper-0.133
 boy-0.132
Token they
Token ID484
Feature activation+8.76e-02

Pos logit contributions
aches+0.084
 belong+0.079
irs+0.078
irl+0.077
 yourself+0.076
Neg logit contributions
 "-0.073
-0.067
 girl-0.066
keeper-0.065
hog-0.065
Token must
Token ID1276
Feature activation+1.26e-01

Pos logit contributions
aches+0.142
irs+0.139
irl+0.139
 yourself+0.135
 moderation+0.132
Neg logit contributions
 "-0.155
 girl-0.149
-0.147
 boy-0.145
keeper-0.142
Token have
Token ID423
Feature activation+7.27e-02

Pos logit contributions
irs+0.208
cks+0.180
wy+0.171
irl+0.170
inations+0.170
Neg logit contributions
 "-0.221
 man-0.199
 lady-0.196
 driver-0.189
 girl-0.187
Token been
Token ID587
Feature activation+4.59e-02

Pos logit contributions
irl+0.106
aches+0.103
irs+0.102
 achieve+0.102
iver+0.098
Neg logit contributions
 "-0.139
sed-0.128
joy-0.125
keeper-0.122
 gir-0.121
Token lost
Token ID2626
Feature activation+3.80e-02

Pos logit contributions
aches+0.080
irs+0.078
 belong+0.078
 happen+0.076
irl+0.075
Neg logit contributions
 "-0.079
joy-0.073
 girl-0.071
 gir-0.070
keeper-0.070
Token."
Token ID526
Feature activation-2.63e-02

Pos logit contributions
aches+0.063
 belong+0.061
 happen+0.057
irl+0.056
irs+0.056
Neg logit contributions
 "-0.074
 girl-0.069
keeper-0.068
-0.068
hog-0.067
Token
Token ID220
Feature activation+5.59e-02

Pos logit contributions
sed+0.039
elt+0.038
 gir+0.038
joy+0.037
 Pole+0.037
Neg logit contributions
 know-0.049
 belong-0.049
 yourself-0.048
 ourselves-0.048
aches-0.047
Token there
Token ID612
Feature activation+6.12e-02

Pos logit contributions
 belong+0.064
aches+0.061
 happen+0.060
 avoid+0.059
 yourself+0.059
Neg logit contributions
 "-0.086
 �-0.080
 girl-0.080
 gir-0.079
keeper-0.079
Token something
Token ID1223
Feature activation+4.37e-02

Pos logit contributions
aches+0.097
irs+0.095
ck+0.095
irl+0.092
 belong+0.087
Neg logit contributions
 "-0.122
 girl-0.111
 �-0.109
keeper-0.106
-0.106
Token you
Token ID345
Feature activation+8.84e-02

Pos logit contributions
aches+0.071
irs+0.068
 belong+0.067
 yourself+0.066
 ease+0.066
Neg logit contributions
 "-0.082
 girl-0.076
hog-0.076
 �-0.075
keeper-0.074
Token can
Token ID460
Feature activation+1.01e-01

Pos logit contributions
aches+0.126
irs+0.120
 ourselves+0.119
ck+0.112
 yourself+0.111
Neg logit contributions
 "-0.180
-0.166
 �-0.158
 girl-0.155
 Pole-0.152
Token do
Token ID466
Feature activation+9.10e-02

Pos logit contributions
ws+0.157
irs+0.155
 moderation+0.143
cks+0.143
 yourself+0.139
Neg logit contributions
 "-0.192
hog-0.168
 rabbit-0.163
 mouse-0.162
 �-0.159
Token to
Token ID284
Feature activation+1.25e-01

Pos logit contributions
irs+0.174
 ease+0.169
ck+0.167
aches+0.167
cks+0.159
Neg logit contributions
 "-0.156
 �-0.125
-0.120
hog-0.115
 girl-0.115
Token make
Token ID787
Feature activation+7.78e-02

Pos logit contributions
irs+0.222
ws+0.216
cks+0.205
ats+0.203
wy+0.198
Neg logit contributions
 "-0.178
 man-0.172
 lady-0.169
 rabbit-0.164
 girl-0.163
Token it
Token ID340
Feature activation+1.01e-01

Pos logit contributions
 belong+0.116
aches+0.116
ws+0.110
 avoid+0.109
 achieve+0.107
Neg logit contributions
 "-0.147
 girl-0.140
 mouse-0.134
 boy-0.133
 �-0.126
Token better
Token ID1365
Feature activation+6.13e-02

Pos logit contributions
irs+0.186
cks+0.163
ck+0.160
aches+0.157
ws+0.153
Neg logit contributions
 "-0.178
 girl-0.162
 boy-0.153
 �-0.153
keeper-0.152
Token?"
Token ID1701
Feature activation-1.99e-02

Pos logit contributions
irs+0.104
 ease+0.097
 belong+0.094
aches+0.094
vern+0.087
Neg logit contributions
 "-0.121
 girl-0.115
-0.111
 boy-0.110
keeper-0.103
Token
Token ID198
Feature activation+4.11e-02

Pos logit contributions
sed+0.037
 Pole+0.035
 gir+0.035
elt+0.034
hog+0.034
Neg logit contributions
aches-0.031
 belong-0.031
 know-0.031
 yourself-0.031
 happen-0.029
Token and
Token ID290
Feature activation+5.25e-03

Pos logit contributions
sed+0.067
elt+0.065
hog+0.063
+0.063
joy+0.063
Neg logit contributions
 happen-0.056
 belong-0.055
aches-0.055
 yourself-0.054
 ourselves-0.052
Token dreaming
Token ID28368
Feature activation+5.98e-03

Pos logit contributions
 belong+0.010
aches+0.010
 happen+0.009
 yourself+0.009
 know+0.009
Neg logit contributions
 "-0.009
hog-0.008
sed-0.008
elt-0.008
 girl-0.008
Token of
Token ID286
Feature activation+3.22e-02

Pos logit contributions
 belong+0.010
aches+0.010
 yourself+0.010
 happen+0.010
 know+0.009
Neg logit contributions
 "-0.011
hog-0.010
 girl-0.010
elt-0.010
joy-0.010
Token what
Token ID644
Feature activation+1.05e-02

Pos logit contributions
 belong+0.055
aches+0.054
 happen+0.048
 know+0.048
 yourself+0.048
Neg logit contributions
 "-0.061
joy-0.057
 �-0.056
 girl-0.055
 gir-0.054
Token it
Token ID340
Feature activation+6.20e-02

Pos logit contributions
aches+0.023
 belong+0.022
 yourself+0.022
 happen+0.021
 know+0.021
Neg logit contributions
 "-0.014
 girl-0.013
hog-0.013
-0.013
keeper-0.013
Token must
Token ID1276
Feature activation+1.25e-01

Pos logit contributions
aches+0.165
 ourselves+0.163
irs+0.159
 yourself+0.157
ck+0.154
Neg logit contributions
 "-0.067
 girl-0.054
-0.047
 boy-0.047
hog-0.044
Token be
Token ID307
Feature activation+5.60e-02

Pos logit contributions
irs+0.209
 unnecessary+0.208
wy+0.205
cks+0.199
ws+0.197
Neg logit contributions
 "-0.240
 man-0.195
,-0.190
 girl-0.187
 boy-0.184
Token like
Token ID588
Feature activation+1.61e-02

Pos logit contributions
irs+0.111
aches+0.103
 belong+0.100
 ourselves+0.100
 ease+0.098
Neg logit contributions
 "-0.089
joy-0.076
 girl-0.075
 gir-0.074
keeper-0.072
Token to
Token ID284
Feature activation+9.88e-02

Pos logit contributions
 belong+0.032
aches+0.032
 happen+0.031
 yourself+0.030
 know+0.030
Neg logit contributions
 "-0.025
hog-0.023
joy-0.022
 �-0.022
 girl-0.022
Token travel
Token ID3067
Feature activation+5.13e-02

Pos logit contributions
wy+0.174
uch+0.155
ats+0.154
aches+0.154
ws+0.150
Neg logit contributions
 "-0.170
 man-0.146
 woman-0.143
 �-0.143
 gir-0.137
Token in
Token ID287
Feature activation+7.61e-02

Pos logit contributions
aches+0.121
 belong+0.112
irs+0.110
ck+0.109
 know+0.107
Neg logit contributions
 "-0.068
 girl-0.060
keeper-0.060
 �-0.058
hog-0.056
Token is
Token ID318
Feature activation+1.81e-02

Pos logit contributions
aches+0.038
irs+0.036
 belong+0.036
 yourself+0.036
 happen+0.035
Neg logit contributions
 "-0.041
hog-0.039
sed-0.039
 gir-0.039
 girl-0.039
Token too
Token ID1165
Feature activation+3.30e-02

Pos logit contributions
aches+0.029
 belong+0.029
 happen+0.027
irs+0.027
 yourself+0.027
Neg logit contributions
 "-0.035
sed-0.033
 �-0.032
 gir-0.032
 girl-0.032
Token heavy
Token ID4334
Feature activation+2.66e-03

Pos logit contributions
 belong+0.066
aches+0.065
 ease+0.064
 happen+0.063
irs+0.063
Neg logit contributions
 "-0.051
 gir-0.047
 girl-0.046
-0.046
hog-0.044
Token and
Token ID290
Feature activation+4.66e-02

Pos logit contributions
 belong+0.006
aches+0.006
 happen+0.005
 yourself+0.005
 know+0.005
Neg logit contributions
 "-0.004
hog-0.003
sed-0.003
elt-0.003
 girl-0.003
Token its
Token ID663
Feature activation+1.64e-02

Pos logit contributions
aches+0.084
irs+0.084
 ease+0.079
 belong+0.078
iver+0.076
Neg logit contributions
 "-0.081
sed-0.073
 gir-0.072
 girl-0.072
 �-0.072
Token wings
Token ID12098
Feature activation+1.25e-01

Pos logit contributions
 belong+0.034
aches+0.034
irs+0.033
 happen+0.033
 yourself+0.033
Neg logit contributions
 "-0.022
sed-0.022
 girl-0.021
 gir-0.021
hog-0.020
Token are
Token ID389
Feature activation+5.13e-02

Pos logit contributions
irs+0.193
 moderation+0.164
aches+0.158
inations+0.151
irl+0.151
Neg logit contributions
 "-0.248
 mouth-0.217
 girl-0.216
hog-0.211
 boy-0.211
Token too
Token ID1165
Feature activation+4.46e-02

Pos logit contributions
irs+0.100
 achieve+0.096
 belong+0.095
aches+0.095
 happen+0.092
Neg logit contributions
 "-0.080
sed-0.072
 gir-0.071
 �-0.070
 girl-0.065
Token weak
Token ID4939
Feature activation+3.18e-02

Pos logit contributions
 belong+0.089
aches+0.088
irs+0.087
 ease+0.086
 achieve+0.085
Neg logit contributions
 "-0.066
-0.061
 gir-0.059
 girl-0.059
ometer-0.057
Token.
Token ID13
Feature activation-1.69e-02

Pos logit contributions
aches+0.070
 belong+0.066
irl+0.065
irs+0.063
 achieve+0.062
Neg logit contributions
 "-0.047
elt-0.041
 girl-0.041
hog-0.040
 �-0.040
Token But
Token ID887
Feature activation+2.26e-02

Pos logit contributions
joy+0.022
hog+0.021
sed+0.021
keeper+0.021
 girl+0.021
Neg logit contributions
 belong-0.037
 yourself-0.036
 ourselves-0.035
 know-0.035
aches-0.034
Token to
Token ID284
Feature activation+9.50e-02

Pos logit contributions
 belong+0.096
aches+0.094
 happen+0.093
irl+0.092
irs+0.091
Neg logit contributions
 "-0.049
elt-0.044
-0.042
keeper-0.042
ometer-0.042
Token make
Token ID787
Feature activation+5.69e-02

Pos logit contributions
irs+0.146
vern+0.139
rest+0.132
lled+0.131
irl+0.131
Neg logit contributions
 "-0.164
 gir-0.164
 mouse-0.146
 truck-0.146
 mouth-0.145
Token sure
Token ID1654
Feature activation+6.73e-02

Pos logit contributions
 belong+0.099
aches+0.091
 appreciate+0.086
irs+0.083
 happen+0.083
Neg logit contributions
 "-0.105
 gir-0.096
 girl-0.090
 �-0.089
 boy-0.089
Token the
Token ID262
Feature activation+1.01e-03

Pos logit contributions
irl+0.109
iver+0.108
irs+0.107
aches+0.105
 belong+0.105
Neg logit contributions
 "-0.128
keeper-0.112
 girl-0.109
 boy-0.108
 �-0.106
Token copper
Token ID15317
Feature activation+8.88e-03

Pos logit contributions
 belong+0.002
aches+0.002
 happen+0.002
 yourself+0.002
 know+0.002
Neg logit contributions
 "-0.002
hog-0.002
elt-0.001
sed-0.001
-0.001
Token pieces
Token ID5207
Feature activation+1.25e-01

Pos logit contributions
 belong+0.016
aches+0.016
 happen+0.015
 yourself+0.015
irs+0.015
Neg logit contributions
 "-0.015
sed-0.014
hog-0.014
 girl-0.014
-0.014
Token are
Token ID389
Feature activation+5.34e-02

Pos logit contributions
irs+0.175
 moderation+0.159
irl+0.155
 yourself+0.145
ck+0.140
Neg logit contributions
 "-0.252
 girl-0.234
keeper-0.232
 boy-0.227
hog-0.224
Token fair
Token ID3148
Feature activation+3.43e-02

Pos logit contributions
irs+0.096
 belong+0.095
aches+0.091
irl+0.091
 know+0.089
Neg logit contributions
 "-0.088
 �-0.076
 gir-0.075
keeper-0.074
 girl-0.072
Token.
Token ID13
Feature activation-3.80e-02

Pos logit contributions
 belong+0.062
aches+0.061
irs+0.058
irl+0.057
 happen+0.057
Neg logit contributions
 "-0.059
 girl-0.055
keeper-0.055
-0.053
elt-0.053
Token We
Token ID775
Feature activation+6.96e-02

Pos logit contributions
sed+0.052
joy+0.052
hog+0.049
 girl+0.048
elt+0.047
Neg logit contributions
 yourself-0.079
 belong-0.077
 know-0.077
 ourselves-0.076
 appreciate-0.072
Token need
Token ID761
Feature activation+5.11e-02

Pos logit contributions
 yourself+0.119
 happen+0.114
 ourselves+0.113
aches+0.112
irs+0.111
Neg logit contributions
 "-0.113
-0.111
 girl-0.110
 gir-0.108
 boy-0.106
Token bad
Token ID2089
Feature activation-3.04e-02

Pos logit contributions
elt+0.022
hog+0.022
sed+0.022
 "+0.021
 Lee+0.021
Neg logit contributions
 belong-0.025
aches-0.024
 happen-0.024
 yourself-0.023
 know-0.023
Token thing
Token ID1517
Feature activation+1.55e-02

Pos logit contributions
elt+0.049
hog+0.049
+0.047
 Pole+0.047
keeper+0.047
Neg logit contributions
 happen-0.057
 belong-0.057
 know-0.053
aches-0.053
 yourself-0.052
Token and
Token ID290
Feature activation+5.11e-02

Pos logit contributions
aches+0.029
 belong+0.029
 yourself+0.027
 happen+0.027
irs+0.027
Neg logit contributions
 "-0.025
hog-0.023
sed-0.023
elt-0.023
 girl-0.023
Token that
Token ID326
Feature activation+4.11e-02

Pos logit contributions
aches+0.087
irs+0.085
 belong+0.080
irl+0.079
ck+0.076
Neg logit contributions
 "-0.097
 �-0.087
 girl-0.086
 gir-0.085
sed-0.083
Token everything
Token ID2279
Feature activation+6.74e-02

Pos logit contributions
aches+0.071
irs+0.068
 belong+0.067
 yourself+0.064
irl+0.063
Neg logit contributions
 "-0.075
 girl-0.069
 �-0.067
-0.067
 gir-0.067
Token will
Token ID481
Feature activation+1.25e-01

Pos logit contributions
aches+0.124
irs+0.123
 ease+0.114
ck+0.113
 belong+0.112
Neg logit contributions
 "-0.111
 girl-0.102
-0.099
hog-0.098
 boy-0.097
Token be
Token ID307
Feature activation+4.94e-02

Pos logit contributions
irs+0.197
irl+0.194
cks+0.180
ws+0.180
 moderation+0.175
Neg logit contributions
 "-0.243
 �-0.208
 rabbit-0.192
 man-0.190
 girl-0.184
Token okay
Token ID8788
Feature activation+3.35e-02

Pos logit contributions
irs+0.085
aches+0.085
 belong+0.083
irl+0.081
 happen+0.079
Neg logit contributions
 "-0.088
 �-0.081
 girl-0.077
keeper-0.074
 gir-0.073
Token."
Token ID526
Feature activation-1.73e-02

Pos logit contributions
aches+0.059
 belong+0.057
irs+0.055
irl+0.054
ck+0.053
Neg logit contributions
 "-0.065
 girl-0.057
-0.056
 �-0.056
 Lee-0.055
Token
Token ID220
Feature activation+3.58e-02

Pos logit contributions
sed+0.026
hog+0.025
 gir+0.025
elt+0.024
keeper+0.024
Neg logit contributions
 yourself-0.033
 know-0.033
 belong-0.032
 achieve-0.032
 ourselves-0.032
Token
Token ID198
Feature activation+3.73e-02

Pos logit contributions
 belong+0.079
aches+0.078
 happen+0.074
 yourself+0.069
ck+0.068
Neg logit contributions
 "-0.057
 �-0.051
-0.050
 Sure-0.048
elt-0.048
Token To
Token ID1675
Feature activation+9.25e-02

Pos logit contributions
sed+0.019
joy+0.019
hog+0.019
 girl+0.019
 gir+0.019
Neg logit contributions
 belong-0.035
 yourself-0.034
 know-0.033
aches-0.033
 ourselves-0.033
Token find
Token ID1064
Feature activation+5.26e-02

Pos logit contributions
inations+0.120
vern+0.119
irs+0.113
aches+0.113
 ease+0.111
Neg logit contributions
 "-0.182
 Lee-0.159
 girl-0.156
-0.150
 �-0.145
Token the
Token ID262
Feature activation-6.36e-03

Pos logit contributions
aches+0.087
 achieve+0.086
 belong+0.085
irl+0.081
 happen+0.080
Neg logit contributions
 "-0.100
 girl-0.093
 �-0.088
 gir-0.087
 mouse-0.086
Token world
Token ID995
Feature activation+4.32e-02

Pos logit contributions
 "+0.009
elt+0.009
hog+0.009
sed+0.009
+0.009
Neg logit contributions
 belong-0.012
aches-0.012
 happen-0.012
 yourself-0.012
 know-0.011
Token you
Token ID345
Feature activation+9.11e-02

Pos logit contributions
aches+0.085
 belong+0.082
irs+0.080
 ease+0.079
 know+0.079
Neg logit contributions
 "-0.069
hog-0.063
-0.062
 girl-0.062
keeper-0.061
Token must
Token ID1276
Feature activation+1.24e-01

Pos logit contributions
aches+0.142
irs+0.142
irl+0.137
 moderation+0.129
 happen+0.128
Neg logit contributions
 "-0.176
-0.161
 girl-0.152
 gir-0.150
 Pole-0.148
Token look
Token ID804
Feature activation+3.40e-02

Pos logit contributions
irs+0.221
wy+0.187
inations+0.184
cks+0.176
ws+0.170
Neg logit contributions
 "-0.223
 lady-0.195
 man-0.185
,-0.177
 girl-0.175
Token in
Token ID287
Feature activation+6.97e-02

Pos logit contributions
aches+0.063
 belong+0.063
 happen+0.060
irs+0.060
 yourself+0.059
Neg logit contributions
 "-0.056
 girl-0.051
 �-0.051
joy-0.051
keeper-0.050
Token your
Token ID534
Feature activation-8.33e-03

Pos logit contributions
aches+0.134
 belong+0.131
irs+0.119
 know+0.118
 happen+0.118
Neg logit contributions
 "-0.119
joy-0.108
 �-0.105
 girl-0.104
 gir-0.099
Token heart
Token ID2612
Feature activation+4.21e-02

Pos logit contributions
 "+0.012
hog+0.012
elt+0.012
sed+0.011
keeper+0.011
Neg logit contributions
 belong-0.017
aches-0.017
 yourself-0.016
 happen-0.016
 know-0.015
Token.
Token ID13
Feature activation-1.54e-02

Pos logit contributions
 belong+0.069
aches+0.066
irs+0.064
 happen+0.063
 ease+0.062
Neg logit contributions
 "-0.082
-0.076
 girl-0.075
keeper-0.074
 Pole-0.073
Token it
Token ID340
Feature activation+4.36e-02

Pos logit contributions
aches+0.103
 belong+0.099
irs+0.094
 happen+0.093
 achieve+0.092
Neg logit contributions
 "-0.111
 �-0.101
 girl-0.097
-0.095
keeper-0.092
Token wisely
Token ID32773
Feature activation+7.10e-02

Pos logit contributions
aches+0.089
 belong+0.087
irs+0.085
 happen+0.084
irl+0.083
Neg logit contributions
 "-0.065
 �-0.058
-0.058
hog-0.057
keeper-0.056
Token and
Token ID290
Feature activation+5.83e-02

Pos logit contributions
aches+0.153
 belong+0.149
irs+0.146
irl+0.145
ck+0.140
Neg logit contributions
 "-0.107
-0.097
 �-0.092
keeper-0.088
 girl-0.087
Token your
Token ID534
Feature activation+1.88e-02

Pos logit contributions
aches+0.094
irs+0.093
 belong+0.088
irl+0.085
iver+0.083
Neg logit contributions
 "-0.116
 �-0.106
 girl-0.103
keeper-0.100
 gir-0.099
Token wound
Token ID11699
Feature activation+9.17e-02

Pos logit contributions
 belong+0.036
aches+0.036
 happen+0.034
irs+0.034
 yourself+0.034
Neg logit contributions
 "-0.030
sed-0.029
 girl-0.029
joy-0.028
-0.028
Token will
Token ID481
Feature activation+1.24e-01

Pos logit contributions
irs+0.153
aches+0.141
 belong+0.138
 happen+0.132
irl+0.132
Neg logit contributions
 "-0.169
 girl-0.153
-0.149
keeper-0.148
 �-0.144
Token be
Token ID307
Feature activation+5.62e-02

Pos logit contributions
irs+0.203
cks+0.193
ws+0.168
aches+0.164
wy+0.164
Neg logit contributions
 "-0.231
 �-0.204
-0.190
keeper-0.188
 rabbit-0.188
Token healed
Token ID28557
Feature activation+5.11e-02

Pos logit contributions
irs+0.102
 belong+0.097
 happen+0.096
aches+0.094
 ourselves+0.091
Neg logit contributions
 "-0.095
 girl-0.088
 �-0.086
keeper-0.082
joy-0.082
Token in
Token ID287
Feature activation+8.68e-02

Pos logit contributions
aches+0.097
irs+0.095
 belong+0.093
irl+0.089
 ease+0.086
Neg logit contributions
 "-0.086
 girl-0.082
-0.080
keeper-0.080
 �-0.077
Token no
Token ID645
Feature activation+5.28e-02

Pos logit contributions
aches+0.153
 belong+0.144
irl+0.135
irs+0.134
 happen+0.133
Neg logit contributions
 "-0.152
 girl-0.143
 mouse-0.141
 eyes-0.130
 �-0.130
Token time
Token ID640
Feature activation+4.46e-02

Pos logit contributions
aches+0.087
irs+0.079
 belong+0.078
 ourselves+0.078
 achieve+0.077
Neg logit contributions
 girl-0.109
joy-0.100
 boy-0.098
 mouse-0.092
 "-0.090
Token help
Token ID1037
Feature activation+6.51e-02

Pos logit contributions
irs+0.156
cks+0.151
ws+0.149
 yours+0.148
aches+0.141
Neg logit contributions
 "-0.160
hog-0.144
ishly-0.140
-0.139
 �-0.137
Token build
Token ID1382
Feature activation+6.33e-02

Pos logit contributions
aches+0.089
irl+0.083
ws+0.082
 belong+0.081
 ease+0.079
Neg logit contributions
 "-0.138
 girl-0.130
 boy-0.121
 �-0.120
-0.118
Token strong
Token ID1913
Feature activation+1.67e-02

Pos logit contributions
aches+0.110
 belong+0.110
irs+0.100
 know+0.100
 happen+0.097
Neg logit contributions
 "-0.106
 girl-0.101
 boy-0.096
 �-0.095
joy-0.091
Token friendship
Token ID14738
Feature activation+4.59e-02

Pos logit contributions
aches+0.031
 belong+0.030
 happen+0.028
 yourself+0.028
 know+0.027
Neg logit contributions
 "-0.027
 girl-0.027
joy-0.026
hog-0.026
elt-0.026
Token which
Token ID543
Feature activation+4.80e-02

Pos logit contributions
aches+0.114
 belong+0.112
irs+0.108
irl+0.105
ck+0.105
Neg logit contributions
 "-0.050
 �-0.044
 girl-0.043
hog-0.043
keeper-0.042
Token will
Token ID481
Feature activation+1.24e-01

Pos logit contributions
aches+0.079
irs+0.073
ck+0.072
 yourself+0.072
irl+0.071
Neg logit contributions
 "-0.091
 girl-0.085
hog-0.083
 boy-0.082
sed-0.081
Token last
Token ID938
Feature activation+7.19e-02

Pos logit contributions
irl+0.187
irs+0.183
cks+0.177
wy+0.175
ws+0.169
Neg logit contributions
 "-0.229
 �-0.195
 rabbit-0.191
 girl-0.185
hog-0.184
Token a
Token ID257
Feature activation-1.46e-02

Pos logit contributions
aches+0.133
irs+0.123
ck+0.116
irl+0.115
eks+0.110
Neg logit contributions
 "-0.128
 girl-0.119
 �-0.117
 boy-0.112
joy-0.109
Token lifetime
Token ID10869
Feature activation+3.86e-02

Pos logit contributions
elt+0.026
sed+0.025
 "+0.025
 Lee+0.025
hog+0.025
Neg logit contributions
 belong-0.025
aches-0.024
 happen-0.024
 yourself-0.023
 know-0.023
Token.
Token ID13
Feature activation+2.80e-02

Pos logit contributions
aches+0.081
 belong+0.076
irl+0.074
irs+0.074
ck+0.073
Neg logit contributions
 "-0.058
 girl-0.053
 �-0.053
hog-0.052
-0.051
Token<|endoftext|>
Token ID50256
Feature activation+2.30e-02

Pos logit contributions
aches+0.061
 belong+0.058
 happen+0.057
ck+0.053
irl+0.053
Neg logit contributions
 "-0.043
 �-0.039
 Sure-0.038
elt-0.038
joy-0.038
Token blue
Token ID4171
Feature activation+1.25e-02

Pos logit contributions
 belong+0.070
aches+0.069
irs+0.068
 happen+0.066
 yourself+0.065
Neg logit contributions
 "-0.059
 �-0.053
 gir-0.051
sed-0.050
 girl-0.050
Token is
Token ID318
Feature activation+4.18e-02

Pos logit contributions
aches+0.018
 belong+0.018
 happen+0.016
 yourself+0.016
 know+0.016
Neg logit contributions
 "-0.026
elt-0.025
 girl-0.025
hog-0.025
-0.025
Token something
Token ID1223
Feature activation+3.04e-02

Pos logit contributions
 belong+0.080
aches+0.079
irs+0.079
 know+0.075
 yourself+0.074
Neg logit contributions
 "-0.068
 �-0.061
 gir-0.058
sed-0.057
 girl-0.057
Token all
Token ID477
Feature activation+6.14e-02

Pos logit contributions
aches+0.053
irs+0.051
 belong+0.051
 yourself+0.050
 ease+0.049
Neg logit contributions
 "-0.054
 �-0.049
hog-0.049
 girl-0.049
 gir-0.048
Token children
Token ID1751
Feature activation+6.59e-02

Pos logit contributions
irs+0.117
aches+0.113
vern+0.108
irl+0.107
 yourself+0.104
Neg logit contributions
 "-0.099
 girl-0.092
hog-0.086
 �-0.086
 boy-0.085
Token must
Token ID1276
Feature activation+1.24e-01

Pos logit contributions
aches+0.110
irs+0.106
 ease+0.104
 belong+0.101
 ourselves+0.099
Neg logit contributions
 "-0.122
-0.111
 girl-0.109
elt-0.105
 Pole-0.103
Token understand
Token ID1833
Feature activation+7.18e-02

Pos logit contributions
irs+0.215
wy+0.188
cks+0.185
inations+0.184
 unnecessary+0.183
Neg logit contributions
 "-0.225
 lady-0.183
 �-0.183
 man-0.181
hog-0.172
Token.
Token ID13
Feature activation-3.77e-02

Pos logit contributions
 belong+0.108
aches+0.104
irs+0.103
 ease+0.101
irl+0.100
Neg logit contributions
 "-0.151
 �-0.129
-0.125
 girl-0.123
 Lee-0.123
Token Nature
Token ID10362
Feature activation+3.97e-02

Pos logit contributions
joy+0.051
sed+0.051
 girl+0.048
hog+0.047
 gir+0.046
Neg logit contributions
 yourself-0.079
 know-0.076
 belong-0.076
 ourselves-0.073
 appreciate-0.071
Token provides
Token ID3769
Feature activation+6.04e-02

Pos logit contributions
aches+0.067
 belong+0.066
irs+0.064
 yourself+0.063
 happen+0.062
Neg logit contributions
 "-0.071
-0.066
 girl-0.066
hog-0.066
 gir-0.064
Token us
Token ID514
Feature activation+5.87e-02

Pos logit contributions
aches+0.094
 belong+0.092
 achieve+0.088
 happen+0.088
irs+0.087
Neg logit contributions
 "-0.117
 girl-0.108
 �-0.106
 gir-0.102
keeper-0.101
Token and
Token ID290
Feature activation+4.71e-02

Pos logit contributions
 belong+0.031
aches+0.030
 happen+0.029
 yourself+0.029
irs+0.029
Neg logit contributions
 "-0.025
hog-0.023
 girl-0.023
-0.023
elt-0.023
Token remember
Token ID3505
Feature activation+4.65e-02

Pos logit contributions
aches+0.077
irs+0.073
 belong+0.072
 happen+0.066
 yourself+0.065
Neg logit contributions
 "-0.091
 girl-0.086
 �-0.084
sed-0.082
-0.081
Token that
Token ID326
Feature activation+3.74e-02

Pos logit contributions
 belong+0.082
aches+0.080
irs+0.074
ck+0.074
 know+0.073
Neg logit contributions
 "-0.088
 �-0.080
-0.074
 girl-0.072
joy-0.072
Token your
Token ID534
Feature activation+2.77e-02

Pos logit contributions
aches+0.062
irs+0.059
 belong+0.059
ck+0.056
 happen+0.054
Neg logit contributions
 "-0.071
 girl-0.066
 �-0.064
 boy-0.062
-0.062
Token dreams
Token ID10625
Feature activation+1.03e-01

Pos logit contributions
aches+0.048
 belong+0.048
irs+0.046
 happen+0.045
 yourself+0.044
Neg logit contributions
 "-0.048
 girl-0.048
sed-0.048
joy-0.047
keeper-0.045
Token will
Token ID481
Feature activation+1.24e-01

Pos logit contributions
aches+0.182
irs+0.180
ck+0.175
 moderation+0.167
cks+0.164
Neg logit contributions
 "-0.184
-0.158
 girl-0.157
 �-0.150
keeper-0.148
Token come
Token ID1282
Feature activation+5.93e-02

Pos logit contributions
irs+0.194
cks+0.191
aches+0.185
irl+0.176
ws+0.171
Neg logit contributions
 "-0.248
 �-0.210
-0.199
 girl-0.197
 man-0.184
Token true
Token ID2081
Feature activation+2.98e-02

Pos logit contributions
aches+0.111
 belong+0.099
irs+0.098
irl+0.095
 happen+0.094
Neg logit contributions
 "-0.107
 �-0.097
keeper-0.095
 girl-0.095
-0.095
Token if
Token ID611
Feature activation+5.02e-02

Pos logit contributions
aches+0.058
 belong+0.057
irs+0.054
 happen+0.053
irl+0.053
Neg logit contributions
 "-0.049
 girl-0.045
keeper-0.045
-0.044
 �-0.044
Token you
Token ID345
Feature activation+9.83e-02

Pos logit contributions
aches+0.065
 belong+0.060
irs+0.057
 know+0.055
ck+0.055
Neg logit contributions
 "-0.115
 girl-0.107
-0.104
 �-0.103
keeper-0.103
Token try
Token ID1949
Feature activation+6.98e-02

Pos logit contributions
irs+0.138
aches+0.127
cks+0.124
 yourself+0.122
 ourselves+0.120
Neg logit contributions
 "-0.196
 girl-0.183
-0.180
 nodded-0.180
 �-0.178
Token law
Token ID1099
Feature activation-1.61e-02

Pos logit contributions
aches+0.038
 belong+0.037
irs+0.037
 happen+0.036
ck+0.036
Neg logit contributions
 "-0.032
 girl-0.030
 gir-0.030
-0.029
hog-0.029
Token is
Token ID318
Feature activation+2.52e-02

Pos logit contributions
joy+0.027
sed+0.027
elt+0.026
 "+0.026
 Pole+0.026
Neg logit contributions
 belong-0.028
aches-0.028
 yourself-0.027
 happen-0.027
 know-0.026
Token something
Token ID1223
Feature activation+9.12e-03

Pos logit contributions
aches+0.042
 belong+0.042
 happen+0.040
irs+0.039
 yourself+0.039
Neg logit contributions
 "-0.047
 �-0.044
 gir-0.042
sed-0.042
hog-0.042
Token that
Token ID326
Feature activation+5.12e-02

Pos logit contributions
aches+0.015
 belong+0.015
 yourself+0.014
 happen+0.014
 know+0.014
Neg logit contributions
 "-0.017
hog-0.016
 girl-0.016
sed-0.016
elt-0.016
Token everyone
Token ID2506
Feature activation+3.56e-02

Pos logit contributions
aches+0.081
irs+0.077
 belong+0.076
ck+0.074
 ourselves+0.073
Neg logit contributions
 "-0.098
 girl-0.090
 �-0.086
 gir-0.086
 boy-0.086
Token must
Token ID1276
Feature activation+1.23e-01

Pos logit contributions
aches+0.072
 belong+0.070
 yourself+0.069
irs+0.068
 ourselves+0.068
Neg logit contributions
 "-0.052
 girl-0.049
-0.048
hog-0.048
elt-0.046
Token obey
Token ID22389
Feature activation+4.47e-02

Pos logit contributions
irs+0.190
 unnecessary+0.173
cks+0.166
wy+0.164
 moderation+0.162
Neg logit contributions
 "-0.247
 man-0.222
 lady-0.221
 �-0.202
 woman-0.199
Token.
Token ID13
Feature activation-4.46e-02

Pos logit contributions
aches+0.075
 belong+0.072
irl+0.066
 happen+0.065
inations+0.065
Neg logit contributions
 "-0.088
 �-0.081
 Lee-0.079
 girl-0.077
-0.076
Token When
Token ID1649
Feature activation+1.73e-02

Pos logit contributions
sed+0.064
hog+0.055
 gir+0.054
keeper+0.053
joy+0.053
Neg logit contributions
 yourself-0.094
 know-0.089
 belong-0.087
 ourselves-0.087
irs-0.086
Token you
Token ID345
Feature activation+8.85e-02

Pos logit contributions
aches+0.031
 belong+0.030
 happen+0.029
 yourself+0.028
 know+0.028
Neg logit contributions
 "-0.030
 girl-0.029
-0.028
 Lee-0.028
joy-0.028
Token attach
Token ID10199
Feature activation+5.58e-02

Pos logit contributions
 yourself+0.130
aches+0.125
 possible+0.117
 ourselves+0.116
 yours+0.115
Neg logit contributions
-0.169
 "-0.167
ishly-0.157
 nodded-0.155
 �-0.154
Token<|endoftext|>
Token ID50256
Feature activation+7.53e-03

Pos logit contributions
 belong+0.002
aches+0.002
 happen+0.002
 yourself+0.002
 know+0.002
Neg logit contributions
 "-0.002
hog-0.002
sed-0.002
elt-0.002
 girl-0.002
TokenOnce
Token ID7454
Feature activation+4.95e-02

Pos logit contributions
 belong+0.014
aches+0.014
 happen+0.013
 yourself+0.013
 know+0.012
Neg logit contributions
 "-0.013
hog-0.012
elt-0.012
 girl-0.012
joy-0.012
Token there
Token ID612
Feature activation+1.72e-02

Pos logit contributions
 yourself+0.108
 belong+0.107
aches+0.106
 happen+0.101
 ourselves+0.101
Neg logit contributions
 "-0.076
 girl-0.067
 �-0.066
joy-0.064
 boy-0.063
Token was
Token ID373
Feature activation+2.88e-02

Pos logit contributions
aches+0.036
 belong+0.035
 yourself+0.035
 happen+0.034
irs+0.033
Neg logit contributions
 "-0.025
 girl-0.023
 �-0.023
-0.023
hog-0.022
Token a
Token ID257
Feature activation-7.64e-02

Pos logit contributions
aches+0.051
 belong+0.049
 happen+0.047
 yourself+0.046
 achieve+0.045
Neg logit contributions
 "-0.052
 girl-0.051
elt-0.051
 �-0.049
joy-0.049
Token curious
Token ID11040
Feature activation-1.37e-01

Pos logit contributions
ishly+0.148
elt+0.147
sed+0.140
ked+0.139
edly+0.132
Neg logit contributions
 avoid-0.121
 have-0.121
 know-0.116
irl-0.113
 enjoy-0.112
Token little
Token ID1310
Feature activation-1.09e-01

Pos logit contributions
ked+0.242
sed+0.242
ishly+0.239
ider+0.235
edly+0.231
Neg logit contributions
 know-0.195
 value-0.193
 belong-0.192
 have-0.186
?"-0.186
Token girl
Token ID2576
Feature activation-3.33e-02

Pos logit contributions
ically+0.172
ked+0.169
edly+0.169
atively+0.169
sed+0.168
Neg logit contributions
 belong-0.183
 you-0.178
 live-0.177
 need-0.176
 us-0.173
Token called
Token ID1444
Feature activation-2.55e-02

Pos logit contributions
 Pole+0.060
sed+0.059
joy+0.057
keeper+0.056
hog+0.054
Neg logit contributions
 belong-0.056
 know-0.054
 yourself-0.052
aches-0.052
 appreciate-0.051
Token Martha
Token ID27243
Feature activation-1.34e-02

Pos logit contributions
sed+0.035
+0.035
elt+0.035
hog+0.034
keeper+0.034
Neg logit contributions
 belong-0.053
 yourself-0.050
aches-0.050
 know-0.050
 happen-0.050
Token.
Token ID13
Feature activation-7.75e-03

Pos logit contributions
sed+0.025
 Pole+0.025
elt+0.024
joy+0.024
hog+0.024
Neg logit contributions
 belong-0.022
aches-0.021
 know-0.020
 happen-0.019
 yourself-0.019
Token a
Token ID257
Feature activation+3.05e-02

Pos logit contributions
aches+0.190
irs+0.189
 belong+0.181
ws+0.173
 happen+0.172
Neg logit contributions
 "-0.107
 girl-0.091
 boy-0.088
 �-0.080
 monkey-0.077
Token time
Token ID640
Feature activation-9.45e-03

Pos logit contributions
 belong+0.040
aches+0.040
 yourself+0.039
 happen+0.037
 ourselves+0.037
Neg logit contributions
 "-0.067
joy-0.066
 �-0.065
 girl-0.065
-0.065
Token there
Token ID612
Feature activation+2.59e-02

Pos logit contributions
sed+0.010
hog+0.010
elt+0.010
keeper+0.009
 girl+0.009
Neg logit contributions
 belong-0.022
aches-0.022
 happen-0.022
 yourself-0.021
 know-0.021
Token was
Token ID373
Feature activation+1.90e-02

Pos logit contributions
aches+0.060
 belong+0.058
 yourself+0.058
 ourselves+0.056
irs+0.056
Neg logit contributions
 "-0.034
 girl-0.030
 �-0.030
 boy-0.029
-0.029
Token a
Token ID257
Feature activation-8.93e-02

Pos logit contributions
aches+0.032
 belong+0.031
 happen+0.030
 yourself+0.029
 achieve+0.028
Neg logit contributions
 "-0.036
elt-0.035
 girl-0.035
joy-0.034
 �-0.034
Token curious
Token ID11040
Feature activation-1.36e-01

Pos logit contributions
elt+0.190
ishly+0.187
ked+0.176
edly+0.172
sed+0.170
Neg logit contributions
 have-0.125
 avoid-0.122
 know-0.119
irl-0.118
 belong-0.116
Token bear
Token ID6842
Feature activation-2.71e-02

Pos logit contributions
sed+0.236
ishly+0.233
ked+0.233
ider+0.232
elt+0.228
Neg logit contributions
 know-0.198
 belong-0.193
 value-0.187
 have-0.178
?"-0.177
Token.
Token ID13
Feature activation-4.23e-02

Pos logit contributions
joy+0.045
sed+0.045
 Pole+0.044
elt+0.043
+0.043
Neg logit contributions
 belong-0.048
aches-0.048
 know-0.045
 happen-0.044
 yourself-0.044
Token One
Token ID1881
Feature activation-1.78e-02

Pos logit contributions
sed+0.065
keeper+0.063
+0.060
joy+0.058
 clerk+0.058
Neg logit contributions
 belong-0.082
 know-0.081
 yourself-0.078
 have-0.076
 appreciate-0.073
Token day
Token ID1110
Feature activation-4.79e-02

Pos logit contributions
joy+0.039
sed+0.038
 Lee+0.038
+0.038
keeper+0.037
Neg logit contributions
 belong-0.025
aches-0.023
 know-0.022
 happen-0.022
 avoid-0.020
Token he
Token ID339
Feature activation+1.68e-02

Pos logit contributions
hog+0.096
keeper+0.092
sed+0.090
ksh+0.087
ishly+0.085
Neg logit contributions
 belong-0.078
 happen-0.075
 are-0.073
aches-0.071
 have-0.071
Token a
Token ID257
Feature activation+3.10e-02

Pos logit contributions
aches+0.183
irs+0.182
 belong+0.172
 happen+0.166
ws+0.166
Neg logit contributions
 "-0.109
 girl-0.094
 boy-0.089
 �-0.083
elt-0.079
Token time
Token ID640
Feature activation-1.75e-02

Pos logit contributions
 belong+0.041
aches+0.041
 yourself+0.039
 happen+0.038
 ourselves+0.037
Neg logit contributions
 "-0.069
-0.067
joy-0.067
 girl-0.067
 �-0.066
Token there
Token ID612
Feature activation+2.74e-02

Pos logit contributions
keeper+0.020
sed+0.019
elt+0.019
hog+0.019
 gir+0.019
Neg logit contributions
 belong-0.040
aches-0.039
 happen-0.038
 yourself-0.038
 know-0.037
Token was
Token ID373
Feature activation+2.16e-02

Pos logit contributions
aches+0.064
 belong+0.062
 yourself+0.062
irs+0.060
 ourselves+0.060
Neg logit contributions
 "-0.036
 girl-0.032
 �-0.031
 boy-0.030
-0.030
Token a
Token ID257
Feature activation-9.35e-02

Pos logit contributions
aches+0.037
 belong+0.035
 happen+0.034
 yourself+0.033
 achieve+0.032
Neg logit contributions
 "-0.040
 girl-0.039
elt-0.038
joy-0.038
 �-0.038
Token curious
Token ID11040
Feature activation-1.36e-01

Pos logit contributions
elt+0.201
ishly+0.192
ked+0.189
edly+0.183
sed+0.180
Neg logit contributions
 have-0.133
 avoid-0.127
irl-0.126
 know-0.123
 belong-0.123
Token flower
Token ID15061
Feature activation-4.74e-02

Pos logit contributions
ked+0.236
sed+0.233
ider+0.230
elt+0.229
ishly+0.228
Neg logit contributions
 know-0.199
 belong-0.192
 value-0.188
?"-0.185
 have-0.183
Token.
Token ID13
Feature activation-3.43e-02

Pos logit contributions
joy+0.102
 Pole+0.097
sed+0.096
icol+0.095
 Lee+0.095
Neg logit contributions
aches-0.064
 belong-0.062
 yourself-0.056
 happen-0.055
 know-0.054
Token The
Token ID383
Feature activation-1.15e-02

Pos logit contributions
sed+0.056
keeper+0.053
hog+0.050
ked+0.050
 Pole+0.050
Neg logit contributions
 know-0.066
 belong-0.065
 yourself-0.064
 happen-0.060
 have-0.060
Token flower
Token ID15061
Feature activation-1.93e-02

Pos logit contributions
elt+0.018
joy+0.018
hog+0.018
 Lee+0.018
 Pole+0.018
Neg logit contributions
 belong-0.022
aches-0.022
 happen-0.021
 yourself-0.021
 know-0.020
Token wanted
Token ID2227
Feature activation+8.80e-03

Pos logit contributions
joy+0.035
 Lee+0.034
 Pole+0.033
elt+0.033
sed+0.033
Neg logit contributions
aches-0.033
 belong-0.032
 happen-0.030
 know-0.030
 yourself-0.030
Token a
Token ID257
Feature activation+4.04e-02

Pos logit contributions
aches+0.181
irs+0.175
 belong+0.169
ws+0.164
 achieve+0.161
Neg logit contributions
 "-0.104
 girl-0.089
 boy-0.087
 �-0.077
elt-0.076
Token time
Token ID640
Feature activation-8.57e-03

Pos logit contributions
aches+0.058
 yourself+0.057
 belong+0.057
 ourselves+0.054
 happen+0.054
Neg logit contributions
 "-0.084
joy-0.082
 �-0.082
 girl-0.081
-0.081
Token there
Token ID612
Feature activation+1.82e-02

Pos logit contributions
hog+0.009
sed+0.009
elt+0.009
keeper+0.009
 girl+0.009
Neg logit contributions
 belong-0.020
aches-0.020
 happen-0.020
 yourself-0.020
 know-0.019
Token was
Token ID373
Feature activation+2.35e-02

Pos logit contributions
aches+0.040
 belong+0.039
 yourself+0.038
 happen+0.037
 ourselves+0.037
Neg logit contributions
 "-0.025
 girl-0.023
hog-0.022
-0.022
 �-0.022
Token a
Token ID257
Feature activation-8.79e-02

Pos logit contributions
aches+0.041
 belong+0.040
 happen+0.038
 yourself+0.037
 achieve+0.036
Neg logit contributions
 "-0.043
elt-0.041
 girl-0.041
 �-0.041
joy-0.040
Token curious
Token ID11040
Feature activation-1.35e-01

Pos logit contributions
elt+0.192
ishly+0.183
ked+0.178
sed+0.173
edly+0.171
Neg logit contributions
 have-0.125
irl-0.121
 avoid-0.121
 know-0.118
 belong-0.117
Token little
Token ID1310
Feature activation-1.06e-01

Pos logit contributions
ishly+0.248
sed+0.247
ked+0.243
ider+0.243
elt+0.239
Neg logit contributions
 know-0.190
 value-0.181
 belong-0.178
?"-0.175
 have-0.172
Token girl
Token ID2576
Feature activation-3.35e-02

Pos logit contributions
ically+0.162
sed+0.161
ked+0.161
edly+0.159
atively+0.156
Neg logit contributions
 belong-0.187
 you-0.180
 need-0.178
 care-0.177
 live-0.175
Token called
Token ID1444
Feature activation-2.60e-02

Pos logit contributions
 Pole+0.061
sed+0.060
joy+0.059
keeper+0.057
elt+0.056
Neg logit contributions
 belong-0.056
 know-0.055
aches-0.052
 yourself-0.051
 appreciate-0.051
Token Jane
Token ID12091
Feature activation+1.66e-03

Pos logit contributions
elt+0.036
+0.035
sed+0.035
hog+0.034
 Pole+0.034
Neg logit contributions
 belong-0.054
 yourself-0.051
 know-0.051
 happen-0.051
aches-0.051
Token.
Token ID13
Feature activation-8.58e-03

Pos logit contributions
 belong+0.003
aches+0.003
 happen+0.003
 yourself+0.003
 know+0.003
Neg logit contributions
 "-0.003
hog-0.003
sed-0.003
elt-0.003
-0.003
Token<|endoftext|>
Token ID50256
Feature activation+1.18e-02

Pos logit contributions
sed+0.012
hog+0.012
 gir+0.011
 "+0.011
 girl+0.011
Neg logit contributions
 belong-0.016
aches-0.015
 yourself-0.015
 know-0.015
 happen-0.015
TokenOnce
Token ID7454
Feature activation+4.62e-02

Pos logit contributions
 belong+0.023
aches+0.023
 happen+0.022
 yourself+0.022
 know+0.021
Neg logit contributions
 "-0.019
 girl-0.017
-0.017
elt-0.017
joy-0.017
Token there
Token ID612
Feature activation+1.95e-02

Pos logit contributions
 yourself+0.100
 belong+0.098
aches+0.098
 ourselves+0.094
 happen+0.093
Neg logit contributions
 "-0.071
 girl-0.063
 �-0.063
joy-0.061
 boy-0.060
Token was
Token ID373
Feature activation+3.05e-02

Pos logit contributions
aches+0.042
 belong+0.041
 yourself+0.041
 happen+0.039
 ourselves+0.039
Neg logit contributions
 "-0.028
 girl-0.025
 �-0.025
-0.025
hog-0.024
Token a
Token ID257
Feature activation-7.93e-02

Pos logit contributions
aches+0.055
 belong+0.053
 happen+0.052
 yourself+0.049
 achieve+0.049
Neg logit contributions
 "-0.054
 girl-0.053
elt-0.053
 �-0.051
joy-0.051
Token curious
Token ID11040
Feature activation-1.34e-01

Pos logit contributions
ishly+0.150
elt+0.148
ked+0.141
sed+0.137
edly+0.134
Neg logit contributions
 have-0.129
 avoid-0.128
 know-0.124
irl-0.122
 belong-0.120
Token little
Token ID1310
Feature activation-1.07e-01

Pos logit contributions
ked+0.226
ishly+0.224
sed+0.220
ider+0.219
elt+0.212
Neg logit contributions
 know-0.201
 belong-0.197
 value-0.192
 have-0.190
 rest-0.187
Token dancer
Token ID38619
Feature activation-4.69e-02

Pos logit contributions
ked+0.165
ically+0.161
edly+0.159
sed+0.158
atively+0.156
Neg logit contributions
 belong-0.187
 need-0.178
 love-0.177
 have-0.176
 you-0.176
Token named
Token ID3706
Feature activation-2.11e-03

Pos logit contributions
 Pole+0.088
elt+0.086
joy+0.085
sed+0.085
 clerk+0.083
Neg logit contributions
 belong-0.072
aches-0.069
 know-0.065
 yourself-0.065
e-0.064
Token L
Token ID406
Feature activation+4.02e-02

Pos logit contributions
 "+0.002
sed+0.002
elt+0.002
hog+0.002
+0.002
Neg logit contributions
 belong-0.005
aches-0.005
 yourself-0.005
 happen-0.005
 know-0.005
Tokenila
Token ID10102
Feature activation+2.73e-02

Pos logit contributions
 belong+0.082
 ourselves+0.076
 happen+0.076
aches+0.075
 yourself+0.074
Neg logit contributions
 "-0.063
joy-0.060
 girl-0.060
 Lee-0.059
elt-0.059
Token a
Token ID257
Feature activation+3.68e-02

Pos logit contributions
irs+0.224
aches+0.223
ws+0.207
 belong+0.206
irl+0.198
Neg logit contributions
 "-0.116
 girl-0.099
 boy-0.098
 �-0.084
 mouse-0.081
Token time
Token ID640
Feature activation-8.27e-03

Pos logit contributions
aches+0.051
 belong+0.051
 yourself+0.050
 ourselves+0.048
 happen+0.047
Neg logit contributions
 "-0.079
joy-0.077
 �-0.077
 girl-0.075
sed-0.075
Token there
Token ID612
Feature activation+2.79e-02

Pos logit contributions
sed+0.008
hog+0.008
elt+0.008
 "+0.008
 girl+0.008
Neg logit contributions
 belong-0.020
aches-0.019
 happen-0.019
 yourself-0.019
 know-0.019
Token was
Token ID373
Feature activation+2.56e-02

Pos logit contributions
aches+0.065
 yourself+0.063
 belong+0.063
irs+0.061
 ourselves+0.061
Neg logit contributions
 "-0.037
 girl-0.032
 �-0.031
 boy-0.031
hog-0.030
Token a
Token ID257
Feature activation-8.78e-02

Pos logit contributions
aches+0.046
 belong+0.043
 happen+0.042
 yourself+0.041
 achieve+0.040
Neg logit contributions
 "-0.047
 girl-0.045
elt-0.044
 �-0.044
joy-0.043
Token curious
Token ID11040
Feature activation-1.34e-01

Pos logit contributions
elt+0.188
ishly+0.182
ked+0.175
sed+0.169
edly+0.166
Neg logit contributions
 have-0.125
 avoid-0.122
irl-0.121
 know-0.119
 belong-0.118
Token boat
Token ID8848
Feature activation-1.58e-02

Pos logit contributions
ishly+0.242
sed+0.240
elt+0.239
ked+0.238
awed+0.228
Neg logit contributions
 know-0.190
 belong-0.184
 value-0.182
?"-0.176
 have-0.173
Token.
Token ID13
Feature activation-2.00e-02

Pos logit contributions
joy+0.027
sed+0.027
elt+0.027
 Lee+0.026
 �+0.026
Neg logit contributions
 belong-0.028
aches-0.027
 happen-0.026
 yourself-0.025
 know-0.024
Token The
Token ID383
Feature activation+1.62e-02

Pos logit contributions
sed+0.027
hog+0.026
keeper+0.025
joy+0.025
 girl+0.025
Neg logit contributions
 belong-0.043
 know-0.042
 yourself-0.041
 happen-0.041
aches-0.040
Token boat
Token ID8848
Feature activation+6.41e-04

Pos logit contributions
aches+0.041
 belong+0.041
 yourself+0.040
 happen+0.039
irs+0.039
Neg logit contributions
 "-0.016
 girl-0.016
-0.015
 boy-0.014
 Pole-0.014
Token wanted
Token ID2227
Feature activation+1.35e-02

Pos logit contributions
 belong+0.001
aches+0.001
 happen+0.001
 yourself+0.001
 know+0.001
Neg logit contributions
 "-0.001
hog-0.001
sed-0.001
elt-0.001
joy-0.001
Token a
Token ID257
Feature activation+3.50e-02

Pos logit contributions
aches+0.213
irs+0.212
 belong+0.199
 happen+0.192
 harm+0.190
Neg logit contributions
 "-0.117
 girl-0.099
 boy-0.099
elt-0.085
 �-0.084
Token time
Token ID640
Feature activation-4.13e-03

Pos logit contributions
aches+0.047
 belong+0.046
 yourself+0.045
 happen+0.044
 ourselves+0.043
Neg logit contributions
 "-0.077
joy-0.075
 �-0.074
 girl-0.074
-0.073
Token there
Token ID612
Feature activation+3.44e-02

Pos logit contributions
 "+0.004
hog+0.004
sed+0.004
elt+0.004
 girl+0.004
Neg logit contributions
 belong-0.010
aches-0.010
 happen-0.010
 yourself-0.010
 know-0.009
Token was
Token ID373
Feature activation+2.87e-02

Pos logit contributions
aches+0.084
 yourself+0.081
 belong+0.081
irs+0.079
 ourselves+0.079
Neg logit contributions
 "-0.043
 girl-0.037
 boy-0.036
 �-0.036
-0.035
Token a
Token ID257
Feature activation-8.16e-02

Pos logit contributions
aches+0.052
 belong+0.050
 happen+0.049
 yourself+0.047
 achieve+0.047
Neg logit contributions
 "-0.051
 girl-0.049
elt-0.049
 �-0.048
joy-0.047
Token curious
Token ID11040
Feature activation-1.33e-01

Pos logit contributions
elt+0.170
ishly+0.163
ked+0.154
sed+0.151
edly+0.149
Neg logit contributions
 have-0.119
 avoid-0.118
irl-0.117
 know-0.114
 belong-0.113
Token little
Token ID1310
Feature activation-1.05e-01

Pos logit contributions
ishly+0.242
sed+0.238
elt+0.236
ked+0.235
ider+0.231
Neg logit contributions
 know-0.186
 belong-0.175
?"-0.175
 value-0.174
 have-0.167
Token girl
Token ID2576
Feature activation-3.17e-02

Pos logit contributions
ically+0.163
ked+0.160
overed+0.158
edly+0.157
sed+0.155
Neg logit contributions
 belong-0.182
 you-0.171
 live-0.169
 care-0.167
 need-0.166
Token called
Token ID1444
Feature activation-2.56e-02

Pos logit contributions
 Pole+0.057
sed+0.055
joy+0.054
keeper+0.053
elt+0.053
Neg logit contributions
 belong-0.055
 know-0.052
aches-0.052
 yourself-0.051
 appreciate-0.049
Token Daisy
Token ID40355
Feature activation-9.40e-03

Pos logit contributions
elt+0.036
+0.035
sed+0.035
keeper+0.034
hog+0.034
Neg logit contributions
 belong-0.053
 yourself-0.050
aches-0.050
 know-0.050
 happen-0.049
Token.
Token ID13
Feature activation-1.60e-02

Pos logit contributions
sed+0.017
 Pole+0.017
elt+0.017
joy+0.017
hog+0.017
Neg logit contributions
 belong-0.015
aches-0.014
 yourself-0.014
 happen-0.014
 know-0.014
TokenM
Token ID44
Feature activation+1.61e-02

Pos logit contributions
 belong+0.012
aches+0.012
 happen+0.011
 yourself+0.011
 know+0.011
Neg logit contributions
 "-0.010
elt-0.010
hog-0.010
 girl-0.010
-0.010
Tokenolly
Token ID5098
Feature activation-5.42e-02

Pos logit contributions
 belong+0.026
 yourself+0.025
aches+0.025
 happen+0.025
 ourselves+0.024
Neg logit contributions
 "-0.030
 �-0.029
 girl-0.029
hog-0.029
keeper-0.029
Token was
Token ID373
Feature activation-1.29e-02

Pos logit contributions
joy+0.081
sed+0.081
elt+0.080
ometer+0.080
 Pole+0.078
Neg logit contributions
 belong-0.098
aches-0.095
 happen-0.095
 know-0.093
 yourself-0.093
Token a
Token ID257
Feature activation-8.81e-02

Pos logit contributions
hog+0.021
 Pole+0.021
keeper+0.021
+0.021
 gir+0.020
Neg logit contributions
 belong-0.023
aches-0.022
 yourself-0.022
 know-0.021
 happen-0.021
Token very
Token ID845
Feature activation-3.93e-02

Pos logit contributions
elt+0.172
ked+0.170
sed+0.167
ishly+0.164
hog+0.159
Neg logit contributions
 own-0.129
 have-0.126
 love-0.121
irl-0.120
 enjoy-0.120
Token curious
Token ID11040
Feature activation-1.33e-01

Pos logit contributions
 Lee+0.069
elt+0.069
sed+0.068
ked+0.067
hog+0.067
Neg logit contributions
 belong-0.067
 know-0.064
aches-0.062
irl-0.061
 happen-0.059
Token three
Token ID1115
Feature activation-4.48e-02

Pos logit contributions
sed+0.230
ked+0.216
awed+0.205
uffed+0.199
ider+0.198
Neg logit contributions
 know-0.223
 value-0.215
 love-0.215
 need-0.214
 have-0.206
Token year
Token ID614
Feature activation-6.27e-02

Pos logit contributions
sed+0.097
elt+0.093
 Pole+0.091
 Sure+0.091
ometer+0.088
Neg logit contributions
aches-0.065
 belong-0.060
 know-0.059
 happen-0.058
 yourself-0.055
Token old
Token ID1468
Feature activation-9.43e-02

Pos logit contributions
sed+0.119
elt+0.110
ked+0.108
+0.104
 Lee+0.104
Neg logit contributions
aches-0.096
 happen-0.096
 yourself-0.095
 belong-0.095
 know-0.094
Token.
Token ID13
Feature activation-2.47e-02

Pos logit contributions
sed+0.214
ometer+0.196
 Pole+0.196
icol+0.190
 Sure+0.187
Neg logit contributions
 appreciate-0.096
 know-0.095
 yourself-0.094
 love-0.089
 us-0.088
Token She
Token ID1375
Feature activation+6.77e-04

Pos logit contributions
sed+0.035
 gir+0.035
 Pole+0.035
keeper+0.034
ometer+0.033
Neg logit contributions
 belong-0.049
 yourself-0.048
 know-0.047
 happen-0.045
aches-0.045
Token a
Token ID257
Feature activation+4.03e-02

Pos logit contributions
irs+0.212
aches+0.210
 belong+0.198
 happen+0.191
ws+0.190
Neg logit contributions
 "-0.114
 girl-0.096
 boy-0.095
 �-0.081
 monkey-0.080
Token time
Token ID640
Feature activation-6.02e-03

Pos logit contributions
 belong+0.057
aches+0.057
 yourself+0.056
 happen+0.054
 ourselves+0.053
Neg logit contributions
 "-0.085
joy-0.083
 �-0.082
-0.081
 girl-0.081
Token there
Token ID612
Feature activation+3.04e-02

Pos logit contributions
 "+0.006
hog+0.006
sed+0.006
elt+0.006
 girl+0.006
Neg logit contributions
 belong-0.015
aches-0.014
 happen-0.014
 yourself-0.014
 know-0.014
Token was
Token ID373
Feature activation+2.14e-02

Pos logit contributions
aches+0.072
 yourself+0.070
 belong+0.070
 ourselves+0.068
irs+0.068
Neg logit contributions
 "-0.039
 girl-0.034
 boy-0.033
 �-0.033
hog-0.032
Token a
Token ID257
Feature activation-8.72e-02

Pos logit contributions
aches+0.037
 belong+0.035
 happen+0.034
 yourself+0.033
 achieve+0.032
Neg logit contributions
 "-0.040
 girl-0.038
elt-0.038
 �-0.037
joy-0.037
Token curious
Token ID11040
Feature activation-1.33e-01

Pos logit contributions
elt+0.189
ishly+0.182
ked+0.172
edly+0.168
sed+0.164
Neg logit contributions
 have-0.123
 avoid-0.122
irl-0.119
 know-0.117
 belong-0.115
Token little
Token ID1310
Feature activation-1.08e-01

Pos logit contributions
ishly+0.239
ked+0.234
sed+0.232
elt+0.232
ider+0.228
Neg logit contributions
 know-0.194
 belong-0.180
 value-0.178
?"-0.177
 have-0.171
Token girl
Token ID2576
Feature activation-3.16e-02

Pos logit contributions
ked+0.173
ically+0.173
overed+0.169
edly+0.169
sed+0.161
Neg logit contributions
 belong-0.184
 need-0.173
 you-0.170
 know-0.169
 live-0.169
Token.
Token ID13
Feature activation-3.91e-02

Pos logit contributions
 Pole+0.056
sed+0.055
joy+0.054
keeper+0.052
elt+0.051
Neg logit contributions
 belong-0.055
 know-0.053
aches-0.052
 yourself-0.051
 appreciate-0.049
Token She
Token ID1375
Feature activation-9.79e-04

Pos logit contributions
sed+0.057
 Pole+0.055
keeper+0.054
ked+0.052
ometer+0.051
Neg logit contributions
 know-0.082
 belong-0.080
 yourself-0.077
 appreciate-0.075
 have-0.075
Token was
Token ID373
Feature activation-2.78e-02

Pos logit contributions
 "+0.002
hog+0.002
joy+0.002
elt+0.002
sed+0.002
Neg logit contributions
 belong-0.002
aches-0.002
 happen-0.002
 yourself-0.002
 know-0.002
Token<|endoftext|>
Token ID50256
Feature activation+8.70e-03

Pos logit contributions
aches+0.030
 belong+0.030
 happen+0.029
 yourself+0.028
ck+0.026
Neg logit contributions
 "-0.023
 �-0.021
joy-0.021
elt-0.021
-0.021
TokenOnce
Token ID7454
Feature activation+4.47e-02

Pos logit contributions
 belong+0.017
aches+0.017
 happen+0.016
 yourself+0.016
 know+0.015
Neg logit contributions
 "-0.014
hog-0.013
elt-0.013
 girl-0.013
-0.013
Token there
Token ID612
Feature activation+1.35e-02

Pos logit contributions
 yourself+0.096
 belong+0.094
aches+0.094
 happen+0.089
 ourselves+0.089
Neg logit contributions
 "-0.069
 girl-0.062
 �-0.061
joy-0.060
 boy-0.058
Token was
Token ID373
Feature activation+3.37e-02

Pos logit contributions
aches+0.028
 belong+0.028
 yourself+0.027
 happen+0.026
 ourselves+0.026
Neg logit contributions
 "-0.020
 girl-0.018
 �-0.018
hog-0.018
-0.018
Token a
Token ID257
Feature activation-7.58e-02

Pos logit contributions
aches+0.062
 belong+0.059
 happen+0.058
 yourself+0.056
 achieve+0.056
Neg logit contributions
 "-0.059
 girl-0.058
elt-0.057
 �-0.055
joy-0.055
Token curious
Token ID11040
Feature activation-1.32e-01

Pos logit contributions
elt+0.146
ishly+0.146
ked+0.138
sed+0.137
edly+0.131
Neg logit contributions
 have-0.120
 avoid-0.120
 know-0.116
irl-0.112
 enjoy-0.112
Token little
Token ID1310
Feature activation-1.06e-01

Pos logit contributions
ked+0.234
ider+0.233
sed+0.232
ishly+0.231
elt+0.222
Neg logit contributions
 know-0.186
 belong-0.179
 value-0.179
 have-0.178
 love-0.171
Token girl
Token ID2576
Feature activation-2.99e-02

Pos logit contributions
ked+0.169
edly+0.165
atively+0.164
ically+0.163
sed+0.163
Neg logit contributions
 belong-0.177
 you-0.173
 have-0.171
 love-0.171
 need-0.171
Token named
Token ID3706
Feature activation+4.52e-03

Pos logit contributions
 Pole+0.053
sed+0.051
joy+0.049
keeper+0.049
elt+0.048
Neg logit contributions
 belong-0.052
 know-0.050
aches-0.049
 yourself-0.049
 appreciate-0.048
Token Daisy
Token ID40355
Feature activation+8.94e-03

Pos logit contributions
 belong+0.011
aches+0.011
 happen+0.011
 yourself+0.011
 ourselves+0.010
Neg logit contributions
 "-0.005
hog-0.004
 girl-0.004
elt-0.004
sed-0.004
Token.
Token ID13
Feature activation-1.74e-02

Pos logit contributions
aches+0.017
 belong+0.017
 happen+0.016
 yourself+0.016
irs+0.015
Neg logit contributions
 "-0.015
 girl-0.014
-0.014
hog-0.014
elt-0.013
Token a
Token ID257
Feature activation+3.13e-02

Pos logit contributions
irs+0.191
aches+0.191
 belong+0.181
 happen+0.177
 achieve+0.173
Neg logit contributions
 "-0.103
 girl-0.094
 boy-0.092
 monkey-0.081
 mouse-0.080
Token time
Token ID640
Feature activation-2.05e-03

Pos logit contributions
 belong+0.042
aches+0.040
 happen+0.040
 yourself+0.039
 ourselves+0.038
Neg logit contributions
 "-0.069
joy-0.067
 girl-0.067
-0.066
sed-0.066
Token there
Token ID612
Feature activation+3.98e-02

Pos logit contributions
 "+0.002
hog+0.002
sed+0.002
elt+0.002
 girl+0.002
Neg logit contributions
 belong-0.005
aches-0.005
 happen-0.005
 yourself-0.005
 know-0.005
Token was
Token ID373
Feature activation+2.16e-02

Pos logit contributions
aches+0.101
 yourself+0.099
 belong+0.097
irs+0.096
 ourselves+0.096
Neg logit contributions
 "-0.047
 boy-0.041
 girl-0.040
 �-0.037
-0.037
Token a
Token ID257
Feature activation-7.96e-02

Pos logit contributions
aches+0.037
 belong+0.036
 happen+0.035
 yourself+0.034
 achieve+0.033
Neg logit contributions
 "-0.040
elt-0.039
 girl-0.038
sed-0.038
 �-0.037
Token curious
Token ID11040
Feature activation-1.32e-01

Pos logit contributions
elt+0.165
ishly+0.161
ked+0.148
edly+0.145
hog+0.144
Neg logit contributions
 have-0.113
irl-0.112
 avoid-0.112
 know-0.110
 own-0.107
Token little
Token ID1310
Feature activation-1.01e-01

Pos logit contributions
ishly+0.216
elt+0.214
sed+0.211
ider+0.209
ked+0.207
Neg logit contributions
 belong-0.200
 know-0.198
 value-0.192
 have-0.181
 stay-0.180
Token girl
Token ID2576
Feature activation-2.21e-02

Pos logit contributions
ked+0.158
ically+0.158
overed+0.153
edly+0.149
elt+0.148
Neg logit contributions
 belong-0.180
 live-0.166
 you-0.163
 need-0.163
 are-0.162
Token.
Token ID13
Feature activation-3.82e-02

Pos logit contributions
 Pole+0.035
joy+0.035
sed+0.034
elt+0.033
keeper+0.033
Neg logit contributions
 belong-0.042
aches-0.040
 know-0.039
 yourself-0.039
 happen-0.038
Token Every
Token ID3887
Feature activation-8.16e-03

Pos logit contributions
sed+0.055
keeper+0.054
 Pole+0.051
ked+0.050
hog+0.049
Neg logit contributions
 know-0.079
 belong-0.079
 yourself-0.076
 have-0.075
 appreciate-0.074
Token day
Token ID1110
Feature activation-3.76e-02

Pos logit contributions
hog+0.015
 "+0.015
keeper+0.015
sed+0.015
+0.015
Neg logit contributions
 belong-0.013
aches-0.012
 happen-0.012
 yourself-0.011
 know-0.011
Token a
Token ID257
Feature activation+2.48e-02

Pos logit contributions
irs+0.215
aches+0.212
 belong+0.203
 happen+0.194
ws+0.193
Neg logit contributions
 "-0.110
 girl-0.098
 boy-0.097
 monkey-0.083
joy-0.080
Token time
Token ID640
Feature activation-2.03e-03

Pos logit contributions
 belong+0.031
aches+0.031
 yourself+0.029
 happen+0.029
 ourselves+0.028
Neg logit contributions
 "-0.056
joy-0.055
 girl-0.054
-0.054
sed-0.054
Token there
Token ID612
Feature activation+4.34e-02

Pos logit contributions
 "+0.002
hog+0.002
sed+0.002
elt+0.002
 girl+0.002
Neg logit contributions
 belong-0.005
aches-0.005
 happen-0.005
 yourself-0.005
 know-0.005
Token was
Token ID373
Feature activation+2.24e-02

Pos logit contributions
aches+0.113
 yourself+0.109
irs+0.107
 belong+0.107
 ourselves+0.106
Neg logit contributions
 "-0.050
 boy-0.043
 girl-0.042
 �-0.038
-0.038
Token a
Token ID257
Feature activation-7.79e-02

Pos logit contributions
aches+0.038
 belong+0.037
 happen+0.036
 yourself+0.035
 achieve+0.034
Neg logit contributions
 "-0.041
elt-0.040
 girl-0.040
sed-0.039
 �-0.039
Token curious
Token ID11040
Feature activation-1.32e-01

Pos logit contributions
elt+0.157
ishly+0.153
ked+0.140
sed+0.138
hog+0.137
Neg logit contributions
 avoid-0.117
 have-0.116
irl-0.116
 know-0.112
 belong-0.111
Token girl
Token ID2576
Feature activation-2.58e-02

Pos logit contributions
ishly+0.220
elt+0.216
sed+0.213
ider+0.209
ked+0.209
Neg logit contributions
 know-0.196
 belong-0.195
 value-0.186
 have-0.180
?"-0.178
Token.
Token ID13
Feature activation-3.62e-02

Pos logit contributions
 Pole+0.043
sed+0.043
joy+0.042
keeper+0.041
elt+0.041
Neg logit contributions
 belong-0.047
aches-0.045
 know-0.044
 yourself-0.044
 happen-0.042
Token She
Token ID1375
Feature activation+7.16e-03

Pos logit contributions
sed+0.050
keeper+0.049
 Pole+0.048
ked+0.045
hog+0.045
Neg logit contributions
 know-0.077
 belong-0.077
 yourself-0.073
 have-0.072
 appreciate-0.071
Token wanted
Token ID2227
Feature activation+6.55e-03

Pos logit contributions
 belong+0.014
aches+0.014
 happen+0.013
 yourself+0.013
 ourselves+0.013
Neg logit contributions
 "-0.011
 girl-0.011
-0.011
hog-0.011
sed-0.011
Token to
Token ID284
Feature activation+7.66e-02

Pos logit contributions
 belong+0.014
aches+0.014
 happen+0.014
 yourself+0.014
 ourselves+0.013
Neg logit contributions
 "-0.009
elt-0.008
 girl-0.008
sed-0.008
hog-0.008
Token a
Token ID257
Feature activation+2.93e-02

Pos logit contributions
aches+0.185
irs+0.180
 belong+0.177
 happen+0.169
ws+0.167
Neg logit contributions
 "-0.105
 girl-0.090
 boy-0.087
 �-0.081
joy-0.079
Token time
Token ID640
Feature activation-8.61e-03

Pos logit contributions
 belong+0.038
aches+0.038
 yourself+0.036
 happen+0.036
 ourselves+0.035
Neg logit contributions
 "-0.065
joy-0.064
 �-0.063
 girl-0.063
-0.063
Token there
Token ID612
Feature activation+3.40e-02

Pos logit contributions
sed+0.009
elt+0.009
hog+0.009
 "+0.008
 girl+0.008
Neg logit contributions
 belong-0.021
aches-0.020
 happen-0.020
 yourself-0.020
 know-0.019
Token was
Token ID373
Feature activation+2.06e-02

Pos logit contributions
aches+0.083
 yourself+0.080
 belong+0.080
 ourselves+0.078
irs+0.078
Neg logit contributions
 "-0.042
 girl-0.036
 boy-0.036
 �-0.036
hog-0.034
Token a
Token ID257
Feature activation-8.52e-02

Pos logit contributions
aches+0.035
 belong+0.034
 happen+0.033
 yourself+0.031
 achieve+0.030
Neg logit contributions
 "-0.039
elt-0.037
 girl-0.037
 �-0.036
joy-0.036
Token curious
Token ID11040
Feature activation-1.32e-01

Pos logit contributions
elt+0.175
ishly+0.170
ked+0.161
sed+0.159
edly+0.158
Neg logit contributions
 have-0.124
 avoid-0.122
 know-0.120
irl-0.120
 belong-0.114
Token mouse
Token ID10211
Feature activation-3.62e-02

Pos logit contributions
sed+0.229
ider+0.225
ishly+0.225
ked+0.221
elt+0.221
Neg logit contributions
 know-0.194
 belong-0.186
 value-0.180
 have-0.173
?"-0.172
Token.
Token ID13
Feature activation-4.12e-02

Pos logit contributions
joy+0.066
sed+0.065
elt+0.064
 Pole+0.064
+0.061
Neg logit contributions
 belong-0.060
aches-0.060
 yourself-0.055
 know-0.055
 happen-0.055
Token He
Token ID679
Feature activation-4.00e-03

Pos logit contributions
sed+0.065
keeper+0.062
+0.060
ked+0.059
 Pole+0.059
Neg logit contributions
 know-0.079
 belong-0.079
 yourself-0.077
 have-0.073
 appreciate-0.071
Token liked
Token ID8288
Feature activation-1.97e-02

Pos logit contributions
 "+0.007
joy+0.007
elt+0.007
hog+0.007
sed+0.007
Neg logit contributions
 belong-0.007
aches-0.007
 happen-0.006
 yourself-0.006
 know-0.006
Token to
Token ID284
Feature activation+4.62e-02

Pos logit contributions
hog+0.032
+0.031
keeper+0.031
sed+0.031
 girl+0.030
Neg logit contributions
aches-0.036
 belong-0.036
 happen-0.035
 yourself-0.035
 know-0.034
Token<|endoftext|>
Token ID50256
Feature activation+1.74e-02

Pos logit contributions
 "+0.004
sed+0.004
hog+0.004
elt+0.004
 girl+0.004
Neg logit contributions
 belong-0.008
aches-0.008
 yourself-0.008
 happen-0.008
 know-0.008
TokenOnce
Token ID7454
Feature activation+4.30e-02

Pos logit contributions
aches+0.035
 belong+0.035
 happen+0.033
 yourself+0.032
 ourselves+0.032
Neg logit contributions
 "-0.027
joy-0.025
 girl-0.025
-0.025
elt-0.025
Token there
Token ID612
Feature activation+2.08e-02

Pos logit contributions
 yourself+0.092
 belong+0.092
aches+0.091
 happen+0.089
 ourselves+0.088
Neg logit contributions
 "-0.067
 girl-0.059
 �-0.058
joy-0.057
 boy-0.057
Token was
Token ID373
Feature activation+3.05e-02

Pos logit contributions
aches+0.046
 belong+0.044
 yourself+0.044
 happen+0.043
irs+0.042
Neg logit contributions
 "-0.029
 girl-0.026
 �-0.026
-0.026
joy-0.026
Token a
Token ID257
Feature activation-7.59e-02

Pos logit contributions
aches+0.055
 belong+0.053
 happen+0.052
 yourself+0.050
 achieve+0.050
Neg logit contributions
 "-0.054
elt-0.053
 girl-0.053
 �-0.051
joy-0.050
Token curious
Token ID11040
Feature activation-1.32e-01

Pos logit contributions
ishly+0.144
elt+0.143
sed+0.136
ked+0.133
hog+0.130
Neg logit contributions
 avoid-0.120
 have-0.119
 know-0.117
irl-0.114
 love-0.112
Token little
Token ID1310
Feature activation-1.05e-01

Pos logit contributions
sed+0.229
ishly+0.223
ider+0.220
ked+0.218
elt+0.213
Neg logit contributions
 know-0.194
 belong-0.191
 value-0.185
 have-0.180
?"-0.175
Token girl
Token ID2576
Feature activation-3.13e-02

Pos logit contributions
edly+0.158
sed+0.158
ked+0.157
ically+0.156
atively+0.151
Neg logit contributions
 belong-0.186
 you-0.178
 live-0.175
 have-0.172
 us-0.172
Token called
Token ID1444
Feature activation-1.55e-02

Pos logit contributions
 Pole+0.055
sed+0.054
joy+0.053
keeper+0.052
elt+0.051
Neg logit contributions
 belong-0.054
 know-0.051
aches-0.050
 yourself-0.050
 appreciate-0.049
Token Am
Token ID1703
Feature activation+1.07e-03

Pos logit contributions
elt+0.020
sed+0.020
+0.020
hog+0.020
keeper+0.020
Neg logit contributions
 belong-0.033
aches-0.032
 happen-0.032
 yourself-0.032
 know-0.031
Tokena
Token ID64
Feature activation-3.18e-02

Pos logit contributions
 belong+0.002
aches+0.002
 happen+0.002
 yourself+0.002
 know+0.002
Neg logit contributions
 "-0.002
hog-0.002
sed-0.002
elt-0.002
-0.002
Token a
Token ID257
Feature activation+2.47e-02

Pos logit contributions
aches+0.192
irs+0.189
 belong+0.180
ws+0.177
 achieve+0.173
Neg logit contributions
 "-0.106
 girl-0.092
 boy-0.089
 �-0.080
joy-0.078
Token time
Token ID640
Feature activation-7.55e-03

Pos logit contributions
aches+0.030
 belong+0.030
 yourself+0.029
 happen+0.028
 ourselves+0.028
Neg logit contributions
 "-0.056
joy-0.055
 girl-0.055
 �-0.054
-0.054
Token there
Token ID612
Feature activation+2.47e-02

Pos logit contributions
sed+0.008
hog+0.007
 "+0.007
elt+0.007
 girl+0.007
Neg logit contributions
 belong-0.018
aches-0.018
 happen-0.017
 yourself-0.017
 know-0.017
Token was
Token ID373
Feature activation+2.46e-02

Pos logit contributions
aches+0.056
 belong+0.055
 yourself+0.055
 ourselves+0.053
irs+0.053
Neg logit contributions
 "-0.033
 girl-0.029
 �-0.029
hog-0.028
 boy-0.028
Token a
Token ID257
Feature activation-8.66e-02

Pos logit contributions
aches+0.044
 belong+0.042
 happen+0.041
 yourself+0.040
 achieve+0.039
Neg logit contributions
 "-0.045
 girl-0.043
elt-0.042
 �-0.042
joy-0.041
Token curious
Token ID11040
Feature activation-1.32e-01

Pos logit contributions
elt+0.186
ishly+0.176
ked+0.173
edly+0.165
sed+0.165
Neg logit contributions
 have-0.124
 avoid-0.121
irl-0.120
 know-0.117
 belong-0.117
Token little
Token ID1310
Feature activation-1.06e-01

Pos logit contributions
ked+0.231
elt+0.230
sed+0.230
ishly+0.224
ider+0.221
Neg logit contributions
 know-0.192
 belong-0.190
 value-0.182
?"-0.178
 have-0.176
Token boy
Token ID2933
Feature activation-2.02e-02

Pos logit contributions
ked+0.165
ically+0.157
sed+0.156
edly+0.155
overed+0.153
Neg logit contributions
 belong-0.189
 need-0.177
 live-0.174
 you-0.174
 know-0.172
Token.
Token ID13
Feature activation-3.64e-02

Pos logit contributions
sed+0.031
joy+0.031
 Pole+0.030
keeper+0.030
+0.030
Neg logit contributions
 belong-0.040
aches-0.038
 know-0.037
 yourself-0.037
 happen-0.036
Token He
Token ID679
Feature activation-8.45e-03

Pos logit contributions
sed+0.049
keeper+0.048
+0.047
hog+0.045
ked+0.045
Neg logit contributions
 know-0.077
 belong-0.076
 yourself-0.073
 have-0.071
 appreciate-0.071
Token wanted
Token ID2227
Feature activation+1.05e-02

Pos logit contributions
joy+0.015
keeper+0.014
elt+0.014
 "+0.014
hog+0.014
Neg logit contributions
 belong-0.015
aches-0.014
 happen-0.014
 yourself-0.014
 know-0.013
Token a
Token ID257
Feature activation+2.33e-02

Pos logit contributions
aches+0.198
irs+0.196
 belong+0.188
 happen+0.181
ws+0.179
Neg logit contributions
 "-0.105
 girl-0.092
 boy-0.089
 �-0.078
joy-0.078
Token time
Token ID640
Feature activation-1.07e-02

Pos logit contributions
 belong+0.028
aches+0.028
 happen+0.027
 yourself+0.026
 ourselves+0.026
Neg logit contributions
 "-0.053
joy-0.052
 girl-0.052
sed-0.051
 �-0.051
Token there
Token ID612
Feature activation+3.40e-02

Pos logit contributions
sed+0.011
hog+0.011
elt+0.011
keeper+0.011
 girl+0.011
Neg logit contributions
 belong-0.025
aches-0.025
 happen-0.024
 yourself-0.024
 know-0.024
Token was
Token ID373
Feature activation+2.29e-02

Pos logit contributions
aches+0.083
 yourself+0.081
 belong+0.081
 ourselves+0.079
irs+0.078
Neg logit contributions
 "-0.043
 girl-0.037
 boy-0.036
 �-0.035
-0.034
Token a
Token ID257
Feature activation-7.99e-02

Pos logit contributions
aches+0.039
 belong+0.038
 happen+0.037
 yourself+0.036
 achieve+0.035
Neg logit contributions
 "-0.042
elt-0.041
 girl-0.041
 �-0.039
joy-0.039
Token curious
Token ID11040
Feature activation-1.31e-01

Pos logit contributions
elt+0.164
ishly+0.158
ked+0.148
sed+0.147
edly+0.144
Neg logit contributions
 avoid-0.115
 have-0.115
irl-0.115
 know-0.111
 belong-0.110
Token cat
Token ID3797
Feature activation-2.87e-02

Pos logit contributions
sed+0.224
ishly+0.221
elt+0.218
ked+0.215
ider+0.215
Neg logit contributions
 know-0.195
 belong-0.194
 value-0.185
?"-0.176
 have-0.174
Token.
Token ID13
Feature activation-4.21e-02

Pos logit contributions
joy+0.049
sed+0.048
elt+0.047
 Pole+0.046
+0.045
Neg logit contributions
aches-0.051
 belong-0.050
 yourself-0.048
 happen-0.048
 know-0.047
Token The
Token ID383
Feature activation-2.92e-02

Pos logit contributions
keeper+0.071
sed+0.065
 clerk+0.063
hog+0.063
 Pole+0.063
Neg logit contributions
 know-0.078
 belong-0.077
 yourself-0.075
 have-0.072
 enjoy-0.069
Token cat
Token ID3797
Feature activation-3.57e-02

Pos logit contributions
joy+0.031
keeper+0.029
elt+0.028
 Pole+0.027
hog+0.026
Neg logit contributions
 belong-0.073
 know-0.069
 happen-0.068
 yourself-0.067
aches-0.066
Token loved
Token ID6151
Feature activation-2.20e-02

Pos logit contributions
joy+0.071
elt+0.068
sed+0.066
 Pole+0.066
keeper+0.066
Neg logit contributions
aches-0.055
 belong-0.053
 know-0.051
 yourself-0.050
 happen-0.049
Token a
Token ID257
Feature activation+3.25e-02

Pos logit contributions
irs+0.200
aches+0.199
 belong+0.188
 happen+0.181
ws+0.180
Neg logit contributions
 "-0.110
 boy-0.092
 girl-0.092
 �-0.082
joy-0.079
Token time
Token ID640
Feature activation-1.50e-02

Pos logit contributions
 belong+0.043
aches+0.043
 yourself+0.042
 happen+0.040
 ourselves+0.040
Neg logit contributions
 "-0.071
joy-0.070
 �-0.069
 girl-0.068
sed-0.068
Token there
Token ID612
Feature activation+3.05e-02

Pos logit contributions
sed+0.016
keeper+0.016
elt+0.016
hog+0.016
 girl+0.016
Neg logit contributions
 belong-0.035
aches-0.034
 happen-0.034
 yourself-0.033
 know-0.032
Token was
Token ID373
Feature activation+2.02e-02

Pos logit contributions
aches+0.072
 belong+0.070
 yourself+0.070
 ourselves+0.068
irs+0.067
Neg logit contributions
 "-0.039
 girl-0.034
 �-0.033
 boy-0.033
hog-0.032
Token a
Token ID257
Feature activation-8.85e-02

Pos logit contributions
aches+0.034
 belong+0.033
 happen+0.032
 yourself+0.031
 achieve+0.030
Neg logit contributions
 "-0.038
 girl-0.036
elt-0.036
 �-0.036
joy-0.036
Token curious
Token ID11040
Feature activation-1.31e-01

Pos logit contributions
elt+0.190
ishly+0.183
ked+0.176
edly+0.170
sed+0.169
Neg logit contributions
 have-0.127
 avoid-0.122
irl-0.122
 know-0.121
 belong-0.117
Token little
Token ID1310
Feature activation-1.08e-01

Pos logit contributions
ishly+0.228
sed+0.228
ider+0.227
ked+0.225
elt+0.224
Neg logit contributions
 know-0.191
 belong-0.186
 value-0.180
 have-0.173
?"-0.172
Token girl
Token ID2576
Feature activation-3.24e-02

Pos logit contributions
ked+0.170
ically+0.168
sed+0.162
edly+0.160
atively+0.159
Neg logit contributions
 belong-0.191
 you-0.178
 live-0.176
 need-0.175
 care-0.174
Token.
Token ID13
Feature activation-3.81e-02

Pos logit contributions
 Pole+0.058
sed+0.058
joy+0.056
keeper+0.054
elt+0.053
Neg logit contributions
 belong-0.056
 know-0.053
aches-0.052
 yourself-0.051
 appreciate-0.050
Token She
Token ID1375
Feature activation-2.63e-03

Pos logit contributions
sed+0.057
 Pole+0.053
keeper+0.053
ked+0.052
+0.050
Neg logit contributions
 know-0.078
 belong-0.078
 yourself-0.075
 appreciate-0.074
 have-0.073
Token was
Token ID373
Feature activation-2.82e-02

Pos logit contributions
 "+0.004
joy+0.004
hog+0.004
elt+0.004
sed+0.004
Neg logit contributions
 belong-0.005
aches-0.005
 happen-0.004
 yourself-0.004
 know-0.004
Token a
Token ID257
Feature activation+3.40e-02

Pos logit contributions
irs+0.201
aches+0.200
 belong+0.191
 happen+0.182
ws+0.181
Neg logit contributions
 "-0.113
 girl-0.095
 boy-0.094
 �-0.084
joy-0.083
Token time
Token ID640
Feature activation-1.23e-02

Pos logit contributions
 belong+0.046
aches+0.045
 yourself+0.045
 happen+0.043
 ourselves+0.043
Neg logit contributions
 "-0.073
joy-0.072
 girl-0.071
 �-0.071
-0.071
Token there
Token ID612
Feature activation+2.45e-02

Pos logit contributions
sed+0.013
hog+0.013
keeper+0.013
elt+0.013
 girl+0.013
Neg logit contributions
 belong-0.029
aches-0.028
 happen-0.028
 yourself-0.027
 know-0.027
Token was
Token ID373
Feature activation+2.05e-02

Pos logit contributions
aches+0.056
 belong+0.055
 yourself+0.054
 ourselves+0.053
irs+0.053
Neg logit contributions
 "-0.033
 girl-0.029
 �-0.028
 boy-0.028
-0.028
Token a
Token ID257
Feature activation-8.49e-02

Pos logit contributions
aches+0.035
 belong+0.034
 happen+0.033
 yourself+0.032
 achieve+0.031
Neg logit contributions
 "-0.038
 girl-0.037
elt-0.037
 �-0.036
joy-0.036
Token curious
Token ID11040
Feature activation-1.31e-01

Pos logit contributions
elt+0.185
ishly+0.178
ked+0.170
edly+0.165
sed+0.162
Neg logit contributions
 have-0.119
irl-0.118
 avoid-0.117
 know-0.112
 belong-0.112
Token puppy
Token ID26188
Feature activation-5.98e-02

Pos logit contributions
ishly+0.234
ked+0.233
sed+0.229
elt+0.229
ider+0.225
Neg logit contributions
 know-0.186
 belong-0.183
 value-0.179
?"-0.177
 have-0.170
Token called
Token ID1444
Feature activation-1.96e-02

Pos logit contributions
joy+0.128
sed+0.123
 Pole+0.122
umm+0.122
elt+0.121
Neg logit contributions
 belong-0.082
 know-0.073
 live-0.072
e-0.072
aches-0.072
Token Benny
Token ID44275
Feature activation+8.07e-03

Pos logit contributions
elt+0.025
+0.025
hog+0.025
sed+0.024
 girl+0.024
Neg logit contributions
 belong-0.042
 yourself-0.040
 happen-0.040
aches-0.040
 know-0.039
Token.
Token ID13
Feature activation-1.42e-02

Pos logit contributions
 belong+0.015
aches+0.015
 happen+0.014
 yourself+0.014
irs+0.014
Neg logit contributions
 "-0.013
hog-0.013
 girl-0.012
-0.012
elt-0.012
Token He
Token ID679
Feature activation-8.39e-03

Pos logit contributions
sed+0.019
keeper+0.019
elt+0.018
 Lee+0.018
hog+0.018
Neg logit contributions
 belong-0.029
 yourself-0.028
 happen-0.028
 know-0.028
aches-0.027
Token a
Token ID257
Feature activation+4.20e-02

Pos logit contributions
irs+0.211
aches+0.209
 belong+0.192
ws+0.191
 affect+0.188
Neg logit contributions
 "-0.111
 boy-0.091
 girl-0.090
 �-0.081
 monkey-0.076
Token time
Token ID640
Feature activation-1.71e-02

Pos logit contributions
aches+0.061
 belong+0.061
 yourself+0.060
 happen+0.058
 ourselves+0.057
Neg logit contributions
 "-0.087
joy-0.084
 �-0.084
 girl-0.084
-0.083
Token there
Token ID612
Feature activation+3.11e-02

Pos logit contributions
sed+0.019
elt+0.019
keeper+0.019
hog+0.019
 girl+0.019
Neg logit contributions
 belong-0.039
aches-0.038
 happen-0.038
 yourself-0.037
 know-0.037
Token was
Token ID373
Feature activation+2.11e-02

Pos logit contributions
aches+0.074
 yourself+0.072
 belong+0.071
 ourselves+0.070
irs+0.069
Neg logit contributions
 "-0.039
 girl-0.034
 boy-0.034
 �-0.034
-0.033
Token a
Token ID257
Feature activation-8.49e-02

Pos logit contributions
aches+0.036
 belong+0.035
 happen+0.034
 yourself+0.033
 achieve+0.032
Neg logit contributions
 "-0.039
 girl-0.038
elt-0.037
 �-0.037
joy-0.037
Token curious
Token ID11040
Feature activation-1.31e-01

Pos logit contributions
elt+0.181
ishly+0.171
ked+0.165
edly+0.161
sed+0.161
Neg logit contributions
 have-0.124
irl-0.120
 avoid-0.120
 belong-0.118
 know-0.116
Token little
Token ID1310
Feature activation-1.08e-01

Pos logit contributions
sed+0.231
ked+0.231
ishly+0.229
ider+0.229
elt+0.228
Neg logit contributions
 know-0.192
 belong-0.186
 value-0.181
?"-0.175
 have-0.175
Token girl
Token ID2576
Feature activation-3.06e-02

Pos logit contributions
ically+0.171
ked+0.170
edly+0.163
sed+0.163
overed+0.161
Neg logit contributions
 belong-0.192
 need-0.179
 live-0.177
 know-0.177
 you-0.176
Token named
Token ID3706
Feature activation-1.77e-03

Pos logit contributions
 Pole+0.053
sed+0.053
joy+0.051
keeper+0.049
elt+0.049
Neg logit contributions
 belong-0.054
 know-0.051
aches-0.050
 yourself-0.049
 appreciate-0.048
Token Lucy
Token ID22162
Feature activation+8.47e-03

Pos logit contributions
 "+0.002
sed+0.002
hog+0.002
elt+0.002
+0.002
Neg logit contributions
 belong-0.004
aches-0.004
 happen-0.004
 yourself-0.004
 know-0.004
Token.
Token ID13
Feature activation-2.24e-02

Pos logit contributions
aches+0.016
 belong+0.016
 happen+0.015
 yourself+0.015
 ourselves+0.015
Neg logit contributions
 "-0.014
 girl-0.013
hog-0.013
-0.013
elt-0.013
Token time
Token ID640
Feature activation-1.66e-02

Pos logit contributions
aches+0.041
 belong+0.040
 yourself+0.039
 happen+0.038
 ourselves+0.037
Neg logit contributions
 "-0.070
joy-0.068
 girl-0.067
-0.067
 �-0.067
Token,
Token ID11
Feature activation+3.61e-03

Pos logit contributions
elt+0.018
keeper+0.018
sed+0.018
hog+0.018
 gir+0.018
Neg logit contributions
 belong-0.038
aches-0.037
 happen-0.037
 yourself-0.036
 know-0.036
Token there
Token ID612
Feature activation+2.07e-02

Pos logit contributions
 belong+0.006
aches+0.006
 happen+0.006
 yourself+0.006
 know+0.006
Neg logit contributions
 "-0.006
hog-0.006
sed-0.006
elt-0.006
 girl-0.006
Token was
Token ID373
Feature activation+1.65e-02

Pos logit contributions
aches+0.046
 belong+0.045
 yourself+0.044
 happen+0.043
 ourselves+0.043
Neg logit contributions
 "-0.029
 girl-0.025
 �-0.025
-0.025
hog-0.025
Token a
Token ID257
Feature activation-9.34e-02

Pos logit contributions
aches+0.027
 belong+0.026
 happen+0.025
 yourself+0.025
 achieve+0.024
Neg logit contributions
 "-0.032
 girl-0.031
elt-0.030
joy-0.030
 �-0.030
Token curious
Token ID11040
Feature activation-1.31e-01

Pos logit contributions
elt+0.203
ishly+0.193
ked+0.187
edly+0.184
ed+0.178
Neg logit contributions
 have-0.131
 avoid-0.125
irl-0.124
 belong-0.122
 know-0.120
Token girl
Token ID2576
Feature activation-2.93e-02

Pos logit contributions
ishly+0.230
ked+0.229
sed+0.224
ider+0.223
elt+0.220
Neg logit contributions
 know-0.197
 belong-0.186
 value-0.181
?"-0.178
 have-0.174
Token named
Token ID3706
Feature activation-1.07e-02

Pos logit contributions
 Pole+0.050
sed+0.050
joy+0.048
elt+0.047
keeper+0.047
Neg logit contributions
 belong-0.053
 know-0.050
aches-0.050
 yourself-0.049
 appreciate-0.047
Token Jill
Token ID22772
Feature activation+2.73e-02

Pos logit contributions
sed+0.014
hog+0.013
joy+0.013
elt+0.013
+0.013
Neg logit contributions
 belong-0.024
aches-0.023
 happen-0.022
 yourself-0.022
 know-0.022
Token.
Token ID13
Feature activation-2.02e-02

Pos logit contributions
 belong+0.058
aches+0.058
 happen+0.056
 yourself+0.056
irs+0.055
Neg logit contributions
 "-0.039
 girl-0.036
hog-0.035
-0.035
 gir-0.035
Token She
Token ID1375
Feature activation-1.39e-02

Pos logit contributions
sed+0.021
keeper+0.020
 Pole+0.020
hog+0.020
ometer+0.019
Neg logit contributions
 belong-0.048
 yourself-0.047
aches-0.047
 know-0.047
 happen-0.046
TokenWhen
Token ID2215
Feature activation-2.22e-02

Pos logit contributions
 belong+0.075
 happen+0.073
aches+0.069
 yourself+0.064
 ourselves+0.063
Neg logit contributions
 "-0.065
-0.064
 �-0.064
hog-0.058
 girl-0.057
Token Emma
Token ID18966
Feature activation-2.46e-02

Pos logit contributions
hog+0.047
sed+0.047
elt+0.046
 Pole+0.045
keeper+0.044
Neg logit contributions
 belong-0.032
aches-0.030
 happen-0.030
 yourself-0.029
 ourselves-0.028
Token went
Token ID1816
Feature activation-4.25e-02

Pos logit contributions
sed+0.043
hog+0.042
 "+0.042
joy+0.041
ometer+0.040
Neg logit contributions
 belong-0.041
 know-0.040
aches-0.040
 happen-0.040
 achieve-0.038
Token to
Token ID284
Feature activation+4.01e-02

Pos logit contributions
sed+0.089
+0.085
 gir+0.083
 Pole+0.081
keeper+0.081
Neg logit contributions
 belong-0.059
aches-0.059
 happen-0.056
 yourself-0.056
 know-0.052
Token sleep
Token ID3993
Feature activation-2.03e-02

Pos logit contributions
aches+0.076
 belong+0.073
 yourself+0.068
 ourselves+0.068
inations+0.067
Neg logit contributions
 "-0.071
 Lee-0.066
 �-0.066
 girl-0.062
joy-0.061
Token at
Token ID379
Feature activation+6.88e-02

Pos logit contributions
elt+0.025
hog+0.025
sed+0.025
 gir+0.024
keeper+0.024
Neg logit contributions
 happen-0.044
aches-0.044
 belong-0.044
 yourself-0.043
 know-0.041
Token night
Token ID1755
Feature activation-3.38e-02

Pos logit contributions
aches+0.173
 belong+0.163
ck+0.153
inations+0.149
its+0.149
Neg logit contributions
 "-0.090
 girl-0.087
joy-0.084
 Lee-0.081
 boy-0.079
Token,
Token ID11
Feature activation+1.64e-03

Pos logit contributions
sed+0.046
hog+0.044
elt+0.043
 gir+0.043
+0.042
Neg logit contributions
 happen-0.071
 belong-0.069
aches-0.068
 yourself-0.066
 know-0.066
Token she
Token ID673
Feature activation+1.60e-02

Pos logit contributions
 belong+0.004
aches+0.004
 happen+0.003
 yourself+0.003
 know+0.003
Neg logit contributions
 "-0.002
hog-0.002
sed-0.002
elt-0.002
-0.002
Token was
Token ID373
Feature activation+9.67e-03

Pos logit contributions
 belong+0.035
aches+0.034
 happen+0.033
 yourself+0.033
 ourselves+0.032
Neg logit contributions
 "-0.022
 girl-0.020
-0.020
 �-0.020
elt-0.020
Token very
Token ID845
Feature activation+9.30e-03

Pos logit contributions
 belong+0.020
aches+0.020
 happen+0.019
 yourself+0.019
 ourselves+0.019
Neg logit contributions
 "-0.014
joy-0.013
elt-0.013
 girl-0.013
hog-0.013
Token sorry
Token ID7926
Feature activation-3.36e-03

Pos logit contributions
aches+0.080
irs+0.078
 belong+0.075
 happen+0.075
irl+0.073
Neg logit contributions
 "-0.037
 girl-0.034
sed-0.031
 �-0.031
joy-0.031
Token.
Token ID13
Feature activation-1.41e-02

Pos logit contributions
hog+0.005
joy+0.005
sed+0.005
elt+0.005
 "+0.005
Neg logit contributions
 belong-0.006
aches-0.006
 happen-0.006
 yourself-0.006
 know-0.006
Token He
Token ID679
Feature activation+2.36e-03

Pos logit contributions
sed+0.019
hog+0.019
 gir+0.018
keeper+0.018
+0.018
Neg logit contributions
 yourself-0.029
 belong-0.029
 know-0.028
aches-0.028
 happen-0.027
Token wishes
Token ID12802
Feature activation+3.62e-03

Pos logit contributions
 belong+0.004
aches+0.004
 happen+0.004
 yourself+0.004
 know+0.004
Neg logit contributions
 "-0.004
hog-0.004
sed-0.004
elt-0.004
 girl-0.004
Token he
Token ID339
Feature activation+3.23e-02

Pos logit contributions
 belong+0.008
aches+0.008
 yourself+0.008
 happen+0.008
 know+0.008
Neg logit contributions
 "-0.004
hog-0.004
sed-0.004
elt-0.004
 girl-0.004
Token did
Token ID750
Feature activation+8.13e-02

Pos logit contributions
irs+0.071
aches+0.069
 ease+0.067
 ourselves+0.067
 belong+0.066
Neg logit contributions
 "-0.043
 girl-0.042
 boy-0.041
-0.040
hog-0.040
Token not
Token ID407
Feature activation+5.03e-02

Pos logit contributions
irs+0.170
 ease+0.141
wy+0.140
hest+0.140
iver+0.138
Neg logit contributions
 "-0.145
 lady-0.101
joy-0.100
 man-0.100
 girl-0.099
Token bite
Token ID13197
Feature activation+2.24e-02

Pos logit contributions
irs+0.074
 ourselves+0.067
aches+0.067
 ease+0.067
vern+0.066
Neg logit contributions
 "-0.097
 girl-0.094
 boy-0.093
hog-0.093
 gir-0.092
Token Mia
Token ID32189
Feature activation+9.85e-03

Pos logit contributions
 belong+0.037
aches+0.035
 happen+0.033
 know+0.033
 achieve+0.033
Neg logit contributions
 "-0.042
 girl-0.041
 Lee-0.040
elt-0.039
 boy-0.039
Token.
Token ID13
Feature activation-1.03e-02

Pos logit contributions
 belong+0.018
aches+0.018
 yourself+0.017
 happen+0.017
irs+0.017
Neg logit contributions
 "-0.016
hog-0.015
 girl-0.015
keeper-0.015
 Lee-0.015
Token He
Token ID679
Feature activation+1.61e-02

Pos logit contributions
hog+0.012
sed+0.011
+0.011
 gir+0.011
keeper+0.011
Neg logit contributions
 belong-0.024
 yourself-0.024
 know-0.023
aches-0.023
 happen-0.023
Token his
Token ID465
Feature activation+3.73e-03

Pos logit contributions
 belong+0.019
aches+0.018
 happen+0.017
 yourself+0.017
 know+0.017
Neg logit contributions
 "-0.024
joy-0.023
 �-0.023
hog-0.023
 gir-0.023
Token work
Token ID670
Feature activation+7.06e-03

Pos logit contributions
 belong+0.007
aches+0.007
 happen+0.007
 yourself+0.007
 know+0.007
Neg logit contributions
 "-0.006
sed-0.005
hog-0.005
elt-0.005
joy-0.005
Token.
Token ID13
Feature activation+1.42e-02

Pos logit contributions
 belong+0.014
aches+0.014
 yourself+0.013
 happen+0.013
 know+0.013
Neg logit contributions
 "-0.010
hog-0.010
 girl-0.010
joy-0.010
-0.010
Token<|endoftext|>
Token ID50256
Feature activation+1.72e-02

Pos logit contributions
aches+0.030
 belong+0.029
 happen+0.028
 yourself+0.028
 ourselves+0.027
Neg logit contributions
 "-0.022
 �-0.020
joy-0.020
elt-0.019
hog-0.019
TokenOnce
Token ID7454
Feature activation+4.48e-02

Pos logit contributions
aches+0.034
 belong+0.034
 happen+0.032
 yourself+0.032
 know+0.031
Neg logit contributions
 "-0.028
joy-0.025
 girl-0.025
-0.025
hog-0.025
Token upon
Token ID2402
Feature activation+8.46e-02

Pos logit contributions
 yourself+0.097
 belong+0.095
aches+0.094
 happen+0.090
 ourselves+0.089
Neg logit contributions
 "-0.070
 �-0.061
 girl-0.061
joy-0.059
 boy-0.059
Token a
Token ID257
Feature activation+2.75e-02

Pos logit contributions
aches+0.204
irs+0.203
 belong+0.195
 happen+0.186
ws+0.186
Neg logit contributions
 "-0.116
 girl-0.097
 boy-0.095
 �-0.087
 mouse-0.083
Token time
Token ID640
Feature activation-1.65e-02

Pos logit contributions
 belong+0.035
aches+0.035
 yourself+0.034
 happen+0.032
 ourselves+0.032
Neg logit contributions
 "-0.062
joy-0.060
 �-0.060
 girl-0.060
-0.059
Token,
Token ID11
Feature activation+4.87e-03

Pos logit contributions
keeper+0.018
hog+0.018
elt+0.018
sed+0.018
 girl+0.018
Neg logit contributions
 belong-0.038
aches-0.037
 happen-0.037
 yourself-0.036
 know-0.035
Token there
Token ID612
Feature activation+2.40e-02

Pos logit contributions
 belong+0.008
aches+0.008
 happen+0.008
 yourself+0.008
 know+0.008
Neg logit contributions
 "-0.009
hog-0.008
 girl-0.008
-0.008
elt-0.008
Token was
Token ID373
Feature activation+1.70e-02

Pos logit contributions
aches+0.055
 belong+0.054
 yourself+0.053
irs+0.051
 ourselves+0.051
Neg logit contributions
 "-0.033
 girl-0.028
 �-0.028
-0.028
 boy-0.027
Token and
Token ID290
Feature activation-1.92e-03

Pos logit contributions
sed+0.071
 Sure+0.068
elt+0.067
 Pole+0.067
hog+0.066
Neg logit contributions
 yourself-0.075
aches-0.073
 belong-0.070
 happen-0.068
 ourselves-0.068
Token the
Token ID262
Feature activation-2.76e-02

Pos logit contributions
 "+0.003
hog+0.003
sed+0.003
joy+0.003
elt+0.003
Neg logit contributions
 belong-0.004
aches-0.004
 happen-0.003
 yourself-0.003
 know-0.003
Token k
Token ID479
Feature activation-1.48e-02

Pos logit contributions
joy+0.035
elt+0.034
 �+0.033
 Pole+0.033
hog+0.033
Neg logit contributions
 belong-0.062
 yourself-0.060
aches-0.059
 know-0.057
 happen-0.056
Tokenitty
Token ID9760
Feature activation-6.06e-02

Pos logit contributions
 "+0.018
 �+0.017
joy+0.017
 Pole+0.017
sed+0.017
Neg logit contributions
 belong-0.033
aches-0.033
 yourself-0.032
 ourselves-0.031
 know-0.030
Token wanted
Token ID2227
Feature activation-3.44e-03

Pos logit contributions
joy+0.129
 Pole+0.117
 Lee+0.117
icol+0.114
 �+0.110
Neg logit contributions
aches-0.087
 belong-0.086
 know-0.085
 have-0.076
 yourself-0.074
Token to
Token ID284
Feature activation+6.82e-02

Pos logit contributions
 "+0.005
hog+0.005
+0.005
sed+0.005
elt+0.005
Neg logit contributions
 belong-0.007
aches-0.007
 happen-0.007
 yourself-0.007
 know-0.006
Token play
Token ID711
Feature activation+8.28e-03

Pos logit contributions
irs+0.123
 ourselves+0.115
cks+0.115
inations+0.114
lled+0.112
Neg logit contributions
 "-0.115
 girl-0.103
 monkey-0.102
 cat-0.102
 lady-0.101
Token with
Token ID351
Feature activation+3.29e-02

Pos logit contributions
 belong+0.017
aches+0.017
 happen+0.017
 yourself+0.017
 know+0.016
Neg logit contributions
 "-0.012
hog-0.011
 girl-0.011
-0.011
joy-0.011
Token it
Token ID340
Feature activation+2.42e-03

Pos logit contributions
 belong+0.061
 achieve+0.060
 happen+0.057
aches+0.056
inations+0.056
Neg logit contributions
 "-0.059
 gir-0.051
 girl-0.050
 mouse-0.049
 �-0.049
Token.
Token ID13
Feature activation-3.09e-02

Pos logit contributions
 belong+0.004
aches+0.004
 happen+0.004
 yourself+0.004
 know+0.004
Neg logit contributions
 "-0.004
hog-0.004
sed-0.004
elt-0.004
 girl-0.004
Token
Token ID220
Feature activation+8.55e-02

Pos logit contributions
sed+0.054
joy+0.053
 gir+0.049
hog+0.048
icol+0.048
Neg logit contributions
 know-0.053
 yourself-0.050
 belong-0.049
 avoid-0.048
aches-0.047
Token mixer
Token ID33938
Feature activation+3.06e-03

Pos logit contributions
 "+0.000
hog+0.000
sed+0.000
elt+0.000
joy+0.000
Neg logit contributions
 belong-0.001
aches-0.001
 happen-0.001
 yourself-0.001
 know-0.001
Token was
Token ID373
Feature activation+2.44e-02

Pos logit contributions
 belong+0.006
aches+0.006
 happen+0.006
 yourself+0.006
 know+0.005
Neg logit contributions
 "-0.005
hog-0.005
sed-0.005
elt-0.005
 girl-0.005
Token very
Token ID845
Feature activation+1.96e-02

Pos logit contributions
aches+0.045
irs+0.045
 belong+0.045
 yourself+0.044
 achieve+0.043
Neg logit contributions
 "-0.041
 �-0.036
 girl-0.035
 gir-0.034
hog-0.033
Token special
Token ID2041
Feature activation-2.27e-02

Pos logit contributions
aches+0.032
 yourself+0.031
 belong+0.031
 happen+0.031
irs+0.031
Neg logit contributions
 "-0.036
 girl-0.034
-0.033
 gir-0.032
 �-0.032
Token.
Token ID13
Feature activation-1.87e-02

Pos logit contributions
sed+0.045
 Pole+0.044
elt+0.044
+0.044
joy+0.043
Neg logit contributions
 belong-0.034
 happen-0.033
 yourself-0.032
aches-0.032
 know-0.031
Token
Token ID198
Feature activation+8.05e-02

Pos logit contributions
sed+0.036
hog+0.036
joy+0.035
 gir+0.035
keeper+0.035
Neg logit contributions
 yourself-0.027
 belong-0.027
 know-0.027
aches-0.026
 ourselves-0.025
Token
Token ID198
Feature activation+8.85e-03

Pos logit contributions
 belong+0.230
aches+0.223
 yourself+0.216
 happen+0.210
 ourselves+0.206
Neg logit contributions
 "-0.091
 �-0.067
 Sure-0.064
 boy-0.061
 girl-0.060
TokenOne
Token ID3198
Feature activation-1.58e-02

Pos logit contributions
 belong+0.009
aches+0.009
 happen+0.009
 yourself+0.008
 ourselves+0.008
Neg logit contributions
 "-0.022
 girl-0.021
-0.021
hog-0.021
 �-0.021
Token day
Token ID1110
Feature activation-5.01e-02

Pos logit contributions
sed+0.035
joy+0.034
hog+0.034
 Pole+0.034
 Lee+0.033
Neg logit contributions
 belong-0.022
aches-0.020
 happen-0.020
 know-0.020
 avoid-0.018
Token,
Token ID11
Feature activation-3.61e-02

Pos logit contributions
hog+0.108
keeper+0.106
 Pole+0.099
 clerk+0.098
sed+0.098
Neg logit contributions
 belong-0.075
 are-0.071
 happen-0.069
 have-0.068
 yourself-0.068
Token the
Token ID262
Feature activation-3.73e-02

Pos logit contributions
 Pole+0.093
hog+0.092
joy+0.090
keeper+0.089
sed+0.089
Neg logit contributions
 belong-0.034
aches-0.032
 yourself-0.030
 happen-0.030
 know-0.029
Token get
Token ID651
Feature activation+5.33e-02

Pos logit contributions
irs+0.145
 belong+0.128
aches+0.124
cks+0.120
 ease+0.119
Neg logit contributions
 "-0.148
 �-0.132
 girl-0.132
-0.130
keeper-0.124
Token free
Token ID1479
Feature activation+4.80e-02

Pos logit contributions
aches+0.090
 achieve+0.087
 belong+0.087
 happen+0.083
irs+0.083
Neg logit contributions
 "-0.092
elt-0.090
 girl-0.088
keeper-0.084
 �-0.084
Token.
Token ID13
Feature activation-2.24e-02

Pos logit contributions
aches+0.098
 belong+0.096
irs+0.091
ck+0.089
 know+0.088
Neg logit contributions
 "-0.074
 girl-0.069
 �-0.068
elt-0.067
keeper-0.065
Tokenâ
Token ID22940
Feature activation+9.05e-03

Pos logit contributions
sed+0.032
hog+0.030
 girl+0.030
 gir+0.030
keeper+0.029
Neg logit contributions
 yourself-0.045
 know-0.044
 belong-0.044
 ourselves-0.042
irs-0.041
Token
Token ID26391
Feature activation+1.26e-02

Pos logit contributions
aches+0.014
 belong+0.013
 happen+0.013
 yourself+0.013
 know+0.012
Neg logit contributions
 "-0.018
hog-0.017
-0.017
 girl-0.017
elt-0.017
Token
Token ID220
Feature activation+3.90e-02

Pos logit contributions
 belong+0.024
aches+0.024
 happen+0.023
 yourself+0.022
irs+0.021
Neg logit contributions
 "-0.021
-0.020
joy-0.019
sed-0.019
elt-0.019
Token
Token ID198
Feature activation+4.88e-02

Pos logit contributions
 belong+0.084
aches+0.082
 happen+0.080
ck+0.074
irl+0.072
Neg logit contributions
 "-0.062
 �-0.055
elt-0.054
 Lee-0.054
-0.053
TokenSo
Token ID2396
Feature activation-2.57e-03

Pos logit contributions
 belong+0.112
aches+0.104
 happen+0.101
 yours+0.097
 harm+0.095
Neg logit contributions
 "-0.078
 �-0.073
-0.071
hog-0.063
 Sure-0.060
Token Jay
Token ID9180
Feature activation-2.02e-02

Pos logit contributions
 "+0.003
hog+0.003
elt+0.003
sed+0.003
joy+0.003
Neg logit contributions
 belong-0.006
aches-0.005
 happen-0.005
 yourself-0.005
 know-0.005
Token settled
Token ID10282
Feature activation-2.04e-02

Pos logit contributions
sed+0.029
joy+0.028
 "+0.028
elt+0.028
 Pole+0.028
Neg logit contributions
 belong-0.040
aches-0.040
 yourself-0.038
 happen-0.038
 know-0.038
Token down
Token ID866
Feature activation-5.91e-03

Pos logit contributions
hog+0.030
sed+0.030
 gir+0.029
 Pole+0.029
elt+0.029
Neg logit contributions
 yourself-0.040
aches-0.039
 belong-0.038
 ourselves-0.037
 happen-0.037
Token was
Token ID373
Feature activation-6.80e-03

Pos logit contributions
joy+0.014
 Pole+0.013
elt+0.013
 "+0.013
 Lee+0.013
Neg logit contributions
 belong-0.015
aches-0.015
 happen-0.015
 yourself-0.014
 know-0.014
Token big
Token ID1263
Feature activation-6.33e-02

Pos logit contributions
+0.012
 Pole+0.012
hog+0.012
 "+0.012
elt+0.012
Neg logit contributions
 belong-0.011
aches-0.011
 yourself-0.011
 happen-0.011
 know-0.010
Token and
Token ID290
Feature activation-6.65e-03

Pos logit contributions
 Sure+0.117
elt+0.113
ometer+0.112
+0.110
keeper+0.109
Neg logit contributions
 yourself-0.099
 happen-0.098
 are-0.094
 belong-0.092
 harm-0.091
Token ugly
Token ID13400
Feature activation-4.65e-02

Pos logit contributions
 "+0.009
elt+0.009
joy+0.009
hog+0.009
 Pole+0.009
Neg logit contributions
 belong-0.013
aches-0.013
 happen-0.013
 yourself-0.013
 know-0.012
Token and
Token ID290
Feature activation-2.86e-03

Pos logit contributions
 Sure+0.091
keeper+0.088
+0.088
ometer+0.087
 Pole+0.087
Neg logit contributions
 happen-0.073
 belong-0.068
 yourself-0.067
 are-0.064
 ourselves-0.063
Token had
Token ID550
Feature activation+2.51e-02

Pos logit contributions
 "+0.004
elt+0.004
joy+0.004
hog+0.004
+0.004
Neg logit contributions
 belong-0.005
aches-0.005
 happen-0.005
 yourself-0.005
 know-0.005
Token a
Token ID257
Feature activation-1.32e-02

Pos logit contributions
aches+0.044
 belong+0.044
 achieve+0.042
 happen+0.042
 yourself+0.041
Neg logit contributions
 "-0.044
sed-0.043
 gir-0.042
elt-0.041
hog-0.040
Token long
Token ID890
Feature activation-1.69e-02

Pos logit contributions
elt+0.022
hog+0.022
 Pole+0.022
joy+0.022
 "+0.022
Neg logit contributions
 belong-0.023
aches-0.023
 happen-0.022
 yourself-0.022
 know-0.022
Token needle
Token ID17598
Feature activation+1.18e-03

Pos logit contributions
elt+0.019
keeper+0.019
hog+0.019
 Pole+0.019
joy+0.018
Neg logit contributions
aches-0.039
 belong-0.039
 happen-0.038
 yourself-0.037
 know-0.037
Token.
Token ID13
Feature activation-1.63e-02

Pos logit contributions
 belong+0.002
aches+0.002
 happen+0.002
 yourself+0.002
 know+0.002
Neg logit contributions
 "-0.002
hog-0.002
elt-0.002
sed-0.002
 girl-0.002
Token
Token ID198
Feature activation+5.35e-02

Pos logit contributions
hog+0.022
sed+0.021
 gir+0.021
 girl+0.021
keeper+0.021
Neg logit contributions
 yourself-0.034
 belong-0.033
aches-0.032
 ourselves-0.032
 know-0.032
Token,
Token ID11
Feature activation-9.79e-02

Pos logit contributions
 Pole+0.104
suit+0.099
 Girl+0.099
umm+0.098
ksh+0.097
Neg logit contributions
aches-0.096
 happen-0.094
 belong-0.092
 know-0.086
 yourself-0.084
Token "
Token ID366
Feature activation-3.32e-03

Pos logit contributions
umm+0.137
hog+0.130
suit+0.124
 girl+0.120
keeper+0.116
Neg logit contributions
 them-0.179
 happen-0.175
 have-0.175
 know-0.175
't-0.171
TokenWow
Token ID22017
Feature activation-7.13e-03

Pos logit contributions
 "+0.005
sed+0.005
hog+0.005
joy+0.005
 Lee+0.005
Neg logit contributions
 belong-0.007
aches-0.007
 yourself-0.006
 happen-0.006
 know-0.006
Token,
Token ID11
Feature activation+6.35e-05

Pos logit contributions
hog+0.008
elt+0.007
 "+0.007
sed+0.007
keeper+0.007
Neg logit contributions
 belong-0.017
aches-0.017
 happen-0.017
 yourself-0.017
 ourselves-0.016
Token the
Token ID262
Feature activation-2.80e-03

Pos logit contributions
 belong+0.000
aches+0.000
 happen+0.000
 yourself+0.000
 know+0.000
Neg logit contributions
 "-0.000
hog-0.000
elt-0.000
sed-0.000
joy-0.000
Token bug
Token ID5434
Feature activation+4.77e-03

Pos logit contributions
 "+0.003
elt+0.003
+0.003
hog+0.003
joy+0.003
Neg logit contributions
 belong-0.007
aches-0.007
 happen-0.007
 yourself-0.007
 know-0.006
Token is
Token ID318
Feature activation+2.18e-02

Pos logit contributions
 belong+0.007
aches+0.007
 yourself+0.007
 happen+0.007
 know+0.006
Neg logit contributions
 "-0.009
hog-0.009
sed-0.009
 girl-0.009
elt-0.009
Token so
Token ID523
Feature activation+2.38e-02

Pos logit contributions
aches+0.044
 belong+0.043
irs+0.043
 happen+0.041
 avoid+0.041
Neg logit contributions
 "-0.033
 gir-0.030
sed-0.030
 �-0.029
hog-0.029
Token pretty
Token ID2495
Feature activation-2.65e-02

Pos logit contributions
aches+0.043
 belong+0.041
irs+0.041
 happen+0.040
 yourself+0.039
Neg logit contributions
 "-0.041
 gir-0.038
hog-0.038
 girl-0.037
sed-0.037
Token!
Token ID0
Feature activation-6.31e-02

Pos logit contributions
sed+0.062
joy+0.060
elt+0.060
 Pole+0.059
 Sure+0.059
Neg logit contributions
 yourself-0.030
 happen-0.029
 belong-0.029
aches-0.028
 ourselves-0.027
Token Can
Token ID1680
Feature activation+1.69e-02

Pos logit contributions
joy+0.099
sed+0.087
keeper+0.082
 Pole+0.081
 girl+0.081
Neg logit contributions
 yourself-0.123
 know-0.120
 belong-0.120
 have-0.113
 yours-0.113
Token sorry
Token ID7926
Feature activation-1.36e-02

Pos logit contributions
 belong+0.003
aches+0.003
 happen+0.003
 yourself+0.003
 know+0.003
Neg logit contributions
 "-0.003
hog-0.003
sed-0.003
elt-0.003
-0.002
Token you
Token ID345
Feature activation+2.93e-02

Pos logit contributions
sed+0.023
elt+0.023
joy+0.023
hog+0.022
 Pole+0.022
Neg logit contributions
 belong-0.024
 yourself-0.024
 happen-0.024
aches-0.023
 ourselves-0.023
Token had
Token ID550
Feature activation+4.61e-02

Pos logit contributions
aches+0.051
 yourself+0.047
irs+0.047
 achieve+0.046
irl+0.045
Neg logit contributions
 "-0.053
 girl-0.051
sed-0.050
elt-0.050
hog-0.049
Token a
Token ID257
Feature activation-2.07e-02

Pos logit contributions
 achieve+0.077
aches+0.076
irs+0.074
 belong+0.073
 accomplish+0.069
Neg logit contributions
sed-0.082
 "-0.081
 gir-0.075
 eyes-0.074
 mouth-0.074
Token terrible
Token ID7818
Feature activation-2.09e-02

Pos logit contributions
elt+0.034
 Pole+0.033
hog+0.033
+0.032
keeper+0.032
Neg logit contributions
 belong-0.038
aches-0.037
 know-0.036
 happen-0.036
 yourself-0.035
Token day
Token ID1110
Feature activation+1.09e-02

Pos logit contributions
keeper+0.030
hog+0.030
+0.030
elt+0.030
 Pole+0.030
Neg logit contributions
 belong-0.042
 happen-0.041
aches-0.041
 yourself-0.040
 know-0.039
Token.
Token ID13
Feature activation-2.14e-02

Pos logit contributions
aches+0.019
 belong+0.019
 happen+0.018
 yourself+0.018
irs+0.017
Neg logit contributions
 "-0.019
hog-0.018
elt-0.018
 girl-0.018
sed-0.018
Token I
Token ID314
Feature activation+5.20e-02

Pos logit contributions
joy+0.030
sed+0.030
 Pole+0.030
hog+0.030
keeper+0.030
Neg logit contributions
 belong-0.043
 yourself-0.043
 know-0.040
 happen-0.040
aches-0.039
Token came
Token ID1625
Feature activation+3.35e-02

Pos logit contributions
 yourself+0.115
 ourselves+0.113
irs+0.113
aches+0.112
 happen+0.104
Neg logit contributions
 girl-0.067
hog-0.067
sed-0.066
 "-0.065
 boy-0.065
Token to
Token ID284
Feature activation+5.48e-02

Pos logit contributions
aches+0.080
 belong+0.076
irs+0.076
 happen+0.073
irl+0.073
Neg logit contributions
 "-0.042
 girl-0.040
elt-0.037
hog-0.036
 boy-0.035
Token comfort
Token ID4467
Feature activation+2.51e-02

Pos logit contributions
aches+0.094
irl+0.086
inations+0.086
hest+0.082
 belong+0.082
Neg logit contributions
 "-0.108
 girl-0.103
 mouse-0.096
 boy-0.095
 Sure-0.095
Token and
Token ID290
Feature activation+2.57e-02

Pos logit contributions
sed+0.079
hog+0.078
 Sure+0.077
ometer+0.077
 Pole+0.074
Neg logit contributions
 yourself-0.075
 happen-0.072
 ourselves-0.068
 belong-0.068
aches-0.067
Token happy
Token ID3772
Feature activation-2.28e-02

Pos logit contributions
aches+0.052
irs+0.050
 belong+0.049
 happen+0.047
 yourself+0.047
Neg logit contributions
 "-0.040
 girl-0.038
sed-0.036
 �-0.036
-0.035
Token again
Token ID757
Feature activation+1.70e-03

Pos logit contributions
sed+0.045
hog+0.045
+0.045
 Pole+0.045
ometer+0.044
Neg logit contributions
 happen-0.034
 belong-0.032
 yourself-0.032
aches-0.032
 ourselves-0.030
Token.
Token ID13
Feature activation-3.05e-02

Pos logit contributions
 belong+0.003
aches+0.003
 happen+0.003
 yourself+0.003
 know+0.003
Neg logit contributions
 "-0.003
hog-0.003
sed-0.003
elt-0.003
 girl-0.003
Token She
Token ID1375
Feature activation+8.16e-03

Pos logit contributions
 gir+0.041
hog+0.041
 Pole+0.039
keeper+0.039
sed+0.039
Neg logit contributions
 know-0.062
 yourself-0.062
 belong-0.059
 ourselves-0.058
 appreciate-0.056
Token puts
Token ID7584
Feature activation+4.27e-03

Pos logit contributions
 belong+0.016
aches+0.016
 happen+0.015
 yourself+0.015
 ourselves+0.015
Neg logit contributions
 "-0.012
sed-0.012
 girl-0.012
hog-0.012
elt-0.012
Token on
Token ID319
Feature activation-1.63e-03

Pos logit contributions
 belong+0.007
aches+0.007
 happen+0.007
 yourself+0.007
 know+0.007
Neg logit contributions
 "-0.008
 girl-0.007
elt-0.007
sed-0.007
hog-0.007
Token her
Token ID607
Feature activation+4.03e-03

Pos logit contributions
 "+0.004
hog+0.004
elt+0.003
sed+0.003
+0.003
Neg logit contributions
 belong-0.002
aches-0.002
 yourself-0.002
 happen-0.002
 know-0.002
Token purple
Token ID14032
Feature activation-2.55e-02

Pos logit contributions
 belong+0.009
aches+0.009
 happen+0.008
 yourself+0.008
 know+0.008
Neg logit contributions
 "-0.005
sed-0.005
hog-0.005
 girl-0.005
elt-0.005
Token dress
Token ID6576
Feature activation-3.54e-02

Pos logit contributions
hog+0.039
keeper+0.039
elt+0.039
joy+0.038
 "+0.038
Neg logit contributions
 belong-0.047
aches-0.046
 happen-0.046
 yourself-0.045
irl-0.044
Token and
Token ID290
Feature activation+2.19e-03

Pos logit contributions
joy+0.051
elt+0.050
sed+0.050
hog+0.048
 Pole+0.048
Neg logit contributions
aches-0.069
 belong-0.067
 happen-0.066
 yourself-0.066
 ourselves-0.062
Token stole
Token ID18240
Feature activation+1.34e-02

Pos logit contributions
 ourselves+0.119
aches+0.119
irs+0.118
 yourself+0.116
 belong+0.110
Neg logit contributions
 "-0.054
hog-0.049
 boy-0.048
 girl-0.047
sed-0.047
Token his
Token ID465
Feature activation+5.25e-03

Pos logit contributions
 belong+0.020
aches+0.020
 happen+0.019
 yourself+0.018
 achieve+0.018
Neg logit contributions
 "-0.027
 girl-0.026
joy-0.025
keeper-0.025
hog-0.025
Token fish
Token ID5916
Feature activation+2.00e-02

Pos logit contributions
 belong+0.011
aches+0.011
 yourself+0.011
 happen+0.010
 know+0.010
Neg logit contributions
 "-0.007
sed-0.007
hog-0.007
 girl-0.007
elt-0.007
Token.
Token ID13
Feature activation-6.20e-02

Pos logit contributions
 belong+0.040
aches+0.040
 happen+0.039
 yourself+0.038
irs+0.038
Neg logit contributions
 "-0.029
hog-0.028
keeper-0.028
 girl-0.027
-0.027
Token The
Token ID383
Feature activation-2.46e-02

Pos logit contributions
sed+0.104
elt+0.095
 girl+0.092
 gir+0.092
 Pole+0.090
Neg logit contributions
 know-0.105
 yourself-0.097
 have-0.096
 enjoy-0.095
 avoid-0.095
Token boy
Token ID2933
Feature activation-3.47e-02

Pos logit contributions
elt+0.026
sed+0.025
 Lee+0.024
+0.024
hog+0.023
Neg logit contributions
 belong-0.058
aches-0.058
 know-0.056
 happen-0.056
 yourself-0.056
Token said
Token ID531
Feature activation-3.22e-02

Pos logit contributions
joy+0.062
elt+0.060
 Pole+0.060
 Lee+0.058
icol+0.057
Neg logit contributions
aches-0.062
 know-0.059
 belong-0.059
ck-0.057
 happen-0.056
Token,
Token ID11
Feature activation-6.26e-02

Pos logit contributions
elt+0.055
 Pole+0.055
sed+0.053
hog+0.053
 Sure+0.053
Neg logit contributions
aches-0.059
 belong-0.057
 happen-0.056
 yourself-0.052
 know-0.051
Token "
Token ID366
Feature activation+3.33e-02

Pos logit contributions
icol+0.049
sed+0.048
umm+0.047
suit+0.047
elt+0.046
Neg logit contributions
 know-0.155
 happen-0.154
 yourself-0.153
aches-0.153
 achieve-0.150
TokenI
Token ID40
Feature activation+4.82e-02

Pos logit contributions
 belong+0.066
aches+0.065
 happen+0.065
irs+0.064
 know+0.061
Neg logit contributions
-0.050
elt-0.050
keeper-0.049
 girl-0.048
hog-0.048
Token'm
Token ID1101
Feature activation-1.19e-03

Pos logit contributions
 yourself+0.081
 ourselves+0.080
irs+0.080
aches+0.075
 ease+0.071
Neg logit contributions
 "-0.088
 girl-0.086
 boy-0.086
keeper-0.084
hog-0.083
Token the
Token ID262
Feature activation+9.10e-03

Pos logit contributions
 belong+0.075
aches+0.075
 happen+0.072
irs+0.070
 know+0.069
Neg logit contributions
 "-0.053
 �-0.049
 girl-0.048
hog-0.048
keeper-0.048
Token toy
Token ID13373
Feature activation-1.54e-02

Pos logit contributions
aches+0.016
 belong+0.016
 happen+0.016
 yourself+0.016
 know+0.015
Neg logit contributions
 "-0.015
 girl-0.015
sed-0.014
hog-0.014
joy-0.014
Token.
Token ID13
Feature activation-4.26e-02

Pos logit contributions
elt+0.021
joy+0.021
 "+0.021
sed+0.021
+0.021
Neg logit contributions
 belong-0.031
aches-0.030
 happen-0.030
 yourself-0.030
 ourselves-0.029
Token Max
Token ID5436
Feature activation-4.19e-02

Pos logit contributions
sed+0.073
 gir+0.071
elt+0.069
joy+0.068
 girl+0.067
Neg logit contributions
 know-0.070
 yourself-0.068
 ourselves-0.067
irs-0.066
 belong-0.066
Token was
Token ID373
Feature activation+1.27e-02

Pos logit contributions
 Pole+0.072
elt+0.071
joy+0.071
sed+0.068
ometer+0.065
Neg logit contributions
 know-0.071
aches-0.070
 belong-0.069
 happen-0.066
 yourself-0.064
Token happy
Token ID3772
Feature activation-3.41e-02

Pos logit contributions
 belong+0.026
aches+0.026
 happen+0.025
 yourself+0.025
irs+0.024
Neg logit contributions
 "-0.019
 �-0.017
joy-0.017
 girl-0.017
hog-0.017
Token and
Token ID290
Feature activation-2.79e-02

Pos logit contributions
 Pole+0.062
sed+0.062
ometer+0.059
elt+0.059
 gir+0.058
Neg logit contributions
 happen-0.057
 know-0.054
 yourself-0.054
 belong-0.052
 avoid-0.051
Token agreed
Token ID4987
Feature activation-1.46e-02

Pos logit contributions
joy+0.044
 Pole+0.043
elt+0.043
sed+0.041
+0.041
Neg logit contributions
 belong-0.054
 know-0.052
aches-0.051
 happen-0.051
 yourself-0.049
Token.
Token ID13
Feature activation-4.22e-02

Pos logit contributions
sed+0.025
elt+0.025
 Pole+0.025
hog+0.024
+0.024
Neg logit contributions
 belong-0.025
 happen-0.025
aches-0.025
 know-0.024
 yourself-0.023
Token
Token ID198
Feature activation+5.48e-02

Pos logit contributions
sed+0.070
 gir+0.070
elt+0.068
joy+0.067
 Pole+0.066
Neg logit contributions
 know-0.070
 yourself-0.067
 ourselves-0.067
 ease-0.064
irs-0.064
Token
Token ID198
Feature activation+2.56e-02

Pos logit contributions
 belong+0.147
aches+0.139
 happen+0.133
 yourself+0.131
 ourselves+0.124
Neg logit contributions
 "-0.075
 �-0.070
 Sure-0.059
-0.058
hog-0.057
Token made
Token ID925
Feature activation+3.67e-02

Pos logit contributions
aches+0.050
 belong+0.048
irs+0.047
 yourself+0.047
 ourselves+0.047
Neg logit contributions
 "-0.032
sed-0.031
-0.030
 girl-0.030
hog-0.030
Token his
Token ID465
Feature activation+3.53e-03

Pos logit contributions
 belong+0.068
aches+0.067
 achieve+0.062
irs+0.060
 happen+0.060
Neg logit contributions
 "-0.066
 girl-0.060
 boy-0.060
joy-0.059
 �-0.057
Token mom
Token ID1995
Feature activation+2.25e-02

Pos logit contributions
 belong+0.008
aches+0.008
 yourself+0.008
 happen+0.008
 know+0.008
Neg logit contributions
 "-0.004
sed-0.004
hog-0.004
 girl-0.004
joy-0.004
Token and
Token ID290
Feature activation-8.44e-03

Pos logit contributions
aches+0.047
 belong+0.047
irs+0.046
 yourself+0.046
 happen+0.046
Neg logit contributions
 girl-0.031
 "-0.030
keeper-0.029
 boy-0.028
hog-0.028
Token dad
Token ID9955
Feature activation+1.04e-02

Pos logit contributions
hog+0.011
elt+0.011
 Pole+0.011
 "+0.011
+0.011
Neg logit contributions
 belong-0.018
aches-0.018
 yourself-0.017
 know-0.017
 happen-0.017
Token angry
Token ID7954
Feature activation-2.11e-02

Pos logit contributions
aches+0.018
 belong+0.018
 yourself+0.018
 happen+0.017
irs+0.017
Neg logit contributions
 "-0.018
 girl-0.017
hog-0.017
joy-0.017
elt-0.017
Token.
Token ID13
Feature activation-2.29e-02

Pos logit contributions
hog+0.041
+0.040
elt+0.040
sed+0.040
 Pole+0.039
Neg logit contributions
 belong-0.032
 happen-0.032
 yourself-0.032
 ourselves-0.031
aches-0.030
Token The
Token ID383
Feature activation-1.31e-02

Pos logit contributions
 gir+0.029
joy+0.029
hog+0.028
sed+0.028
 girl+0.028
Neg logit contributions
 yourself-0.048
 know-0.047
 belong-0.047
 ourselves-0.045
aches-0.044
Token next
Token ID1306
Feature activation-1.98e-02

Pos logit contributions
elt+0.029
hog+0.028
 "+0.028
joy+0.028
+0.028
Neg logit contributions
 belong-0.017
 happen-0.016
aches-0.015
 yourself-0.015
 know-0.014
Token day
Token ID1110
Feature activation-3.84e-02

Pos logit contributions
hog+0.044
 �+0.043
ishly+0.043
+0.043
 "+0.043
Neg logit contributions
 belong-0.026
 happen-0.024
 know-0.022
aches-0.022
 yourself-0.022
Token,
Token ID11
Feature activation-4.06e-03

Pos logit contributions
hog+0.069
sed+0.066
ishly+0.064
 gir+0.062
icol+0.060
Neg logit contributions
 belong-0.069
 are-0.065
 have-0.065
 know-0.065
 happen-0.065
Token too
Token ID1165
Feature activation-1.98e-02

Pos logit contributions
 Pole+0.047
sed+0.046
+0.045
hog+0.044
ometer+0.044
Neg logit contributions
 happen-0.049
 belong-0.048
 yourself-0.048
aches-0.046
 know-0.045
Token,
Token ID11
Feature activation-1.37e-02

Pos logit contributions
sed+0.036
hog+0.035
elt+0.034
 Lee+0.034
keeper+0.034
Neg logit contributions
 belong-0.033
 happen-0.032
 yourself-0.031
 know-0.030
aches-0.030
Token with
Token ID351
Feature activation+3.70e-02

Pos logit contributions
+0.018
hog+0.018
sed+0.018
 Pole+0.018
elt+0.017
Neg logit contributions
 belong-0.029
aches-0.028
 happen-0.028
 yourself-0.028
 know-0.027
Token its
Token ID663
Feature activation+8.18e-03

Pos logit contributions
aches+0.068
 achieve+0.064
 belong+0.062
 happen+0.061
irl+0.061
Neg logit contributions
 "-0.067
 girl-0.061
joy-0.061
 �-0.061
 gir-0.058
Token colorful
Token ID20239
Feature activation-1.43e-02

Pos logit contributions
 belong+0.016
aches+0.016
 yourself+0.016
 happen+0.015
irs+0.015
Neg logit contributions
 "-0.012
sed-0.012
hog-0.012
 girl-0.012
joy-0.011
Token surprise
Token ID5975
Feature activation-2.80e-02

Pos logit contributions
elt+0.020
+0.020
 "+0.020
 Pole+0.020
hog+0.020
Neg logit contributions
 belong-0.029
aches-0.028
 happen-0.028
 yourself-0.027
 know-0.026
Token.
Token ID13
Feature activation-2.14e-02

Pos logit contributions
elt+0.048
+0.048
joy+0.047
sed+0.047
 Pole+0.047
Neg logit contributions
 belong-0.050
 happen-0.050
aches-0.047
 know-0.044
 yourself-0.044
Token
Token ID198
Feature activation+5.69e-02

Pos logit contributions
sed+0.036
 gir+0.035
 Pole+0.034
hog+0.034
joy+0.034
Neg logit contributions
 know-0.037
 yourself-0.036
 belong-0.036
irs-0.034
 ourselves-0.034
Token
Token ID198
Feature activation+3.28e-02

Pos logit contributions
 belong+0.131
aches+0.122
 yourself+0.121
 happen+0.117
 ourselves+0.114
Neg logit contributions
 "-0.094
 �-0.086
-0.074
 Sure-0.073
hog-0.069
TokenThe
Token ID464
Feature activation-1.95e-02

Pos logit contributions
 belong+0.065
aches+0.062
 happen+0.062
 yourself+0.058
 yours+0.057
Neg logit contributions
 "-0.054
-0.053
 �-0.052
hog-0.049
 girl-0.048
Token little
Token ID1310
Feature activation-5.08e-02

Pos logit contributions
elt+0.028
ometer+0.026
sed+0.026
 Lee+0.025
hog+0.025
Neg logit contributions
 belong-0.042
aches-0.040
 happen-0.040
 yourself-0.038
 know-0.038
Token.
Token ID13
Feature activation-5.12e-02

Pos logit contributions
joy+0.044
+0.042
 Pole+0.042
sed+0.042
hog+0.042
Neg logit contributions
 happen-0.046
 belong-0.046
aches-0.046
 ourselves-0.043
 yourself-0.042
Token He
Token ID679
Feature activation-5.99e-03

Pos logit contributions
 calf+0.078
 woman+0.076
keeper+0.075
hog+0.075
sed+0.071
Neg logit contributions
 know-0.098
 ease-0.096
 yourself-0.092
 belong-0.090
 improve-0.087
Token mixed
Token ID7668
Feature activation+2.90e-02

Pos logit contributions
 "+0.010
joy+0.010
hog+0.010
sed+0.010
elt+0.010
Neg logit contributions
aches-0.010
 belong-0.010
 happen-0.010
 yourself-0.010
 know-0.010
Token blue
Token ID4171
Feature activation-9.29e-03

Pos logit contributions
 belong+0.053
aches+0.048
 yourself+0.047
 happen+0.047
 know+0.046
Neg logit contributions
 "-0.053
 girl-0.049
 �-0.049
-0.047
elt-0.047
Token with
Token ID351
Feature activation+5.72e-03

Pos logit contributions
sed+0.011
hog+0.010
joy+0.010
 "+0.010
keeper+0.010
Neg logit contributions
aches-0.021
 belong-0.021
 happen-0.021
 yourself-0.021
 ourselves-0.020
Token yellow
Token ID7872
Feature activation-2.35e-02

Pos logit contributions
 belong+0.011
aches+0.011
 happen+0.011
 yourself+0.011
 know+0.011
Neg logit contributions
 "-0.008
elt-0.008
 girl-0.008
hog-0.008
sed-0.008
Token to
Token ID284
Feature activation+3.70e-02

Pos logit contributions
hog+0.033
sed+0.032
keeper+0.032
 Lee+0.031
joy+0.031
Neg logit contributions
aches-0.048
 belong-0.047
 happen-0.047
 yourself-0.046
 know-0.045
Token make
Token ID787
Feature activation+2.13e-02

Pos logit contributions
 belong+0.054
aches+0.050
irs+0.050
 yourself+0.049
 ourselves+0.048
Neg logit contributions
 "-0.078
 �-0.071
 girl-0.071
 gir-0.070
 boy-0.067
Token green
Token ID4077
Feature activation-3.61e-02

Pos logit contributions
 belong+0.035
aches+0.033
 yourself+0.032
 happen+0.032
irs+0.030
Neg logit contributions
 "-0.041
 girl-0.038
 �-0.038
 gir-0.037
elt-0.037
Token.
Token ID13
Feature activation-3.79e-02

Pos logit contributions
sed+0.056
keeper+0.056
joy+0.056
 clerk+0.056
elt+0.055
Neg logit contributions
 happen-0.069
aches-0.068
 yourself-0.065
 belong-0.065
 ourselves-0.065
Token He
Token ID679
Feature activation-4.30e-03

Pos logit contributions
hog+0.053
keeper+0.052
 calf+0.052
 woman+0.051
sed+0.050
Neg logit contributions
 know-0.077
 ease-0.075
 yourself-0.073
 belong-0.072
 avoid-0.069
Token.
Token ID13
Feature activation-2.74e-02

Pos logit contributions
 belong+0.028
aches+0.027
 happen+0.026
 yourself+0.026
 know+0.025
Neg logit contributions
 "-0.027
 girl-0.026
joy-0.026
hog-0.025
sed-0.025
Token Sam
Token ID3409
Feature activation-2.27e-02

Pos logit contributions
 girl+0.044
 gir+0.044
sed+0.043
keeper+0.043
hog+0.043
Neg logit contributions
 know-0.047
 ease-0.045
 belong-0.045
irs-0.044
 yourself-0.044
Token shouted
Token ID17293
Feature activation-6.67e-02

Pos logit contributions
elt+0.037
joy+0.037
 Pole+0.037
sed+0.036
 "+0.036
Neg logit contributions
aches-0.039
 belong-0.039
 know-0.038
 happen-0.038
 yourself-0.037
Token with
Token ID351
Feature activation+4.24e-02

Pos logit contributions
 Pole+0.113
elt+0.113
hog+0.112
 gir+0.112
 Girl+0.111
Neg logit contributions
 happen-0.112
 yourself-0.108
 know-0.107
 belong-0.105
 them-0.102
Token g
Token ID308
Feature activation-2.02e-03

Pos logit contributions
aches+0.098
 belong+0.086
 achieve+0.085
 happen+0.085
 yourself+0.083
Neg logit contributions
 "-0.061
joy-0.060
 �-0.059
 girl-0.056
 eyes-0.054
Tokenlee
Token ID7197
Feature activation-8.13e-02

Pos logit contributions
 "+0.002
hog+0.002
 girl+0.002
+0.002
keeper+0.002
Neg logit contributions
aches-0.005
 belong-0.005
 happen-0.005
 yourself-0.005
 know-0.004
Token as
Token ID355
Feature activation+2.79e-03

Pos logit contributions
Really+0.146
elt+0.138
joy+0.136
 Pole+0.134
 girl+0.133
Neg logit contributions
 belong-0.130
 happen-0.129
 know-0.121
aches-0.119
 ourselves-0.118
Token he
Token ID339
Feature activation-6.51e-03

Pos logit contributions
 belong+0.006
aches+0.006
 happen+0.006
 yourself+0.006
 know+0.006
Neg logit contributions
 "-0.003
hog-0.003
sed-0.003
elt-0.003
 girl-0.003
Token found
Token ID1043
Feature activation+1.23e-02

Pos logit contributions
 "+0.009
elt+0.009
hog+0.009
 girl+0.009
+0.009
Neg logit contributions
 belong-0.013
aches-0.013
 happen-0.012
 yourself-0.012
 know-0.012
Token his
Token ID465
Feature activation-2.96e-03

Pos logit contributions
 belong+0.019
aches+0.019
 happen+0.018
 yourself+0.018
 know+0.017
Neg logit contributions
 "-0.024
hog-0.023
 girl-0.023
joy-0.022
 �-0.022
Token mom
Token ID1995
Feature activation-2.17e-02

Pos logit contributions
 "+0.004
elt+0.004
+0.004
hog+0.004
joy+0.004
Neg logit contributions
 belong-0.006
aches-0.006
 happen-0.006
 yourself-0.006
 know-0.006
Token
Token ID198
Feature activation+9.59e-02

Pos logit contributions
sed+0.022
hog+0.022
joy+0.022
 gir+0.022
 girl+0.021
Neg logit contributions
 belong-0.018
aches-0.018
 know-0.018
 yourself-0.017
 happen-0.017
Token
Token ID198
Feature activation+8.08e-03

Pos logit contributions
 belong+0.276
aches+0.271
 yourself+0.270
 happen+0.257
 ourselves+0.255
Neg logit contributions
 "-0.106
 Sure-0.074
 �-0.070
 boy-0.070
 girl-0.062
TokenTim
Token ID14967
Feature activation-1.40e-02

Pos logit contributions
 belong+0.013
aches+0.013
 happen+0.013
 yourself+0.013
 ourselves+0.012
Neg logit contributions
 "-0.015
-0.014
hog-0.014
elt-0.014
 girl-0.014
Token saw
Token ID2497
Feature activation+1.51e-02

Pos logit contributions
joy+0.023
sed+0.022
 "+0.022
keeper+0.022
hog+0.022
Neg logit contributions
 belong-0.025
aches-0.025
 know-0.024
 yourself-0.024
 happen-0.023
Token a
Token ID257
Feature activation-6.59e-02

Pos logit contributions
 belong+0.024
aches+0.024
 happen+0.023
 yourself+0.022
irs+0.022
Neg logit contributions
 "-0.030
elt-0.028
 girl-0.027
 gir-0.027
sed-0.027
Token big
Token ID1263
Feature activation-7.17e-02

Pos logit contributions
ishly+0.153
edly+0.153
elt+0.151
ked+0.149
hog+0.142
Neg logit contributions
 belong-0.079
 happen-0.075
aches-0.073
 know-0.071
 have-0.068
Token,
Token ID11
Feature activation-5.09e-02

Pos logit contributions
edly+0.139
elt+0.133
ked+0.133
keeper+0.130
 Sure+0.127
Neg logit contributions
 belong-0.125
 happen-0.124
 have-0.113
 are-0.112
aches-0.111
Token scary
Token ID14343
Feature activation-7.17e-02

Pos logit contributions
 Lee+0.097
elt+0.096
ishly+0.095
keeper+0.095
sed+0.095
Neg logit contributions
 belong-0.084
 happen-0.078
aches-0.074
 know-0.074
 yourself-0.071
Token bug
Token ID5434
Feature activation-1.94e-02

Pos logit contributions
keeper+0.112
 �+0.104
ishly+0.104
elt+0.104
ked+0.103
Neg logit contributions
 happen-0.144
 belong-0.137
 harm-0.127
 know-0.126
aches-0.125
Token.
Token ID13
Feature activation-7.45e-02

Pos logit contributions
joy+0.038
keeper+0.037
 Pole+0.037
elt+0.037
 Sure+0.036
Neg logit contributions
 belong-0.030
aches-0.030
 happen-0.029
 yourself-0.028
 know-0.026
Token He
Token ID679
Feature activation-2.57e-02

Pos logit contributions
joy+0.145
ista+0.130
pit+0.127
 woman+0.126
icol+0.125
Neg logit contributions
 know-0.126
 belong-0.117
 have-0.115
 happen-0.113
 yourself-0.106
Token chim
Token ID18205
Feature activation+1.58e-02

Pos logit contributions
elt+0.018
 "+0.018
sed+0.018
hog+0.018
 Lee+0.018
Neg logit contributions
 belong-0.024
aches-0.024
 yourself-0.024
 happen-0.023
 know-0.022
Tokenney
Token ID1681
Feature activation+3.80e-02

Pos logit contributions
 belong+0.037
aches+0.036
 happen+0.035
 yourself+0.034
irs+0.034
Neg logit contributions
 "-0.019
hog-0.018
keeper-0.018
-0.018
 girl-0.017
Token,"
Token ID553
Feature activation-2.79e-02

Pos logit contributions
 belong+0.064
aches+0.060
irs+0.060
 yourself+0.058
 happen+0.058
Neg logit contributions
 "-0.069
keeper-0.067
 girl-0.066
hog-0.065
 �-0.064
Token Ben
Token ID3932
Feature activation-4.68e-02

Pos logit contributions
sed+0.044
elt+0.043
hog+0.042
ometer+0.042
 Pole+0.042
Neg logit contributions
 belong-0.052
 know-0.052
aches-0.050
 yourself-0.049
 happen-0.049
Token said
Token ID531
Feature activation-5.72e-02

Pos logit contributions
joy+0.097
 Pole+0.090
sed+0.090
icol+0.087
elt+0.085
Neg logit contributions
 know-0.075
aches-0.068
ck-0.063
 belong-0.062
 happen-0.060
Token,
Token ID11
Feature activation-7.56e-02

Pos logit contributions
 Pole+0.112
 gir+0.106
elt+0.106
Really+0.106
sed+0.106
Neg logit contributions
 know-0.093
 happen-0.092
 belong-0.090
aches-0.087
 stay-0.080
Token pointing
Token ID10609
Feature activation-3.21e-02

Pos logit contributions
ometer+0.130
 Pole+0.125
hog+0.121
umm+0.119
board+0.116
Neg logit contributions
aches-0.131
ck-0.126
 yourself-0.125
 happen-0.124
 enjoy-0.124
Token to
Token ID284
Feature activation+2.72e-02

Pos logit contributions
 Pole+0.057
hog+0.056
sed+0.055
 gir+0.054
elt+0.053
Neg logit contributions
 belong-0.053
 yourself-0.052
aches-0.052
 know-0.052
 happen-0.050
Token the
Token ID262
Feature activation-1.63e-02

Pos logit contributions
aches+0.050
 belong+0.047
 happen+0.044
 yourself+0.043
irl+0.043
Neg logit contributions
 "-0.053
 girl-0.048
 �-0.048
joy-0.047
 gir-0.046
Token old
Token ID1468
Feature activation-2.99e-02

Pos logit contributions
elt+0.023
hog+0.022
 "+0.022
sed+0.022
 Lee+0.022
Neg logit contributions
 belong-0.033
aches-0.033
 yourself-0.032
 happen-0.032
 know-0.031
Token chim
Token ID18205
Feature activation-1.41e-02

Pos logit contributions
elt+0.050
sed+0.049
hog+0.046
+0.046
ked+0.046
Neg logit contributions
aches-0.055
 happen-0.053
 belong-0.053
 yourself-0.053
 know-0.051
Token Her
Token ID2332
Feature activation-2.44e-02

Pos logit contributions
 gir+0.055
hog+0.053
 Pole+0.052
sed+0.051
keeper+0.051
Neg logit contributions
 know-0.048
 belong-0.046
 yourself-0.046
 enjoy-0.045
 ease-0.044
Token mom
Token ID1995
Feature activation-4.25e-02

Pos logit contributions
elt+0.040
 Pole+0.040
hog+0.038
+0.038
 Sure+0.037
Neg logit contributions
 belong-0.044
aches-0.043
 know-0.043
 happen-0.043
irl-0.041
Token smiled
Token ID13541
Feature activation-5.76e-02

Pos logit contributions
 Pole+0.070
sed+0.070
joy+0.068
ometer+0.067
elt+0.064
Neg logit contributions
aches-0.078
 know-0.077
ck-0.077
 belong-0.073
 happen-0.072
Token and
Token ID290
Feature activation-4.43e-02

Pos logit contributions
 Pole+0.126
sed+0.126
hog+0.118
ometer+0.116
 gir+0.114
Neg logit contributions
 know-0.072
 understand-0.070
 love-0.069
 yourself-0.067
 enjoy-0.067
Token said
Token ID531
Feature activation-4.83e-02

Pos logit contributions
joy+0.083
sed+0.082
 Pole+0.082
hog+0.081
ometer+0.080
Neg logit contributions
aches-0.067
 know-0.067
 happen-0.066
 belong-0.066
ck-0.064
Token,
Token ID11
Feature activation-7.41e-02

Pos logit contributions
 Pole+0.083
sed+0.082
hog+0.080
 gir+0.078
elt+0.077
Neg logit contributions
 happen-0.090
 belong-0.085
aches-0.083
 yourself-0.078
 know-0.078
Token "
Token ID366
Feature activation+2.74e-03

Pos logit contributions
hog+0.074
sed+0.071
suit+0.071
joy+0.065
ometer+0.063
Neg logit contributions
 happen-0.174
aches-0.168
 have-0.165
n-0.165
 know-0.164
TokenI
Token ID40
Feature activation+3.78e-02

Pos logit contributions
 belong+0.005
aches+0.005
 happen+0.005
 yourself+0.005
 know+0.004
Neg logit contributions
 "-0.005
hog-0.004
elt-0.004
 girl-0.004
sed-0.004
Token'm
Token ID1101
Feature activation+1.46e-02

Pos logit contributions
aches+0.057
 yourself+0.056
irs+0.055
 ourselves+0.053
cks+0.053
Neg logit contributions
 "-0.081
 girl-0.078
-0.077
 �-0.076
 boy-0.074
Token glad
Token ID9675
Feature activation-4.68e-03

Pos logit contributions
aches+0.031
 belong+0.031
 happen+0.030
 yourself+0.029
irs+0.029
Neg logit contributions
 "-0.021
 �-0.019
joy-0.019
 girl-0.019
elt-0.018
Token you
Token ID345
Feature activation+4.00e-02

Pos logit contributions
sed+0.012
 "+0.012
hog+0.012
elt+0.012
+0.012
Neg logit contributions
 belong-0.004
aches-0.003
 yourself-0.003
 happen-0.003
 ourselves-0.003
Token,
Token ID11
Feature activation-5.64e-03

Pos logit contributions
sed+0.057
joy+0.056
hog+0.055
ometer+0.054
keeper+0.053
Neg logit contributions
 belong-0.078
aches-0.077
 happen-0.076
 know-0.075
 yourself-0.073
Token look
Token ID804
Feature activation-1.88e-02

Pos logit contributions
 "+0.009
sed+0.009
hog+0.009
joy+0.008
 Pole+0.008
Neg logit contributions
 belong-0.011
aches-0.010
 yourself-0.010
 happen-0.010
 know-0.010
Token!
Token ID0
Feature activation-8.19e-02

Pos logit contributions
sed+0.027
hog+0.026
 gir+0.026
 Lee+0.026
elt+0.026
Neg logit contributions
aches-0.037
 belong-0.037
 yourself-0.036
 happen-0.035
 know-0.035
Token Jim
Token ID5395
Feature activation+8.71e-03

Pos logit contributions
sed+0.142
keeper+0.128
hog+0.125
 gir+0.124
ked+0.123
Neg logit contributions
 have-0.136
 yourself-0.133
 know-0.133
irs-0.132
 belong-0.132
Token said
Token ID531
Feature activation-4.63e-02

Pos logit contributions
aches+0.018
 belong+0.018
 happen+0.017
 yourself+0.017
 know+0.016
Neg logit contributions
 "-0.012
-0.012
elt-0.012
hog-0.012
 girl-0.012
Token,
Token ID11
Feature activation-1.10e-01

Pos logit contributions
sed+0.067
 Lee+0.065
elt+0.064
 Girl+0.064
hog+0.063
Neg logit contributions
 belong-0.093
 happen-0.092
aches-0.088
 know-0.087
 yourself-0.085
Token pointing
Token ID10609
Feature activation-2.12e-02

Pos logit contributions
sed+0.122
hog+0.114
umm+0.112
ometer+0.107
oky+0.107
Neg logit contributions
 have-0.230
 happen-0.229
 them-0.226
 know-0.216
 belong-0.212
Token excited
Token ID6568
Feature activation-2.59e-02

Pos logit contributions
sed+0.034
hog+0.034
 Pole+0.033
+0.032
elt+0.032
Neg logit contributions
 belong-0.039
 yourself-0.038
aches-0.038
 know-0.037
 happen-0.036
Tokenly
Token ID306
Feature activation-4.72e-02

Pos logit contributions
sed+0.035
+0.034
hog+0.033
ometer+0.032
 Lee+0.032
Neg logit contributions
 happen-0.056
 belong-0.056
aches-0.053
 yourself-0.053
 know-0.052
Token at
Token ID379
Feature activation+5.38e-02

Pos logit contributions
sed+0.082
hog+0.078
elt+0.078
 Sure+0.075
+0.075
Neg logit contributions
 belong-0.079
aches-0.079
 happen-0.077
 yourself-0.077
 know-0.073
Token something
Token ID1223
Feature activation-1.25e-02

Pos logit contributions
 belong+0.110
aches+0.109
iver+0.104
vern+0.102
 achieve+0.102
Neg logit contributions
 "-0.089
 girl-0.079
 boy-0.077
joy-0.076
 eyes-0.076