TOP ACTIVATIONS
MAX = 11.270

<|endoftext|> <|endoftext|> L ila and Ben are friends . They like
Once upon a time , Jack and his sister Mia were
<|endoftext|> <|endoftext|> <|endoftext|> Anna and Ben are playing with their toys
<|endoftext|> <|endoftext|> L ily and Sam went on a saf ari
Once upon a time , Sally and Jane were best friends
Once upon a time , Elliot the elephant was walking through
Once upon a time , Jack was out for lunch with
." <|endoftext|> <|endoftext|> <|endoftext|> L ily and Sam went on a
. <|endoftext|> One day , Diego was feeling very impatient .
upon a time , there arrived a boy to a stable
. <|endoftext|> One day , Jack went for a walk in
<|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> L ila and Ben are friends .
. <|endoftext|> One day , Paul and his mom my went
. <|endoftext|> One day , John went to the mall with
. <|endoftext|> One day , Tommy was walking in the park
Once upon a time , Jack was playing with his hoop
! <|endoftext|> One day , Sally was in the park with
Once upon a time , Jack and Jill were very excited
. <|endoftext|> One day , Sammy saw something very beautiful in
me adow <|endoftext|> Once upon a time , there was a

MIN ACTIVATIONS
MIN = -12.074

had learned to appreciate the therm ometer . <|endoftext|> John lay
. She had the strongest therm ometer in the town and
They took turns putting the therm ometer in their mouths and
will not touch the ash tr ay again ," Tim says
him how to use the therm ometer and soon John knew
were brave and kept the therm ometer in your mouth !"
checked . She put the therm ometer next to his forehead
and they bought the new therm ometer ! <|endoftext|> Once upon
not played with the ash tr ay . They wished they
was cheaper than the other therm ometer . Bob was happy
should never touch an ash tr ay ." S
ing on how great the hat looked . Lily loved it
wanted to keep the ash tr ay . So he tried
can use this broken ash tr ay as a puzzle !"
Did you destroy my ash tr ay ?" Lily felt bad
, you should have the therm ometer in your mouth .
We did not share the er aser ." The
that he had used the m op in an organized way
his mouth and put the therm ometer in his mouth again
not nice to keep the dish all to yourself ."

INTERVAL 5.434 - 11.270
CONTAINS 4.224%

for you . <|endoftext|> M ia and Leo were playing in
dirt iness . <|endoftext|> Once upon a time , there was
bad ending . <|endoftext|> Once upon a time , there was
to improve ." <|endoftext|> There was a cat . She was
. <|endoftext|> Once upon a time , there was a boy

INTERVAL -0.402 - 5.434
CONTAINS 50.390%

cob we bs ," she said . " We
and Ben were twins who liked to play outside . They
." They take turns to push the red car
Tom and Mia are happy . They have helped someone
help the bird . They went back to the tree and

INTERVAL -6.238 - -0.402
CONTAINS 44.933%

and that it would always keep her toy safe . <|endoftext|>
my said , " Wow , you look so fancy !
and walked with Lily to the market . He gave the
suit . She grabbed a towel and hopped into the pirate
, she finally could balance the task ! She felt so

INTERVAL -12.074 - -6.238
CONTAINS 0.452%

listen to me next time . The zoo is a fun
for giving him the best cocoa . The End . <|endoftext|>
are many things to see and do and learn inside .
Can I buy this tea ? â . Her
, " It is my block !" " No , it
Token<|endoftext|>
Token ID50256
Feature activation+7.19e+00

Pos logit contributions
 colored+0.132
 saw+0.130
 little+0.122
 coloured+0.120
 remembered+0.117
Neg logit contributions
othe-0.113
vere-0.109
rier-0.108
unes-0.105
uts-0.103
Token<|endoftext|>
Token ID50256
Feature activation+7.46e+00

Pos logit contributions
 colored+0.144
 saw+0.141
 little+0.133
 coloured+0.131
 remembered+0.129
Neg logit contributions
othe-0.117
vere-0.113
rier-0.111
unes-0.109
uts-0.106
TokenL
Token ID43
Feature activation+9.35e+00

Pos logit contributions
 colored+0.150
 saw+0.147
 little+0.140
 coloured+0.137
 remembered+0.135
Neg logit contributions
othe-0.119
vere-0.114
rier-0.111
unes-0.111
uts-0.108
Tokenila
Token ID10102
Feature activation+1.03e+01

Pos logit contributions
 colored+0.145
 saw+0.138
ptions+0.136
 coloured+0.133
 remembered+0.133
Neg logit contributions
unes-0.182
umbs-0.173
othe-0.171
uts-0.167
 spine-0.162
Token and
Token ID290
Feature activation+6.86e+00

Pos logit contributions
ptions+0.145
 colored+0.136
 saw+0.121
 coloured+0.120
 amusement+0.118
Neg logit contributions
othe-0.223
 spine-0.214
uts-0.207
ometer-0.205
unes-0.203
Token Ben
Token ID3932
Feature activation+1.13e+01

Pos logit contributions
 colored+0.072
 saw+0.067
 coloured+0.059
 little+0.059
ptions+0.057
Neg logit contributions
othe-0.185
rier-0.174
vere-0.174
 spine-0.172
ometers-0.170
Token are
Token ID389
Feature activation+1.00e+01

Pos logit contributions
 colored+0.139
ptions+0.135
 saw+0.126
 coloured+0.120
 little+0.119
Neg logit contributions
othe-0.263
 spine-0.249
rier-0.245
ometers-0.245
uts-0.244
Token friends
Token ID2460
Feature activation+9.62e+00

Pos logit contributions
 colored+0.125
 saw+0.123
 little+0.107
 coloured+0.106
 heard+0.104
Neg logit contributions
othe-0.248
rier-0.242
vere-0.239
unes-0.236
ometers-0.234
Token.
Token ID13
Feature activation+1.67e+00

Pos logit contributions
 saw+0.133
 little+0.116
 colored+0.113
 heard+0.111
 a+0.108
Neg logit contributions
rier-0.251
unes-0.245
ometers-0.245
ometer-0.244
drops-0.241
Token They
Token ID1119
Feature activation+5.41e+00

Pos logit contributions
 saw+0.035
 colored+0.035
 little+0.033
 heard+0.032
 coloured+0.031
Neg logit contributions
othe-0.028
rier-0.027
vere-0.026
ometers-0.025
 spine-0.025
Token like
Token ID588
Feature activation+3.26e+00

Pos logit contributions
 colored+0.068
 saw+0.067
ptions+0.064
 little+0.061
 coloured+0.058
Neg logit contributions
othe-0.127
rier-0.123
vere-0.120
drops-0.120
 spine-0.119
TokenOnce
Token ID7454
Feature activation+7.19e+00

Pos logit contributions
 colored+0.125
 saw+0.123
 little+0.116
 coloured+0.114
ptions+0.111
Neg logit contributions
othe-0.104
rier-0.099
vere-0.098
unes-0.096
ometer-0.095
Token upon
Token ID2402
Feature activation+7.73e+00

Pos logit contributions
 saw+0.088
 colored+0.088
 little+0.078
 coloured+0.074
 heard+0.074
Neg logit contributions
othe-0.180
rier-0.179
vere-0.173
 spine-0.170
ometers-0.170
Token a
Token ID257
Feature activation+9.41e+00

Pos logit contributions
 colored+0.074
ptions+0.071
 saw+0.069
ere+0.064
 remembered+0.061
Neg logit contributions
othe-0.187
uts-0.187
vere-0.184
 numb-0.180
 bleed-0.179
Token time
Token ID640
Feature activation+6.16e+00

Pos logit contributions
 colored+0.093
 saw+0.087
 coloured+0.076
ptions+0.071
 heard+0.070
Neg logit contributions
othe-0.252
 spine-0.246
vere-0.246
rier-0.244
unes-0.236
Token,
Token ID11
Feature activation+7.51e+00

Pos logit contributions
 colored+0.063
 saw+0.060
 coloured+0.051
 little+0.049
 heard+0.049
Neg logit contributions
othe-0.168
rier-0.164
vere-0.163
 spine-0.159
uts-0.159
Token Jack
Token ID3619
Feature activation+1.09e+01

Pos logit contributions
 colored+0.097
 amusement+0.096
 coloured+0.085
 media+0.085
othy+0.078
Neg logit contributions
rier-0.173
othe-0.168
ometer-0.165
 spine-0.160
balls-0.159
Token and
Token ID290
Feature activation+4.74e+00

Pos logit contributions
ptions+0.196
 amusement+0.166
 media+0.166
 preschool+0.147
grown+0.141
Neg logit contributions
othe-0.173
oes-0.170
 spine-0.164
urch-0.164
umbs-0.162
Token his
Token ID465
Feature activation+7.55e+00

Pos logit contributions
 colored+0.067
 saw+0.065
ptions+0.059
 coloured+0.058
 heard+0.057
Neg logit contributions
othe-0.108
rier-0.105
vere-0.101
uts-0.100
ometer-0.099
Token sister
Token ID6621
Feature activation+7.87e+00

Pos logit contributions
 colored+0.094
 saw+0.090
 coloured+0.082
ptions+0.081
 heard+0.079
Neg logit contributions
othe-0.184
rier-0.183
vere-0.181
ometer-0.180
ometers-0.177
Token Mia
Token ID32189
Feature activation+7.80e+00

Pos logit contributions
ptions+0.120
 colored+0.110
 media+0.109
 coloured+0.108
 amusement+0.093
Neg logit contributions
unes-0.150
othe-0.143
ometers-0.141
drops-0.139
ometer-0.135
Token were
Token ID547
Feature activation+9.35e+00

Pos logit contributions
ptions+0.131
 media+0.114
 colored+0.110
 amusement+0.104
 coloured+0.103
Neg logit contributions
othe-0.138
unes-0.136
drops-0.134
uts-0.129
ometers-0.127
Token<|endoftext|>
Token ID50256
Feature activation+6.85e+00

Pos logit contributions
 saw+0.102
 colored+0.101
 little+0.094
 heard+0.090
 coloured+0.090
Neg logit contributions
othe-0.100
rier-0.099
vere-0.095
ometers-0.092
 spine-0.092
Token<|endoftext|>
Token ID50256
Feature activation+7.49e+00

Pos logit contributions
 colored+0.137
 saw+0.134
 little+0.127
 coloured+0.124
 remembered+0.121
Neg logit contributions
othe-0.116
rier-0.111
vere-0.110
unes-0.106
ometer-0.104
Token<|endoftext|>
Token ID50256
Feature activation+7.76e+00

Pos logit contributions
 colored+0.151
 saw+0.147
 little+0.140
 coloured+0.137
 remembered+0.134
Neg logit contributions
othe-0.123
vere-0.117
rier-0.116
unes-0.113
uts-0.110
TokenAnna
Token ID31160
Feature activation+1.02e+01

Pos logit contributions
 colored+0.157
 saw+0.153
 little+0.145
 coloured+0.143
 remembered+0.141
Neg logit contributions
othe-0.125
vere-0.119
rier-0.117
unes-0.115
uts-0.112
Token and
Token ID290
Feature activation+6.60e+00

Pos logit contributions
ptions+0.170
 colored+0.155
 coloured+0.139
 amusement+0.138
 media+0.135
Neg logit contributions
othe-0.199
 spine-0.191
ometer-0.185
uts-0.182
oes-0.176
Token Ben
Token ID3932
Feature activation+1.09e+01

Pos logit contributions
 colored+0.068
 saw+0.060
 coloured+0.057
ptions+0.056
 little+0.052
Neg logit contributions
othe-0.180
vere-0.167
 spine-0.166
rier-0.164
drops-0.162
Token are
Token ID389
Feature activation+9.52e+00

Pos logit contributions
 colored+0.141
ptions+0.138
 saw+0.127
 coloured+0.123
 little+0.120
Neg logit contributions
othe-0.249
 spine-0.234
rier-0.231
uts-0.230
ometers-0.230
Token playing
Token ID2712
Feature activation+7.33e+00

Pos logit contributions
 colored+0.123
 saw+0.122
 little+0.108
 coloured+0.106
 heard+0.104
Neg logit contributions
othe-0.232
rier-0.228
vere-0.224
unes-0.220
ometers-0.219
Token with
Token ID351
Feature activation+5.24e+00

Pos logit contributions
 colored+0.110
 saw+0.107
 packed+0.101
ptions+0.100
 remembered+0.100
Neg logit contributions
rier-0.152
othe-0.150
vere-0.145
balls-0.142
 spine-0.140
Token their
Token ID511
Feature activation+5.58e+00

Pos logit contributions
 saw+0.073
 colored+0.069
 heard+0.067
ptions+0.067
 remembered+0.065
Neg logit contributions
othe-0.118
 numb-0.109
rier-0.108
ometers-0.106
unes-0.105
Token toys
Token ID14958
Feature activation+2.31e+00

Pos logit contributions
 saw+0.078
ptions+0.076
 colored+0.075
 heard+0.074
 remembered+0.071
Neg logit contributions
rier-0.116
othe-0.116
balls-0.116
ometer-0.113
 spine-0.113
Token<|endoftext|>
Token ID50256
Feature activation+7.11e+00

Pos logit contributions
 saw+0.096
 colored+0.095
 little+0.088
 heard+0.085
 coloured+0.084
Neg logit contributions
othe-0.107
rier-0.106
vere-0.101
ometers-0.099
 spine-0.099
Token<|endoftext|>
Token ID50256
Feature activation+7.92e+00

Pos logit contributions
 colored+0.137
 saw+0.136
 little+0.127
 coloured+0.125
 remembered+0.121
Neg logit contributions
othe-0.124
rier-0.119
vere-0.119
unes-0.115
uts-0.112
TokenL
Token ID43
Feature activation+8.80e+00

Pos logit contributions
 colored+0.158
 saw+0.155
 little+0.146
 coloured+0.144
 remembered+0.142
Neg logit contributions
othe-0.129
vere-0.124
rier-0.122
unes-0.121
uts-0.117
Tokenily
Token ID813
Feature activation+1.04e+01

Pos logit contributions
 colored+0.139
 saw+0.128
 coloured+0.128
ptions+0.125
 remembered+0.122
Neg logit contributions
unes-0.173
umbs-0.166
othe-0.164
uts-0.161
 spine-0.152
Token and
Token ID290
Feature activation+6.27e+00

Pos logit contributions
ptions+0.183
 colored+0.155
 amusement+0.148
 coloured+0.141
 media+0.140
Neg logit contributions
othe-0.188
 spine-0.184
ometer-0.175
oes-0.171
uts-0.171
Token Sam
Token ID3409
Feature activation+1.08e+01

Pos logit contributions
 colored+0.064
 saw+0.058
ptions+0.053
 coloured+0.053
 little+0.051
Neg logit contributions
othe-0.172
vere-0.159
rier-0.158
 spine-0.157
ometers-0.156
Token went
Token ID1816
Feature activation+9.32e+00

Pos logit contributions
ptions+0.143
 colored+0.134
 little+0.120
 saw+0.118
 coloured+0.117
Neg logit contributions
othe-0.240
unes-0.222
 spine-0.221
ometers-0.218
drops-0.218
Token on
Token ID319
Feature activation+7.85e+00

Pos logit contributions
 saw+0.125
 colored+0.120
 little+0.114
 found+0.105
 heard+0.103
Neg logit contributions
rier-0.237
othe-0.226
vere-0.222
ometers-0.217
drops-0.215
Token a
Token ID257
Feature activation+7.31e+00

Pos logit contributions
ptions+0.081
 saw+0.075
 colored+0.074
 remembered+0.073
 noticed+0.066
Neg logit contributions
othe-0.196
vere-0.180
 spine-0.179
uts-0.176
ometers-0.175
Token saf
Token ID1932
Feature activation+1.92e+00

Pos logit contributions
 colored+0.106
ptions+0.098
 coloured+0.094
 saw+0.092
 remembered+0.090
Neg logit contributions
othe-0.152
 spine-0.148
ometer-0.145
vere-0.141
 numb-0.137
Tokenari
Token ID2743
Feature activation+6.86e+00

Pos logit contributions
 saw+0.027
 colored+0.024
 little+0.024
 heard+0.021
 found+0.021
Neg logit contributions
rier-0.047
othe-0.047
drops-0.044
 bleed-0.044
vere-0.043
TokenOnce
Token ID7454
Feature activation+6.81e+00

Pos logit contributions
 colored+0.151
 saw+0.148
 little+0.139
 coloured+0.137
 heard+0.133
Neg logit contributions
othe-0.121
rier-0.116
vere-0.115
ometer-0.110
uts-0.110
Token upon
Token ID2402
Feature activation+7.19e+00

Pos logit contributions
 saw+0.083
 colored+0.082
 little+0.074
 heard+0.069
 coloured+0.069
Neg logit contributions
othe-0.171
rier-0.170
vere-0.165
 spine-0.162
ometers-0.162
Token a
Token ID257
Feature activation+9.18e+00

Pos logit contributions
 colored+0.063
 saw+0.054
ptions+0.054
ere+0.051
 coloured+0.050
Neg logit contributions
othe-0.191
vere-0.186
uts-0.185
 numb-0.179
 bleed-0.178
Token time
Token ID640
Feature activation+5.63e+00

Pos logit contributions
 colored+0.105
 saw+0.094
ptions+0.091
 coloured+0.088
 remembered+0.081
Neg logit contributions
othe-0.231
 spine-0.224
vere-0.221
ometer-0.211
uts-0.209
Token,
Token ID11
Feature activation+6.59e+00

Pos logit contributions
 colored+0.062
 saw+0.057
 coloured+0.051
 little+0.049
ptions+0.048
Neg logit contributions
othe-0.150
vere-0.145
rier-0.142
uts-0.141
 spine-0.140
Token Sally
Token ID25737
Feature activation+1.07e+01

Pos logit contributions
 amusement+0.081
ptions+0.080
 colored+0.079
 media+0.076
othy+0.071
Neg logit contributions
othe-0.159
vere-0.139
ometer-0.138
rier-0.135
 spine-0.133
Token and
Token ID290
Feature activation+3.79e+00

Pos logit contributions
ptions+0.169
 amusement+0.148
 media+0.136
 preschool+0.121
 colored+0.118
Neg logit contributions
othe-0.206
 bleed-0.188
oes-0.186
 spine-0.183
ometer-0.179
Token Jane
Token ID12091
Feature activation+9.11e+00

Pos logit contributions
 colored+0.048
 saw+0.046
ptions+0.045
 coloured+0.042
 heard+0.041
Neg logit contributions
othe-0.092
vere-0.085
uts-0.084
ometers-0.083
rier-0.083
Token were
Token ID547
Feature activation+9.35e+00

Pos logit contributions
ptions+0.143
 media+0.138
 colored+0.133
 little+0.133
 amusement+0.126
Neg logit contributions
othe-0.173
unes-0.158
drops-0.157
 bleed-0.153
ometers-0.150
Token best
Token ID1266
Feature activation+6.06e+00

Pos logit contributions
 colored+0.157
 saw+0.151
 coloured+0.144
 heard+0.136
ptions+0.132
Neg logit contributions
othe-0.187
rier-0.184
vere-0.183
unes-0.180
ometers-0.172
Token friends
Token ID2460
Feature activation+8.15e+00

Pos logit contributions
 saw+0.056
 little+0.052
 colored+0.051
 heard+0.043
 found+0.041
Neg logit contributions
rier-0.175
othe-0.171
 spine-0.167
ometers-0.165
ometer-0.164
TokenOnce
Token ID7454
Feature activation+6.34e+00

Pos logit contributions
 colored+0.097
 saw+0.096
 little+0.090
 coloured+0.088
 heard+0.086
Neg logit contributions
othe-0.091
rier-0.088
vere-0.086
unes-0.084
ometer-0.084
Token upon
Token ID2402
Feature activation+6.61e+00

Pos logit contributions
 saw+0.078
 colored+0.077
 little+0.069
 heard+0.065
 coloured+0.065
Neg logit contributions
othe-0.159
rier-0.158
vere-0.153
 spine-0.150
ometers-0.150
Token a
Token ID257
Feature activation+8.59e+00

Pos logit contributions
 colored+0.064
 saw+0.057
ptions+0.057
ere+0.056
 remembered+0.052
Neg logit contributions
vere-0.165
othe-0.165
uts-0.160
rier-0.157
 numb-0.157
Token time
Token ID640
Feature activation+5.26e+00

Pos logit contributions
 colored+0.100
 saw+0.090
 coloured+0.084
ptions+0.083
 picked+0.075
Neg logit contributions
othe-0.211
 spine-0.208
vere-0.205
rier-0.197
ometer-0.194
Token,
Token ID11
Feature activation+6.62e+00

Pos logit contributions
 saw+0.045
 colored+0.045
 little+0.038
 heard+0.035
 coloured+0.035
Neg logit contributions
othe-0.151
rier-0.150
vere-0.146
 spine-0.144
ometers-0.144
Token Elliot
Token ID30451
Feature activation+1.05e+01

Pos logit contributions
 colored+0.076
 amusement+0.066
 coloured+0.063
ptions+0.061
 saw+0.061
Neg logit contributions
othe-0.166
rier-0.165
ometer-0.159
vere-0.154
 spine-0.152
Token the
Token ID262
Feature activation+3.53e+00

Pos logit contributions
ptions+0.147
 amusement+0.140
 media+0.131
 seeker+0.114
 preschool+0.112
Neg logit contributions
lest-0.168
s-0.167
urch-0.166
othe-0.164
ars-0.162
Token elephant
Token ID20950
Feature activation+5.03e+00

Pos logit contributions
 saw+0.057
 colored+0.056
 little+0.051
 heard+0.050
 coloured+0.050
Neg logit contributions
othe-0.074
rier-0.074
vere-0.071
 spine-0.070
ometers-0.070
Token was
Token ID373
Feature activation+7.51e+00

Pos logit contributions
ptions+0.067
 colored+0.052
 amusement+0.050
 little+0.048
 media+0.048
Neg logit contributions
othe-0.109
uts-0.105
ometer-0.104
unes-0.103
ometers-0.101
Token walking
Token ID6155
Feature activation+6.32e+00

Pos logit contributions
 colored+0.110
 saw+0.109
 little+0.097
 coloured+0.096
 heard+0.095
Neg logit contributions
rier-0.168
othe-0.168
vere-0.162
unes-0.158
ometers-0.158
Token through
Token ID832
Feature activation+6.75e+00

Pos logit contributions
 saw+0.104
 colored+0.104
 little+0.095
 heard+0.092
 coloured+0.091
Neg logit contributions
othe-0.130
rier-0.129
vere-0.122
 spine-0.122
ometer-0.121
TokenOnce
Token ID7454
Feature activation+6.17e+00

Pos logit contributions
 colored+0.110
 saw+0.109
 little+0.102
 coloured+0.099
 heard+0.097
Neg logit contributions
othe-0.103
rier-0.101
vere-0.099
unes-0.096
ometers-0.095
Token upon
Token ID2402
Feature activation+7.16e+00

Pos logit contributions
 saw+0.075
 colored+0.073
 little+0.067
 heard+0.062
 coloured+0.061
Neg logit contributions
othe-0.156
rier-0.155
vere-0.150
ometers-0.148
 spine-0.148
Token a
Token ID257
Feature activation+9.41e+00

Pos logit contributions
 colored+0.062
 saw+0.054
ptions+0.054
ere+0.051
 coloured+0.049
Neg logit contributions
othe-0.191
vere-0.188
uts-0.183
rier-0.177
 numb-0.177
Token time
Token ID640
Feature activation+5.90e+00

Pos logit contributions
 colored+0.104
 saw+0.095
ptions+0.087
 coloured+0.086
 remembered+0.079
Neg logit contributions
othe-0.242
 spine-0.234
vere-0.233
rier-0.223
ometer-0.222
Token,
Token ID11
Feature activation+6.88e+00

Pos logit contributions
 colored+0.059
 saw+0.056
 coloured+0.048
 little+0.047
 heard+0.045
Neg logit contributions
othe-0.163
rier-0.157
vere-0.157
 spine-0.153
unes-0.153
Token Jack
Token ID3619
Feature activation+1.04e+01

Pos logit contributions
 colored+0.078
ptions+0.071
 amusement+0.068
 media+0.066
 coloured+0.066
Neg logit contributions
othe-0.176
vere-0.159
ometer-0.157
rier-0.156
 spine-0.153
Token was
Token ID373
Feature activation+9.59e+00

Pos logit contributions
ptions+0.173
 media+0.134
 amusement+0.133
 young+0.121
 little+0.118
Neg logit contributions
othe-0.194
oes-0.181
 falls-0.167
 spine-0.166
urch-0.163
Token out
Token ID503
Feature activation+6.00e+00

Pos logit contributions
 colored+0.083
 saw+0.077
ptions+0.069
 coloured+0.069
 heard+0.065
Neg logit contributions
othe-0.264
vere-0.250
 spine-0.248
rier-0.247
unes-0.246
Token for
Token ID329
Feature activation+5.46e+00

Pos logit contributions
 colored+0.085
 saw+0.084
ptions+0.076
 did+0.076
 heard+0.076
Neg logit contributions
othe-0.137
rier-0.129
 spine-0.126
vere-0.121
ometer-0.119
Token lunch
Token ID9965
Feature activation+3.10e+00

Pos logit contributions
 saw+0.064
 remembered+0.056
ptions+0.054
iet+0.054
 noticed+0.053
Neg logit contributions
othe-0.123
 numb-0.123
unes-0.120
balls-0.115
 spine-0.114
Token with
Token ID351
Feature activation+3.85e+00

Pos logit contributions
 colored+0.056
 saw+0.056
ptions+0.051
 remembered+0.050
 coloured+0.050
Neg logit contributions
othe-0.057
rier-0.056
vere-0.055
uts-0.054
 spine-0.053
Token."
Token ID526
Feature activation-3.08e+00

Pos logit contributions
othe+0.145
rier+0.142
 spine+0.142
vere+0.141
ometer+0.137
Neg logit contributions
 saw-0.085
 colored-0.082
 little-0.077
 found-0.074
 heard-0.072
Token<|endoftext|>
Token ID50256
Feature activation+5.40e+00

Pos logit contributions
rier+0.094
 spine+0.087
drops+0.087
ometers+0.087
vere+0.085
Neg logit contributions
 saw-0.036
 little-0.034
 a-0.032
 two-0.029
 found-0.026
Token<|endoftext|>
Token ID50256
Feature activation+7.11e+00

Pos logit contributions
 saw+0.096
 colored+0.095
 little+0.088
 heard+0.085
 coloured+0.084
Neg logit contributions
othe-0.107
rier-0.106
vere-0.101
ometers-0.099
 spine-0.099
Token<|endoftext|>
Token ID50256
Feature activation+7.92e+00

Pos logit contributions
 colored+0.137
 saw+0.136
 little+0.127
 coloured+0.125
 remembered+0.121
Neg logit contributions
othe-0.124
rier-0.119
vere-0.119
unes-0.115
uts-0.112
TokenL
Token ID43
Feature activation+8.80e+00

Pos logit contributions
 colored+0.158
 saw+0.155
 little+0.146
 coloured+0.144
 remembered+0.142
Neg logit contributions
othe-0.129
vere-0.124
rier-0.122
unes-0.121
uts-0.117
Tokenily
Token ID813
Feature activation+1.04e+01

Pos logit contributions
 colored+0.139
 saw+0.128
 coloured+0.128
ptions+0.125
 remembered+0.122
Neg logit contributions
unes-0.173
umbs-0.166
othe-0.164
uts-0.161
 spine-0.152
Token and
Token ID290
Feature activation+6.27e+00

Pos logit contributions
ptions+0.183
 colored+0.155
 amusement+0.148
 coloured+0.141
 media+0.140
Neg logit contributions
othe-0.188
 spine-0.184
ometer-0.175
oes-0.171
uts-0.171
Token Sam
Token ID3409
Feature activation+1.08e+01

Pos logit contributions
 colored+0.064
 saw+0.058
ptions+0.053
 coloured+0.053
 little+0.051
Neg logit contributions
othe-0.172
vere-0.159
rier-0.158
 spine-0.157
ometers-0.156
Token went
Token ID1816
Feature activation+9.32e+00

Pos logit contributions
ptions+0.143
 colored+0.134
 little+0.120
 saw+0.118
 coloured+0.117
Neg logit contributions
othe-0.240
unes-0.222
 spine-0.221
ometers-0.218
drops-0.218
Token on
Token ID319
Feature activation+7.85e+00

Pos logit contributions
 saw+0.125
 colored+0.120
 little+0.114
 found+0.105
 heard+0.103
Neg logit contributions
rier-0.237
othe-0.226
vere-0.222
ometers-0.217
drops-0.215
Token a
Token ID257
Feature activation+7.31e+00

Pos logit contributions
ptions+0.081
 saw+0.075
 colored+0.074
 remembered+0.073
 noticed+0.066
Neg logit contributions
othe-0.196
vere-0.180
 spine-0.179
uts-0.176
ometers-0.175
Token.
Token ID13
Feature activation-4.61e+00

Pos logit contributions
 saw+0.003
 colored+0.002
 little+0.002
 heard+0.002
 coloured+0.002
Neg logit contributions
othe-0.005
rier-0.005
vere-0.004
ometers-0.004
uts-0.004
Token<|endoftext|>
Token ID50256
Feature activation+5.36e+00

Pos logit contributions
rier+0.121
 numb+0.115
 spine+0.113
vere+0.113
drops+0.110
Neg logit contributions
 saw-0.066
 little-0.065
 two-0.058
 a-0.057
 found-0.056
TokenOne
Token ID3198
Feature activation+4.67e+00

Pos logit contributions
 saw+0.101
 colored+0.101
 little+0.094
 coloured+0.090
 heard+0.090
Neg logit contributions
othe-0.098
rier-0.097
vere-0.093
ometers-0.091
 spine-0.091
Token day
Token ID1110
Feature activation+5.13e+00

Pos logit contributions
 colored+0.090
 saw+0.089
ptions+0.088
 coloured+0.081
 heard+0.081
Neg logit contributions
othe-0.080
 spine-0.072
vere-0.071
ometer-0.071
rier-0.070
Token,
Token ID11
Feature activation+5.51e+00

Pos logit contributions
 colored+0.055
 saw+0.050
ptions+0.047
 coloured+0.046
 little+0.043
Neg logit contributions
othe-0.137
vere-0.130
rier-0.129
 spine-0.126
uts-0.126
Token Diego
Token ID9500
Feature activation+1.03e+01

Pos logit contributions
 colored+0.048
ptions+0.046
 amusement+0.041
 packed+0.038
 media+0.037
Neg logit contributions
othe-0.152
vere-0.136
 spine-0.136
ometer-0.135
balls-0.135
Token was
Token ID373
Feature activation+7.96e+00

Pos logit contributions
ptions+0.137
 amusement+0.119
 media+0.114
 preschool+0.099
 colored+0.095
Neg logit contributions
oes-0.188
line-0.185
urch-0.179
othe-0.178
 He-0.178
Token feeling
Token ID4203
Feature activation+3.86e+00

Pos logit contributions
 colored+0.099
 saw+0.092
 coloured+0.085
ptions+0.081
 heard+0.079
Neg logit contributions
othe-0.193
rier-0.191
vere-0.188
 spine-0.184
ometers-0.180
Token very
Token ID845
Feature activation+2.05e+00

Pos logit contributions
 saw+0.066
 colored+0.065
 little+0.060
 heard+0.058
 coloured+0.058
Neg logit contributions
othe-0.079
rier-0.078
vere-0.075
ometers-0.073
 spine-0.073
Token impatient
Token ID33440
Feature activation+3.19e+00

Pos logit contributions
 saw+0.036
 colored+0.034
 little+0.034
 found+0.031
 heard+0.031
Neg logit contributions
rier-0.042
othe-0.042
vere-0.040
ometers-0.040
unes-0.040
Token.
Token ID13
Feature activation-1.87e+00

Pos logit contributions
 saw+0.054
 colored+0.054
 little+0.050
 heard+0.048
 coloured+0.047
Neg logit contributions
othe-0.065
rier-0.065
vere-0.062
ometers-0.061
 spine-0.061
Token upon
Token ID2402
Feature activation+6.53e+00

Pos logit contributions
 saw+0.073
 colored+0.068
 little+0.065
 heard+0.059
 found+0.058
Neg logit contributions
rier-0.166
othe-0.164
vere-0.160
ometers-0.157
drops-0.157
Token a
Token ID257
Feature activation+8.82e+00

Pos logit contributions
 colored+0.063
 saw+0.057
ptions+0.056
ere+0.053
 coloured+0.050
Neg logit contributions
othe-0.166
vere-0.164
uts-0.159
 numb-0.156
 bleed-0.155
Token time
Token ID640
Feature activation+5.39e+00

Pos logit contributions
 colored+0.101
 saw+0.090
 coloured+0.083
ptions+0.083
 remembered+0.074
Neg logit contributions
othe-0.220
 spine-0.216
vere-0.213
rier-0.204
ometer-0.201
Token,
Token ID11
Feature activation+6.50e+00

Pos logit contributions
 colored+0.049
 saw+0.048
 little+0.040
 coloured+0.039
 heard+0.038
Neg logit contributions
othe-0.153
rier-0.150
vere-0.148
 spine-0.145
ometers-0.144
Token there
Token ID612
Feature activation+6.58e+00

Pos logit contributions
 colored+0.073
 amusement+0.064
ptions+0.062
 coloured+0.061
 saw+0.059
Neg logit contributions
othe-0.164
rier-0.157
ometer-0.153
 spine-0.150
vere-0.150
Token arrived
Token ID5284
Feature activation+1.03e+01

Pos logit contributions
ptions+0.093
 colored+0.087
 coloured+0.078
 packed+0.078
iet+0.074
Neg logit contributions
othe-0.147
ometer-0.129
uts-0.127
vere-0.124
ometers-0.122
Token a
Token ID257
Feature activation+5.83e+00

Pos logit contributions
 colored+0.190
ptions+0.186
 coloured+0.171
 saw+0.168
 heard+0.164
Neg logit contributions
othe-0.190
rier-0.177
ometer-0.177
unes-0.172
vere-0.172
Token boy
Token ID2933
Feature activation+5.93e+00

Pos logit contributions
 colored+0.049
 saw+0.039
 coloured+0.038
ptions+0.037
 heard+0.034
Neg logit contributions
rier-0.168
othe-0.164
ometer-0.161
vere-0.160
 spine-0.154
Token to
Token ID284
Feature activation+4.27e+00

Pos logit contributions
 colored+0.050
ptions+0.049
 little+0.043
 saw+0.042
 coloured+0.040
Neg logit contributions
othe-0.161
 spine-0.148
ometer-0.148
vere-0.146
uts-0.145
Token a
Token ID257
Feature activation+4.45e+00

Pos logit contributions
ptions+0.074
 colored+0.073
iet+0.071
 media+0.068
 coloured+0.068
Neg logit contributions
othe-0.073
 heal-0.063
vere-0.062
 Ple-0.061
 Dragons-0.061
Token stable
Token ID8245
Feature activation+7.81e-01

Pos logit contributions
 colored+0.051
 coloured+0.043
ptions+0.041
 saw+0.041
 heard+0.038
Neg logit contributions
rier-0.114
othe-0.111
ometer-0.111
vere-0.108
 spine-0.106
Token.
Token ID13
Feature activation-1.80e+00

Pos logit contributions
 colored+0.047
ptions+0.042
 coloured+0.039
 little+0.038
 saw+0.038
Neg logit contributions
othe-0.108
rier-0.100
vere-0.100
ometer-0.099
unes-0.098
Token<|endoftext|>
Token ID50256
Feature activation+6.21e+00

Pos logit contributions
rier+0.035
othe+0.034
vere+0.033
 spine+0.032
ometers+0.031
Neg logit contributions
 saw-0.035
 little-0.033
 colored-0.032
 heard-0.030
 found-0.030
TokenOne
Token ID3198
Feature activation+5.46e+00

Pos logit contributions
 colored+0.125
 saw+0.123
 little+0.116
 coloured+0.114
ptions+0.111
Neg logit contributions
othe-0.104
rier-0.099
vere-0.098
unes-0.096
ometer-0.095
Token day
Token ID1110
Feature activation+5.62e+00

Pos logit contributions
 saw+0.100
 colored+0.100
ptions+0.095
 little+0.090
 heard+0.090
Neg logit contributions
othe-0.100
 spine-0.092
vere-0.092
rier-0.091
unes-0.089
Token,
Token ID11
Feature activation+6.45e+00

Pos logit contributions
 colored+0.055
 saw+0.053
 coloured+0.044
 little+0.044
ptions+0.043
Neg logit contributions
othe-0.155
rier-0.151
vere-0.149
 spine-0.147
unes-0.146
Token Jack
Token ID3619
Feature activation+1.03e+01

Pos logit contributions
 colored+0.071
ptions+0.066
 amusement+0.064
 media+0.064
 taught+0.059
Neg logit contributions
othe-0.155
ometer-0.151
balls-0.150
rier-0.150
ometers-0.143
Token went
Token ID1816
Feature activation+8.12e+00

Pos logit contributions
ptions+0.182
 amusement+0.140
 media+0.136
 preschool+0.127
 colored+0.120
Neg logit contributions
othe-0.180
 spine-0.166
uts-0.165
oes-0.162
 bal-0.160
Token for
Token ID329
Feature activation+5.46e+00

Pos logit contributions
 saw+0.122
 colored+0.119
 little+0.111
 heard+0.105
 found+0.103
Neg logit contributions
rier-0.184
othe-0.183
vere-0.176
ometers-0.173
 spine-0.172
Token a
Token ID257
Feature activation+4.95e+00

Pos logit contributions
 saw+0.064
 remembered+0.058
 colored+0.058
 noticed+0.057
ere+0.054
Neg logit contributions
 numb-0.122
othe-0.119
 bleed-0.116
vere-0.114
balls-0.113
Token walk
Token ID2513
Feature activation+5.96e+00

Pos logit contributions
 colored+0.074
 saw+0.069
 coloured+0.064
ptions+0.061
 heard+0.061
Neg logit contributions
othe-0.110
 spine-0.105
vere-0.105
rier-0.104
ometer-0.104
Token in
Token ID287
Feature activation+4.49e+00

Pos logit contributions
 saw+0.081
 colored+0.079
 little+0.072
 heard+0.069
 coloured+0.068
Neg logit contributions
othe-0.144
rier-0.144
vere-0.139
ometers-0.137
unes-0.136
Token<|endoftext|>
Token ID50256
Feature activation+5.86e+00

Pos logit contributions
 saw+0.076
 colored+0.074
 little+0.070
 heard+0.067
 coloured+0.066
Neg logit contributions
othe-0.085
rier-0.085
vere-0.081
 spine-0.080
ometers-0.080
Token<|endoftext|>
Token ID50256
Feature activation+6.70e+00

Pos logit contributions
 colored+0.112
 saw+0.111
 little+0.104
 coloured+0.101
 heard+0.099
Neg logit contributions
othe-0.105
rier-0.102
vere-0.100
unes-0.097
ometers-0.096
Token<|endoftext|>
Token ID50256
Feature activation+7.19e+00

Pos logit contributions
 colored+0.132
 saw+0.130
 little+0.122
 coloured+0.120
 remembered+0.117
Neg logit contributions
othe-0.113
vere-0.109
rier-0.108
unes-0.105
uts-0.103
Token<|endoftext|>
Token ID50256
Feature activation+7.46e+00

Pos logit contributions
 colored+0.144
 saw+0.141
 little+0.133
 coloured+0.131
 remembered+0.129
Neg logit contributions
othe-0.117
vere-0.113
rier-0.111
unes-0.109
uts-0.106
TokenL
Token ID43
Feature activation+9.35e+00

Pos logit contributions
 colored+0.150
 saw+0.147
 little+0.140
 coloured+0.137
 remembered+0.135
Neg logit contributions
othe-0.119
vere-0.114
rier-0.111
unes-0.111
uts-0.108
Tokenila
Token ID10102
Feature activation+1.03e+01

Pos logit contributions
 colored+0.145
 saw+0.138
ptions+0.136
 coloured+0.133
 remembered+0.133
Neg logit contributions
unes-0.182
umbs-0.173
othe-0.171
uts-0.167
 spine-0.162
Token and
Token ID290
Feature activation+6.86e+00

Pos logit contributions
ptions+0.145
 colored+0.136
 saw+0.121
 coloured+0.120
 amusement+0.118
Neg logit contributions
othe-0.223
 spine-0.214
uts-0.207
ometer-0.205
unes-0.203
Token Ben
Token ID3932
Feature activation+1.13e+01

Pos logit contributions
 colored+0.072
 saw+0.067
 coloured+0.059
 little+0.059
ptions+0.057
Neg logit contributions
othe-0.185
rier-0.174
vere-0.174
 spine-0.172
ometers-0.170
Token are
Token ID389
Feature activation+1.00e+01

Pos logit contributions
 colored+0.139
ptions+0.135
 saw+0.126
 coloured+0.120
 little+0.119
Neg logit contributions
othe-0.263
 spine-0.249
rier-0.245
ometers-0.245
uts-0.244
Token friends
Token ID2460
Feature activation+9.62e+00

Pos logit contributions
 colored+0.125
 saw+0.123
 little+0.107
 coloured+0.106
 heard+0.104
Neg logit contributions
othe-0.248
rier-0.242
vere-0.239
unes-0.236
ometers-0.234
Token.
Token ID13
Feature activation+1.67e+00

Pos logit contributions
 saw+0.133
 little+0.116
 colored+0.113
 heard+0.111
 a+0.108
Neg logit contributions
rier-0.251
unes-0.245
ometers-0.245
ometer-0.244
drops-0.241
Token.
Token ID13
Feature activation-5.18e+00

Pos logit contributions
othe+0.103
rier+0.102
vere+0.099
ometers+0.097
 spine+0.096
Neg logit contributions
 saw-0.064
 colored-0.063
 little-0.058
 heard-0.055
 coloured-0.054
Token<|endoftext|>
Token ID50256
Feature activation+6.17e+00

Pos logit contributions
rier+0.127
vere+0.117
othe+0.115
 spine+0.112
drops+0.111
Neg logit contributions
 saw-0.079
 little-0.073
 colored-0.071
 heard-0.067
 a-0.066
TokenOne
Token ID3198
Feature activation+5.49e+00

Pos logit contributions
 colored+0.121
 saw+0.118
 little+0.111
 coloured+0.109
 heard+0.106
Neg logit contributions
othe-0.107
rier-0.104
vere-0.102
unes-0.099
ometer-0.099
Token day
Token ID1110
Feature activation+5.54e+00

Pos logit contributions
ptions+0.111
 colored+0.110
 saw+0.106
 coloured+0.099
 media+0.099
Neg logit contributions
othe-0.090
 spine-0.081
vere-0.078
ometer-0.078
ometers-0.076
Token,
Token ID11
Feature activation+6.36e+00

Pos logit contributions
 colored+0.066
ptions+0.060
 saw+0.058
 coloured+0.055
 little+0.050
Neg logit contributions
othe-0.142
vere-0.132
rier-0.131
uts-0.131
 spine-0.130
Token Paul
Token ID3362
Feature activation+1.03e+01

Pos logit contributions
 colored+0.073
ptions+0.068
 coloured+0.059
 media+0.057
 amusement+0.057
Neg logit contributions
othe-0.163
balls-0.147
ometer-0.146
vere-0.144
 spine-0.144
Token and
Token ID290
Feature activation+2.86e+00

Pos logit contributions
ptions+0.188
 media+0.142
 amusement+0.135
 setup+0.134
 preschool+0.131
Neg logit contributions
othe-0.155
uts-0.151
 He-0.150
ometer-0.147
 spine-0.144
Token his
Token ID465
Feature activation+6.88e+00

Pos logit contributions
 colored+0.048
ptions+0.046
 saw+0.045
 coloured+0.043
 heard+0.041
Neg logit contributions
othe-0.057
vere-0.053
rier-0.053
 spine-0.051
uts-0.051
Token mom
Token ID1995
Feature activation+6.27e+00

Pos logit contributions
 saw+0.088
 colored+0.088
 little+0.075
 coloured+0.075
 heard+0.074
Neg logit contributions
othe-0.168
rier-0.167
vere-0.163
 spine-0.160
ometers-0.160
Tokenmy
Token ID1820
Feature activation+7.97e+00

Pos logit contributions
 media+0.125
ptions+0.124
ption+0.100
 colored+0.099
iet+0.098
Neg logit contributions
unes-0.088
drops-0.081
oes-0.080
whe-0.079
othe-0.077
Token went
Token ID1816
Feature activation+7.50e+00

Pos logit contributions
ptions+0.143
 media+0.132
 colored+0.123
 little+0.117
 coloured+0.116
Neg logit contributions
othe-0.123
drops-0.121
unes-0.120
ometers-0.114
 bal-0.112
Token.
Token ID13
Feature activation-2.20e+00

Pos logit contributions
othe+0.001
rier+0.001
vere+0.001
ometers+0.001
ometer+0.001
Neg logit contributions
 saw-0.000
 colored-0.000
 little-0.000
 heard-0.000
 coloured-0.000
Token<|endoftext|>
Token ID50256
Feature activation+5.88e+00

Pos logit contributions
rier+0.041
othe+0.040
vere+0.039
 spine+0.038
ometers+0.037
Neg logit contributions
 saw-0.043
 colored-0.041
 little-0.040
 heard-0.038
 found-0.038
TokenOne
Token ID3198
Feature activation+5.48e+00

Pos logit contributions
 colored+0.115
 saw+0.114
 little+0.107
 coloured+0.104
 heard+0.102
Neg logit contributions
othe-0.102
rier-0.098
vere-0.096
unes-0.094
ometer-0.094
Token day
Token ID1110
Feature activation+5.66e+00

Pos logit contributions
 saw+0.101
 colored+0.101
ptions+0.093
 little+0.092
 heard+0.090
Neg logit contributions
othe-0.101
rier-0.094
 spine-0.094
vere-0.093
unes-0.091
Token,
Token ID11
Feature activation+6.65e+00

Pos logit contributions
 colored+0.057
 saw+0.054
 coloured+0.047
ptions+0.046
 little+0.045
Neg logit contributions
othe-0.154
rier-0.150
vere-0.148
 spine-0.146
uts-0.145
Token John
Token ID1757
Feature activation+1.03e+01

Pos logit contributions
 colored+0.073
ptions+0.066
 amusement+0.065
 media+0.064
 taught+0.060
Neg logit contributions
othe-0.162
rier-0.158
ometer-0.156
balls-0.154
 spine-0.148
Token went
Token ID1816
Feature activation+8.14e+00

Pos logit contributions
ptions+0.169
 media+0.136
 amusement+0.133
 preschool+0.119
 little+0.114
Neg logit contributions
othe-0.174
oes-0.172
 spine-0.168
uts-0.167
ometer-0.167
Token to
Token ID284
Feature activation+5.73e+00

Pos logit contributions
 saw+0.121
 colored+0.118
 little+0.110
 heard+0.104
 found+0.102
Neg logit contributions
rier-0.187
othe-0.186
vere-0.179
ometers-0.176
 spine-0.174
Token the
Token ID262
Feature activation+3.00e+00

Pos logit contributions
 colored+0.068
ptions+0.066
 remembered+0.062
 saw+0.060
 coloured+0.060
Neg logit contributions
othe-0.130
 spine-0.119
 glide-0.117
vere-0.117
 numb-0.117
Token mall
Token ID17374
Feature activation+2.75e+00

Pos logit contributions
 remembered+0.035
 colored+0.035
 saw+0.032
 heard+0.031
 coloured+0.031
Neg logit contributions
rier-0.070
ometer-0.067
 spine-0.065
vere-0.064
 highest-0.064
Token with
Token ID351
Feature activation+3.65e+00

Pos logit contributions
 saw+0.042
 colored+0.041
 little+0.038
 heard+0.036
 coloured+0.036
Neg logit contributions
othe-0.061
rier-0.060
vere-0.058
 spine-0.058
ometers-0.057
Token.
Token ID13
Feature activation-3.86e+00

Pos logit contributions
othe+0.115
rier+0.113
unes+0.112
ometers+0.111
vere+0.111
Neg logit contributions
 saw-0.067
 colored-0.063
 little-0.063
 heard-0.059
 a-0.058
Token<|endoftext|>
Token ID50256
Feature activation+7.16e+00

Pos logit contributions
rier+0.117
drops+0.105
vere+0.105
 spine+0.102
 numb+0.099
Neg logit contributions
 saw-0.048
 a-0.046
 little-0.045
 found-0.041
 the-0.039
TokenOne
Token ID3198
Feature activation+5.87e+00

Pos logit contributions
 colored+0.150
 saw+0.147
 little+0.138
 coloured+0.137
 remembered+0.133
Neg logit contributions
othe-0.113
rier-0.108
vere-0.107
uts-0.102
ometer-0.102
Token day
Token ID1110
Feature activation+5.98e+00

Pos logit contributions
ptions+0.140
 colored+0.131
 media+0.126
 saw+0.124
 remembered+0.121
Neg logit contributions
othe-0.074
oes-0.065
 spine-0.062
ometer-0.058
vere-0.057
Token,
Token ID11
Feature activation+5.87e+00

Pos logit contributions
 colored+0.067
 saw+0.062
ptions+0.060
 coloured+0.056
 heard+0.052
Neg logit contributions
othe-0.157
vere-0.148
rier-0.146
uts-0.146
 spine-0.144
Token Tommy
Token ID19919
Feature activation+1.02e+01

Pos logit contributions
ptions+0.051
 colored+0.051
 amusement+0.042
 saw+0.039
 coloured+0.038
Neg logit contributions
othe-0.165
vere-0.148
uts-0.148
balls-0.147
ometer-0.147
Token was
Token ID373
Feature activation+8.64e+00

Pos logit contributions
ptions+0.180
 amusement+0.145
 media+0.134
 colored+0.131
 preschool+0.127
Neg logit contributions
othe-0.179
uts-0.169
 bleed-0.166
ometer-0.163
 We-0.162
Token walking
Token ID6155
Feature activation+7.73e+00

Pos logit contributions
 colored+0.104
 saw+0.101
 coloured+0.090
ptions+0.087
 heard+0.086
Neg logit contributions
othe-0.213
rier-0.210
vere-0.206
 spine-0.202
unes-0.198
Token in
Token ID287
Feature activation+6.28e+00

Pos logit contributions
 saw+0.109
 colored+0.107
 little+0.099
 heard+0.093
 coloured+0.092
Neg logit contributions
othe-0.185
rier-0.184
vere-0.179
ometers-0.175
unes-0.173
Token the
Token ID262
Feature activation+3.64e+00

Pos logit contributions
 colored+0.085
 remembered+0.084
ptions+0.078
 saw+0.075
 did+0.074
Neg logit contributions
othe-0.128
 numb-0.122
 spine-0.121
vere-0.116
whe-0.113
Token park
Token ID3952
Feature activation+4.34e+00

Pos logit contributions
 colored+0.045
 remembered+0.043
 did+0.043
 coloured+0.041
ptions+0.040
Neg logit contributions
 spine-0.071
 highest-0.071
 numb-0.070
oes-0.069
 stra-0.067
TokenOnce
Token ID7454
Feature activation+6.23e+00

Pos logit contributions
 colored+0.108
 saw+0.106
 little+0.100
 coloured+0.098
 remembered+0.095
Neg logit contributions
othe-0.088
rier-0.085
vere-0.084
ometer-0.081
unes-0.081
Token upon
Token ID2402
Feature activation+6.81e+00

Pos logit contributions
 saw+0.075
 colored+0.073
 little+0.067
 heard+0.062
 coloured+0.061
Neg logit contributions
othe-0.158
rier-0.158
vere-0.153
ometers-0.150
 spine-0.150
Token a
Token ID257
Feature activation+8.90e+00

Pos logit contributions
 colored+0.071
ptions+0.064
ere+0.064
 saw+0.063
 remembered+0.060
Neg logit contributions
vere-0.164
othe-0.162
uts-0.156
 numb-0.152
 bleed-0.151
Token time
Token ID640
Feature activation+5.42e+00

Pos logit contributions
 colored+0.106
 saw+0.095
ptions+0.092
 coloured+0.089
 remembered+0.081
Neg logit contributions
othe-0.213
 spine-0.209
vere-0.206
ometer-0.195
rier-0.194
Token,
Token ID11
Feature activation+6.58e+00

Pos logit contributions
 colored+0.056
 saw+0.053
 coloured+0.045
 little+0.044
 heard+0.043
Neg logit contributions
othe-0.148
vere-0.143
rier-0.143
 spine-0.139
uts-0.138
Token Jack
Token ID3619
Feature activation+1.02e+01

Pos logit contributions
 colored+0.085
 amusement+0.081
ptions+0.078
 coloured+0.073
 media+0.072
Neg logit contributions
othe-0.153
ometer-0.143
rier-0.142
vere-0.138
 spine-0.135
Token was
Token ID373
Feature activation+9.39e+00

Pos logit contributions
ptions+0.189
 amusement+0.153
 media+0.138
 preschool+0.127
 3+0.126
Neg logit contributions
oes-0.163
othe-0.160
 spine-0.149
 bleed-0.148
bo-0.148
Token playing
Token ID2712
Feature activation+7.02e+00

Pos logit contributions
 colored+0.083
 saw+0.077
 coloured+0.069
ptions+0.066
 heard+0.065
Neg logit contributions
othe-0.258
rier-0.252
vere-0.249
 spine-0.244
unes-0.241
Token with
Token ID351
Feature activation+3.73e+00

Pos logit contributions
 colored+0.117
 packed+0.116
 remembered+0.116
ptions+0.112
 saw+0.111
Neg logit contributions
othe-0.124
balls-0.123
vere-0.117
rier-0.116
but-0.116
Token his
Token ID465
Feature activation+4.41e+00

Pos logit contributions
 saw+0.063
 remembered+0.060
 heard+0.059
ptions+0.058
 packed+0.056
Neg logit contributions
othe-0.066
balls-0.066
 numb-0.065
uts-0.062
unes-0.061
Token hoop
Token ID44792
Feature activation-1.00e+00

Pos logit contributions
ptions+0.066
 saw+0.066
 heard+0.064
 colored+0.060
 remembered+0.059
Neg logit contributions
balls-0.093
rier-0.086
ometer-0.086
vere-0.083
othe-0.083
Token!
Token ID0
Feature activation-5.04e+00

Pos logit contributions
 saw+0.006
 colored+0.006
 little+0.006
 heard+0.006
 coloured+0.006
Neg logit contributions
othe-0.008
rier-0.008
vere-0.008
 spine-0.007
ometers-0.007
Token<|endoftext|>
Token ID50256
Feature activation+5.84e+00

Pos logit contributions
rier+0.121
drops+0.116
 numb+0.114
aro+0.108
unes+0.108
Neg logit contributions
 saw-0.084
 little-0.083
 a-0.082
 two-0.079
 found-0.074
TokenOne
Token ID3198
Feature activation+5.64e+00

Pos logit contributions
 colored+0.112
 saw+0.111
 little+0.104
 coloured+0.101
 heard+0.099
Neg logit contributions
othe-0.104
rier-0.102
vere-0.099
unes-0.096
ometers-0.096
Token day
Token ID1110
Feature activation+5.59e+00

Pos logit contributions
ptions+0.116
 colored+0.114
 saw+0.110
 media+0.105
 coloured+0.103
Neg logit contributions
othe-0.089
 spine-0.079
vere-0.076
oes-0.076
ometer-0.075
Token,
Token ID11
Feature activation+6.43e+00

Pos logit contributions
 colored+0.062
 saw+0.056
ptions+0.054
 coloured+0.051
 little+0.048
Neg logit contributions
othe-0.148
vere-0.140
rier-0.139
uts-0.137
 spine-0.137
Token Sally
Token ID25737
Feature activation+1.02e+01

Pos logit contributions
 colored+0.061
ptions+0.058
 amusement+0.049
 coloured+0.047
 media+0.047
Neg logit contributions
othe-0.174
balls-0.159
vere-0.158
uts-0.155
ometer-0.155
Token was
Token ID373
Feature activation+8.94e+00

Pos logit contributions
ptions+0.159
 amusement+0.127
 media+0.122
 colored+0.121
 little+0.111
Neg logit contributions
othe-0.201
uts-0.187
 spine-0.185
drops-0.181
ometer-0.177
Token in
Token ID287
Feature activation+6.90e+00

Pos logit contributions
 colored+0.104
 saw+0.097
 coloured+0.090
ptions+0.084
 heard+0.082
Neg logit contributions
othe-0.224
vere-0.217
rier-0.217
 spine-0.211
unes-0.209
Token the
Token ID262
Feature activation+3.90e+00

Pos logit contributions
 colored+0.083
ptions+0.083
 remembered+0.077
 saw+0.074
 coloured+0.073
Neg logit contributions
othe-0.148
 numb-0.141
 spine-0.130
 bleed-0.127
vere-0.123
Token park
Token ID3952
Feature activation+4.48e+00

Pos logit contributions
 colored+0.048
 coloured+0.044
 did+0.044
 remembered+0.043
ptions+0.043
Neg logit contributions
 spine-0.080
 numb-0.080
unes-0.075
rier-0.075
 highest-0.074
Token with
Token ID351
Feature activation+4.47e+00

Pos logit contributions
 saw+0.067
 colored+0.067
 little+0.061
 heard+0.058
 coloured+0.058
Neg logit contributions
othe-0.100
rier-0.097
vere-0.094
 spine-0.093
ometers-0.092
TokenOnce
Token ID7454
Feature activation+6.77e+00

Pos logit contributions
 saw+0.090
 colored+0.089
 little+0.083
 heard+0.080
 coloured+0.079
Neg logit contributions
othe-0.093
rier-0.092
vere-0.088
ometers-0.086
 spine-0.086
Token upon
Token ID2402
Feature activation+6.83e+00

Pos logit contributions
 saw+0.083
 colored+0.082
 little+0.073
 heard+0.069
 coloured+0.069
Neg logit contributions
othe-0.170
rier-0.169
vere-0.163
 spine-0.161
ometers-0.160
Token a
Token ID257
Feature activation+9.11e+00

Pos logit contributions
 colored+0.068
 saw+0.061
ptions+0.060
ere+0.058
 remembered+0.055
Neg logit contributions
vere-0.168
othe-0.167
uts-0.163
rier-0.160
 numb-0.159
Token time
Token ID640
Feature activation+5.54e+00

Pos logit contributions
 colored+0.105
 saw+0.094
 coloured+0.088
ptions+0.087
 picked+0.077
Neg logit contributions
othe-0.227
 spine-0.222
vere-0.220
rier-0.212
ometer-0.207
Token,
Token ID11
Feature activation+6.98e+00

Pos logit contributions
 saw+0.048
 colored+0.048
 little+0.040
 coloured+0.037
 heard+0.037
Neg logit contributions
othe-0.159
rier-0.157
vere-0.153
 spine-0.151
ometers-0.150
Token Jack
Token ID3619
Feature activation+1.02e+01

Pos logit contributions
 colored+0.090
 amusement+0.084
 coloured+0.078
ptions+0.074
 media+0.074
Neg logit contributions
othe-0.163
rier-0.162
ometer-0.155
 spine-0.151
vere-0.149
Token and
Token ID290
Feature activation+3.94e+00

Pos logit contributions
ptions+0.180
 amusement+0.147
 media+0.144
 preschool+0.126
 young+0.123
Neg logit contributions
oes-0.162
othe-0.159
urch-0.153
 spine-0.152
 bleed-0.149
Token Jill
Token ID22772
Feature activation+9.19e+00

Pos logit contributions
 colored+0.053
 saw+0.053
 little+0.047
 coloured+0.046
 heard+0.046
Neg logit contributions
othe-0.094
rier-0.092
vere-0.089
 spine-0.088
ometer-0.087
Token were
Token ID547
Feature activation+9.23e+00

Pos logit contributions
ptions+0.146
 colored+0.140
 little+0.136
 media+0.136
 saw+0.130
Neg logit contributions
othe-0.171
unes-0.159
drops-0.157
 spine-0.156
ometers-0.155
Token very
Token ID845
Feature activation+3.41e+00

Pos logit contributions
 colored+0.167
 saw+0.164
 coloured+0.152
 heard+0.148
 little+0.145
Neg logit contributions
rier-0.172
othe-0.172
vere-0.168
unes-0.163
ometers-0.158
Token excited
Token ID6568
Feature activation+4.86e+00

Pos logit contributions
 colored+0.061
 saw+0.060
 heard+0.056
 coloured+0.055
ptions+0.054
Neg logit contributions
othe-0.066
rier-0.065
vere-0.062
ometers-0.061
ometer-0.060
Token.
Token ID13
Feature activation-2.40e+00

Pos logit contributions
rier+0.023
othe+0.022
vere+0.022
ometer+0.022
ometers+0.021
Neg logit contributions
 saw-0.015
 colored-0.015
 little-0.014
 heard-0.013
 found-0.013
Token<|endoftext|>
Token ID50256
Feature activation+6.14e+00

Pos logit contributions
rier+0.047
othe+0.046
vere+0.044
 spine+0.043
ometers+0.042
Neg logit contributions
 saw-0.046
 little-0.043
 colored-0.043
 found-0.040
 heard-0.040
TokenOne
Token ID3198
Feature activation+5.57e+00

Pos logit contributions
 colored+0.124
 saw+0.122
 little+0.114
 coloured+0.113
 heard+0.110
Neg logit contributions
othe-0.103
rier-0.098
vere-0.097
unes-0.095
ometer-0.094
Token day
Token ID1110
Feature activation+5.72e+00

Pos logit contributions
 colored+0.104
 saw+0.104
ptions+0.099
 heard+0.093
 coloured+0.093
Neg logit contributions
othe-0.100
 spine-0.092
vere-0.091
rier-0.091
unes-0.088
Token,
Token ID11
Feature activation+6.34e+00

Pos logit contributions
 colored+0.056
 saw+0.054
 coloured+0.046
 little+0.045
ptions+0.045
Neg logit contributions
othe-0.157
rier-0.153
vere-0.151
 spine-0.149
uts-0.148
Token Sammy
Token ID44067
Feature activation+1.02e+01

Pos logit contributions
 colored+0.067
ptions+0.061
 amusement+0.059
 media+0.057
 coloured+0.054
Neg logit contributions
othe-0.157
rier-0.153
ometer-0.153
balls-0.151
ometers-0.144
Token saw
Token ID2497
Feature activation+8.60e+00

Pos logit contributions
ptions+0.167
 amusement+0.130
 media+0.117
 preschool+0.116
 colored+0.116
Neg logit contributions
othe-0.192
 spine-0.175
uts-0.168
oes-0.164
line-0.164
Token something
Token ID1223
Feature activation+4.56e-01

Pos logit contributions
ptions+0.094
 packed+0.091
 colored+0.090
 remembered+0.088
 saw+0.086
Neg logit contributions
othe-0.212
rier-0.193
 numb-0.190
uts-0.190
ometers-0.189
Token very
Token ID845
Feature activation+4.32e-01

Pos logit contributions
 saw+0.006
 colored+0.006
 little+0.005
 heard+0.005
 coloured+0.005
Neg logit contributions
rier-0.011
othe-0.011
vere-0.011
 spine-0.011
ometers-0.011
Token beautiful
Token ID4950
Feature activation+2.21e+00

Pos logit contributions
 colored+0.006
 saw+0.006
ptions+0.006
 heard+0.005
 remembered+0.005
Neg logit contributions
othe-0.010
rier-0.010
vere-0.009
ometer-0.009
ometers-0.009
Token in
Token ID287
Feature activation+1.75e+00

Pos logit contributions
 saw+0.042
ptions+0.041
 little+0.039
 remembered+0.039
 colored+0.038
Neg logit contributions
rier-0.038
othe-0.037
uts-0.035
drops-0.035
ometer-0.035
Token me
Token ID502
Feature activation-2.89e+00

Pos logit contributions
othe+0.120
rier+0.119
vere+0.115
 spine+0.113
ometers+0.113
Neg logit contributions
 saw-0.068
 colored-0.067
 little-0.061
 heard-0.058
 coloured-0.058
Tokenadow
Token ID4584
Feature activation-2.34e+00

Pos logit contributions
rier+0.056
othe+0.055
vere+0.052
 spine+0.051
ometers+0.051
Neg logit contributions
 saw-0.054
 colored-0.053
 little-0.050
 heard-0.048
 found-0.048
Token<|endoftext|>
Token ID50256
Feature activation+6.33e+00

Pos logit contributions
othe+0.051
rier+0.051
vere+0.049
ometers+0.048
 spine+0.048
Neg logit contributions
 saw-0.037
 colored-0.036
 little-0.033
 heard-0.032
 coloured-0.031
TokenOnce
Token ID7454
Feature activation+7.62e+00

Pos logit contributions
 colored+0.122
 saw+0.120
 little+0.113
 coloured+0.110
 remembered+0.107
Neg logit contributions
othe-0.110
rier-0.107
vere-0.105
unes-0.102
ometers-0.100
Token upon
Token ID2402
Feature activation+7.87e+00

Pos logit contributions
 colored+0.096
 saw+0.094
 little+0.084
 coloured+0.081
 heard+0.080
Neg logit contributions
othe-0.190
rier-0.187
vere-0.182
 spine-0.178
ometers-0.178
Token a
Token ID257
Feature activation+1.02e+01

Pos logit contributions
 colored+0.078
ptions+0.070
ere+0.067
 saw+0.067
 coloured+0.063
Neg logit contributions
othe-0.198
vere-0.196
uts-0.187
 bleed-0.186
 numb-0.185
Token time
Token ID640
Feature activation+6.40e+00

Pos logit contributions
 colored+0.119
 saw+0.106
ptions+0.102
 coloured+0.099
 remembered+0.088
Neg logit contributions
othe-0.253
 spine-0.245
vere-0.243
rier-0.231
ometer-0.229
Token,
Token ID11
Feature activation+7.30e+00

Pos logit contributions
 colored+0.063
 saw+0.060
 little+0.050
 coloured+0.050
 heard+0.048
Neg logit contributions
othe-0.178
rier-0.172
vere-0.171
 spine-0.167
ometers-0.166
Token there
Token ID612
Feature activation+7.08e+00

Pos logit contributions
 colored+0.087
 amusement+0.077
ptions+0.076
 coloured+0.074
 media+0.073
Neg logit contributions
othe-0.180
rier-0.169
ometer-0.165
vere-0.165
 spine-0.162
Token was
Token ID373
Feature activation+8.32e+00

Pos logit contributions
ptions+0.103
 colored+0.097
 packed+0.090
 coloured+0.088
 media+0.083
Neg logit contributions
othe-0.150
uts-0.126
urch-0.125
ometer-0.124
vere-0.123
Token a
Token ID257
Feature activation+6.20e+00

Pos logit contributions
ptions+0.096
 saw+0.094
 remembered+0.090
 colored+0.084
 packed+0.083
Neg logit contributions
othe-0.179
 numb-0.168
umbs-0.167
unes-0.165
vere-0.164
Token had
Token ID550
Feature activation-3.90e-01

Pos logit contributions
othe+0.102
rier+0.097
ometers+0.094
vere+0.093
drops+0.092
Neg logit contributions
 colored-0.066
 saw-0.063
ptions-0.058
 coloured-0.057
 little-0.056
Token learned
Token ID4499
Feature activation-1.15e+00

Pos logit contributions
othe+0.010
ometers+0.009
uts+0.009
unes+0.009
rier+0.009
Neg logit contributions
 colored-0.004
 saw-0.004
ptions-0.004
 coloured-0.004
 little-0.003
Token to
Token ID284
Feature activation-5.41e-01

Pos logit contributions
othe+0.030
rier+0.029
ometers+0.028
vere+0.028
ometer+0.028
Neg logit contributions
 saw-0.013
 colored-0.013
ptions-0.011
 coloured-0.011
 heard-0.011
Token appreciate
Token ID9144
Feature activation+6.05e-01

Pos logit contributions
othe+0.011
 spine+0.010
vere+0.010
rier+0.009
drops+0.009
Neg logit contributions
 colored-0.009
 saw-0.009
ptions-0.009
 little-0.008
 coloured-0.008
Token the
Token ID262
Feature activation-3.41e+00

Pos logit contributions
 saw+0.007
 colored+0.007
 heard+0.006
ptions+0.006
 found+0.006
Neg logit contributions
othe-0.016
rier-0.015
ometers-0.015
vere-0.015
ometer-0.014
Token therm
Token ID21969
Feature activation-1.21e+01

Pos logit contributions
othe+0.079
rier+0.077
uts+0.076
vere+0.075
unes+0.074
Neg logit contributions
 saw-0.052
 colored-0.051
 little-0.050
 heard-0.044
 coloured-0.044
Tokenometer
Token ID15635
Feature activation-8.46e+00

Pos logit contributions
aro+0.111
vere+0.110
 glide+0.106
othe+0.100
umbs+0.096
Neg logit contributions
 little-0.303
 saw-0.290
 young-0.280
 two-0.277
 colored-0.274
Token.
Token ID13
Feature activation-4.82e+00

Pos logit contributions
othe+0.200
rier+0.200
vere+0.197
unes+0.193
 spine+0.191
Neg logit contributions
 saw-0.117
 colored-0.114
 little-0.108
 heard-0.101
 found-0.100
Token<|endoftext|>
Token ID50256
Feature activation+5.64e+00

Pos logit contributions
rier+0.126
vere+0.113
 spine+0.112
drops+0.111
 numb+0.111
Neg logit contributions
 saw-0.073
 little-0.066
 colored-0.063
 found-0.062
 a-0.061
TokenJohn
Token ID7554
Feature activation+7.15e+00

Pos logit contributions
 colored+0.110
 saw+0.108
 little+0.101
 coloured+0.099
 heard+0.096
Neg logit contributions
othe-0.099
rier-0.097
vere-0.094
unes-0.091
ometer-0.091
Token lay
Token ID3830
Feature activation+5.71e+00

Pos logit contributions
ptions+0.097
 amusement+0.074
 colored+0.069
 little+0.067
 media+0.066
Neg logit contributions
ometer-0.156
othe-0.155
ometers-0.150
drops-0.148
unes-0.148
Token.
Token ID13
Feature activation-5.36e+00

Pos logit contributions
unes+0.054
rier+0.053
 spine+0.053
ometers+0.052
othe+0.052
Neg logit contributions
 saw-0.029
 little-0.027
 colored-0.027
 heard-0.024
 found-0.024
Token She
Token ID1375
Feature activation-1.45e+00

Pos logit contributions
rier+0.130
vere+0.123
 spine+0.121
drops+0.119
 numb+0.117
Neg logit contributions
 saw-0.079
 little-0.075
 a-0.072
 two-0.070
 found-0.069
Token had
Token ID550
Feature activation+4.69e-01

Pos logit contributions
othe+0.038
vere+0.035
rier+0.035
ometer+0.035
uts+0.034
Neg logit contributions
 colored-0.016
ptions-0.015
 saw-0.014
 little-0.013
 coloured-0.013
Token the
Token ID262
Feature activation-1.84e+00

Pos logit contributions
ptions+0.006
 colored+0.005
 media+0.005
 coloured+0.005
 amusement+0.004
Neg logit contributions
othe-0.010
vere-0.010
uts-0.010
umbs-0.009
ometers-0.009
Token strongest
Token ID12841
Feature activation-2.70e+00

Pos logit contributions
othe+0.039
rier+0.039
vere+0.037
ometers+0.037
 spine+0.036
Neg logit contributions
 saw-0.031
 colored-0.030
 little-0.029
 heard-0.027
 coloured-0.027
Token therm
Token ID21969
Feature activation-1.11e+01

Pos logit contributions
rier+0.067
othe+0.066
 spine+0.063
vere+0.063
ometer+0.062
Neg logit contributions
 colored-0.033
 saw-0.032
ptions-0.029
 heard-0.027
 coloured-0.027
Tokenometer
Token ID15635
Feature activation-7.44e+00

Pos logit contributions
othe+0.078
vere+0.076
unes+0.070
 numb+0.069
umbs+0.068
Neg logit contributions
 little-0.304
 saw-0.301
 colored-0.295
 two-0.287
 did-0.282
Token in
Token ID287
Feature activation-3.60e-01

Pos logit contributions
rier+0.189
othe+0.189
vere+0.183
unes+0.180
ometers+0.179
Neg logit contributions
 saw-0.093
 colored-0.089
 little-0.082
 heard-0.077
 found-0.077
Token the
Token ID262
Feature activation-2.84e+00

Pos logit contributions
othe+0.009
rier+0.009
vere+0.009
 spine+0.008
uts+0.008
Neg logit contributions
 colored-0.004
 saw-0.004
ptions-0.003
 coloured-0.003
 heard-0.003
Token town
Token ID3240
Feature activation-3.56e+00

Pos logit contributions
rier+0.063
othe+0.062
 spine+0.060
vere+0.060
ometer+0.059
Neg logit contributions
 colored-0.042
 saw-0.040
 coloured-0.037
 little-0.036
 heard-0.035
Token and
Token ID290
Feature activation-4.52e+00

Pos logit contributions
othe+0.088
rier+0.087
vere+0.085
 spine+0.083
ometers+0.083
Neg logit contributions
 saw-0.044
 colored-0.044
 little-0.039
 coloured-0.037
 heard-0.037
Token They
Token ID1119
Feature activation+3.52e-01

Pos logit contributions
 spine+0.065
vere+0.064
rier+0.063
ometers+0.062
othe+0.062
Neg logit contributions
 little-0.040
 saw-0.038
 two-0.034
 a-0.033
 did-0.031
Token took
Token ID1718
Feature activation+3.34e+00

Pos logit contributions
 preschool+0.005
ptions+0.005
grown+0.005
 media+0.005
 abandoned+0.005
Neg logit contributions
drops-0.007
othe-0.006
whe-0.006
bars-0.006
balls-0.006
Token turns
Token ID4962
Feature activation+2.12e-01

Pos logit contributions
 saw+0.028
 colored+0.027
 little+0.022
 heard+0.022
 coloured+0.021
Neg logit contributions
othe-0.097
rier-0.096
vere-0.092
ometers-0.092
ometer-0.091
Token putting
Token ID5137
Feature activation+1.22e+00

Pos logit contributions
 saw+0.003
 colored+0.003
 little+0.003
 heard+0.003
 coloured+0.003
Neg logit contributions
othe-0.005
rier-0.005
vere-0.005
 spine-0.004
ometers-0.004
Token the
Token ID262
Feature activation-3.73e+00

Pos logit contributions
 saw+0.008
 colored+0.008
 little+0.007
 coloured+0.006
 heard+0.006
Neg logit contributions
othe-0.038
rier-0.037
vere-0.037
 spine-0.037
uts-0.036
Token therm
Token ID21969
Feature activation-1.09e+01

Pos logit contributions
othe+0.077
uts+0.072
vere+0.072
rier+0.071
 spine+0.069
Neg logit contributions
 saw-0.068
 little-0.068
 colored-0.066
 two-0.060
 heard-0.059
Tokenometer
Token ID15635
Feature activation-6.51e+00

Pos logit contributions
othe+0.112
vere+0.110
aro+0.100
 spine+0.096
drops+0.095
Neg logit contributions
 little-0.261
 two-0.256
 saw-0.255
 a-0.245
 three-0.240
Token in
Token ID287
Feature activation+4.52e-01

Pos logit contributions
othe+0.169
vere+0.167
rier+0.163
plets+0.159
unes+0.158
Neg logit contributions
 colored-0.078
 saw-0.078
 a-0.070
 little-0.070
 found-0.067
Token their
Token ID511
Feature activation-1.59e+00

Pos logit contributions
 saw+0.005
 colored+0.005
 remembered+0.004
 observed+0.004
 packed+0.004
Neg logit contributions
othe-0.011
rier-0.011
vere-0.010
 spine-0.010
 numb-0.010
Token mouths
Token ID28552
Feature activation-3.14e+00

Pos logit contributions
othe+0.032
rier+0.032
vere+0.031
uts+0.030
plets+0.029
Neg logit contributions
 saw-0.030
 colored-0.029
 little-0.028
 found-0.026
 heard-0.026
Token and
Token ID290
Feature activation-4.62e+00

Pos logit contributions
othe+0.077
rier+0.075
vere+0.074
ometers+0.072
unes+0.071
Neg logit contributions
 saw-0.042
 colored-0.041
 little-0.038
 heard-0.036
 coloured-0.035
Token will
Token ID481
Feature activation-1.08e+00

Pos logit contributions
othe+0.052
ometer+0.048
drops+0.047
ometers+0.047
vere+0.047
Neg logit contributions
 colored-0.028
ptions-0.026
 saw-0.026
 little-0.026
 coloured-0.024
Token not
Token ID407
Feature activation-3.72e+00

Pos logit contributions
othe+0.023
rier+0.022
vere+0.022
 spine+0.021
ometers+0.021
Neg logit contributions
 saw-0.017
 colored-0.017
 little-0.016
 coloured-0.015
 heard-0.015
Token touch
Token ID3638
Feature activation+2.92e-01

Pos logit contributions
othe+0.086
rier+0.085
vere+0.082
ometers+0.081
 spine+0.081
Neg logit contributions
 saw-0.054
 colored-0.053
 little-0.049
 heard-0.047
 coloured-0.046
Token the
Token ID262
Feature activation-4.63e+00

Pos logit contributions
 saw+0.003
 colored+0.003
 little+0.003
 a+0.002
 heard+0.002
Neg logit contributions
othe-0.008
rier-0.008
 spine-0.008
vere-0.008
ometer-0.008
Token ash
Token ID12530
Feature activation-9.94e+00

Pos logit contributions
othe+0.103
rier+0.102
vere+0.099
ometers+0.096
 spine+0.096
Neg logit contributions
 saw-0.075
 colored-0.073
 little-0.069
 heard-0.064
 coloured-0.063
Tokentr
Token ID2213
Feature activation-1.09e+01

Pos logit contributions
vere+0.196
othe+0.190
aro+0.171
rier+0.167
plets+0.165
Neg logit contributions
 saw-0.191
 colored-0.189
 little-0.180
 a-0.179
 did-0.173
Tokenay
Token ID323
Feature activation-4.84e+00

Pos logit contributions
 spine+0.262
rier+0.257
othe+0.254
vere+0.248
aro+0.246
Neg logit contributions
 colored-0.139
 saw-0.135
 little-0.119
 did-0.118
 coloured-0.114
Token again
Token ID757
Feature activation-2.68e+00

Pos logit contributions
 spine+0.116
vere+0.113
rier+0.113
othe+0.112
ometers+0.107
Neg logit contributions
 colored-0.071
 saw-0.069
 little-0.063
 a-0.061
 did-0.061
Token,"
Token ID553
Feature activation-2.23e+00

Pos logit contributions
othe+0.052
rier+0.052
vere+0.050
ometers+0.049
 spine+0.049
Neg logit contributions
 saw-0.048
 colored-0.047
 little-0.045
 heard-0.043
 coloured-0.042
Token Tim
Token ID5045
Feature activation+2.16e+00

Pos logit contributions
rier+0.051
 spine+0.049
ometer+0.048
ometers+0.046
vere+0.046
Neg logit contributions
 saw-0.041
 little-0.036
 colored-0.036
 a-0.035
 did-0.034
Token says
Token ID1139
Feature activation+2.82e+00

Pos logit contributions
 saw+0.034
 colored+0.033
 little+0.031
 heard+0.030
 found+0.029
Neg logit contributions
othe-0.048
rier-0.048
vere-0.046
ometers-0.045
 spine-0.045
Token him
Token ID683
Feature activation+5.45e-01

Pos logit contributions
 saw+0.040
 colored+0.040
 remembered+0.037
 packed+0.036
 heard+0.035
Neg logit contributions
othe-0.049
rier-0.049
ometers-0.045
ometer-0.044
vere-0.044
Token how
Token ID703
Feature activation-2.97e+00

Pos logit contributions
 colored+0.005
 saw+0.005
ptions+0.005
 coloured+0.004
 heard+0.004
Neg logit contributions
othe-0.015
rier-0.015
ometers-0.014
ometer-0.014
vere-0.014
Token to
Token ID284
Feature activation-1.14e+00

Pos logit contributions
rier+0.069
othe+0.069
vere+0.066
 spine+0.065
unes+0.065
Neg logit contributions
 saw-0.043
 colored-0.042
 little-0.039
 heard-0.037
 coloured-0.036
Token use
Token ID779
Feature activation-3.05e-01

Pos logit contributions
othe+0.023
rier+0.022
vere+0.021
 spine+0.021
ometers+0.021
Neg logit contributions
 saw-0.020
 colored-0.020
 little-0.018
 heard-0.018
 coloured-0.017
Token the
Token ID262
Feature activation-4.48e+00

Pos logit contributions
rier+0.009
othe+0.009
vere+0.009
 spine+0.008
drops+0.008
Neg logit contributions
 saw-0.003
 colored-0.003
 little-0.002
 coloured-0.002
 heard-0.002
Token therm
Token ID21969
Feature activation-1.09e+01

Pos logit contributions
othe+0.091
uts+0.089
rier+0.087
vere+0.087
drops+0.085
Neg logit contributions
 little-0.083
 saw-0.081
 colored-0.079
 two-0.072
 heard-0.071
Tokenometer
Token ID15635
Feature activation-6.07e+00

Pos logit contributions
vere+0.097
aro+0.097
drops+0.092
othe+0.092
 glide+0.085
Neg logit contributions
 little-0.278
 saw-0.267
 two-0.262
 young-0.255
 colored-0.254
Token and
Token ID290
Feature activation-4.76e+00

Pos logit contributions
othe+0.155
vere+0.155
rier+0.154
 spine+0.149
plets+0.148
Neg logit contributions
 saw-0.071
 colored-0.069
 little-0.065
 a-0.063
 heard-0.061
Token soon
Token ID2582
Feature activation-1.32e+00

Pos logit contributions
othe+0.115
rier+0.110
ometers+0.107
vere+0.106
ometer+0.105
Neg logit contributions
 colored-0.062
 saw-0.061
 little-0.055
ptions-0.054
 coloured-0.053
Token John
Token ID1757
Feature activation+2.70e-01

Pos logit contributions
othe+0.026
rier+0.025
ometers+0.024
uts+0.024
vere+0.024
Neg logit contributions
 colored-0.022
 saw-0.021
ptions-0.020
 coloured-0.020
 little-0.019
Token knew
Token ID2993
Feature activation-1.15e+00

Pos logit contributions
 colored+0.003
 saw+0.003
 little+0.002
ptions+0.002
 coloured+0.002
Neg logit contributions
othe-0.007
rier-0.007
ometer-0.007
ometers-0.007
vere-0.007
Token were
Token ID547
Feature activation+8.68e-01

Pos logit contributions
othe+0.036
vere+0.035
drops+0.035
taker+0.034
rier+0.034
Neg logit contributions
 colored-0.036
ptions-0.034
 coloured-0.034
 saw-0.032
 media-0.031
Token brave
Token ID14802
Feature activation+5.71e-01

Pos logit contributions
 saw+0.014
 colored+0.014
 heard+0.013
 coloured+0.012
 little+0.012
Neg logit contributions
rier-0.018
othe-0.017
vere-0.017
ometers-0.016
drops-0.016
Token and
Token ID290
Feature activation-4.81e+00

Pos logit contributions
 colored+0.008
 saw+0.008
ptions+0.008
 coloured+0.007
 observed+0.007
Neg logit contributions
rier-0.013
drops-0.012
vere-0.011
othe-0.011
 spine-0.011
Token kept
Token ID4030
Feature activation+9.74e-01

Pos logit contributions
othe+0.097
rier+0.091
ometers+0.090
vere+0.090
ometer+0.088
Neg logit contributions
 colored-0.083
 saw-0.079
ptions-0.078
 coloured-0.074
 heard-0.073
Token the
Token ID262
Feature activation-2.86e+00

Pos logit contributions
 saw+0.012
 colored+0.012
 little+0.011
 heard+0.010
 coloured+0.010
Neg logit contributions
othe-0.024
rier-0.024
vere-0.023
ometers-0.023
 spine-0.023
Token therm
Token ID21969
Feature activation-1.09e+01

Pos logit contributions
othe+0.057
vere+0.055
uts+0.055
rier+0.054
unes+0.053
Neg logit contributions
 saw-0.053
 little-0.052
 colored-0.052
 coloured-0.046
 heard-0.046
Tokenometer
Token ID15635
Feature activation-7.62e+00

Pos logit contributions
aro+0.101
vere+0.097
unes+0.094
othe+0.091
 glide+0.090
Neg logit contributions
 little-0.271
 saw-0.258
 young-0.249
 two-0.247
 did-0.239
Token in
Token ID287
Feature activation-1.34e+00

Pos logit contributions
othe+0.174
vere+0.167
unes+0.165
rier+0.162
ometers+0.162
Neg logit contributions
 saw-0.113
 little-0.112
 colored-0.108
 a-0.102
 found-0.100
Token your
Token ID534
Feature activation-1.41e+00

Pos logit contributions
othe+0.033
rier+0.032
vere+0.031
 spine+0.031
ometers+0.030
Neg logit contributions
 saw-0.017
 colored-0.016
 heard-0.014
ptions-0.013
 did-0.013
Token mouth
Token ID5422
Feature activation-4.23e+00

Pos logit contributions
othe+0.029
rier+0.028
vere+0.028
uts+0.028
unes+0.027
Neg logit contributions
 saw-0.026
 little-0.025
 colored-0.023
 found-0.022
 heard-0.022
Token!"
Token ID2474
Feature activation-3.63e+00

Pos logit contributions
othe+0.076
rier+0.075
vere+0.072
 spine+0.070
ometers+0.070
Neg logit contributions
 saw-0.082
 colored-0.081
 little-0.077
 heard-0.074
 coloured-0.073
Token checked
Token ID10667
Feature activation+1.10e+00

Pos logit contributions
 colored+0.000
 saw+0.000
ptions+0.000
 little+0.000
 coloured+0.000
Neg logit contributions
othe-0.000
rier-0.000
vere-0.000
 numb-0.000
drops-0.000
Token.
Token ID13
Feature activation-4.98e+00

Pos logit contributions
 saw+0.013
 colored+0.013
 little+0.012
 heard+0.011
 coloured+0.011
Neg logit contributions
othe-0.028
rier-0.028
vere-0.027
 spine-0.027
unes-0.027
Token She
Token ID1375
Feature activation-7.52e-01

Pos logit contributions
rier+0.101
othe+0.100
vere+0.099
 spine+0.095
drops+0.094
Neg logit contributions
 saw-0.086
 little-0.084
 colored-0.078
 a-0.075
 found-0.074
Token put
Token ID1234
Feature activation+1.33e+00

Pos logit contributions
othe+0.019
ometer+0.017
rier+0.017
drops+0.017
vere+0.017
Neg logit contributions
ptions-0.009
 colored-0.008
 saw-0.007
 little-0.007
 coloured-0.007
Token the
Token ID262
Feature activation-2.50e+00

Pos logit contributions
 saw+0.013
 colored+0.013
 little+0.011
 heard+0.010
 coloured+0.010
Neg logit contributions
othe-0.037
rier-0.037
vere-0.036
 spine-0.035
ometers-0.035
Token therm
Token ID21969
Feature activation-1.08e+01

Pos logit contributions
rier+0.049
othe+0.049
vere+0.047
ometer+0.046
 spine+0.046
Neg logit contributions
 saw-0.044
 colored-0.043
 heard-0.039
 found-0.039
 coloured-0.039
Tokenometer
Token ID15635
Feature activation-6.13e+00

Pos logit contributions
vere+0.091
othe+0.090
umbs+0.084
unes+0.082
 spine+0.081
Neg logit contributions
 saw-0.273
 little-0.273
 two-0.265
 colored-0.262
 a-0.251
Token next
Token ID1306
Feature activation-1.50e+00

Pos logit contributions
othe+0.146
rier+0.145
vere+0.143
unes+0.142
ometers+0.139
Neg logit contributions
 saw-0.087
 colored-0.082
 little-0.078
 found-0.074
 heard-0.074
Token to
Token ID284
Feature activation+9.29e-01

Pos logit contributions
rier+0.031
othe+0.031
 spine+0.030
ometer+0.029
vere+0.029
Neg logit contributions
 colored-0.024
 saw-0.024
 little-0.021
 coloured-0.021
ptions-0.021
Token his
Token ID465
Feature activation-4.81e-01

Pos logit contributions
 colored+0.013
 saw+0.013
 coloured+0.011
 little+0.011
 heard+0.011
Neg logit contributions
othe-0.022
rier-0.022
vere-0.021
 spine-0.020
ometers-0.020
Token forehead
Token ID22645
Feature activation-7.72e-01

Pos logit contributions
othe+0.011
rier+0.011
vere+0.010
 spine+0.010
ometers+0.010
Neg logit contributions
 saw-0.007
 colored-0.007
 little-0.006
 heard-0.006
 coloured-0.006
Token and
Token ID290
Feature activation-1.77e+00

Pos logit contributions
 colored+0.030
 saw+0.030
 little+0.027
 coloured+0.026
ptions+0.026
Neg logit contributions
othe-0.049
rier-0.048
vere-0.046
 spine-0.045
ometers-0.045
Token they
Token ID484
Feature activation-5.56e-01

Pos logit contributions
othe+0.032
rier+0.029
vere+0.027
ometers+0.027
ometer+0.027
Neg logit contributions
 colored-0.035
 saw-0.033
ptions-0.032
 coloured-0.032
 little-0.031
Token bought
Token ID5839
Feature activation+2.65e+00

Pos logit contributions
othe+0.011
drops+0.010
taker+0.010
vere+0.010
ometers+0.010
Neg logit contributions
ptions-0.008
 media-0.008
 colored-0.008
 little-0.008
 preschool-0.007
Token the
Token ID262
Feature activation-2.72e+00

Pos logit contributions
 saw+0.024
 colored+0.023
 little+0.019
 heard+0.019
 coloured+0.018
Neg logit contributions
othe-0.075
rier-0.075
ometers-0.072
vere-0.071
ometer-0.071
Token new
Token ID649
Feature activation-3.65e+00

Pos logit contributions
othe+0.048
rier+0.048
vere+0.046
 spine+0.045
ometers+0.045
Neg logit contributions
 saw-0.054
 colored-0.054
 little-0.050
 heard-0.049
 coloured-0.048
Token therm
Token ID21969
Feature activation-1.07e+01

Pos logit contributions
othe+0.073
rier+0.072
vere+0.070
 spine+0.068
ometers+0.068
Neg logit contributions
 saw-0.065
 colored-0.064
 little-0.061
 coloured-0.057
 heard-0.057
Tokenometer
Token ID15635
Feature activation-6.17e+00

Pos logit contributions
vere+0.103
 glide+0.100
drops+0.099
aro+0.096
othe+0.096
Neg logit contributions
 little-0.264
 saw-0.256
 two-0.247
 colored-0.245
 young-0.242
Token!
Token ID0
Feature activation-4.66e+00

Pos logit contributions
vere+0.153
othe+0.150
rier+0.149
plets+0.146
 spine+0.146
Neg logit contributions
 saw-0.081
 colored-0.078
 a-0.074
 little-0.073
 found-0.072
Token<|endoftext|>
Token ID50256
Feature activation+5.83e+00

Pos logit contributions
rier+0.120
vere+0.109
 spine+0.108
 numb+0.105
unes+0.102
Neg logit contributions
 saw-0.070
 little-0.065
 a-0.062
 two-0.061
 found-0.058
TokenOnce
Token ID7454
Feature activation+6.48e+00

Pos logit contributions
 colored+0.115
 saw+0.113
 little+0.106
 coloured+0.105
 heard+0.101
Neg logit contributions
othe-0.100
rier-0.097
vere-0.095
unes-0.092
ometer-0.092
Token upon
Token ID2402
Feature activation+7.14e+00

Pos logit contributions
 saw+0.078
 colored+0.074
 little+0.069
 heard+0.064
 found+0.062
Neg logit contributions
rier-0.166
othe-0.165
vere-0.161
ometers-0.158
 spine-0.158
Token not
Token ID407
Feature activation-3.18e+00

Pos logit contributions
othe+0.026
vere+0.024
balls+0.024
ometers+0.024
uts+0.023
Neg logit contributions
 colored-0.016
 coloured-0.014
 little-0.014
ptions-0.014
 saw-0.014
Token played
Token ID2826
Feature activation+3.92e-01

Pos logit contributions
othe+0.074
 bleed+0.069
drops+0.068
 numb+0.067
uts+0.067
Neg logit contributions
 media-0.037
iet-0.036
ptions-0.036
ere-0.034
iley-0.034
Token with
Token ID351
Feature activation-8.93e-01

Pos logit contributions
 saw+0.006
 colored+0.006
 little+0.005
 heard+0.005
 coloured+0.005
Neg logit contributions
rier-0.009
othe-0.009
vere-0.008
ometers-0.008
ometer-0.008
Token the
Token ID262
Feature activation-2.85e+00

Pos logit contributions
othe+0.022
rier+0.022
ometers+0.021
unes+0.021
ometer+0.021
Neg logit contributions
 saw-0.011
 colored-0.010
 heard-0.009
 packed-0.009
 coloured-0.009
Token ash
Token ID12530
Feature activation-9.35e+00

Pos logit contributions
othe+0.059
rier+0.059
vere+0.057
ometers+0.056
 spine+0.056
Neg logit contributions
 saw-0.048
 colored-0.047
 little-0.044
 heard-0.042
 coloured-0.042
Tokentr
Token ID2213
Feature activation-1.07e+01

Pos logit contributions
vere+0.180
othe+0.172
rier+0.165
plets+0.155
aro+0.153
Neg logit contributions
 saw-0.194
 colored-0.186
 little-0.179
 did-0.175
 heard-0.173
Tokenay
Token ID323
Feature activation-3.94e+00

Pos logit contributions
rier+0.258
 spine+0.255
othe+0.251
vere+0.247
aro+0.241
Neg logit contributions
 colored-0.137
 saw-0.136
 little-0.120
 did-0.115
 coloured-0.113
Token.
Token ID13
Feature activation-4.61e+00

Pos logit contributions
rier+0.090
vere+0.089
 spine+0.089
othe+0.088
ometers+0.083
Neg logit contributions
 colored-0.060
 saw-0.059
 little-0.054
 coloured-0.052
 heard-0.052
Token They
Token ID1119
Feature activation-9.41e-01

Pos logit contributions
rier+0.090
othe+0.085
vere+0.085
unes+0.082
 spine+0.082
Neg logit contributions
 saw-0.090
 little-0.085
 colored-0.084
 a-0.079
 did-0.078
Token wished
Token ID16555
Feature activation-9.30e-01

Pos logit contributions
othe+0.023
drops+0.021
vere+0.021
 numb+0.021
 spine+0.020
Neg logit contributions
 colored-0.012
ptions-0.011
 coloured-0.011
 little-0.011
 saw-0.010
Token they
Token ID484
Feature activation-3.25e+00

Pos logit contributions
rier+0.019
vere+0.018
 spine+0.018
ometer+0.018
plets+0.018
Neg logit contributions
 saw-0.017
 little-0.017
 colored-0.016
 a-0.016
 two-0.015
Token was
Token ID373
Feature activation+1.65e+00

Pos logit contributions
othe+0.068
rier+0.066
ometers+0.063
vere+0.063
 spine+0.062
Neg logit contributions
 colored-0.044
 saw-0.042
 little-0.039
 coloured-0.038
ptions-0.038
Token cheaper
Token ID11721
Feature activation-1.18e+00

Pos logit contributions
 colored+0.026
 saw+0.025
 little+0.023
 coloured+0.023
ptions+0.022
Neg logit contributions
rier-0.036
othe-0.036
vere-0.034
ometers-0.033
ometer-0.032
Token than
Token ID621
Feature activation-3.55e+00

Pos logit contributions
rier+0.023
othe+0.022
ometers+0.021
vere+0.021
ometer+0.021
Neg logit contributions
 saw-0.021
 colored-0.021
 little-0.019
 heard-0.019
 coloured-0.019
Token the
Token ID262
Feature activation-4.11e+00

Pos logit contributions
othe+0.084
rier+0.084
vere+0.082
 spine+0.080
ometer+0.080
Neg logit contributions
 saw-0.049
 colored-0.047
 little-0.044
 heard-0.041
 coloured-0.041
Token other
Token ID584
Feature activation-4.58e+00

Pos logit contributions
othe+0.095
rier+0.091
vere+0.091
uts+0.089
unes+0.088
Neg logit contributions
 saw-0.062
 colored-0.060
 little-0.059
 heard-0.053
 found-0.052
Token therm
Token ID21969
Feature activation-1.07e+01

Pos logit contributions
othe+0.115
rier+0.112
vere+0.112
uts+0.107
unes+0.107
Neg logit contributions
 saw-0.064
 little-0.059
 colored-0.058
 heard-0.053
 found-0.051
Tokenometer
Token ID15635
Feature activation-7.66e+00

Pos logit contributions
vere+0.083
othe+0.081
drops+0.075
aro+0.074
unes+0.069
Neg logit contributions
 little-0.291
 saw-0.290
 colored-0.270
 two-0.267
 young-0.267
Token.
Token ID13
Feature activation-5.21e+00

Pos logit contributions
othe+0.168
rier+0.167
vere+0.163
unes+0.157
 spine+0.157
Neg logit contributions
 saw-0.122
 colored-0.117
 little-0.111
 heard-0.106
 found-0.104
Token Bob
Token ID5811
Feature activation+2.10e+00

Pos logit contributions
rier+0.122
vere+0.117
 spine+0.113
drops+0.113
 numb+0.112
Neg logit contributions
 saw-0.080
 little-0.077
 two-0.069
 a-0.069
 colored-0.067
Token was
Token ID373
Feature activation+2.93e+00

Pos logit contributions
ptions+0.024
 colored+0.024
 coloured+0.020
 saw+0.020
 amusement+0.020
Neg logit contributions
othe-0.050
ometer-0.049
rier-0.047
uts-0.047
drops-0.047
Token happy
Token ID3772
Feature activation+4.96e-01

Pos logit contributions
 colored+0.056
 saw+0.051
ptions+0.050
 coloured+0.050
ere+0.047
Neg logit contributions
rier-0.054
othe-0.053
vere-0.051
uts-0.047
 numb-0.047
Token should
Token ID815
Feature activation-5.38e-01

Pos logit contributions
othe+0.059
rier+0.058
vere+0.057
ometers+0.056
ometer+0.055
Neg logit contributions
 saw-0.041
 colored-0.041
 little-0.037
 coloured-0.036
 heard-0.035
Token never
Token ID1239
Feature activation-3.39e+00

Pos logit contributions
othe+0.012
rier+0.011
vere+0.011
 spine+0.011
ometers+0.011
Neg logit contributions
 colored-0.008
 saw-0.008
 little-0.008
 coloured-0.008
 heard-0.007
Token touch
Token ID3638
Feature activation-7.66e-02

Pos logit contributions
othe+0.071
 spine+0.067
vere+0.066
rier+0.065
drops+0.065
Neg logit contributions
 colored-0.052
 saw-0.051
 little-0.049
 coloured-0.047
ptions-0.046
Token an
Token ID281
Feature activation-2.92e+00

Pos logit contributions
othe+0.002
rier+0.002
vere+0.002
ometers+0.002
 spine+0.002
Neg logit contributions
 saw-0.001
 colored-0.001
 little-0.001
 heard-0.001
 coloured-0.001
Token ash
Token ID12530
Feature activation-8.57e+00

Pos logit contributions
rier+0.055
othe+0.055
 spine+0.052
unes+0.052
ometer+0.052
Neg logit contributions
 saw-0.053
 colored-0.050
 heard-0.047
 little-0.047
 did-0.046
Tokentr
Token ID2213
Feature activation-1.06e+01

Pos logit contributions
othe+0.145
vere+0.142
rier+0.124
ometers+0.121
aro+0.121
Neg logit contributions
 saw-0.188
 colored-0.183
 little-0.178
 heard-0.172
 a-0.170
Tokenay
Token ID323
Feature activation-3.77e+00

Pos logit contributions
 spine+0.253
othe+0.252
rier+0.246
aro+0.243
vere+0.241
Neg logit contributions
 saw-0.132
 colored-0.132
 a-0.116
 little-0.115
 now-0.114
Token."
Token ID526
Feature activation-2.60e+00

Pos logit contributions
othe+0.078
rier+0.076
vere+0.076
 spine+0.075
ometers+0.074
Neg logit contributions
 saw-0.064
 colored-0.063
 little-0.059
 heard-0.057
 coloured-0.055
Token
Token ID198
Feature activation-1.37e+00

Pos logit contributions
rier+0.070
drops+0.068
 spine+0.067
ometers+0.066
vere+0.065
Neg logit contributions
 saw-0.035
 little-0.034
 a-0.031
 two-0.030
 young-0.027
Token
Token ID198
Feature activation-1.72e+00

Pos logit contributions
rier+0.032
 spine+0.031
drops+0.030
unes+0.030
ometers+0.030
Neg logit contributions
 saw-0.022
 little-0.021
 colored-0.019
 heard-0.018
 a-0.018
TokenS
Token ID50
Feature activation-2.44e-01

Pos logit contributions
othe+0.043
rier+0.043
vere+0.041
 spine+0.041
ometers+0.041
Neg logit contributions
 saw-0.022
 colored-0.022
 little-0.020
 heard-0.019
 coloured-0.018
Tokening
Token ID278
Feature activation-1.16e+00

Pos logit contributions
othe+0.010
rier+0.010
vere+0.009
ometers+0.009
 spine+0.009
Neg logit contributions
 saw-0.008
 colored-0.008
 little-0.008
 heard-0.007
 coloured-0.007
Token on
Token ID319
Feature activation+7.34e-02

Pos logit contributions
othe+0.029
rier+0.029
vere+0.027
ometers+0.027
ometer+0.027
Neg logit contributions
 saw-0.015
 colored-0.014
 little-0.013
 heard-0.012
 coloured-0.012
Token how
Token ID703
Feature activation-3.73e+00

Pos logit contributions
 saw+0.001
 colored+0.001
 heard+0.001
 remembered+0.001
 little+0.001
Neg logit contributions
othe-0.002
rier-0.002
ometers-0.002
vere-0.002
ometer-0.002
Token great
Token ID1049
Feature activation-3.26e+00

Pos logit contributions
othe+0.081
rier+0.080
vere+0.078
ometers+0.076
ometer+0.075
Neg logit contributions
 saw-0.058
 colored-0.058
 little-0.052
 heard-0.051
 coloured-0.050
Token the
Token ID262
Feature activation-4.34e+00

Pos logit contributions
rier+0.068
othe+0.068
vere+0.065
ometers+0.063
drops+0.063
Neg logit contributions
 saw-0.051
 colored-0.050
 little-0.045
ptions-0.045
 heard-0.044
Token hat
Token ID6877
Feature activation-1.06e+01

Pos logit contributions
rier+0.100
vere+0.095
othe+0.095
 spine+0.094
drops+0.090
Neg logit contributions
 saw-0.061
 colored-0.060
ptions-0.053
 heard-0.053
 coloured-0.052
Token looked
Token ID3114
Feature activation-3.77e+00

Pos logit contributions
othe+0.290
rier+0.287
vere+0.281
 spine+0.278
ometers+0.274
Neg logit contributions
 saw-0.119
 colored-0.113
 little-0.099
 did-0.096
 found-0.096
Token.
Token ID13
Feature activation-5.15e+00

Pos logit contributions
rier+0.080
othe+0.080
vere+0.076
 spine+0.075
ometers+0.075
Neg logit contributions
 saw-0.060
 colored-0.060
 little-0.055
 heard-0.053
 coloured-0.052
Token Lily
Token ID20037
Feature activation+9.50e-01

Pos logit contributions
rier+0.126
vere+0.118
 spine+0.115
othe+0.113
drops+0.112
Neg logit contributions
 saw-0.077
 little-0.076
 a-0.068
 found-0.068
 colored-0.068
Token loved
Token ID6151
Feature activation+1.58e+00

Pos logit contributions
ptions+0.012
 colored+0.011
 saw+0.010
 little+0.010
 coloured+0.010
Neg logit contributions
othe-0.021
drops-0.021
ometer-0.020
uts-0.020
vere-0.020
Token it
Token ID340
Feature activation-4.44e+00

Pos logit contributions
 saw+0.022
 colored+0.021
ptions+0.020
 remembered+0.019
 heard+0.019
Neg logit contributions
othe-0.034
rier-0.032
uts-0.032
vere-0.031
ometers-0.031
Token wanted
Token ID2227
Feature activation+6.07e-01

Pos logit contributions
othe+0.003
ometers+0.003
ometer+0.003
 spine+0.003
rier+0.003
Neg logit contributions
 colored-0.001
ptions-0.001
 little-0.001
 saw-0.001
 coloured-0.001
Token to
Token ID284
Feature activation+8.59e-01

Pos logit contributions
 colored+0.009
 saw+0.009
 little+0.008
 coloured+0.008
 heard+0.008
Neg logit contributions
othe-0.014
rier-0.013
vere-0.013
ometers-0.013
unes-0.013
Token keep
Token ID1394
Feature activation+5.74e-01

Pos logit contributions
 colored+0.014
 saw+0.014
 little+0.013
ptions+0.013
 coloured+0.013
Neg logit contributions
othe-0.018
rier-0.017
vere-0.017
 spine-0.016
ometers-0.016
Token the
Token ID262
Feature activation-3.51e+00

Pos logit contributions
 saw+0.008
 colored+0.007
 little+0.007
 heard+0.007
 coloured+0.006
Neg logit contributions
othe-0.014
rier-0.014
vere-0.013
ometers-0.013
unes-0.013
Token ash
Token ID12530
Feature activation-9.80e+00

Pos logit contributions
othe+0.078
rier+0.078
vere+0.076
uts+0.074
ometers+0.073
Neg logit contributions
 saw-0.057
 colored-0.055
 little-0.053
 coloured-0.048
 heard-0.048
Tokentr
Token ID2213
Feature activation-1.06e+01

Pos logit contributions
vere+0.220
plets+0.213
aro+0.213
othe+0.210
rier+0.199
Neg logit contributions
 a-0.161
 little-0.159
 saw-0.157
 colored-0.147
 found-0.146
Tokenay
Token ID323
Feature activation-4.27e+00

Pos logit contributions
 spine+0.271
rier+0.268
aro+0.265
 drown+0.264
othe+0.259
Neg logit contributions
 colored-0.103
 saw-0.103
 little-0.094
 a-0.091
 now-0.090
Token.
Token ID13
Feature activation-4.49e+00

Pos logit contributions
othe+0.100
rier+0.100
 spine+0.099
vere+0.098
unes+0.095
Neg logit contributions
 colored-0.058
 saw-0.058
 little-0.056
 a-0.054
 found-0.051
Token So
Token ID1406
Feature activation-3.61e-01

Pos logit contributions
rier+0.090
vere+0.086
othe+0.086
aro+0.083
plets+0.082
Neg logit contributions
 saw-0.084
 little-0.083
 colored-0.077
 did-0.076
 thought-0.074
Token he
Token ID339
Feature activation+6.05e-01

Pos logit contributions
othe+0.010
rier+0.010
vere+0.009
 spine+0.009
ometers+0.009
Neg logit contributions
 saw-0.004
 colored-0.004
 little-0.003
 heard-0.003
 coloured-0.003
Token tried
Token ID3088
Feature activation+1.09e+00

Pos logit contributions
ptions+0.009
grown+0.008
 preschool+0.008
 colored+0.008
 amusement+0.008
Neg logit contributions
ometer-0.011
balls-0.011
drops-0.011
othe-0.011
uts-0.011
Token can
Token ID460
Feature activation-1.49e+00

Pos logit contributions
othe+0.018
vere+0.018
ometer+0.017
rier+0.017
unes+0.017
Neg logit contributions
 colored-0.020
 coloured-0.019
 saw-0.018
 little-0.018
ptions-0.018
Token use
Token ID779
Feature activation-2.51e-02

Pos logit contributions
othe+0.030
rier+0.030
vere+0.029
ometers+0.028
ometer+0.028
Neg logit contributions
 saw-0.025
 colored-0.025
 little-0.023
 heard-0.022
 coloured-0.022
Token this
Token ID428
Feature activation-5.08e+00

Pos logit contributions
othe+0.001
rier+0.001
vere+0.001
ometers+0.001
unes+0.001
Neg logit contributions
 saw-0.000
 colored-0.000
 little-0.000
 heard-0.000
 coloured-0.000
Token broken
Token ID5445
Feature activation-2.76e+00

Pos logit contributions
rier+0.117
othe+0.117
vere+0.114
uts+0.110
ometers+0.109
Neg logit contributions
 saw-0.078
 colored-0.076
 little-0.072
 heard-0.067
 found-0.066
Token ash
Token ID12530
Feature activation-8.12e+00

Pos logit contributions
rier+0.053
othe+0.053
ometer+0.050
ometers+0.050
unes+0.050
Neg logit contributions
 saw-0.048
 colored-0.048
 little-0.044
 heard-0.043
 coloured-0.042
Tokentr
Token ID2213
Feature activation-1.05e+01

Pos logit contributions
othe+0.145
vere+0.136
rier+0.133
 numb+0.123
plets+0.122
Neg logit contributions
 saw-0.172
 colored-0.167
 little-0.161
 a-0.159
 found-0.159
Tokenay
Token ID323
Feature activation-3.90e+00

Pos logit contributions
 spine+0.263
rier+0.255
othe+0.254
drops+0.244
aro+0.244
Neg logit contributions
 colored-0.119
 saw-0.118
 a-0.104
 little-0.104
 did-0.103
Token as
Token ID355
Feature activation-3.69e+00

Pos logit contributions
othe+0.089
rier+0.089
vere+0.086
 spine+0.084
ometers+0.083
Neg logit contributions
 saw-0.058
 colored-0.057
 little-0.052
 heard-0.050
 coloured-0.050
Token a
Token ID257
Feature activation-2.40e+00

Pos logit contributions
othe+0.101
rier+0.099
vere+0.097
ometers+0.097
unes+0.095
Neg logit contributions
 saw-0.036
 colored-0.034
 heard-0.029
ptions-0.028
 little-0.028
Token puzzle
Token ID15027
Feature activation-4.49e+00

Pos logit contributions
othe+0.060
rier+0.059
uts+0.058
vere+0.058
ometers+0.058
Neg logit contributions
 saw-0.033
 little-0.032
 colored-0.029
 heard-0.027
 found-0.027
Token!"
Token ID2474
Feature activation-2.56e+00

Pos logit contributions
othe+0.098
rier+0.095
vere+0.093
ometers+0.092
 spine+0.091
Neg logit contributions
 saw-0.074
 colored-0.071
 little-0.068
 found-0.065
 heard-0.064
TokenDid
Token ID11633
Feature activation-1.24e+00

Pos logit contributions
rier+0.040
othe+0.039
but+0.038
vere+0.038
drops+0.037
Neg logit contributions
 colored-0.036
 saw-0.034
 little-0.033
 coloured-0.032
 packed-0.032
Token you
Token ID345
Feature activation-9.79e-01

Pos logit contributions
rier+0.021
othe+0.021
 spine+0.020
vere+0.020
ometer+0.020
Neg logit contributions
 saw-0.026
 colored-0.025
 little-0.024
 heard-0.023
 did-0.023
Token destroy
Token ID4117
Feature activation+2.04e+00

Pos logit contributions
othe+0.020
vere+0.018
drops+0.017
 bleed+0.017
 spine+0.017
Neg logit contributions
 colored-0.017
 saw-0.016
ptions-0.016
 coloured-0.015
 little-0.015
Token my
Token ID616
Feature activation-2.13e+00

Pos logit contributions
 saw+0.022
 colored+0.021
 packed+0.019
 heard+0.018
 remembered+0.018
Neg logit contributions
othe-0.054
rier-0.053
vere-0.051
ometers-0.050
unes-0.050
Token ash
Token ID12530
Feature activation-8.98e+00

Pos logit contributions
rier+0.049
othe+0.049
vere+0.048
ometers+0.046
uts+0.046
Neg logit contributions
 saw-0.033
 colored-0.032
 little-0.031
 coloured-0.028
 heard-0.028
Tokentr
Token ID2213
Feature activation-1.05e+01

Pos logit contributions
othe+0.139
rier+0.135
vere+0.132
 spine+0.118
ometers+0.117
Neg logit contributions
 saw-0.207
 colored-0.203
 little-0.198
 did-0.186
 heard-0.185
Tokenay
Token ID323
Feature activation-4.20e+00

Pos logit contributions
rier+0.248
 spine+0.246
othe+0.243
vere+0.233
aro+0.231
Neg logit contributions
 saw-0.134
 colored-0.131
 little-0.120
 did-0.116
 now-0.111
Token?"
Token ID1701
Feature activation-1.75e+00

Pos logit contributions
othe+0.101
rier+0.100
vere+0.096
ometer+0.095
ometers+0.095
Neg logit contributions
 saw-0.056
 colored-0.055
 little-0.050
 heard-0.047
 coloured-0.047
Token Lily
Token ID20037
Feature activation+3.68e+00

Pos logit contributions
rier+0.048
vere+0.046
othe+0.046
ometers+0.046
 spine+0.045
Neg logit contributions
 saw-0.020
 colored-0.018
 little-0.018
 heard-0.016
 found-0.016
Token felt
Token ID2936
Feature activation+3.07e+00

Pos logit contributions
 colored+0.052
ptions+0.049
 coloured+0.047
 amusement+0.046
grown+0.045
Neg logit contributions
othe-0.081
rier-0.074
ometer-0.074
vere-0.074
drops-0.072
Token bad
Token ID2089
Feature activation+2.85e-01

Pos logit contributions
 saw+0.055
 colored+0.054
 little+0.051
 heard+0.048
 coloured+0.048
Neg logit contributions
othe-0.060
rier-0.060
vere-0.058
ometers-0.056
 spine-0.056
Token,
Token ID11
Feature activation-2.09e+00

Pos logit contributions
ptions+0.007
 colored+0.006
 coloured+0.006
 little+0.006
 amusement+0.006
Neg logit contributions
othe-0.011
uts-0.011
ometer-0.011
ometers-0.011
drops-0.011
Token you
Token ID345
Feature activation-3.29e-01

Pos logit contributions
othe+0.041
rier+0.040
vere+0.039
ometers+0.038
 spine+0.038
Neg logit contributions
 saw-0.038
 colored-0.037
 little-0.035
 heard-0.033
 coloured-0.033
Token should
Token ID815
Feature activation+8.49e-01

Pos logit contributions
othe+0.006
uts+0.006
vere+0.005
ometers+0.005
 bleed+0.005
Neg logit contributions
 colored-0.006
ptions-0.006
 coloured-0.006
 media-0.005
 amusement-0.005
Token have
Token ID423
Feature activation-7.54e-01

Pos logit contributions
 colored+0.015
 saw+0.015
ptions+0.014
 coloured+0.014
 heard+0.014
Neg logit contributions
othe-0.017
rier-0.015
vere-0.015
 spine-0.015
uts-0.014
Token the
Token ID262
Feature activation-2.87e+00

Pos logit contributions
othe+0.018
vere+0.017
balls+0.017
ometers+0.017
uts+0.017
Neg logit contributions
 colored-0.008
 little-0.007
 saw-0.007
ptions-0.007
 coloured-0.007
Token therm
Token ID21969
Feature activation-1.04e+01

Pos logit contributions
othe+0.067
uts+0.062
rier+0.062
vere+0.062
ometers+0.061
Neg logit contributions
 saw-0.045
 little-0.044
 colored-0.043
 heard-0.038
 two-0.038
Tokenometer
Token ID15635
Feature activation-5.86e+00

Pos logit contributions
vere+0.095
othe+0.090
drops+0.086
unes+0.083
aro+0.080
Neg logit contributions
 little-0.270
 saw-0.269
 colored-0.258
 packed-0.251
 thought-0.246
Token in
Token ID287
Feature activation-6.44e-01

Pos logit contributions
othe+0.132
rier+0.130
vere+0.128
ometers+0.124
 spine+0.124
Neg logit contributions
 saw-0.088
 colored-0.086
 little-0.082
 heard-0.076
 found-0.075
Token your
Token ID534
Feature activation-1.46e+00

Pos logit contributions
othe+0.015
rier+0.014
uts+0.014
ometers+0.013
 numb+0.013
Neg logit contributions
 saw-0.008
 colored-0.008
 observed-0.008
ptions-0.008
 remembered-0.007
Token mouth
Token ID5422
Feature activation-3.86e+00

Pos logit contributions
othe+0.034
rier+0.033
vere+0.032
 spine+0.032
ometers+0.032
Neg logit contributions
 saw-0.021
 colored-0.021
 little-0.019
 coloured-0.018
 heard-0.018
Token.
Token ID13
Feature activation-5.91e+00

Pos logit contributions
othe+0.079
rier+0.079
vere+0.075
ometers+0.074
 spine+0.074
Neg logit contributions
 saw-0.065
 colored-0.064
 little-0.060
 heard-0.057
 coloured-0.057
Token We
Token ID775
Feature activation-1.38e+00

Pos logit contributions
othe+0.086
rier+0.083
 spine+0.082
vere+0.078
unes+0.076
Neg logit contributions
 colored-0.125
 saw-0.125
 little-0.123
 a-0.120
 two-0.119
Token did
Token ID750
Feature activation-1.73e-01

Pos logit contributions
othe+0.035
drops+0.033
ometers+0.033
vere+0.032
unes+0.032
Neg logit contributions
 colored-0.014
 little-0.013
ptions-0.013
 saw-0.013
 coloured-0.012
Token not
Token ID407
Feature activation-1.76e+00

Pos logit contributions
othe+0.003
rier+0.003
vere+0.003
ometers+0.003
uts+0.003
Neg logit contributions
 saw-0.003
 colored-0.003
 packed-0.003
 heard-0.003
 coloured-0.003
Token share
Token ID2648
Feature activation-5.05e-01

Pos logit contributions
othe+0.033
vere+0.031
drops+0.030
unes+0.030
 bleed+0.030
Neg logit contributions
 saw-0.027
 colored-0.026
 little-0.026
 media-0.025
ptions-0.025
Token the
Token ID262
Feature activation-4.44e+00

Pos logit contributions
rier+0.010
othe+0.010
vere+0.010
ometers+0.009
unes+0.009
Neg logit contributions
 saw-0.009
 colored-0.009
 little-0.008
 heard-0.008
 coloured-0.008
Token er
Token ID1931
Feature activation-1.04e+01

Pos logit contributions
othe+0.111
rier+0.110
vere+0.107
ometers+0.105
 spine+0.105
Neg logit contributions
 saw-0.056
 colored-0.055
 little-0.050
 heard-0.047
 coloured-0.046
Tokenaser
Token ID6005
Feature activation-7.98e+00

Pos logit contributions
othe+0.213
 spine+0.212
 numb+0.205
rier+0.188
 afloat+0.187
Neg logit contributions
 colored-0.188
 coloured-0.167
 saw-0.164
 little-0.163
 two-0.154
Token."
Token ID526
Feature activation-2.11e+00

Pos logit contributions
othe+0.193
 spine+0.189
ometers+0.186
vere+0.185
unes+0.178
Neg logit contributions
 colored-0.109
 saw-0.107
 little-0.106
 a-0.099
 found-0.095
Token
Token ID198
Feature activation-4.61e-01

Pos logit contributions
rier+0.054
 spine+0.052
unes+0.051
ometers+0.050
drops+0.050
Neg logit contributions
 saw-0.032
 little-0.029
 a-0.028
 heard-0.026
 colored-0.026
Token
Token ID198
Feature activation-8.50e-01

Pos logit contributions
rier+0.010
 spine+0.010
unes+0.010
drops+0.009
ometers+0.009
Neg logit contributions
 saw-0.008
 colored-0.007
 little-0.007
 heard-0.007
 found-0.007
TokenThe
Token ID464
Feature activation-1.86e-01

Pos logit contributions
othe+0.017
vere+0.017
rier+0.017
uts+0.016
ometer+0.016
Neg logit contributions
 colored-0.014
 saw-0.014
 little-0.013
 coloured-0.013
ptions-0.013
Token that
Token ID326
Feature activation-2.17e+00

Pos logit contributions
 saw+0.008
 colored+0.008
 little+0.007
 heard+0.007
 coloured+0.007
Neg logit contributions
othe-0.012
rier-0.012
vere-0.012
ometers-0.012
unes-0.011
Token he
Token ID339
Feature activation-2.44e+00

Pos logit contributions
othe+0.045
rier+0.045
vere+0.043
 spine+0.043
ometers+0.042
Neg logit contributions
 saw-0.036
 colored-0.036
 little-0.033
 heard-0.032
 coloured-0.031
Token had
Token ID550
Feature activation-5.56e-01

Pos logit contributions
othe+0.051
vere+0.051
drops+0.050
uts+0.049
ometers+0.049
Neg logit contributions
ptions-0.032
 colored-0.031
 little-0.030
 amusement-0.029
 coloured-0.027
Token used
Token ID973
Feature activation+9.57e-04

Pos logit contributions
othe+0.013
vere+0.012
ometers+0.012
uts+0.012
unes+0.012
Neg logit contributions
 colored-0.006
ptions-0.006
 saw-0.006
iet-0.006
 little-0.006
Token the
Token ID262
Feature activation-4.10e+00

Pos logit contributions
 saw+0.000
 colored+0.000
 little+0.000
 heard+0.000
ptions+0.000
Neg logit contributions
othe-0.000
rier-0.000
vere-0.000
ometers-0.000
unes-0.000
Token m
Token ID285
Feature activation-1.04e+01

Pos logit contributions
othe+0.111
rier+0.108
uts+0.107
vere+0.106
unes+0.102
Neg logit contributions
 saw-0.049
 little-0.049
 colored-0.047
 two-0.040
 heard-0.040
Tokenop
Token ID404
Feature activation-7.84e+00

Pos logit contributions
rier+0.219
vere+0.208
othe+0.203
 spine+0.195
 glide+0.187
Neg logit contributions
 little-0.160
 colored-0.159
 coloured-0.146
 saw-0.144
 two-0.137
Token in
Token ID287
Feature activation-8.01e-01

Pos logit contributions
rier+0.192
othe+0.191
vere+0.189
 spine+0.183
unes+0.177
Neg logit contributions
 saw-0.106
 colored-0.103
 little-0.096
 heard-0.090
 found-0.090
Token an
Token ID281
Feature activation-2.55e-01

Pos logit contributions
othe+0.020
rier+0.019
vere+0.018
ometers+0.018
 spine+0.018
Neg logit contributions
 saw-0.010
 colored-0.010
ptions-0.009
 little-0.008
 heard-0.008
Token organized
Token ID8389
Feature activation-7.01e-01

Pos logit contributions
rier+0.005
othe+0.005
vere+0.005
 spine+0.005
unes+0.005
Neg logit contributions
 saw-0.004
 colored-0.004
ptions-0.004
 little-0.004
 did-0.004
Token way
Token ID835
Feature activation-3.43e-01

Pos logit contributions
rier+0.015
othe+0.014
 spine+0.014
ometer+0.014
vere+0.014
Neg logit contributions
 saw-0.010
 colored-0.010
ptions-0.010
 remembered-0.009
 heard-0.009
Token his
Token ID465
Feature activation-1.18e+00

Pos logit contributions
 saw+0.004
 colored+0.004
ptions+0.004
 remembered+0.003
 coloured+0.003
Neg logit contributions
rier-0.009
othe-0.009
vere-0.009
 spine-0.009
ometer-0.009
Token mouth
Token ID5422
Feature activation-2.48e+00

Pos logit contributions
othe+0.025
vere+0.024
rier+0.024
uts+0.023
drops+0.023
Neg logit contributions
 saw-0.022
 little-0.021
 colored-0.020
 found-0.019
 a-0.019
Token and
Token ID290
Feature activation-3.57e+00

Pos logit contributions
othe+0.061
rier+0.060
vere+0.059
ometers+0.057
 spine+0.057
Neg logit contributions
 saw-0.032
 colored-0.032
 little-0.029
 heard-0.028
 coloured-0.027
Token put
Token ID1234
Feature activation+1.33e+00

Pos logit contributions
othe+0.084
drops+0.081
balls+0.081
rier+0.079
ometers+0.077
Neg logit contributions
ptions-0.041
ere-0.038
 little-0.037
iet-0.037
 colored-0.036
Token the
Token ID262
Feature activation-3.67e+00

Pos logit contributions
 saw+0.018
 colored+0.018
 little+0.016
 heard+0.015
 coloured+0.015
Neg logit contributions
othe-0.032
rier-0.032
vere-0.031
 spine-0.031
ometers-0.031
Token therm
Token ID21969
Feature activation-1.04e+01

Pos logit contributions
othe+0.092
uts+0.087
vere+0.086
rier+0.084
ometers+0.083
Neg logit contributions
 saw-0.055
 little-0.052
 colored-0.052
 two-0.046
 heard-0.046
Tokenometer
Token ID15635
Feature activation-5.83e+00

Pos logit contributions
vere+0.106
othe+0.105
ickets+0.085
 spine+0.083
 affect+0.082
Neg logit contributions
 saw-0.263
 little-0.251
 two-0.250
 colored-0.248
 did-0.242
Token in
Token ID287
Feature activation+6.45e-02

Pos logit contributions
othe+0.163
vere+0.162
plets+0.156
ometers+0.155
rier+0.153
Neg logit contributions
 saw-0.060
 colored-0.058
 a-0.054
 little-0.052
 found-0.051
Token his
Token ID465
Feature activation-1.46e+00

Pos logit contributions
 saw+0.001
 colored+0.001
 little+0.001
 coloured+0.001
 heard+0.001
Neg logit contributions
rier-0.002
othe-0.002
vere-0.002
 spine-0.002
ometers-0.002
Token mouth
Token ID5422
Feature activation-2.33e+00

Pos logit contributions
othe+0.032
vere+0.031
uts+0.030
rier+0.030
drops+0.029
Neg logit contributions
 saw-0.027
 little-0.026
 a-0.024
 colored-0.023
 found-0.023
Token again
Token ID757
Feature activation-1.12e+00

Pos logit contributions
othe+0.055
vere+0.054
rier+0.053
ometers+0.051
 spine+0.051
Neg logit contributions
 saw-0.034
 colored-0.033
 little-0.031
 heard-0.030
 a-0.029
Token not
Token ID407
Feature activation-2.92e+00

Pos logit contributions
othe+0.041
rier+0.040
vere+0.038
ometers+0.038
 spine+0.038
Neg logit contributions
 saw-0.035
 colored-0.034
 heard-0.031
 little-0.031
 coloured-0.031
Token nice
Token ID3621
Feature activation-2.13e+00

Pos logit contributions
othe+0.068
rier+0.067
vere+0.065
ometers+0.064
unes+0.063
Neg logit contributions
 saw-0.040
 colored-0.039
 heard-0.034
 coloured-0.034
 little-0.033
Token to
Token ID284
Feature activation-1.76e+00

Pos logit contributions
othe+0.045
rier+0.045
vere+0.043
ometers+0.042
 spine+0.042
Neg logit contributions
 saw-0.035
 colored-0.035
 little-0.032
 heard-0.031
 coloured-0.031
Token keep
Token ID1394
Feature activation-1.25e+00

Pos logit contributions
othe+0.039
rier+0.037
vere+0.036
 spine+0.036
ometers+0.035
Neg logit contributions
 colored-0.026
 saw-0.026
 coloured-0.023
 little-0.023
 heard-0.023
Token the
Token ID262
Feature activation-3.65e+00

Pos logit contributions
rier+0.028
othe+0.028
ometers+0.026
unes+0.026
vere+0.026
Neg logit contributions
 saw-0.019
 colored-0.018
 heard-0.016
 coloured-0.016
 little-0.016
Token dish
Token ID9433
Feature activation-1.04e+01

Pos logit contributions
othe+0.084
vere+0.080
uts+0.078
rier+0.078
ometers+0.076
Neg logit contributions
 saw-0.058
 colored-0.057
 little-0.055
 heard-0.049
 coloured-0.049
Token all
Token ID477
Feature activation-3.79e+00

Pos logit contributions
othe+0.225
vere+0.220
 spine+0.217
aro+0.203
plets+0.203
Neg logit contributions
 little-0.173
 saw-0.170
 colored-0.160
 a-0.154
 found-0.151
Token to
Token ID284
Feature activation-3.12e+00

Pos logit contributions
othe+0.076
rier+0.075
vere+0.072
ometers+0.071
 spine+0.071
Neg logit contributions
 saw-0.067
 colored-0.066
 little-0.062
 heard-0.059
 coloured-0.059
Token yourself
Token ID3511
Feature activation-1.24e+00

Pos logit contributions
othe+0.068
rier+0.065
vere+0.062
 spine+0.062
ometers+0.061
Neg logit contributions
 saw-0.049
 colored-0.048
 little-0.044
 heard-0.043
 coloured-0.042
Token."
Token ID526
Feature activation-2.55e+00

Pos logit contributions
othe+0.024
rier+0.023
vere+0.023
 spine+0.022
ometers+0.022
Neg logit contributions
 saw-0.023
 colored-0.022
 little-0.021
 heard-0.020
 coloured-0.020
Token
Token ID198
Feature activation-6.41e-01

Pos logit contributions
rier+0.070
ometers+0.069
drops+0.068
vere+0.067
plets+0.066
Neg logit contributions
 saw-0.035
 little-0.035
 a-0.032
 two-0.028
 heard-0.028
Token for
Token ID329
Feature activation-7.57e-01

Pos logit contributions
othe+0.064
rier+0.064
vere+0.061
ometers+0.060
 spine+0.059
Neg logit contributions
 saw-0.057
 colored-0.056
 little-0.052
 heard-0.051
 coloured-0.050
Token you
Token ID345
Feature activation-2.52e+00

Pos logit contributions
othe+0.016
rier+0.015
vere+0.015
ometers+0.014
uts+0.014
Neg logit contributions
 saw-0.013
 colored-0.012
 heard-0.011
 coloured-0.011
 little-0.011
Token.
Token ID13
Feature activation-4.89e+00

Pos logit contributions
othe+0.050
rier+0.049
vere+0.047
ometers+0.046
 spine+0.046
Neg logit contributions
 saw-0.045
 colored-0.044
 little-0.041
 coloured-0.039
 heard-0.039
Token<|endoftext|>
Token ID50256
Feature activation+4.92e+00

Pos logit contributions
rier+0.105
vere+0.100
othe+0.099
 spine+0.097
drops+0.097
Neg logit contributions
 saw-0.083
 little-0.080
 colored-0.079
 a-0.075
 two-0.073
TokenM
Token ID44
Feature activation+4.52e+00

Pos logit contributions
 saw+0.094
 colored+0.093
 little+0.087
 coloured+0.084
 heard+0.084
Neg logit contributions
othe-0.089
rier-0.088
vere-0.085
ometers-0.083
 spine-0.082
Tokenia
Token ID544
Feature activation+7.53e+00

Pos logit contributions
 colored+0.089
 saw+0.088
ptions+0.081
 coloured+0.080
 heard+0.079
Neg logit contributions
othe-0.081
rier-0.075
vere-0.075
unes-0.073
ometer-0.073
Token and
Token ID290
Feature activation+3.26e+00

Pos logit contributions
ptions+0.126
 amusement+0.093
 little+0.091
 colored+0.088
 media+0.086
Neg logit contributions
othe-0.145
uts-0.142
drops-0.139
ometer-0.137
 spine-0.135
Token Leo
Token ID19632
Feature activation+7.70e+00

Pos logit contributions
 colored+0.043
 saw+0.041
 coloured+0.037
 little+0.037
ptions+0.037
Neg logit contributions
othe-0.079
rier-0.076
vere-0.074
ometers-0.074
 spine-0.073
Token were
Token ID547
Feature activation+8.50e+00

Pos logit contributions
ptions+0.123
 colored+0.106
 little+0.102
 media+0.100
 coloured+0.097
Neg logit contributions
othe-0.150
unes-0.137
drops-0.137
uts-0.135
ometers-0.135
Token playing
Token ID2712
Feature activation+5.70e+00

Pos logit contributions
 colored+0.133
 saw+0.127
 coloured+0.118
 little+0.112
 heard+0.112
Neg logit contributions
othe-0.184
rier-0.182
vere-0.179
unes-0.174
 spine-0.173
Token in
Token ID287
Feature activation+4.44e+00

Pos logit contributions
 colored+0.082
 saw+0.082
 packed+0.075
 remembered+0.074
ptions+0.074
Neg logit contributions
rier-0.123
othe-0.120
vere-0.117
balls-0.114
 spine-0.111
Token dirt
Token ID13647
Feature activation-4.91e+00

Pos logit contributions
rier+0.059
othe+0.059
vere+0.056
ometers+0.055
unes+0.055
Neg logit contributions
 saw-0.056
 colored-0.056
 little-0.051
 coloured-0.050
 heard-0.050
Tokeniness
Token ID1272
Feature activation-4.97e+00

Pos logit contributions
othe+0.093
rier+0.093
vere+0.089
ometers+0.087
 spine+0.087
Neg logit contributions
 saw-0.090
 colored-0.089
 little-0.083
 heard-0.080
 coloured-0.080
Token.
Token ID13
Feature activation-6.06e+00

Pos logit contributions
othe+0.109
rier+0.106
 spine+0.106
vere+0.104
ometers+0.103
Neg logit contributions
 saw-0.077
 little-0.074
 colored-0.074
 heard-0.068
 did-0.067
Token<|endoftext|>
Token ID50256
Feature activation+5.82e+00

Pos logit contributions
rier+0.154
drops+0.144
 spine+0.142
othe+0.140
 numb+0.138
Neg logit contributions
 little-0.079
 saw-0.079
 two-0.077
 a-0.077
 young-0.072
TokenOnce
Token ID7454
Feature activation+6.25e+00

Pos logit contributions
 colored+0.115
 saw+0.112
 little+0.106
 coloured+0.104
 remembered+0.101
Neg logit contributions
othe-0.099
rier-0.096
vere-0.095
unes-0.093
ometer-0.091
Token upon
Token ID2402
Feature activation+7.13e+00

Pos logit contributions
 saw+0.075
 colored+0.072
 little+0.066
 heard+0.061
 coloured+0.060
Neg logit contributions
rier-0.160
othe-0.159
vere-0.154
ometers-0.152
 spine-0.152
Token a
Token ID257
Feature activation+9.30e+00

Pos logit contributions
 colored+0.062
 saw+0.055
ptions+0.052
 remembered+0.049
ere+0.048
Neg logit contributions
othe-0.191
vere-0.188
uts-0.182
rier-0.178
 numb-0.177
Token time
Token ID640
Feature activation+5.70e+00

Pos logit contributions
 colored+0.106
 saw+0.097
ptions+0.091
 coloured+0.088
 remembered+0.081
Neg logit contributions
othe-0.235
 spine-0.227
vere-0.226
ometer-0.213
rier-0.213
Token,
Token ID11
Feature activation+6.52e+00

Pos logit contributions
 colored+0.054
 saw+0.052
 little+0.044
 coloured+0.043
 heard+0.042
Neg logit contributions
othe-0.160
rier-0.156
vere-0.154
 spine-0.151
ometers-0.151
Token there
Token ID612
Feature activation+7.00e+00

Pos logit contributions
 colored+0.080
ptions+0.076
 amusement+0.074
 media+0.072
 coloured+0.068
Neg logit contributions
othe-0.156
ometer-0.143
rier-0.138
vere-0.137
 spine-0.136
Token was
Token ID373
Feature activation+8.54e+00

Pos logit contributions
ptions+0.094
 colored+0.090
 packed+0.081
 coloured+0.080
iet+0.075
Neg logit contributions
othe-0.161
ometer-0.144
uts-0.138
ometers-0.136
vere-0.135
Token bad
Token ID2089
Feature activation-4.98e-01

Pos logit contributions
othe+0.006
rier+0.006
vere+0.006
ometer+0.006
 spine+0.006
Neg logit contributions
 colored-0.003
 saw-0.003
 coloured-0.003
 little-0.003
 heard-0.003
Token ending
Token ID7464
Feature activation+1.17e+00

Pos logit contributions
rier+0.010
othe+0.010
vere+0.010
ometers+0.010
unes+0.010
Neg logit contributions
 saw-0.009
 little-0.009
 colored-0.009
 heard-0.008
 found-0.008
Token.
Token ID13
Feature activation-5.04e+00

Pos logit contributions
 saw+0.018
 colored+0.017
 little+0.017
 heard+0.016
 found+0.016
Neg logit contributions
rier-0.027
othe-0.026
vere-0.025
ometers-0.025
unes-0.025
Token<|endoftext|>
Token ID50256
Feature activation+6.29e+00

Pos logit contributions
rier+0.133
drops+0.132
 spine+0.124
unes+0.123
ometers+0.123
Neg logit contributions
 little-0.065
 saw-0.065
 a-0.063
 two-0.058
 found-0.055
TokenOnce
Token ID7454
Feature activation+6.68e+00

Pos logit contributions
 saw+0.121
 colored+0.121
 little+0.113
 coloured+0.109
 heard+0.108
Neg logit contributions
othe-0.113
rier-0.111
vere-0.107
ometers-0.104
 spine-0.104
Token upon
Token ID2402
Feature activation+6.97e+00

Pos logit contributions
 saw+0.080
 colored+0.075
 little+0.070
 heard+0.065
 found+0.063
Neg logit contributions
rier-0.173
othe-0.171
vere-0.166
ometers-0.164
 spine-0.164
Token a
Token ID257
Feature activation+9.30e+00

Pos logit contributions
 colored+0.056
 saw+0.048
ptions+0.046
 coloured+0.044
ere+0.043
Neg logit contributions
othe-0.193
vere-0.187
uts-0.184
rier-0.183
 numb-0.180
Token time
Token ID640
Feature activation+5.90e+00

Pos logit contributions
 colored+0.099
 saw+0.092
ptions+0.082
 coloured+0.081
 little+0.075
Neg logit contributions
othe-0.244
 spine-0.235
vere-0.233
ometer-0.225
rier-0.225
Token,
Token ID11
Feature activation+6.87e+00

Pos logit contributions
 colored+0.052
 saw+0.052
 little+0.044
 coloured+0.041
 heard+0.040
Neg logit contributions
othe-0.169
rier-0.166
vere-0.163
 spine-0.160
ometers-0.159
Token there
Token ID612
Feature activation+7.04e+00

Pos logit contributions
 colored+0.082
ptions+0.081
 amusement+0.078
 media+0.074
othy+0.073
Neg logit contributions
othe-0.170
ometer-0.153
vere-0.151
rier-0.149
 spine-0.144
Token was
Token ID373
Feature activation+8.79e+00

Pos logit contributions
ptions+0.095
 colored+0.087
 coloured+0.078
 packed+0.078
 amusement+0.077
Neg logit contributions
othe-0.160
ometer-0.144
uts-0.139
vere-0.136
balls-0.135
Token to
Token ID284
Feature activation-2.03e+00

Pos logit contributions
rier+0.077
othe+0.077
 spine+0.074
vere+0.073
unes+0.073
Neg logit contributions
 saw-0.051
 colored-0.050
 little-0.048
 heard-0.044
 coloured-0.043
Token improve
Token ID2987
Feature activation-2.49e+00

Pos logit contributions
othe+0.045
vere+0.044
rier+0.043
 spine+0.041
drops+0.041
Neg logit contributions
 saw-0.030
 colored-0.029
ptions-0.027
 little-0.026
 heard-0.026
Token."
Token ID526
Feature activation-3.78e+00

Pos logit contributions
rier+0.058
othe+0.058
vere+0.056
ometers+0.054
uts+0.054
Neg logit contributions
 saw-0.035
 colored-0.034
 heard-0.030
 little-0.030
 found-0.029
Token<|endoftext|>
Token ID50256
Feature activation+5.50e+00

Pos logit contributions
rier+0.096
vere+0.090
drops+0.087
unes+0.085
ometers+0.085
Neg logit contributions
 saw-0.057
 little-0.055
 colored-0.051
 a-0.050
 found-0.049
TokenThere
Token ID1858
Feature activation+5.11e+00

Pos logit contributions
 colored+0.109
 saw+0.108
 little+0.101
 coloured+0.099
 heard+0.096
Neg logit contributions
othe-0.094
rier-0.091
vere-0.089
unes-0.086
ometer-0.086
Token was
Token ID373
Feature activation+6.74e+00

Pos logit contributions
 colored+0.050
 saw+0.045
ptions+0.043
 coloured+0.041
 little+0.041
Neg logit contributions
othe-0.136
rier-0.133
vere-0.128
unes-0.128
uts-0.127
Token a
Token ID257
Feature activation+4.60e+00

Pos logit contributions
ptions+0.055
 saw+0.053
 colored+0.053
 remembered+0.044
 little+0.042
Neg logit contributions
othe-0.182
unes-0.173
 spine-0.172
uts-0.171
vere-0.171
Token cat
Token ID3797
Feature activation+2.69e+00

Pos logit contributions
 colored+0.040
 saw+0.036
 coloured+0.032
ptions+0.031
 heard+0.030
Neg logit contributions
othe-0.128
rier-0.128
ometer-0.124
vere-0.124
 spine-0.124
Token.
Token ID13
Feature activation-1.56e+00

Pos logit contributions
 colored+0.032
 saw+0.031
 little+0.028
ptions+0.028
 coloured+0.027
Neg logit contributions
othe-0.067
rier-0.063
 spine-0.063
vere-0.062
uts-0.062
Token She
Token ID1375
Feature activation+2.57e+00

Pos logit contributions
othe+0.030
rier+0.030
vere+0.029
 spine+0.028
ometers+0.028
Neg logit contributions
 saw-0.028
 colored-0.028
 little-0.026
 heard-0.025
 coloured-0.025
Token was
Token ID373
Feature activation+5.45e+00

Pos logit contributions
ptions+0.032
 amusement+0.024
 colored+0.024
 little+0.022
 media+0.021
Neg logit contributions
othe-0.060
uts-0.057
drops-0.056
ometer-0.056
vere-0.055
Token.
Token ID13
Feature activation-4.79e+00

Pos logit contributions
 saw+0.006
 colored+0.006
 little+0.006
 coloured+0.006
 heard+0.006
Neg logit contributions
othe-0.007
rier-0.007
vere-0.006
ometers-0.006
unes-0.006
Token<|endoftext|>
Token ID50256
Feature activation+6.03e+00

Pos logit contributions
rier+0.144
vere+0.135
 numb+0.131
drops+0.130
 spine+0.128
Neg logit contributions
 saw-0.055
 little-0.053
 a-0.053
 two-0.047
 the-0.046
TokenOnce
Token ID7454
Feature activation+5.91e+00

Pos logit contributions
 colored+0.120
 saw+0.118
 little+0.111
 coloured+0.109
 heard+0.106
Neg logit contributions
othe-0.101
rier-0.099
vere-0.096
unes-0.094
ometer-0.093
Token upon
Token ID2402
Feature activation+6.93e+00

Pos logit contributions
 saw+0.071
 colored+0.069
 little+0.063
 heard+0.058
 coloured+0.057
Neg logit contributions
othe-0.150
rier-0.150
vere-0.145
ometers-0.143
 spine-0.143
Token a
Token ID257
Feature activation+9.28e+00

Pos logit contributions
 colored+0.068
ptions+0.062
ere+0.059
 saw+0.058
 coloured+0.054
Neg logit contributions
othe-0.175
vere-0.171
uts-0.166
 numb-0.163
 bleed-0.161
Token time
Token ID640
Feature activation+5.45e+00

Pos logit contributions
 colored+0.094
 saw+0.088
 coloured+0.076
ptions+0.076
 little+0.071
Neg logit contributions
othe-0.248
 spine-0.239
vere-0.239
rier-0.232
ometer-0.229
Token,
Token ID11
Feature activation+6.18e+00

Pos logit contributions
 colored+0.061
 saw+0.056
 coloured+0.050
ptions+0.047
 little+0.047
Neg logit contributions
othe-0.145
vere-0.139
rier-0.137
uts-0.136
 spine-0.135
Token there
Token ID612
Feature activation+6.66e+00

Pos logit contributions
 colored+0.065
ptions+0.059
 media+0.055
 amusement+0.054
 coloured+0.054
Neg logit contributions
othe-0.161
vere-0.146
rier-0.145
ometer-0.145
 spine-0.144
Token was
Token ID373
Feature activation+7.97e+00

Pos logit contributions
ptions+0.103
 colored+0.092
 packed+0.087
iet+0.086
 coloured+0.084
Neg logit contributions
othe-0.136
ometer-0.117
uts-0.113
urch-0.111
oles-0.111
Token a
Token ID257
Feature activation+6.34e+00

Pos logit contributions
 saw+0.084
ptions+0.082
 remembered+0.078
 colored+0.070
 packed+0.069
Neg logit contributions
othe-0.181
umbs-0.172
 numb-0.171
unes-0.169
uts-0.167
Token boy
Token ID2933
Feature activation+6.91e+00

Pos logit contributions
 colored+0.054
ptions+0.043
 coloured+0.043
 saw+0.041
 heard+0.037
Neg logit contributions
rier-0.177
othe-0.175
ometer-0.171
vere-0.170
 spine-0.167
Token cob
Token ID22843
Feature activation-5.78e+00

Pos logit contributions
rier+0.076
othe+0.076
vere+0.073
 spine+0.073
ometer+0.072
Neg logit contributions
 saw-0.057
 colored-0.056
 little-0.050
 coloured-0.050
 heard-0.049
Tokenwe
Token ID732
Feature activation-5.41e+00

Pos logit contributions
othe+0.126
rier+0.123
vere+0.122
 spine+0.120
ometers+0.120
Neg logit contributions
 saw-0.093
 colored-0.092
 little-0.082
 heard-0.082
 coloured-0.080
Tokenbs
Token ID1443
Feature activation-3.10e+00

Pos logit contributions
othe+0.063
rier+0.061
vere+0.058
ometers+0.056
 spine+0.055
Neg logit contributions
 colored-0.138
 saw-0.137
 little-0.131
 coloured-0.127
 heard-0.126
Token,"
Token ID553
Feature activation-1.50e+00

Pos logit contributions
othe+0.073
rier+0.072
vere+0.070
ometers+0.070
 spine+0.069
Neg logit contributions
 saw-0.044
 colored-0.042
 little-0.039
 heard-0.038
 found-0.036
Token she
Token ID673
Feature activation+1.14e+00

Pos logit contributions
rier+0.033
othe+0.032
vere+0.032
ometers+0.032
unes+0.031
Neg logit contributions
 saw-0.024
 colored-0.023
 little-0.023
 heard-0.021
 found-0.021
Token said
Token ID531
Feature activation+4.13e+00

Pos logit contributions
 colored+0.016
ptions+0.015
 saw+0.014
 coloured+0.014
 preschool+0.014
Neg logit contributions
othe-0.026
vere-0.024
rier-0.024
ometer-0.024
uts-0.024
Token.
Token ID13
Feature activation-2.21e+00

Pos logit contributions
 little+0.036
 saw+0.035
 a+0.032
 colored+0.031
 heard+0.028
Neg logit contributions
unes-0.117
plets-0.116
ometers-0.115
umm-0.113
vere-0.112
Token
Token ID198
Feature activation-1.01e+00

Pos logit contributions
rier+0.054
drops+0.052
unes+0.052
 spine+0.052
othe+0.052
Neg logit contributions
 saw-0.032
 little-0.030
 heard-0.027
 a-0.027
 two-0.026
Token
Token ID198
Feature activation-1.29e+00

Pos logit contributions
rier+0.023
drops+0.022
 spine+0.022
unes+0.022
ometers+0.021
Neg logit contributions
 saw-0.017
 little-0.016
 colored-0.015
 heard-0.014
 a-0.014
Token"
Token ID1
Feature activation-2.05e+00

Pos logit contributions
othe+0.027
rier+0.027
vere+0.026
uts+0.025
ometer+0.025
Neg logit contributions
 colored-0.021
 saw-0.020
 coloured-0.019
 little-0.018
 heard-0.018
TokenWe
Token ID1135
Feature activation-8.96e-01

Pos logit contributions
othe+0.044
rier+0.043
unes+0.043
 spine+0.042
anks+0.041
Neg logit contributions
 saw-0.034
 colored-0.033
 little-0.032
 heard-0.030
 did-0.030
Token and
Token ID290
Feature activation+2.59e+00

Pos logit contributions
ptions+0.104
 little+0.089
 amusement+0.085
 colored+0.083
 coloured+0.077
Neg logit contributions
ometer-0.113
uts-0.109
othe-0.108
ometers-0.108
 spine-0.108
Token Ben
Token ID3932
Feature activation+6.71e+00

Pos logit contributions
 colored+0.034
ptions+0.033
 coloured+0.030
 saw+0.027
 media+0.026
Neg logit contributions
othe-0.063
vere-0.056
 spine-0.056
ometers-0.055
uts-0.055
Token were
Token ID547
Feature activation+7.86e+00

Pos logit contributions
ptions+0.123
 colored+0.107
 little+0.104
 media+0.104
 coloured+0.101
Neg logit contributions
othe-0.106
drops-0.105
unes-0.102
 spine-0.099
ometers-0.098
Token twins
Token ID20345
Feature activation+6.17e+00

Pos logit contributions
 colored+0.131
 saw+0.119
 coloured+0.119
ptions+0.113
 heard+0.107
Neg logit contributions
othe-0.162
vere-0.159
rier-0.155
unes-0.150
 spine-0.149
Token who
Token ID508
Feature activation+2.17e+00

Pos logit contributions
 saw+0.090
 colored+0.081
 little+0.080
 heard+0.076
 found+0.073
Neg logit contributions
rier-0.149
othe-0.148
unes-0.146
ometers-0.144
vere-0.144
Token liked
Token ID8288
Feature activation+4.16e+00

Pos logit contributions
 colored+0.018
ptions+0.017
 little+0.016
 saw+0.016
 coloured+0.015
Neg logit contributions
othe-0.061
ometers-0.057
drops-0.056
vere-0.056
uts-0.056
Token to
Token ID284
Feature activation+1.56e+00

Pos logit contributions
 saw+0.069
 colored+0.068
 remembered+0.063
 heard+0.062
 little+0.061
Neg logit contributions
othe-0.081
rier-0.078
ometers-0.077
unes-0.077
uts-0.076
Token play
Token ID711
Feature activation+3.51e+00

Pos logit contributions
 saw+0.022
 colored+0.021
ptions+0.021
 little+0.019
 heard+0.019
Neg logit contributions
othe-0.034
vere-0.032
unes-0.032
 spine-0.032
drops-0.032
Token outside
Token ID2354
Feature activation+2.71e+00

Pos logit contributions
 saw+0.055
 colored+0.054
 remembered+0.050
 packed+0.050
 little+0.050
Neg logit contributions
othe-0.072
rier-0.071
vere-0.069
balls-0.066
 spine-0.065
Token.
Token ID13
Feature activation-2.34e+00

Pos logit contributions
 saw+0.039
 colored+0.038
 little+0.036
 heard+0.034
 a+0.033
Neg logit contributions
othe-0.065
rier-0.065
vere-0.063
ometers-0.062
 spine-0.062
Token They
Token ID1119
Feature activation+1.00e+00

Pos logit contributions
othe+0.041
rier+0.040
 spine+0.038
unes+0.038
ometers+0.038
Neg logit contributions
 colored-0.046
 saw-0.045
 little-0.042
 coloured-0.042
 heard-0.041
Token."
Token ID526
Feature activation-3.06e+00

Pos logit contributions
 colored+0.017
 saw+0.016
 little+0.015
 coloured+0.015
 heard+0.015
Neg logit contributions
rier-0.014
othe-0.014
vere-0.014
uts-0.013
ometer-0.013
Token
Token ID198
Feature activation-1.10e+00

Pos logit contributions
rier+0.086
vere+0.079
 spine+0.078
drops+0.076
ometers+0.076
Neg logit contributions
 saw-0.040
 little-0.038
 a-0.036
 two-0.033
 colored-0.032
Token
Token ID198
Feature activation-1.39e+00

Pos logit contributions
rier+0.024
 spine+0.023
unes+0.023
othe+0.023
ometers+0.023
Neg logit contributions
 saw-0.018
 colored-0.017
 little-0.017
 heard-0.016
 found-0.015
TokenThey
Token ID2990
Feature activation+9.34e-01

Pos logit contributions
othe+0.026
rier+0.026
vere+0.025
ometers+0.025
 spine+0.025
Neg logit contributions
 saw-0.026
 colored-0.026
 little-0.024
 heard-0.023
 coloured-0.023
Token take
Token ID1011
Feature activation+2.55e+00

Pos logit contributions
 colored+0.017
ptions+0.016
 coloured+0.016
 saw+0.016
 little+0.015
Neg logit contributions
othe-0.017
drops-0.016
vere-0.016
 spine-0.015
rier-0.015
Token turns
Token ID4962
Feature activation+5.40e-01

Pos logit contributions
 saw+0.031
 colored+0.031
 remembered+0.028
 coloured+0.028
 noticed+0.027
Neg logit contributions
othe-0.060
rier-0.060
unes-0.057
ometer-0.056
ometers-0.056
Token to
Token ID284
Feature activation+1.09e+00

Pos logit contributions
 saw+0.008
 colored+0.008
 little+0.007
 heard+0.007
 coloured+0.007
Neg logit contributions
othe-0.012
rier-0.012
vere-0.012
 spine-0.012
uts-0.011
Token push
Token ID4574
Feature activation+2.72e+00

Pos logit contributions
 saw+0.019
 colored+0.019
ptions+0.018
 heard+0.018
 little+0.018
Neg logit contributions
othe-0.020
rier-0.019
vere-0.018
drops-0.017
 spine-0.017
Token the
Token ID262
Feature activation-2.81e+00

Pos logit contributions
 saw+0.024
 colored+0.023
 little+0.019
 heard+0.018
 coloured+0.018
Neg logit contributions
othe-0.078
rier-0.077
vere-0.074
ometers-0.073
ometer-0.073
Token red
Token ID2266
Feature activation-4.00e+00

Pos logit contributions
othe+0.064
rier+0.064
vere+0.062
 spine+0.061
ometer+0.060
Neg logit contributions
 saw-0.040
 colored-0.040
 little-0.035
 coloured-0.035
 heard-0.035
Token car
Token ID1097
Feature activation-3.86e+00

Pos logit contributions
othe+0.069
rier+0.069
vere+0.065
 spine+0.064
ometer+0.064
Neg logit contributions
 saw-0.080
 colored-0.079
 little-0.074
 heard-0.072
 coloured-0.072
Token
Token ID198
Feature activation-1.76e+00

Pos logit contributions
rier+0.026
 spine+0.025
othe+0.024
ometers+0.024
drops+0.024
Neg logit contributions
 saw-0.020
 colored-0.018
 little-0.018
 heard-0.017
 found-0.017
TokenTom
Token ID13787
Feature activation+2.70e+00

Pos logit contributions
othe+0.040
rier+0.039
vere+0.039
ometers+0.037
ometer+0.037
Neg logit contributions
 saw-0.026
 colored-0.026
 little-0.023
 coloured-0.022
 heard-0.022
Token and
Token ID290
Feature activation-7.74e-01

Pos logit contributions
 colored+0.036
ptions+0.036
 saw+0.035
 coloured+0.033
 little+0.033
Neg logit contributions
drops-0.056
unes-0.056
ometer-0.056
rier-0.056
othe-0.056
Token Mia
Token ID32189
Feature activation+2.69e+00

Pos logit contributions
othe+0.016
rier+0.015
 spine+0.015
vere+0.014
unes+0.014
Neg logit contributions
 colored-0.013
 saw-0.012
 coloured-0.012
 packed-0.011
 little-0.011
Token are
Token ID389
Feature activation+3.21e+00

Pos logit contributions
 saw+0.046
 colored+0.046
 little+0.042
 heard+0.041
 coloured+0.040
Neg logit contributions
othe-0.055
rier-0.054
vere-0.052
 spine-0.051
ometers-0.051
Token happy
Token ID3772
Feature activation+7.90e-01

Pos logit contributions
 colored+0.057
 saw+0.055
ptions+0.052
 coloured+0.052
 little+0.050
Neg logit contributions
rier-0.062
othe-0.061
vere-0.059
uts-0.057
drops-0.056
Token.
Token ID13
Feature activation-3.69e+00

Pos logit contributions
 saw+0.011
 little+0.010
 colored+0.010
 heard+0.010
 a+0.009
Neg logit contributions
rier-0.020
vere-0.019
ometers-0.019
ometer-0.019
othe-0.019
Token They
Token ID1119
Feature activation+1.39e-01

Pos logit contributions
rier+0.075
vere+0.071
othe+0.071
 spine+0.067
unes+0.066
Neg logit contributions
 saw-0.070
 colored-0.067
 little-0.064
 a-0.061
 did-0.061
Token have
Token ID423
Feature activation+6.03e-01

Pos logit contributions
 colored+0.002
ptions+0.002
 saw+0.002
 coloured+0.002
 little+0.002
Neg logit contributions
othe-0.003
vere-0.003
drops-0.003
rier-0.002
 spine-0.002
Token helped
Token ID4193
Feature activation+1.76e+00

Pos logit contributions
 colored+0.007
 coloured+0.007
ptions+0.006
 media+0.006
ere+0.006
Neg logit contributions
ometers-0.012
uts-0.012
 numb-0.012
unes-0.012
othe-0.012
Token someone
Token ID2130
Feature activation-1.83e+00

Pos logit contributions
 saw+0.022
 colored+0.022
 little+0.020
 heard+0.018
 coloured+0.018
Neg logit contributions
othe-0.044
rier-0.044
vere-0.042
 spine-0.042
ometers-0.042
Token help
Token ID1037
Feature activation+2.08e+00

Pos logit contributions
 colored+0.002
 saw+0.002
ptions+0.002
 heard+0.002
 little+0.002
Neg logit contributions
othe-0.003
rier-0.002
vere-0.002
 spine-0.002
ometers-0.002
Token the
Token ID262
Feature activation-1.76e+00

Pos logit contributions
 saw+0.027
 colored+0.026
 little+0.025
 found+0.023
 heard+0.022
Neg logit contributions
rier-0.052
othe-0.051
 spine-0.051
vere-0.051
ometer-0.050
Token bird
Token ID6512
Feature activation-2.31e+00

Pos logit contributions
rier+0.041
vere+0.039
othe+0.038
 spine+0.038
ometer+0.038
Neg logit contributions
 colored-0.024
 saw-0.024
 coloured-0.021
 found-0.020
 heard-0.020
Token.
Token ID13
Feature activation-4.80e+00

Pos logit contributions
rier+0.053
othe+0.052
vere+0.052
ometers+0.051
 spine+0.051
Neg logit contributions
 saw-0.035
 colored-0.033
 little-0.033
 found-0.031
 heard-0.030
Token They
Token ID1119
Feature activation-3.80e-01

Pos logit contributions
rier+0.100
vere+0.094
othe+0.093
 spine+0.091
 numb+0.087
Neg logit contributions
 saw-0.086
 little-0.083
 colored-0.080
 a-0.076
 two-0.074
Token went
Token ID1816
Feature activation+2.72e+00

Pos logit contributions
othe+0.009
drops+0.008
ometers+0.008
ometer+0.008
taker+0.008
Neg logit contributions
ptions-0.004
 colored-0.004
 little-0.004
 preschool-0.004
 media-0.004
Token back
Token ID736
Feature activation+2.48e+00

Pos logit contributions
 saw+0.041
 colored+0.039
 little+0.037
 found+0.035
 heard+0.035
Neg logit contributions
rier-0.063
othe-0.063
vere-0.061
ometers-0.060
 spine-0.059
Token to
Token ID284
Feature activation+1.59e+00

Pos logit contributions
 colored+0.046
ptions+0.043
 saw+0.042
 little+0.042
 coloured+0.041
Neg logit contributions
othe-0.044
ometer-0.043
ometers-0.042
 spine-0.040
unes-0.040
Token the
Token ID262
Feature activation-8.18e-01

Pos logit contributions
 colored+0.020
 saw+0.019
ptions+0.017
 coloured+0.017
 heard+0.017
Neg logit contributions
othe-0.040
 spine-0.036
ometers-0.035
rier-0.035
vere-0.034
Token tree
Token ID5509
Feature activation-4.58e-02

Pos logit contributions
rier+0.016
 spine+0.016
ometer+0.015
vere+0.015
othe+0.015
Neg logit contributions
 colored-0.013
 saw-0.012
 coloured-0.012
 did-0.011
ptions-0.011
Token and
Token ID290
Feature activation-2.51e+00

Pos logit contributions
othe+0.001
rier+0.001
vere+0.001
 spine+0.001
ometers+0.001
Neg logit contributions
 saw-0.001
 colored-0.001
 little-0.000
 heard-0.000
 coloured-0.000
Token and
Token ID290
Feature activation-3.84e+00

Pos logit contributions
othe+0.080
rier+0.079
vere+0.076
 spine+0.075
ometers+0.075
Neg logit contributions
 saw-0.043
 colored-0.043
 little-0.039
 heard-0.037
 coloured-0.037
Token that
Token ID326
Feature activation-2.64e+00

Pos logit contributions
othe+0.072
rier+0.067
vere+0.065
ometers+0.063
drops+0.062
Neg logit contributions
 colored-0.070
ptions-0.069
 saw-0.067
 coloured-0.063
 little-0.062
Token it
Token ID340
Feature activation-4.87e+00

Pos logit contributions
othe+0.062
rier+0.060
vere+0.059
ometers+0.058
ometer+0.057
Neg logit contributions
 saw-0.036
 colored-0.036
 little-0.033
 coloured-0.031
 heard-0.031
Token would
Token ID561
Feature activation-2.69e+00

Pos logit contributions
othe+0.128
rier+0.125
vere+0.123
ometers+0.121
ometer+0.120
Neg logit contributions
 colored-0.050
 saw-0.048
 little-0.043
ptions-0.041
 coloured-0.041
Token always
Token ID1464
Feature activation-2.89e+00

Pos logit contributions
rier+0.069
othe+0.066
unes+0.066
ometers+0.066
ometer+0.066
Neg logit contributions
 saw-0.035
 colored-0.033
 little-0.032
 found-0.030
 heard-0.029
Token keep
Token ID1394
Feature activation-8.00e-01

Pos logit contributions
rier+0.069
othe+0.068
unes+0.067
ometers+0.066
ometer+0.066
Neg logit contributions
 saw-0.041
 colored-0.040
 little-0.037
 heard-0.035
 found-0.035
Token her
Token ID607
Feature activation-2.09e+00

Pos logit contributions
othe+0.021
rier+0.021
vere+0.020
ometers+0.020
 spine+0.020
Neg logit contributions
 saw-0.009
 colored-0.009
 little-0.008
 heard-0.007
 coloured-0.007
Token toy
Token ID13373
Feature activation-6.59e+00

Pos logit contributions
othe+0.045
rier+0.044
vere+0.043
ometers+0.042
unes+0.042
Neg logit contributions
 saw-0.034
 colored-0.034
 little-0.033
 heard-0.030
 found-0.030
Token safe
Token ID3338
Feature activation-2.69e+00

Pos logit contributions
othe+0.169
plets+0.160
unes+0.158
vere+0.158
rier+0.157
Neg logit contributions
 little-0.085
 saw-0.085
 colored-0.078
 found-0.075
 a-0.073
Token.
Token ID13
Feature activation-5.18e+00

Pos logit contributions
othe+0.058
rier+0.057
vere+0.055
 spine+0.054
ometers+0.054
Neg logit contributions
 saw-0.043
 colored-0.042
 little-0.039
 heard-0.037
 coloured-0.037
Token<|endoftext|>
Token ID50256
Feature activation+5.76e+00

Pos logit contributions
rier+0.123
drops+0.116
unes+0.115
 spine+0.114
othe+0.113
Neg logit contributions
 saw-0.080
 little-0.078
 a-0.071
 colored-0.071
 found-0.069
Tokenmy
Token ID1820
Feature activation+4.42e+00

Pos logit contributions
ptions+0.067
 amusement+0.063
 colored+0.061
 media+0.060
 preschool+0.059
Neg logit contributions
oes-0.057
ometer-0.054
drops-0.053
othe-0.052
 spine-0.051
Token said
Token ID531
Feature activation+4.45e+00

Pos logit contributions
ptions+0.064
grown+0.059
 preschool+0.057
 colored+0.057
 amusement+0.054
Neg logit contributions
othe-0.090
uts-0.086
drops-0.085
ometer-0.083
but-0.081
Token,
Token ID11
Feature activation+1.20e+00

Pos logit contributions
 saw+0.035
 colored+0.032
 little+0.029
 heard+0.025
 found+0.024
Neg logit contributions
othe-0.133
rier-0.132
vere-0.129
 spine-0.129
ometers-0.129
Token "
Token ID366
Feature activation-2.17e+00

Pos logit contributions
 saw+0.020
 colored+0.019
 little+0.018
 heard+0.017
 found+0.017
Neg logit contributions
othe-0.026
rier-0.026
vere-0.025
ometers-0.024
 spine-0.024
TokenWow
Token ID22017
Feature activation+1.33e+00

Pos logit contributions
rier+0.058
unes+0.057
 spine+0.057
othe+0.055
ometers+0.055
Neg logit contributions
 colored-0.030
 saw-0.030
 thought-0.027
 coloured-0.026
 little-0.025
Token,
Token ID11
Feature activation-1.37e+00

Pos logit contributions
 saw+0.012
 colored+0.012
 little+0.010
 heard+0.009
 coloured+0.009
Neg logit contributions
othe-0.038
rier-0.038
vere-0.037
ometers-0.036
 spine-0.036
Token you
Token ID345
Feature activation-2.05e-01

Pos logit contributions
othe+0.022
rier+0.022
vere+0.021
 spine+0.019
drops+0.019
Neg logit contributions
 colored-0.029
ptions-0.027
 coloured-0.026
 saw-0.026
 heard-0.025
Token look
Token ID804
Feature activation+7.39e-01

Pos logit contributions
othe+0.004
vere+0.004
unes+0.004
drops+0.004
plets+0.004
Neg logit contributions
 colored-0.003
ptions-0.003
 little-0.003
 saw-0.003
 coloured-0.003
Token so
Token ID523
Feature activation-1.51e+00

Pos logit contributions
 colored+0.012
 saw+0.011
ptions+0.010
 coloured+0.010
 packed+0.010
Neg logit contributions
rier-0.016
othe-0.015
vere-0.015
unes-0.014
 spine-0.014
Token fancy
Token ID14996
Feature activation+3.33e-01

Pos logit contributions
othe+0.038
rier+0.038
vere+0.037
ometers+0.036
ometer+0.036
Neg logit contributions
 colored-0.018
 saw-0.018
 little-0.015
 heard-0.015
 coloured-0.015
Token!
Token ID0
Feature activation-6.21e+00

Pos logit contributions
 saw+0.005
 colored+0.005
 little+0.005
 heard+0.005
 coloured+0.005
Neg logit contributions
rier-0.007
othe-0.007
vere-0.007
ometers-0.007
ometer-0.007
Token and
Token ID290
Feature activation-2.75e+00

Pos logit contributions
rier+0.062
othe+0.062
vere+0.062
ometers+0.062
plets+0.061
Neg logit contributions
 saw-0.025
 little-0.022
 a-0.022
 found-0.022
 colored-0.021
Token walked
Token ID6807
Feature activation+4.68e+00

Pos logit contributions
othe+0.067
drops+0.062
balls+0.062
ometer+0.060
ometers+0.060
Neg logit contributions
ptions-0.031
 colored-0.030
 saw-0.028
 little-0.027
 coloured-0.026
Token with
Token ID351
Feature activation+2.35e+00

Pos logit contributions
 colored+0.082
 saw+0.082
 little+0.076
 coloured+0.073
 heard+0.072
Neg logit contributions
rier-0.092
othe-0.092
vere-0.087
 spine-0.086
ometers-0.085
Token Lily
Token ID20037
Feature activation+1.21e+00

Pos logit contributions
 colored+0.025
 saw+0.024
 little+0.020
 coloured+0.020
 heard+0.020
Neg logit contributions
othe-0.063
rier-0.063
vere-0.060
unes-0.059
ometers-0.059
Token to
Token ID284
Feature activation+1.98e+00

Pos logit contributions
 saw+0.018
 colored+0.017
 little+0.016
 heard+0.015
 found+0.015
Neg logit contributions
rier-0.028
othe-0.028
vere-0.027
unes-0.026
ometers-0.026
Token the
Token ID262
Feature activation-9.97e-01

Pos logit contributions
 colored+0.022
 saw+0.019
 coloured+0.019
 remembered+0.017
ptions+0.017
Neg logit contributions
othe-0.052
rier-0.049
vere-0.049
 spine-0.048
 numb-0.045
Token market
Token ID1910
Feature activation-3.47e-01

Pos logit contributions
rier+0.021
 spine+0.020
vere+0.020
othe+0.020
ometer+0.020
Neg logit contributions
 colored-0.016
 coloured-0.015
 saw-0.014
ptions-0.014
 remembered-0.013
Token.
Token ID13
Feature activation-4.47e+00

Pos logit contributions
othe+0.007
rier+0.007
vere+0.007
unes+0.007
 spine+0.007
Neg logit contributions
 saw-0.006
 colored-0.005
 little-0.005
 heard-0.005
 found-0.005
Token He
Token ID679
Feature activation-4.60e-01

Pos logit contributions
rier+0.067
othe+0.067
vere+0.067
ometers+0.063
 spine+0.062
Neg logit contributions
 saw-0.101
 little-0.098
 colored-0.095
 found-0.091
 heard-0.090
Token gave
Token ID2921
Feature activation+2.04e+00

Pos logit contributions
othe+0.011
ometer+0.011
balls+0.011
ometers+0.011
drops+0.010
Neg logit contributions
 colored-0.004
ptions-0.004
 little-0.004
 coloured-0.004
 media-0.004
Token the
Token ID262
Feature activation-2.03e+00

Pos logit contributions
 colored+0.029
 saw+0.027
 coloured+0.026
 packed+0.026
ptions+0.025
Neg logit contributions
rier-0.046
othe-0.045
unes-0.042
balls-0.040
ometers-0.040
Token suit
Token ID6050
Feature activation-4.49e+00

Pos logit contributions
rier+0.057
othe+0.057
vere+0.055
ometer+0.055
 spine+0.054
Neg logit contributions
 saw-0.038
 colored-0.037
 little-0.034
 heard-0.033
ptions-0.032
Token.
Token ID13
Feature activation-4.87e+00

Pos logit contributions
othe+0.094
rier+0.093
vere+0.090
 spine+0.088
ometers+0.088
Neg logit contributions
 saw-0.074
 colored-0.073
 little-0.068
 heard-0.065
 coloured-0.064
Token She
Token ID1375
Feature activation-2.03e-02

Pos logit contributions
rier+0.104
vere+0.103
othe+0.101
 numb+0.100
 spine+0.097
Neg logit contributions
 little-0.080
 saw-0.079
 a-0.072
 two-0.071
 young-0.070
Token grabbed
Token ID13176
Feature activation+3.99e+00

Pos logit contributions
othe+0.000
drops+0.000
ometer+0.000
vere+0.000
uts+0.000
Neg logit contributions
ptions-0.000
 colored-0.000
 preschool-0.000
 media-0.000
grown-0.000
Token a
Token ID257
Feature activation-1.05e+00

Pos logit contributions
 saw+0.039
 colored+0.037
 heard+0.032
 little+0.031
ptions+0.031
Neg logit contributions
othe-0.110
rier-0.107
unes-0.103
vere-0.102
ometers-0.102
Token towel
Token ID24808
Feature activation-3.00e+00

Pos logit contributions
othe+0.022
rier+0.022
vere+0.021
ometer+0.021
 spine+0.021
Neg logit contributions
 colored-0.016
 saw-0.015
ptions-0.014
 coloured-0.014
 heard-0.014
Token and
Token ID290
Feature activation-3.02e+00

Pos logit contributions
othe+0.084
rier+0.083
vere+0.081
ometers+0.080
unes+0.079
Neg logit contributions
 saw-0.029
 colored-0.028
 little-0.025
 heard-0.023
 coloured-0.022
Token hopped
Token ID45230
Feature activation+3.32e+00

Pos logit contributions
othe+0.071
drops+0.066
ometers+0.064
ometer+0.064
uts+0.064
Neg logit contributions
ptions-0.039
 colored-0.036
 saw-0.034
 preschool-0.032
 little-0.032
Token into
Token ID656
Feature activation+3.98e+00

Pos logit contributions
 saw+0.057
 colored+0.056
 little+0.053
 found+0.050
 a+0.050
Neg logit contributions
othe-0.069
rier-0.068
ometers-0.067
vere-0.067
 spine-0.066
Token the
Token ID262
Feature activation-1.58e+00

Pos logit contributions
 saw+0.041
 colored+0.039
 remembered+0.037
 did+0.034
ptions+0.034
Neg logit contributions
othe-0.105
rier-0.100
vere-0.097
 numb-0.096
 spine-0.094
Token pirate
Token ID25868
Feature activation-8.71e-01

Pos logit contributions
rier+0.029
othe+0.028
 spine+0.027
vere+0.027
ometer+0.026
Neg logit contributions
 colored-0.029
 saw-0.029
 heard-0.026
 coloured-0.026
 little-0.026
Token,
Token ID11
Feature activation-1.06e+00

Pos logit contributions
 saw+0.000
 colored+0.000
 little+0.000
 coloured+0.000
 heard+0.000
Neg logit contributions
othe-0.001
rier-0.001
vere-0.001
 spine-0.001
ometers-0.001
Token she
Token ID673
Feature activation+1.88e-01

Pos logit contributions
othe+0.023
rier+0.023
vere+0.022
ometers+0.021
 spine+0.021
Neg logit contributions
 saw-0.017
 colored-0.017
 little-0.015
 coloured-0.015
 heard-0.015
Token finally
Token ID3443
Feature activation+1.26e+00

Pos logit contributions
 colored+0.002
ptions+0.002
 amusement+0.002
 preschool+0.002
 little+0.002
Neg logit contributions
othe-0.004
drops-0.004
ometer-0.004
vere-0.004
balls-0.004
Token could
Token ID714
Feature activation+7.02e-01

Pos logit contributions
 colored+0.018
ptions+0.017
 preschool+0.016
 media+0.016
 little+0.016
Neg logit contributions
othe-0.027
uts-0.025
vere-0.025
plets-0.024
drops-0.024
Token balance
Token ID5236
Feature activation+1.20e+00

Pos logit contributions
 saw+0.011
 colored+0.010
 little+0.010
 heard+0.010
 found+0.010
Neg logit contributions
rier-0.016
othe-0.015
ometers-0.015
unes-0.015
ometer-0.015
Token the
Token ID262
Feature activation-3.11e+00

Pos logit contributions
 saw+0.014
 colored+0.014
 a+0.013
 little+0.013
 heard+0.012
Neg logit contributions
othe-0.031
ometer-0.031
ometers-0.031
 spine-0.031
rier-0.030
Token task
Token ID4876
Feature activation-3.17e+00

Pos logit contributions
othe+0.066
rier+0.065
vere+0.063
 spine+0.062
ometers+0.061
Neg logit contributions
 saw-0.051
 colored-0.051
 little-0.047
 heard-0.045
 coloured-0.044
Token!
Token ID0
Feature activation-3.91e+00

Pos logit contributions
rier+0.079
othe+0.078
ometers+0.077
unes+0.077
ometer+0.075
Neg logit contributions
 saw-0.042
 colored-0.039
 found-0.037
 little-0.037
 a-0.037
Token She
Token ID1375
Feature activation+4.49e-01

Pos logit contributions
rier+0.095
vere+0.090
 numb+0.087
othe+0.086
 spine+0.084
Neg logit contributions
 saw-0.061
 little-0.059
 a-0.055
 found-0.053
 two-0.052
Token felt
Token ID2936
Feature activation+3.26e+00

Pos logit contributions
 colored+0.005
ptions+0.005
 saw+0.004
 little+0.004
 amusement+0.004
Neg logit contributions
othe-0.011
vere-0.010
drops-0.010
rier-0.010
ometer-0.010
Token so
Token ID523
Feature activation-1.30e-01

Pos logit contributions
 saw+0.039
 colored+0.038
 little+0.036
 heard+0.033
 found+0.032
Neg logit contributions
rier-0.084
othe-0.083
ometers-0.081
 spine-0.081
ometer-0.081
Token listen
Token ID6004
Feature activation-9.33e-03

Pos logit contributions
othe+0.126
rier+0.126
vere+0.121
ometers+0.119
 spine+0.119
Neg logit contributions
 saw-0.085
 colored-0.084
 little-0.078
 heard-0.074
 coloured-0.073
Token to
Token ID284
Feature activation+2.75e-01

Pos logit contributions
othe+0.000
rier+0.000
vere+0.000
ometers+0.000
ometer+0.000
Neg logit contributions
 colored-0.000
 saw-0.000
 little-0.000
 coloured-0.000
 heard-0.000
Token me
Token ID502
Feature activation-1.62e+00

Pos logit contributions
 colored+0.003
 saw+0.003
 coloured+0.003
 heard+0.003
ptions+0.003
Neg logit contributions
othe-0.007
rier-0.007
vere-0.006
 spine-0.006
ometers-0.006
Token next
Token ID1306
Feature activation-5.18e-01

Pos logit contributions
othe+0.035
rier+0.034
vere+0.033
ometers+0.032
 spine+0.032
Neg logit contributions
 saw-0.026
 colored-0.026
 little-0.024
 heard-0.023
 coloured-0.023
Token time
Token ID640
Feature activation-1.08e+00

Pos logit contributions
rier+0.011
 spine+0.011
vere+0.011
othe+0.011
ometer+0.010
Neg logit contributions
 colored-0.007
 saw-0.007
ptions-0.007
 remembered-0.007
 coloured-0.007
Token.
Token ID13
Feature activation-6.47e+00

Pos logit contributions
othe+0.023
rier+0.023
vere+0.022
unes+0.022
ometers+0.021
Neg logit contributions
 saw-0.018
 colored-0.017
 little-0.017
 heard-0.016
 found-0.015
Token The
Token ID383
Feature activation-2.76e+00

Pos logit contributions
rier+0.122
vere+0.113
othe+0.113
drops+0.105
 spine+0.104
Neg logit contributions
 saw-0.121
 little-0.121
 a-0.116
 two-0.116
 colored-0.113
Token zoo
Token ID26626
Feature activation-2.62e+00

Pos logit contributions
othe+0.069
rier+0.067
uts+0.066
ometers+0.065
vere+0.065
Neg logit contributions
 saw-0.038
 little-0.037
 colored-0.036
 heard-0.032
 two-0.032
Token is
Token ID318
Feature activation-1.56e+00

Pos logit contributions
othe+0.065
rier+0.065
vere+0.063
ometers+0.062
ometer+0.062
Neg logit contributions
 colored-0.032
 saw-0.031
 little-0.028
 coloured-0.027
 heard-0.026
Token a
Token ID257
Feature activation-2.46e+00

Pos logit contributions
othe+0.031
rier+0.031
vere+0.030
ometers+0.029
 spine+0.029
Neg logit contributions
 colored-0.026
 saw-0.025
 coloured-0.023
 little-0.023
ptions-0.023
Token fun
Token ID1257
Feature activation-4.78e-01

Pos logit contributions
othe+0.058
rier+0.058
vere+0.056
 spine+0.055
ometer+0.055
Neg logit contributions
 colored-0.034
 saw-0.033
 little-0.029
 coloured-0.029
 heard-0.028
Token for
Token ID329
Feature activation+7.06e-01

Pos logit contributions
othe+0.109
rier+0.105
vere+0.101
ometers+0.101
 spine+0.100
Neg logit contributions
 colored-0.064
 saw-0.064
 little-0.059
ptions-0.056
 heard-0.056
Token giving
Token ID3501
Feature activation-5.87e-01

Pos logit contributions
 saw+0.011
 colored+0.010
 coloured+0.009
iet+0.009
ptions+0.009
Neg logit contributions
othe-0.016
rier-0.014
vere-0.014
 numb-0.014
balls-0.013
Token him
Token ID683
Feature activation-1.05e+00

Pos logit contributions
othe+0.012
rier+0.012
vere+0.011
ometers+0.011
unes+0.011
Neg logit contributions
 saw-0.010
 colored-0.010
 little-0.009
 heard-0.009
 coloured-0.009
Token the
Token ID262
Feature activation-3.52e+00

Pos logit contributions
othe+0.028
rier+0.027
ometers+0.027
unes+0.026
vere+0.026
Neg logit contributions
 saw-0.011
 colored-0.011
ptions-0.009
 little-0.009
 coloured-0.009
Token best
Token ID1266
Feature activation-3.82e+00

Pos logit contributions
othe+0.080
rier+0.078
vere+0.076
 spine+0.075
ometers+0.074
Neg logit contributions
 saw-0.054
 colored-0.054
 little-0.052
 heard-0.047
 coloured-0.047
Token cocoa
Token ID35845
Feature activation-8.62e+00

Pos logit contributions
rier+0.089
othe+0.088
vere+0.087
 spine+0.084
ometers+0.083
Neg logit contributions
 colored-0.058
 saw-0.057
 little-0.055
 coloured-0.050
 heard-0.049
Token.
Token ID13
Feature activation-4.93e+00

Pos logit contributions
vere+0.227
othe+0.222
rier+0.221
 spine+0.216
unes+0.206
Neg logit contributions
 saw-0.106
 colored-0.105
 little-0.097
 a-0.096
 found-0.092
Token The
Token ID383
Feature activation-6.72e-01

Pos logit contributions
rier+0.136
vere+0.125
drops+0.123
 spine+0.118
 numb+0.116
Neg logit contributions
 saw-0.070
 little-0.065
 a-0.062
 found-0.059
 two-0.058
Token End
Token ID5268
Feature activation+9.23e-01

Pos logit contributions
othe+0.013
rier+0.013
vere+0.012
 spine+0.012
ometer+0.012
Neg logit contributions
 colored-0.012
 saw-0.011
ptions-0.011
 coloured-0.010
 found-0.010
Token.
Token ID13
Feature activation-3.89e+00

Pos logit contributions
 saw+0.014
 colored+0.013
 little+0.012
 heard+0.012
 a+0.012
Neg logit contributions
rier-0.023
 spine-0.022
drops-0.021
unes-0.021
othe-0.021
Token<|endoftext|>
Token ID50256
Feature activation+5.44e+00

Pos logit contributions
rier+0.111
drops+0.104
vere+0.099
 spine+0.098
unes+0.095
Neg logit contributions
 saw-0.053
 a-0.048
 little-0.047
 found-0.045
 two-0.044
Token are
Token ID389
Feature activation-2.50e+00

Pos logit contributions
rier+0.047
othe+0.047
 spine+0.045
vere+0.045
unes+0.044
Neg logit contributions
 saw-0.025
 little-0.023
 colored-0.022
 did-0.021
 heard-0.020
Token many
Token ID867
Feature activation-1.62e+00

Pos logit contributions
rier+0.057
othe+0.057
vere+0.056
ometers+0.055
unes+0.054
Neg logit contributions
 colored-0.034
 saw-0.034
 little-0.030
 coloured-0.030
 heard-0.029
Token things
Token ID1243
Feature activation-4.90e+00

Pos logit contributions
rier+0.029
vere+0.028
othe+0.028
drops+0.027
unes+0.027
Neg logit contributions
 colored-0.027
 saw-0.026
othy-0.025
ptions-0.025
 remembered-0.025
Token to
Token ID284
Feature activation-3.60e+00

Pos logit contributions
othe+0.123
rier+0.123
vere+0.118
 spine+0.117
ometers+0.116
Neg logit contributions
 saw-0.063
 colored-0.060
 little-0.055
 heard-0.052
 found-0.052
Token see
Token ID766
Feature activation-1.99e+00

Pos logit contributions
othe+0.067
vere+0.063
 spine+0.059
rier+0.058
 numb+0.057
Neg logit contributions
 colored-0.062
 saw-0.059
 remembered-0.058
ptions-0.057
 coloured-0.056
Token and
Token ID290
Feature activation-6.57e+00

Pos logit contributions
rier+0.041
othe+0.041
vere+0.039
ometers+0.039
unes+0.038
Neg logit contributions
 saw-0.032
 colored-0.032
 little-0.029
 coloured-0.028
 heard-0.028
Token do
Token ID466
Feature activation-2.00e+00

Pos logit contributions
othe+0.141
rier+0.136
vere+0.133
 spine+0.131
drops+0.131
Neg logit contributions
 colored-0.101
 saw-0.096
 coloured-0.090
 little-0.088
ptions-0.086
Token and
Token ID290
Feature activation-5.40e+00

Pos logit contributions
othe+0.045
rier+0.044
vere+0.043
ometers+0.042
unes+0.041
Neg logit contributions
 saw-0.030
 colored-0.030
 little-0.027
 heard-0.026
 coloured-0.026
Token learn
Token ID2193
Feature activation-2.85e+00

Pos logit contributions
othe+0.120
rier+0.117
vere+0.113
 spine+0.112
ometers+0.110
Neg logit contributions
 colored-0.081
 saw-0.080
 little-0.073
 coloured-0.071
 heard-0.070
Token inside
Token ID2641
Feature activation-1.01e+00

Pos logit contributions
othe+0.063
rier+0.063
 spine+0.061
vere+0.061
ometers+0.060
Neg logit contributions
 saw-0.043
 colored-0.043
 little-0.041
 heard-0.037
 coloured-0.037
Token.
Token ID13
Feature activation-5.83e+00

Pos logit contributions
othe+0.020
rier+0.020
vere+0.019
ometers+0.018
ometer+0.018
Neg logit contributions
 saw-0.018
 colored-0.018
 little-0.016
 heard-0.016
 coloured-0.016
Token
Token ID241
Feature activation-1.27e+00

Pos logit contributions
 saw+0.012
 colored+0.012
 little+0.011
 heard+0.011
 coloured+0.011
Neg logit contributions
othe-0.012
rier-0.012
vere-0.011
ometers-0.011
 spine-0.011
TokenCan
Token ID6090
Feature activation-2.31e+00

Pos logit contributions
othe+0.023
rier+0.023
vere+0.022
ometers+0.022
ometer+0.021
Neg logit contributions
 saw-0.024
 colored-0.024
 little-0.022
 heard-0.021
 coloured-0.021
Token I
Token ID314
Feature activation-3.41e-01

Pos logit contributions
rier+0.041
othe+0.039
 spine+0.038
vere+0.036
ometers+0.036
Neg logit contributions
 saw-0.049
 little-0.046
 colored-0.045
 did-0.043
 a-0.043
Token buy
Token ID2822
Feature activation+1.34e+00

Pos logit contributions
othe+0.007
rier+0.007
vere+0.007
unes+0.007
ometers+0.007
Neg logit contributions
 colored-0.005
 saw-0.005
 little-0.005
 coloured-0.005
ptions-0.005
Token this
Token ID428
Feature activation-4.23e+00

Pos logit contributions
 saw+0.017
 colored+0.017
 heard+0.015
 remembered+0.015
 little+0.015
Neg logit contributions
rier-0.031
othe-0.031
unes-0.030
ometers-0.030
vere-0.029
Token tea
Token ID8887
Feature activation-7.40e+00

Pos logit contributions
rier+0.080
othe+0.079
unes+0.079
ometer+0.079
balls+0.077
Neg logit contributions
 saw-0.067
ptions-0.065
 colored-0.065
 remembered-0.063
 observed-0.062
Token?
Token ID30
Feature activation-4.65e+00

Pos logit contributions
othe+0.188
rier+0.188
vere+0.181
ometers+0.178
unes+0.178
Neg logit contributions
 saw-0.087
 colored-0.085
 little-0.076
 heard-0.072
 coloured-0.071
Tokenâ
Token ID22940
Feature activation-6.07e-01

Pos logit contributions
rier+0.089
othe+0.088
 spine+0.085
vere+0.083
unes+0.080
Neg logit contributions
 saw-0.089
 little-0.087
 colored-0.082
 a-0.079
 found-0.078
Token
Token ID26391
Feature activation-5.69e-01

Pos logit contributions
rier+0.010
othe+0.010
vere+0.009
ometers+0.009
unes+0.009
Neg logit contributions
 saw-0.013
 colored-0.013
 little-0.012
 coloured-0.012
 heard-0.012
Token.
Token ID13
Feature activation-3.97e+00

Pos logit contributions
othe+0.011
vere+0.010
ometer+0.010
uts+0.010
ometers+0.010
Neg logit contributions
 colored-0.011
 saw-0.010
 little-0.010
 coloured-0.010
 heard-0.009
Token Her
Token ID2332
Feature activation+7.90e-01

Pos logit contributions
othe+0.078
rier+0.077
vere+0.073
 spine+0.073
ometers+0.072
Neg logit contributions
 saw-0.072
 little-0.069
 colored-0.069
 heard-0.063
 found-0.063
Token,
Token ID11
Feature activation-1.67e-01

Pos logit contributions
 saw+0.030
 colored+0.029
 little+0.028
 heard+0.026
 did+0.026
Neg logit contributions
rier-0.044
othe-0.043
 spine-0.042
vere-0.042
ometers-0.042
Token "
Token ID366
Feature activation-2.46e+00

Pos logit contributions
rier+0.004
othe+0.004
vere+0.004
 spine+0.004
unes+0.004
Neg logit contributions
 saw-0.002
 little-0.002
 did-0.002
 colored-0.002
 heard-0.002
TokenIt
Token ID1026
Feature activation-4.60e+00

Pos logit contributions
rier+0.065
unes+0.064
othe+0.064
 spine+0.064
ometers+0.063
Neg logit contributions
 saw-0.032
 colored-0.030
 heard-0.027
 little-0.027
 did-0.025
Token is
Token ID318
Feature activation-2.16e+00

Pos logit contributions
othe+0.102
rier+0.101
vere+0.098
ometer+0.098
ometers+0.096
Neg logit contributions
 saw-0.067
 colored-0.067
 little-0.061
 coloured-0.059
 heard-0.058
Token my
Token ID616
Feature activation-7.23e-02

Pos logit contributions
othe+0.048
rier+0.048
vere+0.046
ometers+0.045
 spine+0.045
Neg logit contributions
 saw-0.032
 colored-0.032
 little-0.029
 heard-0.028
 coloured-0.028
Token block
Token ID2512
Feature activation-6.54e+00

Pos logit contributions
othe+0.002
rier+0.002
vere+0.002
ometers+0.002
uts+0.002
Neg logit contributions
 saw-0.001
 colored-0.001
 little-0.001
 heard-0.001
 coloured-0.001
Token!"
Token ID2474
Feature activation-2.75e+00

Pos logit contributions
othe+0.169
rier+0.166
vere+0.162
ometers+0.161
 spine+0.160
Neg logit contributions
 saw-0.080
 colored-0.077
 little-0.075
 did-0.066
 heard-0.066
Token "
Token ID366
Feature activation-1.06e+00

Pos logit contributions
rier+0.057
vere+0.056
othe+0.055
ometers+0.055
unes+0.054
Neg logit contributions
 saw-0.048
 little-0.046
 colored-0.044
 heard-0.042
 a-0.042
TokenNo
Token ID2949
Feature activation+7.13e-01

Pos logit contributions
unes+0.027
 spine+0.026
rier+0.026
ometers+0.026
ometer+0.025
Neg logit contributions
 saw-0.015
 colored-0.015
 little-0.013
 heard-0.013
 did-0.013
Token,
Token ID11
Feature activation-2.43e+00

Pos logit contributions
 colored+0.009
 saw+0.009
ptions+0.009
 found+0.008
 coloured+0.008
Neg logit contributions
othe-0.017
rier-0.016
vere-0.016
ometer-0.016
 spine-0.015
Token it
Token ID340
Feature activation-4.06e+00

Pos logit contributions
othe+0.046
rier+0.046
vere+0.043
 spine+0.042
ometers+0.042
Neg logit contributions
 colored-0.046
 saw-0.045
 coloured-0.041
 little-0.040
 heard-0.040