TOP ACTIVATIONS
MAX = 0.221

They lifted him onto a stret cher and put him in
They put Lily on a stret cher and carried her to
the elderly man on a stret cher . She thanked Tim
with a man on a stret cher . The man 's
adventure to explore a big cliff . It was so high
loved to play make - bel ieve . Today she was
to the edge of the cliff and fell off . She
He would race make - bel ieve cars around the track
to the edge of the cliff and looked down . He
, he noticed a giant cliff ! The cloth was now
, they found a big cliff near their house . They
her friends went to the cliff . They had a brilliant
he suddenly fell off a cliff . He was hurt and
put it in a v ase and watered it every day
was ready for make - bel ieve fun . <|endoftext|> Once
" Please keep your seat bel ts on . We will
It was a very high cliff , but she was brave
pulled him away from the cliff just in time . The
was her royal s ce pter . As they
she started walking towards the cliff . Scott decided

MIN ACTIVATIONS
MIN = -0.267

the owner smiled . " Looks like you two are friends
smiled . " Looks like the frog likes you
's mom my . " Looks like the wave destroyed all
smiled and said , " Looks like you like the wardrobe
table . " Looks like mom has some mail
laughed and said , " Looks like that apple made you
smiled and said , " Great job , Tim my !
All their efforts were in vain and the ball was still
happily . " Great job , sweet ie ,"
The bee was proud and vain . He said , "
all his efforts were in vain . The monster destroyed everything
. And that 's why E amon always tried to be
won the race . " Great job , Tim my !"
happy . " Great shot , Tim !" his
smiled and said , " Great idea !" The
's mom said , " Great job , kids ! It
it and said , " Great job , Jill . You
all her efforts were in vain - all the exits were
Dad said , " Great job , Joe . You
M ama said , " Great job Henry ! We are

INTERVAL 0.099 - 0.221
CONTAINS 13.889%

up with sunshine . The family was happy to
â t want to play with balls . It
soon they found a big tree that was full of berries
job , Tim !" In the water , they
the car , he needed to find the right fuel .

INTERVAL -0.023 - 0.099
CONTAINS 47.141%

the sand . Big T came over and said , "
the stairs . The stairs were wet and slippery . Anna
but you have to wait . I have to go to
time , there were two twins . They were very old
" Fine , Mom . I will try ." Max says

INTERVAL -0.145 - -0.023
CONTAINS 36.840%

and hugged her . He was sorry for her . He
end . <|endoftext|> Once upon a time , there was a
to clean her room , so she didn 't . Her
reach the stars ! She was so happy that she laughed
laughed and said , " That was fun !" Ben waited

INTERVAL -0.267 - -0.145
CONTAINS 2.130%

scared . He had a great time , learning new things
talk . Lily is very happy . She wants to show
for sale !" Benny was happy because he could buy lots
The little boy was so relieved and he felt reminded not
" No , silly , ze br as are
Token They
Token ID1119
Feature activation-4.15e-02

Pos logit contributions
 brothers+0.020
lee+0.018
 mistakes+0.018
 siblings+0.018
 adorable+0.018
Neg logit contributions
Circ-0.021
 shortcut-0.020
fall-0.020
 gra-0.020
ground-0.019
Token lifted
Token ID13663
Feature activation-1.32e-02

Pos logit contributions
 mistakes+0.059
 brothers+0.058
lee+0.058
heart+0.057
 important+0.056
Neg logit contributions
Circ-0.081
fall-0.077
ground-0.076
Wait-0.075
Ready-0.075
Token him
Token ID683
Feature activation-1.96e-02

Pos logit contributions
lee+0.026
ushing+0.026
aky+0.026
ious+0.026
ity+0.026
Neg logit contributions
Circ-0.019
Ready-0.017
ground-0.016
Wait-0.016
 stairs-0.016
Token onto
Token ID4291
Feature activation+7.59e-02

Pos logit contributions
 mistakes+0.026
 brothers+0.026
 important+0.025
 adorable+0.025
ity+0.025
Neg logit contributions
Circ-0.040
-0.039
Ready-0.038
Wait-0.038
fall-0.038
Token a
Token ID257
Feature activation-3.03e-02

Pos logit contributions
Circ+0.135
fall+0.129
Ready+0.126
Wait+0.126
+0.125
Neg logit contributions
 brothers-0.123
 mistakes-0.120
lee-0.118
 important-0.115
 adorable-0.115
Token stret
Token ID5759
Feature activation+2.21e-01

Pos logit contributions
 adorable+0.060
 important+0.060
aky+0.060
 brothers+0.060
 mistakes+0.059
Neg logit contributions
Circ-0.042
fall-0.038
Ready-0.038
Wait-0.038
-0.038
Tokencher
Token ID2044
Feature activation+5.13e-03

Pos logit contributions
Circ+0.332
 stairs+0.291
 gra+0.287
Wait+0.286
Find+0.284
Neg logit contributions
 brothers-0.432
aky-0.410
 siblings-0.409
ious-0.409
 careful-0.405
Token and
Token ID290
Feature activation-3.03e-02

Pos logit contributions
Circ+0.012
fall+0.011
Ready+0.011
ground+0.011
Wait+0.011
Neg logit contributions
 brothers-0.006
 mistakes-0.005
lee-0.005
 important-0.005
heart-0.005
Token put
Token ID1234
Feature activation-1.28e-02

Pos logit contributions
 brothers+0.046
lee+0.044
 important+0.043
 mistakes+0.043
 adorable+0.043
Neg logit contributions
Circ-0.058
fall-0.055
-0.055
ground-0.054
Ready-0.054
Token him
Token ID683
Feature activation-1.91e-02

Pos logit contributions
 brothers+0.020
 mistakes+0.019
lee+0.019
 adorable+0.019
 important+0.019
Neg logit contributions
Circ-0.024
fall-0.023
-0.022
ground-0.022
Find-0.022
Token in
Token ID287
Feature activation+1.18e-01

Pos logit contributions
 brothers+0.034
 mistakes+0.033
 important+0.033
aky+0.033
 adorable+0.033
Neg logit contributions
Circ-0.031
fall-0.029
-0.029
Ready-0.029
Wait-0.028
Token They
Token ID1119
Feature activation-4.18e-02

Pos logit contributions
 brothers+0.024
lee+0.024
 mistakes+0.023
 important+0.023
 adorable+0.023
Neg logit contributions
Circ-0.015
fall-0.014
Ready-0.014
 gra-0.013
Wait-0.013
Token put
Token ID1234
Feature activation-1.20e-02

Pos logit contributions
 brothers+0.063
 mistakes+0.062
lee+0.062
heart+0.060
 important+0.060
Neg logit contributions
Circ-0.079
fall-0.074
ground-0.073
Ready-0.073
Wait-0.073
Token Lily
Token ID20037
Feature activation-1.37e-01

Pos logit contributions
 brothers+0.018
 mistakes+0.017
lee+0.017
 important+0.017
 adorable+0.017
Neg logit contributions
Circ-0.023
fall-0.022
Ready-0.022
ground-0.022
Wait-0.021
Token on
Token ID319
Feature activation+1.24e-01

Pos logit contributions
 brothers+0.209
 mistakes+0.205
lee+0.204
 important+0.199
 adorable+0.198
Neg logit contributions
Circ-0.256
fall-0.242
Ready-0.238
ground-0.236
Wait-0.235
Token a
Token ID257
Feature activation-3.02e-02

Pos logit contributions
Circ+0.222
fall+0.208
ground+0.205
Ready+0.204
Wait+0.203
Neg logit contributions
 brothers-0.200
lee-0.197
 mistakes-0.195
heart-0.192
ity-0.191
Token stret
Token ID5759
Feature activation+2.18e-01

Pos logit contributions
 important+0.061
 adorable+0.060
aky+0.059
 brothers+0.058
 mistakes+0.057
Neg logit contributions
Circ-0.043
Ready-0.039
-0.039
fall-0.039
Wait-0.038
Tokencher
Token ID2044
Feature activation+4.41e-03

Pos logit contributions
Circ+0.286
fall+0.256
Ready+0.252
ground+0.251
 gra+0.248
Neg logit contributions
 brothers-0.461
 mistakes-0.445
lee-0.443
aky-0.441
 siblings-0.439
Token and
Token ID290
Feature activation-3.12e-02

Pos logit contributions
Circ+0.010
fall+0.010
ground+0.010
Ready+0.010
Wait+0.010
Neg logit contributions
 brothers-0.005
lee-0.005
 mistakes-0.004
 important-0.004
heart-0.004
Token carried
Token ID5281
Feature activation-3.29e-02

Pos logit contributions
 brothers+0.048
lee+0.046
 mistakes+0.046
 important+0.045
 adorable+0.045
Neg logit contributions
Circ-0.059
fall-0.055
-0.055
ground-0.055
Ready-0.054
Token her
Token ID607
Feature activation-6.09e-02

Pos logit contributions
 brothers+0.038
 mistakes+0.037
lee+0.037
 important+0.036
heart+0.035
Neg logit contributions
Circ-0.073
fall-0.070
Ready-0.069
ground-0.069
-0.069
Token to
Token ID284
Feature activation+1.51e-01

Pos logit contributions
 brothers+0.086
 mistakes+0.085
 important+0.084
 adorable+0.083
 feelings+0.083
Neg logit contributions
Circ-0.129
Ready-0.121
fall-0.121
Wait-0.120
-0.118
Token the
Token ID262
Feature activation+2.81e-02

Pos logit contributions
 brothers+0.017
 mistakes+0.015
 important+0.015
lee+0.015
 adorable+0.015
Neg logit contributions
Circ-0.021
fall-0.020
-0.020
ground-0.020
Find-0.020
Token elderly
Token ID13830
Feature activation-1.04e-01

Pos logit contributions
Circ+0.054
fall+0.051
+0.050
ground+0.050
Ready+0.049
Neg logit contributions
 brothers-0.044
 mistakes-0.043
 important-0.042
 adorable-0.042
 feelings-0.041
Token man
Token ID582
Feature activation+6.07e-02

Pos logit contributions
 brothers+0.193
 mistakes+0.185
 important+0.183
 siblings+0.180
 adorable+0.180
Neg logit contributions
Circ-0.173
Ready-0.157
-0.155
fall-0.155
ground-0.153
Token on
Token ID319
Feature activation+1.24e-01

Pos logit contributions
Circ+0.105
fall+0.099
+0.098
Ready+0.097
Wait+0.096
Neg logit contributions
 brothers-0.103
 mistakes-0.100
lee-0.098
 important-0.098
aky-0.098
Token a
Token ID257
Feature activation-2.75e-02

Pos logit contributions
Circ+0.246
fall+0.235
Ready+0.230
Wait+0.230
Find+0.229
Neg logit contributions
 brothers-0.177
 mistakes-0.174
lee-0.167
 important-0.166
 adorable-0.164
Token stret
Token ID5759
Feature activation+2.18e-01

Pos logit contributions
 important+0.060
 adorable+0.059
 brothers+0.058
 mistakes+0.058
aky+0.058
Neg logit contributions
Circ-0.035
Ready-0.032
Wait-0.031
fall-0.031
-0.030
Tokencher
Token ID2044
Feature activation+4.92e-03

Pos logit contributions
Circ+0.325
 stairs+0.284
fall+0.280
Wait+0.279
Ready+0.278
Neg logit contributions
 brothers-0.435
 siblings-0.413
aky-0.410
 mistakes-0.407
ious-0.405
Token.
Token ID13
Feature activation-1.11e-02

Pos logit contributions
Circ+0.011
Ready+0.011
fall+0.011
Wait+0.011
+0.010
Neg logit contributions
 brothers-0.006
 mistakes-0.005
lee-0.005
aky-0.005
ity-0.005
Token She
Token ID1375
Feature activation-7.15e-02

Pos logit contributions
 brothers+0.016
lee+0.016
 mistakes+0.015
 darling+0.015
 adorable+0.015
Neg logit contributions
Circ-0.022
ground-0.021
Find-0.020
 shortcut-0.020
Ready-0.020
Token thanked
Token ID26280
Feature activation-6.61e-02

Pos logit contributions
 brothers+0.093
lee+0.087
 important+0.085
 mistakes+0.085
 adorable+0.084
Neg logit contributions
Circ-0.154
fall-0.148
Ready-0.145
ground-0.144
-0.144
Token Tim
Token ID5045
Feature activation-1.22e-01

Pos logit contributions
 brothers+0.088
lee+0.085
 mistakes+0.084
heart+0.083
 adorable+0.080
Neg logit contributions
Circ-0.138
fall-0.132
Ready-0.131
ground-0.129
Wait-0.129
Token with
Token ID351
Feature activation+2.77e-02

Pos logit contributions
Circ+0.222
fall+0.211
ground+0.207
+0.203
Wait+0.202
Neg logit contributions
 brothers-0.162
 mistakes-0.160
lee-0.158
ity-0.157
heart-0.156
Token a
Token ID257
Feature activation-2.88e-02

Pos logit contributions
Circ+0.052
fall+0.049
Ready+0.049
ground+0.048
Wait+0.048
Neg logit contributions
 brothers-0.041
lee-0.041
 mistakes-0.041
heart-0.040
ity-0.040
Token man
Token ID582
Feature activation+5.99e-02

Pos logit contributions
 brothers+0.039
 important+0.039
 adorable+0.038
 mistakes+0.037
 daughter+0.036
Neg logit contributions
Circ-0.060
Ready-0.056
fall-0.056
ground-0.056
Find-0.054
Token on
Token ID319
Feature activation+1.22e-01

Pos logit contributions
Circ+0.123
fall+0.117
ground+0.115
Ready+0.115
Wait+0.114
Neg logit contributions
 brothers-0.080
 mistakes-0.078
lee-0.078
 important-0.076
heart-0.075
Token a
Token ID257
Feature activation-2.82e-02

Pos logit contributions
Circ+0.254
Ready+0.247
fall+0.242
Wait+0.241
Find+0.239
Neg logit contributions
 brothers-0.170
 mistakes-0.162
lee-0.159
 adorable-0.157
 important-0.155
Token stret
Token ID5759
Feature activation+2.17e-01

Pos logit contributions
 important+0.053
 adorable+0.052
 brothers+0.052
 mistakes+0.051
lee+0.051
Neg logit contributions
Circ-0.044
fall-0.040
Ready-0.040
Find-0.039
ground-0.039
Tokencher
Token ID2044
Feature activation+2.81e-03

Pos logit contributions
Circ+0.317
 gra+0.274
fall+0.270
 stairs+0.270
Ready+0.269
Neg logit contributions
 brothers-0.437
lee-0.419
 siblings-0.414
aky-0.412
 mistakes-0.412
Token.
Token ID13
Feature activation-3.71e-03

Pos logit contributions
Circ+0.007
Ready+0.006
Wait+0.006
fall+0.006
Find+0.006
Neg logit contributions
 brothers-0.003
lee-0.003
 mistakes-0.003
ity-0.003
aky-0.003
Token The
Token ID383
Feature activation-4.85e-02

Pos logit contributions
 brothers+0.007
lee+0.007
 mistakes+0.007
 important+0.007
 adorable+0.007
Neg logit contributions
Circ-0.006
fall-0.005
Ready-0.005
ground-0.005
Wait-0.005
Token man
Token ID582
Feature activation+5.94e-02

Pos logit contributions
 brothers+0.075
 mistakes+0.071
 important+0.071
 adorable+0.070
lee+0.069
Neg logit contributions
Circ-0.096
fall-0.089
ground-0.088
-0.087
Ready-0.087
Token's
Token ID338
Feature activation-6.16e-02

Pos logit contributions
Circ+0.111
fall+0.105
ground+0.103
Find+0.103
Wait+0.102
Neg logit contributions
 mistakes-0.089
 brothers-0.087
lee-0.087
heart-0.086
ity-0.084
Token adventure
Token ID8855
Feature activation+7.49e-02

Pos logit contributions
ushing+0.135
 brothers+0.133
ity+0.131
 mistakes+0.129
 siblings+0.128
Neg logit contributions
 hike-0.086
 hunt-0.083
Circ-0.081
 shortcut-0.081
 stairs-0.080
Token to
Token ID284
Feature activation+1.54e-01

Pos logit contributions
Circ+0.163
Ready+0.153
Find+0.151
Wait+0.150
fall+0.149
Neg logit contributions
 brothers-0.096
 important-0.092
lee-0.091
 mistakes-0.091
heart-0.089
Token explore
Token ID7301
Feature activation+6.35e-02

Pos logit contributions
Circ+0.285
+0.270
fall+0.269
Wait+0.266
Find+0.258
Neg logit contributions
 brothers-0.239
 important-0.231
 adorable-0.231
 mistakes-0.223
lee-0.221
Token a
Token ID257
Feature activation-2.17e-02

Pos logit contributions
Circ+0.118
fall+0.111
Ready+0.109
ground+0.109
Wait+0.108
Neg logit contributions
 brothers-0.098
lee-0.096
 mistakes-0.096
 important-0.093
heart-0.093
Token big
Token ID1263
Feature activation-7.03e-02

Pos logit contributions
 brothers+0.033
lee+0.033
 mistakes+0.032
heart+0.031
ity+0.031
Neg logit contributions
Circ-0.041
fall-0.038
Ready-0.038
ground-0.038
Wait-0.037
Token cliff
Token ID19516
Feature activation+2.01e-01

Pos logit contributions
 brothers+0.156
 mistakes+0.155
lee+0.155
ushing+0.152
ity+0.152
Neg logit contributions
Circ-0.080
fall-0.073
ground-0.073
Ready-0.073
 shortcut-0.070
Token.
Token ID13
Feature activation+2.47e-04

Pos logit contributions
Circ+0.467
Ready+0.432
Find+0.431
fall+0.429
Wait+0.428
Neg logit contributions
 brothers-0.231
 important-0.222
 mistakes-0.221
lee-0.218
heart-0.217
Token It
Token ID632
Feature activation+7.04e-02

Pos logit contributions
Circ+0.000
fall+0.000
Wait+0.000
 shortcut+0.000
Find+0.000
Neg logit contributions
 brothers-0.000
lee-0.000
 mistakes-0.000
heart-0.000
 adorable-0.000
Token was
Token ID373
Feature activation-1.94e-02

Pos logit contributions
Circ+0.129
fall+0.121
+0.119
Ready+0.119
ground+0.118
Neg logit contributions
 brothers-0.112
lee-0.109
 mistakes-0.109
 important-0.108
heart-0.107
Token so
Token ID523
Feature activation-3.66e-02

Pos logit contributions
 mistakes+0.027
 brothers+0.027
lee+0.027
ity+0.026
ushing+0.026
Neg logit contributions
Circ-0.038
fall-0.036
Wait-0.036
ground-0.036
Ready-0.036
Token high
Token ID1029
Feature activation-4.72e-02

Pos logit contributions
 mistakes+0.056
 brothers+0.056
ity+0.055
 feelings+0.054
 sisters+0.054
Neg logit contributions
Circ-0.067
fall-0.066
 stairs-0.065
Wait-0.064
ground-0.064
Token loved
Token ID6151
Feature activation-1.13e-01

Pos logit contributions
 mistakes+0.067
 brothers+0.064
ushing+0.064
 sons+0.063
 feelings+0.063
Neg logit contributions
 arrived-0.053
ground-0.052
 gra-0.052
Circ-0.052
fall-0.051
Token to
Token ID284
Feature activation+1.57e-01

Pos logit contributions
 brothers+0.217
lee+0.216
 important+0.216
 mistakes+0.215
 adorable+0.214
Neg logit contributions
ground-0.171
-0.168
Ready-0.168
fall-0.165
Circ-0.165
Token play
Token ID711
Feature activation+3.51e-02

Pos logit contributions
Circ+0.191
fall+0.175
Ready+0.170
ground+0.170
Wait+0.168
Neg logit contributions
 brothers-0.341
 mistakes-0.337
lee-0.337
heart-0.329
 important-0.329
Token make
Token ID787
Feature activation+1.97e-02

Pos logit contributions
Circ+0.065
fall+0.062
Wait+0.062
Ready+0.061
+0.060
Neg logit contributions
 brothers-0.054
 adorable-0.053
 mistakes-0.052
 important-0.052
lee-0.051
Token-
Token ID12
Feature activation+4.68e-02

Pos logit contributions
Circ+0.037
 gra+0.034
fall+0.034
 stairs+0.033
ground+0.033
Neg logit contributions
aky-0.030
 brothers-0.029
nie-0.029
ushing-0.029
omo-0.029
Tokenbel
Token ID6667
Feature activation+2.00e-01

Pos logit contributions
Circ+0.062
fall+0.057
Ready+0.056
ground+0.056
Wait+0.055
Neg logit contributions
 brothers-0.097
 mistakes-0.095
lee-0.095
 important-0.093
heart-0.093
Tokenieve
Token ID12311
Feature activation-7.01e-02

Pos logit contributions
Circ+0.393
fall+0.369
Find+0.367
Ready+0.366
ground+0.366
Neg logit contributions
lee-0.279
heart-0.268
 brothers-0.267
dy-0.264
ushing-0.263
Token.
Token ID13
Feature activation+3.08e-04

Pos logit contributions
 brothers+0.107
lee+0.106
 mistakes+0.105
heart+0.102
 important+0.101
Neg logit contributions
Circ-0.131
fall-0.123
ground-0.122
Ready-0.120
Find-0.119
Token Today
Token ID6288
Feature activation-4.46e-03

Pos logit contributions
Circ+0.001
fall+0.000
ground+0.000
Wait+0.000
Ready+0.000
Neg logit contributions
 brothers-0.001
lee-0.001
 mistakes-0.001
heart-0.001
 important-0.001
Token she
Token ID673
Feature activation-1.45e-02

Pos logit contributions
 brothers+0.006
lee+0.006
 mistakes+0.006
aky+0.006
ity+0.006
Neg logit contributions
Circ-0.009
fall-0.008
ground-0.008
Ready-0.008
Wait-0.008
Token was
Token ID373
Feature activation-2.31e-02

Pos logit contributions
 mistakes+0.028
 feelings+0.027
ushing+0.027
 brothers+0.027
ity+0.027
Neg logit contributions
Circ-0.021
 arrived-0.020
Find-0.019
fall-0.019
Ready-0.019
Token to
Token ID284
Feature activation+1.55e-01

Pos logit contributions
 brothers+0.104
lee+0.104
 mistakes+0.104
heart+0.102
 important+0.101
Neg logit contributions
Circ-0.062
Ready-0.055
fall-0.055
Find-0.054
ground-0.054
Token the
Token ID262
Feature activation+2.39e-02

Pos logit contributions
Circ+0.326
fall+0.311
Ready+0.305
Find+0.305
+0.305
Neg logit contributions
 brothers-0.206
 mistakes-0.201
lee-0.195
 adorable-0.194
 feelings-0.190
Token edge
Token ID5743
Feature activation+9.32e-02

Pos logit contributions
Circ+0.038
fall+0.035
ground+0.035
Ready+0.034
Wait+0.034
Neg logit contributions
 brothers-0.044
 mistakes-0.043
lee-0.042
 important-0.042
 adorable-0.041
Token of
Token ID286
Feature activation+8.22e-02

Pos logit contributions
Circ+0.180
fall+0.170
Ready+0.168
ground+0.167
Wait+0.166
Neg logit contributions
 brothers-0.136
 mistakes-0.133
lee-0.133
heart-0.129
 important-0.129
Token the
Token ID262
Feature activation+2.32e-02

Pos logit contributions
Circ+0.171
fall+0.169
Ready+0.163
Wait+0.159
Find+0.158
Neg logit contributions
 brothers-0.114
 mistakes-0.110
lee-0.106
 sisters-0.105
 adorable-0.105
Token cliff
Token ID19516
Feature activation+2.00e-01

Pos logit contributions
Circ+0.032
fall+0.030
Wait+0.029
Find+0.029
Ready+0.029
Neg logit contributions
 brothers-0.047
 mistakes-0.046
lee-0.046
 important-0.045
 adorable-0.045
Token and
Token ID290
Feature activation-1.44e-02

Pos logit contributions
Circ+0.452
fall+0.423
Ready+0.422
Find+0.420
Wait+0.418
Neg logit contributions
 brothers-0.230
 mistakes-0.225
lee-0.223
heart-0.218
 important-0.217
Token fell
Token ID3214
Feature activation-4.25e-02

Pos logit contributions
lee+0.022
 brothers+0.022
 mistakes+0.021
heart+0.021
 important+0.021
Neg logit contributions
Circ-0.028
-0.026
fall-0.025
Find-0.024
ground-0.024
Token off
Token ID572
Feature activation+8.85e-02

Pos logit contributions
 brothers+0.075
lee+0.075
 mistakes+0.074
 important+0.073
 adorable+0.072
Neg logit contributions
Circ-0.071
Ready-0.065
ground-0.065
Find-0.064
-0.063
Token.
Token ID13
Feature activation-1.83e-03

Pos logit contributions
Circ+0.197
Ready+0.186
Wait+0.182
Find+0.182
fall+0.179
Neg logit contributions
 mistakes-0.107
 brothers-0.107
lee-0.106
 important-0.102
 adorable-0.101
Token She
Token ID1375
Feature activation-6.88e-02

Pos logit contributions
lee+0.003
 brothers+0.003
 darling+0.003
 mistakes+0.003
 adorable+0.003
Neg logit contributions
Circ-0.003
 shortcut-0.003
 hunt-0.003
fall-0.003
ground-0.003
Token He
Token ID679
Feature activation-6.10e-02

Pos logit contributions
 brothers+0.009
lee+0.008
 mistakes+0.008
heart+0.008
 important+0.008
Neg logit contributions
Circ-0.007
fall-0.007
ground-0.007
Ready-0.007
Wait-0.007
Token would
Token ID561
Feature activation+3.95e-02

Pos logit contributions
 mistakes+0.104
heart+0.103
ushing+0.102
lee+0.101
ity+0.101
Neg logit contributions
Circ-0.101
Ready-0.096
Find-0.094
ground-0.093
 stairs-0.093
Token race
Token ID3234
Feature activation+6.10e-02

Pos logit contributions
Circ+0.055
fall+0.051
Ready+0.049
ground+0.049
Wait+0.049
Neg logit contributions
 brothers-0.079
lee-0.078
 mistakes-0.078
 important-0.076
heart-0.076
Token make
Token ID787
Feature activation+1.45e-02

Pos logit contributions
Circ+0.116
+0.115
fall+0.112
 gra+0.111
Wait+0.108
Neg logit contributions
 brothers-0.092
 mistakes-0.089
 important-0.088
lee-0.087
 adorable-0.085
Token-
Token ID12
Feature activation+4.77e-02

Pos logit contributions
Circ+0.027
fall+0.025
Ready+0.025
ground+0.025
Wait+0.025
Neg logit contributions
 brothers-0.022
lee-0.022
 mistakes-0.021
heart-0.021
aky-0.021
Tokenbel
Token ID6667
Feature activation+1.99e-01

Pos logit contributions
+0.064
Circ+0.063
ground+0.061
fall+0.061
Ready+0.058
Neg logit contributions
 brothers-0.095
 mistakes-0.095
ity-0.094
aky-0.094
heart-0.093
Tokenieve
Token ID12311
Feature activation-7.19e-02

Pos logit contributions
Circ+0.385
fall+0.364
Ready+0.363
ground+0.362
Find+0.361
Neg logit contributions
lee-0.283
 brothers-0.276
heart-0.271
ushing-0.266
 mistakes-0.266
Token cars
Token ID5006
Feature activation+1.01e-01

Pos logit contributions
 brothers+0.097
 mistakes+0.095
lee+0.093
 important+0.092
 adorable+0.090
Neg logit contributions
Circ-0.149
fall-0.141
Ready-0.140
Wait-0.140
-0.139
Token around
Token ID1088
Feature activation+1.32e-01

Pos logit contributions
Circ+0.248
Wait+0.233
Ready+0.232
+0.232
fall+0.231
Neg logit contributions
 brothers-0.104
 important-0.095
 mistakes-0.093
 adorable-0.092
lee-0.090
Token the
Token ID262
Feature activation+2.32e-02

Pos logit contributions
Circ+0.269
Wait+0.260
fall+0.258
Ready+0.255
+0.252
Neg logit contributions
 brothers-0.188
 mistakes-0.179
 important-0.174
lee-0.174
 adorable-0.170
Token track
Token ID2610
Feature activation+1.01e-01

Pos logit contributions
Circ+0.031
Ready+0.030
ground+0.029
 stairs+0.029
fall+0.028
Neg logit contributions
lee-0.047
heart-0.047
ushing-0.046
 brothers-0.046
 mistakes-0.046
Token to
Token ID284
Feature activation+1.51e-01

Pos logit contributions
 brothers+0.076
 mistakes+0.075
 important+0.074
lee+0.074
 adorable+0.073
Neg logit contributions
Circ-0.049
 gra-0.046
-0.046
Find-0.045
fall-0.045
Token the
Token ID262
Feature activation+2.35e-02

Pos logit contributions
Circ+0.322
br+0.316
fall+0.312
+0.310
Follow+0.306
Neg logit contributions
 brothers-0.192
 mistakes-0.181
 daughter-0.177
lee-0.177
 siblings-0.168
Token edge
Token ID5743
Feature activation+9.22e-02

Pos logit contributions
Circ+0.037
fall+0.035
+0.034
Find+0.033
ground+0.033
Neg logit contributions
 brothers-0.044
 mistakes-0.042
 sisters-0.041
 adorable-0.041
 siblings-0.041
Token of
Token ID286
Feature activation+8.22e-02

Pos logit contributions
Circ+0.188
Ready+0.178
fall+0.178
Find+0.175
ground+0.174
Neg logit contributions
 brothers-0.129
 mistakes-0.124
lee-0.123
heart-0.122
aky-0.120
Token the
Token ID262
Feature activation+1.96e-02

Pos logit contributions
fall+0.185
Circ+0.178
+0.172
Ready+0.171
br+0.169
Neg logit contributions
 brothers-0.106
 siblings-0.092
lee-0.092
 adorable-0.091
 twins-0.090
Token cliff
Token ID19516
Feature activation+1.98e-01

Pos logit contributions
Circ+0.033
fall+0.032
+0.031
Find+0.031
Wait+0.031
Neg logit contributions
 brothers-0.035
 mistakes-0.033
 twins-0.032
 important-0.032
 siblings-0.032
Token and
Token ID290
Feature activation-2.19e-02

Pos logit contributions
Circ+0.469
fall+0.442
Ready+0.442
Find+0.438
Wait+0.437
Neg logit contributions
 brothers-0.211
 mistakes-0.204
lee-0.202
heart-0.195
 important-0.195
Token looked
Token ID3114
Feature activation-3.89e-02

Pos logit contributions
 brothers+0.032
lee+0.032
 adorable+0.029
heart+0.029
 careful+0.029
Neg logit contributions
Circ-0.043
-0.041
fall-0.039
ground-0.039
Find-0.039
Token down
Token ID866
Feature activation+1.25e-01

Pos logit contributions
ity+0.076
 mistakes+0.071
 brothers+0.070
omo+0.070
aky+0.070
Neg logit contributions
fall-0.055
Circ-0.052
 stairs-0.051
-0.051
 shortcut-0.051
Token.
Token ID13
Feature activation-9.95e-03

Pos logit contributions
fall+0.230
Circ+0.227
Wait+0.219
+0.217
 stairs+0.216
Neg logit contributions
ity-0.183
 brothers-0.182
lee-0.181
aky-0.180
 mistakes-0.180
Token He
Token ID679
Feature activation-6.59e-02

Pos logit contributions
 brothers+0.014
lee+0.014
 darling+0.013
 mistakes+0.012
 important+0.012
Neg logit contributions
Circ-0.020
 gra-0.019
ground-0.019
 shortcut-0.019
fall-0.018
Token,
Token ID11
Feature activation-4.46e-03

Pos logit contributions
Circ+0.328
Ready+0.304
+0.300
Find+0.299
Wait+0.297
Neg logit contributions
 brothers-0.186
lee-0.180
 important-0.180
heart-0.178
 mistakes-0.177
Token he
Token ID339
Feature activation-5.08e-03

Pos logit contributions
 brothers+0.007
lee+0.007
 mistakes+0.007
 important+0.007
heart+0.007
Neg logit contributions
Circ-0.008
fall-0.008
Ready-0.008
ground-0.008
Wait-0.008
Token noticed
Token ID6810
Feature activation-2.76e-02

Pos logit contributions
 mistakes+0.008
 brothers+0.008
lee+0.008
heart+0.008
ushing+0.008
Neg logit contributions
Circ-0.009
fall-0.008
Ready-0.008
Wait-0.008
 gra-0.008
Token a
Token ID257
Feature activation-2.23e-02

Pos logit contributions
 brothers+0.043
lee+0.043
 mistakes+0.043
heart+0.041
aky+0.041
Neg logit contributions
Circ-0.050
fall-0.047
ground-0.046
Ready-0.046
Wait-0.046
Token giant
Token ID6175
Feature activation-4.05e-02

Pos logit contributions
 brothers+0.031
 important+0.030
 adorable+0.030
 mistakes+0.030
ity+0.029
Neg logit contributions
Circ-0.045
fall-0.043
ground-0.042
-0.042
Ready-0.042
Token cliff
Token ID19516
Feature activation+1.97e-01

Pos logit contributions
 brothers+0.073
 mistakes+0.072
lee+0.071
 important+0.070
heart+0.070
Neg logit contributions
Circ-0.065
fall-0.060
Ready-0.059
ground-0.059
Wait-0.059
Token!
Token ID0
Feature activation-2.21e-03

Pos logit contributions
Circ+0.399
Ready+0.362
Find+0.358
fall+0.355
+0.353
Neg logit contributions
 brothers-0.291
 important-0.283
lee-0.281
 mistakes-0.280
heart-0.280
Token The
Token ID383
Feature activation-5.58e-02

Pos logit contributions
 brothers+0.004
lee+0.003
 darling+0.003
heart+0.003
 adorable+0.003
Neg logit contributions
Circ-0.004
fall-0.004
 shortcut-0.004
Wait-0.004
bel-0.004
Token cloth
Token ID16270
Feature activation+9.60e-02

Pos logit contributions
lee+0.102
 mistakes+0.102
 brothers+0.101
ushing+0.099
aky+0.098
Neg logit contributions
Circ-0.086
fall-0.080
Ready-0.079
ground-0.078
Wait-0.078
Token was
Token ID373
Feature activation-2.08e-02

Pos logit contributions
Circ+0.164
fall+0.153
Ready+0.151
ground+0.150
Wait+0.149
Neg logit contributions
 brothers-0.162
lee-0.159
 mistakes-0.159
 important-0.154
heart-0.154
Token now
Token ID783
Feature activation-8.30e-02

Pos logit contributions
 brothers+0.027
lee+0.026
 mistakes+0.026
 important+0.025
heart+0.025
Neg logit contributions
Circ-0.044
fall-0.042
Ready-0.041
ground-0.041
Wait-0.041
Token,
Token ID11
Feature activation-7.21e-04

Pos logit contributions
Circ+0.310
Find+0.292
 gra+0.291
Ready+0.289
+0.288
Neg logit contributions
 brothers-0.110
heart-0.100
 important-0.099
lee-0.098
 adorable-0.097
Token they
Token ID484
Feature activation+3.03e-02

Pos logit contributions
 brothers+0.001
 adorable+0.001
heart+0.001
 darling+0.001
 important+0.001
Neg logit contributions
Circ-0.001
Ready-0.001
 shortcut-0.001
fall-0.001
 gra-0.001
Token found
Token ID1043
Feature activation+1.41e-02

Pos logit contributions
Circ+0.051
fall+0.047
Ready+0.047
ground+0.047
Wait+0.047
Neg logit contributions
 mistakes-0.052
 brothers-0.052
lee-0.051
 important-0.050
heart-0.050
Token a
Token ID257
Feature activation-2.71e-02

Pos logit contributions
Circ+0.025
Wait+0.023
Ready+0.023
ground+0.023
fall+0.023
Neg logit contributions
lee-0.023
heart-0.023
 mistakes-0.022
ity-0.022
 brothers-0.022
Token big
Token ID1263
Feature activation-7.13e-02

Pos logit contributions
 important+0.036
 adorable+0.035
 brothers+0.034
 darling+0.033
 mistakes+0.032
Neg logit contributions
Circ-0.057
fall-0.055
ground-0.054
-0.053
Wait-0.053
Token cliff
Token ID19516
Feature activation+1.97e-01

Pos logit contributions
 brothers+0.133
lee+0.132
 mistakes+0.132
heart+0.128
 important+0.128
Neg logit contributions
Circ-0.108
fall-0.100
Ready-0.098
ground-0.098
Wait-0.097
Token near
Token ID1474
Feature activation+7.14e-02

Pos logit contributions
Circ+0.426
Ready+0.395
Find+0.391
fall+0.391
Wait+0.383
Neg logit contributions
 important-0.253
 brothers-0.252
 adorable-0.247
 mistakes-0.245
lee-0.244
Token their
Token ID511
Feature activation-6.39e-02

Pos logit contributions
Circ+0.132
fall+0.127
Ready+0.124
Wait+0.123
+0.123
Neg logit contributions
 brothers-0.111
 mistakes-0.109
lee-0.109
 important-0.106
 adorable-0.105
Token house
Token ID2156
Feature activation+1.57e-01

Pos logit contributions
 brothers+0.109
lee+0.108
 mistakes+0.107
heart+0.107
aky+0.106
Neg logit contributions
Circ-0.107
fall-0.099
ground-0.099
Ready-0.098
 stairs-0.097
Token.
Token ID13
Feature activation-4.90e-03

Pos logit contributions
Circ+0.357
Ready+0.335
fall+0.335
Wait+0.331
Find+0.330
Neg logit contributions
 brothers-0.183
 mistakes-0.178
lee-0.177
 important-0.171
heart-0.169
Token They
Token ID1119
Feature activation-4.38e-02

Pos logit contributions
lee+0.007
 brothers+0.007
 darling+0.006
 adorable+0.006
 mistakes+0.006
Neg logit contributions
Circ-0.010
 shortcut-0.009
fall-0.009
Ready-0.009
Wait-0.009
Token her
Token ID607
Feature activation-6.88e-02

Pos logit contributions
 mistakes+0.021
ushing+0.021
lee+0.020
aky+0.020
 important+0.020
Neg logit contributions
Circ-0.028
fall-0.025
Wait-0.025
 gra-0.025
ground-0.025
Token friends
Token ID2460
Feature activation+3.61e-02

Pos logit contributions
heart+0.069
 mistakes+0.068
lee+0.068
 brothers+0.066
aky+0.066
Neg logit contributions
Circ-0.161
fall-0.153
-0.152
Ready-0.152
ground-0.151
Token went
Token ID1816
Feature activation+4.00e-02

Pos logit contributions
Circ+0.056
fall+0.053
Ready+0.052
ground+0.052
Wait+0.052
Neg logit contributions
 mistakes-0.065
 brothers-0.064
lee-0.063
 important-0.062
ity-0.062
Token to
Token ID284
Feature activation+1.54e-01

Pos logit contributions
Circ+0.064
Wait+0.062
Find+0.061
+0.061
Ready+0.061
Neg logit contributions
 brothers-0.074
lee-0.070
 important-0.070
 sisters-0.069
 friends-0.068
Token the
Token ID262
Feature activation+2.57e-02

Pos logit contributions
Circ+0.292
fall+0.291
Wait+0.289
+0.284
Ready+0.278
Neg logit contributions
 brothers-0.226
 mistakes-0.221
lee-0.215
 adorable-0.214
 daughter-0.211
Token cliff
Token ID19516
Feature activation+1.97e-01

Pos logit contributions
Circ+0.033
fall+0.033
Wait+0.031
+0.031
Find+0.030
Neg logit contributions
 brothers-0.053
 important-0.053
 adorable-0.052
 mistakes-0.051
 sisters-0.050
Token.
Token ID13
Feature activation-5.65e-04

Pos logit contributions
Circ+0.431
fall+0.408
Ready+0.406
Wait+0.402
Find+0.401
Neg logit contributions
 brothers-0.241
 mistakes-0.234
lee-0.234
 important-0.228
heart-0.227
Token They
Token ID1119
Feature activation-3.75e-02

Pos logit contributions
lee+0.001
 brothers+0.001
 adorable+0.001
 darling+0.001
 mistakes+0.001
Neg logit contributions
Circ-0.001
Wait-0.001
 shortcut-0.001
fall-0.001
Ready-0.001
Token had
Token ID550
Feature activation+1.76e-02

Pos logit contributions
 brothers+0.058
lee+0.057
aky+0.054
 adorable+0.054
 important+0.054
Neg logit contributions
Circ-0.071
fall-0.068
Ready-0.066
-0.066
ground-0.065
Token a
Token ID257
Feature activation-2.89e-02

Pos logit contributions
Circ+0.036
fall+0.034
+0.034
Ready+0.034
ground+0.034
Neg logit contributions
 brothers-0.024
lee-0.024
 important-0.023
 mistakes-0.023
 adorable-0.023
Token brilliant
Token ID10457
Feature activation-1.23e-01

Pos logit contributions
 brothers+0.041
lee+0.041
 mistakes+0.041
heart+0.040
ity+0.039
Neg logit contributions
Circ-0.057
fall-0.054
Ready-0.053
ground-0.053
Wait-0.053
Token he
Token ID339
Feature activation-7.05e-03

Pos logit contributions
Circ+0.104
Find+0.092
 gra+0.092
+0.090
fall+0.090
Neg logit contributions
 brothers-0.118
 mistakes-0.111
lee-0.111
 important-0.111
 adorable-0.109
Token suddenly
Token ID6451
Feature activation-6.04e-02

Pos logit contributions
lee+0.011
 brothers+0.010
aky+0.010
 careful+0.010
 years+0.010
Neg logit contributions
Circ-0.014
-0.013
 gra-0.012
 stairs-0.012
stack-0.012
Token fell
Token ID3214
Feature activation-4.48e-02

Pos logit contributions
lee+0.101
 brothers+0.100
aky+0.095
 important+0.095
ity+0.094
Neg logit contributions
Circ-0.108
-0.101
Ready-0.097
Find-0.096
 gra-0.096
Token off
Token ID572
Feature activation+8.68e-02

Pos logit contributions
 brothers+0.070
lee+0.069
 important+0.067
 mistakes+0.067
 adorable+0.066
Neg logit contributions
Circ-0.085
Find-0.078
Ready-0.078
-0.078
ground-0.077
Token a
Token ID257
Feature activation-2.53e-02

Pos logit contributions
Circ+0.197
Ready+0.188
Find+0.184
Wait+0.183
+0.179
Neg logit contributions
 brothers-0.104
lee-0.100
 mistakes-0.100
 important-0.096
 adorable-0.095
Token cliff
Token ID19516
Feature activation+1.97e-01

Pos logit contributions
 brothers+0.047
 important+0.047
 mistakes+0.047
lee+0.046
 adorable+0.046
Neg logit contributions
Circ-0.038
fall-0.036
Ready-0.035
-0.035
Wait-0.034
Token.
Token ID13
Feature activation-1.75e-03

Pos logit contributions
Circ+0.498
Ready+0.462
Find+0.456
+0.451
Wait+0.446
Neg logit contributions
heart-0.185
lee-0.177
 mistakes-0.177
 brothers-0.177
 important-0.175
Token He
Token ID679
Feature activation-6.17e-02

Pos logit contributions
lee+0.003
 brothers+0.003
 darling+0.002
 mistakes+0.002
 adorable+0.002
Neg logit contributions
Circ-0.003
 shortcut-0.003
Wait-0.003
Ready-0.003
bel-0.003
Token was
Token ID373
Feature activation-2.18e-02

Pos logit contributions
 brothers+0.105
 mistakes+0.104
lee+0.103
heart+0.101
 important+0.100
Neg logit contributions
Circ-0.103
fall-0.097
Wait-0.095
Ready-0.095
ground-0.095
Token hurt
Token ID5938
Feature activation-8.42e-02

Pos logit contributions
 brothers+0.024
lee+0.023
 mistakes+0.023
 important+0.022
 adorable+0.022
Neg logit contributions
Circ-0.051
fall-0.049
Ready-0.048
ground-0.048
-0.048
Token and
Token ID290
Feature activation-1.92e-02

Pos logit contributions
 mistakes+0.118
lee+0.117
 daughters+0.115
 brothers+0.114
ity+0.110
Neg logit contributions
fall-0.156
Circ-0.154
 stairs-0.152
ground-0.151
 shortcut-0.150
Token put
Token ID1234
Feature activation-1.11e-02

Pos logit contributions
 mistakes+0.145
heart+0.140
ubby+0.140
ious+0.139
ushing+0.138
Neg logit contributions
 arrived-0.095
Circ-0.090
Wait-0.087
Under-0.086
 walk-0.086
Token it
Token ID340
Feature activation+9.71e-02

Pos logit contributions
 brothers+0.018
 adorable+0.017
lee+0.017
 important+0.017
 mistakes+0.017
Neg logit contributions
Circ-0.020
fall-0.020
ground-0.019
Ready-0.019
 shortcut-0.019
Token in
Token ID287
Feature activation+1.19e-01

Pos logit contributions
Circ+0.171
fall+0.161
Ready+0.160
+0.158
 gra+0.156
Neg logit contributions
 brothers-0.161
 important-0.160
 adorable-0.158
aky-0.158
lee-0.154
Token a
Token ID257
Feature activation-2.68e-02

Pos logit contributions
Circ+0.227
fall+0.214
ground+0.211
Ready+0.211
Wait+0.210
Neg logit contributions
 brothers-0.177
lee-0.174
 mistakes-0.174
heart-0.168
 important-0.168
Token v
Token ID410
Feature activation+5.03e-02

Pos logit contributions
 brothers+0.041
 important+0.041
 adorable+0.041
 mistakes+0.040
lee+0.040
Neg logit contributions
Circ-0.050
fall-0.048
Ready-0.045
Wait-0.045
ground-0.045
Tokenase
Token ID589
Feature activation+1.97e-01

Pos logit contributions
Circ+0.055
fall+0.049
+0.048
ground+0.048
Ready+0.047
Neg logit contributions
 brothers-0.116
 mistakes-0.115
lee-0.115
 important-0.113
heart-0.112
Token and
Token ID290
Feature activation-2.83e-02

Pos logit contributions
Circ+0.396
fall+0.380
ground+0.377
+0.371
Wait+0.369
Neg logit contributions
 brothers-0.263
 mistakes-0.261
lee-0.259
heart-0.250
ity-0.246
Token watered
Token ID48024
Feature activation+3.17e-03

Pos logit contributions
 brothers+0.045
lee+0.044
 mistakes+0.044
 important+0.043
heart+0.043
Neg logit contributions
Circ-0.051
fall-0.048
ground-0.047
Ready-0.047
-0.047
Token it
Token ID340
Feature activation+9.76e-02

Pos logit contributions
Circ+0.006
fall+0.005
+0.005
Wait+0.005
Ready+0.005
Neg logit contributions
 mistakes-0.005
 brothers-0.005
lee-0.005
 important-0.005
heart-0.005
Token every
Token ID790
Feature activation-5.84e-04

Pos logit contributions
Circ+0.167
fall+0.157
Ready+0.155
ground+0.154
Wait+0.153
Neg logit contributions
 brothers-0.165
 mistakes-0.162
lee-0.162
 important-0.158
heart-0.157
Token day
Token ID1110
Feature activation+1.24e-01

Pos logit contributions
 important+0.001
 brothers+0.001
heart+0.001
 mistakes+0.001
 adorable+0.001
Neg logit contributions
Circ-0.001
fall-0.001
Ready-0.001
Find-0.001
ground-0.001
Token was
Token ID373
Feature activation-2.51e-02

Pos logit contributions
 mistakes+0.124
 important+0.119
 brothers+0.118
lee+0.117
 adorable+0.117
Neg logit contributions
Circ-0.142
fall-0.138
 gra-0.136
ground-0.134
Wait-0.132
Token ready
Token ID3492
Feature activation-3.04e-02

Pos logit contributions
 mistakes+0.035
heart+0.034
lee+0.034
 brothers+0.034
ity+0.034
Neg logit contributions
Circ-0.047
fall-0.046
 gra-0.045
ground-0.045
Wait-0.045
Token for
Token ID329
Feature activation+3.90e-02

Pos logit contributions
 brothers+0.054
lee+0.053
 mistakes+0.053
 important+0.052
heart+0.052
Neg logit contributions
Circ-0.050
fall-0.046
Ready-0.046
ground-0.045
Wait-0.045
Token make
Token ID787
Feature activation+1.71e-02

Pos logit contributions
Circ+0.065
fall+0.063
ground+0.063
+0.061
Ready+0.060
Neg logit contributions
lee-0.065
ushing-0.064
aky-0.064
heart-0.064
ity-0.064
Token-
Token ID12
Feature activation+4.87e-02

Pos logit contributions
Circ+0.026
fall+0.026
ground+0.026
 gra+0.025
 stairs+0.023
Neg logit contributions
aky-0.030
ushing-0.029
lee-0.028
omo-0.028
ubby-0.028
Tokenbel
Token ID6667
Feature activation+1.96e-01

Pos logit contributions
Circ+0.080
fall+0.075
Ready+0.074
Wait+0.073
ground+0.073
Neg logit contributions
 brothers-0.085
lee-0.084
 mistakes-0.083
 important-0.081
 adorable-0.080
Tokenieve
Token ID12311
Feature activation-7.59e-02

Pos logit contributions
Circ+0.377
ground+0.363
fall+0.361
Ready+0.357
Find+0.356
Neg logit contributions
lee-0.282
dy-0.264
 brothers-0.264
heart-0.263
ushing-0.262
Token fun
Token ID1257
Feature activation-1.00e-01

Pos logit contributions
lee+0.101
 brothers+0.099
 mistakes+0.096
aky+0.095
ushing+0.095
Neg logit contributions
Circ-0.154
fall-0.147
ground-0.147
 stairs-0.142
Ready-0.141
Token.
Token ID13
Feature activation-5.88e-03

Pos logit contributions
lee+0.137
 brothers+0.135
 mistakes+0.132
aky+0.132
ushing+0.131
Neg logit contributions
Circ-0.200
fall-0.192
ground-0.190
Ready-0.185
 stairs-0.184
Token<|endoftext|>
Token ID50256
Feature activation+1.49e-01

Pos logit contributions
 brothers+0.011
lee+0.011
 mistakes+0.011
heart+0.010
 important+0.010
Neg logit contributions
Circ-0.009
fall-0.008
Find-0.008
Ready-0.008
ground-0.008
TokenOnce
Token ID7454
Feature activation+7.23e-02

Pos logit contributions
Circ+0.162
bel+0.159
 shortcut+0.158
fall+0.151
Find+0.150
Neg logit contributions
 brothers-0.345
 twins-0.318
lee-0.312
aky-0.312
 siblings-0.310
Token "
Token ID366
Feature activation+7.33e-02

Pos logit contributions
 brothers+0.011
lee+0.011
 adorable+0.011
 important+0.011
 darling+0.011
Neg logit contributions
Circ-0.011
Ready-0.010
-0.010
Find-0.010
 shortcut-0.010
TokenPlease
Token ID5492
Feature activation-7.73e-03

Pos logit contributions
 shortcut+0.107
 stairs+0.106
+0.102
 hunt+0.101
 disappears+0.101
Neg logit contributions
heart-0.141
ity-0.133
aky-0.133
nie-0.130
 adorable-0.127
Token keep
Token ID1394
Feature activation-1.32e-02

Pos logit contributions
 brothers+0.012
 darling+0.011
heart+0.011
 mistakes+0.011
 important+0.011
Neg logit contributions
Circ-0.015
Find-0.014
Ready-0.014
 stairs-0.013
Wait-0.013
Token your
Token ID534
Feature activation-8.78e-02

Pos logit contributions
 brothers+0.021
 mistakes+0.020
lee+0.020
 important+0.020
 adorable+0.020
Neg logit contributions
Circ-0.025
fall-0.023
Ready-0.023
Wait-0.023
-0.023
Token seat
Token ID5852
Feature activation+9.60e-02

Pos logit contributions
 brothers+0.160
 mistakes+0.158
lee+0.157
 important+0.155
 adorable+0.153
Neg logit contributions
Circ-0.141
fall-0.131
Ready-0.129
Wait-0.127
ground-0.127
Tokenbel
Token ID6667
Feature activation+1.96e-01

Pos logit contributions
Circ+0.167
fall+0.164
ground+0.163
+0.158
 gra+0.158
Neg logit contributions
lee-0.151
 brothers-0.150
 mistakes-0.149
heart-0.141
 sisters-0.141
Tokents
Token ID912
Feature activation+3.82e-02

Pos logit contributions
Circ+0.378
ground+0.365
fall+0.364
Ready+0.358
Find+0.357
Neg logit contributions
lee-0.281
 brothers-0.268
heart-0.263
dy-0.263
 darling-0.258
Token on
Token ID319
Feature activation+1.26e-01

Pos logit contributions
ground+0.072
Circ+0.070
fall+0.070
 shortcut+0.068
 stairs+0.068
Neg logit contributions
 brothers-0.055
lee-0.055
 mistakes-0.053
 sisters-0.051
 siblings-0.051
Token.
Token ID13
Feature activation+1.18e-03

Pos logit contributions
Circ+0.271
fall+0.255
Wait+0.253
Ready+0.253
+0.251
Neg logit contributions
 brothers-0.162
 mistakes-0.161
lee-0.156
 important-0.156
 adorable-0.152
Token We
Token ID775
Feature activation+4.83e-03

Pos logit contributions
Circ+0.002
fall+0.002
ground+0.002
Wait+0.002
 stairs+0.002
Neg logit contributions
 brothers-0.002
 mistakes-0.002
 darling-0.002
lee-0.002
 adorable-0.002
Token will
Token ID481
Feature activation+1.63e-02

Pos logit contributions
Circ+0.008
fall+0.008
Ready+0.008
ground+0.008
Wait+0.008
Neg logit contributions
 brothers-0.008
 mistakes-0.008
 important-0.008
lee-0.008
aky-0.008
Token It
Token ID632
Feature activation+6.95e-02

Pos logit contributions
 brothers+0.000
lee+0.000
 mistakes+0.000
 darling+0.000
 adorable+0.000
Neg logit contributions
Circ-0.000
 shortcut-0.000
fall-0.000
 gra-0.000
ground-0.000
Token was
Token ID373
Feature activation-2.20e-02

Pos logit contributions
Circ+0.128
fall+0.119
ground+0.118
+0.118
Ready+0.117
Neg logit contributions
 brothers-0.109
lee-0.107
 mistakes-0.106
 important-0.105
heart-0.105
Token a
Token ID257
Feature activation-2.80e-02

Pos logit contributions
 brothers+0.031
lee+0.030
 mistakes+0.030
heart+0.029
 important+0.029
Neg logit contributions
Circ-0.044
fall-0.041
Ready-0.041
ground-0.041
Wait-0.040
Token very
Token ID845
Feature activation-8.43e-02

Pos logit contributions
 brothers+0.042
 mistakes+0.041
lee+0.041
 important+0.041
 adorable+0.040
Neg logit contributions
Circ-0.053
fall-0.049
ground-0.049
Ready-0.048
Wait-0.048
Token high
Token ID1029
Feature activation-4.69e-02

Pos logit contributions
 brothers+0.121
 mistakes+0.119
lee+0.118
ity+0.118
heart+0.116
Neg logit contributions
Circ-0.159
fall-0.158
Wait-0.155
Ready-0.153
 stairs-0.153
Token cliff
Token ID19516
Feature activation+1.95e-01

Pos logit contributions
 brothers+0.082
 mistakes+0.080
lee+0.080
heart+0.078
 important+0.078
Neg logit contributions
Circ-0.078
fall-0.073
Ready-0.072
ground-0.071
Wait-0.071
Token,
Token ID11
Feature activation-2.70e-03

Pos logit contributions
Circ+0.470
Ready+0.440
Find+0.438
fall+0.437
Wait+0.434
Neg logit contributions
 brothers-0.197
 mistakes-0.191
lee-0.191
heart-0.187
 important-0.187
Token but
Token ID475
Feature activation-8.83e-03

Pos logit contributions
 brothers+0.004
lee+0.004
 mistakes+0.004
aky+0.003
ity+0.003
Neg logit contributions
Circ-0.005
fall-0.005
Ready-0.005
ground-0.005
Wait-0.005
Token she
Token ID673
Feature activation-1.74e-02

Pos logit contributions
 brothers+0.015
lee+0.015
 mistakes+0.015
heart+0.014
ity+0.014
Neg logit contributions
Circ-0.015
fall-0.014
Ready-0.014
ground-0.014
Wait-0.014
Token was
Token ID373
Feature activation-2.24e-02

Pos logit contributions
 mistakes+0.029
 brothers+0.028
lee+0.028
heart+0.027
ity+0.027
Neg logit contributions
Circ-0.029
Wait-0.028
fall-0.028
Ready-0.027
Find-0.027
Token brave
Token ID14802
Feature activation-1.60e-01

Pos logit contributions
 brothers+0.025
 mistakes+0.025
lee+0.025
heart+0.024
ity+0.024
Neg logit contributions
Circ-0.050
fall-0.048
Ready-0.048
Wait-0.047
ground-0.047
Token pulled
Token ID5954
Feature activation-2.88e-02

Pos logit contributions
Circ+0.046
fall+0.043
Ready+0.043
ground+0.043
Wait+0.042
Neg logit contributions
 brothers-0.032
lee-0.031
 mistakes-0.030
 adorable-0.029
 important-0.029
Token him
Token ID683
Feature activation-2.75e-02

Pos logit contributions
lee+0.051
ity+0.051
aky+0.050
 brothers+0.050
heart+0.050
Neg logit contributions
Circ-0.046
Ready-0.042
fall-0.042
ground-0.042
Wait-0.042
Token away
Token ID1497
Feature activation+3.26e-02

Pos logit contributions
 brothers+0.044
 mistakes+0.043
 important+0.043
 adorable+0.041
ushing+0.041
Neg logit contributions
Circ-0.051
fall-0.047
-0.047
Ready-0.047
Find-0.047
Token from
Token ID422
Feature activation+7.50e-02

Pos logit contributions
Circ+0.070
fall+0.066
Ready+0.066
Wait+0.065
Find+0.065
Neg logit contributions
 brothers-0.041
 mistakes-0.040
lee-0.040
 important-0.039
heart-0.038
Token the
Token ID262
Feature activation+2.17e-02

Pos logit contributions
Circ+0.139
fall+0.132
Wait+0.129
Ready+0.129
Find+0.128
Neg logit contributions
 brothers-0.121
 mistakes-0.116
lee-0.115
 adorable-0.113
 important-0.113
Token cliff
Token ID19516
Feature activation+1.95e-01

Pos logit contributions
Circ+0.037
fall+0.034
ground+0.033
+0.033
Find+0.033
Neg logit contributions
 brothers-0.040
 important-0.038
 mistakes-0.038
 feelings-0.037
 adorable-0.037
Token just
Token ID655
Feature activation-1.01e-01

Pos logit contributions
Circ+0.435
fall+0.411
Ready+0.408
Wait+0.405
Find+0.404
Neg logit contributions
 brothers-0.230
 mistakes-0.225
lee-0.223
 important-0.216
heart-0.215
Token in
Token ID287
Feature activation+1.21e-01

Pos logit contributions
 important+0.138
 brothers+0.138
 feelings+0.134
 adorable+0.134
lee+0.134
Neg logit contributions
Circ-0.215
-0.196
Find-0.195
ground-0.191
Ready-0.187
Token time
Token ID640
Feature activation+1.08e-01

Pos logit contributions
Circ+0.233
fall+0.223
Ready+0.220
Wait+0.214
Find+0.214
Neg logit contributions
 brothers-0.185
 mistakes-0.178
 important-0.172
lee-0.171
 adorable-0.169
Token.
Token ID13
Feature activation-9.69e-03

Pos logit contributions
Circ+0.240
fall+0.226
Ready+0.225
Wait+0.223
ground+0.222
Neg logit contributions
 brothers-0.127
lee-0.124
 mistakes-0.124
 important-0.119
heart-0.118
Token The
Token ID383
Feature activation-5.34e-02

Pos logit contributions
 brothers+0.014
lee+0.013
 mistakes+0.012
 darling+0.012
 adorable+0.012
Neg logit contributions
Circ-0.019
Wait-0.018
 shortcut-0.018
fall-0.018
Ready-0.018
Token was
Token ID373
Feature activation-2.70e-02

Pos logit contributions
Circ+0.212
fall+0.203
+0.201
Ready+0.199
 gra+0.199
Neg logit contributions
 important-0.123
 adorable-0.120
 brothers-0.120
ity-0.114
lee-0.108
Token her
Token ID607
Feature activation-6.69e-02

Pos logit contributions
 brothers+0.038
lee+0.037
 mistakes+0.037
 important+0.037
 adorable+0.036
Neg logit contributions
Circ-0.054
fall-0.051
Ready-0.050
ground-0.050
-0.050
Token royal
Token ID15100
Feature activation-5.33e-02

Pos logit contributions
 brothers+0.075
 mistakes+0.071
 important+0.070
 adorable+0.069
lee+0.069
Neg logit contributions
Circ-0.156
fall-0.149
Ready-0.147
Wait-0.147
ground-0.146
Token s
Token ID264
Feature activation-1.92e-02

Pos logit contributions
 brothers+0.093
lee+0.088
 mistakes+0.088
 siblings+0.087
 important+0.087
Neg logit contributions
Circ-0.095
Wait-0.088
Ready-0.087
fall-0.085
Find-0.085
Tokence
Token ID344
Feature activation-1.99e-03

Pos logit contributions
lee+0.032
 brothers+0.032
 mistakes+0.031
heart+0.031
ushing+0.030
Neg logit contributions
Circ-0.034
 gra-0.032
 shortcut-0.032
Wait-0.031
Find-0.031
Tokenpter
Token ID42104
Feature activation+1.95e-01

Pos logit contributions
 brothers+0.003
lee+0.003
 mistakes+0.003
ity+0.003
 important+0.003
Neg logit contributions
Circ-0.004
fall-0.004
Ready-0.004
Wait-0.004
Find-0.004
Token.
Token ID13
Feature activation-1.02e-02

Pos logit contributions
Circ+0.390
fall+0.381
ground+0.371
+0.370
Wait+0.365
Neg logit contributions
 mistakes-0.258
 brothers-0.258
lee-0.256
heart-0.245
 important-0.244
Token
Token ID198
Feature activation+1.16e-01

Pos logit contributions
 brothers+0.017
lee+0.017
 mistakes+0.016
 adorable+0.016
 siblings+0.016
Neg logit contributions
Circ-0.018
fall-0.017
 shortcut-0.017
ground-0.017
 gra-0.016
Token
Token ID198
Feature activation+1.16e-01

Pos logit contributions
Circ+0.154
fall+0.148
Wait+0.139
+0.138
Ready+0.138
Neg logit contributions
 brothers-0.238
 mistakes-0.233
lee-0.232
 important-0.230
aky-0.228
TokenAs
Token ID1722
Feature activation-1.81e-02

Pos logit contributions
Circ+0.205
Ready+0.193
fall+0.192
ground+0.191
Wait+0.190
Neg logit contributions
 brothers-0.192
lee-0.190
 mistakes-0.188
heart-0.183
 important-0.182
Token they
Token ID484
Feature activation+2.52e-02

Pos logit contributions
 brothers+0.031
 mistakes+0.030
 important+0.030
lee+0.030
 adorable+0.030
Neg logit contributions
Circ-0.031
fall-0.029
 gra-0.028
ground-0.028
Ready-0.028
Token she
Token ID673
Feature activation-1.78e-02

Pos logit contributions
 brothers+0.008
 mistakes+0.008
lee+0.008
 important+0.007
heart+0.007
Neg logit contributions
Circ-0.009
fall-0.008
Ready-0.008
ground-0.008
Wait-0.008
Token started
Token ID2067
Feature activation+3.13e-03

Pos logit contributions
 brothers+0.028
 mistakes+0.027
lee+0.027
heart+0.027
 important+0.027
Neg logit contributions
Circ-0.032
fall-0.030
Ready-0.030
ground-0.030
Wait-0.030
Token walking
Token ID6155
Feature activation+5.68e-02

Pos logit contributions
Circ+0.004
Ready+0.004
ground+0.004
Find+0.004
+0.004
Neg logit contributions
 mistakes-0.007
 brothers-0.006
lee-0.006
 important-0.006
 sisters-0.006
Token towards
Token ID3371
Feature activation+4.29e-02

Pos logit contributions
Circ+0.107
fall+0.102
ground+0.100
Ready+0.100
Wait+0.099
Neg logit contributions
 brothers-0.086
lee-0.085
 mistakes-0.084
ity-0.081
 important-0.081
Token the
Token ID262
Feature activation+2.80e-02

Pos logit contributions
Circ+0.085
fall+0.083
+0.081
Ready+0.081
Wait+0.080
Neg logit contributions
 brothers-0.061
 mistakes-0.060
lee-0.060
heart-0.057
 adorable-0.057
Token cliff
Token ID19516
Feature activation+1.95e-01

Pos logit contributions
Circ+0.036
fall+0.034
+0.032
Wait+0.032
ground+0.032
Neg logit contributions
 brothers-0.059
 mistakes-0.058
 important-0.057
lee-0.057
 adorable-0.057
Token.
Token ID13
Feature activation-1.06e-02

Pos logit contributions
Circ+0.430
fall+0.408
Ready+0.404
ground+0.402
Wait+0.401
Neg logit contributions
 brothers-0.232
 mistakes-0.226
lee-0.225
 important-0.217
heart-0.217
Token
Token ID198
Feature activation+1.17e-01

Pos logit contributions
lee+0.018
 brothers+0.017
 darling+0.017
 mistakes+0.016
 adorable+0.016
Neg logit contributions
Circ-0.018
 shortcut-0.017
Find-0.017
fall-0.017
Wait-0.017
Token
Token ID198
Feature activation+1.17e-01

Pos logit contributions
Circ+0.181
fall+0.178
Wait+0.169
+0.168
Ready+0.167
Neg logit contributions
 brothers-0.211
lee-0.208
 mistakes-0.205
 important-0.202
aky-0.199
TokenScott
Token ID19040
Feature activation-7.15e-02

Pos logit contributions
Circ+0.199
fall+0.195
+0.185
 gra+0.184
 shortcut+0.181
Neg logit contributions
 brothers-0.191
aky-0.187
 mistakes-0.187
 important-0.185
ity-0.183
Token decided
Token ID3066
Feature activation-1.33e-01

Pos logit contributions
 mistakes+0.137
 darling+0.134
ushing+0.132
 adorable+0.130
 daughters+0.130
Neg logit contributions
 arrived-0.101
ground-0.088
walk-0.086
Circ-0.085
 hike-0.085
Token the
Token ID262
Feature activation+2.43e-02

Pos logit contributions
 brothers+0.043
 mistakes+0.043
lee+0.042
heart+0.042
aky+0.041
Neg logit contributions
Circ-0.054
fall-0.051
Ready-0.050
Wait-0.050
ground-0.050
Token owner
Token ID4870
Feature activation+3.69e-02

Pos logit contributions
Circ+0.041
ground+0.038
fall+0.038
+0.036
Find+0.036
Neg logit contributions
 brothers-0.044
 adorable-0.043
 important-0.042
 feelings-0.042
 mistakes-0.042
Token smiled
Token ID13541
Feature activation-1.28e-01

Pos logit contributions
Circ+0.066
fall+0.062
Wait+0.061
+0.061
Find+0.061
Neg logit contributions
 mistakes-0.060
 important-0.056
lee-0.056
 brothers-0.055
heart-0.055
Token.
Token ID13
Feature activation-5.98e-03

Pos logit contributions
ice+0.263
 mistakes+0.262
hew+0.262
nie+0.254
heart+0.247
Neg logit contributions
fall-0.166
Circ-0.151
Wait-0.137
 to-0.135
 towards-0.135
Token "
Token ID366
Feature activation+7.12e-02

Pos logit contributions
 mistakes+0.012
aky+0.012
ity+0.012
 brothers+0.012
heart+0.012
Neg logit contributions
Circ-0.008
fall-0.008
Ready-0.008
-0.008
ground-0.008
TokenLooks
Token ID41102
Feature activation-2.67e-01

Pos logit contributions
Circ+0.103
fall+0.095
Ready+0.095
Wait+0.094
Find+0.092
Neg logit contributions
 brothers-0.140
lee-0.139
 mistakes-0.138
heart-0.134
 important-0.134
Token like
Token ID588
Feature activation-7.25e-02

Pos logit contributions
lee+0.541
omo+0.532
nie+0.507
heart+0.500
illy+0.500
Neg logit contributions
Wait-0.375
Ready-0.372
br-0.361
Circ-0.349
 Whe-0.340
Token you
Token ID345
Feature activation-9.19e-03

Pos logit contributions
 brothers+0.101
 important+0.098
 mistakes+0.097
 adorable+0.096
 feelings+0.093
Neg logit contributions
Circ-0.151
fall-0.144
ground-0.144
-0.143
Ready-0.141
Token two
Token ID734
Feature activation-7.03e-02

Pos logit contributions
 mistakes+0.015
lee+0.015
 brothers+0.015
ushing+0.014
heart+0.014
Neg logit contributions
fall-0.016
Circ-0.016
Wait-0.015
Ready-0.015
Find-0.015
Token are
Token ID389
Feature activation-3.29e-03

Pos logit contributions
lee+0.099
 mistakes+0.093
ushing+0.092
 brothers+0.092
heart+0.090
Neg logit contributions
fall-0.134
Circ-0.132
Wait-0.128
 shortcut-0.128
 gra-0.127
Token friends
Token ID2460
Feature activation+3.58e-02

Pos logit contributions
lee+0.005
ushing+0.005
 mistakes+0.005
heart+0.005
ity+0.005
Neg logit contributions
fall-0.006
 gra-0.005
Circ-0.005
't-0.005
ground-0.005
Token smiled
Token ID13541
Feature activation-1.30e-01

Pos logit contributions
 mistakes+0.106
 brothers+0.104
 important+0.102
lee+0.102
 adorable+0.098
Neg logit contributions
Circ-0.191
fall-0.182
Ready-0.179
Wait-0.179
ground-0.179
Token.
Token ID13
Feature activation-1.16e-02

Pos logit contributions
hew+0.221
 mistakes+0.217
nie+0.215
lee+0.206
ice+0.205
Neg logit contributions
Circ-0.207
fall-0.201
Wait-0.196
 distance-0.185
-0.185
Token
Token ID198
Feature activation+1.17e-01

Pos logit contributions
lee+0.023
 brothers+0.023
 mistakes+0.022
 adorable+0.022
 important+0.021
Neg logit contributions
Circ-0.017
ground-0.016
fall-0.016
 shortcut-0.016
 gra-0.015
Token
Token ID198
Feature activation+1.17e-01

Pos logit contributions
Circ+0.141
fall+0.131
Ready+0.126
Wait+0.125
ground+0.124
Neg logit contributions
 brothers-0.254
lee-0.252
 mistakes-0.251
 important-0.246
 adorable-0.243
Token"
Token ID1
Feature activation+6.35e-02

Pos logit contributions
Circ+0.187
Ready+0.186
Wait+0.181
ground+0.181
To+0.179
Neg logit contributions
 brothers-0.211
lee-0.208
 mistakes-0.207
 feelings-0.206
heart-0.205
TokenLooks
Token ID41102
Feature activation-2.67e-01

Pos logit contributions
ground+0.095
fall+0.092
 shortcut+0.091
Circ+0.091
 gra+0.089
Neg logit contributions
 adorable-0.119
 brothers-0.117
 important-0.114
 sisters-0.113
aky-0.113
Token like
Token ID588
Feature activation-7.62e-02

Pos logit contributions
omo+0.497
lee+0.487
nie+0.482
heart+0.473
asured+0.458
Neg logit contributions
Wait-0.399
Ready-0.397
Circ-0.391
br-0.386
Find-0.368
Token the
Token ID262
Feature activation+1.94e-02

Pos logit contributions
 brothers+0.097
 mistakes+0.097
 important+0.096
 adorable+0.096
lee+0.093
Neg logit contributions
Circ-0.166
ground-0.160
-0.159
fall-0.158
Ready-0.156
Token frog
Token ID21264
Feature activation+3.25e-02

Pos logit contributions
Circ+0.024
fall+0.022
Ready+0.022
Wait+0.022
ground+0.022
Neg logit contributions
 brothers-0.042
lee-0.041
 mistakes-0.041
heart-0.040
 important-0.040
Token likes
Token ID7832
Feature activation-9.15e-02

Pos logit contributions
Circ+0.066
fall+0.063
 gra+0.062
ground+0.062
Ready+0.061
Neg logit contributions
lee-0.043
 mistakes-0.042
 brothers-0.042
heart-0.040
ity-0.040
Token you
Token ID345
Feature activation-1.38e-02

Pos logit contributions
 brothers+0.149
 mistakes+0.147
lee+0.144
 important+0.143
 adorable+0.142
Neg logit contributions
Circ-0.163
fall-0.155
-0.154
ground-0.154
Ready-0.153
Token's
Token ID338
Feature activation-6.31e-02

Pos logit contributions
 mistakes+0.211
 important+0.201
heart+0.196
 brothers+0.195
 adorable+0.192
Neg logit contributions
fall-0.229
Circ-0.216
ground-0.207
 gra-0.205
-0.205
Token mom
Token ID1995
Feature activation-2.63e-02

Pos logit contributions
aky+0.112
ice+0.107
ady+0.101
ious+0.101
ity+0.099
Neg logit contributions
 stairs-0.105
-0.098
fall-0.097
 ladder-0.095
 bench-0.095
Tokenmy
Token ID1820
Feature activation-9.94e-02

Pos logit contributions
 mistakes+0.033
 important+0.033
 feelings+0.031
ubby+0.031
 enjoyable+0.030
Neg logit contributions
fall-0.051
 gra-0.049
 ride-0.048
bel-0.048
 arrived-0.047
Token.
Token ID13
Feature activation-5.71e-03

Pos logit contributions
 mistakes+0.148
 important+0.146
 brothers+0.144
 feelings+0.140
 adorable+0.139
Neg logit contributions
fall-0.179
Circ-0.169
 gra-0.167
ground-0.165
-0.165
Token "
Token ID366
Feature activation+6.83e-02

Pos logit contributions
lee+0.010
 darling+0.009
 brothers+0.009
 adorable+0.009
 mistakes+0.009
Neg logit contributions
Circ-0.010
 shortcut-0.009
 gra-0.009
Ready-0.009
 hike-0.009
TokenLooks
Token ID41102
Feature activation-2.66e-01

Pos logit contributions
Circ+0.099
fall+0.095
 shortcut+0.092
ground+0.092
 gra+0.090
Neg logit contributions
 mistakes-0.129
 brothers-0.129
aky-0.129
 adorable-0.127
 important-0.126
Token like
Token ID588
Feature activation-7.43e-02

Pos logit contributions
lee+0.452
omo+0.451
heart+0.442
nie+0.434
illy+0.433
Neg logit contributions
Wait-0.437
Ready-0.429
Circ-0.417
Find-0.414
br-0.412
Token the
Token ID262
Feature activation+2.10e-02

Pos logit contributions
 mistakes+0.110
 brothers+0.110
 important+0.109
lee+0.107
 adorable+0.106
Neg logit contributions
Circ-0.146
fall-0.137
Ready-0.136
-0.135
ground-0.134
Token wave
Token ID6769
Feature activation+8.65e-03

Pos logit contributions
Circ+0.029
fall+0.027
Ready+0.027
Wait+0.027
ground+0.026
Neg logit contributions
lee-0.042
 mistakes-0.041
 brothers-0.041
ushing-0.041
aky-0.041
Token destroyed
Token ID6572
Feature activation-3.57e-02

Pos logit contributions
Circ+0.017
fall+0.016
Ready+0.016
ground+0.016
Wait+0.016
Neg logit contributions
 brothers-0.012
 mistakes-0.012
lee-0.012
heart-0.012
 important-0.012
Token all
Token ID477
Feature activation+2.45e-02

Pos logit contributions
 brothers+0.046
lee+0.045
 mistakes+0.045
aky+0.043
 important+0.043
Neg logit contributions
Circ-0.074
fall-0.071
ground-0.070
Ready-0.070
Wait-0.069
Token smiled
Token ID13541
Feature activation-1.28e-01

Pos logit contributions
 mistakes+0.038
 important+0.036
 feelings+0.036
ubby+0.034
aky+0.034
Neg logit contributions
 arrived-0.040
Circ-0.038
-0.036
fall-0.035
Find-0.035
Token and
Token ID290
Feature activation-2.90e-02

Pos logit contributions
hew+0.251
 mistakes+0.242
ice+0.240
nie+0.239
akers+0.238
Neg logit contributions
fall-0.170
Circ-0.166
 to-0.154
 across-0.152
 towards-0.148
Token said
Token ID531
Feature activation-8.60e-02

Pos logit contributions
 mistakes+0.040
 brothers+0.038
aky+0.037
heart+0.037
ity+0.037
Neg logit contributions
Circ-0.060
fall-0.057
Wait-0.056
-0.055
Ready-0.055
Token,
Token ID11
Feature activation-9.79e-03

Pos logit contributions
 brothers+0.124
aky+0.121
 mistakes+0.120
lee+0.115
heart+0.113
Neg logit contributions
Circ-0.167
fall-0.166
-0.160
Wait-0.156
ground-0.156
Token "
Token ID366
Feature activation+7.09e-02

Pos logit contributions
lee+0.016
 brothers+0.016
 adorable+0.015
 darling+0.015
 important+0.015
Neg logit contributions
Circ-0.018
 shortcut-0.017
Ready-0.017
ground-0.016
Find-0.016
TokenLooks
Token ID41102
Feature activation-2.66e-01

Pos logit contributions
Circ+0.101
Ready+0.092
fall+0.092
Wait+0.091
Find+0.090
Neg logit contributions
 brothers-0.142
lee-0.141
 mistakes-0.140
 important-0.136
heart-0.135
Token like
Token ID588
Feature activation-7.09e-02

Pos logit contributions
omo+0.561
lee+0.558
heart+0.528
pa+0.525
nie+0.519
Neg logit contributions
br-0.335
Ready-0.331
Circ-0.329
Wait-0.323
sc-0.306
Token you
Token ID345
Feature activation-1.05e-02

Pos logit contributions
 brothers+0.106
lee+0.105
 mistakes+0.104
aky+0.102
heart+0.102
Neg logit contributions
Circ-0.133
fall-0.126
Ready-0.124
ground-0.124
Wait-0.123
Token like
Token ID588
Feature activation-7.05e-02

Pos logit contributions
 mistakes+0.017
lee+0.016
ushing+0.016
ity+0.015
heart+0.015
Neg logit contributions
fall-0.018
 arrived-0.017
Circ-0.017
 arrive-0.017
ground-0.017
Token the
Token ID262
Feature activation+2.62e-02

Pos logit contributions
 brothers+0.109
 mistakes+0.108
lee+0.107
 important+0.106
 adorable+0.104
Neg logit contributions
Circ-0.131
fall-0.123
Ready-0.122
Wait-0.121
ground-0.120
Token wardrobe
Token ID35593
Feature activation+1.13e-01

Pos logit contributions
Circ+0.047
 stairs+0.046
Ready+0.045
fall+0.045
 shortcut+0.044
Neg logit contributions
lee-0.042
ushing-0.040
aky-0.039
 brothers-0.039
ity-0.039
Token table
Token ID3084
Feature activation+1.18e-01

Pos logit contributions
Circ+0.034
fall+0.032
Ready+0.031
ground+0.031
Wait+0.031
Neg logit contributions
 brothers-0.039
lee-0.038
 mistakes-0.038
heart-0.037
 important-0.037
Token.
Token ID13
Feature activation-1.38e-02

Pos logit contributions
Circ+0.257
fall+0.245
Ready+0.242
ground+0.240
Wait+0.240
Neg logit contributions
 brothers-0.142
 mistakes-0.139
lee-0.138
 important-0.133
heart-0.132
Token
Token ID198
Feature activation+1.20e-01

Pos logit contributions
 brothers+0.023
lee+0.022
 mistakes+0.022
 important+0.021
 adorable+0.021
Neg logit contributions
Circ-0.024
fall-0.023
 gra-0.023
Ready-0.023
ground-0.023
Token
Token ID198
Feature activation+1.20e-01

Pos logit contributions
Circ+0.160
fall+0.154
Wait+0.148
Ready+0.147
 gra+0.146
Neg logit contributions
 brothers-0.248
 mistakes-0.240
lee-0.239
 important-0.235
 adorable-0.232
Token"
Token ID1
Feature activation+6.32e-02

Pos logit contributions
Circ+0.220
Ready+0.209
ground+0.207
Wait+0.205
Find+0.204
Neg logit contributions
lee-0.188
 mistakes-0.188
 brothers-0.185
heart-0.185
ity-0.181
TokenLooks
Token ID41102
Feature activation-2.65e-01

Pos logit contributions
 gra+0.093
fall+0.093
+0.092
Circ+0.090
 shortcut+0.088
Neg logit contributions
 brothers-0.119
 siblings-0.115
aky-0.114
 sisters-0.114
 mistakes-0.113
Token like
Token ID588
Feature activation-7.49e-02

Pos logit contributions
omo+0.507
heart+0.499
lee+0.492
nie+0.490
ie+0.476
Neg logit contributions
Ready-0.386
Wait-0.381
Circ-0.375
br-0.373
Find-0.363
Token mom
Token ID1995
Feature activation-2.69e-02

Pos logit contributions
 brothers+0.108
 mistakes+0.106
 important+0.105
lee+0.103
 adorable+0.102
Neg logit contributions
Circ-0.151
fall-0.143
Ready-0.142
Wait-0.141
-0.140
Token has
Token ID468
Feature activation-2.92e-03

Pos logit contributions
lee+0.041
 mistakes+0.040
ity+0.038
 important+0.038
ushing+0.037
Neg logit contributions
fall-0.045
 aisle-0.045
 stairs-0.044
 arrived-0.044
-0.043
Token some
Token ID617
Feature activation+2.50e-02

Pos logit contributions
 brothers+0.004
 mistakes+0.004
lee+0.004
heart+0.004
ity+0.004
Neg logit contributions
Circ-0.005
fall-0.005
ground-0.005
Ready-0.005
Wait-0.005
Token mail
Token ID6920
Feature activation+5.24e-02

Pos logit contributions
Circ+0.040
fall+0.040
 stairs+0.039
ground+0.039
+0.038
Neg logit contributions
lee-0.042
ity-0.041
heart-0.040
ushing-0.040
 brothers-0.040
Token laughed
Token ID13818
Feature activation-1.16e-01

Pos logit contributions
 mistakes+0.169
 brothers+0.166
 important+0.162
lee+0.161
aky+0.159
Neg logit contributions
Circ-0.175
fall-0.167
Wait-0.162
Ready-0.161
ground-0.161
Token and
Token ID290
Feature activation-2.58e-02

Pos logit contributions
lee+0.202
aky+0.199
 sons+0.198
 mistakes+0.196
hew+0.196
Neg logit contributions
fall-0.183
Circ-0.183
Wait-0.171
br-0.166
To-0.163
Token said
Token ID531
Feature activation-8.55e-02

Pos logit contributions
 brothers+0.034
lee+0.033
 mistakes+0.033
 important+0.032
heart+0.032
Neg logit contributions
Circ-0.054
fall-0.051
ground-0.051
Ready-0.051
Wait-0.050
Token,
Token ID11
Feature activation-8.75e-03

Pos logit contributions
 brothers+0.119
 mistakes+0.115
lee+0.114
aky+0.112
heart+0.110
Neg logit contributions
Circ-0.172
fall-0.166
Wait-0.161
Ready-0.160
ground-0.160
Token "
Token ID366
Feature activation+7.26e-02

Pos logit contributions
lee+0.014
 brothers+0.014
 adorable+0.014
 important+0.014
 darling+0.014
Neg logit contributions
Circ-0.016
 shortcut-0.015
ground-0.014
Ready-0.014
 distance-0.014
TokenLooks
Token ID41102
Feature activation-2.65e-01

Pos logit contributions
Circ+0.104
fall+0.096
Ready+0.094
ground+0.094
Wait+0.093
Neg logit contributions
 brothers-0.143
lee-0.141
 mistakes-0.141
 important-0.138
heart-0.137
Token like
Token ID588
Feature activation-7.04e-02

Pos logit contributions
omo+0.514
lee+0.514
nie+0.496
heart+0.492
 brothers+0.486
Neg logit contributions
br-0.369
Wait-0.369
Ready-0.367
Circ-0.357
 Whe-0.345
Token that
Token ID326
Feature activation+2.98e-02

Pos logit contributions
 mistakes+0.092
 brothers+0.092
 important+0.091
 adorable+0.089
 sisters+0.087
Neg logit contributions
Circ-0.150
fall-0.143
ground-0.142
Ready-0.140
-0.140
Token apple
Token ID17180
Feature activation+9.29e-02

Pos logit contributions
Circ+0.060
fall+0.058
Wait+0.057
+0.056
Ready+0.056
Neg logit contributions
lee-0.041
 brothers-0.040
aky-0.039
 mistakes-0.038
ushing-0.038
Token made
Token ID925
Feature activation+2.08e-02

Pos logit contributions
Circ+0.197
Ready+0.185
fall+0.184
ground+0.183
Find+0.182
Neg logit contributions
 brothers-0.121
 mistakes-0.118
 important-0.117
heart-0.115
 adorable-0.115
Token you
Token ID345
Feature activation-1.11e-02

Pos logit contributions
Circ+0.038
fall+0.037
ground+0.035
Wait+0.035
Ready+0.035
Neg logit contributions
lee-0.032
 brothers-0.031
aky-0.031
ushing-0.030
 mistakes-0.030
Token smiled
Token ID13541
Feature activation-1.32e-01

Pos logit contributions
 mistakes+0.173
aky+0.166
 important+0.164
ushing+0.162
heart+0.162
Neg logit contributions
Circ-0.150
 arrived-0.141
fall-0.134
Ready-0.133
ground-0.133
Token and
Token ID290
Feature activation-3.21e-02

Pos logit contributions
ice+0.249
hew+0.249
nie+0.245
 mistakes+0.245
akers+0.239
Neg logit contributions
Circ-0.186
fall-0.185
 to-0.167
Wait-0.163
 ride-0.162
Token said
Token ID531
Feature activation-8.78e-02

Pos logit contributions
 mistakes+0.052
aky+0.051
heart+0.050
 brothers+0.049
ity+0.049
Neg logit contributions
Circ-0.058
fall-0.056
Wait-0.055
Ready-0.053
Find-0.053
Token,
Token ID11
Feature activation-1.32e-02

Pos logit contributions
 brothers+0.114
 mistakes+0.113
lee+0.111
aky+0.110
heart+0.107
Neg logit contributions
Circ-0.183
fall-0.177
Wait-0.171
ground-0.170
Ready-0.170
Token "
Token ID366
Feature activation+6.65e-02

Pos logit contributions
lee+0.021
 brothers+0.021
 adorable+0.020
 important+0.020
 careful+0.020
Neg logit contributions
Circ-0.024
 shortcut-0.022
Ready-0.022
ground-0.022
 stairs-0.022
TokenGreat
Token ID13681
Feature activation-2.39e-01

Pos logit contributions
Circ+0.100
Ready+0.092
fall+0.092
Wait+0.091
Find+0.090
Neg logit contributions
 brothers-0.127
lee-0.126
 mistakes-0.126
 important-0.122
heart-0.121
Token job
Token ID1693
Feature activation+2.23e-02

Pos logit contributions
aky+0.512
ice+0.492
lee+0.479
 sisters+0.472
asured+0.467
Neg logit contributions
fall-0.350
-0.325
 contest-0.309
 shortcut-0.303
 ride-0.303
Token,
Token ID11
Feature activation-1.22e-02

Pos logit contributions
Circ+0.048
fall+0.047
ground+0.046
+0.045
Ready+0.045
Neg logit contributions
 brothers-0.027
lee-0.026
 mistakes-0.026
aky-0.026
heart-0.026
Token Tim
Token ID5045
Feature activation-1.22e-01

Pos logit contributions
aky+0.020
lee+0.020
ity+0.019
ushing+0.019
ady+0.019
Neg logit contributions
Circ-0.021
Ready-0.020
 shortcut-0.020
Find-0.020
fall-0.020
Tokenmy
Token ID1820
Feature activation-1.03e-01

Pos logit contributions
 daughters+0.222
lee+0.218
 mistakes+0.217
 orderly+0.213
 wishes+0.213
Neg logit contributions
 ride-0.144
-0.140
 arrived-0.138
 hunt-0.137
 hill-0.131
Token!
Token ID0
Feature activation-8.66e-03

Pos logit contributions
lee+0.161
aky+0.160
 mistakes+0.160
 daughters+0.153
 sisters+0.152
Neg logit contributions
fall-0.184
Circ-0.172
Ready-0.166
 hunt-0.165
ground-0.165
Token All
Token ID1439
Feature activation+3.12e-02

Pos logit contributions
 brothers+0.008
lee+0.008
 mistakes+0.008
 siblings+0.007
 darling+0.007
Neg logit contributions
Circ-0.009
fall-0.008
Wait-0.008
 gra-0.008
Ready-0.008
Token their
Token ID511
Feature activation-6.33e-02

Pos logit contributions
Circ+0.048
ground+0.044
Wait+0.044
fall+0.044
 stairs+0.043
Neg logit contributions
 brothers-0.055
 mistakes-0.055
lee-0.055
ity-0.054
ushing-0.054
Token efforts
Token ID4040
Feature activation+2.17e-02

Pos logit contributions
heart+0.076
aky+0.070
ushing+0.070
ity+0.069
lee+0.068
Neg logit contributions
Circ-0.136
-0.131
 stairs-0.131
Find-0.130
Wait-0.129
Token were
Token ID547
Feature activation-1.22e-02

Pos logit contributions
Circ+0.042
Ready+0.039
fall+0.038
 gra+0.037
Wait+0.037
Neg logit contributions
 brothers-0.033
lee-0.033
 mistakes-0.032
 important-0.032
ushing-0.031
Token in
Token ID287
Feature activation+1.14e-01

Pos logit contributions
 brothers+0.016
lee+0.016
heart+0.016
 mistakes+0.016
ity+0.015
Neg logit contributions
Circ-0.025
fall-0.023
Wait-0.023
ground-0.023
Find-0.023
Token vain
Token ID23469
Feature activation-2.39e-01

Pos logit contributions
Circ+0.232
fall+0.220
Ready+0.217
ground+0.215
Wait+0.214
Neg logit contributions
 brothers-0.159
 mistakes-0.155
lee-0.154
 important-0.150
heart-0.148
Token and
Token ID290
Feature activation-2.84e-02

Pos logit contributions
aky+0.349
 brothers+0.339
lee+0.325
 adorable+0.325
heart+0.323
Neg logit contributions
fall-0.438
 stairs-0.422
ground-0.420
Circ-0.417
-0.406
Token the
Token ID262
Feature activation+2.14e-02

Pos logit contributions
 brothers+0.044
 mistakes+0.044
lee+0.044
aky+0.043
ity+0.043
Neg logit contributions
Circ-0.050
fall-0.047
Wait-0.047
Ready-0.046
ground-0.046
Token ball
Token ID2613
Feature activation+9.09e-02

Pos logit contributions
Circ+0.036
Ready+0.035
+0.034
Wait+0.034
 stairs+0.034
Neg logit contributions
lee-0.037
heart-0.035
ushing-0.035
ity-0.035
aky-0.035
Token was
Token ID373
Feature activation-2.75e-02

Pos logit contributions
Circ+0.164
fall+0.152
Ready+0.151
ground+0.150
Wait+0.148
Neg logit contributions
 brothers-0.147
 mistakes-0.144
lee-0.144
 important-0.141
 adorable-0.138
Token still
Token ID991
Feature activation-1.14e-01

Pos logit contributions
 brothers+0.036
lee+0.036
 mistakes+0.036
heart+0.035
ity+0.035
Neg logit contributions
Circ-0.056
fall-0.053
ground-0.052
Wait-0.052
Ready-0.052
Token happily
Token ID18177
Feature activation-7.24e-02

Pos logit contributions
aky+0.110
 important+0.108
hew+0.106
ity+0.105
omo+0.105
Neg logit contributions
fall-0.114
Circ-0.108
-0.107
ground-0.104
 gra-0.101
Token.
Token ID13
Feature activation-4.07e-03

Pos logit contributions
 mistakes+0.118
heart+0.117
 important+0.116
 brothers+0.116
hew+0.114
Neg logit contributions
fall-0.125
Circ-0.121
ground-0.109
 ride-0.108
bel-0.107
Token
Token ID198
Feature activation+1.18e-01

Pos logit contributions
 brothers+0.008
lee+0.008
 mistakes+0.008
 siblings+0.007
 adorable+0.007
Neg logit contributions
Circ-0.006
fall-0.006
ground-0.005
Wait-0.005
 shortcut-0.005
Token
Token ID198
Feature activation+1.18e-01

Pos logit contributions
Circ+0.138
fall+0.125
Ready+0.123
ground+0.122
Wait+0.120
Neg logit contributions
 brothers-0.263
 mistakes-0.260
lee-0.259
heart-0.255
 important-0.254
Token"
Token ID1
Feature activation+6.47e-02

Pos logit contributions
Circ+0.203
Ready+0.193
ground+0.190
Find+0.188
Wait+0.187
Neg logit contributions
 brothers-0.201
lee-0.200
 mistakes-0.196
heart-0.193
 important-0.190
TokenGreat
Token ID13681
Feature activation-2.39e-01

Pos logit contributions
Circ+0.101
fall+0.096
ground+0.094
 shortcut+0.093
 gra+0.093
Neg logit contributions
 brothers-0.117
 important-0.114
 adorable-0.114
 mistakes-0.113
aky-0.112
Token job
Token ID1693
Feature activation+2.37e-02

Pos logit contributions
aky+0.501
lee+0.484
asured+0.467
ady+0.458
ice+0.458
Neg logit contributions
fall-0.378
 contest-0.335
-0.326
 prize-0.312
 ride-0.306
Token,
Token ID11
Feature activation-1.11e-02

Pos logit contributions
fall+0.049
Circ+0.048
ground+0.048
 stairs+0.047
+0.047
Neg logit contributions
aky-0.030
 brothers-0.029
lee-0.029
 mistakes-0.028
ity-0.028
Token sweet
Token ID6029
Feature activation-2.00e-01

Pos logit contributions
aky+0.017
lee+0.017
 brothers+0.016
ity+0.016
ushing+0.016
Neg logit contributions
Ready-0.020
Circ-0.020
fall-0.019
Find-0.019
ground-0.019
Tokenie
Token ID494
Feature activation-3.52e-02

Pos logit contributions
asured+0.141
aky+0.127
 brothers+0.120
ushing+0.117
 feelings+0.115
Neg logit contributions
fall-0.531
Circ-0.511
 gra-0.499
Wait-0.498
 stairs-0.497
Token,"
Token ID553
Feature activation+1.84e-02

Pos logit contributions
 mistakes+0.054
lee+0.054
ushing+0.053
asured+0.052
 feelings+0.052
Neg logit contributions
fall-0.062
ground-0.058
 gra-0.055
 stairs-0.055
br-0.055
TokenThe
Token ID464
Feature activation-5.27e-02

Pos logit contributions
Circ+0.188
Ready+0.177
Wait+0.174
fall+0.173
ground+0.172
Neg logit contributions
lee-0.221
 brothers-0.219
 mistakes-0.218
heart-0.213
 important-0.211
Token bee
Token ID20697
Feature activation+7.16e-02

Pos logit contributions
lee+0.092
 brothers+0.091
 mistakes+0.091
heart+0.089
ushing+0.088
Neg logit contributions
Circ-0.087
fall-0.081
Ready-0.081
Wait-0.080
ground-0.079
Token was
Token ID373
Feature activation-3.05e-02

Pos logit contributions
Circ+0.141
fall+0.135
ground+0.132
Wait+0.132
Ready+0.132
Neg logit contributions
 brothers-0.099
 mistakes-0.098
lee-0.097
 important-0.094
heart-0.093
Token proud
Token ID6613
Feature activation-1.52e-01

Pos logit contributions
heart+0.036
 mistakes+0.035
ity+0.035
ushing+0.034
lee+0.033
Neg logit contributions
Circ-0.063
 gra-0.063
fall-0.062
ground-0.061
Wait-0.061
Token and
Token ID290
Feature activation-2.96e-02

Pos logit contributions
 mistakes+0.240
lee+0.240
 brothers+0.237
ity+0.237
aky+0.236
Neg logit contributions
Circ-0.259
fall-0.246
Wait-0.239
 stairs-0.238
ground-0.237
Token vain
Token ID23469
Feature activation-2.39e-01

Pos logit contributions
 mistakes+0.042
aky+0.039
heart+0.039
 brothers+0.039
lee+0.039
Neg logit contributions
Circ-0.057
Wait-0.055
fall-0.055
 gra-0.055
Ready-0.055
Token.
Token ID13
Feature activation-1.36e-02

Pos logit contributions
aky+0.373
 brothers+0.369
 mistakes+0.362
lee+0.362
heart+0.361
Neg logit contributions
fall-0.409
Circ-0.396
 stairs-0.384
Wait-0.380
ground-0.379
Token He
Token ID679
Feature activation-6.62e-02

Pos logit contributions
ity+0.024
 brothers+0.024
aky+0.024
 mistakes+0.024
lee+0.024
Neg logit contributions
Circ-0.021
Ready-0.020
fall-0.020
-0.020
Wait-0.019
Token said
Token ID531
Feature activation-9.12e-02

Pos logit contributions
 mistakes+0.096
 brothers+0.094
heart+0.093
lee+0.093
ity+0.090
Neg logit contributions
Circ-0.127
fall-0.120
Ready-0.119
Wait-0.119
Find-0.119
Token,
Token ID11
Feature activation-1.04e-02

Pos logit contributions
 brothers+0.121
 mistakes+0.119
lee+0.118
aky+0.118
heart+0.117
Neg logit contributions
Circ-0.184
fall-0.178
Wait-0.172
Find-0.170
ground-0.170
Token "
Token ID366
Feature activation+6.73e-02

Pos logit contributions
 brothers+0.016
lee+0.016
 important+0.015
 careful+0.014
 adorable+0.014
Neg logit contributions
Circ-0.020
 shortcut-0.019
Ready-0.018
ground-0.018
 stairs-0.018
Token all
Token ID477
Feature activation+2.39e-02

Pos logit contributions
Circ+0.046
+0.043
fall+0.043
ground+0.043
Ready+0.042
Neg logit contributions
 brothers-0.045
 mistakes-0.044
 important-0.043
 adorable-0.042
 feelings-0.041
Token his
Token ID465
Feature activation-7.25e-02

Pos logit contributions
Circ+0.040
fall+0.038
Ready+0.037
ground+0.037
Wait+0.037
Neg logit contributions
 brothers-0.041
 mistakes-0.041
lee-0.040
 important-0.039
heart-0.039
Token efforts
Token ID4040
Feature activation+2.12e-02

Pos logit contributions
 brothers+0.078
lee+0.077
 mistakes+0.076
heart+0.075
aky+0.073
Neg logit contributions
Circ-0.166
fall-0.159
Ready-0.157
ground-0.156
Wait-0.156
Token were
Token ID547
Feature activation-1.43e-02

Pos logit contributions
Circ+0.042
fall+0.038
Ready+0.038
+0.037
 gra+0.037
Neg logit contributions
 brothers-0.033
 important-0.032
 mistakes-0.032
lee-0.031
 feelings-0.031
Token in
Token ID287
Feature activation+1.14e-01

Pos logit contributions
 brothers+0.018
lee+0.018
 mistakes+0.018
heart+0.017
ity+0.017
Neg logit contributions
Circ-0.030
fall-0.029
Ready-0.028
ground-0.028
Wait-0.028
Token vain
Token ID23469
Feature activation-2.38e-01

Pos logit contributions
Circ+0.226
fall+0.214
Ready+0.210
 gra+0.206
Wait+0.205
Neg logit contributions
 brothers-0.174
 mistakes-0.170
 important-0.168
lee-0.161
 feelings-0.161
Token.
Token ID13
Feature activation-1.36e-02

Pos logit contributions
 brothers+0.314
lee+0.313
aky+0.308
 adorable+0.300
heart+0.298
Neg logit contributions
fall-0.458
Circ-0.456
ground-0.445
 stairs-0.445
 shortcut-0.434
Token The
Token ID383
Feature activation-5.68e-02

Pos logit contributions
 brothers+0.023
 mistakes+0.023
lee+0.023
 important+0.022
heart+0.022
Neg logit contributions
Circ-0.023
fall-0.022
Ready-0.021
ground-0.021
Wait-0.021
Token monster
Token ID9234
Feature activation+6.15e-02

Pos logit contributions
lee+0.081
heart+0.078
ushing+0.076
aky+0.075
ity+0.075
Neg logit contributions
Circ-0.112
Ready-0.109
Wait-0.106
 shortcut-0.106
 stairs-0.106
Token destroyed
Token ID6572
Feature activation-3.81e-02

Pos logit contributions
Circ+0.103
fall+0.100
Wait+0.098
Find+0.097
ground+0.097
Neg logit contributions
lee-0.098
 brothers-0.097
 mistakes-0.097
heart-0.095
ity-0.095
Token everything
Token ID2279
Feature activation+1.29e-01

Pos logit contributions
 brothers+0.060
 mistakes+0.059
lee+0.059
 important+0.057
heart+0.057
Neg logit contributions
Circ-0.069
fall-0.065
Ready-0.064
ground-0.064
Wait-0.064
Token.
Token ID13
Feature activation-1.03e-02

Pos logit contributions
 brothers+0.093
 mistakes+0.091
lee+0.091
 important+0.088
heart+0.088
Neg logit contributions
Circ-0.153
fall-0.145
Ready-0.143
ground-0.142
Wait-0.142
Token And
Token ID843
Feature activation-9.35e-02

Pos logit contributions
 brothers+0.022
lee+0.022
 mistakes+0.022
 important+0.021
heart+0.021
Neg logit contributions
Circ-0.013
fall-0.012
Ready-0.012
Wait-0.012
ground-0.011
Token that
Token ID326
Feature activation+2.47e-02

Pos logit contributions
 brothers+0.178
lee+0.175
 mistakes+0.175
 important+0.170
heart+0.170
Neg logit contributions
Circ-0.140
fall-0.130
Ready-0.128
ground-0.127
Wait-0.126
Token's
Token ID338
Feature activation-6.45e-02

Pos logit contributions
Circ+0.041
Wait+0.039
ground+0.039
Ready+0.038
fall+0.038
Neg logit contributions
lee-0.043
ushing-0.039
aky-0.038
 darling-0.038
 daughters-0.038
Token why
Token ID1521
Feature activation+7.26e-02

Pos logit contributions
lee+0.113
 mistakes+0.105
ushing+0.102
ious+0.101
heart+0.101
Neg logit contributions
Circ-0.096
Wait-0.096
Ready-0.095
ground-0.094
 gra-0.094
Token E
Token ID412
Feature activation-2.38e-01

Pos logit contributions
Circ+0.122
Ready+0.112
Wait+0.110
fall+0.109
+0.109
Neg logit contributions
 brothers-0.133
 mistakes-0.128
 important-0.127
lee-0.125
 siblings-0.123
Tokenamon
Token ID16487
Feature activation-1.15e-01

Pos logit contributions
lee+0.371
 mistakes+0.355
 brothers+0.351
 adorable+0.349
 feelings+0.344
Neg logit contributions
Circ-0.435
fall-0.418
-0.415
ground-0.413
Ready-0.413
Token always
Token ID1464
Feature activation-1.14e-01

Pos logit contributions
lee+0.177
 brothers+0.175
 mistakes+0.173
 adorable+0.171
heart+0.169
Neg logit contributions
Circ-0.199
fall-0.194
ground-0.192
Ready-0.189
Wait-0.189
Token tried
Token ID3088
Feature activation-3.26e-02

Pos logit contributions
 brothers+0.160
 mistakes+0.155
lee+0.153
 important+0.153
 adorable+0.148
Neg logit contributions
Circ-0.234
fall-0.219
ground-0.217
Ready-0.216
-0.215
Token to
Token ID284
Feature activation+1.49e-01

Pos logit contributions
lee+0.067
aky+0.066
 brothers+0.066
heart+0.065
ushing+0.065
Neg logit contributions
Circ-0.042
fall-0.038
Ready-0.037
Wait-0.037
 shortcut-0.036
Token be
Token ID307
Feature activation-1.79e-02

Pos logit contributions
Circ+0.221
fall+0.213
Ready+0.207
ground+0.207
Wait+0.205
Neg logit contributions
lee-0.275
 brothers-0.269
 mistakes-0.262
heart-0.262
aky-0.262
Token won
Token ID1839
Feature activation+1.78e-02

Pos logit contributions
 brothers+0.047
lee+0.045
 important+0.045
 adorable+0.044
 mistakes+0.044
Neg logit contributions
Circ-0.048
-0.045
fall-0.045
ground-0.044
 gra-0.043
Token the
Token ID262
Feature activation+2.16e-02

Pos logit contributions
Circ+0.038
 gra+0.036
Wait+0.035
Ready+0.034
Find+0.034
Neg logit contributions
 mistakes-0.024
 brothers-0.023
 important-0.023
lee-0.022
 adorable-0.021
Token race
Token ID3234
Feature activation+5.51e-02

Pos logit contributions
Circ+0.026
+0.023
fall+0.023
Wait+0.022
 gra+0.022
Neg logit contributions
 important-0.048
 mistakes-0.048
 brothers-0.048
 adorable-0.047
 feelings-0.045
Token.
Token ID13
Feature activation-7.03e-03

Pos logit contributions
Circ+0.129
 gra+0.122
Wait+0.121
fall+0.121
+0.121
Neg logit contributions
 brothers-0.061
 mistakes-0.059
 important-0.058
 adorable-0.054
lee-0.054
Token "
Token ID366
Feature activation+6.91e-02

Pos logit contributions
 brothers+0.010
lee+0.010
 mistakes+0.010
 important+0.010
 adorable+0.010
Neg logit contributions
Circ-0.014
fall-0.013
ground-0.013
Wait-0.013
Ready-0.013
TokenGreat
Token ID13681
Feature activation-2.37e-01

Pos logit contributions
fall+0.093
Circ+0.092
 gra+0.089
ground+0.089
 stairs+0.088
Neg logit contributions
 brothers-0.133
 adorable-0.130
aky-0.129
heart-0.128
 mistakes-0.128
Token job
Token ID1693
Feature activation+2.29e-02

Pos logit contributions
lee+0.459
aky+0.457
 brothers+0.438
 mistakes+0.438
 sisters+0.437
Neg logit contributions
fall-0.364
Circ-0.345
-0.340
 shortcut-0.336
ground-0.332
Token,
Token ID11
Feature activation-1.14e-02

Pos logit contributions
Circ+0.053
fall+0.051
Ready+0.050
ground+0.050
Wait+0.050
Neg logit contributions
 brothers-0.024
lee-0.024
 mistakes-0.024
 important-0.023
heart-0.023
Token Tim
Token ID5045
Feature activation-1.23e-01

Pos logit contributions
aky+0.016
lee+0.016
 mistakes+0.016
ity+0.015
 brothers+0.015
Neg logit contributions
Circ-0.022
Ready-0.022
fall-0.021
ground-0.021
 shortcut-0.021
Tokenmy
Token ID1820
Feature activation-1.04e-01

Pos logit contributions
 mistakes+0.206
 daughters+0.201
lee+0.200
ushing+0.193
 pride+0.193
Neg logit contributions
 ride-0.179
 hill-0.172
 hunt-0.166
-0.166
 arrived-0.165
Token!"
Token ID2474
Feature activation+7.83e-03

Pos logit contributions
lee+0.160
 mistakes+0.153
aky+0.152
 sisters+0.148
ushing+0.147
Neg logit contributions
fall-0.191
Circ-0.185
Ready-0.178
 ride-0.178
 hunt-0.176
Token happy
Token ID3772
Feature activation-1.63e-01

Pos logit contributions
heart+0.090
lee+0.083
ice+0.080
nie+0.078
 mistakes+0.078
Neg logit contributions
Circ-0.214
fall-0.211
Find-0.208
Wait-0.207
 shortcut-0.206
Token.
Token ID13
Feature activation-1.18e-02

Pos logit contributions
 brothers+0.194
 mistakes+0.189
lee+0.189
 important+0.182
heart+0.181
Neg logit contributions
Circ-0.363
fall-0.345
Ready-0.342
ground-0.340
Wait-0.339
Token
Token ID198
Feature activation+1.19e-01

Pos logit contributions
 brothers+0.022
lee+0.021
 mistakes+0.021
 important+0.020
 adorable+0.020
Neg logit contributions
Circ-0.019
fall-0.018
ground-0.018
 gra-0.017
Ready-0.017
Token
Token ID198
Feature activation+1.19e-01

Pos logit contributions
Circ+0.164
fall+0.156
Ready+0.153
Wait+0.152
+0.151
Neg logit contributions
 brothers-0.241
lee-0.234
 mistakes-0.232
 important-0.232
 adorable-0.228
Token"
Token ID1
Feature activation+6.09e-02

Pos logit contributions
Circ+0.204
fall+0.191
Ready+0.188
ground+0.187
Wait+0.186
Neg logit contributions
 brothers-0.203
lee-0.200
 mistakes-0.199
heart-0.194
 important-0.193
TokenGreat
Token ID13681
Feature activation-2.37e-01

Pos logit contributions
Circ+0.093
fall+0.091
ground+0.090
 shortcut+0.089
 stairs+0.088
Neg logit contributions
 brothers-0.113
 adorable-0.107
ity-0.106
aky-0.105
heart-0.105
Token shot
Token ID2823
Feature activation-2.92e-02

Pos logit contributions
lee+0.493
aky+0.478
 sisters+0.464
ice+0.461
 mistakes+0.456
Neg logit contributions
fall-0.344
 shortcut-0.306
-0.306
Circ-0.305
 ride-0.304
Token,
Token ID11
Feature activation-1.34e-02

Pos logit contributions
lee+0.038
 brothers+0.037
aky+0.036
ushing+0.036
 mistakes+0.036
Neg logit contributions
Circ-0.060
fall-0.058
 shortcut-0.057
Ready-0.056
ground-0.056
Token Tim
Token ID5045
Feature activation-1.26e-01

Pos logit contributions
aky+0.023
ity+0.022
 mistakes+0.022
lee+0.021
ady+0.021
Neg logit contributions
Circ-0.022
 shortcut-0.021
Wait-0.021
 stairs-0.021
Ready-0.021
Token!"
Token ID2474
Feature activation+2.47e-02

Pos logit contributions
 daughters+0.241
lee+0.239
 mistakes+0.237
ushing+0.228
 sisters+0.227
Neg logit contributions
 hunt-0.165
 ride-0.164
 hill-0.154
 arrived-0.152
fall-0.150
Token his
Token ID465
Feature activation-7.16e-02

Pos logit contributions
Circ+0.044
fall+0.042
Wait+0.042
ground+0.040
Ready+0.040
Neg logit contributions
aky-0.039
 mistakes-0.039
ity-0.039
lee-0.039
ushing-0.039
Token smiled
Token ID13541
Feature activation-1.28e-01

Pos logit contributions
 mistakes+0.184
 important+0.172
heart+0.167
aky+0.166
 brothers+0.165
Neg logit contributions
Circ-0.231
ground-0.217
fall-0.215
 arrived-0.214
 gra-0.210
Token and
Token ID290
Feature activation-2.94e-02

Pos logit contributions
 mistakes+0.219
ice+0.214
hew+0.213
aky+0.208
nie+0.206
Neg logit contributions
Circ-0.203
fall-0.202
Wait-0.184
 ride-0.181
bel-0.181
Token said
Token ID531
Feature activation-8.51e-02

Pos logit contributions
 mistakes+0.047
 brothers+0.045
aky+0.045
heart+0.044
 important+0.044
Neg logit contributions
Circ-0.053
fall-0.051
Wait-0.050
ground-0.049
Ready-0.049
Token,
Token ID11
Feature activation-1.11e-02

Pos logit contributions
 brothers+0.112
 mistakes+0.111
lee+0.109
aky+0.107
heart+0.105
Neg logit contributions
Circ-0.175
fall-0.170
ground-0.165
Wait-0.165
-0.164
Token "
Token ID366
Feature activation+6.95e-02

Pos logit contributions
lee+0.020
 brothers+0.019
 adorable+0.019
 important+0.018
 sisters+0.018
Neg logit contributions
Circ-0.019
Ready-0.017
 shortcut-0.017
Find-0.016
Wait-0.016
TokenGreat
Token ID13681
Feature activation-2.37e-01

Pos logit contributions
Circ+0.104
fall+0.098
ground+0.096
+0.093
 gra+0.093
Neg logit contributions
 brothers-0.130
 mistakes-0.127
lee-0.127
heart-0.127
 adorable-0.126
Token idea
Token ID2126
Feature activation-3.94e-02

Pos logit contributions
aky+0.499
lee+0.472
ice+0.465
 mistakes+0.453
asured+0.447
Neg logit contributions
fall-0.371
 ride-0.330
 shortcut-0.310
-0.309
 contest-0.302
Token!"
Token ID2474
Feature activation+4.03e-03

Pos logit contributions
aky+0.057
 brothers+0.053
 mistakes+0.053
lee+0.052
 important+0.051
Neg logit contributions
fall-0.079
ground-0.075
Circ-0.075
 stairs-0.072
-0.072
Token
Token ID198
Feature activation+1.21e-01

Pos logit contributions
Circ+0.007
fall+0.007
ground+0.007
Ready+0.007
Wait+0.007
Neg logit contributions
 brothers-0.006
 mistakes-0.006
lee-0.006
 important-0.006
heart-0.006
Token
Token ID198
Feature activation+1.21e-01

Pos logit contributions
Circ+0.196
fall+0.183
Ready+0.180
ground+0.179
Wait+0.178
Neg logit contributions
 brothers-0.216
lee-0.212
 mistakes-0.212
 important-0.207
heart-0.206
TokenThe
Token ID464
Feature activation-5.11e-02

Pos logit contributions
Circ+0.236
fall+0.221
ground+0.220
Ready+0.218
Wait+0.217
Neg logit contributions
 brothers-0.177
lee-0.175
 mistakes-0.174
 important-0.168
heart-0.167
Token's
Token ID338
Feature activation-6.44e-02

Pos logit contributions
 mistakes+0.210
heart+0.206
 important+0.202
ity+0.202
aky+0.199
Neg logit contributions
Circ-0.247
 arrived-0.240
fall-0.238
ground-0.230
 gra-0.228
Token mom
Token ID1995
Feature activation-2.91e-02

Pos logit contributions
lee+0.088
heart+0.087
aky+0.085
ushing+0.083
ity+0.082
Neg logit contributions
Circ-0.126
fall-0.120
Wait-0.120
-0.120
 shortcut-0.119
Token said
Token ID531
Feature activation-8.91e-02

Pos logit contributions
 mistakes+0.047
 important+0.045
 feelings+0.045
ity+0.044
 brothers+0.043
Neg logit contributions
Circ-0.050
 arrived-0.049
fall-0.048
 gra-0.048
Wait-0.047
Token,
Token ID11
Feature activation-1.04e-02

Pos logit contributions
 brothers+0.134
aky+0.134
 mistakes+0.133
lee+0.131
heart+0.128
Neg logit contributions
Circ-0.164
fall-0.161
Wait-0.154
ground-0.154
-0.153
Token "
Token ID366
Feature activation+6.90e-02

Pos logit contributions
lee+0.016
 brothers+0.016
 adorable+0.015
 important+0.015
 sisters+0.015
Neg logit contributions
Circ-0.020
Ready-0.019
 shortcut-0.019
ground-0.018
 distance-0.018
TokenGreat
Token ID13681
Feature activation-2.37e-01

Pos logit contributions
Circ+0.096
fall+0.089
Ready+0.088
Wait+0.087
ground+0.087
Neg logit contributions
 brothers-0.139
lee-0.137
 mistakes-0.137
 important-0.134
heart-0.133
Token job
Token ID1693
Feature activation+2.53e-02

Pos logit contributions
aky+0.561
lee+0.525
asured+0.523
ice+0.514
ady+0.506
Neg logit contributions
fall-0.313
-0.277
 ride-0.272
 contest-0.263
 subway-0.246
Token,
Token ID11
Feature activation-9.59e-03

Pos logit contributions
Circ+0.052
fall+0.052
ground+0.050
+0.050
Ready+0.050
Neg logit contributions
 brothers-0.032
lee-0.031
 mistakes-0.031
aky-0.031
heart-0.030
Token kids
Token ID3988
Feature activation+6.92e-02

Pos logit contributions
aky+0.017
lee+0.017
ady+0.016
ious+0.016
ity+0.016
Neg logit contributions
Ready-0.016
Find-0.015
Circ-0.015
Wait-0.015
 shortcut-0.015
Token!
Token ID0
Feature activation-7.28e-03

Pos logit contributions
Circ+0.144
fall+0.142
ground+0.139
Ready+0.137
+0.136
Neg logit contributions
 brothers-0.086
lee-0.086
 mistakes-0.086
ity-0.082
 important-0.081
Token It
Token ID632
Feature activation+6.24e-02

Pos logit contributions
 brothers+0.011
lee+0.011
 mistakes+0.011
heart+0.010
 important+0.010
Neg logit contributions
Circ-0.014
fall-0.013
ground-0.013
 gra-0.013
Ready-0.013
Token it
Token ID340
Feature activation+9.41e-02

Pos logit contributions
Circ+0.207
fall+0.198
ground+0.196
+0.195
 gra+0.192
Neg logit contributions
aky-0.202
 brothers-0.199
 mistakes-0.199
ity-0.198
ushing-0.197
Token and
Token ID290
Feature activation-2.69e-02

Pos logit contributions
Circ+0.212
fall+0.205
ground+0.202
+0.201
Ready+0.200
Neg logit contributions
 brothers-0.104
lee-0.102
 mistakes-0.101
heart-0.097
aky-0.096
Token said
Token ID531
Feature activation-8.93e-02

Pos logit contributions
 brothers+0.041
 mistakes+0.041
lee+0.041
aky+0.040
heart+0.039
Neg logit contributions
Circ-0.050
fall-0.047
Ready-0.046
Wait-0.046
ground-0.046
Token,
Token ID11
Feature activation-9.95e-03

Pos logit contributions
 brothers+0.113
lee+0.111
 mistakes+0.110
 important+0.108
heart+0.106
Neg logit contributions
Circ-0.191
fall-0.181
Ready-0.180
ground-0.179
-0.177
Token "
Token ID366
Feature activation+7.02e-02

Pos logit contributions
lee+0.015
 brothers+0.015
 important+0.015
 careful+0.014
 darling+0.014
Neg logit contributions
Circ-0.019
 shortcut-0.017
Ready-0.017
 hike-0.017
ground-0.017
TokenGreat
Token ID13681
Feature activation-2.36e-01

Pos logit contributions
Circ+0.099
fall+0.095
ground+0.093
 shortcut+0.092
 stairs+0.092
Neg logit contributions
 brothers-0.136
heart-0.135
 mistakes-0.133
ity-0.132
aky-0.131
Token job
Token ID1693
Feature activation+2.45e-02

Pos logit contributions
aky+0.489
lee+0.473
ice+0.464
 sisters+0.455
asured+0.445
Neg logit contributions
fall-0.356
 shortcut-0.326
-0.314
 contest-0.314
 ride-0.299
Token,
Token ID11
Feature activation-9.78e-03

Pos logit contributions
Circ+0.052
fall+0.050
ground+0.049
Ready+0.049
+0.048
Neg logit contributions
 brothers-0.030
lee-0.030
 mistakes-0.030
aky-0.029
heart-0.029
Token Jill
Token ID22772
Feature activation-1.33e-01

Pos logit contributions
aky+0.015
ity+0.014
ushing+0.014
lee+0.013
ady+0.013
Neg logit contributions
Circ-0.017
Ready-0.017
 shortcut-0.017
 stairs-0.017
fall-0.017
Token.
Token ID13
Feature activation-2.95e-04

Pos logit contributions
 mistakes+0.209
 daughters+0.194
lee+0.193
 feelings+0.191
 brothers+0.191
Neg logit contributions
fall-0.223
ground-0.218
 gra-0.215
 arrived-0.212
-0.206
Token You
Token ID921
Feature activation-6.14e-02

Pos logit contributions
 brothers+0.000
lee+0.000
 mistakes+0.000
heart+0.000
 siblings+0.000
Neg logit contributions
Circ-0.001
fall-0.001
ground-0.001
Wait-0.000
 gra-0.000
Token all
Token ID477
Feature activation+2.51e-02

Pos logit contributions
 important+0.018
 adorable+0.017
 brothers+0.017
 careful+0.017
 mistakes+0.016
Neg logit contributions
Circ-0.023
fall-0.021
 gra-0.021
-0.020
Ready-0.020
Token her
Token ID607
Feature activation-6.63e-02

Pos logit contributions
Circ+0.039
fall+0.037
ground+0.036
Ready+0.036
Wait+0.036
Neg logit contributions
 brothers-0.046
 mistakes-0.045
lee-0.045
 important-0.044
ity-0.044
Token efforts
Token ID4040
Feature activation+2.38e-02

Pos logit contributions
heart+0.048
lee+0.045
 mistakes+0.044
ushing+0.043
 brothers+0.041
Neg logit contributions
Circ-0.172
 stairs-0.165
fall-0.165
 shortcut-0.164
Wait-0.164
Token were
Token ID547
Feature activation-9.71e-03

Pos logit contributions
Circ+0.050
Ready+0.047
Find+0.045
+0.044
fall+0.044
Neg logit contributions
lee-0.033
 important-0.033
 brothers-0.032
 mistakes-0.031
 careful-0.031
Token in
Token ID287
Feature activation+1.16e-01

Pos logit contributions
 brothers+0.014
 mistakes+0.014
 important+0.014
lee+0.014
 adorable+0.014
Neg logit contributions
Circ-0.019
fall-0.018
Ready-0.018
ground-0.018
-0.018
Token vain
Token ID23469
Feature activation-2.36e-01

Pos logit contributions
Circ+0.224
fall+0.214
Ready+0.210
Wait+0.204
ground+0.204
Neg logit contributions
 brothers-0.176
 mistakes-0.174
 important-0.170
lee-0.168
 adorable-0.166
Token -
Token ID532
Feature activation-1.37e-02

Pos logit contributions
 brothers+0.355
aky+0.354
lee+0.352
omo+0.341
heart+0.341
Neg logit contributions
fall-0.406
 stairs-0.405
Circ-0.399
ground-0.396
 shortcut-0.379
Token all
Token ID477
Feature activation+2.48e-02

Pos logit contributions
 brothers+0.022
 mistakes+0.022
 important+0.022
 adorable+0.021
lee+0.021
Neg logit contributions
Circ-0.025
fall-0.023
ground-0.023
-0.023
Ready-0.023
Token the
Token ID262
Feature activation+2.57e-02

Pos logit contributions
Circ+0.042
fall+0.039
Ready+0.038
ground+0.038
Wait+0.038
Neg logit contributions
 brothers-0.043
 mistakes-0.042
lee-0.042
 important-0.041
heart-0.041
Token exits
Token ID30151
Feature activation+4.82e-02

Pos logit contributions
Circ+0.048
fall+0.045
Ready+0.044
ground+0.044
Wait+0.044
Neg logit contributions
 brothers-0.040
lee-0.039
 mistakes-0.039
heart-0.038
 important-0.038
Token were
Token ID547
Feature activation-1.05e-02

Pos logit contributions
Circ+0.092
Ready+0.084
Find+0.082
fall+0.081
Wait+0.080
Neg logit contributions
 brothers-0.075
lee-0.074
 important-0.074
 mistakes-0.073
heart-0.073
Token
Token ID198
Feature activation+1.21e-01

Pos logit contributions
Circ+0.175
fall+0.166
 shortcut+0.162
Ready+0.160
Find+0.159
Neg logit contributions
 brothers-0.237
lee-0.229
 mistakes-0.222
 important-0.220
aky-0.219
TokenDad
Token ID46270
Feature activation-7.03e-02

Pos logit contributions
Circ+0.245
fall+0.236
 gra+0.228
Ready+0.227
+0.227
Neg logit contributions
 brothers-0.168
lee-0.160
 mistakes-0.159
 important-0.157
aky-0.156
Token said
Token ID531
Feature activation-9.03e-02

Pos logit contributions
 mistakes+0.129
 feelings+0.124
 important+0.120
ity+0.117
 adorable+0.115
Neg logit contributions
 arrived-0.114
Circ-0.109
fall-0.104
ground-0.103
Find-0.102
Token,
Token ID11
Feature activation-1.36e-02

Pos logit contributions
 brothers+0.115
 mistakes+0.112
lee+0.112
 important+0.108
heart+0.108
Neg logit contributions
Circ-0.193
fall-0.183
Ready-0.180
ground-0.180
Wait-0.179
Token "
Token ID366
Feature activation+6.91e-02

Pos logit contributions
 brothers+0.023
lee+0.022
 important+0.021
 darling+0.021
 adorable+0.021
Neg logit contributions
Circ-0.025
Ready-0.023
 shortcut-0.023
Find-0.023
 gra-0.022
TokenGreat
Token ID13681
Feature activation-2.36e-01

Pos logit contributions
Circ+0.098
fall+0.091
ground+0.089
Ready+0.088
+0.087
Neg logit contributions
 brothers-0.137
 mistakes-0.135
lee-0.135
heart-0.132
 important-0.132
Token job
Token ID1693
Feature activation+2.35e-02

Pos logit contributions
aky+0.564
lee+0.535
 sisters+0.517
ice+0.514
asured+0.512
Neg logit contributions
fall-0.300
-0.256
 ride-0.243
 contest-0.238
 shortcut-0.237
Token,
Token ID11
Feature activation-1.32e-02

Pos logit contributions
fall+0.048
Circ+0.047
ground+0.047
+0.046
 stairs+0.045
Neg logit contributions
aky-0.030
lee-0.029
 brothers-0.029
 mistakes-0.029
ity-0.028
Token Joe
Token ID5689
Feature activation-9.49e-02

Pos logit contributions
aky+0.026
ity+0.023
ady+0.023
ice+0.022
 mistakes+0.022
Neg logit contributions
 shortcut-0.018
fall-0.018
ground-0.018
Circ-0.018
 stairs-0.018
Token.
Token ID13
Feature activation-3.34e-03

Pos logit contributions
 mistakes+0.147
lee+0.143
 daughters+0.143
 feelings+0.140
aky+0.139
Neg logit contributions
fall-0.164
ground-0.158
 gra-0.152
 arrived-0.149
 hunt-0.148
Token You
Token ID921
Feature activation-6.19e-02

Pos logit contributions
 brothers+0.005
lee+0.005
 mistakes+0.005
heart+0.005
 important+0.005
Neg logit contributions
Circ-0.006
fall-0.006
ground-0.006
Ready-0.006
Wait-0.006
TokenM
Token ID44
Feature activation-1.03e-01

Pos logit contributions
Circ+0.208
fall+0.197
ground+0.191
Ready+0.191
+0.190
Neg logit contributions
 brothers-0.202
 mistakes-0.198
lee-0.197
 important-0.193
heart-0.192
Tokenama
Token ID1689
Feature activation-6.88e-02

Pos logit contributions
 adorable+0.143
 sisters+0.139
 feelings+0.139
lee+0.137
 brothers+0.137
Neg logit contributions
-0.192
fall-0.189
Circ-0.184
ground-0.178
Wait-0.177
Token said
Token ID531
Feature activation-8.98e-02

Pos logit contributions
 mistakes+0.107
 important+0.103
 feelings+0.102
ity+0.098
 adorable+0.098
Neg logit contributions
Circ-0.123
fall-0.119
 arrived-0.118
ground-0.117
Wait-0.115
Token,
Token ID11
Feature activation-1.22e-02

Pos logit contributions
 brothers+0.122
lee+0.121
 mistakes+0.119
 important+0.116
heart+0.116
Neg logit contributions
Circ-0.184
Ready-0.173
fall-0.172
ground-0.171
Wait-0.169
Token "
Token ID366
Feature activation+7.03e-02

Pos logit contributions
lee+0.020
 brothers+0.019
 darling+0.019
 important+0.019
 adorable+0.019
Neg logit contributions
Circ-0.022
 shortcut-0.021
Ready-0.020
Find-0.020
ground-0.020
TokenGreat
Token ID13681
Feature activation-2.36e-01

Pos logit contributions
Circ+0.104
fall+0.100
ground+0.097
+0.095
 shortcut+0.095
Neg logit contributions
 brothers-0.133
heart-0.132
 mistakes-0.130
lee-0.129
ity-0.128
Token job
Token ID1693
Feature activation+2.54e-02

Pos logit contributions
aky+0.546
lee+0.514
asured+0.492
ice+0.491
 sisters+0.490
Neg logit contributions
fall-0.328
-0.267
 shortcut-0.260
ground-0.255
 ride-0.255
Token Henry
Token ID8616
Feature activation-9.79e-02

Pos logit contributions
Circ+0.053
fall+0.052
ground+0.051
Ready+0.050
+0.050
Neg logit contributions
 brothers-0.032
lee-0.031
 mistakes-0.031
aky-0.030
heart-0.030
Token!
Token ID0
Feature activation-6.23e-03

Pos logit contributions
lee+0.140
 mistakes+0.139
aky+0.130
 feelings+0.129
 adorable+0.129
Neg logit contributions
fall-0.183
ground-0.182
-0.171
Circ-0.171
 arrived-0.170
Token We
Token ID775
Feature activation+3.10e-03

Pos logit contributions
 brothers+0.010
 mistakes+0.009
lee+0.009
heart+0.009
 important+0.009
Neg logit contributions
Circ-0.012
fall-0.011
ground-0.011
Ready-0.011
Wait-0.011
Token are
Token ID389
Feature activation-7.46e-03

Pos logit contributions
Circ+0.005
fall+0.005
Ready+0.004
ground+0.004
Find+0.004
Neg logit contributions
 mistakes-0.006
lee-0.006
 brothers-0.006
heart-0.005
ity-0.005
Token up
Token ID510
Feature activation+4.64e-02

Pos logit contributions
 brothers+0.021
omo+0.020
 siblings+0.020
ushing+0.020
lee+0.020
Neg logit contributions
Circ-0.021
ground-0.020
fall-0.020
 gra-0.019
 stairs-0.019
Token with
Token ID351
Feature activation+3.34e-02

Pos logit contributions
Circ+0.089
fall+0.086
ground+0.085
 gra+0.083
Ready+0.083
Neg logit contributions
lee-0.066
 brothers-0.064
 mistakes-0.063
ity-0.062
ushing-0.062
Token sunshine
Token ID34488
Feature activation+3.86e-02

Pos logit contributions
Circ+0.058
 gra+0.055
fall+0.055
ground+0.054
Ready+0.054
Neg logit contributions
ushing-0.053
lee-0.052
 brothers-0.051
aky-0.051
ity-0.051
Token.
Token ID13
Feature activation-6.18e-03

Pos logit contributions
Circ+0.086
fall+0.082
ground+0.081
Ready+0.081
Wait+0.080
Neg logit contributions
 brothers-0.044
lee-0.043
 mistakes-0.043
 important-0.041
heart-0.041
Token
Token ID198
Feature activation+1.22e-01

Pos logit contributions
 brothers+0.012
lee+0.012
 mistakes+0.012
 important+0.012
heart+0.012
Neg logit contributions
Circ-0.009
fall-0.008
Ready-0.008
ground-0.008
Wait-0.008
Token
Token ID198
Feature activation+1.21e-01

Pos logit contributions
Circ+0.169
fall+0.153
Ready+0.153
ground+0.153
Wait+0.150
Neg logit contributions
 brothers-0.244
 mistakes-0.241
lee-0.241
heart-0.237
 important-0.235
TokenThe
Token ID464
Feature activation-5.41e-02

Pos logit contributions
Circ+0.184
Ready+0.172
ground+0.169
Wait+0.169
Find+0.167
Neg logit contributions
lee-0.230
 brothers-0.230
 mistakes-0.227
heart-0.221
ushing-0.220
Token family
Token ID1641
Feature activation+3.20e-02

Pos logit contributions
lee+0.082
ushing+0.080
 mistakes+0.079
heart+0.079
 brothers+0.078
Neg logit contributions
Circ-0.099
Ready-0.096
fall-0.094
Wait-0.093
ground-0.093
Token was
Token ID373
Feature activation-2.44e-02

Pos logit contributions
Circ+0.049
ground+0.048
 arrived+0.048
 aisle+0.046
 stairs+0.046
Neg logit contributions
 mistakes-0.057
ushing-0.054
 darling-0.054
 feelings-0.054
ity-0.053
Token happy
Token ID3772
Feature activation-1.60e-01

Pos logit contributions
heart+0.028
 mistakes+0.028
ity+0.028
lee+0.028
ushing+0.028
Neg logit contributions
Circ-0.052
ground-0.051
fall-0.051
 gra-0.051
Wait-0.049
Token to
Token ID284
Feature activation+1.49e-01

Pos logit contributions
aky+0.277
lee+0.275
 brothers+0.274
 mistakes+0.274
ity+0.268
Neg logit contributions
Circ-0.257
fall-0.253
ground-0.237
-0.233
Wait-0.233
Tokenâ
Token ID22940
Feature activation-1.93e-01

Pos logit contributions
 brothers+0.063
 mistakes+0.062
lee+0.061
 important+0.061
 adorable+0.060
Neg logit contributions
Circ-0.018
Ready-0.015
Wait-0.014
 gra-0.014
fall-0.014
Token
Token ID26391
Feature activation-1.14e-01

Pos logit contributions
heart+0.303
 brothers+0.301
 mistakes+0.298
lee+0.294
 important+0.293
Neg logit contributions
Circ-0.350
 stairs-0.324
Find-0.317
fall-0.316
 shortcut-0.308
Token
Token ID8151
Feature activation+9.38e-02

Pos logit contributions
 brothers+0.223
lee+0.220
/+0.218
 mistakes+0.215
ushing+0.213
Neg logit contributions
Circ-0.163
 gra-0.146
Wait-0.145
Ready-0.145
 shortcut-0.144
Tokent
Token ID83
Feature activation-8.87e-03

Pos logit contributions
Circ+0.074
Ready+0.061
Find+0.060
fall+0.059
 gra+0.058
Neg logit contributions
 brothers-0.247
lee-0.243
 mistakes-0.243
heart-0.240
 important-0.239
Token want
Token ID765
Feature activation-4.79e-02

Pos logit contributions
 brothers+0.013
lee+0.013
 mistakes+0.013
ushing+0.012
 adorable+0.012
Neg logit contributions
Circ-0.017
fall-0.017
Wait-0.016
Ready-0.016
ground-0.016
Token to
Token ID284
Feature activation+1.52e-01

Pos logit contributions
 mistakes+0.104
 brothers+0.103
 important+0.102
lee+0.101
 adorable+0.100
Neg logit contributions
Circ-0.061
fall-0.056
ground-0.055
 gra-0.055
-0.055
Token play
Token ID711
Feature activation+3.32e-02

Pos logit contributions
Circ+0.195
fall+0.189
Ready+0.183
ground+0.181
Wait+0.178
Neg logit contributions
 brothers-0.309
lee-0.308
aky-0.305
ushing-0.301
 mistakes-0.301
Token with
Token ID351
Feature activation+3.11e-02

Pos logit contributions
Circ+0.066
fall+0.063
Ready+0.062
ground+0.062
Wait+0.062
Neg logit contributions
 brothers-0.046
 mistakes-0.045
lee-0.045
 important-0.044
heart-0.044
Token balls
Token ID11333
Feature activation+1.02e-01

Pos logit contributions
Circ+0.059
fall+0.056
ground+0.055
Ready+0.055
Wait+0.054
Neg logit contributions
 brothers-0.046
lee-0.045
 mistakes-0.045
 important-0.044
aky-0.044
Token.
Token ID13
Feature activation-5.90e-04

Pos logit contributions
Circ+0.213
fall+0.204
ground+0.201
Ready+0.200
+0.198
Neg logit contributions
lee-0.130
 brothers-0.129
 mistakes-0.126
ity-0.122
 important-0.122
Token It
Token ID632
Feature activation+6.25e-02

Pos logit contributions
 brothers+0.001
 mistakes+0.001
lee+0.001
heart+0.001
 important+0.001
Neg logit contributions
Circ-0.001
fall-0.001
ground-0.001
Ready-0.001
Wait-0.001
Token soon
Token ID2582
Feature activation-8.83e-02

Pos logit contributions
 brothers+0.041
lee+0.041
heart+0.039
 important+0.039
 adorable+0.039
Neg logit contributions
Circ-0.050
-0.049
Ready-0.047
fall-0.047
Find-0.046
Token they
Token ID484
Feature activation+2.94e-02

Pos logit contributions
 brothers+0.161
lee+0.158
 mistakes+0.157
 important+0.154
heart+0.153
Neg logit contributions
Circ-0.139
fall-0.130
Ready-0.128
ground-0.127
-0.126
Token found
Token ID1043
Feature activation+1.28e-02

Pos logit contributions
Circ+0.051
Ready+0.048
fall+0.048
+0.047
ground+0.046
Neg logit contributions
lee-0.050
 brothers-0.049
aky-0.047
 adorable-0.047
 important-0.046
Token a
Token ID257
Feature activation-2.69e-02

Pos logit contributions
Circ+0.023
fall+0.022
Ready+0.022
ground+0.022
Wait+0.021
Neg logit contributions
 brothers-0.020
lee-0.020
 mistakes-0.020
 important-0.019
heart-0.019
Token big
Token ID1263
Feature activation-7.70e-02

Pos logit contributions
 brothers+0.038
 important+0.038
 adorable+0.037
 mistakes+0.037
lee+0.036
Neg logit contributions
Circ-0.053
fall-0.050
-0.050
Ready-0.050
Wait-0.049
Token tree
Token ID5509
Feature activation+1.85e-01

Pos logit contributions
lee+0.145
 brothers+0.145
 mistakes+0.145
ity+0.140
ushing+0.140
Neg logit contributions
Circ-0.114
fall-0.106
ground-0.105
-0.103
 gra-0.103
Token that
Token ID326
Feature activation+3.09e-02

Pos logit contributions
Circ+0.393
Ready+0.370
fall+0.369
Wait+0.367
ground+0.361
Neg logit contributions
 brothers-0.237
lee-0.231
 mistakes-0.231
 important-0.225
 adorable-0.223
Token was
Token ID373
Feature activation-2.63e-02

Pos logit contributions
Circ+0.057
fall+0.055
+0.054
Ready+0.054
Find+0.053
Neg logit contributions
 brothers-0.048
 important-0.047
 adorable-0.047
lee-0.046
heart-0.044
Token full
Token ID1336
Feature activation-8.53e-02

Pos logit contributions
 brothers+0.041
 mistakes+0.041
lee+0.041
heart+0.039
 important+0.039
Neg logit contributions
Circ-0.048
fall-0.045
Ready-0.045
ground-0.045
Wait-0.044
Token of
Token ID286
Feature activation+8.32e-02

Pos logit contributions
 brothers+0.129
lee+0.126
 mistakes+0.126
 important+0.123
heart+0.122
Neg logit contributions
Circ-0.161
fall-0.152
Ready-0.151
ground-0.149
Wait-0.149
Token berries
Token ID36322
Feature activation+1.04e-01

Pos logit contributions
Circ+0.147
 gra+0.141
 stairs+0.139
fall+0.137
Ready+0.136
Neg logit contributions
lee-0.133
ushing-0.132
aky-0.127
ity-0.126
heart-0.126
Token job
Token ID1693
Feature activation+2.46e-02

Pos logit contributions
lee+0.426
aky+0.406
ice+0.398
 mistakes+0.389
ady+0.387
Neg logit contributions
fall-0.277
Circ-0.263
Ready-0.244
Wait-0.240
 ride-0.240
Token,
Token ID11
Feature activation-1.10e-02

Pos logit contributions
Circ+0.053
fall+0.051
ground+0.050
Ready+0.050
+0.050
Neg logit contributions
 brothers-0.030
lee-0.030
 mistakes-0.029
aky-0.029
heart-0.028
Token Tim
Token ID5045
Feature activation-1.24e-01

Pos logit contributions
aky+0.020
lee+0.019
ady+0.019
ity+0.019
 mistakes+0.018
Neg logit contributions
Circ-0.018
fall-0.017
Ready-0.017
 shortcut-0.017
Wait-0.017
Token!"
Token ID2474
Feature activation+4.64e-03

Pos logit contributions
 daughters+0.245
 mistakes+0.242
lee+0.239
efully+0.238
 darling+0.234
Neg logit contributions
 hunt-0.153
 ride-0.143
-0.139
 hill-0.139
fall-0.136
Token
Token ID198
Feature activation+1.24e-01

Pos logit contributions
Circ+0.006
fall+0.006
ground+0.006
Ready+0.006
+0.005
Neg logit contributions
 brothers-0.010
 mistakes-0.010
lee-0.010
heart-0.009
 important-0.009
Token
Token ID198
Feature activation+1.23e-01

Pos logit contributions
Circ+0.161
fall+0.152
Ready+0.150
Wait+0.148
Find+0.147
Neg logit contributions
 brothers-0.257
 mistakes-0.249
 important-0.249
lee-0.247
 adorable-0.245
TokenIn
Token ID818
Feature activation+6.72e-02

Pos logit contributions
Circ+0.182
Ready+0.168
ground+0.167
fall+0.167
Wait+0.166
Neg logit contributions
lee-0.244
 brothers-0.240
 mistakes-0.239
heart-0.232
 sisters-0.231
Token the
Token ID262
Feature activation+2.23e-02

Pos logit contributions
fall+0.137
Circ+0.136
Ready+0.130
 gra+0.126
Wait+0.124
Neg logit contributions
 brothers-0.104
 important-0.094
 siblings-0.093
 adorable-0.092
lee-0.091
Token water
Token ID1660
Feature activation+4.95e-02

Pos logit contributions
Circ+0.042
fall+0.040
ground+0.038
+0.038
Ready+0.038
Neg logit contributions
 brothers-0.036
 important-0.034
 mistakes-0.034
lee-0.033
 adorable-0.033
Token,
Token ID11
Feature activation-1.15e-02

Pos logit contributions
Circ+0.116
fall+0.108
Ready+0.107
Find+0.106
+0.106
Neg logit contributions
 brothers-0.058
lee-0.054
 mistakes-0.053
heart-0.053
 important-0.051
Token they
Token ID484
Feature activation+2.38e-02

Pos logit contributions
 brothers+0.016
 mistakes+0.016
lee+0.016
 important+0.015
heart+0.015
Neg logit contributions
Circ-0.023
fall-0.022
Ready-0.022
ground-0.022
Wait-0.021
Token the
Token ID262
Feature activation+2.67e-02

Pos logit contributions
Circ+0.096
fall+0.092
Ready+0.090
ground+0.090
 stairs+0.088
Neg logit contributions
lee-0.101
heart-0.099
 brothers-0.099
aky-0.098
ity-0.097
Token car
Token ID1097
Feature activation+8.71e-02

Pos logit contributions
Circ+0.035
Ready+0.033
fall+0.032
ground+0.032
Wait+0.031
Neg logit contributions
lee-0.056
heart-0.055
 brothers-0.055
 mistakes-0.054
ity-0.054
Token,
Token ID11
Feature activation-9.17e-03

Pos logit contributions
Circ+0.183
Wait+0.171
Ready+0.170
fall+0.170
+0.168
Neg logit contributions
 brothers-0.118
 mistakes-0.113
 important-0.111
lee-0.109
 adorable-0.109
Token he
Token ID339
Feature activation-8.45e-03

Pos logit contributions
 brothers+0.016
 adorable+0.015
 important+0.015
 mistakes+0.014
 siblings+0.014
Neg logit contributions
Circ-0.017
-0.015
Ready-0.015
fall-0.015
Find-0.015
Token needed
Token ID2622
Feature activation-2.91e-02

Pos logit contributions
lee+0.016
heart+0.016
 mistakes+0.016
ity+0.015
 darling+0.015
Neg logit contributions
Circ-0.012
fall-0.012
Ready-0.012
Find-0.012
Wait-0.011
Token to
Token ID284
Feature activation+1.50e-01

Pos logit contributions
 brothers+0.051
 mistakes+0.049
 important+0.049
 feelings+0.047
 careful+0.046
Neg logit contributions
Circ-0.051
fall-0.049
 gra-0.047
ground-0.047
-0.047
Token find
Token ID1064
Feature activation+3.01e-02

Pos logit contributions
Circ+0.204
fall+0.189
Ready+0.185
ground+0.185
Wait+0.182
Neg logit contributions
 brothers-0.305
lee-0.302
 mistakes-0.301
heart-0.295
 important-0.294
Token the
Token ID262
Feature activation+2.76e-02

Pos logit contributions
Circ+0.053
fall+0.049
Ready+0.049
ground+0.048
Wait+0.048
Neg logit contributions
 brothers-0.049
lee-0.049
 mistakes-0.048
heart-0.048
ity-0.047
Token right
Token ID826
Feature activation-7.53e-02

Pos logit contributions
Circ+0.059
fall+0.055
ground+0.055
+0.054
 gra+0.054
Neg logit contributions
 brothers-0.038
 important-0.036
 mistakes-0.036
 adorable-0.035
 feelings-0.034
Token fuel
Token ID5252
Feature activation+6.20e-02

Pos logit contributions
 brothers+0.134
 mistakes+0.130
 important+0.128
lee+0.127
 feelings+0.126
Neg logit contributions
Circ-0.126
fall-0.118
-0.116
Ready-0.115
 gra-0.115
Token.
Token ID13
Feature activation-2.82e-03

Pos logit contributions
Circ+0.137
fall+0.130
Ready+0.129
Wait+0.129
Find+0.129
Neg logit contributions
 brothers-0.076
 important-0.072
 mistakes-0.071
lee-0.070
 adorable-0.069
Token the
Token ID262
Feature activation+1.97e-02

Pos logit contributions
Circ+0.229
fall+0.222
Ready+0.216
Wait+0.214
 gra+0.212
Neg logit contributions
 brothers-0.175
 mistakes-0.170
 important-0.162
lee-0.161
 adorable-0.160
Token sand
Token ID6450
Feature activation+6.97e-02

Pos logit contributions
Circ+0.036
fall+0.034
ground+0.033
+0.032
Ready+0.032
Neg logit contributions
 brothers-0.032
 mistakes-0.031
lee-0.031
 important-0.031
 adorable-0.030
Token.
Token ID13
Feature activation-8.92e-03

Pos logit contributions
Circ+0.165
Ready+0.157
Wait+0.155
Find+0.155
fall+0.154
Neg logit contributions
 brothers-0.078
 mistakes-0.073
lee-0.071
aky-0.069
heart-0.069
Token Big
Token ID4403
Feature activation-9.47e-02

Pos logit contributions
 brothers+0.015
lee+0.015
 important+0.014
 adorable+0.014
 mistakes+0.014
Neg logit contributions
Circ-0.015
ground-0.014
 gra-0.014
Find-0.014
Ready-0.014
Token T
Token ID309
Feature activation-2.16e-01

Pos logit contributions
 brothers+0.184
lee+0.181
 mistakes+0.177
 adorable+0.175
 important+0.174
Neg logit contributions
Circ-0.141
 gra-0.128
fall-0.128
ground-0.127
Ready-0.125
Token came
Token ID1625
Feature activation+5.55e-02

Pos logit contributions
 brothers+0.334
 mistakes+0.330
lee+0.326
heart+0.321
 important+0.319
Neg logit contributions
Circ-0.398
fall-0.378
Ready-0.369
ground-0.368
Wait-0.366
Token over
Token ID625
Feature activation+9.30e-02

Pos logit contributions
Circ+0.107
fall+0.101
Ready+0.099
ground+0.099
Wait+0.099
Neg logit contributions
 brothers-0.081
lee-0.080
 mistakes-0.080
heart-0.078
 important-0.078
Token and
Token ID290
Feature activation-3.07e-02

Pos logit contributions
Circ+0.230
 gra+0.223
Ready+0.219
fall+0.217
ground+0.216
Neg logit contributions
 brothers-0.093
lee-0.089
 mistakes-0.084
 adorable-0.077
 siblings-0.077
Token said
Token ID531
Feature activation-8.82e-02

Pos logit contributions
 brothers+0.043
lee+0.043
 careful+0.041
 adorable+0.041
 important+0.041
Neg logit contributions
Circ-0.063
-0.060
ground-0.060
fall-0.058
 gra-0.057
Token,
Token ID11
Feature activation-1.14e-02

Pos logit contributions
 brothers+0.095
lee+0.091
 mistakes+0.091
 important+0.091
 adorable+0.087
Neg logit contributions
Circ-0.208
ground-0.196
Ready-0.196
fall-0.195
-0.194
Token "
Token ID366
Feature activation+6.98e-02

Pos logit contributions
 brothers+0.017
lee+0.016
 important+0.016
 adorable+0.016
 careful+0.016
Neg logit contributions
Circ-0.023
Ready-0.021
ground-0.020
Find-0.020
 shortcut-0.020
Token the
Token ID262
Feature activation+2.79e-02

Pos logit contributions
Circ+0.269
+0.266
Find+0.265
 gra+0.264
Wait+0.262
Neg logit contributions
 brothers-0.150
 important-0.144
 mistakes-0.141
 years-0.135
 twins-0.133
Token stairs
Token ID16046
Feature activation+1.39e-01

Pos logit contributions
Circ+0.033
+0.033
Wait+0.030
Find+0.030
fall+0.029
Neg logit contributions
 important-0.063
nie-0.061
 adorable-0.060
 brothers-0.060
 mistakes-0.057
Token.
Token ID13
Feature activation-8.03e-03

Pos logit contributions
Circ+0.307
fall+0.291
Ready+0.290
Wait+0.288
Find+0.285
Neg logit contributions
 brothers-0.165
 mistakes-0.160
lee-0.160
 important-0.154
heart-0.152
Token The
Token ID383
Feature activation-5.28e-02

Pos logit contributions
 brothers+0.012
lee+0.012
 darling+0.011
 adorable+0.011
 mistakes+0.011
Neg logit contributions
Circ-0.015
 shortcut-0.014
fall-0.014
 hunt-0.014
 gra-0.014
Token stairs
Token ID16046
Feature activation+1.38e-01

Pos logit contributions
 brothers+0.117
 important+0.113
 mistakes+0.112
 adorable+0.112
 siblings+0.110
Neg logit contributions
Circ-0.068
fall-0.061
-0.059
 gra-0.058
Find-0.058
Token were
Token ID547
Feature activation-1.36e-02

Pos logit contributions
Circ+0.282
Ready+0.259
Find+0.251
 gra+0.250
fall+0.249
Neg logit contributions
 brothers-0.191
 important-0.185
 adorable-0.181
lee-0.180
 mistakes-0.180
Token wet
Token ID9583
Feature activation-1.14e-01

Pos logit contributions
 brothers+0.019
 mistakes+0.019
lee+0.019
heart+0.018
ity+0.018
Neg logit contributions
Circ-0.027
fall-0.026
Ready-0.026
ground-0.025
Wait-0.025
Token and
Token ID290
Feature activation-2.39e-02

Pos logit contributions
lee+0.149
 mistakes+0.145
 brothers+0.143
ity+0.142
aky+0.142
Neg logit contributions
fall-0.233
 stairs-0.229
Circ-0.225
ground-0.223
 aisle-0.219
Token slippery
Token ID32911
Feature activation-1.21e-01

Pos logit contributions
 brothers+0.030
lee+0.029
 mistakes+0.029
 important+0.029
 adorable+0.028
Neg logit contributions
Circ-0.052
fall-0.049
Ready-0.048
ground-0.048
-0.048
Token.
Token ID13
Feature activation-9.33e-03

Pos logit contributions
 brothers+0.136
 mistakes+0.133
lee+0.129
 important+0.127
heart+0.125
Neg logit contributions
Circ-0.280
Ready-0.262
Find-0.261
fall-0.261
Wait-0.260
Token Anna
Token ID11735
Feature activation-1.25e-01

Pos logit contributions
 brothers+0.014
lee+0.013
 mistakes+0.013
 adorable+0.013
 important+0.013
Neg logit contributions
Circ-0.018
fall-0.017
Ready-0.017
ground-0.017
 shortcut-0.017
Token but
Token ID475
Feature activation-1.32e-02

Pos logit contributions
lee+0.014
 brothers+0.014
aky+0.014
ity+0.013
 mistakes+0.013
Neg logit contributions
Circ-0.011
Wait-0.010
fall-0.010
 stairs-0.010
Ready-0.010
Token you
Token ID345
Feature activation-8.93e-03

Pos logit contributions
 important+0.018
 brothers+0.018
 mistakes+0.017
 adorable+0.016
 feelings+0.016
Neg logit contributions
Circ-0.029
fall-0.027
ground-0.026
-0.026
Ready-0.026
Token have
Token ID423
Feature activation+9.50e-03

Pos logit contributions
 brothers+0.017
 mistakes+0.016
lee+0.016
 important+0.016
 adorable+0.016
Neg logit contributions
Circ-0.014
fall-0.013
Ready-0.013
ground-0.013
Wait-0.013
Token to
Token ID284
Feature activation+1.52e-01

Pos logit contributions
Circ+0.011
fall+0.011
ground+0.010
Ready+0.010
+0.010
Neg logit contributions
 brothers-0.021
 important-0.021
 mistakes-0.021
 feelings-0.021
lee-0.020
Token wait
Token ID4043
Feature activation-4.75e-02

Pos logit contributions
fall+0.185
Circ+0.182
 gra+0.182
ground+0.178
Wait+0.178
Neg logit contributions
lee-0.330
ushing-0.311
 brothers-0.311
 mistakes-0.309
heart-0.307
Token.
Token ID13
Feature activation+5.61e-05

Pos logit contributions
ity+0.074
lee+0.073
 brothers+0.072
aky+0.072
heart+0.072
Neg logit contributions
fall-0.084
ground-0.082
 stairs-0.080
Circ-0.078
 gra-0.076
Token I
Token ID314
Feature activation+3.18e-02

Pos logit contributions
Circ+0.000
fall+0.000
Ready+0.000
ground+0.000
Wait+0.000
Neg logit contributions
 brothers-0.000
lee-0.000
 mistakes-0.000
heart-0.000
 important-0.000
Token have
Token ID423
Feature activation+9.16e-03

Pos logit contributions
Circ+0.055
fall+0.051
ground+0.051
 stairs+0.051
Ready+0.050
Neg logit contributions
 brothers-0.052
 important-0.052
aky-0.051
 mistakes-0.050
 adorable-0.050
Token to
Token ID284
Feature activation+1.52e-01

Pos logit contributions
Circ+0.012
fall+0.011
Ready+0.011
ground+0.011
+0.011
Neg logit contributions
 brothers-0.019
 mistakes-0.019
lee-0.019
 important-0.019
heart-0.018
Token go
Token ID467
Feature activation+5.18e-02

Pos logit contributions
Circ+0.188
fall+0.183
 gra+0.180
ground+0.179
+0.175
Neg logit contributions
lee-0.322
 brothers-0.311
 mistakes-0.311
ushing-0.309
heart-0.309
Token to
Token ID284
Feature activation+1.52e-01

Pos logit contributions
Circ+0.074
Ready+0.067
Wait+0.067
Find+0.066
+0.066
Neg logit contributions
 brothers-0.104
 important-0.101
 mistakes-0.101
lee-0.100
 adorable-0.099
Token time
Token ID640
Feature activation+1.09e-01

Pos logit contributions
lee+0.049
 brothers+0.049
 mistakes+0.048
heart+0.047
 sisters+0.046
Neg logit contributions
Circ-0.040
-0.038
fall-0.037
Wait-0.037
ground-0.036
Token,
Token ID11
Feature activation-5.44e-03

Pos logit contributions
Circ+0.263
Ready+0.250
fall+0.248
 gra+0.248
Find+0.246
Neg logit contributions
 brothers-0.111
 mistakes-0.106
lee-0.105
 adorable-0.103
 important-0.102
Token there
Token ID612
Feature activation+1.75e-01

Pos logit contributions
lee+0.012
 mistakes+0.012
 brothers+0.012
heart+0.012
ity+0.012
Neg logit contributions
Circ-0.006
Wait-0.006
ground-0.006
Find-0.006
fall-0.006
Token were
Token ID547
Feature activation-8.33e-03

Pos logit contributions
Circ+0.338
fall+0.322
Ready+0.317
ground+0.310
Find+0.310
Neg logit contributions
 brothers-0.257
 adorable-0.246
 mistakes-0.246
lee-0.245
 important-0.244
Token two
Token ID734
Feature activation-6.82e-02

Pos logit contributions
ushing+0.014
heart+0.014
ity+0.014
ious+0.013
aky+0.013
Neg logit contributions
Circ-0.015
ground-0.013
 stairs-0.013
Find-0.013
-0.013
Token twins
Token ID20345
Feature activation+1.33e-02

Pos logit contributions
heart+0.083
asured+0.083
ushing+0.073
nie+0.073
 darling+0.072
Neg logit contributions
 stairs-0.133
 hills-0.130
 ladder-0.130
 walk-0.129
 hike-0.127
Token.
Token ID13
Feature activation+2.06e-03

Pos logit contributions
ground+0.019
 hunt+0.019
Circ+0.019
fall+0.018
+0.018
Neg logit contributions
heart-0.024
 mistakes-0.024
 darling-0.023
ity-0.023
 important-0.022
Token They
Token ID1119
Feature activation-3.46e-02

Pos logit contributions
Circ+0.003
fall+0.003
Ready+0.003
ground+0.003
+0.003
Neg logit contributions
 mistakes-0.004
 brothers-0.004
lee-0.003
heart-0.003
ity-0.003
Token were
Token ID547
Feature activation-8.88e-03

Pos logit contributions
 mistakes+0.078
heart+0.073
ushing+0.068
lee+0.067
 darling+0.067
Neg logit contributions
 arrived-0.049
Circ-0.046
 arrive-0.045
 ride-0.044
Find-0.044
Token very
Token ID845
Feature activation-8.61e-02

Pos logit contributions
heart+0.016
ity+0.015
ious+0.015
ushing+0.015
 mistakes+0.015
Neg logit contributions
 gra-0.015
Circ-0.014
 stairs-0.014
 arrive-0.014
 Enter-0.014
Token old
Token ID1468
Feature activation-9.80e-02

Pos logit contributions
heart+0.123
 mistakes+0.115
hew+0.107
ity+0.105
 feelings+0.103
Neg logit contributions
Circ-0.174
 stairs-0.171
fall-0.170
bel-0.167
 gra-0.167
Token"
Token ID1
Feature activation+6.09e-02

Pos logit contributions
Ready+0.146
Circ+0.140
Wait+0.139
To+0.134
Find+0.132
Neg logit contributions
lee-0.260
heart-0.252
ity-0.249
ious-0.248
ushing-0.246
TokenFine
Token ID34389
Feature activation-1.80e-01

Pos logit contributions
Circ+0.100
 gra+0.097
fall+0.097
ground+0.095
 shortcut+0.093
Neg logit contributions
 brothers-0.104
 mistakes-0.103
 important-0.102
 adorable-0.102
 siblings-0.100
Token,
Token ID11
Feature activation-1.07e-02

Pos logit contributions
aky+0.349
lee+0.340
ushing+0.330
ity+0.327
 brothers+0.321
Neg logit contributions
fall-0.256
 stairs-0.248
ground-0.239
 hunt-0.234
Ready-0.233
Token Mom
Token ID11254
Feature activation-8.46e-02

Pos logit contributions
lee+0.018
aky+0.018
ity+0.017
ady+0.017
ious+0.017
Neg logit contributions
Ready-0.018
Wait-0.018
 stairs-0.017
Find-0.017
 shortcut-0.017
Token.
Token ID13
Feature activation-1.21e-02

Pos logit contributions
lee+0.158
ity+0.153
aky+0.150
 adorable+0.148
ushing+0.147
Neg logit contributions
fall-0.120
 stairs-0.111
-0.110
 hunt-0.109
Ready-0.107
Token I
Token ID314
Feature activation+2.88e-02

Pos logit contributions
 brothers+0.023
lee+0.023
 mistakes+0.022
ity+0.022
aky+0.022
Neg logit contributions
Circ-0.018
Ready-0.017
fall-0.017
Wait-0.017
-0.017
Token will
Token ID481
Feature activation+1.37e-02

Pos logit contributions
Circ+0.045
fall+0.042
Ready+0.042
Wait+0.041
Find+0.041
Neg logit contributions
lee-0.053
 brothers-0.053
 mistakes-0.052
heart-0.051
ity-0.050
Token try
Token ID1949
Feature activation-2.71e-02

Pos logit contributions
fall+0.019
Circ+0.018
ground+0.018
+0.018
 hunt+0.018
Neg logit contributions
lee-0.027
heart-0.027
 brothers-0.026
aky-0.026
ity-0.026
Token."
Token ID526
Feature activation+1.25e-02

Pos logit contributions
 brothers+0.059
heart+0.059
aky+0.058
ity+0.058
lee+0.058
Neg logit contributions
Circ-0.032
fall-0.029
 stairs-0.029
ground-0.028
Ready-0.028
Token Max
Token ID5436
Feature activation-9.35e-02

Pos logit contributions
Circ+0.023
fall+0.022
Ready+0.021
Wait+0.021
ground+0.021
Neg logit contributions
 brothers-0.020
lee-0.020
 mistakes-0.019
ity-0.019
heart-0.019
Token says
Token ID1139
Feature activation-6.86e-02

Pos logit contributions
ity+0.149
ious+0.146
aky+0.145
lee+0.145
heart+0.143
Neg logit contributions
Circ-0.173
Ready-0.168
fall-0.161
Wait-0.160
 stairs-0.158
Token and
Token ID290
Feature activation-1.85e-02

Pos logit contributions
 brothers+0.093
lee+0.093
 mistakes+0.091
heart+0.090
aky+0.089
Neg logit contributions
Circ-0.151
fall-0.144
ground-0.142
 gra-0.142
Ready-0.142
Token hugged
Token ID41130
Feature activation-7.85e-02

Pos logit contributions
 brothers+0.027
lee+0.027
 mistakes+0.027
 important+0.026
heart+0.026
Neg logit contributions
Circ-0.035
fall-0.033
ground-0.033
Ready-0.033
-0.033
Token her
Token ID607
Feature activation-7.04e-02

Pos logit contributions
 brothers+0.098
lee+0.096
 mistakes+0.096
heart+0.092
 important+0.092
Neg logit contributions
Circ-0.169
fall-0.161
Ready-0.159
ground-0.159
Wait-0.158
Token.
Token ID13
Feature activation-9.11e-03

Pos logit contributions
aky+0.105
lee+0.105
heart+0.104
ushing+0.103
 mistakes+0.100
Neg logit contributions
Circ-0.122
 gra-0.121
fall-0.117
 stairs-0.116
-0.116
Token He
Token ID679
Feature activation-6.60e-02

Pos logit contributions
 brothers+0.016
lee+0.016
 mistakes+0.016
 adorable+0.015
 important+0.015
Neg logit contributions
Circ-0.015
fall-0.014
ground-0.014
Ready-0.013
Wait-0.013
Token was
Token ID373
Feature activation-2.49e-02

Pos logit contributions
 mistakes+0.118
heart+0.116
ushing+0.111
ubby+0.111
ity+0.110
Neg logit contributions
Circ-0.103
 arrived-0.099
Wait-0.097
 gra-0.097
Under-0.095
Token sorry
Token ID7926
Feature activation-1.63e-01

Pos logit contributions
heart+0.032
ity+0.032
ushing+0.031
 mistakes+0.031
ious+0.030
Neg logit contributions
 gra-0.049
 arrive-0.047
fall-0.047
Find-0.046
Wait-0.046
Token for
Token ID329
Feature activation+3.73e-02

Pos logit contributions
aky+0.326
lee+0.323
 brothers+0.320
ady+0.318
ity+0.312
Neg logit contributions
fall-0.212
Circ-0.210
Wait-0.197
 stairs-0.192
 shortcut-0.190
Token her
Token ID607
Feature activation-7.17e-02

Pos logit contributions
Circ+0.083
Ready+0.077
ground+0.077
fall+0.077
+0.076
Neg logit contributions
 brothers-0.047
 mistakes-0.046
lee-0.045
 important-0.045
 adorable-0.045
Token.
Token ID13
Feature activation-1.20e-02

Pos logit contributions
lee+0.113
 brothers+0.110
heart+0.110
aky+0.109
ushing+0.107
Neg logit contributions
Circ-0.124
fall-0.122
 shortcut-0.115
 gra-0.115
 stairs-0.115
Token He
Token ID679
Feature activation-6.48e-02

Pos logit contributions
 brothers+0.021
lee+0.020
 adorable+0.020
 darling+0.019
 important+0.019
Neg logit contributions
Circ-0.021
fall-0.019
ground-0.019
 shortcut-0.019
 gra-0.019
Token end
Token ID886
Feature activation+7.90e-02

Pos logit contributions
lee+0.082
ushing+0.077
heart+0.077
 mistakes+0.075
ity+0.075
Neg logit contributions
Circ-0.095
fall-0.092
Ready-0.091
ground-0.090
-0.090
Token.
Token ID13
Feature activation-3.69e-03

Pos logit contributions
Circ+0.191
Find+0.185
Ready+0.184
Wait+0.183
+0.183
Neg logit contributions
 brothers-0.078
 important-0.077
 mistakes-0.076
 adorable-0.073
lee-0.071
Token<|endoftext|>
Token ID50256
Feature activation+1.49e-01

Pos logit contributions
 brothers+0.007
lee+0.007
 adorable+0.007
 important+0.007
 mistakes+0.007
Neg logit contributions
 shortcut-0.005
Ready-0.005
Circ-0.005
fall-0.005
Wait-0.005
TokenOnce
Token ID7454
Feature activation+7.22e-02

Pos logit contributions
Find+0.259
Circ+0.257
bel+0.257
Wait+0.257
fall+0.256
Neg logit contributions
 brothers-0.241
 twins-0.218
 adorable-0.210
/-0.208
 years-0.206
Token upon
Token ID2402
Feature activation+6.63e-02

Pos logit contributions
Circ+0.112
fall+0.105
Ready+0.103
ground+0.102
Wait+0.101
Neg logit contributions
 brothers-0.134
lee-0.131
 mistakes-0.131
 important-0.128
heart-0.128
Token a
Token ID257
Feature activation-2.57e-02

Pos logit contributions
Circ+0.120
fall+0.113
Wait+0.111
Ready+0.110
ground+0.110
Neg logit contributions
lee-0.107
 brothers-0.105
 mistakes-0.104
heart-0.102
ushing-0.101
Token time
Token ID640
Feature activation+1.09e-01

Pos logit contributions
lee+0.052
 brothers+0.052
 mistakes+0.051
 sisters+0.049
heart+0.049
Neg logit contributions
Circ-0.036
-0.034
Wait-0.033
fall-0.033
 hike-0.033
Token,
Token ID11
Feature activation-3.69e-03

Pos logit contributions
Circ+0.261
Ready+0.248
fall+0.247
 gra+0.245
Find+0.244
Neg logit contributions
 brothers-0.113
lee-0.108
 mistakes-0.108
 important-0.105
 adorable-0.105
Token there
Token ID612
Feature activation+1.75e-01

Pos logit contributions
lee+0.008
 mistakes+0.008
 brothers+0.008
heart+0.008
ity+0.007
Neg logit contributions
Circ-0.005
fall-0.004
ground-0.004
Wait-0.004
Ready-0.004
Token was
Token ID373
Feature activation-2.24e-02

Pos logit contributions
Circ+0.327
fall+0.310
Ready+0.305
ground+0.300
Find+0.299
Neg logit contributions
 brothers-0.271
 mistakes-0.262
lee-0.260
 important-0.258
 adorable-0.258
Token a
Token ID257
Feature activation-2.54e-02

Pos logit contributions
lee+0.049
ushing+0.048
heart+0.048
 mistakes+0.047
ious+0.046
Neg logit contributions
Circ-0.032
-0.027
Under-0.027
ground-0.026
Be-0.026
Token to
Token ID284
Feature activation+1.49e-01

Pos logit contributions
 mistakes+0.105
 important+0.105
 brothers+0.103
lee+0.102
 adorable+0.101
Neg logit contributions
ground-0.068
fall-0.067
Circ-0.066
 gra-0.065
-0.065
Token clean
Token ID3424
Feature activation-8.14e-02

Pos logit contributions
fall+0.255
+0.249
 ride+0.246
 arrive+0.242
 walk+0.238
Neg logit contributions
aky-0.241
lee-0.241
ious-0.240
 brothers-0.238
dy-0.234
Token her
Token ID607
Feature activation-6.66e-02

Pos logit contributions
aky+0.173
ady+0.160
ity+0.159
 brothers+0.159
ushing+0.157
Neg logit contributions
 stairs-0.106
 aisle-0.103
ground-0.100
fall-0.100
 hills-0.092
Token room
Token ID2119
Feature activation+1.37e-01

Pos logit contributions
aky+0.106
heart+0.105
lee+0.102
 brothers+0.098
ious+0.097
Neg logit contributions
 stairs-0.119
Circ-0.111
ground-0.109
fall-0.109
-0.107
Token,
Token ID11
Feature activation-8.76e-03

Pos logit contributions
 stairs+0.245
fall+0.242
+0.240
 aisle+0.236
ground+0.235
Neg logit contributions
 brothers-0.213
lee-0.207
ity-0.202
aky-0.200
heart-0.194
Token so
Token ID523
Feature activation-4.38e-02

Pos logit contributions
aky+0.017
ity+0.017
 brothers+0.017
ubby+0.017
ious+0.017
Neg logit contributions
Wait-0.012
 stairs-0.012
Circ-0.011
fall-0.011
 bench-0.011
Token she
Token ID673
Feature activation-2.17e-02

Pos logit contributions
aky+0.115
ady+0.114
 mistakes+0.112
ubby+0.111
lee+0.111
Neg logit contributions
Wait-0.029
 stairs-0.028
fall-0.025
 aisle-0.025
bel-0.024
Token didn
Token ID1422
Feature activation-5.34e-02

Pos logit contributions
 mistakes+0.046
 instant+0.042
 choices+0.042
 feelings+0.042
ity+0.042
Neg logit contributions
Under-0.027
 arrived-0.027
Wait-0.027
Circ-0.026
Find-0.025
Token't
Token ID470
Feature activation-3.77e-02

Pos logit contributions
 mistakes+0.141
 brothers+0.138
 important+0.138
 adorable+0.135
lee+0.135
Neg logit contributions
Circ-0.041
Ready-0.036
 gra-0.036
Wait-0.035
fall-0.034
Token.
Token ID13
Feature activation-6.28e-03

Pos logit contributions
lee+0.065
 mistakes+0.063
 brothers+0.063
ious+0.062
ushing+0.060
Neg logit contributions
fall-0.061
Circ-0.061
Wait-0.061
 shortcut-0.058
 ride-0.058
Token Her
Token ID2332
Feature activation-1.57e-01

Pos logit contributions
 brothers+0.011
 mistakes+0.010
lee+0.010
aky+0.010
heart+0.010
Neg logit contributions
Circ-0.011
fall-0.010
Ready-0.010
Wait-0.010
ground-0.010
Token reach
Token ID3151
Feature activation+6.08e-02

Pos logit contributions
Circ+0.237
fall+0.212
Ready+0.211
Wait+0.211
ground+0.204
Neg logit contributions
 brothers-0.288
 mistakes-0.285
lee-0.280
 important-0.275
 adorable-0.275
Token the
Token ID262
Feature activation+2.46e-02

Pos logit contributions
Circ+0.116
Wait+0.108
fall+0.107
Ready+0.107
Find+0.107
Neg logit contributions
 mistakes-0.093
 brothers-0.092
 important-0.090
lee-0.090
 adorable-0.087
Token stars
Token ID5788
Feature activation+7.12e-02

Pos logit contributions
Circ+0.042
fall+0.038
Wait+0.037
East+0.036
Find+0.036
Neg logit contributions
 mistakes-0.045
 important-0.044
 brothers-0.043
 adorable-0.042
 darling-0.041
Token!
Token ID0
Feature activation-3.81e-03

Pos logit contributions
Circ+0.153
Ready+0.144
Wait+0.142
fall+0.142
Find+0.141
Neg logit contributions
 brothers-0.090
lee-0.088
 mistakes-0.088
heart-0.086
ity-0.085
Token She
Token ID1375
Feature activation-7.31e-02

Pos logit contributions
lee+0.006
 brothers+0.006
 mistakes+0.005
 darling+0.005
 siblings+0.005
Neg logit contributions
Circ-0.007
 shortcut-0.007
 hunt-0.007
fall-0.007
ground-0.007
Token was
Token ID373
Feature activation-2.39e-02

Pos logit contributions
 mistakes+0.120
 brothers+0.119
lee+0.116
heart+0.116
 important+0.115
Neg logit contributions
Circ-0.126
fall-0.117
ground-0.117
Find-0.116
Ready-0.116
Token so
Token ID523
Feature activation-4.28e-02

Pos logit contributions
ity+0.029
heart+0.029
 mistakes+0.029
 brothers+0.028
ushing+0.028
Neg logit contributions
Circ-0.049
fall-0.048
 gra-0.048
ground-0.048
Find-0.047
Token happy
Token ID3772
Feature activation-1.61e-01

Pos logit contributions
heart+0.043
 mistakes+0.042
 brothers+0.041
ity+0.040
lee+0.040
Neg logit contributions
Circ-0.099
fall-0.099
Wait-0.096
 gra-0.096
-0.096
Token that
Token ID326
Feature activation+2.88e-02

Pos logit contributions
 brothers+0.233
 mistakes+0.228
lee+0.228
 important+0.221
heart+0.220
Neg logit contributions
Circ-0.314
fall-0.297
Ready-0.293
ground-0.292
Wait-0.290
Token she
Token ID673
Feature activation-2.48e-02

Pos logit contributions
Circ+0.049
fall+0.046
ground+0.045
+0.045
Ready+0.045
Neg logit contributions
 brothers-0.050
 important-0.049
 mistakes-0.049
 adorable-0.048
heart-0.048
Token laughed
Token ID13818
Feature activation-1.19e-01

Pos logit contributions
 mistakes+0.044
 brothers+0.043
 feelings+0.042
heart+0.042
ity+0.042
Neg logit contributions
Circ-0.039
fall-0.038
Ready-0.037
ground-0.037
Find-0.037
Token laughed
Token ID13818
Feature activation-1.15e-01

Pos logit contributions
 brothers+0.126
lee+0.123
 mistakes+0.122
 important+0.120
 adorable+0.119
Neg logit contributions
Circ-0.124
fall-0.116
ground-0.114
Ready-0.113
-0.113
Token and
Token ID290
Feature activation-2.66e-02

Pos logit contributions
lee+0.185
aky+0.181
heart+0.181
 mistakes+0.179
nie+0.178
Neg logit contributions
fall-0.200
Circ-0.198
Wait-0.190
 hills-0.186
 stairs-0.185
Token said
Token ID531
Feature activation-8.35e-02

Pos logit contributions
 brothers+0.037
lee+0.036
 adorable+0.035
 important+0.034
 mistakes+0.034
Neg logit contributions
Circ-0.055
ground-0.051
fall-0.051
Ready-0.050
Find-0.050
Token,
Token ID11
Feature activation-8.33e-03

Pos logit contributions
 brothers+0.098
lee+0.097
 mistakes+0.095
 important+0.092
 adorable+0.091
Neg logit contributions
Circ-0.188
ground-0.178
Ready-0.177
fall-0.177
Find-0.174
Token "
Token ID366
Feature activation+7.08e-02

Pos logit contributions
lee+0.013
 brothers+0.012
 darling+0.012
 adorable+0.012
 careful+0.012
Neg logit contributions
Circ-0.016
 shortcut-0.015
ground-0.015
Find-0.014
Ready-0.014
TokenThat
Token ID2504
Feature activation-6.33e-02

Pos logit contributions
Circ+0.084
ground+0.080
 shortcut+0.080
fall+0.079
+0.076
Neg logit contributions
heart-0.147
 brothers-0.147
 adorable-0.146
ity-0.145
 mistakes-0.144
Token was
Token ID373
Feature activation-2.47e-02

Pos logit contributions
 brothers+0.095
 mistakes+0.093
 important+0.092
 adorable+0.091
heart+0.091
Neg logit contributions
Circ-0.122
fall-0.114
ground-0.113
-0.112
Find-0.111
Token fun
Token ID1257
Feature activation-1.01e-01

Pos logit contributions
lee+0.030
 brothers+0.029
 mistakes+0.029
ity+0.029
heart+0.028
Neg logit contributions
Circ-0.054
Ready-0.052
Wait-0.051
fall-0.051
 shortcut-0.050
Token!"
Token ID2474
Feature activation+4.14e-03

Pos logit contributions
aky+0.131
lee+0.131
 brothers+0.130
 mistakes+0.124
ity+0.123
Neg logit contributions
fall-0.207
Circ-0.205
 stairs-0.202
ground-0.199
-0.196
Token Ben
Token ID3932
Feature activation-1.22e-01

Pos logit contributions
Circ+0.009
+0.008
fall+0.008
Wait+0.008
Ready+0.008
Neg logit contributions
aky-0.005
ity-0.005
 brothers-0.005
 important-0.005
 mistakes-0.005
Token waited
Token ID13488
Feature activation-2.91e-02

Pos logit contributions
 mistakes+0.201
 brothers+0.191
heart+0.191
lee+0.190
 important+0.189
Neg logit contributions
Circ-0.217
Wait-0.205
fall-0.203
Ready-0.202
Find-0.196
Token scared
Token ID12008
Feature activation-1.32e-01

Pos logit contributions
 mistakes+0.118
ity+0.115
heart+0.114
 brothers+0.114
lee+0.114
Neg logit contributions
Circ-0.191
fall-0.191
Wait-0.182
ground-0.181
 stairs-0.180
Token.
Token ID13
Feature activation-9.55e-03

Pos logit contributions
 brothers+0.182
lee+0.179
 mistakes+0.178
 important+0.172
heart+0.171
Neg logit contributions
Circ-0.265
fall-0.254
Ready-0.248
ground-0.247
Wait-0.247
Token He
Token ID679
Feature activation-6.40e-02

Pos logit contributions
 brothers+0.016
lee+0.015
 mistakes+0.015
 important+0.015
 adorable+0.014
Neg logit contributions
Circ-0.017
fall-0.016
ground-0.016
 gra-0.016
Find-0.016
Token had
Token ID550
Feature activation+1.09e-02

Pos logit contributions
 mistakes+0.106
 brothers+0.102
heart+0.101
ushing+0.100
lee+0.100
Neg logit contributions
Circ-0.109
fall-0.103
Wait-0.103
ground-0.103
Ready-0.102
Token a
Token ID257
Feature activation-3.12e-02

Pos logit contributions
Circ+0.021
fall+0.020
Ready+0.019
ground+0.019
Wait+0.019
Neg logit contributions
 brothers-0.016
 mistakes-0.016
lee-0.016
heart-0.016
 important-0.015
Token great
Token ID1049
Feature activation-1.51e-01

Pos logit contributions
 brothers+0.033
lee+0.033
 mistakes+0.032
heart+0.031
ity+0.031
Neg logit contributions
Circ-0.073
fall-0.069
Ready-0.069
ground-0.069
Wait-0.068
Token time
Token ID640
Feature activation+1.04e-01

Pos logit contributions
lee+0.280
 brothers+0.276
 mistakes+0.274
heart+0.270
aky+0.269
Neg logit contributions
Circ-0.237
fall-0.220
ground-0.217
Wait-0.215
 shortcut-0.215
Token,
Token ID11
Feature activation-1.16e-02

Pos logit contributions
Circ+0.207
Ready+0.191
Wait+0.190
fall+0.190
Find+0.188
Neg logit contributions
 brothers-0.153
 mistakes-0.148
lee-0.147
 important-0.144
 adorable-0.142
Token learning
Token ID4673
Feature activation-7.49e-02

Pos logit contributions
 brothers+0.015
lee+0.015
 mistakes+0.015
heart+0.014
ity+0.014
Neg logit contributions
Circ-0.024
fall-0.023
Ready-0.023
ground-0.023
Wait-0.023
Token new
Token ID649
Feature activation-9.04e-02

Pos logit contributions
 brothers+0.087
 mistakes+0.086
 important+0.085
 adorable+0.080
 feelings+0.080
Neg logit contributions
Circ-0.173
-0.167
fall-0.166
Ready-0.164
Find-0.164
Token things
Token ID1243
Feature activation+1.20e-01

Pos logit contributions
 brothers+0.132
lee+0.132
aky+0.128
 mistakes+0.128
ushing+0.126
Neg logit contributions
Circ-0.173
fall-0.163
Ready-0.160
ground-0.160
Wait-0.159
Token talk
Token ID1561
Feature activation-4.00e-02

Pos logit contributions
Circ+0.098
Ready+0.091
ground+0.090
fall+0.090
Wait+0.089
Neg logit contributions
 brothers-0.103
 adorable-0.103
 important-0.102
 mistakes-0.100
lee-0.098
Token.
Token ID13
Feature activation-4.29e-03

Pos logit contributions
 brothers+0.046
 mistakes+0.045
 important+0.045
lee+0.044
 adorable+0.044
Neg logit contributions
Circ-0.090
-0.086
fall-0.085
Ready-0.085
Find-0.084
Token Lily
Token ID20037
Feature activation-1.35e-01

Pos logit contributions
lee+0.006
 brothers+0.006
 mistakes+0.006
 adorable+0.006
 darling+0.006
Neg logit contributions
Circ-0.008
fall-0.008
ground-0.008
 shortcut-0.008
bel-0.008
Token is
Token ID318
Feature activation-1.54e-02

Pos logit contributions
heart+0.244
ious+0.243
ity+0.241
aky+0.237
ushing+0.227
Neg logit contributions
Circ-0.211
Ready-0.207
 stairs-0.205
Find-0.197
 aisle-0.197
Token very
Token ID845
Feature activation-8.57e-02

Pos logit contributions
heart+0.021
 mistakes+0.020
hew+0.020
lee+0.019
ushing+0.019
Neg logit contributions
Circ-0.031
Find-0.030
 gra-0.029
Ready-0.029
fall-0.029
Token happy
Token ID3772
Feature activation-1.61e-01

Pos logit contributions
heart+0.123
 mistakes+0.114
ious+0.110
ice+0.109
hew+0.107
Neg logit contributions
bel-0.160
Circ-0.160
Find-0.160
Under-0.157
fall-0.154
Token.
Token ID13
Feature activation-5.15e-03

Pos logit contributions
lee+0.217
 brothers+0.215
 mistakes+0.211
aky+0.208
ity+0.206
Neg logit contributions
Circ-0.326
fall-0.312
ground-0.304
Ready-0.303
Wait-0.302
Token She
Token ID1375
Feature activation-7.36e-02

Pos logit contributions
lee+0.008
 brothers+0.008
 mistakes+0.008
 adorable+0.008
 darling+0.008
Neg logit contributions
Circ-0.009
fall-0.009
ground-0.009
 shortcut-0.008
 gra-0.008
Token wants
Token ID3382
Feature activation-5.10e-02

Pos logit contributions
heart+0.126
ity+0.125
ious+0.120
aky+0.120
 mistakes+0.119
Neg logit contributions
Circ-0.125
Ready-0.124
 gra-0.117
 hunt-0.117
 stairs-0.117
Token to
Token ID284
Feature activation+1.50e-01

Pos logit contributions
 brothers+0.108
 mistakes+0.106
 important+0.106
lee+0.104
 feelings+0.104
Neg logit contributions
-0.070
fall-0.066
ground-0.066
Circ-0.064
Ready-0.064
Token show
Token ID905
Feature activation+2.07e-02

Pos logit contributions
Circ+0.224
fall+0.211
ground+0.207
Ready+0.207
Wait+0.205
Neg logit contributions
 brothers-0.286
lee-0.278
 mistakes-0.277
 adorable-0.276
 important-0.273
Token for
Token ID329
Feature activation+3.34e-02

Pos logit contributions
Circ+0.207
Ready+0.196
fall+0.194
Wait+0.194
Find+0.193
Neg logit contributions
 brothers-0.094
 mistakes-0.089
lee-0.088
 important-0.087
heart-0.086
Token sale
Token ID5466
Feature activation+4.34e-02

Pos logit contributions
Circ+0.075
fall+0.071
Ready+0.071
Wait+0.070
+0.070
Neg logit contributions
 brothers-0.041
 mistakes-0.040
 important-0.038
lee-0.038
 adorable-0.037
Token!"
Token ID2474
Feature activation+2.69e-03

Pos logit contributions
Circ+0.098
fall+0.094
ground+0.093
Ready+0.093
Wait+0.092
Neg logit contributions
 brothers-0.049
lee-0.048
 mistakes-0.047
heart-0.045
 important-0.045
Token Benny
Token ID44275
Feature activation-1.23e-01

Pos logit contributions
Circ+0.005
fall+0.005
+0.005
Ready+0.005
ground+0.005
Neg logit contributions
 mistakes-0.004
 brothers-0.004
lee-0.004
aky-0.004
ity-0.004
Token was
Token ID373
Feature activation-3.04e-02

Pos logit contributions
 mistakes+0.241
heart+0.229
ubby+0.225
aky+0.224
 brothers+0.224
Neg logit contributions
Circ-0.178
 arrived-0.165
fall-0.163
Wait-0.158
ground-0.156
Token happy
Token ID3772
Feature activation-1.65e-01

Pos logit contributions
heart+0.041
ity+0.039
lee+0.038
illy+0.037
hew+0.037
Neg logit contributions
Circ-0.061
fall-0.060
 gra-0.059
Find-0.058
Wait-0.058
Token because
Token ID780
Feature activation-1.99e-02

Pos logit contributions
 brothers+0.256
lee+0.251
 mistakes+0.251
 important+0.243
heart+0.243
Neg logit contributions
Circ-0.308
fall-0.290
Ready-0.286
ground-0.285
Wait-0.283
Token he
Token ID339
Feature activation-1.05e-02

Pos logit contributions
lee+0.053
ity+0.052
ushing+0.052
aky+0.052
 mistakes+0.052
Neg logit contributions
Circ-0.014
 stairs-0.013
ground-0.013
fall-0.012
Ready-0.012
Token could
Token ID714
Feature activation+8.08e-02

Pos logit contributions
 mistakes+0.017
 brothers+0.016
lee+0.016
heart+0.016
ity+0.015
Neg logit contributions
Circ-0.020
fall-0.018
Ready-0.018
Wait-0.018
Find-0.018
Token buy
Token ID2822
Feature activation+2.34e-02

Pos logit contributions
Circ+0.127
Ready+0.117
fall+0.115
ground+0.115
Wait+0.114
Neg logit contributions
 brothers-0.154
 mistakes-0.147
 important-0.146
lee-0.146
 adorable-0.145
Token lots
Token ID6041
Feature activation-5.86e-02

Pos logit contributions
Circ+0.043
fall+0.040
Ready+0.040
ground+0.039
Wait+0.039
Neg logit contributions
 brothers-0.036
lee-0.036
 mistakes-0.036
heart-0.035
 important-0.035
TokenThe
Token ID464
Feature activation-5.68e-02

Pos logit contributions
Circ+0.184
fall+0.170
Ready+0.169
ground+0.167
Wait+0.167
Neg logit contributions
 brothers-0.222
lee-0.219
 mistakes-0.218
heart-0.213
 important-0.212
Token little
Token ID1310
Feature activation-8.62e-02

Pos logit contributions
lee+0.078
heart+0.076
ushing+0.074
 mistakes+0.072
ity+0.072
Neg logit contributions
Circ-0.115
Ready-0.111
Wait-0.109
fall-0.108
 stairs-0.108
Token boy
Token ID2933
Feature activation-3.28e-03

Pos logit contributions
lee+0.165
ushing+0.155
 mistakes+0.152
ity+0.150
aky+0.149
Neg logit contributions
Circ-0.132
Find-0.125
Wait-0.125
Ready-0.123
 stairs-0.122
Token was
Token ID373
Feature activation-2.88e-02

Pos logit contributions
 mistakes+0.005
 brothers+0.005
lee+0.005
ushing+0.005
heart+0.005
Neg logit contributions
Circ-0.005
fall-0.005
ground-0.005
Wait-0.005
 gra-0.005
Token so
Token ID523
Feature activation-4.53e-02

Pos logit contributions
 brothers+0.031
 mistakes+0.031
lee+0.031
heart+0.030
ity+0.030
Neg logit contributions
Circ-0.064
fall-0.062
 gra-0.062
ground-0.061
Find-0.061
Token relieved
Token ID22598
Feature activation-1.52e-01

Pos logit contributions
 mistakes+0.047
lee+0.047
 brothers+0.047
heart+0.047
ity+0.045
Neg logit contributions
Circ-0.101
fall-0.100
Wait-0.098
 gra-0.097
Find-0.097
Token and
Token ID290
Feature activation-2.70e-02

Pos logit contributions
 brothers+0.207
lee+0.203
 mistakes+0.203
heart+0.196
 important+0.196
Neg logit contributions
Circ-0.310
fall-0.291
Ready-0.290
ground-0.287
Wait-0.286
Token he
Token ID339
Feature activation-1.54e-02

Pos logit contributions
 mistakes+0.040
aky+0.039
 brothers+0.039
ushing+0.039
lee+0.039
Neg logit contributions
Circ-0.048
fall-0.048
Wait-0.047
 gra-0.047
Ready-0.046
Token felt
Token ID2936
Feature activation-7.25e-02

Pos logit contributions
 mistakes+0.023
 brothers+0.023
heart+0.022
ity+0.022
lee+0.022
Neg logit contributions
Circ-0.028
fall-0.027
Wait-0.027
ground-0.027
 gra-0.027
Token reminded
Token ID14516
Feature activation-8.98e-02

Pos logit contributions
 brothers+0.088
ity+0.088
lee+0.088
heart+0.086
 mistakes+0.085
Neg logit contributions
Circ-0.150
fall-0.149
Wait-0.149
 gra-0.147
Ready-0.146
Token not
Token ID407
Feature activation-5.66e-02

Pos logit contributions
 brothers+0.145
lee+0.144
 mistakes+0.141
aky+0.141
ity+0.140
Neg logit contributions
Circ-0.152
fall-0.147
ground-0.145
-0.142
Wait-0.142
Token
Token ID198
Feature activation+1.22e-01

Pos logit contributions
 brothers+0.011
 adorable+0.011
 darling+0.011
 mistakes+0.010
 important+0.010
Neg logit contributions
Circ-0.014
 hike-0.014
 shortcut-0.014
 gra-0.014
ground-0.013
Token
Token ID198
Feature activation+1.22e-01

Pos logit contributions
Circ+0.139
fall+0.127
Ready+0.123
ground+0.122
Wait+0.121
Neg logit contributions
 brothers-0.276
 mistakes-0.272
lee-0.272
 important-0.267
heart-0.266
Token"
Token ID1
Feature activation+6.55e-02

Pos logit contributions
Circ+0.204
Ready+0.200
Wait+0.193
Find+0.191
ground+0.189
Neg logit contributions
lee-0.223
 brothers-0.211
 mistakes-0.210
heart-0.210
 adorable-0.206
TokenNo
Token ID2949
Feature activation-8.98e-02

Pos logit contributions
ground+0.106
fall+0.105
 gra+0.104
 hike+0.103
 hunt+0.100
Neg logit contributions
 careful-0.106
 adorable-0.104
 important-0.103
aky-0.103
 feelings-0.099
Token,
Token ID11
Feature activation-4.48e-03

Pos logit contributions
aky+0.146
lee+0.141
ushing+0.138
aming+0.137
asured+0.135
Neg logit contributions
fall-0.156
 shortcut-0.155
 stairs-0.153
Circ-0.151
 hunt-0.150
Token silly
Token ID14397
Feature activation-1.78e-01

Pos logit contributions
 brothers+0.007
 mistakes+0.007
lee+0.007
 important+0.007
 adorable+0.007
Neg logit contributions
Circ-0.008
fall-0.008
ground-0.007
Ready-0.007
-0.007
Token,
Token ID11
Feature activation-4.88e-03

Pos logit contributions
 brothers+0.225
lee+0.223
 mistakes+0.220
 important+0.214
aky+0.214
Neg logit contributions
Circ-0.375
fall-0.364
ground-0.355
Ready-0.354
Wait-0.352
Token ze
Token ID41271
Feature activation-3.03e-02

Pos logit contributions
 brothers+0.008
lee+0.008
 mistakes+0.008
heart+0.008
 important+0.008
Neg logit contributions
Circ-0.008
fall-0.008
Ready-0.008
ground-0.008
Wait-0.008
Tokenbr
Token ID1671
Feature activation+1.37e-01

Pos logit contributions
 brothers+0.076
/+0.073
aky+0.073
ushing+0.073
 important+0.072
Neg logit contributions
Circ-0.029
bel-0.024
Find-0.023
Wait-0.023
 gra-0.022
Tokenas
Token ID292
Feature activation+6.82e-02

Pos logit contributions
Circ+0.331
fall+0.324
Ready+0.322
+0.320
Wait+0.319
Neg logit contributions
 brothers-0.138
aky-0.134
ity-0.131
lee-0.129
ushing-0.127
Token are
Token ID389
Feature activation-7.42e-03

Pos logit contributions
Circ+0.157
fall+0.148
Ready+0.147
Wait+0.146
ground+0.145
Neg logit contributions
 brothers-0.076
 mistakes-0.074
lee-0.073
heart-0.072
aky-0.070