TOP ACTIVATIONS
MAX = 0.471

big board . She was crying . " Are
An nie stopped crying and was very excited .
little girl . She was crying because she forgot her t
an old man who was crying and seemed to be in
kind lady saw Tim my crying and asked him what was
" I 'm not crying ," said Emma .
and found a little girl crying . El p said to
asked him why he was crying . The boy told her
sitting on the ground and crying . Jim was a compassionate
noises . It sounded like crying and so he followed the
apples . They felt like crying . But then
my saw his friend Sally crying . He asked her ,
wet and pin ching and crying . She was not happy
asked Lily why she was crying . Lily told her about
so happy . He stopped crying and went with the lady
with the hammer and started crying . Her mom had to
happy that she was not crying anymore . Later , she
was a picture of her crying . She did not like
, " Why are you crying ?" The bird told her
ily 's mom saw her crying and asked , " Why

MIN ACTIVATIONS
MIN = -0.535

can ! I know many songs . Let 's whistle together
She pret ends to sing songs with it .
, sad songs , silly songs and loud songs . She
, silly songs and loud songs . She sang to her
sang happy songs , sad songs , silly songs and loud
bed . She sang happy songs , sad songs , silly
many videos of animals , songs , and stories . Sometimes
at night . They sang songs about the sun , the
story with pupp ets and songs . Ben and Lily laughed
. They sang their favorite songs and made up some new
Can you teach me more songs ?" L ily
their guitar and sing their songs . They make everyone happy
to her family and her songs are beautiful because they come
. Anna liked to play songs for her family and friends
they could see cartoons or songs or stories . But today
and the cars and the songs . He wants to go
. She liked to play songs for her family and friends
they sing along with the songs or cl ap their hands
was so proud of the songs he had made .
and the colors and the songs . She had a big

INTERVAL 0.219 - 0.471
CONTAINS 4.252%

Fin sw am back up to Bub bles and
. She was sad . Her hand was cold and wet
. " It 's so big and shiny !"
Sarah was crying because she lost her favorite doll . Tim
wet and cold and still trembling . He said , "

INTERVAL -0.032 - 0.219
CONTAINS 63.051%

, they see a thin cat in the park . The
" OK , Ben . That 's a good one .
the Easter Bunny and now you know where I live !"
something new to measure . <|endoftext|> Once upon a time ,
door and it led to a big room . He saw

INTERVAL -0.284 - -0.032
CONTAINS 29.681%

only have cars and trucks and toys ." The robot says
a deep sense of content ment . He was glad he
was a little boy named Tim my . Tim my loved
to make a mess and play . Tom my
outside with his friends . One day , Tim my 's

INTERVAL -0.535 - -0.284
CONTAINS 3.016%

their blocks on top of each other , but they did
, but we couldn 't agree on the t eddy bear
John asked his mom multiple times , but his mom never
. One day , he heard a strange
and his friends were relieved they had made the right decision
Token big
Token ID1263
Feature activation+7.72e-02

Pos logit contributions
 crying+0.264
 mirror+0.251
 bleeding+0.236
 wardrobe+0.233
 ha+0.229
Neg logit contributions
 songs-0.288
 rods-0.275
 bond-0.265
 unity-0.263
 enjoyable-0.262
Token board
Token ID3096
Feature activation-4.30e-02

Pos logit contributions
 crying+0.114
 mirror+0.108
 bleeding+0.100
 wardrobe+0.099
 stub+0.098
Neg logit contributions
 songs-0.152
 rods-0.148
 unity-0.141
 bond-0.141
 habitats-0.140
Token.
Token ID13
Feature activation+9.99e-02

Pos logit contributions
 songs+0.085
 rods+0.083
 similarities+0.081
 habitats+0.081
 unity+0.081
Neg logit contributions
 crying-0.065
 mirror-0.061
 wardrobe-0.056
 bleeding-0.056
 ha-0.055
Token She
Token ID1375
Feature activation+3.20e-01

Pos logit contributions
 crying+0.161
 mirror+0.153
 bleeding+0.145
 tooth+0.143
 wardrobe+0.142
Neg logit contributions
 songs-0.183
 rods-0.179
 habitats-0.171
 unity-0.169
 similarities-0.167
Token was
Token ID373
Feature activation+6.08e-02

Pos logit contributions
 crying+0.464
 mirror+0.454
 ha+0.413
 wardrobe+0.412
 drawer+0.398
Neg logit contributions
 songs-0.656
 rods-0.630
 unity-0.614
 bond-0.613
 similarities-0.612
Token crying
Token ID13774
Feature activation+4.68e-01

Pos logit contributions
 mirror+0.095
 crying+0.094
 diary+0.090
 wardrobe+0.088
 bleeding+0.088
Neg logit contributions
 songs-0.110
 games-0.102
 enjoyable-0.101
 rods-0.101
asted-0.100
Token.
Token ID13
Feature activation+9.03e-02

Pos logit contributions
 crying+0.843
 mirror+0.809
 bleeding+0.751
 wardrobe+0.749
 ha+0.741
Neg logit contributions
 songs-0.789
 rods-0.769
 habitats-0.728
 unity-0.727
 similarities-0.725
Token
Token ID198
Feature activation+4.09e-02

Pos logit contributions
 tooth+0.150
 mirror+0.140
 crying+0.139
 bleeding+0.139
 wardrobe+0.139
Neg logit contributions
 They-0.146
 songs-0.139
 rods-0.139
 they-0.137
 habitats-0.137
Token
Token ID198
Feature activation+4.05e-02

Pos logit contributions
 crying+0.075
 mirror+0.075
 tooth+0.073
 wardrobe+0.071
 bleeding+0.070
Neg logit contributions
 songs-0.062
 rods-0.061
Years-0.060
 lesson-0.058
 habitats-0.057
Token"
Token ID1
Feature activation+5.17e-02

Pos logit contributions
 crying+0.071
 mirror+0.068
 tooth+0.066
 wardrobe+0.064
 bleeding+0.063
Neg logit contributions
 songs-0.068
 rods-0.066
Years-0.065
 similarities-0.063
asted-0.063
TokenAre
Token ID8491
Feature activation+1.86e-01

Pos logit contributions
 crying+0.080
 wardrobe+0.078
 mirror+0.077
 cabinet+0.076
 drawer+0.075
Neg logit contributions
 songs-0.092
 rods-0.088
Years-0.086
asted-0.085
 similarities-0.084
Token
Token ID198
Feature activation+4.06e-02

Pos logit contributions
 songs+0.004
 rods+0.004
 habitats+0.004
 unity+0.004
 similarities+0.004
Neg logit contributions
 crying-0.003
 mirror-0.003
 bleeding-0.003
 wardrobe-0.003
 tooth-0.003
Token
Token ID198
Feature activation+3.99e-02

Pos logit contributions
 crying+0.067
 mirror+0.064
 wardrobe+0.059
 bleeding+0.059
 tooth+0.058
Neg logit contributions
 songs-0.074
 rods-0.072
 habitats-0.069
 similarities-0.069
 unity-0.069
TokenAn
Token ID2025
Feature activation+1.82e-01

Pos logit contributions
 crying+0.064
 mirror+0.062
 ha+0.057
 bleeding+0.056
 wardrobe+0.055
Neg logit contributions
 songs-0.074
 rods-0.073
 facts-0.070
 habitats-0.070
 unity-0.069
Tokennie
Token ID11952
Feature activation+3.91e-02

Pos logit contributions
 crying+0.276
 mirror+0.263
 wardrobe+0.242
 ha+0.239
 tooth+0.238
Neg logit contributions
 songs-0.352
 rods-0.350
 similarities-0.331
 habitats-0.330
 facts-0.330
Token stopped
Token ID5025
Feature activation+5.51e-02

Pos logit contributions
 crying+0.060
 mirror+0.057
 ha+0.052
 wardrobe+0.052
 bleeding+0.052
Neg logit contributions
 songs-0.077
 rods-0.075
 habitats-0.072
 similarities-0.072
 unity-0.072
Token crying
Token ID13774
Feature activation+4.66e-01

Pos logit contributions
 mirror+0.082
 crying+0.081
 ha+0.077
 wardrobe+0.077
 tooth+0.076
Neg logit contributions
 songs-0.103
 rods-0.098
 unity-0.095
 habitats-0.095
 similarities-0.094
Token and
Token ID290
Feature activation-1.26e-01

Pos logit contributions
 crying+0.915
 mirror+0.883
 bleeding+0.825
 wardrobe+0.824
 ha+0.818
Neg logit contributions
 songs-0.710
 rods-0.682
 habitats-0.642
 unity-0.642
 similarities-0.639
Token was
Token ID373
Feature activation+5.01e-02

Pos logit contributions
 songs+0.222
 they+0.207
 rods+0.207
asted+0.206
 games+0.206
Neg logit contributions
 mirror-0.189
 crying-0.189
 bleeding-0.181
 cabinet-0.178
 wardrobe-0.177
Token very
Token ID845
Feature activation+1.17e-01

Pos logit contributions
 mirror+0.085
 Girl+0.082
 cabinet+0.082
 drawer+0.081
 wardrobe+0.081
Neg logit contributions
 songs-0.079
 enjoyable-0.076
 often-0.073
asted-0.073
 unity-0.071
Token excited
Token ID6568
Feature activation-8.59e-03

Pos logit contributions
 mirror+0.191
 wardrobe+0.185
 crying+0.182
 diary+0.182
 cabinet+0.180
Neg logit contributions
 enjoyable-0.195
 songs-0.189
 lesson-0.182
 unity-0.180
 teamwork-0.174
Token.
Token ID13
Feature activation+9.07e-02

Pos logit contributions
 songs+0.017
 rods+0.017
 unity+0.016
 habitats+0.016
 similarities+0.016
Neg logit contributions
 crying-0.013
 mirror-0.012
 bleeding-0.011
 wardrobe-0.011
 tooth-0.011
Token little
Token ID1310
Feature activation+9.21e-02

Pos logit contributions
 crying+0.264
 mirror+0.251
 bleeding+0.236
 wardrobe+0.233
 ha+0.229
Neg logit contributions
 songs-0.288
 rods-0.275
 bond-0.265
 unity-0.263
 enjoyable-0.262
Token girl
Token ID2576
Feature activation+1.94e-01

Pos logit contributions
 crying+0.113
 mirror+0.108
 bleeding+0.100
 wardrobe+0.098
 stub+0.097
Neg logit contributions
 songs-0.203
 rods-0.194
 bond-0.188
 unity-0.188
 games-0.186
Token.
Token ID13
Feature activation+9.87e-02

Pos logit contributions
 crying+0.327
 mirror+0.325
 bleeding+0.309
 cabinet+0.303
 wardrobe+0.302
Neg logit contributions
 songs-0.312
 rods-0.295
 bond-0.289
asted-0.288
 unity-0.285
Token She
Token ID1375
Feature activation+3.20e-01

Pos logit contributions
 crying+0.125
 bleeding+0.116
 mirror+0.116
 tooth+0.115
 wardrobe+0.108
Neg logit contributions
 songs-0.206
 habitats-0.198
 rods-0.198
 unity-0.195
asted-0.194
Token was
Token ID373
Feature activation+6.27e-02

Pos logit contributions
 crying+0.529
 mirror+0.506
 wardrobe+0.465
 bleeding+0.463
 ha+0.459
Neg logit contributions
 songs-0.587
 rods-0.575
 unity-0.545
 habitats-0.545
 similarities-0.544
Token crying
Token ID13774
Feature activation+4.66e-01

Pos logit contributions
 mirror+0.096
 diary+0.092
 crying+0.091
 cabinet+0.090
 wardrobe+0.089
Neg logit contributions
 songs-0.111
 enjoyable-0.102
asted-0.101
 often-0.101
 games-0.101
Token because
Token ID780
Feature activation-9.33e-02

Pos logit contributions
 crying+0.872
 mirror+0.837
 bleeding+0.781
 wardrobe+0.778
 ha+0.766
Neg logit contributions
 songs-0.751
 rods-0.737
 habitats-0.696
 unity-0.694
 similarities-0.693
Token she
Token ID673
Feature activation+2.12e-01

Pos logit contributions
 songs+0.208
 rods+0.201
 unity+0.195
 similarities+0.194
 habitats+0.193
Neg logit contributions
 crying-0.115
 mirror-0.108
 bleeding-0.097
 ha-0.097
 wardrobe-0.096
Token forgot
Token ID16453
Feature activation-6.41e-02

Pos logit contributions
used+0.438
 mirror+0.437
 front+0.409
 ha+0.405
 register+0.403
Neg logit contributions
 songs-0.250
 often-0.233
 lesson-0.214
 games-0.212
 sizes-0.207
Token her
Token ID607
Feature activation+2.49e-01

Pos logit contributions
 songs+0.121
 rods+0.112
 facts+0.110
 games+0.110
 lesson+0.109
Neg logit contributions
 crying-0.099
 mirror-0.097
 ha-0.091
 wardrobe-0.089
 bleeding-0.088
Token t
Token ID256
Feature activation+1.83e-01

Pos logit contributions
 crying+0.389
 mirror+0.371
 bleeding+0.340
 wardrobe+0.339
 ha+0.336
Neg logit contributions
 songs-0.479
 rods-0.467
 habitats-0.446
 unity-0.445
 similarities-0.444
Token an
Token ID281
Feature activation+2.06e-01

Pos logit contributions
 crying+0.246
 mirror+0.235
 wardrobe+0.213
 bleeding+0.210
 ha+0.207
Neg logit contributions
 songs-0.421
 rods-0.411
 habitats-0.393
 facts-0.391
 unity-0.391
Token old
Token ID1468
Feature activation+8.91e-02

Pos logit contributions
 crying+0.313
 mirror+0.299
 bleeding+0.274
 wardrobe+0.271
 ha+0.270
Neg logit contributions
 songs-0.403
 rods-0.392
 unity-0.376
 enjoyable-0.375
 habitats-0.374
Token man
Token ID582
Feature activation+3.01e-02

Pos logit contributions
 crying+0.133
 mirror+0.126
 bleeding+0.121
used+0.119
 stub+0.118
Neg logit contributions
 songs-0.168
 rods-0.162
 bond-0.157
 games-0.154
 lesson-0.154
Token who
Token ID508
Feature activation+1.55e-02

Pos logit contributions
 mirror+0.041
 crying+0.039
 wardrobe+0.037
 cabinet+0.037
 tooth+0.037
Neg logit contributions
 songs-0.059
 rods-0.058
asted-0.055
 unity-0.055
 games-0.055
Token was
Token ID373
Feature activation+5.58e-02

Pos logit contributions
 crying+0.025
 mirror+0.025
 wardrobe+0.023
 ha+0.022
 bleeding+0.022
Neg logit contributions
 songs-0.028
 rods-0.027
 unity-0.026
 habitats-0.026
 bond-0.026
Token crying
Token ID13774
Feature activation+4.64e-01

Pos logit contributions
 mirror+0.074
 diary+0.068
 wardrobe+0.068
 cabinet+0.066
 drawer+0.066
Neg logit contributions
 songs-0.109
 rods-0.104
 enjoyable-0.102
 unity-0.102
 games-0.101
Token and
Token ID290
Feature activation-1.20e-01

Pos logit contributions
 crying+0.649
 mirror+0.602
 bleeding+0.564
 wardrobe+0.552
 tooth+0.538
Neg logit contributions
 songs-0.963
 rods-0.963
 similarities-0.931
 habitats-0.926
 unity-0.925
Token seemed
Token ID3947
Feature activation+1.57e-02

Pos logit contributions
 songs+0.230
 rods+0.218
 habitats+0.213
 games+0.212
 unity+0.212
Neg logit contributions
 mirror-0.170
 crying-0.162
 ha-0.154
 wardrobe-0.153
 tooth-0.153
Token to
Token ID284
Feature activation+8.11e-03

Pos logit contributions
 crying+0.022
 mirror+0.022
 wardrobe+0.020
 bleeding+0.019
 ha+0.019
Neg logit contributions
 songs-0.032
 rods-0.031
 unity-0.030
 habitats-0.030
 similarities-0.030
Token be
Token ID307
Feature activation+4.31e-02

Pos logit contributions
 crying+0.013
 mirror+0.013
 wardrobe+0.012
 bleeding+0.012
 ha+0.011
Neg logit contributions
 songs-0.015
 rods-0.014
 unity-0.014
 similarities-0.014
 habitats-0.014
Token in
Token ID287
Feature activation+7.83e-02

Pos logit contributions
 mirror+0.055
 crying+0.050
 cabinet+0.050
 wardrobe+0.050
 drawer+0.049
Neg logit contributions
 songs-0.091
 enjoyable-0.087
 games-0.085
 rods-0.084
 unity-0.084
Token kind
Token ID1611
Feature activation+7.18e-02

Pos logit contributions
 crying+0.264
 mirror+0.251
 bleeding+0.236
 wardrobe+0.233
 ha+0.229
Neg logit contributions
 songs-0.288
 rods-0.275
 bond-0.265
 unity-0.263
 enjoyable-0.262
Token lady
Token ID10846
Feature activation+1.25e-01

Pos logit contributions
 crying+0.102
 mirror+0.096
 bleeding+0.088
 wardrobe+0.087
 ha+0.086
Neg logit contributions
 songs-0.148
 rods-0.145
 habitats-0.139
 unity-0.139
 similarities-0.139
Token saw
Token ID2497
Feature activation+1.93e-01

Pos logit contributions
 crying+0.196
 mirror+0.194
 bleeding+0.180
 wardrobe+0.179
 cabinet+0.177
Neg logit contributions
 songs-0.226
 rods-0.213
 bond-0.207
 unity-0.207
 habitats-0.207
Token Tim
Token ID5045
Feature activation-8.49e-02

Pos logit contributions
 crying+0.270
 mirror+0.260
 wardrobe+0.239
 bleeding+0.237
 stub+0.235
Neg logit contributions
 songs-0.397
 rods-0.382
 habitats-0.368
 facts-0.368
 unity-0.365
Tokenmy
Token ID1820
Feature activation+3.26e-01

Pos logit contributions
 songs+0.198
asted+0.187
 games+0.186
 bond+0.184
asting+0.183
Neg logit contributions
 mirror-0.078
 wardrobe-0.070
 ha-0.068
 stub-0.068
aila-0.067
Token crying
Token ID13774
Feature activation+4.64e-01

Pos logit contributions
 mirror+0.385
 crying+0.379
 wardrobe+0.343
 ha+0.342
 bleeding+0.337
Neg logit contributions
 songs-0.718
 rods-0.693
 unity-0.670
 enjoyable-0.669
asted-0.666
Token and
Token ID290
Feature activation-1.27e-01

Pos logit contributions
 crying+0.716
 mirror+0.674
 bleeding+0.626
 wardrobe+0.612
 tooth+0.607
Neg logit contributions
 rods-0.898
 songs-0.895
 unity-0.856
 similarities-0.854
 habitats-0.853
Token asked
Token ID1965
Feature activation+1.06e-01

Pos logit contributions
 songs+0.252
 rods+0.241
 unity+0.233
 habitats+0.232
 similarities+0.231
Neg logit contributions
 crying-0.180
 mirror-0.178
 bleeding-0.161
 wardrobe-0.161
 ha-0.160
Token him
Token ID683
Feature activation-4.55e-02

Pos logit contributions
 crying+0.178
 mirror+0.172
 bleeding+0.159
 wardrobe+0.158
 ha+0.156
Neg logit contributions
 songs-0.192
 rods-0.186
 unity-0.176
 habitats-0.176
 similarities-0.176
Token what
Token ID644
Feature activation+3.07e-02

Pos logit contributions
 songs+0.071
 rods+0.067
 facts+0.064
 unity+0.064
 lesson+0.064
Neg logit contributions
 crying-0.085
 mirror-0.084
 bleeding-0.079
 wardrobe-0.078
 stub-0.078
Token was
Token ID373
Feature activation+4.74e-02

Pos logit contributions
 crying+0.054
 mirror+0.052
 wardrobe+0.048
 bleeding+0.048
 ha+0.048
Neg logit contributions
 songs-0.053
 rods-0.051
 unity-0.049
 facts-0.049
 similarities-0.049
Token
Token ID198
Feature activation+4.83e-02

Pos logit contributions
 crying+0.090
 tooth+0.088
 wardrobe+0.087
 mirror+0.087
 bleeding+0.085
Neg logit contributions
 songs-0.064
Years-0.062
After-0.058
They-0.058
 enjoyable-0.057
Token"
Token ID1
Feature activation+5.68e-02

Pos logit contributions
 crying+0.080
 mirror+0.076
 wardrobe+0.073
 tooth+0.073
 bleeding+0.073
Neg logit contributions
 songs-0.088
Years-0.083
 enjoyable-0.080
 rods-0.080
 unity-0.080
TokenI
Token ID40
Feature activation+2.15e-01

Pos logit contributions
 crying+0.094
 mirror+0.090
 wardrobe+0.084
 bleeding+0.083
 tooth+0.082
Neg logit contributions
 songs-0.104
 rods-0.100
 unity-0.096
 similarities-0.096
 habitats-0.096
Token'm
Token ID1101
Feature activation+1.03e-01

Pos logit contributions
 crying+0.365
 mirror+0.360
 wardrobe+0.339
 bleeding+0.332
 cabinet+0.331
Neg logit contributions
 songs-0.359
 lesson-0.345
 rods-0.343
 bond-0.341
 enjoyable-0.340
Token not
Token ID407
Feature activation+2.26e-02

Pos logit contributions
 mirror+0.141
 drawer+0.133
 wardrobe+0.131
 diary+0.130
 cabinet+0.130
Neg logit contributions
 enjoyable-0.190
 songs-0.186
 teamwork-0.177
 often-0.176
 lesson-0.176
Token crying
Token ID13774
Feature activation+4.63e-01

Pos logit contributions
 mirror+0.032
 drawer+0.032
 cabinet+0.031
 wardrobe+0.031
 closet+0.030
Neg logit contributions
 songs-0.043
 enjoyable-0.043
 lesson-0.040
 games-0.039
 theirs-0.039
Token,"
Token ID553
Feature activation+7.53e-02

Pos logit contributions
 crying+0.748
 mirror+0.745
 wardrobe+0.687
 stub+0.682
 tooth+0.675
Neg logit contributions
 songs-0.859
 rods-0.775
 enjoyable-0.772
 games-0.766
 lesson-0.765
Token said
Token ID531
Feature activation+8.60e-02

Pos logit contributions
 crying+0.095
 mirror+0.095
 wardrobe+0.090
 stub+0.089
 cabinet+0.088
Neg logit contributions
 songs-0.159
 lesson-0.146
 enjoyable-0.146
 unity-0.145
 bond-0.143
Token Emma
Token ID18966
Feature activation+2.06e-01

Pos logit contributions
 mirror+0.122
 wardrobe+0.115
 crying+0.112
 bleeding+0.112
 closet+0.111
Neg logit contributions
 songs-0.166
 they-0.158
 lesson-0.155
 their-0.151
 every-0.147
Token.
Token ID13
Feature activation+8.78e-02

Pos logit contributions
 mirror+0.407
 wardrobe+0.377
 diary+0.372
aila+0.369
 cabinet+0.363
Neg logit contributions
 songs-0.304
 every-0.273
 until-0.266
 they-0.264
 agree-0.258
Token
Token ID198
Feature activation+4.42e-02

Pos logit contributions
 mirror+0.157
 tooth+0.153
 crying+0.153
 ha+0.153
 wardrobe+0.150
Neg logit contributions
 songs-0.135
 rods-0.121
Years-0.121
 unity-0.119
 they-0.118
Token and
Token ID290
Feature activation-1.41e-01

Pos logit contributions
 mirror+0.008
 crying+0.008
 wardrobe+0.008
 ha+0.008
 tooth+0.008
Neg logit contributions
 songs-0.008
 rods-0.008
 lesson-0.007
 enjoyable-0.007
Years-0.007
Token found
Token ID1043
Feature activation+1.63e-01

Pos logit contributions
 songs+0.268
 rods+0.265
 similarities+0.252
 habitats+0.252
 unity+0.252
Neg logit contributions
 crying-0.224
 mirror-0.213
 bleeding-0.195
 wardrobe-0.194
 ha-0.192
Token a
Token ID257
Feature activation+1.73e-01

Pos logit contributions
 crying+0.244
 mirror+0.237
 stub+0.219
 wardrobe+0.218
 ha+0.217
Neg logit contributions
 songs-0.316
 rods-0.311
 unity-0.295
 facts-0.292
 lesson-0.291
Token little
Token ID1310
Feature activation+8.93e-02

Pos logit contributions
 crying+0.278
 mirror+0.262
used+0.248
 bleeding+0.246
 wardrobe+0.245
Neg logit contributions
 songs-0.315
 rods-0.307
 bond-0.296
 lesson-0.292
 unity-0.292
Token girl
Token ID2576
Feature activation+1.88e-01

Pos logit contributions
 crying+0.123
 mirror+0.118
 bleeding+0.109
aila+0.108
 wardrobe+0.108
Neg logit contributions
 songs-0.180
 rods-0.175
 bond-0.171
 unity-0.169
 games-0.166
Token crying
Token ID13774
Feature activation+4.62e-01

Pos logit contributions
 mirror+0.300
 crying+0.286
 cabinet+0.280
 wardrobe+0.278
 bleeding+0.278
Neg logit contributions
 songs-0.312
 rods-0.293
 bond-0.291
 games-0.291
asted-0.291
Token.
Token ID13
Feature activation+9.65e-02

Pos logit contributions
 crying+0.715
 mirror+0.702
 ha+0.667
 stub+0.648
 wardrobe+0.637
Neg logit contributions
 songs-0.890
 games-0.792
 rods-0.792
 habitats-0.780
 bond-0.779
Token El
Token ID2574
Feature activation+5.40e-02

Pos logit contributions
 crying+0.124
 mirror+0.124
 tooth+0.118
 bleeding+0.118
 ha+0.114
Neg logit contributions
 songs-0.192
 rods-0.190
 habitats-0.190
 unity-0.182
 similarities-0.182
Tokenp
Token ID79
Feature activation+5.51e-02

Pos logit contributions
 crying+0.085
 mirror+0.081
 bleeding+0.075
 wardrobe+0.074
 ha+0.073
Neg logit contributions
 songs-0.103
 rods-0.101
 habitats-0.096
 unity-0.096
 similarities-0.096
Token said
Token ID531
Feature activation+8.20e-02

Pos logit contributions
 crying+0.090
 mirror+0.089
 cabinet+0.083
 wardrobe+0.083
 drawer+0.083
Neg logit contributions
 songs-0.097
 rods-0.092
 habitats-0.089
 similarities-0.088
 unity-0.088
Token to
Token ID284
Feature activation+5.97e-03

Pos logit contributions
 mirror+0.144
 wardrobe+0.139
 crying+0.139
 drawer+0.137
 closet+0.137
Neg logit contributions
 songs-0.131
 they-0.119
 lesson-0.119
 rods-0.114
 unity-0.114
Token asked
Token ID1965
Feature activation+1.11e-01

Pos logit contributions
 songs+0.213
 rods+0.208
 habitats+0.199
 unity+0.198
 similarities+0.198
Neg logit contributions
 crying-0.181
 mirror-0.173
 bleeding-0.158
 wardrobe-0.158
 ha-0.156
Token him
Token ID683
Feature activation-4.19e-02

Pos logit contributions
 crying+0.192
 mirror+0.184
 bleeding+0.170
 wardrobe+0.169
 ha+0.167
Neg logit contributions
 songs-0.196
 rods-0.191
 habitats-0.182
 unity-0.181
 similarities-0.181
Token why
Token ID1521
Feature activation+1.91e-02

Pos logit contributions
 songs+0.073
 rods+0.071
 habitats+0.067
 unity+0.067
 similarities+0.067
Neg logit contributions
 crying-0.073
 mirror-0.070
 bleeding-0.065
 wardrobe-0.064
 ha-0.064
Token he
Token ID339
Feature activation-1.07e-01

Pos logit contributions
 crying+0.037
 mirror+0.035
 bleeding+0.032
 wardrobe+0.032
 tooth+0.032
Neg logit contributions
 songs-0.031
 rods-0.030
 habitats-0.029
 unity-0.029
 bond-0.028
Token was
Token ID373
Feature activation+5.20e-02

Pos logit contributions
 songs+0.174
 rods+0.165
 facts+0.164
 enjoyable+0.163
 habitats+0.163
Neg logit contributions
 mirror-0.184
 crying-0.183
 wardrobe-0.174
 drawer-0.168
 cabinet-0.167
Token crying
Token ID13774
Feature activation+4.61e-01

Pos logit contributions
 mirror+0.068
 wardrobe+0.062
 diary+0.061
 cabinet+0.060
aila+0.060
Neg logit contributions
 songs-0.102
 enjoyable-0.097
 rods-0.097
 games-0.096
 theirs-0.094
Token.
Token ID13
Feature activation+9.21e-02

Pos logit contributions
 crying+0.681
 mirror+0.659
 ha+0.604
 wardrobe+0.599
 stub+0.594
Neg logit contributions
 songs-0.920
 rods-0.875
 enjoyable-0.842
 facts-0.834
 habitats-0.833
Token The
Token ID383
Feature activation+5.52e-02

Pos logit contributions
 crying+0.157
 mirror+0.154
 bleeding+0.145
 wardrobe+0.144
 tooth+0.142
Neg logit contributions
 rods-0.156
 songs-0.155
 habitats-0.148
 unity-0.145
 similarities-0.144
Token boy
Token ID2933
Feature activation+4.41e-02

Pos logit contributions
 crying+0.081
 mirror+0.077
 bleeding+0.072
 wardrobe+0.071
 stub+0.070
Neg logit contributions
 songs-0.110
 rods-0.106
 lesson-0.101
 habitats-0.100
 unity-0.100
Token told
Token ID1297
Feature activation+8.30e-02

Pos logit contributions
 mirror+0.065
 crying+0.064
 wardrobe+0.059
 cabinet+0.058
 bleeding+0.057
Neg logit contributions
 songs-0.085
 rods-0.082
 unity-0.079
 bond-0.078
 habitats-0.078
Token her
Token ID607
Feature activation+2.40e-01

Pos logit contributions
 crying+0.148
 mirror+0.143
 bleeding+0.134
 wardrobe+0.133
 stub+0.131
Neg logit contributions
 songs-0.141
 rods-0.136
 unity-0.129
 habitats-0.128
 facts-0.128
Token sitting
Token ID5586
Feature activation+2.67e-01

Pos logit contributions
 mirror+0.092
 crying+0.087
 diary+0.086
 cabinet+0.085
 wardrobe+0.084
Neg logit contributions
 songs-0.107
 enjoyable-0.100
asted-0.098
 unity-0.098
 games-0.098
Token on
Token ID319
Feature activation+1.00e-01

Pos logit contributions
 crying+0.441
 stub+0.423
 mirror+0.417
 ha+0.405
 sob+0.403
Neg logit contributions
 songs-0.464
 games-0.423
 rods-0.416
 bond-0.404
 every-0.404
Token the
Token ID262
Feature activation+2.49e-02

Pos logit contributions
 crying+0.135
 mirror+0.127
 bleeding+0.118
 stub+0.115
 ha+0.115
Neg logit contributions
 songs-0.214
 rods-0.206
 habitats-0.198
 unity-0.197
 games-0.197
Token ground
Token ID2323
Feature activation+1.20e-01

Pos logit contributions
 crying+0.039
 mirror+0.037
 bleeding+0.034
 ha+0.034
 wardrobe+0.033
Neg logit contributions
 songs-0.048
 rods-0.046
 habitats-0.044
 unity-0.044
 similarities-0.044
Token and
Token ID290
Feature activation-1.20e-01

Pos logit contributions
 crying+0.233
 mirror+0.224
 bleeding+0.209
 wardrobe+0.209
 ha+0.206
Neg logit contributions
 songs-0.184
 rods-0.179
 habitats-0.168
 unity-0.168
 similarities-0.168
Token crying
Token ID13774
Feature activation+4.61e-01

Pos logit contributions
 songs+0.230
 rods+0.219
 habitats+0.217
 games+0.213
asted+0.210
Neg logit contributions
 crying-0.164
 mirror-0.164
 wardrobe-0.155
 ha-0.155
 diary-0.154
Token.
Token ID13
Feature activation+9.39e-02

Pos logit contributions
 crying+0.679
 mirror+0.643
 bleeding+0.590
 wardrobe+0.585
 tooth+0.574
Neg logit contributions
 songs-0.919
 rods-0.915
 unity-0.873
 habitats-0.871
 similarities-0.870
Token Jim
Token ID5395
Feature activation-1.22e-02

Pos logit contributions
 tooth+0.134
 crying+0.133
 mirror+0.132
 bleeding+0.131
 wardrobe+0.126
Neg logit contributions
 habitats-0.170
 rods-0.166
 songs-0.166
 They-0.161
 they-0.159
Token was
Token ID373
Feature activation+5.11e-02

Pos logit contributions
 songs+0.021
 rods+0.020
 habitats+0.020
asted+0.020
 unity+0.019
Neg logit contributions
 mirror-0.019
 crying-0.018
 wardrobe-0.017
 cabinet-0.017
 diary-0.017
Token a
Token ID257
Feature activation+1.65e-01

Pos logit contributions
 Girl+0.063
 mirror+0.063
 drawer+0.060
 cabinet+0.060
 diary+0.060
Neg logit contributions
 songs-0.102
 often-0.094
 enjoyable-0.094
 games-0.093
 unity-0.091
Token compassionate
Token ID32533
Feature activation+6.11e-02

Pos logit contributions
 crying+0.228
 mirror+0.218
 bleeding+0.202
 wardrobe+0.201
used+0.199
Neg logit contributions
 songs-0.334
 rods-0.319
 bond-0.314
 lesson-0.310
 unity-0.309
Token noises
Token ID26782
Feature activation-1.52e-01

Pos logit contributions
 crying+0.169
 mirror+0.161
 wardrobe+0.149
 bleeding+0.149
 tooth+0.147
Neg logit contributions
 rods-0.135
 songs-0.134
 similarities-0.127
 unity-0.127
 habitats-0.126
Token.
Token ID13
Feature activation+7.73e-02

Pos logit contributions
 rods+0.323
 similarities+0.315
 songs+0.314
 habitats+0.311
 unity+0.310
Neg logit contributions
 crying-0.220
 mirror-0.200
 bleeding-0.182
 wardrobe-0.176
 staring-0.171
Token It
Token ID632
Feature activation+7.82e-02

Pos logit contributions
 crying+0.141
 mirror+0.139
 bleeding+0.132
 tooth+0.131
 wardrobe+0.129
Neg logit contributions
 songs-0.124
 rods-0.124
 habitats-0.119
 facts-0.114
 unity-0.113
Token sounded
Token ID14846
Feature activation+8.06e-02

Pos logit contributions
 crying+0.129
 mirror+0.128
 wardrobe+0.118
 drawer+0.117
 ha+0.116
Neg logit contributions
 songs-0.141
 rods-0.131
 unity-0.127
 lesson-0.126
 enjoyable-0.126
Token like
Token ID588
Feature activation-1.28e-01

Pos logit contributions
 crying+0.154
 mirror+0.144
 bleeding+0.136
 wardrobe+0.131
 stub+0.131
Neg logit contributions
 rods-0.132
 songs-0.126
 habitats-0.123
 similarities-0.123
 bond-0.121
Token crying
Token ID13774
Feature activation+4.60e-01

Pos logit contributions
 songs+0.265
 rods+0.264
 habitats+0.254
 similarities+0.254
 bond+0.253
Neg logit contributions
 crying-0.179
 mirror-0.167
 bleeding-0.152
 wardrobe-0.151
 ha-0.148
Token and
Token ID290
Feature activation-1.35e-01

Pos logit contributions
 crying+0.667
 mirror+0.633
 bleeding+0.576
 wardrobe+0.575
 ha+0.565
Neg logit contributions
 songs-0.935
 rods-0.917
 habitats-0.877
 unity-0.876
 similarities-0.874
Token so
Token ID523
Feature activation+1.13e-01

Pos logit contributions
 songs+0.258
 rods+0.252
 habitats+0.240
 unity+0.240
 similarities+0.240
Neg logit contributions
 crying-0.213
 mirror-0.203
 bleeding-0.186
 wardrobe-0.186
 ha-0.183
Token he
Token ID339
Feature activation-1.04e-01

Pos logit contributions
 crying+0.196
 mirror+0.184
 bleeding+0.172
 wardrobe+0.170
 ha+0.169
Neg logit contributions
 songs-0.197
 rods-0.196
 habitats-0.185
 similarities-0.184
 bond-0.183
Token followed
Token ID3940
Feature activation+7.08e-03

Pos logit contributions
 songs+0.197
 rods+0.192
 habitats+0.183
 unity+0.183
 similarities+0.182
Neg logit contributions
 crying-0.165
 mirror-0.158
 bleeding-0.144
 wardrobe-0.144
 ha-0.142
Token the
Token ID262
Feature activation+2.09e-02

Pos logit contributions
 crying+0.013
 mirror+0.013
 bleeding+0.012
 wardrobe+0.012
 cabinet+0.012
Neg logit contributions
 songs-0.012
 rods-0.011
 unity-0.011
 lesson-0.010
 similarities-0.010
Token apples
Token ID22514
Feature activation-1.49e-01

Pos logit contributions
 songs+0.682
 rods+0.670
 enjoyable+0.644
 similarities+0.641
 unity+0.641
Neg logit contributions
 crying-0.580
 mirror-0.558
 wardrobe-0.512
 ha-0.502
 tooth-0.501
Token.
Token ID13
Feature activation+9.69e-02

Pos logit contributions
 songs+0.333
 similarities+0.324
 habitats+0.320
 rods+0.320
 facts+0.320
Neg logit contributions
 crying-0.190
 mirror-0.172
 bleeding-0.159
 wardrobe-0.156
 stuck-0.153
Token They
Token ID1119
Feature activation-3.14e-01

Pos logit contributions
 crying+0.219
 mirror+0.211
 bleeding+0.198
 wardrobe+0.198
 ha+0.197
Neg logit contributions
 songs-0.120
 rods-0.113
 bond-0.107
 unity-0.107
 facts-0.107
Token felt
Token ID2936
Feature activation+1.05e-01

Pos logit contributions
 songs+0.593
 rods+0.579
 habitats+0.550
 unity+0.547
 enjoyable+0.545
Neg logit contributions
 crying-0.506
 mirror-0.494
 ha-0.457
 wardrobe-0.452
 tooth-0.444
Token like
Token ID588
Feature activation-1.29e-01

Pos logit contributions
 crying+0.166
 mirror+0.154
 bleeding+0.143
 tooth+0.138
 wardrobe+0.137
Neg logit contributions
 rods-0.205
 songs-0.201
 habitats-0.200
 similarities-0.194
 bond-0.193
Token crying
Token ID13774
Feature activation+4.60e-01

Pos logit contributions
 habitats+0.210
 bond+0.208
asted+0.206
Years+0.205
 rods+0.202
Neg logit contributions
 crying-0.246
 mirror-0.220
 bleeding-0.209
 tooth-0.205
 lying-0.199
Token.
Token ID13
Feature activation+7.98e-02

Pos logit contributions
 crying+0.546
 bleeding+0.468
 mirror+0.464
 wardrobe+0.427
 tooth+0.423
Neg logit contributions
 rods-1.057
asted-1.049
 similarities-1.042
 unity-1.042
Years-1.034
Token
Token ID198
Feature activation+3.28e-02

Pos logit contributions
 crying+0.181
 mirror+0.174
 bleeding+0.164
 wardrobe+0.163
 ha+0.162
Neg logit contributions
 songs-0.099
 rods-0.094
 unity-0.089
 bond-0.088
 facts-0.088
Token
Token ID198
Feature activation+3.48e-02

Pos logit contributions
 crying+0.046
 mirror+0.042
 bleeding+0.039
 wardrobe+0.037
 ha+0.037
Neg logit contributions
 songs-0.070
 rods-0.067
 unity-0.066
 habitats-0.065
 facts-0.065
TokenBut
Token ID1537
Feature activation-1.32e-01

Pos logit contributions
 crying+0.057
 mirror+0.053
 ha+0.051
 bleeding+0.050
 wardrobe+0.048
Neg logit contributions
 songs-0.065
 rods-0.062
 facts-0.062
 habitats-0.061
 unity-0.060
Token then
Token ID788
Feature activation-1.33e-01

Pos logit contributions
 songs+0.187
 rods+0.180
 habitats+0.170
 similarities+0.169
 unity+0.169
Neg logit contributions
 crying-0.277
 mirror-0.266
 bleeding-0.251
 wardrobe-0.250
 ha-0.249
Tokenmy
Token ID1820
Feature activation+3.17e-01

Pos logit contributions
 songs+0.269
 rods+0.257
 lesson+0.254
 enjoyable+0.253
 habitats+0.251
Neg logit contributions
 crying-0.091
 mirror-0.088
 wardrobe-0.077
 cabinet-0.076
used-0.075
Token saw
Token ID2497
Feature activation+1.86e-01

Pos logit contributions
 crying+0.461
 mirror+0.433
 wardrobe+0.392
 bleeding+0.392
 tooth+0.390
Neg logit contributions
 songs-0.648
 rods-0.639
 unity-0.615
 similarities-0.614
 habitats-0.612
Token his
Token ID465
Feature activation-8.42e-02

Pos logit contributions
 crying+0.250
 mirror+0.241
 wardrobe+0.223
 ha+0.223
used+0.222
Neg logit contributions
 songs-0.391
 rods-0.377
 games-0.358
 unity-0.355
 enjoyable-0.355
Token friend
Token ID1545
Feature activation+7.27e-02

Pos logit contributions
 songs+0.166
 rods+0.161
 habitats+0.154
 unity+0.153
 bond+0.152
Neg logit contributions
 crying-0.123
 mirror-0.119
 ha-0.109
 bleeding-0.109
 wardrobe-0.108
Token Sally
Token ID25737
Feature activation+4.60e-02

Pos logit contributions
 mirror+0.122
 crying+0.116
 wardrobe+0.113
 ha+0.112
 drawer+0.110
Neg logit contributions
 songs-0.122
 rods-0.117
 games-0.112
 enjoyable-0.110
asted-0.109
Token crying
Token ID13774
Feature activation+4.60e-01

Pos logit contributions
 mirror+0.069
 ha+0.064
 crying+0.063
 wardrobe+0.061
 tooth+0.061
Neg logit contributions
 songs-0.086
 rods-0.084
 games-0.081
 enjoyable-0.079
asted-0.078
Token.
Token ID13
Feature activation+8.25e-02

Pos logit contributions
 crying+0.667
 mirror+0.623
 bleeding+0.580
 wardrobe+0.563
 tooth+0.557
Neg logit contributions
 rods-0.939
 songs-0.930
 unity-0.901
 similarities-0.899
 habitats-0.893
Token He
Token ID679
Feature activation-2.32e-02

Pos logit contributions
 mirror+0.150
 crying+0.150
 tooth+0.143
 ha+0.142
 wardrobe+0.140
Neg logit contributions
 songs-0.126
 rods-0.124
 habitats-0.119
 enjoyable-0.115
 similarities-0.115
Token asked
Token ID1965
Feature activation+1.05e-01

Pos logit contributions
 songs+0.043
 rods+0.041
 habitats+0.040
 unity+0.039
asted+0.039
Neg logit contributions
 mirror-0.036
 crying-0.035
 cabinet-0.033
 wardrobe-0.032
 ha-0.032
Token her
Token ID607
Feature activation+2.39e-01

Pos logit contributions
 crying+0.171
 mirror+0.161
 wardrobe+0.148
 bleeding+0.147
 ha+0.146
Neg logit contributions
 songs-0.199
 rods-0.195
 habitats-0.187
 unity-0.186
 similarities-0.186
Token,
Token ID11
Feature activation+4.20e-02

Pos logit contributions
 crying+0.437
 mirror+0.423
 bleeding+0.400
 wardrobe+0.391
 ha+0.390
Neg logit contributions
 songs-0.391
 rods-0.378
 unity-0.355
 facts-0.354
 habitats-0.354
Token wet
Token ID9583
Feature activation+1.91e-01

Pos logit contributions
 songs+0.502
 rods+0.492
 habitats+0.470
 unity+0.470
 similarities+0.469
Neg logit contributions
 crying-0.383
 mirror-0.362
 bleeding-0.331
 wardrobe-0.328
 ha-0.323
Token and
Token ID290
Feature activation-1.31e-01

Pos logit contributions
 crying+0.284
 mirror+0.268
 bleeding+0.248
 wardrobe+0.241
 tooth+0.235
Neg logit contributions
 songs-0.378
 rods-0.377
 similarities-0.369
 unity-0.368
 habitats-0.366
Token pin
Token ID6757
Feature activation+2.96e-01

Pos logit contributions
 songs+0.294
 rods+0.287
 unity+0.275
 similarities+0.275
 habitats+0.275
Neg logit contributions
 crying-0.162
 mirror-0.154
 wardrobe-0.138
 bleeding-0.136
 ha-0.136
Tokenching
Token ID10813
Feature activation+1.23e-01

Pos logit contributions
 crying+0.497
 mirror+0.492
 wardrobe+0.458
 cabinet+0.450
 drawer+0.447
Neg logit contributions
 songs-0.530
 rods-0.507
 lesson-0.477
 enjoyable-0.474
 games-0.473
Token and
Token ID290
Feature activation-1.33e-01

Pos logit contributions
 crying+0.243
 mirror+0.228
 bleeding+0.218
 ha+0.211
 wardrobe+0.209
Neg logit contributions
 songs-0.179
 rods-0.178
 habitats-0.176
 similarities-0.175
 enjoyable-0.174
Token crying
Token ID13774
Feature activation+4.59e-01

Pos logit contributions
 songs+0.254
 rods+0.250
 habitats+0.240
 similarities+0.239
 unity+0.239
Neg logit contributions
 crying-0.214
 mirror-0.200
 bleeding-0.186
 wardrobe-0.182
 ha-0.181
Token.
Token ID13
Feature activation+7.65e-02

Pos logit contributions
 crying+0.559
 bleeding+0.485
 mirror+0.474
 sob+0.429
 wardrobe+0.424
Neg logit contributions
 rods-1.053
asted-1.031
 unity-1.019
 habitats-1.017
Years-1.015
Token She
Token ID1375
Feature activation+3.06e-01

Pos logit contributions
 crying+0.145
 mirror+0.141
 bleeding+0.133
 tooth+0.132
 wardrobe+0.132
Neg logit contributions
 songs-0.117
 rods-0.116
 habitats-0.109
 unity-0.107
 similarities-0.107
Token was
Token ID373
Feature activation+5.00e-02

Pos logit contributions
 crying+0.465
 mirror+0.451
 bleeding+0.418
 wardrobe+0.415
 cabinet+0.412
Neg logit contributions
 songs-0.567
 rods-0.552
 habitats-0.530
asted-0.521
 facts-0.521
Token not
Token ID407
Feature activation+1.90e-02

Pos logit contributions
 mirror+0.064
 drawer+0.062
 diary+0.062
 Girl+0.062
 wardrobe+0.061
Neg logit contributions
 songs-0.099
 enjoyable-0.093
asted-0.092
 games-0.091
 theirs-0.090
Token happy
Token ID3772
Feature activation-1.02e-01

Pos logit contributions
 mirror+0.031
 wardrobe+0.030
 ha+0.029
used+0.029
 cabinet+0.029
Neg logit contributions
 songs-0.034
 enjoyable-0.031
 theirs-0.030
 games-0.030
 sizes-0.029
Token asked
Token ID1965
Feature activation+6.96e-02

Pos logit contributions
 crying+0.050
 mirror+0.046
 bleeding+0.042
 ha+0.042
 wardrobe+0.041
Neg logit contributions
 rods-0.066
 songs-0.065
 habitats-0.065
 similarities-0.063
 unity-0.063
Token Lily
Token ID20037
Feature activation+9.54e-02

Pos logit contributions
 crying+0.110
 mirror+0.108
 bleeding+0.100
 wardrobe+0.100
 stub+0.099
Neg logit contributions
 songs-0.127
 rods-0.124
 unity-0.117
 facts-0.116
 habitats-0.116
Token why
Token ID1521
Feature activation+1.62e-02

Pos logit contributions
 crying+0.176
 mirror+0.174
 bleeding+0.167
 wardrobe+0.164
 ha+0.162
Neg logit contributions
 songs-0.145
 rods-0.138
 lesson-0.131
 habitats-0.131
 facts-0.131
Token she
Token ID673
Feature activation+2.06e-01

Pos logit contributions
 crying+0.022
 mirror+0.021
 bleeding+0.019
 wardrobe+0.019
 ha+0.018
Neg logit contributions
 songs-0.034
 rods-0.034
 habitats-0.032
 unity-0.032
 similarities-0.032
Token was
Token ID373
Feature activation+5.10e-02

Pos logit contributions
 mirror+0.364
 crying+0.359
 wardrobe+0.346
 bleeding+0.341
 drawer+0.339
Neg logit contributions
 songs-0.312
 enjoyable-0.295
 facts-0.292
 lesson-0.289
 habitats-0.286
Token crying
Token ID13774
Feature activation+4.59e-01

Pos logit contributions
 mirror+0.064
ainted+0.060
used+0.060
 ha+0.060
aila+0.059
Neg logit contributions
 songs-0.094
 enjoyable-0.089
 games-0.088
 often-0.087
 they-0.086
Token.
Token ID13
Feature activation+8.62e-02

Pos logit contributions
 crying+0.749
 mirror+0.735
 ha+0.720
 stub+0.691
 wardrobe+0.689
Neg logit contributions
 songs-0.827
 enjoyable-0.729
 lesson-0.728
 games-0.718
 rods-0.718
Token Lily
Token ID20037
Feature activation+9.07e-02

Pos logit contributions
 mirror+0.146
 tooth+0.142
 wardrobe+0.142
 crying+0.140
 bleeding+0.137
Neg logit contributions
 songs-0.131
 habitats-0.128
 they-0.128
 similarities-0.125
 They-0.124
Token told
Token ID1297
Feature activation+8.63e-02

Pos logit contributions
 mirror+0.161
used+0.154
 cabinet+0.153
 bleeding+0.153
 crying+0.152
Neg logit contributions
 often-0.126
 songs-0.124
Years-0.119
asted-0.116
 lesson-0.115
Token her
Token ID607
Feature activation+2.41e-01

Pos logit contributions
 crying+0.091
used+0.090
 mirror+0.090
 stub+0.088
 cabinet+0.084
Neg logit contributions
 songs-0.203
 lesson-0.186
asted-0.184
 games-0.182
 facts-0.182
Token about
Token ID546
Feature activation-1.99e-01

Pos logit contributions
 crying+0.458
 mirror+0.453
 ha+0.428
 bleeding+0.426
 wardrobe+0.419
Neg logit contributions
 songs-0.371
 rods-0.340
 lesson-0.336
 facts-0.333
 similarities-0.327
Token so
Token ID523
Feature activation+1.15e-01

Pos logit contributions
 mirror+0.092
 wardrobe+0.088
 diary+0.087
 cabinet+0.086
 drawer+0.085
Neg logit contributions
 songs-0.101
 enjoyable-0.098
asted-0.093
 games-0.092
 unity-0.091
Token happy
Token ID3772
Feature activation-9.02e-02

Pos logit contributions
 crying+0.201
 mirror+0.188
 bleeding+0.176
 ha+0.172
 wardrobe+0.171
Neg logit contributions
 songs-0.208
 rods-0.204
 habitats-0.196
 similarities-0.193
 facts-0.193
Token.
Token ID13
Feature activation+1.04e-01

Pos logit contributions
 similarities+0.179
 rods+0.179
 unity+0.176
 songs+0.175
 habitats+0.175
Neg logit contributions
 crying-0.147
 mirror-0.127
 bleeding-0.116
 staring-0.114
 tooth-0.113
Token He
Token ID679
Feature activation-1.19e-02

Pos logit contributions
 crying+0.174
 mirror+0.165
 bleeding+0.151
 wardrobe+0.151
 cabinet+0.151
Neg logit contributions
 songs-0.189
 rods-0.184
 unity-0.175
 similarities-0.175
 bond-0.175
Token stopped
Token ID5025
Feature activation+5.20e-02

Pos logit contributions
 songs+0.025
 rods+0.025
 pleasures+0.025
 similarities+0.024
 facts+0.024
Neg logit contributions
 crying-0.016
 mirror-0.015
 locked-0.014
 stub-0.014
 ha-0.014
Token crying
Token ID13774
Feature activation+4.59e-01

Pos logit contributions
 crying+0.077
 mirror+0.073
 bleeding+0.067
 wardrobe+0.067
 ha+0.066
Neg logit contributions
 songs-0.105
 rods-0.102
 habitats-0.098
 unity-0.098
 similarities-0.097
Token and
Token ID290
Feature activation-1.23e-01

Pos logit contributions
 crying+0.837
 mirror+0.785
 bleeding+0.738
 tooth+0.719
 wardrobe+0.719
Neg logit contributions
 rods-0.799
 similarities-0.775
 habitats-0.766
 unity-0.761
asted-0.753
Token went
Token ID1816
Feature activation-1.64e-02

Pos logit contributions
 songs+0.249
 rods+0.243
 habitats+0.233
 unity+0.232
 similarities+0.232
Neg logit contributions
 crying-0.182
 mirror-0.173
 bleeding-0.157
 wardrobe-0.157
 ha-0.155
Token with
Token ID351
Feature activation-2.63e-02

Pos logit contributions
 songs+0.031
 rods+0.030
 games+0.029
 habitats+0.028
 lesson+0.028
Neg logit contributions
 crying-0.024
 mirror-0.024
 bleeding-0.022
 ha-0.022
 tooth-0.022
Token the
Token ID262
Feature activation+1.86e-02

Pos logit contributions
 songs+0.045
 rods+0.043
 unity+0.041
 games+0.040
 enjoyable+0.040
Neg logit contributions
 crying-0.047
 mirror-0.045
 bleeding-0.042
 ha-0.042
 wardrobe-0.042
Token lady
Token ID10846
Feature activation+1.11e-01

Pos logit contributions
 crying+0.027
 mirror+0.026
 bleeding+0.024
 wardrobe+0.024
 ha+0.023
Neg logit contributions
 songs-0.037
 rods-0.036
 unity-0.034
 bond-0.034
 habitats-0.034
Token with
Token ID351
Feature activation-2.76e-02

Pos logit contributions
 crying+0.535
 mirror+0.521
 wardrobe+0.492
 cabinet+0.478
 ha+0.474
Neg logit contributions
 songs-0.645
 rods-0.619
 lesson-0.590
 enjoyable-0.587
 unity-0.585
Token the
Token ID262
Feature activation+3.22e-02

Pos logit contributions
 songs+0.054
 rods+0.052
 habitats+0.050
 unity+0.050
 similarities+0.050
Neg logit contributions
 crying-0.042
 mirror-0.041
 bleeding-0.037
 wardrobe-0.037
 ha-0.036
Token hammer
Token ID15554
Feature activation+2.05e-01

Pos logit contributions
 crying+0.044
 mirror+0.041
used+0.039
 wardrobe+0.039
 stub+0.038
Neg logit contributions
 songs-0.066
 rods-0.064
 lesson-0.061
 bond-0.061
 unity-0.060
Token and
Token ID290
Feature activation-1.14e-01

Pos logit contributions
 crying+0.319
 mirror+0.306
 wardrobe+0.279
 bleeding+0.279
 ha+0.277
Neg logit contributions
 songs-0.391
 rods-0.383
 unity-0.363
 habitats-0.363
 similarities-0.362
Token started
Token ID2067
Feature activation+8.05e-02

Pos logit contributions
 songs+0.212
 rods+0.205
 habitats+0.200
 similarities+0.199
 bond+0.198
Neg logit contributions
 crying-0.186
 mirror-0.178
 bleeding-0.162
 wardrobe-0.162
 ha-0.161
Token crying
Token ID13774
Feature activation+4.59e-01

Pos logit contributions
 crying+0.131
 mirror+0.126
 bleeding+0.116
 wardrobe+0.116
 ha+0.115
Neg logit contributions
 songs-0.148
 rods-0.145
 habitats-0.137
 unity-0.137
 similarities-0.137
Token.
Token ID13
Feature activation+8.94e-02

Pos logit contributions
 crying+0.526
 mirror+0.465
 bleeding+0.437
 wardrobe+0.406
 sob+0.390
Neg logit contributions
 rods-1.106
 habitats-1.067
 similarities-1.066
 unity-1.056
asted-1.055
Token Her
Token ID2332
Feature activation+2.75e-01

Pos logit contributions
 crying+0.151
 mirror+0.148
 wardrobe+0.138
 bleeding+0.137
 tooth+0.136
Neg logit contributions
 songs-0.154
 rods-0.153
 habitats-0.146
 similarities-0.143
 unity-0.143
Token mom
Token ID1995
Feature activation+9.03e-02

Pos logit contributions
 crying+0.397
 mirror+0.380
 bleeding+0.350
 wardrobe+0.347
 cabinet+0.342
Neg logit contributions
 songs-0.555
 rods-0.537
 unity-0.511
 habitats-0.510
 similarities-0.509
Token had
Token ID550
Feature activation+1.70e-02

Pos logit contributions
 mirror+0.118
 crying+0.112
 diary+0.110
 cabinet+0.109
 wardrobe+0.108
Neg logit contributions
 songs-0.176
Years-0.165
asted-0.164
 often-0.162
 rods-0.162
Token to
Token ID284
Feature activation+3.61e-03

Pos logit contributions
 mirror+0.026
used+0.025
 cabinet+0.025
 crying+0.025
 wardrobe+0.025
Neg logit contributions
 songs-0.029
 rods-0.029
 lesson-0.029
 enjoyable-0.028
 teamwork-0.028
Token happy
Token ID3772
Feature activation-9.93e-02

Pos logit contributions
 crying+0.204
 mirror+0.194
 bleeding+0.180
 ha+0.178
 wardrobe+0.176
Neg logit contributions
 songs-0.190
 rods-0.188
 habitats-0.180
 facts-0.177
 similarities-0.176
Token that
Token ID326
Feature activation-7.84e-02

Pos logit contributions
 rods+0.194
 similarities+0.188
 songs+0.186
 unity+0.185
asted+0.183
Neg logit contributions
 crying-0.164
 mirror-0.147
 wardrobe-0.133
 bleeding-0.132
 ha-0.131
Token she
Token ID673
Feature activation+2.09e-01

Pos logit contributions
 rods+0.194
 songs+0.192
 habitats+0.187
 similarities+0.183
 unity+0.182
Neg logit contributions
 crying-0.083
 mirror-0.075
 wardrobe-0.065
 bleeding-0.064
 ha-0.063
Token was
Token ID373
Feature activation+5.36e-02

Pos logit contributions
 crying+0.324
 mirror+0.307
 ha+0.283
 bleeding+0.281
 wardrobe+0.280
Neg logit contributions
 songs-0.406
 rods-0.404
 habitats-0.386
 facts-0.383
 similarities-0.383
Token not
Token ID407
Feature activation+2.05e-02

Pos logit contributions
 mirror+0.085
 crying+0.084
 cabinet+0.079
 wardrobe+0.079
 diary+0.078
Neg logit contributions
 songs-0.096
 enjoyable-0.090
 rods-0.089
asted-0.089
 unity-0.088
Token crying
Token ID13774
Feature activation+4.59e-01

Pos logit contributions
 mirror+0.028
 crying+0.026
 wardrobe+0.026
 diary+0.025
 cabinet+0.025
Neg logit contributions
 songs-0.041
 enjoyable-0.040
 rods-0.038
 games-0.038
 lesson-0.038
Token anymore
Token ID7471
Feature activation-2.78e-02

Pos logit contributions
 crying+0.762
 mirror+0.731
 bleeding+0.673
 wardrobe+0.672
 ha+0.665
Neg logit contributions
 songs-0.835
 rods-0.807
 habitats-0.770
 unity-0.768
 similarities-0.766
Token.
Token ID13
Feature activation+9.22e-02

Pos logit contributions
 rods+0.064
 songs+0.062
 similarities+0.061
 unity+0.061
asted+0.060
Neg logit contributions
 crying-0.035
 mirror-0.031
 bleeding-0.029
 wardrobe-0.028
 tooth-0.027
Token Later
Token ID11450
Feature activation-1.99e-02

Pos logit contributions
 crying+0.146
 mirror+0.140
 tooth+0.134
 bleeding+0.133
 ha+0.131
Neg logit contributions
 songs-0.169
 habitats-0.162
 rods-0.162
 enjoyable-0.158
 similarities-0.158
Token,
Token ID11
Feature activation+4.75e-02

Pos logit contributions
 songs+0.031
 rods+0.030
 habitats+0.028
 unity+0.028
 similarities+0.028
Neg logit contributions
 crying-0.039
 mirror-0.037
 bleeding-0.035
 wardrobe-0.035
 ha-0.034
Token she
Token ID673
Feature activation+2.00e-01

Pos logit contributions
 crying+0.063
 mirror+0.059
 bleeding+0.053
 wardrobe+0.053
 ha+0.052
Neg logit contributions
 songs-0.103
 rods-0.102
 habitats-0.097
 unity-0.097
 similarities-0.097
Token was
Token ID373
Feature activation+4.97e-02

Pos logit contributions
 mirror+0.158
used+0.148
 wardrobe+0.147
 crying+0.146
 ha+0.142
Neg logit contributions
 lesson-0.102
 they-0.100
 songs-0.099
 rods-0.096
 sizes-0.095
Token a
Token ID257
Feature activation+1.61e-01

Pos logit contributions
 mirror+0.067
 crying+0.065
 ha+0.061
 wardrobe+0.061
used+0.061
Neg logit contributions
 songs-0.101
 rods-0.100
 lesson-0.096
 enjoyable-0.095
 unity-0.095
Token picture
Token ID4286
Feature activation+2.92e-02

Pos logit contributions
 crying+0.233
 mirror+0.223
used+0.213
 wardrobe+0.207
 ha+0.204
Neg logit contributions
 songs-0.315
 rods-0.305
 lesson-0.297
 bond-0.296
 unity-0.291
Token of
Token ID286
Feature activation-1.57e-01

Pos logit contributions
 crying+0.065
 mirror+0.062
 bleeding+0.059
 wardrobe+0.059
 ha+0.058
Neg logit contributions
 songs-0.037
 rods-0.036
 habitats-0.033
 unity-0.033
 similarities-0.033
Token her
Token ID607
Feature activation+2.38e-01

Pos logit contributions
 songs+0.359
 rods+0.352
 habitats+0.339
 unity+0.338
 similarities+0.338
Neg logit contributions
 crying-0.187
 mirror-0.176
 bleeding-0.156
 wardrobe-0.156
 ha-0.153
Token crying
Token ID13774
Feature activation+4.58e-01

Pos logit contributions
 crying+0.369
 mirror+0.352
 bleeding+0.323
 wardrobe+0.320
 ha+0.318
Neg logit contributions
 songs-0.462
 rods-0.449
 habitats-0.430
 unity-0.429
 similarities-0.427
Token.
Token ID13
Feature activation+8.86e-02

Pos logit contributions
 crying+0.750
 mirror+0.716
 ha+0.669
 stub+0.664
 bleeding+0.648
Neg logit contributions
 songs-0.877
 rods-0.803
 games-0.781
 habitats-0.777
 facts-0.774
Token She
Token ID1375
Feature activation+3.06e-01

Pos logit contributions
 mirror+0.137
 wardrobe+0.136
 crying+0.132
 tooth+0.129
 bleeding+0.128
Neg logit contributions
 They-0.152
 rods-0.151
 habitats-0.151
 lesson-0.147
 songs-0.146
Token did
Token ID750
Feature activation+4.52e-02

Pos logit contributions
 crying+0.467
 mirror+0.455
 wardrobe+0.417
 bleeding+0.410
 cabinet+0.407
Neg logit contributions
 songs-0.584
 rods-0.565
 habitats-0.540
 similarities-0.536
asted-0.535
Token not
Token ID407
Feature activation+1.67e-02

Pos logit contributions
 crying+0.077
 mirror+0.077
 wardrobe+0.072
 bleeding+0.070
 ha+0.070
Neg logit contributions
 songs-0.078
 rods-0.075
 lesson-0.072
 similarities-0.072
 facts-0.072
Token like
Token ID588
Feature activation-1.31e-01

Pos logit contributions
 crying+0.032
 mirror+0.031
 bleeding+0.029
 wardrobe+0.029
 ha+0.028
Neg logit contributions
 songs-0.027
 rods-0.026
 habitats-0.024
 unity-0.024
 similarities-0.024
Token,
Token ID11
Feature activation+5.43e-02

Pos logit contributions
 songs+0.027
 rods+0.026
 unity+0.025
 facts+0.025
 similarities+0.025
Neg logit contributions
 crying-0.028
 mirror-0.028
 bleeding-0.026
 ha-0.026
 stub-0.026
Token "
Token ID366
Feature activation+5.76e-02

Pos logit contributions
 crying+0.088
 mirror+0.088
 wardrobe+0.082
 ha+0.081
 tooth+0.080
Neg logit contributions
 songs-0.096
 rods-0.092
 unity-0.088
 similarities-0.087
 enjoyable-0.087
TokenWhy
Token ID5195
Feature activation+9.12e-02

Pos logit contributions
 crying+0.102
 wardrobe+0.099
 cabinet+0.098
 mirror+0.096
 closet+0.094
Neg logit contributions
 songs-0.091
Years-0.088
asted-0.088
 rods-0.086
 bond-0.082
Token are
Token ID389
Feature activation-3.04e-02

Pos logit contributions
 crying+0.167
 mirror+0.161
 bleeding+0.150
 wardrobe+0.149
 stub+0.147
Neg logit contributions
 songs-0.150
 rods-0.143
 facts-0.137
 lesson-0.137
 similarities-0.136
Token you
Token ID345
Feature activation+1.06e-01

Pos logit contributions
 songs+0.051
 rods+0.049
 they+0.049
 games+0.047
 teamwork+0.045
Neg logit contributions
 mirror-0.053
 crying-0.051
 closet-0.049
 drawer-0.049
 cabinet-0.048
Token crying
Token ID13774
Feature activation+4.58e-01

Pos logit contributions
 mirror+0.143
aila+0.136
 ha+0.134
 cabinet+0.131
 drawer+0.130
Neg logit contributions
 songs-0.201
 involve-0.184
 rods-0.179
 theirs-0.178
 unity-0.178
Token?"
Token ID1701
Feature activation+1.74e-01

Pos logit contributions
 crying+0.568
 mirror+0.549
 ha+0.495
 stub+0.493
 wardrobe+0.491
Neg logit contributions
 songs-1.019
 rods-0.956
 games-0.923
 enjoyable-0.923
 unity-0.915
Token The
Token ID383
Feature activation+5.32e-02

Pos logit contributions
 mirror+0.264
 crying+0.263
 wardrobe+0.252
 bleeding+0.244
 tooth+0.241
Neg logit contributions
 songs-0.321
 rods-0.305
asted-0.297
 unity-0.292
 habitats-0.291
Token bird
Token ID6512
Feature activation+2.11e-01

Pos logit contributions
used+0.079
ainted+0.076
 Heather+0.074
aila+0.073
 crying+0.073
Neg logit contributions
 songs-0.105
 lesson-0.095
 rods-0.091
 unity-0.089
 games-0.089
Token told
Token ID1297
Feature activation+8.69e-02

Pos logit contributions
 mirror+0.325
 crying+0.322
 wardrobe+0.307
 cabinet+0.301
 drawer+0.300
Neg logit contributions
 songs-0.389
 rods-0.356
 unity-0.354
 bond-0.352
 similarities-0.351
Token her
Token ID607
Feature activation+2.41e-01

Pos logit contributions
 stub+0.121
 crying+0.111
 mirror+0.110
used+0.110
 bleeding+0.107
Neg logit contributions
 songs-0.184
 lesson-0.165
 times-0.164
asted-0.162
 teamwork-0.161
Tokenily
Token ID813
Feature activation+1.91e-01

Pos logit contributions
 crying+0.106
 mirror+0.097
 bleeding+0.091
 wardrobe+0.089
 stub+0.088
Neg logit contributions
 songs-0.181
 rods-0.173
 habitats-0.172
asted-0.171
 similarities-0.170
Token's
Token ID338
Feature activation+3.17e-01

Pos logit contributions
 crying+0.330
 mirror+0.316
 bleeding+0.295
 wardrobe+0.292
 ha+0.288
Neg logit contributions
 songs-0.333
 rods-0.324
 habitats-0.309
 unity-0.306
 similarities-0.305
Token mom
Token ID1995
Feature activation+9.46e-02

Pos logit contributions
 crying+0.449
 mirror+0.427
 bleeding+0.390
 wardrobe+0.387
 ha+0.381
Neg logit contributions
 songs-0.660
 rods-0.640
 unity-0.612
 habitats-0.612
 similarities-0.609
Token saw
Token ID2497
Feature activation+1.91e-01

Pos logit contributions
 crying+0.137
 mirror+0.136
 diary+0.132
 stub+0.131
 tooth+0.129
Neg logit contributions
 songs-0.168
asted-0.163
 rods-0.161
Years-0.156
 teamwork-0.155
Token her
Token ID607
Feature activation+2.42e-01

Pos logit contributions
 crying+0.257
 stub+0.253
 mirror+0.253
 ha+0.245
used+0.242
Neg logit contributions
 songs-0.387
 rods-0.365
 games-0.353
 facts-0.350
Years-0.345
Token crying
Token ID13774
Feature activation+4.58e-01

Pos logit contributions
 mirror+0.295
 crying+0.282
 ha+0.268
 stub+0.267
 wardrobe+0.264
Neg logit contributions
 songs-0.528
 rods-0.502
 games-0.489
 bond-0.483
 unity-0.481
Token and
Token ID290
Feature activation-1.32e-01

Pos logit contributions
 crying+0.928
 mirror+0.899
 wardrobe+0.840
 bleeding+0.838
 ha+0.836
Neg logit contributions
 songs-0.670
 rods-0.641
 habitats-0.602
 unity-0.600
 similarities-0.599
Token asked
Token ID1965
Feature activation+1.09e-01

Pos logit contributions
 songs+0.235
 they+0.219
 rods+0.217
 games+0.215
asted+0.210
Neg logit contributions
 mirror-0.202
 diary-0.190
 crying-0.186
 tooth-0.186
 cabinet-0.186
Token,
Token ID11
Feature activation+4.69e-02

Pos logit contributions
 crying+0.171
 mirror+0.168
 stub+0.158
 bleeding+0.158
 wardrobe+0.154
Neg logit contributions
 songs-0.204
 rods-0.193
 facts-0.187
 similarities-0.185
 lesson-0.185
Token "
Token ID366
Feature activation+4.82e-02

Pos logit contributions
 mirror+0.080
 crying+0.078
 tooth+0.075
 wardrobe+0.075
 diary+0.074
Neg logit contributions
 songs-0.077
 rods-0.073
Years-0.070
 lesson-0.070
 unity-0.070
TokenWhy
Token ID5195
Feature activation+8.70e-02

Pos logit contributions
 crying+0.079
 mirror+0.076
 wardrobe+0.075
 cabinet+0.074
 tooth+0.072
Neg logit contributions
 songs-0.082
Years-0.081
 rods-0.078
 similarities-0.075
 bond-0.074
Token can
Token ID460
Feature activation-3.35e-02

Pos logit contributions
 crying+0.269
 mirror+0.267
 ha+0.241
 drawer+0.232
 wardrobe+0.231
Neg logit contributions
 songs-0.402
 rods-0.390
 habitats-0.383
 unity-0.372
lights-0.369
Token!
Token ID0
Feature activation+8.69e-02

Pos logit contributions
 rods+0.067
 songs+0.066
 habitats+0.066
 similarities+0.063
 enjoyable+0.063
Neg logit contributions
 crying-0.050
 mirror-0.047
 ha-0.043
 bleeding-0.042
 wardrobe-0.041
Token I
Token ID314
Feature activation+1.92e-01

Pos logit contributions
 crying+0.123
 mirror+0.117
 bleeding+0.106
 ha+0.105
 wardrobe+0.105
Neg logit contributions
 songs-0.178
 rods-0.176
 habitats-0.168
 unity-0.167
 similarities-0.167
Token know
Token ID760
Feature activation-8.80e-02

Pos logit contributions
 crying+0.289
 mirror+0.284
 ha+0.270
 tooth+0.252
 stub+0.251
Neg logit contributions
 songs-0.380
 rods-0.378
 habitats-0.363
 facts-0.356
lights-0.354
Token many
Token ID867
Feature activation-2.41e-01

Pos logit contributions
 songs+0.177
 rods+0.172
 unity+0.165
 habitats+0.165
 similarities+0.164
Neg logit contributions
 crying-0.130
 mirror-0.123
 bleeding-0.113
 wardrobe-0.112
 ha-0.110
Token songs
Token ID7259
Feature activation-5.35e-01

Pos logit contributions
 rods+0.311
asted+0.310
Years+0.293
 habitats+0.289
 similarities+0.276
Neg logit contributions
 crying-0.506
 mirror-0.505
 diary-0.471
 tooth-0.468
 ha-0.467
Token.
Token ID13
Feature activation+3.51e-02

Pos logit contributions
 rods+1.069
 songs+1.052
 habitats+1.033
 similarities+1.023
asted+1.012
Neg logit contributions
 crying-0.785
 mirror-0.764
 wardrobe-0.682
 bleeding-0.670
 ha-0.669
Token Let
Token ID3914
Feature activation-5.63e-02

Pos logit contributions
 crying+0.048
 mirror+0.046
 bleeding+0.041
 wardrobe+0.041
 ha+0.041
Neg logit contributions
 songs-0.074
 rods-0.072
 habitats-0.069
 unity-0.069
 similarities-0.069
Token's
Token ID338
Feature activation+2.66e-01

Pos logit contributions
 songs+0.140
 lesson+0.131
 games+0.129
 rods+0.128
 times+0.125
Neg logit contributions
 crying-0.060
 mirror-0.054
 bleeding-0.051
used-0.050
 cabinet-0.050
Token whistle
Token ID16121
Feature activation-5.45e-02

Pos logit contributions
 crying+0.447
 mirror+0.444
 ha+0.409
 wardrobe+0.401
 tooth+0.395
Neg logit contributions
 rods-0.471
 songs-0.459
 habitats-0.452
 facts-0.449
 similarities-0.443
Token together
Token ID1978
Feature activation-3.58e-01

Pos logit contributions
 rods+0.099
 habitats+0.098
asted+0.095
 similarities+0.094
 unity+0.092
Neg logit contributions
 crying-0.084
 mirror-0.082
 ha-0.074
 tooth-0.074
 bleeding-0.073
Token She
Token ID1375
Feature activation+2.63e-01

Pos logit contributions
 crying+0.054
 mirror+0.052
 tooth+0.051
 ha+0.050
 wardrobe+0.049
Neg logit contributions
 songs-0.063
 rods-0.062
 similarities-0.057
 games-0.057
 habitats-0.057
Token pret
Token ID2181
Feature activation-1.12e-01

Pos logit contributions
 crying+0.397
 mirror+0.393
 wardrobe+0.352
 ha+0.346
 stub+0.342
Neg logit contributions
 songs-0.525
 bond-0.503
 rods-0.499
 unity-0.499
Years-0.497
Tokenends
Token ID2412
Feature activation-1.09e-01

Pos logit contributions
 enjoyable+0.220
 bond+0.220
 lesson+0.218
 unity+0.216
 habitats+0.216
Neg logit contributions
 crying-0.162
 ha-0.150
 mirror-0.147
 sob-0.143
 finger-0.141
Token to
Token ID284
Feature activation-3.64e-02

Pos logit contributions
 songs+0.193
 rods+0.190
 unity+0.184
 habitats+0.184
 bond+0.182
Neg logit contributions
 crying-0.185
 mirror-0.176
 bleeding-0.161
 wardrobe-0.161
 ha-0.160
Token sing
Token ID1702
Feature activation-2.06e-02

Pos logit contributions
 songs+0.065
 rods+0.064
 habitats+0.061
 unity+0.061
 similarities+0.060
Neg logit contributions
 crying-0.062
 mirror-0.060
 ha-0.055
 wardrobe-0.054
 bleeding-0.054
Token songs
Token ID7259
Feature activation-5.34e-01

Pos logit contributions
 rods+0.041
 habitats+0.039
 unity+0.039
 songs+0.039
 bond+0.039
Neg logit contributions
 crying-0.031
 mirror-0.029
 bleeding-0.027
 wardrobe-0.026
 tooth-0.026
Token with
Token ID351
Feature activation-7.12e-02

Pos logit contributions
 rods+1.014
 songs+1.005
 habitats+0.971
 similarities+0.966
 unity+0.966
Neg logit contributions
 crying-0.845
 mirror-0.801
 bleeding-0.734
 wardrobe-0.727
 tooth-0.713
Token it
Token ID340
Feature activation-6.84e-02

Pos logit contributions
 songs+0.144
 rods+0.139
 habitats+0.132
 unity+0.132
 similarities+0.131
Neg logit contributions
 crying-0.105
 mirror-0.100
 wardrobe-0.092
 bleeding-0.092
 ha-0.091
Token.
Token ID13
Feature activation+3.34e-02

Pos logit contributions
 songs+0.144
 rods+0.144
 habitats+0.138
 unity+0.138
 similarities+0.138
Neg logit contributions
 crying-0.092
 mirror-0.087
 bleeding-0.078
 wardrobe-0.078
 tooth-0.077
Token
Token ID198
Feature activation-1.50e-02

Pos logit contributions
 crying+0.056
 mirror+0.053
 tooth+0.052
 wardrobe+0.051
 bleeding+0.050
Neg logit contributions
 songs-0.060
 rods-0.058
 similarities-0.054
 teamwork-0.054
 games-0.054
Token
Token ID198
Feature activation-1.50e-02

Pos logit contributions
 songs+0.028
 rods+0.027
 habitats+0.026
 unity+0.026
 similarities+0.026
Neg logit contributions
 crying-0.024
 mirror-0.023
 bleeding-0.021
 wardrobe-0.021
 ha-0.021
Token,
Token ID11
Feature activation-2.97e-03

Pos logit contributions
 rods+1.050
 songs+1.021
 habitats+1.003
 similarities+0.997
 unity+0.992
Neg logit contributions
 crying-0.828
 mirror-0.782
 wardrobe-0.710
 bleeding-0.708
 tooth-0.694
Token sad
Token ID6507
Feature activation+1.94e-01

Pos logit contributions
 rods+0.006
Years+0.005
 unity+0.005
 bond+0.005
 habitats+0.005
Neg logit contributions
 crying-0.005
 mirror-0.005
 sob-0.004
 bleeding-0.004
 tooth-0.004
Token songs
Token ID7259
Feature activation-5.34e-01

Pos logit contributions
 crying+0.391
 mirror+0.385
 wardrobe+0.353
 tooth+0.350
 cabinet+0.345
Neg logit contributions
 rods-0.301
asted-0.285
 unity-0.282
Years-0.278
 enjoyable-0.277
Token,
Token ID11
Feature activation-3.91e-03

Pos logit contributions
 rods+1.068
 songs+1.061
 habitats+1.020
 similarities+1.015
 unity+1.015
Neg logit contributions
 crying-0.793
 mirror-0.749
 bleeding-0.681
 wardrobe-0.679
 tooth-0.665
Token silly
Token ID14397
Feature activation+3.01e-02

Pos logit contributions
 rods+0.007
 unity+0.007
Years+0.007
asted+0.007
 bond+0.007
Neg logit contributions
 crying-0.007
 mirror-0.007
 bleeding-0.006
 sob-0.006
 tooth-0.006
Token songs
Token ID7259
Feature activation-5.34e-01

Pos logit contributions
 crying+0.081
 mirror+0.079
 wardrobe+0.075
 tooth+0.074
 ha+0.073
Neg logit contributions
 rods-0.025
Years-0.023
asted-0.022
 unity-0.022
 enjoyable-0.021
Token and
Token ID290
Feature activation-1.84e-01

Pos logit contributions
 rods+1.106
 songs+1.085
 habitats+1.056
 similarities+1.052
 unity+1.052
Neg logit contributions
 crying-0.765
 mirror-0.719
 bleeding-0.650
 wardrobe-0.648
 ha-0.634
Token loud
Token ID7812
Feature activation-9.28e-02

Pos logit contributions
 rods+0.313
Years+0.281
asted+0.278
 habitats+0.273
asts+0.271
Neg logit contributions
 crying-0.359
 mirror-0.339
 bleeding-0.317
 tooth-0.311
 wardrobe-0.305
Token songs
Token ID7259
Feature activation-5.34e-01

Pos logit contributions
 rods+0.124
 songs+0.119
 unity+0.114
 similarities+0.113
 habitats+0.112
Neg logit contributions
 crying-0.199
 mirror-0.192
 bleeding-0.179
 wardrobe-0.179
 tooth-0.178
Token.
Token ID13
Feature activation+3.47e-02

Pos logit contributions
 songs+1.127
 rods+1.102
 habitats+1.056
 unity+1.055
 similarities+1.053
Neg logit contributions
 crying-0.728
 mirror-0.690
 bleeding-0.624
 wardrobe-0.622
 ha-0.613
Token She
Token ID1375
Feature activation+2.59e-01

Pos logit contributions
 crying+0.051
 mirror+0.047
 bleeding+0.046
 ha+0.045
 wardrobe+0.045
Neg logit contributions
 songs-0.070
 rods-0.066
 similarities-0.064
 habitats-0.064
 lesson-0.064
Token,
Token ID11
Feature activation-3.91e-03

Pos logit contributions
 rods+1.068
 songs+1.061
 habitats+1.020
 similarities+1.015
 unity+1.015
Neg logit contributions
 crying-0.793
 mirror-0.749
 bleeding-0.681
 wardrobe-0.679
 tooth-0.665
Token silly
Token ID14397
Feature activation+3.01e-02

Pos logit contributions
 rods+0.007
 unity+0.007
Years+0.007
asted+0.007
 bond+0.007
Neg logit contributions
 crying-0.007
 mirror-0.007
 bleeding-0.006
 sob-0.006
 tooth-0.006
Token songs
Token ID7259
Feature activation-5.34e-01

Pos logit contributions
 crying+0.081
 mirror+0.079
 wardrobe+0.075
 tooth+0.074
 ha+0.073
Neg logit contributions
 rods-0.025
Years-0.023
asted-0.022
 unity-0.022
 enjoyable-0.021
Token and
Token ID290
Feature activation-1.84e-01

Pos logit contributions
 rods+1.106
 songs+1.085
 habitats+1.056
 similarities+1.052
 unity+1.052
Neg logit contributions
 crying-0.765
 mirror-0.719
 bleeding-0.650
 wardrobe-0.648
 ha-0.634
Token loud
Token ID7812
Feature activation-9.28e-02

Pos logit contributions
 rods+0.313
Years+0.281
asted+0.278
 habitats+0.273
asts+0.271
Neg logit contributions
 crying-0.359
 mirror-0.339
 bleeding-0.317
 tooth-0.311
 wardrobe-0.305
Token songs
Token ID7259
Feature activation-5.34e-01

Pos logit contributions
 rods+0.124
 songs+0.119
 unity+0.114
 similarities+0.113
 habitats+0.112
Neg logit contributions
 crying-0.199
 mirror-0.192
 bleeding-0.179
 wardrobe-0.179
 tooth-0.178
Token.
Token ID13
Feature activation+3.47e-02

Pos logit contributions
 songs+1.127
 rods+1.102
 habitats+1.056
 unity+1.055
 similarities+1.053
Neg logit contributions
 crying-0.728
 mirror-0.690
 bleeding-0.624
 wardrobe-0.622
 ha-0.613
Token She
Token ID1375
Feature activation+2.59e-01

Pos logit contributions
 crying+0.051
 mirror+0.047
 bleeding+0.046
 ha+0.045
 wardrobe+0.045
Neg logit contributions
 songs-0.070
 rods-0.066
 similarities-0.064
 habitats-0.064
 lesson-0.064
Token sang
Token ID25889
Feature activation-8.93e-02

Pos logit contributions
 crying+0.418
 mirror+0.401
 stub+0.368
 tooth+0.367
 wardrobe+0.363
Neg logit contributions
 rods-0.507
 songs-0.475
 unity-0.469
 habitats-0.461
 enjoyable-0.460
Token to
Token ID284
Feature activation-3.85e-02

Pos logit contributions
 rods+0.167
 habitats+0.155
 unity+0.154
 bond+0.153
 similarities+0.152
Neg logit contributions
 crying-0.150
 mirror-0.144
 wardrobe-0.129
 bleeding-0.128
 tooth-0.128
Token her
Token ID607
Feature activation+1.93e-01

Pos logit contributions
 rods+0.084
 songs+0.083
 habitats+0.080
 unity+0.080
 similarities+0.079
Neg logit contributions
 crying-0.050
 mirror-0.048
 wardrobe-0.042
 bleeding-0.042
 tooth-0.041
Token sang
Token ID25889
Feature activation-8.80e-02

Pos logit contributions
 crying+0.416
 mirror+0.398
 wardrobe+0.364
 bleeding+0.364
 ha+0.359
Neg logit contributions
 songs-0.494
 rods-0.486
 unity-0.463
 habitats-0.462
 similarities-0.460
Token happy
Token ID3772
Feature activation-1.49e-01

Pos logit contributions
 rods+0.174
 songs+0.171
 habitats+0.165
 unity+0.165
 similarities+0.164
Neg logit contributions
 crying-0.132
 mirror-0.127
 wardrobe-0.114
 bleeding-0.114
 tooth-0.113
Token songs
Token ID7259
Feature activation-5.34e-01

Pos logit contributions
 rods+0.149
 unity+0.130
Years+0.129
asted+0.124
 enjoyable+0.123
Neg logit contributions
 crying-0.391
 mirror-0.379
 tooth-0.360
 wardrobe-0.353
 sob-0.349
Token,
Token ID11
Feature activation-2.97e-03

Pos logit contributions
 rods+1.050
 songs+1.021
 habitats+1.003
 similarities+0.997
 unity+0.992
Neg logit contributions
 crying-0.828
 mirror-0.782
 wardrobe-0.710
 bleeding-0.708
 tooth-0.694
Token sad
Token ID6507
Feature activation+1.94e-01

Pos logit contributions
 rods+0.006
Years+0.005
 unity+0.005
 bond+0.005
 habitats+0.005
Neg logit contributions
 crying-0.005
 mirror-0.005
 sob-0.004
 bleeding-0.004
 tooth-0.004
Token songs
Token ID7259
Feature activation-5.34e-01

Pos logit contributions
 crying+0.391
 mirror+0.385
 wardrobe+0.353
 tooth+0.350
 cabinet+0.345
Neg logit contributions
 rods-0.301
asted-0.285
 unity-0.282
Years-0.278
 enjoyable-0.277
Token,
Token ID11
Feature activation-3.91e-03

Pos logit contributions
 rods+1.068
 songs+1.061
 habitats+1.020
 similarities+1.015
 unity+1.015
Neg logit contributions
 crying-0.793
 mirror-0.749
 bleeding-0.681
 wardrobe-0.679
 tooth-0.665
Token silly
Token ID14397
Feature activation+3.01e-02

Pos logit contributions
 rods+0.007
 unity+0.007
Years+0.007
asted+0.007
 bond+0.007
Neg logit contributions
 crying-0.007
 mirror-0.007
 bleeding-0.006
 sob-0.006
 tooth-0.006
Token songs
Token ID7259
Feature activation-5.34e-01

Pos logit contributions
 crying+0.081
 mirror+0.079
 wardrobe+0.075
 tooth+0.074
 ha+0.073
Neg logit contributions
 rods-0.025
Years-0.023
asted-0.022
 unity-0.022
 enjoyable-0.021
Token and
Token ID290
Feature activation-1.84e-01

Pos logit contributions
 rods+1.106
 songs+1.085
 habitats+1.056
 similarities+1.052
 unity+1.052
Neg logit contributions
 crying-0.765
 mirror-0.719
 bleeding-0.650
 wardrobe-0.648
 ha-0.634
Token loud
Token ID7812
Feature activation-9.28e-02

Pos logit contributions
 rods+0.313
Years+0.281
asted+0.278
 habitats+0.273
asts+0.271
Neg logit contributions
 crying-0.359
 mirror-0.339
 bleeding-0.317
 tooth-0.311
 wardrobe-0.305
Token bed
Token ID3996
Feature activation+6.73e-02

Pos logit contributions
 songs+0.067
 rods+0.064
 unity+0.062
 similarities+0.061
 habitats+0.061
Neg logit contributions
 crying-0.056
 mirror-0.053
 bleeding-0.050
 wardrobe-0.049
 ha-0.048
Token.
Token ID13
Feature activation+3.67e-02

Pos logit contributions
 crying+0.098
 mirror+0.094
 bleeding+0.087
 ha+0.085
 wardrobe+0.085
Neg logit contributions
 songs-0.136
 rods-0.128
 games-0.124
 enjoyable-0.123
 habitats-0.123
Token She
Token ID1375
Feature activation+2.61e-01

Pos logit contributions
 crying+0.045
 bleeding+0.040
 mirror+0.040
 ha+0.039
 wardrobe+0.038
Neg logit contributions
 songs-0.083
 rods-0.078
 games-0.076
 lesson-0.076
 habitats-0.076
Token sang
Token ID25889
Feature activation-8.80e-02

Pos logit contributions
 crying+0.416
 mirror+0.398
 wardrobe+0.364
 bleeding+0.364
 ha+0.359
Neg logit contributions
 songs-0.494
 rods-0.486
 unity-0.463
 habitats-0.462
 similarities-0.460
Token happy
Token ID3772
Feature activation-1.49e-01

Pos logit contributions
 rods+0.174
 songs+0.171
 habitats+0.165
 unity+0.165
 similarities+0.164
Neg logit contributions
 crying-0.132
 mirror-0.127
 wardrobe-0.114
 bleeding-0.114
 tooth-0.113
Token songs
Token ID7259
Feature activation-5.34e-01

Pos logit contributions
 rods+0.149
 unity+0.130
Years+0.129
asted+0.124
 enjoyable+0.123
Neg logit contributions
 crying-0.391
 mirror-0.379
 tooth-0.360
 wardrobe-0.353
 sob-0.349
Token,
Token ID11
Feature activation-2.97e-03

Pos logit contributions
 rods+1.050
 songs+1.021
 habitats+1.003
 similarities+0.997
 unity+0.992
Neg logit contributions
 crying-0.828
 mirror-0.782
 wardrobe-0.710
 bleeding-0.708
 tooth-0.694
Token sad
Token ID6507
Feature activation+1.94e-01

Pos logit contributions
 rods+0.006
Years+0.005
 unity+0.005
 bond+0.005
 habitats+0.005
Neg logit contributions
 crying-0.005
 mirror-0.005
 sob-0.004
 bleeding-0.004
 tooth-0.004
Token songs
Token ID7259
Feature activation-5.34e-01

Pos logit contributions
 crying+0.391
 mirror+0.385
 wardrobe+0.353
 tooth+0.350
 cabinet+0.345
Neg logit contributions
 rods-0.301
asted-0.285
 unity-0.282
Years-0.278
 enjoyable-0.277
Token,
Token ID11
Feature activation-3.91e-03

Pos logit contributions
 rods+1.068
 songs+1.061
 habitats+1.020
 similarities+1.015
 unity+1.015
Neg logit contributions
 crying-0.793
 mirror-0.749
 bleeding-0.681
 wardrobe-0.679
 tooth-0.665
Token silly
Token ID14397
Feature activation+3.01e-02

Pos logit contributions
 rods+0.007
 unity+0.007
Years+0.007
asted+0.007
 bond+0.007
Neg logit contributions
 crying-0.007
 mirror-0.007
 bleeding-0.006
 sob-0.006
 tooth-0.006
Token many
Token ID867
Feature activation-2.41e-01

Pos logit contributions
 bond+0.046
 unity+0.046
 pleasures+0.045
 enjoyable+0.045
 rods+0.045
Neg logit contributions
 crying-0.026
 mirror-0.025
 wardrobe-0.022
 tooth-0.021
 ha-0.021
Token videos
Token ID5861
Feature activation-2.64e-01

Pos logit contributions
asted+0.413
Years+0.394
 bond+0.389
Fin+0.389
 unity+0.388
Neg logit contributions
 crying-0.427
 mirror-0.416
 wardrobe-0.380
 closet-0.373
 drawer-0.370
Token of
Token ID286
Feature activation-2.01e-01

Pos logit contributions
 similarities+0.482
 habitats+0.482
asted+0.479
 unity+0.476
 bond+0.472
Neg logit contributions
 crying-0.425
 mirror-0.421
 cabinet-0.378
 ha-0.369
 diary-0.365
Token animals
Token ID4695
Feature activation-2.38e-01

Pos logit contributions
Years+0.317
 rods+0.313
 bond+0.312
 enjoyable+0.307
 unity+0.301
Neg logit contributions
 crying-0.375
 mirror-0.365
 wardrobe-0.326
 closet-0.319
 tooth-0.318
Token,
Token ID11
Feature activation-2.40e-03

Pos logit contributions
Years+0.514
 similarities+0.500
 enjoyable+0.498
 habitats+0.496
 rods+0.496
Neg logit contributions
 crying-0.329
 mirror-0.293
 wardrobe-0.259
 cabinet-0.256
 stuck-0.256
Token songs
Token ID7259
Feature activation-5.33e-01

Pos logit contributions
 bond+0.005
Years+0.005
 unity+0.004
asted+0.004
 teamwork+0.004
Neg logit contributions
 crying-0.004
 mirror-0.004
 sob-0.003
 tooth-0.003
 closet-0.003
Token,
Token ID11
Feature activation-2.87e-03

Pos logit contributions
 enjoyable+1.290
 habitats+1.276
 similarities+1.259
 rods+1.257
Years+1.254
Neg logit contributions
 crying-0.625
 mirror-0.555
 bleeding-0.478
 staring-0.457
 lying-0.452
Token and
Token ID290
Feature activation-1.83e-01

Pos logit contributions
 bond+0.006
Years+0.006
 unity+0.005
 enjoyable+0.005
 teamwork+0.005
Neg logit contributions
 crying-0.005
 mirror-0.004
 sob-0.004
 tooth-0.004
 staring-0.004
Token stories
Token ID3923
Feature activation-3.92e-01

Pos logit contributions
 rods+0.261
Years+0.258
asted+0.247
 bond+0.244
angles+0.238
Neg logit contributions
 crying-0.384
 mirror-0.367
 tooth-0.333
 ha-0.333
 wardrobe-0.325
Token.
Token ID13
Feature activation+3.23e-02

Pos logit contributions
 rods+0.943
 similarities+0.926
 unity+0.922
 habitats+0.920
asted+0.917
Neg logit contributions
 crying-0.396
 mirror-0.378
 cabinet-0.337
 wardrobe-0.322
 drawer-0.315
Token Sometimes
Token ID8975
Feature activation-1.69e-01

Pos logit contributions
 mirror+0.048
 crying+0.047
 drawer+0.043
 pant+0.042
 cabinet+0.041
Neg logit contributions
 songs-0.065
 unity-0.063
 facts-0.062
 bond-0.062
 habitats-0.060
Token at
Token ID379
Feature activation+8.56e-03

Pos logit contributions
 rods+0.267
 unity+0.266
 pleasures+0.264
 bond+0.264
 habitats+0.262
Neg logit contributions
 crying-0.381
 mirror-0.353
 stub-0.341
 Elsa-0.329
 tooth-0.328
Token night
Token ID1755
Feature activation-1.41e-01

Pos logit contributions
 crying+0.014
 mirror+0.013
 wardrobe+0.012
 tooth+0.012
 drawer+0.012
Neg logit contributions
 rods-0.016
 songs-0.015
 similarities-0.015
Years-0.015
 facts-0.015
Token.
Token ID13
Feature activation+3.20e-02

Pos logit contributions
 songs+0.297
 rods+0.293
 unity+0.286
 habitats+0.286
 similarities+0.284
Neg logit contributions
 crying-0.194
 mirror-0.182
 wardrobe-0.163
 bleeding-0.162
 tooth-0.160
Token They
Token ID1119
Feature activation-3.75e-01

Pos logit contributions
 crying+0.063
 mirror+0.063
 drawer+0.057
 cabinet+0.057
 pant+0.057
Neg logit contributions
 songs-0.048
 unity-0.046
 bond-0.045
 facts-0.044
 enjoyable-0.044
Token sang
Token ID25889
Feature activation-9.05e-02

Pos logit contributions
 songs+0.641
 rods+0.633
 unity+0.597
 habitats+0.596
 similarities+0.594
Neg logit contributions
 crying-0.664
 mirror-0.637
 wardrobe-0.589
 bleeding-0.587
 ha-0.582
Token songs
Token ID7259
Feature activation-5.33e-01

Pos logit contributions
 rods+0.149
 habitats+0.138
asted+0.137
 unity+0.134
 similarities+0.133
Neg logit contributions
 crying-0.166
 mirror-0.159
 Rose-0.151
 diary-0.148
 wardrobe-0.145
Token about
Token ID546
Feature activation-2.43e-01

Pos logit contributions
 rods+1.193
 habitats+1.165
 similarities+1.157
 enjoyable+1.139
asted+1.124
Neg logit contributions
 crying-0.738
 mirror-0.640
 Rose-0.585
 bleeding-0.573
 staring-0.572
Token the
Token ID262
Feature activation-3.21e-02

Pos logit contributions
Years+0.413
 rods+0.392
 enjoyable+0.388
 bond+0.378
 lesson+0.375
Neg logit contributions
 crying-0.469
 mirror-0.423
 bleeding-0.390
 tooth-0.390
 weeping-0.388
Token sun
Token ID4252
Feature activation-1.41e-01

Pos logit contributions
 rods+0.056
 songs+0.053
asted+0.053
 similarities+0.052
 unity+0.052
Neg logit contributions
 crying-0.055
 mirror-0.054
 wardrobe-0.050
 tooth-0.049
 bleeding-0.049
Token,
Token ID11
Feature activation-4.53e-03

Pos logit contributions
 rods+0.276
Years+0.273
 enjoyable+0.271
 similarities+0.268
 unity+0.268
Neg logit contributions
 crying-0.215
 mirror-0.202
 bleeding-0.184
 wardrobe-0.180
 tooth-0.179
Token the
Token ID262
Feature activation-3.14e-02

Pos logit contributions
Years+0.010
 rods+0.010
 bond+0.009
 unity+0.009
 enjoyable+0.009
Neg logit contributions
 crying-0.007
 mirror-0.006
 bleeding-0.006
 stub-0.005
 staring-0.005
Token story
Token ID1621
Feature activation-5.98e-03

Pos logit contributions
 rods+0.097
 habitats+0.095
 unity+0.094
 enjoyable+0.094
asted+0.094
Neg logit contributions
 crying-0.126
 mirror-0.121
 tooth-0.112
 wardrobe-0.110
 ha-0.109
Token with
Token ID351
Feature activation-6.94e-02

Pos logit contributions
 rods+0.014
asted+0.013
 habitats+0.013
 similarities+0.013
 unity+0.013
Neg logit contributions
 crying-0.007
 mirror-0.006
 tooth-0.006
 bleeding-0.006
 diary-0.006
Token pupp
Token ID13595
Feature activation-1.16e-01

Pos logit contributions
 rods+0.132
 songs+0.132
Years+0.131
 bond+0.131
 enjoyable+0.129
Neg logit contributions
 crying-0.106
 mirror-0.099
 tooth-0.091
 bleeding-0.090
 ha-0.088
Tokenets
Token ID1039
Feature activation-3.44e-01

Pos logit contributions
Years+0.135
 rods+0.134
 bond+0.131
asted+0.130
 unity+0.130
Neg logit contributions
 mirror-0.259
 crying-0.254
 wardrobe-0.234
 closet-0.227
 bleeding-0.225
Token and
Token ID290
Feature activation-1.79e-01

Pos logit contributions
 rods+0.717
 songs+0.698
 similarities+0.687
Years+0.686
 bond+0.684
Neg logit contributions
 crying-0.492
 mirror-0.454
 bleeding-0.407
 wardrobe-0.405
 cabinet-0.396
Token songs
Token ID7259
Feature activation-5.33e-01

Pos logit contributions
Years+0.295
 rods+0.286
 bond+0.279
 enjoyable+0.277
asted+0.276
Neg logit contributions
 crying-0.333
 mirror-0.311
 tooth-0.294
 bleeding-0.289
 ha-0.286
Token.
Token ID13
Feature activation+3.81e-02

Pos logit contributions
 rods+1.264
 habitats+1.224
 similarities+1.223
 songs+1.217
asted+1.214
Neg logit contributions
 crying-0.633
 mirror-0.563
 bleeding-0.512
 wardrobe-0.473
 tooth-0.462
Token Ben
Token ID3932
Feature activation+1.04e-01

Pos logit contributions
 crying+0.056
 mirror+0.053
 bleeding+0.047
 cabinet+0.047
 drawer+0.046
Neg logit contributions
 songs-0.076
 bond-0.075
 rods-0.073
 unity-0.072
asted-0.072
Token and
Token ID290
Feature activation-1.81e-01

Pos logit contributions
 crying+0.186
 mirror+0.174
 sob+0.164
 tooth+0.161
 ha+0.161
Neg logit contributions
 rods-0.178
 enjoyable-0.175
 similarities-0.173
 unity-0.172
Years-0.171
Token Lily
Token ID20037
Feature activation+4.95e-02

Pos logit contributions
 songs+0.393
 rods+0.386
 habitats+0.371
 unity+0.370
 similarities+0.369
Neg logit contributions
 crying-0.238
 mirror-0.225
 bleeding-0.202
 wardrobe-0.201
 ha-0.199
Token laughed
Token ID13818
Feature activation-5.62e-02

Pos logit contributions
 crying+0.079
 mirror+0.074
 sob+0.069
 ha+0.067
 bleeding+0.066
Neg logit contributions
 rods-0.097
 enjoyable-0.093
 songs-0.092
 similarities-0.092
Years-0.092
Token.
Token ID13
Feature activation+3.81e-02

Pos logit contributions
 rods+0.042
 habitats+0.040
asted+0.039
 enjoyable+0.039
Years+0.039
Neg logit contributions
 crying-0.029
 mirror-0.028
 bleeding-0.024
 diary-0.024
 tooth-0.024
Token They
Token ID1119
Feature activation-3.70e-01

Pos logit contributions
 crying+0.067
 She+0.064
 mirror+0.062
 Rose+0.060
 sob+0.058
Neg logit contributions
 songs-0.062
 bond-0.061
asting-0.060
 pleasures-0.059
Years-0.058
Token sang
Token ID25889
Feature activation-9.11e-02

Pos logit contributions
 songs+0.653
 rods+0.646
 habitats+0.610
 unity+0.609
 enjoyable+0.608
Neg logit contributions
 crying-0.633
 mirror-0.607
 wardrobe-0.559
 bleeding-0.558
 ha-0.554
Token their
Token ID511
Feature activation-4.10e-01

Pos logit contributions
 rods+0.168
 habitats+0.157
asted+0.156
 enjoyable+0.151
asts+0.150
Neg logit contributions
 crying-0.153
 mirror-0.143
 Rose-0.132
 diary-0.130
 sob-0.129
Token favorite
Token ID4004
Feature activation-2.08e-01

Pos logit contributions
 rods+0.431
asted+0.390
 enjoyable+0.382
 habitats+0.381
asts+0.379
Neg logit contributions
 crying-0.998
 mirror-0.974
 wardrobe-0.930
 tooth-0.916
 diary-0.911
Token songs
Token ID7259
Feature activation-5.33e-01

Pos logit contributions
 rods+0.189
 enjoyable+0.166
Years+0.159
Fin+0.157
 habitats+0.157
Neg logit contributions
 crying-0.548
 mirror-0.542
 tooth-0.512
 diary-0.500
 wardrobe-0.498
Token and
Token ID290
Feature activation-1.80e-01

Pos logit contributions
 rods+1.123
 habitats+1.084
 similarities+1.081
 enjoyable+1.066
 unity+1.045
Neg logit contributions
 crying-0.824
 mirror-0.726
 bleeding-0.677
 tooth-0.641
 staring-0.641
Token made
Token ID925
Feature activation-3.77e-02

Pos logit contributions
Years+0.332
 rods+0.329
addle+0.315
asts+0.314
asting+0.311
Neg logit contributions
 crying-0.295
 mirror-0.272
 tooth-0.262
 sob-0.261
 bleeding-0.253
Token up
Token ID510
Feature activation+1.54e-01

Pos logit contributions
 rods+0.057
 habitats+0.055
 bond+0.055
 unity+0.053
 similarities+0.052
Neg logit contributions
 crying-0.074
 mirror-0.070
 tooth-0.064
 bleeding-0.063
 sob-0.063
Token some
Token ID617
Feature activation-9.08e-02

Pos logit contributions
 crying+0.349
 mirror+0.321
 tooth+0.294
 wardrobe+0.293
 weeping+0.293
Neg logit contributions
 rods-0.202
Fin-0.193
Years-0.192
asted-0.176
 bond-0.176
Token new
Token ID649
Feature activation-1.07e-01

Pos logit contributions
 rods+0.136
 habitats+0.132
Fin+0.132
Years+0.131
asted+0.125
Neg logit contributions
 mirror-0.183
 crying-0.182
 tooth-0.162
 bleeding-0.159
 wardrobe-0.159
Token Can
Token ID1680
Feature activation+4.88e-02

Pos logit contributions
 crying+0.118
 mirror+0.114
 ha+0.103
 bleeding+0.101
 tooth+0.099
Neg logit contributions
 songs-0.191
 rods-0.191
 habitats-0.183
 similarities-0.181
 unity-0.181
Token you
Token ID345
Feature activation+6.24e-02

Pos logit contributions
 crying+0.060
 mirror+0.057
 wardrobe+0.051
 bleeding+0.051
 tooth+0.050
Neg logit contributions
 songs-0.110
 rods-0.107
 similarities-0.103
 unity-0.103
 habitats-0.103
Token teach
Token ID4545
Feature activation-6.64e-02

Pos logit contributions
 crying+0.106
 mirror+0.102
 ha+0.094
 bleeding+0.093
 wardrobe+0.092
Neg logit contributions
 songs-0.112
 rods-0.111
 habitats-0.107
 facts-0.105
 similarities-0.105
Token me
Token ID502
Feature activation+2.61e-01

Pos logit contributions
 songs+0.174
 rods+0.165
 games+0.160
 facts+0.160
 unity+0.158
Neg logit contributions
 crying-0.062
 mirror-0.054
 wardrobe-0.053
 bleeding-0.052
 cabinet-0.051
Token more
Token ID517
Feature activation-1.20e-01

Pos logit contributions
 crying+0.423
 mirror+0.420
 ha+0.375
 tooth+0.373
 bleeding+0.370
Neg logit contributions
 rods-0.466
 songs-0.465
 habitats-0.457
 similarities-0.448
 unity-0.444
Token songs
Token ID7259
Feature activation-5.33e-01

Pos logit contributions
 rods+0.227
 habitats+0.225
asted+0.223
 similarities+0.222
Years+0.217
Neg logit contributions
 mirror-0.180
 crying-0.175
 ha-0.159
 tooth-0.155
 diary-0.152
Token?"
Token ID1701
Feature activation+1.29e-01

Pos logit contributions
 rods+1.265
 songs+1.263
 habitats+1.229
 similarities+1.220
 unity+1.219
Neg logit contributions
 crying-0.575
 mirror-0.552
 bleeding-0.468
 wardrobe-0.467
 ha-0.462
Token
Token ID198
Feature activation-8.97e-03

Pos logit contributions
 crying+0.211
 mirror+0.202
 wardrobe+0.188
 bleeding+0.188
 ha+0.184
Neg logit contributions
 songs-0.239
 rods-0.234
 habitats-0.221
 similarities-0.221
 unity-0.220
Token
Token ID198
Feature activation-9.02e-03

Pos logit contributions
 songs+0.016
 rods+0.016
 similarities+0.015
 unity+0.015
 habitats+0.015
Neg logit contributions
 crying-0.015
 mirror-0.014
 wardrobe-0.013
 bleeding-0.013
 ha-0.013
TokenL
Token ID43
Feature activation+3.31e-02

Pos logit contributions
 songs+0.018
 rods+0.018
 habitats+0.018
 unity+0.017
 facts+0.017
Neg logit contributions
 crying-0.013
 mirror-0.013
 ha-0.011
 bleeding-0.011
 wardrobe-0.011
Tokenily
Token ID813
Feature activation+1.39e-01

Pos logit contributions
 crying+0.056
 wardrobe+0.050
 bleeding+0.050
 stub+0.049
 mirror+0.048
Neg logit contributions
 songs-0.056
 lesson-0.055
 rods-0.055
asted-0.055
asting-0.055
Token their
Token ID511
Feature activation-4.09e-01

Pos logit contributions
 sizes+0.378
 habitats+0.374
lights+0.348
 enjoyable+0.345
 unity+0.339
Neg logit contributions
 crying-0.227
 mirror-0.214
 ha-0.201
 diary-0.185
 drawer-0.185
Token guitar
Token ID10047
Feature activation-8.59e-02

Pos logit contributions
asted+0.654
 enjoyable+0.628
asts+0.624
Fin+0.615
 pleasures+0.615
Neg logit contributions
 crying-0.770
 ha-0.707
 mirror-0.694
 wardrobe-0.676
 arm-0.664
Token and
Token ID290
Feature activation-1.79e-01

Pos logit contributions
 enjoyable+0.163
 similarities+0.162
 rods+0.161
asted+0.159
 facts+0.157
Neg logit contributions
 crying-0.136
 mirror-0.124
 wardrobe-0.118
 diary-0.117
 cabinet-0.115
Token sing
Token ID1702
Feature activation-1.70e-02

Pos logit contributions
Years+0.354
 rods+0.339
 enjoyable+0.334
 habitats+0.333
asting+0.328
Neg logit contributions
 crying-0.286
 mirror-0.274
 ha-0.246
 bleeding-0.240
 wardrobe-0.238
Token their
Token ID511
Feature activation-4.11e-01

Pos logit contributions
 rods+0.031
asted+0.030
 habitats+0.029
 enjoyable+0.029
 unity+0.029
Neg logit contributions
 crying-0.027
 mirror-0.025
 diary-0.024
 Rose-0.023
 bleeding-0.023
Token songs
Token ID7259
Feature activation-5.33e-01

Pos logit contributions
asted+0.476
asts+0.452
Fin+0.451
 enjoyable+0.449
Years+0.445
Neg logit contributions
 crying-0.905
 mirror-0.883
 tooth-0.878
 arm-0.861
 sob-0.861
Token.
Token ID13
Feature activation+3.68e-02

Pos logit contributions
 rods+1.340
 habitats+1.300
 similarities+1.285
asted+1.270
 facts+1.267
Neg logit contributions
 crying-0.571
 mirror-0.485
 bleeding-0.441
 stuck-0.411
 wardrobe-0.407
Token They
Token ID1119
Feature activation-3.71e-01

Pos logit contributions
 crying+0.076
 She+0.073
 mirror+0.072
 Rose+0.070
 closet+0.069
Neg logit contributions
 bond-0.046
 songs-0.046
 rods-0.045
 unity-0.044
 times-0.044
Token make
Token ID787
Feature activation-1.12e-01

Pos logit contributions
 rods+0.771
 songs+0.744
 habitats+0.728
Years+0.715
 enjoyable+0.713
Neg logit contributions
 crying-0.533
 mirror-0.531
 ha-0.489
 wardrobe-0.474
 tooth-0.458
Token everyone
Token ID2506
Feature activation-2.75e-01

Pos logit contributions
 rods+0.181
 habitats+0.176
 bond+0.176
 songs+0.172
 unity+0.170
Neg logit contributions
 crying-0.209
 mirror-0.196
 tooth-0.180
 bleeding-0.180
 wardrobe-0.177
Token happy
Token ID3772
Feature activation-1.47e-01

Pos logit contributions
 rods+0.471
 habitats+0.455
 songs+0.438
Years+0.436
 unity+0.432
Neg logit contributions
 crying-0.509
 mirror-0.484
 bleeding-0.438
 wardrobe-0.436
 tooth-0.434
Token to
Token ID284
Feature activation-3.49e-02

Pos logit contributions
 crying+0.025
 mirror+0.024
 bleeding+0.022
 wardrobe+0.022
 tooth+0.022
Neg logit contributions
 songs-0.028
 rods-0.027
 unity-0.026
 similarities-0.026
 habitats-0.026
Token her
Token ID607
Feature activation+1.97e-01

Pos logit contributions
 rods+0.071
 songs+0.069
 habitats+0.066
 unity+0.066
 bond+0.066
Neg logit contributions
 crying-0.051
 mirror-0.048
 ha-0.043
 wardrobe-0.043
 tooth-0.042
Token family
Token ID1641
Feature activation-1.41e-01

Pos logit contributions
 crying+0.304
 mirror+0.288
 wardrobe+0.265
 ha+0.260
 tooth+0.256
Neg logit contributions
 rods-0.386
 songs-0.372
Years-0.366
asted-0.366
 similarities-0.363
Token and
Token ID290
Feature activation-1.75e-01

Pos logit contributions
 rods+0.290
 songs+0.286
 similarities+0.276
 habitats+0.275
 unity+0.274
Neg logit contributions
 crying-0.204
 mirror-0.192
 wardrobe-0.175
 tooth-0.171
 ha-0.170
Token her
Token ID607
Feature activation+1.97e-01

Pos logit contributions
 rods+0.314
 songs+0.303
Years+0.300
asted+0.295
 habitats+0.293
Neg logit contributions
 crying-0.303
 mirror-0.286
 bleeding-0.259
 wardrobe-0.259
 tooth-0.258
Token songs
Token ID7259
Feature activation-5.33e-01

Pos logit contributions
 crying+0.348
 mirror+0.337
 wardrobe+0.311
 diary+0.305
 ha+0.298
Neg logit contributions
 rods-0.346
asted-0.337
Years-0.328
asting-0.319
asts-0.317
Token are
Token ID389
Feature activation-7.76e-02

Pos logit contributions
 rods+1.155
 songs+1.128
 habitats+1.097
 unity+1.081
asted+1.081
Neg logit contributions
 crying-0.735
 mirror-0.689
 bleeding-0.617
 wardrobe-0.604
 tooth-0.580
Token beautiful
Token ID4950
Feature activation-6.53e-02

Pos logit contributions
 rods+0.152
 songs+0.147
 habitats+0.145
 bond+0.140
 facts+0.140
Neg logit contributions
 crying-0.124
 mirror-0.114
 bleeding-0.107
 tooth-0.104
 wardrobe-0.103
Token because
Token ID780
Feature activation-1.41e-01

Pos logit contributions
 rods+0.132
 songs+0.129
asted+0.126
 similarities+0.125
 habitats+0.125
Neg logit contributions
 crying-0.096
 mirror-0.092
 wardrobe-0.083
 tooth-0.082
 bleeding-0.081
Token they
Token ID484
Feature activation-4.21e-01

Pos logit contributions
 rods+0.272
 songs+0.263
 habitats+0.263
asted+0.258
 bond+0.257
Neg logit contributions
 crying-0.226
 mirror-0.214
 bleeding-0.192
 wardrobe-0.191
 tooth-0.186
Token come
Token ID1282
Feature activation+5.83e-02

Pos logit contributions
 rods+0.867
 songs+0.842
 habitats+0.834
Years+0.808
lights+0.793
Neg logit contributions
 crying-0.633
 mirror-0.597
 wardrobe-0.536
 locked-0.530
 tooth-0.525
Token.
Token ID13
Feature activation+3.39e-02

Pos logit contributions
 rods+0.388
 habitats+0.376
asted+0.369
 bond+0.364
 unity+0.362
Neg logit contributions
 crying-0.208
 mirror-0.196
 bleeding-0.167
 wardrobe-0.162
 tooth-0.161
Token Anna
Token ID11735
Feature activation+1.27e-01

Pos logit contributions
 crying+0.052
 mirror+0.049
 bleeding+0.045
 wardrobe+0.045
 ha+0.045
Neg logit contributions
 songs-0.066
 rods-0.065
 habitats-0.062
 unity-0.062
 similarities-0.062
Token liked
Token ID8288
Feature activation-1.97e-01

Pos logit contributions
 crying+0.231
 mirror+0.221
 stub+0.205
 ha+0.205
 wardrobe+0.203
Neg logit contributions
 rods-0.212
 songs-0.207
 enjoyable-0.200
 facts-0.199
 unity-0.199
Token to
Token ID284
Feature activation-3.74e-02

Pos logit contributions
 rods+0.408
 enjoyable+0.389
asted+0.388
 bond+0.387
 unity+0.385
Neg logit contributions
 crying-0.302
 mirror-0.282
 wardrobe-0.248
 bleeding-0.241
 diary-0.240
Token play
Token ID711
Feature activation-1.71e-01

Pos logit contributions
 rods+0.069
 songs+0.064
 habitats+0.064
 unity+0.063
 enjoyable+0.062
Neg logit contributions
 crying-0.062
 mirror-0.062
 wardrobe-0.056
 ha-0.055
 tooth-0.055
Token songs
Token ID7259
Feature activation-5.33e-01

Pos logit contributions
 rods+0.301
 songs+0.295
 habitats+0.289
 unity+0.283
 similarities+0.281
Neg logit contributions
 crying-0.296
 mirror-0.284
 wardrobe-0.261
 bleeding-0.258
 ha-0.256
Token for
Token ID329
Feature activation-2.07e-01

Pos logit contributions
 rods+0.924
 songs+0.891
 habitats+0.878
 unity+0.866
 similarities+0.864
Neg logit contributions
 crying-0.947
 mirror-0.906
 wardrobe-0.829
 bleeding-0.828
 cabinet-0.811
Token her
Token ID607
Feature activation+1.94e-01

Pos logit contributions
 rods+0.438
 bond+0.417
 songs+0.413
 unity+0.412
 habitats+0.412
Neg logit contributions
 crying-0.287
 mirror-0.266
 ha-0.244
 tooth-0.243
 wardrobe-0.240
Token family
Token ID1641
Feature activation-1.44e-01

Pos logit contributions
 crying+0.302
 mirror+0.277
 wardrobe+0.255
 diary+0.254
 bleeding+0.251
Neg logit contributions
 rods-0.396
asted-0.384
 habitats-0.368
asts-0.367
 unity-0.364
Token and
Token ID290
Feature activation-1.86e-01

Pos logit contributions
 rods+0.315
 songs+0.308
 habitats+0.303
 unity+0.303
 similarities+0.303
Neg logit contributions
 crying-0.187
 mirror-0.173
 wardrobe-0.156
 drawer-0.154
 cabinet-0.153
Token friends
Token ID2460
Feature activation-1.28e-01

Pos logit contributions
 rods+0.382
Years+0.356
asting+0.355
 enjoyable+0.353
asts+0.350
Neg logit contributions
 crying-0.291
 mirror-0.277
 wardrobe-0.249
 diary-0.247
 ha-0.243
Token they
Token ID484
Feature activation-4.24e-01

Pos logit contributions
 rods+0.028
 songs+0.028
Years+0.028
 habitats+0.028
 bond+0.028
Neg logit contributions
 crying-0.042
 mirror-0.040
 wardrobe-0.039
 tooth-0.038
 drawer-0.038
Token could
Token ID714
Feature activation-3.45e-03

Pos logit contributions
 songs+0.895
 rods+0.873
 enjoyable+0.855
 unity+0.853
 bond+0.850
Neg logit contributions
 crying-0.607
 mirror-0.574
 wardrobe-0.521
 tooth-0.519
 stub-0.514
Token see
Token ID766
Feature activation-4.78e-03

Pos logit contributions
 rods+0.007
 songs+0.007
 habitats+0.007
 enjoyable+0.006
 facts+0.006
Neg logit contributions
 crying-0.005
 mirror-0.005
 tooth-0.005
 ha-0.005
 bleeding-0.004
Token cartoons
Token ID30070
Feature activation-3.29e-01

Pos logit contributions
 rods+0.008
 songs+0.008
 bond+0.008
Years+0.008
 similarities+0.008
Neg logit contributions
 crying-0.008
 mirror-0.008
 wardrobe-0.007
 tooth-0.007
 bleeding-0.007
Token or
Token ID393
Feature activation-6.65e-02

Pos logit contributions
 habitats+0.657
 similarities+0.651
 rods+0.648
 songs+0.641
 enjoyable+0.640
Neg logit contributions
 crying-0.496
 mirror-0.464
 cabinet-0.421
 bleeding-0.421
 wardrobe-0.421
Token songs
Token ID7259
Feature activation-5.33e-01

Pos logit contributions
 rods+0.100
Years+0.098
 bond+0.097
 songs+0.097
 enjoyable+0.096
Neg logit contributions
 crying-0.133
 mirror-0.126
 bleeding-0.117
 tooth-0.115
 wardrobe-0.115
Token or
Token ID393
Feature activation-6.66e-02

Pos logit contributions
 rods+1.183
 habitats+1.155
 similarities+1.150
 unity+1.139
 bond+1.137
Neg logit contributions
 crying-0.696
 mirror-0.648
 bleeding-0.575
 wardrobe-0.571
 cabinet-0.557
Token stories
Token ID3923
Feature activation-3.93e-01

Pos logit contributions
Years+0.101
 rods+0.100
asted+0.096
 bond+0.096
 enjoyable+0.094
Neg logit contributions
 crying-0.134
 mirror-0.127
 tooth-0.118
 bleeding-0.117
 ha-0.115
Token.
Token ID13
Feature activation+3.23e-02

Pos logit contributions
 rods+0.962
 similarities+0.933
asted+0.930
 unity+0.929
 bond+0.928
Neg logit contributions
 crying-0.403
 mirror-0.378
 cabinet-0.329
 wardrobe-0.328
 bleeding-0.319
Token But
Token ID887
Feature activation-1.92e-01

Pos logit contributions
 crying+0.048
 mirror+0.046
 cabinet+0.041
 drawer+0.041
 ha+0.041
Neg logit contributions
 songs-0.066
 unity-0.061
 facts-0.061
 habitats-0.061
 rods-0.061
Token today
Token ID1909
Feature activation-1.46e-01

Pos logit contributions
 songs+0.271
 habitats+0.254
Years+0.253
 facts+0.253
 rods+0.251
Neg logit contributions
 crying-0.412
 mirror-0.385
 ha-0.372
 bleeding-0.362
 wardrobe-0.361
Token and
Token ID290
Feature activation-1.82e-01

Pos logit contributions
 songs+0.482
 rods+0.481
 habitats+0.474
Years+0.473
 similarities+0.470
Neg logit contributions
 crying-0.371
 mirror-0.329
 bleeding-0.313
 wardrobe-0.299
 tooth-0.295
Token the
Token ID262
Feature activation-2.93e-02

Pos logit contributions
Years+0.370
 rods+0.360
 bond+0.359
addle+0.353
asted+0.352
Neg logit contributions
 crying-0.283
 mirror-0.265
 bleeding-0.233
 tooth-0.230
 wardrobe-0.229
Token cars
Token ID5006
Feature activation-2.08e-01

Pos logit contributions
 rods+0.051
 songs+0.049
asted+0.049
Years+0.048
 enjoyable+0.048
Neg logit contributions
 crying-0.051
 mirror-0.050
 wardrobe-0.045
 bleeding-0.045
 cabinet-0.045
Token and
Token ID290
Feature activation-1.81e-01

Pos logit contributions
 rods+0.468
 songs+0.460
 habitats+0.460
Years+0.452
 similarities+0.450
Neg logit contributions
 crying-0.275
 mirror-0.243
 bleeding-0.232
 stuck-0.210
 wardrobe-0.208
Token the
Token ID262
Feature activation-2.93e-02

Pos logit contributions
Years+0.395
 rods+0.382
addle+0.379
asted+0.379
 enjoyable+0.378
Neg logit contributions
 crying-0.260
 mirror-0.242
 wardrobe-0.215
 bleeding-0.211
 cabinet-0.204
Token songs
Token ID7259
Feature activation-5.32e-01

Pos logit contributions
 rods+0.049
 songs+0.048
asted+0.047
 enjoyable+0.047
Years+0.047
Neg logit contributions
 crying-0.053
 mirror-0.051
 wardrobe-0.047
 bleeding-0.046
 cabinet-0.046
Token.
Token ID13
Feature activation+3.61e-02

Pos logit contributions
 rods+1.255
 habitats+1.239
 songs+1.218
 similarities+1.208
 unity+1.206
Neg logit contributions
 crying-0.642
 mirror-0.578
 bleeding-0.533
 wardrobe-0.481
 tooth-0.478
Token He
Token ID679
Feature activation-6.72e-02

Pos logit contributions
 crying+0.067
 mirror+0.064
 wardrobe+0.060
 bleeding+0.060
 ha+0.059
Neg logit contributions
 songs-0.059
 rods-0.058
 habitats-0.055
 similarities-0.054
 facts-0.054
Token wants
Token ID3382
Feature activation-4.26e-02

Pos logit contributions
 songs+0.130
 rods+0.127
 unity+0.121
 habitats+0.121
 similarities+0.121
Neg logit contributions
 crying-0.104
 mirror-0.099
 bleeding-0.091
 wardrobe-0.091
 ha-0.090
Token to
Token ID284
Feature activation-3.88e-02

Pos logit contributions
 bond+0.085
 enjoyable+0.083
 unity+0.083
 songs+0.083
 rods+0.082
Neg logit contributions
 crying-0.065
 mirror-0.060
 tooth-0.054
 bleeding-0.052
 ha-0.052
Token go
Token ID467
Feature activation-1.09e-01

Pos logit contributions
 rods+0.072
 songs+0.068
 habitats+0.067
 facts+0.066
 enjoyable+0.066
Neg logit contributions
 crying-0.065
 mirror-0.064
 ha-0.059
 tooth-0.056
 wardrobe-0.056
Token.
Token ID13
Feature activation+3.78e-02

Pos logit contributions
 songs+0.148
 rods+0.148
 habitats+0.143
 similarities+0.142
 unity+0.142
Neg logit contributions
 crying-0.091
 mirror-0.085
 bleeding-0.077
 wardrobe-0.077
 tooth-0.075
Token She
Token ID1375
Feature activation+2.61e-01

Pos logit contributions
 crying+0.057
 bleeding+0.052
 mirror+0.051
 ha+0.050
 wardrobe+0.050
Neg logit contributions
 songs-0.074
 rods-0.071
 habitats-0.069
 similarities-0.069
 lesson-0.068
Token liked
Token ID8288
Feature activation-1.96e-01

Pos logit contributions
 crying+0.393
 stub+0.379
 mirror+0.374
 stuck+0.355
 tooth+0.350
Neg logit contributions
 facts-0.513
 pleasures-0.512
 rods-0.511
 enjoyable-0.509
 unity-0.509
Token to
Token ID284
Feature activation-3.61e-02

Pos logit contributions
 rods+0.417
 enjoyable+0.413
 bond+0.408
 unity+0.403
 pleasures+0.401
Neg logit contributions
 crying-0.287
 mirror-0.262
 bleeding-0.226
 wardrobe-0.225
 stub-0.221
Token play
Token ID711
Feature activation-1.69e-01

Pos logit contributions
 rods+0.068
 habitats+0.066
 songs+0.064
 facts+0.064
 enjoyable+0.063
Neg logit contributions
 crying-0.058
 mirror-0.057
 wardrobe-0.052
 ha-0.051
 tooth-0.051
Token songs
Token ID7259
Feature activation-5.32e-01

Pos logit contributions
 rods+0.317
 habitats+0.311
 songs+0.302
 unity+0.302
 facts+0.301
Neg logit contributions
 crying-0.277
 mirror-0.266
 wardrobe-0.243
 ha-0.241
 bleeding-0.239
Token for
Token ID329
Feature activation-2.04e-01

Pos logit contributions
 songs+0.834
 rods+0.822
 habitats+0.777
 unity+0.773
 similarities+0.771
Neg logit contributions
 crying-1.017
 mirror-0.979
 bleeding-0.910
 wardrobe-0.908
 ha-0.897
Token her
Token ID607
Feature activation+1.95e-01

Pos logit contributions
 rods+0.480
 bond+0.461
 unity+0.456
 songs+0.456
 habitats+0.455
Neg logit contributions
 crying-0.236
 mirror-0.216
 tooth-0.195
 ha-0.193
 bleeding-0.190
Token family
Token ID1641
Feature activation-1.42e-01

Pos logit contributions
 crying+0.339
 mirror+0.312
 wardrobe+0.300
 tooth+0.294
 ha+0.292
Neg logit contributions
asted-0.353
 rods-0.350
Years-0.338
 facts-0.332
asts-0.332
Token and
Token ID290
Feature activation-1.82e-01

Pos logit contributions
 rods+0.287
 songs+0.285
 similarities+0.279
 unity+0.278
 habitats+0.278
Neg logit contributions
 crying-0.206
 mirror-0.193
 wardrobe-0.177
 tooth-0.174
 drawer-0.174
Token friends
Token ID2460
Feature activation-1.27e-01

Pos logit contributions
 rods+0.362
Years+0.344
 enjoyable+0.341
 habitats+0.341
asting+0.333
Neg logit contributions
 crying-0.290
 mirror-0.273
 diary-0.248
 wardrobe-0.246
 tooth-0.245
Token they
Token ID484
Feature activation-4.23e-01

Pos logit contributions
Years+0.227
asted+0.224
 habitats+0.217
 rods+0.216
 bond+0.216
Neg logit contributions
 crying-0.368
 mirror-0.344
 ha-0.334
 wardrobe-0.329
 diary-0.323
Token sing
Token ID1702
Feature activation-1.58e-02

Pos logit contributions
 rods+0.794
 songs+0.781
 bond+0.743
 unity+0.740
 lesson+0.733
Neg logit contributions
 crying-0.708
 mirror-0.682
 wardrobe-0.615
 ha-0.608
 locked-0.596
Token along
Token ID1863
Feature activation-1.35e-01

Pos logit contributions
 rods+0.029
asted+0.028
 bond+0.028
 unity+0.027
 habitats+0.027
Neg logit contributions
 crying-0.026
 mirror-0.024
 diary-0.022
 Rose-0.022
 bleeding-0.022
Token with
Token ID351
Feature activation-7.09e-02

Pos logit contributions
 rods+0.276
asted+0.269
 bond+0.265
 similarities+0.263
 enjoyable+0.262
Neg logit contributions
 crying-0.195
 mirror-0.174
 cabinet-0.158
 ha-0.156
 wardrobe-0.155
Token the
Token ID262
Feature activation-2.72e-02

Pos logit contributions
 rods+0.107
 songs+0.105
Years+0.104
 bond+0.103
 enjoyable+0.103
Neg logit contributions
 crying-0.140
 mirror-0.131
 bleeding-0.123
 tooth-0.120
 wardrobe-0.120
Token songs
Token ID7259
Feature activation-5.32e-01

Pos logit contributions
 songs+0.045
 rods+0.044
 habitats+0.041
 unity+0.041
 similarities+0.041
Neg logit contributions
 crying-0.050
 mirror-0.048
 bleeding-0.045
 wardrobe-0.044
 ha-0.044
Token or
Token ID393
Feature activation-6.00e-02

Pos logit contributions
 rods+1.080
 habitats+1.068
 similarities+1.051
asted+1.040
Years+1.031
Neg logit contributions
 crying-0.799
 mirror-0.761
 bleeding-0.670
 wardrobe-0.666
 drawer-0.654
Token cl
Token ID537
Feature activation-6.47e-02

Pos logit contributions
 rods+0.105
Years+0.103
 songs+0.099
asted+0.099
 bond+0.099
Neg logit contributions
 crying-0.107
 mirror-0.100
 bleeding-0.092
 tooth-0.090
 wardrobe-0.089
Tokenap
Token ID499
Feature activation-1.32e-01

Pos logit contributions
 songs+0.090
 rods+0.087
 habitats+0.082
 similarities+0.081
 unity+0.081
Neg logit contributions
 crying-0.136
 mirror-0.131
 bleeding-0.124
 wardrobe-0.123
 ha-0.122
Token their
Token ID511
Feature activation-4.10e-01

Pos logit contributions
Years+0.229
 rods+0.229
asted+0.223
 habitats+0.220
 unity+0.220
Neg logit contributions
 crying-0.223
 mirror-0.218
 bleeding-0.196
 wardrobe-0.191
 tooth-0.188
Token hands
Token ID2832
Feature activation+1.56e-02

Pos logit contributions
 rods+0.829
 songs+0.823
Years+0.801
 enjoyable+0.796
 unity+0.793
Neg logit contributions
 crying-0.588
 mirror-0.554
 bleeding-0.505
 tooth-0.495
 wardrobe-0.494
Token was
Token ID373
Feature activation-1.05e-03

Pos logit contributions
 rods+0.126
 songs+0.120
 enjoyable+0.118
 unity+0.118
 facts+0.117
Neg logit contributions
 crying-0.123
 mirror-0.118
 stub-0.111
 wardrobe-0.109
 tooth-0.108
Token so
Token ID523
Feature activation+6.36e-02

Pos logit contributions
 songs+0.002
 rods+0.002
 enjoyable+0.002
 unity+0.002
 habitats+0.002
Neg logit contributions
 crying-0.002
 mirror-0.002
 wardrobe-0.001
 bleeding-0.001
 cabinet-0.001
Token proud
Token ID6613
Feature activation-1.06e-01

Pos logit contributions
 crying+0.107
 mirror+0.101
 bleeding+0.092
 ha+0.090
 stub+0.089
Neg logit contributions
 rods-0.121
 songs-0.115
 habitats-0.115
 bond-0.112
 facts-0.111
Token of
Token ID286
Feature activation-2.04e-01

Pos logit contributions
 similarities+0.197
asted+0.193
 bond+0.191
 unity+0.190
 rods+0.189
Neg logit contributions
 crying-0.183
 mirror-0.172
 tooth-0.151
 wardrobe-0.148
 spider-0.148
Token the
Token ID262
Feature activation-3.04e-02

Pos logit contributions
 rods+0.392
 songs+0.380
 bond+0.378
 habitats+0.375
 similarities+0.365
Neg logit contributions
 crying-0.336
 mirror-0.322
 bleeding-0.285
 wardrobe-0.281
 ha-0.281
Token songs
Token ID7259
Feature activation-5.32e-01

Pos logit contributions
 rods+0.043
 songs+0.041
 habitats+0.040
 similarities+0.039
 unity+0.039
Neg logit contributions
 crying-0.063
 mirror-0.061
 bleeding-0.057
 wardrobe-0.056
 tooth-0.056
Token he
Token ID339
Feature activation-1.57e-01

Pos logit contributions
 rods+1.016
 habitats+1.013
 similarities+1.002
 unity+0.962
asted+0.958
Neg logit contributions
 crying-0.894
 mirror-0.833
 bleeding-0.761
 wardrobe-0.744
 tooth-0.733
Token had
Token ID550
Feature activation-2.97e-02

Pos logit contributions
 rods+0.292
 songs+0.289
 habitats+0.275
 unity+0.274
 similarities+0.273
Neg logit contributions
 crying-0.256
 mirror-0.243
 bleeding-0.224
 wardrobe-0.222
 stub-0.222
Token made
Token ID925
Feature activation-4.05e-02

Pos logit contributions
 rods+0.055
 songs+0.055
 similarities+0.052
 unity+0.051
 bond+0.051
Neg logit contributions
 crying-0.048
 mirror-0.046
 ha-0.042
 bleeding-0.042
 stub-0.042
Token.
Token ID13
Feature activation+4.22e-02

Pos logit contributions
 rods+0.082
 habitats+0.081
 similarities+0.080
 bond+0.079
 unity+0.078
Neg logit contributions
 crying-0.060
 mirror-0.055
 bleeding-0.049
 tooth-0.049
 wardrobe-0.048
Token
Token ID198
Feature activation-1.69e-02

Pos logit contributions
 crying+0.055
 mirror+0.053
 cabinet+0.047
 drawer+0.046
 wardrobe+0.046
Neg logit contributions
 songs-0.093
 rods-0.089
 unity-0.087
 similarities-0.086
 facts-0.086
Token and
Token ID290
Feature activation-1.83e-01

Pos logit contributions
 habitats+0.475
 rods+0.474
Years+0.468
 bond+0.461
 lesson+0.460
Neg logit contributions
 crying-0.366
 mirror-0.339
 wardrobe-0.305
 cabinet-0.303
 pant-0.299
Token the
Token ID262
Feature activation-2.76e-02

Pos logit contributions
 rods+0.378
Years+0.367
 enjoyable+0.361
 bond+0.357
addle+0.346
Neg logit contributions
 crying-0.279
 mirror-0.273
 wardrobe-0.232
 tooth-0.222
 weeping-0.217
Token colors
Token ID7577
Feature activation-2.23e-01

Pos logit contributions
 rods+0.042
asted+0.042
 enjoyable+0.041
Years+0.041
 unity+0.040
Neg logit contributions
 crying-0.052
 mirror-0.052
 wardrobe-0.048
 cabinet-0.047
 tooth-0.046
Token and
Token ID290
Feature activation-1.83e-01

Pos logit contributions
 rods+0.538
 habitats+0.531
 unity+0.517
 bond+0.511
Years+0.511
Neg logit contributions
 crying-0.247
 mirror-0.246
 cabinet-0.202
 wardrobe-0.200
 drawer-0.193
Token the
Token ID262
Feature activation-2.78e-02

Pos logit contributions
Years+0.409
 rods+0.400
addle+0.385
Fin+0.384
 bond+0.384
Neg logit contributions
 crying-0.239
 mirror-0.236
 wardrobe-0.198
 tooth-0.189
 cabinet-0.187
Token songs
Token ID7259
Feature activation-5.32e-01

Pos logit contributions
 rods+0.043
Years+0.042
asted+0.042
 songs+0.041
 enjoyable+0.041
Neg logit contributions
 crying-0.052
 mirror-0.052
 wardrobe-0.048
 cabinet-0.046
 bleeding-0.046
Token.
Token ID13
Feature activation+3.50e-02

Pos logit contributions
 rods+1.268
 habitats+1.228
 unity+1.184
 similarities+1.181
 enjoyable+1.181
Neg logit contributions
 crying-0.633
 mirror-0.611
 wardrobe-0.520
 cabinet-0.509
 bleeding-0.506
Token She
Token ID1375
Feature activation+2.60e-01

Pos logit contributions
 crying+0.059
 bleeding+0.054
 ha+0.054
 mirror+0.053
 tooth+0.053
Neg logit contributions
 songs-0.062
 rods-0.059
 similarities-0.058
 habitats-0.057
 enjoyable-0.056
Token had
Token ID550
Feature activation-2.24e-02

Pos logit contributions
 crying+0.408
 mirror+0.393
 stub+0.368
 tooth+0.360
 wardrobe+0.359
Neg logit contributions
 songs-0.501
 rods-0.498
 unity-0.496
 pleasures-0.487
 enjoyable-0.484
Token a
Token ID257
Feature activation+1.11e-01

Pos logit contributions
 songs+0.047
 rods+0.047
 unity+0.045
 habitats+0.045
 similarities+0.045
Neg logit contributions
 crying-0.030
 mirror-0.029
 wardrobe-0.026
 bleeding-0.026
 ha-0.025
Token big
Token ID1263
Feature activation+2.49e-02

Pos logit contributions
 crying+0.183
 mirror+0.173
 bleeding+0.165
 ha+0.160
 wardrobe+0.159
Neg logit contributions
 songs-0.203
 rods-0.195
 bond-0.186
 unity-0.185
 similarities-0.184
Token
Token ID198
Feature activation+2.81e-02

Pos logit contributions
 songs+0.059
 rods+0.056
 unity+0.053
 similarities+0.053
 habitats+0.052
Neg logit contributions
 crying-0.082
 mirror-0.079
 bleeding-0.076
 wardrobe-0.076
 tooth-0.074
Token
Token ID198
Feature activation+4.69e-02

Pos logit contributions
 crying+0.048
 mirror+0.046
 bleeding+0.042
 wardrobe+0.042
 ha+0.042
Neg logit contributions
 songs-0.050
 rods-0.049
 habitats-0.046
 unity-0.046
 similarities-0.046
TokenFin
Token ID18467
Feature activation-3.55e-01

Pos logit contributions
 crying+0.082
 mirror+0.079
 ha+0.073
 bleeding+0.073
 wardrobe+0.073
Neg logit contributions
 songs-0.081
 rods-0.079
 facts-0.075
 habitats-0.075
 unity-0.074
Token sw
Token ID1509
Feature activation-1.70e-01

Pos logit contributions
 songs+0.721
 rods+0.694
 similarities+0.679
 facts+0.676
 enjoyable+0.672
Neg logit contributions
 crying-0.517
 mirror-0.478
 ha-0.459
 tooth-0.444
 stuck-0.441
Tokenam
Token ID321
Feature activation-7.64e-02

Pos logit contributions
 songs+0.270
 enjoyable+0.249
 rods+0.249
 facts+0.248
 bond+0.247
Neg logit contributions
 crying-0.314
 mirror-0.304
 ha-0.284
 wardrobe-0.284
 bleeding-0.281
Token back
Token ID736
Feature activation+2.65e-01

Pos logit contributions
 songs+0.177
 rods+0.173
 similarities+0.171
 facts+0.171
 habitats+0.170
Neg logit contributions
 crying-0.090
 mirror-0.081
 bleeding-0.072
 ha-0.071
 stub-0.071
Token up
Token ID510
Feature activation+2.01e-01

Pos logit contributions
 crying+0.399
 mirror+0.364
 wardrobe+0.336
 bleeding+0.326
 ha+0.322
Neg logit contributions
 songs-0.544
 similarities-0.527
 rods-0.524
 facts-0.517
asted-0.514
Token to
Token ID284
Feature activation+4.66e-03

Pos logit contributions
 crying+0.299
 mirror+0.260
 stuck+0.240
 stub+0.239
 bleeding+0.237
Neg logit contributions
 songs-0.418
 similarities-0.410
 facts-0.401
 rods-0.398
 enjoyable-0.394
Token Bub
Token ID19645
Feature activation-1.32e-01

Pos logit contributions
 crying+0.007
 mirror+0.007
 ha+0.006
 wardrobe+0.006
 tooth+0.006
Neg logit contributions
 songs-0.009
 rods-0.009
 enjoyable-0.009
 facts-0.009
 similarities-0.009
Tokenbles
Token ID7689
Feature activation-6.24e-03

Pos logit contributions
 rods+0.189
 songs+0.178
 habitats+0.165
 islands+0.165
addle+0.162
Neg logit contributions
 mirror-0.266
 crying-0.265
 bleeding-0.245
 cabinet-0.244
 diary-0.241
Token and
Token ID290
Feature activation-1.30e-01

Pos logit contributions
 songs+0.011
 rods+0.011
 habitats+0.011
 similarities+0.011
 unity+0.011
Neg logit contributions
 crying-0.011
 mirror-0.010
 bleeding-0.009
 wardrobe-0.009
 ha-0.009
Token.
Token ID13
Feature activation+4.21e-02

Pos logit contributions
 songs+0.247
 rods+0.235
 games+0.223
 lesson+0.223
 bond+0.219
Neg logit contributions
 crying-0.221
 mirror-0.221
 ha-0.205
 wardrobe-0.201
 bleeding-0.195
Token She
Token ID1375
Feature activation+2.66e-01

Pos logit contributions
 crying+0.049
 wardrobe+0.045
 bleeding+0.045
 mirror+0.045
 stub+0.044
Neg logit contributions
 songs-0.092
 rods-0.091
 habitats-0.089
 games-0.087
 facts-0.087
Token was
Token ID373
Feature activation+8.75e-03

Pos logit contributions
 crying+0.417
 mirror+0.408
used+0.377
 wardrobe+0.366
 cabinet+0.362
Neg logit contributions
 songs-0.482
 rods-0.474
 games-0.443
 habitats-0.436
asted-0.431
Token sad
Token ID6507
Feature activation+2.01e-01

Pos logit contributions
 mirror+0.012
 Girl+0.012
used+0.012
oh+0.012
 ha+0.012
Neg logit contributions
 songs-0.016
 often-0.015
 theirs-0.015
 games-0.015
 their-0.015
Token.
Token ID13
Feature activation+4.14e-02

Pos logit contributions
 crying+0.310
 mirror+0.282
 bleeding+0.263
 wardrobe+0.251
 tooth+0.248
Neg logit contributions
 rods-0.417
 songs-0.410
 unity-0.406
 similarities-0.406
 habitats-0.403
Token Her
Token ID2332
Feature activation+2.36e-01

Pos logit contributions
 crying+0.048
 wardrobe+0.045
 tooth+0.045
 mirror+0.044
 bleeding+0.043
Neg logit contributions
 songs-0.092
 habitats-0.089
 rods-0.089
 facts-0.087
 games-0.086
Token hand
Token ID1021
Feature activation+2.74e-01

Pos logit contributions
 crying+0.338
 mirror+0.320
 wardrobe+0.292
 bleeding+0.290
 ha+0.288
Neg logit contributions
 songs-0.486
 rods-0.473
 similarities-0.448
 habitats-0.448
 unity-0.446
Token was
Token ID373
Feature activation+8.01e-03

Pos logit contributions
 mirror+0.439
 crying+0.433
 wardrobe+0.403
 ha+0.396
 cabinet+0.392
Neg logit contributions
 songs-0.495
 rods-0.470
 lesson-0.441
 games-0.437
 similarities-0.433
Token cold
Token ID4692
Feature activation+5.40e-02

Pos logit contributions
 ha+0.010
used+0.010
 mirror+0.010
 Girl+0.010
oh+0.009
Neg logit contributions
 songs-0.017
 theirs-0.016
 enjoyable-0.015
 lesson-0.015
lights-0.015
Token and
Token ID290
Feature activation-1.75e-01

Pos logit contributions
 crying+0.111
 mirror+0.107
 bleeding+0.100
 wardrobe+0.100
 ha+0.099
Neg logit contributions
 songs-0.078
 rods-0.075
 habitats-0.070
 unity-0.070
 similarities-0.070
Token wet
Token ID9583
Feature activation+1.41e-01

Pos logit contributions
 songs+0.349
 they+0.328
 enjoyable+0.322
 sizes+0.320
 lesson+0.319
Neg logit contributions
 mirror-0.231
used-0.221
 ha-0.220
 crying-0.219
 wardrobe-0.208
Token.
Token ID13
Feature activation+3.70e-02

Pos logit contributions
 crying+0.089
 mirror+0.088
 ha+0.084
 stub+0.080
 wardrobe+0.080
Neg logit contributions
 songs-0.102
 rods-0.096
 games-0.094
 enjoyable-0.093
 lesson-0.092
Token
Token ID198
Feature activation-1.48e-02

Pos logit contributions
 crying+0.040
 bleeding+0.038
 mirror+0.037
 tooth+0.036
 ha+0.035
Neg logit contributions
 songs-0.085
 rods-0.083
 habitats-0.083
 lesson-0.081
 enjoyable-0.080
Token
Token ID198
Feature activation-1.46e-02

Pos logit contributions
 songs+0.025
 rods+0.025
 lesson+0.024
Years+0.023
 similarities+0.023
Neg logit contributions
 crying-0.026
 mirror-0.025
 bleeding-0.023
 tooth-0.023
 wardrobe-0.023
Token"
Token ID1
Feature activation-2.88e-03

Pos logit contributions
 songs+0.026
 rods+0.026
 habitats+0.024
 unity+0.024
 similarities+0.024
Neg logit contributions
 crying-0.025
 mirror-0.024
 bleeding-0.022
 wardrobe-0.022
 ha-0.021
TokenIt
Token ID1026
Feature activation+9.14e-02

Pos logit contributions
 songs+0.005
 rods+0.004
asted+0.004
 unity+0.004
 similarities+0.004
Neg logit contributions
 crying-0.005
 mirror-0.005
 wardrobe-0.005
 bleeding-0.005
 cabinet-0.005
Token's
Token ID338
Feature activation+2.69e-01

Pos logit contributions
 crying+0.115
 mirror+0.114
 stub+0.109
 ha+0.107
aila+0.106
Neg logit contributions
 songs-0.190
 lesson-0.183
 sizes-0.181
 enjoyable-0.175
 teamwork-0.175
Token so
Token ID523
Feature activation+6.56e-02

Pos logit contributions
 crying+0.429
 mirror+0.407
 ha+0.378
 stub+0.372
 bleeding+0.369
Neg logit contributions
 songs-0.520
 enjoyable-0.489
 rods-0.489
 unity-0.477
asted-0.474
Token big
Token ID1263
Feature activation+2.56e-02

Pos logit contributions
 crying+0.118
 mirror+0.111
 bleeding+0.106
 ha+0.103
 stub+0.103
Neg logit contributions
 songs-0.112
 enjoyable-0.106
 rods-0.104
 similarities-0.103
 lesson-0.103
Token and
Token ID290
Feature activation-1.76e-01

Pos logit contributions
 crying+0.040
 mirror+0.038
 wardrobe+0.035
 bleeding+0.034
 tooth+0.034
Neg logit contributions
 songs-0.049
 rods-0.049
 similarities-0.048
 habitats-0.047
asted-0.047
Token shiny
Token ID22441
Feature activation-5.23e-02

Pos logit contributions
 songs+0.310
 enjoyable+0.289
 rods+0.288
 unity+0.283
 games+0.280
Neg logit contributions
 crying-0.304
 mirror-0.290
 stub-0.268
 ha-0.268
 bleeding-0.267
Token!"
Token ID2474
Feature activation+1.43e-03

Pos logit contributions
 songs+0.103
 rods+0.100
 unity+0.096
 habitats+0.096
 facts+0.095
Neg logit contributions
 crying-0.079
 mirror-0.075
 bleeding-0.069
 wardrobe-0.069
 ha-0.068
Token Sarah
Token ID10490
Feature activation+1.35e-01

Pos logit contributions
 crying+0.101
 mirror+0.100
 tooth+0.099
 ha+0.097
 bleeding+0.095
Neg logit contributions
 songs-0.123
 rods-0.118
 habitats-0.114
 games-0.114
 sizes-0.113
Token was
Token ID373
Feature activation+3.19e-02

Pos logit contributions
 mirror+0.226
 crying+0.213
 wardrobe+0.212
 diary+0.210
 drawer+0.210
Neg logit contributions
 songs-0.215
 rods-0.207
 they-0.204
 often-0.198
 habitats-0.196
Token crying
Token ID13774
Feature activation+4.44e-01

Pos logit contributions
 wardrobe+0.048
 diary+0.048
 cabinet+0.048
 mirror+0.048
 drawer+0.047
Neg logit contributions
 songs-0.054
 often-0.050
 games-0.049
 their-0.049
 enjoyable-0.048
Token because
Token ID780
Feature activation-1.16e-01

Pos logit contributions
 crying+0.902
 mirror+0.869
 bleeding+0.815
 wardrobe+0.812
 ha+0.802
Neg logit contributions
 songs-0.650
 rods-0.634
 habitats-0.596
 unity-0.594
 similarities-0.593
Token she
Token ID673
Feature activation+1.87e-01

Pos logit contributions
 songs+0.251
 rods+0.248
 habitats+0.238
 similarities+0.236
 unity+0.236
Neg logit contributions
 crying-0.156
 mirror-0.147
 bleeding-0.132
 wardrobe-0.132
 ha-0.128
Token lost
Token ID2626
Feature activation+3.32e-01

Pos logit contributions
 mirror+0.344
used+0.327
 wardrobe+0.324
 crying+0.324
 ha+0.322
Neg logit contributions
 songs-0.274
 habitats-0.252
 often-0.248
Years-0.242
 rods-0.241
Token her
Token ID607
Feature activation+2.21e-01

Pos logit contributions
 crying+0.476
used+0.462
 mirror+0.455
 stub+0.451
 ha+0.451
Neg logit contributions
 songs-0.669
 rods-0.617
 games-0.615
 theirs-0.584
 unity-0.577
Token favorite
Token ID4004
Feature activation-1.81e-01

Pos logit contributions
 crying+0.369
 mirror+0.350
 wardrobe+0.324
 bleeding+0.324
 tooth+0.322
Neg logit contributions
 songs-0.402
 rods-0.395
 similarities-0.381
 unity-0.380
 habitats-0.380
Token doll
Token ID3654
Feature activation+2.36e-01

Pos logit contributions
 songs+0.357
 rods+0.351
 habitats+0.337
 unity+0.337
 similarities+0.337
Neg logit contributions
 crying-0.272
 mirror-0.260
 wardrobe-0.236
 bleeding-0.235
 tooth-0.233
Token.
Token ID13
Feature activation+6.40e-02

Pos logit contributions
 mirror+0.384
 crying+0.379
 ha+0.346
 stub+0.339
used+0.338
Neg logit contributions
 songs-0.432
 rods-0.400
 games-0.393
 bond-0.385
 habitats-0.377
Token Tim
Token ID5045
Feature activation-1.08e-01

Pos logit contributions
 mirror+0.101
 tooth+0.100
 crying+0.100
 bleeding+0.096
 ha+0.095
Neg logit contributions
 songs-0.113
 rods-0.108
 habitats-0.105
 games-0.105
 enjoyable-0.104
Token wet
Token ID9583
Feature activation+1.48e-01

Pos logit contributions
 Girl+0.032
 cabinet+0.031
 drawer+0.031
 wardrobe+0.031
 mirror+0.031
Neg logit contributions
 songs-0.029
 their-0.028
 enjoyable-0.027
 theirs-0.027
asting-0.026
Token and
Token ID290
Feature activation-1.66e-01

Pos logit contributions
 crying+0.322
 mirror+0.317
 ha+0.299
 wardrobe+0.298
 stub+0.297
Neg logit contributions
 songs-0.188
 rods-0.175
 habitats-0.162
 games-0.161
 enjoyable-0.159
Token cold
Token ID4692
Feature activation+6.15e-02

Pos logit contributions
 songs+0.384
 rods+0.374
 enjoyable+0.362
 similarities+0.360
 unity+0.360
Neg logit contributions
 crying-0.186
 mirror-0.182
 wardrobe-0.161
 ha-0.160
 tooth-0.155
Token and
Token ID290
Feature activation-1.67e-01

Pos logit contributions
 crying+0.109
 mirror+0.105
 wardrobe+0.097
 bleeding+0.097
 ha+0.096
Neg logit contributions
 songs-0.105
 rods-0.102
 habitats-0.096
 unity-0.096
 similarities-0.095
Token still
Token ID991
Feature activation+7.27e-02

Pos logit contributions
 songs+0.373
 rods+0.366
 habitats+0.351
 unity+0.351
 similarities+0.350
Neg logit contributions
 crying-0.209
 mirror-0.198
 bleeding-0.177
 wardrobe-0.176
 ha-0.173
Token trembling
Token ID39228
Feature activation+3.40e-01

Pos logit contributions
 crying+0.092
 mirror+0.092
 wardrobe+0.083
 ha+0.083
 cabinet+0.079
Neg logit contributions
 songs-0.157
 rods-0.151
 enjoyable-0.146
 habitats-0.146
 unity-0.145
Token.
Token ID13
Feature activation+5.02e-02

Pos logit contributions
 crying+0.464
 mirror+0.440
 bleeding+0.398
 wardrobe+0.397
 ha+0.391
Neg logit contributions
 songs-0.716
 rods-0.701
 habitats-0.672
 unity-0.671
 similarities-0.670
Token He
Token ID679
Feature activation-5.16e-02

Pos logit contributions
 crying+0.095
 mirror+0.092
 wardrobe+0.088
 ha+0.087
 bleeding+0.086
Neg logit contributions
 songs-0.077
 rods-0.076
 habitats-0.071
 similarities-0.071
 enjoyable-0.070
Token said
Token ID531
Feature activation+4.86e-02

Pos logit contributions
 songs+0.105
 rods+0.103
 unity+0.099
 similarities+0.098
 habitats+0.098
Neg logit contributions
 crying-0.076
 mirror-0.071
 bleeding-0.066
 tooth-0.065
 wardrobe-0.065
Token,
Token ID11
Feature activation+1.27e-02

Pos logit contributions
 crying+0.079
 mirror+0.077
 wardrobe+0.072
 bleeding+0.071
 drawer+0.070
Neg logit contributions
 songs-0.089
 rods-0.086
 similarities-0.081
 lesson-0.081
 unity-0.081
Token "
Token ID366
Feature activation+1.60e-02

Pos logit contributions
 crying+0.021
 mirror+0.020
 wardrobe+0.019
 ha+0.019
 tooth+0.019
Neg logit contributions
 songs-0.022
 rods-0.022
 similarities-0.020
 unity-0.020
 habitats-0.020
Token,
Token ID11
Feature activation+5.73e-03

Pos logit contributions
 songs+0.610
 rods+0.590
 habitats+0.586
 unity+0.581
 similarities+0.581
Neg logit contributions
 crying-0.390
 mirror-0.371
 ha-0.347
 wardrobe-0.329
 bleeding-0.328
Token they
Token ID484
Feature activation-4.17e-01

Pos logit contributions
 crying+0.010
 mirror+0.010
 ha+0.010
 tooth+0.009
 bleeding+0.009
Neg logit contributions
 habitats-0.009
 bond-0.009
 similarities-0.009
 unity-0.009
 songs-0.009
Token see
Token ID766
Feature activation+4.68e-03

Pos logit contributions
 songs+0.727
 rods+0.720
 habitats+0.680
asted+0.676
 unity+0.674
Neg logit contributions
 crying-0.702
 mirror-0.661
 bleeding-0.644
 tooth-0.615
 wardrobe-0.607
Token a
Token ID257
Feature activation+1.18e-01

Pos logit contributions
 crying+0.006
 mirror+0.006
 wardrobe+0.005
 stub+0.005
 bleeding+0.005
Neg logit contributions
 songs-0.010
 rods-0.010
 unity-0.010
 facts-0.010
 games-0.010
Token thin
Token ID7888
Feature activation+7.88e-02

Pos logit contributions
 crying+0.192
used+0.183
 mirror+0.178
 bleeding+0.173
 sob+0.171
Neg logit contributions
 songs-0.208
 rods-0.205
 lesson-0.194
 bond-0.193
 games-0.191
Token cat
Token ID3797
Feature activation+1.60e-01

Pos logit contributions
 crying+0.117
 mirror+0.109
 bleeding+0.103
 stub+0.100
 wardrobe+0.100
Neg logit contributions
 songs-0.155
 rods-0.151
 bond-0.144
 unity-0.142
 lesson-0.141
Token in
Token ID287
Feature activation+3.65e-02

Pos logit contributions
 mirror+0.268
aila+0.260
 Girl+0.256
 tooth+0.254
 ha+0.248
Neg logit contributions
 songs-0.252
 games-0.240
 rods-0.237
 they-0.233
asted-0.222
Token the
Token ID262
Feature activation-2.16e-02

Pos logit contributions
 crying+0.064
 mirror+0.063
 stub+0.060
 wardrobe+0.059
 bleeding+0.058
Neg logit contributions
 songs-0.063
 rods-0.061
 unity-0.057
 games-0.056
asted-0.055
Token park
Token ID3952
Feature activation+1.05e-02

Pos logit contributions
 songs+0.029
 rods+0.028
 games+0.027
 bond+0.027
 habitats+0.025
Neg logit contributions
used-0.048
 crying-0.044
 sob-0.042
ainted-0.041
issa-0.041
Token.
Token ID13
Feature activation+3.78e-02

Pos logit contributions
 crying+0.017
 mirror+0.017
 tooth+0.016
 wardrobe+0.016
 bleeding+0.015
Neg logit contributions
 songs-0.019
 rods-0.019
 habitats-0.018
 games-0.017
 facts-0.017
Token The
Token ID383
Feature activation+1.48e-02

Pos logit contributions
 crying+0.061
 mirror+0.058
 wardrobe+0.055
 bleeding+0.054
 ha+0.053
Neg logit contributions
 songs-0.070
 rods-0.069
 habitats-0.065
 facts-0.065
 similarities-0.065
Token"
Token ID1
Feature activation+1.27e-02

Pos logit contributions
 crying+0.003
 mirror+0.003
 ha+0.002
 bleeding+0.002
 drawer+0.002
Neg logit contributions
 songs-0.004
 habitats-0.004
 facts-0.004
 unity-0.004
 bond-0.004
TokenOK
Token ID11380
Feature activation-2.64e-01

Pos logit contributions
 mirror+0.016
 crying+0.015
 ha+0.015
 bleeding+0.014
 stub+0.014
Neg logit contributions
 songs-0.026
 habitats-0.026
asts-0.026
 enjoyable-0.025
 facts-0.025
Token,
Token ID11
Feature activation+1.24e-02

Pos logit contributions
 songs+0.461
 rods+0.447
 enjoyable+0.417
 games+0.417
 unity+0.415
Neg logit contributions
 crying-0.461
 mirror-0.439
 bleeding-0.419
 stub-0.414
 wardrobe-0.404
Token Ben
Token ID3932
Feature activation+1.13e-01

Pos logit contributions
 crying+0.016
 mirror+0.015
 ha+0.013
 wardrobe+0.013
 bleeding+0.013
Neg logit contributions
 songs-0.027
 rods-0.027
 habitats-0.027
 facts-0.026
 unity-0.026
Token.
Token ID13
Feature activation+4.76e-02

Pos logit contributions
 crying+0.187
 mirror+0.181
 wardrobe+0.166
 bleeding+0.166
 stub+0.163
Neg logit contributions
 songs-0.209
 rods-0.204
 enjoyable-0.192
 facts-0.192
 unity-0.191
Token That
Token ID1320
Feature activation+2.15e-01

Pos logit contributions
 crying+0.079
 mirror+0.076
 bleeding+0.070
 wardrobe+0.070
 tooth+0.068
Neg logit contributions
 songs-0.087
 rods-0.085
 habitats-0.081
 unity-0.081
 similarities-0.081
Token's
Token ID338
Feature activation+2.78e-01

Pos logit contributions
 crying+0.286
 mirror+0.274
 wardrobe+0.244
 bleeding+0.243
 tooth+0.240
Neg logit contributions
 songs-0.462
 rods-0.450
 habitats-0.440
 similarities-0.434
 unity-0.432
Token a
Token ID257
Feature activation+1.22e-01

Pos logit contributions
 crying+0.418
 mirror+0.394
 stub+0.370
 wardrobe+0.360
 bleeding+0.350
Neg logit contributions
 songs-0.565
 rods-0.550
 unity-0.529
 enjoyable-0.528
 similarities-0.523
Token good
Token ID922
Feature activation-1.97e-01

Pos logit contributions
 crying+0.212
 mirror+0.210
 ha+0.188
 tooth+0.187
 wardrobe+0.186
Neg logit contributions
 songs-0.213
 habitats-0.210
 rods-0.208
 facts-0.203
 similarities-0.198
Token one
Token ID530
Feature activation+4.78e-02

Pos logit contributions
 songs+0.328
 habitats+0.323
 rods+0.319
 similarities+0.311
 facts+0.308
Neg logit contributions
 crying-0.356
 mirror-0.355
 tooth-0.318
 cabinet-0.317
 drawer-0.316
Token.
Token ID13
Feature activation+4.71e-02

Pos logit contributions
 crying+0.063
 mirror+0.059
 bleeding+0.053
 ha+0.053
 tooth+0.053
Neg logit contributions
 songs-0.103
 rods-0.101
 habitats-0.099
 similarities-0.098
 unity-0.097
Token the
Token ID262
Feature activation+2.22e-02

Pos logit contributions
 mirror+0.079
 crying+0.078
 wardrobe+0.075
 cabinet+0.074
 diary+0.073
Neg logit contributions
 songs-0.099
 enjoyable-0.095
 unity-0.092
 rods-0.091
asted-0.091
Token Easter
Token ID20111
Feature activation-5.93e-02

Pos logit contributions
 crying+0.034
 mirror+0.031
 bleeding+0.030
 sob+0.029
 stub+0.029
Neg logit contributions
 songs-0.044
 rods-0.041
 unity-0.040
 bond-0.040
 lesson-0.040
Token Bunny
Token ID28683
Feature activation+9.73e-02

Pos logit contributions
 songs+0.122
 rods+0.117
 unity+0.113
 habitats+0.113
 bond+0.112
Neg logit contributions
 crying-0.085
 mirror-0.080
 bleeding-0.074
 ha-0.072
 stub-0.072
Token and
Token ID290
Feature activation-1.26e-01

Pos logit contributions
 crying+0.146
 mirror+0.140
 bleeding+0.127
 wardrobe+0.127
 ha+0.125
Neg logit contributions
 songs-0.191
 rods-0.187
 habitats-0.178
 unity-0.178
 similarities-0.177
Token now
Token ID783
Feature activation-3.40e-02

Pos logit contributions
 songs+0.297
 rods+0.296
 habitats+0.284
 similarities+0.283
 unity+0.282
Neg logit contributions
 crying-0.142
 mirror-0.133
 wardrobe-0.117
 bleeding-0.117
 ha-0.115
Token you
Token ID345
Feature activation+1.00e-01

Pos logit contributions
 songs+0.084
 rods+0.082
 habitats+0.079
 unity+0.079
 similarities+0.079
Neg logit contributions
 crying-0.035
 mirror-0.032
 bleeding-0.028
 wardrobe-0.028
 ha-0.027
Token know
Token ID760
Feature activation-4.93e-02

Pos logit contributions
 crying+0.170
 mirror+0.163
 wardrobe+0.150
 bleeding+0.150
 ha+0.149
Neg logit contributions
 songs-0.179
 rods-0.176
 unity-0.165
 similarities-0.165
 habitats-0.165
Token where
Token ID810
Feature activation+1.98e-02

Pos logit contributions
 rods+0.091
 songs+0.089
 habitats+0.085
 bond+0.085
 similarities+0.085
Neg logit contributions
 crying-0.080
 mirror-0.077
 wardrobe-0.071
 bleeding-0.070
 ha-0.069
Token I
Token ID314
Feature activation+2.32e-01

Pos logit contributions
 crying+0.028
 mirror+0.027
 bleeding+0.025
 wardrobe+0.025
 tooth+0.024
Neg logit contributions
 songs-0.040
 rods-0.038
 facts-0.037
 unity-0.037
 enjoyable-0.037
Token live
Token ID2107
Feature activation-1.32e-01

Pos logit contributions
 crying+0.393
 mirror+0.376
 bleeding+0.348
 wardrobe+0.346
 ha+0.342
Neg logit contributions
 songs-0.413
 rods-0.403
 habitats-0.383
 unity-0.383
 similarities-0.381
Token!"
Token ID2474
Feature activation+4.42e-02

Pos logit contributions
 songs+0.274
 rods+0.265
 habitats+0.256
 unity+0.253
 enjoyable+0.253
Neg logit contributions
 crying-0.185
 mirror-0.174
 bleeding-0.161
 ha-0.158
 tooth-0.156
Token something
Token ID1223
Feature activation+1.50e-01

Pos logit contributions
 crying+0.238
 mirror+0.228
 wardrobe+0.213
 stub+0.212
 ha+0.211
Neg logit contributions
 songs-0.317
 rods-0.301
 unity-0.294
 facts-0.289
 lesson-0.289
Token new
Token ID649
Feature activation-7.92e-02

Pos logit contributions
 crying+0.267
 mirror+0.255
 bleeding+0.236
 wardrobe+0.236
 ha+0.232
Neg logit contributions
 songs-0.256
 rods-0.251
 habitats-0.240
 similarities-0.237
 unity-0.237
Token to
Token ID284
Feature activation-1.17e-02

Pos logit contributions
 songs+0.166
 rods+0.163
 similarities+0.157
 unity+0.156
 habitats+0.155
Neg logit contributions
 crying-0.110
 mirror-0.104
 wardrobe-0.094
 bleeding-0.093
 ha-0.092
Token measure
Token ID3953
Feature activation+6.48e-02

Pos logit contributions
 rods+0.020
 songs+0.020
 habitats+0.019
 similarities+0.018
lights+0.018
Neg logit contributions
 crying-0.021
 mirror-0.020
 tooth-0.018
 ha-0.018
 bleeding-0.018
Token.
Token ID13
Feature activation+6.53e-02

Pos logit contributions
 crying+0.094
 mirror+0.089
 bleeding+0.081
 wardrobe+0.081
 ha+0.080
Neg logit contributions
 songs-0.131
 rods-0.128
 habitats-0.123
 unity-0.123
 similarities-0.123
Token<|endoftext|>
Token ID50256
Feature activation-1.90e-02

Pos logit contributions
 crying+0.109
 mirror+0.107
 bleeding+0.100
 tooth+0.100
 ha+0.100
Neg logit contributions
 songs-0.113
 rods-0.108
 habitats-0.107
 enjoyable-0.105
 unity-0.105
TokenOnce
Token ID7454
Feature activation+7.36e-02

Pos logit contributions
 rods+0.024
 songs+0.024
 lesson+0.023
 games+0.022
 habitats+0.021
Neg logit contributions
 crying-0.039
 mirror-0.038
 tooth-0.038
 wardrobe-0.037
 bleeding-0.037
Token upon
Token ID2402
Feature activation+1.67e-01

Pos logit contributions
 mirror+0.104
 crying+0.104
 ha+0.102
 bleeding+0.100
 tooth+0.100
Neg logit contributions
 lesson-0.135
 rods-0.133
 songs-0.132
 they-0.132
 enjoyable-0.127
Token a
Token ID257
Feature activation+1.38e-01

Pos logit contributions
 mirror+0.232
 crying+0.232
 wardrobe+0.223
 ha+0.213
used+0.213
Neg logit contributions
 songs-0.354
 games-0.327
 rods-0.319
 lesson-0.316
 enjoyable-0.314
Token time
Token ID640
Feature activation-1.53e-01

Pos logit contributions
 crying+0.274
 mirror+0.262
 bleeding+0.245
 stub+0.244
 tooth+0.244
Neg logit contributions
 songs-0.203
 rods-0.203
 unity-0.193
 facts-0.192
 habitats-0.190
Token,
Token ID11
Feature activation+2.44e-02

Pos logit contributions
 rods+0.218
 songs+0.215
 they+0.205
 games+0.205
 lesson+0.203
Neg logit contributions
 mirror-0.291
 crying-0.280
 tooth-0.274
 bleeding-0.271
 wardrobe-0.269
Token door
Token ID3420
Feature activation+2.94e-01

Pos logit contributions
 lesson+0.046
 rods+0.043
 songs+0.042
 times+0.042
 teamwork+0.041
Neg logit contributions
 crying-0.026
 stub-0.025
bug-0.023
bo-0.023
 Elsa-0.023
Token and
Token ID290
Feature activation-1.65e-01

Pos logit contributions
 crying+0.589
 mirror+0.567
 wardrobe+0.530
 bleeding+0.529
 ha+0.519
Neg logit contributions
 songs-0.437
 rods-0.424
 habitats-0.403
 similarities-0.401
 unity-0.400
Token it
Token ID340
Feature activation-6.03e-02

Pos logit contributions
 songs+0.293
 rods+0.290
 they+0.289
 games+0.285
asted+0.276
Neg logit contributions
 crying-0.257
 mirror-0.239
 bleeding-0.232
 weeping-0.221
 stub-0.219
Token led
Token ID2957
Feature activation+4.73e-02

Pos logit contributions
 lesson+0.083
 songs+0.083
 they+0.081
Years+0.080
 games+0.079
Neg logit contributions
 crying-0.119
 mirror-0.117
 weeping-0.107
 bleeding-0.107
used-0.106
Token to
Token ID284
Feature activation-2.81e-02

Pos logit contributions
 crying+0.098
 mirror+0.096
 bleeding+0.092
 wardrobe+0.091
 stub+0.090
Neg logit contributions
 songs-0.067
 rods-0.065
 habitats-0.061
 games-0.059
 lesson-0.059
Token a
Token ID257
Feature activation+1.23e-01

Pos logit contributions
 songs+0.063
 rods+0.062
 habitats+0.061
 similarities+0.060
 facts+0.060
Neg logit contributions
 crying-0.034
 mirror-0.032
 wardrobe-0.029
 ha-0.028
 bleeding-0.028
Token big
Token ID1263
Feature activation+3.50e-02

Pos logit contributions
 crying+0.199
 mirror+0.187
 bleeding+0.179
 stub+0.175
used+0.174
Neg logit contributions
 songs-0.227
 rods-0.218
 bond-0.209
 games-0.209
 unity-0.209
Token room
Token ID2119
Feature activation+1.40e-01

Pos logit contributions
 crying+0.053
 mirror+0.049
 bleeding+0.047
 stub+0.046
 sob+0.045
Neg logit contributions
 songs-0.068
 rods-0.066
 unity-0.063
 bond-0.063
 games-0.063
Token.
Token ID13
Feature activation+4.88e-02

Pos logit contributions
 crying+0.189
 mirror+0.179
 bleeding+0.161
 wardrobe+0.161
 ha+0.159
Neg logit contributions
 songs-0.299
 rods-0.293
 habitats-0.281
 unity-0.281
 similarities-0.281
Token He
Token ID679
Feature activation-5.95e-02

Pos logit contributions
 crying+0.080
 mirror+0.076
 bleeding+0.072
 wardrobe+0.070
 ha+0.070
Neg logit contributions
 songs-0.089
 rods-0.088
 habitats-0.083
 unity-0.082
 facts-0.082
Token saw
Token ID2497
Feature activation+1.55e-01

Pos logit contributions
 songs+0.126
 rods+0.123
 habitats+0.118
 unity+0.117
 similarities+0.117
Neg logit contributions
 crying-0.080
 mirror-0.076
 bleeding-0.069
 wardrobe-0.068
 ha-0.067
Token only
Token ID691
Feature activation+6.62e-02

Pos logit contributions
 crying+0.282
 ha+0.256
 locked+0.249
 mirror+0.249
 stuck+0.246
Neg logit contributions
 songs-0.424
lights-0.402
 rods-0.399
 habitats-0.394
 unity-0.391
Token have
Token ID423
Feature activation-6.40e-02

Pos logit contributions
 crying+0.099
 mirror+0.090
 ha+0.089
 tooth+0.084
 bleeding+0.083
Neg logit contributions
 songs-0.136
lights-0.131
 rods-0.129
Years-0.129
 habitats-0.128
Token cars
Token ID5006
Feature activation-2.04e-01

Pos logit contributions
 songs+0.130
Years+0.125
 bond+0.125
 habitats+0.125
 rods+0.124
Neg logit contributions
 crying-0.090
 mirror-0.084
 tooth-0.081
 bleeding-0.080
 ha-0.078
Token and
Token ID290
Feature activation-1.75e-01

Pos logit contributions
asted+0.392
 unity+0.385
 songs+0.382
 similarities+0.382
Years+0.381
Neg logit contributions
 crying-0.320
 mirror-0.298
 bleeding-0.287
 ha-0.285
 tooth-0.272
Token trucks
Token ID13960
Feature activation-3.36e-01

Pos logit contributions
Years+0.256
addle+0.243
asted+0.241
 enjoyable+0.240
asting+0.232
Neg logit contributions
 crying-0.356
 mirror-0.343
 bleeding-0.339
 tooth-0.335
 ha-0.330
Token and
Token ID290
Feature activation-1.75e-01

Pos logit contributions
 songs+0.704
asted+0.695
 unity+0.693
 similarities+0.691
 rods+0.689
Neg logit contributions
 crying-0.469
 mirror-0.428
 ha-0.405
 bleeding-0.403
 tooth-0.381
Token toys
Token ID14958
Feature activation-1.44e-01

Pos logit contributions
Years+0.264
addle+0.251
asted+0.247
 enjoyable+0.241
 bond+0.238
Neg logit contributions
 crying-0.353
 mirror-0.339
 bleeding-0.333
 tooth-0.328
 ha-0.327
Token."
Token ID526
Feature activation-5.98e-02

Pos logit contributions
 songs+0.315
asted+0.313
 unity+0.310
 similarities+0.310
 rods+0.309
Neg logit contributions
 crying-0.182
 mirror-0.171
 ha-0.157
 bleeding-0.155
 drawer-0.152
Token The
Token ID383
Feature activation+1.29e-02

Pos logit contributions
 songs+0.124
 bond+0.118
 unity+0.118
 rods+0.116
 similarities+0.116
Neg logit contributions
 crying-0.088
 mirror-0.081
 bleeding-0.074
 cabinet-0.074
 ha-0.073
Token robot
Token ID9379
Feature activation+7.03e-02

Pos logit contributions
 crying+0.020
used+0.020
 stub+0.020
 mirror+0.019
ow+0.019
Neg logit contributions
 songs-0.024
 lesson-0.023
 games-0.022
 rods-0.022
 teamwork-0.022
Token says
Token ID1139
Feature activation+1.95e-02

Pos logit contributions
 crying+0.101
 mirror+0.096
 wardrobe+0.087
 bleeding+0.087
 ha+0.086
Neg logit contributions
 songs-0.143
 rods-0.140
 habitats-0.135
 unity-0.135
 enjoyable-0.134
Token a
Token ID257
Feature activation+1.59e-01

Pos logit contributions
 crying+0.140
 mirror+0.136
 wardrobe+0.127
 ha+0.126
 cabinet+0.126
Neg logit contributions
 songs-0.197
 enjoyable-0.188
 rods-0.186
 unity-0.183
 lesson-0.182
Token deep
Token ID2769
Feature activation+1.30e-01

Pos logit contributions
 crying+0.234
 mirror+0.221
 wardrobe+0.206
 bleeding+0.203
 ha+0.203
Neg logit contributions
 songs-0.313
 rods-0.296
 bond-0.294
 enjoyable-0.290
 lesson-0.290
Token sense
Token ID2565
Feature activation-2.77e-02

Pos logit contributions
 crying+0.198
 mirror+0.190
 ha+0.176
 wardrobe+0.175
 bleeding+0.173
Neg logit contributions
 songs-0.246
 rods-0.234
 bond-0.228
 enjoyable-0.227
 unity-0.226
Token of
Token ID286
Feature activation-1.61e-01

Pos logit contributions
 songs+0.036
 rods+0.036
 habitats+0.034
 similarities+0.034
 unity+0.033
Neg logit contributions
 crying-0.061
 mirror-0.059
 bleeding-0.055
 wardrobe-0.055
 tooth-0.054
Token content
Token ID2695
Feature activation-1.72e-01

Pos logit contributions
 rods+0.279
 songs+0.268
 similarities+0.256
 habitats+0.254
 facts+0.251
Neg logit contributions
 crying-0.298
 mirror-0.286
 bleeding-0.260
 wardrobe-0.254
 tooth-0.253
Tokenment
Token ID434
Feature activation-1.72e-01

Pos logit contributions
 rods+0.322
 songs+0.310
 similarities+0.306
 unity+0.304
Years+0.302
Neg logit contributions
 crying-0.288
 mirror-0.273
 wardrobe-0.249
 tooth-0.247
 bleeding-0.246
Token.
Token ID13
Feature activation+7.79e-02

Pos logit contributions
 rods+0.362
 songs+0.353
 similarities+0.350
 habitats+0.348
 unity+0.339
Neg logit contributions
 crying-0.244
 mirror-0.228
 bleeding-0.209
 tooth-0.206
 wardrobe-0.205
Token He
Token ID679
Feature activation-2.49e-02

Pos logit contributions
 crying+0.134
 mirror+0.128
 bleeding+0.119
 wardrobe+0.118
 ha+0.117
Neg logit contributions
 songs-0.138
 rods-0.134
 habitats-0.127
 unity-0.127
 similarities-0.127
Token was
Token ID373
Feature activation+4.17e-02

Pos logit contributions
 songs+0.049
 rods+0.048
 habitats+0.046
 unity+0.046
 similarities+0.046
Neg logit contributions
 crying-0.037
 mirror-0.036
 bleeding-0.032
 wardrobe-0.032
 ha-0.032
Token glad
Token ID9675
Feature activation-1.78e-02

Pos logit contributions
 mirror+0.065
 cabinet+0.063
 crying+0.063
 diary+0.062
 ha+0.061
Neg logit contributions
 songs-0.072
 enjoyable-0.069
asted-0.067
 unity-0.066
 often-0.064
Token he
Token ID339
Feature activation-1.14e-01

Pos logit contributions
 rods+0.035
 similarities+0.034
Years+0.033
asted+0.033
angles+0.033
Neg logit contributions
 crying-0.030
 mirror-0.027
 tooth-0.025
 wardrobe-0.024
 staring-0.024
Token was
Token ID373
Feature activation+7.84e-03

Pos logit contributions
 crying+0.309
 mirror+0.297
 bleeding+0.276
 wardrobe+0.275
 tooth+0.269
Neg logit contributions
 songs-0.342
 rods-0.336
 similarities-0.317
 bond-0.316
 habitats-0.315
Token a
Token ID257
Feature activation+1.18e-01

Pos logit contributions
 crying+0.010
 mirror+0.009
 wardrobe+0.009
 ha+0.009
 bleeding+0.008
Neg logit contributions
 songs-0.017
 rods-0.017
 games-0.016
 enjoyable-0.016
asted-0.016
Token little
Token ID1310
Feature activation+3.85e-02

Pos logit contributions
 crying+0.179
 mirror+0.169
 bleeding+0.159
used+0.157
 wardrobe+0.156
Neg logit contributions
 songs-0.230
 rods-0.218
 similarities-0.208
 bond-0.208
 games-0.207
Token boy
Token ID2933
Feature activation+1.83e-03

Pos logit contributions
 crying+0.049
 mirror+0.045
 bleeding+0.042
 stub+0.042
used+0.041
Neg logit contributions
 songs-0.084
 rods-0.082
 bond-0.078
 unity-0.077
 games-0.077
Token named
Token ID3706
Feature activation-1.91e-02

Pos logit contributions
 crying+0.003
 mirror+0.003
 bleeding+0.003
 wardrobe+0.003
 tooth+0.003
Neg logit contributions
 songs-0.003
 rods-0.003
 habitats-0.003
 similarities-0.003
 bond-0.003
Token Tim
Token ID5045
Feature activation-1.28e-01

Pos logit contributions
 songs+0.030
 rods+0.029
 habitats+0.027
 unity+0.027
 similarities+0.027
Neg logit contributions
 crying-0.037
 mirror-0.036
 wardrobe-0.034
 bleeding-0.033
 ha-0.033
Tokenmy
Token ID1820
Feature activation+2.81e-01

Pos logit contributions
 songs+0.245
 rods+0.228
 habitats+0.223
asted+0.221
 similarities+0.220
Neg logit contributions
 crying-0.190
 mirror-0.187
 ha-0.178
 wardrobe-0.176
used-0.174
Token.
Token ID13
Feature activation+4.50e-02

Pos logit contributions
 crying+0.394
 mirror+0.378
 wardrobe+0.343
 bleeding+0.341
 ha+0.338
Neg logit contributions
 songs-0.584
 rods-0.566
 habitats-0.543
 similarities-0.541
 unity-0.541
Token Tim
Token ID5045
Feature activation-1.29e-01

Pos logit contributions
 crying+0.077
 mirror+0.073
 bleeding+0.069
 cabinet+0.068
 pant+0.065
Neg logit contributions
 songs-0.082
 rods-0.078
 unity-0.077
 facts-0.076
 bond-0.076
Tokenmy
Token ID1820
Feature activation+2.81e-01

Pos logit contributions
 songs+0.321
 rods+0.309
 habitats+0.302
 similarities+0.301
 bond+0.301
Neg logit contributions
 crying-0.121
 mirror-0.112
 ha-0.102
 wardrobe-0.098
used-0.098
Token loved
Token ID6151
Feature activation-1.98e-01

Pos logit contributions
 crying+0.581
 mirror+0.557
 bleeding+0.523
 wardrobe+0.521
 tooth+0.518
Neg logit contributions
 songs-0.401
 rods-0.396
 unity-0.375
 habitats-0.371
 facts-0.369
Token to
Token ID284
Feature activation-2.64e-02

Pos logit contributions
 songs+0.014
 rods+0.014
 unity+0.014
 bond+0.013
 habitats+0.013
Neg logit contributions
 crying-0.012
 mirror-0.012
 tooth-0.011
 wardrobe-0.010
 bleeding-0.010
Token make
Token ID787
Feature activation-1.00e-01

Pos logit contributions
 rods+0.044
 songs+0.044
 habitats+0.041
 unity+0.041
 facts+0.041
Neg logit contributions
 crying-0.048
 mirror-0.047
 ha-0.043
 wardrobe-0.043
 bleeding-0.043
Token a
Token ID257
Feature activation+1.23e-01

Pos logit contributions
 songs+0.181
 rods+0.173
 enjoyable+0.163
 games+0.163
 facts+0.162
Neg logit contributions
 crying-0.169
 mirror-0.163
 bleeding-0.153
 stub-0.153
 wardrobe-0.151
Token mess
Token ID2085
Feature activation-2.40e-02

Pos logit contributions
 crying+0.204
 mirror+0.196
 wardrobe+0.180
 bleeding+0.177
 tooth+0.177
Neg logit contributions
 songs-0.228
 rods-0.225
 habitats-0.218
 unity-0.215
 similarities-0.214
Token and
Token ID290
Feature activation-1.67e-01

Pos logit contributions
 rods+0.049
 songs+0.048
 habitats+0.048
 similarities+0.047
 unity+0.047
Neg logit contributions
 crying-0.036
 mirror-0.033
 bleeding-0.030
 wardrobe-0.030
 cabinet-0.030
Token play
Token ID711
Feature activation-1.62e-01

Pos logit contributions
 songs+0.301
 rods+0.301
 habitats+0.287
 unity+0.286
 similarities+0.286
Neg logit contributions
 crying-0.286
 mirror-0.270
 bleeding-0.249
 ha-0.246
 wardrobe-0.246
Token.
Token ID13
Feature activation+5.23e-02

Pos logit contributions
 songs+0.285
 rods+0.281
 habitats+0.269
 unity+0.267
 similarities+0.266
Neg logit contributions
 crying-0.279
 mirror-0.267
 bleeding-0.246
 wardrobe-0.246
 ha-0.244
Token
Token ID198
Feature activation-1.73e-03

Pos logit contributions
 crying+0.091
 mirror+0.087
 wardrobe+0.082
 bleeding+0.082
 tooth+0.081
Neg logit contributions
 songs-0.090
 rods-0.088
 habitats-0.084
 similarities-0.083
 enjoyable-0.083
Token
Token ID198
Feature activation-1.46e-03

Pos logit contributions
 songs+0.004
 rods+0.003
 unity+0.003
 habitats+0.003
 similarities+0.003
Neg logit contributions
 crying-0.002
 mirror-0.002
 bleeding-0.002
 ha-0.002
 stub-0.002
TokenTom
Token ID13787
Feature activation-1.19e-01

Pos logit contributions
 songs+0.003
 facts+0.002
 rods+0.002
 unity+0.002
 habitats+0.002
Neg logit contributions
 crying-0.002
 ha-0.002
 mirror-0.002
 bleeding-0.002
 pant-0.002
Tokenmy
Token ID1820
Feature activation+2.89e-01

Pos logit contributions
 songs+0.281
 habitats+0.264
Years+0.260
 sizes+0.260
 rods+0.260
Neg logit contributions
 crying-0.117
 mirror-0.108
used-0.104
 ha-0.098
 stub-0.096
Token outside
Token ID2354
Feature activation+1.38e-01

Pos logit contributions
 habitats+0.315
 unity+0.312
 enjoyable+0.310
 facts+0.309
 rods+0.309
Neg logit contributions
 crying-0.211
 mirror-0.201
 ha-0.183
 wardrobe-0.179
 cabinet-0.179
Token with
Token ID351
Feature activation-5.13e-02

Pos logit contributions
 crying+0.178
 mirror+0.160
 cabinet+0.146
 stuck+0.143
 wardrobe+0.143
Neg logit contributions
 similarities-0.310
 habitats-0.309
 unity-0.307
 facts-0.306
 enjoyable-0.304
Token his
Token ID465
Feature activation-1.10e-01

Pos logit contributions
 songs+0.087
 habitats+0.086
 bond+0.086
 unity+0.086
 enjoyable+0.085
Neg logit contributions
 crying-0.090
 mirror-0.089
 wardrobe-0.080
 tooth-0.080
 bleeding-0.080
Token friends
Token ID2460
Feature activation-1.05e-01

Pos logit contributions
 songs+0.181
 rods+0.175
 habitats+0.169
 unity+0.168
 similarities+0.168
Neg logit contributions
 crying-0.204
 mirror-0.196
 wardrobe-0.182
 bleeding-0.182
 ha-0.179
Token.
Token ID13
Feature activation+6.37e-02

Pos logit contributions
 songs+0.226
 rods+0.222
 habitats+0.215
 unity+0.214
 similarities+0.214
Neg logit contributions
 crying-0.141
 mirror-0.133
 wardrobe-0.119
 bleeding-0.119
 ha-0.117
Token One
Token ID1881
Feature activation-3.29e-02

Pos logit contributions
 crying+0.095
 mirror+0.094
 cabinet+0.090
 closet+0.084
 drawer+0.084
Neg logit contributions
 songs-0.125
 unity-0.122
 bond-0.119
 facts-0.119
 enjoyable-0.118
Token day
Token ID1110
Feature activation-2.78e-01

Pos logit contributions
 rods+0.041
 habitats+0.040
 facts+0.039
 unity+0.039
 similarities+0.039
Neg logit contributions
 crying-0.075
 mirror-0.071
 tooth-0.068
 bleeding-0.068
 stub-0.067
Token,
Token ID11
Feature activation+2.10e-02

Pos logit contributions
 songs+0.620
 habitats+0.603
 unity+0.593
 similarities+0.590
 rods+0.590
Neg logit contributions
 crying-0.369
 mirror-0.335
 ha-0.312
 bleeding-0.311
 stub-0.296
Token Tim
Token ID5045
Feature activation-1.16e-01

Pos logit contributions
 crying+0.036
 mirror+0.032
 bleeding+0.032
 stuck+0.031
 ha+0.030
Neg logit contributions
 habitats-0.040
 facts-0.038
 similarities-0.038
 unity-0.038
 songs-0.038
Tokenmy
Token ID1820
Feature activation+2.94e-01

Pos logit contributions
 rods+0.285
 songs+0.285
 lesson+0.272
 habitats+0.271
 games+0.270
Neg logit contributions
 crying-0.100
 mirror-0.096
used-0.092
 ha-0.086
 bleeding-0.081
Token's
Token ID338
Feature activation+2.85e-01

Pos logit contributions
 crying+0.358
 mirror+0.334
 wardrobe+0.297
 bleeding+0.296
 stub+0.292
Neg logit contributions
 songs-0.670
 rods-0.655
 unity-0.636
 habitats-0.634
 similarities-0.633
Token their
Token ID511
Feature activation-4.04e-01

Pos logit contributions
 habitats+0.125
 songs+0.125
 enjoyable+0.120
 facts+0.118
 bond+0.118
Neg logit contributions
 mirror-0.090
 crying-0.088
 wardrobe-0.079
 drawer-0.079
 cabinet-0.078
Token blocks
Token ID7021
Feature activation-1.07e-01

Pos logit contributions
 songs+0.640
 rods+0.631
 similarities+0.587
 unity+0.585
 habitats+0.581
Neg logit contributions
 crying-0.769
 mirror-0.734
 wardrobe-0.688
 bleeding-0.687
 stub-0.680
Token on
Token ID319
Feature activation+5.57e-02

Pos logit contributions
 songs+0.255
 rods+0.249
 habitats+0.243
 unity+0.242
 similarities+0.241
Neg logit contributions
 crying-0.118
 mirror-0.110
 bleeding-0.097
 wardrobe-0.096
 ha-0.094
Token top
Token ID1353
Feature activation+8.01e-02

Pos logit contributions
 crying+0.071
 mirror+0.069
 cabinet+0.062
 drawer+0.062
 ha+0.061
Neg logit contributions
 songs-0.120
 habitats-0.117
 facts-0.117
 enjoyable-0.116
 similarities-0.116
Token of
Token ID286
Feature activation-1.97e-01

Pos logit contributions
 crying+0.131
 mirror+0.127
 wardrobe+0.116
 cabinet+0.116
 drawer+0.116
Neg logit contributions
 songs-0.145
 rods-0.140
 habitats-0.140
 similarities-0.139
 unity-0.138
Token each
Token ID1123
Feature activation-3.89e-01

Pos logit contributions
 songs+0.342
 habitats+0.327
 rods+0.323
 enjoyable+0.320
 facts+0.320
Neg logit contributions
 crying-0.347
 mirror-0.335
 bleeding-0.306
 ha-0.305
 wardrobe-0.304
Token other
Token ID584
Feature activation-7.42e-02

Pos logit contributions
 songs+0.700
 rods+0.685
 unity+0.647
 similarities+0.645
 habitats+0.644
Neg logit contributions
 crying-0.658
 mirror-0.629
 bleeding-0.581
 wardrobe-0.580
 ha-0.572
Token,
Token ID11
Feature activation+3.32e-03

Pos logit contributions
 songs+0.160
 rods+0.157
 habitats+0.152
 similarities+0.151
 unity+0.151
Neg logit contributions
 crying-0.098
 mirror-0.093
 wardrobe-0.083
 bleeding-0.083
 ha-0.082
Token but
Token ID475
Feature activation-7.41e-02

Pos logit contributions
 crying+0.005
 mirror+0.004
 pant+0.004
 bleeding+0.004
 sob+0.004
Neg logit contributions
 habitats-0.007
 bond-0.007
 songs-0.007
 similarities-0.007
 facts-0.007
Token they
Token ID484
Feature activation-4.18e-01

Pos logit contributions
 songs+0.135
 rods+0.130
 habitats+0.127
 unity+0.126
 facts+0.126
Neg logit contributions
 crying-0.125
 mirror-0.120
 ha-0.112
 bleeding-0.111
 wardrobe-0.109
Token did
Token ID750
Feature activation+9.61e-03

Pos logit contributions
 songs+0.718
 rods+0.701
 habitats+0.663
 unity+0.662
 similarities+0.661
Neg logit contributions
 crying-0.738
 mirror-0.709
 wardrobe-0.655
 bleeding-0.655
 ha-0.647
Token,
Token ID11
Feature activation+1.63e-02

Pos logit contributions
 crying+0.022
 mirror+0.019
 ha+0.018
 wardrobe+0.018
 cabinet+0.018
Neg logit contributions
 rods-0.035
 songs-0.035
 similarities-0.034
asted-0.034
 unity-0.034
Token but
Token ID475
Feature activation-6.30e-02

Pos logit contributions
 crying+0.024
 mirror+0.022
 bleeding+0.020
 wardrobe+0.019
 sob+0.019
Neg logit contributions
 habitats-0.036
 rods-0.036
 facts-0.035
 bond-0.034
asted-0.034
Token we
Token ID356
Feature activation-1.73e-01

Pos logit contributions
 songs+0.104
 habitats+0.103
 rods+0.102
Years+0.098
 facts+0.097
Neg logit contributions
 crying-0.121
 mirror-0.115
 bleeding-0.108
 wardrobe-0.106
 ha-0.105
Token couldn
Token ID3521
Feature activation-2.90e-02

Pos logit contributions
 songs+0.336
 rods+0.328
 habitats+0.313
 unity+0.313
 facts+0.312
Neg logit contributions
 crying-0.277
 mirror-0.262
 bleeding-0.242
 wardrobe-0.238
 stub-0.236
Token't
Token ID470
Feature activation+1.93e-01

Pos logit contributions
 rods+0.059
 lesson+0.058
 songs+0.058
 they+0.056
 games+0.055
Neg logit contributions
 mirror-0.038
 crying-0.037
 tooth-0.035
 wardrobe-0.034
 drawer-0.034
Token agree
Token ID4236
Feature activation-3.89e-01

Pos logit contributions
 crying+0.291
 mirror+0.276
 wardrobe+0.251
 bleeding+0.250
 ha+0.249
Neg logit contributions
 songs-0.384
 rods-0.382
 habitats-0.372
 facts-0.366
 unity-0.366
Token on
Token ID319
Feature activation+6.33e-02

Pos logit contributions
 songs+0.830
 rods+0.825
 habitats+0.804
 bond+0.803
 unity+0.801
Neg logit contributions
 crying-0.527
 mirror-0.486
 bleeding-0.450
 ha-0.431
 wardrobe-0.422
Token the
Token ID262
Feature activation-1.36e-02

Pos logit contributions
 crying+0.106
 mirror+0.100
 bleeding+0.092
 ha+0.091
 wardrobe+0.090
Neg logit contributions
 songs-0.116
 rods-0.115
 habitats-0.111
 bond-0.109
 similarities-0.109
Token t
Token ID256
Feature activation+1.52e-01

Pos logit contributions
 songs+0.030
 rods+0.028
 lesson+0.028
 games+0.028
 bond+0.027
Neg logit contributions
 crying-0.017
used-0.016
 mirror-0.016
bug-0.015
 stub-0.015
Tokeneddy
Token ID21874
Feature activation-8.79e-02

Pos logit contributions
 crying+0.329
 mirror+0.317
 ha+0.300
 bleeding+0.299
 stub+0.298
Neg logit contributions
 songs-0.194
 rods-0.183
 habitats-0.173
 bond-0.168
 lesson-0.166
Token bear
Token ID6842
Feature activation+1.82e-01

Pos logit contributions
 songs+0.208
 rods+0.192
 bond+0.190
 games+0.189
 lesson+0.188
Neg logit contributions
 crying-0.100
 mirror-0.094
 stub-0.089
used-0.087
 ha-0.083
Token John
Token ID1757
Feature activation-6.75e-02

Pos logit contributions
 crying+0.184
 mirror+0.173
 bleeding+0.161
 wardrobe+0.157
 ha+0.157
Neg logit contributions
 songs-0.209
 rods-0.204
 similarities-0.194
 habitats-0.194
 unity-0.192
Token asked
Token ID1965
Feature activation+8.50e-02

Pos logit contributions
 songs+0.132
 rods+0.130
 unity+0.123
 similarities+0.123
 habitats+0.123
Neg logit contributions
 crying-0.104
 mirror-0.098
 bleeding-0.090
 wardrobe-0.090
 ha-0.089
Token his
Token ID465
Feature activation-1.07e-01

Pos logit contributions
 crying+0.148
 mirror+0.141
 wardrobe+0.129
 ha+0.128
 tooth+0.128
Neg logit contributions
 rods-0.151
 songs-0.148
 habitats-0.146
 similarities-0.143
 unity-0.141
Token mom
Token ID1995
Feature activation+7.00e-02

Pos logit contributions
 songs+0.212
 rods+0.203
 lesson+0.199
 facts+0.194
 unity+0.194
Neg logit contributions
 crying-0.159
 mirror-0.150
 bleeding-0.144
 stub-0.143
 wardrobe-0.137
Token multiple
Token ID3294
Feature activation-2.87e-02

Pos logit contributions
 crying+0.120
 mirror+0.115
 bleeding+0.108
 stub+0.107
 wardrobe+0.106
Neg logit contributions
 songs-0.122
 rods-0.118
 facts-0.112
 unity-0.112
 habitats-0.112
Token times
Token ID1661
Feature activation-4.49e-01

Pos logit contributions
 rods+0.024
 bond+0.022
 similarities+0.022
 songs+0.022
 habitats+0.021
Neg logit contributions
 crying-0.076
 mirror-0.074
 wardrobe-0.070
 tooth-0.070
 ha-0.069
Token,
Token ID11
Feature activation+2.73e-02

Pos logit contributions
 songs+0.841
 rods+0.832
 habitats+0.802
 similarities+0.794
 unity+0.789
Neg logit contributions
 crying-0.734
 mirror-0.692
 wardrobe-0.641
 bleeding-0.633
 ha-0.622
Token but
Token ID475
Feature activation-5.64e-02

Pos logit contributions
 crying+0.045
 mirror+0.042
 bleeding+0.038
 ha+0.038
 wardrobe+0.038
Neg logit contributions
 rods-0.052
 songs-0.051
 habitats-0.050
 similarities-0.049
 facts-0.048
Token his
Token ID465
Feature activation-1.08e-01

Pos logit contributions
 songs+0.108
 rods+0.105
 unity+0.101
 similarities+0.100
 habitats+0.100
Neg logit contributions
 crying-0.087
 mirror-0.084
 bleeding-0.077
 wardrobe-0.076
 ha-0.075
Token mom
Token ID1995
Feature activation+6.88e-02

Pos logit contributions
 songs+0.213
 lesson+0.200
 rods+0.198
 bond+0.194
 games+0.191
Neg logit contributions
 crying-0.163
 mirror-0.153
 stub-0.144
 bleeding-0.144
 ha-0.140
Token never
Token ID1239
Feature activation-9.48e-02

Pos logit contributions
 crying+0.101
 mirror+0.097
 wardrobe+0.088
 bleeding+0.088
 ha+0.086
Neg logit contributions
 songs-0.137
 rods-0.134
 unity-0.128
 habitats-0.128
 similarities-0.128
Token.
Token ID13
Feature activation+5.38e-02

Pos logit contributions
 rods+0.663
 similarities+0.655
 songs+0.651
 unity+0.650
 bond+0.647
Neg logit contributions
 crying-0.335
 mirror-0.294
 stub-0.263
 tooth-0.259
 bleeding-0.258
Token
Token ID220
Feature activation+1.46e-01

Pos logit contributions
 crying+0.063
 mirror+0.060
 cabinet+0.054
 drawer+0.050
 pant+0.050
Neg logit contributions
 songs-0.123
 rods-0.118
 enjoyable-0.117
 facts-0.117
 unity-0.117
Token
Token ID198
Feature activation-2.72e-04

Pos logit contributions
 crying+0.238
 mirror+0.229
 wardrobe+0.211
 bleeding+0.210
 ha+0.209
Neg logit contributions
 songs-0.268
 rods-0.263
 habitats-0.248
 similarities-0.248
 unity-0.248
Token
Token ID198
Feature activation-2.73e-04

Pos logit contributions
 songs+0.001
 rods+0.001
 habitats+0.001
 unity+0.001
 facts+0.001
Neg logit contributions
 crying-0.000
 mirror-0.000
 stub-0.000
 ha-0.000
 bleeding-0.000
TokenOne
Token ID3198
Feature activation-6.28e-02

Pos logit contributions
 rods+0.001
 facts+0.001
lights+0.001
 accomplishment+0.001
 bond+0.001
Neg logit contributions
 crying-0.000
 mirror-0.000
 ha-0.000
 front-0.000
 stub-0.000
Token day
Token ID1110
Feature activation-2.87e-01

Pos logit contributions
 rods+0.077
 habitats+0.072
 facts+0.071
 similarities+0.070
 unity+0.069
Neg logit contributions
 crying-0.144
 mirror-0.136
 tooth-0.134
 bleeding-0.130
 stub-0.129
Token,
Token ID11
Feature activation+1.57e-02

Pos logit contributions
 songs+0.610
 rods+0.594
 habitats+0.589
 similarities+0.585
 unity+0.580
Neg logit contributions
 crying-0.396
 mirror-0.370
 ha-0.343
 bleeding-0.336
 tooth-0.331
Token he
Token ID339
Feature activation-1.44e-01

Pos logit contributions
 crying+0.018
 mirror+0.016
 ha+0.016
 tooth+0.015
 bleeding+0.015
Neg logit contributions
 habitats-0.037
 similarities-0.036
 facts-0.036
 enjoyable-0.036
Years-0.035
Token heard
Token ID2982
Feature activation+2.05e-01

Pos logit contributions
 songs+0.311
 rods+0.302
 habitats+0.294
 unity+0.287
 similarities+0.287
Neg logit contributions
 crying-0.184
 mirror-0.181
 bleeding-0.162
 wardrobe-0.162
 cabinet-0.161
Token a
Token ID257
Feature activation+1.25e-01

Pos logit contributions
 crying+0.241
 mirror+0.227
 bleeding+0.201
 wardrobe+0.201
 ha+0.198
Neg logit contributions
 songs-0.474
 rods-0.463
 habitats-0.446
 unity-0.445
 similarities-0.444
Token strange
Token ID6283
Feature activation+4.49e-02

Pos logit contributions
 crying+0.213
 mirror+0.208
 wardrobe+0.193
 stub+0.192
 bleeding+0.191
Neg logit contributions
 songs-0.222
 rods-0.205
 unity-0.198
 lesson-0.197
 bond-0.197
Token and
Token ID290
Feature activation-1.71e-01

Pos logit contributions
 songs+0.124
 rods+0.121
 unity+0.115
 habitats+0.115
 similarities+0.114
Neg logit contributions
 crying-0.120
 mirror-0.116
 wardrobe-0.107
 bleeding-0.107
 ha-0.106
Token his
Token ID465
Feature activation-1.26e-01

Pos logit contributions
 songs+0.302
 rods+0.298
 habitats+0.283
 enjoyable+0.280
 similarities+0.280
Neg logit contributions
 crying-0.298
 mirror-0.284
 ha-0.260
 bleeding-0.260
 wardrobe-0.259
Token friends
Token ID2460
Feature activation-1.22e-01

Pos logit contributions
 songs+0.187
 rods+0.178
 unity+0.170
 similarities+0.167
 habitats+0.167
Neg logit contributions
 crying-0.251
 mirror-0.243
 bleeding-0.230
 wardrobe-0.227
 ha-0.226
Token were
Token ID547
Feature activation-2.17e-01

Pos logit contributions
 songs+0.225
 rods+0.224
 habitats+0.215
 facts+0.214
 similarities+0.212
Neg logit contributions
 crying-0.208
 mirror-0.193
 wardrobe-0.179
 bleeding-0.178
 ha-0.176
Token relieved
Token ID22598
Feature activation+4.23e-02

Pos logit contributions
 songs+0.439
 rods+0.424
 unity+0.411
 enjoyable+0.410
 similarities+0.403
Neg logit contributions
 crying-0.300
 mirror-0.300
 wardrobe-0.271
 cabinet-0.268
 tooth-0.267
Token they
Token ID484
Feature activation-4.25e-01

Pos logit contributions
 crying+0.066
 mirror+0.054
 staring+0.051
 trembling+0.050
 bleeding+0.050
Neg logit contributions
 rods-0.090
 habitats-0.088
 similarities-0.087
 songs-0.087
angles-0.085
Token had
Token ID550
Feature activation-2.33e-02

Pos logit contributions
 songs+0.726
 rods+0.710
 habitats+0.672
 unity+0.669
 similarities+0.668
Neg logit contributions
 crying-0.759
 mirror-0.726
 bleeding-0.675
 wardrobe-0.673
 ha-0.665
Token made
Token ID925
Feature activation-3.96e-02

Pos logit contributions
 songs+0.050
 habitats+0.050
lights+0.049
 facts+0.049
 rods+0.049
Neg logit contributions
 crying-0.032
 stuck-0.027
 mirror-0.026
 locked-0.026
 ha-0.026
Token the
Token ID262
Feature activation-2.57e-02

Pos logit contributions
 habitats+0.091
 songs+0.088
 similarities+0.087
 rods+0.087
Years+0.085
Neg logit contributions
 crying-0.051
 mirror-0.044
 tooth-0.039
 bleeding-0.039
 staring-0.037
Token right
Token ID826
Feature activation+2.01e-01

Pos logit contributions
 lesson+0.045
 songs+0.045
 bond+0.043
 games+0.042
 enjoyable+0.042
Neg logit contributions
 bleeding-0.041
 crying-0.040
used-0.040
ainted-0.040
 stub-0.038
Token decision
Token ID2551
Feature activation-1.74e-01

Pos logit contributions
 crying+0.430
 mirror+0.417
 wardrobe+0.388
 tooth+0.386
 ha+0.383
Neg logit contributions
 songs-0.272
 rods-0.268
 habitats-0.255
 similarities-0.254
 unity-0.250