TOP ACTIVATIONS
MAX = 0.234

't know the rh in oc eros was wild . Can
quiet compared to the noisy se ag ull s outside .
said goodbye to the g az eb o and went back
't cry , boys . Acc idents happen . I have
to eat the y ummy ins ides . From that day
give to the rh in oc eros . Lily walked to
giving him too many ch ocol ates ." The king learned
. The more , the mer rier . Come to my
am taking the rh in oc eros to a new home
for Santa and his re ind eer ," Anna says to
before dinner . <|endoftext|> S aman tha was a lazy girl
stayed away from the other ge ese . "
not know why his t eddy bear was sad . He
it to the rh in oc eros . When he tasted
Why don â t you come with me
wanted to guess what the kitten was doing .
â M az zy â s
these toys and how your grand pa and I played with
on top . The b unn ies wanted to have a
I want to taste something y ummy !" The rabbit smiled

MIN ACTIVATIONS
MIN = -0.097

tightly . The tornado was getting closer , but luckily the
tight . His aim was getting better and better .
found one and dad used a tool to fix the robot
. Soon the distance was getting smaller , and the hill
pa showed him how to use the tools to fix the
on , Lily learned to use spray paint responsibly and continued
build their fort . They used the leaves and sticks and
it was in pain and needed help . They took out
didn 't feel well and needed some medicine too . But
the house . Lily used a broom to sweep the floor
that day on , Lily used her gifts to help everyone
. She didn 't even need help from anyone !
dad showed him how to use it safely and Tim my
The thin tree learned to be strong , too . They
sheep . The sheep were getting sleepy after all the walking
. They built the castle for the whole
't know how . They needed a plan to keep the
band ages . Daisy chose a set of colourful band ages
big ! The hole was too big , and Joe was
an important lesson . He needed to be careful when stretching

INTERVAL 0.151 - 0.234
CONTAINS 0.047%

quiet compared to the noisy se ag ull s outside .
't know the rh in oc eros was wild . Can

INTERVAL 0.069 - 0.151
CONTAINS 2.701%

and couldn 't stop smiling . <|endoftext|> Once upon a time
her mom my . " Mom my , can we find
to share ! It 's mine !" said Billy .
the mouse . Then , she took the mouse outside and
huge guard stood at the entrance . He said to Lucy

INTERVAL -0.014 - 0.069
CONTAINS 61.925%

the food and then soared back up into the sky .
saw what was happening and came over to talk to them
a big hug and said : " Thank you !"
" Give me the dinosaur !" Lily shouted .
. For days and months and years , the little girl

INTERVAL -0.097 - -0.014
CONTAINS 35.327%

. She said , " No , dinner comes first ."
bee did not like that . It buzz ed angrily and
The owl was happy to help and showed the fox how
was going to play with his favorite bright things - his
cat me owed and joined him . The boy was happy
Token't
Token ID470
Feature activation+1.01e-01

Pos logit contributions
 darker+0.063
 stairs+0.063
 security+0.062
 darkness+0.060
 silence+0.060
Neg logit contributions
ish-0.081
anda-0.077
an-0.073
amb-0.072
o-0.071
Token know
Token ID760
Feature activation+7.05e-02

Pos logit contributions
ish+0.244
anda+0.240
amb+0.223
an+0.220
ouse+0.218
Neg logit contributions
 silence-0.133
 security-0.130
 danger-0.127
 dark-0.125
 darkness-0.124
Token the
Token ID262
Feature activation+7.59e-02

Pos logit contributions
ish+0.159
anda+0.154
amb+0.144
o+0.142
an+0.142
Neg logit contributions
 stairs-0.099
 darker-0.099
 security-0.098
 darkness-0.097
 silence-0.094
Token rh
Token ID9529
Feature activation+1.35e-01

Pos logit contributions
ish+0.182
anda+0.176
an+0.162
o+0.161
ouse+0.158
Neg logit contributions
 darker-0.093
 security-0.091
 stairs-0.091
 narrower-0.091
 darkness-0.088
Tokenin
Token ID259
Feature activation+1.05e-01

Pos logit contributions
ish+0.210
anda+0.209
amb+0.194
anna+0.194
am+0.188
Neg logit contributions
 security-0.269
 stairs-0.252
 darker-0.251
 darkness-0.247
Where-0.246
Tokenoc
Token ID420
Feature activation+1.59e-01

Pos logit contributions
ish+0.127
anda+0.124
amb+0.121
anna+0.110
am+0.109
Neg logit contributions
 security-0.242
 silence-0.238
 darkness-0.233
 stairs-0.229
 danger-0.228
Tokeneros
Token ID27498
Feature activation+1.12e-01

Pos logit contributions
anda+0.182
ouse+0.178
adow+0.175
orse+0.168
ish+0.167
Neg logit contributions
 security-0.370
 stairs-0.352
 front-0.350
 opposite-0.348
 darkness-0.346
Token was
Token ID373
Feature activation+4.20e-02

Pos logit contributions
ish+0.252
amb+0.247
anda+0.244
ouse+0.233
anna+0.222
Neg logit contributions
 security-0.157
 darkness-0.146
 silence-0.144
 stairs-0.141
 darker-0.141
Token wild
Token ID4295
Feature activation+8.21e-02

Pos logit contributions
ish+0.098
anda+0.095
o+0.089
an+0.086
am+0.086
Neg logit contributions
 stairs-0.055
 darker-0.054
 security-0.053
 narrower-0.052
 darkness-0.052
Token.
Token ID13
Feature activation+6.67e-02

Pos logit contributions
anda+0.193
ish+0.192
ouse+0.190
orse+0.188
amb+0.179
Neg logit contributions
 stairs-0.107
 silence-0.103
 darkness-0.103
 darker-0.102
 security-0.099
Token Can
Token ID1680
Feature activation+6.82e-02

Pos logit contributions
ish+0.160
anda+0.149
an+0.142
ouse+0.138
amb+0.135
Neg logit contributions
 stairs-0.090
 darkness-0.088
 silence-0.087
 darker-0.087
 security-0.086
Token quiet
Token ID5897
Feature activation+1.53e-01

Pos logit contributions
 darker+0.072
 stairs+0.071
 security+0.070
 darkness+0.070
 narrower+0.068
Neg logit contributions
ish-0.094
anda-0.092
amb-0.085
o-0.085
ouse-0.083
Token compared
Token ID3688
Feature activation+8.49e-02

Pos logit contributions
anda+0.388
ish+0.387
ouse+0.350
amb+0.349
o+0.348
Neg logit contributions
 silence-0.189
 stairs-0.182
 darkness-0.178
 opposite-0.167
 front-0.166
Token to
Token ID284
Feature activation+5.59e-02

Pos logit contributions
ish+0.212
anda+0.206
o+0.191
amb+0.190
ouse+0.189
Neg logit contributions
 stairs-0.100
 darker-0.100
 security-0.098
 darkness-0.097
 narrower-0.092
Token the
Token ID262
Feature activation+6.39e-02

Pos logit contributions
ish+0.141
anda+0.137
o+0.128
amb+0.127
an+0.126
Neg logit contributions
 darker-0.063
 stairs-0.063
 security-0.062
 darkness-0.061
 narrower-0.059
Token noisy
Token ID31210
Feature activation+7.85e-02

Pos logit contributions
ish+0.156
anda+0.150
an+0.139
o+0.137
ouse+0.135
Neg logit contributions
 darker-0.075
 narrower-0.073
 security-0.072
 stairs-0.071
 darkness-0.069
Token se
Token ID384
Feature activation+1.56e-01

Pos logit contributions
ish+0.185
anda+0.182
amb+0.169
o+0.169
am+0.165
Neg logit contributions
 stairs-0.103
 darker-0.100
 darkness-0.100
 security-0.099
 silence-0.097
Tokenag
Token ID363
Feature activation+9.48e-02

Pos logit contributions
anda+0.278
sters+0.261
ish+0.256
oppy+0.241
amb+0.237
Neg logit contributions
 lesson-0.241
Time-0.236
 opposite-0.226
 month-0.222
 hours-0.222
Tokenull
Token ID724
Feature activation+1.05e-01

Pos logit contributions
ish+0.197
anda+0.190
amb+0.172
ouse+0.163
sters+0.159
Neg logit contributions
 silence-0.145
 stairs-0.143
 darkness-0.142
 front-0.141
 opposite-0.139
Tokens
Token ID82
Feature activation+6.91e-02

Pos logit contributions
ish+0.200
anda+0.196
amb+0.193
ouse+0.184
am+0.174
Neg logit contributions
 silence-0.147
 darkness-0.143
 front-0.143
 darker-0.141
 dark-0.140
Token outside
Token ID2354
Feature activation+6.67e-02

Pos logit contributions
ish+0.151
anda+0.148
amb+0.136
o+0.134
an+0.133
Neg logit contributions
 darkness-0.100
 darker-0.100
 stairs-0.099
 silence-0.098
 security-0.097
Token.
Token ID13
Feature activation+7.57e-02

Pos logit contributions
ish+0.148
anda+0.145
amb+0.133
o+0.132
an+0.131
Neg logit contributions
 darker-0.096
 stairs-0.096
 darkness-0.094
 security-0.094
 silence-0.091
Token said
Token ID531
Feature activation+4.80e-02

Pos logit contributions
anda+0.201
ish+0.198
amb+0.187
an+0.185
ans+0.173
Neg logit contributions
 security-0.116
 silence-0.109
 front-0.107
 darkness-0.107
 stairs-0.102
Token goodbye
Token ID24829
Feature activation+9.43e-02

Pos logit contributions
ish+0.109
anda+0.105
an+0.098
amb+0.097
o+0.096
Neg logit contributions
 security-0.069
 darker-0.068
 silence-0.066
 darkness-0.066
 stairs-0.066
Token to
Token ID284
Feature activation+5.92e-02

Pos logit contributions
ish+0.255
anda+0.247
an+0.224
ishes+0.215
ouse+0.214
Neg logit contributions
 silence-0.096
 front-0.093
 security-0.092
 dark-0.089
 darkness-0.088
Token the
Token ID262
Feature activation+6.99e-02

Pos logit contributions
ish+0.162
anda+0.160
an+0.148
ouse+0.147
amb+0.147
Neg logit contributions
 security-0.057
 stairs-0.055
 darkness-0.054
 silence-0.054
 darker-0.053
Token g
Token ID308
Feature activation+1.19e-01

Pos logit contributions
ish+0.162
anda+0.157
o+0.145
an+0.143
amb+0.142
Neg logit contributions
 darker-0.092
 stairs-0.090
 security-0.090
 narrower-0.088
 darkness-0.088
Tokenaz
Token ID1031
Feature activation+1.51e-01

Pos logit contributions
ish+0.219
o+0.208
pp+0.200
lets+0.199
pper+0.199
Neg logit contributions
 stairs-0.175
 darkness-0.169
 security-0.159
 silence-0.155
 front-0.154
Tokeneb
Token ID1765
Feature activation+1.33e-01

Pos logit contributions
ish+0.242
anda+0.213
sters+0.206
ishes+0.204
oppy+0.195
Neg logit contributions
 stairs-0.278
 top-0.277
 front-0.273
 opposite-0.268
 security-0.263
Tokeno
Token ID78
Feature activation+9.49e-02

Pos logit contributions
ish+0.170
amb+0.153
anda+0.140
unn+0.135
imm+0.132
Neg logit contributions
 front-0.236
 other-0.227
 top-0.226
 usual-0.226
 security-0.225
Token and
Token ID290
Feature activation+4.66e-02

Pos logit contributions
anda+0.224
ish+0.220
amb+0.199
ouse+0.196
am+0.195
Neg logit contributions
 front-0.108
 silence-0.102
 security-0.101
 darkness-0.101
 corner-0.096
Token went
Token ID1816
Feature activation+1.66e-02

Pos logit contributions
ish+0.106
anda+0.105
an+0.095
amb+0.090
ouse+0.090
Neg logit contributions
 security-0.069
 darkness-0.066
 silence-0.065
 stairs-0.064
 darker-0.064
Token back
Token ID736
Feature activation+4.85e-02

Pos logit contributions
ish+0.039
anda+0.038
an+0.035
o+0.035
amb+0.034
Neg logit contributions
 stairs-0.022
 security-0.022
 darker-0.022
 darkness-0.021
 front-0.020
Token't
Token ID470
Feature activation+7.73e-02

Pos logit contributions
ish+0.328
anda+0.313
ishes+0.298
lets+0.294
o+0.288
Neg logit contributions
Oh-0.094
 security-0.093
 front-0.093
Time-0.092
 darkness-0.090
Token cry
Token ID3960
Feature activation+9.66e-02

Pos logit contributions
anda+0.194
ish+0.187
o+0.175
ouse+0.174
am+0.171
Neg logit contributions
 dark-0.091
 damage-0.090
 darkness-0.089
 silence-0.089
 darker-0.084
Token,
Token ID11
Feature activation+6.22e-02

Pos logit contributions
ish+0.226
anda+0.225
an+0.220
am+0.215
seed+0.215
Neg logit contributions
 dark-0.108
 darker-0.107
 harder-0.105
 silence-0.103
 hard-0.100
Token boys
Token ID6510
Feature activation+9.90e-02

Pos logit contributions
ish+0.147
anda+0.143
am+0.133
ouse+0.132
an+0.131
Neg logit contributions
 darker-0.083
 silence-0.083
 stairs-0.083
 darkness-0.081
 security-0.080
Token.
Token ID13
Feature activation+6.97e-02

Pos logit contributions
ouse+0.251
anda+0.251
amb+0.249
ish+0.248
orse+0.244
Neg logit contributions
 opposite-0.083
 silence-0.082
 security-0.082
 stairs-0.078
 dark-0.078
Token Acc
Token ID6366
Feature activation+1.51e-01

Pos logit contributions
ish+0.168
anda+0.155
pp+0.155
an+0.154
ese+0.154
Neg logit contributions
Oh-0.083
 silence-0.080
Where-0.078
 tow-0.076
 opposite-0.074
Tokenidents
Token ID3231
Feature activation+7.98e-02

Pos logit contributions
adow+0.356
anda+0.332
at+0.310
bug+0.303
ppy+0.300
Neg logit contributions
 harder-0.165
 louder-0.162
 disaster-0.154
 darker-0.152
What-0.144
Token happen
Token ID1645
Feature activation+7.22e-02

Pos logit contributions
ish+0.208
anda+0.203
amb+0.194
o+0.186
am+0.186
Neg logit contributions
 stairs-0.091
 darker-0.090
 silence-0.082
 dark-0.082
 security-0.080
Token.
Token ID13
Feature activation+5.88e-02

Pos logit contributions
anda+0.177
ish+0.174
amb+0.170
ouse+0.160
o+0.159
Neg logit contributions
 darker-0.089
 dark-0.082
 silence-0.080
 stairs-0.080
 darkness-0.079
Token I
Token ID314
Feature activation+5.69e-02

Pos logit contributions
ish+0.133
anda+0.126
an+0.121
adow+0.116
ouse+0.116
Neg logit contributions
Oh-0.081
 darker-0.079
Where-0.078
 silence-0.078
 darkness-0.077
Token have
Token ID423
Feature activation+2.08e-02

Pos logit contributions
anda+0.122
ish+0.122
amb+0.112
an+0.110
am+0.110
Neg logit contributions
 darker-0.082
 silence-0.080
 darkness-0.078
 narrower-0.076
 dark-0.076
Token to
Token ID284
Feature activation+6.77e-02

Pos logit contributions
anda+0.311
ish+0.309
amb+0.288
an+0.283
ouse+0.282
Neg logit contributions
 stairs-0.121
 darkness-0.120
 silence-0.118
 darker-0.117
 security-0.113
Token eat
Token ID4483
Feature activation+5.60e-02

Pos logit contributions
anda+0.163
ish+0.162
amb+0.149
o+0.148
ouse+0.145
Neg logit contributions
 stairs-0.086
 darkness-0.086
 security-0.085
 silence-0.084
 darker-0.083
Token the
Token ID262
Feature activation+7.33e-02

Pos logit contributions
ish+0.140
anda+0.136
o+0.126
amb+0.125
an+0.124
Neg logit contributions
 darker-0.065
 stairs-0.064
 security-0.063
 darkness-0.062
 narrower-0.061
Token y
Token ID331
Feature activation+1.27e-01

Pos logit contributions
ish+0.173
anda+0.168
o+0.157
amb+0.156
an+0.153
Neg logit contributions
 darker-0.094
 stairs-0.094
 security-0.092
 darkness-0.092
 narrower-0.089
Tokenummy
Token ID13513
Feature activation+9.54e-02

Pos logit contributions
anda+0.240
ish+0.223
ouse+0.219
o+0.217
orse+0.217
Neg logit contributions
 stairs-0.211
 darkness-0.209
 darker-0.205
 silence-0.204
 front-0.199
Token ins
Token ID1035
Feature activation+1.49e-01

Pos logit contributions
anda+0.239
ish+0.231
amb+0.226
o+0.221
am+0.221
Neg logit contributions
 stairs-0.111
 darkness-0.105
 silence-0.104
 darker-0.103
 dark-0.101
Tokenides
Token ID1460
Feature activation+8.80e-02

Pos logit contributions
anda+0.252
ish+0.249
o+0.228
an+0.211
ouse+0.211
Neg logit contributions
 month-0.229
 months-0.226
 handy-0.224
 stairs-0.224
 darkness-0.221
Token.
Token ID13
Feature activation+7.57e-02

Pos logit contributions
anda+0.190
ish+0.188
an+0.172
amb+0.171
am+0.167
Neg logit contributions
 stairs-0.132
 darker-0.128
 darkness-0.125
 front-0.122
 security-0.121
Token From
Token ID3574
Feature activation+4.71e-02

Pos logit contributions
ish+0.196
anda+0.188
an+0.180
ouse+0.172
ese+0.166
Neg logit contributions
 stairs-0.085
 silence-0.082
 darkness-0.081
Where-0.081
Oh-0.080
Token that
Token ID326
Feature activation+3.29e-02

Pos logit contributions
ish+0.109
anda+0.106
ouse+0.102
amb+0.099
o+0.096
Neg logit contributions
 darkness-0.063
 darker-0.063
 security-0.061
 silence-0.060
 opposite-0.060
Token day
Token ID1110
Feature activation+2.28e-02

Pos logit contributions
ish+0.088
anda+0.087
amb+0.081
an+0.080
am+0.080
Neg logit contributions
 darker-0.032
 stairs-0.032
 darkness-0.031
 security-0.030
 silence-0.030
Token give
Token ID1577
Feature activation+6.73e-02

Pos logit contributions
anda+0.217
ish+0.217
amb+0.203
ouse+0.198
o+0.196
Neg logit contributions
 stairs-0.107
 silence-0.104
 darkness-0.103
 opposite-0.103
 darker-0.102
Token to
Token ID284
Feature activation+6.31e-02

Pos logit contributions
anda+0.180
ish+0.179
amb+0.169
ouse+0.168
an+0.165
Neg logit contributions
 darker-0.060
 stairs-0.060
 darkness-0.059
 silence-0.058
 security-0.057
Token the
Token ID262
Feature activation+5.98e-02

Pos logit contributions
ish+0.181
anda+0.180
amb+0.172
ouse+0.169
an+0.167
Neg logit contributions
 stairs-0.052
 silence-0.050
 opposite-0.049
 darkness-0.049
 security-0.047
Token rh
Token ID9529
Feature activation+1.19e-01

Pos logit contributions
ish+0.137
anda+0.133
o+0.123
amb+0.122
an+0.121
Neg logit contributions
 darker-0.081
 stairs-0.080
 security-0.079
 darkness-0.078
 narrower-0.077
Tokenin
Token ID259
Feature activation+7.93e-02

Pos logit contributions
ish+0.196
anda+0.188
amb+0.184
anna+0.181
am+0.172
Neg logit contributions
 security-0.226
 stairs-0.210
 darker-0.205
 opposite-0.204
Where-0.202
Tokenoc
Token ID420
Feature activation+1.48e-01

Pos logit contributions
anda+0.091
ish+0.089
anna+0.079
amb+0.077
am+0.076
Neg logit contributions
 front-0.187
 stairs-0.184
 silence-0.183
 security-0.183
 lot-0.182
Tokeneros
Token ID27498
Feature activation+9.99e-02

Pos logit contributions
anda+0.164
ouse+0.163
ish+0.158
adow+0.158
anna+0.148
Neg logit contributions
 security-0.349
 front-0.334
 stairs-0.334
 opposite-0.332
 darkness-0.327
Token.
Token ID13
Feature activation+6.26e-02

Pos logit contributions
ish+0.246
anda+0.245
amb+0.235
ouse+0.229
an+0.225
Neg logit contributions
 silence-0.105
 stairs-0.104
 security-0.104
 opposite-0.102
 darkness-0.102
Token Lily
Token ID20037
Feature activation+8.10e-02

Pos logit contributions
ish+0.154
anda+0.146
an+0.138
ouse+0.133
am+0.130
Neg logit contributions
 stairs-0.081
 darker-0.077
 opposite-0.076
 silence-0.076
 darkness-0.075
Token walked
Token ID6807
Feature activation+4.03e-03

Pos logit contributions
ish+0.190
anda+0.189
amb+0.182
orse+0.178
ouse+0.176
Neg logit contributions
 stairs-0.087
 security-0.085
 opposite-0.081
 front-0.079
 darkness-0.079
Token to
Token ID284
Feature activation+2.39e-02

Pos logit contributions
ish+0.008
anda+0.008
o+0.008
amb+0.007
an+0.007
Neg logit contributions
 darker-0.006
 darkness-0.006
 security-0.006
 narrower-0.006
 stairs-0.006
Token giving
Token ID3501
Feature activation+5.92e-02

Pos logit contributions
ish+0.132
anda+0.128
ouse+0.119
amb+0.118
o+0.114
Neg logit contributions
 front-0.073
 silence-0.071
 stairs-0.071
 darkness-0.070
 security-0.068
Token him
Token ID683
Feature activation+5.40e-02

Pos logit contributions
anda+0.132
ish+0.129
orse+0.124
amb+0.123
ouse+0.123
Neg logit contributions
 darkness-0.079
 opposite-0.077
 silence-0.077
 front-0.077
 dark-0.076
Token too
Token ID1165
Feature activation+3.17e-02

Pos logit contributions
ish+0.152
anda+0.152
amb+0.149
orse+0.143
ouse+0.142
Neg logit contributions
 silence-0.038
 darkness-0.038
 darker-0.038
 front-0.037
 opposite-0.036
Token many
Token ID867
Feature activation+2.71e-02

Pos logit contributions
ish+0.077
anda+0.077
o+0.074
am+0.071
ouse+0.069
Neg logit contributions
 darker-0.036
 security-0.034
 narrower-0.034
 stairs-0.033
Where-0.033
Token ch
Token ID442
Feature activation+8.25e-02

Pos logit contributions
ish+0.060
anda+0.058
o+0.053
amb+0.053
an+0.052
Neg logit contributions
 darker-0.039
 stairs-0.039
 security-0.039
 darkness-0.038
 narrower-0.037
Tokenocol
Token ID4668
Feature activation+1.47e-01

Pos logit contributions
ci+0.158
ouse+0.156
ish+0.153
anda+0.153
amb+0.149
Neg logit contributions
 danger-0.114
 darkness-0.113
 darker-0.109
 dark-0.107
 security-0.106
Tokenates
Token ID689
Feature activation+8.14e-02

Pos logit contributions
anda+0.276
ish+0.270
ouse+0.250
an+0.240
anna+0.238
Neg logit contributions
 front-0.217
 security-0.213
 darker-0.212
 stairs-0.211
 darkness-0.209
Token."
Token ID526
Feature activation+7.97e-02

Pos logit contributions
ish+0.179
anda+0.179
amb+0.165
an+0.164
am+0.161
Neg logit contributions
 stairs-0.112
 front-0.106
 security-0.105
 darkness-0.104
 darker-0.102
Token The
Token ID383
Feature activation+4.27e-02

Pos logit contributions
ish+0.233
anda+0.221
an+0.213
ouse+0.201
ese+0.197
Neg logit contributions
 opposite-0.066
Oh-0.064
 silence-0.062
 front-0.061
Where-0.061
Token king
Token ID5822
Feature activation+1.01e-01

Pos logit contributions
ish+0.097
anda+0.092
amb+0.088
o+0.087
ouse+0.085
Neg logit contributions
 stairs-0.060
 security-0.060
 darker-0.059
 darkness-0.058
 narrower-0.056
Token learned
Token ID4499
Feature activation+2.20e-02

Pos logit contributions
ish+0.212
amb+0.197
anda+0.196
ans+0.191
sters+0.186
Neg logit contributions
 security-0.127
 moral-0.121
 silence-0.121
 darkness-0.119
 opposite-0.116
Token.
Token ID13
Feature activation+6.91e-02

Pos logit contributions
ish+0.179
ouse+0.169
anda+0.168
amb+0.165
an+0.161
Neg logit contributions
 stairs-0.134
 security-0.131
 darker-0.130
 front-0.129
 silence-0.129
Token The
Token ID383
Feature activation+4.88e-02

Pos logit contributions
ish+0.149
anda+0.141
an+0.136
ouse+0.133
am+0.126
Neg logit contributions
 stairs-0.100
Where-0.099
 silence-0.099
Oh-0.098
 front-0.097
Token more
Token ID517
Feature activation+4.96e-02

Pos logit contributions
ish+0.110
anda+0.107
an+0.097
o+0.096
ouse+0.093
Neg logit contributions
 darker-0.068
 security-0.066
 stairs-0.066
 narrower-0.066
 darkness-0.065
Token,
Token ID11
Feature activation+2.70e-02

Pos logit contributions
ish+0.118
anda+0.111
amb+0.111
o+0.107
an+0.105
Neg logit contributions
 stairs-0.070
 darkness-0.063
 security-0.063
 darker-0.063
 silence-0.062
Token the
Token ID262
Feature activation+5.22e-02

Pos logit contributions
ish+0.066
anda+0.065
amb+0.061
o+0.060
ouse+0.060
Neg logit contributions
 darker-0.033
 stairs-0.033
 darkness-0.032
 security-0.032
 narrower-0.031
Token mer
Token ID4017
Feature activation+1.46e-01

Pos logit contributions
ish+0.120
anda+0.118
an+0.106
o+0.106
ouse+0.103
Neg logit contributions
 darker-0.068
 security-0.067
 stairs-0.066
 narrower-0.066
 darkness-0.065
Tokenrier
Token ID5277
Feature activation+1.10e-01

Pos logit contributions
o+0.227
seed+0.191
ongo+0.190
ish+0.184
pper+0.183
Neg logit contributions
 security-0.273
 stairs-0.268
 clues-0.266
 darker-0.254
 seconds-0.252
Token.
Token ID13
Feature activation+3.82e-02

Pos logit contributions
ish+0.246
anda+0.237
amb+0.222
ouse+0.222
am+0.219
Neg logit contributions
 security-0.143
 silence-0.142
 stairs-0.139
 darkness-0.132
 front-0.131
Token Come
Token ID7911
Feature activation+4.34e-02

Pos logit contributions
ish+0.082
anda+0.079
an+0.072
o+0.071
ouse+0.071
Neg logit contributions
 stairs-0.058
 darker-0.058
 darkness-0.057
 security-0.056
 narrower-0.055
Token to
Token ID284
Feature activation+3.57e-02

Pos logit contributions
ish+0.101
anda+0.100
an+0.093
anna+0.092
am+0.091
Neg logit contributions
 front-0.054
 stairs-0.053
 darker-0.052
 security-0.052
Where-0.051
Token my
Token ID616
Feature activation+5.26e-02

Pos logit contributions
ish+0.092
anda+0.090
o+0.084
amb+0.083
an+0.083
Neg logit contributions
 darker-0.038
 stairs-0.038
 security-0.038
 darkness-0.037
 narrower-0.036
Token am
Token ID716
Feature activation+4.86e-02

Pos logit contributions
anda+0.208
ish+0.207
an+0.194
amb+0.192
ese+0.186
Neg logit contributions
 silence-0.120
 darker-0.119
 darkness-0.116
 dark-0.116
 lot-0.114
Token taking
Token ID2263
Feature activation+4.32e-02

Pos logit contributions
ish+0.122
anda+0.116
amb+0.112
an+0.111
ouse+0.108
Neg logit contributions
 security-0.058
 darker-0.057
 stairs-0.056
 darkness-0.055
 silence-0.055
Token the
Token ID262
Feature activation+6.60e-02

Pos logit contributions
ish+0.107
anda+0.104
o+0.097
amb+0.096
an+0.096
Neg logit contributions
 darker-0.050
 stairs-0.050
 security-0.050
 darkness-0.049
 narrower-0.047
Token rh
Token ID9529
Feature activation+1.16e-01

Pos logit contributions
ish+0.151
anda+0.146
o+0.134
an+0.133
amb+0.133
Neg logit contributions
 darker-0.090
 security-0.088
 stairs-0.088
 darkness-0.086
 narrower-0.086
Tokenin
Token ID259
Feature activation+7.92e-02

Pos logit contributions
anda+0.184
ish+0.184
anna+0.175
amb+0.169
ouse+0.168
Neg logit contributions
 security-0.219
 stairs-0.202
 danger-0.199
 darker-0.198
 opposite-0.197
Tokenoc
Token ID420
Feature activation+1.46e-01

Pos logit contributions
anda+0.089
ish+0.082
anna+0.074
ouse+0.071
am+0.069
Neg logit contributions
 front-0.192
 stairs-0.188
 security-0.188
 darkness-0.185
 silence-0.185
Tokeneros
Token ID27498
Feature activation+1.01e-01

Pos logit contributions
ouse+0.171
anda+0.167
adow+0.163
orse+0.155
anna+0.154
Neg logit contributions
 security-0.327
 front-0.315
 stairs-0.309
 opposite-0.306
 darkness-0.306
Token to
Token ID284
Feature activation+4.05e-02

Pos logit contributions
anda+0.238
ish+0.232
amb+0.218
ouse+0.215
anna+0.209
Neg logit contributions
 security-0.122
 stairs-0.121
 front-0.120
 darkness-0.118
 silence-0.116
Token a
Token ID257
Feature activation+2.53e-02

Pos logit contributions
ish+0.101
anda+0.098
o+0.093
an+0.091
amb+0.091
Neg logit contributions
 darker-0.045
 stairs-0.044
 narrower-0.043
 security-0.043
 darkness-0.042
Token new
Token ID649
Feature activation+4.28e-02

Pos logit contributions
ish+0.061
anda+0.060
o+0.055
an+0.054
ouse+0.053
Neg logit contributions
 darker-0.030
 stairs-0.029
 narrower-0.029
 darkness-0.028
 security-0.027
Token home
Token ID1363
Feature activation+8.04e-02

Pos logit contributions
ish+0.104
anda+0.101
o+0.094
amb+0.093
ouse+0.092
Neg logit contributions
 stairs-0.052
 darker-0.052
 security-0.051
 darkness-0.050
 narrower-0.049
Token for
Token ID329
Feature activation+6.29e-02

Pos logit contributions
ish+0.247
anda+0.247
amb+0.232
an+0.225
anna+0.222
Neg logit contributions
 stairs-0.145
 front-0.129
 silence-0.129
 darkness-0.128
 security-0.126
Token Santa
Token ID8909
Feature activation+1.10e-01

Pos logit contributions
ish+0.147
anda+0.143
o+0.132
amb+0.131
an+0.130
Neg logit contributions
 darker-0.082
 stairs-0.082
 security-0.081
 darkness-0.080
 narrower-0.078
Token and
Token ID290
Feature activation+7.90e-02

Pos logit contributions
anda+0.277
amb+0.275
ish+0.269
an+0.253
imm+0.250
Neg logit contributions
 silence-0.090
 security-0.089
 front-0.089
 last-0.089
 oxygen-0.088
Token his
Token ID465
Feature activation+6.75e-02

Pos logit contributions
ish+0.205
anda+0.204
an+0.193
amb+0.191
ouse+0.188
Neg logit contributions
 security-0.087
 stairs-0.086
 darkness-0.084
 silence-0.082
 dark-0.080
Token re
Token ID302
Feature activation+1.29e-01

Pos logit contributions
ish+0.155
anda+0.151
amb+0.145
o+0.142
am+0.140
Neg logit contributions
 stairs-0.090
 darkness-0.087
 security-0.087
 silence-0.086
 front-0.084
Tokenind
Token ID521
Feature activation+1.46e-01

Pos logit contributions
ish+0.233
ouncing+0.226
ouse+0.219
atmeal+0.219
amb+0.218
Neg logit contributions
 security-0.170
 danger-0.163
 front-0.157
 darkness-0.144
 silence-0.143
Tokeneer
Token ID28153
Feature activation+9.43e-02

Pos logit contributions
ish+0.207
anda+0.198
ouse+0.183
orse+0.172
am+0.168
Neg logit contributions
 stairs-0.307
 silence-0.301
 darkness-0.301
 security-0.299
 opposite-0.291
Token,"
Token ID553
Feature activation+8.43e-02

Pos logit contributions
ish+0.222
anda+0.220
amb+0.211
ouse+0.204
an+0.204
Neg logit contributions
 security-0.108
 stairs-0.107
 silence-0.106
 darkness-0.106
 front-0.104
Token Anna
Token ID11735
Feature activation+1.01e-01

Pos logit contributions
ish+0.216
anda+0.209
an+0.193
ouse+0.187
am+0.186
Neg logit contributions
 security-0.104
 opposite-0.099
Oh-0.098
 silence-0.096
Where-0.096
Token says
Token ID1139
Feature activation+5.74e-02

Pos logit contributions
ish+0.220
anda+0.212
an+0.199
ouse+0.193
ci+0.188
Neg logit contributions
 opposite-0.153
 silence-0.148
 front-0.147
 security-0.146
 darkness-0.144
Token to
Token ID284
Feature activation+5.11e-02

Pos logit contributions
ish+0.138
anda+0.137
an+0.126
amb+0.121
ouse+0.120
Neg logit contributions
 security-0.070
 front-0.068
 silence-0.068
 darker-0.067
 darkness-0.067
Token before
Token ID878
Feature activation+6.75e-02

Pos logit contributions
ish+0.233
anda+0.227
amb+0.211
an+0.209
o+0.204
Neg logit contributions
 stairs-0.144
 darker-0.133
 darkness-0.133
 security-0.132
 silence-0.131
Token dinner
Token ID8073
Feature activation+8.34e-02

Pos logit contributions
ish+0.153
anda+0.148
o+0.136
an+0.135
amb+0.135
Neg logit contributions
 stairs-0.095
 darker-0.095
 security-0.093
 darkness-0.092
 narrower-0.090
Token.
Token ID13
Feature activation+7.37e-02

Pos logit contributions
ish+0.186
anda+0.183
amb+0.169
an+0.166
o+0.164
Neg logit contributions
 stairs-0.131
 darkness-0.120
 security-0.120
 darker-0.120
 silence-0.119
Token<|endoftext|>
Token ID50256
Feature activation+7.56e-02

Pos logit contributions
ish+0.189
anda+0.175
an+0.171
ouse+0.159
am+0.159
Neg logit contributions
 stairs-0.093
 darkness-0.086
 silence-0.086
Where-0.086
 darker-0.084
TokenS
Token ID50
Feature activation+1.02e-01

Pos logit contributions
ish+0.180
anda+0.175
amb+0.160
an+0.160
o+0.159
Neg logit contributions
 stairs-0.097
 darker-0.096
 security-0.094
 darkness-0.094
 narrower-0.091
Tokenaman
Token ID10546
Feature activation+1.45e-01

Pos logit contributions
ish+0.192
amb+0.178
ouse+0.172
anda+0.172
ci+0.168
Neg logit contributions
 front-0.170
 stairs-0.161
 security-0.157
On-0.157
 top-0.156
Tokentha
Token ID12898
Feature activation+1.08e-01

Pos logit contributions
ish+0.309
ouse+0.291
amb+0.276
aring+0.273
lets+0.268
Neg logit contributions
 front-0.202
 top-0.200
 security-0.185
 stairs-0.175
 usual-0.167
Token was
Token ID373
Feature activation+3.72e-02

Pos logit contributions
anda+0.260
ish+0.259
ouse+0.249
aring+0.243
imm+0.240
Neg logit contributions
 front-0.114
 stairs-0.108
 darkness-0.104
 corner-0.101
 dark-0.094
Token a
Token ID257
Feature activation+2.58e-02

Pos logit contributions
ish+0.090
anda+0.089
an+0.081
ouse+0.080
lets+0.077
Neg logit contributions
 dark-0.048
 darker-0.048
 darkness-0.047
 stairs-0.047
 front-0.046
Token lazy
Token ID16931
Feature activation+6.03e-02

Pos logit contributions
ish+0.064
anda+0.063
o+0.059
amb+0.059
am+0.058
Neg logit contributions
 darker-0.029
 stairs-0.029
 security-0.029
 darkness-0.028
 narrower-0.027
Token girl
Token ID2576
Feature activation+7.55e-02

Pos logit contributions
ish+0.130
anda+0.127
am+0.125
amb+0.120
ouse+0.120
Neg logit contributions
 stairs-0.086
 darkness-0.084
 darker-0.084
 security-0.083
 silence-0.083
Token stayed
Token ID9658
Feature activation+7.68e-02

Pos logit contributions
 front+0.004
 silence+0.004
 darkness+0.004
 security+0.004
 stairs+0.004
Neg logit contributions
anda-0.010
ish-0.009
amb-0.009
ouse-0.009
anna-0.009
Token away
Token ID1497
Feature activation+9.25e-02

Pos logit contributions
anda+0.184
ish+0.183
amb+0.168
o+0.167
ouse+0.163
Neg logit contributions
 darker-0.103
 darkness-0.098
 dark-0.095
 silence-0.094
 stairs-0.094
Token from
Token ID422
Feature activation+7.33e-02

Pos logit contributions
ish+0.222
anda+0.216
amb+0.199
an+0.198
ouse+0.197
Neg logit contributions
 darker-0.123
 stairs-0.123
 darkness-0.122
 silence-0.120
 security-0.118
Token the
Token ID262
Feature activation+8.08e-02

Pos logit contributions
ish+0.191
anda+0.185
o+0.172
amb+0.172
ouse+0.171
Neg logit contributions
 darker-0.079
 stairs-0.078
 security-0.077
 darkness-0.076
 narrower-0.073
Token other
Token ID584
Feature activation+9.12e-02

Pos logit contributions
ish+0.187
anda+0.183
an+0.168
o+0.166
ouse+0.162
Neg logit contributions
 darker-0.103
 narrower-0.100
 security-0.100
 stairs-0.099
 darkness-0.096
Token ge
Token ID4903
Feature activation+1.45e-01

Pos logit contributions
ish+0.212
anda+0.206
amb+0.194
o+0.192
ouse+0.191
Neg logit contributions
 stairs-0.122
 darkness-0.121
 darker-0.120
 security-0.120
 silence-0.115
Tokenese
Token ID2771
Feature activation+1.12e-01

Pos logit contributions
ouse+0.136
ish+0.132
amb+0.122
anda+0.118
imm+0.098
Neg logit contributions
 darkness-0.344
 front-0.344
 opposite-0.344
 silence-0.342
 danger-0.341
Token.
Token ID13
Feature activation+7.38e-02

Pos logit contributions
anda+0.265
ish+0.265
amb+0.243
ouse+0.242
an+0.240
Neg logit contributions
 darkness-0.135
 silence-0.132
 stairs-0.131
 security-0.131
 front-0.130
Token
Token ID198
Feature activation+6.12e-02

Pos logit contributions
ish+0.186
anda+0.174
an+0.166
ouse+0.162
am+0.154
Neg logit contributions
 stairs-0.091
 silence-0.087
 darkness-0.086
 opposite-0.085
Where-0.085
Token
Token ID198
Feature activation+6.05e-02

Pos logit contributions
ish+0.156
anda+0.147
amb+0.136
ouse+0.135
an+0.135
Neg logit contributions
 stairs-0.072
 darker-0.071
 security-0.070
 darkness-0.069
 narrower-0.067
Token"
Token ID1
Feature activation+4.91e-02

Pos logit contributions
ish+0.155
anda+0.152
o+0.143
amb+0.141
an+0.140
Neg logit contributions
 darker-0.066
 stairs-0.064
 security-0.063
 darkness-0.063
 narrower-0.061
Token not
Token ID407
Feature activation+9.96e-02

Pos logit contributions
 basement+0.100
 darkness+0.099
 stairs+0.095
 darker+0.094
 narrower+0.093
Neg logit contributions
ish-0.127
anda-0.114
am-0.110
o-0.110
an-0.103
Token know
Token ID760
Feature activation+6.74e-02

Pos logit contributions
anda+0.246
ish+0.244
amb+0.230
ouse+0.222
an+0.220
Neg logit contributions
 security-0.125
 silence-0.124
 danger-0.119
 dark-0.119
 darker-0.116
Token why
Token ID1521
Feature activation+5.33e-02

Pos logit contributions
ish+0.154
anda+0.149
o+0.138
amb+0.138
an+0.136
Neg logit contributions
 darker-0.092
 stairs-0.092
 security-0.092
 darkness-0.090
 narrower-0.088
Token his
Token ID465
Feature activation+9.38e-02

Pos logit contributions
ish+0.123
anda+0.119
amb+0.112
an+0.110
o+0.108
Neg logit contributions
 stairs-0.074
 darker-0.072
 darkness-0.072
 security-0.072
 silence-0.070
Token t
Token ID256
Feature activation+1.42e-01

Pos logit contributions
ish+0.218
anda+0.207
amb+0.201
o+0.198
pp+0.196
Neg logit contributions
 security-0.122
 silence-0.120
 stairs-0.120
 darkness-0.118
 front-0.115
Tokeneddy
Token ID21874
Feature activation+1.45e-01

Pos logit contributions
orse+0.294
ans+0.270
ish+0.264
orses+0.263
ese+0.257
Neg logit contributions
 security-0.183
 stairs-0.180
 front-0.175
Where-0.174
 corner-0.164
Token bear
Token ID6842
Feature activation+1.02e-01

Pos logit contributions
anda+0.329
amb+0.307
aring+0.303
am+0.299
orse+0.299
Neg logit contributions
 front-0.196
 silence-0.193
 darkness-0.188
 security-0.182
 other-0.181
Token was
Token ID373
Feature activation+5.06e-02

Pos logit contributions
amb+0.230
anda+0.225
ish+0.224
an+0.204
anna+0.202
Neg logit contributions
 darkness-0.131
 security-0.131
 silence-0.131
 dark-0.124
 danger-0.123
Token sad
Token ID6507
Feature activation+8.16e-02

Pos logit contributions
ish+0.121
anda+0.118
amb+0.111
an+0.108
o+0.108
Neg logit contributions
 darker-0.066
 stairs-0.065
 darkness-0.065
 security-0.064
 silence-0.062
Token.
Token ID13
Feature activation+7.57e-02

Pos logit contributions
anda+0.209
ish+0.206
amb+0.195
an+0.194
orse+0.191
Neg logit contributions
 stairs-0.091
 darker-0.089
 dark-0.088
 silence-0.087
 darkness-0.087
Token He
Token ID679
Feature activation+6.83e-02

Pos logit contributions
ish+0.193
anda+0.175
an+0.172
ouse+0.165
ese+0.161
Neg logit contributions
 stairs-0.097
 darkness-0.093
 silence-0.093
 opposite-0.092
Oh-0.091
Token it
Token ID340
Feature activation+5.59e-02

Pos logit contributions
ish+0.193
anda+0.192
ouse+0.182
amb+0.180
an+0.178
Neg logit contributions
 stairs-0.068
 darkness-0.067
 security-0.067
 silence-0.066
 darker-0.066
Token to
Token ID284
Feature activation+6.52e-02

Pos logit contributions
ish+0.144
anda+0.143
amb+0.134
o+0.130
an+0.130
Neg logit contributions
 stairs-0.058
 darkness-0.057
 darker-0.057
 silence-0.056
 security-0.056
Token the
Token ID262
Feature activation+7.83e-02

Pos logit contributions
ish+0.170
anda+0.167
ouse+0.153
amb+0.153
o+0.153
Neg logit contributions
 security-0.070
 stairs-0.070
 darkness-0.068
 silence-0.068
 darker-0.068
Token rh
Token ID9529
Feature activation+1.14e-01

Pos logit contributions
ish+0.182
anda+0.176
o+0.162
an+0.161
amb+0.159
Neg logit contributions
 darker-0.103
 stairs-0.101
 security-0.100
 narrower-0.099
 darkness-0.098
Tokenin
Token ID259
Feature activation+8.08e-02

Pos logit contributions
ish+0.186
anda+0.181
amb+0.176
anna+0.174
am+0.163
Neg logit contributions
 security-0.219
 stairs-0.201
 darker-0.201
 darkness-0.199
 danger-0.199
Tokenoc
Token ID420
Feature activation+1.45e-01

Pos logit contributions
anda+0.090
ish+0.090
amb+0.077
anna+0.073
ouse+0.072
Neg logit contributions
 front-0.190
 security-0.189
 silence-0.187
 opposite-0.186
 darkness-0.185
Tokeneros
Token ID27498
Feature activation+9.75e-02

Pos logit contributions
ouse+0.167
anda+0.165
adow+0.163
anna+0.157
ish+0.156
Neg logit contributions
 security-0.332
 front-0.315
 stairs-0.311
 darkness-0.311
 opposite-0.311
Token.
Token ID13
Feature activation+5.84e-02

Pos logit contributions
ish+0.234
anda+0.234
amb+0.226
ouse+0.214
an+0.212
Neg logit contributions
 silence-0.107
 security-0.104
 opposite-0.102
 darkness-0.102
 many-0.101
Token When
Token ID1649
Feature activation+2.99e-02

Pos logit contributions
ish+0.143
anda+0.135
an+0.128
ouse+0.123
am+0.120
Neg logit contributions
 stairs-0.075
 security-0.070
 silence-0.070
 opposite-0.070
 darkness-0.070
Token he
Token ID339
Feature activation+5.23e-02

Pos logit contributions
ish+0.070
anda+0.068
o+0.064
amb+0.063
an+0.062
Neg logit contributions
 darker-0.039
 stairs-0.038
 security-0.037
 darkness-0.037
 narrower-0.037
Token tasted
Token ID29187
Feature activation+2.50e-02

Pos logit contributions
ish+0.117
anda+0.114
amb+0.108
an+0.105
anna+0.102
Neg logit contributions
 security-0.075
 darkness-0.072
 stairs-0.071
 darker-0.070
 front-0.069
Token
Token ID241
Feature activation+1.25e-01

Pos logit contributions
ish+0.309
ishes+0.267
anda+0.264
lets+0.235
seed+0.235
Neg logit contributions
 front-0.291
 opposite-0.278
 top-0.273
 security-0.271
 silence-0.264
TokenWhy
Token ID5195
Feature activation+5.99e-02

Pos logit contributions
ish+0.337
anda+0.329
an+0.300
ouse+0.296
am+0.292
Neg logit contributions
Oh-0.129
 stairs-0.129
Where-0.128
Time-0.126
 darker-0.125
Token don
Token ID836
Feature activation+7.94e-02

Pos logit contributions
anda+0.130
ish+0.130
amb+0.126
an+0.124
am+0.117
Neg logit contributions
 silence-0.087
 stairs-0.085
 lot-0.082
 opposite-0.081
 truth-0.081
Tokenâ
Token ID22940
Feature activation+1.50e-01

Pos logit contributions
ish+0.211
anda+0.207
an+0.192
o+0.188
adow+0.186
Neg logit contributions
 stairs-0.085
 security-0.082
 darker-0.081
 darkness-0.078
 front-0.078
Token
Token ID26391
Feature activation+1.30e-01

Pos logit contributions
anda+0.302
ish+0.274
lets+0.256
ouse+0.249
o+0.247
Neg logit contributions
 opposite-0.251
 silence-0.246
Time-0.245
 front-0.242
 darkness-0.239
Token
Token ID8151
Feature activation+1.44e-01

Pos logit contributions
ish+0.328
an+0.305
anda+0.288
ie+0.284
ishes+0.284
Neg logit contributions
On-0.153
Time-0.144
 front-0.137
To-0.137
Before-0.136
Tokent
Token ID83
Feature activation+7.46e-02

Pos logit contributions
ish+0.343
an+0.342
 Majesty+0.320
anda+0.317
ans+0.305
Neg logit contributions
 stairs-0.156
 darker-0.151
On-0.150
 lot-0.147
 month-0.146
Token you
Token ID345
Feature activation+7.54e-02

Pos logit contributions
ish+0.167
anda+0.166
an+0.153
amb+0.153
am+0.149
Neg logit contributions
 silence-0.102
 darkness-0.098
 dark-0.096
 stairs-0.096
Time-0.095
Token come
Token ID1282
Feature activation+6.78e-02

Pos logit contributions
anda+0.151
ish+0.150
an+0.140
amb+0.133
o+0.133
Neg logit contributions
 darkness-0.116
 silence-0.114
 stairs-0.114
 darker-0.113
 dark-0.112
Token with
Token ID351
Feature activation+4.17e-02

Pos logit contributions
ish+0.162
anda+0.160
an+0.146
am+0.144
orse+0.144
Neg logit contributions
 darker-0.084
 stairs-0.082
 front-0.082
 security-0.081
 silence-0.080
Token me
Token ID502
Feature activation+8.02e-02

Pos logit contributions
ish+0.092
anda+0.089
o+0.082
amb+0.081
an+0.080
Neg logit contributions
 darker-0.061
 stairs-0.060
 security-0.060
 darkness-0.059
 narrower-0.058
Token wanted
Token ID2227
Feature activation+7.97e-02

Pos logit contributions
ish+0.001
anda+0.001
o+0.001
an+0.001
ouse+0.001
Neg logit contributions
 darker-0.001
 darkness-0.001
 narrower-0.001
 silence-0.001
 security-0.001
Token to
Token ID284
Feature activation+8.38e-02

Pos logit contributions
ish+0.210
anda+0.204
amb+0.193
an+0.191
ouse+0.190
Neg logit contributions
 stairs-0.087
 silence-0.083
 security-0.083
 darker-0.083
 darkness-0.083
Token guess
Token ID4724
Feature activation+6.84e-02

Pos logit contributions
ish+0.206
anda+0.203
amb+0.187
ouse+0.184
o+0.183
Neg logit contributions
 stairs-0.109
 front-0.108
 silence-0.107
 opposite-0.104
 security-0.103
Token what
Token ID644
Feature activation+6.68e-02

Pos logit contributions
ish+0.167
anda+0.160
amb+0.157
an+0.153
ouse+0.151
Neg logit contributions
 stairs-0.085
 darker-0.083
 security-0.081
 silence-0.079
 opposite-0.078
Token the
Token ID262
Feature activation+8.57e-02

Pos logit contributions
ish+0.162
anda+0.155
amb+0.149
an+0.145
ouse+0.143
Neg logit contributions
 stairs-0.090
 security-0.086
 darker-0.085
 darkness-0.084
 silence-0.083
Token kitten
Token ID42143
Feature activation+1.44e-01

Pos logit contributions
ish+0.204
anda+0.198
o+0.184
amb+0.184
ouse+0.181
Neg logit contributions
 darker-0.109
 stairs-0.109
 security-0.107
 darkness-0.106
 narrower-0.103
Token was
Token ID373
Feature activation+6.15e-02

Pos logit contributions
anda+0.307
amb+0.307
ish+0.306
ouse+0.286
am+0.286
Neg logit contributions
 darkness-0.192
 silence-0.192
 stairs-0.191
 dark-0.185
 security-0.181
Token doing
Token ID1804
Feature activation+8.02e-02

Pos logit contributions
ish+0.148
anda+0.144
amb+0.136
ouse+0.133
o+0.131
Neg logit contributions
 stairs-0.079
 security-0.076
 darker-0.076
 darkness-0.076
 silence-0.075
Token.
Token ID13
Feature activation+7.60e-02

Pos logit contributions
anda+0.195
ish+0.194
amb+0.177
an+0.174
ouse+0.171
Neg logit contributions
 stairs-0.102
 darker-0.096
 front-0.095
 security-0.095
 darkness-0.094
Token
Token ID220
Feature activation+6.10e-02

Pos logit contributions
ish+0.196
anda+0.184
an+0.180
ouse+0.174
ese+0.167
Neg logit contributions
 stairs-0.090
Where-0.086
 opposite-0.085
Oh-0.085
 front-0.082
Token
Token ID198
Feature activation+5.60e-02

Pos logit contributions
ish+0.164
anda+0.154
an+0.148
amb+0.146
ouse+0.143
Neg logit contributions
 stairs-0.062
 darker-0.056
 opposite-0.055
 darkness-0.054
 silence-0.053
Tokenâ
Token ID22940
Feature activation+1.32e-01

Pos logit contributions
ish+0.263
anda+0.246
an+0.242
ouse+0.226
ese+0.225
Neg logit contributions
 stairs-0.109
 silence-0.104
Oh-0.104
Where-0.103
 darkness-0.103
Token
Token ID26391
Feature activation+1.06e-01

Pos logit contributions
ish+0.235
anda+0.230
ouse+0.205
sters+0.196
lets+0.194
Neg logit contributions
 darkness-0.238
 darker-0.236
Time-0.236
 opposite-0.233
 silence-0.232
Token
Token ID198
Feature activation+6.11e-02

Pos logit contributions
ish+0.285
an+0.249
ishes+0.240
anda+0.240
ie+0.237
Neg logit contributions
 stairs-0.109
 front-0.108
On-0.103
 top-0.100
Inside-0.097
Token
Token ID198
Feature activation+6.06e-02

Pos logit contributions
ish+0.140
anda+0.135
o+0.124
amb+0.123
an+0.123
Neg logit contributions
 darker-0.084
 stairs-0.083
 security-0.082
 darkness-0.081
 narrower-0.080
TokenM
Token ID44
Feature activation+8.76e-02

Pos logit contributions
ish+0.153
anda+0.151
o+0.142
amb+0.140
an+0.138
Neg logit contributions
 darker-0.067
 stairs-0.066
 security-0.065
 darkness-0.064
 narrower-0.062
Tokenaz
Token ID1031
Feature activation+1.44e-01

Pos logit contributions
ish+0.156
anda+0.138
o+0.135
ouse+0.133
an+0.131
Neg logit contributions
 stairs-0.160
 security-0.160
 darker-0.159
 darkness-0.157
Time-0.153
Tokenzy
Token ID7357
Feature activation+1.01e-01

Pos logit contributions
ish+0.212
anda+0.180
sters+0.174
ishes+0.170
ouse+0.170
Neg logit contributions
 stairs-0.268
 top-0.265
 opposite-0.259
 front-0.259
 hours-0.250
Tokenâ
Token ID22940
Feature activation+1.14e-01

Pos logit contributions
ish+0.213
anda+0.201
ouse+0.197
amb+0.196
an+0.187
Neg logit contributions
 darkness-0.140
 stairs-0.135
 silence-0.133
 front-0.133
 opposite-0.130
Token
Token ID26391
Feature activation+7.54e-02

Pos logit contributions
ish+0.189
anda+0.187
ouse+0.164
o+0.156
orse+0.154
Neg logit contributions
 darkness-0.222
 opposite-0.218
 darker-0.218
 silence-0.217
 stairs-0.214
Token
Token ID8151
Feature activation+8.57e-02

Pos logit contributions
ish+0.158
an+0.136
anda+0.133
ie+0.130
amb+0.128
Neg logit contributions
 stairs-0.121
 front-0.121
On-0.118
Inside-0.113
 top-0.111
Tokens
Token ID82
Feature activation+4.57e-02

Pos logit contributions
ish+0.165
an+0.148
anda+0.146
amb+0.136
sters+0.131
Neg logit contributions
 stairs-0.147
 opposite-0.140
 darker-0.139
 silence-0.137
 security-0.137
Token these
Token ID777
Feature activation+1.22e-01

Pos logit contributions
 stairs+0.066
 darker+0.066
 darkness+0.065
 security+0.065
 narrower+0.064
Neg logit contributions
ish-0.078
anda-0.075
o-0.070
an-0.067
am-0.067
Token toys
Token ID14958
Feature activation+1.19e-01

Pos logit contributions
ish+0.301
anda+0.294
amb+0.287
o+0.285
ouse+0.280
Neg logit contributions
 stairs-0.143
 darker-0.134
 security-0.133
 darkness-0.132
 opposite-0.129
Token and
Token ID290
Feature activation+7.85e-02

Pos logit contributions
ish+0.267
anda+0.262
amb+0.249
an+0.248
orse+0.240
Neg logit contributions
 stairs-0.172
 front-0.147
 security-0.145
 dark-0.142
 darkness-0.141
Token how
Token ID703
Feature activation+4.49e-02

Pos logit contributions
ish+0.190
anda+0.189
amb+0.184
an+0.178
o+0.171
Neg logit contributions
 stairs-0.105
 security-0.098
 darkness-0.096
 silence-0.096
 dark-0.095
Token your
Token ID534
Feature activation+7.48e-02

Pos logit contributions
ish+0.121
anda+0.117
amb+0.116
an+0.114
ouse+0.109
Neg logit contributions
 stairs-0.048
 darker-0.044
 security-0.042
 dark-0.042
 many-0.041
Token grand
Token ID4490
Feature activation+1.43e-01

Pos logit contributions
ish+0.177
anda+0.170
amb+0.169
o+0.166
am+0.159
Neg logit contributions
 stairs-0.099
 silence-0.091
 front-0.091
 security-0.090
 darkness-0.090
Tokenpa
Token ID8957
Feature activation+1.01e-01

Pos logit contributions
anda+0.280
ish+0.269
amb+0.250
ouse+0.245
adow+0.234
Neg logit contributions
 front-0.233
 silence-0.231
 stairs-0.231
 months-0.230
 weeks-0.223
Token and
Token ID290
Feature activation+5.01e-02

Pos logit contributions
anda+0.214
ish+0.211
amb+0.211
an+0.202
ouse+0.193
Neg logit contributions
 security-0.136
 stairs-0.133
 darkness-0.131
 silence-0.131
 front-0.127
Token I
Token ID314
Feature activation+6.87e-02

Pos logit contributions
ish+0.126
anda+0.126
amb+0.120
an+0.118
o+0.114
Neg logit contributions
 stairs-0.059
 security-0.059
 darkness-0.056
 silence-0.055
 darker-0.055
Token played
Token ID2826
Feature activation+3.83e-02

Pos logit contributions
ish+0.156
anda+0.155
an+0.146
amb+0.141
am+0.140
Neg logit contributions
 security-0.086
 silence-0.086
 stairs-0.085
 darkness-0.085
 darker-0.084
Token with
Token ID351
Feature activation+3.60e-02

Pos logit contributions
ish+0.090
anda+0.089
o+0.079
amb+0.079
an+0.079
Neg logit contributions
 stairs-0.051
 security-0.050
 darker-0.050
 silence-0.049
 opposite-0.048
Token on
Token ID319
Feature activation+8.63e-02

Pos logit contributions
 security+0.055
 silence+0.055
 darkness+0.054
 darker+0.053
 stairs+0.052
Neg logit contributions
anda-0.145
ish-0.145
ouse-0.134
amb-0.134
o-0.132
Token top
Token ID1353
Feature activation+9.97e-02

Pos logit contributions
ish+0.214
anda+0.209
o+0.196
am+0.192
an+0.192
Neg logit contributions
 darker-0.095
 security-0.092
 narrower-0.091
 stairs-0.090
 darkness-0.090
Token.
Token ID13
Feature activation+8.93e-02

Pos logit contributions
ish+0.253
anda+0.243
amb+0.235
ouse+0.230
an+0.226
Neg logit contributions
 stairs-0.134
 darker-0.121
 darkness-0.114
 front-0.112
 silence-0.111
Token The
Token ID383
Feature activation+5.18e-02

Pos logit contributions
ish+0.232
anda+0.219
an+0.213
ouse+0.200
ese+0.196
Neg logit contributions
 stairs-0.107
Where-0.101
 silence-0.099
 front-0.099
Oh-0.098
Token b
Token ID275
Feature activation+1.14e-01

Pos logit contributions
ish+0.118
anda+0.115
o+0.104
an+0.104
ouse+0.102
Neg logit contributions
 darker-0.071
 security-0.069
 stairs-0.069
 narrower-0.068
 darkness-0.068
Tokenunn
Token ID20935
Feature activation+1.43e-01

Pos logit contributions
ouse+0.142
ish+0.133
ci+0.133
orse+0.133
o+0.121
Neg logit contributions
 darkness-0.228
 stairs-0.228
 darker-0.220
 security-0.217
 danger-0.213
Tokenies
Token ID444
Feature activation+9.37e-02

Pos logit contributions
ouse+0.193
anda+0.169
ish+0.165
ese+0.159
atmeal+0.154
Neg logit contributions
 many-0.271
 other-0.269
 stairs-0.268
 months-0.257
 weeks-0.257
Token wanted
Token ID2227
Feature activation+1.70e-02

Pos logit contributions
anda+0.216
ish+0.214
ouse+0.200
o+0.199
amb+0.198
Neg logit contributions
 stairs-0.122
 darkness-0.121
 darker-0.119
 security-0.117
 silence-0.114
Token to
Token ID284
Feature activation+4.53e-02

Pos logit contributions
ish+0.036
anda+0.035
o+0.034
am+0.032
an+0.031
Neg logit contributions
 darker-0.025
 narrower-0.024
 security-0.024
 stairs-0.024
 darkness-0.023
Token have
Token ID423
Feature activation+1.60e-03

Pos logit contributions
ish+0.109
anda+0.108
o+0.099
amb+0.098
ouse+0.098
Neg logit contributions
 stairs-0.058
 silence-0.057
 front-0.057
 darkness-0.056
 security-0.056
Token a
Token ID257
Feature activation+4.11e-03

Pos logit contributions
ish+0.004
anda+0.004
o+0.004
am+0.004
anna+0.004
Neg logit contributions
 darker-0.002
 stairs-0.002
 security-0.002
 narrower-0.002
 darkness-0.001
TokenI
Token ID40
Feature activation+1.26e-01

Pos logit contributions
 darker+0.046
 stairs+0.046
 security+0.045
 darkness+0.044
 narrower+0.043
Neg logit contributions
ish-0.139
anda-0.135
o-0.127
amb-0.126
an-0.125
Token want
Token ID765
Feature activation+6.92e-02

Pos logit contributions
anda+0.285
ish+0.279
an+0.275
ans+0.262
sters+0.262
Neg logit contributions
 silence-0.168
 darker-0.164
 dark-0.162
 darkness-0.155
Time-0.153
Token to
Token ID284
Feature activation+7.77e-02

Pos logit contributions
ish+0.169
anda+0.165
amb+0.153
an+0.153
ouse+0.151
Neg logit contributions
 stairs-0.088
 security-0.084
 silence-0.084
 darker-0.084
 darkness-0.084
Token taste
Token ID6938
Feature activation+6.00e-02

Pos logit contributions
ish+0.191
anda+0.189
o+0.172
amb+0.170
an+0.170
Neg logit contributions
 front-0.100
 silence-0.098
 stairs-0.096
 opposite-0.095
 security-0.093
Token something
Token ID1223
Feature activation+7.92e-02

Pos logit contributions
ish+0.150
anda+0.147
o+0.136
amb+0.136
an+0.135
Neg logit contributions
 darker-0.070
 stairs-0.068
 security-0.067
 darkness-0.067
 narrower-0.065
Token y
Token ID331
Feature activation+1.43e-01

Pos logit contributions
ish+0.182
anda+0.178
amb+0.169
o+0.167
am+0.165
Neg logit contributions
 stairs-0.105
 darker-0.105
 security-0.101
 darkness-0.099
 silence-0.097
Tokenummy
Token ID13513
Feature activation+1.21e-01

Pos logit contributions
anda+0.277
anna+0.252
ish+0.249
orse+0.249
ouse+0.246
Neg logit contributions
 stairs-0.224
 darker-0.224
 darkness-0.215
 dark-0.214
 front-0.211
Token!"
Token ID2474
Feature activation+9.40e-02

Pos logit contributions
anda+0.249
amb+0.224
ish+0.223
am+0.218
o+0.216
Neg logit contributions
 stairs-0.191
 dark-0.178
 silence-0.178
 darker-0.176
 oxygen-0.176
Token The
Token ID383
Feature activation+6.05e-02

Pos logit contributions
ish+0.264
anda+0.257
an+0.244
ouse+0.230
ese+0.223
Neg logit contributions
 opposite-0.089
Oh-0.089
 silence-0.084
 front-0.084
 stairs-0.083
Token rabbit
Token ID22746
Feature activation+1.08e-01

Pos logit contributions
ish+0.143
anda+0.138
amb+0.130
o+0.130
am+0.127
Neg logit contributions
 stairs-0.079
 darker-0.078
 security-0.077
 darkness-0.076
 silence-0.073
Token smiled
Token ID13541
Feature activation+6.01e-02

Pos logit contributions
anda+0.233
ish+0.232
ouse+0.222
amb+0.220
pp+0.213
Neg logit contributions
 opposite-0.142
 silence-0.141
 darkness-0.140
 front-0.138
 stairs-0.136
Token tightly
Token ID17707
Feature activation-2.14e-02

Pos logit contributions
 darker+0.041
 stairs+0.040
 security+0.040
 darkness+0.040
 narrower+0.040
Neg logit contributions
ish-0.042
anda-0.040
o-0.037
amb-0.036
an-0.036
Token.
Token ID13
Feature activation-3.37e-02

Pos logit contributions
 darker+0.038
 security+0.037
 narrower+0.037
 stairs+0.037
Time+0.037
Neg logit contributions
ish-0.040
anda-0.038
o-0.035
amb-0.034
am-0.034
Token The
Token ID383
Feature activation-4.62e-02

Pos logit contributions
 darker+0.059
 narrower+0.057
 security+0.057
 darkness+0.057
 stairs+0.056
Neg logit contributions
ish-0.060
anda-0.059
am-0.056
amb-0.055
o-0.055
Token tornado
Token ID33718
Feature activation+6.84e-03

Pos logit contributions
 narrower+0.051
 outer+0.047
Where+0.047
 security+0.046
 darker+0.046
Neg logit contributions
ish-0.108
anda-0.100
an-0.098
am-0.093
ouse-0.091
Token was
Token ID373
Feature activation-8.17e-02

Pos logit contributions
ish+0.014
anda+0.013
o+0.012
amb+0.012
an+0.012
Neg logit contributions
 darker-0.011
 stairs-0.011
 security-0.011
 darkness-0.011
 narrower-0.011
Token getting
Token ID1972
Feature activation-9.60e-02

Pos logit contributions
 stairs+0.133
Ready+0.128
Oh+0.125
 clues+0.123
 opposite+0.121
Neg logit contributions
o-0.151
anda-0.147
ish-0.146
an-0.129
am-0.128
Token closer
Token ID5699
Feature activation-2.39e-02

Pos logit contributions
 basement+0.144
Warning+0.142
 stairs+0.141
 month+0.140
 importance+0.139
Neg logit contributions
o-0.188
ish-0.174
am-0.159
bo-0.154
anda-0.154
Token,
Token ID11
Feature activation-3.11e-02

Pos logit contributions
 darker+0.040
 stairs+0.039
 darkness+0.039
 security+0.038
 silence+0.038
Neg logit contributions
ish-0.049
anda-0.048
ouse-0.043
an-0.043
amb-0.043
Token but
Token ID475
Feature activation-4.33e-02

Pos logit contributions
 darker+0.051
 stairs+0.050
 security+0.049
 darkness+0.049
 narrower+0.048
Neg logit contributions
ish-0.064
anda-0.063
o-0.056
amb-0.056
ouse-0.056
Token luckily
Token ID45120
Feature activation-3.89e-02

Pos logit contributions
 stairs+0.079
 darker+0.078
 narrower+0.076
 opposite+0.076
 darkness+0.075
Neg logit contributions
ish-0.067
o-0.065
anda-0.064
am-0.060
amb-0.060
Token the
Token ID262
Feature activation-4.28e-02

Pos logit contributions
 stairs+0.077
 darker+0.075
Oh+0.074
 opposite+0.073
 darkness+0.072
Neg logit contributions
ish-0.062
anda-0.056
o-0.056
am-0.055
ouse-0.051
Token tight
Token ID5381
Feature activation-2.09e-02

Pos logit contributions
 darker+0.022
 stairs+0.022
 security+0.022
 darkness+0.022
 narrower+0.021
Neg logit contributions
ish-0.033
anda-0.032
o-0.030
amb-0.029
an-0.029
Token.
Token ID13
Feature activation-1.53e-02

Pos logit contributions
Time+0.038
 darker+0.038
 stairs+0.037
 security+0.037
 narrower+0.037
Neg logit contributions
ish-0.039
anda-0.036
amb-0.033
am-0.033
o-0.032
Token His
Token ID2399
Feature activation-3.89e-02

Pos logit contributions
 darker+0.025
 security+0.025
 stairs+0.024
 darkness+0.024
 narrower+0.024
Neg logit contributions
ish-0.031
anda-0.030
o-0.028
amb-0.027
am-0.027
Token aim
Token ID4031
Feature activation-2.28e-02

Pos logit contributions
 security+0.055
 stairs+0.053
 darkness+0.053
 darker+0.053
 silence+0.053
Neg logit contributions
ish-0.086
anda-0.086
o-0.081
amb-0.078
anna-0.077
Token was
Token ID373
Feature activation-7.20e-02

Pos logit contributions
 darker+0.038
 security+0.037
 stairs+0.037
 darkness+0.037
 narrower+0.036
Neg logit contributions
ish-0.046
anda-0.045
amb-0.041
o-0.041
an-0.040
Token getting
Token ID1972
Feature activation-9.51e-02

Pos logit contributions
 stairs+0.096
 darker+0.093
 security+0.092
 darkness+0.092
 narrower+0.089
Neg logit contributions
ish-0.165
anda-0.160
o-0.152
ouse-0.146
am-0.146
Token better
Token ID1365
Feature activation-3.13e-02

Pos logit contributions
 basement+0.145
 stairs+0.131
jar+0.120
Time+0.119
 silence+0.118
Neg logit contributions
ish-0.207
am-0.198
o-0.190
anda-0.186
bo-0.172
Token and
Token ID290
Feature activation-4.80e-02

Pos logit contributions
 security+0.051
 stairs+0.050
 darker+0.050
 darkness+0.050
 silence+0.048
Neg logit contributions
anda-0.065
ish-0.064
o-0.058
an-0.058
amb-0.057
Token better
Token ID1365
Feature activation-2.90e-02

Pos logit contributions
 stairs+0.048
 security+0.046
 darkness+0.046
 darker+0.046
 opposite+0.044
Neg logit contributions
ish-0.125
anda-0.120
o-0.114
am-0.113
ouse-0.111
Token.
Token ID13
Feature activation-1.33e-02

Pos logit contributions
 security+0.040
 darker+0.039
 stairs+0.039
 darkness+0.039
 silence+0.038
Neg logit contributions
anda-0.068
ish-0.067
an-0.061
o-0.061
amb-0.060
Token
Token ID220
Feature activation-2.02e-02

Pos logit contributions
 darker+0.020
 stairs+0.020
 security+0.020
 darkness+0.019
 narrower+0.019
Neg logit contributions
ish-0.029
anda-0.028
o-0.025
amb-0.025
an-0.025
Token found
Token ID1043
Feature activation-7.25e-02

Pos logit contributions
 darker+0.059
 silence+0.059
 stairs+0.058
Time+0.058
 darkness+0.057
Neg logit contributions
ish-0.047
o-0.043
anda-0.042
am-0.039
ouse-0.037
Token one
Token ID530
Feature activation-5.95e-02

Pos logit contributions
Time+0.095
 darker+0.089
 stairs+0.089
jar+0.088
 month+0.085
Neg logit contributions
ish-0.173
o-0.155
am-0.153
anda-0.153
an-0.138
Token and
Token ID290
Feature activation-3.81e-02

Pos logit contributions
 darker+0.101
 security+0.100
 narrower+0.100
 stairs+0.100
 darkness+0.098
Neg logit contributions
ish-0.121
anda-0.112
o-0.105
an-0.100
ouse-0.099
Token dad
Token ID9955
Feature activation-1.46e-02

Pos logit contributions
 darker+0.067
Time+0.065
 silence+0.065
 darkness+0.065
 security+0.065
Neg logit contributions
ish-0.072
o-0.065
anda-0.063
am-0.062
lets-0.059
Token used
Token ID973
Feature activation-7.67e-02

Pos logit contributions
 security+0.023
 stairs+0.023
 darker+0.023
 darkness+0.022
 narrower+0.022
Neg logit contributions
ish-0.030
anda-0.030
amb-0.027
an-0.027
o-0.026
Token a
Token ID257
Feature activation-9.47e-02

Pos logit contributions
 darker+0.114
Time+0.112
Oh+0.112
 basement+0.111
 security+0.109
Neg logit contributions
ish-0.163
o-0.146
am-0.146
anda-0.142
lets-0.132
Token tool
Token ID2891
Feature activation-3.90e-02

Pos logit contributions
wered+0.157
 darker+0.149
Time+0.149
 darkness+0.148
 security+0.148
Neg logit contributions
ish-0.193
ouse-0.176
anda-0.174
o-0.161
lets-0.146
Token to
Token ID284
Feature activation-6.09e-02

Pos logit contributions
 darker+0.057
 stairs+0.056
 security+0.056
 darkness+0.055
 narrower+0.055
Neg logit contributions
ish-0.086
anda-0.083
o-0.077
amb-0.076
an-0.075
Token fix
Token ID4259
Feature activation-4.66e-02

Pos logit contributions
 darker+0.106
Time+0.104
 security+0.103
 darkness+0.101
Oh+0.100
Neg logit contributions
ish-0.114
o-0.100
anda-0.097
am-0.094
gy-0.092
Token the
Token ID262
Feature activation-4.72e-02

Pos logit contributions
 darker+0.061
 stairs+0.060
 security+0.060
 darkness+0.059
 narrower+0.059
Neg logit contributions
ish-0.111
anda-0.101
o-0.097
am-0.095
an-0.094
Token robot
Token ID9379
Feature activation-8.90e-03

Pos logit contributions
 darker+0.087
Time+0.085
 narrower+0.083
 security+0.083
wered+0.081
Neg logit contributions
ish-0.085
anda-0.073
an-0.069
ouse-0.068
o-0.067
Token.
Token ID13
Feature activation-3.06e-02

Pos logit contributions
 security+0.043
 darker+0.043
 darkness+0.042
 stairs+0.042
 narrower+0.041
Neg logit contributions
ish-0.051
anda-0.049
o-0.045
amb-0.044
an-0.044
Token Soon
Token ID15894
Feature activation-4.98e-02

Pos logit contributions
 security+0.056
 darker+0.055
 narrower+0.053
 darkness+0.053
Time+0.051
Neg logit contributions
ish-0.054
anda-0.052
amb-0.049
o-0.049
am-0.048
Token the
Token ID262
Feature activation-4.83e-02

Pos logit contributions
 darker+0.112
 stairs+0.112
 darkness+0.111
 security+0.111
 opposite+0.109
Neg logit contributions
ish-0.064
anda-0.062
am-0.053
o-0.052
amb-0.048
Token distance
Token ID5253
Feature activation-1.70e-02

Pos logit contributions
 security+0.067
Where+0.065
Time+0.064
 emergencies+0.062
Oh+0.062
Neg logit contributions
ish-0.107
anda-0.101
an-0.094
adow-0.086
am-0.084
Token was
Token ID373
Feature activation-7.53e-02

Pos logit contributions
 darker+0.029
 stairs+0.028
 security+0.028
 narrower+0.028
 darkness+0.028
Neg logit contributions
ish-0.033
anda-0.032
o-0.029
an-0.029
amb-0.028
Token getting
Token ID1972
Feature activation-9.44e-02

Pos logit contributions
 clues+0.112
Oh+0.107
Warning+0.105
 emergencies+0.104
 electricity+0.099
Neg logit contributions
o-0.144
adow-0.129
anda-0.129
ish-0.128
am-0.122
Token smaller
Token ID4833
Feature activation-3.98e-02

Pos logit contributions
 basement+0.142
growth+0.134
 emergencies+0.131
 month+0.127
Warning+0.127
Neg logit contributions
o-0.195
ish-0.172
am-0.171
anda-0.160
bo-0.160
Token,
Token ID11
Feature activation-3.83e-02

Pos logit contributions
 darker+0.070
 stairs+0.070
 security+0.069
 darkness+0.068
 narrower+0.067
Neg logit contributions
ish-0.076
anda-0.074
o-0.067
ouse-0.066
amb-0.066
Token and
Token ID290
Feature activation-6.13e-02

Pos logit contributions
 security+0.075
 stairs+0.074
 darker+0.073
 narrower+0.072
 darkness+0.072
Neg logit contributions
ish-0.062
anda-0.058
o-0.054
an-0.052
am-0.052
Token the
Token ID262
Feature activation-4.03e-02

Pos logit contributions
 security+0.072
 tow+0.071
 emergencies+0.068
 month+0.068
Care+0.066
Neg logit contributions
ish-0.125
anda-0.121
o-0.118
am-0.113
ie-0.112
Token hill
Token ID12788
Feature activation-1.85e-02

Pos logit contributions
 security+0.075
 emergencies+0.074
Time+0.073
Where+0.073
 tow+0.073
Neg logit contributions
ish-0.065
anda-0.061
an-0.055
adow-0.049
ie-0.048
Tokenpa
Token ID8957
Feature activation+8.97e-03

Pos logit contributions
ish+0.030
anda+0.029
ouse+0.026
an+0.025
amb+0.024
Neg logit contributions
 security-0.035
 stairs-0.034
 front-0.034
 darkness-0.034
 darker-0.033
Token showed
Token ID3751
Feature activation-2.64e-02

Pos logit contributions
ish+0.018
anda+0.018
amb+0.017
an+0.017
ouse+0.017
Neg logit contributions
 security-0.014
 stairs-0.013
 darker-0.013
 darkness-0.013
 narrower-0.013
Token him
Token ID683
Feature activation-3.76e-02

Pos logit contributions
 security+0.039
 stairs+0.038
 darkness+0.037
 darker+0.037
 opposite+0.036
Neg logit contributions
ish-0.057
anda-0.056
ouse-0.053
amb-0.051
orse-0.051
Token how
Token ID703
Feature activation-6.27e-02

Pos logit contributions
 stairs+0.053
 security+0.053
 darker+0.053
 darkness+0.052
 narrower+0.051
Neg logit contributions
ish-0.084
anda-0.082
amb-0.075
an-0.075
o-0.074
Token to
Token ID284
Feature activation-5.34e-02

Pos logit contributions
 darker+0.098
 darkness+0.095
 security+0.094
 stairs+0.094
 narrower+0.093
Neg logit contributions
ish-0.129
anda-0.124
o-0.120
am-0.114
ouse-0.112
Token use
Token ID779
Feature activation-9.37e-02

Pos logit contributions
 darker+0.084
 security+0.081
 stairs+0.081
 darkness+0.079
 narrower+0.079
Neg logit contributions
ish-0.114
anda-0.103
o-0.098
an-0.097
am-0.096
Token the
Token ID262
Feature activation-7.64e-02

Pos logit contributions
Oh+0.137
 darker+0.134
Time+0.132
 narrower+0.130
Where+0.130
Neg logit contributions
ish-0.195
am-0.166
o-0.163
anda-0.162
ci-0.155
Token tools
Token ID4899
Feature activation-4.55e-02

Pos logit contributions
 narrower+0.127
Where+0.121
 darker+0.121
Oh+0.116
Time+0.111
Neg logit contributions
ish-0.148
anna-0.126
anda-0.126
ci-0.122
ouse-0.118
Token to
Token ID284
Feature activation-6.45e-02

Pos logit contributions
 darker+0.077
 stairs+0.076
 security+0.076
 darkness+0.075
 narrower+0.075
Neg logit contributions
ish-0.089
anda-0.083
o-0.078
amb-0.075
am-0.075
Token fix
Token ID4259
Feature activation-5.62e-02

Pos logit contributions
 darker+0.112
Oh+0.108
Time+0.106
 security+0.105
 darkness+0.105
Neg logit contributions
ish-0.124
anda-0.105
o-0.103
am-0.101
ci-0.100
Token the
Token ID262
Feature activation-5.21e-02

Pos logit contributions
 darker+0.063
 darkness+0.060
 narrower+0.060
 stairs+0.059
 security+0.059
Neg logit contributions
ish-0.144
anda-0.134
o-0.127
am-0.124
an-0.124
Token on
Token ID319
Feature activation-1.71e-02

Pos logit contributions
 stairs+0.077
 darker+0.075
 narrower+0.073
 clues+0.073
 darkness+0.072
Neg logit contributions
ish-0.076
o-0.071
amb-0.066
am-0.066
anda-0.066
Token,
Token ID11
Feature activation-4.81e-02

Pos logit contributions
 darker+0.034
 stairs+0.034
 security+0.033
 darkness+0.033
 narrower+0.032
Neg logit contributions
ish-0.029
anda-0.028
o-0.025
amb-0.025
an-0.025
Token Lily
Token ID20037
Feature activation-8.54e-03

Pos logit contributions
 security+0.080
 darker+0.078
 darkness+0.078
 stairs+0.078
 narrower+0.078
Neg logit contributions
ish-0.094
anda-0.090
o-0.084
ouse-0.081
amb-0.080
Token learned
Token ID4499
Feature activation-6.37e-02

Pos logit contributions
 darker+0.012
 stairs+0.012
 darkness+0.012
 security+0.012
 narrower+0.012
Neg logit contributions
anda-0.018
ish-0.018
amb-0.016
an-0.016
o-0.016
Token to
Token ID284
Feature activation-4.80e-02

Pos logit contributions
 narrower+0.137
 darker+0.135
 stairs+0.135
 darkness+0.131
 tow+0.131
Neg logit contributions
ish-0.086
anda-0.079
o-0.079
amb-0.079
ouse-0.073
Token use
Token ID779
Feature activation-9.32e-02

Pos logit contributions
 security+0.093
 darker+0.093
 stairs+0.091
 narrower+0.089
 darkness+0.088
Neg logit contributions
ish-0.086
anda-0.077
o-0.073
amb-0.072
an-0.071
Token spray
Token ID11662
Feature activation-6.18e-02

Pos logit contributions
 security+0.139
 narrower+0.134
Where+0.133
jar+0.132
Oh+0.129
Neg logit contributions
ish-0.197
anda-0.168
am-0.163
o-0.162
c-0.158
Token paint
Token ID7521
Feature activation-5.08e-02

Pos logit contributions
 security+0.090
 stairs+0.084
 narrower+0.083
 darker+0.081
 darkness+0.079
Neg logit contributions
ish-0.143
anda-0.130
o-0.123
am-0.120
an-0.120
Token responsibly
Token ID43493
Feature activation-3.86e-02

Pos logit contributions
 security+0.092
 stairs+0.089
 darker+0.088
 darkness+0.088
 narrower+0.088
Neg logit contributions
ish-0.095
anda-0.088
o-0.082
ouse-0.078
am-0.078
Token and
Token ID290
Feature activation-4.46e-02

Pos logit contributions
 darker+0.077
 security+0.075
 narrower+0.075
 stairs+0.074
 darkness+0.073
Neg logit contributions
ish-0.063
anda-0.059
o-0.054
amb-0.052
an-0.052
Token continued
Token ID3767
Feature activation-6.63e-02

Pos logit contributions
 darker+0.064
 security+0.064
 stairs+0.063
 narrower+0.062
 darkness+0.062
Neg logit contributions
ish-0.097
anda-0.093
o-0.087
am-0.084
ouse-0.084
Token build
Token ID1382
Feature activation-8.10e-02

Pos logit contributions
 darker+0.080
 stairs+0.080
 security+0.078
 darkness+0.078
 narrower+0.077
Neg logit contributions
ish-0.109
anda-0.106
o-0.096
amb-0.096
an-0.095
Token their
Token ID511
Feature activation-4.16e-02

Pos logit contributions
 security+0.085
Time+0.084
Oh+0.084
 stairs+0.084
 darkness+0.083
Neg logit contributions
ish-0.203
anda-0.191
am-0.185
an-0.179
o-0.179
Token fort
Token ID6285
Feature activation-2.49e-02

Pos logit contributions
 stairs+0.058
 darker+0.056
 darkness+0.055
 security+0.055
 dark+0.053
Neg logit contributions
ish-0.093
anda-0.092
amb-0.090
o-0.088
anna-0.086
Token.
Token ID13
Feature activation-3.95e-02

Pos logit contributions
 stairs+0.037
 darker+0.036
 security+0.035
 darkness+0.034
 narrower+0.034
Neg logit contributions
ish-0.056
anda-0.055
amb-0.051
o-0.050
anna-0.050
Token They
Token ID1119
Feature activation-2.60e-02

Pos logit contributions
 security+0.088
 darkness+0.086
 darker+0.085
 narrower+0.084
 danger+0.084
Neg logit contributions
o-0.048
am-0.047
anda-0.047
ish-0.047
amb-0.044
Token used
Token ID973
Feature activation-9.30e-02

Pos logit contributions
 darker+0.043
 security+0.042
 stairs+0.042
 darkness+0.041
 narrower+0.041
Neg logit contributions
ish-0.052
anda-0.050
o-0.047
am-0.045
an-0.045
Token the
Token ID262
Feature activation-7.09e-02

Pos logit contributions
Oh+0.129
 security+0.123
Time+0.122
Where+0.121
 darker+0.119
Neg logit contributions
ish-0.205
anda-0.186
am-0.185
o-0.184
lets-0.173
Token leaves
Token ID5667
Feature activation-4.82e-02

Pos logit contributions
 security+0.097
 darker+0.095
 darkness+0.092
 stairs+0.091
 narrower+0.091
Neg logit contributions
ish-0.164
anda-0.156
an-0.142
o-0.141
ouse-0.138
Token and
Token ID290
Feature activation-4.81e-02

Pos logit contributions
 darker+0.069
 stairs+0.068
 security+0.067
 darkness+0.066
 narrower+0.065
Neg logit contributions
ish-0.108
anda-0.105
o-0.096
amb-0.095
an-0.095
Token sticks
Token ID16461
Feature activation-4.70e-02

Pos logit contributions
 security+0.065
 darker+0.064
Oh+0.062
Time+0.062
 darkness+0.061
Neg logit contributions
ish-0.105
anda-0.098
o-0.096
am-0.094
orse-0.092
Token and
Token ID290
Feature activation-4.70e-02

Pos logit contributions
 darker+0.073
 security+0.072
 stairs+0.072
 darkness+0.071
 narrower+0.070
Neg logit contributions
ish-0.099
anda-0.094
o-0.088
am-0.085
an-0.085
Token it
Token ID340
Feature activation-2.37e-02

Pos logit contributions
 darker+0.051
 stairs+0.051
 security+0.050
 darkness+0.049
 narrower+0.049
Neg logit contributions
ish-0.136
anda-0.132
o-0.125
amb-0.121
an-0.121
Token was
Token ID373
Feature activation-6.33e-02

Pos logit contributions
 darker+0.032
 darkness+0.032
 security+0.030
 stairs+0.030
 narrower+0.030
Neg logit contributions
ish-0.055
anda-0.053
amb-0.053
ouse-0.051
anna-0.050
Token in
Token ID287
Feature activation-5.47e-02

Pos logit contributions
 stairs+0.093
 security+0.085
 narrower+0.082
 darker+0.082
 opposite+0.082
Neg logit contributions
ish-0.138
anda-0.133
o-0.128
an-0.122
am-0.121
Token pain
Token ID2356
Feature activation-1.20e-02

Pos logit contributions
 darker+0.081
 narrower+0.077
Ready+0.077
Hours+0.074
 stairs+0.072
Neg logit contributions
gy-0.082
o-0.081
ish-0.080
seed-0.077
dy-0.077
Token and
Token ID290
Feature activation-2.52e-02

Pos logit contributions
 darker+0.020
 stairs+0.020
 darkness+0.020
 security+0.019
 silence+0.019
Neg logit contributions
ish-0.024
anda-0.023
amb-0.021
o-0.021
an-0.021
Token needed
Token ID2622
Feature activation-9.30e-02

Pos logit contributions
 darker+0.036
 darkness+0.035
 security+0.035
 stairs+0.035
 narrower+0.034
Neg logit contributions
ish-0.058
anda-0.056
amb-0.053
ouse-0.052
an-0.051
Token help
Token ID1037
Feature activation-3.44e-02

Pos logit contributions
 darker+0.174
 stairs+0.173
 narrower+0.173
 basement+0.168
 security+0.165
Neg logit contributions
o-0.124
ish-0.124
seed-0.120
ow-0.117
am-0.112
Token.
Token ID13
Feature activation-2.37e-02

Pos logit contributions
 darker+0.054
 stairs+0.054
 security+0.053
 darkness+0.052
 narrower+0.052
Neg logit contributions
ish-0.072
anda-0.069
o-0.064
amb-0.063
an-0.063
Token They
Token ID1119
Feature activation-2.34e-02

Pos logit contributions
 darker+0.035
 stairs+0.035
 security+0.035
 darkness+0.034
 narrower+0.034
Neg logit contributions
ish-0.051
anda-0.050
o-0.046
amb-0.045
an-0.045
Token took
Token ID1718
Feature activation-5.94e-02

Pos logit contributions
 darker+0.050
 stairs+0.050
Time+0.048
 silence+0.047
 narrower+0.047
Neg logit contributions
ish-0.037
anda-0.034
o-0.033
am-0.030
orse-0.028
Token out
Token ID503
Feature activation-5.55e-02

Pos logit contributions
 darker+0.068
 stairs+0.067
 narrower+0.065
 security+0.064
 darkness+0.062
Neg logit contributions
ish-0.153
anda-0.143
o-0.136
am-0.135
an-0.132
Token didn
Token ID1422
Feature activation-2.62e-02

Pos logit contributions
 darker+0.017
 stairs+0.016
 darkness+0.016
 security+0.016
 narrower+0.016
Neg logit contributions
ish-0.019
anda-0.019
amb-0.017
o-0.017
an-0.017
Token't
Token ID470
Feature activation-3.17e-02

Pos logit contributions
 darker+0.043
 security+0.041
 darkness+0.041
 narrower+0.041
 stairs+0.041
Neg logit contributions
ish-0.052
anda-0.051
o-0.047
amb-0.046
am-0.046
Token feel
Token ID1254
Feature activation-6.06e-02

Pos logit contributions
 darker+0.055
 stairs+0.054
 darkness+0.054
 security+0.054
 silence+0.053
Neg logit contributions
ish-0.060
anda-0.059
amb-0.052
an-0.052
am-0.052
Token well
Token ID880
Feature activation-3.72e-02

Pos logit contributions
 stairs+0.070
 darker+0.069
 security+0.068
 darkness+0.066
 narrower+0.066
Neg logit contributions
ish-0.153
anda-0.147
o-0.137
amb-0.135
an-0.135
Token and
Token ID290
Feature activation-1.11e-02

Pos logit contributions
 darker+0.056
 stairs+0.054
 security+0.054
 darkness+0.054
 narrower+0.053
Neg logit contributions
ish-0.081
anda-0.079
o-0.072
amb-0.072
an-0.071
Token needed
Token ID2622
Feature activation-9.24e-02

Pos logit contributions
 darker+0.018
 darkness+0.017
 dark+0.017
 stairs+0.017
 silence+0.017
Neg logit contributions
ish-0.024
anda-0.023
amb-0.023
an-0.022
ouse-0.021
Token some
Token ID617
Feature activation-6.07e-02

Pos logit contributions
 narrower+0.180
 security+0.170
 darker+0.168
wered+0.168
Ready+0.165
Neg logit contributions
o-0.119
ish-0.112
seed-0.108
ow-0.105
ie-0.104
Token medicine
Token ID9007
Feature activation-8.91e-03

Pos logit contributions
 narrower+0.089
 security+0.087
 darker+0.085
 stairs+0.084
 opposite+0.082
Neg logit contributions
ish-0.131
anda-0.123
am-0.113
o-0.112
ie-0.110
Token too
Token ID1165
Feature activation-5.30e-02

Pos logit contributions
 stairs+0.012
 darker+0.012
 security+0.012
 darkness+0.012
 silence+0.011
Neg logit contributions
ish-0.021
anda-0.020
an-0.018
amb-0.018
o-0.018
Token.
Token ID13
Feature activation-1.78e-02

Pos logit contributions
 darker+0.100
Oh+0.100
 narrower+0.100
Ready+0.100
 security+0.099
Neg logit contributions
ish-0.089
anda-0.084
o-0.080
am-0.075
ouse-0.074
Token But
Token ID887
Feature activation-3.13e-02

Pos logit contributions
 stairs+0.025
 darker+0.025
 darkness+0.024
 security+0.023
 opposite+0.023
Neg logit contributions
ish-0.041
anda-0.039
an-0.036
o-0.036
amb-0.036
Token the
Token ID262
Feature activation-5.60e-02

Pos logit contributions
 security+0.092
 narrower+0.091
Time+0.091
 darker+0.090
Oh+0.089
Neg logit contributions
ish-0.094
anda-0.080
ci-0.076
am-0.075
an-0.073
Token house
Token ID2156
Feature activation-3.84e-02

Pos logit contributions
Oh+0.117
wered+0.113
 darker+0.113
 narrower+0.113
Care+0.112
Neg logit contributions
ish-0.082
anda-0.077
am-0.069
an-0.068
ouse-0.061
Token.
Token ID13
Feature activation-4.29e-02

Pos logit contributions
Oh+0.076
Time+0.075
 security+0.075
 darker+0.074
 narrower+0.073
Neg logit contributions
ish-0.063
anda-0.057
o-0.054
am-0.053
ouse-0.051
Token Lily
Token ID20037
Feature activation-8.82e-03

Pos logit contributions
 security+0.085
 darker+0.078
 danger+0.078
 narrower+0.078
 darkness+0.077
Neg logit contributions
ish-0.070
anda-0.067
am-0.062
anna-0.061
o-0.060
Token used
Token ID973
Feature activation-8.69e-02

Pos logit contributions
 stairs+0.015
 darker+0.015
 security+0.014
 darkness+0.014
 narrower+0.014
Neg logit contributions
ish-0.017
anda-0.017
amb-0.015
o-0.015
an-0.015
Token a
Token ID257
Feature activation-9.24e-02

Pos logit contributions
 security+0.142
Ready+0.131
Oh+0.128
 narrower+0.127
Time+0.126
Neg logit contributions
ish-0.170
o-0.151
anda-0.145
am-0.143
ie-0.140
Token broom
Token ID47085
Feature activation-2.20e-02

Pos logit contributions
wered+0.140
Where+0.132
 narrower+0.132
 security+0.132
Warning+0.131
Neg logit contributions
ish-0.197
ouse-0.194
anda-0.185
o-0.159
ig-0.152
Token to
Token ID284
Feature activation-6.29e-02

Pos logit contributions
 darker+0.036
 stairs+0.035
 security+0.035
 darkness+0.034
 narrower+0.034
Neg logit contributions
ish-0.045
anda-0.043
o-0.040
amb-0.039
an-0.039
Token sweep
Token ID16085
Feature activation-5.97e-02

Pos logit contributions
 security+0.105
Oh+0.102
 darker+0.101
Where+0.097
Time+0.097
Neg logit contributions
ish-0.124
anda-0.111
o-0.108
adow-0.103
ci-0.103
Token the
Token ID262
Feature activation-4.58e-02

Pos logit contributions
 security+0.095
 narrower+0.093
Where+0.088
 darker+0.086
 outer+0.085
Neg logit contributions
ish-0.122
o-0.105
ie-0.104
anda-0.102
am-0.099
Token floor
Token ID4314
Feature activation-4.57e-02

Pos logit contributions
 darker+0.065
Time+0.064
 narrower+0.064
Where+0.064
Oh+0.063
Neg logit contributions
ish-0.099
anda-0.096
an-0.089
am-0.086
ouse-0.084
Token that
Token ID326
Feature activation-3.95e-02

Pos logit contributions
 darker+0.055
 stairs+0.054
 security+0.054
 darkness+0.053
 narrower+0.052
Neg logit contributions
ish-0.064
anda-0.061
o-0.056
amb-0.055
an-0.055
Token day
Token ID1110
Feature activation-3.79e-02

Pos logit contributions
 darker+0.064
 security+0.064
 narrower+0.064
 stairs+0.062
 clues+0.060
Neg logit contributions
ish-0.082
o-0.075
anda-0.073
amb-0.071
an-0.070
Token on
Token ID319
Feature activation-1.92e-02

Pos logit contributions
 stairs+0.069
 darker+0.067
 darkness+0.066
 security+0.065
 clues+0.065
Neg logit contributions
ish-0.073
o-0.068
anda-0.064
am-0.063
amb-0.063
Token,
Token ID11
Feature activation-4.61e-02

Pos logit contributions
 stairs+0.038
 darker+0.038
 security+0.037
 darkness+0.037
 narrower+0.036
Neg logit contributions
ish-0.033
anda-0.032
o-0.029
amb-0.028
an-0.028
Token Lily
Token ID20037
Feature activation-4.85e-03

Pos logit contributions
 security+0.073
 darkness+0.072
 darker+0.072
 narrower+0.071
 stairs+0.070
Neg logit contributions
ish-0.093
anda-0.090
o-0.083
ouse-0.080
am-0.078
Token used
Token ID973
Feature activation-9.16e-02

Pos logit contributions
 darkness+0.006
 darker+0.006
 security+0.006
 stairs+0.006
 silence+0.006
Neg logit contributions
anda-0.011
ish-0.011
amb-0.010
ouse-0.010
an-0.009
Token her
Token ID607
Feature activation-6.86e-02

Pos logit contributions
 narrower+0.140
 darker+0.138
 security+0.137
 stairs+0.135
 basement+0.134
Neg logit contributions
ish-0.188
o-0.170
anda-0.167
am-0.163
seed-0.160
Token gifts
Token ID13201
Feature activation-4.50e-02

Pos logit contributions
 darker+0.099
 narrower+0.098
 stairs+0.097
 security+0.097
Where+0.092
Neg logit contributions
ish-0.154
anda-0.147
an-0.134
ouse-0.134
am-0.132
Token to
Token ID284
Feature activation-5.78e-02

Pos logit contributions
 darker+0.068
 stairs+0.067
 security+0.067
 narrower+0.066
 darkness+0.065
Neg logit contributions
ish-0.097
anda-0.092
o-0.086
amb-0.084
ouse-0.084
Token help
Token ID1037
Feature activation-4.86e-02

Pos logit contributions
 stairs+0.086
 darker+0.086
 security+0.084
 narrower+0.083
 darkness+0.080
Neg logit contributions
ish-0.127
anda-0.118
o-0.111
an-0.111
amb-0.108
Token everyone
Token ID2506
Feature activation-2.33e-02

Pos logit contributions
 darker+0.061
 stairs+0.060
 narrower+0.059
 security+0.057
 darkness+0.057
Neg logit contributions
ish-0.118
anda-0.111
an-0.105
o-0.105
amb-0.103
Token.
Token ID13
Feature activation-2.02e-02

Pos logit contributions
 stairs+0.032
 darker+0.032
 security+0.032
 darkness+0.031
 narrower+0.031
Neg logit contributions
ish-0.046
anda-0.045
o-0.041
amb-0.041
an-0.040
Token She
Token ID1375
Feature activation-4.70e-04

Pos logit contributions
 darker+0.034
 stairs+0.033
 security+0.033
 darkness+0.033
 narrower+0.032
Neg logit contributions
ish-0.040
anda-0.039
o-0.036
amb-0.035
an-0.035
Token didn
Token ID1422
Feature activation-3.04e-02

Pos logit contributions
 security+0.001
 stairs+0.001
 darkness+0.001
 hours+0.001
 front+0.001
Neg logit contributions
ish-0.001
anda-0.001
an-0.001
anna-0.001
amb-0.001
Token't
Token ID470
Feature activation-2.97e-02

Pos logit contributions
 darker+0.052
 darkness+0.051
 security+0.051
 narrower+0.050
 stairs+0.050
Neg logit contributions
ish-0.056
anda-0.055
o-0.051
amb-0.051
am-0.050
Token even
Token ID772
Feature activation-3.37e-02

Pos logit contributions
 darker+0.049
 stairs+0.049
 security+0.048
 darkness+0.048
 narrower+0.047
Neg logit contributions
ish-0.060
anda-0.058
o-0.053
amb-0.052
an-0.052
Token need
Token ID761
Feature activation-9.16e-02

Pos logit contributions
 stairs+0.044
 darker+0.043
 narrower+0.042
 security+0.041
 darkness+0.041
Neg logit contributions
ish-0.078
anda-0.072
o-0.067
ouse-0.067
am-0.066
Token help
Token ID1037
Feature activation-4.20e-02

Pos logit contributions
 narrower+0.166
 darker+0.159
wered+0.156
 basement+0.153
Time+0.152
Neg logit contributions
ish-0.145
anda-0.118
o-0.118
lets-0.114
am-0.112
Token from
Token ID422
Feature activation-2.86e-02

Pos logit contributions
 darker+0.075
 stairs+0.075
 security+0.074
 narrower+0.074
 darkness+0.073
Neg logit contributions
ish-0.078
anda-0.073
o-0.066
an-0.064
amb-0.064
Token anyone
Token ID2687
Feature activation-1.61e-02

Pos logit contributions
 narrower+0.039
 darker+0.039
 stairs+0.038
Time+0.038
 security+0.037
Neg logit contributions
ish-0.063
anda-0.058
ie-0.054
i-0.054
o-0.054
Token!
Token ID0
Feature activation-1.95e-02

Pos logit contributions
 security+0.023
 stairs+0.022
 darkness+0.022
 darker+0.022
 dark+0.021
Neg logit contributions
amb-0.036
ish-0.035
anda-0.034
an-0.034
ouse-0.033
Token
Token ID220
Feature activation-2.18e-02

Pos logit contributions
 darker+0.030
 stairs+0.030
 security+0.029
 darkness+0.029
 narrower+0.029
Neg logit contributions
ish-0.042
anda-0.040
o-0.037
amb-0.037
ouse-0.036
Token dad
Token ID9955
Feature activation+1.27e-03

Pos logit contributions
 security+0.039
 stairs+0.039
 darker+0.039
 darkness+0.038
 narrower+0.037
Neg logit contributions
ish-0.055
anda-0.053
o-0.049
amb-0.048
an-0.048
Token showed
Token ID3751
Feature activation-3.24e-02

Pos logit contributions
ish+0.002
anda+0.002
amb+0.002
an+0.002
ouse+0.002
Neg logit contributions
 security-0.002
 darker-0.002
 stairs-0.002
 darkness-0.002
 narrower-0.002
Token him
Token ID683
Feature activation-3.65e-02

Pos logit contributions
 darker+0.056
 stairs+0.056
 security+0.055
 darkness+0.055
 narrower+0.054
Neg logit contributions
ish-0.063
anda-0.061
o-0.055
amb-0.054
an-0.054
Token how
Token ID703
Feature activation-6.09e-02

Pos logit contributions
 darker+0.060
 stairs+0.060
 security+0.059
 darkness+0.058
 narrower+0.058
Neg logit contributions
ish-0.073
anda-0.071
o-0.065
amb-0.064
an-0.064
Token to
Token ID284
Feature activation-4.90e-02

Pos logit contributions
 darker+0.093
 stairs+0.091
 security+0.091
 darkness+0.090
 narrower+0.089
Neg logit contributions
ish-0.130
anda-0.124
o-0.117
am-0.115
ouse-0.112
Token use
Token ID779
Feature activation-9.14e-02

Pos logit contributions
 darker+0.080
 security+0.078
 stairs+0.077
 darkness+0.076
 narrower+0.076
Neg logit contributions
ish-0.103
anda-0.095
amb-0.088
o-0.088
an-0.088
Token it
Token ID340
Feature activation-5.46e-02

Pos logit contributions
Oh+0.163
Time+0.161
 darker+0.159
 basement+0.156
 narrower+0.154
Neg logit contributions
ish-0.164
am-0.140
anda-0.132
o-0.129
orse-0.127
Token safely
Token ID11512
Feature activation-3.85e-02

Pos logit contributions
 security+0.101
Oh+0.101
 darker+0.101
Time+0.099
 stairs+0.098
Neg logit contributions
ish-0.098
anda-0.088
o-0.086
am-0.084
ouse-0.079
Token and
Token ID290
Feature activation-3.80e-02

Pos logit contributions
 darker+0.065
 security+0.064
 narrower+0.063
 stairs+0.063
 darkness+0.063
Neg logit contributions
ish-0.077
anda-0.072
o-0.067
ouse-0.065
am-0.065
Token Tim
Token ID5045
Feature activation+1.44e-02

Pos logit contributions
 darker+0.060
 stairs+0.059
 security+0.059
 darkness+0.058
 narrower+0.057
Neg logit contributions
ish-0.079
anda-0.077
o-0.070
amb-0.069
an-0.069
Tokenmy
Token ID1820
Feature activation-1.20e-02

Pos logit contributions
orse+0.028
amb+0.027
anda+0.026
an+0.025
sters+0.025
Neg logit contributions
 front-0.021
 stairs-0.020
 last-0.020
 opposite-0.020
 darkness-0.020
Token The
Token ID383
Feature activation-5.13e-02

Pos logit contributions
 stairs+0.039
 darker+0.039
 security+0.039
 darkness+0.039
 narrower+0.038
Neg logit contributions
ish-0.056
anda-0.054
o-0.049
an-0.049
ouse-0.049
Token thin
Token ID7888
Feature activation-3.01e-02

Pos logit contributions
 security+0.069
 darker+0.064
 stairs+0.064
 outer+0.063
 narrower+0.063
Neg logit contributions
ish-0.114
anda-0.111
o-0.102
an-0.101
am-0.099
Token tree
Token ID5509
Feature activation+4.60e-04

Pos logit contributions
 security+0.045
 darker+0.045
 narrower+0.044
 stairs+0.043
 darkness+0.043
Neg logit contributions
ish-0.065
anda-0.063
o-0.057
an-0.057
am-0.055
Token learned
Token ID4499
Feature activation-5.33e-02

Pos logit contributions
ish+0.001
anda+0.001
amb+0.001
ouse+0.001
anna+0.001
Neg logit contributions
 silence-0.001
 darkness-0.001
 stairs-0.001
 darker-0.001
 security-0.001
Token to
Token ID284
Feature activation-5.39e-02

Pos logit contributions
 stairs+0.099
 darker+0.097
 narrower+0.097
 darkness+0.095
Time+0.093
Neg logit contributions
ish-0.090
anda-0.087
o-0.085
am-0.083
amb-0.081
Token be
Token ID307
Feature activation-9.14e-02

Pos logit contributions
 darker+0.093
 stairs+0.091
 security+0.089
 darkness+0.088
 narrower+0.088
Neg logit contributions
ish-0.106
anda-0.100
o-0.094
amb-0.093
an-0.092
Token strong
Token ID1913
Feature activation-2.84e-02

Pos logit contributions
 stairs+0.160
 security+0.148
 darkness+0.142
 darker+0.142
 basement+0.138
Neg logit contributions
ish-0.182
o-0.172
anda-0.168
am-0.159
an-0.157
Token,
Token ID11
Feature activation-4.25e-02

Pos logit contributions
 darker+0.045
 darkness+0.044
 stairs+0.043
 security+0.043
 narrower+0.043
Neg logit contributions
anda-0.059
ish-0.058
ouse-0.054
amb-0.053
an-0.053
Token too
Token ID1165
Feature activation-6.66e-02

Pos logit contributions
 stairs+0.077
 darker+0.077
 security+0.077
 darkness+0.075
 narrower+0.075
Neg logit contributions
ish-0.078
anda-0.074
o-0.068
amb-0.067
an-0.066
Token.
Token ID13
Feature activation-2.70e-02

Pos logit contributions
 darker+0.101
 stairs+0.100
 security+0.099
 darkness+0.098
 narrower+0.097
Neg logit contributions
ish-0.143
anda-0.138
o-0.127
amb-0.125
an-0.125
Token They
Token ID1119
Feature activation-2.03e-02

Pos logit contributions
 darker+0.043
 stairs+0.043
 darkness+0.042
 security+0.042
 narrower+0.041
Neg logit contributions
ish-0.056
anda-0.054
o-0.049
an-0.049
ouse-0.049
Token sheep
Token ID15900
Feature activation-8.97e-03

Pos logit contributions
 darker+0.072
 darkness+0.070
 security+0.069
 silence+0.069
 narrower+0.068
Neg logit contributions
ish-0.099
anda-0.097
an-0.094
o-0.089
amb-0.087
Token.
Token ID13
Feature activation-3.37e-02

Pos logit contributions
 stairs+0.013
 darker+0.012
 darkness+0.012
 security+0.012
 front+0.012
Neg logit contributions
anda-0.020
ish-0.019
ouse-0.018
amb-0.018
an-0.018
Token The
Token ID383
Feature activation-5.19e-02

Pos logit contributions
 darker+0.055
 security+0.054
 narrower+0.053
 darkness+0.052
 stairs+0.052
Neg logit contributions
ish-0.066
anda-0.065
o-0.061
amb-0.060
anna-0.059
Token sheep
Token ID15900
Feature activation+1.10e-02

Pos logit contributions
 darker+0.095
 stairs+0.094
 security+0.093
 narrower+0.093
 darkness+0.091
Neg logit contributions
ish-0.092
anda-0.087
an-0.085
amb-0.080
o-0.078
Token were
Token ID547
Feature activation-7.20e-02

Pos logit contributions
ish+0.023
anda+0.022
ouse+0.021
amb+0.020
an+0.020
Neg logit contributions
 stairs-0.017
 security-0.016
 darker-0.016
 darkness-0.016
 narrower-0.016
Token getting
Token ID1972
Feature activation-9.11e-02

Pos logit contributions
 darker+0.100
 stairs+0.098
 security+0.098
 darkness+0.097
 narrower+0.095
Neg logit contributions
ish-0.166
anda-0.160
amb-0.147
o-0.146
an-0.146
Token sleepy
Token ID42023
Feature activation-1.56e-02

Pos logit contributions
 basement+0.146
 stairs+0.139
 security+0.131
 opposite+0.129
 darkness+0.128
Neg logit contributions
o-0.188
ish-0.182
anda-0.173
am-0.165
ishes-0.152
Token after
Token ID706
Feature activation-2.15e-02

Pos logit contributions
 darker+0.030
 stairs+0.030
 security+0.030
 narrower+0.029
 darkness+0.029
Neg logit contributions
ish-0.026
anda-0.025
o-0.023
amb-0.022
an-0.022
Token all
Token ID477
Feature activation-4.22e-02

Pos logit contributions
 darker+0.035
 narrower+0.034
 stairs+0.033
 security+0.033
 darkness+0.033
Neg logit contributions
ish-0.042
anda-0.041
o-0.038
am-0.037
amb-0.037
Token the
Token ID262
Feature activation-4.95e-02

Pos logit contributions
 darker+0.044
 stairs+0.044
 security+0.043
 darkness+0.043
 narrower+0.041
Neg logit contributions
ish-0.111
anda-0.108
ouse-0.101
o-0.101
amb-0.101
Token walking
Token ID6155
Feature activation-4.03e-02

Pos logit contributions
 darker+0.082
 narrower+0.079
 security+0.078
 stairs+0.077
Where+0.076
Neg logit contributions
ish-0.094
anda-0.091
an-0.088
anna-0.085
o-0.083
Token.
Token ID13
Feature activation-2.78e-02

Pos logit contributions
 stairs+0.018
 darkness+0.017
 front+0.016
 dark+0.016
 darker+0.016
Neg logit contributions
anda-0.033
ish-0.031
amb-0.031
an-0.030
anna-0.029
Token
Token ID220
Feature activation-2.16e-02

Pos logit contributions
 security+0.050
 darker+0.049
 narrower+0.048
 darkness+0.047
 stairs+0.046
Neg logit contributions
ish-0.051
anda-0.048
o-0.048
amb-0.046
am-0.045
Token
Token ID198
Feature activation-2.76e-02

Pos logit contributions
 stairs+0.026
 darker+0.026
 security+0.025
 darkness+0.025
 opposite+0.024
Neg logit contributions
ish-0.053
anda-0.051
an-0.048
amb-0.047
ouse-0.046
Token
Token ID198
Feature activation-2.67e-02

Pos logit contributions
 darker+0.044
 security+0.043
 darkness+0.043
 narrower+0.043
 stairs+0.042
Neg logit contributions
ish-0.055
anda-0.054
o-0.052
amb-0.050
ouse-0.049
TokenThey
Token ID2990
Feature activation-2.64e-02

Pos logit contributions
 darker+0.037
 darkness+0.037
 security+0.035
 clues+0.034
 narrower+0.033
Neg logit contributions
o-0.056
amb-0.053
orse-0.052
ish-0.051
anda-0.051
Token built
Token ID3170
Feature activation-9.09e-02

Pos logit contributions
 darker+0.043
 security+0.042
 stairs+0.042
 darkness+0.042
 narrower+0.041
Neg logit contributions
ish-0.054
anda-0.051
o-0.048
am-0.046
amb-0.046
Token the
Token ID262
Feature activation-6.32e-02

Pos logit contributions
Oh+0.088
 security+0.085
 darker+0.084
Where+0.084
 darkness+0.082
Neg logit contributions
ish-0.232
anda-0.222
am-0.214
o-0.210
an-0.208
Token castle
Token ID16669
Feature activation-4.05e-02

Pos logit contributions
 stairs+0.124
 darker+0.123
 security+0.121
 darkness+0.121
 narrower+0.119
Neg logit contributions
ish-0.109
anda-0.105
amb-0.097
o-0.096
ouse-0.093
Token for
Token ID329
Feature activation-6.64e-02

Pos logit contributions
 stairs+0.060
 darker+0.057
 security+0.055
 darkness+0.055
 narrower+0.053
Neg logit contributions
ish-0.092
anda-0.091
amb-0.087
anna-0.084
o-0.082
Token the
Token ID262
Feature activation-4.57e-02

Pos logit contributions
Oh+0.093
 darker+0.090
Ready+0.084
 narrower+0.083
Care+0.082
Neg logit contributions
orse-0.125
o-0.115
ish-0.114
am-0.113
anda-0.110
Token whole
Token ID2187
Feature activation-4.94e-02

Pos logit contributions
 darker+0.077
 security+0.075
 stairs+0.075
 darkness+0.074
 narrower+0.074
Neg logit contributions
ish-0.091
anda-0.088
o-0.079
an-0.079
amb-0.077
Token't
Token ID470
Feature activation-2.28e-02

Pos logit contributions
 darker+0.030
 darkness+0.030
 security+0.029
 stairs+0.029
 narrower+0.029
Neg logit contributions
ish-0.033
anda-0.032
o-0.030
amb-0.030
am-0.029
Token know
Token ID760
Feature activation-5.51e-02

Pos logit contributions
 darker+0.042
 stairs+0.041
 security+0.041
 darkness+0.041
 narrower+0.040
Neg logit contributions
ish-0.042
anda-0.041
o-0.036
amb-0.036
an-0.036
Token how
Token ID703
Feature activation-4.82e-02

Pos logit contributions
 stairs+0.111
 darker+0.109
 darkness+0.107
 security+0.107
Time+0.107
Neg logit contributions
ish-0.086
anda-0.085
o-0.080
am-0.072
adow-0.069
Token.
Token ID13
Feature activation-1.93e-02

Pos logit contributions
 darker+0.071
 stairs+0.071
 security+0.070
 darkness+0.069
 narrower+0.068
Neg logit contributions
ish-0.103
anda-0.101
o-0.094
am-0.091
an-0.090
Token They
Token ID1119
Feature activation-1.38e-02

Pos logit contributions
 darker+0.033
 stairs+0.033
 security+0.032
 darkness+0.032
 narrower+0.032
Neg logit contributions
ish-0.038
anda-0.037
o-0.033
amb-0.033
an-0.033
Token needed
Token ID2622
Feature activation-9.09e-02

Pos logit contributions
 stairs+0.030
 darker+0.029
 outer+0.029
Time+0.028
 security+0.028
Neg logit contributions
ish-0.021
anda-0.019
o-0.019
am-0.018
an-0.016
Token a
Token ID257
Feature activation-6.72e-02

Pos logit contributions
Oh+0.190
jar+0.189
 darker+0.185
Ready+0.178
 horizon+0.177
Neg logit contributions
o-0.108
am-0.094
ow-0.093
gy-0.082
's-0.077
Token plan
Token ID1410
Feature activation-3.74e-02

Pos logit contributions
Where+0.094
Oh+0.092
 darker+0.092
 darkness+0.090
 narrower+0.089
Neg logit contributions
ish-0.146
anda-0.141
ouse-0.131
o-0.130
an-0.124
Token to
Token ID284
Feature activation-3.72e-02

Pos logit contributions
 darker+0.073
 narrower+0.070
Oh+0.070
 darkness+0.069
 stairs+0.069
Neg logit contributions
o-0.062
ish-0.060
anda-0.059
am-0.056
adow-0.054
Token keep
Token ID1394
Feature activation-4.16e-02

Pos logit contributions
 darker+0.057
 stairs+0.056
 security+0.055
 darkness+0.054
 narrower+0.054
Neg logit contributions
ish-0.079
anda-0.076
o-0.071
an-0.070
amb-0.069
Token the
Token ID262
Feature activation-5.68e-02

Pos logit contributions
 darker+0.058
 stairs+0.057
 security+0.055
 narrower+0.055
 darkness+0.054
Neg logit contributions
ish-0.095
anda-0.090
o-0.087
am-0.084
an-0.083
Token band
Token ID4097
Feature activation-2.29e-03

Pos logit contributions
 stairs+0.041
 darker+0.040
 security+0.039
 darkness+0.038
 silence+0.037
Neg logit contributions
ish-0.088
anda-0.085
amb-0.079
o-0.079
an-0.079
Tokenages
Token ID1095
Feature activation-2.47e-02

Pos logit contributions
 security+0.005
 stairs+0.005
 silence+0.005
 many+0.005
 front+0.005
Neg logit contributions
ish-0.003
anda-0.003
amb-0.002
an-0.002
 Majesty-0.002
Token.
Token ID13
Feature activation-4.99e-02

Pos logit contributions
 stairs+0.035
 security+0.034
 darker+0.034
 darkness+0.033
 silence+0.032
Neg logit contributions
ish-0.055
anda-0.054
an-0.049
o-0.049
amb-0.049
Token Daisy
Token ID40355
Feature activation+6.35e-04

Pos logit contributions
 darker+0.078
 security+0.077
 stairs+0.076
 narrower+0.075
 darkness+0.074
Neg logit contributions
ish-0.105
anda-0.100
o-0.093
amb-0.092
am-0.090
Token chose
Token ID7690
Feature activation-6.69e-02

Pos logit contributions
ish+0.001
anda+0.001
an+0.001
amb+0.001
ouse+0.001
Neg logit contributions
 stairs-0.001
 security-0.001
 front-0.001
 darkness-0.001
 hours-0.001
Token a
Token ID257
Feature activation-9.09e-02

Pos logit contributions
 darker+0.063
 stairs+0.060
 darkness+0.059
 narrower+0.059
 security+0.059
Neg logit contributions
ish-0.181
anda-0.177
o-0.169
amb-0.162
an-0.161
Token set
Token ID900
Feature activation-4.29e-02

Pos logit contributions
 darker+0.111
 narrower+0.108
 darkness+0.108
 stairs+0.108
 security+0.106
Neg logit contributions
ish-0.220
anda-0.213
ouse-0.200
o-0.197
an-0.192
Token of
Token ID286
Feature activation-6.69e-02

Pos logit contributions
 stairs+0.059
 security+0.058
 darker+0.056
 darkness+0.055
 front+0.055
Neg logit contributions
ish-0.098
anda-0.096
o-0.088
amb-0.088
an-0.086
Token colourful
Token ID44799
Feature activation-6.32e-02

Pos logit contributions
 darker+0.087
 narrower+0.086
 stairs+0.085
 security+0.084
Where+0.083
Neg logit contributions
ish-0.156
anda-0.146
o-0.138
an-0.134
amb-0.133
Token band
Token ID4097
Feature activation-9.02e-03

Pos logit contributions
 darker+0.092
 stairs+0.091
 security+0.090
 darkness+0.089
 narrower+0.088
Neg logit contributions
ish-0.139
anda-0.135
o-0.124
amb-0.123
an-0.122
Tokenages
Token ID1095
Feature activation-3.13e-02

Pos logit contributions
 security+0.016
 stairs+0.016
 silence+0.016
 front+0.015
 hours+0.015
Neg logit contributions
ish-0.016
anda-0.015
am-0.013
an-0.013
 Majesty-0.013
Token big
Token ID1263
Feature activation-1.92e-02

Pos logit contributions
 security+0.091
Time+0.090
Oh+0.087
Where+0.086
 silence+0.086
Neg logit contributions
ish-0.113
o-0.104
anda-0.103
am-0.098
an-0.094
Token!
Token ID0
Feature activation-2.69e-02

Pos logit contributions
 darker+0.029
 stairs+0.029
 security+0.029
 darkness+0.028
 narrower+0.028
Neg logit contributions
ish-0.041
anda-0.039
o-0.036
amb-0.036
an-0.035
Token The
Token ID383
Feature activation-5.59e-02

Pos logit contributions
 darker+0.051
 security+0.051
 stairs+0.050
 darkness+0.050
 narrower+0.049
Neg logit contributions
ish-0.047
anda-0.045
o-0.041
amb-0.040
am-0.040
Token hole
Token ID7604
Feature activation-1.04e-03

Pos logit contributions
 security+0.087
 outer+0.086
 January+0.080
Oh+0.080
Where+0.080
Neg logit contributions
ish-0.119
an-0.103
anda-0.098
am-0.094
sters-0.088
Token was
Token ID373
Feature activation-7.79e-02

Pos logit contributions
 darker+0.002
 stairs+0.002
 darkness+0.001
 narrower+0.001
 security+0.001
Neg logit contributions
ish-0.002
anda-0.002
amb-0.002
anna-0.002
ouse-0.002
Token too
Token ID1165
Feature activation-9.09e-02

Pos logit contributions
 security+0.125
Ready+0.122
Oh+0.118
Time+0.118
 opposite+0.118
Neg logit contributions
ish-0.143
o-0.137
anda-0.132
am-0.121
adow-0.115
Token big
Token ID1263
Feature activation-1.66e-02

Pos logit contributions
 security+0.124
Where+0.121
Ready+0.120
 emergencies+0.115
Oh+0.114
Neg logit contributions
ish-0.208
o-0.194
anda-0.183
ouse-0.175
an-0.172
Token,
Token ID11
Feature activation-2.58e-02

Pos logit contributions
 darker+0.027
 stairs+0.027
 security+0.026
 darkness+0.026
 narrower+0.026
Neg logit contributions
ish-0.034
anda-0.033
o-0.030
amb-0.030
an-0.030
Token and
Token ID290
Feature activation-5.54e-02

Pos logit contributions
 darker+0.043
 stairs+0.043
 security+0.043
 darkness+0.042
 narrower+0.042
Neg logit contributions
ish-0.051
anda-0.049
o-0.045
amb-0.044
an-0.044
Token Joe
Token ID5689
Feature activation+1.35e-02

Pos logit contributions
 security+0.081
 stairs+0.078
 darker+0.077
 darkness+0.074
 silence+0.074
Neg logit contributions
ish-0.120
o-0.109
anda-0.109
am-0.106
orse-0.102
Token was
Token ID373
Feature activation-5.71e-02

Pos logit contributions
anda+0.030
amb+0.030
ish+0.029
anna+0.028
ouse+0.027
Neg logit contributions
 darkness-0.017
 stairs-0.016
 security-0.016
 hours-0.016
 narrower-0.016
Token an
Token ID281
Feature activation-3.79e-02

Pos logit contributions
 stairs+0.086
 darker+0.085
 darkness+0.081
 narrower+0.081
 security+0.079
Neg logit contributions
ish-0.096
anda-0.090
o-0.089
amb-0.085
am-0.084
Token important
Token ID1593
Feature activation-2.57e-02

Pos logit contributions
 stairs+0.048
 darker+0.047
 security+0.046
 darkness+0.045
 narrower+0.044
Neg logit contributions
ish-0.093
anda-0.090
an-0.085
amb-0.082
o-0.082
Token lesson
Token ID11483
Feature activation-1.26e-02

Pos logit contributions
 darker+0.021
 darkness+0.019
 stairs+0.019
 narrower+0.018
 security+0.018
Neg logit contributions
ish-0.074
anda-0.071
an-0.069
ouse-0.069
o-0.069
Token.
Token ID13
Feature activation-2.73e-02

Pos logit contributions
 darker+0.022
 stairs+0.021
 security+0.021
 darkness+0.021
 narrower+0.021
Neg logit contributions
ish-0.024
anda-0.023
o-0.022
amb-0.021
an-0.021
Token He
Token ID679
Feature activation-1.47e-02

Pos logit contributions
 stairs+0.044
 darkness+0.044
 security+0.043
 darker+0.043
 silence+0.042
Neg logit contributions
ish-0.057
anda-0.055
an-0.049
ouse-0.048
amb-0.048
Token needed
Token ID2622
Feature activation-9.06e-02

Pos logit contributions
 darkness+0.018
 security+0.018
 narrower+0.017
 darker+0.017
 stairs+0.017
Neg logit contributions
anda-0.032
ish-0.032
ouse-0.030
amb-0.029
anna-0.029
Token to
Token ID284
Feature activation-4.24e-02

Pos logit contributions
 darker+0.185
 narrower+0.178
 stairs+0.172
 security+0.170
jar+0.168
Neg logit contributions
o-0.123
ish-0.121
am-0.106
orse-0.099
amb-0.098
Token be
Token ID307
Feature activation-7.54e-02

Pos logit contributions
 stairs+0.058
 security+0.058
 darker+0.058
 darkness+0.058
 silence+0.056
Neg logit contributions
ish-0.096
anda-0.094
o-0.086
amb-0.085
ouse-0.085
Token careful
Token ID8161
Feature activation-5.90e-03

Pos logit contributions
 stairs+0.122
 darker+0.119
 security+0.117
 darkness+0.116
 narrower+0.114
Neg logit contributions
ish-0.158
anda-0.149
o-0.140
am-0.136
an-0.135
Token when
Token ID618
Feature activation-4.09e-02

Pos logit contributions
 darkness+0.008
 stairs+0.008
 darker+0.008
 silence+0.008
 security+0.007
Neg logit contributions
anda-0.014
ish-0.013
amb-0.012
ouse-0.012
an-0.012
Token stretching
Token ID20880
Feature activation-4.55e-02

Pos logit contributions
 darker+0.066
 stairs+0.065
 security+0.064
 darkness+0.064
 narrower+0.063
Neg logit contributions
ish-0.084
anda-0.081
o-0.074
amb-0.074
an-0.073
Token quiet
Token ID5897
Feature activation+1.53e-01

Pos logit contributions
 darker+0.072
 stairs+0.071
 security+0.070
 darkness+0.070
 narrower+0.068
Neg logit contributions
ish-0.094
anda-0.092
amb-0.085
o-0.085
ouse-0.083
Token compared
Token ID3688
Feature activation+8.49e-02

Pos logit contributions
anda+0.388
ish+0.387
ouse+0.350
amb+0.349
o+0.348
Neg logit contributions
 silence-0.189
 stairs-0.182
 darkness-0.178
 opposite-0.167
 front-0.166
Token to
Token ID284
Feature activation+5.59e-02

Pos logit contributions
ish+0.212
anda+0.206
o+0.191
amb+0.190
ouse+0.189
Neg logit contributions
 stairs-0.100
 darker-0.100
 security-0.098
 darkness-0.097
 narrower-0.092
Token the
Token ID262
Feature activation+6.39e-02

Pos logit contributions
ish+0.141
anda+0.137
o+0.128
amb+0.127
an+0.126
Neg logit contributions
 darker-0.063
 stairs-0.063
 security-0.062
 darkness-0.061
 narrower-0.059
Token noisy
Token ID31210
Feature activation+7.85e-02

Pos logit contributions
ish+0.156
anda+0.150
an+0.139
o+0.137
ouse+0.135
Neg logit contributions
 darker-0.075
 narrower-0.073
 security-0.072
 stairs-0.071
 darkness-0.069
Token se
Token ID384
Feature activation+1.56e-01

Pos logit contributions
ish+0.185
anda+0.182
amb+0.169
o+0.169
am+0.165
Neg logit contributions
 stairs-0.103
 darker-0.100
 darkness-0.100
 security-0.099
 silence-0.097
Tokenag
Token ID363
Feature activation+9.48e-02

Pos logit contributions
anda+0.278
sters+0.261
ish+0.256
oppy+0.241
amb+0.237
Neg logit contributions
 lesson-0.241
Time-0.236
 opposite-0.226
 month-0.222
 hours-0.222
Tokenull
Token ID724
Feature activation+1.05e-01

Pos logit contributions
ish+0.197
anda+0.190
amb+0.172
ouse+0.163
sters+0.159
Neg logit contributions
 silence-0.145
 stairs-0.143
 darkness-0.142
 front-0.141
 opposite-0.139
Tokens
Token ID82
Feature activation+6.91e-02

Pos logit contributions
ish+0.200
anda+0.196
amb+0.193
ouse+0.184
am+0.174
Neg logit contributions
 silence-0.147
 darkness-0.143
 front-0.143
 darker-0.141
 dark-0.140
Token outside
Token ID2354
Feature activation+6.67e-02

Pos logit contributions
ish+0.151
anda+0.148
amb+0.136
o+0.134
an+0.133
Neg logit contributions
 darkness-0.100
 darker-0.100
 stairs-0.099
 silence-0.098
 security-0.097
Token.
Token ID13
Feature activation+7.57e-02

Pos logit contributions
ish+0.148
anda+0.145
amb+0.133
o+0.132
an+0.131
Neg logit contributions
 darker-0.096
 stairs-0.096
 darkness-0.094
 security-0.094
 silence-0.091
Token't
Token ID470
Feature activation+1.01e-01

Pos logit contributions
 darker+0.063
 stairs+0.063
 security+0.062
 darkness+0.060
 silence+0.060
Neg logit contributions
ish-0.081
anda-0.077
an-0.073
amb-0.072
o-0.071
Token know
Token ID760
Feature activation+7.05e-02

Pos logit contributions
ish+0.244
anda+0.240
amb+0.223
an+0.220
ouse+0.218
Neg logit contributions
 silence-0.133
 security-0.130
 danger-0.127
 dark-0.125
 darkness-0.124
Token the
Token ID262
Feature activation+7.59e-02

Pos logit contributions
ish+0.159
anda+0.154
amb+0.144
o+0.142
an+0.142
Neg logit contributions
 stairs-0.099
 darker-0.099
 security-0.098
 darkness-0.097
 silence-0.094
Token rh
Token ID9529
Feature activation+1.35e-01

Pos logit contributions
ish+0.182
anda+0.176
an+0.162
o+0.161
ouse+0.158
Neg logit contributions
 darker-0.093
 security-0.091
 stairs-0.091
 narrower-0.091
 darkness-0.088
Tokenin
Token ID259
Feature activation+1.05e-01

Pos logit contributions
ish+0.210
anda+0.209
amb+0.194
anna+0.194
am+0.188
Neg logit contributions
 security-0.269
 stairs-0.252
 darker-0.251
 darkness-0.247
Where-0.246
Tokenoc
Token ID420
Feature activation+1.59e-01

Pos logit contributions
ish+0.127
anda+0.124
amb+0.121
anna+0.110
am+0.109
Neg logit contributions
 security-0.242
 silence-0.238
 darkness-0.233
 stairs-0.229
 danger-0.228
Tokeneros
Token ID27498
Feature activation+1.12e-01

Pos logit contributions
anda+0.182
ouse+0.178
adow+0.175
orse+0.168
ish+0.167
Neg logit contributions
 security-0.370
 stairs-0.352
 front-0.350
 opposite-0.348
 darkness-0.346
Token was
Token ID373
Feature activation+4.20e-02

Pos logit contributions
ish+0.252
amb+0.247
anda+0.244
ouse+0.233
anna+0.222
Neg logit contributions
 security-0.157
 darkness-0.146
 silence-0.144
 stairs-0.141
 darker-0.141
Token wild
Token ID4295
Feature activation+8.21e-02

Pos logit contributions
ish+0.098
anda+0.095
o+0.089
an+0.086
am+0.086
Neg logit contributions
 stairs-0.055
 darker-0.054
 security-0.053
 narrower-0.052
 darkness-0.052
Token.
Token ID13
Feature activation+6.67e-02

Pos logit contributions
anda+0.193
ish+0.192
ouse+0.190
orse+0.188
amb+0.179
Neg logit contributions
 stairs-0.107
 silence-0.103
 darkness-0.103
 darker-0.102
 security-0.099
Token Can
Token ID1680
Feature activation+6.82e-02

Pos logit contributions
ish+0.160
anda+0.149
an+0.142
ouse+0.138
amb+0.135
Neg logit contributions
 stairs-0.090
 darkness-0.088
 silence-0.087
 darker-0.087
 security-0.086
Token and
Token ID290
Feature activation+1.05e-01

Pos logit contributions
 stairs+0.014
 darker+0.014
 security+0.014
 darkness+0.014
 narrower+0.013
Neg logit contributions
ish-0.021
anda-0.021
ouse-0.019
an-0.019
o-0.019
Token couldn
Token ID3521
Feature activation+7.13e-02

Pos logit contributions
ish+0.253
anda+0.248
amb+0.238
an+0.233
ouse+0.225
Neg logit contributions
 stairs-0.143
 darker-0.136
 darkness-0.136
 dark-0.135
 silence-0.135
Token't
Token ID470
Feature activation+6.80e-02

Pos logit contributions
ish+0.199
anda+0.190
an+0.182
amb+0.176
ouse+0.175
Neg logit contributions
 stairs-0.075
 front-0.069
 silence-0.068
 darkness-0.067
 darker-0.065
Token stop
Token ID2245
Feature activation+6.19e-02

Pos logit contributions
ish+0.164
anda+0.164
o+0.150
amb+0.147
ouse+0.147
Neg logit contributions
 darkness-0.088
 silence-0.085
 security-0.084
 front-0.084
 stairs-0.083
Token smiling
Token ID16755
Feature activation+7.69e-02

Pos logit contributions
anda+0.152
ish+0.151
ouse+0.146
an+0.143
amb+0.140
Neg logit contributions
 darker-0.074
 front-0.074
 darkness-0.073
 silence-0.072
 stairs-0.072
Token.
Token ID13
Feature activation+7.62e-02

Pos logit contributions
ish+0.185
anda+0.184
o+0.174
an+0.170
orse+0.169
Neg logit contributions
 darker-0.094
 security-0.094
 dark-0.091
 silence-0.091
 stairs-0.091
Token<|endoftext|>
Token ID50256
Feature activation+7.44e-02

Pos logit contributions
ish+0.197
anda+0.182
an+0.180
ouse+0.170
ese+0.165
Neg logit contributions
 stairs-0.092
Oh-0.086
 opposite-0.086
 silence-0.086
Where-0.085
TokenOnce
Token ID7454
Feature activation+6.23e-02

Pos logit contributions
ish+0.176
anda+0.170
amb+0.156
an+0.155
o+0.155
Neg logit contributions
 stairs-0.097
 darker-0.097
 security-0.094
 darkness-0.094
 narrower-0.091
Token upon
Token ID2402
Feature activation+7.51e-02

Pos logit contributions
ish+0.156
amb+0.151
ese+0.149
an+0.146
anda+0.145
Neg logit contributions
 stairs-0.070
 darker-0.070
 dark-0.070
 darkness-0.065
 many-0.064
Token a
Token ID257
Feature activation+3.77e-02

Pos logit contributions
ish+0.234
anda+0.228
amb+0.213
o+0.212
an+0.212
Neg logit contributions
 darker-0.044
 stairs-0.043
 security-0.041
 darkness-0.041
 narrower-0.038
Token time
Token ID640
Feature activation+3.16e-02

Pos logit contributions
ish+0.130
anda+0.127
amb+0.123
ouse+0.122
an+0.120
Neg logit contributions
 dark-0.006
 opposite-0.005
 silence-0.005
 stairs-0.005
 hours-0.005
Token her
Token ID607
Feature activation+8.38e-02

Pos logit contributions
ish+0.127
anda+0.123
o+0.117
amb+0.114
am+0.113
Neg logit contributions
 darker-0.061
 narrower-0.058
 security-0.057
 stairs-0.057
 darkness-0.056
Token mom
Token ID1995
Feature activation+8.33e-02

Pos logit contributions
ish+0.211
anda+0.202
amb+0.199
o+0.195
an+0.191
Neg logit contributions
 stairs-0.103
 security-0.099
 front-0.097
 darkness-0.097
 darker-0.091
Tokenmy
Token ID1820
Feature activation+8.54e-02

Pos logit contributions
anda+0.169
amb+0.162
ish+0.160
an+0.160
orse+0.152
Neg logit contributions
 front-0.143
 corner-0.135
 stairs-0.130
 dark-0.126
Where-0.123
Token.
Token ID13
Feature activation+5.83e-02

Pos logit contributions
anda+0.211
ish+0.199
an+0.196
amb+0.195
orse+0.188
Neg logit contributions
 front-0.104
 stairs-0.103
 corner-0.097
 darkness-0.093
 security-0.091
Token "
Token ID366
Feature activation+6.23e-02

Pos logit contributions
ish+0.137
anda+0.131
an+0.123
amb+0.117
am+0.117
Neg logit contributions
 stairs-0.080
 darker-0.078
 darkness-0.077
 security-0.076
Where-0.074
TokenMom
Token ID29252
Feature activation+8.47e-02

Pos logit contributions
ish+0.160
o+0.156
amb+0.156
anda+0.154
ouse+0.148
Neg logit contributions
 darker-0.064
 stairs-0.062
 security-0.062
 darkness-0.060
 narrower-0.059
Tokenmy
Token ID1820
Feature activation+8.46e-02

Pos logit contributions
amb+0.182
orse+0.172
an+0.171
ans+0.170
anda+0.165
Neg logit contributions
 front-0.117
 dark-0.116
 stairs-0.113
 darker-0.112
 last-0.111
Token,
Token ID11
Feature activation+4.65e-02

Pos logit contributions
amb+0.196
anda+0.182
an+0.180
ans+0.178
orse+0.177
Neg logit contributions
 dark-0.100
 hard-0.094
 stairs-0.093
 darkness-0.092
 security-0.092
Token can
Token ID460
Feature activation+4.54e-02

Pos logit contributions
ish+0.104
anda+0.102
amb+0.099
an+0.098
am+0.096
Neg logit contributions
 stairs-0.066
 security-0.065
 darkness-0.063
 darker-0.061
 dark-0.061
Token we
Token ID356
Feature activation+4.15e-02

Pos logit contributions
ish+0.112
anda+0.106
amb+0.104
an+0.104
adow+0.102
Neg logit contributions
 front-0.059
 stairs-0.058
Where-0.053
Oh-0.053
 top-0.053
Token find
Token ID1064
Feature activation+4.41e-03

Pos logit contributions
anda+0.101
ish+0.100
anna+0.093
an+0.092
o+0.092
Neg logit contributions
 front-0.049
 stairs-0.048
 security-0.048
 darkness-0.046
 darker-0.045
Token to
Token ID284
Feature activation+4.97e-02

Pos logit contributions
ish+0.058
anda+0.056
o+0.052
amb+0.051
an+0.051
Neg logit contributions
 darker-0.038
 security-0.037
 stairs-0.037
 darkness-0.036
 narrower-0.036
Token share
Token ID2648
Feature activation+4.13e-02

Pos logit contributions
ish+0.120
anda+0.119
amb+0.109
ouse+0.108
an+0.108
Neg logit contributions
 silence-0.064
 front-0.063
 opposite-0.061
 stairs-0.061
 damage-0.060
Token!
Token ID0
Feature activation+6.30e-02

Pos logit contributions
ish+0.078
anda+0.076
amb+0.071
an+0.069
ouse+0.069
Neg logit contributions
 stairs-0.074
 security-0.073
 darker-0.072
 silence-0.072
 opposite-0.072
Token It
Token ID632
Feature activation+6.91e-02

Pos logit contributions
ish+0.139
anda+0.128
an+0.125
am+0.120
ouse+0.120
Neg logit contributions
Oh-0.092
 stairs-0.091
Where-0.091
 silence-0.091
 opposite-0.087
Token's
Token ID338
Feature activation+3.97e-02

Pos logit contributions
amb+0.152
orse+0.147
ouse+0.136
anna+0.134
adow+0.130
Neg logit contributions
 opposite-0.102
 hard-0.101
 stairs-0.100
 dark-0.099
 darker-0.098
Token mine
Token ID6164
Feature activation+9.03e-02

Pos logit contributions
ish+0.081
anda+0.078
o+0.072
amb+0.070
an+0.070
Neg logit contributions
 darker-0.064
 stairs-0.064
 security-0.063
 darkness-0.063
 narrower-0.062
Token!"
Token ID2474
Feature activation+5.97e-02

Pos logit contributions
ish+0.171
orse+0.169
amb+0.165
anda+0.158
an+0.156
Neg logit contributions
 stairs-0.167
 darker-0.153
 dark-0.150
 hard-0.146
 front-0.143
Token said
Token ID531
Feature activation+3.56e-02

Pos logit contributions
ish+0.156
anda+0.148
an+0.139
ouse+0.137
amb+0.134
Neg logit contributions
 stairs-0.071
 opposite-0.071
 security-0.071
 silence-0.070
 darkness-0.069
Token Billy
Token ID15890
Feature activation+7.29e-02

Pos logit contributions
ish+0.078
anda+0.075
o+0.071
amb+0.069
am+0.068
Neg logit contributions
 darker-0.051
 stairs-0.051
 narrower-0.050
 security-0.049
 darkness-0.049
Token.
Token ID13
Feature activation+5.25e-02

Pos logit contributions
ish+0.194
anda+0.192
amb+0.182
an+0.180
ans+0.177
Neg logit contributions
 front-0.063
 security-0.062
 harder-0.060
 silence-0.057
 opposite-0.057
Token
Token ID198
Feature activation+4.09e-02

Pos logit contributions
ish+0.123
anda+0.119
an+0.109
amb+0.108
o+0.107
Neg logit contributions
 darker-0.070
 stairs-0.070
 security-0.069
 darkness-0.068
 narrower-0.067
Token the
Token ID262
Feature activation+6.34e-02

Pos logit contributions
ish+0.169
anda+0.166
amb+0.155
ouse+0.155
o+0.155
Neg logit contributions
 stairs-0.064
 security-0.064
 darker-0.063
 darkness-0.061
 opposite-0.059
Token mouse
Token ID10211
Feature activation+9.06e-02

Pos logit contributions
ish+0.142
anda+0.137
o+0.125
an+0.125
ouse+0.123
Neg logit contributions
 darker-0.089
 security-0.086
 narrower-0.086
 stairs-0.085
 darkness-0.085
Token.
Token ID13
Feature activation+7.02e-02

Pos logit contributions
anda+0.214
ish+0.212
amb+0.212
anna+0.199
ouse+0.198
Neg logit contributions
 security-0.111
 stairs-0.111
 front-0.109
 darkness-0.106
 silence-0.102
Token Then
Token ID3244
Feature activation+4.49e-02

Pos logit contributions
ish+0.179
anda+0.167
an+0.163
ouse+0.153
ese+0.149
Neg logit contributions
 stairs-0.088
 front-0.080
Where-0.080
 opposite-0.079
Oh-0.079
Token,
Token ID11
Feature activation+3.81e-02

Pos logit contributions
ish+0.107
anda+0.107
an+0.100
amb+0.100
ouse+0.097
Neg logit contributions
 stairs-0.057
 security-0.057
 darkness-0.056
 darker-0.056
 silence-0.055
Token she
Token ID673
Feature activation+7.82e-02

Pos logit contributions
ish+0.096
anda+0.094
an+0.087
amb+0.087
ouse+0.086
Neg logit contributions
 stairs-0.045
 darker-0.044
 security-0.044
 darkness-0.043
 silence-0.041
Token took
Token ID1718
Feature activation+7.77e-03

Pos logit contributions
ish+0.175
anda+0.167
amb+0.166
anna+0.159
an+0.158
Neg logit contributions
 security-0.108
 stairs-0.107
 darkness-0.098
 front-0.097
 darker-0.095
Token the
Token ID262
Feature activation+1.33e-02

Pos logit contributions
ish+0.020
anda+0.019
o+0.018
am+0.018
amb+0.017
Neg logit contributions
 darker-0.009
 narrower-0.008
 stairs-0.008
 security-0.008
 darkness-0.008
Token mouse
Token ID10211
Feature activation+5.73e-02

Pos logit contributions
ish+0.028
anda+0.027
an+0.024
o+0.024
ouse+0.023
Neg logit contributions
 darker-0.021
 narrower-0.020
Time-0.020
 security-0.020
 darkness-0.019
Token outside
Token ID2354
Feature activation+3.59e-02

Pos logit contributions
anda+0.141
ish+0.136
amb+0.133
anna+0.130
ouse+0.130
Neg logit contributions
 stairs-0.066
 darkness-0.066
 front-0.065
 security-0.064
 opposite-0.062
Token and
Token ID290
Feature activation+3.69e-02

Pos logit contributions
ish+0.076
anda+0.075
an+0.068
amb+0.068
o+0.067
Neg logit contributions
 stairs-0.056
 darker-0.055
 darkness-0.054
 security-0.054
 silence-0.052
Token huge
Token ID3236
Feature activation+9.01e-02

Pos logit contributions
ish+0.151
anda+0.147
o+0.137
an+0.135
amb+0.134
Neg logit contributions
 darker-0.065
 stairs-0.064
 security-0.063
 darkness-0.062
 narrower-0.062
Token guard
Token ID4860
Feature activation+1.34e-01

Pos logit contributions
ish+0.198
anda+0.197
amb+0.186
am+0.183
o+0.183
Neg logit contributions
 stairs-0.130
 darkness-0.126
 darker-0.126
 silence-0.126
 security-0.126
Token stood
Token ID6204
Feature activation+5.79e-02

Pos logit contributions
ish+0.306
anda+0.282
amb+0.277
ouse+0.271
an+0.266
Neg logit contributions
 security-0.194
 stairs-0.180
 darkness-0.169
 darker-0.168
 silence-0.166
Token at
Token ID379
Feature activation+5.32e-02

Pos logit contributions
ish+0.133
anda+0.129
amb+0.118
o+0.117
ouse+0.117
Neg logit contributions
 darker-0.081
 security-0.079
 stairs-0.078
 darkness-0.078
 narrower-0.076
Token the
Token ID262
Feature activation+5.84e-02

Pos logit contributions
ish+0.121
anda+0.117
o+0.111
amb+0.108
an+0.107
Neg logit contributions
 darker-0.072
 narrower-0.069
 stairs-0.068
 security-0.068
 darkness-0.067
Token entrance
Token ID10384
Feature activation+6.96e-02

Pos logit contributions
ish+0.143
anda+0.140
o+0.129
an+0.128
amb+0.127
Neg logit contributions
 darker-0.070
 stairs-0.068
 security-0.067
 narrower-0.066
 darkness-0.066
Token.
Token ID13
Feature activation+6.05e-02

Pos logit contributions
ish+0.153
anda+0.148
amb+0.137
an+0.135
o+0.135
Neg logit contributions
 stairs-0.102
 darker-0.102
 security-0.100
 darkness-0.099
 narrower-0.096
Token He
Token ID679
Feature activation+5.73e-02

Pos logit contributions
ish+0.136
anda+0.129
an+0.120
ouse+0.116
amb+0.116
Neg logit contributions
 stairs-0.090
 darker-0.087
 security-0.086
 darkness-0.086
 narrower-0.083
Token said
Token ID531
Feature activation+2.29e-02

Pos logit contributions
ish+0.122
anda+0.113
amb+0.108
an+0.105
ouse+0.105
Neg logit contributions
 security-0.092
 stairs-0.090
 darkness-0.089
 darker-0.088
 narrower-0.086
Token to
Token ID284
Feature activation+5.05e-02

Pos logit contributions
ish+0.052
anda+0.048
amb+0.046
an+0.046
o+0.045
Neg logit contributions
 security-0.034
 darker-0.033
 stairs-0.033
 silence-0.032
 darkness-0.031
Token Lucy
Token ID22162
Feature activation+7.94e-02

Pos logit contributions
ish+0.137
anda+0.133
o+0.124
ouse+0.123
amb+0.123
Neg logit contributions
 stairs-0.050
 security-0.050
 darker-0.048
 darkness-0.048
 opposite-0.046
Token the
Token ID262
Feature activation-5.36e-02

Pos logit contributions
 security+0.057
 stairs+0.055
 narrower+0.054
 darker+0.053
 darkness+0.052
Neg logit contributions
ish-0.094
anda-0.084
o-0.083
am-0.081
amb-0.080
Token food
Token ID2057
Feature activation-1.06e-02

Pos logit contributions
 darker+0.071
 security+0.070
 stairs+0.070
 darkness+0.068
 narrower+0.068
Neg logit contributions
ish-0.127
anda-0.123
o-0.113
an-0.113
amb-0.111
Token and
Token ID290
Feature activation-2.21e-02

Pos logit contributions
 stairs+0.017
 darker+0.017
 darkness+0.017
 security+0.017
 silence+0.017
Neg logit contributions
ish-0.021
anda-0.021
amb-0.019
o-0.019
an-0.018
Token then
Token ID788
Feature activation-2.16e-02

Pos logit contributions
 darker+0.038
 stairs+0.038
 darkness+0.038
 security+0.038
 silence+0.037
Neg logit contributions
ish-0.043
anda-0.042
amb-0.038
an-0.037
o-0.037
Token soared
Token ID35292
Feature activation-3.89e-02

Pos logit contributions
 stairs+0.032
 darker+0.032
 darkness+0.032
 security+0.031
 dark+0.031
Neg logit contributions
anda-0.047
ish-0.046
amb-0.042
an-0.042
o-0.042
Token back
Token ID736
Feature activation-5.04e-03

Pos logit contributions
 darker+0.054
 stairs+0.050
 darkness+0.049
 opposite+0.049
 dark+0.049
Neg logit contributions
ish-0.091
anda-0.090
ouse-0.082
orse-0.082
anna-0.082
Token up
Token ID510
Feature activation-3.03e-02

Pos logit contributions
 stairs+0.007
 darker+0.007
 darkness+0.006
 front+0.006
 security+0.006
Neg logit contributions
ish-0.012
anda-0.011
orse-0.011
an-0.011
ouse-0.011
Token into
Token ID656
Feature activation-5.74e-02

Pos logit contributions
Time+0.048
 security+0.048
 narrower+0.048
 emergencies+0.048
Where+0.047
Neg logit contributions
ish-0.058
o-0.055
anda-0.054
am-0.054
amb-0.051
Token the
Token ID262
Feature activation-3.50e-02

Pos logit contributions
 narrower+0.091
 darker+0.090
Ready+0.085
Time+0.084
Listen+0.082
Neg logit contributions
ish-0.094
orse-0.090
o-0.087
am-0.084
anda-0.081
Token sky
Token ID6766
Feature activation-5.36e-03

Pos logit contributions
 security+0.039
 narrower+0.038
Where+0.036
 darker+0.036
 stairs+0.036
Neg logit contributions
ish-0.089
anda-0.086
an-0.082
o-0.079
ouse-0.077
Token.
Token ID13
Feature activation-1.98e-02

Pos logit contributions
 darker+0.006
 darkness+0.006
 silence+0.006
 stairs+0.006
 front+0.006
Neg logit contributions
anda-0.013
ish-0.013
amb-0.012
ouse-0.012
an-0.012
Token saw
Token ID2497
Feature activation-1.42e-02

Pos logit contributions
ish+0.084
amb+0.083
anda+0.080
ouse+0.079
an+0.077
Neg logit contributions
 security-0.052
 darker-0.050
 darkness-0.050
 stairs-0.050
 front-0.049
Token what
Token ID644
Feature activation+4.79e-03

Pos logit contributions
 stairs+0.016
 security+0.016
 darker+0.016
 darkness+0.016
 opposite+0.015
Neg logit contributions
ish-0.036
anda-0.036
amb-0.033
ouse-0.033
an-0.033
Token was
Token ID373
Feature activation-1.08e-02

Pos logit contributions
ish+0.009
anda+0.009
o+0.009
ouse+0.008
am+0.008
Neg logit contributions
 darker-0.008
 narrower-0.008
 stairs-0.008
 darkness-0.008
 security-0.008
Token happening
Token ID5836
Feature activation+1.01e-02

Pos logit contributions
 darker+0.013
 stairs+0.013
 security+0.013
 darkness+0.013
 narrower+0.013
Neg logit contributions
ish-0.027
anda-0.026
o-0.024
an-0.024
amb-0.024
Token and
Token ID290
Feature activation+1.04e-02

Pos logit contributions
ish+0.019
anda+0.018
o+0.017
amb+0.016
an+0.016
Neg logit contributions
 darker-0.018
 stairs-0.018
 security-0.018
 darkness-0.018
 narrower-0.017
Token came
Token ID1625
Feature activation-4.44e-03

Pos logit contributions
ish+0.021
anda+0.020
amb+0.019
an+0.019
ouse+0.018
Neg logit contributions
 security-0.017
 darker-0.017
 stairs-0.017
 darkness-0.017
 narrower-0.017
Token over
Token ID625
Feature activation-3.34e-03

Pos logit contributions
 darker+0.006
 stairs+0.006
 security+0.006
 darkness+0.006
 narrower+0.006
Neg logit contributions
ish-0.010
anda-0.010
o-0.009
amb-0.009
am-0.009
Token to
Token ID284
Feature activation-1.04e-03

Pos logit contributions
 darker+0.006
 narrower+0.006
 stairs+0.006
 darkness+0.006
 security+0.006
Neg logit contributions
ish-0.005
o-0.005
anda-0.005
am-0.005
amb-0.005
Token talk
Token ID1561
Feature activation+1.02e-02

Pos logit contributions
 darker+0.002
 stairs+0.002
 narrower+0.002
 darkness+0.002
 security+0.002
Neg logit contributions
ish-0.002
anda-0.002
o-0.002
amb-0.002
an-0.002
Token to
Token ID284
Feature activation+1.13e-02

Pos logit contributions
ish+0.020
anda+0.019
o+0.018
am+0.017
amb+0.017
Neg logit contributions
 darker-0.017
 stairs-0.017
 narrower-0.017
 darkness-0.016
 security-0.016
Token them
Token ID606
Feature activation+2.39e-02

Pos logit contributions
ish+0.024
anda+0.024
o+0.022
amb+0.021
an+0.021
Neg logit contributions
 darker-0.017
 stairs-0.017
 security-0.017
 darkness-0.016
 narrower-0.016
Token a
Token ID257
Feature activation-1.25e-02

Pos logit contributions
ish+0.157
anda+0.156
amb+0.146
ouse+0.144
an+0.144
Neg logit contributions
 security-0.032
 darkness-0.031
 stairs-0.031
 darker-0.031
 front-0.030
Token big
Token ID1263
Feature activation+1.22e-02

Pos logit contributions
 darker+0.015
 narrower+0.015
 darkness+0.014
 stairs+0.014
 security+0.014
Neg logit contributions
ish-0.031
anda-0.030
o-0.028
ouse-0.028
amb-0.027
Token hug
Token ID16225
Feature activation+4.05e-02

Pos logit contributions
ish+0.027
anda+0.026
o+0.024
amb+0.023
an+0.023
Neg logit contributions
 darker-0.018
 stairs-0.018
 security-0.018
 narrower-0.018
 darkness-0.017
Token and
Token ID290
Feature activation+2.35e-02

Pos logit contributions
ish+0.081
anda+0.080
an+0.071
o+0.071
amb+0.070
Neg logit contributions
 security-0.065
 darker-0.065
 stairs-0.065
 darkness-0.065
 silence-0.063
Token said
Token ID531
Feature activation+1.08e-02

Pos logit contributions
ish+0.048
anda+0.047
an+0.043
amb+0.041
o+0.041
Neg logit contributions
 security-0.039
 stairs-0.038
 darker-0.038
 darkness-0.038
 silence-0.037
Token:
Token ID25
Feature activation+1.79e-02

Pos logit contributions
ish+0.023
anda+0.022
an+0.021
o+0.020
ese+0.020
Neg logit contributions
 security-0.017
 stairs-0.016
 darker-0.016
 silence-0.016
 many-0.016
Token "
Token ID366
Feature activation+3.50e-02

Pos logit contributions
ish+0.045
anda+0.044
an+0.041
am+0.039
o+0.039
Neg logit contributions
 stairs-0.020
 darker-0.020
 security-0.020
 darkness-0.019
 silence-0.019
TokenThank
Token ID10449
Feature activation+2.90e-02

Pos logit contributions
ish+0.096
anda+0.095
an+0.086
am+0.086
o+0.086
Neg logit contributions
 darker-0.033
 stairs-0.031
 darkness-0.031
 security-0.031
 narrower-0.030
Token you
Token ID345
Feature activation+1.89e-02

Pos logit contributions
ish+0.059
anda+0.057
ouse+0.053
an+0.050
anna+0.049
Neg logit contributions
 security-0.047
Oh-0.047
Where-0.047
 front-0.047
Time-0.046
Token!"
Token ID2474
Feature activation+3.13e-02

Pos logit contributions
ish+0.039
anda+0.037
an+0.034
amb+0.034
am+0.033
Neg logit contributions
 security-0.030
 stairs-0.030
 front-0.029
 darker-0.029
 darkness-0.029
Token
Token ID198
Feature activation+2.75e-02

Pos logit contributions
ish+0.073
anda+0.071
an+0.065
ouse+0.061
amb+0.061
Neg logit contributions
 stairs-0.042
 darkness-0.041
 security-0.041
 silence-0.041
 opposite-0.041
Token
Token ID198
Feature activation-1.34e-02

Pos logit contributions
 darker+0.022
 security+0.022
 stairs+0.022
 darkness+0.022
 narrower+0.022
Neg logit contributions
ish-0.030
anda-0.030
o-0.028
amb-0.027
ouse-0.027
Token"
Token ID1
Feature activation-1.16e-02

Pos logit contributions
 darker+0.018
 security+0.018
 stairs+0.018
 darkness+0.018
 narrower+0.017
Neg logit contributions
ish-0.031
anda-0.030
o-0.028
amb-0.028
ouse-0.028
TokenGive
Token ID23318
Feature activation-6.11e-03

Pos logit contributions
 darker+0.017
 stairs+0.017
 darkness+0.017
 security+0.017
 narrower+0.016
Neg logit contributions
ish-0.024
o-0.022
amb-0.022
anda-0.022
ouse-0.021
Token me
Token ID502
Feature activation-1.67e-02

Pos logit contributions
 opposite+0.007
 front+0.007
 other+0.006
Where+0.006
 security+0.006
Neg logit contributions
adow-0.016
anda-0.016
ish-0.015
orse-0.015
ouse-0.015
Token the
Token ID262
Feature activation-5.91e-02

Pos logit contributions
 front+0.019
 security+0.018
 stairs+0.018
 opposite+0.018
 darker+0.017
Neg logit contributions
ish-0.041
anda-0.041
amb-0.040
ouse-0.038
anna-0.038
Token dinosaur
Token ID28257
Feature activation+1.05e-02

Pos logit contributions
 stairs+0.082
 security+0.081
 darker+0.080
 darkness+0.079
 front+0.077
Neg logit contributions
ish-0.134
anda-0.131
o-0.122
amb-0.120
am-0.120
Token!"
Token ID2474
Feature activation-5.00e-03

Pos logit contributions
anna+0.025
amb+0.024
orse+0.023
ish+0.022
ci+0.022
Neg logit contributions
 stairs-0.013
 hard-0.013
 many-0.013
 front-0.013
 top-0.013
Token Lily
Token ID20037
Feature activation+2.06e-02

Pos logit contributions
 stairs+0.006
 darker+0.006
 security+0.006
 darkness+0.006
 opposite+0.006
Neg logit contributions
ish-0.012
anda-0.012
o-0.011
an-0.011
amb-0.011
Token shouted
Token ID17293
Feature activation-1.98e-02

Pos logit contributions
ish+0.045
anda+0.043
amb+0.040
an+0.040
ouse+0.040
Neg logit contributions
 security-0.031
 stairs-0.031
 opposite-0.031
 darkness-0.030
 front-0.029
Token.
Token ID13
Feature activation-3.82e-03

Pos logit contributions
 darker+0.027
 security+0.027
 darkness+0.026
 stairs+0.026
 silence+0.026
Neg logit contributions
ish-0.045
anda-0.044
an-0.041
ouse-0.040
o-0.040
Token
Token ID198
Feature activation-1.04e-02

Pos logit contributions
 stairs+0.006
 darker+0.006
 security+0.005
 darkness+0.005
 narrower+0.005
Neg logit contributions
ish-0.008
anda-0.008
an-0.007
o-0.007
amb-0.007
Token.
Token ID13
Feature activation+6.85e-03

Pos logit contributions
ish+0.017
anda+0.016
o+0.015
amb+0.015
an+0.015
Neg logit contributions
 darker-0.011
 stairs-0.011
 security-0.011
 darkness-0.011
 narrower-0.011
Token For
Token ID1114
Feature activation-3.00e-02

Pos logit contributions
ish+0.015
anda+0.015
an+0.014
amb+0.013
o+0.013
Neg logit contributions
 stairs-0.010
 darker-0.010
 darkness-0.010
 security-0.009
 silence-0.009
Token days
Token ID1528
Feature activation-1.06e-02

Pos logit contributions
 narrower+0.040
 darker+0.040
 stairs+0.039
 security+0.037
 basement+0.034
Neg logit contributions
ish-0.065
o-0.063
anda-0.062
an-0.059
am-0.058
Token and
Token ID290
Feature activation-1.05e-02

Pos logit contributions
 stairs+0.019
 darker+0.019
 security+0.019
 darkness+0.019
 narrower+0.018
Neg logit contributions
ish-0.019
anda-0.019
an-0.017
ouse-0.017
amb-0.017
Token months
Token ID1933
Feature activation-2.03e-02

Pos logit contributions
 darker+0.012
 stairs+0.012
 security+0.011
 narrower+0.011
 darkness+0.011
Neg logit contributions
ish-0.027
anda-0.026
o-0.024
amb-0.024
an-0.024
Token and
Token ID290
Feature activation-1.05e-02

Pos logit contributions
 darker+0.039
 stairs+0.038
 security+0.038
 darkness+0.038
 narrower+0.038
Neg logit contributions
ish-0.035
anda-0.034
o-0.031
amb-0.030
am-0.029
Token years
Token ID812
Feature activation-4.38e-03

Pos logit contributions
 darker+0.011
 stairs+0.011
 security+0.011
 narrower+0.011
 darkness+0.011
Neg logit contributions
ish-0.027
anda-0.026
o-0.024
amb-0.024
an-0.024
Token,
Token ID11
Feature activation-5.33e-03

Pos logit contributions
 darker+0.007
 stairs+0.007
 narrower+0.007
 security+0.007
 darkness+0.007
Neg logit contributions
ish-0.008
anda-0.008
o-0.007
amb-0.007
am-0.007
Token the
Token ID262
Feature activation-1.38e-02

Pos logit contributions
 darker+0.006
 stairs+0.006
 security+0.006
 darkness+0.006
 narrower+0.005
Neg logit contributions
ish-0.014
anda-0.013
o-0.012
amb-0.012
an-0.012
Token little
Token ID1310
Feature activation+1.37e-02

Pos logit contributions
 security+0.021
 narrower+0.021
 darker+0.021
Where+0.020
 outer+0.020
Neg logit contributions
ish-0.028
anda-0.025
an-0.023
ouse-0.021
adow-0.021
Token girl
Token ID2576
Feature activation+2.42e-02

Pos logit contributions
ish+0.027
anda+0.023
an+0.021
o+0.020
amb+0.020
Neg logit contributions
 security-0.024
 darker-0.024
 narrower-0.024
 stairs-0.024
Where-0.023
Token.
Token ID13
Feature activation-2.04e-02

Pos logit contributions
 security+0.001
 stairs+0.001
 darker+0.001
 darkness+0.001
 silence+0.001
Neg logit contributions
ish-0.002
anda-0.002
an-0.002
amb-0.002
o-0.002
Token She
Token ID1375
Feature activation-7.92e-03

Pos logit contributions
 darker+0.033
 stairs+0.033
 security+0.033
 darkness+0.032
 narrower+0.032
Neg logit contributions
ish-0.042
anda-0.040
o-0.037
amb-0.036
an-0.036
Token said
Token ID531
Feature activation-3.90e-02

Pos logit contributions
 security+0.015
 darker+0.014
 stairs+0.014
 darkness+0.014
 narrower+0.014
Neg logit contributions
ish-0.014
anda-0.014
amb-0.012
o-0.012
an-0.012
Token,
Token ID11
Feature activation-2.32e-02

Pos logit contributions
 darker+0.065
 stairs+0.064
 security+0.064
 darkness+0.063
 narrower+0.062
Neg logit contributions
ish-0.078
anda-0.076
o-0.069
amb-0.068
an-0.068
Token "
Token ID366
Feature activation-1.49e-02

Pos logit contributions
 darker+0.036
 stairs+0.035
 security+0.034
 darkness+0.034
 narrower+0.034
Neg logit contributions
ish-0.049
anda-0.047
o-0.044
amb-0.044
anna-0.043
TokenNo
Token ID2949
Feature activation-2.03e-02

Pos logit contributions
 stairs+0.018
 security+0.018
 darker+0.018
 darkness+0.018
 narrower+0.017
Neg logit contributions
ish-0.035
anda-0.033
o-0.033
amb-0.033
anna-0.032
Token,
Token ID11
Feature activation-3.99e-02

Pos logit contributions
 darker+0.039
 stairs+0.038
 security+0.038
 narrower+0.038
 darkness+0.037
Neg logit contributions
ish-0.036
anda-0.035
o-0.031
an-0.030
anna-0.030
Token dinner
Token ID8073
Feature activation-7.86e-03

Pos logit contributions
 darker+0.066
 stairs+0.066
 security+0.065
 darkness+0.065
 narrower+0.063
Neg logit contributions
ish-0.080
anda-0.077
o-0.071
amb-0.071
an-0.070
Token comes
Token ID2058
Feature activation-2.68e-02

Pos logit contributions
 stairs+0.014
 security+0.014
 darker+0.014
 darkness+0.013
 silence+0.013
Neg logit contributions
ish-0.015
anda-0.014
amb-0.014
o-0.013
an-0.013
Token first
Token ID717
Feature activation-1.65e-02

Pos logit contributions
 darker+0.039
 stairs+0.039
 security+0.038
 darkness+0.038
 narrower+0.037
Neg logit contributions
ish-0.059
anda-0.057
amb-0.053
o-0.052
an-0.052
Token."
Token ID526
Feature activation-1.39e-02

Pos logit contributions
 stairs+0.028
 darker+0.028
 security+0.027
 darkness+0.027
 silence+0.027
Neg logit contributions
ish-0.032
anda-0.031
amb-0.029
o-0.028
an-0.028
Token bee
Token ID20697
Feature activation+2.09e-02

Pos logit contributions
 security+0.082
 darker+0.082
 stairs+0.082
 narrower+0.081
 darkness+0.080
Neg logit contributions
ish-0.088
anda-0.085
an-0.077
o-0.076
ouse-0.074
Token did
Token ID750
Feature activation-4.17e-02

Pos logit contributions
ish+0.044
anda+0.042
amb+0.042
ouse+0.040
anna+0.039
Neg logit contributions
 darker-0.032
 darkness-0.031
 silence-0.031
 stairs-0.030
 security-0.030
Token not
Token ID407
Feature activation-3.23e-02

Pos logit contributions
 stairs+0.083
 darkness+0.081
 basement+0.079
 security+0.079
 darker+0.079
Neg logit contributions
ish-0.067
anda-0.064
am-0.059
o-0.058
an-0.056
Token like
Token ID588
Feature activation-4.58e-02

Pos logit contributions
 darker+0.050
 stairs+0.050
 security+0.049
 darkness+0.048
 narrower+0.048
Neg logit contributions
ish-0.069
anda-0.066
o-0.061
amb-0.060
an-0.060
Token that
Token ID326
Feature activation-3.70e-02

Pos logit contributions
 darker+0.051
 stairs+0.050
 security+0.050
 darkness+0.049
 narrower+0.047
Neg logit contributions
ish-0.117
anda-0.114
amb-0.106
o-0.106
ouse-0.105
Token.
Token ID13
Feature activation-2.52e-02

Pos logit contributions
 silence+0.049
 darker+0.049
 darkness+0.048
 security+0.046
 opposite+0.046
Neg logit contributions
ish-0.087
anda-0.087
amb-0.081
ouse-0.080
o-0.078
Token It
Token ID632
Feature activation-1.83e-03

Pos logit contributions
 darker+0.041
 stairs+0.041
 security+0.041
 darkness+0.040
 narrower+0.040
Neg logit contributions
ish-0.051
anda-0.049
o-0.045
amb-0.045
an-0.044
Token buzz
Token ID14713
Feature activation-3.17e-03

Pos logit contributions
 darker+0.003
 darkness+0.003
 silence+0.003
 security+0.003
 stairs+0.003
Neg logit contributions
ish-0.004
anda-0.004
amb-0.004
ouse-0.004
anna-0.004
Tokened
Token ID276
Feature activation-3.57e-02

Pos logit contributions
 darker+0.007
 darkness+0.007
 security+0.007
 narrower+0.007
 stairs+0.007
Neg logit contributions
ish-0.005
anda-0.004
o-0.004
amb-0.004
ouse-0.004
Token angrily
Token ID34425
Feature activation-9.77e-03

Pos logit contributions
 stairs+0.055
 darker+0.055
 security+0.054
 darkness+0.053
 narrower+0.053
Neg logit contributions
ish-0.076
anda-0.073
o-0.067
amb-0.066
an-0.066
Token and
Token ID290
Feature activation-1.77e-02

Pos logit contributions
 security+0.014
 darker+0.014
 silence+0.014
 darkness+0.013
 front+0.013
Neg logit contributions
anda-0.022
ish-0.021
ouse-0.020
o-0.019
amb-0.019
Token The
Token ID383
Feature activation-2.61e-02

Pos logit contributions
ish+0.014
anda+0.014
an+0.013
ouse+0.013
amb+0.012
Neg logit contributions
 stairs-0.008
 opposite-0.008
 security-0.008
 darkness-0.008
 darker-0.008
Token owl
Token ID39610
Feature activation+3.57e-02

Pos logit contributions
 stairs+0.055
 darker+0.054
 security+0.053
 narrower+0.052
 basement+0.051
Neg logit contributions
ish-0.042
anda-0.038
an-0.036
o-0.033
amb-0.032
Token was
Token ID373
Feature activation-4.25e-02

Pos logit contributions
anda+0.074
ish+0.073
amb+0.069
ouse+0.068
anna+0.067
Neg logit contributions
 security-0.050
 darker-0.049
 darkness-0.049
 silence-0.049
 opposite-0.046
Token happy
Token ID3772
Feature activation+1.38e-03

Pos logit contributions
 darker+0.064
 security+0.062
 stairs+0.062
 darkness+0.062
 narrower+0.061
Neg logit contributions
ish-0.093
anda-0.090
amb-0.082
o-0.082
an-0.081
Token to
Token ID284
Feature activation-1.26e-02

Pos logit contributions
anda+0.003
ish+0.003
an+0.003
ouse+0.003
amb+0.003
Neg logit contributions
 darkness-0.002
 darker-0.002
 security-0.002
 silence-0.002
 stairs-0.002
Token help
Token ID1037
Feature activation-1.97e-02

Pos logit contributions
 security+0.017
 darker+0.017
 stairs+0.017
 darkness+0.017
 silence+0.016
Neg logit contributions
ish-0.029
anda-0.029
o-0.026
amb-0.026
ouse-0.026
Token and
Token ID290
Feature activation-1.48e-02

Pos logit contributions
 darker+0.033
 stairs+0.033
 security+0.032
 darkness+0.032
 narrower+0.032
Neg logit contributions
ish-0.040
anda-0.037
o-0.035
an-0.034
amb-0.034
Token showed
Token ID3751
Feature activation-1.42e-02

Pos logit contributions
 security+0.021
 darker+0.021
 darkness+0.020
 stairs+0.020
 narrower+0.020
Neg logit contributions
ish-0.034
anda-0.034
amb-0.030
an-0.030
ouse-0.030
Token the
Token ID262
Feature activation-3.05e-02

Pos logit contributions
 security+0.016
 darker+0.016
 stairs+0.016
 darkness+0.016
 opposite+0.015
Neg logit contributions
ish-0.035
anda-0.035
ouse-0.032
o-0.032
amb-0.032
Token fox
Token ID21831
Feature activation+1.48e-02

Pos logit contributions
 darker+0.056
 stairs+0.055
 security+0.054
 narrower+0.053
 darkness+0.052
Neg logit contributions
ish-0.057
anda-0.054
an-0.049
o-0.048
amb-0.047
Token how
Token ID703
Feature activation-3.88e-02

Pos logit contributions
anda+0.038
ish+0.037
amb+0.036
ouse+0.034
o+0.034
Neg logit contributions
 security-0.016
 darkness-0.015
 silence-0.015
 many-0.015
 opposite-0.015
Token was
Token ID373
Feature activation-3.87e-02

Pos logit contributions
ish+0.050
anda+0.048
amb+0.046
an+0.046
ese+0.045
Neg logit contributions
 stairs-0.027
 dark-0.025
 darkness-0.025
 security-0.025
 hours-0.025
Token going
Token ID1016
Feature activation-2.44e-02

Pos logit contributions
 stairs+0.054
 darker+0.053
 security+0.052
 darkness+0.052
 front+0.050
Neg logit contributions
ish-0.090
anda-0.088
an-0.082
amb-0.081
ouse-0.078
Token to
Token ID284
Feature activation-1.57e-02

Pos logit contributions
 darker+0.032
 stairs+0.032
 security+0.032
 darkness+0.031
 narrower+0.031
Neg logit contributions
ish-0.057
anda-0.055
o-0.051
amb-0.050
an-0.050
Token play
Token ID711
Feature activation-3.40e-02

Pos logit contributions
 darker+0.021
 stairs+0.020
 security+0.020
 darkness+0.020
 narrower+0.020
Neg logit contributions
ish-0.037
anda-0.035
o-0.033
amb-0.033
an-0.032
Token with
Token ID351
Feature activation-3.24e-02

Pos logit contributions
 darker+0.048
 stairs+0.048
 security+0.047
 darkness+0.046
 silence+0.046
Neg logit contributions
ish-0.077
anda-0.076
o-0.069
amb-0.068
an-0.068
Token his
Token ID465
Feature activation-2.33e-02

Pos logit contributions
 darker+0.042
 stairs+0.041
 security+0.041
 darkness+0.041
 narrower+0.040
Neg logit contributions
ish-0.076
anda-0.074
o-0.069
amb-0.068
an-0.067
Token favorite
Token ID4004
Feature activation-1.36e-02

Pos logit contributions
 stairs+0.037
 darker+0.036
 security+0.036
 darkness+0.036
 narrower+0.035
Neg logit contributions
ish-0.049
anda-0.047
o-0.043
amb-0.043
an-0.042
Token bright
Token ID6016
Feature activation-1.03e-02

Pos logit contributions
 stairs+0.019
 darker+0.018
 security+0.018
 darkness+0.018
 narrower+0.017
Neg logit contributions
ish-0.031
anda-0.030
o-0.028
amb-0.028
an-0.027
Token things
Token ID1243
Feature activation-6.26e-03

Pos logit contributions
 stairs+0.011
 darker+0.010
 darkness+0.010
 silence+0.010
 front+0.010
Neg logit contributions
ish-0.026
anda-0.026
o-0.025
amb-0.024
ouse-0.024
Token -
Token ID532
Feature activation-5.25e-02

Pos logit contributions
 darker+0.009
 stairs+0.009
 security+0.009
 darkness+0.009
 narrower+0.009
Neg logit contributions
ish-0.013
anda-0.013
an-0.012
amb-0.012
o-0.012
Token his
Token ID465
Feature activation-2.98e-02

Pos logit contributions
 darker+0.053
 security+0.052
 stairs+0.051
 darkness+0.051
 narrower+0.049
Neg logit contributions
ish-0.137
anda-0.133
o-0.126
amb-0.123
an-0.122
Token cat
Token ID3797
Feature activation+2.38e-02

Pos logit contributions
 stairs+0.083
 narrower+0.082
 darker+0.082
wered+0.079
 basement+0.079
Neg logit contributions
anda-0.119
ish-0.116
ouse-0.113
orse-0.111
o-0.111
Token me
Token ID502
Feature activation+4.06e-04

Pos logit contributions
ish+0.055
anda+0.053
amb+0.051
ouse+0.051
an+0.049
Neg logit contributions
 security-0.032
 darkness-0.031
 darker-0.031
 stairs-0.031
 silence-0.030
Tokenowed
Token ID6972
Feature activation-2.94e-02

Pos logit contributions
ish+0.000
anda+0.000
an+0.000
amb+0.000
ouse+0.000
Neg logit contributions
 darkness-0.001
 silence-0.001
 security-0.001
 darker-0.001
 stairs-0.001
Token and
Token ID290
Feature activation-2.71e-02

Pos logit contributions
 darker+0.049
 stairs+0.048
 security+0.048
 darkness+0.047
 narrower+0.047
Neg logit contributions
ish-0.059
anda-0.057
o-0.052
amb-0.052
an-0.051
Token joined
Token ID5399
Feature activation-1.72e-02

Pos logit contributions
 darker+0.043
 security+0.042
 stairs+0.042
 darkness+0.042
 narrower+0.041
Neg logit contributions
ish-0.057
anda-0.056
amb-0.051
an-0.051
ouse-0.050
Token him
Token ID683
Feature activation-1.70e-02

Pos logit contributions
 stairs+0.016
 darker+0.016
 security+0.016
 darkness+0.016
 silence+0.015
Neg logit contributions
ish-0.046
anda-0.046
ouse-0.042
o-0.042
amb-0.042
Token.
Token ID13
Feature activation-2.46e-02

Pos logit contributions
 stairs+0.021
 security+0.021
 darkness+0.020
 silence+0.020
 front+0.020
Neg logit contributions
ish-0.040
anda-0.040
amb-0.037
ouse-0.037
an-0.037
Token The
Token ID383
Feature activation-3.30e-02

Pos logit contributions
 stairs+0.038
 darker+0.038
 darkness+0.037
 security+0.037
 narrower+0.036
Neg logit contributions
ish-0.053
anda-0.051
an-0.046
o-0.046
amb-0.046
Token boy
Token ID2933
Feature activation+4.05e-03

Pos logit contributions
 security+0.064
 stairs+0.063
 outer+0.063
 darker+0.062
 narrower+0.061
Neg logit contributions
ish-0.053
anda-0.048
an-0.045
o-0.044
ouse-0.040
Token was
Token ID373
Feature activation-6.10e-02

Pos logit contributions
ish+0.009
anda+0.009
amb+0.009
ouse+0.008
an+0.008
Neg logit contributions
 security-0.005
 darkness-0.005
 silence-0.005
 hours-0.005
 stairs-0.005
Token happy
Token ID3772
Feature activation-1.89e-02

Pos logit contributions
 darker+0.087
 security+0.085
 darkness+0.084
 silence+0.084
 front+0.082
Neg logit contributions
ish-0.145
anda-0.138
amb-0.131
an-0.127
ouse-0.126