TOP ACTIVATIONS
MAX = 16.255

't let yourself become too dependent on one thing . It
bring the quarrel to a resolution , until everyone was happy
never dared to join them <|endoftext|> Ab by and Tommy were
if nothing had ever happened <|endoftext|> Once upon a time there
your cr ay on will <|endoftext|> Once upon a time ,
we can have a rainbow <|endoftext|> S ara liked to play
with the cute veil he <|endoftext|> Once upon a time ,
of the power of believing <|endoftext|> Once upon a time ,
sad to lose the battery <|endoftext|> Once upon a time ,
't stay dry in rainy <|endoftext|> Once upon a time there
use it whenever you want <|endoftext|> Once there was a girl
is a noise that matches <|endoftext|> Ben and Lily were playing
back into bed for a <|endoftext|> Once upon a time there
dog call out to him <|endoftext|> Once there was a pup
always ask before you peek <|endoftext|> Once , there was a
parents and skipped off to <|endoftext|> Once upon a time ,
never go near a hive <|endoftext|> Once upon a time ,
it 's never okay to <|endoftext|> Once upon a time there
this old man might be <|endoftext|> Once there was a little
out for more miracles to <|endoftext|> Once upon a time ,

MIN ACTIVATIONS
MIN = -12.150

cheered . Ell a felt their applause through her big feet
?" R uffy w agged his tail . Tim
now he could qu ench his thirst and have a cool
Max w agged his mustache happily and said ,
family spent the rest of their days in their new home
and pull . They feel their arms and legs working .
you ?" Max w agged his tail and l icked his
! Harry felt his worries drift away with each
ed happily and w agged his tail some more . "
Spot ?" Spot w agged his tail and bark ed again
how Mike and Tony spent their day playing video and eating
Max w agged his tail and ran around happily
happy again and w agged his tail . The little girl
." The dog w agged his tail and said , "
ground !" Spot w agged his tail and bark ed ,
Max smiled and w agged their tails . They were happy
his shoulder . He felt their little hands tick ling his
owner !" Spot w agged his tail happily , feeling proud
The children nodded their heads in understanding and started
, and Spot w agged his tail as they shared a

INTERVAL 9.154 - 16.255
CONTAINS 0.636%

dreams can come true . <|endoftext|> S ara was so excited
her . The end . <|endoftext|> Once upon a time ,
They started to argue , but then their mom
the best day ever . <|endoftext|> Tom and Mia liked to
They wished they had not fought . They wished they could

INTERVAL 2.052 - 9.154
CONTAINS 40.124%

When Peter came back he saw the balloon had landed in
awesome , L ila !" L ila hugged him
It tastes like dirt and grass ," Anna says .
complete the task because his paws were too big . Max
you want to be my friend ?" Spot w agged his

INTERVAL -5.049 - 2.052
CONTAINS 39.659%

driver called out to the boy and asked him what was
are mean . Give it back to me ." She ran
grabbed her doll and said , " This is mine now
tightly and then slowly lowered it into the pit . He
," Tom says . " That is a word we know

INTERVAL -12.150 - -5.049
CONTAINS 19.581%

world , learning new things each day . He even made
Then she heard a voice . It was coming
. They continued to race and play together , and they
and kind . You 're the best visitors ever ."
walking on the bridge noticed the apple and stopped to pick
Token't
Token ID470
Feature activation-1.80e+00

Pos logit contributions
elt+0.150
uzz+0.149
 Isn+0.149
ges+0.149
uts+0.148
Neg logit contributions
ches-0.197
 hospital-0.195
 needle-0.195
 way-0.194
 knot-0.190
Token let
Token ID1309
Feature activation+6.76e+00

Pos logit contributions
 forgot+0.036
 way+0.036
 fort+0.035
 trap+0.034
 hospital+0.033
Neg logit contributions
uts-0.059
anes-0.059
all-0.055
ous-0.052
ges-0.052
Token yourself
Token ID3511
Feature activation-5.05e-01

Pos logit contributions
uts+0.216
anes+0.210
all+0.204
ous+0.194
ges+0.190
Neg logit contributions
 fort-0.149
 way-0.140
 trap-0.136
 scar-0.136
 forgot-0.135
Token become
Token ID1716
Feature activation+1.70e+00

Pos logit contributions
 fort+0.012
 way+0.012
 forgot+0.012
 trap+0.011
 pile+0.011
Neg logit contributions
anes-0.014
uts-0.014
all-0.013
unch-0.013
ous-0.013
Token too
Token ID1165
Feature activation+2.69e+00

Pos logit contributions
uts+0.054
all+0.053
anes+0.052
ous+0.048
that+0.048
Neg logit contributions
 fort-0.037
 way-0.033
 trap-0.033
 scar-0.033
 forgot-0.032
Token dependent
Token ID10795
Feature activation+1.63e+01

Pos logit contributions
anes+0.102
uts+0.099
all+0.097
amb+0.095
that+0.094
Neg logit contributions
 way-0.042
 forgot-0.041
 fort-0.038
 dangerous-0.037
 hole-0.036
Token on
Token ID319
Feature activation+2.83e+00

Pos logit contributions
uts+0.409
all+0.391
anes+0.389
unch+0.380
ous+0.378
Neg logit contributions
 way-0.411
 fort-0.406
 trap-0.393
 closet-0.381
 hospital-0.380
Token one
Token ID530
Feature activation+2.51e-01

Pos logit contributions
uts+0.087
anes+0.084
all+0.082
ous+0.077
that+0.076
Neg logit contributions
 fort-0.065
 way-0.061
 trap-0.060
 scar-0.060
 forgot-0.059
Token thing
Token ID1517
Feature activation+7.04e+00

Pos logit contributions
uts+0.007
anes+0.007
all+0.006
ous+0.006
ges+0.006
Neg logit contributions
 fort-0.007
 way-0.007
 trap-0.006
 pile-0.006
 scar-0.006
Token.
Token ID13
Feature activation+5.50e+00

Pos logit contributions
anes+0.167
uts+0.165
all+0.161
unch+0.152
that+0.147
Neg logit contributions
 fort-0.201
 way-0.196
 forgot-0.188
 trap-0.188
 scar-0.185
Token It
Token ID632
Feature activation-3.69e+00

Pos logit contributions
uts+0.136
anes+0.128
ous+0.127
ges+0.123
all+0.118
Neg logit contributions
 fort-0.141
 way-0.141
 hospital-0.141
 closet-0.137
 trap-0.134
Token bring
Token ID2222
Feature activation+4.81e+00

Pos logit contributions
 fort+0.041
 way+0.037
 scar+0.036
 trap+0.036
 forgot+0.036
Neg logit contributions
uts-0.059
anes-0.058
all-0.057
ous-0.054
that-0.053
Token the
Token ID262
Feature activation-6.89e+00

Pos logit contributions
uts+0.128
all+0.124
anes+0.124
that+0.112
ous+0.111
Neg logit contributions
 fort-0.131
 forgot-0.120
 trap-0.118
 scar-0.118
 closet-0.116
Token quarrel
Token ID41998
Feature activation+8.35e+00

Pos logit contributions
 fort+0.151
 forgot+0.134
 scar+0.133
 trap+0.127
 hospital+0.126
Neg logit contributions
uts-0.209
anes-0.208
all-0.205
ous-0.193
that-0.191
Token to
Token ID284
Feature activation-9.41e-01

Pos logit contributions
uts+0.221
anes+0.213
all+0.209
ous+0.194
that+0.190
Neg logit contributions
 fort-0.228
 way-0.215
 trap-0.211
 scar-0.211
 forgot-0.210
Token a
Token ID257
Feature activation-4.63e+00

Pos logit contributions
 fort+0.023
 way+0.022
 trap+0.021
 scar+0.021
 forgot+0.021
Neg logit contributions
uts-0.027
anes-0.026
all-0.026
ous-0.024
that-0.024
Token resolution
Token ID6323
Feature activation+1.56e+01

Pos logit contributions
 fort+0.088
 scar+0.081
 forgot+0.079
 closet+0.076
 trap+0.073
Neg logit contributions
uts-0.150
anes-0.146
all-0.146
ous-0.142
that-0.139
Token,
Token ID11
Feature activation+2.13e-02

Pos logit contributions
uts+0.455
anes+0.449
all+0.430
ous+0.402
 Isn+0.400
Neg logit contributions
 fort-0.366
 way-0.347
 forgot-0.336
 trap-0.332
 scar-0.332
Token until
Token ID1566
Feature activation+9.33e-01

Pos logit contributions
uts+0.001
all+0.001
anes+0.001
that+0.001
ous+0.001
Neg logit contributions
 fort-0.001
 trap-0.000
 scar-0.000
 way-0.000
 forgot-0.000
Token everyone
Token ID2506
Feature activation-9.83e-01

Pos logit contributions
uts+0.024
all+0.023
anes+0.023
ous+0.021
that+0.021
Neg logit contributions
 fort-0.026
 scar-0.024
 trap-0.024
 way-0.024
 forgot-0.024
Token was
Token ID373
Feature activation-2.77e+00

Pos logit contributions
 fort+0.021
 scar+0.019
 way+0.019
 trap+0.019
 closet+0.018
Neg logit contributions
uts-0.032
anes-0.032
all-0.031
ous-0.030
ges-0.029
Token happy
Token ID3772
Feature activation+2.40e+00

Pos logit contributions
 fort+0.061
 way+0.058
 forgot+0.057
 trap+0.056
 scar+0.056
Neg logit contributions
uts-0.088
anes-0.086
all-0.083
ous-0.080
ges-0.078
Token never
Token ID1239
Feature activation-3.40e+00

Pos logit contributions
 fort+0.068
 scar+0.063
 trap+0.061
 closet+0.061
 way+0.061
Neg logit contributions
uts-0.064
all-0.063
anes-0.061
ous-0.058
that-0.056
Token dared
Token ID28765
Feature activation+7.83e+00

Pos logit contributions
 fort+0.075
 way+0.069
 trap+0.068
 scar+0.068
 forgot+0.067
Neg logit contributions
uts-0.109
anes-0.105
all-0.104
ous-0.098
that-0.096
Token to
Token ID284
Feature activation-1.36e+00

Pos logit contributions
uts+0.182
anes+0.177
all+0.165
ous+0.156
that+0.153
Neg logit contributions
 fort-0.239
 way-0.233
 scar-0.228
 trap-0.225
 forgot-0.221
Token join
Token ID4654
Feature activation+8.10e+00

Pos logit contributions
 way+0.030
 forgot+0.028
 trap+0.026
 fort+0.026
 pile+0.026
Neg logit contributions
uts-0.046
anes-0.046
all-0.042
unch-0.041
ges-0.041
Token them
Token ID606
Feature activation-3.13e+00

Pos logit contributions
uts+0.237
anes+0.227
all+0.216
ous+0.207
unch+0.205
Neg logit contributions
 fort-0.201
 way-0.194
 trap-0.190
 scar-0.187
 hospital-0.181
Token<|endoftext|>
Token ID50256
Feature activation+1.43e+01

Pos logit contributions
 fort+0.086
 way+0.084
 forgot+0.078
 hospital+0.078
 trap+0.078
Neg logit contributions
anes-0.082
uts-0.080
all-0.075
ous-0.074
ges-0.073
TokenAb
Token ID4826
Feature activation-4.70e+00

Pos logit contributions
uts+0.468
ous+0.446
all+0.408
amb+0.403
anes+0.393
Neg logit contributions
 fort-0.320
 way-0.312
 hole-0.311
 closet-0.297
 pile-0.297
Tokenby
Token ID1525
Feature activation-2.97e+00

Pos logit contributions
 fort+0.104
 trap+0.095
 way+0.094
 scar+0.094
 forgot+0.093
Neg logit contributions
uts-0.149
anes-0.145
all-0.144
ges-0.134
ous-0.133
Token and
Token ID290
Feature activation-5.56e+00

Pos logit contributions
 fort+0.085
 scar+0.078
 closet+0.077
 trap+0.077
 hospital+0.076
Neg logit contributions
uts-0.077
all-0.074
anes-0.067
ous-0.066
ges-0.065
Token Tommy
Token ID19919
Feature activation-4.75e+00

Pos logit contributions
 way+0.121
 fort+0.120
 forgot+0.116
 trap+0.114
 hospital+0.112
Neg logit contributions
uts-0.178
anes-0.170
all-0.166
ous-0.158
unch-0.155
Token were
Token ID547
Feature activation-2.02e+00

Pos logit contributions
 forgot+0.108
 fort+0.108
 way+0.106
 pile+0.100
 closet+0.099
Neg logit contributions
uts-0.148
anes-0.142
all-0.137
ous-0.130
unch-0.128
Token if
Token ID611
Feature activation+2.28e+00

Pos logit contributions
uts+0.054
anes+0.053
all+0.050
that+0.049
 Isn+0.048
Neg logit contributions
 fort-0.045
 way-0.043
 forgot-0.041
 scar-0.041
 trap-0.041
Token nothing
Token ID2147
Feature activation-1.34e-01

Pos logit contributions
uts+0.060
anes+0.057
all+0.056
ous+0.052
that+0.051
Neg logit contributions
 fort-0.064
 way-0.060
 trap-0.059
 scar-0.059
 forgot-0.059
Token had
Token ID550
Feature activation+9.71e-01

Pos logit contributions
 fort+0.004
 closet+0.003
 scar+0.003
 trap+0.003
 hospital+0.003
Neg logit contributions
uts-0.003
all-0.003
ous-0.003
anes-0.003
ges-0.003
Token ever
Token ID1683
Feature activation-1.53e+00

Pos logit contributions
uts+0.026
anes+0.025
all+0.025
ous+0.023
that+0.022
Neg logit contributions
 fort-0.026
 way-0.025
 trap-0.024
 scar-0.024
 forgot-0.024
Token happened
Token ID3022
Feature activation+7.08e+00

Pos logit contributions
 fort+0.038
 scar+0.035
 trap+0.034
 closet+0.034
 way+0.034
Neg logit contributions
uts-0.045
all-0.043
anes-0.043
ous-0.040
ges-0.039
Token<|endoftext|>
Token ID50256
Feature activation+1.43e+01

Pos logit contributions
anes+0.213
uts+0.194
all+0.189
ous+0.188
unch+0.186
Neg logit contributions
 way-0.171
 forgot-0.163
 fort-0.162
 head-0.151
 trap-0.150
TokenOnce
Token ID7454
Feature activation+6.29e+00

Pos logit contributions
uts+0.461
ous+0.424
all+0.413
amb+0.393
ful+0.390
Neg logit contributions
 fort-0.320
 way-0.311
 room-0.304
 closet-0.303
 hole-0.300
Token upon
Token ID2402
Feature activation-9.02e-02

Pos logit contributions
uts+0.197
anes+0.190
all+0.185
ous+0.174
that+0.172
Neg logit contributions
 fort-0.144
 way-0.135
 forgot-0.131
 scar-0.128
 closet-0.127
Token a
Token ID257
Feature activation-4.20e+00

Pos logit contributions
 fort+0.002
 way+0.002
 trap+0.002
 scar+0.002
 forgot+0.002
Neg logit contributions
uts-0.003
anes-0.003
all-0.003
ous-0.003
ges-0.003
Token time
Token ID640
Feature activation+4.33e+00

Pos logit contributions
 fort+0.076
 scar+0.075
 closet+0.073
 trap+0.072
 needle+0.072
Neg logit contributions
all-0.137
uts-0.132
unch-0.127
that-0.121
anes-0.120
Token there
Token ID612
Feature activation+2.74e+00

Pos logit contributions
anes+0.173
uts+0.168
all+0.158
good+0.157
 Share+0.155
Neg logit contributions
 fort-0.060
 way-0.058
 forgot-0.051
 closet-0.050
 hole-0.047
Token your
Token ID534
Feature activation-7.90e+00

Pos logit contributions
uts+0.130
anes+0.129
all+0.123
unch+0.117
ges+0.113
Neg logit contributions
 way-0.171
 fort-0.171
 forgot-0.161
 trap-0.159
 closet-0.158
Token cr
Token ID1067
Feature activation+5.47e+00

Pos logit contributions
 fort+0.163
 forgot+0.152
 scar+0.147
 trap+0.146
 hospital+0.142
Neg logit contributions
uts-0.248
all-0.235
anes-0.233
ous-0.228
that-0.216
Tokenay
Token ID323
Feature activation+1.30e+00

Pos logit contributions
 Mike+0.106
ges+0.101
 Hello+0.097
Mike+0.096
 Hi+0.095
Neg logit contributions
 fort-0.175
 way-0.160
 closet-0.158
 trap-0.158
 head-0.157
Tokenon
Token ID261
Feature activation+2.70e+00

Pos logit contributions
uts+0.036
unch+0.035
 Mike+0.034
all+0.034
ges+0.034
Neg logit contributions
 fort-0.030
 way-0.029
 scar-0.027
 scarf-0.027
 belt-0.027
Token will
Token ID481
Feature activation+3.16e+00

Pos logit contributions
uts+0.058
anes+0.056
all+0.054
ous+0.050
ges+0.049
Neg logit contributions
 fort-0.087
 way-0.083
 trap-0.081
 forgot-0.081
 scar-0.081
Token<|endoftext|>
Token ID50256
Feature activation+1.43e+01

Pos logit contributions
uts+0.104
anes+0.104
all+0.097
ges+0.094
unch+0.094
Neg logit contributions
 way-0.064
 fort-0.063
 trap-0.061
 forgot-0.060
 pile-0.058
TokenOnce
Token ID7454
Feature activation+6.33e+00

Pos logit contributions
uts+0.453
ous+0.430
all+0.401
amb+0.385
ful+0.381
Neg logit contributions
 fort-0.319
 way-0.314
 hole-0.304
 room-0.302
 closet-0.294
Token upon
Token ID2402
Feature activation-5.32e-03

Pos logit contributions
uts+0.193
anes+0.187
all+0.183
ous+0.172
that+0.169
Neg logit contributions
 fort-0.149
 way-0.139
 forgot-0.136
 scar-0.135
 trap-0.135
Token a
Token ID257
Feature activation-4.36e+00

Pos logit contributions
 fort+0.000
 way+0.000
 trap+0.000
 scar+0.000
 forgot+0.000
Neg logit contributions
uts-0.000
anes-0.000
all-0.000
ous-0.000
ges-0.000
Token time
Token ID640
Feature activation+4.24e+00

Pos logit contributions
 closet+0.090
 scar+0.089
 needle+0.089
 fort+0.089
 hospital+0.087
Neg logit contributions
all-0.130
unch-0.123
uts-0.119
that-0.110
od-0.109
Token,
Token ID11
Feature activation-4.71e-01

Pos logit contributions
anes+0.170
uts+0.164
all+0.155
good+0.152
 Isn+0.151
Neg logit contributions
 way-0.059
 fort-0.059
 forgot-0.054
 hole-0.049
 closet-0.048
Token we
Token ID356
Feature activation-2.52e+00

Pos logit contributions
uts+0.031
all+0.030
anes+0.029
ous+0.026
 Isn+0.026
Neg logit contributions
 fort-0.049
 trap-0.045
 scar-0.045
 way-0.045
 closet-0.045
Token can
Token ID460
Feature activation+3.94e+00

Pos logit contributions
 fort+0.054
 way+0.050
 closet+0.048
 scar+0.048
 forgot+0.047
Neg logit contributions
uts-0.079
anes-0.077
all-0.077
ges-0.072
 Share-0.071
Token have
Token ID423
Feature activation+4.29e-02

Pos logit contributions
anes+0.150
uts+0.146
ous+0.144
all+0.141
year+0.138
Neg logit contributions
 way-0.065
 pile-0.064
 head-0.063
 trap-0.063
 fort-0.062
Token a
Token ID257
Feature activation-4.90e+00

Pos logit contributions
uts+0.001
anes+0.001
all+0.001
ous+0.001
ges+0.001
Neg logit contributions
 fort-0.001
 way-0.001
 scar-0.001
 trap-0.001
 forgot-0.001
Token rainbow
Token ID27223
Feature activation+3.20e+00

Pos logit contributions
 fort+0.099
 forgot+0.097
 hospital+0.092
 closet+0.091
 scar+0.087
Neg logit contributions
uts-0.143
all-0.142
anes-0.137
ous-0.134
 Isn-0.133
Token<|endoftext|>
Token ID50256
Feature activation+1.43e+01

Pos logit contributions
uts+0.084
anes+0.082
all+0.078
ous+0.075
that+0.075
Neg logit contributions
 fort-0.093
 scar-0.084
 trap-0.083
 pile-0.081
 way-0.081
TokenS
Token ID50
Feature activation+1.31e+00

Pos logit contributions
uts+0.496
ous+0.488
all+0.448
amb+0.423
ful+0.410
Neg logit contributions
 fort-0.291
 hole-0.285
 way-0.273
 pile-0.268
 trap-0.266
Tokenara
Token ID3301
Feature activation-4.31e+00

Pos logit contributions
uts+0.042
anes+0.040
all+0.039
ous+0.038
ges+0.037
Neg logit contributions
 fort-0.030
 way-0.028
 scar-0.027
 closet-0.027
 trap-0.027
Token liked
Token ID8288
Feature activation+3.40e+00

Pos logit contributions
 fort+0.095
 way+0.087
 scar+0.086
 trap+0.086
 forgot+0.085
Neg logit contributions
uts-0.138
anes-0.133
all-0.132
ous-0.123
ges-0.122
Token to
Token ID284
Feature activation-1.62e+00

Pos logit contributions
uts+0.102
anes+0.097
all+0.094
ous+0.093
that+0.090
Neg logit contributions
 fort-0.084
 scar-0.079
 way-0.078
 scarf-0.077
 closet-0.076
Token play
Token ID711
Feature activation+7.67e+00

Pos logit contributions
 fort+0.033
 way+0.031
 scar+0.030
 trap+0.030
 forgot+0.030
Neg logit contributions
uts-0.055
anes-0.053
all-0.052
ous-0.049
that-0.048
Token with
Token ID351
Feature activation+2.02e+00

Pos logit contributions
uts+0.162
anes+0.160
unch+0.143
ges+0.141
ous+0.141
Neg logit contributions
 fort-0.129
 way-0.125
 forgot-0.119
 scar-0.118
 trap-0.116
Token the
Token ID262
Feature activation-6.67e+00

Pos logit contributions
uts+0.057
anes+0.055
all+0.054
ous+0.050
that+0.049
Neg logit contributions
 fort-0.052
 way-0.049
 trap-0.048
 scar-0.048
 forgot-0.048
Token cute
Token ID13779
Feature activation-2.46e+00

Pos logit contributions
 fort+0.151
 forgot+0.138
 slammed+0.133
 scar+0.132
ches+0.128
Neg logit contributions
all-0.187
anes-0.184
uts-0.181
that-0.175
ous-0.171
Token veil
Token ID30615
Feature activation+4.24e+00

Pos logit contributions
 fort+0.052
 forgot+0.048
 scar+0.047
 hospital+0.046
 trap+0.046
Neg logit contributions
uts-0.078
all-0.077
anes-0.076
ous-0.074
 Isn-0.070
Token he
Token ID339
Feature activation-6.65e+00

Pos logit contributions
uts+0.120
anes+0.117
all+0.113
ges+0.106
ous+0.106
Neg logit contributions
 fort-0.108
 way-0.101
 trap-0.099
 scar-0.098
 forgot-0.098
Token<|endoftext|>
Token ID50256
Feature activation+1.42e+01

Pos logit contributions
 fort+0.130
 trap+0.112
 scar+0.112
 way+0.111
 closet+0.109
Neg logit contributions
uts-0.226
all-0.224
anes-0.220
ous-0.209
that-0.207
TokenOnce
Token ID7454
Feature activation+6.16e+00

Pos logit contributions
uts+0.474
ous+0.421
all+0.391
anes+0.389
amb+0.378
Neg logit contributions
 way-0.307
 fort-0.306
 closet-0.303
 hole-0.298
 room-0.288
Token upon
Token ID2402
Feature activation-1.36e-01

Pos logit contributions
uts+0.192
anes+0.187
all+0.181
ous+0.171
that+0.168
Neg logit contributions
 fort-0.141
 way-0.132
 forgot-0.128
 scar-0.126
 trap-0.126
Token a
Token ID257
Feature activation-3.70e+00

Pos logit contributions
 fort+0.003
 way+0.002
 trap+0.002
 scarf+0.002
 scar+0.002
Neg logit contributions
uts-0.005
anes-0.005
ous-0.004
all-0.004
ges-0.004
Token time
Token ID640
Feature activation+4.64e+00

Pos logit contributions
 fort+0.062
 scar+0.059
 closet+0.056
 trap+0.056
 forgot+0.056
Neg logit contributions
all-0.129
uts-0.129
unch-0.119
that-0.117
anes-0.117
Token,
Token ID11
Feature activation-3.01e-01

Pos logit contributions
anes+0.193
uts+0.185
good+0.177
leaf+0.175
 Share+0.173
Neg logit contributions
 fort-0.054
 way-0.052
 closet-0.045
 forgot-0.044
 hole-0.042
Token of
Token ID286
Feature activation-3.41e-01

Pos logit contributions
uts+0.081
all+0.077
anes+0.074
ous+0.069
that+0.061
Neg logit contributions
 fort-0.208
 scar-0.199
 forgot-0.197
 trap-0.197
 way-0.193
Token the
Token ID262
Feature activation-6.42e+00

Pos logit contributions
 fort+0.009
 scar+0.009
 forgot+0.009
 trap+0.009
 way+0.008
Neg logit contributions
uts-0.009
all-0.009
anes-0.009
ous-0.008
that-0.008
Token power
Token ID1176
Feature activation+3.44e+00

Pos logit contributions
 fort+0.146
 scar+0.136
 forgot+0.132
ches+0.128
 closet+0.125
Neg logit contributions
all-0.187
uts-0.182
anes-0.180
that-0.176
 Isn-0.170
Token of
Token ID286
Feature activation-1.12e+00

Pos logit contributions
uts+0.084
anes+0.081
all+0.079
ous+0.074
that+0.071
Neg logit contributions
 fort-0.102
 scar-0.095
 trap-0.094
 way-0.094
 forgot-0.094
Token believing
Token ID14773
Feature activation+8.43e+00

Pos logit contributions
 fort+0.030
 forgot+0.027
 scar+0.027
 trap+0.027
 way+0.027
Neg logit contributions
uts-0.030
all-0.030
anes-0.029
ous-0.027
that-0.027
Token<|endoftext|>
Token ID50256
Feature activation+1.42e+01

Pos logit contributions
uts+0.224
anes+0.216
all+0.210
ous+0.193
that+0.191
Neg logit contributions
 fort-0.229
 way-0.217
 trap-0.212
 forgot-0.212
 scar-0.211
TokenOnce
Token ID7454
Feature activation+6.12e+00

Pos logit contributions
uts+0.473
ous+0.444
all+0.409
anes+0.394
amb+0.392
Neg logit contributions
 way-0.306
 fort-0.301
 hole-0.297
 closet-0.289
 pile-0.286
Token upon
Token ID2402
Feature activation-3.51e-01

Pos logit contributions
uts+0.187
anes+0.182
all+0.177
ous+0.167
that+0.164
Neg logit contributions
 fort-0.143
 way-0.134
 forgot-0.131
 trap-0.130
 scar-0.130
Token a
Token ID257
Feature activation-4.06e+00

Pos logit contributions
 fort+0.008
 way+0.007
 trap+0.007
 scar+0.007
 forgot+0.007
Neg logit contributions
uts-0.011
anes-0.011
all-0.010
ous-0.010
ges-0.010
Token time
Token ID640
Feature activation+4.30e+00

Pos logit contributions
 fort+0.076
 scar+0.073
 closet+0.073
 hospital+0.071
 needle+0.071
Neg logit contributions
all-0.132
uts-0.127
unch-0.123
that-0.118
 Mike-0.115
Token,
Token ID11
Feature activation-4.21e-01

Pos logit contributions
anes+0.172
uts+0.166
all+0.156
good+0.153
ous+0.152
Neg logit contributions
 fort-0.061
 way-0.060
 forgot-0.053
 closet-0.050
 pile-0.049
Token sad
Token ID6507
Feature activation+2.58e+00

Pos logit contributions
 fort+0.054
 scar+0.048
 way+0.048
 trap+0.048
 forgot+0.046
Neg logit contributions
uts-0.105
all-0.101
anes-0.101
ous-0.095
that-0.094
Token to
Token ID284
Feature activation-2.18e+00

Pos logit contributions
anes+0.071
uts+0.069
all+0.066
ges+0.061
unch+0.061
Neg logit contributions
 fort-0.068
 way-0.066
 forgot-0.064
 hospital-0.063
 scar-0.062
Token lose
Token ID4425
Feature activation+2.99e+00

Pos logit contributions
 way+0.039
 fort+0.037
 hospital+0.036
 forgot+0.036
 trap+0.036
Neg logit contributions
uts-0.079
anes-0.078
all-0.074
ous-0.072
unch-0.071
Token the
Token ID262
Feature activation-7.05e+00

Pos logit contributions
uts+0.075
all+0.072
anes+0.071
 Mike+0.066
ges+0.066
Neg logit contributions
 fort-0.086
 trap-0.079
 scar-0.079
 forgot-0.078
 way-0.076
Token battery
Token ID6555
Feature activation+1.89e+00

Pos logit contributions
 forgot+0.160
 fort+0.160
 scar+0.149
 trap+0.145
eh+0.142
Neg logit contributions
uts-0.196
all-0.196
anes-0.196
that-0.185
 Isn-0.181
Token<|endoftext|>
Token ID50256
Feature activation+1.42e+01

Pos logit contributions
uts+0.044
all+0.041
anes+0.040
ous+0.038
that+0.037
Neg logit contributions
 fort-0.058
 scar-0.056
 trap-0.054
 forgot-0.054
 way-0.053
TokenOnce
Token ID7454
Feature activation+6.27e+00

Pos logit contributions
uts+0.479
ous+0.449
all+0.419
amb+0.403
ful+0.402
Neg logit contributions
 fort-0.295
 way-0.293
 hole-0.293
 closet-0.286
 room-0.284
Token upon
Token ID2402
Feature activation-1.73e-01

Pos logit contributions
uts+0.204
anes+0.198
all+0.191
ous+0.181
that+0.179
Neg logit contributions
 fort-0.136
 way-0.128
 forgot-0.124
 closet-0.120
 scar-0.119
Token a
Token ID257
Feature activation-3.86e+00

Pos logit contributions
 fort+0.004
 way+0.003
 trap+0.003
 hospital+0.003
 forgot+0.003
Neg logit contributions
uts-0.006
anes-0.006
all-0.005
ous-0.005
ges-0.005
Token time
Token ID640
Feature activation+4.47e+00

Pos logit contributions
 fort+0.064
 scar+0.061
 closet+0.058
 forgot+0.058
 trap+0.058
Neg logit contributions
all-0.136
uts-0.135
unch-0.126
anes-0.124
that-0.123
Token,
Token ID11
Feature activation-3.22e-01

Pos logit contributions
anes+0.181
uts+0.175
good+0.166
all+0.165
leaf+0.162
Neg logit contributions
 fort-0.057
 way-0.056
 closet-0.049
 forgot-0.048
 hole-0.046
Token't
Token ID470
Feature activation-1.94e+00

Pos logit contributions
uts+0.142
anes+0.139
all+0.136
ous+0.128
ges+0.126
Neg logit contributions
 fort-0.099
 way-0.092
 trap-0.091
 scar-0.090
 forgot-0.090
Token stay
Token ID2652
Feature activation+1.99e+00

Pos logit contributions
 fort+0.041
 way+0.037
 scar+0.037
 closet+0.037
 trap+0.036
Neg logit contributions
uts-0.064
anes-0.062
all-0.061
ous-0.058
that-0.056
Token dry
Token ID5894
Feature activation+5.00e+00

Pos logit contributions
uts+0.055
all+0.054
anes+0.051
ges+0.049
that+0.048
Neg logit contributions
 fort-0.049
 forgot-0.045
 hospital-0.045
 trap-0.045
 way-0.044
Token in
Token ID287
Feature activation-4.41e-01

Pos logit contributions
uts+0.135
anes+0.132
all+0.127
ous+0.119
that+0.117
Neg logit contributions
 fort-0.134
 way-0.126
 trap-0.123
 scar-0.122
 forgot-0.122
Token rainy
Token ID37259
Feature activation+4.39e+00

Pos logit contributions
 fort+0.011
 way+0.010
 scar+0.010
 forgot+0.010
 trap+0.010
Neg logit contributions
uts-0.013
anes-0.013
all-0.013
ous-0.012
ges-0.012
Token<|endoftext|>
Token ID50256
Feature activation+1.42e+01

Pos logit contributions
uts+0.118
anes+0.114
all+0.112
ous+0.103
ges+0.101
Neg logit contributions
 fort-0.117
 scar-0.110
 trap-0.110
 forgot-0.109
 way-0.108
TokenOnce
Token ID7454
Feature activation+6.37e+00

Pos logit contributions
uts+0.467
ous+0.455
all+0.415
amb+0.409
ful+0.403
Neg logit contributions
 way-0.304
 fort-0.292
 hole-0.292
 room-0.289
 closet-0.287
Token upon
Token ID2402
Feature activation-2.33e-01

Pos logit contributions
uts+0.208
anes+0.202
all+0.195
ous+0.185
that+0.182
Neg logit contributions
 fort-0.137
 way-0.130
 forgot-0.127
 closet-0.121
 scar-0.120
Token a
Token ID257
Feature activation-4.15e+00

Pos logit contributions
 fort+0.005
 way+0.005
 trap+0.005
 scar+0.005
 forgot+0.004
Neg logit contributions
uts-0.008
anes-0.008
all-0.007
ous-0.007
ges-0.007
Token time
Token ID640
Feature activation+4.41e+00

Pos logit contributions
 scar+0.079
 fort+0.078
 closet+0.076
 needle+0.076
 hospital+0.073
Neg logit contributions
all-0.131
uts-0.125
unch-0.122
that-0.115
 Mike-0.111
Token there
Token ID612
Feature activation+2.56e+00

Pos logit contributions
anes+0.178
uts+0.172
all+0.162
good+0.160
leaf+0.158
Neg logit contributions
 fort-0.059
 way-0.059
 forgot-0.053
 closet-0.050
 hole-0.048
Token use
Token ID779
Feature activation+1.63e+00

Pos logit contributions
uts+0.129
anes+0.126
all+0.120
ges+0.119
ous+0.117
Neg logit contributions
 fort-0.067
 way-0.065
 forgot-0.063
 pile-0.062
 hospital-0.062
Token it
Token ID340
Feature activation-2.45e+00

Pos logit contributions
uts+0.045
all+0.044
anes+0.044
ous+0.040
ges+0.040
Neg logit contributions
 fort-0.042
 forgot-0.039
 trap-0.038
 way-0.037
 scar-0.037
Token whenever
Token ID8797
Feature activation+1.32e+00

Pos logit contributions
 fort+0.075
 way+0.071
 scar+0.070
 trap+0.070
 forgot+0.069
Neg logit contributions
uts-0.057
anes-0.055
all-0.053
ous-0.049
ges-0.048
Token you
Token ID345
Feature activation-3.59e+00

Pos logit contributions
uts+0.033
anes+0.032
all+0.031
ous+0.029
that+0.028
Neg logit contributions
 fort-0.038
 way-0.036
 trap-0.035
 forgot-0.035
 scar-0.035
Token want
Token ID765
Feature activation+6.58e+00

Pos logit contributions
 fort+0.088
 way+0.082
 trap+0.081
 forgot+0.081
 scar+0.081
Neg logit contributions
uts-0.105
anes-0.102
all-0.099
ous-0.093
that-0.091
Token<|endoftext|>
Token ID50256
Feature activation+1.42e+01

Pos logit contributions
anes+0.162
uts+0.160
all+0.149
ous+0.146
unch+0.145
Neg logit contributions
 way-0.190
 fort-0.186
 trap-0.179
 pile-0.172
 hospital-0.172
TokenOnce
Token ID7454
Feature activation+6.74e+00

Pos logit contributions
uts+0.457
ous+0.446
all+0.409
amb+0.399
ful+0.394
Neg logit contributions
 fort-0.311
 way-0.309
 hole-0.296
 room-0.290
 trap-0.288
Token there
Token ID612
Feature activation+2.94e+00

Pos logit contributions
uts+0.200
anes+0.194
all+0.191
ous+0.179
that+0.175
Neg logit contributions
 fort-0.163
 way-0.151
 trap-0.149
 scar-0.149
 forgot-0.148
Token was
Token ID373
Feature activation-2.14e+00

Pos logit contributions
uts+0.095
anes+0.092
all+0.091
ous+0.085
that+0.085
Neg logit contributions
 fort-0.063
 scar-0.058
 trap-0.058
 way-0.057
 forgot-0.056
Token a
Token ID257
Feature activation-4.16e+00

Pos logit contributions
 fort+0.054
 way+0.050
 trap+0.049
 scar+0.049
 forgot+0.049
Neg logit contributions
uts-0.063
anes-0.061
all-0.059
ous-0.056
that-0.054
Token girl
Token ID2576
Feature activation-1.83e+00

Pos logit contributions
 fort+0.069
 trap+0.064
 way+0.063
 scar+0.063
 closet+0.062
Neg logit contributions
all-0.137
uts-0.136
unch-0.129
that-0.128
anes-0.128
Token is
Token ID318
Feature activation-2.09e+00

Pos logit contributions
 fort+0.034
 scar+0.031
 forgot+0.031
 trap+0.030
 closet+0.029
Neg logit contributions
uts-0.057
anes-0.056
all-0.054
ous-0.053
 Isn-0.052
Token a
Token ID257
Feature activation-4.67e+00

Pos logit contributions
 fort+0.053
 trap+0.048
 forgot+0.048
 way+0.048
 scar+0.047
Neg logit contributions
uts-0.058
all-0.057
anes-0.056
ous-0.052
that-0.052
Token noise
Token ID7838
Feature activation+4.66e+00

Pos logit contributions
 fort+0.091
 forgot+0.083
 scar+0.082
 trap+0.079
 way+0.079
Neg logit contributions
uts-0.154
anes-0.149
all-0.148
ous-0.141
 Isn-0.140
Token that
Token ID326
Feature activation-1.86e+00

Pos logit contributions
uts+0.125
anes+0.125
all+0.118
unch+0.112
that+0.110
Neg logit contributions
 fort-0.125
 way-0.121
 trap-0.116
 forgot-0.116
 pile-0.113
Token matches
Token ID7466
Feature activation+2.18e+00

Pos logit contributions
 fort+0.042
 scar+0.037
 way+0.037
 trap+0.037
 forgot+0.037
Neg logit contributions
uts-0.059
anes-0.057
all-0.056
ous-0.053
that-0.051
Token<|endoftext|>
Token ID50256
Feature activation+1.42e+01

Pos logit contributions
uts+0.060
anes+0.058
all+0.057
ous+0.053
that+0.052
Neg logit contributions
 fort-0.057
 way-0.054
 trap-0.053
 forgot-0.053
 scar-0.052
TokenBen
Token ID11696
Feature activation-1.52e+00

Pos logit contributions
ous+0.470
uts+0.470
all+0.428
amb+0.418
ful+0.395
Neg logit contributions
 hole-0.300
 fort-0.300
 way-0.298
 trap-0.285
 room-0.283
Token and
Token ID290
Feature activation-5.86e+00

Pos logit contributions
 fort+0.038
 scar+0.035
 trap+0.034
 way+0.034
 hospital+0.034
Neg logit contributions
uts-0.044
all-0.043
anes-0.042
ges-0.040
that-0.039
Token Lily
Token ID20037
Feature activation-5.15e+00

Pos logit contributions
 fort+0.144
 way+0.134
 trap+0.132
 scar+0.132
 forgot+0.131
Neg logit contributions
uts-0.173
anes-0.167
all-0.164
ous-0.154
that-0.151
Token were
Token ID547
Feature activation-2.05e+00

Pos logit contributions
 fort+0.096
 way+0.088
 forgot+0.087
 scar+0.086
 trap+0.086
Neg logit contributions
uts-0.181
anes-0.176
all-0.173
ous-0.165
that-0.161
Token playing
Token ID2712
Feature activation+7.13e+00

Pos logit contributions
 fort+0.051
 way+0.048
 scar+0.047
 trap+0.047
 forgot+0.047
Neg logit contributions
uts-0.060
anes-0.058
all-0.057
ous-0.053
that-0.052
Token back
Token ID736
Feature activation-9.44e-01

Pos logit contributions
all+0.084
uts+0.083
anes+0.082
that+0.075
ges+0.074
Neg logit contributions
 fort-0.097
 forgot-0.095
 trap-0.093
 way-0.089
 scar-0.089
Token into
Token ID656
Feature activation-1.55e+00

Pos logit contributions
 fort+0.027
 way+0.025
 scar+0.025
 trap+0.025
 forgot+0.025
Neg logit contributions
uts-0.024
anes-0.023
all-0.023
ous-0.021
that-0.020
Token bed
Token ID3996
Feature activation+7.19e+00

Pos logit contributions
 fort+0.036
 way+0.033
 trap+0.033
 scar+0.033
 forgot+0.033
Neg logit contributions
uts-0.047
anes-0.046
all-0.045
ous-0.042
that-0.042
Token for
Token ID329
Feature activation+2.60e-01

Pos logit contributions
uts+0.204
all+0.190
anes+0.182
ous+0.180
that+0.169
Neg logit contributions
 fort-0.190
 trap-0.182
 scar-0.179
 forgot-0.172
 way-0.172
Token a
Token ID257
Feature activation-5.46e+00

Pos logit contributions
uts+0.007
all+0.007
anes+0.007
ous+0.007
that+0.007
Neg logit contributions
 fort-0.007
 forgot-0.006
 trap-0.006
 way-0.006
 scar-0.006
Token<|endoftext|>
Token ID50256
Feature activation+1.42e+01

Pos logit contributions
 fort+0.112
 closet+0.101
 hospital+0.101
 forgot+0.100
 trap+0.099
Neg logit contributions
all-0.164
uts-0.163
ous-0.160
that-0.152
unch-0.150
TokenOnce
Token ID7454
Feature activation+6.24e+00

Pos logit contributions
uts+0.462
ous+0.424
all+0.404
anes+0.391
amb+0.378
Neg logit contributions
 fort-0.315
 closet-0.308
 way-0.307
 hole-0.304
 room-0.294
Token upon
Token ID2402
Feature activation-3.32e-01

Pos logit contributions
uts+0.190
anes+0.184
all+0.180
ous+0.169
that+0.166
Neg logit contributions
 fort-0.147
 way-0.137
 forgot-0.134
 scar-0.133
 trap-0.133
Token a
Token ID257
Feature activation-4.03e+00

Pos logit contributions
 fort+0.007
 way+0.006
 trap+0.006
 scar+0.006
 scarf+0.006
Neg logit contributions
uts-0.011
anes-0.011
all-0.010
ous-0.010
ges-0.010
Token time
Token ID640
Feature activation+4.47e+00

Pos logit contributions
 fort+0.068
 scar+0.065
 forgot+0.061
 closet+0.061
 trap+0.061
Neg logit contributions
all-0.141
uts-0.140
unch-0.131
that-0.129
anes-0.127
Token there
Token ID612
Feature activation+2.91e+00

Pos logit contributions
anes+0.177
uts+0.172
all+0.161
good+0.160
that+0.157
Neg logit contributions
 fort-0.063
 way-0.062
 forgot-0.055
 closet-0.053
 pile-0.051
Token dog
Token ID3290
Feature activation+1.68e+00

Pos logit contributions
 fort+0.007
 forgot+0.007
 scar+0.007
 closet+0.006
 scarf+0.006
Neg logit contributions
anes-0.014
uts-0.014
ous-0.014
all-0.013
od-0.013
Token call
Token ID869
Feature activation+7.99e+00

Pos logit contributions
uts+0.051
anes+0.050
all+0.049
ous+0.045
that+0.045
Neg logit contributions
 fort-0.040
 way-0.037
 trap-0.036
 scar-0.036
 forgot-0.036
Token out
Token ID503
Feature activation+7.18e-01

Pos logit contributions
uts+0.228
anes+0.218
all+0.210
unch+0.196
ous+0.191
Neg logit contributions
 fort-0.208
 way-0.202
 trap-0.195
 scar-0.192
 hospital-0.189
Token to
Token ID284
Feature activation-1.06e+00

Pos logit contributions
anes+0.022
uts+0.022
all+0.020
unch+0.020
that+0.019
Neg logit contributions
 fort-0.018
 trap-0.016
 scar-0.016
 way-0.016
 shield-0.016
Token him
Token ID683
Feature activation-4.32e+00

Pos logit contributions
 way+0.027
 fort+0.026
 trap+0.026
 hospital+0.025
 scar+0.025
Neg logit contributions
uts-0.033
anes-0.031
all-0.029
ous-0.028
unch-0.027
Token<|endoftext|>
Token ID50256
Feature activation+1.42e+01

Pos logit contributions
 fort+0.117
 way+0.113
 trap+0.108
 scar+0.107
 pile+0.106
Neg logit contributions
anes-0.119
uts-0.116
all-0.108
unch-0.100
 Isn-0.100
TokenOnce
Token ID7454
Feature activation+6.18e+00

Pos logit contributions
uts+0.473
ous+0.448
all+0.416
anes+0.404
amb+0.402
Neg logit contributions
 hole-0.304
 way-0.303
 fort-0.298
 closet-0.288
 room-0.283
Token there
Token ID612
Feature activation+2.74e+00

Pos logit contributions
uts+0.197
anes+0.192
all+0.186
ous+0.175
that+0.173
Neg logit contributions
 fort-0.137
 way-0.130
 forgot-0.125
 scar-0.120
 closet-0.120
Token was
Token ID373
Feature activation-2.41e+00

Pos logit contributions
uts+0.097
anes+0.094
all+0.092
ous+0.087
ges+0.085
Neg logit contributions
 fort-0.051
 way-0.049
 forgot-0.047
 trap-0.045
 scar-0.045
Token a
Token ID257
Feature activation-4.81e+00

Pos logit contributions
 fort+0.053
 way+0.049
 trap+0.049
 scar+0.048
 hospital+0.048
Neg logit contributions
uts-0.078
anes-0.076
all-0.072
ous-0.072
ges-0.068
Token pup
Token ID15552
Feature activation+1.09e+00

Pos logit contributions
 fort+0.081
 way+0.074
 scar+0.074
 forgot+0.072
 trap+0.070
Neg logit contributions
uts-0.159
all-0.157
anes-0.153
ous-0.151
unch-0.151
Token always
Token ID1464
Feature activation-3.14e+00

Pos logit contributions
 fort+0.028
 way+0.025
 trap+0.025
 scar+0.025
 forgot+0.025
Neg logit contributions
uts-0.044
anes-0.043
all-0.042
ous-0.040
that-0.039
Token ask
Token ID1265
Feature activation+5.90e+00

Pos logit contributions
 fort+0.066
 way+0.062
 trap+0.060
 scar+0.060
 forgot+0.060
Neg logit contributions
uts-0.103
anes-0.100
all-0.098
ous-0.092
that-0.091
Token before
Token ID878
Feature activation+7.85e+00

Pos logit contributions
uts+0.172
anes+0.167
all+0.163
ous+0.153
that+0.150
Neg logit contributions
 fort-0.146
 way-0.136
 scar-0.134
 trap-0.134
 forgot-0.133
Token you
Token ID345
Feature activation-2.89e+00

Pos logit contributions
anes+0.229
uts+0.224
ous+0.204
all+0.204
that+0.201
Neg logit contributions
 way-0.191
 fort-0.183
 scar-0.180
 forgot-0.178
 hospital-0.176
Token peek
Token ID27185
Feature activation+7.25e+00

Pos logit contributions
 fort+0.067
 way+0.062
 trap+0.061
 scar+0.061
 forgot+0.061
Neg logit contributions
uts-0.089
anes-0.086
all-0.084
ous-0.079
that-0.078
Token<|endoftext|>
Token ID50256
Feature activation+1.41e+01

Pos logit contributions
uts+0.196
anes+0.191
all+0.183
ous+0.171
unch+0.170
Neg logit contributions
 fort-0.194
 way-0.183
 scar-0.182
 trap-0.178
 forgot-0.177
TokenOnce
Token ID7454
Feature activation+6.45e+00

Pos logit contributions
uts+0.500
ous+0.468
all+0.435
ful+0.426
amb+0.414
Neg logit contributions
 fort-0.267
 way-0.264
 hole-0.263
 closet-0.259
 pile-0.246
Token,
Token ID11
Feature activation+3.34e-01

Pos logit contributions
uts+0.207
anes+0.201
all+0.194
ous+0.184
that+0.181
Neg logit contributions
 fort-0.142
 way-0.134
 forgot-0.130
 scar-0.127
 closet-0.126
Token there
Token ID612
Feature activation+1.96e+00

Pos logit contributions
uts+0.010
anes+0.009
all+0.009
ous+0.008
that+0.008
Neg logit contributions
 fort-0.009
 way-0.008
 forgot-0.008
 scar-0.008
 closet-0.008
Token was
Token ID373
Feature activation-2.76e+00

Pos logit contributions
uts+0.070
anes+0.069
all+0.067
ous+0.064
ges+0.062
Neg logit contributions
 fort-0.035
 way-0.034
 forgot-0.032
 trap-0.032
 closet-0.031
Token a
Token ID257
Feature activation-5.14e+00

Pos logit contributions
 fort+0.066
 way+0.061
 scar+0.060
 trap+0.060
 closet+0.059
Neg logit contributions
uts-0.084
anes-0.082
all-0.078
ous-0.076
that-0.073
Token parents
Token ID3397
Feature activation+2.81e+00

Pos logit contributions
 forgot+0.172
ns+0.159
 trap+0.158
eh+0.156
 hospital+0.153
Neg logit contributions
all-0.231
uts-0.228
ous-0.227
that-0.219
anes-0.218
Token and
Token ID290
Feature activation-5.64e+00

Pos logit contributions
uts+0.092
anes+0.090
all+0.087
ous+0.082
ges+0.081
Neg logit contributions
 fort-0.059
 way-0.055
 scar-0.053
 trap-0.053
 forgot-0.053
Token skipped
Token ID26684
Feature activation+4.64e+00

Pos logit contributions
 fort+0.139
 trap+0.129
 closet+0.123
 scar+0.123
 hospital+0.122
Neg logit contributions
uts-0.164
all-0.154
anes-0.151
ous-0.146
 Mike-0.143
Token off
Token ID572
Feature activation+1.46e+00

Pos logit contributions
uts+0.145
anes+0.142
all+0.135
ous+0.131
unch+0.128
Neg logit contributions
 fort-0.107
 way-0.101
 scar-0.099
 trap-0.096
 closet-0.095
Token to
Token ID284
Feature activation-2.03e+00

Pos logit contributions
anes+0.040
uts+0.039
all+0.036
ous+0.035
 Isn+0.035
Neg logit contributions
 fort-0.038
 way-0.038
 forgot-0.036
 scar-0.035
 closet-0.035
Token<|endoftext|>
Token ID50256
Feature activation+1.41e+01

Pos logit contributions
 fort+0.049
 scar+0.046
 forgot+0.046
 trap+0.045
 way+0.043
Neg logit contributions
uts-0.057
all-0.056
anes-0.055
ous-0.052
that-0.052
TokenOnce
Token ID7454
Feature activation+6.15e+00

Pos logit contributions
uts+0.477
ous+0.452
all+0.423
amb+0.407
reating+0.405
Neg logit contributions
 hole-0.292
 fort-0.290
 way-0.288
 closet-0.274
 backyard-0.274
Token upon
Token ID2402
Feature activation-4.27e-01

Pos logit contributions
uts+0.192
anes+0.186
all+0.181
ous+0.170
that+0.168
Neg logit contributions
 fort-0.141
 way-0.132
 forgot-0.128
 scar-0.126
 trap-0.125
Token a
Token ID257
Feature activation-4.14e+00

Pos logit contributions
 fort+0.009
 way+0.008
 trap+0.008
 scar+0.008
 forgot+0.008
Neg logit contributions
uts-0.014
anes-0.014
all-0.013
ous-0.013
ges-0.013
Token time
Token ID640
Feature activation+4.36e+00

Pos logit contributions
 fort+0.075
 scar+0.074
 needle+0.069
 closet+0.069
 trap+0.069
Neg logit contributions
all-0.138
uts-0.135
unch-0.130
that-0.124
anes-0.122
Token,
Token ID11
Feature activation-4.84e-01

Pos logit contributions
anes+0.168
uts+0.165
all+0.155
good+0.150
that+0.149
Neg logit contributions
 fort-0.066
 way-0.065
 forgot-0.058
 closet-0.054
 pile-0.054
Token never
Token ID1239
Feature activation-1.57e+00

Pos logit contributions
 fort+0.017
 way+0.016
 trap+0.015
 scar+0.015
 forgot+0.015
Neg logit contributions
uts-0.030
anes-0.029
all-0.029
ous-0.027
ges-0.027
Token go
Token ID467
Feature activation+4.67e+00

Pos logit contributions
 fort+0.029
 way+0.029
 forgot+0.028
 trap+0.027
 scar+0.026
Neg logit contributions
uts-0.056
anes-0.055
all-0.053
ous-0.050
that-0.050
Token near
Token ID1474
Feature activation+2.83e+00

Pos logit contributions
uts+0.142
anes+0.137
all+0.132
ous+0.127
ges+0.123
Neg logit contributions
 fort-0.112
 way-0.106
 trap-0.103
 scar-0.103
 forgot-0.102
Token a
Token ID257
Feature activation-4.48e+00

Pos logit contributions
uts+0.081
anes+0.078
all+0.077
ous+0.071
ges+0.070
Neg logit contributions
 fort-0.071
 way-0.066
 forgot-0.066
 scar-0.066
 trap-0.065
Token hive
Token ID35881
Feature activation+4.97e+00

Pos logit contributions
 fort+0.085
 forgot+0.084
 scar+0.079
 way+0.075
 closet+0.075
Neg logit contributions
anes-0.143
uts-0.142
all-0.141
ous-0.136
 Isn-0.131
Token<|endoftext|>
Token ID50256
Feature activation+1.41e+01

Pos logit contributions
uts+0.124
all+0.119
anes+0.119
ous+0.109
that+0.106
Neg logit contributions
 fort-0.141
 scar-0.134
 way-0.132
 trap-0.131
 forgot-0.131
TokenOnce
Token ID7454
Feature activation+6.53e+00

Pos logit contributions
uts+0.463
ous+0.436
all+0.407
ful+0.385
anes+0.384
Neg logit contributions
 fort-0.302
 hole-0.292
 way-0.290
 closet-0.284
 trap-0.274
Token upon
Token ID2402
Feature activation-3.27e-02

Pos logit contributions
uts+0.200
anes+0.194
all+0.190
ous+0.178
that+0.175
Neg logit contributions
 fort-0.152
 way-0.142
 forgot-0.139
 scar-0.137
 trap-0.137
Token a
Token ID257
Feature activation-4.03e+00

Pos logit contributions
 fort+0.001
 trap+0.001
 way+0.001
 scar+0.001
 forgot+0.001
Neg logit contributions
uts-0.001
anes-0.001
all-0.001
ous-0.001
ges-0.001
Token time
Token ID640
Feature activation+4.63e+00

Pos logit contributions
 scar+0.079
 fort+0.077
 needle+0.076
 closet+0.076
 hospital+0.075
Neg logit contributions
all-0.126
uts-0.119
unch-0.118
that-0.108
od-0.107
Token,
Token ID11
Feature activation-2.60e-01

Pos logit contributions
anes+0.186
uts+0.178
good+0.168
all+0.168
 Isn+0.166
Neg logit contributions
 fort-0.065
 way-0.064
 forgot-0.055
 closet-0.052
 pile-0.052
Token it
Token ID340
Feature activation-3.22e+00

Pos logit contributions
 fort+0.056
 way+0.053
 scar+0.052
 trap+0.052
 forgot+0.052
Neg logit contributions
uts-0.058
anes-0.056
all-0.054
ous-0.051
ges-0.050
Token's
Token ID338
Feature activation-4.90e+00

Pos logit contributions
 fort+0.076
 closet+0.070
 scar+0.070
 trap+0.068
 scarf+0.063
Neg logit contributions
all-0.097
uts-0.095
anes-0.090
ous-0.089
that-0.084
Token never
Token ID1239
Feature activation-2.45e+00

Pos logit contributions
 fort+0.116
 trap+0.100
 closet+0.100
 scar+0.098
 forgot+0.094
Neg logit contributions
all-0.153
uts-0.145
that-0.135
anes-0.135
ous-0.133
Token okay
Token ID8788
Feature activation+2.96e+00

Pos logit contributions
 fort+0.059
 trap+0.053
 scar+0.053
 closet+0.053
 hospital+0.051
Neg logit contributions
uts-0.071
all-0.069
anes-0.068
ous-0.067
that-0.062
Token to
Token ID284
Feature activation-1.35e+00

Pos logit contributions
anes+0.087
uts+0.077
amb+0.076
unch+0.075
all+0.072
Neg logit contributions
 way-0.075
 fort-0.073
 forgot-0.071
 head-0.066
 trap-0.065
Token<|endoftext|>
Token ID50256
Feature activation+1.41e+01

Pos logit contributions
 way+0.024
 forgot+0.022
 pile+0.021
 head+0.021
 trap+0.020
Neg logit contributions
uts-0.050
anes-0.050
all-0.046
unch-0.045
elt-0.045
TokenOnce
Token ID7454
Feature activation+6.55e+00

Pos logit contributions
uts+0.469
ous+0.448
all+0.412
ful+0.403
amb+0.402
Neg logit contributions
 way-0.290
 fort-0.288
 closet-0.277
 hole-0.276
 room-0.272
Token upon
Token ID2402
Feature activation+8.63e-02

Pos logit contributions
uts+0.213
anes+0.206
all+0.199
ous+0.189
that+0.187
Neg logit contributions
 fort-0.142
 way-0.134
 forgot-0.130
 closet-0.125
 scar-0.125
Token a
Token ID257
Feature activation-3.91e+00

Pos logit contributions
uts+0.003
anes+0.003
all+0.003
ous+0.003
ges+0.003
Neg logit contributions
 fort-0.002
 way-0.002
 trap-0.002
 scar-0.002
 hospital-0.002
Token time
Token ID640
Feature activation+4.62e+00

Pos logit contributions
 fort+0.074
 scar+0.073
 closet+0.071
 needle+0.071
 trap+0.071
Neg logit contributions
all-0.126
uts-0.119
unch-0.115
that-0.108
 Mike-0.108
Token there
Token ID612
Feature activation+2.83e+00

Pos logit contributions
anes+0.191
uts+0.186
good+0.175
all+0.173
leaf+0.172
Neg logit contributions
 fort-0.056
 way-0.056
 forgot-0.047
 closet-0.046
 pile-0.044
Token this
Token ID428
Feature activation-5.23e+00

Pos logit contributions
 fort+0.032
 way+0.030
 scar+0.030
 trap+0.030
 forgot+0.030
Neg logit contributions
uts-0.035
anes-0.034
all-0.033
ous-0.031
that-0.030
Token old
Token ID1468
Feature activation-3.14e+00

Pos logit contributions
 fort+0.121
 scar+0.112
 trap+0.107
 shield+0.103
 stable+0.103
Neg logit contributions
all-0.147
uts-0.146
that-0.138
 Mike-0.136
 Isn-0.133
Token man
Token ID582
Feature activation+1.63e+00

Pos logit contributions
 fort+0.061
 scar+0.054
 forgot+0.054
 trap+0.052
 web+0.049
Neg logit contributions
uts-0.104
all-0.099
anes-0.097
that-0.094
 Mike-0.092
Token might
Token ID1244
Feature activation+1.33e+00

Pos logit contributions
uts+0.050
all+0.049
ous+0.047
anes+0.046
that+0.046
Neg logit contributions
 fort-0.039
 scar-0.035
 closet-0.033
 trap-0.033
 hole-0.032
Token be
Token ID307
Feature activation-2.93e+00

Pos logit contributions
uts+0.044
anes+0.043
all+0.043
ous+0.040
that+0.039
Neg logit contributions
 fort-0.027
 way-0.025
 scar-0.024
 trap-0.024
 forgot-0.024
Token<|endoftext|>
Token ID50256
Feature activation+1.41e+01

Pos logit contributions
 fort+0.071
 trap+0.060
 closet+0.060
 scar+0.060
 way+0.059
Neg logit contributions
uts-0.089
all-0.087
anes-0.082
ges-0.079
that-0.078
TokenOnce
Token ID7454
Feature activation+6.30e+00

Pos logit contributions
uts+0.458
ous+0.427
all+0.399
anes+0.384
amb+0.379
Neg logit contributions
 way-0.309
 fort-0.308
 hole-0.299
 closet-0.295
 room-0.286
Token there
Token ID612
Feature activation+2.86e+00

Pos logit contributions
uts+0.196
anes+0.190
all+0.185
ous+0.174
that+0.172
Neg logit contributions
 fort-0.145
 way-0.136
 forgot-0.132
 scar-0.130
 trap-0.129
Token was
Token ID373
Feature activation-2.21e+00

Pos logit contributions
uts+0.097
anes+0.094
all+0.092
ous+0.087
ges+0.085
Neg logit contributions
 fort-0.057
 way-0.053
 forgot-0.052
 trap-0.051
 scar-0.051
Token a
Token ID257
Feature activation-4.48e+00

Pos logit contributions
 fort+0.053
 way+0.049
 scar+0.048
 trap+0.048
 forgot+0.047
Neg logit contributions
uts-0.067
anes-0.065
all-0.062
ous-0.060
that-0.058
Token little
Token ID1310
Feature activation-2.80e+00

Pos logit contributions
 fort+0.077
 way+0.070
 trap+0.068
 scar+0.068
 forgot+0.067
Neg logit contributions
uts-0.148
all-0.147
anes-0.141
ous-0.139
that-0.139
Token out
Token ID503
Feature activation-1.31e+00

Pos logit contributions
uts+0.047
all+0.047
ous+0.044
anes+0.044
ges+0.042
Neg logit contributions
 fort-0.073
 forgot-0.070
 scar-0.067
 trap-0.066
 hospital-0.066
Token for
Token ID329
Feature activation+2.72e-01

Pos logit contributions
 fort+0.041
 forgot+0.039
 trap+0.038
 scar+0.037
 closet+0.037
Neg logit contributions
all-0.029
uts-0.028
ous-0.026
anes-0.026
ges-0.026
Token more
Token ID517
Feature activation-2.19e+00

Pos logit contributions
all+0.007
uts+0.007
 Mike+0.006
anes+0.006
ges+0.006
Neg logit contributions
 fort-0.008
 forgot-0.007
 closet-0.007
 trap-0.007
 pile-0.006
Token miracles
Token ID32904
Feature activation+2.28e+00

Pos logit contributions
 fort+0.051
 forgot+0.047
 closet+0.046
 hospital+0.046
 trap+0.046
Neg logit contributions
uts-0.065
all-0.062
anes-0.061
ges-0.057
ous-0.056
Token to
Token ID284
Feature activation-2.18e+00

Pos logit contributions
uts+0.055
anes+0.053
all+0.052
ous+0.048
that+0.046
Neg logit contributions
 fort-0.068
 trap-0.063
 way-0.063
 scar-0.063
 forgot-0.063
Token<|endoftext|>
Token ID50256
Feature activation+1.41e+01

Pos logit contributions
 fort+0.046
 way+0.044
 trap+0.043
 forgot+0.042
 scar+0.042
Neg logit contributions
uts-0.072
anes-0.070
all-0.068
ous-0.064
ges-0.063
TokenOnce
Token ID7454
Feature activation+6.11e+00

Pos logit contributions
uts+0.476
ous+0.461
all+0.428
amb+0.413
ful+0.407
Neg logit contributions
 way-0.291
 hole-0.285
 fort-0.279
 closet-0.266
 room-0.266
Token upon
Token ID2402
Feature activation-4.32e-01

Pos logit contributions
uts+0.185
anes+0.180
all+0.176
ous+0.165
that+0.162
Neg logit contributions
 fort-0.145
 way-0.135
 forgot-0.132
 trap-0.131
 scar-0.131
Token a
Token ID257
Feature activation-4.33e+00

Pos logit contributions
 fort+0.010
 way+0.009
 trap+0.009
 scar+0.009
 forgot+0.009
Neg logit contributions
uts-0.014
anes-0.014
all-0.013
ous-0.013
ges-0.012
Token time
Token ID640
Feature activation+4.24e+00

Pos logit contributions
 fort+0.082
 scar+0.078
 closet+0.078
 needle+0.076
 hospital+0.076
Neg logit contributions
all-0.140
uts-0.133
unch-0.130
that-0.124
 Mike-0.119
Token,
Token ID11
Feature activation-6.63e-01

Pos logit contributions
anes+0.167
uts+0.163
all+0.153
good+0.151
 Isn+0.149
Neg logit contributions
 fort-0.060
 way-0.059
 forgot-0.053
 closet-0.050
 hole-0.049
Token cheered
Token ID34767
Feature activation+8.74e+00

Pos logit contributions
 fort+0.147
 way+0.139
 scar+0.137
 forgot+0.137
 trap+0.135
Neg logit contributions
uts-0.184
anes-0.179
all-0.175
ous-0.163
that-0.161
Token.
Token ID13
Feature activation+5.18e+00

Pos logit contributions
anes+0.282
uts+0.270
unch+0.258
all+0.241
 Isn+0.239
Neg logit contributions
 fort-0.199
 way-0.191
 scar-0.182
 forgot-0.175
 head-0.173
Token Ell
Token ID7122
Feature activation-7.12e+00

Pos logit contributions
anes+0.143
uts+0.142
all+0.137
unch+0.134
ous+0.128
Neg logit contributions
 way-0.124
 floor-0.115
 fort-0.115
 room-0.115
 proper-0.115
Tokena
Token ID64
Feature activation-7.12e+00

Pos logit contributions
 fort+0.181
 trap+0.167
 way+0.165
 closet+0.162
 scar+0.162
Neg logit contributions
uts-0.210
all-0.201
anes-0.200
ous-0.187
that-0.183
Token felt
Token ID2936
Feature activation-1.66e+00

Pos logit contributions
 fort+0.164
 trap+0.149
 closet+0.143
 scar+0.142
 way+0.139
Neg logit contributions
uts-0.226
all-0.217
anes-0.216
ous-0.204
that-0.200
Token their
Token ID511
Feature activation-1.22e+01

Pos logit contributions
 fort+0.033
 way+0.031
 forgot+0.031
 trap+0.030
 scar+0.030
Neg logit contributions
uts-0.056
anes-0.055
all-0.053
ous-0.051
ges-0.050
Token applause
Token ID28445
Feature activation+4.68e+00

Pos logit contributions
 fort+0.293
 scar+0.279
 trap+0.259
 closet+0.259
 hospital+0.257
Neg logit contributions
uts-0.345
all-0.339
ous-0.323
anes-0.321
that-0.317
Token through
Token ID832
Feature activation-1.49e+00

Pos logit contributions
uts+0.130
anes+0.126
all+0.122
ous+0.113
ges+0.112
Neg logit contributions
 fort-0.122
 way-0.116
 forgot-0.113
 trap-0.112
 scar-0.112
Token her
Token ID607
Feature activation-9.46e+00

Pos logit contributions
 way+0.027
 fort+0.025
 trap+0.023
 hospital+0.023
 scar+0.023
Neg logit contributions
uts-0.055
anes-0.054
all-0.051
unch-0.050
ous-0.049
Token big
Token ID1263
Feature activation-4.95e+00

Pos logit contributions
 fort+0.201
 forgot+0.197
 scar+0.197
 hospital+0.189
 needle+0.181
Neg logit contributions
uts-0.289
anes-0.260
ous-0.254
all-0.253
od-0.251
Token feet
Token ID3625
Feature activation+4.75e+00

Pos logit contributions
 fort+0.108
 forgot+0.099
 scar+0.099
 hospital+0.097
 trap+0.097
Neg logit contributions
uts-0.155
anes-0.152
all-0.148
ous-0.142
ges-0.138
Token?"
Token ID1701
Feature activation+7.76e+00

Pos logit contributions
uts+0.091
anes+0.088
all+0.086
ous+0.079
that+0.077
Neg logit contributions
 fort-0.114
 way-0.108
 scar-0.107
 forgot-0.107
 trap-0.107
Token R
Token ID371
Feature activation-2.06e+00

Pos logit contributions
uts+0.216
anes+0.214
all+0.201
ous+0.199
that+0.194
Neg logit contributions
 scar-0.194
 fort-0.193
 way-0.186
 forgot-0.184
 hospital-0.180
Tokenuffy
Token ID15352
Feature activation-5.10e+00

Pos logit contributions
 fort+0.041
 trap+0.037
 scar+0.036
 way+0.036
 forgot+0.036
Neg logit contributions
uts-0.070
anes-0.068
all-0.067
that-0.064
ges-0.063
Token w
Token ID266
Feature activation-2.75e-01

Pos logit contributions
 fort+0.137
 trap+0.124
 closet+0.121
 way+0.121
 scar+0.119
Neg logit contributions
uts-0.142
all-0.138
anes-0.128
ous-0.124
that-0.122
Tokenagged
Token ID14655
Feature activation+4.18e-02

Pos logit contributions
 fort+0.008
 way+0.008
 forgot+0.007
 trap+0.007
 scar+0.007
Neg logit contributions
uts-0.007
anes-0.007
all-0.007
ous-0.006
unch-0.006
Token his
Token ID465
Feature activation-1.19e+01

Pos logit contributions
uts+0.001
all+0.001
anes+0.001
ous+0.001
that+0.001
Neg logit contributions
 fort-0.001
 trap-0.001
 way-0.001
 forgot-0.001
 scar-0.001
Token tail
Token ID7894
Feature activation+1.86e+00

Pos logit contributions
 hospital+0.279
 fort+0.255
 forgot+0.254
 closet+0.240
 scar+0.240
Neg logit contributions
uts-0.332
all-0.286
ges-0.281
unch-0.278
that-0.277
Token.
Token ID13
Feature activation+5.10e+00

Pos logit contributions
uts+0.054
anes+0.052
all+0.051
ous+0.048
that+0.047
Neg logit contributions
 fort-0.046
 way-0.043
 trap-0.043
 scar-0.043
 forgot-0.042
Token
Token ID198
Feature activation+3.49e+00

Pos logit contributions
anes+0.143
uts+0.141
all+0.137
ous+0.130
ges+0.129
Neg logit contributions
 fort-0.128
 scar-0.120
 way-0.120
 closet-0.116
 pile-0.115
Token
Token ID198
Feature activation+3.06e+00

Pos logit contributions
uts+0.075
anes+0.075
ous+0.070
all+0.069
ges+0.066
Neg logit contributions
 fort-0.109
 way-0.105
 trap-0.101
 scar-0.101
 forgot-0.098
TokenTim
Token ID14967
Feature activation-6.34e+00

Pos logit contributions
uts+0.076
anes+0.074
all+0.070
ous+0.068
ges+0.066
Neg logit contributions
 fort-0.089
 scar-0.084
 way-0.083
 trap-0.082
 forgot-0.082
Token now
Token ID783
Feature activation-2.98e+00

Pos logit contributions
 fort+0.032
 way+0.030
 scar+0.030
 trap+0.030
 forgot+0.030
Neg logit contributions
uts-0.035
anes-0.034
all-0.032
ous-0.031
ges-0.030
Token he
Token ID339
Feature activation-7.96e+00

Pos logit contributions
 fort+0.084
 scar+0.078
 way+0.078
 trap+0.077
 closet+0.076
Neg logit contributions
uts-0.077
anes-0.073
all-0.073
ous-0.067
that-0.066
Token could
Token ID714
Feature activation+2.41e+00

Pos logit contributions
 fort+0.213
 closet+0.193
 trap+0.184
 scar+0.183
 hole+0.179
Neg logit contributions
all-0.224
uts-0.217
that-0.201
 Isn-0.192
anes-0.192
Token qu
Token ID627
Feature activation+5.21e+00

Pos logit contributions
uts+0.079
all+0.076
anes+0.076
ous+0.071
that+0.070
Neg logit contributions
 fort-0.051
 way-0.047
 scar-0.047
 trap-0.046
 forgot-0.046
Tokenench
Token ID24421
Feature activation+4.85e+00

Pos logit contributions
uts+0.148
anes+0.143
all+0.140
ous+0.130
that+0.128
Neg logit contributions
 fort-0.134
 way-0.126
 trap-0.124
 scar-0.124
 forgot-0.123
Token his
Token ID465
Feature activation-1.19e+01

Pos logit contributions
uts+0.152
anes+0.151
all+0.142
ous+0.139
ges+0.132
Neg logit contributions
 fort-0.110
 way-0.105
 scar-0.104
 trap-0.102
 forgot-0.099
Token thirst
Token ID24613
Feature activation+3.84e+00

Pos logit contributions
 fort+0.226
 forgot+0.207
 hospital+0.205
 closet+0.204
 slammed+0.192
Neg logit contributions
all-0.377
uts-0.373
unch-0.370
elt-0.348
that-0.346
Token and
Token ID290
Feature activation-5.87e+00

Pos logit contributions
uts+0.106
anes+0.101
all+0.100
ous+0.094
that+0.092
Neg logit contributions
 fort-0.101
 way-0.094
 scar-0.094
 trap-0.094
 forgot-0.093
Token have
Token ID423
Feature activation-1.14e+00

Pos logit contributions
 fort+0.134
 trap+0.121
 closet+0.120
 scar+0.118
 hospital+0.113
Neg logit contributions
uts-0.174
all-0.170
anes-0.159
ous-0.157
 Mike-0.154
Token a
Token ID257
Feature activation-6.22e+00

Pos logit contributions
 fort+0.030
 trap+0.028
 closet+0.027
 way+0.027
 scar+0.027
Neg logit contributions
uts-0.031
all-0.030
anes-0.029
that-0.027
ous-0.027
Token cool
Token ID3608
Feature activation-2.55e+00

Pos logit contributions
 fort+0.128
 floor+0.118
 closet+0.117
 forgot+0.117
 traps+0.115
Neg logit contributions
all-0.185
uts-0.177
unch-0.171
that-0.168
 Isn-0.166
Token
Token ID198
Feature activation+3.73e+00

Pos logit contributions
all+0.174
anes+0.172
uts+0.171
ous+0.166
od+0.162
Neg logit contributions
 fort-0.101
 way-0.094
 forgot-0.089
 hospital-0.089
 scar-0.089
Token
Token ID198
Feature activation+3.25e+00

Pos logit contributions
uts+0.080
anes+0.078
all+0.076
ous+0.074
ges+0.069
Neg logit contributions
 fort-0.116
 way-0.108
 trap-0.107
 hospital-0.106
 scar-0.105
TokenMax
Token ID11518
Feature activation-5.67e+00

Pos logit contributions
uts+0.098
all+0.093
anes+0.093
that+0.087
ous+0.084
Neg logit contributions
 fort-0.079
 way-0.076
 scar-0.072
 closet-0.072
 trap-0.072
Token w
Token ID266
Feature activation-6.29e-01

Pos logit contributions
 fort+0.136
 trap+0.125
 closet+0.118
 hole+0.118
 way+0.116
Neg logit contributions
all-0.173
uts-0.172
anes-0.160
ous-0.157
that-0.156
Tokenagged
Token ID14655
Feature activation-1.33e-01

Pos logit contributions
 fort+0.021
 way+0.021
 closet+0.020
 scar+0.020
 pile+0.019
Neg logit contributions
uts-0.014
anes-0.014
all-0.013
unch-0.013
ous-0.011
Token his
Token ID465
Feature activation-1.18e+01

Pos logit contributions
 fort+0.004
 trap+0.004
 forgot+0.004
 scar+0.004
 hospital+0.004
Neg logit contributions
all-0.003
uts-0.003
that-0.003
 Mike-0.003
ges-0.003
Token mustache
Token ID49303
Feature activation+2.06e+00

Pos logit contributions
 hospital+0.304
 fort+0.285
ns+0.278
 forgot+0.277
 scar+0.273
Neg logit contributions
uts-0.257
that-0.237
amb-0.225
ges-0.223
all-0.221
Token happily
Token ID18177
Feature activation-1.01e+00

Pos logit contributions
uts+0.052
all+0.050
ous+0.047
that+0.045
anes+0.044
Neg logit contributions
 fort-0.058
 closet-0.055
 trap-0.055
 trash-0.054
 scar-0.053
Token and
Token ID290
Feature activation-6.05e+00

Pos logit contributions
 fort+0.024
 trap+0.022
 way+0.022
 scar+0.022
 closet+0.022
Neg logit contributions
uts-0.031
all-0.029
anes-0.029
ous-0.027
that-0.027
Token said
Token ID531
Feature activation+3.05e+00

Pos logit contributions
 fort+0.162
 trap+0.154
 closet+0.147
 hole+0.143
ns+0.138
Neg logit contributions
uts-0.153
all-0.151
 Mike-0.149
ous-0.138
that-0.136
Token,
Token ID11
Feature activation-7.41e-01

Pos logit contributions
anes+0.120
uts+0.115
all+0.111
unch+0.107
that+0.107
Neg logit contributions
 fort-0.044
 way-0.042
 forgot-0.041
 shield-0.038
 scar-0.036
Token family
Token ID1641
Feature activation+1.28e-02

Pos logit contributions
 fort+0.190
 trap+0.178
 forgot+0.177
 scar+0.176
 way+0.168
Neg logit contributions
anes-0.249
all-0.249
uts-0.245
ous-0.227
that-0.226
Token spent
Token ID3377
Feature activation+2.52e-01

Pos logit contributions
all+0.000
anes+0.000
uts+0.000
 Mike+0.000
that+0.000
Neg logit contributions
 fort-0.000
 trap-0.000
 scar-0.000
 closet-0.000
 way-0.000
Token the
Token ID262
Feature activation-9.24e+00

Pos logit contributions
all+0.005
 Mike+0.005
that+0.005
uts+0.005
od+0.005
Neg logit contributions
 fort-0.008
 closet-0.007
 needle-0.007
 scar-0.007
 shield-0.007
Token rest
Token ID1334
Feature activation+2.86e+00

Pos logit contributions
 fort+0.178
 forgot+0.165
 scar+0.165
 trap+0.164
 closet+0.160
Neg logit contributions
uts-0.309
all-0.306
anes-0.305
that-0.287
ous-0.284
Token of
Token ID286
Feature activation-3.17e+00

Pos logit contributions
uts+0.043
all+0.039
anes+0.038
ous+0.035
ges+0.034
Neg logit contributions
 fort-0.113
 scar-0.109
 trap-0.108
 way-0.106
 closet-0.106
Token their
Token ID511
Feature activation-1.18e+01

Pos logit contributions
 fort+0.078
 forgot+0.072
 scar+0.071
 way+0.071
 trap+0.071
Neg logit contributions
uts-0.092
all-0.089
anes-0.088
ges-0.082
ous-0.082
Token days
Token ID1528
Feature activation+3.25e+00

Pos logit contributions
 forgot+0.266
 fort+0.262
 trap+0.258
 needle+0.254
 scar+0.254
Neg logit contributions
all-0.328
uts-0.311
anes-0.308
good-0.296
ous-0.291
Token in
Token ID287
Feature activation-2.29e+00

Pos logit contributions
uts+0.087
anes+0.084
all+0.082
ous+0.077
that+0.075
Neg logit contributions
 fort-0.088
 way-0.083
 forgot-0.081
 scar-0.081
 trap-0.081
Token their
Token ID511
Feature activation-1.10e+01

Pos logit contributions
 fort+0.053
 way+0.049
 scar+0.049
 trap+0.049
 forgot+0.048
Neg logit contributions
uts-0.070
anes-0.068
all-0.067
ous-0.063
ges-0.062
Token new
Token ID649
Feature activation-4.30e+00

Pos logit contributions
ches+0.253
 forgot+0.242
eh+0.240
 needle+0.237
 trap+0.237
Neg logit contributions
all-0.264
 Mike-0.261
uts-0.249
good-0.245
that-0.245
Token home
Token ID1363
Feature activation+3.04e+00

Pos logit contributions
 forgot+0.095
ches+0.087
 trap+0.086
 scar+0.086
 needle+0.085
Neg logit contributions
all-0.117
uts-0.116
anes-0.111
 Mike-0.109
good-0.107
Token and
Token ID290
Feature activation-6.06e+00

Pos logit contributions
uts+0.199
anes+0.191
all+0.189
ges+0.176
ous+0.176
Neg logit contributions
 fort-0.161
 way-0.148
 trap-0.148
 forgot-0.148
 scar-0.147
Token pull
Token ID2834
Feature activation+6.83e+00

Pos logit contributions
 fort+0.130
 trap+0.115
 way+0.115
 closet+0.114
 scar+0.113
Neg logit contributions
uts-0.197
all-0.185
anes-0.185
ges-0.178
that-0.175
Token.
Token ID13
Feature activation+5.04e+00

Pos logit contributions
uts+0.185
anes+0.178
all+0.175
ous+0.162
ges+0.160
Neg logit contributions
 fort-0.184
 way-0.172
 trap-0.170
 scar-0.170
 forgot-0.169
Token They
Token ID1119
Feature activation-6.74e+00

Pos logit contributions
uts+0.128
anes+0.125
all+0.123
ous+0.116
that+0.109
Neg logit contributions
 fort-0.143
 way-0.135
 trap-0.131
 scar-0.129
 forgot-0.129
Token feel
Token ID1254
Feature activation-1.64e+00

Pos logit contributions
 fort+0.184
eh+0.167
 trap+0.163
 closet+0.162
 scar+0.159
Neg logit contributions
uts-0.174
anes-0.159
 Mike-0.159
that-0.155
ges-0.154
Token their
Token ID511
Feature activation-1.18e+01

Pos logit contributions
 fort+0.040
 way+0.036
 scar+0.035
 forgot+0.035
 closet+0.035
Neg logit contributions
uts-0.050
anes-0.047
all-0.047
ges-0.045
that-0.044
Token arms
Token ID5101
Feature activation+1.76e+00

Pos logit contributions
 fort+0.293
eh+0.291
 closet+0.286
 hospital+0.286
 forgot+0.278
Neg logit contributions
uts-0.316
unch-0.248
ges-0.245
anes-0.240
ous-0.236
Token and
Token ID290
Feature activation-6.66e+00

Pos logit contributions
uts+0.056
anes+0.055
all+0.054
 Isn+0.050
ous+0.050
Neg logit contributions
 fort-0.037
 way-0.036
 forgot-0.034
 scar-0.034
 trap-0.034
Token legs
Token ID7405
Feature activation+4.21e+00

Pos logit contributions
 fort+0.150
 hospital+0.148
eh+0.140
ches+0.139
iper+0.133
Neg logit contributions
uts-0.192
ges-0.172
anes-0.163
 Mike-0.163
that-0.163
Token working
Token ID1762
Feature activation+5.22e+00

Pos logit contributions
anes+0.116
uts+0.115
all+0.112
 Isn+0.103
ous+0.103
Neg logit contributions
 fort-0.107
 way-0.104
 forgot-0.101
 scar-0.099
 trap-0.099
Token.
Token ID13
Feature activation+5.02e+00

Pos logit contributions
uts+0.136
all+0.127
anes+0.127
ous+0.117
that+0.116
Neg logit contributions
 fort-0.145
 trap-0.136
 scar-0.135
 forgot-0.135
 way-0.134
Token you
Token ID345
Feature activation-4.11e+00

Pos logit contributions
 fort+0.047
 way+0.047
 forgot+0.047
 trap+0.045
 scar+0.043
Neg logit contributions
anes-0.103
uts-0.103
all-0.096
ges-0.093
ous-0.093
Token?"
Token ID1701
Feature activation+7.94e+00

Pos logit contributions
 fort+0.096
 forgot+0.096
 way+0.095
 trap+0.091
 scar+0.090
Neg logit contributions
uts-0.122
anes-0.121
all-0.114
that-0.109
ges-0.109
Token Max
Token ID5436
Feature activation-4.93e+00

Pos logit contributions
uts+0.239
anes+0.230
all+0.226
that+0.210
ous+0.210
Neg logit contributions
 fort-0.191
 trap-0.176
 way-0.176
 scar-0.175
 forgot-0.172
Token w
Token ID266
Feature activation-5.60e-02

Pos logit contributions
 fort+0.125
 trap+0.116
 traps+0.109
 hospital+0.107
 hole+0.106
Neg logit contributions
all-0.144
uts-0.142
 Mike-0.130
ous-0.129
that-0.126
Tokenagged
Token ID14655
Feature activation+2.16e-01

Pos logit contributions
 fort+0.002
 way+0.002
 scar+0.002
 closet+0.002
 forgot+0.002
Neg logit contributions
uts-0.001
anes-0.001
all-0.001
unch-0.001
ous-0.001
Token his
Token ID465
Feature activation-1.18e+01

Pos logit contributions
uts+0.005
all+0.005
anes+0.005
ous+0.005
 Mike+0.005
Neg logit contributions
 fort-0.006
 trap-0.006
 forgot-0.006
 scar-0.006
 hospital-0.006
Token tail
Token ID7894
Feature activation+1.69e+00

Pos logit contributions
 hospital+0.284
ches+0.281
 fort+0.274
 forgot+0.272
 needle+0.269
Neg logit contributions
uts-0.261
that-0.222
uzz-0.214
amb-0.209
 Share-0.209
Token and
Token ID290
Feature activation-6.30e+00

Pos logit contributions
uts+0.044
all+0.043
anes+0.040
ous+0.039
that+0.038
Neg logit contributions
 fort-0.046
 trap-0.043
 scar-0.042
 forgot-0.042
 closet-0.042
Token l
Token ID300
Feature activation+2.81e+00

Pos logit contributions
 fort+0.153
 trap+0.147
 closet+0.136
 hospital+0.131
 needle+0.131
Neg logit contributions
uts-0.176
 Mike-0.170
all-0.169
ous-0.158
that-0.155
Tokenicked
Token ID9484
Feature activation+5.54e+00

Pos logit contributions
uts+0.047
all+0.045
anes+0.045
ous+0.039
 Isn+0.039
Neg logit contributions
 fort-0.102
 trap-0.097
 forgot-0.096
 scar-0.096
 way-0.095
Token his
Token ID465
Feature activation-1.04e+01

Pos logit contributions
uts+0.148
anes+0.141
all+0.140
ges+0.135
ous+0.128
Neg logit contributions
 fort-0.149
 forgot-0.139
 way-0.139
 trap-0.138
 scar-0.137
Token!
Token ID0
Feature activation+4.40e+00

Pos logit contributions
anes+0.077
uts+0.077
all+0.070
unch+0.069
 Share+0.068
Neg logit contributions
 fort-0.051
 way-0.048
 scar-0.045
 trap-0.045
 hospital-0.044
Token
Token ID198
Feature activation+3.57e+00

Pos logit contributions
uts+0.115
anes+0.106
all+0.106
ous+0.102
ges+0.097
Neg logit contributions
 way-0.123
 fort-0.121
 trap-0.115
 forgot-0.113
 hole-0.113
Token
Token ID198
Feature activation+3.22e+00

Pos logit contributions
uts+0.087
ous+0.084
anes+0.081
ges+0.075
all+0.075
Neg logit contributions
 way-0.100
 trap-0.094
 fort-0.094
 hole-0.091
 forgot-0.090
TokenHarry
Token ID18308
Feature activation-6.29e+00

Pos logit contributions
uts+0.089
anes+0.085
all+0.082
ous+0.080
ges+0.077
Neg logit contributions
 fort-0.083
 trap-0.081
 way-0.080
 forgot-0.079
 scar-0.077
Token felt
Token ID2936
Feature activation-1.80e+00

Pos logit contributions
 fort+0.166
 trap+0.147
 closet+0.142
 hospital+0.137
 hole+0.134
Neg logit contributions
all-0.182
uts-0.173
ous-0.161
od-0.159
 Mike-0.158
Token his
Token ID465
Feature activation-1.17e+01

Pos logit contributions
 fort+0.041
 way+0.037
 scar+0.036
 trap+0.036
 forgot+0.036
Neg logit contributions
uts-0.057
anes-0.055
all-0.055
ous-0.051
that-0.051
Token worries
Token ID18572
Feature activation+1.39e+00

Pos logit contributions
 fort+0.229
 forgot+0.226
 scar+0.216
 trap+0.201
 slammed+0.195
Neg logit contributions
uts-0.372
all-0.356
ous-0.353
anes-0.339
 Mike-0.332
Token drift
Token ID24260
Feature activation+3.01e+00

Pos logit contributions
uts+0.042
anes+0.040
all+0.039
ous+0.038
that+0.037
Neg logit contributions
 fort-0.033
 trap-0.031
 scar-0.030
 way-0.030
 forgot-0.030
Token away
Token ID1497
Feature activation-3.58e-01

Pos logit contributions
uts+0.097
all+0.095
anes+0.094
ous+0.089
ges+0.087
Neg logit contributions
 fort-0.066
 trap-0.061
 forgot-0.060
 scar-0.059
 way-0.059
Token with
Token ID351
Feature activation+4.36e-01

Pos logit contributions
 fort+0.010
 way+0.009
 trap+0.009
 scar+0.009
 forgot+0.009
Neg logit contributions
uts-0.009
anes-0.009
all-0.009
ous-0.008
that-0.008
Token each
Token ID1123
Feature activation-6.43e+00

Pos logit contributions
uts+0.013
anes+0.012
all+0.012
ous+0.011
ges+0.011
Neg logit contributions
 fort-0.011
 way-0.010
 scar-0.010
 forgot-0.010
 trap-0.010
Tokened
Token ID276
Feature activation+1.58e+00

Pos logit contributions
uts+0.042
anes+0.040
all+0.037
ous+0.031
that+0.030
Neg logit contributions
 fort-0.140
 way-0.134
 scar-0.133
 trap-0.133
 forgot-0.133
Token happily
Token ID18177
Feature activation-4.45e-01

Pos logit contributions
all+0.038
uts+0.038
anes+0.034
ous+0.034
that+0.033
Neg logit contributions
 fort-0.044
 trap-0.042
 forgot-0.042
 closet-0.042
 way-0.041
Token and
Token ID290
Feature activation-6.32e+00

Pos logit contributions
 fort+0.010
 way+0.009
 scar+0.009
 forgot+0.009
 trap+0.009
Neg logit contributions
uts-0.014
anes-0.014
all-0.013
ous-0.013
ges-0.012
Token w
Token ID266
Feature activation+1.23e+00

Pos logit contributions
 fort+0.171
 trap+0.161
 closet+0.157
 hole+0.149
 hospital+0.148
Neg logit contributions
uts-0.160
all-0.152
 Mike-0.149
ous-0.142
ful-0.137
Tokenagged
Token ID14655
Feature activation+8.77e-01

Pos logit contributions
uts+0.028
anes+0.027
all+0.026
unch+0.024
ous+0.023
Neg logit contributions
 fort-0.041
 way-0.040
 closet-0.038
 trap-0.038
 scar-0.038
Token his
Token ID465
Feature activation-1.17e+01

Pos logit contributions
uts+0.022
all+0.021
anes+0.020
ous+0.020
ges+0.019
Neg logit contributions
 fort-0.025
 trap-0.023
 forgot-0.023
 scar-0.022
 closet-0.022
Token tail
Token ID7894
Feature activation+1.55e+00

Pos logit contributions
 hospital+0.282
 fort+0.258
 slammed+0.237
 closet+0.237
 scar+0.234
Neg logit contributions
uts-0.309
that-0.291
ges-0.278
ous-0.277
all-0.277
Token some
Token ID617
Feature activation-3.19e+00

Pos logit contributions
uts+0.040
anes+0.038
all+0.037
ous+0.035
that+0.034
Neg logit contributions
 fort-0.044
 scar-0.041
 way-0.041
 trap-0.041
 forgot-0.040
Token more
Token ID517
Feature activation-3.62e+00

Pos logit contributions
 fort+0.078
 closet+0.073
 forgot+0.069
 slammed+0.068
iper+0.065
Neg logit contributions
that-0.081
all-0.081
Welcome-0.079
good-0.077
ges-0.077
Token.
Token ID13
Feature activation+4.64e+00

Pos logit contributions
 fort+0.114
 closet+0.109
 scar+0.109
 trap+0.105
 forgot+0.103
Neg logit contributions
uts-0.085
all-0.079
anes-0.070
ous-0.069
that-0.067
Token "
Token ID366
Feature activation+2.28e+00

Pos logit contributions
uts+0.120
anes+0.114
all+0.113
ous+0.104
that+0.103
Neg logit contributions
 fort-0.130
 trap-0.124
 scar-0.122
 way-0.122
 forgot-0.120
Token Spot
Token ID15899
Feature activation-4.21e+00

Pos logit contributions
 fort+0.021
 way+0.019
 forgot+0.019
 trap+0.019
 closet+0.019
Neg logit contributions
uts-0.020
all-0.019
anes-0.018
that-0.017
unch-0.017
Token?"
Token ID1701
Feature activation+7.80e+00

Pos logit contributions
 fort+0.138
 scar+0.132
 way+0.131
 trap+0.129
 forgot+0.129
Neg logit contributions
uts-0.085
all-0.081
anes-0.080
ous-0.071
that-0.071
Token Spot
Token ID15899
Feature activation-4.67e+00

Pos logit contributions
uts+0.235
anes+0.231
ous+0.218
all+0.218
ges+0.205
Neg logit contributions
 way-0.181
 fort-0.180
 scar-0.171
 hospital-0.171
 forgot-0.169
Token w
Token ID266
Feature activation+3.64e-01

Pos logit contributions
 fort+0.110
 trap+0.099
 closet+0.098
 hospital+0.096
 way+0.094
Neg logit contributions
uts-0.143
all-0.141
anes-0.129
that-0.128
ous-0.128
Tokenagged
Token ID14655
Feature activation+7.01e-01

Pos logit contributions
uts+0.009
anes+0.009
all+0.009
unch+0.008
ous+0.008
Neg logit contributions
 fort-0.011
 way-0.010
 scar-0.010
 closet-0.010
 forgot-0.010
Token his
Token ID465
Feature activation-1.17e+01

Pos logit contributions
all+0.017
uts+0.017
anes+0.016
that+0.015
ges+0.015
Neg logit contributions
 fort-0.020
 trap-0.019
 forgot-0.019
 scar-0.019
 hospital-0.018
Token tail
Token ID7894
Feature activation+1.75e+00

Pos logit contributions
 hospital+0.296
 fort+0.268
 forgot+0.265
 needle+0.257
ience+0.255
Neg logit contributions
uts-0.296
that-0.262
uzz-0.256
ges-0.255
all-0.253
Token and
Token ID290
Feature activation-6.28e+00

Pos logit contributions
uts+0.048
all+0.046
anes+0.044
ous+0.042
that+0.042
Neg logit contributions
 fort-0.046
 scar-0.043
 forgot-0.042
 trap-0.042
 closet-0.041
Token bark
Token ID21405
Feature activation+3.78e+00

Pos logit contributions
 fort+0.145
 closet+0.136
 trap+0.133
 hospital+0.126
 hole+0.123
Neg logit contributions
uts-0.185
all-0.178
 Mike-0.173
that-0.168
ous-0.165
Tokened
Token ID276
Feature activation+1.50e+00

Pos logit contributions
anes+0.071
uts+0.066
all+0.062
unch+0.055
ous+0.055
Neg logit contributions
 fort-0.131
 forgot-0.129
 way-0.126
 bit-0.124
 trap-0.124
Token again
Token ID757
Feature activation+1.84e+00

Pos logit contributions
uts+0.041
anes+0.041
all+0.039
ous+0.037
that+0.036
Neg logit contributions
 fort-0.040
 way-0.038
 scar-0.037
 trap-0.037
 forgot-0.037
Token how
Token ID703
Feature activation+4.09e+00

Pos logit contributions
 fort+0.173
 way+0.166
 scar+0.161
 forgot+0.161
 trap+0.160
Neg logit contributions
uts-0.152
anes-0.147
all-0.137
ous-0.130
ges-0.128
Token Mike
Token ID4995
Feature activation-5.72e+00

Pos logit contributions
uts+0.088
anes+0.085
all+0.077
ges+0.073
that+0.072
Neg logit contributions
 fort-0.132
 way-0.128
 forgot-0.128
 scar-0.123
 trap-0.121
Token and
Token ID290
Feature activation-7.03e+00

Pos logit contributions
 fort+0.131
 trap+0.118
 scar+0.118
 closet+0.116
 way+0.116
Neg logit contributions
uts-0.181
all-0.178
anes-0.171
ous-0.163
that-0.162
Token Tony
Token ID8832
Feature activation-5.46e+00

Pos logit contributions
 way+0.249
 fort+0.242
 scar+0.233
 pile+0.230
 forgot+0.230
Neg logit contributions
uts-0.144
anes-0.132
all-0.117
unch-0.114
ous-0.110
Token spent
Token ID3377
Feature activation+3.00e-01

Pos logit contributions
 fort+0.126
 scar+0.111
 closet+0.111
 trap+0.111
 way+0.110
Neg logit contributions
uts-0.167
all-0.165
anes-0.159
that-0.152
ous-0.151
Token their
Token ID511
Feature activation-1.17e+01

Pos logit contributions
all+0.007
 Mike+0.006
uts+0.006
that+0.006
ges+0.006
Neg logit contributions
 fort-0.009
 closet-0.008
 needle-0.008
 forgot-0.008
 shield-0.008
Token day
Token ID1110
Feature activation+2.89e+00

Pos logit contributions
 fort+0.252
 closet+0.244
 scar+0.240
 shield+0.238
 hospital+0.237
Neg logit contributions
all-0.344
uts-0.335
that-0.321
anes-0.314
good-0.313
Token playing
Token ID2712
Feature activation+5.55e+00

Pos logit contributions
uts+0.067
all+0.063
anes+0.063
ges+0.056
unch+0.056
Neg logit contributions
 fort-0.088
 scar-0.083
 way-0.083
 trap-0.082
 closet-0.082
Token video
Token ID2008
Feature activation+6.80e+00

Pos logit contributions
uts+0.109
all+0.107
anes+0.101
that+0.097
ges+0.096
Neg logit contributions
 forgot-0.173
 scar-0.171
 fort-0.171
 way-0.165
 closet-0.165
Token and
Token ID290
Feature activation-5.72e+00

Pos logit contributions
uts+0.146
anes+0.136
all+0.135
ous+0.119
that+0.118
Neg logit contributions
 fort-0.217
 scar-0.210
 forgot-0.208
 trap-0.207
 way-0.207
Token eating
Token ID6600
Feature activation+5.85e+00

Pos logit contributions
 fort+0.181
 trap+0.170
 scar+0.170
 way+0.169
 forgot+0.166
Neg logit contributions
uts-0.126
anes-0.119
all-0.119
ous-0.108
that-0.106
Token
Token ID198
Feature activation+4.01e+00

Pos logit contributions
ous+0.184
all+0.183
anes+0.177
ges+0.175
 Isn+0.172
Neg logit contributions
 way-0.190
 fort-0.188
 hole-0.171
 forgot-0.167
 closet-0.166
Token
Token ID198
Feature activation+3.49e+00

Pos logit contributions
uts+0.088
anes+0.087
all+0.083
ous+0.082
ges+0.078
Neg logit contributions
 fort-0.119
 way-0.119
 forgot-0.111
 closet-0.110
 scar-0.109
TokenMax
Token ID11518
Feature activation-5.17e+00

Pos logit contributions
uts+0.102
all+0.096
anes+0.095
that+0.090
ous+0.087
Neg logit contributions
 fort-0.088
 way-0.084
 scar-0.081
 trap-0.081
 hospital-0.080
Token w
Token ID266
Feature activation-1.37e-01

Pos logit contributions
 fort+0.127
 hospital+0.116
 trap+0.115
 way+0.113
 proper+0.111
Neg logit contributions
all-0.149
uts-0.147
 Mike-0.130
ous-0.130
od-0.129
Tokenagged
Token ID14655
Feature activation+3.59e-01

Pos logit contributions
 fort+0.005
 way+0.005
 scar+0.004
 closet+0.004
 forgot+0.004
Neg logit contributions
uts-0.003
anes-0.003
all-0.003
unch-0.003
ous-0.002
Token his
Token ID465
Feature activation-1.17e+01

Pos logit contributions
all+0.008
uts+0.008
that+0.007
 Mike+0.007
ges+0.007
Neg logit contributions
 fort-0.011
 trap-0.010
 forgot-0.010
ns-0.010
 scar-0.010
Token tail
Token ID7894
Feature activation+1.65e+00

Pos logit contributions
 hospital+0.303
ience+0.286
 forgot+0.280
 needle+0.279
ches+0.276
Neg logit contributions
uts-0.254
that-0.219
amb-0.213
 That-0.212
ges-0.202
Token and
Token ID290
Feature activation-6.13e+00

Pos logit contributions
uts+0.042
all+0.040
anes+0.038
ous+0.037
that+0.036
Neg logit contributions
 fort-0.045
 forgot-0.043
 scar-0.043
 trap-0.042
 closet-0.041
Token ran
Token ID4966
Feature activation+3.30e+00

Pos logit contributions
 fort+0.152
 trap+0.148
ches+0.138
 hospital+0.137
ns+0.137
Neg logit contributions
 Mike-0.158
uts-0.153
all-0.152
that-0.140
ous-0.139
Token around
Token ID1088
Feature activation+2.80e+00

Pos logit contributions
uts+0.089
anes+0.086
all+0.083
ous+0.079
that+0.076
Neg logit contributions
 fort-0.090
 way-0.085
 trap-0.083
 scar-0.083
 forgot-0.083
Token happily
Token ID18177
Feature activation+2.44e-01

Pos logit contributions
uts+0.079
anes+0.076
all+0.075
ous+0.070
ges+0.069
Neg logit contributions
 fort-0.072
 way-0.067
 scar-0.067
 trap-0.066
 forgot-0.066
Token happy
Token ID3772
Feature activation+1.36e+00

Pos logit contributions
 fort+0.025
 way+0.023
 scar+0.023
 trap+0.023
 forgot+0.022
Neg logit contributions
uts-0.044
anes-0.042
all-0.042
ous-0.039
that-0.039
Token again
Token ID757
Feature activation+1.57e+00

Pos logit contributions
anes+0.040
uts+0.039
all+0.037
ous+0.035
ges+0.035
Neg logit contributions
 fort-0.034
 way-0.032
 forgot-0.031
 scar-0.031
 trap-0.031
Token and
Token ID290
Feature activation-6.40e+00

Pos logit contributions
anes+0.046
uts+0.045
all+0.041
ges+0.039
ous+0.039
Neg logit contributions
 fort-0.039
 way-0.038
 trap-0.037
 forgot-0.037
 scar-0.036
Token w
Token ID266
Feature activation+1.38e+00

Pos logit contributions
 fort+0.161
 trap+0.149
 scar+0.145
 closet+0.144
 needle+0.140
Neg logit contributions
all-0.180
uts-0.179
anes-0.166
that-0.165
ous-0.164
Tokenagged
Token ID14655
Feature activation+7.39e-01

Pos logit contributions
uts+0.034
anes+0.032
all+0.032
ous+0.029
unch+0.029
Neg logit contributions
 fort-0.042
 way-0.041
 scar-0.040
 trap-0.039
 closet-0.039
Token his
Token ID465
Feature activation-1.17e+01

Pos logit contributions
uts+0.019
all+0.018
anes+0.017
ous+0.017
that+0.016
Neg logit contributions
 fort-0.021
 trap-0.019
 forgot-0.019
 scar-0.019
 closet-0.019
Token tail
Token ID7894
Feature activation+1.61e+00

Pos logit contributions
 hospital+0.265
 scar+0.251
ches+0.246
 fort+0.241
 forgot+0.237
Neg logit contributions
uts-0.289
ous-0.264
that-0.261
uzz-0.256
ges-0.252
Token.
Token ID13
Feature activation+4.54e+00

Pos logit contributions
uts+0.037
anes+0.036
all+0.035
ous+0.032
ges+0.031
Neg logit contributions
 fort-0.050
 way-0.047
 trap-0.046
 scar-0.046
 forgot-0.046
Token The
Token ID383
Feature activation-7.78e+00

Pos logit contributions
uts+0.112
anes+0.107
all+0.104
ous+0.096
that+0.094
Neg logit contributions
 fort-0.134
 way-0.127
 trap-0.126
 scar-0.126
 forgot-0.125
Token little
Token ID1310
Feature activation-4.61e+00

Pos logit contributions
 forgot+0.152
 scar+0.150
 fort+0.150
igs+0.144
ches+0.143
Neg logit contributions
that-0.216
all-0.215
Welcome-0.209
anes-0.209
uts-0.207
Token girl
Token ID2576
Feature activation-4.34e+00

Pos logit contributions
ns+0.108
ches+0.106
 forgot+0.106
 fort+0.105
eh+0.104
Neg logit contributions
elt-0.105
anes-0.104
uts-0.103
 Mike-0.102
all-0.101
Token."
Token ID526
Feature activation+7.69e+00

Pos logit contributions
anes+0.110
uts+0.109
all+0.100
ous+0.095
unch+0.094
Neg logit contributions
 fort-0.147
 way-0.142
 forgot-0.140
 trap-0.137
 hospital-0.135
Token The
Token ID383
Feature activation-7.45e+00

Pos logit contributions
uts+0.166
anes+0.157
all+0.156
ous+0.144
ges+0.139
Neg logit contributions
 fort-0.249
 way-0.237
 scar-0.233
 hospital-0.229
 trap-0.228
Token dog
Token ID3290
Feature activation+1.65e-01

Pos logit contributions
 fort+0.144
 forgot+0.144
 way+0.130
 scar+0.127
 closet+0.127
Neg logit contributions
uts-0.235
anes-0.233
all-0.228
that-0.222
unch-0.221
Token w
Token ID266
Feature activation+1.04e+00

Pos logit contributions
uts+0.005
all+0.005
ous+0.004
that+0.004
 Mike+0.004
Neg logit contributions
 fort-0.004
 closet-0.004
 trap-0.004
 hole-0.004
 scar-0.004
Tokenagged
Token ID14655
Feature activation+9.37e-01

Pos logit contributions
uts+0.026
anes+0.025
all+0.024
ous+0.022
unch+0.022
Neg logit contributions
 fort-0.031
 way-0.030
 closet-0.029
 scar-0.029
 trap-0.029
Token his
Token ID465
Feature activation-1.16e+01

Pos logit contributions
uts+0.022
all+0.021
anes+0.020
ous+0.019
ges+0.019
Neg logit contributions
 fort-0.028
 trap-0.026
 forgot-0.026
 closet-0.026
 scar-0.026
Token tail
Token ID7894
Feature activation+1.89e+00

Pos logit contributions
 hospital+0.285
 fort+0.272
 forgot+0.262
 closet+0.259
 slammed+0.248
Neg logit contributions
uts-0.295
that-0.269
ges-0.265
all-0.261
unch-0.258
Token and
Token ID290
Feature activation-6.01e+00

Pos logit contributions
uts+0.053
all+0.051
anes+0.050
ous+0.048
that+0.046
Neg logit contributions
 fort-0.048
 scar-0.044
 trap-0.044
 forgot-0.044
 closet-0.044
Token said
Token ID531
Feature activation+3.66e+00

Pos logit contributions
 fort+0.142
 trap+0.130
 closet+0.127
 hospital+0.121
 needle+0.119
Neg logit contributions
uts-0.178
all-0.170
 Mike-0.166
anes-0.164
ous-0.163
Token,
Token ID11
Feature activation-2.90e-01

Pos logit contributions
anes+0.141
uts+0.138
all+0.134
unch+0.127
that+0.127
Neg logit contributions
 fort-0.053
 way-0.051
 shield-0.047
 forgot-0.046
 trap-0.044
Token "
Token ID366
Feature activation+3.74e+00

Pos logit contributions
 head+0.006
 forgot+0.006
 way+0.006
 fort+0.006
 scar+0.006
Neg logit contributions
anes-0.009
uts-0.009
ges-0.008
all-0.008
reating-0.008
Token ground
Token ID2323
Feature activation+2.26e+00

Pos logit contributions
 forgot+0.148
iper+0.139
 scar+0.123
ches+0.112
 slammed+0.112
Neg logit contributions
 Isn-0.172
anes-0.172
that-0.171
 Hello-0.170
Ice-0.169
Token!"
Token ID2474
Feature activation+7.67e+00

Pos logit contributions
uts+0.050
anes+0.048
all+0.046
ous+0.043
that+0.041
Neg logit contributions
 fort-0.073
 way-0.069
 trap-0.068
 forgot-0.067
 scar-0.067
Token Spot
Token ID15899
Feature activation-5.03e+00

Pos logit contributions
uts+0.211
anes+0.196
all+0.192
that+0.177
 Mike+0.176
Neg logit contributions
 fort-0.207
 forgot-0.191
 trap-0.191
 scar-0.190
 closet-0.190
Token w
Token ID266
Feature activation+1.01e-01

Pos logit contributions
 fort+0.121
 hospital+0.116
 closet+0.112
 trap+0.108
 way+0.107
Neg logit contributions
uts-0.149
all-0.140
 Mike-0.128
ous-0.128
that-0.127
Tokenagged
Token ID14655
Feature activation+5.62e-01

Pos logit contributions
uts+0.002
anes+0.002
all+0.002
unch+0.002
ous+0.002
Neg logit contributions
 fort-0.003
 way-0.003
 closet-0.003
 scar-0.003
 trap-0.003
Token his
Token ID465
Feature activation-1.16e+01

Pos logit contributions
uts+0.013
all+0.013
anes+0.011
that+0.011
ges+0.011
Neg logit contributions
 fort-0.017
 trap-0.016
 forgot-0.016
 hospital-0.016
 scar-0.016
Token tail
Token ID7894
Feature activation+1.61e+00

Pos logit contributions
 hospital+0.299
 forgot+0.279
 fort+0.271
 closet+0.270
 needle+0.269
Neg logit contributions
uts-0.294
amb-0.247
ges-0.228
unch-0.227
that-0.226
Token and
Token ID290
Feature activation-6.32e+00

Pos logit contributions
uts+0.044
all+0.043
anes+0.041
ous+0.039
ges+0.039
Neg logit contributions
 fort-0.041
 forgot-0.039
 scar-0.039
 trap-0.038
 way-0.037
Token bark
Token ID21405
Feature activation+3.59e+00

Pos logit contributions
 fort+0.149
 trap+0.139
 closet+0.138
 hospital+0.138
 needle+0.130
Neg logit contributions
uts-0.181
 Mike-0.167
all-0.167
that-0.156
ous-0.152
Tokened
Token ID276
Feature activation+1.42e+00

Pos logit contributions
anes+0.066
uts+0.062
all+0.058
ous+0.051
unch+0.051
Neg logit contributions
 fort-0.127
 forgot-0.123
 way-0.122
 scar-0.119
 trap-0.119
Token,
Token ID11
Feature activation-1.65e+00

Pos logit contributions
anes+0.038
uts+0.038
all+0.035
ous+0.034
unch+0.033
Neg logit contributions
 fort-0.040
 way-0.037
 scar-0.036
 trap-0.036
 forgot-0.036
Token Max
Token ID5436
Feature activation-5.69e+00

Pos logit contributions
 trap+0.202
 fort+0.201
eh+0.194
 forgot+0.191
 way+0.191
Neg logit contributions
uts-0.166
anes-0.159
all-0.158
ges-0.154
 Share-0.152
Token smiled
Token ID13541
Feature activation+2.66e+00

Pos logit contributions
 fort+0.135
 trap+0.124
 way+0.119
 closet+0.117
 hospital+0.115
Neg logit contributions
uts-0.173
all-0.164
anes-0.162
that-0.155
ous-0.154
Token and
Token ID290
Feature activation-6.67e+00

Pos logit contributions
anes+0.091
uts+0.090
all+0.087
ous+0.082
ges+0.082
Neg logit contributions
 fort-0.055
 scar-0.049
 way-0.048
 trap-0.047
 forgot-0.046
Token w
Token ID266
Feature activation+1.23e+00

Pos logit contributions
 fort+0.143
 trap+0.127
 way+0.126
 closet+0.123
 hospital+0.122
Neg logit contributions
uts-0.221
anes-0.206
all-0.205
ous-0.197
that-0.195
Tokenagged
Token ID14655
Feature activation+1.01e+00

Pos logit contributions
uts+0.030
anes+0.030
all+0.028
ous+0.026
ges+0.026
Neg logit contributions
 fort-0.037
 way-0.035
 trap-0.034
 scar-0.034
 closet-0.034
Token their
Token ID511
Feature activation-1.16e+01

Pos logit contributions
uts+0.024
all+0.021
ges+0.021
anes+0.021
ous+0.020
Neg logit contributions
 fort-0.029
 forgot-0.028
 trap-0.028
 needle-0.027
 hospital-0.027
Token tails
Token ID30514
Feature activation+7.89e-01

Pos logit contributions
 hospital+0.235
ches+0.234
 needle+0.230
 fort+0.225
 scar+0.220
Neg logit contributions
uts-0.321
ges-0.316
uzz-0.309
 Share-0.306
that-0.298
Token.
Token ID13
Feature activation+5.05e+00

Pos logit contributions
uts+0.019
anes+0.018
all+0.018
ous+0.016
 Isn+0.016
Neg logit contributions
 fort-0.024
 way-0.023
 trap-0.022
 scar-0.022
 forgot-0.022
Token They
Token ID1119
Feature activation-6.40e+00

Pos logit contributions
anes+0.133
all+0.132
uts+0.129
ous+0.121
unch+0.120
Neg logit contributions
 fort-0.143
 way-0.134
 forgot-0.131
 closet-0.129
 scar-0.129
Token were
Token ID547
Feature activation-2.47e+00

Pos logit contributions
 fort+0.191
 trap+0.174
 hole+0.164
 closet+0.163
 traps+0.163
Neg logit contributions
uts-0.155
all-0.147
anes-0.140
that-0.140
 Mike-0.138
Token happy
Token ID3772
Feature activation+1.54e+00

Pos logit contributions
 fort+0.051
 way+0.046
 trap+0.046
 scar+0.046
 forgot+0.045
Neg logit contributions
uts-0.083
anes-0.080
all-0.080
ous-0.075
that-0.074
Token his
Token ID465
Feature activation-9.64e+00

Pos logit contributions
uts+0.014
anes+0.014
all+0.014
ous+0.013
ges+0.013
Neg logit contributions
 fort-0.012
 forgot-0.011
 way-0.011
 scar-0.011
 trap-0.011
Token shoulder
Token ID8163
Feature activation+3.99e+00

Pos logit contributions
 forgot+0.203
 hospital+0.198
 fort+0.191
eh+0.185
 trap+0.182
Neg logit contributions
uts-0.296
ous-0.258
that-0.255
all-0.253
anes-0.247
Token.
Token ID13
Feature activation+5.23e+00

Pos logit contributions
uts+0.102
anes+0.099
all+0.096
ous+0.089
that+0.087
Neg logit contributions
 fort-0.113
 way-0.107
 scar-0.105
 trap-0.105
 forgot-0.104
Token He
Token ID679
Feature activation-8.45e+00

Pos logit contributions
uts+0.137
ous+0.131
all+0.127
anes+0.126
reating+0.120
Neg logit contributions
 way-0.131
 fort-0.127
 hospital-0.120
 hole-0.120
 pile-0.120
Token felt
Token ID2936
Feature activation-1.36e+00

Pos logit contributions
 fort+0.211
 hospital+0.196
 trap+0.192
 closet+0.187
 hole+0.181
Neg logit contributions
all-0.229
uts-0.228
 Mike-0.217
ous-0.202
od-0.200
Token their
Token ID511
Feature activation-1.16e+01

Pos logit contributions
 fort+0.035
 way+0.032
 scar+0.031
 trap+0.031
 forgot+0.031
Neg logit contributions
uts-0.039
anes-0.037
all-0.037
ous-0.034
that-0.034
Token little
Token ID1310
Feature activation-4.80e+00

Pos logit contributions
eh+0.254
 fort+0.244
 forgot+0.240
 hospital+0.233
 scar+0.232
Neg logit contributions
uts-0.315
all-0.308
ous-0.307
 Mike-0.305
that-0.284
Token hands
Token ID2832
Feature activation+2.94e+00

Pos logit contributions
 fort+0.105
eh+0.103
 forgot+0.100
ches+0.099
iper+0.095
Neg logit contributions
 Mike-0.126
uts-0.125
all-0.123
ous-0.119
ges-0.118
Token tick
Token ID4378
Feature activation+3.77e+00

Pos logit contributions
uts+0.086
anes+0.084
all+0.082
ous+0.076
that+0.075
Neg logit contributions
 fort-0.071
 way-0.067
 forgot-0.066
 scar-0.066
 trap-0.065
Tokenling
Token ID1359
Feature activation-8.48e-01

Pos logit contributions
uts+0.111
all+0.110
anes+0.106
ous+0.101
ges+0.101
Neg logit contributions
 fort-0.095
 way-0.088
 closet-0.087
 trap-0.086
 forgot-0.085
Token his
Token ID465
Feature activation-9.65e+00

Pos logit contributions
 fort+0.022
 forgot+0.021
 way+0.021
 scar+0.021
 trap+0.021
Neg logit contributions
uts-0.023
all-0.022
anes-0.022
ous-0.021
ges-0.020
Token owner
Token ID4870
Feature activation+2.16e+00

Pos logit contributions
 fort+0.009
 forgot+0.009
 scar+0.009
 trap+0.008
 closet+0.008
Neg logit contributions
uts-0.013
all-0.013
anes-0.013
ous-0.013
 Mike-0.012
Token!"
Token ID2474
Feature activation+7.67e+00

Pos logit contributions
uts+0.053
anes+0.052
all+0.046
unch+0.045
ous+0.044
Neg logit contributions
 fort-0.063
 way-0.062
 forgot-0.060
 trap-0.058
 pile-0.058
Token Spot
Token ID15899
Feature activation-4.62e+00

Pos logit contributions
anes+0.256
all+0.247
uts+0.246
ous+0.245
od+0.238
Neg logit contributions
 fort-0.141
 hospital-0.137
 way-0.137
 scar-0.129
 trap-0.127
Token w
Token ID266
Feature activation+4.23e-01

Pos logit contributions
 fort+0.112
 trap+0.104
 closet+0.098
 hospital+0.097
 hole+0.096
Neg logit contributions
uts-0.139
all-0.138
that-0.126
ous-0.125
 Mike-0.125
Tokenagged
Token ID14655
Feature activation+6.80e-01

Pos logit contributions
uts+0.010
anes+0.010
all+0.009
unch+0.009
ous+0.008
Neg logit contributions
 fort-0.014
 way-0.013
 closet-0.013
 scar-0.013
 trap-0.012
Token his
Token ID465
Feature activation-1.16e+01

Pos logit contributions
uts+0.016
all+0.016
anes+0.014
 Mike+0.014
ges+0.014
Neg logit contributions
 fort-0.020
 forgot-0.019
 trap-0.019
 scar-0.018
 closet-0.018
Token tail
Token ID7894
Feature activation+2.01e+00

Pos logit contributions
 hospital+0.288
 forgot+0.274
ches+0.273
 fort+0.269
 needle+0.268
Neg logit contributions
uts-0.255
uzz-0.224
amb-0.223
that-0.223
ges-0.215
Token happily
Token ID18177
Feature activation-7.99e-01

Pos logit contributions
uts+0.054
all+0.052
anes+0.049
ous+0.047
that+0.047
Neg logit contributions
 fort-0.053
 scar-0.050
 forgot-0.050
 trap-0.049
 closet-0.048
Token,
Token ID11
Feature activation-1.38e+00

Pos logit contributions
 fort+0.019
 trap+0.018
 way+0.018
 scar+0.018
 closet+0.017
Neg logit contributions
uts-0.024
all-0.023
anes-0.023
ous-0.021
that-0.021
Token feeling
Token ID4203
Feature activation+1.62e+00

Pos logit contributions
 fort+0.038
 trap+0.037
 knot+0.034
iper+0.033
 closet+0.033
Neg logit contributions
all-0.036
uts-0.034
that-0.033
 Mike-0.031
unch-0.031
Token proud
Token ID6613
Feature activation+4.24e+00

Pos logit contributions
uts+0.053
anes+0.052
all+0.051
ous+0.048
that+0.047
Neg logit contributions
 fort-0.035
 way-0.032
 scar-0.031
 trap-0.031
 forgot-0.031
Token
Token ID198
Feature activation+3.71e+00

Pos logit contributions
uts+0.132
anes+0.128
ous+0.128
all+0.119
ges+0.116
Neg logit contributions
 way-0.129
 fort-0.121
 closet-0.120
 hospital-0.120
 forgot-0.119
Token
Token ID198
Feature activation+3.34e+00

Pos logit contributions
uts+0.107
anes+0.103
ous+0.102
all+0.099
ges+0.096
Neg logit contributions
 fort-0.088
 hospital-0.082
 way-0.081
 trap-0.081
 scar-0.079
TokenThe
Token ID464
Feature activation-8.09e+00

Pos logit contributions
uts+0.099
anes+0.096
all+0.092
ous+0.092
ges+0.088
Neg logit contributions
 fort-0.080
 trap-0.074
 scar-0.074
 forgot-0.073
 way-0.073
Token children
Token ID1751
Feature activation+1.43e-01

Pos logit contributions
 fort+0.215
 trap+0.201
 forgot+0.198
 scar+0.192
 proper+0.192
Neg logit contributions
anes-0.206
all-0.205
uts-0.196
ous-0.183
that-0.182
Token nodded
Token ID14464
Feature activation+1.81e+00

Pos logit contributions
uts+0.004
all+0.004
anes+0.004
ous+0.004
that+0.004
Neg logit contributions
 fort-0.003
 trap-0.003
 scar-0.003
 way-0.003
 closet-0.003
Token their
Token ID511
Feature activation-1.16e+01

Pos logit contributions
uts+0.055
anes+0.053
all+0.053
ous+0.050
ges+0.049
Neg logit contributions
 fort-0.042
 way-0.039
 trap-0.039
 scar-0.038
 forgot-0.038
Token heads
Token ID6665
Feature activation+2.00e+00

Pos logit contributions
eh+0.220
 fort+0.219
 slammed+0.207
 forgot+0.204
 hospital+0.203
Neg logit contributions
all-0.322
 Mike-0.318
uts-0.301
ous-0.298
ach-0.292
Token in
Token ID287
Feature activation-1.80e+00

Pos logit contributions
uts+0.063
anes+0.062
all+0.060
ous+0.056
that+0.056
Neg logit contributions
 fort-0.044
 way-0.041
 scar-0.040
 forgot-0.040
 trap-0.040
Token understanding
Token ID4547
Feature activation+4.37e+00

Pos logit contributions
 fort+0.042
 way+0.040
 scar+0.039
 trap+0.039
 forgot+0.038
Neg logit contributions
uts-0.055
anes-0.054
all-0.052
ous-0.049
that-0.048
Token and
Token ID290
Feature activation-5.53e+00

Pos logit contributions
uts+0.118
all+0.114
anes+0.112
ous+0.106
ges+0.105
Neg logit contributions
 fort-0.118
 trap-0.107
 scar-0.106
 forgot-0.105
 closet-0.104
Token started
Token ID2067
Feature activation+4.03e+00

Pos logit contributions
 fort+0.130
 trap+0.116
 scar+0.115
 closet+0.113
 way+0.110
Neg logit contributions
uts-0.169
all-0.162
anes-0.159
ous-0.155
that-0.150
Token,
Token ID11
Feature activation-1.54e+00

Pos logit contributions
uts+0.048
all+0.047
anes+0.045
ous+0.043
unch+0.042
Neg logit contributions
 fort-0.046
 forgot-0.046
 scar-0.045
 trap-0.045
 needle-0.043
Token and
Token ID290
Feature activation-6.29e+00

Pos logit contributions
 fort+0.046
 trap+0.045
 knot+0.040
 needle+0.040
 hospital+0.039
Neg logit contributions
all-0.038
uts-0.036
that-0.035
 Mike-0.033
ges-0.033
Token Spot
Token ID15899
Feature activation-5.05e+00

Pos logit contributions
 fort+0.170
 trap+0.158
 closet+0.154
 hole+0.147
 web+0.147
Neg logit contributions
uts-0.165
all-0.161
 Mike-0.160
ous-0.145
that-0.145
Token w
Token ID266
Feature activation+3.24e-01

Pos logit contributions
 fort+0.107
 trap+0.100
 closet+0.093
 scar+0.090
 hole+0.090
Neg logit contributions
uts-0.169
all-0.159
anes-0.154
that-0.150
ous-0.149
Tokenagged
Token ID14655
Feature activation+5.35e-01

Pos logit contributions
uts+0.006
anes+0.006
unch+0.006
all+0.005
ucked+0.005
Neg logit contributions
 fort-0.012
 way-0.012
 closet-0.011
 shelf-0.011
 trap-0.011
Token his
Token ID465
Feature activation-1.16e+01

Pos logit contributions
uts+0.014
all+0.013
anes+0.013
ous+0.012
that+0.012
Neg logit contributions
 fort-0.015
 trap-0.014
 forgot-0.014
 closet-0.014
 hospital-0.013
Token tail
Token ID7894
Feature activation+1.89e+00

Pos logit contributions
 hospital+0.305
 fort+0.273
 closet+0.265
 needle+0.260
 traps+0.255
Neg logit contributions
uts-0.294
amb-0.255
that-0.244
cr-0.240
all-0.238
Token as
Token ID355
Feature activation+9.37e-01

Pos logit contributions
uts+0.047
anes+0.044
all+0.044
ous+0.041
that+0.040
Neg logit contributions
 fort-0.054
 forgot-0.051
 trap-0.051
 scar-0.051
 way-0.051
Token they
Token ID484
Feature activation-6.54e+00

Pos logit contributions
uts+0.026
anes+0.026
all+0.025
ous+0.023
that+0.023
Neg logit contributions
 fort-0.024
 way-0.023
 forgot-0.023
 scar-0.022
 trap-0.022
Token shared
Token ID4888
Feature activation+3.89e+00

Pos logit contributions
 fort+0.132
 trap+0.122
 way+0.121
 closet+0.116
 hospital+0.116
Neg logit contributions
uts-0.218
anes-0.214
all-0.213
ous-0.199
ges-0.198
Token a
Token ID257
Feature activation-5.73e+00

Pos logit contributions
anes+0.117
uts+0.116
all+0.112
ous+0.108
unch+0.102
Neg logit contributions
 fort-0.094
 way-0.090
 scar-0.088
 trap-0.086
 forgot-0.084
Token dreams
Token ID10625
Feature activation+5.15e+00

Pos logit contributions
 fort+0.191
 scar+0.175
 trap+0.171
 forgot+0.170
 way+0.163
Neg logit contributions
uts-0.282
anes-0.269
all-0.268
ous-0.255
that-0.249
Token can
Token ID460
Feature activation+4.77e+00

Pos logit contributions
uts+0.166
anes+0.161
all+0.158
ous+0.149
ges+0.147
Neg logit contributions
 fort-0.111
 way-0.103
 forgot-0.100
 trap-0.100
 scar-0.100
Token come
Token ID1282
Feature activation+4.74e+00

Pos logit contributions
uts+0.166
anes+0.161
all+0.155
ges+0.153
unch+0.151
Neg logit contributions
 way-0.087
 fort-0.087
 trap-0.081
 forgot-0.081
 pile-0.077
Token true
Token ID2081
Feature activation-5.83e-01

Pos logit contributions
uts+0.119
anes+0.114
all+0.108
unch+0.104
ges+0.102
Neg logit contributions
 fort-0.136
 way-0.131
 trap-0.127
 scar-0.127
 forgot-0.126
Token.
Token ID13
Feature activation+5.11e+00

Pos logit contributions
 fort+0.017
 way+0.017
 trap+0.016
 scar+0.016
 forgot+0.016
Neg logit contributions
uts-0.014
anes-0.014
all-0.013
ous-0.012
ges-0.012
Token<|endoftext|>
Token ID50256
Feature activation+1.25e+01

Pos logit contributions
anes+0.141
ous+0.140
uts+0.138
ges+0.128
piring+0.126
Neg logit contributions
 way-0.127
 hospital-0.112
 closet-0.110
 forgot-0.110
 pocket-0.110
TokenS
Token ID50
Feature activation+2.07e-01

Pos logit contributions
uts+0.392
ous+0.358
all+0.345
anes+0.335
ful+0.323
Neg logit contributions
 way-0.297
 fort-0.293
 hole-0.279
 closet-0.274
 room-0.267
Tokenara
Token ID3301
Feature activation-5.37e+00

Pos logit contributions
uts+0.006
anes+0.006
all+0.005
ous+0.005
ges+0.005
Neg logit contributions
 fort-0.005
 way-0.005
 scar-0.005
 forgot-0.005
 closet-0.005
Token was
Token ID373
Feature activation-2.74e+00

Pos logit contributions
 fort+0.117
 trap+0.108
 scar+0.108
 way+0.105
 forgot+0.102
Neg logit contributions
uts-0.170
anes-0.166
all-0.165
ges-0.154
ous-0.153
Token so
Token ID523
Feature activation-1.42e+00

Pos logit contributions
 fort+0.072
 trap+0.067
 forgot+0.066
 way+0.066
 scar+0.066
Neg logit contributions
uts-0.075
all-0.073
anes-0.072
that-0.066
ous-0.066
Token excited
Token ID6568
Feature activation+4.18e+00

Pos logit contributions
 forgot+0.030
 way+0.029
 fort+0.028
 dusty+0.027
 hospital+0.027
Neg logit contributions
anes-0.046
uts-0.045
ous-0.041
ges-0.040
all-0.040
Token her
Token ID607
Feature activation-7.06e+00

Pos logit contributions
 fort+0.038
 scar+0.035
 way+0.035
 trap+0.035
 forgot+0.034
Neg logit contributions
uts-0.040
all-0.039
anes-0.038
ous-0.036
ges-0.036
Token.
Token ID13
Feature activation+4.81e+00

Pos logit contributions
 fort+0.190
 trap+0.178
 forgot+0.178
 scar+0.177
 way+0.177
Neg logit contributions
uts-0.191
all-0.181
anes-0.181
ous-0.168
that-0.164
Token The
Token ID383
Feature activation-7.63e+00

Pos logit contributions
uts+0.105
all+0.099
anes+0.097
that+0.090
 Isn+0.088
Neg logit contributions
 fort-0.156
 trap-0.148
 scar-0.148
 way-0.144
 shield-0.143
Token end
Token ID886
Feature activation+2.88e+00

Pos logit contributions
 scar+0.147
 forgot+0.145
 fort+0.144
iper+0.140
 web+0.140
Neg logit contributions
anes-0.239
all-0.227
uts-0.222
that-0.221
 Share-0.209
Token.
Token ID13
Feature activation+4.70e+00

Pos logit contributions
anes+0.073
uts+0.072
all+0.068
unch+0.063
that+0.063
Neg logit contributions
 fort-0.086
 way-0.079
 trap-0.079
 scar-0.076
 hospital-0.076
Token<|endoftext|>
Token ID50256
Feature activation+1.21e+01

Pos logit contributions
uts+0.093
anes+0.092
all+0.084
ous+0.082
ges+0.074
Neg logit contributions
 fort-0.159
 way-0.154
 forgot-0.152
 hospital-0.151
 closet-0.151
TokenOnce
Token ID7454
Feature activation+5.31e+00

Pos logit contributions
ous+0.386
uts+0.373
all+0.345
anes+0.343
amb+0.342
Neg logit contributions
 way-0.285
 fort-0.270
 room-0.269
 hole-0.264
 closet-0.255
Token upon
Token ID2402
Feature activation-9.33e-01

Pos logit contributions
uts+0.165
anes+0.161
all+0.156
ous+0.147
that+0.145
Neg logit contributions
 fort-0.122
 way-0.115
 forgot-0.112
 scar-0.109
 trap-0.109
Token a
Token ID257
Feature activation-4.16e+00

Pos logit contributions
 fort+0.019
 trap+0.018
 way+0.018
 forgot+0.017
 hospital+0.017
Neg logit contributions
anes-0.031
uts-0.031
all-0.029
ous-0.029
ges-0.028
Token time
Token ID640
Feature activation+4.21e+00

Pos logit contributions
 fort+0.075
 scar+0.073
 hospital+0.070
 closet+0.069
 shield+0.069
Neg logit contributions
all-0.138
uts-0.135
unch-0.131
that-0.122
anes-0.120
Token,
Token ID11
Feature activation-4.05e-01

Pos logit contributions
anes+0.166
uts+0.162
all+0.153
that+0.150
ous+0.150
Neg logit contributions
 fort-0.060
 way-0.058
 forgot-0.050
 hole-0.048
 closet-0.048
Token
Token ID198
Feature activation+3.74e+00

Pos logit contributions
anes+0.133
uts+0.132
ous+0.127
all+0.124
piring+0.116
Neg logit contributions
 way-0.136
 fort-0.134
 closet-0.134
 forgot-0.132
 pile-0.129
Token
Token ID198
Feature activation+3.43e+00

Pos logit contributions
ous+0.094
uts+0.094
anes+0.092
all+0.082
piring+0.081
Neg logit contributions
 fort-0.097
 way-0.097
 trap-0.093
 hospital-0.092
 closet-0.091
TokenThey
Token ID2990
Feature activation-5.08e+00

Pos logit contributions
anes+0.103
uts+0.102
ous+0.101
all+0.094
ges+0.090
Neg logit contributions
 fort-0.075
 trap-0.071
 forgot-0.071
 scar-0.070
 way-0.069
Token started
Token ID2067
Feature activation+4.26e+00

Pos logit contributions
 fort+0.137
 trap+0.124
 way+0.120
 hole+0.118
 web+0.115
Neg logit contributions
all-0.131
 Mike-0.130
uts-0.129
anes-0.126
ful-0.121
Token to
Token ID284
Feature activation-2.40e+00

Pos logit contributions
uts+0.107
anes+0.105
all+0.099
ous+0.094
that+0.090
Neg logit contributions
 fort-0.123
 way-0.121
 scar-0.116
 trap-0.115
 forgot-0.114
Token argue
Token ID7267
Feature activation+1.11e+01

Pos logit contributions
 fort+0.053
 way+0.049
 trap+0.049
 scar+0.049
 forgot+0.048
Neg logit contributions
uts-0.076
anes-0.074
all-0.073
ous-0.068
that-0.067
Token,
Token ID11
Feature activation-1.75e-01

Pos logit contributions
anes+0.368
uts+0.366
all+0.349
unch+0.333
ous+0.332
Neg logit contributions
 fort-0.222
 way-0.213
 forgot-0.200
 hospital-0.192
 pile-0.192
Token but
Token ID475
Feature activation-1.93e+00

Pos logit contributions
 fort+0.004
 way+0.004
 trap+0.004
 scar+0.004
 forgot+0.003
Neg logit contributions
uts-0.006
anes-0.005
all-0.005
ous-0.005
that-0.005
Token then
Token ID788
Feature activation-1.11e+00

Pos logit contributions
 fort+0.061
 trap+0.056
 scar+0.054
 closet+0.053
 way+0.053
Neg logit contributions
uts-0.043
 Mike-0.042
all-0.042
ful-0.039
ges-0.038
Token their
Token ID511
Feature activation-1.05e+01

Pos logit contributions
 fort+0.033
 trap+0.030
 way+0.029
 scar+0.029
 closet+0.029
Neg logit contributions
uts-0.027
all-0.025
anes-0.025
 Mike-0.025
ous-0.024
Token mom
Token ID1995
Feature activation-4.09e-01

Pos logit contributions
 fort+0.225
 forgot+0.215
 scar+0.214
 trap+0.212
eh+0.201
Neg logit contributions
uts-0.309
all-0.291
anes-0.287
ous-0.275
 Mike-0.268
Token the
Token ID262
Feature activation-7.83e+00

Pos logit contributions
 fort+0.026
 scar+0.024
 forgot+0.023
 trap+0.023
 way+0.023
Neg logit contributions
uts-0.030
all-0.029
anes-0.028
that-0.027
 Isn-0.026
Token best
Token ID1266
Feature activation-3.85e+00

Pos logit contributions
 fort+0.149
 forgot+0.142
 trap+0.134
 scar+0.131
 way+0.126
Neg logit contributions
uts-0.253
all-0.252
anes-0.252
that-0.236
 Isn-0.230
Token day
Token ID1110
Feature activation+5.06e+00

Pos logit contributions
 fort+0.086
 way+0.081
 trap+0.077
 scar+0.077
 forgot+0.076
Neg logit contributions
uts-0.125
anes-0.122
all-0.118
ous-0.112
ges-0.110
Token ever
Token ID1683
Feature activation-5.28e-01

Pos logit contributions
uts+0.074
anes+0.071
all+0.067
ous+0.058
that+0.056
Neg logit contributions
 fort-0.197
 way-0.190
 forgot-0.188
 trap-0.187
 scar-0.186
Token.
Token ID13
Feature activation+4.83e+00

Pos logit contributions
 fort+0.017
 way+0.016
 forgot+0.016
 trap+0.016
 scar+0.015
Neg logit contributions
uts-0.012
anes-0.012
all-0.011
ous-0.010
ges-0.010
Token<|endoftext|>
Token ID50256
Feature activation+1.26e+01

Pos logit contributions
anes+0.126
all+0.120
uts+0.116
ous+0.116
unch+0.108
Neg logit contributions
 closet-0.134
 fort-0.129
 way-0.129
 forgot-0.127
 hospital-0.125
TokenTom
Token ID13787
Feature activation-2.44e+00

Pos logit contributions
uts+0.430
ous+0.430
all+0.390
anes+0.371
amb+0.366
Neg logit contributions
 fort-0.259
 hole-0.256
 way-0.254
 room-0.252
 closet-0.245
Token and
Token ID290
Feature activation-6.02e+00

Pos logit contributions
 fort+0.064
 trap+0.061
 scar+0.060
 knot+0.058
 web+0.057
Neg logit contributions
all-0.066
uts-0.062
anes-0.060
ges-0.060
that-0.058
Token Mia
Token ID32189
Feature activation-6.08e+00

Pos logit contributions
 fort+0.145
 way+0.139
 forgot+0.136
 trap+0.136
 scar+0.134
Neg logit contributions
uts-0.179
anes-0.174
all-0.170
ous-0.160
that-0.156
Token liked
Token ID8288
Feature activation+2.51e+00

Pos logit contributions
 fort+0.142
 scar+0.132
 trap+0.131
 way+0.126
 hospital+0.121
Neg logit contributions
uts-0.182
anes-0.177
all-0.177
ges-0.166
that-0.164
Token to
Token ID284
Feature activation-2.24e+00

Pos logit contributions
uts+0.074
anes+0.072
all+0.068
ous+0.066
unch+0.066
Neg logit contributions
 fort-0.065
 way-0.061
 scar-0.059
 trap-0.057
 hospital-0.057
Token They
Token ID1119
Feature activation-6.50e+00

Pos logit contributions
anes+0.142
uts+0.135
all+0.135
ous+0.135
unch+0.125
Neg logit contributions
 fort-0.145
 closet-0.143
 way-0.141
 forgot-0.140
 hole-0.138
Token wished
Token ID16555
Feature activation+8.04e+00

Pos logit contributions
 fort+0.167
eh+0.154
 trap+0.150
 way+0.141
igs+0.141
Neg logit contributions
uts-0.173
 Mike-0.159
all-0.158
anes-0.154
 Share-0.148
Token they
Token ID484
Feature activation-5.01e+00

Pos logit contributions
uts+0.201
anes+0.193
all+0.190
ous+0.177
ges+0.172
Neg logit contributions
 fort-0.233
 trap-0.220
 way-0.218
 scar-0.217
 forgot-0.215
Token had
Token ID550
Feature activation+6.75e-01

Pos logit contributions
 fort+0.101
 trap+0.090
 scar+0.090
 way+0.087
 hospital+0.083
Neg logit contributions
uts-0.175
all-0.164
anes-0.161
that-0.157
ous-0.154
Token not
Token ID407
Feature activation-2.52e+00

Pos logit contributions
uts+0.021
all+0.020
anes+0.020
that+0.019
ous+0.019
Neg logit contributions
 fort-0.015
 trap-0.013
 scar-0.013
 way-0.013
 closet-0.013
Token fought
Token ID8350
Feature activation+9.95e+00

Pos logit contributions
 fort+0.054
 trap+0.047
 scar+0.047
 way+0.046
 closet+0.045
Neg logit contributions
uts-0.084
all-0.080
anes-0.079
ous-0.075
ges-0.074
Token.
Token ID13
Feature activation+5.58e+00

Pos logit contributions
uts+0.252
anes+0.251
all+0.236
unch+0.231
ous+0.228
Neg logit contributions
 fort-0.273
 way-0.268
 forgot-0.258
 trap-0.255
 hospital-0.252
Token They
Token ID1119
Feature activation-6.39e+00

Pos logit contributions
anes+0.155
all+0.151
ous+0.151
uts+0.147
unch+0.139
Neg logit contributions
 fort-0.144
 closet-0.143
 forgot-0.138
 way-0.138
 hole-0.137
Token wished
Token ID16555
Feature activation+8.03e+00

Pos logit contributions
 fort+0.176
eh+0.160
 trap+0.154
igs+0.147
iper+0.146
Neg logit contributions
uts-0.164
 Mike-0.151
all-0.146
anes-0.144
 Isn-0.142
Token they
Token ID484
Feature activation-5.09e+00

Pos logit contributions
uts+0.203
anes+0.194
all+0.191
ous+0.178
ges+0.173
Neg logit contributions
 fort-0.231
 trap-0.217
 way-0.217
 scar-0.215
 forgot-0.214
Token could
Token ID714
Feature activation+3.79e+00

Pos logit contributions
 fort+0.111
 scar+0.099
 trap+0.098
 way+0.094
 closet+0.091
Neg logit contributions
uts-0.172
all-0.159
anes-0.155
that-0.153
ous-0.148
TokenWhen
Token ID2215
Feature activation+5.25e-01

Pos logit contributions
uts+0.092
anes+0.090
all+0.086
ous+0.084
ges+0.080
Neg logit contributions
 fort-0.078
 trap-0.074
 forgot-0.074
 way-0.073
 scar-0.072
Token Peter
Token ID5613
Feature activation-7.24e+00

Pos logit contributions
uts+0.012
anes+0.012
all+0.012
ous+0.011
 Mike+0.011
Neg logit contributions
 fort-0.016
 trap-0.015
 scar-0.015
 forgot-0.015
 way-0.014
Token came
Token ID1625
Feature activation+2.87e+00

Pos logit contributions
 fort+0.178
 scar+0.157
 trap+0.156
 web+0.151
 closet+0.150
Neg logit contributions
all-0.213
uts-0.205
 Mike-0.198
ges-0.196
anes-0.194
Token back
Token ID736
Feature activation-6.36e-01

Pos logit contributions
uts+0.125
anes+0.123
qu+0.114
ous+0.113
unch+0.113
Neg logit contributions
 way-0.032
 fort-0.030
 head-0.029
 closet-0.026
 hole-0.025
Token he
Token ID339
Feature activation-7.73e+00

Pos logit contributions
 fort+0.013
 way+0.013
 pile+0.012
 closet+0.012
 hole+0.012
Neg logit contributions
anes-0.022
uts-0.022
qu-0.019
ous-0.019
unch-0.019
Token saw
Token ID2497
Feature activation+2.62e+00

Pos logit contributions
 fort+0.156
 trap+0.135
 scar+0.132
 hospital+0.128
 knot+0.127
Neg logit contributions
uts-0.256
all-0.252
anes-0.243
 Mike-0.242
ous-0.236
Token the
Token ID262
Feature activation-7.47e+00

Pos logit contributions
uts+0.085
anes+0.081
all+0.078
ous+0.075
unch+0.073
Neg logit contributions
 fort-0.058
 way-0.057
 hospital-0.054
 trap-0.054
 scar-0.054
Token balloon
Token ID21190
Feature activation+2.33e+00

Pos logit contributions
 fort+0.182
 way+0.172
 trap+0.168
 scar+0.165
 forgot+0.164
Neg logit contributions
uts-0.225
anes-0.217
all-0.214
ous-0.202
ges-0.199
Token had
Token ID550
Feature activation+1.63e-01

Pos logit contributions
uts+0.064
anes+0.062
all+0.061
ous+0.057
that+0.056
Neg logit contributions
 fort-0.060
 scar-0.057
 forgot-0.056
 way-0.056
 trap-0.056
Token landed
Token ID11406
Feature activation+5.07e+00

Pos logit contributions
uts+0.005
anes+0.005
all+0.005
ous+0.005
that+0.005
Neg logit contributions
 fort-0.004
 scar-0.003
 trap-0.003
 way-0.003
 closet-0.003
Token in
Token ID287
Feature activation-6.63e-01

Pos logit contributions
uts+0.144
anes+0.138
all+0.133
ous+0.129
 Isn+0.126
Neg logit contributions
 fort-0.126
 way-0.121
 trap-0.119
 hospital-0.115
 hole-0.115
Token awesome
Token ID7427
Feature activation-3.96e-01

Pos logit contributions
 fort+0.051
 way+0.049
 trap+0.048
 forgot+0.048
 scar+0.048
Neg logit contributions
uts-0.052
anes-0.050
all-0.048
ous-0.046
ges-0.045
Token,
Token ID11
Feature activation-1.17e+00

Pos logit contributions
 fort+0.011
 way+0.010
 pile+0.010
 scar+0.010
 trap+0.010
Neg logit contributions
anes-0.011
uts-0.011
all-0.010
that-0.010
ges-0.010
Token L
Token ID406
Feature activation-1.06e-01

Pos logit contributions
 fort+0.033
 forgot+0.031
 trap+0.031
 way+0.031
 scar+0.030
Neg logit contributions
uts-0.029
all-0.028
anes-0.028
ous-0.026
that-0.025
Tokenila
Token ID10102
Feature activation-6.06e+00

Pos logit contributions
 fort+0.002
 forgot+0.002
ig+0.002
 way+0.002
eh+0.002
Neg logit contributions
uts-0.004
anes-0.004
ous-0.003
 Share-0.003
all-0.003
Token!"
Token ID2474
Feature activation+7.28e+00

Pos logit contributions
 fort+0.205
 scar+0.195
 trap+0.194
 way+0.194
 forgot+0.192
Neg logit contributions
uts-0.120
all-0.112
anes-0.112
ous-0.101
that-0.098
Token
Token ID198
Feature activation+3.55e+00

Pos logit contributions
anes+0.182
ous+0.179
uts+0.174
all+0.166
od+0.158
Neg logit contributions
 scar-0.200
 fort-0.198
 way-0.192
 closet-0.192
 hole-0.189
Token
Token ID198
Feature activation+3.14e+00

Pos logit contributions
anes+0.090
uts+0.090
ous+0.089
all+0.084
work+0.079
Neg logit contributions
 fort-0.098
 hole-0.095
 scar-0.091
 way-0.090
 hospital-0.089
TokenL
Token ID43
Feature activation-2.98e+00

Pos logit contributions
uts+0.113
anes+0.112
ous+0.108
all+0.107
ges+0.100
Neg logit contributions
 fort-0.056
 scar-0.052
 forgot-0.050
 pile-0.049
 hole-0.049
Tokenila
Token ID10102
Feature activation-7.48e+00

Pos logit contributions
 fort+0.075
 way+0.070
 forgot+0.069
 scar+0.069
 trap+0.069
Neg logit contributions
uts-0.086
anes-0.084
all-0.080
ous-0.077
that-0.074
Token hugged
Token ID41130
Feature activation+6.00e+00

Pos logit contributions
 fort+0.204
 trap+0.197
 proper+0.181
 traps+0.180
 shield+0.175
Neg logit contributions
all-0.196
uts-0.194
 Mike-0.184
anes-0.180
that-0.177
Token him
Token ID683
Feature activation-6.10e+00

Pos logit contributions
uts+0.170
anes+0.164
all+0.153
ous+0.152
unch+0.147
Neg logit contributions
 fort-0.156
 scar-0.154
 way-0.149
 closet-0.144
 hospital-0.142
Token It
Token ID632
Feature activation-4.08e+00

Pos logit contributions
anes+0.154
all+0.147
ous+0.144
uts+0.143
good+0.126
Neg logit contributions
 trap-0.162
 hole-0.156
 hospital-0.155
 fort-0.154
 trash-0.154
Token tastes
Token ID18221
Feature activation+4.55e+00

Pos logit contributions
 fort+0.091
 closet+0.085
 scar+0.083
 shield+0.081
 way+0.080
Neg logit contributions
uts-0.126
all-0.122
anes-0.119
that-0.115
ous-0.113
Token like
Token ID588
Feature activation+2.58e+00

Pos logit contributions
uts+0.178
anes+0.175
all+0.171
ous+0.163
ges+0.161
Neg logit contributions
 fort-0.067
 way-0.061
 trap-0.059
 scar-0.059
 forgot-0.059
Token dirt
Token ID13647
Feature activation+4.32e+00

Pos logit contributions
uts+0.078
anes+0.075
all+0.074
ous+0.069
that+0.068
Neg logit contributions
 fort-0.061
 way-0.056
 forgot-0.056
 scar-0.055
 trap-0.055
Token and
Token ID290
Feature activation-5.38e+00

Pos logit contributions
anes+0.107
uts+0.106
all+0.104
ous+0.096
ges+0.094
Neg logit contributions
 fort-0.125
 way-0.118
 trap-0.116
 scar-0.115
 pile-0.113
Token grass
Token ID8701
Feature activation+3.51e+00

Pos logit contributions
 fort+0.122
 closet+0.109
 way+0.109
 forgot+0.107
 trap+0.106
Neg logit contributions
uts-0.164
all-0.155
anes-0.153
ous-0.146
ges-0.145
Token,"
Token ID553
Feature activation+5.45e+00

Pos logit contributions
anes+0.090
all+0.086
uts+0.086
ous+0.080
unch+0.078
Neg logit contributions
 fort-0.100
 trap-0.095
 way-0.094
 scar-0.093
 pile-0.092
Token Anna
Token ID11735
Feature activation-6.27e+00

Pos logit contributions
uts+0.157
all+0.151
anes+0.150
 Mike+0.143
 Isn+0.141
Neg logit contributions
 fort-0.130
 trap-0.121
 way-0.119
 forgot-0.118
 scar-0.117
Token says
Token ID1139
Feature activation+3.48e+00

Pos logit contributions
 fort+0.126
 way+0.119
 scar+0.118
 forgot+0.117
 trap+0.114
Neg logit contributions
uts-0.211
anes-0.208
all-0.200
ous-0.193
ges-0.189
Token.
Token ID13
Feature activation+6.18e+00

Pos logit contributions
anes+0.109
all+0.101
uts+0.101
 Isn+0.098
that+0.095
Neg logit contributions
 fort-0.080
 head-0.078
 way-0.074
 forgot-0.070
 pile-0.070
Token
Token ID198
Feature activation+4.00e+00

Pos logit contributions
all+0.165
anes+0.165
ous+0.163
uts+0.154
od+0.151
Neg logit contributions
 fort-0.150
 closet-0.147
 hole-0.147
 trap-0.142
 hospital-0.142
Token complete
Token ID1844
Feature activation+4.89e+00

Pos logit contributions
 fort+0.081
 way+0.076
 trap+0.075
 forgot+0.074
 scar+0.074
Neg logit contributions
uts-0.091
anes-0.089
all-0.086
ous-0.081
that-0.079
Token the
Token ID262
Feature activation-7.01e+00

Pos logit contributions
uts+0.126
all+0.121
anes+0.119
ges+0.109
ous+0.109
Neg logit contributions
 fort-0.136
 forgot-0.127
 way-0.125
 trap-0.125
 scar-0.125
Token task
Token ID4876
Feature activation+5.74e+00

Pos logit contributions
 forgot+0.128
 fort+0.125
iper+0.122
ns+0.120
eh+0.118
Neg logit contributions
uts-0.219
all-0.217
that-0.212
anes-0.208
good-0.199
Token because
Token ID780
Feature activation-1.01e+00

Pos logit contributions
uts+0.129
anes+0.124
all+0.120
ous+0.110
that+0.107
Neg logit contributions
 fort-0.180
 way-0.171
 trap-0.169
 scar-0.168
 forgot-0.168
Token his
Token ID465
Feature activation-9.88e+00

Pos logit contributions
 fort+0.031
 scar+0.030
 forgot+0.029
 trap+0.029
 way+0.029
Neg logit contributions
uts-0.023
anes-0.022
all-0.022
ous-0.020
 Mike-0.019
Token paws
Token ID49193
Feature activation+3.18e+00

Pos logit contributions
 fort+0.178
ns+0.177
eh+0.176
 forgot+0.172
ches+0.172
Neg logit contributions
uts-0.321
all-0.281
anes-0.276
that-0.264
ous-0.263
Token were
Token ID547
Feature activation-1.15e+00

Pos logit contributions
uts+0.093
anes+0.087
all+0.086
ous+0.081
that+0.080
Neg logit contributions
 fort-0.081
 scar-0.075
 trap-0.074
 way-0.072
 closet-0.072
Token too
Token ID1165
Feature activation+1.10e+00

Pos logit contributions
 fort+0.026
 hospital+0.023
 trap+0.023
 scar+0.022
 closet+0.022
Neg logit contributions
uts-0.037
all-0.035
anes-0.033
ous-0.033
that-0.032
Token big
Token ID1263
Feature activation-6.02e-01

Pos logit contributions
uts+0.044
anes+0.043
all+0.043
ous+0.041
that+0.040
Neg logit contributions
 fort-0.015
 way-0.013
 scar-0.013
 trap-0.013
 forgot-0.013
Token.
Token ID13
Feature activation+5.02e+00

Pos logit contributions
 fort+0.016
 way+0.015
 scar+0.015
 trap+0.015
 forgot+0.015
Neg logit contributions
uts-0.016
anes-0.016
all-0.015
ous-0.014
that-0.014
Token Max
Token ID5436
Feature activation-5.38e+00

Pos logit contributions
uts+0.107
anes+0.098
all+0.095
that+0.085
ous+0.084
Neg logit contributions
 fort-0.164
 scar-0.160
 trap-0.156
 forgot-0.155
 way-0.153
Token you
Token ID345
Feature activation-4.32e+00

Pos logit contributions
uts+0.061
all+0.059
anes+0.055
ous+0.051
that+0.051
Neg logit contributions
 fort-0.090
 closet-0.086
 pile-0.083
 hole-0.082
 way-0.081
Token want
Token ID765
Feature activation+5.85e+00

Pos logit contributions
 fort+0.128
 way+0.119
 scar+0.118
 trap+0.117
 closet+0.116
Neg logit contributions
uts-0.105
anes-0.102
all-0.101
ous-0.092
ges-0.089
Token to
Token ID284
Feature activation-1.35e+00

Pos logit contributions
uts+0.146
anes+0.140
all+0.134
ges+0.127
ous+0.127
Neg logit contributions
 fort-0.169
 way-0.167
 trap-0.164
 scar-0.163
 hospital-0.159
Token be
Token ID307
Feature activation-2.54e+00

Pos logit contributions
 way+0.025
 fort+0.024
 trap+0.023
 hospital+0.023
 scar+0.023
Neg logit contributions
uts-0.049
anes-0.047
all-0.046
ges-0.045
that-0.045
Token my
Token ID616
Feature activation-8.24e+00

Pos logit contributions
 fort+0.056
 way+0.051
 forgot+0.050
 scar+0.050
 trap+0.050
Neg logit contributions
uts-0.081
anes-0.078
all-0.078
ous-0.072
ges-0.071
Token friend
Token ID1545
Feature activation+3.77e+00

Pos logit contributions
 fort+0.140
 way+0.126
 trap+0.124
 scar+0.124
 forgot+0.123
Neg logit contributions
uts-0.305
anes-0.297
all-0.293
ous-0.278
that-0.274
Token?"
Token ID1701
Feature activation+8.07e+00

Pos logit contributions
uts+0.096
anes+0.092
all+0.088
ous+0.083
ges+0.082
Neg logit contributions
 fort-0.108
 way-0.103
 trap-0.101
 scar-0.100
 forgot-0.100
Token Spot
Token ID15899
Feature activation-4.32e+00

Pos logit contributions
uts+0.226
anes+0.216
all+0.213
that+0.196
unch+0.194
Neg logit contributions
 fort-0.211
 forgot-0.194
 closet-0.193
 way-0.192
 trap-0.191
Token w
Token ID266
Feature activation+5.04e-01

Pos logit contributions
 fort+0.099
 trap+0.088
 closet+0.086
 hospital+0.084
 scar+0.083
Neg logit contributions
uts-0.136
all-0.133
anes-0.124
ous-0.123
that-0.122
Tokenagged
Token ID14655
Feature activation+7.10e-01

Pos logit contributions
uts+0.013
anes+0.013
all+0.012
ous+0.011
unch+0.011
Neg logit contributions
 fort-0.015
 way-0.014
 scar-0.013
 trap-0.013
 forgot-0.013
Token his
Token ID465
Feature activation-1.13e+01

Pos logit contributions
uts+0.018
all+0.017
anes+0.016
that+0.016
ous+0.015
Neg logit contributions
 fort-0.020
 trap-0.019
 forgot-0.019
 scar-0.019
 closet-0.018
Token driver
Token ID4639
Feature activation+4.04e-01

Pos logit contributions
 web+0.152
iper+0.150
 scar+0.149
 fort+0.146
 forgot+0.144
Neg logit contributions
anes-0.249
all-0.241
uts-0.230
that-0.229
 Share-0.221
Token called
Token ID1444
Feature activation+5.30e+00

Pos logit contributions
all+0.011
anes+0.011
ous+0.010
uts+0.010
 Mike+0.010
Neg logit contributions
 fort-0.010
 closet-0.010
 web-0.009
 scar-0.009
 trap-0.009
Token out
Token ID503
Feature activation-2.39e-01

Pos logit contributions
uts+0.162
anes+0.156
all+0.152
ous+0.143
that+0.141
Neg logit contributions
 fort-0.127
 way-0.119
 trap-0.116
 scar-0.115
 forgot-0.113
Token to
Token ID284
Feature activation-1.37e+00

Pos logit contributions
 fort+0.005
 scar+0.005
 trap+0.005
 shield+0.005
 way+0.004
Neg logit contributions
anes-0.008
uts-0.008
that-0.007
unch-0.007
good-0.007
Token the
Token ID262
Feature activation-6.85e+00

Pos logit contributions
 fort+0.032
 way+0.031
 trap+0.030
 hospital+0.030
 scar+0.029
Neg logit contributions
uts-0.044
anes-0.041
all-0.039
ous-0.038
unch-0.037
Token boy
Token ID2933
Feature activation-1.05e+00

Pos logit contributions
 scar+0.131
 fort+0.127
iper+0.126
 web+0.126
 forgot+0.125
Neg logit contributions
anes-0.217
all-0.202
uts-0.200
that-0.198
ous-0.194
Token and
Token ID290
Feature activation-5.85e+00

Pos logit contributions
 fort+0.021
 way+0.020
 closet+0.019
 hospital+0.019
 trap+0.019
Neg logit contributions
uts-0.036
anes-0.036
all-0.033
ous-0.032
that-0.032
Token asked
Token ID1965
Feature activation+5.14e+00

Pos logit contributions
 fort+0.166
 web+0.149
 trap+0.148
 closet+0.148
 hole+0.145
Neg logit contributions
all-0.144
 Mike-0.137
uts-0.137
anes-0.134
ous-0.132
Token him
Token ID683
Feature activation-4.86e+00

Pos logit contributions
uts+0.135
anes+0.129
all+0.119
ous+0.113
that+0.112
Neg logit contributions
 fort-0.150
 way-0.142
 scar-0.139
 trap-0.138
 hospital-0.137
Token what
Token ID644
Feature activation+1.88e+00

Pos logit contributions
 fort+0.130
 way+0.123
 scar+0.119
 forgot+0.118
 trap+0.118
Neg logit contributions
uts-0.136
anes-0.135
all-0.123
unch-0.118
ous-0.117
Token was
Token ID373
Feature activation-1.76e+00

Pos logit contributions
uts+0.054
anes+0.053
all+0.052
ous+0.048
ges+0.048
Neg logit contributions
 fort-0.047
 way-0.045
 forgot-0.044
 trap-0.044
 hospital-0.043
Token are
Token ID389
Feature activation-2.65e+00

Pos logit contributions
 fort+0.055
 way+0.050
 scar+0.050
 closet+0.049
 trap+0.048
Neg logit contributions
uts-0.050
anes-0.046
all-0.044
 Share-0.041
ges-0.041
Token mean
Token ID1612
Feature activation+2.12e+00

Pos logit contributions
 fort+0.066
 scar+0.056
 way+0.056
 closet+0.055
 pile+0.054
Neg logit contributions
uts-0.078
all-0.075
anes-0.071
ful-0.068
od-0.068
Token.
Token ID13
Feature activation+5.62e+00

Pos logit contributions
anes+0.065
all+0.062
uts+0.062
unch+0.061
that+0.058
Neg logit contributions
 fort-0.048
 way-0.045
 forgot-0.044
 closet-0.043
 scar-0.042
Token Give
Token ID13786
Feature activation+7.59e+00

Pos logit contributions
anes+0.117
all+0.117
uts+0.113
ous+0.109
unch+0.106
Neg logit contributions
 closet-0.165
 fort-0.165
 forgot-0.165
 hospital-0.164
 trap-0.163
Token it
Token ID340
Feature activation-2.37e+00

Pos logit contributions
uts+0.242
anes+0.236
all+0.227
ous+0.220
unch+0.213
Neg logit contributions
 fort-0.168
 way-0.160
 scar-0.156
 trap-0.156
 forgot-0.154
Token back
Token ID736
Feature activation+3.06e-01

Pos logit contributions
 fort+0.051
 way+0.047
 scar+0.047
 forgot+0.047
 trap+0.047
Neg logit contributions
uts-0.076
anes-0.074
all-0.073
ous-0.068
that-0.068
Token to
Token ID284
Feature activation-1.14e+00

Pos logit contributions
uts+0.006
anes+0.006
all+0.006
ous+0.005
ges+0.005
Neg logit contributions
 fort-0.010
 way-0.010
 scar-0.009
 trap-0.009
 forgot-0.009
Token me
Token ID502
Feature activation-2.18e+00

Pos logit contributions
 fort+0.015
 way+0.013
 scar+0.012
 trap+0.012
 forgot+0.012
Neg logit contributions
uts-0.047
anes-0.046
all-0.046
ous-0.044
that-0.043
Token."
Token ID526
Feature activation+7.60e+00

Pos logit contributions
 fort+0.076
 way+0.072
 trap+0.071
 forgot+0.071
 scar+0.071
Neg logit contributions
uts-0.042
anes-0.041
all-0.039
ous-0.035
that-0.034
Token She
Token ID1375
Feature activation-8.29e+00

Pos logit contributions
uts+0.176
anes+0.175
all+0.170
ous+0.162
that+0.152
Neg logit contributions
 fort-0.223
 way-0.218
 hospital-0.214
 closet-0.213
 forgot-0.212
Token ran
Token ID4966
Feature activation+3.30e+00

Pos logit contributions
 fort+0.237
 spine+0.207
 web+0.201
 trap+0.200
 den+0.197
Neg logit contributions
uts-0.192
 Mike-0.191
all-0.191
hey-0.183
ucked-0.176
Token grabbed
Token ID13176
Feature activation+3.64e+00

Pos logit contributions
 fort+0.242
 closet+0.205
 web+0.203
 spine+0.202
 trap+0.201
Neg logit contributions
 Mike-0.176
all-0.170
uts-0.154
Mike-0.147
amb-0.144
Token her
Token ID607
Feature activation-7.85e+00

Pos logit contributions
uts+0.087
all+0.085
anes+0.084
ges+0.083
 Mike+0.079
Neg logit contributions
 fort-0.105
 forgot-0.102
 way-0.097
 trap-0.095
 closet-0.093
Token doll
Token ID3654
Feature activation+2.92e-01

Pos logit contributions
 forgot+0.161
 fort+0.150
 trap+0.143
 hospital+0.141
 scar+0.131
Neg logit contributions
uts-0.247
anes-0.228
all-0.227
ges-0.224
that-0.221
Token and
Token ID290
Feature activation-5.49e+00

Pos logit contributions
uts+0.008
all+0.007
anes+0.006
ful+0.006
ges+0.006
Neg logit contributions
 forgot-0.008
 fort-0.008
 way-0.007
 trap-0.007
 scar-0.007
Token said
Token ID531
Feature activation+4.45e+00

Pos logit contributions
 fort+0.141
 lookout+0.127
 trap+0.127
 web+0.122
eh+0.121
Neg logit contributions
uts-0.144
 Mike-0.134
ges-0.124
ful-0.124
all-0.119
Token,
Token ID11
Feature activation-1.02e-01

Pos logit contributions
anes+0.144
uts+0.143
all+0.141
that+0.133
unch+0.129
Neg logit contributions
 fort-0.090
 forgot-0.083
 way-0.083
 scar-0.083
 trap-0.082
Token "
Token ID366
Feature activation+4.05e+00

Pos logit contributions
 fort+0.003
 way+0.003
 scar+0.003
 forgot+0.003
 trap+0.003
Neg logit contributions
anes-0.003
uts-0.003
all-0.002
ous-0.002
ges-0.002
TokenThis
Token ID1212
Feature activation-3.13e+00

Pos logit contributions
uts+0.092
anes+0.089
all+0.085
ous+0.081
ges+0.076
Neg logit contributions
 fort-0.126
 way-0.120
 trap-0.119
 scar-0.119
 forgot-0.119
Token is
Token ID318
Feature activation-1.59e+00

Pos logit contributions
 fort+0.058
 forgot+0.056
 scar+0.055
 trap+0.054
 way+0.050
Neg logit contributions
uts-0.107
anes-0.103
all-0.099
ous-0.095
that-0.095
Token mine
Token ID6164
Feature activation+2.29e+00

Pos logit contributions
 fort+0.044
 hole+0.038
 closet+0.038
 forgot+0.037
 hospital+0.037
Neg logit contributions
all-0.042
uts-0.039
ful-0.036
 Mike-0.036
ous-0.034
Token now
Token ID783
Feature activation-9.04e-01

Pos logit contributions
anes+0.059
uts+0.058
all+0.055
unch+0.052
ges+0.051
Neg logit contributions
 fort-0.066
 way-0.061
 forgot-0.059
 pile-0.059
 closet-0.059
Token tightly
Token ID17707
Feature activation-8.63e-01

Pos logit contributions
 fort+0.095
 hospital+0.093
 forgot+0.092
 trap+0.090
 scar+0.089
Neg logit contributions
all-0.098
uts-0.096
that-0.087
anes-0.084
ges-0.083
Token and
Token ID290
Feature activation-5.87e+00

Pos logit contributions
 fort+0.020
 way+0.019
 scar+0.019
 trap+0.018
 forgot+0.018
Neg logit contributions
uts-0.026
anes-0.026
all-0.025
ous-0.023
that-0.023
Token then
Token ID788
Feature activation-1.23e+00

Pos logit contributions
 fort+0.150
 closet+0.132
 hospital+0.131
 trap+0.130
 stable+0.125
Neg logit contributions
all-0.164
 Mike-0.163
uts-0.162
that-0.150
ges-0.148
Token slowly
Token ID6364
Feature activation-2.25e+00

Pos logit contributions
 fort+0.029
 trap+0.026
 closet+0.025
 hospital+0.025
 scar+0.024
Neg logit contributions
uts-0.036
all-0.035
 Mike-0.035
ges-0.033
that-0.032
Token lowered
Token ID17788
Feature activation+4.64e+00

Pos logit contributions
 fort+0.051
 trap+0.046
 way+0.045
 closet+0.044
 hospital+0.044
Neg logit contributions
uts-0.069
all-0.067
anes-0.066
ges-0.064
ous-0.062
Token it
Token ID340
Feature activation-3.12e+00

Pos logit contributions
uts+0.162
anes+0.159
ous+0.148
unch+0.142
all+0.141
Neg logit contributions
 way-0.094
 fort-0.091
 scar-0.088
 hole-0.087
 trap-0.087
Token into
Token ID656
Feature activation-1.63e+00

Pos logit contributions
 fort+0.087
 way+0.081
 scar+0.081
 trap+0.081
 forgot+0.080
Neg logit contributions
uts-0.082
anes-0.078
all-0.077
ous-0.071
that-0.070
Token the
Token ID262
Feature activation-7.69e+00

Pos logit contributions
 fort+0.035
 way+0.033
 trap+0.033
 scar+0.032
 pile+0.031
Neg logit contributions
uts-0.054
anes-0.053
all-0.051
ous-0.049
that-0.047
Token pit
Token ID6028
Feature activation+1.92e+00

Pos logit contributions
iper+0.157
 scar+0.156
 forgot+0.146
 fort+0.145
ches+0.142
Neg logit contributions
 What-0.219
that-0.215
 Isn-0.215
anes-0.215
 Mike-0.208
Token.
Token ID13
Feature activation+4.65e+00

Pos logit contributions
uts+0.042
anes+0.040
all+0.039
ous+0.036
that+0.035
Neg logit contributions
 fort-0.061
 way-0.058
 trap-0.058
 scar-0.057
 forgot-0.057
Token He
Token ID679
Feature activation-8.82e+00

Pos logit contributions
uts+0.131
anes+0.129
all+0.123
ous+0.121
od+0.119
Neg logit contributions
 fort-0.113
 way-0.113
 hole-0.105
 forgot-0.104
 closet-0.104
Token,"
Token ID553
Feature activation+5.42e+00

Pos logit contributions
 fort+0.091
 way+0.085
 scar+0.084
 trap+0.083
 forgot+0.083
Neg logit contributions
uts-0.088
anes-0.086
all-0.082
ous-0.077
that-0.076
Token Tom
Token ID4186
Feature activation-5.05e+00

Pos logit contributions
uts+0.140
anes+0.136
all+0.133
ous+0.122
that+0.120
Neg logit contributions
 fort-0.153
 way-0.144
 trap-0.142
 forgot-0.142
 scar-0.141
Token says
Token ID1139
Feature activation+3.08e+00

Pos logit contributions
 fort+0.104
 scar+0.097
 way+0.097
 trap+0.092
 forgot+0.092
Neg logit contributions
uts-0.170
anes-0.163
all-0.159
ous-0.154
ges-0.153
Token.
Token ID13
Feature activation+6.36e+00

Pos logit contributions
anes+0.083
uts+0.082
all+0.080
 Isn+0.075
that+0.075
Neg logit contributions
 fort-0.084
 way-0.078
 head-0.076
 scar-0.074
 pile-0.074
Token "
Token ID366
Feature activation+3.81e+00

Pos logit contributions
anes+0.180
all+0.178
uts+0.173
ous+0.171
od+0.167
Neg logit contributions
 fort-0.162
 pile-0.145
 way-0.143
 trap-0.142
 hole-0.142
TokenThat
Token ID2504
Feature activation-2.37e+00

Pos logit contributions
ous+0.110
 Mike+0.109
anes+0.106
uts+0.097
all+0.097
Neg logit contributions
 forgot-0.079
 fort-0.079
 trap-0.076
 scarf-0.076
 floor-0.076
Token is
Token ID318
Feature activation-1.97e+00

Pos logit contributions
 fort+0.051
 scar+0.046
 trap+0.045
 forgot+0.044
 closet+0.044
Neg logit contributions
uts-0.076
anes-0.074
all-0.073
ous-0.069
that-0.069
Token a
Token ID257
Feature activation-4.69e+00

Pos logit contributions
 fort+0.050
 forgot+0.045
 trap+0.044
 way+0.043
 hole+0.043
Neg logit contributions
uts-0.054
all-0.053
anes-0.049
 Mike-0.048
 Isn-0.048
Token word
Token ID1573
Feature activation+5.32e+00

Pos logit contributions
 fort+0.092
 forgot+0.086
 scar+0.083
 way+0.082
 trap+0.081
Neg logit contributions
uts-0.153
all-0.149
anes-0.148
ous-0.140
that-0.140
Token we
Token ID356
Feature activation-1.91e+00

Pos logit contributions
uts+0.140
anes+0.140
all+0.139
unch+0.133
ges+0.132
Neg logit contributions
 fort-0.136
 way-0.131
 hospital-0.127
 closet-0.126
 scar-0.125
Token know
Token ID760
Feature activation+4.40e+00

Pos logit contributions
 fort+0.039
 way+0.036
 trap+0.035
 scar+0.035
 forgot+0.034
Neg logit contributions
uts-0.064
anes-0.062
all-0.061
ous-0.057
ges-0.056
Token world
Token ID995
Feature activation+4.02e+00

Pos logit contributions
 slammed+0.202
 forgot+0.198
 sank+0.174
iper+0.173
 scar+0.169
Neg logit contributions
anes-0.167
all-0.163
Welcome-0.151
that-0.143
Happy-0.138
Token,
Token ID11
Feature activation-1.18e+00

Pos logit contributions
uts+0.107
anes+0.105
all+0.104
ous+0.097
that+0.093
Neg logit contributions
 fort-0.106
 scar-0.104
 forgot-0.103
 trap-0.101
 closet-0.100
Token learning
Token ID4673
Feature activation+4.74e+00

Pos logit contributions
 fort+0.035
 knot+0.034
 trap+0.033
 closet+0.033
 needle+0.033
Neg logit contributions
all-0.028
uts-0.025
anes-0.024
that-0.024
ges-0.023
Token new
Token ID649
Feature activation-1.53e+00

Pos logit contributions
uts+0.129
anes+0.124
all+0.121
ous+0.113
that+0.111
Neg logit contributions
 fort-0.127
 way-0.118
 trap-0.118
 forgot-0.117
 scar-0.117
Token things
Token ID1243
Feature activation+3.64e+00

Pos logit contributions
 fort+0.030
 forgot+0.028
 closet+0.028
 scar+0.027
 trap+0.027
Neg logit contributions
uts-0.051
anes-0.050
all-0.049
ous-0.046
that-0.046
Token each
Token ID1123
Feature activation-5.32e+00

Pos logit contributions
anes+0.097
uts+0.096
all+0.092
unch+0.089
ges+0.088
Neg logit contributions
 fort-0.095
 way-0.094
 trap-0.088
 scar-0.087
 forgot-0.087
Token day
Token ID1110
Feature activation+6.53e+00

Pos logit contributions
 fort+0.127
 scar+0.123
 forgot+0.120
 closet+0.116
 slammed+0.111
Neg logit contributions
all-0.152
uts-0.144
anes-0.144
ges-0.142
that-0.138
Token.
Token ID13
Feature activation+4.84e+00

Pos logit contributions
uts+0.158
anes+0.155
all+0.150
ous+0.139
that+0.136
Neg logit contributions
 fort-0.191
 way-0.183
 trap-0.180
 scar-0.178
 forgot-0.178
Token He
Token ID679
Feature activation-8.44e+00

Pos logit contributions
uts+0.117
anes+0.107
ous+0.103
all+0.102
ges+0.095
Neg logit contributions
 fort-0.145
 way-0.143
 hospital-0.135
 trap-0.133
 forgot-0.132
Token even
Token ID772
Feature activation-4.94e+00

Pos logit contributions
 fort+0.220
 trap+0.200
 closet+0.197
 trash+0.190
 scar+0.189
Neg logit contributions
all-0.233
uts-0.208
od-0.207
anes-0.205
 Mike-0.200
Token made
Token ID925
Feature activation+3.75e+00

Pos logit contributions
 fort+0.128
 closet+0.111
 knot+0.110
 hole+0.108
 shelf+0.106
Neg logit contributions
all-0.138
ful-0.123
ges-0.120
od-0.117
year-0.115
Token
Token ID198
Feature activation+3.63e+00

Pos logit contributions
uts+0.128
anes+0.123
all+0.123
piring+0.119
qu+0.119
Neg logit contributions
 closet-0.101
 way-0.098
 fort-0.096
 backyard-0.093
 room-0.090
Token
Token ID198
Feature activation+3.23e+00

Pos logit contributions
uts+0.092
anes+0.092
ous+0.091
work+0.087
all+0.084
Neg logit contributions
 fort-0.090
 way-0.088
 hospital-0.085
 trap-0.084
 closet-0.083
TokenThen
Token ID6423
Feature activation-5.83e-01

Pos logit contributions
uts+0.099
anes+0.097
ous+0.092
all+0.091
ges+0.089
Neg logit contributions
 fort-0.072
 trap-0.067
 forgot-0.067
 hospital-0.066
 way-0.066
Token she
Token ID673
Feature activation-8.31e+00

Pos logit contributions
 fort+0.018
 trap+0.017
eh+0.017
ches+0.016
ns+0.016
Neg logit contributions
all-0.012
 Mike-0.012
uts-0.011
anes-0.010
ous-0.010
Token heard
Token ID2982
Feature activation+4.38e+00

Pos logit contributions
 fort+0.175
 trap+0.154
 way+0.153
 hospital+0.150
 proper+0.147
Neg logit contributions
all-0.262
uts-0.261
anes-0.247
ous-0.237
 Isn-0.234
Token a
Token ID257
Feature activation-5.45e+00

Pos logit contributions
uts+0.136
anes+0.130
all+0.127
ges+0.120
that+0.119
Neg logit contributions
 fort-0.101
 way-0.096
 trap-0.094
 scar-0.094
 hospital-0.093
Token voice
Token ID3809
Feature activation+2.79e+00

Pos logit contributions
 fort+0.106
 way+0.097
 forgot+0.096
 trap+0.095
 scar+0.094
Neg logit contributions
uts-0.185
anes-0.181
all-0.178
ous-0.170
that-0.166
Token.
Token ID13
Feature activation+5.15e+00

Pos logit contributions
uts+0.089
anes+0.082
unch+0.077
all+0.077
that+0.074
Neg logit contributions
 fort-0.065
 hospital-0.063
 way-0.062
 scar-0.062
 trap-0.062
Token It
Token ID632
Feature activation-4.26e+00

Pos logit contributions
uts+0.135
all+0.131
anes+0.126
unch+0.123
that+0.115
Neg logit contributions
 fort-0.136
 closet-0.135
 way-0.130
 trap-0.130
 hole-0.129
Token was
Token ID373
Feature activation-2.44e+00

Pos logit contributions
 fort+0.086
 way+0.079
 trap+0.078
 scar+0.078
 forgot+0.077
Neg logit contributions
uts-0.143
anes-0.139
all-0.137
ous-0.129
that-0.127
Token coming
Token ID2406
Feature activation+5.71e+00

Pos logit contributions
 fort+0.051
 way+0.050
 trap+0.049
 closet+0.049
 scar+0.047
Neg logit contributions
uts-0.083
anes-0.079
all-0.074
ous-0.073
that-0.071
Token.
Token ID13
Feature activation+4.46e+00

Pos logit contributions
anes+0.065
uts+0.059
all+0.058
 Isn+0.057
ocolate+0.056
Neg logit contributions
 fort-0.047
 way-0.043
 forgot-0.042
 hole-0.040
 room-0.039
Token They
Token ID1119
Feature activation-7.05e+00

Pos logit contributions
uts+0.116
anes+0.113
all+0.111
ges+0.105
ous+0.105
Neg logit contributions
 fort-0.121
 way-0.116
 hospital-0.112
 forgot-0.111
 trap-0.110
Token continued
Token ID3767
Feature activation+3.66e+00

Pos logit contributions
 fort+0.190
 trap+0.172
 closet+0.167
 hospital+0.167
 way+0.166
Neg logit contributions
uts-0.185
all-0.184
anes-0.177
 Mike-0.168
od-0.166
Token to
Token ID284
Feature activation-2.77e+00

Pos logit contributions
uts+0.085
anes+0.082
all+0.079
ous+0.073
that+0.071
Neg logit contributions
 fort-0.113
 way-0.107
 trap-0.106
 scar-0.106
 forgot-0.105
Token race
Token ID3234
Feature activation+8.26e+00

Pos logit contributions
 fort+0.067
 trap+0.061
 scar+0.061
 way+0.060
 forgot+0.059
Neg logit contributions
uts-0.080
anes-0.078
all-0.078
ous-0.073
that-0.072
Token and
Token ID290
Feature activation-5.58e+00

Pos logit contributions
uts+0.212
anes+0.202
all+0.199
ges+0.183
that+0.182
Neg logit contributions
 fort-0.230
 trap-0.220
 scar-0.218
 forgot-0.215
 closet-0.214
Token play
Token ID711
Feature activation+6.29e+00

Pos logit contributions
 fort+0.146
 scar+0.135
 trap+0.135
 closet+0.132
 hospital+0.127
Neg logit contributions
uts-0.148
all-0.144
anes-0.140
ges-0.134
that-0.133
Token together
Token ID1978
Feature activation+3.07e-01

Pos logit contributions
uts+0.115
all+0.108
that+0.105
anes+0.102
ges+0.098
Neg logit contributions
 scar-0.201
 forgot-0.199
 closet-0.198
iper-0.197
 fort-0.194
Token,
Token ID11
Feature activation-1.47e+00

Pos logit contributions
anes+0.008
uts+0.008
all+0.008
ous+0.007
 Isn+0.007
Neg logit contributions
 fort-0.008
 way-0.008
 forgot-0.008
 trap-0.008
 pile-0.008
Token and
Token ID290
Feature activation-6.16e+00

Pos logit contributions
 fort+0.042
 trap+0.041
 scar+0.041
 closet+0.040
iper+0.040
Neg logit contributions
uts-0.035
all-0.034
that-0.032
anes-0.031
 Mike-0.030
Token they
Token ID484
Feature activation-6.42e+00

Pos logit contributions
 fort+0.176
 closet+0.165
 trap+0.163
 scar+0.163
 needle+0.159
Neg logit contributions
uts-0.152
all-0.147
 Mike-0.138
anes-0.137
that-0.134
Token and
Token ID290
Feature activation-5.82e+00

Pos logit contributions
anes+0.017
uts+0.016
that+0.016
all+0.015
unch+0.015
Neg logit contributions
 fort-0.011
 scar-0.010
 hospital-0.010
 way-0.010
 forgot-0.010
Token kind
Token ID1611
Feature activation+4.33e+00

Pos logit contributions
 fort+0.125
 way+0.112
 scar+0.112
 trap+0.111
 closet+0.109
Neg logit contributions
uts-0.190
all-0.182
anes-0.182
ous-0.172
that-0.169
Token.
Token ID13
Feature activation+5.39e+00

Pos logit contributions
anes+0.116
uts+0.114
unch+0.110
all+0.109
that+0.104
Neg logit contributions
 fort-0.114
 way-0.113
 scar-0.106
 trap-0.104
 hospital-0.104
Token You
Token ID921
Feature activation-1.90e+00

Pos logit contributions
uts+0.137
anes+0.130
all+0.122
ous+0.120
unch+0.119
Neg logit contributions
 closet-0.142
 fort-0.141
 way-0.139
 hospital-0.138
 trap-0.135
Token're
Token ID821
Feature activation-2.19e+00

Pos logit contributions
 fort+0.052
 closet+0.045
 scar+0.045
 way+0.045
 trap+0.044
Neg logit contributions
uts-0.053
anes-0.051
all-0.050
ous-0.046
that-0.045
Token the
Token ID262
Feature activation-7.58e+00

Pos logit contributions
 fort+0.055
 scar+0.049
 way+0.048
 trap+0.048
 forgot+0.048
Neg logit contributions
uts-0.064
all-0.064
anes-0.062
that-0.058
ous-0.057
Token best
Token ID1266
Feature activation-3.89e+00

Pos logit contributions
 fort+0.144
 forgot+0.137
 trap+0.126
 scar+0.123
 way+0.121
Neg logit contributions
anes-0.249
uts-0.248
all-0.246
that-0.231
ous-0.224
Token visitors
Token ID9692
Feature activation+3.42e+00

Pos logit contributions
 fort+0.088
 way+0.085
 hospital+0.081
 scar+0.079
 scarf+0.077
Neg logit contributions
uts-0.126
anes-0.126
all-0.118
that-0.117
ges-0.115
Token ever
Token ID1683
Feature activation-1.68e+00

Pos logit contributions
uts+0.074
anes+0.072
all+0.069
unch+0.063
that+0.062
Neg logit contributions
 fort-0.108
 way-0.104
 forgot-0.103
 trap-0.101
 closet-0.099
Token."
Token ID526
Feature activation+7.52e+00

Pos logit contributions
 forgot+0.049
 way+0.048
 fort+0.046
 trap+0.045
 scar+0.045
Neg logit contributions
anes-0.044
uts-0.043
unch-0.038
all-0.038
ous-0.036
Token
Token ID198
Feature activation+3.82e+00

Pos logit contributions
anes+0.176
ous+0.174
all+0.174
uts+0.172
that+0.156
Neg logit contributions
 closet-0.216
 fort-0.211
 way-0.202
 hospital-0.201
 scar-0.199
Token walking
Token ID6155
Feature activation+4.33e+00

Pos logit contributions
 fort+0.087
 scar+0.078
 way+0.076
 trap+0.075
 hospital+0.073
Neg logit contributions
all-0.122
uts-0.122
ges-0.120
anes-0.120
ous-0.114
Token on
Token ID319
Feature activation+2.28e-01

Pos logit contributions
all+0.094
uts+0.092
anes+0.091
ges+0.087
that+0.084
Neg logit contributions
 fort-0.132
 forgot-0.129
 scar-0.126
 trap-0.125
 way-0.125
Token the
Token ID262
Feature activation-7.55e+00

Pos logit contributions
all+0.005
uts+0.005
anes+0.005
ges+0.004
 Mike+0.004
Neg logit contributions
 forgot-0.007
 fort-0.007
 scar-0.006
 proper-0.006
 trap-0.006
Token bridge
Token ID7696
Feature activation+3.13e+00

Pos logit contributions
iper+0.148
 forgot+0.145
 scar+0.134
 proper+0.134
 closet+0.132
Neg logit contributions
that-0.212
anes-0.206
ous-0.203
all-0.202
od-0.191
Token noticed
Token ID6810
Feature activation+3.10e+00

Pos logit contributions
all+0.082
anes+0.082
uts+0.081
ous+0.076
that+0.074
Neg logit contributions
 scar-0.082
 fort-0.082
 trap-0.076
 needle-0.076
 knot-0.075
Token the
Token ID262
Feature activation-7.44e+00

Pos logit contributions
uts+0.094
anes+0.090
all+0.087
ous+0.082
that+0.081
Neg logit contributions
 fort-0.074
 way-0.070
 trap-0.069
 scar-0.069
 forgot-0.067
Token apple
Token ID17180
Feature activation+1.47e+00

Pos logit contributions
iper+0.202
ches+0.190
 forgot+0.183
 proper+0.179
 slammed+0.168
Neg logit contributions
anes-0.160
all-0.154
that-0.153
ous-0.141
 Whale-0.140
Token and
Token ID290
Feature activation-5.43e+00

Pos logit contributions
uts+0.042
all+0.039
anes+0.039
that+0.038
ous+0.037
Neg logit contributions
 scar-0.035
 fort-0.035
 forgot-0.033
 proper-0.033
iper-0.032
Token stopped
Token ID5025
Feature activation+6.29e+00

Pos logit contributions
 fort+0.137
 trap+0.121
 way+0.117
 closet+0.116
 scar+0.115
Neg logit contributions
uts-0.157
all-0.156
anes-0.148
ous-0.145
that-0.144
Token to
Token ID284
Feature activation-1.90e+00

Pos logit contributions
uts+0.197
anes+0.193
ous+0.171
 Isn+0.169
all+0.169
Neg logit contributions
 way-0.142
 fort-0.141
 trap-0.138
 scar-0.138
 hole-0.137
Token pick
Token ID2298
Feature activation+6.00e+00

Pos logit contributions
 fort+0.046
 way+0.043
 trap+0.042
 scar+0.042
 forgot+0.042
Neg logit contributions
uts-0.058
anes-0.056
all-0.054
ous-0.051
ges-0.050