TOP ACTIVATIONS
MAX = 0.141

Tim grabbed the hook and anx iously put it in the
the day , Bella kissed Aless io on the cheek and
smile . Mary anx iously opened up the pass
little bird nodded and listened int ently as the owl whispered
ber b ere and sum ac and tam ar ind and
. The gir affe listened int ently . When
voice . The birds listened int ently , then nodded and
temper and often threw temper tant rums . One day ,
home . The puppy listened int ently and w agged his
on his paws and listen int ently to her until she
Tom my listened int ently to his mom 's
! â Jim said anx iously . His
. The little girl listened int ently as they introduced her
Both of them put their anx ieties aside . From then
the answer . She asked anx iously , " What is
Sammy would often share av oc ados and other y ummy
see . The owl listened int ently and then it said
. Kate listened int ently and proudly .
to break ." Molly nodded solemn ly , understanding the lesson
from the trap . B amb i was very happy and

MIN ACTIVATIONS
MIN = -0.134

play . <|endoftext|> Once upon a time , there was a
down . <|endoftext|> Once upon a time , there was a
day . <|endoftext|> Once upon a time , in a thick
ball . <|endoftext|> Once upon a time , there was a
face . <|endoftext|> Once upon a time , there was a
tomorrow . <|endoftext|> Once upon a time , there was a
long . <|endoftext|> Once upon a time , there was a
song . <|endoftext|> Once upon a time , there was a
faces . <|endoftext|> Once upon a time , there was a
yard . <|endoftext|> Once upon a time , there was a
place ! <|endoftext|> Once upon a time there was a small
come . <|endoftext|> Once upon a time , there was a
tree . <|endoftext|> Once upon a time , there was a
together . <|endoftext|> Once upon a time , there was an
new . <|endoftext|> Once upon a time , there were two
her . <|endoftext|> Once upon a time , there was a
end . <|endoftext|> Once upon a time , there was a
long . <|endoftext|> Once upon a time , there was a
together . <|endoftext|> Once upon a time , there was a
park . <|endoftext|> Once upon a time , there was a

INTERVAL 0.072 - 0.141
CONTAINS 0.688%

better and smiled . She hugged her mum back .
Sally . She smiled and hugged Jake . " This old
look at their picture and smile . They learned
men and said , " Thank you for saving the old
. The gard ener was happy to hear

INTERVAL 0.004 - 0.072
CONTAINS 45.993%

an orig ami figure . But no matter how hard she
too comfortable here ." The big bear was very
feeling grateful for his friends . From that day on ,
She said , " No , you are silly . And
fun and cool . One day , Lily went

INTERVAL -0.065 - 0.004
CONTAINS 50.449%

there were two friends named Jack and Jo jo . Every
be careful . And we have to take Rex with us
tired from running around , she decided to face the tornado
and angry . She did not want to leave her home
is too small and boring . We need more cars and

INTERVAL -0.134 - -0.065
CONTAINS 2.870%

Once upon a time there was a little girl . Her
Daisy and her mum took a drive . Daisy was excited
a time , there lived a little girl . She loved
my went to a big building with an elevator . Lily
upon a time , there was a little girl named Sue
TokenTim
Token ID14967
Feature activation+3.38e-02

Pos logit contributions
lee+0.035
bled+0.031
alks+0.031
atted+0.031
urg+0.030
Neg logit contributions
 hide-0.025
 tag-0.024
 hiding-0.023
 seeker-0.022
 hidden-0.022
Token grabbed
Token ID13176
Feature activation+5.31e-02

Pos logit contributions
lee+0.081
bled+0.074
alks+0.073
atted+0.071
urg+0.069
Neg logit contributions
 hide-0.050
 tag-0.049
 hiding-0.047
 seeker-0.046
 hidden-0.044
Token the
Token ID262
Feature activation-8.49e-03

Pos logit contributions
lee+0.133
bled+0.119
alks+0.118
atted+0.117
urg+0.115
Neg logit contributions
 hide-0.077
 tag-0.073
 hiding-0.071
 seeker-0.067
 hidden-0.067
Token hook
Token ID8011
Feature activation-3.48e-02

Pos logit contributions
 hide+0.014
 seeker+0.013
 tag+0.013
 hiding+0.013
 chamber+0.012
Neg logit contributions
lee-0.017
bled-0.017
alks-0.017
rac-0.016
atted-0.015
Token and
Token ID290
Feature activation+2.01e-02

Pos logit contributions
 seeker+0.047
 hide+0.046
 tag+0.045
 hiding+0.042
 exit+0.040
Neg logit contributions
lee-0.086
bled-0.080
atted-0.078
alks-0.077
urg-0.073
Token anx
Token ID7296
Feature activation+1.41e-01

Pos logit contributions
lee+0.043
bled+0.038
atted+0.037
alks+0.036
urg+0.035
Neg logit contributions
 hide-0.035
 seeker-0.035
 tag-0.034
 hiding-0.034
 exit-0.033
Tokeniously
Token ID6819
Feature activation+1.51e-02

Pos logit contributions
alks+0.149
lee+0.141
umm+0.128
ubb+0.127
bled+0.126
Neg logit contributions
 hiding-0.338
 hide-0.332
 seeker-0.296
 like-0.293
 tag-0.292
Token put
Token ID1234
Feature activation+2.69e-02

Pos logit contributions
lee+0.036
bled+0.032
atted+0.032
alks+0.031
urg+0.030
Neg logit contributions
 hide-0.023
 tag-0.023
 seeker-0.022
 hiding-0.022
 exit-0.021
Token it
Token ID340
Feature activation-2.15e-02

Pos logit contributions
lee+0.066
bled+0.059
alks+0.058
atted+0.058
urg+0.056
Neg logit contributions
 hide-0.040
 tag-0.039
 hiding-0.037
 seeker-0.036
 hidden-0.035
Token in
Token ID287
Feature activation+1.15e-02

Pos logit contributions
 seeker+0.026
 hide+0.025
 tag+0.025
 library+0.023
 exit+0.022
Neg logit contributions
bled-0.057
lee-0.056
atted-0.052
alks-0.049
ped-0.049
Token the
Token ID262
Feature activation-1.65e-02

Pos logit contributions
lee+0.025
bled+0.023
atted+0.022
alks+0.022
urg+0.021
Neg logit contributions
 hide-0.020
 tag-0.020
 hiding-0.019
 seeker-0.019
 hidden-0.018
Token the
Token ID262
Feature activation-1.45e-02

Pos logit contributions
lee+0.023
bled+0.021
atted+0.021
urg+0.020
aned+0.020
Neg logit contributions
 hide-0.013
 tag-0.012
 hiding-0.012
 seeker-0.011
 hidden-0.011
Token day
Token ID1110
Feature activation-1.60e-02

Pos logit contributions
 hide+0.018
 tag+0.017
 hiding+0.016
 seeker+0.016
 hidden+0.015
Neg logit contributions
lee-0.039
bled-0.036
alks-0.035
atted-0.035
urg-0.034
Token,
Token ID11
Feature activation-1.47e-02

Pos logit contributions
 hide+0.019
 seeker+0.019
 tag+0.019
 exit+0.018
 hiding+0.018
Neg logit contributions
lee-0.043
alks-0.038
atted-0.037
bled-0.037
ums-0.036
Token Bella
Token ID39304
Feature activation+1.08e-02

Pos logit contributions
 hide+0.026
 seeker+0.025
 tag+0.025
 hiding+0.025
 exit+0.025
Neg logit contributions
lee-0.031
bled-0.026
alks-0.026
atted-0.025
rac-0.024
Token kissed
Token ID27946
Feature activation+5.62e-02

Pos logit contributions
lee+0.024
bled+0.021
atted+0.021
alks+0.020
urg+0.019
Neg logit contributions
 hide-0.019
 tag-0.018
 seeker-0.018
 hiding-0.018
 exit-0.017
Token Aless
Token ID47319
Feature activation+1.39e-01

Pos logit contributions
lee+0.124
bled+0.110
alks+0.108
atted+0.107
urg+0.103
Neg logit contributions
 hide-0.097
 tag-0.094
 hiding-0.091
 seeker-0.090
 hidden-0.086
Tokenio
Token ID952
Feature activation+5.47e-02

Pos logit contributions
bled+0.173
lee+0.172
lights+0.161
alks+0.154
rac+0.148
Neg logit contributions
 hiding-0.301
 hide-0.293
 tag-0.290
 seeker-0.282
 exit-0.281
Token on
Token ID319
Feature activation+3.24e-02

Pos logit contributions
lee+0.116
bled+0.104
alks+0.102
atted+0.098
urg+0.096
Neg logit contributions
 hide-0.097
 tag-0.094
 hiding-0.090
 seeker-0.087
 hidden-0.086
Token the
Token ID262
Feature activation-6.80e-03

Pos logit contributions
lee+0.072
bled+0.064
alks+0.063
atted+0.063
urg+0.060
Neg logit contributions
 hide-0.055
 tag-0.053
 hiding-0.051
 seeker-0.051
 hidden-0.048
Token cheek
Token ID19353
Feature activation+1.44e-02

Pos logit contributions
 hide+0.011
 tag+0.010
 seeker+0.010
 hiding+0.010
 exit+0.009
Neg logit contributions
lee-0.016
bled-0.014
atted-0.014
alks-0.014
urg-0.013
Token and
Token ID290
Feature activation+3.50e-02

Pos logit contributions
lee+0.036
bled+0.033
atted+0.032
alks+0.032
ums+0.031
Neg logit contributions
 hide-0.020
 seeker-0.019
 hiding-0.019
 tag-0.018
 exit-0.017
Token smile
Token ID8212
Feature activation+3.42e-02

Pos logit contributions
 hide+0.048
 seeker+0.048
 tag+0.046
 hiding+0.045
 exit+0.045
Neg logit contributions
lee-0.039
alks-0.033
atted-0.033
bled-0.031
urg-0.029
Token.
Token ID13
Feature activation+5.45e-03

Pos logit contributions
lee+0.097
bled+0.087
atted+0.086
alks+0.085
urg+0.082
Neg logit contributions
 hide-0.040
 seeker-0.037
 tag-0.036
 hiding-0.034
 exit-0.033
Token
Token ID198
Feature activation+3.55e-02

Pos logit contributions
lee+0.013
bled+0.012
atted+0.012
alks+0.012
urg+0.011
Neg logit contributions
 hide-0.008
 tag-0.008
 hiding-0.007
 seeker-0.007
 hidden-0.007
Token
Token ID198
Feature activation+3.13e-02

Pos logit contributions
lee+0.091
bled+0.085
alks+0.083
atted+0.083
urg+0.080
Neg logit contributions
 hide-0.046
 tag-0.044
 hiding-0.042
 hidden-0.039
 exit-0.038
TokenMary
Token ID24119
Feature activation+3.24e-02

Pos logit contributions
lee+0.060
bled+0.053
atted+0.051
ipt+0.050
alks+0.050
Neg logit contributions
 hide-0.062
 hiding-0.059
 seeker-0.059
 tag-0.059
 hidden-0.055
Token anx
Token ID7296
Feature activation+1.36e-01

Pos logit contributions
lee+0.070
bled+0.061
atted+0.060
alks+0.059
urg+0.057
Neg logit contributions
 hide-0.058
 tag-0.056
 seeker-0.055
 hiding-0.054
 exit-0.052
Tokeniously
Token ID6819
Feature activation+2.65e-02

Pos logit contributions
lee+0.146
lights+0.143
alks+0.142
bled+0.137
ubb+0.137
Neg logit contributions
 hiding-0.329
 hide-0.319
 exit-0.295
 tag-0.294
 seeker-0.291
Token opened
Token ID4721
Feature activation+2.56e-02

Pos logit contributions
lee+0.072
bled+0.065
atted+0.064
alks+0.064
urg+0.062
Neg logit contributions
 hide-0.032
 tag-0.031
 seeker-0.030
 hiding-0.030
 exit-0.027
Token up
Token ID510
Feature activation+8.06e-03

Pos logit contributions
lee+0.058
bled+0.052
alks+0.051
atted+0.051
urg+0.049
Neg logit contributions
 hide-0.042
 tag-0.041
 hiding-0.040
 seeker-0.039
 hidden-0.037
Token the
Token ID262
Feature activation-1.19e-02

Pos logit contributions
lee+0.018
bled+0.016
atted+0.016
alks+0.016
rac+0.015
Neg logit contributions
 hide-0.013
 seeker-0.013
 tag-0.013
 hiding-0.012
 exit-0.012
Token pass
Token ID1208
Feature activation-1.43e-02

Pos logit contributions
 seeker+0.019
 tim+0.018
 hide+0.018
 tag+0.017
 hiding+0.016
Neg logit contributions
alks-0.024
lee-0.024
atted-0.022
bled-0.020
um-0.020
Token little
Token ID1310
Feature activation+4.91e-02

Pos logit contributions
lee+0.082
bled+0.075
atted+0.074
alks+0.073
urg+0.071
Neg logit contributions
 hide-0.040
 tag-0.038
 hiding-0.037
 seeker-0.035
 hidden-0.034
Token bird
Token ID6512
Feature activation+4.08e-02

Pos logit contributions
lee+0.130
bled+0.119
alks+0.116
atted+0.116
urg+0.112
Neg logit contributions
 hide-0.063
 tag-0.060
 hiding-0.058
 seeker-0.057
 hidden-0.053
Token nodded
Token ID14464
Feature activation+8.55e-02

Pos logit contributions
lee+0.089
bled+0.080
atted+0.077
alks+0.077
urg+0.074
Neg logit contributions
 hide-0.071
 tag-0.069
 hiding-0.066
 seeker-0.066
 hidden-0.063
Token and
Token ID290
Feature activation+4.20e-02

Pos logit contributions
lee+0.243
alks+0.226
bled+0.223
atted+0.220
ipt+0.217
Neg logit contributions
 hide-0.088
 tag-0.084
 hiding-0.082
 hidden-0.074
 seeker-0.068
Token listened
Token ID16399
Feature activation+3.86e-02

Pos logit contributions
lee+0.089
bled+0.079
alks+0.078
atted+0.076
urg+0.074
Neg logit contributions
 hide-0.077
 tag-0.075
 hiding-0.072
 seeker-0.070
 hidden-0.069
Token int
Token ID493
Feature activation+1.30e-01

Pos logit contributions
lee+0.101
bled+0.091
alks+0.089
atted+0.089
urg+0.086
Neg logit contributions
 hide-0.051
 tag-0.049
 hiding-0.047
 seeker-0.046
 hidden-0.043
Tokenently
Token ID1473
Feature activation+1.49e-02

Pos logit contributions
lee+0.179
alks+0.152
umm+0.136
eled+0.131
arted+0.129
Neg logit contributions
 hide-0.290
 tag-0.282
 hiding-0.280
 outside-0.279
?-0.278
Token as
Token ID355
Feature activation+7.95e-03

Pos logit contributions
lee+0.042
bled+0.038
alks+0.038
atted+0.037
urg+0.036
Neg logit contributions
 hide-0.016
 tag-0.016
 seeker-0.016
 hiding-0.015
 exit-0.014
Token the
Token ID262
Feature activation+2.80e-02

Pos logit contributions
lee+0.017
bled+0.015
alks+0.015
atted+0.014
urg+0.014
Neg logit contributions
 hide-0.014
 tag-0.013
 seeker-0.013
 hiding-0.013
 exit-0.012
Token owl
Token ID39610
Feature activation+3.28e-02

Pos logit contributions
lee+0.071
bled+0.065
atted+0.064
alks+0.063
urg+0.061
Neg logit contributions
 hide-0.039
 tag-0.038
 hiding-0.036
 seeker-0.035
 hidden-0.035
Token whispered
Token ID25029
Feature activation+7.15e-02

Pos logit contributions
lee+0.068
atted+0.060
bled+0.059
alks+0.057
eled+0.055
Neg logit contributions
 seeker-0.061
 hide-0.057
 tag-0.056
 hiding-0.055
 exit-0.053
Token ber
Token ID18157
Feature activation+6.82e-02

Pos logit contributions
lee+0.081
bled+0.072
atted+0.070
alks+0.070
urg+0.067
Neg logit contributions
 hide-0.065
 tag-0.063
 hiding-0.061
 seeker-0.060
 hidden-0.057
Tokenb
Token ID65
Feature activation+7.37e-02

Pos logit contributions
lee+0.112
bled+0.096
alks+0.095
atted+0.091
umm+0.088
Neg logit contributions
 hide-0.146
 tag-0.144
 hiding-0.142
 hidden-0.134
 seeker-0.132
Tokenere
Token ID567
Feature activation+8.34e-02

Pos logit contributions
lee+0.130
bled+0.111
alks+0.109
atted+0.107
ipt+0.104
Neg logit contributions
 hide-0.152
 tag-0.151
 hiding-0.144
 seeker-0.140
 hidden-0.139
Token and
Token ID290
Feature activation+2.25e-02

Pos logit contributions
lee+0.154
bled+0.136
atted+0.129
alks+0.129
urg+0.122
Neg logit contributions
 hide-0.174
 tag-0.168
 seeker-0.165
 hiding-0.165
 exit-0.158
Token sum
Token ID2160
Feature activation+5.08e-02

Pos logit contributions
lee+0.051
bled+0.045
alks+0.044
atted+0.044
urg+0.042
Neg logit contributions
 hide-0.037
 tag-0.035
 seeker-0.035
 hiding-0.034
 exit-0.033
Tokenac
Token ID330
Feature activation+1.29e-01

Pos logit contributions
lee+0.110
bled+0.099
atted+0.097
alks+0.095
rac+0.089
Neg logit contributions
 hide-0.088
 tag-0.085
 seeker-0.084
 hiding-0.083
 exit-0.079
Token and
Token ID290
Feature activation+2.52e-02

Pos logit contributions
bled+0.222
lee+0.202
umm+0.190
alks+0.180
lights+0.177
Neg logit contributions
 hide-0.247
 tag-0.237
 hidden-0.229
 hiding-0.225
 like-0.222
Token tam
Token ID21885
Feature activation+8.46e-02

Pos logit contributions
lee+0.056
bled+0.049
alks+0.048
atted+0.048
urg+0.046
Neg logit contributions
 hide-0.042
 tag-0.041
 seeker-0.040
 hiding-0.040
 exit-0.038
Tokenar
Token ID283
Feature activation+4.79e-02

Pos logit contributions
lee+0.152
bled+0.134
alks+0.128
atted+0.127
urg+0.121
Neg logit contributions
 hide-0.180
 tag-0.174
 hiding-0.172
 seeker-0.170
 hidden-0.164
Tokenind
Token ID521
Feature activation+8.59e-02

Pos logit contributions
lee+0.110
bled+0.098
atted+0.096
alks+0.095
urg+0.091
Neg logit contributions
 hide-0.079
 tag-0.076
 seeker-0.075
 hiding-0.074
 exit-0.069
Token and
Token ID290
Feature activation+2.96e-02

Pos logit contributions
lee+0.169
bled+0.151
alks+0.144
atted+0.143
urg+0.138
Neg logit contributions
 hide-0.164
 tag-0.161
 hiding-0.155
 seeker-0.153
 hidden-0.147
Token.
Token ID13
Feature activation+1.01e-02

Pos logit contributions
lee+0.021
alks+0.018
atted+0.018
bled+0.018
ums+0.017
Neg logit contributions
 seeker-0.012
 tag-0.011
 hide-0.011
 exit-0.010
 hiding-0.010
Token The
Token ID383
Feature activation+2.78e-02

Pos logit contributions
lee+0.023
bled+0.021
atted+0.020
alks+0.020
urg+0.020
Neg logit contributions
 hide-0.017
 tag-0.016
 hiding-0.015
 seeker-0.015
 hidden-0.014
Token gir
Token ID37370
Feature activation+5.35e-02

Pos logit contributions
lee+0.066
bled+0.058
alks+0.057
atted+0.057
urg+0.055
Neg logit contributions
 hide-0.042
 tag-0.040
 hiding-0.039
 seeker-0.039
 exit-0.036
Tokenaffe
Token ID21223
Feature activation+3.36e-02

Pos logit contributions
lee+0.104
bled+0.098
alks+0.090
urg+0.089
atted+0.088
Neg logit contributions
 tag-0.100
 hide-0.100
 hiding-0.094
 exit-0.092
 hidden-0.091
Token listened
Token ID16399
Feature activation+3.92e-02

Pos logit contributions
lee+0.067
bled+0.058
atted+0.058
alks+0.056
urg+0.054
Neg logit contributions
 hide-0.063
 seeker-0.062
 tag-0.062
 hiding-0.060
 exit-0.058
Token int
Token ID493
Feature activation+1.28e-01

Pos logit contributions
lee+0.096
bled+0.086
atted+0.084
alks+0.083
urg+0.080
Neg logit contributions
 hide-0.058
 seeker-0.057
 tag-0.056
 hiding-0.053
 exit-0.051
Tokenently
Token ID1473
Feature activation+1.32e-02

Pos logit contributions
lee+0.167
alks+0.146
eled+0.119
umm+0.116
arted+0.115
Neg logit contributions
 hide-0.301
 tag-0.297
 hiding-0.289
 like-0.286
 outside-0.281
Token.
Token ID13
Feature activation+1.03e-02

Pos logit contributions
lee+0.034
bled+0.030
atted+0.030
alks+0.030
urg+0.029
Neg logit contributions
 hide-0.017
 seeker-0.017
 tag-0.016
 hiding-0.016
 exit-0.015
Token
Token ID220
Feature activation+5.15e-02

Pos logit contributions
lee+0.022
bled+0.020
atted+0.019
urg+0.019
alks+0.019
Neg logit contributions
 hide-0.018
 tag-0.018
 hiding-0.017
 seeker-0.017
 hidden-0.016
Token
Token ID198
Feature activation+3.12e-02

Pos logit contributions
lee+0.127
bled+0.115
atted+0.112
alks+0.112
urg+0.109
Neg logit contributions
 hide-0.073
 tag-0.071
 hiding-0.068
 seeker-0.065
 hidden-0.063
TokenWhen
Token ID2215
Feature activation+1.41e-03

Pos logit contributions
lee+0.084
atted+0.077
bled+0.076
alks+0.075
urg+0.075
Neg logit contributions
 hide-0.038
 tag-0.038
 hiding-0.034
 hidden-0.032
 seeker-0.031
Token voice
Token ID3809
Feature activation+3.22e-02

Pos logit contributions
lee+0.093
atted+0.080
alks+0.078
bled+0.076
urg+0.073
Neg logit contributions
 seeker-0.092
 hide-0.089
 tag-0.088
 hiding-0.086
 exit-0.080
Token.
Token ID13
Feature activation+5.48e-03

Pos logit contributions
lee+0.084
atted+0.076
bled+0.073
ment+0.071
alks+0.070
Neg logit contributions
 seeker-0.045
 hide-0.040
 tag-0.038
 hiding-0.037
 maze-0.035
Token The
Token ID383
Feature activation+2.08e-02

Pos logit contributions
lee+0.012
bled+0.011
alks+0.010
atted+0.010
urg+0.010
Neg logit contributions
 hide-0.009
 tag-0.009
 hiding-0.009
 seeker-0.009
 hidden-0.008
Token birds
Token ID10087
Feature activation+1.17e-02

Pos logit contributions
lee+0.042
bled+0.035
atted+0.035
alks+0.035
eled+0.032
Neg logit contributions
 hide-0.038
 seeker-0.036
 hiding-0.036
 tag-0.036
 hidden-0.033
Token listened
Token ID16399
Feature activation+3.74e-02

Pos logit contributions
lee+0.026
atted+0.023
bled+0.023
alks+0.021
eled+0.021
Neg logit contributions
 hide-0.020
 seeker-0.020
 tag-0.019
 hiding-0.019
 exit-0.018
Token int
Token ID493
Feature activation+1.28e-01

Pos logit contributions
lee+0.097
bled+0.088
atted+0.087
alks+0.086
urg+0.083
Neg logit contributions
 hide-0.049
 tag-0.047
 seeker-0.046
 hiding-0.045
 exit-0.042
Tokenently
Token ID1473
Feature activation+1.41e-02

Pos logit contributions
lee+0.160
alks+0.133
umm+0.113
arted+0.107
eled+0.107
Neg logit contributions
 hide-0.312
 tag-0.308
 hiding-0.300
 like-0.290
 hidden-0.289
Token,
Token ID11
Feature activation+2.05e-02

Pos logit contributions
lee+0.037
bled+0.033
atted+0.033
alks+0.033
urg+0.032
Neg logit contributions
 hide-0.018
 seeker-0.017
 tag-0.017
 hiding-0.016
 exit-0.016
Token then
Token ID788
Feature activation+2.06e-02

Pos logit contributions
lee+0.050
bled+0.044
atted+0.043
alks+0.043
urg+0.042
Neg logit contributions
 hide-0.031
 tag-0.029
 seeker-0.029
 hiding-0.028
 exit-0.027
Token nodded
Token ID14464
Feature activation+8.59e-02

Pos logit contributions
lee+0.045
bled+0.039
atted+0.039
alks+0.038
urg+0.036
Neg logit contributions
 hide-0.036
 seeker-0.035
 tag-0.034
 hiding-0.034
 exit-0.033
Token and
Token ID290
Feature activation+4.21e-02

Pos logit contributions
lee+0.224
bled+0.203
alks+0.200
atted+0.199
urg+0.192
Neg logit contributions
 hide-0.112
 tag-0.108
 hiding-0.104
 seeker-0.100
 hidden-0.095
Token temper
Token ID4124
Feature activation-7.10e-03

Pos logit contributions
lee+0.017
atted+0.015
alks+0.015
bled+0.014
ums+0.014
Neg logit contributions
 seeker-0.014
 hide-0.014
 tag-0.014
 hiding-0.013
 exit-0.013
Token and
Token ID290
Feature activation+8.24e-03

Pos logit contributions
 seeker+0.010
 hide+0.010
 tag+0.009
 hiding+0.009
 exit+0.008
Neg logit contributions
lee-0.018
atted-0.016
alks-0.016
bled-0.015
ums-0.015
Token often
Token ID1690
Feature activation+1.10e-02

Pos logit contributions
lee+0.020
bled+0.017
atted+0.017
alks+0.017
ums+0.016
Neg logit contributions
 hide-0.012
 seeker-0.012
 tag-0.012
 hiding-0.011
 exit-0.011
Token threw
Token ID9617
Feature activation+2.26e-02

Pos logit contributions
lee+0.023
atted+0.020
bled+0.020
alks+0.019
urg+0.018
Neg logit contributions
 seeker-0.019
 tag-0.019
 hide-0.019
 exit-0.018
 hiding-0.018
Token temper
Token ID4124
Feature activation+2.33e-02

Pos logit contributions
lee+0.051
bled+0.045
alks+0.044
atted+0.044
urg+0.041
Neg logit contributions
 hide-0.037
 tag-0.036
 seeker-0.036
 hiding-0.035
 exit-0.033
Token tant
Token ID24246
Feature activation+1.28e-01

Pos logit contributions
lee+0.049
atted+0.043
bled+0.043
alks+0.043
urg+0.040
Neg logit contributions
 hide-0.042
 seeker-0.041
 tag-0.040
 hiding-0.040
 exit-0.038
Tokenrums
Token ID45241
Feature activation+4.07e-02

Pos logit contributions
lee+0.218
bled+0.190
umm+0.189
alks+0.187
ipt+0.175
Neg logit contributions
 hide-0.245
 hiding-0.243
 tag-0.239
 exit-0.207
 anywhere-0.207
Token.
Token ID13
Feature activation-2.66e-03

Pos logit contributions
lee+0.103
bled+0.096
atted+0.091
ums+0.089
alks+0.089
Neg logit contributions
 seeker-0.054
 hide-0.054
 tag-0.051
 hiding-0.051
 exit-0.048
Token One
Token ID1881
Feature activation-1.94e-02

Pos logit contributions
 hide+0.005
 tag+0.005
 hiding+0.004
 seeker+0.004
 hidden+0.004
Neg logit contributions
lee-0.006
bled-0.005
alks-0.005
atted-0.005
urg-0.005
Token day
Token ID1110
Feature activation-2.15e-02

Pos logit contributions
 tag+0.035
 hide+0.034
 seeker+0.032
 hiding+0.031
 exit+0.030
Neg logit contributions
lee-0.039
atted-0.035
eled-0.034
alks-0.033
um-0.033
Token,
Token ID11
Feature activation-1.96e-02

Pos logit contributions
 seeker+0.033
 hide+0.032
 tag+0.031
 exit+0.030
 hiding+0.030
Neg logit contributions
lee-0.045
bled-0.044
atted-0.042
ment-0.042
um-0.041
Token home
Token ID1363
Feature activation-1.84e-03

Pos logit contributions
 tim+0.022
 seeker+0.022
 hide+0.021
 anywhere+0.020
WR+0.020
Neg logit contributions
lee-0.031
um-0.027
ment-0.027
alks-0.027
 faces-0.027
Token.
Token ID13
Feature activation+5.28e-03

Pos logit contributions
 seeker+0.003
 hide+0.003
 tag+0.002
 maze+0.002
 hiding+0.002
Neg logit contributions
lee-0.004
bled-0.004
atted-0.004
ment-0.004
alks-0.004
Token The
Token ID383
Feature activation+1.80e-02

Pos logit contributions
lee+0.011
bled+0.010
atted+0.010
alks+0.010
urg+0.009
Neg logit contributions
 hide-0.009
 tag-0.009
 hiding-0.009
 seeker-0.009
 hidden-0.008
Token puppy
Token ID26188
Feature activation+1.58e-02

Pos logit contributions
lee+0.041
bled+0.037
atted+0.035
um+0.035
alks+0.034
Neg logit contributions
 hide-0.025
 hiding-0.024
 tag-0.023
 tim-0.022
 seeker-0.022
Token listened
Token ID16399
Feature activation+3.89e-02

Pos logit contributions
lee+0.035
atted+0.031
bled+0.029
aned+0.027
rac+0.027
Neg logit contributions
 seeker-0.027
 hide-0.024
 exit-0.024
 tag-0.023
 maze-0.023
Token int
Token ID493
Feature activation+1.27e-01

Pos logit contributions
lee+0.097
bled+0.087
atted+0.085
alks+0.083
urg+0.081
Neg logit contributions
 hide-0.056
 seeker-0.055
 tag-0.054
 hiding-0.050
 exit-0.049
Tokenently
Token ID1473
Feature activation+1.23e-02

Pos logit contributions
lee+0.158
alks+0.134
eled+0.110
arted+0.108
atted+0.106
Neg logit contributions
 hide-0.312
 tag-0.305
 hiding-0.302
 hidden-0.292
 outside-0.290
Token and
Token ID290
Feature activation+3.82e-02

Pos logit contributions
lee+0.031
bled+0.028
atted+0.027
alks+0.027
urg+0.026
Neg logit contributions
 hide-0.017
 seeker-0.017
 tag-0.016
 exit-0.015
 hiding-0.015
Token w
Token ID266
Feature activation+8.51e-02

Pos logit contributions
lee+0.079
bled+0.071
atted+0.069
alks+0.066
urg+0.065
Neg logit contributions
 hide-0.069
 tag-0.067
 seeker-0.066
 hiding-0.065
 exit-0.063
Tokenagged
Token ID14655
Feature activation+5.86e-02

Pos logit contributions
lee+0.092
alks+0.079
ipt+0.074
bled+0.071
rac+0.069
Neg logit contributions
 hide-0.227
 tag-0.221
 hiding-0.215
 exit-0.209
 hidden-0.207
Token his
Token ID465
Feature activation+4.80e-02

Pos logit contributions
lee+0.116
bled+0.101
atted+0.099
alks+0.097
urg+0.093
Neg logit contributions
 hide-0.112
 seeker-0.109
 tag-0.108
 hiding-0.105
 exit-0.102
Token on
Token ID319
Feature activation+3.00e-02

Pos logit contributions
lee+0.028
bled+0.027
atted+0.025
rac+0.025
aned+0.024
Neg logit contributions
 hide-0.016
 seeker-0.015
 tag-0.015
 hiding-0.014
 exit-0.014
Token his
Token ID465
Feature activation+7.53e-03

Pos logit contributions
lee+0.062
bled+0.055
alks+0.054
atted+0.052
rac+0.050
Neg logit contributions
 hide-0.051
 seeker-0.050
 tag-0.048
 hiding-0.048
 exit-0.046
Token paws
Token ID49193
Feature activation-1.43e-02

Pos logit contributions
lee+0.016
bled+0.014
alks+0.014
atted+0.014
urg+0.013
Neg logit contributions
 hide-0.013
 tag-0.013
 seeker-0.013
 hiding-0.013
 hidden-0.012
Token and
Token ID290
Feature activation+2.23e-02

Pos logit contributions
 hide+0.020
 seeker+0.019
 tag+0.019
 hiding+0.018
 exit+0.018
Neg logit contributions
lee-0.034
bled-0.031
atted-0.031
alks-0.030
rac-0.029
Token listen
Token ID6004
Feature activation+3.67e-02

Pos logit contributions
lee+0.050
bled+0.045
atted+0.044
alks+0.043
rac+0.042
Neg logit contributions
 seeker-0.034
 hide-0.033
 tag-0.033
 hiding-0.032
 exit-0.031
Token int
Token ID493
Feature activation+1.26e-01

Pos logit contributions
lee+0.091
bled+0.082
atted+0.080
alks+0.078
urg+0.076
Neg logit contributions
 seeker-0.053
 hide-0.052
 tag-0.051
 hiding-0.048
 exit-0.047
Tokenently
Token ID1473
Feature activation+1.15e-02

Pos logit contributions
lee+0.157
alks+0.132
arted+0.109
atted+0.108
eled+0.108
Neg logit contributions
 hide-0.316
 tag-0.307
 hiding-0.303
 hidden-0.293
 outside-0.291
Token to
Token ID284
Feature activation+5.64e-02

Pos logit contributions
lee+0.028
bled+0.025
atted+0.024
alks+0.024
urg+0.023
Neg logit contributions
 hide-0.017
 seeker-0.017
 tag-0.016
 hiding-0.015
 exit-0.015
Token her
Token ID607
Feature activation+3.62e-02

Pos logit contributions
lee+0.122
bled+0.109
alks+0.106
atted+0.106
urg+0.101
Neg logit contributions
 hide-0.099
 tag-0.097
 hiding-0.094
 seeker-0.092
 hidden-0.088
Token until
Token ID1566
Feature activation+5.62e-03

Pos logit contributions
lee+0.075
bled+0.064
atted+0.063
alks+0.063
urg+0.062
Neg logit contributions
 seeker-0.063
 hide-0.063
 tag-0.062
 hiding-0.059
 exit-0.058
Token she
Token ID673
Feature activation-2.61e-04

Pos logit contributions
lee+0.011
bled+0.011
atted+0.010
alks+0.010
rac+0.009
Neg logit contributions
 hide-0.010
 tag-0.010
 seeker-0.010
 hiding-0.010
 exit-0.009
Token
Token ID198
Feature activation+2.17e-02

Pos logit contributions
lee+0.033
bled+0.031
urg+0.029
atted+0.029
alks+0.029
Neg logit contributions
 hide-0.019
 tag-0.019
 seeker-0.018
 hiding-0.018
 gym-0.017
Token
Token ID198
Feature activation+2.01e-02

Pos logit contributions
lee+0.059
bled+0.055
alks+0.054
atted+0.054
urg+0.053
Neg logit contributions
 hide-0.025
 tag-0.024
 hiding-0.022
 hidden-0.021
 seeker-0.020
TokenTom
Token ID13787
Feature activation+5.81e-02

Pos logit contributions
lee+0.040
bled+0.036
alks+0.034
atted+0.034
urg+0.033
Neg logit contributions
 hide-0.039
 tag-0.038
 hiding-0.037
 seeker-0.036
 hidden-0.035
Tokenmy
Token ID1820
Feature activation+3.19e-02

Pos logit contributions
lee+0.144
bled+0.135
alks+0.133
atted+0.127
urg+0.126
Neg logit contributions
 hide-0.080
 tag-0.076
 hiding-0.076
 seeker-0.072
 hidden-0.070
Token listened
Token ID16399
Feature activation+4.24e-02

Pos logit contributions
lee+0.083
alks+0.078
bled+0.078
atted+0.074
urg+0.074
Neg logit contributions
 hide-0.041
 tag-0.039
 hiding-0.037
 hidden-0.034
 seeker-0.033
Token int
Token ID493
Feature activation+1.25e-01

Pos logit contributions
lee+0.108
bled+0.098
atted+0.095
alks+0.095
urg+0.092
Neg logit contributions
 hide-0.059
 tag-0.057
 hiding-0.054
 seeker-0.054
 hidden-0.050
Tokenently
Token ID1473
Feature activation+1.60e-02

Pos logit contributions
lee+0.165
alks+0.149
umm+0.129
ubb+0.122
 pulp+0.119
Neg logit contributions
 hide-0.281
?-0.278
 outside-0.277
 tag-0.276
 hiding-0.270
Token to
Token ID284
Feature activation+6.26e-02

Pos logit contributions
lee+0.043
bled+0.039
atted+0.038
alks+0.038
urg+0.036
Neg logit contributions
 hide-0.020
 tag-0.019
 seeker-0.019
 hiding-0.018
 exit-0.017
Token his
Token ID465
Feature activation+4.59e-02

Pos logit contributions
lee+0.154
atted+0.150
bled+0.149
alks+0.145
umbled+0.144
Neg logit contributions
 hide-0.091
 tag-0.085
 hiding-0.084
 exit-0.079
 hidden-0.076
Token mom
Token ID1995
Feature activation+3.68e-02

Pos logit contributions
lee+0.077
bled+0.066
alks+0.064
atted+0.064
urg+0.060
Neg logit contributions
 hide-0.104
 tag-0.102
 hiding-0.099
 seeker-0.099
 hidden-0.095
Token's
Token ID338
Feature activation+2.39e-02

Pos logit contributions
lee+0.099
bled+0.089
atted+0.089
alks+0.087
urg+0.085
Neg logit contributions
 hide-0.045
 tag-0.044
 seeker-0.043
 hiding-0.041
 exit-0.038
Token!
Token ID0
Feature activation-6.69e-03

Pos logit contributions
lee+0.010
alks+0.009
atted+0.009
bled+0.008
rac+0.008
Neg logit contributions
 seeker-0.006
 hide-0.006
 tag-0.006
 hiding-0.005
 exit-0.005
Tokenâ
Token ID22940
Feature activation+3.99e-02

Pos logit contributions
 hide+0.012
 tag+0.012
 hiding+0.012
 seeker+0.012
 hidden+0.011
Neg logit contributions
lee-0.014
alks-0.012
atted-0.012
bled-0.012
ipt-0.011
Token
Token ID26391
Feature activation+3.79e-04

Pos logit contributions
lee+0.080
bled+0.071
atted+0.069
alks+0.067
rac+0.066
Neg logit contributions
 hide-0.077
 tag-0.075
 hiding-0.074
 seeker-0.071
 hidden-0.069
Token Jim
Token ID5395
Feature activation+2.41e-02

Pos logit contributions
lee+0.001
alks+0.001
atted+0.001
bled+0.001
rac+0.001
Neg logit contributions
 hide-0.001
 hiding-0.001
 tag-0.001
 exit-0.001
 seeker-0.001
Token said
Token ID531
Feature activation+4.47e-02

Pos logit contributions
lee+0.054
atted+0.048
bled+0.046
alks+0.044
aned+0.044
Neg logit contributions
 hide-0.040
 seeker-0.039
 tag-0.039
 hiding-0.038
 exit-0.037
Token anx
Token ID7296
Feature activation+1.25e-01

Pos logit contributions
lee+0.106
bled+0.095
atted+0.093
rac+0.089
alks+0.089
Neg logit contributions
 hide-0.067
 seeker-0.065
 hiding-0.064
 exit-0.064
 tag-0.064
Tokeniously
Token ID6819
Feature activation+2.21e-02

Pos logit contributions
lee+0.129
bled+0.112
alks+0.111
lights+0.100
ipt+0.094
Neg logit contributions
 hide-0.317
 hiding-0.317
 seeker-0.315
 tag-0.305
 exit-0.296
Token.
Token ID13
Feature activation-3.41e-04

Pos logit contributions
lee+0.062
atted+0.055
alks+0.055
bled+0.054
ums+0.052
Neg logit contributions
 hide-0.025
 seeker-0.024
 tag-0.023
 hiding-0.022
 exit-0.021
Token
Token ID198
Feature activation+1.38e-02

Pos logit contributions
 hide+0.001
 tag+0.001
 hiding+0.001
 seeker+0.001
 hidden+0.001
Neg logit contributions
lee-0.001
bled-0.001
alks-0.001
atted-0.001
urg-0.001
Token
Token ID198
Feature activation+1.03e-02

Pos logit contributions
lee+0.034
bled+0.031
atted+0.031
alks+0.031
urg+0.030
Neg logit contributions
 hide-0.019
 tag-0.019
 hiding-0.018
 seeker-0.018
 hidden-0.017
TokenHis
Token ID6653
Feature activation+4.28e-02

Pos logit contributions
lee+0.020
bled+0.018
alks+0.017
atted+0.017
urg+0.016
Neg logit contributions
 hide-0.021
 tag-0.020
 hiding-0.020
 seeker-0.019
 hidden-0.019
Token.
Token ID13
Feature activation-5.85e-05

Pos logit contributions
 tag+0.046
 seeker+0.046
 hide+0.045
 maze+0.042
 exit+0.042
Neg logit contributions
lee-0.087
atted-0.076
bled-0.076
alks-0.075
ums-0.071
Token The
Token ID383
Feature activation+3.20e-02

Pos logit contributions
 hide+0.000
 tag+0.000
 hiding+0.000
 seeker+0.000
 hidden+0.000
Neg logit contributions
lee-0.000
bled-0.000
atted-0.000
alks-0.000
urg-0.000
Token little
Token ID1310
Feature activation+5.19e-02

Pos logit contributions
lee+0.069
bled+0.061
atted+0.059
alks+0.059
urg+0.057
Neg logit contributions
 hide-0.057
 tag-0.055
 hiding-0.054
 seeker-0.053
 hidden-0.050
Token girl
Token ID2576
Feature activation+2.34e-02

Pos logit contributions
lee+0.119
bled+0.105
atted+0.104
alks+0.103
urg+0.098
Neg logit contributions
 hide-0.085
 tag-0.082
 hiding-0.079
 seeker-0.078
 exit-0.074
Token listened
Token ID16399
Feature activation+3.37e-02

Pos logit contributions
lee+0.056
atted+0.049
bled+0.049
alks+0.047
rac+0.046
Neg logit contributions
 hide-0.036
 tag-0.035
 seeker-0.035
 hiding-0.034
 exit-0.032
Token int
Token ID493
Feature activation+1.23e-01

Pos logit contributions
lee+0.087
bled+0.079
atted+0.077
alks+0.077
urg+0.074
Neg logit contributions
 hide-0.045
 tag-0.044
 seeker-0.042
 hiding-0.042
 exit-0.038
Tokenently
Token ID1473
Feature activation+7.20e-03

Pos logit contributions
lee+0.164
alks+0.144
umm+0.125
arted+0.121
eled+0.118
Neg logit contributions
 hide-0.283
 tag-0.278
 hiding-0.272
?-0.270
 outside-0.270
Token as
Token ID355
Feature activation+7.83e-04

Pos logit contributions
lee+0.019
bled+0.017
atted+0.017
alks+0.017
urg+0.016
Neg logit contributions
 hide-0.009
 tag-0.009
 seeker-0.009
 hiding-0.008
 exit-0.008
Token they
Token ID484
Feature activation+1.04e-02

Pos logit contributions
lee+0.002
bled+0.002
alks+0.001
atted+0.001
urg+0.001
Neg logit contributions
 hide-0.001
 tag-0.001
 seeker-0.001
 hiding-0.001
 exit-0.001
Token introduced
Token ID5495
Feature activation+6.12e-02

Pos logit contributions
lee+0.023
bled+0.020
atted+0.020
alks+0.019
rac+0.019
Neg logit contributions
 hide-0.017
 tag-0.017
 seeker-0.017
 hiding-0.017
 exit-0.016
Token her
Token ID607
Feature activation+2.72e-02

Pos logit contributions
lee+0.133
bled+0.122
atted+0.119
alks+0.118
urg+0.116
Neg logit contributions
 hide-0.107
 tag-0.102
 hiding-0.099
 hidden-0.093
 seeker-0.092
Token Both
Token ID5747
Feature activation+3.01e-02

Pos logit contributions
lee+0.024
bled+0.024
urg+0.023
atted+0.023
alks+0.022
Neg logit contributions
 hide-0.012
 tag-0.011
 hiding-0.011
 seeker-0.010
 chamber-0.010
Token of
Token ID286
Feature activation+4.56e-02

Pos logit contributions
lee+0.067
bled+0.062
alks+0.060
atted+0.059
urg+0.058
Neg logit contributions
 hide-0.050
 tag-0.048
 hiding-0.047
 seeker-0.045
 hidden-0.044
Token them
Token ID606
Feature activation+7.88e-03

Pos logit contributions
lee+0.087
alks+0.080
bled+0.080
urg+0.077
atted+0.077
Neg logit contributions
 hide-0.090
 tag-0.088
 hiding-0.085
 hidden-0.079
 seeker-0.077
Token put
Token ID1234
Feature activation+3.19e-02

Pos logit contributions
lee+0.017
atted+0.014
bled+0.014
alks+0.013
eled+0.013
Neg logit contributions
 seeker-0.015
 hide-0.014
 tag-0.014
 hiding-0.014
 exit-0.013
Token their
Token ID511
Feature activation+3.64e-02

Pos logit contributions
lee+0.076
bled+0.067
alks+0.066
atted+0.065
urg+0.064
Neg logit contributions
 hide-0.050
 tag-0.049
 hiding-0.046
 seeker-0.045
 hidden-0.044
Token anx
Token ID7296
Feature activation+1.23e-01

Pos logit contributions
lee+0.087
bled+0.079
alks+0.075
atted+0.073
urg+0.073
Neg logit contributions
 hide-0.058
 tag-0.055
 hiding-0.055
 hidden-0.050
 exit-0.049
Tokenieties
Token ID9545
Feature activation-4.13e-03

Pos logit contributions
lee+0.142
bled+0.132
lights+0.121
alks+0.110
ipt+0.107
Neg logit contributions
 hide-0.302
 hiding-0.298
 tag-0.274
 exit-0.269
house-0.266
Token aside
Token ID7263
Feature activation+2.21e-02

Pos logit contributions
 hide+0.005
 tag+0.005
 seeker+0.005
 hiding+0.005
 exit+0.004
Neg logit contributions
lee-0.011
bled-0.010
atted-0.010
alks-0.010
urg-0.009
Token.
Token ID13
Feature activation+6.22e-03

Pos logit contributions
lee+0.052
bled+0.047
atted+0.046
alks+0.045
ment+0.043
Neg logit contributions
 seeker-0.032
 hide-0.031
 tag-0.029
 exit-0.028
 hiding-0.027
Token From
Token ID3574
Feature activation+2.04e-02

Pos logit contributions
lee+0.015
bled+0.014
atted+0.014
urg+0.014
alks+0.014
Neg logit contributions
 hide-0.009
 tag-0.009
 hiding-0.008
 seeker-0.008
 hidden-0.007
Token then
Token ID788
Feature activation-1.01e-02

Pos logit contributions
lee+0.038
bled+0.033
atted+0.032
alks+0.030
rac+0.030
Neg logit contributions
 hide-0.043
 tag-0.041
 seeker-0.041
 hiding-0.040
 exit-0.039
Token the
Token ID262
Feature activation+2.07e-02

Pos logit contributions
lee+0.062
atted+0.053
bled+0.051
ment+0.047
urg+0.047
Neg logit contributions
 seeker-0.061
 hide-0.056
 hiding-0.053
 exit-0.053
 tag-0.053
Token answer
Token ID3280
Feature activation+2.45e-02

Pos logit contributions
lee+0.042
bled+0.038
alks+0.038
ums+0.037
atted+0.037
Neg logit contributions
 seeker-0.028
 hide-0.028
 tim-0.026
 hiding-0.026
 tag-0.026
Token.
Token ID13
Feature activation-2.52e-03

Pos logit contributions
lee+0.062
atted+0.055
bled+0.054
alks+0.052
ums+0.051
Neg logit contributions
 seeker-0.034
 hide-0.032
 hiding-0.028
 tag-0.028
 exit-0.028
Token She
Token ID1375
Feature activation+1.48e-02

Pos logit contributions
 hide+0.005
 tag+0.004
 hiding+0.004
 seeker+0.004
 hidden+0.004
Neg logit contributions
lee-0.005
bled-0.005
alks-0.005
atted-0.005
urg-0.004
Token asked
Token ID1965
Feature activation+6.60e-02

Pos logit contributions
lee+0.031
atted+0.026
bled+0.026
alks+0.024
aned+0.024
Neg logit contributions
 hide-0.028
 seeker-0.028
 tag-0.027
 hiding-0.026
 exit-0.025
Token anx
Token ID7296
Feature activation+1.23e-01

Pos logit contributions
lee+0.142
bled+0.126
atted+0.123
alks+0.119
urg+0.117
Neg logit contributions
 hide-0.113
 tag-0.111
 seeker-0.109
 hiding-0.109
 exit-0.104
Tokeniously
Token ID6819
Feature activation+1.62e-02

Pos logit contributions
lee+0.121
bled+0.119
alks+0.112
lights+0.109
umm+0.103
Neg logit contributions
 hiding-0.315
 hide-0.313
 tag-0.294
 seeker-0.286
 exit-0.285
Token,
Token ID11
Feature activation+7.89e-03

Pos logit contributions
lee+0.043
atted+0.038
bled+0.038
alks+0.037
ums+0.036
Neg logit contributions
 seeker-0.020
 hide-0.020
 tag-0.019
 exit-0.018
 hiding-0.017
Token "
Token ID366
Feature activation+5.28e-03

Pos logit contributions
lee+0.017
bled+0.015
atted+0.015
alks+0.014
rac+0.014
Neg logit contributions
 hide-0.014
 seeker-0.013
 tag-0.013
 hiding-0.013
 exit-0.013
TokenWhat
Token ID2061
Feature activation-5.82e-04

Pos logit contributions
lee+0.010
bled+0.009
atted+0.008
alks+0.008
ipt+0.008
Neg logit contributions
 hide-0.011
 tag-0.010
 hiding-0.010
 seeker-0.010
 hidden-0.010
Token is
Token ID318
Feature activation-4.91e-02

Pos logit contributions
 clearing+0.001
 seeker+0.001
 hiding+0.001
 exit+0.001
 shelter+0.001
Neg logit contributions
rac-0.001
lee-0.001
ums-0.001
atted-0.001
um-0.001
Token Sammy
Token ID44067
Feature activation+1.68e-02

Pos logit contributions
lee+0.116
bled+0.113
atted+0.109
alks+0.107
aned+0.107
Neg logit contributions
 hide-0.074
 hiding-0.069
 tag-0.067
 seeker-0.064
 hidden-0.061
Token would
Token ID561
Feature activation+2.68e-03

Pos logit contributions
lee+0.037
bled+0.032
atted+0.031
alks+0.031
urg+0.030
Neg logit contributions
 hide-0.029
 tag-0.028
 hiding-0.028
 seeker-0.028
 exit-0.026
Token often
Token ID1690
Feature activation+6.94e-03

Pos logit contributions
lee+0.007
bled+0.006
atted+0.006
alks+0.006
urg+0.006
Neg logit contributions
 hide-0.004
 tag-0.004
 hiding-0.003
 seeker-0.003
 hidden-0.003
Token share
Token ID2648
Feature activation+3.13e-02

Pos logit contributions
lee+0.017
alks+0.015
bled+0.014
atted+0.014
urg+0.014
Neg logit contributions
 seeker-0.010
 tag-0.010
 hide-0.009
 hiding-0.009
 exit-0.009
Token av
Token ID1196
Feature activation+2.39e-04

Pos logit contributions
lee+0.072
bled+0.065
alks+0.063
atted+0.063
urg+0.061
Neg logit contributions
 hide-0.050
 tag-0.049
 hiding-0.047
 seeker-0.046
 hidden-0.044
Tokenoc
Token ID420
Feature activation+1.22e-01

Pos logit contributions
lee+0.001
bled+0.001
urg+0.001
atted+0.001
aned+0.001
Neg logit contributions
 hide-0.000
 tag-0.000
house-0.000
 hiding-0.000
 hidden-0.000
Tokenados
Token ID22484
Feature activation-2.42e-02

Pos logit contributions
lee+0.152
lights+0.147
bled+0.125
eled+0.121
Dad+0.120
Neg logit contributions
 hide-0.270
 tag-0.257
 hiding-0.254
 jail-0.245
house-0.235
Token and
Token ID290
Feature activation+3.24e-02

Pos logit contributions
 hide+0.032
 tag+0.031
 seeker+0.031
 hiding+0.030
 exit+0.028
Neg logit contributions
lee-0.062
bled-0.056
alks-0.055
atted-0.055
urg-0.053
Token other
Token ID584
Feature activation+4.25e-02

Pos logit contributions
lee+0.084
bled+0.078
atted+0.076
alks+0.075
ipt+0.074
Neg logit contributions
 hide-0.046
 tag-0.043
 hiding-0.041
 hidden-0.038
 seeker-0.036
Token y
Token ID331
Feature activation+6.69e-02

Pos logit contributions
lee+0.094
bled+0.087
alks+0.084
atted+0.082
urg+0.080
Neg logit contributions
 hide-0.072
 tag-0.069
 hiding-0.067
 hidden-0.064
 seeker-0.064
Tokenummy
Token ID13513
Feature activation+3.21e-02

Pos logit contributions
bled+0.091
lee+0.089
rac+0.085
aned+0.080
lights+0.071
Neg logit contributions
 hide-0.144
 hiding-0.139
 hidden-0.131
 exit-0.127
house-0.126
Token see
Token ID766
Feature activation+2.62e-02

Pos logit contributions
lee+0.001
alks+0.001
rac+0.001
bled+0.001
urg+0.001
Neg logit contributions
 seeker-0.000
 hiding-0.000
 tag-0.000
 hide-0.000
 chamber-0.000
Token.
Token ID13
Feature activation+9.03e-03

Pos logit contributions
lee+0.061
atted+0.051
bled+0.051
ment+0.050
alks+0.049
Neg logit contributions
 seeker-0.042
 hide-0.037
 tag-0.036
 hiding-0.035
 exit-0.033
Token The
Token ID383
Feature activation+2.35e-02

Pos logit contributions
lee+0.020
bled+0.018
alks+0.018
atted+0.018
urg+0.017
Neg logit contributions
 hide-0.015
 tag-0.015
 hiding-0.014
 seeker-0.013
 hidden-0.013
Token owl
Token ID39610
Feature activation+2.28e-02

Pos logit contributions
lee+0.056
bled+0.051
atted+0.050
alks+0.050
urg+0.048
Neg logit contributions
 hide-0.036
 tag-0.035
 hiding-0.034
 seeker-0.033
 hidden-0.032
Token listened
Token ID16399
Feature activation+3.51e-02

Pos logit contributions
lee+0.045
bled+0.039
atted+0.038
alks+0.037
urg+0.036
Neg logit contributions
 hide-0.045
 tag-0.043
 seeker-0.043
 hiding-0.042
 exit-0.040
Token int
Token ID493
Feature activation+1.22e-01

Pos logit contributions
lee+0.084
bled+0.075
atted+0.074
alks+0.073
urg+0.070
Neg logit contributions
 hide-0.054
 seeker-0.052
 tag-0.052
 hiding-0.049
 exit-0.047
Tokenently
Token ID1473
Feature activation+6.67e-03

Pos logit contributions
lee+0.148
alks+0.128
eled+0.106
atted+0.102
umm+0.101
Neg logit contributions
 hide-0.301
 tag-0.296
 hiding-0.294
 hidden-0.282
 exit-0.281
Token and
Token ID290
Feature activation+3.23e-02

Pos logit contributions
lee+0.018
bled+0.016
atted+0.016
alks+0.016
urg+0.015
Neg logit contributions
 hide-0.009
 seeker-0.008
 tag-0.008
 hiding-0.008
 exit-0.007
Token then
Token ID788
Feature activation+1.41e-02

Pos logit contributions
lee+0.066
bled+0.059
atted+0.057
alks+0.055
urg+0.054
Neg logit contributions
 hide-0.060
 tag-0.058
 seeker-0.057
 hiding-0.056
 exit-0.054
Token it
Token ID340
Feature activation+6.34e-03

Pos logit contributions
lee+0.030
bled+0.027
atted+0.026
alks+0.026
rac+0.024
Neg logit contributions
 hide-0.025
 seeker-0.024
 tag-0.024
 hiding-0.023
 exit-0.023
Token said
Token ID531
Feature activation+4.15e-02

Pos logit contributions
lee+0.014
bled+0.012
atted+0.012
alks+0.012
urg+0.011
Neg logit contributions
 hide-0.011
 seeker-0.011
 tag-0.011
 hiding-0.010
 exit-0.010
Token.
Token ID13
Feature activation+1.85e-03

Pos logit contributions
 seeker+0.038
 tag+0.035
 hiding+0.033
 hide+0.032
 maze+0.030
Neg logit contributions
lee-0.054
bled-0.048
alks-0.047
rac-0.046
atted-0.046
Token
Token ID198
Feature activation+1.88e-02

Pos logit contributions
lee+0.004
bled+0.004
atted+0.004
alks+0.004
urg+0.004
Neg logit contributions
 hide-0.003
 tag-0.003
 hiding-0.003
 chamber-0.002
 hidden-0.002
Token
Token ID198
Feature activation+1.81e-02

Pos logit contributions
lee+0.047
bled+0.043
alks+0.042
atted+0.042
urg+0.040
Neg logit contributions
 hide-0.027
 tag-0.026
 hiding-0.025
 hidden-0.023
 seeker-0.023
TokenKate
Token ID45087
Feature activation+2.70e-02

Pos logit contributions
lee+0.038
bled+0.034
alks+0.034
atted+0.033
urg+0.032
Neg logit contributions
 hide-0.033
 tag-0.032
 hiding-0.031
 hidden-0.030
 seeker-0.029
Token listened
Token ID16399
Feature activation+3.61e-02

Pos logit contributions
lee+0.066
alks+0.061
bled+0.061
atted+0.058
urg+0.057
Neg logit contributions
 hide-0.039
 tag-0.037
 hiding-0.036
 hidden-0.034
 seeker-0.033
Token int
Token ID493
Feature activation+1.22e-01

Pos logit contributions
lee+0.088
bled+0.080
alks+0.078
atted+0.078
urg+0.075
Neg logit contributions
 hide-0.053
 tag-0.051
 hiding-0.049
 seeker-0.049
 hidden-0.046
Tokenently
Token ID1473
Feature activation+9.89e-03

Pos logit contributions
lee+0.154
alks+0.139
eled+0.119
umm+0.118
ubb+0.115
Neg logit contributions
 hide-0.279
 hiding-0.275
 outside-0.275
?-0.273
 tag-0.272
Token and
Token ID290
Feature activation+3.72e-02

Pos logit contributions
lee+0.027
bled+0.025
atted+0.024
alks+0.024
urg+0.023
Neg logit contributions
 hide-0.012
 tag-0.011
 seeker-0.011
 hiding-0.011
 exit-0.010
Token proudly
Token ID23017
Feature activation+4.74e-02

Pos logit contributions
lee+0.082
alks+0.074
bled+0.073
atted+0.071
urg+0.069
Neg logit contributions
 hide-0.065
 tag-0.062
 hiding-0.061
 hidden-0.058
 seeker-0.058
Token.
Token ID13
Feature activation+1.22e-03

Pos logit contributions
lee+0.101
bled+0.090
atted+0.087
alks+0.086
urg+0.083
Neg logit contributions
 hide-0.085
 tag-0.082
 hiding-0.080
 seeker-0.080
 exit-0.076
Token
Token ID198
Feature activation+1.87e-02

Pos logit contributions
lee+0.003
bled+0.003
atted+0.003
alks+0.003
eled+0.003
Neg logit contributions
 hide-0.002
 hiding-0.002
 chamber-0.002
 tag-0.002
 hidden-0.001
Token to
Token ID284
Feature activation+2.75e-02

Pos logit contributions
 tag+0.013
 seeker+0.012
 hide+0.012
 exit+0.011
 maze+0.011
Neg logit contributions
lee-0.021
bled-0.019
atted-0.019
rac-0.018
alks-0.017
Token break
Token ID2270
Feature activation+4.53e-02

Pos logit contributions
lee+0.065
bled+0.058
alks+0.057
atted+0.056
urg+0.055
Neg logit contributions
 hide-0.041
 tag-0.041
 seeker-0.039
 hiding-0.039
 exit-0.036
Token."
Token ID526
Feature activation+4.80e-02

Pos logit contributions
lee+0.115
bled+0.103
atted+0.101
alks+0.101
urg+0.097
Neg logit contributions
 hide-0.062
 tag-0.060
 seeker-0.060
 hiding-0.058
 exit-0.054
Token Molly
Token ID30236
Feature activation+4.82e-02

Pos logit contributions
bled+0.113
lee+0.110
urg+0.105
aned+0.104
atted+0.102
Neg logit contributions
 hide-0.070
 hiding-0.063
 tag-0.062
 chamber-0.062
 gym-0.060
Token nodded
Token ID14464
Feature activation+9.39e-02

Pos logit contributions
lee+0.101
alks+0.093
bled+0.092
atted+0.087
urg+0.086
Neg logit contributions
 hide-0.087
 tag-0.085
 hiding-0.081
 hidden-0.077
 seeker-0.076
Token solemn
Token ID26322
Feature activation+1.22e-01

Pos logit contributions
lee+0.285
alks+0.282
bled+0.269
atted+0.265
umbled+0.265
Neg logit contributions
 hide-0.075
 hiding-0.070
 tag-0.067
 hidden-0.060
 like-0.060
Tokenly
Token ID306
Feature activation+6.95e-02

Pos logit contributions
bled+0.232
lee+0.225
alks+0.223
ipt+0.207
eled+0.201
Neg logit contributions
 hide-0.216
 hiding-0.208
 like-0.201
 tag-0.196
 hidden-0.194
Token,
Token ID11
Feature activation+2.81e-02

Pos logit contributions
lee+0.197
bled+0.186
alks+0.183
atted+0.177
urg+0.176
Neg logit contributions
 hide-0.073
 tag-0.068
 hiding-0.067
 hidden-0.059
 seeker-0.056
Token understanding
Token ID4547
Feature activation+7.71e-02

Pos logit contributions
lee+0.069
alks+0.066
bled+0.064
urg+0.062
ums+0.061
Neg logit contributions
 hide-0.040
 tag-0.039
 hiding-0.039
 hidden-0.037
 like-0.034
Token the
Token ID262
Feature activation+3.86e-02

Pos logit contributions
lee+0.199
bled+0.178
atted+0.176
alks+0.174
urg+0.168
Neg logit contributions
 hide-0.105
 tag-0.100
 seeker-0.098
 hiding-0.097
 exit-0.089
Token lesson
Token ID11483
Feature activation+4.67e-02

Pos logit contributions
lee+0.093
bled+0.084
alks+0.082
atted+0.082
urg+0.079
Neg logit contributions
 hide-0.059
 tag-0.058
 hiding-0.055
 seeker-0.055
 hidden-0.052
Token from
Token ID422
Feature activation+3.03e-02

Pos logit contributions
 seeker+0.015
 tag+0.014
 hide+0.014
 hiding+0.013
 exit+0.013
Neg logit contributions
lee-0.025
bled-0.023
atted-0.023
alks-0.022
urg-0.022
Token the
Token ID262
Feature activation-8.61e-03

Pos logit contributions
lee+0.075
bled+0.068
alks+0.067
atted+0.066
urg+0.065
Neg logit contributions
 hide-0.045
 tag-0.043
 hiding-0.042
 seeker-0.040
 hidden-0.040
Token trap
Token ID12840
Feature activation-2.70e-02

Pos logit contributions
 seeker+0.010
 hide+0.010
 tag+0.010
 tim+0.009
 hiding+0.009
Neg logit contributions
lee-0.021
alks-0.019
bled-0.018
arted-0.017
rac-0.017
Token.
Token ID13
Feature activation-2.42e-03

Pos logit contributions
 seeker+0.039
 hide+0.035
 tag+0.033
 exit+0.032
 hiding+0.032
Neg logit contributions
lee-0.067
alks-0.060
bled-0.059
atted-0.059
urg-0.056
Token B
Token ID347
Feature activation+7.79e-02

Pos logit contributions
 hide+0.004
 tag+0.003
 hiding+0.003
 hidden+0.003
 shelter+0.003
Neg logit contributions
lee-0.005
bled-0.005
alks-0.005
atted-0.005
rac-0.005
Tokenamb
Token ID4131
Feature activation+1.22e-01

Pos logit contributions
lee+0.136
bled+0.130
alks+0.128
lights+0.127
atted+0.118
Neg logit contributions
 seeker-0.141
 hidden-0.138
 hide-0.137
 hiding-0.137
 tag-0.131
Tokeni
Token ID72
Feature activation+4.53e-02

Pos logit contributions
alks+0.203
lee+0.194
umm+0.190
lights+0.184
bled+0.181
Neg logit contributions
 hiding-0.224
 hide-0.220
 seeker-0.210
 hidden-0.206
 exit-0.198
Token was
Token ID373
Feature activation+2.26e-03

Pos logit contributions
lee+0.120
bled+0.110
alks+0.110
atted+0.106
urg+0.105
Neg logit contributions
 hide-0.057
 tag-0.055
 hiding-0.051
 hidden-0.048
 seeker-0.046
Token very
Token ID845
Feature activation-1.63e-02

Pos logit contributions
lee+0.006
alks+0.005
bled+0.005
atted+0.005
urg+0.005
Neg logit contributions
 hide-0.003
 tag-0.003
 hiding-0.003
 seeker-0.003
 hidden-0.003
Token happy
Token ID3772
Feature activation+2.49e-02

Pos logit contributions
 hide+0.025
 tag+0.024
 hiding+0.023
 hidden+0.023
 seeker+0.021
Neg logit contributions
lee-0.041
alks-0.036
bled-0.035
atted-0.035
urg-0.035
Token and
Token ID290
Feature activation+2.36e-02

Pos logit contributions
lee+0.062
bled+0.055
atted+0.055
alks+0.053
urg+0.052
Neg logit contributions
 hide-0.035
 seeker-0.034
 tag-0.034
 exit-0.031
 hiding-0.030
Token play
Token ID711
Feature activation-1.35e-02

Pos logit contributions
 hiding+0.000
 tag+0.000
 chamber+0.000
 hide+0.000
 seeker+0.000
Neg logit contributions
lee-0.000
bled-0.000
alks-0.000
urg-0.000
rac-0.000
Token.
Token ID13
Feature activation-1.82e-02

Pos logit contributions
 seeker+0.019
 exit+0.019
 locked+0.018
 library+0.018
 hiding+0.017
Neg logit contributions
bled-0.026
lee-0.025
urg-0.022
ums-0.021
lights-0.021
Token<|endoftext|>
Token ID50256
Feature activation-5.69e-02

Pos logit contributions
 hide+0.030
 tag+0.029
 hiding+0.028
 seeker+0.027
 hidden+0.026
Neg logit contributions
lee-0.042
bled-0.038
alks-0.037
atted-0.037
urg-0.035
TokenOnce
Token ID7454
Feature activation-5.99e-02

Pos logit contributions
 hide+0.101
 seeker+0.100
 tag+0.099
 hiding+0.097
 exit+0.094
Neg logit contributions
lee-0.112
bled-0.100
alks-0.095
atted-0.094
ipt-0.093
Token upon
Token ID2402
Feature activation-5.87e-02

Pos logit contributions
 hide+0.052
 hiding+0.049
 tag+0.047
 seeker+0.044
 hidden+0.043
Neg logit contributions
lee-0.181
bled-0.168
alks-0.165
urg-0.163
atted-0.162
Token a
Token ID257
Feature activation-1.34e-01

Pos logit contributions
 seeker+0.078
 tag+0.068
 hide+0.068
 hiding+0.068
 exit+0.067
Neg logit contributions
lee-0.129
bled-0.124
ipt-0.117
rac-0.115
ped-0.114
Token time
Token ID640
Feature activation-1.10e-01

Pos logit contributions
 tag+0.156
 hide+0.155
 seeker+0.151
 hiding+0.141
 exit+0.138
Neg logit contributions
lee-0.339
atted-0.306
bled-0.302
um-0.298
rac-0.295
Token,
Token ID11
Feature activation-1.07e-01

Pos logit contributions
 hide+0.126
 tag+0.126
 seeker+0.121
 exit+0.115
 hiding+0.108
Neg logit contributions
lee-0.287
bled-0.264
atted-0.263
alks-0.255
rac-0.253
Token there
Token ID612
Feature activation-5.53e-02

Pos logit contributions
 tag+0.109
 seeker+0.108
 hide+0.106
 exit+0.097
 library+0.092
Neg logit contributions
bled-0.279
lee-0.275
atted-0.262
rac-0.259
ipt-0.247
Token was
Token ID373
Feature activation-1.02e-01

Pos logit contributions
 seeker+0.075
 hide+0.073
 tag+0.072
 hiding+0.068
 exit+0.067
Neg logit contributions
lee-0.132
bled-0.122
atted-0.119
ipt-0.115
umbled-0.114
Token a
Token ID257
Feature activation-1.26e-01

Pos logit contributions
 tag+0.117
 hide+0.116
 seeker+0.111
 hiding+0.110
 exit+0.108
Neg logit contributions
lee-0.251
bled-0.238
alks-0.221
um-0.220
ipt-0.219
Token down
Token ID866
Feature activation-2.17e-02

Pos logit contributions
 hide+0.020
 tag+0.019
 seeker+0.019
 hiding+0.019
 exit+0.017
Neg logit contributions
lee-0.026
bled-0.024
atted-0.023
alks-0.023
arted-0.023
Token.
Token ID13
Feature activation-1.51e-02

Pos logit contributions
 seeker+0.029
 hide+0.029
 tag+0.028
 hiding+0.027
 exit+0.025
Neg logit contributions
lee-0.053
bled-0.049
alks-0.047
umbled-0.046
atted-0.045
Token<|endoftext|>
Token ID50256
Feature activation-5.47e-02

Pos logit contributions
 hide+0.023
 tag+0.022
 hiding+0.021
 seeker+0.020
 hidden+0.020
Neg logit contributions
lee-0.036
bled-0.033
alks-0.032
atted-0.032
urg-0.031
TokenOnce
Token ID7454
Feature activation-5.66e-02

Pos logit contributions
 seeker+0.101
 hide+0.100
 tag+0.098
 hiding+0.096
 exit+0.094
Neg logit contributions
lee-0.104
bled-0.091
alks-0.088
atted-0.086
ipt-0.085
Token upon
Token ID2402
Feature activation-5.63e-02

Pos logit contributions
 hide+0.048
 hiding+0.046
 tag+0.043
 seeker+0.040
 hidden+0.040
Neg logit contributions
lee-0.172
bled-0.160
alks-0.157
urg-0.156
atted-0.154
Token a
Token ID257
Feature activation-1.32e-01

Pos logit contributions
 seeker+0.074
 tag+0.066
 hide+0.065
 exit+0.065
 hiding+0.064
Neg logit contributions
lee-0.126
bled-0.122
ipt-0.113
ped-0.111
urg-0.111
Token time
Token ID640
Feature activation-1.09e-01

Pos logit contributions
 tag+0.153
 hide+0.151
 seeker+0.147
 hiding+0.136
 exit+0.134
Neg logit contributions
lee-0.341
atted-0.307
bled-0.303
um-0.297
alks-0.295
Token,
Token ID11
Feature activation-1.05e-01

Pos logit contributions
 hide+0.124
 tag+0.123
 seeker+0.118
 exit+0.113
 hiding+0.106
Neg logit contributions
lee-0.286
bled-0.263
atted-0.261
alks-0.254
rac-0.251
Token there
Token ID612
Feature activation-5.45e-02

Pos logit contributions
 tag+0.102
 hide+0.100
 seeker+0.097
 exit+0.088
 library+0.085
Neg logit contributions
bled-0.285
lee-0.281
atted-0.264
rac-0.260
ipt-0.251
Token was
Token ID373
Feature activation-1.00e-01

Pos logit contributions
 seeker+0.071
 hide+0.071
 tag+0.070
 hiding+0.065
 exit+0.065
Neg logit contributions
lee-0.133
bled-0.123
atted-0.119
ipt-0.115
umbled-0.114
Token a
Token ID257
Feature activation-1.24e-01

Pos logit contributions
 tag+0.107
 hide+0.107
 seeker+0.100
 hiding+0.100
 exit+0.097
Neg logit contributions
lee-0.263
bled-0.247
alks-0.233
atted-0.228
um-0.228
Token day
Token ID1110
Feature activation-2.45e-02

Pos logit contributions
 hide+0.018
 tag+0.017
 hiding+0.016
 seeker+0.016
 hidden+0.015
Neg logit contributions
lee-0.036
bled-0.033
alks-0.032
atted-0.032
urg-0.031
Token.
Token ID13
Feature activation-8.44e-03

Pos logit contributions
 seeker+0.029
 hide+0.028
 tag+0.027
 exit+0.027
 hiding+0.025
Neg logit contributions
lee-0.065
bled-0.059
alks-0.059
atted-0.055
ums-0.055
Token<|endoftext|>
Token ID50256
Feature activation-5.19e-02

Pos logit contributions
 hide+0.012
 tag+0.011
 hiding+0.011
 seeker+0.010
 hidden+0.010
Neg logit contributions
lee-0.021
bled-0.020
atted-0.019
urg-0.019
alks-0.019
TokenOnce
Token ID7454
Feature activation-5.37e-02

Pos logit contributions
 hide+0.096
 seeker+0.096
 tag+0.095
 hiding+0.093
 exit+0.090
Neg logit contributions
lee-0.097
bled-0.085
alks-0.082
ipt-0.081
atted-0.080
Token upon
Token ID2402
Feature activation-5.67e-02

Pos logit contributions
 hide+0.050
 hiding+0.047
 tag+0.046
 seeker+0.044
 hidden+0.042
Neg logit contributions
lee-0.158
bled-0.147
alks-0.144
urg-0.142
atted-0.142
Token a
Token ID257
Feature activation-1.32e-01

Pos logit contributions
 seeker+0.076
 hide+0.067
 hiding+0.067
 tag+0.067
 exit+0.066
Neg logit contributions
lee-0.124
bled-0.119
ipt-0.112
rac-0.110
ped-0.109
Token time
Token ID640
Feature activation-1.08e-01

Pos logit contributions
 tag+0.152
 hide+0.151
 seeker+0.146
 hiding+0.137
 exit+0.133
Neg logit contributions
lee-0.341
atted-0.307
bled-0.301
rac-0.296
um-0.296
Token,
Token ID11
Feature activation-1.06e-01

Pos logit contributions
 hide+0.122
 tag+0.121
 seeker+0.116
 exit+0.111
 hiding+0.105
Neg logit contributions
lee-0.285
bled-0.262
atted-0.261
alks-0.254
rac-0.251
Token in
Token ID287
Feature activation-9.48e-02

Pos logit contributions
 tag+0.107
 hide+0.104
 seeker+0.103
 exit+0.094
 library+0.089
Neg logit contributions
bled-0.280
lee-0.273
atted-0.261
rac-0.255
eled-0.249
Token a
Token ID257
Feature activation-1.27e-01

Pos logit contributions
 hide+0.100
 seeker+0.100
 tag+0.095
 hiding+0.092
 exit+0.090
Neg logit contributions
lee-0.240
bled-0.227
atted-0.218
rac-0.218
alks-0.215
Token thick
Token ID6546
Feature activation-8.85e-02

Pos logit contributions
 tag+0.171
 seeker+0.166
 hide+0.166
 hiding+0.152
 exit+0.150
Neg logit contributions
lee-0.308
bled-0.266
alks-0.265
atted-0.264
eled-0.260
Token ball
Token ID2613
Feature activation-5.15e-02

Pos logit contributions
 seeker+0.014
 hide+0.013
 hiding+0.013
 maze+0.012
 chamber+0.012
Neg logit contributions
lee-0.020
atted-0.019
alks-0.018
ums-0.018
bled-0.018
Token.
Token ID13
Feature activation-1.58e-02

Pos logit contributions
 seeker+0.080
 library+0.068
 tag+0.067
 tim+0.066
 hide+0.065
Neg logit contributions
lee-0.107
bled-0.100
atted-0.098
alks-0.096
arted-0.096
Token<|endoftext|>
Token ID50256
Feature activation-5.26e-02

Pos logit contributions
 hide+0.021
 tag+0.020
 hiding+0.018
 gym+0.017
 seeker+0.016
Neg logit contributions
lee-0.041
bled-0.040
urg-0.038
atted-0.037
alks-0.037
TokenOnce
Token ID7454
Feature activation-5.28e-02

Pos logit contributions
 hide+0.093
 seeker+0.092
 tag+0.092
 hiding+0.089
 exit+0.087
Neg logit contributions
lee-0.105
bled-0.093
alks-0.090
atted-0.088
ipt-0.087
Token upon
Token ID2402
Feature activation-5.43e-02

Pos logit contributions
 hide+0.043
 hiding+0.041
 tag+0.038
 seeker+0.036
 hidden+0.035
Neg logit contributions
lee-0.162
bled-0.152
alks-0.148
atted-0.146
urg-0.146
Token a
Token ID257
Feature activation-1.32e-01

Pos logit contributions
 seeker+0.062
 hide+0.058
 tag+0.057
 hiding+0.056
 exit+0.055
Neg logit contributions
lee-0.133
bled-0.126
ipt-0.119
rac-0.118
urg-0.117
Token time
Token ID640
Feature activation-1.07e-01

Pos logit contributions
 hide+0.141
 tag+0.139
 seeker+0.132
 hiding+0.128
 exit+0.122
Neg logit contributions
lee-0.357
atted-0.319
bled-0.319
alks-0.317
rac-0.311
Token,
Token ID11
Feature activation-9.93e-02

Pos logit contributions
 hide+0.116
 tag+0.115
 seeker+0.110
 exit+0.104
 hiding+0.100
Neg logit contributions
lee-0.286
bled-0.265
atted-0.262
alks-0.256
rac-0.253
Token there
Token ID612
Feature activation-4.86e-02

Pos logit contributions
 hide+0.083
 tag+0.083
 seeker+0.079
 exit+0.071
 hiding+0.068
Neg logit contributions
lee-0.285
bled-0.280
atted-0.266
rac-0.264
ipt-0.258
Token was
Token ID373
Feature activation-1.03e-01

Pos logit contributions
 hide+0.059
 seeker+0.059
 tag+0.058
 hiding+0.055
 exit+0.053
Neg logit contributions
lee-0.124
bled-0.114
atted-0.112
alks-0.108
ipt-0.108
Token a
Token ID257
Feature activation-1.24e-01

Pos logit contributions
 hide+0.101
 tag+0.100
 hiding+0.092
 seeker+0.092
 exit+0.087
Neg logit contributions
lee-0.291
bled-0.271
alks-0.262
atted-0.256
urg-0.254
Token face
Token ID1986
Feature activation+9.78e-03

Pos logit contributions
lee+0.029
atted+0.027
bled+0.027
alks+0.026
arted+0.026
Neg logit contributions
 hide-0.031
 tag-0.031
 hiding-0.030
 hidden-0.029
 exit-0.028
Token.
Token ID13
Feature activation-6.65e-03

Pos logit contributions
lee+0.025
bled+0.022
alks+0.022
atted+0.022
ment+0.021
Neg logit contributions
 hide-0.013
 seeker-0.013
 tag-0.012
 hiding-0.012
 maze-0.011
Token<|endoftext|>
Token ID50256
Feature activation-5.75e-02

Pos logit contributions
 hide+0.012
 tag+0.012
 hiding+0.012
 seeker+0.012
 hidden+0.011
Neg logit contributions
lee-0.014
bled-0.012
alks-0.012
atted-0.012
urg-0.011
TokenOnce
Token ID7454
Feature activation-5.53e-02

Pos logit contributions
 hide+0.105
 seeker+0.105
 tag+0.104
 hiding+0.101
 exit+0.098
Neg logit contributions
lee-0.107
bled-0.095
alks-0.091
ipt-0.090
atted-0.089
Token upon
Token ID2402
Feature activation-5.70e-02

Pos logit contributions
 hide+0.049
 hiding+0.047
 tag+0.045
 seeker+0.043
 hidden+0.041
Neg logit contributions
lee-0.166
bled-0.154
alks-0.151
atted-0.149
urg-0.148
Token a
Token ID257
Feature activation-1.32e-01

Pos logit contributions
 seeker+0.068
 tag+0.064
 hide+0.063
 hiding+0.062
 exit+0.060
Neg logit contributions
lee-0.132
bled-0.127
ipt-0.120
rac-0.118
ped-0.117
Token time
Token ID640
Feature activation-1.08e-01

Pos logit contributions
 tag+0.160
 hide+0.158
 seeker+0.153
 hiding+0.143
 exit+0.141
Neg logit contributions
lee-0.323
atted-0.289
um-0.287
bled-0.286
rac-0.283
Token,
Token ID11
Feature activation-1.05e-01

Pos logit contributions
 tag+0.133
 hide+0.131
 seeker+0.127
 exit+0.121
 hidden+0.112
Neg logit contributions
lee-0.267
atted-0.249
bled-0.249
rac-0.241
alks-0.238
Token there
Token ID612
Feature activation-5.30e-02

Pos logit contributions
 tag+0.101
 hide+0.097
 seeker+0.096
 exit+0.084
 hiding+0.082
Neg logit contributions
bled-0.283
lee-0.278
atted-0.266
rac-0.261
ipt-0.254
Token was
Token ID373
Feature activation-1.02e-01

Pos logit contributions
 seeker+0.071
 hide+0.070
 tag+0.069
 hiding+0.064
 exit+0.064
Neg logit contributions
lee-0.125
bled-0.117
atted-0.113
ipt-0.110
umbled-0.109
Token a
Token ID257
Feature activation-1.25e-01

Pos logit contributions
 tag+0.108
 hide+0.107
 hiding+0.101
 seeker+0.099
 exit+0.094
Neg logit contributions
lee-0.269
bled-0.254
alks-0.241
ipt-0.236
atted-0.235
Token tomorrow
Token ID9439
Feature activation-2.69e-02

Pos logit contributions
 tag+0.017
 hide+0.017
 seeker+0.016
 hiding+0.016
 exit+0.015
Neg logit contributions
lee-0.033
bled-0.029
alks-0.029
rac-0.029
atted-0.028
Token.
Token ID13
Feature activation-7.99e-03

Pos logit contributions
 tag+0.034
 hide+0.034
 seeker+0.033
 hiding+0.032
 exit+0.030
Neg logit contributions
lee-0.066
alks-0.058
bled-0.057
rac-0.056
ums-0.055
Token<|endoftext|>
Token ID50256
Feature activation-5.78e-02

Pos logit contributions
 hide+0.014
 tag+0.014
 hiding+0.013
 seeker+0.013
 hidden+0.013
Neg logit contributions
lee-0.017
bled-0.016
atted-0.015
alks-0.015
urg-0.015
TokenOnce
Token ID7454
Feature activation-5.85e-02

Pos logit contributions
 hide+0.105
 seeker+0.105
 tag+0.104
 hiding+0.101
 exit+0.099
Neg logit contributions
lee-0.108
bled-0.096
alks-0.092
ipt-0.091
atted-0.090
Token upon
Token ID2402
Feature activation-5.57e-02

Pos logit contributions
 hide+0.056
 tag+0.052
 hiding+0.051
 seeker+0.049
 hidden+0.045
Neg logit contributions
lee-0.173
bled-0.159
alks-0.156
atted-0.155
urg-0.153
Token a
Token ID257
Feature activation-1.32e-01

Pos logit contributions
 seeker+0.066
 tag+0.061
 hide+0.061
 hiding+0.060
 exit+0.058
Neg logit contributions
lee-0.130
bled-0.126
ipt-0.118
rac-0.117
ped-0.115
Token time
Token ID640
Feature activation-1.08e-01

Pos logit contributions
 tag+0.162
 hide+0.160
 seeker+0.155
 hiding+0.147
 exit+0.144
Neg logit contributions
lee-0.316
atted-0.284
um-0.283
bled-0.282
rac-0.280
Token,
Token ID11
Feature activation-1.04e-01

Pos logit contributions
 tag+0.126
 hide+0.126
 seeker+0.121
 exit+0.115
 hiding+0.108
Neg logit contributions
lee-0.275
bled-0.256
atted-0.255
rac-0.247
alks-0.247
Token there
Token ID612
Feature activation-5.12e-02

Pos logit contributions
 tag+0.106
 seeker+0.104
 hide+0.102
 exit+0.094
 library+0.091
Neg logit contributions
bled-0.271
lee-0.261
rac-0.252
atted-0.251
ipt-0.239
Token was
Token ID373
Feature activation-9.83e-02

Pos logit contributions
 seeker+0.068
 hide+0.067
 tag+0.067
 hiding+0.063
 exit+0.062
Neg logit contributions
lee-0.121
bled-0.114
atted-0.109
ipt-0.107
umbled-0.106
Token a
Token ID257
Feature activation-1.24e-01

Pos logit contributions
 tag+0.107
 hide+0.106
 hiding+0.098
 seeker+0.097
 exit+0.093
Neg logit contributions
lee-0.257
bled-0.243
alks-0.230
atted-0.225
ipt-0.224
Token long
Token ID890
Feature activation+2.25e-02

Pos logit contributions
 hide+0.027
 seeker+0.026
 tag+0.026
 exit+0.025
 hiding+0.024
Neg logit contributions
lee-0.068
bled-0.062
alks-0.062
atted-0.059
ums-0.057
Token.
Token ID13
Feature activation-1.18e-02

Pos logit contributions
lee+0.058
bled+0.053
atted+0.051
alks+0.051
arted+0.048
Neg logit contributions
 seeker-0.029
 hide-0.028
 exit-0.026
 tag-0.025
 maze-0.025
Token<|endoftext|>
Token ID50256
Feature activation-5.45e-02

Pos logit contributions
 hide+0.019
 tag+0.018
 hiding+0.017
 seeker+0.016
 hidden+0.016
Neg logit contributions
lee-0.028
bled-0.026
alks-0.025
urg-0.024
atted-0.024
TokenOnce
Token ID7454
Feature activation-5.75e-02

Pos logit contributions
 hide+0.099
 seeker+0.099
 tag+0.098
 hiding+0.095
 exit+0.093
Neg logit contributions
lee-0.104
bled-0.092
alks-0.089
ipt-0.087
atted-0.086
Token upon
Token ID2402
Feature activation-5.80e-02

Pos logit contributions
 hide+0.047
 hiding+0.045
 tag+0.042
 seeker+0.040
 hidden+0.039
Neg logit contributions
lee-0.176
bled-0.163
alks-0.161
urg-0.159
atted-0.158
Token a
Token ID257
Feature activation-1.31e-01

Pos logit contributions
 seeker+0.074
 tag+0.066
 hiding+0.066
 hide+0.066
 exit+0.065
Neg logit contributions
lee-0.131
bled-0.126
ipt-0.117
rac-0.117
ped-0.116
Token time
Token ID640
Feature activation-1.08e-01

Pos logit contributions
 tag+0.153
 hide+0.151
 seeker+0.147
 hiding+0.137
 exit+0.135
Neg logit contributions
lee-0.336
atted-0.302
bled-0.298
um-0.294
alks-0.293
Token,
Token ID11
Feature activation-1.04e-01

Pos logit contributions
 hide+0.118
 tag+0.116
 seeker+0.111
 exit+0.105
 hiding+0.101
Neg logit contributions
lee-0.291
bled-0.269
atted-0.266
alks-0.261
rac-0.257
Token there
Token ID612
Feature activation-5.19e-02

Pos logit contributions
 tag+0.101
 hide+0.099
 seeker+0.099
 exit+0.089
 library+0.085
Neg logit contributions
bled-0.277
lee-0.274
atted-0.259
rac-0.256
ipt-0.245
Token was
Token ID373
Feature activation-1.02e-01

Pos logit contributions
 seeker+0.068
 hide+0.066
 tag+0.066
 hiding+0.062
 exit+0.061
Neg logit contributions
lee-0.126
bled-0.117
atted-0.114
ipt-0.110
umbled-0.109
Token a
Token ID257
Feature activation-1.23e-01

Pos logit contributions
 tag+0.109
 hide+0.108
 seeker+0.103
 hiding+0.102
 exit+0.099
Neg logit contributions
lee-0.267
bled-0.248
alks-0.236
atted-0.230
ipt-0.230
Token song
Token ID3496
Feature activation-2.09e-02

Pos logit contributions
lee+0.013
alks+0.012
bled+0.012
atted+0.011
urg+0.011
Neg logit contributions
 hide-0.011
 seeker-0.011
 tag-0.011
 hiding-0.010
 exit-0.010
Token.
Token ID13
Feature activation-7.16e-03

Pos logit contributions
 seeker+0.029
 hide+0.028
 maze+0.028
 exit+0.026
 hiding+0.026
Neg logit contributions
lee-0.046
bled-0.044
atted-0.042
ped-0.040
alks-0.040
Token<|endoftext|>
Token ID50256
Feature activation-5.31e-02

Pos logit contributions
 hide+0.010
 tag+0.010
 hiding+0.009
 hidden+0.008
 seeker+0.008
Neg logit contributions
lee-0.018
bled-0.017
urg-0.016
atted-0.016
alks-0.016
TokenOnce
Token ID7454
Feature activation-6.17e-02

Pos logit contributions
 hide+0.098
 seeker+0.097
 tag+0.096
 hiding+0.094
 exit+0.092
Neg logit contributions
lee-0.099
bled-0.088
alks-0.085
ipt-0.083
atted-0.082
Token upon
Token ID2402
Feature activation-5.32e-02

Pos logit contributions
 hide+0.055
 hiding+0.052
 tag+0.051
 seeker+0.048
 hidden+0.045
Neg logit contributions
lee-0.186
bled-0.171
alks-0.169
atted-0.167
urg-0.165
Token a
Token ID257
Feature activation-1.31e-01

Pos logit contributions
 seeker+0.062
 hide+0.058
 tag+0.057
 hiding+0.056
 exit+0.055
Neg logit contributions
lee-0.129
bled-0.123
rac-0.115
ipt-0.115
urg-0.113
Token time
Token ID640
Feature activation-1.11e-01

Pos logit contributions
 tag+0.152
 hide+0.152
 seeker+0.145
 hiding+0.138
 exit+0.135
Neg logit contributions
lee-0.337
atted-0.301
bled-0.300
rac-0.295
alks-0.294
Token,
Token ID11
Feature activation-1.04e-01

Pos logit contributions
 hide+0.130
 tag+0.129
 seeker+0.125
 exit+0.120
 hiding+0.112
Neg logit contributions
lee-0.285
bled-0.263
atted-0.262
rac-0.255
alks-0.254
Token there
Token ID612
Feature activation-5.23e-02

Pos logit contributions
 tag+0.101
 hide+0.100
 seeker+0.098
 exit+0.089
 library+0.084
Neg logit contributions
bled-0.281
lee-0.275
rac-0.262
atted-0.261
ipt-0.248
Token was
Token ID373
Feature activation-1.01e-01

Pos logit contributions
 seeker+0.070
 hide+0.069
 tag+0.068
 hiding+0.064
 exit+0.063
Neg logit contributions
lee-0.126
bled-0.117
atted-0.113
ipt-0.110
umbled-0.109
Token a
Token ID257
Feature activation-1.22e-01

Pos logit contributions
 tag+0.106
 hide+0.106
 seeker+0.097
 hiding+0.097
 exit+0.094
Neg logit contributions
lee-0.271
bled-0.254
alks-0.242
atted-0.237
urg-0.235
Token faces
Token ID6698
Feature activation+4.59e-03

Pos logit contributions
bled+0.014
lee+0.013
ipt+0.013
atted+0.013
alks+0.013
Neg logit contributions
 hide-0.025
 hiding-0.025
 hidden-0.024
 tag-0.024
 exit-0.024
Token.
Token ID13
Feature activation-7.20e-03

Pos logit contributions
lee+0.012
bled+0.011
atted+0.010
alks+0.010
urg+0.010
Neg logit contributions
 hide-0.006
 seeker-0.006
 tag-0.006
 maze-0.005
 hiding-0.005
Token<|endoftext|>
Token ID50256
Feature activation-5.42e-02

Pos logit contributions
 hide+0.013
 tag+0.012
 seeker+0.012
 hiding+0.012
 hidden+0.011
Neg logit contributions
lee-0.016
bled-0.014
atted-0.014
alks-0.014
urg-0.013
TokenOnce
Token ID7454
Feature activation-5.77e-02

Pos logit contributions
 hide+0.100
 seeker+0.100
 tag+0.098
 hiding+0.096
 exit+0.094
Neg logit contributions
lee-0.101
bled-0.090
alks-0.085
ipt-0.085
atted-0.084
Token upon
Token ID2402
Feature activation-5.56e-02

Pos logit contributions
 hide+0.053
 hiding+0.049
 tag+0.048
 seeker+0.045
 hidden+0.043
Neg logit contributions
lee-0.172
bled-0.159
alks-0.157
atted-0.154
urg-0.154
Token a
Token ID257
Feature activation-1.31e-01

Pos logit contributions
 seeker+0.072
 tag+0.064
 hide+0.064
 hiding+0.064
 exit+0.063
Neg logit contributions
lee-0.123
bled-0.119
ipt-0.111
rac-0.110
ped-0.109
Token time
Token ID640
Feature activation-1.07e-01

Pos logit contributions
 tag+0.159
 hide+0.157
 seeker+0.153
 hiding+0.143
 exit+0.141
Neg logit contributions
lee-0.323
atted-0.292
bled-0.287
um-0.285
rac-0.283
Token,
Token ID11
Feature activation-1.05e-01

Pos logit contributions
 tag+0.132
 hide+0.131
 seeker+0.127
 exit+0.122
 hidden+0.113
Neg logit contributions
lee-0.267
atted-0.247
bled-0.246
rac-0.239
alks-0.236
Token there
Token ID612
Feature activation-5.07e-02

Pos logit contributions
 tag+0.109
 seeker+0.107
 hide+0.106
 exit+0.096
 library+0.094
Neg logit contributions
bled-0.272
lee-0.263
atted-0.254
rac-0.252
umbled-0.241
Token was
Token ID373
Feature activation-1.00e-01

Pos logit contributions
 seeker+0.071
 hide+0.069
 tag+0.068
 exit+0.064
 hiding+0.064
Neg logit contributions
lee-0.117
bled-0.109
atted-0.106
ipt-0.103
umbled-0.103
Token a
Token ID257
Feature activation-1.24e-01

Pos logit contributions
 tag+0.115
 hide+0.114
 seeker+0.107
 hiding+0.107
 exit+0.104
Neg logit contributions
lee-0.249
bled-0.235
alks-0.220
um-0.218
ipt-0.217
Token yard
Token ID12699
Feature activation-3.66e-02

Pos logit contributions
 tag+0.020
 seeker+0.020
 hide+0.019
 hiding+0.018
 anywhere+0.018
Neg logit contributions
lee-0.031
alks-0.028
bled-0.027
um-0.026
atted-0.025
Token.
Token ID13
Feature activation-5.46e-03

Pos logit contributions
 seeker+0.047
 tag+0.044
 hide+0.043
 exit+0.042
 hiding+0.040
Neg logit contributions
lee-0.085
bled-0.080
alks-0.077
ums-0.076
atted-0.075
Token<|endoftext|>
Token ID50256
Feature activation-5.41e-02

Pos logit contributions
 hide+0.008
 tag+0.008
 hiding+0.008
 seeker+0.008
 hidden+0.007
Neg logit contributions
lee-0.013
bled-0.012
atted-0.011
alks-0.011
urg-0.011
TokenOnce
Token ID7454
Feature activation-6.00e-02

Pos logit contributions
 hide+0.103
 seeker+0.103
 tag+0.102
 hiding+0.099
 exit+0.098
Neg logit contributions
lee-0.096
bled-0.084
alks-0.081
ipt-0.080
atted-0.078
Token upon
Token ID2402
Feature activation-5.78e-02

Pos logit contributions
 hide+0.055
 hiding+0.051
 tag+0.050
 seeker+0.047
 hidden+0.045
Neg logit contributions
lee-0.180
bled-0.166
alks-0.163
atted-0.162
urg-0.160
Token a
Token ID257
Feature activation-1.31e-01

Pos logit contributions
 seeker+0.077
 tag+0.068
 hide+0.067
 hiding+0.067
 exit+0.067
Neg logit contributions
lee-0.126
bled-0.121
ipt-0.113
ped-0.112
rac-0.111
Token time
Token ID640
Feature activation-1.07e-01

Pos logit contributions
 tag+0.161
 hide+0.157
 seeker+0.156
 exit+0.142
 hiding+0.141
Neg logit contributions
lee-0.320
atted-0.286
um-0.282
bled-0.280
rac-0.277
Token,
Token ID11
Feature activation-1.05e-01

Pos logit contributions
 tag+0.125
 hide+0.125
 seeker+0.121
 exit+0.115
 hidden+0.107
Neg logit contributions
lee-0.273
bled-0.251
atted-0.251
alks-0.244
rac-0.241
Token there
Token ID612
Feature activation-5.30e-02

Pos logit contributions
 tag+0.113
 hide+0.109
 seeker+0.107
 library+0.101
 exit+0.101
Neg logit contributions
bled-0.270
lee-0.258
atted-0.247
rac-0.245
eled-0.233
Token was
Token ID373
Feature activation-9.98e-02

Pos logit contributions
 seeker+0.073
 hide+0.071
 tag+0.071
 exit+0.067
 hiding+0.066
Neg logit contributions
lee-0.123
bled-0.114
atted-0.111
ipt-0.108
umbled-0.107
Token a
Token ID257
Feature activation-1.23e-01

Pos logit contributions
 tag+0.114
 hide+0.113
 hiding+0.107
 seeker+0.107
 exit+0.105
Neg logit contributions
lee-0.248
bled-0.235
alks-0.219
um-0.216
ipt-0.215
Token place
Token ID1295
Feature activation-2.65e-02

Pos logit contributions
lee+0.070
lights+0.066
ment+0.064
bled+0.063
atted+0.062
Neg logit contributions
 seeker-0.033
 hide-0.031
 tag-0.030
house-0.028
 library-0.028
Token!
Token ID0
Feature activation-1.97e-03

Pos logit contributions
 seeker+0.044
 tag+0.037
 hide+0.036
 maze+0.034
 exit+0.033
Neg logit contributions
lee-0.059
atted-0.051
bled-0.050
alks-0.049
ment-0.048
Token<|endoftext|>
Token ID50256
Feature activation-5.20e-02

Pos logit contributions
 hide+0.003
 tag+0.003
 hiding+0.003
 seeker+0.003
 exit+0.003
Neg logit contributions
lee-0.005
bled-0.004
atted-0.004
alks-0.004
urg-0.004
TokenOnce
Token ID7454
Feature activation-5.71e-02

Pos logit contributions
 hide+0.104
 seeker+0.103
 tag+0.102
 hiding+0.100
 exit+0.099
Neg logit contributions
lee-0.085
bled-0.073
ipt-0.073
alks-0.071
umbled-0.070
Token upon
Token ID2402
Feature activation-5.68e-02

Pos logit contributions
 hide+0.053
 tag+0.050
 hiding+0.049
 seeker+0.046
 hidden+0.043
Neg logit contributions
lee-0.170
bled-0.157
alks-0.154
atted-0.153
urg-0.150
Token a
Token ID257
Feature activation-1.31e-01

Pos logit contributions
 seeker+0.078
 hiding+0.069
 hide+0.068
 tag+0.068
 exit+0.068
Neg logit contributions
lee-0.119
bled-0.115
ipt-0.107
ped-0.106
rac-0.106
Token time
Token ID640
Feature activation-1.06e-01

Pos logit contributions
 tag+0.164
 hide+0.161
 seeker+0.158
 hiding+0.146
 exit+0.145
Neg logit contributions
lee-0.312
atted-0.279
um-0.277
bled-0.272
rac-0.272
Token there
Token ID612
Feature activation-5.15e-02

Pos logit contributions
 hide+0.126
 tag+0.126
 seeker+0.121
 exit+0.116
 hiding+0.108
Neg logit contributions
lee-0.272
atted-0.250
bled-0.250
alks-0.241
rac-0.239
Token was
Token ID373
Feature activation-1.05e-01

Pos logit contributions
 seeker+0.072
 hide+0.071
 tag+0.070
 hiding+0.066
 exit+0.066
Neg logit contributions
lee-0.121
bled-0.110
atted-0.108
umbled-0.104
ipt-0.104
Token a
Token ID257
Feature activation-1.25e-01

Pos logit contributions
 tag+0.120
 hide+0.118
 seeker+0.112
 hiding+0.111
 exit+0.110
Neg logit contributions
lee-0.265
bled-0.252
alks-0.232
ipt-0.231
um-0.231
Token small
Token ID1402
Feature activation-7.77e-02

Pos logit contributions
 anywhere+0.223
 hide+0.221
 tag+0.219
 hiding+0.211
 exit+0.207
Neg logit contributions
lee-0.229
Dad-0.187
um-0.186
bled-0.185
alks-0.177
Token come
Token ID1282
Feature activation+1.15e-02

Pos logit contributions
 seeker+0.009
 hiding+0.009
 tag+0.009
 hide+0.008
 maze+0.008
Neg logit contributions
lee-0.015
rac-0.013
bled-0.013
alks-0.013
urg-0.013
Token.
Token ID13
Feature activation-8.50e-03

Pos logit contributions
lee+0.021
bled+0.017
rac+0.017
um+0.016
alks+0.016
Neg logit contributions
 tag-0.020
 seeker-0.019
 gym-0.019
 library-0.019
 maze-0.018
Token<|endoftext|>
Token ID50256
Feature activation-5.10e-02

Pos logit contributions
 hide+0.016
 hiding+0.015
 tag+0.015
 seeker+0.015
 hidden+0.014
Neg logit contributions
lee-0.017
bled-0.015
alks-0.015
atted-0.015
rac-0.014
TokenOnce
Token ID7454
Feature activation-5.19e-02

Pos logit contributions
 seeker+0.102
 hide+0.101
 hiding+0.098
 tag+0.097
 exit+0.095
Neg logit contributions
lee-0.083
bled-0.072
ipt-0.072
alks-0.070
umbled-0.067
Token upon
Token ID2402
Feature activation-5.80e-02

Pos logit contributions
 hide+0.056
 tag+0.053
 hiding+0.051
 seeker+0.050
 hidden+0.046
Neg logit contributions
lee-0.147
bled-0.135
alks-0.132
atted-0.132
urg-0.129
Token a
Token ID257
Feature activation-1.31e-01

Pos logit contributions
 seeker+0.081
 hiding+0.072
 hide+0.070
 tag+0.069
 maze+0.068
Neg logit contributions
lee-0.115
bled-0.111
rac-0.107
ipt-0.106
ped-0.106
Token time
Token ID640
Feature activation-9.78e-02

Pos logit contributions
 tag+0.164
 hide+0.162
 seeker+0.158
 hiding+0.147
 exit+0.143
Neg logit contributions
lee-0.318
um-0.280
atted-0.278
rac-0.277
alks-0.275
Token,
Token ID11
Feature activation-9.84e-02

Pos logit contributions
 hide+0.127
 tag+0.127
 seeker+0.124
 exit+0.117
 hidden+0.109
Neg logit contributions
lee-0.235
atted-0.218
bled-0.216
rac-0.210
alks-0.206
Token there
Token ID612
Feature activation-4.69e-02

Pos logit contributions
 seeker+0.117
 hide+0.106
 tag+0.104
 exit+0.099
 library+0.097
Neg logit contributions
bled-0.239
rac-0.227
lee-0.216
ipt-0.215
atted-0.212
Token was
Token ID373
Feature activation-1.00e-01

Pos logit contributions
 seeker+0.070
 hide+0.067
 tag+0.065
 hiding+0.063
 exit+0.062
Neg logit contributions
lee-0.102
bled-0.096
atted-0.092
ipt-0.092
umbled-0.091
Token a
Token ID257
Feature activation-1.23e-01

Pos logit contributions
 tag+0.114
 hide+0.113
 seeker+0.107
 hiding+0.106
 exit+0.102
Neg logit contributions
lee-0.255
bled-0.236
alks-0.225
ipt-0.224
um-0.222
Token tree
Token ID5509
Feature activation-3.70e-02

Pos logit contributions
 seeker+0.053
 tag+0.050
 hide+0.049
 hiding+0.046
 exit+0.042
Neg logit contributions
lee-0.094
alks-0.081
atted-0.079
arted-0.076
bled-0.076
Token.
Token ID13
Feature activation-1.07e-03

Pos logit contributions
 seeker+0.056
 tag+0.049
 hide+0.047
 exit+0.044
 hiding+0.043
Neg logit contributions
lee-0.086
alks-0.076
atted-0.074
ums-0.070
bled-0.070
Token<|endoftext|>
Token ID50256
Feature activation-5.00e-02

Pos logit contributions
 hide+0.001
 tag+0.001
 hiding+0.001
 seeker+0.001
 hidden+0.001
Neg logit contributions
lee-0.003
bled-0.003
aned-0.003
atted-0.003
urg-0.003
TokenOnce
Token ID7454
Feature activation-6.06e-02

Pos logit contributions
 hide+0.094
 seeker+0.094
 tag+0.092
 hiding+0.090
 exit+0.089
Neg logit contributions
lee-0.091
bled-0.081
alks-0.077
ipt-0.075
atted-0.075
Token upon
Token ID2402
Feature activation-5.39e-02

Pos logit contributions
 hide+0.055
 hiding+0.051
 tag+0.051
 seeker+0.048
 hidden+0.045
Neg logit contributions
lee-0.181
bled-0.167
alks-0.164
atted-0.163
urg-0.161
Token a
Token ID257
Feature activation-1.31e-01

Pos logit contributions
 seeker+0.070
 hide+0.062
 exit+0.062
 tag+0.061
 hiding+0.061
Neg logit contributions
lee-0.122
bled-0.117
ipt-0.108
rac-0.108
urg-0.107
Token time
Token ID640
Feature activation-1.09e-01

Pos logit contributions
 tag+0.153
 hide+0.152
 seeker+0.147
 hiding+0.139
 exit+0.136
Neg logit contributions
lee-0.333
atted-0.298
bled-0.295
rac-0.290
um-0.290
Token,
Token ID11
Feature activation-1.06e-01

Pos logit contributions
 tag+0.128
 hide+0.128
 seeker+0.123
 exit+0.119
 hidden+0.109
Neg logit contributions
lee-0.280
bled-0.256
atted-0.256
alks-0.248
rac-0.247
Token there
Token ID612
Feature activation-5.09e-02

Pos logit contributions
 tag+0.111
 hide+0.108
 seeker+0.108
 exit+0.101
 library+0.099
Neg logit contributions
bled-0.270
lee-0.259
atted-0.249
rac-0.248
eled-0.236
Token was
Token ID373
Feature activation-9.93e-02

Pos logit contributions
 seeker+0.070
 hide+0.068
 tag+0.067
 exit+0.064
 hiding+0.063
Neg logit contributions
lee-0.119
bled-0.111
atted-0.107
ipt-0.104
umbled-0.103
Token a
Token ID257
Feature activation-1.23e-01

Pos logit contributions
 hide+0.111
 tag+0.111
 seeker+0.104
 hiding+0.104
 exit+0.102
Neg logit contributions
lee-0.252
bled-0.238
alks-0.222
um-0.220
urg-0.219
Token together
Token ID1978
Feature activation-1.28e-02

Pos logit contributions
 seeker+0.061
 tag+0.057
 exit+0.056
 hide+0.056
 hiding+0.053
Neg logit contributions
lee-0.094
bled-0.085
alks-0.083
atted-0.081
urg-0.079
Token.
Token ID13
Feature activation-5.14e-03

Pos logit contributions
 seeker+0.017
 exit+0.016
 hide+0.016
 tag+0.015
 maze+0.015
Neg logit contributions
lee-0.031
bled-0.029
alks-0.028
atted-0.026
ums-0.026
Token<|endoftext|>
Token ID50256
Feature activation-5.23e-02

Pos logit contributions
 hide+0.007
 tag+0.007
 hiding+0.006
 seeker+0.006
 hidden+0.006
Neg logit contributions
lee-0.013
bled-0.012
atted-0.012
urg-0.012
alks-0.012
TokenOnce
Token ID7454
Feature activation-6.16e-02

Pos logit contributions
 hide+0.096
 seeker+0.095
 tag+0.095
 hiding+0.092
 exit+0.091
Neg logit contributions
lee-0.098
bled-0.088
alks-0.084
ipt-0.082
atted-0.081
Token upon
Token ID2402
Feature activation-5.34e-02

Pos logit contributions
 hide+0.055
 hiding+0.052
 tag+0.051
 seeker+0.048
 hidden+0.046
Neg logit contributions
lee-0.185
bled-0.171
alks-0.168
atted-0.166
urg-0.165
Token a
Token ID257
Feature activation-1.31e-01

Pos logit contributions
 seeker+0.069
 tag+0.062
 hide+0.061
 exit+0.061
 hiding+0.061
Neg logit contributions
lee-0.120
bled-0.116
ipt-0.107
rac-0.107
urg-0.106
Token time
Token ID640
Feature activation-1.06e-01

Pos logit contributions
 tag+0.152
 hide+0.151
 seeker+0.146
 hiding+0.137
 exit+0.135
Neg logit contributions
lee-0.333
atted-0.298
bled-0.296
alks-0.290
um-0.290
Token,
Token ID11
Feature activation-1.03e-01

Pos logit contributions
 tag+0.119
 hide+0.119
 seeker+0.114
 exit+0.108
 hiding+0.102
Neg logit contributions
lee-0.279
bled-0.257
atted-0.256
alks-0.249
rac-0.246
Token there
Token ID612
Feature activation-4.94e-02

Pos logit contributions
 tag+0.108
 seeker+0.108
 hide+0.104
 exit+0.097
 library+0.092
Neg logit contributions
bled-0.266
lee-0.256
atted-0.247
rac-0.245
ipt-0.233
Token was
Token ID373
Feature activation-9.95e-02

Pos logit contributions
 seeker+0.064
 hide+0.063
 tag+0.062
 hiding+0.059
 exit+0.058
Neg logit contributions
lee-0.121
bled-0.112
atted-0.109
ipt-0.105
alks-0.104
Token an
Token ID281
Feature activation-9.02e-02

Pos logit contributions
 tag+0.115
 hide+0.114
 seeker+0.108
 hiding+0.107
 exit+0.106
Neg logit contributions
lee-0.246
bled-0.235
alks-0.218
um-0.215
atted-0.214
Token new
Token ID649
Feature activation+2.25e-04

Pos logit contributions
 seeker+0.055
 tag+0.049
 hide+0.044
 gym+0.040
 chamber+0.040
Neg logit contributions
lee-0.076
atted-0.065
bled-0.062
alks-0.061
arted-0.060
Token.
Token ID13
Feature activation-3.41e-03

Pos logit contributions
lee+0.001
ment+0.000
atted+0.000
alks+0.000
iously+0.000
Neg logit contributions
 seeker-0.000
 tag-0.000
 hide-0.000
 exit-0.000
 gym-0.000
Token<|endoftext|>
Token ID50256
Feature activation-4.94e-02

Pos logit contributions
 hide+0.004
 tag+0.004
 hiding+0.004
 seeker+0.004
 gym+0.004
Neg logit contributions
lee-0.009
bled-0.009
atted-0.008
aned-0.008
urg-0.008
TokenOnce
Token ID7454
Feature activation-5.66e-02

Pos logit contributions
 seeker+0.092
 hide+0.091
 tag+0.090
 hiding+0.087
 exit+0.086
Neg logit contributions
lee-0.093
bled-0.082
alks-0.079
ipt-0.077
atted-0.077
Token upon
Token ID2402
Feature activation-5.21e-02

Pos logit contributions
 hide+0.047
 hiding+0.045
 tag+0.042
 seeker+0.040
 hidden+0.039
Neg logit contributions
lee-0.172
bled-0.160
alks-0.158
atted-0.155
urg-0.155
Token a
Token ID257
Feature activation-1.31e-01

Pos logit contributions
 seeker+0.062
 hide+0.057
 tag+0.057
 hiding+0.056
 exit+0.055
Neg logit contributions
lee-0.126
bled-0.120
ipt-0.111
urg-0.111
rac-0.110
Token time
Token ID640
Feature activation-1.03e-01

Pos logit contributions
 hide+0.146
 tag+0.145
 seeker+0.138
 hiding+0.132
 exit+0.127
Neg logit contributions
lee-0.345
atted-0.309
bled-0.307
alks-0.302
rac-0.300
Token,
Token ID11
Feature activation-1.01e-01

Pos logit contributions
 hide+0.122
 tag+0.122
 seeker+0.118
 exit+0.113
 hiding+0.105
Neg logit contributions
lee-0.266
bled-0.244
atted-0.244
alks-0.236
rac-0.234
Token there
Token ID612
Feature activation-4.85e-02

Pos logit contributions
 seeker+0.102
 tag+0.102
 hide+0.100
 exit+0.093
 library+0.086
Neg logit contributions
bled-0.266
lee-0.260
atted-0.245
rac-0.244
ipt-0.235
Token were
Token ID547
Feature activation-9.86e-02

Pos logit contributions
 seeker+0.062
 hide+0.061
 tag+0.060
 hiding+0.057
 exit+0.056
Neg logit contributions
lee-0.122
bled-0.112
atted-0.109
ipt-0.105
alks-0.105
Token two
Token ID734
Feature activation-6.87e-02

Pos logit contributions
 hide+0.136
 hiding+0.128
 tag+0.127
 seeker+0.125
 hidden+0.120
Neg logit contributions
lee-0.251
atted-0.228
bled-0.228
alks-0.226
aned-0.222
Token her
Token ID607
Feature activation+2.13e-02

Pos logit contributions
lee+0.047
bled+0.042
alks+0.041
atted+0.040
urg+0.039
Neg logit contributions
 hide-0.038
 tag-0.036
 seeker-0.036
 hiding-0.036
 hidden-0.034
Token.
Token ID13
Feature activation-8.07e-03

Pos logit contributions
lee+0.048
alks+0.041
atted+0.040
bled+0.038
ums+0.038
Neg logit contributions
 seeker-0.032
 hide-0.031
 tag-0.030
 tim-0.030
 maze-0.027
Token<|endoftext|>
Token ID50256
Feature activation-5.30e-02

Pos logit contributions
 hide+0.013
 tag+0.013
 hiding+0.012
 seeker+0.012
 exit+0.011
Neg logit contributions
lee-0.019
bled-0.017
atted-0.016
alks-0.016
urg-0.016
TokenOnce
Token ID7454
Feature activation-5.33e-02

Pos logit contributions
 hide+0.098
 seeker+0.097
 tag+0.096
 hiding+0.094
 exit+0.092
Neg logit contributions
lee-0.098
bled-0.087
alks-0.083
ipt-0.083
atted-0.081
Token upon
Token ID2402
Feature activation-5.59e-02

Pos logit contributions
 hide+0.047
 hiding+0.045
 tag+0.043
 seeker+0.041
 hidden+0.040
Neg logit contributions
lee-0.161
bled-0.148
alks-0.146
urg-0.144
atted-0.144
Token a
Token ID257
Feature activation-1.30e-01

Pos logit contributions
 seeker+0.071
 hide+0.064
 tag+0.064
 hiding+0.063
 exit+0.062
Neg logit contributions
lee-0.124
bled-0.121
ipt-0.113
rac-0.112
ped-0.110
Token time
Token ID640
Feature activation-1.08e-01

Pos logit contributions
 tag+0.150
 hide+0.150
 seeker+0.143
 hiding+0.135
 exit+0.131
Neg logit contributions
lee-0.335
atted-0.301
bled-0.299
rac-0.293
um-0.292
Token,
Token ID11
Feature activation-1.04e-01

Pos logit contributions
 hide+0.122
 tag+0.121
 seeker+0.115
 exit+0.109
 hiding+0.104
Neg logit contributions
lee-0.284
bled-0.263
atted-0.261
alks-0.254
rac-0.253
Token there
Token ID612
Feature activation-5.24e-02

Pos logit contributions
 tag+0.102
 hide+0.100
 seeker+0.097
 exit+0.088
 hiding+0.084
Neg logit contributions
bled-0.277
lee-0.271
atted-0.259
rac-0.256
ipt-0.247
Token was
Token ID373
Feature activation-1.00e-01

Pos logit contributions
 seeker+0.068
 hide+0.068
 tag+0.067
 hiding+0.063
 exit+0.062
Neg logit contributions
lee-0.126
bled-0.118
atted-0.114
ipt-0.111
umbled-0.110
Token a
Token ID257
Feature activation-1.24e-01

Pos logit contributions
 tag+0.114
 hide+0.113
 seeker+0.107
 hiding+0.107
 exit+0.103
Neg logit contributions
lee-0.249
bled-0.236
alks-0.220
ipt-0.218
um-0.217
Token end
Token ID886
Feature activation-1.21e-02

Pos logit contributions
 hide+0.005
 tag+0.005
 seeker+0.005
 hiding+0.005
 exit+0.005
Neg logit contributions
lee-0.007
bled-0.007
atted-0.007
alks-0.006
urg-0.006
Token.
Token ID13
Feature activation-2.03e-02

Pos logit contributions
 seeker+0.020
 hide+0.017
 tag+0.017
 hiding+0.016
 hidden+0.015
Neg logit contributions
lee-0.027
lights-0.024
atted-0.024
ment-0.024
alks-0.023
Token<|endoftext|>
Token ID50256
Feature activation-5.03e-02

Pos logit contributions
 hide+0.025
 tag+0.024
 hiding+0.022
 seeker+0.021
 hidden+0.021
Neg logit contributions
lee-0.056
bled-0.052
atted-0.051
alks-0.050
urg-0.049
TokenOnce
Token ID7454
Feature activation-5.13e-02

Pos logit contributions
 hide+0.093
 seeker+0.093
 tag+0.091
 hiding+0.089
 exit+0.087
Neg logit contributions
lee-0.094
bled-0.083
alks-0.080
ipt-0.078
atted-0.078
Token upon
Token ID2402
Feature activation-5.29e-02

Pos logit contributions
 hide+0.046
 hiding+0.043
 tag+0.042
 seeker+0.040
 hidden+0.038
Neg logit contributions
lee-0.153
bled-0.143
alks-0.140
atted-0.138
urg-0.137
Token a
Token ID257
Feature activation-1.30e-01

Pos logit contributions
 seeker+0.064
 hide+0.059
 tag+0.059
 hiding+0.058
 exit+0.057
Neg logit contributions
lee-0.125
bled-0.119
ipt-0.111
rac-0.111
urg-0.111
Token time
Token ID640
Feature activation-1.02e-01

Pos logit contributions
 hide+0.147
 tag+0.146
 seeker+0.139
 hiding+0.133
 exit+0.128
Neg logit contributions
lee-0.342
atted-0.308
bled-0.304
rac-0.300
alks-0.299
Token,
Token ID11
Feature activation-1.00e-01

Pos logit contributions
 hide+0.121
 tag+0.121
 seeker+0.114
 exit+0.111
 hidden+0.104
Neg logit contributions
lee-0.261
bled-0.240
atted-0.240
alks-0.231
rac-0.230
Token there
Token ID612
Feature activation-4.80e-02

Pos logit contributions
 tag+0.103
 hide+0.101
 seeker+0.100
 exit+0.094
 library+0.087
Neg logit contributions
bled-0.261
lee-0.252
atted-0.241
rac-0.239
ipt-0.232
Token was
Token ID373
Feature activation-9.97e-02

Pos logit contributions
 seeker+0.064
 hide+0.063
 tag+0.062
 hiding+0.059
 exit+0.058
Neg logit contributions
lee-0.115
bled-0.106
atted-0.104
ipt-0.101
umbled-0.100
Token a
Token ID257
Feature activation-1.23e-01

Pos logit contributions
 hide+0.101
 tag+0.099
 hiding+0.092
 seeker+0.091
 exit+0.088
Neg logit contributions
lee-0.275
bled-0.257
alks-0.246
urg-0.241
atted-0.241
Token long
Token ID890
Feature activation+2.20e-02

Pos logit contributions
 hide+0.034
 tag+0.033
 seeker+0.032
 exit+0.030
 hiding+0.030
Neg logit contributions
lee-0.073
bled-0.067
alks-0.065
atted-0.063
urg-0.061
Token.
Token ID13
Feature activation-7.50e-03

Pos logit contributions
lee+0.058
bled+0.052
atted+0.052
alks+0.051
ped+0.048
Neg logit contributions
 seeker-0.028
 hide-0.026
 exit-0.024
 tag-0.024
 maze-0.023
Token<|endoftext|>
Token ID50256
Feature activation-4.53e-02

Pos logit contributions
 hide+0.010
 tag+0.009
 hiding+0.009
 seeker+0.008
 hidden+0.008
Neg logit contributions
lee-0.019
bled-0.018
atted-0.017
alks-0.017
urg-0.017
TokenOnce
Token ID7454
Feature activation-5.44e-02

Pos logit contributions
 hide+0.087
 seeker+0.087
 tag+0.086
 hiding+0.084
 exit+0.083
Neg logit contributions
lee-0.081
bled-0.071
alks-0.068
ipt-0.068
umbled-0.065
Token upon
Token ID2402
Feature activation-5.58e-02

Pos logit contributions
 hide+0.048
 hiding+0.045
 tag+0.043
 seeker+0.040
 hidden+0.040
Neg logit contributions
lee-0.163
bled-0.152
alks-0.150
urg-0.148
atted-0.147
Token a
Token ID257
Feature activation-1.30e-01

Pos logit contributions
 seeker+0.075
 tag+0.067
 hide+0.067
 exit+0.066
 hiding+0.066
Neg logit contributions
lee-0.121
bled-0.117
ipt-0.109
ped-0.107
rac-0.107
Token time
Token ID640
Feature activation-1.07e-01

Pos logit contributions
 tag+0.154
 hide+0.152
 seeker+0.147
 hiding+0.136
 exit+0.135
Neg logit contributions
lee-0.334
atted-0.299
bled-0.295
um-0.290
alks-0.288
Token,
Token ID11
Feature activation-1.04e-01

Pos logit contributions
 tag+0.127
 hide+0.126
 seeker+0.120
 exit+0.116
 hidden+0.107
Neg logit contributions
lee-0.276
bled-0.254
atted-0.253
alks-0.245
rac-0.243
Token there
Token ID612
Feature activation-5.36e-02

Pos logit contributions
 tag+0.099
 hide+0.098
 seeker+0.093
 exit+0.084
 hiding+0.081
Neg logit contributions
bled-0.284
lee-0.283
atted-0.267
rac-0.261
ipt-0.253
Token was
Token ID373
Feature activation-9.93e-02

Pos logit contributions
 seeker+0.071
 hide+0.070
 tag+0.070
 hiding+0.065
 exit+0.065
Neg logit contributions
lee-0.129
bled-0.119
atted-0.116
ipt-0.112
eled-0.111
Token a
Token ID257
Feature activation-1.23e-01

Pos logit contributions
 tag+0.114
 hide+0.114
 seeker+0.108
 hiding+0.107
 exit+0.106
Neg logit contributions
lee-0.247
bled-0.233
alks-0.218
um-0.213
ipt-0.213
Token together
Token ID1978
Feature activation-1.88e-02

Pos logit contributions
bled+0.006
lee+0.006
alks+0.005
ums+0.005
urg+0.005
Neg logit contributions
 seeker-0.004
 exit-0.004
 hiding-0.004
 locked-0.004
 library-0.004
Token.
Token ID13
Feature activation-3.40e-03

Pos logit contributions
 seeker+0.027
 exit+0.025
 hide+0.025
 tag+0.025
 maze+0.024
Neg logit contributions
lee-0.041
bled-0.038
alks-0.037
ums-0.035
rac-0.034
Token<|endoftext|>
Token ID50256
Feature activation-5.59e-02

Pos logit contributions
 hide+0.006
 tag+0.005
 hiding+0.005
 seeker+0.005
 hidden+0.005
Neg logit contributions
lee-0.008
bled-0.007
atted-0.007
urg-0.007
alks-0.007
TokenOnce
Token ID7454
Feature activation-6.02e-02

Pos logit contributions
 hide+0.100
 tag+0.098
 seeker+0.098
 hiding+0.096
 exit+0.093
Neg logit contributions
lee-0.109
bled-0.098
alks-0.093
ipt-0.091
atted-0.091
Token upon
Token ID2402
Feature activation-5.57e-02

Pos logit contributions
 hide+0.057
 hiding+0.053
 tag+0.052
 seeker+0.049
 hidden+0.047
Neg logit contributions
lee-0.178
bled-0.164
alks-0.161
atted-0.160
urg-0.159
Token a
Token ID257
Feature activation-1.30e-01

Pos logit contributions
 seeker+0.071
 tag+0.064
 hide+0.064
 hiding+0.063
 exit+0.063
Neg logit contributions
lee-0.125
bled-0.122
ipt-0.113
rac-0.113
urg-0.110
Token time
Token ID640
Feature activation-1.06e-01

Pos logit contributions
 tag+0.154
 hide+0.152
 seeker+0.148
 hiding+0.138
 exit+0.135
Neg logit contributions
lee-0.329
atted-0.294
bled-0.294
um-0.288
rac-0.288
Token,
Token ID11
Feature activation-1.07e-01

Pos logit contributions
 hide+0.117
 tag+0.117
 seeker+0.111
 exit+0.105
 hiding+0.101
Neg logit contributions
lee-0.283
bled-0.262
atted-0.260
alks-0.254
rac-0.251
Token there
Token ID612
Feature activation-4.85e-02

Pos logit contributions
 tag+0.103
 hide+0.101
 seeker+0.100
 exit+0.089
 hiding+0.086
Neg logit contributions
bled-0.285
lee-0.283
atted-0.270
rac-0.268
ipt-0.256
Token was
Token ID373
Feature activation-1.00e-01

Pos logit contributions
 seeker+0.064
 hide+0.063
 tag+0.062
 hiding+0.058
 exit+0.058
Neg logit contributions
lee-0.116
bled-0.108
atted-0.105
ipt-0.102
umbled-0.101
Token a
Token ID257
Feature activation-1.22e-01

Pos logit contributions
 tag+0.116
 hide+0.114
 seeker+0.108
 hiding+0.108
 exit+0.106
Neg logit contributions
lee-0.246
bled-0.235
alks-0.218
um-0.215
ped-0.214
Token park
Token ID3952
Feature activation-3.76e-02

Pos logit contributions
 tag+0.025
 hide+0.025
 hiding+0.023
 seeker+0.022
 tim+0.020
Neg logit contributions
lee-0.057
bled-0.049
alks-0.049
um-0.048
rac-0.047
Token.
Token ID13
Feature activation-7.21e-03

Pos logit contributions
 seeker+0.054
 hide+0.050
 tag+0.048
 exit+0.047
 hiding+0.047
Neg logit contributions
lee-0.084
alks-0.071
bled-0.070
atted-0.069
ums-0.067
Token<|endoftext|>
Token ID50256
Feature activation-5.16e-02

Pos logit contributions
 hide+0.010
 tag+0.010
 hiding+0.009
 seeker+0.009
 hidden+0.009
Neg logit contributions
lee-0.018
bled-0.017
atted-0.016
alks-0.016
urg-0.016
TokenOnce
Token ID7454
Feature activation-5.99e-02

Pos logit contributions
 seeker+0.098
 hide+0.098
 tag+0.096
 hiding+0.094
 exit+0.093
Neg logit contributions
lee-0.093
bled-0.082
alks-0.079
ipt-0.078
atted-0.076
Token upon
Token ID2402
Feature activation-5.56e-02

Pos logit contributions
 hide+0.055
 tag+0.051
 hiding+0.051
 seeker+0.047
 hidden+0.044
Neg logit contributions
lee-0.179
bled-0.165
alks-0.162
atted-0.161
urg-0.159
Token a
Token ID257
Feature activation-1.30e-01

Pos logit contributions
 seeker+0.075
 exit+0.066
 hiding+0.065
 hide+0.065
 tag+0.065
Neg logit contributions
lee-0.122
bled-0.117
ipt-0.108
rac-0.108
ped-0.107
Token time
Token ID640
Feature activation-1.05e-01

Pos logit contributions
 tag+0.156
 hide+0.155
 seeker+0.152
 hiding+0.142
 exit+0.140
Neg logit contributions
lee-0.323
atted-0.291
bled-0.287
um-0.284
rac-0.282
Token,
Token ID11
Feature activation-1.04e-01

Pos logit contributions
 hide+0.125
 tag+0.125
 seeker+0.122
 exit+0.117
 hidden+0.108
Neg logit contributions
lee-0.267
atted-0.245
bled-0.245
alks-0.237
rac-0.236
Token there
Token ID612
Feature activation-4.98e-02

Pos logit contributions
 seeker+0.117
 tag+0.115
 hide+0.112
 exit+0.108
 library+0.107
Neg logit contributions
bled-0.257
lee-0.246
atted-0.239
rac-0.238
eled-0.226
Token was
Token ID373
Feature activation-1.00e-01

Pos logit contributions
 seeker+0.071
 hide+0.068
 tag+0.067
 exit+0.065
 hiding+0.064
Neg logit contributions
lee-0.115
bled-0.107
atted-0.104
ipt-0.100
umbled-0.099
Token a
Token ID257
Feature activation-1.23e-01

Pos logit contributions
 hide+0.114
 tag+0.113
 seeker+0.108
 hiding+0.107
 exit+0.105
Neg logit contributions
lee-0.253
bled-0.237
alks-0.222
um-0.220
ipt-0.218
Token better
Token ID1365
Feature activation+2.72e-02

Pos logit contributions
lee+0.008
bled+0.007
atted+0.007
alks+0.007
urg+0.007
Neg logit contributions
 hide-0.004
 tag-0.004
 hiding-0.004
 seeker-0.004
 exit-0.004
Token and
Token ID290
Feature activation+3.38e-02

Pos logit contributions
lee+0.067
bled+0.061
atted+0.059
alks+0.058
ums+0.056
Neg logit contributions
 hide-0.039
 tag-0.038
 hiding-0.035
 seeker-0.035
 exit-0.034
Token smiled
Token ID13541
Feature activation+6.70e-02

Pos logit contributions
lee+0.078
alks+0.071
bled+0.070
atted+0.067
urg+0.067
Neg logit contributions
 hide-0.055
 tag-0.053
 hiding-0.052
 seeker-0.049
 hidden-0.049
Token.
Token ID13
Feature activation+2.67e-02

Pos logit contributions
lee+0.185
bled+0.169
atted+0.164
alks+0.161
urg+0.158
Neg logit contributions
 hide-0.079
 tag-0.076
 seeker-0.072
 hiding-0.071
 exit-0.066
Token She
Token ID1375
Feature activation+3.53e-02

Pos logit contributions
bled+0.060
lee+0.060
urg+0.058
aned+0.055
atted+0.055
Neg logit contributions
 hide-0.038
 tag-0.036
 hiding-0.035
 seeker-0.035
 chamber-0.033
Token hugged
Token ID41130
Feature activation+9.32e-02

Pos logit contributions
lee+0.081
bled+0.072
atted+0.070
alks+0.069
urg+0.067
Neg logit contributions
 hide-0.058
 tag-0.056
 hiding-0.055
 seeker-0.054
 exit-0.051
Token her
Token ID607
Feature activation+6.61e-02

Pos logit contributions
lee+0.223
alks+0.214
bled+0.210
aned+0.207
atted+0.206
Neg logit contributions
 hide-0.136
 tag-0.127
 hiding-0.121
 like-0.113
 seeker-0.112
Token mum
Token ID25682
Feature activation+4.47e-02

Pos logit contributions
lee+0.081
bled+0.065
alks+0.063
atted+0.062
urg+0.058
Neg logit contributions
 hide-0.179
 tag-0.175
 hiding-0.172
 seeker-0.171
 hidden-0.167
Token back
Token ID736
Feature activation+3.53e-02

Pos logit contributions
lee+0.112
atted+0.099
bled+0.099
alks+0.094
rac+0.093
Neg logit contributions
 hide-0.064
 tag-0.062
 seeker-0.062
 hiding-0.059
 exit-0.056
Token.
Token ID13
Feature activation+2.41e-02

Pos logit contributions
lee+0.088
atted+0.078
alks+0.074
bled+0.074
ment+0.072
Neg logit contributions
 hide-0.052
 tag-0.050
 seeker-0.049
 hiding-0.046
 exit-0.045
Token
Token ID220
Feature activation+5.00e-02

Pos logit contributions
lee+0.054
bled+0.053
urg+0.051
aned+0.050
atted+0.049
Neg logit contributions
 hide-0.034
 tag-0.033
 hiding-0.032
 seeker-0.030
 chamber-0.030
Token Sally
Token ID25737
Feature activation+2.34e-02

Pos logit contributions
lee+0.156
bled+0.135
alks+0.131
atted+0.131
urg+0.124
Neg logit contributions
 hide-0.153
 tag-0.149
 hiding-0.146
 seeker-0.145
 exit-0.140
Token.
Token ID13
Feature activation+6.07e-03

Pos logit contributions
lee+0.067
bled+0.061
atted+0.060
alks+0.060
urg+0.058
Neg logit contributions
 hide-0.024
 tag-0.023
 seeker-0.023
 hiding-0.022
 exit-0.020
Token She
Token ID1375
Feature activation+2.90e-02

Pos logit contributions
lee+0.015
bled+0.014
atted+0.013
alks+0.013
urg+0.013
Neg logit contributions
 hide-0.009
 tag-0.009
 hiding-0.008
 seeker-0.008
 hidden-0.007
Token smiled
Token ID13541
Feature activation+6.19e-02

Pos logit contributions
lee+0.067
bled+0.060
alks+0.059
atted+0.059
urg+0.057
Neg logit contributions
 hide-0.047
 tag-0.045
 hiding-0.043
 seeker-0.042
 hidden-0.041
Token and
Token ID290
Feature activation+4.18e-02

Pos logit contributions
lee+0.171
bled+0.156
alks+0.153
atted+0.153
urg+0.148
Neg logit contributions
 hide-0.072
 tag-0.069
 hiding-0.066
 seeker-0.065
 hidden-0.060
Token hugged
Token ID41130
Feature activation+9.10e-02

Pos logit contributions
lee+0.087
alks+0.081
bled+0.078
urg+0.077
atted+0.076
Neg logit contributions
 hide-0.075
 tag-0.074
 hiding-0.069
 hidden-0.067
 seeker-0.065
Token Jake
Token ID14757
Feature activation+2.86e-02

Pos logit contributions
lee+0.184
bled+0.166
atted+0.162
alks+0.162
urg+0.156
Neg logit contributions
 hide-0.171
 tag-0.167
 hiding-0.160
 seeker-0.157
 hidden-0.150
Token.
Token ID13
Feature activation+1.02e-02

Pos logit contributions
lee+0.076
atted+0.068
bled+0.068
alks+0.065
ums+0.064
Neg logit contributions
 hide-0.036
 seeker-0.036
 tag-0.035
 hiding-0.033
 exit-0.032
Token "
Token ID366
Feature activation+1.97e-02

Pos logit contributions
lee+0.024
bled+0.023
atted+0.022
urg+0.022
alks+0.021
Neg logit contributions
 tag-0.014
 hide-0.014
 hiding-0.014
 seeker-0.013
 hidden-0.012
TokenThis
Token ID1212
Feature activation+2.48e-02

Pos logit contributions
lee+0.046
atted+0.041
alks+0.040
urg+0.040
bled+0.039
Neg logit contributions
 tag-0.032
 hide-0.031
 seeker-0.030
 hidden-0.029
 like-0.029
Token old
Token ID1468
Feature activation+3.54e-02

Pos logit contributions
lee+0.051
alks+0.047
bled+0.046
rac+0.046
um+0.045
Neg logit contributions
 seeker-0.034
 tim-0.033
 hiding-0.030
 chamber-0.030
 hide-0.030
Token look
Token ID804
Feature activation+1.64e-02

Pos logit contributions
lee+0.016
rac+0.013
alks+0.013
bled+0.013
atted+0.013
Neg logit contributions
 hiding-0.012
 tag-0.011
 seeker-0.011
 hide-0.010
 chamber-0.010
Token at
Token ID379
Feature activation+6.97e-02

Pos logit contributions
lee+0.037
atted+0.033
bled+0.032
arted+0.030
rac+0.028
Neg logit contributions
 tag-0.027
 seeker-0.026
 hide-0.026
 hiding-0.024
 exit-0.024
Token their
Token ID511
Feature activation+2.52e-02

Pos logit contributions
lee+0.160
bled+0.144
alks+0.141
atted+0.138
urg+0.137
Neg logit contributions
 hide-0.118
 tag-0.113
 hiding-0.108
 seeker-0.106
 hidden-0.102
Token picture
Token ID4286
Feature activation+2.56e-03

Pos logit contributions
lee+0.058
bled+0.053
atted+0.051
alks+0.050
urg+0.049
Neg logit contributions
 hide-0.041
 tag-0.039
 hiding-0.039
 seeker-0.037
 hidden-0.036
Token and
Token ID290
Feature activation+1.76e-02

Pos logit contributions
lee+0.006
bled+0.006
atted+0.006
alks+0.005
lights+0.005
Neg logit contributions
 seeker-0.004
 hide-0.003
 tag-0.003
 hiding-0.003
 exit-0.003
Token smile
Token ID8212
Feature activation+7.39e-02

Pos logit contributions
lee+0.036
alks+0.031
bled+0.031
atted+0.030
rac+0.030
Neg logit contributions
 hide-0.030
 seeker-0.030
 tag-0.030
 hiding-0.030
 exit-0.028
Token.
Token ID13
Feature activation+8.82e-03

Pos logit contributions
lee+0.188
bled+0.171
atted+0.163
alks+0.162
urg+0.159
Neg logit contributions
 hide-0.100
 seeker-0.093
 tag-0.091
 hiding-0.089
 exit-0.088
Token
Token ID198
Feature activation+2.79e-02

Pos logit contributions
lee+0.017
alks+0.015
bled+0.015
atted+0.014
ipt+0.014
Neg logit contributions
 hide-0.017
 tag-0.017
 hiding-0.017
 seeker-0.016
 exit-0.016
Token
Token ID198
Feature activation+2.54e-02

Pos logit contributions
lee+0.080
bled+0.075
umm+0.075
atted+0.074
iously+0.073
Neg logit contributions
 hide-0.029
 tag-0.027
 hiding-0.025
 hidden-0.023
 shelter-0.021
TokenThey
Token ID2990
Feature activation+5.82e-03

Pos logit contributions
lee+0.062
bled+0.055
atted+0.054
alks+0.054
urg+0.053
Neg logit contributions
 hide-0.040
 tag-0.039
 hiding-0.036
 seeker-0.035
 hidden-0.035
Token learned
Token ID4499
Feature activation+4.87e-02

Pos logit contributions
lee+0.012
rac+0.010
bled+0.010
alks+0.010
atted+0.010
Neg logit contributions
 hiding-0.010
 tag-0.009
 seeker-0.009
 hide-0.009
 chamber-0.009
Tokenmen
Token ID3653
Feature activation+1.22e-02

Pos logit contributions
lee+0.021
bled+0.019
atted+0.019
alks+0.018
urg+0.018
Neg logit contributions
 hide-0.018
 seeker-0.017
 tag-0.017
 hiding-0.017
 hidden-0.016
Token and
Token ID290
Feature activation+3.23e-02

Pos logit contributions
lee+0.028
atted+0.026
bled+0.025
alks+0.024
ums+0.023
Neg logit contributions
 seeker-0.018
 hide-0.018
 tag-0.017
 hiding-0.017
 maze-0.016
Token said
Token ID531
Feature activation+6.40e-02

Pos logit contributions
lee+0.067
bled+0.061
alks+0.059
atted+0.058
urg+0.057
Neg logit contributions
 hide-0.061
 tag-0.059
 hiding-0.057
 seeker-0.055
 hidden-0.054
Token,
Token ID11
Feature activation+2.47e-02

Pos logit contributions
lee+0.163
bled+0.148
alks+0.146
atted+0.144
urg+0.140
Neg logit contributions
 hide-0.088
 tag-0.085
 hiding-0.081
 seeker-0.079
 hidden-0.075
Token "
Token ID366
Feature activation+2.03e-02

Pos logit contributions
lee+0.062
bled+0.056
atted+0.056
alks+0.055
urg+0.054
Neg logit contributions
 tag-0.034
 hide-0.034
 hiding-0.032
 seeker-0.030
 hidden-0.029
TokenThank
Token ID10449
Feature activation+7.22e-02

Pos logit contributions
lee+0.051
atted+0.045
alks+0.045
urg+0.044
bled+0.043
Neg logit contributions
 tag-0.032
 hide-0.030
 seeker-0.028
 hidden-0.027
 like-0.027
Token you
Token ID345
Feature activation+2.07e-02

Pos logit contributions
lee+0.208
alks+0.186
atted+0.186
aned+0.186
bled+0.181
Neg logit contributions
 hide-0.087
 tag-0.082
 hiding-0.070
 hidden-0.066
 seeker-0.064
Token for
Token ID329
Feature activation+4.08e-02

Pos logit contributions
lee+0.063
bled+0.059
atted+0.058
alks+0.058
urg+0.056
Neg logit contributions
 hide-0.018
 tag-0.017
 hiding-0.016
 seeker-0.015
 hidden-0.014
Token saving
Token ID8914
Feature activation+6.00e-02

Pos logit contributions
lee+0.098
atted+0.088
alks+0.088
bled+0.085
umbled+0.084
Neg logit contributions
 hide-0.067
 tag-0.065
 hiding-0.064
 hidden-0.058
 seeker-0.057
Token the
Token ID262
Feature activation+1.42e-02

Pos logit contributions
lee+0.127
bled+0.114
atted+0.110
alks+0.110
urg+0.104
Neg logit contributions
 hide-0.111
 tag-0.107
 hiding-0.104
 seeker-0.100
 hidden-0.098
Token old
Token ID1468
Feature activation+2.46e-02

Pos logit contributions
lee+0.027
alks+0.026
bled+0.024
iu+0.024
 faces+0.024
Neg logit contributions
 hide-0.020
 seeker-0.020
 tag-0.019
 tim-0.018
 anywhere-0.017
Token.
Token ID13
Feature activation+1.55e-02

Pos logit contributions
lee+0.119
bled+0.110
atted+0.107
ums+0.103
alks+0.102
Neg logit contributions
 hide-0.064
 tag-0.062
 seeker-0.062
 hiding-0.057
 exit-0.056
Token
Token ID220
Feature activation+5.15e-02

Pos logit contributions
bled+0.041
lee+0.041
urg+0.039
aned+0.038
atted+0.037
Neg logit contributions
 hide-0.018
 tag-0.016
 hiding-0.016
 jail-0.015
 seeker-0.014
Token
Token ID198
Feature activation+3.79e-02

Pos logit contributions
lee+0.134
bled+0.121
atted+0.119
alks+0.118
urg+0.117
Neg logit contributions
 hide-0.066
 tag-0.063
 hiding-0.061
 seeker-0.057
 hidden-0.055
Token
Token ID198
Feature activation+3.46e-02

Pos logit contributions
lee+0.116
umm+0.110
urg+0.107
alks+0.107
atted+0.107
Neg logit contributions
 hide-0.033
 tag-0.031
 hiding-0.026
 hidden-0.025
 gym-0.020
TokenThe
Token ID464
Feature activation+4.64e-02

Pos logit contributions
lee+0.078
atted+0.069
alks+0.069
bled+0.067
urg+0.065
Neg logit contributions
 tag-0.061
 hide-0.060
 hidden-0.055
 hiding-0.055
 seeker-0.054
Token gard
Token ID16985
Feature activation+7.24e-02

Pos logit contributions
lee+0.089
bled+0.078
alks+0.075
atted+0.074
urg+0.071
Neg logit contributions
 hide-0.094
 tag-0.091
 hiding-0.089
 seeker-0.088
 hidden-0.084
Tokenener
Token ID877
Feature activation+3.25e-02

Pos logit contributions
lee+0.102
rac+0.099
alks+0.094
urg+0.092
bled+0.092
Neg logit contributions
 hiding-0.154
 hide-0.149
 inside-0.148
 hidden-0.147
 seeker-0.146
Token was
Token ID373
Feature activation+6.00e-03

Pos logit contributions
lee+0.077
bled+0.069
alks+0.067
atted+0.067
urg+0.065
Neg logit contributions
 hide-0.050
 tag-0.049
 hiding-0.047
 seeker-0.046
 hidden-0.044
Token happy
Token ID3772
Feature activation+4.01e-02

Pos logit contributions
lee+0.015
bled+0.014
alks+0.014
atted+0.013
urg+0.013
Neg logit contributions
 hide-0.008
 tag-0.008
 hiding-0.008
 seeker-0.007
 hidden-0.007
Token to
Token ID284
Feature activation+4.65e-02

Pos logit contributions
lee+0.104
bled+0.094
alks+0.092
atted+0.092
urg+0.089
Neg logit contributions
 hide-0.054
 tag-0.052
 hiding-0.050
 seeker-0.049
 hidden-0.046
Token hear
Token ID3285
Feature activation+4.83e-02

Pos logit contributions
lee+0.137
atted+0.134
bled+0.132
ipt+0.132
umbled+0.126
Neg logit contributions
 hide-0.051
 tag-0.043
 hiding-0.042
 seeker-0.039
 hidden-0.037
Token an
Token ID281
Feature activation-4.52e-02

Pos logit contributions
lee+0.013
bled+0.012
atted+0.012
alks+0.012
urg+0.011
Neg logit contributions
 hide-0.008
 tag-0.007
 hiding-0.007
 seeker-0.007
 hidden-0.007
Token orig
Token ID1796
Feature activation+3.28e-02

Pos logit contributions
 hide+0.062
 tag+0.058
 hiding+0.057
 exit+0.055
 seeker+0.055
Neg logit contributions
lee-0.115
bled-0.107
alks-0.104
atted-0.104
urg-0.100
Tokenami
Token ID6277
Feature activation-2.18e-02

Pos logit contributions
lee+0.045
bled+0.043
atted+0.038
alks+0.037
ped+0.036
Neg logit contributions
 seeker-0.073
 hide-0.072
 exit-0.072
 hiding-0.071
 tag-0.069
Token figure
Token ID3785
Feature activation-3.86e-02

Pos logit contributions
 seeker+0.029
 hide+0.027
 tag+0.025
 hiding+0.025
 exit+0.024
Neg logit contributions
lee-0.055
alks-0.049
atted-0.046
bled-0.046
iu-0.045
Token.
Token ID13
Feature activation-8.29e-03

Pos logit contributions
 seeker+0.052
 hide+0.047
 tag+0.045
 chamber+0.043
 hiding+0.043
Neg logit contributions
lee-0.099
bled-0.090
atted-0.089
alks-0.085
aned-0.082
Token But
Token ID887
Feature activation+5.84e-03

Pos logit contributions
 hide+0.015
 tag+0.014
 hiding+0.014
 seeker+0.013
 exit+0.013
Neg logit contributions
lee-0.018
bled-0.016
atted-0.016
alks-0.016
urg-0.015
Token no
Token ID645
Feature activation+2.10e-02

Pos logit contributions
lee+0.013
bled+0.012
atted+0.011
rac+0.011
alks+0.011
Neg logit contributions
 hide-0.009
 tag-0.009
 seeker-0.009
 hiding-0.009
 exit-0.009
Token matter
Token ID2300
Feature activation-3.74e-03

Pos logit contributions
lee+0.040
atted+0.035
bled+0.034
eled+0.034
iously+0.034
Neg logit contributions
 hide-0.037
 tag-0.036
 seeker-0.036
 chamber-0.034
 hiding-0.033
Token how
Token ID703
Feature activation-2.05e-02

Pos logit contributions
 seeker+0.009
 hide+0.008
 tag+0.008
 hiding+0.008
 gym+0.008
Neg logit contributions
lee-0.006
atted-0.005
bled-0.005
rac-0.004
alks-0.004
Token hard
Token ID1327
Feature activation+1.95e-02

Pos logit contributions
 hide+0.027
 tag+0.026
 hiding+0.025
 seeker+0.024
 hidden+0.023
Neg logit contributions
lee-0.054
bled-0.049
alks-0.048
atted-0.048
urg-0.047
Token she
Token ID673
Feature activation+1.01e-03

Pos logit contributions
lee+0.043
bled+0.038
alks+0.038
atted+0.037
urg+0.036
Neg logit contributions
 hide-0.032
 tag-0.031
 seeker-0.031
 hiding-0.030
 exit-0.029
Token too
Token ID1165
Feature activation-5.09e-02

Pos logit contributions
 hide+0.022
 tag+0.022
 seeker+0.022
 maze+0.021
 exit+0.020
Neg logit contributions
lee-0.029
bled-0.026
atted-0.025
alks-0.024
urg-0.024
Token comfortable
Token ID6792
Feature activation-2.58e-04

Pos logit contributions
 hide+0.087
 tag+0.085
 seeker+0.080
 hiding+0.078
 exit+0.076
Neg logit contributions
lee-0.117
alks-0.099
atted-0.099
bled-0.097
urg-0.093
Token here
Token ID994
Feature activation-1.17e-02

Pos logit contributions
 tag+0.000
 hide+0.000
 seeker+0.000
 exit+0.000
 maze+0.000
Neg logit contributions
lee-0.001
bled-0.001
atted-0.001
urg-0.001
alks-0.001
Token."
Token ID526
Feature activation+1.86e-02

Pos logit contributions
 seeker+0.016
 hide+0.016
 tag+0.016
 maze+0.015
 exit+0.015
Neg logit contributions
lee-0.027
bled-0.025
rac-0.023
alks-0.023
atted-0.023
Token
Token ID198
Feature activation+2.24e-02

Pos logit contributions
lee+0.040
bled+0.036
alks+0.035
atted+0.035
urg+0.033
Neg logit contributions
 hide-0.033
 tag-0.032
 hiding-0.031
 seeker-0.030
 hidden-0.029
Token
Token ID198
Feature activation+2.07e-02

Pos logit contributions
lee+0.062
bled+0.057
atted+0.056
alks+0.056
urg+0.054
Neg logit contributions
 hide-0.027
 tag-0.026
 hiding-0.024
 hidden-0.022
 seeker-0.021
TokenThe
Token ID464
Feature activation+1.95e-02

Pos logit contributions
lee+0.043
atted+0.037
bled+0.037
alks+0.037
urg+0.035
Neg logit contributions
 hide-0.040
 tag-0.039
 hiding-0.037
 seeker-0.036
 hidden-0.036
Token big
Token ID1263
Feature activation+2.35e-02

Pos logit contributions
lee+0.052
bled+0.047
alks+0.046
atted+0.046
urg+0.044
Neg logit contributions
 hide-0.026
 tag-0.025
 hiding-0.025
 seeker-0.023
 hidden-0.023
Token bear
Token ID6842
Feature activation+1.30e-02

Pos logit contributions
lee+0.062
bled+0.056
atted+0.055
alks+0.055
urg+0.053
Neg logit contributions
 hide-0.029
 tag-0.028
 seeker-0.027
 hiding-0.026
 hidden-0.024
Token was
Token ID373
Feature activation-5.69e-03

Pos logit contributions
lee+0.028
atted+0.024
bled+0.024
aned+0.023
alks+0.023
Neg logit contributions
 seeker-0.023
 hide-0.022
 tag-0.022
 hiding-0.021
 exit-0.021
Token very
Token ID845
Feature activation-1.97e-02

Pos logit contributions
 hide+0.008
 tag+0.008
 seeker+0.008
 hiding+0.008
 exit+0.007
Neg logit contributions
lee-0.014
bled-0.012
atted-0.012
alks-0.012
urg-0.011
Token feeling
Token ID4203
Feature activation+1.88e-02

Pos logit contributions
lee+0.053
bled+0.049
alks+0.048
atted+0.047
urg+0.046
Neg logit contributions
 hide-0.024
 tag-0.023
 hiding-0.022
 hidden-0.020
 seeker-0.020
Token grateful
Token ID14066
Feature activation+7.38e-02

Pos logit contributions
lee+0.046
alks+0.041
bled+0.041
urg+0.040
atted+0.040
Neg logit contributions
 hide-0.029
 tag-0.027
 hiding-0.026
 hidden-0.026
 seeker-0.025
Token for
Token ID329
Feature activation+7.33e-02

Pos logit contributions
lee+0.209
bled+0.192
alks+0.189
atted+0.187
urg+0.183
Neg logit contributions
 hide-0.081
 tag-0.077
 hiding-0.075
 seeker-0.069
 hidden-0.068
Token his
Token ID465
Feature activation+5.70e-02

Pos logit contributions
lee+0.167
bled+0.149
atted+0.147
alks+0.147
urg+0.140
Neg logit contributions
 hide-0.123
 tag-0.119
 hiding-0.115
 seeker-0.111
 hidden-0.108
Token friends
Token ID2460
Feature activation+1.18e-02

Pos logit contributions
lee+0.111
bled+0.096
atted+0.095
alks+0.093
urg+0.089
Neg logit contributions
 hide-0.109
 tag-0.107
 seeker-0.107
 hiding-0.101
 exit-0.098
Token.
Token ID13
Feature activation+4.83e-03

Pos logit contributions
lee+0.028
bled+0.024
atted+0.023
ment+0.022
ums+0.022
Neg logit contributions
 seeker-0.018
 hide-0.017
 tag-0.017
 exit-0.016
 maze-0.016
Token From
Token ID3574
Feature activation+2.73e-02

Pos logit contributions
lee+0.010
bled+0.010
atted+0.009
rac+0.009
urg+0.009
Neg logit contributions
 hide-0.008
 tag-0.007
 hiding-0.007
 seeker-0.007
 hidden-0.007
Token that
Token ID326
Feature activation-1.16e-02

Pos logit contributions
lee+0.063
bled+0.057
alks+0.056
atted+0.055
urg+0.053
Neg logit contributions
 hide-0.044
 tag-0.043
 hiding-0.041
 seeker-0.041
 hidden-0.039
Token day
Token ID1110
Feature activation+2.14e-03

Pos logit contributions
 hide+0.014
 tag+0.013
 hiding+0.013
 seeker+0.012
 hidden+0.012
Neg logit contributions
lee-0.032
bled-0.030
alks-0.029
atted-0.029
urg-0.028
Token on
Token ID319
Feature activation+5.19e-03

Pos logit contributions
lee+0.007
bled+0.006
alks+0.006
atted+0.006
urg+0.006
Neg logit contributions
 hide-0.002
 tag-0.001
 hiding-0.001
 seeker-0.001
 hidden-0.001
Token,
Token ID11
Feature activation-5.48e-04

Pos logit contributions
lee+0.015
bled+0.014
atted+0.014
alks+0.013
urg+0.013
Neg logit contributions
 hide-0.005
 tag-0.005
 seeker-0.005
 hiding-0.005
 exit-0.005
Token She
Token ID1375
Feature activation+1.80e-02

Pos logit contributions
 hide+0.012
 tag+0.011
 hiding+0.011
 seeker+0.011
 hidden+0.010
Neg logit contributions
lee-0.016
bled-0.015
atted-0.014
alks-0.014
urg-0.014
Token said
Token ID531
Feature activation+5.06e-02

Pos logit contributions
lee+0.037
bled+0.032
atted+0.032
alks+0.030
aned+0.030
Neg logit contributions
 hide-0.033
 seeker-0.032
 tag-0.032
 hiding-0.032
 exit-0.030
Token,
Token ID11
Feature activation+2.35e-02

Pos logit contributions
lee+0.147
bled+0.134
atted+0.133
alks+0.131
rac+0.126
Neg logit contributions
 hide-0.051
 tag-0.049
 seeker-0.047
 hiding-0.046
 exit-0.044
Token "
Token ID366
Feature activation+6.55e-03

Pos logit contributions
lee+0.053
bled+0.047
atted+0.046
alks+0.046
urg+0.044
Neg logit contributions
 hide-0.040
 tag-0.038
 hiding-0.037
 seeker-0.037
 exit-0.035
TokenNo
Token ID2949
Feature activation+4.34e-02

Pos logit contributions
lee+0.017
urg+0.016
atted+0.015
bled+0.015
alks+0.015
Neg logit contributions
 tag-0.010
 hide-0.010
 seeker-0.010
 like-0.008
 hidden-0.008
Token,
Token ID11
Feature activation+1.26e-02

Pos logit contributions
lee+0.111
atted+0.097
alks+0.095
ums+0.093
um+0.092
Neg logit contributions
 hide-0.054
 hidden-0.052
 chamber-0.051
 hiding-0.050
 seeker-0.050
Token you
Token ID345
Feature activation-2.29e-02

Pos logit contributions
lee+0.026
bled+0.023
alks+0.022
atted+0.022
rac+0.021
Neg logit contributions
 hide-0.023
 tag-0.022
 hiding-0.021
 seeker-0.021
 exit-0.021
Token are
Token ID389
Feature activation+3.97e-03

Pos logit contributions
 hiding+0.030
 hide+0.029
 chamber+0.029
 tag+0.028
 seeker+0.028
Neg logit contributions
lee-0.056
atted-0.050
bled-0.049
rac-0.049
alks-0.048
Token silly
Token ID14397
Feature activation+2.13e-02

Pos logit contributions
lee+0.007
bled+0.006
rac+0.006
alks+0.005
ped+0.005
Neg logit contributions
 cabin-0.007
 library-0.007
 shelter-0.007
house-0.007
 seeker-0.007
Token.
Token ID13
Feature activation-3.40e-02

Pos logit contributions
lee+0.056
atted+0.049
bled+0.048
ums+0.047
alks+0.046
Neg logit contributions
 hide-0.026
 seeker-0.025
 tag-0.025
 hidden-0.024
 hiding-0.023
Token And
Token ID843
Feature activation+1.30e-02

Pos logit contributions
 hide+0.055
 tag+0.054
 hiding+0.052
 seeker+0.051
 hidden+0.049
Neg logit contributions
lee-0.078
bled-0.070
atted-0.068
alks-0.068
urg-0.066
Token fun
Token ID1257
Feature activation-1.25e-02

Pos logit contributions
 seeker+0.012
 hide+0.011
 tag+0.010
 hiding+0.010
 maze+0.010
Neg logit contributions
lee-0.018
bled-0.015
alks-0.015
urg-0.014
rac-0.014
Token and
Token ID290
Feature activation+1.11e-02

Pos logit contributions
 seeker+0.020
 hide+0.018
 tag+0.018
 tim+0.017
 exit+0.017
Neg logit contributions
lee-0.027
atted-0.023
alks-0.023
lights-0.022
ums-0.021
Token cool
Token ID3608
Feature activation+2.22e-02

Pos logit contributions
lee+0.028
bled+0.024
atted+0.024
alks+0.024
rac+0.023
Neg logit contributions
 seeker-0.015
 hide-0.014
 tag-0.014
 hiding-0.014
 exit-0.013
Token.
Token ID13
Feature activation-5.02e-03

Pos logit contributions
lee+0.053
atted+0.047
ment+0.044
bled+0.044
lights+0.044
Neg logit contributions
 seeker-0.033
 hide-0.032
 tag-0.030
 hiding-0.028
 exit-0.027
Token
Token ID198
Feature activation+2.47e-03

Pos logit contributions
 hide+0.007
 tag+0.007
 hiding+0.007
 seeker+0.007
 hidden+0.006
Neg logit contributions
lee-0.012
bled-0.011
atted-0.011
alks-0.011
urg-0.011
Token
Token ID198
Feature activation+4.51e-03

Pos logit contributions
lee+0.005
bled+0.005
alks+0.005
atted+0.005
ipt+0.004
Neg logit contributions
 hide-0.004
 tag-0.004
 seeker-0.004
 hiding-0.004
 exit-0.004
TokenOne
Token ID3198
Feature activation-2.24e-02

Pos logit contributions
lee+0.010
bled+0.010
alks+0.009
atted+0.009
ipt+0.009
Neg logit contributions
 hide-0.007
 seeker-0.007
 tag-0.007
 hiding-0.006
 exit-0.006
Token day
Token ID1110
Feature activation-2.64e-02

Pos logit contributions
 tag+0.043
 hide+0.042
 hiding+0.040
 seeker+0.040
 anywhere+0.040
Neg logit contributions
lee-0.040
um-0.036
rac-0.035
atted-0.034
alks-0.034
Token,
Token ID11
Feature activation-1.97e-02

Pos logit contributions
 exit+0.040
 seeker+0.038
 hide+0.038
 tag+0.037
 hiding+0.036
Neg logit contributions
lee-0.059
bled-0.054
atted-0.053
um-0.052
alks-0.050
Token Lily
Token ID20037
Feature activation-6.99e-03

Pos logit contributions
 hide+0.035
 exit+0.033
 hiding+0.033
 tag+0.033
 seeker+0.033
Neg logit contributions
lee-0.039
bled-0.036
atted-0.033
alks-0.033
rac-0.032
Token went
Token ID1816
Feature activation-2.45e-02

Pos logit contributions
 tag+0.011
 hide+0.010
 seeker+0.010
 exit+0.010
 hiding+0.010
Neg logit contributions
lee-0.016
atted-0.014
ipt-0.013
eled-0.013
bled-0.013
Token there
Token ID612
Feature activation-4.16e-02

Pos logit contributions
 tag+0.118
 seeker+0.118
 hide+0.110
 exit+0.108
 library+0.107
Neg logit contributions
bled-0.211
lee-0.205
atted-0.194
eled-0.189
ipt-0.185
Token were
Token ID547
Feature activation-8.71e-02

Pos logit contributions
 seeker+0.059
 hide+0.057
 tag+0.056
 exit+0.053
 hiding+0.053
Neg logit contributions
lee-0.098
bled-0.089
atted-0.086
ipt-0.085
eled-0.084
Token two
Token ID734
Feature activation-6.03e-02

Pos logit contributions
 hide+0.125
 tag+0.118
 hiding+0.117
 seeker+0.114
 hidden+0.110
Neg logit contributions
lee-0.216
bled-0.198
atted-0.196
alks-0.196
aned-0.191
Token friends
Token ID2460
Feature activation-5.05e-02

Pos logit contributions
 hide+0.113
 tag+0.110
 hiding+0.107
 seeker+0.105
 hidden+0.101
Neg logit contributions
lee-0.123
bled-0.109
alks-0.106
atted-0.105
urg-0.101
Token named
Token ID3706
Feature activation-2.39e-02

Pos logit contributions
 hide+0.041
 tag+0.037
 hiding+0.036
 seeker+0.032
 hidden+0.031
Neg logit contributions
lee-0.157
bled-0.148
alks-0.145
atted-0.143
urg-0.140
Token Jack
Token ID3619
Feature activation-4.40e-02

Pos logit contributions
 tag+0.045
 seeker+0.044
 hide+0.044
 hidden+0.043
 exit+0.042
Neg logit contributions
lee-0.047
alks-0.035
urg-0.034
arted-0.034
bled-0.034
Token and
Token ID290
Feature activation-3.66e-02

Pos logit contributions
 hide+0.055
 tag+0.054
 seeker+0.052
 hiding+0.049
 exit+0.047
Neg logit contributions
lee-0.118
atted-0.105
bled-0.105
alks-0.104
urg-0.100
Token Jo
Token ID5302
Feature activation-3.22e-02

Pos logit contributions
 hide+0.064
 tag+0.063
 hiding+0.061
 seeker+0.060
 exit+0.058
Neg logit contributions
lee-0.078
bled-0.069
alks-0.067
atted-0.067
urg-0.064
Tokenjo
Token ID7639
Feature activation-4.38e-02

Pos logit contributions
 hide+0.032
 tag+0.030
 seeker+0.028
 hiding+0.028
 hidden+0.025
Neg logit contributions
lee-0.095
bled-0.087
atted-0.086
alks-0.085
urg-0.083
Token.
Token ID13
Feature activation-4.44e-02

Pos logit contributions
 seeker+0.054
 tag+0.051
 hide+0.050
 hiding+0.045
 exit+0.045
Neg logit contributions
lee-0.123
atted-0.110
bled-0.105
alks-0.103
eled-0.100
Token Every
Token ID3887
Feature activation-4.14e-02

Pos logit contributions
 hide+0.091
 tag+0.090
 seeker+0.088
 hiding+0.087
 hidden+0.085
Neg logit contributions
lee-0.083
alks-0.068
bled-0.065
atted-0.064
urg-0.063
Token be
Token ID307
Feature activation-5.24e-03

Pos logit contributions
 chamber+0.012
 tag+0.012
 hiding+0.012
 seeker+0.012
 gym+0.011
Neg logit contributions
lee-0.013
rac-0.012
alks-0.011
urg-0.011
ieties-0.011
Token careful
Token ID8161
Feature activation+3.25e-02

Pos logit contributions
 tag+0.009
 seeker+0.009
 hide+0.009
 gym+0.009
 library+0.009
Neg logit contributions
lee-0.010
bled-0.009
atted-0.008
rac-0.008
alks-0.008
Token.
Token ID13
Feature activation-2.40e-02

Pos logit contributions
lee+0.090
bled+0.083
atted+0.081
alks+0.081
urg+0.078
Neg logit contributions
 hide-0.036
 tag-0.035
 seeker-0.034
 hiding-0.032
 hidden-0.030
Token And
Token ID843
Feature activation+6.04e-03

Pos logit contributions
 hide+0.045
 tag+0.044
 hiding+0.044
 seeker+0.042
 hidden+0.041
Neg logit contributions
lee-0.048
bled-0.042
alks-0.041
atted-0.041
urg-0.039
Token we
Token ID356
Feature activation-1.07e-02

Pos logit contributions
lee+0.012
bled+0.011
rac+0.011
alks+0.011
atted+0.010
Neg logit contributions
 tag-0.010
 hide-0.010
 seeker-0.010
 hiding-0.010
 exit-0.009
Token have
Token ID423
Feature activation-1.33e-02

Pos logit contributions
 hiding+0.015
 tag+0.014
 chamber+0.013
 seeker+0.013
 gym+0.013
Neg logit contributions
lee-0.022
alks-0.020
rac-0.020
um-0.018
eled-0.018
Token to
Token ID284
Feature activation-2.97e-03

Pos logit contributions
 seeker+0.018
 hide+0.018
 tag+0.018
 hiding+0.017
 exit+0.016
Neg logit contributions
lee-0.032
atted-0.029
alks-0.029
rac-0.028
bled-0.028
Token take
Token ID1011
Feature activation+3.96e-02

Pos logit contributions
 chamber+0.005
 seeker+0.005
 tag+0.005
 hiding+0.005
 anywhere+0.004
Neg logit contributions
lee-0.005
rac-0.005
alks-0.005
urg-0.005
ieties-0.004
Token Rex
Token ID17853
Feature activation-1.68e-02

Pos logit contributions
lee+0.092
bled+0.082
alks+0.081
atted+0.081
rac+0.079
Neg logit contributions
 hide-0.059
 tag-0.058
 seeker-0.058
 hiding-0.057
 exit-0.053
Token with
Token ID351
Feature activation+1.73e-02

Pos logit contributions
 seeker+0.026
 maze+0.024
 tag+0.024
 hide+0.023
 library+0.023
Neg logit contributions
lee-0.033
rac-0.031
atted-0.030
bled-0.030
am-0.028
Token us
Token ID514
Feature activation+1.30e-03

Pos logit contributions
lee+0.029
bled+0.025
alks+0.024
atted+0.024
rac+0.023
Neg logit contributions
 seeker-0.036
 hide-0.036
 tag-0.036
 hiding-0.035
 exit-0.034
Token tired
Token ID10032
Feature activation+4.56e-03

Pos logit contributions
 library+0.017
 maze+0.016
 seeker+0.016
 exhibit+0.016
 museum+0.016
Neg logit contributions
lee-0.018
bled-0.017
arted-0.016
atted-0.015
ped-0.015
Token from
Token ID422
Feature activation+1.85e-02

Pos logit contributions
lee+0.012
alks+0.011
atted+0.011
bled+0.011
ums+0.010
Neg logit contributions
 hide-0.005
 seeker-0.005
 tag-0.005
 hiding-0.005
 hidden-0.004
Token running
Token ID2491
Feature activation-1.32e-02

Pos logit contributions
lee+0.037
bled+0.032
arted+0.031
atted+0.030
alks+0.029
Neg logit contributions
 seeker-0.033
 tag-0.031
 hide-0.031
 hidden-0.029
 maze-0.028
Token around
Token ID1088
Feature activation-1.21e-02

Pos logit contributions
 seeker+0.021
 library+0.019
 hiding+0.019
 hide+0.018
 tag+0.018
Neg logit contributions
lee-0.027
arted-0.023
atted-0.023
bled-0.023
ment-0.022
Token,
Token ID11
Feature activation-5.42e-03

Pos logit contributions
 seeker+0.020
 exit+0.017
 tag+0.017
 hide+0.016
 maze+0.016
Neg logit contributions
lee-0.026
alks-0.021
bled-0.021
atted-0.021
arted-0.020
Token she
Token ID673
Feature activation-1.13e-04

Pos logit contributions
 hide+0.009
 tag+0.009
 seeker+0.009
 hiding+0.009
 exit+0.009
Neg logit contributions
lee-0.012
bled-0.010
alks-0.010
atted-0.010
urg-0.010
Token decided
Token ID3066
Feature activation-1.10e-02

Pos logit contributions
 seeker+0.000
 hide+0.000
 tag+0.000
 exit+0.000
 hiding+0.000
Neg logit contributions
lee-0.000
bled-0.000
atted-0.000
eled-0.000
aned-0.000
Token to
Token ID284
Feature activation-1.61e-02

Pos logit contributions
 seeker+0.019
 hide+0.018
 tag+0.017
 tim+0.017
 chamber+0.017
Neg logit contributions
lee-0.024
bled-0.020
atted-0.019
rac-0.018
iously-0.018
Token face
Token ID1986
Feature activation+7.22e-03

Pos logit contributions
 seeker+0.024
 hiding+0.024
 hidden+0.024
 tag+0.023
 anywhere+0.022
Neg logit contributions
lee-0.031
rac-0.028
urg-0.028
um-0.027
bled-0.027
Token the
Token ID262
Feature activation-8.50e-03

Pos logit contributions
lee+0.012
ment+0.009
bled+0.009
atted+0.009
rac+0.009
Neg logit contributions
 seeker-0.016
 hide-0.015
 hiding-0.014
 maze-0.014
 tag-0.014
Token tornado
Token ID33718
Feature activation-4.52e-02

Pos logit contributions
 hide+0.014
 seeker+0.013
 hiding+0.013
 tag+0.013
 hidden+0.011
Neg logit contributions
lee-0.018
bled-0.016
alks-0.016
arted-0.015
 winds-0.015
Token and
Token ID290
Feature activation+2.43e-02

Pos logit contributions
lee+0.059
atted+0.052
bled+0.049
alks+0.049
ment+0.048
Neg logit contributions
 seeker-0.037
 hide-0.036
 tag-0.035
 hiding-0.031
 exit-0.031
Token angry
Token ID7954
Feature activation+5.14e-02

Pos logit contributions
lee+0.052
bled+0.045
atted+0.044
alks+0.042
aned+0.041
Neg logit contributions
 hide-0.043
 tag-0.043
 seeker-0.042
 hiding-0.041
 exit-0.039
Token.
Token ID13
Feature activation-5.99e-03

Pos logit contributions
lee+0.142
atted+0.127
bled+0.127
alks+0.123
ums+0.121
Neg logit contributions
 hide-0.058
 seeker-0.056
 tag-0.055
 hiding-0.052
 hidden-0.050
Token She
Token ID1375
Feature activation+1.08e-02

Pos logit contributions
 hide+0.010
 tag+0.010
 hiding+0.009
 seeker+0.009
 hidden+0.009
Neg logit contributions
lee-0.014
bled-0.012
atted-0.012
alks-0.012
urg-0.012
Token did
Token ID750
Feature activation+1.12e-02

Pos logit contributions
lee+0.023
atted+0.020
bled+0.019
aned+0.019
ped+0.018
Neg logit contributions
 seeker-0.019
 tag-0.018
 hide-0.018
 hiding-0.018
 exit-0.017
Token not
Token ID407
Feature activation+3.40e-03

Pos logit contributions
lee+0.020
bled+0.018
aned+0.017
rac+0.016
atted+0.016
Neg logit contributions
 seeker-0.022
 hide-0.021
 tag-0.021
 chamber-0.021
 hiding-0.020
Token want
Token ID765
Feature activation+2.54e-02

Pos logit contributions
lee+0.007
alks+0.006
atted+0.006
bled+0.006
urg+0.006
Neg logit contributions
 hiding-0.005
 tag-0.005
 seeker-0.005
 chamber-0.005
 hide-0.005
Token to
Token ID284
Feature activation+9.50e-03

Pos logit contributions
lee+0.065
bled+0.058
alks+0.058
atted+0.057
rac+0.055
Neg logit contributions
 hide-0.032
 tag-0.032
 seeker-0.031
 hiding-0.030
 hidden-0.028
Token leave
Token ID2666
Feature activation+3.22e-02

Pos logit contributions
lee+0.021
urg+0.018
alks+0.018
rac+0.018
bled+0.018
Neg logit contributions
 hiding-0.013
 seeker-0.013
 tag-0.013
 hidden-0.013
 hide-0.013
Token her
Token ID607
Feature activation+1.33e-02

Pos logit contributions
lee+0.059
bled+0.050
alks+0.048
atted+0.047
ums+0.045
Neg logit contributions
 seeker-0.059
 hide-0.058
 tag-0.057
 hiding-0.057
 hidden-0.055
Token home
Token ID1363
Feature activation-4.55e-03

Pos logit contributions
lee+0.028
atted+0.025
alks+0.024
lights+0.024
bled+0.024
Neg logit contributions
 hide-0.018
 tag-0.018
 seeker-0.017
 tim-0.016
 hidden-0.015
Token is
Token ID318
Feature activation-2.37e-02

Pos logit contributions
 seeker+0.021
 hide+0.018
 tag+0.018
 hiding+0.017
 exit+0.016
Neg logit contributions
lee-0.031
atted-0.028
rac-0.027
bled-0.027
alks-0.026
Token too
Token ID1165
Feature activation-4.27e-02

Pos logit contributions
 seeker+0.036
 hide+0.034
 tag+0.033
 chamber+0.029
 gym+0.028
Neg logit contributions
lee-0.048
bled-0.043
alks-0.040
urg-0.039
atted-0.039
Token small
Token ID1402
Feature activation-1.93e-04

Pos logit contributions
 seeker+0.074
 hide+0.073
 tag+0.073
 hiding+0.068
house+0.060
Neg logit contributions
lee-0.080
alks-0.069
lights-0.069
um-0.068
 winds-0.067
Token and
Token ID290
Feature activation+7.30e-03

Pos logit contributions
 seeker+0.000
 tag+0.000
 hide+0.000
 exit+0.000
 maze+0.000
Neg logit contributions
lee-0.000
alks-0.000
atted-0.000
iously-0.000
 winds-0.000
Token boring
Token ID14262
Feature activation+1.69e-02

Pos logit contributions
lee+0.016
bled+0.015
alks+0.014
rac+0.014
atted+0.014
Neg logit contributions
 seeker-0.011
 hide-0.011
 tag-0.011
 hiding-0.010
 exit-0.010
Token.
Token ID13
Feature activation-2.71e-02

Pos logit contributions
lee+0.038
bled+0.034
atted+0.033
alks+0.033
iously+0.032
Neg logit contributions
 seeker-0.024
 hide-0.023
 tag-0.023
 exit-0.020
 hidden-0.020
Token We
Token ID775
Feature activation-5.89e-03

Pos logit contributions
 hide+0.048
 tag+0.047
 hiding+0.047
 seeker+0.045
 hidden+0.044
Neg logit contributions
lee-0.057
bled-0.050
alks-0.050
atted-0.049
rac-0.047
Token need
Token ID761
Feature activation+6.62e-03

Pos logit contributions
 hiding+0.009
 tag+0.008
 seeker+0.008
 chamber+0.007
 maze+0.007
Neg logit contributions
lee-0.012
alks-0.011
rac-0.011
atted-0.011
eled-0.010
Token more
Token ID517
Feature activation+1.01e-02

Pos logit contributions
lee+0.016
alks+0.014
bled+0.014
atted+0.014
rac+0.014
Neg logit contributions
 seeker-0.009
 hide-0.009
 tag-0.009
 hiding-0.008
 exit-0.008
Token cars
Token ID5006
Feature activation-1.86e-02

Pos logit contributions
lee+0.023
atted+0.020
alks+0.020
ment+0.019
lights+0.018
Neg logit contributions
 seeker-0.016
 tag-0.014
 hide-0.014
 hiding-0.013
 tim-0.013
Token and
Token ID290
Feature activation+2.60e-02

Pos logit contributions
 hide+0.024
 tag+0.024
 seeker+0.024
 hiding+0.023
 chamber+0.021
Neg logit contributions
lee-0.045
bled-0.041
alks-0.041
atted-0.041
rac-0.040
TokenOnce
Token ID7454
Feature activation-3.64e-02

Pos logit contributions
 hide+0.052
 seeker+0.051
 tag+0.050
 hiding+0.050
 exit+0.049
Neg logit contributions
lee-0.046
ipt-0.040
umbled-0.039
alks-0.039
bled-0.039
Token upon
Token ID2402
Feature activation-3.76e-02

Pos logit contributions
 hide+0.033
 tag+0.031
 hiding+0.030
 seeker+0.029
 hidden+0.027
Neg logit contributions
lee-0.108
bled-0.101
alks-0.098
atted-0.098
urg-0.097
Token a
Token ID257
Feature activation-1.10e-01

Pos logit contributions
 seeker+0.057
 exit+0.049
 hiding+0.048
 tag+0.048
 maze+0.048
Neg logit contributions
lee-0.075
bled-0.071
ment-0.067
ped-0.065
ipt-0.065
Token time
Token ID640
Feature activation-8.29e-02

Pos logit contributions
 tag+0.131
 hide+0.129
 seeker+0.125
 hiding+0.115
 exit+0.113
Neg logit contributions
lee-0.282
atted-0.251
um-0.243
alks-0.242
bled-0.242
Token there
Token ID612
Feature activation-3.20e-02

Pos logit contributions
 seeker+0.109
 tag+0.107
 hide+0.106
 exit+0.103
 hidden+0.093
Neg logit contributions
lee-0.201
atted-0.181
bled-0.180
alks-0.173
rac-0.170
Token was
Token ID373
Feature activation-9.54e-02

Pos logit contributions
 seeker+0.049
 hide+0.046
 tag+0.046
 exit+0.044
 hiding+0.044
Neg logit contributions
lee-0.072
bled-0.064
atted-0.063
eled-0.062
ipt-0.061
Token a
Token ID257
Feature activation-1.11e-01

Pos logit contributions
 hide+0.103
 tag+0.102
 seeker+0.098
 hiding+0.095
 exit+0.094
Neg logit contributions
lee-0.252
bled-0.234
alks-0.222
atted-0.218
ipt-0.217
Token little
Token ID1310
Feature activation-6.94e-02

Pos logit contributions
 anywhere+0.213
 hide+0.211
 tag+0.207
 hiding+0.201
 exit+0.200
Neg logit contributions
lee-0.193
Dad-0.157
um-0.147
bled-0.141
alks-0.140
Token girl
Token ID2576
Feature activation-4.97e-02

Pos logit contributions
 tag+0.137
 hide+0.136
 anywhere+0.134
 hiding+0.131
 seeker+0.126
Neg logit contributions
lee-0.120
atted-0.097
alks-0.094
um-0.092
eled-0.090
Token.
Token ID13
Feature activation-4.43e-02

Pos logit contributions
 tag+0.066
 seeker+0.063
 hide+0.060
 exit+0.058
 hiding+0.056
Neg logit contributions
lee-0.129
atted-0.113
alks-0.107
eled-0.106
ums-0.105
Token Her
Token ID2332
Feature activation-9.94e-03

Pos logit contributions
 tag+0.092
 hide+0.092
 hiding+0.090
 seeker+0.088
 hidden+0.087
Neg logit contributions
lee-0.080
alks-0.065
atted-0.061
bled-0.059
ums-0.059
Token Daisy
Token ID40355
Feature activation-4.42e-02

Pos logit contributions
 seeker+0.114
 library+0.109
 exit+0.108
 hide+0.101
 maze+0.099
Neg logit contributions
bled-0.165
lee-0.154
ped-0.145
ipt-0.140
umbled-0.140
Token and
Token ID290
Feature activation-8.98e-03

Pos logit contributions
 tag+0.069
 seeker+0.069
 hide+0.063
 exit+0.063
 hiding+0.061
Neg logit contributions
lee-0.104
atted-0.087
eled-0.085
ipt-0.080
bled-0.080
Token her
Token ID607
Feature activation-3.00e-03

Pos logit contributions
 hide+0.017
 tag+0.017
 hiding+0.017
 seeker+0.017
 exit+0.016
Neg logit contributions
lee-0.017
bled-0.015
atted-0.014
alks-0.014
rac-0.014
Token mum
Token ID25682
Feature activation-8.78e-03

Pos logit contributions
 hide+0.005
 tag+0.005
 hiding+0.005
 seeker+0.005
 exit+0.005
Neg logit contributions
lee-0.006
bled-0.006
alks-0.005
atted-0.005
rac-0.005
Token took
Token ID1718
Feature activation+7.46e-03

Pos logit contributions
 tag+0.016
 hide+0.016
 hiding+0.016
 seeker+0.015
 exit+0.014
Neg logit contributions
lee-0.018
bled-0.015
atted-0.015
eled-0.015
alks-0.014
Token a
Token ID257
Feature activation-8.89e-02

Pos logit contributions
lee+0.019
bled+0.016
alks+0.016
atted+0.016
urg+0.016
Neg logit contributions
 seeker-0.010
 hide-0.010
 tag-0.010
 hiding-0.009
 exit-0.009
Token drive
Token ID3708
Feature activation-4.30e-02

Pos logit contributions
 seeker+0.137
 tag+0.121
 hide+0.116
 exit+0.106
 like+0.105
Neg logit contributions
lee-0.199
atted-0.168
alks-0.166
rac-0.162
um-0.161
Token.
Token ID13
Feature activation-3.27e-02

Pos logit contributions
 seeker+0.057
 tag+0.056
 hide+0.056
 hiding+0.054
 maze+0.048
Neg logit contributions
lee-0.104
atted-0.096
ipt-0.095
bled-0.095
rac-0.093
Token Daisy
Token ID40355
Feature activation-2.24e-02

Pos logit contributions
 hide+0.062
 tag+0.062
 seeker+0.061
 hiding+0.060
 hidden+0.058
Neg logit contributions
lee-0.064
alks-0.055
bled-0.054
atted-0.053
urg-0.050
Token was
Token ID373
Feature activation-4.06e-02

Pos logit contributions
 seeker+0.035
 tag+0.034
 hide+0.032
 hiding+0.032
 exit+0.032
Neg logit contributions
lee-0.051
atted-0.045
bled-0.043
eled-0.042
ipt-0.041
Token excited
Token ID6568
Feature activation+5.87e-03

Pos logit contributions
 tag+0.052
 hide+0.051
 maze+0.049
 seeker+0.049
 exit+0.047
Neg logit contributions
lee-0.096
bled-0.093
atted-0.086
ums-0.084
rac-0.081
Token a
Token ID257
Feature activation-1.14e-01

Pos logit contributions
 seeker+0.065
 exit+0.056
 maze+0.056
 hiding+0.055
 tag+0.054
Neg logit contributions
lee-0.084
bled-0.080
urg-0.073
ipt-0.073
ment-0.072
Token time
Token ID640
Feature activation-9.07e-02

Pos logit contributions
 tag+0.139
 hide+0.136
 seeker+0.135
 hiding+0.123
 exit+0.123
Neg logit contributions
lee-0.285
atted-0.255
um-0.249
bled-0.249
ums-0.245
Token,
Token ID11
Feature activation-9.35e-02

Pos logit contributions
 tag+0.117
 seeker+0.116
 hide+0.115
 exit+0.110
 hidden+0.100
Neg logit contributions
lee-0.220
atted-0.200
bled-0.199
alks-0.189
rac-0.188
Token there
Token ID612
Feature activation-3.94e-02

Pos logit contributions
 tag+0.108
 seeker+0.108
 hide+0.106
 exit+0.104
 library+0.102
Neg logit contributions
bled-0.228
lee-0.216
atted-0.203
eled-0.196
rac-0.195
Token lived
Token ID5615
Feature activation-7.46e-02

Pos logit contributions
 seeker+0.056
 hide+0.053
 tag+0.053
 exit+0.050
 hiding+0.050
Neg logit contributions
lee-0.092
bled-0.085
atted-0.082
ipt-0.080
eled-0.079
Token a
Token ID257
Feature activation-1.15e-01

Pos logit contributions
 seeker+0.085
 tag+0.083
 hide+0.081
 exit+0.080
 maze+0.079
Neg logit contributions
bled-0.183
lee-0.183
ipt-0.165
rac-0.162
atted-0.160
Token little
Token ID1310
Feature activation-7.18e-02

Pos logit contributions
 anywhere+0.218
 hide+0.211
 exit+0.211
 tag+0.210
 seeker+0.203
Neg logit contributions
lee-0.194
Dad-0.156
bled-0.143
um-0.142
alks-0.139
Token girl
Token ID2576
Feature activation-4.89e-02

Pos logit contributions
 hide+0.137
 tag+0.136
 anywhere+0.134
 hiding+0.132
 exit+0.128
Neg logit contributions
lee-0.130
alks-0.106
atted-0.105
um-0.102
ums-0.100
Token.
Token ID13
Feature activation-4.48e-02

Pos logit contributions
 tag+0.064
 seeker+0.061
 hide+0.059
 exit+0.057
 hiding+0.055
Neg logit contributions
lee-0.126
atted-0.110
alks-0.106
bled-0.104
eled-0.104
Token She
Token ID1375
Feature activation-1.63e-02

Pos logit contributions
 tag+0.091
 hide+0.091
 hiding+0.089
 seeker+0.087
 hidden+0.086
Neg logit contributions
lee-0.082
alks-0.068
atted-0.064
bled-0.063
ums-0.063
Token loved
Token ID6151
Feature activation-1.55e-02

Pos logit contributions
 tag+0.024
 seeker+0.024
 hide+0.023
 exit+0.023
 hiding+0.022
Neg logit contributions
lee-0.038
bled-0.034
atted-0.032
rac-0.031
alks-0.031
Tokenmy
Token ID1820
Feature activation-9.02e-03

Pos logit contributions
lee+0.009
bled+0.008
atted+0.008
alks+0.008
urg+0.008
Neg logit contributions
 hide-0.007
 tag-0.007
 hiding-0.007
 seeker-0.007
 exit-0.006
Token went
Token ID1816
Feature activation-3.54e-02

Pos logit contributions
 tag+0.018
 hiding+0.017
 hide+0.017
 seeker+0.017
 chamber+0.016
Neg logit contributions
lee-0.017
bled-0.014
eled-0.014
atted-0.014
umbled-0.013
Token to
Token ID284
Feature activation-2.88e-02

Pos logit contributions
 tag+0.057
 seeker+0.056
 hide+0.052
 exit+0.049
 tim+0.049
Neg logit contributions
lee-0.072
bled-0.064
atted-0.063
alks-0.062
arted-0.059
Token a
Token ID257
Feature activation-9.08e-02

Pos logit contributions
 seeker+0.052
 tag+0.050
 hide+0.047
 exit+0.047
 hiding+0.047
Neg logit contributions
lee-0.055
alks-0.045
rac-0.044
urg-0.043
bled-0.042
Token big
Token ID1263
Feature activation-7.23e-02

Pos logit contributions
 seeker+0.147
 tim+0.137
 tag+0.130
 like+0.122
adow+0.120
Neg logit contributions
lee-0.146
um-0.125
ths-0.124
Dad-0.123
 Dad-0.117
Token building
Token ID2615
Feature activation-6.58e-02

Pos logit contributions
 tag+0.093
 seeker+0.086
 tim+0.084
 hide+0.077
 like+0.075
Neg logit contributions
lee-0.150
ths-0.123
alks-0.123
um-0.122
ummy-0.118
Token with
Token ID351
Feature activation-2.89e-02

Pos logit contributions
 seeker+0.089
 tag+0.079
 hide+0.078
 hiding+0.073
 shelter+0.069
Neg logit contributions
lee-0.155
bled-0.145
ipt-0.145
atted-0.144
alks-0.141
Token an
Token ID281
Feature activation-5.95e-02

Pos logit contributions
 hide+0.042
 seeker+0.040
 tag+0.040
 hiding+0.040
 exit+0.036
Neg logit contributions
lee-0.070
bled-0.063
alks-0.062
atted-0.061
urg-0.058
Token elevator
Token ID20932
Feature activation-7.82e-02

Pos logit contributions
 hide+0.085
 tag+0.084
 hiding+0.080
 seeker+0.079
 hidden+0.073
Neg logit contributions
lee-0.148
bled-0.131
alks-0.129
atted-0.128
urg-0.125
Token.
Token ID13
Feature activation-2.02e-02

Pos logit contributions
 seeker+0.107
 hide+0.103
 tag+0.100
 hiding+0.090
 chamber+0.084
Neg logit contributions
lee-0.188
atted-0.170
alks-0.165
bled-0.165
ipt-0.161
Token Lily
Token ID20037
Feature activation-3.51e-03

Pos logit contributions
 hide+0.036
 tag+0.035
 hiding+0.034
 seeker+0.034
 hidden+0.033
Neg logit contributions
lee-0.043
bled-0.037
alks-0.037
atted-0.036
ipt-0.035
Token upon
Token ID2402
Feature activation-4.14e-02

Pos logit contributions
 hide+0.041
 hiding+0.038
 tag+0.037
 seeker+0.035
 hidden+0.033
Neg logit contributions
lee-0.142
bled-0.132
alks-0.129
atted-0.128
urg-0.127
Token a
Token ID257
Feature activation-1.15e-01

Pos logit contributions
 seeker+0.058
 exit+0.050
 tag+0.050
 hiding+0.050
 hide+0.050
Neg logit contributions
lee-0.087
bled-0.084
ipt-0.078
ped-0.077
rac-0.077
Token time
Token ID640
Feature activation-9.75e-02

Pos logit contributions
 tag+0.131
 hide+0.130
 seeker+0.123
 hiding+0.117
 exit+0.114
Neg logit contributions
lee-0.299
atted-0.268
bled-0.266
alks-0.261
rac-0.259
Token,
Token ID11
Feature activation-9.30e-02

Pos logit contributions
 tag+0.107
 hide+0.107
 seeker+0.102
 exit+0.095
 hiding+0.092
Neg logit contributions
lee-0.262
bled-0.241
atted-0.239
alks-0.234
rac-0.230
Token there
Token ID612
Feature activation-4.00e-02

Pos logit contributions
 tag+0.094
 seeker+0.092
 hide+0.091
 exit+0.082
 library+0.081
Neg logit contributions
bled-0.248
lee-0.240
atted-0.228
rac-0.225
ipt-0.217
Token was
Token ID373
Feature activation-9.07e-02

Pos logit contributions
 seeker+0.051
 hide+0.051
 tag+0.050
 hiding+0.047
 exit+0.046
Neg logit contributions
lee-0.099
bled-0.091
atted-0.088
ipt-0.086
alks-0.085
Token a
Token ID257
Feature activation-1.14e-01

Pos logit contributions
 tag+0.099
 hide+0.098
 seeker+0.093
 hiding+0.093
 exit+0.090
Neg logit contributions
lee-0.233
bled-0.219
alks-0.207
atted-0.201
ipt-0.200
Token little
Token ID1310
Feature activation-7.17e-02

Pos logit contributions
 hide+0.199
 tag+0.197
 anywhere+0.196
 exit+0.187
 hiding+0.187
Neg logit contributions
lee-0.213
bled-0.176
Dad-0.171
um-0.169
alks-0.167
Token girl
Token ID2576
Feature activation-5.14e-02

Pos logit contributions
 hide+0.135
 tag+0.135
 anywhere+0.130
 hiding+0.129
 exit+0.124
Neg logit contributions
lee-0.127
alks-0.106
atted-0.105
um-0.103
ums-0.101
Token named
Token ID3706
Feature activation-4.20e-02

Pos logit contributions
 tag+0.060
 seeker+0.058
 hide+0.057
 exit+0.053
 hiding+0.053
Neg logit contributions
lee-0.138
atted-0.122
alks-0.119
bled-0.119
ums-0.116
Token Sue
Token ID26089
Feature activation-3.55e-02

Pos logit contributions
 tag+0.074
 seeker+0.072
 hide+0.070
 exit+0.068
 hidden+0.068
Neg logit contributions
lee-0.085
bled-0.068
alks-0.067
atted-0.067
rac-0.067