TOP ACTIVATIONS
MAX = 7.646

time playing with her or nam ents and making her special
the rest of the or nam ents onto the tree .
she had with her or nam ents . <|endoftext|> Once upon
All of these lovely or nam ents reminded him of the
to look at the or nam ents on the tree .
take one of the or nam ents outside for a walk
he especially liked the or nam ents he kept around .
at all the different or nam ents and adm iring the
and play with your or nam ents . Maybe you can
box full of beautiful or nam ents .
a box of shiny or nam ents that spark led in
garden with the other b unn ies . One day she
to help sort the or nam ents . She picked up
park filled with colourful or nam ents . Jimmy looked out
beautiful garden with the or nam ents . The
she could carry the or nam ents to the place .
Fred played with the b unn ies all day , hopping
who liked to collect or nam ents . Everywhere she went
never played with the or nam ents again . They still
the rest of the or nam ents onto the tree .

MIN ACTIVATIONS
MIN = -6.099

was back . H oor ay ! <|endoftext|> Once upon a
Bob the Explorer " and all of his friends cheered him
Hi Jay den ! Ready for school today ?"
he shouted " h oor ay !" The jet
, nine , eight , seven , six , five ,
! Let â s ride on ! â
. " Ready for blast off ?" Lily asks
cake and a bike in front of them . They are
was a good idea after all . Then they
, Anna . You spelled all the words right . You
outside and said " please bird ie , sing again !"
said , " Yes , even though the cow was unknown
buttons , switches and levers all around . Amy started touching
said , " Yes , all the ingredients you need are
went home and told mom all about their fun day .
, five , six , seven , eight , nine ,
and dad , h oor ay ! They pulled
to go back and warn all the other fish about the
hidden , but right in front of them . <|endoftext|> Sam
pet als and a yellow middle . It smells nice .

INTERVAL 4.210 - 7.646
CONTAINS 0.022%

had cleaned up all the trash from the ash tr ay
seen . Em ma was so proud of
's something important . <|endoftext|> Em ma was a 3 year
big bowl of ice - cream ! The girl peek ed
friends . Everyone admired his pants , and he felt very

INTERVAL 0.774 - 4.210
CONTAINS 22.977%

Tim and Lily were playing in the park with their new
in the garden . They had a big drum that made
than anything else . <|endoftext|> Once upon a time there was
yard . He looked up in the sky and it was
to do . She held the jacket close to her and

INTERVAL -2.663 - 0.774
CONTAINS 75.763%

she went , she had a box of colors with her
her , but he knew he couldn 't . He had
dog . She started to cry . <|endoftext|> Sam and Lily
something no one else had done and they always chose him
," her mom said . " Now we can enjoy the

INTERVAL -6.099 - -2.663
CONTAINS 1.238%

would cl ap his wings together in excitement .
. They cut a hole for the door . They tape
he had to pay a lot of money to go to
forth . Anna laughed and cl apped . She said ,
to be mad at someone for sne ez ing .
Token time
Token ID640
Feature activation+1.51e+00

Pos logit contributions
 door+0.012
 us+0.011
 me+0.011
 too+0.010
 gate+0.010
Neg logit contributions
 secure-0.010
Er-0.009
 blonde-0.009
 shave-0.009
asting-0.009
Token playing
Token ID2712
Feature activation+7.40e-01

Pos logit contributions
 secure+0.057
ge+0.054
asting+0.053
Er+0.053
 blonde+0.053
Neg logit contributions
 door-0.046
 us-0.044
 me-0.041
 gate-0.041
 poster-0.039
Token with
Token ID351
Feature activation-6.66e-02

Pos logit contributions
 secure+0.022
 shave+0.021
asting+0.020
 snug+0.019
 blonde+0.019
Neg logit contributions
 door-0.026
 cho-0.025
ays-0.024
 gate-0.024
 us-0.024
Token her
Token ID607
Feature activation+1.46e+00

Pos logit contributions
 door+0.002
 us+0.002
 gate+0.002
 poster+0.002
 me+0.002
Neg logit contributions
 secure-0.002
 blonde-0.002
Er-0.002
asting-0.002
 Ethan-0.002
Token or
Token ID393
Feature activation+1.97e+00

Pos logit contributions
 secure+0.058
Er+0.053
 blonde+0.053
ge+0.053
 shave+0.053
Neg logit contributions
 door-0.042
 us-0.042
 me-0.038
 cho-0.037
 gate-0.037
Tokennam
Token ID7402
Feature activation+7.65e+00

Pos logit contributions
 secure+0.070
 shave+0.063
 blonde+0.063
Er+0.063
ge+0.063
Neg logit contributions
 door-0.066
 us-0.064
 gate-0.060
 me-0.058
 cho-0.058
Tokenents
Token ID658
Feature activation+2.58e+00

Pos logit contributions
 secure+0.353
Er+0.338
ge+0.332
asting+0.332
 shave+0.327
Neg logit contributions
 door-0.176
 us-0.167
 me-0.154
 too-0.147
 gate-0.146
Token and
Token ID290
Feature activation+7.73e-01

Pos logit contributions
 secure+0.099
ge+0.095
asting+0.095
 energ+0.086
 shave+0.085
Neg logit contributions
 door-0.078
 us-0.078
 me-0.075
 poster-0.069
 gate-0.068
Token making
Token ID1642
Feature activation+1.20e+00

Pos logit contributions
 secure+0.028
 blonde+0.026
Er+0.025
ge+0.025
 shave+0.025
Neg logit contributions
 us-0.025
 door-0.025
 gate-0.023
 me-0.023
ays-0.022
Token her
Token ID607
Feature activation+7.00e-01

Pos logit contributions
 secure+0.049
Er+0.045
asting+0.045
 blonde+0.045
 shave+0.044
Neg logit contributions
 door-0.035
 us-0.032
 gate-0.031
 me-0.030
 too-0.029
Token special
Token ID2041
Feature activation+5.59e-01

Pos logit contributions
 secure+0.021
 blonde+0.019
Er+0.019
 shave+0.018
ge+0.018
Neg logit contributions
 door-0.028
 us-0.027
 me-0.026
 gate-0.025
 cho-0.025
Token the
Token ID262
Feature activation-2.06e-01

Pos logit contributions
 door+0.015
 us+0.014
 poster+0.014
 me+0.013
 gate+0.013
Neg logit contributions
 secure-0.019
asting-0.018
Er-0.017
 shave-0.017
 Ethan-0.017
Token rest
Token ID1334
Feature activation+9.60e-01

Pos logit contributions
 us+0.006
 door+0.006
 me+0.006
 too+0.006
 gate+0.005
Neg logit contributions
 secure-0.008
Er-0.007
asting-0.007
 shave-0.007
ge-0.007
Token of
Token ID286
Feature activation+5.25e-01

Pos logit contributions
 secure+0.038
asting+0.038
ge+0.035
 snug+0.034
 energ+0.033
Neg logit contributions
 door-0.026
 us-0.025
 me-0.025
 poster-0.025
 gate-0.023
Token the
Token ID262
Feature activation+1.16e+00

Pos logit contributions
 secure+0.020
Er+0.019
 shave+0.019
ge+0.019
asting+0.019
Neg logit contributions
 door-0.016
 us-0.015
 me-0.014
 gate-0.014
aw-0.013
Token or
Token ID393
Feature activation+6.40e-01

Pos logit contributions
 secure+0.046
Er+0.043
asting+0.042
ge+0.042
 shave+0.042
Neg logit contributions
 door-0.034
 us-0.033
 me-0.031
 too-0.030
 gate-0.029
Tokennam
Token ID7402
Feature activation+7.00e+00

Pos logit contributions
 secure+0.024
 shave+0.021
ge+0.021
Er+0.021
asting+0.021
Neg logit contributions
 door-0.021
 us-0.020
 me-0.019
 gate-0.019
 cho-0.018
Tokenents
Token ID658
Feature activation+2.07e+00

Pos logit contributions
 secure+0.318
Er+0.308
ge+0.306
asting+0.301
 shave+0.295
Neg logit contributions
 door-0.164
 us-0.155
 me-0.145
 gate-0.137
 too-0.137
Token onto
Token ID4291
Feature activation-3.75e-01

Pos logit contributions
 secure+0.082
asting+0.078
ge+0.078
 energ+0.071
 shave+0.069
Neg logit contributions
 door-0.060
 us-0.060
 me-0.059
 poster-0.054
 gate-0.053
Token the
Token ID262
Feature activation+4.11e-01

Pos logit contributions
 door+0.010
 us+0.009
 gate+0.009
 me+0.009
 poster+0.009
Neg logit contributions
 secure-0.015
Er-0.014
asting-0.014
 Ethan-0.014
 shave-0.014
Token tree
Token ID5509
Feature activation-1.98e+00

Pos logit contributions
 secure+0.016
Er+0.015
asting+0.015
 blonde+0.015
 shave+0.015
Neg logit contributions
 door-0.012
 us-0.012
 me-0.011
 too-0.011
 gate-0.011
Token.
Token ID13
Feature activation+7.77e-02

Pos logit contributions
 us+0.057
 door+0.056
 me+0.053
 poster+0.050
 gate+0.049
Neg logit contributions
 secure-0.080
ge-0.074
asting-0.072
 shave-0.069
 blonde-0.069
Token she
Token ID673
Feature activation-4.47e-01

Pos logit contributions
 secure+0.047
 blonde+0.043
asting+0.043
 Ethan+0.042
Er+0.042
Neg logit contributions
 door-0.036
 us-0.034
 me-0.032
 gate-0.032
 cho-0.031
Token had
Token ID550
Feature activation+8.99e-01

Pos logit contributions
 door+0.019
 us+0.017
 gate+0.017
ays+0.017
 cho+0.016
Neg logit contributions
 secure-0.013
 energ-0.011
asting-0.011
Er-0.011
 Ethan-0.011
Token with
Token ID351
Feature activation-3.24e-02

Pos logit contributions
 secure+0.032
asting+0.030
 blonde+0.029
Er+0.029
ge+0.029
Neg logit contributions
 door-0.031
 us-0.028
 gate-0.027
 cho-0.027
 me-0.026
Token her
Token ID607
Feature activation+1.01e+00

Pos logit contributions
 door+0.001
 us+0.001
 me+0.001
 too+0.001
 gate+0.001
Neg logit contributions
 secure-0.001
Er-0.001
asting-0.001
ge-0.001
 shave-0.001
Token or
Token ID393
Feature activation+1.61e+00

Pos logit contributions
 secure+0.035
 blonde+0.032
Er+0.032
ge+0.031
 shave+0.031
Neg logit contributions
 door-0.033
 us-0.033
 cho-0.031
 me-0.030
 gate-0.030
Tokennam
Token ID7402
Feature activation+6.96e+00

Pos logit contributions
 secure+0.061
 shave+0.056
 blonde+0.055
Er+0.055
asting+0.054
Neg logit contributions
 door-0.050
 us-0.048
 gate-0.046
 cho-0.044
 me-0.044
Tokenents
Token ID658
Feature activation+1.62e+00

Pos logit contributions
 secure+0.325
Er+0.312
ge+0.307
asting+0.306
 blonde+0.302
Neg logit contributions
 door-0.156
 us-0.148
 me-0.136
 too-0.129
 gate-0.129
Token.
Token ID13
Feature activation+2.70e-01

Pos logit contributions
 secure+0.069
ge+0.064
asting+0.064
 energ+0.060
 blonde+0.059
Neg logit contributions
 door-0.043
 us-0.043
 me-0.040
 poster-0.037
ipt-0.037
Token<|endoftext|>
Token ID50256
Feature activation-1.59e+00

Pos logit contributions
 secure+0.010
asting+0.010
ge+0.010
Er+0.009
 shave+0.009
Neg logit contributions
 door-0.009
 us-0.008
 me-0.008
 too-0.008
 gate-0.007
TokenOnce
Token ID7454
Feature activation+1.25e+00

Pos logit contributions
 door+0.046
 us+0.044
 me+0.042
 too+0.041
 gate+0.039
Neg logit contributions
 secure-0.066
ge-0.061
asting-0.061
Er-0.060
 shave-0.060
Token upon
Token ID2402
Feature activation-2.37e+00

Pos logit contributions
 secure+0.045
Er+0.043
ge+0.042
asting+0.042
 shave+0.041
Neg logit contributions
 door-0.040
 us-0.039
 me-0.036
 gate-0.036
 too-0.035
Token All
Token ID1439
Feature activation-4.31e-01

Pos logit contributions
 secure+0.016
Er+0.015
asting+0.015
ge+0.015
 shave+0.014
Neg logit contributions
 door-0.016
 us-0.015
 me-0.015
 too-0.014
 gate-0.014
Token of
Token ID286
Feature activation+1.60e+00

Pos logit contributions
 door+0.013
 us+0.011
 gate+0.011
 me+0.011
 poster+0.011
Neg logit contributions
 secure-0.018
asting-0.016
Er-0.016
 energ-0.016
 shave-0.016
Token these
Token ID777
Feature activation+1.77e+00

Pos logit contributions
 secure+0.074
 energ+0.069
ge+0.069
Er+0.068
asting+0.068
Neg logit contributions
 us-0.039
 door-0.037
 me-0.036
 too-0.034
 them-0.031
Token lovely
Token ID14081
Feature activation+5.13e-01

Pos logit contributions
 secure+0.067
Er+0.063
asting+0.062
 blonde+0.062
ge+0.061
Neg logit contributions
 door-0.055
 us-0.052
 me-0.050
 gate-0.049
 too-0.048
Token or
Token ID393
Feature activation+1.54e+00

Pos logit contributions
 secure+0.022
 blonde+0.020
Er+0.020
ge+0.020
asting+0.020
Neg logit contributions
 door-0.014
 us-0.013
 me-0.012
 gate-0.012
 too-0.012
Tokennam
Token ID7402
Feature activation+6.92e+00

Pos logit contributions
 secure+0.041
Er+0.038
asting+0.038
ge+0.037
tto+0.036
Neg logit contributions
 door-0.066
 us-0.064
 me-0.062
 too-0.061
 gate-0.059
Tokenents
Token ID658
Feature activation+1.85e+00

Pos logit contributions
 secure+0.330
Er+0.314
asting+0.310
ge+0.309
 shave+0.306
Neg logit contributions
 door-0.150
 us-0.141
 me-0.130
 too-0.124
 gate-0.122
Token reminded
Token ID14516
Feature activation-7.41e-01

Pos logit contributions
 secure+0.069
asting+0.064
ge+0.064
 shave+0.060
Er+0.060
Neg logit contributions
 door-0.060
 us-0.058
 me-0.056
 gate-0.053
 poster-0.051
Token him
Token ID683
Feature activation-9.04e-01

Pos logit contributions
 door+0.022
 us+0.020
 gate+0.019
 me+0.019
 poster+0.018
Neg logit contributions
 secure-0.030
Er-0.028
asting-0.028
ge-0.027
 shave-0.027
Token of
Token ID286
Feature activation+2.33e-01

Pos logit contributions
 door+0.020
 us+0.019
 me+0.017
 gate+0.017
 poster+0.016
Neg logit contributions
 secure-0.043
asting-0.040
ge-0.039
Er-0.039
 shave-0.039
Token the
Token ID262
Feature activation+4.83e-01

Pos logit contributions
 secure+0.010
Er+0.009
asting+0.009
ge+0.009
 shave+0.009
Neg logit contributions
 door-0.007
 us-0.006
 me-0.006
 gate-0.006
 too-0.006
Token to
Token ID284
Feature activation+1.43e+00

Pos logit contributions
 shave+0.018
 Ethan+0.018
 secure+0.017
Er+0.017
 Baby+0.017
Neg logit contributions
 door-0.012
 gate-0.011
 us-0.011
ays-0.010
 too-0.010
Token look
Token ID804
Feature activation+6.57e-01

Pos logit contributions
 secure+0.049
Er+0.046
asting+0.046
ge+0.045
 energ+0.043
Neg logit contributions
 door-0.051
 us-0.049
 me-0.048
 too-0.046
 gate-0.045
Token at
Token ID379
Feature activation+1.02e-03

Pos logit contributions
 secure+0.034
Er+0.032
asting+0.032
ge+0.032
 shave+0.032
Neg logit contributions
 door-0.012
 us-0.011
 me-0.009
 gate-0.009
 too-0.009
Token the
Token ID262
Feature activation+1.19e+00

Pos logit contributions
 secure+0.000
 shave+0.000
Er+0.000
 Ethan+0.000
 blonde+0.000
Neg logit contributions
 door-0.000
 us-0.000
 cho-0.000
 gate-0.000
 too-0.000
Token or
Token ID393
Feature activation+1.96e+00

Pos logit contributions
Er+0.047
 secure+0.046
asting+0.045
 shave+0.045
 blonde+0.045
Neg logit contributions
 us-0.030
 door-0.029
 too-0.027
 me-0.026
 cho-0.026
Tokennam
Token ID7402
Feature activation+6.76e+00

Pos logit contributions
 secure+0.055
Er+0.050
asting+0.050
ge+0.049
 blonde+0.047
Neg logit contributions
 door-0.082
 us-0.079
 me-0.077
 too-0.075
 gate-0.073
Tokenents
Token ID658
Feature activation+2.18e+00

Pos logit contributions
 secure+0.311
Er+0.300
ge+0.295
asting+0.294
 shave+0.289
Neg logit contributions
 door-0.155
 us-0.146
 me-0.136
 too-0.129
 gate-0.129
Token on
Token ID319
Feature activation+4.96e-01

Pos logit contributions
 secure+0.085
ge+0.083
asting+0.082
 energ+0.075
Er+0.075
Neg logit contributions
 door-0.064
 us-0.063
 me-0.061
 cho-0.056
 gate-0.055
Token the
Token ID262
Feature activation+6.32e-02

Pos logit contributions
 secure+0.022
Er+0.021
asting+0.021
ge+0.020
 shave+0.020
Neg logit contributions
 door-0.013
 us-0.012
 me-0.011
 too-0.011
 gate-0.011
Token tree
Token ID5509
Feature activation-1.84e+00

Pos logit contributions
 secure+0.002
Er+0.002
asting+0.002
 blonde+0.002
 shave+0.002
Neg logit contributions
 door-0.002
 us-0.002
 me-0.002
 too-0.002
 gate-0.002
Token.
Token ID13
Feature activation+8.14e-01

Pos logit contributions
 door+0.052
 us+0.052
 me+0.048
 cho+0.048
 gate+0.045
Neg logit contributions
 secure-0.071
ge-0.068
asting-0.066
 blonde-0.065
 shave-0.065
Token take
Token ID1011
Feature activation+1.44e+00

Pos logit contributions
 door+0.001
 us+0.001
 me+0.001
 too+0.001
 gate+0.001
Neg logit contributions
 secure-0.001
Er-0.001
ge-0.001
asting-0.001
 energ-0.001
Token one
Token ID530
Feature activation-1.67e+00

Pos logit contributions
 secure+0.060
Er+0.056
ge+0.056
asting+0.056
 shave+0.054
Neg logit contributions
 door-0.041
 us-0.040
 me-0.038
 too-0.037
 gate-0.035
Token of
Token ID286
Feature activation+7.67e-01

Pos logit contributions
 us+0.046
 door+0.044
 cho+0.043
 me+0.040
 comes+0.040
Neg logit contributions
 secure-0.066
 shave-0.062
asting-0.061
ge-0.060
Er-0.060
Token the
Token ID262
Feature activation+1.19e+00

Pos logit contributions
 secure+0.028
Er+0.026
 shave+0.026
asting+0.026
ge+0.026
Neg logit contributions
 door-0.025
 us-0.024
 gate-0.022
 me-0.022
 cho-0.021
Token or
Token ID393
Feature activation+1.28e+00

Pos logit contributions
 secure+0.043
Er+0.040
asting+0.039
 blonde+0.039
ge+0.039
Neg logit contributions
 door-0.040
 us-0.039
 me-0.036
 too-0.035
 gate-0.035
Tokennam
Token ID7402
Feature activation+6.76e+00

Pos logit contributions
 secure+0.038
 shave+0.034
Er+0.034
ge+0.033
asting+0.033
Neg logit contributions
 door-0.051
 us-0.049
 me-0.047
 gate-0.046
 cho-0.046
Tokenents
Token ID658
Feature activation+1.70e+00

Pos logit contributions
 secure+0.313
Er+0.302
ge+0.297
asting+0.295
 shave+0.290
Neg logit contributions
 door-0.153
 us-0.145
 me-0.135
 too-0.128
 gate-0.127
Token outside
Token ID2354
Feature activation+4.21e-01

Pos logit contributions
 secure+0.072
ge+0.069
asting+0.068
 energ+0.063
 shave+0.063
Neg logit contributions
 door-0.044
 us-0.044
 me-0.042
 poster-0.039
 cho-0.038
Token for
Token ID329
Feature activation-2.90e+00

Pos logit contributions
 secure+0.016
ge+0.015
asting+0.015
 shave+0.015
 Ethan+0.014
Neg logit contributions
 door-0.012
 us-0.012
 me-0.011
 gate-0.011
 poster-0.011
Token a
Token ID257
Feature activation-8.77e-01

Pos logit contributions
 door+0.081
 us+0.080
 too+0.075
 me+0.075
 gate+0.065
Neg logit contributions
 secure-0.120
Er-0.114
tto-0.114
asting-0.113
ge-0.112
Token walk
Token ID2513
Feature activation+5.13e-01

Pos logit contributions
 door+0.033
 us+0.031
 me+0.031
 too+0.030
 gate+0.029
Neg logit contributions
 secure-0.028
Er-0.027
asting-0.026
tto-0.026
Marcus-0.025
Token he
Token ID339
Feature activation-3.64e-02

Pos logit contributions
Er+0.005
ge+0.005
 secure+0.005
asting+0.005
Marcus+0.005
Neg logit contributions
 door-0.008
 us-0.008
 me-0.008
 too-0.008
 gate-0.007
Token especially
Token ID2592
Feature activation-9.27e-01

Pos logit contributions
 door+0.001
 us+0.001
 gate+0.001
 poster+0.001
 me+0.001
Neg logit contributions
 secure-0.001
Er-0.001
asting-0.001
 shave-0.001
 energ-0.001
Token liked
Token ID8288
Feature activation-4.23e-01

Pos logit contributions
 door+0.028
 too+0.027
 us+0.027
 me+0.027
 cho+0.024
Neg logit contributions
Er-0.034
 secure-0.034
ge-0.033
asting-0.032
Marcus-0.032
Token the
Token ID262
Feature activation+1.21e+00

Pos logit contributions
 door+0.008
 gate+0.008
 poster+0.008
ays+0.007
 us+0.007
Neg logit contributions
 secure-0.020
 Ethan-0.019
 blonde-0.019
ge-0.019
 shave-0.019
Token or
Token ID393
Feature activation+1.41e+00

Pos logit contributions
 secure+0.046
Er+0.043
asting+0.042
ge+0.042
 shave+0.042
Neg logit contributions
 door-0.039
 us-0.037
 me-0.035
 too-0.034
 gate-0.034
Tokennam
Token ID7402
Feature activation+6.76e+00

Pos logit contributions
 secure+0.051
Er+0.049
asting+0.048
ge+0.047
tto+0.046
Neg logit contributions
 door-0.047
 us-0.045
 me-0.043
 too-0.042
 gate-0.040
Tokenents
Token ID658
Feature activation+1.98e+00

Pos logit contributions
 secure+0.331
Er+0.311
asting+0.310
 shave+0.307
 blonde+0.304
Neg logit contributions
 door-0.140
 us-0.131
 me-0.120
 too-0.115
 gate-0.111
Token he
Token ID339
Feature activation+2.17e-01

Pos logit contributions
 secure+0.083
asting+0.078
ge+0.078
Er+0.075
 shave+0.075
Neg logit contributions
 door-0.055
 us-0.053
 me-0.050
 gate-0.047
 poster-0.046
Token kept
Token ID4030
Feature activation+1.08e+00

Pos logit contributions
 secure+0.008
Er+0.007
asting+0.007
ge+0.007
 shave+0.007
Neg logit contributions
 door-0.007
 us-0.007
 me-0.007
 gate-0.006
 too-0.006
Token around
Token ID1088
Feature activation-9.54e-01

Pos logit contributions
 secure+0.045
asting+0.042
ge+0.042
Er+0.042
 shave+0.041
Neg logit contributions
 door-0.030
 us-0.028
 gate-0.026
 me-0.026
 poster-0.025
Token.
Token ID13
Feature activation+4.57e-01

Pos logit contributions
 door+0.026
 us+0.025
 gate+0.024
 me+0.023
 poster+0.022
Neg logit contributions
 secure-0.039
ge-0.037
Er-0.036
asting-0.036
 shave-0.036
Token at
Token ID379
Feature activation-2.85e-01

Pos logit contributions
 secure+0.106
 energ+0.098
asting+0.098
 snug+0.096
Er+0.094
Neg logit contributions
 us-0.051
 door-0.050
 poster-0.048
 me-0.046
 cho-0.043
Token all
Token ID477
Feature activation-2.82e+00

Pos logit contributions
 door+0.008
 us+0.007
 me+0.007
 gate+0.007
 too+0.007
Neg logit contributions
 secure-0.012
Er-0.011
asting-0.011
ge-0.011
 shave-0.011
Token the
Token ID262
Feature activation+1.44e+00

Pos logit contributions
 door+0.082
 us+0.077
 me+0.072
 gate+0.070
 too+0.070
Neg logit contributions
 secure-0.114
asting-0.107
Er-0.106
ge-0.104
 energ-0.102
Token different
Token ID1180
Feature activation-3.41e-01

Pos logit contributions
 secure+0.058
Er+0.055
asting+0.054
ge+0.054
 shave+0.053
Neg logit contributions
 door-0.042
 us-0.040
 me-0.038
 too-0.036
 gate-0.036
Token or
Token ID393
Feature activation+1.50e+00

Pos logit contributions
 door+0.009
 us+0.009
 me+0.008
 too+0.008
 cho+0.008
Neg logit contributions
 secure-0.014
asting-0.013
Er-0.013
ge-0.013
 blonde-0.013
Tokennam
Token ID7402
Feature activation+6.63e+00

Pos logit contributions
 secure+0.040
Er+0.037
asting+0.036
ge+0.035
 shave+0.035
Neg logit contributions
 door-0.065
 us-0.063
 me-0.061
 too-0.060
 gate-0.058
Tokenents
Token ID658
Feature activation+1.77e+00

Pos logit contributions
 secure+0.314
Er+0.300
asting+0.296
ge+0.295
 shave+0.291
Neg logit contributions
 door-0.146
 us-0.137
 me-0.127
 too-0.121
 gate-0.119
Token and
Token ID290
Feature activation-1.21e-01

Pos logit contributions
 secure+0.074
ge+0.071
asting+0.071
Er+0.066
 energ+0.065
Neg logit contributions
 door-0.048
 us-0.047
 me-0.045
 gate-0.041
 poster-0.041
Token adm
Token ID6178
Feature activation+4.66e-01

Pos logit contributions
 door+0.004
 us+0.004
 me+0.004
 gate+0.004
 too+0.004
Neg logit contributions
 secure-0.004
Er-0.004
ge-0.004
 blonde-0.004
asting-0.004
Tokeniring
Token ID3428
Feature activation+6.63e-01

Pos logit contributions
 secure+0.014
Er+0.013
asting+0.013
ge+0.013
 blonde+0.012
Neg logit contributions
 door-0.018
 us-0.018
 me-0.017
 gate-0.017
 too-0.016
Token the
Token ID262
Feature activation+1.44e+00

Pos logit contributions
 secure+0.029
Er+0.027
asting+0.027
 blonde+0.027
ge+0.027
Neg logit contributions
 door-0.017
 us-0.016
 me-0.015
 gate-0.014
 too-0.014
Token and
Token ID290
Feature activation-2.88e-01

Pos logit contributions
 door+0.038
 us+0.037
 me+0.034
 gate+0.033
 cho+0.032
Neg logit contributions
 secure-0.068
ge-0.063
Er-0.063
asting-0.063
 shave-0.061
Token play
Token ID711
Feature activation-1.30e+00

Pos logit contributions
 door+0.010
 us+0.009
 gate+0.009
 cho+0.009
 poster+0.009
Neg logit contributions
 secure-0.010
 shave-0.009
 Ethan-0.009
 energ-0.009
Er-0.009
Token with
Token ID351
Feature activation-1.37e+00

Pos logit contributions
 door+0.043
 gate+0.040
ipt+0.038
 poster+0.038
ays+0.038
Neg logit contributions
 secure-0.046
zing-0.043
asting-0.043
 shave-0.041
 energ-0.041
Token your
Token ID534
Feature activation+9.69e-01

Pos logit contributions
 door+0.043
 gate+0.039
 us+0.038
 poster+0.038
 cho+0.036
Neg logit contributions
 secure-0.051
 shave-0.047
 Ethan-0.046
Er-0.046
ge-0.046
Token or
Token ID393
Feature activation+7.68e-01

Pos logit contributions
 secure+0.036
 blonde+0.033
 shave+0.033
 snug+0.032
Er+0.032
Neg logit contributions
 us-0.028
 door-0.028
 cho-0.026
ays-0.026
 me-0.026
Tokennam
Token ID7402
Feature activation+6.62e+00

Pos logit contributions
 secure+0.030
 shave+0.028
ge+0.027
Er+0.026
asting+0.026
Neg logit contributions
 door-0.022
 us-0.021
ays-0.021
aw-0.020
 gate-0.020
Tokenents
Token ID658
Feature activation+1.73e+00

Pos logit contributions
 secure+0.294
Er+0.285
ge+0.283
asting+0.279
 shave+0.274
Neg logit contributions
 door-0.162
 us-0.152
 me-0.143
 gate-0.137
 too-0.135
Token.
Token ID13
Feature activation-1.51e-01

Pos logit contributions
ge+0.061
 secure+0.061
asting+0.060
free+0.053
 energ+0.052
Neg logit contributions
 door-0.056
 us-0.052
 me-0.052
 poster-0.050
 gate-0.049
Token Maybe
Token ID6674
Feature activation-1.34e+00

Pos logit contributions
 door+0.004
 us+0.004
 me+0.004
 too+0.004
 gate+0.003
Neg logit contributions
 secure-0.007
asting-0.006
ge-0.006
 shave-0.006
Er-0.006
Token you
Token ID345
Feature activation-4.39e-01

Pos logit contributions
 door+0.034
 us+0.031
 gate+0.029
 me+0.028
 cho+0.027
Neg logit contributions
 secure-0.058
Er-0.055
 shave-0.054
 Ethan-0.054
ge-0.054
Token can
Token ID460
Feature activation-4.56e-01

Pos logit contributions
 door+0.012
 us+0.011
 me+0.011
 cho+0.011
 gate+0.010
Neg logit contributions
 secure-0.018
ge-0.017
Er-0.017
asting-0.017
 shave-0.016
Token box
Token ID3091
Feature activation-1.69e+00

Pos logit contributions
 us+0.028
 door+0.027
 me+0.025
 too+0.025
 cho+0.024
Neg logit contributions
 secure-0.041
 shave-0.038
 blonde-0.037
Er-0.037
asting-0.037
Token full
Token ID1336
Feature activation+1.20e-01

Pos logit contributions
 us+0.044
 door+0.043
 me+0.040
 gate+0.037
 cho+0.037
Neg logit contributions
 secure-0.072
asting-0.068
ge-0.067
Er-0.066
 shave-0.066
Token of
Token ID286
Feature activation+1.59e+00

Pos logit contributions
 secure+0.005
asting+0.005
 shave+0.005
 blonde+0.005
 snug+0.005
Neg logit contributions
 us-0.003
 door-0.003
 poster-0.003
 me-0.003
 cho-0.003
Token beautiful
Token ID4950
Feature activation-8.97e-01

Pos logit contributions
 secure+0.061
ge+0.059
Er+0.059
asting+0.058
tto+0.057
Neg logit contributions
 door-0.051
 us-0.048
 me-0.046
 too-0.044
 gate-0.043
Token or
Token ID393
Feature activation+1.80e+00

Pos logit contributions
 door+0.024
 us+0.022
 me+0.021
 too+0.020
 gate+0.020
Neg logit contributions
 secure-0.039
Er-0.037
asting-0.036
ge-0.036
 shave-0.036
Tokennam
Token ID7402
Feature activation+6.54e+00

Pos logit contributions
tto+0.049
asting+0.048
 secure+0.047
oney+0.046
Marcus+0.045
Neg logit contributions
 door-0.079
 too-0.074
 me-0.072
 us-0.071
 gate-0.069
Tokenents
Token ID658
Feature activation+1.64e+00

Pos logit contributions
 secure+0.306
Er+0.292
asting+0.288
ge+0.287
 shave+0.283
Neg logit contributions
 door-0.146
 us-0.139
 me-0.128
 too-0.122
 gate-0.120
Token.
Token ID13
Feature activation+1.15e+00

Pos logit contributions
 secure+0.067
asting+0.064
ge+0.064
Er+0.060
 energ+0.059
Neg logit contributions
 door-0.046
 us-0.045
 me-0.044
 poster-0.039
 gate-0.039
Token
Token ID220
Feature activation+7.20e-01

Pos logit contributions
 secure+0.052
asting+0.051
ge+0.050
Er+0.048
 shave+0.047
Neg logit contributions
 door-0.031
 me-0.029
 us-0.028
 too-0.027
 gate-0.025
Token
Token ID198
Feature activation+1.47e+00

Pos logit contributions
 secure+0.028
Er+0.026
asting+0.026
ge+0.025
 shave+0.025
Neg logit contributions
 door-0.022
 us-0.022
 me-0.020
 too-0.020
 gate-0.019
Token
Token ID198
Feature activation+1.68e+00

Pos logit contributions
 secure+0.065
asting+0.061
ge+0.060
 energ+0.059
 shave+0.058
Neg logit contributions
 door-0.039
 me-0.039
 us-0.038
 too-0.035
 gate-0.031
Token a
Token ID257
Feature activation+6.74e-02

Pos logit contributions
 secure+0.030
 blonde+0.029
Er+0.027
asting+0.027
 shave+0.026
Neg logit contributions
 door-0.029
 us-0.029
 poster-0.028
ays-0.028
 gate-0.027
Token box
Token ID3091
Feature activation-1.50e+00

Pos logit contributions
 secure+0.002
Er+0.002
 blonde+0.002
 shave+0.002
asting+0.002
Neg logit contributions
 us-0.003
 door-0.002
 me-0.002
 too-0.002
ays-0.002
Token of
Token ID286
Feature activation+6.72e-01

Pos logit contributions
 us+0.043
 door+0.041
 me+0.039
 cho+0.037
 poster+0.036
Neg logit contributions
 secure-0.061
asting-0.056
 blonde-0.055
Er-0.055
 shave-0.054
Token shiny
Token ID22441
Feature activation+1.86e-02

Pos logit contributions
 secure+0.025
Er+0.023
 blonde+0.022
 shave+0.022
asting+0.022
Neg logit contributions
 door-0.022
 us-0.021
 me-0.020
 gate-0.020
 too-0.019
Token or
Token ID393
Feature activation+1.32e+00

Pos logit contributions
 secure+0.001
 blonde+0.001
Er+0.001
 shave+0.001
 Ethan+0.001
Neg logit contributions
 us-0.001
 door-0.001
 me-0.001
 gate-0.001
 cho-0.001
Tokennam
Token ID7402
Feature activation+6.45e+00

Pos logit contributions
 secure+0.038
asting+0.036
Er+0.036
ge+0.035
tto+0.035
Neg logit contributions
 door-0.055
 us-0.052
 me-0.051
 too-0.051
 gate-0.048
Tokenents
Token ID658
Feature activation+1.54e+00

Pos logit contributions
 secure+0.304
Er+0.291
asting+0.286
ge+0.286
 shave+0.282
Neg logit contributions
 door-0.142
 us-0.134
 me-0.124
 too-0.118
 gate-0.116
Token that
Token ID326
Feature activation-2.00e-01

Pos logit contributions
 secure+0.062
asting+0.060
ge+0.060
Er+0.055
 energ+0.054
Neg logit contributions
 us-0.043
 door-0.043
 me-0.042
 poster-0.038
 gate-0.037
Token spark
Token ID9009
Feature activation-1.44e+00

Pos logit contributions
 door+0.006
 us+0.006
 gate+0.006
 me+0.006
 poster+0.005
Neg logit contributions
 secure-0.008
Er-0.007
 blonde-0.007
asting-0.007
 shave-0.007
Tokenled
Token ID992
Feature activation+1.46e-01

Pos logit contributions
 door+0.052
 us+0.049
 me+0.047
 gate+0.047
 poster+0.045
Neg logit contributions
 secure-0.048
ge-0.044
asting-0.044
Er-0.043
 blonde-0.043
Token in
Token ID287
Feature activation+6.69e-01

Pos logit contributions
 secure+0.006
asting+0.005
 blonde+0.005
ge+0.005
 snug+0.005
Neg logit contributions
 door-0.004
 us-0.004
 poster-0.004
 gate-0.004
 me-0.004
Token garden
Token ID11376
Feature activation+7.09e-02

Pos logit contributions
 secure+0.012
Er+0.011
asting+0.011
 blonde+0.011
 shave+0.011
Neg logit contributions
 door-0.011
 us-0.010
 too-0.009
 me-0.009
 gate-0.009
Token with
Token ID351
Feature activation-6.98e-01

Pos logit contributions
 secure+0.002
asting+0.002
 blonde+0.002
ge+0.002
Er+0.002
Neg logit contributions
 door-0.002
 us-0.002
 poster-0.002
 gate-0.002
 cho-0.002
Token the
Token ID262
Feature activation+1.25e+00

Pos logit contributions
 door+0.019
 us+0.018
 me+0.017
 too+0.017
 gate+0.016
Neg logit contributions
 secure-0.030
Er-0.028
asting-0.028
ge-0.028
 shave-0.027
Token other
Token ID584
Feature activation+5.42e-01

Pos logit contributions
 secure+0.048
Er+0.045
asting+0.044
 blonde+0.044
 shave+0.044
Neg logit contributions
 door-0.038
 us-0.037
 me-0.034
 gate-0.034
 too-0.033
Token b
Token ID275
Feature activation+1.56e+00

Pos logit contributions
 secure+0.023
ge+0.023
Er+0.022
 shave+0.022
asting+0.022
Neg logit contributions
 me-0.015
 us-0.015
 door-0.015
 too-0.014
 gate-0.012
Tokenunn
Token ID20935
Feature activation+6.42e+00

Pos logit contributions
 secure+0.034
Er+0.030
 shave+0.029
asting+0.029
ge+0.028
Neg logit contributions
 door-0.075
 us-0.074
 me-0.071
 too-0.070
 gate-0.069
Tokenies
Token ID444
Feature activation+1.08e+00

Pos logit contributions
 secure+0.305
Er+0.289
 shave+0.285
asting+0.284
 blonde+0.281
Neg logit contributions
 door-0.143
 us-0.137
 me-0.127
 too-0.123
 gate-0.117
Token.
Token ID13
Feature activation+6.78e-01

Pos logit contributions
 secure+0.038
ge+0.036
asting+0.036
Er+0.035
 energ+0.035
Neg logit contributions
 door-0.035
 us-0.034
 me-0.031
 poster-0.030
 gate-0.030
Token One
Token ID1881
Feature activation+1.02e-01

Pos logit contributions
 secure+0.026
asting+0.024
Er+0.024
ge+0.024
 shave+0.023
Neg logit contributions
 door-0.022
 us-0.021
 me-0.021
 too-0.020
 gate-0.019
Token day
Token ID1110
Feature activation-6.78e-01

Pos logit contributions
Er+0.004
 secure+0.004
ge+0.004
 Ethan+0.004
 shave+0.004
Neg logit contributions
 me-0.003
 too-0.003
 us-0.003
 door-0.003
 bird-0.002
Token she
Token ID673
Feature activation-1.16e+00

Pos logit contributions
 me+0.010
 too+0.009
 us+0.009
 door+0.009
 to+0.007
Neg logit contributions
 secure-0.040
ge-0.038
asting-0.037
Er-0.037
 shave-0.036
Token to
Token ID284
Feature activation+1.66e+00

Pos logit contributions
 secure+0.031
 shave+0.029
ge+0.029
 energ+0.028
 blonde+0.028
Neg logit contributions
 us-0.018
 door-0.018
 gate-0.017
 cho-0.016
ays-0.016
Token help
Token ID1037
Feature activation+9.71e-01

Pos logit contributions
 secure+0.057
Er+0.055
ge+0.053
asting+0.053
 blonde+0.052
Neg logit contributions
 door-0.058
 us-0.056
 me-0.054
 too-0.053
 gate-0.051
Token sort
Token ID3297
Feature activation+2.51e+00

Pos logit contributions
 secure+0.044
Er+0.042
ge+0.041
asting+0.041
 energ+0.041
Neg logit contributions
 door-0.024
 us-0.023
 me-0.022
 too-0.021
 gate-0.020
Token the
Token ID262
Feature activation+9.62e-01

Pos logit contributions
 secure+0.105
Er+0.099
asting+0.097
ge+0.096
 shave+0.096
Neg logit contributions
 door-0.069
 us-0.065
 me-0.061
 gate-0.059
 too-0.058
Token or
Token ID393
Feature activation+8.23e-01

Pos logit contributions
 secure+0.037
Er+0.035
asting+0.034
 blonde+0.034
ge+0.034
Neg logit contributions
 door-0.029
 us-0.028
 me-0.026
 too-0.026
 gate-0.025
Tokennam
Token ID7402
Feature activation+6.35e+00

Pos logit contributions
 secure+0.023
Er+0.022
asting+0.021
ge+0.021
 shave+0.021
Neg logit contributions
 door-0.034
 us-0.033
 me-0.032
 too-0.031
 gate-0.030
Tokenents
Token ID658
Feature activation+1.41e+00

Pos logit contributions
 secure+0.284
Er+0.277
ge+0.273
asting+0.269
 shave+0.263
Neg logit contributions
 door-0.152
 us-0.144
 me-0.134
 gate-0.128
 too-0.127
Token.
Token ID13
Feature activation+7.55e-01

Pos logit contributions
 secure+0.058
ge+0.056
asting+0.054
 energ+0.051
Er+0.050
Neg logit contributions
 us-0.039
 door-0.038
 me-0.037
 cho-0.034
 poster-0.034
Token She
Token ID1375
Feature activation+3.62e-01

Pos logit contributions
 secure+0.025
 blonde+0.024
Er+0.024
 Ethan+0.023
 energ+0.022
Neg logit contributions
 us-0.026
 door-0.026
 cho-0.024
 me-0.024
 gate-0.023
Token picked
Token ID6497
Feature activation+2.18e+00

Pos logit contributions
 secure+0.011
Er+0.010
 blonde+0.010
 Ethan+0.009
 energ+0.009
Neg logit contributions
 door-0.014
 us-0.014
 gate-0.013
 poster-0.013
 cho-0.012
Token up
Token ID510
Feature activation+4.47e-01

Pos logit contributions
 secure+0.103
Er+0.098
asting+0.097
ge+0.096
 shave+0.096
Neg logit contributions
 door-0.048
 us-0.045
 me-0.041
 too-0.039
 gate-0.039
Token park
Token ID3952
Feature activation-1.81e+00

Pos logit contributions
ays+0.005
 door+0.005
ases+0.005
ucks+0.005
 us+0.005
Neg logit contributions
 secure-0.006
Er-0.005
 blonde-0.005
 snug-0.005
asting-0.005
Token filled
Token ID5901
Feature activation-1.39e+00

Pos logit contributions
 us+0.060
 door+0.057
ays+0.056
 me+0.055
ucks+0.055
Neg logit contributions
 secure-0.059
ge-0.056
asting-0.055
 blonde-0.052
 energ-0.051
Token with
Token ID351
Feature activation-1.79e-01

Pos logit contributions
 door+0.040
 us+0.036
 gate+0.035
 too+0.034
 me+0.033
Neg logit contributions
 secure-0.055
asting-0.054
Er-0.052
 shave-0.052
ge-0.051
Token colourful
Token ID44799
Feature activation-6.53e-01

Pos logit contributions
 door+0.006
 us+0.005
 me+0.005
 gate+0.005
 too+0.005
Neg logit contributions
 secure-0.007
asting-0.006
Er-0.006
 blonde-0.006
 shave-0.006
Token or
Token ID393
Feature activation+1.26e+00

Pos logit contributions
 door+0.018
 us+0.017
 me+0.016
 too+0.015
 gate+0.015
Neg logit contributions
 secure-0.027
 blonde-0.025
asting-0.025
Er-0.025
 shave-0.025
Tokennam
Token ID7402
Feature activation+6.33e+00

Pos logit contributions
 secure+0.039
Er+0.037
ge+0.037
asting+0.037
tto+0.036
Neg logit contributions
 door-0.048
 us-0.046
 me-0.044
 too-0.044
 gate-0.042
Tokenents
Token ID658
Feature activation+1.21e+00

Pos logit contributions
 secure+0.291
Er+0.280
ge+0.276
asting+0.274
 shave+0.269
Neg logit contributions
 door-0.145
 us-0.138
 me-0.128
 too-0.121
 gate-0.121
Token.
Token ID13
Feature activation+6.21e-01

Pos logit contributions
 secure+0.047
asting+0.046
ge+0.045
 energ+0.041
Er+0.041
Neg logit contributions
 door-0.036
 us-0.036
 me-0.035
 poster-0.032
 gate-0.031
Token Jimmy
Token ID12963
Feature activation+2.41e+00

Pos logit contributions
 secure+0.022
asting+0.020
ge+0.020
Er+0.020
 shave+0.019
Neg logit contributions
 door-0.022
 too-0.021
 me-0.021
 us-0.021
 gate-0.019
Token looked
Token ID3114
Feature activation+1.49e+00

Pos logit contributions
 secure+0.086
Er+0.082
ge+0.080
 Ethan+0.079
asting+0.078
Neg logit contributions
 door-0.080
 us-0.076
 gate-0.070
 me-0.069
 too-0.069
Token out
Token ID503
Feature activation+1.38e+00

Pos logit contributions
 secure+0.058
 energ+0.057
Er+0.055
 blonde+0.054
 Ethan+0.053
Neg logit contributions
 door-0.042
 us-0.041
 poster-0.039
 gate-0.037
ipt-0.037
Token beautiful
Token ID4950
Feature activation-1.71e+00

Pos logit contributions
 secure+0.015
Er+0.014
 blonde+0.013
 shave+0.013
asting+0.013
Neg logit contributions
 us-0.016
 door-0.015
 me-0.014
ays-0.014
 too-0.014
Token garden
Token ID11376
Feature activation-3.98e-01

Pos logit contributions
 us+0.060
ays+0.055
 door+0.055
ases+0.053
 cho+0.052
Neg logit contributions
 secure-0.057
 blonde-0.052
Er-0.051
 shave-0.050
 energ-0.049
Token with
Token ID351
Feature activation-6.22e-01

Pos logit contributions
 us+0.012
 me+0.011
 door+0.010
ipt+0.010
ays+0.010
Neg logit contributions
 secure-0.015
asting-0.014
 blonde-0.014
 snug-0.013
 energ-0.013
Token the
Token ID262
Feature activation+1.21e+00

Pos logit contributions
 door+0.018
 us+0.017
 me+0.016
 too+0.016
 gate+0.016
Neg logit contributions
 secure-0.025
Er-0.023
asting-0.023
 blonde-0.023
ge-0.023
Token or
Token ID393
Feature activation+1.06e+00

Pos logit contributions
 secure+0.045
Er+0.041
 blonde+0.041
asting+0.040
 Ethan+0.039
Neg logit contributions
 us-0.038
 door-0.038
 me-0.035
 cho-0.034
 too-0.034
Tokennam
Token ID7402
Feature activation+6.27e+00

Pos logit contributions
 secure+0.021
Er+0.018
asting+0.017
 blonde+0.017
ge+0.017
Neg logit contributions
 door-0.052
 us-0.051
 me-0.049
 gate-0.048
 too-0.048
Tokenents
Token ID658
Feature activation+1.37e+00

Pos logit contributions
 secure+0.281
Er+0.274
ge+0.270
asting+0.267
 blonde+0.262
Neg logit contributions
 door-0.147
 us-0.141
 me-0.131
 too-0.124
 gate-0.123
Token.
Token ID13
Feature activation+4.43e-01

Pos logit contributions
 secure+0.056
ge+0.053
asting+0.053
 energ+0.049
Er+0.048
Neg logit contributions
 us-0.038
 door-0.038
 me-0.037
ipt-0.033
 poster-0.032
Token
Token ID198
Feature activation+7.92e-01

Pos logit contributions
 secure+0.017
asting+0.016
Er+0.016
ge+0.016
 shave+0.016
Neg logit contributions
 door-0.014
 us-0.013
 me-0.013
 too-0.012
 gate-0.012
Token
Token ID198
Feature activation+1.10e+00

Pos logit contributions
 secure+0.035
 blonde+0.032
ge+0.032
 shave+0.032
 energ+0.032
Neg logit contributions
 door-0.021
 us-0.021
 me-0.020
 too-0.018
 gate-0.017
TokenThe
Token ID464
Feature activation-8.67e-02

Pos logit contributions
 secure+0.033
 shave+0.029
ge+0.029
Er+0.029
 Ethan+0.028
Neg logit contributions
 door-0.044
 us-0.044
 me-0.043
 too-0.041
 gate-0.040
Token she
Token ID673
Feature activation-6.33e-01

Pos logit contributions
 secure+0.057
 Baby+0.056
 Ethan+0.054
 energ+0.053
 blonde+0.053
Neg logit contributions
 door-0.058
ays-0.056
 poster-0.056
 us-0.056
 gate-0.055
Token could
Token ID714
Feature activation+1.01e+00

Pos logit contributions
 door+0.021
 us+0.020
 me+0.019
 gate+0.019
 too+0.019
Neg logit contributions
 secure-0.023
Er-0.021
asting-0.021
ge-0.021
 shave-0.021
Token carry
Token ID3283
Feature activation+1.77e+00

Pos logit contributions
 secure+0.031
Er+0.028
asting+0.027
 shave+0.027
ge+0.027
Neg logit contributions
 door-0.040
 us-0.039
 me-0.037
 gate-0.036
 too-0.036
Token the
Token ID262
Feature activation+9.47e-01

Pos logit contributions
 secure+0.077
Er+0.073
asting+0.072
ge+0.071
 shave+0.071
Neg logit contributions
 door-0.046
 us-0.044
 me-0.041
 too-0.039
 gate-0.039
Token or
Token ID393
Feature activation+1.11e+00

Pos logit contributions
 secure+0.037
Er+0.035
 blonde+0.034
asting+0.034
ge+0.033
Neg logit contributions
 door-0.029
 us-0.028
 me-0.026
 too-0.025
 gate-0.025
Tokennam
Token ID7402
Feature activation+6.22e+00

Pos logit contributions
 secure+0.031
 shave+0.026
Er+0.026
ge+0.026
 energ+0.025
Neg logit contributions
 door-0.046
 us-0.045
 cho-0.043
 gate-0.043
 me-0.042
Tokenents
Token ID658
Feature activation+1.48e+00

Pos logit contributions
 secure+0.280
Er+0.271
ge+0.268
asting+0.264
 blonde+0.258
Neg logit contributions
 door-0.147
 us-0.141
 me-0.131
 gate-0.124
 too-0.123
Token to
Token ID284
Feature activation+2.51e-01

Pos logit contributions
 secure+0.063
ge+0.059
asting+0.059
 snug+0.055
 energ+0.055
Neg logit contributions
 us-0.039
 door-0.038
 me-0.037
ipt-0.035
 poster-0.034
Token the
Token ID262
Feature activation-5.71e-02

Pos logit contributions
 secure+0.010
Er+0.010
asting+0.009
ge+0.009
 shave+0.009
Neg logit contributions
 door-0.007
 us-0.007
 me-0.007
 too-0.006
 gate-0.006
Token place
Token ID1295
Feature activation-7.11e-01

Pos logit contributions
 door+0.002
 us+0.002
 me+0.001
 too+0.001
 cho+0.001
Neg logit contributions
 secure-0.002
Er-0.002
 blonde-0.002
asting-0.002
 Ethan-0.002
Token.
Token ID13
Feature activation+1.58e-01

Pos logit contributions
 us+0.020
 cho+0.019
 door+0.019
 me+0.019
ipt+0.018
Neg logit contributions
 secure-0.029
 snug-0.025
ge-0.025
asting-0.025
 blonde-0.024
TokenFred
Token ID30847
Feature activation+1.59e+00

Pos logit contributions
 secure+0.028
Er+0.026
asting+0.025
ge+0.025
 shave+0.025
Neg logit contributions
 door-0.040
 us-0.039
 me-0.037
 too-0.036
 gate-0.036
Token played
Token ID2826
Feature activation+6.38e-01

Pos logit contributions
 secure+0.060
Er+0.057
asting+0.056
ge+0.055
 shave+0.055
Neg logit contributions
 door-0.050
 us-0.048
 me-0.045
 too-0.044
 gate-0.043
Token with
Token ID351
Feature activation-2.59e-01

Pos logit contributions
 secure+0.024
asting+0.024
 energ+0.023
 snug+0.022
zing+0.022
Neg logit contributions
 door-0.020
 us-0.018
 gate-0.017
 poster-0.017
ays-0.017
Token the
Token ID262
Feature activation+1.26e+00

Pos logit contributions
 us+0.006
 door+0.006
 me+0.006
 too+0.006
 gate+0.005
Neg logit contributions
 secure-0.012
Er-0.011
ge-0.011
asting-0.011
 shave-0.011
Token b
Token ID275
Feature activation+1.28e+00

Pos logit contributions
 secure+0.050
Er+0.047
ge+0.047
asting+0.047
 shave+0.047
Neg logit contributions
 door-0.038
 us-0.037
 me-0.036
 too-0.034
 gate-0.033
Tokenunn
Token ID20935
Feature activation+6.18e+00

Pos logit contributions
 secure+0.032
 shave+0.030
Er+0.030
 blonde+0.029
Marcus+0.028
Neg logit contributions
 us-0.057
 door-0.056
 me-0.055
 too-0.053
 gate-0.051
Tokenies
Token ID444
Feature activation+1.15e+00

Pos logit contributions
 secure+0.306
 shave+0.292
Er+0.292
 blonde+0.288
asting+0.284
Neg logit contributions
 door-0.123
 us-0.122
 me-0.110
 too-0.110
 gate-0.097
Token all
Token ID477
Feature activation-2.69e+00

Pos logit contributions
 secure+0.046
asting+0.043
ge+0.042
 energ+0.042
Er+0.041
Neg logit contributions
 door-0.034
 us-0.033
 me-0.030
 gate-0.029
 poster-0.029
Token day
Token ID1110
Feature activation-1.19e-01

Pos logit contributions
 door+0.093
 us+0.092
 me+0.087
 too+0.086
 gate+0.082
Neg logit contributions
 secure-0.093
Er-0.088
 shave-0.085
ge-0.085
asting-0.084
Token,
Token ID11
Feature activation+2.48e-01

Pos logit contributions
 door+0.003
 us+0.003
 me+0.003
 too+0.003
 gate+0.003
Neg logit contributions
 secure-0.005
Er-0.005
 shave-0.005
asting-0.005
ge-0.005
Token hopping
Token ID47153
Feature activation-8.58e-02

Pos logit contributions
 secure+0.010
Er+0.009
asting+0.009
ge+0.009
 blonde+0.009
Neg logit contributions
 door-0.008
 us-0.007
 me-0.007
 gate-0.007
 too-0.007
Token who
Token ID508
Feature activation+3.01e+00

Pos logit contributions
 door+0.012
 us+0.012
 me+0.011
 too+0.011
 gate+0.010
Neg logit contributions
 secure-0.020
Er-0.019
asting-0.019
ge-0.019
 shave-0.018
Token liked
Token ID8288
Feature activation+2.47e-01

Pos logit contributions
ge+0.160
 afloat+0.148
Er+0.145
Phill+0.144
Marcus+0.144
Neg logit contributions
 me-0.063
 is-0.063
 too-0.062
 comes-0.060
 us-0.056
Token to
Token ID284
Feature activation+1.20e+00

Pos logit contributions
 blonde+0.011
 shave+0.011
 secure+0.011
 Baby+0.011
 energ+0.011
Neg logit contributions
 us-0.005
 door-0.005
ays-0.004
 gate-0.004
 me-0.004
Token collect
Token ID2824
Feature activation+2.48e+00

Pos logit contributions
 secure+0.044
Er+0.043
ge+0.043
asting+0.041
tto+0.040
Neg logit contributions
 door-0.040
 us-0.037
 me-0.037
 too-0.036
 gate-0.034
Token or
Token ID393
Feature activation+1.42e+00

Pos logit contributions
 secure+0.107
Er+0.102
ge+0.101
asting+0.100
 shave+0.099
Neg logit contributions
 door-0.067
 us-0.064
 me-0.061
 too-0.060
 gate-0.057
Tokennam
Token ID7402
Feature activation+6.16e+00

Pos logit contributions
 secure+0.032
Er+0.030
asting+0.029
tto+0.029
ge+0.028
Neg logit contributions
 door-0.068
 us-0.066
 me-0.065
 too-0.063
 gate-0.061
Tokenents
Token ID658
Feature activation+1.21e+00

Pos logit contributions
 secure+0.286
Er+0.275
ge+0.270
asting+0.270
 shave+0.266
Neg logit contributions
 door-0.139
 us-0.131
 me-0.121
 too-0.116
 gate-0.115
Token.
Token ID13
Feature activation+1.18e+00

Pos logit contributions
 secure+0.043
ge+0.042
asting+0.042
free+0.037
Er+0.036
Neg logit contributions
 door-0.039
 us-0.038
 me-0.038
 poster-0.035
 gate-0.034
Token Everywhere
Token ID48201
Feature activation+9.99e-01

Pos logit contributions
 secure+0.050
Er+0.047
asting+0.047
ge+0.046
 shave+0.046
Neg logit contributions
 door-0.034
 us-0.032
 me-0.030
 too-0.030
 gate-0.028
Token she
Token ID673
Feature activation-1.96e+00

Pos logit contributions
 secure+0.032
 blonde+0.030
Er+0.029
 Ethan+0.029
 Baby+0.028
Neg logit contributions
 door-0.037
 us-0.034
 gate-0.033
 poster-0.031
 me-0.031
Token went
Token ID1816
Feature activation+3.62e-01

Pos logit contributions
 door+0.045
 us+0.042
 me+0.041
 too+0.037
 comes+0.035
Neg logit contributions
 secure-0.093
asting-0.089
ge-0.088
Er-0.087
 shave-0.087
Token never
Token ID1239
Feature activation-3.27e+00

Pos logit contributions
 door+0.037
 us+0.035
 me+0.034
 too+0.032
 gate+0.032
Neg logit contributions
 secure-0.043
Er-0.040
asting-0.040
ge-0.039
 shave-0.039
Token played
Token ID2826
Feature activation-9.78e-01

Pos logit contributions
 door+0.106
 us+0.102
 me+0.101
 too+0.100
 gate+0.092
Neg logit contributions
 secure-0.120
Er-0.118
asting-0.114
Marcus-0.112
 Ethan-0.110
Token with
Token ID351
Feature activation-1.93e+00

Pos logit contributions
 door+0.028
 us+0.026
 gate+0.024
 me+0.024
 poster+0.023
Neg logit contributions
 secure-0.040
asting-0.038
Er-0.037
 shave-0.037
 blonde-0.037
Token the
Token ID262
Feature activation+2.23e-01

Pos logit contributions
 door+0.051
 us+0.051
 me+0.047
 too+0.045
 gate+0.043
Neg logit contributions
 secure-0.083
Er-0.079
ge-0.078
asting-0.077
 shave-0.076
Token or
Token ID393
Feature activation+2.51e-01

Pos logit contributions
 secure+0.008
Er+0.008
asting+0.008
 shave+0.008
 blonde+0.008
Neg logit contributions
 door-0.007
 us-0.007
 me-0.006
 too-0.006
 gate-0.006
Tokennam
Token ID7402
Feature activation+6.14e+00

Pos logit contributions
 secure+0.009
asting+0.008
ge+0.008
 shave+0.008
 energ+0.008
Neg logit contributions
 door-0.008
 us-0.008
 gate-0.008
 me-0.007
 poster-0.007
Tokenents
Token ID658
Feature activation+1.28e+00

Pos logit contributions
 secure+0.286
Er+0.274
ge+0.270
asting+0.270
 shave+0.265
Neg logit contributions
 door-0.139
 us-0.131
 me-0.121
 too-0.115
 gate-0.114
Token again
Token ID757
Feature activation-2.40e+00

Pos logit contributions
 secure+0.061
ge+0.060
asting+0.059
 energ+0.056
 shave+0.055
Neg logit contributions
 door-0.027
 us-0.026
 me-0.025
 gate-0.023
 poster-0.022
Token.
Token ID13
Feature activation-5.99e-01

Pos logit contributions
 door+0.060
 us+0.057
 me+0.053
 too+0.051
 gate+0.050
Neg logit contributions
 secure-0.107
Er-0.102
asting-0.100
ge-0.100
 shave-0.099
Token They
Token ID1119
Feature activation+4.24e-01

Pos logit contributions
 door+0.016
 us+0.016
 me+0.015
 too+0.015
 gate+0.013
Neg logit contributions
 secure-0.026
ge-0.025
asting-0.024
 shave-0.024
 energ-0.024
Token still
Token ID991
Feature activation-1.46e+00

Pos logit contributions
 secure+0.014
Er+0.013
asting+0.013
 shave+0.012
ge+0.012
Neg logit contributions
 door-0.016
 us-0.015
 me-0.014
 gate-0.014
 too-0.014
Token the
Token ID262
Feature activation+7.94e-02

Pos logit contributions
 secure+0.002
Er+0.002
asting+0.002
ge+0.002
 blonde+0.002
Neg logit contributions
 door-0.002
 us-0.002
 me-0.001
 gate-0.001
 poster-0.001
Token rest
Token ID1334
Feature activation+1.35e+00

Pos logit contributions
 secure+0.003
asting+0.003
Er+0.003
Marcus+0.003
 shave+0.003
Neg logit contributions
 door-0.003
 us-0.003
 me-0.002
 too-0.002
 gate-0.002
Token of
Token ID286
Feature activation+7.82e-01

Pos logit contributions
 secure+0.056
asting+0.056
ge+0.053
 snug+0.050
 energ+0.050
Neg logit contributions
 door-0.036
 me-0.035
 us-0.035
 poster-0.033
 gate-0.032
Token the
Token ID262
Feature activation+1.14e+00

Pos logit contributions
 secure+0.039
asting+0.038
Er+0.038
Marcus+0.037
tto+0.037
Neg logit contributions
 door-0.016
 us-0.016
 too-0.015
 me-0.014
 them-0.013
Token or
Token ID393
Feature activation+7.12e-01

Pos logit contributions
 secure+0.043
Marcus+0.043
asting+0.042
Er+0.042
 shave+0.041
Neg logit contributions
 door-0.039
 us-0.035
 me-0.035
 too-0.033
 gate-0.032
Tokennam
Token ID7402
Feature activation+6.12e+00

Pos logit contributions
 secure+0.017
Er+0.015
 shave+0.015
ge+0.015
asting+0.015
Neg logit contributions
 door-0.032
 us-0.031
 me-0.030
 gate-0.030
 too-0.029
Tokenents
Token ID658
Feature activation+1.55e+00

Pos logit contributions
 secure+0.280
Er+0.269
ge+0.268
asting+0.264
 shave+0.258
Neg logit contributions
 door-0.141
 us-0.135
 me-0.127
 too-0.118
 gate-0.118
Token onto
Token ID4291
Feature activation-9.62e-01

Pos logit contributions
 secure+0.066
ge+0.062
asting+0.061
 energ+0.057
 snug+0.056
Neg logit contributions
 us-0.042
 me-0.041
 door-0.041
 poster-0.036
 gate-0.035
Token the
Token ID262
Feature activation+1.20e-01

Pos logit contributions
 door+0.025
 us+0.024
 me+0.022
 too+0.021
 gate+0.021
Neg logit contributions
 secure-0.043
Er-0.040
asting-0.040
ge-0.039
 shave-0.039
Token tree
Token ID5509
Feature activation-1.76e+00

Pos logit contributions
 secure+0.005
Er+0.005
asting+0.005
ge+0.005
Marcus+0.005
Neg logit contributions
 door-0.004
 us-0.003
 me-0.003
 too-0.003
 gate-0.003
Token.
Token ID13
Feature activation-3.77e-01

Pos logit contributions
 us+0.052
 door+0.050
 me+0.049
 poster+0.044
 cho+0.043
Neg logit contributions
 secure-0.071
ge-0.065
asting-0.063
 snug-0.060
Er-0.060
Token was
Token ID373
Feature activation+3.13e-02

Pos logit contributions
 door+0.054
 us+0.050
 gate+0.048
aw+0.047
 me+0.047
Neg logit contributions
 secure-0.069
Er-0.065
asting-0.063
ge-0.063
 Ethan-0.062
Token back
Token ID736
Feature activation-1.78e-01

Pos logit contributions
 secure+0.001
asting+0.001
 energ+0.001
 blonde+0.001
 snug+0.001
Neg logit contributions
 door-0.001
 us-0.001
 poster-0.001
 gate-0.001
als-0.001
Token.
Token ID13
Feature activation+1.54e-02

Pos logit contributions
 door+0.004
 us+0.004
 cho+0.004
 me+0.004
 poster+0.004
Neg logit contributions
 secure-0.008
ge-0.007
asting-0.007
Er-0.007
 energ-0.007
Token H
Token ID367
Feature activation+1.09e+00

Pos logit contributions
 secure+0.000
Er+0.000
 Ethan+0.000
 blonde+0.000
Baby+0.000
Neg logit contributions
 door-0.001
 us-0.001
 gate-0.001
 poster-0.001
 cho-0.001
Tokenoor
Token ID2675
Feature activation-2.80e+00

Pos logit contributions
 secure+0.045
 shave+0.043
 blonde+0.042
 energ+0.041
asting+0.041
Neg logit contributions
 door-0.030
 us-0.029
 me-0.028
 too-0.027
 cho-0.026
Tokenay
Token ID323
Feature activation-5.17e+00

Pos logit contributions
 door+0.084
 us+0.082
 gate+0.075
 cho+0.075
 poster+0.074
Neg logit contributions
 secure-0.102
Er-0.097
asting-0.097
ge-0.095
 Ethan-0.095
Token!
Token ID0
Feature activation-2.26e+00

Pos logit contributions
 door+0.166
 us+0.152
 gate+0.148
 poster+0.145
fork+0.144
Neg logit contributions
 secure-0.184
 Ethan-0.177
ge-0.174
asting-0.173
Er-0.171
Token<|endoftext|>
Token ID50256
Feature activation-2.36e+00

Pos logit contributions
 door+0.076
 us+0.073
 me+0.070
 too+0.068
 gate+0.067
Neg logit contributions
 secure-0.081
Er-0.076
asting-0.075
ge-0.074
 shave-0.074
TokenOnce
Token ID7454
Feature activation+8.42e-01

Pos logit contributions
 door+0.069
 us+0.066
 me+0.063
 too+0.061
 gate+0.058
Neg logit contributions
 secure-0.097
ge-0.090
asting-0.089
Er-0.089
 shave-0.088
Token upon
Token ID2402
Feature activation-2.46e+00

Pos logit contributions
 secure+0.031
Er+0.029
ge+0.029
asting+0.028
 shave+0.028
Neg logit contributions
 door-0.027
 us-0.026
 me-0.025
 gate-0.024
 too-0.024
Token a
Token ID257
Feature activation-2.49e-01

Pos logit contributions
 door+0.058
 us+0.057
 too+0.057
 me+0.056
 gate+0.046
Neg logit contributions
 secure-0.113
ge-0.109
Er-0.107
asting-0.103
 shave-0.102
TokenBob
Token ID18861
Feature activation+4.68e-01

Pos logit contributions
 door+0.031
 us+0.028
 gate+0.027
 me+0.027
 too+0.026
Neg logit contributions
 secure-0.024
Er-0.024
asting-0.023
ge-0.023
 shave-0.022
Token the
Token ID262
Feature activation-1.91e-01

Pos logit contributions
 secure+0.019
Er+0.019
asting+0.018
ge+0.018
 shave+0.018
Neg logit contributions
 door-0.013
 us-0.012
 me-0.011
 gate-0.011
 too-0.011
Token Explorer
Token ID19142
Feature activation-1.56e+00

Pos logit contributions
 us+0.006
 door+0.006
 me+0.006
ays+0.006
 too+0.006
Neg logit contributions
 secure-0.007
Er-0.006
asting-0.006
 blonde-0.006
 shave-0.006
Token"
Token ID1
Feature activation-1.50e+00

Pos logit contributions
 door+0.049
 us+0.046
 me+0.043
 gate+0.043
 too+0.041
Neg logit contributions
 secure-0.058
asting-0.057
Er-0.056
ge-0.055
 Ethan-0.054
Token and
Token ID290
Feature activation-1.31e+00

Pos logit contributions
 door+0.042
 us+0.041
 poster+0.038
 gate+0.038
 me+0.036
Neg logit contributions
 secure-0.059
asting-0.058
ge-0.056
Er-0.056
 Ethan-0.054
Token all
Token ID477
Feature activation-5.06e+00

Pos logit contributions
 door+0.043
 us+0.041
 me+0.038
 gate+0.038
 too+0.036
Neg logit contributions
 secure-0.049
Er-0.045
asting-0.045
 shave-0.044
ge-0.044
Token of
Token ID286
Feature activation-1.29e-02

Pos logit contributions
 door+0.147
 us+0.135
 gate+0.127
 me+0.126
 poster+0.125
Neg logit contributions
 secure-0.205
asting-0.194
ge-0.188
Er-0.188
 Ethan-0.184
Token his
Token ID465
Feature activation-1.94e-01

Pos logit contributions
 door+0.000
 us+0.000
 me+0.000
 too+0.000
 gate+0.000
Neg logit contributions
 secure-0.001
Er-0.001
asting-0.001
ge-0.001
 shave-0.001
Token friends
Token ID2460
Feature activation+2.55e-01

Pos logit contributions
 door+0.006
 us+0.006
 me+0.005
 gate+0.005
 too+0.005
Neg logit contributions
 secure-0.008
Er-0.007
asting-0.007
ge-0.007
 blonde-0.007
Token cheered
Token ID34767
Feature activation-1.31e+00

Pos logit contributions
 secure+0.009
asting+0.008
Er+0.008
ge+0.008
 shave+0.008
Neg logit contributions
 door-0.009
 us-0.008
 me-0.008
 gate-0.008
 poster-0.007
Token him
Token ID683
Feature activation-1.68e-01

Pos logit contributions
 door+0.035
 gate+0.032
 poster+0.031
 us+0.030
 me+0.028
Neg logit contributions
 secure-0.054
asting-0.052
Er-0.051
ge-0.051
 Ethan-0.051
TokenHi
Token ID17250
Feature activation+1.81e+00

Pos logit contributions
 door+0.059
 us+0.056
 me+0.053
 gate+0.053
 too+0.052
Neg logit contributions
 secure-0.055
Er-0.052
asting-0.051
ge-0.050
 blonde-0.050
Token Jay
Token ID9180
Feature activation-7.52e-01

Pos logit contributions
 Ethan+0.060
Er+0.060
 blonde+0.059
 secure+0.058
 Em+0.057
Neg logit contributions
 door-0.059
fork-0.054
ays-0.054
 gate-0.052
 us-0.052
Tokenden
Token ID6559
Feature activation+1.61e+00

Pos logit contributions
 door+0.024
 us+0.022
 gate+0.021
 me+0.020
fork+0.020
Neg logit contributions
 secure-0.027
Er-0.026
ge-0.026
 Ethan-0.026
asting-0.025
Token!
Token ID0
Feature activation-1.74e+00

Pos logit contributions
 secure+0.057
ge+0.054
Er+0.053
 Ethan+0.052
 blonde+0.052
Neg logit contributions
 door-0.052
 us-0.048
 gate-0.046
ays-0.044
 me-0.044
Token Ready
Token ID23432
Feature activation+3.60e-01

Pos logit contributions
 door+0.056
 us+0.053
 gate+0.050
 me+0.047
 planes+0.046
Neg logit contributions
Er-0.061
 Ethan-0.061
 secure-0.060
 blonde-0.059
 Em-0.057
Token for
Token ID329
Feature activation-4.94e+00

Pos logit contributions
ge+0.013
 Ethan+0.012
Er+0.012
asting+0.012
 shave+0.012
Neg logit contributions
 door-0.011
fork-0.010
 cho-0.010
 gate-0.010
 houses-0.009
Token school
Token ID1524
Feature activation-1.74e+00

Pos logit contributions
ays+0.166
ipt+0.161
 door+0.158
fork+0.156
outs+0.155
Neg logit contributions
 secure-0.160
 shave-0.143
zing-0.141
asting-0.141
 mischief-0.139
Token today
Token ID1909
Feature activation-3.38e+00

Pos logit contributions
ipt+0.052
 door+0.052
ays+0.049
fork+0.049
ucks+0.048
Neg logit contributions
 Ethan-0.056
 secure-0.054
 blonde-0.053
ge-0.053
asting-0.051
Token?"
Token ID1701
Feature activation-2.81e-01

Pos logit contributions
 cho+0.083
 door+0.081
ipt+0.079
ays+0.079
 us+0.078
Neg logit contributions
 secure-0.131
 Ethan-0.125
asting-0.123
 shave-0.121
 blonde-0.120
Token
Token ID198
Feature activation-1.35e-01

Pos logit contributions
 door+0.010
 us+0.010
 gate+0.009
 me+0.009
 too+0.009
Neg logit contributions
 secure-0.009
Er-0.009
 Ethan-0.008
 shave-0.008
 blonde-0.008
Token
Token ID198
Feature activation+3.47e-01

Pos logit contributions
 door+0.005
 us+0.005
 gate+0.005
 too+0.005
 me+0.005
Neg logit contributions
Er-0.004
 secure-0.004
 Ethan-0.004
Marcus-0.004
ge-0.004
Token he
Token ID339
Feature activation-9.12e-01

Pos logit contributions
 door+0.041
 us+0.039
 me+0.037
 too+0.036
 gate+0.034
Neg logit contributions
 secure-0.065
ge-0.059
asting-0.059
Er-0.059
 shave-0.059
Token shouted
Token ID17293
Feature activation-6.07e-01

Pos logit contributions
 door+0.032
 us+0.030
 poster+0.029
ays+0.029
 gate+0.029
Neg logit contributions
 secure-0.031
asting-0.027
 shave-0.027
 energ-0.026
 Ethan-0.026
Token "
Token ID366
Feature activation-1.92e+00

Pos logit contributions
 door+0.018
 poster+0.017
 us+0.017
 gate+0.017
ipt+0.017
Neg logit contributions
 secure-0.021
zing-0.021
 Ethan-0.020
 shave-0.020
 blonde-0.020
Tokenh
Token ID71
Feature activation-1.17e+00

Pos logit contributions
 door+0.071
 us+0.064
 gate+0.064
 me+0.061
 too+0.060
Neg logit contributions
 secure-0.062
Er-0.061
asting-0.060
ge-0.059
 shave-0.059
Tokenoor
Token ID2675
Feature activation-2.71e+00

Pos logit contributions
 door+0.035
 us+0.033
 me+0.031
 gate+0.030
 too+0.029
Neg logit contributions
 secure-0.046
asting-0.045
Er-0.044
ge-0.043
 Ethan-0.042
Tokenay
Token ID323
Feature activation-4.90e+00

Pos logit contributions
 door+0.075
 us+0.071
 planes+0.068
 gate+0.068
fork+0.067
Neg logit contributions
asting-0.102
Er-0.098
 secure-0.097
 Ethan-0.096
ge-0.094
Token!"
Token ID2474
Feature activation-1.61e+00

Pos logit contributions
 door+0.147
 gate+0.134
fork+0.132
 poster+0.128
 us+0.126
Neg logit contributions
 Ethan-0.174
asting-0.170
 secure-0.167
zing-0.166
Er-0.162
Token
Token ID198
Feature activation-9.80e-01

Pos logit contributions
 door+0.061
 us+0.059
 gate+0.055
 me+0.055
 poster+0.054
Neg logit contributions
 secure-0.049
Er-0.047
 Ethan-0.046
 shave-0.045
asting-0.045
Token
Token ID198
Feature activation-4.63e-01

Pos logit contributions
 door+0.033
 us+0.032
 me+0.030
 too+0.029
 gate+0.029
Neg logit contributions
 secure-0.035
Er-0.033
asting-0.032
ge-0.032
 shave-0.032
TokenThe
Token ID464
Feature activation-1.05e+00

Pos logit contributions
 door+0.019
 us+0.019
 me+0.018
 too+0.017
 gate+0.017
Neg logit contributions
 secure-0.013
Er-0.012
asting-0.012
ge-0.011
 shave-0.011
Token jet
Token ID12644
Feature activation-1.38e+00

Pos logit contributions
 us+0.030
 door+0.030
 me+0.027
 too+0.027
ays+0.026
Neg logit contributions
 secure-0.041
Er-0.038
asting-0.038
 shave-0.037
 blonde-0.037
Token,
Token ID11
Feature activation-1.34e+00

Pos logit contributions
 door+0.014
fork+0.013
 us+0.013
 me+0.012
 gate+0.012
Neg logit contributions
ge-0.020
 secure-0.020
Er-0.019
asting-0.019
 blonde-0.019
Token nine
Token ID5193
Feature activation-2.73e+00

Pos logit contributions
 door+0.036
 us+0.036
 me+0.033
 cho+0.031
 too+0.031
Neg logit contributions
 secure-0.055
Er-0.053
 blonde-0.051
 shave-0.051
asting-0.050
Token,
Token ID11
Feature activation-2.16e+00

Pos logit contributions
 us+0.077
 door+0.076
 me+0.071
fork+0.070
 gate+0.067
Neg logit contributions
 secure-0.104
ge-0.099
 blonde-0.097
Er-0.097
 shave-0.094
Token eight
Token ID3624
Feature activation-1.87e+00

Pos logit contributions
 door+0.069
 us+0.067
 me+0.063
ays+0.061
 gate+0.061
Neg logit contributions
 secure-0.075
Er-0.073
 Ethan-0.070
asting-0.069
 shave-0.069
Token,
Token ID11
Feature activation-1.91e+00

Pos logit contributions
 door+0.054
 us+0.052
 me+0.048
 gate+0.048
 cho+0.046
Neg logit contributions
 secure-0.071
Er-0.068
 blonde-0.068
ge-0.067
 shave-0.065
Token seven
Token ID3598
Feature activation-4.89e+00

Pos logit contributions
 door+0.049
 us+0.049
 me+0.044
 too+0.042
 gate+0.042
Neg logit contributions
 secure-0.078
Er-0.077
asting-0.074
 Ethan-0.072
ge-0.072
Token,
Token ID11
Feature activation-2.21e+00

Pos logit contributions
 door+0.158
 us+0.155
fork+0.150
 cho+0.147
ipt+0.147
Neg logit contributions
 blonde-0.155
 secure-0.153
Er-0.146
ge-0.143
 Ethan-0.138
Token six
Token ID2237
Feature activation-2.86e+00

Pos logit contributions
 us+0.064
 door+0.062
 me+0.057
ocked+0.055
 too+0.055
Neg logit contributions
Er-0.084
 secure-0.081
ge-0.077
 blonde-0.077
 Ethan-0.077
Token,
Token ID11
Feature activation-2.02e+00

Pos logit contributions
 door+0.081
 us+0.079
fork+0.073
 cho+0.072
 me+0.072
Neg logit contributions
 secure-0.108
ge-0.103
Er-0.102
 blonde-0.102
asting-0.098
Token five
Token ID1936
Feature activation-2.48e+00

Pos logit contributions
 us+0.044
 door+0.043
 me+0.037
 too+0.036
 gate+0.035
Neg logit contributions
 secure-0.092
Er-0.091
 blonde-0.088
asting-0.085
ge-0.085
Token,
Token ID11
Feature activation-2.04e+00

Pos logit contributions
 door+0.066
 us+0.064
 me+0.059
fork+0.059
 cho+0.058
Neg logit contributions
 secure-0.098
ge-0.095
 blonde-0.091
Er-0.091
asting-0.089
Token!
Token ID0
Feature activation-2.78e+00

Pos logit contributions
 secure+0.026
 shave+0.024
ge+0.024
 blonde+0.023
 Junior+0.023
Neg logit contributions
 me-0.027
 us-0.026
 door-0.026
 too-0.024
 cho-0.023
Token Let
Token ID3914
Feature activation+1.14e+00

Pos logit contributions
 door+0.101
 us+0.096
 gate+0.091
 poster+0.089
 cho+0.087
Neg logit contributions
 secure-0.085
 Ethan-0.084
Er-0.084
 blonde-0.082
 Baby-0.080
Tokenâ
Token ID22940
Feature activation-1.51e+00

Pos logit contributions
 secure+0.036
 Ethan+0.033
asting+0.033
 shave+0.033
 Baby+0.033
Neg logit contributions
 door-0.041
 gate-0.037
 poster-0.036
fork-0.036
 cho-0.035
Token
Token ID26391
Feature activation+7.26e-01

Pos logit contributions
 us+0.055
 me+0.052
 door+0.051
 too+0.049
 poster+0.046
Neg logit contributions
 blonde-0.050
 secure-0.049
Er-0.046
ge-0.045
 snug-0.044
Token
Token ID8151
Feature activation+3.44e-01

Pos logit contributions
 secure+0.022
Er+0.021
ge+0.020
asting+0.020
 blonde+0.020
Neg logit contributions
 door-0.028
 us-0.027
 me-0.026
 gate-0.025
 too-0.025
Tokens
Token ID82
Feature activation-4.84e+00

Pos logit contributions
ge+0.013
 secure+0.013
asting+0.012
Er+0.012
 Ethan+0.012
Neg logit contributions
 door-0.011
 us-0.010
 me-0.010
 gate-0.010
 poster-0.009
Token ride
Token ID6594
Feature activation-1.15e+00

Pos logit contributions
 door+0.171
 us+0.164
 me+0.155
 too+0.152
 gate+0.152
Neg logit contributions
 secure-0.164
Er-0.151
asting-0.150
 shave-0.150
ge-0.147
Token on
Token ID319
Feature activation-1.34e+00

Pos logit contributions
 door+0.035
 gate+0.031
 us+0.028
 window+0.028
 poster+0.027
Neg logit contributions
 Ethan-0.043
 secure-0.042
 shave-0.042
Er-0.042
ge-0.041
Token!
Token ID0
Feature activation-2.45e+00

Pos logit contributions
 door+0.036
 us+0.031
 gate+0.030
 poster+0.029
 me+0.029
Neg logit contributions
 secure-0.056
 Ethan-0.053
Er-0.053
asting-0.052
 shave-0.052
Tokenâ
Token ID22940
Feature activation-6.31e-01

Pos logit contributions
 door+0.085
 us+0.082
 me+0.079
 too+0.078
 gate+0.075
Neg logit contributions
 secure-0.086
asting-0.079
ge-0.078
Er-0.078
 shave-0.076
Token
Token ID26391
Feature activation+1.50e+00

Pos logit contributions
 us+0.025
 me+0.025
 door+0.025
 too+0.024
 poster+0.023
Neg logit contributions
 secure-0.018
 blonde-0.017
Er-0.016
ge-0.016
 shave-0.016
Token.
Token ID13
Feature activation-4.05e-01

Pos logit contributions
 secure+0.060
ge+0.057
 shave+0.056
asting+0.056
Er+0.055
Neg logit contributions
 door-0.044
 us-0.043
 me-0.041
 gate-0.040
 poster-0.039
Token
Token ID198
Feature activation+2.33e-01

Pos logit contributions
 door+0.012
 us+0.011
 me+0.010
 too+0.010
 gate+0.010
Neg logit contributions
 secure-0.017
Er-0.016
asting-0.016
ge-0.016
 shave-0.016
Token
Token ID198
Feature activation+5.60e-01

Pos logit contributions
 secure+0.008
Er+0.008
asting+0.008
ge+0.008
 shave+0.008
Neg logit contributions
 door-0.008
 us-0.008
 me-0.007
 too-0.007
 gate-0.007
Token"
Token ID1
Feature activation-8.25e-01

Pos logit contributions
 secure+0.022
Er+0.021
asting+0.021
ge+0.020
 shave+0.020
Neg logit contributions
 door-0.017
 us-0.016
 me-0.015
 too-0.015
 gate-0.015
TokenReady
Token ID35474
Feature activation+1.73e+00

Pos logit contributions
 door+0.027
 us+0.026
 me+0.025
 too+0.024
 gate+0.024
Neg logit contributions
 secure-0.031
ge-0.028
asting-0.028
Er-0.028
 shave-0.028
Token for
Token ID329
Feature activation-4.80e+00

Pos logit contributions
ge+0.059
asting+0.057
 Ethan+0.054
Er+0.054
 shave+0.053
Neg logit contributions
 door-0.053
 us-0.051
fork-0.050
 gate-0.049
 me-0.049
Token blast
Token ID11975
Feature activation-2.84e+00

Pos logit contributions
fork+0.148
 door+0.148
ays+0.147
ipt+0.144
outs+0.143
Neg logit contributions
 secure-0.155
 shave-0.148
asting-0.141
 Ethan-0.136
 Junior-0.136
Token off
Token ID572
Feature activation-1.17e+00

Pos logit contributions
 door+0.085
fork+0.084
 poster+0.079
 elephants+0.079
 us+0.078
Neg logit contributions
asting-0.093
ge-0.092
 Ethan-0.091
 secure-0.089
 Baby-0.087
Token?"
Token ID1701
Feature activation-5.99e-01

Pos logit contributions
 door+0.030
fork+0.029
ipt+0.029
 poster+0.028
ocked+0.028
Neg logit contributions
 secure-0.044
 Ethan-0.041
 Baby-0.041
asting-0.041
 Junior-0.040
Token Lily
Token ID20037
Feature activation-1.11e+00

Pos logit contributions
 door+0.018
 us+0.017
 gate+0.016
 me+0.016
 cho+0.016
Neg logit contributions
 secure-0.022
Er-0.021
 blonde-0.021
 Ethan-0.021
 shave-0.021
Token asks
Token ID7893
Feature activation+2.29e-01

Pos logit contributions
 cho+0.027
 door+0.026
 us+0.026
 gate+0.022
fork+0.022
Neg logit contributions
Er-0.044
 blonde-0.043
 shave-0.043
 Ethan-0.042
Marcus-0.042
Token cake
Token ID12187
Feature activation-1.49e+00

Pos logit contributions
 us+0.028
ays+0.026
 door+0.025
ipt+0.024
 too+0.024
Neg logit contributions
 secure-0.034
Er-0.033
 shave-0.032
 blonde-0.031
asting-0.030
Token and
Token ID290
Feature activation-1.16e+00

Pos logit contributions
 us+0.044
ipt+0.040
 me+0.040
 door+0.039
 cho+0.039
Neg logit contributions
 secure-0.052
asting-0.050
 shave-0.049
ge-0.048
Er-0.048
Token a
Token ID257
Feature activation-1.55e+00

Pos logit contributions
 door+0.039
 us+0.038
 me+0.036
 gate+0.034
 too+0.034
Neg logit contributions
 secure-0.040
 blonde-0.037
 shave-0.037
Er-0.037
 Ethan-0.036
Token bike
Token ID7161
Feature activation-2.02e+00

Pos logit contributions
 us+0.052
ays+0.048
ipt+0.046
 door+0.045
 me+0.045
Neg logit contributions
 secure-0.053
 blonde-0.051
 shave-0.051
Er-0.051
asting-0.047
Token in
Token ID287
Feature activation-8.12e-01

Pos logit contributions
 us+0.067
ipt+0.063
 door+0.060
 me+0.060
ucks+0.059
Neg logit contributions
ge-0.065
 secure-0.061
 shave-0.057
asting-0.056
Er-0.055
Token front
Token ID2166
Feature activation-4.80e+00

Pos logit contributions
 door+0.022
 us+0.021
 me+0.020
 gate+0.019
 too+0.019
Neg logit contributions
 secure-0.034
Er-0.032
asting-0.032
 shave-0.031
ge-0.031
Token of
Token ID286
Feature activation-1.32e+00

Pos logit contributions
 cho+0.154
rots+0.144
 us+0.143
 poster+0.133
men+0.133
Neg logit contributions
 secure-0.158
 blonde-0.151
 shave-0.146
ge-0.143
asting-0.143
Token them
Token ID606
Feature activation-7.73e-03

Pos logit contributions
 door+0.024
 us+0.020
 me+0.019
 gate+0.019
 too+0.018
Neg logit contributions
 secure-0.067
Er-0.065
asting-0.064
 Ethan-0.064
 shave-0.064
Token.
Token ID13
Feature activation+6.44e-02

Pos logit contributions
 door+0.000
 us+0.000
 me+0.000
 cho+0.000
 gate+0.000
Neg logit contributions
 secure-0.000
ge-0.000
Er-0.000
asting-0.000
 Ethan-0.000
Token They
Token ID1119
Feature activation+7.63e-01

Pos logit contributions
 secure+0.002
Er+0.002
asting+0.002
 blonde+0.002
 Ethan+0.002
Neg logit contributions
 door-0.002
 us-0.002
 me-0.002
 too-0.002
 gate-0.002
Token are
Token ID389
Feature activation-1.99e+00

Pos logit contributions
 secure+0.027
Er+0.025
asting+0.025
 shave+0.024
ge+0.024
Neg logit contributions
 door-0.026
 us-0.025
 me-0.024
 too-0.023
 gate-0.023
Token was
Token ID373
Feature activation-2.07e+00

Pos logit contributions
 door+0.065
 us+0.061
 gate+0.060
 poster+0.058
 me+0.057
Neg logit contributions
 secure-0.065
 shave-0.060
asting-0.059
ge-0.059
 snug-0.056
Token a
Token ID257
Feature activation-9.09e-01

Pos logit contributions
 door+0.069
 us+0.063
 gate+0.062
 poster+0.061
 me+0.059
Neg logit contributions
 secure-0.076
asting-0.071
 shave-0.068
ge-0.067
 energ-0.067
Token good
Token ID922
Feature activation+4.01e-01

Pos logit contributions
 door+0.035
 us+0.034
 me+0.032
 gate+0.031
 too+0.031
Neg logit contributions
 secure-0.028
Er-0.025
asting-0.025
 shave-0.025
ge-0.025
Token idea
Token ID2126
Feature activation-6.20e-01

Pos logit contributions
 secure+0.015
 shave+0.014
asting+0.014
Er+0.013
ge+0.013
Neg logit contributions
 door-0.013
 us-0.013
 me-0.012
 gate-0.011
 cho-0.011
Token after
Token ID706
Feature activation-2.85e+00

Pos logit contributions
 door+0.019
 us+0.017
 gate+0.017
 me+0.016
 poster+0.016
Neg logit contributions
 secure-0.024
asting-0.022
ge-0.022
 shave-0.021
Er-0.021
Token all
Token ID477
Feature activation-4.79e+00

Pos logit contributions
 door+0.078
aw+0.075
 cho+0.073
fork+0.072
 gate+0.072
Neg logit contributions
 secure-0.105
asting-0.102
ge-0.099
 Ethan-0.098
 Baby-0.097
Token.
Token ID13
Feature activation-4.05e-01

Pos logit contributions
 door+0.139
 us+0.125
 cho+0.123
 me+0.120
 gate+0.118
Neg logit contributions
 secure-0.188
asting-0.178
ge-0.172
 shave-0.168
Er-0.165
Token
Token ID198
Feature activation+8.51e-02

Pos logit contributions
 door+0.012
 us+0.011
 me+0.011
 too+0.011
 gate+0.010
Neg logit contributions
 secure-0.017
ge-0.016
asting-0.016
 shave-0.016
Er-0.016
Token
Token ID198
Feature activation+3.80e-01

Pos logit contributions
 secure+0.004
 blonde+0.003
ge+0.003
Er+0.003
 shave+0.003
Neg logit contributions
 us-0.002
 door-0.002
 me-0.002
 too-0.002
 gate-0.002
TokenThen
Token ID6423
Feature activation+9.07e-01

Pos logit contributions
 secure+0.013
 blonde+0.012
 shave+0.012
ge+0.012
 Ethan+0.011
Neg logit contributions
 us-0.014
 me-0.013
 door-0.013
 too-0.013
aw-0.012
Token they
Token ID484
Feature activation-9.80e-01

Pos logit contributions
 secure+0.043
Er+0.040
asting+0.040
ge+0.040
 shave+0.040
Neg logit contributions
 door-0.021
 us-0.020
 me-0.019
 too-0.018
 gate-0.017
Token,
Token ID11
Feature activation-9.51e-01

Pos logit contributions
 door+0.004
 us+0.004
 me+0.004
 too+0.003
 gate+0.003
Neg logit contributions
 secure-0.007
Er-0.007
asting-0.006
ge-0.006
 shave-0.006
Token Anna
Token ID11735
Feature activation-5.32e-01

Pos logit contributions
 door+0.024
 us+0.023
 me+0.021
 gate+0.021
 too+0.020
Neg logit contributions
 secure-0.041
Er-0.040
asting-0.039
ge-0.038
 shave-0.038
Token.
Token ID13
Feature activation-3.13e-01

Pos logit contributions
 door+0.016
 us+0.015
 gate+0.014
 me+0.013
 poster+0.013
Neg logit contributions
 secure-0.021
Er-0.019
asting-0.019
ge-0.019
 blonde-0.018
Token You
Token ID921
Feature activation-1.41e-01

Pos logit contributions
 door+0.007
 us+0.007
 me+0.007
 too+0.006
 gate+0.006
Neg logit contributions
 secure-0.015
asting-0.014
ge-0.014
Er-0.014
 shave-0.014
Token spelled
Token ID32213
Feature activation+7.66e-01

Pos logit contributions
 door+0.005
 us+0.004
 me+0.004
 gate+0.004
 too+0.004
Neg logit contributions
 secure-0.005
Er-0.005
asting-0.005
 shave-0.005
 blonde-0.005
Token all
Token ID477
Feature activation-4.79e+00

Pos logit contributions
Er+0.025
asting+0.025
 secure+0.025
 shave+0.024
 Ethan+0.024
Neg logit contributions
 door-0.024
ipt-0.023
fork-0.022
 gate-0.022
 us-0.021
Token the
Token ID262
Feature activation-1.53e-01

Pos logit contributions
 door+0.152
 gate+0.136
 us+0.132
fork+0.130
ipt+0.130
Neg logit contributions
asting-0.172
 secure-0.169
 shave-0.157
Er-0.157
ge-0.153
Token words
Token ID2456
Feature activation+3.99e-01

Pos logit contributions
 us+0.004
 door+0.004
ipt+0.004
 cho+0.004
 me+0.004
Neg logit contributions
 secure-0.006
Er-0.006
asting-0.006
 shave-0.006
 blonde-0.005
Token right
Token ID826
Feature activation-2.57e-01

Pos logit contributions
 secure+0.014
asting+0.014
ge+0.013
 shave+0.013
Er+0.012
Neg logit contributions
ipt-0.012
 door-0.012
 gate-0.011
 cho-0.011
 us-0.011
Token.
Token ID13
Feature activation-6.10e-01

Pos logit contributions
 door+0.008
 cho+0.007
 gate+0.007
 poster+0.007
 planes+0.007
Neg logit contributions
 secure-0.009
asting-0.009
ge-0.008
Er-0.008
wy-0.008
Token You
Token ID921
Feature activation-2.30e-01

Pos logit contributions
 door+0.016
 us+0.015
 me+0.014
 too+0.014
 gate+0.013
Neg logit contributions
 secure-0.027
ge-0.025
Er-0.025
asting-0.025
 shave-0.025
Token outside
Token ID2354
Feature activation+9.08e-01

Pos logit contributions
 secure+0.022
asting+0.021
ge+0.021
 shave+0.021
Er+0.020
Neg logit contributions
 us-0.011
 door-0.011
aw-0.010
 cho-0.010
 me-0.010
Token and
Token ID290
Feature activation-6.21e-02

Pos logit contributions
ge+0.036
 secure+0.035
 energ+0.034
 shave+0.033
 snug+0.032
Neg logit contributions
aw-0.025
rots-0.025
 cho-0.024
 us-0.024
ays-0.022
Token said
Token ID531
Feature activation-9.41e-01

Pos logit contributions
 door+0.002
 us+0.002
ays+0.002
 gate+0.002
ipt+0.002
Neg logit contributions
 secure-0.002
 shave-0.002
 energ-0.002
 Ethan-0.002
 snug-0.002
Token "
Token ID366
Feature activation-1.39e+00

Pos logit contributions
 us+0.031
 door+0.031
ays+0.029
 gate+0.029
 poster+0.028
Neg logit contributions
 secure-0.031
 blonde-0.030
zing-0.030
Er-0.028
 energ-0.028
Tokenplease
Token ID29688
Feature activation-9.52e-01

Pos logit contributions
 door+0.056
 us+0.051
 gate+0.050
 me+0.048
 too+0.048
Neg logit contributions
 secure-0.041
Er-0.041
asting-0.039
ge-0.039
 shave-0.039
Token bird
Token ID6512
Feature activation-4.77e+00

Pos logit contributions
 door+0.028
fork+0.026
 us+0.026
ays+0.026
 gate+0.025
Neg logit contributions
asting-0.034
Er-0.034
 secure-0.032
 shave-0.031
free-0.031
Tokenie
Token ID494
Feature activation-3.04e+00

Pos logit contributions
 door+0.155
 gate+0.138
 us+0.136
fork+0.130
ipt+0.129
Neg logit contributions
asting-0.163
 secure-0.157
ge-0.156
free-0.153
Er-0.151
Token,
Token ID11
Feature activation-2.01e+00

Pos logit contributions
 door+0.088
 us+0.085
 gate+0.077
 me+0.077
 poster+0.075
Neg logit contributions
 secure-0.115
asting-0.111
Er-0.110
ge-0.109
 blonde-0.107
Token sing
Token ID1702
Feature activation-2.47e+00

Pos logit contributions
ipt+0.060
ays+0.057
fork+0.057
 door+0.055
ucks+0.055
Neg logit contributions
asting-0.069
 secure-0.068
 shave-0.066
Er-0.066
 Baby-0.065
Token again
Token ID757
Feature activation-2.58e+00

Pos logit contributions
ipt+0.080
 door+0.079
ays+0.079
 gate+0.076
fork+0.075
Neg logit contributions
free-0.085
asting-0.084
zing-0.080
 secure-0.078
sy-0.077
Token!"
Token ID2474
Feature activation-1.23e+00

Pos logit contributions
 door+0.066
 us+0.060
 gate+0.057
 poster+0.054
fork+0.053
Neg logit contributions
 secure-0.112
asting-0.107
Er-0.106
ge-0.103
 blonde-0.102
Token said
Token ID531
Feature activation-1.31e+00

Pos logit contributions
 door+0.015
 us+0.014
 gate+0.013
 me+0.013
 too+0.012
Neg logit contributions
 secure-0.021
Er-0.019
 energ-0.019
asting-0.019
 Ethan-0.019
Token,
Token ID11
Feature activation-1.32e+00

Pos logit contributions
 door+0.034
 us+0.033
 me+0.030
 too+0.029
 gate+0.029
Neg logit contributions
 secure-0.057
Er-0.054
asting-0.053
ge-0.053
 shave-0.052
Token "
Token ID366
Feature activation-1.71e+00

Pos logit contributions
 us+0.030
 door+0.030
 gate+0.026
 me+0.025
 poster+0.023
Neg logit contributions
 secure-0.060
Er-0.059
asting-0.057
 Ethan-0.056
 energ-0.056
TokenYes
Token ID5297
Feature activation-2.17e+00

Pos logit contributions
 door+0.064
 us+0.061
 me+0.058
 gate+0.057
 too+0.056
Neg logit contributions
 secure-0.056
Er-0.052
asting-0.051
ge-0.050
 blonde-0.049
Token,
Token ID11
Feature activation-2.38e+00

Pos logit contributions
 door+0.067
fork+0.061
 gate+0.061
 us+0.060
 poster+0.060
Neg logit contributions
 secure-0.079
 Ethan-0.074
asting-0.073
Er-0.073
 Baby-0.072
Token even
Token ID772
Feature activation-4.77e+00

Pos logit contributions
 door+0.062
ays+0.055
 us+0.054
 gate+0.054
 poster+0.054
Neg logit contributions
 secure-0.102
asting-0.095
 Ethan-0.091
 blonde-0.091
 Baby-0.091
Token though
Token ID996
Feature activation+5.02e-01

Pos logit contributions
 door+0.149
 us+0.144
 me+0.138
 too+0.134
 gate+0.129
Neg logit contributions
 secure-0.185
Er-0.174
asting-0.171
 shave-0.169
ge-0.169
Token the
Token ID262
Feature activation-1.22e+00

Pos logit contributions
 secure+0.023
asting+0.022
Er+0.021
 Ethan+0.021
ge+0.021
Neg logit contributions
 door-0.011
 gate-0.010
 us-0.009
 poster-0.009
fork-0.009
Token cow
Token ID9875
Feature activation-1.56e+00

Pos logit contributions
 door+0.033
 us+0.031
ays+0.030
ipt+0.028
fork+0.028
Neg logit contributions
 secure-0.049
 blonde-0.046
Er-0.045
asting-0.045
 shave-0.043
Token was
Token ID373
Feature activation-8.26e-01

Pos logit contributions
 door+0.039
 us+0.034
 gate+0.033
 me+0.030
ipt+0.030
Neg logit contributions
 secure-0.069
asting-0.064
Er-0.062
 shave-0.062
 blonde-0.061
Token unknown
Token ID6439
Feature activation-5.68e-01

Pos logit contributions
 door+0.027
 us+0.026
 me+0.025
 too+0.025
 gate+0.023
Neg logit contributions
 secure-0.030
Er-0.029
ge-0.028
asting-0.028
 shave-0.028
Token buttons
Token ID12163
Feature activation+1.42e+00

Pos logit contributions
 door+0.006
 poster+0.006
 us+0.006
ays+0.006
 me+0.006
Neg logit contributions
 secure-0.007
 blonde-0.007
asting-0.007
Er-0.007
 Ethan-0.006
Token,
Token ID11
Feature activation+4.33e-01

Pos logit contributions
 secure+0.058
ge+0.057
asting+0.056
 shave+0.055
Er+0.055
Neg logit contributions
 us-0.035
 door-0.035
 me-0.033
 poster-0.031
 gate-0.031
Token switches
Token ID18225
Feature activation+1.84e+00

Pos logit contributions
 secure+0.017
 blonde+0.016
 shave+0.016
Er+0.016
asting+0.016
Neg logit contributions
 door-0.012
 us-0.012
 me-0.011
 poster-0.011
 gate-0.011
Token and
Token ID290
Feature activation-4.72e-01

Pos logit contributions
 secure+0.071
ge+0.070
 Ethan+0.067
asting+0.067
 shave+0.066
Neg logit contributions
 door-0.050
 us-0.049
 me-0.047
 gate-0.045
 poster-0.044
Token levers
Token ID48872
Feature activation+2.21e+00

Pos logit contributions
 door+0.014
 us+0.014
 me+0.013
 gate+0.012
 poster+0.012
Neg logit contributions
 secure-0.018
 blonde-0.017
 shave-0.017
Er-0.017
 Ethan-0.016
Token all
Token ID477
Feature activation-4.77e+00

Pos logit contributions
 secure+0.085
ge+0.083
 shave+0.080
asting+0.080
 Ethan+0.079
Neg logit contributions
 door-0.062
 us-0.061
 me-0.058
 gate-0.056
 poster-0.055
Token around
Token ID1088
Feature activation-2.08e+00

Pos logit contributions
 door+0.135
 us+0.125
 gate+0.121
 me+0.119
 poster+0.118
Neg logit contributions
 secure-0.195
ge-0.183
asting-0.183
 shave-0.176
Er-0.174
Token.
Token ID13
Feature activation+2.95e-01

Pos logit contributions
 door+0.051
 gate+0.046
 poster+0.045
 us+0.045
 cho+0.045
Neg logit contributions
 secure-0.088
ge-0.083
 Ethan-0.082
 shave-0.082
Er-0.082
Token Amy
Token ID14235
Feature activation+9.82e-01

Pos logit contributions
 secure+0.010
Er+0.010
 blonde+0.010
 Ethan+0.010
 shave+0.009
Neg logit contributions
 door-0.010
 us-0.009
 me-0.009
 gate-0.009
 poster-0.008
Token started
Token ID2067
Feature activation+6.48e-01

Pos logit contributions
 secure+0.038
Er+0.036
 blonde+0.035
ge+0.035
 shave+0.035
Neg logit contributions
 door-0.030
 us-0.029
 gate-0.027
 me-0.026
 too-0.026
Token touching
Token ID15241
Feature activation+2.08e+00

Pos logit contributions
 secure+0.033
ge+0.032
Er+0.032
 shave+0.032
 blonde+0.031
Neg logit contributions
 us-0.011
 door-0.011
 me-0.010
 gate-0.009
 too-0.008
Token said
Token ID531
Feature activation-8.74e-01

Pos logit contributions
 secure+0.003
Er+0.002
asting+0.002
 shave+0.002
ge+0.002
Neg logit contributions
 door-0.002
 us-0.002
 gate-0.002
 poster-0.002
 me-0.002
Token,
Token ID11
Feature activation-9.50e-01

Pos logit contributions
 door+0.028
 us+0.026
 me+0.025
 gate+0.025
 poster+0.023
Neg logit contributions
 secure-0.033
Er-0.031
asting-0.030
 blonde-0.030
 Ethan-0.030
Token "
Token ID366
Feature activation-1.17e+00

Pos logit contributions
 door+0.022
 us+0.022
 gate+0.020
 me+0.019
 poster+0.019
Neg logit contributions
 secure-0.042
Er-0.041
 Ethan-0.040
 blonde-0.039
asting-0.039
TokenYes
Token ID5297
Feature activation-2.12e+00

Pos logit contributions
 door+0.044
 us+0.040
 gate+0.039
 me+0.039
 too+0.038
Neg logit contributions
 secure-0.038
Er-0.037
asting-0.036
ge-0.035
 shave-0.035
Token,
Token ID11
Feature activation-2.05e+00

Pos logit contributions
 door+0.060
 us+0.055
 gate+0.053
 poster+0.052
 me+0.052
Neg logit contributions
 secure-0.084
asting-0.079
Er-0.079
 Ethan-0.077
 blonde-0.076
Token all
Token ID477
Feature activation-4.74e+00

Pos logit contributions
 door+0.052
 us+0.048
 me+0.045
 gate+0.045
 too+0.044
Neg logit contributions
 secure-0.090
asting-0.084
Er-0.084
 blonde-0.081
 shave-0.081
Token the
Token ID262
Feature activation-2.78e-01

Pos logit contributions
 door+0.130
 us+0.119
 me+0.114
 gate+0.112
 too+0.109
Neg logit contributions
 secure-0.197
asting-0.186
Er-0.179
 energ-0.174
 shave-0.174
Token ingredients
Token ID9391
Feature activation+1.43e+00

Pos logit contributions
 door+0.009
 us+0.009
 me+0.009
 too+0.008
 gate+0.008
Neg logit contributions
 secure-0.010
Er-0.009
asting-0.009
 shave-0.009
ge-0.009
Token you
Token ID345
Feature activation-2.42e-01

Pos logit contributions
 secure+0.058
asting+0.056
Er+0.052
ge+0.052
 shave+0.052
Neg logit contributions
 door-0.039
 gate-0.035
 us-0.035
 me-0.035
 poster-0.034
Token need
Token ID761
Feature activation-1.64e-01

Pos logit contributions
 door+0.008
 us+0.008
 me+0.008
 gate+0.007
 too+0.007
Neg logit contributions
 secure-0.008
asting-0.008
Er-0.008
 shave-0.007
ge-0.007
Token are
Token ID389
Feature activation-2.01e+00

Pos logit contributions
 door+0.004
 poster+0.004
 gate+0.004
ipt+0.004
 cho+0.003
Neg logit contributions
asting-0.007
 secure-0.007
 Ethan-0.006
 Baby-0.006
 shave-0.006
Token went
Token ID1816
Feature activation+6.47e-01

Pos logit contributions
 secure+0.026
Er+0.024
asting+0.024
 Ethan+0.024
 shave+0.023
Neg logit contributions
 door-0.027
 us-0.026
 me-0.024
 gate-0.024
 too-0.023
Token home
Token ID1363
Feature activation-9.09e-01

Pos logit contributions
 secure+0.025
 snug+0.024
asting+0.023
ge+0.023
 energ+0.023
Neg logit contributions
 us-0.017
rots-0.016
aw-0.015
 me-0.015
ipt-0.014
Token and
Token ID290
Feature activation-1.21e-01

Pos logit contributions
 door+0.021
 us+0.021
 me+0.019
 gate+0.018
 too+0.018
Neg logit contributions
 secure-0.042
Er-0.040
ge-0.039
asting-0.039
 shave-0.039
Token told
Token ID1297
Feature activation-1.31e+00

Pos logit contributions
 door+0.004
 us+0.004
 me+0.004
 gate+0.003
 too+0.003
Neg logit contributions
 secure-0.004
Er-0.004
asting-0.004
 shave-0.004
ge-0.004
Token mom
Token ID1995
Feature activation-1.97e+00

Pos logit contributions
 door+0.039
 poster+0.035
 gate+0.035
 us+0.034
ays+0.033
Neg logit contributions
 secure-0.051
 shave-0.048
asting-0.047
 Ethan-0.047
 Baby-0.046
Token all
Token ID477
Feature activation-4.73e+00

Pos logit contributions
 door+0.054
 us+0.053
 me+0.049
 poster+0.049
 gate+0.048
Neg logit contributions
asting-0.076
 secure-0.075
ge-0.071
Er-0.071
 shave-0.070
Token about
Token ID546
Feature activation+1.07e+00

Pos logit contributions
 door+0.118
 us+0.113
 me+0.105
 too+0.102
 gate+0.099
Neg logit contributions
 secure-0.210
Er-0.203
ge-0.198
asting-0.196
 blonde-0.195
Token their
Token ID511
Feature activation+8.96e-01

Pos logit contributions
 secure+0.042
asting+0.039
 shave+0.039
Er+0.038
ge+0.038
Neg logit contributions
 door-0.032
 us-0.030
 gate-0.028
 me-0.028
 poster-0.026
Token fun
Token ID1257
Feature activation+7.57e-01

Pos logit contributions
 secure+0.031
Er+0.028
asting+0.028
ge+0.028
 shave+0.027
Neg logit contributions
 door-0.032
 us-0.031
 me-0.029
 too-0.028
 gate-0.028
Token day
Token ID1110
Feature activation-2.01e-01

Pos logit contributions
 secure+0.025
asting+0.023
ge+0.022
 shave+0.022
Er+0.022
Neg logit contributions
 door-0.028
 us-0.027
 me-0.026
 gate-0.025
 too-0.024
Token.
Token ID13
Feature activation-6.03e-01

Pos logit contributions
 door+0.005
 us+0.005
 me+0.005
 gate+0.005
 poster+0.005
Neg logit contributions
 secure-0.008
asting-0.008
Er-0.008
ge-0.008
 blonde-0.008
Token,
Token ID11
Feature activation-1.28e+00

Pos logit contributions
 door+0.068
 us+0.065
 me+0.060
 gate+0.059
 poster+0.057
Neg logit contributions
 secure-0.087
Er-0.082
ge-0.082
 blonde-0.081
asting-0.080
Token five
Token ID1936
Feature activation-2.02e+00

Pos logit contributions
 door+0.039
 us+0.039
 me+0.035
 gate+0.034
ays+0.033
Neg logit contributions
 secure-0.046
Er-0.045
 blonde-0.044
asting-0.043
 Ethan-0.042
Token,
Token ID11
Feature activation-1.39e+00

Pos logit contributions
 door+0.059
 us+0.057
 me+0.053
 gate+0.051
 cho+0.050
Neg logit contributions
 secure-0.077
ge-0.073
Er-0.072
 blonde-0.072
asting-0.072
Token six
Token ID2237
Feature activation-2.45e+00

Pos logit contributions
 door+0.042
 us+0.042
 me+0.038
 gate+0.036
ays+0.036
Neg logit contributions
Er-0.050
 secure-0.049
 blonde-0.048
asting-0.047
 Ethan-0.046
Token,
Token ID11
Feature activation-1.43e+00

Pos logit contributions
 door+0.070
 us+0.067
 me+0.062
 gate+0.060
 cho+0.059
Neg logit contributions
 secure-0.096
ge-0.091
Er-0.091
 blonde-0.090
asting-0.090
Token seven
Token ID3598
Feature activation-4.73e+00

Pos logit contributions
 door+0.031
 us+0.030
 me+0.026
 too+0.025
 gate+0.024
Neg logit contributions
 secure-0.066
Er-0.064
 blonde-0.063
asting-0.063
ge-0.062
Token,
Token ID11
Feature activation-1.82e+00

Pos logit contributions
 door+0.148
 us+0.144
 cho+0.133
 me+0.131
 poster+0.130
Neg logit contributions
 blonde-0.158
 secure-0.156
Er-0.152
ge-0.150
asting-0.144
Token eight
Token ID3624
Feature activation-1.83e+00

Pos logit contributions
 door+0.055
 us+0.054
 me+0.049
 too+0.047
 gate+0.046
Neg logit contributions
Er-0.067
 secure-0.065
 blonde-0.064
ge-0.063
asting-0.062
Token,
Token ID11
Feature activation-1.67e+00

Pos logit contributions
 door+0.053
 us+0.051
 me+0.047
 gate+0.046
 cho+0.044
Neg logit contributions
 secure-0.069
Er-0.067
 blonde-0.066
ge-0.066
asting-0.064
Token nine
Token ID5193
Feature activation-2.93e+00

Pos logit contributions
 door+0.041
 us+0.041
 me+0.036
 gate+0.034
 cho+0.034
Neg logit contributions
Er-0.070
 secure-0.068
ge-0.066
asting-0.065
 blonde-0.065
Token,
Token ID11
Feature activation-2.12e+00

Pos logit contributions
 door+0.085
 us+0.084
 me+0.077
 gate+0.075
fork+0.073
Neg logit contributions
 secure-0.110
ge-0.105
Er-0.103
 blonde-0.103
asting-0.101
Token and
Token ID290
Feature activation-1.46e+00

Pos logit contributions
 door+0.026
 gate+0.024
 us+0.023
fork+0.022
 cho+0.022
Neg logit contributions
asting-0.043
 secure-0.043
Er-0.042
 Ethan-0.042
ge-0.042
Token dad
Token ID9955
Feature activation+2.74e-02

Pos logit contributions
 door+0.053
 cho+0.047
 us+0.047
ays+0.046
 gate+0.046
Neg logit contributions
 Junior-0.046
 Ethan-0.045
Er-0.044
 Baby-0.044
 Caleb-0.044
Token,
Token ID11
Feature activation-1.58e+00

Pos logit contributions
 secure+0.001
asting+0.001
Er+0.001
 Ethan+0.001
ge+0.001
Neg logit contributions
 door-0.001
 gate-0.001
 us-0.001
 cho-0.001
fork-0.001
Token h
Token ID289
Feature activation-3.41e+00

Pos logit contributions
 door+0.046
ays+0.045
ipt+0.043
ocked+0.043
aw+0.042
Neg logit contributions
 Baby-0.056
 secure-0.054
Baby-0.053
 Ethan-0.053
 Junior-0.053
Tokenoor
Token ID2675
Feature activation-2.90e+00

Pos logit contributions
 door+0.108
 us+0.104
 me+0.099
 too+0.095
 gate+0.094
Neg logit contributions
 secure-0.129
Er-0.122
asting-0.121
ge-0.120
 shave-0.118
Tokenay
Token ID323
Feature activation-4.73e+00

Pos logit contributions
 door+0.081
fork+0.080
 elephants+0.075
 cho+0.075
 monkeys+0.075
Neg logit contributions
asting-0.102
Er-0.098
 secure-0.097
ge-0.095
 Ethan-0.094
Token!
Token ID0
Feature activation-1.94e+00

Pos logit contributions
 door+0.135
fork+0.135
 gate+0.124
 poster+0.117
 us+0.111
Neg logit contributions
asting-0.175
zing-0.170
 Ethan-0.169
Er-0.163
ge-0.162
Token
Token ID198
Feature activation-1.07e-01

Pos logit contributions
 door+0.055
 us+0.053
 me+0.050
 too+0.049
 gate+0.047
Neg logit contributions
 secure-0.082
ge-0.076
asting-0.076
Er-0.074
 shave-0.074
Token
Token ID198
Feature activation+2.04e-01

Pos logit contributions
 door+0.003
 us+0.003
 me+0.003
 too+0.003
 gate+0.003
Neg logit contributions
 secure-0.004
Er-0.004
ge-0.004
asting-0.004
 blonde-0.004
TokenThey
Token ID2990
Feature activation+7.19e-01

Pos logit contributions
 secure+0.008
Er+0.007
asting+0.007
ge+0.007
 shave+0.007
Neg logit contributions
 door-0.007
 us-0.006
 me-0.006
 too-0.006
 gate-0.006
Token pulled
Token ID5954
Feature activation+2.35e+00

Pos logit contributions
 secure+0.024
Er+0.022
asting+0.022
 Ethan+0.022
 energ+0.021
Neg logit contributions
 door-0.025
 us-0.024
 gate-0.023
 cho-0.022
 too-0.022
Token to
Token ID284
Feature activation-5.80e-01

Pos logit contributions
 secure+0.003
asting+0.003
ge+0.003
 shave+0.003
Er+0.003
Neg logit contributions
 door-0.001
 us-0.001
ays-0.001
 poster-0.001
 me-0.001
Token go
Token ID467
Feature activation-6.12e-01

Pos logit contributions
 door+0.021
 us+0.020
 me+0.019
 too+0.019
 gate+0.018
Neg logit contributions
 secure-0.020
Er-0.019
ge-0.018
asting-0.018
 blonde-0.018
Token back
Token ID736
Feature activation-1.58e+00

Pos logit contributions
 door+0.013
 us+0.012
 me+0.011
 gate+0.011
 cho+0.011
Neg logit contributions
 secure-0.029
asting-0.027
ge-0.027
 shave-0.026
Er-0.026
Token and
Token ID290
Feature activation-1.06e-01

Pos logit contributions
 door+0.031
 us+0.030
 cho+0.027
 me+0.025
 gate+0.025
Neg logit contributions
 secure-0.075
ge-0.072
Er-0.071
 shave-0.070
 Ethan-0.070
Token warn
Token ID9828
Feature activation-2.08e+00

Pos logit contributions
 door+0.004
 us+0.004
 me+0.004
 gate+0.004
 too+0.004
Neg logit contributions
 secure-0.003
 shave-0.003
Er-0.003
asting-0.003
ge-0.003
Token all
Token ID477
Feature activation-4.72e+00

Pos logit contributions
 door+0.063
 us+0.058
 gate+0.056
 poster+0.055
 me+0.054
Neg logit contributions
 secure-0.082
asting-0.076
Er-0.074
 Ethan-0.074
ge-0.074
Token the
Token ID262
Feature activation-1.66e-01

Pos logit contributions
 door+0.138
 us+0.131
 me+0.123
 gate+0.119
 too+0.119
Neg logit contributions
 secure-0.189
asting-0.178
Er-0.176
ge-0.175
 shave-0.172
Token other
Token ID584
Feature activation-1.31e+00

Pos logit contributions
 door+0.005
 us+0.004
 me+0.004
 too+0.004
 gate+0.004
Neg logit contributions
 secure-0.007
Er-0.007
ge-0.007
asting-0.007
 blonde-0.007
Token fish
Token ID5916
Feature activation+1.45e-01

Pos logit contributions
 door+0.030
 us+0.028
 me+0.026
 gate+0.025
 too+0.024
Neg logit contributions
 secure-0.059
asting-0.057
Er-0.056
ge-0.055
 Ethan-0.055
Token about
Token ID546
Feature activation+1.14e+00

Pos logit contributions
 secure+0.006
asting+0.006
ge+0.005
Er+0.005
 shave+0.005
Neg logit contributions
 door-0.004
 us-0.004
 me-0.004
 poster-0.004
 gate-0.004
Token the
Token ID262
Feature activation+5.01e-01

Pos logit contributions
 secure+0.046
asting+0.043
ge+0.043
Er+0.042
 shave+0.042
Neg logit contributions
 door-0.033
 us-0.030
 gate-0.029
 me-0.028
 poster-0.028
Token hidden
Token ID7104
Feature activation-1.16e+00

Pos logit contributions
 secure+0.059
Er+0.056
asting+0.055
ge+0.054
 shave+0.054
Neg logit contributions
 door-0.051
 us-0.049
 me-0.046
 too-0.046
 gate-0.044
Token,
Token ID11
Feature activation-1.58e-02

Pos logit contributions
 door+0.032
 us+0.028
 me+0.028
 poster+0.027
 gate+0.027
Neg logit contributions
 secure-0.048
ge-0.046
asting-0.045
Er-0.044
 Ethan-0.043
Token but
Token ID475
Feature activation+3.73e-01

Pos logit contributions
 door+0.001
 us+0.000
 me+0.000
 gate+0.000
 poster+0.000
Neg logit contributions
 secure-0.001
asting-0.001
 shave-0.001
Er-0.001
 blonde-0.001
Token right
Token ID826
Feature activation-1.09e+00

Pos logit contributions
 secure+0.016
Er+0.014
asting+0.014
 shave+0.014
ge+0.014
Neg logit contributions
 door-0.010
 us-0.010
 me-0.009
 gate-0.009
 too-0.009
Token in
Token ID287
Feature activation-4.20e-01

Pos logit contributions
 door+0.028
 us+0.027
 me+0.026
 cho+0.025
 poster+0.025
Neg logit contributions
 secure-0.044
Er-0.040
asting-0.040
ge-0.040
 shave-0.039
Token front
Token ID2166
Feature activation-4.72e+00

Pos logit contributions
 door+0.015
 us+0.015
 me+0.014
 too+0.014
 gate+0.013
Neg logit contributions
 secure-0.014
Er-0.013
asting-0.013
 energ-0.013
ge-0.013
Token of
Token ID286
Feature activation-1.11e+00

Pos logit contributions
 door+0.133
 poster+0.131
 cho+0.129
ipt+0.125
 gate+0.123
Neg logit contributions
 secure-0.171
asting-0.155
ge-0.148
 blonde-0.146
 shave-0.143
Token them
Token ID606
Feature activation+6.26e-01

Pos logit contributions
 door+0.023
 us+0.019
 gate+0.019
 me+0.018
 poster+0.018
Neg logit contributions
 secure-0.054
Er-0.051
asting-0.051
ge-0.050
 shave-0.050
Token.
Token ID13
Feature activation-3.54e-01

Pos logit contributions
 secure+0.026
asting+0.024
ge+0.023
Er+0.023
 shave+0.023
Neg logit contributions
 door-0.017
 us-0.016
 gate-0.015
 me-0.015
 cho-0.015
Token<|endoftext|>
Token ID50256
Feature activation-1.78e+00

Pos logit contributions
 door+0.009
 us+0.009
 me+0.008
 too+0.008
 gate+0.007
Neg logit contributions
 secure-0.016
asting-0.015
ge-0.015
 shave-0.015
 energ-0.015
TokenSam
Token ID16305
Feature activation-2.11e-01

Pos logit contributions
 door+0.054
 us+0.052
 me+0.049
 too+0.047
 gate+0.047
Neg logit contributions
 secure-0.070
Er-0.066
asting-0.065
ge-0.064
 shave-0.064
Token pet
Token ID4273
Feature activation-4.46e-02

Pos logit contributions
 door+0.001
 us+0.001
 me+0.001
 too+0.001
 gate+0.001
Neg logit contributions
 secure-0.003
Er-0.003
asting-0.003
ge-0.003
 shave-0.003
Tokenals
Token ID874
Feature activation-1.83e+00

Pos logit contributions
 door+0.001
 us+0.001
 poster+0.001
ipt+0.001
 me+0.001
Neg logit contributions
 blonde-0.002
 Baby-0.002
asting-0.002
 Ethan-0.002
 secure-0.002
Token and
Token ID290
Feature activation-3.93e-01

Pos logit contributions
 door+0.046
 us+0.045
 me+0.042
 poster+0.039
 gate+0.039
Neg logit contributions
 secure-0.078
asting-0.074
ge-0.072
Er-0.072
 blonde-0.072
Token a
Token ID257
Feature activation-1.08e+00

Pos logit contributions
 door+0.012
 us+0.011
 me+0.011
 gate+0.010
 too+0.010
Neg logit contributions
 secure-0.015
 blonde-0.014
Er-0.014
asting-0.014
 shave-0.014
Token yellow
Token ID7872
Feature activation-2.18e+00

Pos logit contributions
 us+0.039
 door+0.035
 me+0.034
ays+0.034
ipt+0.033
Neg logit contributions
 secure-0.035
 blonde-0.035
Er-0.033
 shave-0.032
 snug-0.031
Token middle
Token ID3504
Feature activation-4.72e+00

Pos logit contributions
 us+0.078
ipt+0.071
men+0.070
ays+0.069
 cho+0.069
Neg logit contributions
 blonde-0.068
 shave-0.064
 secure-0.058
 texture-0.058
 chew-0.057
Token.
Token ID13
Feature activation+1.22e-01

Pos logit contributions
 us+0.149
 cho+0.138
 door+0.136
 me+0.133
 poster+0.130
Neg logit contributions
 secure-0.163
 blonde-0.160
asting-0.150
ge-0.144
 snug-0.142
Token It
Token ID632
Feature activation-3.19e-01

Pos logit contributions
 blonde+0.004
 secure+0.004
Er+0.004
 Ethan+0.004
 Baby+0.004
Neg logit contributions
 door-0.004
 us-0.004
 me-0.004
 poster-0.004
 gate-0.004
Token smells
Token ID25760
Feature activation+3.14e-02

Pos logit contributions
ipt+0.010
 door+0.010
 cho+0.009
 us+0.009
amer+0.008
Neg logit contributions
 blonde-0.010
 secure-0.010
 energ-0.010
chy-0.010
 chew-0.010
Token nice
Token ID3621
Feature activation+2.69e-01

Pos logit contributions
 secure+0.001
ge+0.001
Er+0.001
 shave+0.001
Marcus+0.001
Neg logit contributions
 us-0.001
 door-0.001
 me-0.001
 too-0.001
 gate-0.001
Token.
Token ID13
Feature activation+1.69e-01

Pos logit contributions
asting+0.009
 secure+0.009
 blonde+0.009
 energ+0.008
ge+0.008
Neg logit contributions
 door-0.008
 us-0.008
 poster-0.008
 cho-0.007
 me-0.007
Token had
Token ID550
Feature activation+6.63e-01

Pos logit contributions
 secure+0.070
Er+0.067
asting+0.066
 energ+0.065
zing+0.062
Neg logit contributions
 door-0.063
 us-0.058
 poster-0.054
 gate-0.054
 cho-0.053
Token cleaned
Token ID20750
Feature activation+3.36e+00

Pos logit contributions
 secure+0.028
 blonde+0.026
 energ+0.026
asting+0.025
Er+0.025
Neg logit contributions
 door-0.018
 us-0.017
 poster-0.017
 gate-0.016
ays-0.015
Token up
Token ID510
Feature activation+6.82e-02

Pos logit contributions
 secure+0.159
Er+0.157
ge+0.153
 blonde+0.152
asting+0.152
Neg logit contributions
 door-0.077
 us-0.071
 me-0.070
 too-0.069
 gate-0.059
Token all
Token ID477
Feature activation-3.04e+00

Pos logit contributions
 secure+0.003
Er+0.003
ge+0.003
 shave+0.003
 blonde+0.003
Neg logit contributions
 door-0.002
 us-0.002
 too-0.002
 me-0.002
 gate-0.001
Token the
Token ID262
Feature activation+1.46e+00

Pos logit contributions
 door+0.080
 us+0.076
 me+0.072
 too+0.069
 gate+0.067
Neg logit contributions
 secure-0.132
Er-0.126
asting-0.123
ge-0.123
 shave-0.122
Token trash
Token ID13913
Feature activation+4.36e+00

Pos logit contributions
Marcus+0.067
Er+0.066
Phill+0.065
 Junior+0.064
 Caleb+0.064
Neg logit contributions
 door-0.044
 me-0.040
 us-0.040
 gate-0.034
 too-0.034
Token from
Token ID422
Feature activation+1.91e+00

Pos logit contributions
 secure+0.148
asting+0.135
ge+0.131
 snug+0.130
Er+0.129
Neg logit contributions
 us-0.131
 door-0.125
 poster-0.125
 me-0.121
 cho-0.120
Token the
Token ID262
Feature activation+9.19e-01

Pos logit contributions
 secure+0.084
Er+0.082
asting+0.079
ge+0.079
 shave+0.078
Neg logit contributions
 door-0.050
 us-0.047
 me-0.045
 too-0.043
 gate-0.041
Token ash
Token ID12530
Feature activation+2.23e+00

Pos logit contributions
 secure+0.036
Er+0.034
ge+0.033
asting+0.033
 shave+0.033
Neg logit contributions
 door-0.029
 us-0.028
 me-0.027
 gate-0.025
 too-0.025
Tokentr
Token ID2213
Feature activation+2.64e+00

Pos logit contributions
 secure+0.050
asting+0.044
chy+0.043
sy+0.040
 blonde+0.040
Neg logit contributions
 us-0.098
 poster-0.086
ucks-0.086
 me-0.084
 MacDonald-0.083
Tokenay
Token ID323
Feature activation-2.06e+00

Pos logit contributions
 secure+0.087
innie+0.079
ge+0.078
 Em+0.077
Er+0.076
Neg logit contributions
 us-0.089
 me-0.083
 lions-0.080
 poster-0.080
 monkeys-0.078
Token seen
Token ID1775
Feature activation+1.87e-01

Pos logit contributions
 door+0.090
 us+0.088
 me+0.084
 too+0.082
 gate+0.079
Neg logit contributions
 secure-0.088
Er-0.083
 shave-0.080
asting-0.079
ge-0.079
Token.
Token ID13
Feature activation+1.22e-01

Pos logit contributions
 secure+0.007
asting+0.006
ge+0.006
 blonde+0.006
Er+0.006
Neg logit contributions
 door-0.006
 gate-0.005
 us-0.005
 poster-0.005
 cho-0.005
Token
Token ID220
Feature activation+2.05e-01

Pos logit contributions
 secure+0.004
Er+0.004
asting+0.004
ge+0.004
 shave+0.004
Neg logit contributions
 door-0.004
 us-0.004
 me-0.004
 too-0.004
 gate-0.004
Token
Token ID198
Feature activation+6.05e-01

Pos logit contributions
 secure+0.008
asting+0.008
ge+0.008
 shave+0.007
Er+0.007
Neg logit contributions
 door-0.006
 us-0.006
 me-0.006
 too-0.005
 gate-0.005
Token
Token ID198
Feature activation+8.94e-01

Pos logit contributions
 secure+0.025
ge+0.024
 shave+0.023
 blonde+0.023
asting+0.023
Neg logit contributions
 us-0.017
 door-0.017
 me-0.016
 too-0.015
 them-0.014
TokenEm
Token ID10161
Feature activation+5.11e+00

Pos logit contributions
 secure+0.026
Er+0.023
asting+0.023
ge+0.023
 shave+0.023
Neg logit contributions
 door-0.037
 us-0.036
 me-0.034
 too-0.033
 gate-0.033
Tokenma
Token ID2611
Feature activation+2.78e-01

Pos logit contributions
 secure+0.227
Er+0.213
asting+0.211
 shave+0.210
 blonde+0.208
Neg logit contributions
 door-0.129
 us-0.126
 me-0.117
 too-0.111
 gate-0.107
Token was
Token ID373
Feature activation+7.28e-01

Pos logit contributions
 secure+0.010
Er+0.009
asting+0.009
 Ethan+0.009
 blonde+0.009
Neg logit contributions
 door-0.009
 us-0.009
 gate-0.008
 cho-0.008
 poster-0.008
Token so
Token ID523
Feature activation+3.30e+00

Pos logit contributions
 secure+0.030
Er+0.029
asting+0.028
 blonde+0.028
 energ+0.027
Neg logit contributions
 door-0.021
 us-0.020
 me-0.018
 gate-0.018
 too-0.017
Token proud
Token ID6613
Feature activation-9.78e-01

Pos logit contributions
ge+0.108
 secure+0.108
 shave+0.107
asting+0.102
tto+0.100
Neg logit contributions
 door-0.116
 too-0.112
 us-0.110
 me-0.110
 gate-0.099
Token of
Token ID286
Feature activation+7.93e-01

Pos logit contributions
 us+0.028
 door+0.028
 gate+0.027
 poster+0.026
 cho+0.026
Neg logit contributions
 secure-0.036
asting-0.033
 Ethan-0.033
ge-0.032
Er-0.032
Token's
Token ID338
Feature activation-1.37e+00

Pos logit contributions
 door+0.039
 us+0.036
 gate+0.034
 me+0.033
 cho+0.031
Neg logit contributions
 secure-0.068
Er-0.064
asting-0.063
ge-0.062
 shave-0.062
Token something
Token ID1223
Feature activation+1.57e+00

Pos logit contributions
 too+0.048
 us+0.047
 door+0.047
 me+0.046
 cho+0.040
Neg logit contributions
Er-0.046
 secure-0.045
 blonde-0.045
ge-0.044
Marcus-0.044
Token important
Token ID1593
Feature activation-4.87e-02

Pos logit contributions
 secure+0.059
asting+0.054
Er+0.054
 shave+0.053
 energ+0.052
Neg logit contributions
 door-0.050
 us-0.047
 gate-0.044
 me-0.044
 poster-0.042
Token.
Token ID13
Feature activation-7.31e-02

Pos logit contributions
 door+0.001
 gate+0.001
 us+0.001
 cho+0.001
 poster+0.001
Neg logit contributions
 secure-0.002
ge-0.002
asting-0.002
 shave-0.002
 Ethan-0.002
Token<|endoftext|>
Token ID50256
Feature activation-1.50e+00

Pos logit contributions
 door+0.002
 us+0.002
 me+0.002
 too+0.002
 gate+0.002
Neg logit contributions
 secure-0.003
ge-0.003
asting-0.003
Er-0.003
 shave-0.003
TokenEm
Token ID10161
Feature activation+4.66e+00

Pos logit contributions
 door+0.042
 us+0.041
 me+0.039
 too+0.038
 gate+0.035
Neg logit contributions
 secure-0.064
ge-0.059
asting-0.058
Er-0.058
 shave-0.058
Tokenma
Token ID2611
Feature activation+2.69e-02

Pos logit contributions
 secure+0.250
 afloat+0.233
 blonde+0.232
 energ+0.230
 shave+0.229
Neg logit contributions
 us-0.090
 me-0.089
 door-0.080
 too-0.066
 poster-0.053
Token was
Token ID373
Feature activation+9.37e-01

Pos logit contributions
 secure+0.001
Er+0.001
asting+0.001
ge+0.001
 shave+0.001
Neg logit contributions
 door-0.001
 us-0.001
 me-0.001
 too-0.001
 gate-0.001
Token a
Token ID257
Feature activation+1.83e+00

Pos logit contributions
 secure+0.036
ge+0.035
Er+0.034
asting+0.034
 shave+0.034
Neg logit contributions
 door-0.029
 us-0.028
 me-0.027
 too-0.026
 gate-0.024
Token 3
Token ID513
Feature activation+1.46e+00

Pos logit contributions
 secure+0.062
ge+0.058
asting+0.057
Er+0.056
 shave+0.056
Neg logit contributions
 door-0.066
 us-0.063
 me-0.062
 too-0.061
 gate-0.058
Token year
Token ID614
Feature activation+6.20e-01

Pos logit contributions
 secure+0.055
asting+0.051
Er+0.051
ge+0.050
 blonde+0.050
Neg logit contributions
 door-0.046
 us-0.044
 me-0.042
 gate-0.040
 too-0.040
Token big
Token ID1263
Feature activation-3.32e-01

Pos logit contributions
 secure+0.052
Er+0.049
asting+0.048
 shave+0.047
ge+0.047
Neg logit contributions
 door-0.046
 us-0.045
 me-0.042
 too-0.041
 gate-0.040
Token bowl
Token ID9396
Feature activation+9.11e-02

Pos logit contributions
 us+0.010
ipt+0.009
 door+0.009
ays+0.009
ocked+0.008
Neg logit contributions
 secure-0.013
 shave-0.012
 blonde-0.012
 snug-0.011
 Ethan-0.011
Token of
Token ID286
Feature activation+1.19e+00

Pos logit contributions
 secure+0.004
asting+0.004
ge+0.003
 shave+0.003
Er+0.003
Neg logit contributions
 us-0.002
 door-0.002
 me-0.002
 gate-0.002
 poster-0.002
Token ice
Token ID4771
Feature activation+1.71e+00

Pos logit contributions
 secure+0.041
Er+0.038
ge+0.038
asting+0.037
 shave+0.037
Neg logit contributions
 door-0.042
 us-0.041
 me-0.039
 too-0.038
 gate-0.037
Token-
Token ID12
Feature activation-7.31e-01

Pos logit contributions
 secure+0.068
asting+0.064
Er+0.064
ge+0.063
 shave+0.063
Neg logit contributions
 door-0.050
 us-0.048
 me-0.045
 gate-0.043
 too-0.043
Tokencream
Token ID36277
Feature activation+4.28e+00

Pos logit contributions
 us+0.029
 door+0.028
 cho+0.027
 me+0.027
 too+0.027
Neg logit contributions
 secure-0.022
 blonde-0.021
Er-0.020
Marcus-0.019
ge-0.019
Token!
Token ID0
Feature activation-6.20e-01

Pos logit contributions
 secure+0.140
asting+0.135
 shave+0.127
 Ethan+0.125
 energ+0.123
Neg logit contributions
 us-0.142
 me-0.126
ucks-0.125
 poster-0.121
 planes-0.120
Token The
Token ID383
Feature activation+3.96e-01

Pos logit contributions
 door+0.020
 us+0.020
 me+0.018
 too+0.018
 gate+0.018
Neg logit contributions
 secure-0.023
Er-0.021
asting-0.021
 blonde-0.021
 shave-0.021
Token girl
Token ID2576
Feature activation+4.53e-01

Pos logit contributions
ge+0.018
 secure+0.018
Er+0.017
asting+0.016
 Ethan+0.016
Neg logit contributions
 door-0.012
 too-0.010
 me-0.010
 gate-0.009
 us-0.009
Token peek
Token ID27185
Feature activation+5.25e-01

Pos logit contributions
 secure+0.019
Er+0.017
asting+0.017
ge+0.017
 shave+0.017
Neg logit contributions
 door-0.013
 us-0.012
 me-0.012
 too-0.011
 gate-0.011
Tokened
Token ID276
Feature activation-7.74e-02

Pos logit contributions
 secure+0.023
Er+0.021
asting+0.021
ge+0.021
 shave+0.021
Neg logit contributions
 door-0.014
 us-0.013
 me-0.012
 too-0.012
 gate-0.012
Token friends
Token ID2460
Feature activation+4.31e-02

Pos logit contributions
 secure+0.014
 blonde+0.013
Er+0.012
 shave+0.012
 Ethan+0.012
Neg logit contributions
 us-0.011
 door-0.010
 cho-0.010
ays-0.010
 comes-0.010
Token.
Token ID13
Feature activation+3.48e-01

Pos logit contributions
 secure+0.002
Er+0.002
asting+0.002
ge+0.002
 shave+0.002
Neg logit contributions
 door-0.001
 us-0.001
 me-0.001
 gate-0.001
 too-0.001
Token Everyone
Token ID11075
Feature activation-4.29e-01

Pos logit contributions
 secure+0.010
Er+0.010
 blonde+0.010
 Ethan+0.010
asting+0.009
Neg logit contributions
 us-0.013
 door-0.013
 me-0.012
 gate-0.012
 poster-0.012
Token admired
Token ID29382
Feature activation+1.13e+00

Pos logit contributions
 door+0.015
 us+0.014
 gate+0.014
 poster+0.013
 me+0.013
Neg logit contributions
 secure-0.015
asting-0.014
Er-0.014
 energ-0.014
 Ethan-0.013
Token his
Token ID465
Feature activation+1.88e+00

Pos logit contributions
 secure+0.052
Er+0.049
asting+0.048
ge+0.048
 shave+0.048
Neg logit contributions
 door-0.027
 us-0.026
 me-0.024
 too-0.023
 gate-0.023
Token pants
Token ID12581
Feature activation+4.30e+00

Pos logit contributions
 secure+0.073
Er+0.072
asting+0.069
ge+0.068
tto+0.068
Neg logit contributions
 door-0.059
 us-0.056
 me-0.054
 too-0.054
 gate-0.051
Token,
Token ID11
Feature activation+3.28e-02

Pos logit contributions
 secure+0.207
Er+0.196
asting+0.195
ge+0.193
 shave+0.192
Neg logit contributions
 door-0.092
 us-0.088
 me-0.080
 gate-0.075
 too-0.075
Token and
Token ID290
Feature activation-4.25e-01

Pos logit contributions
 secure+0.001
asting+0.001
 blonde+0.001
 energ+0.001
Er+0.001
Neg logit contributions
 door-0.001
 us-0.001
 me-0.001
 gate-0.001
 poster-0.001
Token he
Token ID339
Feature activation-4.49e-01

Pos logit contributions
 door+0.013
 us+0.012
 me+0.011
 gate+0.011
 poster+0.011
Neg logit contributions
 secure-0.017
asting-0.015
Er-0.015
 Ethan-0.015
 energ-0.015
Token felt
Token ID2936
Feature activation-3.78e-01

Pos logit contributions
 door+0.018
 us+0.016
 gate+0.016
 poster+0.015
ays+0.015
Neg logit contributions
 secure-0.014
Er-0.012
 energ-0.012
asting-0.012
 Ethan-0.012
Token very
Token ID845
Feature activation+5.26e-01

Pos logit contributions
 too+0.010
 us+0.010
 door+0.010
 me+0.010
 cho+0.008
Neg logit contributions
ge-0.016
 secure-0.015
Er-0.015
 shave-0.015
innie-0.015
TokenTim
Token ID14967
Feature activation+7.97e-01

Pos logit contributions
 door+0.052
 us+0.049
 me+0.046
 gate+0.045
 too+0.045
Neg logit contributions
 secure-0.065
Er-0.062
asting-0.061
ge-0.060
 shave-0.059
Token and
Token ID290
Feature activation+6.36e-02

Pos logit contributions
 secure+0.039
Er+0.036
ge+0.036
asting+0.036
 shave+0.035
Neg logit contributions
 door-0.017
 us-0.017
 me-0.016
 too-0.015
 gate-0.014
Token Lily
Token ID20037
Feature activation-1.79e-01

Pos logit contributions
 secure+0.003
ge+0.003
asting+0.003
Er+0.002
 energ+0.002
Neg logit contributions
 door-0.002
 us-0.002
 me-0.002
 too-0.002
 gate-0.002
Token were
Token ID547
Feature activation+2.65e-01

Pos logit contributions
 us+0.004
 me+0.004
 door+0.003
 comes+0.003
 too+0.003
Neg logit contributions
 secure-0.009
ge-0.008
asting-0.008
Er-0.008
innie-0.008
Token playing
Token ID2712
Feature activation+1.41e+00

Pos logit contributions
 secure+0.010
Er+0.009
asting+0.009
 blonde+0.009
 energ+0.009
Neg logit contributions
 door-0.009
 us-0.008
 me-0.008
 gate-0.008
 too-0.008
Token in
Token ID287
Feature activation+9.63e-01

Pos logit contributions
 secure+0.046
 shave+0.045
Er+0.045
asting+0.044
 chew+0.042
Neg logit contributions
 door-0.047
 us-0.044
 gate-0.042
 cho-0.041
 poster-0.040
Token the
Token ID262
Feature activation+2.28e-01

Pos logit contributions
 secure+0.045
asting+0.042
Er+0.041
ge+0.041
 shave+0.041
Neg logit contributions
 door-0.024
 us-0.023
 me-0.022
 too-0.020
 gate-0.019
Token park
Token ID3952
Feature activation-8.94e-01

Pos logit contributions
 secure+0.010
Er+0.009
asting+0.009
ge+0.009
 shave+0.009
Neg logit contributions
 door-0.006
 us-0.006
 me-0.006
 too-0.006
 gate-0.005
Token with
Token ID351
Feature activation-4.07e-01

Pos logit contributions
 door+0.027
 us+0.027
 me+0.025
 cho+0.024
 poster+0.023
Neg logit contributions
 secure-0.033
asting-0.032
ge-0.031
 blonde-0.031
Er-0.031
Token their
Token ID511
Feature activation+1.49e+00

Pos logit contributions
 door+0.008
 us+0.008
 me+0.008
 too+0.007
 gate+0.007
Neg logit contributions
 secure-0.020
asting-0.019
ge-0.019
 energ-0.019
Er-0.019
Token new
Token ID649
Feature activation+4.41e-02

Pos logit contributions
 secure+0.060
 blonde+0.057
Er+0.056
 shave+0.055
 Ethan+0.054
Neg logit contributions
 us-0.038
 door-0.038
 me-0.035
 too-0.035
 cho-0.034
Token in
Token ID287
Feature activation+6.08e-01

Pos logit contributions
 secure+0.039
 shave+0.038
Er+0.038
asting+0.038
 energ+0.036
Neg logit contributions
 door-0.042
 us-0.038
 gate-0.038
 cho-0.037
 poster-0.035
Token the
Token ID262
Feature activation-4.65e-03

Pos logit contributions
 secure+0.026
Er+0.025
asting+0.025
ge+0.024
 shave+0.024
Neg logit contributions
 door-0.016
 us-0.015
 me-0.015
 too-0.014
 gate-0.014
Token garden
Token ID11376
Feature activation+6.66e-02

Pos logit contributions
 door+0.000
 us+0.000
 me+0.000
 too+0.000
 gate+0.000
Neg logit contributions
 secure-0.000
Er-0.000
asting-0.000
ge-0.000
 shave-0.000
Token.
Token ID13
Feature activation+7.64e-01

Pos logit contributions
 secure+0.003
asting+0.002
ge+0.002
Er+0.002
 blonde+0.002
Neg logit contributions
 door-0.002
 us-0.002
 me-0.002
 cho-0.002
 gate-0.002
Token They
Token ID1119
Feature activation+1.21e+00

Pos logit contributions
 secure+0.034
Er+0.032
asting+0.032
ge+0.032
 shave+0.031
Neg logit contributions
 door-0.020
 us-0.019
 me-0.018
 too-0.017
 gate-0.017
Token had
Token ID550
Feature activation+9.62e-01

Pos logit contributions
 secure+0.048
Er+0.045
ge+0.045
asting+0.045
 shave+0.044
Neg logit contributions
 door-0.036
 us-0.035
 me-0.034
 too-0.032
 gate-0.031
Token a
Token ID257
Feature activation+1.86e-01

Pos logit contributions
 secure+0.033
 blonde+0.031
asting+0.031
Er+0.030
 energ+0.029
Neg logit contributions
 door-0.033
 us-0.032
 me-0.030
 poster-0.030
 gate-0.029
Token big
Token ID1263
Feature activation-1.15e+00

Pos logit contributions
 secure+0.007
Er+0.006
asting+0.006
 blonde+0.006
 shave+0.006
Neg logit contributions
 door-0.006
 us-0.006
 me-0.006
 too-0.005
 gate-0.005
Token drum
Token ID13026
Feature activation-1.31e+00

Pos logit contributions
 us+0.034
 door+0.033
ocked+0.030
ays+0.030
ipt+0.030
Neg logit contributions
 secure-0.042
 blonde-0.041
 shave-0.039
Er-0.039
 snug-0.038
Token that
Token ID326
Feature activation-1.09e+00

Pos logit contributions
 us+0.047
 me+0.041
ipt+0.040
ucks+0.039
 door+0.038
Neg logit contributions
 blonde-0.038
 secure-0.036
asting-0.036
 shave-0.036
zing-0.036
Token made
Token ID925
Feature activation-3.41e-01

Pos logit contributions
 door+0.037
 us+0.035
 gate+0.033
 me+0.032
 poster+0.032
Neg logit contributions
 secure-0.037
 blonde-0.034
 shave-0.033
 snug-0.033
 Ethan-0.033
Token than
Token ID621
Feature activation-7.42e-01

Pos logit contributions
 secure+0.058
ge+0.055
Er+0.054
asting+0.054
 shave+0.053
Neg logit contributions
 door-0.033
 us-0.032
 me-0.030
 gate-0.029
 cho-0.027
Token anything
Token ID1997
Feature activation+1.77e+00

Pos logit contributions
 door+0.022
 us+0.021
 me+0.020
 gate+0.020
 too+0.019
Neg logit contributions
 secure-0.029
Er-0.027
asting-0.027
ge-0.027
 shave-0.027
Token else
Token ID2073
Feature activation+3.43e-01

Pos logit contributions
 secure+0.072
asting+0.066
Er+0.066
ge+0.065
 shave+0.064
Neg logit contributions
 door-0.052
 us-0.048
 gate-0.046
 me-0.046
 poster-0.043
Token.
Token ID13
Feature activation-5.41e-01

Pos logit contributions
 secure+0.015
Er+0.014
asting+0.014
ge+0.014
 shave+0.014
Neg logit contributions
 door-0.009
 us-0.008
 me-0.008
 gate-0.008
 too-0.007
Token<|endoftext|>
Token ID50256
Feature activation-1.54e+00

Pos logit contributions
 door+0.016
 us+0.015
 me+0.015
 too+0.014
 gate+0.012
Neg logit contributions
 secure-0.023
ge-0.022
asting-0.021
 blonde-0.021
 shave-0.020
TokenOnce
Token ID7454
Feature activation+9.58e-01

Pos logit contributions
 door+0.044
 us+0.043
 me+0.041
 too+0.040
 gate+0.037
Neg logit contributions
 secure-0.064
ge-0.059
Er-0.059
asting-0.058
 shave-0.058
Token upon
Token ID2402
Feature activation-2.74e+00

Pos logit contributions
 secure+0.036
Er+0.034
ge+0.033
asting+0.033
 shave+0.033
Neg logit contributions
 door-0.030
 us-0.029
 me-0.028
 gate-0.027
 too-0.026
Token a
Token ID257
Feature activation-4.85e-01

Pos logit contributions
 too+0.058
 us+0.052
 me+0.049
 door+0.049
 them+0.040
Neg logit contributions
 secure-0.139
Er-0.133
ge-0.133
Marcus-0.127
 afloat-0.127
Token time
Token ID640
Feature activation+1.24e+00

Pos logit contributions
 us+0.016
 me+0.016
 too+0.016
 door+0.014
aw+0.013
Neg logit contributions
 secure-0.019
Er-0.017
ge-0.017
 Ethan-0.017
 shave-0.017
Token there
Token ID612
Feature activation-2.04e-01

Pos logit contributions
 secure+0.055
Er+0.052
asting+0.051
ge+0.051
 shave+0.051
Neg logit contributions
 door-0.032
 us-0.030
 me-0.028
 too-0.027
 gate-0.027
Token was
Token ID373
Feature activation-1.20e+00

Pos logit contributions
 us+0.004
 door+0.004
 me+0.004
 too+0.003
 comes+0.003
Neg logit contributions
 secure-0.011
Er-0.010
ge-0.010
asting-0.010
 energ-0.010
Token yard
Token ID12699
Feature activation+3.99e-01

Pos logit contributions
 secure+0.059
 blonde+0.055
Er+0.055
ge+0.054
 shave+0.053
Neg logit contributions
 us-0.039
 door-0.037
 cho-0.035
 me-0.034
 too-0.033
Token.
Token ID13
Feature activation+1.05e+00

Pos logit contributions
 secure+0.015
ge+0.014
 blonde+0.014
asting+0.013
 energ+0.013
Neg logit contributions
 us-0.013
 door-0.011
 me-0.011
ays-0.011
 cho-0.011
Token He
Token ID679
Feature activation+1.58e+00

Pos logit contributions
 secure+0.035
Er+0.033
asting+0.032
ge+0.032
 shave+0.032
Neg logit contributions
 door-0.038
 us-0.036
 me-0.035
 too-0.034
 gate-0.033
Token looked
Token ID3114
Feature activation+1.76e+00

Pos logit contributions
 secure+0.059
 blonde+0.054
Er+0.053
asting+0.053
ge+0.053
Neg logit contributions
 door-0.050
 us-0.048
 poster-0.044
 gate-0.044
ays-0.043
Token up
Token ID510
Feature activation+1.88e+00

Pos logit contributions
 secure+0.069
 energ+0.065
asting+0.064
 snug+0.063
Er+0.061
Neg logit contributions
 door-0.050
 us-0.049
 poster-0.046
 gate-0.044
 cho-0.044
Token in
Token ID287
Feature activation+1.91e+00

Pos logit contributions
 secure+0.079
 energ+0.076
Er+0.075
ge+0.075
 blonde+0.074
Neg logit contributions
 us-0.045
 door-0.043
 poster-0.042
 cho-0.042
 gate-0.038
Token the
Token ID262
Feature activation+1.60e+00

Pos logit contributions
 secure+0.077
Er+0.071
 blonde+0.070
asting+0.070
 shave+0.070
Neg logit contributions
 door-0.055
 us-0.053
 me-0.049
 cho-0.047
 too-0.047
Token sky
Token ID6766
Feature activation-2.53e-01

Pos logit contributions
 secure+0.069
 blonde+0.065
Er+0.064
asting+0.064
 snug+0.063
Neg logit contributions
 us-0.038
 door-0.037
 too-0.035
 cho-0.034
 poster-0.032
Token and
Token ID290
Feature activation+2.00e-01

Pos logit contributions
 door+0.006
 us+0.006
 me+0.005
 gate+0.005
 poster+0.005
Neg logit contributions
 secure-0.012
asting-0.011
Er-0.011
ge-0.011
 shave-0.011
Token it
Token ID340
Feature activation+2.92e-01

Pos logit contributions
 secure+0.008
Er+0.008
 shave+0.008
 energ+0.008
 Ethan+0.007
Neg logit contributions
 door-0.006
 us-0.005
 gate-0.005
 poster-0.005
 cho-0.005
Token was
Token ID373
Feature activation-2.09e-01

Pos logit contributions
 secure+0.010
 energ+0.009
asting+0.009
Er+0.009
 snug+0.009
Neg logit contributions
 door-0.010
 us-0.010
 gate-0.009
 poster-0.009
ays-0.008
Token to
Token ID284
Feature activation-3.06e-01

Pos logit contributions
 secure+0.140
 blonde+0.132
Er+0.131
 shave+0.131
asting+0.130
Neg logit contributions
 door-0.061
 us-0.056
 gate-0.050
 me-0.049
 poster-0.048
Token do
Token ID466
Feature activation-6.53e-01

Pos logit contributions
 us+0.011
 door+0.011
 me+0.010
 too+0.010
 gate+0.009
Neg logit contributions
Er-0.011
 secure-0.011
 energ-0.010
 blonde-0.010
asting-0.010
Token.
Token ID13
Feature activation+6.68e-01

Pos logit contributions
 door+0.019
 us+0.018
 gate+0.017
 me+0.017
 cho+0.017
Neg logit contributions
 secure-0.025
ge-0.024
asting-0.024
Er-0.024
 shave-0.023
Token She
Token ID1375
Feature activation+3.72e-01

Pos logit contributions
 secure+0.025
Er+0.023
asting+0.023
ge+0.023
 shave+0.023
Neg logit contributions
 door-0.022
 us-0.021
 me-0.020
 too-0.019
 gate-0.019
Token held
Token ID2714
Feature activation+2.12e+00

Pos logit contributions
 secure+0.013
 blonde+0.011
Er+0.011
 shave+0.011
asting+0.011
Neg logit contributions
 door-0.013
 us-0.012
 poster-0.012
 gate-0.012
ays-0.011
Token the
Token ID262
Feature activation+2.13e+00

Pos logit contributions
 secure+0.104
asting+0.101
ge+0.100
Er+0.100
 shave+0.097
Neg logit contributions
 door-0.045
 us-0.042
 me-0.040
 too-0.039
 gate-0.035
Token jacket
Token ID15224
Feature activation+2.93e+00

Pos logit contributions
 secure+0.069
Er+0.065
ge+0.064
asting+0.064
 shave+0.062
Neg logit contributions
 door-0.081
 us-0.076
 me-0.074
 too-0.072
 gate-0.071
Token close
Token ID1969
Feature activation+9.76e-01

Pos logit contributions
 secure+0.111
Er+0.106
asting+0.103
 shave+0.102
ge+0.102
Neg logit contributions
 door-0.094
 us-0.087
 me-0.084
 too-0.084
 gate-0.081
Token to
Token ID284
Feature activation+1.08e+00

Pos logit contributions
 secure+0.049
ge+0.046
Er+0.045
 blonde+0.045
asting+0.045
Neg logit contributions
 door-0.018
 us-0.018
 me-0.016
 poster-0.015
 gate-0.015
Token her
Token ID607
Feature activation+2.26e-01

Pos logit contributions
 secure+0.048
Er+0.046
ge+0.046
asting+0.045
tto+0.045
Neg logit contributions
 door-0.028
 us-0.027
 me-0.026
 too-0.025
 gate-0.023
Token and
Token ID290
Feature activation-3.60e-02

Pos logit contributions
 secure+0.010
Er+0.009
asting+0.009
ge+0.009
 shave+0.009
Neg logit contributions
 door-0.006
 us-0.006
 me-0.005
 too-0.005
 gate-0.005
Token she
Token ID673
Feature activation-1.94e+00

Pos logit contributions
 secure+0.030
 blonde+0.029
 Baby+0.027
 Ethan+0.026
 energ+0.026
Neg logit contributions
 door-0.041
 gate-0.037
 us-0.037
 cho-0.037
ucks-0.036
Token went
Token ID1816
Feature activation+4.17e-01

Pos logit contributions
 door+0.053
 us+0.050
 me+0.048
 too+0.045
 gate+0.044
Neg logit contributions
 secure-0.083
Er-0.078
asting-0.078
ge-0.078
 shave-0.077
Token,
Token ID11
Feature activation+4.63e-01

Pos logit contributions
 secure+0.015
Er+0.015
 blonde+0.014
 Ethan+0.014
asting+0.014
Neg logit contributions
 door-0.013
 us-0.012
 gate-0.011
 me-0.011
 too-0.011
Token she
Token ID673
Feature activation-8.80e-01

Pos logit contributions
 secure+0.019
Er+0.019
asting+0.018
ge+0.018
 shave+0.018
Neg logit contributions
 door-0.013
 us-0.012
 me-0.012
 too-0.011
 gate-0.011
Token had
Token ID550
Feature activation+1.05e+00

Pos logit contributions
 door+0.031
 us+0.030
 gate+0.028
 me+0.027
 too+0.027
Neg logit contributions
 secure-0.029
Er-0.028
 blonde-0.027
asting-0.027
 energ-0.026
Token a
Token ID257
Feature activation+3.60e-01

Pos logit contributions
 blonde+0.035
 secure+0.033
Er+0.032
asting+0.031
 Ethan+0.030
Neg logit contributions
 door-0.036
 us-0.036
 poster-0.034
ays-0.033
 me-0.033
Token box
Token ID3091
Feature activation-1.33e+00

Pos logit contributions
 secure+0.011
 blonde+0.011
Er+0.011
 shave+0.010
 Ethan+0.009
Neg logit contributions
 us-0.014
 door-0.012
ays-0.012
ipt-0.012
 me-0.012
Token of
Token ID286
Feature activation+9.27e-01

Pos logit contributions
 us+0.036
 door+0.033
 me+0.033
 cho+0.030
 poster+0.030
Neg logit contributions
 secure-0.056
asting-0.051
 blonde-0.050
ge-0.050
Er-0.050
Token colors
Token ID7577
Feature activation+4.07e-02

Pos logit contributions
 secure+0.034
Er+0.031
 blonde+0.031
asting+0.031
 shave+0.030
Neg logit contributions
 door-0.030
 us-0.029
 me-0.028
 gate-0.027
 too-0.027
Token with
Token ID351
Feature activation-3.41e-01

Pos logit contributions
 secure+0.002
asting+0.002
 blonde+0.001
ge+0.001
Er+0.001
Neg logit contributions
 us-0.001
 me-0.001
 door-0.001
 poster-0.001
 cho-0.001
Token her
Token ID607
Feature activation+1.19e+00

Pos logit contributions
 door+0.009
 us+0.009
 me+0.008
 gate+0.008
 too+0.008
Neg logit contributions
 secure-0.014
Er-0.013
 blonde-0.013
asting-0.013
ge-0.013
Token her
Token ID607
Feature activation-4.06e-01

Pos logit contributions
 door+0.000
 gate+0.000
 cho+0.000
 poster+0.000
aw+0.000
Neg logit contributions
asting-0.000
 Ethan-0.000
ge-0.000
 shave-0.000
 secure-0.000
Token,
Token ID11
Feature activation-5.86e-01

Pos logit contributions
 cho+0.012
 us+0.012
ipt+0.012
 door+0.012
aw+0.011
Neg logit contributions
 secure-0.014
ge-0.014
 shave-0.013
 blonde-0.013
asting-0.013
Token but
Token ID475
Feature activation+1.79e-01

Pos logit contributions
ays+0.021
 door+0.021
 us+0.021
ipt+0.020
aw+0.020
Neg logit contributions
 secure-0.019
 shave-0.017
 energ-0.016
 snug-0.016
 blonde-0.016
Token he
Token ID339
Feature activation-1.29e+00

Pos logit contributions
 secure+0.006
Er+0.006
 shave+0.006
asting+0.006
ge+0.006
Neg logit contributions
 door-0.006
 us-0.006
 gate-0.005
 me-0.005
 poster-0.005
Token knew
Token ID2993
Feature activation-1.74e+00

Pos logit contributions
 door+0.048
 us+0.045
 gate+0.043
 poster+0.041
 me+0.041
Neg logit contributions
 secure-0.043
asting-0.039
tto-0.038
Er-0.038
 shave-0.038
Token he
Token ID339
Feature activation-1.94e+00

Pos logit contributions
 door+0.056
 us+0.053
 gate+0.051
 poster+0.049
 me+0.049
Neg logit contributions
 secure-0.064
Er-0.058
 shave-0.058
asting-0.058
 Ethan-0.057
Token couldn
Token ID3521
Feature activation-1.35e+00

Pos logit contributions
 door+0.056
 us+0.053
 me+0.050
 gate+0.048
 too+0.047
Neg logit contributions
 secure-0.079
Er-0.074
asting-0.074
ge-0.073
 shave-0.073
Token't
Token ID470
Feature activation-1.73e+00

Pos logit contributions
 door+0.044
 us+0.041
 gate+0.040
 me+0.038
 cho+0.037
Neg logit contributions
 secure-0.047
Er-0.046
asting-0.045
 shave-0.044
ge-0.044
Token.
Token ID13
Feature activation-7.64e-01

Pos logit contributions
 door+0.049
 us+0.047
 me+0.045
 too+0.044
 gate+0.041
Neg logit contributions
 secure-0.072
Er-0.069
asting-0.067
 blonde-0.067
ge-0.067
Token He
Token ID679
Feature activation+3.13e-01

Pos logit contributions
 door+0.020
 too+0.020
 me+0.019
 us+0.019
 gate+0.016
Neg logit contributions
 secure-0.034
asting-0.033
ge-0.033
Er-0.031
 energ-0.031
Token had
Token ID550
Feature activation+3.97e-02

Pos logit contributions
 secure+0.009
Er+0.008
 shave+0.008
 Ethan+0.008
asting+0.008
Neg logit contributions
 door-0.012
 us-0.012
 poster-0.011
 gate-0.011
 cho-0.011
Token dog
Token ID3290
Feature activation-1.40e+00

Pos logit contributions
 secure+0.004
 shave+0.004
 blonde+0.004
 snug+0.004
Er+0.004
Neg logit contributions
 us-0.004
 door-0.004
aw-0.004
 me-0.004
 cho-0.004
Token.
Token ID13
Feature activation-4.94e-01

Pos logit contributions
 door+0.036
 us+0.035
 poster+0.032
 me+0.031
ays+0.031
Neg logit contributions
 secure-0.060
asting-0.055
ge-0.054
 blonde-0.053
 snug-0.053
Token She
Token ID1375
Feature activation-3.40e-02

Pos logit contributions
 door+0.012
 me+0.011
 us+0.011
 too+0.010
 gate+0.009
Neg logit contributions
 secure-0.024
asting-0.024
ge-0.023
 energ-0.022
 shave-0.022
Token started
Token ID2067
Feature activation+2.63e-01

Pos logit contributions
 door+0.001
 us+0.001
 gate+0.001
 me+0.001
 poster+0.001
Neg logit contributions
 secure-0.001
Er-0.001
 blonde-0.001
 shave-0.001
 Ethan-0.001
Token to
Token ID284
Feature activation+1.23e+00

Pos logit contributions
 secure+0.014
ge+0.013
 blonde+0.013
 shave+0.013
Er+0.013
Neg logit contributions
 us-0.004
 door-0.004
 me-0.003
 gate-0.003
 poster-0.003
Token cry
Token ID3960
Feature activation-2.48e-01

Pos logit contributions
 secure+0.036
Er+0.036
asting+0.036
tto+0.034
 blonde+0.034
Neg logit contributions
 door-0.048
 us-0.046
 me-0.046
 too-0.044
 gate-0.043
Token.
Token ID13
Feature activation-4.14e-01

Pos logit contributions
 door+0.007
 us+0.007
 me+0.006
 gate+0.006
 poster+0.006
Neg logit contributions
 secure-0.010
Er-0.010
ge-0.009
asting-0.009
 shave-0.009
Token<|endoftext|>
Token ID50256
Feature activation-1.62e+00

Pos logit contributions
 door+0.011
 me+0.010
 us+0.010
 too+0.010
 gate+0.009
Neg logit contributions
 secure-0.019
asting-0.018
ge-0.018
 energ-0.017
Er-0.017
TokenSam
Token ID16305
Feature activation-4.49e-03

Pos logit contributions
 door+0.049
 us+0.047
 me+0.044
 gate+0.043
 too+0.043
Neg logit contributions
 secure-0.062
Er-0.059
asting-0.058
ge-0.057
 shave-0.057
Token and
Token ID290
Feature activation-3.10e-01

Pos logit contributions
 door+0.000
 us+0.000
 me+0.000
 too+0.000
 gate+0.000
Neg logit contributions
 secure-0.000
Er-0.000
ge-0.000
asting-0.000
 shave-0.000
Token Lily
Token ID20037
Feature activation-1.82e-01

Pos logit contributions
 door+0.009
 us+0.009
 me+0.008
 too+0.008
 gate+0.008
Neg logit contributions
 secure-0.012
Er-0.012
 shave-0.011
asting-0.011
 blonde-0.011
Token something
Token ID1223
Feature activation+5.83e-01

Pos logit contributions
 door+0.047
 us+0.045
 me+0.042
 too+0.041
 gate+0.040
Neg logit contributions
 secure-0.072
Er-0.068
asting-0.067
ge-0.067
 shave-0.066
Token no
Token ID645
Feature activation-1.53e+00

Pos logit contributions
 secure+0.024
Er+0.022
asting+0.022
ge+0.022
 shave+0.022
Neg logit contributions
 door-0.017
 us-0.016
 me-0.015
 too-0.015
 gate-0.014
Token one
Token ID530
Feature activation-2.97e+00

Pos logit contributions
 door+0.060
 us+0.059
 me+0.057
 too+0.056
 gate+0.053
Neg logit contributions
 secure-0.047
Er-0.045
asting-0.043
 blonde-0.042
ge-0.042
Token else
Token ID2073
Feature activation+5.50e-01

Pos logit contributions
 door+0.090
 us+0.081
 gate+0.079
 cho+0.078
 poster+0.075
Neg logit contributions
 secure-0.116
asting-0.110
ge-0.107
 shave-0.106
 energ-0.105
Token had
Token ID550
Feature activation-4.20e-01

Pos logit contributions
 secure+0.023
asting+0.021
ge+0.021
Er+0.021
 shave+0.021
Neg logit contributions
 door-0.016
 us-0.014
 gate-0.013
 me-0.013
 cho-0.013
Token done
Token ID1760
Feature activation+2.59e-01

Pos logit contributions
 door+0.013
 us+0.012
 gate+0.012
 me+0.012
 poster+0.012
Neg logit contributions
 secure-0.016
asting-0.015
ge-0.015
 shave-0.015
Er-0.014
Token and
Token ID290
Feature activation-7.88e-01

Pos logit contributions
 secure+0.011
asting+0.010
Er+0.010
ge+0.010
 shave+0.010
Neg logit contributions
 door-0.007
 us-0.007
 me-0.006
 gate-0.006
 too-0.006
Token they
Token ID484
Feature activation-1.13e+00

Pos logit contributions
 door+0.027
 us+0.025
 gate+0.024
 poster+0.023
 me+0.023
Neg logit contributions
 secure-0.029
asting-0.025
 energ-0.025
ge-0.025
Er-0.025
Token always
Token ID1464
Feature activation-8.18e-01

Pos logit contributions
 door+0.041
 us+0.036
 gate+0.036
 poster+0.036
ucks+0.035
Neg logit contributions
 secure-0.039
asting-0.036
Er-0.035
 Ethan-0.033
 energ-0.033
Token chose
Token ID7690
Feature activation+8.36e-01

Pos logit contributions
 door+0.029
 us+0.028
 gate+0.026
 me+0.026
 too+0.025
Neg logit contributions
 secure-0.029
Er-0.026
asting-0.025
ge-0.025
 shave-0.025
Token him
Token ID683
Feature activation+8.37e-01

Pos logit contributions
 secure+0.038
asting+0.036
Er+0.035
 shave+0.035
ge+0.034
Neg logit contributions
 door-0.020
 us-0.018
 poster-0.017
 gate-0.017
 me-0.017
Token,"
Token ID553
Feature activation+8.57e-01

Pos logit contributions
 door+0.006
 us+0.006
 gate+0.005
 me+0.005
 poster+0.005
Neg logit contributions
 secure-0.010
Er-0.009
asting-0.009
ge-0.009
 shave-0.009
Token her
Token ID607
Feature activation-1.22e+00

Pos logit contributions
 secure+0.031
 shave+0.028
Er+0.028
 Ethan+0.028
 blonde+0.027
Neg logit contributions
 door-0.029
 us-0.027
 gate-0.026
 poster-0.024
 me-0.024
Token mom
Token ID1995
Feature activation-1.04e+00

Pos logit contributions
 us+0.034
 door+0.033
 me+0.030
 gate+0.029
 too+0.029
Neg logit contributions
 secure-0.050
Er-0.047
 blonde-0.046
 shave-0.045
asting-0.045
Token said
Token ID531
Feature activation-1.53e+00

Pos logit contributions
 door+0.024
 us+0.022
 me+0.021
 too+0.020
 gate+0.019
Neg logit contributions
 secure-0.048
Er-0.046
asting-0.046
ge-0.045
 shave-0.045
Token.
Token ID13
Feature activation-5.36e-01

Pos logit contributions
 door+0.032
 too+0.029
 us+0.028
 me+0.026
 cho+0.023
Neg logit contributions
 secure-0.073
Er-0.073
asting-0.071
Marcus-0.070
tto-0.070
Token "
Token ID366
Feature activation-1.58e+00

Pos logit contributions
 door+0.018
 us+0.018
 gate+0.016
 me+0.016
 poster+0.016
Neg logit contributions
 secure-0.018
Er-0.018
 Ethan-0.017
 blonde-0.017
Baby-0.017
TokenNow
Token ID3844
Feature activation-5.53e-01

Pos logit contributions
 us+0.047
 door+0.047
 me+0.045
 too+0.044
 cho+0.041
Neg logit contributions
 secure-0.062
ge-0.057
 shave-0.056
Er-0.056
asting-0.055
Token we
Token ID356
Feature activation-1.42e+00

Pos logit contributions
 us+0.012
 door+0.012
 me+0.012
 too+0.011
 gate+0.010
Neg logit contributions
 secure-0.026
Er-0.025
ge-0.025
asting-0.025
 shave-0.025
Token can
Token ID460
Feature activation-6.24e-02

Pos logit contributions
 door+0.041
 us+0.040
 me+0.038
 too+0.037
 gate+0.035
Neg logit contributions
 secure-0.058
ge-0.055
Er-0.054
 shave-0.054
 blonde-0.053
Token enjoy
Token ID2883
Feature activation+6.03e-01

Pos logit contributions
 door+0.002
 us+0.002
 me+0.002
 too+0.002
 gate+0.002
Neg logit contributions
 secure-0.002
ge-0.002
Er-0.002
 blonde-0.002
 energ-0.002
Token the
Token ID262
Feature activation+7.86e-01

Pos logit contributions
 secure+0.025
Er+0.023
asting+0.023
 shave+0.023
 Ethan+0.023
Neg logit contributions
 door-0.017
 us-0.016
 gate-0.015
 me-0.015
 too-0.014
Token would
Token ID561
Feature activation-1.45e+00

Pos logit contributions
 door+0.003
 us+0.003
 gate+0.003
 me+0.003
 poster+0.003
Neg logit contributions
 secure-0.003
asting-0.003
 shave-0.003
 energ-0.003
Er-0.003
Token cl
Token ID537
Feature activation-3.03e+00

Pos logit contributions
 door+0.056
 us+0.053
 gate+0.051
 me+0.049
 poster+0.049
Neg logit contributions
 secure-0.045
asting-0.040
 shave-0.040
Er-0.040
 blonde-0.039
Tokenap
Token ID499
Feature activation-1.80e+00

Pos logit contributions
 door+0.070
 us+0.065
 me+0.060
 gate+0.057
 too+0.057
Neg logit contributions
 secure-0.144
Er-0.137
asting-0.136
ge-0.134
 shave-0.133
Token his
Token ID465
Feature activation+1.29e+00

Pos logit contributions
 door+0.061
 gate+0.058
 poster+0.058
ipt+0.053
ays+0.053
Neg logit contributions
 secure-0.061
asting-0.061
 shave-0.058
 snug-0.057
zing-0.057
Token wings
Token ID12098
Feature activation-6.59e-01

Pos logit contributions
 secure+0.051
Er+0.049
asting+0.048
ge+0.048
 shave+0.047
Neg logit contributions
 door-0.039
 us-0.037
 me-0.036
 too-0.035
 gate-0.034
Token together
Token ID1978
Feature activation-2.67e+00

Pos logit contributions
 door+0.019
 us+0.018
 me+0.017
 poster+0.017
 gate+0.017
Neg logit contributions
 secure-0.027
asting-0.025
ge-0.024
 shave-0.024
 energ-0.024
Token in
Token ID287
Feature activation-3.46e-01

Pos logit contributions
 door+0.085
 us+0.081
 gate+0.078
 poster+0.077
 cho+0.076
Neg logit contributions
 secure-0.097
ge-0.090
asting-0.090
 shave-0.088
 blonde-0.086
Token excitement
Token ID14067
Feature activation-4.16e-01

Pos logit contributions
 door+0.010
 us+0.009
 me+0.009
 gate+0.009
 too+0.008
Neg logit contributions
 secure-0.014
Er-0.013
asting-0.013
ge-0.013
 blonde-0.013
Token.
Token ID13
Feature activation+2.47e-01

Pos logit contributions
 door+0.013
 us+0.013
 poster+0.013
 gate+0.012
 cho+0.012
Neg logit contributions
 secure-0.015
ge-0.014
asting-0.014
 blonde-0.013
 shave-0.013
Token
Token ID198
Feature activation+6.71e-01

Pos logit contributions
 secure+0.009
Er+0.008
asting+0.008
ge+0.008
 shave+0.008
Neg logit contributions
 door-0.009
 us-0.008
 me-0.008
 too-0.008
 gate-0.008
Token
Token ID198
Feature activation+9.66e-01

Pos logit contributions
 secure+0.029
 blonde+0.026
Er+0.026
ge+0.026
 shave+0.026
Neg logit contributions
 door-0.019
 us-0.018
 me-0.018
 too-0.017
 gate-0.015
Token.
Token ID13
Feature activation+4.87e-02

Pos logit contributions
 secure+0.005
asting+0.005
ge+0.005
Er+0.005
 shave+0.005
Neg logit contributions
 door-0.004
 us-0.004
 me-0.004
 poster-0.004
 gate-0.004
Token They
Token ID1119
Feature activation+9.14e-01

Pos logit contributions
 secure+0.002
Er+0.002
asting+0.002
ge+0.002
 shave+0.002
Neg logit contributions
 door-0.001
 us-0.001
 me-0.001
 too-0.001
 gate-0.001
Token cut
Token ID2005
Feature activation+7.45e-01

Pos logit contributions
 secure+0.028
Er+0.026
asting+0.026
ge+0.025
 shave+0.025
Neg logit contributions
 door-0.035
 us-0.034
 me-0.033
 too-0.032
 gate-0.032
Token a
Token ID257
Feature activation-6.66e-01

Pos logit contributions
 secure+0.033
Er+0.031
asting+0.031
 shave+0.030
ge+0.030
Neg logit contributions
 door-0.019
 us-0.018
 me-0.017
 gate-0.016
 too-0.016
Token hole
Token ID7604
Feature activation-2.00e+00

Pos logit contributions
 us+0.021
ays+0.020
ipt+0.019
ocked+0.019
 me+0.019
Neg logit contributions
 secure-0.024
 shave-0.023
Er-0.022
asting-0.021
 blonde-0.021
Token for
Token ID329
Feature activation-3.02e+00

Pos logit contributions
 us+0.065
 me+0.061
 door+0.059
 poster+0.058
ipt+0.057
Neg logit contributions
 secure-0.067
asting-0.066
ge-0.063
 snug-0.059
 energ-0.056
Token the
Token ID262
Feature activation-3.52e-02

Pos logit contributions
 door+0.083
 us+0.079
 me+0.074
 gate+0.072
 too+0.070
Neg logit contributions
 secure-0.125
Er-0.118
ge-0.116
asting-0.116
 shave-0.116
Token door
Token ID3420
Feature activation-1.19e+00

Pos logit contributions
 us+0.001
 me+0.001
ocked+0.001
ays+0.001
 door+0.001
Neg logit contributions
 secure-0.001
 shave-0.001
Er-0.001
asting-0.001
ge-0.001
Token.
Token ID13
Feature activation-4.64e-02

Pos logit contributions
 us+0.035
 me+0.033
 door+0.033
 poster+0.030
 cho+0.030
Neg logit contributions
 secure-0.047
ge-0.044
asting-0.043
Er-0.041
 shave-0.041
Token They
Token ID1119
Feature activation+8.56e-01

Pos logit contributions
 door+0.001
 us+0.001
 me+0.001
 too+0.001
 gate+0.001
Neg logit contributions
 secure-0.002
Er-0.002
asting-0.002
ge-0.002
 shave-0.002
Token tape
Token ID9154
Feature activation+1.06e+00

Pos logit contributions
 secure+0.027
Er+0.025
asting+0.024
ge+0.024
 shave+0.024
Neg logit contributions
 door-0.033
 us-0.032
 me-0.031
 too-0.030
 gate-0.029
Token he
Token ID339
Feature activation-1.12e+00

Pos logit contributions
 door+0.027
 us+0.026
 me+0.024
 gate+0.024
 poster+0.023
Neg logit contributions
 secure-0.032
asting-0.030
 shave-0.030
Er-0.030
 Ethan-0.029
Token had
Token ID550
Feature activation+3.32e-01

Pos logit contributions
 door+0.037
 us+0.035
 me+0.033
 gate+0.032
 too+0.031
Neg logit contributions
 secure-0.042
Er-0.039
asting-0.038
 shave-0.038
ge-0.037
Token to
Token ID284
Feature activation+2.45e-01

Pos logit contributions
 secure+0.014
asting+0.013
 shave+0.013
 blonde+0.013
 Ethan+0.013
Neg logit contributions
 door-0.009
 poster-0.008
 us-0.008
 gate-0.008
 me-0.007
Token pay
Token ID1414
Feature activation+2.79e+00

Pos logit contributions
Er+0.009
 secure+0.009
ge+0.009
 blonde+0.009
 energ+0.009
Neg logit contributions
 us-0.008
 door-0.008
 me-0.007
 too-0.007
 gate-0.007
Token a
Token ID257
Feature activation-9.57e-01

Pos logit contributions
 secure+0.133
asting+0.126
Er+0.124
 shave+0.124
ge+0.123
Neg logit contributions
 door-0.060
 us-0.055
 gate-0.051
 me-0.050
 poster-0.049
Token lot
Token ID1256
Feature activation-2.77e+00

Pos logit contributions
 door+0.032
 us+0.031
 me+0.029
 too+0.028
 gate+0.028
Neg logit contributions
 secure-0.035
Er-0.032
asting-0.032
ge-0.032
 shave-0.031
Token of
Token ID286
Feature activation+1.90e-01

Pos logit contributions
 door+0.059
 us+0.056
 me+0.051
 too+0.049
 gate+0.048
Neg logit contributions
 secure-0.134
Er-0.127
asting-0.126
ge-0.125
 shave-0.124
Token money
Token ID1637
Feature activation+2.56e+00

Pos logit contributions
Er+0.005
 secure+0.005
ge+0.005
asting+0.005
tto+0.005
Neg logit contributions
 door-0.008
 us-0.007
 me-0.007
 too-0.007
 gate-0.007
Token to
Token ID284
Feature activation-6.72e-01

Pos logit contributions
 secure+0.115
asting+0.109
ge+0.108
Er+0.106
 shave+0.106
Neg logit contributions
 door-0.061
 us-0.057
 me-0.053
 gate-0.052
 poster-0.051
Token go
Token ID467
Feature activation-3.71e-01

Pos logit contributions
 door+0.027
 us+0.027
 me+0.026
 too+0.025
 gate+0.024
Neg logit contributions
 secure-0.019
Er-0.019
 blonde-0.019
 energ-0.018
ge-0.018
Token to
Token ID284
Feature activation+3.26e-02

Pos logit contributions
 door+0.009
 us+0.008
 me+0.008
 gate+0.007
 poster+0.007
Neg logit contributions
 secure-0.017
ge-0.016
asting-0.016
Er-0.016
 shave-0.015
Token forth
Token ID6071
Feature activation-4.65e-01

Pos logit contributions
 secure+0.008
 energ+0.007
ge+0.007
 snug+0.007
 shave+0.007
Neg logit contributions
 us-0.009
 door-0.009
 gate-0.008
ays-0.008
 me-0.007
Token.
Token ID13
Feature activation-1.66e-01

Pos logit contributions
 us+0.016
 door+0.015
 gate+0.014
ays+0.014
ucks+0.014
Neg logit contributions
 secure-0.015
ge-0.015
Er-0.014
 blonde-0.014
asting-0.014
Token Anna
Token ID11735
Feature activation+2.23e-01

Pos logit contributions
 door+0.005
 us+0.005
 me+0.005
 gate+0.005
 too+0.005
Neg logit contributions
 secure-0.006
Er-0.006
 blonde-0.006
asting-0.006
 shave-0.006
Token laughed
Token ID13818
Feature activation-1.19e+00

Pos logit contributions
 secure+0.006
 blonde+0.006
 energ+0.006
Er+0.006
 snug+0.006
Neg logit contributions
 door-0.009
 us-0.008
 gate-0.008
fork-0.007
 me-0.007
Token and
Token ID290
Feature activation-3.90e-01

Pos logit contributions
 door+0.040
 poster+0.040
 us+0.039
 gate+0.038
ipt+0.036
Neg logit contributions
 secure-0.036
 energ-0.035
 snug-0.034
 blonde-0.033
asting-0.032
Token cl
Token ID537
Feature activation-2.78e+00

Pos logit contributions
 door+0.012
 us+0.011
 gate+0.010
 me+0.010
 poster+0.009
Neg logit contributions
 secure-0.016
Er-0.014
 energ-0.014
 blonde-0.014
 shave-0.014
Tokenapped
Token ID6320
Feature activation-3.65e+00

Pos logit contributions
 door+0.044
 us+0.039
 me+0.034
 gate+0.032
 too+0.029
Neg logit contributions
 secure-0.152
Er-0.145
 blonde-0.144
asting-0.143
ge-0.142
Token.
Token ID13
Feature activation-2.27e-01

Pos logit contributions
 poster+0.147
ays+0.144
 door+0.143
anes+0.140
ane+0.139
Neg logit contributions
 secure-0.098
 snug-0.089
 blonde-0.083
asting-0.080
 cozy-0.075
Token She
Token ID1375
Feature activation+2.64e-01

Pos logit contributions
 door+0.007
 us+0.007
 me+0.006
 gate+0.006
 too+0.006
Neg logit contributions
 secure-0.008
Er-0.008
 blonde-0.008
 Ethan-0.008
asting-0.008
Token said
Token ID531
Feature activation-9.17e-01

Pos logit contributions
 secure+0.008
 blonde+0.008
Er+0.007
 energ+0.007
 shave+0.007
Neg logit contributions
 door-0.010
 us-0.010
 gate-0.009
ays-0.009
 poster-0.009
Token,
Token ID11
Feature activation-1.09e+00

Pos logit contributions
 door+0.035
 us+0.035
ipt+0.034
 gate+0.033
ays+0.032
Neg logit contributions
 blonde-0.023
zing-0.022
 secure-0.022
 energ-0.021
free-0.021
Token to
Token ID284
Feature activation+7.41e-01

Pos logit contributions
 secure+0.000
ge+0.000
asting+0.000
 shave+0.000
 blonde+0.000
Neg logit contributions
 door-0.000
 us-0.000
 poster-0.000
 me-0.000
 gate-0.000
Token be
Token ID307
Feature activation+5.17e-02

Pos logit contributions
 secure+0.029
Er+0.027
asting+0.027
ge+0.026
 shave+0.026
Neg logit contributions
 door-0.023
 us-0.022
 me-0.021
 too-0.020
 gate-0.020
Token mad
Token ID8805
Feature activation+4.18e-01

Pos logit contributions
 secure+0.002
Er+0.002
asting+0.002
ge+0.002
 shave+0.002
Neg logit contributions
 door-0.002
 us-0.002
 me-0.002
 too-0.001
 gate-0.001
Token at
Token ID379
Feature activation-1.84e+00

Pos logit contributions
 secure+0.016
ge+0.015
Er+0.015
asting+0.015
 Ethan+0.015
Neg logit contributions
 door-0.012
 us-0.011
 poster-0.011
 gate-0.011
 me-0.010
Token someone
Token ID2130
Feature activation+1.15e+00

Pos logit contributions
 door+0.044
 gate+0.039
 poster+0.037
 cho+0.036
 us+0.036
Neg logit contributions
 secure-0.081
ge-0.078
asting-0.077
Er-0.077
 Ethan-0.075
Token for
Token ID329
Feature activation-2.97e+00

Pos logit contributions
 secure+0.046
ge+0.043
asting+0.043
Er+0.042
 shave+0.042
Neg logit contributions
 door-0.033
 us-0.030
 gate-0.029
 me-0.029
 cho-0.027
Token sne
Token ID10505
Feature activation-1.37e+00

Pos logit contributions
 door+0.087
 us+0.083
 me+0.078
 too+0.076
 gate+0.074
Neg logit contributions
 secure-0.121
Er-0.114
asting-0.112
ge-0.111
 shave-0.110
Tokenez
Token ID8471
Feature activation-9.63e-01

Pos logit contributions
 door+0.056
 us+0.053
 me+0.051
 gate+0.050
 too+0.050
Neg logit contributions
 secure-0.040
Er-0.037
asting-0.037
ge-0.036
 shave-0.035
Tokening
Token ID278
Feature activation-5.96e-01

Pos logit contributions
 door+0.025
 us+0.023
 gate+0.022
 me+0.022
 poster+0.021
Neg logit contributions
 secure-0.041
asting-0.040
Er-0.039
ge-0.039
 shave-0.038
Token.
Token ID13
Feature activation-3.35e-01

Pos logit contributions
 door+0.018
 gate+0.016
 us+0.016
 poster+0.016
 me+0.015
Neg logit contributions
 secure-0.023
asting-0.022
ge-0.022
Er-0.021
 shave-0.021
Token
Token ID198
Feature activation+2.33e-01

Pos logit contributions
 door+0.009
 us+0.008
 me+0.008
 too+0.008
 gate+0.007
Neg logit contributions
 secure-0.015
asting-0.014
ge-0.014
Er-0.013
 shave-0.013