TOP ACTIVATIONS
MAX = 0.304

it to me for safe keeping . He told me to
grandma 's house for safe keeping . When
and Ben sit at their desks . They look at the
But when she opened her closet , she was in for
and a scarf in her closet , but they were too
- a box of mult icol ored bubbles !
. They sit at their desks and wait for the exam
and put it in the closet . You will have to
was very curious about the closet . " What
floors , scrub bed the counters and bathrooms and d usted
. They sit at their desks and draw with cr ay
. They have their own beds in the same room .
has another car in his closet . He says , "
friends sitting at the new desks . She sees her old
The teacher comes to their desks . She smiles at their
. She has made their beds . She has left a
the dishes and wiped the counters . When they were done
took her umbrella from the closet . It was blue with
draw ers and on the counters . They even looked in
a h anger in the closet . It was shiny and

MIN ACTIVATIONS
MIN = -0.262

s join as one . â The
b ble is a pretty one ," said the old man
you are , my little one ! â
with a bright smile , one of the friends held a
, " Hello , little one ! What have you got
crib for Sarah . This one is too small for her
speaking a magical language - one that only the two of
on an adventure that no one expected . <|endoftext|> Once upon
just as beautiful as the one in the sky . â
said ' Mom , this one !' Max 's
long leaves , and this one probably used to belong to
The tall one stood on one leg and said , "
it is not the easiest one . At
decided to take the youngest one with them on a fun
â He pointed to one of the pictures on the
that 's the most important one ," he said .
the elephant was a friendly one . â
landed the little girl said one final thank you to the
!" The tall one stood on one leg and
variety of colors . Every one of the c act i

INTERVAL 0.162 - 0.304
CONTAINS 0.414%

very young , only three years old . Sarah liked to
found a lemon in the kitchen . They both wanted to
â M um , what
for being so kind . From that day on , the
It also had a big keyboard where one could press lots

INTERVAL 0.021 - 0.162
CONTAINS 33.258%

them the iron was very old and came from a ship
. She helped me make a rainbow cake . Do you
. My shark is boring and gray ," Tom said ,
and smiled . When it was time to go
and Tom looked up and saw the woman . They did

INTERVAL -0.121 - 0.021
CONTAINS 64.214%

she could say anything . As she continued walking
umm iest treats and sharing them with one another . As
have a lot of fun . They share everything . They
at the desk , she would play with her toys .
a button on the side . She pressed it and heard

INTERVAL -0.262 - -0.121
CONTAINS 2.114%

. The two animals were never friends again
Harry thought to himself , " I want to
felt right . One day , the onion saw
smile . They became the best of friends and the bird
today , Lily . Maybe next time if you behave ."
Token it
Token ID340
Feature activation-2.54e-02

Pos logit contributions
 loser+0.009
in+0.008
 frog+0.008
worm+0.008
 goose+0.008
Neg logit contributions
 counters-0.014
 magazines-0.013
 envelop-0.013
keeping-0.013
 computers-0.012
Token to
Token ID284
Feature activation+3.52e-02

Pos logit contributions
 loser+0.025
 friend+0.024
 leader+0.024
 lesson+0.023
 frog+0.023
Neg logit contributions
 counters-0.062
keeping-0.060
 magazines-0.060
 envelop-0.058
array-0.057
Token me
Token ID502
Feature activation+4.03e-02

Pos logit contributions
 counters+0.071
 magazines+0.069
keeping+0.068
 envelop+0.063
 illustrations+0.062
Neg logit contributions
 loser-0.052
 leader-0.048
 frog-0.048
 friend-0.047
 first-0.047
Token for
Token ID329
Feature activation+3.68e-02

Pos logit contributions
 magazines+0.091
 counters+0.091
keeping+0.088
 envelop+0.088
 candles+0.084
Neg logit contributions
 loser-0.049
 leader-0.044
in-0.042
 frog-0.042
worm-0.042
Token safe
Token ID3338
Feature activation+8.63e-02

Pos logit contributions
 counters+0.077
 magazines+0.073
icol+0.069
keeping+0.069
 album+0.068
Neg logit contributions
 loser-0.051
 first-0.050
 leader-0.046
 one-0.046
 lesson-0.046
Tokenkeeping
Token ID19934
Feature activation+3.04e-01

Pos logit contributions
 counters+0.119
 magazines+0.116
 brand+0.112
 envelop+0.109
 illustrations+0.107
Neg logit contributions
 loser-0.175
 leader-0.163
 frog-0.162
 thing-0.160
 lesson-0.158
Token.
Token ID13
Feature activation+7.89e-03

Pos logit contributions
 magazines+0.682
 counters+0.668
 envelop+0.653
keeping+0.635
 candles+0.626
Neg logit contributions
 loser-0.376
 leader-0.328
 winner-0.322
 frog-0.318
worm-0.316
Token He
Token ID679
Feature activation+1.98e-03

Pos logit contributions
 magazines+0.018
 counters+0.017
 envelop+0.017
keeping+0.017
 desks+0.016
Neg logit contributions
 loser-0.010
 frog-0.009
 winner-0.009
 leader-0.008
 coward-0.008
Token told
Token ID1297
Feature activation+1.13e-02

Pos logit contributions
 magazines+0.004
 counters+0.004
keeping+0.004
 envelop+0.004
 computers+0.004
Neg logit contributions
 loser-0.003
 leader-0.002
 frog-0.002
 winner-0.002
worm-0.002
Token me
Token ID502
Feature activation+4.11e-02

Pos logit contributions
 counters+0.022
 magazines+0.022
keeping+0.021
 envelop+0.021
 illustrations+0.020
Neg logit contributions
 loser-0.016
 frog-0.016
 leader-0.015
 friend-0.015
 first-0.015
Token to
Token ID284
Feature activation+3.63e-02

Pos logit contributions
 counters+0.091
 magazines+0.090
 envelop+0.090
keeping+0.089
array+0.083
Neg logit contributions
 loser-0.051
worm-0.046
 leader-0.046
 frog-0.046
 lesson-0.046
Token grandma
Token ID49890
Feature activation+1.53e-01

Pos logit contributions
 loser+0.051
 leader+0.043
 winner+0.042
worm+0.042
 frog+0.041
Neg logit contributions
 magazines-0.134
 counters-0.132
 envelop-0.129
keeping-0.128
 candles-0.125
Token's
Token ID338
Feature activation-3.03e-02

Pos logit contributions
 counters+0.387
keeping+0.379
 magazines+0.379
 candles+0.370
 desks+0.367
Neg logit contributions
 friend-0.124
 first-0.120
 frog-0.117
 leader-0.117
ny-0.115
Token house
Token ID2156
Feature activation+1.13e-01

Pos logit contributions
 loser+0.041
 leader+0.039
 friend+0.039
 problem+0.039
 best+0.038
Neg logit contributions
 magazines-0.061
 counters-0.061
keeping-0.059
 illustrations-0.058
 candles-0.058
Token for
Token ID329
Feature activation+3.13e-02

Pos logit contributions
 counters+0.268
 magazines+0.266
 envelop+0.258
 candles+0.252
keeping+0.252
Neg logit contributions
 loser-0.116
 leader-0.110
 first-0.103
 winner-0.101
 friend-0.101
Token safe
Token ID3338
Feature activation+7.71e-02

Pos logit contributions
 counters+0.073
 album+0.070
 magazines+0.067
 illustrations+0.066
etti+0.065
Neg logit contributions
 first-0.038
 one-0.038
 problem-0.032
 lesson-0.031
 best-0.031
Tokenkeeping
Token ID19934
Feature activation+3.01e-01

Pos logit contributions
 counters+0.145
 magazines+0.143
 illustrations+0.142
 candles+0.139
 envelop+0.139
Neg logit contributions
 loser-0.105
 thing-0.103
 first-0.101
 friend-0.100
 best-0.099
Token.
Token ID13
Feature activation-1.45e-02

Pos logit contributions
 magazines+0.715
 counters+0.703
 envelop+0.685
 candles+0.663
 desks+0.657
Neg logit contributions
 loser-0.328
 leader-0.284
 winner-0.274
zag-0.272
worm-0.269
Token
Token ID220
Feature activation+5.15e-03

Pos logit contributions
 friend+0.017
Only+0.017
ny+0.017
 frog+0.017
 neither+0.017
Neg logit contributions
 magazines-0.031
 candles-0.030
 envelop-0.029
 counters-0.029
keeping-0.028
Token
Token ID198
Feature activation-4.63e-02

Pos logit contributions
 magazines+0.013
 counters+0.013
 envelop+0.013
keeping+0.013
 candles+0.013
Neg logit contributions
 loser-0.005
 leader-0.004
in-0.004
 frog-0.004
worm-0.004
Token
Token ID198
Feature activation-4.71e-02

Pos logit contributions
 loser+0.045
 leader+0.037
worm+0.037
 frog+0.037
 winner+0.036
Neg logit contributions
 magazines-0.117
 counters-0.114
 envelop-0.112
keeping-0.110
 candles-0.109
TokenWhen
Token ID2215
Feature activation+1.80e-02

Pos logit contributions
 loser+0.062
 leader+0.055
in+0.055
worm+0.054
 winner+0.054
Neg logit contributions
 magazines-0.103
 counters-0.100
 envelop-0.098
 candles-0.098
keeping-0.097
Token and
Token ID290
Feature activation+3.88e-02

Pos logit contributions
 magazines+0.011
 counters+0.010
 envelop+0.010
keeping+0.010
 candles+0.010
Neg logit contributions
 loser-0.006
 leader-0.006
 frog-0.006
 winner-0.006
worm-0.005
Token Ben
Token ID3932
Feature activation+7.02e-03

Pos logit contributions
 counters+0.074
 magazines+0.073
 envelop+0.070
keeping+0.070
 candles+0.067
Neg logit contributions
 loser-0.061
 frog-0.057
 goose-0.055
 leader-0.055
 coward-0.054
Token sit
Token ID1650
Feature activation-1.49e-02

Pos logit contributions
 magazines+0.014
 counters+0.014
 envelop+0.014
keeping+0.014
 candles+0.013
Neg logit contributions
 loser-0.010
worm-0.009
 leader-0.009
 winner-0.009
 frog-0.009
Token at
Token ID379
Feature activation-2.96e-02

Pos logit contributions
 loser+0.022
 winner+0.019
 leader+0.019
worm+0.019
 frog+0.019
Neg logit contributions
 magazines-0.030
 counters-0.030
 envelop-0.029
keeping-0.029
 candles-0.028
Token their
Token ID511
Feature activation+5.47e-02

Pos logit contributions
 loser+0.037
 frog+0.036
 first+0.035
 goose+0.034
 leader+0.033
Neg logit contributions
 counters-0.066
 magazines-0.060
 illustrations-0.058
 envelop-0.058
keeping-0.058
Token desks
Token ID44301
Feature activation+2.71e-01

Pos logit contributions
 magazines+0.104
 counters+0.102
 envelop+0.099
keeping+0.098
 candles+0.095
Neg logit contributions
 loser-0.086
 leader-0.078
 frog-0.077
 winner-0.077
worm-0.076
Token.
Token ID13
Feature activation+1.06e-02

Pos logit contributions
 magazines+0.666
 counters+0.657
keeping+0.645
 envelop+0.644
 Emmy+0.628
Neg logit contributions
 loser-0.247
 leader-0.234
 first-0.216
 winner-0.213
zag-0.209
Token They
Token ID1119
Feature activation+6.49e-02

Pos logit contributions
keeping+0.023
ons+0.023
 envelop+0.022
 candles+0.022
years+0.022
Neg logit contributions
 loser-0.013
 frog-0.012
 leader-0.012
 bet-0.011
ny-0.011
Token look
Token ID804
Feature activation+4.80e-03

Pos logit contributions
 magazines+0.138
 counters+0.135
 envelop+0.132
keeping+0.131
 candles+0.127
Neg logit contributions
 loser-0.088
 leader-0.078
 winner-0.077
worm-0.076
 frog-0.076
Token at
Token ID379
Feature activation-3.41e-02

Pos logit contributions
keeping+0.011
 magazines+0.011
 counters+0.011
 envelop+0.011
 computers+0.011
Neg logit contributions
 loser-0.005
 leader-0.004
 frog-0.004
 problem-0.004
 winner-0.004
Token the
Token ID262
Feature activation+7.66e-03

Pos logit contributions
 loser+0.051
 frog+0.046
 leader+0.046
 goose+0.046
worm+0.045
Neg logit contributions
 counters-0.066
 magazines-0.066
 envelop-0.064
keeping-0.063
 candles-0.060
Token But
Token ID887
Feature activation-8.49e-03

Pos logit contributions
 magazines+0.003
 counters+0.003
 envelop+0.003
keeping+0.003
 candles+0.003
Neg logit contributions
 loser-0.002
 frog-0.002
 leader-0.002
 winner-0.002
 goose-0.002
Token when
Token ID618
Feature activation-3.64e-02

Pos logit contributions
 loser+0.013
 leader+0.012
worm+0.012
 winner+0.012
 coward+0.011
Neg logit contributions
 magazines-0.016
 counters-0.016
 envelop-0.016
keeping-0.015
 candles-0.015
Token she
Token ID673
Feature activation-1.18e-02

Pos logit contributions
 loser+0.046
 frog+0.041
 leader+0.041
 winner+0.041
 goose+0.040
Neg logit contributions
 magazines-0.080
 counters-0.079
 envelop-0.077
keeping-0.076
 candles-0.073
Token opened
Token ID4721
Feature activation+7.53e-02

Pos logit contributions
 loser+0.018
 leader+0.015
worm+0.015
 winner+0.015
 lesson+0.015
Neg logit contributions
 magazines-0.023
 counters-0.023
 envelop-0.023
 candles-0.022
keeping-0.022
Token her
Token ID607
Feature activation+3.22e-02

Pos logit contributions
 magazines+0.144
 counters+0.142
 envelop+0.137
keeping+0.136
 candles+0.131
Neg logit contributions
 loser-0.115
 frog-0.105
worm-0.104
 leader-0.104
 winner-0.103
Token closet
Token ID21615
Feature activation+2.65e-01

Pos logit contributions
keeping+0.042
 counters+0.041
 magazines+0.040
 envelop+0.037
 illustrations+0.035
Neg logit contributions
 problem-0.066
 loser-0.066
 friend-0.065
 frog-0.065
 best-0.065
Token,
Token ID11
Feature activation+1.37e-02

Pos logit contributions
 magazines+0.687
 counters+0.673
 envelop+0.664
keeping+0.659
 videos+0.641
Neg logit contributions
 loser-0.216
 frog-0.195
 leader-0.188
 winner-0.185
 problem-0.185
Token she
Token ID673
Feature activation-1.79e-02

Pos logit contributions
 counters+0.030
 magazines+0.030
keeping+0.029
 envelop+0.029
 candles+0.028
Neg logit contributions
 loser-0.016
 frog-0.015
worm-0.015
 best-0.014
 goose-0.014
Token was
Token ID373
Feature activation+3.02e-02

Pos logit contributions
 loser+0.019
 leader+0.019
 frog+0.018
in+0.018
worm+0.018
Neg logit contributions
 magazines-0.042
keeping-0.041
 counters-0.040
 envelop-0.040
 videos-0.038
Token in
Token ID287
Feature activation-7.56e-03

Pos logit contributions
 magazines+0.069
keeping+0.067
 counters+0.064
 envelop+0.064
 videos+0.061
Neg logit contributions
 loser-0.037
 best-0.035
 winner-0.035
 leader-0.035
 friend-0.034
Token for
Token ID329
Feature activation+4.17e-02

Pos logit contributions
 loser+0.011
 problem+0.010
 frog+0.010
 thing+0.010
 lesson+0.009
Neg logit contributions
 counters-0.015
 magazines-0.015
keeping-0.014
 envelop-0.014
etti-0.014
Token and
Token ID290
Feature activation+5.59e-02

Pos logit contributions
 magazines+0.034
 counters+0.034
 envelop+0.033
keeping+0.032
 computers+0.032
Neg logit contributions
 loser-0.019
 leader-0.017
 frog-0.016
worm-0.016
 winner-0.016
Token a
Token ID257
Feature activation+5.28e-02

Pos logit contributions
 magazines+0.113
 counters+0.112
keeping+0.108
 envelop+0.107
 illustrations+0.103
Neg logit contributions
 loser-0.073
 frog-0.073
 goose-0.070
 leader-0.070
 winner-0.068
Token scarf
Token ID36953
Feature activation+5.10e-02

Pos logit contributions
 magazines+0.105
 counters+0.103
 envelop+0.101
keeping+0.098
 candles+0.097
Neg logit contributions
 loser-0.078
 leader-0.069
worm-0.069
 winner-0.069
 frog-0.068
Token in
Token ID287
Feature activation-3.46e-03

Pos logit contributions
 magazines+0.113
 counters+0.111
 envelop+0.108
keeping+0.106
 candles+0.103
Neg logit contributions
 loser-0.065
 leader-0.058
 winner-0.057
worm-0.057
 frog-0.056
Token her
Token ID607
Feature activation+2.89e-02

Pos logit contributions
 loser+0.005
 frog+0.004
 leader+0.004
 winner+0.004
worm+0.004
Neg logit contributions
 counters-0.007
 magazines-0.007
keeping-0.006
 envelop-0.006
 candles-0.006
Token closet
Token ID21615
Feature activation+2.64e-01

Pos logit contributions
 magazines+0.051
 counters+0.051
keeping+0.050
 envelop+0.047
 videos+0.047
Neg logit contributions
 loser-0.046
 friend-0.045
 best-0.044
 leader-0.043
 frog-0.043
Token,
Token ID11
Feature activation+1.36e-02

Pos logit contributions
 magazines+0.670
 counters+0.645
 envelop+0.643
 videos+0.632
keeping+0.629
Neg logit contributions
 loser-0.231
 leader-0.220
 first-0.209
 friend-0.202
 winner-0.201
Token but
Token ID475
Feature activation-2.20e-02

Pos logit contributions
keeping+0.032
 magazines+0.031
 counters+0.030
etti+0.030
 envelop+0.030
Neg logit contributions
 only-0.013
 frog-0.013
 one-0.013
 goose-0.012
 also-0.012
Token they
Token ID484
Feature activation-2.73e-02

Pos logit contributions
 loser+0.030
 leader+0.027
worm+0.026
 frog+0.026
 winner+0.026
Neg logit contributions
 magazines-0.046
 counters-0.045
 envelop-0.044
keeping-0.044
 candles-0.042
Token were
Token ID547
Feature activation+9.65e-02

Pos logit contributions
 loser+0.041
 winner+0.036
 leader+0.035
worm+0.035
in+0.035
Neg logit contributions
 magazines-0.054
 counters-0.053
 envelop-0.051
 candles-0.050
keeping-0.050
Token too
Token ID1165
Feature activation-1.50e-01

Pos logit contributions
 magazines+0.208
 counters+0.204
 envelop+0.199
keeping+0.198
 candles+0.190
Neg logit contributions
 loser-0.128
 leader-0.114
 winner-0.113
worm-0.112
 frog-0.112
Token -
Token ID532
Feature activation+9.26e-02

Pos logit contributions
 loser+0.035
 frog+0.032
 winner+0.032
 leader+0.032
 problem+0.032
Neg logit contributions
 counters-0.076
 magazines-0.076
 envelop-0.074
keeping-0.072
 candles-0.070
Token a
Token ID257
Feature activation+3.42e-02

Pos logit contributions
keeping+0.195
 counters+0.195
 magazines+0.192
 envelop+0.186
 desks+0.178
Neg logit contributions
 loser-0.120
 winner-0.114
 frog-0.114
 best-0.112
worm-0.110
Token box
Token ID3091
Feature activation+7.52e-02

Pos logit contributions
 magazines+0.072
 counters+0.071
keeping+0.069
 envelop+0.069
 candles+0.066
Neg logit contributions
 loser-0.047
 winner-0.042
 frog-0.042
 leader-0.042
worm-0.041
Token of
Token ID286
Feature activation+2.94e-02

Pos logit contributions
 magazines+0.172
 counters+0.166
keeping+0.160
 envelop+0.159
 computers+0.157
Neg logit contributions
 winner-0.089
 loser-0.088
worm-0.086
 frog-0.084
 leader-0.084
Token mult
Token ID1963
Feature activation+1.38e-01

Pos logit contributions
keeping+0.058
 counters+0.053
amas+0.053
etti+0.051
 magazines+0.049
Neg logit contributions
 frog-0.044
 thing-0.043
 best-0.042
 problem-0.042
 loser-0.042
Tokenicol
Token ID27045
Feature activation+2.64e-01

Pos logit contributions
 magazines+0.140
 counters+0.133
keeping+0.129
 computers+0.127
nam+0.124
Neg logit contributions
 loser-0.315
 frog-0.307
 friend-0.306
 winner-0.304
ah-0.303
Tokenored
Token ID1850
Feature activation+6.29e-02

Pos logit contributions
 magazines+0.449
 counters+0.429
keeping+0.416
 envelop+0.415
 computers+0.408
Neg logit contributions
 loser-0.464
in-0.450
ah-0.441
 winner-0.434
Only-0.422
Token bubbles
Token ID25037
Feature activation-1.27e-02

Pos logit contributions
 counters+0.113
 magazines+0.108
keeping+0.107
years+0.105
array+0.103
Neg logit contributions
 thing-0.098
 problem-0.092
worm-0.090
 leader-0.089
 loser-0.089
Token!
Token ID0
Feature activation+2.62e-02

Pos logit contributions
 loser+0.016
 frog+0.015
 winner+0.015
worm+0.015
 leader+0.015
Neg logit contributions
 magazines-0.028
 counters-0.027
 envelop-0.026
keeping-0.026
 desks-0.025
Token
Token ID198
Feature activation-3.98e-02

Pos logit contributions
 envelop+0.056
 counters+0.055
 magazines+0.055
keeping+0.054
 candles+0.053
Neg logit contributions
Only-0.032
 frog-0.029
 lesson-0.028
 bet-0.028
 friend-0.028
Token
Token ID198
Feature activation-3.92e-02

Pos logit contributions
 loser+0.025
in+0.024
 frog+0.024
worm+0.023
 lesson+0.023
Neg logit contributions
 magazines-0.110
 counters-0.108
 envelop-0.107
keeping-0.105
 candles-0.104
Token.
Token ID13
Feature activation+7.02e-03

Pos logit contributions
 loser+0.021
 leader+0.019
 frog+0.019
in+0.019
 winner+0.018
Neg logit contributions
 counters-0.050
 magazines-0.049
keeping-0.048
 envelop-0.047
 candles-0.046
Token They
Token ID1119
Feature activation+6.31e-02

Pos logit contributions
 counters+0.015
keeping+0.014
 magazines+0.014
 envelop+0.014
 candles+0.014
Neg logit contributions
 loser-0.009
 leader-0.008
 frog-0.008
 winner-0.008
Only-0.008
Token sit
Token ID1650
Feature activation-1.97e-02

Pos logit contributions
 magazines+0.115
 envelop+0.111
 counters+0.106
 candles+0.104
 desks+0.101
Neg logit contributions
 loser-0.105
worm-0.098
in-0.093
 goose-0.091
 problem-0.091
Token at
Token ID379
Feature activation-3.67e-02

Pos logit contributions
 loser+0.025
 leader+0.022
 winner+0.022
 frog+0.022
in+0.022
Neg logit contributions
 magazines-0.044
 counters-0.043
 envelop-0.042
keeping-0.041
 candles-0.040
Token their
Token ID511
Feature activation+4.34e-02

Pos logit contributions
 loser+0.051
 leader+0.046
 frog+0.046
 first+0.045
 lesson+0.045
Neg logit contributions
 counters-0.077
 magazines-0.072
keeping-0.070
 envelop-0.069
 illustrations-0.069
Token desks
Token ID44301
Feature activation+2.63e-01

Pos logit contributions
 magazines+0.059
 counters+0.058
keeping+0.056
 envelop+0.056
 candles+0.052
Neg logit contributions
 loser-0.091
 leader-0.085
 winner-0.084
 problem-0.084
 frog-0.083
Token and
Token ID290
Feature activation+3.72e-02

Pos logit contributions
 magazines+0.632
 counters+0.627
keeping+0.619
 envelop+0.607
 Emmy+0.588
Neg logit contributions
 loser-0.263
 leader-0.241
 winner-0.229
in-0.218
 first-0.217
Token wait
Token ID4043
Feature activation-9.55e-02

Pos logit contributions
 magazines+0.070
 counters+0.069
keeping+0.067
 envelop+0.066
 candles+0.063
Neg logit contributions
 loser-0.058
 leader-0.053
 winner-0.052
 frog-0.052
 goose-0.051
Token for
Token ID329
Feature activation+3.74e-02

Pos logit contributions
 loser+0.107
 leader+0.098
 first+0.096
 best+0.096
 winner+0.094
Neg logit contributions
 magazines-0.223
keeping-0.218
 envelop-0.216
 counters-0.215
 computers-0.203
Token the
Token ID262
Feature activation+5.29e-03

Pos logit contributions
 counters+0.090
keeping+0.088
amas+0.086
array+0.082
 magazines+0.082
Neg logit contributions
 first-0.041
 one-0.040
 loser-0.038
 lesson-0.038
 friend-0.037
Token exam
Token ID2814
Feature activation+1.84e-02

Pos logit contributions
 magazines+0.011
 counters+0.011
keeping+0.011
 envelop+0.011
 candles+0.010
Neg logit contributions
 loser-0.007
 winner-0.007
 leader-0.007
 frog-0.006
 lesson-0.006
Token and
Token ID290
Feature activation+4.16e-02

Pos logit contributions
 magazines+0.096
 counters+0.095
keeping+0.093
 envelop+0.092
 candles+0.088
Neg logit contributions
 loser-0.074
 winner-0.066
 frog-0.065
 leader-0.065
 goose-0.064
Token put
Token ID1234
Feature activation+1.81e-02

Pos logit contributions
 magazines+0.083
 counters+0.082
 envelop+0.080
keeping+0.079
 candles+0.076
Neg logit contributions
 loser-0.058
 frog-0.055
 leader-0.054
 goose-0.053
 winner-0.053
Token it
Token ID340
Feature activation-8.26e-03

Pos logit contributions
 magazines+0.036
 counters+0.036
keeping+0.035
 envelop+0.034
 illustrations+0.033
Neg logit contributions
 loser-0.025
 frog-0.023
in-0.023
 goose-0.022
 best-0.022
Token in
Token ID287
Feature activation+2.65e-03

Pos logit contributions
 loser+0.011
 leader+0.009
worm+0.009
 winner+0.009
 frog+0.009
Neg logit contributions
 magazines-0.018
 counters-0.018
 envelop-0.017
keeping-0.017
 computers-0.017
Token the
Token ID262
Feature activation+1.09e-02

Pos logit contributions
 counters+0.006
 magazines+0.005
keeping+0.005
 desks+0.005
 illustrations+0.005
Neg logit contributions
 loser-0.004
 frog-0.003
 lesson-0.003
 thing-0.003
 first-0.003
Token closet
Token ID21615
Feature activation+2.62e-01

Pos logit contributions
keeping+0.021
 magazines+0.021
 counters+0.020
 illustrations+0.020
years+0.020
Neg logit contributions
 loser-0.016
 frog-0.015
 winner-0.015
 problem-0.015
 best-0.015
Token.
Token ID13
Feature activation+1.17e-02

Pos logit contributions
 magazines+0.673
 envelop+0.659
 counters+0.652
keeping+0.646
 illustrations+0.642
Neg logit contributions
 loser-0.213
 first-0.209
 leader-0.199
 best-0.191
 winner-0.182
Token You
Token ID921
Feature activation+5.43e-02

Pos logit contributions
 magazines+0.025
 counters+0.024
 envelop+0.024
 candles+0.023
keeping+0.023
Neg logit contributions
 loser-0.016
 frog-0.015
 winner-0.014
 bet-0.014
 goose-0.014
Token will
Token ID481
Feature activation-3.28e-02

Pos logit contributions
 magazines+0.126
 counters+0.123
 envelop+0.121
keeping+0.119
 candles+0.116
Neg logit contributions
 loser-0.064
 leader-0.055
 winner-0.055
worm-0.054
 frog-0.054
Token have
Token ID423
Feature activation+2.98e-03

Pos logit contributions
 loser+0.040
 leader+0.035
 winner+0.034
worm+0.034
 frog+0.034
Neg logit contributions
 magazines-0.074
 counters-0.073
 envelop-0.071
keeping-0.070
 candles-0.069
Token to
Token ID284
Feature activation+3.83e-02

Pos logit contributions
 magazines+0.007
 counters+0.007
 envelop+0.007
keeping+0.007
 videos+0.006
Neg logit contributions
 loser-0.003
 problem-0.003
 lesson-0.003
 best-0.003
 winner-0.003
Token was
Token ID373
Feature activation+2.65e-02

Pos logit contributions
 loser+0.022
 leader+0.019
 winner+0.019
worm+0.019
 frog+0.019
Neg logit contributions
 magazines-0.032
 counters-0.031
 envelop-0.031
keeping-0.030
 candles-0.029
Token very
Token ID845
Feature activation+3.67e-02

Pos logit contributions
 magazines+0.053
 counters+0.052
 envelop+0.050
keeping+0.050
 candles+0.048
Neg logit contributions
 loser-0.040
 leader-0.036
 winner-0.035
worm-0.035
 frog-0.035
Token curious
Token ID11040
Feature activation+3.04e-02

Pos logit contributions
 counters+0.065
 candles+0.061
 magazines+0.061
 brand+0.059
 envelop+0.058
Neg logit contributions
 loser-0.058
ah-0.057
zag-0.055
worm-0.052
igger-0.051
Token about
Token ID546
Feature activation+1.00e-01

Pos logit contributions
 magazines+0.069
 counters+0.069
 envelop+0.068
keeping+0.067
 candles+0.064
Neg logit contributions
 loser-0.034
 leader-0.031
 frog-0.030
 problem-0.030
 winner-0.030
Token the
Token ID262
Feature activation+2.01e-05

Pos logit contributions
 counters+0.194
 magazines+0.189
keeping+0.185
 envelop+0.184
 candles+0.177
Neg logit contributions
 loser-0.144
 frog-0.140
 goose-0.137
 leader-0.136
 problem-0.135
Token closet
Token ID21615
Feature activation+2.62e-01

Pos logit contributions
 magazines+0.000
 counters+0.000
 envelop+0.000
keeping+0.000
 candles+0.000
Neg logit contributions
 loser-0.000
 leader-0.000
 frog-0.000
 winner-0.000
worm-0.000
Token.
Token ID13
Feature activation+5.57e-03

Pos logit contributions
 magazines+0.599
 counters+0.598
 envelop+0.590
keeping+0.578
 illustrations+0.560
Neg logit contributions
 loser-0.288
 leader-0.268
 winner-0.263
worm-0.257
 frog-0.253
Token
Token ID198
Feature activation-3.51e-02

Pos logit contributions
 magazines+0.012
 counters+0.012
 envelop+0.012
keeping+0.012
 candles+0.012
Neg logit contributions
 loser-0.007
 frog-0.006
 leader-0.006
 winner-0.006
worm-0.006
Token
Token ID198
Feature activation-3.81e-02

Pos logit contributions
 loser+0.047
 winner+0.042
 leader+0.041
worm+0.040
 problem+0.040
Neg logit contributions
 magazines-0.074
 counters-0.073
 envelop-0.072
keeping-0.071
 computers-0.068
Token"
Token ID1
Feature activation-3.50e-02

Pos logit contributions
 loser+0.044
 problem+0.038
 winner+0.038
 frog+0.038
 coward+0.037
Neg logit contributions
 magazines-0.087
 counters-0.085
 envelop-0.084
keeping-0.082
 computers-0.080
TokenWhat
Token ID2061
Feature activation+2.53e-02

Pos logit contributions
 problem+0.060
 loser+0.059
zag+0.055
 coward+0.054
 best+0.054
Neg logit contributions
 magazines-0.062
 envelop-0.060
 counters-0.059
keeping-0.057
 candles-0.056
Token floors
Token ID18570
Feature activation+1.50e-01

Pos logit contributions
 magazines+0.008
 counters+0.008
 envelop+0.008
keeping+0.008
 illustrations+0.008
Neg logit contributions
 loser-0.006
 frog-0.005
 goose-0.005
worm-0.005
 winner-0.005
Token,
Token ID11
Feature activation+6.89e-03

Pos logit contributions
 magazines+0.323
 counters+0.316
 envelop+0.310
keeping+0.305
 candles+0.298
Neg logit contributions
 loser-0.200
 leader-0.176
 winner-0.174
worm-0.174
 frog-0.172
Token scrub
Token ID27268
Feature activation+2.68e-02

Pos logit contributions
 magazines+0.015
 counters+0.015
 envelop+0.015
keeping+0.014
 illustrations+0.014
Neg logit contributions
 loser-0.009
 frog-0.008
worm-0.008
 goose-0.008
 leader-0.007
Tokenbed
Token ID3077
Feature activation+2.72e-02

Pos logit contributions
 magazines+0.055
 envelop+0.053
 counters+0.052
keeping+0.051
 illustrations+0.051
Neg logit contributions
 loser-0.036
in-0.033
 frog-0.033
 best-0.032
ah-0.032
Token the
Token ID262
Feature activation-1.78e-04

Pos logit contributions
 magazines+0.056
 counters+0.054
 envelop+0.053
keeping+0.052
 candles+0.051
Neg logit contributions
 loser-0.039
 frog-0.034
worm-0.034
 leader-0.034
 winner-0.034
Token counters
Token ID21154
Feature activation+2.62e-01

Pos logit contributions
 loser+0.000
 frog+0.000
worm+0.000
 goose+0.000
 winner+0.000
Neg logit contributions
 magazines-0.000
keeping-0.000
 envelop-0.000
 illustrations-0.000
 counters-0.000
Token and
Token ID290
Feature activation+2.88e-02

Pos logit contributions
 magazines+0.603
 counters+0.578
 envelop+0.576
keeping+0.568
 illustrations+0.551
Neg logit contributions
 loser-0.305
 leader-0.266
 frog-0.265
 winner-0.265
worm-0.263
Token bathrooms
Token ID32601
Feature activation+1.27e-01

Pos logit contributions
 magazines+0.059
 counters+0.055
 illustrations+0.055
keeping+0.054
 envelop+0.054
Neg logit contributions
 loser-0.037
 frog-0.036
 neither-0.035
in-0.034
 best-0.034
Token and
Token ID290
Feature activation+3.08e-02

Pos logit contributions
 magazines+0.284
 counters+0.272
 envelop+0.271
keeping+0.262
 computers+0.258
Neg logit contributions
 loser-0.161
in-0.145
 winner-0.142
ah-0.142
 frog-0.141
Token d
Token ID288
Feature activation+1.22e-02

Pos logit contributions
 magazines+0.058
 counters+0.055
 illustrations+0.054
 envelop+0.054
keeping+0.053
Neg logit contributions
 loser-0.044
 frog-0.042
 neither-0.041
 best-0.041
 lesson-0.040
Tokenusted
Token ID8459
Feature activation+3.67e-02

Pos logit contributions
 magazines+0.023
keeping+0.021
 envelop+0.021
 computers+0.021
 illustrations+0.020
Neg logit contributions
 loser-0.020
in-0.019
ny-0.018
 winner-0.018
ah-0.018
Token.
Token ID13
Feature activation+9.42e-04

Pos logit contributions
keeping+0.319
 counters+0.309
 magazines+0.307
 envelop+0.300
 illustrations+0.289
Neg logit contributions
 loser-0.104
 leader-0.102
 first-0.098
 best-0.096
 winner-0.095
Token They
Token ID1119
Feature activation+5.54e-02

Pos logit contributions
 magazines+0.002
 counters+0.002
keeping+0.002
 envelop+0.002
 candles+0.002
Neg logit contributions
 loser-0.002
 winner-0.001
 leader-0.001
 goose-0.001
 frog-0.001
Token sit
Token ID1650
Feature activation-1.99e-02

Pos logit contributions
 magazines+0.131
 counters+0.130
keeping+0.129
 envelop+0.127
 candles+0.122
Neg logit contributions
 loser-0.062
 leader-0.056
 winner-0.055
 frog-0.052
 bet-0.051
Token at
Token ID379
Feature activation-3.64e-02

Pos logit contributions
 loser+0.022
in+0.021
 leader+0.020
 best+0.019
 frog+0.019
Neg logit contributions
 counters-0.045
 magazines-0.045
 envelop-0.045
keeping-0.045
 computers-0.043
Token their
Token ID511
Feature activation+4.49e-02

Pos logit contributions
 loser+0.049
 first+0.046
 leader+0.045
 lesson+0.044
 friend+0.044
Neg logit contributions
 counters-0.077
 magazines-0.071
keeping-0.071
 envelop-0.068
 illustrations-0.068
Token desks
Token ID44301
Feature activation+2.62e-01

Pos logit contributions
 counters+0.071
keeping+0.071
 magazines+0.071
 envelop+0.070
 candles+0.065
Neg logit contributions
 loser-0.081
 leader-0.078
 problem-0.076
 winner-0.075
 frog-0.074
Token and
Token ID290
Feature activation+3.42e-02

Pos logit contributions
keeping+0.666
 counters+0.647
 magazines+0.646
 Emmy+0.632
 envelop+0.631
Neg logit contributions
 leader-0.209
 loser-0.209
 first-0.204
 winner-0.191
 best-0.186
Token draw
Token ID3197
Feature activation+1.18e-01

Pos logit contributions
 magazines+0.069
keeping+0.069
 counters+0.068
 envelop+0.066
 brand+0.063
Neg logit contributions
 loser-0.047
 leader-0.043
 winner-0.043
 first-0.040
 frog-0.040
Token with
Token ID351
Feature activation-1.51e-02

Pos logit contributions
keeping+0.243
 counters+0.211
 Emmy+0.209
array+0.207
years+0.201
Neg logit contributions
 best-0.179
 leader-0.164
 one-0.161
 first-0.160
 loser-0.160
Token cr
Token ID1067
Feature activation+8.02e-02

Pos logit contributions
 loser+0.023
 best+0.021
 leader+0.021
 one+0.021
 thing+0.021
Neg logit contributions
keeping-0.029
 counters-0.028
amas-0.026
 magazines-0.026
array-0.025
Tokenay
Token ID323
Feature activation+1.25e-01

Pos logit contributions
keeping+0.153
nam+0.152
 magazines+0.149
 counters+0.145
 envelop+0.141
Neg logit contributions
ah-0.111
zag-0.111
 loser-0.110
in-0.108
 leader-0.101
Token.
Token ID13
Feature activation+1.64e-03

Pos logit contributions
 magazines+0.278
 counters+0.269
 envelop+0.265
keeping+0.263
 candles+0.257
Neg logit contributions
 loser-0.151
 leader-0.133
worm-0.131
 winner-0.129
 coward-0.127
Token They
Token ID1119
Feature activation+8.14e-02

Pos logit contributions
 magazines+0.003
 counters+0.003
 envelop+0.003
keeping+0.003
 candles+0.003
Neg logit contributions
 loser-0.002
 frog-0.002
 leader-0.002
 winner-0.002
 goose-0.002
Token have
Token ID423
Feature activation+1.15e-02

Pos logit contributions
 magazines+0.169
 counters+0.166
 envelop+0.162
keeping+0.161
 candles+0.155
Neg logit contributions
 loser-0.113
 leader-0.102
 frog-0.101
 winner-0.100
worm-0.099
Token their
Token ID511
Feature activation+6.34e-02

Pos logit contributions
 counters+0.024
keeping+0.024
 magazines+0.023
etti+0.023
 envelop+0.023
Neg logit contributions
 loser-0.015
 best-0.014
 winner-0.014
 lesson-0.014
 leader-0.014
Token own
Token ID898
Feature activation-3.11e-02

Pos logit contributions
 magazines+0.119
 counters+0.118
keeping+0.114
 envelop+0.114
 candles+0.108
Neg logit contributions
 loser-0.100
 leader-0.090
 frog-0.090
 winner-0.089
 goose-0.088
Token beds
Token ID20237
Feature activation+2.61e-01

Pos logit contributions
 loser+0.047
 leader+0.043
 frog+0.043
 problem+0.043
 winner+0.043
Neg logit contributions
 counters-0.060
keeping-0.058
 magazines-0.058
 envelop-0.056
 illustrations-0.054
Token in
Token ID287
Feature activation-7.96e-03

Pos logit contributions
 counters+0.615
 magazines+0.613
 envelop+0.603
keeping+0.580
 illustrations+0.576
Neg logit contributions
 loser-0.284
 leader-0.247
 best-0.239
 winner-0.235
in-0.234
Token the
Token ID262
Feature activation+6.66e-03

Pos logit contributions
 loser+0.011
 lesson+0.009
 thing+0.009
 frog+0.009
 first+0.009
Neg logit contributions
 counters-0.017
 magazines-0.016
 envelop-0.016
keeping-0.015
 candles-0.015
Token same
Token ID976
Feature activation-5.09e-02

Pos logit contributions
keeping+0.013
 magazines+0.012
 counters+0.012
etti+0.011
 illustrations+0.011
Neg logit contributions
 loser-0.011
 best-0.010
 problem-0.010
 frog-0.010
 winner-0.010
Token room
Token ID2119
Feature activation+1.79e-01

Pos logit contributions
 loser+0.061
 problem+0.060
 friend+0.059
 frog+0.058
 leader+0.058
Neg logit contributions
keeping-0.107
 counters-0.107
 magazines-0.106
years-0.102
 envelop-0.101
Token.
Token ID13
Feature activation+5.98e-03

Pos logit contributions
 magazines+0.432
 counters+0.425
 envelop+0.423
keeping+0.410
 illustrations+0.397
Neg logit contributions
 loser-0.179
 leader-0.165
 winner-0.159
 best-0.155
 first-0.154
Token has
Token ID468
Feature activation+4.09e-02

Pos logit contributions
 loser+0.115
 leader+0.100
worm+0.100
 winner+0.099
 goose+0.098
Neg logit contributions
 magazines-0.186
 counters-0.181
 envelop-0.178
keeping-0.174
 candles-0.172
Token another
Token ID1194
Feature activation-8.35e-02

Pos logit contributions
 counters+0.080
 magazines+0.079
keeping+0.076
 envelop+0.076
 candles+0.072
Neg logit contributions
 loser-0.062
 leader-0.056
 winner-0.055
 frog-0.055
 problem-0.054
Token car
Token ID1097
Feature activation+8.15e-04

Pos logit contributions
 loser+0.095
Le+0.089
in+0.089
Only+0.089
igger+0.087
Neg logit contributions
 magazines-0.188
 envelop-0.181
 candles-0.175
 presents-0.172
 counters-0.170
Token in
Token ID287
Feature activation-9.57e-03

Pos logit contributions
 counters+0.002
 magazines+0.002
 envelop+0.002
keeping+0.002
 candles+0.002
Neg logit contributions
 loser-0.001
 leader-0.001
 winner-0.001
 frog-0.001
worm-0.001
Token his
Token ID465
Feature activation-3.76e-02

Pos logit contributions
 loser+0.016
worm+0.014
 leader+0.014
 winner+0.014
 goose+0.014
Neg logit contributions
 magazines-0.018
 envelop-0.017
 counters-0.017
keeping-0.017
 candles-0.016
Token closet
Token ID21615
Feature activation+2.61e-01

Pos logit contributions
 loser+0.064
 leader+0.058
 winner+0.057
 frog+0.057
 problem+0.057
Neg logit contributions
 counters-0.067
 magazines-0.066
keeping-0.064
 envelop-0.062
 candles-0.061
Token.
Token ID13
Feature activation+9.27e-03

Pos logit contributions
 magazines+0.617
 counters+0.611
 envelop+0.599
keeping+0.581
 illustrations+0.572
Neg logit contributions
 loser-0.288
 leader-0.255
 winner-0.245
 frog-0.236
in-0.235
Token He
Token ID679
Feature activation+1.46e-03

Pos logit contributions
 magazines+0.019
 counters+0.019
 envelop+0.018
keeping+0.018
 candles+0.017
Neg logit contributions
 loser-0.013
 leader-0.012
 winner-0.012
worm-0.012
 goose-0.011
Token says
Token ID1139
Feature activation+2.04e-03

Pos logit contributions
 magazines+0.003
 counters+0.003
 envelop+0.003
 candles+0.003
keeping+0.003
Neg logit contributions
 loser-0.002
worm-0.002
 leader-0.002
 winner-0.002
 goose-0.002
Token,
Token ID11
Feature activation+1.12e-02

Pos logit contributions
 magazines+0.005
 counters+0.005
 envelop+0.004
keeping+0.004
 candles+0.004
Neg logit contributions
 loser-0.002
 leader-0.002
worm-0.002
 winner-0.002
 goose-0.002
Token "
Token ID366
Feature activation+9.45e-03

Pos logit contributions
 magazines+0.022
 counters+0.022
 envelop+0.021
keeping+0.021
 candles+0.020
Neg logit contributions
 loser-0.017
 leader-0.015
worm-0.015
 winner-0.015
 frog-0.015
Token friends
Token ID2460
Feature activation-1.80e-02

Pos logit contributions
 counters+0.083
 magazines+0.083
keeping+0.082
 envelop+0.080
 candles+0.076
Neg logit contributions
 loser-0.093
 leader-0.086
 problem-0.085
 winner-0.085
 frog-0.085
Token sitting
Token ID5586
Feature activation+9.10e-03

Pos logit contributions
 loser+0.025
 winner+0.024
 leader+0.023
 frog+0.023
 problem+0.023
Neg logit contributions
 magazines-0.036
 counters-0.036
keeping-0.035
 envelop-0.035
 candles-0.033
Token at
Token ID379
Feature activation-3.75e-02

Pos logit contributions
 magazines+0.021
 counters+0.021
 envelop+0.021
keeping+0.020
 candles+0.020
Neg logit contributions
 loser-0.011
in-0.009
 leader-0.009
 winner-0.009
 frog-0.009
Token the
Token ID262
Feature activation+2.75e-04

Pos logit contributions
 first+0.052
 one+0.048
 loser+0.047
 moment+0.046
 lesson+0.045
Neg logit contributions
 counters-0.083
 magazines-0.072
 envelop-0.072
keeping-0.071
 illustrations-0.071
Token new
Token ID649
Feature activation+1.71e-02

Pos logit contributions
keeping+0.001
 counters+0.001
 magazines+0.001
 envelop+0.000
 candles+0.000
Neg logit contributions
 loser-0.000
 leader-0.000
 winner-0.000
 first-0.000
 problem-0.000
Token desks
Token ID44301
Feature activation+2.60e-01

Pos logit contributions
 magazines+0.025
 counters+0.025
 envelop+0.024
keeping+0.024
 candles+0.023
Neg logit contributions
 loser-0.034
 leader-0.032
 winner-0.031
 frog-0.031
worm-0.031
Token.
Token ID13
Feature activation+5.36e-03

Pos logit contributions
 magazines+0.610
 counters+0.606
 envelop+0.592
keeping+0.591
 illustrations+0.563
Neg logit contributions
 loser-0.281
 leader-0.249
 winner-0.246
in-0.236
worm-0.233
Token She
Token ID1375
Feature activation+6.80e-02

Pos logit contributions
 magazines+0.009
 counters+0.008
 envelop+0.008
keeping+0.008
 candles+0.008
Neg logit contributions
 loser-0.010
 leader-0.009
 winner-0.009
worm-0.009
 frog-0.009
Token sees
Token ID7224
Feature activation+6.88e-02

Pos logit contributions
 magazines+0.144
 counters+0.139
 envelop+0.138
 candles+0.134
 desks+0.132
Neg logit contributions
 loser-0.097
worm-0.085
 leader-0.082
 goose-0.081
 winner-0.081
Token her
Token ID607
Feature activation+1.54e-02

Pos logit contributions
 counters+0.124
keeping+0.122
 magazines+0.121
 envelop+0.117
 illustrations+0.111
Neg logit contributions
 loser-0.110
 frog-0.104
 goose-0.103
in-0.102
 leader-0.102
Token old
Token ID1468
Feature activation+5.20e-02

Pos logit contributions
 magazines+0.030
 counters+0.029
 envelop+0.028
keeping+0.028
 candles+0.027
Neg logit contributions
 loser-0.024
 leader-0.022
 winner-0.022
 frog-0.021
worm-0.021
TokenThe
Token ID464
Feature activation+1.76e-02

Pos logit contributions
 loser+0.042
 winner+0.038
 problem+0.037
 leader+0.037
 goose+0.037
Neg logit contributions
 counters-0.073
 magazines-0.072
 envelop-0.071
keeping-0.069
 computers-0.068
Token teacher
Token ID4701
Feature activation-2.76e-02

Pos logit contributions
 magazines+0.041
 counters+0.040
 envelop+0.040
keeping+0.039
 candles+0.039
Neg logit contributions
 loser-0.020
worm-0.017
 leader-0.017
 winner-0.017
ah-0.017
Token comes
Token ID2058
Feature activation-7.97e-03

Pos logit contributions
 loser+0.037
 leader+0.033
 winner+0.033
worm+0.033
 frog+0.032
Neg logit contributions
 magazines-0.059
 counters-0.057
 envelop-0.056
keeping-0.056
 candles-0.054
Token to
Token ID284
Feature activation+3.04e-02

Pos logit contributions
 loser+0.009
 leader+0.009
 frog+0.008
 winner+0.008
 friend+0.008
Neg logit contributions
 magazines-0.019
 counters-0.018
 envelop-0.017
keeping-0.017
 candles-0.017
Token their
Token ID511
Feature activation+4.19e-02

Pos logit contributions
 magazines+0.061
 counters+0.060
keeping+0.057
 envelop+0.057
 candles+0.056
Neg logit contributions
 loser-0.045
 frog-0.041
 leader-0.040
 goose-0.040
 winner-0.039
Token desks
Token ID44301
Feature activation+2.60e-01

Pos logit contributions
 magazines+0.063
 counters+0.062
keeping+0.061
 envelop+0.059
 illustrations+0.058
Neg logit contributions
 loser-0.078
 leader-0.075
 frog-0.075
 friend-0.075
 problem-0.073
Token.
Token ID13
Feature activation+2.29e-03

Pos logit contributions
 magazines+0.644
 counters+0.629
keeping+0.624
 envelop+0.619
 illustrations+0.596
Neg logit contributions
 loser-0.242
 leader-0.213
 first-0.206
 winner-0.205
in-0.197
Token She
Token ID1375
Feature activation+6.72e-02

Pos logit contributions
 magazines+0.004
 envelop+0.004
 counters+0.004
keeping+0.004
 candles+0.004
Neg logit contributions
 loser-0.004
 frog-0.004
 leader-0.004
 winner-0.004
 goose-0.003
Token smiles
Token ID21845
Feature activation+5.19e-02

Pos logit contributions
 magazines+0.148
 counters+0.144
 envelop+0.142
 candles+0.137
keeping+0.136
Neg logit contributions
 loser-0.089
worm-0.076
 leader-0.076
 winner-0.075
 goose-0.074
Token at
Token ID379
Feature activation-3.91e-02

Pos logit contributions
 magazines+0.106
 counters+0.101
 envelop+0.101
 candles+0.098
keeping+0.096
Neg logit contributions
 loser-0.076
worm-0.070
 goose-0.067
 coward-0.066
 winner-0.066
Token their
Token ID511
Feature activation+3.94e-02

Pos logit contributions
 loser+0.059
 frog+0.055
in+0.053
 goose+0.053
 leader+0.053
Neg logit contributions
 magazines-0.076
 counters-0.075
 envelop-0.073
keeping-0.072
 candles-0.069
Token.
Token ID13
Feature activation+2.28e-02

Pos logit contributions
 counters+0.450
 magazines+0.440
 envelop+0.429
keeping+0.419
 computers+0.414
Neg logit contributions
 loser-0.196
 leader-0.179
worm-0.174
 winner-0.173
zag-0.170
Token She
Token ID1375
Feature activation+8.86e-02

Pos logit contributions
 magazines+0.048
 counters+0.048
 envelop+0.047
keeping+0.046
 candles+0.046
Neg logit contributions
 loser-0.030
 frog-0.028
 leader-0.026
 goose-0.026
 winner-0.026
Token has
Token ID468
Feature activation+5.11e-02

Pos logit contributions
 magazines+0.194
 counters+0.191
keeping+0.191
 envelop+0.186
 candles+0.177
Neg logit contributions
 loser-0.108
 leader-0.101
 frog-0.099
 winner-0.098
 goose-0.097
Token made
Token ID925
Feature activation-4.01e-02

Pos logit contributions
 counters+0.116
etti+0.115
keeping+0.113
amas+0.108
 magazines+0.105
Neg logit contributions
 best-0.056
 loser-0.055
 only-0.055
 one-0.055
 lesson-0.054
Token their
Token ID511
Feature activation+6.16e-02

Pos logit contributions
 best+0.051
 friend+0.050
 lesson+0.050
 leader+0.049
 one+0.049
Neg logit contributions
keeping-0.081
 counters-0.078
 magazines-0.078
etti-0.075
array-0.074
Token beds
Token ID20237
Feature activation+2.60e-01

Pos logit contributions
 counters+0.107
 magazines+0.105
keeping+0.104
 envelop+0.100
 Emmy+0.097
Neg logit contributions
 loser-0.104
 leader-0.096
 problem-0.096
 friend-0.095
 frog-0.095
Token.
Token ID13
Feature activation+1.86e-02

Pos logit contributions
 magazines+0.617
 envelop+0.607
 counters+0.602
keeping+0.594
 computers+0.578
Neg logit contributions
 loser-0.272
 first-0.247
 best-0.245
 leader-0.242
zag-0.230
Token She
Token ID1375
Feature activation+8.19e-02

Pos logit contributions
 magazines+0.037
 counters+0.037
keeping+0.037
 envelop+0.036
 candles+0.035
Neg logit contributions
 loser-0.027
 frog-0.024
Only-0.023
 goose-0.023
 winner-0.023
Token has
Token ID468
Feature activation+4.38e-02

Pos logit contributions
keeping+0.200
 magazines+0.194
 counters+0.189
 envelop+0.188
 Emmy+0.178
Neg logit contributions
 loser-0.080
 frog-0.074
 problem-0.073
 leader-0.073
 am-0.072
Token left
Token ID1364
Feature activation-1.99e-02

Pos logit contributions
etti+0.094
keeping+0.093
 counters+0.091
 Emmy+0.089
amas+0.087
Neg logit contributions
 best-0.053
 one-0.053
 only-0.052
 first-0.050
 loser-0.050
Token a
Token ID257
Feature activation+4.54e-02

Pos logit contributions
 loser+0.024
 first+0.023
 one+0.023
 thing+0.022
 best+0.021
Neg logit contributions
 counters-0.042
keeping-0.041
 magazines-0.041
array-0.040
 illustrations-0.040
Token the
Token ID262
Feature activation+8.14e-03

Pos logit contributions
 magazines+0.054
 counters+0.053
 envelop+0.052
keeping+0.051
 illustrations+0.049
Neg logit contributions
 loser-0.047
 frog-0.044
in-0.043
 lesson-0.043
worm-0.043
Token dishes
Token ID16759
Feature activation+8.69e-02

Pos logit contributions
 magazines+0.015
keeping+0.015
 counters+0.014
 envelop+0.014
 illustrations+0.014
Neg logit contributions
 loser-0.013
 frog-0.012
 goose-0.012
 problem-0.012
worm-0.012
Token and
Token ID290
Feature activation+3.62e-02

Pos logit contributions
 magazines+0.175
 counters+0.173
 envelop+0.168
keeping+0.165
 candles+0.163
Neg logit contributions
 loser-0.128
 leader-0.113
 coward-0.113
 winner-0.113
 goose-0.112
Token wiped
Token ID21122
Feature activation+5.68e-02

Pos logit contributions
 magazines+0.072
keeping+0.070
 illustrations+0.069
 envelop+0.068
etti+0.067
Neg logit contributions
 frog-0.044
 lesson-0.043
 neither-0.043
 loser-0.043
 best-0.042
Token the
Token ID262
Feature activation+3.48e-03

Pos logit contributions
 magazines+0.103
 counters+0.100
 envelop+0.098
keeping+0.096
 candles+0.093
Neg logit contributions
 loser-0.096
 leader-0.087
 winner-0.087
worm-0.087
 frog-0.085
Token counters
Token ID21154
Feature activation+2.60e-01

Pos logit contributions
 magazines+0.005
keeping+0.005
 envelop+0.005
 illustrations+0.005
etti+0.005
Neg logit contributions
 loser-0.007
 frog-0.006
 lesson-0.006
 problem-0.006
 goose-0.006
Token.
Token ID13
Feature activation+1.02e-03

Pos logit contributions
 magazines+0.597
 counters+0.570
 envelop+0.570
keeping+0.565
 illustrations+0.546
Neg logit contributions
 loser-0.301
 leader-0.264
 winner-0.261
worm-0.259
 frog-0.258
Token When
Token ID1649
Feature activation+4.50e-02

Pos logit contributions
 magazines+0.002
 envelop+0.002
 counters+0.002
keeping+0.002
 candles+0.002
Neg logit contributions
 loser-0.001
 frog-0.001
Only-0.001
 bet-0.001
 goose-0.001
Token they
Token ID484
Feature activation-3.25e-02

Pos logit contributions
 counters+0.111
 magazines+0.109
keeping+0.108
 envelop+0.106
 illustrations+0.102
Neg logit contributions
 loser-0.044
 problem-0.040
 frog-0.040
 neither-0.039
 lesson-0.039
Token were
Token ID547
Feature activation+8.69e-02

Pos logit contributions
 loser+0.057
 leader+0.052
worm+0.052
 winner+0.051
 goose+0.051
Neg logit contributions
 magazines-0.056
 counters-0.055
 envelop-0.054
keeping-0.052
 candles-0.052
Token done
Token ID1760
Feature activation+4.25e-02

Pos logit contributions
keeping+0.177
 magazines+0.177
 envelop+0.165
 counters+0.162
 illustrations+0.160
Neg logit contributions
 loser-0.123
 best-0.121
 problem-0.121
 winner-0.116
in-0.116
Token took
Token ID1718
Feature activation+7.53e-03

Pos logit contributions
 magazines+0.187
 counters+0.183
keeping+0.180
 envelop+0.178
 candles+0.170
Neg logit contributions
 loser-0.114
 leader-0.105
 frog-0.104
 winner-0.103
worm-0.102
Token her
Token ID607
Feature activation+3.38e-02

Pos logit contributions
 magazines+0.014
 counters+0.014
keeping+0.013
 envelop+0.013
 illustrations+0.013
Neg logit contributions
 loser-0.012
 frog-0.011
 winner-0.011
 leader-0.011
worm-0.010
Token umbrella
Token ID25510
Feature activation+8.24e-02

Pos logit contributions
 counters+0.066
 magazines+0.065
keeping+0.064
 envelop+0.062
 illustrations+0.061
Neg logit contributions
 loser-0.048
 frog-0.046
 friend-0.045
worm-0.045
 goose-0.044
Token from
Token ID422
Feature activation+2.39e-02

Pos logit contributions
 counters+0.192
 magazines+0.191
 envelop+0.186
 illustrations+0.182
keeping+0.180
Neg logit contributions
 loser-0.085
worm-0.080
 leader-0.079
zag-0.079
ny-0.077
Token the
Token ID262
Feature activation+5.69e-03

Pos logit contributions
 counters+0.048
 magazines+0.045
 candles+0.043
 envelop+0.043
 illustrations+0.043
Neg logit contributions
 loser-0.034
 first-0.033
 frog-0.032
 goose-0.032
 one-0.031
Token closet
Token ID21615
Feature activation+2.60e-01

Pos logit contributions
 magazines+0.011
 counters+0.011
keeping+0.011
 envelop+0.011
 illustrations+0.011
Neg logit contributions
 loser-0.008
 frog-0.008
 winner-0.007
 goose-0.007
 leader-0.007
Token.
Token ID13
Feature activation+1.11e-02

Pos logit contributions
 magazines+0.606
 counters+0.592
 envelop+0.579
keeping+0.572
 illustrations+0.572
Neg logit contributions
 loser-0.277
 leader-0.268
 friend-0.249
 first-0.248
 winner-0.246
Token It
Token ID632
Feature activation+6.61e-02

Pos logit contributions
 magazines+0.024
 counters+0.023
 candles+0.023
 envelop+0.023
keeping+0.022
Neg logit contributions
 loser-0.014
 frog-0.013
Only-0.013
 bet-0.013
 goose-0.012
Token was
Token ID373
Feature activation+3.11e-02

Pos logit contributions
 magazines+0.136
 counters+0.133
 envelop+0.129
 candles+0.127
keeping+0.125
Neg logit contributions
 loser-0.096
 leader-0.082
 fisherman-0.082
 winner-0.082
 coward-0.081
Token blue
Token ID4171
Feature activation-5.34e-03

Pos logit contributions
keeping+0.071
 magazines+0.069
 counters+0.069
 illustrations+0.066
 envelop+0.066
Neg logit contributions
 best-0.036
 loser-0.036
 first-0.034
 only-0.034
 problem-0.034
Token with
Token ID351
Feature activation-1.03e-02

Pos logit contributions
 loser+0.005
 frog+0.004
 best+0.004
 winner+0.004
ah+0.004
Neg logit contributions
 envelop-0.013
 magazines-0.013
 counters-0.013
keeping-0.013
 computers-0.012
Token draw
Token ID3197
Feature activation+1.17e-01

Pos logit contributions
 loser+0.001
 frog+0.001
 problem+0.001
 winner+0.001
 goose+0.001
Neg logit contributions
keeping-0.001
 magazines-0.001
 counters-0.001
 envelop-0.001
 computers-0.001
Tokeners
Token ID364
Feature activation+2.20e-02

Pos logit contributions
keeping+0.208
 magazines+0.207
 counters+0.199
nam+0.194
 videos+0.192
Neg logit contributions
 loser-0.184
 best-0.178
in-0.174
 winner-0.173
worm-0.170
Token and
Token ID290
Feature activation+3.23e-02

Pos logit contributions
 magazines+0.052
 counters+0.051
keeping+0.050
 envelop+0.049
 candles+0.047
Neg logit contributions
 loser-0.024
 leader-0.021
 winner-0.021
 frog-0.021
worm-0.020
Token on
Token ID319
Feature activation-2.19e-02

Pos logit contributions
 magazines+0.067
 counters+0.066
keeping+0.064
amas+0.062
 illustrations+0.061
Neg logit contributions
 loser-0.041
 frog-0.041
 leader-0.039
worm-0.038
 goose-0.038
Token the
Token ID262
Feature activation-2.05e-04

Pos logit contributions
 loser+0.028
 frog+0.027
 goose+0.026
 leader+0.026
 problem+0.025
Neg logit contributions
 counters-0.045
keeping-0.044
 magazines-0.044
 envelop-0.041
 illustrations-0.041
Token counters
Token ID21154
Feature activation+2.59e-01

Pos logit contributions
 loser+0.000
 frog+0.000
worm+0.000
 best+0.000
 goose+0.000
Neg logit contributions
 magazines-0.000
keeping-0.000
 envelop-0.000
 counters-0.000
 illustrations-0.000
Token.
Token ID13
Feature activation-6.19e-05

Pos logit contributions
 magazines+0.650
 counters+0.614
 envelop+0.611
keeping+0.597
 videos+0.592
Neg logit contributions
 loser-0.245
 leader-0.220
 frog-0.210
 winner-0.209
 best-0.202
Token They
Token ID1119
Feature activation+5.76e-02

Pos logit contributions
 loser+0.000
 frog+0.000
 leader+0.000
 bet+0.000
 winner+0.000
Neg logit contributions
 magazines-0.000
 counters-0.000
 envelop-0.000
 candles-0.000
keeping-0.000
Token even
Token ID772
Feature activation-4.47e-02

Pos logit contributions
 magazines+0.124
 counters+0.121
keeping+0.120
 envelop+0.117
 videos+0.112
Neg logit contributions
 loser-0.075
 leader-0.069
 frog-0.069
worm-0.066
 winner-0.066
Token looked
Token ID3114
Feature activation+3.04e-02

Pos logit contributions
 loser+0.056
 leader+0.053
 frog+0.053
in+0.052
 winner+0.052
Neg logit contributions
 magazines-0.095
keeping-0.095
 counters-0.093
 envelop-0.088
 candles-0.086
Token in
Token ID287
Feature activation-1.63e-02

Pos logit contributions
keeping+0.080
 magazines+0.076
 counters+0.075
 envelop+0.073
bons+0.073
Neg logit contributions
 best-0.025
 loser-0.024
 problem-0.024
 frog-0.024
 closer-0.024
Token a
Token ID257
Feature activation+4.70e-02

Pos logit contributions
 counters+0.225
keeping+0.221
 magazines+0.216
 envelop+0.212
amas+0.207
Neg logit contributions
 frog-0.157
 goose-0.156
 loser-0.154
 best-0.150
 leader-0.147
Token h
Token ID289
Feature activation-5.40e-03

Pos logit contributions
 magazines+0.106
 counters+0.103
 envelop+0.102
keeping+0.099
 candles+0.098
Neg logit contributions
 loser-0.057
 leader-0.050
worm-0.049
 winner-0.049
 coward-0.048
Tokenanger
Token ID2564
Feature activation+2.59e-02

Pos logit contributions
 loser+0.008
 leader+0.007
 winner+0.007
worm+0.007
 frog+0.007
Neg logit contributions
 magazines-0.010
 counters-0.010
 envelop-0.010
keeping-0.010
 candles-0.010
Token in
Token ID287
Feature activation-8.30e-03

Pos logit contributions
 magazines+0.055
 counters+0.053
 envelop+0.052
keeping+0.052
 illustrations+0.049
Neg logit contributions
 loser-0.035
 leader-0.032
 winner-0.032
worm-0.032
 frog-0.032
Token the
Token ID262
Feature activation+6.03e-03

Pos logit contributions
 loser+0.011
 frog+0.010
 thing+0.010
 leader+0.010
 goose+0.010
Neg logit contributions
 counters-0.017
 magazines-0.017
keeping-0.016
 envelop-0.016
 candles-0.016
Token closet
Token ID21615
Feature activation+2.59e-01

Pos logit contributions
 magazines+0.011
keeping+0.011
 counters+0.011
etti+0.010
 illustrations+0.010
Neg logit contributions
 loser-0.010
 winner-0.009
 frog-0.009
 goose-0.009
 best-0.009
Token.
Token ID13
Feature activation+2.56e-03

Pos logit contributions
 magazines+0.638
 counters+0.612
 envelop+0.612
keeping+0.597
 videos+0.583
Neg logit contributions
 loser-0.261
 leader-0.246
 winner-0.230
 frog-0.225
 first-0.222
Token It
Token ID632
Feature activation+6.00e-02

Pos logit contributions
 magazines+0.005
 counters+0.005
keeping+0.005
 candles+0.005
 envelop+0.005
Neg logit contributions
 loser-0.003
 frog-0.003
Only-0.003
 bet-0.003
 goose-0.003
Token was
Token ID373
Feature activation+3.00e-02

Pos logit contributions
 magazines+0.127
 counters+0.124
 envelop+0.122
 candles+0.118
 computers+0.116
Neg logit contributions
 loser-0.085
 leader-0.072
 winner-0.071
 coward-0.071
 goose-0.070
Token shiny
Token ID22441
Feature activation+1.62e-02

Pos logit contributions
keeping+0.069
 counters+0.066
 magazines+0.065
amas+0.063
 illustrations+0.063
Neg logit contributions
 best-0.037
 only-0.035
 loser-0.035
 first-0.034
 frog-0.033
Token and
Token ID290
Feature activation+3.44e-02

Pos logit contributions
 magazines+0.037
 counters+0.037
keeping+0.037
 envelop+0.036
 computers+0.035
Neg logit contributions
 loser-0.018
 frog-0.016
worm-0.016
 problem-0.016
 winner-0.016
Token
Token ID26391
Feature activation+4.61e-03

Pos logit contributions
 loser+0.040
worm+0.037
 coward+0.036
 frog+0.036
 leader+0.036
Neg logit contributions
 magazines-0.043
 counters-0.042
keeping-0.041
 envelop-0.041
 candles-0.038
Token
Token ID8151
Feature activation-6.53e-03

Pos logit contributions
 magazines+0.009
 counters+0.009
 envelop+0.008
keeping+0.008
 candles+0.008
Neg logit contributions
 loser-0.007
 leader-0.006
 winner-0.006
 frog-0.006
 goose-0.006
Tokens
Token ID82
Feature activation-3.81e-02

Pos logit contributions
 loser+0.006
 leader+0.005
 winner+0.005
 frog+0.005
worm+0.005
Neg logit contributions
 magazines-0.016
keeping-0.015
 counters-0.015
 envelop-0.015
 Emmy-0.015
Token join
Token ID4654
Feature activation-9.93e-02

Pos logit contributions
 loser+0.053
worm+0.047
 coward+0.045
 goose+0.045
ah+0.045
Neg logit contributions
 magazines-0.078
 counters-0.077
 envelop-0.076
keeping-0.075
 candles-0.073
Token as
Token ID355
Feature activation-1.42e-01

Pos logit contributions
 loser+0.141
 frog+0.128
in+0.128
 leader+0.126
 goose+0.126
Neg logit contributions
 counters-0.204
 magazines-0.202
 envelop-0.194
keeping-0.192
 candles-0.186
Token one
Token ID530
Feature activation-2.62e-01

Pos logit contributions
 loser+0.203
 leader+0.180
 winner+0.179
 frog+0.177
 goose+0.177
Neg logit contributions
 magazines-0.294
 counters-0.292
 envelop-0.284
keeping-0.279
 candles-0.269
Token.
Token ID13
Feature activation-2.66e-02

Pos logit contributions
 loser+0.405
 coward+0.360
worm+0.342
 leader+0.339
igger+0.339
Neg logit contributions
 magazines-0.486
 counters-0.460
 candles-0.454
 envelop-0.451
 videos-0.443
Tokenâ
Token ID22940
Feature activation-2.47e-02

Pos logit contributions
 loser+0.042
 winner+0.037
 frog+0.037
 leader+0.036
worm+0.036
Neg logit contributions
 magazines-0.051
 counters-0.050
 envelop-0.049
 candles-0.048
keeping-0.048
Token
Token ID26391
Feature activation+3.44e-03

Pos logit contributions
 loser+0.039
 winner+0.035
 leader+0.035
ah+0.035
worm+0.034
Neg logit contributions
 magazines-0.048
 counters-0.046
 envelop-0.045
keeping-0.044
 candles-0.044
Token
Token ID198
Feature activation-5.22e-02

Pos logit contributions
 magazines+0.007
 counters+0.007
 envelop+0.006
keeping+0.006
 candles+0.006
Neg logit contributions
 loser-0.005
 leader-0.005
 winner-0.005
 goose-0.005
 frog-0.005
TokenThe
Token ID464
Feature activation+4.72e-03

Pos logit contributions
 loser+0.069
 leader+0.060
 winner+0.060
worm+0.060
 problem+0.059
Neg logit contributions
 magazines-0.112
 counters-0.110
 envelop-0.108
keeping-0.107
 candles-0.103
Tokenb
Token ID65
Feature activation-1.87e-01

Pos logit contributions
 loser+0.163
 leader+0.137
 coward+0.135
 frog+0.134
 lesson+0.133
Neg logit contributions
 counters-0.245
 magazines-0.244
 envelop-0.241
 candles-0.239
years-0.235
Tokenble
Token ID903
Feature activation-1.56e-01

Pos logit contributions
 loser+0.208
 coward+0.177
 leader+0.174
 winner+0.170
 goose+0.169
Neg logit contributions
 magazines-0.433
 counters-0.431
 envelop-0.418
keeping-0.415
 candles-0.411
Token is
Token ID318
Feature activation-2.34e-02

Pos logit contributions
 loser+0.204
 leader+0.172
 winner+0.170
 coward+0.169
 goose+0.166
Neg logit contributions
 magazines-0.341
 counters-0.334
 envelop-0.324
 candles-0.319
keeping-0.314
Token a
Token ID257
Feature activation+2.32e-02

Pos logit contributions
 loser+0.035
 winner+0.032
 leader+0.032
 frog+0.032
worm+0.032
Neg logit contributions
 magazines-0.046
 counters-0.046
keeping-0.044
 envelop-0.044
 desks-0.042
Token pretty
Token ID2495
Feature activation+4.56e-02

Pos logit contributions
 magazines+0.047
 envelop+0.046
 counters+0.046
 candles+0.044
keeping+0.044
Neg logit contributions
 loser-0.031
 leader-0.028
 neither-0.027
worm-0.027
in-0.027
Token one
Token ID530
Feature activation-2.62e-01

Pos logit contributions
 magazines+0.096
 envelop+0.095
 counters+0.095
 candles+0.092
keeping+0.091
Neg logit contributions
 loser-0.059
 coward-0.052
 leader-0.050
 neither-0.050
 winner-0.048
Token,"
Token ID553
Feature activation+8.39e-02

Pos logit contributions
 loser+0.477
 coward+0.433
 leader+0.414
 winner+0.407
Only+0.405
Neg logit contributions
 magazines-0.434
 candles-0.414
 envelop-0.413
 counters-0.410
keeping-0.393
Token said
Token ID531
Feature activation-1.74e-02

Pos logit contributions
 magazines+0.136
 counters+0.134
 envelop+0.132
keeping+0.127
 candles+0.125
Neg logit contributions
 loser-0.159
 winner-0.141
 leader-0.139
 coward-0.139
zag-0.137
Token the
Token ID262
Feature activation-1.83e-02

Pos logit contributions
 loser+0.035
 winner+0.032
 coward+0.032
 problem+0.031
 fastest+0.031
Neg logit contributions
 magazines-0.025
 envelop-0.025
 counters-0.024
keeping-0.023
 candles-0.022
Token old
Token ID1468
Feature activation+3.78e-02

Pos logit contributions
 loser+0.031
zag+0.028
ah+0.028
 coward+0.028
 neither+0.027
Neg logit contributions
 magazines-0.032
 counters-0.031
 envelop-0.031
 candles-0.029
keeping-0.029
Token man
Token ID582
Feature activation-5.39e-02

Pos logit contributions
 magazines+0.084
 envelop+0.084
keeping+0.081
 counters+0.081
 desks+0.078
Neg logit contributions
 loser-0.047
 coward-0.040
 fastest-0.039
 winner-0.038
Only-0.036
Token you
Token ID345
Feature activation-1.03e-03

Pos logit contributions
 loser+0.142
 leader+0.127
 winner+0.125
worm+0.125
 coward+0.123
Neg logit contributions
 magazines-0.175
 counters-0.168
 envelop-0.165
keeping-0.161
 candles-0.159
Token are
Token ID389
Feature activation-5.30e-03

Pos logit contributions
 loser+0.001
 leader+0.001
worm+0.001
 winner+0.001
 goose+0.001
Neg logit contributions
 magazines-0.002
 counters-0.002
 envelop-0.002
keeping-0.002
 candles-0.002
Token,
Token ID11
Feature activation-1.57e-02

Pos logit contributions
 loser+0.008
worm+0.007
 leader+0.007
 winner+0.007
 goose+0.007
Neg logit contributions
 magazines-0.010
 counters-0.010
 envelop-0.010
keeping-0.010
 candles-0.009
Token my
Token ID616
Feature activation-2.05e-02

Pos logit contributions
 loser+0.022
 leader+0.020
worm+0.020
 winner+0.020
 frog+0.019
Neg logit contributions
 magazines-0.032
 counters-0.031
 envelop-0.031
keeping-0.030
 candles-0.029
Token little
Token ID1310
Feature activation-8.77e-03

Pos logit contributions
 loser+0.032
 leader+0.029
 winner+0.029
worm+0.029
 frog+0.029
Neg logit contributions
 magazines-0.039
 counters-0.038
 envelop-0.037
keeping-0.037
 candles-0.036
Token one
Token ID530
Feature activation-2.62e-01

Pos logit contributions
 loser+0.011
worm+0.010
 coward+0.010
 leader+0.010
 winner+0.009
Neg logit contributions
 magazines-0.019
 counters-0.019
 envelop-0.018
keeping-0.018
 candles-0.018
Token!
Token ID0
Feature activation+2.52e-03

Pos logit contributions
 loser+0.379
 coward+0.321
 leader+0.320
worm+0.313
 goose+0.311
Neg logit contributions
 magazines-0.529
 counters-0.513
 candles-0.498
 envelop-0.495
keeping-0.488
Tokenâ
Token ID22940
Feature activation-2.39e-02

Pos logit contributions
 magazines+0.005
 counters+0.004
 envelop+0.004
keeping+0.004
 candles+0.004
Neg logit contributions
 loser-0.004
 winner-0.004
 leader-0.004
 frog-0.004
worm-0.004
Token
Token ID26391
Feature activation+3.85e-03

Pos logit contributions
 loser+0.038
 winner+0.034
 leader+0.034
worm+0.034
ah+0.034
Neg logit contributions
 magazines-0.046
 counters-0.044
 envelop-0.044
keeping-0.042
 candles-0.042
Token
Token ID220
Feature activation-9.56e-04

Pos logit contributions
 magazines+0.008
 counters+0.008
 envelop+0.008
keeping+0.008
 candles+0.008
Neg logit contributions
 loser-0.005
 leader-0.005
 winner-0.005
 frog-0.005
 goose-0.005
Token
Token ID198
Feature activation-5.49e-02

Pos logit contributions
 loser+0.001
 leader+0.001
 winner+0.001
worm+0.001
 frog+0.001
Neg logit contributions
 magazines-0.002
 counters-0.002
 envelop-0.002
keeping-0.002
 candles-0.002
Token with
Token ID351
Feature activation-2.91e-02

Pos logit contributions
 loser+0.015
worm+0.013
 leader+0.013
 winner+0.013
 coward+0.013
Neg logit contributions
 magazines-0.021
 counters-0.020
 envelop-0.020
keeping-0.019
 candles-0.019
Token a
Token ID257
Feature activation+2.39e-02

Pos logit contributions
 loser+0.046
 leader+0.042
 winner+0.042
 frog+0.042
worm+0.041
Neg logit contributions
 counters-0.055
 magazines-0.054
keeping-0.052
 envelop-0.052
 candles-0.049
Token bright
Token ID6016
Feature activation+4.29e-02

Pos logit contributions
 magazines+0.048
 counters+0.047
 envelop+0.047
 candles+0.045
keeping+0.044
Neg logit contributions
 loser-0.035
worm-0.031
 leader-0.030
in-0.030
 winner-0.030
Token smile
Token ID8212
Feature activation+2.24e-02

Pos logit contributions
 magazines+0.093
 counters+0.092
keeping+0.092
 envelop+0.088
 desks+0.086
Neg logit contributions
 loser-0.054
 leader-0.050
 winner-0.050
 frog-0.050
 problem-0.049
Token,
Token ID11
Feature activation-1.20e-02

Pos logit contributions
 magazines+0.048
 counters+0.047
 envelop+0.046
keeping+0.045
 candles+0.044
Neg logit contributions
 loser-0.030
 leader-0.026
worm-0.026
 winner-0.026
 frog-0.025
Token one
Token ID530
Feature activation-2.62e-01

Pos logit contributions
 loser+0.015
 frog+0.014
 leader+0.014
 goose+0.014
 winner+0.014
Neg logit contributions
 magazines-0.026
 counters-0.026
keeping-0.025
 envelop-0.025
 candles-0.024
Token of
Token ID286
Feature activation+1.80e-02

Pos logit contributions
 loser+0.383
 coward+0.331
 leader+0.310
worm+0.309
 winner+0.308
Neg logit contributions
 magazines-0.534
 envelop-0.520
 counters-0.516
 candles-0.513
 illustrations-0.488
Token the
Token ID262
Feature activation-1.72e-02

Pos logit contributions
 magazines+0.032
 counters+0.031
 envelop+0.030
keeping+0.029
 candles+0.029
Neg logit contributions
 loser-0.031
 leader-0.028
worm-0.028
 winner-0.028
 coward-0.027
Token friends
Token ID2460
Feature activation-3.47e-02

Pos logit contributions
 loser+0.024
 leader+0.021
 winner+0.021
worm+0.021
 frog+0.021
Neg logit contributions
 magazines-0.036
 counters-0.035
 envelop-0.035
keeping-0.034
 candles-0.033
Token held
Token ID2714
Feature activation-6.33e-02

Pos logit contributions
 loser+0.050
 leader+0.046
 winner+0.046
 frog+0.046
worm+0.045
Neg logit contributions
 magazines-0.070
 counters-0.068
 envelop-0.066
keeping-0.066
 candles-0.063
Token a
Token ID257
Feature activation+2.12e-02

Pos logit contributions
 loser+0.091
 frog+0.083
 leader+0.082
 best+0.082
 winner+0.082
Neg logit contributions
 magazines-0.126
 counters-0.126
keeping-0.122
 envelop-0.118
 desks-0.115
Token,
Token ID11
Feature activation-9.30e-03

Pos logit contributions
 magazines+0.092
 envelop+0.089
 counters+0.088
keeping+0.085
 candles+0.084
Neg logit contributions
 loser-0.062
 winner-0.053
 coward-0.052
 leader-0.051
worm-0.050
Token "
Token ID366
Feature activation-9.97e-03

Pos logit contributions
 loser+0.013
 leader+0.012
 winner+0.012
 frog+0.012
worm+0.012
Neg logit contributions
 magazines-0.019
 counters-0.019
 envelop-0.018
keeping-0.018
 candles-0.017
TokenHello
Token ID15496
Feature activation+4.80e-02

Pos logit contributions
 loser+0.016
 problem+0.015
 coward+0.014
 leader+0.014
 best+0.014
Neg logit contributions
 magazines-0.018
 envelop-0.018
keeping-0.018
 counters-0.018
 candles-0.017
Token,
Token ID11
Feature activation-1.26e-02

Pos logit contributions
 magazines+0.107
 counters+0.106
 envelop+0.103
keeping+0.103
 candles+0.098
Neg logit contributions
 loser-0.060
 frog-0.055
 winner-0.053
 leader-0.053
in-0.052
Token little
Token ID1310
Feature activation-6.58e-03

Pos logit contributions
 loser+0.018
 frog+0.016
 winner+0.016
 leader+0.016
worm+0.015
Neg logit contributions
 magazines-0.026
 counters-0.026
 envelop-0.025
keeping-0.025
 candles-0.024
Token one
Token ID530
Feature activation-2.62e-01

Pos logit contributions
 loser+0.007
zag+0.006
 leader+0.006
 coward+0.006
worm+0.006
Neg logit contributions
 magazines-0.016
 counters-0.015
 envelop-0.015
 candles-0.015
keeping-0.015
Token!
Token ID0
Feature activation+3.93e-03

Pos logit contributions
 loser+0.405
 leader+0.355
 coward+0.352
 winner+0.351
worm+0.348
Neg logit contributions
 magazines-0.510
 counters-0.490
 envelop-0.481
keeping-0.471
 candles-0.468
Token What
Token ID1867
Feature activation+6.27e-02

Pos logit contributions
 magazines+0.007
 counters+0.006
 envelop+0.006
keeping+0.006
 candles+0.006
Neg logit contributions
 loser-0.007
 leader-0.006
 winner-0.006
 frog-0.006
worm-0.006
Token have
Token ID423
Feature activation-2.28e-02

Pos logit contributions
 magazines+0.128
 envelop+0.119
 counters+0.118
 computers+0.116
 candles+0.116
Neg logit contributions
 loser-0.091
worm-0.083
ah-0.079
 goose-0.079
 leader-0.079
Token you
Token ID345
Feature activation-1.44e-03

Pos logit contributions
 loser+0.034
 winner+0.032
 leader+0.031
 frog+0.031
 goose+0.031
Neg logit contributions
 counters-0.044
 magazines-0.044
 envelop-0.042
keeping-0.042
 candles-0.040
Token got
Token ID1392
Feature activation-2.19e-02

Pos logit contributions
 loser+0.002
 winner+0.002
 leader+0.002
 frog+0.002
 problem+0.002
Neg logit contributions
 magazines-0.003
 counters-0.003
 envelop-0.003
keeping-0.003
 candles-0.003
Token crib
Token ID48083
Feature activation+1.49e-02

Pos logit contributions
 loser+0.043
 leader+0.039
 winner+0.039
worm+0.039
 coward+0.039
Neg logit contributions
 magazines-0.042
 counters-0.041
 envelop-0.040
keeping-0.039
 candles-0.038
Token for
Token ID329
Feature activation+1.99e-02

Pos logit contributions
 magazines+0.031
 counters+0.030
 envelop+0.029
keeping+0.029
 candles+0.028
Neg logit contributions
 loser-0.021
 leader-0.019
 winner-0.019
worm-0.019
 frog-0.019
Token Sarah
Token ID10490
Feature activation+1.46e-01

Pos logit contributions
 counters+0.043
 magazines+0.041
keeping+0.040
 illustrations+0.040
 envelop+0.040
Neg logit contributions
 first-0.025
 loser-0.023
 best-0.023
 one-0.023
 friend-0.022
Token.
Token ID13
Feature activation-2.61e-02

Pos logit contributions
 magazines+0.329
 counters+0.327
 envelop+0.316
keeping+0.309
 candles+0.307
Neg logit contributions
 loser-0.170
 leader-0.153
worm-0.151
 winner-0.150
 frog-0.148
Token This
Token ID770
Feature activation+2.14e-02

Pos logit contributions
 loser+0.039
Only+0.036
 bet+0.036
 frog+0.036
 winner+0.036
Neg logit contributions
 magazines-0.050
 candles-0.049
 envelop-0.048
 counters-0.048
keeping-0.048
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 magazines+0.050
 counters+0.047
 candles+0.046
 presents+0.046
 envelop+0.045
Neg logit contributions
 loser-0.027
worm-0.023
 goose-0.023
 coward-0.023
ah-0.023
Token is
Token ID318
Feature activation-2.46e-02

Pos logit contributions
 loser+0.317
 coward+0.263
 leader+0.260
 winner+0.258
worm+0.255
Neg logit contributions
 magazines-0.601
 counters-0.582
 envelop-0.573
 candles-0.562
keeping-0.556
Token too
Token ID1165
Feature activation-1.69e-01

Pos logit contributions
 best+0.033
 loser+0.032
 lesson+0.031
 problem+0.031
 winner+0.031
Neg logit contributions
 counters-0.051
keeping-0.051
 magazines-0.049
 envelop-0.048
 desks-0.046
Token small
Token ID1402
Feature activation-2.45e-02

Pos logit contributions
 loser+0.209
 leader+0.187
 problem+0.186
worm+0.185
 lesson+0.184
Neg logit contributions
 magazines-0.370
 counters-0.368
keeping-0.363
 envelop-0.361
 candles-0.342
Token for
Token ID329
Feature activation+2.10e-02

Pos logit contributions
 loser+0.031
 leader+0.027
 winner+0.027
worm+0.027
 frog+0.027
Neg logit contributions
 magazines-0.054
 counters-0.053
 envelop-0.052
keeping-0.051
 candles-0.050
Token her
Token ID607
Feature activation-4.18e-03

Pos logit contributions
 counters+0.041
keeping+0.039
 magazines+0.038
 envelop+0.038
 illustrations+0.037
Neg logit contributions
 first-0.030
 one-0.028
 best-0.028
 loser-0.027
 problem-0.027
Token speaking
Token ID5486
Feature activation-3.76e-02

Pos logit contributions
 magazines+0.004
 counters+0.004
 envelop+0.003
keeping+0.003
 candles+0.003
Neg logit contributions
 loser-0.003
 leader-0.003
 winner-0.003
worm-0.003
 frog-0.003
Token a
Token ID257
Feature activation+2.43e-02

Pos logit contributions
 loser+0.050
 leader+0.044
worm+0.044
 winner+0.043
 frog+0.043
Neg logit contributions
 magazines-0.081
 counters-0.079
 envelop-0.078
keeping-0.076
 candles-0.075
Token magical
Token ID10883
Feature activation-6.11e-03

Pos logit contributions
 magazines+0.053
 envelop+0.051
 counters+0.049
 brand+0.048
 computers+0.048
Neg logit contributions
 loser-0.031
in-0.028
 neither-0.027
 coward-0.026
worm-0.026
Token language
Token ID3303
Feature activation-1.46e-04

Pos logit contributions
 loser+0.008
 coward+0.007
in+0.007
worm+0.007
 winner+0.007
Neg logit contributions
 magazines-0.013
 envelop-0.013
 counters-0.013
keeping-0.012
 candles-0.012
Token -
Token ID532
Feature activation+7.71e-02

Pos logit contributions
 loser+0.000
 leader+0.000
 winner+0.000
worm+0.000
 frog+0.000
Neg logit contributions
 magazines-0.000
 counters-0.000
 envelop-0.000
keeping-0.000
 candles-0.000
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 magazines+0.158
 counters+0.155
 envelop+0.151
keeping+0.149
 candles+0.145
Neg logit contributions
 loser-0.110
 leader-0.098
 winner-0.097
worm-0.097
 frog-0.096
Token that
Token ID326
Feature activation-8.19e-02

Pos logit contributions
 loser+0.421
 coward+0.364
 goose+0.343
igger+0.343
 winner+0.343
Neg logit contributions
 magazines-0.496
 envelop-0.471
 candles-0.459
 counters-0.457
 videos-0.448
Token only
Token ID691
Feature activation-2.36e-01

Pos logit contributions
 loser+0.129
 winner+0.106
in+0.106
 coward+0.106
 goose+0.105
Neg logit contributions
 magazines-0.160
 envelop-0.152
keeping-0.146
 computers-0.146
 counters-0.144
Token the
Token ID262
Feature activation-1.84e-02

Pos logit contributions
 loser+0.314
 leader+0.272
 winner+0.271
worm+0.270
 frog+0.267
Neg logit contributions
 magazines-0.510
 counters-0.495
 envelop-0.489
keeping-0.481
 candles-0.469
Token two
Token ID734
Feature activation-1.43e-01

Pos logit contributions
 loser+0.021
in+0.020
ah+0.020
zag+0.019
worm+0.019
Neg logit contributions
 magazines-0.042
 envelop-0.040
 counters-0.039
keeping-0.038
 computers-0.038
Token of
Token ID286
Feature activation+1.87e-02

Pos logit contributions
 loser+0.215
 leader+0.185
worm+0.185
 winner+0.185
 frog+0.183
Neg logit contributions
 magazines-0.288
 counters-0.277
 envelop-0.274
keeping-0.264
 candles-0.263
Token on
Token ID319
Feature activation-3.46e-02

Pos logit contributions
 loser+0.034
ah+0.029
 goose+0.029
 coward+0.029
 winner+0.029
Neg logit contributions
 magazines-0.045
 counters-0.044
keeping-0.044
 envelop-0.043
 candles-0.041
Token an
Token ID281
Feature activation+1.26e-02

Pos logit contributions
 loser+0.045
 lesson+0.041
 leader+0.041
 problem+0.040
 frog+0.040
Neg logit contributions
 counters-0.075
 magazines-0.072
 candles-0.069
 envelop-0.069
keeping-0.068
Token adventure
Token ID8855
Feature activation-2.35e-03

Pos logit contributions
 counters+0.027
 magazines+0.027
keeping+0.026
 envelop+0.026
 candles+0.025
Neg logit contributions
 loser-0.015
 leader-0.015
 lesson-0.015
 problem-0.015
worm-0.015
Token that
Token ID326
Feature activation-7.93e-02

Pos logit contributions
 loser+0.003
 leader+0.003
worm+0.003
 winner+0.003
 frog+0.002
Neg logit contributions
 magazines-0.005
 counters-0.005
 envelop-0.005
keeping-0.005
 candles-0.005
Token no
Token ID645
Feature activation-1.21e-01

Pos logit contributions
 loser+0.114
worm+0.095
 goose+0.095
ah+0.095
 winner+0.095
Neg logit contributions
 magazines-0.164
 counters-0.159
 envelop-0.158
keeping-0.156
 candles-0.152
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 loser+0.140
worm+0.124
in+0.124
zag+0.122
 goose+0.119
Neg logit contributions
 magazines-0.276
 envelop-0.267
 counters-0.261
 candles-0.259
keeping-0.255
Token expected
Token ID2938
Feature activation-1.50e-02

Pos logit contributions
 loser+0.434
igger+0.347
 winner+0.345
 coward+0.345
zag+0.344
Neg logit contributions
keeping-0.470
 magazines-0.470
 counters-0.452
 candles-0.452
 envelop-0.450
Token.
Token ID13
Feature activation-2.10e-02

Pos logit contributions
 loser+0.020
 leader+0.017
 winner+0.017
worm+0.017
 frog+0.017
Neg logit contributions
 magazines-0.032
 counters-0.031
 envelop-0.031
keeping-0.030
 candles-0.030
Token<|endoftext|>
Token ID50256
Feature activation+1.09e-01

Pos logit contributions
 loser+0.033
worm+0.031
 leader+0.030
 goose+0.030
 problem+0.030
Neg logit contributions
 magazines-0.038
 counters-0.037
 envelop-0.036
keeping-0.036
 Emmy-0.035
TokenOnce
Token ID7454
Feature activation-6.34e-02

Pos logit contributions
 magazines+0.233
 counters+0.227
keeping+0.226
 envelop+0.221
 candles+0.213
Neg logit contributions
 loser-0.144
 leader-0.128
worm-0.125
 winner-0.124
 coward-0.122
Token upon
Token ID2402
Feature activation+7.80e-02

Pos logit contributions
 loser+0.138
 winner+0.126
 hawk+0.122
 fastest+0.120
 leader+0.120
Neg logit contributions
 Emmy-0.083
 Sarah-0.080
 magazines-0.078
 desks-0.074
 bed-0.073
Token just
Token ID655
Feature activation-1.36e-01

Pos logit contributions
 loser+0.140
 winner+0.125
 frog+0.124
 leader+0.124
 problem+0.121
Neg logit contributions
 magazines-0.231
 counters-0.224
 envelop-0.220
keeping-0.218
 computers-0.209
Token as
Token ID355
Feature activation-1.31e-01

Pos logit contributions
 loser+0.183
 winner+0.164
 leader+0.163
 frog+0.163
worm+0.160
Neg logit contributions
 magazines-0.293
 counters-0.285
 envelop-0.279
keeping-0.274
 desks-0.264
Token beautiful
Token ID4950
Feature activation+5.41e-02

Pos logit contributions
 loser+0.183
 leader+0.162
 winner+0.161
worm+0.160
 frog+0.160
Neg logit contributions
 magazines-0.275
 counters-0.270
 envelop-0.265
keeping-0.260
 candles-0.252
Token as
Token ID355
Feature activation-1.36e-01

Pos logit contributions
 magazines+0.115
 counters+0.112
 envelop+0.112
keeping+0.108
 computers+0.105
Neg logit contributions
 loser-0.070
 frog-0.067
 goose-0.065
worm-0.064
 winner-0.064
Token the
Token ID262
Feature activation-1.64e-02

Pos logit contributions
 loser+0.203
 leader+0.182
 winner+0.182
 frog+0.182
worm+0.181
Neg logit contributions
 magazines-0.271
 counters-0.266
 envelop-0.260
keeping-0.255
 candles-0.245
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 loser+0.022
 leader+0.019
 winner+0.019
 frog+0.019
worm+0.019
Neg logit contributions
 magazines-0.035
 counters-0.035
 envelop-0.034
keeping-0.034
 candles-0.032
Token in
Token ID287
Feature activation-3.16e-02

Pos logit contributions
 loser+0.351
 coward+0.297
 leader+0.295
 winner+0.290
worm+0.289
Neg logit contributions
 magazines-0.560
 counters-0.551
 envelop-0.535
 candles-0.527
keeping-0.523
Token the
Token ID262
Feature activation-1.81e-02

Pos logit contributions
 loser+0.050
worm+0.045
 leader+0.045
 winner+0.045
 frog+0.044
Neg logit contributions
 magazines-0.060
 counters-0.058
 envelop-0.057
keeping-0.056
 candles-0.054
Token sky
Token ID6766
Feature activation+3.70e-02

Pos logit contributions
 loser+0.026
 winner+0.025
 frog+0.024
 problem+0.024
 leader+0.024
Neg logit contributions
 magazines-0.036
keeping-0.036
 counters-0.035
 envelop-0.034
 candles-0.032
Token.
Token ID13
Feature activation-2.36e-02

Pos logit contributions
 magazines+0.080
 counters+0.078
 envelop+0.077
keeping+0.075
 candles+0.073
Neg logit contributions
 loser-0.049
 leader-0.043
 winner-0.043
worm-0.042
 frog-0.042
Tokenâ
Token ID22940
Feature activation-2.47e-02

Pos logit contributions
 loser+0.034
 frog+0.031
 winner+0.030
 leader+0.030
 goose+0.030
Neg logit contributions
 magazines-0.048
 counters-0.047
 envelop-0.046
 candles-0.045
keeping-0.045
Token said
Token ID531
Feature activation-7.36e-03

Pos logit contributions
 magazines+0.031
 counters+0.031
 envelop+0.030
keeping+0.029
 computers+0.028
Neg logit contributions
 loser-0.023
 frog-0.021
 leader-0.021
 winner-0.021
 goose-0.021
Token '
Token ID705
Feature activation-4.13e-02

Pos logit contributions
 loser+0.009
 leader+0.008
worm+0.008
 winner+0.008
 goose+0.008
Neg logit contributions
 magazines-0.016
 counters-0.016
 envelop-0.016
keeping-0.015
 candles-0.015
TokenMom
Token ID29252
Feature activation+1.28e-02

Pos logit contributions
 loser+0.063
 leader+0.057
 coward+0.056
 problem+0.056
worm+0.055
Neg logit contributions
 magazines-0.080
 envelop-0.077
 counters-0.077
keeping-0.076
 candles-0.073
Token,
Token ID11
Feature activation-1.30e-02

Pos logit contributions
 magazines+0.029
 counters+0.028
keeping+0.027
 envelop+0.027
 desks+0.026
Neg logit contributions
 loser-0.016
ah-0.014
 winner-0.014
 leader-0.014
in-0.014
Token this
Token ID428
Feature activation-5.63e-02

Pos logit contributions
 loser+0.019
 leader+0.016
 winner+0.016
worm+0.016
 frog+0.016
Neg logit contributions
 magazines-0.027
 counters-0.026
 envelop-0.026
keeping-0.025
 candles-0.025
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 loser+0.083
 coward+0.071
 leader+0.069
worm+0.069
 neither+0.066
Neg logit contributions
 magazines-0.121
 counters-0.110
 presents-0.108
 envelop-0.108
 candles-0.108
Token!'
Token ID13679
Feature activation+1.07e-01

Pos logit contributions
 loser+0.321
 coward+0.271
 leader+0.269
 winner+0.265
worm+0.264
Neg logit contributions
 magazines-0.592
 counters-0.572
 envelop-0.566
 candles-0.553
keeping-0.549
Token
Token ID198
Feature activation-5.46e-02

Pos logit contributions
 magazines+0.231
 counters+0.227
 envelop+0.222
keeping+0.219
 candles+0.214
Neg logit contributions
 loser-0.139
 leader-0.123
 frog-0.122
 winner-0.121
worm-0.121
Token
Token ID198
Feature activation-5.46e-02

Pos logit contributions
 loser+0.060
 leader+0.051
 winner+0.051
worm+0.051
 frog+0.050
Neg logit contributions
 magazines-0.131
 counters-0.128
 envelop-0.126
keeping-0.124
 candles-0.121
TokenMax
Token ID11518
Feature activation+2.35e-02

Pos logit contributions
 loser+0.079
worm+0.070
 winner+0.070
 leader+0.070
 goose+0.069
Neg logit contributions
 magazines-0.112
 counters-0.109
 envelop-0.107
keeping-0.104
 candles-0.102
Token's
Token ID338
Feature activation-3.90e-02

Pos logit contributions
 magazines+0.052
 counters+0.050
keeping+0.050
 envelop+0.049
 computers+0.047
Neg logit contributions
 loser-0.029
 frog-0.028
 leader-0.028
ah-0.028
 friend-0.027
Token long
Token ID890
Feature activation+4.94e-02

Pos logit contributions
 loser+0.036
 frog+0.034
 leader+0.034
 winner+0.034
worm+0.034
Neg logit contributions
 magazines-0.050
 counters-0.050
 envelop-0.048
keeping-0.047
 videos-0.045
Token leaves
Token ID5667
Feature activation-4.64e-02

Pos logit contributions
 magazines+0.101
 counters+0.098
 envelop+0.097
keeping+0.097
 videos+0.093
Neg logit contributions
 loser-0.067
 frog-0.063
 goose-0.062
worm-0.061
 problem-0.061
Token,
Token ID11
Feature activation-1.26e-02

Pos logit contributions
 loser+0.063
 leader+0.055
 winner+0.055
worm+0.055
 frog+0.054
Neg logit contributions
 magazines-0.098
 counters-0.096
 envelop-0.094
keeping-0.093
 candles-0.091
Token and
Token ID290
Feature activation+1.36e-02

Pos logit contributions
 loser+0.017
 frog+0.016
 goose+0.015
 winner+0.015
 leader+0.015
Neg logit contributions
 magazines-0.026
 counters-0.026
keeping-0.026
 envelop-0.026
 desks-0.024
Token this
Token ID428
Feature activation-5.78e-02

Pos logit contributions
 magazines+0.028
 counters+0.027
 envelop+0.027
keeping+0.026
 candles+0.025
Neg logit contributions
 loser-0.019
 winner-0.017
 frog-0.017
 leader-0.017
worm-0.017
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 loser+0.088
 coward+0.077
 leader+0.073
in+0.072
 winner+0.070
Neg logit contributions
 magazines-0.115
 counters-0.113
 envelop-0.112
 candles-0.111
keeping-0.107
Token probably
Token ID2192
Feature activation-2.92e-02

Pos logit contributions
 loser+0.368
 coward+0.308
 leader+0.298
 winner+0.295
igger+0.282
Neg logit contributions
 magazines-0.551
 counters-0.537
 envelop-0.527
 candles-0.527
keeping-0.513
Token used
Token ID973
Feature activation-6.30e-02

Pos logit contributions
 loser+0.041
 leader+0.035
 winner+0.035
in+0.035
 coward+0.035
Neg logit contributions
 magazines-0.060
 counters-0.059
 envelop-0.058
keeping-0.057
 candles-0.056
Token to
Token ID284
Feature activation+1.52e-02

Pos logit contributions
 loser+0.083
worm+0.075
 leader+0.075
 frog+0.074
 winner+0.074
Neg logit contributions
 magazines-0.134
 counters-0.133
 envelop-0.128
keeping-0.127
 candles-0.123
Token belong
Token ID5594
Feature activation-4.45e-02

Pos logit contributions
 magazines+0.034
 counters+0.033
 envelop+0.032
keeping+0.032
 candles+0.031
Neg logit contributions
 loser-0.019
 leader-0.017
 frog-0.017
 winner-0.016
worm-0.016
Token to
Token ID284
Feature activation+1.52e-02

Pos logit contributions
 loser+0.055
 leader+0.049
worm+0.049
 winner+0.049
 frog+0.048
Neg logit contributions
 magazines-0.099
 counters-0.098
 envelop-0.095
keeping-0.092
 candles-0.091
TokenThe
Token ID464
Feature activation+8.59e-03

Pos logit contributions
 loser+0.075
 problem+0.066
 winner+0.066
 coward+0.065
worm+0.065
Neg logit contributions
 magazines-0.107
 counters-0.104
 envelop-0.102
keeping-0.099
 computers-0.097
Token tall
Token ID7331
Feature activation+3.29e-02

Pos logit contributions
 magazines+0.020
 counters+0.019
 candles+0.018
 desks+0.018
 envelop+0.018
Neg logit contributions
 loser-0.010
ah-0.009
zag-0.009
 coward-0.009
igger-0.009
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 magazines+0.079
 counters+0.077
 envelop+0.075
 desks+0.074
 candles+0.074
Neg logit contributions
 loser-0.037
 coward-0.032
zag-0.030
 fastest-0.029
 winner-0.029
Token stood
Token ID6204
Feature activation-2.60e-02

Pos logit contributions
 loser+0.449
 coward+0.395
 fastest+0.367
 winner+0.364
 leader+0.361
Neg logit contributions
 magazines-0.462
 counters-0.453
 envelop-0.447
 candles-0.443
 illustrations-0.418
Token on
Token ID319
Feature activation-3.70e-02

Pos logit contributions
 loser+0.032
 leader+0.028
worm+0.028
 frog+0.028
 winner+0.028
Neg logit contributions
 magazines-0.059
 counters-0.057
 envelop-0.056
keeping-0.056
 candles-0.054
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 loser+0.053
 frog+0.049
 leader+0.048
 goose+0.047
 winner+0.047
Neg logit contributions
 counters-0.075
 magazines-0.074
keeping-0.071
 envelop-0.069
 candles-0.068
Token leg
Token ID1232
Feature activation-7.49e-02

Pos logit contributions
 loser+0.359
 coward+0.327
 leader+0.301
 winner+0.301
 goose+0.297
Neg logit contributions
 magazines-0.552
 counters-0.540
 envelop-0.536
 candles-0.515
 desks-0.506
Token and
Token ID290
Feature activation+1.59e-02

Pos logit contributions
 loser+0.096
 leader+0.084
worm+0.083
 winner+0.083
 frog+0.083
Neg logit contributions
 magazines-0.164
 counters-0.161
 envelop-0.158
keeping-0.155
 candles-0.152
Token said
Token ID531
Feature activation-6.66e-03

Pos logit contributions
 magazines+0.031
 counters+0.031
 envelop+0.030
keeping+0.030
 candles+0.029
Neg logit contributions
 loser-0.024
 leader-0.022
worm-0.021
 winner-0.021
 frog-0.021
Token,
Token ID11
Feature activation-1.04e-02

Pos logit contributions
 loser+0.009
worm+0.007
 winner+0.007
 leader+0.007
 coward+0.007
Neg logit contributions
 magazines-0.014
 counters-0.014
 envelop-0.014
 candles-0.013
keeping-0.013
Token "
Token ID366
Feature activation-8.90e-03

Pos logit contributions
 loser+0.016
worm+0.015
 leader+0.015
 winner+0.014
 coward+0.014
Neg logit contributions
 magazines-0.020
 counters-0.019
 envelop-0.019
keeping-0.018
 candles-0.018
Token it
Token ID340
Feature activation-3.78e-02

Pos logit contributions
 loser+0.113
worm+0.097
ah+0.097
 winner+0.097
 leader+0.096
Neg logit contributions
 magazines-0.192
 counters-0.186
 envelop-0.183
keeping-0.181
 candles-0.177
Token is
Token ID318
Feature activation-1.80e-02

Pos logit contributions
 loser+0.053
 goose+0.045
 fisherman+0.044
igger+0.043
 coward+0.043
Neg logit contributions
 magazines-0.079
 counters-0.074
 computers-0.074
 candles-0.073
 envelop-0.072
Token not
Token ID407
Feature activation-1.26e-01

Pos logit contributions
 loser+0.025
worm+0.022
 goose+0.022
in+0.022
ah+0.022
Neg logit contributions
 magazines-0.037
 envelop-0.035
 counters-0.035
keeping-0.034
 candles-0.034
Token the
Token ID262
Feature activation-1.41e-02

Pos logit contributions
 loser+0.185
worm+0.163
 leader+0.161
 goose+0.161
 winner+0.161
Neg logit contributions
 magazines-0.255
 counters-0.246
 envelop-0.243
keeping-0.238
 candles-0.232
Token easiest
Token ID16638
Feature activation-1.36e-01

Pos logit contributions
in+0.014
 loser+0.014
zag+0.014
ah+0.013
igger+0.013
Neg logit contributions
 magazines-0.033
 envelop-0.032
 counters-0.030
keeping-0.030
 candles-0.030
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 loser+0.186
ah+0.170
in+0.167
 coward+0.162
igger+0.162
Neg logit contributions
 magazines-0.283
 envelop-0.269
keeping-0.266
 counters-0.264
 computers-0.259
Token.
Token ID13
Feature activation-1.97e-02

Pos logit contributions
 loser+0.464
zag+0.397
 coward+0.393
igger+0.392
ah+0.392
Neg logit contributions
 magazines-0.445
 candles-0.411
 envelop-0.407
 presents-0.406
 album-0.402
Token
Token ID220
Feature activation-8.35e-04

Pos logit contributions
 loser+0.027
 leader+0.024
 winner+0.024
worm+0.024
 frog+0.024
Neg logit contributions
 magazines-0.041
 counters-0.040
 envelop-0.040
keeping-0.039
 candles-0.038
Token
Token ID198
Feature activation-5.36e-02

Pos logit contributions
 loser+0.001
 winner+0.001
 goose+0.001
worm+0.001
 leader+0.001
Neg logit contributions
 magazines-0.002
 counters-0.002
 envelop-0.002
keeping-0.002
 computers-0.002
Token
Token ID198
Feature activation-5.38e-02

Pos logit contributions
 loser+0.075
 winner+0.066
 problem+0.066
 leader+0.064
 coward+0.063
Neg logit contributions
 magazines-0.112
 counters-0.108
 envelop-0.108
keeping-0.105
 computers-0.101
TokenAt
Token ID2953
Feature activation-8.91e-03

Pos logit contributions
 loser+0.088
 problem+0.082
 winner+0.078
 coward+0.078
worm+0.077
Neg logit contributions
 magazines-0.098
 envelop-0.094
 counters-0.093
keeping-0.089
 computers-0.087
Token decided
Token ID3066
Feature activation+5.22e-02

Pos logit contributions
 counters+0.123
 magazines+0.123
 envelop+0.117
keeping+0.117
 candles+0.112
Neg logit contributions
 loser-0.076
 leader-0.075
 winner-0.073
 problem-0.069
in-0.068
Token to
Token ID284
Feature activation+2.08e-02

Pos logit contributions
 magazines+0.121
 counters+0.121
 envelop+0.117
keeping+0.115
 candles+0.114
Neg logit contributions
 loser-0.058
 leader-0.053
 winner-0.051
worm-0.050
 problem-0.050
Token take
Token ID1011
Feature activation-4.20e-02

Pos logit contributions
 magazines+0.044
 counters+0.044
 envelop+0.042
keeping+0.042
 candles+0.041
Neg logit contributions
 loser-0.028
 leader-0.026
 winner-0.025
 frog-0.024
worm-0.024
Token the
Token ID262
Feature activation-1.45e-02

Pos logit contributions
 loser+0.059
 best+0.057
 leader+0.056
 winner+0.055
 lesson+0.055
Neg logit contributions
 counters-0.083
keeping-0.079
 magazines-0.078
 candles-0.075
 envelop-0.075
Token youngest
Token ID18887
Feature activation-1.07e-01

Pos logit contributions
 loser+0.023
 winner+0.021
 leader+0.021
 problem+0.021
worm+0.020
Neg logit contributions
 magazines-0.027
 counters-0.027
keeping-0.026
 envelop-0.026
 candles-0.025
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 loser+0.175
 leader+0.158
 winner+0.158
worm+0.154
 frog+0.154
Neg logit contributions
 magazines-0.199
 counters-0.195
 envelop-0.189
keeping-0.187
 candles-0.181
Token with
Token ID351
Feature activation-3.07e-02

Pos logit contributions
 loser+0.367
 coward+0.319
 frog+0.312
 leader+0.310
 goose+0.308
Neg logit contributions
 magazines-0.552
 counters-0.529
 envelop-0.525
keeping-0.510
 candles-0.504
Token them
Token ID606
Feature activation-2.22e-02

Pos logit contributions
 loser+0.045
 leader+0.043
 winner+0.041
 lesson+0.040
 best+0.040
Neg logit contributions
 counters-0.061
 magazines-0.058
keeping-0.057
 envelop-0.055
 candles-0.054
Token on
Token ID319
Feature activation-3.66e-02

Pos logit contributions
 loser+0.024
 leader+0.022
 winner+0.022
worm+0.021
 best+0.021
Neg logit contributions
 counters-0.051
 magazines-0.051
 envelop-0.050
keeping-0.049
 candles-0.048
Token a
Token ID257
Feature activation+2.43e-02

Pos logit contributions
 loser+0.055
 leader+0.050
 winner+0.048
 frog+0.047
 lesson+0.047
Neg logit contributions
 counters-0.073
 magazines-0.071
keeping-0.069
 envelop-0.068
 candles-0.066
Token fun
Token ID1257
Feature activation-4.41e-03

Pos logit contributions
 magazines+0.050
 counters+0.050
keeping+0.048
 envelop+0.048
 candles+0.046
Neg logit contributions
 loser-0.034
 leader-0.031
 winner-0.030
worm-0.030
 problem-0.030
Tokenâ
Token ID22940
Feature activation-2.33e-02

Pos logit contributions
 loser+0.040
 winner+0.036
 frog+0.035
 leader+0.035
ah+0.034
Neg logit contributions
 magazines-0.043
 counters-0.042
 envelop-0.041
keeping-0.040
 candles-0.040
Token
Token ID26391
Feature activation+5.24e-03

Pos logit contributions
 loser+0.037
 leader+0.034
 winner+0.034
worm+0.033
 frog+0.033
Neg logit contributions
 magazines-0.044
 counters-0.043
 envelop-0.042
keeping-0.041
 candles-0.040
Token He
Token ID679
Feature activation-2.41e-02

Pos logit contributions
 magazines+0.011
 counters+0.011
 envelop+0.010
keeping+0.010
 candles+0.010
Neg logit contributions
 loser-0.008
 winner-0.007
 leader-0.007
 goose-0.007
ah-0.007
Token pointed
Token ID6235
Feature activation-5.11e-02

Pos logit contributions
 loser+0.034
 goose+0.029
worm+0.029
 coward+0.028
 frog+0.028
Neg logit contributions
 magazines-0.049
 counters-0.048
 envelop-0.047
 candles-0.046
 computers-0.045
Token to
Token ID284
Feature activation+1.56e-02

Pos logit contributions
 loser+0.065
 leader+0.058
 winner+0.058
worm+0.057
 frog+0.057
Neg logit contributions
 magazines-0.111
 counters-0.109
 envelop-0.106
keeping-0.106
 candles-0.102
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 counters+0.031
 magazines+0.030
keeping+0.030
 envelop+0.029
 candles+0.029
Neg logit contributions
 loser-0.022
 goose-0.020
 frog-0.020
 leader-0.020
 lesson-0.020
Token of
Token ID286
Feature activation+1.79e-02

Pos logit contributions
 loser+0.358
 coward+0.320
 leader+0.304
 winner+0.298
 goose+0.291
Neg logit contributions
 magazines-0.552
 counters-0.530
 envelop-0.525
 candles-0.511
 desks-0.503
Token the
Token ID262
Feature activation-1.73e-02

Pos logit contributions
 counters+0.035
 magazines+0.035
keeping+0.034
 envelop+0.033
 candles+0.032
Neg logit contributions
 loser-0.027
 frog-0.024
 goose-0.024
 winner-0.024
 leader-0.024
Token pictures
Token ID5986
Feature activation+1.13e-01

Pos logit contributions
 loser+0.025
 leader+0.023
 winner+0.022
worm+0.022
 frog+0.022
Neg logit contributions
 magazines-0.035
 counters-0.034
 envelop-0.034
keeping-0.033
 candles-0.032
Token on
Token ID319
Feature activation-3.75e-02

Pos logit contributions
 magazines+0.249
 counters+0.245
 envelop+0.240
keeping+0.237
 candles+0.228
Neg logit contributions
 loser-0.142
 winner-0.126
 leader-0.126
worm-0.124
 frog-0.123
Token the
Token ID262
Feature activation-1.70e-02

Pos logit contributions
 loser+0.055
 frog+0.049
 goose+0.049
 leader+0.048
in+0.048
Neg logit contributions
 counters-0.075
 magazines-0.073
keeping-0.072
 envelop-0.070
 candles-0.068
Token that
Token ID326
Feature activation-8.27e-02

Pos logit contributions
 loser+0.158
 leader+0.141
 winner+0.140
worm+0.140
 frog+0.139
Neg logit contributions
 magazines-0.201
 counters-0.196
 envelop-0.192
keeping-0.188
 candles-0.184
Token's
Token ID338
Feature activation-3.97e-02

Pos logit contributions
 loser+0.117
 coward+0.100
 leader+0.099
worm+0.099
 winner+0.098
Neg logit contributions
 magazines-0.173
 counters-0.166
 envelop-0.165
 candles-0.161
 computers-0.158
Token the
Token ID262
Feature activation-1.47e-02

Pos logit contributions
 loser+0.065
 leader+0.059
 winner+0.059
worm+0.059
 frog+0.058
Neg logit contributions
 magazines-0.073
 counters-0.071
 envelop-0.069
keeping-0.068
 candles-0.066
Token most
Token ID749
Feature activation-5.27e-02

Pos logit contributions
 loser+0.019
worm+0.017
in+0.017
 coward+0.016
 leader+0.016
Neg logit contributions
 magazines-0.032
 counters-0.031
 envelop-0.030
 candles-0.030
keeping-0.029
Token important
Token ID1593
Feature activation-1.63e-02

Pos logit contributions
 loser+0.083
 leader+0.073
 winner+0.072
 coward+0.072
worm+0.072
Neg logit contributions
 magazines-0.101
 counters-0.099
 envelop-0.097
keeping-0.094
 candles-0.093
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 loser+0.019
 coward+0.016
 leader+0.015
in+0.015
 winner+0.015
Neg logit contributions
 magazines-0.038
 counters-0.037
keeping-0.037
 envelop-0.036
 candles-0.036
Token,"
Token ID553
Feature activation+9.76e-02

Pos logit contributions
 loser+0.422
 coward+0.365
 leader+0.354
igger+0.351
 winner+0.344
Neg logit contributions
 magazines-0.497
 candles-0.477
 counters-0.471
 envelop-0.469
keeping-0.453
Token he
Token ID339
Feature activation-1.11e-01

Pos logit contributions
 magazines+0.222
 counters+0.216
 envelop+0.213
keeping+0.212
 candles+0.204
Neg logit contributions
 loser-0.115
 frog-0.103
 leader-0.101
worm-0.101
 winner-0.100
Token said
Token ID531
Feature activation-1.31e-02

Pos logit contributions
 loser+0.213
 coward+0.182
 winner+0.181
 fastest+0.178
 goose+0.178
Neg logit contributions
 magazines-0.174
 envelop-0.174
 counters-0.171
 candles-0.163
 illustrations-0.157
Token.
Token ID13
Feature activation-2.86e-02

Pos logit contributions
 loser+0.017
 leader+0.015
worm+0.014
 winner+0.014
 frog+0.014
Neg logit contributions
 magazines-0.029
 counters-0.028
 envelop-0.028
keeping-0.027
 candles-0.027
Token
Token ID198
Feature activation-5.64e-02

Pos logit contributions
 loser+0.041
 leader+0.037
 frog+0.037
 winner+0.037
worm+0.036
Neg logit contributions
 magazines-0.058
 counters-0.056
 envelop-0.055
keeping-0.055
 candles-0.053
Token the
Token ID262
Feature activation-1.19e-02

Pos logit contributions
 loser+0.001
 leader+0.000
 winner+0.000
 frog+0.000
worm+0.000
Neg logit contributions
 magazines-0.001
 counters-0.001
 envelop-0.001
keeping-0.001
 candles-0.001
Token elephant
Token ID20950
Feature activation-9.39e-02

Pos logit contributions
 loser+0.020
in+0.018
worm+0.018
 coward+0.018
 goose+0.017
Neg logit contributions
 magazines-0.021
 envelop-0.020
 counters-0.020
 candles-0.020
keeping-0.020
Token was
Token ID373
Feature activation+4.96e-03

Pos logit contributions
 loser+0.125
 winner+0.106
 leader+0.106
 coward+0.105
worm+0.104
Neg logit contributions
 magazines-0.203
 counters-0.195
 envelop-0.194
keeping-0.190
 candles-0.187
Token a
Token ID257
Feature activation+2.44e-02

Pos logit contributions
 magazines+0.009
 envelop+0.009
 counters+0.009
keeping+0.008
 candles+0.008
Neg logit contributions
 loser-0.008
worm-0.007
in-0.007
 goose-0.007
 winner-0.007
Token friendly
Token ID8030
Feature activation+3.38e-03

Pos logit contributions
 magazines+0.049
 envelop+0.049
 brand+0.048
 counters+0.046
 candles+0.046
Neg logit contributions
 loser-0.031
 neither-0.028
in-0.028
worm-0.028
igger-0.028
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 magazines+0.007
 envelop+0.007
 counters+0.007
keeping+0.007
 candles+0.007
Neg logit contributions
 loser-0.004
worm-0.004
in-0.004
 coward-0.004
 winner-0.004
Token.
Token ID13
Feature activation-2.32e-02

Pos logit contributions
 loser+0.450
 coward+0.409
 leader+0.371
Only+0.366
igger+0.364
Neg logit contributions
 magazines-0.454
 envelop-0.434
 candles-0.428
 album-0.413
 counters-0.412
Token
Token ID198
Feature activation-5.47e-02

Pos logit contributions
 loser+0.030
 frog+0.028
 winner+0.027
 leader+0.027
worm+0.027
Neg logit contributions
 magazines-0.049
 counters-0.048
 envelop-0.047
keeping-0.046
 candles-0.046
Token
Token ID198
Feature activation-5.53e-02

Pos logit contributions
 loser+0.083
 leader+0.074
worm+0.073
 winner+0.073
 frog+0.073
Neg logit contributions
 magazines-0.108
 counters-0.106
 envelop-0.103
keeping-0.102
 candles-0.099
Tokenâ
Token ID22940
Feature activation-2.15e-02

Pos logit contributions
 loser+0.081
 leader+0.071
 winner+0.071
worm+0.070
 frog+0.070
Neg logit contributions
 magazines-0.112
 counters-0.110
 envelop-0.108
keeping-0.105
 candles-0.103
Token
Token ID26391
Feature activation+7.40e-03

Pos logit contributions
 loser+0.033
 leader+0.030
 winner+0.030
ah+0.030
worm+0.030
Neg logit contributions
 magazines-0.042
 counters-0.041
 envelop-0.040
keeping-0.039
 candles-0.038
Token landed
Token ID11406
Feature activation-3.52e-02

Pos logit contributions
 loser+0.085
worm+0.074
 fastest+0.073
 coward+0.073
 leader+0.072
Neg logit contributions
 magazines-0.074
 envelop-0.072
 candles-0.070
 counters-0.068
keeping-0.068
Token the
Token ID262
Feature activation-1.50e-02

Pos logit contributions
 loser+0.046
 leader+0.039
ah+0.039
worm+0.039
 winner+0.039
Neg logit contributions
 magazines-0.077
 counters-0.076
 envelop-0.074
keeping-0.072
 candles-0.072
Token little
Token ID1310
Feature activation-7.93e-03

Pos logit contributions
 loser+0.021
 leader+0.018
 winner+0.018
worm+0.018
 frog+0.018
Neg logit contributions
 magazines-0.032
 counters-0.031
 envelop-0.030
keeping-0.030
 candles-0.029
Token girl
Token ID2576
Feature activation-1.75e-02

Pos logit contributions
 loser+0.007
 leader+0.006
worm+0.006
 coward+0.006
 winner+0.006
Neg logit contributions
 magazines-0.021
 counters-0.020
 envelop-0.020
keeping-0.019
 candles-0.019
Token said
Token ID531
Feature activation-5.94e-03

Pos logit contributions
 loser+0.020
 leader+0.019
worm+0.019
 winner+0.019
 frog+0.019
Neg logit contributions
 magazines-0.040
 counters-0.039
 envelop-0.038
keeping-0.038
 candles-0.036
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 loser+0.008
 leader+0.008
 frog+0.008
worm+0.008
 winner+0.008
Neg logit contributions
 magazines-0.012
 counters-0.012
keeping-0.012
 envelop-0.012
 computers-0.011
Token final
Token ID2457
Feature activation-4.63e-02

Pos logit contributions
 loser+0.373
 coward+0.321
 leader+0.306
 winner+0.303
 frog+0.298
Neg logit contributions
 magazines-0.545
 envelop-0.524
 counters-0.521
 candles-0.513
 illustrations-0.491
Token thank
Token ID5875
Feature activation-1.61e-02

Pos logit contributions
 loser+0.065
in+0.056
 coward+0.056
worm+0.055
 leader+0.055
Neg logit contributions
 magazines-0.096
 envelop-0.093
 counters-0.093
keeping-0.090
 candles-0.089
Token you
Token ID345
Feature activation-1.79e-03

Pos logit contributions
 loser+0.021
in+0.020
 leader+0.020
worm+0.020
 goose+0.020
Neg logit contributions
 magazines-0.034
 counters-0.034
keeping-0.032
 envelop-0.032
 desks-0.031
Token to
Token ID284
Feature activation+1.64e-02

Pos logit contributions
 loser+0.002
 leader+0.002
worm+0.002
 winner+0.002
 frog+0.002
Neg logit contributions
 magazines-0.004
 counters-0.004
 envelop-0.004
keeping-0.004
 candles-0.004
Token the
Token ID262
Feature activation-1.63e-02

Pos logit contributions
 counters+0.033
 magazines+0.033
keeping+0.031
 envelop+0.030
 candles+0.030
Neg logit contributions
 loser-0.023
 leader-0.022
 frog-0.022
 goose-0.021
 friend-0.021
Token!"
Token ID2474
Feature activation+5.06e-02

Pos logit contributions
 magazines+0.000
 counters+0.000
 envelop+0.000
keeping+0.000
 candles+0.000
Neg logit contributions
 loser-0.000
worm-0.000
 leader-0.000
 winner-0.000
 coward-0.000
Token
Token ID198
Feature activation-5.20e-02

Pos logit contributions
 magazines+0.103
 counters+0.101
 envelop+0.099
keeping+0.098
 candles+0.096
Neg logit contributions
 loser-0.067
 frog-0.064
 leader-0.063
Only-0.062
 goose-0.062
Token
Token ID198
Feature activation-5.20e-02

Pos logit contributions
 loser+0.066
 winner+0.057
 leader+0.055
 problem+0.055
worm+0.055
Neg logit contributions
 magazines-0.117
 counters-0.114
 envelop-0.113
keeping-0.111
 candles-0.107
TokenThe
Token ID464
Feature activation+8.59e-03

Pos logit contributions
 loser+0.075
 problem+0.066
 winner+0.066
 coward+0.065
worm+0.065
Neg logit contributions
 magazines-0.107
 counters-0.104
 envelop-0.102
keeping-0.099
 computers-0.097
Token tall
Token ID7331
Feature activation+3.29e-02

Pos logit contributions
 magazines+0.020
 counters+0.019
 candles+0.018
 desks+0.018
 envelop+0.018
Neg logit contributions
 loser-0.010
ah-0.009
zag-0.009
 coward-0.009
igger-0.009
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 magazines+0.079
 counters+0.077
 envelop+0.075
 desks+0.074
 candles+0.074
Neg logit contributions
 loser-0.037
 coward-0.032
zag-0.030
 fastest-0.029
 winner-0.029
Token stood
Token ID6204
Feature activation-2.60e-02

Pos logit contributions
 loser+0.449
 coward+0.395
 fastest+0.367
 winner+0.364
 leader+0.361
Neg logit contributions
 magazines-0.462
 counters-0.453
 envelop-0.447
 candles-0.443
 illustrations-0.418
Token on
Token ID319
Feature activation-3.70e-02

Pos logit contributions
 loser+0.032
 leader+0.028
worm+0.028
 frog+0.028
 winner+0.028
Neg logit contributions
 magazines-0.059
 counters-0.057
 envelop-0.056
keeping-0.056
 candles-0.054
Token one
Token ID530
Feature activation-2.61e-01

Pos logit contributions
 loser+0.053
 frog+0.049
 leader+0.048
 goose+0.047
 winner+0.047
Neg logit contributions
 counters-0.075
 magazines-0.074
keeping-0.071
 envelop-0.069
 candles-0.068
Token leg
Token ID1232
Feature activation-7.49e-02

Pos logit contributions
 loser+0.359
 coward+0.327
 leader+0.301
 winner+0.301
 goose+0.297
Neg logit contributions
 magazines-0.552
 counters-0.540
 envelop-0.536
 candles-0.515
 desks-0.506
Token and
Token ID290
Feature activation+1.59e-02

Pos logit contributions
 loser+0.096
 leader+0.084
worm+0.083
 winner+0.083
 frog+0.083
Neg logit contributions
 magazines-0.164
 counters-0.161
 envelop-0.158
keeping-0.155
 candles-0.152
Token variety
Token ID4996
Feature activation+6.01e-02

Pos logit contributions
 magazines+0.050
 counters+0.049
keeping+0.048
 envelop+0.048
 candles+0.046
Neg logit contributions
 loser-0.033
 winner-0.029
 frog-0.029
 leader-0.029
worm-0.029
Token of
Token ID286
Feature activation+1.71e-02

Pos logit contributions
 magazines+0.134
 counters+0.132
 envelop+0.129
keeping+0.126
 candles+0.123
Neg logit contributions
 loser-0.073
worm-0.066
 leader-0.065
 frog-0.064
 winner-0.064
Token colors
Token ID7577
Feature activation+3.38e-04

Pos logit contributions
 counters+0.035
keeping+0.034
amas+0.034
etti+0.034
 magazines+0.034
Neg logit contributions
 lesson-0.022
 frog-0.022
 best-0.021
 thing-0.021
 goose-0.021
Token.
Token ID13
Feature activation-2.48e-02

Pos logit contributions
 magazines+0.001
 counters+0.001
 envelop+0.001
keeping+0.001
 computers+0.001
Neg logit contributions
 loser-0.000
worm-0.000
 leader-0.000
 frog-0.000
 winner-0.000
Token Every
Token ID3887
Feature activation-4.80e-02

Pos logit contributions
 loser+0.041
worm+0.038
 leader+0.038
 frog+0.038
 goose+0.038
Neg logit contributions
 magazines-0.045
 counters-0.043
 envelop-0.042
keeping-0.041
 candles-0.040
Token one
Token ID530
Feature activation-2.60e-01

Pos logit contributions
 loser+0.051
 leader+0.041
 coward+0.041
 winner+0.040
worm+0.040
Neg logit contributions
 magazines-0.117
 counters-0.113
 envelop-0.112
keeping-0.110
 candles-0.110
Token of
Token ID286
Feature activation+1.76e-02

Pos logit contributions
 loser+0.358
 coward+0.307
igger+0.279
zag+0.276
 fastest+0.271
Neg logit contributions
 magazines-0.562
 candles-0.545
 envelop-0.544
 counters-0.531
 presents-0.523
Token the
Token ID262
Feature activation-1.88e-02

Pos logit contributions
 magazines+0.024
 envelop+0.023
 candles+0.022
keeping+0.022
 counters+0.022
Neg logit contributions
 loser-0.039
 winner-0.035
 coward-0.035
 leader-0.035
 fastest-0.034
Token c
Token ID269
Feature activation+6.81e-02

Pos logit contributions
 loser+0.027
 frog+0.025
 leader+0.025
 winner+0.025
worm+0.024
Neg logit contributions
 magazines-0.038
 counters-0.037
 envelop-0.036
keeping-0.036
 computers-0.034
Tokenact
Token ID529
Feature activation-2.27e-02

Pos logit contributions
 magazines+0.155
 counters+0.151
 envelop+0.151
keeping+0.146
 computers+0.144
Neg logit contributions
 loser-0.081
worm-0.073
in-0.072
 frog-0.072
 leader-0.072
Tokeni
Token ID72
Feature activation-9.93e-02

Pos logit contributions
 loser+0.036
 winner+0.032
 leader+0.032
 coward+0.031
 lesson+0.031
Neg logit contributions
 magazines-0.044
 counters-0.043
keeping-0.042
 envelop-0.042
 candles-0.041
Token very
Token ID845
Feature activation+9.29e-03

Pos logit contributions
keeping+0.018
 counters+0.018
 magazines+0.018
 envelop+0.017
 illustrations+0.017
Neg logit contributions
 best-0.009
 loser-0.008
 only-0.008
 leader-0.008
 frog-0.008
Token young
Token ID1862
Feature activation+8.78e-03

Pos logit contributions
 magazines+0.018
 counters+0.018
 envelop+0.017
 candles+0.017
 desks+0.017
Neg logit contributions
 loser-0.014
ah-0.012
worm-0.012
 winner-0.012
 leader-0.012
Token,
Token ID11
Feature activation-2.31e-03

Pos logit contributions
 magazines+0.019
 counters+0.018
 envelop+0.018
keeping+0.018
 candles+0.017
Neg logit contributions
 loser-0.012
 leader-0.010
 winner-0.010
worm-0.010
 frog-0.010
Token only
Token ID691
Feature activation-2.33e-01

Pos logit contributions
 loser+0.002
 frog+0.002
 winner+0.002
 goose+0.002
 leader+0.002
Neg logit contributions
 counters-0.006
 magazines-0.005
 envelop-0.005
keeping-0.005
 candles-0.005
Token three
Token ID1115
Feature activation-5.80e-02

Pos logit contributions
 loser+0.309
 leader+0.271
 winner+0.270
worm+0.268
 frog+0.266
Neg logit contributions
 magazines-0.504
 counters-0.494
 envelop-0.484
keeping-0.477
 candles-0.465
Token years
Token ID812
Feature activation+1.88e-01

Pos logit contributions
 loser+0.133
 goose+0.126
 leader+0.126
in+0.125
 problem+0.125
Neg logit contributions
 magazines-0.064
 counters-0.062
keeping-0.062
 envelop-0.061
 candles-0.055
Token old
Token ID1468
Feature activation+4.39e-02

Pos logit contributions
keeping+0.418
 counters+0.410
 magazines+0.409
 envelop+0.403
 illustrations+0.391
Neg logit contributions
 problem-0.211
 loser-0.208
 leader-0.208
worm-0.203
 goose-0.198
Token.
Token ID13
Feature activation-1.15e-02

Pos logit contributions
 counters+0.104
 magazines+0.102
 envelop+0.101
keeping+0.101
 candles+0.098
Neg logit contributions
 loser-0.046
 frog-0.042
 goose-0.041
 leader-0.041
 problem-0.040
Token Sarah
Token ID10490
Feature activation+1.58e-01

Pos logit contributions
 loser+0.011
 moment+0.011
 best+0.011
 lesson+0.010
 friend+0.010
Neg logit contributions
keeping-0.027
 counters-0.027
 candles-0.026
 envelop-0.026
 magazines-0.025
Token liked
Token ID8288
Feature activation+3.60e-02

Pos logit contributions
 magazines+0.339
 counters+0.336
keeping+0.333
 envelop+0.331
 candles+0.315
Neg logit contributions
 loser-0.191
 leader-0.183
in-0.180
 goose-0.180
 frog-0.179
Token to
Token ID284
Feature activation+2.35e-02

Pos logit contributions
 counters+0.094
keeping+0.090
 envelop+0.090
etti+0.089
 magazines+0.086
Neg logit contributions
 goose-0.027
 frog-0.026
 problem-0.024
 lesson-0.024
 friend-0.024
Token found
Token ID1043
Feature activation+9.50e-02

Pos logit contributions
 loser+0.047
 leader+0.043
 winner+0.042
 frog+0.041
in+0.040
Neg logit contributions
 magazines-0.074
 counters-0.073
keeping-0.071
 envelop-0.071
 candles-0.068
Token a
Token ID257
Feature activation+3.91e-02

Pos logit contributions
 magazines+0.169
 counters+0.168
keeping+0.161
 envelop+0.160
 illustrations+0.152
Neg logit contributions
 loser-0.162
 frog-0.147
 leader-0.146
 goose-0.146
 winner-0.146
Token lemon
Token ID18873
Feature activation+2.86e-03

Pos logit contributions
 magazines+0.084
 counters+0.081
 envelop+0.081
 candles+0.078
keeping+0.077
Neg logit contributions
 loser-0.052
 leader-0.045
worm-0.044
 winner-0.044
ah-0.044
Token in
Token ID287
Feature activation-1.79e-02

Pos logit contributions
 magazines+0.005
 counters+0.005
 envelop+0.005
 candles+0.005
keeping+0.005
Neg logit contributions
 loser-0.005
 leader-0.004
 coward-0.004
 bet-0.004
 winner-0.004
Token the
Token ID262
Feature activation-5.35e-04

Pos logit contributions
 loser+0.028
 frog+0.025
 leader+0.025
worm+0.025
 goose+0.025
Neg logit contributions
 magazines-0.034
 counters-0.034
 envelop-0.032
keeping-0.032
 candles-0.031
Token kitchen
Token ID9592
Feature activation+1.63e-01

Pos logit contributions
 loser+0.001
 frog+0.001
 winner+0.001
 goose+0.001
 leader+0.001
Neg logit contributions
 magazines-0.001
 counters-0.001
keeping-0.001
 envelop-0.001
 illustrations-0.001
Token.
Token ID13
Feature activation-4.62e-03

Pos logit contributions
 magazines+0.361
 counters+0.351
 envelop+0.346
keeping+0.344
 candles+0.330
Neg logit contributions
 loser-0.202
 leader-0.182
 winner-0.178
 frog-0.177
 goose-0.175
Token They
Token ID1119
Feature activation+4.98e-02

Pos logit contributions
 loser+0.008
 frog+0.008
 leader+0.008
 winner+0.008
 goose+0.008
Neg logit contributions
 magazines-0.008
 counters-0.007
 envelop-0.007
keeping-0.007
 candles-0.007
Token both
Token ID1111
Feature activation-9.62e-02

Pos logit contributions
 magazines+0.114
keeping+0.114
 counters+0.111
 envelop+0.109
 illustrations+0.105
Neg logit contributions
 loser-0.054
 frog-0.052
 leader-0.051
 neither-0.050
 goose-0.049
Token wanted
Token ID2227
Feature activation+7.07e-03

Pos logit contributions
 loser+0.118
 leader+0.104
 winner+0.102
worm+0.102
 frog+0.102
Neg logit contributions
 magazines-0.218
 counters-0.211
 envelop-0.208
keeping-0.206
 candles-0.199
Token to
Token ID284
Feature activation+2.86e-02

Pos logit contributions
 counters+0.016
 magazines+0.016
keeping+0.015
 envelop+0.015
 illustrations+0.014
Neg logit contributions
 loser-0.009
 frog-0.008
 goose-0.008
 problem-0.008
 winner-0.008
Token
Token ID220
Feature activation+6.41e-03

Pos logit contributions
 loser+0.013
 frog+0.013
Only+0.013
 leader+0.013
 lesson+0.013
Neg logit contributions
 magazines-0.017
 envelop-0.017
 counters-0.016
keeping-0.016
 computers-0.016
Token
Token ID198
Feature activation-4.14e-02

Pos logit contributions
 magazines+0.016
 counters+0.016
 envelop+0.016
keeping+0.015
 candles+0.015
Neg logit contributions
 loser-0.006
 leader-0.005
worm-0.005
 frog-0.005
 winner-0.005
Token
Token ID198
Feature activation-4.12e-02

Pos logit contributions
 loser+0.040
 leader+0.034
worm+0.033
 frog+0.033
 winner+0.033
Neg logit contributions
 magazines-0.104
 counters-0.101
 envelop-0.100
keeping-0.098
 candles-0.097
Tokenâ
Token ID22940
Feature activation-1.73e-02

Pos logit contributions
 loser+0.048
 leader+0.043
worm+0.042
 winner+0.042
in+0.042
Neg logit contributions
 magazines-0.095
 counters-0.092
 envelop-0.091
keeping-0.090
 candles-0.087
Token
Token ID26391
Feature activation+1.24e-02

Pos logit contributions
 loser+0.027
ny+0.025
ah+0.025
 leader+0.025
 winner+0.024
Neg logit contributions
 magazines-0.033
 envelop-0.032
 counters-0.031
 candles-0.031
 desks-0.031
Token
Token ID129
Feature activation+1.71e-01

Pos logit contributions
 magazines+0.024
 envelop+0.023
 counters+0.022
 candles+0.022
 videos+0.022
Neg logit contributions
 loser-0.018
zag-0.018
 am-0.017
 leader-0.017
 moment-0.016
Token
Token ID241
Feature activation-5.40e-02

Pos logit contributions
 magazines+0.321
 counters+0.305
 envelop+0.304
keeping+0.299
 candles+0.296
Neg logit contributions
 loser-0.260
 leader-0.255
 goose-0.249
 lesson-0.243
zag-0.242
TokenM
Token ID44
Feature activation-6.21e-02

Pos logit contributions
 loser+0.067
 leader+0.059
 winner+0.058
worm+0.058
 frog+0.058
Neg logit contributions
 magazines-0.120
 counters-0.118
 envelop-0.115
keeping-0.114
 candles-0.111
Tokenum
Token ID388
Feature activation+1.46e-02

Pos logit contributions
 loser+0.098
 leader+0.088
worm+0.087
 winner+0.087
 frog+0.087
Neg logit contributions
 magazines-0.118
 counters-0.116
 envelop-0.112
keeping-0.110
 candles-0.109
Token,
Token ID11
Feature activation-4.18e-03

Pos logit contributions
 magazines+0.035
 counters+0.034
keeping+0.034
 envelop+0.033
 desks+0.032
Neg logit contributions
 loser-0.015
 frog-0.015
in-0.014
ah-0.014
 winner-0.014
Token what
Token ID644
Feature activation+1.00e-02

Pos logit contributions
 loser+0.006
 frog+0.006
 goose+0.006
 problem+0.006
 winner+0.006
Neg logit contributions
 magazines-0.008
keeping-0.008
 counters-0.008
 envelop-0.008
 desks-0.007
Token for
Token ID329
Feature activation+2.81e-02

Pos logit contributions
 loser+0.141
 winner+0.123
 coward+0.123
 leader+0.123
 goose+0.122
Neg logit contributions
 magazines-0.185
 counters-0.177
 envelop-0.177
keeping-0.172
 candles-0.169
Token being
Token ID852
Feature activation-7.36e-02

Pos logit contributions
 counters+0.057
 magazines+0.055
 desks+0.052
 illustrations+0.050
 computers+0.050
Neg logit contributions
 loser-0.041
 leader-0.040
 frog-0.040
 best-0.039
 first-0.039
Token so
Token ID523
Feature activation+3.31e-02

Pos logit contributions
 loser+0.125
 winner+0.118
 leader+0.117
 best+0.116
in+0.116
Neg logit contributions
 counters-0.134
 magazines-0.133
 envelop-0.124
keeping-0.124
 desks-0.122
Token kind
Token ID1611
Feature activation-1.56e-02

Pos logit contributions
 magazines+0.078
 counters+0.076
 envelop+0.075
keeping+0.074
 candles+0.071
Neg logit contributions
 loser-0.039
 leader-0.035
 winner-0.034
worm-0.033
 frog-0.033
Token.
Token ID13
Feature activation-1.57e-02

Pos logit contributions
 loser+0.023
 winner+0.020
worm+0.020
 leader+0.020
 coward+0.019
Neg logit contributions
 magazines-0.032
keeping-0.031
 counters-0.031
 envelop-0.030
 candles-0.029
Token From
Token ID3574
Feature activation+1.71e-01

Pos logit contributions
 loser+0.028
 frog+0.027
worm+0.026
 leader+0.026
 goose+0.026
Neg logit contributions
 magazines-0.026
 counters-0.025
 envelop-0.024
 candles-0.023
 computers-0.023
Token that
Token ID326
Feature activation-7.38e-02

Pos logit contributions
 magazines+0.333
 counters+0.324
 envelop+0.318
keeping+0.312
 candles+0.304
Neg logit contributions
 loser-0.266
 leader-0.237
 winner-0.236
worm-0.234
 frog-0.232
Token day
Token ID1110
Feature activation-6.43e-02

Pos logit contributions
 loser+0.103
 coward+0.096
worm+0.092
 goose+0.090
 fisherman+0.088
Neg logit contributions
 magazines-0.150
 envelop-0.142
 Emmy-0.139
 videos-0.138
 wrapping-0.134
Token on
Token ID319
Feature activation-3.00e-02

Pos logit contributions
 loser+0.093
worm+0.089
 frog+0.087
 goose+0.087
 leader+0.086
Neg logit contributions
 magazines-0.125
 envelop-0.120
 counters-0.118
 candles-0.114
keeping-0.114
Token,
Token ID11
Feature activation-7.70e-03

Pos logit contributions
 loser+0.042
 leader+0.037
 winner+0.037
 coward+0.036
worm+0.036
Neg logit contributions
 magazines-0.063
 counters-0.060
 envelop-0.060
keeping-0.058
 candles-0.057
Token the
Token ID262
Feature activation-1.04e-02

Pos logit contributions
 loser+0.013
worm+0.012
 winner+0.012
 leader+0.012
 coward+0.012
Neg logit contributions
 magazines-0.014
 envelop-0.013
 counters-0.013
keeping-0.013
 candles-0.012
Token It
Token ID632
Feature activation+5.77e-02

Pos logit contributions
 loser+0.009
 frog+0.009
 leader+0.008
worm+0.008
 goose+0.008
Neg logit contributions
 magazines-0.013
 counters-0.013
 envelop-0.013
keeping-0.013
 candles-0.013
Token also
Token ID635
Feature activation-1.59e-01

Pos logit contributions
 magazines+0.111
 counters+0.110
 envelop+0.108
 candles+0.106
 computers+0.102
Neg logit contributions
 loser-0.091
 coward-0.075
 goose-0.074
 winner-0.074
 leader-0.073
Token had
Token ID550
Feature activation+5.56e-02

Pos logit contributions
 loser+0.235
 leader+0.216
 winner+0.214
 frog+0.212
worm+0.210
Neg logit contributions
 magazines-0.316
 counters-0.311
 envelop-0.301
keeping-0.300
 candles-0.283
Token a
Token ID257
Feature activation+3.83e-02

Pos logit contributions
 magazines+0.112
 counters+0.112
keeping+0.105
 envelop+0.104
amas+0.100
Neg logit contributions
 leader-0.076
 loser-0.075
 winner-0.073
 best-0.071
 lesson-0.070
Token big
Token ID1263
Feature activation+3.16e-02

Pos logit contributions
 magazines+0.079
 counters+0.078
 envelop+0.076
keeping+0.074
 candles+0.073
Neg logit contributions
 loser-0.054
 leader-0.048
worm-0.047
 winner-0.047
 frog-0.047
Token keyboard
Token ID10586
Feature activation+1.98e-01

Pos logit contributions
 magazines+0.066
 counters+0.065
keeping+0.064
 envelop+0.062
 candles+0.059
Neg logit contributions
 loser-0.045
 winner-0.040
 problem-0.040
 frog-0.040
 leader-0.040
Token where
Token ID810
Feature activation-2.99e-03

Pos logit contributions
keeping+0.449
 magazines+0.448
 counters+0.445
 envelop+0.434
years+0.412
Neg logit contributions
 loser-0.220
 winner-0.215
 leader-0.212
 problem-0.195
zag-0.194
Token one
Token ID530
Feature activation-2.47e-01

Pos logit contributions
 loser+0.004
 leader+0.003
worm+0.003
 frog+0.003
 winner+0.003
Neg logit contributions
 counters-0.006
 magazines-0.006
keeping-0.006
 envelop-0.006
 illustrations-0.006
Token could
Token ID714
Feature activation+2.60e-02

Pos logit contributions
 loser+0.358
 leader+0.309
 winner+0.307
 coward+0.307
worm+0.307
Neg logit contributions
 magazines-0.504
 counters-0.491
 envelop-0.484
keeping-0.470
 candles-0.469
Token press
Token ID1803
Feature activation-7.32e-02

Pos logit contributions
 magazines+0.053
 counters+0.052
 envelop+0.051
keeping+0.050
 candles+0.049
Neg logit contributions
 loser-0.037
 leader-0.033
 winner-0.032
worm-0.032
 frog-0.032
Token lots
Token ID6041
Feature activation+8.49e-02

Pos logit contributions
 loser+0.092
 leader+0.083
 problem+0.081
 winner+0.081
ah+0.080
Neg logit contributions
 counters-0.158
 magazines-0.156
keeping-0.156
 envelop-0.150
 candles-0.143
Token them
Token ID606
Feature activation-9.39e-03

Pos logit contributions
 magazines+0.008
 counters+0.008
 candles+0.008
 envelop+0.008
 desks+0.007
Neg logit contributions
 loser-0.010
 winner-0.009
worm-0.009
zag-0.009
 leader-0.009
Token the
Token ID262
Feature activation+6.30e-04

Pos logit contributions
 loser+0.012
 frog+0.011
 lesson+0.011
 goose+0.011
 leader+0.011
Neg logit contributions
keeping-0.020
 counters-0.019
 envelop-0.019
 magazines-0.019
bons-0.018
Token iron
Token ID6953
Feature activation-2.05e-02

Pos logit contributions
 magazines+0.001
 counters+0.001
 envelop+0.001
keeping+0.001
 computers+0.001
Neg logit contributions
 loser-0.001
worm-0.001
ah-0.001
zag-0.001
 coward-0.001
Token was
Token ID373
Feature activation+2.02e-02

Pos logit contributions
 loser+0.027
 leader+0.024
 winner+0.023
worm+0.023
 frog+0.023
Neg logit contributions
 magazines-0.044
 counters-0.044
 envelop-0.043
keeping-0.042
 candles-0.041
Token very
Token ID845
Feature activation+1.95e-02

Pos logit contributions
 counters+0.048
 magazines+0.048
keeping+0.048
 desks+0.046
 envelop+0.045
Neg logit contributions
 loser-0.021
 best-0.020
 leader-0.019
 frog-0.019
 winner-0.019
Token old
Token ID1468
Feature activation+5.00e-02

Pos logit contributions
 magazines+0.041
 counters+0.040
 envelop+0.039
keeping+0.038
 candles+0.038
Neg logit contributions
 loser-0.027
 winner-0.024
 leader-0.024
worm-0.024
 frog-0.023
Token and
Token ID290
Feature activation+3.05e-02

Pos logit contributions
 magazines+0.102
 counters+0.096
 envelop+0.095
keeping+0.094
 candles+0.092
Neg logit contributions
 loser-0.075
 winner-0.067
ah-0.066
worm-0.066
 coward-0.066
Token came
Token ID1625
Feature activation+3.56e-02

Pos logit contributions
keeping+0.068
 counters+0.068
 magazines+0.068
 envelop+0.066
 illustrations+0.063
Neg logit contributions
 loser-0.035
 frog-0.033
 lesson-0.032
 problem-0.032
 best-0.032
Token from
Token ID422
Feature activation+1.17e-02

Pos logit contributions
 magazines+0.081
 counters+0.079
 envelop+0.077
keeping+0.075
 candles+0.075
Neg logit contributions
 loser-0.044
 leader-0.040
 frog-0.039
 winner-0.038
worm-0.038
Token a
Token ID257
Feature activation+3.79e-02

Pos logit contributions
 counters+0.026
 magazines+0.024
keeping+0.023
 illustrations+0.023
etti+0.023
Neg logit contributions
 loser-0.015
 first-0.014
 problem-0.014
 frog-0.014
 leader-0.013
Token ship
Token ID4074
Feature activation+3.50e-02

Pos logit contributions
 magazines+0.079
 counters+0.077
keeping+0.075
 envelop+0.075
 candles+0.073
Neg logit contributions
 loser-0.054
 leader-0.048
 frog-0.047
 winner-0.047
 goose-0.047
Token.
Token ID13
Feature activation-7.10e-04

Pos logit contributions
 loser+0.084
 leader+0.074
 winner+0.072
 frog+0.072
 problem+0.070
Neg logit contributions
 magazines-0.147
 counters-0.146
 envelop-0.143
keeping-0.139
 computers-0.135
Token She
Token ID1375
Feature activation+6.56e-02

Pos logit contributions
 loser+0.001
 frog+0.001
 winner+0.001
 leader+0.001
 coward+0.001
Neg logit contributions
 magazines-0.001
 counters-0.001
 envelop-0.001
keeping-0.001
 candles-0.001
Token helped
Token ID4193
Feature activation-1.44e-02

Pos logit contributions
 magazines+0.148
 counters+0.145
 envelop+0.143
keeping+0.140
 candles+0.138
Neg logit contributions
 loser-0.080
 leader-0.069
worm-0.068
 winner-0.068
 frog-0.067
Token me
Token ID502
Feature activation+3.93e-02

Pos logit contributions
 loser+0.024
 leader+0.022
 frog+0.022
 goose+0.022
worm+0.021
Neg logit contributions
 magazines-0.026
 counters-0.025
 envelop-0.024
keeping-0.024
 candles-0.023
Token make
Token ID787
Feature activation-3.75e-02

Pos logit contributions
 magazines+0.081
 counters+0.079
 envelop+0.077
keeping+0.076
 candles+0.073
Neg logit contributions
 loser-0.056
 leader-0.050
 frog-0.050
 winner-0.049
worm-0.049
Token a
Token ID257
Feature activation+4.30e-02

Pos logit contributions
 loser+0.055
 best+0.054
 frog+0.053
in+0.053
 leader+0.052
Neg logit contributions
 magazines-0.069
keeping-0.069
 counters-0.069
 envelop-0.065
 computers-0.063
Token rainbow
Token ID27223
Feature activation+2.49e-02

Pos logit contributions
 magazines+0.083
 counters+0.081
keeping+0.079
 envelop+0.078
 computers+0.074
Neg logit contributions
 loser-0.067
 leader-0.061
 winner-0.060
 frog-0.060
worm-0.059
Token cake
Token ID12187
Feature activation+1.97e-02

Pos logit contributions
 magazines+0.052
 counters+0.051
 envelop+0.050
keeping+0.050
 computers+0.048
Neg logit contributions
 loser-0.034
 leader-0.031
 winner-0.031
 frog-0.030
worm-0.030
Token.
Token ID13
Feature activation+3.16e-03

Pos logit contributions
 magazines+0.041
 counters+0.040
 envelop+0.040
keeping+0.039
 candles+0.038
Neg logit contributions
 loser-0.027
 leader-0.024
 winner-0.024
worm-0.024
 frog-0.024
Token Do
Token ID2141
Feature activation+1.33e-01

Pos logit contributions
 magazines+0.006
 counters+0.006
keeping+0.006
 envelop+0.006
 desks+0.005
Neg logit contributions
 loser-0.005
 winner-0.005
 frog-0.004
Only-0.004
 leader-0.004
Token you
Token ID345
Feature activation+1.31e-02

Pos logit contributions
 magazines+0.231
 counters+0.222
 envelop+0.218
keeping+0.211
 candles+0.209
Neg logit contributions
 loser-0.235
 leader-0.212
worm-0.212
 winner-0.212
 frog-0.209
Token.
Token ID13
Feature activation-7.44e-03

Pos logit contributions
 loser+0.027
 leader+0.024
 winner+0.024
worm+0.023
 coward+0.023
Neg logit contributions
 magazines-0.038
 counters-0.036
 envelop-0.035
 candles-0.035
keeping-0.035
Token My
Token ID2011
Feature activation+4.77e-02

Pos logit contributions
 loser+0.013
 frog+0.012
 goose+0.011
 leader+0.011
 winner+0.011
Neg logit contributions
 magazines-0.013
 counters-0.013
keeping-0.013
 envelop-0.013
 desks-0.012
Token shark
Token ID21027
Feature activation-7.80e-02

Pos logit contributions
 magazines+0.105
 envelop+0.100
 counters+0.100
 candles+0.099
keeping+0.097
Neg logit contributions
 loser-0.058
 winner-0.052
worm-0.051
ah-0.050
in-0.050
Token is
Token ID318
Feature activation-3.60e-03

Pos logit contributions
 loser+0.102
 winner+0.086
worm+0.086
 leader+0.085
igger+0.084
Neg logit contributions
 magazines-0.169
 candles-0.162
 counters-0.160
 envelop-0.159
 videos-0.152
Token boring
Token ID14262
Feature activation-3.05e-02

Pos logit contributions
 loser+0.005
worm+0.004
 leader+0.004
 winner+0.004
in+0.004
Neg logit contributions
 magazines-0.007
 counters-0.007
 envelop-0.007
 candles-0.007
keeping-0.007
Token and
Token ID290
Feature activation+3.01e-02

Pos logit contributions
 loser+0.043
 leader+0.038
 winner+0.038
worm+0.038
 frog+0.037
Neg logit contributions
 magazines-0.063
 counters-0.062
 envelop-0.060
keeping-0.059
 candles-0.058
Token gray
Token ID12768
Feature activation-6.13e-03

Pos logit contributions
 magazines+0.063
 counters+0.060
 envelop+0.059
 candles+0.059
 videos+0.058
Neg logit contributions
 loser-0.042
 winner-0.037
 leader-0.037
ah-0.036
zag-0.036
Token,"
Token ID553
Feature activation+1.03e-01

Pos logit contributions
 loser+0.009
 winner+0.008
 leader+0.008
worm+0.008
 coward+0.008
Neg logit contributions
 magazines-0.013
 counters-0.012
 envelop-0.012
keeping-0.012
 candles-0.012
Token Tom
Token ID4186
Feature activation+2.39e-02

Pos logit contributions
keeping+0.204
 counters+0.202
 magazines+0.196
 envelop+0.189
 illustrations+0.183
Neg logit contributions
 loser-0.156
 frog-0.148
 neither-0.143
 goose-0.143
 problem-0.137
Token said
Token ID531
Feature activation-4.15e-03

Pos logit contributions
 magazines+0.037
 counters+0.036
 envelop+0.035
 candles+0.033
 videos+0.032
Neg logit contributions
 loser-0.048
 winner-0.043
worm-0.043
 fastest-0.042
 coward-0.042
Token,
Token ID11
Feature activation+3.57e-06

Pos logit contributions
 loser+0.007
 goose+0.006
 fastest+0.006
 winner+0.006
 hawk+0.006
Neg logit contributions
 counters-0.008
 magazines-0.007
 envelop-0.007
 candles-0.007
 videos-0.007
Token and
Token ID290
Feature activation+2.94e-02

Pos logit contributions
 loser+0.030
 leader+0.026
 frog+0.026
 winner+0.026
worm+0.026
Neg logit contributions
 magazines-0.054
 counters-0.053
 envelop-0.052
keeping-0.051
 candles-0.050
Token smiled
Token ID13541
Feature activation+6.44e-02

Pos logit contributions
 magazines+0.069
 counters+0.068
 envelop+0.067
keeping+0.066
 candles+0.063
Neg logit contributions
 frog-0.032
 loser-0.032
 goose-0.031
 leader-0.030
worm-0.029
Token.
Token ID13
Feature activation-1.88e-03

Pos logit contributions
 magazines+0.126
 counters+0.122
 envelop+0.118
keeping+0.117
 candles+0.116
Neg logit contributions
 loser-0.097
 winner-0.086
worm-0.085
 coward-0.084
 problem-0.083
Token
Token ID198
Feature activation-3.20e-02

Pos logit contributions
 loser+0.003
 frog+0.003
 leader+0.003
 goose+0.003
 winner+0.003
Neg logit contributions
 magazines-0.004
 counters-0.003
 envelop-0.003
keeping-0.003
 candles-0.003
Token
Token ID198
Feature activation-3.12e-02

Pos logit contributions
 loser+0.041
 winner+0.036
 leader+0.035
worm+0.035
 frog+0.034
Neg logit contributions
 magazines-0.071
 counters-0.069
 envelop-0.068
keeping-0.067
 desks-0.065
TokenWhen
Token ID2215
Feature activation+2.82e-02

Pos logit contributions
 loser+0.044
worm+0.039
 leader+0.038
 winner+0.038
 frog+0.038
Neg logit contributions
 magazines-0.065
 counters-0.064
 envelop-0.062
keeping-0.061
 candles-0.059
Token it
Token ID340
Feature activation-2.51e-02

Pos logit contributions
 magazines+0.050
 counters+0.048
 envelop+0.048
 candles+0.046
keeping+0.046
Neg logit contributions
 loser-0.048
worm-0.044
 leader-0.044
 winner-0.043
in-0.042
Token was
Token ID373
Feature activation+2.03e-02

Pos logit contributions
 loser+0.047
 coward+0.040
 leader+0.040
worm+0.040
 winner+0.039
Neg logit contributions
 counters-0.038
 candles-0.037
 magazines-0.036
 brand-0.035
 videos-0.035
Token time
Token ID640
Feature activation-6.07e-02

Pos logit contributions
 magazines+0.048
 counters+0.047
keeping+0.046
 envelop+0.046
 desks+0.044
Neg logit contributions
 loser-0.023
 frog-0.021
 winner-0.020
 leader-0.020
 best-0.020
Token to
Token ID284
Feature activation+3.51e-02

Pos logit contributions
 loser+0.083
 leader+0.073
worm+0.072
 winner+0.072
 frog+0.071
Neg logit contributions
 magazines-0.129
 counters-0.126
 envelop-0.124
keeping-0.122
 candles-0.119
Token go
Token ID467
Feature activation-4.69e-02

Pos logit contributions
 magazines+0.078
 counters+0.077
 envelop+0.075
keeping+0.074
 candles+0.072
Neg logit contributions
 loser-0.044
 leader-0.039
 winner-0.039
 frog-0.038
worm-0.038
Token and
Token ID290
Feature activation+5.12e-02

Pos logit contributions
 loser+0.083
 leader+0.073
worm+0.073
 winner+0.073
 frog+0.071
Neg logit contributions
 magazines-0.117
 counters-0.116
 envelop-0.113
keeping-0.111
 candles-0.109
Token Tom
Token ID4186
Feature activation+5.29e-02

Pos logit contributions
 magazines+0.096
 counters+0.094
 envelop+0.091
keeping+0.090
 candles+0.087
Neg logit contributions
 loser-0.083
 leader-0.075
 winner-0.074
 frog-0.074
worm-0.074
Token looked
Token ID3114
Feature activation+5.06e-02

Pos logit contributions
 magazines+0.102
 counters+0.099
 envelop+0.097
keeping+0.095
 candles+0.093
Neg logit contributions
 loser-0.082
worm-0.074
 winner-0.073
 leader-0.072
 frog-0.072
Token up
Token ID510
Feature activation-5.43e-02

Pos logit contributions
 magazines+0.114
 counters+0.111
keeping+0.110
 envelop+0.110
 candles+0.105
Neg logit contributions
 loser-0.060
 frog-0.055
 leader-0.055
 winner-0.054
 goose-0.054
Token and
Token ID290
Feature activation+4.21e-02

Pos logit contributions
 loser+0.079
 winner+0.069
worm+0.068
 coward+0.068
 leader+0.068
Neg logit contributions
 magazines-0.112
 counters-0.109
 envelop-0.107
keeping-0.105
 candles-0.104
Token saw
Token ID2497
Feature activation+1.23e-01

Pos logit contributions
 magazines+0.091
 counters+0.088
 envelop+0.087
keeping+0.086
 candles+0.083
Neg logit contributions
 loser-0.055
 leader-0.050
 frog-0.050
 winner-0.049
 goose-0.049
Token the
Token ID262
Feature activation+1.18e-02

Pos logit contributions
 magazines+0.220
 counters+0.216
 envelop+0.209
keeping+0.206
 candles+0.197
Neg logit contributions
 loser-0.205
 frog-0.187
worm-0.187
 leader-0.187
 goose-0.186
Token woman
Token ID2415
Feature activation+7.67e-02

Pos logit contributions
 magazines+0.023
 counters+0.023
 envelop+0.022
keeping+0.022
 candles+0.021
Neg logit contributions
 loser-0.018
 frog-0.016
 winner-0.016
 leader-0.016
worm-0.016
Token.
Token ID13
Feature activation+8.16e-03

Pos logit contributions
 magazines+0.157
 counters+0.153
keeping+0.150
 envelop+0.147
 computers+0.144
Neg logit contributions
 loser-0.104
 frog-0.099
worm-0.097
 winner-0.097
 leader-0.097
Token They
Token ID1119
Feature activation+6.56e-02

Pos logit contributions
 magazines+0.015
 counters+0.015
 envelop+0.015
keeping+0.015
 candles+0.014
Neg logit contributions
 loser-0.012
 frog-0.012
 goose-0.012
 leader-0.011
worm-0.011
Token did
Token ID750
Feature activation+6.40e-03

Pos logit contributions
 magazines+0.133
 counters+0.130
 envelop+0.127
keeping+0.126
 candles+0.121
Neg logit contributions
 loser-0.095
 leader-0.087
 frog-0.086
 goose-0.085
 winner-0.084
Token she
Token ID673
Feature activation-3.53e-02

Pos logit contributions
 frog+0.060
 first+0.056
 goose+0.056
 moment+0.055
 one+0.055
Neg logit contributions
 counters-0.130
 magazines-0.127
 envelop-0.125
keeping-0.121
 desks-0.119
Token could
Token ID714
Feature activation+1.91e-02

Pos logit contributions
 loser+0.055
 winner+0.048
 leader+0.048
worm+0.047
 goose+0.047
Neg logit contributions
 magazines-0.067
 counters-0.066
 envelop-0.065
keeping-0.063
 candles-0.063
Token say
Token ID910
Feature activation-1.71e-02

Pos logit contributions
 magazines+0.029
keeping+0.029
 envelop+0.029
 bed+0.028
 counters+0.028
Neg logit contributions
 loser-0.032
in-0.030
worm-0.030
 problem-0.029
 winner-0.029
Token anything
Token ID1997
Feature activation-4.38e-02

Pos logit contributions
 loser+0.030
worm+0.027
 leader+0.027
 winner+0.027
 frog+0.026
Neg logit contributions
 magazines-0.030
 counters-0.028
 envelop-0.028
keeping-0.027
 candles-0.027
Token.
Token ID13
Feature activation-1.18e-02

Pos logit contributions
 loser+0.065
 winner+0.056
 leader+0.055
 coward+0.055
ah+0.054
Neg logit contributions
 magazines-0.089
 counters-0.086
 envelop-0.084
keeping-0.082
 candles-0.082
Token
Token ID198
Feature activation-4.56e-02

Pos logit contributions
 loser+0.014
 leader+0.012
 frog+0.012
 winner+0.012
worm+0.012
Neg logit contributions
 magazines-0.027
 counters-0.026
 envelop-0.026
keeping-0.025
 candles-0.025
Token
Token ID198
Feature activation-4.64e-02

Pos logit contributions
 loser+0.043
 leader+0.037
worm+0.037
in+0.036
 frog+0.036
Neg logit contributions
 magazines-0.114
 counters-0.112
 envelop-0.110
keeping-0.109
 candles-0.107
TokenAs
Token ID1722
Feature activation-1.41e-02

Pos logit contributions
 loser+0.071
 leader+0.064
 winner+0.063
worm+0.063
in+0.062
Neg logit contributions
 magazines-0.091
 counters-0.089
 envelop-0.087
keeping-0.087
 candles-0.084
Token she
Token ID673
Feature activation-3.38e-02

Pos logit contributions
 loser+0.023
 leader+0.021
 winner+0.021
worm+0.020
 frog+0.020
Neg logit contributions
 magazines-0.026
 counters-0.026
 envelop-0.025
keeping-0.025
 candles-0.024
Token continued
Token ID3767
Feature activation-2.49e-02

Pos logit contributions
 loser+0.060
 winner+0.052
 coward+0.052
 leader+0.052
worm+0.052
Neg logit contributions
 magazines-0.059
 counters-0.057
 envelop-0.055
keeping-0.054
 candles-0.054
Token walking
Token ID6155
Feature activation+3.43e-02

Pos logit contributions
 loser+0.044
 winner+0.039
worm+0.039
 leader+0.039
 coward+0.039
Neg logit contributions
 magazines-0.044
 counters-0.042
 envelop-0.041
keeping-0.040
 candles-0.039
Tokenumm
Token ID13929
Feature activation+1.26e-02

Pos logit contributions
 magazines+0.014
 counters+0.013
 envelop+0.013
keeping+0.013
 candles+0.012
Neg logit contributions
 loser-0.017
 frog-0.015
 winner-0.015
 leader-0.015
worm-0.015
Tokeniest
Token ID6386
Feature activation-1.09e-01

Pos logit contributions
 counters+0.026
 candles+0.026
 magazines+0.025
keeping+0.025
 presents+0.025
Neg logit contributions
 loser-0.018
 leader-0.016
 coward-0.016
 bet-0.016
 hawk-0.015
Token treats
Token ID18432
Feature activation+4.64e-02

Pos logit contributions
 loser+0.169
 leader+0.151
worm+0.150
 winner+0.148
 coward+0.147
Neg logit contributions
 magazines-0.213
 counters-0.209
 envelop-0.202
keeping-0.195
 candles-0.194
Token and
Token ID290
Feature activation+2.43e-02

Pos logit contributions
 magazines+0.102
 counters+0.100
 envelop+0.098
keeping+0.097
 candles+0.095
Neg logit contributions
 loser-0.059
 leader-0.051
 winner-0.051
worm-0.051
 frog-0.051
Token sharing
Token ID7373
Feature activation+2.31e-02

Pos logit contributions
 magazines+0.046
 counters+0.045
 envelop+0.044
keeping+0.043
amas+0.043
Neg logit contributions
 frog-0.034
 lesson-0.034
 best-0.033
 goose-0.033
 loser-0.032
Token them
Token ID606
Feature activation-1.42e-02

Pos logit contributions
 magazines+0.044
 counters+0.043
 envelop+0.042
keeping+0.041
 candles+0.040
Neg logit contributions
 loser-0.036
 frog-0.032
 leader-0.032
 winner-0.032
worm-0.032
Token with
Token ID351
Feature activation-2.25e-02

Pos logit contributions
 loser+0.022
 coward+0.020
 winner+0.020
 goose+0.020
ah+0.019
Neg logit contributions
 magazines-0.028
 counters-0.026
 envelop-0.026
keeping-0.026
 candles-0.025
Token one
Token ID530
Feature activation-2.51e-01

Pos logit contributions
 loser+0.031
 frog+0.028
in+0.028
 leader+0.028
 goose+0.027
Neg logit contributions
 counters-0.047
 magazines-0.046
keeping-0.044
 envelop-0.044
 desks-0.042
Token another
Token ID1194
Feature activation-9.55e-02

Pos logit contributions
 loser+0.373
zag+0.339
 coward+0.334
 hawk+0.334
 leader+0.319
Neg logit contributions
 magazines-0.487
 envelop-0.467
 candles-0.467
 presents-0.467
 counters-0.455
Token.
Token ID13
Feature activation-1.15e-02

Pos logit contributions
 loser+0.139
 hawk+0.127
Le+0.125
zag+0.124
worm+0.123
Neg logit contributions
 magazines-0.181
 envelop-0.175
 presents-0.175
 brand-0.174
 counters-0.173
Token As
Token ID1081
Feature activation+6.74e-03

Pos logit contributions
 loser+0.017
 frog+0.016
ny+0.016
 winner+0.016
 lesson+0.016
Neg logit contributions
 counters-0.022
 envelop-0.022
 magazines-0.022
keeping-0.021
 candles-0.021
Token have
Token ID423
Feature activation-2.89e-03

Pos logit contributions
 magazines+0.118
 envelop+0.113
 counters+0.112
 candles+0.110
 desks+0.105
Neg logit contributions
 loser-0.095
worm-0.089
in-0.086
 problem-0.084
 coward-0.084
Token a
Token ID257
Feature activation+4.35e-02

Pos logit contributions
 loser+0.004
 best+0.003
 winner+0.003
 leader+0.003
 lesson+0.003
Neg logit contributions
 counters-0.006
keeping-0.006
 magazines-0.006
 envelop-0.006
 videos-0.006
Token lot
Token ID1256
Feature activation+3.64e-02

Pos logit contributions
 magazines+0.091
 envelop+0.087
 candles+0.084
 presents+0.081
 brand+0.081
Neg logit contributions
worm-0.058
 loser-0.058
 neither-0.055
igger-0.055
in-0.054
Token of
Token ID286
Feature activation+3.53e-02

Pos logit contributions
 magazines+0.076
 counters+0.073
 envelop+0.072
 candles+0.070
keeping+0.070
Neg logit contributions
 loser-0.052
 leader-0.046
worm-0.045
 winner-0.045
 coward-0.045
Token fun
Token ID1257
Feature activation+6.58e-03

Pos logit contributions
 counters+0.077
keeping+0.076
 magazines+0.074
 envelop+0.073
etti+0.070
Neg logit contributions
 loser-0.043
 lesson-0.041
 problem-0.039
 best-0.039
 leader-0.039
Token.
Token ID13
Feature activation+6.31e-03

Pos logit contributions
 magazines+0.014
 counters+0.013
 envelop+0.013
keeping+0.013
 candles+0.013
Neg logit contributions
 loser-0.009
worm-0.008
 winner-0.008
 coward-0.008
 leader-0.008
Token They
Token ID1119
Feature activation+6.17e-02

Pos logit contributions
 magazines+0.011
 counters+0.010
 envelop+0.010
keeping+0.010
 candles+0.009
Neg logit contributions
 loser-0.012
worm-0.011
 leader-0.011
 winner-0.011
in-0.010
Token share
Token ID2648
Feature activation-2.40e-02

Pos logit contributions
 magazines+0.117
 envelop+0.111
 counters+0.110
 candles+0.107
 desks+0.104
Neg logit contributions
 loser-0.094
worm-0.090
in-0.087
 goose-0.085
 problem-0.085
Token everything
Token ID2279
Feature activation-1.70e-02

Pos logit contributions
 loser+0.037
 frog+0.033
 leader+0.033
 goose+0.033
 winner+0.033
Neg logit contributions
 magazines-0.045
 counters-0.045
keeping-0.044
 envelop-0.043
 candles-0.041
Token.
Token ID13
Feature activation+7.18e-03

Pos logit contributions
 loser+0.031
ah+0.028
 coward+0.028
 hawk+0.027
 bet+0.027
Neg logit contributions
 magazines-0.029
 candles-0.029
 counters-0.028
 envelop-0.027
 presents-0.027
Token They
Token ID1119
Feature activation+6.28e-02

Pos logit contributions
 magazines+0.012
 counters+0.012
 envelop+0.012
keeping+0.011
 candles+0.011
Neg logit contributions
 loser-0.013
 leader-0.012
 winner-0.012
 frog-0.012
worm-0.012
Token at
Token ID379
Feature activation-3.74e-02

Pos logit contributions
 magazines+0.047
 counters+0.045
 envelop+0.044
keeping+0.043
 candles+0.043
Neg logit contributions
 loser-0.031
 leader-0.027
 winner-0.027
worm-0.027
 frog-0.027
Token the
Token ID262
Feature activation+7.60e-04

Pos logit contributions
 loser+0.050
 first+0.047
 lesson+0.045
 frog+0.044
 friend+0.044
Neg logit contributions
 counters-0.080
 magazines-0.074
keeping-0.074
 envelop-0.072
 illustrations-0.071
Token desk
Token ID6915
Feature activation+1.51e-01

Pos logit contributions
keeping+0.001
 magazines+0.001
 counters+0.001
 envelop+0.001
etti+0.001
Neg logit contributions
 loser-0.002
 leader-0.002
 problem-0.002
 first-0.002
 winner-0.002
Token,
Token ID11
Feature activation+5.42e-03

Pos logit contributions
 counters+0.381
keeping+0.380
 magazines+0.379
 envelop+0.373
 candles+0.355
Neg logit contributions
 loser-0.137
 leader-0.126
 winner-0.120
 problem-0.118
 frog-0.116
Token she
Token ID673
Feature activation-2.63e-02

Pos logit contributions
keeping+0.013
 counters+0.013
 envelop+0.012
 magazines+0.012
bons+0.012
Neg logit contributions
 loser-0.006
 frog-0.005
 thing-0.005
 goose-0.005
 neither-0.005
Token would
Token ID561
Feature activation+2.62e-03

Pos logit contributions
 loser+0.032
 leader+0.029
 winner+0.028
 frog+0.028
in+0.028
Neg logit contributions
 magazines-0.059
 counters-0.058
keeping-0.058
 envelop-0.057
 candles-0.054
Token play
Token ID711
Feature activation+6.03e-02

Pos logit contributions
 magazines+0.006
keeping+0.006
 counters+0.006
 envelop+0.006
 candles+0.005
Neg logit contributions
 loser-0.003
 leader-0.003
 winner-0.003
 frog-0.003
 goose-0.003
Token with
Token ID351
Feature activation-1.71e-02

Pos logit contributions
keeping+0.134
 counters+0.132
 magazines+0.128
 envelop+0.127
 candles+0.126
Neg logit contributions
 loser-0.073
 thing-0.069
 frog-0.068
 leader-0.068
in-0.065
Token her
Token ID607
Feature activation+1.27e-02

Pos logit contributions
 loser+0.030
 frog+0.028
worm+0.027
in+0.027
 leader+0.027
Neg logit contributions
 counters-0.029
 magazines-0.028
keeping-0.028
 envelop-0.027
 candles-0.026
Token toys
Token ID14958
Feature activation+6.62e-02

Pos logit contributions
 magazines+0.025
 counters+0.024
 envelop+0.024
keeping+0.023
 candles+0.023
Neg logit contributions
 loser-0.020
worm-0.018
 leader-0.017
 winner-0.017
 goose-0.017
Token.
Token ID13
Feature activation-1.69e-03

Pos logit contributions
 magazines+0.140
 counters+0.134
 envelop+0.132
keeping+0.129
 candles+0.128
Neg logit contributions
 loser-0.092
worm-0.080
 leader-0.080
 winner-0.080
 goose-0.080
Token a
Token ID257
Feature activation+4.89e-02

Pos logit contributions
 magazines+0.141
 counters+0.140
keeping+0.135
 videos+0.128
 envelop+0.128
Neg logit contributions
 loser-0.089
 winner-0.081
 lesson-0.081
 leader-0.080
 only-0.079
Token button
Token ID4936
Feature activation-6.05e-02

Pos logit contributions
 magazines+0.095
 envelop+0.094
 counters+0.092
 candles+0.089
 brand+0.088
Neg logit contributions
 loser-0.070
worm-0.066
in-0.065
 leader-0.064
 goose-0.063
Token on
Token ID319
Feature activation-2.10e-02

Pos logit contributions
 loser+0.076
 winner+0.067
 leader+0.066
worm+0.065
 frog+0.065
Neg logit contributions
 magazines-0.135
 counters-0.132
keeping-0.129
 envelop-0.129
 candles-0.122
Token the
Token ID262
Feature activation+6.84e-03

Pos logit contributions
 loser+0.031
 frog+0.028
 leader+0.028
worm+0.027
 winner+0.027
Neg logit contributions
 magazines-0.041
 counters-0.041
keeping-0.039
 envelop-0.039
 candles-0.038
Token side
Token ID1735
Feature activation-8.28e-02

Pos logit contributions
 magazines+0.012
 counters+0.012
keeping+0.012
 envelop+0.011
 illustrations+0.011
Neg logit contributions
 loser-0.012
 leader-0.010
 winner-0.010
 first-0.010
 best-0.010
Token.
Token ID13
Feature activation+7.82e-03

Pos logit contributions
 loser+0.104
 leader+0.091
 winner+0.090
worm+0.090
 frog+0.089
Neg logit contributions
 magazines-0.184
 counters-0.181
 envelop-0.177
keeping-0.174
 candles-0.170
Token She
Token ID1375
Feature activation+7.15e-02

Pos logit contributions
 magazines+0.014
 counters+0.014
 envelop+0.014
keeping+0.014
 candles+0.013
Neg logit contributions
 loser-0.013
 frog-0.011
 coward-0.011
 leader-0.011
 winner-0.011
Token pressed
Token ID12070
Feature activation-7.32e-02

Pos logit contributions
 magazines+0.153
 counters+0.150
 envelop+0.147
keeping+0.144
 candles+0.141
Neg logit contributions
 loser-0.097
 leader-0.085
 winner-0.085
worm-0.084
 frog-0.084
Token it
Token ID340
Feature activation-2.01e-02

Pos logit contributions
 loser+0.114
 leader+0.102
 winner+0.102
worm+0.101
 frog+0.101
Neg logit contributions
 magazines-0.141
 counters-0.138
 envelop-0.134
keeping-0.133
 candles-0.128
Token and
Token ID290
Feature activation+4.03e-02

Pos logit contributions
 loser+0.030
 goose+0.027
 coward+0.027
 fisherman+0.026
worm+0.026
Neg logit contributions
 magazines-0.040
 counters-0.039
 envelop-0.038
 candles-0.037
keeping-0.036
Token heard
Token ID2982
Feature activation+1.12e-01

Pos logit contributions
 magazines+0.081
 counters+0.079
 envelop+0.078
keeping+0.076
 candles+0.074
Neg logit contributions
 loser-0.060
 leader-0.053
 winner-0.053
worm-0.053
 frog-0.052
Token.
Token ID13
Feature activation-1.88e-02

Pos logit contributions
 magazines+0.140
 counters+0.137
 envelop+0.133
keeping+0.129
 candles+0.128
Neg logit contributions
 loser-0.094
 frog-0.084
worm-0.083
 leader-0.083
 goose-0.082
Token
Token ID220
Feature activation+2.35e-03

Pos logit contributions
 loser+0.026
 leader+0.023
 winner+0.023
worm+0.023
 frog+0.023
Neg logit contributions
 magazines-0.039
 counters-0.038
 envelop-0.038
keeping-0.037
 candles-0.036
Token
Token ID198
Feature activation-5.08e-02

Pos logit contributions
 magazines+0.005
 counters+0.005
 envelop+0.005
keeping+0.005
 computers+0.005
Neg logit contributions
 loser-0.003
 winner-0.003
worm-0.003
 leader-0.003
 goose-0.003
Token
Token ID198
Feature activation-5.11e-02

Pos logit contributions
 loser+0.071
 winner+0.065
 problem+0.063
 leader+0.063
worm+0.062
Neg logit contributions
 counters-0.103
 magazines-0.103
 envelop-0.101
keeping-0.100
 computers-0.094
TokenThe
Token ID464
Feature activation+1.25e-02

Pos logit contributions
 loser+0.081
 problem+0.078
 winner+0.075
 leader+0.074
worm+0.074
Neg logit contributions
 magazines-0.091
 counters-0.091
 envelop-0.087
keeping-0.085
 computers-0.082
Token two
Token ID734
Feature activation-1.35e-01

Pos logit contributions
 magazines+0.029
 counters+0.028
 envelop+0.028
 candles+0.027
keeping+0.026
Neg logit contributions
 loser-0.015
worm-0.013
 leader-0.013
in-0.013
 winner-0.013
Token animals
Token ID4695
Feature activation-4.26e-02

Pos logit contributions
 loser+0.176
worm+0.153
 leader+0.152
 winner+0.151
 frog+0.148
Neg logit contributions
 magazines-0.298
 counters-0.289
 envelop-0.285
 candles-0.275
keeping-0.273
Token were
Token ID547
Feature activation+7.17e-02

Pos logit contributions
 loser+0.071
 winner+0.062
 leader+0.062
worm+0.062
 frog+0.061
Neg logit contributions
 magazines-0.078
 counters-0.075
 envelop-0.075
keeping-0.073
 candles-0.071
Token never
Token ID1239
Feature activation-1.11e-01

Pos logit contributions
 magazines+0.139
 counters+0.135
 envelop+0.133
keeping+0.130
 candles+0.128
Neg logit contributions
 loser-0.110
worm-0.099
ah-0.097
 leader-0.096
 winner-0.096
Token friends
Token ID2460
Feature activation-2.81e-02

Pos logit contributions
 loser+0.214
 winner+0.200
 leader+0.197
 coward+0.193
 problem+0.192
Neg logit contributions
 magazines-0.180
 counters-0.175
 envelop-0.167
 candles-0.162
keeping-0.158
Token again
Token ID757
Feature activation-1.13e-01

Pos logit contributions
 loser+0.035
zag+0.030
ah+0.030
worm+0.030
 leader+0.030
Neg logit contributions
 magazines-0.060
 counters-0.059
keeping-0.058
 envelop-0.058
 candles-0.056
Token
Token ID198
Feature activation-5.26e-02

Pos logit contributions
 loser+0.037
worm+0.032
 winner+0.032
 leader+0.032
 coward+0.032
Neg logit contributions
 magazines-0.043
 counters-0.043
keeping-0.042
 envelop-0.041
 computers-0.039
Token
Token ID198
Feature activation-5.42e-02

Pos logit contributions
 loser+0.072
 winner+0.063
 problem+0.062
 leader+0.061
worm+0.060
Neg logit contributions
 magazines-0.111
 counters-0.109
 envelop-0.107
keeping-0.107
 computers-0.102
TokenHarry
Token ID18308
Feature activation+3.49e-02

Pos logit contributions
 loser+0.085
 problem+0.076
 winner+0.074
 coward+0.074
 frog+0.074
Neg logit contributions
 magazines-0.103
 counters-0.101
 envelop-0.099
keeping-0.097
 computers-0.094
Token thought
Token ID1807
Feature activation-6.95e-02

Pos logit contributions
 magazines+0.074
 counters+0.072
 envelop+0.071
keeping+0.070
 candles+0.068
Neg logit contributions
 loser-0.047
 leader-0.043
worm-0.042
 frog-0.042
 winner-0.042
Token to
Token ID284
Feature activation+2.11e-02

Pos logit contributions
 loser+0.099
worm+0.090
 leader+0.090
 lesson+0.089
 winner+0.089
Neg logit contributions
 magazines-0.141
 counters-0.138
keeping-0.136
 envelop-0.135
 candles-0.129
Token himself
Token ID2241
Feature activation-1.21e-01

Pos logit contributions
 magazines+0.040
 counters+0.039
 envelop+0.038
keeping+0.038
 candles+0.036
Neg logit contributions
 loser-0.034
 leader-0.031
 winner-0.031
worm-0.031
 frog-0.030
Token,
Token ID11
Feature activation-1.22e-02

Pos logit contributions
 loser+0.216
 coward+0.187
 goose+0.174
 winner+0.172
Only+0.171
Neg logit contributions
 magazines-0.212
 envelop-0.203
 counters-0.197
 Emmy-0.192
 computers-0.191
Token "
Token ID366
Feature activation-1.08e-02

Pos logit contributions
 loser+0.016
 leader+0.015
 frog+0.015
 winner+0.015
worm+0.015
Neg logit contributions
 magazines-0.026
 counters-0.025
 envelop-0.025
keeping-0.025
 candles-0.024
TokenI
Token ID40
Feature activation-1.85e-02

Pos logit contributions
 loser+0.017
 problem+0.017
 coward+0.016
 goose+0.016
 frog+0.015
Neg logit contributions
keeping-0.019
 magazines-0.019
 envelop-0.019
 counters-0.019
bons-0.018
Token want
Token ID765
Feature activation-7.08e-02

Pos logit contributions
 loser+0.025
worm+0.024
 goose+0.024
igger+0.023
 hawk+0.022
Neg logit contributions
 envelop-0.034
 counters-0.034
 illustrations-0.033
 wrote-0.033
keeping-0.033
Token to
Token ID284
Feature activation+1.99e-02

Pos logit contributions
 frog+0.058
 loser+0.057
 friend+0.056
 lesson+0.056
 first+0.056
Neg logit contributions
 counters-0.181
 magazines-0.176
 envelop-0.173
keeping-0.173
 illustrations-0.165
Token felt
Token ID2936
Feature activation+4.14e-02

Pos logit contributions
 loser+0.102
 winner+0.086
 leader+0.086
ah+0.086
worm+0.086
Neg logit contributions
 magazines-0.151
 counters-0.147
 envelop-0.145
keeping-0.142
 candles-0.141
Token right
Token ID826
Feature activation-1.26e-01

Pos logit contributions
 magazines+0.086
 counters+0.084
 envelop+0.082
keeping+0.081
 candles+0.079
Neg logit contributions
 loser-0.059
 leader-0.052
 winner-0.051
worm-0.051
 frog-0.051
Token.
Token ID13
Feature activation-1.28e-02

Pos logit contributions
 loser+0.177
 leader+0.156
 winner+0.155
worm+0.155
 frog+0.154
Neg logit contributions
 magazines-0.261
 counters-0.255
 envelop-0.250
keeping-0.246
 candles-0.240
Token
Token ID198
Feature activation-4.08e-02

Pos logit contributions
 loser+0.014
 leader+0.012
 frog+0.012
worm+0.012
 winner+0.012
Neg logit contributions
 magazines-0.030
 counters-0.030
 envelop-0.029
keeping-0.029
 candles-0.028
Token
Token ID198
Feature activation-4.11e-02

Pos logit contributions
 loser+0.066
 winner+0.062
 leader+0.057
 fastest+0.056
 coward+0.055
Neg logit contributions
 magazines-0.077
 counters-0.073
 envelop-0.073
keeping-0.072
 computers-0.069
TokenOne
Token ID3198
Feature activation-1.23e-01

Pos logit contributions
 loser+0.049
 winner+0.043
 coward+0.041
 problem+0.040
 leader+0.039
Neg logit contributions
 magazines-0.093
 counters-0.089
 envelop-0.087
 computers-0.086
 illustrations-0.084
Token day
Token ID1110
Feature activation-6.44e-02

Pos logit contributions
 loser+0.147
 winner+0.127
 coward+0.124
 leader+0.120
 fastest+0.118
Neg logit contributions
 magazines-0.277
 counters-0.264
 candles-0.259
 envelop-0.258
 videos-0.258
Token,
Token ID11
Feature activation-2.86e-03

Pos logit contributions
 loser+0.084
 leader+0.075
worm+0.075
 winner+0.074
 fastest+0.072
Neg logit contributions
 magazines-0.138
 counters-0.134
 envelop-0.132
 candles-0.129
keeping-0.125
Token the
Token ID262
Feature activation-7.04e-03

Pos logit contributions
 loser+0.005
 frog+0.004
 leader+0.004
worm+0.004
 winner+0.004
Neg logit contributions
 magazines-0.005
 counters-0.005
 envelop-0.005
keeping-0.005
 candles-0.005
Token onion
Token ID21670
Feature activation-9.50e-02

Pos logit contributions
 loser+0.014
ah+0.013
zag+0.013
 coward+0.012
 leader+0.012
Neg logit contributions
 counters-0.012
 magazines-0.012
 candles-0.011
 envelop-0.010
 fridge-0.010
Token saw
Token ID2497
Feature activation+1.01e-01

Pos logit contributions
 loser+0.158
 winner+0.136
 fastest+0.133
 coward+0.133
worm+0.132
Neg logit contributions
 magazines-0.170
 counters-0.169
 envelop-0.163
keeping-0.160
 candles-0.157
Token smile
Token ID8212
Feature activation+3.50e-02

Pos logit contributions
 magazines+0.022
 counters+0.021
 envelop+0.020
keeping+0.020
 candles+0.020
Neg logit contributions
 loser-0.016
worm-0.014
 winner-0.014
 leader-0.014
in-0.014
Token.
Token ID13
Feature activation-2.08e-03

Pos logit contributions
 magazines+0.078
 counters+0.077
 envelop+0.075
keeping+0.074
 candles+0.072
Neg logit contributions
 loser-0.044
 leader-0.038
 winner-0.038
worm-0.038
 frog-0.037
Token They
Token ID1119
Feature activation+4.78e-02

Pos logit contributions
 loser+0.002
 frog+0.002
 leader+0.002
 winner+0.002
 goose+0.002
Neg logit contributions
 counters-0.005
 magazines-0.005
 envelop-0.005
keeping-0.005
 candles-0.005
Token became
Token ID2627
Feature activation+5.71e-03

Pos logit contributions
 magazines+0.082
 envelop+0.078
 counters+0.077
 candles+0.075
keeping+0.074
Neg logit contributions
 loser-0.085
worm-0.077
 winner-0.076
 leader-0.075
 coward-0.074
Token the
Token ID262
Feature activation-1.69e-03

Pos logit contributions
 magazines+0.015
 envelop+0.014
 counters+0.014
keeping+0.014
 candles+0.014
Neg logit contributions
 loser-0.005
worm-0.004
 leader-0.004
ah-0.004
 winner-0.004
Token best
Token ID1266
Feature activation-1.75e-01

Pos logit contributions
 loser+0.001
worm+0.001
in+0.001
 coward+0.001
igger+0.001
Neg logit contributions
 magazines-0.004
 candles-0.004
 envelop-0.004
 presents-0.004
 videos-0.004
Token of
Token ID286
Feature activation+3.37e-02

Pos logit contributions
 loser+0.246
worm+0.220
 coward+0.215
ah+0.214
 winner+0.211
Neg logit contributions
 magazines-0.363
 candles-0.344
 envelop-0.342
 counters-0.341
 videos-0.332
Token friends
Token ID2460
Feature activation-1.72e-02

Pos logit contributions
 magazines+0.079
 envelop+0.076
 counters+0.075
 candles+0.074
keeping+0.074
Neg logit contributions
 loser-0.039
worm-0.034
ah-0.033
 leader-0.033
 winner-0.033
Token and
Token ID290
Feature activation+2.84e-02

Pos logit contributions
 loser+0.023
 leader+0.020
worm+0.020
 winner+0.020
 frog+0.020
Neg logit contributions
 magazines-0.037
 counters-0.036
 envelop-0.035
keeping-0.035
 candles-0.034
Token the
Token ID262
Feature activation-1.75e-03

Pos logit contributions
 counters+0.060
 magazines+0.058
keeping+0.056
 envelop+0.056
 desks+0.053
Neg logit contributions
 loser-0.039
 leader-0.038
 frog-0.038
 goose-0.036
 lesson-0.036
Token bird
Token ID6512
Feature activation-1.07e-01

Pos logit contributions
 loser+0.002
zag+0.002
worm+0.002
igger+0.001
 coward+0.001
Neg logit contributions
 magazines-0.004
 counters-0.004
 candles-0.004
 envelop-0.004
 computers-0.004
Token today
Token ID1909
Feature activation-1.86e-02

Pos logit contributions
 loser+0.053
 problem+0.048
 leader+0.046
 first+0.046
 frog+0.045
Neg logit contributions
 magazines-0.085
 counters-0.083
 envelop-0.082
keeping-0.079
 candles-0.077
Token,
Token ID11
Feature activation+4.52e-03

Pos logit contributions
 loser+0.017
 problem+0.015
 leader+0.014
 best+0.014
 winner+0.014
Neg logit contributions
 magazines-0.047
 counters-0.047
 envelop-0.047
keeping-0.046
 illustrations-0.044
Token Lily
Token ID20037
Feature activation+6.81e-02

Pos logit contributions
keeping+0.009
 magazines+0.009
 envelop+0.009
 counters+0.008
 illustrations+0.008
Neg logit contributions
 loser-0.007
 frog-0.007
 problem-0.006
 best-0.006
 lesson-0.006
Token.
Token ID13
Feature activation-1.14e-02

Pos logit contributions
 magazines+0.147
 counters+0.144
 envelop+0.141
keeping+0.139
 candles+0.136
Neg logit contributions
 loser-0.090
 leader-0.079
 winner-0.078
worm-0.078
 frog-0.077
Token Maybe
Token ID6674
Feature activation-3.40e-03

Pos logit contributions
 loser+0.019
 winner+0.017
 frog+0.017
 leader+0.017
Only+0.016
Neg logit contributions
 magazines-0.021
 counters-0.021
 envelop-0.021
keeping-0.020
 candles-0.020
Token next
Token ID1306
Feature activation-1.62e-01

Pos logit contributions
 loser+0.004
 leader+0.004
 winner+0.004
 frog+0.004
worm+0.004
Neg logit contributions
 magazines-0.007
 counters-0.007
 envelop-0.007
keeping-0.007
 candles-0.007
Token time
Token ID640
Feature activation-6.84e-02

Pos logit contributions
 loser+0.092
 problem+0.087
 lesson+0.077
 winner+0.071
 leader+0.069
Neg logit contributions
 magazines-0.471
 counters-0.467
 envelop-0.463
keeping-0.457
nam-0.444
Token if
Token ID611
Feature activation-8.30e-02

Pos logit contributions
 loser+0.061
 first+0.060
 best+0.056
 problem+0.055
 winner+0.053
Neg logit contributions
 counters-0.172
keeping-0.171
 magazines-0.169
 envelop-0.167
 illustrations-0.161
Token you
Token ID345
Feature activation+7.82e-03

Pos logit contributions
 loser+0.106
 problem+0.101
 lesson+0.097
 best+0.097
 frog+0.096
Neg logit contributions
 counters-0.178
 magazines-0.175
keeping-0.173
 envelop-0.172
 illustrations-0.164
Token behave
Token ID17438
Feature activation-3.08e-02

Pos logit contributions
 magazines+0.017
 counters+0.016
 envelop+0.016
keeping+0.016
 candles+0.016
Neg logit contributions
 loser-0.010
worm-0.009
 winner-0.009
 leader-0.009
 goose-0.009
Token."
Token ID526
Feature activation+2.34e-02

Pos logit contributions
 loser+0.040
 leader+0.035
 winner+0.034
in+0.034
worm+0.033
Neg logit contributions
 magazines-0.067
 counters-0.066
 envelop-0.065
keeping-0.064
 candles-0.063