TOP ACTIVATIONS
MAX = 14.620

laughed and said , " Looks like that apple made you
the owner smiled . " Looks like you two are friends
smiled and said , " Looks like you like the wardrobe
smiled . " Looks like the frog likes you
's mom my . " Looks like the wave destroyed all
table . " Looks like mom has some mail
smiled and said , " Sounds like you had the best
messy , the walls were grim y , and the furniture
The shape was still unsure but it wanted to be
The shape was a little unsure , but the mouse said
. The water was so grim y and gross , but
and he thought it was unfair . So he
the rabbit but he was unsure . He struggled
the rabbit but he was unsure . He struggled
adventure !" Bill was still unsure , but Sam convinced him
park . Joe thought this seemed like a good idea .
trip !" " Sounds like fun !" said her
Her friends were still unsure , but Julia was so
Yes , Molly " it seemed to say . " There
Sam was unsure at first , but then

MIN ACTIVATIONS
MIN = -10.992

shell and a y olk inside ," she said .
cheese , tomato and lettuce inside . When the sandwich was
big cookie on the table inside the house .
placed small pieces of wood inside and soon the fire was
big box with many balls inside . The man told the
cage nearby with another bird inside . R alph
cage nearby with another bird inside . R alph
found a lot of water inside . He had never seen
found a lot of water inside . He had never seen
a box with the computer inside . It had been hidden
put their dolls and cars inside and pretended to be kings
box with lots of equipment inside . They tried to open
found a family of birds inside , ch ir ping loudly
had many clothes and shoes inside . Sometimes they would squeeze
with some y ummy apples inside . John smiled . He
and forks and sp oons inside the dish washer . She
they found many colorful decorations inside . Jack said
and needed all the dishes inside to work properly . Lily
pond with lots of fish inside . â
with a king and queen inside . She wanted to paint

INTERVAL 8.217 - 14.620
CONTAINS 0.163%

, but it still felt happy . It had done something
join in . Tim my didn 't want to be selfish
at it , but it didn 't work .
They realized they had been selfish and rude . They saw
He learnt that it was wrong to kick the ash ,

INTERVAL 1.814 - 8.217
CONTAINS 24.861%

saw . They learned and laughed and had fun . They
I promise to be very good in the car ."
and thought it was the best seat ever . John felt
had an impressive , wide smile on his face .
had a pencil . He loved to draw with it .

INTERVAL -4.589 - 1.814
CONTAINS 65.569%

got scared and started to run as fast as she could
As they played , Lily noticed a big rock nearby .
. They made their own story . <|endoftext|> Ben and Mia
better . <|endoftext|> Once upon a time , there was a
a time , there was a big dog . He had

INTERVAL -10.992 - -4.589
CONTAINS 9.406%

eagerly and went back outside to play . He was so
owner . Suddenly , there was an emergency !
't spoil the cater pillar or else he won 't be
bright , green grass . In the middle of the big
fairy waved her magic wand and said
Token laughed
Token ID13818
Feature activation+5.72e+00

Pos logit contributions
 important+0.034
 valuable+0.031
 best+0.029
 mistakes+0.028
 awful+0.028
Neg logit contributions
 benches-0.030
 bench-0.029
 Zig-0.028
 slide-0.028
round-0.027
Token and
Token ID290
Feature activation-3.95e+00

Pos logit contributions
 important+0.247
 valuable+0.237
 mistakes+0.226
 virtue+0.226
arre+0.225
Neg logit contributions
 benches-0.099
 Zig-0.098
 bench-0.096
 sit-0.090
 walk-0.088
Token said
Token ID531
Feature activation+2.49e+00

Pos logit contributions
 benches+0.137
 bench+0.129
 Zig+0.124
round+0.124
 enclosure+0.122
Neg logit contributions
 important-0.128
 valuable-0.112
 best-0.108
arre-0.104
 ok-0.103
Token,
Token ID11
Feature activation-2.24e+00

Pos logit contributions
 important+0.094
 valuable+0.084
 best+0.082
arre+0.080
 mistakes+0.078
Neg logit contributions
 benches-0.072
 bench-0.068
 Zig-0.065
round-0.064
 slide-0.063
Token "
Token ID366
Feature activation+1.01e+00

Pos logit contributions
 benches+0.077
 bench+0.072
round+0.070
 Zig+0.069
 enclosure+0.069
Neg logit contributions
 important-0.073
 valuable-0.064
 best-0.062
 ok-0.060
arre-0.059
TokenLooks
Token ID41102
Feature activation+1.42e+01

Pos logit contributions
 important+0.036
 valuable+0.032
 best+0.031
 mistakes+0.030
 awful+0.029
Neg logit contributions
 benches-0.031
 bench-0.029
round-0.029
 Zig-0.028
 slide-0.028
Token like
Token ID588
Feature activation+1.91e+00

Pos logit contributions
 important+0.555
arre+0.544
ushing+0.521
 mistakes+0.515
 likely+0.511
Neg logit contributions
 benches-0.291
 Zig-0.259
 sit-0.256
 climb-0.248
 slide-0.247
Token that
Token ID326
Feature activation-1.33e+00

Pos logit contributions
 important+0.066
 valuable+0.058
 best+0.056
 ok+0.054
 mistakes+0.054
Neg logit contributions
 benches-0.063
 bench-0.059
round-0.057
 Zig-0.056
 enclosure-0.056
Token apple
Token ID17180
Feature activation-3.00e+00

Pos logit contributions
 bench+0.038
 slide+0.037
 benches+0.036
round+0.034
 climb+0.034
Neg logit contributions
 important-0.048
 valuable-0.044
arre-0.041
ushing-0.041
ealing-0.041
Token made
Token ID925
Feature activation+2.12e+00

Pos logit contributions
 benches+0.094
 bench+0.092
 slide+0.087
round+0.086
 Zig+0.085
Neg logit contributions
 important-0.102
 valuable-0.093
arre-0.088
 best-0.086
 mistakes-0.086
Token you
Token ID345
Feature activation+4.43e+00

Pos logit contributions
 important+0.078
 valuable+0.072
 best+0.067
arre+0.067
 likely+0.065
Neg logit contributions
 benches-0.060
 bench-0.056
 Zig-0.055
round-0.053
 slide-0.053
Token the
Token ID262
Feature activation+6.39e-01

Pos logit contributions
 benches+0.093
 bench+0.088
 Zig+0.084
round+0.084
 enclosure+0.081
Neg logit contributions
 important-0.100
 valuable-0.089
 best-0.086
arre-0.083
 mistakes-0.083
Token owner
Token ID4870
Feature activation-3.84e+00

Pos logit contributions
 important+0.028
 valuable+0.025
arre+0.025
 best+0.024
 ok+0.024
Neg logit contributions
 benches-0.015
 bench-0.015
 Zig-0.013
 slide-0.013
round-0.013
Token smiled
Token ID13541
Feature activation+7.00e+00

Pos logit contributions
 benches+0.141
 bench+0.130
round+0.126
 enclosure+0.125
 Zig+0.125
Neg logit contributions
 important-0.117
 best-0.102
 valuable-0.101
 ok-0.094
arre-0.093
Token.
Token ID13
Feature activation+1.26e+00

Pos logit contributions
 important+0.298
arre+0.292
 mistakes+0.279
 valuable+0.277
 awful+0.273
Neg logit contributions
 Zig-0.122
 bench-0.118
 benches-0.116
 walk-0.113
 towards-0.110
Token "
Token ID366
Feature activation+9.18e-01

Pos logit contributions
 important+0.043
 valuable+0.038
 best+0.037
arre+0.035
 ok+0.035
Neg logit contributions
 benches-0.042
 bench-0.039
round-0.037
 Zig-0.037
 enclosure-0.036
TokenLooks
Token ID41102
Feature activation+1.42e+01

Pos logit contributions
 important+0.030
 valuable+0.027
 best+0.026
 mistakes+0.026
arre+0.025
Neg logit contributions
 benches-0.030
 bench-0.029
round-0.028
 Zig-0.028
 slide-0.027
Token like
Token ID588
Feature activation+2.08e+00

Pos logit contributions
arre+0.559
 important+0.551
ushing+0.533
 mistakes+0.525
iced+0.522
Neg logit contributions
 benches-0.277
 Zig-0.267
 sit-0.250
 climb-0.244
 slide-0.239
Token you
Token ID345
Feature activation+4.50e+00

Pos logit contributions
 important+0.073
 valuable+0.064
 best+0.063
 ok+0.059
 mistakes+0.059
Neg logit contributions
 benches-0.069
 bench-0.064
round-0.062
 enclosure-0.061
 Zig-0.060
Token two
Token ID734
Feature activation+1.13e+00

Pos logit contributions
 important+0.152
 valuable+0.136
 best+0.128
 mistakes+0.128
arre+0.123
Neg logit contributions
 benches-0.144
 bench-0.139
round-0.135
 Zig-0.135
 slide-0.133
Token are
Token ID389
Feature activation+5.08e+00

Pos logit contributions
 important+0.037
 valuable+0.034
arre+0.032
 mistakes+0.030
 best+0.030
Neg logit contributions
 benches-0.037
 bench-0.036
 slide-0.034
 Zig-0.034
round-0.034
Token friends
Token ID2460
Feature activation+1.74e-01

Pos logit contributions
 important+0.170
 valuable+0.159
arre+0.155
 mistakes+0.147
ushing+0.147
Neg logit contributions
 Zig-0.151
 benches-0.150
 bench-0.144
 slide-0.138
round-0.138
Token smiled
Token ID13541
Feature activation+6.59e+00

Pos logit contributions
 benches+0.053
 bench+0.050
round+0.047
 Zig+0.047
 slide+0.047
Neg logit contributions
 important-0.060
 valuable-0.054
 best-0.051
arre-0.050
 mistakes-0.050
Token and
Token ID290
Feature activation-2.95e+00

Pos logit contributions
 important+0.295
arre+0.275
 valuable+0.272
 unfair+0.267
 virtue+0.267
Neg logit contributions
 bench-0.114
 benches-0.110
 walk-0.107
 sit-0.097
 touch-0.095
Token said
Token ID531
Feature activation+2.41e+00

Pos logit contributions
 benches+0.111
 bench+0.105
round+0.102
 Zig+0.102
 enclosure+0.100
Neg logit contributions
 important-0.087
 valuable-0.075
 best-0.072
arre-0.069
 ok-0.068
Token,
Token ID11
Feature activation-2.10e+00

Pos logit contributions
 important+0.097
 valuable+0.087
arre+0.084
 best+0.083
 mistakes+0.082
Neg logit contributions
 benches-0.062
 bench-0.059
 Zig-0.056
round-0.054
 slide-0.054
Token "
Token ID366
Feature activation+1.49e+00

Pos logit contributions
 benches+0.076
 bench+0.070
round+0.068
 Zig+0.068
 enclosure+0.068
Neg logit contributions
 important-0.066
 valuable-0.058
 best-0.055
 ok-0.053
arre-0.053
TokenLooks
Token ID41102
Feature activation+1.42e+01

Pos logit contributions
 important+0.051
 valuable+0.045
 best+0.044
arre+0.043
 mistakes+0.043
Neg logit contributions
 benches-0.047
 bench-0.045
round-0.044
 Zig-0.043
 slide-0.042
Token like
Token ID588
Feature activation+2.01e+00

Pos logit contributions
arre+0.576
 important+0.560
 mistakes+0.539
ushing+0.536
iced+0.523
Neg logit contributions
 benches-0.256
 sit-0.235
 climb-0.230
 Zig-0.221
 slide-0.216
Token you
Token ID345
Feature activation+4.57e+00

Pos logit contributions
 important+0.071
 valuable+0.064
arre+0.062
 best+0.061
 awful+0.059
Neg logit contributions
 benches-0.061
 bench-0.058
 Zig-0.055
round-0.054
 slide-0.053
Token like
Token ID588
Feature activation+1.76e+00

Pos logit contributions
 important+0.161
 valuable+0.146
 mistakes+0.141
 best+0.136
arre+0.136
Neg logit contributions
 benches-0.131
 bench-0.130
 slide-0.126
round-0.124
 enclosure-0.121
Token the
Token ID262
Feature activation-5.96e-02

Pos logit contributions
 important+0.066
 valuable+0.060
 best+0.058
arre+0.056
 ok+0.055
Neg logit contributions
 benches-0.051
 bench-0.048
 Zig-0.046
round-0.046
 enclosure-0.044
Token wardrobe
Token ID35593
Feature activation-1.94e+00

Pos logit contributions
 benches+0.002
 bench+0.002
 slide+0.002
 Zig+0.002
 enclosure+0.002
Neg logit contributions
 important-0.002
 valuable-0.002
arre-0.002
ushing-0.002
ealing-0.002
Token smiled
Token ID13541
Feature activation+6.64e+00

Pos logit contributions
 benches+0.014
 bench+0.013
 Zig+0.013
round+0.013
 enclosure+0.012
Neg logit contributions
 important-0.012
 best-0.011
 valuable-0.011
arre-0.010
 ok-0.010
Token.
Token ID13
Feature activation+1.27e+00

Pos logit contributions
 important+0.289
 valuable+0.269
arre+0.268
 virtue+0.259
 mistakes+0.259
Neg logit contributions
 bench-0.127
 benches-0.127
 walk-0.111
 sit-0.109
 Zig-0.108
Token
Token ID198
Feature activation-6.17e-01

Pos logit contributions
 important+0.049
 valuable+0.044
 best+0.043
arre+0.042
 ok+0.041
Neg logit contributions
 benches-0.036
 bench-0.033
 Zig-0.032
round-0.032
 enclosure-0.031
Token
Token ID198
Feature activation-5.45e-01

Pos logit contributions
 benches+0.019
 bench+0.018
round+0.018
 enclosure+0.017
 Zig+0.017
Neg logit contributions
 important-0.022
 valuable-0.020
 best-0.018
 mistakes-0.018
arre-0.018
Token"
Token ID1
Feature activation+1.22e-01

Pos logit contributions
 benches+0.018
 bench+0.018
round+0.017
 slide+0.017
 Zig+0.016
Neg logit contributions
 important-0.018
 best-0.016
 valuable-0.016
 ok-0.015
 mistakes-0.015
TokenLooks
Token ID41102
Feature activation+1.36e+01

Pos logit contributions
 important+0.005
 valuable+0.004
 best+0.004
arre+0.004
 mistakes+0.004
Neg logit contributions
 benches-0.004
 bench-0.003
round-0.003
 Zig-0.003
 slide-0.003
Token like
Token ID588
Feature activation+1.92e+00

Pos logit contributions
 important+0.525
arre+0.505
ushing+0.487
iced+0.486
 likely+0.476
Neg logit contributions
 benches-0.280
 sit-0.257
 Zig-0.251
 climb-0.249
 slide-0.240
Token the
Token ID262
Feature activation+3.77e-02

Pos logit contributions
 important+0.067
 valuable+0.058
 best+0.057
 mistakes+0.054
 ok+0.054
Neg logit contributions
 benches-0.063
 bench-0.060
round-0.060
 enclosure-0.058
 Zig-0.057
Token frog
Token ID21264
Feature activation-1.73e+00

Pos logit contributions
 important+0.002
 valuable+0.002
arre+0.002
 best+0.002
 ok+0.001
Neg logit contributions
 benches-0.001
 bench-0.001
 Zig-0.001
 slide-0.001
round-0.001
Token likes
Token ID7832
Feature activation+3.36e+00

Pos logit contributions
 benches+0.051
 bench+0.050
 Zig+0.047
 slide+0.047
 enclosure+0.047
Neg logit contributions
 important-0.061
 valuable-0.056
arre-0.053
 best-0.052
 ok-0.052
Token you
Token ID345
Feature activation+4.16e+00

Pos logit contributions
 important+0.126
 valuable+0.112
 best+0.109
 mistakes+0.104
 ok+0.102
Neg logit contributions
 benches-0.100
 bench-0.094
round-0.091
 Zig-0.089
 enclosure-0.088
Token's
Token ID338
Feature activation+2.65e+00

Pos logit contributions
 important+0.044
 valuable+0.039
arre+0.038
 mistakes+0.037
 best+0.037
Neg logit contributions
 benches-0.034
 bench-0.034
 Zig-0.033
round-0.032
 enclosure-0.031
Token mom
Token ID1995
Feature activation-2.02e+00

Pos logit contributions
 important+0.111
arre+0.103
 valuable+0.103
 likely+0.096
 best+0.096
Neg logit contributions
 bench-0.063
 benches-0.061
 slide-0.058
 throne-0.054
 Zig-0.052
Tokenmy
Token ID1820
Feature activation+8.06e-01

Pos logit contributions
 benches+0.059
 bench+0.056
 Zig+0.053
round+0.053
 enclosure+0.052
Neg logit contributions
 important-0.075
 valuable-0.068
 best-0.065
arre-0.064
 mistakes-0.062
Token.
Token ID13
Feature activation+7.25e-01

Pos logit contributions
 important+0.029
 valuable+0.026
arre+0.024
 best+0.024
 mistakes+0.024
Neg logit contributions
 benches-0.025
 bench-0.023
 Zig-0.023
 enclosure-0.022
round-0.022
Token "
Token ID366
Feature activation+5.71e-01

Pos logit contributions
 important+0.024
 valuable+0.021
 best+0.020
 ok+0.020
arre+0.020
Neg logit contributions
 benches-0.024
 bench-0.023
round-0.022
 Zig-0.022
 enclosure-0.022
TokenLooks
Token ID41102
Feature activation+1.35e+01

Pos logit contributions
 important+0.018
 valuable+0.016
 best+0.016
 mistakes+0.015
 ok+0.015
Neg logit contributions
 benches-0.020
 bench-0.019
round-0.018
 Zig-0.018
 slide-0.018
Token like
Token ID588
Feature activation+1.75e+00

Pos logit contributions
arre+0.460
 important+0.459
ushing+0.445
iced+0.441
omm+0.427
Neg logit contributions
 benches-0.320
 climb-0.288
 Zig-0.286
 sit-0.285
 slide-0.280
Token the
Token ID262
Feature activation-1.40e-01

Pos logit contributions
 important+0.065
 valuable+0.059
 best+0.057
arre+0.055
 ok+0.054
Neg logit contributions
 benches-0.052
 bench-0.048
 Zig-0.046
round-0.046
 enclosure-0.045
Token wave
Token ID6769
Feature activation+1.48e+00

Pos logit contributions
 benches+0.003
 bench+0.003
 slide+0.003
 enclosure+0.002
 Zig+0.002
Neg logit contributions
 important-0.006
 valuable-0.006
arre-0.006
ushing-0.005
 ok-0.005
Token destroyed
Token ID6572
Feature activation+1.85e-01

Pos logit contributions
 important+0.052
 valuable+0.047
 best+0.044
arre+0.044
 mistakes+0.043
Neg logit contributions
 benches-0.046
 bench-0.044
round-0.042
 Zig-0.041
 slide-0.041
Token all
Token ID477
Feature activation-1.27e+00

Pos logit contributions
 important+0.007
 valuable+0.006
 best+0.006
arre+0.006
 ok+0.005
Neg logit contributions
 benches-0.006
 bench-0.005
 Zig-0.005
round-0.005
 enclosure-0.005
Token table
Token ID3084
Feature activation-5.44e+00

Pos logit contributions
 benches+0.099
 Zig+0.098
round+0.097
 bench+0.092
 enclosure+0.091
Neg logit contributions
 important-0.129
 best-0.115
 valuable-0.114
 awful-0.107
 mistakes-0.104
Token.
Token ID13
Feature activation+8.48e-01

Pos logit contributions
 benches+0.176
 bench+0.167
round+0.155
 enclosure+0.154
 Zig+0.154
Neg logit contributions
 important-0.190
 valuable-0.170
 best-0.161
arre-0.158
 ok-0.156
Token
Token ID198
Feature activation-7.89e-01

Pos logit contributions
 important+0.031
 valuable+0.028
 best+0.027
arre+0.026
 ok+0.026
Neg logit contributions
 benches-0.026
 bench-0.024
 Zig-0.023
round-0.023
 enclosure-0.022
Token
Token ID198
Feature activation-6.53e-01

Pos logit contributions
 benches+0.025
round+0.024
 bench+0.023
 enclosure+0.023
 Zig+0.022
Neg logit contributions
 important-0.028
 valuable-0.025
 best-0.023
 mistakes-0.022
arre-0.022
Token"
Token ID1
Feature activation-2.80e-01

Pos logit contributions
 benches+0.025
 bench+0.024
 Zig+0.023
round+0.023
 enclosure+0.023
Neg logit contributions
 important-0.019
 valuable-0.016
 best-0.015
arre-0.015
 ok-0.014
TokenLooks
Token ID41102
Feature activation+1.33e+01

Pos logit contributions
 benches+0.008
 Zig+0.008
 bench+0.008
 enclosure+0.008
round+0.008
Neg logit contributions
 important-0.010
 valuable-0.009
 best-0.009
arre-0.009
 awful-0.008
Token like
Token ID588
Feature activation+1.56e+00

Pos logit contributions
 important+0.501
arre+0.480
ushing+0.470
iced+0.458
 likely+0.457
Neg logit contributions
 benches-0.293
 sit-0.252
 climb-0.249
 Zig-0.241
 slide-0.240
Token mom
Token ID1995
Feature activation-2.03e+00

Pos logit contributions
 important+0.060
 valuable+0.053
 best+0.051
 mistakes+0.049
 ok+0.049
Neg logit contributions
 benches-0.046
 bench-0.043
round-0.042
 Zig-0.041
 enclosure-0.041
Token has
Token ID468
Feature activation+2.52e+00

Pos logit contributions
 benches+0.058
 bench+0.055
round+0.053
 slide+0.052
 enclosure+0.052
Neg logit contributions
 important-0.076
 valuable-0.070
 best-0.065
arre-0.064
 awful-0.063
Token some
Token ID617
Feature activation-1.20e-01

Pos logit contributions
 important+0.085
 valuable+0.076
arre+0.073
 best+0.072
 ok+0.070
Neg logit contributions
 benches-0.082
 bench-0.077
 Zig-0.072
round-0.072
 enclosure-0.071
Token mail
Token ID6920
Feature activation-2.83e+00

Pos logit contributions
 benches+0.003
 bench+0.003
round+0.003
 slide+0.003
 Zig+0.003
Neg logit contributions
 important-0.005
 valuable-0.004
arre-0.004
 best-0.004
 ok-0.004
Token smiled
Token ID13541
Feature activation+6.69e+00

Pos logit contributions
 important+0.035
 mistakes+0.032
arre+0.032
 valuable+0.031
 awful+0.031
Neg logit contributions
 bench-0.031
 slide-0.029
 benches-0.029
 Zig-0.028
round-0.026
Token and
Token ID290
Feature activation-2.64e+00

Pos logit contributions
 important+0.272
arre+0.257
 valuable+0.250
 mistakes+0.249
 awful+0.248
Neg logit contributions
 benches-0.142
 bench-0.141
 Zig-0.139
 throne-0.126
 walk-0.125
Token said
Token ID531
Feature activation+2.75e+00

Pos logit contributions
 benches+0.084
 bench+0.080
 Zig+0.077
round+0.075
 slide+0.074
Neg logit contributions
 important-0.093
 valuable-0.082
arre-0.079
 mistakes-0.078
 best-0.078
Token,
Token ID11
Feature activation-1.90e+00

Pos logit contributions
 important+0.108
 valuable+0.097
 best+0.094
arre+0.091
 ok+0.090
Neg logit contributions
 benches-0.077
 bench-0.072
 Zig-0.068
round-0.068
 enclosure-0.066
Token "
Token ID366
Feature activation+1.38e+00

Pos logit contributions
 benches+0.070
 bench+0.065
round+0.063
 enclosure+0.063
 Zig+0.063
Neg logit contributions
 important-0.058
 valuable-0.051
 best-0.048
 ok-0.047
arre-0.045
TokenSounds
Token ID40825
Feature activation+1.25e+01

Pos logit contributions
 important+0.046
 valuable+0.040
 best+0.039
arre+0.037
 mistakes+0.037
Neg logit contributions
 benches-0.046
 bench-0.044
round-0.042
 Zig-0.042
 enclosure-0.041
Token like
Token ID588
Feature activation+3.62e+00

Pos logit contributions
arre+0.429
 mistakes+0.405
 important+0.377
ushing+0.361
 valuable+0.337
Neg logit contributions
 Zig-0.341
 benches-0.339
 bench-0.338
 sit-0.323
 slide-0.317
Token you
Token ID345
Feature activation+5.61e+00

Pos logit contributions
 important+0.127
 valuable+0.112
 best+0.109
 ok+0.103
 mistakes+0.103
Neg logit contributions
 benches-0.116
 bench-0.109
round-0.105
 Zig-0.104
 enclosure-0.103
Token had
Token ID550
Feature activation+4.35e+00

Pos logit contributions
 important+0.200
 valuable+0.181
 mistakes+0.178
arre+0.172
 awful+0.169
Neg logit contributions
 benches-0.166
 bench-0.158
 Zig-0.156
 slide-0.154
 enclosure-0.152
Token the
Token ID262
Feature activation+7.47e-01

Pos logit contributions
 important+0.134
 valuable+0.118
arre+0.115
 best+0.112
 mistakes+0.110
Neg logit contributions
 benches-0.155
 bench-0.145
 Zig-0.142
round-0.138
 enclosure-0.138
Token best
Token ID1266
Feature activation+6.00e+00

Pos logit contributions
 important+0.016
 valuable+0.013
arre+0.013
ushing+0.012
 awful+0.012
Neg logit contributions
 benches-0.033
 bench-0.032
 slide-0.031
 Zig-0.031
 enclosure-0.030
Token messy
Token ID24097
Feature activation+4.98e+00

Pos logit contributions
 important+0.108
 valuable+0.096
arre+0.096
ealing+0.095
 best+0.092
Neg logit contributions
 benches-0.142
 bench-0.137
 Zig-0.127
 slide-0.127
 throne-0.120
Token,
Token ID11
Feature activation-2.62e+00

Pos logit contributions
 important+0.198
 valuable+0.190
ealing+0.178
arre+0.175
 ok+0.175
Neg logit contributions
 benches-0.132
 bench-0.117
 slide-0.111
 Zig-0.104
round-0.104
Token the
Token ID262
Feature activation+2.01e-01

Pos logit contributions
 benches+0.075
 bench+0.071
 slide+0.065
 Zig+0.063
round+0.062
Neg logit contributions
 important-0.099
 valuable-0.090
arre-0.087
ealing-0.087
 best-0.087
Token walls
Token ID7714
Feature activation-3.80e+00

Pos logit contributions
 important+0.008
 valuable+0.007
 best+0.007
 mistakes+0.007
arre+0.006
Neg logit contributions
 benches-0.006
 bench-0.005
round-0.005
 Zig-0.005
 enclosure-0.005
Token were
Token ID547
Feature activation+3.87e+00

Pos logit contributions
 benches+0.118
 bench+0.111
 Zig+0.105
round+0.105
 slide+0.103
Neg logit contributions
 important-0.136
 valuable-0.122
 best-0.117
arre-0.114
 ok-0.112
Token grim
Token ID18288
Feature activation+1.18e+01

Pos logit contributions
 important+0.113
 valuable+0.100
arre+0.098
ealing+0.096
 best+0.095
Neg logit contributions
 benches-0.141
 bench-0.135
 slide-0.126
 Zig-0.125
 throne-0.120
Tokeny
Token ID88
Feature activation+5.20e+00

Pos logit contributions
 important+0.357
 valuable+0.345
 unfair+0.306
 certainly+0.298
 mistakes+0.296
Neg logit contributions
 benches-0.392
round-0.368
 bench-0.362
 Zig-0.344
 enclosure-0.339
Token,
Token ID11
Feature activation-1.22e+00

Pos logit contributions
 important+0.198
 valuable+0.183
arre+0.177
ealing+0.172
 unfair+0.171
Neg logit contributions
 benches-0.147
 bench-0.139
 slide-0.130
round-0.123
 enclosure-0.123
Token and
Token ID290
Feature activation-2.42e+00

Pos logit contributions
 benches+0.035
 bench+0.034
 slide+0.031
 Zig+0.030
round+0.030
Neg logit contributions
 important-0.045
 valuable-0.040
arre-0.040
 best-0.040
 mistakes-0.039
Token the
Token ID262
Feature activation+3.81e-01

Pos logit contributions
 benches+0.064
 bench+0.060
 Zig+0.056
 slide+0.055
round+0.051
Neg logit contributions
 important-0.096
 valuable-0.088
arre-0.087
 best-0.084
 mistakes-0.083
Token furniture
Token ID13091
Feature activation-3.64e+00

Pos logit contributions
 important+0.015
 valuable+0.014
 best+0.013
 mistakes+0.013
 awful+0.013
Neg logit contributions
 benches-0.010
 bench-0.010
round-0.009
 Zig-0.009
 enclosure-0.009
Token
Token ID26391
Feature activation+4.54e+00

Pos logit contributions
 important+0.205
 valuable+0.183
 best+0.179
 mistakes+0.171
arre+0.169
Neg logit contributions
 benches-0.180
round-0.171
 bench-0.171
 Zig-0.164
 slide-0.162
Token The
Token ID383
Feature activation+5.88e-01

Pos logit contributions
 important+0.158
 valuable+0.140
 best+0.134
 awful+0.130
arre+0.129
Neg logit contributions
 benches-0.147
 bench-0.139
 Zig-0.134
round-0.132
 slide-0.130
Token shape
Token ID5485
Feature activation+3.42e-01

Pos logit contributions
 important+0.021
arre+0.020
 valuable+0.020
ealing+0.019
ushing+0.019
Neg logit contributions
 benches-0.016
 slide-0.016
 bench-0.016
 climb-0.014
 enclosure-0.014
Token was
Token ID373
Feature activation+6.03e+00

Pos logit contributions
 important+0.011
 valuable+0.010
arre+0.010
 ok+0.009
 awful+0.009
Neg logit contributions
 slide-0.010
 bench-0.010
 benches-0.010
round-0.010
 enclosure-0.010
Token still
Token ID991
Feature activation+5.10e+00

Pos logit contributions
arre+0.180
 important+0.174
ushing+0.168
 mistakes+0.166
ealing+0.164
Neg logit contributions
 bench-0.184
 benches-0.184
 Zig-0.181
 slide-0.174
 gra-0.171
Token unsure
Token ID22147
Feature activation+1.16e+01

Pos logit contributions
arre+0.152
 important+0.150
ealing+0.145
 valuable+0.141
ushing+0.140
Neg logit contributions
 bench-0.151
 slide-0.147
 benches-0.146
 Zig-0.139
 climb-0.136
Token but
Token ID475
Feature activation+8.60e-01

Pos logit contributions
 important+0.407
 valuable+0.363
 best+0.349
arre+0.342
 ok+0.334
Neg logit contributions
 benches-0.366
 bench-0.343
 Zig-0.328
round-0.327
 enclosure-0.319
Token it
Token ID340
Feature activation+5.28e+00

Pos logit contributions
 important+0.031
arre+0.029
 mistakes+0.028
 valuable+0.028
 awful+0.028
Neg logit contributions
 bench-0.022
 Zig-0.022
 benches-0.022
 slide-0.021
round-0.020
Token wanted
Token ID2227
Feature activation+5.96e+00

Pos logit contributions
 important+0.175
 mistakes+0.166
arre+0.165
ealing+0.159
 awful+0.155
Neg logit contributions
 bench-0.148
 Zig-0.144
 slide-0.144
 walk-0.139
 climb-0.137
Token to
Token ID284
Feature activation-5.19e+00

Pos logit contributions
 important+0.273
 valuable+0.250
arre+0.246
 best+0.244
 awful+0.238
Neg logit contributions
 benches-0.121
 bench-0.109
 Zig-0.103
round-0.099
 slide-0.096
Token be
Token ID307
Feature activation+5.14e+00

Pos logit contributions
 benches+0.133
 bench+0.126
 Zig+0.119
round+0.117
 slide+0.117
Neg logit contributions
 important-0.212
 valuable-0.193
 best-0.186
arre-0.183
 mistakes-0.179
TokenThe
Token ID464
Feature activation+5.91e-01

Pos logit contributions
 benches+0.008
 bench+0.007
round+0.007
 Zig+0.007
 slide+0.007
Neg logit contributions
 important-0.012
 valuable-0.011
 best-0.011
 mistakes-0.010
 awful-0.010
Token shape
Token ID5485
Feature activation-1.84e-02

Pos logit contributions
 important+0.021
arre+0.020
ealing+0.020
 valuable+0.019
ushing+0.019
Neg logit contributions
 slide-0.016
 benches-0.016
 bench-0.016
 climb-0.015
 enclosure-0.014
Token was
Token ID373
Feature activation+5.97e+00

Pos logit contributions
 bench+0.001
 benches+0.001
 slide+0.001
 enclosure+0.001
round+0.001
Neg logit contributions
 important-0.001
 valuable-0.001
arre-0.001
 mistakes-0.001
 best-0.001
Token a
Token ID257
Feature activation+3.29e+00

Pos logit contributions
arre+0.165
 important+0.161
 mistakes+0.153
ushing+0.151
 valuable+0.147
Neg logit contributions
 bench-0.191
 Zig-0.190
 benches-0.189
 gra-0.180
 slide-0.176
Token little
Token ID1310
Feature activation+3.98e+00

Pos logit contributions
 important+0.088
arre+0.082
ushing+0.078
 mistakes+0.077
ealing+0.076
Neg logit contributions
 bench-0.121
 benches-0.116
 slide-0.114
 throne-0.110
 Zig-0.110
Token unsure
Token ID22147
Feature activation+1.14e+01

Pos logit contributions
 important+0.137
arre+0.128
 valuable+0.120
ushing+0.119
ealing+0.117
Neg logit contributions
 bench-0.116
 benches-0.113
 slide-0.112
 enclosure-0.107
 walk-0.106
Token,
Token ID11
Feature activation-1.20e+00

Pos logit contributions
 important+0.399
 valuable+0.357
 best+0.344
arre+0.340
 mistakes+0.329
Neg logit contributions
 benches-0.359
 bench-0.339
 Zig-0.322
round-0.321
 enclosure-0.314
Token but
Token ID475
Feature activation+9.40e-02

Pos logit contributions
 benches+0.030
 bench+0.029
 Zig+0.027
round+0.026
 slide+0.026
Neg logit contributions
 important-0.048
arre-0.045
 valuable-0.044
 best-0.043
 mistakes-0.042
Token the
Token ID262
Feature activation+1.03e+00

Pos logit contributions
 important+0.004
arre+0.003
 valuable+0.003
 mistakes+0.003
 best+0.003
Neg logit contributions
 benches-0.002
 bench-0.002
 Zig-0.002
 slide-0.002
round-0.002
Token mouse
Token ID10211
Feature activation-1.47e+00

Pos logit contributions
 important+0.036
arre+0.035
ealing+0.033
 valuable+0.033
ushing+0.033
Neg logit contributions
 benches-0.029
 bench-0.029
 slide-0.028
 climb-0.026
 enclosure-0.025
Token said
Token ID531
Feature activation+3.40e+00

Pos logit contributions
 bench+0.047
 benches+0.046
 slide+0.044
 Zig+0.044
 enclosure+0.044
Neg logit contributions
 important-0.046
 valuable-0.041
arre-0.040
 mistakes-0.040
 best-0.039
Token.
Token ID13
Feature activation+5.12e-01

Pos logit contributions
 benches+0.306
round+0.288
 Zig+0.287
 bench+0.280
 enclosure+0.278
Neg logit contributions
 important-0.303
 valuable-0.262
 best-0.261
 mistakes-0.248
arre-0.247
Token The
Token ID383
Feature activation+5.00e-01

Pos logit contributions
 important+0.019
 valuable+0.017
 best+0.016
arre+0.016
 ok+0.016
Neg logit contributions
 benches-0.015
 bench-0.014
round-0.014
 Zig-0.014
 enclosure-0.013
Token water
Token ID1660
Feature activation-1.16e+00

Pos logit contributions
 important+0.018
 valuable+0.016
 best+0.016
arre+0.015
 ok+0.015
Neg logit contributions
 benches-0.015
 bench-0.014
 Zig-0.013
round-0.013
 slide-0.013
Token was
Token ID373
Feature activation+5.36e+00

Pos logit contributions
 benches+0.038
 bench+0.037
 slide+0.035
 Zig+0.034
round+0.033
Neg logit contributions
 important-0.039
 valuable-0.035
 best-0.033
 mistakes-0.032
arre-0.032
Token so
Token ID523
Feature activation+4.55e+00

Pos logit contributions
 important+0.160
arre+0.151
ealing+0.149
 valuable+0.147
 mistakes+0.146
Neg logit contributions
 benches-0.172
 bench-0.168
 Zig-0.157
 slide-0.156
 distance-0.150
Token grim
Token ID18288
Feature activation+1.14e+01

Pos logit contributions
arre+0.138
 important+0.129
 mistakes+0.129
 best+0.126
ushing+0.123
Neg logit contributions
 bench-0.133
 benches-0.131
 Zig-0.124
 slide-0.124
 gra-0.120
Tokeny
Token ID88
Feature activation+5.02e+00

Pos logit contributions
 important+0.325
 valuable+0.316
 unfair+0.288
 mistakes+0.273
arre+0.273
Neg logit contributions
 benches-0.373
round-0.350
 Zig-0.342
 bench-0.342
 enclosure-0.326
Token and
Token ID290
Feature activation-1.65e+00

Pos logit contributions
 important+0.186
 valuable+0.170
arre+0.167
 unfair+0.160
 best+0.156
Neg logit contributions
 benches-0.141
 bench-0.138
 slide-0.126
 enclosure-0.125
round-0.123
Token gross
Token ID10319
Feature activation+6.81e+00

Pos logit contributions
 benches+0.050
 bench+0.050
 slide+0.048
 Zig+0.047
 gra+0.047
Neg logit contributions
 important-0.054
arre-0.049
 valuable-0.049
 mistakes-0.047
 best-0.046
Token,
Token ID11
Feature activation-2.10e+00

Pos logit contributions
 important+0.267
 valuable+0.247
arre+0.235
 best+0.229
 unfair+0.228
Neg logit contributions
 benches-0.181
 bench-0.169
 slide-0.156
 Zig-0.149
round-0.146
Token but
Token ID475
Feature activation-6.51e-01

Pos logit contributions
 benches+0.061
 bench+0.060
 Zig+0.054
 slide+0.054
round+0.054
Neg logit contributions
 important-0.077
 valuable-0.069
 best-0.068
arre-0.067
 mistakes-0.065
Token and
Token ID290
Feature activation-3.08e+00

Pos logit contributions
 important+0.111
 valuable+0.099
 best+0.095
arre+0.094
 mistakes+0.092
Neg logit contributions
 benches-0.104
 bench-0.100
 Zig-0.094
round-0.094
 slide-0.092
Token he
Token ID339
Feature activation+2.34e+00

Pos logit contributions
 benches+0.100
 bench+0.095
 Zig+0.091
round+0.090
 enclosure+0.088
Neg logit contributions
 important-0.106
 valuable-0.094
 best-0.091
arre-0.088
 mistakes-0.087
Token thought
Token ID1807
Feature activation+7.88e+00

Pos logit contributions
 important+0.073
 valuable+0.066
arre+0.064
 best+0.064
 mistakes+0.063
Neg logit contributions
 bench-0.077
 benches-0.076
 Zig-0.075
 slide-0.071
 walk-0.071
Token it
Token ID340
Feature activation+5.82e+00

Pos logit contributions
 important+0.296
 valuable+0.262
 best+0.254
 ok+0.240
 awful+0.237
Neg logit contributions
 benches-0.239
round-0.222
 bench-0.220
 enclosure-0.218
 Zig-0.208
Token was
Token ID373
Feature activation+7.82e+00

Pos logit contributions
 important+0.194
arre+0.184
 mistakes+0.180
 valuable+0.179
ealing+0.175
Neg logit contributions
 Zig-0.168
 benches-0.168
 bench-0.163
 slide-0.158
 walk-0.156
Token unfair
Token ID11675
Feature activation+1.13e+01

Pos logit contributions
arre+0.224
ushing+0.200
ealing+0.192
 mistakes+0.185
inking+0.185
Neg logit contributions
 Zig-0.277
 bench-0.270
 benches-0.269
 Sue-0.253
 bamboo-0.250
Token.
Token ID13
Feature activation+5.77e-01

Pos logit contributions
arre+0.470
 important+0.446
ealing+0.439
aming+0.438
amed+0.429
Neg logit contributions
 bench-0.245
 benches-0.241
 Zig-0.225
 slide-0.213
 Ann-0.204
Token
Token ID198
Feature activation-8.69e-01

Pos logit contributions
 important+0.019
 valuable+0.017
 best+0.017
 ok+0.015
 mistakes+0.015
Neg logit contributions
 benches-0.020
 bench-0.018
round-0.018
 throne-0.017
 slide-0.017
Token
Token ID198
Feature activation-7.69e-01

Pos logit contributions
 benches+0.030
 bench+0.027
round+0.027
 enclosure+0.026
 throne+0.026
Neg logit contributions
 important-0.029
 valuable-0.026
 best-0.025
 ok-0.023
 mistakes-0.023
TokenSo
Token ID2396
Feature activation+1.73e+00

Pos logit contributions
 benches+0.021
 bench+0.020
 Zig+0.019
round+0.019
 enclosure+0.018
Neg logit contributions
 important-0.031
 valuable-0.028
 best-0.027
arre-0.026
 ok-0.026
Token he
Token ID339
Feature activation+1.94e+00

Pos logit contributions
 important+0.066
 best+0.061
 valuable+0.058
 ok+0.053
 awful+0.051
Neg logit contributions
 benches-0.053
 bench-0.048
round-0.048
 gra-0.046
 enclosure-0.045
Token the
Token ID262
Feature activation-2.12e+00

Pos logit contributions
 benches+0.163
 throne+0.162
round+0.157
 bench+0.157
 enclosure+0.154
Neg logit contributions
 important-0.163
 best-0.152
 ok-0.147
 valuable-0.142
 mistakes-0.139
Token rabbit
Token ID22746
Feature activation-4.01e+00

Pos logit contributions
 benches+0.052
 bench+0.050
 slide+0.045
 walk+0.044
 Zig+0.043
Neg logit contributions
 important-0.087
 valuable-0.080
arre-0.078
ushing-0.077
 likely-0.076
Token but
Token ID475
Feature activation-1.72e+00

Pos logit contributions
 benches+0.132
 bench+0.122
round+0.119
 Zig+0.118
 enclosure+0.114
Neg logit contributions
 important-0.137
 valuable-0.122
 best-0.119
 ok-0.113
arre-0.111
Token he
Token ID339
Feature activation+2.34e+00

Pos logit contributions
 benches+0.036
 bench+0.034
 Zig+0.033
 slide+0.030
round+0.029
Neg logit contributions
 important-0.077
 valuable-0.071
arre-0.071
 mistakes-0.069
 best-0.069
Token was
Token ID373
Feature activation+5.97e+00

Pos logit contributions
 important+0.071
arre+0.065
 valuable+0.065
 mistakes+0.063
 best+0.059
Neg logit contributions
 bench-0.078
 Zig-0.078
 benches-0.076
 slide-0.074
 climb-0.071
Token unsure
Token ID22147
Feature activation+1.12e+01

Pos logit contributions
 important+0.160
arre+0.158
ushing+0.157
 mistakes+0.155
 valuable+0.149
Neg logit contributions
 bench-0.203
 benches-0.194
 Zig-0.193
 slide-0.184
 climb-0.183
Token.
Token ID13
Feature activation+1.20e+00

Pos logit contributions
 important+0.384
 valuable+0.339
 best+0.327
 ok+0.311
 mistakes+0.310
Neg logit contributions
 benches-0.370
 bench-0.348
round-0.336
 Zig-0.335
 enclosure-0.328
Token
Token ID198
Feature activation-4.98e-01

Pos logit contributions
 important+0.039
 valuable+0.035
 best+0.035
arre+0.033
 ok+0.033
Neg logit contributions
 benches-0.040
 bench-0.038
round-0.037
 throne-0.037
 slide-0.035
Token
Token ID198
Feature activation-4.21e-01

Pos logit contributions
 benches+0.017
 bench+0.015
round+0.015
 throne+0.015
 enclosure+0.015
Neg logit contributions
 important-0.017
 valuable-0.015
 best-0.014
arre-0.014
 ok-0.013
TokenHe
Token ID1544
Feature activation+1.07e+00

Pos logit contributions
 benches+0.012
 bench+0.011
round+0.011
 Zig+0.011
 slide+0.010
Neg logit contributions
 important-0.016
 best-0.014
 valuable-0.014
arre-0.014
 mistakes-0.014
Token struggled
Token ID11615
Feature activation+4.88e+00

Pos logit contributions
 important+0.038
 valuable+0.035
 best+0.033
arre+0.033
 mistakes+0.033
Neg logit contributions
 benches-0.032
 bench-0.031
 Zig-0.030
round-0.029
 enclosure-0.028
Token the
Token ID262
Feature activation-2.12e+00

Pos logit contributions
 benches+0.163
 throne+0.162
round+0.157
 bench+0.157
 enclosure+0.154
Neg logit contributions
 important-0.163
 best-0.152
 ok-0.147
 valuable-0.142
 mistakes-0.139
Token rabbit
Token ID22746
Feature activation-4.01e+00

Pos logit contributions
 benches+0.052
 bench+0.050
 slide+0.045
 walk+0.044
 Zig+0.043
Neg logit contributions
 important-0.087
 valuable-0.080
arre-0.078
ushing-0.077
 likely-0.076
Token but
Token ID475
Feature activation-1.72e+00

Pos logit contributions
 benches+0.132
 bench+0.122
round+0.119
 Zig+0.118
 enclosure+0.114
Neg logit contributions
 important-0.137
 valuable-0.122
 best-0.119
 ok-0.113
arre-0.111
Token he
Token ID339
Feature activation+2.34e+00

Pos logit contributions
 benches+0.036
 bench+0.034
 Zig+0.033
 slide+0.030
round+0.029
Neg logit contributions
 important-0.077
 valuable-0.071
arre-0.071
 mistakes-0.069
 best-0.069
Token was
Token ID373
Feature activation+5.97e+00

Pos logit contributions
 important+0.071
arre+0.065
 valuable+0.065
 mistakes+0.063
 best+0.059
Neg logit contributions
 bench-0.078
 Zig-0.078
 benches-0.076
 slide-0.074
 climb-0.071
Token unsure
Token ID22147
Feature activation+1.12e+01

Pos logit contributions
 important+0.160
arre+0.158
ushing+0.157
 mistakes+0.155
 valuable+0.149
Neg logit contributions
 bench-0.203
 benches-0.194
 Zig-0.193
 slide-0.184
 climb-0.183
Token.
Token ID13
Feature activation+1.20e+00

Pos logit contributions
 important+0.384
 valuable+0.339
 best+0.327
 ok+0.311
 mistakes+0.310
Neg logit contributions
 benches-0.370
 bench-0.348
round-0.336
 Zig-0.335
 enclosure-0.328
Token
Token ID198
Feature activation-4.98e-01

Pos logit contributions
 important+0.039
 valuable+0.035
 best+0.035
arre+0.033
 ok+0.033
Neg logit contributions
 benches-0.040
 bench-0.038
round-0.037
 throne-0.037
 slide-0.035
Token
Token ID198
Feature activation-4.21e-01

Pos logit contributions
 benches+0.017
 bench+0.015
round+0.015
 throne+0.015
 enclosure+0.015
Neg logit contributions
 important-0.017
 valuable-0.015
 best-0.014
arre-0.014
 ok-0.013
TokenHe
Token ID1544
Feature activation+1.07e+00

Pos logit contributions
 benches+0.012
 bench+0.011
round+0.011
 Zig+0.011
 slide+0.010
Neg logit contributions
 important-0.016
 best-0.014
 valuable-0.014
arre-0.014
 mistakes-0.014
Token struggled
Token ID11615
Feature activation+4.88e+00

Pos logit contributions
 important+0.038
 valuable+0.035
 best+0.033
arre+0.033
 mistakes+0.033
Neg logit contributions
 benches-0.032
 bench-0.031
 Zig-0.030
round-0.029
 enclosure-0.028
Token adventure
Token ID8855
Feature activation-8.01e-01

Pos logit contributions
 important+0.161
 best+0.144
 valuable+0.139
 mistakes+0.130
arre+0.129
Neg logit contributions
 benches-0.091
 bench-0.077
 Zig-0.077
 gra-0.077
 enclosure-0.076
Token!"
Token ID2474
Feature activation-1.49e+00

Pos logit contributions
 benches+0.027
 bench+0.025
round+0.025
 Zig+0.025
 enclosure+0.024
Neg logit contributions
 important-0.026
 valuable-0.023
 best-0.022
arre-0.022
 ok-0.021
Token Bill
Token ID3941
Feature activation-1.64e+00

Pos logit contributions
 benches+0.044
 throne+0.042
 bench+0.042
 gra+0.040
round+0.039
Neg logit contributions
 important-0.051
 best-0.047
arre-0.045
 valuable-0.045
 mistakes-0.044
Token was
Token ID373
Feature activation+5.35e+00

Pos logit contributions
 benches+0.058
 bench+0.055
round+0.053
 Zig+0.053
 slide+0.052
Neg logit contributions
 important-0.052
 valuable-0.046
 best-0.044
 mistakes-0.042
arre-0.042
Token still
Token ID991
Feature activation+4.88e+00

Pos logit contributions
 important+0.139
ushing+0.136
 valuable+0.133
arre+0.128
 mistakes+0.127
Neg logit contributions
 bench-0.190
 Zig-0.187
 benches-0.179
 Sue-0.177
 slide-0.176
Token unsure
Token ID22147
Feature activation+1.12e+01

Pos logit contributions
 important+0.130
 valuable+0.125
 awful+0.120
ushing+0.118
arre+0.116
Neg logit contributions
 bench-0.169
 benches-0.160
 slide-0.159
 Zig-0.159
 Sue-0.154
Token,
Token ID11
Feature activation-1.43e+00

Pos logit contributions
 important+0.396
 valuable+0.349
 best+0.340
 ok+0.324
 mistakes+0.323
Neg logit contributions
 benches-0.358
 bench-0.339
round-0.331
 Zig-0.327
 enclosure-0.321
Token but
Token ID475
Feature activation-8.92e-02

Pos logit contributions
 benches+0.038
 bench+0.035
 Zig+0.034
round+0.033
 slide+0.032
Neg logit contributions
 important-0.058
 valuable-0.052
 best-0.050
arre-0.050
 mistakes-0.049
Token Sam
Token ID3409
Feature activation+1.62e+00

Pos logit contributions
 benches+0.002
 bench+0.002
 Zig+0.002
round+0.002
 slide+0.002
Neg logit contributions
 important-0.004
 valuable-0.003
arre-0.003
 best-0.003
 mistakes-0.003
Token convinced
Token ID9431
Feature activation+2.21e+00

Pos logit contributions
 important+0.051
 valuable+0.047
 mistakes+0.046
arre+0.044
 awful+0.043
Neg logit contributions
 bench-0.052
 slide-0.050
 benches-0.049
 walk-0.048
 Zig-0.047
Token him
Token ID683
Feature activation+4.40e+00

Pos logit contributions
 important+0.088
 best+0.080
 ok+0.077
 valuable+0.076
 mistakes+0.075
Neg logit contributions
 benches-0.059
 bench-0.056
round-0.056
 enclosure-0.055
 throne-0.054
Token park
Token ID3952
Feature activation-5.41e+00

Pos logit contributions
 benches+0.047
 bench+0.043
 Zig+0.040
round+0.039
 enclosure+0.038
Neg logit contributions
 important-0.100
 valuable-0.091
 best-0.089
arre-0.087
 ok-0.087
Token.
Token ID13
Feature activation+2.53e-01

Pos logit contributions
 benches+0.200
 Zig+0.181
 bench+0.181
round+0.178
 gra+0.176
Neg logit contributions
 important-0.164
 best-0.139
 valuable-0.139
 ok-0.127
 mistakes-0.124
Token Joe
Token ID5689
Feature activation-3.73e-01

Pos logit contributions
 important+0.009
 valuable+0.008
 best+0.008
arre+0.008
 mistakes+0.008
Neg logit contributions
 benches-0.008
 bench-0.007
round-0.007
 throne-0.007
 Zig-0.006
Token thought
Token ID1807
Feature activation+7.73e+00

Pos logit contributions
 benches+0.013
 bench+0.012
 Zig+0.012
 slide+0.012
round+0.012
Neg logit contributions
 important-0.012
 valuable-0.011
 best-0.010
arre-0.010
 mistakes-0.010
Token this
Token ID428
Feature activation+4.95e+00

Pos logit contributions
 important+0.283
 valuable+0.248
 best+0.243
 ok+0.229
 mistakes+0.229
Neg logit contributions
 benches-0.241
 bench-0.227
round-0.223
 enclosure-0.217
 Zig-0.213
Token seemed
Token ID3947
Feature activation+1.12e+01

Pos logit contributions
 important+0.151
arre+0.145
ealing+0.140
ushing+0.138
 valuable+0.136
Neg logit contributions
 bench-0.165
 benches-0.158
 slide-0.155
 Zig-0.152
round-0.147
Token like
Token ID588
Feature activation+3.57e+00

Pos logit contributions
arre+0.358
 important+0.333
ushing+0.330
 mistakes+0.330
ealing+0.315
Neg logit contributions
 benches-0.332
 bench-0.328
 Zig-0.325
 slide-0.303
 walk-0.298
Token a
Token ID257
Feature activation+2.26e+00

Pos logit contributions
 important+0.128
 valuable+0.113
 best+0.110
 ok+0.104
 mistakes+0.103
Neg logit contributions
 benches-0.113
 bench-0.106
round-0.104
 enclosure-0.101
 Zig-0.100
Token good
Token ID922
Feature activation+7.02e+00

Pos logit contributions
 important+0.072
 valuable+0.063
arre+0.061
 best+0.060
 mistakes+0.060
Neg logit contributions
 benches-0.076
 bench-0.075
 slide-0.071
 Zig-0.070
round-0.069
Token idea
Token ID2126
Feature activation+7.94e+00

Pos logit contributions
 important+0.210
 valuable+0.182
 best+0.175
arre+0.166
 ok+0.164
Neg logit contributions
 benches-0.264
 bench-0.247
 Zig-0.241
round-0.240
 enclosure-0.236
Token.
Token ID13
Feature activation+8.70e-01

Pos logit contributions
 important+0.274
 valuable+0.242
 best+0.235
 ok+0.223
 mistakes+0.220
Neg logit contributions
 benches-0.259
 bench-0.243
round-0.234
 Zig-0.234
 enclosure-0.229
Token trip
Token ID5296
Feature activation-3.01e+00

Pos logit contributions
 important+0.000
 valuable+0.000
 best+0.000
arre+0.000
 mistakes+0.000
Neg logit contributions
 benches-0.000
 bench-0.000
round-0.000
 Zig-0.000
 enclosure-0.000
Token!"
Token ID2474
Feature activation-2.73e+00

Pos logit contributions
 benches+0.091
 bench+0.085
 Zig+0.081
round+0.081
 enclosure+0.079
Neg logit contributions
 important-0.109
 valuable-0.098
 best-0.094
arre-0.092
 ok-0.091
Token
Token ID198
Feature activation-1.25e+00

Pos logit contributions
 benches+0.080
 bench+0.072
round+0.071
 Zig+0.069
 enclosure+0.069
Neg logit contributions
 important-0.100
 best-0.089
 valuable-0.089
 ok-0.085
arre-0.085
Token
Token ID198
Feature activation-9.82e-01

Pos logit contributions
 benches+0.041
 bench+0.037
round+0.036
 enclosure+0.035
 Zig+0.035
Neg logit contributions
 important-0.043
 valuable-0.038
 best-0.036
arre-0.036
 mistakes-0.035
Token"
Token ID1
Feature activation-7.25e-01

Pos logit contributions
 benches+0.033
 bench+0.031
 Zig+0.030
round+0.029
 enclosure+0.029
Neg logit contributions
 important-0.032
 valuable-0.028
arre-0.027
 best-0.027
 ok-0.026
TokenSounds
Token ID40825
Feature activation+1.11e+01

Pos logit contributions
 benches+0.023
 bench+0.021
 Zig+0.021
 enclosure+0.021
round+0.020
Neg logit contributions
 important-0.025
 valuable-0.022
 best-0.022
arre-0.021
 ok-0.021
Token like
Token ID588
Feature activation+3.63e+00

Pos logit contributions
arre+0.347
 mistakes+0.328
 important+0.325
ushing+0.312
 valuable+0.289
Neg logit contributions
 bench-0.330
 benches-0.329
 slide-0.317
 Zig-0.312
round-0.309
Token fun
Token ID1257
Feature activation+5.93e+00

Pos logit contributions
 important+0.111
 valuable+0.098
 best+0.093
arre+0.093
 mistakes+0.089
Neg logit contributions
 benches-0.130
 bench-0.123
 Zig-0.119
round-0.117
 slide-0.115
Token!"
Token ID2474
Feature activation-3.46e+00

Pos logit contributions
 important+0.217
 valuable+0.198
arre+0.191
 best+0.188
 awful+0.186
Neg logit contributions
 benches-0.174
 bench-0.165
round-0.158
 Zig-0.156
 slide-0.156
Token said
Token ID531
Feature activation+8.28e-01

Pos logit contributions
 benches+0.103
 bench+0.094
round+0.092
 enclosure+0.088
 gra+0.087
Neg logit contributions
 important-0.124
 best-0.110
 valuable-0.110
 ok-0.105
 mistakes-0.103
Token her
Token ID607
Feature activation+2.81e+00

Pos logit contributions
 important+0.033
 best+0.030
 valuable+0.029
 ok+0.029
 mistakes+0.027
Neg logit contributions
 benches-0.022
 bench-0.020
round-0.019
 enclosure-0.019
 slide-0.018
Token
Token ID198
Feature activation-4.58e-01

Pos logit contributions
 benches+0.020
round+0.018
 bench+0.018
 Zig+0.017
 enclosure+0.017
Neg logit contributions
 important-0.021
 valuable-0.019
 best-0.018
arre-0.017
 mistakes-0.017
TokenHer
Token ID9360
Feature activation+2.80e+00

Pos logit contributions
 benches+0.014
 bench+0.014
round+0.013
 Zig+0.013
 enclosure+0.013
Neg logit contributions
 important-0.016
 valuable-0.014
 best-0.014
 ok-0.014
 awful-0.014
Token friends
Token ID2460
Feature activation-2.18e+00

Pos logit contributions
 important+0.074
 valuable+0.068
 awful+0.065
arre+0.065
 likely+0.061
Neg logit contributions
 benches-0.106
 bench-0.103
 slide-0.100
 Zig-0.098
 walk-0.094
Token were
Token ID547
Feature activation+3.67e+00

Pos logit contributions
 benches+0.067
 bench+0.064
 slide+0.061
 walk+0.060
round+0.059
Neg logit contributions
 important-0.078
 valuable-0.071
 mistakes-0.065
 best-0.065
 awful-0.063
Token still
Token ID991
Feature activation+4.52e+00

Pos logit contributions
 important+0.094
 valuable+0.090
 mistakes+0.090
ushing+0.087
arre+0.082
Neg logit contributions
 bench-0.126
 benches-0.120
 sit-0.116
 Zig-0.114
 Sue-0.114
Token unsure
Token ID22147
Feature activation+1.11e+01

Pos logit contributions
 important+0.121
 valuable+0.115
 awful+0.114
 certainly+0.108
ience+0.108
Neg logit contributions
 bench-0.147
 slide-0.141
 benches-0.140
 climb-0.139
 walk-0.133
Token,
Token ID11
Feature activation-1.45e+00

Pos logit contributions
 important+0.402
 valuable+0.359
 best+0.347
arre+0.338
 ok+0.332
Neg logit contributions
 benches-0.342
 bench-0.320
 Zig-0.305
round-0.304
 enclosure-0.297
Token but
Token ID475
Feature activation+6.87e-02

Pos logit contributions
 benches+0.034
 bench+0.031
 Zig+0.029
round+0.028
 slide+0.027
Neg logit contributions
 important-0.063
 valuable-0.057
arre-0.056
 best-0.055
 mistakes-0.055
Token Julia
Token ID22300
Feature activation+2.58e+00

Pos logit contributions
 important+0.003
arre+0.003
 valuable+0.003
 mistakes+0.003
 awful+0.003
Neg logit contributions
 benches-0.002
 bench-0.001
 Zig-0.001
 Ann-0.001
 slide-0.001
Token was
Token ID373
Feature activation+6.77e+00

Pos logit contributions
 important+0.086
 valuable+0.081
 mistakes+0.078
 awful+0.074
arre+0.074
Neg logit contributions
 bench-0.072
 walk-0.071
 slide-0.067
 climb-0.067
 benches-0.066
Token so
Token ID523
Feature activation+5.46e+00

Pos logit contributions
arre+0.175
 mistakes+0.173
ushing+0.166
 important+0.161
/+0.154
Neg logit contributions
 bench-0.238
 sit-0.224
 climb-0.218
 benches-0.215
 slide-0.214
TokenYes
Token ID5297
Feature activation+6.55e+00

Pos logit contributions
 important+0.028
 valuable+0.025
 ok+0.023
 best+0.023
arre+0.023
Neg logit contributions
 benches-0.027
 bench-0.026
 Zig-0.025
 enclosure-0.025
 throne-0.024
Token,
Token ID11
Feature activation-1.66e+00

Pos logit contributions
 important+0.227
 valuable+0.200
 best+0.195
 ok+0.189
 awful+0.185
Neg logit contributions
 benches-0.214
 bench-0.200
round-0.192
 Zig-0.191
 enclosure-0.189
Token Molly
Token ID30236
Feature activation+3.66e+00

Pos logit contributions
 benches+0.052
 bench+0.048
round+0.047
 Zig+0.047
 enclosure+0.046
Neg logit contributions
 important-0.060
 valuable-0.054
 best-0.052
 ok-0.051
 awful-0.049
Token"
Token ID1
Feature activation+1.28e+00

Pos logit contributions
 important+0.139
 valuable+0.127
arre+0.126
 best+0.121
 mistakes+0.119
Neg logit contributions
 benches-0.099
 bench-0.096
 Zig-0.092
 enclosure-0.089
 slide-0.089
Token it
Token ID340
Feature activation+4.42e+00

Pos logit contributions
 important+0.047
 valuable+0.042
 best+0.041
arre+0.039
 ok+0.039
Neg logit contributions
 benches-0.038
 bench-0.036
 Zig-0.035
round-0.034
 enclosure-0.034
Token seemed
Token ID3947
Feature activation+1.10e+01

Pos logit contributions
 important+0.142
arre+0.130
 valuable+0.128
 mistakes+0.123
 best+0.121
Neg logit contributions
 bench-0.135
 Zig-0.133
 benches-0.133
 enclosure-0.128
 gra-0.128
Token to
Token ID284
Feature activation-4.15e+00

Pos logit contributions
 important+0.444
arre+0.437
ushing+0.409
 mistakes+0.409
 best+0.401
Neg logit contributions
 benches-0.245
 bench-0.232
 Zig-0.226
 gra-0.205
 slide-0.202
Token say
Token ID910
Feature activation+3.92e+00

Pos logit contributions
 benches+0.122
 bench+0.115
 Zig+0.110
round+0.109
 enclosure+0.107
Neg logit contributions
 important-0.154
 valuable-0.138
 best-0.134
arre-0.130
 mistakes-0.128
Token.
Token ID13
Feature activation+1.85e+00

Pos logit contributions
 important+0.145
 valuable+0.130
 best+0.126
 ok+0.122
 mistakes+0.120
Neg logit contributions
 benches-0.118
 bench-0.109
round-0.105
 Zig-0.104
 enclosure-0.103
Token "
Token ID366
Feature activation+1.07e+00

Pos logit contributions
 important+0.062
 valuable+0.055
 ok+0.053
 best+0.053
 mistakes+0.050
Neg logit contributions
 benches-0.062
 bench-0.058
round-0.055
 slide-0.054
 Zig-0.054
TokenThere
Token ID1858
Feature activation-1.87e+00

Pos logit contributions
 important+0.035
 valuable+0.031
 best+0.030
arre+0.029
 ok+0.029
Neg logit contributions
 benches-0.036
 bench-0.034
 Zig-0.033
 enclosure-0.033
 throne-0.032
Token
Token ID26391
Feature activation+3.78e+00

Pos logit contributions
 important+0.178
 valuable+0.157
 best+0.151
arre+0.147
 ok+0.145
Neg logit contributions
 benches-0.171
 bench-0.160
 Zig-0.154
round-0.151
 enclosure-0.149
Token
Token ID198
Feature activation-9.09e-01

Pos logit contributions
 important+0.145
 valuable+0.130
 best+0.125
arre+0.121
 awful+0.120
Neg logit contributions
 benches-0.109
 bench-0.102
 Zig-0.098
round-0.097
 enclosure-0.094
Token
Token ID198
Feature activation-7.80e-01

Pos logit contributions
 benches+0.028
 bench+0.026
 enclosure+0.025
round+0.025
 throne+0.025
Neg logit contributions
 important-0.032
 valuable-0.029
 best-0.028
arre-0.028
 ok-0.027
TokenSam
Token ID16305
Feature activation-7.05e-01

Pos logit contributions
 benches+0.023
 bench+0.022
 Zig+0.021
round+0.021
 enclosure+0.021
Neg logit contributions
 important-0.029
 valuable-0.026
 best-0.025
arre-0.024
 ok-0.024
Token was
Token ID373
Feature activation+5.47e+00

Pos logit contributions
 benches+0.024
 bench+0.023
round+0.022
 Zig+0.022
 slide+0.022
Neg logit contributions
 important-0.023
 valuable-0.020
 best-0.019
 mistakes-0.019
 awful-0.018
Token unsure
Token ID22147
Feature activation+1.10e+01

Pos logit contributions
 important+0.139
 valuable+0.127
arre+0.119
 mistakes+0.118
ushing+0.116
Neg logit contributions
 bench-0.209
 benches-0.208
 Zig-0.199
 slide-0.194
 Ann-0.193
Token at
Token ID379
Feature activation-3.22e+00

Pos logit contributions
 important+0.385
 valuable+0.341
 best+0.330
 ok+0.315
arre+0.313
Neg logit contributions
 benches-0.357
 bench-0.335
 Zig-0.323
round-0.323
 enclosure-0.315
Token first
Token ID717
Feature activation+4.30e+00

Pos logit contributions
 benches+0.119
round+0.115
 bench+0.111
 enclosure+0.109
 Zig+0.109
Neg logit contributions
 important-0.096
 best-0.084
 valuable-0.079
 ok-0.077
 mistakes-0.076
Token,
Token ID11
Feature activation-2.22e+00

Pos logit contributions
 important+0.149
 valuable+0.132
 best+0.128
arre+0.123
 ok+0.122
Neg logit contributions
 benches-0.138
 bench-0.130
 Zig-0.125
round-0.125
 enclosure-0.122
Token but
Token ID475
Feature activation-3.72e-01

Pos logit contributions
 benches+0.064
 bench+0.060
 Zig+0.057
round+0.057
 slide+0.055
Neg logit contributions
 important-0.084
 valuable-0.075
 best-0.073
arre-0.073
 mistakes-0.071
Token then
Token ID788
Feature activation-1.54e+00

Pos logit contributions
 benches+0.009
 bench+0.008
 Zig+0.008
round+0.007
 slide+0.007
Neg logit contributions
 important-0.016
 valuable-0.015
 best-0.014
arre-0.014
 mistakes-0.014
Token shell
Token ID7582
Feature activation+4.23e-01

Pos logit contributions
 important+0.055
arre+0.051
 best+0.048
 valuable+0.048
 mistakes+0.048
Neg logit contributions
 bench-0.046
 benches-0.043
 slide-0.042
 Zig-0.040
 enclosure-0.040
Token and
Token ID290
Feature activation-4.03e+00

Pos logit contributions
 important+0.015
 valuable+0.013
 best+0.013
arre+0.013
 ok+0.012
Neg logit contributions
 benches-0.013
 bench-0.013
round-0.012
 Zig-0.012
 enclosure-0.012
Token a
Token ID257
Feature activation+1.10e+00

Pos logit contributions
 benches+0.130
 bench+0.123
 Zig+0.117
 slide+0.115
 enclosure+0.114
Neg logit contributions
 important-0.138
 valuable-0.122
arre-0.121
 best-0.118
 ok-0.114
Token y
Token ID331
Feature activation+5.14e+00

Pos logit contributions
 important+0.040
arre+0.035
 valuable+0.035
 best+0.034
 mistakes+0.033
Neg logit contributions
 bench-0.032
 benches-0.032
 slide-0.030
 Zig-0.029
round-0.029
Tokenolk
Token ID13597
Feature activation-4.90e+00

Pos logit contributions
 important+0.133
 valuable+0.115
 best+0.107
arre+0.103
 ok+0.103
Neg logit contributions
 benches-0.208
 bench-0.201
round-0.197
 enclosure-0.191
 slide-0.191
Token inside
Token ID2641
Feature activation-1.10e+01

Pos logit contributions
 benches+0.153
 bench+0.145
round+0.137
 Zig+0.135
 slide+0.134
Neg logit contributions
 important-0.176
 valuable-0.158
 best-0.150
arre-0.147
 ok-0.145
Token,"
Token ID553
Feature activation-3.52e+00

Pos logit contributions
 benches+0.376
 bench+0.353
 Zig+0.350
round+0.347
 enclosure+0.336
Neg logit contributions
 important-0.354
 valuable-0.308
 best-0.303
 mistakes-0.289
 awful-0.284
Token she
Token ID673
Feature activation+1.20e+00

Pos logit contributions
 benches+0.105
round+0.102
 enclosure+0.101
 bench+0.095
 Zig+0.094
Neg logit contributions
 important-0.133
 valuable-0.117
 best-0.114
 awful-0.110
 unfair-0.108
Token said
Token ID531
Feature activation+1.11e+00

Pos logit contributions
 important+0.047
 valuable+0.043
 best+0.042
arre+0.041
 mistakes+0.040
Neg logit contributions
 benches-0.031
 bench-0.030
 Zig-0.029
 slide-0.028
 Ann-0.027
Token.
Token ID13
Feature activation+9.72e-01

Pos logit contributions
 important+0.039
 valuable+0.035
 best+0.033
arre+0.033
 awful+0.032
Neg logit contributions
 benches-0.034
 bench-0.033
 Zig-0.031
round-0.031
 slide-0.030
Token
Token ID198
Feature activation-5.64e-01

Pos logit contributions
 important+0.030
 valuable+0.027
 best+0.026
 ok+0.025
arre+0.025
Neg logit contributions
 benches-0.033
 enclosure-0.031
round-0.031
 throne-0.030
 bench-0.030
Token cheese
Token ID9891
Feature activation-4.97e+00

Pos logit contributions
 benches+0.039
 bench+0.037
 slide+0.034
round+0.033
 Zig+0.033
Neg logit contributions
 important-0.055
 valuable-0.050
arre-0.049
 ok-0.048
 best-0.047
Token,
Token ID11
Feature activation-5.25e+00

Pos logit contributions
 benches+0.148
 bench+0.138
 Zig+0.132
round+0.132
 enclosure+0.129
Neg logit contributions
 important-0.184
 valuable-0.165
 best-0.160
arre-0.154
 ok-0.153
Token tomato
Token ID24240
Feature activation-3.29e+00

Pos logit contributions
 benches+0.166
 bench+0.156
 Zig+0.153
round+0.153
 enclosure+0.148
Neg logit contributions
 important-0.185
 valuable-0.164
 best-0.159
 ok-0.151
 awful-0.151
Token and
Token ID290
Feature activation-5.65e+00

Pos logit contributions
 benches+0.103
 bench+0.096
round+0.093
 Zig+0.093
 enclosure+0.090
Neg logit contributions
 important-0.117
 valuable-0.104
 best-0.101
arre-0.096
 ok-0.096
Token lettuce
Token ID39406
Feature activation-4.35e+00

Pos logit contributions
 benches+0.180
 bench+0.169
 Zig+0.162
round+0.162
 enclosure+0.158
Neg logit contributions
 important-0.198
 valuable-0.176
 best-0.170
arre-0.164
 ok-0.162
Token inside
Token ID2641
Feature activation-1.08e+01

Pos logit contributions
 benches+0.132
 bench+0.124
 Zig+0.117
round+0.115
 enclosure+0.115
Neg logit contributions
 important-0.160
 valuable-0.143
 best-0.135
arre-0.135
 mistakes-0.133
Token.
Token ID13
Feature activation+1.15e-01

Pos logit contributions
 benches+0.348
 bench+0.325
round+0.323
 Zig+0.322
 enclosure+0.311
Neg logit contributions
 important-0.377
 valuable-0.332
 best-0.326
 mistakes-0.307
 ok-0.307
Token When
Token ID1649
Feature activation-1.77e+00

Pos logit contributions
 important+0.004
 valuable+0.003
 best+0.003
 ok+0.003
arre+0.003
Neg logit contributions
 benches-0.004
 bench-0.004
round-0.004
 enclosure-0.003
 throne-0.003
Token the
Token ID262
Feature activation-3.60e-01

Pos logit contributions
round+0.057
 benches+0.055
 enclosure+0.054
 bench+0.051
 Zig+0.050
Neg logit contributions
 important-0.061
 best-0.055
 valuable-0.053
 mistakes-0.050
 ok-0.050
Token sandwich
Token ID20433
Feature activation-2.24e+00

Pos logit contributions
 benches+0.010
 bench+0.010
 slide+0.009
 walk+0.008
 enclosure+0.008
Neg logit contributions
 important-0.014
arre-0.013
 valuable-0.013
 likely-0.012
 mistakes-0.012
Token was
Token ID373
Feature activation+4.26e+00

Pos logit contributions
 benches+0.085
 bench+0.081
round+0.078
 Zig+0.078
 enclosure+0.077
Neg logit contributions
 important-0.065
 valuable-0.056
 best-0.054
arre-0.051
 ok-0.050
Token big
Token ID1263
Feature activation-1.29e-01

Pos logit contributions
 important+0.004
 valuable+0.003
 best+0.003
arre+0.003
 ok+0.003
Neg logit contributions
 benches-0.004
 bench-0.004
 Zig-0.003
round-0.003
 slide-0.003
Token cookie
Token ID19751
Feature activation-3.29e+00

Pos logit contributions
 benches+0.003
 Zig+0.003
round+0.003
 bench+0.003
 enclosure+0.003
Neg logit contributions
 important-0.005
 valuable-0.005
 best-0.005
 awful-0.004
arre-0.004
Token on
Token ID319
Feature activation-7.30e+00

Pos logit contributions
 benches+0.097
 bench+0.093
 slide+0.086
round+0.085
 Zig+0.083
Neg logit contributions
 important-0.124
 valuable-0.113
arre-0.106
 best-0.106
 ok-0.105
Token the
Token ID262
Feature activation-3.25e+00

Pos logit contributions
 benches+0.203
round+0.197
 enclosure+0.189
 Zig+0.188
 bench+0.183
Neg logit contributions
 important-0.268
 valuable-0.236
 awful-0.228
 best-0.227
 mistakes-0.219
Token table
Token ID3084
Feature activation-5.70e+00

Pos logit contributions
 benches+0.099
 Zig+0.097
round+0.093
 bench+0.089
 enclosure+0.089
Neg logit contributions
 important-0.115
 best-0.101
 valuable-0.101
 awful-0.094
 mistakes-0.090
Token inside
Token ID2641
Feature activation-1.08e+01

Pos logit contributions
 benches+0.186
 bench+0.177
round+0.167
 enclosure+0.164
 Zig+0.164
Neg logit contributions
 important-0.197
 valuable-0.176
 best-0.167
arre-0.163
 ok-0.162
Token the
Token ID262
Feature activation-4.21e+00

Pos logit contributions
 benches+0.388
 bench+0.362
round+0.361
 Zig+0.360
 enclosure+0.350
Neg logit contributions
 important-0.335
 valuable-0.287
 best-0.281
 mistakes-0.266
 ok-0.262
Token house
Token ID2156
Feature activation-7.87e+00

Pos logit contributions
 benches+0.111
 Zig+0.104
round+0.101
 bench+0.101
 enclosure+0.096
Neg logit contributions
 important-0.170
 valuable-0.153
 best-0.150
arre-0.142
 mistakes-0.142
Token.
Token ID13
Feature activation+2.85e-01

Pos logit contributions
 benches+0.267
 bench+0.252
 Zig+0.242
round+0.242
 enclosure+0.237
Neg logit contributions
 important-0.260
 valuable-0.229
 best-0.220
arre-0.212
 ok-0.210
Token
Token ID198
Feature activation-9.69e-01

Pos logit contributions
 important+0.010
 valuable+0.009
 ok+0.008
 best+0.008
arre+0.008
Neg logit contributions
 benches-0.009
 throne-0.009
 bench-0.008
round-0.008
 enclosure-0.008
Token
Token ID198
Feature activation-8.05e-01

Pos logit contributions
 benches+0.034
 throne+0.031
 enclosure+0.031
 bench+0.031
round+0.030
Neg logit contributions
 important-0.031
 valuable-0.028
 best-0.025
arre-0.025
 ok-0.025
Token placed
Token ID4624
Feature activation-2.17e+00

Pos logit contributions
 important+0.008
 valuable+0.007
 best+0.007
arre+0.007
 mistakes+0.006
Neg logit contributions
 benches-0.008
 bench-0.008
 enclosure-0.007
 Zig-0.007
 slide-0.007
Token small
Token ID1402
Feature activation+3.35e-01

Pos logit contributions
 benches+0.064
 bench+0.062
round+0.062
 enclosure+0.061
 Zig+0.058
Neg logit contributions
 important-0.077
 best-0.070
 valuable-0.068
 ok-0.064
 awful-0.063
Token pieces
Token ID5207
Feature activation-2.07e+00

Pos logit contributions
 important+0.012
 valuable+0.011
 best+0.011
 mistakes+0.010
arre+0.010
Neg logit contributions
 benches-0.010
 bench-0.009
 Zig-0.009
round-0.009
 enclosure-0.009
Token of
Token ID286
Feature activation-6.79e+00

Pos logit contributions
 benches+0.060
 bench+0.056
 Zig+0.053
round+0.053
 enclosure+0.052
Neg logit contributions
 important-0.078
 valuable-0.070
 best-0.067
arre-0.066
 mistakes-0.065
Token wood
Token ID4898
Feature activation-5.80e+00

Pos logit contributions
 benches+0.198
round+0.189
 bench+0.188
 Zig+0.183
 enclosure+0.178
Neg logit contributions
 important-0.253
 valuable-0.227
 best-0.221
 mistakes-0.209
 awful-0.207
Token inside
Token ID2641
Feature activation-1.07e+01

Pos logit contributions
 benches+0.183
 bench+0.172
 Zig+0.165
round+0.165
 enclosure+0.160
Neg logit contributions
 important-0.204
 valuable-0.181
 best-0.176
arre-0.168
 ok-0.167
Token and
Token ID290
Feature activation-6.03e+00

Pos logit contributions
 benches+0.303
round+0.288
 Zig+0.283
 bench+0.281
 enclosure+0.270
Neg logit contributions
 important-0.406
 valuable-0.362
 best-0.357
 ok-0.334
 mistakes-0.334
Token soon
Token ID2582
Feature activation+7.59e-01

Pos logit contributions
 benches+0.192
 bench+0.181
round+0.174
 Zig+0.174
 enclosure+0.170
Neg logit contributions
 important-0.211
 valuable-0.187
 best-0.181
arre-0.173
 ok-0.172
Token the
Token ID262
Feature activation-1.09e+00

Pos logit contributions
 important+0.029
 valuable+0.026
arre+0.025
 best+0.025
 mistakes+0.024
Neg logit contributions
 benches-0.022
 bench-0.020
 Zig-0.019
 slide-0.018
 enclosure-0.018
Token fire
Token ID2046
Feature activation-5.47e+00

Pos logit contributions
 benches+0.032
 bench+0.030
 Zig+0.028
round+0.028
 slide+0.028
Neg logit contributions
 important-0.041
 valuable-0.037
 best-0.035
arre-0.035
 ok-0.034
Token was
Token ID373
Feature activation+5.66e+00

Pos logit contributions
 benches+0.182
 bench+0.172
 Zig+0.164
round+0.164
 enclosure+0.162
Neg logit contributions
 important-0.184
 valuable-0.163
 best-0.154
arre-0.151
 mistakes-0.149
Token big
Token ID1263
Feature activation+1.36e+00

Pos logit contributions
 important+0.091
 valuable+0.080
 best+0.076
arre+0.076
 mistakes+0.075
Neg logit contributions
 benches-0.098
 bench-0.096
 slide-0.089
round-0.088
 Zig-0.088
Token box
Token ID3091
Feature activation-1.74e+00

Pos logit contributions
 important+0.053
 valuable+0.047
 best+0.047
 ok+0.044
arre+0.044
Neg logit contributions
 benches-0.038
 bench-0.034
 Zig-0.034
round-0.033
 enclosure-0.032
Token with
Token ID351
Feature activation-2.35e+00

Pos logit contributions
 benches+0.055
 bench+0.052
round+0.050
 Zig+0.050
 enclosure+0.049
Neg logit contributions
 important-0.061
 valuable-0.054
 best-0.052
arre-0.051
 mistakes-0.050
Token many
Token ID867
Feature activation-1.52e+00

Pos logit contributions
 benches+0.080
 bench+0.075
 Zig+0.072
round+0.071
 enclosure+0.070
Neg logit contributions
 important-0.078
 valuable-0.069
 best-0.066
arre-0.065
 ok-0.063
Token balls
Token ID11333
Feature activation-3.91e+00

Pos logit contributions
 benches+0.046
 bench+0.041
 slide+0.038
 Zig+0.038
 enclosure+0.037
Neg logit contributions
 important-0.056
arre-0.051
 valuable-0.050
 ok-0.049
 best-0.049
Token inside
Token ID2641
Feature activation-1.06e+01

Pos logit contributions
 benches+0.126
 bench+0.118
round+0.113
 Zig+0.112
 enclosure+0.110
Neg logit contributions
 important-0.136
 valuable-0.121
 best-0.116
arre-0.113
 ok-0.112
Token.
Token ID13
Feature activation+9.30e-01

Pos logit contributions
 benches+0.367
 bench+0.344
 Zig+0.340
round+0.338
 enclosure+0.328
Neg logit contributions
 important-0.340
 valuable-0.296
 best-0.289
 mistakes-0.270
 ok-0.267
Token The
Token ID383
Feature activation+2.28e-01

Pos logit contributions
 important+0.031
 valuable+0.028
 best+0.027
 ok+0.026
 mistakes+0.025
Neg logit contributions
 benches-0.029
 bench-0.027
round-0.026
 gra-0.025
 slide-0.025
Token man
Token ID582
Feature activation-2.50e+00

Pos logit contributions
 important+0.009
 valuable+0.008
 best+0.008
arre+0.007
 ok+0.007
Neg logit contributions
 benches-0.007
 bench-0.006
 Zig-0.006
round-0.006
 enclosure-0.006
Token told
Token ID1297
Feature activation+1.37e+00

Pos logit contributions
 benches+0.091
 bench+0.087
 Zig+0.084
round+0.084
 enclosure+0.082
Neg logit contributions
 important-0.075
 valuable-0.066
 best-0.063
arre-0.060
 ok-0.060
Token the
Token ID262
Feature activation+1.06e+00

Pos logit contributions
 important+0.051
 awful+0.045
 mistakes+0.045
 best+0.044
 valuable+0.043
Neg logit contributions
round-0.038
 benches-0.037
 bench-0.036
stack-0.035
 enclosure-0.034
Token cage
Token ID17186
Feature activation-4.73e+00

Pos logit contributions
 important+0.024
 valuable+0.021
 best+0.020
arre+0.019
 ok+0.019
Neg logit contributions
 benches-0.026
 bench-0.024
 Zig-0.023
round-0.023
 enclosure-0.023
Token nearby
Token ID6716
Feature activation-1.88e+00

Pos logit contributions
 benches+0.152
 bench+0.140
 Zig+0.139
round+0.136
 enclosure+0.133
Neg logit contributions
 important-0.165
 valuable-0.145
 best-0.142
 ok-0.134
 awful-0.134
Token with
Token ID351
Feature activation-2.96e+00

Pos logit contributions
 benches+0.058
 Zig+0.053
round+0.052
 bench+0.050
 enclosure+0.048
Neg logit contributions
 important-0.066
 best-0.057
 valuable-0.057
 awful-0.054
 ok-0.054
Token another
Token ID1194
Feature activation-2.02e-01

Pos logit contributions
 benches+0.104
 bench+0.099
 Zig+0.095
round+0.095
 enclosure+0.093
Neg logit contributions
 important-0.093
 valuable-0.082
 best-0.079
arre-0.076
 ok-0.075
Token bird
Token ID6512
Feature activation-4.32e+00

Pos logit contributions
 benches+0.006
 bench+0.005
 Zig+0.005
round+0.005
 enclosure+0.005
Neg logit contributions
 important-0.008
 valuable-0.007
 best-0.007
arre-0.007
 ok-0.006
Token inside
Token ID2641
Feature activation-1.06e+01

Pos logit contributions
 benches+0.138
 bench+0.128
round+0.125
 Zig+0.125
 enclosure+0.119
Neg logit contributions
 important-0.151
 valuable-0.133
 best-0.130
arre-0.123
 ok-0.123
Token.
Token ID13
Feature activation+1.01e-01

Pos logit contributions
 benches+0.385
round+0.365
 Zig+0.361
 bench+0.350
 enclosure+0.345
Neg logit contributions
 important-0.318
 best-0.269
 valuable-0.268
 mistakes-0.248
 ok-0.248
Token
Token ID198
Feature activation-1.01e+00

Pos logit contributions
 important+0.003
 valuable+0.003
 ok+0.003
 best+0.003
arre+0.003
Neg logit contributions
 benches-0.003
 bench-0.003
round-0.003
 slide-0.003
 walk-0.003
Token
Token ID198
Feature activation-8.21e-01

Pos logit contributions
 benches+0.034
 bench+0.031
round+0.030
 throne+0.030
 enclosure+0.030
Neg logit contributions
 important-0.034
 valuable-0.031
 best-0.028
 certainly-0.027
 ok-0.027
TokenR
Token ID49
Feature activation+1.01e+00

Pos logit contributions
 benches+0.030
 bench+0.027
 Zig+0.026
round+0.026
 enclosure+0.026
Neg logit contributions
 important-0.026
 valuable-0.023
 best-0.021
arre-0.021
 ok-0.021
Tokenalph
Token ID17307
Feature activation-1.82e+00

Pos logit contributions
 important+0.033
 valuable+0.029
 best+0.028
arre+0.027
 awful+0.026
Neg logit contributions
 benches-0.035
 bench-0.033
 Zig-0.032
round-0.032
 enclosure-0.032
Token cage
Token ID17186
Feature activation-4.73e+00

Pos logit contributions
 important+0.024
 valuable+0.021
 best+0.020
arre+0.019
 ok+0.019
Neg logit contributions
 benches-0.026
 bench-0.024
 Zig-0.023
round-0.023
 enclosure-0.023
Token nearby
Token ID6716
Feature activation-1.88e+00

Pos logit contributions
 benches+0.152
 bench+0.140
 Zig+0.139
round+0.136
 enclosure+0.133
Neg logit contributions
 important-0.165
 valuable-0.145
 best-0.142
 ok-0.134
 awful-0.134
Token with
Token ID351
Feature activation-2.96e+00

Pos logit contributions
 benches+0.058
 Zig+0.053
round+0.052
 bench+0.050
 enclosure+0.048
Neg logit contributions
 important-0.066
 best-0.057
 valuable-0.057
 awful-0.054
 ok-0.054
Token another
Token ID1194
Feature activation-2.02e-01

Pos logit contributions
 benches+0.104
 bench+0.099
 Zig+0.095
round+0.095
 enclosure+0.093
Neg logit contributions
 important-0.093
 valuable-0.082
 best-0.079
arre-0.076
 ok-0.075
Token bird
Token ID6512
Feature activation-4.32e+00

Pos logit contributions
 benches+0.006
 bench+0.005
 Zig+0.005
round+0.005
 enclosure+0.005
Neg logit contributions
 important-0.008
 valuable-0.007
 best-0.007
arre-0.007
 ok-0.006
Token inside
Token ID2641
Feature activation-1.06e+01

Pos logit contributions
 benches+0.138
 bench+0.128
round+0.125
 Zig+0.125
 enclosure+0.119
Neg logit contributions
 important-0.151
 valuable-0.133
 best-0.130
arre-0.123
 ok-0.123
Token.
Token ID13
Feature activation+1.01e-01

Pos logit contributions
 benches+0.385
round+0.365
 Zig+0.361
 bench+0.350
 enclosure+0.345
Neg logit contributions
 important-0.318
 best-0.269
 valuable-0.268
 mistakes-0.248
 ok-0.248
Token
Token ID198
Feature activation-1.01e+00

Pos logit contributions
 important+0.003
 valuable+0.003
 ok+0.003
 best+0.003
arre+0.003
Neg logit contributions
 benches-0.003
 bench-0.003
round-0.003
 slide-0.003
 walk-0.003
Token
Token ID198
Feature activation-8.21e-01

Pos logit contributions
 benches+0.034
 bench+0.031
round+0.030
 throne+0.030
 enclosure+0.030
Neg logit contributions
 important-0.034
 valuable-0.031
 best-0.028
 certainly-0.027
 ok-0.027
TokenR
Token ID49
Feature activation+1.01e+00

Pos logit contributions
 benches+0.030
 bench+0.027
 Zig+0.026
round+0.026
 enclosure+0.026
Neg logit contributions
 important-0.026
 valuable-0.023
 best-0.021
arre-0.021
 ok-0.021
Tokenalph
Token ID17307
Feature activation-1.82e+00

Pos logit contributions
 important+0.033
 valuable+0.029
 best+0.028
arre+0.027
 awful+0.026
Neg logit contributions
 benches-0.035
 bench-0.033
 Zig-0.032
round-0.032
 enclosure-0.032
Token found
Token ID1043
Feature activation-4.76e-01

Pos logit contributions
 important+0.031
 valuable+0.028
 best+0.026
arre+0.026
 mistakes+0.025
Neg logit contributions
 benches-0.031
 bench-0.030
 Zig-0.028
 slide-0.028
 enclosure-0.028
Token a
Token ID257
Feature activation+9.75e-01

Pos logit contributions
 benches+0.015
 bench+0.014
 Zig+0.013
 enclosure+0.013
 slide+0.013
Neg logit contributions
 important-0.017
 valuable-0.015
arre-0.015
 best-0.015
 mistakes-0.014
Token lot
Token ID1256
Feature activation+2.64e+00

Pos logit contributions
 important+0.030
 valuable+0.026
arre+0.025
 best+0.025
 mistakes+0.024
Neg logit contributions
 benches-0.034
 bench-0.034
 slide-0.031
 Zig-0.031
round-0.031
Token of
Token ID286
Feature activation-4.85e+00

Pos logit contributions
 important+0.100
 valuable+0.089
 best+0.088
arre+0.087
 ok+0.085
Neg logit contributions
 benches-0.076
 bench-0.070
round-0.067
 Zig-0.066
 enclosure-0.065
Token water
Token ID1660
Feature activation-2.30e+00

Pos logit contributions
 benches+0.166
 bench+0.157
 Zig+0.151
round+0.150
 enclosure+0.147
Neg logit contributions
 important-0.159
 valuable-0.140
 best-0.135
arre-0.129
 ok-0.128
Token inside
Token ID2641
Feature activation-1.05e+01

Pos logit contributions
 benches+0.072
 bench+0.068
 Zig+0.065
round+0.065
 enclosure+0.063
Neg logit contributions
 important-0.082
 valuable-0.073
 best-0.070
arre-0.068
 ok-0.067
Token.
Token ID13
Feature activation+5.38e-01

Pos logit contributions
 benches+0.352
 bench+0.330
 Zig+0.322
round+0.322
 enclosure+0.312
Neg logit contributions
 important-0.353
 valuable-0.310
 best-0.300
 ok-0.285
 mistakes-0.284
Token He
Token ID679
Feature activation+8.87e-01

Pos logit contributions
 important+0.018
 valuable+0.016
 ok+0.015
 best+0.015
arre+0.015
Neg logit contributions
 benches-0.018
 bench-0.016
round-0.015
 slide-0.015
 Zig-0.015
Token had
Token ID550
Feature activation+2.17e+00

Pos logit contributions
 important+0.031
 valuable+0.027
 best+0.027
 mistakes+0.026
arre+0.026
Neg logit contributions
 benches-0.028
 bench-0.027
 slide-0.025
 Zig-0.025
 enclosure-0.025
Token never
Token ID1239
Feature activation+2.74e+00

Pos logit contributions
 important+0.069
arre+0.061
 valuable+0.061
ushing+0.060
 best+0.059
Neg logit contributions
 benches-0.075
 bench-0.070
 Zig-0.067
 slide-0.065
 enclosure-0.064
Token seen
Token ID1775
Feature activation-5.46e-01

Pos logit contributions
 important+0.104
 valuable+0.093
 best+0.090
arre+0.087
 ok+0.086
Neg logit contributions
 benches-0.080
 bench-0.074
 Zig-0.071
round-0.071
 enclosure-0.069
Token found
Token ID1043
Feature activation-4.76e-01

Pos logit contributions
 important+0.031
 valuable+0.028
 best+0.026
arre+0.026
 mistakes+0.025
Neg logit contributions
 benches-0.031
 bench-0.030
 Zig-0.028
 slide-0.028
 enclosure-0.028
Token a
Token ID257
Feature activation+9.75e-01

Pos logit contributions
 benches+0.015
 bench+0.014
 Zig+0.013
 enclosure+0.013
 slide+0.013
Neg logit contributions
 important-0.017
 valuable-0.015
arre-0.015
 best-0.015
 mistakes-0.014
Token lot
Token ID1256
Feature activation+2.64e+00

Pos logit contributions
 important+0.030
 valuable+0.026
arre+0.025
 best+0.025
 mistakes+0.024
Neg logit contributions
 benches-0.034
 bench-0.034
 slide-0.031
 Zig-0.031
round-0.031
Token of
Token ID286
Feature activation-4.85e+00

Pos logit contributions
 important+0.100
 valuable+0.089
 best+0.088
arre+0.087
 ok+0.085
Neg logit contributions
 benches-0.076
 bench-0.070
round-0.067
 Zig-0.066
 enclosure-0.065
Token water
Token ID1660
Feature activation-2.30e+00

Pos logit contributions
 benches+0.166
 bench+0.157
 Zig+0.151
round+0.150
 enclosure+0.147
Neg logit contributions
 important-0.159
 valuable-0.140
 best-0.135
arre-0.129
 ok-0.128
Token inside
Token ID2641
Feature activation-1.05e+01

Pos logit contributions
 benches+0.072
 bench+0.068
 Zig+0.065
round+0.065
 enclosure+0.063
Neg logit contributions
 important-0.082
 valuable-0.073
 best-0.070
arre-0.068
 ok-0.067
Token.
Token ID13
Feature activation+5.38e-01

Pos logit contributions
 benches+0.352
 bench+0.330
 Zig+0.322
round+0.322
 enclosure+0.312
Neg logit contributions
 important-0.353
 valuable-0.310
 best-0.300
 ok-0.285
 mistakes-0.284
Token He
Token ID679
Feature activation+8.87e-01

Pos logit contributions
 important+0.018
 valuable+0.016
 ok+0.015
 best+0.015
arre+0.015
Neg logit contributions
 benches-0.018
 bench-0.016
round-0.015
 slide-0.015
 Zig-0.015
Token had
Token ID550
Feature activation+2.17e+00

Pos logit contributions
 important+0.031
 valuable+0.027
 best+0.027
 mistakes+0.026
arre+0.026
Neg logit contributions
 benches-0.028
 bench-0.027
 slide-0.025
 Zig-0.025
 enclosure-0.025
Token never
Token ID1239
Feature activation+2.74e+00

Pos logit contributions
 important+0.069
arre+0.061
 valuable+0.061
ushing+0.060
 best+0.059
Neg logit contributions
 benches-0.075
 bench-0.070
 Zig-0.067
 slide-0.065
 enclosure-0.064
Token seen
Token ID1775
Feature activation-5.46e-01

Pos logit contributions
 important+0.104
 valuable+0.093
 best+0.090
arre+0.087
 ok+0.086
Neg logit contributions
 benches-0.080
 bench-0.074
 Zig-0.071
round-0.071
 enclosure-0.069
Token a
Token ID257
Feature activation+6.25e-01

Pos logit contributions
 benches+0.017
 bench+0.016
 Zig+0.015
round+0.015
 enclosure+0.015
Neg logit contributions
 important-0.017
 valuable-0.015
 best-0.014
arre-0.014
 ok-0.014
Token box
Token ID3091
Feature activation-1.49e+00

Pos logit contributions
 important+0.018
 valuable+0.016
 best+0.015
arre+0.015
 mistakes+0.015
Neg logit contributions
 benches-0.023
 bench-0.022
 Zig-0.021
round-0.021
 slide-0.021
Token with
Token ID351
Feature activation-2.70e+00

Pos logit contributions
 benches+0.044
 bench+0.042
round+0.040
 Zig+0.039
 enclosure+0.039
Neg logit contributions
 important-0.055
 valuable-0.049
 best-0.047
arre-0.047
 mistakes-0.046
Token the
Token ID262
Feature activation-1.29e+00

Pos logit contributions
 benches+0.090
 bench+0.084
 Zig+0.080
round+0.079
 enclosure+0.078
Neg logit contributions
 important-0.091
 valuable-0.080
arre-0.077
 best-0.077
 ok-0.075
Token computer
Token ID3644
Feature activation-3.78e+00

Pos logit contributions
 benches+0.040
 bench+0.038
 Zig+0.035
 slide+0.035
round+0.035
Neg logit contributions
 important-0.046
 valuable-0.041
arre-0.040
 best-0.039
 ok-0.038
Token inside
Token ID2641
Feature activation-1.05e+01

Pos logit contributions
 benches+0.117
 bench+0.111
 slide+0.104
round+0.103
 enclosure+0.102
Neg logit contributions
 important-0.136
 valuable-0.122
arre-0.116
 best-0.115
 ok-0.114
Token.
Token ID13
Feature activation+2.10e-01

Pos logit contributions
 benches+0.341
 bench+0.322
round+0.305
 Zig+0.305
 enclosure+0.299
Neg logit contributions
 important-0.360
 valuable-0.320
 best-0.307
arre-0.301
 ok-0.295
Token It
Token ID632
Feature activation+1.49e+00

Pos logit contributions
 important+0.008
 valuable+0.007
 best+0.007
 likely+0.006
 ok+0.006
Neg logit contributions
 benches-0.006
 bench-0.005
 gra-0.005
round-0.005
 walk-0.005
Token had
Token ID550
Feature activation+2.82e+00

Pos logit contributions
 important+0.050
arre+0.047
ealing+0.046
 mistakes+0.045
 valuable+0.045
Neg logit contributions
 benches-0.045
 bench-0.045
 slide-0.044
 Zig-0.043
 enclosure-0.041
Token been
Token ID587
Feature activation+7.79e+00

Pos logit contributions
 important+0.091
 valuable+0.080
 best+0.077
arre+0.074
 ok+0.074
Neg logit contributions
 benches-0.097
 bench-0.092
 Zig-0.088
round-0.088
 enclosure-0.086
Token hidden
Token ID7104
Feature activation+2.67e+00

Pos logit contributions
 important+0.247
 valuable+0.223
arre+0.219
 mistakes+0.211
 best+0.211
Neg logit contributions
 benches-0.265
 bench-0.246
 Zig-0.238
round-0.230
 slide-0.228
Token put
Token ID1234
Feature activation-1.88e+00

Pos logit contributions
 benches+0.029
 bench+0.027
round+0.027
 Zig+0.027
 enclosure+0.026
Neg logit contributions
 important-0.027
 valuable-0.024
 best-0.024
arre-0.022
 awful-0.022
Token their
Token ID511
Feature activation-4.17e-02

Pos logit contributions
round+0.054
 benches+0.053
 bench+0.052
 enclosure+0.050
 Zig+0.050
Neg logit contributions
 important-0.069
 best-0.066
 valuable-0.060
 precious-0.056
 awful-0.055
Token dolls
Token ID36062
Feature activation-3.69e+00

Pos logit contributions
round+0.001
 benches+0.001
 Zig+0.001
 bench+0.001
 gra+0.001
Neg logit contributions
 important-0.001
 best-0.001
 valuable-0.001
 precious-0.001
 mistakes-0.001
Token and
Token ID290
Feature activation-6.00e+00

Pos logit contributions
 benches+0.128
 Zig+0.126
 gra+0.122
round+0.121
 bench+0.120
Neg logit contributions
 important-0.108
 best-0.103
 valuable-0.092
 precious-0.085
arre-0.083
Token cars
Token ID5006
Feature activation-4.44e+00

Pos logit contributions
round+0.195
 bench+0.170
stack+0.170
 Zig+0.170
 enclosure+0.169
Neg logit contributions
 important-0.196
 best-0.185
 valuable-0.166
 precious-0.162
 mistakes-0.159
Token inside
Token ID2641
Feature activation-1.04e+01

Pos logit contributions
 benches+0.167
 gra+0.163
 bench+0.161
 enclosure+0.157
 Fif+0.156
Neg logit contributions
 important-0.112
 best-0.106
 valuable-0.092
arre-0.081
/-0.080
Token and
Token ID290
Feature activation-6.11e+00

Pos logit contributions
 benches+0.359
round+0.345
 Zig+0.335
 bench+0.333
 enclosure+0.326
Neg logit contributions
 important-0.341
 best-0.298
 valuable-0.296
 mistakes-0.271
 ok-0.267
Token pretended
Token ID36766
Feature activation+9.25e-01

Pos logit contributions
round+0.193
 Zig+0.174
 enclosure+0.174
 benches+0.174
 bench+0.170
Neg logit contributions
 important-0.211
 best-0.196
 valuable-0.180
 precious-0.167
 focused-0.164
Token to
Token ID284
Feature activation-5.82e+00

Pos logit contributions
 important+0.035
 valuable+0.031
 best+0.031
 awful+0.028
 mistakes+0.028
Neg logit contributions
 benches-0.026
round-0.025
 enclosure-0.024
 bench-0.024
 Zig-0.023
Token be
Token ID307
Feature activation+4.04e+00

Pos logit contributions
 benches+0.198
round+0.196
 Zig+0.180
 bench+0.177
stack+0.175
Neg logit contributions
 important-0.183
 valuable-0.154
 best-0.153
 awful-0.139
 mistakes-0.135
Token kings
Token ID21557
Feature activation-1.47e+00

Pos logit contributions
 important+0.141
 valuable+0.124
 best+0.122
 ok+0.113
arre+0.112
Neg logit contributions
 benches-0.131
 bench-0.125
round-0.122
 Zig-0.117
 enclosure-0.117
Token box
Token ID3091
Feature activation-1.53e+00

Pos logit contributions
 important+0.009
 valuable+0.008
 best+0.008
arre+0.007
 ok+0.007
Neg logit contributions
 benches-0.006
 bench-0.006
 Zig-0.006
round-0.006
 enclosure-0.005
Token with
Token ID351
Feature activation-2.42e+00

Pos logit contributions
 benches+0.043
 bench+0.041
round+0.039
 Zig+0.038
 enclosure+0.038
Neg logit contributions
 important-0.059
 valuable-0.053
 best-0.051
arre-0.050
 mistakes-0.050
Token lots
Token ID6041
Feature activation+2.81e+00

Pos logit contributions
 benches+0.082
 bench+0.075
 Zig+0.073
 slide+0.070
 enclosure+0.069
Neg logit contributions
 important-0.080
arre-0.072
 valuable-0.071
 ok-0.068
 best-0.068
Token of
Token ID286
Feature activation-5.84e+00

Pos logit contributions
 important+0.107
 valuable+0.095
 best+0.094
arre+0.092
 ok+0.091
Neg logit contributions
 benches-0.082
 bench-0.076
round-0.072
 Zig-0.071
 enclosure-0.070
Token equipment
Token ID5112
Feature activation-3.17e+00

Pos logit contributions
 benches+0.190
 bench+0.179
 Zig+0.172
round+0.172
 enclosure+0.168
Neg logit contributions
 important-0.201
 valuable-0.178
 best-0.172
arre-0.165
 ok-0.164
Token inside
Token ID2641
Feature activation-1.04e+01

Pos logit contributions
 benches+0.092
 bench+0.086
round+0.081
 slide+0.080
 Zig+0.079
Neg logit contributions
 important-0.120
 valuable-0.109
arre-0.105
 best-0.104
 ok-0.103
Token.
Token ID13
Feature activation+7.57e-02

Pos logit contributions
 benches+0.335
 bench+0.317
 Zig+0.297
round+0.297
 slide+0.293
Neg logit contributions
 important-0.362
 valuable-0.323
 best-0.310
arre-0.307
 ok-0.300
Token They
Token ID1119
Feature activation-3.69e-01

Pos logit contributions
 important+0.003
 valuable+0.002
 best+0.002
arre+0.002
 ok+0.002
Neg logit contributions
 benches-0.002
 bench-0.002
round-0.002
 Zig-0.002
 slide-0.002
Token tried
Token ID3088
Feature activation+2.42e+00

Pos logit contributions
 benches+0.012
 bench+0.012
 Zig+0.011
round+0.011
 slide+0.011
Neg logit contributions
 important-0.013
 valuable-0.011
 best-0.011
 mistakes-0.010
arre-0.010
Token to
Token ID284
Feature activation-7.00e+00

Pos logit contributions
 important+0.102
 valuable+0.093
 best+0.091
 mistakes+0.086
arre+0.086
Neg logit contributions
 benches-0.060
round-0.056
 bench-0.055
 Zig-0.053
 enclosure-0.053
Token open
Token ID1280
Feature activation-3.81e+00

Pos logit contributions
 benches+0.188
round+0.172
 Zig+0.169
 bench+0.168
 enclosure+0.167
Neg logit contributions
 important-0.277
 valuable-0.246
 best-0.241
 ok-0.238
 awful-0.229
Token found
Token ID1043
Feature activation-2.05e-01

Pos logit contributions
 important+0.013
 valuable+0.012
 best+0.012
arre+0.011
 mistakes+0.011
Neg logit contributions
 benches-0.013
 bench-0.012
 Zig-0.011
 enclosure-0.011
round-0.011
Token a
Token ID257
Feature activation+6.50e-01

Pos logit contributions
 benches+0.007
 bench+0.007
round+0.007
 Zig+0.007
 enclosure+0.007
Neg logit contributions
 important-0.007
 valuable-0.006
 best-0.006
 ok-0.005
 awful-0.005
Token family
Token ID1641
Feature activation-1.77e+00

Pos logit contributions
 important+0.019
 valuable+0.017
 best+0.016
 ok+0.015
 awful+0.014
Neg logit contributions
 benches-0.025
 bench-0.023
round-0.023
 Zig-0.023
 enclosure-0.022
Token of
Token ID286
Feature activation-5.28e+00

Pos logit contributions
 benches+0.056
 bench+0.052
 Zig+0.051
round+0.050
 enclosure+0.048
Neg logit contributions
 important-0.063
 valuable-0.056
 best-0.055
 ok-0.051
arre-0.051
Token birds
Token ID10087
Feature activation-5.54e+00

Pos logit contributions
round+0.172
 benches+0.171
 bench+0.160
 Zig+0.158
 enclosure+0.156
Neg logit contributions
 important-0.185
 valuable-0.160
 best-0.158
 mistakes-0.146
 precious-0.143
Token inside
Token ID2641
Feature activation-1.04e+01

Pos logit contributions
 benches+0.188
 bench+0.174
round+0.173
 Zig+0.172
 enclosure+0.161
Neg logit contributions
 important-0.175
 best-0.153
 valuable-0.151
arre-0.144
 ok-0.141
Token,
Token ID11
Feature activation-5.12e+00

Pos logit contributions
 benches+0.314
round+0.304
 Zig+0.295
 bench+0.289
 enclosure+0.281
Neg logit contributions
 important-0.377
 best-0.332
 valuable-0.329
 mistakes-0.307
 ok-0.306
Token ch
Token ID442
Feature activation+5.84e-01

Pos logit contributions
 benches+0.161
round+0.158
 Zig+0.151
 bench+0.149
 enclosure+0.145
Neg logit contributions
 important-0.175
 valuable-0.153
 best-0.151
 ok-0.142
 precious-0.138
Tokenir
Token ID343
Feature activation+6.53e-01

Pos logit contributions
 important+0.020
 valuable+0.017
 best+0.017
arre+0.017
 mistakes+0.017
Neg logit contributions
 benches-0.018
 bench-0.018
round-0.017
 Zig-0.017
 slide-0.017
Tokenping
Token ID13886
Feature activation-2.95e-01

Pos logit contributions
 important+0.022
 valuable+0.020
 best+0.019
arre+0.018
 mistakes+0.018
Neg logit contributions
 benches-0.021
 bench-0.020
round-0.019
 Zig-0.019
 enclosure-0.019
Token loudly
Token ID23112
Feature activation+9.74e-01

Pos logit contributions
 benches+0.009
 bench+0.009
round+0.009
 Zig+0.008
 enclosure+0.008
Neg logit contributions
 important-0.010
 valuable-0.009
 best-0.009
arre-0.008
 ok-0.008
Token had
Token ID550
Feature activation+2.30e+00

Pos logit contributions
 benches+0.118
 bench+0.113
 Zig+0.109
 slide+0.106
round+0.105
Neg logit contributions
 important-0.102
arre-0.091
 valuable-0.090
 mistakes-0.087
 best-0.087
Token many
Token ID867
Feature activation-6.64e-01

Pos logit contributions
 important+0.073
 valuable+0.064
 best+0.062
arre+0.058
 ok+0.058
Neg logit contributions
 benches-0.080
 bench-0.076
round-0.074
 Zig-0.073
 enclosure-0.072
Token clothes
Token ID8242
Feature activation-2.25e+00

Pos logit contributions
 benches+0.020
 bench+0.018
 Zig+0.017
 slide+0.017
 throne+0.016
Neg logit contributions
 important-0.024
arre-0.022
 valuable-0.022
 ok-0.021
 best-0.021
Token and
Token ID290
Feature activation-5.09e+00

Pos logit contributions
 benches+0.066
 bench+0.062
round+0.059
 slide+0.058
 Zig+0.057
Neg logit contributions
 important-0.084
 valuable-0.076
arre-0.072
 best-0.072
 mistakes-0.072
Token shoes
Token ID10012
Feature activation-2.65e+00

Pos logit contributions
 benches+0.143
 bench+0.133
 Zig+0.125
round+0.123
 slide+0.121
Neg logit contributions
 important-0.198
 valuable-0.178
 best-0.173
arre-0.172
 ok-0.168
Token inside
Token ID2641
Feature activation-1.04e+01

Pos logit contributions
 benches+0.081
 bench+0.078
round+0.073
 slide+0.072
 Zig+0.071
Neg logit contributions
 important-0.095
 valuable-0.085
arre-0.082
 best-0.082
 mistakes-0.081
Token.
Token ID13
Feature activation+3.45e-01

Pos logit contributions
 benches+0.344
 bench+0.323
 Zig+0.311
round+0.311
 enclosure+0.303
Neg logit contributions
 important-0.352
 valuable-0.312
 best-0.301
arre-0.289
 ok-0.286
Token Sometimes
Token ID8975
Feature activation+1.42e+00

Pos logit contributions
 important+0.013
 valuable+0.012
 best+0.011
 ok+0.011
arre+0.011
Neg logit contributions
 benches-0.010
 bench-0.009
round-0.009
 slide-0.009
 Zig-0.009
Token they
Token ID484
Feature activation+5.36e-01

Pos logit contributions
 important+0.054
 valuable+0.048
 best+0.048
 ok+0.045
 mistakes+0.045
Neg logit contributions
 benches-0.041
 bench-0.039
round-0.038
 Zig-0.036
 enclosure-0.036
Token would
Token ID561
Feature activation-3.28e-01

Pos logit contributions
 important+0.016
 valuable+0.013
 best+0.013
arre+0.012
 ok+0.012
Neg logit contributions
 benches-0.020
 Zig-0.019
 bench-0.019
round-0.019
 enclosure-0.018
Token squeeze
Token ID21229
Feature activation+3.47e-03

Pos logit contributions
 benches+0.009
round+0.008
 Zig+0.008
 bench+0.007
 enclosure+0.007
Neg logit contributions
 important-0.013
 valuable-0.012
 best-0.012
 ok-0.011
 likely-0.011
Token with
Token ID351
Feature activation-1.99e+00

Pos logit contributions
 benches+0.005
 bench+0.005
round+0.005
 slide+0.005
 Zig+0.005
Neg logit contributions
 important-0.006
 valuable-0.005
 best-0.005
 mistakes-0.005
arre-0.005
Token some
Token ID617
Feature activation-5.30e-01

Pos logit contributions
 benches+0.065
 bench+0.061
 Zig+0.058
round+0.058
 enclosure+0.057
Neg logit contributions
 important-0.068
 valuable-0.061
 best-0.059
arre-0.057
 ok-0.056
Token y
Token ID331
Feature activation+2.63e+00

Pos logit contributions
 benches+0.015
 bench+0.014
 Zig+0.013
round+0.013
 enclosure+0.013
Neg logit contributions
 important-0.021
 valuable-0.019
 best-0.018
arre-0.018
 ok-0.017
Tokenummy
Token ID13513
Feature activation+1.13e+00

Pos logit contributions
 important+0.061
 valuable+0.051
 best+0.048
 ok+0.045
arre+0.045
Neg logit contributions
 benches-0.116
 bench-0.110
round-0.107
 Zig-0.106
 enclosure-0.105
Token apples
Token ID22514
Feature activation-5.93e+00

Pos logit contributions
 important+0.046
 valuable+0.042
 best+0.040
arre+0.039
 ok+0.039
Neg logit contributions
 benches-0.029
 bench-0.027
 Zig-0.026
round-0.026
 enclosure-0.025
Token inside
Token ID2641
Feature activation-1.04e+01

Pos logit contributions
 benches+0.188
 bench+0.177
round+0.168
 Zig+0.168
 enclosure+0.164
Neg logit contributions
 important-0.209
 valuable-0.186
 best-0.179
arre-0.175
 mistakes-0.173
Token.
Token ID13
Feature activation+6.45e-01

Pos logit contributions
 benches+0.350
 bench+0.326
 Zig+0.324
round+0.322
 enclosure+0.312
Neg logit contributions
 important-0.343
 valuable-0.300
 best-0.293
 mistakes-0.276
 ok-0.275
Token John
Token ID1757
Feature activation+6.49e-01

Pos logit contributions
 important+0.022
 valuable+0.019
 best+0.019
 ok+0.018
arre+0.018
Neg logit contributions
 benches-0.021
 bench-0.019
round-0.018
 slide-0.018
 Zig-0.018
Token smiled
Token ID13541
Feature activation+6.32e+00

Pos logit contributions
 important+0.021
 valuable+0.018
 best+0.018
 ok+0.017
 mistakes+0.017
Neg logit contributions
 benches-0.022
 bench-0.022
 enclosure-0.021
round-0.021
 Zig-0.020
Token.
Token ID13
Feature activation+9.88e-01

Pos logit contributions
 important+0.242
 valuable+0.219
arre+0.214
 mistakes+0.210
 awful+0.208
Neg logit contributions
 benches-0.170
 bench-0.167
 Zig-0.158
round-0.155
 enclosure-0.150
Token He
Token ID679
Feature activation+1.79e+00

Pos logit contributions
 important+0.034
 valuable+0.030
 best+0.029
 ok+0.028
arre+0.028
Neg logit contributions
 benches-0.032
 bench-0.029
round-0.028
 Zig-0.028
 enclosure-0.027
Token and
Token ID290
Feature activation-5.53e+00

Pos logit contributions
 benches+0.079
 bench+0.075
 Zig+0.074
round+0.073
 enclosure+0.072
Neg logit contributions
 important-0.076
 best-0.068
 valuable-0.067
 ok-0.062
arre-0.062
Token forks
Token ID43378
Feature activation-3.02e+00

Pos logit contributions
 benches+0.157
 bench+0.146
round+0.142
 Zig+0.142
 enclosure+0.137
Neg logit contributions
 important-0.213
 valuable-0.191
 best-0.186
arre-0.178
 ok-0.177
Token and
Token ID290
Feature activation-5.69e+00

Pos logit contributions
 benches+0.096
 bench+0.089
 Zig+0.086
round+0.086
 enclosure+0.084
Neg logit contributions
 important-0.106
 valuable-0.094
 best-0.092
arre-0.088
 ok-0.087
Token sp
Token ID599
Feature activation+7.12e-01

Pos logit contributions
 benches+0.168
round+0.158
 Zig+0.157
 bench+0.156
 enclosure+0.151
Neg logit contributions
 important-0.211
 valuable-0.187
 best-0.184
 ok-0.172
 mistakes-0.171
Tokenoons
Token ID13022
Feature activation-3.69e+00

Pos logit contributions
 important+0.029
 valuable+0.027
 ok+0.026
 best+0.026
 awful+0.024
Neg logit contributions
 benches-0.018
round-0.016
 bench-0.016
 Zig-0.015
 slide-0.014
Token inside
Token ID2641
Feature activation-1.04e+01

Pos logit contributions
 benches+0.122
 bench+0.114
round+0.111
 Zig+0.111
 enclosure+0.109
Neg logit contributions
 important-0.124
 valuable-0.110
 best-0.107
arre-0.101
 ok-0.101
Token the
Token ID262
Feature activation-4.40e+00

Pos logit contributions
 benches+0.351
 bench+0.328
round+0.325
 Zig+0.324
 enclosure+0.316
Neg logit contributions
 important-0.346
 valuable-0.303
 best-0.297
 mistakes-0.278
 ok-0.277
Token dish
Token ID9433
Feature activation-3.85e+00

Pos logit contributions
 benches+0.138
 bench+0.130
 slide+0.122
 enclosure+0.120
round+0.119
Neg logit contributions
 important-0.156
 valuable-0.140
 best-0.132
 ok-0.132
arre-0.131
Tokenwasher
Token ID45146
Feature activation-3.70e+00

Pos logit contributions
 benches+0.143
 bench+0.139
 slide+0.132
round+0.131
 enclosure+0.131
Neg logit contributions
 important-0.112
 valuable-0.099
 ok-0.093
 awful-0.090
 best-0.089
Token.
Token ID13
Feature activation+1.04e+00

Pos logit contributions
 benches+0.116
 bench+0.111
round+0.104
 slide+0.103
 Zig+0.102
Neg logit contributions
 important-0.130
 valuable-0.117
 best-0.110
arre-0.109
 ok-0.108
Token She
Token ID1375
Feature activation+6.61e-01

Pos logit contributions
 important+0.037
 valuable+0.033
 best+0.032
arre+0.031
 ok+0.031
Neg logit contributions
 benches-0.033
 bench-0.031
round-0.030
 enclosure-0.030
 Zig-0.029
Token they
Token ID484
Feature activation-4.63e-01

Pos logit contributions
 important+0.080
 valuable+0.072
arre+0.072
 awful+0.069
 ok+0.068
Neg logit contributions
 benches-0.067
 bench-0.063
 Zig-0.060
round-0.059
 slide-0.059
Token found
Token ID1043
Feature activation-8.51e-01

Pos logit contributions
 benches+0.015
 bench+0.015
 Zig+0.014
round+0.014
 enclosure+0.014
Neg logit contributions
 important-0.015
 valuable-0.014
 best-0.013
arre-0.013
 mistakes-0.012
Token many
Token ID867
Feature activation-1.53e+00

Pos logit contributions
 benches+0.026
 bench+0.024
 Zig+0.023
round+0.023
 enclosure+0.022
Neg logit contributions
 important-0.031
 valuable-0.028
 best-0.027
arre-0.026
 ok-0.026
Token colorful
Token ID20239
Feature activation+1.98e+00

Pos logit contributions
 benches+0.049
 bench+0.046
 Zig+0.043
round+0.043
 enclosure+0.043
Neg logit contributions
 important-0.053
 valuable-0.047
 best-0.046
arre-0.045
 ok-0.044
Token decorations
Token ID39011
Feature activation-3.23e+00

Pos logit contributions
 important+0.077
 valuable+0.069
 best+0.067
 mistakes+0.064
 awful+0.062
Neg logit contributions
 benches-0.054
 bench-0.051
 Zig-0.051
round-0.050
 enclosure-0.047
Token inside
Token ID2641
Feature activation-1.04e+01

Pos logit contributions
 benches+0.103
 bench+0.097
round+0.092
 Zig+0.092
 enclosure+0.090
Neg logit contributions
 important-0.113
 valuable-0.100
 best-0.096
arre-0.094
 mistakes-0.093
Token.
Token ID13
Feature activation+5.42e-01

Pos logit contributions
 benches+0.317
 bench+0.297
 Zig+0.284
round+0.284
 enclosure+0.277
Neg logit contributions
 important-0.376
 valuable-0.336
 best-0.325
arre-0.313
 ok-0.311
Token
Token ID198
Feature activation-6.21e-01

Pos logit contributions
 important+0.019
 valuable+0.017
 best+0.016
arre+0.016
 ok+0.016
Neg logit contributions
 benches-0.017
 bench-0.016
round-0.015
 Zig-0.015
 slide-0.015
Token
Token ID198
Feature activation-5.08e-01

Pos logit contributions
 benches+0.020
 bench+0.018
round+0.018
 enclosure+0.017
 Zig+0.017
Neg logit contributions
 important-0.022
 valuable-0.019
 best-0.018
 mistakes-0.018
arre-0.018
TokenJack
Token ID14295
Feature activation+9.26e-01

Pos logit contributions
 benches+0.017
 bench+0.015
 Zig+0.015
round+0.015
 enclosure+0.015
Neg logit contributions
 important-0.018
 valuable-0.016
 best-0.015
arre-0.014
 ok-0.014
Token said
Token ID531
Feature activation+2.66e+00

Pos logit contributions
 important+0.031
 valuable+0.028
 best+0.026
 mistakes+0.026
 awful+0.025
Neg logit contributions
 benches-0.030
 bench-0.029
 enclosure-0.028
 slide-0.028
 Zig-0.028
Token and
Token ID290
Feature activation-3.52e+00

Pos logit contributions
 important+0.226
arre+0.215
ealing+0.214
 valuable+0.212
 awful+0.206
Neg logit contributions
 benches-0.116
 bench-0.113
 slide-0.102
 walk-0.100
round-0.096
Token needed
Token ID2622
Feature activation+3.03e+00

Pos logit contributions
 bench+0.108
 benches+0.107
 slide+0.103
 Zig+0.103
 climb+0.101
Neg logit contributions
arre-0.106
 important-0.095
ealing-0.094
ushing-0.091
 mistakes-0.091
Token all
Token ID477
Feature activation-1.06e+00

Pos logit contributions
 important+0.127
 valuable+0.116
 best+0.113
arre+0.112
 awful+0.108
Neg logit contributions
 benches-0.074
 bench-0.067
 Zig-0.064
round-0.062
 enclosure-0.061
Token the
Token ID262
Feature activation-1.02e-01

Pos logit contributions
 benches+0.024
 bench+0.022
 Zig+0.021
round+0.021
 slide+0.020
Neg logit contributions
 important-0.045
 valuable-0.042
 best-0.041
arre-0.040
 ok-0.039
Token dishes
Token ID16759
Feature activation-2.13e+00

Pos logit contributions
 benches+0.003
 bench+0.003
 slide+0.003
 walk+0.003
 enclosure+0.003
Neg logit contributions
ushing-0.003
 important-0.003
arre-0.003
ealing-0.003
 valuable-0.003
Token inside
Token ID2641
Feature activation-1.04e+01

Pos logit contributions
 benches+0.053
 bench+0.051
 slide+0.048
round+0.045
 walk+0.043
Neg logit contributions
 important-0.083
arre-0.076
 valuable-0.076
 mistakes-0.075
ealing-0.073
Token to
Token ID284
Feature activation-9.09e+00

Pos logit contributions
 benches+0.288
 bench+0.271
 slide+0.247
 Zig+0.244
round+0.244
Neg logit contributions
 important-0.392
 valuable-0.356
arre-0.347
 best-0.343
 awful-0.332
Token work
Token ID670
Feature activation+6.00e-02

Pos logit contributions
 benches+0.211
 bench+0.194
 Zig+0.182
round+0.181
 enclosure+0.176
Neg logit contributions
 important-0.397
 valuable-0.362
 best-0.352
arre-0.343
 ok-0.339
Token properly
Token ID6105
Feature activation+4.24e-01

Pos logit contributions
 important+0.002
arre+0.002
 valuable+0.002
ealing+0.002
ushing+0.002
Neg logit contributions
 benches-0.002
 bench-0.002
 slide-0.001
round-0.001
 Zig-0.001
Token.
Token ID13
Feature activation+5.33e-01

Pos logit contributions
 important+0.016
 valuable+0.014
arre+0.014
 awful+0.014
 best+0.014
Neg logit contributions
 benches-0.011
 bench-0.011
 slide-0.010
round-0.010
 Zig-0.009
Token Lily
Token ID20037
Feature activation+2.04e+00

Pos logit contributions
 important+0.020
 valuable+0.018
 best+0.017
arre+0.017
 ok+0.017
Neg logit contributions
 benches-0.016
 bench-0.014
round-0.014
 Zig-0.014
 enclosure-0.013
Token pond
Token ID16723
Feature activation-2.99e+00

Pos logit contributions
 important+0.037
 valuable+0.032
 best+0.032
 ok+0.029
 awful+0.029
Neg logit contributions
 benches-0.033
 Zig-0.031
round-0.031
 bench-0.030
 gra-0.029
Token with
Token ID351
Feature activation-2.69e+00

Pos logit contributions
 benches+0.087
 bench+0.080
 Zig+0.078
round+0.077
 enclosure+0.075
Neg logit contributions
 important-0.113
 valuable-0.101
 best-0.098
arre-0.094
 ok-0.094
Token lots
Token ID6041
Feature activation+2.73e+00

Pos logit contributions
 benches+0.087
 bench+0.082
round+0.080
 Zig+0.079
 enclosure+0.077
Neg logit contributions
 important-0.093
 valuable-0.082
 best-0.080
 awful-0.075
 mistakes-0.075
Token of
Token ID286
Feature activation-5.70e+00

Pos logit contributions
 important+0.091
 valuable+0.080
 best+0.075
 mistakes+0.072
 awful+0.072
Neg logit contributions
 benches-0.090
 bench-0.087
 Zig-0.085
round-0.085
 gra-0.083
Token fish
Token ID5916
Feature activation-4.36e+00

Pos logit contributions
 benches+0.152
 bench+0.143
round+0.141
 Zig+0.136
 enclosure+0.134
Neg logit contributions
 important-0.230
 valuable-0.206
 best-0.200
 mistakes-0.190
 awful-0.190
Token inside
Token ID2641
Feature activation-1.03e+01

Pos logit contributions
 benches+0.136
 bench+0.128
 Zig+0.122
round+0.121
 enclosure+0.120
Neg logit contributions
 important-0.156
 valuable-0.140
 best-0.134
arre-0.130
 ok-0.130
Token.
Token ID13
Feature activation+4.29e-01

Pos logit contributions
 benches+0.367
round+0.347
 Zig+0.344
 bench+0.342
 enclosure+0.329
Neg logit contributions
 important-0.322
 valuable-0.276
 best-0.273
 mistakes-0.254
 ok-0.251
Token
Token ID198
Feature activation-7.32e-01

Pos logit contributions
 important+0.013
 valuable+0.011
 best+0.011
 ok+0.011
arre+0.011
Neg logit contributions
 benches-0.015
 bench-0.014
round-0.013
 throne-0.013
 walk-0.013
Token
Token ID198
Feature activation-5.76e-01

Pos logit contributions
 benches+0.026
 bench+0.023
round+0.023
 throne+0.023
 enclosure+0.023
Neg logit contributions
 important-0.023
 valuable-0.021
 best-0.019
 certainly-0.018
 ok-0.018
Tokenâ
Token ID22940
Feature activation+5.31e+00

Pos logit contributions
 benches+0.019
 bench+0.018
 Zig+0.017
 enclosure+0.017
round+0.017
Neg logit contributions
 important-0.020
 valuable-0.017
arre-0.016
 best-0.016
 ok-0.016
Token
Token ID26391
Feature activation+4.31e+00

Pos logit contributions
 important+0.178
 valuable+0.155
 best+0.149
arre+0.145
 ok+0.144
Neg logit contributions
 benches-0.180
 bench-0.169
 Zig-0.161
 enclosure-0.156
round-0.156
Token with
Token ID351
Feature activation-2.73e+00

Pos logit contributions
 benches+0.111
 bench+0.104
 Zig+0.100
round+0.100
 enclosure+0.097
Neg logit contributions
 important-0.113
 valuable-0.099
 best-0.096
arre-0.092
 ok-0.091
Token a
Token ID257
Feature activation+5.72e-01

Pos logit contributions
 benches+0.091
 bench+0.085
 Zig+0.082
round+0.082
 enclosure+0.080
Neg logit contributions
 important-0.092
 valuable-0.081
 best-0.078
arre-0.076
 ok-0.075
Token king
Token ID5822
Feature activation-2.06e+00

Pos logit contributions
 important+0.020
 valuable+0.018
arre+0.017
 best+0.017
 mistakes+0.017
Neg logit contributions
 benches-0.018
 bench-0.017
 slide-0.016
 Zig-0.016
round-0.016
Token and
Token ID290
Feature activation-5.32e+00

Pos logit contributions
 benches+0.069
 bench+0.065
 Zig+0.063
round+0.062
 enclosure+0.061
Neg logit contributions
 important-0.068
 valuable-0.060
 best-0.058
arre-0.055
 ok-0.055
Token queen
Token ID16599
Feature activation-3.44e+00

Pos logit contributions
 benches+0.172
 bench+0.162
 Zig+0.155
round+0.154
 enclosure+0.151
Neg logit contributions
 important-0.183
 valuable-0.162
 best-0.156
arre-0.152
 ok-0.150
Token inside
Token ID2641
Feature activation-1.03e+01

Pos logit contributions
 benches+0.118
 bench+0.112
round+0.109
 Zig+0.108
 enclosure+0.104
Neg logit contributions
 important-0.110
 best-0.096
 valuable-0.096
 mistakes-0.087
arre-0.087
Token.
Token ID13
Feature activation+4.74e-01

Pos logit contributions
 benches+0.353
round+0.335
 Zig+0.329
 bench+0.324
 enclosure+0.316
Neg logit contributions
 important-0.342
 valuable-0.296
 best-0.293
 mistakes-0.273
 ok-0.268
Token She
Token ID1375
Feature activation+2.13e-01

Pos logit contributions
 important+0.016
 valuable+0.014
 best+0.013
 ok+0.013
arre+0.013
Neg logit contributions
 benches-0.016
 bench-0.014
round-0.014
 Zig-0.013
 slide-0.013
Token wanted
Token ID2227
Feature activation+4.60e+00

Pos logit contributions
 important+0.007
 valuable+0.006
 ok+0.006
 best+0.006
arre+0.006
Neg logit contributions
 benches-0.007
 bench-0.007
 slide-0.007
 enclosure-0.007
 Zig-0.007
Token to
Token ID284
Feature activation-6.33e+00

Pos logit contributions
 important+0.180
 best+0.160
 valuable+0.156
 mistakes+0.145
 awful+0.143
Neg logit contributions
round-0.131
 benches-0.128
 bench-0.122
 enclosure-0.122
 Zig-0.120
Token paint
Token ID7521
Feature activation-3.55e+00

Pos logit contributions
 benches+0.188
round+0.172
 Zig+0.168
 enclosure+0.162
 bench+0.159
Neg logit contributions
 important-0.234
 valuable-0.202
 best-0.200
 certainly-0.199
 ok-0.192
Token,
Token ID11
Feature activation-3.72e+00

Pos logit contributions
 benches+0.131
 bench+0.124
 Zig+0.118
round+0.117
 enclosure+0.115
Neg logit contributions
 important-0.141
 valuable-0.126
 best-0.120
arre-0.117
 mistakes-0.115
Token but
Token ID475
Feature activation-1.08e+00

Pos logit contributions
 benches+0.122
 bench+0.115
 Zig+0.110
round+0.110
 enclosure+0.107
Neg logit contributions
 important-0.127
 valuable-0.113
 best-0.109
arre-0.105
 ok-0.103
Token it
Token ID340
Feature activation+3.51e+00

Pos logit contributions
 benches+0.027
 bench+0.026
 Zig+0.025
 slide+0.024
round+0.023
Neg logit contributions
 important-0.043
 valuable-0.040
arre-0.039
 best-0.038
 mistakes-0.037
Token still
Token ID991
Feature activation+3.90e+00

Pos logit contributions
 important+0.108
 valuable+0.097
arre+0.093
 best+0.093
 mistakes+0.092
Neg logit contributions
 bench-0.115
 benches-0.113
 Zig-0.111
 slide-0.110
 gra-0.105
Token felt
Token ID2936
Feature activation+8.17e+00

Pos logit contributions
 important+0.118
 valuable+0.105
arre+0.104
 best+0.103
 mistakes+0.099
Neg logit contributions
 bench-0.130
 benches-0.129
 slide-0.125
 Zig-0.123
 walk-0.119
Token happy
Token ID3772
Feature activation+8.33e+00

Pos logit contributions
arre+0.221
ushing+0.213
 important+0.209
ealing+0.192
/+0.188
Neg logit contributions
 bench-0.266
 benches-0.265
 distance-0.252
 gra-0.245
 slide-0.242
Token.
Token ID13
Feature activation+1.05e+00

Pos logit contributions
 important+0.317
 valuable+0.289
arre+0.282
 best+0.276
ealing+0.271
Neg logit contributions
 benches-0.233
 bench-0.220
 Zig-0.204
 slide-0.201
round-0.198
Token It
Token ID632
Feature activation+2.08e+00

Pos logit contributions
 important+0.042
 valuable+0.038
 best+0.037
arre+0.036
 ok+0.035
Neg logit contributions
 benches-0.028
 bench-0.026
 Zig-0.025
round-0.025
 enclosure-0.024
Token had
Token ID550
Feature activation+3.15e+00

Pos logit contributions
 important+0.069
 mistakes+0.061
 valuable+0.061
 best+0.061
arre+0.061
Neg logit contributions
 bench-0.064
 benches-0.063
 Zig-0.061
 slide-0.060
 walk-0.059
Token done
Token ID1760
Feature activation+4.77e+00

Pos logit contributions
 important+0.100
 valuable+0.089
 best+0.086
arre+0.086
ushing+0.085
Neg logit contributions
 benches-0.106
 bench-0.101
 Zig-0.097
 slide-0.094
 gra-0.091
Token something
Token ID1223
Feature activation+3.69e+00

Pos logit contributions
 important+0.180
 valuable+0.164
arre+0.163
ushing+0.160
 best+0.160
Neg logit contributions
 benches-0.128
 bench-0.122
 Zig-0.117
 slide-0.109
 gra-0.106
Token join
Token ID4654
Feature activation-1.51e+00

Pos logit contributions
 important+0.025
 valuable+0.023
 best+0.023
arre+0.022
 ok+0.022
Neg logit contributions
 benches-0.014
 Zig-0.013
 bench-0.013
round-0.012
 enclosure-0.012
Token in
Token ID287
Feature activation-6.97e+00

Pos logit contributions
 benches+0.047
 bench+0.044
 enclosure+0.043
round+0.042
 Zig+0.042
Neg logit contributions
 important-0.055
 best-0.048
 valuable-0.047
 ok-0.046
 mistakes-0.046
Token.
Token ID13
Feature activation+6.34e-01

Pos logit contributions
 benches+0.218
 bench+0.204
 Zig+0.193
round+0.189
 slide+0.189
Neg logit contributions
 important-0.250
 valuable-0.223
 best-0.215
arre-0.210
 ok-0.207
Token Tim
Token ID5045
Feature activation+4.21e-02

Pos logit contributions
 important+0.026
 valuable+0.024
 best+0.024
ushing+0.023
arre+0.023
Neg logit contributions
 benches-0.015
 bench-0.014
 Zig-0.014
round-0.014
 slide-0.013
Tokenmy
Token ID1820
Feature activation+1.49e+00

Pos logit contributions
 important+0.002
 valuable+0.002
 awful+0.002
 precious+0.002
 likely+0.002
Neg logit contributions
 slide-0.001
 benches-0.001
 bench-0.001
round-0.001
 climb-0.001
Token didn
Token ID1422
Feature activation+8.26e+00

Pos logit contributions
 important+0.050
 valuable+0.048
 mistakes+0.045
 awful+0.045
 best+0.044
Neg logit contributions
 bench-0.042
 slide-0.041
 walk-0.040
 climb-0.039
 benches-0.038
Token't
Token ID470
Feature activation+5.12e+00

Pos logit contributions
 important+0.368
 valuable+0.340
 best+0.331
ealing+0.329
ushing+0.328
Neg logit contributions
 benches-0.161
 Zig-0.155
 bench-0.152
 slide-0.151
round-0.141
Token want
Token ID765
Feature activation+5.74e+00

Pos logit contributions
 important+0.179
 valuable+0.160
 best+0.156
 mistakes+0.150
arre+0.149
Neg logit contributions
 benches-0.159
 bench-0.152
round-0.147
 Zig-0.144
 slide-0.143
Token to
Token ID284
Feature activation-4.87e+00

Pos logit contributions
 important+0.244
 valuable+0.218
 best+0.214
 mistakes+0.204
 unfair+0.202
Neg logit contributions
 benches-0.146
 bench-0.139
round-0.135
 enclosure-0.134
stack-0.132
Token be
Token ID307
Feature activation+5.18e+00

Pos logit contributions
 benches+0.112
 bench+0.101
 Zig+0.096
round+0.095
 enclosure+0.094
Neg logit contributions
 important-0.218
 valuable-0.196
 best-0.193
 ok-0.187
 mistakes-0.187
Token selfish
Token ID20363
Feature activation+8.33e+00

Pos logit contributions
 important+0.116
 valuable+0.108
 certainly+0.098
ushing+0.097
 mistakes+0.094
Neg logit contributions
 benches-0.205
 bench-0.194
 Zig-0.193
 slide-0.192
 Sue-0.189
Token at
Token ID379
Feature activation-7.39e+00

Pos logit contributions
 benches+0.037
 bench+0.035
 Zig+0.033
round+0.033
 slide+0.032
Neg logit contributions
 important-0.053
 valuable-0.048
 best-0.046
arre-0.045
 ok-0.044
Token it
Token ID340
Feature activation+7.52e-01

Pos logit contributions
 benches+0.169
 bench+0.155
 Zig+0.146
round+0.143
 slide+0.139
Neg logit contributions
 important-0.325
 valuable-0.298
 best-0.287
arre-0.284
 ok-0.279
Token,
Token ID11
Feature activation-4.10e+00

Pos logit contributions
 important+0.026
 valuable+0.023
 best+0.023
 ok+0.022
 mistakes+0.021
Neg logit contributions
 benches-0.024
 bench-0.023
round-0.023
 Zig-0.022
 enclosure-0.022
Token but
Token ID475
Feature activation-1.36e+00

Pos logit contributions
 benches+0.112
round+0.102
 bench+0.101
 Zig+0.101
 enclosure+0.098
Neg logit contributions
 important-0.163
 valuable-0.147
 best-0.143
 ok-0.140
 awful-0.137
Token it
Token ID340
Feature activation+3.29e+00

Pos logit contributions
 benches+0.037
 bench+0.035
 Zig+0.033
round+0.032
 slide+0.032
Neg logit contributions
 important-0.054
 valuable-0.049
 best-0.047
arre-0.047
 mistakes-0.046
Token didn
Token ID1422
Feature activation+8.54e+00

Pos logit contributions
 important+0.105
 valuable+0.097
arre+0.091
ealing+0.090
 mistakes+0.089
Neg logit contributions
 benches-0.106
 bench-0.105
 Zig-0.104
 slide-0.103
 climb-0.097
Token't
Token ID470
Feature activation+5.63e+00

Pos logit contributions
 important+0.387
 valuable+0.359
arre+0.351
 best+0.350
ealing+0.349
Neg logit contributions
 benches-0.160
 Zig-0.153
 bench-0.151
 slide-0.149
 climb-0.143
Token work
Token ID670
Feature activation+4.06e+00

Pos logit contributions
 important+0.226
 valuable+0.204
 best+0.199
arre+0.194
 mistakes+0.192
Neg logit contributions
 benches-0.150
 bench-0.140
 Zig-0.133
round-0.133
 slide-0.129
Token.
Token ID13
Feature activation+8.47e-01

Pos logit contributions
 important+0.152
arre+0.145
ealing+0.141
 valuable+0.140
 mistakes+0.138
Neg logit contributions
 benches-0.098
 bench-0.093
 Zig-0.087
round-0.084
 slide-0.084
Token
Token ID198
Feature activation-5.39e-01

Pos logit contributions
 important+0.034
 valuable+0.031
 best+0.030
arre+0.029
 ok+0.028
Neg logit contributions
 benches-0.023
 bench-0.021
 Zig-0.020
round-0.020
 enclosure-0.020
Token
Token ID198
Feature activation-4.65e-01

Pos logit contributions
 benches+0.017
 bench+0.016
round+0.016
 enclosure+0.015
 throne+0.015
Neg logit contributions
 important-0.019
 valuable-0.017
 best-0.016
 mistakes-0.015
arre-0.015
Token They
Token ID1119
Feature activation+3.43e-01

Pos logit contributions
 important+0.039
 valuable+0.035
 best+0.034
 ok+0.031
arre+0.031
Neg logit contributions
 benches-0.039
 bench-0.037
round-0.036
 enclosure-0.035
 Zig-0.035
Token realized
Token ID6939
Feature activation+2.28e+00

Pos logit contributions
 important+0.011
 valuable+0.010
 mistakes+0.010
 best+0.010
arre+0.010
Neg logit contributions
 benches-0.011
 bench-0.011
 slide-0.011
 Zig-0.010
 walk-0.010
Token they
Token ID484
Feature activation+2.27e+00

Pos logit contributions
 important+0.090
 valuable+0.081
 best+0.078
arre+0.076
 ok+0.075
Neg logit contributions
 benches-0.063
 bench-0.059
 Zig-0.056
round-0.055
 enclosure-0.054
Token had
Token ID550
Feature activation+3.01e+00

Pos logit contributions
 important+0.082
 mistakes+0.078
 valuable+0.077
arre+0.075
ushing+0.073
Neg logit contributions
 benches-0.059
 bench-0.059
 slide-0.057
 walk-0.057
 sit-0.053
Token been
Token ID587
Feature activation+7.51e+00

Pos logit contributions
 important+0.089
 valuable+0.082
ushing+0.079
arre+0.078
ealing+0.077
Neg logit contributions
 benches-0.104
 bench-0.100
 slide-0.094
 Zig-0.092
 walk-0.091
Token selfish
Token ID20363
Feature activation+8.62e+00

Pos logit contributions
 important+0.152
arre+0.149
 valuable+0.148
ushing+0.144
 mistakes+0.135
Neg logit contributions
 benches-0.303
 bench-0.286
 Zig-0.280
 slide-0.278
 Sue-0.273
Token and
Token ID290
Feature activation-3.17e+00

Pos logit contributions
 important+0.355
 valuable+0.332
 certainly+0.325
 awful+0.316
 ok+0.316
Neg logit contributions
 benches-0.211
 bench-0.199
 slide-0.176
round-0.174
 enclosure-0.164
Token rude
Token ID22066
Feature activation+6.18e+00

Pos logit contributions
 benches+0.127
 bench+0.126
 slide+0.122
 Zig+0.120
round+0.117
Neg logit contributions
 important-0.066
arre-0.062
 mistakes-0.060
 valuable-0.057
 best-0.057
Token.
Token ID13
Feature activation+9.83e-01

Pos logit contributions
 important+0.240
 valuable+0.221
arre+0.214
 certainly+0.209
 ok+0.207
Neg logit contributions
 benches-0.170
 bench-0.157
round-0.146
 slide-0.144
 Zig-0.141
Token They
Token ID1119
Feature activation+9.22e-02

Pos logit contributions
 important+0.037
 valuable+0.033
 best+0.032
arre+0.031
 ok+0.030
Neg logit contributions
 benches-0.029
 bench-0.027
 Zig-0.026
round-0.026
 enclosure-0.026
Token saw
Token ID2497
Feature activation-1.57e+00

Pos logit contributions
 important+0.003
 mistakes+0.003
 valuable+0.003
 best+0.003
arre+0.003
Neg logit contributions
 benches-0.003
 bench-0.003
 slide-0.003
 Zig-0.003
 walk-0.003
Token He
Token ID679
Feature activation+1.33e+00

Pos logit contributions
 important+0.031
 valuable+0.028
 best+0.027
 ok+0.026
arre+0.025
Neg logit contributions
 benches-0.027
 bench-0.025
round-0.024
 enclosure-0.024
 Zig-0.024
Token learnt
Token ID26338
Feature activation+4.43e+00

Pos logit contributions
 important+0.046
 mistakes+0.042
arre+0.042
ealing+0.041
 valuable+0.041
Neg logit contributions
 bench-0.039
 benches-0.037
 slide-0.037
 Zig-0.036
 enclosure-0.035
Token that
Token ID326
Feature activation+9.40e-03

Pos logit contributions
 important+0.162
arre+0.151
 valuable+0.146
ushing+0.146
 best+0.145
Neg logit contributions
 benches-0.123
 bench-0.115
 Zig-0.112
round-0.103
 slide-0.103
Token it
Token ID340
Feature activation+4.71e+00

Pos logit contributions
 important+0.000
 valuable+0.000
arre+0.000
ealing+0.000
 best+0.000
Neg logit contributions
 benches-0.000
 bench-0.000
 Zig-0.000
 slide-0.000
round-0.000
Token was
Token ID373
Feature activation+6.92e+00

Pos logit contributions
istered+0.166
ealing+0.153
aming+0.148
ushing+0.146
ady+0.146
Neg logit contributions
 climb-0.102
 slide-0.100
 Zig-0.099
 benches-0.094
 Sky-0.094
Token wrong
Token ID2642
Feature activation+8.42e+00

Pos logit contributions
arre+0.182
ushing+0.170
ealing+0.161
istered+0.151
inking+0.148
Neg logit contributions
 Sue-0.232
 climb-0.227
 Zig-0.224
 visit-0.220
 bench-0.219
Token to
Token ID284
Feature activation-4.96e+00

Pos logit contributions
arre+0.413
 important+0.402
 awful+0.402
ealing+0.401
 unfair+0.392
Neg logit contributions
 benches-0.093
 bench-0.082
 slide-0.067
 Zig-0.067
 climb-0.063
Token kick
Token ID4829
Feature activation+6.21e-01

Pos logit contributions
 benches+0.136
 bench+0.126
 Zig+0.120
round+0.120
 enclosure+0.117
Neg logit contributions
 important-0.197
 valuable-0.177
 best-0.172
arre-0.167
 ok-0.165
Token the
Token ID262
Feature activation-1.48e+00

Pos logit contributions
 important+0.023
 valuable+0.020
arre+0.020
 best+0.019
 certainly+0.019
Neg logit contributions
 benches-0.019
 bench-0.017
 slide-0.016
 Zig-0.016
 climb-0.015
Token ash
Token ID12530
Feature activation-2.68e+00

Pos logit contributions
 benches+0.041
 bench+0.038
round+0.036
 Zig+0.036
 enclosure+0.035
Neg logit contributions
 important-0.059
 valuable-0.053
 best-0.051
arre-0.049
 ok-0.049
Token,
Token ID11
Feature activation-3.37e+00

Pos logit contributions
 benches+0.081
 bench+0.076
round+0.073
 Zig+0.073
 gra+0.071
Neg logit contributions
 important-0.096
 best-0.087
 valuable-0.085
 mistakes-0.079
arre-0.079
Token saw
Token ID2497
Feature activation-1.45e+00

Pos logit contributions
 benches+0.009
 bench+0.009
round+0.009
 Zig+0.009
 enclosure+0.008
Neg logit contributions
 important-0.009
 valuable-0.008
 best-0.007
arre-0.007
 ok-0.007
Token.
Token ID13
Feature activation+7.82e-01

Pos logit contributions
 benches+0.048
 bench+0.045
 Zig+0.043
round+0.043
 enclosure+0.042
Neg logit contributions
 important-0.049
 valuable-0.044
 best-0.042
arre-0.041
 ok-0.040
Token They
Token ID1119
Feature activation-1.46e-01

Pos logit contributions
 important+0.027
 valuable+0.024
 best+0.023
arre+0.022
 mistakes+0.022
Neg logit contributions
 benches-0.025
 bench-0.024
round-0.023
 Zig-0.023
 enclosure-0.022
Token learned
Token ID4499
Feature activation+3.69e+00

Pos logit contributions
 benches+0.005
round+0.005
 Zig+0.005
 bench+0.005
 enclosure+0.005
Neg logit contributions
 important-0.004
 valuable-0.004
 best-0.004
arre-0.004
 awful-0.003
Token and
Token ID290
Feature activation-3.72e+00

Pos logit contributions
 important+0.135
 valuable+0.120
 best+0.117
arre+0.113
 ok+0.112
Neg logit contributions
 benches-0.113
 bench-0.106
 Zig-0.102
round-0.101
 enclosure-0.099
Token laughed
Token ID13818
Feature activation+5.50e+00

Pos logit contributions
 benches+0.126
 bench+0.120
 Zig+0.113
 slide+0.111
round+0.111
Neg logit contributions
 important-0.122
 valuable-0.108
 best-0.104
arre-0.103
 mistakes-0.101
Token and
Token ID290
Feature activation-3.73e+00

Pos logit contributions
 important+0.218
arre+0.208
 valuable+0.204
 likely+0.197
 certainly+0.193
Neg logit contributions
 benches-0.130
 bench-0.120
 Zig-0.117
 slide-0.107
round-0.100
Token had
Token ID550
Feature activation+2.65e+00

Pos logit contributions
 benches+0.116
 bench+0.109
 Zig+0.104
round+0.103
 slide+0.101
Neg logit contributions
 important-0.134
 valuable-0.120
 best-0.115
arre-0.112
 mistakes-0.111
Token fun
Token ID1257
Feature activation+4.69e+00

Pos logit contributions
 important+0.073
ushing+0.068
arre+0.067
 mistakes+0.063
ealing+0.063
Neg logit contributions
 benches-0.100
 bench-0.094
 Zig-0.089
 slide-0.087
 walk-0.084
Token.
Token ID13
Feature activation+1.04e+00

Pos logit contributions
 important+0.162
 valuable+0.145
 best+0.138
arre+0.135
 ok+0.134
Neg logit contributions
 benches-0.153
 bench-0.143
 Zig-0.136
round-0.135
 enclosure-0.134
Token They
Token ID1119
Feature activation+2.22e-01

Pos logit contributions
 important+0.042
 valuable+0.038
 best+0.037
arre+0.036
 ok+0.036
Neg logit contributions
 benches-0.028
 bench-0.026
 Zig-0.025
round-0.024
 enclosure-0.023
Token I
Token ID314
Feature activation+4.92e-01

Pos logit contributions
 important+0.026
 valuable+0.023
 best+0.022
arre+0.021
 ok+0.021
Neg logit contributions
 benches-0.024
 bench-0.023
 Zig-0.022
round-0.022
 enclosure-0.021
Token promise
Token ID6991
Feature activation+5.03e+00

Pos logit contributions
 important+0.018
 mistakes+0.018
 awful+0.016
 valuable+0.016
ushing+0.016
Neg logit contributions
 slide-0.012
 walk-0.012
 bench-0.012
 sit-0.012
 ride-0.012
Token to
Token ID284
Feature activation-4.78e+00

Pos logit contributions
 important+0.196
 valuable+0.180
arre+0.176
 best+0.171
 awful+0.170
Neg logit contributions
 benches-0.131
 bench-0.128
round-0.119
 Zig-0.119
 slide-0.119
Token be
Token ID307
Feature activation+4.70e+00

Pos logit contributions
 benches+0.150
 bench+0.136
round+0.133
 Zig+0.131
 enclosure+0.129
Neg logit contributions
 important-0.176
 valuable-0.154
 best-0.152
 ok-0.147
 awful-0.142
Token very
Token ID845
Feature activation+5.98e+00

Pos logit contributions
 important+0.112
arre+0.110
 valuable+0.103
ushing+0.101
 mistakes+0.101
Neg logit contributions
 bench-0.171
 benches-0.170
 slide-0.164
 Zig-0.160
 Sue-0.158
Token good
Token ID922
Feature activation+6.16e+00

Pos logit contributions
 important+0.118
arre+0.115
 mistakes+0.105
 valuable+0.101
 best+0.100
Neg logit contributions
 bench-0.245
 benches-0.240
round-0.237
 Zig-0.235
 slide-0.235
Token in
Token ID287
Feature activation-4.73e+00

Pos logit contributions
 important+0.220
 valuable+0.208
 awful+0.194
arre+0.187
 unfair+0.182
Neg logit contributions
 benches-0.176
round-0.170
 bench-0.170
 slide-0.163
 Zig-0.162
Token the
Token ID262
Feature activation-2.28e+00

Pos logit contributions
 benches+0.129
 bench+0.120
 Zig+0.114
round+0.114
 enclosure+0.111
Neg logit contributions
 important-0.189
 valuable-0.170
 best-0.165
arre-0.159
 ok-0.159
Token car
Token ID1097
Feature activation-3.78e+00

Pos logit contributions
 benches+0.067
 bench+0.061
 Zig+0.061
round+0.060
 enclosure+0.057
Neg logit contributions
 important-0.088
 valuable-0.078
 best-0.077
 mistakes-0.073
arre-0.072
Token."
Token ID526
Feature activation-8.14e-01

Pos logit contributions
 benches+0.120
 bench+0.115
round+0.111
 Zig+0.110
 enclosure+0.108
Neg logit contributions
 important-0.129
 valuable-0.116
 best-0.109
arre-0.108
 mistakes-0.107
Token
Token ID198
Feature activation-6.90e-01

Pos logit contributions
 benches+0.025
 bench+0.023
round+0.023
 Zig+0.023
 enclosure+0.022
Neg logit contributions
 important-0.029
 valuable-0.026
 best-0.026
arre-0.025
 mistakes-0.024
Token and
Token ID290
Feature activation-3.93e+00

Pos logit contributions
 benches+0.060
 bench+0.058
 slide+0.054
 Zig+0.052
round+0.052
Neg logit contributions
 important-0.078
arre-0.071
 valuable-0.070
 mistakes-0.068
ealing-0.066
Token thought
Token ID1807
Feature activation+7.22e+00

Pos logit contributions
 benches+0.123
 bench+0.115
 Zig+0.111
round+0.110
 enclosure+0.108
Neg logit contributions
 important-0.140
 valuable-0.125
 best-0.121
arre-0.116
 ok-0.115
Token it
Token ID340
Feature activation+4.94e+00

Pos logit contributions
 important+0.272
 valuable+0.243
 best+0.235
 ok+0.226
 mistakes+0.224
Neg logit contributions
 benches-0.214
 bench-0.199
round-0.192
 Zig-0.190
 enclosure-0.188
Token was
Token ID373
Feature activation+7.16e+00

Pos logit contributions
 important+0.174
arre+0.169
 mistakes+0.165
ealing+0.163
 valuable+0.160
Neg logit contributions
 benches-0.129
 Zig-0.128
 bench-0.126
 slide-0.125
 walk-0.122
Token the
Token ID262
Feature activation+2.02e+00

Pos logit contributions
arre+0.234
ushing+0.208
ealing+0.205
 important+0.205
 mistakes+0.200
Neg logit contributions
 benches-0.222
 Zig-0.219
 bench-0.217
 slide-0.203
 Sue-0.201
Token best
Token ID1266
Feature activation+7.38e+00

Pos logit contributions
 important+0.036
arre+0.031
 valuable+0.030
ushing+0.029
ealing+0.028
Neg logit contributions
 benches-0.095
 bench-0.094
 slide-0.091
 Zig-0.087
 walk-0.086
Token seat
Token ID5852
Feature activation-1.15e+00

Pos logit contributions
 important+0.278
 valuable+0.245
 best+0.243
 mistakes+0.227
 ok+0.226
Neg logit contributions
 benches-0.222
 Zig-0.201
 bench-0.200
round-0.199
 enclosure-0.198
Token ever
Token ID1683
Feature activation+4.97e+00

Pos logit contributions
 benches+0.035
 bench+0.033
round+0.031
 Zig+0.031
 slide+0.030
Neg logit contributions
 important-0.042
 valuable-0.037
 best-0.036
arre-0.036
 mistakes-0.035
Token.
Token ID13
Feature activation+8.93e-01

Pos logit contributions
 important+0.181
arre+0.174
 valuable+0.165
 mistakes+0.163
ealing+0.160
Neg logit contributions
 benches-0.138
 bench-0.132
 slide-0.122
round-0.120
 Zig-0.120
Token John
Token ID1757
Feature activation+3.53e-01

Pos logit contributions
 important+0.033
 valuable+0.030
 best+0.029
arre+0.028
 ok+0.027
Neg logit contributions
 benches-0.027
 bench-0.025
 Zig-0.024
round-0.024
 enclosure-0.023
Token felt
Token ID2936
Feature activation+7.20e+00

Pos logit contributions
 important+0.012
 valuable+0.011
 best+0.011
arre+0.011
 mistakes+0.010
Neg logit contributions
 benches-0.011
 bench-0.011
round-0.010
 Zig-0.010
 slide-0.010
Token had
Token ID550
Feature activation+2.29e+00

Pos logit contributions
 benches+0.118
 bench+0.113
 Zig+0.109
round+0.108
 enclosure+0.106
Neg logit contributions
 important-0.115
 valuable-0.102
 best-0.097
arre-0.095
 mistakes-0.093
Token an
Token ID281
Feature activation+2.62e+00

Pos logit contributions
 important+0.077
ushing+0.069
arre+0.069
 valuable+0.068
ealing+0.067
Neg logit contributions
 benches-0.071
 bench-0.068
 Zig-0.065
 slide-0.063
 walk-0.061
Token impressive
Token ID8036
Feature activation+4.29e+00

Pos logit contributions
arre+0.058
ushing+0.050
 valuable+0.049
 important+0.049
istered+0.047
Neg logit contributions
 bench-0.098
 benches-0.097
 sit-0.097
 slide-0.096
 climb-0.095
Token,
Token ID11
Feature activation-2.64e+00

Pos logit contributions
 important+0.091
 valuable+0.079
 awful+0.074
arre+0.073
 mistakes+0.072
Neg logit contributions
 bench-0.189
 slide-0.183
 benches-0.183
round-0.174
 walk-0.174
Token wide
Token ID3094
Feature activation+3.35e+00

Pos logit contributions
 bench+0.093
 benches+0.092
 slide+0.085
round+0.084
 Zig+0.083
Neg logit contributions
 important-0.076
arre-0.071
 mistakes-0.068
 best-0.066
ushing-0.066
Token smile
Token ID8212
Feature activation+3.77e+00

Pos logit contributions
 important+0.128
 valuable+0.114
 best+0.111
arre+0.107
 ok+0.107
Neg logit contributions
 benches-0.097
 bench-0.089
 Zig-0.086
round-0.086
 enclosure-0.083
Token on
Token ID319
Feature activation-4.95e+00

Pos logit contributions
 important+0.144
 valuable+0.132
 mistakes+0.124
 best+0.123
arre+0.123
Neg logit contributions
 benches-0.104
 bench-0.103
round-0.098
 Zig-0.094
 slide-0.094
Token his
Token ID465
Feature activation+1.08e+00

Pos logit contributions
 benches+0.113
 bench+0.103
 Zig+0.097
round+0.097
 enclosure+0.094
Neg logit contributions
 important-0.220
 valuable-0.200
 best-0.195
arre-0.189
 ok-0.188
Token face
Token ID1986
Feature activation+4.35e-01

Pos logit contributions
 important+0.038
 valuable+0.034
arre+0.034
 likely+0.033
 best+0.032
Neg logit contributions
 benches-0.034
 bench-0.033
 slide-0.030
round-0.030
 Zig-0.029
Token.
Token ID13
Feature activation+9.86e-01

Pos logit contributions
 important+0.015
 valuable+0.013
 best+0.013
arre+0.013
 mistakes+0.012
Neg logit contributions
 benches-0.014
 bench-0.013
round-0.013
 Zig-0.012
 enclosure-0.012
Token
Token ID220
Feature activation+4.16e+00

Pos logit contributions
 important+0.039
 valuable+0.034
 best+0.034
 awful+0.033
 mistakes+0.033
Neg logit contributions
 benches-0.027
 bench-0.026
round-0.025
 Zig-0.024
 enclosure-0.023
Token had
Token ID550
Feature activation+2.24e+00

Pos logit contributions
 benches+0.081
 bench+0.080
 Zig+0.076
 slide+0.076
 enclosure+0.074
Neg logit contributions
 important-0.079
 valuable-0.071
 mistakes-0.070
arre-0.069
 ok-0.069
Token a
Token ID257
Feature activation+1.44e+00

Pos logit contributions
 important+0.071
 valuable+0.063
arre+0.061
 best+0.059
 mistakes+0.059
Neg logit contributions
 benches-0.078
 bench-0.074
 Zig-0.072
 slide-0.069
round-0.068
Token pencil
Token ID21613
Feature activation-9.54e-01

Pos logit contributions
 important+0.043
arre+0.041
 mistakes+0.040
ushing+0.039
ealing+0.039
Neg logit contributions
 bench-0.049
 slide-0.046
 benches-0.045
 walk-0.044
 Zig-0.044
Token.
Token ID13
Feature activation+4.24e-01

Pos logit contributions
 bench+0.025
 benches+0.024
round+0.023
 slide+0.023
 enclosure+0.021
Neg logit contributions
 important-0.038
 ok-0.035
arre-0.034
 mistakes-0.034
 valuable-0.034
Token He
Token ID679
Feature activation+9.43e-01

Pos logit contributions
 important+0.012
 valuable+0.010
 best+0.010
 certainly+0.009
 ok+0.009
Neg logit contributions
 benches-0.016
round-0.015
 bench-0.015
 enclosure-0.014
aro-0.014
Token loved
Token ID6151
Feature activation+3.60e+00

Pos logit contributions
 important+0.028
arre+0.025
 valuable+0.025
 mistakes+0.025
 best+0.025
Neg logit contributions
 benches-0.032
 bench-0.032
 Zig-0.031
 slide-0.030
round-0.030
Token to
Token ID284
Feature activation-6.10e+00

Pos logit contributions
 important+0.147
 valuable+0.130
 best+0.124
 mistakes+0.116
 precious+0.112
Neg logit contributions
 benches-0.105
 enclosure-0.097
round-0.097
 bench-0.093
stack-0.093
Token draw
Token ID3197
Feature activation-9.74e-01

Pos logit contributions
 benches+0.173
 Zig+0.153
round+0.148
 enclosure+0.148
 bench+0.144
Neg logit contributions
 important-0.245
 valuable-0.216
 best-0.207
 mistakes-0.196
 certainly-0.196
Token with
Token ID351
Feature activation-3.12e+00

Pos logit contributions
 benches+0.027
 bench+0.025
 Zig+0.024
 slide+0.023
round+0.023
Neg logit contributions
 important-0.037
 valuable-0.034
arre-0.033
 ok-0.033
 best-0.032
Token it
Token ID340
Feature activation+4.71e-01

Pos logit contributions
 benches+0.077
 bench+0.072
round+0.069
 Zig+0.068
 enclosure+0.067
Neg logit contributions
 important-0.134
 valuable-0.121
 best-0.118
 mistakes-0.113
 awful-0.112
Token.
Token ID13
Feature activation+3.63e-01

Pos logit contributions
 important+0.017
arre+0.015
 valuable+0.015
 mistakes+0.015
 ok+0.015
Neg logit contributions
 benches-0.014
 bench-0.013
 Zig-0.012
 slide-0.012
 enclosure-0.012
Token got
Token ID1392
Feature activation+1.89e+00

Pos logit contributions
 important+0.038
 valuable+0.034
 best+0.033
 mistakes+0.033
 awful+0.032
Neg logit contributions
 bench-0.037
 benches-0.036
 Zig-0.035
 slide-0.034
 walk-0.033
Token scared
Token ID12008
Feature activation+5.19e+00

Pos logit contributions
 important+0.043
 valuable+0.036
arre+0.034
 best+0.034
 mistakes+0.032
Neg logit contributions
 benches-0.081
 bench-0.078
 Zig-0.077
round-0.075
 slide-0.075
Token and
Token ID290
Feature activation-3.34e+00

Pos logit contributions
 important+0.151
 valuable+0.125
 best+0.125
 mistakes+0.120
 awful+0.118
Neg logit contributions
 benches-0.196
 bench-0.191
 gra-0.187
 enclosure-0.187
stack-0.185
Token started
Token ID2067
Feature activation+6.14e-01

Pos logit contributions
 benches+0.106
 bench+0.098
round+0.095
 Zig+0.094
 enclosure+0.093
Neg logit contributions
 important-0.120
 valuable-0.106
 best-0.102
 ok-0.098
arre-0.098
Token to
Token ID284
Feature activation-6.39e+00

Pos logit contributions
 important+0.024
 valuable+0.021
 best+0.020
 mistakes+0.020
 ok+0.020
Neg logit contributions
 benches-0.018
round-0.017
 enclosure-0.016
 throne-0.016
stack-0.015
Token run
Token ID1057
Feature activation+3.13e-01

Pos logit contributions
 benches+0.201
round+0.195
 enclosure+0.179
 throne+0.177
stack+0.173
Neg logit contributions
 important-0.222
 ok-0.198
 best-0.193
 stubborn-0.188
 mistakes-0.186
Token as
Token ID355
Feature activation-2.81e+00

Pos logit contributions
 important+0.011
 valuable+0.010
 best+0.010
 ok+0.009
 awful+0.009
Neg logit contributions
 benches-0.009
 bench-0.009
 gra-0.009
 enclosure-0.008
 throne-0.008
Token fast
Token ID3049
Feature activation+2.13e+00

Pos logit contributions
 benches+0.102
 enclosure+0.096
round+0.094
 bench+0.093
 throne+0.090
Neg logit contributions
 important-0.091
 valuable-0.079
 best-0.073
 stubborn-0.068
 awful-0.068
Token as
Token ID355
Feature activation-2.19e+00

Pos logit contributions
 important+0.070
 valuable+0.061
 best+0.059
 mistakes+0.057
 awful+0.056
Neg logit contributions
 benches-0.071
 bench-0.068
 enclosure-0.065
round-0.065
 gra-0.064
Token she
Token ID673
Feature activation+1.30e+00

Pos logit contributions
round+0.071
 benches+0.070
 enclosure+0.070
 bench+0.067
 gra+0.064
Neg logit contributions
 important-0.077
 best-0.067
 valuable-0.067
 ok-0.061
 precious-0.059
Token could
Token ID714
Feature activation-1.58e+00

Pos logit contributions
 important+0.037
 valuable+0.031
 best+0.030
 ok+0.028
arre+0.028
Neg logit contributions
 benches-0.050
 bench-0.048
round-0.047
 enclosure-0.046
 Zig-0.046
TokenAs
Token ID1722
Feature activation-5.03e-01

Pos logit contributions
 benches+0.011
 bench+0.011
round+0.010
 slide+0.010
 Zig+0.010
Neg logit contributions
 important-0.018
 best-0.017
 valuable-0.016
 mistakes-0.016
 ok-0.015
Token they
Token ID484
Feature activation-2.26e-02

Pos logit contributions
 benches+0.014
round+0.013
 bench+0.012
 enclosure+0.012
 Zig+0.012
Neg logit contributions
 important-0.021
 valuable-0.018
 best-0.018
 ok-0.017
 awful-0.017
Token played
Token ID2826
Feature activation-3.84e-01

Pos logit contributions
 benches+0.001
 bench+0.001
round+0.001
 Zig+0.001
 enclosure+0.001
Neg logit contributions
 important-0.001
 valuable-0.001
 best-0.001
arre-0.001
 awful-0.001
Token,
Token ID11
Feature activation-4.52e+00

Pos logit contributions
 benches+0.013
round+0.013
 bench+0.012
 enclosure+0.012
 Zig+0.012
Neg logit contributions
 important-0.013
 best-0.011
 valuable-0.010
 ok-0.010
 mistakes-0.010
Token Lily
Token ID20037
Feature activation+1.36e+00

Pos logit contributions
 benches+0.118
round+0.112
 bench+0.110
 enclosure+0.109
 Zig+0.102
Neg logit contributions
 important-0.183
 best-0.164
 valuable-0.163
 ok-0.154
 awful-0.154
Token noticed
Token ID6810
Feature activation+8.04e-02

Pos logit contributions
 important+0.048
 valuable+0.044
 awful+0.040
 best+0.040
 mistakes+0.039
Neg logit contributions
 bench-0.042
 benches-0.042
 slide-0.040
 walk-0.039
round-0.037
Token a
Token ID257
Feature activation+3.36e-01

Pos logit contributions
 important+0.003
 valuable+0.002
 best+0.002
 mistakes+0.002
 ok+0.002
Neg logit contributions
 benches-0.003
 bench-0.002
round-0.002
 Zig-0.002
 enclosure-0.002
Token big
Token ID1263
Feature activation+2.35e-01

Pos logit contributions
 important+0.011
 valuable+0.010
arre+0.009
 best+0.009
 mistakes+0.009
Neg logit contributions
 bench-0.011
 benches-0.011
 slide-0.010
 Zig-0.010
round-0.010
Token rock
Token ID3881
Feature activation-2.17e+00

Pos logit contributions
 important+0.009
 valuable+0.008
 best+0.008
arre+0.008
 ok+0.008
Neg logit contributions
 benches-0.007
 bench-0.006
round-0.006
 Zig-0.006
 slide-0.006
Token nearby
Token ID6716
Feature activation-1.50e+00

Pos logit contributions
 benches+0.067
 bench+0.062
 Zig+0.060
round+0.060
 enclosure+0.058
Neg logit contributions
 important-0.078
 valuable-0.069
 best-0.068
arre-0.064
 awful-0.064
Token.
Token ID13
Feature activation+3.28e-01

Pos logit contributions
 benches+0.051
 bench+0.048
 Zig+0.047
round+0.047
 enclosure+0.046
Neg logit contributions
 important-0.049
 valuable-0.043
 best-0.042
arre-0.040
 ok-0.040
Token.
Token ID13
Feature activation+1.45e+00

Pos logit contributions
 important+0.201
 valuable+0.184
arre+0.178
 awful+0.172
 mistakes+0.171
Neg logit contributions
 benches-0.153
 bench-0.144
 Zig-0.140
 slide-0.132
round-0.130
Token They
Token ID1119
Feature activation+2.74e-01

Pos logit contributions
 important+0.050
 valuable+0.044
 best+0.043
arre+0.040
 ok+0.040
Neg logit contributions
 benches-0.048
 bench-0.045
round-0.044
 enclosure-0.043
 Zig-0.042
Token made
Token ID925
Feature activation+1.34e+00

Pos logit contributions
 important+0.008
 valuable+0.007
 best+0.007
arre+0.007
 ok+0.006
Neg logit contributions
 benches-0.010
 bench-0.009
round-0.009
 Zig-0.009
 enclosure-0.009
Token their
Token ID511
Feature activation+8.35e-01

Pos logit contributions
 important+0.044
 valuable+0.039
 best+0.037
arre+0.037
 ok+0.036
Neg logit contributions
 benches-0.045
 bench-0.043
 Zig-0.042
 slide-0.040
 enclosure-0.039
Token own
Token ID898
Feature activation+1.86e+00

Pos logit contributions
 important+0.027
 best+0.024
 valuable+0.024
 mistakes+0.023
 awful+0.021
Neg logit contributions
 benches-0.029
round-0.027
 bench-0.027
 Zig-0.026
 enclosure-0.025
Token story
Token ID1621
Feature activation-3.85e-02

Pos logit contributions
 important+0.063
 valuable+0.056
 best+0.054
 mistakes+0.052
arre+0.051
Neg logit contributions
 benches-0.062
 bench-0.058
round-0.057
 Zig-0.057
 enclosure-0.055
Token.
Token ID13
Feature activation+1.29e+00

Pos logit contributions
 benches+0.001
 bench+0.001
round+0.001
 Zig+0.001
 slide+0.001
Neg logit contributions
 important-0.001
 valuable-0.001
 best-0.001
arre-0.001
 ok-0.001
Token<|endoftext|>
Token ID50256
Feature activation-7.19e-01

Pos logit contributions
 important+0.046
 valuable+0.041
 best+0.040
arre+0.038
 ok+0.038
Neg logit contributions
 benches-0.040
 bench-0.038
round-0.036
 Zig-0.036
 enclosure-0.036
TokenBen
Token ID11696
Feature activation-1.29e+00

Pos logit contributions
 benches+0.022
round+0.021
 bench+0.021
 throne+0.020
 enclosure+0.020
Neg logit contributions
 important-0.024
 best-0.023
 valuable-0.021
arre-0.020
 mistakes-0.020
Token and
Token ID290
Feature activation-4.97e+00

Pos logit contributions
 benches+0.044
 bench+0.042
 Zig+0.040
round+0.040
 enclosure+0.039
Neg logit contributions
 important-0.042
 valuable-0.037
 best-0.036
arre-0.034
 mistakes-0.034
Token Mia
Token ID32189
Feature activation-2.64e-01

Pos logit contributions
 benches+0.163
 enclosure+0.149
 bench+0.144
round+0.143
 throne+0.139
Neg logit contributions
 important-0.178
 valuable-0.155
 best-0.154
 ok-0.143
 unfair-0.138
Token better
Token ID1365
Feature activation+4.47e+00

Pos logit contributions
arre+0.147
 important+0.127
ushing+0.124
 mistakes+0.119
/+0.118
Neg logit contributions
 benches-0.164
 bench-0.162
 sit-0.154
 distance-0.153
 gra-0.152
Token.
Token ID13
Feature activation+9.01e-01

Pos logit contributions
 important+0.177
arre+0.176
 valuable+0.165
 unfair+0.161
 mistakes+0.159
Neg logit contributions
 benches-0.106
 bench-0.106
 slide-0.092
round-0.088
 enclosure-0.086
Token<|endoftext|>
Token ID50256
Feature activation-8.32e-01

Pos logit contributions
 important+0.033
 valuable+0.030
 best+0.028
arre+0.027
 mistakes+0.027
Neg logit contributions
 benches-0.028
 bench-0.026
round-0.025
 Zig-0.024
 enclosure-0.024
TokenOnce
Token ID7454
Feature activation-2.85e+00

Pos logit contributions
 benches+0.029
round+0.028
 bench+0.027
 throne+0.027
 enclosure+0.026
Neg logit contributions
 important-0.025
 best-0.023
 valuable-0.022
arre-0.020
 mistakes-0.020
Token upon
Token ID2402
Feature activation+1.56e+00

Pos logit contributions
 benches+0.078
 bench+0.072
round+0.069
 Zig+0.068
 enclosure+0.067
Neg logit contributions
 important-0.114
 valuable-0.102
 best-0.100
arre-0.096
 ok-0.095
Token a
Token ID257
Feature activation-2.42e-01

Pos logit contributions
 important+0.047
 valuable+0.040
 best+0.039
arre+0.037
 ok+0.037
Neg logit contributions
 benches-0.057
 bench-0.054
round-0.053
 Zig-0.052
 enclosure-0.052
Token time
Token ID640
Feature activation-9.73e-01

Pos logit contributions
 benches+0.009
 bench+0.009
round+0.008
 slide+0.008
 Zig+0.008
Neg logit contributions
 important-0.007
 valuable-0.006
 best-0.006
 mistakes-0.006
arre-0.006
Token,
Token ID11
Feature activation-4.97e+00

Pos logit contributions
 benches+0.034
 bench+0.032
round+0.031
 enclosure+0.030
 gra+0.030
Neg logit contributions
 important-0.032
 best-0.028
 valuable-0.028
 ok-0.026
arre-0.026
Token there
Token ID612
Feature activation-2.91e+00

Pos logit contributions
 benches+0.131
 bench+0.123
 Zig+0.118
round+0.115
 slide+0.113
Neg logit contributions
 important-0.198
 valuable-0.180
 best-0.173
arre-0.172
 mistakes-0.170
Token was
Token ID373
Feature activation+4.00e+00

Pos logit contributions
 benches+0.117
 bench+0.111
round+0.109
 Zig+0.107
 enclosure+0.106
Neg logit contributions
 important-0.078
 valuable-0.066
 best-0.064
arre-0.059
 ok-0.059
Token a
Token ID257
Feature activation+1.66e+00

Pos logit contributions
 important+0.128
 valuable+0.114
arre+0.107
 best+0.107
 ok+0.105
Neg logit contributions
 benches-0.138
 bench-0.131
 Zig-0.126
round-0.123
 slide-0.122
Token a
Token ID257
Feature activation-3.64e-01

Pos logit contributions
 important+0.046
 valuable+0.040
 best+0.039
arre+0.037
 mistakes+0.037
Neg logit contributions
 benches-0.052
 bench-0.049
 Zig-0.047
round-0.047
 enclosure-0.046
Token time
Token ID640
Feature activation-1.28e+00

Pos logit contributions
 benches+0.012
 bench+0.012
 slide+0.012
 walk+0.011
 Zig+0.011
Neg logit contributions
 important-0.011
 valuable-0.010
 mistakes-0.009
 best-0.009
arre-0.009
Token,
Token ID11
Feature activation-5.12e+00

Pos logit contributions
 benches+0.043
 bench+0.040
round+0.040
 enclosure+0.038
 gra+0.037
Neg logit contributions
 important-0.043
 best-0.038
 valuable-0.037
 ok-0.035
arre-0.035
Token there
Token ID612
Feature activation-2.85e+00

Pos logit contributions
 benches+0.142
 bench+0.132
 Zig+0.126
round+0.126
 enclosure+0.122
Neg logit contributions
 important-0.201
 valuable-0.181
 best-0.175
arre-0.170
 ok-0.169
Token was
Token ID373
Feature activation+3.94e+00

Pos logit contributions
 benches+0.116
 bench+0.109
round+0.108
 Zig+0.106
 enclosure+0.105
Neg logit contributions
 important-0.075
 valuable-0.064
 best-0.062
arre-0.057
 ok-0.057
Token a
Token ID257
Feature activation+1.43e+00

Pos logit contributions
 important+0.131
 valuable+0.119
arre+0.113
 best+0.110
 mistakes+0.110
Neg logit contributions
 benches-0.129
 bench-0.122
 Zig-0.119
 slide-0.114
round-0.112
Token big
Token ID1263
Feature activation+1.08e+00

Pos logit contributions
 important+0.040
 valuable+0.035
 best+0.033
arre+0.031
 ok+0.031
Neg logit contributions
 benches-0.056
 bench-0.053
 Zig-0.051
round-0.051
 enclosure-0.050
Token dog
Token ID3290
Feature activation-2.32e+00

Pos logit contributions
 important+0.042
 best+0.037
 valuable+0.036
 awful+0.034
arre+0.034
Neg logit contributions
 benches-0.031
 Zig-0.028
round-0.027
 bench-0.027
 enclosure-0.026
Token.
Token ID13
Feature activation+4.51e-01

Pos logit contributions
 benches+0.083
 inspection+0.070
 gra+0.068
 bench+0.067
round+0.067
Neg logit contributions
 important-0.079
 valuable-0.064
/-0.064
 best-0.061
 stubborn-0.059
Token He
Token ID679
Feature activation+8.09e-01

Pos logit contributions
 important+0.014
 valuable+0.012
 stubborn+0.011
 best+0.011
 unfair+0.011
Neg logit contributions
 benches-0.015
aro-0.014
round-0.013
 enclosure-0.013
 bench-0.013
Token had
Token ID550
Feature activation+2.47e+00

Pos logit contributions
 important+0.023
 valuable+0.020
arre+0.020
 mistakes+0.019
 best+0.019
Neg logit contributions
 benches-0.030
 bench-0.029
 Zig-0.028
round-0.028
 slide-0.028
Token eagerly
Token ID30130
Feature activation+2.32e+00

Pos logit contributions
 important+0.107
 valuable+0.097
 best+0.092
arre+0.092
 awful+0.090
Neg logit contributions
 benches-0.080
 bench-0.076
 Zig-0.072
round-0.071
 slide-0.069
Token and
Token ID290
Feature activation-4.07e+00

Pos logit contributions
 important+0.086
 valuable+0.077
 best+0.073
arre+0.073
 awful+0.072
Neg logit contributions
 benches-0.068
 bench-0.066
round-0.063
 enclosure-0.061
 slide-0.061
Token went
Token ID1816
Feature activation-1.63e+00

Pos logit contributions
 benches+0.127
 bench+0.121
 Zig+0.116
round+0.115
 slide+0.113
Neg logit contributions
 important-0.144
 valuable-0.128
 best-0.123
arre-0.120
 mistakes-0.119
Token back
Token ID736
Feature activation-2.31e+00

Pos logit contributions
 benches+0.051
 Zig+0.048
 bench+0.047
round+0.047
 enclosure+0.046
Neg logit contributions
 important-0.056
 best-0.050
 valuable-0.049
arre-0.044
 mistakes-0.044
Token outside
Token ID2354
Feature activation-7.38e+00

Pos logit contributions
 benches+0.067
round+0.061
 bench+0.061
 Zig+0.061
 gra+0.057
Neg logit contributions
 important-0.086
 best-0.076
 valuable-0.075
 mistakes-0.071
arre-0.069
Token to
Token ID284
Feature activation-8.98e+00

Pos logit contributions
 benches+0.234
 bench+0.217
 Zig+0.211
round+0.211
 enclosure+0.204
Neg logit contributions
 important-0.262
 valuable-0.231
 best-0.226
 mistakes-0.215
 ok-0.214
Token play
Token ID711
Feature activation-1.59e+00

Pos logit contributions
 benches+0.180
round+0.170
 Zig+0.155
 bench+0.150
 enclosure+0.142
Neg logit contributions
 important-0.422
 best-0.379
 valuable-0.371
 mistakes-0.366
 ok-0.364
Token.
Token ID13
Feature activation+5.30e-01

Pos logit contributions
 benches+0.052
 bench+0.049
round+0.047
 Zig+0.047
 enclosure+0.046
Neg logit contributions
 important-0.055
 valuable-0.048
 best-0.047
arre-0.045
 ok-0.045
Token He
Token ID679
Feature activation+1.20e+00

Pos logit contributions
 important+0.019
 valuable+0.018
 best+0.017
arre+0.016
 ok+0.016
Neg logit contributions
 benches-0.016
 bench-0.015
round-0.014
 Zig-0.014
 enclosure-0.014
Token was
Token ID373
Feature activation+5.34e+00

Pos logit contributions
 important+0.041
 valuable+0.037
 mistakes+0.035
 awful+0.035
arre+0.035
Neg logit contributions
 benches-0.037
 bench-0.037
 slide-0.036
 Zig-0.035
round-0.034
Token so
Token ID523
Feature activation+4.57e+00

Pos logit contributions
 important+0.149
 valuable+0.137
arre+0.131
ushing+0.129
 mistakes+0.128
Neg logit contributions
 benches-0.190
 bench-0.189
 Zig-0.184
 slide-0.179
round-0.174
Token owner
Token ID4870
Feature activation-3.47e+00

Pos logit contributions
 important+0.052
arre+0.051
 valuable+0.046
ealing+0.044
ushing+0.044
Neg logit contributions
 benches-0.049
 bench-0.049
 Zig-0.045
 slide-0.045
 enclosure-0.044
Token.
Token ID13
Feature activation+1.53e-01

Pos logit contributions
 benches+0.126
 bench+0.119
 Zig+0.116
round+0.115
 enclosure+0.112
Neg logit contributions
 important-0.107
 valuable-0.093
 best-0.090
 ok-0.085
arre-0.084
Token
Token ID198
Feature activation-1.08e+00

Pos logit contributions
 important+0.005
 valuable+0.005
 best+0.005
arre+0.005
 ok+0.005
Neg logit contributions
 benches-0.005
round-0.004
 bench-0.004
 enclosure-0.004
 Zig-0.004
Token
Token ID198
Feature activation-8.97e-01

Pos logit contributions
 benches+0.035
 bench+0.032
round+0.032
 enclosure+0.031
 Zig+0.031
Neg logit contributions
 important-0.038
 valuable-0.034
 best-0.032
arre-0.031
 ok-0.030
TokenSuddenly
Token ID38582
Feature activation-1.71e+00

Pos logit contributions
 bench+0.019
 benches+0.019
round+0.017
 slide+0.017
 Zig+0.016
Neg logit contributions
 important-0.038
 best-0.036
 valuable-0.035
 mistakes-0.035
 ok-0.035
Token,
Token ID11
Feature activation-4.67e+00

Pos logit contributions
 benches+0.052
 bench+0.048
 Zig+0.046
round+0.046
 enclosure+0.045
Neg logit contributions
 important-0.063
 valuable-0.056
 best-0.054
arre-0.052
 ok-0.052
Token there
Token ID612
Feature activation-2.09e+00

Pos logit contributions
 benches+0.131
round+0.127
 enclosure+0.126
 bench+0.120
 Zig+0.114
Neg logit contributions
 important-0.182
 best-0.163
 valuable-0.159
 ok-0.154
 awful-0.150
Token was
Token ID373
Feature activation+3.94e+00

Pos logit contributions
 benches+0.078
 bench+0.074
 Zig+0.072
round+0.070
 slide+0.070
Neg logit contributions
 important-0.062
 valuable-0.055
 best-0.051
arre-0.050
 mistakes-0.049
Token an
Token ID281
Feature activation+3.31e+00

Pos logit contributions
 important+0.126
 valuable+0.112
 best+0.106
arre+0.105
 mistakes+0.102
Neg logit contributions
 benches-0.138
 bench-0.130
 Zig-0.126
round-0.122
 slide-0.121
Token emergency
Token ID6334
Feature activation-1.08e+00

Pos logit contributions
 important+0.094
 valuable+0.091
arre+0.088
 ok+0.086
 certainly+0.085
Neg logit contributions
 benches-0.116
 bench-0.113
round-0.107
 slide-0.107
 enclosure-0.103
Token!
Token ID0
Feature activation+6.11e-01

Pos logit contributions
 benches+0.036
 bench+0.033
 Zig+0.033
round+0.033
 gra+0.032
Neg logit contributions
 important-0.035
 best-0.030
 valuable-0.029
 awful-0.028
 mistakes-0.028
Token't
Token ID470
Feature activation+5.24e+00

Pos logit contributions
 important+0.242
 valuable+0.220
 best+0.216
 precious+0.211
ushing+0.208
Neg logit contributions
 benches-0.136
 Zig-0.129
 bench-0.129
 slide-0.125
round-0.120
Token spoil
Token ID20851
Feature activation+1.69e+00

Pos logit contributions
 important+0.199
 valuable+0.178
 best+0.171
 ok+0.165
arre+0.163
Neg logit contributions
 benches-0.156
 bench-0.142
round-0.138
 Zig-0.138
 enclosure-0.136
Token the
Token ID262
Feature activation+1.60e-01

Pos logit contributions
 important+0.064
 valuable+0.058
 best+0.056
arre+0.054
 ok+0.054
Neg logit contributions
 benches-0.048
 bench-0.045
 Zig-0.043
round-0.043
 enclosure-0.042
Token cater
Token ID20825
Feature activation-1.37e-01

Pos logit contributions
 important+0.005
 valuable+0.005
 best+0.004
arre+0.004
 ok+0.004
Neg logit contributions
 benches-0.006
 bench-0.005
 Zig-0.005
round-0.005
 enclosure-0.005
Tokenpillar
Token ID41643
Feature activation-1.01e+00

Pos logit contributions
 benches+0.005
 bench+0.005
round+0.005
 Zig+0.004
 slide+0.004
Neg logit contributions
 important-0.004
 valuable-0.004
 best-0.004
 ok-0.003
 mistakes-0.003
Token or
Token ID393
Feature activation-4.84e+00

Pos logit contributions
 benches+0.034
 bench+0.033
 slide+0.031
 enclosure+0.030
 Zig+0.030
Neg logit contributions
 important-0.033
 valuable-0.029
arre-0.028
 best-0.027
 unfair-0.027
Token else
Token ID2073
Feature activation-1.45e-01

Pos logit contributions
 benches+0.145
 bench+0.135
round+0.132
 Zig+0.131
 enclosure+0.127
Neg logit contributions
 important-0.180
 valuable-0.160
 best-0.155
 ok-0.148
 mistakes-0.148
Token he
Token ID339
Feature activation+2.13e+00

Pos logit contributions
 benches+0.004
 bench+0.004
 slide+0.004
 Zig+0.004
 enclosure+0.004
Neg logit contributions
 important-0.005
 valuable-0.005
arre-0.004
 best-0.004
 awful-0.004
Token won
Token ID1839
Feature activation+1.52e+00

Pos logit contributions
 important+0.063
arre+0.058
 valuable+0.057
ushing+0.055
 mistakes+0.055
Neg logit contributions
 bench-0.066
 benches-0.065
 walk-0.065
 slide-0.065
 climb-0.064
Token't
Token ID470
Feature activation+3.86e+00

Pos logit contributions
 important+0.055
 valuable+0.049
 best+0.047
arre+0.046
ushing+0.045
Neg logit contributions
 benches-0.046
 bench-0.043
 Zig-0.042
 slide-0.041
round-0.040
Token be
Token ID307
Feature activation+5.54e+00

Pos logit contributions
 important+0.151
 valuable+0.136
 best+0.131
arre+0.127
 ok+0.126
Neg logit contributions
 benches-0.109
 bench-0.101
 Zig-0.096
round-0.096
 enclosure-0.094
Token bright
Token ID6016
Feature activation+3.00e+00

Pos logit contributions
 benches+0.118
 bench+0.110
round+0.107
 Zig+0.106
 enclosure+0.103
Neg logit contributions
 important-0.145
 valuable-0.129
 best-0.125
arre-0.119
 mistakes-0.119
Token,
Token ID11
Feature activation-3.17e+00

Pos logit contributions
 important+0.104
 valuable+0.092
 best+0.089
arre+0.087
 ok+0.086
Neg logit contributions
 benches-0.098
 bench-0.091
round-0.087
 Zig-0.087
 enclosure-0.085
Token green
Token ID4077
Feature activation+1.87e+00

Pos logit contributions
 benches+0.117
 bench+0.112
 Zig+0.106
round+0.105
 slide+0.105
Neg logit contributions
 important-0.094
 valuable-0.082
 best-0.079
arre-0.079
 mistakes-0.077
Token grass
Token ID8701
Feature activation-2.95e+00

Pos logit contributions
 important+0.076
 best+0.068
 valuable+0.067
 mistakes+0.064
arre+0.061
Neg logit contributions
 benches-0.045
 Zig-0.044
round-0.043
 bench-0.042
 gra-0.041
Token.
Token ID13
Feature activation+5.54e-01

Pos logit contributions
 benches+0.100
 bench+0.094
 Zig+0.091
round+0.091
 enclosure+0.089
Neg logit contributions
 important-0.097
 valuable-0.085
 best-0.082
arre-0.079
 ok-0.078
Token In
Token ID554
Feature activation-5.60e+00

Pos logit contributions
 important+0.019
 valuable+0.017
 best+0.016
 ok+0.016
 mistakes+0.016
Neg logit contributions
 benches-0.017
 bench-0.016
round-0.016
 Zig-0.016
 throne-0.015
Token the
Token ID262
Feature activation-3.25e+00

Pos logit contributions
 benches+0.122
 bench+0.109
round+0.108
 Zig+0.106
 enclosure+0.100
Neg logit contributions
 important-0.253
 valuable-0.230
 best-0.227
 ok-0.215
 mistakes-0.215
Token middle
Token ID3504
Feature activation-2.66e+00

Pos logit contributions
 benches+0.083
 bench+0.077
 Zig+0.074
round+0.074
 enclosure+0.071
Neg logit contributions
 important-0.135
 valuable-0.122
 best-0.119
arre-0.114
 ok-0.114
Token of
Token ID286
Feature activation-6.70e+00

Pos logit contributions
 benches+0.089
round+0.086
 Zig+0.085
 bench+0.083
 gra+0.080
Neg logit contributions
 important-0.087
 best-0.076
 valuable-0.075
 mistakes-0.069
 awful-0.068
Token the
Token ID262
Feature activation-2.47e+00

Pos logit contributions
round+0.193
 benches+0.177
 throne+0.174
 enclosure+0.171
 bench+0.168
Neg logit contributions
 important-0.249
 best-0.225
 valuable-0.210
 certainly-0.199
 ok-0.197
Token big
Token ID1263
Feature activation-2.52e+00

Pos logit contributions
 benches+0.058
 Zig+0.052
 bench+0.052
round+0.051
 enclosure+0.049
Neg logit contributions
 important-0.108
 valuable-0.098
 best-0.097
arre-0.091
 mistakes-0.091
Token fairy
Token ID25607
Feature activation-2.00e+00

Pos logit contributions
 important+0.007
 valuable+0.006
 best+0.006
arre+0.006
 ok+0.005
Neg logit contributions
 benches-0.007
 bench-0.007
 Zig-0.006
round-0.006
 slide-0.006
Token waved
Token ID26834
Feature activation+2.73e+00

Pos logit contributions
 benches+0.069
 bench+0.065
 Zig+0.063
round+0.063
 enclosure+0.061
Neg logit contributions
 important-0.064
 valuable-0.057
 best-0.055
arre-0.052
 ok-0.052
Token her
Token ID607
Feature activation+2.60e+00

Pos logit contributions
 important+0.113
 valuable+0.102
 best+0.099
 ok+0.096
arre+0.095
Neg logit contributions
 benches-0.070
 bench-0.065
round-0.062
 Zig-0.061
 enclosure-0.060
Token magic
Token ID5536
Feature activation+3.20e+00

Pos logit contributions
 important+0.076
 valuable+0.066
arre+0.064
 best+0.061
 awful+0.061
Neg logit contributions
 benches-0.098
 bench-0.093
 Zig-0.088
 slide-0.087
 enclosure-0.087
Token wand
Token ID11569
Feature activation-1.48e+00

Pos logit contributions
 important+0.090
 valuable+0.077
 best+0.073
arre+0.072
 ok+0.070
Neg logit contributions
 benches-0.124
 bench-0.119
round-0.113
 Zig-0.113
 enclosure-0.111
Token and
Token ID290
Feature activation-5.09e+00

Pos logit contributions
 benches+0.042
 bench+0.040
 Zig+0.036
 enclosure+0.036
round+0.036
Neg logit contributions
 important-0.056
 valuable-0.050
arre-0.049
 mistakes-0.048
 awful-0.047
Token said
Token ID531
Feature activation+2.22e+00

Pos logit contributions
 benches+0.170
 bench+0.160
round+0.159
 enclosure+0.155
 Zig+0.151
Neg logit contributions
 important-0.170
 valuable-0.150
 best-0.147
 ok-0.137
 awful-0.135
Token
Token ID6184
Feature activation+3.73e+00

Pos logit contributions
 important+0.081
 valuable+0.072
 best+0.070
 ok+0.067
arre+0.067
Neg logit contributions
 benches-0.068
 bench-0.064
round-0.061
 Zig-0.060
 enclosure-0.060
Token
Token ID95
Feature activation+3.02e+00

Pos logit contributions
 important+0.136
 valuable+0.121
 best+0.117
arre+0.113
 ok+0.112
Neg logit contributions
 benches-0.114
 bench-0.107
 Zig-0.102
round-0.102
 enclosure-0.099
Token
Token ID26391
Feature activation+5.02e+00

Pos logit contributions
 important+0.110
 valuable+0.098
 best+0.098
arre+0.093
 mistakes+0.092
Neg logit contributions
 benches-0.091
 bench-0.085
round-0.083
 Zig-0.082
 enclosure-0.081
Token
Token ID129
Feature activation+2.13e+00

Pos logit contributions
 important+0.170
 valuable+0.151
 best+0.145
arre+0.140
 ok+0.139
Neg logit contributions
 benches-0.165
 bench-0.155
round-0.149
 Zig-0.149
 enclosure-0.146