TOP ACTIVATIONS
MAX = 13.136

laughed and said , " Looks like that apple made you
smiled and said , " Looks like you like the wardrobe
smiled . " Looks like the frog likes you
the owner smiled . " Looks like you two are friends
smiled and said , " Sounds like you had the best
table . " Looks like mom has some mail
's mom my . " Looks like the wave destroyed all
, but he didn 't seem to win very often .
, but it didn 't seem to work . Lily laughed
, but he didn 't seem to be getting bigger or
, but it didn 't seem to work . The star
The sw an didn 't seem to mind and just kept
her . It didn 't seem like it wanted to hurt
wind . Billy didn 't seem to notice anything out of
new toy she had been longing for - a mixing machine
, but it didn 't seem to work . He got
Yes , Molly " it seemed to say . " There
. Mitt ens didn 't seem to mind and continued to
move . It didn 't seem to care about L ila
why everyone else didn 't seem to mind . He tried

MIN ACTIVATIONS
MIN = -10.864

L ily and Mom go to the store . Mom wants
L ily and Mom went to the shop . Lily saw
S ara and Ben went to see a show with their
and Sam and Rex walk to the pond . They see
S ara and Tom went to the zoo with Mom and
girl named Emma walked outside to the garden . On the
and her mom would sign to each other every day .
L ily and Ben went to the park with Mom .
Jim walked into the kitchen to find a messy pizza on
L ily and Ben went to the park with Mom .
ily and her mom went to the mall . Lily liked
L ily and Ben went to the beach with Mom and
and her mom went outside to the park . It was
<|endoftext|> Tim and Lily went to the zoo with Mom and
. <|endoftext|> Mom my went to the kitchen . She opened
<|endoftext|> Anna and Ben went to the shop with Mom .
<|endoftext|> Tim and Mia went to the zoo with their mom
would <|endoftext|> John b iked to the park with his friends
John and his mom went to the store . When they
again . <|endoftext|> John went to the station with his dad

INTERVAL 7.136 - 13.136
CONTAINS 1.220%

proud . She had learned something new and it was very
home and the puppy was very happy . They wanted to
her feel sad . Lily didn 't like Tim my and
Sam . You are good too . You are my friend
warned him to not go too far . As

INTERVAL 1.136 - 7.136
CONTAINS 39.564%

kept singing her happy songs each morning in the quiet forest
take my boot . It is mine . Go away ."
she gave him the dinner . Her dad was very happy
mom said they had no spaghetti left . She said they
the flowers . They could not see the shed . They

INTERVAL -4.864 - 1.136
CONTAINS 51.000%

, as they passed the playground . " Okay
Everything is harmless ." The boy relaxed and soon
curtain ." S ue 's mom thought for a
It was so mild that Mom my bear didn 't worry
so many new things from the test . And Bobby was

INTERVAL -10.864 - -4.864
CONTAINS 8.216%

, but he kept talking in Spanish . Finally , Mary
to many houses . They knock on the doors and give
. She was so proud of herself , and so were
mess . Mom and Dad hear them and come
statue . They go back to their mom , who is
Token laughed
Token ID13818
Feature activation+3.30e+00

Pos logit contributions
 benches+0.032
 greet+0.030
aster+0.029
 Sue+0.029
 enclosure+0.028
Neg logit contributions
 important-0.035
 valuable-0.035
 best-0.031
 precious-0.027
 certainly-0.027
Token and
Token ID290
Feature activation-3.61e+00

Pos logit contributions
 valuable+0.145
 important+0.138
 virtue+0.138
 precious+0.126
 best+0.124
Neg logit contributions
 Sue-0.043
 sit-0.041
 Ann-0.037
 her-0.037
 benches-0.036
Token said
Token ID531
Feature activation-1.02e-01

Pos logit contributions
 benches+0.097
 greet+0.089
aster+0.088
 Sue+0.086
 enclosure+0.086
Neg logit contributions
 important-0.119
 valuable-0.118
 best-0.108
 precious-0.095
 certainly-0.095
Token,
Token ID11
Feature activation-3.04e+00

Pos logit contributions
 benches+0.002
 Sue+0.002
 greet+0.002
 Ann+0.002
aster+0.002
Neg logit contributions
 important-0.004
 valuable-0.004
 best-0.003
 precious-0.003
 certainly-0.003
Token "
Token ID366
Feature activation+1.41e+00

Pos logit contributions
 benches+0.089
 greet+0.083
 Sue+0.082
aster+0.081
 Ann+0.081
Neg logit contributions
 important-0.091
 valuable-0.091
 best-0.082
 precious-0.072
 certainly-0.071
TokenLooks
Token ID41102
Feature activation+1.24e+01

Pos logit contributions
 important+0.044
 valuable+0.043
 best+0.040
 precious+0.035
 impossible+0.033
Neg logit contributions
 benches-0.039
 greet-0.037
 Sue-0.036
 Ann-0.036
aster-0.036
Token like
Token ID588
Feature activation+1.66e+00

Pos logit contributions
 valuable+0.410
 important+0.409
 virtue+0.381
 best+0.365
iced+0.362
Neg logit contributions
 Sue-0.259
 sit-0.242
 benches-0.240
 stroke-0.232
 climb-0.227
Token that
Token ID326
Feature activation+9.74e-02

Pos logit contributions
 important+0.053
 valuable+0.052
 best+0.048
 precious+0.042
 certainly+0.042
Neg logit contributions
 benches-0.047
 greet-0.043
 enclosure-0.043
aster-0.042
 bench-0.041
Token apple
Token ID17180
Feature activation-1.69e+00

Pos logit contributions
 valuable+0.003
 important+0.003
 best+0.003
 precious+0.003
 certainly+0.003
Neg logit contributions
 Sue-0.002
 benches-0.002
 bench-0.002
 climb-0.002
 Ann-0.002
Token made
Token ID925
Feature activation+1.18e+00

Pos logit contributions
 benches+0.046
 Sue+0.044
 bench+0.043
 Ann+0.042
aster+0.042
Neg logit contributions
 valuable-0.053
 important-0.052
 best-0.046
 precious-0.042
 certainly-0.041
Token you
Token ID345
Feature activation+4.10e+00

Pos logit contributions
 valuable+0.040
 important+0.040
 best+0.036
 precious+0.032
 certainly+0.032
Neg logit contributions
 benches-0.030
 greet-0.028
 Sue-0.028
aster-0.027
 Ann-0.027
Token smiled
Token ID13541
Feature activation+4.18e+00

Pos logit contributions
 benches+0.124
 greet+0.113
aster+0.110
 enclosure+0.109
 bench+0.108
Neg logit contributions
 important-0.145
 valuable-0.144
 best-0.133
 certainly-0.120
 precious-0.117
Token and
Token ID290
Feature activation-2.70e+00

Pos logit contributions
 valuable+0.180
 important+0.179
 virtue+0.174
 best+0.157
 certainly+0.155
Neg logit contributions
 Sue-0.052
 greet-0.051
 sit-0.051
 stroke-0.049
 benches-0.047
Token said
Token ID531
Feature activation-1.63e-01

Pos logit contributions
 benches+0.079
 greet+0.073
aster+0.072
 enclosure+0.071
 bench+0.070
Neg logit contributions
 important-0.082
 valuable-0.082
 best-0.074
 precious-0.065
 certainly-0.065
Token,
Token ID11
Feature activation-2.78e+00

Pos logit contributions
 benches+0.003
 Sue+0.003
 Ann+0.003
 greet+0.003
aster+0.003
Neg logit contributions
 important-0.006
 valuable-0.006
 best-0.005
 precious-0.005
 certainly-0.005
Token "
Token ID366
Feature activation+1.74e+00

Pos logit contributions
 benches+0.089
 greet+0.083
aster+0.082
 enclosure+0.081
 Sue+0.080
Neg logit contributions
 important-0.076
 valuable-0.076
 best-0.068
 certainly-0.058
 precious-0.058
TokenLooks
Token ID41102
Feature activation+1.22e+01

Pos logit contributions
 important+0.053
 valuable+0.052
 best+0.048
 precious+0.042
 problem+0.041
Neg logit contributions
 benches-0.048
 Sue-0.047
 greet-0.046
 Ann-0.045
aster-0.044
Token like
Token ID588
Feature activation+1.90e+00

Pos logit contributions
 important+0.409
 valuable+0.404
 virtue+0.385
 best+0.371
iced+0.367
Neg logit contributions
 Sue-0.238
 sit-0.225
 stroke-0.214
 benches-0.212
 greet-0.212
Token you
Token ID345
Feature activation+3.96e+00

Pos logit contributions
 important+0.062
 valuable+0.062
 best+0.056
 precious+0.049
 certainly+0.049
Neg logit contributions
 benches-0.051
 greet-0.047
 Sue-0.046
aster-0.046
 Ann-0.045
Token like
Token ID588
Feature activation+1.82e+00

Pos logit contributions
 valuable+0.127
 important+0.126
 best+0.112
 precious+0.102
 virtue+0.097
Neg logit contributions
 benches-0.101
 Sue-0.100
 greet-0.100
 Ann-0.096
aster-0.096
Token the
Token ID262
Feature activation+9.68e-01

Pos logit contributions
 important+0.064
 valuable+0.063
 best+0.058
 precious+0.051
 certainly+0.051
Neg logit contributions
 benches-0.046
 greet-0.042
aster-0.041
 enclosure-0.040
 bench-0.040
Token wardrobe
Token ID35593
Feature activation+1.90e-01

Pos logit contributions
 valuable+0.026
 important+0.025
 best+0.021
 certainly+0.020
 virtue+0.019
Neg logit contributions
 benches-0.031
 greet-0.029
 bench-0.029
 Sue-0.028
 Ann-0.028
Token smiled
Token ID13541
Feature activation+4.16e+00

Pos logit contributions
 benches+0.102
aster+0.095
 restaurants+0.092
 arrest+0.090
 greet+0.089
Neg logit contributions
 important-0.094
 valuable-0.092
 best-0.091
 certainly-0.086
 really-0.078
Token.
Token ID13
Feature activation+2.26e+00

Pos logit contributions
 valuable+0.177
 important+0.174
 virtue+0.168
 best+0.151
 certainly+0.151
Neg logit contributions
 Sue-0.058
 sit-0.056
 benches-0.055
 greet-0.055
 stroke-0.055
Token
Token ID198
Feature activation-3.42e-01

Pos logit contributions
 important+0.076
 valuable+0.076
 best+0.069
 precious+0.062
 certainly+0.061
Neg logit contributions
 benches-0.058
 greet-0.054
 Sue-0.053
aster-0.052
 Ann-0.052
Token
Token ID198
Feature activation-2.21e-01

Pos logit contributions
 benches+0.011
 enclosure+0.011
 greet+0.011
aster+0.010
 stroke+0.010
Neg logit contributions
 valuable-0.009
 important-0.009
 best-0.008
 certainly-0.007
 precious-0.006
Token"
Token ID1
Feature activation+6.65e-02

Pos logit contributions
 benches+0.007
 greet+0.006
aster+0.006
 Sue+0.006
 bench+0.006
Neg logit contributions
 important-0.006
 valuable-0.006
 best-0.006
 precious-0.005
 certainly-0.005
TokenLooks
Token ID41102
Feature activation+1.22e+01

Pos logit contributions
 important+0.002
 valuable+0.002
 best+0.002
 impossible+0.002
 virtue+0.002
Neg logit contributions
 benches-0.002
 greet-0.002
 Sue-0.002
 bench-0.001
aster-0.001
Token like
Token ID588
Feature activation+1.80e+00

Pos logit contributions
 important+0.396
 valuable+0.391
 virtue+0.372
 best+0.348
iced+0.345
Neg logit contributions
 Sue-0.257
 stroke-0.251
 sit-0.250
 benches-0.243
 greet-0.240
Token the
Token ID262
Feature activation+9.23e-01

Pos logit contributions
 important+0.057
 valuable+0.055
 best+0.051
 precious+0.044
 certainly+0.042
Neg logit contributions
 benches-0.052
 enclosure-0.049
aster-0.049
 greet-0.049
 slam-0.047
Token frog
Token ID21264
Feature activation-1.23e+00

Pos logit contributions
 valuable+0.036
 important+0.035
 best+0.032
 certainly+0.030
 precious+0.029
Neg logit contributions
 benches-0.019
 greet-0.018
 Sue-0.017
 bench-0.017
 Ann-0.017
Token likes
Token ID7832
Feature activation+2.38e+00

Pos logit contributions
 benches+0.032
 Sue+0.029
 greet+0.029
 enclosure+0.029
 bench+0.029
Neg logit contributions
 valuable-0.041
 important-0.040
 best-0.036
 certainly-0.033
 precious-0.033
Token you
Token ID345
Feature activation+4.21e+00

Pos logit contributions
 important+0.080
 valuable+0.079
 best+0.073
 precious+0.064
 certainly+0.062
Neg logit contributions
 benches-0.063
aster-0.058
 greet-0.058
 enclosure-0.056
 bench-0.055
Token the
Token ID262
Feature activation+5.81e-01

Pos logit contributions
 benches+0.060
 Sue+0.057
 greet+0.056
 Ann+0.055
 bench+0.055
Neg logit contributions
 important-0.087
 valuable-0.087
 best-0.079
 precious-0.071
 certainly-0.071
Token owner
Token ID4870
Feature activation-6.21e+00

Pos logit contributions
 valuable+0.022
 important+0.022
 best+0.020
 certainly+0.019
 precious+0.018
Neg logit contributions
 benches-0.012
 bench-0.011
 greet-0.011
 Sue-0.011
 Ann-0.011
Token smiled
Token ID13541
Feature activation+4.37e+00

Pos logit contributions
 benches+0.188
aster+0.174
 greet+0.172
 restaurants+0.170
 enclosure+0.167
Neg logit contributions
 important-0.184
 valuable-0.181
 best-0.173
 certainly-0.150
 precious-0.145
Token.
Token ID13
Feature activation+2.36e+00

Pos logit contributions
 valuable+0.182
 important+0.178
 virtue+0.176
 certainly+0.160
 awful+0.157
Neg logit contributions
 Sue-0.061
 sit-0.054
 greet-0.052
 stroke-0.052
 towards-0.048
Token "
Token ID366
Feature activation+1.29e+00

Pos logit contributions
 important+0.070
 valuable+0.069
 best+0.062
 certainly+0.054
 precious+0.054
Neg logit contributions
 benches-0.071
 greet-0.066
aster-0.066
 bench-0.063
 enclosure-0.063
TokenLooks
Token ID41102
Feature activation+1.21e+01

Pos logit contributions
 important+0.039
 valuable+0.038
 best+0.036
 problem+0.032
 precious+0.031
Neg logit contributions
 benches-0.034
 greet-0.034
aster-0.034
 Sue-0.032
 bench-0.032
Token like
Token ID588
Feature activation+1.91e+00

Pos logit contributions
 valuable+0.403
 important+0.401
 virtue+0.385
iced+0.370
 best+0.366
Neg logit contributions
 Sue-0.248
 sit-0.230
 stroke-0.220
 benches-0.219
 climb-0.215
Token you
Token ID345
Feature activation+3.83e+00

Pos logit contributions
 important+0.062
 valuable+0.061
 best+0.056
 precious+0.049
 certainly+0.049
Neg logit contributions
 benches-0.053
 greet-0.049
aster-0.048
 enclosure-0.048
 bench-0.047
Token two
Token ID734
Feature activation-9.95e-02

Pos logit contributions
 valuable+0.117
 important+0.117
 best+0.104
 precious+0.093
 certainly+0.090
Neg logit contributions
 benches-0.108
 greet-0.104
 Sue-0.102
 Ann-0.099
aster-0.099
Token are
Token ID389
Feature activation+5.22e+00

Pos logit contributions
 benches+0.003
 greet+0.003
 Sue+0.003
 bench+0.003
 Ann+0.003
Neg logit contributions
 valuable-0.003
 important-0.003
 best-0.003
 certainly-0.002
 precious-0.002
Token friends
Token ID2460
Feature activation-1.37e+00

Pos logit contributions
 valuable+0.163
 important+0.160
 best+0.141
 precious+0.129
 virtue+0.128
Neg logit contributions
 benches-0.141
 Sue-0.137
 greet-0.135
 Ann-0.132
 enclosure-0.129
Token smiled
Token ID13541
Feature activation+4.40e+00

Pos logit contributions
 important+0.069
 valuable+0.069
 best+0.061
 impossible+0.058
 virtue+0.056
Neg logit contributions
 benches-0.051
 greet-0.050
 Ann-0.050
 bench-0.050
 Sue-0.050
Token and
Token ID290
Feature activation-2.36e+00

Pos logit contributions
 valuable+0.180
 important+0.178
 virtue+0.167
 awful+0.159
 best+0.153
Neg logit contributions
 Sue-0.064
 greet-0.062
 sit-0.059
 Mary-0.059
 Ann-0.058
Token said
Token ID531
Feature activation+3.24e-01

Pos logit contributions
 benches+0.057
 greet+0.053
 Sue+0.053
 Ann+0.052
 bench+0.051
Neg logit contributions
 important-0.083
 valuable-0.083
 best-0.075
 precious-0.068
 virtue-0.067
Token,
Token ID11
Feature activation-2.57e+00

Pos logit contributions
 important+0.011
 valuable+0.011
 best+0.010
 precious+0.009
 certainly+0.009
Neg logit contributions
 benches-0.008
 greet-0.007
 Sue-0.007
 Ann-0.007
aster-0.007
Token "
Token ID366
Feature activation+1.76e+00

Pos logit contributions
 benches+0.083
 greet+0.077
aster+0.076
 enclosure+0.075
 Sue+0.074
Neg logit contributions
 important-0.071
 valuable-0.071
 best-0.063
 certainly-0.054
 precious-0.053
TokenSounds
Token ID40825
Feature activation+1.19e+01

Pos logit contributions
 important+0.052
 valuable+0.052
 best+0.046
 precious+0.041
 awful+0.039
Neg logit contributions
 benches-0.052
 greet-0.050
aster-0.049
 Sue-0.048
 Ann-0.048
Token like
Token ID588
Feature activation+2.86e+00

Pos logit contributions
 valuable+0.312
 important+0.304
arre+0.300
der+0.286
 best+0.280
Neg logit contributions
 Sue-0.314
 sit-0.308
 greet-0.304
 Ann-0.289
 climb-0.280
Token you
Token ID345
Feature activation+4.84e+00

Pos logit contributions
 important+0.092
 valuable+0.092
 best+0.083
 precious+0.073
 certainly+0.073
Neg logit contributions
 benches-0.078
 greet-0.073
aster-0.071
 Sue-0.070
 enclosure-0.070
Token had
Token ID550
Feature activation+5.23e+00

Pos logit contributions
 valuable+0.158
 important+0.157
 best+0.137
 precious+0.126
 virtue+0.123
Neg logit contributions
 greet-0.124
 benches-0.124
 Sue-0.117
 Ann-0.117
 enter-0.116
Token the
Token ID262
Feature activation+1.85e+00

Pos logit contributions
 important+0.143
 valuable+0.142
 best+0.126
 precious+0.109
 certainly+0.108
Neg logit contributions
 benches-0.168
 greet-0.158
 Sue-0.155
aster-0.154
 Ann-0.153
Token best
Token ID1266
Feature activation+5.85e+00

Pos logit contributions
 valuable+0.034
 important+0.033
 best+0.024
 certainly+0.022
 awful+0.021
Neg logit contributions
 benches-0.074
 greet-0.072
 bench-0.070
 enclosure-0.070
 Sue-0.069
Token table
Token ID3084
Feature activation-2.74e+00

Pos logit contributions
 benches+0.036
aster+0.035
 greet+0.035
 Sue+0.034
 Ann+0.033
Neg logit contributions
 important-0.051
 valuable-0.051
 best-0.047
 precious-0.042
 certainly-0.040
Token.
Token ID13
Feature activation+1.98e+00

Pos logit contributions
 benches+0.081
 greet+0.077
aster+0.075
 Sue+0.074
 enclosure+0.073
Neg logit contributions
 important-0.080
 valuable-0.079
 best-0.072
 precious-0.062
 certainly-0.061
Token
Token ID198
Feature activation-4.51e-01

Pos logit contributions
 important+0.057
 valuable+0.056
 best+0.050
 precious+0.044
 certainly+0.043
Neg logit contributions
 benches-0.061
 greet-0.057
aster-0.056
 Sue-0.056
 Ann-0.056
Token
Token ID198
Feature activation-3.28e-01

Pos logit contributions
 benches+0.015
 greet+0.014
 enclosure+0.014
 stroke+0.014
aster+0.013
Neg logit contributions
 valuable-0.012
 important-0.012
 best-0.010
 precious-0.009
 certainly-0.009
Token"
Token ID1
Feature activation-1.63e-01

Pos logit contributions
 benches+0.011
 greet+0.010
aster+0.010
 enclosure+0.010
 Sue+0.010
Neg logit contributions
 important-0.009
 valuable-0.009
 best-0.008
 precious-0.007
 certainly-0.007
TokenLooks
Token ID41102
Feature activation+1.19e+01

Pos logit contributions
 benches+0.005
 enclosure+0.004
 greet+0.004
aster+0.004
 Sue+0.004
Neg logit contributions
 important-0.005
 valuable-0.005
 best-0.005
 precious-0.004
 certainly-0.004
Token like
Token ID588
Feature activation+1.55e+00

Pos logit contributions
 valuable+0.380
 important+0.377
 virtue+0.352
 certainly+0.337
 best+0.334
Neg logit contributions
 Sue-0.263
 benches-0.260
 sit-0.250
 climb-0.240
 greet-0.240
Token mom
Token ID1995
Feature activation-4.12e+00

Pos logit contributions
 important+0.055
 valuable+0.053
 best+0.050
 precious+0.044
 certainly+0.042
Neg logit contributions
 benches-0.039
 greet-0.036
 enclosure-0.036
aster-0.035
 bench-0.035
Token has
Token ID468
Feature activation+3.68e+00

Pos logit contributions
 benches+0.112
 greet+0.104
aster+0.100
 bench+0.098
 enclosure+0.098
Neg logit contributions
 important-0.136
 valuable-0.132
 best-0.124
 precious-0.108
 certainly-0.107
Token some
Token ID617
Feature activation+1.06e+00

Pos logit contributions
 important+0.113
 valuable+0.112
 best+0.101
 precious+0.089
 certainly+0.089
Neg logit contributions
 benches-0.106
 greet-0.098
aster-0.096
 Sue-0.096
 bench-0.095
Token mail
Token ID6920
Feature activation-1.47e+00

Pos logit contributions
 valuable+0.037
 important+0.037
 best+0.034
 certainly+0.031
 precious+0.030
Neg logit contributions
 benches-0.026
 greet-0.024
aster-0.023
 Sue-0.023
 bench-0.023
Token's
Token ID338
Feature activation+2.03e+00

Pos logit contributions
 benches+0.078
 greet+0.073
 Sue+0.073
 Ann+0.073
 bench+0.071
Neg logit contributions
 important-0.097
 valuable-0.097
 best-0.087
 precious-0.079
 certainly-0.077
Token mom
Token ID1995
Feature activation-5.77e+00

Pos logit contributions
 valuable+0.080
 important+0.079
 best+0.071
 precious+0.066
 impossible+0.066
Neg logit contributions
 benches-0.039
 Ann-0.037
 Sue-0.037
 bench-0.037
 greet-0.036
Tokenmy
Token ID1820
Feature activation-1.51e+00

Pos logit contributions
 benches+0.136
 greet+0.118
aster+0.118
 bench+0.114
 enclosure+0.114
Neg logit contributions
 important-0.212
 valuable-0.209
 best-0.196
 certainly-0.175
 precious-0.173
Token.
Token ID13
Feature activation+1.97e+00

Pos logit contributions
 benches+0.044
 greet+0.041
aster+0.040
 Sue+0.039
 bench+0.039
Neg logit contributions
 important-0.045
 valuable-0.045
 best-0.041
 precious-0.036
 certainly-0.036
Token "
Token ID366
Feature activation+9.67e-01

Pos logit contributions
 important+0.056
 valuable+0.056
 best+0.050
 certainly+0.043
 precious+0.043
Neg logit contributions
 benches-0.061
 greet-0.057
aster-0.057
 enclosure-0.055
 bench-0.054
TokenLooks
Token ID41102
Feature activation+1.16e+01

Pos logit contributions
 important+0.026
 valuable+0.026
 best+0.023
 precious+0.019
 certainly+0.019
Neg logit contributions
 benches-0.032
 greet-0.029
aster-0.029
 Sue-0.029
 bench-0.029
Token like
Token ID588
Feature activation+1.60e+00

Pos logit contributions
 valuable+0.343
 important+0.339
 virtue+0.321
iced+0.312
 best+0.304
Neg logit contributions
 Sue-0.264
 benches-0.252
 sit-0.252
 climb-0.245
 stroke-0.241
Token the
Token ID262
Feature activation+4.92e-01

Pos logit contributions
 important+0.055
 valuable+0.054
 best+0.049
 precious+0.044
 certainly+0.043
Neg logit contributions
 benches-0.042
 greet-0.038
aster-0.037
 enclosure-0.037
 Sue-0.037
Token wave
Token ID6769
Feature activation+1.94e+00

Pos logit contributions
 valuable+0.018
 important+0.018
 best+0.016
 virtue+0.015
 certainly+0.015
Neg logit contributions
 benches-0.011
 bench-0.010
 greet-0.010
 Sue-0.009
aster-0.009
Token destroyed
Token ID6572
Feature activation+5.78e-02

Pos logit contributions
 valuable+0.062
 important+0.062
 best+0.055
 precious+0.050
 certainly+0.049
Neg logit contributions
 benches-0.052
 greet-0.048
 Sue-0.048
aster-0.047
 bench-0.047
Token all
Token ID477
Feature activation+6.96e-02

Pos logit contributions
 important+0.002
 valuable+0.002
 best+0.002
 precious+0.002
 certainly+0.002
Neg logit contributions
 benches-0.001
 greet-0.001
aster-0.001
 enclosure-0.001
 Sue-0.001
Token,
Token ID11
Feature activation-3.44e+00

Pos logit contributions
 important+0.178
 valuable+0.177
 best+0.161
 precious+0.142
 certainly+0.141
Neg logit contributions
 benches-0.144
 greet-0.134
aster-0.131
 Sue-0.129
 enclosure-0.128
Token but
Token ID475
Feature activation-5.13e-01

Pos logit contributions
 benches+0.085
 greet+0.079
aster+0.078
 enclosure+0.076
 Sue+0.076
Neg logit contributions
 important-0.119
 valuable-0.118
 best-0.110
 certainly-0.095
 impossible-0.095
Token he
Token ID339
Feature activation+3.57e+00

Pos logit contributions
 benches+0.013
 greet+0.012
aster+0.012
 Sue+0.012
 bench+0.011
Neg logit contributions
 important-0.018
 valuable-0.018
 best-0.016
 precious-0.014
 certainly-0.014
Token didn
Token ID1422
Feature activation+7.26e+00

Pos logit contributions
 valuable+0.100
 important+0.095
 best+0.083
 precious+0.081
 certainly+0.075
Neg logit contributions
 Sue-0.101
 benches-0.101
 Ann-0.101
 climb-0.100
aster-0.100
Token't
Token ID470
Feature activation+3.85e+00

Pos logit contributions
 valuable+0.294
 important+0.291
 best+0.268
 precious+0.265
 virtue+0.246
Neg logit contributions
aster-0.125
 benches-0.120
 climb-0.120
 Ann-0.115
 Sue-0.111
Token seem
Token ID1283
Feature activation+1.11e+01

Pos logit contributions
 valuable+0.144
 important+0.144
 best+0.134
 precious+0.123
 certainly+0.121
Neg logit contributions
 benches-0.079
 greet-0.076
aster-0.074
 Sue-0.073
 climb-0.071
Token to
Token ID284
Feature activation-5.33e+00

Pos logit contributions
 important+0.363
 valuable+0.363
 best+0.336
 certainly+0.321
 virtue+0.305
Neg logit contributions
 Sue-0.247
 benches-0.246
 greet-0.237
 bench-0.236
 climb-0.234
Token win
Token ID1592
Feature activation+2.13e+00

Pos logit contributions
 benches+0.116
 greet+0.107
aster+0.104
 Sue+0.102
 bench+0.101
Neg logit contributions
 valuable-0.199
 important-0.199
 best-0.182
 precious-0.166
 certainly-0.164
Token very
Token ID845
Feature activation+7.17e+00

Pos logit contributions
 valuable+0.076
 important+0.075
 best+0.068
 certainly+0.063
 precious+0.063
Neg logit contributions
 benches-0.050
 Sue-0.045
 greet-0.044
 Ann-0.044
aster-0.044
Token often
Token ID1690
Feature activation+3.26e+00

Pos logit contributions
 valuable+0.166
 important+0.163
 best+0.148
 certainly+0.142
 precious+0.129
Neg logit contributions
 benches-0.237
aster-0.231
 climb-0.227
 Sue-0.225
 bench-0.222
Token.
Token ID13
Feature activation+2.13e+00

Pos logit contributions
 important+0.096
 valuable+0.095
 best+0.086
 precious+0.073
 certainly+0.072
Neg logit contributions
 benches-0.098
 greet-0.092
aster-0.091
 enclosure-0.089
 Sue-0.089
Token,
Token ID11
Feature activation-4.02e+00

Pos logit contributions
 important+0.073
 valuable+0.073
 best+0.064
 precious+0.060
 virtue+0.060
Neg logit contributions
 benches-0.047
 greet-0.042
 Sue-0.042
 bench-0.041
 Ann-0.040
Token but
Token ID475
Feature activation-6.16e-01

Pos logit contributions
 benches+0.117
aster+0.117
 greet+0.109
 restaurants+0.108
 enclosure+0.108
Neg logit contributions
 important-0.118
 valuable-0.116
 best-0.112
 precious-0.091
 certainly-0.091
Token it
Token ID340
Feature activation+5.01e+00

Pos logit contributions
 benches+0.016
 greet+0.015
 enclosure+0.014
aster+0.014
 bench+0.013
Neg logit contributions
 important-0.022
 valuable-0.021
 best-0.020
 precious-0.017
 certainly-0.017
Token didn
Token ID1422
Feature activation+7.45e+00

Pos logit contributions
 valuable+0.152
 important+0.147
 best+0.129
 precious+0.119
 virtue+0.118
Neg logit contributions
 benches-0.140
 Ann-0.133
 climb-0.132
 Sue-0.131
 bench-0.130
Token't
Token ID470
Feature activation+3.89e+00

Pos logit contributions
 valuable+0.294
 important+0.293
 best+0.272
 precious+0.258
 virtue+0.246
Neg logit contributions
aster-0.138
 benches-0.137
 Sue-0.133
 climb-0.132
 Ann-0.131
Token seem
Token ID1283
Feature activation+1.11e+01

Pos logit contributions
 important+0.140
 valuable+0.139
 best+0.128
 precious+0.114
 certainly+0.114
Neg logit contributions
 benches-0.093
 greet-0.083
 enclosure-0.083
aster-0.082
 bench-0.080
Token to
Token ID284
Feature activation-5.41e+00

Pos logit contributions
 important+0.377
 valuable+0.375
 best+0.346
 certainly+0.321
 virtue+0.312
Neg logit contributions
 benches-0.256
 Sue-0.248
 greet-0.241
 bench-0.234
 Ann-0.234
Token work
Token ID670
Feature activation+3.39e+00

Pos logit contributions
 benches+0.137
 enclosure+0.121
 greet+0.120
aster+0.119
 Ann+0.118
Neg logit contributions
 important-0.190
 valuable-0.187
 best-0.174
 precious-0.153
 certainly-0.152
Token.
Token ID13
Feature activation+2.26e+00

Pos logit contributions
 valuable+0.111
 important+0.109
 best+0.099
 certainly+0.092
 precious+0.090
Neg logit contributions
 benches-0.088
 greet-0.080
 Sue-0.079
 Ann-0.078
aster-0.078
Token Lily
Token ID20037
Feature activation-1.35e+00

Pos logit contributions
 important+0.073
 valuable+0.073
 best+0.066
 certainly+0.059
 precious+0.058
Neg logit contributions
 benches-0.062
 greet-0.058
aster-0.057
 enclosure-0.055
 bench-0.055
Token laughed
Token ID13818
Feature activation+3.58e+00

Pos logit contributions
 benches+0.040
 greet+0.037
 bench+0.036
 Ann+0.036
 Sue+0.036
Neg logit contributions
 valuable-0.040
 important-0.040
 best-0.036
 precious-0.032
 certainly-0.031
Token,
Token ID11
Feature activation-3.85e+00

Pos logit contributions
 valuable+0.160
 important+0.159
 best+0.140
 precious+0.124
 certainly+0.119
Neg logit contributions
 benches-0.156
 greet-0.148
aster-0.143
 Ann-0.141
 enclosure-0.141
Token but
Token ID475
Feature activation-7.50e-01

Pos logit contributions
 benches+0.100
 greet+0.093
aster+0.093
 enclosure+0.090
 Ann+0.090
Neg logit contributions
 important-0.129
 valuable-0.127
 best-0.119
 certainly-0.102
 precious-0.101
Token he
Token ID339
Feature activation+3.17e+00

Pos logit contributions
 benches+0.019
 greet+0.017
aster+0.017
 Sue+0.017
 bench+0.017
Neg logit contributions
 important-0.026
 valuable-0.026
 best-0.023
 precious-0.021
 certainly-0.021
Token didn
Token ID1422
Feature activation+7.28e+00

Pos logit contributions
 valuable+0.085
 important+0.082
 best+0.072
 precious+0.064
 certainly+0.062
Neg logit contributions
 benches-0.101
 Sue-0.097
aster-0.097
 bench-0.096
 Ann-0.095
Token't
Token ID470
Feature activation+3.84e+00

Pos logit contributions
 valuable+0.291
 important+0.290
 best+0.270
 precious+0.259
 virtue+0.244
Neg logit contributions
aster-0.127
 benches-0.125
 climb-0.123
 Sue-0.121
 Ann-0.117
Token seem
Token ID1283
Feature activation+1.11e+01

Pos logit contributions
 important+0.138
 valuable+0.137
 best+0.125
 precious+0.111
 certainly+0.109
Neg logit contributions
 benches-0.096
 enclosure-0.086
 greet-0.081
aster-0.081
 Ann-0.080
Token to
Token ID284
Feature activation-5.62e+00

Pos logit contributions
 important+0.402
 valuable+0.399
 best+0.374
 certainly+0.348
 precious+0.334
Neg logit contributions
 benches-0.229
 Sue-0.220
 greet-0.216
 bench-0.211
 Ann-0.203
Token be
Token ID307
Feature activation+4.85e+00

Pos logit contributions
 benches+0.121
 enclosure+0.103
 Ann+0.101
 greet+0.101
aster+0.099
Neg logit contributions
 important-0.220
 valuable-0.216
 best-0.204
 precious-0.180
 certainly-0.179
Token getting
Token ID1972
Feature activation+3.52e+00

Pos logit contributions
 important+0.128
 valuable+0.126
 best+0.113
 precious+0.096
 certainly+0.094
Neg logit contributions
 benches-0.164
 greet-0.154
aster-0.151
 enclosure-0.149
 bench-0.149
Token bigger
Token ID5749
Feature activation+2.70e+00

Pos logit contributions
 valuable+0.094
 important+0.094
 best+0.083
 certainly+0.074
 precious+0.071
Neg logit contributions
 benches-0.112
 greet-0.106
 Sue-0.105
aster-0.104
 Ann-0.101
Token or
Token ID393
Feature activation-1.93e+00

Pos logit contributions
 valuable+0.089
 important+0.089
 best+0.079
 certainly+0.075
 precious+0.072
Neg logit contributions
 benches-0.072
 Sue-0.063
 greet-0.063
 enclosure-0.062
 bench-0.062
Token,
Token ID11
Feature activation-3.15e+00

Pos logit contributions
 important+0.125
 valuable+0.124
 best+0.112
 precious+0.099
 certainly+0.098
Neg logit contributions
 benches-0.108
 greet-0.100
aster-0.098
 Sue-0.097
 enclosure-0.096
Token but
Token ID475
Feature activation-6.15e-01

Pos logit contributions
 benches+0.088
 greet+0.082
aster+0.080
 Sue+0.079
 bench+0.079
Neg logit contributions
 important-0.099
 valuable-0.099
 best-0.089
 precious-0.079
 certainly-0.078
Token it
Token ID340
Feature activation+4.87e+00

Pos logit contributions
 benches+0.013
 greet+0.012
aster+0.012
 Sue+0.012
 enclosure+0.012
Neg logit contributions
 important-0.023
 valuable-0.023
 best-0.021
 precious-0.019
 certainly-0.019
Token didn
Token ID1422
Feature activation+7.38e+00

Pos logit contributions
 valuable+0.141
 important+0.133
 best+0.119
 virtue+0.110
 precious+0.108
Neg logit contributions
 Sue-0.131
 climb-0.130
 benches-0.128
 Ann-0.128
 stroke-0.125
Token't
Token ID470
Feature activation+3.95e+00

Pos logit contributions
 valuable+0.303
 important+0.301
 best+0.282
 precious+0.270
 virtue+0.259
Neg logit contributions
aster-0.119
 climb-0.115
 benches-0.110
 enter-0.108
 Sue-0.107
Token seem
Token ID1283
Feature activation+1.10e+01

Pos logit contributions
 valuable+0.149
 important+0.148
 best+0.139
 virtue+0.126
 precious+0.126
Neg logit contributions
 greet-0.076
 benches-0.074
 climb-0.073
 Sue-0.072
 sit-0.071
Token to
Token ID284
Feature activation-5.38e+00

Pos logit contributions
 important+0.283
 valuable+0.282
 best+0.263
 certainly+0.249
 virtue+0.244
Neg logit contributions
 Sue-0.317
 greet-0.307
 benches-0.306
 climb-0.297
 bench-0.296
Token work
Token ID670
Feature activation+3.29e+00

Pos logit contributions
 benches+0.126
 greet+0.115
aster+0.112
 Sue+0.111
 bench+0.109
Neg logit contributions
 important-0.194
 valuable-0.194
 best-0.177
 precious-0.159
 certainly-0.158
Token.
Token ID13
Feature activation+1.94e+00

Pos logit contributions
 valuable+0.109
 important+0.107
 best+0.099
 certainly+0.092
 precious+0.090
Neg logit contributions
 benches-0.080
 greet-0.073
 Sue-0.073
 Ann-0.072
aster-0.071
Token The
Token ID383
Feature activation-1.37e-01

Pos logit contributions
 important+0.061
 valuable+0.060
 best+0.055
 certainly+0.048
 precious+0.047
Neg logit contributions
 benches-0.054
 greet-0.052
aster-0.052
 enclosure-0.049
 bench-0.048
Token star
Token ID3491
Feature activation+1.70e-01

Pos logit contributions
 benches+0.004
 greet+0.004
 bench+0.004
aster+0.004
 Ann+0.004
Neg logit contributions
 valuable-0.004
 important-0.004
 best-0.003
 certainly-0.003
 virtue-0.003
Token The
Token ID383
Feature activation+2.06e-01

Pos logit contributions
 valuable+0.072
 important+0.072
 best+0.063
 certainly+0.057
 precious+0.055
Neg logit contributions
 benches-0.071
 greet-0.065
aster-0.065
 enclosure-0.063
 restaurants-0.061
Token sw
Token ID1509
Feature activation+1.47e+00

Pos logit contributions
 valuable+0.007
 important+0.007
 best+0.006
 virtue+0.006
 certainly+0.006
Neg logit contributions
 benches-0.005
 greet-0.004
 bench-0.004
 enclosure-0.004
 Sue-0.004
Tokenan
Token ID272
Feature activation-3.24e+00

Pos logit contributions
 valuable+0.048
 important+0.047
 best+0.042
 precious+0.038
 certainly+0.038
Neg logit contributions
 benches-0.039
 enclosure-0.037
 greet-0.036
 Sue-0.034
 bench-0.034
Token didn
Token ID1422
Feature activation+7.40e+00

Pos logit contributions
 benches+0.092
 greet+0.086
 Sue+0.085
 enclosure+0.084
 Ann+0.084
Neg logit contributions
 important-0.099
 valuable-0.098
 best-0.088
 precious-0.078
 virtue-0.076
Token't
Token ID470
Feature activation+3.80e+00

Pos logit contributions
 important+0.290
 valuable+0.289
 best+0.266
 precious+0.250
 virtue+0.240
Neg logit contributions
 benches-0.138
aster-0.135
 climb-0.134
 greet-0.130
 Sue-0.130
Token seem
Token ID1283
Feature activation+1.10e+01

Pos logit contributions
 important+0.138
 valuable+0.138
 best+0.127
 precious+0.114
 certainly+0.113
Neg logit contributions
 benches-0.088
 greet-0.081
aster-0.079
 Sue-0.078
 bench-0.077
Token to
Token ID284
Feature activation-5.26e+00

Pos logit contributions
 important+0.439
 valuable+0.433
 best+0.411
 certainly+0.383
 virtue+0.376
Neg logit contributions
 benches-0.184
 Sue-0.182
 greet-0.175
 Ann-0.164
 bench-0.162
Token mind
Token ID2000
Feature activation+1.08e+00

Pos logit contributions
 benches+0.122
 greet+0.104
 enclosure+0.104
 bench+0.103
aster+0.103
Neg logit contributions
 important-0.197
 valuable-0.194
 best-0.181
 certainly-0.162
 precious-0.161
Token and
Token ID290
Feature activation-2.83e+00

Pos logit contributions
 important+0.037
 valuable+0.037
 best+0.034
 precious+0.030
 certainly+0.029
Neg logit contributions
 benches-0.027
 greet-0.026
aster-0.025
 enclosure-0.024
 bench-0.024
Token just
Token ID655
Feature activation+3.39e+00

Pos logit contributions
 benches+0.080
 greet+0.073
aster+0.073
 enclosure+0.072
 restaurants+0.069
Neg logit contributions
 important-0.090
 valuable-0.090
 best-0.081
 precious-0.071
 certainly-0.071
Token kept
Token ID4030
Feature activation-7.64e-01

Pos logit contributions
 important+0.105
 valuable+0.103
 best+0.093
 precious+0.083
 certainly+0.082
Neg logit contributions
 benches-0.099
 greet-0.090
aster-0.088
 enclosure-0.088
 bench-0.087
Token her
Token ID607
Feature activation+1.25e+00

Pos logit contributions
aster+0.132
 benches+0.131
 greet+0.123
 enclosure+0.121
 slam+0.116
Neg logit contributions
 important-0.271
 best-0.254
 valuable-0.254
 certainly-0.221
 precious-0.221
Token.
Token ID13
Feature activation+2.01e+00

Pos logit contributions
 important+0.037
 valuable+0.037
 best+0.034
 precious+0.029
 certainly+0.029
Neg logit contributions
 benches-0.037
aster-0.035
 greet-0.035
 Sue-0.033
 Ann-0.033
Token It
Token ID632
Feature activation+4.41e+00

Pos logit contributions
 important+0.087
 valuable+0.087
 best+0.077
 certainly+0.073
 precious+0.072
Neg logit contributions
 benches-0.033
aster-0.031
 greet-0.030
 stroke-0.027
 enclosure-0.027
Token didn
Token ID1422
Feature activation+7.42e+00

Pos logit contributions
 valuable+0.132
 important+0.130
 best+0.119
 virtue+0.110
 precious+0.103
Neg logit contributions
 benches-0.124
 Sue-0.121
 greet-0.118
 Ann-0.118
 climb-0.113
Token't
Token ID470
Feature activation+3.91e+00

Pos logit contributions
 valuable+0.276
 important+0.275
 best+0.255
 precious+0.235
 virtue+0.226
Neg logit contributions
 benches-0.158
aster-0.152
 Sue-0.147
 Ann-0.146
 greet-0.146
Token seem
Token ID1283
Feature activation+1.10e+01

Pos logit contributions
 important+0.132
 valuable+0.130
 best+0.116
 precious+0.105
 certainly+0.102
Neg logit contributions
 benches-0.106
 enclosure-0.098
aster-0.093
 bench-0.092
 greet-0.091
Token like
Token ID588
Feature activation+2.41e+00

Pos logit contributions
 important+0.343
 valuable+0.341
 best+0.316
 certainly+0.286
 virtue+0.279
Neg logit contributions
 benches-0.290
 Sue-0.281
 greet-0.274
 Ann-0.267
 bench-0.266
Token it
Token ID340
Feature activation+5.84e+00

Pos logit contributions
 important+0.090
 valuable+0.088
 best+0.082
 precious+0.074
 impossible+0.072
Neg logit contributions
 benches-0.054
aster-0.051
 enclosure-0.051
 greet-0.051
 bench-0.047
Token wanted
Token ID2227
Feature activation+4.39e+00

Pos logit contributions
 valuable+0.162
 important+0.161
 best+0.143
 precious+0.125
 certainly+0.123
Neg logit contributions
 benches-0.184
 greet-0.172
 Sue-0.171
 Ann-0.168
 bench-0.168
Token to
Token ID284
Feature activation-6.33e+00

Pos logit contributions
 important+0.152
 valuable+0.146
 best+0.139
 precious+0.118
 impossible+0.115
Neg logit contributions
aster-0.113
 benches-0.111
 enclosure-0.108
 greet-0.105
 restaurants-0.102
Token hurt
Token ID5938
Feature activation+1.57e+00

Pos logit contributions
 benches+0.171
 enclosure+0.149
aster+0.148
 restaurants+0.145
 Ann+0.144
Neg logit contributions
 important-0.202
 valuable-0.190
 best-0.180
 really-0.160
 impossible-0.159
Token wind
Token ID2344
Feature activation+8.12e-01

Pos logit contributions
 benches+0.030
 greet+0.028
aster+0.027
 Sue+0.027
 bench+0.027
Neg logit contributions
 important-0.029
 valuable-0.029
 best-0.026
 precious-0.022
 certainly-0.022
Token.
Token ID13
Feature activation+1.86e+00

Pos logit contributions
 valuable+0.027
 important+0.027
 best+0.024
 precious+0.022
 certainly+0.022
Neg logit contributions
 benches-0.021
 greet-0.020
 Sue-0.019
 bench-0.019
aster-0.019
Token Billy
Token ID15890
Feature activation-4.28e+00

Pos logit contributions
 valuable+0.079
 important+0.079
 best+0.074
 precious+0.069
 certainly+0.068
Neg logit contributions
 benches-0.031
 Sue-0.028
 greet-0.026
 Ann-0.025
 enclosure-0.025
Token didn
Token ID1422
Feature activation+7.27e+00

Pos logit contributions
 benches+0.098
 Sue+0.091
 greet+0.090
 bench+0.090
 Ann+0.089
Neg logit contributions
 valuable-0.152
 important-0.151
 best-0.139
 virtue-0.127
 precious-0.125
Token't
Token ID470
Feature activation+3.71e+00

Pos logit contributions
 valuable+0.279
 important+0.279
 best+0.261
 precious+0.243
 virtue+0.229
Neg logit contributions
aster-0.141
 benches-0.140
 climb-0.132
 greet-0.132
 Sue-0.131
Token seem
Token ID1283
Feature activation+1.10e+01

Pos logit contributions
 important+0.120
 valuable+0.119
 best+0.107
 precious+0.094
 certainly+0.094
Neg logit contributions
 benches-0.104
 enclosure-0.094
 greet-0.093
aster-0.093
 bench-0.091
Token to
Token ID284
Feature activation-5.40e+00

Pos logit contributions
 important+0.330
 valuable+0.325
 best+0.305
 certainly+0.284
 virtue+0.280
Neg logit contributions
 Sue-0.284
 benches-0.276
 greet-0.275
 bench-0.266
 Ann-0.264
Token notice
Token ID4003
Feature activation-2.08e+00

Pos logit contributions
 benches+0.133
 enclosure+0.119
aster+0.114
 Ann+0.112
 bench+0.112
Neg logit contributions
 important-0.195
 valuable-0.189
 best-0.179
 certainly-0.156
 precious-0.156
Token anything
Token ID1997
Feature activation+6.06e+00

Pos logit contributions
 benches+0.051
 greet+0.049
aster+0.048
 enclosure+0.047
 bench+0.047
Neg logit contributions
 important-0.072
 valuable-0.071
 best-0.065
 precious-0.058
 awful-0.056
Token out
Token ID503
Feature activation-2.25e+00

Pos logit contributions
 important+0.187
 valuable+0.187
 best+0.174
 virtue+0.161
 certainly+0.157
Neg logit contributions
 benches-0.161
 Sue-0.148
 greet-0.147
 bench-0.146
 Ann-0.143
Token of
Token ID286
Feature activation-7.32e+00

Pos logit contributions
 benches+0.061
 greet+0.056
aster+0.055
 bench+0.054
 Sue+0.054
Neg logit contributions
 important-0.073
 valuable-0.073
 best-0.066
 precious-0.059
 certainly-0.058
Token new
Token ID649
Feature activation+3.99e+00

Pos logit contributions
 important+0.086
 valuable+0.085
 best+0.076
 certainly+0.072
der+0.072
Neg logit contributions
 bench-0.074
 benches-0.071
aster-0.071
 climb-0.071
 greet-0.071
Token toy
Token ID13373
Feature activation+1.45e+00

Pos logit contributions
 valuable+0.147
 important+0.145
 best+0.129
 certainly+0.123
 precious+0.121
Neg logit contributions
 benches-0.086
 bench-0.083
 Sue-0.081
aster-0.081
 greet-0.080
Token she
Token ID673
Feature activation+3.02e+00

Pos logit contributions
 valuable+0.051
 important+0.050
 best+0.044
 precious+0.041
 certainly+0.041
Neg logit contributions
 benches-0.036
 bench-0.033
aster-0.032
 Sue-0.032
 greet-0.032
Token had
Token ID550
Feature activation+3.42e+00

Pos logit contributions
 valuable+0.090
 important+0.084
 best+0.080
der+0.075
 virtue+0.074
Neg logit contributions
 greet-0.075
 sit-0.073
 climb-0.073
 benches-0.072
aster-0.071
Token been
Token ID587
Feature activation+7.94e+00

Pos logit contributions
 important+0.103
 valuable+0.103
 best+0.093
 precious+0.081
 certainly+0.080
Neg logit contributions
 benches-0.100
 greet-0.093
aster-0.091
 Sue-0.090
 bench-0.089
Token longing
Token ID40314
Feature activation+1.10e+01

Pos logit contributions
 valuable+0.277
 important+0.272
 best+0.245
 certainly+0.224
 precious+0.223
Neg logit contributions
 benches-0.192
 greet-0.178
aster-0.177
 Sue-0.173
 Ann-0.170
Token for
Token ID329
Feature activation-2.33e+00

Pos logit contributions
 important+0.348
 valuable+0.347
 best+0.314
 precious+0.277
 certainly+0.276
Neg logit contributions
 benches-0.305
 greet-0.283
aster-0.277
 Sue-0.274
 bench-0.272
Token -
Token ID532
Feature activation-4.82e-01

Pos logit contributions
 benches+0.062
 greet+0.059
 enclosure+0.059
aster+0.057
 bench+0.056
Neg logit contributions
 important-0.077
 valuable-0.075
 best-0.069
 precious-0.061
 certainly-0.058
Token a
Token ID257
Feature activation+3.37e+00

Pos logit contributions
 benches+0.013
 greet+0.012
aster+0.012
 Sue+0.012
 bench+0.011
Neg logit contributions
 important-0.016
 valuable-0.016
 best-0.014
 precious-0.013
 certainly-0.013
Token mixing
Token ID17090
Feature activation+1.12e+00

Pos logit contributions
 important+0.092
 valuable+0.091
 best+0.081
 certainly+0.076
der+0.074
Neg logit contributions
 bench-0.099
 benches-0.096
 greet-0.094
 Sue-0.093
aster-0.091
Token machine
Token ID4572
Feature activation+5.30e-01

Pos logit contributions
 valuable+0.037
 important+0.037
 best+0.033
 precious+0.030
 certainly+0.030
Neg logit contributions
 benches-0.029
aster-0.027
 greet-0.027
 Sue-0.026
 bench-0.026
Token,
Token ID11
Feature activation-3.92e+00

Pos logit contributions
 valuable+0.099
 important+0.098
 best+0.087
 precious+0.083
 certainly+0.081
Neg logit contributions
 benches-0.058
 Sue-0.052
aster-0.052
 greet-0.052
 Ann-0.051
Token but
Token ID475
Feature activation-9.12e-01

Pos logit contributions
 benches+0.083
 greet+0.075
aster+0.073
 Sue+0.073
 bench+0.071
Neg logit contributions
 important-0.150
 valuable-0.150
 best-0.137
 precious-0.125
 certainly-0.123
Token it
Token ID340
Feature activation+4.76e+00

Pos logit contributions
 benches+0.021
aster+0.019
 Sue+0.019
 greet+0.019
 Ann+0.019
Neg logit contributions
 valuable-0.033
 important-0.033
 best-0.030
 precious-0.027
 certainly-0.027
Token didn
Token ID1422
Feature activation+7.29e+00

Pos logit contributions
 valuable+0.151
 important+0.145
 best+0.125
 precious+0.122
 virtue+0.120
Neg logit contributions
 climb-0.120
 Sue-0.118
 Ann-0.117
 benches-0.114
 enter-0.110
Token't
Token ID470
Feature activation+3.94e+00

Pos logit contributions
 valuable+0.294
 important+0.292
 best+0.274
 precious+0.262
 virtue+0.250
Neg logit contributions
aster-0.128
 climb-0.123
 benches-0.118
 Ann-0.116
 Sue-0.114
Token seem
Token ID1283
Feature activation+1.10e+01

Pos logit contributions
 valuable+0.151
 important+0.150
 best+0.143
 virtue+0.131
 precious+0.130
Neg logit contributions
 climb-0.073
 greet-0.070
 sit-0.070
aster-0.069
 enter-0.069
Token to
Token ID284
Feature activation-5.49e+00

Pos logit contributions
 important+0.347
 valuable+0.343
 best+0.329
 certainly+0.316
 virtue+0.306
Neg logit contributions
 Sue-0.248
 benches-0.238
 greet-0.236
 climb-0.233
 bench-0.228
Token work
Token ID670
Feature activation+3.23e+00

Pos logit contributions
 benches+0.106
 greet+0.104
aster+0.103
 climb+0.099
 Sue+0.098
Neg logit contributions
 valuable-0.214
 important-0.211
 best-0.195
 precious-0.179
 certainly-0.176
Token.
Token ID13
Feature activation+1.90e+00

Pos logit contributions
 valuable+0.110
 important+0.108
 best+0.101
 certainly+0.096
 precious+0.094
Neg logit contributions
 benches-0.071
 Ann-0.065
 Sue-0.065
aster-0.064
 greet-0.064
Token He
Token ID679
Feature activation+2.37e+00

Pos logit contributions
 important+0.051
 valuable+0.051
 best+0.045
 certainly+0.040
 precious+0.037
Neg logit contributions
 benches-0.061
aster-0.059
 greet-0.059
 stroke-0.055
 bench-0.055
Token got
Token ID1392
Feature activation+1.90e+00

Pos logit contributions
 valuable+0.087
 important+0.085
 best+0.080
 precious+0.073
 virtue+0.071
Neg logit contributions
 benches-0.047
 climb-0.047
 Ann-0.046
aster-0.045
 Sue-0.044
TokenYes
Token ID5297
Feature activation+3.89e+00

Pos logit contributions
 valuable+0.030
 important+0.030
 best+0.027
 certainly+0.024
 precious+0.024
Neg logit contributions
 benches-0.030
 greet-0.027
 enclosure-0.027
 Sue-0.027
aster-0.027
Token,
Token ID11
Feature activation-3.03e+00

Pos logit contributions
 important+0.124
 valuable+0.122
 best+0.113
 certainly+0.099
 precious+0.098
Neg logit contributions
 benches-0.109
 greet-0.101
 enclosure-0.097
aster-0.097
 bench-0.096
Token Molly
Token ID30236
Feature activation-6.25e-02

Pos logit contributions
 benches+0.086
 greet+0.079
 enclosure+0.077
aster+0.077
 bench+0.075
Neg logit contributions
 important-0.097
 valuable-0.096
 best-0.087
 precious-0.077
 certainly-0.076
Token"
Token ID1
Feature activation+5.81e-02

Pos logit contributions
 benches+0.002
 greet+0.001
 Sue+0.001
 Ann+0.001
aster+0.001
Neg logit contributions
 valuable-0.002
 important-0.002
 best-0.002
 precious-0.002
 certainly-0.002
Token it
Token ID340
Feature activation+5.83e+00

Pos logit contributions
 important+0.002
 valuable+0.002
 best+0.002
 precious+0.002
 certainly+0.002
Neg logit contributions
 benches-0.002
 greet-0.001
aster-0.001
 enclosure-0.001
 bench-0.001
Token seemed
Token ID3947
Feature activation+1.10e+01

Pos logit contributions
 valuable+0.190
 important+0.188
 best+0.171
 virtue+0.154
 precious+0.153
Neg logit contributions
 greet-0.144
 Sue-0.141
 benches-0.141
 Ann-0.139
 enclosure-0.134
Token to
Token ID284
Feature activation-5.60e+00

Pos logit contributions
 important+0.383
 valuable+0.382
 best+0.361
 certainly+0.334
 virtue+0.332
Neg logit contributions
 Sue-0.221
 benches-0.221
 greet-0.220
 Ann-0.205
 bench-0.195
Token say
Token ID910
Feature activation+1.73e+00

Pos logit contributions
 benches+0.136
 greet+0.123
aster+0.121
 Sue+0.119
 enclosure+0.119
Neg logit contributions
 important-0.198
 valuable-0.197
 best-0.180
 precious-0.161
 certainly-0.161
Token.
Token ID13
Feature activation+3.11e+00

Pos logit contributions
 important+0.058
 valuable+0.058
 best+0.052
 precious+0.047
 certainly+0.046
Neg logit contributions
 benches-0.045
 greet-0.041
aster-0.041
 Sue-0.040
 enclosure-0.040
Token "
Token ID366
Feature activation+1.53e+00

Pos logit contributions
 important+0.089
 valuable+0.089
 best+0.080
 certainly+0.071
 precious+0.068
Neg logit contributions
 benches-0.096
aster-0.089
 greet-0.087
 bench-0.085
 enclosure-0.085
TokenThere
Token ID1858
Feature activation+3.83e-01

Pos logit contributions
 important+0.044
 valuable+0.044
 best+0.039
 certainly+0.035
 precious+0.034
Neg logit contributions
 benches-0.047
 greet-0.043
aster-0.042
 Sue-0.042
 enclosure-0.042
Token.
Token ID13
Feature activation+1.90e+00

Pos logit contributions
 important+0.046
 valuable+0.046
 best+0.042
 precious+0.037
 certainly+0.036
Neg logit contributions
 benches-0.038
 greet-0.035
aster-0.035
 enclosure-0.034
 bench-0.034
Token Mitt
Token ID16627
Feature activation-3.47e+00

Pos logit contributions
 valuable+0.063
 important+0.063
 best+0.057
 certainly+0.051
 precious+0.050
Neg logit contributions
 benches-0.050
 greet-0.046
aster-0.046
 enclosure-0.044
 bench-0.043
Tokenens
Token ID641
Feature activation-4.20e+00

Pos logit contributions
 benches+0.026
 Sue+0.020
 greet+0.017
aster+0.017
 Ann+0.017
Neg logit contributions
 valuable-0.180
 important-0.179
 best-0.168
 precious-0.158
 certainly-0.156
Token didn
Token ID1422
Feature activation+7.24e+00

Pos logit contributions
 benches+0.118
 greet+0.110
aster+0.108
 Sue+0.106
 enclosure+0.105
Neg logit contributions
 important-0.132
 valuable-0.131
 best-0.118
 precious-0.104
 certainly-0.104
Token't
Token ID470
Feature activation+3.72e+00

Pos logit contributions
 important+0.258
 valuable+0.257
 best+0.236
 precious+0.214
 certainly+0.204
Neg logit contributions
 benches-0.169
aster-0.158
 greet-0.157
 Sue-0.155
 Ann-0.153
Token seem
Token ID1283
Feature activation+1.10e+01

Pos logit contributions
 important+0.110
 valuable+0.110
 best+0.097
 precious+0.085
 certainly+0.083
Neg logit contributions
 benches-0.115
 enclosure-0.106
 greet-0.103
aster-0.103
 bench-0.101
Token to
Token ID284
Feature activation-5.67e+00

Pos logit contributions
 important+0.344
 valuable+0.340
 best+0.319
 certainly+0.294
 virtue+0.288
Neg logit contributions
 Sue-0.279
 benches-0.267
 greet-0.261
 Ann-0.257
 bench-0.254
Token mind
Token ID2000
Feature activation+6.23e-01

Pos logit contributions
 benches+0.153
 enclosure+0.139
aster+0.134
 bench+0.130
 Ann+0.127
Neg logit contributions
 important-0.193
 valuable-0.186
 best-0.177
 precious-0.153
 impossible-0.151
Token and
Token ID290
Feature activation-3.27e+00

Pos logit contributions
 important+0.021
 valuable+0.020
 best+0.019
 precious+0.016
 certainly+0.016
Neg logit contributions
 benches-0.017
 greet-0.016
aster-0.016
 enclosure-0.015
 bench-0.015
Token continued
Token ID3767
Feature activation-1.94e+00

Pos logit contributions
 benches+0.096
 enclosure+0.088
aster+0.087
 greet+0.086
 restaurants+0.084
Neg logit contributions
 important-0.103
 valuable-0.102
 best-0.090
 precious-0.079
 certainly-0.079
Token to
Token ID284
Feature activation-8.29e+00

Pos logit contributions
 benches+0.053
 enclosure+0.049
aster+0.049
 greet+0.047
 restaurants+0.045
Neg logit contributions
 important-0.064
 valuable-0.062
 best-0.058
 precious-0.046
 very-0.046
Token move
Token ID1445
Feature activation+1.75e+00

Pos logit contributions
 important+0.120
 valuable+0.119
 best+0.107
 precious+0.095
 certainly+0.093
Neg logit contributions
 benches-0.102
 greet-0.092
aster-0.091
 enclosure-0.091
 bench-0.090
Token.
Token ID13
Feature activation+2.20e+00

Pos logit contributions
 important+0.059
 valuable+0.059
 best+0.053
 certainly+0.048
 precious+0.048
Neg logit contributions
 benches-0.045
 greet-0.042
 Sue-0.041
 Ann-0.040
aster-0.040
Token It
Token ID632
Feature activation+4.80e+00

Pos logit contributions
 important+0.068
 valuable+0.067
 best+0.059
 precious+0.054
 virtue+0.053
Neg logit contributions
 benches-0.062
 Sue-0.061
 Ann-0.059
 Jen-0.058
 greet-0.058
Token didn
Token ID1422
Feature activation+7.53e+00

Pos logit contributions
 valuable+0.153
 important+0.152
 best+0.137
 virtue+0.121
 precious+0.120
Neg logit contributions
 benches-0.128
 greet-0.122
 Sue-0.120
 bench-0.119
 Ann-0.118
Token't
Token ID470
Feature activation+3.83e+00

Pos logit contributions
 important+0.279
 valuable+0.279
 best+0.255
 precious+0.236
 virtue+0.230
Neg logit contributions
 benches-0.163
aster-0.157
 Ann-0.153
 Sue-0.152
 greet-0.151
Token seem
Token ID1283
Feature activation+1.10e+01

Pos logit contributions
 important+0.126
 valuable+0.125
 best+0.113
 precious+0.101
 certainly+0.099
Neg logit contributions
 benches-0.104
 greet-0.095
aster-0.094
 enclosure-0.093
 bench-0.092
Token to
Token ID284
Feature activation-5.48e+00

Pos logit contributions
 important+0.352
 valuable+0.351
 best+0.323
 certainly+0.297
 virtue+0.289
Neg logit contributions
 benches-0.277
 Sue-0.269
 greet-0.263
 Ann-0.254
 bench-0.252
Token care
Token ID1337
Feature activation+3.23e+00

Pos logit contributions
 benches+0.137
 enclosure+0.118
aster+0.113
 bench+0.111
 restaurants+0.110
Neg logit contributions
 important-0.199
 valuable-0.192
 best-0.182
 precious-0.161
 certainly-0.158
Token about
Token ID546
Feature activation-3.50e+00

Pos logit contributions
 important+0.102
 valuable+0.102
 best+0.092
 precious+0.081
 certainly+0.080
Neg logit contributions
 benches-0.091
 greet-0.085
aster-0.083
 bench-0.081
 Sue-0.081
Token L
Token ID406
Feature activation-1.91e-01

Pos logit contributions
 benches+0.077
 greet+0.074
aster+0.072
 enclosure+0.071
 bench+0.068
Neg logit contributions
 important-0.131
 valuable-0.128
 best-0.121
 precious-0.108
 impossible-0.102
Tokenila
Token ID10102
Feature activation-1.04e+00

Pos logit contributions
 benches+0.004
 greet+0.004
 enclosure+0.004
 bench+0.004
 Sue+0.004
Neg logit contributions
 important-0.007
 valuable-0.007
 best-0.007
 certainly-0.006
 precious-0.006
Token why
Token ID1521
Feature activation+2.47e-01

Pos logit contributions
 important+0.050
 valuable+0.048
 best+0.045
 precious+0.039
 awful+0.039
Neg logit contributions
 benches-0.034
 greet-0.031
 enclosure-0.031
 arrest-0.029
aster-0.029
Token everyone
Token ID2506
Feature activation+5.33e-01

Pos logit contributions
 important+0.009
 valuable+0.009
 best+0.008
 precious+0.008
 certainly+0.008
Neg logit contributions
 benches-0.005
 greet-0.005
aster-0.005
 Sue-0.005
 Ann-0.005
Token else
Token ID2073
Feature activation+2.86e-01

Pos logit contributions
 valuable+0.015
 important+0.015
 best+0.013
 precious+0.012
 virtue+0.011
Neg logit contributions
 benches-0.016
 greet-0.015
 Sue-0.015
 Ann-0.015
 bench-0.015
Token didn
Token ID1422
Feature activation+7.01e+00

Pos logit contributions
 valuable+0.008
 important+0.008
 best+0.007
 precious+0.007
 certainly+0.007
Neg logit contributions
 benches-0.008
 greet-0.008
 Sue-0.008
 Ann-0.008
aster-0.008
Token't
Token ID470
Feature activation+3.73e+00

Pos logit contributions
 valuable+0.270
 important+0.265
 best+0.243
 precious+0.236
 virtue+0.221
Neg logit contributions
 benches-0.141
aster-0.137
 climb-0.132
 greet-0.131
 Ann-0.131
Token seem
Token ID1283
Feature activation+1.10e+01

Pos logit contributions
 important+0.124
 valuable+0.121
 best+0.110
 certainly+0.096
 precious+0.094
Neg logit contributions
 benches-0.104
 enclosure-0.096
aster-0.092
 greet-0.091
 Ann-0.088
Token to
Token ID284
Feature activation-5.37e+00

Pos logit contributions
 valuable+0.378
 important+0.376
 best+0.348
 certainly+0.325
 virtue+0.319
Neg logit contributions
 benches-0.244
 Sue-0.234
 greet-0.233
 bench-0.227
 Ann-0.222
Token mind
Token ID2000
Feature activation+9.44e-01

Pos logit contributions
 benches+0.134
 enclosure+0.121
aster+0.113
 greet+0.112
 Ann+0.110
Neg logit contributions
 important-0.192
 valuable-0.184
 best-0.177
 certainly-0.152
 impossible-0.151
Token.
Token ID13
Feature activation+1.92e+00

Pos logit contributions
 important+0.032
 valuable+0.032
 best+0.029
 precious+0.026
 certainly+0.025
Neg logit contributions
 benches-0.024
 greet-0.023
aster-0.022
 enclosure-0.022
 Sue-0.022
Token He
Token ID679
Feature activation+2.36e+00

Pos logit contributions
 valuable+0.063
 important+0.062
 best+0.057
 precious+0.052
 certainly+0.050
Neg logit contributions
 benches-0.051
 Ann-0.047
 greet-0.047
 Sue-0.046
 enclosure-0.045
Token tried
Token ID3088
Feature activation+3.45e+00

Pos logit contributions
 valuable+0.078
 important+0.075
 best+0.070
 virtue+0.064
 precious+0.063
Neg logit contributions
 benches-0.059
 Ann-0.056
 Sue-0.055
 bench-0.055
 greet-0.054
TokenL
Token ID43
Feature activation-4.08e+00

Pos logit contributions
 benches+0.098
 greet+0.093
aster+0.090
 Sue+0.088
 enclosure+0.088
Neg logit contributions
 important-0.115
 valuable-0.113
 best-0.108
 certainly-0.090
 precious-0.089
Tokenily
Token ID813
Feature activation-4.71e+00

Pos logit contributions
 greet+0.107
 benches+0.106
 gra+0.101
 stroke+0.099
 slam+0.097
Neg logit contributions
 important-0.129
 best-0.125
 valuable-0.120
 certainly-0.105
 precious-0.099
Token and
Token ID290
Feature activation-6.80e+00

Pos logit contributions
 benches+0.147
 greet+0.137
aster+0.134
 Sue+0.133
 Ann+0.132
Neg logit contributions
 important-0.133
 valuable-0.131
 best-0.119
 precious-0.102
 certainly-0.102
Token Mom
Token ID11254
Feature activation-5.37e+00

Pos logit contributions
 benches+0.160
 enclosure+0.146
 greet+0.142
aster+0.134
 bench+0.131
Neg logit contributions
 important-0.249
 valuable-0.243
 best-0.233
 precious-0.203
 certainly-0.202
Token go
Token ID467
Feature activation-4.57e+00

Pos logit contributions
 benches+0.159
 greet+0.139
aster+0.138
 enclosure+0.137
 bench+0.137
Neg logit contributions
 important-0.161
 valuable-0.152
 best-0.149
 certainly-0.131
 precious-0.124
Token to
Token ID284
Feature activation-1.09e+01

Pos logit contributions
 benches+0.130
aster+0.124
 arrest+0.113
 greet+0.113
 Ann+0.112
Neg logit contributions
 important-0.140
 best-0.138
 valuable-0.137
 precious-0.106
 much-0.102
Token the
Token ID262
Feature activation-2.92e+00

Pos logit contributions
 benches+0.286
aro+0.270
talk+0.267
 enclosure+0.255
 slam+0.254
Neg logit contributions
 important-0.330
 valuable-0.311
 best-0.304
 great-0.274
 really-0.269
Token store
Token ID3650
Feature activation-4.71e+00

Pos logit contributions
 benches+0.063
aster+0.059
 greet+0.059
 Sue+0.058
 Ann+0.057
Neg logit contributions
 important-0.108
 valuable-0.106
 best-0.101
 precious-0.090
 truly-0.084
Token.
Token ID13
Feature activation+5.93e-01

Pos logit contributions
 benches+0.153
aster+0.147
 greet+0.147
 stroke+0.141
 gra+0.139
Neg logit contributions
 important-0.127
 valuable-0.123
 best-0.115
 precious-0.093
 certainly-0.087
Token Mom
Token ID11254
Feature activation-4.00e+00

Pos logit contributions
 important+0.018
 valuable+0.018
 best+0.016
 precious+0.014
 certainly+0.014
Neg logit contributions
 benches-0.016
aster-0.016
 greet-0.015
 enclosure-0.015
 arrest-0.015
Token wants
Token ID3382
Feature activation+2.21e+00

Pos logit contributions
 benches+0.148
aster+0.137
 greet+0.136
 Sue+0.132
 enclosure+0.131
Neg logit contributions
 important-0.094
 valuable-0.091
 best-0.082
 certainly-0.068
 precious-0.068
TokenL
Token ID43
Feature activation-4.21e+00

Pos logit contributions
 benches+0.100
 greet+0.096
aster+0.095
 enclosure+0.092
 bench+0.091
Neg logit contributions
 important-0.106
 valuable-0.104
 best-0.101
 certainly-0.084
 precious-0.081
Tokenily
Token ID813
Feature activation-4.99e+00

Pos logit contributions
 greet+0.114
 benches+0.111
 gra+0.108
 stroke+0.107
 slam+0.105
Neg logit contributions
 important-0.131
 best-0.122
 valuable-0.121
 certainly-0.103
 precious-0.097
Token and
Token ID290
Feature activation-7.24e+00

Pos logit contributions
 benches+0.162
 greet+0.153
aster+0.149
 Sue+0.145
 enclosure+0.144
Neg logit contributions
 important-0.135
 valuable-0.130
 best-0.119
 precious-0.101
 certainly-0.100
Token Mom
Token ID11254
Feature activation-5.39e+00

Pos logit contributions
 benches+0.154
 enclosure+0.137
 greet+0.133
 bench+0.124
aster+0.122
Neg logit contributions
 important-0.280
 valuable-0.271
 best-0.265
 certainly-0.232
 precious-0.230
Token went
Token ID1816
Feature activation-4.49e+00

Pos logit contributions
 benches+0.162
 greet+0.147
aster+0.146
 bench+0.143
 enclosure+0.141
Neg logit contributions
 important-0.158
 valuable-0.150
 best-0.147
 certainly-0.128
 precious-0.122
Token to
Token ID284
Feature activation-1.08e+01

Pos logit contributions
 benches+0.118
aster+0.113
 greet+0.104
 Sue+0.103
 Ann+0.102
Neg logit contributions
 important-0.151
 valuable-0.147
 best-0.141
 precious-0.116
 certainly-0.110
Token the
Token ID262
Feature activation-3.09e+00

Pos logit contributions
 benches+0.298
aro+0.271
 enclosure+0.270
talk+0.267
 slam+0.266
Neg logit contributions
 important-0.318
 valuable-0.297
 best-0.292
 great-0.270
 really-0.268
Token shop
Token ID6128
Feature activation-3.91e+00

Pos logit contributions
 benches+0.066
aster+0.061
 greet+0.061
 Sue+0.060
 Ann+0.058
Neg logit contributions
 important-0.117
 valuable-0.116
 best-0.109
 precious-0.097
 certainly-0.094
Token.
Token ID13
Feature activation+5.74e-01

Pos logit contributions
 benches+0.114
aster+0.107
 greet+0.106
 Sue+0.101
 stroke+0.100
Neg logit contributions
 important-0.119
 valuable-0.118
 best-0.108
 precious-0.094
 certainly-0.090
Token Lily
Token ID20037
Feature activation-2.43e+00

Pos logit contributions
 important+0.014
 valuable+0.014
 best+0.013
 precious+0.010
 certainly+0.010
Neg logit contributions
 benches-0.019
aster-0.019
 greet-0.018
 arrest-0.018
hop-0.018
Token saw
Token ID2497
Feature activation-3.72e+00

Pos logit contributions
 benches+0.077
aster+0.072
 greet+0.072
 Sue+0.069
 enclosure+0.068
Neg logit contributions
 important-0.068
 valuable-0.067
 best-0.060
 precious-0.052
 certainly-0.052
TokenS
Token ID50
Feature activation-6.86e+00

Pos logit contributions
 benches+0.105
 greet+0.101
aster+0.099
 enclosure+0.096
 bench+0.096
Neg logit contributions
 important-0.111
 valuable-0.108
 best-0.106
 certainly-0.086
 precious-0.084
Tokenara
Token ID3301
Feature activation-5.91e+00

Pos logit contributions
 benches+0.192
 greet+0.182
keeping+0.179
 slam+0.174
 arrest+0.164
Neg logit contributions
 best-0.199
 important-0.196
 valuable-0.193
 certainly-0.175
 precious-0.160
Token and
Token ID290
Feature activation-7.62e+00

Pos logit contributions
 benches+0.183
 greet+0.170
aster+0.166
 Sue+0.163
 enclosure+0.162
Neg logit contributions
 important-0.172
 valuable-0.168
 best-0.155
 precious-0.132
 certainly-0.131
Token Ben
Token ID3932
Feature activation-5.37e+00

Pos logit contributions
 benches+0.180
 enclosure+0.163
 greet+0.150
aster+0.141
 bench+0.139
Neg logit contributions
 important-0.276
 valuable-0.265
 best-0.261
 certainly-0.229
 precious-0.225
Token went
Token ID1816
Feature activation-4.51e+00

Pos logit contributions
 benches+0.153
 greet+0.141
aster+0.139
 enclosure+0.136
 Sue+0.136
Neg logit contributions
 important-0.166
 valuable-0.164
 best-0.151
 precious-0.132
 certainly-0.131
Token to
Token ID284
Feature activation-1.08e+01

Pos logit contributions
 benches+0.127
aster+0.123
 greet+0.114
 slam+0.110
 arrest+0.110
Neg logit contributions
 important-0.141
 best-0.136
 valuable-0.135
 precious-0.103
 much-0.099
Token see
Token ID766
Feature activation-5.54e+00

Pos logit contributions
 benches+0.300
talk+0.287
keeping+0.278
 slam+0.271
aro+0.269
Neg logit contributions
 important-0.304
 best-0.294
 valuable-0.279
 great-0.263
 really-0.259
Token a
Token ID257
Feature activation+1.10e+00

Pos logit contributions
 benches+0.118
 greet+0.107
aster+0.106
 enclosure+0.104
 arrest+0.100
Neg logit contributions
 important-0.216
 valuable-0.212
 best-0.202
 precious-0.178
 certainly-0.172
Token show
Token ID905
Feature activation-8.96e-02

Pos logit contributions
 important+0.029
 valuable+0.029
 best+0.025
 certainly+0.023
 virtue+0.022
Neg logit contributions
 benches-0.035
 bench-0.034
 greet-0.033
 Sue-0.032
aster-0.032
Token with
Token ID351
Feature activation-3.74e+00

Pos logit contributions
 benches+0.003
 greet+0.003
aster+0.003
 enclosure+0.002
 Sue+0.002
Neg logit contributions
 important-0.003
 valuable-0.002
 best-0.002
 certainly-0.002
 precious-0.002
Token their
Token ID511
Feature activation-4.91e-02

Pos logit contributions
 benches+0.105
 greet+0.098
aster+0.095
 enclosure+0.095
 bench+0.094
Neg logit contributions
 important-0.118
 valuable-0.117
 best-0.107
 precious-0.094
 certainly-0.092
Token and
Token ID290
Feature activation-3.88e+00

Pos logit contributions
 benches+0.132
 greet+0.122
 enclosure+0.120
 bench+0.119
 Ann+0.118
Neg logit contributions
 valuable-0.169
 important-0.168
 best-0.148
 precious-0.135
 virtue-0.131
Token Sam
Token ID3409
Feature activation-3.78e+00

Pos logit contributions
 benches+0.097
 greet+0.093
 enclosure+0.093
hop+0.090
 bench+0.087
Neg logit contributions
 important-0.124
 valuable-0.120
 best-0.119
 impossible-0.097
 precious-0.095
Token and
Token ID290
Feature activation-3.87e+00

Pos logit contributions
 benches+0.084
aster+0.075
 greet+0.075
 Sue+0.074
 enclosure+0.073
Neg logit contributions
 important-0.138
 valuable-0.138
 best-0.129
 certainly-0.116
 precious-0.115
Token Rex
Token ID17853
Feature activation-4.73e+00

Pos logit contributions
 benches+0.091
 enclosure+0.083
hop+0.082
 bench+0.082
 restaurants+0.079
Neg logit contributions
 important-0.133
 valuable-0.130
 best-0.125
 impossible-0.105
 precious-0.105
Token walk
Token ID2513
Feature activation-4.67e+00

Pos logit contributions
 benches+0.099
aster+0.090
 Sue+0.087
 restaurants+0.087
 enclosure+0.087
Neg logit contributions
 important-0.174
 valuable-0.173
 best-0.164
 certainly-0.147
 precious-0.145
Token to
Token ID284
Feature activation-1.08e+01

Pos logit contributions
 benches+0.126
aster+0.124
 greet+0.116
 enclosure+0.115
 arrest+0.115
Neg logit contributions
 important-0.150
 valuable-0.148
 best-0.140
 precious-0.117
 impossible-0.111
Token the
Token ID262
Feature activation-2.01e+00

Pos logit contributions
 benches+0.253
hop+0.235
 enclosure+0.235
talk+0.228
aro+0.228
Neg logit contributions
 important-0.335
 best-0.325
 valuable-0.323
 great-0.291
 really-0.284
Token pond
Token ID16723
Feature activation-2.97e+00

Pos logit contributions
 benches+0.042
 bench+0.038
 greet+0.037
 enclosure+0.036
 Sue+0.036
Neg logit contributions
 valuable-0.078
 important-0.078
 best-0.071
 certainly-0.067
 precious-0.065
Token.
Token ID13
Feature activation+1.43e+00

Pos logit contributions
 benches+0.098
aster+0.093
 greet+0.093
 Ann+0.091
 arrest+0.089
Neg logit contributions
 important-0.077
 valuable-0.075
 best-0.071
 precious-0.055
 certainly-0.053
Token They
Token ID1119
Feature activation+5.68e-01

Pos logit contributions
 important+0.042
 valuable+0.042
 best+0.038
 precious+0.033
 certainly+0.033
Neg logit contributions
 benches-0.044
 greet-0.040
 Sue-0.040
aster-0.039
 Ann-0.039
Token see
Token ID766
Feature activation-4.22e+00

Pos logit contributions
 valuable+0.020
 important+0.020
 best+0.019
 certainly+0.017
 precious+0.017
Neg logit contributions
 benches-0.012
aster-0.012
 Sue-0.011
 Ann-0.011
 enclosure-0.011
TokenS
Token ID50
Feature activation-6.85e+00

Pos logit contributions
 benches+0.114
 greet+0.110
aster+0.109
 enclosure+0.105
 slam+0.103
Neg logit contributions
 important-0.106
 valuable-0.104
 best-0.101
 certainly-0.083
 precious-0.078
Tokenara
Token ID3301
Feature activation-6.15e+00

Pos logit contributions
 benches+0.202
 greet+0.188
 slam+0.182
keeping+0.182
 Ann+0.177
Neg logit contributions
 important-0.194
 valuable-0.192
 best-0.190
 certainly-0.170
 truly-0.156
Token and
Token ID290
Feature activation-7.51e+00

Pos logit contributions
 benches+0.200
 greet+0.184
aster+0.180
 enclosure+0.176
 Sue+0.175
Neg logit contributions
 important-0.171
 valuable-0.165
 best-0.151
 certainly-0.129
 precious-0.129
Token Tom
Token ID4186
Feature activation-6.63e+00

Pos logit contributions
 benches+0.173
 enclosure+0.157
 greet+0.147
aster+0.139
 restaurants+0.138
Neg logit contributions
 important-0.279
 valuable-0.266
 best-0.264
 certainly-0.229
 precious-0.227
Token went
Token ID1816
Feature activation-4.59e+00

Pos logit contributions
 benches+0.189
 greet+0.174
aster+0.171
 enclosure+0.168
 bench+0.168
Neg logit contributions
 important-0.204
 valuable-0.201
 best-0.186
 certainly-0.163
 precious-0.162
Token to
Token ID284
Feature activation-1.08e+01

Pos logit contributions
 benches+0.125
aster+0.119
 greet+0.112
 Ann+0.109
 Sue+0.108
Neg logit contributions
 important-0.149
 valuable-0.143
 best-0.141
 precious-0.111
 certainly-0.107
Token the
Token ID262
Feature activation-3.15e+00

Pos logit contributions
 benches+0.298
talk+0.285
keeping+0.270
aro+0.267
 slam+0.265
Neg logit contributions
 important-0.311
 best-0.299
 valuable-0.286
 great-0.269
 really-0.264
Token zoo
Token ID26626
Feature activation-5.96e+00

Pos logit contributions
 benches+0.064
 greet+0.059
aster+0.058
 Ann+0.056
 Sue+0.056
Neg logit contributions
 important-0.122
 valuable-0.121
 best-0.114
 precious-0.102
 certainly-0.099
Token with
Token ID351
Feature activation-5.42e+00

Pos logit contributions
 greet+0.198
 benches+0.197
aster+0.193
 stroke+0.192
 arrest+0.185
Neg logit contributions
 important-0.148
 best-0.138
 valuable-0.137
 much-0.098
 precious-0.098
Token Mom
Token ID11254
Feature activation-4.75e+00

Pos logit contributions
 benches+0.151
hop+0.146
 enclosure+0.144
 greet+0.134
aster+0.133
Neg logit contributions
 important-0.171
 best-0.166
 valuable-0.163
 precious-0.138
 much-0.131
Token and
Token ID290
Feature activation-6.34e+00

Pos logit contributions
 benches+0.158
aster+0.146
 greet+0.145
 arrest+0.142
 Sue+0.137
Neg logit contributions
 important-0.128
 valuable-0.123
 best-0.121
 certainly-0.095
 precious-0.093
Token girl
Token ID2576
Feature activation-2.14e+00

Pos logit contributions
 valuable+0.006
 important+0.006
 best+0.005
 certainly+0.005
 precious+0.005
Neg logit contributions
 benches-0.006
 greet-0.005
 Sue-0.005
aster-0.005
 bench-0.005
Token named
Token ID3706
Feature activation-7.95e+00

Pos logit contributions
 benches+0.055
 greet+0.048
 enclosure+0.047
 bench+0.047
 contain+0.046
Neg logit contributions
 important-0.071
 valuable-0.065
 best-0.064
 precious-0.056
 really-0.055
Token Emma
Token ID18966
Feature activation-3.48e+00

Pos logit contributions
 benches+0.203
aster+0.193
 enclosure+0.191
 greet+0.182
hop+0.180
Neg logit contributions
 important-0.277
 valuable-0.268
 best-0.251
 precious-0.219
 certainly-0.218
Token walked
Token ID6807
Feature activation-4.24e+00

Pos logit contributions
 benches+0.105
 greet+0.097
aster+0.095
 enclosure+0.094
 Ann+0.093
Neg logit contributions
 important-0.103
 valuable-0.101
 best-0.091
 precious-0.080
 certainly-0.079
Token outside
Token ID2354
Feature activation-7.85e+00

Pos logit contributions
 benches+0.137
aster+0.135
 slam+0.128
 arrest+0.127
 enclosure+0.127
Neg logit contributions
 important-0.122
 valuable-0.110
 best-0.106
 really-0.087
 very-0.086
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
 benches+0.255
 enclosure+0.226
isk+0.221
aster+0.212
 slam+0.210
Neg logit contributions
 important-0.221
 valuable-0.190
 best-0.186
 really-0.171
 very-0.170
Token the
Token ID262
Feature activation-3.06e+00

Pos logit contributions
 benches+0.269
 enclosure+0.248
aro+0.248
 inspection+0.240
keeping+0.240
Neg logit contributions
 important-0.361
 valuable-0.326
 best-0.324
 great-0.308
 really-0.299
Token garden
Token ID11376
Feature activation-4.08e+00

Pos logit contributions
 benches+0.066
 greet+0.061
aster+0.060
 Ann+0.059
 Sue+0.058
Neg logit contributions
 important-0.115
 valuable-0.113
 best-0.106
 precious-0.094
 certainly-0.092
Token.
Token ID13
Feature activation+8.45e-01

Pos logit contributions
 benches+0.149
 greet+0.145
aster+0.141
 Ann+0.139
 arrest+0.136
Neg logit contributions
 important-0.090
 valuable-0.081
 best-0.080
 much-0.057
 precious-0.055
Token On
Token ID1550
Feature activation-3.63e+00

Pos logit contributions
 important+0.028
 valuable+0.028
 best+0.025
 precious+0.022
 certainly+0.022
Neg logit contributions
 benches-0.022
 greet-0.021
aster-0.020
 enclosure-0.020
 Sue-0.020
Token the
Token ID262
Feature activation-2.17e+00

Pos logit contributions
 benches+0.065
 greet+0.064
 enclosure+0.059
 stroke+0.056
hop+0.054
Neg logit contributions
 important-0.147
 valuable-0.143
 best-0.138
 certainly-0.122
 precious-0.121
Token and
Token ID290
Feature activation-3.76e+00

Pos logit contributions
 benches+0.048
 greet+0.046
 Ann+0.046
 Sue+0.045
 bench+0.045
Neg logit contributions
 valuable-0.057
 important-0.056
 best-0.049
 virtue-0.045
 awful-0.045
Token her
Token ID607
Feature activation+1.27e+00

Pos logit contributions
 enclosure+0.091
 benches+0.090
 greet+0.082
 arrest+0.078
 slam+0.076
Neg logit contributions
 important-0.141
 best-0.132
 valuable-0.129
 unfair-0.106
 precious-0.105
Token mom
Token ID1995
Feature activation-5.66e+00

Pos logit contributions
 important+0.040
 valuable+0.039
 best+0.037
 precious+0.032
 certainly+0.030
Neg logit contributions
 benches-0.036
aster-0.035
 greet-0.034
 enclosure-0.033
 Sue-0.032
Token would
Token ID561
Feature activation+6.83e-02

Pos logit contributions
 benches+0.162
aster+0.149
 greet+0.148
 Sue+0.145
 bench+0.144
Neg logit contributions
 important-0.176
 valuable-0.174
 best-0.160
 precious-0.139
 certainly-0.139
Token sign
Token ID1051
Feature activation-1.97e+00

Pos logit contributions
 important+0.003
 valuable+0.003
 best+0.002
 certainly+0.002
 precious+0.002
Neg logit contributions
 benches-0.001
aster-0.001
 Ann-0.001
 enclosure-0.001
 Sue-0.001
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
 benches+0.051
aster+0.048
 enclosure+0.047
 greet+0.047
 bench+0.045
Neg logit contributions
 important-0.067
 valuable-0.066
 best-0.062
 precious-0.054
 impossible-0.051
Token each
Token ID1123
Feature activation-1.87e+00

Pos logit contributions
 benches+0.293
 enclosure+0.284
 greet+0.258
hop+0.256
 arrest+0.255
Neg logit contributions
 important-0.355
 valuable-0.338
 best-0.326
 precious-0.279
 truly-0.266
Token other
Token ID584
Feature activation-2.12e+00

Pos logit contributions
 benches+0.062
 enclosure+0.057
 greet+0.055
 Ann+0.055
 Sue+0.054
Neg logit contributions
 important-0.051
 valuable-0.049
 best-0.047
 precious-0.037
 certainly-0.035
Token every
Token ID790
Feature activation+1.86e+00

Pos logit contributions
 benches+0.059
aster+0.055
 greet+0.054
 Sue+0.053
 enclosure+0.052
Neg logit contributions
 important-0.067
 valuable-0.067
 best-0.061
 precious-0.053
 certainly-0.052
Token day
Token ID1110
Feature activation+6.53e-02

Pos logit contributions
 important+0.051
 valuable+0.049
 best+0.044
 precious+0.037
 certainly+0.036
Neg logit contributions
 benches-0.061
 greet-0.057
 Sue-0.055
 Ann-0.055
 enclosure-0.055
Token.
Token ID13
Feature activation+1.52e+00

Pos logit contributions
 important+0.002
 valuable+0.002
 best+0.002
 precious+0.001
 certainly+0.001
Neg logit contributions
 benches-0.002
 greet-0.002
aster-0.002
 enclosure-0.002
 bench-0.002
TokenL
Token ID43
Feature activation-4.33e+00

Pos logit contributions
 benches+0.107
 greet+0.103
aster+0.101
 enclosure+0.097
 bench+0.097
Neg logit contributions
 important-0.123
 valuable-0.121
 best-0.116
 certainly-0.097
 precious-0.096
Tokenily
Token ID813
Feature activation-5.08e+00

Pos logit contributions
 benches+0.097
 greet+0.095
 gra+0.090
 bench+0.088
 enclosure+0.088
Neg logit contributions
 important-0.154
 valuable-0.149
 best-0.147
 certainly-0.129
 precious-0.124
Token and
Token ID290
Feature activation-6.93e+00

Pos logit contributions
 benches+0.149
 greet+0.138
aster+0.135
 Sue+0.133
 enclosure+0.132
Neg logit contributions
 important-0.153
 valuable-0.151
 best-0.138
 precious-0.120
 certainly-0.120
Token Ben
Token ID3932
Feature activation-5.02e+00

Pos logit contributions
 benches+0.159
 enclosure+0.144
 greet+0.140
 bench+0.133
aster+0.133
Neg logit contributions
 important-0.252
 valuable-0.247
 best-0.239
 certainly-0.210
 precious-0.207
Token went
Token ID1816
Feature activation-4.54e+00

Pos logit contributions
 benches+0.135
 greet+0.123
aster+0.122
 Ann+0.120
 enclosure+0.120
Neg logit contributions
 important-0.162
 valuable-0.160
 best-0.148
 precious-0.130
 certainly-0.130
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
 benches+0.120
aster+0.117
 greet+0.108
 arrest+0.106
 Ann+0.105
Neg logit contributions
 important-0.149
 valuable-0.146
 best-0.142
 precious-0.115
 certainly-0.107
Token the
Token ID262
Feature activation-3.17e+00

Pos logit contributions
 benches+0.267
talk+0.245
 slam+0.242
keeping+0.237
aro+0.236
Neg logit contributions
 important-0.336
 best-0.326
 valuable-0.324
 really-0.284
 great-0.283
Token park
Token ID3952
Feature activation-4.41e+00

Pos logit contributions
 benches+0.067
aster+0.065
 greet+0.064
 Ann+0.062
 Sue+0.061
Neg logit contributions
 important-0.117
 valuable-0.116
 best-0.110
 precious-0.098
 certainly-0.092
Token with
Token ID351
Feature activation-5.77e+00

Pos logit contributions
 benches+0.156
aster+0.150
 greet+0.149
 arrest+0.146
 gra+0.144
Neg logit contributions
 important-0.098
 valuable-0.096
 best-0.091
 much-0.068
 really-0.066
Token Mom
Token ID11254
Feature activation-4.75e+00

Pos logit contributions
 benches+0.144
 enclosure+0.142
hop+0.136
 greet+0.130
 arrest+0.130
Neg logit contributions
 important-0.194
 valuable-0.190
 best-0.189
 precious-0.160
 great-0.147
Token.
Token ID13
Feature activation+5.74e-01

Pos logit contributions
 benches+0.162
aster+0.151
 arrest+0.150
 greet+0.149
 slam+0.144
Neg logit contributions
 important-0.123
 valuable-0.118
 best-0.115
 certainly-0.088
 precious-0.087
TokenJim
Token ID18050
Feature activation-6.87e+00

Pos logit contributions
 benches+0.115
aster+0.111
 greet+0.110
 enclosure+0.105
 bench+0.104
Neg logit contributions
 important-0.117
 valuable-0.115
 best-0.110
 certainly-0.094
 precious-0.088
Token walked
Token ID6807
Feature activation-4.41e+00

Pos logit contributions
 benches+0.205
 greet+0.189
aster+0.184
 bench+0.181
 Sue+0.181
Neg logit contributions
 important-0.207
 valuable-0.203
 best-0.185
 certainly-0.162
 precious-0.161
Token into
Token ID656
Feature activation-9.25e+00

Pos logit contributions
 benches+0.126
aster+0.120
 greet+0.114
 Ann+0.113
 slam+0.112
Neg logit contributions
 important-0.139
 valuable-0.135
 best-0.126
 precious-0.104
 certainly-0.102
Token the
Token ID262
Feature activation-3.18e+00

Pos logit contributions
talk+0.223
 enclosure+0.222
isk+0.222
 benches+0.216
hop+0.211
Neg logit contributions
 important-0.303
 valuable-0.293
 best-0.273
 amazing-0.240
 great-0.238
Token kitchen
Token ID9592
Feature activation-3.53e+00

Pos logit contributions
 benches+0.073
 greet+0.068
aster+0.066
 Sue+0.065
 Ann+0.064
Neg logit contributions
 important-0.116
 valuable-0.115
 best-0.107
 precious-0.095
 certainly-0.093
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
 benches+0.095
 greet+0.089
aster+0.086
 Sue+0.085
 Ann+0.084
Neg logit contributions
 important-0.115
 valuable-0.114
 best-0.104
 precious-0.091
 certainly-0.091
Token find
Token ID1064
Feature activation-2.24e+00

Pos logit contributions
 benches+0.234
 enclosure+0.200
aro+0.195
 Ann+0.185
 arrest+0.178
Neg logit contributions
 important-0.422
 valuable-0.401
 best-0.377
 certainly-0.350
 really-0.346
Token a
Token ID257
Feature activation+2.40e+00

Pos logit contributions
 benches+0.071
 greet+0.067
 enclosure+0.066
aster+0.065
 bench+0.065
Neg logit contributions
 important-0.063
 valuable-0.062
 best-0.056
 precious-0.048
 certainly-0.047
Token messy
Token ID24097
Feature activation+5.23e+00

Pos logit contributions
 important+0.066
 valuable+0.066
 best+0.058
 certainly+0.054
der+0.053
Neg logit contributions
 bench-0.071
 benches-0.070
 greet-0.068
 Sue-0.066
 climb-0.065
Token pizza
Token ID14256
Feature activation+7.51e-01

Pos logit contributions
 valuable+0.184
 important+0.184
 best+0.168
 precious+0.151
 certainly+0.151
Neg logit contributions
 benches-0.126
 greet-0.115
aster-0.113
 bench-0.112
 Sue-0.112
Token on
Token ID319
Feature activation-6.78e+00

Pos logit contributions
 valuable+0.024
 important+0.024
 best+0.022
 precious+0.020
 certainly+0.020
Neg logit contributions
 benches-0.020
 greet-0.018
 bench-0.018
 Sue-0.018
aster-0.018
TokenL
Token ID43
Feature activation-4.21e+00

Pos logit contributions
 benches+0.098
 greet+0.095
aster+0.094
 enclosure+0.091
 slam+0.089
Neg logit contributions
 important-0.098
 valuable-0.097
 best-0.093
 certainly-0.077
 precious-0.074
Tokenily
Token ID813
Feature activation-4.95e+00

Pos logit contributions
 benches+0.135
 greet+0.135
 gra+0.130
 enclosure+0.128
 stroke+0.127
Neg logit contributions
 important-0.115
 valuable-0.107
 best-0.103
 certainly-0.083
 precious-0.083
Token and
Token ID290
Feature activation-7.13e+00

Pos logit contributions
 benches+0.154
 greet+0.144
aster+0.141
 Sue+0.138
 enclosure+0.137
Neg logit contributions
 important-0.142
 valuable-0.138
 best-0.126
 certainly-0.108
 precious-0.108
Token Ben
Token ID3932
Feature activation-5.07e+00

Pos logit contributions
 benches+0.160
 enclosure+0.149
 greet+0.140
hop+0.127
aster+0.124
Neg logit contributions
 important-0.268
 valuable-0.256
 best-0.255
 certainly-0.220
 precious-0.220
Token went
Token ID1816
Feature activation-4.45e+00

Pos logit contributions
 benches+0.149
 greet+0.139
aster+0.137
 enclosure+0.134
 Sue+0.134
Neg logit contributions
 important-0.151
 valuable-0.150
 best-0.136
 precious-0.119
 certainly-0.118
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
 benches+0.120
aster+0.118
 greet+0.109
 arrest+0.105
 Ann+0.104
Neg logit contributions
 important-0.143
 valuable-0.138
 best-0.138
 precious-0.108
 much-0.101
Token the
Token ID262
Feature activation-3.08e+00

Pos logit contributions
 benches+0.295
talk+0.287
keeping+0.284
aro+0.282
hop+0.265
Neg logit contributions
 important-0.310
 best-0.300
 valuable-0.288
 great-0.272
 really-0.268
Token park
Token ID3952
Feature activation-4.16e+00

Pos logit contributions
 benches+0.065
aster+0.062
 greet+0.061
 Sue+0.058
 Ann+0.058
Neg logit contributions
 important-0.116
 valuable-0.114
 best-0.109
 precious-0.097
 certainly-0.091
Token with
Token ID351
Feature activation-5.73e+00

Pos logit contributions
 benches+0.148
aster+0.144
 greet+0.143
 arrest+0.141
 gra+0.137
Neg logit contributions
 important-0.090
 best-0.086
 valuable-0.085
 much-0.062
 really-0.060
Token Mom
Token ID11254
Feature activation-4.66e+00

Pos logit contributions
 benches+0.144
 enclosure+0.142
hop+0.134
 greet+0.130
 slam+0.130
Neg logit contributions
 important-0.194
 best-0.188
 valuable-0.186
 precious-0.158
 great-0.147
Token.
Token ID13
Feature activation+5.46e-01

Pos logit contributions
 benches+0.161
aster+0.149
 greet+0.148
 arrest+0.148
 slam+0.142
Neg logit contributions
 important-0.120
 best-0.114
 valuable-0.114
 certainly-0.086
 precious-0.084
Tokenily
Token ID813
Feature activation-4.88e+00

Pos logit contributions
 greet+0.112
 benches+0.109
 gra+0.106
 stroke+0.103
 slam+0.102
Neg logit contributions
 important-0.128
 best-0.121
 valuable-0.119
 certainly-0.101
 precious-0.097
Token and
Token ID290
Feature activation-7.04e+00

Pos logit contributions
 benches+0.162
 greet+0.152
aster+0.149
 Sue+0.147
 Ann+0.146
Neg logit contributions
 important-0.129
 valuable-0.126
 best-0.114
 precious-0.096
 certainly-0.096
Token her
Token ID607
Feature activation-1.28e+00

Pos logit contributions
 benches+0.153
 enclosure+0.139
 greet+0.132
aster+0.123
 bench+0.120
Neg logit contributions
 important-0.269
 valuable-0.261
 best-0.255
 certainly-0.223
 precious-0.222
Token mom
Token ID1995
Feature activation-7.52e+00

Pos logit contributions
 benches+0.029
 greet+0.027
 bench+0.026
 Sue+0.026
 Ann+0.026
Neg logit contributions
 valuable-0.047
 important-0.047
 best-0.042
 certainly-0.039
 impossible-0.038
Token went
Token ID1816
Feature activation-4.56e+00

Pos logit contributions
 benches+0.241
 greet+0.215
 enclosure+0.213
aster+0.212
 bench+0.209
Neg logit contributions
 important-0.210
 best-0.201
 valuable-0.192
 certainly-0.165
 precious-0.157
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
 benches+0.121
aster+0.116
 Sue+0.106
 Ann+0.106
 greet+0.105
Neg logit contributions
 important-0.153
 valuable-0.149
 best-0.142
 precious-0.118
 certainly-0.111
Token the
Token ID262
Feature activation-3.03e+00

Pos logit contributions
 benches+0.290
aro+0.266
 enclosure+0.261
 slam+0.255
talk+0.255
Neg logit contributions
 important-0.321
 valuable-0.305
 best-0.295
 really-0.277
 great-0.274
Token mall
Token ID17374
Feature activation-2.49e+00

Pos logit contributions
 benches+0.062
aster+0.058
 greet+0.058
 Ann+0.056
 Sue+0.056
Neg logit contributions
 important-0.116
 valuable-0.114
 best-0.108
 precious-0.097
 certainly-0.092
Token.
Token ID13
Feature activation+5.94e-01

Pos logit contributions
 benches+0.077
 greet+0.073
aster+0.073
 stroke+0.070
 Ann+0.068
Neg logit contributions
 important-0.071
 valuable-0.069
 best-0.063
 precious-0.053
 certainly-0.050
Token Lily
Token ID20037
Feature activation-2.51e+00

Pos logit contributions
 important+0.017
 valuable+0.017
 best+0.015
 precious+0.013
 certainly+0.013
Neg logit contributions
 benches-0.017
aster-0.017
 greet-0.016
 arrest-0.016
 enclosure-0.016
Token liked
Token ID8288
Feature activation+1.37e+00

Pos logit contributions
 benches+0.074
aster+0.069
 greet+0.069
 Sue+0.066
 enclosure+0.065
Neg logit contributions
 important-0.076
 valuable-0.075
 best-0.068
 precious-0.060
 certainly-0.059
TokenL
Token ID43
Feature activation-4.12e+00

Pos logit contributions
 benches+0.097
 greet+0.094
aster+0.092
 enclosure+0.089
 bench+0.089
Neg logit contributions
 important-0.099
 valuable-0.097
 best-0.095
 certainly-0.077
 precious-0.076
Tokenily
Token ID813
Feature activation-4.87e+00

Pos logit contributions
 greet+0.110
 benches+0.106
 gra+0.104
 slam+0.103
 stroke+0.101
Neg logit contributions
 important-0.130
 best-0.125
 valuable-0.121
 certainly-0.101
 precious-0.100
Token and
Token ID290
Feature activation-7.42e+00

Pos logit contributions
 benches+0.161
 greet+0.150
aster+0.148
 Sue+0.145
 enclosure+0.143
Neg logit contributions
 important-0.130
 valuable-0.126
 best-0.116
 precious-0.097
 certainly-0.097
Token Ben
Token ID3932
Feature activation-5.29e+00

Pos logit contributions
 benches+0.168
 enclosure+0.155
 greet+0.143
aster+0.136
 bench+0.134
Neg logit contributions
 important-0.276
 valuable-0.265
 best-0.265
 precious-0.227
 certainly-0.226
Token went
Token ID1816
Feature activation-4.46e+00

Pos logit contributions
 benches+0.151
 greet+0.138
aster+0.137
 enclosure+0.134
 bench+0.133
Neg logit contributions
 important-0.163
 valuable-0.160
 best-0.148
 precious-0.129
 certainly-0.129
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
 benches+0.124
aster+0.123
 greet+0.114
 gra+0.111
 arrest+0.110
Neg logit contributions
 important-0.137
 best-0.135
 valuable-0.132
 precious-0.101
 much-0.097
Token the
Token ID262
Feature activation-3.12e+00

Pos logit contributions
 benches+0.299
talk+0.292
keeping+0.280
aro+0.280
 slam+0.276
Neg logit contributions
 important-0.293
 best-0.286
 valuable-0.271
 great-0.256
 really-0.250
Token beach
Token ID10481
Feature activation-2.50e+00

Pos logit contributions
 benches+0.066
 greet+0.064
aster+0.063
 Sue+0.060
 Ann+0.059
Neg logit contributions
 important-0.116
 valuable-0.114
 best-0.111
 precious-0.097
 certainly-0.090
Token with
Token ID351
Feature activation-5.77e+00

Pos logit contributions
 benches+0.087
aster+0.083
 greet+0.082
 Ann+0.079
 arrest+0.078
Neg logit contributions
 important-0.060
 best-0.056
 valuable-0.056
 precious-0.040
 really-0.039
Token Mom
Token ID11254
Feature activation-4.81e+00

Pos logit contributions
 benches+0.141
 enclosure+0.141
hop+0.136
 greet+0.126
 slam+0.124
Neg logit contributions
 important-0.196
 best-0.191
 valuable-0.189
 precious-0.161
 much-0.151
Token and
Token ID290
Feature activation-6.43e+00

Pos logit contributions
 benches+0.158
aster+0.145
 greet+0.145
 arrest+0.143
 Ann+0.139
Neg logit contributions
 important-0.132
 valuable-0.127
 best-0.125
 certainly-0.098
 precious-0.096
Token and
Token ID290
Feature activation-6.59e+00

Pos logit contributions
 benches+0.137
 greet+0.128
aster+0.124
 Sue+0.122
 enclosure+0.122
Neg logit contributions
 important-0.145
 valuable-0.143
 best-0.130
 certainly-0.114
 precious-0.113
Token her
Token ID607
Feature activation-1.09e+00

Pos logit contributions
 benches+0.145
 enclosure+0.129
 greet+0.126
aster+0.123
 bench+0.117
Neg logit contributions
 important-0.250
 valuable-0.243
 best-0.237
 certainly-0.208
 precious-0.202
Token mom
Token ID1995
Feature activation-7.75e+00

Pos logit contributions
 benches+0.024
 greet+0.022
 Sue+0.022
 bench+0.022
 Ann+0.022
Neg logit contributions
 valuable-0.040
 important-0.040
 best-0.034
 virtue-0.033
 certainly-0.033
Token went
Token ID1816
Feature activation-4.70e+00

Pos logit contributions
 benches+0.190
 greet+0.169
aster+0.168
 enclosure+0.163
 bench+0.161
Neg logit contributions
 important-0.276
 valuable-0.263
 best-0.262
 certainly-0.229
 precious-0.220
Token outside
Token ID2354
Feature activation-7.95e+00

Pos logit contributions
 benches+0.122
aster+0.115
 greet+0.107
 Ann+0.107
 Sue+0.106
Neg logit contributions
 important-0.160
 valuable-0.156
 best-0.148
 precious-0.124
 certainly-0.120
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
 benches+0.240
aster+0.214
 enclosure+0.211
 greet+0.210
isk+0.206
Neg logit contributions
 important-0.237
 valuable-0.220
 best-0.215
 really-0.172
 certainly-0.172
Token the
Token ID262
Feature activation-3.27e+00

Pos logit contributions
 benches+0.231
keeping+0.195
 enclosure+0.193
 arrest+0.193
 Ann+0.186
Neg logit contributions
 important-0.391
 valuable-0.369
 best-0.361
 really-0.336
 certainly-0.334
Token park
Token ID3952
Feature activation-4.81e+00

Pos logit contributions
 benches+0.069
 greet+0.064
aster+0.063
 Ann+0.061
 Sue+0.060
Neg logit contributions
 important-0.125
 valuable-0.123
 best-0.115
 precious-0.103
 certainly-0.101
Token.
Token ID13
Feature activation+8.66e-01

Pos logit contributions
 benches+0.163
 greet+0.158
aster+0.157
 arrest+0.151
 restaurants+0.147
Neg logit contributions
 important-0.120
 valuable-0.115
 best-0.108
 precious-0.081
 certainly-0.081
Token It
Token ID632
Feature activation+4.02e+00

Pos logit contributions
 important+0.028
 valuable+0.028
 best+0.026
 precious+0.023
 certainly+0.023
Neg logit contributions
 benches-0.023
 greet-0.021
aster-0.021
 Sue-0.021
 bench-0.021
Token was
Token ID373
Feature activation+4.66e+00

Pos logit contributions
 valuable+0.118
 important+0.117
 best+0.103
 virtue+0.095
 precious+0.094
Neg logit contributions
 benches-0.120
 Sue-0.115
 bench-0.113
 Ann-0.112
 climb-0.108
Token<|endoftext|>
Token ID50256
Feature activation-3.29e+00

Pos logit contributions
 valuable+0.067
 important+0.066
 best+0.060
 certainly+0.053
 precious+0.053
Neg logit contributions
 benches-0.062
 greet-0.058
aster-0.056
 bench-0.056
 enclosure-0.055
TokenTim
Token ID14967
Feature activation-5.98e+00

Pos logit contributions
 benches+0.086
 greet+0.081
aster+0.080
 Sue+0.077
 bench+0.077
Neg logit contributions
 important-0.106
 valuable-0.105
 best-0.099
 certainly-0.085
 precious-0.084
Token and
Token ID290
Feature activation-6.80e+00

Pos logit contributions
 benches+0.191
 greet+0.176
aster+0.169
 bench+0.165
 enclosure+0.163
Neg logit contributions
 important-0.169
 valuable-0.161
 best-0.152
 certainly-0.130
 precious-0.127
Token Lily
Token ID20037
Feature activation-3.71e+00

Pos logit contributions
 benches+0.165
 enclosure+0.147
 greet+0.144
 bench+0.136
aster+0.135
Neg logit contributions
 important-0.247
 valuable-0.244
 best-0.229
 precious-0.200
 certainly-0.198
Token went
Token ID1816
Feature activation-4.43e+00

Pos logit contributions
 benches+0.103
 greet+0.096
aster+0.094
 Sue+0.092
 Ann+0.092
Neg logit contributions
 important-0.117
 valuable-0.117
 best-0.106
 precious-0.093
 certainly-0.093
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
 benches+0.108
aster+0.103
 greet+0.098
 Ann+0.096
 Sue+0.095
Neg logit contributions
 important-0.155
 valuable-0.153
 best-0.146
 precious-0.123
 certainly-0.118
Token the
Token ID262
Feature activation-3.32e+00

Pos logit contributions
 benches+0.265
aro+0.235
talk+0.233
 slam+0.230
 enclosure+0.223
Neg logit contributions
 important-0.344
 valuable-0.334
 best-0.330
 great-0.291
 really-0.290
Token zoo
Token ID26626
Feature activation-5.95e+00

Pos logit contributions
 benches+0.065
 greet+0.060
aster+0.059
 Ann+0.058
 Sue+0.058
Neg logit contributions
 important-0.131
 valuable-0.130
 best-0.122
 precious-0.110
 certainly-0.107
Token with
Token ID351
Feature activation-5.62e+00

Pos logit contributions
 benches+0.204
 greet+0.203
aster+0.201
 stroke+0.199
 arrest+0.192
Neg logit contributions
 important-0.140
 valuable-0.131
 best-0.128
 precious-0.093
 much-0.089
Token Mom
Token ID11254
Feature activation-4.77e+00

Pos logit contributions
 benches+0.153
hop+0.151
 enclosure+0.148
 greet+0.139
 bench+0.137
Neg logit contributions
 important-0.177
 valuable-0.171
 best-0.171
 precious-0.145
 great-0.135
Token and
Token ID290
Feature activation-6.30e+00

Pos logit contributions
 benches+0.155
aster+0.142
 greet+0.142
 arrest+0.138
 enclosure+0.135
Neg logit contributions
 important-0.133
 valuable-0.128
 best-0.123
 certainly-0.100
 precious-0.098
Token.
Token ID13
Feature activation+2.22e+00

Pos logit contributions
 benches+0.081
 greet+0.078
aster+0.076
 stroke+0.073
 enclosure+0.073
Neg logit contributions
 important-0.076
 valuable-0.075
 best-0.069
 precious-0.057
 certainly-0.056
Token<|endoftext|>
Token ID50256
Feature activation-3.87e+00

Pos logit contributions
 important+0.074
 valuable+0.074
 best+0.067
 precious+0.059
 certainly+0.059
Neg logit contributions
 benches-0.059
 greet-0.055
aster-0.053
 bench-0.052
 enclosure-0.052
TokenMom
Token ID29252
Feature activation-5.31e+00

Pos logit contributions
 benches+0.129
 greet+0.125
aster+0.124
 enclosure+0.121
 stroke+0.119
Neg logit contributions
 important-0.094
 valuable-0.091
 best-0.089
 certainly-0.071
 much-0.066
Tokenmy
Token ID1820
Feature activation-3.03e+00

Pos logit contributions
 benches+0.171
 greet+0.153
aster+0.147
 bench+0.146
 slam+0.146
Neg logit contributions
 important-0.151
 valuable-0.140
 best-0.137
 certainly-0.122
 precious-0.111
Token went
Token ID1816
Feature activation-4.45e+00

Pos logit contributions
 benches+0.095
 greet+0.087
 Sue+0.083
 enclosure+0.082
aster+0.082
Neg logit contributions
 important-0.087
 valuable-0.083
 best-0.077
 certainly-0.067
 precious-0.065
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
 benches+0.125
aster+0.113
 greet+0.109
 enclosure+0.106
 Sue+0.104
Neg logit contributions
 important-0.147
 valuable-0.139
 best-0.134
 precious-0.110
 certainly-0.107
Token the
Token ID262
Feature activation-3.19e+00

Pos logit contributions
 benches+0.320
aro+0.303
talk+0.293
 enclosure+0.285
keeping+0.284
Neg logit contributions
 important-0.288
 really-0.261
 valuable-0.256
 best-0.250
 great-0.245
Token kitchen
Token ID9592
Feature activation-3.34e+00

Pos logit contributions
 benches+0.069
 greet+0.065
aster+0.064
 Ann+0.061
 Sue+0.060
Neg logit contributions
 important-0.119
 valuable-0.116
 best-0.112
 precious-0.098
 certainly-0.094
Token.
Token ID13
Feature activation+6.22e-01

Pos logit contributions
 benches+0.111
 greet+0.103
aster+0.099
 enclosure+0.098
 stroke+0.097
Neg logit contributions
 important-0.090
 valuable-0.086
 best-0.080
 certainly-0.065
 precious-0.064
Token She
Token ID1375
Feature activation+1.02e+00

Pos logit contributions
 important+0.018
 valuable+0.018
 best+0.016
 certainly+0.014
 precious+0.014
Neg logit contributions
 benches-0.019
 greet-0.017
aster-0.017
 enclosure-0.017
 bench-0.017
Token opened
Token ID4721
Feature activation-4.17e+00

Pos logit contributions
 important+0.029
 valuable+0.028
 best+0.025
 certainly+0.022
 precious+0.022
Neg logit contributions
 benches-0.033
 greet-0.031
 enclosure-0.030
aster-0.030
 bench-0.029
Token<|endoftext|>
Token ID50256
Feature activation-3.34e+00

Pos logit contributions
 important+0.084
 valuable+0.083
 best+0.076
 precious+0.067
 certainly+0.067
Neg logit contributions
 benches-0.069
 greet-0.064
aster-0.062
 Sue-0.062
 Ann-0.061
TokenAnna
Token ID31160
Feature activation-4.88e+00

Pos logit contributions
 benches+0.091
 greet+0.088
aster+0.086
 Sue+0.083
 bench+0.083
Neg logit contributions
 important-0.102
 valuable-0.100
 best-0.096
 certainly-0.080
 precious-0.079
Token and
Token ID290
Feature activation-7.34e+00

Pos logit contributions
 benches+0.154
 greet+0.143
aster+0.141
 Sue+0.139
 bench+0.138
Neg logit contributions
 important-0.137
 valuable-0.135
 best-0.123
 precious-0.106
 certainly-0.104
Token Ben
Token ID3932
Feature activation-5.25e+00

Pos logit contributions
 benches+0.178
 enclosure+0.165
 greet+0.149
hop+0.147
 bench+0.145
Neg logit contributions
 important-0.253
 best-0.250
 valuable-0.238
 precious-0.210
 certainly-0.206
Token went
Token ID1816
Feature activation-4.46e+00

Pos logit contributions
 benches+0.143
 greet+0.132
aster+0.131
 enclosure+0.128
 Sue+0.128
Neg logit contributions
 important-0.166
 valuable-0.164
 best-0.153
 precious-0.134
 certainly-0.133
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
aster+0.124
 benches+0.123
 greet+0.112
 gra+0.110
 arrest+0.109
Neg logit contributions
 important-0.139
 best-0.135
 valuable-0.134
 precious-0.103
 much-0.097
Token the
Token ID262
Feature activation-3.10e+00

Pos logit contributions
 benches+0.292
talk+0.287
keeping+0.279
aro+0.274
 slam+0.272
Neg logit contributions
 important-0.297
 best-0.291
 valuable-0.274
 great-0.260
 really-0.255
Token shop
Token ID6128
Feature activation-3.81e+00

Pos logit contributions
 benches+0.065
 greet+0.063
aster+0.062
 Ann+0.060
 Sue+0.059
Neg logit contributions
 important-0.116
 valuable-0.114
 best-0.110
 precious-0.097
 certainly-0.091
Token with
Token ID351
Feature activation-5.76e+00

Pos logit contributions
 benches+0.113
aster+0.106
 greet+0.105
 stroke+0.099
 Sue+0.098
Neg logit contributions
 important-0.115
 valuable-0.113
 best-0.106
 precious-0.090
 certainly-0.084
Token Mom
Token ID11254
Feature activation-4.82e+00

Pos logit contributions
 benches+0.140
 enclosure+0.138
hop+0.132
 greet+0.126
aster+0.124
Neg logit contributions
 important-0.203
 valuable-0.196
 best-0.195
 precious-0.167
 great-0.154
Token.
Token ID13
Feature activation+3.72e-01

Pos logit contributions
 benches+0.158
aster+0.146
 greet+0.145
 arrest+0.142
 Sue+0.138
Neg logit contributions
 important-0.134
 valuable-0.130
 best-0.125
 precious-0.099
 certainly-0.098
Token<|endoftext|>
Token ID50256
Feature activation-3.32e+00

Pos logit contributions
 important+0.081
 valuable+0.081
 best+0.073
 precious+0.064
 certainly+0.063
Neg logit contributions
 benches-0.077
 greet-0.072
aster-0.070
 Sue-0.070
 enclosure-0.069
TokenTim
Token ID14967
Feature activation-5.67e+00

Pos logit contributions
 benches+0.091
 greet+0.087
aster+0.085
 enclosure+0.082
 bench+0.082
Neg logit contributions
 important-0.101
 valuable-0.100
 best-0.095
 certainly-0.080
 precious-0.078
Token and
Token ID290
Feature activation-7.24e+00

Pos logit contributions
 benches+0.192
 greet+0.174
aster+0.168
 Sue+0.167
 Ann+0.165
Neg logit contributions
 important-0.152
 valuable-0.141
 best-0.138
 certainly-0.113
 precious-0.110
Token Mia
Token ID32189
Feature activation-4.33e+00

Pos logit contributions
 benches+0.180
 enclosure+0.160
 greet+0.151
 bench+0.141
hop+0.140
Neg logit contributions
 important-0.258
 valuable-0.252
 best-0.240
 certainly-0.210
 precious-0.209
Token went
Token ID1816
Feature activation-4.41e+00

Pos logit contributions
 benches+0.123
 greet+0.112
aster+0.110
 Sue+0.110
 enclosure+0.109
Neg logit contributions
 important-0.133
 valuable-0.130
 best-0.122
 certainly-0.106
 precious-0.106
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
 benches+0.117
aster+0.116
 greet+0.107
 restaurants+0.103
 Ann+0.103
Neg logit contributions
 important-0.143
 valuable-0.139
 best-0.137
 precious-0.109
 much-0.102
Token the
Token ID262
Feature activation-3.02e+00

Pos logit contributions
 benches+0.294
talk+0.288
aro+0.280
keeping+0.272
 slam+0.266
Neg logit contributions
 important-0.303
 best-0.292
 valuable-0.278
 great-0.263
 really-0.256
Token zoo
Token ID26626
Feature activation-5.70e+00

Pos logit contributions
 benches+0.062
 greet+0.058
aster+0.057
 Ann+0.055
 Sue+0.055
Neg logit contributions
 important-0.117
 valuable-0.115
 best-0.109
 precious-0.098
 certainly-0.094
Token with
Token ID351
Feature activation-5.43e+00

Pos logit contributions
 greet+0.202
 benches+0.199
aster+0.198
 stroke+0.197
 gra+0.190
Neg logit contributions
 important-0.127
 best-0.118
 valuable-0.116
 very-0.081
 precious-0.081
Token their
Token ID511
Feature activation-9.72e-01

Pos logit contributions
 benches+0.138
hop+0.137
 enclosure+0.136
round+0.126
 greet+0.122
Neg logit contributions
 important-0.177
 best-0.172
 valuable-0.165
 precious-0.145
 much-0.142
Token mom
Token ID1995
Feature activation-5.97e+00

Pos logit contributions
aster+0.032
 benches+0.030
hop+0.029
 greet+0.029
 stroke+0.027
Neg logit contributions
 best-0.028
 important-0.027
 valuable-0.027
 precious-0.023
 great-0.019
Token would
Token ID561
Feature activation+7.92e-01

Pos logit contributions
 valuable+0.077
 important+0.076
 best+0.067
 precious+0.059
 awful+0.059
Neg logit contributions
 benches-0.080
 greet-0.078
 Ann-0.075
 bench-0.074
 Sue-0.074
Token<|endoftext|>
Token ID50256
Feature activation-4.71e+00

Pos logit contributions
 important+0.023
 valuable+0.023
 best+0.021
 certainly+0.019
 precious+0.018
Neg logit contributions
 benches-0.025
 Sue-0.022
aster-0.022
 Ann-0.021
 bench-0.021
TokenJohn
Token ID7554
Feature activation-6.32e+00

Pos logit contributions
 benches+0.134
 greet+0.130
aster+0.130
 bench+0.123
 enclosure+0.123
Neg logit contributions
 important-0.136
 valuable-0.133
 best-0.129
 certainly-0.109
 truly-0.102
Token b
Token ID275
Feature activation-3.42e+00

Pos logit contributions
 benches+0.197
 greet+0.176
aster+0.174
 bench+0.171
 Sue+0.170
Neg logit contributions
 important-0.184
 valuable-0.180
 best-0.164
 certainly-0.146
 precious-0.141
Tokeniked
Token ID17951
Feature activation-6.93e+00

Pos logit contributions
 benches+0.104
 greet+0.097
 bench+0.093
 Ann+0.092
 slam+0.091
Neg logit contributions
 important-0.095
 valuable-0.092
 best-0.084
attered-0.074
 certainly-0.073
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
 benches+0.206
aster+0.204
 greet+0.195
 enclosure+0.194
 arrest+0.191
Neg logit contributions
 important-0.202
 valuable-0.196
 best-0.187
 certainly-0.148
 precious-0.147
Token the
Token ID262
Feature activation-2.80e+00

Pos logit contributions
talk+0.256
 benches+0.249
aro+0.245
keeping+0.237
ges+0.224
Neg logit contributions
 important-0.341
 valuable-0.332
 best-0.328
 great-0.299
 perfect-0.297
Token park
Token ID3952
Feature activation-4.42e+00

Pos logit contributions
 benches+0.054
 greet+0.049
aster+0.048
 Sue+0.047
 Ann+0.046
Neg logit contributions
 important-0.112
 valuable-0.112
 best-0.104
 precious-0.094
 certainly-0.093
Token with
Token ID351
Feature activation-5.79e+00

Pos logit contributions
 benches+0.143
aster+0.141
 greet+0.139
 arrest+0.139
 slam+0.131
Neg logit contributions
 important-0.112
 valuable-0.107
 best-0.107
 certainly-0.079
 truly-0.078
Token his
Token ID465
Feature activation-4.47e-01

Pos logit contributions
 benches+0.135
 enclosure+0.130
 slam+0.125
aster+0.118
 arrest+0.117
Neg logit contributions
 important-0.213
 valuable-0.207
 best-0.202
 certainly-0.171
 precious-0.171
Token friends
Token ID2460
Feature activation-5.26e+00

Pos logit contributions
 benches+0.013
aster+0.012
 greet+0.012
 enclosure+0.011
 Sue+0.011
Neg logit contributions
 important-0.014
 valuable-0.014
 best-0.013
 precious-0.011
 certainly-0.011
TokenJohn
Token ID7554
Feature activation-6.17e+00

Pos logit contributions
 benches+0.115
 greet+0.111
aster+0.111
 enclosure+0.105
 bench+0.104
Neg logit contributions
 important-0.124
 valuable-0.122
 best-0.116
 certainly-0.099
 precious-0.095
Token and
Token ID290
Feature activation-6.88e+00

Pos logit contributions
 benches+0.192
 greet+0.177
aster+0.174
 Sue+0.171
 bench+0.170
Neg logit contributions
 important-0.177
 valuable-0.174
 best-0.158
 certainly-0.137
 precious-0.136
Token his
Token ID465
Feature activation-8.71e-01

Pos logit contributions
 benches+0.165
 enclosure+0.146
 greet+0.137
 slam+0.136
aster+0.133
Neg logit contributions
 important-0.254
 valuable-0.245
 best-0.235
 certainly-0.202
 precious-0.199
Token mom
Token ID1995
Feature activation-7.60e+00

Pos logit contributions
 benches+0.020
 bench+0.018
 Sue+0.018
 greet+0.018
 Ann+0.018
Neg logit contributions
 valuable-0.032
 important-0.032
 best-0.027
 certainly-0.026
 impossible-0.026
Token went
Token ID1816
Feature activation-4.65e+00

Pos logit contributions
 benches+0.211
 greet+0.190
aster+0.188
 enclosure+0.183
 bench+0.182
Neg logit contributions
 important-0.245
 valuable-0.233
 best-0.231
 certainly-0.199
 precious-0.191
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
 benches+0.120
aster+0.112
 greet+0.107
 Sue+0.105
 Ann+0.105
Neg logit contributions
 important-0.159
 valuable-0.156
 best-0.147
 precious-0.125
 certainly-0.122
Token the
Token ID262
Feature activation-3.15e+00

Pos logit contributions
 benches+0.281
aro+0.253
talk+0.252
 slam+0.245
 enclosure+0.243
Neg logit contributions
 important-0.331
 valuable-0.314
 best-0.309
 great-0.280
 really-0.278
Token store
Token ID3650
Feature activation-4.87e+00

Pos logit contributions
 benches+0.061
 greet+0.057
aster+0.056
 Sue+0.054
 Ann+0.054
Neg logit contributions
 important-0.125
 valuable-0.123
 best-0.116
 precious-0.104
 certainly-0.102
Token.
Token ID13
Feature activation+4.61e-01

Pos logit contributions
 benches+0.142
 greet+0.133
aster+0.132
 Ann+0.126
 enclosure+0.126
Neg logit contributions
 important-0.150
 valuable-0.147
 best-0.136
 precious-0.116
 certainly-0.112
Token When
Token ID1649
Feature activation-8.18e-01

Pos logit contributions
 important+0.012
 valuable+0.012
 best+0.011
 certainly+0.009
 precious+0.009
Neg logit contributions
 benches-0.015
aster-0.014
 greet-0.014
 stroke-0.014
 enclosure-0.014
Token they
Token ID484
Feature activation+4.15e-01

Pos logit contributions
 benches+0.024
 greet+0.023
aster+0.022
 enclosure+0.022
 bench+0.022
Neg logit contributions
 important-0.025
 valuable-0.024
 best-0.022
 precious-0.019
 certainly-0.019
Token again
Token ID757
Feature activation-5.97e-01

Pos logit contributions
 important+0.028
 valuable+0.027
 best+0.025
 precious+0.022
 certainly+0.022
Neg logit contributions
 benches-0.022
 greet-0.021
aster-0.020
 enclosure-0.020
 Sue-0.020
Token.
Token ID13
Feature activation+2.16e+00

Pos logit contributions
 benches+0.018
 greet+0.016
aster+0.016
 enclosure+0.016
 Sue+0.016
Neg logit contributions
 important-0.018
 valuable-0.018
 best-0.016
 precious-0.014
 certainly-0.014
Token<|endoftext|>
Token ID50256
Feature activation-3.88e+00

Pos logit contributions
 valuable+0.065
 important+0.064
 best+0.058
 certainly+0.052
 precious+0.050
Neg logit contributions
 benches-0.064
aster-0.061
 greet-0.058
 bench-0.057
 restaurants-0.056
TokenJohn
Token ID7554
Feature activation-6.26e+00

Pos logit contributions
aster+0.119
 benches+0.119
 greet+0.115
 enclosure+0.111
 arrest+0.110
Neg logit contributions
 important-0.101
 valuable-0.100
 best-0.097
 certainly-0.082
 truly-0.075
Token went
Token ID1816
Feature activation-4.47e+00

Pos logit contributions
 benches+0.199
 greet+0.182
aster+0.179
 bench+0.176
 Sue+0.175
Neg logit contributions
 important-0.178
 valuable-0.173
 best-0.158
 certainly-0.138
 precious-0.136
Token to
Token ID284
Feature activation-1.07e+01

Pos logit contributions
 benches+0.122
aster+0.111
 greet+0.108
 Ann+0.106
 enclosure+0.104
Neg logit contributions
 important-0.149
 valuable-0.144
 best-0.136
 precious-0.113
 certainly-0.112
Token the
Token ID262
Feature activation-3.29e+00

Pos logit contributions
 benches+0.289
talk+0.279
aro+0.264
keeping+0.261
 slam+0.257
Neg logit contributions
 important-0.323
 valuable-0.298
 best-0.292
 really-0.274
 great-0.265
Token station
Token ID4429
Feature activation-5.80e+00

Pos logit contributions
 benches+0.067
 greet+0.062
aster+0.061
 Ann+0.058
 Sue+0.058
Neg logit contributions
 important-0.128
 valuable-0.127
 best-0.119
 precious-0.107
 certainly-0.104
Token with
Token ID351
Feature activation-5.81e+00

Pos logit contributions
 benches+0.176
 greet+0.160
aster+0.159
 stroke+0.153
 Ann+0.153
Neg logit contributions
 important-0.170
 valuable-0.163
 best-0.154
 certainly-0.127
 truly-0.122
Token his
Token ID465
Feature activation-6.51e-01

Pos logit contributions
 benches+0.129
 enclosure+0.119
 greet+0.112
 slam+0.109
 bench+0.109
Neg logit contributions
 important-0.225
 valuable-0.217
 best-0.210
 precious-0.182
 certainly-0.181
Token dad
Token ID9955
Feature activation-5.72e+00

Pos logit contributions
 benches+0.016
 greet+0.015
aster+0.015
 Sue+0.015
 bench+0.015
Neg logit contributions
 important-0.022
 valuable-0.022
 best-0.020
 precious-0.018
 certainly-0.018
Token proud
Token ID6613
Feature activation+5.65e+00

Pos logit contributions
 valuable+0.142
 important+0.140
 best+0.123
 virtue+0.107
 certainly+0.103
Neg logit contributions
 benches-0.199
 greet-0.197
 Sue-0.195
 Ann-0.194
 bench-0.191
Token.
Token ID13
Feature activation+2.18e+00

Pos logit contributions
 valuable+0.195
 important+0.195
 best+0.179
 certainly+0.159
 precious+0.159
Neg logit contributions
 benches-0.139
 greet-0.128
 Sue-0.126
aster-0.124
 Ann-0.124
Token She
Token ID1375
Feature activation+2.06e+00

Pos logit contributions
 important+0.069
 valuable+0.069
 best+0.062
 certainly+0.055
 precious+0.055
Neg logit contributions
 benches-0.060
 greet-0.056
aster-0.055
 enclosure-0.053
 bench-0.053
Token had
Token ID550
Feature activation+2.95e+00

Pos logit contributions
 important+0.070
 valuable+0.070
 best+0.064
 precious+0.056
 impossible+0.056
Neg logit contributions
 benches-0.050
 greet-0.048
 Sue-0.047
 Ann-0.046
 bench-0.046
Token learned
Token ID4499
Feature activation+2.04e+00

Pos logit contributions
 important+0.088
 valuable+0.088
 best+0.079
 precious+0.069
 certainly+0.069
Neg logit contributions
 benches-0.087
 greet-0.082
aster-0.080
 Sue-0.079
 bench-0.078
Token something
Token ID1223
Feature activation+7.49e+00

Pos logit contributions
 valuable+0.063
 important+0.063
 best+0.058
 precious+0.051
 certainly+0.051
Neg logit contributions
 benches-0.055
 Sue-0.051
 greet-0.051
 Ann-0.050
 bench-0.050
Token new
Token ID649
Feature activation+3.29e+00

Pos logit contributions
istered+0.172
 valuable+0.170
 important+0.165
 virtue+0.165
attered+0.161
Neg logit contributions
 Sue-0.203
 benches-0.200
 Ann-0.197
 bench-0.193
 greet-0.189
Token and
Token ID290
Feature activation-2.13e+00

Pos logit contributions
 valuable+0.114
 important+0.113
 best+0.102
 awful+0.096
 certainly+0.095
Neg logit contributions
 benches-0.073
 Sue-0.069
 Ann-0.067
 bench-0.067
 greet-0.066
Token it
Token ID340
Feature activation+4.71e+00

Pos logit contributions
 benches+0.071
 greet+0.069
 Sue+0.069
 Ann+0.068
 bench+0.066
Neg logit contributions
 valuable-0.049
 important-0.049
 best-0.043
 virtue-0.038
 certainly-0.037
Token was
Token ID373
Feature activation+5.36e+00

Pos logit contributions
 valuable+0.140
 important+0.136
 best+0.122
 virtue+0.117
istered+0.111
Neg logit contributions
 benches-0.119
 greet-0.118
 Ann-0.116
 Sue-0.116
 climb-0.116
Token very
Token ID845
Feature activation+8.57e+00

Pos logit contributions
 valuable+0.129
 important+0.123
 best+0.111
 virtue+0.101
der+0.100
Neg logit contributions
 Sue-0.171
 greet-0.166
 benches-0.166
 Ann-0.162
aster-0.161
Token home
Token ID1363
Feature activation-3.58e+00

Pos logit contributions
aster+0.106
 benches+0.105
 greet+0.100
 Ann+0.098
 restaurants+0.093
Neg logit contributions
 important-0.123
 valuable-0.122
 best-0.115
 precious-0.097
 certainly-0.089
Token and
Token ID290
Feature activation-4.61e+00

Pos logit contributions
 benches+0.110
 greet+0.103
aster+0.102
 enclosure+0.099
 Ann+0.097
Neg logit contributions
 important-0.105
 valuable-0.104
 best-0.095
 precious-0.081
 certainly-0.077
Token the
Token ID262
Feature activation-9.94e-01

Pos logit contributions
 benches+0.132
aster+0.123
 enclosure+0.121
 greet+0.117
 restaurants+0.111
Neg logit contributions
 important-0.144
 valuable-0.140
 best-0.130
 precious-0.113
 certainly-0.109
Token puppy
Token ID26188
Feature activation-4.72e+00

Pos logit contributions
 benches+0.024
 bench+0.023
 Sue+0.022
 greet+0.022
 walk+0.021
Neg logit contributions
 valuable-0.034
 important-0.033
 best-0.030
 certainly-0.029
 impossible-0.028
Token was
Token ID373
Feature activation+4.62e+00

Pos logit contributions
 benches+0.152
aster+0.144
 greet+0.141
 Sue+0.136
 enclosure+0.135
Neg logit contributions
 important-0.130
 valuable-0.128
 best-0.117
 certainly-0.099
 precious-0.098
Token very
Token ID845
Feature activation+7.91e+00

Pos logit contributions
 valuable+0.096
 important+0.094
 best+0.079
 precious+0.066
 virtue+0.065
Neg logit contributions
 benches-0.172
 Sue-0.168
 greet-0.167
 Ann-0.165
 bench-0.163
Token happy
Token ID3772
Feature activation+6.63e+00

Pos logit contributions
 valuable+0.198
 important+0.195
 best+0.175
 certainly+0.154
 virtue+0.154
Neg logit contributions
 Sue-0.250
 benches-0.249
 greet-0.246
 bench-0.241
 Ann-0.240
Token.
Token ID13
Feature activation+2.08e+00

Pos logit contributions
 valuable+0.221
 important+0.220
 best+0.199
 precious+0.178
 certainly+0.177
Neg logit contributions
 benches-0.173
 greet-0.160
 Sue-0.159
 Ann-0.156
 bench-0.155
Token They
Token ID1119
Feature activation+1.12e+00

Pos logit contributions
 important+0.068
 valuable+0.068
 best+0.061
 certainly+0.054
 precious+0.054
Neg logit contributions
 benches-0.056
 greet-0.052
aster-0.051
 enclosure-0.050
 bench-0.050
Token wanted
Token ID2227
Feature activation+3.90e+00

Pos logit contributions
 important+0.034
 valuable+0.034
 best+0.031
 certainly+0.028
 precious+0.027
Neg logit contributions
 benches-0.033
aster-0.031
 enclosure-0.029
 greet-0.029
 Ann-0.028
Token to
Token ID284
Feature activation-6.79e+00

Pos logit contributions
 important+0.135
 valuable+0.132
 best+0.121
 precious+0.105
 truly+0.098
Neg logit contributions
 benches-0.102
 enclosure-0.101
aster-0.099
 slam-0.094
 greet-0.093
Token her
Token ID607
Feature activation+2.96e+00

Pos logit contributions
 valuable+0.035
 important+0.035
 best+0.032
 precious+0.029
 certainly+0.029
Neg logit contributions
 benches-0.022
 greet-0.021
aster-0.020
 Sue-0.020
 Ann-0.020
Token feel
Token ID1254
Feature activation+4.91e+00

Pos logit contributions
 important+0.094
 valuable+0.093
 best+0.084
 precious+0.075
 certainly+0.074
Neg logit contributions
 benches-0.082
 greet-0.076
aster-0.074
 Sue-0.074
 bench-0.074
Token sad
Token ID6507
Feature activation+6.24e+00

Pos logit contributions
 valuable+0.106
 important+0.099
 virtue+0.098
 best+0.096
ometer+0.094
Neg logit contributions
 sit-0.144
 greet-0.143
 stroke-0.141
aster-0.139
 Sue-0.139
Token.
Token ID13
Feature activation+2.43e+00

Pos logit contributions
 valuable+0.235
 important+0.227
 best+0.205
 certainly+0.199
 precious+0.197
Neg logit contributions
 benches-0.127
 Sue-0.121
 bench-0.115
 Ann-0.114
 sit-0.113
Token Lily
Token ID20037
Feature activation-8.11e-01

Pos logit contributions
 important+0.076
 valuable+0.076
 best+0.068
 certainly+0.061
 precious+0.059
Neg logit contributions
 benches-0.069
 greet-0.064
aster-0.062
 bench-0.060
 enclosure-0.060
Token didn
Token ID1422
Feature activation+7.39e+00

Pos logit contributions
 benches+0.021
 greet+0.021
 Sue+0.020
 sit+0.020
 climb+0.020
Neg logit contributions
 valuable-0.026
 important-0.025
 best-0.023
 precious-0.021
 virtue-0.020
Token't
Token ID470
Feature activation+3.61e+00

Pos logit contributions
 valuable+0.292
 important+0.290
 best+0.274
 precious+0.259
 virtue+0.243
Neg logit contributions
aster-0.137
 Sue-0.127
 Ann-0.126
 climb-0.123
 benches-0.121
Token like
Token ID588
Feature activation+1.30e+00

Pos logit contributions
 valuable+0.126
 important+0.125
 best+0.117
 certainly+0.106
 precious+0.105
Neg logit contributions
 benches-0.081
 greet-0.081
 Sue-0.079
aster-0.078
 climb-0.076
Token Tim
Token ID5045
Feature activation-2.13e+00

Pos logit contributions
 important+0.046
 valuable+0.044
 best+0.039
 awful+0.036
 precious+0.035
Neg logit contributions
 enclosure-0.034
 benches-0.033
 greet-0.032
talk-0.029
aster-0.029
Tokenmy
Token ID1820
Feature activation+9.51e-01

Pos logit contributions
 benches+0.049
 greet+0.045
aster+0.043
 enclosure+0.043
 bench+0.041
Neg logit contributions
 important-0.080
 valuable-0.078
 best-0.075
 certainly-0.065
 precious-0.065
Token and
Token ID290
Feature activation-3.21e+00

Pos logit contributions
 important+0.023
 valuable+0.023
 best+0.020
 precious+0.017
 certainly+0.016
Neg logit contributions
 benches-0.034
 greet-0.032
aster-0.031
 enclosure-0.031
 Sue-0.031
Token Sam
Token ID3409
Feature activation-2.11e+00

Pos logit contributions
 benches+0.043
 greet+0.037
aster+0.036
 enclosure+0.034
 Sue+0.034
Neg logit contributions
 important-0.122
 valuable-0.121
 best-0.113
 precious-0.104
 certainly-0.103
Token.
Token ID13
Feature activation+2.94e+00

Pos logit contributions
 benches+0.051
 Sue+0.049
 Ann+0.047
 greet+0.047
 bench+0.046
Neg logit contributions
 valuable-0.072
 important-0.071
 best-0.064
 certainly-0.059
 precious-0.059
Token You
Token ID921
Feature activation+6.41e+00

Pos logit contributions
 important+0.071
 valuable+0.070
 best+0.061
 precious+0.052
 certainly+0.051
Neg logit contributions
 benches-0.104
 greet-0.098
aster-0.097
 Sue-0.096
 bench-0.095
Token are
Token ID389
Feature activation+4.81e+00

Pos logit contributions
 valuable+0.270
 important+0.269
 best+0.231
der+0.227
 virtue+0.224
Neg logit contributions
 benches-0.100
 greet-0.098
 sit-0.098
 climb-0.097
 enter-0.093
Token good
Token ID922
Feature activation+5.28e+00

Pos logit contributions
 valuable+0.135
 important+0.128
der+0.113
 virtue+0.107
 best+0.102
Neg logit contributions
 Sue-0.145
 benches-0.139
 Ann-0.138
 greet-0.136
 sit-0.133
Token too
Token ID1165
Feature activation+8.90e+00

Pos logit contributions
 valuable+0.167
 important+0.163
 best+0.143
 certainly+0.133
 precious+0.131
Neg logit contributions
 benches-0.149
 Sue-0.138
 greet-0.138
 Ann-0.136
aster-0.134
Token.
Token ID13
Feature activation+3.33e+00

Pos logit contributions
 valuable+0.328
 important+0.321
 best+0.297
 certainly+0.291
 virtue+0.275
Neg logit contributions
 benches-0.183
 Sue-0.182
 Ann-0.175
 bench-0.162
 greet-0.160
Token You
Token ID921
Feature activation+6.49e+00

Pos logit contributions
 important+0.083
 valuable+0.083
 best+0.073
 precious+0.062
 certainly+0.061
Neg logit contributions
 benches-0.115
 greet-0.108
aster-0.106
 bench-0.105
 Sue-0.105
Token are
Token ID389
Feature activation+4.82e+00

Pos logit contributions
 valuable+0.250
 important+0.249
 best+0.216
der+0.209
 virtue+0.206
Neg logit contributions
 sit-0.121
 greet-0.120
 climb-0.120
 benches-0.117
 enter-0.116
Token my
Token ID616
Feature activation+3.12e+00

Pos logit contributions
 valuable+0.135
 important+0.128
der+0.109
 virtue+0.106
 best+0.106
Neg logit contributions
 Sue-0.143
 benches-0.139
 Ann-0.137
 greet-0.137
 sit-0.132
Token friend
Token ID1545
Feature activation-1.38e+00

Pos logit contributions
 valuable+0.080
 important+0.078
 awful+0.065
 certainly+0.062
der+0.061
Neg logit contributions
 benches-0.093
 climb-0.093
 Sue-0.093
 bench-0.090
 walk-0.090
Token warned
Token ID7728
Feature activation-5.23e-01

Pos logit contributions
 benches+0.150
 greet+0.141
 Sue+0.139
aster+0.137
 bench+0.137
Neg logit contributions
 valuable-0.180
 important-0.179
 best-0.161
 precious-0.144
 certainly-0.141
Token him
Token ID683
Feature activation+1.87e+00

Pos logit contributions
 benches+0.010
 greet+0.010
 enclosure+0.010
 arrest+0.009
 bench+0.009
Neg logit contributions
 important-0.021
 best-0.020
 valuable-0.020
 awful-0.017
 certainly-0.017
Token to
Token ID284
Feature activation-7.70e+00

Pos logit contributions
 valuable+0.060
 important+0.060
 best+0.054
 precious+0.050
 certainly+0.049
Neg logit contributions
 benches-0.049
 Sue-0.046
 Ann-0.045
 greet-0.045
 bench-0.045
Token not
Token ID407
Feature activation+3.39e+00

Pos logit contributions
 benches+0.220
 enclosure+0.195
 greet+0.191
 Ann+0.186
aster+0.186
Neg logit contributions
 important-0.250
 valuable-0.239
 best-0.225
 certainly-0.201
 precious-0.189
Token go
Token ID467
Feature activation-1.94e+00

Pos logit contributions
 important+0.130
 valuable+0.128
 best+0.118
 precious+0.104
 certainly+0.104
Neg logit contributions
 benches-0.077
 greet-0.067
 enclosure-0.066
aster-0.065
 Ann-0.064
Token too
Token ID1165
Feature activation+7.40e+00

Pos logit contributions
 benches+0.056
 greet+0.052
aster+0.051
 enclosure+0.051
 Ann+0.051
Neg logit contributions
 important-0.061
 valuable-0.059
 best-0.055
 precious-0.047
 impossible-0.045
Token far
Token ID1290
Feature activation+1.36e+00

Pos logit contributions
 important+0.236
 valuable+0.232
 best+0.209
 precious+0.185
 impossible+0.174
Neg logit contributions
 benches-0.212
 greet-0.202
 enclosure-0.193
aster-0.192
 Ann-0.189
Token.
Token ID13
Feature activation+1.84e+00

Pos logit contributions
 important+0.043
 valuable+0.042
 best+0.039
 precious+0.034
 impossible+0.033
Neg logit contributions
 benches-0.038
 greet-0.036
aster-0.035
 Ann-0.035
 enclosure-0.035
Token
Token ID198
Feature activation-5.71e-01

Pos logit contributions
 important+0.064
 valuable+0.063
 best+0.060
 certainly+0.056
 shouldn+0.051
Neg logit contributions
aster-0.041
 benches-0.041
 greet-0.040
 arrest-0.035
 stroke-0.035
Token
Token ID198
Feature activation-4.70e-01

Pos logit contributions
 benches+0.020
 greet+0.019
aster+0.018
 arrest+0.018
 enclosure+0.018
Neg logit contributions
 important-0.014
 valuable-0.014
 best-0.012
 certainly-0.011
 much-0.010
TokenAs
Token ID1722
Feature activation+1.23e-01

Pos logit contributions
 benches+0.010
 greet+0.009
aster+0.009
 enclosure+0.009
 Sue+0.009
Neg logit contributions
 important-0.018
 valuable-0.018
 best-0.016
 certainly-0.015
 precious-0.015
Token kept
Token ID4030
Feature activation-6.12e-01

Pos logit contributions
 benches+0.105
 greet+0.102
aster+0.097
 Ann+0.096
 enclosure+0.096
Neg logit contributions
 valuable-0.104
 important-0.104
 best-0.092
 precious-0.081
 virtue-0.079
Token singing
Token ID13777
Feature activation+8.26e-01

Pos logit contributions
 benches+0.013
 greet+0.012
aster+0.012
 Sue+0.012
 Ann+0.011
Neg logit contributions
 valuable-0.023
 important-0.023
 best-0.021
 precious-0.019
 certainly-0.019
Token her
Token ID607
Feature activation+2.01e+00

Pos logit contributions
 valuable+0.028
 important+0.027
 best+0.025
 precious+0.022
 certainly+0.022
Neg logit contributions
 benches-0.021
 greet-0.020
 Sue-0.019
 Ann-0.019
aster-0.019
Token happy
Token ID3772
Feature activation+6.33e+00

Pos logit contributions
 valuable+0.060
 important+0.060
 best+0.053
 certainly+0.047
 precious+0.046
Neg logit contributions
 benches-0.059
 greet-0.055
 bench-0.054
aster-0.053
 enclosure-0.053
Token songs
Token ID7259
Feature activation-3.26e-01

Pos logit contributions
 valuable+0.199
 important+0.198
 best+0.177
 certainly+0.159
 precious+0.159
Neg logit contributions
 benches-0.175
 greet-0.164
 bench-0.159
aster-0.158
 Sue-0.158
Token each
Token ID1123
Feature activation+1.98e+00

Pos logit contributions
 benches+0.008
 greet+0.008
 Sue+0.007
 bench+0.007
 Ann+0.007
Neg logit contributions
 important-0.011
 valuable-0.011
 best-0.010
 precious-0.009
 certainly-0.009
Token morning
Token ID3329
Feature activation+4.23e-01

Pos logit contributions
 important+0.051
 valuable+0.049
 best+0.046
 certainly+0.037
 precious+0.036
Neg logit contributions
 benches-0.069
 greet-0.063
 enclosure-0.062
 Sue-0.062
 Ann-0.061
Token in
Token ID287
Feature activation-6.95e+00

Pos logit contributions
 valuable+0.014
 important+0.014
 best+0.012
 precious+0.011
 certainly+0.011
Neg logit contributions
 benches-0.011
 greet-0.011
aster-0.010
 Sue-0.010
 bench-0.010
Token the
Token ID262
Feature activation-1.48e+00

Pos logit contributions
 benches+0.143
 enclosure+0.125
 greet+0.120
 bench+0.119
aster+0.118
Neg logit contributions
 important-0.272
 valuable-0.268
 best-0.256
 precious-0.225
 certainly-0.225
Token quiet
Token ID5897
Feature activation+1.25e+00

Pos logit contributions
 benches+0.040
 greet+0.038
 bench+0.037
aster+0.036
 enclosure+0.036
Neg logit contributions
 valuable-0.047
 important-0.045
 best-0.040
 certainly-0.039
 precious-0.037
Token forest
Token ID8222
Feature activation-3.32e+00

Pos logit contributions
 important+0.039
 valuable+0.039
 best+0.035
 precious+0.031
 certainly+0.030
Neg logit contributions
 benches-0.035
 greet-0.033
aster-0.033
 Sue-0.032
 Ann-0.032
Token take
Token ID1011
Feature activation-6.67e-01

Pos logit contributions
 important+0.111
 valuable+0.111
 best+0.102
 certainly+0.093
 precious+0.092
Neg logit contributions
 benches-0.058
 greet-0.052
aster-0.051
 Sue-0.050
 enclosure-0.050
Token my
Token ID616
Feature activation+8.95e-01

Pos logit contributions
 benches+0.018
 greet+0.017
aster+0.016
 bench+0.016
 enclosure+0.016
Neg logit contributions
 important-0.022
 valuable-0.022
 best-0.020
 precious-0.018
 certainly-0.017
Token boot
Token ID6297
Feature activation+1.22e+00

Pos logit contributions
 important+0.029
 valuable+0.028
 best+0.027
 precious+0.024
 impossible+0.021
Neg logit contributions
 benches-0.024
aster-0.023
 greet-0.023
 Ann-0.023
 Sue-0.022
Token.
Token ID13
Feature activation+2.39e+00

Pos logit contributions
 important+0.038
 valuable+0.038
 best+0.034
 precious+0.030
 certainly+0.030
Neg logit contributions
 benches-0.035
 greet-0.032
aster-0.031
 Sue-0.031
 enclosure-0.031
Token It
Token ID632
Feature activation+5.05e+00

Pos logit contributions
 important+0.068
 valuable+0.067
 best+0.060
 precious+0.052
 certainly+0.051
Neg logit contributions
 benches-0.073
 Sue-0.070
 greet-0.070
aster-0.069
 Ann-0.069
Token is
Token ID318
Feature activation+5.67e+00

Pos logit contributions
 valuable+0.152
 important+0.146
 best+0.122
 virtue+0.119
 fantastic+0.114
Neg logit contributions
 climb-0.135
 Sue-0.133
 enclosure-0.132
 Ann-0.131
 enter-0.129
Token mine
Token ID6164
Feature activation+4.09e+00

Pos logit contributions
 valuable+0.152
 important+0.148
 best+0.136
 certainly+0.124
 virtue+0.119
Neg logit contributions
 Sue-0.167
 benches-0.166
aster-0.158
 Ann-0.157
 greet-0.156
Token.
Token ID13
Feature activation+2.58e+00

Pos logit contributions
 valuable+0.137
 important+0.135
 best+0.121
 certainly+0.111
 precious+0.109
Neg logit contributions
 benches-0.105
 Sue-0.098
 Ann-0.097
 bench-0.095
 greet-0.093
Token Go
Token ID1514
Feature activation-1.81e+00

Pos logit contributions
 important+0.072
 valuable+0.072
 best+0.064
 precious+0.055
 certainly+0.055
Neg logit contributions
 benches-0.082
 greet-0.076
 bench-0.074
aster-0.074
 enclosure-0.073
Token away
Token ID1497
Feature activation-4.29e-02

Pos logit contributions
 benches+0.052
aster+0.044
 Sue+0.043
 Ann+0.043
 restaurants+0.043
Neg logit contributions
 important-0.057
 valuable-0.055
 best-0.052
 really-0.046
 virtue-0.046
Token."
Token ID526
Feature activation-1.25e+00

Pos logit contributions
 benches+0.001
aster+0.001
 greet+0.001
 gra+0.001
 stroke+0.001
Neg logit contributions
 important-0.001
 valuable-0.001
 best-0.001
 precious-0.001
 virtue-0.001
Token she
Token ID673
Feature activation+2.43e+00

Pos logit contributions
 benches+0.093
 greet+0.087
 enclosure+0.086
aster+0.083
 bench+0.079
Neg logit contributions
 important-0.133
 valuable-0.131
 best-0.123
 precious-0.107
 certainly-0.106
Token gave
Token ID2921
Feature activation-2.91e+00

Pos logit contributions
 valuable+0.089
 important+0.089
 best+0.081
 precious+0.073
 certainly+0.072
Neg logit contributions
 benches-0.055
 greet-0.050
aster-0.049
 Sue-0.049
 bench-0.049
Token him
Token ID683
Feature activation+7.24e-01

Pos logit contributions
 benches+0.063
 enclosure+0.061
 bench+0.055
 slam+0.054
 greet+0.054
Neg logit contributions
 important-0.113
 valuable-0.109
 best-0.108
 precious-0.092
 certainly-0.091
Token the
Token ID262
Feature activation-1.87e-02

Pos logit contributions
 important+0.019
 valuable+0.018
 best+0.017
 precious+0.014
 certainly+0.013
Neg logit contributions
 benches-0.025
 greet-0.023
 enclosure-0.023
 bench-0.022
aster-0.022
Token dinner
Token ID8073
Feature activation+6.30e-01

Pos logit contributions
 benches+0.000
 greet+0.000
 bench+0.000
aster+0.000
 Sue+0.000
Neg logit contributions
 valuable-0.001
 important-0.001
 best-0.001
 certainly-0.001
 precious-0.001
Token.
Token ID13
Feature activation+2.00e+00

Pos logit contributions
 important+0.021
 valuable+0.021
 best+0.019
 precious+0.017
 certainly+0.017
Neg logit contributions
 benches-0.016
 greet-0.015
 bench-0.015
aster-0.015
 Sue-0.015
Token Her
Token ID2332
Feature activation+1.97e+00

Pos logit contributions
 important+0.065
 valuable+0.064
 best+0.057
 certainly+0.051
 precious+0.050
Neg logit contributions
 benches-0.055
 greet-0.051
aster-0.050
 enclosure-0.049
 bench-0.048
Token dad
Token ID9955
Feature activation-4.61e+00

Pos logit contributions
 valuable+0.078
 important+0.075
 best+0.067
 impossible+0.067
 certainly+0.066
Neg logit contributions
 benches-0.037
 bench-0.037
 Sue-0.033
aster-0.032
 Ann-0.032
Token was
Token ID373
Feature activation+4.83e+00

Pos logit contributions
 benches+0.136
 greet+0.126
aster+0.124
 Sue+0.123
 enclosure+0.122
Neg logit contributions
 important-0.139
 valuable-0.138
 best-0.124
 precious-0.109
 certainly-0.108
Token very
Token ID845
Feature activation+8.19e+00

Pos logit contributions
 valuable+0.103
 important+0.099
 best+0.084
 virtue+0.072
 precious+0.072
Neg logit contributions
 benches-0.174
 Sue-0.173
 greet-0.171
aster-0.169
 Ann-0.169
Token happy
Token ID3772
Feature activation+7.09e+00

Pos logit contributions
 valuable+0.181
 important+0.176
 best+0.161
der+0.149
 certainly+0.145
Neg logit contributions
 Sue-0.274
 greet-0.269
 bench-0.264
 climb-0.262
 Ann-0.262
Token mom
Token ID1995
Feature activation-6.01e+00

Pos logit contributions
 benches+0.105
 Sue+0.098
 Ann+0.096
 greet+0.095
aster+0.095
Neg logit contributions
 valuable-0.170
 important-0.169
 best-0.151
 precious-0.139
 certainly-0.138
Token said
Token ID531
Feature activation-5.64e-01

Pos logit contributions
 benches+0.174
 greet+0.160
aster+0.157
 enclosure+0.154
 bench+0.152
Neg logit contributions
 important-0.187
 valuable-0.184
 best-0.168
 certainly-0.147
 precious-0.147
Token they
Token ID484
Feature activation+2.38e+00

Pos logit contributions
 benches+0.014
 greet+0.013
aster+0.013
 Sue+0.013
 Ann+0.013
Neg logit contributions
 important-0.019
 valuable-0.019
 best-0.017
 precious-0.016
 certainly-0.015
Token had
Token ID550
Feature activation+3.58e+00

Pos logit contributions
 valuable+0.081
 important+0.079
 best+0.072
 virtue+0.066
 precious+0.065
Neg logit contributions
 sit-0.055
 benches-0.053
 greet-0.053
 Sue-0.053
 buy-0.052
Token no
Token ID645
Feature activation+3.26e+00

Pos logit contributions
 valuable+0.144
 important+0.143
 best+0.133
 certainly+0.123
 precious+0.122
Neg logit contributions
 benches-0.064
aster-0.059
 Sue-0.057
 greet-0.056
 bench-0.056
Token spaghetti
Token ID48241
Feature activation+3.57e+00

Pos logit contributions
 valuable+0.112
 important+0.107
 best+0.094
 certainly+0.094
 virtue+0.091
Neg logit contributions
 benches-0.076
aster-0.074
 greet-0.072
 Sue-0.071
 bench-0.069
Token left
Token ID1364
Feature activation-3.91e+00

Pos logit contributions
 valuable+0.121
 important+0.120
 best+0.108
 precious+0.098
 certainly+0.097
Neg logit contributions
 benches-0.090
 greet-0.083
 Sue-0.082
aster-0.082
 bench-0.081
Token.
Token ID13
Feature activation+1.75e+00

Pos logit contributions
 benches+0.112
 greet+0.107
aster+0.102
 enclosure+0.101
 stroke+0.099
Neg logit contributions
 important-0.121
 valuable-0.116
 best-0.111
 precious-0.093
 certainly-0.091
Token She
Token ID1375
Feature activation+1.71e+00

Pos logit contributions
 important+0.061
 valuable+0.061
 best+0.055
 precious+0.049
 certainly+0.049
Neg logit contributions
 benches-0.044
 greet-0.040
 Sue-0.039
aster-0.039
 Ann-0.039
Token said
Token ID531
Feature activation+2.01e-01

Pos logit contributions
 valuable+0.060
 important+0.059
 best+0.055
 impossible+0.049
 virtue+0.049
Neg logit contributions
 Sue-0.038
 Ann-0.037
 benches-0.037
 greet-0.037
 bench-0.036
Token they
Token ID484
Feature activation+2.68e+00

Pos logit contributions
 important+0.006
 valuable+0.006
 best+0.006
 precious+0.005
 certainly+0.005
Neg logit contributions
 benches-0.006
 greet-0.005
 enclosure-0.005
aster-0.005
 bench-0.005
Token the
Token ID262
Feature activation+2.74e-01

Pos logit contributions
 benches+0.087
 greet+0.082
aster+0.079
 enclosure+0.078
 bench+0.076
Neg logit contributions
 important-0.160
 valuable-0.157
 best-0.149
 precious-0.133
 certainly-0.128
Token flowers
Token ID12734
Feature activation-3.22e+00

Pos logit contributions
 important+0.009
 valuable+0.009
 best+0.008
 certainly+0.007
 precious+0.007
Neg logit contributions
 benches-0.008
 greet-0.007
 bench-0.007
aster-0.007
 Sue-0.007
Token.
Token ID13
Feature activation+1.71e+00

Pos logit contributions
 benches+0.090
 greet+0.085
aster+0.084
 Ann+0.082
 Sue+0.082
Neg logit contributions
 important-0.099
 valuable-0.098
 best-0.091
 precious-0.078
 certainly-0.077
Token They
Token ID1119
Feature activation+1.03e+00

Pos logit contributions
 valuable+0.047
 important+0.046
 best+0.041
 precious+0.035
 certainly+0.035
Neg logit contributions
 benches-0.055
 greet-0.051
aster-0.051
 enclosure-0.050
 Sue-0.049
Token could
Token ID714
Feature activation+9.13e-02

Pos logit contributions
 important+0.029
 valuable+0.029
 best+0.026
 precious+0.023
 certainly+0.023
Neg logit contributions
 benches-0.032
 greet-0.030
aster-0.029
 enclosure-0.029
 Sue-0.029
Token not
Token ID407
Feature activation+3.26e+00

Pos logit contributions
 important+0.002
 valuable+0.002
 best+0.002
 certainly+0.001
 precious+0.001
Neg logit contributions
 benches-0.004
 enclosure-0.003
 restaurants-0.003
aster-0.003
hop-0.003
Token see
Token ID766
Feature activation-4.08e+00

Pos logit contributions
 important+0.118
 valuable+0.115
 best+0.110
 truly+0.103
 really+0.102
Neg logit contributions
 benches-0.069
 enclosure-0.061
 restaurants-0.059
 Ann-0.058
 arrest-0.056
Token the
Token ID262
Feature activation+2.97e-01

Pos logit contributions
 benches+0.080
 greet+0.078
aster+0.077
 enclosure+0.073
 bench+0.070
Neg logit contributions
 important-0.157
 valuable-0.153
 best-0.150
 precious-0.133
 awful-0.127
Token shed
Token ID14999
Feature activation-2.24e+00

Pos logit contributions
 important+0.010
 valuable+0.010
 best+0.009
 precious+0.008
 certainly+0.008
Neg logit contributions
 benches-0.007
 greet-0.007
aster-0.007
 Sue-0.007
 enclosure-0.007
Token.
Token ID13
Feature activation+2.02e+00

Pos logit contributions
 benches+0.064
 greet+0.060
aster+0.058
 Sue+0.058
 Ann+0.057
Neg logit contributions
 important-0.069
 valuable-0.069
 best-0.063
 precious-0.055
 certainly-0.054
Token They
Token ID1119
Feature activation+1.04e+00

Pos logit contributions
 valuable+0.058
 important+0.058
 best+0.052
 precious+0.045
 certainly+0.045
Neg logit contributions
 benches-0.061
 greet-0.058
aster-0.057
 enclosure-0.056
 bench-0.055
Token,
Token ID11
Feature activation-4.44e+00

Pos logit contributions
 benches+0.053
 greet+0.049
aster+0.048
 enclosure+0.048
 bench+0.047
Neg logit contributions
 important-0.058
 valuable-0.058
 best-0.053
 precious-0.046
 certainly-0.045
Token as
Token ID355
Feature activation+2.67e-01

Pos logit contributions
 benches+0.127
 greet+0.115
aster+0.114
 enclosure+0.113
 restaurants+0.110
Neg logit contributions
 important-0.136
 valuable-0.131
 best-0.125
 certainly-0.109
 precious-0.104
Token they
Token ID484
Feature activation+5.89e-01

Pos logit contributions
 important+0.010
 valuable+0.010
 best+0.009
 precious+0.008
 certainly+0.008
Neg logit contributions
 benches-0.006
 greet-0.005
aster-0.005
 Sue-0.005
 bench-0.005
Token passed
Token ID3804
Feature activation-2.85e+00

Pos logit contributions
 valuable+0.018
 important+0.018
 best+0.018
 certainly+0.016
 truly+0.015
Neg logit contributions
 benches-0.015
 greet-0.014
 restaurants-0.014
keeping-0.014
 enclosure-0.014
Token the
Token ID262
Feature activation-6.50e-01

Pos logit contributions
aster+0.076
 greet+0.073
 benches+0.072
 enclosure+0.072
 arrest+0.069
Neg logit contributions
 important-0.093
 valuable-0.090
 best-0.087
 precious-0.074
 brightly-0.070
Token playground
Token ID24817
Feature activation-2.70e+00

Pos logit contributions
 benches+0.009
 greet+0.008
aster+0.008
 Sue+0.008
 Ann+0.008
Neg logit contributions
 important-0.029
 valuable-0.029
 best-0.028
 precious-0.025
 certainly-0.025
Token.
Token ID13
Feature activation+1.52e+00

Pos logit contributions
 benches+0.082
aster+0.081
 greet+0.081
 stroke+0.075
 restaurants+0.074
Neg logit contributions
 important-0.076
 valuable-0.074
 best-0.070
 precious-0.057
 certainly-0.055
Token
Token ID198
Feature activation-6.39e-01

Pos logit contributions
 important+0.049
 valuable+0.048
 best+0.044
 certainly+0.039
 precious+0.038
Neg logit contributions
 benches-0.042
 greet-0.039
aster-0.039
 enclosure-0.038
 bench-0.037
Token
Token ID198
Feature activation-5.09e-01

Pos logit contributions
 benches+0.020
 greet+0.018
 enclosure+0.018
aster+0.017
 bench+0.017
Neg logit contributions
 important-0.019
 valuable-0.019
 best-0.017
 certainly-0.015
 precious-0.015
Token"
Token ID1
Feature activation-2.98e-01

Pos logit contributions
 benches+0.014
 greet+0.013
 bench+0.013
aster+0.013
 Sue+0.013
Neg logit contributions
 valuable-0.016
 important-0.015
 best-0.014
 precious-0.013
 impossible-0.012
TokenOkay
Token ID16454
Feature activation+3.89e+00

Pos logit contributions
 benches+0.008
 greet+0.007
 bench+0.007
 Sue+0.007
aster+0.007
Neg logit contributions
 important-0.010
 valuable-0.010
 best-0.009
 precious-0.008
 certainly-0.008
Token Everything
Token ID11391
Feature activation+3.87e+00

Pos logit contributions
 important+0.062
 valuable+0.062
 best+0.055
 certainly+0.048
 precious+0.048
Neg logit contributions
 benches-0.069
 greet-0.064
aster-0.062
 bench-0.061
 enclosure-0.060
Token is
Token ID318
Feature activation+5.26e+00

Pos logit contributions
 valuable+0.108
 important+0.104
 best+0.093
 virtue+0.083
 precious+0.082
Neg logit contributions
 benches-0.112
 greet-0.108
 Ann-0.107
 Sue-0.106
 climb-0.105
Token harmless
Token ID23585
Feature activation+4.76e+00

Pos logit contributions
 valuable+0.119
 important+0.110
 best+0.102
 virtue+0.093
 certainly+0.092
Neg logit contributions
 Sue-0.175
 greet-0.167
 benches-0.166
 Ann-0.163
aster-0.157
Token."
Token ID526
Feature activation-1.74e+00

Pos logit contributions
 valuable+0.170
 important+0.167
 best+0.153
 certainly+0.142
 precious+0.140
Neg logit contributions
 benches-0.110
 Sue-0.100
 Ann-0.097
 greet-0.097
 bench-0.095
Token
Token ID198
Feature activation-9.52e-01

Pos logit contributions
 benches+0.054
 greet+0.049
 enclosure+0.048
aster+0.047
 restaurants+0.047
Neg logit contributions
 important-0.048
 valuable-0.048
 best-0.044
 certainly-0.038
 precious-0.035
Token
Token ID198
Feature activation-7.16e-01

Pos logit contributions
 benches+0.031
 greet+0.028
 enclosure+0.028
 restaurants+0.028
 arrest+0.027
Neg logit contributions
 valuable-0.026
 important-0.026
 best-0.022
 certainly-0.020
 precious-0.019
TokenThe
Token ID464
Feature activation-2.29e-01

Pos logit contributions
 benches+0.020
 greet+0.018
 enclosure+0.017
aster+0.017
 bench+0.017
Neg logit contributions
 important-0.023
 valuable-0.023
 best-0.020
 certainly-0.018
 precious-0.018
Token boy
Token ID2933
Feature activation-2.86e+00

Pos logit contributions
 benches+0.006
 greet+0.006
 bench+0.006
aster+0.006
 Sue+0.006
Neg logit contributions
 valuable-0.007
 important-0.007
 best-0.006
 certainly-0.006
 precious-0.006
Token relaxed
Token ID18397
Feature activation+2.39e+00

Pos logit contributions
 benches+0.084
 greet+0.078
 Sue+0.077
aster+0.077
 enclosure+0.076
Neg logit contributions
 important-0.086
 valuable-0.085
 best-0.077
 precious-0.067
 certainly-0.066
Token and
Token ID290
Feature activation-3.66e+00

Pos logit contributions
 important+0.080
 valuable+0.080
 best+0.073
 precious+0.065
 certainly+0.065
Neg logit contributions
 benches-0.062
 greet-0.057
aster-0.056
 Sue-0.055
 bench-0.055
Token soon
Token ID2582
Feature activation+1.28e+00

Pos logit contributions
 benches+0.096
 greet+0.088
aster+0.086
 Sue+0.086
 bench+0.085
Neg logit contributions
 important-0.122
 valuable-0.122
 best-0.110
 precious-0.098
 certainly-0.098
Token curtain
Token ID29461
Feature activation-2.09e-01

Pos logit contributions
 important+0.014
 valuable+0.014
 best+0.013
 precious+0.012
 certainly+0.012
Neg logit contributions
 benches-0.007
 greet-0.006
 Sue-0.006
aster-0.006
 bench-0.006
Token."
Token ID526
Feature activation-1.04e+00

Pos logit contributions
 benches+0.006
 greet+0.005
aster+0.005
 Sue+0.005
 bench+0.005
Neg logit contributions
 important-0.006
 valuable-0.006
 best-0.006
 precious-0.005
 certainly-0.005
Token
Token ID198
Feature activation-9.88e-01

Pos logit contributions
 greet+0.029
 benches+0.029
aster+0.027
 restaurants+0.027
 enclosure+0.026
Neg logit contributions
 important-0.031
 best-0.030
 valuable-0.030
 certainly-0.025
 awful-0.023
Token
Token ID198
Feature activation-6.68e-01

Pos logit contributions
 benches+0.035
 greet+0.033
 enclosure+0.033
 restaurants+0.032
 arrest+0.031
Neg logit contributions
 important-0.024
 valuable-0.024
 best-0.021
 certainly-0.018
 much-0.017
TokenS
Token ID50
Feature activation-3.47e+00

Pos logit contributions
 benches+0.018
 greet+0.017
 enclosure+0.016
 restaurants+0.016
aster+0.015
Neg logit contributions
 important-0.022
 valuable-0.022
 best-0.020
 certainly-0.018
 precious-0.017
Tokenue
Token ID518
Feature activation-3.60e+00

Pos logit contributions
 benches+0.064
 greet+0.061
 restaurants+0.054
aster+0.054
 bench+0.054
Neg logit contributions
 important-0.143
 valuable-0.140
 best-0.133
 certainly-0.123
 precious-0.120
Token's
Token ID338
Feature activation+2.34e+00

Pos logit contributions
 benches+0.109
 greet+0.102
aster+0.100
 Sue+0.099
 enclosure+0.098
Neg logit contributions
 important-0.105
 valuable-0.104
 best-0.094
 precious-0.082
 certainly-0.081
Token mom
Token ID1995
Feature activation-5.20e+00

Pos logit contributions
 important+0.080
 valuable+0.079
 best+0.073
 precious+0.065
 certainly+0.064
Neg logit contributions
 benches-0.060
 greet-0.056
aster-0.054
 Sue-0.053
 enclosure-0.053
Token thought
Token ID1807
Feature activation+6.43e+00

Pos logit contributions
 benches+0.153
 greet+0.142
aster+0.138
 enclosure+0.137
 restaurants+0.135
Neg logit contributions
 important-0.158
 valuable-0.154
 best-0.144
 certainly-0.128
 precious-0.123
Token for
Token ID329
Feature activation-3.10e+00

Pos logit contributions
 important+0.201
 valuable+0.194
 best+0.178
 precious+0.152
 certainly+0.151
Neg logit contributions
 benches-0.189
 greet-0.183
 enclosure-0.181
aster-0.171
 arrest-0.169
Token a
Token ID257
Feature activation+2.91e+00

Pos logit contributions
 benches+0.108
 greet+0.105
 enclosure+0.105
aster+0.101
 bench+0.101
Neg logit contributions
 important-0.075
 valuable-0.072
 best-0.066
 certainly-0.055
 precious-0.055
Token It
Token ID632
Feature activation+4.24e+00

Pos logit contributions
 important+0.069
 valuable+0.069
 best+0.063
 certainly+0.056
 precious+0.056
Neg logit contributions
 benches-0.053
 greet-0.049
aster-0.047
 enclosure-0.046
 bench-0.046
Token was
Token ID373
Feature activation+4.88e+00

Pos logit contributions
 valuable+0.123
 important+0.121
 best+0.109
 virtue+0.096
 precious+0.095
Neg logit contributions
 benches-0.125
 greet-0.120
 Ann-0.119
 Sue-0.119
 bench-0.116
Token so
Token ID523
Feature activation+6.36e+00

Pos logit contributions
 valuable+0.140
 important+0.138
 best+0.124
 certainly+0.109
 precious+0.108
Neg logit contributions
 benches-0.145
 Sue-0.138
 greet-0.137
 Ann-0.135
aster-0.134
Token mild
Token ID11607
Feature activation+5.02e+00

Pos logit contributions
 valuable+0.142
 important+0.137
 best+0.135
 virtue+0.118
 certainly+0.112
Neg logit contributions
 Sue-0.213
 benches-0.206
 Ann-0.206
 greet-0.200
 bench-0.197
Token that
Token ID326
Feature activation+6.48e-01

Pos logit contributions
 valuable+0.174
 important+0.171
 best+0.157
 precious+0.142
 certainly+0.141
Neg logit contributions
 benches-0.125
 Sue-0.114
 Ann-0.111
 greet-0.111
 bench-0.110
Token Mom
Token ID11254
Feature activation-3.34e+00

Pos logit contributions
 important+0.023
 valuable+0.023
 best+0.021
 precious+0.019
 certainly+0.019
Neg logit contributions
 benches-0.015
 greet-0.014
aster-0.014
 Sue-0.013
 bench-0.013
Tokenmy
Token ID1820
Feature activation-8.55e-01

Pos logit contributions
 benches+0.070
 greet+0.062
aster+0.061
 enclosure+0.059
 bench+0.059
Neg logit contributions
 important-0.130
 valuable-0.129
 best-0.120
 certainly-0.109
 precious-0.108
Token bear
Token ID6842
Feature activation-3.54e+00

Pos logit contributions
 benches+0.025
 greet+0.024
aster+0.023
 Sue+0.023
 Ann+0.023
Neg logit contributions
 valuable-0.025
 important-0.025
 best-0.023
 precious-0.020
 certainly-0.020
Token didn
Token ID1422
Feature activation+6.64e+00

Pos logit contributions
 benches+0.112
 greet+0.106
aster+0.104
 Sue+0.103
 bench+0.103
Neg logit contributions
 important-0.097
 valuable-0.096
 best-0.086
 precious-0.074
 certainly-0.073
Token't
Token ID470
Feature activation+3.39e+00

Pos logit contributions
 valuable+0.259
 important+0.257
 best+0.240
 precious+0.226
 virtue+0.212
Neg logit contributions
aster-0.128
 benches-0.125
 greet-0.122
 Ann-0.121
 climb-0.119
Token worry
Token ID5490
Feature activation+5.65e+00

Pos logit contributions
 important+0.115
 valuable+0.114
 best+0.103
 certainly+0.092
 precious+0.092
Neg logit contributions
 benches-0.090
 enclosure-0.082
aster-0.078
 greet-0.078
 Ann-0.076
Token so
Token ID523
Feature activation+5.04e+00

Pos logit contributions
 valuable+0.068
 important+0.068
 best+0.064
 precious+0.059
 certainly+0.059
Neg logit contributions
 benches-0.046
 Sue-0.044
 greet-0.042
 Ann-0.042
 bench-0.041
Token many
Token ID867
Feature activation+1.56e-03

Pos logit contributions
 best+0.088
 valuable+0.087
 important+0.084
 certainly+0.078
 impossible+0.076
Neg logit contributions
 benches-0.184
 Sue-0.182
aster-0.182
 Ann-0.178
 bench-0.172
Token new
Token ID649
Feature activation+3.25e+00

Pos logit contributions
 valuable+0.000
 important+0.000
 best+0.000
 certainly+0.000
 precious+0.000
Neg logit contributions
 benches-0.000
 greet-0.000
 Sue-0.000
 Ann-0.000
aster-0.000
Token things
Token ID1243
Feature activation+2.80e+00

Pos logit contributions
 valuable+0.096
 important+0.094
 best+0.084
 certainly+0.080
 awful+0.079
Neg logit contributions
 benches-0.091
 greet-0.085
 Ann-0.084
 Sue-0.083
 bench-0.082
Token from
Token ID422
Feature activation-6.54e+00

Pos logit contributions
 valuable+0.096
 important+0.095
 best+0.086
 virtue+0.082
 precious+0.079
Neg logit contributions
 benches-0.061
 Sue-0.057
 greet-0.057
 Ann-0.055
 bench-0.054
Token the
Token ID262
Feature activation-9.54e-01

Pos logit contributions
 benches+0.146
 greet+0.133
 enclosure+0.128
aster+0.126
 bench+0.125
Neg logit contributions
 important-0.251
 valuable-0.242
 best-0.229
 precious-0.203
 impossible-0.196
Token test
Token ID1332
Feature activation-1.77e+00

Pos logit contributions
 benches+0.021
 greet+0.019
aster+0.019
 Sue+0.019
 bench+0.019
Neg logit contributions
 important-0.035
 valuable-0.035
 best-0.032
 precious-0.029
 certainly-0.029
Token.
Token ID13
Feature activation+2.05e+00

Pos logit contributions
 benches+0.046
 Sue+0.042
 greet+0.042
 Ann+0.042
 bench+0.041
Neg logit contributions
 valuable-0.058
 important-0.057
 best-0.050
 precious-0.047
 certainly-0.046
Token And
Token ID843
Feature activation-4.83e-01

Pos logit contributions
 important+0.064
 valuable+0.063
 best+0.058
 precious+0.051
 certainly+0.048
Neg logit contributions
 benches-0.057
 Sue-0.055
 Ann-0.054
 greet-0.054
 enclosure-0.053
Token Bobby
Token ID17155
Feature activation-1.84e+00

Pos logit contributions
 benches+0.013
 greet+0.012
aster+0.012
 Sue+0.011
 enclosure+0.011
Neg logit contributions
 important-0.016
 valuable-0.016
 best-0.015
 precious-0.013
 certainly-0.013
Token was
Token ID373
Feature activation+5.09e+00

Pos logit contributions
 benches+0.049
 greet+0.048
 Ann+0.048
 Sue+0.047
 climb+0.047
Neg logit contributions
 valuable-0.055
 important-0.054
 best-0.047
 precious-0.044
 awful-0.042
Token,
Token ID11
Feature activation-3.75e+00

Pos logit contributions
 benches+0.021
 greet+0.019
aster+0.019
 enclosure+0.019
 bench+0.019
Neg logit contributions
 important-0.023
 valuable-0.022
 best-0.021
 precious-0.018
 certainly-0.017
Token but
Token ID475
Feature activation-6.76e-01

Pos logit contributions
 benches+0.093
 greet+0.086
aster+0.086
 enclosure+0.083
 Sue+0.082
Neg logit contributions
 important-0.130
 valuable-0.129
 best-0.119
 precious-0.105
 certainly-0.105
Token he
Token ID339
Feature activation+3.20e+00

Pos logit contributions
 benches+0.017
 greet+0.015
aster+0.015
 Sue+0.015
 Ann+0.015
Neg logit contributions
 important-0.023
 valuable-0.023
 best-0.021
 precious-0.019
 certainly-0.019
Token kept
Token ID4030
Feature activation-4.61e-01

Pos logit contributions
 valuable+0.084
 important+0.082
 best+0.071
 precious+0.063
 virtue+0.060
Neg logit contributions
 benches-0.101
 Sue-0.099
 Ann-0.098
 greet-0.097
 bench-0.095
Token talking
Token ID3375
Feature activation+2.94e+00

Pos logit contributions
 benches+0.011
 greet+0.010
aster+0.010
 Sue+0.010
 Ann+0.010
Neg logit contributions
 valuable-0.016
 important-0.016
 best-0.015
 precious-0.013
 certainly-0.013
Token in
Token ID287
Feature activation-6.29e+00

Pos logit contributions
 important+0.099
 valuable+0.099
 best+0.090
 precious+0.082
 certainly+0.082
Neg logit contributions
 benches-0.073
 greet-0.067
 Sue-0.066
 Ann-0.065
 bench-0.065
Token Spanish
Token ID7897
Feature activation+1.67e+00

Pos logit contributions
 benches+0.156
 enclosure+0.144
 greet+0.141
aster+0.141
 bench+0.138
Neg logit contributions
 important-0.220
 valuable-0.218
 best-0.201
 precious-0.177
 certainly-0.173
Token.
Token ID13
Feature activation+2.08e+00

Pos logit contributions
 important+0.057
 valuable+0.057
 best+0.052
 certainly+0.050
 virtue+0.048
Neg logit contributions
 benches-0.038
 Ann-0.035
 greet-0.034
 bench-0.033
 Sue-0.033
Token Finally
Token ID9461
Feature activation+1.02e+00

Pos logit contributions
 important+0.070
 valuable+0.070
 best+0.064
 certainly+0.058
 precious+0.055
Neg logit contributions
 benches-0.052
 greet-0.049
aster-0.048
 bench-0.047
 enclosure-0.046
Token,
Token ID11
Feature activation-4.00e+00

Pos logit contributions
 important+0.032
 valuable+0.031
 best+0.030
 certainly+0.025
 precious+0.025
Neg logit contributions
 benches-0.029
 greet-0.027
 enclosure-0.027
 bench-0.026
aster-0.026
Token Mary
Token ID5335
Feature activation-3.33e+00

Pos logit contributions
 benches+0.103
 enclosure+0.100
 greet+0.097
 bench+0.091
 gra+0.090
Neg logit contributions
 important-0.131
 valuable-0.127
 best-0.125
 certainly-0.107
 truly-0.104
Token to
Token ID284
Feature activation-9.55e+00

Pos logit contributions
 benches+0.087
aster+0.080
 Sue+0.076
 stroke+0.076
 gra+0.076
Neg logit contributions
 important-0.091
 valuable-0.086
 best-0.085
 very-0.072
 really-0.072
Token many
Token ID867
Feature activation-2.26e+00

Pos logit contributions
aro+0.253
 benches+0.248
talk+0.223
 enclosure+0.222
round+0.219
Neg logit contributions
 important-0.272
 valuable-0.255
 best-0.248
 great-0.233
 really-0.231
Token houses
Token ID7777
Feature activation-5.14e+00

Pos logit contributions
 benches+0.067
 Sue+0.065
 greet+0.065
 Ann+0.063
 enclosure+0.062
Neg logit contributions
 important-0.068
 valuable-0.066
 best-0.061
 precious-0.051
 awful-0.048
Token.
Token ID13
Feature activation+1.99e+00

Pos logit contributions
 benches+0.160
 greet+0.152
aster+0.148
 Sue+0.146
 Ann+0.145
Neg logit contributions
 important-0.141
 valuable-0.138
 best-0.131
 precious-0.105
 certainly-0.104
Token They
Token ID1119
Feature activation+1.07e+00

Pos logit contributions
 important+0.065
 valuable+0.064
 best+0.058
 precious+0.052
 certainly+0.051
Neg logit contributions
 benches-0.054
 greet-0.050
aster-0.049
 Sue-0.049
 Ann-0.048
Token knock
Token ID10643
Feature activation-5.99e+00

Pos logit contributions
 important+0.037
 valuable+0.037
 best+0.035
 certainly+0.032
 truly+0.031
Neg logit contributions
 benches-0.024
 Sue-0.022
 Ann-0.020
aster-0.020
 bench-0.019
Token on
Token ID319
Feature activation-8.32e+00

Pos logit contributions
 arrest+0.194
 benches+0.193
 enclosure+0.193
 greet+0.191
 gra+0.185
Neg logit contributions
 important-0.142
 best-0.142
 valuable-0.133
 really-0.102
 hard-0.099
Token the
Token ID262
Feature activation-1.95e+00

Pos logit contributions
 greet+0.170
 benches+0.169
 enclosure+0.159
aster+0.156
 slam+0.154
Neg logit contributions
 important-0.305
 valuable-0.301
 best-0.290
 precious-0.244
 really-0.240
Token doors
Token ID8215
Feature activation-2.75e+00

Pos logit contributions
 benches+0.052
 greet+0.050
 Sue+0.049
aster+0.049
 Ann+0.049
Neg logit contributions
 valuable-0.062
 important-0.062
 best-0.056
 precious-0.049
 awful-0.047
Token and
Token ID290
Feature activation-5.24e+00

Pos logit contributions
 benches+0.084
 greet+0.079
aster+0.077
 Sue+0.076
 enclosure+0.076
Neg logit contributions
 important-0.076
 valuable-0.075
 best-0.070
 certainly-0.059
 precious-0.058
Token give
Token ID1577
Feature activation-2.69e+00

Pos logit contributions
isk+0.106
 benches+0.105
hop+0.101
 enclosure+0.100
ifi+0.099
Neg logit contributions
 important-0.170
 valuable-0.168
 best-0.167
 precious-0.148
 awful-0.138
Token.
Token ID13
Feature activation+2.20e+00

Pos logit contributions
 benches+0.046
 greet+0.042
aster+0.041
 Sue+0.041
 bench+0.040
Neg logit contributions
 important-0.060
 valuable-0.060
 best-0.054
 precious-0.048
 certainly-0.048
Token She
Token ID1375
Feature activation+2.03e+00

Pos logit contributions
 important+0.071
 valuable+0.071
 best+0.064
 certainly+0.057
 precious+0.057
Neg logit contributions
 benches-0.060
 greet-0.055
aster-0.054
 enclosure-0.053
 bench-0.053
Token was
Token ID373
Feature activation+4.84e+00

Pos logit contributions
 important+0.059
 valuable+0.059
 best+0.053
 impossible+0.045
 precious+0.045
Neg logit contributions
 benches-0.059
 greet-0.057
 bench-0.055
 Ann-0.055
 Sue-0.055
Token so
Token ID523
Feature activation+6.43e+00

Pos logit contributions
 valuable+0.112
 important+0.111
 best+0.094
 precious+0.081
 virtue+0.078
Neg logit contributions
 benches-0.170
 greet-0.166
 Sue-0.165
 Ann-0.163
aster-0.161
Token proud
Token ID6613
Feature activation+5.90e+00

Pos logit contributions
 valuable+0.152
 important+0.151
 best+0.134
der+0.119
 virtue+0.115
Neg logit contributions
 greet-0.207
 Sue-0.206
 benches-0.202
 Ann-0.201
aster-0.199
Token of
Token ID286
Feature activation-5.15e+00

Pos logit contributions
 important+0.201
 valuable+0.201
 best+0.183
 precious+0.163
 certainly+0.162
Neg logit contributions
 benches-0.149
 greet-0.137
aster-0.134
 Sue-0.134
 Ann-0.132
Token herself
Token ID5223
Feature activation+4.50e+00

Pos logit contributions
 benches+0.116
aster+0.108
 enclosure+0.107
 greet+0.106
 bench+0.102
Neg logit contributions
 important-0.193
 valuable-0.190
 best-0.178
 certainly-0.157
 precious-0.156
Token,
Token ID11
Feature activation-2.42e+00

Pos logit contributions
 valuable+0.146
 important+0.146
 best+0.130
 precious+0.120
 virtue+0.118
Neg logit contributions
 benches-0.112
 Sue-0.106
 greet-0.105
 Ann-0.104
 bench-0.104
Token and
Token ID290
Feature activation-2.44e+00

Pos logit contributions
 benches+0.064
 greet+0.059
aster+0.058
 Sue+0.057
 bench+0.057
Neg logit contributions
 important-0.080
 valuable-0.080
 best-0.072
 precious-0.064
 certainly-0.064
Token so
Token ID523
Feature activation+4.59e+00

Pos logit contributions
 benches+0.060
 greet+0.056
 Sue+0.055
 Ann+0.053
 bench+0.053
Neg logit contributions
 important-0.084
 valuable-0.084
 best-0.076
 precious-0.068
 certainly-0.068
Token were
Token ID547
Feature activation+3.37e+00

Pos logit contributions
 important+0.159
 valuable+0.159
 best+0.144
 precious+0.131
 certainly+0.130
Neg logit contributions
 benches-0.109
 Sue-0.102
 greet-0.102
 Ann-0.100
aster-0.098
Token mess
Token ID2085
Feature activation+2.16e-01

Pos logit contributions
 benches+0.079
 greet+0.074
 enclosure+0.073
aster+0.072
 Sue+0.070
Neg logit contributions
 important-0.130
 valuable-0.129
 best-0.121
 precious-0.107
 awful-0.104
Token.
Token ID13
Feature activation+2.28e+00

Pos logit contributions
 valuable+0.008
 important+0.008
 best+0.007
 certainly+0.007
 virtue+0.007
Neg logit contributions
 benches-0.005
 Ann-0.005
 Sue-0.005
aster-0.004
 Jen-0.004
Token
Token ID198
Feature activation-2.39e-01

Pos logit contributions
 valuable+0.067
 important+0.067
 best+0.061
 precious+0.052
 certainly+0.051
Neg logit contributions
 benches-0.067
 greet-0.065
aster-0.063
 enclosure-0.063
 bench-0.061
Token
Token ID198
Feature activation-1.22e-01

Pos logit contributions
 benches+0.007
 greet+0.006
 enclosure+0.006
aster+0.006
 bench+0.006
Neg logit contributions
 important-0.007
 valuable-0.007
 best-0.007
 precious-0.006
 certainly-0.006
TokenMom
Token ID29252
Feature activation-3.36e+00

Pos logit contributions
 benches+0.004
aster+0.003
 greet+0.003
 Sue+0.003
 Ann+0.003
Neg logit contributions
 important-0.004
 valuable-0.004
 best-0.003
 certainly-0.003
 precious-0.003
Token and
Token ID290
Feature activation-4.87e+00

Pos logit contributions
 benches+0.081
aster+0.079
 greet+0.078
keeping+0.078
 Sue+0.077
Neg logit contributions
 important-0.104
 best-0.101
 valuable-0.097
 certainly-0.090
 problem-0.086
Token Dad
Token ID17415
Feature activation-4.45e+00

Pos logit contributions
 benches+0.145
 enclosure+0.144
 greet+0.138
aster+0.136
 gra+0.134
Neg logit contributions
 important-0.136
 valuable-0.130
 best-0.126
 precious-0.110
 unfair-0.100
Token hear
Token ID3285
Feature activation-3.46e+00

Pos logit contributions
keeping+0.104
 benches+0.104
 Sue+0.102
 enclosure+0.102
aster+0.096
Neg logit contributions
 best-0.138
 important-0.134
 valuable-0.128
 certainly-0.122
 really-0.120
Token them
Token ID606
Feature activation-1.29e-01

Pos logit contributions
 enclosure+0.080
 greet+0.077
 gra+0.076
hop+0.074
 benches+0.073
Neg logit contributions
 important-0.112
 valuable-0.108
 best-0.101
 awful-0.096
 horrible-0.093
Token and
Token ID290
Feature activation-4.30e+00

Pos logit contributions
 benches+0.004
aster+0.004
 greet+0.004
 enclosure+0.004
 bench+0.004
Neg logit contributions
 important-0.003
 valuable-0.003
 best-0.003
 certainly-0.002
 awful-0.002
Token come
Token ID1282
Feature activation-2.92e+00

Pos logit contributions
 benches+0.102
aster+0.101
 enclosure+0.100
keeping+0.096
hop+0.093
Neg logit contributions
 important-0.134
 best-0.129
 valuable-0.128
 really-0.115
 certainly-0.110
Token statue
Token ID15207
Feature activation-2.06e+00

Pos logit contributions
 important+0.008
 valuable+0.008
 best+0.008
 precious+0.007
 certainly+0.006
Neg logit contributions
 benches-0.007
aster-0.007
 greet-0.007
 Sue-0.007
 Ann-0.007
Token.
Token ID13
Feature activation+1.90e+00

Pos logit contributions
 benches+0.061
 greet+0.057
aster+0.056
 Sue+0.055
 bench+0.054
Neg logit contributions
 important-0.062
 valuable-0.061
 best-0.056
 precious-0.048
 certainly-0.047
Token They
Token ID1119
Feature activation+7.18e-01

Pos logit contributions
 important+0.048
 valuable+0.048
 best+0.042
 precious+0.036
 certainly+0.035
Neg logit contributions
 benches-0.064
aster-0.062
 greet-0.061
 bench-0.059
 stroke-0.058
Token go
Token ID467
Feature activation-2.98e+00

Pos logit contributions
 important+0.025
 valuable+0.025
 best+0.024
 certainly+0.021
 precious+0.021
Neg logit contributions
 benches-0.017
aster-0.015
 Sue-0.015
 Ann-0.015
 enclosure-0.014
Token back
Token ID736
Feature activation-3.52e+00

Pos logit contributions
aster+0.103
 benches+0.094
 gra+0.094
 restaurants+0.089
growth+0.089
Neg logit contributions
 best-0.072
 important-0.069
 valuable-0.066
 very-0.062
 really-0.057
Token to
Token ID284
Feature activation-9.56e+00

Pos logit contributions
aster+0.127
 gra+0.116
 benches+0.115
 greet+0.111
 stroke+0.107
Neg logit contributions
 important-0.082
 valuable-0.078
 best-0.076
 very-0.065
 really-0.059
Token their
Token ID511
Feature activation-1.04e+00

Pos logit contributions
talk+0.281
 benches+0.278
round+0.273
aro+0.271
of+0.267
Neg logit contributions
 important-0.248
 best-0.233
 valuable-0.224
 great-0.196
 very-0.189
Token mom
Token ID1995
Feature activation-6.50e+00

Pos logit contributions
aster+0.030
 benches+0.028
 greet+0.028
 Sue+0.027
 gra+0.026
Neg logit contributions
 important-0.033
 valuable-0.031
 best-0.031
 precious-0.027
 problem-0.024
Token,
Token ID11
Feature activation-5.09e+00

Pos logit contributions
 benches+0.246
aster+0.228
aro+0.228
 greet+0.226
 Sue+0.219
Neg logit contributions
 important-0.130
 best-0.122
 valuable-0.118
 precious-0.083
 very-0.082
Token who
Token ID508
Feature activation-1.51e+00

Pos logit contributions
 benches+0.153
aster+0.145
 enclosure+0.141
 restaurants+0.137
 arrest+0.136
Neg logit contributions
 best-0.139
 important-0.138
 valuable-0.131
 very-0.116
 really-0.114
Token is
Token ID318
Feature activation+4.57e+00

Pos logit contributions
 benches+0.057
 Sue+0.052
aster+0.051
keeping+0.051
 greet+0.050
Neg logit contributions
 important-0.033
 valuable-0.031
 best-0.029
 precious-0.024
 certainly-0.023