TOP ACTIVATIONS
MAX = 0.170

her feet and threw a tant rum . She grabbed the
listen . She threw a tant rum in the store .
sad . She threw a tant rum and cried loudly .
She cried and threw a tant rum . Her mom tried
to cry and threw a tant rum . She didn 't
kept crying and throwing a tant rum . Finally
. He kept throwing a tant rum . His mom was
closer look . He leaned forward and realised it was a
had taken a big step forward , and he knew that
him inside . Bobby leaned forward , his face getting closer
Jeff took a step forward and the rabbit began to
the glass . Lily leaned forward and kissed the oct opus
Sam took a step forward and bowed . He saw
the boy took a step forward . Then the bird flew
A little girl quickly ran forward . She was excited to
and started to throw a tant rum . Suddenly , she
. She took a step forward to get a closer look
Suddenly , the bird leaned forward and bit Tom ! It
was amazed . He leaned forward in his seat and asked
the road . She ran forward and started collecting them .

MIN ACTIVATIONS
MIN = -0.191

Did you know that cater pill ars turn into beautiful butterflies
, " I love cater pill ars ! You can be
but he also knew cater pill ars were not pets .
know , Lily . Cater pill ars need leaves and air
careful , Lily . Cater pill ars can be tricky .
you , Lily . Cater pill ars can leave and never
Lily , no . Cater pill ars are not for keeping
ily liked to keep cater pill ars in a jar .
said , " C ater pill ars are not meant to
Do you know what b unn ies eat ?"
for me ! Aren â t we both so
religion . After church , Lily and her family
spoke . " Why am I so poor ?" he
my , what do b unn ies do ?"
" Because it wasn 't pe eling well !"
said , " Hello , church ! I like to sail
" What is opera ?" Jen asks Tom .
mother asked , " Is land of seas hell s ?
children , " Do not slide too fast , you need
put it back . Cater pill ars need to eat leaves

INTERVAL 0.080 - 0.170
CONTAINS 0.374%

many green leaves and a brown trunk . It was very
ran over and grabbed the naughty boy , and gave him
Jack gently reached out his hand and said " Hello ,
Lu cy kne eled down next to him and closed her
But then a mean spider came along . It

INTERVAL -0.010 - 0.080
CONTAINS 65.166%

a good friend ." Mia says . They sit
eyes on their arms and legs and even their shirts !
something crazy . It was a dress with spark les and
and sad . Tom said , " Hi ,
promised to use the magic wand wisely . <|endoftext|> Once upon

INTERVAL -0.100 - -0.010
CONTAINS 34.417%

could go outside . " You can come ,
" Hello , little ones ," he said .
are in an open field ," said the farmer . "
chasing one another through the forest . Soon , they started
and thanked her for the water and the hats . They

INTERVAL -0.191 - -0.100
CONTAINS 0.043%

a friend to help you achieve your goals . Friends can
to be kind and consider ate . When we use kind
said , " Let â s play catch .
. She loved eating o atmeal for breakfast every day .
brave ly asked , " Who are you ?"
Token her
Token ID607
Feature activation+4.31e-02

Pos logit contributions
ed+0.126
apped+0.122
uttered+0.118
w+0.117
rified+0.117
Neg logit contributions
 solve-0.071
 read-0.068
 count-0.068
 earn-0.067
 seem-0.065
Token feet
Token ID3625
Feature activation+6.71e-02

Pos logit contributions
ed+0.065
rified+0.055
apped+0.053
ing+0.051
uttered+0.050
Neg logit contributions
 earn-0.083
 solve-0.083
 actually-0.080
 count-0.079
 read-0.078
Token and
Token ID290
Feature activation+2.22e-02

Pos logit contributions
ed+0.122
rified+0.106
apped+0.105
ing+0.100
uttered+0.099
Neg logit contributions
 earn-0.106
 solve-0.104
 actually-0.102
 count-0.098
 seem-0.098
Token threw
Token ID9617
Feature activation+2.79e-02

Pos logit contributions
ed+0.043
rified+0.038
apped+0.037
ing+0.036
uttered+0.035
Neg logit contributions
 earn-0.033
 solve-0.032
 actually-0.032
 count-0.030
 seem-0.030
Token a
Token ID257
Feature activation+6.93e-02

Pos logit contributions
ed+0.055
apped+0.049
rified+0.049
uttered+0.047
ing+0.047
Neg logit contributions
 earn-0.038
 solve-0.038
 actually-0.038
 count-0.037
 seem-0.037
Token tant
Token ID24246
Feature activation+1.70e-01

Pos logit contributions
ed+0.078
apped+0.065
rified+0.064
uttered+0.060
ipt+0.060
Neg logit contributions
 earn-0.156
 solve-0.156
 read-0.150
 count-0.149
 actually-0.149
Tokenrum
Token ID6582
Feature activation+3.06e-02

Pos logit contributions
ed+0.350
rified+0.327
apped+0.316
uttered+0.306
attered+0.300
Neg logit contributions
 earn-0.206
 solve-0.203
 actually-0.194
 seem-0.192
 count-0.191
Token.
Token ID13
Feature activation+1.09e-02

Pos logit contributions
ed+0.076
rified+0.071
apped+0.070
uttered+0.068
ipt+0.067
Neg logit contributions
 earn-0.023
 actually-0.023
 solve-0.022
 read-0.020
 seem-0.020
Token She
Token ID1375
Feature activation-3.93e-03

Pos logit contributions
ed+0.023
apped+0.020
rified+0.020
uttered+0.019
ing+0.019
Neg logit contributions
 earn-0.015
 solve-0.015
 actually-0.015
 rely-0.014
 count-0.014
Token grabbed
Token ID13176
Feature activation+2.92e-02

Pos logit contributions
 earn+0.006
 solve+0.006
 actually+0.006
 seem+0.006
 rely+0.006
Neg logit contributions
ed-0.007
rified-0.006
apped-0.006
ing-0.006
uttered-0.006
Token the
Token ID262
Feature activation+1.39e-02

Pos logit contributions
ed+0.061
apped+0.056
rified+0.055
attered+0.053
ing+0.053
Neg logit contributions
 earn-0.036
 solve-0.035
 actually-0.034
 seem-0.034
 count-0.034
Token listen
Token ID6004
Feature activation-1.61e-02

Pos logit contributions
ed+0.054
rified+0.046
 away+0.045
ing+0.045
w+0.045
Neg logit contributions
 actually-0.048
lf-0.046
 earn-0.045
 solve-0.044
case-0.042
Token.
Token ID13
Feature activation-1.54e-03

Pos logit contributions
 earn+0.032
 solve+0.031
 actually+0.031
 rely+0.031
 count+0.030
Neg logit contributions
ed-0.024
ing-0.019
rified-0.019
apped-0.019
uttered-0.017
Token She
Token ID1375
Feature activation-1.39e-02

Pos logit contributions
 earn+0.002
 solve+0.002
 actually+0.002
 rely+0.002
 count+0.002
Neg logit contributions
ed-0.003
apped-0.002
rified-0.002
ing-0.002
elt-0.002
Token threw
Token ID9617
Feature activation+2.68e-02

Pos logit contributions
 earn+0.023
 solve+0.023
 seem+0.022
 rely+0.021
 count+0.020
Neg logit contributions
ed-0.025
rified-0.021
apped-0.020
ted-0.019
aked-0.019
Token a
Token ID257
Feature activation+6.99e-02

Pos logit contributions
ed+0.055
apped+0.052
uttered+0.051
ipt+0.050
rified+0.050
Neg logit contributions
 solve-0.032
 earn-0.032
 read-0.031
 seem-0.031
 actually-0.031
Token tant
Token ID24246
Feature activation+1.66e-01

Pos logit contributions
ed+0.101
rified+0.089
apped+0.086
ipt+0.085
uttered+0.084
Neg logit contributions
 solve-0.133
 earn-0.131
 read-0.126
 shelf-0.125
 count-0.125
Tokenrum
Token ID6582
Feature activation+4.11e-02

Pos logit contributions
ed+0.336
rified+0.321
apped+0.306
uttered+0.300
attered+0.295
Neg logit contributions
 earn-0.197
 solve-0.194
 seem-0.184
 actually-0.183
 count-0.183
Token in
Token ID287
Feature activation+1.31e-02

Pos logit contributions
ed+0.085
rified+0.081
apped+0.077
uttered+0.077
xc+0.076
Neg logit contributions
 actually-0.046
 earn-0.045
 solve-0.045
 read-0.043
 seem-0.042
Token the
Token ID262
Feature activation+1.87e-02

Pos logit contributions
ed+0.027
apped+0.024
rified+0.024
uttered+0.023
ing+0.023
Neg logit contributions
 earn-0.018
 solve-0.018
 actually-0.017
 count-0.017
 read-0.017
Token store
Token ID3650
Feature activation-3.32e-02

Pos logit contributions
ed+0.038
uttered+0.033
rified+0.033
ho+0.032
elt+0.032
Neg logit contributions
 earn-0.027
 solve-0.026
 actually-0.025
 count-0.024
 cost-0.024
Token.
Token ID13
Feature activation+1.48e-02

Pos logit contributions
 earn+0.054
 solve+0.052
 actually+0.052
 rely+0.051
 seem+0.050
Neg logit contributions
ed-0.061
apped-0.051
rified-0.051
ing-0.050
ho-0.048
Token sad
Token ID6507
Feature activation+1.96e-02

Pos logit contributions
 earn+0.027
 solve+0.027
 count+0.026
 rely+0.026
 actually+0.026
Neg logit contributions
ed-0.030
rified-0.025
apped-0.025
ing-0.023
uttered-0.023
Token.
Token ID13
Feature activation+6.22e-03

Pos logit contributions
ed+0.044
ing+0.038
apped+0.037
rified+0.037
ted+0.034
Neg logit contributions
 earn-0.027
 solve-0.025
 actually-0.024
 rely-0.024
 count-0.024
Token She
Token ID1375
Feature activation-2.92e-02

Pos logit contributions
ed+0.012
apped+0.010
rified+0.010
elt+0.010
ipt+0.010
Neg logit contributions
 earn-0.009
 solve-0.009
 actually-0.009
 rely-0.009
 count-0.009
Token threw
Token ID9617
Feature activation+2.21e-02

Pos logit contributions
 earn+0.047
 solve+0.046
 rely+0.046
 seem+0.046
 count+0.044
Neg logit contributions
ed-0.052
rified-0.044
apped-0.042
attered-0.041
ted-0.040
Token a
Token ID257
Feature activation+6.07e-02

Pos logit contributions
ed+0.050
apped+0.048
uttered+0.048
attered+0.047
elt+0.046
Neg logit contributions
 solve-0.022
 read-0.022
 earn-0.022
 count-0.021
 seem-0.020
Token tant
Token ID24246
Feature activation+1.60e-01

Pos logit contributions
ed+0.095
apped+0.084
rified+0.083
ipt+0.081
uttered+0.079
Neg logit contributions
 solve-0.113
 earn-0.109
 read-0.107
 count-0.105
 shelf-0.105
Tokenrum
Token ID6582
Feature activation+4.20e-02

Pos logit contributions
ed+0.332
rified+0.315
apped+0.303
uttered+0.293
attered+0.291
Neg logit contributions
 earn-0.184
 solve-0.181
 actually-0.174
 seem-0.173
 count-0.172
Token and
Token ID290
Feature activation+1.31e-02

Pos logit contributions
ed+0.077
rified+0.075
xc+0.072
uttered+0.071
suits+0.071
Neg logit contributions
 actually-0.051
 read-0.050
 earn-0.049
 solve-0.049
 also-0.047
Token cried
Token ID16896
Feature activation+3.38e-02

Pos logit contributions
ed+0.025
rified+0.022
apped+0.022
ing+0.021
uttered+0.021
Neg logit contributions
 earn-0.019
 solve-0.019
 actually-0.019
 seem-0.018
 count-0.018
Token loudly
Token ID23112
Feature activation+4.80e-02

Pos logit contributions
ed+0.075
rified+0.068
apped+0.067
uttered+0.065
ipt+0.064
Neg logit contributions
 earn-0.039
 solve-0.039
 actually-0.037
 seem-0.035
 count-0.035
Token.
Token ID13
Feature activation+1.98e-02

Pos logit contributions
ed+0.108
rified+0.095
apped+0.094
ing+0.091
uttered+0.089
Neg logit contributions
 earn-0.059
 solve-0.056
 actually-0.055
 rely-0.052
 count-0.052
Token She
Token ID1375
Feature activation-2.47e-02

Pos logit contributions
 earn+0.009
 actually+0.009
 solve+0.009
 rely+0.008
 count+0.008
Neg logit contributions
ed-0.012
apped-0.010
rified-0.010
ing-0.010
uttered-0.009
Token cried
Token ID16896
Feature activation+3.44e-02

Pos logit contributions
 earn+0.039
 solve+0.038
 seem+0.038
 rely+0.037
 count+0.036
Neg logit contributions
ed-0.047
apped-0.039
rified-0.038
eled-0.036
ted-0.036
Token and
Token ID290
Feature activation+7.31e-03

Pos logit contributions
ed+0.069
ipt+0.066
rified+0.065
ho+0.065
w+0.065
Neg logit contributions
 solve-0.041
 read-0.040
 earn-0.039
 actually-0.038
 seem-0.038
Token threw
Token ID9617
Feature activation+1.51e-02

Pos logit contributions
ed+0.015
rified+0.013
apped+0.013
ing+0.012
uttered+0.012
Neg logit contributions
 earn-0.010
 solve-0.010
 actually-0.010
 seem-0.009
 count-0.009
Token a
Token ID257
Feature activation+5.75e-02

Pos logit contributions
ed+0.032
apped+0.030
rified+0.029
uttered+0.029
ipt+0.029
Neg logit contributions
 solve-0.018
 earn-0.018
 read-0.018
 count-0.017
 actually-0.017
Token tant
Token ID24246
Feature activation+1.60e-01

Pos logit contributions
ed+0.089
apped+0.079
rified+0.077
uttered+0.076
ipt+0.075
Neg logit contributions
 solve-0.110
 earn-0.108
 read-0.106
 count-0.102
 shelf-0.101
Tokenrum
Token ID6582
Feature activation+2.80e-02

Pos logit contributions
ed+0.323
rified+0.301
apped+0.289
uttered+0.282
attered+0.276
Neg logit contributions
 earn-0.199
 solve-0.197
 actually-0.188
 seem-0.187
 count-0.185
Token.
Token ID13
Feature activation+8.60e-03

Pos logit contributions
ed+0.071
rified+0.066
apped+0.065
uttered+0.063
ing+0.063
Neg logit contributions
 earn-0.021
 solve-0.021
 actually-0.021
 read-0.019
 seem-0.019
Token Her
Token ID2332
Feature activation-6.75e-03

Pos logit contributions
ed+0.019
apped+0.017
rified+0.017
uttered+0.016
ing+0.016
Neg logit contributions
 earn-0.011
 solve-0.010
 actually-0.010
 rely-0.010
 count-0.010
Token mom
Token ID1995
Feature activation+5.20e-02

Pos logit contributions
 solve+0.008
 earn+0.008
 read+0.008
 count+0.008
 actually+0.007
Neg logit contributions
ed-0.015
rified-0.014
ing-0.013
apped-0.013
uttered-0.013
Token tried
Token ID3088
Feature activation+2.03e-02

Pos logit contributions
ed+0.109
rified+0.099
apped+0.097
uttered+0.095
attered+0.093
Neg logit contributions
 earn-0.066
 solve-0.065
 actually-0.065
 read-0.063
 count-0.060
Token to
Token ID284
Feature activation-1.75e-02

Pos logit contributions
ed+0.018
rified+0.015
apped+0.015
ing+0.014
uttered+0.014
Neg logit contributions
 earn-0.022
 solve-0.022
 actually-0.022
 count-0.021
 read-0.021
Token cry
Token ID3960
Feature activation+2.12e-02

Pos logit contributions
 actually+0.024
 earn+0.024
 solve+0.023
 seem+0.022
 rely+0.022
Neg logit contributions
ed-0.032
rified-0.029
apped-0.028
 breath-0.027
xc-0.027
Token and
Token ID290
Feature activation-1.61e-03

Pos logit contributions
ed+0.042
rified+0.036
apped+0.036
ing+0.035
uttered+0.033
Neg logit contributions
 earn-0.032
 solve-0.031
 actually-0.030
 rely-0.029
 count-0.029
Token threw
Token ID9617
Feature activation+1.51e-02

Pos logit contributions
 earn+0.002
 solve+0.002
 actually+0.002
 rely+0.002
 seem+0.002
Neg logit contributions
ed-0.003
rified-0.003
apped-0.003
ing-0.003
uttered-0.003
Token a
Token ID257
Feature activation+5.91e-02

Pos logit contributions
ed+0.032
apped+0.030
uttered+0.030
rified+0.029
attered+0.029
Neg logit contributions
 solve-0.018
 earn-0.018
 read-0.018
 count-0.017
 actually-0.017
Token tant
Token ID24246
Feature activation+1.59e-01

Pos logit contributions
ed+0.108
apped+0.098
rified+0.094
ipt+0.094
uttered+0.093
Neg logit contributions
 solve-0.097
 earn-0.091
 read-0.091
 else-0.090
 count-0.088
Tokenrum
Token ID6582
Feature activation+3.42e-02

Pos logit contributions
ed+0.320
rified+0.295
apped+0.285
uttered+0.276
attered+0.272
Neg logit contributions
 earn-0.203
 solve-0.201
 actually-0.192
 seem-0.190
 count-0.188
Token.
Token ID13
Feature activation+3.63e-03

Pos logit contributions
ed+0.088
rified+0.082
apped+0.080
uttered+0.078
ipt+0.077
Neg logit contributions
 earn-0.024
 solve-0.024
 actually-0.024
 read-0.022
 count-0.021
Token She
Token ID1375
Feature activation-1.36e-02

Pos logit contributions
ed+0.008
apped+0.007
rified+0.007
uttered+0.007
ing+0.007
Neg logit contributions
 earn-0.005
 solve-0.004
 actually-0.004
 rely-0.004
 count-0.004
Token didn
Token ID1422
Feature activation+5.80e-02

Pos logit contributions
 earn+0.021
 solve+0.020
 actually+0.020
 seem+0.020
 rely+0.020
Neg logit contributions
ed-0.026
rified-0.022
apped-0.022
ing-0.021
uttered-0.020
Token't
Token ID470
Feature activation+2.17e-02

Pos logit contributions
ed+0.110
ing+0.096
uck+0.094
apped+0.090
uttered+0.090
Neg logit contributions
 rely-0.082
 earn-0.080
 count-0.079
uries-0.075
 solve-0.074
Token kept
Token ID4030
Feature activation+1.44e-02

Pos logit contributions
 earn+0.051
 solve+0.050
 rely+0.047
 seem+0.047
 actually+0.047
Neg logit contributions
ed-0.064
rified-0.055
apped-0.053
ing-0.051
attered-0.051
Token crying
Token ID13774
Feature activation-2.37e-03

Pos logit contributions
ed+0.030
rified+0.027
apped+0.027
ing+0.026
uttered+0.025
Neg logit contributions
 earn-0.019
 solve-0.018
 actually-0.018
 count-0.017
 seem-0.017
Token and
Token ID290
Feature activation-1.47e-02

Pos logit contributions
 earn+0.004
 solve+0.004
 actually+0.004
 count+0.003
 rely+0.003
Neg logit contributions
ed-0.004
rified-0.004
apped-0.004
ing-0.004
uttered-0.004
Token throwing
Token ID9644
Feature activation-1.08e-02

Pos logit contributions
 earn+0.020
 solve+0.019
 actually+0.019
 seem+0.018
 count+0.018
Neg logit contributions
ed-0.031
rified-0.027
apped-0.027
ing-0.026
uttered-0.025
Token a
Token ID257
Feature activation+5.32e-02

Pos logit contributions
 earn+0.016
 solve+0.015
 actually+0.015
 seem+0.014
 count+0.014
Neg logit contributions
ed-0.021
apped-0.019
rified-0.019
uttered-0.018
ing-0.018
Token tant
Token ID24246
Feature activation+1.57e-01

Pos logit contributions
ed+0.075
rified+0.065
apped+0.064
uttered+0.060
ipt+0.058
Neg logit contributions
 earn-0.105
 solve-0.105
 actually-0.101
 read-0.100
 count-0.098
Tokenrum
Token ID6582
Feature activation+2.50e-02

Pos logit contributions
ed+0.316
rified+0.292
apped+0.282
uttered+0.272
attered+0.268
Neg logit contributions
 earn-0.203
 solve-0.200
 actually-0.193
 seem-0.189
 count-0.188
Token.
Token ID13
Feature activation+7.08e-03

Pos logit contributions
ed+0.061
rified+0.057
apped+0.056
uttered+0.054
ipt+0.054
Neg logit contributions
 earn-0.020
 actually-0.020
 solve-0.020
 read-0.018
 seem-0.018
Token
Token ID198
Feature activation+1.44e-03

Pos logit contributions
ed+0.015
apped+0.013
rified+0.013
uttered+0.013
ing+0.013
Neg logit contributions
 earn-0.009
 solve-0.009
 actually-0.009
 rely-0.009
 count-0.009
Token
Token ID198
Feature activation-1.09e-02

Pos logit contributions
ed+0.003
apped+0.003
ing+0.003
rified+0.003
uttered+0.002
Neg logit contributions
 earn-0.002
 actually-0.002
 solve-0.002
 count-0.002
 rely-0.002
TokenFinally
Token ID11158
Feature activation-2.48e-02

Pos logit contributions
 rely+0.018
 earn+0.018
 count+0.017
 actually+0.017
 seem+0.017
Neg logit contributions
ed-0.019
ing-0.016
apped-0.016
rified-0.016
attered-0.015
Token.
Token ID13
Feature activation+1.33e-03

Pos logit contributions
 earn+0.019
 solve+0.018
 actually+0.018
 rely+0.017
 count+0.017
Neg logit contributions
ed-0.036
ing-0.032
rified-0.031
apped-0.031
ho-0.030
Token He
Token ID679
Feature activation+1.12e-02

Pos logit contributions
ed+0.002
apped+0.002
rified+0.002
elt+0.002
ipt+0.002
Neg logit contributions
 actually-0.002
 solve-0.002
 earn-0.002
 rely-0.002
 count-0.002
Token kept
Token ID4030
Feature activation+1.60e-02

Pos logit contributions
ed+0.021
rified+0.017
apped+0.017
uttered+0.016
ted+0.016
Neg logit contributions
 earn-0.018
 solve-0.018
 seem-0.017
 rely-0.017
 count-0.017
Token throwing
Token ID9644
Feature activation-6.30e-03

Pos logit contributions
ed+0.032
apped+0.027
rified+0.027
ing+0.026
eled+0.025
Neg logit contributions
 earn-0.023
 solve-0.022
 actually-0.022
 seem-0.022
 count-0.021
Token a
Token ID257
Feature activation+6.57e-02

Pos logit contributions
 earn+0.010
 solve+0.009
 actually+0.009
 count+0.009
 read+0.009
Neg logit contributions
ed-0.012
apped-0.010
rified-0.010
uttered-0.010
ing-0.010
Token tant
Token ID24246
Feature activation+1.57e-01

Pos logit contributions
ed+0.123
rified+0.111
apped+0.110
uttered+0.107
ipt+0.103
Neg logit contributions
 earn-0.100
 solve-0.100
 read-0.095
 count-0.093
 actually-0.091
Tokenrum
Token ID6582
Feature activation+3.16e-02

Pos logit contributions
ed+0.318
rified+0.292
apped+0.283
uttered+0.275
attered+0.270
Neg logit contributions
 earn-0.204
 solve-0.199
 actually-0.191
 seem-0.188
 count-0.188
Token.
Token ID13
Feature activation+1.07e-02

Pos logit contributions
ed+0.081
rified+0.075
apped+0.073
uttered+0.072
ing+0.070
Neg logit contributions
 earn-0.024
 solve-0.024
 actually-0.023
 count-0.021
 seem-0.021
Token His
Token ID2399
Feature activation+2.53e-02

Pos logit contributions
ed+0.019
apped+0.016
rified+0.016
elt+0.015
ipt+0.015
Neg logit contributions
 solve-0.017
 actually-0.017
 rely-0.017
 earn-0.017
 count-0.017
Token mom
Token ID1995
Feature activation+4.76e-02

Pos logit contributions
ed+0.059
rified+0.055
ing+0.052
apped+0.051
uttered+0.050
Neg logit contributions
 earn-0.028
 solve-0.027
 count-0.026
 read-0.026
 rely-0.025
Token was
Token ID373
Feature activation+1.61e-02

Pos logit contributions
ed+0.093
rified+0.082
apped+0.081
ing+0.078
uttered+0.077
Neg logit contributions
 earn-0.069
 solve-0.068
 actually-0.067
 count-0.064
 seem-0.063
Token closer
Token ID5699
Feature activation+6.43e-02

Pos logit contributions
ed+0.057
apped+0.050
rified+0.048
ipt+0.046
attered+0.045
Neg logit contributions
 solve-0.078
 earn-0.076
 seem-0.075
 read-0.074
 actually-0.073
Token look
Token ID804
Feature activation+1.61e-03

Pos logit contributions
ed+0.134
rified+0.120
apped+0.119
attered+0.112
uttered+0.112
Neg logit contributions
 solve-0.084
 earn-0.083
 seem-0.082
 count-0.081
 actually-0.081
Token.
Token ID13
Feature activation+2.25e-02

Pos logit contributions
ed+0.004
ing+0.003
rified+0.003
apped+0.003
ted+0.003
Neg logit contributions
 earn-0.002
 solve-0.002
 actually-0.002
 rely-0.002
 count-0.002
Token He
Token ID679
Feature activation+1.50e-02

Pos logit contributions
ed+0.042
rified+0.036
apped+0.036
uttered+0.034
elt+0.034
Neg logit contributions
 earn-0.036
 solve-0.035
 actually-0.034
 rely-0.033
 count-0.033
Token leaned
Token ID23831
Feature activation+4.09e-02

Pos logit contributions
ed+0.025
rified+0.021
apped+0.021
ing+0.020
uttered+0.019
Neg logit contributions
 earn-0.027
 solve-0.026
 actually-0.025
 seem-0.025
 rely-0.025
Token forward
Token ID2651
Feature activation+1.57e-01

Pos logit contributions
suits+0.065
attered+0.064
 tant+0.062
rified+0.061
w+0.061
Neg logit contributions
 actually-0.065
 also-0.064
 read-0.061
 count-0.055
 not-0.054
Token and
Token ID290
Feature activation+2.78e-02

Pos logit contributions
ed+0.277
rified+0.238
apped+0.235
ing+0.225
uttered+0.222
Neg logit contributions
 earn-0.258
 solve-0.252
 actually-0.248
 count-0.239
 seem-0.239
Token realised
Token ID19169
Feature activation-2.54e-02

Pos logit contributions
ed+0.048
rified+0.040
apped+0.039
ing+0.038
uttered+0.037
Neg logit contributions
 earn-0.048
 solve-0.047
 actually-0.046
 rely-0.045
 seem-0.045
Token it
Token ID340
Feature activation-1.43e-02

Pos logit contributions
 earn+0.042
 solve+0.041
 actually+0.040
 rely+0.039
 count+0.039
Neg logit contributions
ed-0.045
rified-0.038
ing-0.037
apped-0.037
uttered-0.035
Token was
Token ID373
Feature activation-1.79e-03

Pos logit contributions
 earn+0.026
 solve+0.024
 rely+0.023
 count+0.022
 read+0.022
Neg logit contributions
ed-0.025
rified-0.021
ing-0.020
uttered-0.019
apped-0.019
Token a
Token ID257
Feature activation+3.96e-03

Pos logit contributions
 earn+0.003
 solve+0.003
 seem+0.003
 read+0.003
 rely+0.003
Neg logit contributions
ed-0.003
rified-0.003
apped-0.002
ing-0.002
uttered-0.002
Token had
Token ID550
Feature activation+4.69e-02

Pos logit contributions
ed+0.039
apped+0.032
rified+0.032
ing+0.031
uttered+0.030
Neg logit contributions
 earn-0.035
 solve-0.033
 seem-0.032
 rely-0.031
 count-0.030
Token taken
Token ID2077
Feature activation+4.61e-02

Pos logit contributions
ed+0.090
rified+0.076
apped+0.075
ing+0.072
uttered+0.071
Neg logit contributions
 earn-0.073
 solve-0.071
 actually-0.067
 rely-0.066
 seem-0.066
Token a
Token ID257
Feature activation+4.69e-02

Pos logit contributions
ed+0.095
apped+0.090
rified+0.089
uttered+0.087
ho+0.087
Neg logit contributions
 earn-0.057
 solve-0.055
 actually-0.054
 seem-0.052
 count-0.052
Token big
Token ID1263
Feature activation+6.81e-02

Pos logit contributions
ed+0.093
rified+0.080
apped+0.079
ipt+0.076
attered+0.074
Neg logit contributions
 solve-0.073
 earn-0.072
 actually-0.069
 read-0.069
 count-0.068
Token step
Token ID2239
Feature activation+5.57e-02

Pos logit contributions
ed+0.147
rified+0.126
apped+0.122
attered+0.121
ipt+0.120
Neg logit contributions
 count-0.094
 solve-0.092
 read-0.091
 earn-0.090
 rely-0.086
Token forward
Token ID2651
Feature activation+1.55e-01

Pos logit contributions
ed+0.094
apped+0.086
rified+0.084
ipt+0.084
uttered+0.084
Neg logit contributions
 actually-0.083
 solve-0.082
 count-0.081
 earn-0.079
 read-0.078
Token,
Token ID11
Feature activation+4.08e-02

Pos logit contributions
ed+0.290
rified+0.255
apped+0.251
uttered+0.238
ing+0.238
Neg logit contributions
 earn-0.237
 solve-0.234
 actually-0.230
 count-0.223
 seem-0.222
Token and
Token ID290
Feature activation-4.05e-03

Pos logit contributions
ed+0.081
rified+0.071
apped+0.070
ing+0.068
uttered+0.067
Neg logit contributions
 earn-0.058
 actually-0.057
 solve-0.057
 count-0.054
 seem-0.054
Token he
Token ID339
Feature activation+3.76e-02

Pos logit contributions
 earn+0.005
 solve+0.005
 actually+0.004
 count+0.004
 seem+0.004
Neg logit contributions
ed-0.009
rified-0.008
apped-0.008
ing-0.008
uttered-0.008
Token knew
Token ID2993
Feature activation+1.54e-02

Pos logit contributions
ed+0.071
apped+0.060
rified+0.059
ing+0.057
uttered+0.056
Neg logit contributions
 earn-0.061
 solve-0.059
 seem-0.055
 count-0.054
 actually-0.054
Token that
Token ID326
Feature activation+8.33e-03

Pos logit contributions
ed+0.031
ing+0.028
rified+0.026
apped+0.026
ted+0.025
Neg logit contributions
 earn-0.021
 solve-0.020
 count-0.019
 seem-0.019
 read-0.019
Token him
Token ID683
Feature activation+3.23e-02

Pos logit contributions
 solve+0.005
 actually+0.005
 earn+0.005
 seem+0.005
 read+0.005
Neg logit contributions
ed-0.007
apped-0.006
rified-0.006
w-0.006
ho-0.006
Token inside
Token ID2641
Feature activation-4.78e-02

Pos logit contributions
ed+0.067
apped+0.057
rified+0.056
ing+0.056
uttered+0.054
Neg logit contributions
 earn-0.045
 actually-0.044
 solve-0.042
 rely-0.041
 seem-0.040
Token.
Token ID13
Feature activation+9.00e-03

Pos logit contributions
 earn+0.065
 rely+0.062
 solve+0.060
 actually+0.060
 seem+0.058
Neg logit contributions
ed-0.093
ing-0.085
apped-0.081
rified-0.080
 down-0.078
Token Bobby
Token ID17155
Feature activation-4.11e-03

Pos logit contributions
ed+0.016
apped+0.013
rified+0.013
elt+0.013
uttered+0.013
Neg logit contributions
 earn-0.016
 solve-0.015
 rely-0.015
 actually-0.014
 count-0.014
Token leaned
Token ID23831
Feature activation+4.19e-02

Pos logit contributions
 earn+0.007
 solve+0.006
 rely+0.006
 actually+0.006
 seem+0.006
Neg logit contributions
ed-0.008
apped-0.006
rified-0.006
ing-0.006
uttered-0.006
Token forward
Token ID2651
Feature activation+1.55e-01

Pos logit contributions
w+0.073
attered+0.072
rified+0.072
suits+0.070
ed+0.069
Neg logit contributions
 actually-0.056
 read-0.056
 also-0.054
 not-0.049
 count-0.048
Token,
Token ID11
Feature activation+5.86e-02

Pos logit contributions
ed+0.283
rified+0.247
apped+0.243
ing+0.231
uttered+0.230
Neg logit contributions
 earn-0.245
 solve-0.241
 actually-0.236
 count-0.228
 seem-0.228
Token his
Token ID465
Feature activation+5.54e-02

Pos logit contributions
ed+0.109
rified+0.091
apped+0.090
elt+0.088
ing+0.087
Neg logit contributions
 earn-0.095
 solve-0.091
 rely-0.086
 seem-0.084
 actually-0.083
Token face
Token ID1986
Feature activation+3.82e-02

Pos logit contributions
ed+0.102
rified+0.090
apped+0.088
ing+0.084
uttered+0.083
Neg logit contributions
 earn-0.087
 solve-0.087
 count-0.085
 actually-0.084
 rely-0.081
Token getting
Token ID1972
Feature activation+4.96e-03

Pos logit contributions
ed+0.071
rified+0.060
apped+0.060
ing+0.058
uttered+0.057
Neg logit contributions
 earn-0.062
 solve-0.059
 actually-0.058
 rely-0.056
 seem-0.056
Token closer
Token ID5699
Feature activation+2.02e-02

Pos logit contributions
ed+0.009
rified+0.009
apped+0.008
uttered+0.008
ho+0.008
Neg logit contributions
 actually-0.008
 read-0.007
 also-0.007
 solve-0.007
 earn-0.007
Token
Token ID198
Feature activation-1.01e-02

Pos logit contributions
ed+0.007
apped+0.006
rified+0.006
ing+0.006
uttered+0.006
Neg logit contributions
 earn-0.004
 solve-0.004
 actually-0.004
 rely-0.003
 count-0.003
TokenJeff
Token ID19139
Feature activation-4.38e-02

Pos logit contributions
 earn+0.014
 solve+0.013
 actually+0.013
 count+0.012
 seem+0.012
Neg logit contributions
ed-0.021
rified-0.019
apped-0.018
ing-0.018
uttered-0.018
Token took
Token ID1718
Feature activation+2.23e-02

Pos logit contributions
 earn+0.069
 solve+0.066
 actually+0.064
 seem+0.062
 rely+0.062
Neg logit contributions
ed-0.084
rified-0.071
apped-0.071
ing-0.069
uttered-0.067
Token a
Token ID257
Feature activation+4.90e-02

Pos logit contributions
ed+0.046
attered+0.044
apped+0.043
rified+0.043
elt+0.043
Neg logit contributions
 solve-0.027
 count-0.026
 read-0.025
 earn-0.025
 seem-0.025
Token step
Token ID2239
Feature activation+5.38e-02

Pos logit contributions
ed+0.092
rified+0.085
apped+0.085
ipt+0.082
attered+0.080
Neg logit contributions
 solve-0.076
 read-0.073
 count-0.072
 seem-0.070
 earn-0.070
Token forward
Token ID2651
Feature activation+1.55e-01

Pos logit contributions
rified+0.078
ed+0.078
apped+0.078
Look+0.077
uttered+0.076
Neg logit contributions
-0.081
 count-0.081
 read-0.081
 solve-0.080
 also-0.079
Token and
Token ID290
Feature activation+1.67e-02

Pos logit contributions
ed+0.274
rified+0.234
apped+0.230
ing+0.224
uttered+0.217
Neg logit contributions
 earn-0.258
 solve-0.249
 actually-0.247
 rely-0.236
 seem-0.236
Token the
Token ID262
Feature activation-1.83e-03

Pos logit contributions
ed+0.032
rified+0.028
apped+0.028
ing+0.027
uttered+0.027
Neg logit contributions
 earn-0.025
 solve-0.024
 actually-0.024
 count-0.023
 seem-0.023
Token rabbit
Token ID22746
Feature activation+3.96e-02

Pos logit contributions
 read+0.003
 actually+0.003
 count+0.003
 solve+0.003
 cost+0.003
Neg logit contributions
ed-0.003
apped-0.003
uttered-0.003
rified-0.003
xc-0.003
Token began
Token ID2540
Feature activation+2.85e-02

Pos logit contributions
ed+0.069
rified+0.055
apped+0.052
ted+0.052
ho+0.052
Neg logit contributions
 earn-0.075
 solve-0.068
 rely-0.067
 seem-0.066
 actually-0.065
Token to
Token ID284
Feature activation-8.98e-03

Pos logit contributions
ed+0.043
rified+0.037
apped+0.036
ing+0.034
uttered+0.033
Neg logit contributions
 earn-0.054
 solve-0.053
 actually-0.053
 count-0.051
 seem-0.051
Token the
Token ID262
Feature activation+1.32e-02

Pos logit contributions
ed+0.062
 breath+0.054
rified+0.053
ted+0.052
apped+0.051
Neg logit contributions
 earn-0.048
 actually-0.045
 solve-0.044
case-0.044
 cost-0.043
Token glass
Token ID5405
Feature activation+3.82e-03

Pos logit contributions
ed+0.024
rified+0.021
apped+0.020
ing+0.019
uttered+0.019
Neg logit contributions
 earn-0.021
 solve-0.021
 actually-0.021
 count-0.020
 seem-0.020
Token.
Token ID13
Feature activation+1.61e-02

Pos logit contributions
ed+0.008
rified+0.007
apped+0.007
ing+0.007
uttered+0.007
Neg logit contributions
 earn-0.005
 solve-0.004
 actually-0.004
 rely-0.004
 seem-0.004
Token Lily
Token ID20037
Feature activation-1.39e-02

Pos logit contributions
ed+0.030
apped+0.026
rified+0.025
uttered+0.025
elt+0.024
Neg logit contributions
 earn-0.026
 solve-0.025
 actually-0.024
 count-0.024
 rely-0.024
Token leaned
Token ID23831
Feature activation+3.60e-02

Pos logit contributions
 earn+0.024
 solve+0.023
 actually+0.022
 rely+0.022
 seem+0.022
Neg logit contributions
ed-0.025
rified-0.021
apped-0.021
ing-0.020
uttered-0.019
Token forward
Token ID2651
Feature activation+1.54e-01

Pos logit contributions
attered+0.062
arp+0.059
rified+0.059
suits+0.058
ed+0.055
Neg logit contributions
 read-0.052
 also-0.051
 actually-0.051
 not-0.046
 count-0.043
Token and
Token ID290
Feature activation+2.87e-02

Pos logit contributions
ed+0.280
rified+0.245
apped+0.241
ing+0.229
uttered+0.228
Neg logit contributions
 earn-0.242
 solve-0.238
 actually-0.234
 count-0.225
 seem-0.225
Token kissed
Token ID27946
Feature activation+4.76e-02

Pos logit contributions
ed+0.055
rified+0.047
apped+0.045
ing+0.044
ted+0.044
Neg logit contributions
 earn-0.045
 solve-0.044
 actually-0.042
 rely-0.042
 seem-0.041
Token the
Token ID262
Feature activation+1.20e-02

Pos logit contributions
ed+0.102
attered+0.095
apped+0.093
aked+0.091
rified+0.091
Neg logit contributions
 solve-0.056
 read-0.053
 actually-0.053
 earn-0.052
 seem-0.051
Token oct
Token ID19318
Feature activation+2.97e-02

Pos logit contributions
ed+0.021
apped+0.018
rified+0.018
uttered+0.017
attered+0.017
Neg logit contributions
 earn-0.020
 solve-0.020
 actually-0.019
 count-0.019
 read-0.019
Tokenopus
Token ID25790
Feature activation+4.74e-02

Pos logit contributions
ed+0.050
ing+0.042
apped+0.041
rified+0.041
ho+0.039
Neg logit contributions
 earn-0.054
 solve-0.053
 actually-0.052
 count-0.051
 seem-0.051
Token
Token ID198
Feature activation-1.90e-02

Pos logit contributions
 earn+0.008
 solve+0.008
 actually+0.008
 read+0.007
 count+0.007
Neg logit contributions
ed-0.022
rified-0.020
apped-0.020
attered-0.020
uttered-0.019
TokenSam
Token ID16305
Feature activation+5.80e-03

Pos logit contributions
 solve+0.033
 earn+0.032
 read+0.031
 count+0.031
 actually+0.030
Neg logit contributions
ed-0.033
apped-0.030
rified-0.029
uttered-0.029
attered-0.028
Token took
Token ID1718
Feature activation+1.80e-02

Pos logit contributions
ed+0.011
rified+0.009
apped+0.009
ing+0.009
uttered+0.009
Neg logit contributions
 earn-0.009
 solve-0.009
 actually-0.009
 rely-0.009
 seem-0.008
Token a
Token ID257
Feature activation+4.47e-02

Pos logit contributions
ed+0.036
attered+0.034
ho+0.034
rified+0.033
apped+0.033
Neg logit contributions
 solve-0.025
 count-0.023
 earn-0.023
 read-0.022
 seem-0.021
Token step
Token ID2239
Feature activation+5.07e-02

Pos logit contributions
ed+0.074
apped+0.066
rified+0.064
attered+0.063
ipt+0.063
Neg logit contributions
 solve-0.085
 read-0.082
 earn-0.080
 count-0.080
 seem-0.078
Token forward
Token ID2651
Feature activation+1.53e-01

Pos logit contributions
ed+0.084
 Ow+0.082
apped+0.082
Look+0.081
rified+0.081
Neg logit contributions
 solve-0.069
 count-0.068
-0.067
 read-0.067
 also-0.066
Token and
Token ID290
Feature activation+1.45e-02

Pos logit contributions
ed+0.267
rified+0.227
apped+0.224
ing+0.218
uttered+0.211
Neg logit contributions
 earn-0.259
 solve-0.249
 actually-0.248
 rely-0.237
 seem-0.237
Token bowed
Token ID35737
Feature activation+2.27e-02

Pos logit contributions
ed+0.028
apped+0.026
rified+0.026
attered+0.025
uttered+0.025
Neg logit contributions
 solve-0.020
 actually-0.020
 read-0.019
 earn-0.019
 count-0.018
Token.
Token ID13
Feature activation+9.97e-03

Pos logit contributions
rified+0.043
attered+0.042
w+0.042
ed+0.041
apped+0.041
Neg logit contributions
 read-0.028
 actually-0.028
 solve-0.027
 count-0.026
 seem-0.025
Token He
Token ID679
Feature activation+1.69e-02

Pos logit contributions
ed+0.020
rified+0.018
apped+0.018
uttered+0.017
ing+0.017
Neg logit contributions
 earn-0.014
 solve-0.013
 actually-0.013
 rely-0.013
 count-0.013
Token saw
Token ID2497
Feature activation+2.74e-02

Pos logit contributions
ed+0.032
rified+0.026
apped+0.026
ing+0.025
uttered+0.024
Neg logit contributions
 earn-0.028
 solve-0.026
 seem-0.025
 rely-0.025
 actually-0.025
Token the
Token ID262
Feature activation-1.87e-02

Pos logit contributions
 earn+0.007
 solve+0.007
 actually+0.007
 seem+0.007
 count+0.007
Neg logit contributions
ed-0.010
rified-0.009
apped-0.009
ing-0.009
uttered-0.008
Token boy
Token ID2933
Feature activation+5.19e-03

Pos logit contributions
 earn+0.028
 solve+0.028
 actually+0.027
 read+0.027
 seem+0.026
Neg logit contributions
ed-0.036
apped-0.031
rified-0.031
uttered-0.030
attered-0.030
Token took
Token ID1718
Feature activation+2.05e-02

Pos logit contributions
ed+0.009
apped+0.007
rified+0.007
eled+0.007
ted+0.007
Neg logit contributions
 earn-0.010
 solve-0.009
 seem-0.009
 rely-0.009
 actually-0.009
Token a
Token ID257
Feature activation+4.76e-02

Pos logit contributions
ed+0.040
rified+0.037
apped+0.037
attered+0.037
uttered+0.036
Neg logit contributions
 solve-0.028
 earn-0.027
 read-0.026
 count-0.026
 seem-0.026
Token step
Token ID2239
Feature activation+5.36e-02

Pos logit contributions
ed+0.089
apped+0.080
rified+0.079
ipt+0.076
attered+0.075
Neg logit contributions
 solve-0.078
 read-0.074
 earn-0.073
 count-0.073
 seem-0.072
Token forward
Token ID2651
Feature activation+1.51e-01

Pos logit contributions
ed+0.079
rified+0.074
apped+0.073
Look+0.071
uttered+0.070
Neg logit contributions
 solve-0.088
 count-0.086
 actually-0.085
 read-0.084
 seem-0.083
Token.
Token ID13
Feature activation+4.72e-03

Pos logit contributions
ed+0.299
ing+0.256
rified+0.254
apped+0.251
uttered+0.238
Neg logit contributions
 earn-0.225
 actually-0.208
 solve-0.206
 rely-0.196
 cost-0.194
Token Then
Token ID3244
Feature activation-3.25e-02

Pos logit contributions
ed+0.009
apped+0.008
rified+0.008
elt+0.007
uttered+0.007
Neg logit contributions
 earn-0.007
 actually-0.007
 solve-0.007
 rely-0.007
 count-0.007
Token the
Token ID262
Feature activation-3.12e-02

Pos logit contributions
 solve+0.037
 read+0.035
 earn+0.034
 actually+0.034
 count+0.034
Neg logit contributions
ed-0.073
attered-0.068
uttered-0.067
rified-0.067
apped-0.066
Token bird
Token ID6512
Feature activation+1.04e-02

Pos logit contributions
 earn+0.044
 solve+0.043
 actually+0.043
 read+0.042
 seem+0.041
Neg logit contributions
ed-0.062
apped-0.055
rified-0.054
uttered-0.053
attered-0.052
Token flew
Token ID13112
Feature activation+5.30e-02

Pos logit contributions
ed+0.018
rified+0.015
apped+0.014
ing+0.014
ted+0.014
Neg logit contributions
 earn-0.019
 solve-0.018
 actually-0.017
 seem-0.017
 rely-0.017
TokenA
Token ID32
Feature activation+7.08e-03

Pos logit contributions
 actually+0.018
 seem+0.018
 rely+0.018
 earn+0.018
 count+0.017
Neg logit contributions
ed-0.019
rified-0.016
apped-0.016
ing-0.016
ipt-0.015
Token little
Token ID1310
Feature activation+3.15e-02

Pos logit contributions
ed+0.013
rified+0.011
apped+0.011
ing+0.011
uttered+0.011
Neg logit contributions
 earn-0.011
 solve-0.011
 actually-0.011
 seem-0.010
 rely-0.010
Token girl
Token ID2576
Feature activation-9.26e-03

Pos logit contributions
ed+0.056
attered+0.050
umbled+0.049
uttered+0.048
w+0.047
Neg logit contributions
 read-0.050
 solve-0.049
 count-0.049
 seem-0.046
 rely-0.045
Token quickly
Token ID2952
Feature activation+2.11e-02

Pos logit contributions
 solve+0.011
 earn+0.011
 actually+0.011
 read+0.011
 count+0.011
Neg logit contributions
ed-0.020
rified-0.018
apped-0.017
ing-0.017
attered-0.017
Token ran
Token ID4966
Feature activation+3.77e-02

Pos logit contributions
ed+0.038
rified+0.033
apped+0.032
ing+0.031
uttered+0.031
Neg logit contributions
 earn-0.034
 solve-0.033
 actually-0.033
 seem-0.032
 count-0.032
Token forward
Token ID2651
Feature activation+1.51e-01

Pos logit contributions
w+0.068
rified+0.066
eled+0.066
ed+0.065
iated+0.064
Neg logit contributions
 solve-0.055
 read-0.055
 count-0.049
 earn-0.049
 learn-0.045
Token.
Token ID13
Feature activation-2.45e-03

Pos logit contributions
ed+0.242
ing+0.195
apped+0.192
rified+0.190
ho+0.181
Neg logit contributions
 earn-0.284
 actually-0.270
 solve-0.261
 rely-0.260
 cost-0.255
Token She
Token ID1375
Feature activation-1.51e-02

Pos logit contributions
 earn+0.004
 actually+0.004
 solve+0.004
 rely+0.004
 count+0.004
Neg logit contributions
ed-0.005
apped-0.004
rified-0.004
elt-0.004
uttered-0.004
Token was
Token ID373
Feature activation+1.57e-02

Pos logit contributions
 earn+0.026
 solve+0.025
 seem+0.025
 rely+0.024
 actually+0.023
Neg logit contributions
ed-0.027
apped-0.022
rified-0.021
ho-0.021
uttered-0.021
Token excited
Token ID6568
Feature activation+2.76e-03

Pos logit contributions
ed+0.028
apped+0.025
rified+0.024
ing+0.023
uttered+0.023
Neg logit contributions
 earn-0.026
 solve-0.025
 seem-0.024
 rely-0.023
 actually-0.023
Token to
Token ID284
Feature activation-2.56e-02

Pos logit contributions
ed+0.004
rified+0.003
apped+0.003
ing+0.003
uttered+0.003
Neg logit contributions
 earn-0.005
 solve-0.005
 actually-0.005
 rely-0.005
 count-0.005
Token and
Token ID290
Feature activation-5.91e-03

Pos logit contributions
ed+0.018
rified+0.015
apped+0.015
ing+0.015
elt+0.014
Neg logit contributions
 earn-0.018
 solve-0.017
 actually-0.017
 rely-0.016
 count-0.016
Token started
Token ID2067
Feature activation+1.02e-02

Pos logit contributions
 earn+0.009
 solve+0.009
 actually+0.009
 rely+0.009
 count+0.009
Neg logit contributions
ed-0.011
rified-0.009
apped-0.009
ing-0.009
uttered-0.009
Token to
Token ID284
Feature activation-1.81e-02

Pos logit contributions
ed+0.014
rified+0.011
apped+0.011
ing+0.011
uttered+0.010
Neg logit contributions
 earn-0.021
 solve-0.020
 actually-0.020
 rely-0.020
 seem-0.020
Token throw
Token ID3714
Feature activation+2.96e-03

Pos logit contributions
 actually+0.027
 earn+0.027
 solve+0.026
 rely+0.026
 seem+0.025
Neg logit contributions
ed-0.031
rified-0.027
 breath-0.026
ho-0.025
apped-0.025
Token a
Token ID257
Feature activation+5.36e-02

Pos logit contributions
ed+0.006
apped+0.006
uttered+0.005
rified+0.005
ipt+0.005
Neg logit contributions
 earn-0.004
 solve-0.004
 actually-0.004
 read-0.004
 seem-0.004
Token tant
Token ID24246
Feature activation+1.50e-01

Pos logit contributions
ed+0.108
apped+0.101
rified+0.098
uttered+0.097
ipt+0.096
Neg logit contributions
 solve-0.072
 earn-0.071
 read-0.069
 else-0.068
 seem-0.067
Tokenrum
Token ID6582
Feature activation+2.98e-02

Pos logit contributions
ed+0.300
rified+0.271
apped+0.265
uttered+0.254
attered+0.250
Neg logit contributions
 earn-0.204
 solve-0.201
 actually-0.193
 seem-0.188
 count-0.188
Token.
Token ID13
Feature activation+4.47e-03

Pos logit contributions
ed+0.077
rified+0.071
apped+0.069
uttered+0.067
ing+0.067
Neg logit contributions
 earn-0.023
 solve-0.022
 actually-0.022
 seem-0.020
 count-0.020
Token Suddenly
Token ID24975
Feature activation-6.93e-03

Pos logit contributions
ed+0.010
apped+0.009
rified+0.009
uttered+0.008
ing+0.008
Neg logit contributions
 earn-0.006
 actually-0.005
 solve-0.005
 rely-0.005
 count-0.005
Token,
Token ID11
Feature activation+7.90e-03

Pos logit contributions
 solve+0.008
 read+0.007
 count+0.007
 earn+0.007
 seem+0.007
Neg logit contributions
ed-0.015
rified-0.014
ipt-0.014
attered-0.014
apped-0.014
Token she
Token ID673
Feature activation+6.84e-04

Pos logit contributions
ed+0.020
rified+0.018
apped+0.018
attered+0.018
ipt+0.018
Neg logit contributions
 solve-0.007
 earn-0.007
 read-0.006
 actually-0.006
 seem-0.006
Token.
Token ID13
Feature activation-6.18e-04

Pos logit contributions
ed+0.013
ing+0.011
 down+0.010
rified+0.010
 away+0.010
Neg logit contributions
 earn-0.013
 dentist-0.011
 solve-0.011
uries-0.011
ird-0.011
Token She
Token ID1375
Feature activation-2.36e-02

Pos logit contributions
 earn+0.001
 solve+0.001
 actually+0.001
 cost+0.001
aches+0.001
Neg logit contributions
ed-0.001
apped-0.001
rified-0.001
elt-0.001
uttered-0.001
Token took
Token ID1718
Feature activation+2.16e-02

Pos logit contributions
 earn+0.042
 solve+0.039
 seem+0.039
 rely+0.038
 count+0.036
Neg logit contributions
ed-0.041
apped-0.033
ho-0.032
rified-0.032
ted-0.032
Token a
Token ID257
Feature activation+4.67e-02

Pos logit contributions
attered+0.045
ed+0.043
apped+0.043
ho+0.043
rified+0.043
Neg logit contributions
 solve-0.024
 read-0.024
 seem-0.023
 count-0.022
 earn-0.022
Token step
Token ID2239
Feature activation+5.34e-02

Pos logit contributions
ed+0.092
apped+0.083
rified+0.083
ipt+0.080
attered+0.079
Neg logit contributions
 solve-0.073
 read-0.071
 count-0.068
 earn-0.067
 seem-0.066
Token forward
Token ID2651
Feature activation+1.50e-01

Pos logit contributions
rified+0.087
 Ow+0.086
Look+0.083
apped+0.082
ASH+0.081
Neg logit contributions
-0.071
 also-0.071
 read-0.070
 count-0.069
 solve-0.066
Token to
Token ID284
Feature activation+2.09e-02

Pos logit contributions
ed+0.270
rified+0.234
apped+0.230
ing+0.219
uttered+0.217
Neg logit contributions
 earn-0.243
 solve-0.238
 actually-0.234
 count-0.226
 seem-0.225
Token get
Token ID651
Feature activation-1.40e-02

Pos logit contributions
ed+0.029
 breath+0.026
rified+0.025
 down+0.025
ted+0.025
Neg logit contributions
 earn-0.037
 cost-0.035
 actually-0.035
case-0.035
uries-0.034
Token a
Token ID257
Feature activation+3.43e-02

Pos logit contributions
 actually+0.019
 solve+0.018
 earn+0.018
 seem+0.018
 read+0.018
Neg logit contributions
ed-0.027
rified-0.026
apped-0.025
ipt-0.024
attered-0.024
Token closer
Token ID5699
Feature activation+4.93e-02

Pos logit contributions
ed+0.070
rified+0.062
apped+0.061
ipt+0.059
attered+0.058
Neg logit contributions
 solve-0.050
 earn-0.049
 actually-0.048
 count-0.048
 read-0.047
Token look
Token ID804
Feature activation-4.29e-03

Pos logit contributions
ed+0.101
rified+0.089
apped+0.088
uttered+0.084
ing+0.084
Neg logit contributions
 solve-0.067
 earn-0.067
 actually-0.065
 count-0.064
 seem-0.064
Token Suddenly
Token ID24975
Feature activation-1.17e-02

Pos logit contributions
 rely+0.008
 actually+0.008
 earn+0.008
 solve+0.008
 seem+0.007
Neg logit contributions
ed-0.009
rified-0.008
apped-0.007
elt-0.007
attered-0.007
Token,
Token ID11
Feature activation-2.37e-03

Pos logit contributions
 solve+0.012
 count+0.011
 read+0.011
 actually+0.011
 earn+0.011
Neg logit contributions
ed-0.026
attered-0.025
rified-0.025
uttered-0.025
ipt-0.024
Token the
Token ID262
Feature activation-4.07e-02

Pos logit contributions
 earn+0.002
 solve+0.002
 actually+0.002
 read+0.002
 count+0.002
Neg logit contributions
ed-0.006
apped-0.005
rified-0.005
attered-0.005
uttered-0.005
Token bird
Token ID6512
Feature activation+1.75e-03

Pos logit contributions
 solve+0.061
 actually+0.060
 earn+0.059
 read+0.059
 count+0.058
Neg logit contributions
ed-0.079
rified-0.069
apped-0.067
uttered-0.065
ing-0.065
Token leaned
Token ID23831
Feature activation+3.74e-02

Pos logit contributions
ed+0.003
rified+0.003
apped+0.003
ing+0.003
uttered+0.002
Neg logit contributions
 earn-0.003
 solve-0.003
 actually-0.003
 seem-0.003
 count-0.003
Token forward
Token ID2651
Feature activation+1.50e-01

Pos logit contributions
suits+0.058
rified+0.056
 tant+0.055
arp+0.055
icol+0.054
Neg logit contributions
 actually-0.055
 also-0.054
 read-0.053
 not-0.049
 always-0.047
Token and
Token ID290
Feature activation+2.15e-02

Pos logit contributions
ed+0.269
rified+0.240
apped+0.234
uttered+0.223
attered+0.222
Neg logit contributions
 solve-0.234
 earn-0.232
 actually-0.227
 count-0.222
 seem-0.220
Token bit
Token ID1643
Feature activation+4.47e-02

Pos logit contributions
ed+0.039
rified+0.033
apped+0.032
ted+0.032
ing+0.031
Neg logit contributions
 earn-0.035
 solve-0.033
 actually-0.033
 rely-0.033
 seem-0.032
Token Tom
Token ID4186
Feature activation+1.65e-02

Pos logit contributions
ed+0.096
apped+0.091
uttered+0.091
rified+0.087
xc+0.084
Neg logit contributions
 actually-0.048
 solve-0.046
 earn-0.045
 seem-0.044
 also-0.044
Token!
Token ID0
Feature activation+1.47e-02

Pos logit contributions
ed+0.036
rified+0.030
ing+0.030
apped+0.029
ho+0.028
Neg logit contributions
 earn-0.023
 solve-0.021
 actually-0.020
 rely-0.020
 seem-0.019
Token It
Token ID632
Feature activation-1.85e-03

Pos logit contributions
ed+0.028
rified+0.025
apped+0.025
ing+0.023
elt+0.023
Neg logit contributions
 earn-0.022
 solve-0.022
 actually-0.021
 rely-0.021
 count-0.020
Token was
Token ID373
Feature activation+1.36e-03

Pos logit contributions
 earn+0.021
 solve+0.020
 actually+0.019
 seem+0.019
 rely+0.019
Neg logit contributions
ed-0.025
rified-0.021
apped-0.021
ing-0.021
uttered-0.020
Token amazed
Token ID24562
Feature activation-2.60e-02

Pos logit contributions
ed+0.002
apped+0.002
rified+0.002
ing+0.002
uttered+0.002
Neg logit contributions
 earn-0.002
 solve-0.002
 seem-0.002
 actually-0.002
 count-0.002
Token.
Token ID13
Feature activation+9.51e-03

Pos logit contributions
 actually+0.032
 solve+0.031
 seem+0.029
 read+0.028
 earn+0.027
Neg logit contributions
ed-0.055
apped-0.051
uttered-0.050
aked-0.049
w-0.049
Token He
Token ID679
Feature activation+2.82e-04

Pos logit contributions
ed+0.017
apped+0.015
rified+0.015
ipt+0.014
elt+0.014
Neg logit contributions
 earn-0.016
 solve-0.016
 count-0.016
 actually-0.015
aches-0.015
Token leaned
Token ID23831
Feature activation+3.20e-02

Pos logit contributions
ed+0.001
apped+0.000
rified+0.000
eled+0.000
uttered+0.000
Neg logit contributions
 earn-0.000
 solve-0.000
 seem-0.000
 rely-0.000
 count-0.000
Token forward
Token ID2651
Feature activation+1.50e-01

Pos logit contributions
attered+0.047
suits+0.046
rified+0.045
w+0.045
ed+0.044
Neg logit contributions
 actually-0.051
 read-0.049
 also-0.049
 solve-0.045
 not-0.045
Token in
Token ID287
Feature activation+2.16e-02

Pos logit contributions
ed+0.269
rified+0.237
apped+0.233
uttered+0.220
ing+0.219
Neg logit contributions
 earn-0.237
 solve-0.235
 actually-0.229
 count-0.223
 seem-0.222
Token his
Token ID465
Feature activation+6.52e-02

Pos logit contributions
ed+0.044
apped+0.039
rified+0.039
attered+0.038
uttered+0.037
Neg logit contributions
 solve-0.030
 earn-0.030
 actually-0.029
 count-0.029
 read-0.029
Token seat
Token ID5852
Feature activation+1.28e-02

Pos logit contributions
ed+0.101
rified+0.083
apped+0.083
ing+0.080
uttered+0.077
Neg logit contributions
 earn-0.120
 actually-0.118
 solve-0.117
 seem-0.114
 read-0.110
Token and
Token ID290
Feature activation+2.89e-02

Pos logit contributions
ed+0.024
rified+0.022
apped+0.021
uttered+0.020
attered+0.020
Neg logit contributions
 earn-0.018
 solve-0.018
 actually-0.018
 read-0.017
 count-0.017
Token asked
Token ID1965
Feature activation-2.89e-02

Pos logit contributions
ed+0.056
rified+0.047
apped+0.046
ing+0.045
uttered+0.044
Neg logit contributions
 earn-0.045
 solve-0.043
 actually-0.043
 seem-0.041
 rely-0.041
Token the
Token ID262
Feature activation+1.77e-02

Pos logit contributions
ed+0.070
rified+0.062
apped+0.061
ing+0.059
uttered+0.058
Neg logit contributions
 earn-0.053
 solve-0.052
 actually-0.051
 count-0.049
 seem-0.049
Token road
Token ID2975
Feature activation-8.08e-03

Pos logit contributions
ed+0.036
rified+0.032
apped+0.031
ing+0.030
uttered+0.030
Neg logit contributions
 earn-0.025
 solve-0.024
 actually-0.023
 rely-0.022
 seem-0.022
Token.
Token ID13
Feature activation+4.06e-03

Pos logit contributions
 earn+0.010
 solve+0.009
 actually+0.009
 rely+0.009
 count+0.008
Neg logit contributions
ed-0.018
ing-0.016
rified-0.016
apped-0.016
ho-0.015
Token She
Token ID1375
Feature activation-1.44e-02

Pos logit contributions
ed+0.008
rified+0.007
apped+0.007
elt+0.006
uttered+0.006
Neg logit contributions
 earn-0.007
 solve-0.006
 actually-0.006
 rely-0.006
 count-0.006
Token ran
Token ID4966
Feature activation+4.16e-02

Pos logit contributions
 earn+0.025
 solve+0.024
 seem+0.023
 rely+0.023
 actually+0.022
Neg logit contributions
ed-0.026
apped-0.021
rified-0.021
uttered-0.020
ing-0.020
Token forward
Token ID2651
Feature activation+1.49e-01

Pos logit contributions
ed+0.066
apped+0.064
rified+0.064
w+0.062
eled+0.060
Neg logit contributions
 solve-0.071
 count-0.068
 read-0.066
 actually-0.066
 earn-0.065
Token and
Token ID290
Feature activation+1.54e-02

Pos logit contributions
ed+0.255
rified+0.213
apped+0.209
ing+0.207
uttered+0.197
Neg logit contributions
 earn-0.261
 actually-0.249
 solve-0.249
 rely-0.240
 seem-0.237
Token started
Token ID2067
Feature activation+1.85e-02

Pos logit contributions
ed+0.029
rified+0.026
apped+0.025
ing+0.024
uttered+0.024
Neg logit contributions
 earn-0.023
 solve-0.023
 actually-0.023
 count-0.022
 seem-0.021
Token collecting
Token ID13157
Feature activation-5.59e-02

Pos logit contributions
ed+0.029
rified+0.025
apped+0.024
ing+0.023
uttered+0.023
Neg logit contributions
 earn-0.034
 solve-0.033
 actually-0.033
 seem-0.032
 count-0.032
Token them
Token ID606
Feature activation+1.98e-02

Pos logit contributions
 earn+0.091
 solve+0.090
 actually+0.088
 count+0.086
 seem+0.085
Neg logit contributions
ed-0.098
rified-0.086
apped-0.085
uttered-0.080
ing-0.079
Token.
Token ID13
Feature activation+1.18e-02

Pos logit contributions
ed+0.040
ing+0.035
rified+0.033
apped+0.032
 down+0.031
Neg logit contributions
 earn-0.030
 actually-0.026
 rely-0.026
 solve-0.026
 dentist-0.025
Token Did
Token ID7731
Feature activation-3.82e-02

Pos logit contributions
 earn+0.057
 solve+0.055
 actually+0.055
 seem+0.052
 count+0.051
Neg logit contributions
ed-0.067
apped-0.059
rified-0.058
uttered-0.055
ing-0.055
Token you
Token ID345
Feature activation-3.81e-02

Pos logit contributions
 earn+0.063
 actually+0.059
 solve+0.056
 rely+0.055
 cost+0.055
Neg logit contributions
ed-0.061
ing-0.053
w-0.052
rified-0.052
apped-0.051
Token know
Token ID760
Feature activation-8.28e-02

Pos logit contributions
 earn+0.046
 dentist+0.042
 actually+0.042
lf+0.041
 solve+0.041
Neg logit contributions
ed-0.075
apped-0.066
 breath-0.065
w-0.064
rified-0.063
Token that
Token ID326
Feature activation-1.17e-02

Pos logit contributions
 earn+0.125
 seem+0.111
 rely+0.111
 cost+0.110
 solve+0.110
Neg logit contributions
ed-0.152
ing-0.146
rified-0.130
apped-0.129
 down-0.128
Token cater
Token ID20825
Feature activation-6.16e-02

Pos logit contributions
 earn+0.019
 solve+0.017
 actually+0.017
 rely+0.017
 seem+0.017
Neg logit contributions
ed-0.020
ing-0.018
rified-0.017
apped-0.016
w-0.016
Tokenpill
Token ID27215
Feature activation-1.91e-01

Pos logit contributions
 earn+0.116
 solve+0.105
 count+0.104
 actually+0.100
 dentist+0.098
Neg logit contributions
ed-0.110
ing-0.096
rified-0.093
ted-0.088
w-0.088
Tokenars
Token ID945
Feature activation-4.17e-02

Pos logit contributions
 dentist+0.399
 earn+0.370
lf+0.356
 shelf+0.354
 read+0.346
Neg logit contributions
ed-0.280
w-0.237
aked-0.230
apped-0.215
rified-0.204
Token turn
Token ID1210
Feature activation-1.52e-02

Pos logit contributions
 earn+0.064
 solve+0.059
 rely+0.059
 actually+0.057
 count+0.056
Neg logit contributions
ed-0.079
ing-0.068
rified-0.064
w-0.063
apped-0.063
Token into
Token ID656
Feature activation-4.86e-02

Pos logit contributions
 actually+0.026
 earn+0.026
 solve+0.026
 seem+0.025
 count+0.025
Neg logit contributions
ed-0.025
rified-0.022
apped-0.022
ipt-0.020
uttered-0.020
Token beautiful
Token ID4950
Feature activation-1.86e-02

Pos logit contributions
 earn+0.085
 solve+0.079
 seem+0.078
 rely+0.077
 actually+0.077
Neg logit contributions
ed-0.077
ing-0.064
rified-0.064
apped-0.062
 breath-0.059
Token butterflies
Token ID44571
Feature activation-3.18e-02

Pos logit contributions
 earn+0.031
 solve+0.031
 actually+0.030
 count+0.029
 seem+0.029
Neg logit contributions
ed-0.032
rified-0.028
apped-0.028
ing-0.026
uttered-0.026
Token,
Token ID11
Feature activation+2.39e-02

Pos logit contributions
 solve+0.006
 seem+0.006
 earn+0.005
 actually+0.005
 count+0.005
Neg logit contributions
ed-0.010
rified-0.009
attered-0.009
ipt-0.009
eled-0.009
Token "
Token ID366
Feature activation+4.25e-02

Pos logit contributions
ed+0.043
ing+0.037
apped+0.037
ho+0.036
rified+0.035
Neg logit contributions
 earn-0.042
 solve-0.039
 rely-0.038
 count-0.036
 actually-0.036
TokenI
Token ID40
Feature activation+4.90e-03

Pos logit contributions
ed+0.068
apped+0.062
uttered+0.059
rified+0.056
ing+0.055
Neg logit contributions
 earn-0.076
 read-0.074
 actually-0.073
 solve-0.073
 seem-0.071
Token love
Token ID1842
Feature activation+1.62e-02

Pos logit contributions
ed+0.010
rified+0.009
apped+0.009
ing+0.009
uttered+0.009
Neg logit contributions
 earn-0.007
 solve-0.006
 actually-0.006
 seem-0.006
 count-0.006
Token cater
Token ID20825
Feature activation-5.77e-02

Pos logit contributions
ed+0.030
ing+0.026
rified+0.026
apped+0.025
ted+0.025
Neg logit contributions
 earn-0.026
 solve-0.024
 actually-0.024
 rely-0.023
 seem-0.023
Tokenpill
Token ID27215
Feature activation-1.76e-01

Pos logit contributions
 earn+0.075
 solve+0.069
 actually+0.068
 count+0.067
 rely+0.064
Neg logit contributions
ed-0.130
rified-0.116
ing-0.116
ted-0.111
apped-0.110
Tokenars
Token ID945
Feature activation-3.26e-02

Pos logit contributions
 dentist+0.339
 earn+0.335
 actually+0.334
 count+0.316
lf+0.308
Neg logit contributions
ed-0.265
w-0.222
apped-0.213
aked-0.211
ing-0.206
Token!
Token ID0
Feature activation-1.88e-02

Pos logit contributions
 earn+0.049
 solve+0.046
 actually+0.045
 rely+0.044
 seem+0.043
Neg logit contributions
ed-0.066
ing-0.058
apped-0.056
rified-0.056
w-0.052
Token You
Token ID921
Feature activation+1.47e-02

Pos logit contributions
 earn+0.034
 solve+0.033
 actually+0.033
 seem+0.032
aches+0.032
Neg logit contributions
ed-0.029
rified-0.026
apped-0.026
elt-0.024
ing-0.024
Token can
Token ID460
Feature activation+2.01e-02

Pos logit contributions
ed+0.024
rified+0.020
apped+0.019
w+0.018
ing+0.018
Neg logit contributions
 earn-0.026
 dentist-0.024
lf-0.024
 actually-0.023
aster-0.023
Token be
Token ID307
Feature activation-6.39e-02

Pos logit contributions
ed+0.033
ho+0.032
uck+0.031
apped+0.030
w+0.030
Neg logit contributions
 dentist-0.025
 earn-0.025
lf-0.024
ird-0.023
 homework-0.022
Token but
Token ID475
Feature activation-5.36e-03

Pos logit contributions
ed+0.006
rified+0.006
apped+0.005
ing+0.005
elt+0.005
Neg logit contributions
 earn-0.006
 solve-0.006
 rely-0.006
 count-0.006
 actually-0.006
Token he
Token ID339
Feature activation-6.46e-03

Pos logit contributions
 earn+0.006
 solve+0.006
 rely+0.005
 seem+0.005
 count+0.005
Neg logit contributions
ed-0.012
apped-0.011
rified-0.011
ing-0.011
ho-0.010
Token also
Token ID635
Feature activation-5.17e-02

Pos logit contributions
 earn+0.010
 solve+0.009
 seem+0.009
 count+0.009
 rely+0.009
Neg logit contributions
ed-0.013
rified-0.011
apped-0.010
ho-0.010
ing-0.010
Token knew
Token ID2993
Feature activation-2.41e-03

Pos logit contributions
 solve+0.091
 earn+0.090
 count+0.084
 seem+0.084
 rely+0.083
Neg logit contributions
ed-0.091
apped-0.073
rified-0.069
ho-0.066
elt-0.066
Token cater
Token ID20825
Feature activation-4.22e-02

Pos logit contributions
 earn+0.004
 solve+0.004
 seem+0.003
 rely+0.003
 count+0.003
Neg logit contributions
ed-0.004
ing-0.004
ted-0.004
rified-0.004
apped-0.004
Tokenpill
Token ID27215
Feature activation-1.69e-01

Pos logit contributions
 earn+0.057
 solve+0.054
 count+0.052
 actually+0.052
 rely+0.049
Neg logit contributions
ed-0.095
rified-0.083
ing-0.082
apped-0.079
ted-0.078
Tokenars
Token ID945
Feature activation-3.74e-02

Pos logit contributions
 dentist+0.328
 earn+0.317
lf+0.313
 count+0.307
 read+0.307
Neg logit contributions
ed-0.262
aked-0.218
w-0.205
apped-0.201
rified-0.199
Token were
Token ID547
Feature activation-2.09e-02

Pos logit contributions
 earn+0.047
 solve+0.045
 rely+0.043
 actually+0.043
 count+0.042
Neg logit contributions
ed-0.084
ing-0.071
rified-0.071
apped-0.070
ted-0.067
Token not
Token ID407
Feature activation+1.35e-02

Pos logit contributions
 earn+0.036
 solve+0.033
ird+0.032
 achieve+0.031
lf+0.031
Neg logit contributions
ed-0.032
 away-0.027
ing-0.026
 down-0.026
aked-0.026
Token pets
Token ID17252
Feature activation-6.24e-02

Pos logit contributions
ed+0.022
ing+0.019
 down+0.019
 away+0.018
 breath+0.018
Neg logit contributions
 earn-0.024
 solve-0.021
lf-0.021
case-0.021
 concentrate-0.020
Token.
Token ID13
Feature activation-7.88e-03

Pos logit contributions
 earn+0.084
 solve+0.074
 rely+0.073
 count+0.072
 actually+0.071
Neg logit contributions
ed-0.137
ing-0.122
rified-0.120
ho-0.112
apped-0.111
Token know
Token ID760
Feature activation-8.81e-02

Pos logit contributions
lf+0.011
 earn+0.010
 actually+0.010
 solve+0.010
 dentist+0.010
Neg logit contributions
ed-0.015
rified-0.013
w-0.012
 away-0.012
 breath-0.012
Token,
Token ID11
Feature activation-2.41e-02

Pos logit contributions
 earn+0.151
 rely+0.134
 solve+0.133
 actually+0.132
 cost+0.132
Neg logit contributions
ed-0.157
ing-0.147
rified-0.138
ted-0.129
 down-0.122
Token Lily
Token ID20037
Feature activation-4.46e-02

Pos logit contributions
 earn+0.039
 solve+0.037
 rely+0.037
 actually+0.035
 count+0.034
Neg logit contributions
ed-0.041
rified-0.039
ing-0.035
apped-0.034
ho-0.034
Token.
Token ID13
Feature activation-2.38e-02

Pos logit contributions
 earn+0.068
 solve+0.061
 actually+0.061
 seem+0.057
 count+0.057
Neg logit contributions
ed-0.093
ing-0.080
rified-0.076
apped-0.072
 down-0.072
Token Cater
Token ID40742
Feature activation-4.46e-02

Pos logit contributions
 earn+0.034
 solve+0.033
 actually+0.033
 count+0.031
 seem+0.031
Neg logit contributions
ed-0.047
rified-0.041
apped-0.041
ing-0.039
uttered-0.039
Tokenpill
Token ID27215
Feature activation-1.68e-01

Pos logit contributions
 earn+0.083
 solve+0.078
 actually+0.077
 count+0.075
 rely+0.073
Neg logit contributions
ed-0.079
rified-0.067
ing-0.067
apped-0.063
elt-0.060
Tokenars
Token ID945
Feature activation-1.80e-02

Pos logit contributions
 dentist+0.330
 earn+0.314
lf+0.312
 actually+0.305
 read+0.300
Neg logit contributions
ed-0.250
w-0.207
aked-0.204
apped-0.198
rified-0.192
Token need
Token ID761
Feature activation+2.41e-02

Pos logit contributions
 earn+0.028
 solve+0.027
 actually+0.027
 rely+0.026
 count+0.025
Neg logit contributions
ed-0.033
ing-0.027
rified-0.027
apped-0.027
w-0.026
Token leaves
Token ID5667
Feature activation-5.85e-02

Pos logit contributions
ed+0.044
rified+0.037
apped+0.036
ing+0.035
ted+0.034
Neg logit contributions
 earn-0.039
 solve-0.037
 actually-0.037
 rely-0.036
 seem-0.036
Token and
Token ID290
Feature activation-8.56e-03

Pos logit contributions
 earn+0.090
 solve+0.086
 actually+0.084
 rely+0.082
 count+0.081
Neg logit contributions
ed-0.114
ing-0.095
rified-0.095
apped-0.094
ho-0.089
Token air
Token ID1633
Feature activation-1.75e-02

Pos logit contributions
 earn+0.016
 solve+0.015
 actually+0.015
 rely+0.015
 seem+0.014
Neg logit contributions
ed-0.013
rified-0.010
apped-0.010
ted-0.010
elt-0.010
Token careful
Token ID8161
Feature activation-7.86e-03

Pos logit contributions
 earn+0.079
 solve+0.077
 actually+0.075
 count+0.072
 rely+0.072
Neg logit contributions
ed-0.100
rified-0.088
apped-0.087
ing-0.083
uttered-0.083
Token,
Token ID11
Feature activation-3.31e-02

Pos logit contributions
 earn+0.012
 solve+0.011
 actually+0.011
 rely+0.011
 count+0.011
Neg logit contributions
ed-0.015
ing-0.013
rified-0.013
apped-0.013
uttered-0.012
Token Lily
Token ID20037
Feature activation-5.67e-02

Pos logit contributions
 earn+0.048
 solve+0.046
 actually+0.045
 rely+0.044
 count+0.044
Neg logit contributions
ed-0.065
rified-0.057
apped-0.055
ing-0.055
uttered-0.052
Token.
Token ID13
Feature activation-3.30e-02

Pos logit contributions
 earn+0.082
 solve+0.078
 actually+0.075
 count+0.075
 seem+0.074
Neg logit contributions
ed-0.117
ing-0.103
rified-0.099
apped-0.096
w-0.091
Token Cater
Token ID40742
Feature activation-5.02e-02

Pos logit contributions
 earn+0.054
 solve+0.053
 actually+0.053
 rely+0.050
 count+0.050
Neg logit contributions
ed-0.058
rified-0.052
apped-0.051
uttered-0.048
elt-0.048
Tokenpill
Token ID27215
Feature activation-1.66e-01

Pos logit contributions
 earn+0.096
 actually+0.091
 solve+0.091
 count+0.089
 rely+0.086
Neg logit contributions
ed-0.086
ing-0.073
rified-0.072
apped-0.068
ted-0.064
Tokenars
Token ID945
Feature activation-2.53e-02

Pos logit contributions
 dentist+0.306
 earn+0.303
 read+0.293
 actually+0.291
 count+0.288
Neg logit contributions
ed-0.265
w-0.220
aked-0.217
apped-0.214
rified-0.209
Token can
Token ID460
Feature activation+1.85e-02

Pos logit contributions
 earn+0.037
 solve+0.036
 actually+0.036
 rely+0.035
 read+0.034
Neg logit contributions
ed-0.048
rified-0.042
apped-0.041
ing-0.041
uttered-0.039
Token be
Token ID307
Feature activation-5.63e-02

Pos logit contributions
ho+0.032
ed+0.031
w+0.030
uck+0.029
ing+0.029
Neg logit contributions
 earn-0.024
 dentist-0.023
uries-0.023
lf-0.023
ird-0.022
Token tricky
Token ID17198
Feature activation-3.47e-02

Pos logit contributions
 earn+0.097
 solve+0.090
 seem+0.087
 read+0.085
 rely+0.085
Neg logit contributions
ed-0.097
ing-0.079
rified-0.078
apped-0.076
aked-0.076
Token.
Token ID13
Feature activation-2.38e-02

Pos logit contributions
 earn+0.052
 actually+0.049
 solve+0.049
 read+0.046
 seem+0.046
Neg logit contributions
ed-0.068
ing-0.060
rified-0.056
apped-0.055
attered-0.051
Token you
Token ID345
Feature activation-1.11e-02

Pos logit contributions
 earn+0.058
 actually+0.056
 solve+0.055
 rely+0.054
 count+0.053
Neg logit contributions
ed-0.095
rified-0.085
ing-0.083
apped-0.080
ted-0.079
Token,
Token ID11
Feature activation-1.23e-02

Pos logit contributions
 earn+0.017
 rely+0.016
 actually+0.016
 solve+0.016
 count+0.015
Neg logit contributions
ed-0.021
rified-0.018
ing-0.017
apped-0.017
uttered-0.016
Token Lily
Token ID20037
Feature activation-5.00e-02

Pos logit contributions
 earn+0.020
 rely+0.019
 solve+0.018
 actually+0.018
 count+0.018
Neg logit contributions
ed-0.020
rified-0.018
ing-0.018
apped-0.017
 bree-0.016
Token.
Token ID13
Feature activation-2.52e-02

Pos logit contributions
 earn+0.080
 actually+0.074
 solve+0.073
 count+0.071
 rely+0.071
Neg logit contributions
ed-0.096
ing-0.084
rified-0.079
apped-0.078
uck-0.072
Token Cater
Token ID40742
Feature activation-4.46e-02

Pos logit contributions
 earn+0.039
 solve+0.038
 actually+0.038
 count+0.036
 read+0.036
Neg logit contributions
ed-0.047
rified-0.042
apped-0.041
ing-0.039
uttered-0.038
Tokenpill
Token ID27215
Feature activation-1.62e-01

Pos logit contributions
 earn+0.062
 solve+0.059
 actually+0.058
 count+0.057
 rely+0.054
Neg logit contributions
ed-0.100
ing-0.087
rified-0.086
apped-0.083
ted-0.080
Tokenars
Token ID945
Feature activation-2.30e-02

Pos logit contributions
 earn+0.272
 dentist+0.260
 actually+0.260
 count+0.258
 solve+0.257
Neg logit contributions
ed-0.283
apped-0.238
rified-0.232
aked-0.226
w-0.225
Token can
Token ID460
Feature activation+2.40e-02

Pos logit contributions
 earn+0.034
 actually+0.033
 solve+0.033
 rely+0.032
 count+0.031
Neg logit contributions
ed-0.044
rified-0.038
apped-0.037
ing-0.037
uttered-0.035
Token leave
Token ID2666
Feature activation-8.44e-03

Pos logit contributions
ed+0.043
ho+0.039
 breath+0.038
w+0.037
uck+0.037
Neg logit contributions
 earn-0.029
 dentist-0.028
 library-0.027
uries-0.027
 reading-0.027
Token and
Token ID290
Feature activation-5.42e-03

Pos logit contributions
 earn+0.013
 solve+0.013
 actually+0.013
 rely+0.012
 count+0.012
Neg logit contributions
ed-0.016
rified-0.014
ing-0.013
apped-0.013
ted-0.013
Token never
Token ID1239
Feature activation+5.87e-02

Pos logit contributions
 earn+0.006
 actually+0.005
 solve+0.005
 seem+0.005
 rely+0.005
Neg logit contributions
ed-0.012
rified-0.010
apped-0.010
ing-0.010
ho-0.010
Token Lily
Token ID20037
Feature activation-5.24e-02

Pos logit contributions
 earn+0.049
 rely+0.046
 solve+0.045
 actually+0.043
 achieve+0.041
Neg logit contributions
ed-0.052
rified-0.048
apped-0.045
ho-0.044
ted-0.043
Token,
Token ID11
Feature activation-1.02e-02

Pos logit contributions
 earn+0.093
 actually+0.087
 solve+0.086
 rely+0.082
 seem+0.082
Neg logit contributions
ed-0.095
rified-0.078
ing-0.077
apped-0.073
ho-0.070
Token no
Token ID645
Feature activation+6.22e-02

Pos logit contributions
 earn+0.015
 actually+0.013
 rely+0.013
 solve+0.013
 read+0.012
Neg logit contributions
ed-0.018
rified-0.017
ho-0.016
ing-0.016
apped-0.016
Token.
Token ID13
Feature activation-3.22e-02

Pos logit contributions
ed+0.091
ing+0.090
rified+0.078
w+0.077
apped+0.075
Neg logit contributions
 earn-0.117
 actually-0.108
 solve-0.101
uries-0.100
 rely-0.100
Token Cater
Token ID40742
Feature activation-4.34e-02

Pos logit contributions
 earn+0.051
 actually+0.050
 solve+0.050
 rely+0.047
 read+0.047
Neg logit contributions
ed-0.060
rified-0.053
apped-0.053
ing-0.049
uttered-0.049
Tokenpill
Token ID27215
Feature activation-1.60e-01

Pos logit contributions
 earn+0.064
 solve+0.059
 actually+0.058
 count+0.056
 rely+0.055
Neg logit contributions
ed-0.094
ing-0.081
rified-0.081
apped-0.078
ted-0.076
Tokenars
Token ID945
Feature activation-2.61e-02

Pos logit contributions
 dentist+0.322
 earn+0.308
 read+0.297
 actually+0.296
 count+0.296
Neg logit contributions
ed-0.238
w-0.197
aked-0.192
apped-0.188
rified-0.176
Token are
Token ID389
Feature activation-2.78e-02

Pos logit contributions
 earn+0.041
 solve+0.040
 actually+0.040
 rely+0.039
 count+0.038
Neg logit contributions
ed-0.048
rified-0.040
ing-0.040
apped-0.040
w-0.039
Token not
Token ID407
Feature activation+9.24e-03

Pos logit contributions
 earn+0.045
 solve+0.039
ird+0.038
 concentrate+0.038
 rely+0.038
Neg logit contributions
ed-0.046
 down-0.040
 away-0.040
 breath-0.040
ho-0.039
Token for
Token ID329
Feature activation-2.86e-02

Pos logit contributions
ed+0.014
ing+0.012
 down+0.012
 away+0.011
apped+0.011
Neg logit contributions
 earn-0.018
 solve-0.015
 concentrate-0.015
lf-0.015
 read-0.015
Token keeping
Token ID5291
Feature activation-3.66e-02

Pos logit contributions
 earn+0.049
 rely+0.045
 actually+0.044
case+0.044
 solve+0.043
Neg logit contributions
ed-0.046
ing-0.041
apped-0.039
 breath-0.039
ted-0.037
Tokenily
Token ID813
Feature activation-2.86e-02

Pos logit contributions
ed+0.017
rified+0.016
apped+0.015
uttered+0.014
attered+0.014
Neg logit contributions
 earn-0.011
 solve-0.011
 actually-0.011
 count-0.010
 seem-0.010
Token liked
Token ID8288
Feature activation+2.29e-02

Pos logit contributions
 earn+0.047
 solve+0.045
 actually+0.045
 seem+0.043
 rely+0.043
Neg logit contributions
ed-0.052
rified-0.044
apped-0.044
ing-0.042
uttered-0.041
Token to
Token ID284
Feature activation-3.68e-02

Pos logit contributions
ed+0.033
rified+0.027
apped+0.027
ing+0.025
uttered+0.025
Neg logit contributions
 earn-0.045
 solve-0.044
 actually-0.044
 count-0.042
 seem-0.042
Token keep
Token ID1394
Feature activation-9.45e-03

Pos logit contributions
 actually+0.054
 earn+0.050
uries+0.047
 rely+0.046
isy+0.046
Neg logit contributions
ed-0.059
 breath-0.056
apped-0.052
ibly-0.050
 bree-0.050
Token cater
Token ID20825
Feature activation-4.28e-02

Pos logit contributions
 earn+0.014
 solve+0.013
 actually+0.013
 seem+0.013
 rely+0.012
Neg logit contributions
ed-0.019
apped-0.016
ing-0.016
rified-0.016
ted-0.015
Tokenpill
Token ID27215
Feature activation-1.59e-01

Pos logit contributions
 earn+0.056
 solve+0.054
 actually+0.053
 count+0.051
 rely+0.050
Neg logit contributions
ed-0.092
rified-0.082
ing-0.081
apped-0.080
ted-0.077
Tokenars
Token ID945
Feature activation-3.41e-03

Pos logit contributions
 earn+0.305
 actually+0.305
 dentist+0.296
 cost+0.293
 count+0.291
Neg logit contributions
ed-0.223
w-0.184
apped-0.179
ing-0.179
aked-0.178
Token in
Token ID287
Feature activation-9.99e-03

Pos logit contributions
 earn+0.006
 solve+0.006
 actually+0.006
 seem+0.005
 rely+0.005
Neg logit contributions
ed-0.006
apped-0.005
rified-0.005
ing-0.005
uttered-0.005
Token a
Token ID257
Feature activation+3.85e-02

Pos logit contributions
 earn+0.016
 actually+0.015
 solve+0.015
 rely+0.014
 seem+0.014
Neg logit contributions
ed-0.018
rified-0.015
apped-0.015
ing-0.015
uttered-0.014
Token jar
Token ID17379
Feature activation-5.22e-03

Pos logit contributions
ed+0.066
rified+0.056
apped+0.055
ing+0.053
uttered+0.052
Neg logit contributions
 earn-0.066
 solve-0.065
 actually-0.064
 count-0.062
 seem-0.061
Token.
Token ID13
Feature activation+1.43e-02

Pos logit contributions
 earn+0.008
 actually+0.007
 solve+0.007
 seem+0.007
 rely+0.007
Neg logit contributions
ed-0.010
ing-0.009
apped-0.009
rified-0.008
attered-0.008
Token said
Token ID531
Feature activation+1.10e-02

Pos logit contributions
ed+0.061
rified+0.053
apped+0.052
ing+0.050
uttered+0.049
Neg logit contributions
 earn-0.048
 solve-0.047
 actually-0.046
 seem-0.045
 rely-0.045
Token,
Token ID11
Feature activation+3.62e-02

Pos logit contributions
ed+0.022
rified+0.020
attered+0.020
ipt+0.019
apped+0.019
Neg logit contributions
 solve-0.014
 earn-0.014
 seem-0.014
 actually-0.013
 count-0.013
Token "
Token ID366
Feature activation+5.77e-02

Pos logit contributions
ed+0.066
apped+0.054
ho+0.054
rified+0.053
ing+0.052
Neg logit contributions
 earn-0.062
 solve-0.058
 rely-0.058
 actually-0.054
 cost-0.053
TokenC
Token ID34
Feature activation-4.72e-02

Pos logit contributions
ed+0.104
apped+0.090
rified+0.089
uttered+0.085
ing+0.084
Neg logit contributions
 earn-0.093
 actually-0.092
 solve-0.091
 seem-0.087
 read-0.086
Tokenater
Token ID729
Feature activation-2.42e-02

Pos logit contributions
 earn+0.089
 solve+0.087
 actually+0.085
 seem+0.082
 rely+0.082
Neg logit contributions
ed-0.074
rified-0.063
apped-0.061
uttered-0.057
ing-0.057
Tokenpill
Token ID27215
Feature activation-1.57e-01

Pos logit contributions
 earn+0.044
 solve+0.043
 actually+0.042
 rely+0.041
 count+0.041
Neg logit contributions
ed-0.041
rified-0.034
ing-0.033
apped-0.033
uttered-0.031
Tokenars
Token ID945
Feature activation-2.24e-02

Pos logit contributions
 earn+0.274
 dentist+0.268
 actually+0.265
 read+0.257
 solve+0.257
Neg logit contributions
ed-0.257
apped-0.210
w-0.207
aked-0.205
rified-0.200
Token are
Token ID389
Feature activation-1.53e-02

Pos logit contributions
 earn+0.033
 actually+0.032
 solve+0.032
 rely+0.031
 count+0.030
Neg logit contributions
ed-0.044
rified-0.038
apped-0.037
ing-0.037
uttered-0.035
Token not
Token ID407
Feature activation+1.61e-02

Pos logit contributions
 earn+0.026
 seem+0.025
 concentrate+0.023
 solve+0.023
 achieve+0.023
Neg logit contributions
ed-0.024
 down-0.020
 away-0.019
aked-0.019
apped-0.018
Token meant
Token ID4001
Feature activation-3.32e-02

Pos logit contributions
ed+0.027
ing+0.021
 down+0.021
elt+0.020
apped+0.020
Neg logit contributions
 earn-0.030
 solve-0.027
case-0.026
lf-0.026
 seem-0.025
Token to
Token ID284
Feature activation-5.02e-02

Pos logit contributions
 earn+0.070
 actually+0.063
 rely+0.063
lf+0.062
 read+0.062
Neg logit contributions
ed-0.048
ing-0.039
less-0.034
rified-0.033
apped-0.032
TokenDo
Token ID5211
Feature activation-5.29e-02

Pos logit contributions
ed+0.075
rified+0.063
Look+0.061
elt+0.059
ipt+0.057
Neg logit contributions
 solve-0.099
 earn-0.098
 actually-0.097
 count-0.093
 rely-0.093
Token you
Token ID345
Feature activation-4.66e-02

Pos logit contributions
 earn+0.073
 actually+0.070
 rely+0.070
uries+0.066
penter+0.064
Neg logit contributions
ed-0.093
w-0.091
ho-0.086
ing-0.084
rified-0.083
Token know
Token ID760
Feature activation-9.68e-02

Pos logit contributions
lf+0.092
 earn+0.089
 dentist+0.083
penter+0.083
 actually+0.081
Neg logit contributions
ed-0.054
w-0.047
rified-0.047
aked-0.045
ee-0.044
Token what
Token ID644
Feature activation-5.35e-02

Pos logit contributions
 earn+0.151
 seem+0.139
 rely+0.138
 cost+0.136
 actually+0.134
Neg logit contributions
ing-0.164
ed-0.160
rified-0.146
ted-0.143
apped-0.136
Token b
Token ID275
Feature activation-5.69e-02

Pos logit contributions
 earn+0.101
uries+0.094
 actually+0.092
 concentrate+0.091
isk+0.090
Neg logit contributions
ed-0.078
ing-0.072
w-0.062
 down-0.059
ted-0.059
Tokenunn
Token ID20935
Feature activation-1.53e-01

Pos logit contributions
 actually+0.118
 earn+0.116
 solve+0.115
 count+0.110
lf+0.108
Neg logit contributions
ed-0.084
ing-0.071
w-0.069
rified-0.066
apped-0.064
Tokenies
Token ID444
Feature activation-7.52e-02

Pos logit contributions
 earn+0.385
 solve+0.376
 actually+0.373
 count+0.358
 read+0.357
Neg logit contributions
ed-0.156
ing-0.108
apped-0.103
rified-0.103
w-0.092
Token eat
Token ID4483
Feature activation-6.83e-02

Pos logit contributions
 earn+0.104
 dentist+0.101
penter+0.098
lf+0.097
uries+0.097
Neg logit contributions
ed-0.131
w-0.119
ing-0.114
ho-0.108
 down-0.107
Token?"
Token ID1701
Feature activation-5.16e-02

Pos logit contributions
 earn+0.087
 solve+0.083
 actually+0.081
 rely+0.079
 seem+0.078
Neg logit contributions
ed-0.145
ing-0.128
apped-0.124
rified-0.124
w-0.120
Token
Token ID198
Feature activation+1.06e-02

Pos logit contributions
 earn+0.078
 solve+0.075
 actually+0.074
 cost+0.073
uries+0.072
Neg logit contributions
ed-0.099
rified-0.085
elt-0.084
apped-0.084
ing-0.083
Token
Token ID198
Feature activation-1.09e-03

Pos logit contributions
ed+0.023
apped+0.020
rified+0.020
ing+0.020
uttered+0.019
Neg logit contributions
 earn-0.013
 actually-0.013
 solve-0.013
 count-0.012
 rely-0.012
Token for
Token ID329
Feature activation-3.10e-02

Pos logit contributions
 earn+0.064
 solve+0.061
 seem+0.059
 actually+0.057
 rely+0.057
Neg logit contributions
ed-0.059
ing-0.049
apped-0.049
ka-0.044
uttered-0.044
Token me
Token ID502
Feature activation-1.38e-02

Pos logit contributions
 earn+0.049
 solve+0.048
 actually+0.048
 rely+0.046
 seem+0.046
Neg logit contributions
ed-0.054
ing-0.046
apped-0.046
rified-0.043
ted-0.042
Token!
Token ID0
Feature activation+1.25e-02

Pos logit contributions
 earn+0.022
 solve+0.021
 actually+0.020
 cost+0.019
 seem+0.019
Neg logit contributions
ed-0.026
ing-0.023
apped-0.022
uck-0.020
elt-0.020
Token Aren
Token ID9843
Feature activation-6.67e-02

Pos logit contributions
ed+0.022
rified+0.019
apped+0.019
elt+0.018
ing+0.018
Neg logit contributions
 solve-0.021
 earn-0.021
 actually-0.020
 count-0.019
 rely-0.019
Tokenâ
Token ID22940
Feature activation-7.06e-02

Pos logit contributions
 rely+0.095
 earn+0.091
 count+0.089
 solve+0.087
 actually+0.087
Neg logit contributions
ed-0.123
uck-0.115
ing-0.112
ipt-0.107
xc-0.107
Token
Token ID26391
Feature activation-1.53e-01

Pos logit contributions
 actually+0.094
 solve+0.090
 count+0.087
 earn+0.087
 rely+0.080
Neg logit contributions
ed-0.142
ing-0.128
rified-0.126
apped-0.123
uck-0.123
Token
Token ID8151
Feature activation-6.94e-02

Pos logit contributions
 earn+0.220
 solve+0.215
 actually+0.207
 count+0.202
 rely+0.196
Neg logit contributions
ed-0.299
apped-0.265
ing-0.257
rified-0.252
elt-0.247
Tokent
Token ID83
Feature activation-2.03e-02

Pos logit contributions
 earn+0.128
 solve+0.128
 actually+0.124
 count+0.123
 rely+0.122
Neg logit contributions
ed-0.114
ing-0.094
rified-0.093
apped-0.092
elt-0.086
Token we
Token ID356
Feature activation-1.61e-02

Pos logit contributions
 solve+0.034
 count+0.033
 earn+0.032
uries+0.031
 rely+0.031
Neg logit contributions
ing-0.032
ed-0.031
uck-0.027
w-0.026
apped-0.026
Token both
Token ID1111
Feature activation-1.08e-02

Pos logit contributions
 earn+0.022
 solve+0.021
lf+0.021
 dentist+0.020
 actually+0.020
Neg logit contributions
ed-0.030
apped-0.027
ing-0.025
 down-0.025
elt-0.024
Token so
Token ID523
Feature activation-3.49e-02

Pos logit contributions
 earn+0.015
 solve+0.014
 actually+0.014
 dentist+0.013
isy+0.013
Neg logit contributions
ed-0.020
apped-0.018
 down-0.017
ing-0.017
 breath-0.017
Token religion
Token ID5737
Feature activation-2.75e-02

Pos logit contributions
 earn+0.023
 solve+0.022
 actually+0.022
 seem+0.021
 count+0.021
Neg logit contributions
ed-0.032
rified-0.028
apped-0.028
ing-0.027
uttered-0.026
Token.
Token ID13
Feature activation+5.10e-03

Pos logit contributions
 earn+0.040
 solve+0.036
 actually+0.036
 seem+0.034
 rely+0.033
Neg logit contributions
ed-0.054
ing-0.049
apped-0.046
rified-0.044
ho-0.044
Token
Token ID198
Feature activation-5.50e-03

Pos logit contributions
ed+0.009
apped+0.008
rified+0.008
ipt+0.008
uttered+0.008
Neg logit contributions
 earn-0.009
 solve-0.008
 rely-0.008
 count-0.008
 actually-0.008
Token
Token ID198
Feature activation-1.93e-02

Pos logit contributions
 earn+0.008
 count+0.008
 actually+0.008
 solve+0.008
 seem+0.008
Neg logit contributions
ed-0.010
apped-0.009
ipt-0.009
Look-0.009
uttered-0.009
TokenAfter
Token ID3260
Feature activation-5.13e-02

Pos logit contributions
 seem+0.033
 earn+0.032
 actually+0.032
 count+0.032
 rely+0.032
Neg logit contributions
ed-0.030
ipt-0.026
apped-0.026
Look-0.025
rified-0.025
Token church
Token ID4928
Feature activation-1.51e-01

Pos logit contributions
 earn+0.078
 actually+0.074
 solve+0.072
 rely+0.072
 seem+0.069
Neg logit contributions
ed-0.098
apped-0.085
ing-0.081
rified-0.080
uttered-0.079
Token,
Token ID11
Feature activation-3.10e-02

Pos logit contributions
 earn+0.246
 solve+0.238
 actually+0.237
 rely+0.227
 count+0.225
Neg logit contributions
ed-0.276
apped-0.234
rified-0.232
ing-0.227
uttered-0.220
Token Lily
Token ID20037
Feature activation-4.28e-02

Pos logit contributions
 earn+0.042
 solve+0.040
 actually+0.040
 rely+0.038
 count+0.038
Neg logit contributions
ed-0.064
rified-0.057
apped-0.056
ing-0.054
uttered-0.053
Token and
Token ID290
Feature activation-2.55e-02

Pos logit contributions
 earn+0.066
 solve+0.063
 seem+0.062
 rely+0.061
 actually+0.060
Neg logit contributions
ed-0.085
apped-0.068
ing-0.067
rified-0.066
elt-0.064
Token her
Token ID607
Feature activation-3.25e-02

Pos logit contributions
 earn+0.044
 actually+0.043
 solve+0.042
 seem+0.041
 rely+0.041
Neg logit contributions
ed-0.043
rified-0.036
apped-0.035
ted-0.034
ing-0.033
Token family
Token ID1641
Feature activation+6.07e-03

Pos logit contributions
 earn+0.051
 solve+0.049
 actually+0.049
 seem+0.047
 count+0.046
Neg logit contributions
ed-0.061
rified-0.053
apped-0.053
ing-0.050
uttered-0.050
Token spoke
Token ID5158
Feature activation-2.27e-03

Pos logit contributions
ed+0.019
apped+0.017
rified+0.016
ing+0.016
uttered+0.016
Neg logit contributions
 solve-0.019
 earn-0.019
 actually-0.019
 read-0.019
 count-0.018
Token.
Token ID13
Feature activation-3.43e-03

Pos logit contributions
 solve+0.004
 actually+0.004
 earn+0.004
 count+0.003
 read+0.003
Neg logit contributions
ed-0.004
rified-0.003
attered-0.003
apped-0.003
ipt-0.003
Token
Token ID198
Feature activation-6.76e-03

Pos logit contributions
 earn+0.006
 rely+0.006
 actually+0.006
 solve+0.006
aches+0.006
Neg logit contributions
ed-0.006
apped-0.005
elt-0.005
rified-0.005
uttered-0.005
Token"
Token ID1
Feature activation+2.05e-02

Pos logit contributions
 earn+0.009
 actually+0.009
 solve+0.008
 rely+0.008
 seem+0.008
Neg logit contributions
ed-0.014
apped-0.012
ing-0.012
rified-0.012
ipt-0.012
TokenWhy
Token ID5195
Feature activation-5.79e-02

Pos logit contributions
ed+0.024
rified+0.019
Look+0.018
apped+0.017
elt+0.017
Neg logit contributions
 earn-0.047
 solve-0.045
 count-0.044
 actually-0.043
 rely-0.043
Token am
Token ID716
Feature activation-1.49e-01

Pos logit contributions
 earn+0.098
 actually+0.092
 rely+0.089
 solve+0.088
uries+0.088
Neg logit contributions
ed-0.095
ing-0.079
rified-0.078
apped-0.078
w-0.077
Token I
Token ID314
Feature activation-2.30e-02

Pos logit contributions
 earn+0.200
 rely+0.185
 solve+0.184
 actually+0.180
 count+0.178
Neg logit contributions
ed-0.304
rified-0.275
apped-0.269
ing-0.266
uttered-0.260
Token so
Token ID523
Feature activation-6.43e-02

Pos logit contributions
 earn+0.037
 solve+0.033
 actually+0.032
 rely+0.032
 seem+0.031
Neg logit contributions
ed-0.042
ing-0.035
rified-0.035
apped-0.035
w-0.034
Token poor
Token ID3595
Feature activation-2.76e-02

Pos logit contributions
 earn+0.112
 solve+0.109
 actually+0.107
 count+0.104
 rely+0.103
Neg logit contributions
ed-0.110
apped-0.093
rified-0.091
ing-0.089
uttered-0.087
Token?"
Token ID1701
Feature activation-7.46e-02

Pos logit contributions
 earn+0.044
 solve+0.039
 actually+0.039
 seem+0.038
 rely+0.037
Neg logit contributions
ed-0.049
ing-0.044
apped-0.042
rified-0.042
elt-0.040
Token he
Token ID339
Feature activation+1.05e-02

Pos logit contributions
 earn+0.113
 actually+0.110
 solve+0.107
 rely+0.105
 count+0.104
Neg logit contributions
ed-0.140
rified-0.123
apped-0.117
ted-0.112
ing-0.110
Tokenmy
Token ID1820
Feature activation-2.58e-02

Pos logit contributions
ed+0.025
rified+0.022
apped+0.022
ing+0.021
uttered+0.020
Neg logit contributions
 earn-0.020
 solve-0.019
 actually-0.019
 seem-0.018
 rely-0.018
Token,
Token ID11
Feature activation-1.20e-02

Pos logit contributions
 earn+0.041
 solve+0.040
 actually+0.039
 seem+0.037
 rely+0.037
Neg logit contributions
ed-0.048
rified-0.041
apped-0.040
ing-0.039
uttered-0.038
Token what
Token ID644
Feature activation-5.27e-02

Pos logit contributions
 earn+0.014
 solve+0.014
 actually+0.013
 count+0.013
 seem+0.013
Neg logit contributions
ed-0.027
rified-0.024
apped-0.024
uttered-0.023
ing-0.023
Token do
Token ID466
Feature activation-7.49e-02

Pos logit contributions
 earn+0.074
 solve+0.069
 actually+0.069
 rely+0.066
 count+0.064
Neg logit contributions
ed-0.107
apped-0.093
rified-0.091
ing-0.091
w-0.086
Token b
Token ID275
Feature activation-6.01e-02

Pos logit contributions
 earn+0.105
 actually+0.100
 solve+0.096
 count+0.094
 rely+0.093
Neg logit contributions
ed-0.147
ing-0.127
rified-0.126
apped-0.125
w-0.124
Tokenunn
Token ID20935
Feature activation-1.49e-01

Pos logit contributions
 actually+0.123
 earn+0.120
 solve+0.119
 count+0.118
 read+0.117
Neg logit contributions
ed-0.090
ing-0.081
w-0.074
rified-0.074
ted-0.069
Tokenies
Token ID444
Feature activation-7.18e-02

Pos logit contributions
 earn+0.356
 solve+0.349
 actually+0.346
 count+0.335
 read+0.333
Neg logit contributions
ed-0.161
rified-0.118
apped-0.116
ing-0.115
uttered-0.101
Token do
Token ID466
Feature activation-5.15e-02

Pos logit contributions
 earn+0.089
 dentist+0.080
uries+0.078
 rely+0.078
 solve+0.078
Neg logit contributions
ed-0.138
ing-0.122
w-0.119
apped-0.116
 down-0.116
Token?"
Token ID1701
Feature activation-5.33e-02

Pos logit contributions
 earn+0.077
 actually+0.069
 rely+0.068
 solve+0.068
uries+0.066
Neg logit contributions
ed-0.096
ing-0.085
apped-0.079
uck-0.077
w-0.075
Token
Token ID198
Feature activation+5.86e-03

Pos logit contributions
 earn+0.079
 solve+0.077
 actually+0.076
 rely+0.075
 count+0.074
Neg logit contributions
ed-0.105
apped-0.095
elt-0.090
uttered-0.085
ted-0.085
Token
Token ID198
Feature activation-5.68e-03

Pos logit contributions
ed+0.011
apped+0.010
uttered+0.010
ipt+0.010
ing+0.010
Neg logit contributions
 earn-0.008
 seem-0.008
 actually-0.008
 cost-0.008
 solve-0.008
Token "
Token ID366
Feature activation+3.61e-02

Pos logit contributions
ed+0.023
apped+0.019
ing+0.019
rified+0.019
elt+0.018
Neg logit contributions
 earn-0.022
 rely-0.021
 solve-0.021
 seem-0.019
 actually-0.019
TokenBecause
Token ID8128
Feature activation-8.30e-02

Pos logit contributions
ed+0.063
rified+0.054
apped+0.052
ing+0.052
elt+0.050
Neg logit contributions
 earn-0.061
 solve-0.060
 count-0.058
 actually-0.058
 rely-0.056
Token it
Token ID340
Feature activation-5.19e-02

Pos logit contributions
 earn+0.132
 rely+0.126
 solve+0.125
 actually+0.123
uries+0.116
Neg logit contributions
ed-0.149
ing-0.135
rified-0.122
apped-0.116
ted-0.114
Token wasn
Token ID2492
Feature activation-4.47e-03

Pos logit contributions
 earn+0.084
 solve+0.074
 rely+0.074
uries+0.072
 concentrate+0.070
Neg logit contributions
ed-0.099
ing-0.084
rified-0.083
w-0.076
ted-0.073
Token't
Token ID470
Feature activation-2.22e-02

Pos logit contributions
 earn+0.008
 rely+0.007
 solve+0.007
 count+0.007
uries+0.007
Neg logit contributions
ed-0.007
w-0.006
rified-0.006
ing-0.006
uck-0.006
Token pe
Token ID613
Feature activation-1.49e-01

Pos logit contributions
 earn+0.042
 solve+0.038
 rely+0.036
 concentrate+0.036
 achieve+0.035
Neg logit contributions
ed-0.033
 down-0.029
ing-0.029
 away-0.027
 breath-0.026
Tokeneling
Token ID10809
Feature activation-8.51e-02

Pos logit contributions
 earn+0.234
 solve+0.228
 read+0.222
 actually+0.219
 seem+0.213
Neg logit contributions
ed-0.272
ing-0.242
rified-0.232
w-0.219
apped-0.212
Token well
Token ID880
Feature activation-9.63e-02

Pos logit contributions
 earn+0.138
 solve+0.130
 count+0.125
 actually+0.125
 read+0.125
Neg logit contributions
ed-0.159
ing-0.134
rified-0.133
apped-0.126
ted-0.119
Token!"
Token ID2474
Feature activation-4.02e-02

Pos logit contributions
 earn+0.141
 solve+0.135
 actually+0.128
 rely+0.127
 count+0.126
Neg logit contributions
ed-0.194
ing-0.165
rified-0.163
apped-0.160
attered-0.152
Token
Token ID198
Feature activation-9.99e-04

Pos logit contributions
 earn+0.066
 solve+0.062
 actually+0.062
 count+0.061
 rely+0.061
Neg logit contributions
ed-0.074
apped-0.064
rified-0.062
elt-0.061
ing-0.059
Token
Token ID198
Feature activation-1.15e-02

Pos logit contributions
 seem+0.002
 earn+0.001
 count+0.001
 cost+0.001
 actually+0.001
Neg logit contributions
ed-0.002
apped-0.002
ipt-0.002
ing-0.002
Look-0.002
Token said
Token ID531
Feature activation-1.30e-02

Pos logit contributions
 earn+0.044
 solve+0.041
 rely+0.040
 actually+0.039
 seem+0.039
Neg logit contributions
ed-0.044
apped-0.037
rified-0.034
ing-0.034
uttered-0.033
Token,
Token ID11
Feature activation+1.16e-02

Pos logit contributions
 solve+0.015
 seem+0.013
 count+0.013
 read+0.013
 actually+0.013
Neg logit contributions
ed-0.028
rified-0.027
attered-0.027
eled-0.027
ipt-0.026
Token "
Token ID366
Feature activation+3.00e-02

Pos logit contributions
ed+0.026
apped+0.022
rified+0.022
ing+0.022
uttered+0.021
Neg logit contributions
 earn-0.015
 solve-0.014
 actually-0.014
 rely-0.013
 count-0.013
TokenHello
Token ID15496
Feature activation-6.77e-02

Pos logit contributions
ed+0.045
apped+0.039
uttered+0.038
rified+0.038
ing+0.036
Neg logit contributions
 read-0.056
 earn-0.055
 solve-0.054
 actually-0.053
 seem-0.053
Token,
Token ID11
Feature activation-2.86e-02

Pos logit contributions
 solve+0.094
 earn+0.093
 actually+0.090
 count+0.089
 read+0.088
Neg logit contributions
ed-0.136
rified-0.120
apped-0.117
uttered-0.113
attered-0.113
Token church
Token ID4928
Feature activation-1.49e-01

Pos logit contributions
 earn+0.043
 solve+0.042
 actually+0.041
 seem+0.040
 count+0.039
Neg logit contributions
ed-0.055
rified-0.048
apped-0.047
ing-0.046
uttered-0.045
Token!
Token ID0
Feature activation-3.59e-02

Pos logit contributions
 earn+0.255
 actually+0.241
 seem+0.226
 solve+0.225
 rely+0.216
Neg logit contributions
ed-0.260
ing-0.245
apped-0.231
cy-0.224
ho-0.210
Token I
Token ID314
Feature activation+7.64e-03

Pos logit contributions
 earn+0.051
 solve+0.050
 actually+0.049
 seem+0.048
 count+0.047
Neg logit contributions
ed-0.071
rified-0.062
apped-0.061
ing-0.059
uttered-0.058
Token like
Token ID588
Feature activation-6.31e-05

Pos logit contributions
ed+0.017
rified+0.015
apped+0.015
ing+0.014
uttered+0.014
Neg logit contributions
 earn-0.009
 solve-0.009
 actually-0.009
 seem-0.009
 count-0.009
Token to
Token ID284
Feature activation-5.79e-02

Pos logit contributions
 earn+0.000
 actually+0.000
 solve+0.000
 seem+0.000
 rely+0.000
Neg logit contributions
ed-0.000
ing-0.000
apped-0.000
elt-0.000
ted-0.000
Token sail
Token ID14936
Feature activation-3.76e-02

Pos logit contributions
 actually+0.089
 earn+0.085
idge+0.081
 dentist+0.080
uries+0.080
Neg logit contributions
 breath-0.087
ed-0.081
 bree-0.076
apped-0.075
 down-0.071
Token
Token ID198
Feature activation-4.77e-03

Pos logit contributions
 earn+0.052
 solve+0.050
 actually+0.050
 seem+0.048
 count+0.048
Neg logit contributions
ed-0.066
rified-0.058
apped-0.056
ing-0.055
uttered-0.054
Token
Token ID198
Feature activation-1.65e-02

Pos logit contributions
 earn+0.005
 solve+0.005
 actually+0.005
 rely+0.005
 count+0.005
Neg logit contributions
ed-0.011
apped-0.010
rified-0.010
ing-0.009
uttered-0.009
Token"
Token ID1
Feature activation+2.20e-02

Pos logit contributions
 earn+0.026
 solve+0.025
 actually+0.025
 seem+0.024
 rely+0.024
Neg logit contributions
ed-0.031
rified-0.027
apped-0.026
ing-0.025
uttered-0.025
TokenWhat
Token ID2061
Feature activation-7.50e-02

Pos logit contributions
ed+0.029
rified+0.024
apped+0.024
ing+0.022
uttered+0.022
Neg logit contributions
 earn-0.046
 solve-0.045
 actually-0.045
 count-0.043
 seem-0.043
Token is
Token ID318
Feature activation-7.22e-02

Pos logit contributions
 earn+0.110
 actually+0.108
 solve+0.103
 rely+0.101
 seem+0.098
Neg logit contributions
ed-0.143
apped-0.125
ing-0.123
rified-0.120
w-0.115
Token opera
Token ID27296
Feature activation-1.47e-01

Pos logit contributions
 earn+0.091
 solve+0.088
 actually+0.087
 seem+0.082
 count+0.082
Neg logit contributions
ed-0.156
rified-0.138
apped-0.137
ing-0.133
uttered-0.131
Token?"
Token ID1701
Feature activation-7.17e-02

Pos logit contributions
 earn+0.200
 actually+0.187
 solve+0.182
 seem+0.176
 rely+0.174
Neg logit contributions
ed-0.301
ing-0.269
apped-0.264
rified-0.251
uck-0.240
Token Jen
Token ID13364
Feature activation-2.50e-02

Pos logit contributions
 earn+0.129
 solve+0.128
 actually+0.125
 rely+0.123
 count+0.119
Neg logit contributions
ed-0.117
apped-0.097
rified-0.096
ipt-0.090
uttered-0.090
Token asks
Token ID7893
Feature activation-4.03e-02

Pos logit contributions
 earn+0.046
 solve+0.045
 actually+0.042
uries+0.041
 rely+0.041
Neg logit contributions
ed-0.039
 down-0.032
rified-0.031
ipt-0.031
ing-0.030
Token Tom
Token ID4186
Feature activation+4.11e-03

Pos logit contributions
 earn+0.042
 actually+0.039
 solve+0.039
 rely+0.037
 count+0.036
Neg logit contributions
ed-0.098
apped-0.087
rified-0.086
ing-0.085
uttered-0.082
Token.
Token ID13
Feature activation+6.19e-03

Pos logit contributions
ed+0.007
 down+0.007
ing+0.006
 away+0.006
 towards+0.006
Neg logit contributions
 earn-0.007
 actually-0.006
 cost-0.006
 solve-0.006
 involve-0.006
Token mother
Token ID2802
Feature activation+5.67e-02

Pos logit contributions
 earn+0.015
 solve+0.015
 actually+0.015
 count+0.014
 read+0.014
Neg logit contributions
ed-0.018
rified-0.016
apped-0.015
ing-0.015
uttered-0.014
Token asked
Token ID1965
Feature activation-2.06e-02

Pos logit contributions
ed+0.104
rified+0.090
apped+0.089
ing+0.085
uttered+0.084
Neg logit contributions
 earn-0.091
 solve-0.088
 actually-0.087
 seem-0.084
 count-0.083
Token,
Token ID11
Feature activation+6.82e-03

Pos logit contributions
 solve+0.026
 earn+0.025
 seem+0.024
 read+0.024
 count+0.023
Neg logit contributions
ed-0.044
rified-0.040
attered-0.040
apped-0.040
uttered-0.040
Token "
Token ID366
Feature activation+3.42e-02

Pos logit contributions
ed+0.013
apped+0.011
ing+0.011
elt+0.011
rified+0.010
Neg logit contributions
 earn-0.011
 solve-0.010
 rely-0.010
 actually-0.010
 seem-0.010
TokenIs
Token ID3792
Feature activation-5.76e-02

Pos logit contributions
ed+0.066
apped+0.058
rified+0.056
uttered+0.055
attered+0.053
Neg logit contributions
 earn-0.053
 actually-0.052
 solve-0.051
 read-0.050
 count-0.048
Tokenland
Token ID1044
Feature activation-1.47e-01

Pos logit contributions
 earn+0.095
 solve+0.091
 rely+0.089
 actually+0.088
 count+0.087
Neg logit contributions
ed-0.105
rified-0.090
ing-0.087
apped-0.085
ted-0.084
Token of
Token ID286
Feature activation-4.18e-02

Pos logit contributions
 earn+0.224
 actually+0.205
 solve+0.205
 rely+0.203
 seem+0.192
Neg logit contributions
ed-0.290
ing-0.247
apped-0.246
rified-0.235
uttered-0.225
Token seas
Token ID21547
Feature activation-7.37e-02

Pos logit contributions
 earn+0.065
 solve+0.062
 rely+0.060
 count+0.060
 actually+0.059
Neg logit contributions
ed-0.076
rified-0.064
ing-0.063
apped-0.063
ted-0.062
Tokenhell
Token ID12758
Feature activation-6.48e-02

Pos logit contributions
 earn+0.091
 solve+0.085
 actually+0.082
 rely+0.080
 count+0.080
Neg logit contributions
ed-0.166
ing-0.148
rified-0.140
apped-0.140
w-0.140
Tokens
Token ID82
Feature activation-2.54e-03

Pos logit contributions
 earn+0.155
 rely+0.149
 solve+0.147
 actually+0.147
 count+0.145
Neg logit contributions
ed-0.075
ing-0.061
rified-0.052
apped-0.050
ho-0.049
Token?
Token ID30
Feature activation-5.92e-02

Pos logit contributions
 earn+0.004
 solve+0.004
 rely+0.003
 actually+0.003
 seem+0.003
Neg logit contributions
ed-0.005
ing-0.004
apped-0.004
 down-0.004
uttered-0.004
Token children
Token ID1751
Feature activation-1.97e-02

Pos logit contributions
 earn+0.055
 actually+0.055
 solve+0.054
 rely+0.051
 read+0.050
Neg logit contributions
ed-0.070
rified-0.061
ing-0.059
apped-0.059
ipt-0.056
Token,
Token ID11
Feature activation+1.19e-02

Pos logit contributions
 earn+0.033
 actually+0.033
 solve+0.032
 dentist+0.032
 rely+0.031
Neg logit contributions
ed-0.034
ing-0.028
apped-0.028
rified-0.026
uttered-0.025
Token "
Token ID366
Feature activation+4.01e-02

Pos logit contributions
ed+0.022
rified+0.020
apped+0.019
ing+0.019
elt+0.018
Neg logit contributions
 earn-0.018
 solve-0.017
 actually-0.017
 rely-0.016
 seem-0.016
TokenDo
Token ID5211
Feature activation-6.85e-02

Pos logit contributions
ed+0.056
rified+0.046
apped+0.045
ing+0.043
uttered+0.042
Neg logit contributions
 earn-0.081
 solve-0.080
 actually-0.079
 seem-0.076
 count-0.076
Token not
Token ID407
Feature activation-3.47e-02

Pos logit contributions
 actually+0.109
 dentist+0.107
 rely+0.104
 earn+0.104
 cost+0.101
Neg logit contributions
ed-0.092
w-0.092
ing-0.086
apped-0.083
uck-0.080
Token slide
Token ID10649
Feature activation-1.46e-01

Pos logit contributions
 actually+0.048
isy+0.047
 dentist+0.046
lf+0.045
 earn+0.045
Neg logit contributions
 down-0.055
ed-0.054
ing-0.053
 breath-0.051
uck-0.051
Token too
Token ID1165
Feature activation-4.97e-02

Pos logit contributions
 earn+0.251
 solve+0.247
 actually+0.244
 rely+0.239
 count+0.234
Neg logit contributions
ed-0.250
rified-0.212
ing-0.211
apped-0.204
attered-0.189
Token fast
Token ID3049
Feature activation-2.53e-02

Pos logit contributions
 earn+0.091
 solve+0.089
 actually+0.088
 rely+0.085
 count+0.085
Neg logit contributions
ed-0.079
rified-0.067
apped-0.066
ing-0.063
uttered-0.061
Token,
Token ID11
Feature activation-2.93e-02

Pos logit contributions
 earn+0.036
 solve+0.035
 actually+0.034
 seem+0.033
 count+0.032
Neg logit contributions
ed-0.051
rified-0.044
apped-0.044
ing-0.043
uttered-0.041
Token you
Token ID345
Feature activation-5.47e-04

Pos logit contributions
 earn+0.040
 solve+0.039
 rely+0.039
 actually+0.038
 count+0.036
Neg logit contributions
ed-0.057
rified-0.053
ing-0.051
apped-0.050
ho-0.047
Token need
Token ID761
Feature activation+1.67e-02

Pos logit contributions
 earn+0.001
 actually+0.001
 solve+0.001
 rely+0.001
lf+0.001
Neg logit contributions
ed-0.001
rified-0.001
apped-0.001
ing-0.001
w-0.001
Token put
Token ID1234
Feature activation-2.82e-02

Pos logit contributions
lf+0.034
 actually+0.033
case+0.033
penter+0.032
 dentist+0.032
Neg logit contributions
 breath-0.051
 away-0.049
ed-0.047
uck-0.045
w-0.045
Token it
Token ID340
Feature activation+2.97e-02

Pos logit contributions
 earn+0.045
 solve+0.044
 actually+0.044
 count+0.042
 rely+0.042
Neg logit contributions
ed-0.051
rified-0.044
apped-0.044
uttered-0.042
ing-0.042
Token back
Token ID736
Feature activation+2.06e-02

Pos logit contributions
ed+0.042
rified+0.038
apped+0.036
uttered+0.034
ipt+0.034
Neg logit contributions
 solve-0.057
 actually-0.055
 earn-0.055
 count-0.054
 read-0.054
Token.
Token ID13
Feature activation-9.07e-03

Pos logit contributions
ed+0.033
rified+0.027
apped+0.027
ing+0.026
uttered+0.025
Neg logit contributions
 earn-0.038
 solve-0.037
 actually-0.037
 seem-0.036
 rely-0.036
Token Cater
Token ID40742
Feature activation-3.63e-02

Pos logit contributions
 earn+0.015
 solve+0.014
 actually+0.014
 seem+0.013
 count+0.013
Neg logit contributions
ed-0.016
rified-0.014
apped-0.014
ing-0.013
uttered-0.013
Tokenpill
Token ID27215
Feature activation-1.45e-01

Pos logit contributions
 earn+0.052
 solve+0.049
 actually+0.049
 count+0.046
 rely+0.045
Neg logit contributions
ed-0.078
rified-0.068
ing-0.067
apped-0.066
ted-0.063
Tokenars
Token ID945
Feature activation-6.42e-03

Pos logit contributions
 dentist+0.256
 earn+0.255
 actually+0.248
 count+0.242
 solve+0.242
Neg logit contributions
ed-0.235
apped-0.193
aked-0.192
w-0.189
rified-0.181
Token need
Token ID761
Feature activation+3.56e-02

Pos logit contributions
 earn+0.010
 solve+0.009
 actually+0.009
 rely+0.009
 count+0.009
Neg logit contributions
ed-0.012
apped-0.010
rified-0.010
ing-0.010
uttered-0.010
Token to
Token ID284
Feature activation-1.66e-02

Pos logit contributions
ed+0.061
rified+0.051
apped+0.051
ing+0.049
uttered+0.048
Neg logit contributions
 earn-0.061
 solve-0.060
 actually-0.059
 seem-0.057
 count-0.057
Token eat
Token ID4483
Feature activation-4.82e-02

Pos logit contributions
case+0.020
 actually+0.019
 seem+0.019
lf+0.019
idge+0.019
Neg logit contributions
 breath-0.029
 away-0.027
ed-0.025
ho-0.025
w-0.025
Token leaves
Token ID5667
Feature activation-4.27e-02

Pos logit contributions
 earn+0.079
 solve+0.075
 actually+0.075
 rely+0.075
 seem+0.073
Neg logit contributions
ed-0.086
rified-0.072
ing-0.072
apped-0.070
ted-0.067
Token many
Token ID867
Feature activation-1.74e-02

Pos logit contributions
ed+0.099
rified+0.083
apped+0.083
ing+0.079
uttered+0.078
Neg logit contributions
 earn-0.090
 solve-0.088
 actually-0.085
 rely-0.083
 seem-0.082
Token green
Token ID4077
Feature activation+6.04e-04

Pos logit contributions
 earn+0.033
 solve+0.032
 actually+0.031
 rely+0.031
 seem+0.031
Neg logit contributions
ed-0.027
rified-0.022
apped-0.021
ing-0.021
uttered-0.020
Token leaves
Token ID5667
Feature activation-3.53e-02

Pos logit contributions
ed+0.001
rified+0.001
apped+0.001
ing+0.001
uttered+0.001
Neg logit contributions
 earn-0.001
 solve-0.001
 actually-0.001
 count-0.001
 seem-0.001
Token and
Token ID290
Feature activation+7.99e-03

Pos logit contributions
 earn+0.057
 solve+0.053
 rely+0.051
 actually+0.050
 count+0.050
Neg logit contributions
ed-0.069
ing-0.058
apped-0.057
rified-0.056
ho-0.055
Token a
Token ID257
Feature activation+4.49e-02

Pos logit contributions
ed+0.013
rified+0.011
apped+0.011
ing+0.010
ho+0.010
Neg logit contributions
 earn-0.015
 solve-0.014
 actually-0.014
 rely-0.014
 seem-0.013
Token brown
Token ID7586
Feature activation+8.36e-02

Pos logit contributions
ed+0.075
rified+0.064
apped+0.063
ing+0.061
uttered+0.059
Neg logit contributions
 earn-0.078
 solve-0.076
 actually-0.074
 seem-0.072
 rely-0.072
Token trunk
Token ID21427
Feature activation+5.32e-03

Pos logit contributions
ed+0.152
rified+0.135
apped+0.131
ing+0.127
uttered+0.126
Neg logit contributions
 earn-0.128
 solve-0.128
 actually-0.126
 count-0.126
 read-0.123
Token.
Token ID13
Feature activation+1.00e-02

Pos logit contributions
ed+0.012
ing+0.010
apped+0.010
rified+0.010
ho+0.010
Neg logit contributions
 earn-0.007
 solve-0.006
 rely-0.006
 actually-0.006
 seem-0.006
Token It
Token ID632
Feature activation+1.25e-02

Pos logit contributions
ed+0.019
apped+0.017
rified+0.017
uttered+0.016
ing+0.016
Neg logit contributions
 earn-0.015
 solve-0.015
 actually-0.014
 rely-0.014
 count-0.014
Token was
Token ID373
Feature activation+8.93e-03

Pos logit contributions
ed+0.022
apped+0.017
ho+0.017
rified+0.016
uttered+0.016
Neg logit contributions
 earn-0.024
 solve-0.022
 rely-0.021
 seem-0.020
 achieve-0.020
Token very
Token ID845
Feature activation-2.88e-02

Pos logit contributions
ed+0.015
apped+0.012
rified+0.012
uttered+0.012
ing+0.012
Neg logit contributions
 earn-0.016
 solve-0.015
 seem-0.014
 rely-0.014
 count-0.013
Token ran
Token ID4966
Feature activation+4.30e-02

Pos logit contributions
 earn+0.017
 solve+0.017
 actually+0.016
 seem+0.016
 rely+0.016
Neg logit contributions
ed-0.022
rified-0.019
apped-0.018
ing-0.018
uttered-0.017
Token over
Token ID625
Feature activation+5.29e-02

Pos logit contributions
ed+0.067
apped+0.067
uttered+0.063
w+0.063
rified+0.063
Neg logit contributions
 solve-0.074
 read-0.072
 earn-0.071
 actually-0.070
 count-0.070
Token and
Token ID290
Feature activation+3.15e-02

Pos logit contributions
ed+0.101
attered+0.099
apped+0.096
rified+0.094
uttered+0.093
Neg logit contributions
 solve-0.077
 read-0.073
 count-0.073
 earn-0.070
 actually-0.068
Token grabbed
Token ID13176
Feature activation+2.48e-02

Pos logit contributions
ed+0.062
rified+0.055
apped+0.054
ing+0.052
uttered+0.051
Neg logit contributions
 earn-0.045
 solve-0.044
 actually-0.044
 count-0.042
 seem-0.041
Token the
Token ID262
Feature activation+1.84e-02

Pos logit contributions
ed+0.052
attered+0.051
apped+0.049
elt+0.049
uttered+0.049
Neg logit contributions
 solve-0.028
 count-0.028
 earn-0.028
 seem-0.027
 actually-0.026
Token naughty
Token ID45128
Feature activation+8.42e-02

Pos logit contributions
ed+0.039
apped+0.038
uttered+0.037
ho+0.036
elt+0.035
Neg logit contributions
 solve-0.022
 earn-0.022
 seem-0.021
 shelf-0.021
 actually-0.021
Token boy
Token ID2933
Feature activation+4.79e-02

Pos logit contributions
uttered+0.150
apped+0.147
ed+0.145
ipt+0.135
eled+0.133
Neg logit contributions
 seem-0.127
 earn-0.127
 count-0.123
 solve-0.122
 shelf-0.119
Token,
Token ID11
Feature activation+5.54e-02

Pos logit contributions
ed+0.100
rified+0.086
ing+0.085
apped+0.082
ho+0.078
Neg logit contributions
 earn-0.067
 actually-0.063
 solve-0.061
 rely-0.060
 seem-0.059
Token and
Token ID290
Feature activation+1.43e-02

Pos logit contributions
ed+0.102
rified+0.088
apped+0.085
ing+0.082
elt+0.081
Neg logit contributions
 earn-0.087
 solve-0.085
 seem-0.080
 rely-0.080
 actually-0.079
Token gave
Token ID2921
Feature activation+1.46e-02

Pos logit contributions
ed+0.029
rified+0.025
apped+0.025
ing+0.024
uttered+0.024
Neg logit contributions
 earn-0.020
 solve-0.020
 actually-0.019
 rely-0.019
 seem-0.019
Token him
Token ID683
Feature activation+4.28e-02

Pos logit contributions
ed+0.029
attered+0.028
apped+0.027
uttered+0.027
suits+0.026
Neg logit contributions
 actually-0.018
 solve-0.018
 earn-0.018
 count-0.018
 seem-0.017
TokenJack
Token ID14295
Feature activation-2.09e-03

Pos logit contributions
 earn+0.030
 solve+0.029
 actually+0.029
 rely+0.029
 count+0.028
Neg logit contributions
ed-0.034
rified-0.029
apped-0.028
ing-0.028
uttered-0.026
Token gently
Token ID15165
Feature activation-4.65e-03

Pos logit contributions
 earn+0.003
 solve+0.003
 actually+0.003
 seem+0.003
 count+0.003
Neg logit contributions
ed-0.004
rified-0.004
apped-0.003
ing-0.003
uttered-0.003
Token reached
Token ID4251
Feature activation+1.38e-02

Pos logit contributions
 earn+0.007
 solve+0.007
 actually+0.007
 rely+0.007
 seem+0.007
Neg logit contributions
ed-0.009
rified-0.008
apped-0.007
ing-0.007
uttered-0.007
Token out
Token ID503
Feature activation+2.51e-02

Pos logit contributions
ed+0.019
rified+0.016
apped+0.015
ing+0.014
uttered+0.014
Neg logit contributions
 earn-0.028
 solve-0.028
 actually-0.028
 count-0.027
 seem-0.027
Token his
Token ID465
Feature activation+6.27e-02

Pos logit contributions
ed+0.051
attered+0.048
w+0.048
rified+0.047
apped+0.046
Neg logit contributions
 solve-0.035
 count-0.033
 actually-0.033
 read-0.032
 seem-0.031
Token hand
Token ID1021
Feature activation+1.01e-01

Pos logit contributions
ed+0.087
rified+0.073
apped+0.070
ing+0.068
w+0.065
Neg logit contributions
 solve-0.128
 count-0.127
 actually-0.124
 earn-0.124
 rely-0.121
Token and
Token ID290
Feature activation+2.08e-02

Pos logit contributions
ed+0.200
rified+0.185
w+0.183
apped+0.183
ipt+0.180
Neg logit contributions
 actually-0.126
 count-0.125
 solve-0.124
 read-0.121
 seem-0.115
Token said
Token ID531
Feature activation-7.54e-03

Pos logit contributions
ed+0.041
rified+0.036
apped+0.035
ing+0.034
uttered+0.033
Neg logit contributions
 earn-0.031
 solve-0.030
 actually-0.029
 rely-0.028
 seem-0.028
Token "
Token ID366
Feature activation+4.59e-02

Pos logit contributions
 solve+0.010
 earn+0.010
 actually+0.010
 seem+0.010
 count+0.010
Neg logit contributions
ed-0.015
rified-0.014
apped-0.013
attered-0.013
ipt-0.013
TokenHello
Token ID15496
Feature activation-6.25e-02

Pos logit contributions
ed+0.069
apped+0.064
uttered+0.058
rified+0.057
ing+0.057
Neg logit contributions
 actually-0.083
 earn-0.082
 solve-0.081
 read-0.081
 seem-0.080
Token,
Token ID11
Feature activation-2.31e-02

Pos logit contributions
 earn+0.096
 solve+0.094
 actually+0.093
 count+0.089
 seem+0.089
Neg logit contributions
ed-0.117
rified-0.102
apped-0.101
ing-0.097
uttered-0.095
TokenLu
Token ID25596
Feature activation-4.91e-02

Pos logit contributions
 earn+0.018
 rely+0.018
 seem+0.017
 solve+0.017
 actually+0.017
Neg logit contributions
ed-0.020
rified-0.017
ing-0.016
apped-0.016
attered-0.016
Tokency
Token ID948
Feature activation-2.97e-02

Pos logit contributions
 read+0.107
 solve+0.097
 earn+0.096
 count+0.096
 reads+0.093
Neg logit contributions
rified-0.039
apped-0.039
 trigger-0.039
ed-0.039
 downstream-0.038
Token kne
Token ID24813
Feature activation+1.13e-02

Pos logit contributions
 earn+0.052
 solve+0.049
 actually+0.048
 seem+0.047
 rely+0.047
Neg logit contributions
ed-0.055
apped-0.044
rified-0.043
ing-0.041
uttered-0.041
Tokeneled
Token ID18449
Feature activation+8.18e-02

Pos logit contributions
rified+0.012
ed+0.012
less+0.012
icol+0.012
 bree+0.012
Neg logit contributions
 read-0.021
 also-0.021
 actually-0.020
aches-0.019
 earn-0.019
Token down
Token ID866
Feature activation+3.40e-02

Pos logit contributions
attered+0.090
rified+0.087
uttered+0.087
apped+0.084
w+0.081
Neg logit contributions
 read-0.168
 count-0.156
 solve-0.155
 earn-0.151
 also-0.148
Token next
Token ID1306
Feature activation+8.40e-02

Pos logit contributions
ed+0.065
rified+0.062
apped+0.059
attered+0.058
w+0.057
Neg logit contributions
 solve-0.046
 earn-0.045
 actually-0.045
 read-0.045
 count-0.043
Token to
Token ID284
Feature activation+1.55e-02

Pos logit contributions
ed+0.146
apped+0.129
rified+0.129
uttered+0.123
attered+0.122
Neg logit contributions
 solve-0.138
 earn-0.136
 read-0.135
 count-0.133
 seem-0.129
Token him
Token ID683
Feature activation+6.63e-02

Pos logit contributions
ed+0.022
rified+0.018
 down+0.018
 breath+0.018
apped+0.018
Neg logit contributions
 actually-0.028
 earn-0.028
 rely-0.027
 cost-0.027
uries-0.027
Token and
Token ID290
Feature activation+2.21e-02

Pos logit contributions
ed+0.095
ing+0.079
 down+0.073
rified+0.073
apped+0.070
Neg logit contributions
 earn-0.135
 actually-0.127
 solve-0.123
 rely-0.120
 seem-0.119
Token closed
Token ID4838
Feature activation+5.41e-04

Pos logit contributions
ed+0.043
rified+0.036
apped+0.036
ing+0.035
uttered+0.034
Neg logit contributions
 earn-0.033
 solve-0.032
 actually-0.032
 rely-0.031
 seem-0.031
Token her
Token ID607
Feature activation+3.24e-02

Pos logit contributions
ed+0.001
rified+0.001
apped+0.001
attered+0.001
w+0.001
Neg logit contributions
 solve-0.001
 actually-0.001
 earn-0.001
 read-0.001
 count-0.001
Token
Token ID198
Feature activation+4.29e-04

Pos logit contributions
ed+0.015
apped+0.013
ipt+0.012
rified+0.012
elt+0.012
Neg logit contributions
 actually-0.015
 earn-0.015
 solve-0.014
 rely-0.014
aches-0.014
Token
Token ID198
Feature activation-1.28e-02

Pos logit contributions
ed+0.001
apped+0.001
rified+0.001
ing+0.001
uttered+0.001
Neg logit contributions
 earn-0.001
 actually-0.001
 solve-0.001
 count-0.001
 rely-0.001
TokenBut
Token ID1537
Feature activation-2.51e-02

Pos logit contributions
 rely+0.020
 earn+0.020
 seem+0.020
 actually+0.020
 count+0.019
Neg logit contributions
ed-0.021
rified-0.018
Look-0.017
ing-0.017
ipt-0.017
Token then
Token ID788
Feature activation-2.76e-02

Pos logit contributions
 earn+0.032
 solve+0.030
 actually+0.029
 seem+0.029
 rely+0.028
Neg logit contributions
ed-0.055
rified-0.049
apped-0.048
ing-0.047
uttered-0.045
Token a
Token ID257
Feature activation+3.34e-03

Pos logit contributions
 earn+0.036
 solve+0.035
 actually+0.034
 count+0.033
 read+0.033
Neg logit contributions
ed-0.058
rified-0.052
apped-0.051
uttered-0.050
attered-0.049
Token mean
Token ID1612
Feature activation+8.55e-02

Pos logit contributions
ed+0.007
rified+0.006
apped+0.006
attered+0.006
ipt+0.006
Neg logit contributions
 solve-0.005
 earn-0.005
 actually-0.005
 read-0.004
 count-0.004
Token spider
Token ID19230
Feature activation+1.87e-02

Pos logit contributions
ed+0.154
rified+0.133
apped+0.132
attered+0.129
uttered+0.128
Neg logit contributions
 solve-0.134
 earn-0.133
 actually-0.132
 seem-0.131
 read-0.127
Token came
Token ID1625
Feature activation+4.23e-02

Pos logit contributions
ed+0.030
rified+0.025
apped+0.025
ing+0.024
uttered+0.023
Neg logit contributions
 earn-0.034
 solve-0.033
 actually-0.033
 seem-0.032
 rely-0.032
Token along
Token ID1863
Feature activation+2.38e-02

Pos logit contributions
ed+0.078
rified+0.073
apped+0.073
attered+0.071
uttered+0.068
Neg logit contributions
 actually-0.061
 solve-0.061
 seem-0.057
 count-0.057
 read-0.055
Token.
Token ID13
Feature activation-1.07e-02

Pos logit contributions
ed+0.051
rified+0.045
apped+0.044
ing+0.043
uttered+0.042
Neg logit contributions
 earn-0.031
 solve-0.029
 actually-0.029
 rely-0.028
 count-0.028
Token It
Token ID632
Feature activation-4.01e-03

Pos logit contributions
 earn+0.019
 rely+0.018
 actually+0.018
 solve+0.018
 count+0.017
Neg logit contributions
ed-0.018
rified-0.016
apped-0.015
elt-0.015
ing-0.014
Token a
Token ID257
Feature activation-1.29e-02

Pos logit contributions
 earn+0.050
 seem+0.046
 solve+0.045
 cost+0.044
 rely+0.043
Neg logit contributions
ed-0.040
apped-0.034
rified-0.033
 down-0.031
ing-0.030
Token good
Token ID922
Feature activation+1.85e-02

Pos logit contributions
 earn+0.020
 solve+0.019
 actually+0.019
 count+0.018
 rely+0.018
Neg logit contributions
ed-0.025
rified-0.022
apped-0.021
ing-0.021
uttered-0.020
Token friend
Token ID1545
Feature activation+3.95e-03

Pos logit contributions
ed+0.033
rified+0.029
apped+0.028
uttered+0.027
attered+0.027
Neg logit contributions
 earn-0.030
 solve-0.030
 count-0.029
 actually-0.029
 read-0.029
Token."
Token ID526
Feature activation-4.14e-02

Pos logit contributions
ed+0.009
rified+0.008
apped+0.008
ing+0.008
elt+0.007
Neg logit contributions
 earn-0.005
 solve-0.005
 actually-0.005
 seem-0.004
 rely-0.004
Token Mia
Token ID32189
Feature activation-1.73e-02

Pos logit contributions
 earn+0.077
 actually+0.074
 cost+0.072
 solve+0.072
uries+0.072
Neg logit contributions
ed-0.065
apped-0.055
rified-0.052
elt-0.051
ing-0.050
Token says
Token ID1139
Feature activation-6.58e-03

Pos logit contributions
 earn+0.034
uries+0.033
 solve+0.032
 actually+0.031
lf+0.031
Neg logit contributions
ed-0.025
apped-0.021
ted-0.020
ing-0.020
rified-0.020
Token.
Token ID13
Feature activation+1.57e-02

Pos logit contributions
 earn+0.007
 actually+0.006
 solve+0.006
 count+0.006
 rely+0.006
Neg logit contributions
ed-0.016
apped-0.015
rified-0.014
ing-0.014
uttered-0.014
Token
Token ID198
Feature activation+6.63e-03

Pos logit contributions
ed+0.033
apped+0.029
elt+0.028
rified+0.028
ing+0.028
Neg logit contributions
 earn-0.023
 solve-0.022
 actually-0.021
 rely-0.020
 cost-0.020
Token
Token ID198
Feature activation-4.66e-03

Pos logit contributions
ed+0.013
apped+0.011
ing+0.011
ipt+0.011
uttered+0.010
Neg logit contributions
 earn-0.010
 cost-0.010
 actually-0.010
 solve-0.009
 seem-0.009
TokenThey
Token ID2990
Feature activation+1.61e-02

Pos logit contributions
 earn+0.007
 actually+0.007
 solve+0.007
 seem+0.007
 rely+0.006
Neg logit contributions
ed-0.009
rified-0.008
apped-0.008
ing-0.008
ipt-0.007
Token sit
Token ID1650
Feature activation-2.02e-03

Pos logit contributions
ed+0.028
apped+0.025
rified+0.025
ho+0.024
 breath+0.024
Neg logit contributions
 actually-0.024
 earn-0.024
 seem-0.022
 rely-0.022
 solve-0.022
Token eyes
Token ID2951
Feature activation+5.86e-02

Pos logit contributions
ed+0.109
rified+0.095
apped+0.093
ing+0.088
uttered+0.088
Neg logit contributions
 earn-0.105
 solve-0.103
 actually-0.102
 count-0.098
 seem-0.098
Token on
Token ID319
Feature activation+4.66e-02

Pos logit contributions
ed+0.101
rified+0.088
apped+0.087
ing+0.083
uttered+0.082
Neg logit contributions
 earn-0.096
 solve-0.094
 actually-0.094
 seem-0.090
 count-0.090
Token their
Token ID511
Feature activation+5.74e-02

Pos logit contributions
ed+0.082
rified+0.071
apped+0.068
ing+0.068
uttered+0.065
Neg logit contributions
 earn-0.077
 solve-0.073
 actually-0.072
 rely-0.071
 seem-0.070
Token arms
Token ID5101
Feature activation+9.06e-02

Pos logit contributions
ed+0.087
rified+0.074
apped+0.073
ing+0.069
uttered+0.068
Neg logit contributions
 earn-0.109
 solve-0.107
 actually-0.105
 count-0.103
 rely-0.102
Token and
Token ID290
Feature activation+1.85e-02

Pos logit contributions
ed+0.198
rified+0.172
apped+0.171
ing+0.168
uttered+0.163
Neg logit contributions
 earn-0.117
 solve-0.112
 actually-0.108
 rely-0.104
 count-0.102
Token legs
Token ID7405
Feature activation+4.74e-02

Pos logit contributions
ed+0.032
apped+0.027
rified+0.027
ing+0.026
uttered+0.025
Neg logit contributions
 earn-0.031
 solve-0.031
 actually-0.030
 seem-0.029
 rely-0.029
Token and
Token ID290
Feature activation+3.21e-03

Pos logit contributions
ed+0.113
rified+0.100
apped+0.099
ing+0.098
uttered+0.095
Neg logit contributions
 earn-0.051
 solve-0.049
 actually-0.047
 rely-0.045
 count-0.045
Token even
Token ID772
Feature activation+5.72e-03

Pos logit contributions
ed+0.005
apped+0.004
rified+0.004
ing+0.004
uttered+0.004
Neg logit contributions
 earn-0.006
 solve-0.006
 actually-0.006
 seem-0.006
 rely-0.006
Token their
Token ID511
Feature activation+5.72e-03

Pos logit contributions
ed+0.011
rified+0.009
apped+0.009
ing+0.009
aked+0.008
Neg logit contributions
 earn-0.009
 solve-0.009
 actually-0.009
 seem-0.008
 rely-0.008
Token shirts
Token ID20858
Feature activation-6.84e-03

Pos logit contributions
ed+0.009
rified+0.008
apped+0.008
ing+0.007
uttered+0.007
Neg logit contributions
 earn-0.010
 solve-0.010
 actually-0.010
 count-0.010
 rely-0.010
Token!
Token ID0
Feature activation+2.40e-02

Pos logit contributions
 earn+0.008
 solve+0.008
 actually+0.007
 rely+0.007
 count+0.007
Neg logit contributions
ed-0.016
apped-0.014
rified-0.014
ing-0.013
uttered-0.013
Token something
Token ID1223
Feature activation-9.96e-03

Pos logit contributions
ed+0.050
apped+0.045
rified+0.045
attered+0.045
uttered+0.043
Neg logit contributions
 solve-0.027
 actually-0.026
 earn-0.026
 read-0.026
 count-0.025
Token crazy
Token ID7165
Feature activation+3.92e-02

Pos logit contributions
 earn+0.016
 solve+0.016
 actually+0.015
 seem+0.015
 rely+0.015
Neg logit contributions
ed-0.018
rified-0.015
apped-0.015
ing-0.015
uttered-0.014
Token.
Token ID13
Feature activation-5.84e-03

Pos logit contributions
ed+0.089
rified+0.080
apped+0.079
ing+0.076
uttered+0.075
Neg logit contributions
 earn-0.044
 solve-0.043
 actually-0.042
 seem-0.040
 count-0.040
Token It
Token ID632
Feature activation+1.02e-03

Pos logit contributions
 earn+0.010
 solve+0.010
 actually+0.010
 rely+0.010
 count+0.009
Neg logit contributions
ed-0.010
rified-0.009
apped-0.009
uttered-0.008
ing-0.008
Token was
Token ID373
Feature activation+3.66e-03

Pos logit contributions
ed+0.002
rified+0.002
apped+0.002
ing+0.002
uttered+0.002
Neg logit contributions
 earn-0.002
 solve-0.002
 actually-0.001
 rely-0.001
 count-0.001
Token a
Token ID257
Feature activation+4.29e-03

Pos logit contributions
ed+0.008
rified+0.007
apped+0.007
ing+0.007
uttered+0.006
Neg logit contributions
 actually-0.005
 solve-0.005
 earn-0.005
 count-0.005
 read-0.004
Token dress
Token ID6576
Feature activation-2.61e-02

Pos logit contributions
ed+0.008
rified+0.007
apped+0.007
uttered+0.007
ipt+0.007
Neg logit contributions
 solve-0.007
 earn-0.007
 actually-0.007
 read-0.006
 count-0.006
Token with
Token ID351
Feature activation+6.61e-04

Pos logit contributions
 earn+0.036
 solve+0.035
 actually+0.034
 seem+0.033
 count+0.033
Neg logit contributions
ed-0.053
rified-0.047
apped-0.046
ing-0.045
uttered-0.044
Token spark
Token ID9009
Feature activation-3.70e-03

Pos logit contributions
ed+0.001
rified+0.001
apped+0.001
ipt+0.001
uttered+0.001
Neg logit contributions
 actually-0.001
 read-0.001
 solve-0.001
 count-0.001
 earn-0.001
Tokenles
Token ID829
Feature activation+1.10e-02

Pos logit contributions
 actually+0.007
 solve+0.007
 seem+0.007
 earn+0.007
 count+0.006
Neg logit contributions
ed-0.005
apped-0.005
rified-0.005
ipt-0.005
uttered-0.005
Token and
Token ID290
Feature activation+8.26e-03

Pos logit contributions
ed+0.019
rified+0.016
apped+0.016
ing+0.015
uttered+0.015
Neg logit contributions
 earn-0.018
 solve-0.018
 actually-0.018
 count-0.017
 seem-0.017
Token and
Token ID290
Feature activation-3.63e-02

Pos logit contributions
 earn+0.007
 actually+0.007
 solve+0.007
 seem+0.006
 count+0.006
Neg logit contributions
ed-0.013
rified-0.011
apped-0.011
uttered-0.011
ipt-0.011
Token sad
Token ID6507
Feature activation+9.60e-03

Pos logit contributions
 earn+0.052
 solve+0.050
 actually+0.048
 rely+0.047
 count+0.047
Neg logit contributions
ed-0.073
rified-0.064
apped-0.063
ing-0.061
uttered-0.060
Token.
Token ID13
Feature activation-3.06e-03

Pos logit contributions
ed+0.022
ing+0.019
rified+0.019
apped+0.019
ted+0.018
Neg logit contributions
 earn-0.012
 solve-0.012
 actually-0.011
 rely-0.011
 count-0.011
Token
Token ID198
Feature activation-8.84e-03

Pos logit contributions
 earn+0.006
 solve+0.005
 rely+0.005
 count+0.005
 actually+0.005
Neg logit contributions
ed-0.005
rified-0.005
apped-0.004
ipt-0.004
elt-0.004
Token
Token ID198
Feature activation-2.03e-02

Pos logit contributions
 earn+0.012
 solve+0.012
 actually+0.012
 count+0.012
 seem+0.011
Neg logit contributions
ed-0.017
apped-0.015
ipt-0.015
rified-0.015
ing-0.015
TokenTom
Token ID13787
Feature activation-6.85e-03

Pos logit contributions
 earn+0.030
 actually+0.030
 seem+0.030
 solve+0.030
 rely+0.030
Neg logit contributions
ed-0.036
rified-0.032
apped-0.032
ipt-0.032
aked-0.031
Token said
Token ID531
Feature activation-1.24e-02

Pos logit contributions
 earn+0.010
 solve+0.010
 actually+0.010
 seem+0.009
 rely+0.009
Neg logit contributions
ed-0.013
rified-0.012
apped-0.011
ing-0.011
uttered-0.011
Token,
Token ID11
Feature activation+1.06e-02

Pos logit contributions
 solve+0.016
 earn+0.015
 actually+0.015
 seem+0.015
 count+0.015
Neg logit contributions
ed-0.024
rified-0.023
attered-0.023
ipt-0.023
eled-0.022
Token "
Token ID366
Feature activation+3.31e-02

Pos logit contributions
ed+0.021
apped+0.018
rified+0.018
ing+0.018
elt+0.017
Neg logit contributions
 earn-0.016
 solve-0.015
 rely-0.015
 actually-0.014
 count-0.014
TokenHi
Token ID17250
Feature activation-5.76e-02

Pos logit contributions
ed+0.058
apped+0.051
rified+0.049
uttered+0.049
ing+0.047
Neg logit contributions
 earn-0.056
 actually-0.054
 solve-0.053
 read-0.052
 seem-0.051
Token,
Token ID11
Feature activation-3.11e-02

Pos logit contributions
 solve+0.081
 earn+0.080
 actually+0.079
 count+0.077
 seem+0.075
Neg logit contributions
ed-0.115
rified-0.101
apped-0.100
uttered-0.097
attered-0.096
Token promised
Token ID8072
Feature activation+4.00e-02

Pos logit contributions
ed+0.051
rified+0.045
apped+0.045
ing+0.043
uttered+0.042
Neg logit contributions
 earn-0.033
 solve-0.032
 actually-0.031
 seem-0.030
 count-0.030
Token to
Token ID284
Feature activation-3.77e-02

Pos logit contributions
ed+0.050
ing+0.036
apped+0.034
rified+0.033
ho+0.033
Neg logit contributions
 earn-0.085
 actually-0.081
 solve-0.080
 rely-0.080
 count-0.080
Token use
Token ID779
Feature activation-2.10e-02

Pos logit contributions
 earn+0.041
 seem+0.039
 actually+0.037
 solve+0.035
 cost+0.034
Neg logit contributions
ed-0.074
ho-0.072
apped-0.070
 breath-0.068
 bree-0.067
Token the
Token ID262
Feature activation-1.26e-02

Pos logit contributions
 earn+0.038
 solve+0.036
 actually+0.036
 seem+0.035
 rely+0.035
Neg logit contributions
ed-0.033
apped-0.027
rified-0.027
ing-0.027
 breath-0.026
Token magic
Token ID5536
Feature activation+4.24e-03

Pos logit contributions
 earn+0.021
 solve+0.021
 actually+0.021
 seem+0.020
 rely+0.020
Neg logit contributions
ed-0.021
rified-0.019
apped-0.018
ing-0.017
uttered-0.017
Token wand
Token ID11569
Feature activation+8.44e-03

Pos logit contributions
ed+0.006
ing+0.005
apped+0.005
rified+0.004
uttered+0.004
Neg logit contributions
 earn-0.009
 actually-0.008
 rely-0.008
 solve-0.008
 read-0.008
Token wisely
Token ID32773
Feature activation+2.56e-02

Pos logit contributions
ed+0.015
rified+0.012
apped+0.012
ing+0.012
uttered+0.011
Neg logit contributions
 earn-0.015
 solve-0.014
 actually-0.014
 rely-0.013
 seem-0.013
Token.
Token ID13
Feature activation+2.00e-02

Pos logit contributions
ed+0.055
ing+0.047
rified+0.044
apped+0.044
ho+0.042
Neg logit contributions
 earn-0.037
 solve-0.033
 count-0.032
 rely-0.032
 actually-0.032
Token<|endoftext|>
Token ID50256
Feature activation+6.06e-02

Pos logit contributions
ed+0.035
apped+0.031
ipt+0.028
elt+0.028
uttered+0.028
Neg logit contributions
 earn-0.037
 count-0.035
 actually-0.034
 solve-0.034
 seem-0.033
TokenOnce
Token ID7454
Feature activation+2.78e-02

Pos logit contributions
ed+0.138
rified+0.121
apped+0.121
ing+0.117
uttered+0.115
Neg logit contributions
 earn-0.073
 solve-0.068
 actually-0.067
 count-0.064
 rely-0.064
Token upon
Token ID2402
Feature activation+9.14e-03

Pos logit contributions
ed+0.034
rified+0.027
apped+0.026
ing+0.025
uttered+0.024
Neg logit contributions
 earn-0.062
 solve-0.060
 actually-0.060
 count-0.058
 rely-0.058
Token could
Token ID714
Feature activation+2.29e-02

Pos logit contributions
 earn+0.050
 solve+0.046
 actually+0.045
 rely+0.044
 count+0.044
Neg logit contributions
ed-0.055
rified-0.044
apped-0.043
uttered-0.041
ho-0.041
Token go
Token ID467
Feature activation-8.58e-03

Pos logit contributions
ed+0.038
apped+0.034
uck+0.033
uttered+0.033
ho+0.032
Neg logit contributions
 earn-0.032
lf-0.031
 dentist-0.031
uries-0.030
 actually-0.029
Token outside
Token ID2354
Feature activation-3.54e-02

Pos logit contributions
 earn+0.016
 actually+0.015
 solve+0.015
 seem+0.015
 rely+0.014
Neg logit contributions
ed-0.013
ing-0.011
apped-0.011
rified-0.010
elt-0.010
Token.
Token ID13
Feature activation+4.27e-03

Pos logit contributions
 earn+0.061
 solve+0.058
 actually+0.058
 seem+0.056
 rely+0.056
Neg logit contributions
ed-0.060
apped-0.050
rified-0.050
ing-0.049
uttered-0.046
Token
Token ID198
Feature activation-2.86e-03

Pos logit contributions
ed+0.009
apped+0.008
rified+0.008
ipt+0.008
elt+0.007
Neg logit contributions
 earn-0.006
 solve-0.006
 actually-0.006
 rely-0.005
 count-0.005
Token
Token ID198
Feature activation-1.67e-02

Pos logit contributions
 earn+0.003
 solve+0.003
 actually+0.003
 count+0.003
 seem+0.003
Neg logit contributions
ed-0.007
rified-0.006
apped-0.006
ing-0.006
uttered-0.006
Token"
Token ID1
Feature activation+1.13e-02

Pos logit contributions
 earn+0.023
 solve+0.023
 actually+0.023
 seem+0.022
 rely+0.022
Neg logit contributions
ed-0.033
rified-0.029
apped-0.029
ing-0.028
ipt-0.027
TokenYou
Token ID1639
Feature activation-7.13e-03

Pos logit contributions
ed+0.018
apped+0.016
rified+0.016
uttered+0.015
ing+0.015
Neg logit contributions
 earn-0.021
 read-0.020
 actually-0.020
 solve-0.020
 count-0.019
Token can
Token ID460
Feature activation+1.10e-03

Pos logit contributions
 earn+0.011
 solve+0.011
 actually+0.011
 rely+0.010
 count+0.010
Neg logit contributions
ed-0.013
rified-0.011
apped-0.011
ing-0.011
uttered-0.011
Token come
Token ID1282
Feature activation-1.93e-02

Pos logit contributions
ed+0.002
uck+0.002
ho+0.002
 breath+0.002
rified+0.002
Neg logit contributions
 earn-0.001
 actually-0.001
 dentist-0.001
uries-0.001
 teaches-0.001
Token,
Token ID11
Feature activation-3.12e-02

Pos logit contributions
 earn+0.036
 solve+0.036
 actually+0.036
 count+0.035
 read+0.034
Neg logit contributions
ed-0.029
rified-0.025
apped-0.024
ing-0.023
uttered-0.023
Token
Token ID198
Feature activation+7.87e-03

Pos logit contributions
ed+0.033
apped+0.029
rified+0.027
elt+0.027
ing+0.026
Neg logit contributions
 earn-0.025
 solve-0.025
 rely-0.025
 count-0.024
 actually-0.023
Token
Token ID198
Feature activation-5.25e-03

Pos logit contributions
ed+0.015
apped+0.013
ing+0.012
ipt+0.012
Look+0.012
Neg logit contributions
 earn-0.012
 rely-0.011
 count-0.011
 actually-0.011
 solve-0.011
Token"
Token ID1
Feature activation+3.28e-02

Pos logit contributions
 rely+0.008
 actually+0.008
 earn+0.008
 solve+0.008
 seem+0.008
Neg logit contributions
ed-0.010
ing-0.009
rified-0.009
apped-0.008
aked-0.008
TokenHello
Token ID15496
Feature activation-5.20e-02

Pos logit contributions
ed+0.040
rified+0.031
apped+0.029
ing+0.028
elt+0.027
Neg logit contributions
 earn-0.073
 solve-0.073
 actually-0.072
 seem-0.070
 rely-0.070
Token,
Token ID11
Feature activation-1.08e-02

Pos logit contributions
 earn+0.086
 solve+0.082
 actually+0.081
 rely+0.078
 seem+0.078
Neg logit contributions
ed-0.094
rified-0.081
apped-0.079
ing-0.079
uttered-0.074
Token little
Token ID1310
Feature activation-1.52e-02

Pos logit contributions
 earn+0.018
 solve+0.018
 rely+0.018
 actually+0.017
 seem+0.017
Neg logit contributions
ed-0.018
rified-0.016
ing-0.016
apped-0.015
ted-0.014
Token ones
Token ID3392
Feature activation-1.42e-02

Pos logit contributions
 solve+0.025
 earn+0.025
 count+0.025
 read+0.024
 seem+0.024
Neg logit contributions
ed-0.026
uttered-0.023
apped-0.023
attered-0.022
rified-0.022
Token,"
Token ID553
Feature activation-6.24e-02

Pos logit contributions
 actually+0.025
 earn+0.024
 solve+0.023
 rely+0.022
 seem+0.022
Neg logit contributions
ed-0.024
ing-0.024
rified-0.020
w-0.020
apped-0.019
Token he
Token ID339
Feature activation+1.82e-02

Pos logit contributions
 earn+0.080
 actually+0.079
 solve+0.078
 rely+0.078
 count+0.074
Neg logit contributions
ed-0.129
rified-0.115
apped-0.111
ted-0.108
ing-0.107
Token said
Token ID531
Feature activation+1.26e-02

Pos logit contributions
ed+0.035
rified+0.028
apped+0.028
ted+0.026
ing+0.026
Neg logit contributions
 earn-0.031
 solve-0.029
 rely-0.028
 actually-0.027
 seem-0.027
Token.
Token ID13
Feature activation+6.31e-03

Pos logit contributions
ed+0.031
apped+0.027
rified+0.026
ing+0.026
uttered+0.025
Neg logit contributions
 earn-0.014
 actually-0.013
 solve-0.013
 rely-0.012
 count-0.012
Token are
Token ID389
Feature activation-4.88e-02

Pos logit contributions
 earn+0.003
 solve+0.003
 actually+0.003
 rely+0.003
 count+0.002
Neg logit contributions
ed-0.004
rified-0.003
apped-0.003
ing-0.003
uttered-0.003
Token in
Token ID287
Feature activation-5.63e-02

Pos logit contributions
 earn+0.086
 rely+0.075
 solve+0.073
 concentrate+0.073
 seem+0.072
Neg logit contributions
ed-0.078
rified-0.066
apped-0.063
ing-0.062
 down-0.062
Token an
Token ID281
Feature activation-2.74e-03

Pos logit contributions
 earn+0.089
 solve+0.087
 actually+0.085
 count+0.082
 seem+0.082
Neg logit contributions
ed-0.104
rified-0.090
apped-0.089
ing-0.086
uttered-0.084
Token open
Token ID1280
Feature activation-2.84e-02

Pos logit contributions
 earn+0.004
 solve+0.004
 actually+0.004
 seem+0.004
 rely+0.004
Neg logit contributions
ed-0.005
rified-0.004
apped-0.004
eled-0.004
ing-0.004
Token field
Token ID2214
Feature activation-4.75e-02

Pos logit contributions
 earn+0.047
 solve+0.046
 actually+0.046
 seem+0.045
 rely+0.044
Neg logit contributions
ed-0.049
rified-0.042
apped-0.041
ing-0.040
uttered-0.038
Token,"
Token ID553
Feature activation-6.95e-02

Pos logit contributions
 earn+0.067
 solve+0.064
 actually+0.064
 seem+0.061
 rely+0.060
Neg logit contributions
ed-0.095
rified-0.083
apped-0.082
ing-0.081
uttered-0.077
Token said
Token ID531
Feature activation-6.12e-03

Pos logit contributions
 earn+0.096
 solve+0.093
 actually+0.092
 count+0.088
 rely+0.088
Neg logit contributions
ed-0.142
rified-0.124
apped-0.122
ing-0.118
ted-0.115
Token the
Token ID262
Feature activation-2.92e-02

Pos logit contributions
 earn+0.009
 solve+0.009
 actually+0.009
 count+0.008
 seem+0.008
Neg logit contributions
ed-0.012
rified-0.010
apped-0.010
ing-0.010
uttered-0.010
Token farmer
Token ID18739
Feature activation+1.49e-02

Pos logit contributions
 earn+0.051
 solve+0.051
 seem+0.050
 actually+0.050
 count+0.047
Neg logit contributions
ed-0.047
rified-0.039
ing-0.039
apped-0.037
ho-0.037
Token.
Token ID13
Feature activation-2.91e-03

Pos logit contributions
ed+0.032
ing+0.027
apped+0.026
rified+0.026
ho+0.025
Neg logit contributions
 earn-0.023
 actually-0.019
 solve-0.019
 rely-0.018
 achieve-0.018
Token "
Token ID366
Feature activation+5.20e-02

Pos logit contributions
 earn+0.005
 solve+0.005
 rely+0.005
 count+0.004
 actually+0.004
Neg logit contributions
ed-0.005
apped-0.005
elt-0.004
ing-0.004
rified-0.004
Token chasing
Token ID20023
Feature activation+3.43e-02

Pos logit contributions
ed+0.056
rified+0.049
apped+0.048
ing+0.047
uttered+0.046
Neg logit contributions
 earn-0.037
 solve-0.036
 actually-0.036
 count-0.034
 seem-0.034
Token one
Token ID530
Feature activation+6.86e-02

Pos logit contributions
ed+0.067
apped+0.066
ipt+0.066
uttered+0.065
elt+0.065
Neg logit contributions
 solve-0.045
 read-0.039
 seem-0.039
 actually-0.037
 count-0.037
Token another
Token ID1194
Feature activation+3.89e-02

Pos logit contributions
ed+0.135
rified+0.117
apped+0.116
ing+0.111
uttered+0.110
Neg logit contributions
 earn-0.100
 solve-0.098
 actually-0.097
 count-0.092
 seem-0.092
Token through
Token ID832
Feature activation+1.66e-02

Pos logit contributions
ed+0.063
ing+0.056
rified+0.055
cy+0.050
ho+0.049
Neg logit contributions
 earn-0.076
 solve-0.066
 actually-0.063
 rely-0.062
 dentist-0.062
Token the
Token ID262
Feature activation+2.10e-02

Pos logit contributions
ed+0.034
rified+0.030
apped+0.030
ing+0.028
uttered+0.028
Neg logit contributions
 solve-0.022
 earn-0.022
 actually-0.022
 seem-0.021
 count-0.021
Token forest
Token ID8222
Feature activation-1.49e-02

Pos logit contributions
ed+0.036
rified+0.031
apped+0.030
ing+0.030
attered+0.028
Neg logit contributions
 earn-0.036
 solve-0.035
 actually-0.035
 seem-0.034
 rely-0.034
Token.
Token ID13
Feature activation+9.51e-03

Pos logit contributions
 earn+0.019
 solve+0.016
 actually+0.016
 rely+0.016
 seem+0.015
Neg logit contributions
ed-0.034
apped-0.030
rified-0.030
ing-0.029
ho-0.029
Token Soon
Token ID15894
Feature activation-2.70e-02

Pos logit contributions
ed+0.015
apped+0.014
rified+0.013
uttered+0.013
ted+0.012
Neg logit contributions
 earn-0.018
aches-0.017
 solve-0.017
 rely-0.017
 actually-0.017
Token,
Token ID11
Feature activation+1.07e-02

Pos logit contributions
 solve+0.031
 seem+0.030
 actually+0.029
 count+0.027
 read+0.026
Neg logit contributions
ed-0.059
attered-0.057
ipt-0.055
rified-0.055
apped-0.054
Token they
Token ID484
Feature activation+2.49e-02

Pos logit contributions
ed+0.026
attered+0.023
aked+0.023
rified+0.023
elt+0.023
Neg logit contributions
 solve-0.012
 actually-0.011
 earn-0.010
 read-0.010
 seem-0.010
Token started
Token ID2067
Feature activation+2.04e-02

Pos logit contributions
ed+0.047
rified+0.041
apped+0.040
ing+0.039
uttered+0.038
Neg logit contributions
 earn-0.038
 solve-0.038
 actually-0.037
 count-0.035
 seem-0.035
Token and
Token ID290
Feature activation+7.96e-03

Pos logit contributions
ed+0.118
ing+0.100
rified+0.098
apped+0.096
ho+0.091
Neg logit contributions
 earn-0.112
 solve-0.108
 actually-0.107
 rely-0.105
 seem-0.101
Token thanked
Token ID26280
Feature activation+1.27e-03

Pos logit contributions
ed+0.014
apped+0.012
rified+0.011
ing+0.011
uttered+0.011
Neg logit contributions
 earn-0.014
 solve-0.013
 actually-0.013
 seem-0.013
 rely-0.013
Token her
Token ID607
Feature activation-2.06e-02

Pos logit contributions
ed+0.002
rified+0.001
apped+0.001
ing+0.001
ted+0.001
Neg logit contributions
 actually-0.002
 earn-0.002
 solve-0.002
 rely-0.002
 count-0.002
Token for
Token ID329
Feature activation+2.54e-03

Pos logit contributions
 earn+0.032
 actually+0.031
 rely+0.030
 seem+0.030
 cost+0.029
Neg logit contributions
ed-0.038
ing-0.035
apped-0.033
 down-0.032
rified-0.030
Token the
Token ID262
Feature activation-2.30e-02

Pos logit contributions
ed+0.005
apped+0.005
ing+0.004
rified+0.004
ipt+0.004
Neg logit contributions
 seem-0.003
 earn-0.003
 actually-0.003
 solve-0.003
 count-0.003
Token water
Token ID1660
Feature activation-1.13e-02

Pos logit contributions
 earn+0.033
 solve+0.032
 actually+0.031
 count+0.031
 seem+0.030
Neg logit contributions
ed-0.047
rified-0.041
apped-0.040
ing-0.039
uttered-0.038
Token and
Token ID290
Feature activation+1.19e-02

Pos logit contributions
 earn+0.012
 solve+0.012
 actually+0.011
 rely+0.011
 seem+0.011
Neg logit contributions
ed-0.027
rified-0.024
apped-0.024
ing-0.023
uttered-0.023
Token the
Token ID262
Feature activation-9.81e-03

Pos logit contributions
ed+0.020
apped+0.017
rified+0.016
ing+0.015
uttered+0.015
Neg logit contributions
 earn-0.020
 actually-0.020
 solve-0.020
 seem-0.019
 rely-0.019
Token hats
Token ID23910
Feature activation+1.20e-02

Pos logit contributions
 earn+0.016
 solve+0.016
 actually+0.015
 count+0.015
 rely+0.015
Neg logit contributions
ed-0.018
rified-0.016
apped-0.015
ing-0.015
uttered-0.015
Token.
Token ID13
Feature activation+9.55e-03

Pos logit contributions
ed+0.028
apped+0.025
rified+0.024
ing+0.024
uttered+0.023
Neg logit contributions
 earn-0.014
 solve-0.013
 actually-0.013
 rely-0.012
 count-0.012
Token They
Token ID1119
Feature activation+1.05e-02

Pos logit contributions
ed+0.017
apped+0.014
rified+0.014
uttered+0.013
ipt+0.013
Neg logit contributions
 earn-0.016
 solve-0.016
 actually-0.016
 count-0.015
 rely-0.015
Token a
Token ID257
Feature activation-6.83e-03

Pos logit contributions
ed+0.015
rified+0.013
apped+0.012
ing+0.012
uttered+0.012
Neg logit contributions
 earn-0.020
 solve-0.020
 actually-0.019
 count-0.019
 seem-0.019
Token friend
Token ID1545
Feature activation+1.82e-02

Pos logit contributions
 earn+0.010
 solve+0.010
 actually+0.010
 seem+0.010
 count+0.009
Neg logit contributions
ed-0.013
rified-0.011
apped-0.011
ing-0.011
uttered-0.010
Token to
Token ID284
Feature activation-1.91e-02

Pos logit contributions
ed+0.031
rified+0.025
ing+0.025
apped+0.024
elt+0.022
Neg logit contributions
 earn-0.033
 actually-0.031
 solve-0.031
 seem-0.030
 rely-0.029
Token help
Token ID1037
Feature activation+7.27e-03

Pos logit contributions
idge+0.022
 actually+0.022
lf+0.022
 dentist+0.022
 earn+0.021
Neg logit contributions
ed-0.032
 away-0.031
 breath-0.031
apped-0.029
 tones-0.028
Token you
Token ID345
Feature activation+7.85e-03

Pos logit contributions
ed+0.012
ing+0.011
rified+0.010
 down+0.009
 away+0.009
Neg logit contributions
 earn-0.012
 actually-0.011
 cost-0.011
 rely-0.011
case-0.011
Token achieve
Token ID4620
Feature activation-1.06e-01

Pos logit contributions
ed+0.014
ing+0.013
 away+0.012
 down+0.012
apped+0.011
Neg logit contributions
 dentist-0.011
 earn-0.011
lf-0.010
 actually-0.010
 cost-0.010
Token your
Token ID534
Feature activation-1.80e-02

Pos logit contributions
 earn+0.190
 actually+0.181
 solve+0.181
 seem+0.180
 cost+0.179
Neg logit contributions
ed-0.156
ing-0.129
rified-0.126
apped-0.120
 down-0.119
Token goals
Token ID4661
Feature activation-7.57e-03

Pos logit contributions
 earn+0.029
 actually+0.029
 solve+0.027
 seem+0.027
 read+0.027
Neg logit contributions
ed-0.030
apped-0.026
rified-0.025
ing-0.024
uttered-0.024
Token.
Token ID13
Feature activation-7.58e-03

Pos logit contributions
 earn+0.010
 actually+0.010
 dentist+0.010
 cost+0.009
 read+0.009
Neg logit contributions
ed-0.014
ing-0.013
apped-0.012
 down-0.012
 away-0.011
Token Friends
Token ID14213
Feature activation+3.66e-02

Pos logit contributions
lf+0.013
 earn+0.013
aches+0.013
 solve+0.012
 cost+0.012
Neg logit contributions
apped-0.011
ed-0.011
rified-0.010
elt-0.010
ipt-0.010
Token can
Token ID460
Feature activation+3.35e-02

Pos logit contributions
ed+0.072
apped+0.064
rified+0.063
ing+0.060
w+0.059
Neg logit contributions
 earn-0.047
 actually-0.047
 dentist-0.044
 solve-0.044
 seem-0.044
Token to
Token ID284
Feature activation-5.77e-02

Pos logit contributions
 earn+0.015
 solve+0.015
 actually+0.015
 count+0.014
 seem+0.014
Neg logit contributions
ed-0.010
rified-0.008
apped-0.008
ing-0.008
uttered-0.007
Token be
Token ID307
Feature activation-5.80e-02

Pos logit contributions
case+0.069
 dentist+0.065
lf+0.064
penter+0.062
ird+0.062
Neg logit contributions
ed-0.095
w-0.089
 breath-0.089
 away-0.088
ing-0.087
Token kind
Token ID1611
Feature activation+4.55e-03

Pos logit contributions
 earn+0.093
 seem+0.091
 solve+0.090
 actually+0.085
 cost+0.085
Neg logit contributions
ed-0.105
apped-0.087
ing-0.084
aked-0.083
rified-0.082
Token and
Token ID290
Feature activation-2.89e-02

Pos logit contributions
ed+0.009
rified+0.008
apped+0.008
uttered+0.007
eled+0.007
Neg logit contributions
 solve-0.006
 earn-0.006
 actually-0.006
 count-0.006
 seem-0.006
Token consider
Token ID2074
Feature activation-7.72e-02

Pos logit contributions
 actually+0.035
 dentist+0.035
 earn+0.034
lf+0.034
 seem+0.033
Neg logit contributions
ed-0.056
apped-0.050
ing-0.048
ho-0.048
uttered-0.046
Tokenate
Token ID378
Feature activation-1.18e-01

Pos logit contributions
 earn+0.123
 actually+0.120
 seem+0.119
 solve+0.119
 read+0.115
Neg logit contributions
ed-0.139
rified-0.119
apped-0.118
ing-0.116
uttered-0.110
Token.
Token ID13
Feature activation-1.52e-02

Pos logit contributions
 earn+0.177
 solve+0.173
 actually+0.170
 seem+0.163
 count+0.163
Neg logit contributions
ed-0.224
rified-0.196
apped-0.194
ing-0.185
uttered-0.184
Token When
Token ID1649
Feature activation-3.01e-02

Pos logit contributions
 solve+0.026
lf+0.025
aches+0.025
 earn+0.025
 cost+0.025
Neg logit contributions
apped-0.023
ed-0.023
ipt-0.021
rified-0.021
elt-0.020
Token we
Token ID356
Feature activation-4.24e-04

Pos logit contributions
 earn+0.041
 actually+0.040
case+0.038
 solve+0.038
 count+0.038
Neg logit contributions
ed-0.054
ing-0.048
apped-0.047
rified-0.045
w-0.045
Token use
Token ID779
Feature activation-3.23e-02

Pos logit contributions
 earn+0.001
 dentist+0.000
lf+0.000
 actually+0.000
 solve+0.000
Neg logit contributions
ed-0.001
rified-0.001
apped-0.001
ho-0.001
uttered-0.001
Token kind
Token ID1611
Feature activation+2.95e-02

Pos logit contributions
 earn+0.054
 solve+0.052
 actually+0.051
 seem+0.049
 count+0.049
Neg logit contributions
ed-0.056
rified-0.048
apped-0.047
ing-0.046
uttered-0.045
Token said
Token ID531
Feature activation-1.33e-02

Pos logit contributions
ed+0.011
apped+0.010
rified+0.010
ing+0.010
uttered+0.009
Neg logit contributions
 earn-0.008
 solve-0.007
 actually-0.007
 seem-0.007
 rely-0.007
Token,
Token ID11
Feature activation+9.57e-03

Pos logit contributions
 solve+0.016
 earn+0.016
 actually+0.016
 seem+0.015
 count+0.015
Neg logit contributions
ed-0.027
rified-0.026
attered-0.025
ipt-0.025
apped-0.024
Token "
Token ID366
Feature activation+3.35e-02

Pos logit contributions
ed+0.020
apped+0.017
ing+0.017
rified+0.016
elt+0.016
Neg logit contributions
 earn-0.014
 solve-0.013
 rely-0.013
 actually-0.013
 seem-0.012
TokenLet
Token ID5756
Feature activation+1.58e-02

Pos logit contributions
ed+0.049
rified+0.041
apped+0.040
ing+0.037
uttered+0.037
Neg logit contributions
 earn-0.066
 solve-0.064
 actually-0.064
 seem-0.062
 count-0.062
Tokenâ
Token ID22940
Feature activation-1.16e-02

Pos logit contributions
ed+0.041
rified+0.037
apped+0.037
ing+0.036
uttered+0.036
Neg logit contributions
 earn-0.013
 solve-0.012
 actually-0.012
 seem-0.011
 count-0.011
Token
Token ID26391
Feature activation-1.04e-01

Pos logit contributions
 actually+0.015
 dentist+0.014
 else+0.013
 solve+0.013
lf+0.013
Neg logit contributions
ed-0.023
uck-0.021
apped-0.020
rified-0.020
aked-0.020
Token
Token ID8151
Feature activation-1.89e-02

Pos logit contributions
 earn+0.155
 actually+0.150
 solve+0.150
 rely+0.140
 count+0.140
Neg logit contributions
ed-0.204
rified-0.175
apped-0.174
ing-0.167
uttered-0.163
Tokens
Token ID82
Feature activation-3.34e-02

Pos logit contributions
 earn+0.035
 solve+0.035
 actually+0.034
 rely+0.033
 count+0.033
Neg logit contributions
ed-0.030
rified-0.025
apped-0.025
ing-0.024
uttered-0.023
Token play
Token ID711
Feature activation-4.68e-02

Pos logit contributions
 earn+0.039
 actually+0.038
 solve+0.037
 seem+0.036
 rely+0.034
Neg logit contributions
ed-0.070
apped-0.062
rified-0.061
ing-0.060
uttered-0.059
Token catch
Token ID4929
Feature activation-3.79e-02

Pos logit contributions
 earn+0.068
 solve+0.068
 actually+0.066
 count+0.064
 read+0.063
Neg logit contributions
ed-0.091
rified-0.081
apped-0.079
uttered-0.077
ing-0.076
Token.
Token ID13
Feature activation-2.99e-02

Pos logit contributions
 earn+0.055
 solve+0.054
 actually+0.053
 count+0.050
 seem+0.050
Neg logit contributions
ed-0.074
rified-0.065
apped-0.065
uttered-0.062
ing-0.061
Token.
Token ID13
Feature activation-1.61e-02

Pos logit contributions
 earn+0.052
 actually+0.049
 seem+0.048
 solve+0.046
 count+0.044
Neg logit contributions
ed-0.076
ing-0.066
apped-0.065
cy-0.063
attered-0.063
Token She
Token ID1375
Feature activation-3.89e-02

Pos logit contributions
 earn+0.024
 actually+0.023
 solve+0.023
 rely+0.022
 count+0.021
Neg logit contributions
ed-0.032
rified-0.028
apped-0.028
ing-0.027
uttered-0.026
Token loved
Token ID6151
Feature activation+1.94e-02

Pos logit contributions
 earn+0.064
 solve+0.060
 actually+0.059
 seem+0.058
 count+0.058
Neg logit contributions
ed-0.073
apped-0.060
rified-0.057
w-0.057
ing-0.056
Token eating
Token ID6600
Feature activation-3.60e-02

Pos logit contributions
ed+0.029
rified+0.024
apped+0.024
ing+0.023
uttered+0.022
Neg logit contributions
 earn-0.037
 solve-0.036
 actually-0.036
 seem-0.035
 count-0.035
Token o
Token ID267
Feature activation-4.14e-02

Pos logit contributions
 earn+0.050
 solve+0.048
 actually+0.048
 rely+0.046
 seem+0.046
Neg logit contributions
ed-0.073
rified-0.064
apped-0.063
ing-0.062
uttered-0.060
Tokenatmeal
Token ID45280
Feature activation-1.02e-01

Pos logit contributions
 actually+0.045
 earn+0.042
 solve+0.041
 cost+0.039
 seem+0.038
Neg logit contributions
ed-0.099
ing-0.091
w-0.085
uck-0.085
apped-0.085
Token for
Token ID329
Feature activation-6.71e-04

Pos logit contributions
 earn+0.172
 actually+0.169
 solve+0.166
 rely+0.165
 seem+0.164
Neg logit contributions
ed-0.184
ing-0.156
apped-0.148
ho-0.138
rified-0.137
Token breakfast
Token ID12607
Feature activation-5.90e-02

Pos logit contributions
 solve+0.001
 rely+0.001
 earn+0.001
 actually+0.001
case+0.001
Neg logit contributions
ed-0.001
ted-0.001
ing-0.001
w-0.001
eled-0.001
Token every
Token ID790
Feature activation+6.51e-02

Pos logit contributions
 actually+0.095
 seem+0.093
 rely+0.092
 earn+0.092
 solve+0.090
Neg logit contributions
ed-0.102
ing-0.096
apped-0.085
uck-0.079
w-0.079
Token day
Token ID1110
Feature activation+4.71e-02

Pos logit contributions
ed+0.111
apped+0.094
rified+0.094
ing+0.090
uttered+0.088
Neg logit contributions
 earn-0.112
 solve-0.109
 actually-0.107
 seem-0.102
 rely-0.102
Token.
Token ID13
Feature activation+5.30e-03

Pos logit contributions
ed+0.094
ing+0.081
apped+0.079
elt+0.074
attered+0.074
Neg logit contributions
 earn-0.075
 solve-0.071
 actually-0.069
 count-0.066
 rely-0.065
Token brave
Token ID14802
Feature activation+6.52e-02

Pos logit contributions
ed+0.113
apped+0.092
rified+0.091
uttered+0.085
ho+0.084
Neg logit contributions
 earn-0.100
 solve-0.096
 actually-0.092
 seem-0.091
 rely-0.090
Tokenly
Token ID306
Feature activation+4.05e-02

Pos logit contributions
ed+0.109
rified+0.094
apped+0.093
uttered+0.088
ing+0.088
Neg logit contributions
 earn-0.111
 solve-0.110
 actually-0.108
 seem-0.105
 count-0.105
Token asked
Token ID1965
Feature activation-2.58e-02

Pos logit contributions
ed+0.075
rified+0.061
apped+0.060
ing+0.059
uttered+0.057
Neg logit contributions
 earn-0.067
 solve-0.064
 actually-0.063
 seem-0.061
 rely-0.061
Token,
Token ID11
Feature activation-5.64e-03

Pos logit contributions
 earn+0.041
 actually+0.039
 solve+0.039
 seem+0.037
 rely+0.037
Neg logit contributions
ed-0.049
rified-0.041
apped-0.041
ing-0.040
uttered-0.038
Token "
Token ID366
Feature activation+2.66e-02

Pos logit contributions
 earn+0.009
 solve+0.008
 rely+0.008
 seem+0.008
 cost+0.008
Neg logit contributions
ed-0.011
apped-0.009
ing-0.009
rified-0.009
elt-0.008
TokenWho
Token ID8241
Feature activation-1.11e-01

Pos logit contributions
ed+0.045
apped+0.038
rified+0.038
uttered+0.036
ing+0.036
Neg logit contributions
 earn-0.046
 solve-0.045
 actually-0.045
 count-0.043
 read-0.043
Token are
Token ID389
Feature activation-6.66e-02

Pos logit contributions
 earn+0.173
 actually+0.166
 rely+0.160
 solve+0.157
 seem+0.154
Neg logit contributions
ed-0.210
apped-0.169
rified-0.166
ing-0.165
w-0.162
Token you
Token ID345
Feature activation-3.92e-02

Pos logit contributions
 earn+0.096
 actually+0.092
 rely+0.089
 count+0.088
 read+0.088
Neg logit contributions
ed-0.126
ing-0.108
apped-0.105
rified-0.104
w-0.099
Token?"
Token ID1701
Feature activation-5.16e-02

Pos logit contributions
 earn+0.061
 solve+0.054
 actually+0.052
 seem+0.051
 rely+0.051
Neg logit contributions
ed-0.076
apped-0.066
ing-0.064
rified-0.063
uttered-0.059
Token
Token ID220
Feature activation+1.68e-02

Pos logit contributions
 earn+0.077
 solve+0.073
 actually+0.072
 seem+0.071
 count+0.071
Neg logit contributions
ed-0.102
apped-0.091
rified-0.087
elt-0.083
uttered-0.082
Token
Token ID198
Feature activation+1.52e-03

Pos logit contributions
ed+0.031
apped+0.027
ing+0.027
uttered+0.026
ipt+0.024
Neg logit contributions
 earn-0.028
 count-0.027
 read-0.027
 cost-0.027
 solve-0.027