TOP ACTIVATIONS
MAX = 0.253

as Peter faded into the distance . She had
alarm go off in the distance . It was a loud
though we are separated by distance , we can still be
as he disappeared into the distance . Gig i knew that
it flying off into the distance . He stayed there ,
best to keep a normal distance . â
noticed a boat in the distance . As it got closer
a shiny object in the distance . He rolled even faster
said something she 'll never forget : " Little
a field off in the distance . She decided to explore
aroo watched them from a distance . She was not angry
They hurt their knees and elbows . They cry and hug
distance . Everyone was over joy ed that the cow had
to see who was the strongest . They decided to play
. Tim my was over joy ed and said , "
a big basket in the distance . He slowly crawled up
of the ocean in the distance . James decided
some strange noises in the distance . The
the blue sea in the distance . Finally , they reached
, but from a safe distance . They said they would

MIN ACTIVATIONS
MIN = -0.226

," Mom says . " Electric ity is a kind of
M oral : Team work and using our brains
. Brown meant by " sp oil ", but she knew
many books on her books he lf in her room .
he lf . The books he lf was smooth and shiny
Yes , we are . Team work makes difficult work easier
. He tried to un plug the mixer , but he
he lf . His books he lf is important to him
that says " Cl ay Factory ". Tom and Sam are
L ily , your books he lf is very messy .
many books on his books he lf . His books he
, you need to bat he . You are too dirty
have a tum my ac he . Can I stay home
put them on the books he lf . The books he
have a tum my ac he . Do you want that
it 's time to bat he !" Max was obedient and
should not climb the books he lf . It is dangerous
had a tum my ac he and a headache . They
new book on his books he lf . It is big
the books on the books he lf and the clothes in

INTERVAL 0.133 - 0.253
CONTAINS 0.152%

didn 't want to wear anything heavy . So , she
we 'll just have to forget about it ."
the wreck out of the way . Red was very thankful
" That 's a good choice . You can both express
but he couldn 't find anywhere to buy a television .

INTERVAL 0.013 - 0.133
CONTAINS 30.701%

She was demanding her to come down . Lily ran down
dropped the soap and ran to the oven . She tried
, Sara , this is your ring . Now we are
she decided to sit in the grass and close her eyes
and broken . Some of the plan ks were missing and

INTERVAL -0.106 - 0.013
CONTAINS 68.656%

about the old man . They were happy . <|endoftext|> Tim
to her mom . " Look , mom , I made
eating the cake . L ily was sad that
enough money . The store manager offered to lend him some
Her mother hugged her and said , " I 'm glad

INTERVAL -0.226 - -0.106
CONTAINS 0.492%

her favorite show on the TV . But it was too
," Tom said . " It has fast cars and loud
Lily and says , " It 's okay , sweet ie
who had come home from work . She saw the stove
friends and they all played together with the shiny trophy .
Token as
Token ID355
Feature activation+4.26e-02

Pos logit contributions
awks+0.029
 Leave+0.029
vious+0.028
joy+0.028
 Fro+0.028
Neg logit contributions
pack-0.021
ve-0.019
her-0.019
--0.019
"-0.019
Token Peter
Token ID5613
Feature activation-4.28e-03

Pos logit contributions
vious+0.088
awks+0.085
 Leave+0.081
trained+0.079
 Fro+0.078
Neg logit contributions
--0.057
pack-0.056
 carefully-0.053
"-0.052
ve-0.051
Token faded
Token ID24887
Feature activation+3.14e-03

Pos logit contributions
-+0.006
 carefully+0.006
pack+0.006
 together+0.006
ator+0.006
Neg logit contributions
vious-0.008
awks-0.008
 Leave-0.008
trained-0.007
 Fro-0.007
Token into
Token ID656
Feature activation+5.18e-02

Pos logit contributions
vious+0.007
awks+0.007
 Leave+0.006
irms+0.006
joy+0.006
Neg logit contributions
 carefully-0.004
pack-0.004
--0.003
ve-0.003
"-0.003
Token the
Token ID262
Feature activation+5.67e-02

Pos logit contributions
vious+0.118
awks+0.115
 Leave+0.111
joy+0.107
irms+0.107
Neg logit contributions
--0.062
 carefully-0.061
pack-0.057
"-0.056
ve-0.056
Token distance
Token ID5253
Feature activation+2.07e-01

Pos logit contributions
vious+0.108
awks+0.103
 Leave+0.100
trained+0.094
irms+0.093
Neg logit contributions
 carefully-0.092
--0.092
 together-0.086
"-0.084
pack-0.083
Token.
Token ID13
Feature activation+2.19e-02

Pos logit contributions
vious+0.564
awks+0.545
 Leave+0.530
joy+0.513
trained+0.507
Neg logit contributions
pack-0.145
 carefully-0.141
--0.139
ve-0.129
lf-0.127
Token
Token ID198
Feature activation+2.01e-02

Pos logit contributions
 Leave+0.041
vious+0.041
awks+0.040
joy+0.037
 Fro+0.037
Neg logit contributions
--0.035
 carefully-0.033
pack-0.031
ve-0.031
lf-0.030
Token
Token ID198
Feature activation+1.76e-02

Pos logit contributions
 Leave+0.035
vious+0.033
awks+0.032
joy+0.032
Leave+0.031
Neg logit contributions
--0.032
 work-0.029
pack-0.029
 carefully-0.029
ator-0.029
TokenShe
Token ID3347
Feature activation+1.04e-02

Pos logit contributions
 Leave+0.030
Leave+0.029
vious+0.028
joy+0.026
awks+0.025
Neg logit contributions
--0.029
 work-0.028
ator-0.026
 carefully-0.025
 creation-0.025
Token had
Token ID550
Feature activation+3.37e-02

Pos logit contributions
vious+0.020
 Leave+0.020
awks+0.019
Leave+0.018
 Fro+0.017
Neg logit contributions
--0.015
pack-0.015
ator-0.014
 together-0.014
 carefully-0.014
Token alarm
Token ID10436
Feature activation-7.10e-02

Pos logit contributions
vious+0.180
awks+0.165
 Leave+0.154
 ze+0.146
 hesitation+0.144
Neg logit contributions
 together-0.122
pack-0.113
 carefully-0.112
--0.112
"-0.104
Token go
Token ID467
Feature activation+9.15e-02

Pos logit contributions
 carefully+0.082
pack+0.078
ve+0.072
ator+0.070
 together+0.070
Neg logit contributions
vious-0.163
 Leave-0.153
awks-0.147
trained-0.138
Leave-0.137
Token off
Token ID572
Feature activation+9.86e-02

Pos logit contributions
vious+0.190
awks+0.180
 Leave+0.179
trained+0.170
Leave+0.166
Neg logit contributions
lf-0.110
ator-0.109
abies-0.100
--0.099
 carefully-0.098
Token in
Token ID287
Feature activation+2.48e-02

Pos logit contributions
vious+0.218
awks+0.214
 Leave+0.211
joy+0.206
irms+0.196
Neg logit contributions
--0.099
ator-0.096
pack-0.095
lf-0.092
 carefully-0.089
Token the
Token ID262
Feature activation+5.71e-02

Pos logit contributions
vious+0.049
 Leave+0.049
joy+0.049
awks+0.048
irms+0.046
Neg logit contributions
--0.031
ve-0.030
 carefully-0.030
iced-0.029
her-0.029
Token distance
Token ID5253
Feature activation+2.04e-01

Pos logit contributions
vious+0.124
awks+0.120
 Leave+0.116
trained+0.111
irms+0.110
Neg logit contributions
 carefully-0.077
--0.077
 together-0.071
"-0.069
pack-0.068
Token.
Token ID13
Feature activation+2.15e-02

Pos logit contributions
vious+0.552
awks+0.529
 Leave+0.518
joy+0.488
trained+0.488
Neg logit contributions
--0.145
 carefully-0.144
pack-0.142
ve-0.129
lf-0.122
Token It
Token ID632
Feature activation-2.74e-02

Pos logit contributions
vious+0.042
awks+0.040
 Leave+0.040
trained+0.037
irms+0.037
Neg logit contributions
--0.034
 carefully-0.033
 together-0.031
pack-0.030
"-0.030
Token was
Token ID373
Feature activation+1.21e-02

Pos logit contributions
-+0.026
pack+0.026
 carefully+0.026
 together+0.024
ator+0.023
Neg logit contributions
vious-0.070
awks-0.066
 Leave-0.065
trained-0.061
Leave-0.061
Token a
Token ID257
Feature activation-1.18e-03

Pos logit contributions
vious+0.027
awks+0.026
 Leave+0.025
 Fro+0.023
irms+0.023
Neg logit contributions
--0.015
 carefully-0.015
pack-0.014
ve-0.014
"-0.014
Token loud
Token ID7812
Feature activation+1.99e-02

Pos logit contributions
pack+0.002
 carefully+0.002
-+0.001
 together+0.001
ator+0.001
Neg logit contributions
vious-0.002
awks-0.002
 Leave-0.002
joy-0.002
 Fro-0.002
Token though
Token ID996
Feature activation+7.68e-02

Pos logit contributions
vious+0.109
awks+0.108
 Leave+0.107
trained+0.101
irms+0.099
Neg logit contributions
 carefully-0.064
--0.062
 together-0.060
 next-0.056
 work-0.054
Token we
Token ID356
Feature activation+8.03e-02

Pos logit contributions
vious+0.161
 Leave+0.157
awks+0.156
 Fro+0.147
joy+0.144
Neg logit contributions
pack-0.098
--0.097
 carefully-0.096
ve-0.090
her-0.088
Token are
Token ID389
Feature activation+5.86e-02

Pos logit contributions
vious+0.140
awks+0.127
 Leave+0.125
trained+0.112
Leave+0.110
Neg logit contributions
--0.150
 carefully-0.141
 together-0.138
pack-0.138
lf-0.136
Token separated
Token ID11266
Feature activation-1.68e-02

Pos logit contributions
vious+0.121
awks+0.116
 Leave+0.110
irms+0.104
trained+0.103
Neg logit contributions
--0.084
 carefully-0.081
pack-0.079
"-0.076
lf-0.076
Token by
Token ID416
Feature activation-2.35e-02

Pos logit contributions
 carefully+0.014
-+0.014
pack+0.014
lf+0.013
ve+0.012
Neg logit contributions
vious-0.044
awks-0.041
 Leave-0.041
irms-0.039
trained-0.038
Token distance
Token ID5253
Feature activation+2.04e-01

Pos logit contributions
-+0.030
pack+0.030
 carefully+0.029
ve+0.029
lf+0.027
Neg logit contributions
vious-0.047
awks-0.047
 Leave-0.045
irms-0.042
 Fro-0.042
Token,
Token ID11
Feature activation+3.60e-02

Pos logit contributions
vious+0.516
awks+0.497
 Leave+0.488
irms+0.457
trained+0.457
Neg logit contributions
 carefully-0.184
pack-0.182
--0.178
ve-0.167
lf-0.163
Token we
Token ID356
Feature activation+6.18e-02

Pos logit contributions
vious+0.074
 Leave+0.074
awks+0.073
 Fro+0.069
Leave+0.068
Neg logit contributions
--0.048
pack-0.045
ve-0.043
 carefully-0.043
ator-0.041
Token can
Token ID460
Feature activation-2.63e-03

Pos logit contributions
vious+0.118
 Leave+0.105
awks+0.104
trained+0.090
rified+0.089
Neg logit contributions
--0.106
lf-0.100
pack-0.099
 carefully-0.097
ator-0.094
Token still
Token ID991
Feature activation-5.97e-02

Pos logit contributions
-+0.004
lf+0.004
ve+0.004
plug+0.003
 together+0.003
Neg logit contributions
 Leave-0.005
vious-0.005
awks-0.004
 realize-0.004
Leave-0.004
Token be
Token ID307
Feature activation+4.52e-02

Pos logit contributions
-+0.092
lf+0.088
ve+0.087
 carefully+0.085
her+0.083
Neg logit contributions
vious-0.109
 Leave-0.109
awks-0.105
Leave-0.095
 realize-0.092
Token as
Token ID355
Feature activation+4.26e-02

Pos logit contributions
awks+0.029
 Leave+0.029
vious+0.028
joy+0.028
 Fro+0.028
Neg logit contributions
pack-0.021
ve-0.019
her-0.019
--0.019
"-0.019
Token he
Token ID339
Feature activation+6.64e-02

Pos logit contributions
vious+0.088
awks+0.085
 Leave+0.081
trained+0.079
 Fro+0.078
Neg logit contributions
--0.057
pack-0.056
 carefully-0.053
"-0.052
ve-0.051
Token disappeared
Token ID12120
Feature activation+1.51e-02

Pos logit contributions
vious+0.116
awks+0.109
 Leave+0.106
trained+0.097
 Fro+0.097
Neg logit contributions
--0.110
pack-0.108
 carefully-0.107
 together-0.106
ator-0.106
Token into
Token ID656
Feature activation+5.01e-02

Pos logit contributions
vious+0.036
awks+0.036
joy+0.034
 Leave+0.033
irms+0.032
Neg logit contributions
pack-0.014
 carefully-0.014
"-0.014
--0.013
lf-0.013
Token the
Token ID262
Feature activation+5.84e-02

Pos logit contributions
vious+0.117
awks+0.113
 Leave+0.110
joy+0.106
irms+0.105
Neg logit contributions
--0.057
 carefully-0.056
pack-0.052
"-0.052
ve-0.051
Token distance
Token ID5253
Feature activation+2.04e-01

Pos logit contributions
vious+0.112
awks+0.107
 Leave+0.104
trained+0.098
irms+0.097
Neg logit contributions
 carefully-0.094
--0.094
 together-0.087
"-0.085
 next-0.084
Token.
Token ID13
Feature activation+2.25e-02

Pos logit contributions
vious+0.555
awks+0.535
 Leave+0.520
joy+0.503
trained+0.496
Neg logit contributions
pack-0.144
 carefully-0.140
--0.137
ve-0.126
lf-0.122
Token Gig
Token ID19525
Feature activation+9.35e-02

Pos logit contributions
vious+0.051
awks+0.049
 Leave+0.048
trained+0.045
irms+0.045
Neg logit contributions
 carefully-0.029
--0.029
 together-0.026
"-0.025
pack-0.025
Tokeni
Token ID72
Feature activation+4.25e-02

Pos logit contributions
vious+0.129
awks+0.122
 Leave+0.114
trained+0.107
irms+0.104
Neg logit contributions
 carefully-0.197
--0.196
 together-0.189
pack-0.188
"-0.184
Token knew
Token ID2993
Feature activation-4.09e-02

Pos logit contributions
vious+0.081
 Leave+0.077
awks+0.076
 Fro+0.070
Leave+0.069
Neg logit contributions
pack-0.062
--0.059
 together-0.057
ator-0.057
 carefully-0.056
Token that
Token ID326
Feature activation+6.38e-04

Pos logit contributions
pack+0.057
-+0.056
 carefully+0.054
lf+0.053
 together+0.052
Neg logit contributions
vious-0.082
 Leave-0.077
awks-0.076
 Fro-0.073
trained-0.071
Token it
Token ID340
Feature activation-2.42e-03

Pos logit contributions
vious+0.179
awks+0.169
 Leave+0.165
irms+0.157
trained+0.156
Neg logit contributions
--0.076
 carefully-0.073
pack-0.070
ve-0.067
"-0.066
Token flying
Token ID7348
Feature activation+9.81e-02

Pos logit contributions
-+0.002
 carefully+0.002
pack+0.002
"+0.002
ve+0.002
Neg logit contributions
vious-0.006
awks-0.006
 Leave-0.006
trained-0.005
Leave-0.005
Token off
Token ID572
Feature activation+8.33e-02

Pos logit contributions
vious+0.233
awks+0.221
 Leave+0.209
rified+0.202
 Fro+0.201
Neg logit contributions
 carefully-0.104
--0.103
pack-0.102
lf-0.099
ve-0.097
Token into
Token ID656
Feature activation+4.80e-02

Pos logit contributions
vious+0.196
awks+0.187
 Leave+0.180
joy+0.178
 Fro+0.172
Neg logit contributions
pack-0.087
--0.085
 carefully-0.082
ve-0.081
lf-0.076
Token the
Token ID262
Feature activation+5.10e-02

Pos logit contributions
vious+0.104
awks+0.102
joy+0.097
 Leave+0.095
irms+0.094
Neg logit contributions
--0.062
ve-0.058
pack-0.057
 carefully-0.056
"-0.053
Token distance
Token ID5253
Feature activation+2.00e-01

Pos logit contributions
vious+0.096
awks+0.092
 Leave+0.090
trained+0.084
irms+0.084
Neg logit contributions
 carefully-0.084
--0.084
 together-0.078
"-0.076
 next-0.075
Token.
Token ID13
Feature activation+2.55e-02

Pos logit contributions
vious+0.544
awks+0.523
 Leave+0.506
joy+0.482
trained+0.481
Neg logit contributions
 carefully-0.145
pack-0.144
--0.140
ve-0.131
lf-0.125
Token He
Token ID679
Feature activation+3.79e-02

Pos logit contributions
vious+0.048
awks+0.047
 Leave+0.044
trained+0.042
irms+0.042
Neg logit contributions
 carefully-0.042
--0.041
 together-0.039
"-0.038
pack-0.037
Token stayed
Token ID9658
Feature activation-2.59e-03

Pos logit contributions
vious+0.073
awks+0.070
 Leave+0.069
 Fro+0.064
Leave+0.063
Neg logit contributions
pack-0.056
--0.054
ator-0.054
 together-0.052
 carefully-0.051
Token there
Token ID612
Feature activation+1.87e-02

Pos logit contributions
-+0.003
 carefully+0.003
lf+0.003
pack+0.002
"+0.002
Neg logit contributions
awks-0.006
vious-0.006
joy-0.006
 Leave-0.006
irms-0.005
Token,
Token ID11
Feature activation+1.48e-02

Pos logit contributions
vious+0.040
joy+0.040
awks+0.040
 Leave+0.038
 Fro+0.037
Neg logit contributions
--0.021
lf-0.020
pack-0.020
ator-0.019
 carefully-0.019
Token best
Token ID1266
Feature activation+1.36e-01

Pos logit contributions
-+0.009
 carefully+0.009
 together+0.008
pack+0.008
"+0.008
Neg logit contributions
vious-0.016
awks-0.016
 Leave-0.015
irms-0.014
trained-0.014
Token to
Token ID284
Feature activation+5.56e-02

Pos logit contributions
vious+0.337
awks+0.324
 Leave+0.319
irms+0.300
trained+0.299
Neg logit contributions
 carefully-0.134
--0.130
pack-0.122
 together-0.115
"-0.115
Token keep
Token ID1394
Feature activation+7.67e-02

Pos logit contributions
vious+0.104
 Leave+0.103
awks+0.099
 Fro+0.094
Leave+0.092
Neg logit contributions
--0.086
ve-0.081
 carefully-0.075
pack-0.074
ator-0.073
Token a
Token ID257
Feature activation+1.90e-02

Pos logit contributions
 Leave+0.141
awks+0.138
vious+0.133
trained+0.131
irms+0.128
Neg logit contributions
lf-0.105
ator-0.102
pack-0.101
"-0.098
ork-0.097
Token normal
Token ID3487
Feature activation+1.51e-03

Pos logit contributions
awks+0.036
vious+0.034
 Leave+0.033
joy+0.033
irms+0.032
Neg logit contributions
lf-0.025
ator-0.025
pack-0.025
her-0.024
 together-0.024
Token distance
Token ID5253
Feature activation+1.99e-01

Pos logit contributions
vious+0.003
awks+0.003
 Leave+0.003
joy+0.003
irms+0.003
Neg logit contributions
 carefully-0.002
ve-0.002
pack-0.002
her-0.002
 together-0.002
Token.
Token ID13
Feature activation+1.83e-02

Pos logit contributions
vious+0.508
awks+0.486
 Leave+0.477
trained+0.449
irms+0.442
Neg logit contributions
pack-0.173
 carefully-0.167
--0.166
lf-0.164
ve-0.162
Tokenâ
Token ID22940
Feature activation-2.73e-02

Pos logit contributions
vious+0.041
awks+0.039
 Leave+0.039
trained+0.036
irms+0.036
Neg logit contributions
--0.024
 carefully-0.023
 together-0.021
pack-0.021
"-0.020
Token
Token ID26391
Feature activation+2.32e-02

Pos logit contributions
 carefully+0.059
-+0.059
 together+0.056
lf+0.054
"+0.054
Neg logit contributions
vious-0.039
awks-0.034
 Leave-0.033
trained-0.031
irms-0.029
Token
Token ID198
Feature activation+1.08e-02

Pos logit contributions
vious+0.052
 Leave+0.052
awks+0.051
trained+0.047
joy+0.047
Neg logit contributions
 carefully-0.029
--0.028
 together-0.025
pack-0.024
 work-0.024
Token
Token ID198
Feature activation+9.61e-03

Pos logit contributions
vious+0.024
 Leave+0.024
awks+0.023
 Fro+0.022
Leave+0.022
Neg logit contributions
--0.013
 carefully-0.012
 together-0.011
 work-0.011
pack-0.011
Token noticed
Token ID6810
Feature activation+6.61e-02

Pos logit contributions
ator+0.024
pack+0.024
lf+0.022
-+0.022
abies+0.022
Neg logit contributions
awks-0.031
 ze-0.027
 Leave-0.027
 Fro-0.027
vious-0.026
Token a
Token ID257
Feature activation+3.69e-02

Pos logit contributions
vious+0.166
awks+0.160
 Leave+0.154
irms+0.147
trained+0.147
Neg logit contributions
--0.064
 carefully-0.063
pack-0.059
"-0.057
 together-0.057
Token boat
Token ID8848
Feature activation+5.45e-02

Pos logit contributions
vious+0.076
awks+0.075
 Leave+0.071
joy+0.071
 Fro+0.069
Neg logit contributions
 carefully-0.046
 together-0.046
--0.046
pack-0.045
ator-0.043
Token in
Token ID287
Feature activation+2.16e-02

Pos logit contributions
vious+0.130
awks+0.123
 Leave+0.122
trained+0.115
Leave+0.115
Neg logit contributions
 carefully-0.056
--0.054
pack-0.050
"-0.049
ve-0.049
Token the
Token ID262
Feature activation+5.53e-02

Pos logit contributions
vious+0.042
awks+0.041
joy+0.039
Leave+0.039
 Leave+0.039
Neg logit contributions
--0.031
ve-0.031
 carefully-0.029
pack-0.028
iced-0.027
Token distance
Token ID5253
Feature activation+1.99e-01

Pos logit contributions
vious+0.105
awks+0.101
 Leave+0.097
trained+0.092
irms+0.091
Neg logit contributions
 carefully-0.090
--0.090
 together-0.084
"-0.082
pack-0.081
Token.
Token ID13
Feature activation+2.16e-02

Pos logit contributions
vious+0.547
awks+0.530
 Leave+0.511
trained+0.489
joy+0.488
Neg logit contributions
--0.135
 carefully-0.134
pack-0.131
ve-0.116
lf-0.115
Token As
Token ID1081
Feature activation+3.83e-02

Pos logit contributions
vious+0.040
awks+0.039
 Leave+0.037
trained+0.035
irms+0.035
Neg logit contributions
--0.036
 carefully-0.036
 together-0.033
pack-0.032
"-0.032
Token it
Token ID340
Feature activation-2.17e-02

Pos logit contributions
vious+0.074
awks+0.071
 Leave+0.069
 Fro+0.066
trained+0.066
Neg logit contributions
--0.057
pack-0.055
 carefully-0.055
ve-0.052
"-0.051
Token got
Token ID1392
Feature activation+8.96e-03

Pos logit contributions
pack+0.034
ator+0.032
lf+0.031
 carefully+0.031
 work+0.030
Neg logit contributions
vious-0.040
awks-0.034
Leave-0.034
 Leave-0.033
 Fro-0.033
Token closer
Token ID5699
Feature activation+2.95e-02

Pos logit contributions
vious+0.021
awks+0.020
 Leave+0.019
trained+0.018
irms+0.018
Neg logit contributions
--0.011
 carefully-0.011
pack-0.010
lf-0.010
"-0.010
Token a
Token ID257
Feature activation+3.11e-02

Pos logit contributions
vious+0.027
 Leave+0.026
awks+0.026
irms+0.022
 Fro+0.022
Neg logit contributions
--0.035
 carefully-0.033
pack-0.032
lf-0.031
ve-0.031
Token shiny
Token ID22441
Feature activation+7.98e-02

Pos logit contributions
vious+0.069
awks+0.067
 Leave+0.065
joy+0.061
irms+0.061
Neg logit contributions
 carefully-0.039
--0.038
 together-0.036
pack-0.035
"-0.034
Token object
Token ID2134
Feature activation-4.49e-04

Pos logit contributions
vious+0.176
awks+0.169
 Leave+0.163
irms+0.154
trained+0.153
Neg logit contributions
 carefully-0.103
--0.099
 together-0.096
pack-0.094
"-0.092
Token in
Token ID287
Feature activation+1.94e-02

Pos logit contributions
 carefully+0.000
-+0.000
pack+0.000
"+0.000
 together+0.000
Neg logit contributions
vious-0.001
 Leave-0.001
awks-0.001
trained-0.001
Leave-0.001
Token the
Token ID262
Feature activation+5.31e-02

Pos logit contributions
vious+0.039
awks+0.037
 Leave+0.036
joy+0.034
Leave+0.034
Neg logit contributions
--0.027
ve-0.026
 carefully-0.025
pack-0.025
 together-0.024
Token distance
Token ID5253
Feature activation+1.98e-01

Pos logit contributions
vious+0.114
awks+0.110
 Leave+0.107
trained+0.102
irms+0.101
Neg logit contributions
--0.074
 carefully-0.074
 together-0.067
"-0.066
 next-0.065
Token.
Token ID13
Feature activation+1.84e-02

Pos logit contributions
vious+0.554
awks+0.535
 Leave+0.519
trained+0.497
irms+0.492
Neg logit contributions
--0.129
 carefully-0.128
pack-0.120
ve-0.108
lf-0.104
Token He
Token ID679
Feature activation+3.26e-02

Pos logit contributions
vious+0.035
awks+0.034
 Leave+0.032
trained+0.030
irms+0.030
Neg logit contributions
--0.030
 carefully-0.030
 together-0.028
"-0.027
pack-0.027
Token rolled
Token ID11686
Feature activation+4.95e-02

Pos logit contributions
vious+0.068
awks+0.065
 Leave+0.065
trained+0.059
 Fro+0.058
Neg logit contributions
--0.044
 together-0.042
pack-0.041
 carefully-0.041
ator-0.041
Token even
Token ID772
Feature activation+4.51e-02

Pos logit contributions
vious+0.107
awks+0.105
 Fro+0.102
joy+0.098
 Leave+0.098
Neg logit contributions
pack-0.056
lf-0.055
ator-0.054
her-0.054
--0.053
Token faster
Token ID5443
Feature activation+8.31e-02

Pos logit contributions
 Leave+0.095
vious+0.093
awks+0.092
trained+0.086
irk+0.085
Neg logit contributions
 carefully-0.069
--0.065
 together-0.062
 next-0.059
 work-0.055
Token said
Token ID531
Feature activation+2.93e-02

Pos logit contributions
 carefully+0.013
pack+0.013
ator+0.012
lf+0.012
her+0.012
Neg logit contributions
vious-0.032
awks-0.031
 Leave-0.030
 Fro-0.029
joy-0.029
Token something
Token ID1223
Feature activation+9.68e-02

Pos logit contributions
vious+0.071
 Leave+0.069
awks+0.069
 Fro+0.066
joy+0.065
Neg logit contributions
 carefully-0.028
pack-0.028
--0.027
lf-0.025
ator-0.025
Token she
Token ID673
Feature activation+5.93e-02

Pos logit contributions
vious+0.224
awks+0.211
 Leave+0.204
 Fro+0.199
trained+0.193
Neg logit contributions
pack-0.107
--0.103
 carefully-0.103
ator-0.100
 together-0.098
Token'll
Token ID1183
Feature activation+1.26e-02

Pos logit contributions
vious+0.111
awks+0.102
 Leave+0.101
trained+0.093
 Fro+0.091
Neg logit contributions
--0.095
 carefully-0.091
pack-0.090
 together-0.090
ator-0.089
Token never
Token ID1239
Feature activation+4.16e-02

Pos logit contributions
vious+0.024
 Leave+0.023
awks+0.022
Leave+0.021
 realize+0.020
Neg logit contributions
--0.018
 together-0.018
pack-0.017
ator-0.017
lf-0.017
Token forget
Token ID6044
Feature activation+1.97e-01

Pos logit contributions
vious+0.088
awks+0.086
 Leave+0.083
irms+0.080
trained+0.079
Neg logit contributions
 carefully-0.059
--0.058
 together-0.053
"-0.052
 next-0.052
Token:
Token ID25
Feature activation-1.03e-02

Pos logit contributions
vious+0.466
 Leave+0.449
awks+0.438
 Fro+0.417
joy+0.415
Neg logit contributions
pack-0.187
--0.185
 carefully-0.180
ve-0.180
ator-0.160
Token
Token ID198
Feature activation+2.84e-02

Pos logit contributions
-+0.013
 carefully+0.013
pack+0.012
 together+0.012
"+0.012
Neg logit contributions
vious-0.023
awks-0.022
 Leave-0.022
trained-0.020
irms-0.020
Token
Token ID198
Feature activation+2.43e-02

Pos logit contributions
vious+0.064
 Leave+0.062
awks+0.062
 Fro+0.058
trained+0.057
Neg logit contributions
--0.034
 carefully-0.033
 together-0.031
 work-0.030
pack-0.030
Token"
Token ID1
Feature activation-3.73e-02

Pos logit contributions
vious+0.049
 Leave+0.048
awks+0.047
Leave+0.044
 Fro+0.043
Neg logit contributions
--0.035
 carefully-0.034
 work-0.032
 together-0.032
pack-0.031
TokenLittle
Token ID22253
Feature activation-6.38e-02

Pos logit contributions
-+0.074
 carefully+0.071
 work+0.068
lf+0.067
 together+0.067
Neg logit contributions
vious-0.060
awks-0.054
 Leave-0.052
Leave-0.051
joy-0.049
Token a
Token ID257
Feature activation+2.24e-02

Pos logit contributions
vious+0.098
awks+0.093
 Leave+0.089
 Fro+0.086
joy+0.086
Neg logit contributions
pack-0.046
--0.045
 carefully-0.043
ve-0.042
"-0.042
Token field
Token ID2214
Feature activation+7.94e-03

Pos logit contributions
vious+0.047
awks+0.047
joy+0.044
 Leave+0.044
 Fro+0.042
Neg logit contributions
 carefully-0.028
 together-0.027
pack-0.027
--0.027
ator-0.025
Token off
Token ID572
Feature activation+8.31e-02

Pos logit contributions
vious+0.019
awks+0.018
 Leave+0.017
trained+0.016
Leave+0.016
Neg logit contributions
 carefully-0.009
--0.009
pack-0.008
 together-0.008
ve-0.008
Token in
Token ID287
Feature activation+1.52e-02

Pos logit contributions
vious+0.193
awks+0.189
 Leave+0.180
joy+0.176
irms+0.172
Neg logit contributions
--0.089
pack-0.088
 carefully-0.086
ve-0.081
ator-0.078
Token the
Token ID262
Feature activation+4.48e-02

Pos logit contributions
vious+0.030
awks+0.030
joy+0.030
Leave+0.028
 Leave+0.028
Neg logit contributions
--0.021
ve-0.021
pack-0.019
 carefully-0.019
abies-0.018
Token distance
Token ID5253
Feature activation+1.96e-01

Pos logit contributions
vious+0.076
awks+0.072
 Leave+0.069
trained+0.064
irms+0.064
Neg logit contributions
--0.083
 carefully-0.083
 together-0.078
"-0.076
pack-0.076
Token.
Token ID13
Feature activation+1.90e-02

Pos logit contributions
vious+0.535
awks+0.515
 Leave+0.500
joy+0.477
trained+0.475
Neg logit contributions
 carefully-0.139
--0.139
pack-0.136
ve-0.120
lf-0.117
Token She
Token ID1375
Feature activation+1.16e-02

Pos logit contributions
vious+0.035
awks+0.034
 Leave+0.033
irms+0.030
trained+0.030
Neg logit contributions
--0.032
 carefully-0.031
 together-0.029
pack-0.029
"-0.028
Token decided
Token ID3066
Feature activation+2.25e-02

Pos logit contributions
vious+0.020
awks+0.019
 Leave+0.018
 Fro+0.017
 regretted+0.017
Neg logit contributions
ator-0.019
pack-0.019
--0.018
 together-0.018
 work-0.017
Token to
Token ID284
Feature activation+2.96e-02

Pos logit contributions
vious+0.052
 Leave+0.050
awks+0.050
joy+0.047
trained+0.047
Neg logit contributions
--0.024
ator-0.023
pack-0.023
 carefully-0.022
lf-0.022
Token explore
Token ID7301
Feature activation+2.28e-02

Pos logit contributions
vious+0.052
 Leave+0.052
awks+0.049
 Fro+0.045
Leave+0.045
Neg logit contributions
--0.050
ve-0.048
 together-0.046
her-0.044
lf-0.043
Tokenaroo
Token ID38049
Feature activation-1.32e-02

Pos logit contributions
 carefully+0.018
-+0.017
pack+0.017
 work+0.017
 together+0.017
Neg logit contributions
vious-0.016
 Leave-0.014
awks-0.014
trained-0.013
rified-0.013
Token watched
Token ID7342
Feature activation-3.52e-03

Pos logit contributions
-+0.017
 carefully+0.016
pack+0.015
 together+0.015
"+0.015
Neg logit contributions
vious-0.029
awks-0.028
 Leave-0.027
trained-0.026
 Fro-0.025
Token them
Token ID606
Feature activation+1.94e-02

Pos logit contributions
pack+0.005
lf+0.004
ator+0.004
her+0.004
-+0.004
Neg logit contributions
vious-0.007
awks-0.007
 Leave-0.007
 Fro-0.007
joy-0.007
Token from
Token ID422
Feature activation+5.41e-02

Pos logit contributions
awks+0.042
vious+0.041
 Leave+0.040
 Fro+0.039
irms+0.036
Neg logit contributions
lf-0.023
ator-0.022
--0.022
ve-0.021
her-0.021
Token a
Token ID257
Feature activation+5.47e-03

Pos logit contributions
awks+0.119
vious+0.116
irms+0.111
 Leave+0.109
joy+0.109
Neg logit contributions
--0.068
 carefully-0.064
"-0.063
ve-0.063
pack-0.062
Token distance
Token ID5253
Feature activation+1.95e-01

Pos logit contributions
awks+0.010
vious+0.010
joy+0.009
 Leave+0.009
 Fro+0.009
Neg logit contributions
--0.008
ator-0.008
pack-0.008
 together-0.008
 carefully-0.008
Token.
Token ID13
Feature activation+1.09e-02

Pos logit contributions
vious+0.509
awks+0.498
 Leave+0.479
irms+0.455
joy+0.454
Neg logit contributions
--0.156
pack-0.154
 carefully-0.151
lf-0.144
ve-0.143
Token She
Token ID1375
Feature activation+1.25e-02

Pos logit contributions
vious+0.021
awks+0.020
 Leave+0.019
trained+0.018
irms+0.018
Neg logit contributions
 carefully-0.018
--0.017
 together-0.017
"-0.016
 next-0.016
Token was
Token ID373
Feature activation+1.09e-02

Pos logit contributions
vious+0.022
awks+0.020
 Leave+0.020
 Fro+0.019
 ze+0.019
Neg logit contributions
pack-0.020
ator-0.020
--0.019
 work-0.018
 together-0.018
Token not
Token ID407
Feature activation+3.53e-03

Pos logit contributions
vious+0.020
awks+0.019
 Leave+0.018
Leave+0.017
joy+0.017
Neg logit contributions
--0.018
 carefully-0.018
lf-0.017
pack-0.017
"-0.017
Token angry
Token ID7954
Feature activation+5.30e-02

Pos logit contributions
vious+0.006
awks+0.006
 Leave+0.006
joy+0.005
trained+0.005
Neg logit contributions
 carefully-0.006
--0.006
pack-0.006
lf-0.006
ator-0.005
Token They
Token ID1119
Feature activation+6.18e-03

Pos logit contributions
vious+0.002
awks+0.002
 Leave+0.002
trained+0.002
irms+0.002
Neg logit contributions
 carefully-0.001
--0.001
 together-0.001
"-0.001
pack-0.001
Token hurt
Token ID5938
Feature activation+3.18e-02

Pos logit contributions
vious+0.012
awks+0.012
 Leave+0.011
joy+0.011
trained+0.011
Neg logit contributions
 carefully-0.010
--0.009
 work-0.009
 together-0.009
"-0.008
Token their
Token ID511
Feature activation-1.55e-02

Pos logit contributions
vious+0.075
awks+0.075
 Leave+0.073
irms+0.070
 Fro+0.068
Neg logit contributions
 carefully-0.032
--0.032
ve-0.031
pack-0.031
lf-0.030
Token knees
Token ID14475
Feature activation+7.37e-02

Pos logit contributions
 carefully+0.019
-+0.019
"+0.018
 together+0.018
pack+0.018
Neg logit contributions
awks-0.034
vious-0.034
 Leave-0.032
irms-0.032
 Fro-0.030
Token and
Token ID290
Feature activation+4.14e-03

Pos logit contributions
vious+0.178
 Leave+0.170
awks+0.167
Leave+0.158
trained+0.154
Neg logit contributions
pack-0.072
ator-0.071
lf-0.070
--0.068
 carefully-0.067
Token elbows
Token ID37076
Feature activation+1.93e-01

Pos logit contributions
vious+0.005
awks+0.005
 Leave+0.005
trained+0.004
joy+0.004
Neg logit contributions
 carefully-0.009
--0.009
 together-0.009
 work-0.009
 next-0.009
Token.
Token ID13
Feature activation-7.52e-03

Pos logit contributions
vious+0.439
 Leave+0.432
awks+0.416
Leave+0.396
irms+0.395
Neg logit contributions
ator-0.204
lf-0.194
pack-0.192
 carefully-0.175
--0.168
Token They
Token ID1119
Feature activation+1.35e-03

Pos logit contributions
 carefully+0.009
-+0.008
 together+0.008
"+0.008
 next+0.007
Neg logit contributions
vious-0.018
awks-0.017
 Leave-0.017
trained-0.016
irms-0.016
Token cry
Token ID3960
Feature activation+5.18e-02

Pos logit contributions
vious+0.003
joy+0.003
awks+0.003
trained+0.003
 Leave+0.003
Neg logit contributions
 carefully-0.002
 work-0.002
--0.002
"-0.002
 together-0.002
Token and
Token ID290
Feature activation-2.08e-04

Pos logit contributions
vious+0.134
awks+0.132
 Leave+0.130
irms+0.124
trained+0.119
Neg logit contributions
pack-0.042
lf-0.042
 carefully-0.041
ator-0.040
--0.038
Token hug
Token ID16225
Feature activation+5.41e-02

Pos logit contributions
 carefully+0.000
 work+0.000
-+0.000
 together+0.000
 next+0.000
Neg logit contributions
vious-0.000
awks-0.000
joy-0.000
 Leave-0.000
trained-0.000
Token distance
Token ID5253
Feature activation+2.26e-01

Pos logit contributions
vious+0.113
awks+0.110
 Leave+0.105
irms+0.099
trained+0.098
Neg logit contributions
--0.078
 carefully-0.078
 together-0.073
pack-0.072
"-0.070
Token.
Token ID13
Feature activation+2.96e-02

Pos logit contributions
vious+0.611
awks+0.591
 Leave+0.574
trained+0.545
irms+0.543
Neg logit contributions
--0.171
 carefully-0.169
pack-0.157
"-0.141
lf-0.141
Token Everyone
Token ID11075
Feature activation+1.64e-03

Pos logit contributions
vious+0.057
awks+0.055
 Leave+0.054
irms+0.050
trained+0.049
Neg logit contributions
--0.047
 carefully-0.046
 together-0.043
pack-0.042
 work-0.041
Token was
Token ID373
Feature activation+2.41e-02

Pos logit contributions
vious+0.004
awks+0.004
 Leave+0.003
trained+0.003
Leave+0.003
Neg logit contributions
--0.002
 carefully-0.002
pack-0.002
 together-0.002
"-0.002
Token over
Token ID625
Feature activation+3.18e-02

Pos logit contributions
vious+0.043
awks+0.040
 Leave+0.040
joy+0.037
Leave+0.036
Neg logit contributions
--0.039
 carefully-0.038
lf-0.038
pack-0.038
"-0.036
Tokenjoy
Token ID2633
Feature activation+1.92e-01

Pos logit contributions
vious+0.069
awks+0.067
joy+0.066
 Leave+0.064
irms+0.061
Neg logit contributions
 carefully-0.040
--0.040
pack-0.039
lf-0.038
her-0.036
Tokened
Token ID276
Feature activation-1.87e-02

Pos logit contributions
vious+0.344
 Leave+0.312
awks+0.310
joy+0.287
trained+0.282
Neg logit contributions
 carefully-0.343
lf-0.321
pack-0.321
--0.314
ator-0.309
Token that
Token ID326
Feature activation+1.06e-02

Pos logit contributions
pack+0.015
-+0.015
 carefully+0.015
lf+0.015
ator+0.013
Neg logit contributions
vious-0.048
awks-0.047
 Leave-0.045
joy-0.044
trained-0.043
Token the
Token ID262
Feature activation+4.55e-02

Pos logit contributions
vious+0.022
awks+0.020
 Leave+0.020
joy+0.019
 Fro+0.019
Neg logit contributions
--0.015
pack-0.014
 carefully-0.014
ve-0.013
her-0.013
Token cow
Token ID9875
Feature activation-3.98e-03

Pos logit contributions
vious+0.091
awks+0.087
 Leave+0.082
joy+0.079
irms+0.078
Neg logit contributions
--0.066
 carefully-0.065
pack-0.063
 together-0.063
"-0.061
Token had
Token ID550
Feature activation+3.55e-02

Pos logit contributions
pack+0.005
-+0.004
 carefully+0.004
lf+0.004
ator+0.004
Neg logit contributions
vious-0.009
 Leave-0.009
awks-0.008
trained-0.008
rified-0.008
Token to
Token ID284
Feature activation+6.81e-02

Pos logit contributions
vious+0.169
awks+0.163
 Leave+0.161
trained+0.153
irms+0.151
Neg logit contributions
 carefully-0.063
--0.063
pack-0.055
 together-0.055
"-0.054
Token see
Token ID766
Feature activation+9.78e-02

Pos logit contributions
vious+0.133
 Leave+0.131
awks+0.128
 Fro+0.121
Leave+0.117
Neg logit contributions
--0.104
ve-0.098
 carefully-0.097
 together-0.095
ator-0.088
Token who
Token ID508
Feature activation+5.01e-02

Pos logit contributions
vious+0.211
awks+0.204
 Leave+0.199
irms+0.187
trained+0.186
Neg logit contributions
--0.125
 carefully-0.122
pack-0.118
ve-0.114
"-0.112
Token was
Token ID373
Feature activation+3.28e-02

Pos logit contributions
vious+0.118
 Leave+0.111
awks+0.111
trained+0.105
irms+0.102
Neg logit contributions
--0.057
 carefully-0.057
pack-0.052
 together-0.051
ator-0.049
Token the
Token ID262
Feature activation+5.51e-02

Pos logit contributions
vious+0.073
awks+0.070
 Leave+0.068
trained+0.065
 Fro+0.064
Neg logit contributions
--0.041
 carefully-0.040
pack-0.037
"-0.036
 together-0.036
Token strongest
Token ID12841
Feature activation+1.92e-01

Pos logit contributions
vious+0.111
awks+0.107
 Leave+0.102
trained+0.097
irms+0.097
Neg logit contributions
--0.083
 carefully-0.082
 together-0.077
"-0.075
pack-0.075
Token.
Token ID13
Feature activation+1.79e-02

Pos logit contributions
vious+0.491
awks+0.465
 Leave+0.459
trained+0.432
irms+0.431
Neg logit contributions
 carefully-0.186
--0.176
pack-0.165
 together-0.163
"-0.156
Token They
Token ID1119
Feature activation+1.07e-02

Pos logit contributions
vious+0.035
awks+0.034
 Leave+0.032
trained+0.031
irms+0.031
Neg logit contributions
 carefully-0.028
--0.028
 together-0.026
"-0.026
pack-0.025
Token decided
Token ID3066
Feature activation+1.97e-02

Pos logit contributions
vious+0.019
awks+0.018
 Leave+0.018
trained+0.016
 Fro+0.016
Neg logit contributions
--0.019
ator-0.017
pack-0.017
 next-0.017
 together-0.016
Token to
Token ID284
Feature activation+2.82e-02

Pos logit contributions
vious+0.045
 Leave+0.044
awks+0.043
trained+0.041
joy+0.040
Neg logit contributions
--0.022
ator-0.021
lf-0.021
pack-0.020
 carefully-0.020
Token play
Token ID711
Feature activation+8.73e-03

Pos logit contributions
vious+0.051
 Leave+0.050
awks+0.047
 Fro+0.043
 realize+0.043
Neg logit contributions
--0.048
ve-0.045
 together-0.042
 carefully-0.040
lf-0.040
Token.
Token ID13
Feature activation+2.66e-02

Pos logit contributions
vious+0.088
awks+0.083
 Leave+0.083
trained+0.078
 Fro+0.077
Neg logit contributions
--0.053
 carefully-0.052
pack-0.049
 together-0.048
ator-0.047
Token Tim
Token ID5045
Feature activation+1.40e-02

Pos logit contributions
vious+0.053
awks+0.051
 Leave+0.049
trained+0.046
irms+0.046
Neg logit contributions
 carefully-0.041
--0.041
 together-0.038
"-0.037
pack-0.036
Tokenmy
Token ID1820
Feature activation+4.43e-02

Pos logit contributions
vious+0.030
awks+0.030
 Leave+0.028
trained+0.027
irms+0.027
Neg logit contributions
 carefully-0.020
--0.020
"-0.018
 together-0.018
 work-0.017
Token was
Token ID373
Feature activation+2.14e-02

Pos logit contributions
vious+0.104
awks+0.101
 Leave+0.098
trained+0.093
irms+0.093
Neg logit contributions
--0.052
 carefully-0.052
 together-0.047
"-0.045
pack-0.045
Token over
Token ID625
Feature activation+2.77e-02

Pos logit contributions
vious+0.048
awks+0.046
 Leave+0.045
trained+0.043
irms+0.042
Neg logit contributions
 carefully-0.028
--0.028
 together-0.026
"-0.025
pack-0.024
Tokenjoy
Token ID2633
Feature activation+1.92e-01

Pos logit contributions
vious+0.048
awks+0.046
 Leave+0.045
trained+0.042
irms+0.041
Neg logit contributions
--0.050
 carefully-0.050
 together-0.047
"-0.046
pack-0.045
Tokened
Token ID276
Feature activation-1.81e-02

Pos logit contributions
vious+0.393
awks+0.377
 Leave+0.366
trained+0.345
irms+0.343
Neg logit contributions
 carefully-0.287
--0.284
 together-0.264
"-0.257
pack-0.257
Token and
Token ID290
Feature activation+2.50e-02

Pos logit contributions
 carefully+0.014
-+0.014
pack+0.013
 together+0.012
lf+0.012
Neg logit contributions
vious-0.048
awks-0.047
 Leave-0.045
trained-0.043
irms-0.043
Token said
Token ID531
Feature activation-3.82e-03

Pos logit contributions
vious+0.055
awks+0.053
 Leave+0.051
irms+0.049
trained+0.048
Neg logit contributions
 carefully-0.036
--0.035
 together-0.032
"-0.031
 next-0.030
Token,
Token ID11
Feature activation+1.90e-02

Pos logit contributions
pack+0.004
 carefully+0.004
-+0.004
her+0.004
ator+0.004
Neg logit contributions
vious-0.009
 Leave-0.008
awks-0.008
 Fro-0.008
joy-0.008
Token "
Token ID366
Feature activation-2.80e-02

Pos logit contributions
 Leave+0.036
vious+0.035
awks+0.035
joy+0.034
 Fro+0.034
Neg logit contributions
--0.027
pack-0.027
ator-0.026
lf-0.025
 carefully-0.024
Token a
Token ID257
Feature activation+3.11e-02

Pos logit contributions
vious+0.066
awks+0.063
 Leave+0.061
joy+0.059
irms+0.059
Neg logit contributions
 carefully-0.019
--0.019
pack-0.018
lf-0.017
"-0.017
Token big
Token ID1263
Feature activation+6.11e-02

Pos logit contributions
vious+0.069
awks+0.067
 Leave+0.065
joy+0.061
irms+0.061
Neg logit contributions
 carefully-0.039
--0.038
 together-0.036
pack-0.035
"-0.034
Token basket
Token ID7988
Feature activation+4.35e-02

Pos logit contributions
vious+0.136
awks+0.131
 Leave+0.128
trained+0.122
irms+0.121
Neg logit contributions
--0.080
 carefully-0.079
 together-0.072
"-0.070
pack-0.069
Token in
Token ID287
Feature activation+1.25e-02

Pos logit contributions
vious+0.108
 Leave+0.099
awks+0.098
trained+0.092
 Fro+0.091
Neg logit contributions
 carefully-0.041
--0.040
lf-0.037
pack-0.036
ve-0.036
Token the
Token ID262
Feature activation+4.60e-02

Pos logit contributions
vious+0.025
 Leave+0.024
awks+0.024
joy+0.023
Leave+0.022
Neg logit contributions
--0.017
ve-0.017
 carefully-0.016
pack-0.015
abies-0.014
Token distance
Token ID5253
Feature activation+1.92e-01

Pos logit contributions
vious+0.101
awks+0.097
 Leave+0.094
trained+0.089
irms+0.089
Neg logit contributions
--0.062
 carefully-0.062
 together-0.057
"-0.055
pack-0.054
Token.
Token ID13
Feature activation+1.35e-02

Pos logit contributions
vious+0.534
awks+0.515
 Leave+0.499
trained+0.477
irms+0.475
Neg logit contributions
--0.127
 carefully-0.125
pack-0.118
ve-0.108
lf-0.102
Token He
Token ID679
Feature activation+2.80e-02

Pos logit contributions
vious+0.026
awks+0.025
 Leave+0.024
trained+0.023
irms+0.023
Neg logit contributions
 carefully-0.022
--0.021
 together-0.020
"-0.020
pack-0.019
Token slowly
Token ID6364
Feature activation-3.87e-02

Pos logit contributions
vious+0.055
awks+0.052
 Leave+0.052
 Fro+0.047
irms+0.047
Neg logit contributions
--0.041
 together-0.039
pack-0.039
ator-0.039
 carefully-0.038
Token crawled
Token ID45668
Feature activation+5.99e-02

Pos logit contributions
-+0.066
 carefully+0.065
 together+0.063
pack+0.063
ator+0.062
Neg logit contributions
vious-0.068
awks-0.065
 Leave-0.062
trained-0.057
irms-0.057
Token up
Token ID510
Feature activation+1.39e-02

Pos logit contributions
vious+0.129
awks+0.124
 Leave+0.117
 Fro+0.116
joy+0.114
Neg logit contributions
lf-0.076
--0.075
pack-0.073
ator-0.069
 carefully-0.068
Token of
Token ID286
Feature activation+1.67e-02

Pos logit contributions
vious+0.121
awks+0.115
 Leave+0.111
joy+0.104
irms+0.103
Neg logit contributions
 carefully-0.091
pack-0.087
--0.086
ve-0.085
 together-0.083
Token the
Token ID262
Feature activation+5.58e-02

Pos logit contributions
vious+0.034
awks+0.033
 Leave+0.032
Leave+0.030
joy+0.030
Neg logit contributions
--0.023
pack-0.023
ve-0.022
 carefully-0.022
"-0.021
Token ocean
Token ID9151
Feature activation+5.17e-02

Pos logit contributions
vious+0.110
awks+0.107
 Leave+0.100
joy+0.096
irms+0.095
Neg logit contributions
--0.080
 carefully-0.080
pack-0.078
 together-0.075
"-0.074
Token in
Token ID287
Feature activation+1.08e-02

Pos logit contributions
vious+0.132
awks+0.125
 Leave+0.121
joy+0.116
trained+0.115
Neg logit contributions
pack-0.040
 carefully-0.040
--0.039
ve-0.037
ator-0.035
Token the
Token ID262
Feature activation+4.30e-02

Pos logit contributions
vious+0.024
awks+0.023
 Leave+0.023
joy+0.022
Leave+0.021
Neg logit contributions
--0.014
 carefully-0.013
ve-0.013
pack-0.012
lf-0.011
Token distance
Token ID5253
Feature activation+1.91e-01

Pos logit contributions
vious+0.077
awks+0.073
 Leave+0.071
trained+0.066
irms+0.066
Neg logit contributions
--0.075
 carefully-0.075
 together-0.070
"-0.069
pack-0.068
Token.
Token ID13
Feature activation+1.19e-02

Pos logit contributions
vious+0.520
awks+0.501
 Leave+0.487
joy+0.467
trained+0.461
Neg logit contributions
pack-0.135
--0.133
 carefully-0.131
lf-0.119
ve-0.117
Token
Token ID198
Feature activation+1.33e-02

Pos logit contributions
vious+0.024
awks+0.023
 Leave+0.022
irms+0.021
trained+0.021
Neg logit contributions
--0.018
 carefully-0.018
 together-0.017
pack-0.016
"-0.016
Token
Token ID198
Feature activation+1.12e-02

Pos logit contributions
 Leave+0.028
vious+0.028
awks+0.027
 Fro+0.026
joy+0.025
Neg logit contributions
--0.017
 carefully-0.016
 work-0.016
pack-0.015
ator-0.015
TokenJames
Token ID14731
Feature activation-7.63e-02

Pos logit contributions
vious+0.022
 Leave+0.022
awks+0.021
Leave+0.021
 Fro+0.020
Neg logit contributions
--0.017
 work-0.016
 carefully-0.015
ator-0.014
 together-0.014
Token decided
Token ID3066
Feature activation+1.55e-02

Pos logit contributions
-+0.087
 carefully+0.084
pack+0.083
 together+0.082
ator+0.081
Neg logit contributions
vious-0.174
 Leave-0.163
awks-0.161
trained-0.149
rified-0.147
Token some
Token ID617
Feature activation+4.89e-02

Pos logit contributions
vious+0.111
awks+0.106
 Leave+0.105
trained+0.098
irms+0.098
Neg logit contributions
--0.054
 carefully-0.054
pack-0.051
"-0.050
ve-0.050
Token strange
Token ID6283
Feature activation+2.36e-02

Pos logit contributions
vious+0.104
awks+0.100
 Leave+0.096
irms+0.091
trained+0.090
Neg logit contributions
 carefully-0.068
--0.067
pack-0.064
 together-0.064
"-0.064
Token noises
Token ID26782
Feature activation+3.62e-02

Pos logit contributions
vious+0.047
awks+0.045
 Leave+0.043
irms+0.040
Leave+0.039
Neg logit contributions
 carefully-0.036
pack-0.035
"-0.034
--0.033
 together-0.033
Token in
Token ID287
Feature activation-7.35e-03

Pos logit contributions
vious+0.097
awks+0.091
 Leave+0.091
Leave+0.086
trained+0.086
Neg logit contributions
 carefully-0.029
--0.029
pack-0.027
"-0.025
ator-0.025
Token the
Token ID262
Feature activation+3.76e-02

Pos logit contributions
 carefully+0.008
-+0.008
pack+0.007
 together+0.007
"+0.007
Neg logit contributions
vious-0.018
awks-0.017
 Leave-0.017
irms-0.016
Leave-0.016
Token distance
Token ID5253
Feature activation+1.89e-01

Pos logit contributions
vious+0.067
awks+0.065
 Leave+0.063
trained+0.059
irms+0.059
Neg logit contributions
 carefully-0.066
--0.066
 next-0.061
 together-0.061
"-0.060
Token.
Token ID13
Feature activation+1.60e-02

Pos logit contributions
vious+0.519
awks+0.503
 Leave+0.488
trained+0.467
joy+0.464
Neg logit contributions
pack-0.135
 carefully-0.130
--0.122
ve-0.119
lf-0.112
Token
Token ID220
Feature activation+8.19e-02

Pos logit contributions
vious+0.029
 Leave+0.029
awks+0.028
 Fro+0.026
joy+0.026
Neg logit contributions
--0.026
 carefully-0.026
lf-0.024
pack-0.024
 together-0.023
Token
Token ID198
Feature activation+2.06e-02

Pos logit contributions
 Leave+0.174
vious+0.168
awks+0.166
 Fro+0.157
irms+0.152
Neg logit contributions
--0.111
 carefully-0.105
lf-0.104
 work-0.094
pack-0.092
Token
Token ID198
Feature activation+2.06e-02

Pos logit contributions
 Leave+0.041
Leave+0.038
vious+0.038
 Fro+0.038
awks+0.037
Neg logit contributions
--0.030
 work-0.027
 carefully-0.026
ator-0.025
 creation-0.025
TokenThe
Token ID464
Feature activation-6.54e-03

Pos logit contributions
Leave+0.036
 Leave+0.035
vious+0.033
Got+0.029
 Fro+0.028
Neg logit contributions
--0.034
 work-0.034
 creation-0.031
ator-0.031
 carefully-0.030
Token the
Token ID262
Feature activation+3.49e-02

Pos logit contributions
vious+0.141
awks+0.137
 Leave+0.134
trained+0.128
irms+0.128
Neg logit contributions
 carefully-0.051
--0.050
 together-0.045
 next-0.041
"-0.041
Token blue
Token ID4171
Feature activation+3.52e-02

Pos logit contributions
vious+0.069
awks+0.066
 Leave+0.065
trained+0.061
irms+0.061
Neg logit contributions
--0.055
 carefully-0.055
 together-0.050
 next-0.050
"-0.049
Token sea
Token ID5417
Feature activation+2.26e-02

Pos logit contributions
vious+0.063
awks+0.061
 Leave+0.059
trained+0.056
irms+0.056
Neg logit contributions
--0.062
 carefully-0.061
 together-0.057
"-0.056
 next-0.055
Token in
Token ID287
Feature activation-7.00e-03

Pos logit contributions
vious+0.050
awks+0.048
 Leave+0.046
trained+0.044
joy+0.044
Neg logit contributions
 carefully-0.027
--0.026
pack-0.026
ve-0.025
"-0.024
Token the
Token ID262
Feature activation+3.57e-02

Pos logit contributions
-+0.008
 carefully+0.008
ve+0.007
pack+0.007
 together+0.007
Neg logit contributions
vious-0.016
awks-0.016
 Leave-0.015
joy-0.015
irms-0.015
Token distance
Token ID5253
Feature activation+1.89e-01

Pos logit contributions
vious+0.072
awks+0.071
 Leave+0.070
trained+0.068
irms+0.065
Neg logit contributions
 carefully-0.055
--0.053
 next-0.052
 together-0.049
"-0.048
Token.
Token ID13
Feature activation+8.20e-03

Pos logit contributions
vious+0.519
awks+0.500
 Leave+0.482
joy+0.465
trained+0.463
Neg logit contributions
pack-0.132
--0.127
 carefully-0.127
ve-0.118
lf-0.110
Token Finally
Token ID9461
Feature activation-1.17e-02

Pos logit contributions
vious+0.019
awks+0.019
trained+0.018
 Leave+0.018
irms+0.017
Neg logit contributions
 carefully-0.010
--0.010
 together-0.009
"-0.009
 next-0.009
Token,
Token ID11
Feature activation+2.36e-02

Pos logit contributions
-+0.010
 carefully+0.010
pack+0.009
 together+0.009
"+0.008
Neg logit contributions
vious-0.030
awks-0.029
 Leave-0.029
irms-0.027
trained-0.027
Token they
Token ID484
Feature activation+2.69e-02

Pos logit contributions
vious+0.049
awks+0.046
 Leave+0.045
joy+0.043
trained+0.042
Neg logit contributions
--0.033
 carefully-0.032
pack-0.030
"-0.030
 together-0.030
Token reached
Token ID4251
Feature activation+7.99e-03

Pos logit contributions
vious+0.053
awks+0.050
 Leave+0.049
trained+0.046
irms+0.046
Neg logit contributions
--0.043
 carefully-0.042
 together-0.039
"-0.038
pack-0.038
Token,
Token ID11
Feature activation+2.69e-02

Pos logit contributions
vious+0.090
awks+0.086
 Leave+0.084
trained+0.079
 Fro+0.078
Neg logit contributions
--0.034
 carefully-0.032
pack-0.032
lf-0.031
ator-0.030
Token but
Token ID475
Feature activation+8.28e-02

Pos logit contributions
vious+0.061
awks+0.059
 Leave+0.057
irms+0.055
trained+0.055
Neg logit contributions
 carefully-0.035
--0.034
 together-0.032
 next-0.030
"-0.030
Token from
Token ID422
Feature activation+4.85e-02

Pos logit contributions
vious+0.173
awks+0.168
 Leave+0.164
 Fro+0.163
trained+0.156
Neg logit contributions
--0.108
pack-0.103
ve-0.099
 carefully-0.099
ator-0.096
Token a
Token ID257
Feature activation+1.85e-03

Pos logit contributions
vious+0.110
awks+0.108
 Leave+0.102
irms+0.100
trained+0.099
Neg logit contributions
--0.056
ve-0.053
 carefully-0.053
pack-0.053
"-0.052
Token safe
Token ID3338
Feature activation+2.94e-02

Pos logit contributions
vious+0.004
awks+0.004
 Leave+0.003
joy+0.003
Leave+0.003
Neg logit contributions
--0.003
 carefully-0.003
pack-0.003
ator-0.002
 together-0.002
Token distance
Token ID5253
Feature activation+1.88e-01

Pos logit contributions
vious+0.052
awks+0.051
 Leave+0.049
irms+0.045
trained+0.045
Neg logit contributions
 carefully-0.049
--0.049
 together-0.046
pack-0.046
ve-0.045
Token.
Token ID13
Feature activation+1.50e-02

Pos logit contributions
vious+0.521
awks+0.507
 Leave+0.497
trained+0.474
irms+0.469
Neg logit contributions
 carefully-0.123
--0.122
pack-0.117
ve-0.107
lf-0.106
Token They
Token ID1119
Feature activation+9.63e-03

Pos logit contributions
vious+0.034
awks+0.033
 Leave+0.031
trained+0.031
irms+0.030
Neg logit contributions
 carefully-0.019
--0.018
 together-0.018
"-0.017
 next-0.016
Token said
Token ID531
Feature activation-2.28e-02

Pos logit contributions
vious+0.018
awks+0.017
 Leave+0.016
trained+0.015
 Fro+0.015
Neg logit contributions
--0.016
 carefully-0.015
pack-0.015
 together-0.015
ator-0.015
Token they
Token ID484
Feature activation+1.84e-02

Pos logit contributions
-+0.026
 carefully+0.026
pack+0.024
 together+0.024
"+0.023
Neg logit contributions
vious-0.053
awks-0.052
 Leave-0.051
trained-0.048
irms-0.047
Token would
Token ID561
Feature activation+2.29e-02

Pos logit contributions
vious+0.035
 Leave+0.032
awks+0.032
trained+0.030
irms+0.029
Neg logit contributions
--0.030
 carefully-0.029
pack-0.028
lf-0.028
ator-0.028
Token,"
Token ID553
Feature activation+4.73e-02

Pos logit contributions
 carefully+0.108
pack+0.103
lf+0.102
ator+0.098
-+0.097
Neg logit contributions
vious-0.257
awks-0.238
 Leave-0.236
irms-0.220
trained-0.214
Token Mom
Token ID11254
Feature activation-4.14e-02

Pos logit contributions
vious+0.112
awks+0.108
 Leave+0.106
trained+0.101
irms+0.099
Neg logit contributions
 carefully-0.058
--0.056
 together-0.052
"-0.051
 next-0.050
Token says
Token ID1139
Feature activation-2.99e-02

Pos logit contributions
-+0.086
 carefully+0.085
pack+0.081
 together+0.081
ve+0.080
Neg logit contributions
vious-0.061
awks-0.057
 Leave-0.053
irms-0.050
trained-0.048
Token.
Token ID13
Feature activation-1.38e-02

Pos logit contributions
lf+0.024
pack+0.023
ator+0.023
 carefully+0.022
-+0.021
Neg logit contributions
vious-0.078
awks-0.076
 Leave-0.075
irms-0.072
trained-0.071
Token "
Token ID366
Feature activation-5.63e-02

Pos logit contributions
-+0.022
 carefully+0.022
pack+0.020
 together+0.020
lf+0.020
Neg logit contributions
vious-0.027
awks-0.026
 Leave-0.025
irms-0.024
trained-0.023
TokenElectric
Token ID44132
Feature activation-2.26e-01

Pos logit contributions
-+0.087
 carefully+0.086
 together+0.079
pack+0.078
lf+0.078
Neg logit contributions
vious-0.114
awks-0.109
 Leave-0.106
Leave-0.100
trained-0.098
Tokenity
Token ID414
Feature activation-9.90e-03

Pos logit contributions
 carefully+0.360
pack+0.338
 together+0.330
 next+0.303
lf+0.300
Neg logit contributions
vious-0.428
awks-0.418
 Leave-0.385
irms-0.379
irk-0.362
Token is
Token ID318
Feature activation+1.16e-02

Pos logit contributions
pack+0.013
 carefully+0.013
-+0.013
lf+0.013
ator+0.012
Neg logit contributions
vious-0.021
awks-0.020
 Leave-0.020
trained-0.018
irms-0.018
Token a
Token ID257
Feature activation-3.70e-02

Pos logit contributions
vious+0.026
awks+0.025
 Leave+0.024
 Fro+0.023
irms+0.023
Neg logit contributions
pack-0.014
--0.014
lf-0.013
 carefully-0.013
ator-0.013
Token kind
Token ID1611
Feature activation+2.21e-02

Pos logit contributions
 carefully+0.052
pack+0.051
 together+0.048
-+0.047
ve+0.046
Neg logit contributions
 liar-0.069
awks-0.069
vious-0.068
 Leave-0.068
 foe-0.066
Token of
Token ID286
Feature activation-2.62e-02

Pos logit contributions
vious+0.048
 Leave+0.044
awks+0.044
irk+0.042
irms+0.041
Neg logit contributions
 carefully-0.030
pack-0.029
 together-0.028
ve-0.027
 next-0.026
Token
Token ID198
Feature activation-7.57e-03

Pos logit contributions
 carefully+0.008
-+0.008
 together+0.007
"+0.007
pack+0.006
Neg logit contributions
vious-0.022
awks-0.021
 Leave-0.020
trained-0.020
irms-0.020
Token
Token ID198
Feature activation-7.64e-03

Pos logit contributions
-+0.010
 carefully+0.010
pack+0.009
 together+0.009
ve+0.009
Neg logit contributions
vious-0.015
awks-0.015
 Leave-0.015
 Fro-0.014
irms-0.014
TokenM
Token ID44
Feature activation-4.83e-02

Pos logit contributions
-+0.012
 work+0.011
 carefully+0.011
pack+0.011
 together+0.011
Neg logit contributions
vious-0.013
 Leave-0.013
Leave-0.013
 Fro-0.013
awks-0.012
Tokenoral
Token ID6864
Feature activation-5.84e-02

Pos logit contributions
 carefully+0.090
 together+0.088
pack+0.086
-+0.084
lf+0.082
Neg logit contributions
vious-0.082
awks-0.072
 Leave-0.068
irk-0.064
 Fro-0.061
Token:
Token ID25
Feature activation-4.87e-02

Pos logit contributions
 carefully+0.080
-+0.074
pack+0.073
lf+0.073
 together+0.072
Neg logit contributions
vious-0.128
 Leave-0.117
awks-0.114
trained-0.108
irk-0.106
Token Team
Token ID4816
Feature activation-2.06e-01

Pos logit contributions
-+0.104
 carefully+0.099
ve+0.097
lf+0.097
pack+0.094
Neg logit contributions
 Leave-0.065
vious-0.065
awks-0.061
 Fro-0.055
joy-0.052
Tokenwork
Token ID1818
Feature activation-1.57e-01

Pos logit contributions
 carefully+0.281
-+0.264
lf+0.261
ve+0.245
pack+0.235
Neg logit contributions
vious-0.434
 Leave-0.409
awks-0.378
rified-0.364
trained-0.353
Token and
Token ID290
Feature activation-7.16e-03

Pos logit contributions
 carefully+0.168
-+0.160
pack+0.159
ve+0.152
lf+0.149
Neg logit contributions
vious-0.372
 Leave-0.348
awks-0.342
 Fro-0.320
trained-0.318
Token using
Token ID1262
Feature activation-6.72e-04

Pos logit contributions
-+0.010
 carefully+0.009
ve+0.009
pack+0.009
her+0.009
Neg logit contributions
vious-0.015
awks-0.014
 Leave-0.014
 Fro-0.013
 crabs-0.013
Token our
Token ID674
Feature activation+2.22e-03

Pos logit contributions
ve+0.001
lf+0.001
-+0.001
pack+0.001
ator+0.001
Neg logit contributions
awks-0.001
vious-0.001
 Leave-0.001
irms-0.001
 unnecessary-0.001
Token brains
Token ID14290
Feature activation-7.83e-03

Pos logit contributions
vious+0.003
awks+0.003
 Leave+0.003
joy+0.003
irms+0.003
Neg logit contributions
--0.004
pack-0.004
her-0.004
 together-0.004
"-0.004
Token.
Token ID13
Feature activation-2.26e-02

Pos logit contributions
 carefully+0.011
pack+0.010
 together+0.009
-+0.009
"+0.009
Neg logit contributions
vious-0.026
awks-0.025
 Leave-0.025
joy-0.024
trained-0.024
Token Brown
Token ID4373
Feature activation-5.44e-02

Pos logit contributions
-+0.039
 carefully+0.039
 together+0.036
pack+0.036
her+0.035
Neg logit contributions
joy-0.038
awks-0.037
vious-0.037
 Leave-0.036
 Fro-0.035
Token meant
Token ID4001
Feature activation-1.17e-01

Pos logit contributions
pack+0.071
 carefully+0.067
ator+0.066
-+0.066
"+0.065
Neg logit contributions
vious-0.117
trained-0.105
awks-0.105
 Leave-0.104
 Fro-0.099
Token by
Token ID416
Feature activation-7.04e-02

Pos logit contributions
pack+0.162
ator+0.153
lf+0.146
-+0.146
 carefully+0.142
Neg logit contributions
vious-0.244
 Leave-0.223
joy-0.217
awks-0.207
 hesitation-0.206
Token "
Token ID366
Feature activation-6.05e-02

Pos logit contributions
-+0.114
 carefully+0.113
pack+0.111
 together+0.110
ve+0.104
Neg logit contributions
vious-0.125
 Leave-0.121
awks-0.120
 Fro-0.113
Leave-0.111
Tokensp
Token ID2777
Feature activation-2.04e-01

Pos logit contributions
 carefully+0.110
-+0.107
lf+0.104
 together+0.101
pack+0.101
Neg logit contributions
vious-0.107
awks-0.097
 Leave-0.092
trained-0.089
Leave-0.087
Tokenoil
Token ID9437
Feature activation-1.38e-01

Pos logit contributions
 carefully+0.422
 work+0.393
 together+0.360
lf+0.356
 next+0.355
Neg logit contributions
vious-0.311
 Leave-0.288
awks-0.280
trained-0.268
irk-0.256
Token",
Token ID1600
Feature activation-2.89e-02

Pos logit contributions
lf+0.157
 carefully+0.156
pack+0.152
ator+0.145
 together+0.145
Neg logit contributions
vious-0.326
awks-0.296
 Leave-0.295
trained-0.273
 pige-0.269
Token but
Token ID475
Feature activation+4.58e-02

Pos logit contributions
-+0.041
 carefully+0.040
pack+0.038
 together+0.037
ve+0.037
Neg logit contributions
vious-0.059
awks-0.057
 Leave-0.055
trained-0.052
 Fro-0.051
Token she
Token ID673
Feature activation+3.45e-03

Pos logit contributions
vious+0.079
awks+0.071
 Fro+0.070
 Leave+0.070
trained+0.069
Neg logit contributions
pack-0.077
--0.077
ator-0.074
ve-0.074
 carefully-0.072
Token knew
Token ID2993
Feature activation-7.52e-02

Pos logit contributions
vious+0.006
 regretted+0.005
 Fro+0.005
 mun+0.005
rified+0.005
Neg logit contributions
pack-0.006
ator-0.006
 work-0.005
 together-0.005
 instructions-0.005
Token many
Token ID867
Feature activation-7.04e-03

Pos logit contributions
 carefully+0.006
-+0.006
 together+0.005
"+0.004
 next+0.004
Neg logit contributions
vious-0.025
awks-0.025
 Leave-0.024
trained-0.023
irms-0.023
Token books
Token ID3835
Feature activation-2.66e-02

Pos logit contributions
 carefully+0.009
pack+0.009
-+0.008
 together+0.008
ator+0.008
Neg logit contributions
vious-0.015
awks-0.015
 Leave-0.014
irms-0.014
trained-0.013
Token on
Token ID319
Feature activation-9.34e-03

Pos logit contributions
 carefully+0.033
ve+0.033
 work+0.031
plug+0.031
-+0.031
Neg logit contributions
vious-0.055
 Leave-0.049
awks-0.047
irk-0.046
trained-0.046
Token her
Token ID607
Feature activation-3.45e-02

Pos logit contributions
ve+0.016
pack+0.014
plug+0.014
-+0.014
 together+0.014
Neg logit contributions
 Leave-0.015
vious-0.015
awks-0.014
 crowds-0.014
 ze-0.013
Token books
Token ID3835
Feature activation-2.86e-02

Pos logit contributions
 together+0.046
 carefully+0.045
ve+0.043
-+0.043
"+0.043
Neg logit contributions
awks-0.069
vious-0.068
 Leave-0.067
irms-0.062
trained-0.061
Tokenhe
Token ID258
Feature activation-2.01e-01

Pos logit contributions
ve+0.028
 carefully+0.027
-+0.027
 work+0.024
pack+0.024
Neg logit contributions
vious-0.068
 Leave-0.061
awks-0.060
irk-0.057
trained-0.057
Tokenlf
Token ID1652
Feature activation-1.44e-01

Pos logit contributions
 carefully+0.152
pack+0.137
 together+0.136
ve+0.136
 handy+0.133
Neg logit contributions
vious-0.488
 Leave-0.470
awks-0.461
Leave-0.443
 crowds-0.442
Token in
Token ID287
Feature activation-4.36e-02

Pos logit contributions
ve+0.125
 carefully+0.121
pack+0.112
 work+0.108
"+0.108
Neg logit contributions
vious-0.346
 Leave-0.315
awks-0.314
trained-0.304
 pige-0.302
Token her
Token ID607
Feature activation-3.66e-02

Pos logit contributions
ve+0.065
-+0.061
 carefully+0.059
 together+0.055
iced+0.054
Neg logit contributions
vious-0.081
 Leave-0.077
awks-0.076
joy-0.074
irms-0.072
Token room
Token ID2119
Feature activation-9.20e-02

Pos logit contributions
 carefully+0.034
-+0.031
 together+0.031
ve+0.030
pack+0.028
Neg logit contributions
vious-0.090
awks-0.090
 Leave-0.088
irms-0.085
joy-0.084
Token.
Token ID13
Feature activation-2.72e-02

Pos logit contributions
 carefully+0.075
ve+0.070
-+0.068
pack+0.066
 together+0.059
Neg logit contributions
vious-0.233
 Leave-0.220
awks-0.216
joy-0.213
trained-0.203
Tokenhe
Token ID258
Feature activation-1.96e-01

Pos logit contributions
ve+0.039
-+0.038
ator+0.036
pack+0.034
abies+0.034
Neg logit contributions
vious-0.053
awks-0.048
 Leave-0.045
rified-0.044
irms-0.044
Tokenlf
Token ID1652
Feature activation-1.42e-01

Pos logit contributions
ve+0.153
 carefully+0.146
pack+0.144
 together+0.133
her+0.131
Neg logit contributions
vious-0.461
awks-0.450
 Leave-0.444
 crowds-0.427
irms-0.427
Token.
Token ID13
Feature activation-2.27e-02

Pos logit contributions
"+0.127
ve+0.125
ator+0.116
pack+0.110
-+0.107
Neg logit contributions
vious-0.330
awks-0.310
trained-0.290
joy-0.285
 Leave-0.284
Token The
Token ID383
Feature activation-1.87e-02

Pos logit contributions
 carefully+0.033
-+0.032
 together+0.031
"+0.030
 next+0.029
Neg logit contributions
vious-0.048
awks-0.046
 Leave-0.044
trained-0.042
irms-0.042
Token books
Token ID3835
Feature activation-2.92e-02

Pos logit contributions
-+0.022
 carefully+0.022
 together+0.021
pack+0.020
"+0.019
Neg logit contributions
vious-0.043
awks-0.042
 Leave-0.041
irms-0.039
trained-0.039
Tokenhe
Token ID258
Feature activation-1.99e-01

Pos logit contributions
-+0.048
pack+0.046
ator+0.046
 carefully+0.044
ve+0.044
Neg logit contributions
vious-0.053
awks-0.045
 Leave-0.045
Leave-0.043
trained-0.042
Tokenlf
Token ID1652
Feature activation-1.44e-01

Pos logit contributions
pack+0.158
ve+0.157
 carefully+0.154
 together+0.149
her+0.149
Neg logit contributions
vious-0.444
awks-0.418
 Leave-0.418
joy-0.413
 liar-0.412
Token was
Token ID373
Feature activation-1.32e-02

Pos logit contributions
pack+0.176
"+0.173
ve+0.170
ator+0.168
-+0.167
Neg logit contributions
vious-0.310
awks-0.276
 Leave-0.270
trained-0.265
rified-0.256
Token smooth
Token ID7209
Feature activation+6.33e-03

Pos logit contributions
-+0.018
ve+0.017
pack+0.017
"+0.016
ator+0.015
Neg logit contributions
vious-0.026
awks-0.026
irms-0.024
 Leave-0.023
 Fro-0.023
Token and
Token ID290
Feature activation-1.42e-02

Pos logit contributions
vious+0.015
awks+0.014
 Leave+0.013
irms+0.013
joy+0.013
Neg logit contributions
pack-0.006
 carefully-0.006
ve-0.006
"-0.006
lf-0.006
Token shiny
Token ID22441
Feature activation+1.72e-02

Pos logit contributions
-+0.021
pack+0.021
ve+0.020
ator+0.020
 carefully+0.019
Neg logit contributions
vious-0.026
awks-0.025
trained-0.024
irms-0.024
 Leave-0.024
TokenYes
Token ID5297
Feature activation-6.33e-03

Pos logit contributions
-+0.060
 carefully+0.060
 together+0.056
"+0.054
pack+0.054
Neg logit contributions
vious-0.066
awks-0.063
 Leave-0.061
trained-0.057
irms-0.057
Token,
Token ID11
Feature activation+9.19e-04

Pos logit contributions
pack+0.007
 carefully+0.006
-+0.006
her+0.006
 together+0.006
Neg logit contributions
vious-0.015
 Leave-0.014
awks-0.014
rified-0.013
 Fro-0.013
Token we
Token ID356
Feature activation+3.74e-02

Pos logit contributions
vious+0.002
 Leave+0.002
awks+0.002
 Fro+0.002
Leave+0.002
Neg logit contributions
--0.001
 carefully-0.001
pack-0.001
 together-0.001
ve-0.001
Token are
Token ID389
Feature activation+1.84e-02

Pos logit contributions
vious+0.069
awks+0.058
 Leave+0.058
 realize+0.053
irms+0.053
Neg logit contributions
--0.069
lf-0.064
ator-0.060
 next-0.060
 carefully-0.059
Token.
Token ID13
Feature activation-1.73e-02

Pos logit contributions
vious+0.035
joy+0.033
awks+0.032
 Leave+0.031
 Fro+0.030
Neg logit contributions
--0.026
pack-0.026
lf-0.025
 carefully-0.025
ork-0.024
Token Team
Token ID4816
Feature activation-1.99e-01

Pos logit contributions
 carefully+0.033
-+0.033
 together+0.031
"+0.030
pack+0.030
Neg logit contributions
vious-0.028
awks-0.027
 Leave-0.026
trained-0.024
irms-0.024
Tokenwork
Token ID1818
Feature activation-1.50e-01

Pos logit contributions
 carefully+0.276
lf+0.259
-+0.241
ork+0.233
ve+0.229
Neg logit contributions
vious-0.399
 Leave-0.370
rified-0.329
awks-0.324
joy-0.308
Token makes
Token ID1838
Feature activation+3.70e-02

Pos logit contributions
pack+0.182
 carefully+0.179
ve+0.177
lf+0.176
-+0.173
Neg logit contributions
vious-0.318
 Leave-0.285
awks-0.279
 ze-0.261
trained-0.253
Token difficult
Token ID2408
Feature activation+1.83e-02

Pos logit contributions
vious+0.074
awks+0.071
 ze+0.067
 Leave+0.067
 Fro+0.066
Neg logit contributions
pack-0.049
ve-0.046
lf-0.045
oe-0.043
--0.042
Token work
Token ID670
Feature activation-1.53e-01

Pos logit contributions
vious+0.041
awks+0.039
 Leave+0.037
irms+0.036
joy+0.035
Neg logit contributions
pack-0.022
 carefully-0.022
her-0.021
"-0.021
ve-0.020
Token easier
Token ID4577
Feature activation-3.44e-02

Pos logit contributions
pack+0.202
lf+0.188
ve+0.182
ator+0.176
 carefully+0.174
Neg logit contributions
vious-0.313
 Leave-0.264
awks-0.264
 crowds-0.253
 loses-0.251
Token.
Token ID13
Feature activation-2.35e-02

Pos logit contributions
vious+0.022
 Leave+0.020
awks+0.020
 Fro+0.020
joy+0.020
Neg logit contributions
pack-0.013
ator-0.011
ve-0.011
--0.010
ate-0.010
Token He
Token ID679
Feature activation+1.83e-03

Pos logit contributions
 carefully+0.042
-+0.041
 together+0.039
"+0.039
 next+0.038
Neg logit contributions
vious-0.042
awks-0.040
 Leave-0.038
trained-0.037
irms-0.036
Token tried
Token ID3088
Feature activation-4.10e-02

Pos logit contributions
vious+0.004
awks+0.004
 Leave+0.004
trained+0.003
Leave+0.003
Neg logit contributions
--0.003
 carefully-0.002
 together-0.002
pack-0.002
ator-0.002
Token to
Token ID284
Feature activation-8.56e-04

Pos logit contributions
lf+0.047
-+0.047
ator+0.047
ve+0.045
 carefully+0.044
Neg logit contributions
vious-0.089
awks-0.089
 Fro-0.086
 Leave-0.086
joy-0.085
Token un
Token ID555
Feature activation-1.35e-01

Pos logit contributions
-+0.001
ve+0.001
 together+0.001
iced+0.001
ator+0.001
Neg logit contributions
awks-0.001
vious-0.001
Leave-0.001
 Leave-0.001
 Fro-0.001
Tokenplug
Token ID16875
Feature activation-1.99e-01

Pos logit contributions
 special+0.135
lf+0.130
 creation+0.126
 carefully+0.121
 decoration+0.121
Neg logit contributions
trained-0.289
 Leave-0.285
vious-0.273
awks-0.264
irms-0.263
Token the
Token ID262
Feature activation+7.02e-03

Pos logit contributions
pack+0.254
-+0.252
ator+0.249
 carefully+0.239
 creation+0.238
Neg logit contributions
 Leave-0.377
vious-0.372
trained-0.360
 Fro-0.358
Leave-0.356
Token mixer
Token ID33938
Feature activation-9.44e-02

Pos logit contributions
vious+0.012
awks+0.011
 pige+0.010
 Leave+0.010
Leave+0.010
Neg logit contributions
 together-0.011
--0.010
 carefully-0.010
pack-0.010
ate-0.010
Token,
Token ID11
Feature activation-1.13e-02

Pos logit contributions
ve+0.118
pack+0.108
 carefully+0.105
lf+0.105
-+0.104
Neg logit contributions
vious-0.203
awks-0.186
 Fro-0.184
 Leave-0.182
rified-0.180
Token but
Token ID475
Feature activation+4.26e-02

Pos logit contributions
 carefully+0.020
-+0.020
 together+0.019
"+0.018
pack+0.018
Neg logit contributions
vious-0.020
awks-0.019
 Leave-0.019
trained-0.017
irms-0.017
Token he
Token ID339
Feature activation+9.07e-03

Pos logit contributions
vious+0.093
 Leave+0.090
awks+0.088
 Fro+0.088
trained+0.086
Neg logit contributions
--0.051
pack-0.048
 carefully-0.047
ator-0.046
ve-0.045
Tokenhe
Token ID258
Feature activation-1.98e-01

Pos logit contributions
ve+0.028
-+0.027
 carefully+0.025
ator+0.024
abies+0.024
Neg logit contributions
vious-0.056
awks-0.050
 Leave-0.049
trained-0.046
irms-0.046
Tokenlf
Token ID1652
Feature activation-1.39e-01

Pos logit contributions
 carefully+0.153
 together+0.144
pack+0.140
ve+0.137
 handy+0.127
Neg logit contributions
vious-0.479
awks-0.467
 Leave-0.457
irms-0.440
 crabs-0.438
Token.
Token ID13
Feature activation-1.78e-02

Pos logit contributions
ve+0.117
 carefully+0.109
"+0.106
pack+0.105
-+0.102
Neg logit contributions
vious-0.345
awks-0.320
 Leave-0.308
trained-0.307
 ze-0.297
Token His
Token ID2399
Feature activation-2.76e-02

Pos logit contributions
 carefully+0.023
"+0.022
 next+0.020
 together+0.020
.+0.020
Neg logit contributions
vious-0.040
trained-0.039
awks-0.039
 Leave-0.038
Leave-0.038
Token books
Token ID3835
Feature activation-2.56e-02

Pos logit contributions
 carefully+0.032
 together+0.031
-+0.031
pack+0.030
"+0.029
Neg logit contributions
vious-0.063
awks-0.062
 Leave-0.059
irms-0.058
joy-0.057
Tokenhe
Token ID258
Feature activation-1.99e-01

Pos logit contributions
-+0.049
 carefully+0.047
ve+0.046
pack+0.045
ator+0.045
Neg logit contributions
vious-0.039
awks-0.035
 Leave-0.035
trained-0.030
Leave-0.030
Tokenlf
Token ID1652
Feature activation-1.40e-01

Pos logit contributions
 together+0.140
pack+0.131
 carefully+0.129
ve+0.129
her+0.123
Neg logit contributions
vious-0.494
awks-0.473
 Leave-0.461
 crabs-0.450
 ze-0.450
Token is
Token ID318
Feature activation-3.29e-03

Pos logit contributions
-+0.178
 carefully+0.178
pack+0.173
ve+0.170
 together+0.165
Neg logit contributions
vious-0.301
awks-0.281
 Leave-0.276
trained-0.260
irms-0.253
Token important
Token ID1593
Feature activation-5.40e-02

Pos logit contributions
-+0.004
ve+0.004
 carefully+0.004
pack+0.004
 together+0.004
Neg logit contributions
vious-0.007
awks-0.006
 Leave-0.006
irms-0.006
joy-0.006
Token to
Token ID284
Feature activation-4.35e-03

Pos logit contributions
 carefully+0.054
pack+0.052
ator+0.052
ve+0.052
"+0.050
Neg logit contributions
vious-0.123
awks-0.114
 Leave-0.113
joy-0.113
 ze-0.109
Token him
Token ID683
Feature activation-4.32e-02

Pos logit contributions
ve+0.007
-+0.006
 together+0.006
iced+0.006
pack+0.006
Neg logit contributions
vious-0.008
awks-0.008
 Leave-0.008
 Fro-0.007
irms-0.007
Token that
Token ID326
Feature activation-3.80e-02

Pos logit contributions
 carefully+0.015
-+0.015
pack+0.014
ve+0.014
ator+0.014
Neg logit contributions
vious-0.035
 Leave-0.033
awks-0.033
trained-0.031
Leave-0.030
Token says
Token ID1139
Feature activation-2.69e-02

Pos logit contributions
-+0.084
ve+0.080
pack+0.078
 carefully+0.078
ator+0.077
Neg logit contributions
vious-0.049
awks-0.047
 Leave-0.045
trained-0.040
irms-0.039
Token "
Token ID366
Feature activation-6.19e-02

Pos logit contributions
pack+0.040
 carefully+0.039
ve+0.037
-+0.036
 work+0.035
Neg logit contributions
 Leave-0.052
vious-0.049
 liar-0.046
 Fro-0.045
awks-0.045
TokenCl
Token ID2601
Feature activation-7.60e-02

Pos logit contributions
 carefully+0.117
-+0.110
lf+0.109
ator+0.107
 together+0.106
Neg logit contributions
vious-0.106
awks-0.097
 Leave-0.097
Leave-0.091
trained-0.090
Tokenay
Token ID323
Feature activation-1.21e-01

Pos logit contributions
 carefully+0.121
-+0.114
 together+0.109
lf+0.107
pack+0.105
Neg logit contributions
vious-0.151
awks-0.146
 Leave-0.142
irms-0.132
trained-0.131
Token Factory
Token ID19239
Feature activation-1.98e-01

Pos logit contributions
 carefully+0.128
pack+0.101
 together+0.099
lf+0.097
 patiently+0.096
Neg logit contributions
 Leave-0.301
vious-0.291
awks-0.255
 pige-0.254
 Drop-0.254
Token".
Token ID1911
Feature activation-1.45e-02

Pos logit contributions
 carefully+0.294
lf+0.288
 together+0.259
-+0.254
 creation+0.253
Neg logit contributions
vious-0.377
 Leave-0.365
rified-0.315
 ze-0.314
awks-0.305
Token Tom
Token ID4186
Feature activation+1.73e-03

Pos logit contributions
 carefully+0.025
-+0.024
 together+0.023
"+0.023
 next+0.022
Neg logit contributions
vious-0.027
awks-0.026
 Leave-0.024
trained-0.024
irms-0.023
Token and
Token ID290
Feature activation-1.26e-02

Pos logit contributions
vious+0.004
awks+0.004
 Leave+0.004
trained+0.004
irms+0.004
Neg logit contributions
 carefully-0.002
--0.002
 together-0.002
"-0.002
pack-0.001
Token Sam
Token ID3409
Feature activation+2.17e-02

Pos logit contributions
 carefully+0.019
-+0.019
"+0.017
 work+0.017
 together+0.017
Neg logit contributions
vious-0.026
awks-0.025
 Leave-0.024
trained-0.023
irms-0.022
Token are
Token ID389
Feature activation+4.86e-04

Pos logit contributions
vious+0.041
awks+0.040
 Leave+0.038
trained+0.036
irms+0.036
Neg logit contributions
 carefully-0.036
--0.035
 together-0.033
"-0.032
 work-0.032
TokenL
Token ID43
Feature activation-1.50e-02

Pos logit contributions
-+0.085
 carefully+0.084
"+0.079
 together+0.078
 next+0.077
Neg logit contributions
vious-0.126
awks-0.123
 Leave-0.119
trained-0.113
irms-0.112
Tokenily
Token ID813
Feature activation-5.14e-02

Pos logit contributions
 carefully+0.028
-+0.027
 together+0.027
pack+0.026
"+0.025
Neg logit contributions
vious-0.025
awks-0.023
 Leave-0.022
irms-0.021
trained-0.020
Token,
Token ID11
Feature activation-1.70e-02

Pos logit contributions
 carefully+0.056
pack+0.050
-+0.048
 together+0.047
her+0.046
Neg logit contributions
vious-0.120
 Leave-0.117
awks-0.114
joy-0.111
irms-0.107
Token your
Token ID534
Feature activation+1.25e-02

Pos logit contributions
 carefully+0.021
-+0.021
 together+0.020
"+0.019
pack+0.019
Neg logit contributions
vious-0.039
awks-0.038
 Leave-0.036
trained-0.035
irms-0.035
Token books
Token ID3835
Feature activation-3.06e-02

Pos logit contributions
vious+0.027
awks+0.026
 Leave+0.025
joy+0.025
irms+0.024
Neg logit contributions
 together-0.016
 carefully-0.016
--0.015
pack-0.014
"-0.014
Tokenhe
Token ID258
Feature activation-1.98e-01

Pos logit contributions
 carefully+0.047
-+0.045
 together+0.042
pack+0.042
ator+0.041
Neg logit contributions
vious-0.062
 Leave-0.056
awks-0.055
irms-0.050
trained-0.050
Tokenlf
Token ID1652
Feature activation-1.42e-01

Pos logit contributions
 carefully+0.168
 together+0.167
her+0.159
pack+0.127
iced+0.125
Neg logit contributions
vious-0.476
 Leave-0.458
joy-0.434
 liar-0.429
awks-0.426
Token is
Token ID318
Feature activation-6.91e-03

Pos logit contributions
 carefully+0.161
pack+0.145
ator+0.137
-+0.134
her+0.133
Neg logit contributions
vious-0.335
 Leave-0.308
awks-0.296
trained-0.281
irms-0.276
Token very
Token ID845
Feature activation-3.80e-02

Pos logit contributions
-+0.010
 carefully+0.010
ve+0.009
pack+0.009
ator+0.009
Neg logit contributions
vious-0.014
 Leave-0.012
awks-0.011
 stub-0.011
 lonely-0.011
Token messy
Token ID24097
Feature activation-4.37e-02

Pos logit contributions
ator+0.063
pack+0.060
 together+0.059
-+0.058
abies+0.056
Neg logit contributions
vious-0.067
awks-0.059
 lonely-0.058
 sweaty-0.056
 thirsty-0.054
Token.
Token ID13
Feature activation-3.66e-02

Pos logit contributions
 carefully+0.047
pack+0.043
ve+0.042
-+0.040
ator+0.039
Neg logit contributions
vious-0.102
 Leave-0.091
awks-0.091
irms-0.089
joy-0.085
Token many
Token ID867
Feature activation-4.15e-03

Pos logit contributions
vious+0.049
awks+0.047
 Leave+0.046
trained+0.045
irms+0.045
Neg logit contributions
 carefully-0.011
--0.011
 together-0.009
"-0.009
 next-0.009
Token books
Token ID3835
Feature activation-2.29e-02

Pos logit contributions
 carefully+0.005
-+0.005
pack+0.005
 together+0.005
"+0.005
Neg logit contributions
vious-0.009
awks-0.009
 Leave-0.008
irms-0.008
trained-0.008
Token on
Token ID319
Feature activation-6.69e-03

Pos logit contributions
ve+0.027
-+0.026
 carefully+0.025
ator+0.025
abies+0.025
Neg logit contributions
vious-0.049
awks-0.044
 Leave-0.043
trained-0.042
irk-0.040
Token his
Token ID465
Feature activation-1.57e-03

Pos logit contributions
ve+0.012
pack+0.011
iced+0.011
 together+0.011
-+0.011
Neg logit contributions
awks-0.010
vious-0.010
 Leave-0.010
 ze-0.009
 crowds-0.009
Token books
Token ID3835
Feature activation-2.45e-02

Pos logit contributions
 carefully+0.002
 together+0.002
-+0.002
pack+0.002
"+0.002
Neg logit contributions
awks-0.003
vious-0.003
 Leave-0.003
irms-0.003
trained-0.003
Tokenhe
Token ID258
Feature activation-1.98e-01

Pos logit contributions
ve+0.028
-+0.027
 carefully+0.025
ator+0.024
abies+0.024
Neg logit contributions
vious-0.056
awks-0.050
 Leave-0.049
trained-0.046
irms-0.046
Tokenlf
Token ID1652
Feature activation-1.39e-01

Pos logit contributions
 carefully+0.153
 together+0.144
pack+0.140
ve+0.137
 handy+0.127
Neg logit contributions
vious-0.479
awks-0.467
 Leave-0.457
irms-0.440
 crabs-0.438
Token.
Token ID13
Feature activation-1.78e-02

Pos logit contributions
ve+0.117
 carefully+0.109
"+0.106
pack+0.105
-+0.102
Neg logit contributions
vious-0.345
awks-0.320
 Leave-0.308
trained-0.307
 ze-0.297
Token His
Token ID2399
Feature activation-2.76e-02

Pos logit contributions
 carefully+0.023
"+0.022
 next+0.020
 together+0.020
.+0.020
Neg logit contributions
vious-0.040
trained-0.039
awks-0.039
 Leave-0.038
Leave-0.038
Token books
Token ID3835
Feature activation-2.56e-02

Pos logit contributions
 carefully+0.032
 together+0.031
-+0.031
pack+0.030
"+0.029
Neg logit contributions
vious-0.063
awks-0.062
 Leave-0.059
irms-0.058
joy-0.057
Tokenhe
Token ID258
Feature activation-1.99e-01

Pos logit contributions
-+0.049
 carefully+0.047
ve+0.046
pack+0.045
ator+0.045
Neg logit contributions
vious-0.039
awks-0.035
 Leave-0.035
trained-0.030
Leave-0.030
Token,
Token ID11
Feature activation-1.82e-02

Pos logit contributions
vious+0.005
 Leave+0.005
awks+0.004
trained+0.004
 Fro+0.004
Neg logit contributions
 carefully-0.002
--0.002
 together-0.001
pack-0.001
lf-0.001
Token you
Token ID345
Feature activation-1.61e-02

Pos logit contributions
 carefully+0.020
-+0.020
 together+0.018
"+0.018
 next+0.018
Neg logit contributions
vious-0.045
awks-0.043
 Leave-0.041
trained-0.040
irms-0.040
Token need
Token ID761
Feature activation-3.60e-02

Pos logit contributions
 carefully+0.027
-+0.027
 together+0.025
"+0.025
pack+0.024
Neg logit contributions
vious-0.030
awks-0.029
 Leave-0.028
trained-0.026
irms-0.026
Token to
Token ID284
Feature activation-8.26e-03

Pos logit contributions
-+0.039
 carefully+0.038
lf+0.037
ator+0.037
ve+0.036
Neg logit contributions
vious-0.085
 Leave-0.081
awks-0.079
irms-0.075
 Fro-0.074
Token bat
Token ID7365
Feature activation+5.50e-03

Pos logit contributions
ve+0.011
-+0.011
 together+0.010
lf+0.010
ator+0.010
Neg logit contributions
vious-0.016
 Leave-0.016
awks-0.015
 Fro-0.015
irms-0.015
Tokenhe
Token ID258
Feature activation-1.97e-01

Pos logit contributions
vious+0.013
awks+0.012
 Leave+0.012
irms+0.012
trained+0.012
Neg logit contributions
 carefully-0.006
--0.006
 together-0.006
"-0.005
pack-0.005
Token.
Token ID13
Feature activation-3.58e-02

Pos logit contributions
 carefully+0.193
pack+0.193
-+0.182
her+0.172
ve+0.171
Neg logit contributions
vious-0.468
 Leave-0.428
awks-0.423
irms-0.405
 Fro-0.401
Token You
Token ID921
Feature activation+1.33e-03

Pos logit contributions
 carefully+0.066
-+0.066
 together+0.062
"+0.062
pack+0.061
Neg logit contributions
vious-0.060
awks-0.057
 Leave-0.053
trained-0.051
irms-0.050
Token are
Token ID389
Feature activation-3.24e-03

Pos logit contributions
vious+0.002
 Leave+0.002
awks+0.002
trained+0.002
Leave+0.002
Neg logit contributions
--0.002
lf-0.002
 carefully-0.002
ator-0.002
 together-0.002
Token too
Token ID1165
Feature activation-3.56e-02

Pos logit contributions
-+0.004
 carefully+0.004
 together+0.004
pack+0.004
"+0.004
Neg logit contributions
vious-0.007
awks-0.007
 Leave-0.007
irms-0.006
trained-0.006
Token dirty
Token ID11841
Feature activation+4.48e-03

Pos logit contributions
 carefully+0.055
-+0.053
ator+0.052
lf+0.051
 together+0.051
Neg logit contributions
vious-0.068
awks-0.064
 Leave-0.063
irms-0.059
trained-0.058
Token have
Token ID423
Feature activation-3.26e-02

Pos logit contributions
vious+0.087
awks+0.081
 Leave+0.079
trained+0.073
irms+0.072
Neg logit contributions
 carefully-0.084
--0.083
lf-0.082
ator-0.080
 together-0.080
Token a
Token ID257
Feature activation-4.56e-02

Pos logit contributions
 carefully+0.031
-+0.031
 together+0.027
"+0.026
pack+0.026
Neg logit contributions
vious-0.084
awks-0.082
 Leave-0.080
trained-0.076
irms-0.076
Token tum
Token ID11814
Feature activation+1.11e-01

Pos logit contributions
 carefully+0.072
-+0.070
 together+0.070
pack+0.067
"+0.067
Neg logit contributions
vious-0.084
awks-0.082
 Leave-0.079
irms-0.074
joy-0.074
Tokenmy
Token ID1820
Feature activation-2.02e-03

Pos logit contributions
vious+0.195
awks+0.183
 Leave+0.177
trained+0.166
irms+0.164
Neg logit contributions
 carefully-0.199
--0.194
 together-0.184
pack-0.180
"-0.177
Token ac
Token ID936
Feature activation+2.14e-02

Pos logit contributions
 carefully+0.002
pack+0.002
ator+0.002
 together+0.002
ve+0.002
Neg logit contributions
vious-0.005
 Leave-0.004
awks-0.004
trained-0.004
Leave-0.004
Tokenhe
Token ID258
Feature activation-1.96e-01

Pos logit contributions
vious+0.047
awks+0.043
 Leave+0.042
trained+0.041
irms+0.040
Neg logit contributions
 carefully-0.030
 together-0.028
--0.026
pack-0.026
 work-0.025
Token.
Token ID13
Feature activation-4.48e-02

Pos logit contributions
 carefully+0.184
pack+0.170
 together+0.168
-+0.166
ve+0.157
Neg logit contributions
vious-0.489
 Leave-0.459
awks-0.456
Leave-0.432
trained-0.427
Token Can
Token ID1680
Feature activation+1.24e-02

Pos logit contributions
 carefully+0.073
"+0.072
-+0.072
 next+0.069
 together+0.067
Neg logit contributions
vious-0.082
awks-0.081
joy-0.073
trained-0.072
irms-0.070
Token I
Token ID314
Feature activation+4.98e-02

Pos logit contributions
vious+0.021
 Leave+0.021
awks+0.020
trained+0.019
Leave+0.018
Neg logit contributions
--0.022
 carefully-0.022
pack-0.021
 together-0.020
ve-0.020
Token stay
Token ID2652
Feature activation-3.39e-02

Pos logit contributions
 Leave+0.093
vious+0.093
awks+0.086
 Fro+0.083
trained+0.080
Neg logit contributions
 together-0.074
ator-0.073
lf-0.072
--0.072
 carefully-0.069
Token home
Token ID1363
Feature activation-6.60e-02

Pos logit contributions
ve+0.030
 carefully+0.030
lf+0.030
ator+0.029
-+0.028
Neg logit contributions
awks-0.080
vious-0.079
 Leave-0.076
irms-0.075
joy-0.074
Token put
Token ID1234
Feature activation+7.31e-03

Pos logit contributions
-+0.018
 carefully+0.017
 together+0.016
"+0.016
pack+0.016
Neg logit contributions
vious-0.023
awks-0.022
 Leave-0.021
trained-0.020
irms-0.020
Token them
Token ID606
Feature activation-2.22e-02

Pos logit contributions
awks+0.016
vious+0.016
irms+0.015
trained+0.014
 Leave+0.014
Neg logit contributions
lf-0.009
pack-0.009
ve-0.009
--0.008
"-0.008
Token on
Token ID319
Feature activation-4.60e-03

Pos logit contributions
lf+0.034
ator+0.032
ve+0.031
"+0.030
-+0.030
Neg logit contributions
vious-0.040
awks-0.040
irms-0.036
 Leave-0.035
trained-0.034
Token the
Token ID262
Feature activation+7.88e-03

Pos logit contributions
ve+0.007
pack+0.006
-+0.006
ator+0.006
"+0.006
Neg logit contributions
awks-0.009
 Fro-0.008
vious-0.008
joy-0.008
irms-0.008
Token books
Token ID3835
Feature activation-2.73e-02

Pos logit contributions
vious+0.018
awks+0.018
 Leave+0.017
irms+0.017
trained+0.016
Neg logit contributions
--0.009
 carefully-0.009
 together-0.009
"-0.008
pack-0.008
Tokenhe
Token ID258
Feature activation-1.96e-01

Pos logit contributions
ve+0.039
-+0.038
ator+0.036
pack+0.034
abies+0.034
Neg logit contributions
vious-0.053
awks-0.048
 Leave-0.045
rified-0.044
irms-0.044
Tokenlf
Token ID1652
Feature activation-1.42e-01

Pos logit contributions
ve+0.153
 carefully+0.146
pack+0.144
 together+0.133
her+0.131
Neg logit contributions
vious-0.461
awks-0.450
 Leave-0.444
 crowds-0.427
irms-0.427
Token.
Token ID13
Feature activation-2.27e-02

Pos logit contributions
"+0.127
ve+0.125
ator+0.116
pack+0.110
-+0.107
Neg logit contributions
vious-0.330
awks-0.310
trained-0.290
joy-0.285
 Leave-0.284
Token The
Token ID383
Feature activation-1.87e-02

Pos logit contributions
 carefully+0.033
-+0.032
 together+0.031
"+0.030
 next+0.029
Neg logit contributions
vious-0.048
awks-0.046
 Leave-0.044
trained-0.042
irms-0.042
Token books
Token ID3835
Feature activation-2.92e-02

Pos logit contributions
-+0.022
 carefully+0.022
 together+0.021
pack+0.020
"+0.019
Neg logit contributions
vious-0.043
awks-0.042
 Leave-0.041
irms-0.039
trained-0.039
Tokenhe
Token ID258
Feature activation-1.99e-01

Pos logit contributions
-+0.048
pack+0.046
ator+0.046
 carefully+0.044
ve+0.044
Neg logit contributions
vious-0.053
awks-0.045
 Leave-0.045
Leave-0.043
trained-0.042
Token have
Token ID423
Feature activation-3.27e-02

Pos logit contributions
-+0.047
 carefully+0.045
 together+0.042
pack+0.041
lf+0.040
Neg logit contributions
vious-0.074
awks-0.072
 Leave-0.071
trained-0.065
irms-0.065
Token a
Token ID257
Feature activation-4.55e-02

Pos logit contributions
-+0.034
 carefully+0.033
pack+0.029
 together+0.029
lf+0.029
Neg logit contributions
vious-0.082
awks-0.079
 Leave-0.077
trained-0.073
irms-0.073
Token tum
Token ID11814
Feature activation+1.14e-01

Pos logit contributions
-+0.055
 carefully+0.055
 together+0.051
pack+0.050
"+0.049
Neg logit contributions
vious-0.102
awks-0.099
 Leave-0.095
irms-0.091
joy-0.091
Tokenmy
Token ID1820
Feature activation-6.22e-03

Pos logit contributions
vious+0.199
awks+0.185
 Leave+0.178
trained+0.169
irms+0.165
Neg logit contributions
 carefully-0.207
--0.201
 together-0.191
pack-0.187
ve-0.184
Token ac
Token ID936
Feature activation+2.35e-02

Pos logit contributions
 carefully+0.007
pack+0.007
lf+0.007
ator+0.007
ve+0.006
Neg logit contributions
vious-0.015
awks-0.014
 Leave-0.014
trained-0.013
Leave-0.012
Tokenhe
Token ID258
Feature activation-1.96e-01

Pos logit contributions
vious+0.054
awks+0.051
 Leave+0.049
trained+0.048
irms+0.047
Neg logit contributions
 carefully-0.030
--0.028
 together-0.027
pack-0.026
 work-0.025
Token.
Token ID13
Feature activation-3.72e-02

Pos logit contributions
 carefully+0.187
-+0.183
pack+0.173
 together+0.163
ve+0.158
Neg logit contributions
vious-0.490
awks-0.466
 Leave-0.459
irms-0.433
Leave-0.430
Token Do
Token ID2141
Feature activation+8.11e-02

Pos logit contributions
 carefully+0.062
"+0.062
-+0.061
 together+0.059
 next+0.059
Neg logit contributions
vious-0.066
awks-0.064
trained-0.059
 Leave-0.056
irms-0.056
Token you
Token ID345
Feature activation-1.52e-02

Pos logit contributions
vious+0.186
awks+0.179
 Leave+0.178
trained+0.168
irms+0.164
Neg logit contributions
--0.098
 carefully-0.098
pack-0.089
 together-0.089
ator-0.083
Token want
Token ID765
Feature activation-1.87e-02

Pos logit contributions
-+0.030
 carefully+0.030
 together+0.029
pack+0.028
lf+0.028
Neg logit contributions
vious-0.024
awks-0.022
 Leave-0.021
irms-0.020
trained-0.019
Token that
Token ID326
Feature activation-4.14e-02

Pos logit contributions
-+0.020
 carefully+0.020
ve+0.019
pack+0.019
ator+0.019
Neg logit contributions
vious-0.044
awks-0.041
 Leave-0.041
 Fro-0.039
trained-0.038
Token it
Token ID340
Feature activation-5.82e-02

Pos logit contributions
-+0.021
 carefully+0.021
 together+0.020
pack+0.019
"+0.019
Neg logit contributions
vious-0.033
awks-0.031
 Leave-0.031
irms-0.029
trained-0.028
Token's
Token ID338
Feature activation+3.47e-03

Pos logit contributions
 carefully+0.075
-+0.073
lf+0.071
pack+0.069
 together+0.067
Neg logit contributions
vious-0.130
 Leave-0.119
awks-0.117
trained-0.108
irms-0.108
Token time
Token ID640
Feature activation+6.67e-03

Pos logit contributions
vious+0.007
awks+0.007
 Leave+0.006
irms+0.006
 Fro+0.006
Neg logit contributions
--0.005
 carefully-0.005
ve-0.005
lf-0.005
pack-0.005
Token to
Token ID284
Feature activation-4.87e-03

Pos logit contributions
vious+0.016
awks+0.015
 Leave+0.015
irms+0.014
joy+0.014
Neg logit contributions
--0.007
 carefully-0.007
lf-0.007
ator-0.007
pack-0.006
Token bat
Token ID7365
Feature activation+5.61e-03

Pos logit contributions
-+0.007
ve+0.007
lf+0.007
 carefully+0.007
 together+0.007
Neg logit contributions
vious-0.009
 Leave-0.008
awks-0.008
 Fro-0.008
irms-0.008
Tokenhe
Token ID258
Feature activation-1.96e-01

Pos logit contributions
vious+0.012
 Leave+0.011
awks+0.011
irms+0.010
rified+0.010
Neg logit contributions
 carefully-0.008
--0.007
lf-0.007
 work-0.007
ator-0.007
Token!"
Token ID2474
Feature activation+2.86e-02

Pos logit contributions
pack+0.177
 carefully+0.169
-+0.166
her+0.155
ve+0.152
Neg logit contributions
vious-0.475
 Leave-0.440
awks-0.437
 Fro-0.414
irms-0.414
Token Max
Token ID5436
Feature activation-5.58e-03

Pos logit contributions
vious+0.051
awks+0.050
 Leave+0.050
irms+0.046
trained+0.045
Neg logit contributions
--0.049
 carefully-0.047
pack-0.045
ator-0.044
ve-0.044
Token was
Token ID373
Feature activation-1.45e-02

Pos logit contributions
-+0.008
ator+0.008
 carefully+0.008
pack+0.008
 together+0.008
Neg logit contributions
vious-0.011
awks-0.010
 Leave-0.010
 Fro-0.010
irms-0.009
Token obedient
Token ID41652
Feature activation-4.26e-02

Pos logit contributions
-+0.024
 carefully+0.024
 together+0.022
pack+0.022
"+0.022
Neg logit contributions
vious-0.027
awks-0.026
 Leave-0.025
irms-0.023
trained-0.023
Token and
Token ID290
Feature activation-1.07e-02

Pos logit contributions
-+0.029
 carefully+0.029
 together+0.024
"+0.022
 next+0.022
Neg logit contributions
vious-0.122
awks-0.120
 Leave-0.117
trained-0.112
irms-0.112
Token should
Token ID815
Feature activation-4.89e-02

Pos logit contributions
vious+0.018
awks+0.017
 Leave+0.017
trained+0.016
irms+0.016
Neg logit contributions
--0.014
 carefully-0.014
 together-0.013
"-0.013
pack-0.013
Token not
Token ID407
Feature activation-2.19e-02

Pos logit contributions
-+0.061
 carefully+0.059
 together+0.055
ve+0.053
pack+0.051
Neg logit contributions
vious-0.110
 Leave-0.108
awks-0.107
trained-0.099
irms-0.099
Token climb
Token ID12080
Feature activation-1.42e-02

Pos logit contributions
 carefully+0.030
-+0.029
 together+0.027
ve+0.026
ator+0.026
Neg logit contributions
vious-0.047
 Leave-0.045
awks-0.045
trained-0.041
irms-0.041
Token the
Token ID262
Feature activation+4.59e-03

Pos logit contributions
pack+0.018
 carefully+0.018
ve+0.017
lf+0.017
ator+0.017
Neg logit contributions
vious-0.030
awks-0.029
 Leave-0.028
irms-0.027
trained-0.026
Token books
Token ID3835
Feature activation-2.36e-02

Pos logit contributions
vious+0.011
awks+0.011
 Leave+0.011
irms+0.010
trained+0.010
Neg logit contributions
--0.005
 carefully-0.005
 together-0.004
"-0.004
pack-0.004
Tokenhe
Token ID258
Feature activation-1.96e-01

Pos logit contributions
-+0.029
ve+0.029
 carefully+0.029
ator+0.028
pack+0.028
Neg logit contributions
vious-0.053
 Leave-0.046
awks-0.046
rified-0.043
irms-0.042
Tokenlf
Token ID1652
Feature activation-1.40e-01

Pos logit contributions
 carefully+0.150
her+0.123
ve+0.118
pack+0.112
 together+0.111
Neg logit contributions
vious-0.501
 Leave-0.466
awks-0.453
irms-0.438
 crowds-0.436
Token.
Token ID13
Feature activation-2.78e-02

Pos logit contributions
 carefully+0.142
ve+0.141
pack+0.137
-+0.134
ator+0.126
Neg logit contributions
vious-0.335
awks-0.298
 Leave-0.297
trained-0.281
irms-0.277
Token It
Token ID632
Feature activation-5.64e-02

Pos logit contributions
-+0.051
 carefully+0.051
"+0.049
 together+0.049
 next+0.048
Neg logit contributions
vious-0.045
awks-0.043
 Leave-0.039
trained-0.039
joy-0.038
Token is
Token ID318
Feature activation+6.04e-03

Pos logit contributions
 carefully+0.062
-+0.060
pack+0.055
 together+0.055
"+0.053
Neg logit contributions
vious-0.139
 Leave-0.130
awks-0.129
trained-0.122
Leave-0.118
Token dangerous
Token ID4923
Feature activation+2.58e-02

Pos logit contributions
vious+0.013
awks+0.012
 Leave+0.012
irms+0.011
 Fro+0.011
Neg logit contributions
--0.008
pack-0.008
 carefully-0.008
ve-0.008
ator-0.008
Token had
Token ID550
Feature activation+7.07e-03

Pos logit contributions
 carefully+0.020
-+0.017
 quickly+0.016
 was+0.016
 together+0.016
Neg logit contributions
vious-0.027
awks-0.027
joy-0.026
 Leave-0.026
irms-0.025
Token a
Token ID257
Feature activation-3.71e-02

Pos logit contributions
vious+0.017
awks+0.017
 Leave+0.016
trained+0.016
irms+0.016
Neg logit contributions
 carefully-0.008
--0.008
 together-0.007
 next-0.007
 work-0.007
Token tum
Token ID11814
Feature activation+1.13e-01

Pos logit contributions
-+0.053
 carefully+0.052
 together+0.048
"+0.046
 work+0.046
Neg logit contributions
vious-0.081
awks-0.076
 Leave-0.074
trained-0.071
irms-0.070
Tokenmy
Token ID1820
Feature activation-8.76e-04

Pos logit contributions
vious+0.197
awks+0.187
 Leave+0.179
trained+0.169
irms+0.167
Neg logit contributions
 carefully-0.204
--0.201
 together-0.189
pack-0.186
"-0.184
Token ac
Token ID936
Feature activation+2.32e-02

Pos logit contributions
 carefully+0.001
-+0.001
 together+0.001
pack+0.001
"+0.001
Neg logit contributions
vious-0.002
awks-0.002
 Leave-0.002
trained-0.002
irms-0.002
Tokenhe
Token ID258
Feature activation-1.95e-01

Pos logit contributions
vious+0.054
awks+0.052
 Leave+0.051
trained+0.049
irms+0.048
Neg logit contributions
 carefully-0.028
--0.027
 together-0.025
pack-0.024
"-0.024
Token and
Token ID290
Feature activation-1.35e-02

Pos logit contributions
 carefully+0.160
pack+0.155
-+0.151
ve+0.138
 together+0.137
Neg logit contributions
vious-0.496
awks-0.479
 Leave-0.469
Leave-0.451
 Fro-0.448
Token a
Token ID257
Feature activation-3.86e-02

Pos logit contributions
 carefully+0.016
-+0.016
 together+0.014
 next+0.014
 work+0.014
Neg logit contributions
vious-0.032
awks-0.031
 Leave-0.030
irms-0.029
trained-0.028
Token headache
Token ID24902
Feature activation+8.25e-03

Pos logit contributions
 carefully+0.054
-+0.053
 together+0.048
"+0.048
 next+0.047
Neg logit contributions
vious-0.085
awks-0.080
 Leave-0.078
trained-0.075
irms-0.074
Token.
Token ID13
Feature activation-2.64e-02

Pos logit contributions
vious+0.021
 Leave+0.020
awks+0.019
Leave+0.018
trained+0.018
Neg logit contributions
pack-0.007
 carefully-0.007
ator-0.006
--0.006
ve-0.006
Token They
Token ID1119
Feature activation-1.44e-02

Pos logit contributions
 carefully+0.030
 together+0.029
-+0.029
"+0.028
 next+0.026
Neg logit contributions
vious-0.064
awks-0.061
trained-0.058
 Leave-0.058
irms-0.056
Token new
Token ID649
Feature activation-6.27e-04

Pos logit contributions
 carefully+0.041
-+0.040
 together+0.038
pack+0.036
"+0.035
Neg logit contributions
vious-0.095
awks-0.092
 Leave-0.089
trained-0.085
irms-0.085
Token book
Token ID1492
Feature activation-8.16e-03

Pos logit contributions
 carefully+0.001
 together+0.001
pack+0.001
ve+0.001
her+0.001
Neg logit contributions
vious-0.001
awks-0.001
joy-0.001
 Leave-0.001
irms-0.001
Token on
Token ID319
Feature activation-6.98e-03

Pos logit contributions
 carefully+0.008
-+0.008
ve+0.008
ator+0.007
pack+0.007
Neg logit contributions
vious-0.020
 Leave-0.018
awks-0.018
trained-0.017
Leave-0.017
Token his
Token ID465
Feature activation-1.77e-03

Pos logit contributions
ve+0.010
-+0.009
 carefully+0.009
 together+0.009
pack+0.009
Neg logit contributions
vious-0.014
awks-0.014
 Leave-0.013
 Fro-0.013
irms-0.013
Token books
Token ID3835
Feature activation-2.49e-02

Pos logit contributions
-+0.002
 carefully+0.002
 together+0.002
"+0.002
ve+0.002
Neg logit contributions
vious-0.004
awks-0.004
 Leave-0.004
irms-0.004
trained-0.003
Tokenhe
Token ID258
Feature activation-1.95e-01

Pos logit contributions
-+0.029
 carefully+0.028
ve+0.027
 together+0.025
ator+0.025
Neg logit contributions
vious-0.059
awks-0.053
 Leave-0.052
irms-0.050
trained-0.049
Tokenlf
Token ID1652
Feature activation-1.37e-01

Pos logit contributions
 together+0.157
 carefully+0.152
ve+0.138
pack+0.128
her+0.126
Neg logit contributions
vious-0.485
awks-0.455
 Leave-0.449
irms-0.440
joy-0.438
Token.
Token ID13
Feature activation-1.05e-02

Pos logit contributions
 carefully+0.104
-+0.096
ve+0.096
"+0.092
pack+0.087
Neg logit contributions
vious-0.366
awks-0.338
 Leave-0.333
trained-0.323
irms-0.318
Token It
Token ID632
Feature activation-5.90e-02

Pos logit contributions
 carefully+0.014
-+0.014
"+0.013
 next+0.012
 work+0.012
Neg logit contributions
awks-0.022
vious-0.022
trained-0.022
 Leave-0.021
Leave-0.021
Token is
Token ID318
Feature activation+4.04e-03

Pos logit contributions
-+0.087
 carefully+0.085
pack+0.084
 together+0.082
ator+0.080
Neg logit contributions
vious-0.119
awks-0.112
 Leave-0.111
trained-0.102
irms-0.100
Token big
Token ID1263
Feature activation+4.81e-03

Pos logit contributions
vious+0.010
awks+0.009
 Leave+0.009
irms+0.009
 Fro+0.008
Neg logit contributions
--0.004
 carefully-0.004
ve-0.004
pack-0.004
 together-0.004
Token the
Token ID262
Feature activation+4.77e-03

Pos logit contributions
vious+0.009
awks+0.008
irms+0.008
joy+0.008
 Leave+0.008
Neg logit contributions
lf-0.007
ve-0.007
pack-0.007
"-0.006
--0.006
Token books
Token ID3835
Feature activation-2.75e-02

Pos logit contributions
vious+0.011
awks+0.010
 Leave+0.010
irms+0.010
joy+0.010
Neg logit contributions
--0.006
 carefully-0.006
"-0.005
pack-0.005
 together-0.005
Token on
Token ID319
Feature activation-8.45e-03

Pos logit contributions
lf+0.040
ator+0.040
abies+0.039
ve+0.038
innie+0.038
Neg logit contributions
vious-0.046
awks-0.038
 Leave-0.037
 sweaty-0.035
rified-0.034
Token the
Token ID262
Feature activation+2.86e-03

Pos logit contributions
ve+0.014
pack+0.013
-+0.012
ator+0.012
sh+0.011
Neg logit contributions
vious-0.012
awks-0.012
 Fro-0.012
 Pip-0.011
rified-0.011
Token books
Token ID3835
Feature activation-2.96e-02

Pos logit contributions
awks+0.005
vious+0.005
 Leave+0.005
irms+0.005
rified+0.005
Neg logit contributions
--0.004
 together-0.004
"-0.004
pack-0.004
 carefully-0.003
Tokenhe
Token ID258
Feature activation-1.95e-01

Pos logit contributions
ve+0.043
-+0.042
abies+0.041
ator+0.040
 work+0.038
Neg logit contributions
vious-0.053
awks-0.044
rified-0.044
 Leave-0.044
irms-0.039
Tokenlf
Token ID1652
Feature activation-1.41e-01

Pos logit contributions
ve+0.206
 together+0.194
rave+0.192
 carefully+0.187
pack+0.187
Neg logit contributions
vious-0.348
 crowds-0.340
 Leave-0.336
 crabs-0.331
awks-0.326
Token and
Token ID290
Feature activation-1.35e-02

Pos logit contributions
"+0.142
ve+0.130
ator+0.128
Mom+0.125
pack+0.120
Neg logit contributions
vious-0.302
awks-0.270
joy-0.266
 pads-0.263
 Leave-0.263
Token the
Token ID262
Feature activation+9.89e-04

Pos logit contributions
-+0.019
ve+0.019
"+0.018
ator+0.018
pack+0.017
Neg logit contributions
vious-0.026
awks-0.025
 Leave-0.024
irms-0.024
trained-0.023
Token clothes
Token ID8242
Feature activation-3.48e-04

Pos logit contributions
vious+0.002
awks+0.002
 Leave+0.002
irms+0.002
trained+0.002
Neg logit contributions
--0.001
 carefully-0.001
 together-0.001
"-0.001
pack-0.001
Token in
Token ID287
Feature activation-4.19e-02

Pos logit contributions
lf+0.000
ator+0.000
"+0.000
-+0.000
pack+0.000
Neg logit contributions
vious-0.001
 Leave-0.001
awks-0.001
joy-0.001
trained-0.001
Token didn
Token ID1422
Feature activation-5.16e-02

Pos logit contributions
vious+0.005
 regretted+0.005
 Leave+0.005
 Fro+0.005
trained+0.005
Neg logit contributions
ator-0.006
pack-0.005
 together-0.005
lf-0.005
"-0.005
Token't
Token ID470
Feature activation-6.12e-02

Pos logit contributions
-+0.066
 together+0.063
 carefully+0.062
ator+0.060
lf+0.060
Neg logit contributions
 Leave-0.110
vious-0.107
Leave-0.096
 Fro-0.094
awks-0.092
Token want
Token ID765
Feature activation+6.27e-03

Pos logit contributions
 carefully+0.120
-+0.120
 together+0.113
"+0.111
pack+0.110
Neg logit contributions
vious-0.097
awks-0.092
 Leave-0.088
trained-0.082
irms-0.081
Token to
Token ID284
Feature activation+2.08e-02

Pos logit contributions
vious+0.018
awks+0.018
 Leave+0.017
irms+0.017
trained+0.017
Neg logit contributions
--0.005
 carefully-0.005
 together-0.004
 next-0.004
 work-0.003
Token wear
Token ID5806
Feature activation+3.54e-02

Pos logit contributions
vious+0.042
awks+0.040
 Leave+0.039
trained+0.037
irms+0.036
Neg logit contributions
--0.031
 carefully-0.031
 together-0.029
pack-0.028
ve-0.028
Token anything
Token ID1997
Feature activation+1.41e-01

Pos logit contributions
vious+0.085
awks+0.081
 Leave+0.079
irms+0.075
trained+0.074
Neg logit contributions
 carefully-0.038
pack-0.037
--0.037
lf-0.036
ve-0.035
Token heavy
Token ID4334
Feature activation-2.27e-03

Pos logit contributions
vious+0.339
awks+0.324
 Leave+0.319
trained+0.301
irms+0.299
Neg logit contributions
--0.151
 carefully-0.151
pack-0.141
"-0.134
ator-0.133
Token.
Token ID13
Feature activation+1.22e-02

Pos logit contributions
 carefully+0.002
-+0.002
pack+0.002
"+0.002
 together+0.002
Neg logit contributions
vious-0.006
awks-0.006
 Leave-0.006
irms-0.005
trained-0.005
Token So
Token ID1406
Feature activation+4.42e-02

Pos logit contributions
vious+0.022
 Leave+0.021
awks+0.021
irms+0.019
joy+0.019
Neg logit contributions
--0.021
 carefully-0.020
lf-0.020
ve-0.019
pack-0.019
Token,
Token ID11
Feature activation+1.33e-02

Pos logit contributions
vious+0.093
 Leave+0.090
awks+0.090
joy+0.085
 Fro+0.085
Neg logit contributions
pack-0.054
--0.051
ve-0.050
her-0.050
 carefully-0.049
Token she
Token ID673
Feature activation+2.65e-02

Pos logit contributions
vious+0.028
awks+0.027
 Leave+0.026
trained+0.025
Leave+0.025
Neg logit contributions
--0.018
 carefully-0.017
pack-0.017
ve-0.016
ator-0.016
Token we
Token ID356
Feature activation+3.59e-02

Pos logit contributions
vious+0.007
awks+0.007
 Leave+0.007
irms+0.006
trained+0.006
Neg logit contributions
--0.005
 carefully-0.005
 together-0.005
pack-0.005
"-0.004
Token'll
Token ID1183
Feature activation-2.68e-02

Pos logit contributions
vious+0.069
 Leave+0.063
awks+0.060
trained+0.054
irms+0.053
Neg logit contributions
--0.058
lf-0.058
 carefully-0.056
ator-0.054
 next-0.054
Token just
Token ID655
Feature activation-3.46e-02

Pos logit contributions
-+0.040
 together+0.035
 carefully+0.035
lf+0.035
ve+0.033
Neg logit contributions
 Leave-0.053
vious-0.052
awks-0.050
Leave-0.047
 Fro-0.046
Token have
Token ID423
Feature activation-1.59e-02

Pos logit contributions
-+0.050
 carefully+0.048
 together+0.045
"+0.045
lf+0.044
Neg logit contributions
vious-0.073
awks-0.070
 Leave-0.068
trained-0.063
irms-0.063
Token to
Token ID284
Feature activation+1.09e-02

Pos logit contributions
lf+0.019
-+0.019
 carefully+0.018
pack+0.018
ator+0.018
Neg logit contributions
vious-0.036
awks-0.034
 Leave-0.033
joy-0.032
irms-0.031
Token forget
Token ID6044
Feature activation+1.49e-01

Pos logit contributions
vious+0.021
awks+0.021
 Leave+0.020
Leave+0.019
irms+0.019
Neg logit contributions
--0.016
ve-0.015
 carefully-0.014
 together-0.014
lf-0.014
Token about
Token ID546
Feature activation-2.74e-02

Pos logit contributions
vious+0.278
 Leave+0.272
awks+0.255
irms+0.233
joy+0.233
Neg logit contributions
ve-0.213
lf-0.200
--0.199
pack-0.198
ator-0.191
Token it
Token ID340
Feature activation-3.89e-02

Pos logit contributions
-+0.037
pack+0.035
ve+0.035
lf+0.034
 carefully+0.034
Neg logit contributions
awks-0.052
vious-0.051
joy-0.051
 Leave-0.050
 crabs-0.049
Token."
Token ID526
Feature activation+2.89e-02

Pos logit contributions
pack+0.048
-+0.048
lf+0.047
 carefully+0.046
ator+0.045
Neg logit contributions
vious-0.085
 Leave-0.075
awks-0.074
 loses-0.067
rified-0.067
Token
Token ID220
Feature activation+7.00e-02

Pos logit contributions
vious+0.071
awks+0.069
 Leave+0.066
trained+0.065
irms+0.064
Neg logit contributions
 carefully-0.033
--0.032
 together-0.031
"-0.029
 next-0.027
Token
Token ID198
Feature activation+9.40e-03

Pos logit contributions
vious+0.155
awks+0.153
 Leave+0.153
 Fro+0.144
irms+0.141
Neg logit contributions
--0.086
 carefully-0.080
lf-0.080
pack-0.072
 next-0.070
Token the
Token ID262
Feature activation+1.17e-02

Pos logit contributions
pack+0.063
lf+0.061
ator+0.061
ve+0.060
iced+0.058
Neg logit contributions
vious-0.093
awks-0.089
 Leave-0.089
trained-0.084
irms-0.083
Token wreck
Token ID15228
Feature activation-2.21e-02

Pos logit contributions
vious+0.026
awks+0.024
 Leave+0.024
trained+0.022
irms+0.022
Neg logit contributions
 carefully-0.016
--0.015
 together-0.014
"-0.014
pack-0.014
Token out
Token ID503
Feature activation+4.14e-03

Pos logit contributions
ve+0.023
ator+0.021
abies+0.020
lf+0.020
innie+0.020
Neg logit contributions
vious-0.050
awks-0.045
trained-0.042
 ze-0.041
 loses-0.041
Token of
Token ID286
Feature activation-2.21e-02

Pos logit contributions
vious+0.009
awks+0.008
 Leave+0.008
trained+0.007
Leave+0.007
Neg logit contributions
pack-0.006
--0.005
 carefully-0.005
ve-0.005
lf-0.005
Token the
Token ID262
Feature activation+1.31e-02

Pos logit contributions
-+0.030
ve+0.030
 carefully+0.029
pack+0.027
iced+0.027
Neg logit contributions
vious-0.045
awks-0.044
 Leave-0.043
joy-0.042
irms-0.042
Token way
Token ID835
Feature activation+1.44e-01

Pos logit contributions
vious+0.027
awks+0.025
 Leave+0.025
trained+0.023
irms+0.023
Neg logit contributions
 carefully-0.020
--0.019
 together-0.018
"-0.018
pack-0.017
Token.
Token ID13
Feature activation-1.57e-03

Pos logit contributions
vious+0.337
 Leave+0.326
awks+0.319
trained+0.311
rified+0.302
Neg logit contributions
lf-0.129
pack-0.126
--0.124
 carefully-0.118
ve-0.115
Token Red
Token ID2297
Feature activation-2.96e-02

Pos logit contributions
-+0.003
 carefully+0.002
pack+0.002
lf+0.002
ator+0.002
Neg logit contributions
vious-0.003
 Leave-0.003
awks-0.003
trained-0.003
irms-0.003
Token was
Token ID373
Feature activation-3.83e-04

Pos logit contributions
ator+0.034
pack+0.033
 carefully+0.032
 together+0.031
-+0.030
Neg logit contributions
vious-0.064
 Leave-0.060
trained-0.058
awks-0.058
 Fro-0.057
Token very
Token ID845
Feature activation-1.85e-02

Pos logit contributions
-+0.001
 carefully+0.001
pack+0.001
lf+0.000
ve+0.000
Neg logit contributions
vious-0.001
 Leave-0.001
awks-0.001
Leave-0.001
trained-0.001
Token thankful
Token ID25535
Feature activation+1.13e-02

Pos logit contributions
ator+0.028
-+0.027
 carefully+0.027
lf+0.027
pack+0.027
Neg logit contributions
vious-0.035
awks-0.034
 Leave-0.033
trained-0.031
 Fro-0.031
Token "
Token ID366
Feature activation-3.01e-02

Pos logit contributions
 Leave+0.043
vious+0.041
joy+0.040
Leave+0.040
 Fro+0.039
Neg logit contributions
--0.032
pack-0.031
 carefully-0.031
ator-0.030
 work-0.030
TokenThat
Token ID2504
Feature activation-8.27e-02

Pos logit contributions
 carefully+0.046
-+0.045
 together+0.043
"+0.043
 next+0.042
Neg logit contributions
vious-0.060
awks-0.059
 Leave-0.056
irms-0.054
trained-0.053
Token's
Token ID338
Feature activation+3.61e-02

Pos logit contributions
ve+0.111
pack+0.109
 carefully+0.109
-+0.106
 together+0.105
Neg logit contributions
vious-0.175
awks-0.156
 Leave-0.155
trained-0.143
 ze-0.141
Token a
Token ID257
Feature activation-1.70e-02

Pos logit contributions
vious+0.066
awks+0.064
irms+0.061
 Fro+0.061
 Leave+0.060
Neg logit contributions
--0.051
ve-0.049
 carefully-0.049
lf-0.047
pack-0.046
Token good
Token ID922
Feature activation+4.92e-02

Pos logit contributions
 carefully+0.020
-+0.019
pack+0.019
 together+0.018
ve+0.017
Neg logit contributions
vious-0.038
awks-0.036
 Leave-0.035
joy-0.034
 liar-0.033
Token choice
Token ID3572
Feature activation+1.46e-01

Pos logit contributions
vious+0.108
awks+0.101
 Leave+0.096
irk+0.094
irms+0.094
Neg logit contributions
 carefully-0.064
pack-0.060
her-0.059
 together-0.058
--0.058
Token.
Token ID13
Feature activation-7.53e-04

Pos logit contributions
vious+0.382
 Leave+0.363
awks+0.357
trained+0.339
joy+0.337
Neg logit contributions
 carefully-0.123
pack-0.119
--0.112
ator-0.102
ve-0.102
Token You
Token ID921
Feature activation+4.18e-02

Pos logit contributions
 carefully+0.001
-+0.001
"+0.001
 together+0.001
 next+0.001
Neg logit contributions
vious-0.001
awks-0.001
 Leave-0.001
trained-0.001
irms-0.001
Token can
Token ID460
Feature activation-7.35e-03

Pos logit contributions
vious+0.076
 Leave+0.066
awks+0.066
irms+0.059
 realize+0.058
Neg logit contributions
--0.077
 carefully-0.068
lf-0.068
ator-0.066
pack-0.066
Token both
Token ID1111
Feature activation-1.14e-02

Pos logit contributions
-+0.012
ve+0.012
 together+0.011
lf+0.010
ator+0.010
Neg logit contributions
vious-0.013
 Leave-0.012
awks-0.011
 realize-0.011
trained-0.010
Token express
Token ID4911
Feature activation+6.99e-03

Pos logit contributions
-+0.017
ve+0.016
 carefully+0.016
ator+0.016
pack+0.015
Neg logit contributions
vious-0.023
awks-0.021
 Leave-0.021
trained-0.019
Leave-0.018
Token but
Token ID475
Feature activation+7.00e-02

Pos logit contributions
ator+0.037
lf+0.036
pack+0.033
ve+0.032
iced+0.031
Neg logit contributions
vious-0.064
 Fro-0.061
awks-0.060
 Leave-0.058
trained-0.058
Token he
Token ID339
Feature activation+3.45e-02

Pos logit contributions
vious+0.100
 Fro+0.099
rified+0.092
 Leave+0.088
awks+0.086
Neg logit contributions
--0.129
ator-0.129
pack-0.126
ve-0.121
her-0.119
Token couldn
Token ID3521
Feature activation-2.67e-02

Pos logit contributions
vious+0.064
 Fro+0.060
awks+0.058
 Leave+0.058
trained+0.055
Neg logit contributions
--0.051
pack-0.051
 together-0.051
 carefully-0.051
ator-0.049
Token't
Token ID470
Feature activation-5.59e-02

Pos logit contributions
 carefully+0.035
-+0.034
 together+0.033
pack+0.033
ator+0.031
Neg logit contributions
 Leave-0.057
vious-0.057
Leave-0.049
 Fro-0.048
trained-0.047
Token find
Token ID1064
Feature activation+3.18e-02

Pos logit contributions
 together+0.093
-+0.092
iced+0.089
ator+0.087
abies+0.086
Neg logit contributions
vious-0.093
 realize-0.086
 Leave-0.083
awks-0.083
 loses-0.077
Token anywhere
Token ID6609
Feature activation+1.34e-01

Pos logit contributions
vious+0.077
awks+0.073
 Leave+0.072
trained+0.069
irms+0.068
Neg logit contributions
 carefully-0.032
pack-0.031
ve-0.031
--0.031
lf-0.028
Token to
Token ID284
Feature activation+2.62e-02

Pos logit contributions
vious+0.382
awks+0.367
 Leave+0.366
trained+0.352
irms+0.347
Neg logit contributions
 carefully-0.081
--0.078
pack-0.069
 together-0.064
"-0.063
Token buy
Token ID2822
Feature activation-2.64e-02

Pos logit contributions
 Leave+0.042
vious+0.042
 realize+0.040
awks+0.039
trained+0.038
Neg logit contributions
 together-0.044
--0.042
abies-0.040
ve-0.040
 carefully-0.039
Token a
Token ID257
Feature activation-1.65e-02

Pos logit contributions
pack+0.030
ve+0.030
lf+0.029
ator+0.028
abies+0.028
Neg logit contributions
vious-0.060
awks-0.053
joy-0.053
trained-0.052
 Leave-0.052
Token television
Token ID5581
Feature activation-7.02e-02

Pos logit contributions
pack+0.021
-+0.021
ve+0.021
 carefully+0.021
lf+0.020
Neg logit contributions
vious-0.030
 foe-0.028
 liar-0.028
 Leave-0.027
awks-0.027
Token.
Token ID13
Feature activation+1.40e-02

Pos logit contributions
ve+0.063
 carefully+0.061
-+0.059
pack+0.057
ator+0.056
Neg logit contributions
vious-0.177
awks-0.155
 Leave-0.155
trained-0.150
joy-0.147
Token She
Token ID1375
Feature activation+5.80e-04

Pos logit contributions
-+0.011
 carefully+0.011
pack+0.010
 together+0.010
lf+0.010
Neg logit contributions
vious-0.014
awks-0.013
 Leave-0.013
irms-0.012
trained-0.012
Token was
Token ID373
Feature activation+1.58e-03

Pos logit contributions
 Fro+0.001
vious+0.001
 Leave+0.001
 regretted+0.001
awks+0.001
Neg logit contributions
pack-0.001
ator-0.001
 project-0.001
 work-0.001
 together-0.001
Token demanding
Token ID11334
Feature activation-9.45e-03

Pos logit contributions
vious+0.003
awks+0.003
 Leave+0.003
 Fro+0.003
joy+0.002
Neg logit contributions
 carefully-0.003
--0.003
 work-0.002
ve-0.002
lf-0.002
Token her
Token ID607
Feature activation-1.36e-02

Pos logit contributions
-+0.010
 carefully+0.010
pack+0.010
 together+0.009
"+0.009
Neg logit contributions
vious-0.022
awks-0.022
 Leave-0.021
trained-0.020
irms-0.020
Token to
Token ID284
Feature activation+1.45e-02

Pos logit contributions
-+0.015
lf+0.014
 carefully+0.014
ator+0.014
ve+0.014
Neg logit contributions
vious-0.031
 Leave-0.030
awks-0.029
 Fro-0.028
trained-0.027
Token come
Token ID1282
Feature activation+4.66e-02

Pos logit contributions
vious+0.026
awks+0.026
 Leave+0.026
Leave+0.024
 Fro+0.024
Neg logit contributions
--0.022
 together-0.022
ve-0.022
 carefully-0.020
iced-0.020
Token down
Token ID866
Feature activation+4.81e-02

Pos logit contributions
awks+0.086
 Fro+0.084
 Leave+0.084
vious+0.082
joy+0.081
Neg logit contributions
lf-0.070
her-0.058
ves-0.056
ator-0.054
ve-0.054
Token.
Token ID13
Feature activation-6.10e-03

Pos logit contributions
vious+0.106
awks+0.098
joy+0.098
 Fro+0.095
 Leave+0.094
Neg logit contributions
lf-0.055
pack-0.055
ve-0.052
ator-0.052
--0.050
Token Lily
Token ID20037
Feature activation+3.05e-02

Pos logit contributions
-+0.010
 carefully+0.010
pack+0.009
 together+0.009
lf+0.009
Neg logit contributions
vious-0.011
awks-0.011
 Leave-0.011
irms-0.010
trained-0.010
Token ran
Token ID4966
Feature activation+5.12e-02

Pos logit contributions
vious+0.056
 Leave+0.053
awks+0.051
 Fro+0.051
 regretted+0.049
Neg logit contributions
ator-0.045
pack-0.045
 together-0.044
--0.042
 work-0.042
Token down
Token ID866
Feature activation+4.94e-02

Pos logit contributions
vious+0.103
awks+0.100
 Fro+0.095
 Leave+0.095
joy+0.094
Neg logit contributions
lf-0.071
ator-0.066
ve-0.061
--0.060
iced-0.059
Token dropped
Token ID5710
Feature activation+4.42e-03

Pos logit contributions
vious+0.009
awks+0.009
 Leave+0.009
irms+0.008
trained+0.008
Neg logit contributions
--0.007
 carefully-0.006
 together-0.006
pack-0.006
"-0.006
Token the
Token ID262
Feature activation+2.42e-02

Pos logit contributions
vious+0.011
awks+0.011
 Leave+0.010
irms+0.010
trained+0.010
Neg logit contributions
--0.004
 carefully-0.004
pack-0.004
 together-0.004
lf-0.004
Token soap
Token ID19533
Feature activation-1.75e-02

Pos logit contributions
vious+0.049
awks+0.047
 Leave+0.043
irms+0.043
Leave+0.042
Neg logit contributions
--0.034
 carefully-0.034
"-0.032
pack-0.032
ve-0.032
Token and
Token ID290
Feature activation+8.69e-03

Pos logit contributions
 carefully+0.016
-+0.016
pack+0.016
ve+0.015
lf+0.015
Neg logit contributions
vious-0.043
awks-0.041
 Leave-0.040
Leave-0.038
irms-0.038
Token ran
Token ID4966
Feature activation+5.19e-02

Pos logit contributions
vious+0.021
awks+0.020
 Leave+0.020
trained+0.019
joy+0.019
Neg logit contributions
 carefully-0.010
--0.010
 together-0.009
 next-0.009
 work-0.008
Token to
Token ID284
Feature activation+1.77e-02

Pos logit contributions
vious+0.120
awks+0.114
 Leave+0.111
irms+0.107
Leave+0.104
Neg logit contributions
lf-0.062
ator-0.058
--0.056
ve-0.055
 carefully-0.055
Token the
Token ID262
Feature activation+2.55e-02

Pos logit contributions
awks+0.037
vious+0.037
 Leave+0.036
 Fro+0.035
irms+0.034
Neg logit contributions
--0.024
lf-0.022
ve-0.022
 carefully-0.022
ator-0.022
Token oven
Token ID14361
Feature activation-7.77e-02

Pos logit contributions
vious+0.046
awks+0.045
 Leave+0.044
irms+0.042
Leave+0.041
Neg logit contributions
--0.041
 carefully-0.039
ve-0.038
 together-0.038
"-0.036
Token.
Token ID13
Feature activation-4.96e-03

Pos logit contributions
ve+0.074
ator+0.065
 carefully+0.064
lf+0.064
-+0.064
Neg logit contributions
vious-0.190
awks-0.187
 Leave-0.180
irms-0.174
joy-0.169
Token She
Token ID1375
Feature activation+4.90e-03

Pos logit contributions
 carefully+0.008
-+0.008
 together+0.007
"+0.007
 next+0.007
Neg logit contributions
vious-0.010
awks-0.009
 Leave-0.009
trained-0.009
irms-0.009
Token tried
Token ID3088
Feature activation-2.27e-02

Pos logit contributions
vious+0.009
awks+0.009
 Leave+0.009
irms+0.008
trained+0.008
Neg logit contributions
--0.008
 carefully-0.008
 together-0.007
ator-0.007
pack-0.007
Token,
Token ID11
Feature activation-7.44e-03

Pos logit contributions
 carefully+0.063
pack+0.062
-+0.058
 together+0.057
her+0.055
Neg logit contributions
vious-0.134
 Leave-0.132
awks-0.129
 Fro-0.123
trained-0.122
Token Sara
Token ID24799
Feature activation+2.17e-02

Pos logit contributions
-+0.013
 carefully+0.012
 together+0.012
pack+0.012
lf+0.011
Neg logit contributions
vious-0.013
awks-0.013
 Leave-0.013
 Fro-0.012
trained-0.011
Token,
Token ID11
Feature activation-8.43e-03

Pos logit contributions
vious+0.053
awks+0.051
 Leave+0.051
 Fro+0.049
trained+0.048
Neg logit contributions
 carefully-0.021
pack-0.020
lf-0.018
ve-0.018
ator-0.018
Token this
Token ID428
Feature activation-3.88e-02

Pos logit contributions
-+0.011
 carefully+0.011
 together+0.010
pack+0.010
"+0.010
Neg logit contributions
vious-0.018
awks-0.018
 Leave-0.017
irms-0.016
trained-0.016
Token is
Token ID318
Feature activation+6.83e-03

Pos logit contributions
 carefully+0.047
pack+0.044
 together+0.043
-+0.043
lf+0.041
Neg logit contributions
vious-0.089
awks-0.086
 Leave-0.083
 Fro-0.078
irms-0.077
Token your
Token ID534
Feature activation+2.01e-02

Pos logit contributions
vious+0.014
awks+0.014
 Leave+0.013
 Fro+0.013
irms+0.013
Neg logit contributions
--0.009
pack-0.009
ve-0.009
 carefully-0.009
lf-0.008
Token ring
Token ID5858
Feature activation-1.56e-02

Pos logit contributions
awks+0.040
vious+0.040
 Leave+0.038
 Fro+0.037
joy+0.036
Neg logit contributions
 carefully-0.028
 together-0.027
lf-0.027
pack-0.026
--0.026
Token.
Token ID13
Feature activation-2.94e-02

Pos logit contributions
pack+0.015
 carefully+0.015
ve+0.014
lf+0.013
her+0.013
Neg logit contributions
vious-0.037
 Leave-0.034
awks-0.034
 Fro-0.032
trained-0.032
Token Now
Token ID2735
Feature activation-1.20e-01

Pos logit contributions
 carefully+0.045
-+0.045
 together+0.042
"+0.041
pack+0.040
Neg logit contributions
vious-0.059
awks-0.057
 Leave-0.053
trained-0.052
irms-0.052
Token we
Token ID356
Feature activation+2.73e-02

Pos logit contributions
-+0.136
 carefully+0.133
pack+0.126
ve+0.122
 together+0.121
Neg logit contributions
vious-0.272
 Leave-0.265
awks-0.265
trained-0.246
 Fro-0.244
Token are
Token ID389
Feature activation+1.27e-02

Pos logit contributions
vious+0.049
 Leave+0.044
awks+0.044
irms+0.038
trained+0.037
Neg logit contributions
--0.049
lf-0.046
 carefully-0.045
ator-0.044
her-0.044
Token she
Token ID673
Feature activation+1.65e-02

Pos logit contributions
vious+0.146
awks+0.142
 Leave+0.140
 Fro+0.132
joy+0.131
Neg logit contributions
pack-0.112
--0.112
 carefully-0.107
ator-0.104
ve-0.101
Token decided
Token ID3066
Feature activation+1.05e-02

Pos logit contributions
vious+0.028
awks+0.025
 Leave+0.025
 Fro+0.024
trained+0.023
Neg logit contributions
--0.028
ator-0.028
pack-0.027
 together-0.027
 work-0.026
Token to
Token ID284
Feature activation+1.58e-02

Pos logit contributions
vious+0.023
 Leave+0.022
awks+0.022
joy+0.021
Leave+0.021
Neg logit contributions
ator-0.012
--0.012
pack-0.012
lf-0.011
 carefully-0.011
Token sit
Token ID1650
Feature activation-1.98e-02

Pos logit contributions
vious+0.029
 Leave+0.029
awks+0.027
 Fro+0.026
Leave+0.025
Neg logit contributions
--0.025
ve-0.025
 together-0.024
 carefully-0.022
her-0.022
Token in
Token ID287
Feature activation-2.00e-02

Pos logit contributions
lf+0.025
ator+0.023
 work+0.023
 project+0.023
ve+0.022
Neg logit contributions
vious-0.040
awks-0.039
joy-0.038
irms-0.037
Leave-0.035
Token the
Token ID262
Feature activation+1.72e-02

Pos logit contributions
ve+0.028
-+0.026
iced+0.026
ator+0.025
pack+0.025
Neg logit contributions
joy-0.039
vious-0.038
awks-0.038
 Leave-0.036
 Fro-0.035
Token grass
Token ID8701
Feature activation+8.69e-03

Pos logit contributions
vious+0.036
awks+0.034
 Leave+0.033
irms+0.031
trained+0.031
Neg logit contributions
--0.025
 carefully-0.025
 together-0.023
"-0.022
pack-0.022
Token and
Token ID290
Feature activation-1.62e-03

Pos logit contributions
vious+0.023
awks+0.021
 Leave+0.020
 Fro+0.020
rified+0.020
Neg logit contributions
--0.007
 carefully-0.007
ve-0.007
lf-0.007
pack-0.007
Token close
Token ID1969
Feature activation-1.19e-02

Pos logit contributions
-+0.002
 carefully+0.002
ve+0.002
"+0.002
 together+0.002
Neg logit contributions
vious-0.003
awks-0.003
 Leave-0.003
irms-0.003
trained-0.003
Token her
Token ID607
Feature activation-1.99e-02

Pos logit contributions
 carefully+0.010
-+0.010
 together+0.009
"+0.008
 next+0.008
Neg logit contributions
vious-0.032
awks-0.031
 Leave-0.030
trained-0.029
irms-0.029
Token eyes
Token ID2951
Feature activation+9.18e-02

Pos logit contributions
pack+0.045
ator+0.044
ve+0.043
lf+0.042
 together+0.042
Neg logit contributions
awks-0.022
 Leave-0.019
irms-0.018
vious-0.018
joy-0.018
Token and
Token ID290
Feature activation+1.77e-02

Pos logit contributions
vious+0.004
awks+0.004
 Leave+0.004
irms+0.004
trained+0.004
Neg logit contributions
 carefully-0.001
--0.001
 together-0.001
"-0.001
pack-0.001
Token broken
Token ID5445
Feature activation-2.51e-03

Pos logit contributions
vious+0.032
awks+0.030
 Leave+0.029
trained+0.028
Leave+0.028
Neg logit contributions
--0.029
pack-0.029
ator-0.028
 carefully-0.028
 together-0.027
Token.
Token ID13
Feature activation+1.35e-02

Pos logit contributions
pack+0.002
ve+0.002
 carefully+0.002
lf+0.002
"+0.002
Neg logit contributions
vious-0.006
awks-0.006
 Leave-0.006
rified-0.006
irms-0.006
Token Some
Token ID2773
Feature activation+1.36e-02

Pos logit contributions
vious+0.026
awks+0.025
 Leave+0.023
trained+0.023
irms+0.022
Neg logit contributions
 carefully-0.022
--0.022
 together-0.021
"-0.021
 next-0.020
Token of
Token ID286
Feature activation-4.87e-03

Pos logit contributions
vious+0.029
awks+0.028
 Leave+0.027
trained+0.026
Leave+0.025
Neg logit contributions
 carefully-0.019
--0.018
pack-0.018
 together-0.017
"-0.017
Token the
Token ID262
Feature activation+2.88e-02

Pos logit contributions
-+0.006
pack+0.006
 carefully+0.006
ve+0.006
 together+0.006
Neg logit contributions
vious-0.010
awks-0.010
 Leave-0.010
 Fro-0.010
Leave-0.010
Token plan
Token ID1410
Feature activation-9.57e-04

Pos logit contributions
vious+0.039
awks+0.038
 Leave+0.036
Leave+0.033
trained+0.032
Neg logit contributions
--0.062
 carefully-0.062
pack-0.061
 together-0.060
lf-0.059
Tokenks
Token ID591
Feature activation+2.19e-02

Pos logit contributions
ator+0.001
pack+0.001
"+0.001
ve+0.001
-+0.001
Neg logit contributions
vious-0.002
 Leave-0.002
awks-0.002
Leave-0.002
 regretted-0.002
Token were
Token ID547
Feature activation+7.86e-03

Pos logit contributions
vious+0.053
 Leave+0.049
awks+0.047
Leave+0.045
rified+0.044
Neg logit contributions
pack-0.026
ator-0.025
lf-0.023
 carefully-0.021
ve-0.021
Token missing
Token ID4814
Feature activation+1.91e-02

Pos logit contributions
vious+0.014
awks+0.014
 Leave+0.013
 hesitation+0.012
joy+0.012
Neg logit contributions
ator-0.012
pack-0.012
--0.012
lf-0.012
 carefully-0.011
Token and
Token ID290
Feature activation+1.95e-02

Pos logit contributions
 Leave+0.041
vious+0.040
awks+0.039
rified+0.037
 Fro+0.036
Neg logit contributions
pack-0.023
lf-0.023
ator-0.021
ve-0.021
 carefully-0.021
Token about
Token ID546
Feature activation-3.94e-02

Pos logit contributions
vious+0.208
awks+0.197
 Leave+0.192
irms+0.178
trained+0.176
Neg logit contributions
--0.209
 carefully-0.201
pack-0.196
ve-0.196
"-0.190
Token the
Token ID262
Feature activation+1.09e-02

Pos logit contributions
-+0.041
 carefully+0.041
 together+0.036
"+0.036
pack+0.035
Neg logit contributions
vious-0.097
awks-0.095
 Leave-0.092
irms-0.088
trained-0.088
Token old
Token ID1468
Feature activation-1.84e-02

Pos logit contributions
vious+0.024
awks+0.023
 Leave+0.022
trained+0.021
irms+0.021
Neg logit contributions
 carefully-0.015
--0.015
 together-0.014
"-0.013
pack-0.013
Token man
Token ID582
Feature activation-2.94e-02

Pos logit contributions
 carefully+0.020
-+0.020
 together+0.018
pack+0.018
"+0.018
Neg logit contributions
vious-0.045
awks-0.043
 Leave-0.042
trained-0.040
irms-0.040
Token.
Token ID13
Feature activation-2.01e-02

Pos logit contributions
 carefully+0.028
ator+0.027
pack+0.027
ve+0.027
-+0.026
Neg logit contributions
vious-0.072
 Leave-0.069
awks-0.068
trained-0.065
 Fro-0.062
Token They
Token ID1119
Feature activation-7.73e-03

Pos logit contributions
 carefully+0.022
 together+0.022
-+0.021
"+0.021
 next+0.020
Neg logit contributions
vious-0.050
awks-0.047
trained-0.044
 Leave-0.044
rified-0.042
Token were
Token ID547
Feature activation-9.27e-03

Pos logit contributions
 carefully+0.011
-+0.011
 together+0.010
"+0.010
 work+0.010
Neg logit contributions
vious-0.016
awks-0.016
 Leave-0.015
irms-0.015
trained-0.015
Token happy
Token ID3772
Feature activation-2.22e-03

Pos logit contributions
-+0.011
 carefully+0.011
 together+0.010
"+0.010
pack+0.010
Neg logit contributions
vious-0.021
awks-0.021
 Leave-0.020
trained-0.019
irms-0.019
Token.
Token ID13
Feature activation-1.98e-02

Pos logit contributions
 carefully+0.001
-+0.001
pack+0.001
"+0.001
ve+0.001
Neg logit contributions
vious-0.006
awks-0.006
 Leave-0.006
joy-0.006
irms-0.006
Token<|endoftext|>
Token ID50256
Feature activation-1.18e-03

Pos logit contributions
 together+0.014
"+0.014
 carefully+0.014
-+0.013
 next+0.013
Neg logit contributions
vious-0.057
awks-0.054
trained-0.052
 Leave-0.050
rified-0.050
TokenTim
Token ID14967
Feature activation-5.61e-02

Pos logit contributions
 carefully+0.002
-+0.002
pack+0.002
 together+0.002
lf+0.002
Neg logit contributions
vious-0.002
awks-0.002
 Leave-0.002
irms-0.002
joy-0.002
Token to
Token ID284
Feature activation+2.62e-02

Pos logit contributions
vious+0.042
awks+0.040
 Leave+0.039
 Fro+0.038
 ze+0.037
Neg logit contributions
ve-0.024
lf-0.023
ator-0.023
pack-0.021
--0.021
Token her
Token ID607
Feature activation-1.16e-03

Pos logit contributions
awks+0.055
vious+0.055
 Leave+0.053
 Fro+0.053
irms+0.050
Neg logit contributions
--0.035
ve-0.032
pack-0.032
 carefully-0.031
"-0.031
Token mom
Token ID1995
Feature activation+2.37e-03

Pos logit contributions
-+0.001
pack+0.001
ve+0.001
"+0.001
 carefully+0.001
Neg logit contributions
awks-0.003
vious-0.003
 Leave-0.003
 Fro-0.003
joy-0.003
Token.
Token ID13
Feature activation+1.07e-03

Pos logit contributions
vious+0.006
awks+0.005
 Leave+0.005
trained+0.005
 Fro+0.005
Neg logit contributions
ve-0.002
--0.002
 carefully-0.002
lf-0.002
pack-0.002
Token "
Token ID366
Feature activation-4.65e-02

Pos logit contributions
vious+0.002
 Leave+0.002
awks+0.002
 Fro+0.002
irms+0.002
Neg logit contributions
--0.002
 carefully-0.002
ve-0.002
pack-0.002
 together-0.002
TokenLook
Token ID8567
Feature activation+2.11e-03

Pos logit contributions
 carefully+0.047
-+0.047
 together+0.042
"+0.041
pack+0.040
Neg logit contributions
vious-0.118
awks-0.114
 Leave-0.111
trained-0.106
irms-0.106
Token,
Token ID11
Feature activation+2.24e-03

Pos logit contributions
vious+0.005
awks+0.005
 Leave+0.005
 Fro+0.004
irms+0.004
Neg logit contributions
 carefully-0.002
pack-0.002
her-0.002
--0.002
ator-0.002
Token mom
Token ID1995
Feature activation-1.12e-02

Pos logit contributions
vious+0.006
awks+0.005
 Leave+0.005
trained+0.005
irms+0.005
Neg logit contributions
 carefully-0.002
--0.002
 together-0.002
pack-0.002
"-0.002
Token,
Token ID11
Feature activation-2.01e-03

Pos logit contributions
 carefully+0.013
-+0.012
pack+0.011
ve+0.011
 together+0.011
Neg logit contributions
vious-0.026
awks-0.025
 Leave-0.024
trained-0.023
joy-0.023
Token I
Token ID314
Feature activation+5.84e-02

Pos logit contributions
 carefully+0.003
-+0.003
 together+0.002
"+0.002
 next+0.002
Neg logit contributions
vious-0.004
awks-0.004
 Leave-0.004
trained-0.004
irms-0.004
Token made
Token ID925
Feature activation+5.79e-03

Pos logit contributions
vious+0.120
awks+0.112
 Leave+0.110
trained+0.101
irms+0.101
Neg logit contributions
 carefully-0.088
--0.086
 together-0.080
lf-0.079
pack-0.078
Token eating
Token ID6600
Feature activation+4.72e-02

Pos logit contributions
awks+0.012
vious+0.012
joy+0.012
 Leave+0.012
 Fro+0.012
Neg logit contributions
--0.012
pack-0.012
 carefully-0.012
ve-0.011
ator-0.011
Token the
Token ID262
Feature activation+7.19e-03

Pos logit contributions
vious+0.096
awks+0.093
irms+0.086
 Fro+0.086
 Leave+0.086
Neg logit contributions
pack-0.064
ve-0.061
ator-0.057
lf-0.056
--0.056
Token cake
Token ID12187
Feature activation-5.41e-02

Pos logit contributions
vious+0.014
awks+0.013
 Leave+0.012
irms+0.012
joy+0.011
Neg logit contributions
 carefully-0.011
pack-0.011
--0.011
 together-0.011
"-0.011
Token.
Token ID13
Feature activation-1.26e-02

Pos logit contributions
pack+0.058
ve+0.056
-+0.050
ator+0.050
 carefully+0.048
Neg logit contributions
vious-0.122
awks-0.108
 Fro-0.102
 Leave-0.101
trained-0.099
Token
Token ID198
Feature activation-7.93e-03

Pos logit contributions
-+0.021
 carefully+0.020
pack+0.019
lf+0.019
ve+0.018
Neg logit contributions
vious-0.024
awks-0.023
 Leave-0.023
irms-0.021
 Fro-0.021
Token
Token ID198
Feature activation-8.57e-03

Pos logit contributions
-+0.010
 carefully+0.009
ator+0.009
pack+0.009
 work+0.009
Neg logit contributions
 Leave-0.017
vious-0.017
awks-0.016
 Fro-0.016
irms-0.015
TokenL
Token ID43
Feature activation-9.54e-03

Pos logit contributions
-+0.015
ator+0.014
 work+0.014
 creation+0.013
 carefully+0.013
Neg logit contributions
 Leave-0.015
vious-0.014
Leave-0.014
awks-0.013
Got-0.012
Tokenily
Token ID813
Feature activation-4.47e-02

Pos logit contributions
 carefully+0.018
-+0.018
 together+0.017
pack+0.016
"+0.016
Neg logit contributions
vious-0.016
awks-0.015
 Leave-0.014
irms-0.013
trained-0.013
Token was
Token ID373
Feature activation-7.57e-03

Pos logit contributions
ator+0.063
 together+0.061
pack+0.059
-+0.058
ve+0.058
Neg logit contributions
vious-0.091
 Leave-0.081
awks-0.080
 regretted-0.076
irms-0.075
Token sad
Token ID6507
Feature activation+8.16e-02

Pos logit contributions
-+0.013
 carefully+0.013
 together+0.012
pack+0.012
"+0.012
Neg logit contributions
vious-0.013
awks-0.013
 Leave-0.012
irms-0.011
trained-0.011
Token that
Token ID326
Feature activation-2.91e-02

Pos logit contributions
vious+0.219
awks+0.215
 Leave+0.209
trained+0.203
irms+0.199
Neg logit contributions
--0.073
 carefully-0.073
 together-0.065
 next-0.059
"-0.059
Token enough
Token ID1576
Feature activation+6.22e-02

Pos logit contributions
vious+0.048
awks+0.046
 Leave+0.045
joy+0.043
trained+0.042
Neg logit contributions
--0.026
 carefully-0.026
ator-0.024
 together-0.024
pack-0.024
Token money
Token ID1637
Feature activation-3.10e-03

Pos logit contributions
vious+0.143
awks+0.135
 Leave+0.131
joy+0.126
irms+0.124
Neg logit contributions
 carefully-0.069
--0.068
ator-0.067
lf-0.067
pack-0.067
Token.
Token ID13
Feature activation+3.84e-04

Pos logit contributions
lf+0.003
ator+0.003
 carefully+0.003
pack+0.003
-+0.003
Neg logit contributions
vious-0.007
 Leave-0.007
trained-0.006
awks-0.006
joy-0.006
Token The
Token ID383
Feature activation-1.86e-03

Pos logit contributions
vious+0.001
awks+0.001
 Leave+0.001
trained+0.001
Leave+0.001
Neg logit contributions
 carefully-0.000
--0.000
"-0.000
 together-0.000
 next-0.000
Token store
Token ID3650
Feature activation-4.55e-02

Pos logit contributions
-+0.003
 carefully+0.003
 together+0.002
pack+0.002
"+0.002
Neg logit contributions
vious-0.004
awks-0.004
 Leave-0.004
irms-0.003
joy-0.003
Token manager
Token ID4706
Feature activation-8.15e-02

Pos logit contributions
 carefully+0.056
pack+0.055
ve+0.054
-+0.053
 work+0.053
Neg logit contributions
vious-0.093
 Leave-0.086
trained-0.083
rified-0.082
 regretted-0.081
Token offered
Token ID4438
Feature activation-1.52e-03

Pos logit contributions
-+0.106
 carefully+0.103
pack+0.101
ator+0.100
"+0.099
Neg logit contributions
vious-0.170
 Leave-0.154
awks-0.147
trained-0.145
rified-0.144
Token to
Token ID284
Feature activation+1.45e-02

Pos logit contributions
-+0.002
 carefully+0.002
pack+0.002
 together+0.002
"+0.002
Neg logit contributions
vious-0.003
awks-0.003
 Leave-0.003
trained-0.003
 Fro-0.003
Token lend
Token ID22096
Feature activation+3.26e-02

Pos logit contributions
 Leave+0.025
vious+0.024
 Fro+0.023
awks+0.023
Leave+0.023
Neg logit contributions
--0.023
iced-0.021
 together-0.021
ve-0.021
her-0.020
Token him
Token ID683
Feature activation-2.57e-02

Pos logit contributions
vious+0.070
awks+0.067
 Fro+0.066
 Leave+0.065
trained+0.064
Neg logit contributions
lf-0.040
ator-0.040
pack-0.039
--0.038
ve-0.037
Token some
Token ID617
Feature activation+3.57e-02

Pos logit contributions
lf+0.027
 carefully+0.027
-+0.026
ve+0.026
ator+0.025
Neg logit contributions
vious-0.061
awks-0.057
 Leave-0.057
trained-0.054
 Fro-0.054
Token Her
Token ID2332
Feature activation-1.09e-02

Pos logit contributions
-+0.016
 carefully+0.015
 together+0.014
pack+0.014
"+0.013
Neg logit contributions
vious-0.028
awks-0.027
 Leave-0.026
irms-0.025
trained-0.025
Token mother
Token ID2802
Feature activation+2.13e-02

Pos logit contributions
pack+0.020
 together+0.019
her+0.018
"+0.018
lf+0.018
Neg logit contributions
joy-0.018
awks-0.017
 hesitation-0.016
irk-0.015
vious-0.015
Token hugged
Token ID41130
Feature activation+1.58e-02

Pos logit contributions
vious+0.029
 Leave+0.025
 Fro+0.025
joy+0.024
 mun+0.024
Neg logit contributions
pack-0.040
 work-0.039
ator-0.039
her-0.038
 together-0.037
Token her
Token ID607
Feature activation-1.68e-02

Pos logit contributions
vious+0.041
awks+0.040
 Leave+0.039
trained+0.037
irms+0.037
Neg logit contributions
 carefully-0.015
--0.015
 together-0.013
pack-0.013
"-0.012
Token and
Token ID290
Feature activation+1.89e-03

Pos logit contributions
-+0.014
 carefully+0.014
pack+0.014
lf+0.013
 together+0.013
Neg logit contributions
vious-0.043
awks-0.042
 Leave-0.040
irms-0.039
joy-0.039
Token said
Token ID531
Feature activation-3.69e-02

Pos logit contributions
vious+0.004
awks+0.004
 Leave+0.004
trained+0.003
 Fro+0.003
Neg logit contributions
--0.003
 carefully-0.003
pack-0.002
"-0.002
ve-0.002
Token,
Token ID11
Feature activation+4.31e-03

Pos logit contributions
pack+0.036
 carefully+0.036
-+0.035
ator+0.033
lf+0.033
Neg logit contributions
vious-0.088
 Leave-0.087
awks-0.083
joy-0.080
 Fro-0.079
Token "
Token ID366
Feature activation-4.41e-02

Pos logit contributions
 Leave+0.008
joy+0.007
vious+0.007
awks+0.007
 Fro+0.007
Neg logit contributions
--0.006
pack-0.006
ator-0.006
 work-0.006
 carefully-0.006
TokenI
Token ID40
Feature activation+3.05e-02

Pos logit contributions
 carefully+0.065
-+0.064
 together+0.060
"+0.060
 next+0.059
Neg logit contributions
vious-0.090
awks-0.089
 Leave-0.085
irms-0.082
trained-0.081
Token'm
Token ID1101
Feature activation+6.04e-03

Pos logit contributions
vious+0.055
awks+0.050
 Leave+0.049
 Fro+0.045
joy+0.045
Neg logit contributions
--0.053
 carefully-0.052
lf-0.050
 together-0.050
pack-0.049
Token glad
Token ID9675
Feature activation+4.58e-02

Pos logit contributions
vious+0.008
awks+0.008
 Leave+0.008
trained+0.007
 Fro+0.007
Neg logit contributions
--0.013
 carefully-0.013
pack-0.012
 together-0.012
lf-0.012
Token her
Token ID607
Feature activation-1.44e-02

Pos logit contributions
ve+0.011
pack+0.010
-+0.009
 carefully+0.009
ator+0.009
Neg logit contributions
awks-0.016
vious-0.016
 ze-0.015
 Fro-0.015
 Leave-0.015
Token favorite
Token ID4004
Feature activation-1.89e-02

Pos logit contributions
ve+0.017
pack+0.017
"+0.017
 carefully+0.017
-+0.016
Neg logit contributions
awks-0.030
vious-0.030
 Fro-0.028
 ze-0.028
 Leave-0.028
Token show
Token ID905
Feature activation+2.77e-03

Pos logit contributions
 carefully+0.027
ve+0.027
pack+0.026
her+0.026
 together+0.026
Neg logit contributions
vious-0.037
awks-0.036
 Leave-0.034
 ze-0.033
 pige-0.033
Token on
Token ID319
Feature activation+8.48e-03

Pos logit contributions
vious+0.007
 Leave+0.007
awks+0.006
 Fro+0.006
 ze+0.006
Neg logit contributions
ve-0.002
 carefully-0.002
--0.002
pack-0.002
ator-0.002
Token the
Token ID262
Feature activation+1.55e-02

Pos logit contributions
 ze+0.015
vious+0.015
awks+0.014
 Leave+0.014
 Fro+0.014
Neg logit contributions
ve-0.014
pack-0.012
--0.012
 together-0.012
ator-0.012
Token TV
Token ID3195
Feature activation-1.10e-01

Pos logit contributions
vious+0.031
awks+0.030
 ze+0.028
 Leave+0.027
 pige+0.027
Neg logit contributions
 carefully-0.022
 together-0.022
--0.021
pack-0.021
"-0.021
Token.
Token ID13
Feature activation-1.10e-02

Pos logit contributions
ve+0.118
-+0.100
 carefully+0.098
pack+0.097
ator+0.094
Neg logit contributions
vious-0.256
 Leave-0.239
awks-0.237
 ze-0.227
 crowds-0.225
Token But
Token ID887
Feature activation-1.24e-02

Pos logit contributions
-+0.025
ve+0.025
 carefully+0.025
 together+0.024
lf+0.024
Neg logit contributions
 Leave-0.013
vious-0.012
awks-0.011
 Nice-0.010
irms-0.010
Token it
Token ID340
Feature activation-4.84e-02

Pos logit contributions
ve+0.020
ator+0.019
pack+0.018
iced+0.018
her+0.018
Neg logit contributions
vious-0.021
 Leave-0.020
 Fro-0.020
Leave-0.019
awks-0.019
Token was
Token ID373
Feature activation-1.16e-02

Pos logit contributions
pack+0.046
 together+0.045
 carefully+0.044
ator+0.043
ve+0.041
Neg logit contributions
vious-0.121
 Leave-0.109
 Fro-0.108
 stub-0.108
awks-0.107
Token too
Token ID1165
Feature activation-2.21e-02

Pos logit contributions
-+0.014
 carefully+0.013
pack+0.013
 together+0.012
ator+0.012
Neg logit contributions
vious-0.027
awks-0.025
 Leave-0.024
 Fro-0.024
irms-0.024
Token,"
Token ID553
Feature activation+4.39e-02

Pos logit contributions
ator+0.089
pack+0.088
 carefully+0.086
lf+0.085
-+0.082
Neg logit contributions
vious-0.126
awks-0.118
 Leave-0.113
irms-0.106
 pige-0.104
Token Tom
Token ID4186
Feature activation+6.64e-03

Pos logit contributions
vious+0.080
awks+0.077
 Leave+0.074
trained+0.069
irms+0.069
Neg logit contributions
 carefully-0.077
--0.076
 together-0.072
"-0.070
 next-0.069
Token said
Token ID531
Feature activation-5.35e-02

Pos logit contributions
vious+0.014
awks+0.013
 Leave+0.013
trained+0.012
irms+0.012
Neg logit contributions
ator-0.009
--0.009
 carefully-0.009
pack-0.009
lf-0.008
Token.
Token ID13
Feature activation-1.68e-02

Pos logit contributions
lf+0.048
ator+0.047
pack+0.045
ve+0.043
 carefully+0.041
Neg logit contributions
vious-0.134
awks-0.125
 Leave-0.124
trained-0.123
irms-0.120
Token "
Token ID366
Feature activation-5.61e-02

Pos logit contributions
pack+0.028
-+0.028
 carefully+0.028
ve+0.027
ator+0.027
Neg logit contributions
vious-0.029
 Leave-0.028
awks-0.028
 Fro-0.026
irms-0.026
TokenIt
Token ID1026
Feature activation-1.10e-01

Pos logit contributions
-+0.088
lf+0.087
 carefully+0.085
ator+0.085
 work+0.080
Neg logit contributions
vious-0.109
awks-0.103
 Leave-0.102
Leave-0.098
trained-0.093
Token has
Token ID468
Feature activation+2.39e-02

Pos logit contributions
pack+0.164
lf+0.163
 carefully+0.162
-+0.159
ator+0.157
Neg logit contributions
vious-0.220
 Leave-0.201
awks-0.200
trained-0.181
 ze-0.178
Token fast
Token ID3049
Feature activation-1.53e-02

Pos logit contributions
vious+0.059
awks+0.057
 Leave+0.055
trained+0.054
irms+0.053
Neg logit contributions
--0.025
 carefully-0.025
 together-0.023
"-0.021
 next-0.021
Token cars
Token ID5006
Feature activation-2.29e-02

Pos logit contributions
-+0.021
 carefully+0.020
 together+0.019
"+0.018
 next+0.018
Neg logit contributions
vious-0.034
awks-0.032
 Leave-0.031
trained-0.030
irms-0.030
Token and
Token ID290
Feature activation-1.07e-02

Pos logit contributions
ator+0.027
 carefully+0.026
pack+0.026
lf+0.025
ve+0.025
Neg logit contributions
vious-0.054
 Leave-0.049
awks-0.048
trained-0.044
 ze-0.043
Token loud
Token ID7812
Feature activation-4.46e-03

Pos logit contributions
-+0.017
 carefully+0.016
pack+0.016
 together+0.015
ve+0.015
Neg logit contributions
vious-0.021
awks-0.020
 Leave-0.020
irms-0.018
trained-0.018
Token Lily
Token ID20037
Feature activation+1.86e-02

Pos logit contributions
vious+0.076
awks+0.073
 Leave+0.070
trained+0.066
irms+0.066
Neg logit contributions
 carefully-0.059
--0.059
 together-0.055
"-0.053
pack-0.053
Token and
Token ID290
Feature activation-2.31e-03

Pos logit contributions
vious+0.052
awks+0.051
 Leave+0.050
trained+0.048
irms+0.048
Neg logit contributions
 carefully-0.013
--0.013
 together-0.011
"-0.011
pack-0.010
Token says
Token ID1139
Feature activation-2.00e-02

Pos logit contributions
 carefully+0.004
 work+0.003
 together+0.003
-+0.003
 next+0.003
Neg logit contributions
vious-0.005
awks-0.004
 Leave-0.004
trained-0.004
rified-0.004
Token,
Token ID11
Feature activation-2.46e-03

Pos logit contributions
 carefully+0.018
-+0.018
pack+0.018
lf+0.016
ator+0.016
Neg logit contributions
vious-0.051
 Leave-0.049
awks-0.049
joy-0.046
trained-0.046
Token "
Token ID366
Feature activation-5.38e-02

Pos logit contributions
-+0.003
pack+0.003
 carefully+0.003
ator+0.003
 work+0.003
Neg logit contributions
 Leave-0.005
vious-0.005
awks-0.005
joy-0.005
Leave-0.005
TokenIt
Token ID1026
Feature activation-1.08e-01

Pos logit contributions
 carefully+0.066
-+0.065
"+0.062
 together+0.061
 next+0.061
Neg logit contributions
vious-0.124
awks-0.122
 Leave-0.117
irms-0.113
trained-0.112
Token's
Token ID338
Feature activation+1.36e-02

Pos logit contributions
lf+0.156
pack+0.153
ator+0.150
 carefully+0.150
-+0.149
Neg logit contributions
vious-0.217
 Leave-0.203
awks-0.191
trained-0.183
rified-0.175
Token okay
Token ID8788
Feature activation+3.98e-02

Pos logit contributions
vious+0.016
awks+0.016
irms+0.014
 Leave+0.013
 Fro+0.013
Neg logit contributions
pack-0.030
--0.030
lf-0.030
ve-0.030
 carefully-0.030
Token,
Token ID11
Feature activation-5.92e-03

Pos logit contributions
vious+0.086
 Leave+0.086
awks+0.084
 Fro+0.080
joy+0.079
Neg logit contributions
pack-0.047
 carefully-0.044
ator-0.044
her-0.043
lf-0.043
Token sweet
Token ID6029
Feature activation+3.39e-02

Pos logit contributions
 carefully+0.009
-+0.009
 together+0.009
"+0.008
 next+0.008
Neg logit contributions
vious-0.012
awks-0.011
 Leave-0.011
trained-0.010
irms-0.010
Tokenie
Token ID494
Feature activation-1.57e-02

Pos logit contributions
vious+0.074
awks+0.072
 Leave+0.070
trained+0.066
irms+0.066
Neg logit contributions
--0.046
 carefully-0.045
 together-0.041
"-0.041
 next-0.040
Token who
Token ID508
Feature activation-3.81e-03

Pos logit contributions
 carefully+0.011
-+0.011
 together+0.010
"+0.010
pack+0.010
Neg logit contributions
vious-0.018
awks-0.018
 Leave-0.017
trained-0.016
irms-0.016
Token had
Token ID550
Feature activation+3.56e-03

Pos logit contributions
 carefully+0.005
-+0.005
pack+0.004
ator+0.004
 together+0.004
Neg logit contributions
vious-0.008
 Leave-0.008
awks-0.008
trained-0.007
Leave-0.007
Token come
Token ID1282
Feature activation+3.38e-02

Pos logit contributions
vious+0.007
awks+0.007
 Leave+0.007
joy+0.006
Leave+0.006
Neg logit contributions
 carefully-0.005
--0.005
 together-0.005
pack-0.005
"-0.005
Token home
Token ID1363
Feature activation-6.71e-02

Pos logit contributions
 Leave+0.064
awks+0.061
vious+0.060
 Fro+0.060
joy+0.060
Neg logit contributions
lf-0.050
ator-0.043
her-0.042
ves-0.040
 carefully-0.040
Token from
Token ID422
Feature activation+9.23e-03

Pos logit contributions
lf+0.077
ve+0.076
ator+0.074
pack+0.069
abies+0.069
Neg logit contributions
vious-0.134
awks-0.126
joy-0.122
 Fro-0.117
 Leave-0.114
Token work
Token ID670
Feature activation-1.62e-01

Pos logit contributions
awks+0.018
joy+0.018
irms+0.017
vious+0.017
 Leave+0.016
Neg logit contributions
iced-0.011
ve-0.011
--0.010
ator-0.010
"-0.010
Token.
Token ID13
Feature activation-1.81e-02

Pos logit contributions
pack+0.161
ator+0.155
ve+0.152
abies+0.151
iced+0.148
Neg logit contributions
vious-0.342
awks-0.321
joy-0.314
 Leave-0.310
trained-0.306
Token She
Token ID1375
Feature activation-1.49e-02

Pos logit contributions
 carefully+0.025
"+0.025
-+0.025
 together+0.024
 next+0.022
Neg logit contributions
vious-0.040
awks-0.037
trained-0.035
 Leave-0.034
irms-0.034
Token saw
Token ID2497
Feature activation+1.46e-02

Pos logit contributions
 together+0.020
 carefully+0.020
pack+0.020
-+0.020
ator+0.019
Neg logit contributions
vious-0.029
 Leave-0.028
awks-0.028
 Fro-0.026
Leave-0.026
Token the
Token ID262
Feature activation+6.16e-03

Pos logit contributions
vious+0.036
awks+0.035
 Leave+0.034
trained+0.033
irms+0.033
Neg logit contributions
--0.015
 carefully-0.015
 together-0.013
pack-0.013
"-0.013
Token stove
Token ID27635
Feature activation-1.01e-01

Pos logit contributions
vious+0.014
awks+0.013
 Leave+0.013
irms+0.012
trained+0.012
Neg logit contributions
 carefully-0.008
--0.008
 together-0.008
"-0.007
pack-0.007
Token friends
Token ID2460
Feature activation+5.44e-02

Pos logit contributions
vious+0.073
awks+0.070
 Leave+0.069
trained+0.066
irms+0.066
Neg logit contributions
--0.030
 carefully-0.029
 together-0.027
"-0.025
 next-0.024
Token and
Token ID290
Feature activation+1.88e-02

Pos logit contributions
vious+0.148
awks+0.143
 Leave+0.140
joy+0.135
irms+0.133
Neg logit contributions
 carefully-0.039
--0.037
pack-0.035
her-0.033
lf-0.032
Token they
Token ID484
Feature activation+2.43e-02

Pos logit contributions
vious+0.044
awks+0.043
 Leave+0.041
irms+0.039
trained+0.039
Neg logit contributions
 carefully-0.024
--0.023
 together-0.022
 work-0.020
"-0.020
Token all
Token ID477
Feature activation+2.25e-02

Pos logit contributions
vious+0.047
awks+0.045
 Leave+0.043
trained+0.041
irms+0.041
Neg logit contributions
--0.039
 carefully-0.038
 together-0.035
pack-0.035
"-0.035
Token played
Token ID2826
Feature activation+1.79e-02

Pos logit contributions
awks+0.046
vious+0.045
 Leave+0.044
irms+0.042
trained+0.041
Neg logit contributions
 carefully-0.034
--0.033
 together-0.033
 work-0.030
"-0.030
Token together
Token ID1978
Feature activation-1.07e-01

Pos logit contributions
awks+0.047
vious+0.046
irms+0.045
joy+0.045
 Leave+0.045
Neg logit contributions
ator-0.011
pack-0.011
her-0.011
ve-0.010
lf-0.010
Token with
Token ID351
Feature activation+3.06e-02

Pos logit contributions
pack+0.110
-+0.099
ve+0.098
ator+0.097
her+0.093
Neg logit contributions
vious-0.245
awks-0.239
 Fro-0.225
 Leave-0.223
joy-0.222
Token the
Token ID262
Feature activation+3.33e-02

Pos logit contributions
vious+0.071
awks+0.069
 Leave+0.066
trained+0.063
irms+0.063
Neg logit contributions
pack-0.036
--0.035
ve-0.034
 carefully-0.033
"-0.031
Token shiny
Token ID22441
Feature activation+5.04e-02

Pos logit contributions
vious+0.070
awks+0.066
 Leave+0.064
trained+0.060
irms+0.060
Neg logit contributions
--0.049
 carefully-0.048
pack-0.045
 together-0.044
"-0.043
Token trophy
Token ID16383
Feature activation+4.09e-03

Pos logit contributions
vious+0.097
awks+0.087
 Leave+0.082
joy+0.081
irms+0.079
Neg logit contributions
 carefully-0.081
pack-0.078
her-0.077
--0.076
 together-0.075
Token.
Token ID13
Feature activation+1.71e-02

Pos logit contributions
vious+0.011
awks+0.010
 Leave+0.010
trained+0.009
 Fro+0.009
Neg logit contributions
pack-0.003
--0.003
ve-0.003
 carefully-0.003
her-0.003