TOP ACTIVATIONS
MAX = 0.118

knew it was her own fault . The girl
awn â s fault . The
realised it wasn 't her fault after all . <|endoftext|> Once
them it was all their fault . His friends were so
Be careful not to touch it
ending was no one 's fault , but it left the
it was Tim my 's fault but Tim my felt guilty
. He said , " Be smart and always think before
creature simply replied , " Be kind to others in the
everyone they met : ' Be patient , be kind and
warned the skeleton , ' Be careful ! The dark woods
Be careful , that 's a
" It 's your own fault , you should 've been
and said it was their fault . Mom said they were
it was all Joe 's fault . Joe tried to talk
and it was his own fault . Johnny smiled
smiled and said , " Be careful when drawing , okay
heard a voice . " Be careful ," it said .
and it 's all my fault ," he said , weeping
had told her : " Be have , little one ".

MIN ACTIVATIONS
MIN = -0.095

quickly helped Tim my get unst uck . She said ,
knows how to get things unst uck ." So , they
We 'll help you get unst uck ." But
that the zipper had become unst uck . She was relieved
they didn 't get it unst uck . Just
could help get the car unst uck . The
before all the fur was unst uck ! The
together to get the bird unst uck and freed it from
scared of the cold por cel ain bowl !
Fl oppy saw a mar vell ous view . He also
Benny helped the bear get unst uck and helped the little
your fault . Sometimes things happen that we cannot control .
. She had heard about att ics and wanted to explore
find them ? What will happen in the forest ? <|endoftext|>
first . You look yes fil thy !" The
grandma . How did this happen ?" she asks .
the buttons and make things happen on the screen . She
magic . Maybe it will happen again ." They
I did not mean to happen it ." Mom
's okay . Acc idents happen . Just be more careful

INTERVAL 0.065 - 0.118
CONTAINS 0.042%

couldn 't run away fast enough and was arrested . The
them are very dark . Some of them are very hot
had looked at the tree enough , Toby said goodbye and
cleaned the house and sure enough , her Mom had a
inside and waited . Sure enough , the guard soon saw

INTERVAL 0.012 - 0.065
CONTAINS 15.017%

the forest . The bear saw the cricket eating his b
. She ran to the park to show all the other
, no toys , and no mom . He looked sad
The fox agreed and waited while Woo ly went to get
until it was as wide as a plate . The flower

INTERVAL -0.042 - 0.012
CONTAINS 84.346%

He had a big bed with a soft mattress that he
take Tim my 's ball , but Tim my didn 't
looked closer and he saw lots of little fishes swimming around
deep as he could until he saw a gl immer of
Robert smiled and said , " Me too ". <|endoftext|> Jim

INTERVAL -0.095 - -0.042
CONTAINS 0.595%

of fun together . <|endoftext|> S ara and Ben are friends
out and see what would happen . So ,
and Daddy say , " Vol can oes are not for
" Look , a coin !" Tim said . " Let
Tom said . " By e , Lily and
Token knew
Token ID2993
Feature activation-1.15e-02

Pos logit contributions
itting+0.013
uttering+0.012
ability+0.011
over+0.011
ooks+0.011
Neg logit contributions
 fastest-0.009
 brakes-0.008
 highest-0.008
 heads-0.007
 fart-0.007
Token it
Token ID340
Feature activation+1.12e-02

Pos logit contributions
 fastest+0.020
 brakes+0.019
 highest+0.019
 hurting+0.018
 ruining+0.017
Neg logit contributions
itting-0.022
uttering-0.020
ability-0.019
over-0.018
ooks-0.017
Token was
Token ID373
Feature activation+7.37e-03

Pos logit contributions
itting+0.021
uttering+0.019
ability+0.019
over+0.019
ooks+0.018
Neg logit contributions
 fastest-0.019
 brakes-0.018
 highest-0.018
 fart-0.017
 hurting-0.016
Token her
Token ID607
Feature activation+9.37e-03

Pos logit contributions
itting+0.015
ability+0.014
uttering+0.013
over+0.013
ooks+0.012
Neg logit contributions
 fastest-0.012
 highest-0.012
 hurting-0.011
 brakes-0.011
 fart-0.011
Token own
Token ID898
Feature activation-9.80e-03

Pos logit contributions
itting+0.019
ability+0.017
uttering+0.017
over+0.016
ooks+0.016
Neg logit contributions
 fastest-0.015
 brakes-0.015
 highest-0.015
 fart-0.013
 heads-0.013
Token fault
Token ID8046
Feature activation+1.04e-01

Pos logit contributions
 fastest+0.016
 brakes+0.015
 highest+0.015
 fart+0.013
 heads+0.013
Neg logit contributions
itting-0.020
uttering-0.018
ability-0.018
ooks-0.017
over-0.017
Token.
Token ID13
Feature activation+1.68e-03

Pos logit contributions
itting+0.196
ability+0.188
uttering+0.177
ooks+0.162
over+0.157
Neg logit contributions
 fastest-0.180
 brakes-0.166
 highest-0.163
 hurting-0.162
 fart-0.156
Token
Token ID198
Feature activation-3.90e-03

Pos logit contributions
itting+0.003
uttering+0.003
ability+0.002
over+0.002
adow+0.002
Neg logit contributions
 fastest-0.003
 brakes-0.003
 highest-0.003
 fart-0.003
 nearest-0.003
Token
Token ID198
Feature activation-4.70e-03

Pos logit contributions
 me+0.006
 brakes+0.006
 fastest+0.006
 highest+0.006
 nearest+0.006
Neg logit contributions
itting-0.007
uttering-0.007
ability-0.006
over-0.006
ment-0.006
TokenThe
Token ID464
Feature activation-1.50e-02

Pos logit contributions
 fastest+0.007
 highest+0.006
 brakes+0.006
 nearest+0.006
 myself+0.006
Neg logit contributions
itting-0.009
uttering-0.009
ment-0.008
ability-0.008
adow-0.008
Token girl
Token ID2576
Feature activation+1.43e-02

Pos logit contributions
 fastest+0.022
 brakes+0.022
 highest+0.021
 fart+0.019
 ruining+0.018
Neg logit contributions
itting-0.031
uttering-0.028
ability-0.028
over-0.028
adow-0.026
Tokenawn
Token ID3832
Feature activation+1.61e-02

Pos logit contributions
 fastest+0.001
 highest+0.001
chie+0.001
 brakes+0.001
 heads+0.001
Neg logit contributions
itting-0.000
ability-0.000
 early-0.000
uttering-0.000
over-0.000
Tokenâ
Token ID22940
Feature activation+3.76e-02

Pos logit contributions
itting+0.028
ability+0.025
uttering+0.025
over+0.023
ooks+0.023
Neg logit contributions
 fastest-0.030
 brakes-0.029
 highest-0.029
 fart-0.026
 hurting-0.026
Token
Token ID26391
Feature activation-1.41e-02

Pos logit contributions
itting+0.079
uttering+0.073
ability+0.072
over+0.069
ooks+0.067
Neg logit contributions
 fastest-0.057
 brakes-0.054
 highest-0.053
 fart-0.049
 hurting-0.046
Token
Token ID8151
Feature activation-4.26e-02

Pos logit contributions
 fastest+0.026
 brakes+0.025
 highest+0.025
 fart+0.023
 heads+0.022
Neg logit contributions
itting-0.024
ability-0.022
uttering-0.021
over-0.020
 early-0.019
Tokens
Token ID82
Feature activation-1.43e-02

Pos logit contributions
 fastest+0.069
 highest+0.065
 brakes+0.061
 fart+0.057
 hurting+0.055
Neg logit contributions
itting-0.081
ability-0.077
uttering-0.075
over-0.070
ment-0.070
Token fault
Token ID8046
Feature activation+1.02e-01

Pos logit contributions
 fastest+0.023
 brakes+0.023
 highest+0.022
 heads+0.020
 hurting+0.020
Neg logit contributions
itting-0.029
uttering-0.026
ability-0.026
over-0.025
ooks-0.024
Token.
Token ID13
Feature activation+7.39e-03

Pos logit contributions
itting+0.180
ability+0.173
uttering+0.161
ooks+0.149
over+0.148
Neg logit contributions
 fastest-0.184
 brakes-0.178
 highest-0.171
 hurting-0.166
 heads-0.161
Token
Token ID220
Feature activation-1.29e-02

Pos logit contributions
itting+0.012
uttering+0.010
ability+0.010
over+0.010
adow+0.009
Neg logit contributions
 fastest-0.015
 brakes-0.015
 highest-0.014
 fart-0.014
 hurting-0.013
Token
Token ID198
Feature activation+7.23e-03

Pos logit contributions
 fastest+0.026
 highest+0.024
 brakes+0.024
 fart+0.022
 nearest+0.022
Neg logit contributions
itting-0.020
ability-0.019
uttering-0.018
over-0.017
ment-0.016
Token
Token ID198
Feature activation+6.16e-03

Pos logit contributions
itting+0.012
uttering+0.010
ability+0.010
adow+0.009
over+0.009
Neg logit contributions
 fastest-0.015
 brakes-0.014
 highest-0.014
 fart-0.013
 nearest-0.013
TokenThe
Token ID464
Feature activation-9.78e-03

Pos logit contributions
itting+0.012
uttering+0.011
ability+0.011
adow+0.010
over+0.010
Neg logit contributions
 fastest-0.010
 brakes-0.010
 highest-0.009
 fart-0.009
 nearest-0.008
Token realised
Token ID19169
Feature activation+1.13e-02

Pos logit contributions
 fastest+0.050
 brakes+0.046
 highest+0.046
 fart+0.043
 hurting+0.042
Neg logit contributions
itting-0.070
uttering-0.065
ability-0.060
over-0.059
ooks-0.058
Token it
Token ID340
Feature activation+7.48e-03

Pos logit contributions
itting+0.023
uttering+0.021
ability+0.021
over+0.020
ooks+0.020
Neg logit contributions
 fastest-0.017
 brakes-0.017
 highest-0.016
 hurting-0.016
 ruining-0.015
Token wasn
Token ID2492
Feature activation-6.10e-04

Pos logit contributions
itting+0.013
ability+0.012
uttering+0.012
over+0.012
adow+0.011
Neg logit contributions
 fastest-0.013
 brakes-0.013
 highest-0.013
 fart-0.012
 heads-0.011
Token't
Token ID470
Feature activation-1.29e-02

Pos logit contributions
 brakes+0.001
 heads+0.001
 fastest+0.001
 highest+0.001
 forth+0.001
Neg logit contributions
itting-0.001
uttering-0.001
over-0.001
 about-0.001
 early-0.001
Token her
Token ID607
Feature activation+4.58e-03

Pos logit contributions
 fastest+0.020
 brakes+0.018
 highest+0.018
 hurting+0.017
 fart+0.017
Neg logit contributions
itting-0.028
ability-0.026
uttering-0.026
over-0.024
ooks-0.024
Token fault
Token ID8046
Feature activation+9.93e-02

Pos logit contributions
itting+0.009
uttering+0.008
ability+0.008
over+0.007
adow+0.007
Neg logit contributions
 fastest-0.008
 brakes-0.008
 highest-0.008
 fart-0.007
 hurting-0.007
Token after
Token ID706
Feature activation-3.39e-03

Pos logit contributions
itting+0.167
ability+0.157
uttering+0.148
over+0.140
adow+0.136
Neg logit contributions
 fastest-0.191
 brakes-0.180
 highest-0.178
 hurting-0.167
 fart-0.167
Token all
Token ID477
Feature activation-2.04e-03

Pos logit contributions
 fastest+0.006
 brakes+0.006
 highest+0.006
 hurting+0.006
 ruining+0.005
Neg logit contributions
itting-0.006
ability-0.006
uttering-0.006
over-0.005
ooks-0.005
Token.
Token ID13
Feature activation+7.23e-03

Pos logit contributions
 fastest+0.004
 brakes+0.004
 highest+0.004
 hurting+0.004
 ruining+0.003
Neg logit contributions
itting-0.003
uttering-0.003
ability-0.003
over-0.003
ooks-0.003
Token<|endoftext|>
Token ID50256
Feature activation+1.32e-02

Pos logit contributions
itting+0.012
uttering+0.012
over+0.011
ability+0.010
 early+0.010
Neg logit contributions
 fastest-0.013
 highest-0.013
 brakes-0.012
 fart-0.011
 nearest-0.011
TokenOnce
Token ID7454
Feature activation-1.13e-02

Pos logit contributions
itting+0.026
uttering+0.024
ooks+0.022
ability+0.022
over+0.021
Neg logit contributions
 fastest-0.021
 highest-0.020
 brakes-0.019
 fart-0.019
 nearest-0.017
Token them
Token ID606
Feature activation+1.73e-02

Pos logit contributions
itting+0.057
uttering+0.052
ability+0.050
over+0.048
adow+0.048
Neg logit contributions
 brakes-0.052
 fastest-0.050
 highest-0.047
 hurting-0.046
 heads-0.045
Token it
Token ID340
Feature activation+8.34e-03

Pos logit contributions
itting+0.028
uttering+0.026
over+0.025
ability+0.025
adow+0.022
Neg logit contributions
 brakes-0.034
 fastest-0.034
 highest-0.032
 hurting-0.031
 ruining-0.030
Token was
Token ID373
Feature activation+8.28e-03

Pos logit contributions
itting+0.016
uttering+0.015
ability+0.015
over+0.014
adow+0.014
Neg logit contributions
 fastest-0.014
 brakes-0.014
 highest-0.013
 fart-0.012
 hurting-0.012
Token all
Token ID477
Feature activation-1.35e-03

Pos logit contributions
itting+0.016
ability+0.014
uttering+0.014
over+0.014
adow+0.013
Neg logit contributions
 fastest-0.014
 brakes-0.014
 highest-0.013
 fart-0.013
 hurting-0.012
Token their
Token ID511
Feature activation+1.25e-02

Pos logit contributions
 fastest+0.002
 brakes+0.002
 highest+0.002
 hurting+0.002
 fart+0.002
Neg logit contributions
itting-0.003
over-0.003
uttering-0.003
ability-0.003
ooks-0.003
Token fault
Token ID8046
Feature activation+9.85e-02

Pos logit contributions
itting+0.022
ability+0.019
uttering+0.019
over+0.019
adow+0.018
Neg logit contributions
 fastest-0.023
 brakes-0.023
 highest-0.022
 fart-0.021
 hurting-0.020
Token.
Token ID13
Feature activation+1.10e-02

Pos logit contributions
itting+0.172
ability+0.168
uttering+0.152
adow+0.143
ooks+0.142
Neg logit contributions
 fastest-0.178
 brakes-0.174
 hurting-0.168
 highest-0.163
 heads-0.157
Token His
Token ID2399
Feature activation+5.01e-03

Pos logit contributions
itting+0.019
uttering+0.018
ability+0.017
adow+0.017
over+0.016
Neg logit contributions
 brakes-0.020
 fastest-0.019
 highest-0.019
 fart-0.017
 nearest-0.017
Token friends
Token ID2460
Feature activation-5.35e-03

Pos logit contributions
itting+0.008
ability+0.007
uttering+0.007
over+0.007
adow+0.007
Neg logit contributions
 fastest-0.010
 brakes-0.010
 highest-0.010
 fart-0.009
 ruining-0.009
Token were
Token ID547
Feature activation-3.04e-03

Pos logit contributions
 brakes+0.009
 fastest+0.009
 highest+0.008
 hurting+0.008
 fart+0.008
Neg logit contributions
itting-0.010
uttering-0.009
ability-0.009
over-0.009
ooks-0.009
Token so
Token ID523
Feature activation+1.28e-02

Pos logit contributions
 fastest+0.004
 brakes+0.004
 highest+0.004
 fart+0.004
 hurting+0.004
Neg logit contributions
itting-0.007
ability-0.006
uttering-0.006
over-0.006
adow-0.006
Token
Token ID6184
Feature activation+2.26e-02

Pos logit contributions
itting+0.025
ability+0.024
uttering+0.022
over+0.022
adow+0.021
Neg logit contributions
 brakes-0.029
 fastest-0.028
 highest-0.027
 hurting-0.026
 fart-0.026
Token
Token ID95
Feature activation+2.73e-02

Pos logit contributions
itting+0.044
uttering+0.040
over+0.039
ability+0.038
adow+0.038
Neg logit contributions
 fastest-0.038
 brakes-0.037
 highest-0.036
 fart-0.034
 hurting-0.031
Token
Token ID26391
Feature activation-1.57e-02

Pos logit contributions
itting+0.053
ability+0.048
over+0.048
uttering+0.048
adow+0.046
Neg logit contributions
 fastest-0.045
 brakes-0.044
 highest-0.043
 fart-0.040
 hurting-0.038
Token
Token ID129
Feature activation-2.21e-02

Pos logit contributions
 fastest+0.028
 brakes+0.028
 highest+0.027
 fart+0.025
 heads+0.024
Neg logit contributions
itting-0.027
ability-0.025
uttering-0.025
over-0.023
adow-0.022
Token
Token ID241
Feature activation+1.73e-02

Pos logit contributions
 fastest+0.035
 brakes+0.033
 highest+0.033
 fart+0.031
 longest+0.028
Neg logit contributions
itting-0.044
uttering-0.041
over-0.040
ability-0.039
adow-0.039
TokenBe
Token ID3856
Feature activation+9.75e-02

Pos logit contributions
itting+0.036
ooks+0.033
uttering+0.033
ability+0.032
ook+0.031
Neg logit contributions
 brakes-0.024
 fastest-0.022
 highest-0.022
 nearest-0.021
 brake-0.020
Token careful
Token ID8161
Feature activation+2.55e-02

Pos logit contributions
itting+0.212
ability+0.194
uttering+0.191
over+0.178
ooks+0.176
Neg logit contributions
 fastest-0.141
 brakes-0.136
 highest-0.130
 fart-0.118
 hurting-0.116
Token not
Token ID407
Feature activation+4.94e-03

Pos logit contributions
itting+0.044
ability+0.042
uttering+0.039
over+0.036
adow+0.035
Neg logit contributions
 fastest-0.048
 brakes-0.047
 highest-0.045
 hurting-0.043
 fart-0.042
Token to
Token ID284
Feature activation+1.05e-02

Pos logit contributions
itting+0.010
ability+0.009
uttering+0.009
over+0.009
ooks+0.008
Neg logit contributions
 fastest-0.008
 brakes-0.008
 highest-0.008
 fart-0.007
 hurting-0.007
Token touch
Token ID3638
Feature activation-1.29e-03

Pos logit contributions
itting+0.021
ability+0.019
uttering+0.019
over+0.019
adow+0.019
Neg logit contributions
 fastest-0.015
 brakes-0.015
 highest-0.015
 fart-0.014
 heads-0.013
Token it
Token ID340
Feature activation+8.35e-03

Pos logit contributions
 fastest+0.002
 brakes+0.002
 highest+0.002
 fart+0.002
 heads+0.002
Neg logit contributions
itting-0.002
ability-0.002
over-0.002
uttering-0.002
adow-0.002
Token ending
Token ID7464
Feature activation+9.92e-03

Pos logit contributions
 fastest+0.024
 brakes+0.024
 highest+0.022
 fart+0.021
 hurting+0.020
Neg logit contributions
itting-0.042
ability-0.039
uttering-0.039
over-0.037
adow-0.037
Token was
Token ID373
Feature activation+5.02e-03

Pos logit contributions
itting+0.020
uttering+0.018
ability+0.018
over+0.018
ooks+0.017
Neg logit contributions
 fastest-0.015
 brakes-0.015
 highest-0.014
 hurting-0.013
 fart-0.013
Token no
Token ID645
Feature activation+3.15e-02

Pos logit contributions
itting+0.010
ability+0.009
uttering+0.009
over+0.009
adow+0.008
Neg logit contributions
 fastest-0.008
 brakes-0.008
 highest-0.008
 fart-0.007
 hurting-0.007
Token one
Token ID530
Feature activation+4.08e-03

Pos logit contributions
itting+0.063
uttering+0.057
ability+0.057
over+0.054
adow+0.054
Neg logit contributions
 fastest-0.052
 brakes-0.050
 highest-0.049
 fart-0.045
 hurting-0.044
Token's
Token ID338
Feature activation+4.00e-03

Pos logit contributions
itting+0.008
ability+0.007
uttering+0.007
over+0.007
adow+0.007
Neg logit contributions
 fastest-0.007
 brakes-0.007
 highest-0.006
 fart-0.006
 heads-0.006
Token fault
Token ID8046
Feature activation+9.70e-02

Pos logit contributions
itting+0.008
ability+0.008
uttering+0.007
over+0.007
adow+0.007
Neg logit contributions
 fastest-0.006
 brakes-0.006
 highest-0.006
 fart-0.005
 hurting-0.005
Token,
Token ID11
Feature activation+1.04e-02

Pos logit contributions
itting+0.174
ability+0.166
uttering+0.154
over+0.145
ooks+0.143
Neg logit contributions
 fastest-0.173
 brakes-0.168
 highest-0.160
 hurting-0.155
 fart-0.149
Token but
Token ID475
Feature activation-2.89e-02

Pos logit contributions
itting+0.020
uttering+0.018
ability+0.018
over+0.017
ooks+0.017
Neg logit contributions
 fastest-0.018
 brakes-0.018
 highest-0.017
 fart-0.016
 hurting-0.016
Token it
Token ID340
Feature activation+4.61e-03

Pos logit contributions
 fastest+0.052
 brakes+0.051
 highest+0.050
 fart+0.047
 hurting+0.045
Neg logit contributions
itting-0.053
uttering-0.047
ability-0.046
over-0.045
adow-0.043
Token left
Token ID1364
Feature activation-9.83e-03

Pos logit contributions
itting+0.008
ability+0.007
uttering+0.007
over+0.007
adow+0.007
Neg logit contributions
 brakes-0.008
 fastest-0.008
 highest-0.008
 fart-0.007
 heads-0.007
Token the
Token ID262
Feature activation+5.58e-03

Pos logit contributions
 fastest+0.016
 brakes+0.016
 highest+0.015
 hurting+0.015
 fart+0.014
Neg logit contributions
itting-0.020
uttering-0.018
ability-0.017
ooks-0.016
over-0.016
Token it
Token ID340
Feature activation+3.90e-03

Pos logit contributions
itting+0.017
ability+0.016
uttering+0.016
adow+0.015
over+0.015
Neg logit contributions
 brakes-0.020
 fastest-0.020
 hurting-0.019
 highest-0.019
 ruining-0.018
Token was
Token ID373
Feature activation+3.65e-03

Pos logit contributions
itting+0.004
 about+0.004
ability+0.004
 early+0.003
book+0.003
Neg logit contributions
 fastest-0.009
 brakes-0.009
 highest-0.009
 fart-0.009
 longest-0.008
Token Tim
Token ID5045
Feature activation-1.25e-02

Pos logit contributions
itting+0.007
uttering+0.006
ability+0.006
over+0.006
adow+0.006
Neg logit contributions
 fastest-0.006
 brakes-0.006
 highest-0.006
 fart-0.005
 heads-0.005
Tokenmy
Token ID1820
Feature activation+5.74e-03

Pos logit contributions
 fastest+0.017
 brakes+0.016
 highest+0.016
 fart+0.015
 hurting+0.014
Neg logit contributions
itting-0.028
uttering-0.026
ability-0.025
over-0.025
ooks-0.024
Token's
Token ID338
Feature activation+2.83e-03

Pos logit contributions
itting+0.014
ability+0.013
uttering+0.013
over+0.013
adow+0.012
Neg logit contributions
 fastest-0.006
 brakes-0.006
 highest-0.006
 fart-0.006
 heads-0.005
Token fault
Token ID8046
Feature activation+9.62e-02

Pos logit contributions
itting+0.007
uttering+0.006
ability+0.006
over+0.006
adow+0.006
Neg logit contributions
 fastest-0.003
 brakes-0.003
 highest-0.003
 fart-0.003
 ruining-0.003
Token but
Token ID475
Feature activation-2.87e-02

Pos logit contributions
itting+0.155
ability+0.147
uttering+0.136
over+0.128
adow+0.128
Neg logit contributions
 fastest-0.190
 brakes-0.188
 highest-0.180
 hurting-0.175
 fart-0.171
Token Tim
Token ID5045
Feature activation-1.17e-02

Pos logit contributions
 fastest+0.060
 brakes+0.059
 highest+0.057
 fart+0.054
 hurting+0.054
Neg logit contributions
itting-0.044
uttering-0.039
ability-0.039
over-0.037
adow-0.035
Tokenmy
Token ID1820
Feature activation+6.75e-03

Pos logit contributions
 fastest+0.017
 brakes+0.016
 highest+0.015
 fart+0.014
 hurting+0.014
Neg logit contributions
itting-0.026
uttering-0.024
ability-0.023
over-0.023
ooks-0.022
Token felt
Token ID2936
Feature activation+1.51e-02

Pos logit contributions
itting+0.013
uttering+0.012
ability+0.011
over+0.011
adow+0.011
Neg logit contributions
 fastest-0.011
 brakes-0.011
 highest-0.011
 fart-0.010
 hurting-0.010
Token guilty
Token ID6717
Feature activation-2.03e-02

Pos logit contributions
itting+0.038
uttering+0.035
ability+0.034
over+0.034
adow+0.033
Neg logit contributions
 fastest-0.017
 brakes-0.017
 highest-0.016
 hurting-0.015
 fart-0.014
Token.
Token ID13
Feature activation+1.77e-02

Pos logit contributions
itting+0.045
uttering+0.041
ability+0.039
ooks+0.038
icker+0.036
Neg logit contributions
 fastest-0.035
 brakes-0.033
 highest-0.032
 hurting-0.031
 heads-0.029
Token He
Token ID679
Feature activation+1.33e-02

Pos logit contributions
itting+0.035
uttering+0.032
ability+0.030
over+0.029
adow+0.029
Neg logit contributions
 brakes-0.030
 fastest-0.029
 highest-0.029
 fart-0.026
 nearest-0.025
Token said
Token ID531
Feature activation+1.64e-02

Pos logit contributions
itting+0.027
uttering+0.025
ability+0.023
over+0.023
ooks+0.022
Neg logit contributions
 fastest-0.021
 brakes-0.021
 highest-0.020
 fart-0.019
 heads-0.018
Token,
Token ID11
Feature activation+1.02e-02

Pos logit contributions
itting+0.034
uttering+0.031
ability+0.030
adow+0.028
over+0.027
Neg logit contributions
 brakes-0.027
 fastest-0.025
 highest-0.024
 hurting-0.023
 mine-0.022
Token "
Token ID366
Feature activation+5.81e-03

Pos logit contributions
itting+0.019
ability+0.018
uttering+0.017
adow+0.016
over+0.016
Neg logit contributions
 brakes-0.018
 fastest-0.017
 highest-0.017
 hurting-0.016
 fart-0.015
TokenBe
Token ID3856
Feature activation+9.53e-02

Pos logit contributions
itting+0.011
uttering+0.011
adow+0.010
ooks+0.010
ability+0.010
Neg logit contributions
 fastest-0.009
 highest-0.009
 brakes-0.008
 fart-0.008
 longest-0.008
Token smart
Token ID4451
Feature activation-1.22e-02

Pos logit contributions
itting+0.225
uttering+0.206
ability+0.195
ooks+0.188
adow+0.181
Neg logit contributions
 fastest-0.127
 brakes-0.119
 highest-0.113
 forth-0.105
 fart-0.105
Token and
Token ID290
Feature activation+7.90e-03

Pos logit contributions
 fastest+0.024
 brakes+0.023
 highest+0.022
 fart+0.021
 hurting+0.021
Neg logit contributions
itting-0.021
uttering-0.019
ability-0.018
ooks-0.017
over-0.017
Token always
Token ID1464
Feature activation+3.48e-02

Pos logit contributions
itting+0.017
uttering+0.016
ability+0.015
over+0.015
adow+0.014
Neg logit contributions
 fastest-0.012
 brakes-0.011
 highest-0.011
 fart-0.010
 hurting-0.010
Token think
Token ID892
Feature activation+2.82e-04

Pos logit contributions
itting+0.079
uttering+0.073
ability+0.072
over+0.070
ooks+0.068
Neg logit contributions
 fastest-0.047
 brakes-0.046
 highest-0.044
 fart-0.040
 hurting-0.038
Token before
Token ID878
Feature activation-1.03e-02

Pos logit contributions
itting+0.000
uttering+0.000
ability+0.000
over+0.000
adow+0.000
Neg logit contributions
 fastest-0.001
 brakes-0.001
 highest-0.001
 fart-0.001
 hurting-0.000
Token creature
Token ID7185
Feature activation+1.00e-03

Pos logit contributions
itting+0.004
ability+0.004
uttering+0.004
over+0.004
adow+0.003
Neg logit contributions
 brakes-0.003
 fastest-0.003
 highest-0.003
 fart-0.003
 ruining-0.003
Token simply
Token ID2391
Feature activation+1.40e-02

Pos logit contributions
itting+0.002
uttering+0.002
ability+0.002
over+0.002
ooks+0.002
Neg logit contributions
 fastest-0.002
 brakes-0.002
 highest-0.002
 fart-0.002
 hurting-0.001
Token replied
Token ID8712
Feature activation+3.86e-02

Pos logit contributions
itting+0.025
uttering+0.023
ability+0.023
over+0.022
ooks+0.022
Neg logit contributions
 brakes-0.025
 fastest-0.025
 highest-0.024
 heads-0.022
 hurting-0.022
Token,
Token ID11
Feature activation+1.65e-02

Pos logit contributions
itting+0.074
ability+0.071
uttering+0.068
adow+0.065
ooks+0.063
Neg logit contributions
 fastest-0.058
 brakes-0.057
 highest-0.057
 hurting-0.055
 nearest-0.052
Token "
Token ID366
Feature activation+6.74e-03

Pos logit contributions
itting+0.029
ability+0.028
uttering+0.027
adow+0.026
over+0.025
Neg logit contributions
 brakes-0.029
 fastest-0.028
 highest-0.028
 hurting-0.026
 fart-0.025
TokenBe
Token ID3856
Feature activation+9.53e-02

Pos logit contributions
itting+0.013
uttering+0.012
ability+0.012
ooks+0.012
adow+0.012
Neg logit contributions
 fastest-0.011
 highest-0.010
 brakes-0.010
 fart-0.009
 nearest-0.009
Token kind
Token ID1611
Feature activation+2.08e-02

Pos logit contributions
itting+0.226
uttering+0.207
ability+0.199
ooks+0.189
icker+0.184
Neg logit contributions
 fastest-0.124
 brakes-0.115
 highest-0.110
 fart-0.100
 forth-0.098
Token to
Token ID284
Feature activation+1.06e-02

Pos logit contributions
itting+0.036
uttering+0.034
ability+0.032
ooks+0.030
over+0.029
Neg logit contributions
 fastest-0.038
 brakes-0.037
 highest-0.036
 fart-0.034
 heads-0.034
Token others
Token ID1854
Feature activation+5.70e-03

Pos logit contributions
itting+0.020
ability+0.018
over+0.018
uttering+0.018
 early+0.017
Neg logit contributions
 fastest-0.017
 brakes-0.017
 highest-0.016
 fart-0.016
 heads-0.014
Token in
Token ID287
Feature activation+6.85e-03

Pos logit contributions
itting+0.010
uttering+0.010
ability+0.009
ooks+0.009
over+0.008
Neg logit contributions
 fastest-0.011
 highest-0.010
 brakes-0.010
 fart-0.009
 hurting-0.009
Token the
Token ID262
Feature activation+9.23e-03

Pos logit contributions
itting+0.014
uttering+0.013
ability+0.013
over+0.012
adow+0.012
Neg logit contributions
 fastest-0.011
 brakes-0.010
 highest-0.010
 fart-0.009
 hurting-0.009
Token everyone
Token ID2506
Feature activation-1.50e-02

Pos logit contributions
itting+0.027
uttering+0.024
ability+0.023
ooks+0.023
adow+0.023
Neg logit contributions
 brakes-0.027
 fastest-0.027
 highest-0.026
 hurting-0.025
 heads-0.024
Token they
Token ID484
Feature activation+5.58e-04

Pos logit contributions
 fastest+0.029
 highest+0.028
 brakes+0.028
 hurting+0.025
 fart+0.025
Neg logit contributions
itting-0.025
uttering-0.022
ability-0.022
ooks-0.021
over-0.021
Token met
Token ID1138
Feature activation+1.36e-02

Pos logit contributions
itting+0.001
uttering+0.001
ability+0.001
over+0.001
adow+0.001
Neg logit contributions
 fastest-0.001
 brakes-0.001
 highest-0.001
 fart-0.001
 hurting-0.001
Token:
Token ID25
Feature activation+4.35e-02

Pos logit contributions
itting+0.024
ability+0.022
uttering+0.022
ooks+0.021
ook+0.019
Neg logit contributions
 fastest-0.024
 highest-0.023
 brakes-0.022
 hurting-0.021
 fart-0.021
Token '
Token ID705
Feature activation+9.88e-03

Pos logit contributions
itting+0.079
ability+0.075
uttering+0.072
over+0.070
ooks+0.069
Neg logit contributions
 fastest-0.076
 highest-0.074
 brakes-0.074
 fart-0.065
 hurting-0.065
TokenBe
Token ID3856
Feature activation+9.51e-02

Pos logit contributions
itting+0.019
uttering+0.017
ability+0.017
over+0.016
adow+0.016
Neg logit contributions
 fastest-0.016
 brakes-0.016
 highest-0.016
 fart-0.015
 hurting-0.014
Token patient
Token ID5827
Feature activation+3.21e-02

Pos logit contributions
itting+0.209
uttering+0.189
ability+0.187
ooks+0.178
adow+0.170
Neg logit contributions
 fastest-0.139
 brakes-0.129
 highest-0.125
 fart-0.116
 forth-0.114
Token,
Token ID11
Feature activation+1.32e-02

Pos logit contributions
itting+0.056
uttering+0.053
ability+0.052
ooks+0.049
ook+0.046
Neg logit contributions
 fastest-0.057
 brakes-0.053
 highest-0.053
 heads-0.052
 hurting-0.051
Token be
Token ID307
Feature activation+1.81e-02

Pos logit contributions
itting+0.027
ability+0.025
uttering+0.025
over+0.024
adow+0.023
Neg logit contributions
 fastest-0.020
 brakes-0.020
 highest-0.019
 fart-0.018
 hurting-0.017
Token kind
Token ID1611
Feature activation+2.11e-02

Pos logit contributions
itting+0.036
uttering+0.033
ability+0.033
ooks+0.030
ment+0.029
Neg logit contributions
 fastest-0.030
 highest-0.028
 brakes-0.028
 fart-0.026
 hurting-0.025
Token and
Token ID290
Feature activation+1.14e-02

Pos logit contributions
itting+0.039
uttering+0.037
ability+0.037
ooks+0.035
icker+0.033
Neg logit contributions
 fastest-0.033
 highest-0.032
 brakes-0.030
 me-0.030
 heads-0.029
Token warned
Token ID7728
Feature activation+1.29e-02

Pos logit contributions
itting+0.013
uttering+0.012
ability+0.012
over+0.011
adow+0.011
Neg logit contributions
 fastest-0.012
 brakes-0.012
 highest-0.012
 fart-0.011
 hurting-0.010
Token the
Token ID262
Feature activation+3.31e-03

Pos logit contributions
itting+0.021
uttering+0.019
over+0.019
ability+0.019
 early+0.018
Neg logit contributions
 fastest-0.025
 brakes-0.024
 highest-0.024
 fart-0.023
 ruining-0.021
Token skeleton
Token ID18328
Feature activation-1.04e-02

Pos logit contributions
itting+0.006
ability+0.006
 early+0.005
uttering+0.005
adow+0.005
Neg logit contributions
 brakes-0.005
 fastest-0.005
 highest-0.005
 fart-0.005
 forth-0.004
Token,
Token ID11
Feature activation+9.47e-03

Pos logit contributions
 fastest+0.020
 brakes+0.019
 highest+0.019
 fart+0.018
 hurting+0.017
Neg logit contributions
itting-0.018
uttering-0.016
ability-0.016
over-0.015
ooks-0.015
Token '
Token ID705
Feature activation+5.87e-03

Pos logit contributions
itting+0.017
ability+0.016
uttering+0.015
over+0.015
ooks+0.015
Neg logit contributions
 brakes-0.017
 fastest-0.017
 highest-0.016
 hurting-0.016
 fart-0.015
TokenBe
Token ID3856
Feature activation+9.49e-02

Pos logit contributions
itting+0.012
ability+0.011
uttering+0.011
over+0.010
adow+0.010
Neg logit contributions
 fastest-0.009
 brakes-0.009
 highest-0.009
 fart-0.008
 heads-0.008
Token careful
Token ID8161
Feature activation+2.22e-02

Pos logit contributions
itting+0.208
ability+0.189
uttering+0.187
ooks+0.175
over+0.174
Neg logit contributions
 fastest-0.137
 brakes-0.133
 highest-0.124
 fart-0.118
 hurting-0.115
Token!
Token ID0
Feature activation+2.79e-03

Pos logit contributions
itting+0.040
ability+0.039
uttering+0.036
ooks+0.033
adow+0.033
Neg logit contributions
 fastest-0.039
 brakes-0.037
 highest-0.036
 heads-0.035
 hurting-0.034
Token The
Token ID383
Feature activation-4.43e-03

Pos logit contributions
itting+0.006
uttering+0.005
ability+0.005
over+0.005
adow+0.005
Neg logit contributions
 fastest-0.004
 brakes-0.004
 highest-0.004
 fart-0.004
 hurting-0.004
Token dark
Token ID3223
Feature activation-1.34e-02

Pos logit contributions
 brakes+0.006
 fastest+0.006
 highest+0.006
 fart+0.006
 ruining+0.006
Neg logit contributions
itting-0.008
ability-0.008
uttering-0.007
over-0.007
 early-0.007
Token woods
Token ID16479
Feature activation+2.69e-02

Pos logit contributions
 fastest+0.020
 brakes+0.019
 highest+0.018
 fart+0.017
 ruining+0.017
Neg logit contributions
itting-0.027
over-0.025
adow-0.025
ability-0.024
ment-0.024
Token
Token ID6184
Feature activation+2.32e-02

Pos logit contributions
itting+0.081
ability+0.078
uttering+0.074
adow+0.073
over+0.072
Neg logit contributions
 brakes-0.063
 highest-0.062
 fastest-0.062
 hurting-0.057
 nearest-0.056
Token
Token ID95
Feature activation+2.74e-02

Pos logit contributions
itting+0.046
uttering+0.042
ability+0.041
over+0.041
adow+0.039
Neg logit contributions
 fastest-0.038
 brakes-0.038
 highest-0.036
 fart-0.034
 heads-0.032
Token
Token ID26391
Feature activation-1.48e-02

Pos logit contributions
itting+0.054
ability+0.048
uttering+0.048
over+0.047
adow+0.045
Neg logit contributions
 fastest-0.045
 brakes-0.045
 highest-0.043
 fart-0.041
 heads-0.038
Token
Token ID129
Feature activation-2.09e-02

Pos logit contributions
 fastest+0.026
 brakes+0.026
 highest+0.025
 fart+0.023
 hurting+0.022
Neg logit contributions
itting-0.027
uttering-0.024
ability-0.024
over-0.024
adow-0.023
Token
Token ID241
Feature activation+1.88e-02

Pos logit contributions
 fastest+0.031
 brakes+0.031
 highest+0.030
 fart+0.028
 heads+0.026
Neg logit contributions
itting-0.043
uttering-0.040
ability-0.039
over-0.039
adow-0.037
TokenBe
Token ID3856
Feature activation+9.48e-02

Pos logit contributions
itting+0.040
ooks+0.038
uttering+0.037
icker+0.037
adow+0.037
Neg logit contributions
 fastest-0.024
 highest-0.022
 brakes-0.022
 longest-0.020
 fart-0.018
Token careful
Token ID8161
Feature activation+2.32e-02

Pos logit contributions
itting+0.214
uttering+0.194
ability+0.194
 early+0.182
ooks+0.180
Neg logit contributions
 fastest-0.126
 highest-0.111
 brakes-0.110
 myself-0.100
 longest-0.099
Token,
Token ID11
Feature activation+1.22e-02

Pos logit contributions
itting+0.042
ability+0.039
uttering+0.038
over+0.035
ooks+0.034
Neg logit contributions
 fastest-0.041
 brakes-0.040
 highest-0.039
 hurting-0.036
 fart-0.036
Token that
Token ID326
Feature activation+1.37e-02

Pos logit contributions
itting+0.023
uttering+0.021
ability+0.021
over+0.020
adow+0.019
Neg logit contributions
 fastest-0.021
 brakes-0.021
 highest-0.020
 fart-0.019
 hurting-0.018
Token's
Token ID338
Feature activation+5.21e-03

Pos logit contributions
itting+0.032
uttering+0.029
ability+0.029
over+0.028
ooks+0.027
Neg logit contributions
 fastest-0.018
 brakes-0.017
 highest-0.017
 fart-0.016
 hurting-0.015
Token a
Token ID257
Feature activation+5.05e-03

Pos logit contributions
itting+0.009
ability+0.008
uttering+0.008
over+0.008
adow+0.007
Neg logit contributions
 fastest-0.010
 brakes-0.010
 highest-0.009
 fart-0.009
 hurting-0.009
Token "
Token ID366
Feature activation+4.94e-03

Pos logit contributions
itting+0.040
ability+0.038
uttering+0.036
ooks+0.034
over+0.033
Neg logit contributions
 fastest-0.036
 brakes-0.036
 highest-0.035
 hurting-0.033
 fart-0.031
TokenIt
Token ID1026
Feature activation+6.59e-03

Pos logit contributions
itting+0.009
uttering+0.009
ability+0.009
over+0.008
adow+0.008
Neg logit contributions
 fastest-0.008
 highest-0.008
 brakes-0.008
 fart-0.007
 hurting-0.007
Token's
Token ID338
Feature activation-1.11e-05

Pos logit contributions
itting+0.015
ability+0.013
uttering+0.013
over+0.013
adow+0.013
Neg logit contributions
 fastest-0.009
 brakes-0.009
 highest-0.008
 fart-0.008
 hurting-0.007
Token your
Token ID534
Feature activation+3.47e-02

Pos logit contributions
 fastest+0.000
 brakes+0.000
 highest+0.000
 fart+0.000
 hurting+0.000
Neg logit contributions
itting-0.000
ability-0.000
uttering-0.000
over-0.000
adow-0.000
Token own
Token ID898
Feature activation-1.80e-02

Pos logit contributions
itting+0.067
ability+0.061
uttering+0.061
over+0.058
adow+0.056
Neg logit contributions
 fastest-0.058
 brakes-0.057
 highest-0.056
 fart-0.052
 heads-0.049
Token fault
Token ID8046
Feature activation+9.47e-02

Pos logit contributions
 fastest+0.029
 brakes+0.028
 highest+0.028
 fart+0.025
 hurting+0.024
Neg logit contributions
itting-0.037
uttering-0.034
ability-0.033
ooks-0.031
over-0.031
Token,
Token ID11
Feature activation+4.90e-03

Pos logit contributions
itting+0.204
ability+0.191
uttering+0.189
over+0.173
ooks+0.170
Neg logit contributions
 fastest-0.141
 brakes-0.132
 highest-0.128
 fart-0.120
 hurting-0.118
Token you
Token ID345
Feature activation+2.40e-02

Pos logit contributions
itting+0.010
ability+0.009
uttering+0.009
over+0.009
adow+0.008
Neg logit contributions
 fastest-0.008
 brakes-0.007
 highest-0.007
 fart-0.007
 heads-0.006
Token should
Token ID815
Feature activation-2.10e-03

Pos logit contributions
itting+0.052
uttering+0.048
ability+0.048
over+0.046
adow+0.045
Neg logit contributions
 fastest-0.034
 brakes-0.033
 highest-0.032
 fart-0.030
 hurting-0.028
Token've
Token ID1053
Feature activation-2.19e-02

Pos logit contributions
 fastest+0.003
 brakes+0.003
 highest+0.003
 fart+0.003
 hurting+0.002
Neg logit contributions
itting-0.005
uttering-0.004
ability-0.004
over-0.004
adow-0.004
Token been
Token ID587
Feature activation-6.27e-03

Pos logit contributions
 fastest+0.030
 brakes+0.029
 highest+0.028
 fart+0.026
 hurting+0.025
Neg logit contributions
itting-0.050
uttering-0.046
ability-0.045
over-0.044
adow-0.042
Token and
Token ID290
Feature activation+8.25e-03

Pos logit contributions
 fastest+0.046
 brakes+0.045
 highest+0.044
 fart+0.042
 heads+0.041
Neg logit contributions
itting-0.037
uttering-0.033
ability-0.033
over-0.031
ooks-0.030
Token said
Token ID531
Feature activation+7.70e-03

Pos logit contributions
itting+0.016
ability+0.014
uttering+0.014
over+0.014
adow+0.013
Neg logit contributions
 fastest-0.014
 brakes-0.014
 highest-0.013
 fart-0.013
 hurting-0.012
Token it
Token ID340
Feature activation+4.29e-03

Pos logit contributions
itting+0.015
ability+0.014
uttering+0.014
over+0.013
adow+0.013
Neg logit contributions
 brakes-0.012
 fastest-0.012
 highest-0.012
 hurting-0.011
 mine-0.011
Token was
Token ID373
Feature activation+2.29e-03

Pos logit contributions
itting+0.007
ability+0.006
uttering+0.006
over+0.006
adow+0.006
Neg logit contributions
 fastest-0.008
 brakes-0.008
 highest-0.008
 fart-0.007
 hurting-0.007
Token their
Token ID511
Feature activation+9.57e-03

Pos logit contributions
itting+0.004
ability+0.003
uttering+0.003
over+0.003
adow+0.003
Neg logit contributions
 fastest-0.004
 brakes-0.004
 highest-0.004
 fart-0.004
 hurting-0.004
Token fault
Token ID8046
Feature activation+9.47e-02

Pos logit contributions
itting+0.018
uttering+0.017
ability+0.016
over+0.016
adow+0.015
Neg logit contributions
 fastest-0.016
 brakes-0.016
 highest-0.015
 fart-0.014
 hurting-0.014
Token.
Token ID13
Feature activation+1.99e-02

Pos logit contributions
itting+0.161
ability+0.158
uttering+0.141
over+0.134
adow+0.131
Neg logit contributions
 fastest-0.179
 brakes-0.174
 highest-0.168
 hurting-0.160
 fart-0.158
Token Mom
Token ID11254
Feature activation+1.02e-02

Pos logit contributions
itting+0.027
uttering+0.024
ability+0.023
over+0.022
adow+0.021
Neg logit contributions
 brakes-0.044
 fastest-0.044
 highest-0.043
 fart-0.041
 heads-0.039
Token said
Token ID531
Feature activation+8.41e-03

Pos logit contributions
itting+0.018
ability+0.016
uttering+0.016
over+0.015
adow+0.015
Neg logit contributions
 fastest-0.019
 brakes-0.018
 highest-0.018
 fart-0.017
 heads-0.016
Token they
Token ID484
Feature activation-3.51e-03

Pos logit contributions
itting+0.015
ability+0.013
uttering+0.013
ooks+0.012
over+0.012
Neg logit contributions
 brakes-0.016
 fastest-0.015
 hurting-0.015
 highest-0.014
 ruining-0.014
Token were
Token ID547
Feature activation-9.00e-03

Pos logit contributions
 fastest+0.006
 brakes+0.006
 highest+0.005
 fart+0.005
 hurting+0.005
Neg logit contributions
itting-0.007
uttering-0.006
ability-0.006
over-0.006
ooks-0.006
Token it
Token ID340
Feature activation+2.68e-03

Pos logit contributions
 brakes+0.018
 fastest+0.018
 highest+0.017
 hurting+0.016
 ruining+0.015
Neg logit contributions
itting-0.021
uttering-0.019
ability-0.019
over-0.018
illed-0.017
Token was
Token ID373
Feature activation+4.37e-03

Pos logit contributions
itting+0.004
ability+0.004
 about+0.003
uttering+0.003
over+0.003
Neg logit contributions
 brakes-0.005
 fastest-0.005
 highest-0.005
 fart-0.005
 heads-0.005
Token all
Token ID477
Feature activation-4.78e-03

Pos logit contributions
itting+0.008
uttering+0.007
ability+0.007
over+0.007
adow+0.007
Neg logit contributions
 fastest-0.008
 brakes-0.007
 highest-0.007
 fart-0.007
 hurting-0.006
Token Joe
Token ID5689
Feature activation-9.47e-03

Pos logit contributions
 fastest+0.008
 brakes+0.008
 highest+0.008
 fart+0.008
 hurting+0.007
Neg logit contributions
itting-0.009
uttering-0.008
ability-0.008
over-0.008
adow-0.007
Token's
Token ID338
Feature activation+3.95e-03

Pos logit contributions
 fastest+0.010
 brakes+0.010
 highest+0.009
 fart+0.008
 hurting+0.008
Neg logit contributions
itting-0.024
uttering-0.022
ability-0.022
over-0.021
ooks-0.021
Token fault
Token ID8046
Feature activation+9.47e-02

Pos logit contributions
itting+0.008
ability+0.007
uttering+0.007
over+0.007
ooks+0.007
Neg logit contributions
 fastest-0.006
 brakes-0.006
 highest-0.006
 fart-0.006
 hurting-0.005
Token.
Token ID13
Feature activation+4.63e-03

Pos logit contributions
itting+0.167
ability+0.158
uttering+0.147
adow+0.136
over+0.135
Neg logit contributions
 fastest-0.175
 brakes-0.171
 highest-0.163
 hurting-0.159
 fart-0.153
Token Joe
Token ID5689
Feature activation-9.36e-03

Pos logit contributions
itting+0.007
uttering+0.006
ability+0.006
over+0.006
adow+0.005
Neg logit contributions
 fastest-0.010
 brakes-0.010
 highest-0.009
 fart-0.009
 heads-0.008
Token tried
Token ID3088
Feature activation+1.41e-02

Pos logit contributions
 fastest+0.016
 brakes+0.016
 highest+0.016
 fart+0.015
 hurting+0.014
Neg logit contributions
itting-0.017
ability-0.015
uttering-0.015
over-0.015
adow-0.014
Token to
Token ID284
Feature activation+7.31e-03

Pos logit contributions
itting+0.027
over+0.025
uttering+0.024
adow+0.024
ability+0.023
Neg logit contributions
 brakes-0.025
 hurting-0.025
 fastest-0.024
 highest-0.023
 ruining-0.023
Token talk
Token ID1561
Feature activation-5.29e-03

Pos logit contributions
itting+0.013
ability+0.012
uttering+0.012
 early+0.012
adow+0.012
Neg logit contributions
 fastest-0.012
 brakes-0.011
 highest-0.011
 fart-0.010
 longest-0.009
Token and
Token ID290
Feature activation+7.13e-03

Pos logit contributions
itting+0.028
ability+0.026
uttering+0.025
ooks+0.024
adow+0.023
Neg logit contributions
 fastest-0.033
 brakes-0.032
 hurting-0.030
 highest-0.030
 heads-0.029
Token it
Token ID340
Feature activation+4.83e-03

Pos logit contributions
itting+0.013
ability+0.012
uttering+0.012
over+0.011
adow+0.011
Neg logit contributions
 fastest-0.012
 brakes-0.012
 highest-0.012
 fart-0.011
 heads-0.011
Token was
Token ID373
Feature activation+4.98e-03

Pos logit contributions
itting+0.008
ability+0.007
uttering+0.007
over+0.007
adow+0.007
Neg logit contributions
 brakes-0.009
 fastest-0.009
 highest-0.009
 fart-0.008
 heads-0.008
Token his
Token ID465
Feature activation+5.82e-03

Pos logit contributions
itting+0.010
uttering+0.009
ability+0.009
over+0.009
adow+0.009
Neg logit contributions
 fastest-0.008
 brakes-0.008
 highest-0.008
 hurting-0.007
 fart-0.007
Token own
Token ID898
Feature activation-1.98e-02

Pos logit contributions
itting+0.012
uttering+0.011
ability+0.011
over+0.011
adow+0.010
Neg logit contributions
 fastest-0.009
 brakes-0.009
 highest-0.008
 fart-0.008
 hurting-0.007
Token fault
Token ID8046
Feature activation+9.46e-02

Pos logit contributions
 fastest+0.032
 brakes+0.030
 highest+0.030
 fart+0.027
 heads+0.026
Neg logit contributions
itting-0.040
uttering-0.036
ability-0.036
over-0.035
adow-0.034
Token.
Token ID13
Feature activation+4.30e-03

Pos logit contributions
itting+0.163
ability+0.152
uttering+0.142
adow+0.132
ooks+0.130
Neg logit contributions
 fastest-0.182
 brakes-0.176
 highest-0.169
 hurting-0.163
 fart-0.159
Token
Token ID198
Feature activation+7.03e-03

Pos logit contributions
itting+0.006
uttering+0.006
ability+0.006
over+0.005
adow+0.005
Neg logit contributions
 fastest-0.009
 brakes-0.009
 highest-0.009
 fart-0.008
 hurting-0.008
Token
Token ID198
Feature activation+7.24e-03

Pos logit contributions
itting+0.011
uttering+0.010
over+0.010
ability+0.010
adow+0.009
Neg logit contributions
 brakes-0.014
 fastest-0.014
 highest-0.013
 nearest-0.012
 fart-0.012
TokenJohnny
Token ID44960
Feature activation-3.63e-03

Pos logit contributions
itting+0.013
uttering+0.012
ability+0.012
over+0.011
adow+0.011
Neg logit contributions
 fastest-0.013
 brakes-0.013
 highest-0.012
 fart-0.012
 hurting-0.011
Token smiled
Token ID13541
Feature activation+3.12e-03

Pos logit contributions
 fastest+0.006
 brakes+0.006
 highest+0.006
 fart+0.006
 longest+0.005
Neg logit contributions
itting-0.006
ability-0.006
over-0.006
uttering-0.006
adow-0.006
Token smiled
Token ID13541
Feature activation+5.74e-03

Pos logit contributions
itting+0.008
ability+0.007
uttering+0.007
over+0.007
adow+0.006
Neg logit contributions
 fastest-0.007
 brakes-0.006
 highest-0.006
 fart-0.006
 hurting-0.006
Token and
Token ID290
Feature activation+1.35e-02

Pos logit contributions
ability+0.011
itting+0.011
ooks+0.010
uttering+0.009
adow+0.009
Neg logit contributions
 highest-0.009
 fastest-0.009
 back-0.009
 brakes-0.009
 heads-0.008
Token said
Token ID531
Feature activation+1.57e-02

Pos logit contributions
itting+0.028
uttering+0.025
ability+0.024
over+0.024
adow+0.023
Neg logit contributions
 fastest-0.021
 brakes-0.021
 highest-0.020
 fart-0.019
 heads-0.018
Token,
Token ID11
Feature activation+1.49e-02

Pos logit contributions
itting+0.029
uttering+0.026
ability+0.025
over+0.024
adow+0.024
Neg logit contributions
 fastest-0.027
 brakes-0.027
 highest-0.027
 fart-0.024
 hurting-0.024
Token "
Token ID366
Feature activation+1.37e-03

Pos logit contributions
itting+0.027
ability+0.026
uttering+0.024
adow+0.023
over+0.023
Neg logit contributions
 brakes-0.026
 fastest-0.026
 highest-0.025
 hurting-0.023
 fart-0.023
TokenBe
Token ID3856
Feature activation+9.45e-02

Pos logit contributions
itting+0.003
uttering+0.003
ability+0.002
over+0.002
adow+0.002
Neg logit contributions
 fastest-0.002
 highest-0.002
 brakes-0.002
 fart-0.002
 longest-0.002
Token careful
Token ID8161
Feature activation+2.05e-02

Pos logit contributions
itting+0.212
uttering+0.192
ability+0.186
ooks+0.175
over+0.172
Neg logit contributions
 fastest-0.137
 highest-0.125
 brakes-0.122
 fart-0.114
 forth-0.111
Token when
Token ID618
Feature activation-9.28e-03

Pos logit contributions
itting+0.034
ability+0.030
uttering+0.030
over+0.028
adow+0.027
Neg logit contributions
 fastest-0.040
 brakes-0.039
 highest-0.038
 fart-0.036
 hurting-0.035
Token drawing
Token ID8263
Feature activation+6.53e-03

Pos logit contributions
 fastest+0.010
 brakes+0.010
 highest+0.009
 fart+0.008
 heads+0.007
Neg logit contributions
itting-0.023
uttering-0.022
ability-0.021
over-0.021
adow-0.021
Token,
Token ID11
Feature activation+7.97e-03

Pos logit contributions
itting+0.011
uttering+0.010
ability+0.010
over+0.009
adow+0.009
Neg logit contributions
 fastest-0.012
 brakes-0.012
 highest-0.012
 fart-0.011
 hurting-0.011
Token okay
Token ID8788
Feature activation-6.17e-03

Pos logit contributions
itting+0.018
ability+0.016
uttering+0.016
over+0.016
adow+0.015
Neg logit contributions
 fastest-0.011
 brakes-0.011
 highest-0.010
 fart-0.009
 heads-0.009
Token heard
Token ID2982
Feature activation+2.94e-02

Pos logit contributions
 fastest+0.005
 brakes+0.005
 highest+0.005
 fart+0.004
 ruining+0.004
Neg logit contributions
itting-0.003
uttering-0.003
over-0.003
ability-0.003
adow-0.003
Token a
Token ID257
Feature activation+5.70e-03

Pos logit contributions
itting+0.049
uttering+0.043
ability+0.041
over+0.038
adow+0.038
Neg logit contributions
 brakes-0.060
 fastest-0.057
 highest-0.057
 fart-0.053
 hurting-0.052
Token voice
Token ID3809
Feature activation+1.61e-02

Pos logit contributions
itting+0.011
uttering+0.010
 early+0.010
over+0.010
ability+0.009
Neg logit contributions
 fastest-0.009
 highest-0.008
 brakes-0.008
 fart-0.008
 longest-0.007
Token.
Token ID13
Feature activation+4.35e-03

Pos logit contributions
itting+0.029
uttering+0.026
ability+0.026
over+0.025
adow+0.025
Neg logit contributions
 fastest-0.029
 brakes-0.028
 highest-0.027
 fart-0.026
 heads-0.024
Token "
Token ID366
Feature activation-4.14e-03

Pos logit contributions
itting+0.008
uttering+0.008
ability+0.008
over+0.007
adow+0.007
Neg logit contributions
 brakes-0.008
 highest-0.007
 fastest-0.007
 nearest-0.006
 hurting-0.006
TokenBe
Token ID3856
Feature activation+9.44e-02

Pos logit contributions
 fastest+0.007
 brakes+0.006
 highest+0.006
 fart+0.006
 heads+0.006
Neg logit contributions
itting-0.008
ability-0.008
uttering-0.008
over-0.007
adow-0.007
Token careful
Token ID8161
Feature activation+2.21e-02

Pos logit contributions
itting+0.223
ability+0.204
uttering+0.200
over+0.188
ooks+0.186
Neg logit contributions
 fastest-0.121
 brakes-0.116
 highest-0.109
 fart-0.099
 hurting-0.097
Token,"
Token ID553
Feature activation-7.97e-03

Pos logit contributions
itting+0.039
ability+0.035
uttering+0.034
over+0.032
adow+0.031
Neg logit contributions
 fastest-0.041
 brakes-0.041
 highest-0.040
 fart-0.037
 hurting-0.036
Token it
Token ID340
Feature activation+5.73e-03

Pos logit contributions
 brakes+0.014
 fastest+0.014
 highest+0.013
 fart+0.012
 hurting+0.012
Neg logit contributions
itting-0.015
uttering-0.014
ability-0.013
over-0.013
ooks-0.012
Token said
Token ID531
Feature activation+9.60e-03

Pos logit contributions
itting+0.008
adow+0.007
over+0.007
ability+0.007
uttering+0.007
Neg logit contributions
 fastest-0.012
 highest-0.012
 brakes-0.012
 fart-0.011
 ruining-0.010
Token.
Token ID13
Feature activation+7.82e-03

Pos logit contributions
itting+0.019
ability+0.017
uttering+0.017
ooks+0.015
adow+0.015
Neg logit contributions
 brakes-0.016
 fastest-0.015
 highest-0.015
 hurting-0.014
 fart-0.014
Token and
Token ID290
Feature activation+2.79e-03

Pos logit contributions
itting+0.007
uttering+0.006
ability+0.006
over+0.006
adow+0.006
Neg logit contributions
 fastest-0.006
 brakes-0.006
 highest-0.006
 fart-0.006
 heads-0.006
Token it
Token ID340
Feature activation-1.60e-04

Pos logit contributions
itting+0.005
ability+0.005
uttering+0.005
over+0.005
 early+0.004
Neg logit contributions
 fastest-0.004
 brakes-0.004
 highest-0.004
 fart-0.004
 heads-0.004
Token's
Token ID338
Feature activation-7.03e-03

Pos logit contributions
 brakes+0.000
 fastest+0.000
 highest+0.000
 fart+0.000
 brake+0.000
Neg logit contributions
itting-0.000
adow-0.000
 about-0.000
ability-0.000
 early-0.000
Token all
Token ID477
Feature activation-1.02e-02

Pos logit contributions
 fastest+0.011
 brakes+0.010
 highest+0.010
 fart+0.009
 heads+0.009
Neg logit contributions
itting-0.015
uttering-0.013
ability-0.013
over-0.013
adow-0.013
Token my
Token ID616
Feature activation+1.77e-02

Pos logit contributions
 brakes+0.014
 fastest+0.013
 highest+0.013
 fart+0.012
 heads+0.012
Neg logit contributions
itting-0.022
ability-0.020
adow-0.019
uttering-0.019
 about-0.019
Token fault
Token ID8046
Feature activation+9.44e-02

Pos logit contributions
itting+0.035
uttering+0.032
ability+0.031
over+0.031
adow+0.030
Neg logit contributions
 fastest-0.029
 brakes-0.028
 highest-0.027
 fart-0.026
 hurting-0.024
Token,"
Token ID553
Feature activation-6.83e-03

Pos logit contributions
itting+0.211
ability+0.196
uttering+0.193
over+0.183
ooks+0.178
Neg logit contributions
 fastest-0.132
 brakes-0.128
 highest-0.123
 fart-0.114
 hurting-0.112
Token he
Token ID339
Feature activation-8.82e-03

Pos logit contributions
 brakes+0.012
 fastest+0.012
 highest+0.011
 fart+0.011
 hurting+0.010
Neg logit contributions
itting-0.013
uttering-0.012
ability-0.011
over-0.011
adow-0.011
Token said
Token ID531
Feature activation+7.80e-03

Pos logit contributions
 fastest+0.018
 brakes+0.018
 highest+0.017
 fart+0.016
 ruining+0.015
Neg logit contributions
itting-0.013
ability-0.012
over-0.012
uttering-0.011
adow-0.011
Token,
Token ID11
Feature activation+6.25e-03

Pos logit contributions
itting+0.015
ability+0.013
uttering+0.013
ooks+0.012
adow+0.012
Neg logit contributions
 brakes-0.014
 fastest-0.013
 highest-0.013
 hurting-0.013
 heads-0.012
Token weeping
Token ID45705
Feature activation-5.03e-03

Pos logit contributions
itting+0.011
ability+0.010
uttering+0.010
over+0.010
adow+0.009
Neg logit contributions
 brakes-0.011
 fastest-0.011
 highest-0.011
 hurting-0.010
 fart-0.010
Token had
Token ID550
Feature activation+1.28e-02

Pos logit contributions
itting+0.012
uttering+0.011
ability+0.011
over+0.010
ooks+0.010
Neg logit contributions
 fastest-0.010
 brakes-0.009
 highest-0.009
 hurting-0.008
 fart-0.008
Token told
Token ID1297
Feature activation+4.75e-03

Pos logit contributions
itting+0.027
ability+0.026
over+0.026
uttering+0.025
ooks+0.025
Neg logit contributions
 fastest-0.017
 hurting-0.017
 brakes-0.016
 highest-0.016
 fart-0.015
Token her
Token ID607
Feature activation+2.49e-03

Pos logit contributions
itting+0.008
uttering+0.007
ability+0.007
ooks+0.007
over+0.007
Neg logit contributions
 hurting-0.008
 fastest-0.008
 brakes-0.008
 heads-0.008
 highest-0.008
Token:
Token ID25
Feature activation+3.67e-02

Pos logit contributions
itting+0.005
uttering+0.004
ability+0.004
over+0.004
ooks+0.004
Neg logit contributions
 brakes-0.004
 fastest-0.004
 hurting-0.004
 highest-0.004
 heads-0.004
Token "
Token ID366
Feature activation+2.73e-03

Pos logit contributions
ability+0.071
itting+0.070
over+0.064
uttering+0.063
ooks+0.061
Neg logit contributions
 brakes-0.059
 fastest-0.056
 hurting-0.056
 highest-0.055
 ruining-0.052
TokenBe
Token ID3856
Feature activation+9.44e-02

Pos logit contributions
itting+0.005
uttering+0.005
ability+0.005
over+0.004
adow+0.004
Neg logit contributions
 fastest-0.005
 highest-0.005
 brakes-0.004
 fart-0.004
 hurting-0.004
Tokenhave
Token ID14150
Feature activation+3.33e-02

Pos logit contributions
itting+0.217
ability+0.196
uttering+0.195
ooks+0.180
book+0.179
Neg logit contributions
 fastest-0.128
 brakes-0.120
 highest-0.112
 hurting-0.108
 fart-0.108
Token,
Token ID11
Feature activation+5.49e-03

Pos logit contributions
itting+0.062
uttering+0.057
ability+0.055
ooks+0.052
over+0.050
Neg logit contributions
 fastest-0.059
 brakes-0.057
 highest-0.056
 fart-0.053
 hurting-0.051
Token little
Token ID1310
Feature activation+8.58e-03

Pos logit contributions
itting+0.011
uttering+0.010
ability+0.010
over+0.010
adow+0.010
Neg logit contributions
 fastest-0.009
 brakes-0.008
 highest-0.008
 fart-0.008
 heads-0.007
Token one
Token ID530
Feature activation-1.58e-03

Pos logit contributions
itting+0.016
ability+0.015
uttering+0.015
over+0.014
adow+0.014
Neg logit contributions
 fastest-0.015
 brakes-0.014
 highest-0.014
 fart-0.013
 heads-0.013
Token".
Token ID1911
Feature activation-9.14e-03

Pos logit contributions
 fastest+0.003
 brakes+0.003
 highest+0.003
 fart+0.003
 hurting+0.003
Neg logit contributions
itting-0.002
uttering-0.002
ability-0.002
over-0.002
ooks-0.002
Token quickly
Token ID2952
Feature activation-7.15e-03

Pos logit contributions
itting+0.009
uttering+0.008
ability+0.008
over+0.008
adow+0.007
Neg logit contributions
 fastest-0.009
 brakes-0.009
 highest-0.008
 fart-0.008
 hurting-0.008
Token helped
Token ID4193
Feature activation-5.49e-03

Pos logit contributions
 fastest+0.011
 brakes+0.011
 highest+0.010
 fart+0.009
 hurting+0.009
Neg logit contributions
itting-0.015
uttering-0.014
ability-0.014
over-0.013
adow-0.013
Token Tim
Token ID5045
Feature activation-2.06e-02

Pos logit contributions
 fastest+0.010
 brakes+0.009
 highest+0.009
 fart+0.009
 hurting+0.008
Neg logit contributions
itting-0.010
uttering-0.009
ability-0.009
over-0.009
adow-0.009
Tokenmy
Token ID1820
Feature activation+1.60e-03

Pos logit contributions
 fastest+0.031
 brakes+0.029
 highest+0.029
 fart+0.027
 heads+0.025
Neg logit contributions
itting-0.044
uttering-0.040
ability-0.040
over-0.039
ooks-0.037
Token get
Token ID651
Feature activation-4.52e-03

Pos logit contributions
itting+0.004
uttering+0.003
ability+0.003
over+0.003
adow+0.003
Neg logit contributions
 fastest-0.002
 brakes-0.002
 highest-0.002
 fart-0.002
 hurting-0.002
Token unst
Token ID15014
Feature activation-9.51e-02

Pos logit contributions
 fastest+0.008
 brakes+0.008
 highest+0.008
 fart+0.007
 heads+0.007
Neg logit contributions
itting-0.008
ability-0.007
uttering-0.007
over-0.007
adow-0.007
Tokenuck
Token ID1347
Feature activation+2.15e-02

Pos logit contributions
 fastest+0.251
 brakes+0.250
 highest+0.236
 fart+0.230
 ruining+0.228
Neg logit contributions
itting-0.092
over-0.075
ability-0.073
uttering-0.073
ook-0.058
Token.
Token ID13
Feature activation+4.05e-03

Pos logit contributions
itting+0.036
uttering+0.032
ability+0.031
over+0.029
adow+0.028
Neg logit contributions
 fastest-0.042
 brakes-0.042
 highest-0.041
 fart-0.039
 heads-0.038
Token She
Token ID1375
Feature activation-1.93e-04

Pos logit contributions
itting+0.008
ability+0.007
uttering+0.007
over+0.007
adow+0.007
Neg logit contributions
 brakes-0.007
 fastest-0.007
 highest-0.007
 fart-0.006
 ruining-0.006
Token said
Token ID531
Feature activation+5.62e-03

Pos logit contributions
 fastest+0.000
 brakes+0.000
 highest+0.000
 fart+0.000
 hurting+0.000
Neg logit contributions
itting-0.000
ability-0.000
uttering-0.000
over-0.000
adow-0.000
Token,
Token ID11
Feature activation+3.14e-03

Pos logit contributions
itting+0.010
ability+0.009
uttering+0.009
adow+0.009
over+0.009
Neg logit contributions
 brakes-0.010
 fastest-0.010
 highest-0.009
 hurting-0.009
 fart-0.009
Token knows
Token ID4206
Feature activation-3.53e-03

Pos logit contributions
itting+0.008
ability+0.007
adow+0.007
uttering+0.007
over+0.007
Neg logit contributions
 fastest-0.007
 brakes-0.007
 highest-0.007
 fart-0.006
 ruining-0.006
Token how
Token ID703
Feature activation-9.25e-03

Pos logit contributions
 fastest+0.007
 brakes+0.007
 highest+0.007
 fart+0.006
 hurting+0.006
Neg logit contributions
itting-0.006
uttering-0.005
ability-0.005
over-0.005
adow-0.005
Token to
Token ID284
Feature activation+3.45e-04

Pos logit contributions
 fastest+0.019
 brakes+0.018
 highest+0.018
 fart+0.017
 hurting+0.016
Neg logit contributions
itting-0.015
uttering-0.013
ability-0.013
over-0.013
adow-0.012
Token get
Token ID651
Feature activation-5.77e-03

Pos logit contributions
itting+0.001
uttering+0.001
adow+0.001
over+0.001
ability+0.001
Neg logit contributions
 brakes-0.001
 fastest-0.001
 highest-0.001
 fart-0.000
 heads-0.000
Token things
Token ID1243
Feature activation+4.73e-03

Pos logit contributions
 fastest+0.010
 brakes+0.010
 highest+0.010
 fart+0.009
 heads+0.009
Neg logit contributions
itting-0.010
uttering-0.009
ability-0.009
 early-0.008
over-0.008
Token unst
Token ID15014
Feature activation-9.32e-02

Pos logit contributions
itting+0.009
uttering+0.008
ability+0.008
over+0.008
adow+0.007
Neg logit contributions
 fastest-0.008
 brakes-0.008
 highest-0.008
 fart-0.007
 heads-0.007
Tokenuck
Token ID1347
Feature activation+2.31e-02

Pos logit contributions
 fastest+0.234
 brakes+0.232
 highest+0.223
 fart+0.214
 ruining+0.210
Neg logit contributions
itting-0.101
over-0.085
uttering-0.084
ability-0.081
ook-0.068
Token."
Token ID526
Feature activation-8.80e-03

Pos logit contributions
itting+0.047
uttering+0.043
ability+0.042
over+0.041
adow+0.039
Neg logit contributions
 fastest-0.037
 brakes-0.035
 highest-0.035
 fart-0.033
 heads-0.031
Token So
Token ID1406
Feature activation-7.79e-03

Pos logit contributions
 fastest+0.015
 highest+0.015
 brakes+0.015
 fart+0.015
 heads+0.014
Neg logit contributions
itting-0.016
uttering-0.014
over-0.013
ability-0.013
adow-0.013
Token,
Token ID11
Feature activation+6.54e-03

Pos logit contributions
 fastest+0.016
 brakes+0.016
 highest+0.016
 fart+0.015
 hurting+0.014
Neg logit contributions
itting-0.012
ability-0.010
uttering-0.010
over-0.010
adow-0.009
Token they
Token ID484
Feature activation-9.23e-03

Pos logit contributions
itting+0.011
uttering+0.010
ability+0.010
over+0.010
adow+0.009
Neg logit contributions
 fastest-0.012
 brakes-0.012
 highest-0.012
 fart-0.011
 hurting-0.011
TokenWe
Token ID1135
Feature activation+9.54e-03

Pos logit contributions
 fastest+0.004
 brakes+0.004
 highest+0.004
 fart+0.004
 hurting+0.004
Neg logit contributions
itting-0.005
uttering-0.004
ability-0.004
over-0.004
adow-0.004
Token'll
Token ID1183
Feature activation+1.34e-02

Pos logit contributions
itting+0.020
ability+0.017
uttering+0.017
adow+0.017
over+0.016
Neg logit contributions
 brakes-0.015
 fastest-0.014
 highest-0.014
 fart-0.013
 hurting-0.012
Token help
Token ID1037
Feature activation-1.42e-02

Pos logit contributions
itting+0.025
uttering+0.023
adow+0.022
over+0.022
ook+0.022
Neg logit contributions
 brakes-0.021
 fastest-0.020
 highest-0.020
 fart-0.019
 heads-0.017
Token you
Token ID345
Feature activation+2.86e-02

Pos logit contributions
 fastest+0.019
 brakes+0.018
 highest+0.017
 fart+0.016
 longest+0.013
Neg logit contributions
itting-0.032
over-0.029
uttering-0.029
ability-0.029
 early-0.028
Token get
Token ID651
Feature activation-3.90e-03

Pos logit contributions
itting+0.055
uttering+0.050
adow+0.049
over+0.049
ability+0.047
Neg logit contributions
 brakes-0.045
 fastest-0.045
 highest-0.042
 fart-0.040
 heads-0.039
Token unst
Token ID15014
Feature activation-9.32e-02

Pos logit contributions
 fastest+0.007
 brakes+0.006
 highest+0.006
 fart+0.006
 heads+0.006
Neg logit contributions
itting-0.007
uttering-0.006
ability-0.006
 early-0.006
over-0.006
Tokenuck
Token ID1347
Feature activation+2.36e-02

Pos logit contributions
 fastest+0.258
 brakes+0.256
 highest+0.244
 ruining+0.237
 ow+0.237
Neg logit contributions
itting-0.071
over-0.060
uttering-0.053
ability-0.053
ook-0.042
Token."
Token ID526
Feature activation-1.67e-02

Pos logit contributions
itting+0.046
uttering+0.041
ability+0.041
over+0.040
adow+0.038
Neg logit contributions
 fastest-0.039
 brakes-0.039
 highest-0.037
 fart-0.035
 hurting-0.033
Token
Token ID198
Feature activation+5.28e-03

Pos logit contributions
 brakes+0.028
 highest+0.027
 fastest+0.027
 fart+0.026
 nearest+0.025
Neg logit contributions
itting-0.030
uttering-0.027
adow-0.026
over-0.026
ability-0.026
Token
Token ID198
Feature activation+5.52e-03

Pos logit contributions
itting+0.009
uttering+0.008
ability+0.007
over+0.007
adow+0.007
Neg logit contributions
 brakes-0.010
 fastest-0.010
 highest-0.010
 fart-0.009
 nearest-0.009
TokenBut
Token ID1537
Feature activation-3.62e-02

Pos logit contributions
itting+0.009
uttering+0.008
ability+0.008
over+0.008
adow+0.007
Neg logit contributions
 fastest-0.011
 brakes-0.011
 highest-0.010
 fart-0.010
 hurting-0.009
Token that
Token ID326
Feature activation+1.00e-02

Pos logit contributions
itting+0.045
ability+0.042
uttering+0.041
ooks+0.040
over+0.040
Neg logit contributions
 hurting-0.031
 brakes-0.031
 fastest-0.031
 highest-0.029
 ruining-0.029
Token the
Token ID262
Feature activation+3.41e-03

Pos logit contributions
itting+0.017
uttering+0.015
ability+0.015
over+0.015
adow+0.014
Neg logit contributions
 fastest-0.019
 brakes-0.018
 highest-0.018
 fart-0.017
 hurting-0.017
Token zipper
Token ID48992
Feature activation+6.57e-03

Pos logit contributions
itting+0.007
uttering+0.006
ability+0.006
over+0.006
adow+0.006
Neg logit contributions
 fastest-0.005
 brakes-0.005
 highest-0.005
 fart-0.005
 ruining-0.004
Token had
Token ID550
Feature activation+1.07e-02

Pos logit contributions
itting+0.012
ability+0.011
uttering+0.011
over+0.010
adow+0.010
Neg logit contributions
 fastest-0.011
 brakes-0.011
 highest-0.011
 fart-0.011
 heads-0.010
Token become
Token ID1716
Feature activation+1.50e-02

Pos logit contributions
itting+0.021
uttering+0.019
ability+0.019
over+0.019
adow+0.018
Neg logit contributions
 fastest-0.018
 brakes-0.017
 highest-0.017
 fart-0.016
 hurting-0.015
Token unst
Token ID15014
Feature activation-9.24e-02

Pos logit contributions
itting+0.031
ability+0.028
uttering+0.028
over+0.027
adow+0.026
Neg logit contributions
 fastest-0.024
 brakes-0.023
 highest-0.022
 fart-0.021
 hurting-0.020
Tokenuck
Token ID1347
Feature activation+2.49e-02

Pos logit contributions
 brakes+0.203
 fastest+0.200
 highest+0.189
 fart+0.183
 ruining+0.182
Neg logit contributions
itting-0.138
uttering-0.120
over-0.115
ability-0.111
illed-0.101
Token.
Token ID13
Feature activation+4.46e-03

Pos logit contributions
itting+0.039
uttering+0.035
ability+0.035
over+0.033
ooks+0.032
Neg logit contributions
 fastest-0.051
 highest-0.048
 brakes-0.048
 fart-0.046
 hurting-0.045
Token She
Token ID1375
Feature activation-2.30e-03

Pos logit contributions
itting+0.008
uttering+0.007
over+0.007
ability+0.007
adow+0.006
Neg logit contributions
 brakes-0.008
 highest-0.008
 fastest-0.007
 hurting-0.007
 fart-0.007
Token was
Token ID373
Feature activation+1.91e-03

Pos logit contributions
 fastest+0.004
 brakes+0.004
 highest+0.004
 fart+0.004
 ruining+0.004
Neg logit contributions
itting-0.004
ability-0.004
uttering-0.004
over-0.003
adow-0.003
Token relieved
Token ID22598
Feature activation-1.66e-02

Pos logit contributions
itting+0.004
ability+0.004
uttering+0.003
over+0.003
adow+0.003
Neg logit contributions
 fastest-0.003
 brakes-0.003
 highest-0.003
 fart-0.003
 hurting-0.003
Token they
Token ID484
Feature activation-5.87e-03

Pos logit contributions
itting+0.001
uttering+0.001
ability+0.000
over+0.000
adow+0.000
Neg logit contributions
 fastest-0.001
 brakes-0.001
 highest-0.001
 fart-0.001
 hurting-0.001
Token didn
Token ID1422
Feature activation+5.29e-03

Pos logit contributions
 fastest+0.011
 brakes+0.011
 highest+0.011
 fart+0.010
 longest+0.009
Neg logit contributions
itting-0.010
ability-0.009
adow-0.009
uttering-0.009
illed-0.008
Token't
Token ID470
Feature activation-1.85e-02

Pos logit contributions
itting+0.012
uttering+0.011
over+0.011
ability+0.011
 early+0.010
Neg logit contributions
 fastest-0.007
 brakes-0.007
 highest-0.006
 heads-0.006
 fart-0.006
Token get
Token ID651
Feature activation-3.29e-03

Pos logit contributions
 brakes+0.031
 highest+0.031
 fastest+0.031
 fart+0.028
 heads+0.027
Neg logit contributions
itting-0.034
uttering-0.030
ability-0.029
 about-0.029
adow-0.029
Token it
Token ID340
Feature activation+3.14e-03

Pos logit contributions
 fastest+0.006
 brakes+0.006
 highest+0.005
 fart+0.005
 heads+0.005
Neg logit contributions
itting-0.006
uttering-0.005
ability-0.005
over-0.005
 early-0.005
Token unst
Token ID15014
Feature activation-8.94e-02

Pos logit contributions
itting+0.005
uttering+0.005
ability+0.004
over+0.004
adow+0.004
Neg logit contributions
 brakes-0.006
 fastest-0.006
 highest-0.006
 fart-0.005
 heads-0.005
Tokenuck
Token ID1347
Feature activation+2.45e-02

Pos logit contributions
 fastest+0.222
 brakes+0.222
 highest+0.209
 ruining+0.203
 fart+0.201
Neg logit contributions
itting-0.098
over-0.081
uttering-0.081
ability-0.075
ook-0.066
Token.
Token ID13
Feature activation+5.91e-03

Pos logit contributions
itting+0.035
uttering+0.030
ability+0.030
over+0.029
adow+0.028
Neg logit contributions
 fastest-0.053
 brakes-0.052
 highest-0.051
 fart-0.049
 hurting-0.047
Token
Token ID198
Feature activation+5.93e-03

Pos logit contributions
itting+0.009
uttering+0.008
ability+0.007
over+0.007
adow+0.007
Neg logit contributions
 fastest-0.013
 brakes-0.013
 highest-0.012
 fart-0.012
 heads-0.011
Token
Token ID198
Feature activation+6.04e-03

Pos logit contributions
itting+0.010
uttering+0.009
over+0.008
ability+0.008
adow+0.008
Neg logit contributions
 fastest-0.011
 brakes-0.011
 highest-0.011
 fart-0.010
 nearest-0.010
TokenJust
Token ID5703
Feature activation-2.69e-02

Pos logit contributions
itting+0.010
uttering+0.009
ability+0.009
over+0.009
adow+0.009
Neg logit contributions
 fastest-0.011
 brakes-0.011
 highest-0.011
 fart-0.010
 nearest-0.010
Token could
Token ID714
Feature activation-9.85e-03

Pos logit contributions
itting+0.004
ability+0.004
over+0.004
uttering+0.004
 early+0.004
Neg logit contributions
 fastest-0.005
 brakes-0.004
 highest-0.004
 fart-0.004
 longest-0.004
Token help
Token ID1037
Feature activation-1.06e-02

Pos logit contributions
 fastest+0.013
 brakes+0.013
 highest+0.012
 fart+0.012
 heads+0.011
Neg logit contributions
itting-0.021
uttering-0.020
ability-0.019
over-0.019
adow-0.019
Token get
Token ID651
Feature activation-2.03e-03

Pos logit contributions
 fastest+0.020
 brakes+0.020
 highest+0.019
 fart+0.018
 heads+0.017
Neg logit contributions
itting-0.018
uttering-0.016
ability-0.016
over-0.016
adow-0.015
Token the
Token ID262
Feature activation+2.89e-03

Pos logit contributions
 fastest+0.004
 brakes+0.004
 highest+0.003
 fart+0.003
 heads+0.003
Neg logit contributions
itting-0.003
uttering-0.003
ability-0.003
over-0.003
 early-0.003
Token car
Token ID1097
Feature activation+3.01e-02

Pos logit contributions
itting+0.006
uttering+0.005
over+0.005
ability+0.005
 early+0.005
Neg logit contributions
 fastest-0.004
 brakes-0.004
 highest-0.004
 fart-0.004
 ruining-0.004
Token unst
Token ID15014
Feature activation-8.87e-02

Pos logit contributions
itting+0.060
uttering+0.054
ability+0.054
over+0.053
adow+0.051
Neg logit contributions
 fastest-0.048
 brakes-0.047
 highest-0.046
 fart-0.043
 hurting-0.040
Tokenuck
Token ID1347
Feature activation+2.56e-02

Pos logit contributions
 fastest+0.214
 brakes+0.211
 highest+0.200
 ruining+0.195
 fart+0.193
Neg logit contributions
itting-0.102
over-0.090
uttering-0.088
ability-0.081
ook-0.073
Token.
Token ID13
Feature activation+5.17e-03

Pos logit contributions
itting+0.046
uttering+0.042
ability+0.041
over+0.039
adow+0.037
Neg logit contributions
 fastest-0.047
 brakes-0.045
 highest-0.045
 fart-0.042
 heads-0.040
Token
Token ID198
Feature activation+5.42e-03

Pos logit contributions
itting+0.009
uttering+0.008
ability+0.008
adow+0.008
over+0.007
Neg logit contributions
 brakes-0.009
 fastest-0.009
 highest-0.009
 fart-0.008
 nearest-0.008
Token
Token ID198
Feature activation+5.56e-03

Pos logit contributions
itting+0.010
uttering+0.008
ability+0.008
adow+0.008
over+0.008
Neg logit contributions
 brakes-0.010
 fastest-0.010
 highest-0.009
 nearest-0.009
 fart-0.008
TokenThe
Token ID464
Feature activation-9.67e-03

Pos logit contributions
itting+0.011
ability+0.010
uttering+0.010
adow+0.009
ooks+0.009
Neg logit contributions
 fastest-0.009
 brakes-0.009
 highest-0.009
 fart-0.008
 nearest-0.008
Token before
Token ID878
Feature activation-1.03e-02

Pos logit contributions
 fastest+0.016
 brakes+0.015
 highest+0.015
 fart+0.014
 hurting+0.014
Neg logit contributions
itting-0.014
ability-0.013
uttering-0.013
over-0.012
ooks-0.012
Token all
Token ID477
Feature activation-2.07e-03

Pos logit contributions
 brakes+0.018
 fastest+0.018
 highest+0.017
 hurting+0.017
 fart+0.016
Neg logit contributions
itting-0.020
ability-0.018
uttering-0.017
over-0.016
adow-0.016
Token the
Token ID262
Feature activation+6.73e-03

Pos logit contributions
 fastest+0.004
 brakes+0.004
 highest+0.004
 fart+0.004
 heads+0.004
Neg logit contributions
itting-0.003
ability-0.003
uttering-0.003
over-0.003
adow-0.003
Token fur
Token ID9230
Feature activation+1.39e-02

Pos logit contributions
itting+0.014
ability+0.013
uttering+0.012
over+0.012
adow+0.012
Neg logit contributions
 fastest-0.009
 brakes-0.009
 highest-0.009
 fart-0.008
 ruining-0.008
Token was
Token ID373
Feature activation+9.49e-03

Pos logit contributions
itting+0.027
uttering+0.025
ability+0.025
over+0.024
ooks+0.023
Neg logit contributions
 fastest-0.023
 brakes-0.022
 highest-0.022
 fart-0.020
 hurting-0.020
Token unst
Token ID15014
Feature activation-8.64e-02

Pos logit contributions
itting+0.021
ability+0.019
uttering+0.019
over+0.019
adow+0.018
Neg logit contributions
 fastest-0.013
 brakes-0.013
 highest-0.012
 fart-0.011
 hurting-0.011
Tokenuck
Token ID1347
Feature activation+3.03e-02

Pos logit contributions
 brakes+0.195
 fastest+0.194
 highest+0.182
 fart+0.178
 ruining+0.177
Neg logit contributions
itting-0.120
uttering-0.104
over-0.100
ability-0.096
illed-0.086
Token!
Token ID0
Feature activation+5.39e-03

Pos logit contributions
itting+0.053
ability+0.047
uttering+0.047
over+0.044
adow+0.043
Neg logit contributions
 fastest-0.057
 brakes-0.055
 highest-0.054
 fart-0.051
 hurting-0.050
Token
Token ID198
Feature activation+1.21e-02

Pos logit contributions
itting+0.009
uttering+0.008
ability+0.007
over+0.007
adow+0.007
Neg logit contributions
 brakes-0.011
 fastest-0.011
 highest-0.010
 fart-0.010
 nearest-0.009
Token
Token ID198
Feature activation+1.15e-02

Pos logit contributions
itting+0.021
uttering+0.019
ability+0.017
over+0.017
adow+0.016
Neg logit contributions
 brakes-0.022
 fastest-0.022
 highest-0.021
 nearest-0.020
 fart-0.019
TokenThe
Token ID464
Feature activation-7.23e-03

Pos logit contributions
itting+0.023
uttering+0.021
ability+0.021
adow+0.020
over+0.020
Neg logit contributions
 fastest-0.018
 brakes-0.017
 highest-0.017
 fart-0.016
 nearest-0.015
Token together
Token ID1978
Feature activation-4.65e-03

Pos logit contributions
itting+0.033
ability+0.029
adow+0.028
uttering+0.028
over+0.027
Neg logit contributions
 fastest-0.056
 highest-0.056
 brakes-0.055
 fart-0.053
 hurting-0.052
Token to
Token ID284
Feature activation+4.87e-03

Pos logit contributions
 highest+0.007
 fastest+0.007
 separate+0.007
 hurting+0.007
 brakes+0.007
Neg logit contributions
itting-0.008
ability-0.008
over-0.008
ooks-0.007
uttering-0.007
Token get
Token ID651
Feature activation-1.80e-03

Pos logit contributions
itting+0.011
uttering+0.010
ability+0.010
over+0.010
adow+0.010
Neg logit contributions
 fastest-0.006
 brakes-0.006
 highest-0.006
 fart-0.006
 hurting-0.005
Token the
Token ID262
Feature activation+2.24e-03

Pos logit contributions
 fastest+0.003
 brakes+0.003
 highest+0.003
 fart+0.003
 hurting+0.003
Neg logit contributions
itting-0.004
uttering-0.003
ability-0.003
over-0.003
adow-0.003
Token bird
Token ID6512
Feature activation+3.72e-03

Pos logit contributions
itting+0.005
uttering+0.005
ability+0.005
over+0.005
adow+0.005
Neg logit contributions
 fastest-0.003
 brakes-0.003
 highest-0.003
 fart-0.002
 heads-0.002
Token unst
Token ID15014
Feature activation-8.53e-02

Pos logit contributions
itting+0.008
ability+0.007
uttering+0.007
over+0.007
ooks+0.007
Neg logit contributions
 fastest-0.006
 highest-0.005
 brakes-0.005
 fart-0.005
 hurting-0.005
Tokenuck
Token ID1347
Feature activation+2.95e-02

Pos logit contributions
 fastest+0.217
 brakes+0.215
 highest+0.202
 ruining+0.198
 fart+0.197
Neg logit contributions
itting-0.090
over-0.076
uttering-0.076
ability-0.072
apped-0.062
Token and
Token ID290
Feature activation+1.15e-02

Pos logit contributions
itting+0.048
uttering+0.042
ability+0.042
over+0.038
adow+0.038
Neg logit contributions
 fastest-0.058
 highest-0.057
 brakes-0.056
 fart-0.053
 heads-0.053
Token freed
Token ID13459
Feature activation-4.57e-02

Pos logit contributions
itting+0.022
uttering+0.020
ability+0.019
over+0.019
adow+0.018
Neg logit contributions
 fastest-0.020
 brakes-0.019
 highest-0.019
 fart-0.018
 hurting-0.017
Token it
Token ID340
Feature activation+7.10e-03

Pos logit contributions
 fastest+0.085
 brakes+0.084
 highest+0.081
 fart+0.077
 hurting+0.073
Neg logit contributions
itting-0.079
uttering-0.071
ability-0.070
over-0.068
adow-0.065
Token from
Token ID422
Feature activation-1.74e-02

Pos logit contributions
itting+0.011
uttering+0.010
ability+0.010
over+0.009
adow+0.009
Neg logit contributions
 fastest-0.015
 brakes-0.014
 highest-0.014
 fart-0.013
 hurting-0.013
Token scared
Token ID12008
Feature activation+1.29e-03

Pos logit contributions
itting+0.004
ability+0.003
uttering+0.003
over+0.003
adow+0.003
Neg logit contributions
 fastest-0.003
 brakes-0.003
 highest-0.003
 fart-0.002
 hurting-0.002
Token of
Token ID286
Feature activation+4.09e-03

Pos logit contributions
itting+0.002
ability+0.002
uttering+0.002
over+0.002
adow+0.001
Neg logit contributions
 fastest-0.003
 brakes-0.003
 highest-0.003
 fart-0.003
 hurting-0.002
Token the
Token ID262
Feature activation+4.62e-03

Pos logit contributions
itting+0.007
uttering+0.006
ability+0.006
over+0.006
adow+0.006
Neg logit contributions
 fastest-0.007
 brakes-0.007
 highest-0.007
 fart-0.007
 heads-0.006
Token cold
Token ID4692
Feature activation-1.37e-02

Pos logit contributions
itting+0.009
uttering+0.008
ability+0.008
 early+0.008
over+0.008
Neg logit contributions
 brakes-0.007
 fastest-0.006
 highest-0.006
 fart-0.006
 heads-0.006
Token por
Token ID16964
Feature activation-4.53e-02

Pos logit contributions
 fastest+0.025
 brakes+0.025
 highest+0.024
 fart+0.022
 hurting+0.021
Neg logit contributions
itting-0.024
ability-0.021
uttering-0.021
over-0.021
adow-0.020
Tokencel
Token ID5276
Feature activation-8.45e-02

Pos logit contributions
 fastest+0.061
 brakes+0.061
 highest+0.057
 fart+0.056
 heads+0.052
Neg logit contributions
itting-0.099
over-0.093
adow-0.089
uttering-0.086
book-0.083
Tokenain
Token ID391
Feature activation-1.78e-02

Pos logit contributions
 fastest+0.121
 highest+0.116
 brakes+0.114
 fart+0.103
 heads+0.099
Neg logit contributions
itting-0.180
ability-0.167
uttering-0.165
over-0.161
adow-0.154
Token bowl
Token ID9396
Feature activation+1.03e-02

Pos logit contributions
 brakes+0.030
 fastest+0.030
 highest+0.028
 fart+0.027
 heads+0.025
Neg logit contributions
itting-0.034
uttering-0.030
ability-0.030
over-0.030
adow-0.029
Token!
Token ID0
Feature activation+3.40e-03

Pos logit contributions
itting+0.018
uttering+0.016
ability+0.016
over+0.015
ooks+0.015
Neg logit contributions
 fastest-0.020
 highest-0.019
 brakes-0.019
 fart-0.018
 hurting-0.017
Token
Token ID220
Feature activation-1.08e-02

Pos logit contributions
itting+0.005
uttering+0.005
ability+0.004
over+0.004
adow+0.004
Neg logit contributions
 fastest-0.007
 brakes-0.007
 highest-0.007
 fart-0.006
 heads-0.006
Token
Token ID198
Feature activation+8.27e-03

Pos logit contributions
 fastest+0.018
 highest+0.018
 brakes+0.017
 fart+0.015
 nearest+0.015
Neg logit contributions
itting-0.019
ability-0.018
uttering-0.017
over-0.016
ment-0.015
Token Fl
Token ID1610
Feature activation-2.73e-02

Pos logit contributions
itting+0.012
uttering+0.011
ability+0.011
over+0.010
adow+0.010
Neg logit contributions
 fastest-0.011
 brakes-0.011
 highest-0.011
 fart-0.010
 hurting-0.010
Tokenoppy
Token ID26696
Feature activation+5.03e-03

Pos logit contributions
 fastest+0.053
 brakes+0.051
 highest+0.051
 heads+0.047
 fart+0.047
Neg logit contributions
itting-0.043
ability-0.040
uttering-0.038
over-0.036
 early-0.035
Token saw
Token ID2497
Feature activation+1.61e-02

Pos logit contributions
itting+0.010
uttering+0.009
ability+0.009
over+0.008
adow+0.008
Neg logit contributions
 fastest-0.008
 brakes-0.008
 highest-0.008
 fart-0.008
 hurting-0.007
Token a
Token ID257
Feature activation-3.59e-03

Pos logit contributions
itting+0.032
uttering+0.029
over+0.027
ability+0.027
ooks+0.026
Neg logit contributions
 brakes-0.029
 fastest-0.028
 highest-0.027
 hurting-0.025
 nearest-0.025
Token mar
Token ID1667
Feature activation+4.04e-03

Pos logit contributions
 fastest+0.005
 brakes+0.005
 highest+0.005
 fart+0.005
 heads+0.005
Neg logit contributions
itting-0.007
uttering-0.007
ability-0.007
over-0.006
 early-0.006
Tokenvell
Token ID29333
Feature activation-8.28e-02

Pos logit contributions
itting+0.008
uttering+0.007
ability+0.007
over+0.007
adow+0.007
Neg logit contributions
 brakes-0.007
 fastest-0.007
 highest-0.006
 fart-0.006
 hurting-0.006
Tokenous
Token ID516
Feature activation+1.29e-02

Pos logit contributions
 fastest+0.158
 brakes+0.158
 highest+0.151
 fart+0.146
 hurting+0.139
Neg logit contributions
itting-0.142
uttering-0.126
ability-0.118
over-0.118
adow-0.115
Token view
Token ID1570
Feature activation-8.15e-03

Pos logit contributions
itting+0.027
uttering+0.025
ability+0.024
over+0.024
adow+0.023
Neg logit contributions
 fastest-0.020
 brakes-0.019
 highest-0.019
 fart-0.017
 hurting-0.017
Token.
Token ID13
Feature activation+1.99e-03

Pos logit contributions
 fastest+0.015
 highest+0.015
 brakes+0.013
 fart+0.013
 hurting+0.013
Neg logit contributions
itting-0.014
ability-0.013
ooks-0.013
uttering-0.013
over-0.012
Token He
Token ID679
Feature activation-5.40e-03

Pos logit contributions
itting+0.003
uttering+0.003
ability+0.003
over+0.003
adow+0.003
Neg logit contributions
 fastest-0.004
 brakes-0.004
 highest-0.004
 fart-0.003
 heads-0.003
Token also
Token ID635
Feature activation+5.73e-03

Pos logit contributions
 fastest+0.010
 brakes+0.010
 highest+0.009
 fart+0.009
 hurting+0.008
Neg logit contributions
itting-0.010
ability-0.009
uttering-0.008
over-0.008
adow-0.008
Token Benny
Token ID44275
Feature activation+3.58e-03

Pos logit contributions
itting+0.007
uttering+0.007
ability+0.006
over+0.006
adow+0.006
Neg logit contributions
 fastest-0.006
 brakes-0.006
 highest-0.006
 fart-0.005
 hurting-0.005
Token helped
Token ID4193
Feature activation-1.23e-03

Pos logit contributions
itting+0.007
uttering+0.007
ability+0.006
over+0.006
ooks+0.006
Neg logit contributions
 fastest-0.006
 brakes-0.006
 highest-0.006
 fart-0.005
 hurting-0.005
Token the
Token ID262
Feature activation+4.18e-03

Pos logit contributions
 fastest+0.002
 brakes+0.002
 highest+0.002
 fart+0.002
 heads+0.002
Neg logit contributions
itting-0.003
uttering-0.002
ability-0.002
over-0.002
adow-0.002
Token bear
Token ID6842
Feature activation-1.11e-02

Pos logit contributions
itting+0.009
uttering+0.008
ability+0.008
over+0.008
adow+0.007
Neg logit contributions
 fastest-0.006
 brakes-0.006
 highest-0.006
 fart-0.005
 hurting-0.005
Token get
Token ID651
Feature activation-2.29e-04

Pos logit contributions
 fastest+0.016
 highest+0.016
 brakes+0.015
 fart+0.014
 separate+0.014
Neg logit contributions
itting-0.025
uttering-0.023
ability-0.022
over-0.021
ooks-0.021
Token unst
Token ID15014
Feature activation-8.21e-02

Pos logit contributions
 fastest+0.000
 brakes+0.000
 highest+0.000
 fart+0.000
 hurting+0.000
Neg logit contributions
itting-0.000
uttering-0.000
ability-0.000
over-0.000
adow-0.000
Tokenuck
Token ID1347
Feature activation+3.31e-02

Pos logit contributions
 brakes+0.206
 fastest+0.205
 highest+0.197
 fart+0.191
 ruining+0.186
Neg logit contributions
itting-0.090
over-0.075
uttering-0.074
ability-0.074
ooks-0.060
Token and
Token ID290
Feature activation+1.14e-02

Pos logit contributions
itting+0.058
uttering+0.052
ability+0.050
over+0.048
adow+0.047
Neg logit contributions
 fastest-0.063
 brakes-0.061
 highest-0.060
 fart-0.057
 heads-0.055
Token helped
Token ID4193
Feature activation+2.80e-03

Pos logit contributions
itting+0.021
uttering+0.019
ability+0.018
over+0.018
adow+0.017
Neg logit contributions
 fastest-0.020
 brakes-0.020
 highest-0.019
 fart-0.018
 hurting-0.018
Token the
Token ID262
Feature activation+6.75e-03

Pos logit contributions
itting+0.005
uttering+0.005
ability+0.005
adow+0.005
over+0.004
Neg logit contributions
 fastest-0.005
 brakes-0.005
 highest-0.005
 fart-0.004
 hurting-0.004
Token little
Token ID1310
Feature activation+1.85e-02

Pos logit contributions
itting+0.013
ability+0.012
over+0.012
uttering+0.012
 early+0.012
Neg logit contributions
 brakes-0.009
 fastest-0.009
 highest-0.008
 fart-0.008
 ruining-0.008
Token your
Token ID534
Feature activation+1.40e-02

Pos logit contributions
 fastest+0.029
 brakes+0.028
 highest+0.027
 fart+0.026
 hurting+0.026
Neg logit contributions
itting-0.036
ability-0.033
uttering-0.032
over-0.031
ooks-0.030
Token fault
Token ID8046
Feature activation+7.65e-02

Pos logit contributions
itting+0.025
uttering+0.023
ability+0.022
over+0.022
adow+0.021
Neg logit contributions
 fastest-0.025
 brakes-0.025
 highest-0.024
 fart-0.023
 hurting-0.022
Token.
Token ID13
Feature activation-8.95e-03

Pos logit contributions
itting+0.111
uttering+0.096
ability+0.095
over+0.093
adow+0.088
Neg logit contributions
 fastest-0.163
 brakes-0.162
 highest-0.158
 fart-0.150
 heads-0.143
Token Sometimes
Token ID8975
Feature activation+2.28e-02

Pos logit contributions
 fastest+0.018
 brakes+0.017
 highest+0.017
 fart+0.016
 heads+0.016
Neg logit contributions
itting-0.014
ability-0.012
uttering-0.012
over-0.012
adow-0.011
Token things
Token ID1243
Feature activation-3.28e-03

Pos logit contributions
itting+0.039
ability+0.035
uttering+0.034
over+0.034
adow+0.033
Neg logit contributions
 fastest-0.042
 brakes-0.041
 highest-0.041
 fart-0.038
 heads-0.036
Token happen
Token ID1645
Feature activation-8.06e-02

Pos logit contributions
 fastest+0.005
 brakes+0.004
 highest+0.004
 fart+0.004
 heads+0.004
Neg logit contributions
itting-0.007
uttering-0.007
ability-0.006
over-0.006
adow-0.006
Token that
Token ID326
Feature activation-5.93e-03

Pos logit contributions
 fastest+0.156
 brakes+0.149
 highest+0.148
 fart+0.139
 hurting+0.136
Neg logit contributions
itting-0.135
ability-0.125
uttering-0.120
over-0.115
adow-0.108
Token we
Token ID356
Feature activation-1.85e-02

Pos logit contributions
 fastest+0.008
 brakes+0.008
 highest+0.008
 fart+0.007
 nearest+0.007
Neg logit contributions
itting-0.013
ability-0.012
uttering-0.011
over-0.011
adow-0.011
Token cannot
Token ID2314
Feature activation-8.24e-03

Pos logit contributions
 fastest+0.030
 brakes+0.030
 highest+0.029
 fart+0.026
 heads+0.025
Neg logit contributions
itting-0.036
ability-0.032
uttering-0.031
adow-0.030
over-0.029
Token control
Token ID1630
Feature activation-3.60e-03

Pos logit contributions
 fastest+0.012
 brakes+0.011
 highest+0.011
 fart+0.010
 hurting+0.010
Neg logit contributions
itting-0.018
ability-0.016
uttering-0.016
over-0.016
adow-0.015
Token.
Token ID13
Feature activation-1.07e-02

Pos logit contributions
 fastest+0.007
 brakes+0.007
 highest+0.006
 fart+0.006
 hurting+0.006
Neg logit contributions
itting-0.006
uttering-0.006
ability-0.006
over-0.005
adow-0.005
Token.
Token ID13
Feature activation-9.69e-03

Pos logit contributions
 fastest+0.027
 brakes+0.026
 highest+0.025
 fart+0.023
 hurting+0.023
Neg logit contributions
itting-0.049
uttering-0.047
over-0.044
ability-0.043
ooks-0.042
Token She
Token ID1375
Feature activation-1.52e-02

Pos logit contributions
 fastest+0.017
 brakes+0.017
 highest+0.017
 fart+0.016
 nearest+0.015
Neg logit contributions
itting-0.017
uttering-0.016
ability-0.015
over-0.014
adow-0.014
Token had
Token ID550
Feature activation-8.28e-03

Pos logit contributions
 brakes+0.023
 fastest+0.022
 highest+0.021
 heads+0.020
 fart+0.019
Neg logit contributions
itting-0.033
uttering-0.031
over-0.029
ability-0.029
ooks-0.027
Token heard
Token ID2982
Feature activation+2.91e-03

Pos logit contributions
 fastest+0.016
 brakes+0.016
 highest+0.016
 fart+0.015
 hurting+0.015
Neg logit contributions
itting-0.014
uttering-0.012
ability-0.012
over-0.012
adow-0.011
Token about
Token ID546
Feature activation-4.30e-03

Pos logit contributions
itting+0.005
uttering+0.004
ability+0.004
over+0.004
adow+0.004
Neg logit contributions
 brakes-0.006
 fastest-0.005
 highest-0.005
 fart-0.005
 hurting-0.005
Token att
Token ID708
Feature activation-7.92e-02

Pos logit contributions
 brakes+0.008
 fastest+0.007
 highest+0.007
 hurting+0.007
 ruining+0.007
Neg logit contributions
itting-0.008
uttering-0.007
over-0.007
ability-0.007
ooks-0.007
Tokenics
Token ID873
Feature activation+1.25e-02

Pos logit contributions
 fastest+0.138
 brakes+0.135
 highest+0.131
 fart+0.122
 hurting+0.117
Neg logit contributions
itting-0.147
ability-0.133
uttering-0.131
over-0.127
adow-0.123
Token and
Token ID290
Feature activation-1.08e-02

Pos logit contributions
itting+0.024
ability+0.022
ooks+0.021
uttering+0.021
icker+0.020
Neg logit contributions
 fastest-0.021
 highest-0.020
 hurting-0.018
 brakes-0.018
 forth-0.018
Token wanted
Token ID2227
Feature activation-1.47e-02

Pos logit contributions
 fastest+0.018
 brakes+0.017
 highest+0.016
 hurting+0.016
 ruining+0.016
Neg logit contributions
itting-0.022
uttering-0.020
ability-0.019
over-0.019
ooks-0.018
Token to
Token ID284
Feature activation-1.02e-02

Pos logit contributions
 fastest+0.030
 brakes+0.029
 highest+0.029
 fart+0.027
 hurting+0.026
Neg logit contributions
itting-0.024
uttering-0.021
ability-0.021
over-0.020
adow-0.019
Token explore
Token ID7301
Feature activation-1.58e-02

Pos logit contributions
 fastest+0.013
 brakes+0.013
 highest+0.012
 fart+0.011
 hurting+0.011
Neg logit contributions
itting-0.024
uttering-0.022
ability-0.022
over-0.021
ooks-0.020
Token find
Token ID1064
Feature activation-2.96e-02

Pos logit contributions
itting+0.038
uttering+0.034
ability+0.034
adow+0.034
over+0.033
Neg logit contributions
 fastest-0.032
 brakes-0.032
 highest-0.031
 fart-0.029
 heads-0.027
Token them
Token ID606
Feature activation-3.17e-04

Pos logit contributions
 fastest+0.063
 brakes+0.062
 highest+0.060
 fart+0.058
 hurting+0.055
Neg logit contributions
itting-0.042
ability-0.037
uttering-0.036
over-0.036
adow-0.033
Token?
Token ID30
Feature activation-1.06e-02

Pos logit contributions
 fastest+0.001
 highest+0.001
 brakes+0.001
 fart+0.001
 hurting+0.001
Neg logit contributions
itting-0.001
ability-0.001
uttering-0.001
over-0.000
ooks-0.000
Token What
Token ID1867
Feature activation-1.39e-02

Pos logit contributions
 fastest+0.021
 brakes+0.021
 highest+0.020
 fart+0.019
 heads+0.018
Neg logit contributions
itting-0.017
uttering-0.015
ability-0.015
over-0.014
adow-0.014
Token will
Token ID481
Feature activation+1.59e-02

Pos logit contributions
 fastest+0.019
 brakes+0.018
 highest+0.017
 fart+0.016
 hurting+0.016
Neg logit contributions
itting-0.031
uttering-0.030
over-0.028
ability-0.027
ment-0.026
Token happen
Token ID1645
Feature activation-7.85e-02

Pos logit contributions
itting+0.030
over+0.027
uttering+0.027
 about+0.027
ability+0.027
Neg logit contributions
 fastest-0.025
 brakes-0.025
 highest-0.024
 fart-0.022
 forth-0.020
Token in
Token ID287
Feature activation-9.65e-03

Pos logit contributions
 fastest+0.141
 highest+0.131
 brakes+0.126
 fart+0.121
 hurting+0.119
Neg logit contributions
itting-0.142
ability-0.137
uttering-0.128
over-0.121
adow-0.117
Token the
Token ID262
Feature activation-1.12e-02

Pos logit contributions
 fastest+0.014
 brakes+0.014
 highest+0.013
 fart+0.012
 hurting+0.012
Neg logit contributions
itting-0.021
uttering-0.019
ability-0.019
over-0.018
adow-0.018
Token forest
Token ID8222
Feature activation-4.21e-03

Pos logit contributions
 fastest+0.016
 brakes+0.016
 highest+0.015
 fart+0.014
 heads+0.013
Neg logit contributions
itting-0.025
uttering-0.022
ability-0.022
over-0.022
adow-0.021
Token?
Token ID30
Feature activation-9.95e-03

Pos logit contributions
 fastest+0.008
 highest+0.007
 brakes+0.007
 fart+0.007
 hurting+0.007
Neg logit contributions
itting-0.008
uttering-0.007
ability-0.007
over-0.006
ooks-0.006
Token<|endoftext|>
Token ID50256
Feature activation+4.62e-03

Pos logit contributions
 fastest+0.018
 brakes+0.017
 highest+0.017
 fart+0.016
 hurting+0.015
Neg logit contributions
itting-0.018
uttering-0.016
ability-0.016
over-0.015
adow-0.015
Token first
Token ID717
Feature activation-4.19e-02

Pos logit contributions
 fastest+0.013
 brakes+0.013
 highest+0.013
 fart+0.012
 hurting+0.011
Neg logit contributions
itting-0.018
uttering-0.017
ability-0.016
over-0.016
adow-0.015
Token.
Token ID13
Feature activation-8.70e-04

Pos logit contributions
 fastest+0.059
 brakes+0.054
 highest+0.054
 fart+0.051
 heads+0.047
Neg logit contributions
itting-0.096
uttering-0.088
ability-0.087
ooks-0.081
over-0.080
Token You
Token ID921
Feature activation+1.97e-02

Pos logit contributions
 fastest+0.001
 brakes+0.001
 highest+0.001
 fart+0.001
 hurting+0.001
Neg logit contributions
itting-0.002
uttering-0.002
ability-0.002
over-0.002
adow-0.001
Token look
Token ID804
Feature activation-2.98e-02

Pos logit contributions
itting+0.040
ability+0.036
adow+0.035
uttering+0.035
over+0.034
Neg logit contributions
 brakes-0.030
 fastest-0.029
 highest-0.028
 fart-0.026
 hurting-0.025
Token yes
Token ID3763
Feature activation-2.61e-03

Pos logit contributions
 fastest+0.050
 brakes+0.050
 highest+0.048
 fart+0.045
 heads+0.043
Neg logit contributions
itting-0.057
uttering-0.051
ability-0.050
over-0.049
adow-0.047
Tokenfil
Token ID10379
Feature activation-7.84e-02

Pos logit contributions
 fastest+0.004
 brakes+0.004
 highest+0.003
 fart+0.003
 hurting+0.003
Neg logit contributions
itting-0.006
ability-0.005
uttering-0.005
over-0.005
adow-0.005
Tokenthy
Token ID20057
Feature activation-7.62e-03

Pos logit contributions
 brakes+0.127
 fastest+0.126
 highest+0.120
 fart+0.114
 heads+0.110
Neg logit contributions
itting-0.157
uttering-0.141
over-0.137
ability-0.134
adow-0.129
Token!"
Token ID2474
Feature activation-1.37e-02

Pos logit contributions
 fastest+0.012
 brakes+0.011
 highest+0.011
 fart+0.010
 hurting+0.010
Neg logit contributions
itting-0.016
uttering-0.014
ability-0.014
over-0.013
adow-0.013
Token
Token ID198
Feature activation+4.95e-03

Pos logit contributions
 brakes+0.024
 highest+0.023
 fastest+0.022
 nearest+0.021
 heads+0.021
Neg logit contributions
itting-0.025
uttering-0.022
ability-0.022
over-0.021
ooks-0.021
Token
Token ID198
Feature activation+4.89e-03

Pos logit contributions
itting+0.008
ability+0.007
uttering+0.007
over+0.006
adow+0.006
Neg logit contributions
 fastest-0.010
 brakes-0.010
 highest-0.010
 nearest-0.009
 fart-0.009
TokenThe
Token ID464
Feature activation-1.33e-02

Pos logit contributions
itting+0.010
uttering+0.009
ability+0.009
over+0.008
adow+0.008
Neg logit contributions
 fastest-0.008
 brakes-0.008
 highest-0.008
 fart-0.007
 heads-0.007
Token grandma
Token ID49890
Feature activation-8.38e-03

Pos logit contributions
 fastest+0.040
 brakes+0.039
 highest+0.037
 fart+0.034
 heads+0.031
Neg logit contributions
itting-0.063
ability-0.058
uttering-0.057
over-0.056
 early-0.054
Token.
Token ID13
Feature activation-4.85e-03

Pos logit contributions
 fastest+0.016
 brakes+0.016
 highest+0.016
 fart+0.015
 hurting+0.014
Neg logit contributions
itting-0.014
uttering-0.012
ability-0.012
over-0.011
ooks-0.011
Token How
Token ID1374
Feature activation+3.57e-03

Pos logit contributions
 fastest+0.009
 brakes+0.009
 highest+0.009
 fart+0.008
 hurting+0.008
Neg logit contributions
itting-0.008
ability-0.007
over-0.007
uttering-0.007
adow-0.007
Token did
Token ID750
Feature activation-9.38e-03

Pos logit contributions
itting+0.008
uttering+0.008
over+0.008
ability+0.007
ment+0.007
Neg logit contributions
 fastest-0.004
 highest-0.004
 brakes-0.004
 fart-0.004
 hurts-0.004
Token this
Token ID428
Feature activation-1.36e-03

Pos logit contributions
 fastest+0.012
 brakes+0.011
 highest+0.011
 fart+0.010
 heads+0.009
Neg logit contributions
itting-0.021
ability-0.020
over-0.019
 about-0.019
 early-0.019
Token happen
Token ID1645
Feature activation-7.83e-02

Pos logit contributions
 fastest+0.003
 brakes+0.002
 highest+0.002
 fart+0.002
 hurting+0.002
Neg logit contributions
itting-0.002
uttering-0.002
ability-0.002
over-0.002
adow-0.002
Token?"
Token ID1701
Feature activation-2.77e-02

Pos logit contributions
 fastest+0.126
 highest+0.118
 brakes+0.112
 hurting+0.110
 fart+0.109
Neg logit contributions
itting-0.153
ability-0.149
uttering-0.140
over-0.131
ment-0.129
Token she
Token ID673
Feature activation-1.41e-02

Pos logit contributions
 brakes+0.046
 fastest+0.045
 highest+0.044
 hurting+0.040
 fart+0.040
Neg logit contributions
itting-0.055
uttering-0.049
ability-0.047
adow-0.046
over-0.046
Token asks
Token ID7893
Feature activation-1.62e-02

Pos logit contributions
 fastest+0.027
 highest+0.026
 fart+0.024
 brakes+0.024
chie+0.023
Neg logit contributions
itting-0.020
 early-0.020
 about-0.019
ability-0.018
over-0.018
Token.
Token ID13
Feature activation+1.02e-02

Pos logit contributions
 brakes+0.030
 fastest+0.029
 highest+0.028
 hurting+0.027
 heads+0.027
Neg logit contributions
itting-0.029
ability-0.025
uttering-0.025
adow-0.023
over-0.023
Token
Token ID198
Feature activation+2.65e-03

Pos logit contributions
itting+0.019
adow+0.017
ability+0.017
uttering+0.017
over+0.016
Neg logit contributions
 brakes-0.018
 highest-0.016
 fastest-0.016
 hurting-0.016
 ow-0.015
Token the
Token ID262
Feature activation-1.80e-02

Pos logit contributions
 fastest+0.005
 brakes+0.005
 highest+0.005
 fart+0.005
 hurting+0.005
Neg logit contributions
itting-0.005
uttering-0.005
ability-0.005
over-0.004
adow-0.004
Token buttons
Token ID12163
Feature activation-9.01e-04

Pos logit contributions
 fastest+0.027
 brakes+0.025
 highest+0.024
 ruining+0.023
 hurting+0.023
Neg logit contributions
itting-0.034
ability-0.031
uttering-0.031
over-0.030
book-0.030
Token and
Token ID290
Feature activation-1.35e-02

Pos logit contributions
 fastest+0.002
 brakes+0.002
 highest+0.002
 fart+0.002
 hurting+0.002
Neg logit contributions
itting-0.001
uttering-0.001
ability-0.001
over-0.001
adow-0.001
Token make
Token ID787
Feature activation-1.07e-02

Pos logit contributions
 fastest+0.022
 highest+0.020
 brakes+0.020
 fart+0.019
 hurting+0.018
Neg logit contributions
itting-0.024
ability-0.022
uttering-0.022
 early-0.021
over-0.021
Token things
Token ID1243
Feature activation-1.61e-03

Pos logit contributions
 fastest+0.019
 brakes+0.018
 highest+0.018
 fart+0.017
 hurting+0.016
Neg logit contributions
itting-0.020
uttering-0.018
ability-0.018
over-0.017
adow-0.016
Token happen
Token ID1645
Feature activation-7.82e-02

Pos logit contributions
 fastest+0.003
 brakes+0.003
 highest+0.003
 fart+0.003
 heads+0.002
Neg logit contributions
itting-0.003
uttering-0.003
ability-0.003
over-0.003
adow-0.002
Token on
Token ID319
Feature activation+1.09e-03

Pos logit contributions
 fastest+0.152
 highest+0.145
 brakes+0.141
 fart+0.135
 hurting+0.128
Neg logit contributions
itting-0.133
ability-0.124
uttering-0.119
over-0.110
adow-0.107
Token the
Token ID262
Feature activation-1.22e-02

Pos logit contributions
itting+0.002
uttering+0.002
ability+0.002
over+0.002
adow+0.002
Neg logit contributions
 fastest-0.002
 brakes-0.002
 highest-0.002
 fart-0.002
 heads-0.002
Token screen
Token ID3159
Feature activation-7.00e-03

Pos logit contributions
 fastest+0.018
 brakes+0.018
 highest+0.017
 fart+0.015
 ruining+0.015
Neg logit contributions
itting-0.023
over-0.021
ability-0.021
uttering-0.021
 early-0.021
Token.
Token ID13
Feature activation-4.42e-03

Pos logit contributions
 fastest+0.015
 brakes+0.014
 highest+0.014
 fart+0.013
 hurting+0.013
Neg logit contributions
itting-0.011
uttering-0.009
ability-0.009
over-0.009
adow-0.008
Token She
Token ID1375
Feature activation-1.32e-02

Pos logit contributions
 fastest+0.009
 brakes+0.008
 highest+0.008
 fart+0.008
 longest+0.007
Neg logit contributions
itting-0.007
uttering-0.007
ability-0.006
over-0.006
adow-0.006
Token magic
Token ID5536
Feature activation-1.16e-02

Pos logit contributions
 fastest+0.005
 brakes+0.005
 highest+0.005
 fart+0.005
 hurting+0.004
Neg logit contributions
itting-0.006
ability-0.006
uttering-0.006
over-0.005
adow-0.005
Token.
Token ID13
Feature activation-3.60e-03

Pos logit contributions
 fastest+0.023
 brakes+0.022
 highest+0.022
 fart+0.021
 hurting+0.020
Neg logit contributions
itting-0.019
uttering-0.017
ability-0.017
over-0.016
adow-0.015
Token Maybe
Token ID6674
Feature activation-3.69e-02

Pos logit contributions
 fastest+0.007
 brakes+0.007
 highest+0.007
 fart+0.006
 hurting+0.006
Neg logit contributions
itting-0.006
ability-0.005
over-0.005
uttering-0.005
adow-0.005
Token it
Token ID340
Feature activation-9.28e-03

Pos logit contributions
 fastest+0.068
 brakes+0.068
 highest+0.065
 fart+0.062
 heads+0.059
Neg logit contributions
itting-0.063
uttering-0.056
ability-0.056
over-0.054
adow-0.053
Token will
Token ID481
Feature activation+1.76e-02

Pos logit contributions
 brakes+0.016
 fastest+0.015
 highest+0.015
 fart+0.014
 forth+0.013
Neg logit contributions
itting-0.017
adow-0.015
ability-0.015
 about-0.014
uttering-0.014
Token happen
Token ID1645
Feature activation-7.81e-02

Pos logit contributions
itting+0.041
uttering+0.038
ability+0.037
over+0.036
adow+0.036
Neg logit contributions
 fastest-0.022
 brakes-0.022
 highest-0.021
 fart-0.019
 heads-0.018
Token again
Token ID757
Feature activation-1.90e-02

Pos logit contributions
 fastest+0.129
 highest+0.123
 brakes+0.118
 fart+0.111
 hurting+0.107
Neg logit contributions
itting-0.154
ability-0.143
uttering-0.139
over-0.132
adow-0.127
Token."
Token ID526
Feature activation-1.93e-02

Pos logit contributions
 fastest+0.034
 brakes+0.033
 highest+0.032
 fart+0.030
 hurting+0.029
Neg logit contributions
itting-0.035
uttering-0.031
ability-0.031
over-0.030
adow-0.029
Token
Token ID198
Feature activation-1.87e-03

Pos logit contributions
 brakes+0.037
 fastest+0.035
 highest+0.035
 fart+0.033
 heads+0.032
Neg logit contributions
itting-0.035
uttering-0.031
over-0.028
ability-0.028
adow-0.026
Token
Token ID198
Feature activation-1.88e-03

Pos logit contributions
 brakes+0.003
 highest+0.003
 me+0.003
 fastest+0.003
 hurts+0.003
Neg logit contributions
itting-0.004
uttering-0.003
over-0.003
ability-0.003
adow-0.003
TokenThey
Token ID2990
Feature activation-5.97e-03

Pos logit contributions
 fastest+0.004
 brakes+0.003
 highest+0.003
 fart+0.003
 nearest+0.003
Neg logit contributions
itting-0.003
uttering-0.003
over-0.003
ability-0.003
adow-0.003
Token I
Token ID314
Feature activation-1.29e-02

Pos logit contributions
 fastest+0.008
 brakes+0.008
 highest+0.008
 fart+0.007
 hurting+0.007
Neg logit contributions
itting-0.010
ability-0.010
over-0.009
uttering-0.009
adow-0.009
Token did
Token ID750
Feature activation-9.81e-03

Pos logit contributions
 fastest+0.020
 brakes+0.020
 highest+0.019
 fart+0.018
 hurting+0.018
Neg logit contributions
itting-0.026
uttering-0.024
over-0.023
ability-0.023
ooks-0.022
Token not
Token ID407
Feature activation-1.79e-02

Pos logit contributions
 fastest+0.017
 brakes+0.016
 highest+0.016
 fart+0.015
 heads+0.014
Neg logit contributions
itting-0.019
uttering-0.017
ability-0.017
over-0.016
adow-0.016
Token mean
Token ID1612
Feature activation-9.87e-03

Pos logit contributions
 brakes+0.034
 fastest+0.034
 highest+0.033
 fart+0.030
 nearest+0.030
Neg logit contributions
itting-0.029
ability-0.026
over-0.026
uttering-0.025
adow-0.025
Token to
Token ID284
Feature activation-1.39e-02

Pos logit contributions
 brakes+0.021
 fastest+0.021
 highest+0.020
 fart+0.020
 hurting+0.019
Neg logit contributions
itting-0.015
uttering-0.013
ability-0.013
over-0.012
adow-0.012
Token happen
Token ID1645
Feature activation-7.79e-02

Pos logit contributions
 fastest+0.021
 highest+0.021
 brakes+0.020
 brav+0.018
 fart+0.018
Neg logit contributions
itting-0.022
 early-0.022
 about-0.022
ability-0.022
over-0.021
Token it
Token ID340
Feature activation-9.70e-03

Pos logit contributions
 brakes+0.148
 fastest+0.147
 highest+0.142
 fart+0.135
 heads+0.128
Neg logit contributions
itting-0.132
uttering-0.117
over-0.114
ability-0.113
adow-0.107
Token."
Token ID526
Feature activation-1.51e-02

Pos logit contributions
 brakes+0.020
 fastest+0.019
 highest+0.018
 fart+0.017
 heads+0.017
Neg logit contributions
itting-0.015
over-0.013
uttering-0.013
 early-0.013
adow-0.012
Token
Token ID198
Feature activation-3.30e-03

Pos logit contributions
 brakes+0.028
 highest+0.027
 fastest+0.026
 fart+0.025
 heads+0.025
Neg logit contributions
itting-0.028
uttering-0.025
ability-0.023
over-0.023
adow-0.022
Token
Token ID198
Feature activation-3.48e-03

Pos logit contributions
 brakes+0.006
 highest+0.006
 fastest+0.006
 me+0.005
 hurts+0.005
Neg logit contributions
itting-0.006
uttering-0.005
ability-0.005
over-0.005
adow-0.005
TokenMom
Token ID29252
Feature activation-7.49e-03

Pos logit contributions
 fastest+0.004
 brakes+0.004
 highest+0.004
 fart+0.004
 heads+0.003
Neg logit contributions
itting-0.008
uttering-0.008
ability-0.008
adow-0.007
over-0.007
Token's
Token ID338
Feature activation-1.81e-02

Pos logit contributions
 brakes+0.069
 fastest+0.064
 highest+0.063
 fart+0.060
 brake+0.058
Neg logit contributions
itting-0.045
ability-0.041
 about-0.039
adow-0.038
 early-0.034
Token okay
Token ID8788
Feature activation-2.22e-02

Pos logit contributions
 fastest+0.028
 brakes+0.027
 highest+0.026
 fart+0.025
 hurting+0.024
Neg logit contributions
itting-0.039
uttering-0.035
ability-0.035
over-0.034
ment-0.032
Token.
Token ID13
Feature activation-6.06e-03

Pos logit contributions
 fastest+0.044
 brakes+0.043
 highest+0.042
 fart+0.040
 hurting+0.039
Neg logit contributions
itting-0.036
ability-0.032
uttering-0.032
over-0.031
adow-0.029
Token Acc
Token ID6366
Feature activation-1.46e-02

Pos logit contributions
 fastest+0.008
 brakes+0.008
 highest+0.008
 fart+0.007
 hurting+0.007
Neg logit contributions
itting-0.014
ability-0.012
uttering-0.012
over-0.012
adow-0.012
Tokenidents
Token ID3231
Feature activation-2.12e-02

Pos logit contributions
 fastest+0.020
 brakes+0.020
 highest+0.019
 fart+0.018
 hurting+0.017
Neg logit contributions
itting-0.032
uttering-0.029
ability-0.029
over-0.029
adow-0.028
Token happen
Token ID1645
Feature activation-7.77e-02

Pos logit contributions
 fastest+0.021
 brakes+0.021
 highest+0.020
 fart+0.018
 heads+0.016
Neg logit contributions
itting-0.055
uttering-0.051
ability-0.051
over-0.050
adow-0.048
Token.
Token ID13
Feature activation-6.68e-03

Pos logit contributions
 fastest+0.162
 brakes+0.159
 highest+0.155
 fart+0.148
 hurting+0.142
Neg logit contributions
itting-0.118
uttering-0.103
ability-0.103
over-0.099
adow-0.094
Token Just
Token ID2329
Feature activation-3.45e-02

Pos logit contributions
 fastest+0.009
 brakes+0.009
 highest+0.009
 fart+0.008
 hurting+0.008
Neg logit contributions
itting-0.015
ability-0.013
uttering-0.013
over-0.013
adow-0.013
Token be
Token ID307
Feature activation-2.49e-03

Pos logit contributions
 fastest+0.051
 brakes+0.051
 highest+0.049
 fart+0.045
 hurting+0.043
Neg logit contributions
itting-0.073
uttering-0.066
ability-0.066
over-0.064
adow-0.062
Token more
Token ID517
Feature activation+2.39e-02

Pos logit contributions
 fastest+0.004
 brakes+0.004
 highest+0.004
 fart+0.003
 hurting+0.003
Neg logit contributions
itting-0.005
uttering-0.005
ability-0.005
over-0.004
adow-0.004
Token careful
Token ID8161
Feature activation+4.06e-03

Pos logit contributions
itting+0.051
ability+0.047
uttering+0.046
over+0.044
adow+0.043
Neg logit contributions
 fastest-0.037
 brakes-0.035
 highest-0.033
 hurting-0.032
 fart-0.031
Token couldn
Token ID3521
Feature activation-2.93e-02

Pos logit contributions
 fastest+0.004
 brakes+0.004
 highest+0.004
 fart+0.004
 longest+0.004
Neg logit contributions
itting-0.004
ability-0.003
over-0.003
adow-0.003
uttering-0.003
Token't
Token ID470
Feature activation-1.51e-02

Pos logit contributions
 brakes+0.037
 fastest+0.037
 highest+0.034
 heads+0.033
 fart+0.033
Neg logit contributions
itting-0.065
over-0.060
uttering-0.059
ability-0.058
 early-0.057
Token run
Token ID1057
Feature activation-4.98e-03

Pos logit contributions
 fastest+0.026
 brakes+0.025
 highest+0.024
 fart+0.023
 heads+0.022
Neg logit contributions
itting-0.027
over-0.024
ability-0.024
uttering-0.024
adow-0.024
Token away
Token ID1497
Feature activation+1.32e-03

Pos logit contributions
 fastest+0.008
 brakes+0.008
 highest+0.008
 fart+0.007
 heads+0.007
Neg logit contributions
itting-0.009
uttering-0.009
over-0.009
ability-0.008
 early-0.008
Token fast
Token ID3049
Feature activation-1.42e-02

Pos logit contributions
itting+0.003
ability+0.002
uttering+0.002
over+0.002
ooks+0.002
Neg logit contributions
 fastest-0.002
 brakes-0.002
 highest-0.002
 hurting-0.002
 fart-0.002
Token enough
Token ID1576
Feature activation+6.90e-02

Pos logit contributions
 fastest+0.027
 brakes+0.026
 highest+0.025
 fart+0.024
 hurting+0.024
Neg logit contributions
itting-0.025
uttering-0.022
ability-0.022
over-0.021
adow-0.021
Token and
Token ID290
Feature activation+1.08e-02

Pos logit contributions
itting+0.109
uttering+0.096
ability+0.096
over+0.093
adow+0.089
Neg logit contributions
 fastest-0.137
 brakes-0.136
 highest-0.132
 fart-0.125
 heads-0.120
Token was
Token ID373
Feature activation+8.78e-03

Pos logit contributions
itting+0.020
ability+0.018
uttering+0.017
over+0.017
adow+0.017
Neg logit contributions
 fastest-0.018
 brakes-0.018
 highest-0.018
 fart-0.017
 heads-0.016
Token arrested
Token ID5169
Feature activation-1.19e-02

Pos logit contributions
itting+0.018
uttering+0.016
ability+0.016
over+0.015
adow+0.015
Neg logit contributions
 fastest-0.014
 brakes-0.014
 highest-0.013
 fart-0.012
 hurting-0.012
Token.
Token ID13
Feature activation+1.24e-02

Pos logit contributions
 fastest+0.022
 brakes+0.022
 highest+0.021
 fart+0.020
 hurting+0.019
Neg logit contributions
itting-0.021
uttering-0.019
ability-0.019
over-0.017
adow-0.017
Token The
Token ID383
Feature activation-5.04e-03

Pos logit contributions
itting+0.020
uttering+0.018
ability+0.017
over+0.016
adow+0.016
Neg logit contributions
 brakes-0.025
 fastest-0.025
 highest-0.024
 fart-0.023
 hurting-0.021
Token them
Token ID606
Feature activation+1.28e-02

Pos logit contributions
itting+0.000
uttering+0.000
ability+0.000
over+0.000
adow+0.000
Neg logit contributions
 fastest-0.000
 brakes-0.000
 highest-0.000
 fart-0.000
 hurting-0.000
Token are
Token ID389
Feature activation-1.00e-02

Pos logit contributions
itting+0.025
ability+0.023
uttering+0.023
over+0.022
ooks+0.021
Neg logit contributions
 fastest-0.021
 brakes-0.020
 highest-0.020
 fart-0.018
 heads-0.018
Token very
Token ID845
Feature activation-5.09e-03

Pos logit contributions
 fastest+0.016
 brakes+0.016
 highest+0.015
 fart+0.014
 hurting+0.014
Neg logit contributions
itting-0.020
ability-0.019
uttering-0.018
over-0.018
adow-0.017
Token dark
Token ID3223
Feature activation-1.07e-02

Pos logit contributions
 fastest+0.008
 brakes+0.008
 highest+0.008
 fart+0.007
 hurting+0.007
Neg logit contributions
itting-0.010
uttering-0.009
ability-0.009
over-0.009
adow-0.009
Token.
Token ID13
Feature activation+6.70e-03

Pos logit contributions
 fastest+0.021
 brakes+0.021
 highest+0.020
 fart+0.019
 hurting+0.019
Neg logit contributions
itting-0.018
ability-0.016
uttering-0.016
over-0.014
ooks-0.014
Token Some
Token ID2773
Feature activation+6.79e-02

Pos logit contributions
itting+0.009
uttering+0.008
ability+0.007
over+0.007
adow+0.007
Neg logit contributions
 fastest-0.015
 brakes-0.015
 highest-0.015
 fart-0.014
 hurting-0.014
Token of
Token ID286
Feature activation-1.02e-03

Pos logit contributions
itting+0.132
uttering+0.119
ability+0.119
over+0.115
adow+0.111
Neg logit contributions
 fastest-0.111
 brakes-0.110
 highest-0.106
 fart-0.099
 hurting-0.094
Token them
Token ID606
Feature activation+1.21e-02

Pos logit contributions
 fastest+0.002
 brakes+0.002
 highest+0.002
 fart+0.002
 hurting+0.002
Neg logit contributions
itting-0.002
uttering-0.002
ability-0.002
over-0.002
adow-0.002
Token are
Token ID389
Feature activation-1.09e-02

Pos logit contributions
itting+0.025
uttering+0.022
ability+0.022
over+0.021
adow+0.021
Neg logit contributions
 fastest-0.019
 brakes-0.018
 highest-0.018
 fart-0.017
 hurting-0.016
Token very
Token ID845
Feature activation-6.21e-03

Pos logit contributions
 fastest+0.017
 brakes+0.016
 highest+0.016
 fart+0.014
 hurting+0.014
Neg logit contributions
itting-0.023
ability-0.021
uttering-0.021
over-0.020
adow-0.020
Token hot
Token ID3024
Feature activation-2.65e-02

Pos logit contributions
 fastest+0.009
 brakes+0.009
 highest+0.009
 fart+0.008
 hurting+0.008
Neg logit contributions
itting-0.013
uttering-0.012
ability-0.012
over-0.011
adow-0.011
Token had
Token ID550
Feature activation+1.37e-02

Pos logit contributions
 fastest+0.008
 brakes+0.008
 highest+0.008
 fart+0.008
 hurting+0.007
Neg logit contributions
itting-0.009
uttering-0.008
ability-0.008
over-0.008
adow-0.008
Token looked
Token ID3114
Feature activation+1.00e-03

Pos logit contributions
itting+0.028
uttering+0.025
ability+0.024
over+0.023
ute+0.023
Neg logit contributions
 fastest-0.022
 brakes-0.022
 highest-0.021
 fart-0.020
 hurting-0.020
Token at
Token ID379
Feature activation+5.69e-03

Pos logit contributions
itting+0.002
ability+0.002
uttering+0.002
over+0.002
ooks+0.002
Neg logit contributions
 fastest-0.001
 highest-0.001
 brakes-0.001
 fart-0.001
 hurting-0.001
Token the
Token ID262
Feature activation+5.27e-03

Pos logit contributions
itting+0.011
uttering+0.010
ability+0.010
over+0.010
adow+0.009
Neg logit contributions
 fastest-0.010
 brakes-0.009
 highest-0.009
 fart-0.008
 hurting-0.008
Token tree
Token ID5509
Feature activation-1.25e-04

Pos logit contributions
itting+0.010
uttering+0.009
ability+0.009
adow+0.009
over+0.009
Neg logit contributions
 brakes-0.008
 fastest-0.008
 fart-0.008
 ruining-0.007
 highest-0.007
Token enough
Token ID1576
Feature activation+6.88e-02

Pos logit contributions
 fastest+0.000
 brakes+0.000
 highest+0.000
 fart+0.000
 hurting+0.000
Neg logit contributions
itting-0.000
ability-0.000
uttering-0.000
over-0.000
ooks-0.000
Token,
Token ID11
Feature activation+1.40e-02

Pos logit contributions
itting+0.122
uttering+0.108
ability+0.107
over+0.105
ooks+0.098
Neg logit contributions
 fastest-0.130
 brakes-0.128
 highest-0.126
 fart-0.116
 hurting-0.112
Token Toby
Token ID37578
Feature activation+1.54e-02

Pos logit contributions
itting+0.027
uttering+0.024
ability+0.023
over+0.022
adow+0.022
Neg logit contributions
 fastest-0.025
 brakes-0.025
 highest-0.024
 fart-0.022
 hurting-0.021
Token said
Token ID531
Feature activation+1.38e-02

Pos logit contributions
itting+0.031
uttering+0.029
ability+0.028
over+0.027
ooks+0.027
Neg logit contributions
 fastest-0.024
 brakes-0.024
 highest-0.023
 fart-0.021
 hurting-0.021
Token goodbye
Token ID24829
Feature activation+1.68e-03

Pos logit contributions
itting+0.029
uttering+0.026
ability+0.026
ooks+0.025
over+0.025
Neg logit contributions
 brakes-0.020
 fastest-0.020
 highest-0.020
 back-0.018
 hurting-0.018
Token and
Token ID290
Feature activation+8.64e-03

Pos logit contributions
itting+0.003
ability+0.003
uttering+0.003
ooks+0.003
over+0.002
Neg logit contributions
 highest-0.003
 fastest-0.003
 brakes-0.003
 heads-0.003
 back-0.003
Token cleaned
Token ID20750
Feature activation-2.06e-02

Pos logit contributions
itting+0.014
uttering+0.014
over+0.013
ability+0.013
ooks+0.012
Neg logit contributions
 fastest-0.008
 heads-0.008
 brakes-0.008
 highest-0.008
 separate-0.008
Token the
Token ID262
Feature activation+2.87e-03

Pos logit contributions
 fastest+0.035
 brakes+0.034
 highest+0.033
 fart+0.031
 heads+0.030
Neg logit contributions
itting-0.040
uttering-0.036
ability-0.035
over-0.034
adow-0.033
Token house
Token ID2156
Feature activation-5.47e-05

Pos logit contributions
itting+0.005
uttering+0.004
ability+0.004
over+0.004
adow+0.004
Neg logit contributions
 fastest-0.006
 brakes-0.006
 highest-0.005
 fart-0.005
 hurting-0.005
Token and
Token ID290
Feature activation+8.09e-03

Pos logit contributions
 fastest+0.000
 highest+0.000
 brakes+0.000
 heads+0.000
 hurting+0.000
Neg logit contributions
itting-0.000
ability-0.000
uttering-0.000
over-0.000
adow-0.000
Token sure
Token ID1654
Feature activation+8.34e-03

Pos logit contributions
itting+0.019
uttering+0.017
over+0.016
ability+0.016
adow+0.015
Neg logit contributions
 forth-0.010
 brakes-0.010
 heads-0.010
 separate-0.010
 fastest-0.010
Token enough
Token ID1576
Feature activation+6.61e-02

Pos logit contributions
itting+0.018
uttering+0.017
ooks+0.015
ability+0.015
over+0.015
Neg logit contributions
 fastest-0.012
 brakes-0.012
 highest-0.011
 separate-0.011
 hurting-0.011
Token,
Token ID11
Feature activation+1.07e-02

Pos logit contributions
itting+0.100
uttering+0.087
ability+0.087
over+0.083
adow+0.079
Neg logit contributions
 fastest-0.139
 brakes-0.137
 highest-0.134
 fart-0.127
 hurting-0.123
Token her
Token ID607
Feature activation+1.70e-03

Pos logit contributions
itting+0.019
uttering+0.017
ability+0.016
over+0.015
adow+0.015
Neg logit contributions
 fastest-0.020
 highest-0.019
 brakes-0.019
 fart-0.018
 hurting-0.018
Token Mom
Token ID11254
Feature activation+1.21e-02

Pos logit contributions
itting+0.004
uttering+0.003
ability+0.003
over+0.003
adow+0.003
Neg logit contributions
 fastest-0.002
 brakes-0.002
 highest-0.002
 fart-0.002
 hurting-0.002
Token had
Token ID550
Feature activation+1.15e-02

Pos logit contributions
itting+0.023
uttering+0.021
ability+0.021
over+0.020
adow+0.019
Neg logit contributions
 fastest-0.020
 brakes-0.020
 highest-0.019
 fart-0.018
 hurting-0.017
Token a
Token ID257
Feature activation+5.98e-04

Pos logit contributions
itting+0.022
uttering+0.020
over+0.019
ability+0.019
ute+0.018
Neg logit contributions
 highest-0.019
 fastest-0.019
 brakes-0.019
 hurting-0.018
 separate-0.018
Token inside
Token ID2641
Feature activation-1.41e-02

Pos logit contributions
 brakes+0.017
 fastest+0.017
 highest+0.016
 nearest+0.015
 separate+0.015
Neg logit contributions
itting-0.018
ability-0.016
over-0.015
uttering-0.015
ooks-0.015
Token and
Token ID290
Feature activation+7.84e-03

Pos logit contributions
 hurting+0.018
 ruining+0.018
 separate+0.017
 brakes+0.017
 fastest+0.017
Neg logit contributions
itting-0.031
uttering-0.028
ability-0.027
over-0.027
ooks-0.026
Token waited
Token ID13488
Feature activation+2.25e-03

Pos logit contributions
itting+0.013
ability+0.012
uttering+0.012
over+0.011
adow+0.011
Neg logit contributions
 fastest-0.015
 brakes-0.014
 highest-0.014
 fart-0.013
 hurting-0.012
Token.
Token ID13
Feature activation+4.00e-03

Pos logit contributions
ability+0.005
itting+0.004
over+0.004
adow+0.004
uttering+0.004
Neg logit contributions
 hurting-0.003
 brakes-0.003
 highest-0.003
 fastest-0.003
 back-0.003
Token Sure
Token ID10889
Feature activation-9.47e-03

Pos logit contributions
itting+0.006
uttering+0.006
ability+0.005
over+0.005
adow+0.005
Neg logit contributions
 brakes-0.008
 fastest-0.008
 highest-0.008
 fart-0.008
 heads-0.007
Token enough
Token ID1576
Feature activation+6.54e-02

Pos logit contributions
 brakes+0.018
 fastest+0.017
 highest+0.017
 hurting+0.016
 fart+0.016
Neg logit contributions
itting-0.016
uttering-0.015
ability-0.014
over-0.014
adow-0.013
Token,
Token ID11
Feature activation+7.42e-03

Pos logit contributions
itting+0.118
uttering+0.105
ability+0.103
over+0.098
adow+0.094
Neg logit contributions
 brakes-0.122
 fastest-0.121
 highest-0.114
 fart-0.109
 hurting-0.108
Token the
Token ID262
Feature activation+2.75e-04

Pos logit contributions
itting+0.014
ability+0.013
uttering+0.012
over+0.012
adow+0.012
Neg logit contributions
 fastest-0.013
 brakes-0.013
 highest-0.013
 fart-0.012
 hurting-0.011
Token guard
Token ID4860
Feature activation+2.92e-03

Pos logit contributions
itting+0.000
uttering+0.000
 early+0.000
over+0.000
 about+0.000
Neg logit contributions
 fastest-0.000
 fart-0.000
 highest-0.000
 brakes-0.000
 forth-0.000
Token soon
Token ID2582
Feature activation-3.69e-02

Pos logit contributions
itting+0.005
ability+0.005
uttering+0.005
over+0.005
adow+0.004
Neg logit contributions
 fastest-0.005
 brakes-0.005
 highest-0.005
 fart-0.005
 heads-0.004
Token saw
Token ID2497
Feature activation+2.10e-02

Pos logit contributions
 fastest+0.056
 brakes+0.055
 highest+0.053
 fart+0.049
 hurting+0.049
Neg logit contributions
itting-0.077
uttering-0.070
ability-0.069
over-0.066
ooks-0.063
Token the
Token ID262
Feature activation+7.67e-03

Pos logit contributions
itting+0.014
uttering+0.012
ability+0.012
adow+0.011
over+0.011
Neg logit contributions
 fastest-0.011
 brakes-0.011
 highest-0.011
 fart-0.010
 hurting-0.010
Token forest
Token ID8222
Feature activation+8.16e-03

Pos logit contributions
itting+0.013
ability+0.012
uttering+0.012
over+0.012
adow+0.011
Neg logit contributions
 brakes-0.014
 fastest-0.013
 highest-0.012
 fart-0.012
 heads-0.012
Token.
Token ID13
Feature activation+8.87e-03

Pos logit contributions
itting+0.015
ability+0.014
uttering+0.013
ooks+0.012
over+0.012
Neg logit contributions
 fastest-0.015
 highest-0.014
 brakes-0.013
 hurting-0.012
 fart-0.012
Token The
Token ID383
Feature activation-4.36e-03

Pos logit contributions
itting+0.016
uttering+0.015
ability+0.014
over+0.014
adow+0.013
Neg logit contributions
 fastest-0.016
 brakes-0.015
 highest-0.015
 fart-0.014
 heads-0.013
Token bear
Token ID6842
Feature activation-1.63e-02

Pos logit contributions
 brakes+0.006
 fastest+0.006
 highest+0.005
 fart+0.005
 ruining+0.005
Neg logit contributions
itting-0.008
ability-0.008
 early-0.008
over-0.008
uttering-0.008
Token saw
Token ID2497
Feature activation+2.23e-02

Pos logit contributions
 fastest+0.028
 brakes+0.027
 highest+0.026
 fart+0.025
 heads+0.024
Neg logit contributions
itting-0.031
uttering-0.028
ability-0.028
over-0.027
ooks-0.025
Token the
Token ID262
Feature activation+4.49e-03

Pos logit contributions
itting+0.046
uttering+0.042
ability+0.040
over+0.040
adow+0.038
Neg logit contributions
 fastest-0.035
 brakes-0.035
 highest-0.034
 hurting-0.031
 fart-0.031
Token cricket
Token ID18836
Feature activation+1.49e-02

Pos logit contributions
itting+0.009
ability+0.008
uttering+0.008
over+0.008
 early+0.008
Neg logit contributions
 brakes-0.006
 fastest-0.006
 highest-0.006
 fart-0.005
 ruining-0.005
Token eating
Token ID6600
Feature activation-8.82e-03

Pos logit contributions
itting+0.028
uttering+0.025
ability+0.025
ooks+0.023
over+0.023
Neg logit contributions
 fastest-0.026
 brakes-0.025
 highest-0.025
 hurting-0.024
 fart-0.024
Token his
Token ID465
Feature activation+4.59e-03

Pos logit contributions
 highest+0.013
 fart+0.012
 fastest+0.012
 brakes+0.012
 hurting+0.012
Neg logit contributions
itting-0.020
uttering-0.018
ooks-0.017
over-0.017
ability-0.016
Token b
Token ID275
Feature activation-2.82e-02

Pos logit contributions
itting+0.008
ability+0.007
uttering+0.007
over+0.007
adow+0.007
Neg logit contributions
 fastest-0.008
 brakes-0.008
 highest-0.008
 fart-0.007
 hurting-0.007
Token.
Token ID13
Feature activation+7.18e-03

Pos logit contributions
itting+0.005
uttering+0.004
ability+0.004
ooks+0.004
over+0.004
Neg logit contributions
 fastest-0.005
 highest-0.005
 brakes-0.005
 fart-0.005
 hurting-0.005
Token She
Token ID1375
Feature activation+8.46e-04

Pos logit contributions
itting+0.012
uttering+0.012
over+0.011
ability+0.011
adow+0.010
Neg logit contributions
 highest-0.013
 fastest-0.013
 brakes-0.013
 heads-0.012
 fart-0.011
Token ran
Token ID4966
Feature activation-9.19e-03

Pos logit contributions
itting+0.002
ability+0.001
uttering+0.001
over+0.001
adow+0.001
Neg logit contributions
 fastest-0.001
 brakes-0.001
 highest-0.001
 fart-0.001
 hurting-0.001
Token to
Token ID284
Feature activation+3.02e-03

Pos logit contributions
 fastest+0.015
 brakes+0.014
 highest+0.014
 fart+0.013
 heads+0.012
Neg logit contributions
itting-0.019
ability-0.017
uttering-0.017
over-0.016
adow-0.016
Token the
Token ID262
Feature activation+1.14e-03

Pos logit contributions
itting+0.006
uttering+0.005
over+0.005
ability+0.005
adow+0.005
Neg logit contributions
 brakes-0.005
 fastest-0.005
 highest-0.005
 hurting-0.004
 fart-0.004
Token park
Token ID3952
Feature activation+2.49e-02

Pos logit contributions
itting+0.002
uttering+0.002
ability+0.002
 early+0.002
over+0.002
Neg logit contributions
 brakes-0.002
 fastest-0.002
 highest-0.002
 fart-0.002
 ruining-0.002
Token to
Token ID284
Feature activation+3.42e-03

Pos logit contributions
itting+0.045
ability+0.040
uttering+0.038
ooks+0.036
over+0.036
Neg logit contributions
 fastest-0.044
 highest-0.044
 brakes-0.043
 separate-0.041
 heads-0.040
Token show
Token ID905
Feature activation-5.52e-02

Pos logit contributions
itting+0.008
uttering+0.007
ability+0.007
over+0.007
adow+0.007
Neg logit contributions
 fastest-0.005
 brakes-0.004
 highest-0.004
 fart-0.004
 hurting-0.004
Token all
Token ID477
Feature activation-5.84e-03

Pos logit contributions
 brakes+0.100
 fastest+0.098
 highest+0.095
 hurting+0.091
 heads+0.090
Neg logit contributions
itting-0.102
uttering-0.089
ability-0.086
over-0.081
adow-0.081
Token the
Token ID262
Feature activation+2.24e-03

Pos logit contributions
 my+0.007
 me+0.006
 highest+0.006
 fastest+0.006
 brakes+0.006
Neg logit contributions
itting-0.014
uttering-0.013
over-0.013
icker-0.012
ute-0.012
Token other
Token ID584
Feature activation-2.19e-02

Pos logit contributions
itting+0.005
ability+0.004
uttering+0.004
over+0.004
adow+0.004
Neg logit contributions
 fastest-0.003
 brakes-0.003
 highest-0.003
 fart-0.003
 hurting-0.003
Token,
Token ID11
Feature activation-2.39e-04

Pos logit contributions
itting+0.006
ability+0.005
uttering+0.005
ooks+0.005
over+0.005
Neg logit contributions
 fastest-0.006
 brakes-0.006
 highest-0.006
 hurting-0.005
 fart-0.005
Token no
Token ID645
Feature activation+2.42e-02

Pos logit contributions
 brakes+0.000
 fastest+0.000
 highest+0.000
 fart+0.000
 hurting+0.000
Neg logit contributions
itting-0.000
uttering-0.000
ability-0.000
over-0.000
ooks-0.000
Token toys
Token ID14958
Feature activation-5.67e-03

Pos logit contributions
itting+0.050
uttering+0.045
ability+0.044
over+0.043
adow+0.041
Neg logit contributions
 brakes-0.039
 fastest-0.038
 highest-0.036
 fart-0.034
 heads-0.033
Token,
Token ID11
Feature activation-3.73e-04

Pos logit contributions
 fastest+0.011
 brakes+0.011
 highest+0.011
 hurting+0.010
 fart+0.010
Neg logit contributions
itting-0.009
ability-0.008
uttering-0.008
over-0.008
ooks-0.007
Token and
Token ID290
Feature activation-3.61e-03

Pos logit contributions
 fastest+0.001
 brakes+0.001
 highest+0.001
 fart+0.001
 hurting+0.001
Neg logit contributions
itting-0.001
uttering-0.001
ability-0.001
over-0.001
ooks-0.001
Token no
Token ID645
Feature activation+2.37e-02

Pos logit contributions
 fastest+0.007
 brakes+0.007
 highest+0.007
 fart+0.006
 hurting+0.006
Neg logit contributions
itting-0.006
uttering-0.006
ability-0.005
over-0.005
adow-0.005
Token mom
Token ID1995
Feature activation+8.56e-03

Pos logit contributions
itting+0.047
uttering+0.042
ability+0.042
over+0.041
adow+0.039
Neg logit contributions
 fastest-0.039
 brakes-0.039
 highest-0.037
 fart-0.035
 hurting-0.033
Token.
Token ID13
Feature activation-1.21e-03

Pos logit contributions
itting+0.016
uttering+0.014
ability+0.014
over+0.014
adow+0.013
Neg logit contributions
 fastest-0.015
 brakes-0.015
 highest-0.014
 fart-0.013
 hurting-0.013
Token He
Token ID679
Feature activation-6.56e-03

Pos logit contributions
 fastest+0.002
 brakes+0.002
 highest+0.002
 fart+0.002
 hurting+0.002
Neg logit contributions
itting-0.002
ability-0.002
uttering-0.002
over-0.002
ooks-0.002
Token looked
Token ID3114
Feature activation-1.04e-02

Pos logit contributions
 fastest+0.012
 brakes+0.011
 highest+0.011
 fart+0.010
 hurting+0.010
Neg logit contributions
itting-0.012
ability-0.011
uttering-0.011
over-0.010
adow-0.010
Token sad
Token ID6507
Feature activation+2.12e-02

Pos logit contributions
 fastest+0.017
 brakes+0.017
 highest+0.016
 fart+0.015
 heads+0.014
Neg logit contributions
itting-0.021
uttering-0.019
ability-0.018
over-0.018
adow-0.017
Token The
Token ID383
Feature activation-1.28e-03

Pos logit contributions
itting+0.026
uttering+0.023
ability+0.022
over+0.021
adow+0.021
Neg logit contributions
 brakes-0.027
 fastest-0.027
 highest-0.026
 fart-0.024
 heads-0.023
Token fox
Token ID21831
Feature activation-1.37e-03

Pos logit contributions
 fastest+0.002
 brakes+0.002
 highest+0.002
 fart+0.002
 ruining+0.001
Neg logit contributions
itting-0.003
over-0.003
ability-0.003
uttering-0.003
adow-0.002
Token agreed
Token ID4987
Feature activation+3.28e-02

Pos logit contributions
 fastest+0.002
 brakes+0.002
 highest+0.002
 fart+0.002
 heads+0.002
Neg logit contributions
itting-0.003
uttering-0.002
ability-0.002
over-0.002
adow-0.002
Token and
Token ID290
Feature activation+1.37e-02

Pos logit contributions
itting+0.063
uttering+0.057
ability+0.055
ooks+0.052
adow+0.051
Neg logit contributions
 brakes-0.056
 fastest-0.054
 hurting-0.054
 highest-0.052
 separate-0.051
Token waited
Token ID13488
Feature activation+3.60e-03

Pos logit contributions
itting+0.023
ability+0.020
uttering+0.020
over+0.020
adow+0.019
Neg logit contributions
 fastest-0.026
 brakes-0.026
 highest-0.025
 fart-0.024
 heads-0.023
Token while
Token ID981
Feature activation+1.51e-02

Pos logit contributions
ability+0.007
itting+0.007
uttering+0.006
adow+0.006
over+0.006
Neg logit contributions
 highest-0.006
 fastest-0.006
 hurting-0.005
 brakes-0.005
 ruining-0.005
Token Woo
Token ID39832
Feature activation+3.49e-02

Pos logit contributions
itting+0.028
ability+0.025
uttering+0.025
over+0.024
adow+0.024
Neg logit contributions
 fastest-0.026
 brakes-0.026
 highest-0.025
 fart-0.023
 hurting-0.022
Tokenly
Token ID306
Feature activation+2.56e-02

Pos logit contributions
itting+0.077
uttering+0.071
ability+0.068
ooks+0.067
adow+0.066
Neg logit contributions
 fastest-0.047
 brakes-0.045
 highest-0.043
 heads-0.042
 hurting-0.041
Token went
Token ID1816
Feature activation-7.83e-03

Pos logit contributions
itting+0.057
uttering+0.053
ability+0.049
over+0.048
ooks+0.048
Neg logit contributions
 fastest-0.035
 brakes-0.035
 highest-0.033
 heads-0.031
 fart-0.031
Token to
Token ID284
Feature activation+8.14e-03

Pos logit contributions
 fastest+0.011
 brakes+0.011
 highest+0.011
 fart+0.009
 hurting+0.009
Neg logit contributions
itting-0.018
ability-0.016
uttering-0.016
over-0.015
adow-0.015
Token get
Token ID651
Feature activation-4.33e-04

Pos logit contributions
itting+0.018
uttering+0.016
ability+0.016
over+0.016
adow+0.015
Neg logit contributions
 fastest-0.012
 brakes-0.012
 highest-0.011
 fart-0.010
 hurting-0.010
Token until
Token ID1566
Feature activation-2.09e-02

Pos logit contributions
itting+0.013
uttering+0.011
ability+0.011
over+0.011
adow+0.010
Neg logit contributions
 fastest-0.021
 brakes-0.021
 highest-0.020
 fart-0.019
 hurting-0.019
Token it
Token ID340
Feature activation+2.15e-03

Pos logit contributions
 highest+0.036
 fastest+0.036
 brakes+0.036
 fart+0.033
 hurting+0.032
Neg logit contributions
itting-0.040
ability-0.036
uttering-0.035
over-0.034
adow-0.033
Token was
Token ID373
Feature activation+5.89e-04

Pos logit contributions
itting+0.004
uttering+0.003
ability+0.003
over+0.003
adow+0.003
Neg logit contributions
 brakes-0.004
 fastest-0.004
 highest-0.004
 fart-0.004
 brake-0.003
Token as
Token ID355
Feature activation+2.34e-02

Pos logit contributions
itting+0.001
ability+0.001
uttering+0.001
over+0.001
adow+0.001
Neg logit contributions
 fastest-0.001
 brakes-0.001
 highest-0.001
 fart-0.001
 hurting-0.001
Token wide
Token ID3094
Feature activation-4.54e-03

Pos logit contributions
itting+0.052
ability+0.048
over+0.047
uttering+0.047
adow+0.045
Neg logit contributions
 fastest-0.033
 highest-0.032
 brakes-0.030
 fart-0.029
 hurting-0.027
Token as
Token ID355
Feature activation+2.37e-02

Pos logit contributions
 brakes+0.009
 fastest+0.009
 highest+0.009
 hurting+0.008
 fart+0.008
Neg logit contributions
itting-0.007
ability-0.007
uttering-0.006
over-0.006
adow-0.006
Token a
Token ID257
Feature activation-2.40e-03

Pos logit contributions
itting+0.042
ability+0.037
uttering+0.037
over+0.036
adow+0.034
Neg logit contributions
 fastest-0.044
 highest-0.043
 brakes-0.043
 fart-0.040
 hurting-0.038
Token plate
Token ID7480
Feature activation+7.70e-03

Pos logit contributions
 fastest+0.004
 brakes+0.004
 highest+0.004
 fart+0.004
 heads+0.004
Neg logit contributions
itting-0.004
uttering-0.004
over-0.004
ability-0.004
 early-0.004
Token.
Token ID13
Feature activation+6.46e-03

Pos logit contributions
itting+0.013
ability+0.013
uttering+0.012
ooks+0.011
over+0.011
Neg logit contributions
 fastest-0.014
 highest-0.013
 brakes-0.013
 hurting-0.013
 fart-0.013
Token The
Token ID383
Feature activation-8.70e-03

Pos logit contributions
itting+0.014
uttering+0.013
ability+0.012
over+0.012
adow+0.011
Neg logit contributions
 highest-0.010
 brakes-0.009
 fastest-0.009
 fart-0.009
 nearest-0.008
Token flower
Token ID15061
Feature activation-1.34e-02

Pos logit contributions
 brakes+0.016
 fastest+0.015
 highest+0.014
 fart+0.014
 brake+0.013
Neg logit contributions
itting-0.015
uttering-0.014
over-0.013
ability-0.013
adow-0.013
Token He
Token ID679
Feature activation-8.31e-03

Pos logit contributions
itting+0.000
uttering+0.000
over+0.000
ability+0.000
ment+0.000
Neg logit contributions
 fastest-0.000
 brakes-0.000
 highest-0.000
 fart-0.000
 heads-0.000
Token had
Token ID550
Feature activation+9.05e-04

Pos logit contributions
 fastest+0.014
 brakes+0.013
 highest+0.013
 fart+0.012
 hurting+0.012
Neg logit contributions
itting-0.016
uttering-0.015
ability-0.014
over-0.014
ment-0.014
Token a
Token ID257
Feature activation-1.35e-02

Pos logit contributions
itting+0.001
uttering+0.001
ability+0.001
over+0.001
adow+0.001
Neg logit contributions
 fastest-0.002
 brakes-0.002
 highest-0.002
 fart-0.002
 hurting-0.002
Token big
Token ID1263
Feature activation-2.32e-02

Pos logit contributions
 fastest+0.020
 brakes+0.019
 highest+0.019
 fart+0.017
 hurting+0.016
Neg logit contributions
itting-0.029
uttering-0.026
ability-0.026
over-0.026
adow-0.025
Token bed
Token ID3996
Feature activation-2.21e-02

Pos logit contributions
 fastest+0.036
 brakes+0.035
 highest+0.034
 fart+0.032
 heads+0.030
Neg logit contributions
itting-0.049
uttering-0.044
ability-0.044
over-0.042
ment-0.041
Token with
Token ID351
Feature activation+3.13e-03

Pos logit contributions
 fastest+0.038
 highest+0.035
 hurting+0.034
 brakes+0.033
 ruining+0.032
Neg logit contributions
ability-0.040
itting-0.039
uttering-0.036
ooks-0.034
illed-0.034
Token a
Token ID257
Feature activation-1.19e-02

Pos logit contributions
itting+0.005
uttering+0.004
ability+0.004
over+0.004
adow+0.004
Neg logit contributions
 fastest-0.007
 brakes-0.007
 highest-0.007
 fart-0.006
 hurting-0.006
Token soft
Token ID2705
Feature activation+1.22e-02

Pos logit contributions
 fastest+0.017
 brakes+0.017
 highest+0.017
 fart+0.015
 hurting+0.014
Neg logit contributions
itting-0.025
uttering-0.023
ability-0.023
over-0.022
adow-0.022
Token mattress
Token ID33388
Feature activation+1.10e-02

Pos logit contributions
itting+0.032
uttering+0.029
ability+0.029
ooks+0.028
ook+0.028
Neg logit contributions
 brakes-0.013
 head-0.012
 me-0.012
 bear-0.012
 heads-0.012
Token that
Token ID326
Feature activation+4.37e-03

Pos logit contributions
itting+0.020
ability+0.019
uttering+0.018
ooks+0.018
illed+0.017
Neg logit contributions
 fastest-0.020
 highest-0.019
 brakes-0.019
 hurting-0.018
 fart-0.017
Token he
Token ID339
Feature activation-1.49e-02

Pos logit contributions
itting+0.007
ability+0.006
uttering+0.006
over+0.006
adow+0.006
Neg logit contributions
 fastest-0.008
 brakes-0.008
 highest-0.008
 fart-0.008
 hurting-0.007
Token take
Token ID1011
Feature activation+5.72e-03

Pos logit contributions
itting+0.007
uttering+0.006
ability+0.006
over+0.006
 early+0.006
Neg logit contributions
 fastest-0.006
 brakes-0.005
 highest-0.005
 fart-0.005
 longest-0.004
Token Tim
Token ID5045
Feature activation-1.79e-02

Pos logit contributions
itting+0.008
ability+0.008
over+0.008
uttering+0.007
 about+0.007
Neg logit contributions
 fastest-0.012
 brakes-0.011
 highest-0.011
 fart-0.010
 nearest-0.010
Tokenmy
Token ID1820
Feature activation-1.75e-03

Pos logit contributions
 fastest+0.027
 brakes+0.025
 highest+0.025
 fart+0.023
 hurting+0.022
Neg logit contributions
itting-0.038
uttering-0.035
ability-0.034
over-0.033
ooks-0.032
Token's
Token ID338
Feature activation-3.29e-03

Pos logit contributions
 fastest+0.003
 brakes+0.002
 highest+0.002
 fart+0.002
 hurting+0.002
Neg logit contributions
itting-0.004
uttering-0.003
ability-0.003
over-0.003
adow-0.003
Token ball
Token ID2613
Feature activation+1.04e-02

Pos logit contributions
 brakes+0.006
 fastest+0.006
 highest+0.006
 fart+0.005
 ruining+0.005
Neg logit contributions
itting-0.006
ability-0.005
uttering-0.005
over-0.005
adow-0.005
Token,
Token ID11
Feature activation+6.06e-03

Pos logit contributions
itting+0.016
uttering+0.014
ability+0.014
over+0.014
adow+0.013
Neg logit contributions
 fastest-0.021
 brakes-0.021
 highest-0.020
 fart-0.019
 heads-0.018
Token but
Token ID475
Feature activation-3.36e-02

Pos logit contributions
itting+0.011
ability+0.010
uttering+0.010
over+0.010
adow+0.009
Neg logit contributions
 fastest-0.010
 brakes-0.010
 highest-0.010
 fart-0.009
 heads-0.009
Token Tim
Token ID5045
Feature activation-1.77e-02

Pos logit contributions
 fastest+0.075
 brakes+0.074
 highest+0.074
 fart+0.069
 heads+0.067
Neg logit contributions
itting-0.043
ability-0.037
uttering-0.036
over-0.034
 about-0.034
Tokenmy
Token ID1820
Feature activation-1.56e-03

Pos logit contributions
 fastest+0.029
 brakes+0.029
 highest+0.028
 fart+0.026
 heads+0.024
Neg logit contributions
itting-0.035
ability-0.031
uttering-0.031
over-0.030
adow-0.029
Token didn
Token ID1422
Feature activation+1.88e-03

Pos logit contributions
 fastest+0.003
 brakes+0.003
 highest+0.003
 fart+0.002
 heads+0.002
Neg logit contributions
itting-0.003
ability-0.003
uttering-0.003
over-0.002
adow-0.002
Token't
Token ID470
Feature activation-2.37e-02

Pos logit contributions
itting+0.004
uttering+0.004
over+0.004
 early+0.004
 about+0.004
Neg logit contributions
 fastest-0.003
 brakes-0.003
 heads-0.002
 highest-0.002
 fart-0.002
Token looked
Token ID3114
Feature activation-2.23e-03

Pos logit contributions
 fastest+0.000
 highest+0.000
 brakes+0.000
 fart+0.000
 longest+0.000
Neg logit contributions
itting-0.000
ability-0.000
uttering-0.000
adow-0.000
 early-0.000
Token closer
Token ID5699
Feature activation-3.68e-03

Pos logit contributions
 fastest+0.004
 brakes+0.004
 highest+0.004
 fart+0.003
 hurting+0.003
Neg logit contributions
itting-0.004
ability-0.004
uttering-0.004
over-0.004
adow-0.004
Token and
Token ID290
Feature activation+4.10e-03

Pos logit contributions
 highest+0.007
 fastest+0.007
 brakes+0.007
 heads+0.006
 fart+0.006
Neg logit contributions
itting-0.006
ability-0.006
over-0.006
uttering-0.006
ooks-0.005
Token he
Token ID339
Feature activation-6.45e-03

Pos logit contributions
itting+0.007
ability+0.006
uttering+0.006
adow+0.006
over+0.006
Neg logit contributions
 fastest-0.008
 brakes-0.008
 highest-0.007
 fart-0.007
 hurting-0.007
Token saw
Token ID2497
Feature activation+2.40e-02

Pos logit contributions
 fastest+0.014
 brakes+0.014
 highest+0.014
 fart+0.013
 ruining+0.013
Neg logit contributions
itting-0.009
uttering-0.007
adow-0.007
ability-0.007
over-0.006
Token lots
Token ID6041
Feature activation-1.02e-03

Pos logit contributions
itting+0.036
uttering+0.031
ability+0.030
over+0.030
adow+0.028
Neg logit contributions
 brakes-0.052
 fastest-0.052
 highest-0.051
 fart-0.047
 hurting-0.046
Token of
Token ID286
Feature activation+4.17e-03

Pos logit contributions
 fastest+0.003
 brakes+0.002
 highest+0.002
 fart+0.002
 heads+0.002
Neg logit contributions
itting-0.001
uttering-0.001
ability-0.001
over-0.001
adow-0.001
Token little
Token ID1310
Feature activation+1.07e-02

Pos logit contributions
itting+0.009
ability+0.008
uttering+0.008
over+0.008
adow+0.008
Neg logit contributions
 fastest-0.006
 brakes-0.006
 highest-0.005
 fart-0.005
 hurting-0.005
Token fishes
Token ID49765
Feature activation+1.47e-03

Pos logit contributions
itting+0.019
uttering+0.018
ability+0.017
over+0.017
adow+0.017
Neg logit contributions
 fastest-0.017
 brakes-0.017
 highest-0.017
 fart-0.016
 hurting-0.015
Token swimming
Token ID14899
Feature activation-2.61e-02

Pos logit contributions
itting+0.003
ability+0.003
over+0.003
uttering+0.003
ooks+0.002
Neg logit contributions
 fastest-0.002
 brakes-0.002
 highest-0.002
 hurting-0.002
 fart-0.002
Token around
Token ID1088
Feature activation+2.52e-02

Pos logit contributions
 fastest+0.044
 highest+0.043
 brakes+0.042
 fart+0.040
 heads+0.038
Neg logit contributions
itting-0.050
ability-0.046
uttering-0.044
over-0.044
ooks-0.042
Token deep
Token ID2769
Feature activation-1.26e-02

Pos logit contributions
itting+0.056
ability+0.050
uttering+0.049
over+0.048
adow+0.047
Neg logit contributions
 fastest-0.040
 highest-0.038
 brakes-0.036
 fart-0.034
 hurting-0.032
Token as
Token ID355
Feature activation+2.58e-02

Pos logit contributions
 fastest+0.027
 brakes+0.026
 highest+0.026
 fart+0.024
 hurting+0.024
Neg logit contributions
itting-0.019
ability-0.016
uttering-0.016
over-0.016
adow-0.015
Token he
Token ID339
Feature activation-7.26e-03

Pos logit contributions
itting+0.039
uttering+0.034
ability+0.034
over+0.033
adow+0.031
Neg logit contributions
 fastest-0.054
 brakes-0.053
 highest-0.052
 fart-0.049
 heads-0.047
Token could
Token ID714
Feature activation-7.03e-03

Pos logit contributions
 fastest+0.013
 brakes+0.013
 highest+0.013
 ruining+0.012
 fart+0.012
Neg logit contributions
itting-0.013
uttering-0.011
ability-0.011
adow-0.011
over-0.011
Token until
Token ID1566
Feature activation-1.85e-02

Pos logit contributions
 fastest+0.010
 highest+0.010
 brakes+0.010
 nearest+0.009
 hurting+0.009
Neg logit contributions
itting-0.015
ability-0.014
over-0.013
uttering-0.012
uggets-0.012
Token he
Token ID339
Feature activation-7.06e-03

Pos logit contributions
 brakes+0.031
 highest+0.031
 fastest+0.031
 fart+0.028
 hurting+0.027
Neg logit contributions
itting-0.037
ability-0.033
uttering-0.033
over-0.031
adow-0.030
Token saw
Token ID2497
Feature activation+2.26e-02

Pos logit contributions
 brakes+0.013
 fastest+0.013
 highest+0.013
 fart+0.012
 heads+0.012
Neg logit contributions
itting-0.012
ability-0.011
uttering-0.011
over-0.010
adow-0.010
Token a
Token ID257
Feature activation-3.26e-03

Pos logit contributions
itting+0.041
uttering+0.035
over+0.034
ability+0.034
illed+0.031
Neg logit contributions
 brakes-0.045
 highest-0.044
 fastest-0.044
 nearest-0.040
 ow-0.040
Token gl
Token ID1278
Feature activation-3.80e-02

Pos logit contributions
 fastest+0.005
 brakes+0.005
 highest+0.005
 fart+0.004
 heads+0.004
Neg logit contributions
itting-0.007
uttering-0.006
ability-0.006
over-0.006
adow-0.006
Tokenimmer
Token ID10957
Feature activation+2.37e-02

Pos logit contributions
 fastest+0.082
 brakes+0.080
 highest+0.080
 fart+0.075
 heads+0.072
Neg logit contributions
itting-0.052
ability-0.048
uttering-0.043
over-0.043
 early-0.040
Token of
Token ID286
Feature activation+8.14e-04

Pos logit contributions
itting+0.034
ability+0.032
uttering+0.030
over+0.029
adow+0.028
Neg logit contributions
 fastest-0.050
 highest-0.049
 brakes-0.048
 fart-0.046
 hurting-0.043
Token Robert
Token ID5199
Feature activation+6.42e-03

Pos logit contributions
 highest+0.015
 brakes+0.014
 fastest+0.013
 heads+0.013
 nearest+0.013
Neg logit contributions
itting-0.018
uttering-0.017
over-0.016
ability-0.016
ute-0.015
Token smiled
Token ID13541
Feature activation+1.04e-03

Pos logit contributions
itting+0.013
uttering+0.011
ability+0.011
over+0.011
ooks+0.010
Neg logit contributions
 fastest-0.010
 brakes-0.010
 highest-0.010
 fart-0.009
 heads-0.009
Token and
Token ID290
Feature activation+9.16e-03

Pos logit contributions
itting+0.002
ability+0.002
ooks+0.002
uttering+0.002
adow+0.002
Neg logit contributions
 highest-0.002
 back-0.002
 brakes-0.002
 fastest-0.002
 heads-0.001
Token said
Token ID531
Feature activation+1.17e-02

Pos logit contributions
itting+0.019
uttering+0.017
ability+0.016
over+0.016
ooks+0.016
Neg logit contributions
 fastest-0.014
 brakes-0.014
 highest-0.014
 fart-0.013
 heads-0.012
Token,
Token ID11
Feature activation+1.07e-02

Pos logit contributions
itting+0.023
uttering+0.021
ability+0.020
over+0.019
adow+0.019
Neg logit contributions
 fastest-0.019
 brakes-0.019
 highest-0.019
 hurting-0.017
 fart-0.017
Token "
Token ID366
Feature activation-2.21e-03

Pos logit contributions
itting+0.019
ability+0.017
uttering+0.017
over+0.016
adow+0.016
Neg logit contributions
 brakes-0.020
 fastest-0.019
 highest-0.019
 fart-0.018
 hurting-0.018
TokenMe
Token ID5308
Feature activation-3.86e-03

Pos logit contributions
 fastest+0.004
 highest+0.003
 brakes+0.003
 fart+0.003
 longest+0.003
Neg logit contributions
itting-0.004
uttering-0.004
ability-0.004
adow-0.004
over-0.004
Token too
Token ID1165
Feature activation+1.39e-02

Pos logit contributions
 brakes+0.006
 fastest+0.006
 highest+0.005
 fart+0.005
 heads+0.005
Neg logit contributions
itting-0.008
uttering-0.007
ability-0.007
over-0.007
adow-0.007
Token".
Token ID1911
Feature activation-1.02e-02

Pos logit contributions
itting+0.025
uttering+0.023
ability+0.022
over+0.022
adow+0.021
Neg logit contributions
 brakes-0.024
 fastest-0.024
 highest-0.023
 fart-0.022
 heads-0.021
Token<|endoftext|>
Token ID50256
Feature activation+1.33e-02

Pos logit contributions
 highest+0.017
 brakes+0.016
 fastest+0.015
 heads+0.014
 fart+0.014
Neg logit contributions
itting-0.020
uttering-0.019
over-0.017
ute-0.017
ability-0.016
TokenJim
Token ID18050
Feature activation-4.88e-03

Pos logit contributions
itting+0.024
uttering+0.022
ability+0.021
ooks+0.020
over+0.020
Neg logit contributions
 fastest-0.023
 highest-0.022
 brakes-0.021
 fart-0.021
 nearest-0.019
Token of
Token ID286
Feature activation-5.73e-03

Pos logit contributions
itting+0.005
uttering+0.005
ability+0.004
over+0.004
illed+0.004
Neg logit contributions
 fastest-0.005
 highest-0.005
 brakes-0.005
 fart-0.005
 nearest-0.005
Token fun
Token ID1257
Feature activation+1.38e-02

Pos logit contributions
 brakes+0.009
 fastest+0.008
 highest+0.008
 fart+0.007
 hurting+0.007
Neg logit contributions
itting-0.013
uttering-0.012
over-0.011
ability-0.011
ooks-0.011
Token together
Token ID1978
Feature activation-1.42e-02

Pos logit contributions
itting+0.027
ability+0.027
book+0.024
shop+0.024
ooks+0.023
Neg logit contributions
 fastest-0.020
 brakes-0.020
 hurting-0.019
 highest-0.018
 forth-0.018
Token.
Token ID13
Feature activation+3.58e-04

Pos logit contributions
 fastest+0.024
 highest+0.023
 brakes+0.023
 hurting+0.022
 heads+0.021
Neg logit contributions
itting-0.026
ability-0.025
uttering-0.023
ooks-0.023
adow-0.021
Token<|endoftext|>
Token ID50256
Feature activation+1.24e-02

Pos logit contributions
itting+0.001
uttering+0.000
ability+0.000
over+0.000
adow+0.000
Neg logit contributions
 brakes-0.001
 fastest-0.001
 highest-0.001
 fart-0.001
 heads-0.001
TokenS
Token ID50
Feature activation-4.18e-02

Pos logit contributions
itting+0.024
uttering+0.022
ability+0.021
ooks+0.020
over+0.019
Neg logit contributions
 fastest-0.021
 highest-0.020
 brakes-0.019
 fart-0.019
 nearest-0.017
Tokenara
Token ID3301
Feature activation+5.42e-03

Pos logit contributions
 fastest+0.097
 brakes+0.095
 highest+0.095
 fart+0.090
 heads+0.087
Neg logit contributions
itting-0.052
uttering-0.044
ability-0.044
over-0.042
adow-0.039
Token and
Token ID290
Feature activation-2.68e-03

Pos logit contributions
itting+0.011
uttering+0.010
ability+0.009
over+0.009
ooks+0.009
Neg logit contributions
 fastest-0.008
 heads-0.008
 brakes-0.008
 highest-0.007
 hurting-0.007
Token Ben
Token ID3932
Feature activation+1.15e-02

Pos logit contributions
 brakes+0.003
 forget+0.003
 fastest+0.003
 highest+0.003
 hurting+0.003
Neg logit contributions
itting-0.006
uttering-0.006
ooks-0.006
ability-0.005
over-0.005
Token are
Token ID389
Feature activation-1.82e-02

Pos logit contributions
itting+0.025
uttering+0.023
over+0.022
ability+0.022
ooks+0.021
Neg logit contributions
 fastest-0.016
 brakes-0.016
 highest-0.015
 heads-0.015
 hurting-0.014
Token friends
Token ID2460
Feature activation-1.44e-02

Pos logit contributions
 fastest+0.026
 highest+0.025
 brakes+0.022
 forth+0.021
 hurting+0.021
Neg logit contributions
ability-0.038
itting-0.037
ment-0.035
uttering-0.032
over-0.032
Token out
Token ID503
Feature activation+2.42e-03

Pos logit contributions
itting+0.025
ability+0.023
uttering+0.022
over+0.022
adow+0.021
Neg logit contributions
 fastest-0.027
 brakes-0.026
 highest-0.026
 fart-0.024
 hurting-0.023
Token and
Token ID290
Feature activation+3.25e-03

Pos logit contributions
itting+0.004
uttering+0.003
ability+0.003
over+0.003
adow+0.003
Neg logit contributions
 fastest-0.005
 brakes-0.005
 highest-0.005
 fart-0.004
 hurting-0.004
Token see
Token ID766
Feature activation+9.50e-03

Pos logit contributions
itting+0.007
ability+0.006
uttering+0.006
over+0.006
adow+0.006
Neg logit contributions
 fastest-0.005
 brakes-0.005
 highest-0.004
 fart-0.004
 hurting-0.004
Token what
Token ID644
Feature activation-4.90e-03

Pos logit contributions
itting+0.015
uttering+0.014
ability+0.013
over+0.013
adow+0.012
Neg logit contributions
 fastest-0.019
 brakes-0.019
 highest-0.019
 fart-0.017
 hurting-0.017
Token would
Token ID561
Feature activation+3.23e-02

Pos logit contributions
 fastest+0.009
 brakes+0.008
 highest+0.008
 fart+0.008
 hurting+0.007
Neg logit contributions
itting-0.009
uttering-0.008
over-0.008
ability-0.008
adow-0.008
Token happen
Token ID1645
Feature activation-6.11e-02

Pos logit contributions
itting+0.071
ability+0.066
uttering+0.065
 about+0.064
over+0.063
Neg logit contributions
 brakes-0.041
 fastest-0.040
 highest-0.039
 fart-0.034
 heads-0.032
Token.
Token ID13
Feature activation+2.28e-04

Pos logit contributions
 fastest+0.108
 highest+0.102
 brakes+0.091
 fart+0.090
 longest+0.089
Neg logit contributions
itting-0.113
ability-0.112
uttering-0.101
adow-0.096
over-0.096
Token
Token ID198
Feature activation+2.02e-03

Pos logit contributions
itting+0.000
uttering+0.000
over+0.000
ability+0.000
adow+0.000
Neg logit contributions
 fastest-0.000
 brakes-0.000
 highest-0.000
 fart-0.000
 heads-0.000
Token
Token ID198
Feature activation+2.60e-03

Pos logit contributions
itting+0.004
uttering+0.003
over+0.003
ability+0.003
ment+0.003
Neg logit contributions
 fastest-0.003
 highest-0.003
 brakes-0.003
 fart-0.003
 nearest-0.003
TokenSo
Token ID2396
Feature activation-2.17e-02

Pos logit contributions
itting+0.004
uttering+0.004
ability+0.004
ment+0.004
adow+0.004
Neg logit contributions
 fastest-0.005
 highest-0.005
 brakes-0.005
 fart-0.005
 nearest-0.004
Token,
Token ID11
Feature activation+7.28e-03

Pos logit contributions
 fastest+0.043
 brakes+0.042
 highest+0.041
 fart+0.039
 heads+0.038
Neg logit contributions
itting-0.035
uttering-0.032
ability-0.030
over-0.030
adow-0.029
Token and
Token ID290
Feature activation-5.28e-03

Pos logit contributions
 fastest+0.015
 highest+0.014
 brakes+0.014
 fart+0.014
 longest+0.013
Neg logit contributions
itting-0.015
ability-0.014
uttering-0.013
over-0.013
adow-0.013
Token Daddy
Token ID26926
Feature activation+8.53e-04

Pos logit contributions
 fastest+0.011
 highest+0.010
 brakes+0.010
 fart+0.010
 longest+0.009
Neg logit contributions
itting-0.009
ability-0.008
uttering-0.007
 early-0.007
over-0.007
Token say
Token ID910
Feature activation+1.07e-02

Pos logit contributions
itting+0.002
uttering+0.002
ability+0.002
over+0.002
ooks+0.002
Neg logit contributions
 brakes-0.001
 fastest-0.001
 highest-0.001
 fart-0.001
 heads-0.001
Token,
Token ID11
Feature activation-9.78e-04

Pos logit contributions
itting+0.017
uttering+0.015
ability+0.015
over+0.014
adow+0.014
Neg logit contributions
 fastest-0.021
 brakes-0.021
 highest-0.020
 fart-0.019
 hurting-0.019
Token "
Token ID366
Feature activation-1.37e-02

Pos logit contributions
 fastest+0.002
 brakes+0.002
 highest+0.002
 fart+0.002
 heads+0.002
Neg logit contributions
itting-0.001
uttering-0.001
ability-0.001
over-0.001
adow-0.001
TokenVol
Token ID16598
Feature activation-4.84e-02

Pos logit contributions
 fastest+0.023
 brakes+0.023
 highest+0.022
 fart+0.020
 hurting+0.019
Neg logit contributions
itting-0.026
uttering-0.024
ability-0.024
over-0.023
adow-0.022
Tokencan
Token ID5171
Feature activation-2.58e-02

Pos logit contributions
 brakes+0.076
 fastest+0.075
 highest+0.071
 fart+0.068
 hurting+0.065
Neg logit contributions
itting-0.102
uttering-0.090
ability-0.086
over-0.085
adow-0.083
Tokenoes
Token ID3028
Feature activation+5.88e-03

Pos logit contributions
 brakes+0.033
 fastest+0.032
 highest+0.031
 fart+0.028
 ruining+0.026
Neg logit contributions
itting-0.061
uttering-0.054
ability-0.054
adow-0.052
over-0.052
Token are
Token ID389
Feature activation-2.00e-02

Pos logit contributions
itting+0.012
uttering+0.011
ability+0.011
over+0.011
adow+0.010
Neg logit contributions
 fastest-0.009
 brakes-0.009
 highest-0.009
 fart-0.008
 hurting-0.007
Token not
Token ID407
Feature activation-1.11e-02

Pos logit contributions
 brakes+0.036
 fastest+0.035
 highest+0.034
 fart+0.032
 forth+0.032
Neg logit contributions
itting-0.033
 about-0.029
uttering-0.028
 early-0.028
ability-0.028
Token for
Token ID329
Feature activation-7.84e-03

Pos logit contributions
 fastest+0.018
 brakes+0.018
 highest+0.017
 fart+0.017
 hurting+0.016
Neg logit contributions
itting-0.022
ability-0.021
uttering-0.020
over-0.019
adow-0.018
Token"
Token ID1
Feature activation-3.93e-02

Pos logit contributions
 fastest+0.010
 brakes+0.010
 highest+0.010
 fart+0.009
 hurting+0.009
Neg logit contributions
itting-0.008
uttering-0.007
ability-0.007
over-0.007
adow-0.007
TokenLook
Token ID8567
Feature activation+2.48e-02

Pos logit contributions
 fastest+0.083
 brakes+0.083
 highest+0.078
 hurting+0.076
 ruining+0.074
Neg logit contributions
itting-0.057
ability-0.050
uttering-0.049
over-0.045
illed-0.043
Token,
Token ID11
Feature activation-6.49e-03

Pos logit contributions
itting+0.023
adow+0.019
uttering+0.018
 about+0.018
ook+0.016
Neg logit contributions
 fastest-0.068
 brakes-0.066
 highest-0.064
 fart-0.061
 nearest-0.060
Token a
Token ID257
Feature activation-1.49e-02

Pos logit contributions
 fastest+0.013
 brakes+0.012
 highest+0.012
 fart+0.011
 hurting+0.011
Neg logit contributions
itting-0.010
uttering-0.009
adow-0.009
ability-0.009
ment-0.008
Token coin
Token ID10752
Feature activation+5.11e-04

Pos logit contributions
 fastest+0.029
 brakes+0.028
 highest+0.027
 fart+0.026
 heads+0.026
Neg logit contributions
itting-0.023
ability-0.021
uttering-0.021
over-0.020
adow-0.019
Token!"
Token ID2474
Feature activation-4.64e-02

Pos logit contributions
itting+0.001
uttering+0.001
ability+0.001
over+0.001
adow+0.001
Neg logit contributions
 fastest-0.001
 brakes-0.001
 highest-0.001
 fart-0.001
 heads-0.001
Token Tim
Token ID5045
Feature activation-2.76e-02

Pos logit contributions
 brakes+0.068
 fastest+0.067
 highest+0.064
 fart+0.059
 heads+0.057
Neg logit contributions
itting-0.100
uttering-0.091
ability-0.090
over-0.088
adow-0.085
Token said
Token ID531
Feature activation-1.09e-02

Pos logit contributions
 fastest+0.055
 brakes+0.054
 highest+0.053
 fart+0.050
 hurting+0.047
Neg logit contributions
itting-0.044
ability-0.039
uttering-0.038
adow-0.037
over-0.037
Token.
Token ID13
Feature activation+4.74e-04

Pos logit contributions
 fastest+0.021
 brakes+0.021
 highest+0.020
 fart+0.019
 heads+0.019
Neg logit contributions
itting-0.018
ability-0.016
uttering-0.016
over-0.015
adow-0.015
Token "
Token ID366
Feature activation-1.40e-02

Pos logit contributions
itting+0.001
ability+0.001
uttering+0.001
over+0.001
adow+0.001
Neg logit contributions
 brakes-0.001
 fastest-0.001
 highest-0.001
 fart-0.001
 heads-0.001
TokenLet
Token ID5756
Feature activation+8.13e-03

Pos logit contributions
 fastest+0.024
 brakes+0.024
 highest+0.024
 fart+0.022
 heads+0.020
Neg logit contributions
itting-0.027
ability-0.024
uttering-0.024
over-0.023
adow-0.023
Token Tom
Token ID4186
Feature activation-1.08e-02

Pos logit contributions
 brakes+0.042
 fastest+0.041
 highest+0.040
 hurting+0.037
 fart+0.036
Neg logit contributions
itting-0.061
ability-0.055
uttering-0.054
over-0.053
adow-0.052
Token said
Token ID531
Feature activation-1.43e-02

Pos logit contributions
 fastest+0.019
 brakes+0.019
 highest+0.018
 fart+0.017
 heads+0.017
Neg logit contributions
itting-0.020
uttering-0.018
ability-0.018
over-0.017
ooks-0.016
Token.
Token ID13
Feature activation+2.70e-03

Pos logit contributions
 brakes+0.025
 fastest+0.025
 highest+0.024
 heads+0.023
 hurting+0.023
Neg logit contributions
itting-0.026
ability-0.023
uttering-0.022
over-0.020
adow-0.020
Token
Token ID198
Feature activation-1.19e-03

Pos logit contributions
itting+0.005
uttering+0.004
adow+0.004
ability+0.004
illed+0.004
Neg logit contributions
 brakes-0.005
 fastest-0.005
 highest-0.005
 fart-0.004
 nearest-0.004
Token
Token ID198
Feature activation-1.42e-03

Pos logit contributions
 brakes+0.002
 highest+0.002
 fastest+0.002
 nearest+0.002
 me+0.002
Neg logit contributions
itting-0.002
uttering-0.002
ability-0.002
over-0.002
adow-0.002
Token"
Token ID1
Feature activation-4.38e-02

Pos logit contributions
 fastest+0.003
 brakes+0.002
 highest+0.002
 fart+0.002
 nearest+0.002
Neg logit contributions
itting-0.003
adow-0.002
uttering-0.002
ability-0.002
illed-0.002
TokenBy
Token ID3886
Feature activation-2.65e-02

Pos logit contributions
 fastest+0.073
 brakes+0.071
 highest+0.069
 fart+0.065
 hurting+0.061
Neg logit contributions
itting-0.085
ability-0.077
uttering-0.077
adow-0.074
over-0.074
Tokene
Token ID68
Feature activation-6.08e-02

Pos logit contributions
 brakes+0.047
 fastest+0.046
 highest+0.045
 fart+0.040
 heads+0.038
Neg logit contributions
itting-0.048
over-0.045
uttering-0.044
ability-0.043
adow-0.041
Token,
Token ID11
Feature activation-8.24e-03

Pos logit contributions
 fastest+0.122
 brakes+0.121
 highest+0.117
 fart+0.112
 hurting+0.107
Neg logit contributions
itting-0.097
uttering-0.085
ability-0.085
over-0.081
adow-0.078
Token Lily
Token ID20037
Feature activation-4.57e-03

Pos logit contributions
 fastest+0.013
 brakes+0.013
 highest+0.013
 fart+0.012
 heads+0.011
Neg logit contributions
itting-0.016
ability-0.014
uttering-0.014
over-0.014
 early-0.013
Token and
Token ID290
Feature activation-1.18e-02

Pos logit contributions
 fastest+0.008
 brakes+0.008
 highest+0.008
 fart+0.007
 hurting+0.007
Neg logit contributions
itting-0.009
uttering-0.008
ability-0.007
over-0.007
ooks-0.007