TOP ACTIVATIONS
MAX = 0.083

ugly . They spent the day playing games and having fun
car . He spent the afternoon racing it around the garden
It spent the day with the animals . They
happy . They spent the day sharing the poem with all
The goat spent her day wandering around the field ,
They spent the day playing games and eating cake
The gang spent the day having fun playing with the
much . They spent the day playing on the beach ,
bored . She spent the day talking to the v ase
idea . She spent the day with Rob by and taught
ite . He spent the day playing with it , but
it . He spent all day hard at work , rubbing
The sisters spent the day playing in the closet .
They spent the day talking and playing together .
baking . She spent the day writing a reliable poem about
. She would spend the day eating and resting - that
They spent the day exploring the lake together .
. She often spent her day playing in the park .
is how Tim spent his day , sitting under the tree
back . They spent the day exploring together . They had

MIN ACTIVATIONS
MIN = -0.088

" WHAT ARE YOU DO ING ?" Mom yells .
to drink it to stay hyd rated !" Billy drank the
me to the nearest fire hyd rant ?"
The car hits a fire hyd rant . The fire hyd
spray from a nearby fire hyd rant to keep her dry
It 's important to stay hyd rated and healthy ," mom
relax together . How does that sound ?" Rex
and the lightning struck the hyd rant . Suddenly water was
ice cream , if there 's any left ." Lily nodded
. It will keep you hyd rated . You have to
rain that way , Lily and Sam ?" L
will never go near a crocod ile again ." Lily said
to go . Nothing can prevent it . Now , move
" Hey , what 's wrong ?" Lily 's mom
" Hey , what 's going on here ?" a
Tom , Mia , what 's wrong ?" Mom asks .
the ring as a souven ir . They thanked the dove
and clean . How does that sound ?" S
keep them as a souven ir of Frost y . They
it 's important to stay hyd rated and drink water when

INTERVAL 0.041 - 0.083
CONTAINS 0.820%

her crab family . One day , Sally wanted to go
a way to make vegetables fun by cutting them into fun
colors and shapes . She can make towers , bridges ,
they spl ashed and had fun . The little
pink dress and her spark ly shoes every day . One

INTERVAL -0.002 - 0.041
CONTAINS 55.395%

!" Sp ike thought for a moment and then
, bigger and brighter than Lily and Ben 's . He
mean monkey named Bob . Bob was always very naughty and
cozy . <|endoftext|> Jen loved her horse , Max . He
to use its be ak to pick up food instead of

INTERVAL -0.045 - -0.002
CONTAINS 43.565%

and cried because she didn 't have her toy bunny anymore
looked everywhere , but it was hidden so well that she
But then , the wind blows and the door
welcome . I 'm so glad I could help you ."
too . It could not stop in time . It hit

INTERVAL -0.088 - -0.045
CONTAINS 0.220%

But this time , let 's not vanish . Let 's
. L ily and Ben take turns flying their
something else ." S ara and Tom were very
home . <|endoftext|> Once upon a time , there was a
!" L ily and Tom hug and laugh .
Token ugly
Token ID13400
Feature activation+5.82e-03

Pos logit contributions
ah+0.025
 mate+0.023
ff+0.022
bar+0.022
 One+0.022
Neg logit contributions
bes-0.027
we-0.027
kins-0.024
irling-0.023
 wiping-0.023
Token.
Token ID13
Feature activation+6.37e-03

Pos logit contributions
bes+0.013
we+0.012
irling+0.012
kins+0.012
 sudden+0.011
Neg logit contributions
ah-0.007
 enough-0.007
 mate-0.006
 One-0.006
ff-0.006
Token They
Token ID1119
Feature activation+2.23e-03

Pos logit contributions
bes+0.012
we+0.011
 wearing+0.010
 wiping+0.010
irling+0.010
Neg logit contributions
ah-0.010
 mate-0.010
ff-0.009
bar-0.009
ow-0.009
Token spent
Token ID3377
Feature activation+2.63e-02

Pos logit contributions
bes+0.004
we+0.003
 wearing+0.003
kins+0.003
irling+0.003
Neg logit contributions
ah-0.004
ff-0.004
 mate-0.004
bar-0.003
 One-0.003
Token the
Token ID262
Feature activation+1.59e-02

Pos logit contributions
we+0.069
bes+0.065
kins+0.062
ini+0.061
keeping+0.059
Neg logit contributions
 enough-0.023
ah-0.020
 One-0.019
 go-0.016
 mate-0.015
Token day
Token ID1110
Feature activation+8.33e-02

Pos logit contributions
bes+0.037
we+0.036
kins+0.035
keeping+0.034
 complexion+0.033
Neg logit contributions
ah-0.018
 go-0.014
 mate-0.013
bar-0.012
pole-0.012
Token playing
Token ID2712
Feature activation+4.38e-02

Pos logit contributions
bes+0.133
we+0.110
keeping+0.109
kins+0.103
 slug+0.099
Neg logit contributions
 mate-0.137
ow-0.136
 enough-0.129
bar-0.129
 come-0.129
Token games
Token ID1830
Feature activation+2.98e-02

Pos logit contributions
bes+0.077
we+0.076
irling+0.067
kins+0.065
 complexion+0.065
Neg logit contributions
ah-0.066
 mate-0.063
 One-0.060
 enough-0.060
ow-0.059
Token and
Token ID290
Feature activation+1.70e-02

Pos logit contributions
bes+0.057
we+0.056
irling+0.049
kins+0.049
 wiping+0.047
Neg logit contributions
ah-0.045
 mate-0.043
 enough-0.040
 One-0.039
bar-0.039
Token having
Token ID1719
Feature activation+2.55e-02

Pos logit contributions
bes+0.027
kins+0.023
we+0.023
irling+0.020
ini+0.020
Neg logit contributions
ah-0.032
 mate-0.030
 enough-0.029
 One-0.029
 come-0.029
Token fun
Token ID1257
Feature activation+5.52e-02

Pos logit contributions
bes+0.049
we+0.048
kins+0.044
irling+0.042
 wiping+0.041
Neg logit contributions
ah-0.038
 mate-0.036
 One-0.036
ff-0.035
bar-0.033
Token car
Token ID1097
Feature activation+2.23e-03

Pos logit contributions
bes+0.036
we+0.035
kins+0.032
irling+0.031
 wiping+0.030
Neg logit contributions
ah-0.028
 mate-0.025
bar-0.025
ff-0.025
 One-0.025
Token.
Token ID13
Feature activation+8.35e-03

Pos logit contributions
bes+0.004
we+0.004
kins+0.004
irling+0.004
 wiping+0.003
Neg logit contributions
ah-0.004
 mate-0.003
ff-0.003
 One-0.003
bar-0.003
Token He
Token ID679
Feature activation+1.03e-02

Pos logit contributions
bes+0.019
we+0.018
 wearing+0.017
 wiping+0.016
kins+0.016
Neg logit contributions
ah-0.010
 mate-0.009
ff-0.009
bar-0.008
ow-0.008
Token spent
Token ID3377
Feature activation+2.73e-02

Pos logit contributions
bes+0.017
we+0.017
kins+0.015
irling+0.014
 wiping+0.014
Neg logit contributions
ah-0.019
 mate-0.017
bar-0.017
 One-0.017
ff-0.017
Token the
Token ID262
Feature activation+1.43e-02

Pos logit contributions
we+0.065
bes+0.061
kins+0.060
ini+0.056
 wiping+0.055
Neg logit contributions
ah-0.027
 enough-0.027
 One-0.026
 mate-0.024
bar-0.022
Token afternoon
Token ID6672
Feature activation+8.26e-02

Pos logit contributions
bes+0.030
we+0.029
kins+0.029
irling+0.026
 wearing+0.025
Neg logit contributions
ah-0.019
bar-0.017
 mate-0.016
 One-0.016
pole-0.015
Token racing
Token ID11717
Feature activation+3.17e-02

Pos logit contributions
bes+0.108
we+0.098
kins+0.087
keeping+0.078
irling+0.077
Neg logit contributions
 mate-0.169
bar-0.169
ah-0.161
 enough-0.156
 go-0.150
Token it
Token ID340
Feature activation+2.09e-02

Pos logit contributions
bes+0.054
we+0.053
kins+0.046
irling+0.046
 wearing+0.046
Neg logit contributions
ah-0.055
ff-0.052
 mate-0.051
 One-0.050
bar-0.049
Token around
Token ID1088
Feature activation+2.15e-02

Pos logit contributions
bes+0.036
we+0.036
kins+0.031
irling+0.031
 wearing+0.031
Neg logit contributions
ah-0.035
ff-0.033
 mate-0.033
 One-0.032
bar-0.031
Token the
Token ID262
Feature activation-1.09e-02

Pos logit contributions
bes+0.043
we+0.042
kins+0.038
irling+0.037
 wiping+0.036
Neg logit contributions
ah-0.032
 mate-0.029
 One-0.029
ff-0.028
bar-0.028
Token garden
Token ID11376
Feature activation-4.98e-03

Pos logit contributions
ah+0.019
 mate+0.018
 One+0.018
ff+0.018
bar+0.018
Neg logit contributions
bes-0.018
we-0.018
kins-0.016
irling-0.015
 wiping-0.015
Token
Token ID198
Feature activation+6.34e-03

Pos logit contributions
bes+0.022
we+0.021
 wiping+0.019
 wearing+0.019
kins+0.019
Neg logit contributions
ah-0.017
 mate-0.016
bar-0.014
ff-0.014
ow-0.014
Token
Token ID198
Feature activation+1.68e-02

Pos logit contributions
bes+0.013
we+0.012
kins+0.011
irling+0.011
 wiping+0.011
Neg logit contributions
ah-0.009
 mate-0.009
ff-0.008
 One-0.008
bar-0.008
TokenIt
Token ID1026
Feature activation+1.29e-02

Pos logit contributions
bes+0.029
we+0.029
kins+0.025
irling+0.025
 wiping+0.024
Neg logit contributions
ah-0.029
 mate-0.027
ff-0.026
 One-0.026
bar-0.026
Token spent
Token ID3377
Feature activation+2.31e-02

Pos logit contributions
bes+0.020
we+0.020
kins+0.017
irling+0.017
 wiping+0.016
Neg logit contributions
ah-0.024
 mate-0.023
ff-0.022
bar-0.022
 One-0.022
Token the
Token ID262
Feature activation+1.81e-02

Pos logit contributions
we+0.052
bes+0.051
kins+0.046
 wiping+0.045
irling+0.045
Neg logit contributions
ah-0.028
 One-0.025
 enough-0.024
 mate-0.023
ff-0.022
Token day
Token ID1110
Feature activation+8.22e-02

Pos logit contributions
bes+0.034
we+0.033
kins+0.030
irling+0.029
 wiping+0.029
Neg logit contributions
ah-0.029
 mate-0.026
 One-0.025
ff-0.025
bar-0.024
Token with
Token ID351
Feature activation+1.97e-02

Pos logit contributions
bes+0.117
we+0.102
kins+0.090
keeping+0.085
irling+0.084
Neg logit contributions
ah-0.161
 mate-0.160
bar-0.157
ow-0.152
ff-0.150
Token the
Token ID262
Feature activation+6.12e-03

Pos logit contributions
bes+0.039
we+0.038
kins+0.035
irling+0.034
 wiping+0.033
Neg logit contributions
ah-0.029
 mate-0.026
ff-0.026
 One-0.025
bar-0.025
Token animals
Token ID4695
Feature activation+1.61e-02

Pos logit contributions
bes+0.009
we+0.009
kins+0.008
 wearing+0.007
 wrapping+0.007
Neg logit contributions
ah-0.012
bar-0.011
ff-0.011
 mate-0.011
ow-0.011
Token.
Token ID13
Feature activation+1.06e-02

Pos logit contributions
bes+0.027
we+0.026
kins+0.023
irling+0.023
 wiping+0.022
Neg logit contributions
ah-0.029
 mate-0.027
ff-0.026
 One-0.026
bar-0.026
Token They
Token ID1119
Feature activation+6.29e-03

Pos logit contributions
bes+0.021
we+0.020
 wiping+0.018
 wearing+0.018
kins+0.017
Neg logit contributions
ah-0.016
 mate-0.015
bar-0.014
ff-0.013
ow-0.013
Token happy
Token ID3772
Feature activation-5.04e-03

Pos logit contributions
bes+0.040
we+0.039
irling+0.034
kins+0.033
 wiping+0.032
Neg logit contributions
ah-0.044
 One-0.043
 mate-0.043
 enough-0.040
ff-0.040
Token.
Token ID13
Feature activation-6.48e-03

Pos logit contributions
ah+0.007
 mate+0.007
 One+0.007
ff+0.006
bar+0.006
Neg logit contributions
bes-0.010
we-0.010
irling-0.009
kins-0.009
 wiping-0.008
Token They
Token ID1119
Feature activation-6.30e-03

Pos logit contributions
ah+0.010
 mate+0.009
ff+0.009
ow+0.008
bar+0.008
Neg logit contributions
bes-0.012
we-0.012
 wearing-0.011
 wiping-0.011
kins-0.010
Token spent
Token ID3377
Feature activation+1.63e-02

Pos logit contributions
ah+0.011
bar+0.011
 mate+0.010
ff+0.010
ow+0.010
Neg logit contributions
bes-0.009
kins-0.008
we-0.008
 wearing-0.008
irling-0.008
Token the
Token ID262
Feature activation+1.55e-02

Pos logit contributions
we+0.039
bes+0.038
kins+0.036
ini+0.034
 wiping+0.034
Neg logit contributions
ah-0.016
 enough-0.016
 One-0.014
 mate-0.014
ff-0.012
Token day
Token ID1110
Feature activation+8.11e-02

Pos logit contributions
we+0.035
bes+0.035
kins+0.033
keeping+0.031
 wearing+0.031
Neg logit contributions
ah-0.019
 mate-0.015
bar-0.014
 go-0.014
ny-0.013
Token sharing
Token ID7373
Feature activation+1.93e-02

Pos logit contributions
bes+0.112
we+0.098
keeping+0.089
kins+0.088
 complexion+0.083
Neg logit contributions
bar-0.149
 mate-0.149
ow-0.143
ah-0.142
 enough-0.139
Token the
Token ID262
Feature activation+2.35e-02

Pos logit contributions
bes+0.037
we+0.036
irling+0.033
kins+0.032
 wiping+0.031
Neg logit contributions
ah-0.029
 mate-0.026
 One-0.026
ff-0.026
bar-0.025
Token poem
Token ID21247
Feature activation+2.29e-02

Pos logit contributions
bes+0.042
we+0.042
irling+0.038
kins+0.036
 wiping+0.034
Neg logit contributions
ah-0.039
 mate-0.036
 One-0.034
bar-0.033
 enough-0.032
Token with
Token ID351
Feature activation+1.62e-03

Pos logit contributions
bes+0.039
we+0.038
kins+0.034
irling+0.034
 wiping+0.032
Neg logit contributions
ah-0.039
 mate-0.037
ff-0.036
 One-0.036
bar-0.035
Token all
Token ID477
Feature activation+2.86e-03

Pos logit contributions
bes+0.003
we+0.003
kins+0.003
irling+0.003
 wiping+0.003
Neg logit contributions
ah-0.002
 mate-0.002
ff-0.002
bar-0.002
 One-0.002
Token
Token ID198
Feature activation+6.05e-03

Pos logit contributions
ah+0.008
 One+0.008
ff+0.007
 mate+0.007
ow+0.007
Neg logit contributions
bes-0.015
we-0.015
kins-0.014
irling-0.013
 wiping-0.013
TokenThe
Token ID464
Feature activation-1.79e-02

Pos logit contributions
bes+0.012
we+0.012
kins+0.011
 wiping+0.010
irling+0.010
Neg logit contributions
ah-0.009
 One-0.008
 mate-0.008
ff-0.008
bar-0.008
Token goat
Token ID26917
Feature activation+1.52e-02

Pos logit contributions
ah+0.031
 mate+0.029
ff+0.029
bar+0.028
 One+0.028
Neg logit contributions
bes-0.031
we-0.030
kins-0.027
irling-0.026
 wiping-0.026
Token spent
Token ID3377
Feature activation+2.42e-02

Pos logit contributions
bes+0.025
we+0.024
kins+0.022
irling+0.021
 wearing+0.021
Neg logit contributions
ah-0.026
 mate-0.025
ff-0.025
bar-0.024
 One-0.024
Token her
Token ID607
Feature activation+5.04e-02

Pos logit contributions
we+0.059
bes+0.056
kins+0.054
ini+0.051
 wiping+0.051
Neg logit contributions
ah-0.025
 One-0.024
 enough-0.023
 mate-0.020
ow-0.018
Token day
Token ID1110
Feature activation+8.11e-02

Pos logit contributions
we+0.115
bes+0.105
kins+0.102
irling+0.100
keeping+0.100
Neg logit contributions
ah-0.062
 One-0.054
 mate-0.050
ow-0.047
 go-0.045
Token wandering
Token ID24504
Feature activation+1.60e-02

Pos logit contributions
bes+0.106
we+0.104
keeping+0.085
kins+0.084
irling+0.075
Neg logit contributions
 mate-0.159
ah-0.152
bar-0.152
ow-0.151
 go-0.144
Token around
Token ID1088
Feature activation+2.64e-02

Pos logit contributions
bes+0.029
we+0.029
kins+0.026
irling+0.025
 wiping+0.025
Neg logit contributions
ah-0.026
 mate-0.024
 One-0.023
ff-0.023
bar-0.023
Token the
Token ID262
Feature activation+1.60e-03

Pos logit contributions
bes+0.057
we+0.055
kins+0.051
irling+0.050
 wiping+0.048
Neg logit contributions
ah-0.034
 One-0.032
 mate-0.032
ff-0.031
 enough-0.029
Token field
Token ID2214
Feature activation-1.05e-02

Pos logit contributions
bes+0.003
we+0.003
irling+0.002
kins+0.002
 wearing+0.002
Neg logit contributions
ah-0.003
 mate-0.003
ff-0.003
bar-0.003
ow-0.003
Token,
Token ID11
Feature activation+1.54e-02

Pos logit contributions
ah+0.018
ff+0.017
 mate+0.016
bar+0.016
 One+0.016
Neg logit contributions
bes-0.018
we-0.018
kins-0.016
 wearing-0.016
irling-0.015
Token
Token ID198
Feature activation-3.41e-03

Pos logit contributions
bes+0.009
we+0.009
kins+0.008
 wearing+0.008
irling+0.008
Neg logit contributions
ah-0.006
bar-0.005
 mate-0.005
ff-0.005
 One-0.005
Token
Token ID198
Feature activation+1.03e-02

Pos logit contributions
ah+0.006
bar+0.005
 mate+0.005
ff+0.005
ow+0.005
Neg logit contributions
bes-0.006
we-0.006
kins-0.005
irling-0.005
 wearing-0.005
TokenThey
Token ID2990
Feature activation+5.68e-03

Pos logit contributions
bes+0.015
we+0.015
irling+0.013
kins+0.013
lf+0.012
Neg logit contributions
bar-0.020
ah-0.020
ow-0.019
ff-0.019
 mate-0.019
Token spent
Token ID3377
Feature activation+3.10e-02

Pos logit contributions
bes+0.009
we+0.009
kins+0.009
 wearing+0.008
 wiping+0.008
Neg logit contributions
ah-0.010
bar-0.009
ff-0.009
 mate-0.009
 One-0.009
Token the
Token ID262
Feature activation+2.33e-02

Pos logit contributions
we+0.075
bes+0.073
kins+0.066
ini+0.066
irling+0.065
Neg logit contributions
ah-0.030
 enough-0.030
 One-0.028
 mate-0.027
ow-0.024
Token day
Token ID1110
Feature activation+8.11e-02

Pos logit contributions
bes+0.046
we+0.044
kins+0.041
irling+0.040
keeping+0.040
Neg logit contributions
ah-0.032
 mate-0.028
 go-0.027
 One-0.026
ow-0.025
Token playing
Token ID2712
Feature activation+4.66e-02

Pos logit contributions
bes+0.123
we+0.108
keeping+0.098
web+0.096
 slug+0.095
Neg logit contributions
 mate-0.142
 enough-0.136
ow-0.135
ah-0.132
 go-0.128
Token games
Token ID1830
Feature activation+2.69e-02

Pos logit contributions
we+0.074
bes+0.074
irling+0.064
kins+0.062
 wiping+0.061
Neg logit contributions
ah-0.080
 mate-0.078
 One-0.076
 enough-0.073
ff-0.072
Token and
Token ID290
Feature activation+1.74e-02

Pos logit contributions
bes+0.052
we+0.051
irling+0.046
kins+0.045
 wiping+0.044
Neg logit contributions
ah-0.039
 mate-0.038
 One-0.036
 enough-0.035
ow-0.034
Token eating
Token ID6600
Feature activation+3.47e-02

Pos logit contributions
bes+0.027
we+0.023
irling+0.021
kins+0.021
ini+0.020
Neg logit contributions
ah-0.034
 mate-0.033
 enough-0.032
 leave-0.030
 One-0.030
Token cake
Token ID12187
Feature activation+1.53e-02

Pos logit contributions
bes+0.067
we+0.066
irling+0.059
keeping+0.055
ini+0.055
Neg logit contributions
ah-0.049
 enough-0.047
 One-0.047
 mate-0.045
ff-0.045
Token
Token ID198
Feature activation+1.81e-03

Pos logit contributions
ah+0.010
 mate+0.009
ff+0.009
 One+0.009
bar+0.008
Neg logit contributions
bes-0.014
we-0.014
kins-0.013
irling-0.012
 wiping-0.012
TokenThe
Token ID464
Feature activation-4.93e-03

Pos logit contributions
bes+0.003
we+0.003
kins+0.003
irling+0.003
 wiping+0.003
Neg logit contributions
ah-0.003
 mate-0.003
ff-0.003
 One-0.003
bar-0.003
Token gang
Token ID7706
Feature activation+1.11e-02

Pos logit contributions
 One+0.007
 mate+0.007
ah+0.007
bar+0.007
 enough+0.006
Neg logit contributions
we-0.010
bes-0.010
irling-0.009
kins-0.008
 wiping-0.008
Token spent
Token ID3377
Feature activation+2.30e-02

Pos logit contributions
bes+0.021
we+0.020
kins+0.018
irling+0.018
 wearing+0.018
Neg logit contributions
ah-0.017
 mate-0.016
ff-0.016
 One-0.015
bar-0.015
Token the
Token ID262
Feature activation+2.14e-02

Pos logit contributions
we+0.058
bes+0.056
ini+0.053
 wiping+0.051
iflower+0.051
Neg logit contributions
 enough-0.020
 One-0.018
ah-0.018
 away-0.015
ow-0.015
Token day
Token ID1110
Feature activation+8.08e-02

Pos logit contributions
we+0.046
bes+0.044
kins+0.043
 complexion+0.040
irling+0.040
Neg logit contributions
ah-0.028
 One-0.024
ny-0.024
 go-0.022
bar-0.020
Token having
Token ID1719
Feature activation+3.14e-02

Pos logit contributions
bes+0.108
we+0.095
kins+0.087
keeping+0.083
iflower+0.078
Neg logit contributions
bar-0.153
ah-0.153
 mate-0.152
ow-0.151
 enough-0.151
Token fun
Token ID1257
Feature activation+6.24e-02

Pos logit contributions
bes+0.063
we+0.062
kins+0.057
irling+0.055
 wiping+0.053
Neg logit contributions
ah-0.044
 One-0.041
 mate-0.041
ff-0.039
 enough-0.039
Token playing
Token ID2712
Feature activation+4.61e-02

Pos logit contributions
bes+0.118
kins+0.112
we+0.112
ini+0.112
iflower+0.111
Neg logit contributions
 enough-0.080
 One-0.076
ah-0.075
 go-0.067
 him-0.065
Token with
Token ID351
Feature activation+2.85e-02

Pos logit contributions
we+0.076
bes+0.074
irling+0.067
 complexion+0.065
keeping+0.064
Neg logit contributions
ah-0.072
 One-0.072
 mate-0.068
 enough-0.067
 please-0.064
Token the
Token ID262
Feature activation+2.06e-02

Pos logit contributions
bes+0.064
we+0.063
irling+0.058
kins+0.056
 wiping+0.054
Neg logit contributions
ah-0.033
 mate-0.031
 One-0.030
ff-0.029
 enough-0.029
Token much
Token ID881
Feature activation-5.81e-03

Pos logit contributions
bes+0.022
we+0.021
kins+0.019
irling+0.019
 wiping+0.019
Neg logit contributions
ah-0.013
 mate-0.012
 One-0.012
ff-0.012
bar-0.011
Token.
Token ID13
Feature activation+6.85e-03

Pos logit contributions
ah+0.009
 mate+0.008
ff+0.008
bar+0.008
 One+0.008
Neg logit contributions
bes-0.011
we-0.011
kins-0.010
 wearing-0.010
irling-0.010
Token They
Token ID1119
Feature activation+4.32e-03

Pos logit contributions
bes+0.012
we+0.012
 wearing+0.010
kins+0.010
 wiping+0.010
Neg logit contributions
ah-0.012
 mate-0.011
ff-0.010
bar-0.010
ow-0.010
Token spent
Token ID3377
Feature activation+2.45e-02

Pos logit contributions
bes+0.006
we+0.006
kins+0.006
 wearing+0.006
irling+0.006
Neg logit contributions
ah-0.008
ff-0.007
bar-0.007
ow-0.007
 One-0.007
Token the
Token ID262
Feature activation+2.21e-02

Pos logit contributions
we+0.063
bes+0.060
ini+0.057
kins+0.056
 wiping+0.055
Neg logit contributions
ah-0.021
 enough-0.021
 One-0.019
 mate-0.017
 go-0.015
Token day
Token ID1110
Feature activation+8.06e-02

Pos logit contributions
bes+0.047
we+0.046
 wiping+0.042
kins+0.042
 complexion+0.041
Neg logit contributions
ah-0.029
 mate-0.026
 go-0.023
ny-0.023
bar-0.022
Token playing
Token ID2712
Feature activation+4.43e-02

Pos logit contributions
bes+0.132
we+0.111
keeping+0.103
kins+0.101
web+0.098
Neg logit contributions
 mate-0.150
ah-0.137
bar-0.137
 go-0.132
ow-0.132
Token on
Token ID319
Feature activation+1.93e-02

Pos logit contributions
bes+0.076
we+0.074
irling+0.063
 wiping+0.063
kins+0.062
Neg logit contributions
ah-0.074
 mate-0.073
 One-0.067
 enough-0.066
 go-0.063
Token the
Token ID262
Feature activation+1.27e-02

Pos logit contributions
we+0.049
bes+0.048
kins+0.045
irling+0.043
 wiping+0.043
Neg logit contributions
 One-0.017
ah-0.017
 mate-0.015
 enough-0.015
 go-0.013
Token beach
Token ID10481
Feature activation+3.07e-03

Pos logit contributions
bes+0.024
we+0.023
kins+0.021
 wiping+0.020
keeping+0.020
Neg logit contributions
 mate-0.019
ah-0.019
 go-0.017
bar-0.017
 One-0.017
Token,
Token ID11
Feature activation+1.92e-02

Pos logit contributions
bes+0.005
we+0.005
kins+0.005
irling+0.004
 wiping+0.004
Neg logit contributions
ah-0.005
 mate-0.005
ff-0.005
 One-0.005
bar-0.005
Token bored
Token ID17533
Feature activation+2.96e-03

Pos logit contributions
ah+0.019
 mate+0.017
ff+0.017
bar+0.017
ow+0.017
Neg logit contributions
bes-0.015
we-0.015
kins-0.013
irling-0.013
 wiping-0.013
Token.
Token ID13
Feature activation-6.99e-03

Pos logit contributions
bes+0.006
we+0.006
kins+0.005
irling+0.005
 wiping+0.005
Neg logit contributions
ah-0.004
 mate-0.004
ff-0.004
 One-0.004
bar-0.004
Token She
Token ID1375
Feature activation-2.30e-03

Pos logit contributions
ah+0.010
 mate+0.009
ff+0.009
bar+0.009
ale+0.009
Neg logit contributions
bes-0.014
we-0.013
 wearing-0.012
 wiping-0.012
kins-0.012
Token spent
Token ID3377
Feature activation+1.95e-02

Pos logit contributions
ah+0.004
 mate+0.003
ff+0.003
bar+0.003
 One+0.003
Neg logit contributions
bes-0.004
we-0.004
kins-0.004
irling-0.004
 wiping-0.004
Token the
Token ID262
Feature activation+1.88e-02

Pos logit contributions
we+0.046
bes+0.043
kins+0.043
ini+0.041
irling+0.039
Neg logit contributions
 enough-0.020
ah-0.020
 One-0.019
 mate-0.017
ow-0.016
Token day
Token ID1110
Feature activation+8.04e-02

Pos logit contributions
we+0.041
bes+0.040
kins+0.039
irling+0.036
 wearing+0.036
Neg logit contributions
ah-0.024
ny-0.018
 mate-0.018
 go-0.018
bar-0.018
Token talking
Token ID3375
Feature activation+1.05e-02

Pos logit contributions
bes+0.113
we+0.098
kins+0.090
keeping+0.089
 complexion+0.081
Neg logit contributions
 mate-0.148
bar-0.146
ow-0.142
ah-0.139
 enough-0.139
Token to
Token ID284
Feature activation-2.89e-03

Pos logit contributions
bes+0.018
we+0.018
kins+0.016
irling+0.016
 wiping+0.015
Neg logit contributions
ah-0.018
 mate-0.016
ff-0.016
 One-0.016
bar-0.016
Token the
Token ID262
Feature activation-2.51e-03

Pos logit contributions
ah+0.005
 mate+0.004
ff+0.004
bar+0.004
 One+0.004
Neg logit contributions
bes-0.006
we-0.005
kins-0.005
irling-0.005
 wiping-0.005
Token v
Token ID410
Feature activation-3.44e-03

Pos logit contributions
ah+0.004
 mate+0.004
 One+0.004
ff+0.004
bar+0.004
Neg logit contributions
bes-0.005
we-0.004
irling-0.004
kins-0.004
 wiping-0.004
Tokenase
Token ID589
Feature activation-2.80e-03

Pos logit contributions
ah+0.006
 mate+0.006
ff+0.006
 One+0.006
bar+0.006
Neg logit contributions
bes-0.006
we-0.005
kins-0.005
irling-0.005
 wiping-0.005
Token idea
Token ID2126
Feature activation-2.47e-03

Pos logit contributions
ah+0.002
 mate+0.002
ff+0.002
 One+0.002
bar+0.002
Neg logit contributions
bes-0.004
we-0.004
kins-0.003
irling-0.003
 wiping-0.003
Token.
Token ID13
Feature activation+9.94e-03

Pos logit contributions
ah+0.004
 mate+0.003
ff+0.003
 One+0.003
bar+0.003
Neg logit contributions
bes-0.005
we-0.005
kins-0.004
 wearing-0.004
 wiping-0.004
Token She
Token ID1375
Feature activation+5.24e-03

Pos logit contributions
bes+0.021
we+0.020
kins+0.018
 wearing+0.018
 wiping+0.018
Neg logit contributions
ah-0.014
 mate-0.012
ff-0.012
bar-0.012
ow-0.011
Token spent
Token ID3377
Feature activation+2.95e-02

Pos logit contributions
bes+0.009
we+0.009
kins+0.008
irling+0.008
 wiping+0.008
Neg logit contributions
ah-0.009
 mate-0.008
ff-0.008
 One-0.008
bar-0.008
Token the
Token ID262
Feature activation+2.02e-02

Pos logit contributions
we+0.071
bes+0.067
ini+0.063
kins+0.063
irling+0.061
Neg logit contributions
 enough-0.028
ah-0.028
 One-0.026
 mate-0.023
ow-0.022
Token day
Token ID1110
Feature activation+8.03e-02

Pos logit contributions
bes+0.041
we+0.040
kins+0.038
irling+0.035
 wearing+0.035
Neg logit contributions
ah-0.030
bar-0.025
 mate-0.024
 One-0.023
 go-0.022
Token with
Token ID351
Feature activation+2.81e-02

Pos logit contributions
bes+0.106
we+0.093
kins+0.082
keeping+0.077
irling+0.075
Neg logit contributions
bar-0.159
 mate-0.157
ah-0.153
ow-0.151
 enough-0.149
Token Rob
Token ID3851
Feature activation+2.39e-02

Pos logit contributions
bes+0.060
we+0.059
kins+0.053
irling+0.053
 wiping+0.050
Neg logit contributions
ah-0.036
 One-0.034
 mate-0.034
 enough-0.032
ff-0.032
Tokenby
Token ID1525
Feature activation+4.85e-02

Pos logit contributions
we+0.047
bes+0.044
kins+0.043
 wearing+0.041
irling+0.040
Neg logit contributions
ah-0.033
 mate-0.032
 One-0.032
bar-0.031
pole-0.030
Token and
Token ID290
Feature activation+2.00e-03

Pos logit contributions
bes+0.078
we+0.076
kins+0.067
irling+0.066
 wiping+0.064
Neg logit contributions
ah-0.089
 mate-0.083
ff-0.081
bar-0.080
 One-0.080
Token taught
Token ID7817
Feature activation+4.77e-03

Pos logit contributions
bes+0.004
we+0.004
kins+0.003
irling+0.003
 wiping+0.003
Neg logit contributions
ah-0.003
 mate-0.003
ff-0.003
 One-0.003
bar-0.003
Tokenite
Token ID578
Feature activation-4.61e-03

Pos logit contributions
bes+0.066
we+0.063
kins+0.058
irling+0.057
 wiping+0.056
Neg logit contributions
ah-0.047
 mate-0.043
bar-0.042
ff-0.042
 One-0.041
Token.
Token ID13
Feature activation+7.73e-03

Pos logit contributions
ff+0.008
ah+0.008
 mate+0.007
 One+0.007
upid+0.007
Neg logit contributions
bes-0.007
we-0.007
 wearing-0.007
kins-0.006
irling-0.006
Token He
Token ID679
Feature activation+7.10e-03

Pos logit contributions
bes+0.016
we+0.015
 wearing+0.014
 wiping+0.013
kins+0.013
Neg logit contributions
ah-0.011
 mate-0.010
ff-0.010
ow-0.009
ale-0.009
Token spent
Token ID3377
Feature activation+2.29e-02

Pos logit contributions
bes+0.012
we+0.012
kins+0.010
irling+0.010
 wiping+0.010
Neg logit contributions
ah-0.012
 mate-0.012
ff-0.011
 One-0.011
bar-0.011
Token the
Token ID262
Feature activation+1.34e-02

Pos logit contributions
we+0.053
bes+0.051
kins+0.048
 wiping+0.046
irling+0.045
Neg logit contributions
ah-0.027
 enough-0.024
 One-0.024
 mate-0.023
bar-0.021
Token day
Token ID1110
Feature activation+7.99e-02

Pos logit contributions
bes+0.027
we+0.027
kins+0.025
 wiping+0.023
irling+0.023
Neg logit contributions
ah-0.020
 mate-0.017
bar-0.017
 One-0.016
pole-0.016
Token playing
Token ID2712
Feature activation+3.96e-02

Pos logit contributions
bes+0.122
we+0.109
kins+0.100
keeping+0.096
 complexion+0.089
Neg logit contributions
bar-0.142
 mate-0.141
ah-0.141
ow-0.133
pole-0.132
Token with
Token ID351
Feature activation+2.18e-02

Pos logit contributions
bes+0.060
we+0.059
kins+0.052
irling+0.050
 wiping+0.048
Neg logit contributions
ah-0.075
 mate-0.071
 One-0.069
bar-0.068
ff-0.067
Token it
Token ID340
Feature activation+1.03e-02

Pos logit contributions
bes+0.050
we+0.048
kins+0.045
irling+0.044
 wiping+0.043
Neg logit contributions
ah-0.026
 mate-0.023
ff-0.023
 One-0.023
bar-0.022
Token,
Token ID11
Feature activation+1.67e-02

Pos logit contributions
bes+0.018
we+0.018
kins+0.016
 wearing+0.015
irling+0.015
Neg logit contributions
ah-0.017
ff-0.016
 mate-0.016
 One-0.015
bar-0.015
Token but
Token ID475
Feature activation+1.15e-02

Pos logit contributions
bes+0.025
we+0.024
kins+0.023
irling+0.020
 complexion+0.019
Neg logit contributions
ah-0.032
 mate-0.030
bar-0.030
 One-0.029
 enough-0.029
Token it
Token ID340
Feature activation+2.35e-03

Pos logit contributions
bes+0.040
we+0.039
kins+0.035
irling+0.033
 wiping+0.033
Neg logit contributions
ah-0.048
 mate-0.045
ff-0.045
 One-0.044
bar-0.043
Token.
Token ID13
Feature activation+1.56e-02

Pos logit contributions
bes+0.004
we+0.004
kins+0.003
 wiping+0.003
 wearing+0.003
Neg logit contributions
ah-0.005
ff-0.004
 mate-0.004
bar-0.004
ow-0.004
Token He
Token ID679
Feature activation+1.24e-02

Pos logit contributions
bes+0.027
we+0.024
 wearing+0.024
 wiping+0.023
 wrapping+0.022
Neg logit contributions
ah-0.028
 mate-0.026
ff-0.024
 enough-0.023
bar-0.023
Token spent
Token ID3377
Feature activation+3.51e-02

Pos logit contributions
bes+0.024
we+0.023
kins+0.021
keeping+0.021
irling+0.021
Neg logit contributions
ah-0.020
bar-0.017
 mate-0.017
 One-0.017
ff-0.016
Token all
Token ID477
Feature activation+3.06e-02

Pos logit contributions
we+0.094
bes+0.087
kins+0.086
ini+0.082
keeping+0.081
Neg logit contributions
 One-0.026
ah-0.025
 enough-0.024
 him-0.019
 mate-0.019
Token day
Token ID1110
Feature activation+7.97e-02

Pos logit contributions
bes+0.059
we+0.056
kins+0.051
irling+0.049
 wiping+0.048
Neg logit contributions
ah-0.050
 mate-0.044
ff-0.044
 One-0.044
bar-0.044
Token hard
Token ID1327
Feature activation+4.43e-02

Pos logit contributions
we+0.106
bes+0.096
keeping+0.095
kins+0.086
ini+0.086
Neg logit contributions
bar-0.145
 mate-0.139
 One-0.130
 go-0.126
pole-0.126
Token at
Token ID379
Feature activation+1.62e-02

Pos logit contributions
bes+0.065
kins+0.063
we+0.063
 wrist+0.058
ini+0.055
Neg logit contributions
 enough-0.084
ah-0.077
 One-0.076
bar-0.069
ow-0.069
Token work
Token ID670
Feature activation+2.69e-02

Pos logit contributions
bes+0.039
we+0.039
kins+0.035
irling+0.034
beans+0.033
Neg logit contributions
ah-0.017
 One-0.016
 enough-0.015
bar-0.015
 mate-0.014
Token,
Token ID11
Feature activation+1.92e-02

Pos logit contributions
we+0.053
bes+0.053
kins+0.047
irling+0.046
 wiping+0.044
Neg logit contributions
ah-0.038
 One-0.034
 enough-0.034
bar-0.034
 mate-0.034
Token rubbing
Token ID27714
Feature activation+3.58e-02

Pos logit contributions
bes+0.032
we+0.031
kins+0.029
irling+0.027
 complexion+0.024
Neg logit contributions
ah-0.034
 One-0.032
 mate-0.031
 enough-0.031
bar-0.031
Token
Token ID198
Feature activation+1.10e-02

Pos logit contributions
ah+0.000
 One+0.000
ff+0.000
 mate+0.000
bar+0.000
Neg logit contributions
bes-0.000
we-0.000
kins-0.000
irling-0.000
 wiping-0.000
TokenThe
Token ID464
Feature activation-3.85e-04

Pos logit contributions
bes+0.022
we+0.021
kins+0.019
 wiping+0.018
 wrapping+0.018
Neg logit contributions
ah-0.016
 One-0.015
ff-0.015
bar-0.015
 mate-0.015
Token sisters
Token ID15153
Feature activation+3.31e-02

Pos logit contributions
ah+0.001
 mate+0.001
 One+0.001
bar+0.001
ff+0.001
Neg logit contributions
bes-0.001
we-0.001
irling-0.001
kins-0.001
 wiping-0.001
Token spent
Token ID3377
Feature activation+2.40e-02

Pos logit contributions
bes+0.060
we+0.058
kins+0.053
irling+0.051
 wiping+0.051
Neg logit contributions
ah-0.054
 mate-0.050
ff-0.049
 One-0.048
bar-0.048
Token the
Token ID262
Feature activation+2.48e-02

Pos logit contributions
we+0.060
bes+0.057
kins+0.054
ini+0.052
irling+0.052
Neg logit contributions
 enough-0.022
 One-0.021
ah-0.020
 mate-0.017
ow-0.016
Token day
Token ID1110
Feature activation+7.90e-02

Pos logit contributions
bes+0.049
we+0.048
kins+0.048
irling+0.045
 wiping+0.043
Neg logit contributions
ah-0.034
 One-0.030
 mate-0.029
bar-0.028
 go-0.027
Token playing
Token ID2712
Feature activation+4.53e-02

Pos logit contributions
bes+0.105
we+0.091
kins+0.084
iflower+0.075
keeping+0.075
Neg logit contributions
 mate-0.156
bar-0.151
 enough-0.149
ow-0.149
ah-0.146
Token in
Token ID287
Feature activation+1.71e-02

Pos logit contributions
bes+0.067
we+0.067
irling+0.059
kins+0.057
 wiping+0.054
Neg logit contributions
ah-0.081
 mate-0.080
 One-0.078
 enough-0.075
ff-0.074
Token the
Token ID262
Feature activation+1.30e-02

Pos logit contributions
we+0.044
bes+0.043
kins+0.040
irling+0.040
iflower+0.039
Neg logit contributions
 One-0.015
 mate-0.013
 enough-0.012
ah-0.011
 leave-0.011
Token closet
Token ID21615
Feature activation-1.08e-02

Pos logit contributions
bes+0.020
we+0.019
kins+0.017
irling+0.016
 wiping+0.016
Neg logit contributions
ah-0.025
 mate-0.024
 One-0.023
ff-0.023
bar-0.023
Token.
Token ID13
Feature activation+9.76e-03

Pos logit contributions
ah+0.019
ff+0.018
 mate+0.017
 One+0.017
bar+0.017
Neg logit contributions
bes-0.018
we-0.018
kins-0.016
 wearing-0.016
 wiping-0.015
Token
Token ID198
Feature activation+6.25e-03

Pos logit contributions
bes+0.008
we+0.007
 wearing+0.007
 wiping+0.006
 wrapping+0.006
Neg logit contributions
ah-0.007
 mate-0.007
ff-0.006
bar-0.006
ow-0.006
Token
Token ID198
Feature activation+1.73e-02

Pos logit contributions
bes+0.013
we+0.012
kins+0.011
irling+0.011
 wiping+0.011
Neg logit contributions
ah-0.009
 mate-0.008
 One-0.008
ff-0.008
bar-0.008
TokenThey
Token ID2990
Feature activation-3.89e-03

Pos logit contributions
bes+0.029
we+0.028
kins+0.024
irling+0.024
 wiping+0.023
Neg logit contributions
ah-0.031
 mate-0.029
ff-0.029
 One-0.028
bar-0.028
Token spent
Token ID3377
Feature activation+2.15e-02

Pos logit contributions
ah+0.007
ff+0.007
bar+0.007
 mate+0.006
upid+0.006
Neg logit contributions
bes-0.006
kins-0.005
we-0.005
 wearing-0.005
irling-0.005
Token the
Token ID262
Feature activation+1.66e-02

Pos logit contributions
we+0.052
bes+0.049
kins+0.047
 wiping+0.045
ini+0.044
Neg logit contributions
ah-0.023
 enough-0.022
 One-0.021
 mate-0.019
ow-0.017
Token day
Token ID1110
Feature activation+7.88e-02

Pos logit contributions
bes+0.036
kins+0.036
we+0.035
irling+0.032
 wiping+0.032
Neg logit contributions
ah-0.022
 One-0.017
 go-0.016
 mate-0.016
ny-0.016
Token talking
Token ID3375
Feature activation+8.81e-03

Pos logit contributions
bes+0.122
we+0.106
kins+0.097
keeping+0.095
iflower+0.092
Neg logit contributions
 mate-0.136
ow-0.131
bar-0.131
 enough-0.128
ah-0.127
Token and
Token ID290
Feature activation+1.02e-02

Pos logit contributions
bes+0.014
we+0.014
kins+0.012
 wearing+0.012
irling+0.012
Neg logit contributions
ah-0.016
 mate-0.015
ff-0.015
 One-0.014
bar-0.014
Token playing
Token ID2712
Feature activation+3.07e-02

Pos logit contributions
bes+0.013
we+0.012
kins+0.010
irling+0.010
web+0.008
Neg logit contributions
ah-0.022
 mate-0.021
 One-0.021
 enough-0.021
ow-0.020
Token together
Token ID1978
Feature activation+1.68e-02

Pos logit contributions
bes+0.051
we+0.050
irling+0.046
 wiping+0.044
kins+0.044
Neg logit contributions
ah-0.050
 mate-0.047
 One-0.047
 enough-0.044
ff-0.044
Token.
Token ID13
Feature activation+4.62e-03

Pos logit contributions
bes+0.031
we+0.031
kins+0.028
irling+0.027
 wearing+0.027
Neg logit contributions
ah-0.027
 mate-0.025
ff-0.025
bar-0.024
 One-0.024
Token baking
Token ID16871
Feature activation+1.18e-02

Pos logit contributions
bes+0.050
we+0.049
irling+0.044
keeping+0.042
 wiping+0.041
Neg logit contributions
 enough-0.035
ah-0.035
 One-0.035
 mate-0.033
 go-0.033
Token.
Token ID13
Feature activation+4.29e-03

Pos logit contributions
bes+0.019
we+0.019
kins+0.017
irling+0.016
 wiping+0.016
Neg logit contributions
ah-0.022
 mate-0.020
ff-0.020
 One-0.020
bar-0.019
Token She
Token ID1375
Feature activation+1.08e-02

Pos logit contributions
bes+0.009
we+0.008
 wearing+0.008
 wiping+0.007
kins+0.007
Neg logit contributions
ah-0.006
 mate-0.006
bar-0.006
ff-0.005
pole-0.005
Token spent
Token ID3377
Feature activation+2.48e-02

Pos logit contributions
bes+0.021
we+0.021
irling+0.018
kins+0.018
 wiping+0.018
Neg logit contributions
ah-0.016
 mate-0.015
ff-0.014
 One-0.014
bar-0.014
Token the
Token ID262
Feature activation+2.26e-02

Pos logit contributions
we+0.061
bes+0.060
ini+0.055
keeping+0.054
irling+0.054
Neg logit contributions
 enough-0.021
ah-0.020
 One-0.020
 nine-0.015
 mate-0.015
Token day
Token ID1110
Feature activation+7.82e-02

Pos logit contributions
bes+0.043
we+0.040
irling+0.038
kins+0.037
keeping+0.036
Neg logit contributions
ah-0.034
 One-0.028
 go-0.027
 mate-0.027
ow-0.027
Token writing
Token ID3597
Feature activation+3.03e-02

Pos logit contributions
bes+0.105
we+0.087
keeping+0.082
kins+0.076
irling+0.074
Neg logit contributions
ah-0.144
 mate-0.143
 enough-0.142
ow-0.141
bar-0.139
Token a
Token ID257
Feature activation+1.93e-02

Pos logit contributions
bes+0.051
we+0.050
kins+0.044
irling+0.043
 wiping+0.042
Neg logit contributions
ah-0.053
 mate-0.049
ff-0.049
 One-0.048
bar-0.047
Token reliable
Token ID9314
Feature activation+1.88e-02

Pos logit contributions
bes+0.035
we+0.034
kins+0.029
irling+0.029
 wiping+0.028
Neg logit contributions
ah-0.032
 mate-0.030
 One-0.029
ff-0.029
bar-0.027
Token poem
Token ID21247
Feature activation+5.80e-03

Pos logit contributions
bes+0.033
we+0.033
irling+0.029
kins+0.029
 wiping+0.027
Neg logit contributions
ah-0.032
 mate-0.029
 One-0.028
ff-0.028
 enough-0.027
Token about
Token ID546
Feature activation+1.32e-02

Pos logit contributions
bes+0.011
we+0.010
kins+0.009
irling+0.009
 wiping+0.009
Neg logit contributions
ah-0.009
 mate-0.009
ff-0.009
 One-0.009
bar-0.008
Token.
Token ID13
Feature activation+1.25e-02

Pos logit contributions
bes+0.016
we+0.015
irling+0.014
kins+0.014
 wiping+0.013
Neg logit contributions
ah-0.009
 One-0.008
 mate-0.008
ff-0.008
 enough-0.008
Token She
Token ID1375
Feature activation+1.11e-02

Pos logit contributions
bes+0.026
we+0.025
 wiping+0.023
 wearing+0.023
kins+0.022
Neg logit contributions
ah-0.018
 mate-0.017
bar-0.015
ff-0.015
 enough-0.014
Token would
Token ID561
Feature activation+2.20e-02

Pos logit contributions
bes+0.022
we+0.021
kins+0.019
irling+0.019
 wiping+0.018
Neg logit contributions
ah-0.017
 mate-0.015
ff-0.015
 One-0.015
bar-0.015
Token spend
Token ID4341
Feature activation+2.18e-02

Pos logit contributions
bes+0.041
we+0.039
kins+0.036
 wiping+0.036
irling+0.035
Neg logit contributions
ah-0.035
ff-0.032
 mate-0.031
bar-0.031
 One-0.030
Token the
Token ID262
Feature activation+3.01e-02

Pos logit contributions
we+0.048
bes+0.046
kins+0.043
irling+0.042
 wiping+0.041
Neg logit contributions
ah-0.027
 One-0.026
 enough-0.025
 mate-0.023
ff-0.022
Token day
Token ID1110
Feature activation+7.81e-02

Pos logit contributions
bes+0.058
we+0.057
kins+0.051
irling+0.051
 wiping+0.049
Neg logit contributions
ah-0.047
 One-0.042
 mate-0.041
ff-0.040
bar-0.039
Token eating
Token ID6600
Feature activation+3.28e-02

Pos logit contributions
bes+0.121
we+0.112
kins+0.100
irling+0.096
keeping+0.090
Neg logit contributions
 mate-0.143
ah-0.142
bar-0.137
ow-0.137
ff-0.133
Token and
Token ID290
Feature activation+9.71e-03

Pos logit contributions
bes+0.066
we+0.064
kins+0.058
irling+0.057
 wiping+0.056
Neg logit contributions
ah-0.048
 mate-0.044
ff-0.043
 One-0.042
bar-0.042
Token resting
Token ID19186
Feature activation+3.51e-02

Pos logit contributions
bes+0.013
we+0.011
kins+0.011
irling+0.010
ini+0.010
Neg logit contributions
ah-0.019
bar-0.019
 One-0.019
 enough-0.019
 mate-0.019
Token -
Token ID532
Feature activation+1.80e-02

Pos logit contributions
bes+0.065
we+0.063
irling+0.057
kins+0.055
 wiping+0.054
Neg logit contributions
ah-0.056
 One-0.051
 mate-0.051
ff-0.050
bar-0.049
Token that
Token ID326
Feature activation+1.98e-03

Pos logit contributions
bes+0.035
we+0.033
kins+0.031
irling+0.030
 wiping+0.028
Neg logit contributions
ah-0.028
 mate-0.026
 One-0.026
bar-0.025
ff-0.025
Token
Token ID198
Feature activation+7.05e-03

Pos logit contributions
ah+0.007
 mate+0.006
ff+0.006
bar+0.006
ow+0.006
Neg logit contributions
bes-0.009
we-0.009
 wearing-0.008
kins-0.008
 wiping-0.007
Token
Token ID198
Feature activation+1.59e-02

Pos logit contributions
bes+0.014
we+0.014
kins+0.013
irling+0.012
 wiping+0.012
Neg logit contributions
ah-0.010
 mate-0.009
ff-0.009
 One-0.009
bar-0.009
TokenThey
Token ID2990
Feature activation-1.57e-03

Pos logit contributions
bes+0.026
we+0.026
kins+0.023
irling+0.022
 wiping+0.021
Neg logit contributions
ah-0.029
 mate-0.027
ff-0.026
 One-0.026
bar-0.026
Token spent
Token ID3377
Feature activation+2.34e-02

Pos logit contributions
ah+0.003
ff+0.003
 One+0.003
bar+0.003
ow+0.002
Neg logit contributions
bes-0.002
we-0.002
kins-0.002
 wearing-0.002
irling-0.002
Token the
Token ID262
Feature activation+1.74e-02

Pos logit contributions
we+0.054
bes+0.053
kins+0.049
 wiping+0.048
ini+0.047
Neg logit contributions
ah-0.026
 enough-0.024
 One-0.023
 mate-0.023
ow-0.020
Token day
Token ID1110
Feature activation+7.81e-02

Pos logit contributions
bes+0.035
we+0.033
kins+0.032
 wiping+0.030
 complexion+0.030
Neg logit contributions
ah-0.026
 mate-0.023
bar-0.022
 One-0.021
ff-0.021
Token exploring
Token ID13504
Feature activation+2.89e-02

Pos logit contributions
bes+0.118
we+0.100
kins+0.091
keeping+0.089
web+0.085
Neg logit contributions
 mate-0.146
bar-0.141
ah-0.140
ow-0.137
 enough-0.136
Token the
Token ID262
Feature activation+1.40e-02

Pos logit contributions
bes+0.070
we+0.066
irling+0.060
kins+0.059
 wiping+0.059
Neg logit contributions
ah-0.030
 One-0.028
 mate-0.028
 enough-0.028
ff-0.025
Token lake
Token ID13546
Feature activation+3.58e-03

Pos logit contributions
bes+0.031
we+0.029
kins+0.027
 wiping+0.026
irling+0.026
Neg logit contributions
ah-0.019
 mate-0.017
 One-0.017
ff-0.016
bar-0.016
Token together
Token ID1978
Feature activation+1.04e-02

Pos logit contributions
bes+0.006
we+0.006
kins+0.006
irling+0.005
 wiping+0.005
Neg logit contributions
ah-0.006
 mate-0.006
ff-0.006
 One-0.005
bar-0.005
Token.
Token ID13
Feature activation+6.87e-03

Pos logit contributions
bes+0.019
we+0.019
kins+0.017
irling+0.016
 wearing+0.016
Neg logit contributions
ah-0.017
 mate-0.015
ff-0.015
bar-0.015
 One-0.015
Token.
Token ID13
Feature activation-2.41e-04

Pos logit contributions
ah+0.001
 mate+0.001
ff+0.001
 One+0.001
bar+0.001
Neg logit contributions
bes-0.002
we-0.002
kins-0.002
irling-0.002
 wiping-0.002
Token She
Token ID1375
Feature activation+9.93e-03

Pos logit contributions
ah+0.000
 One+0.000
ff+0.000
 mate+0.000
ow+0.000
Neg logit contributions
bes-0.001
we-0.001
kins-0.001
irling-0.000
 wiping-0.000
Token often
Token ID1690
Feature activation+1.26e-02

Pos logit contributions
bes+0.021
keeping+0.021
kins+0.020
we+0.020
irling+0.019
Neg logit contributions
ah-0.013
 One-0.012
 go-0.011
ny-0.010
 come-0.010
Token spent
Token ID3377
Feature activation+2.94e-02

Pos logit contributions
bes+0.025
we+0.025
kins+0.022
irling+0.022
 wiping+0.021
Neg logit contributions
ah-0.020
 mate-0.017
 One-0.017
ff-0.017
ow-0.016
Token her
Token ID607
Feature activation+5.28e-02

Pos logit contributions
we+0.072
ini+0.066
kins+0.066
bes+0.065
fer+0.064
Neg logit contributions
 One-0.026
 enough-0.023
ah-0.021
ossible-0.020
 go-0.019
Token day
Token ID1110
Feature activation+7.80e-02

Pos logit contributions
we+0.109
irling+0.093
bes+0.092
kins+0.092
bing+0.091
Neg logit contributions
ah-0.075
 One-0.068
 mate-0.068
 go-0.060
ny-0.057
Token playing
Token ID2712
Feature activation+4.09e-02

Pos logit contributions
we+0.105
bes+0.104
keeping+0.090
kins+0.083
ini+0.077
Neg logit contributions
 mate-0.147
bar-0.139
ow-0.136
ah-0.134
 go-0.134
Token in
Token ID287
Feature activation+2.17e-02

Pos logit contributions
we+0.070
bes+0.068
keeping+0.065
kins+0.063
irling+0.063
Neg logit contributions
 One-0.061
 mate-0.059
ah-0.058
 go-0.054
 him-0.053
Token the
Token ID262
Feature activation+1.15e-02

Pos logit contributions
we+0.057
bes+0.056
kins+0.055
uddled+0.052
irling+0.052
Neg logit contributions
 One-0.019
 mate-0.014
 go-0.013
 enough-0.012
ah-0.012
Token park
Token ID3952
Feature activation+7.58e-03

Pos logit contributions
bes+0.020
we+0.019
kins+0.017
irling+0.016
 wiping+0.016
Neg logit contributions
ah-0.020
 mate-0.019
ff-0.018
 One-0.018
bar-0.018
Token.
Token ID13
Feature activation+1.61e-02

Pos logit contributions
bes+0.012
we+0.012
kins+0.010
irling+0.010
 wiping+0.010
Neg logit contributions
ah-0.014
 One-0.014
 mate-0.013
ow-0.013
ff-0.012
Token is
Token ID318
Feature activation-1.21e-02

Pos logit contributions
ah+0.021
 mate+0.019
ff+0.019
bar+0.019
 One+0.018
Neg logit contributions
bes-0.031
we-0.031
kins-0.028
irling-0.028
 wearing-0.027
Token how
Token ID703
Feature activation+3.42e-03

Pos logit contributions
ah+0.022
ff+0.021
ale+0.019
 mate+0.019
bar+0.019
Neg logit contributions
we-0.019
 wearing-0.019
bes-0.019
 wiping-0.018
kins-0.017
Token Tim
Token ID5045
Feature activation+2.02e-02

Pos logit contributions
bes+0.006
we+0.006
kins+0.006
irling+0.006
 wiping+0.006
Neg logit contributions
ah-0.005
 mate-0.005
ff-0.005
 One-0.005
bar-0.005
Token spent
Token ID3377
Feature activation+9.10e-03

Pos logit contributions
bes+0.045
we+0.043
kins+0.040
irling+0.039
 wiping+0.038
Neg logit contributions
ah-0.026
ff-0.023
 mate-0.023
 One-0.022
bar-0.022
Token his
Token ID465
Feature activation+5.10e-02

Pos logit contributions
we+0.022
bes+0.021
irling+0.020
kins+0.020
 wiping+0.019
Neg logit contributions
ah-0.009
 One-0.008
 enough-0.008
 mate-0.007
 away-0.007
Token day
Token ID1110
Feature activation+7.77e-02

Pos logit contributions
we+0.101
bes+0.092
irling+0.085
kins+0.083
keeping+0.082
Neg logit contributions
 One-0.070
ah-0.069
 go-0.064
 end-0.062
ow-0.061
Token,
Token ID11
Feature activation+1.48e-02

Pos logit contributions
bes+0.117
we+0.116
kins+0.096
keeping+0.092
 complexion+0.089
Neg logit contributions
bar-0.128
 mate-0.127
ow-0.127
 enough-0.124
 go-0.123
Token sitting
Token ID5586
Feature activation+1.87e-02

Pos logit contributions
bes+0.025
we+0.025
kins+0.023
irling+0.021
lf+0.020
Neg logit contributions
ah-0.026
 mate-0.024
bar-0.024
 enough-0.024
 One-0.024
Token under
Token ID739
Feature activation-5.37e-04

Pos logit contributions
bes+0.032
we+0.031
irling+0.027
 wiping+0.026
kins+0.025
Neg logit contributions
ah-0.032
 mate-0.029
ff-0.029
 enough-0.029
 One-0.028
Token the
Token ID262
Feature activation-2.11e-02

Pos logit contributions
ah+0.001
 One+0.001
 mate+0.001
ff+0.001
bar+0.001
Neg logit contributions
bes-0.001
we-0.001
kins-0.001
irling-0.001
 wiping-0.001
Token tree
Token ID5509
Feature activation-1.53e-02

Pos logit contributions
ah+0.027
 One+0.027
 go+0.024
bar+0.024
 leave+0.024
Neg logit contributions
we-0.045
bes-0.044
kins-0.040
irling-0.039
keeping-0.036
Token back
Token ID736
Feature activation+1.02e-02

Pos logit contributions
we+0.020
bes+0.019
kins+0.018
 wearing+0.018
 wiping+0.018
Neg logit contributions
ah-0.020
bar-0.018
 mate-0.017
ff-0.017
 One-0.017
Token.
Token ID13
Feature activation+3.63e-03

Pos logit contributions
bes+0.022
we+0.022
kins+0.020
irling+0.019
 wiping+0.019
Neg logit contributions
ah-0.013
 mate-0.012
ff-0.012
 One-0.011
bar-0.011
Token They
Token ID1119
Feature activation-6.50e-05

Pos logit contributions
bes+0.007
we+0.007
 wearing+0.006
 wiping+0.006
 wrapping+0.006
Neg logit contributions
ah-0.006
 mate-0.005
ow-0.005
ff-0.005
bar-0.005
Token spent
Token ID3377
Feature activation+2.31e-02

Pos logit contributions
ah+0.000
ff+0.000
bar+0.000
 mate+0.000
 One+0.000
Neg logit contributions
bes-0.000
we-0.000
 wearing-0.000
kins-0.000
irling-0.000
Token the
Token ID262
Feature activation+1.84e-02

Pos logit contributions
we+0.058
bes+0.056
kins+0.051
 wiping+0.050
ini+0.049
Neg logit contributions
ah-0.023
 enough-0.022
 One-0.021
 mate-0.018
ow-0.016
Token day
Token ID1110
Feature activation+7.76e-02

Pos logit contributions
bes+0.039
we+0.039
kins+0.037
irling+0.035
 wiping+0.035
Neg logit contributions
ah-0.024
 One-0.020
 mate-0.020
ny-0.019
 go-0.019
Token exploring
Token ID13504
Feature activation+2.81e-02

Pos logit contributions
bes+0.119
we+0.104
kins+0.093
web+0.089
lf+0.089
Neg logit contributions
 mate-0.143
ow-0.134
bar-0.132
ah-0.131
 enough-0.130
Token together
Token ID1978
Feature activation+2.00e-02

Pos logit contributions
bes+0.066
we+0.063
irling+0.059
kins+0.057
 wiping+0.056
Neg logit contributions
 One-0.030
ah-0.030
 enough-0.028
 mate-0.028
ff-0.026
Token.
Token ID13
Feature activation+6.88e-03

Pos logit contributions
bes+0.038
we+0.038
kins+0.034
irling+0.033
 wearing+0.033
Neg logit contributions
ah-0.031
 mate-0.028
ff-0.028
bar-0.027
 One-0.027
Token They
Token ID1119
Feature activation+3.34e-03

Pos logit contributions
bes+0.013
we+0.013
 wearing+0.012
kins+0.011
 wiping+0.011
Neg logit contributions
ah-0.011
 mate-0.010
ff-0.009
bar-0.009
ow-0.009
Token had
Token ID550
Feature activation+9.01e-03

Pos logit contributions
bes+0.005
we+0.005
kins+0.005
 wearing+0.005
irling+0.005
Neg logit contributions
ah-0.006
ff-0.006
bar-0.006
 mate-0.005
 One-0.005
Token
Token ID198
Feature activation+2.76e-03

Pos logit contributions
ah+0.017
 One+0.017
ff+0.016
 mate+0.016
ow+0.015
Neg logit contributions
bes-0.032
we-0.031
kins-0.029
irling-0.028
 wiping-0.027
Token"
Token ID1
Feature activation+5.27e-03

Pos logit contributions
bes+0.006
we+0.006
kins+0.005
 wiping+0.005
irling+0.005
Neg logit contributions
ah-0.003
 One-0.003
 mate-0.003
ff-0.003
bar-0.003
TokenWHAT
Token ID32971
Feature activation-3.91e-02

Pos logit contributions
bes+0.010
we+0.009
irling+0.009
kins+0.009
 wearing+0.008
Neg logit contributions
ah-0.008
 mate-0.008
ff-0.008
 One-0.007
bar-0.007
Token ARE
Token ID15986
Feature activation-5.21e-02

Pos logit contributions
bar+0.062
bit+0.057
pole+0.057
ah+0.055
 mate+0.053
Neg logit contributions
bes-0.063
we-0.063
lf-0.062
 wearing-0.057
 reading-0.055
Token YOU
Token ID7013
Feature activation-2.64e-02

Pos logit contributions
ah+0.096
ff+0.087
bar+0.087
 mate+0.086
pole+0.084
Neg logit contributions
bes-0.081
we-0.072
 wearing-0.071
 wrapping-0.071
lf-0.070
Token DO
Token ID8410
Feature activation-8.75e-02

Pos logit contributions
bar+0.042
 mate+0.041
ah+0.038
ff+0.038
pole+0.038
Neg logit contributions
bes-0.046
lf-0.040
kins-0.040
we-0.040
 wearing-0.039
TokenING
Token ID2751
Feature activation-2.77e-02

Pos logit contributions
bar+0.154
ah+0.154
 mate+0.143
ff+0.136
pole+0.136
Neg logit contributions
bes-0.146
irling-0.117
lf-0.112
 wrapping-0.111
we-0.110
Token?"
Token ID1701
Feature activation-1.18e-02

Pos logit contributions
ah+0.044
 mate+0.042
bar+0.042
 One+0.042
ff+0.042
Neg logit contributions
bes-0.049
we-0.048
kins-0.042
 wiping-0.041
 wearing-0.041
Token Mom
Token ID11254
Feature activation-5.32e-03

Pos logit contributions
ah+0.015
 mate+0.013
ff+0.013
 One+0.013
bar+0.012
Neg logit contributions
bes-0.026
we-0.026
kins-0.024
irling-0.023
 wiping-0.023
Token yells
Token ID42675
Feature activation-1.51e-02

Pos logit contributions
ah+0.006
 mate+0.006
bar+0.006
ff+0.005
 One+0.005
Neg logit contributions
we-0.012
bes-0.012
 wearing-0.011
 wiping-0.011
kins-0.011
Token.
Token ID13
Feature activation-1.01e-02

Pos logit contributions
ah+0.024
 mate+0.022
bar+0.022
ff+0.021
 One+0.020
Neg logit contributions
bes-0.028
we-0.028
kins-0.025
 wiping-0.025
 wearing-0.024
Token to
Token ID284
Feature activation+2.31e-03

Pos logit contributions
ah+0.000
 One+0.000
 mate+0.000
 enough+0.000
ff+0.000
Neg logit contributions
bes-0.000
we-0.000
kins-0.000
irling-0.000
lf-0.000
Token drink
Token ID4144
Feature activation+5.35e-03

Pos logit contributions
bes+0.005
we+0.005
kins+0.004
irling+0.004
 wiping+0.004
Neg logit contributions
ah-0.003
ff-0.003
 mate-0.003
bar-0.003
 One-0.003
Token it
Token ID340
Feature activation+7.30e-03

Pos logit contributions
bes+0.011
we+0.010
irling+0.009
kins+0.009
lf+0.009
Neg logit contributions
ah-0.008
 One-0.007
ff-0.007
 mate-0.007
 enough-0.007
Token to
Token ID284
Feature activation-1.18e-02

Pos logit contributions
bes+0.014
we+0.013
kins+0.012
 wiping+0.011
irling+0.011
Neg logit contributions
ah-0.012
 mate-0.011
ff-0.011
bar-0.010
 One-0.010
Token stay
Token ID2652
Feature activation+3.67e-03

Pos logit contributions
ah+0.016
 mate+0.015
ff+0.015
 One+0.014
bar+0.014
Neg logit contributions
bes-0.024
we-0.024
kins-0.022
irling-0.021
 wiping-0.021
Token hyd
Token ID7409
Feature activation-8.41e-02

Pos logit contributions
we+0.007
bes+0.007
lf+0.006
irling+0.006
kins+0.006
Neg logit contributions
 enough-0.005
ah-0.005
ff-0.005
 One-0.005
 mate-0.005
Tokenrated
Token ID4111
Feature activation-1.42e-02

Pos logit contributions
 mate+0.145
ah+0.134
bar+0.124
 One+0.122
 enough+0.119
Neg logit contributions
we-0.144
bes-0.138
kins-0.125
bing-0.123
keeping-0.119
Token!"
Token ID2474
Feature activation-3.06e-03

Pos logit contributions
ah+0.020
 mate+0.018
ff+0.018
bar+0.017
 One+0.017
Neg logit contributions
bes-0.029
we-0.028
kins-0.026
irling-0.025
 wiping-0.025
Token Billy
Token ID15890
Feature activation+3.55e-03

Pos logit contributions
ah+0.004
 mate+0.004
ff+0.004
bar+0.003
 One+0.003
Neg logit contributions
bes-0.007
we-0.007
kins-0.006
 wiping-0.006
 wearing-0.006
Token drank
Token ID24070
Feature activation+2.11e-03

Pos logit contributions
bes+0.006
we+0.006
kins+0.005
irling+0.005
 wiping+0.005
Neg logit contributions
ah-0.006
 mate-0.006
ff-0.006
 One-0.006
bar-0.006
Token the
Token ID262
Feature activation+2.58e-02

Pos logit contributions
bes+0.005
we+0.004
kins+0.004
irling+0.004
 wearing+0.004
Neg logit contributions
ah-0.003
 One-0.003
ff-0.002
 mate-0.002
bar-0.002
Token me
Token ID502
Feature activation+2.82e-02

Pos logit contributions
ah+0.003
 mate+0.002
ff+0.002
 One+0.002
bar+0.002
Neg logit contributions
bes-0.006
we-0.006
kins-0.006
irling-0.005
 wiping-0.005
Token to
Token ID284
Feature activation-4.03e-03

Pos logit contributions
bes+0.055
we+0.053
kins+0.048
irling+0.047
 wiping+0.046
Neg logit contributions
ah-0.043
ff-0.039
 mate-0.039
 One-0.038
bar-0.037
Token the
Token ID262
Feature activation-1.86e-02

Pos logit contributions
ah+0.007
ff+0.006
bar+0.006
 mate+0.006
pole+0.005
Neg logit contributions
bes-0.007
we-0.007
 wearing-0.007
 wiping-0.007
 wrapping-0.006
Token nearest
Token ID16936
Feature activation-2.60e-02

Pos logit contributions
ah+0.030
 mate+0.028
 One+0.027
ff+0.027
bar+0.026
Neg logit contributions
bes-0.035
we-0.034
kins-0.031
irling-0.030
 wiping-0.029
Token fire
Token ID2046
Feature activation-3.50e-02

Pos logit contributions
ah+0.050
ff+0.048
ow+0.045
bar+0.044
 mate+0.044
Neg logit contributions
bes-0.036
we-0.034
 wearing-0.033
lf-0.032
 wiping-0.031
Token hyd
Token ID7409
Feature activation-8.34e-02

Pos logit contributions
ah+0.062
ff+0.057
 mate+0.056
bar+0.054
 One+0.054
Neg logit contributions
we-0.055
bes-0.055
 wearing-0.051
kins-0.050
 wiping-0.047
Tokenrant
Token ID5250
Feature activation-2.01e-02

Pos logit contributions
 mate+0.130
ah+0.122
bar+0.116
 enough+0.113
ff+0.111
Neg logit contributions
we-0.149
bes-0.148
kins-0.130
bing-0.126
irling-0.125
Token?"
Token ID1701
Feature activation+4.82e-03

Pos logit contributions
ah+0.034
 mate+0.031
ff+0.030
bar+0.029
 One+0.028
Neg logit contributions
bes-0.035
we-0.034
kins-0.031
 wearing-0.031
 wiping-0.030
Token
Token ID220
Feature activation-2.91e-04

Pos logit contributions
bes+0.010
we+0.009
kins+0.009
irling+0.008
 wiping+0.008
Neg logit contributions
ah-0.007
 mate-0.006
ff-0.006
 One-0.006
bar-0.006
Token
Token ID198
Feature activation-4.01e-03

Pos logit contributions
ah+0.001
 mate+0.001
ff+0.001
bar+0.001
ale+0.001
Neg logit contributions
bes-0.000
we-0.000
kins-0.000
 wearing-0.000
 wiping-0.000
Token
Token ID198
Feature activation+8.02e-03

Pos logit contributions
ah+0.007
 mate+0.006
ff+0.006
bar+0.006
ow+0.006
Neg logit contributions
bes-0.007
we-0.007
kins-0.006
 wearing-0.006
irling-0.006
Token The
Token ID383
Feature activation-1.98e-03

Pos logit contributions
ah+0.011
 mate+0.010
ff+0.009
ale+0.009
bar+0.009
Neg logit contributions
bes-0.014
we-0.013
 wearing-0.012
 wiping-0.012
kins-0.012
Token car
Token ID1097
Feature activation+1.34e-02

Pos logit contributions
ah+0.003
ff+0.003
ow+0.003
 One+0.003
 mate+0.003
Neg logit contributions
bes-0.004
we-0.004
kins-0.003
 wiping-0.003
 wearing-0.003
Token hits
Token ID7127
Feature activation+7.49e-03

Pos logit contributions
bes+0.023
we+0.022
 wearing+0.019
kins+0.019
 wrapping+0.019
Neg logit contributions
ah-0.023
ff-0.022
 mate-0.022
ait-0.021
ale-0.020
Token a
Token ID257
Feature activation-1.67e-02

Pos logit contributions
we+0.012
bes+0.011
 wearing+0.010
 wiping+0.010
 wrapping+0.010
Neg logit contributions
 mate-0.014
ah-0.013
ff-0.013
bar-0.012
pole-0.012
Token fire
Token ID2046
Feature activation-3.85e-02

Pos logit contributions
ah+0.031
ff+0.030
ow+0.029
 mate+0.029
 One+0.028
Neg logit contributions
bes-0.025
we-0.024
 wiping-0.022
kins-0.022
lf-0.022
Token hyd
Token ID7409
Feature activation-8.09e-02

Pos logit contributions
ah+0.069
ff+0.068
 One+0.067
aus+0.067
ait+0.065
Neg logit contributions
we-0.058
bes-0.053
 wiping-0.048
 wrapping-0.048
web-0.048
Tokenrant
Token ID5250
Feature activation-1.25e-02

Pos logit contributions
 mate+0.131
ah+0.117
 One+0.115
ait+0.111
bar+0.106
Neg logit contributions
we-0.145
bes-0.135
kins-0.128
bing-0.118
aged-0.112
Token.
Token ID13
Feature activation-7.11e-03

Pos logit contributions
ah+0.022
 mate+0.021
ff+0.020
 One+0.020
bar+0.019
Neg logit contributions
bes-0.021
we-0.021
kins-0.018
 wiping-0.018
 wearing-0.018
Token The
Token ID383
Feature activation-4.09e-03

Pos logit contributions
ah+0.010
 mate+0.009
ff+0.009
bar+0.009
ale+0.008
Neg logit contributions
bes-0.015
we-0.014
 wiping-0.013
 wearing-0.013
kins-0.013
Token fire
Token ID2046
Feature activation-1.41e-03

Pos logit contributions
ah+0.006
ff+0.005
 mate+0.005
 One+0.005
bar+0.005
Neg logit contributions
bes-0.008
we-0.008
kins-0.007
 wiping-0.007
irling-0.007
Token hyd
Token ID7409
Feature activation-6.54e-02

Pos logit contributions
ah+0.002
 One+0.002
 mate+0.002
ff+0.002
bar+0.002
Neg logit contributions
we-0.003
bes-0.003
kins-0.002
 wiping-0.002
 wearing-0.002
Token spray
Token ID11662
Feature activation-2.75e-03

Pos logit contributions
bes+0.022
we+0.022
kins+0.020
irling+0.019
 wiping+0.019
Neg logit contributions
ah-0.020
 mate-0.019
ff-0.018
 One-0.018
bar-0.018
Token from
Token ID422
Feature activation-2.91e-03

Pos logit contributions
ah+0.005
 mate+0.004
ff+0.004
 One+0.004
bar+0.004
Neg logit contributions
bes-0.005
we-0.005
kins-0.004
 wiping-0.004
irling-0.004
Token a
Token ID257
Feature activation-6.76e-03

Pos logit contributions
ah+0.004
 One+0.003
 mate+0.003
ff+0.003
bar+0.003
Neg logit contributions
bes-0.006
we-0.006
kins-0.006
irling-0.006
 wiping-0.005
Token nearby
Token ID6716
Feature activation-1.74e-02

Pos logit contributions
ah+0.012
 mate+0.011
ff+0.011
 One+0.011
bar+0.010
Neg logit contributions
bes-0.012
we-0.011
kins-0.010
irling-0.010
 wiping-0.010
Token fire
Token ID2046
Feature activation-3.30e-02

Pos logit contributions
ah+0.031
ff+0.029
 mate+0.029
ow+0.028
bar+0.028
Neg logit contributions
bes-0.027
we-0.026
 wearing-0.024
 wiping-0.024
lf-0.023
Token hyd
Token ID7409
Feature activation-8.00e-02

Pos logit contributions
ah+0.054
ff+0.051
 One+0.050
 mate+0.049
bar+0.047
Neg logit contributions
we-0.059
bes-0.057
kins-0.050
 wearing-0.050
 wiping-0.048
Tokenrant
Token ID5250
Feature activation-6.60e-03

Pos logit contributions
 mate+0.130
ah+0.126
bar+0.117
 One+0.113
 please+0.112
Neg logit contributions
we-0.143
bes-0.141
kins-0.122
irling-0.116
keeping-0.116
Token to
Token ID284
Feature activation-1.03e-02

Pos logit contributions
ah+0.011
 mate+0.010
ff+0.010
 One+0.009
bar+0.009
Neg logit contributions
bes-0.012
we-0.012
kins-0.011
irling-0.010
 wiping-0.010
Token keep
Token ID1394
Feature activation+2.24e-02

Pos logit contributions
ah+0.016
ff+0.014
 mate+0.014
bar+0.014
 One+0.014
Neg logit contributions
bes-0.020
we-0.019
 wiping-0.017
kins-0.017
 wearing-0.017
Token her
Token ID607
Feature activation+2.25e-02

Pos logit contributions
bes+0.045
we+0.042
irling+0.040
kins+0.039
 complexion+0.039
Neg logit contributions
ah-0.028
 enough-0.028
ow-0.027
bar-0.027
 One-0.027
Token dry
Token ID5894
Feature activation-3.09e-03

Pos logit contributions
bes+0.046
we+0.045
irling+0.040
kins+0.040
lf+0.038
Neg logit contributions
ah-0.030
 mate-0.029
ff-0.028
 One-0.028
 enough-0.028
Token It
Token ID632
Feature activation-3.74e-03

Pos logit contributions
ah+0.032
 mate+0.029
ff+0.028
bar+0.028
 One+0.026
Neg logit contributions
bes-0.043
we-0.042
kins-0.038
 wiping-0.037
irling-0.037
Token's
Token ID338
Feature activation-2.31e-02

Pos logit contributions
ah+0.005
 mate+0.005
ff+0.005
bar+0.004
 One+0.004
Neg logit contributions
bes-0.008
we-0.008
kins-0.007
 wiping-0.007
 wearing-0.007
Token important
Token ID1593
Feature activation+1.23e-02

Pos logit contributions
ah+0.036
 mate+0.033
ff+0.032
 One+0.032
bar+0.032
Neg logit contributions
bes-0.044
we-0.043
kins-0.039
irling-0.038
 wiping-0.037
Token to
Token ID284
Feature activation-1.15e-02

Pos logit contributions
bes+0.029
we+0.028
irling+0.027
kins+0.026
lf+0.025
Neg logit contributions
ah-0.013
 mate-0.012
 One-0.012
 enough-0.012
ow-0.012
Token stay
Token ID2652
Feature activation+7.59e-03

Pos logit contributions
ah+0.014
 mate+0.013
ff+0.013
 One+0.012
bar+0.012
Neg logit contributions
bes-0.025
we-0.025
kins-0.023
irling-0.022
 wiping-0.022
Token hyd
Token ID7409
Feature activation-7.64e-02

Pos logit contributions
we+0.015
bes+0.014
lf+0.013
irling+0.012
kins+0.012
Neg logit contributions
 enough-0.011
 mate-0.010
ow-0.010
 away-0.010
 leave-0.010
Tokenrated
Token ID4111
Feature activation-1.08e-02

Pos logit contributions
 mate+0.118
ah+0.112
bar+0.106
 One+0.101
ff+0.096
Neg logit contributions
we-0.143
bes-0.140
kins-0.127
irling-0.117
bing-0.117
Token and
Token ID290
Feature activation-8.15e-03

Pos logit contributions
ah+0.015
 mate+0.014
ff+0.014
 One+0.013
bar+0.013
Neg logit contributions
bes-0.022
we-0.022
kins-0.020
irling-0.020
 wiping-0.019
Token healthy
Token ID5448
Feature activation-3.02e-03

Pos logit contributions
ah+0.010
 One+0.010
 mate+0.010
bar+0.009
ff+0.009
Neg logit contributions
bes-0.018
we-0.017
kins-0.016
irling-0.015
lf-0.015
Token,"
Token ID553
Feature activation-1.79e-02

Pos logit contributions
ah+0.004
 mate+0.003
 enough+0.003
ff+0.003
 One+0.003
Neg logit contributions
bes-0.006
we-0.006
irling-0.006
kins-0.006
 wiping-0.006
Token mom
Token ID1995
Feature activation-1.25e-02

Pos logit contributions
ah+0.030
 mate+0.028
ff+0.028
bar+0.027
 One+0.026
Neg logit contributions
bes-0.032
we-0.032
kins-0.028
 wiping-0.028
irling-0.027
Token relax
Token ID8960
Feature activation+1.25e-02

Pos logit contributions
ah+0.021
 mate+0.020
 One+0.019
ff+0.019
bar+0.018
Neg logit contributions
bes-0.033
we-0.032
kins-0.030
irling-0.029
 wiping-0.028
Token together
Token ID1978
Feature activation+1.11e-03

Pos logit contributions
bes+0.024
we+0.023
irling+0.021
kins+0.021
 wiping+0.020
Neg logit contributions
ah-0.019
 mate-0.017
 One-0.017
ff-0.017
 enough-0.016
Token.
Token ID13
Feature activation-2.89e-02

Pos logit contributions
we+0.002
bes+0.002
kins+0.002
 wearing+0.002
irling+0.002
Neg logit contributions
ah-0.002
ff-0.002
bar-0.002
 mate-0.002
ow-0.001
Token How
Token ID1374
Feature activation-4.47e-02

Pos logit contributions
ah+0.049
ff+0.044
bar+0.043
 mate+0.042
ale+0.042
Neg logit contributions
bes-0.051
we-0.050
 wearing-0.043
 wiping-0.043
kins-0.042
Token does
Token ID857
Feature activation-4.38e-02

Pos logit contributions
bar+0.071
ah+0.068
ff+0.065
 mate+0.063
pole+0.062
Neg logit contributions
we-0.071
bes-0.069
kins-0.068
 wearing-0.064
 wiping-0.062
Token that
Token ID326
Feature activation-7.58e-02

Pos logit contributions
ah+0.086
ff+0.082
bar+0.079
 mate+0.077
pole+0.075
Neg logit contributions
bes-0.067
we-0.061
kins-0.056
 wearing-0.056
irling-0.056
Token sound
Token ID2128
Feature activation-2.73e-02

Pos logit contributions
ff+0.176
rol+0.170
ait+0.169
 mate+0.168
bar+0.167
Neg logit contributions
we-0.074
lf-0.064
bes-0.058
 wearing-0.058
kins-0.054
Token?"
Token ID1701
Feature activation-2.35e-02

Pos logit contributions
ah+0.043
 mate+0.040
ff+0.039
bar+0.039
 One+0.038
Neg logit contributions
bes-0.051
we-0.050
kins-0.046
 wearing-0.044
irling-0.044
Token
Token ID198
Feature activation-2.71e-02

Pos logit contributions
ah+0.041
 mate+0.039
ff+0.038
bar+0.038
pole+0.036
Neg logit contributions
bes-0.041
we-0.039
 wearing-0.035
kins-0.034
 wiping-0.034
Token
Token ID198
Feature activation-1.44e-02

Pos logit contributions
ah+0.052
 mate+0.050
ff+0.049
bar+0.049
pole+0.047
Neg logit contributions
bes-0.042
we-0.041
kins-0.035
irling-0.034
 wearing-0.034
TokenRex
Token ID47389
Feature activation-3.61e-03

Pos logit contributions
ff+0.029
 mate+0.028
ah+0.028
pole+0.028
ow+0.028
Neg logit contributions
bes-0.021
we-0.020
irling-0.016
kins-0.016
lf-0.015
Token and
Token ID290
Feature activation-5.51e-03

Pos logit contributions
ah+0.007
 mate+0.006
pole+0.006
ff+0.006
ale+0.006
Neg logit contributions
bes-0.005
we-0.005
 wiping-0.005
 wrapping-0.005
 wearing-0.005
Token the
Token ID262
Feature activation+4.16e-05

Pos logit contributions
ah+0.010
ff+0.010
 mate+0.009
 One+0.009
pole+0.009
Neg logit contributions
we-0.009
bes-0.008
 wiping-0.008
 wearing-0.008
kins-0.007
Token lightning
Token ID14357
Feature activation+6.73e-03

Pos logit contributions
bes+0.000
we+0.000
kins+0.000
irling+0.000
 wiping+0.000
Neg logit contributions
ah-0.000
 mate-0.000
ff-0.000
 One-0.000
bar-0.000
Token struck
Token ID7425
Feature activation-1.26e-02

Pos logit contributions
we+0.012
bes+0.011
kins+0.010
 wiping+0.010
irling+0.010
Neg logit contributions
ah-0.012
 mate-0.011
ff-0.011
 One-0.011
ow-0.010
Token the
Token ID262
Feature activation-2.12e-02

Pos logit contributions
ah+0.022
 mate+0.022
ff+0.020
bar+0.020
pole+0.020
Neg logit contributions
we-0.021
bes-0.020
 wiping-0.018
 wearing-0.018
 wrapping-0.017
Token hyd
Token ID7409
Feature activation-7.54e-02

Pos logit contributions
ah+0.030
 mate+0.028
ff+0.027
 One+0.027
bar+0.026
Neg logit contributions
bes-0.044
we-0.042
irling-0.038
kins-0.038
 wearing-0.036
Tokenrant
Token ID5250
Feature activation-9.91e-03

Pos logit contributions
 mate+0.121
ah+0.117
 enough+0.108
bar+0.107
 One+0.106
Neg logit contributions
we-0.136
bes-0.130
kins-0.115
bing-0.109
irling-0.108
Token.
Token ID13
Feature activation+6.51e-03

Pos logit contributions
ah+0.018
 mate+0.016
ff+0.016
bar+0.015
 One+0.015
Neg logit contributions
we-0.017
bes-0.016
kins-0.014
 wiping-0.014
 wearing-0.014
Token Suddenly
Token ID24975
Feature activation+9.34e-03

Pos logit contributions
bes+0.011
we+0.011
 wiping+0.010
 wearing+0.010
 wrapping+0.009
Neg logit contributions
ah-0.011
 mate-0.011
ff-0.010
ale-0.010
 enough-0.010
Token water
Token ID1660
Feature activation-3.39e-03

Pos logit contributions
bes+0.015
we+0.015
kins+0.014
 wiping+0.014
 wearing+0.014
Neg logit contributions
ah-0.016
ff-0.015
 mate-0.014
bar-0.014
pole-0.014
Token was
Token ID373
Feature activation-3.03e-02

Pos logit contributions
ah+0.005
 mate+0.005
ff+0.005
 One+0.005
bar+0.004
Neg logit contributions
bes-0.007
we-0.007
kins-0.006
 wiping-0.006
irling-0.006
Token ice
Token ID4771
Feature activation+1.25e-02

Pos logit contributions
ah+0.018
 mate+0.016
ff+0.016
 One+0.016
bar+0.016
Neg logit contributions
bes-0.018
we-0.018
kins-0.016
irling-0.015
 wiping-0.015
Token cream
Token ID8566
Feature activation-1.56e-02

Pos logit contributions
bes+0.015
we+0.014
irling+0.012
kins+0.011
 wiping+0.011
Neg logit contributions
ah-0.029
 mate-0.027
ff-0.026
bar-0.026
 One-0.026
Token,
Token ID11
Feature activation-2.86e-02

Pos logit contributions
ah+0.027
ff+0.026
 mate+0.025
 One+0.025
upid+0.024
Neg logit contributions
we-0.026
bes-0.025
kins-0.024
 wearing-0.022
 wiping-0.021
Token if
Token ID611
Feature activation-2.26e-02

Pos logit contributions
ah+0.043
ff+0.039
 mate+0.037
 One+0.036
bar+0.036
Neg logit contributions
bes-0.057
we-0.055
kins-0.051
 wiping-0.049
 wearing-0.049
Token there
Token ID612
Feature activation-2.40e-02

Pos logit contributions
ah+0.039
 mate+0.037
ff+0.036
 One+0.036
bar+0.035
Neg logit contributions
bes-0.039
we-0.037
kins-0.033
irling-0.033
 wiping-0.032
Token's
Token ID338
Feature activation-7.53e-02

Pos logit contributions
ah+0.036
 mate+0.035
ff+0.033
bar+0.032
 One+0.030
Neg logit contributions
we-0.047
bes-0.045
kins-0.043
 wearing-0.040
 wiping-0.038
Token any
Token ID597
Feature activation-3.03e-02

Pos logit contributions
ah+0.132
ff+0.124
 mate+0.116
bar+0.112
ale+0.111
Neg logit contributions
bes-0.124
we-0.119
 wiping-0.113
kins-0.112
 wearing-0.108
Token left
Token ID1364
Feature activation-1.91e-02

Pos logit contributions
ah+0.049
ff+0.045
 mate+0.045
 One+0.044
bar+0.044
Neg logit contributions
bes-0.055
we-0.054
kins-0.049
 wiping-0.047
irling-0.047
Token."
Token ID526
Feature activation-1.98e-02

Pos logit contributions
ah+0.032
 mate+0.029
ff+0.028
bar+0.028
 One+0.028
Neg logit contributions
we-0.034
bes-0.033
kins-0.031
 wearing-0.029
 wiping-0.029
Token Lily
Token ID20037
Feature activation-6.18e-03

Pos logit contributions
ah+0.027
 One+0.025
 mate+0.025
ff+0.025
bar+0.024
Neg logit contributions
bes-0.041
we-0.040
kins-0.037
irling-0.036
 wiping-0.035
Token nodded
Token ID14464
Feature activation-1.59e-02

Pos logit contributions
ah+0.009
 mate+0.008
 One+0.008
ff+0.008
bar+0.008
Neg logit contributions
bes-0.012
we-0.012
irling-0.011
kins-0.011
 wiping-0.010
Token.
Token ID13
Feature activation-1.91e-02

Pos logit contributions
ah+0.036
 mate+0.033
ff+0.033
 One+0.032
bar+0.032
Neg logit contributions
bes-0.041
we-0.040
kins-0.036
irling-0.035
 wiping-0.035
Token It
Token ID632
Feature activation-1.88e-04

Pos logit contributions
ah+0.026
 mate+0.024
ff+0.024
 One+0.023
bar+0.023
Neg logit contributions
bes-0.040
we-0.039
kins-0.036
irling-0.035
 wiping-0.034
Token will
Token ID481
Feature activation-5.27e-04

Pos logit contributions
ah+0.000
 mate+0.000
ff+0.000
bar+0.000
 One+0.000
Neg logit contributions
bes-0.000
we-0.000
kins-0.000
 wiping-0.000
irling-0.000
Token keep
Token ID1394
Feature activation+8.97e-03

Pos logit contributions
ah+0.001
ff+0.001
 mate+0.001
bar+0.001
ow+0.001
Neg logit contributions
bes-0.001
we-0.001
kins-0.001
 wiping-0.001
 wearing-0.001
Token you
Token ID345
Feature activation+2.86e-02

Pos logit contributions
bes+0.019
we+0.018
irling+0.017
kins+0.017
lf+0.016
Neg logit contributions
ah-0.011
 mate-0.010
 One-0.010
ff-0.010
ow-0.010
Token hyd
Token ID7409
Feature activation-7.42e-02

Pos logit contributions
bes+0.052
we+0.052
irling+0.046
kins+0.044
 wiping+0.043
Neg logit contributions
ah-0.045
 mate-0.043
ff-0.042
 One-0.041
ow-0.040
Tokenrated
Token ID4111
Feature activation-1.36e-02

Pos logit contributions
 mate+0.114
ah+0.109
bar+0.102
 One+0.098
 enough+0.092
Neg logit contributions
we-0.138
bes-0.136
kins-0.126
irling-0.115
bing-0.114
Token.
Token ID13
Feature activation-2.16e-02

Pos logit contributions
ah+0.020
 mate+0.019
ff+0.018
bar+0.018
 One+0.017
Neg logit contributions
bes-0.027
we-0.026
kins-0.024
irling-0.023
 wiping-0.023
Token You
Token ID921
Feature activation-4.22e-03

Pos logit contributions
ah+0.032
 mate+0.030
ff+0.029
bar+0.028
 One+0.028
Neg logit contributions
bes-0.042
we-0.041
kins-0.037
irling-0.036
 wiping-0.036
Token have
Token ID423
Feature activation-2.66e-03

Pos logit contributions
ah+0.006
bar+0.006
ff+0.005
 mate+0.005
ale+0.005
Neg logit contributions
bes-0.008
kins-0.007
we-0.007
 wearing-0.007
 wiping-0.007
Token to
Token ID284
Feature activation-5.43e-03

Pos logit contributions
ah+0.004
 mate+0.004
ff+0.004
bar+0.004
 One+0.004
Neg logit contributions
bes-0.005
we-0.005
kins-0.004
 wiping-0.004
 wearing-0.004
Token rain
Token ID6290
Feature activation-1.19e-02

Pos logit contributions
ah+0.029
 mate+0.027
 One+0.026
ff+0.026
bar+0.026
Neg logit contributions
bes-0.028
we-0.027
kins-0.024
irling-0.024
 wiping-0.023
Token that
Token ID326
Feature activation-3.62e-02

Pos logit contributions
ah+0.017
 mate+0.016
 One+0.016
ow+0.015
ff+0.015
Neg logit contributions
bes-0.024
we-0.023
kins-0.021
irling-0.021
 wiping-0.020
Token way
Token ID835
Feature activation+1.83e-03

Pos logit contributions
ah+0.060
 mate+0.055
ff+0.055
bar+0.054
 One+0.052
Neg logit contributions
we-0.064
bes-0.064
kins-0.059
 wearing-0.057
 wiping-0.055
Token,
Token ID11
Feature activation-3.27e-02

Pos logit contributions
bes+0.004
we+0.004
kins+0.003
irling+0.003
 wearing+0.003
Neg logit contributions
ah-0.003
 mate-0.002
ff-0.002
bar-0.002
 One-0.002
Token Lily
Token ID20037
Feature activation-2.16e-02

Pos logit contributions
ah+0.046
ff+0.043
bar+0.041
 mate+0.039
 One+0.039
Neg logit contributions
bes-0.065
we-0.064
 wearing-0.059
kins-0.059
 wiping-0.058
Token and
Token ID290
Feature activation-7.37e-02

Pos logit contributions
ah+0.039
 mate+0.036
ff+0.036
 One+0.035
bar+0.035
Neg logit contributions
bes-0.036
we-0.035
kins-0.031
irling-0.030
 wearing-0.029
Token Sam
Token ID3409
Feature activation-3.51e-02

Pos logit contributions
rol+0.154
ah+0.145
bar+0.137
ff+0.136
pole+0.133
Neg logit contributions
 reading-0.092
 wearing-0.085
 drawing-0.081
bes-0.079
lf-0.075
Token?"
Token ID1701
Feature activation-1.33e-02

Pos logit contributions
ah+0.067
bar+0.064
ff+0.063
 mate+0.061
isc+0.057
Neg logit contributions
 wearing-0.049
bes-0.048
we-0.046
kins-0.045
 reading-0.044
Token
Token ID198
Feature activation-1.35e-02

Pos logit contributions
 One+0.016
ah+0.015
 mate+0.014
ff+0.013
ow+0.012
Neg logit contributions
bes-0.031
we-0.030
irling-0.028
kins-0.027
 wiping-0.026
Token
Token ID198
Feature activation-1.37e-03

Pos logit contributions
ah+0.020
 mate+0.018
ff+0.018
 One+0.018
bar+0.018
Neg logit contributions
bes-0.027
we-0.026
kins-0.024
irling-0.023
 wiping-0.023
TokenL
Token ID43
Feature activation-3.21e-02

Pos logit contributions
ah+0.002
 mate+0.002
 One+0.002
ff+0.002
bar+0.002
Neg logit contributions
bes-0.003
we-0.002
kins-0.002
irling-0.002
 wiping-0.002
Token will
Token ID481
Feature activation+1.03e-03

Pos logit contributions
bes+0.023
we+0.023
kins+0.020
irling+0.020
 wiping+0.020
Neg logit contributions
ah-0.019
 mate-0.018
ff-0.018
bar-0.017
 One-0.017
Token never
Token ID1239
Feature activation-2.31e-02

Pos logit contributions
bes+0.002
we+0.002
 wiping+0.002
 wearing+0.002
irling+0.002
Neg logit contributions
ah-0.002
ff-0.002
bar-0.002
 mate-0.002
ale-0.001
Token go
Token ID467
Feature activation-2.47e-02

Pos logit contributions
ah+0.033
ff+0.030
bar+0.030
 mate+0.029
 One+0.028
Neg logit contributions
bes-0.044
we-0.043
 wearing-0.040
 wiping-0.040
kins-0.039
Token near
Token ID1474
Feature activation-1.27e-02

Pos logit contributions
ah+0.037
 mate+0.035
 One+0.034
ff+0.034
bar+0.033
Neg logit contributions
bes-0.048
we-0.047
kins-0.043
irling-0.042
 wiping-0.040
Token a
Token ID257
Feature activation-3.23e-02

Pos logit contributions
ah+0.019
 mate+0.018
ff+0.017
 One+0.017
bar+0.017
Neg logit contributions
bes-0.025
we-0.024
kins-0.022
irling-0.022
 wiping-0.021
Token crocod
Token ID37565
Feature activation-7.33e-02

Pos logit contributions
ah+0.057
ff+0.054
ow+0.054
 mate+0.051
bar+0.050
Neg logit contributions
we-0.050
bes-0.048
 wearing-0.044
kins-0.044
 sudden-0.043
Tokenile
Token ID576
Feature activation-4.76e-02

Pos logit contributions
ff+0.097
ah+0.097
 One+0.096
 enough+0.093
 mate+0.093
Neg logit contributions
we-0.141
bes-0.130
kins-0.122
web-0.117
irling-0.115
Token again
Token ID757
Feature activation+9.49e-04

Pos logit contributions
ah+0.085
ff+0.081
 mate+0.076
upid+0.074
bar+0.074
Neg logit contributions
we-0.078
bes-0.075
 wearing-0.066
 wiping-0.064
kins-0.062
Token."
Token ID526
Feature activation-2.06e-02

Pos logit contributions
bes+0.002
we+0.002
kins+0.002
irling+0.002
 wiping+0.002
Neg logit contributions
ah-0.001
 mate-0.001
ff-0.001
bar-0.001
 One-0.001
Token Lily
Token ID20037
Feature activation-6.84e-03

Pos logit contributions
ah+0.028
 One+0.026
 mate+0.026
ff+0.025
bar+0.024
Neg logit contributions
bes-0.044
we-0.043
kins-0.039
irling-0.038
 wiping-0.037
Token said
Token ID531
Feature activation-1.06e-02

Pos logit contributions
ah+0.010
 mate+0.009
ff+0.009
bar+0.009
 One+0.009
Neg logit contributions
bes-0.014
we-0.013
kins-0.012
 wiping-0.012
 wearing-0.012
Token to
Token ID284
Feature activation-7.63e-03

Pos logit contributions
ah+0.016
ff+0.015
bar+0.014
 One+0.014
pole+0.014
Neg logit contributions
bes-0.014
we-0.014
 wearing-0.013
 wiping-0.012
 wrapping-0.012
Token go
Token ID467
Feature activation-1.64e-02

Pos logit contributions
ah+0.011
ff+0.010
bar+0.009
 mate+0.009
pole+0.008
Neg logit contributions
bes-0.016
we-0.014
 wearing-0.014
kins-0.014
 wiping-0.013
Token.
Token ID13
Feature activation-2.39e-02

Pos logit contributions
ah+0.025
ff+0.023
 mate+0.023
bar+0.021
 One+0.021
Neg logit contributions
bes-0.030
we-0.030
 wearing-0.028
kins-0.027
 wrapping-0.026
Token Nothing
Token ID10528
Feature activation-1.16e-02

Pos logit contributions
ah+0.042
 mate+0.035
ff+0.035
bar+0.035
ale+0.034
Neg logit contributions
bes-0.041
we-0.041
 wiping-0.036
 wearing-0.036
kins-0.034
Token can
Token ID460
Feature activation-3.80e-03

Pos logit contributions
ah+0.015
 mate+0.014
ff+0.013
 One+0.013
bar+0.013
Neg logit contributions
bes-0.025
we-0.025
kins-0.023
irling-0.022
 wiping-0.022
Token prevent
Token ID2948
Feature activation-7.19e-02

Pos logit contributions
ah+0.006
ff+0.006
 mate+0.005
bar+0.005
 enough+0.005
Neg logit contributions
bes-0.007
we-0.007
 wearing-0.006
kins-0.006
 wiping-0.006
Token it
Token ID340
Feature activation-1.62e-02

Pos logit contributions
ah+0.117
ff+0.103
 mate+0.102
bar+0.099
 One+0.097
Neg logit contributions
we-0.137
bes-0.129
 wearing-0.121
 wiping-0.119
 wrapping-0.117
Token.
Token ID13
Feature activation-2.70e-02

Pos logit contributions
ah+0.029
ff+0.026
 mate+0.026
bar+0.024
 One+0.024
Neg logit contributions
bes-0.027
we-0.027
 wearing-0.025
 wiping-0.024
kins-0.023
Token Now
Token ID2735
Feature activation+1.80e-03

Pos logit contributions
ah+0.045
 mate+0.039
ff+0.039
bar+0.038
ale+0.037
Neg logit contributions
bes-0.049
we-0.048
 wiping-0.042
 wearing-0.041
kins-0.041
Token,
Token ID11
Feature activation-9.43e-03

Pos logit contributions
bes+0.002
we+0.002
 wearing+0.002
kins+0.002
 wiping+0.002
Neg logit contributions
ah-0.004
ff-0.004
bar-0.003
ow-0.003
 mate-0.003
Token move
Token ID1445
Feature activation-1.13e-02

Pos logit contributions
ah+0.016
ff+0.015
 One+0.014
bar+0.014
 mate+0.013
Neg logit contributions
we-0.016
bes-0.016
 wearing-0.015
 wiping-0.014
kins-0.014
Token
Token ID198
Feature activation-4.17e-03

Pos logit contributions
ah+0.035
 mate+0.032
ff+0.031
bar+0.031
 enough+0.029
Neg logit contributions
bes-0.037
we-0.036
kins-0.032
irling-0.032
 wiping-0.031
Token"
Token ID1
Feature activation+2.16e-03

Pos logit contributions
ah+0.007
 mate+0.006
ff+0.006
bar+0.006
ow+0.006
Neg logit contributions
bes-0.008
we-0.008
kins-0.007
irling-0.007
 wiping-0.006
TokenHey
Token ID10814
Feature activation+5.14e-03

Pos logit contributions
bes+0.005
 burying+0.004
kins+0.004
 wearing+0.004
irling+0.004
Neg logit contributions
ah-0.003
 One-0.002
ff-0.002
 mate-0.002
 please-0.002
Token,
Token ID11
Feature activation-2.11e-02

Pos logit contributions
bes+0.010
we+0.010
kins+0.009
irling+0.009
 wiping+0.009
Neg logit contributions
ah-0.007
ff-0.007
bar-0.007
 mate-0.007
 One-0.006
Token what
Token ID644
Feature activation-2.92e-02

Pos logit contributions
ah+0.034
ff+0.032
 One+0.030
bar+0.030
 mate+0.028
Neg logit contributions
we-0.038
bes-0.038
 wiping-0.035
 wearing-0.034
kins-0.033
Token's
Token ID338
Feature activation-7.18e-02

Pos logit contributions
ah+0.021
 mate+0.018
ff+0.017
 One+0.017
ow+0.015
Neg logit contributions
bes-0.081
we-0.079
irling-0.073
kins-0.073
 wiping-0.071
Token wrong
Token ID2642
Feature activation-1.64e-02

Pos logit contributions
ah+0.093
 mate+0.084
ff+0.082
 One+0.081
bar+0.080
Neg logit contributions
bes-0.156
we-0.153
kins-0.140
irling-0.137
 wiping-0.134
Token?"
Token ID1701
Feature activation-1.86e-02

Pos logit contributions
ah+0.014
 please+0.013
 One+0.013
 enough+0.013
 mate+0.013
Neg logit contributions
bes-0.043
we-0.041
irling-0.039
 sudden-0.037
kins-0.037
Token Lily
Token ID20037
Feature activation-3.70e-03

Pos logit contributions
ah+0.027
 mate+0.025
ff+0.024
 One+0.024
bar+0.024
Neg logit contributions
bes-0.037
we-0.036
kins-0.033
irling-0.032
 wiping-0.032
Token's
Token ID338
Feature activation-3.02e-02

Pos logit contributions
ah+0.006
 mate+0.005
ff+0.005
 One+0.005
bar+0.005
Neg logit contributions
bes-0.007
we-0.007
kins-0.006
irling-0.006
 wiping-0.006
Token mom
Token ID1995
Feature activation-1.21e-02

Pos logit contributions
ah+0.042
 mate+0.040
ff+0.038
ow+0.037
 One+0.037
Neg logit contributions
bes-0.064
we-0.061
kins-0.056
irling-0.055
 wiping-0.052
Token
Token ID198
Feature activation-9.62e-03

Pos logit contributions
ah+0.027
 One+0.025
 mate+0.025
ff+0.025
bar+0.024
Neg logit contributions
bes-0.042
we-0.041
kins-0.038
irling-0.037
 wiping-0.036
Token"
Token ID1
Feature activation+4.67e-03

Pos logit contributions
ah+0.015
 mate+0.014
ff+0.013
 One+0.013
bar+0.013
Neg logit contributions
bes-0.019
we-0.018
kins-0.016
irling-0.016
 wiping-0.016
TokenHey
Token ID10814
Feature activation-2.15e-03

Pos logit contributions
bes+0.009
we+0.008
kins+0.008
irling+0.008
 complexion+0.008
Neg logit contributions
ah-0.007
 mate-0.006
 One-0.006
bar-0.006
ff-0.006
Token,
Token ID11
Feature activation-3.09e-02

Pos logit contributions
ah+0.003
bar+0.003
ff+0.003
 One+0.003
 mate+0.003
Neg logit contributions
bes-0.004
we-0.004
kins-0.004
 wearing-0.003
 wiping-0.003
Token what
Token ID644
Feature activation-3.01e-02

Pos logit contributions
ah+0.047
ff+0.045
 One+0.043
bar+0.041
pole+0.038
Neg logit contributions
we-0.059
bes-0.057
 wiping-0.054
 wearing-0.053
kins-0.052
Token's
Token ID338
Feature activation-7.15e-02

Pos logit contributions
ah+0.027
 mate+0.023
ff+0.023
 One+0.022
bar+0.021
Neg logit contributions
bes-0.077
we-0.076
kins-0.071
irling-0.069
 wiping-0.068
Token going
Token ID1016
Feature activation-3.34e-02

Pos logit contributions
ah+0.113
 mate+0.104
ff+0.103
 One+0.101
bar+0.100
Neg logit contributions
bes-0.135
we-0.131
kins-0.119
irling-0.116
 wiping-0.114
Token on
Token ID319
Feature activation-1.94e-02

Pos logit contributions
ah+0.066
 mate+0.062
ff+0.062
 One+0.061
bar+0.060
Neg logit contributions
bes-0.049
we-0.047
kins-0.041
irling-0.040
 wiping-0.039
Token here
Token ID994
Feature activation-8.20e-03

Pos logit contributions
ah+0.025
 enough+0.024
 mate+0.023
 One+0.023
ow+0.023
Neg logit contributions
bes-0.042
we-0.040
kins-0.036
irling-0.035
 wiping-0.034
Token?"
Token ID1701
Feature activation-2.48e-02

Pos logit contributions
ah+0.011
 mate+0.010
ff+0.010
 One+0.010
bar+0.010
Neg logit contributions
bes-0.017
we-0.017
kins-0.015
irling-0.015
 wiping-0.015
Token a
Token ID257
Feature activation-3.53e-02

Pos logit contributions
ah+0.034
 mate+0.030
ff+0.030
bar+0.030
 One+0.028
Neg logit contributions
bes-0.052
we-0.051
kins-0.047
irling-0.046
 wearing-0.046
TokenTom
Token ID13787
Feature activation-1.15e-02

Pos logit contributions
bes+0.002
we+0.002
kins+0.002
irling+0.002
 wearing+0.002
Neg logit contributions
ah-0.002
 mate-0.002
 One-0.002
ff-0.002
bar-0.002
Token,
Token ID11
Feature activation-1.69e-02

Pos logit contributions
 mate+0.022
bar+0.021
ff+0.020
 One+0.020
ah+0.020
Neg logit contributions
bes-0.016
we-0.015
 wearing-0.015
 wiping-0.014
kins-0.014
Token Mia
Token ID32189
Feature activation-3.69e-02

Pos logit contributions
ah+0.026
ff+0.025
bar+0.023
 One+0.023
ale+0.022
Neg logit contributions
we-0.032
bes-0.031
 wiping-0.029
 wearing-0.028
kins-0.028
Token,
Token ID11
Feature activation-1.72e-02

Pos logit contributions
bar+0.068
 mate+0.067
ah+0.065
ff+0.065
 One+0.064
Neg logit contributions
bes-0.053
we-0.051
kins-0.048
 wiping-0.046
 wearing-0.045
Token what
Token ID644
Feature activation-2.90e-02

Pos logit contributions
ah+0.024
ff+0.022
bar+0.021
 One+0.021
pole+0.020
Neg logit contributions
we-0.035
bes-0.033
 wiping-0.032
 wearing-0.032
 wrapping-0.030
Token's
Token ID338
Feature activation-7.11e-02

Pos logit contributions
ah+0.036
 mate+0.033
ff+0.032
 One+0.031
bar+0.030
Neg logit contributions
bes-0.065
we-0.063
kins-0.058
irling-0.057
 wiping-0.056
Token wrong
Token ID2642
Feature activation-2.30e-02

Pos logit contributions
ah+0.110
ff+0.103
 mate+0.099
bar+0.098
 One+0.097
Neg logit contributions
bes-0.136
we-0.132
kins-0.121
irling-0.119
 wiping-0.118
Token?"
Token ID1701
Feature activation-2.34e-02

Pos logit contributions
ah+0.028
 mate+0.025
ff+0.025
ow+0.024
 One+0.024
Neg logit contributions
bes-0.053
we-0.050
irling-0.047
kins-0.046
 wiping-0.045
Token Mom
Token ID11254
Feature activation-1.44e-02

Pos logit contributions
ah+0.023
 One+0.022
 mate+0.021
ff+0.021
ow+0.019
Neg logit contributions
bes-0.058
we-0.056
kins-0.052
irling-0.051
 wiping-0.050
Token asks
Token ID7893
Feature activation-1.52e-02

Pos logit contributions
ah+0.017
 mate+0.016
bar+0.015
ff+0.015
 One+0.014
Neg logit contributions
we-0.032
bes-0.032
 wearing-0.029
 wiping-0.029
kins-0.028
Token.
Token ID13
Feature activation-1.74e-02

Pos logit contributions
ah+0.019
 mate+0.017
ff+0.017
 One+0.017
ow+0.016
Neg logit contributions
bes-0.034
we-0.032
irling-0.030
kins-0.030
 wiping-0.028
Token the
Token ID262
Feature activation-8.76e-04

Pos logit contributions
bes+0.038
we+0.036
web+0.033
lf+0.033
irling+0.033
Neg logit contributions
 enough-0.015
 leave-0.013
ow-0.013
 him-0.012
ah-0.012
Token ring
Token ID5858
Feature activation-1.36e-02

Pos logit contributions
ah+0.001
 mate+0.001
ff+0.001
 One+0.001
 enough+0.001
Neg logit contributions
bes-0.002
we-0.002
kins-0.002
irling-0.002
 wiping-0.002
Token as
Token ID355
Feature activation-7.54e-03

Pos logit contributions
ah+0.023
ff+0.023
 mate+0.022
 One+0.022
upid+0.021
Neg logit contributions
bes-0.020
kins-0.020
we-0.020
 wearing-0.018
 wrapping-0.017
Token a
Token ID257
Feature activation-3.55e-02

Pos logit contributions
 mate+0.010
ah+0.010
 enough+0.009
 One+0.009
ff+0.009
Neg logit contributions
bes-0.016
we-0.015
irling-0.014
kins-0.013
 wiping-0.013
Token souven
Token ID45043
Feature activation-1.64e-02

Pos logit contributions
ah+0.060
ff+0.057
 One+0.055
bar+0.055
ow+0.055
Neg logit contributions
bes-0.059
we-0.058
kins-0.052
 wiping-0.051
 wearing-0.050
Tokenir
Token ID343
Feature activation-7.10e-02

Pos logit contributions
ah+0.026
 mate+0.025
 One+0.024
bar+0.023
ff+0.023
Neg logit contributions
we-0.029
bes-0.029
irling-0.027
kins-0.027
 wearing-0.025
Token.
Token ID13
Feature activation-1.25e-02

Pos logit contributions
ah+0.115
ff+0.111
 mate+0.109
bar+0.108
ait+0.106
Neg logit contributions
we-0.115
bes-0.113
kins-0.097
 wearing-0.096
irling-0.090
Token They
Token ID1119
Feature activation-9.49e-03

Pos logit contributions
ah+0.022
bar+0.019
 mate+0.019
pole+0.018
ff+0.018
Neg logit contributions
bes-0.022
we-0.021
 wearing-0.019
 wiping-0.019
kins-0.019
Token thanked
Token ID26280
Feature activation-6.48e-04

Pos logit contributions
ah+0.016
ff+0.015
 mate+0.015
bar+0.015
 One+0.015
Neg logit contributions
bes-0.016
we-0.015
kins-0.014
 wearing-0.014
irling-0.013
Token the
Token ID262
Feature activation-2.89e-02

Pos logit contributions
ah+0.001
 mate+0.001
ff+0.001
bar+0.001
 One+0.001
Neg logit contributions
bes-0.001
we-0.001
kins-0.001
irling-0.001
 wiping-0.001
Token dove
Token ID38862
Feature activation-2.99e-02

Pos logit contributions
ah+0.046
 mate+0.042
 One+0.040
ff+0.040
bar+0.039
Neg logit contributions
bes-0.056
we-0.053
kins-0.049
irling-0.047
 wiping-0.046
Token and
Token ID290
Feature activation-4.93e-03

Pos logit contributions
ah+0.043
 mate+0.039
ff+0.039
 One+0.038
bar+0.038
Neg logit contributions
bes-0.056
we-0.054
kins-0.049
irling-0.048
 wiping-0.047
Token clean
Token ID3424
Feature activation-2.89e-02

Pos logit contributions
 mate+0.007
ah+0.007
 One+0.006
 enough+0.006
ff+0.006
Neg logit contributions
bes-0.010
we-0.009
irling-0.009
kins-0.008
 wiping-0.008
Token.
Token ID13
Feature activation-2.12e-02

Pos logit contributions
ah+0.042
 mate+0.038
ff+0.037
 One+0.037
bar+0.036
Neg logit contributions
bes-0.059
we-0.057
kins-0.052
irling-0.051
 wiping-0.050
Token How
Token ID1374
Feature activation-4.64e-02

Pos logit contributions
ah+0.035
bar+0.031
 mate+0.031
ff+0.031
ow+0.029
Neg logit contributions
bes-0.039
we-0.038
kins-0.033
 wearing-0.033
 wiping-0.033
Token does
Token ID857
Feature activation-4.35e-02

Pos logit contributions
bar+0.077
ah+0.074
 mate+0.071
ff+0.070
pole+0.065
Neg logit contributions
we-0.073
bes-0.072
kins-0.070
 wearing-0.067
irling-0.064
Token that
Token ID326
Feature activation-7.09e-02

Pos logit contributions
ah+0.082
ff+0.076
bar+0.074
 mate+0.074
 One+0.070
Neg logit contributions
bes-0.070
we-0.066
kins-0.061
 wearing-0.059
irling-0.058
Token sound
Token ID2128
Feature activation-3.06e-02

Pos logit contributions
 mate+0.149
ff+0.148
ah+0.146
bar+0.145
ow+0.141
Neg logit contributions
we-0.086
bes-0.073
kins-0.071
 wiping-0.070
lf-0.070
Token?"
Token ID1701
Feature activation-1.53e-02

Pos logit contributions
ah+0.045
 mate+0.041
ff+0.040
 One+0.039
bar+0.039
Neg logit contributions
bes-0.061
we-0.060
kins-0.055
irling-0.053
 wiping-0.052
Token
Token ID198
Feature activation-2.14e-02

Pos logit contributions
ah+0.023
 mate+0.021
ff+0.021
bar+0.021
 One+0.020
Neg logit contributions
bes-0.030
we-0.029
kins-0.026
irling-0.026
 wiping-0.025
Token
Token ID198
Feature activation-1.00e-02

Pos logit contributions
ah+0.033
 mate+0.031
ff+0.030
bar+0.030
 One+0.029
Neg logit contributions
bes-0.041
we-0.040
kins-0.036
irling-0.035
 wiping-0.035
TokenS
Token ID50
Feature activation-5.13e-02

Pos logit contributions
ah+0.019
 mate+0.018
ff+0.017
bar+0.017
ow+0.017
Neg logit contributions
bes-0.016
we-0.016
kins-0.014
irling-0.014
 wearing-0.013
Token keep
Token ID1394
Feature activation+1.67e-02

Pos logit contributions
ah+0.015
ff+0.013
bar+0.013
 mate+0.013
 One+0.012
Neg logit contributions
bes-0.019
we-0.018
 wiping-0.016
kins-0.016
irling-0.016
Token them
Token ID606
Feature activation+8.36e-03

Pos logit contributions
bes+0.039
we+0.037
kins+0.035
irling+0.035
lf+0.033
Neg logit contributions
ah-0.018
 mate-0.017
 enough-0.016
ow-0.016
 One-0.015
Token as
Token ID355
Feature activation-7.73e-03

Pos logit contributions
bes+0.014
we+0.013
kins+0.013
 wearing+0.012
 wiping+0.012
Neg logit contributions
ah-0.015
ff-0.014
 One-0.013
 mate-0.013
bar-0.013
Token a
Token ID257
Feature activation-3.25e-02

Pos logit contributions
 mate+0.011
ah+0.011
 enough+0.010
ff+0.009
 One+0.009
Neg logit contributions
bes-0.016
we-0.015
irling-0.014
kins-0.014
 wiping-0.013
Token souven
Token ID45043
Feature activation-1.62e-02

Pos logit contributions
ah+0.056
ff+0.054
ow+0.052
bar+0.051
 One+0.050
Neg logit contributions
bes-0.053
we-0.051
 wiping-0.046
 wrapping-0.046
 wearing-0.046
Tokenir
Token ID343
Feature activation-7.09e-02

Pos logit contributions
ah+0.027
 mate+0.025
 One+0.025
bar+0.024
pole+0.024
Neg logit contributions
we-0.026
bes-0.026
irling-0.025
kins-0.025
 wearing-0.023
Token of
Token ID286
Feature activation-2.50e-03

Pos logit contributions
ah+0.114
 destination+0.114
ff+0.110
ait+0.109
bar+0.108
Neg logit contributions
we-0.110
bes-0.108
kins-0.092
 wearing-0.090
keeping-0.084
Token Frost
Token ID15122
Feature activation-6.71e-03

Pos logit contributions
ah+0.004
ff+0.003
bar+0.003
 mate+0.003
 One+0.003
Neg logit contributions
bes-0.005
we-0.005
kins-0.005
 wearing-0.004
 wiping-0.004
Tokeny
Token ID88
Feature activation-2.02e-02

Pos logit contributions
ah+0.010
 One+0.010
bar+0.010
 mate+0.010
ff+0.009
Neg logit contributions
we-0.013
bes-0.012
kins-0.011
 wearing-0.010
 wiping-0.010
Token.
Token ID13
Feature activation-1.47e-02

Pos logit contributions
ah+0.032
ff+0.031
bar+0.030
 mate+0.030
 One+0.029
Neg logit contributions
bes-0.036
we-0.035
kins-0.031
 wearing-0.031
 wiping-0.030
Token They
Token ID1119
Feature activation-1.12e-02

Pos logit contributions
ah+0.028
bar+0.025
ff+0.023
pole+0.022
 mate+0.022
Neg logit contributions
bes-0.026
we-0.024
 wearing-0.022
 wiping-0.021
kins-0.021
Token it
Token ID340
Feature activation+1.68e-02

Pos logit contributions
bes+0.015
we+0.013
kins+0.013
 complexion+0.012
irling+0.012
Neg logit contributions
 One-0.009
 enough-0.009
 mate-0.009
ah-0.008
 go-0.008
Token's
Token ID338
Feature activation-2.28e-02

Pos logit contributions
bes+0.047
we+0.045
irling+0.043
lf+0.042
kins+0.042
Neg logit contributions
ah-0.010
 One-0.010
 mate-0.009
 enough-0.009
ff-0.008
Token important
Token ID1593
Feature activation+1.02e-02

Pos logit contributions
ah+0.037
 One+0.036
 mate+0.035
 enough+0.035
ff+0.033
Neg logit contributions
bes-0.042
we-0.042
kins-0.036
irling-0.035
lf-0.035
Token to
Token ID284
Feature activation-3.30e-03

Pos logit contributions
bes+0.029
irling+0.027
we+0.027
kins+0.027
lf+0.026
Neg logit contributions
 enough-0.008
ah-0.007
 One-0.006
 mate-0.006
ow-0.006
Token stay
Token ID2652
Feature activation+1.24e-02

Pos logit contributions
ah+0.004
 mate+0.004
 One+0.003
ff+0.003
ow+0.003
Neg logit contributions
bes-0.008
we-0.008
kins-0.007
irling-0.007
 wiping-0.006
Token hyd
Token ID7409
Feature activation-7.05e-02

Pos logit contributions
we+0.026
bes+0.026
lf+0.024
 complexion+0.021
keeping+0.021
Neg logit contributions
 enough-0.017
 away-0.015
 One-0.015
 leave-0.015
ff-0.015
Tokenrated
Token ID4111
Feature activation-5.80e-03

Pos logit contributions
 mate+0.100
ah+0.099
bar+0.090
 One+0.087
ff+0.086
Neg logit contributions
we-0.137
bes-0.134
kins-0.121
irling-0.118
bing-0.114
Token and
Token ID290
Feature activation-1.81e-03

Pos logit contributions
ah+0.009
 mate+0.008
ff+0.008
 One+0.008
bar+0.008
Neg logit contributions
bes-0.011
we-0.010
kins-0.010
irling-0.009
 wiping-0.009
Token drink
Token ID4144
Feature activation-3.93e-03

Pos logit contributions
ah+0.002
 One+0.002
 mate+0.002
ff+0.002
bar+0.002
Neg logit contributions
bes-0.004
we-0.004
kins-0.003
irling-0.003
lf-0.003
Token water
Token ID1660
Feature activation-2.03e-02

Pos logit contributions
ah+0.005
 One+0.005
ff+0.005
 mate+0.005
 enough+0.005
Neg logit contributions
bes-0.008
we-0.008
irling-0.007
kins-0.007
lf-0.007
Token when
Token ID618
Feature activation-7.01e-03

Pos logit contributions
ah+0.033
 mate+0.030
ff+0.030
 One+0.029
bar+0.029
Neg logit contributions
bes-0.038
we-0.037
kins-0.034
 wiping-0.032
 wearing-0.032
Token her
Token ID607
Feature activation+2.13e-02

Pos logit contributions
bes+0.027
we+0.026
kins+0.024
 wearing+0.023
 wiping+0.023
Neg logit contributions
ah-0.027
ff-0.024
 mate-0.023
bar-0.023
 One-0.023
Token crab
Token ID32202
Feature activation-1.52e-03

Pos logit contributions
bes+0.042
we+0.040
irling+0.035
 wiping+0.033
keeping+0.033
Neg logit contributions
 mate-0.034
ah-0.033
 One-0.030
ff-0.029
bar-0.029
Token family
Token ID1641
Feature activation+8.87e-03

Pos logit contributions
ah+0.002
 mate+0.002
 One+0.002
ff+0.002
bar+0.002
Neg logit contributions
bes-0.003
we-0.003
kins-0.003
irling-0.003
 wiping-0.003
Token.
Token ID13
Feature activation+7.93e-03

Pos logit contributions
bes+0.020
we+0.020
irling+0.018
kins+0.018
 wiping+0.017
Neg logit contributions
ah-0.011
 mate-0.010
 One-0.010
ff-0.009
bar-0.009
Token One
Token ID1881
Feature activation-7.45e-03

Pos logit contributions
bes+0.019
we+0.019
kins+0.017
irling+0.017
 wiping+0.016
Neg logit contributions
ah-0.009
 One-0.008
 mate-0.008
ff-0.008
bar-0.007
Token day
Token ID1110
Feature activation+4.14e-02

Pos logit contributions
ah+0.012
 mate+0.011
 One+0.011
ff+0.011
bar+0.011
Neg logit contributions
bes-0.014
we-0.014
kins-0.012
irling-0.012
 wiping-0.012
Token,
Token ID11
Feature activation+1.75e-02

Pos logit contributions
bes+0.118
we+0.113
keeping+0.101
irling+0.101
web+0.100
Neg logit contributions
 One-0.034
ah-0.032
 enough-0.029
 mate-0.028
ff-0.025
Token Sally
Token ID25737
Feature activation+1.16e-02

Pos logit contributions
bes+0.042
we+0.039
irling+0.035
web+0.034
kins+0.034
Neg logit contributions
 One-0.023
ah-0.021
 please-0.020
 enough-0.020
 mate-0.020
Token wanted
Token ID2227
Feature activation+2.42e-02

Pos logit contributions
bes+0.024
we+0.023
kins+0.021
irling+0.021
 wiping+0.020
Neg logit contributions
ah-0.017
 One-0.015
ff-0.015
 mate-0.015
ow-0.014
Token to
Token ID284
Feature activation+2.53e-02

Pos logit contributions
bes+0.066
we+0.061
irling+0.060
kins+0.059
lf+0.054
Neg logit contributions
 mate-0.021
 One-0.021
 enough-0.020
ah-0.020
ow-0.019
Token go
Token ID467
Feature activation+1.38e-02

Pos logit contributions
bes+0.052
we+0.049
kins+0.046
 wearing+0.045
 wiping+0.045
Neg logit contributions
ah-0.036
ff-0.032
 mate-0.031
bar-0.031
 One-0.030
Token a
Token ID257
Feature activation+1.17e-02

Pos logit contributions
ah+0.027
 mate+0.026
ff+0.025
 One+0.023
bit+0.022
Neg logit contributions
we-0.029
 wiping-0.028
bes-0.027
 wrapping-0.027
 wearing-0.026
Token way
Token ID835
Feature activation+1.53e-02

Pos logit contributions
we+0.022
bes+0.022
irling+0.018
kins+0.018
 wiping+0.017
Neg logit contributions
ah-0.019
 mate-0.018
 One-0.017
ff-0.017
bar-0.017
Token to
Token ID284
Feature activation+4.64e-03

Pos logit contributions
bes+0.033
we+0.031
kins+0.029
irling+0.029
 wiping+0.027
Neg logit contributions
ah-0.021
 mate-0.019
bar-0.019
ff-0.018
 One-0.018
Token make
Token ID787
Feature activation-8.49e-03

Pos logit contributions
bes+0.010
we+0.010
kins+0.009
irling+0.009
 wiping+0.008
Neg logit contributions
ah-0.006
 mate-0.006
 One-0.005
ff-0.005
bar-0.005
Token vegetables
Token ID13701
Feature activation-8.14e-03

Pos logit contributions
ah+0.013
 mate+0.012
ff+0.011
 One+0.011
bar+0.010
Neg logit contributions
bes-0.017
we-0.017
kins-0.015
 wiping-0.015
 wearing-0.015
Token fun
Token ID1257
Feature activation+4.79e-02

Pos logit contributions
ah+0.013
ff+0.012
 mate+0.012
 One+0.011
bar+0.011
Neg logit contributions
bes-0.015
we-0.015
kins-0.014
 wiping-0.013
 wearing-0.013
Token by
Token ID416
Feature activation+1.48e-02

Pos logit contributions
bes+0.089
irling+0.087
we+0.081
 complexion+0.079
 sudden+0.077
Neg logit contributions
 enough-0.065
 One-0.060
ah-0.059
 go-0.059
 mate-0.058
Token cutting
Token ID7720
Feature activation+9.82e-03

Pos logit contributions
bes+0.022
we+0.022
irling+0.018
kins+0.018
 complexion+0.018
Neg logit contributions
 One-0.026
ah-0.026
bar-0.026
 mate-0.025
 enough-0.025
Token them
Token ID606
Feature activation+7.22e-03

Pos logit contributions
bes+0.018
we+0.017
kins+0.016
irling+0.015
 wiping+0.015
Neg logit contributions
ah-0.016
 mate-0.015
ff-0.015
 One-0.015
bar-0.014
Token into
Token ID656
Feature activation+2.98e-03

Pos logit contributions
bes+0.013
we+0.013
kins+0.012
 wiping+0.011
 wearing+0.011
Neg logit contributions
ah-0.012
 mate-0.011
ff-0.011
bar-0.010
 One-0.010
Token fun
Token ID1257
Feature activation+3.57e-02

Pos logit contributions
bes+0.006
we+0.006
irling+0.005
kins+0.005
 wiping+0.005
Neg logit contributions
ah-0.004
 One-0.004
 mate-0.004
ff-0.004
 enough-0.004
Token colors
Token ID7577
Feature activation+1.60e-02

Pos logit contributions
bes+0.033
we+0.031
irling+0.027
kins+0.027
 wiping+0.026
Neg logit contributions
ah-0.052
ff-0.049
 mate-0.049
 One-0.049
bar-0.047
Token and
Token ID290
Feature activation+1.03e-02

Pos logit contributions
bes+0.030
we+0.029
irling+0.026
kins+0.026
 wiping+0.025
Neg logit contributions
ah-0.025
 mate-0.023
 One-0.023
ff-0.023
bar-0.022
Token shapes
Token ID15268
Feature activation-1.04e-02

Pos logit contributions
bes+0.017
we+0.016
irling+0.016
kins+0.014
web+0.014
Neg logit contributions
 One-0.018
ah-0.017
 mate-0.017
 enough-0.016
 go-0.016
Token.
Token ID13
Feature activation+3.52e-03

Pos logit contributions
ah+0.016
ff+0.014
 mate+0.014
bar+0.014
 One+0.014
Neg logit contributions
bes-0.020
we-0.020
kins-0.018
irling-0.017
 wearing-0.017
Token She
Token ID1375
Feature activation+9.11e-03

Pos logit contributions
bes+0.008
we+0.007
kins+0.007
 wiping+0.007
irling+0.007
Neg logit contributions
ah-0.005
 mate-0.004
bar-0.004
ff-0.004
ow-0.004
Token can
Token ID460
Feature activation+4.32e-02

Pos logit contributions
bes+0.017
we+0.016
kins+0.015
irling+0.014
 wiping+0.014
Neg logit contributions
ah-0.015
 mate-0.014
ff-0.013
 One-0.013
bar-0.013
Token make
Token ID787
Feature activation-5.13e-03

Pos logit contributions
bes+0.079
we+0.077
kins+0.069
irling+0.068
 wearing+0.068
Neg logit contributions
ah-0.069
ff-0.065
 mate-0.062
 enough-0.059
ale-0.058
Token towers
Token ID18028
Feature activation-1.64e-02

Pos logit contributions
ah+0.010
ff+0.010
 mate+0.009
ow+0.009
bar+0.009
Neg logit contributions
we-0.008
bes-0.007
kins-0.006
 wearing-0.006
 wiping-0.006
Token,
Token ID11
Feature activation+3.29e-03

Pos logit contributions
ah+0.034
ff+0.032
 mate+0.031
 One+0.030
ow+0.029
Neg logit contributions
bes-0.022
we-0.022
kins-0.020
 wearing-0.019
 wiping-0.018
Token bridges
Token ID19432
Feature activation-8.73e-03

Pos logit contributions
bes+0.005
we+0.005
kins+0.005
irling+0.004
 wiping+0.004
Neg logit contributions
ah-0.006
ff-0.006
 mate-0.006
bar-0.005
 One-0.005
Token,
Token ID11
Feature activation+7.52e-03

Pos logit contributions
ah+0.016
ff+0.015
 mate+0.014
 One+0.014
bar+0.014
Neg logit contributions
bes-0.015
we-0.014
kins-0.013
 wearing-0.012
 wiping-0.012
Token they
Token ID484
Feature activation-6.90e-03

Pos logit contributions
ah+0.006
 mate+0.005
ff+0.005
 One+0.005
bar+0.005
Neg logit contributions
bes-0.007
we-0.007
kins-0.006
irling-0.006
 wiping-0.006
Token spl
Token ID4328
Feature activation-9.64e-03

Pos logit contributions
ff+0.013
ah+0.012
bar+0.012
 One+0.012
upid+0.012
Neg logit contributions
bes-0.009
kins-0.009
we-0.009
 wearing-0.009
irling-0.008
Tokenashed
Token ID5263
Feature activation+1.82e-02

Pos logit contributions
ah+0.018
 mate+0.017
 One+0.017
ff+0.016
bar+0.016
Neg logit contributions
bes-0.015
we-0.015
kins-0.013
irling-0.013
 wearing-0.012
Token and
Token ID290
Feature activation+1.18e-02

Pos logit contributions
bes+0.037
we+0.035
irling+0.033
kins+0.032
lf+0.032
Neg logit contributions
ah-0.026
 One-0.023
 enough-0.023
ff-0.023
 mate-0.023
Token had
Token ID550
Feature activation+3.19e-03

Pos logit contributions
bes+0.024
we+0.023
kins+0.022
irling+0.021
ini+0.021
Neg logit contributions
ah-0.017
 mate-0.015
 One-0.014
 enough-0.014
bar-0.014
Token fun
Token ID1257
Feature activation+4.20e-02

Pos logit contributions
bes+0.006
we+0.005
kins+0.005
irling+0.005
 wiping+0.005
Neg logit contributions
ah-0.005
ff-0.005
 mate-0.005
bar-0.005
 One-0.005
Token.
Token ID13
Feature activation+8.01e-03

Pos logit contributions
bes+0.108
kins+0.098
we+0.098
iflower+0.095
ini+0.094
Neg logit contributions
ah-0.032
 One-0.031
 enough-0.030
 mate-0.029
 go-0.022
Token
Token ID198
Feature activation-1.29e-03

Pos logit contributions
bes+0.017
we+0.016
kins+0.015
irling+0.014
 wiping+0.014
Neg logit contributions
ah-0.011
 mate-0.010
ff-0.010
bar-0.010
 One-0.010
Token
Token ID198
Feature activation+9.39e-03

Pos logit contributions
 One+0.001
ah+0.001
ff+0.001
 mate+0.001
ow+0.001
Neg logit contributions
bes-0.003
we-0.003
kins-0.003
irling-0.003
 wiping-0.003
TokenThe
Token ID464
Feature activation-9.39e-03

Pos logit contributions
bes+0.017
we+0.016
kins+0.015
 wiping+0.014
irling+0.014
Neg logit contributions
ah-0.016
 One-0.015
 mate-0.014
ff-0.014
bar-0.014
Token little
Token ID1310
Feature activation+1.49e-02

Pos logit contributions
 mate+0.013
ah+0.013
 One+0.012
 driver+0.011
 end+0.011
Neg logit contributions
we-0.020
bes-0.020
irling-0.018
kins-0.017
keeping-0.017
Token pink
Token ID11398
Feature activation+3.54e-02

Pos logit contributions
we+0.114
bes+0.102
irling+0.100
keeping+0.098
kins+0.093
Neg logit contributions
 One-0.062
ah-0.058
 mate-0.058
ossible-0.051
 go-0.051
Token dress
Token ID6576
Feature activation+4.83e-03

Pos logit contributions
we+0.062
bes+0.060
kins+0.055
irling+0.054
keeping+0.051
Neg logit contributions
ah-0.060
 One-0.058
 mate-0.057
bar-0.054
ff-0.051
Token and
Token ID290
Feature activation+9.24e-03

Pos logit contributions
bes+0.009
we+0.008
kins+0.007
irling+0.007
 wearing+0.007
Neg logit contributions
ah-0.008
 mate-0.008
ff-0.008
bar-0.007
 One-0.007
Token her
Token ID607
Feature activation+2.54e-02

Pos logit contributions
we+0.020
bes+0.019
irling+0.017
kins+0.017
keeping+0.016
Neg logit contributions
 One-0.013
 go-0.013
ah-0.012
 mate-0.012
 leave-0.012
Token spark
Token ID9009
Feature activation+2.74e-03

Pos logit contributions
we+0.054
bes+0.051
irling+0.048
keeping+0.047
kins+0.046
Neg logit contributions
ah-0.033
 One-0.032
 mate-0.031
 go-0.027
 leave-0.027
Tokenly
Token ID306
Feature activation+4.10e-02

Pos logit contributions
bes+0.004
we+0.004
kins+0.003
irling+0.003
 wiping+0.003
Neg logit contributions
ah-0.006
 mate-0.005
ff-0.005
 One-0.005
bar-0.005
Token shoes
Token ID10012
Feature activation+7.81e-03

Pos logit contributions
we+0.072
bes+0.071
irling+0.063
kins+0.062
 wiping+0.060
Neg logit contributions
ah-0.070
 mate-0.066
 One-0.062
bar-0.062
ff-0.061
Token every
Token ID790
Feature activation+8.24e-03

Pos logit contributions
bes+0.016
we+0.016
kins+0.015
irling+0.014
 wiping+0.014
Neg logit contributions
ah-0.011
 mate-0.009
 One-0.009
ff-0.009
bar-0.009
Token day
Token ID1110
Feature activation+3.04e-02

Pos logit contributions
bes+0.014
we+0.014
kins+0.013
irling+0.012
 wiping+0.012
Neg logit contributions
ah-0.014
 mate-0.013
ff-0.013
bar-0.013
 One-0.012
Token.
Token ID13
Feature activation+1.24e-02

Pos logit contributions
bes+0.064
we+0.062
kins+0.057
irling+0.056
 wiping+0.055
Neg logit contributions
ah-0.041
 mate-0.037
ff-0.037
 One-0.037
bar-0.036
Token One
Token ID1881
Feature activation-1.54e-02

Pos logit contributions
bes+0.025
we+0.024
 wearing+0.023
 wiping+0.023
 wrapping+0.022
Neg logit contributions
ah-0.017
 mate-0.016
ff-0.016
bar-0.015
ow-0.015
Token!"
Token ID2474
Feature activation-5.05e-03

Pos logit contributions
bes+0.006
we+0.006
kins+0.006
 wearing+0.006
 wiping+0.005
Neg logit contributions
ah-0.006
ff-0.005
 mate-0.005
 One-0.005
bar-0.004
Token
Token ID198
Feature activation-7.65e-03

Pos logit contributions
ah+0.008
 mate+0.008
ff+0.007
bar+0.007
 enough+0.007
Neg logit contributions
bes-0.009
we-0.009
 wearing-0.008
 wiping-0.008
kins-0.008
Token
Token ID198
Feature activation+5.46e-03

Pos logit contributions
ah+0.012
 mate+0.012
ff+0.011
bar+0.011
 One+0.011
Neg logit contributions
bes-0.014
we-0.014
kins-0.012
irling-0.012
 wiping-0.012
TokenSp
Token ID4561
Feature activation+2.37e-02

Pos logit contributions
bes+0.008
we+0.008
kins+0.007
irling+0.007
 wiping+0.007
Neg logit contributions
ah-0.011
 mate-0.010
ff-0.010
ow-0.010
 One-0.010
Tokenike
Token ID522
Feature activation+2.03e-02

Pos logit contributions
bes+0.043
we+0.040
 wrapping+0.037
 wiping+0.037
kins+0.037
Neg logit contributions
ah-0.041
ff-0.037
 mate-0.035
ow-0.034
bar-0.034
Token thought
Token ID1807
Feature activation+1.67e-03

Pos logit contributions
bes+0.035
we+0.034
kins+0.030
irling+0.029
 wiping+0.028
Neg logit contributions
ah-0.036
 mate-0.033
ff-0.033
 One-0.032
bar-0.032
Token for
Token ID329
Feature activation-4.99e-03

Pos logit contributions
bes+0.004
we+0.004
kins+0.003
 wiping+0.003
 wearing+0.003
Neg logit contributions
ah-0.002
ff-0.002
 mate-0.002
bar-0.002
 One-0.002
Token a
Token ID257
Feature activation-1.23e-02

Pos logit contributions
ah+0.007
 One+0.006
 mate+0.006
ff+0.006
 enough+0.006
Neg logit contributions
bes-0.011
we-0.010
irling-0.010
kins-0.010
 wiping-0.009
Token moment
Token ID2589
Feature activation+2.37e-02

Pos logit contributions
ah+0.027
 mate+0.026
ff+0.026
bar+0.025
 One+0.025
Neg logit contributions
bes-0.015
we-0.015
kins-0.012
irling-0.012
 wiping-0.012
Token and
Token ID290
Feature activation-1.30e-02

Pos logit contributions
bes+0.043
we+0.041
kins+0.037
irling+0.036
 wiping+0.035
Neg logit contributions
ah-0.039
 mate-0.036
ff-0.036
 One-0.035
bar-0.035
Token then
Token ID788
Feature activation+3.36e-04

Pos logit contributions
ah+0.021
ff+0.020
 mate+0.020
bar+0.019
 One+0.019
Neg logit contributions
bes-0.023
we-0.022
kins-0.020
 wearing-0.019
 wiping-0.019
Token,
Token ID11
Feature activation-6.08e-03

Pos logit contributions
ah+0.042
ff+0.038
 mate+0.037
 One+0.036
bar+0.036
Neg logit contributions
bes-0.042
we-0.040
kins-0.037
 wearing-0.035
 wiping-0.034
Token bigger
Token ID5749
Feature activation-9.96e-03

Pos logit contributions
ah+0.010
 mate+0.010
 One+0.009
ff+0.009
bar+0.009
Neg logit contributions
bes-0.011
we-0.010
kins-0.009
irling-0.009
 wiping-0.009
Token and
Token ID290
Feature activation-2.26e-02

Pos logit contributions
ah+0.017
 mate+0.016
ff+0.016
 One+0.016
bar+0.015
Neg logit contributions
bes-0.017
we-0.017
kins-0.015
irling-0.015
 wiping-0.014
Token brighter
Token ID26083
Feature activation-2.03e-03

Pos logit contributions
ah+0.033
ff+0.030
 mate+0.030
 One+0.029
bar+0.028
Neg logit contributions
bes-0.045
we-0.044
kins-0.040
irling-0.039
 wiping-0.039
Token than
Token ID621
Feature activation-1.37e-02

Pos logit contributions
ah+0.003
 mate+0.003
ff+0.003
 One+0.003
bar+0.003
Neg logit contributions
bes-0.004
we-0.003
irling-0.003
kins-0.003
 wiping-0.003
Token Lily
Token ID20037
Feature activation+1.35e-02

Pos logit contributions
ah+0.020
ff+0.019
 mate+0.018
 One+0.018
bar+0.018
Neg logit contributions
bes-0.026
we-0.026
kins-0.024
 wiping-0.023
 wearing-0.023
Token and
Token ID290
Feature activation-3.98e-02

Pos logit contributions
bes+0.026
we+0.025
kins+0.022
 wearing+0.022
 wiping+0.021
Neg logit contributions
ah-0.021
 mate-0.020
ff-0.020
 One-0.019
bar-0.019
Token Ben
Token ID3932
Feature activation+1.25e-03

Pos logit contributions
ah+0.059
ff+0.054
 mate+0.053
 One+0.052
bar+0.051
Neg logit contributions
bes-0.078
we-0.077
kins-0.069
 wearing-0.067
 wiping-0.067
Token's
Token ID338
Feature activation-8.45e-03

Pos logit contributions
bes+0.002
we+0.002
kins+0.002
 wearing+0.002
 wiping+0.002
Neg logit contributions
ah-0.002
bar-0.002
 mate-0.002
ff-0.002
 One-0.002
Token.
Token ID13
Feature activation-1.19e-02

Pos logit contributions
ah+0.014
 mate+0.013
ff+0.013
 One+0.013
bar+0.013
Neg logit contributions
bes-0.015
we-0.015
kins-0.013
irling-0.013
 wiping-0.012
Token He
Token ID679
Feature activation-6.79e-03

Pos logit contributions
ah+0.018
ff+0.016
 mate+0.016
bar+0.015
ow+0.015
Neg logit contributions
bes-0.024
we-0.023
 wearing-0.021
kins-0.020
 wiping-0.020
Token mean
Token ID1612
Feature activation-1.62e-02

Pos logit contributions
ah+0.017
 mate+0.016
 One+0.015
 please+0.014
 driver+0.014
Neg logit contributions
bes-0.024
we-0.023
keeping-0.021
kins-0.020
irling-0.020
Token monkey
Token ID21657
Feature activation-1.09e-02

Pos logit contributions
ah+0.025
 mate+0.023
ff+0.023
 One+0.022
bar+0.022
Neg logit contributions
bes-0.031
we-0.030
kins-0.027
irling-0.027
 wiping-0.026
Token named
Token ID3706
Feature activation-1.41e-02

Pos logit contributions
ah+0.019
 mate+0.017
ff+0.017
 One+0.017
bar+0.017
Neg logit contributions
bes-0.019
we-0.018
kins-0.016
irling-0.016
 wiping-0.016
Token Bob
Token ID5811
Feature activation-2.23e-03

Pos logit contributions
ff+0.022
ah+0.021
 mate+0.020
bar+0.019
ow+0.018
Neg logit contributions
bes-0.025
 wearing-0.024
we-0.024
lf-0.023
 reading-0.023
Token.
Token ID13
Feature activation-6.79e-04

Pos logit contributions
bar+0.004
 mate+0.004
 destination+0.004
ah+0.004
ff+0.004
Neg logit contributions
we-0.003
bes-0.003
 wearing-0.003
kins-0.003
 wiping-0.003
Token Bob
Token ID5811
Feature activation+3.59e-02

Pos logit contributions
ah+0.001
bar+0.001
 mate+0.001
ale+0.001
ff+0.001
Neg logit contributions
bes-0.001
we-0.001
 wearing-0.001
 wiping-0.001
 reading-0.001
Token was
Token ID373
Feature activation+2.78e-02

Pos logit contributions
bes+0.061
we+0.060
kins+0.052
irling+0.052
 wiping+0.051
Neg logit contributions
ah-0.065
 One-0.059
ff-0.058
 mate-0.057
ow-0.056
Token always
Token ID1464
Feature activation+2.84e-02

Pos logit contributions
bes+0.051
we+0.049
kins+0.045
irling+0.043
 wiping+0.042
Neg logit contributions
ah-0.046
 mate-0.042
ff-0.041
 One-0.041
bar-0.041
Token very
Token ID845
Feature activation+1.71e-02

Pos logit contributions
bes+0.047
we+0.045
kins+0.041
irling+0.039
lf+0.036
Neg logit contributions
ah-0.052
 mate-0.048
ff-0.047
 One-0.047
ow-0.047
Token naughty
Token ID45128
Feature activation+1.28e-02

Pos logit contributions
bes+0.030
we+0.029
kins+0.025
irling+0.025
 wiping+0.024
Neg logit contributions
ah-0.030
 mate-0.028
ff-0.028
 One-0.027
bar-0.027
Token and
Token ID290
Feature activation+1.34e-02

Pos logit contributions
bes+0.022
we+0.022
kins+0.019
irling+0.019
 wiping+0.019
Neg logit contributions
ah-0.022
 mate-0.020
ff-0.020
 One-0.020
bar-0.020
Token cozy
Token ID37438
Feature activation+9.35e-03

Pos logit contributions
we+0.011
bes+0.010
irling+0.009
kins+0.009
lf+0.009
Neg logit contributions
ah-0.008
 mate-0.008
 One-0.008
 leave-0.008
ff-0.008
Token.
Token ID13
Feature activation-7.31e-03

Pos logit contributions
bes+0.020
we+0.020
irling+0.018
kins+0.018
 wiping+0.017
Neg logit contributions
ah-0.012
 mate-0.011
 One-0.011
ff-0.011
 enough-0.010
Token<|endoftext|>
Token ID50256
Feature activation+1.60e-02

Pos logit contributions
ah+0.011
 mate+0.010
bar+0.010
ff+0.010
ow+0.009
Neg logit contributions
bes-0.015
we-0.014
kins-0.013
irling-0.013
 wiping-0.013
TokenJen
Token ID44875
Feature activation-1.95e-02

Pos logit contributions
bes+0.036
we+0.034
irling+0.032
kins+0.032
 wiping+0.031
Neg logit contributions
ah-0.019
 One-0.018
ff-0.017
 mate-0.017
ow-0.016
Token loved
Token ID6151
Feature activation+2.50e-02

Pos logit contributions
 mate+0.036
ah+0.036
bar+0.035
ff+0.034
 One+0.033
Neg logit contributions
bes-0.030
we-0.029
kins-0.026
irling-0.025
 wearing-0.025
Token her
Token ID607
Feature activation+2.67e-02

Pos logit contributions
bes+0.054
we+0.051
kins+0.049
irling+0.047
web+0.044
Neg logit contributions
ah-0.032
 One-0.030
 mate-0.030
bar-0.028
ff-0.028
Token horse
Token ID8223
Feature activation+1.45e-03

Pos logit contributions
we+0.058
bes+0.058
irling+0.053
kins+0.051
keeping+0.048
Neg logit contributions
ah-0.034
 One-0.033
 mate-0.032
 enough-0.028
ff-0.028
Token,
Token ID11
Feature activation+1.24e-02

Pos logit contributions
bes+0.002
we+0.002
 wearing+0.002
kins+0.002
 wiping+0.002
Neg logit contributions
ah-0.002
ff-0.002
 mate-0.002
bar-0.002
upid-0.002
Token Max
Token ID5436
Feature activation+4.61e-03

Pos logit contributions
bes+0.023
we+0.023
 wearing+0.020
kins+0.020
 wiping+0.020
Neg logit contributions
ah-0.019
ff-0.019
 mate-0.018
bar-0.017
ow-0.017
Token.
Token ID13
Feature activation-1.06e-02

Pos logit contributions
bes+0.007
we+0.007
 wearing+0.007
 wiping+0.006
kins+0.006
Neg logit contributions
ah-0.008
bar-0.008
ff-0.008
 mate-0.008
upid-0.008
Token He
Token ID679
Feature activation+6.12e-03

Pos logit contributions
ah+0.016
 mate+0.015
bar+0.015
ow+0.015
ff+0.015
Neg logit contributions
bes-0.019
we-0.017
 wearing-0.017
 wiping-0.017
 basketball-0.016
Token to
Token ID284
Feature activation-9.76e-04

Pos logit contributions
bes+0.030
we+0.027
kins+0.026
lf+0.026
irling+0.026
Neg logit contributions
ah-0.014
 enough-0.014
 mate-0.013
 One-0.012
ow-0.012
Token use
Token ID779
Feature activation+1.17e-02

Pos logit contributions
ah+0.001
 mate+0.001
ff+0.001
 One+0.001
bar+0.001
Neg logit contributions
bes-0.002
we-0.002
kins-0.002
irling-0.002
 wiping-0.002
Token its
Token ID663
Feature activation+4.38e-03

Pos logit contributions
bes+0.023
we+0.023
kins+0.021
irling+0.020
 wiping+0.020
Neg logit contributions
ah-0.017
 mate-0.016
ff-0.015
 One-0.015
bar-0.015
Token be
Token ID307
Feature activation+2.53e-02

Pos logit contributions
bes+0.008
we+0.007
kins+0.007
irling+0.006
 wiping+0.006
Neg logit contributions
ah-0.008
ff-0.007
 mate-0.007
bar-0.007
 One-0.007
Tokenak
Token ID461
Feature activation-1.72e-02

Pos logit contributions
bes+0.046
we+0.045
kins+0.040
irling+0.039
 wiping+0.038
Neg logit contributions
ah-0.041
 mate-0.038
ff-0.038
bar-0.037
 One-0.037
Token to
Token ID284
Feature activation-1.94e-04

Pos logit contributions
ah+0.028
ff+0.026
 mate+0.025
 One+0.025
bar+0.023
Neg logit contributions
bes-0.030
we-0.030
kins-0.027
 wearing-0.026
 wiping-0.026
Token pick
Token ID2298
Feature activation-1.64e-03

Pos logit contributions
ah+0.000
 mate+0.000
ff+0.000
 One+0.000
bar+0.000
Neg logit contributions
bes-0.000
we-0.000
kins-0.000
irling-0.000
 wiping-0.000
Token up
Token ID510
Feature activation+1.26e-02

Pos logit contributions
ah+0.003
 mate+0.003
ff+0.003
 One+0.003
bar+0.003
Neg logit contributions
we-0.003
bes-0.003
kins-0.002
 wiping-0.002
irling-0.002
Token food
Token ID2057
Feature activation-5.17e-03

Pos logit contributions
bes+0.024
we+0.024
kins+0.022
irling+0.021
 wiping+0.021
Neg logit contributions
ah-0.019
 mate-0.018
ff-0.017
 One-0.017
bar-0.017
Token instead
Token ID2427
Feature activation-7.12e-03

Pos logit contributions
ah+0.009
ff+0.009
 mate+0.009
 One+0.008
bar+0.008
Neg logit contributions
bes-0.009
we-0.009
kins-0.008
 wearing-0.007
 wiping-0.007
Token of
Token ID286
Feature activation-1.31e-03

Pos logit contributions
ah+0.011
 mate+0.011
ff+0.010
 One+0.010
bar+0.010
Neg logit contributions
we-0.013
bes-0.013
kins-0.012
 wearing-0.011
 wiping-0.011
Token and
Token ID290
Feature activation-3.17e-03

Pos logit contributions
bes+0.013
we+0.013
 wiping+0.012
kins+0.011
 wearing+0.011
Neg logit contributions
ah-0.016
 mate-0.015
ff-0.015
bar-0.015
pole-0.014
Token cried
Token ID16896
Feature activation+6.10e-03

Pos logit contributions
ah+0.005
ff+0.005
 mate+0.005
bar+0.005
 One+0.005
Neg logit contributions
bes-0.006
we-0.005
kins-0.005
 wiping-0.005
 wearing-0.005
Token because
Token ID780
Feature activation+3.30e-03

Pos logit contributions
bes+0.010
we+0.010
 wiping+0.010
kins+0.009
 wearing+0.009
Neg logit contributions
ah-0.011
 mate-0.010
ff-0.010
bar-0.009
pole-0.009
Token she
Token ID673
Feature activation-1.66e-03

Pos logit contributions
bes+0.007
we+0.007
irling+0.006
kins+0.006
keeping+0.006
Neg logit contributions
 One-0.004
 enough-0.004
ah-0.004
 mate-0.004
 please-0.004
Token didn
Token ID1422
Feature activation-1.79e-02

Pos logit contributions
ah+0.002
 One+0.002
ff+0.002
ow+0.002
 mate+0.002
Neg logit contributions
bes-0.003
we-0.003
kins-0.003
irling-0.003
keeping-0.003
Token't
Token ID470
Feature activation-8.87e-03

Pos logit contributions
ah+0.026
 One+0.025
bar+0.023
ow+0.023
 mate+0.023
Neg logit contributions
bes-0.036
we-0.035
kins-0.033
irling-0.031
 wiping-0.030
Token have
Token ID423
Feature activation-1.67e-02

Pos logit contributions
ah+0.010
ff+0.009
 mate+0.009
bar+0.009
 One+0.008
Neg logit contributions
bes-0.020
we-0.020
kins-0.018
 wearing-0.018
 wiping-0.018
Token her
Token ID607
Feature activation+1.88e-02

Pos logit contributions
ah+0.024
ff+0.022
pole+0.022
bar+0.021
 mate+0.021
Neg logit contributions
bes-0.032
 wearing-0.030
we-0.030
 wiping-0.030
 wrapping-0.029
Token toy
Token ID13373
Feature activation+2.37e-02

Pos logit contributions
we+0.034
bes+0.033
irling+0.029
kins+0.029
lf+0.027
Neg logit contributions
ah-0.031
 mate-0.030
 One-0.030
 enough-0.029
ff-0.028
Token bunny
Token ID44915
Feature activation+9.28e-03

Pos logit contributions
bes+0.043
we+0.042
kins+0.038
 wearing+0.037
 wiping+0.036
Neg logit contributions
ah-0.039
ff-0.036
 mate-0.035
bar-0.034
 One-0.034
Token anymore
Token ID7471
Feature activation+4.33e-02

Pos logit contributions
bes+0.019
we+0.018
kins+0.016
 wearing+0.016
 wiping+0.016
Neg logit contributions
ah-0.014
ff-0.012
 mate-0.012
 One-0.012
bar-0.012
Token looked
Token ID3114
Feature activation+5.51e-03

Pos logit contributions
bes+0.014
we+0.014
kins+0.012
irling+0.012
 wiping+0.012
Neg logit contributions
ah-0.014
 mate-0.013
ff-0.013
 One-0.013
bar-0.012
Token everywhere
Token ID8347
Feature activation+3.96e-02

Pos logit contributions
bes+0.011
we+0.010
kins+0.009
irling+0.009
 wiping+0.009
Neg logit contributions
ah-0.008
 mate-0.008
ff-0.008
 One-0.008
bar-0.007
Token,
Token ID11
Feature activation+1.54e-02

Pos logit contributions
bes+0.092
we+0.088
kins+0.080
irling+0.080
keeping+0.074
Neg logit contributions
ah-0.047
 One-0.043
 mate-0.043
bar-0.040
ff-0.040
Token but
Token ID475
Feature activation+1.26e-02

Pos logit contributions
bes+0.033
we+0.032
kins+0.031
irling+0.030
iflower+0.027
Neg logit contributions
ah-0.018
 One-0.018
 enough-0.017
 mate-0.017
bar-0.017
Token it
Token ID340
Feature activation-4.36e-03

Pos logit contributions
bes+0.023
we+0.022
kins+0.020
irling+0.020
 wiping+0.018
Neg logit contributions
ah-0.020
 mate-0.019
 One-0.019
bar-0.019
 enough-0.019
Token was
Token ID373
Feature activation-1.83e-02

Pos logit contributions
ah+0.008
 mate+0.008
ff+0.008
 One+0.008
bar+0.008
Neg logit contributions
bes-0.007
we-0.006
kins-0.006
irling-0.006
 wiping-0.005
Token hidden
Token ID7104
Feature activation-1.43e-02

Pos logit contributions
ah+0.037
 mate+0.035
ff+0.034
 One+0.034
bar+0.034
Neg logit contributions
bes-0.026
we-0.026
kins-0.022
irling-0.022
 wiping-0.021
Token so
Token ID523
Feature activation+5.39e-03

Pos logit contributions
ah+0.025
 mate+0.023
ff+0.023
 enough+0.023
 One+0.023
Neg logit contributions
bes-0.024
we-0.023
irling-0.021
kins-0.021
 wiping-0.020
Token well
Token ID880
Feature activation+1.21e-02

Pos logit contributions
bes+0.011
we+0.010
kins+0.009
irling+0.009
 wiping+0.008
Neg logit contributions
 One-0.008
ah-0.008
 mate-0.007
 enough-0.007
ff-0.007
Token that
Token ID326
Feature activation-2.43e-03

Pos logit contributions
bes+0.025
we+0.025
kins+0.023
irling+0.022
 wiping+0.022
Neg logit contributions
ah-0.016
 mate-0.014
 One-0.014
 enough-0.014
ff-0.014
Token she
Token ID673
Feature activation-4.56e-03

Pos logit contributions
ah+0.004
 One+0.004
 mate+0.004
bar+0.004
 enough+0.004
Neg logit contributions
bes-0.004
we-0.004
irling-0.004
kins-0.004
 wiping-0.004
Token
Token ID198
Feature activation-1.14e-02

Pos logit contributions
 One+0.008
ah+0.007
 mate+0.007
ff+0.007
 enough+0.006
Neg logit contributions
we-0.018
bes-0.018
kins-0.017
irling-0.017
 wiping-0.016
Token
Token ID198
Feature activation+1.96e-04

Pos logit contributions
ah+0.015
 One+0.014
 mate+0.014
ff+0.013
bar+0.013
Neg logit contributions
bes-0.024
we-0.024
kins-0.022
irling-0.021
 wiping-0.021
TokenBut
Token ID1537
Feature activation+7.69e-03

Pos logit contributions
bes+0.000
we+0.000
kins+0.000
 wiping+0.000
 complexion+0.000
Neg logit contributions
ah-0.000
 One-0.000
 mate-0.000
bar-0.000
ff-0.000
Token then
Token ID788
Feature activation-1.42e-02

Pos logit contributions
bes+0.015
we+0.015
kins+0.013
irling+0.013
 wiping+0.013
Neg logit contributions
ah-0.012
 mate-0.011
 One-0.011
 enough-0.010
bar-0.010
Token,
Token ID11
Feature activation-3.00e-03

Pos logit contributions
ah+0.021
 mate+0.019
ff+0.019
bar+0.018
 One+0.018
Neg logit contributions
bes-0.028
we-0.028
kins-0.025
irling-0.025
 wiping-0.024
Token the
Token ID262
Feature activation-1.33e-02

Pos logit contributions
ah+0.004
 mate+0.004
 One+0.004
 enough+0.004
ff+0.004
Neg logit contributions
bes-0.006
we-0.006
irling-0.005
kins-0.005
 wiping-0.005
Token wind
Token ID2344
Feature activation+9.71e-03

Pos logit contributions
ah+0.016
 mate+0.016
 One+0.015
 driver+0.014
bar+0.014
Neg logit contributions
we-0.029
bes-0.029
irling-0.027
kins-0.026
 wiping-0.024
Token blows
Token ID20385
Feature activation-1.17e-02

Pos logit contributions
bes+0.018
we+0.017
kins+0.016
irling+0.015
 wiping+0.015
Neg logit contributions
ah-0.016
 mate-0.014
ff-0.014
 One-0.014
bar-0.014
Token and
Token ID290
Feature activation-2.63e-02

Pos logit contributions
 mate+0.022
ah+0.021
bar+0.020
ff+0.019
ow+0.019
Neg logit contributions
we-0.018
bes-0.017
kins-0.016
 wiping-0.015
 wrapping-0.015
Token the
Token ID262
Feature activation+2.72e-04

Pos logit contributions
ah+0.046
ff+0.044
 mate+0.044
pole+0.041
bar+0.041
Neg logit contributions
bes-0.044
we-0.042
 wiping-0.038
 wrapping-0.037
 wearing-0.037
Token door
Token ID3420
Feature activation-1.27e-02

Pos logit contributions
we+0.001
bes+0.001
irling+0.000
kins+0.000
 wiping+0.000
Neg logit contributions
ah-0.000
 mate-0.000
 One-0.000
bar-0.000
 leave-0.000
Token welcome
Token ID7062
Feature activation+5.44e-03

Pos logit contributions
ah+0.015
ff+0.013
bar+0.013
 mate+0.013
ale+0.012
Neg logit contributions
bes-0.019
we-0.019
 wearing-0.018
 wiping-0.018
kins-0.017
Token.
Token ID13
Feature activation-2.64e-03

Pos logit contributions
we+0.009
bes+0.009
 wearing+0.008
kins+0.008
 wiping+0.008
Neg logit contributions
ah-0.009
bar-0.009
ff-0.008
 mate-0.008
 One-0.008
Token I
Token ID314
Feature activation+2.80e-02

Pos logit contributions
ah+0.004
 mate+0.004
 One+0.004
ff+0.004
bar+0.003
Neg logit contributions
bes-0.005
we-0.005
kins-0.005
irling-0.005
 wiping-0.004
Token'm
Token ID1101
Feature activation+1.42e-02

Pos logit contributions
bes+0.052
we+0.050
kins+0.046
irling+0.045
 wearing+0.044
Neg logit contributions
ah-0.042
bar-0.040
 mate-0.040
ff-0.039
 One-0.038
Token so
Token ID523
Feature activation+1.82e-02

Pos logit contributions
bes+0.018
we+0.017
 wearing+0.017
 wiping+0.016
irling+0.016
Neg logit contributions
ah-0.030
ff-0.028
 mate-0.028
bar-0.027
 One-0.027
Token glad
Token ID9675
Feature activation-4.95e-03

Pos logit contributions
bes+0.029
we+0.028
kins+0.025
irling+0.023
 wiping+0.023
Neg logit contributions
ah-0.034
 mate-0.032
ff-0.032
 One-0.032
bar-0.031
Token I
Token ID314
Feature activation+2.48e-02

Pos logit contributions
ah+0.007
 mate+0.006
 One+0.006
ff+0.006
 enough+0.006
Neg logit contributions
bes-0.010
we-0.010
kins-0.009
irling-0.009
 wiping-0.008
Token could
Token ID714
Feature activation+9.95e-03

Pos logit contributions
bes+0.044
we+0.043
kins+0.039
irling+0.037
 wearing+0.037
Neg logit contributions
ah-0.041
 mate-0.038
ff-0.037
bar-0.037
 One-0.037
Token help
Token ID1037
Feature activation-1.36e-04

Pos logit contributions
bes+0.021
we+0.020
kins+0.019
irling+0.018
 wiping+0.018
Neg logit contributions
ah-0.014
 mate-0.012
ff-0.012
 One-0.012
bar-0.012
Token you
Token ID345
Feature activation+1.30e-02

Pos logit contributions
ah+0.000
 mate+0.000
ff+0.000
 One+0.000
bar+0.000
Neg logit contributions
bes-0.000
we-0.000
kins-0.000
irling-0.000
 wiping-0.000
Token."
Token ID526
Feature activation+4.35e-03

Pos logit contributions
bes+0.026
we+0.025
kins+0.023
irling+0.022
 wearing+0.022
Neg logit contributions
ah-0.020
ff-0.018
 mate-0.017
bar-0.017
 One-0.017
Token too
Token ID1165
Feature activation-1.07e-02

Pos logit contributions
ah+0.001
 mate+0.001
ff+0.001
 One+0.001
bar+0.001
Neg logit contributions
bes-0.001
we-0.001
kins-0.001
irling-0.001
 wiping-0.001
Token.
Token ID13
Feature activation-1.93e-02

Pos logit contributions
ah+0.016
 mate+0.015
ff+0.015
 One+0.014
bar+0.014
Neg logit contributions
bes-0.021
we-0.020
kins-0.018
irling-0.018
 wiping-0.018
Token It
Token ID632
Feature activation-1.37e-03

Pos logit contributions
ah+0.031
 mate+0.026
bar+0.026
ff+0.026
ale+0.024
Neg logit contributions
bes-0.038
we-0.035
 wearing-0.033
 wiping-0.032
kins-0.032
Token could
Token ID714
Feature activation-2.94e-03

Pos logit contributions
ff+0.003
ah+0.003
upid+0.002
 mate+0.002
ale+0.002
Neg logit contributions
bes-0.002
we-0.002
 wearing-0.002
kins-0.001
lf-0.001
Token not
Token ID407
Feature activation-3.68e-02

Pos logit contributions
ah+0.006
ff+0.006
 mate+0.005
ale+0.005
bar+0.005
Neg logit contributions
bes-0.004
 wearing-0.004
we-0.004
 wiping-0.003
irling-0.003
Token stop
Token ID2245
Feature activation-1.26e-02

Pos logit contributions
ah+0.060
ff+0.059
bar+0.056
rol+0.054
 enough+0.052
Neg logit contributions
bes-0.058
we-0.058
 wearing-0.058
lf-0.054
 wiping-0.052
Token in
Token ID287
Feature activation-1.53e-02

Pos logit contributions
ah+0.020
 mate+0.018
ff+0.018
 One+0.018
bar+0.018
Neg logit contributions
we-0.023
bes-0.023
kins-0.021
 wearing-0.021
irling-0.020
Token time
Token ID640
Feature activation+5.06e-03

Pos logit contributions
ah+0.024
 mate+0.022
ff+0.022
bar+0.021
 One+0.021
Neg logit contributions
bes-0.029
we-0.028
kins-0.025
 wearing-0.025
 wiping-0.025
Token.
Token ID13
Feature activation-2.21e-02

Pos logit contributions
bes+0.008
we+0.008
 wearing+0.008
kins+0.007
 wiping+0.007
Neg logit contributions
ah-0.009
 mate-0.008
ff-0.008
bar-0.008
 One-0.008
Token It
Token ID632
Feature activation-4.72e-03

Pos logit contributions
ah+0.033
 mate+0.029
bar+0.028
ff+0.028
pole+0.026
Neg logit contributions
bes-0.045
we-0.043
kins-0.040
 wearing-0.039
 wiping-0.039
Token hit
Token ID2277
Feature activation+6.23e-03

Pos logit contributions
ff+0.009
ah+0.009
 mate+0.009
bar+0.009
upid+0.009
Neg logit contributions
bes-0.006
we-0.006
 wearing-0.005
kins-0.005
lf-0.005
Token But
Token ID887
Feature activation-2.73e-03

Pos logit contributions
ah+0.049
ff+0.044
bar+0.043
 mate+0.043
ow+0.041
Neg logit contributions
bes-0.051
we-0.049
 wiping-0.043
kins-0.043
 wearing-0.042
Token this
Token ID428
Feature activation-3.00e-02

Pos logit contributions
ah+0.004
 mate+0.004
 One+0.004
ff+0.004
bar+0.004
Neg logit contributions
bes-0.005
we-0.005
kins-0.005
irling-0.005
 wiping-0.005
Token time
Token ID640
Feature activation+9.21e-03

Pos logit contributions
ah+0.040
 mate+0.035
 One+0.034
ff+0.034
bar+0.032
Neg logit contributions
bes-0.065
we-0.065
kins-0.058
irling-0.058
 wiping-0.057
Token,
Token ID11
Feature activation-1.05e-02

Pos logit contributions
we+0.016
bes+0.015
 wearing+0.015
kins+0.014
 wiping+0.014
Neg logit contributions
ah-0.015
ff-0.014
bar-0.014
 mate-0.013
pole-0.013
Token let
Token ID1309
Feature activation-3.74e-02

Pos logit contributions
ah+0.015
ff+0.014
 mate+0.013
bar+0.013
 One+0.013
Neg logit contributions
bes-0.020
we-0.020
kins-0.018
 wiping-0.018
 wearing-0.018
Token's
Token ID338
Feature activation-4.95e-02

Pos logit contributions
ff+0.072
 mate+0.071
ah+0.071
bar+0.062
 passenger+0.060
Neg logit contributions
we-0.048
bes-0.045
lf-0.044
 wiping-0.042
 cleaning-0.042
Token not
Token ID407
Feature activation-3.94e-02

Pos logit contributions
ah+0.068
ff+0.062
 mate+0.059
bar+0.058
 One+0.057
Neg logit contributions
bes-0.103
we-0.098
kins-0.091
irling-0.091
 wearing-0.090
Token vanish
Token ID39572
Feature activation-2.60e-02

Pos logit contributions
ah+0.059
ff+0.053
 mate+0.051
bar+0.051
 One+0.049
Neg logit contributions
bes-0.076
we-0.075
 wearing-0.068
 wiping-0.068
kins-0.068
Token.
Token ID13
Feature activation-2.88e-02

Pos logit contributions
ah+0.041
bar+0.037
ff+0.037
 mate+0.036
 One+0.036
Neg logit contributions
we-0.049
bes-0.049
kins-0.045
 wearing-0.042
irling-0.042
Token Let
Token ID3914
Feature activation-1.80e-02

Pos logit contributions
ah+0.052
ff+0.045
ale+0.045
bar+0.044
 mate+0.043
Neg logit contributions
bes-0.049
we-0.045
 wiping-0.040
kins-0.040
 wearing-0.039
Token's
Token ID338
Feature activation-4.54e-02

Pos logit contributions
ff+0.036
 mate+0.035
ah+0.034
bar+0.030
ait+0.030
Neg logit contributions
we-0.026
bes-0.024
 wiping-0.022
lf-0.021
 wrapping-0.020
Token.
Token ID13
Feature activation-1.21e-02

Pos logit contributions
ah+0.001
 mate+0.001
ff+0.001
 One+0.001
bar+0.001
Neg logit contributions
bes-0.001
we-0.001
kins-0.001
irling-0.001
 wiping-0.001
Token
Token ID198
Feature activation-1.49e-02

Pos logit contributions
ah+0.018
 mate+0.017
ff+0.016
bar+0.016
ow+0.016
Neg logit contributions
bes-0.024
we-0.023
kins-0.021
 wiping-0.020
irling-0.020
Token
Token ID198
Feature activation-1.76e-03

Pos logit contributions
ah+0.022
 mate+0.020
ff+0.020
 One+0.019
bar+0.019
Neg logit contributions
bes-0.030
we-0.029
kins-0.026
irling-0.026
 wiping-0.025
TokenL
Token ID43
Feature activation-1.17e-02

Pos logit contributions
ah+0.003
 mate+0.002
ff+0.002
 One+0.002
bar+0.002
Neg logit contributions
bes-0.003
we-0.003
kins-0.003
irling-0.003
 wiping-0.003
Tokenily
Token ID813
Feature activation+8.23e-03

Pos logit contributions
ah+0.016
 mate+0.015
ff+0.014
 enough+0.014
 One+0.014
Neg logit contributions
bes-0.025
we-0.024
 sudden-0.022
irling-0.022
kins-0.021
Token and
Token ID290
Feature activation-5.27e-02

Pos logit contributions
bes+0.015
we+0.015
kins+0.014
irling+0.013
 wiping+0.013
Neg logit contributions
ah-0.013
 mate-0.012
ff-0.012
 One-0.012
bar-0.011
Token Ben
Token ID3932
Feature activation-3.88e-03

Pos logit contributions
ah+0.068
ff+0.064
 mate+0.062
bar+0.060
pole+0.059
Neg logit contributions
bes-0.111
we-0.109
 wiping-0.100
 wearing-0.100
kins-0.100
Token take
Token ID1011
Feature activation-1.60e-02

Pos logit contributions
ah+0.006
bar+0.006
 mate+0.006
ff+0.005
 One+0.005
Neg logit contributions
bes-0.007
we-0.006
kins-0.006
 wiping-0.006
irling-0.006
Token turns
Token ID4962
Feature activation+1.57e-02

Pos logit contributions
ah+0.026
 mate+0.024
bar+0.023
ff+0.023
pole+0.023
Neg logit contributions
bes-0.030
we-0.029
kins-0.026
 wiping-0.025
 wearing-0.025
Token flying
Token ID7348
Feature activation-8.85e-03

Pos logit contributions
bes+0.023
we+0.022
kins+0.020
irling+0.019
 wiping+0.018
Neg logit contributions
ah-0.031
 mate-0.029
ff-0.029
 One-0.028
bar-0.028
Token their
Token ID511
Feature activation+1.51e-02

Pos logit contributions
ah+0.016
 mate+0.016
bar+0.015
ff+0.015
 One+0.015
Neg logit contributions
we-0.014
bes-0.014
kins-0.012
 wearing-0.011
 wiping-0.011
Token something
Token ID1223
Feature activation-2.12e-02

Pos logit contributions
ah+0.002
ff+0.001
 mate+0.001
 One+0.001
bar+0.001
Neg logit contributions
bes-0.002
we-0.002
kins-0.002
 wearing-0.002
 wiping-0.002
Token else
Token ID2073
Feature activation-7.64e-05

Pos logit contributions
ah+0.043
ff+0.040
bar+0.038
 mate+0.038
 One+0.038
Neg logit contributions
bes-0.031
we-0.031
kins-0.027
 wearing-0.026
irling-0.026
Token."
Token ID526
Feature activation-1.45e-02

Pos logit contributions
ah+0.000
ff+0.000
bar+0.000
 mate+0.000
 One+0.000
Neg logit contributions
bes-0.000
we-0.000
kins-0.000
 wearing-0.000
 wiping-0.000
Token
Token ID198
Feature activation-2.47e-02

Pos logit contributions
ah+0.019
 One+0.018
 mate+0.018
ff+0.018
bar+0.017
Neg logit contributions
bes-0.031
we-0.030
kins-0.027
irling-0.027
 wiping-0.026
Token
Token ID198
Feature activation-1.05e-02

Pos logit contributions
ah+0.034
 mate+0.031
 One+0.031
ff+0.031
bar+0.030
Neg logit contributions
bes-0.051
we-0.050
kins-0.045
irling-0.044
 wiping-0.043
TokenS
Token ID50
Feature activation-5.06e-02

Pos logit contributions
ah+0.019
 mate+0.018
ff+0.018
ow+0.017
bar+0.017
Neg logit contributions
bes-0.018
we-0.017
kins-0.015
irling-0.015
 wiping-0.014
Tokenara
Token ID3301
Feature activation+3.13e-03

Pos logit contributions
 mate+0.098
bar+0.092
ah+0.090
pole+0.090
 One+0.088
Neg logit contributions
we-0.074
bes-0.074
kins-0.067
irling-0.062
 wiping-0.059
Token and
Token ID290
Feature activation-5.75e-02

Pos logit contributions
bes+0.006
we+0.006
irling+0.005
kins+0.005
 wiping+0.005
Neg logit contributions
ah-0.005
ff-0.005
 mate-0.004
 One-0.004
ow-0.004
Token Tom
Token ID4186
Feature activation-1.58e-02

Pos logit contributions
ff+0.094
ah+0.089
pole+0.087
rol+0.085
 mate+0.083
Neg logit contributions
bes-0.094
 reading-0.094
 wearing-0.093
 wiping-0.089
 basketball-0.089
Token were
Token ID547
Feature activation-5.71e-03

Pos logit contributions
ah+0.032
 mate+0.030
bar+0.029
 One+0.029
ff+0.029
Neg logit contributions
bes-0.021
we-0.020
 wiping-0.019
kins-0.019
 wearing-0.018
Token very
Token ID845
Feature activation-3.03e-03

Pos logit contributions
ah+0.009
 mate+0.009
 One+0.008
ff+0.008
ow+0.008
Neg logit contributions
bes-0.010
we-0.010
kins-0.009
irling-0.009
 wiping-0.008
Token home
Token ID1363
Feature activation+1.47e-02

Pos logit contributions
bes+0.012
we+0.011
kins+0.011
lf+0.011
irling+0.011
Neg logit contributions
ow-0.004
ah-0.004
 One-0.003
ff-0.003
 enough-0.003
Token.
Token ID13
Feature activation-1.10e-03

Pos logit contributions
bes+0.032
we+0.032
irling+0.029
kins+0.029
lf+0.027
Neg logit contributions
 mate-0.016
ah-0.016
 One-0.015
ow-0.015
 enough-0.014
Token<|endoftext|>
Token ID50256
Feature activation+2.01e-02

Pos logit contributions
ah+0.001
 One+0.001
 mate+0.001
ff+0.001
bar+0.001
Neg logit contributions
bes-0.002
we-0.002
kins-0.002
irling-0.002
 wiping-0.002
TokenOnce
Token ID7454
Feature activation-8.87e-03

Pos logit contributions
bes+0.048
we+0.045
irling+0.044
kins+0.043
 wiping+0.041
Neg logit contributions
ah-0.022
 One-0.021
ff-0.019
 mate-0.018
ow-0.017
Token upon
Token ID2402
Feature activation-4.22e-02

Pos logit contributions
ah+0.014
 mate+0.013
bar+0.013
ff+0.013
ow+0.012
Neg logit contributions
we-0.015
bes-0.015
kins-0.015
 wearing-0.014
irling-0.014
Token a
Token ID257
Feature activation-4.67e-02

Pos logit contributions
 One+0.046
ah+0.043
 mate+0.040
ff+0.038
 go+0.037
Neg logit contributions
bes-0.100
we-0.097
irling-0.094
kins-0.093
bing-0.087
Token time
Token ID640
Feature activation+2.35e-03

Pos logit contributions
ah+0.062
 mate+0.057
bar+0.056
ff+0.056
 One+0.054
Neg logit contributions
bes-0.098
we-0.095
kins-0.087
irling-0.086
 wiping-0.084
Token,
Token ID11
Feature activation-3.70e-03

Pos logit contributions
bes+0.005
we+0.005
kins+0.005
irling+0.005
 wiping+0.005
Neg logit contributions
ah-0.003
 mate-0.003
 One-0.002
ff-0.002
ow-0.002
Token there
Token ID612
Feature activation-4.55e-03

Pos logit contributions
 One+0.006
ah+0.005
 mate+0.005
ow+0.005
ff+0.005
Neg logit contributions
bes-0.008
we-0.007
keeping-0.007
irling-0.006
kins-0.006
Token was
Token ID373
Feature activation-1.05e-02

Pos logit contributions
ah+0.008
 One+0.008
ff+0.008
 mate+0.008
ow+0.007
Neg logit contributions
bes-0.007
we-0.007
kins-0.006
irling-0.006
 wiping-0.006
Token a
Token ID257
Feature activation-2.35e-02

Pos logit contributions
 One+0.011
ah+0.009
 enough+0.009
 mate+0.008
ff+0.008
Neg logit contributions
we-0.028
bes-0.028
kins-0.026
irling-0.025
keeping-0.023
Token!"
Token ID2474
Feature activation-1.74e-02

Pos logit contributions
bes+0.047
we+0.046
kins+0.042
 wearing+0.042
irling+0.041
Neg logit contributions
ah-0.042
 One-0.041
 mate-0.040
ff-0.040
bar-0.039
Token
Token ID198
Feature activation-1.09e-02

Pos logit contributions
ah+0.019
 mate+0.017
ff+0.016
bar+0.016
 One+0.015
Neg logit contributions
bes-0.042
we-0.041
kins-0.038
irling-0.037
 wiping-0.036
Token
Token ID198
Feature activation+6.59e-04

Pos logit contributions
ah+0.014
 One+0.013
ff+0.012
 mate+0.012
bar+0.012
Neg logit contributions
bes-0.024
we-0.023
kins-0.021
irling-0.021
 wiping-0.020
TokenL
Token ID43
Feature activation-1.65e-02

Pos logit contributions
bes+0.001
we+0.001
kins+0.001
irling+0.001
 wiping+0.001
Neg logit contributions
ah-0.001
 mate-0.001
 One-0.001
ff-0.001
bar-0.001
Tokenily
Token ID813
Feature activation+4.94e-03

Pos logit contributions
ah+0.027
 mate+0.024
ff+0.024
 One+0.024
ow+0.023
Neg logit contributions
bes-0.031
we-0.030
kins-0.027
irling-0.027
 wiping-0.026
Token and
Token ID290
Feature activation-4.88e-02

Pos logit contributions
bes+0.009
we+0.009
kins+0.008
irling+0.008
 wiping+0.008
Neg logit contributions
ah-0.008
 mate-0.008
ff-0.007
bar-0.007
 One-0.007
Token Tom
Token ID4186
Feature activation-8.05e-03

Pos logit contributions
ah+0.062
 mate+0.059
ff+0.059
bar+0.054
pole+0.051
Neg logit contributions
bes-0.106
we-0.103
 wearing-0.094
 wiping-0.094
kins-0.093
Token hug
Token ID16225
Feature activation-2.47e-03

Pos logit contributions
ah+0.014
bar+0.013
 mate+0.013
ff+0.012
pole+0.012
Neg logit contributions
bes-0.013
we-0.012
 wiping-0.011
kins-0.011
 wearing-0.011
Token and
Token ID290
Feature activation-1.68e-02

Pos logit contributions
ah+0.004
 mate+0.004
bar+0.004
ff+0.004
 One+0.003
Neg logit contributions
bes-0.005
we-0.004
kins-0.004
 wiping-0.004
 wearing-0.004
Token laugh
Token ID6487
Feature activation-8.12e-03

Pos logit contributions
ah+0.020
 mate+0.018
ff+0.018
 One+0.018
bar+0.017
Neg logit contributions
bes-0.038
we-0.037
kins-0.034
irling-0.034
 wiping-0.033
Token.
Token ID13
Feature activation-1.38e-02

Pos logit contributions
ah+0.016
bar+0.015
ff+0.014
 mate+0.014
ow+0.014
Neg logit contributions
bes-0.012
we-0.012
 wearing-0.011
kins-0.011
 wrapping-0.011