TOP ACTIVATIONS
MAX = 9.244

and listening . <|endoftext|> M omm a and Billy were walking
we can get more ch ocol ates later . For now
cakes and cookies and ch ocol ates and ice cream .
the leaf there were little spec ks of gold ! She
, he noticed a small spr out coming out of the
soon nothing but a small spe ck in the sky .
park . When the ch ocol ates were all gone ,
John chose some carrots and cuc umbers to take home .
inside ! There were ch ocol ates , cookies , and
It was sweet , ch ocol at y and warm ,
their faces . <|endoftext|> M omm a and Ron ny were
high they were just tiny spec ks in the sky .
king ate too many ch ocol ates and his tum my
?" L ily spr ays Ben with the water
reward . It was ch ocol ates and sweets !
bled up all the ch ocol ates , and she was
it . M omm a and Tom went to
When he does , she spr ays him with water .
. M omm a and Billy stayed by
mustache !" M omm a replied , " Well

MIN ACTIVATIONS
MIN = -4.984

with her hands on her hips . She said ,
stuck beneath the j ee pt he same je ep that
with her hands on her hips . Big Bear
truck because of your bad behavior . Now you have to
rubbed his nose with her sleeve . She said , "
. You are hurting my feelings . You are making your
again because he missed his friend . But then , his
He wanted to play with Big Red . He reached for
of her hands on her hips . She shouted , "
away his tears with his sleeve . " Ok , Mom
put his hands on his hips . Then
his sp ade with his friend . They both played in
had something special up her sleeve , but he wasn 't
. " He hurt my feelings ." Tim my
would love to play with Big gie . He is so
about not sharing with his friend . The bad ending was
had a trick up his sleeve . <|endoftext|> Once upon a
had hurt their dad 's feelings . They said sorry to
of Tim my 's bad behavior . <|endoftext|> Once upon a
off her neck with his sleeve . Lily was confused and

INTERVAL 5.687 - 9.244
CONTAINS 0.011%

. Sara runs away and spr ays him again . They
soon nothing but a small spe ck in the sky .
reward . It was ch ocol ates and sweets !
It was sweet , ch ocol at y and warm ,
we can get more ch ocol ates later . For now

INTERVAL 2.130 - 5.687
CONTAINS 2.382%

new shelves . He filled up the shelves with markers ,
tight . M addy never worried about losing her
it , stretching it and even making shapes out of it
street . She has a pink dress and a red hat
" But I like the loud car ," Lily

INTERVAL -1.427 - 2.130
CONTAINS 90.868%

couldn 't wait to see all the cool things that were
. They talked and talked and laughed and laughed
bird ! The bird fl apped its wings and
the whole park . It was green and bright and beautiful
she didn 't have anything to play with now . But

INTERVAL -4.984 - -1.427
CONTAINS 6.739%

When Amy put on the dress , she accidentally
the frog was very troubled . His fountain had broken and
tree . Lily loved to play outside in the sun .
was very excited to see all the animals .
had lost the fight . The boys never
Token and
Token ID290
Feature activation+2.74e+00

Pos logit contributions
aw+0.079
-+0.076
 at+0.076
 by+0.071
up+0.068
Neg logit contributions
stack-0.095
-0.092
 hips-0.088
ses-0.087
gency-0.086
Token listening
Token ID8680
Feature activation+9.05e-01

Pos logit contributions
aw+0.070
-+0.068
 at+0.066
 by+0.061
up+0.060
Neg logit contributions
stack-0.095
-0.090
 hips-0.088
ses-0.087
gency-0.086
Token.
Token ID13
Feature activation+5.93e-01

Pos logit contributions
aw+0.016
 at+0.015
-+0.014
 by+0.013
up+0.012
Neg logit contributions
stack-0.038
-0.037
 hips-0.037
gency-0.036
alk-0.036
Token<|endoftext|>
Token ID50256
Feature activation+9.79e-01

Pos logit contributions
 at+0.018
aw+0.018
 by+0.017
-+0.016
 with+0.014
Neg logit contributions
gency-0.017
ivery-0.017
drivers-0.017
ses-0.017
stack-0.017
TokenM
Token ID44
Feature activation+1.31e+00

Pos logit contributions
 at+0.028
aw+0.028
-+0.027
 by+0.027
up+0.022
Neg logit contributions
stack-0.032
 spine-0.031
 hips-0.030
-0.030
ses-0.030
Tokenomm
Token ID2002
Feature activation+6.42e+00

Pos logit contributions
 at+0.041
aw+0.040
-+0.040
 by+0.038
up+0.035
Neg logit contributions
stack-0.039
-0.038
drivers-0.037
gency-0.037
ses-0.037
Tokena
Token ID64
Feature activation+2.93e+00

Pos logit contributions
 at+0.150
aw+0.148
-+0.141
 by+0.135
up+0.120
Neg logit contributions
stack-0.231
-0.226
drivers-0.217
gency-0.216
 hips-0.215
Token and
Token ID290
Feature activation+7.15e-01

Pos logit contributions
aw+0.059
 at+0.057
-+0.056
 by+0.055
up+0.053
Neg logit contributions
-0.108
stack-0.108
 hips-0.102
gency-0.100
alk-0.098
Token Billy
Token ID15890
Feature activation-8.67e-01

Pos logit contributions
aw+0.022
 at+0.021
-+0.021
 by+0.020
up+0.019
Neg logit contributions
stack-0.021
-0.021
gency-0.020
 hips-0.020
ses-0.020
Token were
Token ID547
Feature activation-8.08e-01

Pos logit contributions
stack+0.032
+0.032
gency+0.031
 hips+0.030
ses+0.030
Neg logit contributions
aw-0.019
 at-0.019
--0.019
 by-0.018
up-0.016
Token walking
Token ID6155
Feature activation+6.79e-01

Pos logit contributions
+0.026
stack+0.026
 hips+0.024
urance+0.024
gency+0.024
Neg logit contributions
aw-0.022
--0.021
 at-0.021
 by-0.020
up-0.020
Token we
Token ID356
Feature activation+2.61e-01

Pos logit contributions
stack+0.018
 hips+0.017
't+0.016
urance+0.015
 Johnson+0.015
Neg logit contributions
aw-0.024
--0.021
up-0.020
 at-0.020
 by-0.018
Token can
Token ID460
Feature activation-2.79e-01

Pos logit contributions
aw+0.007
-+0.007
 at+0.006
 by+0.006
up+0.006
Neg logit contributions
-0.009
stack-0.009
gency-0.008
urance-0.008
 hips-0.008
Token get
Token ID651
Feature activation+1.75e+00

Pos logit contributions
+0.010
stack+0.010
 hips+0.009
gency+0.009
urance+0.009
Neg logit contributions
aw-0.007
 at-0.007
--0.007
 by-0.007
up-0.006
Token more
Token ID517
Feature activation+1.52e+00

Pos logit contributions
aw+0.049
-+0.044
 at+0.042
up+0.039
 by+0.039
Neg logit contributions
-0.051
stack-0.050
gency-0.050
 hips-0.049
drivers-0.049
Token ch
Token ID442
Feature activation+1.20e+00

Pos logit contributions
aw+0.044
 at+0.035
 by+0.034
up+0.034
-+0.033
Neg logit contributions
 hips-0.036
drivers-0.036
 hind-0.035
 slope-0.034
 banks-0.034
Tokenocol
Token ID4668
Feature activation+6.40e+00

Pos logit contributions
aw+0.038
 at+0.037
 by+0.037
-+0.034
up+0.033
Neg logit contributions
-0.032
alk-0.031
drivers-0.030
 hips-0.029
 spine-0.028
Tokenates
Token ID689
Feature activation+1.28e+00

Pos logit contributions
 at+0.240
aw+0.236
-+0.231
 by+0.225
up+0.213
Neg logit contributions
stack-0.145
-0.144
drivers-0.132
 hips-0.130
ses-0.129
Token later
Token ID1568
Feature activation-7.34e-01

Pos logit contributions
aw+0.032
-+0.029
 at+0.029
 by+0.027
up+0.026
Neg logit contributions
stack-0.045
-0.043
 hips-0.042
alk-0.040
drivers-0.040
Token.
Token ID13
Feature activation+5.75e-01

Pos logit contributions
stack+0.021
+0.020
ses+0.019
 hips+0.019
gency+0.019
Neg logit contributions
 at-0.024
aw-0.023
--0.023
 by-0.022
up-0.020
Token For
Token ID1114
Feature activation+7.76e-01

Pos logit contributions
aw+0.020
-+0.020
 at+0.020
 by+0.018
up+0.018
Neg logit contributions
-0.016
stack-0.015
gency-0.014
 hips-0.014
ses-0.014
Token now
Token ID783
Feature activation+8.97e-01

Pos logit contributions
 at+0.018
-+0.017
aw+0.017
 by+0.016
up+0.013
Neg logit contributions
stack-0.030
-0.030
 spine-0.029
ses-0.028
drivers-0.028
Token cakes
Token ID30849
Feature activation+3.39e+00

Pos logit contributions
stack+0.028
+0.028
 hips+0.026
gency+0.026
ses+0.026
Neg logit contributions
 at-0.026
aw-0.026
--0.025
 by-0.024
up-0.022
Token and
Token ID290
Feature activation+4.08e-01

Pos logit contributions
aw+0.084
up+0.076
-+0.076
 at+0.071
 by+0.069
Neg logit contributions
-0.117
stack-0.115
drivers-0.106
 hips-0.103
gy-0.100
Token cookies
Token ID14746
Feature activation+1.85e+00

Pos logit contributions
 at+0.010
aw+0.010
-+0.010
 by+0.009
up+0.008
Neg logit contributions
stack-0.015
-0.015
 hips-0.014
gency-0.014
ses-0.014
Token and
Token ID290
Feature activation+3.55e-01

Pos logit contributions
aw+0.039
-+0.035
up+0.034
 at+0.032
 by+0.030
Neg logit contributions
stack-0.073
-0.072
drivers-0.067
 hips-0.066
ses-0.064
Token ch
Token ID442
Feature activation+2.66e-01

Pos logit contributions
 at+0.009
aw+0.009
-+0.009
 by+0.008
up+0.008
Neg logit contributions
stack-0.012
-0.012
drivers-0.011
 hips-0.011
ses-0.011
Tokenocol
Token ID4668
Feature activation+6.31e+00

Pos logit contributions
 at+0.010
 by+0.010
aw+0.009
-+0.009
up+0.009
Neg logit contributions
alk-0.006
-0.006
ses-0.005
drivers-0.005
urance-0.005
Tokenates
Token ID689
Feature activation+4.09e-01

Pos logit contributions
 at+0.231
aw+0.226
-+0.223
 by+0.219
up+0.205
Neg logit contributions
-0.147
stack-0.146
drivers-0.136
ses-0.135
 hips-0.133
Token and
Token ID290
Feature activation-2.29e-03

Pos logit contributions
aw+0.009
-+0.008
 at+0.008
 by+0.008
up+0.007
Neg logit contributions
stack-0.016
-0.015
 hips-0.015
drivers-0.014
alk-0.014
Token ice
Token ID4771
Feature activation+1.83e+00

Pos logit contributions
stack+0.000
+0.000
ses+0.000
 hips+0.000
drivers+0.000
Neg logit contributions
aw-0.000
 at-0.000
--0.000
 by-0.000
up-0.000
Token cream
Token ID8566
Feature activation+1.19e+00

Pos logit contributions
aw+0.041
 at+0.040
 bra+0.036
 by+0.036
up+0.033
Neg logit contributions
drivers-0.063
-0.062
gy-0.060
stack-0.059
 hips-0.057
Token.
Token ID13
Feature activation+8.67e-01

Pos logit contributions
aw+0.031
up+0.028
-+0.028
 at+0.027
 by+0.026
Neg logit contributions
stack-0.039
-0.039
drivers-0.036
 hips-0.034
sed-0.034
Token the
Token ID262
Feature activation+3.83e-01

Pos logit contributions
stack+0.020
+0.019
 hips+0.018
ses+0.018
gency+0.018
Neg logit contributions
 at-0.010
aw-0.010
--0.010
 by-0.009
up-0.008
Token leaf
Token ID12835
Feature activation+5.40e-01

Pos logit contributions
 at+0.012
-+0.011
 by+0.011
aw+0.011
up+0.009
Neg logit contributions
stack-0.012
-0.012
gency-0.011
urance-0.011
ses-0.011
Token there
Token ID612
Feature activation-5.35e-01

Pos logit contributions
aw+0.012
 at+0.012
-+0.012
 by+0.011
up+0.010
Neg logit contributions
stack-0.020
-0.020
 hips-0.019
alk-0.018
drivers-0.018
Token were
Token ID547
Feature activation-9.96e-01

Pos logit contributions
stack+0.013
+0.013
 hips+0.011
alk+0.011
urance+0.011
Neg logit contributions
aw-0.019
--0.018
 at-0.018
 by-0.017
up-0.017
Token little
Token ID1310
Feature activation-7.47e-01

Pos logit contributions
+0.031
stack+0.030
gency+0.029
 hips+0.028
ses+0.028
Neg logit contributions
 at-0.030
aw-0.030
--0.029
 by-0.028
up-0.026
Token spec
Token ID1020
Feature activation+6.19e+00

Pos logit contributions
+0.021
stack+0.021
ses+0.020
 slope+0.020
drivers+0.020
Neg logit contributions
aw-0.023
 at-0.022
--0.022
up-0.020
 by-0.020
Tokenks
Token ID591
Feature activation-9.11e-01

Pos logit contributions
 at+0.155
aw+0.153
-+0.149
 by+0.141
up+0.126
Neg logit contributions
-0.221
stack-0.221
gency-0.208
 hips-0.208
ses-0.206
Token of
Token ID286
Feature activation+5.06e-01

Pos logit contributions
urance+0.032
+0.032
 hips+0.032
stack+0.032
drivers+0.031
Neg logit contributions
aw-0.015
--0.014
up-0.013
 by-0.013
 at-0.012
Token gold
Token ID3869
Feature activation+1.02e+00

Pos logit contributions
-+0.013
 at+0.013
aw+0.013
 by+0.012
up+0.011
Neg logit contributions
-0.018
stack-0.018
ses-0.017
 hips-0.016
sed-0.016
Token!
Token ID0
Feature activation+1.35e-01

Pos logit contributions
aw+0.023
up+0.021
 at+0.021
 by+0.020
-+0.019
Neg logit contributions
drivers-0.032
stack-0.032
-0.032
gy-0.031
 slope-0.030
Token She
Token ID1375
Feature activation+6.15e-01

Pos logit contributions
 at+0.004
 by+0.004
aw+0.004
-+0.004
 with+0.003
Neg logit contributions
gency-0.004
stack-0.004
arse-0.004
alk-0.004
 spine-0.004
Token,
Token ID11
Feature activation+3.08e-01

Pos logit contributions
aw+0.076
up+0.066
-+0.064
 bra+0.059
 at+0.059
Neg logit contributions
-0.078
stack-0.073
alk-0.069
drivers-0.069
't-0.066
Token he
Token ID339
Feature activation+2.68e-01

Pos logit contributions
aw+0.010
-+0.009
up+0.009
 at+0.008
 by+0.008
Neg logit contributions
stack-0.009
-0.009
 slope-0.008
 hips-0.008
gency-0.008
Token noticed
Token ID6810
Feature activation+5.55e-01

Pos logit contributions
aw+0.008
-+0.007
 by+0.007
 at+0.007
up+0.007
Neg logit contributions
-0.007
stack-0.007
 hips-0.007
sed-0.007
alk-0.007
Token a
Token ID257
Feature activation+8.26e-01

Pos logit contributions
 at+0.011
aw+0.011
-+0.011
 by+0.010
up+0.009
Neg logit contributions
stack-0.022
-0.021
gency-0.020
 hips-0.020
drivers-0.020
Token small
Token ID1402
Feature activation+5.75e-01

Pos logit contributions
aw+0.025
 at+0.025
-+0.023
 by+0.023
up+0.021
Neg logit contributions
-0.025
stack-0.025
 hips-0.023
gency-0.023
ses-0.023
Token spr
Token ID7500
Feature activation+6.19e+00

Pos logit contributions
aw+0.020
 at+0.018
up+0.018
-+0.017
 by+0.017
Neg logit contributions
-0.015
stack-0.015
 slope-0.014
alk-0.014
ses-0.014
Tokenout
Token ID448
Feature activation+1.31e+00

Pos logit contributions
 at+0.140
-+0.137
 by+0.131
aw+0.128
up+0.112
Neg logit contributions
stack-0.206
-0.206
alk-0.205
ses-0.198
 spine-0.197
Token coming
Token ID2406
Feature activation+2.09e+00

Pos logit contributions
aw+0.031
 at+0.030
-+0.030
up+0.028
 by+0.027
Neg logit contributions
stack-0.046
-0.046
alk-0.042
 slope-0.042
 hips-0.042
Token out
Token ID503
Feature activation+2.46e+00

Pos logit contributions
aw+0.045
up+0.036
 at+0.036
-+0.036
 by+0.029
Neg logit contributions
stack-0.080
gy-0.076
 slope-0.074
alk-0.074
ifully-0.074
Token of
Token ID286
Feature activation+1.89e+00

Pos logit contributions
aw+0.040
-+0.033
up+0.031
 at+0.030
 by+0.024
Neg logit contributions
stack-0.100
alk-0.100
-0.098
 hips-0.097
ifully-0.096
Token the
Token ID262
Feature activation+2.73e-01

Pos logit contributions
 at+0.024
 by+0.022
-+0.019
aw+0.016
 with+0.012
Neg logit contributions
drivers-0.092
-0.091
stack-0.091
urance-0.091
ivery-0.091
Token soon
Token ID2582
Feature activation-6.32e-02

Pos logit contributions
stack+0.006
+0.006
 hips+0.006
ses+0.006
sed+0.005
Neg logit contributions
aw-0.005
--0.005
 at-0.005
up-0.004
 by-0.004
Token nothing
Token ID2147
Feature activation+1.16e+00

Pos logit contributions
 spine+0.002
stack+0.002
 hips+0.002
+0.002
sed+0.002
Neg logit contributions
aw-0.002
up-0.002
--0.002
 at-0.001
 by-0.001
Token but
Token ID475
Feature activation+7.61e-01

Pos logit contributions
aw+0.033
-+0.027
 at+0.027
up+0.027
 by+0.025
Neg logit contributions
stack-0.034
sed-0.033
ses-0.033
-0.032
 spine-0.032
Token a
Token ID257
Feature activation-7.14e-02

Pos logit contributions
aw+0.023
-+0.020
up+0.020
 at+0.018
ins+0.017
Neg logit contributions
stack-0.022
-0.022
ses-0.020
 hips-0.020
sed-0.020
Token small
Token ID1402
Feature activation-1.96e-01

Pos logit contributions
+0.002
stack+0.002
ses+0.002
 hips+0.002
gency+0.002
Neg logit contributions
aw-0.002
 at-0.002
--0.002
 by-0.002
up-0.002
Token spe
Token ID693
Feature activation+6.14e+00

Pos logit contributions
stack+0.006
+0.006
ses+0.006
 spine+0.006
alk+0.006
Neg logit contributions
aw-0.005
 at-0.005
--0.005
up-0.005
 by-0.005
Tokenck
Token ID694
Feature activation+2.55e+00

Pos logit contributions
-+0.150
aw+0.148
 at+0.147
 by+0.136
up+0.121
Neg logit contributions
-0.230
stack-0.225
 hips-0.214
drivers-0.214
gency-0.213
Token in
Token ID287
Feature activation-2.06e+00

Pos logit contributions
aw+0.054
 at+0.052
-+0.051
 by+0.049
up+0.048
Neg logit contributions
stack-0.093
-0.090
alk-0.087
ses-0.083
sed-0.083
Token the
Token ID262
Feature activation-3.73e-01

Pos logit contributions
stack+0.077
+0.077
 hips+0.072
ses+0.072
gency+0.072
Neg logit contributions
 at-0.047
aw-0.047
--0.046
 by-0.043
up-0.039
Token sky
Token ID6766
Feature activation+1.14e-01

Pos logit contributions
stack+0.010
+0.010
 spine+0.009
ses+0.009
 hips+0.009
Neg logit contributions
 at-0.013
aw-0.012
--0.012
 by-0.012
up-0.011
Token.
Token ID13
Feature activation-1.01e-01

Pos logit contributions
aw+0.003
-+0.003
up+0.003
 at+0.003
 by+0.002
Neg logit contributions
stack-0.004
-0.004
 hips-0.003
alk-0.003
gy-0.003
Token park
Token ID3952
Feature activation-8.71e-01

Pos logit contributions
 at+0.001
-+0.001
aw+0.001
 by+0.001
up+0.000
Neg logit contributions
-0.000
stack-0.000
 slope-0.000
gency-0.000
ses-0.000
Token.
Token ID13
Feature activation-5.45e-02

Pos logit contributions
stack+0.032
+0.031
 hips+0.030
drivers+0.029
urance+0.029
Neg logit contributions
aw-0.020
 at-0.019
--0.019
up-0.018
 by-0.018
Token When
Token ID1649
Feature activation+1.69e+00

Pos logit contributions
stack+0.002
+0.002
gency+0.001
 hips+0.001
ses+0.001
Neg logit contributions
 at-0.002
aw-0.002
--0.002
 by-0.002
up-0.002
Token the
Token ID262
Feature activation-3.33e-02

Pos logit contributions
aw+0.055
-+0.053
up+0.053
 at+0.049
 by+0.047
Neg logit contributions
-0.044
stack-0.044
 slope-0.040
't-0.039
gency-0.038
Token ch
Token ID442
Feature activation+8.64e-01

Pos logit contributions
+0.001
stack+0.001
 hips+0.001
ses+0.001
gency+0.001
Neg logit contributions
 at-0.001
aw-0.001
--0.001
 by-0.001
up-0.001
Tokenocol
Token ID4668
Feature activation+6.11e+00

Pos logit contributions
 at+0.027
aw+0.027
 by+0.026
-+0.025
up+0.024
Neg logit contributions
-0.025
stack-0.023
alk-0.023
ses-0.023
sed-0.022
Tokenates
Token ID689
Feature activation+8.55e-01

Pos logit contributions
 at+0.213
aw+0.212
-+0.209
 by+0.200
up+0.186
Neg logit contributions
stack-0.156
-0.155
 hips-0.143
gency-0.143
ses-0.142
Token were
Token ID547
Feature activation-6.13e-01

Pos logit contributions
aw+0.023
 at+0.023
-+0.022
 by+0.021
up+0.020
Neg logit contributions
-0.028
stack-0.028
 hips-0.026
alk-0.025
ses-0.025
Token all
Token ID477
Feature activation-1.19e+00

Pos logit contributions
+0.020
stack+0.020
 hips+0.019
urance+0.018
sed+0.018
Neg logit contributions
aw-0.016
--0.016
 at-0.015
 by-0.015
up-0.014
Token gone
Token ID3750
Feature activation-4.09e-01

Pos logit contributions
stack+0.038
+0.038
 hips+0.036
ses+0.034
alk+0.034
Neg logit contributions
aw-0.034
--0.032
 at-0.031
up-0.028
 by-0.028
Token,
Token ID11
Feature activation-6.84e-01

Pos logit contributions
+0.016
stack+0.016
 hips+0.015
gency+0.015
ses+0.015
Neg logit contributions
aw-0.009
 at-0.008
--0.008
 by-0.008
up-0.007
Token John
Token ID1757
Feature activation+2.07e+00

Pos logit contributions
 at+0.020
aw+0.020
-+0.019
 by+0.018
up+0.017
Neg logit contributions
stack-0.021
-0.020
 hips-0.019
gency-0.019
ses-0.019
Token chose
Token ID7690
Feature activation+1.99e+00

Pos logit contributions
aw+0.050
 at+0.049
-+0.048
 by+0.045
up+0.042
Neg logit contributions
stack-0.073
-0.072
 hips-0.069
gency-0.068
urance-0.067
Token some
Token ID617
Feature activation+2.51e+00

Pos logit contributions
aw+0.045
 at+0.043
-+0.043
 by+0.038
up+0.038
Neg logit contributions
stack-0.072
-0.070
 hips-0.068
drivers-0.067
gency-0.066
Token carrots
Token ID34397
Feature activation+2.50e+00

Pos logit contributions
aw+0.065
 at+0.064
-+0.062
 by+0.058
up+0.055
Neg logit contributions
-0.082
stack-0.081
 hips-0.077
alk-0.076
drivers-0.076
Token and
Token ID290
Feature activation+1.00e+00

Pos logit contributions
 at+0.041
aw+0.040
-+0.039
 by+0.036
up+0.030
Neg logit contributions
stack-0.110
-0.110
 hips-0.105
gency-0.104
ses-0.104
Token cuc
Token ID38421
Feature activation+6.11e+00

Pos logit contributions
aw+0.027
-+0.026
 at+0.026
 by+0.025
up+0.024
Neg logit contributions
stack-0.033
-0.032
 hips-0.031
ses-0.030
drivers-0.030
Tokenumbers
Token ID17024
Feature activation+2.18e+00

Pos logit contributions
 at+0.191
-+0.191
aw+0.188
 by+0.186
up+0.165
Neg logit contributions
-0.171
stack-0.155
drivers-0.154
alk-0.146
gency-0.143
Token to
Token ID284
Feature activation+6.41e-02

Pos logit contributions
aw+0.045
 at+0.044
-+0.043
 by+0.040
up+0.036
Neg logit contributions
stack-0.085
-0.084
 hips-0.080
drivers-0.080
alk-0.079
Token take
Token ID1011
Feature activation+6.17e-01

Pos logit contributions
 at+0.002
aw+0.001
-+0.001
 by+0.001
up+0.001
Neg logit contributions
-0.002
stack-0.002
ses-0.002
gency-0.002
 hips-0.002
Token home
Token ID1363
Feature activation+4.69e-01

Pos logit contributions
aw+0.015
 at+0.015
-+0.014
 by+0.014
up+0.013
Neg logit contributions
-0.020
stack-0.020
drivers-0.019
 hips-0.019
alk-0.019
Token.
Token ID13
Feature activation+3.14e-01

Pos logit contributions
aw+0.010
 at+0.009
-+0.009
 by+0.008
up+0.008
Neg logit contributions
-0.018
stack-0.017
 hips-0.017
drivers-0.017
gy-0.016
Token inside
Token ID2641
Feature activation-6.12e-01

Pos logit contributions
aw+0.019
-+0.016
up+0.016
 at+0.015
 bra+0.014
Neg logit contributions
stack-0.023
-0.022
 hips-0.021
drivers-0.021
gency-0.020
Token!
Token ID0
Feature activation+1.51e-01

Pos logit contributions
stack+0.021
+0.020
 hips+0.019
gency+0.018
alk+0.018
Neg logit contributions
aw-0.016
--0.015
 at-0.014
up-0.014
 by-0.012
Token There
Token ID1318
Feature activation-3.17e-01

Pos logit contributions
 at+0.005
aw+0.005
-+0.005
 by+0.005
up+0.004
Neg logit contributions
stack-0.004
-0.004
gency-0.004
ses-0.004
drivers-0.004
Token were
Token ID547
Feature activation-8.00e-01

Pos logit contributions
stack+0.008
+0.008
 hips+0.007
drivers+0.007
alk+0.007
Neg logit contributions
aw-0.011
--0.011
 at-0.011
 by-0.010
up-0.010
Token ch
Token ID442
Feature activation+5.82e-01

Pos logit contributions
stack+0.023
+0.022
drivers+0.022
 hips+0.021
ses+0.021
Neg logit contributions
--0.024
 at-0.024
aw-0.024
 by-0.022
up-0.020
Tokenocol
Token ID4668
Feature activation+6.06e+00

Pos logit contributions
 at+0.018
aw+0.018
 by+0.018
-+0.017
up+0.016
Neg logit contributions
-0.016
alk-0.015
drivers-0.015
 hips-0.014
stack-0.014
Tokenates
Token ID689
Feature activation+7.49e-01

Pos logit contributions
 at+0.207
aw+0.206
-+0.204
 by+0.195
up+0.180
Neg logit contributions
stack-0.159
-0.157
ses-0.145
gency-0.145
 hips-0.145
Token,
Token ID11
Feature activation+7.33e-02

Pos logit contributions
aw+0.017
 at+0.015
-+0.015
 by+0.014
up+0.014
Neg logit contributions
stack-0.028
-0.028
 hips-0.027
drivers-0.026
gy-0.026
Token cookies
Token ID14746
Feature activation+1.85e+00

Pos logit contributions
aw+0.002
-+0.002
 at+0.002
 by+0.002
up+0.002
Neg logit contributions
stack-0.002
-0.002
 hips-0.002
drivers-0.002
ses-0.002
Token,
Token ID11
Feature activation+9.64e-02

Pos logit contributions
aw+0.045
-+0.039
 at+0.039
up+0.037
 by+0.036
Neg logit contributions
stack-0.070
-0.069
drivers-0.064
 hips-0.063
urance-0.061
Token and
Token ID290
Feature activation+4.69e-01

Pos logit contributions
aw+0.002
 at+0.002
-+0.002
up+0.002
 by+0.002
Neg logit contributions
stack-0.004
-0.004
 hips-0.003
drivers-0.003
sed-0.003
Token It
Token ID632
Feature activation-2.97e-01

Pos logit contributions
 at+0.000
aw+0.000
-+0.000
 by+0.000
up+0.000
Neg logit contributions
stack-0.000
gency-0.000
drivers-0.000
 hips-0.000
 spine-0.000
Token was
Token ID373
Feature activation-1.52e-01

Pos logit contributions
+0.009
stack+0.009
 hips+0.009
sed+0.008
ses+0.008
Neg logit contributions
aw-0.009
 at-0.008
--0.008
 by-0.008
up-0.007
Token sweet
Token ID6029
Feature activation+1.76e+00

Pos logit contributions
+0.005
stack+0.005
 hips+0.005
ses+0.005
urance+0.005
Neg logit contributions
aw-0.004
--0.004
 at-0.004
up-0.004
 by-0.003
Token,
Token ID11
Feature activation-2.30e-02

Pos logit contributions
aw+0.049
up+0.047
ups+0.040
 bra+0.039
 at+0.038
Neg logit contributions
-0.049
 Johnson-0.043
 hips-0.042
urance-0.042
ishly-0.042
Token ch
Token ID442
Feature activation+4.96e-01

Pos logit contributions
+0.001
 hips+0.001
stack+0.001
ses+0.001
 swollen+0.001
Neg logit contributions
aw-0.001
up-0.001
--0.001
 at-0.001
 by-0.001
Tokenocol
Token ID4668
Feature activation+6.06e+00

Pos logit contributions
 at+0.021
aw+0.021
 by+0.021
-+0.020
up+0.020
Neg logit contributions
-0.009
stack-0.007
alk-0.007
ses-0.007
 hips-0.007
Tokenat
Token ID265
Feature activation+1.69e+00

Pos logit contributions
 at+0.210
aw+0.209
-+0.207
 by+0.198
up+0.183
Neg logit contributions
stack-0.156
-0.154
 hips-0.142
gency-0.142
ses-0.142
Tokeny
Token ID88
Feature activation-1.03e+00

Pos logit contributions
aw+0.043
 at+0.041
 by+0.037
-+0.037
up+0.035
Neg logit contributions
-0.058
stack-0.058
 hips-0.055
gy-0.053
ishly-0.053
Token and
Token ID290
Feature activation-3.02e-02

Pos logit contributions
stack+0.039
+0.039
 hips+0.038
sed+0.036
ses+0.036
Neg logit contributions
aw-0.022
 at-0.020
--0.019
up-0.019
 by-0.018
Token warm
Token ID5814
Feature activation+1.03e+00

Pos logit contributions
stack+0.001
+0.001
 hips+0.001
 swollen+0.001
sed+0.001
Neg logit contributions
aw-0.001
--0.001
 at-0.001
up-0.001
 by-0.001
Token,
Token ID11
Feature activation-4.05e-01

Pos logit contributions
aw+0.026
up+0.023
 at+0.022
-+0.021
 by+0.019
Neg logit contributions
-0.037
stack-0.036
 hips-0.034
ishly-0.033
urance-0.033
Token their
Token ID511
Feature activation-9.71e-01

Pos logit contributions
stack+0.090
+0.089
ses+0.086
gency+0.085
drivers+0.085
Neg logit contributions
 at-0.048
aw-0.046
--0.045
 by-0.042
up-0.036
Token faces
Token ID6698
Feature activation-4.22e-01

Pos logit contributions
stack+0.030
+0.030
 hips+0.030
ses+0.029
 spine+0.029
Neg logit contributions
aw-0.028
 at-0.027
--0.027
 by-0.025
up-0.024
Token.
Token ID13
Feature activation+3.85e-01

Pos logit contributions
+0.015
stack+0.015
 hips+0.015
 spine+0.014
urance+0.014
Neg logit contributions
aw-0.010
--0.009
 at-0.009
 by-0.008
up-0.008
Token<|endoftext|>
Token ID50256
Feature activation+3.78e-01

Pos logit contributions
 at+0.011
aw+0.010
 by+0.010
-+0.010
 with+0.008
Neg logit contributions
stack-0.013
alk-0.012
 spine-0.012
ses-0.012
sed-0.012
TokenM
Token ID44
Feature activation+9.52e-01

Pos logit contributions
 at+0.011
aw+0.011
-+0.010
 by+0.010
up+0.009
Neg logit contributions
stack-0.012
ses-0.012
-0.012
 hips-0.012
 spine-0.012
Tokenomm
Token ID2002
Feature activation+5.98e+00

Pos logit contributions
 at+0.033
aw+0.033
-+0.033
 by+0.031
up+0.029
Neg logit contributions
stack-0.026
ses-0.024
drivers-0.024
gency-0.024
 spine-0.024
Tokena
Token ID64
Feature activation+2.55e+00

Pos logit contributions
aw+0.142
 at+0.140
-+0.131
 by+0.127
up+0.112
Neg logit contributions
stack-0.213
-0.209
drivers-0.200
 hips-0.199
gency-0.199
Token and
Token ID290
Feature activation+1.17e+00

Pos logit contributions
aw+0.051
 at+0.048
-+0.047
 by+0.047
up+0.044
Neg logit contributions
stack-0.097
-0.097
 hips-0.091
sed-0.089
gency-0.089
Token Ron
Token ID6575
Feature activation-5.05e-01

Pos logit contributions
aw+0.035
 at+0.034
-+0.034
 by+0.032
up+0.030
Neg logit contributions
stack-0.035
-0.035
gency-0.032
 hips-0.032
ses-0.032
Tokenny
Token ID3281
Feature activation+4.65e-01

Pos logit contributions
+0.017
stack+0.017
ses+0.016
 hips+0.016
sed+0.016
Neg logit contributions
 at-0.014
--0.014
aw-0.014
 by-0.013
up-0.012
Token were
Token ID547
Feature activation-2.51e-01

Pos logit contributions
 at+0.012
aw+0.012
-+0.012
 by+0.011
up+0.010
Neg logit contributions
stack-0.016
-0.016
 hips-0.015
gency-0.015
ses-0.015
Token high
Token ID1029
Feature activation-5.38e-02

Pos logit contributions
+0.017
stack+0.016
 hips+0.015
ses+0.015
 slope+0.015
Neg logit contributions
aw-0.016
 at-0.015
--0.015
 by-0.014
up-0.014
Token they
Token ID484
Feature activation-1.35e+00

Pos logit contributions
stack+0.002
+0.002
 slope+0.002
gy+0.002
ses+0.002
Neg logit contributions
aw-0.001
up-0.001
--0.001
 at-0.001
 by-0.001
Token were
Token ID547
Feature activation-2.91e-01

Pos logit contributions
stack+0.041
+0.040
 hips+0.038
gency+0.037
ses+0.037
Neg logit contributions
aw-0.041
--0.040
 at-0.039
 by-0.037
up-0.036
Token just
Token ID655
Feature activation+9.48e-01

Pos logit contributions
stack+0.008
 hips+0.008
+0.008
 slope+0.007
 spine+0.007
Neg logit contributions
aw-0.009
--0.009
up-0.008
 at-0.008
 by-0.008
Token tiny
Token ID7009
Feature activation-1.00e+00

Pos logit contributions
aw+0.027
up+0.027
-+0.026
 at+0.021
 by+0.021
Neg logit contributions
ses-0.029
 spine-0.029
sing-0.029
stack-0.029
sed-0.028
Token spec
Token ID1020
Feature activation+5.96e+00

Pos logit contributions
+0.032
drivers+0.031
 hips+0.031
ses+0.031
stack+0.031
Neg logit contributions
aw-0.026
up-0.023
--0.023
 at-0.022
 by-0.021
Tokenks
Token ID591
Feature activation-1.39e+00

Pos logit contributions
-+0.153
 at+0.152
aw+0.151
 by+0.147
up+0.140
Neg logit contributions
stack-0.194
-0.191
ses-0.190
 hips-0.185
sed-0.185
Token in
Token ID287
Feature activation-3.28e+00

Pos logit contributions
 hips+0.046
stack+0.046
+0.046
urance+0.044
alk+0.044
Neg logit contributions
aw-0.030
--0.028
up-0.027
 at-0.027
 by-0.026
Token the
Token ID262
Feature activation-1.12e+00

Pos logit contributions
stack+0.139
+0.138
ses+0.132
gency+0.132
 hips+0.132
Neg logit contributions
 at-0.062
aw-0.059
--0.058
 by-0.055
up-0.044
Token sky
Token ID6766
Feature activation-6.79e-01

Pos logit contributions
stack+0.031
+0.031
 spine+0.030
 hips+0.029
ses+0.029
Neg logit contributions
 at-0.035
--0.035
aw-0.034
 by-0.032
up-0.030
Token.
Token ID13
Feature activation-5.47e-01

Pos logit contributions
stack+0.022
+0.021
gy+0.020
 hips+0.020
 spine+0.020
Neg logit contributions
aw-0.018
up-0.016
--0.016
 at-0.016
 by-0.015
Token king
Token ID5822
Feature activation+1.62e+00

Pos logit contributions
stack+0.030
 spine+0.027
sed+0.027
+0.026
gy+0.026
Neg logit contributions
aw-0.044
 at-0.040
--0.038
 by-0.037
le-0.035
Token ate
Token ID15063
Feature activation+6.71e-01

Pos logit contributions
aw+0.041
 at+0.039
-+0.037
 by+0.036
up+0.034
Neg logit contributions
stack-0.054
-0.052
gy-0.050
 hips-0.050
sed-0.049
Token too
Token ID1165
Feature activation-4.15e-01

Pos logit contributions
aw+0.015
 at+0.014
-+0.014
up+0.013
 by+0.013
Neg logit contributions
stack-0.026
-0.025
 hips-0.024
ses-0.024
gency-0.024
Token many
Token ID867
Feature activation-1.05e+00

Pos logit contributions
stack+0.016
gency+0.015
ses+0.015
 spine+0.014
+0.014
Neg logit contributions
 at-0.012
--0.011
aw-0.011
 by-0.010
 with-0.009
Token ch
Token ID442
Feature activation+7.07e-01

Pos logit contributions
drivers+0.030
 hips+0.029
opes+0.028
 hind+0.027
 slope+0.027
Neg logit contributions
aw-0.026
 at-0.025
up-0.023
 by-0.023
--0.021
Tokenocol
Token ID4668
Feature activation+5.91e+00

Pos logit contributions
 at+0.023
aw+0.022
-+0.022
 by+0.022
up+0.020
Neg logit contributions
-0.019
stack-0.019
alk-0.018
ses-0.018
drivers-0.018
Tokenates
Token ID689
Feature activation+8.61e-01

Pos logit contributions
aw+0.197
-+0.196
 at+0.196
 by+0.184
up+0.170
Neg logit contributions
stack-0.160
-0.160
gency-0.149
 hips-0.148
 slope-0.146
Token and
Token ID290
Feature activation-1.71e-01

Pos logit contributions
aw+0.017
 at+0.015
-+0.015
up+0.015
 by+0.014
Neg logit contributions
stack-0.034
-0.034
 hips-0.033
drivers-0.032
alk-0.031
Token his
Token ID465
Feature activation-6.68e-01

Pos logit contributions
stack+0.006
+0.005
drivers+0.005
 hips+0.005
sed+0.005
Neg logit contributions
aw-0.004
 at-0.004
--0.004
 by-0.004
up-0.004
Token tum
Token ID11814
Feature activation-2.01e-01

Pos logit contributions
stack+0.024
+0.024
ses+0.023
sed+0.023
 hips+0.023
Neg logit contributions
aw-0.016
 at-0.016
--0.015
 by-0.014
up-0.013
Tokenmy
Token ID1820
Feature activation-7.29e-01

Pos logit contributions
stack+0.007
+0.007
ses+0.007
 hips+0.007
gency+0.007
Neg logit contributions
aw-0.005
 at-0.005
--0.005
 by-0.005
up-0.004
Token?"
Token ID1701
Feature activation+5.08e-01

Pos logit contributions
aw+0.059
-+0.052
up+0.052
 at+0.050
 by+0.049
Neg logit contributions
stack-0.048
-0.045
 hips-0.044
gency-0.042
alk-0.042
Token
Token ID198
Feature activation+5.56e-01

Pos logit contributions
 at+0.015
-+0.014
 by+0.013
aw+0.013
up+0.012
Neg logit contributions
stack-0.018
gency-0.017
ses-0.017
-0.016
 spine-0.016
Token
Token ID198
Feature activation+1.80e-01

Pos logit contributions
 at+0.015
-+0.015
aw+0.014
 by+0.014
up+0.012
Neg logit contributions
stack-0.021
-0.019
ses-0.019
gency-0.019
 spine-0.019
TokenL
Token ID43
Feature activation+1.57e+00

Pos logit contributions
 at+0.005
aw+0.005
-+0.005
 by+0.005
up+0.004
Neg logit contributions
-0.005
stack-0.005
 hips-0.005
alk-0.005
drivers-0.005
Tokenily
Token ID813
Feature activation+1.89e+00

Pos logit contributions
 at+0.050
aw+0.049
-+0.049
 by+0.047
up+0.043
Neg logit contributions
-0.043
stack-0.043
alk-0.041
ses-0.040
drivers-0.040
Token spr
Token ID7500
Feature activation+5.87e+00

Pos logit contributions
aw+0.047
 at+0.046
-+0.046
 by+0.043
up+0.040
Neg logit contributions
-0.066
stack-0.065
 hips-0.063
ses-0.063
gency-0.062
Tokenays
Token ID592
Feature activation+4.17e-01

Pos logit contributions
-+0.202
 at+0.192
 by+0.187
aw+0.186
up+0.180
Neg logit contributions
ses-0.125
stack-0.124
alk-0.120
-0.120
sed-0.115
Token Ben
Token ID3932
Feature activation-1.07e-01

Pos logit contributions
 at+0.009
aw+0.008
-+0.008
 by+0.008
up+0.007
Neg logit contributions
-0.017
stack-0.017
gency-0.016
drivers-0.016
 hips-0.016
Token with
Token ID351
Feature activation+1.97e-01

Pos logit contributions
 hips+0.004
stack+0.004
ses+0.004
alk+0.004
gy+0.004
Neg logit contributions
aw-0.002
--0.002
up-0.002
 at-0.002
 by-0.002
Token the
Token ID262
Feature activation-3.16e-01

Pos logit contributions
 at+0.004
aw+0.004
-+0.004
 by+0.003
up+0.003
Neg logit contributions
stack-0.008
-0.008
drivers-0.008
gency-0.008
 spine-0.008
Token water
Token ID1660
Feature activation+1.05e+00

Pos logit contributions
stack+0.009
+0.009
ses+0.009
 hips+0.009
alk+0.009
Neg logit contributions
aw-0.010
 at-0.010
--0.009
 by-0.009
up-0.008
Token reward
Token ID6721
Feature activation+3.36e+00

Pos logit contributions
 at+0.082
aw+0.080
-+0.080
 by+0.076
up+0.068
Neg logit contributions
-0.092
stack-0.091
 hips-0.086
gency-0.086
ses-0.085
Token.
Token ID13
Feature activation+9.41e-01

Pos logit contributions
aw+0.067
 at+0.064
-+0.062
 by+0.057
up+0.053
Neg logit contributions
stack-0.135
-0.132
 hips-0.127
drivers-0.125
gency-0.125
Token It
Token ID632
Feature activation+4.58e-01

Pos logit contributions
 at+0.030
aw+0.029
 by+0.028
-+0.027
 with+0.024
Neg logit contributions
ses-0.027
gency-0.026
stack-0.026
-0.026
drivers-0.026
Token was
Token ID373
Feature activation+3.63e-01

Pos logit contributions
aw+0.012
-+0.011
 at+0.011
 by+0.011
up+0.010
Neg logit contributions
stack-0.016
-0.015
ses-0.014
 hips-0.014
sed-0.014
Token ch
Token ID442
Feature activation+1.21e+00

Pos logit contributions
aw+0.009
 at+0.009
-+0.009
 by+0.008
up+0.008
Neg logit contributions
stack-0.013
-0.013
 hips-0.012
ses-0.012
gency-0.012
Tokenocol
Token ID4668
Feature activation+5.85e+00

Pos logit contributions
 at+0.053
aw+0.052
-+0.052
 by+0.050
up+0.048
Neg logit contributions
-0.020
stack-0.019
alk-0.017
ses-0.017
 hips-0.017
Tokenates
Token ID689
Feature activation+2.16e-01

Pos logit contributions
 at+0.216
aw+0.213
-+0.210
 by+0.204
up+0.192
Neg logit contributions
stack-0.135
-0.135
 hips-0.122
drivers-0.122
gency-0.122
Token and
Token ID290
Feature activation-8.66e-01

Pos logit contributions
aw+0.004
 at+0.004
-+0.004
 by+0.004
up+0.004
Neg logit contributions
stack-0.008
-0.008
 hips-0.008
gency-0.008
ses-0.008
Token sweets
Token ID42402
Feature activation+2.14e-01

Pos logit contributions
stack+0.029
+0.029
 hips+0.027
ses+0.027
gency+0.027
Neg logit contributions
aw-0.023
 at-0.023
--0.022
 by-0.021
up-0.020
Token!
Token ID0
Feature activation+1.86e-01

Pos logit contributions
aw+0.005
 at+0.004
-+0.004
 by+0.004
up+0.004
Neg logit contributions
stack-0.008
-0.008
 hips-0.008
drivers-0.008
gency-0.008
Token
Token ID198
Feature activation-2.80e-01

Pos logit contributions
 at+0.006
aw+0.006
 by+0.005
-+0.005
up+0.005
Neg logit contributions
stack-0.005
ses-0.005
-0.005
gency-0.005
 spine-0.005
Tokenbled
Token ID9342
Feature activation+8.94e-02

Pos logit contributions
aw+0.138
 at+0.132
-+0.130
 by+0.128
up+0.126
Neg logit contributions
stack-0.024
-0.024
 hips-0.022
aned-0.017
drivers-0.016
Token up
Token ID510
Feature activation+1.02e+00

Pos logit contributions
aw+0.002
up+0.002
-+0.002
 writer+0.002
 at+0.002
Neg logit contributions
stack-0.002
 top-0.002
 hips-0.002
gy-0.002
 bottom-0.002
Token all
Token ID477
Feature activation-9.52e-01

Pos logit contributions
aw+0.024
-+0.022
 at+0.022
up+0.021
 by+0.020
Neg logit contributions
-0.036
stack-0.035
 hips-0.035
gency-0.032
drivers-0.032
Token the
Token ID262
Feature activation-4.99e-01

Pos logit contributions
stack+0.028
alk+0.026
+0.025
 Everywhere+0.024
zie+0.024
Neg logit contributions
aw-0.026
--0.023
 at-0.023
 bra-0.021
up-0.021
Token ch
Token ID442
Feature activation+5.20e-01

Pos logit contributions
drivers+0.015
 hips+0.015
gency+0.014
stack+0.014
+0.014
Neg logit contributions
aw-0.015
 at-0.013
--0.012
 by-0.011
up-0.011
Tokenocol
Token ID4668
Feature activation+5.82e+00

Pos logit contributions
 at+0.016
 by+0.016
aw+0.016
-+0.015
up+0.014
Neg logit contributions
alk-0.014
-0.014
drivers-0.013
ses-0.013
 spine-0.013
Tokenates
Token ID689
Feature activation+6.78e-01

Pos logit contributions
 at+0.219
aw+0.213
-+0.209
 by+0.205
up+0.192
Neg logit contributions
stack-0.132
-0.131
drivers-0.120
alk-0.120
 hips-0.118
Token,
Token ID11
Feature activation+5.57e-03

Pos logit contributions
aw+0.013
 at+0.011
-+0.011
up+0.011
 by+0.010
Neg logit contributions
stack-0.028
-0.027
 hips-0.027
alk-0.025
 spine-0.025
Token and
Token ID290
Feature activation+2.52e-01

Pos logit contributions
aw+0.000
up+0.000
-+0.000
 at+0.000
 by+0.000
Neg logit contributions
stack-0.000
 hips-0.000
 spine-0.000
-0.000
sed-0.000
Token she
Token ID673
Feature activation+7.71e-01

Pos logit contributions
aw+0.008
-+0.007
up+0.007
 at+0.006
 by+0.006
Neg logit contributions
stack-0.008
sed-0.007
 hips-0.007
ses-0.006
-0.006
Token was
Token ID373
Feature activation-1.71e-01

Pos logit contributions
aw+0.022
 at+0.021
 by+0.021
-+0.020
up+0.019
Neg logit contributions
stack-0.024
-0.023
 hips-0.022
sed-0.021
gy-0.021
Token it
Token ID340
Feature activation-1.02e+00

Pos logit contributions
+0.089
stack+0.088
ses+0.087
 hips+0.086
drivers+0.086
Neg logit contributions
 at-0.034
--0.032
aw-0.032
 by-0.029
up-0.022
Token.
Token ID13
Feature activation+4.79e-02

Pos logit contributions
stack+0.039
+0.039
 hips+0.037
gency+0.037
ses+0.036
Neg logit contributions
aw-0.022
 at-0.022
--0.021
 by-0.020
up-0.018
Token
Token ID198
Feature activation+1.18e-01

Pos logit contributions
 at+0.002
aw+0.002
-+0.002
 by+0.001
up+0.001
Neg logit contributions
-0.001
stack-0.001
 hips-0.001
gency-0.001
ses-0.001
Token
Token ID198
Feature activation-3.46e-01

Pos logit contributions
 at+0.003
aw+0.003
-+0.003
 by+0.003
up+0.003
Neg logit contributions
stack-0.004
ses-0.004
 spine-0.004
sed-0.004
gency-0.004
TokenM
Token ID44
Feature activation+8.14e-01

Pos logit contributions
+0.009
stack+0.009
 slope+0.008
drivers+0.008
 hips+0.008
Neg logit contributions
 at-0.012
--0.011
aw-0.011
 by-0.011
up-0.010
Tokenomm
Token ID2002
Feature activation+5.75e+00

Pos logit contributions
aw+0.037
 at+0.037
-+0.035
 by+0.035
up+0.033
Neg logit contributions
-0.013
stack-0.011
ses-0.010
gency-0.010
drivers-0.010
Tokena
Token ID64
Feature activation+2.19e+00

Pos logit contributions
 at+0.133
aw+0.133
 by+0.118
-+0.115
 bra+0.106
Neg logit contributions
stack-0.186
drivers-0.182
-0.178
 hips-0.177
gy-0.177
Token and
Token ID290
Feature activation+1.81e+00

Pos logit contributions
aw+0.055
 at+0.052
 by+0.052
-+0.051
up+0.047
Neg logit contributions
stack-0.069
-0.068
 hips-0.067
sed-0.065
drivers-0.065
Token Tom
Token ID4186
Feature activation-5.63e-02

Pos logit contributions
aw+0.058
-+0.057
 at+0.056
 by+0.053
up+0.052
Neg logit contributions
stack-0.050
-0.050
 hips-0.047
gency-0.047
drivers-0.047
Token went
Token ID1816
Feature activation+1.35e+00

Pos logit contributions
stack+0.002
+0.002
 hips+0.002
gency+0.002
drivers+0.002
Neg logit contributions
aw-0.002
--0.002
 at-0.001
 by-0.001
up-0.001
Token to
Token ID284
Feature activation+1.27e+00

Pos logit contributions
aw+0.040
up+0.024
-+0.024
 bra+0.024
 at+0.024
Neg logit contributions
 hips-0.047
stack-0.046
drivers-0.044
gency-0.043
 spine-0.043
Token When
Token ID1649
Feature activation+1.62e+00

Pos logit contributions
stack+0.009
+0.009
 hips+0.008
gency+0.008
ses+0.008
Neg logit contributions
 at-0.008
aw-0.008
--0.008
 by-0.008
up-0.007
Token he
Token ID339
Feature activation-1.32e-01

Pos logit contributions
aw+0.054
-+0.052
 at+0.049
up+0.048
 by+0.047
Neg logit contributions
-0.045
stack-0.044
 hips-0.039
 slope-0.039
't-0.039
Token does
Token ID857
Feature activation+5.40e-01

Pos logit contributions
+0.004
't+0.004
gency+0.003
 hips+0.003
sed+0.003
Neg logit contributions
aw-0.004
 by-0.003
--0.003
up-0.003
 at-0.003
Token,
Token ID11
Feature activation+4.26e-01

Pos logit contributions
 at+0.011
aw+0.011
-+0.011
 by+0.010
up+0.009
Neg logit contributions
stack-0.022
-0.021
ses-0.020
gency-0.020
 hips-0.020
Token she
Token ID673
Feature activation+9.07e-01

Pos logit contributions
aw+0.014
-+0.014
up+0.013
 at+0.012
 by+0.012
Neg logit contributions
stack-0.011
-0.011
't-0.010
gency-0.010
 slope-0.010
Token spr
Token ID7500
Feature activation+5.73e+00

Pos logit contributions
-+0.027
aw+0.026
 by+0.026
 at+0.024
up+0.023
Neg logit contributions
ses-0.026
-0.026
 hips-0.025
sed-0.025
stack-0.024
Tokenays
Token ID592
Feature activation+1.36e-01

Pos logit contributions
-+0.207
aw+0.191
 bra+0.189
up+0.188
 at+0.187
Neg logit contributions
stack-0.107
alk-0.100
aned-0.100
-0.100
gy-0.100
Token him
Token ID683
Feature activation+3.32e-01

Pos logit contributions
aw+0.003
-+0.003
 at+0.003
 by+0.003
up+0.002
Neg logit contributions
stack-0.005
-0.005
 hips-0.005
ses-0.005
alk-0.005
Token with
Token ID351
Feature activation-9.78e-02

Pos logit contributions
aw+0.005
-+0.005
 at+0.005
 by+0.004
up+0.004
Neg logit contributions
stack-0.014
-0.014
 hips-0.014
ses-0.014
alk-0.013
Token water
Token ID1660
Feature activation+1.52e+00

Pos logit contributions
stack+0.004
+0.004
gency+0.004
 hips+0.004
ses+0.004
Neg logit contributions
 at-0.002
aw-0.002
--0.002
 by-0.002
up-0.002
Token.
Token ID13
Feature activation+1.00e-01

Pos logit contributions
aw+0.031
-+0.028
 at+0.027
up+0.026
 by+0.025
Neg logit contributions
stack-0.059
-0.058
 hips-0.056
ses-0.055
alk-0.055
Token.
Token ID13
Feature activation+6.10e-01

Pos logit contributions
+0.071
stack+0.069
 hips+0.067
urance+0.065
 spine+0.064
Neg logit contributions
aw-0.046
--0.042
 at-0.042
up-0.039
 by-0.039
Token
Token ID220
Feature activation-8.42e-01

Pos logit contributions
 at+0.023
aw+0.022
 by+0.022
-+0.022
up+0.019
Neg logit contributions
stack-0.014
gency-0.013
drivers-0.013
ses-0.013
alk-0.013
Token
Token ID198
Feature activation+3.47e-01

Pos logit contributions
stack+0.032
 spine+0.031
gency+0.031
ses+0.031
 slope+0.031
Neg logit contributions
 at-0.021
 by-0.020
aw-0.019
--0.018
 with-0.016
Token
Token ID198
Feature activation-6.19e-02

Pos logit contributions
 at+0.007
 by+0.007
-+0.006
aw+0.006
 with+0.006
Neg logit contributions
ses-0.015
sed-0.015
stack-0.015
 spine-0.015
gency-0.014
TokenM
Token ID44
Feature activation+1.07e+00

Pos logit contributions
+0.002
stack+0.002
 hips+0.002
gency+0.002
drivers+0.002
Neg logit contributions
 at-0.002
aw-0.002
--0.002
 by-0.002
up-0.002
Tokenomm
Token ID2002
Feature activation+5.73e+00

Pos logit contributions
 at+0.046
aw+0.046
-+0.045
 by+0.044
up+0.041
Neg logit contributions
-0.019
stack-0.018
 hips-0.016
gency-0.016
ses-0.016
Tokena
Token ID64
Feature activation+2.11e+00

Pos logit contributions
 at+0.131
aw+0.128
 by+0.115
-+0.111
up+0.098
Neg logit contributions
stack-0.194
-0.186
drivers-0.183
 hips-0.182
aned-0.180
Token and
Token ID290
Feature activation+1.71e+00

Pos logit contributions
aw+0.050
 at+0.050
-+0.047
 by+0.047
up+0.043
Neg logit contributions
stack-0.072
-0.072
 hips-0.069
sed-0.067
ishly-0.066
Token Billy
Token ID15890
Feature activation-8.08e-01

Pos logit contributions
aw+0.063
 at+0.063
-+0.062
 by+0.059
up+0.057
Neg logit contributions
stack-0.041
-0.040
 hips-0.037
gency-0.037
ses-0.036
Token stayed
Token ID9658
Feature activation+6.64e-01

Pos logit contributions
stack+0.027
+0.027
ses+0.025
 hips+0.025
gency+0.025
Neg logit contributions
 at-0.022
aw-0.022
--0.021
 by-0.020
up-0.018
Token by
Token ID416
Feature activation+4.69e-01

Pos logit contributions
aw+0.017
up+0.015
-+0.012
go+0.011
lier+0.011
Neg logit contributions
 hips-0.021
gy-0.021
 afloat-0.020
stack-0.020
-0.019
Token mustache
Token ID49303
Feature activation+1.45e+00

Pos logit contributions
aw+0.035
 at+0.034
-+0.033
 by+0.032
up+0.030
Neg logit contributions
-0.038
stack-0.037
gency-0.036
 hips-0.036
ses-0.035
Token!"
Token ID2474
Feature activation-4.57e-03

Pos logit contributions
aw+0.035
 at+0.031
-+0.031
 by+0.030
up+0.028
Neg logit contributions
stack-0.052
-0.050
gency-0.049
 hips-0.048
 spine-0.047
Token
Token ID198
Feature activation+1.61e-01

Pos logit contributions
stack+0.000
+0.000
gency+0.000
ses+0.000
drivers+0.000
Neg logit contributions
 at-0.000
aw-0.000
--0.000
 by-0.000
up-0.000
Token
Token ID198
Feature activation-2.33e-01

Pos logit contributions
 at+0.003
 by+0.003
-+0.003
 with+0.003
aw+0.003
Neg logit contributions
ses-0.007
stack-0.007
 spine-0.007
 spoiler-0.007
sed-0.007
TokenM
Token ID44
Feature activation+9.71e-01

Pos logit contributions
+0.005
stack+0.005
ses+0.005
gency+0.004
drivers+0.004
Neg logit contributions
 at-0.009
aw-0.009
--0.009
 by-0.009
up-0.008
Tokenomm
Token ID2002
Feature activation+5.72e+00

Pos logit contributions
 at+0.033
-+0.033
aw+0.032
 by+0.031
up+0.029
Neg logit contributions
stack-0.027
-0.026
gency-0.025
 hips-0.024
ses-0.024
Tokena
Token ID64
Feature activation+2.05e+00

Pos logit contributions
 at+0.125
aw+0.124
-+0.110
 by+0.110
up+0.093
Neg logit contributions
stack-0.204
drivers-0.195
-0.194
 hips-0.192
sed-0.191
Token replied
Token ID8712
Feature activation+9.18e-01

Pos logit contributions
aw+0.055
 at+0.054
-+0.053
 by+0.051
up+0.047
Neg logit contributions
-0.065
stack-0.065
 hips-0.062
sed-0.061
ses-0.061
Token,
Token ID11
Feature activation+2.76e-01

Pos logit contributions
aw+0.023
 at+0.021
up+0.020
-+0.020
 by+0.020
Neg logit contributions
-0.033
stack-0.032
 hips-0.031
sed-0.030
drivers-0.030
Token "
Token ID366
Feature activation-5.85e-01

Pos logit contributions
 at+0.008
-+0.007
 by+0.007
aw+0.007
 with+0.006
Neg logit contributions
alk-0.009
stack-0.009
 spine-0.009
ses-0.009
-0.009
TokenWell
Token ID5779
Feature activation+6.40e-01

Pos logit contributions
stack+0.017
+0.016
gency+0.015
 hips+0.015
drivers+0.015
Neg logit contributions
 at-0.019
aw-0.019
--0.019
 by-0.017
up-0.016
Token with
Token ID351
Feature activation-3.13e-01

Pos logit contributions
aw+0.003
 at+0.003
-+0.003
up+0.003
 by+0.003
Neg logit contributions
-0.004
stack-0.004
 hips-0.003
gy-0.003
drivers-0.003
Token her
Token ID607
Feature activation+8.06e-02

Pos logit contributions
stack+0.012
+0.012
 hips+0.011
ses+0.011
gency+0.011
Neg logit contributions
 at-0.007
aw-0.007
--0.007
 by-0.006
up-0.005
Token hands
Token ID2832
Feature activation-2.57e+00

Pos logit contributions
 at+0.002
aw+0.002
-+0.002
 by+0.002
up+0.002
Neg logit contributions
stack-0.003
-0.003
 hips-0.003
gency-0.003
ses-0.003
Token on
Token ID319
Feature activation-2.77e+00

Pos logit contributions
stack+0.111
+0.111
ses+0.107
gency+0.106
alk+0.105
Neg logit contributions
 at-0.048
--0.043
aw-0.042
 by-0.041
up-0.031
Token her
Token ID607
Feature activation-9.50e-01

Pos logit contributions
stack+0.115
+0.115
ses+0.112
gency+0.111
drivers+0.110
Neg logit contributions
 at-0.053
--0.050
aw-0.049
 by-0.048
up-0.037
Token hips
Token ID21497
Feature activation-4.98e+00

Pos logit contributions
+0.028
stack+0.028
gency+0.026
drivers+0.026
 slope+0.026
Neg logit contributions
 at-0.031
--0.031
 by-0.029
aw-0.027
 with-0.025
Token.
Token ID13
Feature activation-2.86e-01

Pos logit contributions
stack+0.230
ses+0.228
+0.226
gency+0.222
sed+0.221
Neg logit contributions
 at-0.086
--0.076
 by-0.074
aw-0.067
 with-0.052
Token She
Token ID1375
Feature activation+4.29e-01

Pos logit contributions
stack+0.008
ses+0.008
+0.008
gency+0.008
alk+0.008
Neg logit contributions
 at-0.009
aw-0.009
--0.009
 by-0.009
up-0.007
Token said
Token ID531
Feature activation+1.30e+00

Pos logit contributions
aw+0.012
 at+0.012
-+0.011
 by+0.011
up+0.010
Neg logit contributions
stack-0.014
-0.013
 hips-0.013
sed-0.013
ses-0.012
Token,
Token ID11
Feature activation-6.98e-02

Pos logit contributions
aw+0.032
 at+0.029
-+0.029
 by+0.028
up+0.027
Neg logit contributions
stack-0.045
-0.044
 hips-0.043
drivers-0.043
gency-0.042
Token
Token ID6184
Feature activation-1.47e+00

Pos logit contributions
+0.002
stack+0.002
alk+0.002
ses+0.002
gency+0.002
Neg logit contributions
 at-0.002
--0.002
 by-0.002
aw-0.002
 with-0.002
Token stuck
Token ID7819
Feature activation-2.73e-01

Pos logit contributions
stack+0.033
+0.032
 hips+0.031
gency+0.031
drivers+0.031
Neg logit contributions
aw-0.030
--0.029
 at-0.028
up-0.027
 by-0.026
Token beneath
Token ID11061
Feature activation-1.40e+00

Pos logit contributions
stack+0.013
+0.013
ses+0.013
 hips+0.013
gency+0.013
Neg logit contributions
 at-0.003
aw-0.003
--0.003
 by-0.003
up-0.002
Token the
Token ID262
Feature activation-8.80e-01

Pos logit contributions
stack+0.059
+0.059
ses+0.057
 spine+0.056
sed+0.056
Neg logit contributions
 at-0.027
--0.026
aw-0.026
 by-0.025
up-0.018
Token j
Token ID474
Feature activation-1.72e+00

Pos logit contributions
+0.030
stack+0.029
ses+0.027
alk+0.026
 hips+0.026
Neg logit contributions
--0.027
 at-0.026
 by-0.025
aw-0.025
up-0.022
Tokenee
Token ID1453
Feature activation-1.28e-01

Pos logit contributions
+0.061
stack+0.061
 hips+0.057
alk+0.057
drivers+0.057
Neg logit contributions
 at-0.043
aw-0.042
--0.041
 by-0.039
up-0.035
Tokenpt
Token ID457
Feature activation-4.92e+00

Pos logit contributions
+0.005
stack+0.005
ses+0.005
gency+0.005
 hips+0.005
Neg logit contributions
 at-0.003
--0.003
aw-0.003
 by-0.002
up-0.002
Tokenhe
Token ID258
Feature activation-2.14e+00

Pos logit contributions
+0.192
stack+0.191
ses+0.182
 hips+0.182
gency+0.181
Neg logit contributions
 at-0.109
--0.107
aw-0.105
 by-0.099
up-0.084
Token same
Token ID976
Feature activation-1.67e+00

Pos logit contributions
+0.075
stack+0.074
ses+0.071
gency+0.070
 hips+0.069
Neg logit contributions
 at-0.057
--0.056
aw-0.054
 by-0.052
up-0.044
Token je
Token ID11223
Feature activation-1.22e+00

Pos logit contributions
stack+0.056
 hips+0.055
+0.055
 spine+0.055
 slope+0.054
Neg logit contributions
aw-0.043
 at-0.040
--0.038
up-0.037
 by-0.036
Tokenep
Token ID538
Feature activation-1.46e+00

Pos logit contributions
+0.036
stack+0.036
drivers+0.035
gency+0.034
 hips+0.034
Neg logit contributions
 at-0.036
aw-0.035
 by-0.033
--0.033
up-0.029
Token that
Token ID326
Feature activation-1.00e+00

Pos logit contributions
stack+0.058
+0.057
 hips+0.055
drivers+0.054
gency+0.054
Neg logit contributions
aw-0.030
 at-0.030
--0.028
 by-0.026
up-0.024
Token with
Token ID351
Feature activation+5.18e-02

Pos logit contributions
aw+0.003
-+0.003
 at+0.003
up+0.003
 by+0.003
Neg logit contributions
stack-0.003
-0.003
 hips-0.003
ses-0.003
 spine-0.003
Token her
Token ID607
Feature activation+5.31e-02

Pos logit contributions
 at+0.001
aw+0.001
-+0.001
 by+0.001
up+0.001
Neg logit contributions
-0.002
stack-0.002
 spine-0.002
gency-0.002
alk-0.002
Token hands
Token ID2832
Feature activation-2.65e+00

Pos logit contributions
 at+0.002
-+0.001
aw+0.001
 by+0.001
up+0.001
Neg logit contributions
stack-0.002
-0.002
drivers-0.002
gency-0.002
 slope-0.002
Token on
Token ID319
Feature activation-2.76e+00

Pos logit contributions
stack+0.115
+0.114
ses+0.110
gency+0.110
alk+0.109
Neg logit contributions
 at-0.049
--0.045
aw-0.044
 by-0.043
up-0.033
Token her
Token ID607
Feature activation-8.68e-01

Pos logit contributions
+0.108
stack+0.108
gency+0.104
ses+0.103
drivers+0.102
Neg logit contributions
 at-0.060
--0.057
aw-0.057
 by-0.054
up-0.044
Token hips
Token ID21497
Feature activation-4.92e+00

Pos logit contributions
+0.026
stack+0.025
drivers+0.024
ivery+0.024
gency+0.024
Neg logit contributions
 at-0.029
--0.029
 by-0.027
aw-0.026
 with-0.023
Token.
Token ID13
Feature activation-2.73e-01

Pos logit contributions
stack+0.234
ses+0.230
+0.229
gency+0.227
alk+0.225
Neg logit contributions
 at-0.084
--0.072
 by-0.070
aw-0.063
 with-0.050
Token
Token ID198
Feature activation-3.44e-01

Pos logit contributions
stack+0.008
+0.008
gency+0.008
ses+0.008
alk+0.008
Neg logit contributions
 at-0.008
aw-0.008
 by-0.008
--0.007
up-0.006
Token
Token ID198
Feature activation-6.68e-01

Pos logit contributions
ses+0.016
 spine+0.015
sed+0.015
sing+0.015
 spoiler+0.015
Neg logit contributions
--0.007
 by-0.006
 at-0.006
aw-0.006
 with-0.005
TokenBig
Token ID12804
Feature activation-2.41e+00

Pos logit contributions
+0.024
ses+0.024
stack+0.023
sed+0.023
 spine+0.023
Neg logit contributions
aw-0.017
 at-0.017
--0.017
 by-0.016
up-0.014
Token Bear
Token ID14732
Feature activation-1.06e+00

Pos logit contributions
+0.046
stack+0.046
 hips+0.041
ses+0.041
gency+0.041
Neg logit contributions
 at-0.099
aw-0.099
--0.097
 by-0.094
up-0.089
Token truck
Token ID7779
Feature activation-1.52e+00

Pos logit contributions
+0.003
stack+0.003
 hips+0.002
gency+0.002
drivers+0.002
Neg logit contributions
 at-0.002
aw-0.002
--0.002
 by-0.002
up-0.002
Token because
Token ID780
Feature activation-8.08e-01

Pos logit contributions
stack+0.061
+0.060
 hips+0.057
ses+0.057
gency+0.057
Neg logit contributions
aw-0.031
 at-0.030
--0.029
 by-0.027
up-0.025
Token of
Token ID286
Feature activation+1.12e+00

Pos logit contributions
stack+0.031
+0.031
ses+0.029
gency+0.029
 hips+0.029
Neg logit contributions
aw-0.018
 at-0.017
--0.017
 by-0.016
up-0.014
Token your
Token ID534
Feature activation+5.67e-02

Pos logit contributions
 at+0.020
aw+0.019
 by+0.019
-+0.019
up+0.013
Neg logit contributions
-0.049
stack-0.049
 spine-0.047
 hips-0.047
gency-0.047
Token bad
Token ID2089
Feature activation-1.79e+00

Pos logit contributions
 at+0.002
aw+0.002
-+0.002
 by+0.002
up+0.001
Neg logit contributions
stack-0.002
-0.002
gency-0.002
 hips-0.002
ses-0.002
Token behavior
Token ID4069
Feature activation-4.90e+00

Pos logit contributions
stack+0.059
+0.059
 hips+0.055
ses+0.055
gency+0.055
Neg logit contributions
 at-0.049
aw-0.049
--0.047
 by-0.045
up-0.041
Token.
Token ID13
Feature activation+7.10e-01

Pos logit contributions
+0.206
stack+0.206
gency+0.195
ses+0.195
 spine+0.192
Neg logit contributions
 at-0.104
--0.095
 by-0.093
aw-0.093
up-0.073
Token Now
Token ID2735
Feature activation-3.51e-01

Pos logit contributions
 at+0.019
aw+0.019
-+0.018
 by+0.018
up+0.015
Neg logit contributions
stack-0.024
-0.024
 slope-0.023
 hips-0.023
 spine-0.023
Token you
Token ID345
Feature activation+1.78e+00

Pos logit contributions
 hips+0.010
stack+0.010
+0.010
urance+0.010
't+0.010
Neg logit contributions
aw-0.011
--0.010
 at-0.009
up-0.008
 by-0.008
Token have
Token ID423
Feature activation+1.11e+00

Pos logit contributions
aw+0.052
-+0.050
 at+0.050
 by+0.046
up+0.042
Neg logit contributions
-0.055
stack-0.053
gency-0.051
ses-0.051
 hips-0.050
Token to
Token ID284
Feature activation+4.84e-01

Pos logit contributions
aw+0.021
 at+0.020
-+0.020
 by+0.018
up+0.016
Neg logit contributions
-0.045
stack-0.045
gency-0.044
 hips-0.044
ses-0.044
Token rubbed
Token ID31862
Feature activation+3.37e-01

Pos logit contributions
aw+0.018
 at+0.017
-+0.016
 by+0.016
up+0.015
Neg logit contributions
stack-0.020
-0.019
 hips-0.019
drivers-0.019
gency-0.018
Token his
Token ID465
Feature activation-6.20e-01

Pos logit contributions
 at+0.008
aw+0.007
-+0.007
 by+0.007
up+0.006
Neg logit contributions
stack-0.013
-0.013
ses-0.012
gency-0.012
 hips-0.012
Token nose
Token ID9686
Feature activation-1.12e+00

Pos logit contributions
stack+0.016
+0.016
gency+0.015
drivers+0.015
alk+0.015
Neg logit contributions
 at-0.021
--0.021
aw-0.021
 by-0.020
up-0.018
Token with
Token ID351
Feature activation-6.97e-01

Pos logit contributions
stack+0.049
+0.049
gency+0.046
ses+0.046
 hips+0.046
Neg logit contributions
 at-0.019
aw-0.019
--0.018
 by-0.017
up-0.014
Token her
Token ID607
Feature activation-6.60e-01

Pos logit contributions
stack+0.026
+0.026
 hips+0.024
gency+0.024
ses+0.024
Neg logit contributions
 at-0.017
aw-0.016
--0.016
 by-0.015
up-0.013
Token sleeve
Token ID23179
Feature activation-4.83e+00

Pos logit contributions
stack+0.020
+0.020
gency+0.019
alk+0.018
urance+0.018
Neg logit contributions
 at-0.021
--0.021
 by-0.020
aw-0.020
up-0.017
Token.
Token ID13
Feature activation+3.38e-01

Pos logit contributions
stack+0.220
+0.216
ses+0.214
gency+0.214
alk+0.209
Neg logit contributions
 at-0.089
--0.081
 by-0.078
aw-0.074
 with-0.054
Token She
Token ID1375
Feature activation+6.60e-01

Pos logit contributions
 at+0.010
aw+0.009
 by+0.009
-+0.008
 with+0.007
Neg logit contributions
ivery-0.011
stack-0.010
gency-0.010
ses-0.010
 spine-0.010
Token said
Token ID531
Feature activation+1.28e+00

Pos logit contributions
aw+0.019
 at+0.018
-+0.018
 by+0.018
up+0.016
Neg logit contributions
stack-0.019
-0.019
 hips-0.019
ses-0.019
sed-0.018
Token,
Token ID11
Feature activation+7.00e-02

Pos logit contributions
aw+0.031
-+0.029
 at+0.029
 by+0.028
up+0.027
Neg logit contributions
stack-0.044
-0.043
 hips-0.042
drivers-0.042
sed-0.041
Token "
Token ID366
Feature activation-1.13e+00

Pos logit contributions
 at+0.002
-+0.002
 by+0.002
aw+0.002
 with+0.002
Neg logit contributions
stack-0.002
-0.002
alk-0.002
gency-0.002
ses-0.002
Token.
Token ID13
Feature activation+2.88e-01

Pos logit contributions
+0.082
stack+0.082
 hips+0.079
ses+0.079
gency+0.078
Neg logit contributions
aw-0.042
 at-0.042
--0.041
 by-0.038
up-0.034
Token You
Token ID921
Feature activation+7.03e-01

Pos logit contributions
 at+0.009
aw+0.008
-+0.008
 by+0.008
up+0.007
Neg logit contributions
stack-0.009
-0.009
 hips-0.008
gency-0.008
drivers-0.008
Token are
Token ID389
Feature activation-1.00e+00

Pos logit contributions
aw+0.022
-+0.021
 at+0.021
 by+0.020
up+0.019
Neg logit contributions
-0.021
stack-0.020
ses-0.019
gency-0.019
 hips-0.019
Token hurting
Token ID21530
Feature activation-2.34e+00

Pos logit contributions
't+0.020
ses+0.020
 hips+0.020
stack+0.020
+0.020
Neg logit contributions
aw-0.039
--0.036
up-0.034
 at-0.032
le-0.031
Token my
Token ID616
Feature activation-7.45e-01

Pos logit contributions
+0.099
stack+0.098
 spine+0.093
ses+0.092
urance+0.091
Neg logit contributions
 at-0.049
--0.048
aw-0.048
 by-0.046
up-0.037
Token feelings
Token ID7666
Feature activation-4.81e+00

Pos logit contributions
stack+0.024
+0.024
gency+0.022
 hips+0.022
drivers+0.022
Neg logit contributions
 at-0.022
--0.021
aw-0.021
 by-0.020
up-0.018
Token.
Token ID13
Feature activation+3.81e-01

Pos logit contributions
stack+0.226
+0.225
gency+0.219
ses+0.217
sed+0.216
Neg logit contributions
 at-0.088
 by-0.079
--0.076
aw-0.067
 with-0.058
Token You
Token ID921
Feature activation+9.44e-01

Pos logit contributions
 at+0.011
aw+0.011
-+0.011
 by+0.011
up+0.010
Neg logit contributions
stack-0.012
-0.012
 hips-0.011
gency-0.011
drivers-0.011
Token are
Token ID389
Feature activation-8.32e-01

Pos logit contributions
aw+0.029
 at+0.028
-+0.028
 by+0.026
up+0.025
Neg logit contributions
-0.028
stack-0.028
ses-0.026
 hips-0.026
gency-0.026
Token making
Token ID1642
Feature activation+4.50e-01

Pos logit contributions
ses+0.019
 hips+0.018
sed+0.018
't+0.018
 swollen+0.017
Neg logit contributions
aw-0.031
--0.029
up-0.027
 at-0.026
le-0.024
Token your
Token ID534
Feature activation-1.98e-01

Pos logit contributions
aw+0.011
 at+0.011
-+0.011
 by+0.010
up+0.009
Neg logit contributions
-0.015
stack-0.015
 hips-0.015
ses-0.014
urance-0.014
Token again
Token ID757
Feature activation-1.01e+00

Pos logit contributions
+0.092
stack+0.092
 hips+0.087
ses+0.087
gency+0.087
Neg logit contributions
aw-0.044
 at-0.043
--0.042
 by-0.039
up-0.035
Token because
Token ID780
Feature activation-7.97e-01

Pos logit contributions
stack+0.044
+0.043
ses+0.041
gency+0.041
drivers+0.041
Neg logit contributions
 at-0.019
--0.018
aw-0.017
 by-0.016
up-0.013
Token he
Token ID339
Feature activation-1.43e+00

Pos logit contributions
stack+0.030
+0.029
ses+0.028
gency+0.028
 hips+0.028
Neg logit contributions
 at-0.020
--0.019
 by-0.018
aw-0.018
up-0.015
Token missed
Token ID6825
Feature activation-2.83e+00

Pos logit contributions
+0.040
stack+0.039
 hips+0.037
gency+0.037
ses+0.037
Neg logit contributions
aw-0.046
 at-0.046
--0.045
 by-0.043
up-0.040
Token his
Token ID465
Feature activation-2.29e+00

Pos logit contributions
stack+0.108
+0.107
ses+0.101
gency+0.100
alk+0.100
Neg logit contributions
 at-0.068
--0.065
aw-0.063
 by-0.062
up-0.050
Token friend
Token ID1545
Feature activation-4.78e+00

Pos logit contributions
stack+0.099
+0.099
gency+0.096
alk+0.093
urance+0.093
Neg logit contributions
--0.044
 at-0.044
 by-0.040
aw-0.033
up-0.029
Token.
Token ID13
Feature activation-5.96e-01

Pos logit contributions
stack+0.180
+0.179
 hips+0.174
gency+0.170
urance+0.169
Neg logit contributions
aw-0.102
 at-0.097
--0.096
 by-0.087
up-0.086
Token But
Token ID887
Feature activation-1.15e-02

Pos logit contributions
stack+0.020
alk+0.018
+0.018
ses+0.018
 slope+0.018
Neg logit contributions
 at-0.018
 by-0.017
--0.016
aw-0.016
 with-0.014
Token then
Token ID788
Feature activation-1.01e+00

Pos logit contributions
stack+0.000
+0.000
 hips+0.000
gency+0.000
 spine+0.000
Neg logit contributions
aw-0.000
 at-0.000
--0.000
 by-0.000
up-0.000
Token,
Token ID11
Feature activation-2.37e-02

Pos logit contributions
+0.030
gency+0.030
stack+0.029
 hips+0.029
drivers+0.029
Neg logit contributions
aw-0.030
--0.026
 at-0.026
 by-0.025
up-0.025
Token his
Token ID465
Feature activation-4.73e-02

Pos logit contributions
+0.001
stack+0.001
gency+0.001
 hips+0.001
gy+0.001
Neg logit contributions
aw-0.001
--0.001
 at-0.001
up-0.001
 by-0.001
Token He
Token ID679
Feature activation-6.95e-01

Pos logit contributions
 at+0.020
aw+0.019
 by+0.018
-+0.017
 with+0.016
Neg logit contributions
stack-0.024
arse-0.023
ivery-0.023
-0.023
ses-0.022
Token wanted
Token ID2227
Feature activation-1.57e+00

Pos logit contributions
stack+0.021
+0.021
 hips+0.020
ses+0.020
sed+0.020
Neg logit contributions
 at-0.019
aw-0.019
--0.019
 by-0.019
up-0.017
Token to
Token ID284
Feature activation-9.86e-02

Pos logit contributions
stack+0.063
gency+0.062
 hips+0.062
+0.062
drivers+0.062
Neg logit contributions
aw-0.029
 at-0.026
--0.026
 by-0.022
up-0.021
Token play
Token ID711
Feature activation-1.65e+00

Pos logit contributions
stack+0.004
+0.004
ses+0.003
drivers+0.003
gency+0.003
Neg logit contributions
 at-0.002
--0.002
aw-0.002
 by-0.002
up-0.002
Token with
Token ID351
Feature activation-1.14e+00

Pos logit contributions
+0.067
stack+0.066
 hips+0.065
gency+0.063
 slope+0.062
Neg logit contributions
aw-0.031
up-0.025
 at-0.022
--0.021
 by-0.018
Token Big
Token ID4403
Feature activation-4.72e+00

Pos logit contributions
stack+0.043
+0.043
 hips+0.042
gency+0.041
drivers+0.041
Neg logit contributions
aw-0.025
 at-0.025
--0.024
 by-0.022
up-0.020
Token Red
Token ID2297
Feature activation-3.08e+00

Pos logit contributions
+0.117
stack+0.114
 hips+0.111
ses+0.109
drivers+0.109
Neg logit contributions
aw-0.166
 at-0.164
--0.156
 by-0.152
up-0.145
Token.
Token ID13
Feature activation+5.59e-01

Pos logit contributions
stack+0.120
+0.118
 hips+0.114
drivers+0.113
ses+0.113
Neg logit contributions
aw-0.066
 at-0.062
--0.060
 by-0.055
up-0.053
Token He
Token ID679
Feature activation-7.35e-01

Pos logit contributions
 at+0.017
 by+0.016
-+0.016
aw+0.015
 with+0.013
Neg logit contributions
stack-0.018
alk-0.018
ses-0.017
-0.017
sed-0.017
Token reached
Token ID4251
Feature activation+8.97e-01

Pos logit contributions
stack+0.023
+0.023
 hips+0.022
ses+0.022
gency+0.022
Neg logit contributions
 at-0.020
aw-0.020
--0.020
 by-0.019
up-0.017
Token for
Token ID329
Feature activation+1.97e+00

Pos logit contributions
aw+0.014
-+0.012
 at+0.012
up+0.010
 by+0.010
Neg logit contributions
stack-0.041
-0.040
 hips-0.039
ses-0.039
gency-0.039
Token of
Token ID286
Feature activation+6.06e-01

Pos logit contributions
stack+0.054
 hips+0.050
+0.050
ses+0.050
 slope+0.050
Neg logit contributions
aw-0.052
--0.049
 at-0.047
up-0.046
 by-0.045
Token her
Token ID607
Feature activation+2.17e-01

Pos logit contributions
 at+0.011
aw+0.011
-+0.011
 by+0.010
up+0.009
Neg logit contributions
stack-0.025
-0.025
 hips-0.024
ses-0.024
gency-0.024
Token hands
Token ID2832
Feature activation-1.60e+00

Pos logit contributions
 at+0.007
aw+0.007
-+0.007
 by+0.006
up+0.006
Neg logit contributions
stack-0.006
-0.006
 hips-0.006
ses-0.006
gency-0.006
Token on
Token ID319
Feature activation-2.42e+00

Pos logit contributions
stack+0.074
+0.074
 hips+0.071
gency+0.071
ses+0.071
Neg logit contributions
aw-0.021
 at-0.021
--0.020
 by-0.018
up-0.015
Token her
Token ID607
Feature activation-9.99e-01

Pos logit contributions
stack+0.097
+0.097
ses+0.095
drivers+0.093
sed+0.093
Neg logit contributions
 at-0.053
--0.049
 by-0.048
aw-0.047
 with-0.036
Token hips
Token ID21497
Feature activation-4.70e+00

Pos logit contributions
stack+0.032
+0.031
gency+0.031
alk+0.029
ivery+0.029
Neg logit contributions
--0.031
 at-0.031
 by-0.028
aw-0.027
 with-0.023
Token.
Token ID13
Feature activation-1.12e+00

Pos logit contributions
stack+0.213
+0.210
ses+0.208
alk+0.204
gency+0.204
Neg logit contributions
 at-0.088
--0.076
 by-0.075
aw-0.069
 with-0.054
Token She
Token ID1375
Feature activation-1.21e-01

Pos logit contributions
stack+0.030
+0.029
gency+0.027
 hips+0.027
ses+0.027
Neg logit contributions
 at-0.038
aw-0.037
--0.037
 by-0.036
up-0.033
Token shouted
Token ID17293
Feature activation-9.80e-01

Pos logit contributions
stack+0.003
+0.003
 hips+0.003
gency+0.003
sed+0.003
Neg logit contributions
aw-0.003
 at-0.003
--0.003
 by-0.003
up-0.003
Token,
Token ID11
Feature activation-6.60e-01

Pos logit contributions
 hips+0.033
sed+0.030
ishly+0.030
urance+0.030
stack+0.030
Neg logit contributions
aw-0.026
up-0.023
--0.019
 by-0.018
 bra-0.018
Token "
Token ID366
Feature activation-1.98e+00

Pos logit contributions
+0.021
stack+0.021
alk+0.021
gency+0.020
drivers+0.020
Neg logit contributions
 at-0.020
--0.018
 by-0.018
aw-0.017
 with-0.015
Token away
Token ID1497
Feature activation-2.00e+00

Pos logit contributions
 hips+0.033
gy+0.032
stack+0.031
 fists+0.031
urance+0.031
Neg logit contributions
aw-0.032
up-0.028
--0.028
 at-0.027
 by-0.025
Token his
Token ID465
Feature activation-1.38e+00

Pos logit contributions
stack+0.065
 hips+0.063
+0.062
gy+0.060
urance+0.060
Neg logit contributions
aw-0.057
 at-0.051
--0.049
up-0.046
 by-0.045
Token tears
Token ID10953
Feature activation-2.24e-01

Pos logit contributions
stack+0.040
+0.039
 hips+0.037
gency+0.036
ses+0.036
Neg logit contributions
 at-0.044
aw-0.044
--0.043
 by-0.041
up-0.038
Token with
Token ID351
Feature activation-1.16e+00

Pos logit contributions
+0.008
stack+0.008
 hips+0.008
 spine+0.008
urance+0.008
Neg logit contributions
aw-0.005
--0.004
 at-0.004
up-0.004
 by-0.004
Token his
Token ID465
Feature activation-2.09e+00

Pos logit contributions
stack+0.043
+0.043
gency+0.041
ses+0.041
 hips+0.041
Neg logit contributions
 at-0.028
aw-0.027
--0.026
 by-0.025
up-0.022
Token sleeve
Token ID23179
Feature activation-4.66e+00

Pos logit contributions
stack+0.069
+0.065
gency+0.063
drivers+0.062
alk+0.062
Neg logit contributions
 at-0.062
--0.062
aw-0.058
 by-0.057
up-0.050
Token.
Token ID13
Feature activation-3.10e-01

Pos logit contributions
stack+0.194
+0.192
ses+0.186
gency+0.185
alk+0.182
Neg logit contributions
 at-0.092
--0.088
aw-0.086
 by-0.083
up-0.068
Token "
Token ID366
Feature activation-8.85e-01

Pos logit contributions
stack+0.009
+0.008
gency+0.008
ivery+0.008
drivers+0.008
Neg logit contributions
 at-0.010
 by-0.009
--0.009
aw-0.009
 with-0.008
TokenOk
Token ID18690
Feature activation-2.32e+00

Pos logit contributions
+0.024
stack+0.023
gency+0.021
 hips+0.021
urance+0.020
Neg logit contributions
 at-0.030
--0.030
aw-0.029
 by-0.027
up-0.026
Token,
Token ID11
Feature activation-1.06e+00

Pos logit contributions
stack+0.077
+0.076
 hips+0.075
gency+0.073
urance+0.072
Neg logit contributions
aw-0.064
 at-0.061
--0.059
 by-0.055
up-0.052
Token Mom
Token ID11254
Feature activation+6.12e-01

Pos logit contributions
stack+0.024
 hips+0.022
 spine+0.019
+0.019
gency+0.019
Neg logit contributions
aw-0.044
--0.039
 at-0.037
up-0.036
 by-0.034
Token put
Token ID1234
Feature activation+4.93e-01

Pos logit contributions
stack+0.029
ishly+0.027
+0.026
 hips+0.026
sed+0.025
Neg logit contributions
aw-0.029
--0.029
 at-0.027
up-0.027
 by-0.026
Token his
Token ID465
Feature activation-7.43e-01

Pos logit contributions
aw+0.011
 at+0.010
-+0.010
 by+0.009
up+0.009
Neg logit contributions
stack-0.019
-0.018
 hips-0.018
gency-0.017
drivers-0.017
Token hands
Token ID2832
Feature activation-1.41e+00

Pos logit contributions
+0.022
stack+0.022
alk+0.020
drivers+0.020
urance+0.020
Neg logit contributions
 at-0.024
--0.024
 by-0.023
aw-0.022
up-0.020
Token on
Token ID319
Feature activation-2.47e+00

Pos logit contributions
stack+0.065
+0.065
ses+0.062
 hips+0.062
gency+0.062
Neg logit contributions
 at-0.020
aw-0.019
--0.019
 by-0.017
up-0.014
Token his
Token ID465
Feature activation-1.96e+00

Pos logit contributions
stack+0.101
+0.101
ses+0.097
gency+0.096
drivers+0.096
Neg logit contributions
 at-0.049
--0.047
aw-0.046
 by-0.044
up-0.035
Token hips
Token ID21497
Feature activation-4.66e+00

Pos logit contributions
stack+0.052
+0.052
ivery+0.048
drivers+0.048
alk+0.048
Neg logit contributions
--0.071
 at-0.070
 by-0.065
aw-0.061
 in-0.057
Token.
Token ID13
Feature activation-5.95e-01

Pos logit contributions
stack+0.216
+0.213
ses+0.213
alk+0.209
gency+0.208
Neg logit contributions
 at-0.084
 by-0.072
--0.072
aw-0.065
 with-0.051
Token
Token ID220
Feature activation-2.10e+00

Pos logit contributions
stack+0.019
ivery+0.018
ses+0.018
gency+0.018
+0.018
Neg logit contributions
 at-0.018
aw-0.017
 by-0.017
--0.016
 with-0.014
Token
Token ID198
Feature activation-5.53e-01

Pos logit contributions
stack+0.082
 spine+0.080
gency+0.079
ses+0.079
 slope+0.078
Neg logit contributions
 at-0.052
 by-0.049
aw-0.046
--0.045
 with-0.038
Token
Token ID198
Feature activation-9.29e-01

Pos logit contributions
ses+0.026
sed+0.025
 spine+0.025
stack+0.024
sing+0.024
Neg logit contributions
 by-0.010
 at-0.010
--0.009
aw-0.009
 with-0.009
TokenThen
Token ID6423
Feature activation-1.39e+00

Pos logit contributions
ses+0.031
stack+0.031
sed+0.030
+0.029
 spine+0.029
Neg logit contributions
aw-0.027
 at-0.026
--0.026
 by-0.025
up-0.023
Token his
Token ID465
Feature activation-2.28e+00

Pos logit contributions
stack+0.026
+0.026
 hips+0.025
drivers+0.025
alk+0.024
Neg logit contributions
aw-0.017
 at-0.016
--0.016
 by-0.014
up-0.014
Token sp
Token ID599
Feature activation-1.55e+00

Pos logit contributions
+0.074
stack+0.073
 hips+0.070
ses+0.069
gency+0.068
Neg logit contributions
aw-0.064
 at-0.063
--0.060
 by-0.058
up-0.053
Tokenade
Token ID671
Feature activation-6.92e-01

Pos logit contributions
stack+0.061
+0.060
gency+0.058
ses+0.058
 hips+0.057
Neg logit contributions
 at-0.034
--0.033
aw-0.033
 by-0.030
up-0.028
Token with
Token ID351
Feature activation-1.77e+00

Pos logit contributions
+0.028
stack+0.028
 hips+0.027
drivers+0.027
ses+0.026
Neg logit contributions
aw-0.013
 at-0.013
--0.012
 by-0.011
up-0.010
Token his
Token ID465
Feature activation-2.61e+00

Pos logit contributions
stack+0.069
+0.066
ses+0.064
 spine+0.064
 slope+0.063
Neg logit contributions
 at-0.045
--0.044
aw-0.042
 by-0.042
up-0.035
Token friend
Token ID1545
Feature activation-4.65e+00

Pos logit contributions
stack+0.113
+0.108
gency+0.106
urance+0.103
 slope+0.103
Neg logit contributions
 at-0.052
--0.050
 by-0.046
aw-0.044
up-0.035
Token.
Token ID13
Feature activation-3.48e-01

Pos logit contributions
stack+0.183
+0.183
 hips+0.173
gency+0.173
ses+0.173
Neg logit contributions
 at-0.097
aw-0.096
--0.093
 by-0.087
up-0.077
Token They
Token ID1119
Feature activation-1.44e+00

Pos logit contributions
stack+0.012
ses+0.011
gency+0.011
 spine+0.011
+0.011
Neg logit contributions
 at-0.011
 by-0.010
aw-0.010
--0.010
 with-0.008
Token both
Token ID1111
Feature activation-1.58e+00

Pos logit contributions
stack+0.047
+0.046
 hips+0.043
gency+0.043
ses+0.043
Neg logit contributions
aw-0.040
--0.040
 at-0.039
 by-0.037
up-0.034
Token played
Token ID2826
Feature activation-1.14e+00

Pos logit contributions
stack+0.053
+0.052
 hips+0.049
ses+0.049
gency+0.049
Neg logit contributions
aw-0.042
 at-0.041
--0.041
 by-0.039
up-0.036
Token in
Token ID287
Feature activation-1.80e+00

Pos logit contributions
+0.043
stack+0.041
 hips+0.040
alk+0.039
drivers+0.038
Neg logit contributions
up-0.024
aw-0.024
--0.022
 at-0.020
 bra-0.018
Token had
Token ID550
Feature activation-7.59e-01

Pos logit contributions
stack+0.018
+0.017
 hips+0.017
ishly+0.016
gy+0.016
Neg logit contributions
aw-0.016
 at-0.016
--0.015
 by-0.014
up-0.014
Token something
Token ID1223
Feature activation+1.66e+00

Pos logit contributions
+0.025
 hips+0.024
gency+0.024
stack+0.024
urance+0.024
Neg logit contributions
aw-0.020
--0.019
 at-0.019
 by-0.017
up-0.016
Token special
Token ID2041
Feature activation+1.28e+00

Pos logit contributions
aw+0.046
-+0.037
 at+0.036
up+0.035
 by+0.033
Neg logit contributions
 hips-0.051
-0.051
stack-0.051
drivers-0.049
sed-0.047
Token up
Token ID510
Feature activation+7.96e-01

Pos logit contributions
aw+0.029
-+0.024
 at+0.023
up+0.023
 by+0.020
Neg logit contributions
stack-0.049
-0.047
 hips-0.046
drivers-0.045
ses-0.044
Token her
Token ID607
Feature activation-7.86e-01

Pos logit contributions
aw+0.015
 at+0.013
-+0.013
up+0.012
 by+0.011
Neg logit contributions
stack-0.033
-0.033
 hips-0.032
gency-0.031
alk-0.031
Token sleeve
Token ID23179
Feature activation-4.64e+00

Pos logit contributions
+0.028
stack+0.028
 hips+0.027
ses+0.026
 spine+0.026
Neg logit contributions
aw-0.019
 at-0.019
--0.018
 by-0.017
up-0.016
Token,
Token ID11
Feature activation-3.96e-01

Pos logit contributions
stack+0.190
+0.189
ses+0.180
gency+0.180
 hips+0.179
Neg logit contributions
 at-0.091
aw-0.088
--0.087
 by-0.081
up-0.070
Token but
Token ID475
Feature activation-8.69e-02

Pos logit contributions
stack+0.013
 hips+0.012
+0.012
 spine+0.012
ses+0.011
Neg logit contributions
aw-0.011
--0.010
 at-0.010
up-0.010
 by-0.009
Token he
Token ID339
Feature activation-9.72e-01

Pos logit contributions
stack+0.003
 hips+0.002
+0.002
ishly+0.002
 spine+0.002
Neg logit contributions
aw-0.003
--0.003
up-0.003
 at-0.002
 by-0.002
Token wasn
Token ID2492
Feature activation+4.74e-01

Pos logit contributions
stack+0.028
+0.028
 hips+0.027
sed+0.026
ses+0.025
Neg logit contributions
aw-0.029
 at-0.028
--0.027
 by-0.027
up-0.025
Token't
Token ID470
Feature activation-1.39e+00

Pos logit contributions
 at+0.021
 by+0.019
-+0.019
 of+0.019
aw+0.019
Neg logit contributions
ses-0.010
 spine-0.010
sed-0.010
stack-0.009
gency-0.008
Token.
Token ID13
Feature activation-8.78e-01

Pos logit contributions
 at+0.015
-+0.014
aw+0.014
 by+0.013
up+0.011
Neg logit contributions
stack-0.029
-0.029
gency-0.027
ses-0.027
 hips-0.027
Token "
Token ID366
Feature activation-8.96e-01

Pos logit contributions
ivery+0.035
gency+0.034
arse+0.034
 spine+0.033
alk+0.033
Neg logit contributions
 at-0.020
 by-0.019
--0.018
 with-0.017
aw-0.016
TokenHe
Token ID1544
Feature activation-1.20e+00

Pos logit contributions
stack+0.031
+0.031
 hips+0.030
urance+0.030
gency+0.030
Neg logit contributions
 at-0.024
aw-0.023
--0.023
 by-0.022
up-0.020
Token hurt
Token ID5938
Feature activation-1.69e+00

Pos logit contributions
gy+0.036
sed+0.036
ses+0.036
stack+0.035
+0.035
Neg logit contributions
aw-0.033
 at-0.031
--0.030
 by-0.030
up-0.025
Token my
Token ID616
Feature activation-2.22e-01

Pos logit contributions
+0.085
stack+0.080
ses+0.078
 spine+0.077
urance+0.077
Neg logit contributions
--0.028
 at-0.028
 by-0.028
aw-0.026
 with-0.019
Token feelings
Token ID7666
Feature activation-4.64e+00

Pos logit contributions
+0.007
stack+0.007
alk+0.006
gency+0.006
drivers+0.006
Neg logit contributions
--0.007
 at-0.007
 by-0.007
aw-0.006
up-0.006
Token."
Token ID526
Feature activation-7.99e-01

Pos logit contributions
+0.226
stack+0.221
gency+0.217
ses+0.217
sed+0.214
Neg logit contributions
 at-0.087
 by-0.080
--0.077
 with-0.062
 of-0.060
Token
Token ID198
Feature activation-7.10e-01

Pos logit contributions
gency+0.033
stack+0.033
ivery+0.032
ses+0.032
 slope+0.032
Neg logit contributions
 at-0.018
 by-0.017
--0.016
 with-0.014
aw-0.014
Token
Token ID198
Feature activation-1.00e+00

Pos logit contributions
ses+0.031
 spine+0.031
stack+0.031
sed+0.030
gency+0.030
Neg logit contributions
--0.016
 by-0.015
 at-0.014
aw-0.013
 with-0.012
TokenTim
Token ID14967
Feature activation-6.50e-01

Pos logit contributions
ses+0.035
+0.034
stack+0.034
 spine+0.033
sed+0.033
Neg logit contributions
--0.030
 at-0.028
aw-0.028
 by-0.028
up-0.025
Tokenmy
Token ID1820
Feature activation+3.99e-01

Pos logit contributions
+0.024
stack+0.024
ses+0.022
gency+0.022
alk+0.022
Neg logit contributions
 at-0.017
--0.016
aw-0.016
 by-0.016
up-0.014
Token would
Token ID561
Feature activation-9.88e-01

Pos logit contributions
aw+0.004
-+0.004
 at+0.004
 by+0.004
up+0.003
Neg logit contributions
-0.004
stack-0.004
gency-0.004
ses-0.004
alk-0.004
Token love
Token ID1842
Feature activation+4.38e-01

Pos logit contributions
stack+0.043
+0.042
drivers+0.040
gency+0.040
ses+0.040
Neg logit contributions
 at-0.018
--0.017
aw-0.017
 by-0.016
up-0.013
Token to
Token ID284
Feature activation-1.73e-02

Pos logit contributions
aw+0.008
 at+0.008
-+0.007
up+0.007
 by+0.006
Neg logit contributions
stack-0.018
drivers-0.017
gency-0.017
-0.017
 hips-0.017
Token play
Token ID711
Feature activation-1.28e+00

Pos logit contributions
stack+0.001
+0.001
ses+0.001
 spine+0.001
drivers+0.001
Neg logit contributions
 at-0.000
--0.000
aw-0.000
 by-0.000
up-0.000
Token with
Token ID351
Feature activation-7.61e-01

Pos logit contributions
+0.058
stack+0.057
 hips+0.055
gency+0.054
drivers+0.054
Neg logit contributions
aw-0.018
 at-0.016
--0.015
up-0.014
 by-0.013
Token Big
Token ID4403
Feature activation-4.62e+00

Pos logit contributions
stack+0.032
+0.032
ses+0.030
urance+0.030
 hips+0.030
Neg logit contributions
 at-0.015
aw-0.015
--0.015
 by-0.014
up-0.011
Tokengie
Token ID22699
Feature activation-1.98e+00

Pos logit contributions
stack+0.078
+0.072
gency+0.063
 hips+0.062
alk+0.062
Neg logit contributions
--0.209
 at-0.207
aw-0.206
 by-0.201
up-0.184
Token.
Token ID13
Feature activation-2.54e-01

Pos logit contributions
stack+0.074
+0.073
 hips+0.070
ses+0.069
gency+0.069
Neg logit contributions
aw-0.045
 at-0.045
--0.043
 by-0.040
up-0.037
Token He
Token ID679
Feature activation-9.18e-01

Pos logit contributions
stack+0.007
+0.007
 hips+0.006
ses+0.006
gency+0.006
Neg logit contributions
 at-0.008
aw-0.008
--0.008
 by-0.008
up-0.007
Token is
Token ID318
Feature activation-1.90e-01

Pos logit contributions
ses+0.029
sed+0.028
gy+0.028
stack+0.028
 hips+0.027
Neg logit contributions
aw-0.024
 at-0.022
--0.022
 by-0.021
up-0.019
Token so
Token ID523
Feature activation+4.87e-02

Pos logit contributions
stack+0.006
 hips+0.006
ses+0.006
+0.005
gy+0.005
Neg logit contributions
aw-0.005
--0.005
up-0.005
 at-0.005
 by-0.005
Token about
Token ID546
Feature activation+1.36e-01

Pos logit contributions
ses+0.062
stack+0.062
sed+0.061
+0.060
 spine+0.060
Neg logit contributions
aw-0.030
up-0.028
--0.025
 at-0.024
 by-0.023
Token not
Token ID407
Feature activation-1.26e+00

Pos logit contributions
 at+0.003
 by+0.003
-+0.003
aw+0.003
 with+0.002
Neg logit contributions
stack-0.005
-0.005
ses-0.005
sed-0.005
ivery-0.005
Token sharing
Token ID7373
Feature activation-4.41e-01

Pos logit contributions
stack+0.034
+0.033
urance+0.032
 spine+0.032
gency+0.032
Neg logit contributions
aw-0.042
--0.039
 at-0.037
up-0.037
 by-0.034
Token with
Token ID351
Feature activation-1.58e+00

Pos logit contributions
stack+0.016
+0.016
 hips+0.016
drivers+0.015
gency+0.015
Neg logit contributions
aw-0.010
--0.009
 at-0.009
up-0.008
 by-0.008
Token his
Token ID465
Feature activation-2.29e+00

Pos logit contributions
stack+0.060
+0.059
 spine+0.057
ses+0.057
 slope+0.056
Neg logit contributions
 at-0.044
 by-0.042
--0.041
aw-0.037
 with-0.032
Token friend
Token ID1545
Feature activation-4.62e+00

Pos logit contributions
stack+0.083
+0.081
gency+0.079
 hips+0.076
alk+0.076
Neg logit contributions
 at-0.062
--0.060
 by-0.056
aw-0.053
up-0.044
Token.
Token ID13
Feature activation-4.62e-01

Pos logit contributions
+0.169
stack+0.168
 hips+0.161
urance+0.158
gency+0.157
Neg logit contributions
aw-0.103
--0.099
 at-0.098
up-0.090
 by-0.089
Token The
Token ID383
Feature activation-1.19e+00

Pos logit contributions
 slope+0.017
stack+0.017
 hips+0.017
ses+0.017
drivers+0.017
Neg logit contributions
 at-0.012
 by-0.012
--0.011
aw-0.011
 with-0.009
Token bad
Token ID2089
Feature activation-2.11e-01

Pos logit contributions
stack+0.042
+0.041
gency+0.040
 hips+0.039
alk+0.039
Neg logit contributions
 at-0.034
 by-0.032
--0.031
aw-0.029
 more-0.027
Token ending
Token ID7464
Feature activation+8.90e-03

Pos logit contributions
stack+0.005
+0.005
 hips+0.005
gency+0.005
ses+0.005
Neg logit contributions
 at-0.007
--0.007
aw-0.007
 by-0.007
up-0.007
Token was
Token ID373
Feature activation-7.85e-01

Pos logit contributions
aw+0.000
 at+0.000
-+0.000
up+0.000
 bra+0.000
Neg logit contributions
stack-0.000
-0.000
drivers-0.000
alk-0.000
ses-0.000
Token had
Token ID550
Feature activation-1.78e-01

Pos logit contributions
stack+0.004
+0.004
 hips+0.004
gency+0.004
ses+0.004
Neg logit contributions
aw-0.003
 at-0.003
--0.003
 by-0.003
up-0.003
Token a
Token ID257
Feature activation+4.26e-01

Pos logit contributions
+0.005
 hips+0.005
drivers+0.005
urance+0.005
gency+0.005
Neg logit contributions
--0.005
 at-0.005
aw-0.005
 by-0.005
up-0.004
Token trick
Token ID6908
Feature activation-1.66e-01

Pos logit contributions
 at+0.010
aw+0.010
-+0.010
 by+0.009
up+0.008
Neg logit contributions
-0.016
stack-0.015
 hips-0.015
gency-0.015
ses-0.015
Token up
Token ID510
Feature activation+7.95e-01

Pos logit contributions
stack+0.007
+0.006
 hips+0.006
drivers+0.006
gency+0.006
Neg logit contributions
aw-0.003
--0.003
 at-0.002
up-0.002
 by-0.002
Token his
Token ID465
Feature activation-1.60e+00

Pos logit contributions
aw+0.016
-+0.014
 at+0.013
up+0.013
 by+0.011
Neg logit contributions
-0.032
stack-0.032
alk-0.030
 hips-0.030
gency-0.030
Token sleeve
Token ID23179
Feature activation-4.60e+00

Pos logit contributions
stack+0.053
+0.050
gency+0.048
drivers+0.047
alk+0.047
Neg logit contributions
 at-0.048
--0.047
 by-0.044
aw-0.044
up-0.038
Token.
Token ID13
Feature activation-3.13e-02

Pos logit contributions
stack+0.186
+0.185
ses+0.177
gency+0.176
 hips+0.176
Neg logit contributions
 at-0.092
aw-0.090
--0.088
 by-0.083
up-0.071
Token<|endoftext|>
Token ID50256
Feature activation-2.36e-01

Pos logit contributions
gency+0.001
ivery+0.001
 spine+0.001
sed+0.001
stack+0.001
Neg logit contributions
 at-0.001
 by-0.001
aw-0.001
--0.001
 with-0.001
TokenOnce
Token ID7454
Feature activation-9.96e-01

Pos logit contributions
stack+0.007
+0.007
ses+0.007
 hips+0.007
gency+0.007
Neg logit contributions
 at-0.007
aw-0.007
--0.007
 by-0.007
up-0.006
Token upon
Token ID2402
Feature activation+9.83e-01

Pos logit contributions
stack+0.038
+0.037
gency+0.037
ses+0.036
drivers+0.036
Neg logit contributions
 at-0.028
 by-0.025
--0.024
aw-0.022
 of-0.020
Token a
Token ID257
Feature activation+6.93e-01

Pos logit contributions
 at+0.022
aw+0.022
-+0.022
 by+0.020
up+0.018
Neg logit contributions
stack-0.037
-0.037
 hips-0.035
gency-0.035
ses-0.035
Token had
Token ID550
Feature activation+1.45e-01

Pos logit contributions
stack+0.033
+0.033
 hips+0.031
ses+0.031
gency+0.031
Neg logit contributions
 at-0.032
aw-0.032
--0.031
 by-0.030
up-0.027
Token hurt
Token ID5938
Feature activation-1.47e+00

Pos logit contributions
 at+0.002
aw+0.002
-+0.002
 by+0.002
up+0.002
Neg logit contributions
stack-0.007
-0.007
 hips-0.006
gency-0.006
 spine-0.006
Token their
Token ID511
Feature activation-2.26e+00

Pos logit contributions
+0.060
stack+0.059
gency+0.056
ses+0.056
alk+0.055
Neg logit contributions
 at-0.034
 by-0.033
--0.032
aw-0.032
 with-0.024
Token dad
Token ID9955
Feature activation-1.59e+00

Pos logit contributions
stack+0.066
+0.066
gency+0.061
drivers+0.060
 hips+0.060
Neg logit contributions
 at-0.072
--0.071
aw-0.070
 by-0.067
up-0.061
Token's
Token ID338
Feature activation-1.30e+00

Pos logit contributions
 hips+0.062
+0.061
stack+0.061
urance+0.059
gy+0.059
Neg logit contributions
aw-0.031
 at-0.028
--0.027
up-0.024
 by-0.024
Token feelings
Token ID7666
Feature activation-4.60e+00

Pos logit contributions
stack+0.038
+0.037
gency+0.035
drivers+0.034
 slope+0.033
Neg logit contributions
 at-0.044
--0.043
 by-0.041
aw-0.040
up-0.036
Token.
Token ID13
Feature activation+9.20e-01

Pos logit contributions
stack+0.200
+0.200
gency+0.192
ses+0.192
sed+0.188
Neg logit contributions
 at-0.086
--0.079
aw-0.077
 by-0.076
up-0.058
Token They
Token ID1119
Feature activation-7.85e-01

Pos logit contributions
 at+0.026
aw+0.025
 by+0.023
-+0.022
 with+0.020
Neg logit contributions
-0.032
stack-0.032
gency-0.032
arse-0.031
ivery-0.030
Token said
Token ID531
Feature activation+1.63e+00

Pos logit contributions
+0.029
stack+0.029
 spine+0.027
alk+0.027
 hips+0.027
Neg logit contributions
 at-0.020
aw-0.019
--0.019
 by-0.018
up-0.016
Token sorry
Token ID7926
Feature activation-7.56e-01

Pos logit contributions
aw+0.048
-+0.046
 at+0.046
 by+0.043
up+0.042
Neg logit contributions
stack-0.049
-0.049
drivers-0.046
 hips-0.046
urance-0.045
Token to
Token ID284
Feature activation-2.87e-01

Pos logit contributions
drivers+0.024
 hips+0.023
 spine+0.023
gy+0.023
't+0.023
Neg logit contributions
aw-0.019
 at-0.014
 bra-0.013
--0.013
up-0.013
Token of
Token ID286
Feature activation+3.50e-01

Pos logit contributions
stack+0.038
+0.038
gency+0.037
ses+0.037
 hips+0.037
Neg logit contributions
 at-0.018
--0.017
aw-0.017
 by-0.017
up-0.013
Token Tim
Token ID5045
Feature activation-2.56e+00

Pos logit contributions
 at+0.006
 by+0.006
-+0.005
aw+0.005
 more+0.005
Neg logit contributions
stack-0.016
 spine-0.016
gency-0.015
ses-0.015
 slope-0.015
Tokenmy
Token ID1820
Feature activation-1.74e+00

Pos logit contributions
stack+0.088
+0.087
ses+0.082
gency+0.081
 hips+0.081
Neg logit contributions
 at-0.069
--0.068
aw-0.067
 by-0.065
up-0.058
Token's
Token ID338
Feature activation-7.23e-01

Pos logit contributions
+0.074
stack+0.073
 hips+0.070
ses+0.070
gency+0.070
Neg logit contributions
aw-0.030
 at-0.030
--0.029
 by-0.026
up-0.023
Token bad
Token ID2089
Feature activation-1.54e+00

Pos logit contributions
stack+0.025
+0.025
gency+0.024
drivers+0.023
alk+0.023
Neg logit contributions
 at-0.020
--0.019
 by-0.018
aw-0.018
up-0.015
Token behavior
Token ID4069
Feature activation-4.59e+00

Pos logit contributions
+0.044
urance+0.042
ses+0.042
stack+0.042
alk+0.042
Neg logit contributions
aw-0.048
 at-0.043
--0.043
 by-0.042
up-0.040
Token.
Token ID13
Feature activation-1.29e-01

Pos logit contributions
stack+0.192
+0.190
gency+0.181
ses+0.181
 hips+0.181
Neg logit contributions
 at-0.088
aw-0.085
--0.084
 by-0.079
up-0.066
Token<|endoftext|>
Token ID50256
Feature activation+2.25e-01

Pos logit contributions
gency+0.005
stack+0.005
 slope+0.005
arse+0.005
 hips+0.005
Neg logit contributions
 at-0.003
 by-0.003
aw-0.003
--0.003
 with-0.002
TokenOnce
Token ID7454
Feature activation-9.17e-01

Pos logit contributions
 at+0.006
 by+0.005
-+0.005
aw+0.005
 with+0.005
Neg logit contributions
stack-0.009
ses-0.009
 spine-0.008
sed-0.008
gency-0.008
Token upon
Token ID2402
Feature activation+1.30e+00

Pos logit contributions
stack+0.039
gency+0.038
+0.037
ses+0.036
drivers+0.036
Neg logit contributions
 at-0.025
 by-0.021
--0.021
 of-0.018
aw-0.016
Token a
Token ID257
Feature activation+6.78e-01

Pos logit contributions
 at+0.026
-+0.025
aw+0.024
 by+0.024
up+0.019
Neg logit contributions
stack-0.056
-0.054
gency-0.052
drivers-0.052
ses-0.052
Token off
Token ID572
Feature activation-1.89e-01

Pos logit contributions
aw+0.035
-+0.029
 at+0.028
up+0.027
 by+0.025
Neg logit contributions
-0.055
stack-0.053
 hips-0.053
ifully-0.050
 slope-0.049
Token her
Token ID607
Feature activation-3.99e-01

Pos logit contributions
stack+0.007
+0.007
 hips+0.007
gency+0.007
alk+0.007
Neg logit contributions
aw-0.004
 at-0.004
--0.004
 by-0.004
up-0.003
Token neck
Token ID7393
Feature activation-1.35e+00

Pos logit contributions
stack+0.012
+0.011
 hips+0.011
ses+0.011
 spine+0.011
Neg logit contributions
aw-0.012
 at-0.012
--0.012
 by-0.012
up-0.011
Token with
Token ID351
Feature activation-7.10e-01

Pos logit contributions
stack+0.059
+0.059
ses+0.058
gency+0.057
sed+0.056
Neg logit contributions
 at-0.025
--0.024
aw-0.023
 by-0.023
up-0.019
Token his
Token ID465
Feature activation-1.99e+00

Pos logit contributions
stack+0.027
+0.026
 hips+0.025
gency+0.025
ses+0.025
Neg logit contributions
 at-0.017
aw-0.016
--0.016
 by-0.015
up-0.013
Token sleeve
Token ID23179
Feature activation-4.59e+00

Pos logit contributions
stack+0.060
+0.058
gency+0.054
 slope+0.054
alk+0.054
Neg logit contributions
 at-0.063
--0.063
aw-0.060
 by-0.059
up-0.053
Token.
Token ID13
Feature activation-3.15e-01

Pos logit contributions
stack+0.209
+0.204
ses+0.201
gency+0.197
sed+0.196
Neg logit contributions
 at-0.085
--0.080
 by-0.075
aw-0.072
up-0.057
Token Lily
Token ID20037
Feature activation+6.48e-01

Pos logit contributions
stack+0.010
+0.010
gency+0.009
ses+0.009
alk+0.009
Neg logit contributions
 at-0.009
--0.009
aw-0.009
 by-0.009
up-0.008
Token was
Token ID373
Feature activation-3.85e-01

Pos logit contributions
 at+0.014
-+0.014
aw+0.013
 by+0.013
up+0.011
Neg logit contributions
stack-0.027
-0.026
 spine-0.025
alk-0.025
ses-0.025
Token confused
Token ID10416
Feature activation-1.76e+00

Pos logit contributions
stack+0.012
+0.012
gency+0.011
 hips+0.011
ses+0.011
Neg logit contributions
 at-0.012
aw-0.012
--0.012
 by-0.011
up-0.010
Token and
Token ID290
Feature activation-1.46e-01

Pos logit contributions
 hips+0.056
 spine+0.056
+0.054
urance+0.054
alk+0.052
Neg logit contributions
aw-0.043
up-0.038
--0.035
 at-0.033
 bra-0.031
Token.
Token ID13
Feature activation+4.32e-01

Pos logit contributions
+0.015
stack+0.015
 hips+0.014
alk+0.014
drivers+0.014
Neg logit contributions
aw-0.009
 at-0.008
--0.008
 by-0.008
up-0.007
Token Sara
Token ID24799
Feature activation+2.26e+00

Pos logit contributions
aw+0.014
-+0.014
 at+0.014
 by+0.014
up+0.013
Neg logit contributions
-0.011
stack-0.011
ses-0.011
 hips-0.010
sed-0.010
Token runs
Token ID4539
Feature activation+3.69e-01

Pos logit contributions
aw+0.067
-+0.064
 at+0.061
 by+0.059
up+0.059
Neg logit contributions
ses-0.064
 hips-0.063
-0.063
sed-0.062
ishly-0.059
Token away
Token ID1497
Feature activation-7.11e-01

Pos logit contributions
aw+0.008
up+0.006
-+0.006
ins+0.006
 at+0.006
Neg logit contributions
't-0.013
 slope-0.013
-0.013
stack-0.012
 hips-0.012
Token and
Token ID290
Feature activation+9.05e-01

Pos logit contributions
+0.029
stack+0.028
 hips+0.028
 spine+0.027
 slope+0.027
Neg logit contributions
aw-0.015
 at-0.012
--0.012
up-0.011
 by-0.010
Token spr
Token ID7500
Feature activation+5.72e+00

Pos logit contributions
aw+0.031
-+0.030
up+0.026
 by+0.026
 at+0.025
Neg logit contributions
ses-0.025
 hips-0.024
sed-0.023
stack-0.023
ishly-0.023
Tokenays
Token ID592
Feature activation+1.20e-01

Pos logit contributions
-+0.173
aw+0.159
up+0.158
 bra+0.155
 at+0.154
Neg logit contributions
stack-0.141
gency-0.134
ifully-0.133
gy-0.132
-0.132
Token him
Token ID683
Feature activation+2.44e-01

Pos logit contributions
aw+0.003
 at+0.003
-+0.003
 by+0.002
up+0.002
Neg logit contributions
-0.004
stack-0.004
ses-0.004
 hips-0.004
drivers-0.004
Token again
Token ID757
Feature activation+7.86e-01

Pos logit contributions
aw+0.004
 at+0.004
-+0.004
 by+0.003
up+0.003
Neg logit contributions
stack-0.011
-0.010
 hips-0.010
ses-0.010
gency-0.010
Token.
Token ID13
Feature activation+8.15e-01

Pos logit contributions
aw+0.018
 at+0.017
-+0.017
 by+0.015
up+0.015
Neg logit contributions
stack-0.029
-0.029
 hips-0.028
gency-0.027
 spine-0.027
Token They
Token ID1119
Feature activation-5.22e-01

Pos logit contributions
 at+0.025
aw+0.024
-+0.023
 by+0.022
up+0.020
Neg logit contributions
stack-0.025
-0.025
gency-0.024
 hips-0.024
alk-0.024
Token soon
Token ID2582
Feature activation-6.32e-02

Pos logit contributions
stack+0.006
+0.006
 hips+0.006
ses+0.006
sed+0.005
Neg logit contributions
aw-0.005
--0.005
 at-0.005
up-0.004
 by-0.004
Token nothing
Token ID2147
Feature activation+1.16e+00

Pos logit contributions
 spine+0.002
stack+0.002
 hips+0.002
+0.002
sed+0.002
Neg logit contributions
aw-0.002
up-0.002
--0.002
 at-0.001
 by-0.001
Token but
Token ID475
Feature activation+7.61e-01

Pos logit contributions
aw+0.033
-+0.027
 at+0.027
up+0.027
 by+0.025
Neg logit contributions
stack-0.034
sed-0.033
ses-0.033
-0.032
 spine-0.032
Token a
Token ID257
Feature activation-7.14e-02

Pos logit contributions
aw+0.023
-+0.020
up+0.020
 at+0.018
ins+0.017
Neg logit contributions
stack-0.022
-0.022
ses-0.020
 hips-0.020
sed-0.020
Token small
Token ID1402
Feature activation-1.96e-01

Pos logit contributions
+0.002
stack+0.002
ses+0.002
 hips+0.002
gency+0.002
Neg logit contributions
aw-0.002
 at-0.002
--0.002
 by-0.002
up-0.002
Token spe
Token ID693
Feature activation+6.14e+00

Pos logit contributions
stack+0.006
+0.006
ses+0.006
 spine+0.006
alk+0.006
Neg logit contributions
aw-0.005
 at-0.005
--0.005
up-0.005
 by-0.005
Tokenck
Token ID694
Feature activation+2.55e+00

Pos logit contributions
-+0.150
aw+0.148
 at+0.147
 by+0.136
up+0.121
Neg logit contributions
-0.230
stack-0.225
 hips-0.214
drivers-0.214
gency-0.213
Token in
Token ID287
Feature activation-2.06e+00

Pos logit contributions
aw+0.054
 at+0.052
-+0.051
 by+0.049
up+0.048
Neg logit contributions
stack-0.093
-0.090
alk-0.087
ses-0.083
sed-0.083
Token the
Token ID262
Feature activation-3.73e-01

Pos logit contributions
stack+0.077
+0.077
 hips+0.072
ses+0.072
gency+0.072
Neg logit contributions
 at-0.047
aw-0.047
--0.046
 by-0.043
up-0.039
Token sky
Token ID6766
Feature activation+1.14e-01

Pos logit contributions
stack+0.010
+0.010
 spine+0.009
ses+0.009
 hips+0.009
Neg logit contributions
 at-0.013
aw-0.012
--0.012
 by-0.012
up-0.011
Token.
Token ID13
Feature activation-1.01e-01

Pos logit contributions
aw+0.003
-+0.003
up+0.003
 at+0.003
 by+0.002
Neg logit contributions
stack-0.004
-0.004
 hips-0.003
alk-0.003
gy-0.003
Token reward
Token ID6721
Feature activation+3.36e+00

Pos logit contributions
 at+0.082
aw+0.080
-+0.080
 by+0.076
up+0.068
Neg logit contributions
-0.092
stack-0.091
 hips-0.086
gency-0.086
ses-0.085
Token.
Token ID13
Feature activation+9.41e-01

Pos logit contributions
aw+0.067
 at+0.064
-+0.062
 by+0.057
up+0.053
Neg logit contributions
stack-0.135
-0.132
 hips-0.127
drivers-0.125
gency-0.125
Token It
Token ID632
Feature activation+4.58e-01

Pos logit contributions
 at+0.030
aw+0.029
 by+0.028
-+0.027
 with+0.024
Neg logit contributions
ses-0.027
gency-0.026
stack-0.026
-0.026
drivers-0.026
Token was
Token ID373
Feature activation+3.63e-01

Pos logit contributions
aw+0.012
-+0.011
 at+0.011
 by+0.011
up+0.010
Neg logit contributions
stack-0.016
-0.015
ses-0.014
 hips-0.014
sed-0.014
Token ch
Token ID442
Feature activation+1.21e+00

Pos logit contributions
aw+0.009
 at+0.009
-+0.009
 by+0.008
up+0.008
Neg logit contributions
stack-0.013
-0.013
 hips-0.012
ses-0.012
gency-0.012
Tokenocol
Token ID4668
Feature activation+5.85e+00

Pos logit contributions
 at+0.053
aw+0.052
-+0.052
 by+0.050
up+0.048
Neg logit contributions
-0.020
stack-0.019
alk-0.017
ses-0.017
 hips-0.017
Tokenates
Token ID689
Feature activation+2.16e-01

Pos logit contributions
 at+0.216
aw+0.213
-+0.210
 by+0.204
up+0.192
Neg logit contributions
stack-0.135
-0.135
 hips-0.122
drivers-0.122
gency-0.122
Token and
Token ID290
Feature activation-8.66e-01

Pos logit contributions
aw+0.004
 at+0.004
-+0.004
 by+0.004
up+0.004
Neg logit contributions
stack-0.008
-0.008
 hips-0.008
gency-0.008
ses-0.008
Token sweets
Token ID42402
Feature activation+2.14e-01

Pos logit contributions
stack+0.029
+0.029
 hips+0.027
ses+0.027
gency+0.027
Neg logit contributions
aw-0.023
 at-0.023
--0.022
 by-0.021
up-0.020
Token!
Token ID0
Feature activation+1.86e-01

Pos logit contributions
aw+0.005
 at+0.004
-+0.004
 by+0.004
up+0.004
Neg logit contributions
stack-0.008
-0.008
 hips-0.008
drivers-0.008
gency-0.008
Token
Token ID198
Feature activation-2.80e-01

Pos logit contributions
 at+0.006
aw+0.006
 by+0.005
-+0.005
up+0.005
Neg logit contributions
stack-0.005
ses-0.005
-0.005
gency-0.005
 spine-0.005
Token It
Token ID632
Feature activation-2.97e-01

Pos logit contributions
 at+0.000
aw+0.000
-+0.000
 by+0.000
up+0.000
Neg logit contributions
stack-0.000
gency-0.000
drivers-0.000
 hips-0.000
 spine-0.000
Token was
Token ID373
Feature activation-1.52e-01

Pos logit contributions
+0.009
stack+0.009
 hips+0.009
sed+0.008
ses+0.008
Neg logit contributions
aw-0.009
 at-0.008
--0.008
 by-0.008
up-0.007
Token sweet
Token ID6029
Feature activation+1.76e+00

Pos logit contributions
+0.005
stack+0.005
 hips+0.005
ses+0.005
urance+0.005
Neg logit contributions
aw-0.004
--0.004
 at-0.004
up-0.004
 by-0.003
Token,
Token ID11
Feature activation-2.30e-02

Pos logit contributions
aw+0.049
up+0.047
ups+0.040
 bra+0.039
 at+0.038
Neg logit contributions
-0.049
 Johnson-0.043
 hips-0.042
urance-0.042
ishly-0.042
Token ch
Token ID442
Feature activation+4.96e-01

Pos logit contributions
+0.001
 hips+0.001
stack+0.001
ses+0.001
 swollen+0.001
Neg logit contributions
aw-0.001
up-0.001
--0.001
 at-0.001
 by-0.001
Tokenocol
Token ID4668
Feature activation+6.06e+00

Pos logit contributions
 at+0.021
aw+0.021
 by+0.021
-+0.020
up+0.020
Neg logit contributions
-0.009
stack-0.007
alk-0.007
ses-0.007
 hips-0.007
Tokenat
Token ID265
Feature activation+1.69e+00

Pos logit contributions
 at+0.210
aw+0.209
-+0.207
 by+0.198
up+0.183
Neg logit contributions
stack-0.156
-0.154
 hips-0.142
gency-0.142
ses-0.142
Tokeny
Token ID88
Feature activation-1.03e+00

Pos logit contributions
aw+0.043
 at+0.041
 by+0.037
-+0.037
up+0.035
Neg logit contributions
-0.058
stack-0.058
 hips-0.055
gy-0.053
ishly-0.053
Token and
Token ID290
Feature activation-3.02e-02

Pos logit contributions
stack+0.039
+0.039
 hips+0.038
sed+0.036
ses+0.036
Neg logit contributions
aw-0.022
 at-0.020
--0.019
up-0.019
 by-0.018
Token warm
Token ID5814
Feature activation+1.03e+00

Pos logit contributions
stack+0.001
+0.001
 hips+0.001
 swollen+0.001
sed+0.001
Neg logit contributions
aw-0.001
--0.001
 at-0.001
up-0.001
 by-0.001
Token,
Token ID11
Feature activation-4.05e-01

Pos logit contributions
aw+0.026
up+0.023
 at+0.022
-+0.021
 by+0.019
Neg logit contributions
-0.037
stack-0.036
 hips-0.034
ishly-0.033
urance-0.033
Token we
Token ID356
Feature activation+2.61e-01

Pos logit contributions
stack+0.018
 hips+0.017
't+0.016
urance+0.015
 Johnson+0.015
Neg logit contributions
aw-0.024
--0.021
up-0.020
 at-0.020
 by-0.018
Token can
Token ID460
Feature activation-2.79e-01

Pos logit contributions
aw+0.007
-+0.007
 at+0.006
 by+0.006
up+0.006
Neg logit contributions
-0.009
stack-0.009
gency-0.008
urance-0.008
 hips-0.008
Token get
Token ID651
Feature activation+1.75e+00

Pos logit contributions
+0.010
stack+0.010
 hips+0.009
gency+0.009
urance+0.009
Neg logit contributions
aw-0.007
 at-0.007
--0.007
 by-0.007
up-0.006
Token more
Token ID517
Feature activation+1.52e+00

Pos logit contributions
aw+0.049
-+0.044
 at+0.042
up+0.039
 by+0.039
Neg logit contributions
-0.051
stack-0.050
gency-0.050
 hips-0.049
drivers-0.049
Token ch
Token ID442
Feature activation+1.20e+00

Pos logit contributions
aw+0.044
 at+0.035
 by+0.034
up+0.034
-+0.033
Neg logit contributions
 hips-0.036
drivers-0.036
 hind-0.035
 slope-0.034
 banks-0.034
Tokenocol
Token ID4668
Feature activation+6.40e+00

Pos logit contributions
aw+0.038
 at+0.037
 by+0.037
-+0.034
up+0.033
Neg logit contributions
-0.032
alk-0.031
drivers-0.030
 hips-0.029
 spine-0.028
Tokenates
Token ID689
Feature activation+1.28e+00

Pos logit contributions
 at+0.240
aw+0.236
-+0.231
 by+0.225
up+0.213
Neg logit contributions
stack-0.145
-0.144
drivers-0.132
 hips-0.130
ses-0.129
Token later
Token ID1568
Feature activation-7.34e-01

Pos logit contributions
aw+0.032
-+0.029
 at+0.029
 by+0.027
up+0.026
Neg logit contributions
stack-0.045
-0.043
 hips-0.042
alk-0.040
drivers-0.040
Token.
Token ID13
Feature activation+5.75e-01

Pos logit contributions
stack+0.021
+0.020
ses+0.019
 hips+0.019
gency+0.019
Neg logit contributions
 at-0.024
aw-0.023
--0.023
 by-0.022
up-0.020
Token For
Token ID1114
Feature activation+7.76e-01

Pos logit contributions
aw+0.020
-+0.020
 at+0.020
 by+0.018
up+0.018
Neg logit contributions
-0.016
stack-0.015
gency-0.014
 hips-0.014
ses-0.014
Token now
Token ID783
Feature activation+8.97e-01

Pos logit contributions
 at+0.018
-+0.017
aw+0.017
 by+0.016
up+0.013
Neg logit contributions
stack-0.030
-0.030
 spine-0.029
ses-0.028
drivers-0.028
Token new
Token ID649
Feature activation+1.18e-01

Pos logit contributions
+0.013
stack+0.013
 spine+0.012
gency+0.012
 hips+0.012
Neg logit contributions
aw-0.020
 at-0.019
--0.018
 by-0.017
up-0.017
Token shelves
Token ID20726
Feature activation-2.19e-03

Pos logit contributions
aw+0.004
 at+0.003
-+0.003
 by+0.003
up+0.003
Neg logit contributions
 hips-0.003
stack-0.003
 spine-0.003
 slope-0.003
-0.003
Token.
Token ID13
Feature activation-1.95e-01

Pos logit contributions
stack+0.000
+0.000
 hips+0.000
drivers+0.000
 spine+0.000
Neg logit contributions
aw-0.000
 at-0.000
--0.000
 by-0.000
up-0.000
Token He
Token ID679
Feature activation-6.72e-01

Pos logit contributions
stack+0.006
ivery+0.005
gency+0.005
ses+0.005
 hips+0.005
Neg logit contributions
 at-0.006
aw-0.006
 by-0.006
--0.006
up-0.005
Token filled
Token ID5901
Feature activation+2.17e+00

Pos logit contributions
stack+0.022
+0.021
 hips+0.021
drivers+0.020
 slope+0.020
Neg logit contributions
aw-0.019
 at-0.018
--0.017
 by-0.017
up-0.015
Token up
Token ID510
Feature activation+2.48e+00

Pos logit contributions
aw+0.059
 at+0.051
up+0.048
-+0.048
 bra+0.047
Neg logit contributions
stack-0.066
 hips-0.063
 slope-0.061
drivers-0.061
gy-0.061
Token the
Token ID262
Feature activation+2.81e-01

Pos logit contributions
aw+0.066
 at+0.058
-+0.057
up+0.055
 by+0.051
Neg logit contributions
stack-0.080
-0.078
 hips-0.077
 slope-0.076
drivers-0.076
Token shelves
Token ID20726
Feature activation+1.15e+00

Pos logit contributions
aw+0.011
 at+0.010
-+0.010
 by+0.009
 bra+0.009
Neg logit contributions
 spine-0.006
 hips-0.006
 hallway-0.006
 sleeves-0.005
-0.005
Token with
Token ID351
Feature activation-1.45e-01

Pos logit contributions
aw+0.022
-+0.018
up+0.017
 at+0.017
 by+0.015
Neg logit contributions
-0.048
stack-0.048
 hips-0.046
 spine-0.045
 slope-0.044
Token markers
Token ID19736
Feature activation+1.34e+00

Pos logit contributions
stack+0.005
+0.005
 hips+0.004
drivers+0.004
ses+0.004
Neg logit contributions
aw-0.004
 at-0.004
--0.004
 by-0.004
up-0.003
Token,
Token ID11
Feature activation+5.51e-01

Pos logit contributions
aw+0.026
-+0.022
 at+0.022
 by+0.020
up+0.020
Neg logit contributions
stack-0.054
-0.053
drivers-0.053
 hips-0.053
 spine-0.050
Token tight
Token ID5381
Feature activation+7.35e-01

Pos logit contributions
 at+0.032
-+0.030
aw+0.029
 by+0.028
up+0.022
Neg logit contributions
stack-0.086
-0.086
gency-0.082
ses-0.081
drivers-0.081
Token.
Token ID13
Feature activation+1.47e+00

Pos logit contributions
 at+0.012
-+0.011
aw+0.011
 by+0.010
up+0.008
Neg logit contributions
-0.033
stack-0.033
gency-0.032
ses-0.031
drivers-0.031
Token
Token ID198
Feature activation+8.28e-01

Pos logit contributions
 at+0.043
aw+0.040
 by+0.039
-+0.037
 with+0.034
Neg logit contributions
gency-0.046
stack-0.046
ivery-0.046
 spine-0.046
arse-0.046
Token
Token ID198
Feature activation+5.06e-01

Pos logit contributions
aw+0.019
 at+0.019
-+0.018
 by+0.018
 of+0.014
Neg logit contributions
ses-0.033
 spine-0.033
sed-0.033
stack-0.033
gency-0.032
TokenM
Token ID44
Feature activation+1.67e+00

Pos logit contributions
aw+0.014
 at+0.013
-+0.013
 by+0.013
up+0.012
Neg logit contributions
ses-0.018
sed-0.017
stack-0.017
 spine-0.017
-0.017
Tokenaddy
Token ID13218
Feature activation+3.08e+00

Pos logit contributions
 at+0.056
aw+0.055
-+0.054
 by+0.052
up+0.048
Neg logit contributions
-0.045
stack-0.045
 hips-0.041
ses-0.041
gency-0.041
Token never
Token ID1239
Feature activation+8.35e-01

Pos logit contributions
 at+0.077
aw+0.076
-+0.074
 by+0.071
up+0.064
Neg logit contributions
-0.106
stack-0.106
 hips-0.101
gency-0.098
ses-0.098
Token worried
Token ID7960
Feature activation-2.93e-01

Pos logit contributions
aw+0.025
-+0.024
 at+0.024
 by+0.023
up+0.021
Neg logit contributions
 spine-0.022
gy-0.022
ses-0.022
urance-0.022
gency-0.022
Token about
Token ID546
Feature activation+6.69e-01

Pos logit contributions
stack+0.012
+0.012
 hips+0.012
urance+0.011
 spine+0.011
Neg logit contributions
aw-0.006
--0.005
 at-0.005
up-0.004
 by-0.004
Token losing
Token ID6078
Feature activation-1.93e+00

Pos logit contributions
 at+0.013
aw+0.012
-+0.012
 by+0.011
up+0.010
Neg logit contributions
stack-0.028
-0.028
ses-0.026
 hips-0.026
gency-0.026
Token her
Token ID607
Feature activation-2.75e-01

Pos logit contributions
stack+0.072
+0.072
 hips+0.070
gency+0.068
drivers+0.067
Neg logit contributions
aw-0.042
 at-0.040
--0.038
up-0.035
 by-0.035
Token it
Token ID340
Feature activation+5.50e-02

Pos logit contributions
aw+0.010
 at+0.010
-+0.010
up+0.009
 by+0.008
Neg logit contributions
stack-0.022
-0.022
 hips-0.021
 slope-0.021
drivers-0.021
Token,
Token ID11
Feature activation+3.21e-01

Pos logit contributions
 at+0.001
aw+0.001
-+0.001
 by+0.001
up+0.001
Neg logit contributions
stack-0.002
-0.002
gency-0.002
ses-0.002
 hips-0.002
Token stretching
Token ID20880
Feature activation+1.25e+00

Pos logit contributions
aw+0.009
 at+0.008
-+0.008
 by+0.008
up+0.008
Neg logit contributions
stack-0.011
-0.010
 hips-0.010
ses-0.010
gency-0.010
Token it
Token ID340
Feature activation-3.81e-01

Pos logit contributions
aw+0.025
 at+0.024
-+0.024
 by+0.022
up+0.020
Neg logit contributions
stack-0.050
-0.049
 hips-0.048
gency-0.047
ses-0.047
Token and
Token ID290
Feature activation-8.13e-01

Pos logit contributions
stack+0.016
+0.016
gency+0.015
ses+0.015
 hips+0.015
Neg logit contributions
 at-0.007
aw-0.007
--0.007
 by-0.006
up-0.005
Token even
Token ID772
Feature activation+2.20e+00

Pos logit contributions
stack+0.023
+0.022
 hips+0.022
gency+0.021
drivers+0.021
Neg logit contributions
aw-0.026
--0.025
 at-0.025
 by-0.023
up-0.023
Token making
Token ID1642
Feature activation+4.53e-01

Pos logit contributions
aw+0.060
-+0.056
 at+0.054
up+0.051
 by+0.048
Neg logit contributions
drivers-0.067
stack-0.066
 hips-0.066
gency-0.066
 slope-0.065
Token shapes
Token ID15268
Feature activation+2.16e+00

Pos logit contributions
 at+0.012
aw+0.012
-+0.012
 by+0.011
up+0.010
Neg logit contributions
drivers-0.013
 hips-0.012
-0.012
 slope-0.012
opes-0.012
Token out
Token ID503
Feature activation+9.45e-01

Pos logit contributions
aw+0.063
up+0.054
-+0.054
 at+0.048
 by+0.047
Neg logit contributions
stack-0.057
 slope-0.057
 hips-0.056
't-0.056
-0.056
Token of
Token ID286
Feature activation+9.51e-01

Pos logit contributions
aw+0.020
-+0.019
up+0.018
 at+0.018
 by+0.017
Neg logit contributions
stack-0.032
-0.031
alk-0.030
drivers-0.030
 slope-0.030
Token it
Token ID340
Feature activation-1.53e+00

Pos logit contributions
 at+0.022
aw+0.021
-+0.020
 by+0.020
up+0.017
Neg logit contributions
stack-0.036
-0.036
 hips-0.034
gency-0.034
ses-0.034
Token street
Token ID4675
Feature activation-1.40e+00

Pos logit contributions
 at+0.022
 by+0.022
-+0.022
aw+0.021
up+0.019
Neg logit contributions
-0.016
alk-0.014
gency-0.014
 spoiler-0.014
drivers-0.013
Token.
Token ID13
Feature activation+3.46e-01

Pos logit contributions
stack+0.053
+0.052
 hips+0.049
 spine+0.049
ses+0.048
Neg logit contributions
aw-0.031
 at-0.030
--0.030
 by-0.027
up-0.026
Token She
Token ID1375
Feature activation+8.85e-01

Pos logit contributions
 at+0.012
aw+0.012
-+0.011
 by+0.011
up+0.010
Neg logit contributions
stack-0.009
-0.009
gency-0.008
 hips-0.008
ses-0.008
Token has
Token ID468
Feature activation+8.37e-01

Pos logit contributions
aw+0.020
 at+0.020
-+0.020
 by+0.019
up+0.017
Neg logit contributions
stack-0.032
-0.031
ses-0.030
 hips-0.030
sed-0.030
Token a
Token ID257
Feature activation+1.18e+00

Pos logit contributions
 at+0.022
aw+0.022
-+0.022
 by+0.020
up+0.019
Neg logit contributions
stack-0.028
-0.028
 hips-0.027
ses-0.027
gency-0.027
Token pink
Token ID11398
Feature activation+2.38e+00

Pos logit contributions
 at+0.033
-+0.032
aw+0.032
 by+0.030
up+0.027
Neg logit contributions
stack-0.039
-0.038
 hips-0.036
gency-0.036
alk-0.036
Token dress
Token ID6576
Feature activation+1.84e+00

Pos logit contributions
 at+0.065
-+0.064
aw+0.063
 by+0.059
up+0.052
Neg logit contributions
stack-0.080
-0.079
gency-0.075
drivers-0.075
 hips-0.074
Token and
Token ID290
Feature activation+1.09e+00

Pos logit contributions
aw+0.030
 at+0.028
-+0.027
 by+0.025
up+0.024
Neg logit contributions
stack-0.079
-0.077
 hips-0.074
 spine-0.074
ses-0.073
Token a
Token ID257
Feature activation+4.59e-01

Pos logit contributions
 at+0.026
aw+0.026
-+0.025
 by+0.024
up+0.021
Neg logit contributions
stack-0.039
-0.039
 hips-0.037
ses-0.037
gency-0.037
Token red
Token ID2266
Feature activation+1.05e+00

Pos logit contributions
aw+0.013
 at+0.013
-+0.013
 by+0.012
up+0.011
Neg logit contributions
stack-0.014
-0.014
 hips-0.013
ses-0.013
gency-0.013
Token hat
Token ID6877
Feature activation+2.05e+00

Pos logit contributions
 at+0.029
-+0.028
aw+0.028
 by+0.027
up+0.024
Neg logit contributions
stack-0.034
-0.034
gency-0.032
 hips-0.032
drivers-0.032
Token
Token ID198
Feature activation+5.20e-02

Pos logit contributions
stack+0.011
gency+0.010
ses+0.010
arse+0.010
ishly+0.010
Neg logit contributions
 at-0.008
 by-0.007
--0.007
aw-0.007
 with-0.006
Token
Token ID198
Feature activation-4.00e-01

Pos logit contributions
 at+0.001
-+0.001
 by+0.001
aw+0.001
up+0.001
Neg logit contributions
stack-0.002
ses-0.002
 spine-0.002
gency-0.002
sed-0.002
Token"
Token ID1
Feature activation+7.53e-01

Pos logit contributions
stack+0.014
+0.014
ses+0.014
gency+0.013
 hips+0.013
Neg logit contributions
 at-0.010
aw-0.010
--0.010
 by-0.009
up-0.008
TokenBut
Token ID1537
Feature activation+2.06e-01

Pos logit contributions
 at+0.022
aw+0.022
-+0.022
 by+0.021
up+0.019
Neg logit contributions
stack-0.023
-0.023
 hips-0.022
gency-0.022
ses-0.022
Token I
Token ID314
Feature activation+1.44e-01

Pos logit contributions
aw+0.007
-+0.006
up+0.006
 at+0.006
 by+0.005
Neg logit contributions
stack-0.006
-0.006
 hips-0.006
ses-0.006
gency-0.006
Token like
Token ID588
Feature activation+2.22e+00

Pos logit contributions
aw+0.004
 at+0.004
-+0.004
 by+0.004
up+0.004
Neg logit contributions
-0.004
ses-0.004
gency-0.004
stack-0.004
 hips-0.004
Token the
Token ID262
Feature activation-3.41e-01

Pos logit contributions
aw+0.050
 at+0.047
-+0.046
up+0.043
 by+0.042
Neg logit contributions
stack-0.079
 hips-0.077
gency-0.077
ses-0.077
-0.077
Token loud
Token ID7812
Feature activation-2.51e-01

Pos logit contributions
+0.012
stack+0.012
alk+0.011
gency+0.011
 hips+0.011
Neg logit contributions
 at-0.009
--0.009
aw-0.009
 by-0.009
up-0.008
Token car
Token ID1097
Feature activation-1.63e+00

Pos logit contributions
+0.008
stack+0.008
 hips+0.007
gency+0.007
ses+0.007
Neg logit contributions
 at-0.007
--0.007
aw-0.007
 by-0.007
up-0.006
Token,"
Token ID553
Feature activation-2.99e-01

Pos logit contributions
stack+0.053
+0.053
 hips+0.050
ses+0.050
gency+0.050
Neg logit contributions
 at-0.045
aw-0.044
--0.044
 by-0.041
up-0.038
Token Lily
Token ID20037
Feature activation+4.58e-01

Pos logit contributions
stack+0.010
+0.010
 hips+0.009
gency+0.009
ses+0.009
Neg logit contributions
 at-0.008
aw-0.008
--0.008
 by-0.007
up-0.007
Token couldn
Token ID3521
Feature activation-2.10e-01

Pos logit contributions
+0.015
stack+0.015
 hips+0.015
gency+0.014
sed+0.014
Neg logit contributions
aw-0.012
 at-0.012
--0.012
 by-0.011
up-0.011
Token't
Token ID470
Feature activation-4.68e-01

Pos logit contributions
ses+0.003
 spine+0.003
sed+0.003
stack+0.003
gency+0.003
Neg logit contributions
 at-0.010
--0.010
 by-0.009
aw-0.009
 of-0.009
Token wait
Token ID4043
Feature activation+1.40e+00

Pos logit contributions
stack+0.019
ses+0.019
drivers+0.019
+0.019
sed+0.019
Neg logit contributions
 at-0.010
--0.009
 by-0.009
aw-0.008
up-0.007
Token to
Token ID284
Feature activation+3.03e-01

Pos logit contributions
aw+0.026
-+0.021
 at+0.020
up+0.020
 by+0.019
Neg logit contributions
 hips-0.059
-0.059
gency-0.057
stack-0.057
 spine-0.056
Token see
Token ID766
Feature activation+3.97e-01

Pos logit contributions
aw+0.009
-+0.008
 at+0.008
 by+0.007
up+0.007
Neg logit contributions
-0.010
stack-0.010
 hips-0.009
urance-0.009
gency-0.009
Token all
Token ID477
Feature activation-1.41e+00

Pos logit contributions
aw+0.010
-+0.009
 at+0.009
up+0.008
 by+0.008
Neg logit contributions
stack-0.013
gency-0.013
 hips-0.013
-0.013
 spoiler-0.012
Token the
Token ID262
Feature activation-8.93e-01

Pos logit contributions
stack+0.046
 slope+0.045
alk+0.044
 hips+0.042
+0.042
Neg logit contributions
aw-0.034
--0.030
 at-0.028
up-0.028
 bra-0.024
Token cool
Token ID3608
Feature activation+7.37e-02

Pos logit contributions
 slope+0.024
drivers+0.023
 banks+0.022
 hips+0.022
 treadmill+0.022
Neg logit contributions
aw-0.025
--0.024
 at-0.024
up-0.021
 by-0.020
Token things
Token ID1243
Feature activation+1.16e+00

Pos logit contributions
aw+0.002
 at+0.002
-+0.002
 by+0.001
 bra+0.001
Neg logit contributions
 slope-0.002
alk-0.002
-0.002
 banks-0.002
drivers-0.002
Token that
Token ID326
Feature activation-7.70e-01

Pos logit contributions
aw+0.033
-+0.030
up+0.028
 at+0.027
 by+0.026
Neg logit contributions
stack-0.036
-0.036
 hips-0.035
 slope-0.033
drivers-0.033
Token were
Token ID547
Feature activation-5.81e-01

Pos logit contributions
stack+0.024
+0.023
 hips+0.022
gency+0.022
ses+0.022
Neg logit contributions
aw-0.023
 at-0.022
--0.022
 by-0.021
up-0.020
Token.
Token ID13
Feature activation+1.83e+00

Pos logit contributions
aw+0.025
 at+0.024
-+0.024
 by+0.023
up+0.021
Neg logit contributions
stack-0.026
-0.025
 hips-0.024
gency-0.024
sed-0.024
Token
Token ID198
Feature activation+1.15e+00

Pos logit contributions
 at+0.058
aw+0.056
 by+0.054
-+0.052
 with+0.046
Neg logit contributions
gency-0.053
-0.052
stack-0.052
ses-0.052
drivers-0.052
Token
Token ID198
Feature activation+5.62e-01

Pos logit contributions
aw+0.028
 at+0.027
 by+0.026
-+0.025
 of+0.021
Neg logit contributions
ses-0.046
sed-0.046
 spine-0.046
stack-0.045
-0.045
TokenThey
Token ID2990
Feature activation+6.55e-01

Pos logit contributions
aw+0.016
 at+0.015
-+0.015
 by+0.014
up+0.013
Neg logit contributions
ses-0.020
sed-0.019
stack-0.019
-0.019
 spine-0.019
Token talked
Token ID6619
Feature activation-4.02e-01

Pos logit contributions
 at+0.016
aw+0.016
-+0.016
 by+0.015
up+0.014
Neg logit contributions
stack-0.023
-0.023
 hips-0.022
gency-0.022
ses-0.022
Token and
Token ID290
Feature activation-1.91e-01

Pos logit contributions
stack+0.017
 hips+0.017
+0.016
ses+0.016
urance+0.016
Neg logit contributions
aw-0.007
--0.006
 at-0.006
up-0.006
 by-0.005
Token talked
Token ID6619
Feature activation-1.27e+00

Pos logit contributions
stack+0.006
+0.006
 hips+0.006
ses+0.005
gency+0.005
Neg logit contributions
aw-0.006
--0.006
 at-0.005
 by-0.005
up-0.005
Token and
Token ID290
Feature activation-8.93e-01

Pos logit contributions
stack+0.050
 hips+0.049
+0.049
urance+0.048
drivers+0.048
Neg logit contributions
aw-0.024
--0.022
 at-0.022
up-0.021
 by-0.020
Token laughed
Token ID13818
Feature activation-1.09e+00

Pos logit contributions
stack+0.027
+0.026
 hips+0.026
ses+0.025
sed+0.025
Neg logit contributions
aw-0.027
--0.026
 at-0.026
up-0.025
 by-0.024
Token and
Token ID290
Feature activation-1.27e+00

Pos logit contributions
 hips+0.038
+0.037
stack+0.037
urance+0.036
ishly+0.035
Neg logit contributions
aw-0.025
up-0.024
--0.023
 at-0.021
 by-0.020
Token laughed
Token ID13818
Feature activation-1.22e+00

Pos logit contributions
stack+0.040
+0.039
 hips+0.038
ses+0.038
sed+0.037
Neg logit contributions
aw-0.036
--0.035
 at-0.034
up-0.033
 by-0.032
Token bird
Token ID6512
Feature activation+1.17e+00

Pos logit contributions
aw+0.049
up+0.047
 at+0.046
 by+0.045
-+0.044
Neg logit contributions
stack-0.042
-0.041
sed-0.037
 slope-0.037
ses-0.037
Token!
Token ID0
Feature activation+4.03e-01

Pos logit contributions
aw+0.030
 at+0.028
up+0.027
-+0.027
 by+0.027
Neg logit contributions
stack-0.039
-0.037
gy-0.036
sed-0.034
gency-0.034
Token
Token ID198
Feature activation+2.08e-03

Pos logit contributions
 at+0.012
 by+0.011
aw+0.011
-+0.011
 with+0.009
Neg logit contributions
stack-0.014
ses-0.013
gency-0.013
sed-0.013
 spine-0.013
Token
Token ID198
Feature activation-3.70e-01

Pos logit contributions
 at+0.000
-+0.000
 by+0.000
aw+0.000
 with+0.000
Neg logit contributions
stack-0.000
ses-0.000
 spine-0.000
sed-0.000
sing-0.000
TokenThe
Token ID464
Feature activation-7.52e-01

Pos logit contributions
stack+0.011
+0.011
 hips+0.010
ses+0.010
gency+0.010
Neg logit contributions
 at-0.012
aw-0.012
--0.011
 by-0.011
up-0.010
Token bird
Token ID6512
Feature activation+7.74e-01

Pos logit contributions
+0.030
stack+0.030
gency+0.028
alk+0.028
drivers+0.028
Neg logit contributions
 at-0.017
--0.016
aw-0.016
 by-0.015
up-0.013
Token fl
Token ID781
Feature activation+2.05e+00

Pos logit contributions
aw+0.021
 at+0.020
-+0.019
 by+0.018
up+0.018
Neg logit contributions
stack-0.025
-0.023
gy-0.023
 hips-0.023
sed-0.022
Tokenapped
Token ID6320
Feature activation+1.57e+00

Pos logit contributions
 at+0.063
aw+0.063
-+0.061
 by+0.058
up+0.054
Neg logit contributions
stack-0.062
-0.061
gency-0.057
 hips-0.057
drivers-0.057
Token its
Token ID663
Feature activation+6.08e-01

Pos logit contributions
aw+0.049
up+0.044
-+0.041
 at+0.038
 by+0.036
Neg logit contributions
 hips-0.044
ishly-0.042
stack-0.042
 swollen-0.040
gy-0.040
Token wings
Token ID12098
Feature activation+2.58e-01

Pos logit contributions
aw+0.021
 at+0.020
-+0.019
 by+0.019
up+0.018
Neg logit contributions
stack-0.016
ses-0.015
-0.015
 hips-0.015
sed-0.015
Token and
Token ID290
Feature activation+2.05e-01

Pos logit contributions
aw+0.005
 at+0.004
-+0.004
up+0.004
 by+0.004
Neg logit contributions
stack-0.011
-0.011
 hips-0.010
 spine-0.010
ishly-0.010
Token the
Token ID262
Feature activation-7.49e-01

Pos logit contributions
 hips+0.018
alk+0.018
stack+0.018
drivers+0.018
 banks+0.018
Neg logit contributions
aw-0.016
--0.014
 at-0.013
 by-0.012
up-0.011
Token whole
Token ID2187
Feature activation-1.50e+00

Pos logit contributions
stack+0.024
+0.023
ses+0.022
gency+0.022
 hips+0.022
Neg logit contributions
aw-0.022
 at-0.021
--0.021
 by-0.020
up-0.019
Token park
Token ID3952
Feature activation-1.13e+00

Pos logit contributions
+0.040
 slope+0.039
stack+0.038
alk+0.036
 hill+0.034
Neg logit contributions
aw-0.048
 at-0.046
--0.044
up-0.044
 by-0.042
Token.
Token ID13
Feature activation+9.55e-01

Pos logit contributions
stack+0.035
drivers+0.032
+0.031
gy+0.031
 hips+0.031
Neg logit contributions
aw-0.031
 at-0.029
--0.027
up-0.027
 by-0.027
Token It
Token ID632
Feature activation-9.06e-01

Pos logit contributions
 at+0.027
aw+0.027
 by+0.025
-+0.024
 of+0.021
Neg logit contributions
stack-0.032
gency-0.031
arse-0.031
ivery-0.031
-0.031
Token was
Token ID373
Feature activation-5.76e-01

Pos logit contributions
stack+0.033
+0.032
ses+0.031
drivers+0.030
 hips+0.030
Neg logit contributions
aw-0.022
 at-0.022
--0.021
 by-0.020
up-0.018
Token green
Token ID4077
Feature activation+2.90e-01

Pos logit contributions
stack+0.020
+0.019
ses+0.018
 hips+0.018
sed+0.018
Neg logit contributions
aw-0.015
--0.015
 at-0.014
up-0.014
 by-0.013
Token and
Token ID290
Feature activation-7.54e-01

Pos logit contributions
aw+0.005
 at+0.005
up+0.005
-+0.005
 by+0.005
Neg logit contributions
stack-0.011
-0.011
 slope-0.011
ses-0.011
gy-0.010
Token bright
Token ID6016
Feature activation+4.45e-02

Pos logit contributions
+0.024
stack+0.024
gency+0.022
 hips+0.022
ses+0.022
Neg logit contributions
 at-0.022
aw-0.021
--0.021
 by-0.020
up-0.018
Token and
Token ID290
Feature activation-1.37e+00

Pos logit contributions
up+0.001
aw+0.001
-+0.001
 at+0.001
 bra+0.001
Neg logit contributions
-0.001
gy-0.001
 slope-0.001
stack-0.001
 spoiler-0.001
Token beautiful
Token ID4950
Feature activation-4.32e-01

Pos logit contributions
stack+0.038
+0.036
 hips+0.035
ses+0.034
 slope+0.034
Neg logit contributions
aw-0.045
 at-0.044
--0.043
up-0.041
 by-0.040
Token she
Token ID673
Feature activation-1.80e-01

Pos logit contributions
stack+0.027
+0.027
gency+0.026
 hips+0.026
ses+0.026
Neg logit contributions
 at-0.016
aw-0.016
--0.015
 by-0.015
up-0.013
Token didn
Token ID1422
Feature activation-3.88e-01

Pos logit contributions
+0.005
stack+0.005
 hips+0.005
gency+0.005
ses+0.005
Neg logit contributions
aw-0.006
 at-0.006
--0.005
 by-0.005
up-0.005
Token't
Token ID470
Feature activation-1.20e+00

Pos logit contributions
 spine+0.009
sed+0.009
ses+0.009
stack+0.009
gency+0.008
Neg logit contributions
 at-0.016
--0.016
 by-0.016
 of-0.015
 with-0.015
Token have
Token ID423
Feature activation+6.22e-01

Pos logit contributions
stack+0.036
+0.036
gency+0.035
urance+0.035
alk+0.035
Neg logit contributions
aw-0.035
 at-0.033
--0.033
 by-0.031
up-0.029
Token anything
Token ID1997
Feature activation+1.32e+00

Pos logit contributions
aw+0.017
-+0.016
 at+0.016
 by+0.015
up+0.015
Neg logit contributions
gency-0.018
ses-0.018
-0.018
 hips-0.018
alk-0.018
Token to
Token ID284
Feature activation+3.64e-01

Pos logit contributions
aw+0.031
-+0.024
up+0.022
 at+0.022
 by+0.020
Neg logit contributions
 hips-0.046
 spine-0.045
drivers-0.045
sed-0.045
ses-0.045
Token play
Token ID711
Feature activation-4.87e-01

Pos logit contributions
 at+0.008
-+0.007
 by+0.007
aw+0.006
 with+0.006
Neg logit contributions
stack-0.016
-0.015
ses-0.015
drivers-0.015
sed-0.014
Token with
Token ID351
Feature activation-8.40e-01

Pos logit contributions
+0.023
stack+0.022
 hips+0.021
gy+0.021
drivers+0.020
Neg logit contributions
aw-0.007
up-0.006
--0.005
 at-0.005
 by-0.004
Token now
Token ID783
Feature activation-4.52e-01

Pos logit contributions
stack+0.035
+0.035
ses+0.033
gency+0.033
 hips+0.033
Neg logit contributions
 at-0.017
aw-0.016
--0.016
 by-0.015
up-0.012
Token.
Token ID13
Feature activation-2.02e-02

Pos logit contributions
stack+0.019
+0.019
ses+0.018
 hips+0.018
gency+0.018
Neg logit contributions
 at-0.009
aw-0.008
--0.008
 by-0.007
up-0.006
Token But
Token ID887
Feature activation+6.06e-01

Pos logit contributions
stack+0.001
+0.001
gency+0.001
drivers+0.001
 hips+0.001
Neg logit contributions
 at-0.001
aw-0.001
--0.001
 by-0.001
up-0.001
Token
Token ID198
Feature activation-7.02e-01

Pos logit contributions
stack+0.017
gency+0.016
+0.016
drivers+0.016
ivery+0.016
Neg logit contributions
 at-0.015
 by-0.014
aw-0.014
--0.014
up-0.012
Token
Token ID198
Feature activation-1.08e+00

Pos logit contributions
stack+0.028
ses+0.028
sed+0.028
 spine+0.028
ishly+0.028
Neg logit contributions
 at-0.017
--0.016
 by-0.016
aw-0.015
 of-0.014
TokenWhen
Token ID2215
Feature activation+1.01e+00

Pos logit contributions
stack+0.018
+0.018
ses+0.016
 hips+0.016
gency+0.016
Neg logit contributions
 at-0.048
aw-0.048
--0.047
 by-0.046
up-0.043
Token Amy
Token ID14235
Feature activation+1.49e-01

Pos logit contributions
aw+0.036
up+0.032
-+0.032
 at+0.029
 by+0.027
Neg logit contributions
-0.028
stack-0.028
't-0.025
 hips-0.024
ses-0.023
Token put
Token ID1234
Feature activation+8.27e-01

Pos logit contributions
aw+0.004
 at+0.004
 by+0.004
-+0.004
up+0.004
Neg logit contributions
-0.005
gy-0.004
stack-0.004
't-0.004
 hips-0.004
Token on
Token ID319
Feature activation-1.79e+00

Pos logit contributions
aw+0.019
up+0.015
 at+0.015
-+0.015
 by+0.013
Neg logit contributions
-0.032
stack-0.031
 hips-0.030
gency-0.029
drivers-0.029
Token the
Token ID262
Feature activation-1.63e-01

Pos logit contributions
+0.064
 hips+0.063
stack+0.062
gency+0.060
drivers+0.059
Neg logit contributions
aw-0.042
 at-0.040
--0.038
 by-0.035
up-0.034
Token dress
Token ID6576
Feature activation+4.48e-01

Pos logit contributions
+0.005
 hips+0.005
stack+0.005
ses+0.005
gency+0.005
Neg logit contributions
aw-0.005
 at-0.004
--0.004
 by-0.004
up-0.004
Token,
Token ID11
Feature activation-4.95e-01

Pos logit contributions
aw+0.010
 at+0.010
-+0.009
 by+0.009
up+0.008
Neg logit contributions
-0.017
stack-0.017
 hips-0.016
drivers-0.016
 spine-0.015
Token she
Token ID673
Feature activation-4.29e-02

Pos logit contributions
stack+0.016
+0.016
 hips+0.015
ses+0.015
gency+0.014
Neg logit contributions
aw-0.014
--0.013
 at-0.013
up-0.012
 by-0.012
Token accidentally
Token ID14716
Feature activation-1.51e-01

Pos logit contributions
+0.001
 hips+0.001
stack+0.001
sed+0.001
gency+0.001
Neg logit contributions
aw-0.001
 by-0.001
--0.001
 at-0.001
up-0.001
Token the
Token ID262
Feature activation-2.59e-01

Pos logit contributions
stack+0.020
+0.019
alk+0.019
gy+0.019
 hips+0.019
Neg logit contributions
aw-0.012
 at-0.011
--0.011
 by-0.010
up-0.009
Token frog
Token ID21264
Feature activation+1.13e-01

Pos logit contributions
stack+0.008
+0.008
 hips+0.008
ses+0.008
gency+0.008
Neg logit contributions
 at-0.007
aw-0.007
--0.007
 by-0.007
up-0.006
Token was
Token ID373
Feature activation-6.85e-01

Pos logit contributions
aw+0.003
-+0.003
 at+0.003
 by+0.003
up+0.003
Neg logit contributions
stack-0.004
-0.004
 hips-0.003
alk-0.003
sed-0.003
Token very
Token ID845
Feature activation-5.97e-01

Pos logit contributions
stack+0.022
+0.022
 hips+0.020
ses+0.020
urance+0.020
Neg logit contributions
aw-0.019
--0.019
 at-0.018
 by-0.017
up-0.017
Token troubled
Token ID17840
Feature activation-1.92e+00

Pos logit contributions
+0.019
stack+0.019
 hips+0.018
ses+0.018
sed+0.018
Neg logit contributions
aw-0.016
--0.016
 at-0.015
 by-0.015
up-0.014
Token.
Token ID13
Feature activation-1.64e+00

Pos logit contributions
stack+0.075
+0.075
 hips+0.072
gency+0.071
ses+0.070
Neg logit contributions
aw-0.039
 at-0.038
--0.037
 by-0.034
up-0.033
Token His
Token ID2399
Feature activation-1.32e+00

Pos logit contributions
gency+0.055
ses+0.054
drivers+0.053
sed+0.053
 hips+0.052
Neg logit contributions
 at-0.045
 by-0.045
aw-0.041
--0.040
 with-0.036
Token fountain
Token ID29420
Feature activation-6.03e-01

Pos logit contributions
stack+0.043
+0.043
gency+0.040
 hips+0.040
drivers+0.040
Neg logit contributions
 at-0.037
aw-0.036
--0.036
 by-0.034
up-0.031
Token had
Token ID550
Feature activation-2.04e+00

Pos logit contributions
stack+0.021
+0.021
 hips+0.020
ses+0.020
gency+0.020
Neg logit contributions
 at-0.015
aw-0.015
--0.015
 by-0.014
up-0.013
Token broken
Token ID5445
Feature activation-3.32e+00

Pos logit contributions
stack+0.063
+0.063
 hips+0.059
gency+0.058
ses+0.058
Neg logit contributions
 at-0.062
aw-0.060
--0.060
 by-0.057
up-0.053
Token and
Token ID290
Feature activation-1.40e+00

Pos logit contributions
+0.144
stack+0.143
ses+0.136
gency+0.134
urance+0.133
Neg logit contributions
 at-0.065
--0.062
 by-0.059
aw-0.058
up-0.044
Token tree
Token ID5509
Feature activation-5.68e-01

Pos logit contributions
aw+0.044
 at+0.044
-+0.042
 by+0.041
up+0.040
Neg logit contributions
stack-0.036
-0.035
gency-0.032
 slope-0.032
ses-0.032
Token.
Token ID13
Feature activation-5.36e-01

Pos logit contributions
stack+0.024
+0.023
 hips+0.022
ses+0.022
alk+0.022
Neg logit contributions
aw-0.011
 at-0.010
--0.010
 by-0.009
up-0.008
Token Lily
Token ID20037
Feature activation+6.33e-01

Pos logit contributions
stack+0.018
drivers+0.018
gency+0.018
arse+0.017
aned+0.017
Neg logit contributions
 at-0.016
 by-0.015
aw-0.014
--0.013
 with-0.013
Token loved
Token ID6151
Feature activation-4.94e-01

Pos logit contributions
aw+0.017
 at+0.017
-+0.016
 by+0.015
up+0.014
Neg logit contributions
stack-0.021
-0.021
 hips-0.020
ses-0.020
gency-0.019
Token to
Token ID284
Feature activation+4.83e-02

Pos logit contributions
stack+0.017
ses+0.017
+0.017
 slope+0.016
sed+0.016
Neg logit contributions
aw-0.012
up-0.010
--0.010
le-0.010
 at-0.009
Token play
Token ID711
Feature activation-1.57e+00

Pos logit contributions
aw+0.001
 at+0.001
-+0.001
 by+0.001
up+0.001
Neg logit contributions
stack-0.002
-0.002
 hips-0.002
alk-0.002
ses-0.001
Token outside
Token ID2354
Feature activation-3.65e-02

Pos logit contributions
+0.062
stack+0.059
't+0.057
 hips+0.056
 slope+0.055
Neg logit contributions
aw-0.031
up-0.027
--0.022
 bra-0.019
ups-0.019
Token in
Token ID287
Feature activation-1.98e+00

Pos logit contributions
stack+0.002
+0.002
 hips+0.001
 slope+0.001
 spine+0.001
Neg logit contributions
aw-0.001
up-0.001
--0.001
 at-0.001
 by-0.000
Token the
Token ID262
Feature activation-2.65e-01

Pos logit contributions
stack+0.092
+0.092
ses+0.089
drivers+0.089
gency+0.089
Neg logit contributions
 at-0.030
--0.028
aw-0.027
 by-0.027
up-0.019
Token sun
Token ID4252
Feature activation+1.80e+00

Pos logit contributions
stack+0.007
+0.007
 slope+0.006
alk+0.006
ivery+0.006
Neg logit contributions
--0.008
 at-0.008
aw-0.008
up-0.007
 by-0.007
Token.
Token ID13
Feature activation-4.68e-01

Pos logit contributions
aw+0.043
-+0.039
up+0.038
 at+0.037
 by+0.033
Neg logit contributions
stack-0.066
-0.066
 hips-0.062
 spine-0.061
gy-0.061
Token was
Token ID373
Feature activation-2.88e-02

Pos logit contributions
+0.009
stack+0.009
gency+0.009
urance+0.009
alk+0.009
Neg logit contributions
 at-0.005
aw-0.005
--0.005
 by-0.005
up-0.004
Token very
Token ID845
Feature activation-5.47e-01

Pos logit contributions
stack+0.001
ses+0.001
+0.001
gy+0.001
sed+0.001
Neg logit contributions
aw-0.001
--0.001
 at-0.001
up-0.001
 by-0.001
Token excited
Token ID6568
Feature activation-2.61e-02

Pos logit contributions
stack+0.018
+0.018
ses+0.017
 hips+0.017
sed+0.017
Neg logit contributions
aw-0.014
 at-0.013
--0.013
 by-0.012
up-0.012
Token to
Token ID284
Feature activation+5.08e-02

Pos logit contributions
+0.001
stack+0.001
 spine+0.001
 hips+0.001
gy+0.001
Neg logit contributions
aw-0.001
up-0.000
--0.000
 at-0.000
 bra-0.000
Token see
Token ID766
Feature activation-2.02e-02

Pos logit contributions
aw+0.001
 at+0.001
-+0.001
 by+0.001
up+0.001
Neg logit contributions
stack-0.002
-0.002
 hips-0.002
alk-0.002
gency-0.002
Token all
Token ID477
Feature activation-1.49e+00

Pos logit contributions
stack+0.001
+0.001
gency+0.001
 hips+0.001
drivers+0.001
Neg logit contributions
 at-0.000
aw-0.000
--0.000
 by-0.000
up-0.000
Token the
Token ID262
Feature activation-9.96e-01

Pos logit contributions
stack+0.051
alk+0.049
gy+0.046
 slope+0.045
bank+0.045
Neg logit contributions
aw-0.036
--0.034
 at-0.033
up-0.033
 by-0.027
Token animals
Token ID4695
Feature activation+8.97e-01

Pos logit contributions
stack+0.036
+0.035
drivers+0.035
gency+0.034
ses+0.034
Neg logit contributions
aw-0.022
 at-0.022
--0.021
 by-0.019
up-0.018
Token.
Token ID13
Feature activation+1.76e-01

Pos logit contributions
aw+0.023
-+0.021
up+0.021
 at+0.018
 by+0.017
Neg logit contributions
stack-0.031
-0.029
 hips-0.028
alk-0.028
gy-0.027
Token
Token ID198
Feature activation+4.26e-01

Pos logit contributions
 at+0.005
 by+0.005
-+0.004
aw+0.004
 with+0.004
Neg logit contributions
stack-0.006
gency-0.006
-0.006
ses-0.006
ivery-0.006
Token
Token ID198
Feature activation+1.43e-02

Pos logit contributions
 at+0.011
-+0.011
 by+0.010
aw+0.010
 of+0.009
Neg logit contributions
stack-0.017
ses-0.016
sed-0.016
-0.016
ishly-0.016
Token had
Token ID550
Feature activation-4.70e-01

Pos logit contributions
stack+0.032
+0.032
 hips+0.030
ses+0.030
gency+0.030
Neg logit contributions
 at-0.032
aw-0.031
--0.031
 by-0.030
up-0.027
Token lost
Token ID2626
Feature activation-1.50e+00

Pos logit contributions
+0.019
stack+0.018
 hips+0.018
gency+0.018
ses+0.018
Neg logit contributions
aw-0.009
 at-0.009
--0.009
 by-0.008
up-0.007
Token the
Token ID262
Feature activation-7.29e-01

Pos logit contributions
stack+0.056
+0.055
 hips+0.053
ses+0.052
gency+0.052
Neg logit contributions
aw-0.034
 at-0.034
--0.033
 by-0.031
up-0.029
Token fight
Token ID1907
Feature activation-3.19e+00

Pos logit contributions
stack+0.022
+0.021
ses+0.020
 hips+0.020
 spine+0.020
Neg logit contributions
aw-0.022
 at-0.022
--0.022
 by-0.021
up-0.019
Token.
Token ID13
Feature activation-3.19e-01

Pos logit contributions
stack+0.126
+0.125
 hips+0.119
drivers+0.118
gency+0.118
Neg logit contributions
aw-0.065
 at-0.064
--0.063
 by-0.058
up-0.052
Token
Token ID220
Feature activation-1.70e+00

Pos logit contributions
ivery+0.009
ses+0.009
gency+0.009
stack+0.009
 spine+0.009
Neg logit contributions
 at-0.010
aw-0.010
 by-0.010
--0.009
 with-0.009
Token
Token ID198
Feature activation-4.02e-01

Pos logit contributions
 spine+0.068
stack+0.068
gency+0.067
ses+0.066
 slope+0.066
Neg logit contributions
 at-0.039
 by-0.037
aw-0.036
--0.033
 with-0.029
Token
Token ID198
Feature activation-7.74e-01

Pos logit contributions
ses+0.020
 spine+0.020
sed+0.019
stack+0.018
gency+0.018
Neg logit contributions
 by-0.007
 at-0.006
aw-0.006
--0.006
 with-0.006
TokenThe
Token ID464
Feature activation-1.23e+00

Pos logit contributions
ses+0.026
 spine+0.026
stack+0.026
sed+0.025
+0.025
Neg logit contributions
aw-0.022
 at-0.021
--0.021
 by-0.021
up-0.019
Token boys
Token ID6510
Feature activation-1.14e+00

Pos logit contributions
stack+0.042
+0.041
gency+0.039
 hips+0.039
drivers+0.038
Neg logit contributions
 at-0.034
aw-0.032
--0.032
 by-0.031
up-0.027
Token never
Token ID1239
Feature activation+1.38e-01

Pos logit contributions
+0.038
stack+0.037
ses+0.036
gency+0.035
 hips+0.035
Neg logit contributions
 at-0.032
aw-0.031
--0.030
 by-0.029
up-0.026