TOP ACTIVATIONS
MAX = 3.277

, and their friendship became even stronger . <|endoftext|> Once upon
, Jack 's cakes became even more famous and people came
on , the well became even more popular because everyone loved
said the presents are useful even when they hang on the
They put them in the juice maker and press the button
and it kept her warm even when it was cold outside
and wraps it around her knee . He hugs her and
take the cups . They sip the coffee slowly . It
and Jane shared the cup cake and celebrated their tasty adventure
on , the branch became even more popular with all the
night , the harbor became even more magical . The lights
Their mom and dad cleaned Lily 's face and put a
ink their cups . They sip the water . They eat
to each other . They drank their water and ate their
floor . They pretend to sip and nib ble .
her mom . She smiled even when she was sad ,
food , and it became even f atter . Whenever the
They ate their lunch and drank their juice and had spicy
teach her doll how to drink wine . She saw her
. The powder made it feel better and stopped it from

MIN ACTIVATIONS
MIN = -4.750

Fl oppy saw a mar vell ous view . He also
the kids had their t amb our ines and drums ,
tall and strong and had delic iously ripe pe aches all
, she saw a beautiful mer maid swimming around her .
. He even saw a mer maid !
big book with pictures of volcan oes . Vol can oes
eye patches and said " Ar rr !" a lot .
he would find his perfect mate . One day ,
hats . They say " Ar rr " and " Ah
big book with pictures of volcan oes . Vol can oes
. She had a big magn ifying glass that made things
. You worked with such delic acy . You have a
The new light bul b was much bigger and brighter
flag . They say " Ar rr !" and " Ah
In it she had a magn ifying glass , a tiny
next to the light bul b , then climbed up the
" I 'm a secret mer maid . Would you like
gave Lucas a small pe b ble . Lucas had no
mother holding a freshly - b aked muff in .
the world 's best - b ouncing frog . And every

INTERVAL 1.270 - 3.277
CONTAINS 3.056%

and put their hands in the spray . It was cold
, he went outside to play and it started to rain
said , " Sure , but be careful ." Tim my
you are too small . Go away ." Tim
pulls the car back from Anna 's hand .

INTERVAL -0.737 - 1.270
CONTAINS 83.679%

She had a favorite toy , a silver ball . The
and sea foam . But nothing seemed quite right . Then
" But be careful , don 't spray the flowers or
so happy and everyone laughed . From now on Jack was
house . It was white and empty . Tommy had an

INTERVAL -2.744 - -0.737
CONTAINS 13.237%

she jumped each time . Suddenly , she noticed
<|endoftext|> M um and Dad had a big surprise for Jane
worried , but he was determined to find something he was
" Ben , are you okay ? Your finger is
day ?" The elderly woman laughed and said ,

INTERVAL -4.750 - -2.744
CONTAINS 0.028%

he was gone , O ll ie met a friendly crab
! She wanted to act out her favorite story with her
long tail and a sharp spine . Can we stay here
the box and found her fl ute was broken . She
big castle with point y towers . " That would be
Token,
Token ID11
Feature activation-3.94e-01

Pos logit contributions
 Lily+0.011
 L+0.010
 v+0.008
,+0.008
 p+0.008
Neg logit contributions
opes-0.019
lor-0.019
wen-0.019
uding-0.019
 Charles-0.018
Token and
Token ID290
Feature activation+6.14e-01

Pos logit contributions
lor+0.018
opes+0.017
wen+0.017
uding+0.017
eway+0.017
Neg logit contributions
 Lily-0.006
 L-0.005
 v-0.004
 very-0.004
,-0.004
Token their
Token ID511
Feature activation-4.28e-01

Pos logit contributions
 Lily+0.009
 L+0.008
 v+0.007
 p+0.006
L+0.006
Neg logit contributions
opes-0.024
wen-0.024
uding-0.024
lor-0.024
 Charles-0.024
Token friendship
Token ID14738
Feature activation+3.98e-01

Pos logit contributions
opes+0.015
uding+0.014
wen+0.014
lor+0.014
 Charles+0.014
Neg logit contributions
 Lily-0.009
 L-0.008
 v-0.007
L-0.007
 Sp-0.007
Token became
Token ID2627
Feature activation+6.61e-01

Pos logit contributions
 Lily+0.006
 L+0.004
 v+0.004
,+0.003
 p+0.003
Neg logit contributions
opes-0.018
lor-0.017
wen-0.017
uding-0.017
 Charles-0.017
Token even
Token ID772
Feature activation+3.28e+00

Pos logit contributions
 Lily+0.011
 L+0.010
 v+0.009
 Sp+0.008
 p+0.008
Neg logit contributions
opes-0.025
uding-0.024
wen-0.024
lor-0.024
 Charles-0.024
Token stronger
Token ID7387
Feature activation-1.01e-01

Pos logit contributions
 L+0.054
 Lily+0.053
 Sp+0.049
 v+0.044
 W+0.041
Neg logit contributions
lled-0.101
imm-0.098
 awaiting-0.097
uding-0.097
 sites-0.097
Token.
Token ID13
Feature activation-7.24e-01

Pos logit contributions
opes+0.005
wen+0.005
lor+0.005
uding+0.004
 Charles+0.004
Neg logit contributions
 Lily-0.001
 L-0.001
,-0.001
 v-0.001
 very-0.001
Token<|endoftext|>
Token ID50256
Feature activation-1.78e-01

Pos logit contributions
opes+0.029
lor+0.029
wen+0.028
uding+0.028
 Charles+0.028
Neg logit contributions
 Lily-0.010
 L-0.009
 v-0.009
 p-0.008
 dirty-0.008
TokenOnce
Token ID7454
Feature activation-2.20e-01

Pos logit contributions
lor+0.007
wen+0.007
opes+0.007
 Charles+0.007
uding+0.006
Neg logit contributions
 Lily-0.003
 L-0.003
 v-0.002
 sad-0.002
 dirty-0.002
Token upon
Token ID2402
Feature activation-6.83e-01

Pos logit contributions
wen+0.006
opes+0.005
 Charles+0.005
lor+0.005
uding+0.005
Neg logit contributions
 Lily-0.006
 L-0.006
 v-0.006
 p-0.005
 Sp-0.005
Token,
Token ID11
Feature activation-5.81e-01

Pos logit contributions
opes+0.022
lor+0.022
wen+0.022
uding+0.022
 reins+0.021
Neg logit contributions
 Lily-0.002
 L-0.000
,-0.000
 p+0.001
 v+0.001
Token Jack
Token ID3619
Feature activation-6.14e-01

Pos logit contributions
lor+0.021
wen+0.021
opes+0.021
uding+0.020
eway+0.019
Neg logit contributions
 Lily-0.014
 L-0.012
,-0.010
 very-0.010
 sad-0.010
Token's
Token ID338
Feature activation-3.55e-01

Pos logit contributions
opes+0.024
lor+0.024
uding+0.024
wen+0.024
eway+0.023
Neg logit contributions
 Lily-0.011
 L-0.009
,-0.008
 v-0.008
 very-0.008
Token cakes
Token ID30849
Feature activation+5.92e-01

Pos logit contributions
lor+0.013
wen+0.013
opes+0.012
uding+0.012
 Charles+0.012
Neg logit contributions
 Lily-0.008
 L-0.007
 v-0.007
 very-0.006
 sad-0.006
Token became
Token ID2627
Feature activation+7.08e-01

Pos logit contributions
 Lily+0.008
 L+0.006
,+0.006
 very+0.006
 v+0.006
Neg logit contributions
opes-0.026
lor-0.025
uding-0.025
wen-0.025
 Charles-0.025
Token even
Token ID772
Feature activation+3.26e+00

Pos logit contributions
 Lily+0.012
 L+0.010
 v+0.009
 p+0.008
 Sp+0.008
Neg logit contributions
opes-0.028
lor-0.028
wen-0.028
uding-0.028
 Charles-0.027
Token more
Token ID517
Feature activation+1.09e+00

Pos logit contributions
 Lily+0.058
 L+0.055
 Sp+0.047
 v+0.044
L+0.042
Neg logit contributions
uding-0.108
opes-0.106
 Charles-0.105
lor-0.105
wen-0.105
Token famous
Token ID5863
Feature activation-5.77e-01

Pos logit contributions
 Lily+0.023
 L+0.020
 v+0.018
 very+0.018
,+0.017
Neg logit contributions
opes-0.041
lor-0.040
uding-0.039
wen-0.039
 Charles-0.038
Token and
Token ID290
Feature activation-8.10e-02

Pos logit contributions
opes+0.023
wen+0.023
lor+0.023
uding+0.022
 Charles+0.022
Neg logit contributions
 Lily-0.010
 L-0.009
 v-0.008
,-0.008
 very-0.007
Token people
Token ID661
Feature activation+1.78e-01

Pos logit contributions
opes+0.003
lor+0.003
uding+0.003
wen+0.003
 Charles+0.003
Neg logit contributions
 Lily-0.002
 L-0.001
 v-0.001
 Sp-0.001
 very-0.001
Token came
Token ID1625
Feature activation-1.43e+00

Pos logit contributions
 Lily+0.003
 L+0.003
,+0.002
 very+0.002
 v+0.002
Neg logit contributions
opes-0.007
wen-0.007
lor-0.007
uding-0.007
 Charles-0.007
Token on
Token ID319
Feature activation-2.36e-01

Pos logit contributions
opes+0.047
lor+0.046
wen+0.046
uding+0.046
 Charles+0.045
Neg logit contributions
 Lily-0.029
 L-0.026
 v-0.023
 Sp-0.021
 p-0.021
Token,
Token ID11
Feature activation-7.26e-02

Pos logit contributions
opes+0.012
lor+0.011
wen+0.011
uding+0.011
 Charles+0.011
Neg logit contributions
 Lily-0.002
 L-0.001
 v-0.001
L-0.000
 Sp-0.000
Token the
Token ID262
Feature activation+8.01e-02

Pos logit contributions
 Charles+0.003
opes+0.003
uding+0.003
wen+0.003
lor+0.003
Neg logit contributions
 Lily-0.001
 L-0.001
 v-0.001
 p-0.001
 dirty-0.001
Token well
Token ID880
Feature activation+5.16e-01

Pos logit contributions
 Lily+0.002
 L+0.002
 Sp+0.001
 v+0.001
L+0.001
Neg logit contributions
uding-0.002
opes-0.002
 Charles-0.002
lor-0.002
wen-0.002
Token became
Token ID2627
Feature activation+8.95e-01

Pos logit contributions
 Lily+0.008
 L+0.007
 v+0.006
 Sp+0.006
L+0.005
Neg logit contributions
opes-0.021
lor-0.020
uding-0.020
wen-0.020
 Charles-0.020
Token even
Token ID772
Feature activation+3.22e+00

Pos logit contributions
 Lily+0.014
 L+0.012
 v+0.011
 Sp+0.010
 p+0.010
Neg logit contributions
uding-0.033
opes-0.033
 Charles-0.032
wen-0.032
lor-0.031
Token more
Token ID517
Feature activation+9.93e-01

Pos logit contributions
 L+0.057
 Lily+0.056
 Sp+0.054
 v+0.049
 Doll+0.044
Neg logit contributions
lled-0.093
 awaiting-0.090
aded-0.088
 witnessed-0.088
uding-0.088
Token popular
Token ID2968
Feature activation+2.52e-01

Pos logit contributions
 Lily+0.023
 L+0.020
 v+0.018
 p+0.017
 Sp+0.017
Neg logit contributions
opes-0.034
lor-0.033
uding-0.033
wen-0.033
 Charles-0.032
Token because
Token ID780
Feature activation+5.28e-01

Pos logit contributions
 Lily+0.005
 L+0.004
 v+0.004
 Sp+0.003
L+0.003
Neg logit contributions
opes-0.009
uding-0.009
lor-0.009
wen-0.009
 Charles-0.009
Token everyone
Token ID2506
Feature activation-2.76e-01

Pos logit contributions
 Lily+0.009
 L+0.008
 v+0.007
L+0.007
 Sp+0.006
Neg logit contributions
opes-0.019
wen-0.019
uding-0.019
 Charles-0.019
lor-0.019
Token loved
Token ID6151
Feature activation+9.30e-01

Pos logit contributions
opes+0.011
lor+0.011
wen+0.011
uding+0.011
 Charles+0.011
Neg logit contributions
 Lily-0.005
 L-0.004
 v-0.003
 very-0.003
 p-0.003
Token said
Token ID531
Feature activation+9.15e-01

Pos logit contributions
opes+0.007
lor+0.007
uding+0.007
wen+0.007
 Charles+0.007
Neg logit contributions
 Lily-0.004
 L-0.003
 v-0.003
 p-0.003
 Sp-0.003
Token the
Token ID262
Feature activation+7.63e-01

Pos logit contributions
 Lily+0.012
 L+0.012
L+0.009
 v+0.009
 Sp+0.008
Neg logit contributions
opes-0.036
 Charles-0.034
lor-0.034
uding-0.034
elcome-0.034
Token presents
Token ID10969
Feature activation-5.37e-01

Pos logit contributions
 Lily+0.018
 L+0.016
 v+0.014
 very+0.014
 sad+0.014
Neg logit contributions
opes-0.026
wen-0.026
lor-0.026
uding-0.025
 Charles-0.024
Token are
Token ID389
Feature activation+8.52e-01

Pos logit contributions
opes+0.024
lor+0.023
uding+0.023
wen+0.023
 Charles+0.023
Neg logit contributions
 Lily-0.007
 L-0.006
 v-0.005
,-0.005
 very-0.005
Token useful
Token ID4465
Feature activation+4.77e-01

Pos logit contributions
 Lily+0.015
 L+0.013
 v+0.011
 Sp+0.010
 p+0.010
Neg logit contributions
opes-0.033
lor-0.033
uding-0.033
wen-0.033
 Charles-0.032
Token even
Token ID772
Feature activation+3.15e+00

Pos logit contributions
 Lily+0.006
 L+0.005
 v+0.004
,+0.004
 very+0.003
Neg logit contributions
opes-0.021
lor-0.021
wen-0.021
uding-0.021
 Charles-0.020
Token when
Token ID618
Feature activation-5.74e-01

Pos logit contributions
 Lily+0.052
 L+0.048
 v+0.040
 Sp+0.040
 p+0.037
Neg logit contributions
opes-0.116
uding-0.115
lor-0.113
wen-0.113
 Charles-0.112
Token they
Token ID484
Feature activation+6.75e-01

Pos logit contributions
lor+0.025
opes+0.025
wen+0.025
uding+0.025
eway+0.024
Neg logit contributions
 Lily-0.008
 L-0.007
 v-0.005
,-0.005
 very-0.005
Token hang
Token ID8181
Feature activation+1.24e-01

Pos logit contributions
 Lily+0.013
 L+0.011
 v+0.009
 p+0.009
 sad+0.009
Neg logit contributions
opes-0.026
lor-0.025
wen-0.025
uding-0.025
 Charles-0.025
Token on
Token ID319
Feature activation+3.11e-01

Pos logit contributions
 Lily+0.003
 L+0.002
 v+0.002
,+0.002
 very+0.002
Neg logit contributions
opes-0.004
lor-0.004
wen-0.004
uding-0.004
 Charles-0.004
Token the
Token ID262
Feature activation+1.99e-01

Pos logit contributions
 Lily+0.004
 L+0.004
 v+0.003
 Sp+0.003
 p+0.003
Neg logit contributions
opes-0.013
lor-0.013
wen-0.013
uding-0.013
 Charles-0.013
Token They
Token ID1119
Feature activation+6.08e-01

Pos logit contributions
 v+0.005
 Lily+0.005
 sad+0.004
 dirty+0.004
 L+0.004
Neg logit contributions
opes-0.010
uding-0.010
 Charles-0.009
wen-0.009
eway-0.009
Token put
Token ID1234
Feature activation+1.02e+00

Pos logit contributions
 Lily+0.013
 L+0.012
L+0.010
 v+0.010
 Sp+0.010
Neg logit contributions
opes-0.019
uding-0.019
 Charles-0.019
wen-0.018
lor-0.018
Token them
Token ID606
Feature activation-5.55e-01

Pos logit contributions
 Lily+0.026
 L+0.025
 Sp+0.023
L+0.023
 v+0.022
Neg logit contributions
opes-0.027
uding-0.026
 Charles-0.025
lor-0.025
wen-0.024
Token in
Token ID287
Feature activation+3.31e-01

Pos logit contributions
opes+0.019
uding+0.019
lor+0.018
wen+0.018
 Charles+0.018
Neg logit contributions
 Lily-0.012
 L-0.011
 v-0.009
 Sp-0.009
L-0.009
Token the
Token ID262
Feature activation+1.20e+00

Pos logit contributions
 Lily+0.006
 L+0.005
 Sp+0.005
 v+0.005
L+0.005
Neg logit contributions
opes-0.011
uding-0.010
 Charles-0.010
Frank-0.010
wen-0.010
Token juice
Token ID13135
Feature activation+3.15e+00

Pos logit contributions
 Lily+0.028
 L+0.027
 Sp+0.024
L+0.023
 sad+0.021
Neg logit contributions
opes-0.035
lor-0.035
uding-0.033
 reins-0.032
 armour-0.032
Token maker
Token ID16009
Feature activation+1.30e+00

Pos logit contributions
 Lily+0.049
 L+0.046
 Sp+0.040
 sad+0.037
L+0.035
Neg logit contributions
lor-0.118
opes-0.116
uding-0.113
wen-0.108
 Charles-0.107
Token and
Token ID290
Feature activation+4.12e-01

Pos logit contributions
 Lily+0.016
 L+0.013
 v+0.010
 Sp+0.009
 p+0.009
Neg logit contributions
opes-0.057
lor-0.056
uding-0.056
wen-0.056
 Charles-0.055
Token press
Token ID1803
Feature activation+9.64e-01

Pos logit contributions
 Lily+0.008
 L+0.007
L+0.006
 v+0.006
 Sp+0.006
Neg logit contributions
opes-0.014
uding-0.014
 Charles-0.014
wen-0.014
lor-0.013
Token the
Token ID262
Feature activation+4.01e-02

Pos logit contributions
 Lily+0.017
 L+0.016
L+0.014
 Sp+0.014
 v+0.013
Neg logit contributions
opes-0.034
uding-0.032
 Charles-0.032
lor-0.032
wen-0.032
Token button
Token ID4936
Feature activation+1.12e-01

Pos logit contributions
 Lily+0.001
 L+0.001
 v+0.001
 Sp+0.001
L+0.001
Neg logit contributions
opes-0.001
lor-0.001
uding-0.001
wen-0.001
 Charles-0.001
Token and
Token ID290
Feature activation+5.43e-01

Pos logit contributions
opes+0.006
wen+0.005
lor+0.005
uding+0.005
 Charles+0.005
Neg logit contributions
 Lily-0.002
 L-0.001
 v-0.001
,-0.001
 very-0.001
Token it
Token ID340
Feature activation+8.18e-01

Pos logit contributions
 Lily+0.011
 L+0.010
 v+0.009
 Sp+0.009
L+0.008
Neg logit contributions
opes-0.018
wen-0.018
lor-0.017
uding-0.017
 Charles-0.017
Token kept
Token ID4030
Feature activation+1.00e-01

Pos logit contributions
 Lily+0.015
 L+0.013
 v+0.011
 Sp+0.011
L+0.010
Neg logit contributions
opes-0.031
lor-0.030
uding-0.030
wen-0.030
 Charles-0.029
Token her
Token ID607
Feature activation+8.08e-01

Pos logit contributions
 Lily+0.002
 L+0.002
 v+0.002
L+0.002
 Sp+0.002
Neg logit contributions
opes-0.003
uding-0.003
lor-0.003
wen-0.003
 Charles-0.003
Token warm
Token ID5814
Feature activation+3.99e-01

Pos logit contributions
 Lily+0.019
 L+0.017
 v+0.016
 p+0.015
 sad+0.015
Neg logit contributions
opes-0.027
lor-0.026
wen-0.026
uding-0.026
 Charles-0.025
Token even
Token ID772
Feature activation+3.09e+00

Pos logit contributions
 Lily+0.008
 L+0.007
 v+0.006
 p+0.005
,+0.005
Neg logit contributions
opes-0.015
wen-0.015
lor-0.015
uding-0.015
 Charles-0.015
Token when
Token ID618
Feature activation-8.08e-01

Pos logit contributions
 Lily+0.051
 L+0.047
 Sp+0.039
 v+0.036
L+0.034
Neg logit contributions
uding-0.112
opes-0.111
lor-0.111
wen-0.110
 Charles-0.109
Token it
Token ID340
Feature activation+1.02e+00

Pos logit contributions
opes+0.035
lor+0.035
uding+0.035
wen+0.034
 Charles+0.034
Neg logit contributions
 Lily-0.011
 L-0.009
 v-0.007
 Sp-0.006
 p-0.006
Token was
Token ID373
Feature activation+2.92e-02

Pos logit contributions
 Lily+0.019
 L+0.017
 v+0.015
 Sp+0.014
L+0.014
Neg logit contributions
opes-0.038
lor-0.037
uding-0.037
wen-0.037
 Charles-0.036
Token cold
Token ID4692
Feature activation-1.93e-01

Pos logit contributions
 Lily+0.001
 L+0.001
 very+0.000
 v+0.000
 sad+0.000
Neg logit contributions
lor-0.001
opes-0.001
wen-0.001
uding-0.001
 Charles-0.001
Token outside
Token ID2354
Feature activation-3.14e-02

Pos logit contributions
opes+0.007
uding+0.006
lor+0.006
wen+0.006
 Charles+0.006
Neg logit contributions
 Lily-0.004
 L-0.004
 v-0.003
L-0.003
 Sp-0.003
Token and
Token ID290
Feature activation-3.72e-01

Pos logit contributions
 Lily+0.007
 L+0.006
 Sp+0.005
 v+0.005
L+0.005
Neg logit contributions
opes-0.016
lor-0.016
uding-0.016
wen-0.016
 Charles-0.015
Token wraps
Token ID27521
Feature activation+9.70e-01

Pos logit contributions
lor+0.013
opes+0.013
uding+0.013
wen+0.013
eway+0.012
Neg logit contributions
 Lily-0.009
 L-0.008
 v-0.007
 p-0.006
,-0.006
Token it
Token ID340
Feature activation+8.09e-02

Pos logit contributions
 Lily+0.019
 L+0.017
 v+0.016
 p+0.015
 sad+0.015
Neg logit contributions
opes-0.033
lor-0.032
wen-0.032
 Charles-0.032
uding-0.032
Token around
Token ID1088
Feature activation+2.22e-01

Pos logit contributions
 Lily+0.002
 L+0.002
 v+0.002
 Sp+0.002
L+0.002
Neg logit contributions
opes-0.003
uding-0.002
lor-0.002
wen-0.002
 Charles-0.002
Token her
Token ID607
Feature activation+1.89e+00

Pos logit contributions
 Lily+0.003
 v+0.003
 L+0.003
 sad+0.003
 dirty+0.003
Neg logit contributions
opes-0.008
 Charles-0.008
uding-0.008
wen-0.008
eway-0.008
Token knee
Token ID10329
Feature activation+3.08e+00

Pos logit contributions
 Lily+0.044
 L+0.041
 Sp+0.036
L+0.035
 v+0.034
Neg logit contributions
opes-0.060
lor-0.059
uding-0.059
 armour-0.055
 Charles-0.055
Token.
Token ID13
Feature activation+5.36e-01

Pos logit contributions
 Lily+0.049
 L+0.046
 Sp+0.042
L+0.036
 P+0.035
Neg logit contributions
lor-0.110
opes-0.108
uding-0.106
aced-0.103
wen-0.101
Token He
Token ID679
Feature activation-2.65e-01

Pos logit contributions
 Lily+0.008
 v+0.008
 L+0.008
 sad+0.007
 dirty+0.007
Neg logit contributions
 Charles-0.019
lor-0.019
opes-0.019
wen-0.019
uding-0.018
Token hugs
Token ID40657
Feature activation+1.05e+00

Pos logit contributions
opes+0.008
lor+0.008
wen+0.008
uding+0.008
 Charles+0.008
Neg logit contributions
 Lily-0.007
 L-0.006
 v-0.006
L-0.005
 Sp-0.005
Token her
Token ID607
Feature activation+3.95e-01

Pos logit contributions
 Lily+0.016
 v+0.015
 L+0.015
 dirty+0.015
 sad+0.015
Neg logit contributions
 Charles-0.036
opes-0.035
wen-0.035
lor-0.035
uding-0.034
Token and
Token ID290
Feature activation-5.68e-01

Pos logit contributions
 Lily+0.007
 L+0.007
 Sp+0.006
L+0.006
 v+0.006
Neg logit contributions
opes-0.014
lor-0.013
 Charles-0.013
uding-0.013
uted-0.013
Token take
Token ID1011
Feature activation+1.02e+00

Pos logit contributions
opes+0.009
uding+0.009
wen+0.009
 Charles+0.009
lor+0.009
Neg logit contributions
 Lily-0.005
 L-0.005
 v-0.004
L-0.004
 Sp-0.004
Token the
Token ID262
Feature activation+3.49e-01

Pos logit contributions
 Lily+0.017
 L+0.016
 sad+0.015
 v+0.015
 Sp+0.014
Neg logit contributions
opes-0.036
wen-0.035
 Charles-0.035
uding-0.035
lor-0.035
Token cups
Token ID14180
Feature activation+2.99e-01

Pos logit contributions
 Lily+0.008
 L+0.007
 Sp+0.006
 v+0.006
L+0.006
Neg logit contributions
opes-0.012
wen-0.011
lor-0.011
uding-0.011
 Charles-0.011
Token.
Token ID13
Feature activation+3.88e-01

Pos logit contributions
 Lily+0.005
 L+0.005
 Sp+0.004
 v+0.004
 sad+0.004
Neg logit contributions
lor-0.011
wen-0.010
uding-0.010
opes-0.010
eway-0.010
Token They
Token ID1119
Feature activation+2.16e-01

Pos logit contributions
 Lily+0.006
 L+0.006
 v+0.006
 sad+0.005
 Sp+0.005
Neg logit contributions
opes-0.015
 Charles-0.014
uding-0.014
wen-0.014
lor-0.014
Token sip
Token ID31145
Feature activation+3.08e+00

Pos logit contributions
 Lily+0.005
 L+0.004
 v+0.004
 Sp+0.003
L+0.003
Neg logit contributions
opes-0.007
uding-0.007
wen-0.007
lor-0.007
 Charles-0.007
Token the
Token ID262
Feature activation+1.26e+00

Pos logit contributions
 L+0.058
 v+0.057
 Lily+0.057
L+0.056
 H+0.055
Neg logit contributions
rily-0.079
opes-0.079
uted-0.078
lled-0.077
aced-0.074
Token coffee
Token ID6891
Feature activation+1.60e+00

Pos logit contributions
 Lily+0.024
 L+0.022
 Sp+0.018
 v+0.018
L+0.018
Neg logit contributions
opes-0.045
wen-0.044
lor-0.043
uding-0.043
 Charles-0.042
Token slowly
Token ID6364
Feature activation+1.86e+00

Pos logit contributions
 Lily+0.027
 L+0.025
 Sp+0.021
L+0.021
 sad+0.020
Neg logit contributions
opes-0.057
lor-0.055
uding-0.055
wen-0.053
uted-0.053
Token.
Token ID13
Feature activation+3.77e-01

Pos logit contributions
 Lily+0.026
 L+0.023
 Sp+0.019
L+0.018
 v+0.017
Neg logit contributions
opes-0.075
uding-0.073
lor-0.072
wen-0.071
 Charles-0.071
Token It
Token ID632
Feature activation+5.92e-01

Pos logit contributions
 Lily+0.006
 L+0.005
 v+0.005
 sad+0.004
L+0.004
Neg logit contributions
opes-0.015
uding-0.014
 Charles-0.014
lor-0.014
wen-0.014
Token and
Token ID290
Feature activation-7.00e-01

Pos logit contributions
opes+0.011
uding+0.011
wen+0.010
lor+0.010
 Charles+0.010
Neg logit contributions
 Lily-0.006
 L-0.006
 v-0.005
 Sp-0.005
L-0.005
Token Jane
Token ID12091
Feature activation+5.22e-01

Pos logit contributions
opes+0.023
lor+0.023
wen+0.022
uding+0.022
 Charles+0.020
Neg logit contributions
 Lily-0.019
 L-0.017
 v-0.014
 p-0.014
,-0.014
Token shared
Token ID4888
Feature activation+1.64e+00

Pos logit contributions
 Lily+0.012
 L+0.011
 v+0.010
 Sp+0.009
L+0.009
Neg logit contributions
opes-0.016
wen-0.016
uding-0.016
 Charles-0.016
lor-0.016
Token the
Token ID262
Feature activation+1.14e+00

Pos logit contributions
 Lily+0.021
 L+0.019
 v+0.015
 Sp+0.014
L+0.013
Neg logit contributions
opes-0.068
lor-0.067
uding-0.066
wen-0.066
 Charles-0.066
Token cup
Token ID6508
Feature activation+1.29e+00

Pos logit contributions
 Lily+0.023
 L+0.020
 v+0.018
 p+0.017
 very+0.016
Neg logit contributions
opes-0.042
uding-0.041
lor-0.041
wen-0.041
 Charles-0.040
Tokencake
Token ID30560
Feature activation+3.06e+00

Pos logit contributions
 Lily+0.024
 L+0.022
 v+0.019
 Sp+0.018
 sad+0.018
Neg logit contributions
lor-0.045
opes-0.045
wen-0.044
uding-0.044
 Charles-0.043
Token and
Token ID290
Feature activation-8.31e-01

Pos logit contributions
 Lily+0.053
 L+0.049
 v+0.040
 Sp+0.039
L+0.038
Neg logit contributions
opes-0.112
uding-0.110
lor-0.109
wen-0.109
 Charles-0.108
Token celebrated
Token ID13943
Feature activation+6.01e-01

Pos logit contributions
opes+0.030
lor+0.030
uding+0.030
wen+0.030
 Charles+0.029
Neg logit contributions
 Lily-0.016
 L-0.014
 v-0.013
 Sp-0.012
L-0.012
Token their
Token ID511
Feature activation+2.65e-01

Pos logit contributions
 Lily+0.010
 L+0.008
,+0.007
 very+0.007
 v+0.006
Neg logit contributions
lor-0.026
opes-0.025
wen-0.025
uding-0.025
 Charles-0.024
Token tasty
Token ID25103
Feature activation+5.45e-01

Pos logit contributions
 Lily+0.007
 L+0.006
 v+0.005
L+0.005
 Sp+0.005
Neg logit contributions
opes-0.008
uding-0.008
lor-0.008
wen-0.008
 Charles-0.008
Token adventure
Token ID8855
Feature activation-3.34e-01

Pos logit contributions
 Lily+0.014
 L+0.012
 v+0.011
 p+0.011
,+0.011
Neg logit contributions
opes-0.017
lor-0.017
wen-0.017
uding-0.017
 Charles-0.016
Token on
Token ID319
Feature activation-5.38e-01

Pos logit contributions
opes+0.055
uding+0.054
lor+0.054
wen+0.053
 Charles+0.052
Neg logit contributions
 Lily-0.028
 L-0.025
 v-0.021
,-0.020
 very-0.020
Token,
Token ID11
Feature activation-4.36e-01

Pos logit contributions
opes+0.029
uding+0.029
lor+0.029
wen+0.028
uted+0.028
Neg logit contributions
 Lily-0.003
 L-0.001
,-0.000
 v+0.001
 p+0.001
Token the
Token ID262
Feature activation-6.99e-01

Pos logit contributions
opes+0.018
uding+0.018
 Charles+0.018
lor+0.018
wen+0.018
Neg logit contributions
 Lily-0.005
 L-0.004
 v-0.004
 p-0.003
L-0.003
Token branch
Token ID8478
Feature activation+5.60e-02

Pos logit contributions
opes+0.022
uding+0.021
wen+0.020
lor+0.020
 Charles+0.020
Neg logit contributions
 Lily-0.018
 L-0.016
 v-0.014
 Sp-0.014
L-0.014
Token became
Token ID2627
Feature activation+5.33e-01

Pos logit contributions
 Lily+0.001
 L+0.001
 v+0.001
,+0.001
 sad+0.001
Neg logit contributions
opes-0.002
uding-0.002
lor-0.002
wen-0.002
 Charles-0.002
Token even
Token ID772
Feature activation+3.05e+00

Pos logit contributions
 Lily+0.009
 L+0.008
 v+0.006
 Sp+0.006
L+0.006
Neg logit contributions
opes-0.021
uding-0.021
lor-0.021
wen-0.021
 Charles-0.020
Token more
Token ID517
Feature activation+8.25e-01

Pos logit contributions
 Lily+0.052
 L+0.051
 Sp+0.045
 v+0.041
 Sara+0.038
Neg logit contributions
 armour-0.095
uding-0.095
lled-0.095
 witnessed-0.094
 Charles-0.094
Token popular
Token ID2968
Feature activation-2.08e-02

Pos logit contributions
 Lily+0.017
 L+0.015
 very+0.014
 sad+0.014
,+0.013
Neg logit contributions
opes-0.030
lor-0.030
uding-0.029
wen-0.029
 Charles-0.028
Token with
Token ID351
Feature activation+2.21e-01

Pos logit contributions
opes+0.001
lor+0.001
wen+0.001
uding+0.001
 Charles+0.001
Neg logit contributions
 Lily-0.000
 L-0.000
 v-0.000
 Sp-0.000
 p-0.000
Token all
Token ID477
Feature activation-1.99e-01

Pos logit contributions
 Lily+0.004
 L+0.003
 v+0.003
 very+0.002
 Sp+0.002
Neg logit contributions
lor-0.009
wen-0.009
opes-0.009
uding-0.009
 Charles-0.009
Token the
Token ID262
Feature activation-1.19e+00

Pos logit contributions
opes+0.009
lor+0.009
uding+0.009
wen+0.009
 Charles+0.009
Neg logit contributions
 Lily-0.002
 L-0.002
 v-0.001
 very-0.001
,-0.001
Token night
Token ID1755
Feature activation-3.66e-01

Pos logit contributions
opes+0.054
wen+0.054
lor+0.054
uding+0.053
 Charles+0.053
Neg logit contributions
 Lily-0.016
 L-0.013
 v-0.011
L-0.010
 Sp-0.009
Token,
Token ID11
Feature activation-9.54e-01

Pos logit contributions
opes+0.019
uding+0.018
wen+0.018
lor+0.018
 Charles+0.018
Neg logit contributions
 Lily-0.002
 L-0.002
 v-0.001
,-0.001
 sad-0.000
Token the
Token ID262
Feature activation+1.66e-01

Pos logit contributions
opes+0.047
lor+0.046
wen+0.046
uding+0.045
 Charles+0.043
Neg logit contributions
 Lily-0.010
 L-0.008
 v-0.005
 very-0.004
,-0.004
Token harbor
Token ID25451
Feature activation-6.68e-01

Pos logit contributions
 Lily+0.004
 L+0.004
 Sp+0.003
 v+0.003
L+0.003
Neg logit contributions
opes-0.005
lor-0.005
uding-0.005
wen-0.005
 Charles-0.005
Token became
Token ID2627
Feature activation+5.33e-01

Pos logit contributions
opes+0.031
uding+0.030
wen+0.030
lor+0.029
 Charles+0.029
Neg logit contributions
 Lily-0.008
,-0.006
 L-0.006
 v-0.005
 very-0.005
Token even
Token ID772
Feature activation+3.04e+00

Pos logit contributions
 Lily+0.009
 L+0.007
 v+0.006
 p+0.006
 Sp+0.006
Neg logit contributions
opes-0.022
lor-0.021
uding-0.021
wen-0.021
 Charles-0.021
Token more
Token ID517
Feature activation+1.01e+00

Pos logit contributions
 Lily+0.056
 L+0.052
 Sp+0.047
 v+0.043
L+0.042
Neg logit contributions
uding-0.097
 Charles-0.097
opes-0.095
 armour-0.095
lor-0.095
Token magical
Token ID10883
Feature activation-6.42e-01

Pos logit contributions
 Lily+0.019
 L+0.016
 very+0.015
,+0.014
 sad+0.014
Neg logit contributions
opes-0.040
lor-0.040
uding-0.039
wen-0.038
eway-0.037
Token.
Token ID13
Feature activation-1.16e+00

Pos logit contributions
wen+0.026
opes+0.026
lor+0.026
uding+0.025
 Charles+0.024
Neg logit contributions
 Lily-0.011
 L-0.009
,-0.009
 v-0.008
 very-0.008
Token The
Token ID383
Feature activation-7.22e-02

Pos logit contributions
opes+0.040
wen+0.039
uding+0.039
lor+0.039
 Charles+0.037
Neg logit contributions
 Lily-0.027
 L-0.024
 v-0.020
 Sp-0.020
L-0.019
Token lights
Token ID7588
Feature activation-8.54e-01

Pos logit contributions
opes+0.002
uding+0.002
lor+0.002
wen+0.002
 Charles+0.002
Neg logit contributions
 Lily-0.002
 L-0.001
 Sp-0.001
 v-0.001
L-0.001
Token Their
Token ID5334
Feature activation-3.01e-01

Pos logit contributions
 v+0.013
 Lily+0.012
 L+0.011
 p+0.011
 dirty+0.011
Neg logit contributions
opes-0.025
 Charles-0.024
lor-0.024
uding-0.024
wen-0.023
Token mom
Token ID1995
Feature activation+6.11e-01

Pos logit contributions
opes+0.011
wen+0.010
uding+0.010
lor+0.010
 Charles+0.010
Neg logit contributions
 Lily-0.006
 L-0.005
 Sp-0.005
L-0.005
 v-0.005
Token and
Token ID290
Feature activation+1.24e-01

Pos logit contributions
 Lily+0.009
 L+0.008
 v+0.006
 Sp+0.006
 p+0.006
Neg logit contributions
opes-0.025
lor-0.025
uding-0.025
wen-0.025
 Charles-0.024
Token dad
Token ID9955
Feature activation-2.49e-01

Pos logit contributions
 Lily+0.003
 L+0.003
 v+0.003
 Sp+0.003
 p+0.003
Neg logit contributions
opes-0.004
uding-0.004
wen-0.004
lor-0.003
 Charles-0.003
Token cleaned
Token ID20750
Feature activation+1.20e+00

Pos logit contributions
opes+0.009
lor+0.009
uding+0.009
wen+0.009
 Charles+0.009
Neg logit contributions
 Lily-0.005
 L-0.004
 v-0.004
 Sp-0.003
 p-0.003
Token Lily
Token ID20037
Feature activation+3.04e+00

Pos logit contributions
 Lily+0.025
 L+0.024
L+0.024
 v+0.023
 sad+0.023
Neg logit contributions
opes-0.035
uding-0.034
 Charles-0.034
uted-0.033
wen-0.032
Token's
Token ID338
Feature activation+9.60e-01

Pos logit contributions
 Lily+0.067
 L+0.063
 Sp+0.054
 v+0.051
L+0.051
Neg logit contributions
opes-0.101
lor-0.100
uding-0.100
wen-0.095
 Charles-0.093
Token face
Token ID1986
Feature activation+2.39e+00

Pos logit contributions
 Lily+0.026
 L+0.024
 v+0.022
 p+0.021
 Sp+0.021
Neg logit contributions
opes-0.028
lor-0.028
wen-0.028
uding-0.028
 Charles-0.027
Token and
Token ID290
Feature activation+3.21e-01

Pos logit contributions
 Lily+0.038
 L+0.034
 Sp+0.027
 v+0.027
L+0.026
Neg logit contributions
opes-0.093
lor-0.089
uding-0.089
wen-0.088
 Charles-0.087
Token put
Token ID1234
Feature activation+1.02e+00

Pos logit contributions
 Lily+0.007
 L+0.006
 v+0.005
 p+0.005
 sad+0.005
Neg logit contributions
opes-0.012
lor-0.012
uding-0.012
wen-0.012
 Charles-0.011
Token a
Token ID257
Feature activation+8.37e-01

Pos logit contributions
 Lily+0.019
 L+0.017
 v+0.014
 Sp+0.013
 p+0.013
Neg logit contributions
opes-0.039
lor-0.038
uding-0.038
wen-0.038
 Charles-0.037
Tokenink
Token ID676
Feature activation-1.43e-02

Pos logit contributions
opes+0.001
lor+0.001
uding+0.001
wen+0.001
 Charles+0.001
Neg logit contributions
 Lily-0.001
 L-0.001
 v-0.001
 Sp-0.001
 p-0.001
Token their
Token ID511
Feature activation-1.01e+00

Pos logit contributions
opes+0.000
uding+0.000
lor+0.000
wen+0.000
 Charles+0.000
Neg logit contributions
 Lily-0.000
 L-0.000
 v-0.000
 p-0.000
L-0.000
Token cups
Token ID14180
Feature activation-5.83e-01

Pos logit contributions
opes+0.034
uding+0.033
lor+0.032
wen+0.032
 Charles+0.031
Neg logit contributions
 Lily-0.024
 L-0.021
 v-0.019
 Sp-0.018
L-0.018
Token.
Token ID13
Feature activation+4.18e-01

Pos logit contributions
opes+0.025
lor+0.024
uding+0.024
wen+0.024
 Charles+0.024
Neg logit contributions
 Lily-0.008
 L-0.007
 v-0.005
 Sp-0.005
 p-0.005
Token They
Token ID1119
Feature activation+9.24e-02

Pos logit contributions
 v+0.008
 sad+0.007
 dirty+0.007
 Lily+0.007
 very+0.007
Neg logit contributions
 Charles-0.013
opes-0.013
uding-0.013
lor-0.012
uted-0.012
Token sip
Token ID31145
Feature activation+3.01e+00

Pos logit contributions
 Lily+0.002
 L+0.002
 v+0.001
 Sp+0.001
L+0.001
Neg logit contributions
opes-0.003
uding-0.003
wen-0.003
 Charles-0.003
lor-0.003
Token the
Token ID262
Feature activation+1.58e+00

Pos logit contributions
 Lily+0.052
L+0.051
ott+0.051
 L+0.051
 v+0.049
Neg logit contributions
opes-0.084
rily-0.083
lled-0.080
uted-0.080
aced-0.076
Token water
Token ID1660
Feature activation+9.01e-01

Pos logit contributions
 Lily+0.027
 L+0.025
 Sp+0.022
L+0.020
 sad+0.019
Neg logit contributions
opes-0.057
lor-0.054
wen-0.054
uding-0.053
eway-0.052
Token.
Token ID13
Feature activation+4.42e-01

Pos logit contributions
 Lily+0.014
 L+0.012
 Sp+0.010
L+0.010
 v+0.010
Neg logit contributions
opes-0.035
uding-0.034
lor-0.033
wen-0.033
 Charles-0.033
Token They
Token ID1119
Feature activation+3.60e-01

Pos logit contributions
 Lily+0.007
 v+0.007
 L+0.007
 sad+0.006
 dirty+0.006
Neg logit contributions
opes-0.015
 Charles-0.015
uding-0.015
lor-0.015
wen-0.014
Token eat
Token ID4483
Feature activation+1.82e+00

Pos logit contributions
 Lily+0.008
 L+0.007
 v+0.006
L+0.006
 Sp+0.006
Neg logit contributions
opes-0.012
uding-0.011
 Charles-0.011
wen-0.011
lor-0.011
Token to
Token ID284
Feature activation+5.37e-01

Pos logit contributions
 Lily+0.017
 L+0.015
 v+0.014
L+0.013
 Sp+0.013
Neg logit contributions
opes-0.031
uding-0.031
wen-0.031
 Charles-0.031
lor-0.030
Token each
Token ID1123
Feature activation+7.64e-01

Pos logit contributions
 Lily+0.010
 L+0.009
 v+0.008
 Sp+0.007
L+0.007
Neg logit contributions
opes-0.020
uding-0.020
wen-0.020
lor-0.020
 Charles-0.019
Token other
Token ID584
Feature activation+1.30e+00

Pos logit contributions
 Lily+0.013
 L+0.012
 Sp+0.011
 v+0.011
 p+0.010
Neg logit contributions
opes-0.027
wen-0.026
uding-0.026
lor-0.026
 Charles-0.026
Token.
Token ID13
Feature activation+5.16e-01

Pos logit contributions
 L+0.021
 Lily+0.021
 Sp+0.018
L+0.017
 v+0.016
Neg logit contributions
uding-0.046
opes-0.046
uted-0.044
aced-0.044
lor-0.044
Token They
Token ID1119
Feature activation+5.73e-01

Pos logit contributions
 v+0.009
 Lily+0.009
 L+0.008
 Sp+0.008
 sad+0.008
Neg logit contributions
opes-0.017
 Charles-0.017
uding-0.017
lor-0.016
wen-0.016
Token drank
Token ID24070
Feature activation+3.00e+00

Pos logit contributions
 Lily+0.013
 L+0.012
 v+0.011
 Sp+0.011
 square+0.010
Neg logit contributions
opes-0.017
wen-0.017
uding-0.017
 witnessed-0.017
lor-0.016
Token their
Token ID511
Feature activation+1.75e+00

Pos logit contributions
 L+0.059
 Lily+0.058
 Sp+0.057
L+0.054
 v+0.052
Neg logit contributions
opes-0.082
lled-0.081
uted-0.080
aced-0.078
uding-0.077
Token water
Token ID1660
Feature activation+1.40e+00

Pos logit contributions
 Lily+0.037
 L+0.035
 Sp+0.029
L+0.029
 sad+0.028
Neg logit contributions
opes-0.058
uding-0.053
aded-0.052
lor-0.052
wen-0.052
Token and
Token ID290
Feature activation+4.70e-01

Pos logit contributions
 Lily+0.025
 L+0.024
 Sp+0.021
L+0.020
 v+0.019
Neg logit contributions
opes-0.049
uding-0.047
lor-0.046
uted-0.045
aced-0.045
Token ate
Token ID15063
Feature activation+2.07e+00

Pos logit contributions
 Lily+0.009
 L+0.008
 v+0.007
 Sp+0.007
L+0.007
Neg logit contributions
opes-0.016
uding-0.016
wen-0.016
lor-0.016
 Charles-0.016
Token their
Token ID511
Feature activation+1.19e+00

Pos logit contributions
 Lily+0.038
 L+0.037
L+0.035
 Sp+0.033
 v+0.030
Neg logit contributions
opes-0.068
uding-0.064
 Charles-0.063
uted-0.063
lled-0.062
Token floor
Token ID4314
Feature activation+2.04e-01

Pos logit contributions
 Lily+0.021
 L+0.020
 Sp+0.018
L+0.017
 sad+0.017
Neg logit contributions
opes-0.021
wen-0.019
lor-0.019
 doorstep-0.019
 reins-0.019
Token.
Token ID13
Feature activation+7.38e-02

Pos logit contributions
 Lily+0.003
 L+0.003
 v+0.002
 Sp+0.002
L+0.002
Neg logit contributions
opes-0.008
uding-0.008
lor-0.008
wen-0.008
 Charles-0.008
Token They
Token ID1119
Feature activation+7.78e-01

Pos logit contributions
 Lily+0.001
 v+0.001
 sad+0.001
 dirty+0.001
 p+0.001
Neg logit contributions
opes-0.003
 Charles-0.002
uding-0.002
lor-0.002
wen-0.002
Token pretend
Token ID16614
Feature activation-7.83e-01

Pos logit contributions
 Lily+0.017
 L+0.015
L+0.014
 v+0.013
 Sp+0.013
Neg logit contributions
opes-0.025
uding-0.024
 Charles-0.024
wen-0.023
lor-0.023
Token to
Token ID284
Feature activation+1.16e-01

Pos logit contributions
opes+0.032
lor+0.031
uding+0.031
wen+0.031
 Charles+0.030
Neg logit contributions
 Lily-0.012
 L-0.011
 v-0.009
L-0.008
 Sp-0.008
Token sip
Token ID31145
Feature activation+2.98e+00

Pos logit contributions
 Lily+0.002
 L+0.002
 Sp+0.002
 sad+0.002
 v+0.002
Neg logit contributions
opes-0.004
uding-0.004
wen-0.003
lor-0.003
 Charles-0.003
Token and
Token ID290
Feature activation+3.03e-01

Pos logit contributions
 L+0.050
 Lily+0.050
L+0.048
 sad+0.047
 H+0.045
Neg logit contributions
opes-0.093
rily-0.082
uted-0.082
uding-0.081
lled-0.080
Token nib
Token ID33272
Feature activation-4.58e-01

Pos logit contributions
 Lily+0.006
 L+0.005
 v+0.005
L+0.005
 Sp+0.005
Neg logit contributions
opes-0.011
uding-0.010
lor-0.010
wen-0.010
 Charles-0.010
Tokenble
Token ID903
Feature activation+1.42e+00

Pos logit contributions
opes+0.011
wen+0.010
lor+0.010
uding+0.009
uted+0.009
Neg logit contributions
 Lily-0.015
 L-0.014
 v-0.013
 sad-0.013
L-0.013
Token.
Token ID13
Feature activation+4.26e-01

Pos logit contributions
 Lily+0.022
 L+0.020
L+0.019
 Sp+0.018
 sad+0.017
Neg logit contributions
opes-0.051
lor-0.047
rily-0.047
uding-0.047
 Charles-0.047
Token
Token ID198
Feature activation+6.98e-01

Pos logit contributions
 v+0.007
 Lily+0.007
 sad+0.007
 dirty+0.007
 p+0.007
Neg logit contributions
opes-0.014
uding-0.014
 Charles-0.014
lor-0.014
wen-0.013
Token her
Token ID607
Feature activation+2.35e-01

Pos logit contributions
 Lily+0.038
 L+0.036
 v+0.033
 Sp+0.032
 sad+0.031
Neg logit contributions
opes-0.069
 Charles-0.068
uding-0.067
lor-0.067
wen-0.066
Token mom
Token ID1995
Feature activation+1.50e+00

Pos logit contributions
 Lily+0.004
 L+0.004
 v+0.003
L+0.003
 Sp+0.003
Neg logit contributions
opes-0.009
uding-0.009
lor-0.009
wen-0.009
 Charles-0.009
Token.
Token ID13
Feature activation+1.17e+00

Pos logit contributions
 Lily+0.019
 L+0.017
 v+0.014
L+0.013
 Sp+0.013
Neg logit contributions
opes-0.061
lor-0.060
 Charles-0.060
uding-0.059
wen-0.059
Token She
Token ID1375
Feature activation+1.08e+00

Pos logit contributions
 Lily+0.018
 v+0.016
 L+0.016
 Sp+0.015
 sad+0.015
Neg logit contributions
opes-0.042
 Charles-0.042
uding-0.042
lor-0.041
wen-0.040
Token smiled
Token ID13541
Feature activation-3.05e-01

Pos logit contributions
 Lily+0.020
 L+0.018
 v+0.016
 Sp+0.014
L+0.014
Neg logit contributions
opes-0.040
wen-0.039
lor-0.038
uding-0.038
 Charles-0.038
Token even
Token ID772
Feature activation+2.98e+00

Pos logit contributions
opes+0.013
lor+0.013
wen+0.013
uding+0.013
eway+0.012
Neg logit contributions
 Lily-0.005
 L-0.004
 v-0.003
,-0.003
 very-0.003
Token when
Token ID618
Feature activation-7.89e-02

Pos logit contributions
 Lily+0.048
 L+0.047
 Sp+0.044
 v+0.039
 P+0.037
Neg logit contributions
 witnessed-0.096
opes-0.095
lled-0.095
uted-0.094
 Charles-0.094
Token she
Token ID673
Feature activation+7.86e-01

Pos logit contributions
opes+0.003
lor+0.003
wen+0.003
uding+0.003
 Charles+0.003
Neg logit contributions
 Lily-0.001
 L-0.001
 v-0.001
 Sp-0.001
L-0.001
Token was
Token ID373
Feature activation+2.41e-01

Pos logit contributions
 Lily+0.016
 L+0.014
 v+0.013
 sad+0.012
 Sp+0.012
Neg logit contributions
opes-0.027
wen-0.027
lor-0.027
uding-0.026
 Charles-0.026
Token sad
Token ID6507
Feature activation+9.37e-01

Pos logit contributions
 Lily+0.004
 L+0.004
 v+0.003
 Sp+0.003
L+0.003
Neg logit contributions
opes-0.009
uding-0.009
wen-0.009
lor-0.009
 Charles-0.009
Token,
Token ID11
Feature activation+2.37e-01

Pos logit contributions
 Lily+0.010
 L+0.008
 v+0.006
 p+0.005
 Sp+0.005
Neg logit contributions
opes-0.043
lor-0.043
wen-0.042
uding-0.042
 Charles-0.041
Token food
Token ID2057
Feature activation+2.03e-01

Pos logit contributions
 Lily+0.026
 L+0.024
 v+0.022
L+0.021
 Sp+0.021
Neg logit contributions
opes-0.029
lor-0.028
uding-0.028
wen-0.028
 Charles-0.027
Token,
Token ID11
Feature activation+5.55e-02

Pos logit contributions
 Lily+0.003
 L+0.002
 v+0.002
 p+0.001
,+0.001
Neg logit contributions
opes-0.009
lor-0.009
wen-0.009
uding-0.009
 Charles-0.009
Token and
Token ID290
Feature activation+5.82e-01

Pos logit contributions
 Lily+0.001
 L+0.001
 v+0.001
 p+0.001
 very+0.001
Neg logit contributions
lor-0.002
wen-0.002
opes-0.002
uding-0.002
eway-0.002
Token it
Token ID340
Feature activation+6.62e-01

Pos logit contributions
 Lily+0.009
 L+0.007
 v+0.006
 p+0.005
,+0.005
Neg logit contributions
opes-0.026
lor-0.025
uding-0.025
wen-0.025
eway-0.024
Token became
Token ID2627
Feature activation+9.04e-01

Pos logit contributions
 Lily+0.011
 L+0.010
 v+0.008
 p+0.007
 Sp+0.007
Neg logit contributions
opes-0.026
lor-0.026
uding-0.026
wen-0.026
 Charles-0.025
Token even
Token ID772
Feature activation+2.97e+00

Pos logit contributions
 Lily+0.014
 L+0.013
 v+0.011
 Sp+0.010
L+0.010
Neg logit contributions
opes-0.035
uding-0.034
 Charles-0.033
wen-0.033
lor-0.033
Token f
Token ID277
Feature activation-7.83e-01

Pos logit contributions
 Lily+0.048
 L+0.048
 Sp+0.042
 v+0.037
 W+0.035
Neg logit contributions
lled-0.096
uding-0.096
 armour-0.095
 Charles-0.094
opes-0.094
Tokenatter
Token ID1436
Feature activation-6.71e-01

Pos logit contributions
lor+0.017
wen+0.017
opes+0.016
uding+0.016
imm+0.015
Neg logit contributions
 Lily-0.029
 L-0.026
 v-0.024
,-0.024
 very-0.024
Token.
Token ID13
Feature activation-4.43e-01

Pos logit contributions
opes+0.029
uding+0.028
lor+0.028
wen+0.028
 Charles+0.027
Neg logit contributions
 Lily-0.009
 L-0.007
 v-0.006
 Sp-0.005
L-0.005
Token Whenever
Token ID21326
Feature activation-4.79e-01

Pos logit contributions
opes+0.018
lor+0.018
uding+0.018
wen+0.018
 Charles+0.018
Neg logit contributions
 Lily-0.007
 L-0.006
 v-0.005
 p-0.004
L-0.004
Token the
Token ID262
Feature activation+2.20e-01

Pos logit contributions
opes+0.022
lor+0.021
uding+0.021
wen+0.021
 Charles+0.021
Neg logit contributions
 Lily-0.005
 L-0.004
 v-0.003
L-0.003
 Sp-0.003
Token They
Token ID1119
Feature activation+9.27e-01

Pos logit contributions
 v+0.019
 Sp+0.017
 le+0.016
 dirty+0.016
 mean+0.016
Neg logit contributions
opes-0.027
 Charles-0.025
andra-0.025
lor-0.025
 Birthday-0.024
Token ate
Token ID15063
Feature activation+2.15e+00

Pos logit contributions
 Lily+0.022
 L+0.020
 v+0.019
 Sp+0.019
 sad+0.018
Neg logit contributions
opes-0.027
wen-0.026
uding-0.026
 Charles-0.025
lor-0.025
Token their
Token ID511
Feature activation+1.17e+00

Pos logit contributions
 L+0.041
L+0.038
 Lily+0.038
 Sp+0.036
 v+0.034
Neg logit contributions
opes-0.065
rily-0.061
uted-0.061
 Charles-0.059
lled-0.059
Token lunch
Token ID9965
Feature activation+9.90e-01

Pos logit contributions
 Lily+0.033
 L+0.030
 v+0.027
 Sp+0.026
L+0.026
Neg logit contributions
opes-0.033
uding-0.032
lor-0.032
wen-0.032
 Charles-0.031
Token and
Token ID290
Feature activation+3.80e-01

Pos logit contributions
 Lily+0.015
 L+0.013
 v+0.011
 Sp+0.010
L+0.010
Neg logit contributions
opes-0.040
lor-0.039
uding-0.039
wen-0.039
 Charles-0.038
Token drank
Token ID24070
Feature activation+2.97e+00

Pos logit contributions
 Lily+0.009
 L+0.008
 v+0.008
 Sp+0.007
L+0.007
Neg logit contributions
opes-0.012
uding-0.011
wen-0.011
 Charles-0.011
lor-0.011
Token their
Token ID511
Feature activation+1.55e+00

Pos logit contributions
 L+0.057
 Lily+0.054
L+0.053
 Sp+0.053
 H+0.049
Neg logit contributions
opes-0.084
rily-0.082
lled-0.080
uted-0.079
aced-0.078
Token juice
Token ID13135
Feature activation+2.90e+00

Pos logit contributions
 Lily+0.036
 L+0.032
 v+0.029
 Sp+0.027
L+0.027
Neg logit contributions
opes-0.052
lor-0.050
uding-0.050
wen-0.050
 Charles-0.048
Token and
Token ID290
Feature activation+6.90e-01

Pos logit contributions
 Lily+0.043
 L+0.041
 Sp+0.035
 v+0.031
 sad+0.030
Neg logit contributions
opes-0.110
lor-0.106
uding-0.106
aced-0.103
uted-0.101
Token had
Token ID550
Feature activation+8.24e-01

Pos logit contributions
 Lily+0.015
 L+0.014
 v+0.013
 Sp+0.012
L+0.012
Neg logit contributions
opes-0.023
uding-0.022
wen-0.021
 Charles-0.021
lor-0.021
Token spicy
Token ID26880
Feature activation+4.16e-01

Pos logit contributions
 Lily+0.016
 L+0.015
 v+0.013
L+0.012
 Sp+0.012
Neg logit contributions
opes-0.029
lor-0.028
uding-0.028
wen-0.028
 Charles-0.028
Token teach
Token ID4545
Feature activation+1.67e+00

Pos logit contributions
opes+0.035
lor+0.034
uding+0.034
wen+0.034
 Charles+0.033
Neg logit contributions
 Lily-0.018
 L-0.015
 v-0.013
 Sp-0.012
L-0.012
Token her
Token ID607
Feature activation+1.92e-02

Pos logit contributions
 Lily+0.030
 L+0.028
 v+0.025
L+0.024
 sad+0.023
Neg logit contributions
opes-0.060
uding-0.058
wen-0.056
 Charles-0.056
lor-0.056
Token doll
Token ID3654
Feature activation+4.10e-02

Pos logit contributions
 Lily+0.000
 L+0.000
 v+0.000
L+0.000
 Sp+0.000
Neg logit contributions
opes-0.001
wen-0.001
lor-0.001
uding-0.001
 Charles-0.001
Token how
Token ID703
Feature activation+5.66e-01

Pos logit contributions
 Lily+0.001
 L+0.001
L+0.001
 v+0.001
 p+0.001
Neg logit contributions
opes-0.001
uding-0.001
lor-0.001
 Charles-0.001
eway-0.001
Token to
Token ID284
Feature activation+3.55e-01

Pos logit contributions
 Lily+0.010
 L+0.010
 v+0.008
L+0.008
 Sp+0.008
Neg logit contributions
opes-0.020
uding-0.019
wen-0.019
 Charles-0.019
lor-0.019
Token drink
Token ID4144
Feature activation+2.96e+00

Pos logit contributions
 Lily+0.007
 L+0.006
 v+0.005
 Sp+0.005
 sad+0.005
Neg logit contributions
opes-0.013
uding-0.012
wen-0.012
lor-0.012
 Charles-0.012
Token wine
Token ID8237
Feature activation+1.41e+00

Pos logit contributions
 Lily+0.042
 L+0.042
 Sp+0.036
L+0.036
 sad+0.035
Neg logit contributions
opes-0.101
uding-0.098
lor-0.096
eway-0.095
 Charles-0.094
Token.
Token ID13
Feature activation+6.62e-01

Pos logit contributions
 Lily+0.016
 L+0.014
 v+0.011
L+0.010
 Sp+0.010
Neg logit contributions
opes-0.061
uding-0.060
lor-0.060
wen-0.058
 Charles-0.058
Token She
Token ID1375
Feature activation+8.59e-02

Pos logit contributions
 v+0.011
 le+0.010
 c+0.010
 L+0.010
 Sp+0.009
Neg logit contributions
opes-0.021
uding-0.021
 Charles-0.019
elcome-0.019
lor-0.019
Token saw
Token ID2497
Feature activation-1.56e-01

Pos logit contributions
 Lily+0.002
 L+0.002
 v+0.001
 Sp+0.001
 sad+0.001
Neg logit contributions
opes-0.003
wen-0.003
uding-0.003
lor-0.003
 Charles-0.003
Token her
Token ID607
Feature activation+2.53e-02

Pos logit contributions
opes+0.006
wen+0.005
lor+0.005
uding+0.005
 Charles+0.005
Neg logit contributions
 Lily-0.003
 L-0.003
 v-0.002
L-0.002
 Sp-0.002
Token.
Token ID13
Feature activation+1.30e+00

Pos logit contributions
 Lily+0.009
 L+0.007
 v+0.006
 Sp+0.005
 p+0.005
Neg logit contributions
opes-0.024
lor-0.024
uding-0.024
wen-0.024
 Charles-0.023
Token The
Token ID383
Feature activation+6.52e-01

Pos logit contributions
 end+0.014
 L+0.014
L+0.014
 better+0.013
m+0.012
Neg logit contributions
opes-0.047
uding-0.044
 Charles-0.043
elcome-0.042
 Jessie-0.041
Token powder
Token ID11913
Feature activation-4.78e-01

Pos logit contributions
 Lily+0.017
 L+0.015
 v+0.014
 p+0.013
L+0.013
Neg logit contributions
opes-0.020
uding-0.020
lor-0.020
wen-0.020
 Charles-0.019
Token made
Token ID925
Feature activation+2.21e-01

Pos logit contributions
opes+0.021
uding+0.021
lor+0.021
wen+0.020
 Charles+0.020
Neg logit contributions
 Lily-0.007
 L-0.005
 v-0.005
 p-0.004
,-0.004
Token it
Token ID340
Feature activation-3.87e-01

Pos logit contributions
 Lily+0.003
 L+0.002
 v+0.002
 Sp+0.002
 p+0.002
Neg logit contributions
opes-0.009
wen-0.008
lor-0.008
uding-0.008
uted-0.008
Token feel
Token ID1254
Feature activation+2.96e+00

Pos logit contributions
opes+0.015
lor+0.015
uding+0.015
wen+0.015
 Charles+0.015
Neg logit contributions
 Lily-0.006
 L-0.005
 v-0.004
 Sp-0.004
L-0.004
Token better
Token ID1365
Feature activation+1.14e+00

Pos logit contributions
 Lily+0.042
 L+0.040
 Sp+0.034
 v+0.031
L+0.030
Neg logit contributions
opes-0.107
aded-0.106
lled-0.105
 witnessed-0.105
uding-0.105
Token and
Token ID290
Feature activation+3.39e-01

Pos logit contributions
 Lily+0.017
 L+0.014
 v+0.012
 Sp+0.010
L+0.010
Neg logit contributions
opes-0.047
lor-0.046
uding-0.046
wen-0.046
 Charles-0.045
Token stopped
Token ID5025
Feature activation+1.27e-01

Pos logit contributions
 Lily+0.004
 L+0.004
 v+0.003
 Sp+0.003
L+0.003
Neg logit contributions
opes-0.013
uding-0.013
 Charles-0.013
wen-0.013
lor-0.013
Token it
Token ID340
Feature activation+7.90e-01

Pos logit contributions
 Lily+0.003
 L+0.002
 v+0.002
L+0.002
 Sp+0.002
Neg logit contributions
opes-0.004
uding-0.004
wen-0.004
lor-0.004
 Charles-0.004
Token from
Token ID422
Feature activation-3.44e-01

Pos logit contributions
 Lily+0.009
 L+0.007
 v+0.005
 Sp+0.004
L+0.004
Neg logit contributions
opes-0.036
lor-0.035
uding-0.035
wen-0.035
 Charles-0.034
Token Fl
Token ID1610
Feature activation-1.24e+00

Pos logit contributions
opes+0.041
lor+0.040
uding+0.040
wen+0.040
eway+0.038
Neg logit contributions
 Lily-0.017
 L-0.014
 v-0.011
 Sp-0.010
,-0.010
Tokenoppy
Token ID26696
Feature activation+4.11e-01

Pos logit contributions
opes+0.035
lor+0.034
wen+0.034
uding+0.034
 Charles+0.033
Neg logit contributions
 Lily-0.035
 L-0.032
 v-0.029
 Sp-0.028
 p-0.028
Token saw
Token ID2497
Feature activation+7.73e-02

Pos logit contributions
 Lily+0.007
 L+0.006
 v+0.005
,+0.005
 p+0.005
Neg logit contributions
opes-0.017
uding-0.016
lor-0.016
wen-0.016
eway-0.016
Token a
Token ID257
Feature activation-4.75e-01

Pos logit contributions
 Lily+0.001
 L+0.001
 v+0.001
 p+0.001
 very+0.001
Neg logit contributions
opes-0.003
uding-0.003
lor-0.003
wen-0.003
 Charles-0.003
Token mar
Token ID1667
Feature activation-2.21e+00

Pos logit contributions
 Charles+0.015
wen+0.015
opes+0.015
lor+0.015
uding+0.015
Neg logit contributions
 Lily-0.010
 L-0.009
 Sp-0.008
L-0.008
 v-0.007
Tokenvell
Token ID29333
Feature activation-4.74e+00

Pos logit contributions
opes+0.061
uding+0.060
wen+0.059
 Charles+0.059
lor+0.058
Neg logit contributions
 Lily-0.063
 L-0.059
 v-0.055
 p-0.052
L-0.052
Tokenous
Token ID516
Feature activation-1.09e+00

Pos logit contributions
opes+0.102
uding+0.096
lor+0.096
wen+0.096
Andrew+0.092
Neg logit contributions
 Lily-0.146
 L-0.144
,-0.140
 v-0.136
 very-0.132
Token view
Token ID1570
Feature activation-8.05e-01

Pos logit contributions
opes+0.035
lor+0.034
uding+0.034
wen+0.033
 Charles+0.033
Neg logit contributions
 Lily-0.027
 L-0.025
 v-0.022
 Sp-0.021
L-0.021
Token.
Token ID13
Feature activation-3.52e-01

Pos logit contributions
opes+0.036
lor+0.035
uding+0.035
wen+0.035
 Charles+0.033
Neg logit contributions
 Lily-0.011
 L-0.009
,-0.008
 v-0.007
 very-0.007
Token He
Token ID679
Feature activation-5.12e-01

Pos logit contributions
lor+0.013
opes+0.013
uding+0.013
 Charles+0.013
wen+0.013
Neg logit contributions
 Lily-0.006
 L-0.005
 v-0.005
 sad-0.004
 dirty-0.004
Token also
Token ID635
Feature activation+8.17e-01

Pos logit contributions
lor+0.020
opes+0.020
wen+0.020
uding+0.020
 Charles+0.019
Neg logit contributions
 Lily-0.009
 L-0.008
 v-0.007
 p-0.007
 very-0.006
Token the
Token ID262
Feature activation+4.16e-01

Pos logit contributions
 Lily+0.006
 L+0.004
 v+0.003
 p+0.003
 Sp+0.003
Neg logit contributions
opes-0.021
lor-0.021
uding-0.020
wen-0.020
 Charles-0.020
Token kids
Token ID3988
Feature activation+4.14e-01

Pos logit contributions
 Lily+0.009
 L+0.007
 v+0.007
 p+0.007
 very+0.006
Neg logit contributions
wen-0.015
lor-0.015
opes-0.015
uding-0.015
 Charles-0.015
Token had
Token ID550
Feature activation-1.36e-01

Pos logit contributions
 Lily+0.007
 L+0.006
,+0.006
 very+0.006
 v+0.006
Neg logit contributions
opes-0.017
lor-0.017
wen-0.017
uding-0.016
eway-0.016
Token their
Token ID511
Feature activation-2.19e-01

Pos logit contributions
opes+0.006
uding+0.006
wen+0.006
lor+0.005
eway+0.005
Neg logit contributions
 Lily-0.003
 L-0.002
 very-0.002
 v-0.002
,-0.002
Token t
Token ID256
Feature activation-1.36e+00

Pos logit contributions
lor+0.007
wen+0.007
opes+0.007
uding+0.007
eway+0.007
Neg logit contributions
 Lily-0.005
 L-0.005
 v-0.004
 very-0.004
 p-0.004
Tokenamb
Token ID4131
Feature activation-4.28e+00

Pos logit contributions
 Charles+0.043
uding+0.042
opes+0.042
wen+0.040
eway+0.039
Neg logit contributions
 Lily-0.031
 L-0.030
 v-0.029
L-0.028
 p-0.026
Tokenour
Token ID454
Feature activation-6.47e-01

Pos logit contributions
uding+0.098
Frank+0.097
lor+0.096
eway+0.095
Charlie+0.094
Neg logit contributions
 L-0.118
 Lily-0.117
,-0.117
 can-0.109
 sad-0.108
Tokenines
Token ID1127
Feature activation+1.43e-01

Pos logit contributions
uding+0.025
opes+0.025
 Charles+0.025
wen+0.024
Frank+0.024
Neg logit contributions
 Lily-0.011
 L-0.010
,-0.010
 v-0.009
 very-0.009
Token and
Token ID290
Feature activation-3.57e-01

Pos logit contributions
 Lily+0.002
 L+0.001
,+0.001
 v+0.001
 very+0.001
Neg logit contributions
opes-0.007
wen-0.007
uding-0.007
lor-0.007
 Charles-0.007
Token drums
Token ID24730
Feature activation-5.64e-01

Pos logit contributions
lor+0.014
opes+0.014
uding+0.014
wen+0.013
eway+0.013
Neg logit contributions
 Lily-0.008
 L-0.007
 v-0.006
 very-0.006
,-0.005
Token,
Token ID11
Feature activation-8.44e-01

Pos logit contributions
wen+0.026
opes+0.026
lor+0.026
uding+0.026
 Charles+0.025
Neg logit contributions
 Lily-0.006
 L-0.005
,-0.005
 v-0.004
 very-0.004
Token tall
Token ID7331
Feature activation-1.55e+00

Pos logit contributions
 Lily+0.003
 L+0.002
 very+0.002
 v+0.002
 sad+0.002
Neg logit contributions
lor-0.008
wen-0.007
opes-0.007
uding-0.007
 Charles-0.007
Token and
Token ID290
Feature activation+4.91e-01

Pos logit contributions
opes+0.063
uding+0.062
lor+0.061
wen+0.061
 Charles+0.060
Neg logit contributions
 Lily-0.024
 L-0.021
 v-0.017
L-0.016
 Sp-0.016
Token strong
Token ID1913
Feature activation-5.26e-01

Pos logit contributions
 Lily+0.009
 L+0.008
 v+0.006
L+0.006
 Sp+0.006
Neg logit contributions
opes-0.018
wen-0.018
uding-0.018
 Charles-0.018
lor-0.018
Token and
Token ID290
Feature activation+4.27e-01

Pos logit contributions
opes+0.024
lor+0.024
wen+0.024
uding+0.024
 Charles+0.023
Neg logit contributions
 Lily-0.006
 L-0.005
 v-0.003
,-0.003
 p-0.003
Token had
Token ID550
Feature activation-1.15e+00

Pos logit contributions
 Lily+0.008
 L+0.007
 v+0.006
 Sp+0.005
 p+0.005
Neg logit contributions
opes-0.017
lor-0.016
uding-0.016
wen-0.016
 Charles-0.016
Token delic
Token ID8675
Feature activation-4.24e+00

Pos logit contributions
lor+0.047
uding+0.046
wen+0.046
opes+0.045
uted+0.044
Neg logit contributions
 Lily-0.021
 L-0.018
 v-0.017
 p-0.016
 very-0.016
Tokeniously
Token ID6819
Feature activation-1.32e+00

Pos logit contributions
lor+0.096
Charlie+0.088
Frank+0.082
 Charles+0.081
eway+0.080
Neg logit contributions
 L-0.140
,-0.136
 Lily-0.133
.-0.125
 Sp-0.116
Token ripe
Token ID29036
Feature activation-9.25e-01

Pos logit contributions
lor+0.060
wen+0.057
opes+0.055
uding+0.055
uted+0.054
Neg logit contributions
 Lily-0.021
,-0.019
 L-0.018
 v-0.016
 very-0.016
Token pe
Token ID613
Feature activation-2.75e+00

Pos logit contributions
lor+0.039
wen+0.039
uding+0.039
opes+0.039
 Charles+0.037
Neg logit contributions
 Lily-0.013
 L-0.012
,-0.010
 p-0.009
 v-0.009
Tokenaches
Token ID3694
Feature activation-1.55e+00

Pos logit contributions
opes+0.068
lor+0.067
uding+0.066
 Charles+0.064
Charlie+0.063
Neg logit contributions
 L-0.078
 Lily-0.078
 v-0.073
 very-0.073
,-0.072
Token all
Token ID477
Feature activation-4.77e-01

Pos logit contributions
opes+0.075
lor+0.074
uding+0.074
wen+0.073
 Charles+0.072
Neg logit contributions
 Lily-0.013
 L-0.010
,-0.008
 very-0.006
 v-0.006
Token,
Token ID11
Feature activation-1.49e+00

Pos logit contributions
opes+0.061
lor+0.060
wen+0.059
uding+0.059
uted+0.057
Neg logit contributions
 Lily-0.008
 L-0.005
,-0.003
 v-0.001
 p-0.001
Token she
Token ID673
Feature activation-2.11e-01

Pos logit contributions
opes+0.077
lor+0.076
uding+0.075
wen+0.073
uted+0.072
Neg logit contributions
 Lily-0.013
 L-0.009
,-0.006
 very-0.005
 sad-0.004
Token saw
Token ID2497
Feature activation-7.41e-01

Pos logit contributions
opes+0.009
uding+0.008
lor+0.008
eway+0.008
wen+0.008
Neg logit contributions
 Lily-0.004
 L-0.003
,-0.003
 very-0.003
 p-0.003
Token a
Token ID257
Feature activation-6.45e-01

Pos logit contributions
opes+0.033
uding+0.033
lor+0.032
wen+0.031
 Charles+0.030
Neg logit contributions
 Lily-0.011
 L-0.009
 v-0.007
,-0.007
 very-0.007
Token beautiful
Token ID4950
Feature activation+7.18e-02

Pos logit contributions
opes+0.022
lor+0.021
uding+0.021
wen+0.021
 Charles+0.021
Neg logit contributions
 Lily-0.014
 L-0.013
 v-0.011
 Sp-0.011
L-0.010
Token mer
Token ID4017
Feature activation-4.20e+00

Pos logit contributions
 Lily+0.002
 L+0.002
 v+0.001
 p+0.001
 Sp+0.001
Neg logit contributions
opes-0.002
lor-0.002
uding-0.002
wen-0.002
 Charles-0.002
Tokenmaid
Token ID23151
Feature activation-1.39e+00

Pos logit contributions
wen+0.107
opes+0.104
lor+0.100
Charlie+0.098
 Charles+0.097
Neg logit contributions
 Lily-0.126
 L-0.122
 p-0.115
 v-0.115
,-0.112
Token swimming
Token ID14899
Feature activation-1.32e+00

Pos logit contributions
opes+0.049
uding+0.048
wen+0.048
lor+0.047
 Charles+0.045
Neg logit contributions
 Lily-0.031
 L-0.029
,-0.026
 v-0.025
 p-0.024
Token around
Token ID1088
Feature activation-9.38e-01

Pos logit contributions
opes+0.053
wen+0.053
lor+0.052
uding+0.051
 Charles+0.050
Neg logit contributions
 Lily-0.024
 L-0.021
,-0.018
 v-0.017
 very-0.017
Token her
Token ID607
Feature activation+3.49e-01

Pos logit contributions
opes+0.041
lor+0.041
wen+0.041
uding+0.040
 Charles+0.039
Neg logit contributions
 Lily-0.013
 L-0.010
 v-0.008
,-0.007
 p-0.007
Token.
Token ID13
Feature activation-1.25e+00

Pos logit contributions
 Lily+0.004
 L+0.003
,+0.003
 very+0.002
 p+0.002
Neg logit contributions
wen-0.016
opes-0.016
lor-0.016
uding-0.016
eway-0.015
Token.
Token ID13
Feature activation-2.77e-01

Pos logit contributions
wen+0.048
opes+0.047
lor+0.047
uding+0.046
 Charles+0.045
Neg logit contributions
 Lily-0.008
,-0.007
 L-0.006
 very-0.005
 v-0.005
Token He
Token ID679
Feature activation-5.78e-01

Pos logit contributions
opes+0.011
lor+0.011
uding+0.011
wen+0.011
 Charles+0.011
Neg logit contributions
 Lily-0.004
 L-0.004
 v-0.003
 Sp-0.003
 p-0.003
Token even
Token ID772
Feature activation+6.36e-01

Pos logit contributions
lor+0.025
wen+0.024
opes+0.024
uding+0.024
eway+0.023
Neg logit contributions
 Lily-0.009
 L-0.008
 very-0.007
,-0.007
 p-0.007
Token saw
Token ID2497
Feature activation-7.60e-01

Pos logit contributions
 Lily+0.016
 L+0.014
 v+0.013
 p+0.012
 Sp+0.012
Neg logit contributions
opes-0.021
lor-0.020
wen-0.020
uding-0.020
 Charles-0.019
Token a
Token ID257
Feature activation-1.15e+00

Pos logit contributions
uding+0.031
wen+0.031
lor+0.031
opes+0.031
eway+0.029
Neg logit contributions
 Lily-0.014
 L-0.012
 v-0.010
 very-0.010
 p-0.009
Token mer
Token ID4017
Feature activation-4.18e+00

Pos logit contributions
opes+0.043
lor+0.042
uding+0.042
wen+0.042
 Charles+0.040
Neg logit contributions
 Lily-0.023
 L-0.020
 v-0.018
 p-0.017
 very-0.017
Tokenmaid
Token ID23151
Feature activation-1.10e+00

Pos logit contributions
wen+0.113
opes+0.106
lor+0.102
Charlie+0.102
Frank+0.099
Neg logit contributions
 Lily-0.121
 L-0.116
 v-0.113
 sad-0.111
,-0.111
Token!
Token ID0
Feature activation-4.12e-02

Pos logit contributions
wen+0.042
opes+0.042
uding+0.041
lor+0.041
 Charles+0.039
Neg logit contributions
 Lily-0.022
 L-0.019
,-0.018
 v-0.017
 p-0.016
Token
Token ID220
Feature activation+2.86e-01

Pos logit contributions
opes+0.002
lor+0.002
uding+0.002
wen+0.002
 Charles+0.002
Neg logit contributions
 Lily-0.001
 L-0.001
 v-0.000
 p-0.000
 Sp-0.000
Token
Token ID198
Feature activation-2.27e-01

Pos logit contributions
 Lily+0.003
 L+0.002
 Sp+0.001
,+0.001
L+0.001
Neg logit contributions
opes-0.014
uding-0.014
lor-0.014
wen-0.014
 Charles-0.013
Token
Token ID198
Feature activation-4.55e-02

Pos logit contributions
opes+0.010
uding+0.010
lor+0.010
wen+0.010
 Charles+0.010
Neg logit contributions
 Lily-0.003
 L-0.002
 v-0.002
L-0.002
 Sp-0.001
Token big
Token ID1263
Feature activation-6.31e-01

Pos logit contributions
lor+0.023
 Charles+0.023
opes+0.022
wen+0.022
uding+0.022
Neg logit contributions
 Lily-0.014
 L-0.013
 Sp-0.011
L-0.011
 Sara-0.010
Token book
Token ID1492
Feature activation-1.38e+00

Pos logit contributions
opes+0.018
uding+0.018
 Charles+0.018
lor+0.017
wen+0.017
Neg logit contributions
 Lily-0.015
 L-0.015
L-0.013
 Sp-0.013
 Sara-0.012
Token with
Token ID351
Feature activation-1.13e+00

Pos logit contributions
opes+0.054
uding+0.053
lor+0.053
wen+0.053
 Charles+0.052
Neg logit contributions
 Lily-0.026
 L-0.022
 v-0.019
,-0.018
 very-0.018
Token pictures
Token ID5986
Feature activation-7.88e-01

Pos logit contributions
opes+0.041
lor+0.040
uding+0.040
wen+0.040
 Charles+0.039
Neg logit contributions
 Lily-0.022
 L-0.020
 v-0.017
 Sp-0.016
L-0.016
Token of
Token ID286
Feature activation-7.57e-01

Pos logit contributions
opes+0.032
uding+0.032
lor+0.032
wen+0.032
 Charles+0.031
Neg logit contributions
 Lily-0.013
 L-0.011
 v-0.009
,-0.008
 p-0.008
Token volcan
Token ID17516
Feature activation-4.15e+00

Pos logit contributions
lor+0.028
opes+0.027
wen+0.027
uding+0.027
 Charles+0.026
Neg logit contributions
 Lily-0.016
 L-0.014
 v-0.012
 Sp-0.011
 p-0.011
Tokenoes
Token ID3028
Feature activation-1.92e-01

Pos logit contributions
lor+0.134
opes+0.133
uding+0.131
wen+0.131
 Charles+0.128
Neg logit contributions
 Lily-0.098
 L-0.090
 v-0.080
 very-0.076
,-0.076
Token.
Token ID13
Feature activation+8.06e-01

Pos logit contributions
opes+0.009
lor+0.009
wen+0.009
uding+0.009
 Charles+0.009
Neg logit contributions
 Lily-0.002
 L-0.001
 v-0.001
,-0.001
 p-0.001
Token Vol
Token ID4709
Feature activation-2.77e+00

Pos logit contributions
 Lily+0.013
 v+0.012
 L+0.012
 c+0.012
 Sp+0.012
Neg logit contributions
opes-0.027
uding-0.027
 Charles-0.027
lor-0.027
eway-0.026
Tokencan
Token ID5171
Feature activation-6.89e-01

Pos logit contributions
opes+0.068
lor+0.067
uding+0.066
wen+0.066
 Charles+0.063
Neg logit contributions
 Lily-0.088
 L-0.082
 v-0.076
 p-0.073
 Sp-0.073
Tokenoes
Token ID3028
Feature activation-9.88e-03

Pos logit contributions
opes+0.025
lor+0.025
uding+0.024
wen+0.024
 Charles+0.024
Neg logit contributions
 Lily-0.014
 L-0.013
 v-0.011
,-0.010
 Sp-0.010
Token eye
Token ID4151
Feature activation-4.20e-01

Pos logit contributions
lor+0.012
uding+0.012
wen+0.012
opes+0.011
 Charles+0.011
Neg logit contributions
 Lily-0.011
 L-0.010
 v-0.009
 p-0.009
,-0.008
Token patches
Token ID16082
Feature activation-1.13e+00

Pos logit contributions
uding+0.015
wen+0.014
lor+0.014
opes+0.014
 Charles+0.013
Neg logit contributions
 Lily-0.010
 L-0.009
 v-0.009
,-0.009
 p-0.008
Token and
Token ID290
Feature activation-5.96e-01

Pos logit contributions
opes+0.049
uding+0.049
lor+0.049
wen+0.048
 Charles+0.047
Neg logit contributions
 Lily-0.015
 L-0.013
 v-0.010
,-0.010
 very-0.009
Token said
Token ID531
Feature activation+9.28e-02

Pos logit contributions
lor+0.020
wen+0.020
uding+0.020
opes+0.020
uted+0.019
Neg logit contributions
 Lily-0.015
 L-0.013
 v-0.012
 p-0.011
,-0.011
Token "
Token ID366
Feature activation+6.59e-01

Pos logit contributions
 Lily+0.001
 L+0.001
L+0.001
 v+0.001
 Sp+0.001
Neg logit contributions
opes-0.004
lor-0.004
 Charles-0.004
uding-0.004
wen-0.004
TokenAr
Token ID3163
Feature activation-4.13e+00

Pos logit contributions
 Lily+0.017
 L+0.016
L+0.016
 v+0.014
 Sp+0.013
Neg logit contributions
opes-0.021
wen-0.020
uding-0.020
lor-0.019
 Charles-0.019
Tokenrr
Token ID21062
Feature activation-1.29e+00

Pos logit contributions
opes+0.108
wen+0.103
uding+0.103
lor+0.102
 Charles+0.100
Neg logit contributions
 Lily-0.115
 L-0.110
,-0.106
L-0.100
 v-0.100
Token!"
Token ID2474
Feature activation+1.37e+00

Pos logit contributions
wen+0.043
opes+0.041
uding+0.041
lor+0.041
 witnessed+0.040
Neg logit contributions
 Lily-0.030
 L-0.027
,-0.027
 v-0.024
 Sp-0.023
Token a
Token ID257
Feature activation+1.56e-02

Pos logit contributions
 Lily+0.019
 L+0.016
 v+0.015
 Sp+0.013
 dirty+0.013
Neg logit contributions
opes-0.055
uding-0.054
lor-0.054
wen-0.053
 Charles-0.052
Token lot
Token ID1256
Feature activation+4.93e-01

Pos logit contributions
 Lily+0.000
 L+0.000
 v+0.000
 Sp+0.000
L+0.000
Neg logit contributions
opes-0.001
lor-0.000
wen-0.000
uding-0.000
 Charles-0.000
Token.
Token ID13
Feature activation+5.50e-01

Pos logit contributions
 Lily+0.009
 L+0.008
 v+0.007
 Sp+0.007
L+0.007
Neg logit contributions
opes-0.018
lor-0.018
wen-0.018
uding-0.018
 Charles-0.017
Token he
Token ID339
Feature activation-4.25e-01

Pos logit contributions
 Lily+0.004
 L+0.004
 v+0.003
L+0.003
 Sp+0.003
Neg logit contributions
opes-0.011
 Charles-0.011
wen-0.011
uding-0.011
lor-0.011
Token would
Token ID561
Feature activation-6.60e-01

Pos logit contributions
opes+0.017
lor+0.017
wen+0.017
uding+0.017
 Charles+0.016
Neg logit contributions
 Lily-0.007
 L-0.006
 v-0.005
 p-0.005
 very-0.005
Token find
Token ID1064
Feature activation-1.36e+00

Pos logit contributions
wen+0.027
opes+0.027
lor+0.027
uding+0.025
 Charles+0.025
Neg logit contributions
 Lily-0.011
 L-0.010
 p-0.009
,-0.009
 v-0.009
Token his
Token ID465
Feature activation-7.38e-01

Pos logit contributions
opes+0.057
wen+0.057
uding+0.056
lor+0.056
 Charles+0.053
Neg logit contributions
 Lily-0.025
 L-0.021
 v-0.017
,-0.017
 very-0.017
Token perfect
Token ID2818
Feature activation+7.39e-02

Pos logit contributions
wen+0.024
lor+0.024
opes+0.024
uding+0.023
 Charles+0.023
Neg logit contributions
 Lily-0.018
 L-0.017
 v-0.015
 very-0.015
 p-0.015
Token mate
Token ID16133
Feature activation-4.12e+00

Pos logit contributions
 Lily+0.002
 L+0.002
 v+0.001
,+0.001
 p+0.001
Neg logit contributions
wen-0.003
opes-0.003
lor-0.003
 Charles-0.002
uding-0.002
Token.
Token ID13
Feature activation+2.71e-01

Pos logit contributions
opes+0.208
wen+0.206
lor+0.204
heit+0.199
uding+0.198
Neg logit contributions
 Lily-0.032
 L-0.025
,-0.024
 very-0.017
.-0.016
Token
Token ID198
Feature activation-7.09e-01

Pos logit contributions
 Lily+0.004
 L+0.004
 v+0.003
 dirty+0.003
 sad+0.003
Neg logit contributions
opes-0.011
 Charles-0.010
lor-0.010
uding-0.010
wen-0.010
TokenOne
Token ID3198
Feature activation+2.35e-01

Pos logit contributions
opes+0.031
lor+0.031
uding+0.031
wen+0.031
 Charles+0.030
Neg logit contributions
 Lily-0.009
 L-0.007
 v-0.005
 Sp-0.005
L-0.005
Token day
Token ID1110
Feature activation+3.76e-02

Pos logit contributions
 Lily+0.005
 L+0.005
 v+0.004
 Sp+0.004
L+0.004
Neg logit contributions
 Charles-0.007
eway-0.007
aded-0.007
wen-0.007
uding-0.007
Token,
Token ID11
Feature activation-4.76e-01

Pos logit contributions
 Lily+0.000
 L+0.000
 v+0.000
L+0.000
 dirty+0.000
Neg logit contributions
opes-0.002
 Charles-0.002
uding-0.002
wen-0.002
lor-0.002
Token hats
Token ID23910
Feature activation-3.10e-02

Pos logit contributions
lor+0.028
wen+0.028
opes+0.027
uding+0.027
 Charles+0.025
Neg logit contributions
 Lily-0.023
 L-0.021
 v-0.019
 p-0.018
 very-0.018
Token.
Token ID13
Feature activation-1.29e-01

Pos logit contributions
opes+0.001
lor+0.001
wen+0.001
uding+0.001
 Charles+0.001
Neg logit contributions
 Lily-0.000
 L-0.000
 v-0.000
 p-0.000
 Sp-0.000
Token They
Token ID1119
Feature activation-2.59e-01

Pos logit contributions
opes+0.005
uding+0.005
wen+0.005
 Charles+0.005
eway+0.005
Neg logit contributions
 Lily-0.002
 L-0.002
 v-0.002
 dirty-0.002
 p-0.002
Token say
Token ID910
Feature activation+4.41e-02

Pos logit contributions
lor+0.010
opes+0.010
wen+0.010
uding+0.010
 Charles+0.009
Neg logit contributions
 Lily-0.005
 L-0.004
 v-0.004
 p-0.004
,-0.003
Token "
Token ID366
Feature activation+7.86e-01

Pos logit contributions
 Lily+0.001
 L+0.001
L+0.000
 v+0.000
 p+0.000
Neg logit contributions
opes-0.002
lor-0.002
 Charles-0.002
uding-0.002
wen-0.002
TokenAr
Token ID3163
Feature activation-4.11e+00

Pos logit contributions
 Lily+0.023
 L+0.021
L+0.020
 v+0.020
 Sp+0.019
Neg logit contributions
opes-0.022
wen-0.021
lor-0.021
uding-0.021
 Charles-0.020
Tokenrr
Token ID21062
Feature activation-1.25e+00

Pos logit contributions
opes+0.103
lor+0.099
uding+0.098
wen+0.098
 Charles+0.096
Neg logit contributions
 Lily-0.119
 L-0.114
,-0.109
 v-0.105
L-0.104
Token"
Token ID1
Feature activation+1.52e-01

Pos logit contributions
wen+0.041
opes+0.041
uding+0.040
lor+0.040
 witnessed+0.039
Neg logit contributions
 Lily-0.028
,-0.026
 L-0.026
 v-0.023
 Sp-0.022
Token and
Token ID290
Feature activation-3.93e-01

Pos logit contributions
 Lily+0.002
 L+0.002
,+0.001
 v+0.001
L+0.001
Neg logit contributions
lor-0.007
opes-0.007
wen-0.006
uding-0.006
 Charles-0.006
Token "
Token ID366
Feature activation+7.93e-01

Pos logit contributions
opes+0.015
lor+0.014
wen+0.014
uding+0.014
 Charles+0.014
Neg logit contributions
 Lily-0.008
 L-0.007
 v-0.006
 p-0.006
 Sp-0.006
TokenAh
Token ID10910
Feature activation-6.01e-01

Pos logit contributions
 Lily+0.023
 L+0.021
 v+0.020
L+0.019
 Sp+0.019
Neg logit contributions
opes-0.022
wen-0.022
lor-0.021
uding-0.021
 Charles-0.021
Token big
Token ID1263
Feature activation-5.75e-01

Pos logit contributions
lor+0.021
 Charles+0.020
opes+0.020
wen+0.020
uding+0.020
Neg logit contributions
 Lily-0.013
 L-0.012
 Sp-0.010
L-0.010
 Sara-0.009
Token book
Token ID1492
Feature activation-1.34e+00

Pos logit contributions
opes+0.017
uding+0.016
 Charles+0.016
lor+0.016
wen+0.016
Neg logit contributions
 Lily-0.014
 L-0.013
L-0.012
 Sp-0.011
 Sara-0.011
Token with
Token ID351
Feature activation-9.98e-01

Pos logit contributions
opes+0.052
uding+0.051
lor+0.050
wen+0.050
 Charles+0.049
Neg logit contributions
 Lily-0.026
 L-0.022
 v-0.019
,-0.018
 very-0.018
Token pictures
Token ID5986
Feature activation-7.77e-01

Pos logit contributions
opes+0.037
lor+0.036
uding+0.036
wen+0.036
 Charles+0.035
Neg logit contributions
 Lily-0.019
 L-0.017
 v-0.015
 Sp-0.014
 p-0.013
Token of
Token ID286
Feature activation-6.91e-01

Pos logit contributions
opes+0.032
uding+0.032
lor+0.032
wen+0.032
 Charles+0.031
Neg logit contributions
 Lily-0.012
 L-0.010
 v-0.009
,-0.008
 p-0.008
Token volcan
Token ID17516
Feature activation-4.11e+00

Pos logit contributions
lor+0.025
wen+0.025
opes+0.025
uding+0.025
 Charles+0.024
Neg logit contributions
 Lily-0.015
 L-0.013
 v-0.011
 p-0.010
 very-0.010
Tokenoes
Token ID3028
Feature activation-1.37e-01

Pos logit contributions
lor+0.134
opes+0.132
uding+0.132
wen+0.131
 Charles+0.128
Neg logit contributions
 Lily-0.095
 L-0.087
 v-0.077
,-0.074
 very-0.073
Token.
Token ID13
Feature activation+7.81e-01

Pos logit contributions
opes+0.007
lor+0.007
wen+0.007
uding+0.006
 Charles+0.006
Neg logit contributions
 Lily-0.001
 L-0.001
 v-0.001
,-0.001
 very-0.000
Token Vol
Token ID4709
Feature activation-2.81e+00

Pos logit contributions
 Lily+0.012
 L+0.012
 v+0.012
 Sp+0.011
 c+0.011
Neg logit contributions
opes-0.027
uding-0.027
 Charles-0.027
lor-0.027
wen-0.026
Tokencan
Token ID5171
Feature activation-6.57e-01

Pos logit contributions
opes+0.070
lor+0.067
uding+0.067
wen+0.067
 Charles+0.064
Neg logit contributions
 Lily-0.089
 L-0.083
 v-0.076
 Sp-0.073
L-0.073
Tokenoes
Token ID3028
Feature activation+1.78e-01

Pos logit contributions
opes+0.025
lor+0.025
uding+0.024
wen+0.024
 Charles+0.024
Neg logit contributions
 Lily-0.013
 L-0.011
 v-0.010
,-0.009
 Sp-0.009
Token.
Token ID13
Feature activation+3.04e-01

Pos logit contributions
 Lily+0.002
 L+0.002
 v+0.001
 Sp+0.001
L+0.001
Neg logit contributions
opes-0.006
uding-0.006
lor-0.006
wen-0.006
 Charles-0.006
Token She
Token ID1375
Feature activation+7.01e-01

Pos logit contributions
 Lily+0.005
 v+0.005
 L+0.004
 dirty+0.004
 Sp+0.004
Neg logit contributions
 Charles-0.011
opes-0.011
uding-0.011
lor-0.011
wen-0.010
Token had
Token ID550
Feature activation-8.86e-01

Pos logit contributions
 Lily+0.014
 L+0.012
 v+0.011
 Sp+0.010
L+0.010
Neg logit contributions
opes-0.025
wen-0.025
uding-0.025
lor-0.024
 Charles-0.024
Token a
Token ID257
Feature activation-4.25e-01

Pos logit contributions
opes+0.037
lor+0.037
uding+0.037
wen+0.037
 Charles+0.035
Neg logit contributions
 Lily-0.014
 L-0.012
 v-0.010
 very-0.010
 p-0.009
Token big
Token ID1263
Feature activation-1.36e+00

Pos logit contributions
opes+0.015
uding+0.015
lor+0.015
wen+0.015
 Charles+0.015
Neg logit contributions
 Lily-0.008
 L-0.007
 Sp-0.006
 v-0.006
L-0.006
Token magn
Token ID7842
Feature activation-4.10e+00

Pos logit contributions
opes+0.047
uding+0.045
wen+0.044
lor+0.044
 Charles+0.044
Neg logit contributions
 Lily-0.029
 L-0.026
L-0.023
 Sp-0.022
 v-0.022
Tokenifying
Token ID4035
Feature activation-4.49e-01

Pos logit contributions
opes+0.092
wen+0.092
uding+0.091
 Charles+0.087
lor+0.086
Neg logit contributions
 Lily-0.129
 L-0.125
,-0.117
 v-0.115
 very-0.112
Token glass
Token ID5405
Feature activation+9.68e-01

Pos logit contributions
lor+0.017
 Charles+0.016
uding+0.016
opes+0.015
eway+0.015
Neg logit contributions
 Lily-0.007
 L-0.006
L-0.005
 Sp-0.004
 sad-0.004
Token that
Token ID326
Feature activation-1.26e-01

Pos logit contributions
 Lily+0.015
 L+0.013
 Sp+0.011
 v+0.011
L+0.011
Neg logit contributions
opes-0.038
lor-0.038
uding-0.037
wen-0.037
 Charles-0.036
Token made
Token ID925
Feature activation-2.68e-01

Pos logit contributions
opes+0.005
lor+0.005
uding+0.005
wen+0.005
 Charles+0.005
Neg logit contributions
 Lily-0.002
 L-0.002
 v-0.001
 Sp-0.001
 p-0.001
Token things
Token ID1243
Feature activation-1.12e+00

Pos logit contributions
lor+0.011
opes+0.011
uding+0.011
wen+0.011
 Charles+0.011
Neg logit contributions
 Lily-0.004
 L-0.004
 v-0.003
,-0.003
 very-0.003
Token.
Token ID13
Feature activation+8.77e-03

Pos logit contributions
 Lily+0.023
 L+0.019
,+0.018
 very+0.015
 v+0.015
Neg logit contributions
opes-0.055
wen-0.053
uding-0.052
lor-0.052
eway-0.050
Token You
Token ID921
Feature activation+7.50e-02

Pos logit contributions
 Lily+0.000
 L+0.000
 v+0.000
 p+0.000
 Sp+0.000
Neg logit contributions
opes-0.000
lor-0.000
uding-0.000
wen-0.000
 Charles-0.000
Token worked
Token ID3111
Feature activation+1.32e-01

Pos logit contributions
 Lily+0.001
 L+0.001
 p+0.001
,+0.001
 very+0.001
Neg logit contributions
opes-0.003
uding-0.003
lor-0.003
wen-0.003
eway-0.003
Token with
Token ID351
Feature activation-4.26e-02

Pos logit contributions
 Lily+0.002
 L+0.002
 v+0.001
 very+0.001
 p+0.001
Neg logit contributions
opes-0.006
lor-0.006
uding-0.006
wen-0.006
 Charles-0.005
Token such
Token ID884
Feature activation-5.98e-01

Pos logit contributions
opes+0.002
lor+0.002
uding+0.002
wen+0.002
eway+0.002
Neg logit contributions
 Lily-0.001
 L-0.001
 v-0.001
 p-0.001
 very-0.001
Token delic
Token ID8675
Feature activation-4.08e+00

Pos logit contributions
opes+0.024
lor+0.024
wen+0.024
uding+0.024
 Charles+0.023
Neg logit contributions
 Lily-0.010
 L-0.009
 v-0.007
 very-0.007
 p-0.007
Tokenacy
Token ID1590
Feature activation-5.24e-01

Pos logit contributions
lor+0.072
opes+0.070
 Charles+0.068
uding+0.067
eway+0.066
Neg logit contributions
 L-0.145
 Lily-0.145
,-0.134
 v-0.130
 Sp-0.130
Token.
Token ID13
Feature activation-1.14e-02

Pos logit contributions
opes+0.020
uding+0.020
wen+0.020
lor+0.019
 Charles+0.019
Neg logit contributions
 Lily-0.009
,-0.009
 very-0.009
 v-0.008
 L-0.008
Token You
Token ID921
Feature activation-1.40e-02

Pos logit contributions
opes+0.000
lor+0.000
uding+0.000
wen+0.000
 Charles+0.000
Neg logit contributions
 Lily-0.000
 L-0.000
 v-0.000
 p-0.000
 sad-0.000
Token have
Token ID423
Feature activation-5.99e-01

Pos logit contributions
opes+0.001
uding+0.001
lor+0.001
wen+0.001
eway+0.001
Neg logit contributions
 Lily-0.000
 L-0.000
 p-0.000
,-0.000
 very-0.000
Token a
Token ID257
Feature activation-5.25e-01

Pos logit contributions
opes+0.022
wen+0.022
lor+0.022
uding+0.022
eway+0.021
Neg logit contributions
 Lily-0.013
 L-0.012
 very-0.010
 v-0.010
,-0.010
Token
Token ID198
Feature activation+3.06e-01

Pos logit contributions
lor+0.001
wen+0.001
opes+0.001
eway+0.001
 Charles+0.001
Neg logit contributions
 Lily-0.000
 L-0.000
 v-0.000
 sad-0.000
 dirty-0.000
TokenThe
Token ID464
Feature activation+4.39e-01

Pos logit contributions
 Lily+0.007
 L+0.006
 sad+0.006
 v+0.006
 Sp+0.006
Neg logit contributions
lor-0.010
wen-0.009
opes-0.009
eway-0.009
uding-0.008
Token new
Token ID649
Feature activation-7.54e-01

Pos logit contributions
 Lily+0.009
 L+0.009
 Sp+0.007
 Sara+0.007
L+0.007
Neg logit contributions
opes-0.015
uding-0.014
wen-0.014
lor-0.014
 Charles-0.014
Token light
Token ID1657
Feature activation-9.71e-01

Pos logit contributions
opes+0.026
uding+0.025
lor+0.025
wen+0.025
 Charles+0.024
Neg logit contributions
 Lily-0.017
 L-0.015
 v-0.013
 Sp-0.013
L-0.013
Tokenbul
Token ID15065
Feature activation-2.82e-01

Pos logit contributions
opes+0.035
uding+0.034
lor+0.034
wen+0.034
 Charles+0.033
Neg logit contributions
 Lily-0.021
 L-0.019
 v-0.018
,-0.016
 p-0.016
Tokenb
Token ID65
Feature activation-4.08e+00

Pos logit contributions
opes+0.009
uding+0.009
lor+0.009
wen+0.008
 Charles+0.008
Neg logit contributions
 Lily-0.007
 L-0.007
 v-0.006
 Sp-0.006
L-0.006
Token was
Token ID373
Feature activation+9.10e-01

Pos logit contributions
uding+0.103
lor+0.099
 Charles+0.099
opes+0.099
Frank+0.095
Neg logit contributions
 Lily-0.123
 L-0.117
,-0.112
 v-0.109
 very-0.108
Token much
Token ID881
Feature activation+1.64e+00

Pos logit contributions
 Lily+0.017
 L+0.014
 v+0.013
 very+0.013
,+0.012
Neg logit contributions
opes-0.037
lor-0.037
uding-0.036
wen-0.035
 Charles-0.035
Token bigger
Token ID5749
Feature activation-8.53e-01

Pos logit contributions
 Lily+0.031
 L+0.029
 Sp+0.024
L+0.024
 v+0.023
Neg logit contributions
uding-0.055
opes-0.055
wen-0.054
lor-0.054
 Charles-0.054
Token and
Token ID290
Feature activation+4.04e-01

Pos logit contributions
opes+0.030
lor+0.030
wen+0.030
uding+0.029
 Charles+0.029
Neg logit contributions
 Lily-0.019
 L-0.017
,-0.015
 v-0.015
 p-0.014
Token brighter
Token ID26083
Feature activation+1.45e-02

Pos logit contributions
 Lily+0.008
 L+0.007
L+0.006
 v+0.006
 sad+0.006
Neg logit contributions
wen-0.013
opes-0.013
 Charles-0.013
uding-0.013
lor-0.013
Token flag
Token ID6056
Feature activation-8.62e-01

Pos logit contributions
 Lily+0.009
 L+0.009
 v+0.008
 Sp+0.007
L+0.007
Neg logit contributions
opes-0.011
lor-0.010
uding-0.010
wen-0.010
 Charles-0.010
Token.
Token ID13
Feature activation+7.36e-01

Pos logit contributions
wen+0.037
opes+0.037
lor+0.036
uding+0.036
 Charles+0.035
Neg logit contributions
 Lily-0.013
 L-0.011
 v-0.010
,-0.009
 very-0.009
Token They
Token ID1119
Feature activation-7.74e-02

Pos logit contributions
 Lily+0.010
 L+0.009
 v+0.009
 sad+0.007
 dirty+0.007
Neg logit contributions
opes-0.029
 Charles-0.029
lor-0.029
uding-0.029
wen-0.028
Token say
Token ID910
Feature activation+1.14e-01

Pos logit contributions
opes+0.003
lor+0.003
wen+0.003
uding+0.003
 Charles+0.003
Neg logit contributions
 Lily-0.002
 L-0.001
 v-0.001
 p-0.001
 Sp-0.001
Token "
Token ID366
Feature activation+8.40e-01

Pos logit contributions
 Lily+0.001
 L+0.001
L+0.001
 v+0.001
 dirty+0.001
Neg logit contributions
opes-0.005
lor-0.005
 Charles-0.005
wen-0.005
uding-0.005
TokenAr
Token ID3163
Feature activation-4.03e+00

Pos logit contributions
 Lily+0.024
 L+0.022
L+0.021
 v+0.021
 Sp+0.020
Neg logit contributions
opes-0.023
wen-0.023
lor-0.022
uding-0.022
 Charles-0.022
Tokenrr
Token ID21062
Feature activation-1.14e+00

Pos logit contributions
opes+0.098
wen+0.093
lor+0.093
uding+0.093
 Charles+0.090
Neg logit contributions
 Lily-0.127
 L-0.119
 v-0.111
,-0.110
L-0.109
Token!"
Token ID2474
Feature activation+1.57e+00

Pos logit contributions
wen+0.044
opes+0.043
uding+0.043
lor+0.042
 Charles+0.041
Neg logit contributions
 Lily-0.021
 L-0.018
,-0.017
 v-0.016
 Sp-0.015
Token and
Token ID290
Feature activation-2.49e-01

Pos logit contributions
 Lily+0.021
 L+0.018
 v+0.016
 Sp+0.014
 dirty+0.014
Neg logit contributions
opes-0.065
uding-0.063
lor-0.063
 Charles-0.063
wen-0.062
Token "
Token ID366
Feature activation+3.82e-01

Pos logit contributions
lor+0.010
opes+0.010
wen+0.010
uding+0.010
 Charles+0.009
Neg logit contributions
 Lily-0.005
 L-0.004
 v-0.003
 p-0.003
 Sp-0.003
TokenAh
Token ID10910
Feature activation-4.69e-01

Pos logit contributions
 Lily+0.011
 L+0.010
 v+0.009
L+0.009
 Sp+0.009
Neg logit contributions
opes-0.011
lor-0.011
wen-0.011
uding-0.010
 Charles-0.010
Token In
Token ID554
Feature activation-1.40e-01

Pos logit contributions
 Lily+0.000
 L+0.000
 v+0.000
 dirty+0.000
 sad+0.000
Neg logit contributions
opes-0.000
uding-0.000
lor-0.000
 Charles-0.000
wen-0.000
Token it
Token ID340
Feature activation+1.39e+00

Pos logit contributions
opes+0.006
lor+0.006
wen+0.006
uding+0.006
 Charles+0.006
Neg logit contributions
 Lily-0.002
 L-0.002
 v-0.001
L-0.001
 Sp-0.001
Token she
Token ID673
Feature activation-3.71e-01

Pos logit contributions
 Lily+0.012
 L+0.008
 v+0.005
 Sp+0.004
 p+0.004
Neg logit contributions
opes-0.068
lor-0.067
uding-0.066
wen-0.066
 Charles-0.065
Token had
Token ID550
Feature activation-1.33e+00

Pos logit contributions
opes+0.015
uding+0.014
wen+0.014
lor+0.014
 Charles+0.014
Neg logit contributions
 Lily-0.007
 L-0.006
 v-0.005
 p-0.005
,-0.005
Token a
Token ID257
Feature activation-6.56e-01

Pos logit contributions
uding+0.057
opes+0.057
wen+0.054
eway+0.054
lor+0.053
Neg logit contributions
 Lily-0.024
 L-0.020
 v-0.018
 very-0.017
 p-0.017
Token magn
Token ID7842
Feature activation-4.02e+00

Pos logit contributions
opes+0.024
lor+0.024
uding+0.023
wen+0.023
 Charles+0.023
Neg logit contributions
 Lily-0.013
 L-0.012
 v-0.010
 Sp-0.009
L-0.009
Tokenifying
Token ID4035
Feature activation-8.14e-01

Pos logit contributions
opes+0.102
uding+0.099
wen+0.097
oving+0.094
 Charles+0.092
Neg logit contributions
 Lily-0.115
 L-0.110
,-0.109
 very-0.105
 v-0.105
Token glass
Token ID5405
Feature activation+1.10e+00

Pos logit contributions
lor+0.033
 Charles+0.031
uding+0.031
opes+0.031
wen+0.030
Neg logit contributions
 Lily-0.011
 L-0.010
L-0.008
 v-0.007
 Sp-0.007
Token,
Token ID11
Feature activation-8.88e-01

Pos logit contributions
 Lily+0.018
 L+0.015
 v+0.013
 Sp+0.011
 p+0.011
Neg logit contributions
opes-0.044
lor-0.044
uding-0.043
wen-0.043
 Charles-0.042
Token a
Token ID257
Feature activation-6.79e-02

Pos logit contributions
lor+0.040
uding+0.039
wen+0.039
opes+0.039
 Charles+0.037
Neg logit contributions
 Lily-0.014
 L-0.011
 very-0.009
 v-0.009
,-0.009
Token tiny
Token ID7009
Feature activation-6.94e-02

Pos logit contributions
opes+0.002
lor+0.002
wen+0.002
 Charles+0.002
uding+0.002
Neg logit contributions
 Lily-0.001
 L-0.001
 Sp-0.001
L-0.001
 v-0.001
Token next
Token ID1306
Feature activation-4.17e-01

Pos logit contributions
opes+0.075
lor+0.073
uding+0.072
wen+0.072
 Charles+0.070
Neg logit contributions
 Lily-0.028
 L-0.022
 v-0.022
,-0.021
 very-0.021
Token to
Token ID284
Feature activation+9.75e-02

Pos logit contributions
opes+0.019
wen+0.018
lor+0.018
uding+0.018
 Charles+0.018
Neg logit contributions
 Lily-0.005
 L-0.004
 v-0.004
,-0.004
 very-0.004
Token the
Token ID262
Feature activation+3.88e-01

Pos logit contributions
 Lily+0.001
 L+0.001
 v+0.001
 sad+0.001
 p+0.001
Neg logit contributions
opes-0.004
uding-0.004
wen-0.004
lor-0.004
 Charles-0.004
Token light
Token ID1657
Feature activation-2.34e-01

Pos logit contributions
 Lily+0.009
 L+0.009
 Sp+0.007
L+0.007
 v+0.007
Neg logit contributions
opes-0.012
uding-0.012
wen-0.012
lor-0.012
 Charles-0.011
Tokenbul
Token ID15065
Feature activation-3.79e-01

Pos logit contributions
opes+0.009
lor+0.009
uding+0.009
wen+0.009
 Charles+0.008
Neg logit contributions
 Lily-0.004
 L-0.004
 v-0.003
 Sp-0.003
 p-0.003
Tokenb
Token ID65
Feature activation-4.01e+00

Pos logit contributions
opes+0.012
uding+0.012
lor+0.011
wen+0.011
 Charles+0.011
Neg logit contributions
 Lily-0.010
 L-0.009
 v-0.008
 Sp-0.008
L-0.008
Token,
Token ID11
Feature activation-2.25e-01

Pos logit contributions
uding+0.110
opes+0.107
 Charles+0.106
lor+0.105
Frank+0.103
Neg logit contributions
 Lily-0.112
 L-0.107
,-0.102
 v-0.101
 very-0.099
Token then
Token ID788
Feature activation+2.39e-01

Pos logit contributions
lor+0.010
opes+0.009
wen+0.009
uding+0.009
eway+0.009
Neg logit contributions
 Lily-0.004
 L-0.003
 v-0.003
 p-0.003
 very-0.003
Token climbed
Token ID19952
Feature activation-9.73e-01

Pos logit contributions
 Lily+0.004
 L+0.004
 v+0.003
 p+0.003
,+0.003
Neg logit contributions
opes-0.010
lor-0.010
uding-0.009
wen-0.009
 Charles-0.009
Token up
Token ID510
Feature activation-2.21e-01

Pos logit contributions
opes+0.030
wen+0.029
lor+0.029
uding+0.029
 Charles+0.029
Neg logit contributions
 Lily-0.024
 L-0.022
 v-0.019
L-0.019
 sad-0.019
Token the
Token ID262
Feature activation-2.59e-02

Pos logit contributions
opes+0.009
wen+0.009
lor+0.009
 Charles+0.009
uding+0.008
Neg logit contributions
 Lily-0.003
 L-0.003
L-0.002
 sad-0.002
 v-0.002
Token "
Token ID366
Feature activation-3.56e-02

Pos logit contributions
lor+0.022
opes+0.022
uding+0.022
 Charles+0.021
wen+0.021
Neg logit contributions
 Lily-0.009
 L-0.008
 v-0.007
 p-0.007
 dirty-0.007
TokenI
Token ID40
Feature activation+2.88e-01

Pos logit contributions
opes+0.001
wen+0.001
uding+0.001
lor+0.001
 Charles+0.001
Neg logit contributions
 Lily-0.001
 L-0.001
L-0.001
 v-0.001
 Sp-0.001
Token'm
Token ID1101
Feature activation-7.14e-02

Pos logit contributions
 Lily+0.005
 L+0.004
 v+0.004
 p+0.004
,+0.003
Neg logit contributions
opes-0.012
lor-0.011
wen-0.011
uding-0.011
 Charles-0.011
Token a
Token ID257
Feature activation+9.08e-03

Pos logit contributions
opes+0.002
lor+0.002
wen+0.002
uding+0.002
eway+0.002
Neg logit contributions
 Lily-0.002
 L-0.002
 very-0.001
 sad-0.001
 v-0.001
Token secret
Token ID3200
Feature activation-1.38e+00

Pos logit contributions
 Lily+0.000
 L+0.000
 v+0.000
 Sp+0.000
L+0.000
Neg logit contributions
opes-0.000
lor-0.000
uding-0.000
wen-0.000
 Charles-0.000
Token mer
Token ID4017
Feature activation-3.99e+00

Pos logit contributions
opes+0.054
wen+0.052
uding+0.049
lor+0.049
uted+0.048
Neg logit contributions
 Lily-0.030
,-0.027
 L-0.026
 v-0.025
 very-0.024
Tokenmaid
Token ID23151
Feature activation-2.75e-01

Pos logit contributions
wen+0.109
opes+0.106
Charlie+0.102
Frank+0.100
lor+0.098
Neg logit contributions
 Lily-0.113
 L-0.111
 v-0.101
,-0.100
 sad-0.099
Token.
Token ID13
Feature activation-4.38e-01

Pos logit contributions
opes+0.011
uding+0.011
wen+0.011
lor+0.010
eway+0.010
Neg logit contributions
 Lily-0.005
 L-0.005
,-0.004
 v-0.004
 very-0.004
Token Would
Token ID10928
Feature activation-9.03e-01

Pos logit contributions
opes+0.016
wen+0.015
uding+0.015
lor+0.015
lled+0.014
Neg logit contributions
 Lily-0.010
 L-0.009
 Sp-0.007
 v-0.007
L-0.007
Token you
Token ID345
Feature activation-6.14e-01

Pos logit contributions
uding+0.037
opes+0.037
wen+0.036
lor+0.035
aced+0.034
Neg logit contributions
 Lily-0.017
 L-0.014
 v-0.012
 Sp-0.011
,-0.011
Token like
Token ID588
Feature activation+2.78e-02

Pos logit contributions
opes+0.028
wen+0.027
lor+0.027
uding+0.027
 Charles+0.026
Neg logit contributions
 Lily-0.008
 L-0.007
,-0.005
 very-0.005
 v-0.005
Token gave
Token ID2921
Feature activation-1.21e+00

Pos logit contributions
opes+0.012
uding+0.012
wen+0.012
lor+0.012
eway+0.011
Neg logit contributions
 Lily-0.005
 L-0.004
,-0.004
 p-0.004
 v-0.004
Token Lucas
Token ID15257
Feature activation-1.03e+00

Pos logit contributions
opes+0.031
uding+0.029
lor+0.029
wen+0.028
oving+0.026
Neg logit contributions
 Lily-0.039
 L-0.037
,-0.034
 very-0.034
 Sp-0.033
Token a
Token ID257
Feature activation-9.14e-01

Pos logit contributions
opes+0.046
uding+0.045
wen+0.044
lor+0.043
eway+0.042
Neg logit contributions
 Lily-0.015
 L-0.012
 very-0.011
,-0.011
 v-0.010
Token small
Token ID1402
Feature activation-2.51e-01

Pos logit contributions
opes+0.041
uding+0.039
lor+0.038
wen+0.037
uted+0.037
Neg logit contributions
 Lily-0.014
 very-0.012
 L-0.012
 v-0.011
 sad-0.011
Token pe
Token ID613
Feature activation-2.20e+00

Pos logit contributions
opes+0.009
wen+0.009
lor+0.009
uding+0.008
 Charles+0.008
Neg logit contributions
 Lily-0.006
 L-0.005
 v-0.005
,-0.004
 p-0.004
Tokenb
Token ID65
Feature activation-3.99e+00

Pos logit contributions
opes+0.077
uding+0.075
Charlie+0.071
 Charles+0.071
uted+0.070
Neg logit contributions
 very-0.041
 v-0.040
 sad-0.040
,-0.039
 L-0.038
Tokenble
Token ID903
Feature activation-3.64e-01

Pos logit contributions
opes+0.127
uding+0.120
Charlie+0.114
 sites+0.114
aman+0.113
Neg logit contributions
,-0.099
 very-0.092
 L-0.086
.-0.085
 Lily-0.083
Token.
Token ID13
Feature activation-1.44e+00

Pos logit contributions
opes+0.017
wen+0.017
uding+0.016
lor+0.016
 Charles+0.016
Neg logit contributions
 Lily-0.004
 L-0.003
,-0.003
 v-0.003
 very-0.003
Token Lucas
Token ID15257
Feature activation-6.82e-01

Pos logit contributions
opes+0.050
wen+0.049
lor+0.048
uding+0.048
aded+0.046
Neg logit contributions
 Lily-0.032
 L-0.030
 Sp-0.026
,-0.025
 very-0.024
Token had
Token ID550
Feature activation-1.28e+00

Pos logit contributions
opes+0.030
uding+0.030
wen+0.030
lor+0.030
eway+0.029
Neg logit contributions
 Lily-0.008
 L-0.007
,-0.007
 very-0.006
 p-0.006
Token no
Token ID645
Feature activation-7.69e-01

Pos logit contributions
opes+0.053
uding+0.050
lor+0.049
wen+0.049
eway+0.048
Neg logit contributions
 Lily-0.024
 very-0.022
 L-0.021
 the-0.021
 too-0.019
Token mother
Token ID2802
Feature activation-2.47e-02

Pos logit contributions
lor+0.043
opes+0.041
wen+0.041
uding+0.040
eway+0.040
Neg logit contributions
 Lily-0.025
 L-0.023
 very-0.022
,-0.022
 sad-0.021
Token holding
Token ID4769
Feature activation+2.27e-01

Pos logit contributions
opes+0.001
lor+0.001
wen+0.001
uding+0.001
 Charles+0.001
Neg logit contributions
 Lily-0.001
 L-0.001
 v-0.000
 p-0.000
,-0.000
Token a
Token ID257
Feature activation+1.88e-02

Pos logit contributions
 Lily+0.004
 L+0.004
 v+0.003
 Sp+0.003
L+0.003
Neg logit contributions
opes-0.009
lor-0.009
uding-0.008
wen-0.008
 Charles-0.008
Token freshly
Token ID29026
Feature activation+1.01e+00

Pos logit contributions
 Lily+0.000
 L+0.000
 v+0.000
 p+0.000
 sad+0.000
Neg logit contributions
opes-0.001
lor-0.001
uding-0.001
wen-0.001
 Charles-0.001
Token-
Token ID12
Feature activation+5.87e-01

Pos logit contributions
 Lily+0.021
 L+0.018
 v+0.017
 very+0.017
 p+0.017
Neg logit contributions
opes-0.037
uding-0.036
wen-0.035
 Charles-0.035
lor-0.034
Tokenb
Token ID65
Feature activation-3.99e+00

Pos logit contributions
 Lily+0.018
 L+0.017
 v+0.016
 sad+0.015
 Sp+0.015
Neg logit contributions
lor-0.014
opes-0.014
wen-0.014
uding-0.014
 Charles-0.013
Tokenaked
Token ID4335
Feature activation-1.32e+00

Pos logit contributions
opes+0.122
uding+0.120
 sites+0.119
 witnessed+0.118
 Charles+0.114
Neg logit contributions
,-0.091
 Lily-0.086
 L-0.086
 very-0.086
.-0.083
Token muff
Token ID27563
Feature activation+5.06e-02

Pos logit contributions
opes+0.048
lor+0.048
wen+0.047
uding+0.046
 Charles+0.045
Neg logit contributions
 Lily-0.028
 L-0.025
 v-0.024
,-0.022
 p-0.022
Tokenin
Token ID259
Feature activation+5.40e-02

Pos logit contributions
 Lily+0.001
 L+0.001
 v+0.001
 sad+0.001
 Sp+0.001
Neg logit contributions
lor-0.001
wen-0.001
opes-0.001
uding-0.001
ates-0.001
Token.
Token ID13
Feature activation-4.43e-01

Pos logit contributions
 Lily+0.001
 L+0.001
 v+0.001
 Sp+0.001
L+0.001
Neg logit contributions
opes-0.002
lor-0.002
uding-0.002
wen-0.002
 Charles-0.002
Token
Token ID220
Feature activation+4.27e-01

Pos logit contributions
opes+0.016
lor+0.016
uding+0.016
wen+0.016
 Charles+0.016
Neg logit contributions
 Lily-0.008
 L-0.007
 v-0.006
 dirty-0.006
L-0.006
Token the
Token ID262
Feature activation-1.12e+00

Pos logit contributions
 Lily+0.007
 L+0.006
 v+0.006
 very+0.006
 sad+0.005
Neg logit contributions
opes-0.014
lor-0.014
wen-0.014
uding-0.014
 Charles-0.013
Token world
Token ID995
Feature activation-4.14e-01

Pos logit contributions
opes+0.040
wen+0.039
lor+0.038
uding+0.037
uted+0.036
Neg logit contributions
 Lily-0.026
 L-0.023
 very-0.022
 v-0.021
 p-0.021
Token's
Token ID338
Feature activation-1.46e-01

Pos logit contributions
opes+0.018
uding+0.017
wen+0.017
lor+0.017
 Charles+0.016
Neg logit contributions
 Lily-0.006
 L-0.005
 v-0.005
,-0.005
 very-0.005
Token best
Token ID1266
Feature activation-1.36e+00

Pos logit contributions
lor+0.005
opes+0.005
wen+0.005
uding+0.005
 Charles+0.005
Neg logit contributions
 Lily-0.003
 L-0.003
 very-0.003
,-0.003
 v-0.003
Token-
Token ID12
Feature activation-5.15e-01

Pos logit contributions
opes+0.068
wen+0.066
lor+0.065
uted+0.065
Andrew+0.064
Neg logit contributions
 Lily-0.013
,-0.012
.-0.011
 L-0.011
 very-0.009
Tokenb
Token ID65
Feature activation-3.99e+00

Pos logit contributions
opes+0.016
uding+0.015
wen+0.014
lor+0.014
 Charles+0.014
Neg logit contributions
 Lily-0.014
 L-0.013
 v-0.013
L-0.012
 very-0.012
Tokenouncing
Token ID18155
Feature activation-1.56e+00

Pos logit contributions
aman+0.124
opes+0.120
uding+0.119
heit+0.117
 avail+0.115
Neg logit contributions
 Lily-0.090
,-0.089
 very-0.087
 L-0.086
.-0.084
Token frog
Token ID21264
Feature activation-1.07e+00

Pos logit contributions
opes+0.068
lor+0.068
wen+0.068
uding+0.067
Andrew+0.064
Neg logit contributions
 Lily-0.023
,-0.019
 very-0.017
 L-0.017
 v-0.016
Token.
Token ID13
Feature activation-1.34e-01

Pos logit contributions
opes+0.049
wen+0.049
lor+0.047
uding+0.047
aman+0.046
Neg logit contributions
 Lily-0.013
,-0.011
 very-0.010
 L-0.010
.-0.009
Token And
Token ID843
Feature activation-1.28e-01

Pos logit contributions
opes+0.005
wen+0.005
lor+0.005
uding+0.005
 Charles+0.005
Neg logit contributions
 Lily-0.002
 L-0.002
 Sp-0.002
 v-0.002
L-0.001
Token every
Token ID790
Feature activation+1.87e-02

Pos logit contributions
opes+0.005
wen+0.005
lor+0.005
uding+0.005
 Charles+0.005
Neg logit contributions
 Lily-0.002
 L-0.002
,-0.001
 very-0.001
 v-0.001
Token and
Token ID290
Feature activation+2.02e-01

Pos logit contributions
 Lily+0.010
 L+0.009
 Sp+0.009
L+0.008
 v+0.007
Neg logit contributions
uding-0.016
lor-0.016
opes-0.016
 Charles-0.016
eway-0.016
Token put
Token ID1234
Feature activation+5.11e-01

Pos logit contributions
 Lily+0.005
 L+0.004
 v+0.004
L+0.004
 Sp+0.004
Neg logit contributions
opes-0.006
uding-0.006
wen-0.006
 Charles-0.006
lor-0.006
Token their
Token ID511
Feature activation-9.39e-02

Pos logit contributions
 Lily+0.009
 L+0.008
L+0.007
 Sp+0.007
 v+0.007
Neg logit contributions
opes-0.018
wen-0.017
uding-0.017
 Charles-0.017
lor-0.017
Token hands
Token ID2832
Feature activation+7.32e-01

Pos logit contributions
lor+0.003
opes+0.003
wen+0.003
uding+0.003
 Charles+0.003
Neg logit contributions
 Lily-0.002
 L-0.002
 v-0.002
 p-0.002
 dirty-0.002
Token in
Token ID287
Feature activation+6.92e-01

Pos logit contributions
 Lily+0.018
 L+0.017
 Sp+0.015
L+0.014
 v+0.014
Neg logit contributions
opes-0.022
uding-0.021
uted-0.020
 Charles-0.020
wen-0.020
Token the
Token ID262
Feature activation+1.63e+00

Pos logit contributions
 Lily+0.011
 Sp+0.009
 L+0.009
L+0.009
 v+0.009
Neg logit contributions
uding-0.023
 Charles-0.023
opes-0.023
eway-0.023
wen-0.023
Token spray
Token ID11662
Feature activation+1.62e+00

Pos logit contributions
 Lily+0.028
 L+0.025
 Sp+0.023
L+0.020
 v+0.018
Neg logit contributions
opes-0.060
lor-0.060
uding-0.058
aded-0.057
wen-0.057
Token.
Token ID13
Feature activation+2.88e-01

Pos logit contributions
 Lily+0.023
 L+0.019
 Sp+0.017
L+0.017
 sad+0.017
Neg logit contributions
opes-0.062
uding-0.061
lor-0.061
eway-0.059
wen-0.058
Token It
Token ID632
Feature activation+9.09e-01

Pos logit contributions
 v+0.005
 Lily+0.005
 gr+0.004
 sad+0.004
 dirty+0.004
Neg logit contributions
opes-0.009
uding-0.009
 Charles-0.009
lor-0.009
wen-0.009
Token was
Token ID373
Feature activation+5.67e-01

Pos logit contributions
 Lily+0.018
 L+0.016
 Sp+0.014
 v+0.014
,+0.013
Neg logit contributions
opes-0.027
wen-0.027
aded-0.027
uding-0.027
 witnessed-0.027
Token cold
Token ID4692
Feature activation+1.41e-01

Pos logit contributions
 Lily+0.009
 L+0.008
L+0.007
 v+0.006
 Sp+0.006
Neg logit contributions
opes-0.022
uding-0.021
wen-0.021
 Charles-0.021
eway-0.020
Token,
Token ID11
Feature activation+1.83e-01

Pos logit contributions
 Lily+0.007
 L+0.006
 v+0.005
 Sp+0.005
L+0.004
Neg logit contributions
opes-0.023
elcome-0.022
 Charles-0.022
uding-0.021
aded-0.021
Token he
Token ID339
Feature activation-3.38e-01

Pos logit contributions
 Lily+0.003
 L+0.003
 v+0.003
 sad+0.002
L+0.002
Neg logit contributions
opes-0.006
 Charles-0.006
uding-0.006
wen-0.006
eway-0.006
Token went
Token ID1816
Feature activation-2.14e-01

Pos logit contributions
opes+0.012
lor+0.012
uding+0.012
wen+0.011
 Charles+0.011
Neg logit contributions
 Lily-0.007
 L-0.007
 v-0.006
 Sp-0.005
 p-0.005
Token outside
Token ID2354
Feature activation+1.94e-01

Pos logit contributions
opes+0.007
uding+0.007
wen+0.007
lor+0.007
 Charles+0.007
Neg logit contributions
 Lily-0.004
 L-0.004
 v-0.003
L-0.003
 Sp-0.003
Token to
Token ID284
Feature activation+4.98e-01

Pos logit contributions
 Lily+0.003
 L+0.003
 v+0.002
 p+0.002
 Sp+0.002
Neg logit contributions
opes-0.008
lor-0.008
wen-0.008
uding-0.008
 Charles-0.008
Token play
Token ID711
Feature activation+1.51e+00

Pos logit contributions
 Lily+0.008
 L+0.007
 v+0.006
L+0.006
 sad+0.005
Neg logit contributions
opes-0.019
uding-0.019
wen-0.019
 Charles-0.019
lor-0.019
Token and
Token ID290
Feature activation-2.23e-01

Pos logit contributions
 L+0.027
 Lily+0.027
 v+0.022
 sad+0.021
 Sp+0.021
Neg logit contributions
opes-0.049
uding-0.049
eway-0.047
 armour-0.046
aded-0.046
Token it
Token ID340
Feature activation+1.36e+00

Pos logit contributions
opes+0.007
uding+0.007
 Charles+0.007
wen+0.007
eway+0.007
Neg logit contributions
 Lily-0.004
 L-0.004
 v-0.004
L-0.004
 Sp-0.003
Token started
Token ID2067
Feature activation+2.55e-01

Pos logit contributions
 Lily+0.025
 L+0.024
 Sp+0.020
 Sara+0.019
 v+0.019
Neg logit contributions
opes-0.045
uding-0.044
wen-0.044
lled-0.044
lor-0.043
Token to
Token ID284
Feature activation+4.78e-01

Pos logit contributions
 Lily+0.006
 L+0.006
L+0.005
 can+0.005
 Doll+0.005
Neg logit contributions
uding-0.007
opes-0.007
eway-0.006
aded-0.006
aced-0.006
Token rain
Token ID6290
Feature activation+1.94e-01

Pos logit contributions
 Lily+0.010
 L+0.009
 Sp+0.008
L+0.007
 Sara+0.007
Neg logit contributions
opes-0.015
uding-0.015
eway-0.014
wen-0.014
 Charles-0.014
Token said
Token ID531
Feature activation+3.67e-01

Pos logit contributions
 Lily+0.004
 L+0.004
 v+0.003
 p+0.003
,+0.003
Neg logit contributions
opes-0.008
lor-0.008
wen-0.008
uding-0.008
 Charles-0.008
Token,
Token ID11
Feature activation-6.56e-01

Pos logit contributions
 Lily+0.004
 L+0.003
 v+0.002
L+0.002
 p+0.002
Neg logit contributions
lor-0.016
 Charles-0.016
opes-0.016
uding-0.016
elcome-0.016
Token "
Token ID366
Feature activation+1.04e+00

Pos logit contributions
opes+0.026
lor+0.026
uding+0.026
wen+0.026
 Charles+0.025
Neg logit contributions
 Lily-0.011
 L-0.009
 v-0.008
 Sp-0.007
L-0.007
TokenSure
Token ID19457
Feature activation-9.69e-01

Pos logit contributions
 Lily+0.021
 L+0.018
L+0.017
 v+0.016
 Sp+0.015
Neg logit contributions
opes-0.040
uding-0.038
wen-0.038
lor-0.037
 Charles-0.037
Token,
Token ID11
Feature activation-1.05e+00

Pos logit contributions
opes+0.045
lor+0.045
uding+0.045
wen+0.045
 Charles+0.044
Neg logit contributions
 Lily-0.010
 L-0.008
 v-0.005
 p-0.004
 Sp-0.004
Token but
Token ID475
Feature activation+1.39e+00

Pos logit contributions
opes+0.044
lor+0.043
wen+0.043
uding+0.043
ady+0.041
Neg logit contributions
 Lily-0.021
 L-0.017
 v-0.013
,-0.013
 p-0.013
Token be
Token ID307
Feature activation+1.40e+00

Pos logit contributions
 Lily+0.025
 L+0.022
 v+0.019
 Sp+0.017
L+0.017
Neg logit contributions
opes-0.053
uding-0.052
lor-0.052
wen-0.052
 Charles-0.051
Token careful
Token ID8161
Feature activation-1.32e+00

Pos logit contributions
 Lily+0.034
 L+0.031
 v+0.028
 Sp+0.027
 p+0.026
Neg logit contributions
opes-0.045
lor-0.044
uding-0.044
wen-0.043
 Charles-0.042
Token."
Token ID526
Feature activation+5.73e-01

Pos logit contributions
opes+0.056
lor+0.055
wen+0.055
uding+0.055
 Charles+0.053
Neg logit contributions
 Lily-0.019
 L-0.016
 v-0.013
,-0.012
 very-0.012
Token Tim
Token ID5045
Feature activation-6.47e-01

Pos logit contributions
 Lily+0.009
 L+0.008
 v+0.007
 sad+0.006
 dirty+0.006
Neg logit contributions
opes-0.022
lor-0.022
uding-0.022
wen-0.022
 Charles-0.021
Tokenmy
Token ID1820
Feature activation+4.88e-01

Pos logit contributions
opes+0.024
wen+0.023
lor+0.023
uding+0.023
 Charles+0.023
Neg logit contributions
 Lily-0.012
 L-0.011
 v-0.010
 Sp-0.009
 sad-0.008
Token you
Token ID345
Feature activation+1.28e-01

Pos logit contributions
opes+0.043
lor+0.042
wen+0.042
uding+0.042
ady+0.040
Neg logit contributions
 Lily-0.022
 L-0.018
 v-0.016
 p-0.015
,-0.015
Token are
Token ID389
Feature activation+2.92e-01

Pos logit contributions
 Lily+0.002
 L+0.002
 v+0.001
L+0.001
 Sp+0.001
Neg logit contributions
opes-0.005
uding-0.005
lor-0.005
wen-0.005
 Charles-0.005
Token too
Token ID1165
Feature activation+4.09e-01

Pos logit contributions
 Lily+0.004
 L+0.004
L+0.003
 v+0.003
 Sp+0.003
Neg logit contributions
opes-0.011
uding-0.011
 Charles-0.011
wen-0.011
uted-0.010
Token small
Token ID1402
Feature activation-5.07e-01

Pos logit contributions
 Lily+0.008
 L+0.007
 Sp+0.007
 v+0.006
L+0.006
Neg logit contributions
opes-0.013
 Charles-0.013
uding-0.013
Frank-0.013
ady-0.013
Token.
Token ID13
Feature activation+3.91e-01

Pos logit contributions
opes+0.023
lor+0.023
wen+0.023
uding+0.022
 Charles+0.022
Neg logit contributions
 Lily-0.006
 L-0.005
 v-0.004
,-0.003
 p-0.003
Token Go
Token ID1514
Feature activation+1.38e+00

Pos logit contributions
 Lily+0.009
 L+0.009
 v+0.009
 dirty+0.009
 very+0.008
Neg logit contributions
opes-0.011
uding-0.011
 Charles-0.011
lor-0.011
wen-0.010
Token away
Token ID1497
Feature activation-4.72e-01

Pos logit contributions
 Lily+0.034
 L+0.031
 v+0.028
 Sp+0.027
 sad+0.027
Neg logit contributions
uding-0.042
opes-0.041
lor-0.041
wen-0.040
 Charles-0.040
Token."
Token ID526
Feature activation+4.91e-01

Pos logit contributions
opes+0.017
lor+0.016
uding+0.016
wen+0.016
 Charles+0.016
Neg logit contributions
 Lily-0.010
 L-0.009
 v-0.008
 Sp-0.007
 p-0.007
Token
Token ID198
Feature activation+6.12e-02

Pos logit contributions
 Lily+0.010
 L+0.009
 v+0.007
 Sp+0.007
 p+0.007
Neg logit contributions
opes-0.018
lor-0.018
wen-0.018
uding-0.018
 Charles-0.017
Token
Token ID198
Feature activation-7.98e-02

Pos logit contributions
 Lily+0.001
 L+0.001
 v+0.001
 p+0.000
 Sp+0.000
Neg logit contributions
opes-0.003
lor-0.003
wen-0.003
uding-0.003
 Charles-0.003
TokenTim
Token ID14967
Feature activation+4.97e-02

Pos logit contributions
opes+0.002
lor+0.002
wen+0.002
uding+0.002
 Charles+0.002
Neg logit contributions
 Lily-0.002
 L-0.002
L-0.002
 v-0.002
 p-0.002
Token pulls
Token ID16194
Feature activation+7.72e-01

Pos logit contributions
opes+0.009
lor+0.009
uding+0.008
wen+0.008
 Charles+0.008
Neg logit contributions
 Lily-0.008
 L-0.007
 v-0.006
 p-0.006
 Sp-0.006
Token the
Token ID262
Feature activation+7.34e-01

Pos logit contributions
 v+0.011
 Lily+0.011
 Sp+0.010
 L+0.010
 le+0.010
Neg logit contributions
 Charles-0.026
aced-0.025
uding-0.024
lled-0.024
uted-0.024
Token car
Token ID1097
Feature activation+5.17e-02

Pos logit contributions
 Lily+0.014
 L+0.013
 Sp+0.011
L+0.010
,+0.009
Neg logit contributions
opes-0.026
lor-0.025
uding-0.025
 reins-0.025
 Charles-0.025
Token back
Token ID736
Feature activation+1.68e+00

Pos logit contributions
 Lily+0.001
 L+0.001
 Sp+0.001
L+0.001
 P+0.001
Neg logit contributions
uding-0.001
aced-0.001
opes-0.001
 Charles-0.001
eway-0.001
Token from
Token ID422
Feature activation-4.09e-01

Pos logit contributions
 Lily+0.026
 L+0.022
 v+0.019
L+0.018
 Sp+0.018
Neg logit contributions
opes-0.065
uding-0.065
wen-0.063
 Charles-0.063
eway-0.063
Token Anna
Token ID11735
Feature activation+1.93e+00

Pos logit contributions
opes+0.017
uding+0.017
 Charles+0.017
wen+0.017
eway+0.016
Neg logit contributions
 Lily-0.005
 L-0.004
 v-0.004
 Sp-0.003
L-0.003
Token's
Token ID338
Feature activation+7.70e-01

Pos logit contributions
 Lily+0.028
 L+0.023
 v+0.019
 Sp+0.017
 p+0.016
Neg logit contributions
opes-0.081
lor-0.080
uding-0.080
wen-0.080
 Charles-0.078
Token hand
Token ID1021
Feature activation+1.49e+00

Pos logit contributions
 Lily+0.022
 L+0.020
L+0.018
 v+0.018
 Sp+0.018
Neg logit contributions
opes-0.021
uding-0.020
lor-0.020
wen-0.020
 Charles-0.019
Token.
Token ID13
Feature activation+8.31e-01

Pos logit contributions
 Lily+0.021
 L+0.018
 Sp+0.014
 v+0.014
L+0.013
Neg logit contributions
opes-0.061
uding-0.060
lor-0.060
wen-0.059
 Charles-0.058
Token
Token ID198
Feature activation+5.49e-01

Pos logit contributions
 Lily+0.012
 L+0.011
 v+0.011
 dirty+0.009
 p+0.009
Neg logit contributions
 Charles-0.031
opes-0.031
uding-0.031
lor-0.031
wen-0.030
Token
Token ID198
Feature activation+7.23e-01

Pos logit contributions
 Lily+0.009
 v+0.008
 L+0.008
 p+0.007
 dirty+0.007
Neg logit contributions
wen-0.020
lor-0.020
opes-0.019
 Charles-0.019
uding-0.019
Token She
Token ID1375
Feature activation+8.49e-01

Pos logit contributions
 L+0.001
 v+0.001
 Lily+0.001
 sad+0.001
 dirty+0.001
Neg logit contributions
opes-0.002
 Charles-0.002
uding-0.002
wen-0.002
eway-0.002
Token had
Token ID550
Feature activation-1.21e+00

Pos logit contributions
 L+0.021
 Lily+0.021
 v+0.018
 Sp+0.018
 Sara+0.018
Neg logit contributions
opes-0.022
wen-0.022
 witnessed-0.021
zed-0.020
 Charles-0.020
Token a
Token ID257
Feature activation-3.93e-01

Pos logit contributions
opes+0.046
lor+0.045
wen+0.045
uding+0.045
 Charles+0.044
Neg logit contributions
 Lily-0.021
 L-0.019
 v-0.016
 Sp-0.015
 p-0.014
Token favorite
Token ID4004
Feature activation+1.47e-01

Pos logit contributions
opes+0.013
wen+0.012
 Charles+0.012
uding+0.012
lor+0.012
Neg logit contributions
 Lily-0.008
 L-0.008
 Sp-0.007
L-0.006
 v-0.006
Token toy
Token ID13373
Feature activation-5.25e-01

Pos logit contributions
 Lily+0.003
 L+0.003
 v+0.002
 Sp+0.002
L+0.002
Neg logit contributions
opes-0.005
lor-0.005
uding-0.005
wen-0.005
 Charles-0.005
Token,
Token ID11
Feature activation-4.17e-01

Pos logit contributions
opes+0.019
lor+0.018
uding+0.018
wen+0.018
 Charles+0.018
Neg logit contributions
 Lily-0.009
 L-0.008
L-0.007
 sad-0.006
 Sp-0.006
Token a
Token ID257
Feature activation+6.78e-01

Pos logit contributions
opes+0.015
wen+0.015
 Charles+0.015
lor+0.015
uding+0.015
Neg logit contributions
 Lily-0.007
 L-0.007
 v-0.006
 Sp-0.005
 p-0.005
Token silver
Token ID8465
Feature activation+6.92e-02

Pos logit contributions
 L+0.012
 Lily+0.012
 Sp+0.010
 sometimes+0.010
L+0.010
Neg logit contributions
opes-0.021
elcome-0.021
 Charles-0.020
 armour-0.020
 reins-0.019
Token ball
Token ID2613
Feature activation+8.99e-01

Pos logit contributions
 Lily+0.002
 L+0.001
 Sp+0.001
L+0.001
 sad+0.001
Neg logit contributions
opes-0.002
lor-0.002
uding-0.002
wen-0.002
 Charles-0.002
Token.
Token ID13
Feature activation-8.07e-02

Pos logit contributions
 Lily+0.014
 L+0.014
 Sp+0.012
L+0.012
 gr+0.011
Neg logit contributions
opes-0.030
lor-0.029
uding-0.028
ates-0.027
 Charles-0.026
Token The
Token ID383
Feature activation+1.37e+00

Pos logit contributions
 Charles+0.003
opes+0.003
uding+0.003
lor+0.003
wen+0.002
Neg logit contributions
 Lily-0.002
 L-0.002
 v-0.001
 sad-0.001
 dirty-0.001
Token and
Token ID290
Feature activation-4.88e-01

Pos logit contributions
lor+0.057
wen+0.057
opes+0.056
uding+0.054
uted+0.054
Neg logit contributions
 Lily-0.013
,-0.013
 very-0.011
 L-0.011
 v-0.010
Token sea
Token ID5417
Feature activation-4.76e-01

Pos logit contributions
lor+0.018
wen+0.018
opes+0.018
uding+0.017
 Charles+0.017
Neg logit contributions
 Lily-0.010
 L-0.009
 v-0.008
 p-0.008
 very-0.008
Token foam
Token ID19828
Feature activation-3.25e-01

Pos logit contributions
wen+0.021
lor+0.021
opes+0.021
uding+0.021
uted+0.020
Neg logit contributions
 Lily-0.006
 L-0.005
,-0.005
 v-0.004
 very-0.004
Token.
Token ID13
Feature activation-3.63e-01

Pos logit contributions
wen+0.015
lor+0.015
uding+0.014
opes+0.014
 Charles+0.014
Neg logit contributions
 Lily-0.004
 L-0.003
,-0.003
 v-0.003
 p-0.003
Token But
Token ID887
Feature activation+4.53e-01

Pos logit contributions
opes+0.016
wen+0.015
lor+0.015
uding+0.015
 Charles+0.015
Neg logit contributions
 Lily-0.006
 L-0.005
 Sp-0.003
 v-0.003
L-0.003
Token nothing
Token ID2147
Feature activation-1.51e-01

Pos logit contributions
 Lily+0.008
 L+0.007
 v+0.006
 Sp+0.005
,+0.005
Neg logit contributions
opes-0.018
lor-0.018
wen-0.018
uding-0.018
 Charles-0.017
Token seemed
Token ID3947
Feature activation-5.03e-01

Pos logit contributions
opes+0.006
lor+0.006
uding+0.006
wen+0.005
 Charles+0.005
Neg logit contributions
 Lily-0.003
 L-0.003
 v-0.002
,-0.002
 very-0.002
Token quite
Token ID2407
Feature activation-2.48e-01

Pos logit contributions
opes+0.021
lor+0.021
wen+0.021
uding+0.021
 Charles+0.020
Neg logit contributions
 Lily-0.008
 L-0.007
 very-0.005
 v-0.005
 sad-0.005
Token right
Token ID826
Feature activation-5.11e-01

Pos logit contributions
opes+0.009
lor+0.009
wen+0.009
uding+0.009
 Charles+0.009
Neg logit contributions
 Lily-0.005
 L-0.004
 v-0.004
 sad-0.003
 very-0.003
Token.
Token ID13
Feature activation-3.40e-01

Pos logit contributions
opes+0.024
lor+0.024
wen+0.024
uding+0.023
 Charles+0.023
Neg logit contributions
 Lily-0.005
 L-0.004
,-0.003
 very-0.003
 v-0.002
Token Then
Token ID3244
Feature activation-1.76e-01

Pos logit contributions
opes+0.014
lor+0.014
wen+0.014
uding+0.014
 Charles+0.013
Neg logit contributions
 Lily-0.006
 L-0.005
 v-0.004
 Sp-0.004
L-0.004
Token "
Token ID366
Feature activation+9.55e-01

Pos logit contributions
 Lily+0.003
 L+0.003
 v+0.003
 p+0.002
 dirty+0.002
Neg logit contributions
opes-0.008
uding-0.008
lor-0.008
wen-0.007
 Charles-0.007
TokenBut
Token ID1537
Feature activation+6.17e-01

Pos logit contributions
 Lily+0.026
 L+0.024
 v+0.022
 Sp+0.021
 p+0.021
Neg logit contributions
opes-0.027
lor-0.027
uding-0.027
wen-0.026
 Charles-0.026
Token be
Token ID307
Feature activation+1.51e+00

Pos logit contributions
 Lily+0.011
 L+0.010
 v+0.009
L+0.008
 Sp+0.008
Neg logit contributions
opes-0.023
uding-0.023
wen-0.023
lor-0.023
 Charles-0.022
Token careful
Token ID8161
Feature activation-7.45e-01

Pos logit contributions
 Lily+0.039
 L+0.036
 Sp+0.032
 v+0.032
L+0.030
Neg logit contributions
opes-0.044
uding-0.043
wen-0.042
lor-0.041
ady-0.040
Token,
Token ID11
Feature activation-5.18e-01

Pos logit contributions
opes+0.031
uding+0.031
lor+0.031
wen+0.031
 Charles+0.030
Neg logit contributions
 Lily-0.011
 L-0.009
 v-0.007
 Sp-0.007
L-0.007
Token don
Token ID836
Feature activation-2.41e-01

Pos logit contributions
opes+0.020
lor+0.019
uding+0.019
wen+0.019
 Charles+0.019
Neg logit contributions
 Lily-0.010
 L-0.008
 v-0.007
 Sp-0.007
 p-0.007
Token't
Token ID470
Feature activation+9.42e-01

Pos logit contributions
opes+0.008
lor+0.007
wen+0.007
uding+0.007
 Charles+0.007
Neg logit contributions
 Lily-0.006
 L-0.005
 v-0.005
 Sp-0.005
L-0.005
Token spray
Token ID11662
Feature activation+1.09e+00

Pos logit contributions
 Lily+0.018
 L+0.016
 v+0.014
 Sp+0.013
L+0.013
Neg logit contributions
opes-0.034
uding-0.033
wen-0.033
lor-0.033
 Charles-0.032
Token the
Token ID262
Feature activation+8.98e-02

Pos logit contributions
 Lily+0.020
 L+0.018
L+0.016
 v+0.015
 sad+0.015
Neg logit contributions
opes-0.039
uding-0.038
lor-0.037
wen-0.036
eway-0.036
Token flowers
Token ID12734
Feature activation+3.62e-01

Pos logit contributions
 Lily+0.002
 L+0.002
 v+0.002
 p+0.001
 dirty+0.001
Neg logit contributions
lor-0.003
wen-0.003
opes-0.003
uding-0.003
 Charles-0.003
Token or
Token ID393
Feature activation-7.67e-01

Pos logit contributions
 Lily+0.007
 L+0.006
 v+0.005
 p+0.005
 Sp+0.005
Neg logit contributions
opes-0.014
lor-0.013
wen-0.013
uding-0.013
 Charles-0.013
Token so
Token ID523
Feature activation+2.26e-01

Pos logit contributions
 Lily+0.002
 L+0.002
 very+0.002
 sad+0.002
 v+0.001
Neg logit contributions
opes-0.006
lor-0.006
wen-0.006
uding-0.005
eway-0.005
Token happy
Token ID3772
Feature activation-6.14e-01

Pos logit contributions
 Lily+0.004
 L+0.003
 v+0.003
 Sp+0.003
 p+0.003
Neg logit contributions
opes-0.009
lor-0.009
uding-0.009
wen-0.009
 Charles-0.008
Token and
Token ID290
Feature activation-2.92e-01

Pos logit contributions
opes+0.031
wen+0.031
lor+0.031
 reins+0.030
heit+0.030
Neg logit contributions
 Lily-0.006
,-0.006
 L-0.004
 too-0.004
.-0.004
Token everyone
Token ID2506
Feature activation-6.71e-01

Pos logit contributions
opes+0.012
lor+0.012
uding+0.011
wen+0.011
eway+0.011
Neg logit contributions
 Lily-0.006
 L-0.005
 very-0.004
 sad-0.004
,-0.004
Token laughed
Token ID13818
Feature activation-3.27e-01

Pos logit contributions
opes+0.026
lor+0.025
uding+0.025
wen+0.025
 Charles+0.024
Neg logit contributions
 Lily-0.013
 L-0.011
 v-0.010
 p-0.009
 very-0.009
Token.
Token ID13
Feature activation-6.15e-01

Pos logit contributions
wen+0.015
lor+0.015
opes+0.015
uding+0.014
 Charles+0.014
Neg logit contributions
 Lily-0.005
 L-0.004
,-0.003
 very-0.003
 too-0.003
Token From
Token ID3574
Feature activation+1.27e+00

Pos logit contributions
wen+0.029
lor+0.029
opes+0.029
uding+0.028
 reins+0.027
Neg logit contributions
 Lily-0.008
 L-0.005
 Sp-0.004
,-0.003
 very-0.003
Token now
Token ID783
Feature activation-2.01e+00

Pos logit contributions
 Lily+0.019
 L+0.015
,+0.013
 very+0.012
 p+0.011
Neg logit contributions
lor-0.056
opes-0.056
uding-0.055
wen-0.054
eway-0.053
Token on
Token ID319
Feature activation-1.92e-01

Pos logit contributions
opes+0.083
uding+0.083
lor+0.081
wen+0.079
uted+0.079
Neg logit contributions
 Lily-0.031
 L-0.025
,-0.024
 very-0.023
 v-0.022
Token Jack
Token ID3619
Feature activation-3.41e-01

Pos logit contributions
lor+0.011
wen+0.011
opes+0.011
uding+0.011
 reins+0.010
Neg logit contributions
 Lily-0.001
,-0.000
 L+0.000
 very+0.001
 the+0.001
Token was
Token ID373
Feature activation+2.06e-01

Pos logit contributions
opes+0.014
uding+0.014
lor+0.014
wen+0.013
eway+0.013
Neg logit contributions
 Lily-0.006
 L-0.005
,-0.004
 very-0.004
 v-0.004
Token house
Token ID2156
Feature activation+6.31e-03

Pos logit contributions
opes+0.013
uding+0.013
wen+0.013
lor+0.012
 Charles+0.012
Neg logit contributions
 Lily-0.011
 L-0.010
 v-0.009
L-0.008
 Sp-0.008
Token.
Token ID13
Feature activation+2.16e-01

Pos logit contributions
 Lily+0.000
 L+0.000
 v+0.000
 Sp+0.000
L+0.000
Neg logit contributions
opes-0.000
uding-0.000
lor-0.000
wen-0.000
 Charles-0.000
Token It
Token ID632
Feature activation-1.78e-01

Pos logit contributions
 Lily+0.004
 L+0.004
 v+0.004
L+0.003
 Sp+0.003
Neg logit contributions
opes-0.007
uding-0.007
 Charles-0.007
lor-0.007
wen-0.007
Token was
Token ID373
Feature activation+1.30e-01

Pos logit contributions
opes+0.006
wen+0.006
 Charles+0.006
uding+0.006
eway+0.006
Neg logit contributions
 Lily-0.003
 L-0.003
 v-0.002
L-0.002
 sad-0.002
Token white
Token ID2330
Feature activation-2.55e-01

Pos logit contributions
 Lily+0.002
 L+0.002
L+0.002
 v+0.002
 Sp+0.002
Neg logit contributions
opes-0.005
uding-0.005
wen-0.005
 Charles-0.005
eway-0.005
Token and
Token ID290
Feature activation+9.92e-01

Pos logit contributions
opes+0.010
uding+0.009
lor+0.009
wen+0.009
 Charles+0.009
Neg logit contributions
 Lily-0.005
 L-0.004
L-0.003
 Sp-0.003
 sad-0.003
Token empty
Token ID6565
Feature activation-2.33e-01

Pos logit contributions
 Lily+0.017
 L+0.016
L+0.014
 v+0.013
 Sp+0.013
Neg logit contributions
opes-0.035
 Charles-0.034
wen-0.033
eway-0.032
 armour-0.031
Token.
Token ID13
Feature activation-1.33e-01

Pos logit contributions
opes+0.011
lor+0.010
uding+0.010
wen+0.010
 Charles+0.010
Neg logit contributions
 Lily-0.003
 L-0.002
 v-0.001
 Sp-0.001
L-0.001
Token Tommy
Token ID19919
Feature activation-5.05e-01

Pos logit contributions
opes+0.005
uding+0.005
lor+0.005
wen+0.005
 Charles+0.005
Neg logit contributions
 Lily-0.002
 L-0.002
 v-0.002
 Sp-0.002
L-0.002
Token had
Token ID550
Feature activation-1.40e+00

Pos logit contributions
opes+0.019
uding+0.018
lor+0.018
wen+0.018
 Charles+0.018
Neg logit contributions
 Lily-0.010
 L-0.009
 v-0.008
 Sp-0.007
L-0.007
Token an
Token ID281
Feature activation-2.31e+00

Pos logit contributions
lor+0.058
opes+0.057
uding+0.057
wen+0.056
imm+0.055
Neg logit contributions
 Lily-0.025
 L-0.022
 very-0.020
 v-0.019
 dirty-0.018
Token she
Token ID673
Feature activation+6.98e-01

Pos logit contributions
opes+0.020
uding+0.019
lor+0.019
wen+0.018
eway+0.018
Neg logit contributions
 Lily-0.005
 L-0.004
 very-0.003
,-0.003
 sad-0.003
Token jumped
Token ID11687
Feature activation+5.60e-01

Pos logit contributions
 Lily+0.015
 L+0.014
 v+0.012
 Sp+0.011
 p+0.011
Neg logit contributions
opes-0.024
lor-0.024
uding-0.024
wen-0.023
 Charles-0.023
Token each
Token ID1123
Feature activation+7.02e-01

Pos logit contributions
 Lily+0.009
 L+0.008
 v+0.006
L+0.006
 Sp+0.006
Neg logit contributions
opes-0.022
uding-0.021
lor-0.021
 Charles-0.021
wen-0.021
Token time
Token ID640
Feature activation+9.27e-01

Pos logit contributions
 Lily+0.013
 L+0.011
 v+0.010
,+0.010
 very+0.010
Neg logit contributions
lor-0.028
opes-0.027
wen-0.027
uding-0.027
 Charles-0.026
Token.
Token ID13
Feature activation-7.75e-01

Pos logit contributions
 Lily+0.009
 L+0.007
 v+0.005
 Sp+0.004
L+0.004
Neg logit contributions
opes-0.042
uding-0.042
lor-0.042
wen-0.042
 Charles-0.041
Token
Token ID198
Feature activation-7.45e-01

Pos logit contributions
opes+0.035
lor+0.034
wen+0.034
uding+0.034
icians+0.033
Neg logit contributions
 Lily-0.011
 L-0.009
,-0.006
 Sara-0.006
 very-0.006
Token
Token ID198
Feature activation-1.02e+00

Pos logit contributions
opes+0.038
uding+0.037
lor+0.036
wen+0.035
 armour+0.035
Neg logit contributions
 Lily-0.006
 L-0.004
L-0.002
,-0.002
 very-0.001
TokenSuddenly
Token ID38582
Feature activation-8.47e-01

Pos logit contributions
opes+0.034
imm+0.032
uding+0.032
anced+0.031
 armour+0.031
Neg logit contributions
L-0.023
 Lily-0.022
 very-0.020
 L-0.019
,-0.019
Token,
Token ID11
Feature activation-1.92e+00

Pos logit contributions
opes+0.048
uding+0.048
lor+0.047
wen+0.046
uted+0.045
Neg logit contributions
 Lily-0.003
,-0.000
 L-0.000
 v+0.003
 p+0.003
Token she
Token ID673
Feature activation-6.36e-01

Pos logit contributions
opes+0.082
uding+0.079
lor+0.078
oving+0.074
imm+0.073
Neg logit contributions
 Lily-0.039
 L-0.033
,-0.030
 the-0.029
 very-0.028
Token noticed
Token ID6810
Feature activation-1.41e-01

Pos logit contributions
lor+0.024
opes+0.024
uding+0.024
wen+0.024
eway+0.023
Neg logit contributions
 Lily-0.013
 L-0.011
 p-0.010
,-0.010
 very-0.010
Token<|endoftext|>
Token ID50256
Feature activation-1.81e-01

Pos logit contributions
opes+0.015
wen+0.015
lor+0.015
uding+0.015
 Charles+0.014
Neg logit contributions
 Lily-0.005
 L-0.004
 Sp-0.003
L-0.002
 v-0.002
TokenM
Token ID44
Feature activation-9.48e-02

Pos logit contributions
lor+0.006
opes+0.006
wen+0.006
uding+0.006
 Charles+0.006
Neg logit contributions
 Lily-0.004
 L-0.004
 v-0.003
 sad-0.003
 dirty-0.003
Tokenum
Token ID388
Feature activation-2.90e-01

Pos logit contributions
opes+0.002
wen+0.002
lor+0.002
uding+0.002
 Charles+0.002
Neg logit contributions
 Lily-0.003
 L-0.003
 v-0.002
 Sp-0.002
L-0.002
Token and
Token ID290
Feature activation-3.19e-01

Pos logit contributions
lor+0.013
uding+0.013
opes+0.013
eway+0.013
wen+0.013
Neg logit contributions
 Lily-0.004
 L-0.003
,-0.002
 p-0.002
 v-0.002
Token Dad
Token ID17415
Feature activation-7.53e-01

Pos logit contributions
lor+0.015
opes+0.014
wen+0.014
uding+0.014
eway+0.013
Neg logit contributions
 Lily-0.006
 L-0.005
 Sara-0.004
 the-0.004
,-0.004
Token had
Token ID550
Feature activation-1.36e+00

Pos logit contributions
lor+0.033
opes+0.032
uding+0.032
wen+0.032
eway+0.032
Neg logit contributions
 Lily-0.012
 L-0.010
,-0.009
 p-0.008
 v-0.008
Token a
Token ID257
Feature activation-5.35e-01

Pos logit contributions
uding+0.069
opes+0.069
lor+0.068
wen+0.068
eway+0.067
Neg logit contributions
 Lily-0.016
 L-0.011
 very-0.011
 the-0.009
 v-0.008
Token big
Token ID1263
Feature activation-1.14e+00

Pos logit contributions
opes+0.018
wen+0.017
lor+0.017
uding+0.017
 Charles+0.017
Neg logit contributions
 Lily-0.012
 L-0.011
 v-0.009
 Sp-0.009
L-0.009
Token surprise
Token ID5975
Feature activation-7.68e-01

Pos logit contributions
opes+0.039
uding+0.038
 Charles+0.038
wen+0.038
lor+0.037
Neg logit contributions
 Lily-0.023
 L-0.021
L-0.018
 v-0.018
 Sp-0.018
Token for
Token ID329
Feature activation-6.39e-01

Pos logit contributions
opes+0.036
lor+0.036
uding+0.035
wen+0.035
 Charles+0.034
Neg logit contributions
 Lily-0.010
 L-0.008
,-0.008
 very-0.007
 v-0.007
Token Jane
Token ID12091
Feature activation+8.48e-01

Pos logit contributions
lor+0.024
opes+0.023
wen+0.023
uding+0.023
 Charles+0.022
Neg logit contributions
 Lily-0.015
 L-0.013
,-0.010
 v-0.010
 Sp-0.010
Token worried
Token ID7960
Feature activation-7.01e-01

Pos logit contributions
 Lily+0.030
 L+0.028
L+0.027
 v+0.023
 Sp+0.022
Neg logit contributions
aded-0.056
 Charles-0.056
uding-0.055
 witnessed-0.055
 armour-0.054
Token,
Token ID11
Feature activation-7.86e-01

Pos logit contributions
lor+0.031
opes+0.031
wen+0.031
uding+0.031
 Charles+0.030
Neg logit contributions
 Lily-0.009
 L-0.007
 v-0.006
,-0.006
 very-0.005
Token but
Token ID475
Feature activation+6.70e-01

Pos logit contributions
lor+0.032
opes+0.032
wen+0.031
uding+0.031
 Charles+0.030
Neg logit contributions
 Lily-0.013
 L-0.012
 v-0.010
 very-0.009
 sad-0.009
Token he
Token ID339
Feature activation-3.00e-01

Pos logit contributions
 Lily+0.012
 L+0.010
 v+0.010
L+0.009
 p+0.008
Neg logit contributions
opes-0.025
 Charles-0.025
wen-0.025
uding-0.024
lor-0.024
Token was
Token ID373
Feature activation+8.81e-02

Pos logit contributions
wen+0.010
opes+0.010
uding+0.010
 Charles+0.010
lor+0.010
Neg logit contributions
 Lily-0.006
 L-0.006
 v-0.005
 Sp-0.005
 square-0.005
Token determined
Token ID5295
Feature activation-1.27e+00

Pos logit contributions
 Lily+0.002
 L+0.001
 v+0.001
 Sp+0.001
L+0.001
Neg logit contributions
opes-0.003
uding-0.003
wen-0.003
lor-0.003
 Charles-0.003
Token to
Token ID284
Feature activation-8.23e-01

Pos logit contributions
opes+0.057
lor+0.055
uding+0.055
wen+0.055
 Charles+0.053
Neg logit contributions
 Lily-0.017
 L-0.013
,-0.011
 very-0.011
 v-0.010
Token find
Token ID1064
Feature activation-1.21e+00

Pos logit contributions
opes+0.033
lor+0.031
wen+0.031
uding+0.030
uted+0.030
Neg logit contributions
 Lily-0.015
 L-0.013
,-0.012
 p-0.012
 very-0.011
Token something
Token ID1223
Feature activation+6.81e-02

Pos logit contributions
opes+0.051
uding+0.050
lor+0.049
wen+0.048
uted+0.047
Neg logit contributions
 Lily-0.022
 L-0.019
,-0.016
 v-0.016
 very-0.016
Token he
Token ID339
Feature activation-8.49e-01

Pos logit contributions
 Lily+0.001
 L+0.001
 v+0.001
 Sp+0.001
 p+0.001
Neg logit contributions
opes-0.003
lor-0.003
uding-0.003
wen-0.003
 Charles-0.003
Token was
Token ID373
Feature activation-3.19e-02

Pos logit contributions
opes+0.034
lor+0.034
uding+0.033
wen+0.033
 Charles+0.033
Neg logit contributions
 Lily-0.014
 L-0.012
 v-0.010
 Sp-0.009
 p-0.009
Token
Token ID198
Feature activation+5.97e-01

Pos logit contributions
 Lily+0.004
 v+0.004
 L+0.004
 p+0.003
 dirty+0.003
Neg logit contributions
wen-0.009
opes-0.009
 Charles-0.009
lor-0.009
uding-0.008
Token"
Token ID1
Feature activation+7.96e-01

Pos logit contributions
 Lily+0.015
 L+0.014
 v+0.014
 sad+0.012
 Sp+0.012
Neg logit contributions
wen-0.016
lor-0.016
opes-0.016
Frank-0.016
ady-0.016
TokenBen
Token ID11696
Feature activation-2.20e-01

Pos logit contributions
 Lily+0.017
 L+0.015
 v+0.014
 very+0.013
 sad+0.013
Neg logit contributions
lor-0.027
opes-0.027
wen-0.027
uding-0.027
 Charles-0.026
Token,
Token ID11
Feature activation-1.14e+00

Pos logit contributions
opes+0.011
lor+0.011
uding+0.011
wen+0.011
 Charles+0.011
Neg logit contributions
 Lily-0.001
 L-0.001
 v-0.000
 p-0.000
,-0.000
Token are
Token ID389
Feature activation+1.02e+00

Pos logit contributions
opes+0.043
lor+0.042
uding+0.042
wen+0.042
 Charles+0.041
Neg logit contributions
 Lily-0.022
 L-0.019
 v-0.016
 Sp-0.015
 p-0.015
Token you
Token ID345
Feature activation-8.24e-01

Pos logit contributions
 Lily+0.024
 L+0.022
 v+0.022
 p+0.020
L+0.020
Neg logit contributions
opes-0.029
 Charles-0.028
wen-0.028
lor-0.028
uding-0.027
Token okay
Token ID8788
Feature activation+1.02e-01

Pos logit contributions
opes+0.026
lor+0.025
uding+0.025
wen+0.025
 Charles+0.024
Neg logit contributions
 Lily-0.022
 L-0.020
 v-0.018
 sad-0.018
 p-0.017
Token?
Token ID30
Feature activation-3.74e-01

Pos logit contributions
 Lily+0.002
 L+0.002
 v+0.002
 Sp+0.002
L+0.002
Neg logit contributions
uding-0.003
opes-0.003
wen-0.003
 Charles-0.003
lor-0.003
Token Your
Token ID3406
Feature activation-3.60e-01

Pos logit contributions
uding+0.010
opes+0.010
lor+0.010
 Charles+0.010
wen+0.010
Neg logit contributions
 Lily-0.009
 L-0.008
 v-0.008
 very-0.008
L-0.008
Token finger
Token ID7660
Feature activation+1.39e+00

Pos logit contributions
lor+0.013
eway+0.012
uding+0.012
opes+0.012
wen+0.012
Neg logit contributions
 Lily-0.008
 L-0.007
 v-0.007
 sad-0.007
 dirty-0.007
Token is
Token ID318
Feature activation+1.31e-01

Pos logit contributions
 Lily+0.024
 L+0.021
 v+0.018
 p+0.017
 sad+0.017
Neg logit contributions
opes-0.054
uding-0.054
lor-0.054
wen-0.053
 Charles-0.052
Token day
Token ID1110
Feature activation+5.77e-01

Pos logit contributions
opes+0.003
lor+0.003
wen+0.003
uding+0.003
 Charles+0.003
Neg logit contributions
 Lily-0.001
 L-0.001
 v-0.001
 p-0.001
 very-0.001
Token?"
Token ID1701
Feature activation+6.46e-01

Pos logit contributions
 Lily+0.011
 L+0.009
,+0.008
 v+0.008
 very+0.008
Neg logit contributions
opes-0.023
wen-0.023
lor-0.023
uding-0.022
 Charles-0.022
Token
Token ID198
Feature activation-1.18e-01

Pos logit contributions
 Lily+0.011
 L+0.010
 v+0.010
 dirty+0.010
 sad+0.009
Neg logit contributions
opes-0.023
 Charles-0.022
wen-0.022
uding-0.022
lor-0.022
Token
Token ID198
Feature activation-8.36e-02

Pos logit contributions
wen+0.005
lor+0.005
opes+0.005
 Charles+0.005
uding+0.005
Neg logit contributions
 Lily-0.002
 L-0.001
 v-0.001
 dirty-0.001
 p-0.001
TokenThe
Token ID464
Feature activation+6.75e-01

Pos logit contributions
lor+0.002
wen+0.002
opes+0.002
uding+0.002
 Charles+0.002
Neg logit contributions
 Lily-0.002
 L-0.002
 v-0.002
 sad-0.002
 Sp-0.002
Token elderly
Token ID13830
Feature activation-7.92e-01

Pos logit contributions
 Lily+0.019
 L+0.018
 Sp+0.016
 v+0.016
 p+0.015
Neg logit contributions
opes-0.018
wen-0.018
lor-0.018
uding-0.018
 Charles-0.017
Token woman
Token ID2415
Feature activation+3.05e-01

Pos logit contributions
opes+0.022
lor+0.022
wen+0.022
uding+0.022
 Charles+0.021
Neg logit contributions
 Lily-0.022
 L-0.020
 v-0.018
 Sp-0.018
 p-0.018
Token laughed
Token ID13818
Feature activation-2.10e-01

Pos logit contributions
 Lily+0.007
 L+0.006
 v+0.006
 p+0.005
 Sp+0.005
Neg logit contributions
opes-0.010
lor-0.010
uding-0.010
wen-0.010
 Charles-0.010
Token and
Token ID290
Feature activation+4.99e-01

Pos logit contributions
opes+0.009
uding+0.008
lor+0.008
wen+0.008
 Charles+0.008
Neg logit contributions
 Lily-0.003
 L-0.003
 v-0.002
 Sp-0.002
L-0.002
Token said
Token ID531
Feature activation+7.86e-01

Pos logit contributions
 Lily+0.009
 L+0.008
 v+0.007
 p+0.006
 Sp+0.006
Neg logit contributions
opes-0.019
lor-0.019
uding-0.019
wen-0.019
 Charles-0.018
Token,
Token ID11
Feature activation-4.41e-01

Pos logit contributions
 Lily+0.009
 L+0.008
L+0.007
 dirty+0.006
 p+0.005
Neg logit contributions
elcome-0.032
lor-0.031
opes-0.031
 Charles-0.031
uding-0.030
Token he
Token ID339
Feature activation-9.04e-02

Pos logit contributions
 Lily+0.003
 L+0.003
 v+0.002
L+0.002
 sad+0.002
Neg logit contributions
opes-0.006
lor-0.006
wen-0.006
uding-0.006
 Charles-0.006
Token was
Token ID373
Feature activation+1.57e-01

Pos logit contributions
opes+0.004
lor+0.004
uding+0.004
wen+0.004
 Charles+0.004
Neg logit contributions
 Lily-0.001
 L-0.001
 v-0.001
 Sp-0.001
 sad-0.001
Token gone
Token ID3750
Feature activation-5.89e-01

Pos logit contributions
 Lily+0.004
,+0.004
 very+0.004
 L+0.004
 sad+0.003
Neg logit contributions
opes-0.005
wen-0.005
lor-0.005
uding-0.005
eway-0.005
Token,
Token ID11
Feature activation-1.10e+00

Pos logit contributions
opes+0.033
lor+0.032
wen+0.032
uding+0.032
eway+0.031
Neg logit contributions
 Lily-0.002
,-0.000
 L-0.000
 sad+0.001
 very+0.001
Token O
Token ID440
Feature activation-2.12e+00

Pos logit contributions
 Charles+0.038
opes+0.037
wen+0.037
lor+0.037
uding+0.036
Neg logit contributions
 Lily-0.018
 L-0.015
L-0.015
 v-0.015
 sad-0.014
Tokenll
Token ID297
Feature activation-3.38e+00

Pos logit contributions
opes+0.056
uding+0.055
lor+0.054
 Charles+0.053
wen+0.053
Neg logit contributions
 Lily-0.062
 L-0.057
 v-0.053
L-0.051
 p-0.051
Tokenie
Token ID494
Feature activation-3.72e-01

Pos logit contributions
uding+0.129
opes+0.126
lor+0.124
wen+0.122
 Charles+0.119
Neg logit contributions
 Lily-0.068
 L-0.057
,-0.053
 very-0.050
 sad-0.049
Token met
Token ID1138
Feature activation-9.51e-01

Pos logit contributions
opes+0.014
uding+0.014
wen+0.014
eway+0.014
lor+0.014
Neg logit contributions
 Lily-0.007
 L-0.006
,-0.006
 v-0.005
 p-0.005
Token a
Token ID257
Feature activation-6.82e-01

Pos logit contributions
opes+0.040
lor+0.039
wen+0.039
uding+0.039
 Charles+0.037
Neg logit contributions
 Lily-0.016
 L-0.013
 v-0.011
,-0.010
 very-0.010
Token friendly
Token ID8030
Feature activation-9.03e-01

Pos logit contributions
 Charles+0.021
lor+0.021
wen+0.021
Frank+0.021
uding+0.020
Neg logit contributions
 Lily-0.015
 L-0.014
 Sp-0.012
 v-0.011
L-0.011
Token crab
Token ID32202
Feature activation-2.55e-01

Pos logit contributions
opes+0.028
 Charles+0.027
lor+0.027
uding+0.027
wen+0.027
Neg logit contributions
 Lily-0.021
 L-0.020
 v-0.018
L-0.017
 Sp-0.017
Token!
Token ID0
Feature activation-1.12e-01

Pos logit contributions
opes+0.081
lor+0.080
uding+0.079
wen+0.079
 Charles+0.078
Neg logit contributions
 Lily-0.016
 L-0.012
 v-0.008
,-0.007
 p-0.006
Token She
Token ID1375
Feature activation-2.51e-01

Pos logit contributions
opes+0.004
uding+0.004
 Charles+0.004
lor+0.004
wen+0.004
Neg logit contributions
 Lily-0.002
 L-0.001
 v-0.001
 p-0.001
 dirty-0.001
Token wanted
Token ID2227
Feature activation+1.78e-01

Pos logit contributions
lor+0.011
uding+0.010
opes+0.010
wen+0.010
 Charles+0.010
Neg logit contributions
 Lily-0.004
 L-0.003
 p-0.003
 v-0.003
 very-0.003
Token to
Token ID284
Feature activation-1.24e+00

Pos logit contributions
 Lily+0.003
 L+0.003
 v+0.002
L+0.002
 Sp+0.002
Neg logit contributions
opes-0.007
uding-0.007
wen-0.007
lor-0.007
 Charles-0.007
Token act
Token ID719
Feature activation-1.26e+00

Pos logit contributions
lor+0.051
opes+0.050
wen+0.050
uding+0.048
 Charles+0.047
Neg logit contributions
 Lily-0.022
 L-0.019
 p-0.016
 v-0.016
,-0.015
Token out
Token ID503
Feature activation-2.87e+00

Pos logit contributions
opes+0.053
lor+0.052
wen+0.052
uding+0.052
 Charles+0.050
Neg logit contributions
 Lily-0.019
 L-0.016
 v-0.013
 very-0.012
 Sp-0.012
Token her
Token ID607
Feature activation-2.66e-01

Pos logit contributions
opes+0.111
lor+0.110
uding+0.109
wen+0.109
 Charles+0.106
Neg logit contributions
 Lily-0.053
 L-0.046
 v-0.040
 very-0.036
 Sp-0.036
Token favorite
Token ID4004
Feature activation-2.96e-01

Pos logit contributions
opes+0.009
lor+0.009
wen+0.009
uding+0.009
 Charles+0.009
Neg logit contributions
 Lily-0.006
 L-0.005
 v-0.005
 very-0.005
 sad-0.005
Token story
Token ID1621
Feature activation+2.81e-03

Pos logit contributions
opes+0.009
lor+0.009
uding+0.009
wen+0.009
 Charles+0.009
Neg logit contributions
 Lily-0.007
 L-0.007
 v-0.006
L-0.006
 Sp-0.006
Token with
Token ID351
Feature activation-4.42e-01

Pos logit contributions
 Lily+0.000
 L+0.000
 v+0.000
,+0.000
 very+0.000
Neg logit contributions
opes-0.000
lor-0.000
uding-0.000
wen-0.000
 Charles-0.000
Token her
Token ID607
Feature activation-8.77e-01

Pos logit contributions
opes+0.018
lor+0.017
uding+0.017
wen+0.017
 Charles+0.017
Neg logit contributions
 Lily-0.008
 L-0.007
 v-0.006
 Sp-0.005
 p-0.005
Token long
Token ID890
Feature activation-1.76e+00

Pos logit contributions
opes+0.038
wen+0.038
lor+0.038
uding+0.037
 Charles+0.036
Neg logit contributions
 Lily-0.025
 L-0.023
 v-0.021
 sad-0.020
 p-0.020
Token tail
Token ID7894
Feature activation-5.56e-01

Pos logit contributions
wen+0.056
lor+0.055
opes+0.054
uding+0.053
 Charles+0.052
Neg logit contributions
 Lily-0.045
 L-0.040
,-0.039
 v-0.038
 p-0.038
Token and
Token ID290
Feature activation-3.86e-01

Pos logit contributions
opes+0.025
wen+0.025
lor+0.024
uding+0.024
 Charles+0.024
Neg logit contributions
 Lily-0.007
 L-0.006
 v-0.005
 p-0.004
,-0.004
Token a
Token ID257
Feature activation-5.02e-01

Pos logit contributions
wen+0.017
lor+0.016
uding+0.016
opes+0.016
eway+0.015
Neg logit contributions
 Lily-0.007
 L-0.006
 v-0.005
 p-0.004
 dirty-0.004
Token sharp
Token ID7786
Feature activation-2.36e+00

Pos logit contributions
opes+0.017
wen+0.017
lor+0.017
uding+0.017
 Charles+0.016
Neg logit contributions
 Lily-0.011
 L-0.010
 v-0.009
 p-0.009
 sad-0.009
Token spine
Token ID19656
Feature activation-3.32e+00

Pos logit contributions
wen+0.071
lor+0.070
opes+0.068
uding+0.066
 Charles+0.066
Neg logit contributions
 Lily-0.063
 L-0.058
,-0.055
 v-0.054
 p-0.053
Token.
Token ID13
Feature activation+2.67e-01

Pos logit contributions
lor+0.170
wen+0.168
Frank+0.165
 Charles+0.164
opes+0.163
Neg logit contributions
 Lily-0.020
,-0.015
 L-0.014
.-0.010
 p-0.003
Token Can
Token ID1680
Feature activation+1.28e+00

Pos logit contributions
 Lily+0.006
 L+0.006
 v+0.005
 p+0.005
 sad+0.005
Neg logit contributions
opes-0.009
lor-0.009
uding-0.009
wen-0.008
 Charles-0.008
Token we
Token ID356
Feature activation+1.25e+00

Pos logit contributions
 L+0.031
 Lily+0.031
 v+0.029
 p+0.028
L+0.028
Neg logit contributions
 Charles-0.032
opes-0.032
lor-0.031
uding-0.030
Frank-0.029
Token stay
Token ID2652
Feature activation-1.26e+00

Pos logit contributions
 Lily+0.026
 L+0.023
 v+0.020
 Sp+0.019
L+0.019
Neg logit contributions
opes-0.044
uding-0.043
lor-0.043
wen-0.043
 Charles-0.042
Token here
Token ID994
Feature activation-1.32e+00

Pos logit contributions
opes+0.050
lor+0.050
wen+0.050
uding+0.047
eway+0.047
Neg logit contributions
 Lily-0.024
 L-0.021
,-0.020
 very-0.019
 dirty-0.018
Token the
Token ID262
Feature activation+2.51e-02

Pos logit contributions
opes+0.010
lor+0.010
uding+0.010
wen+0.010
 Charles+0.009
Neg logit contributions
 Lily-0.002
 L-0.002
 v-0.001
,-0.001
 p-0.001
Token box
Token ID3091
Feature activation-8.77e-01

Pos logit contributions
 Lily+0.001
 L+0.001
 Sp+0.001
L+0.001
 v+0.001
Neg logit contributions
opes-0.001
lor-0.001
uding-0.001
wen-0.001
 Charles-0.001
Token and
Token ID290
Feature activation-2.76e-01

Pos logit contributions
opes+0.040
uding+0.040
wen+0.039
lor+0.039
 Charles+0.038
Neg logit contributions
 Lily-0.010
 L-0.008
,-0.006
 v-0.006
 very-0.005
Token found
Token ID1043
Feature activation-7.24e-01

Pos logit contributions
opes+0.011
lor+0.011
uding+0.011
wen+0.010
eway+0.010
Neg logit contributions
 Lily-0.006
 L-0.005
 v-0.004
 p-0.004
L-0.004
Token her
Token ID607
Feature activation-6.60e-01

Pos logit contributions
opes+0.030
uding+0.029
lor+0.029
wen+0.029
 Charles+0.028
Neg logit contributions
 Lily-0.012
 L-0.011
 v-0.009
 Sp-0.008
 p-0.008
Token fl
Token ID781
Feature activation-2.99e+00

Pos logit contributions
lor+0.024
uding+0.024
opes+0.024
wen+0.024
Frank+0.023
Neg logit contributions
 Lily-0.014
 L-0.013
 v-0.012
 very-0.011
 p-0.011
Tokenute
Token ID1133
Feature activation-1.33e+00

Pos logit contributions
opes+0.086
 Charles+0.086
uding+0.084
Charlie+0.084
lled+0.082
Neg logit contributions
 L-0.073
 Lily-0.071
 v-0.067
,-0.065
L-0.063
Token was
Token ID373
Feature activation+3.94e-01

Pos logit contributions
opes+0.066
uding+0.065
wen+0.065
lor+0.065
 Charles+0.064
Neg logit contributions
 Lily-0.011
 L-0.008
 v-0.006
,-0.006
 p-0.004
Token broken
Token ID5445
Feature activation+1.97e-01

Pos logit contributions
 Lily+0.008
 L+0.007
 v+0.006
 very+0.006
 dirty+0.005
Neg logit contributions
opes-0.015
lor-0.015
uding-0.015
wen-0.015
 Charles-0.015
Token.
Token ID13
Feature activation+1.62e-01

Pos logit contributions
 Lily+0.002
 L+0.002
 v+0.001
 Sp+0.001
L+0.001
Neg logit contributions
opes-0.009
uding-0.009
lor-0.009
wen-0.009
 Charles-0.008
Token She
Token ID1375
Feature activation+5.11e-01

Pos logit contributions
 v+0.002
 dirty+0.002
 p+0.002
 c+0.002
 sad+0.002
Neg logit contributions
lor-0.006
opes-0.006
 Charles-0.005
uding-0.005
wen-0.005
Token big
Token ID1263
Feature activation-9.52e-01

Pos logit contributions
 Charles+0.022
lor+0.021
opes+0.021
wen+0.021
uding+0.021
Neg logit contributions
 Lily-0.015
 L-0.014
 Sp-0.012
L-0.011
 Sara-0.011
Token castle
Token ID16669
Feature activation-1.93e+00

Pos logit contributions
 Charles+0.031
opes+0.030
uding+0.030
 armour+0.029
lor+0.028
Neg logit contributions
 Lily-0.020
 L-0.019
L-0.017
 Sp-0.016
 Sara-0.014
Token with
Token ID351
Feature activation-1.43e+00

Pos logit contributions
opes+0.093
wen+0.092
lor+0.091
uding+0.091
 Charles+0.090
Neg logit contributions
 Lily-0.017
 L-0.013
 v-0.010
,-0.008
 p-0.007
Token point
Token ID966
Feature activation-1.12e+00

Pos logit contributions
lor+0.057
uding+0.057
wen+0.057
opes+0.056
 Charles+0.053
Neg logit contributions
 Lily-0.026
 L-0.023
 v-0.020
 p-0.018
 very-0.018
Tokeny
Token ID88
Feature activation-1.18e+00

Pos logit contributions
opes+0.043
lor+0.043
uding+0.042
wen+0.042
 Charles+0.041
Neg logit contributions
 Lily-0.021
 L-0.019
 v-0.016
,-0.016
 p-0.015
Token towers
Token ID18028
Feature activation-2.92e+00

Pos logit contributions
wen+0.038
lor+0.037
uding+0.037
opes+0.036
 Charles+0.035
Neg logit contributions
 Lily-0.030
 L-0.027
 v-0.026
 p-0.025
,-0.024
Token.
Token ID13
Feature activation-1.69e-01

Pos logit contributions
lor+0.144
wen+0.144
opes+0.144
uding+0.143
 Charles+0.140
Neg logit contributions
 Lily-0.019
 L-0.013
,-0.011
 v-0.009
 very-0.007
Token "
Token ID366
Feature activation-6.85e-02

Pos logit contributions
opes+0.006
uding+0.006
 Charles+0.006
lor+0.006
eway+0.006
Neg logit contributions
 Lily-0.003
 L-0.003
L-0.002
 v-0.002
 dirty-0.002
TokenThat
Token ID2504
Feature activation+4.36e-01

Pos logit contributions
opes+0.002
lor+0.002
uding+0.002
wen+0.002
 Charles+0.002
Neg logit contributions
 Lily-0.002
 L-0.002
 v-0.001
 Sp-0.001
L-0.001
Token would
Token ID561
Feature activation-3.79e-01

Pos logit contributions
 Lily+0.008
 L+0.007
 v+0.007
,+0.007
 p+0.006
Neg logit contributions
uding-0.017
wen-0.017
opes-0.017
lor-0.016
uted-0.016
Token be
Token ID307
Feature activation+8.23e-01

Pos logit contributions
opes+0.016
lor+0.015
wen+0.015
uding+0.015
 Charles+0.015
Neg logit contributions
 Lily-0.006
 L-0.005
 v-0.004
 p-0.004
 Sp-0.004