TOP ACTIVATIONS
MAX = 5.698

Are you OK , L ila and Ben ? I was
a time two friends , Jas mine and John , were
. When Ma isy was finished , she
. " How long do we have to wait
you have there , L ila and Ben ?" she asks
" Are you okay , Jill ?" Tom asked . He
The moral of the story is : With
. The moral of the story is : Eating
t eddy bear , L ila ?" Tom asks .
asked , " What does spell mean ?" Her
's a stream !" said Jas mine .
The moral of this story is that although
you want to go , Blank et ?" The
The moral of the story is this :
you , doctor . How long will she be sick ?"
want to give it a try ?" The baby
keep it as a souven ir of your adventure . But
. The moral of the story is that it
" Are you OK , Lily ?" Tom asked .
." The moral of the story is that it

MIN ACTIVATIONS
MIN = -5.536

the sea and they watched how they moved through the water
the noise they made and how wild it all was .
They watched the rain and how it made p udd les
on , Lily and the puppy were best friends and they
One day , Lily and Fl uffy went to the market
on , Lily and the puppy became best friends and they
the way they tasted and how they made him feel warm
day , Tom and his toy dog went on an adventure
heard the tornado roar and how l outside . They felt
Rose loved Fl appy and Fl appy seemed to love her
honey . Buzz y and Fl utter became best friends .
on , Lily and the puppy were best friends forever .
out the window and saw how dirty he was . She
Fl uffy . Lily loved Fl uffy very much and took
together , Amy and the puppy had become best friends .
named Tim . Tim and Fl uffy liked to play and
tower they had built and how strong it was . They
day , Tim and his toy dog went for a walk
mom came in and saw how upset he was . She
and what they ate and how they lived . They also

INTERVAL 2.889 - 5.698
CONTAINS 0.503%

granted . From then on , the village
they heard a voice shout , " Sh oot the bear
End . <|endoftext|> Once upon a time there was a girl
knew that she would always remember the special knob on her
end . <|endoftext|> Once upon a time , there was a

INTERVAL 0.081 - 2.889
CONTAINS 48.184%

mom and sister are angry . They run to the big
the centre of her imagination . The little
polite . Jane wanted to learn how to add numbers .
" Tim my , you need to eat your cel ery
is Billy . Billy has a big bag of toys .

INTERVAL -2.727 - 0.081
CONTAINS 50.480%

Her mom said , " No , we are going
, hugged Milo and together they played until the sun went
punishments , he started to cry . His mom said he
in his notebook . Every day he would fill the pages
That night , Benny stayed up late playing . He

INTERVAL -5.536 - -2.727
CONTAINS 0.833%

mad . They say , " You are a bad man
L ily said , " Wow ! Look at the pretty
mom came in and saw how clean her room was .
dog and said , " Hello , do ggy ! You
in the room . " What are you doing , Sweet
TokenAre
Token ID8491
Feature activation-3.08e+00

Pos logit contributions
 ourselves+0.067
ls+0.063
uddled+0.062
apy+0.062
istered+0.059
Neg logit contributions
ah-0.079
et-0.078
 seven-0.073
oscope-0.067
aster-0.066
Token you
Token ID345
Feature activation-1.08e-01

Pos logit contributions
uddled+0.082
apy+0.079
istered+0.078
mons+0.076
ins+0.075
Neg logit contributions
et-0.067
ah-0.066
 seven-0.063
ary-0.056
 2-0.056
Token OK
Token ID7477
Feature activation-4.25e-02

Pos logit contributions
 ourselves+0.003
apy+0.003
uddled+0.003
mons+0.002
ls+0.002
Neg logit contributions
et-0.003
ah-0.003
 seven-0.002
oscope-0.002
pole-0.002
Token,
Token ID11
Feature activation+8.11e-01

Pos logit contributions
 ourselves+0.001
apy+0.001
uddled+0.001
mons+0.001
ls+0.001
Neg logit contributions
et-0.001
ah-0.001
 seven-0.001
oscope-0.001
pole-0.001
Token L
Token ID406
Feature activation+2.88e+00

Pos logit contributions
et+0.021
ah+0.021
 seven+0.018
oscope+0.018
pole+0.017
Neg logit contributions
 ourselves-0.020
apy-0.019
uddled-0.019
mons-0.018
ls-0.018
Tokenila
Token ID10102
Feature activation+5.70e+00

Pos logit contributions
et+0.081
ah+0.078
oscope+0.074
 seven+0.072
wn+0.069
Neg logit contributions
 ourselves-0.063
uddled-0.058
ls-0.057
apy-0.057
mons-0.056
Token and
Token ID290
Feature activation+2.80e+00

Pos logit contributions
et+0.134
ah+0.132
wn+0.124
 seven+0.121
alling+0.118
Neg logit contributions
 ourselves-0.137
ls-0.129
uddled-0.127
apy-0.125
iring-0.123
Token Ben
Token ID3932
Feature activation+2.10e+00

Pos logit contributions
et+0.064
ah+0.063
 seven+0.053
oscope+0.053
pole+0.051
Neg logit contributions
 ourselves-0.079
apy-0.074
uddled-0.072
mons-0.071
ls-0.070
Token?
Token ID30
Feature activation-7.52e-01

Pos logit contributions
et+0.049
ah+0.048
 seven+0.042
oscope+0.041
wn+0.039
Neg logit contributions
 ourselves-0.056
apy-0.051
uddled-0.050
ls-0.050
mons-0.049
Token I
Token ID314
Feature activation-8.27e-01

Pos logit contributions
 ourselves+0.021
uddled+0.019
apy+0.019
ls+0.019
mons+0.018
Neg logit contributions
et-0.017
ah-0.017
 seven-0.015
oscope-0.014
pole-0.013
Token was
Token ID373
Feature activation+5.26e-01

Pos logit contributions
 ourselves+0.019
uddled+0.017
apy+0.016
mons+0.016
ls+0.016
Neg logit contributions
ah-0.024
et-0.024
oscope-0.021
 seven-0.020
ary-0.020
Token a
Token ID257
Feature activation+3.09e+00

Pos logit contributions
ah+0.007
et+0.007
 seven+0.006
oscope+0.006
aster+0.005
Neg logit contributions
 ourselves-0.013
uddled-0.013
mons-0.013
apy-0.013
ls-0.012
Token time
Token ID640
Feature activation+9.14e-01

Pos logit contributions
et+0.036
ah+0.033
oscope+0.026
pole+0.025
alling+0.024
Neg logit contributions
 ourselves-0.112
apy-0.110
uddled-0.109
istered-0.106
ls-0.106
Token two
Token ID734
Feature activation-1.25e+00

Pos logit contributions
ah+0.016
et+0.016
 seven+0.013
ary+0.011
oscope+0.011
Neg logit contributions
 ourselves-0.030
uddled-0.029
apy-0.028
mons-0.028
Sorry-0.027
Token friends
Token ID2460
Feature activation-7.73e-02

Pos logit contributions
 ourselves+0.025
apy+0.022
uddled+0.022
mons+0.021
Sorry+0.020
Neg logit contributions
et-0.039
ah-0.039
 seven-0.035
oscope-0.033
 2-0.033
Token,
Token ID11
Feature activation-1.12e+00

Pos logit contributions
 ourselves+0.002
apy+0.002
uddled+0.002
mons+0.002
ls+0.002
Neg logit contributions
et-0.002
ah-0.001
 seven-0.001
oscope-0.001
pole-0.001
Token Jas
Token ID21961
Feature activation+5.43e+00

Pos logit contributions
 ourselves+0.032
apy+0.030
uddled+0.029
mons+0.028
ls+0.028
Neg logit contributions
et-0.025
ah-0.025
 seven-0.022
oscope-0.021
pole-0.020
Tokenmine
Token ID3810
Feature activation+4.18e-01

Pos logit contributions
ah+0.109
et+0.102
 seven+0.101
oscope+0.098
wn+0.098
Neg logit contributions
 ourselves-0.139
uddled-0.138
ls-0.127
ve-0.127
ins-0.126
Token and
Token ID290
Feature activation-4.10e-01

Pos logit contributions
et+0.007
ah+0.006
 seven+0.005
oscope+0.005
ary+0.005
Neg logit contributions
 ourselves-0.014
apy-0.014
uddled-0.014
mons-0.013
istered-0.013
Token John
Token ID1757
Feature activation-2.31e+00

Pos logit contributions
uddled+0.013
 ourselves+0.013
iring+0.012
istered+0.012
apy+0.012
Neg logit contributions
ah-0.008
et-0.008
 seven-0.007
pole-0.006
oscope-0.006
Token,
Token ID11
Feature activation-1.40e+00

Pos logit contributions
 ourselves+0.057
apy+0.054
uddled+0.054
mons+0.053
ls+0.052
Neg logit contributions
ah-0.061
et-0.061
 seven-0.052
oscope-0.051
pole-0.048
Token were
Token ID547
Feature activation-9.47e-01

Pos logit contributions
 ourselves+0.033
apy+0.031
uddled+0.031
mons+0.029
ls+0.029
Neg logit contributions
et-0.038
ah-0.037
 seven-0.033
oscope-0.032
pole-0.031
Token.
Token ID13
Feature activation+7.86e-01

Pos logit contributions
 ourselves+0.056
uddled+0.053
apy+0.053
mons+0.051
ls+0.050
Neg logit contributions
et-0.041
ah-0.040
 seven-0.034
oscope-0.032
pole-0.031
Token
Token ID220
Feature activation-2.54e+00

Pos logit contributions
et+0.016
ah+0.016
oscope+0.014
pole+0.013
ock+0.013
Neg logit contributions
 ourselves-0.023
apy-0.022
mons-0.021
uddled-0.021
ls-0.020
Token
Token ID198
Feature activation+1.16e+00

Pos logit contributions
 ourselves+0.073
apy+0.069
uddled+0.069
ls+0.067
mons+0.067
Neg logit contributions
et-0.052
ah-0.051
 seven-0.044
oscope-0.041
ary-0.041
Token
Token ID198
Feature activation+7.11e-01

Pos logit contributions
et+0.031
ah+0.030
oscope+0.026
 seven+0.026
pole+0.025
Neg logit contributions
 ourselves-0.029
apy-0.027
uddled-0.026
ls-0.025
mons-0.025
TokenWhen
Token ID2215
Feature activation-1.92e+00

Pos logit contributions
et+0.021
ah+0.020
oscope+0.018
 seven+0.018
pole+0.017
Neg logit contributions
 ourselves-0.015
apy-0.014
uddled-0.014
mons-0.013
ls-0.013
Token Ma
Token ID6669
Feature activation+5.43e+00

Pos logit contributions
 ourselves+0.053
apy+0.050
uddled+0.049
mons+0.047
ls+0.047
Neg logit contributions
et-0.044
ah-0.044
 seven-0.038
oscope-0.036
pole-0.035
Tokenisy
Token ID13560
Feature activation+2.45e-01

Pos logit contributions
et+0.114
ah+0.112
 seven+0.108
aling+0.099
oscope+0.099
Neg logit contributions
uddled-0.149
 ourselves-0.138
ls-0.137
mons-0.132
iring-0.128
Token was
Token ID373
Feature activation-1.88e-01

Pos logit contributions
et+0.007
ah+0.007
 seven+0.006
oscope+0.006
pole+0.006
Neg logit contributions
 ourselves-0.005
apy-0.005
uddled-0.005
ls-0.005
mons-0.005
Token finished
Token ID5201
Feature activation+7.31e-02

Pos logit contributions
apy+0.004
 ourselves+0.004
uddled+0.004
ls+0.004
istered+0.004
Neg logit contributions
ah-0.005
et-0.005
oscope-0.004
pole-0.004
 seven-0.004
Token,
Token ID11
Feature activation-5.18e-01

Pos logit contributions
et+0.001
ah+0.001
 seven+0.001
aster+0.001
oscope+0.001
Neg logit contributions
 ourselves-0.002
uddled-0.002
mons-0.002
istered-0.002
apy-0.002
Token she
Token ID673
Feature activation-7.31e-02

Pos logit contributions
 ourselves+0.011
apy+0.010
uddled+0.010
mons+0.009
ls+0.009
Neg logit contributions
et-0.016
ah-0.015
 seven-0.014
oscope-0.013
pole-0.013
Token.
Token ID13
Feature activation+4.67e-01

Pos logit contributions
 ourselves+0.014
apy+0.013
uddled+0.013
mons+0.012
ls+0.012
Neg logit contributions
et-0.011
ah-0.011
 seven-0.010
oscope-0.009
pole-0.009
Token
Token ID198
Feature activation-4.11e-01

Pos logit contributions
et+0.010
ah+0.010
 seven+0.009
oscope+0.009
pole+0.008
Neg logit contributions
 ourselves-0.013
apy-0.012
uddled-0.012
mons-0.012
ls-0.012
Token
Token ID198
Feature activation-1.45e+00

Pos logit contributions
 ourselves+0.013
mons+0.013
uddled+0.013
istered+0.013
sters+0.013
Neg logit contributions
ah-0.006
et-0.006
 seven-0.005
ary-0.005
 time-0.005
Token"
Token ID1
Feature activation-2.98e+00

Pos logit contributions
uddled+0.041
 ourselves+0.039
mons+0.039
ls+0.039
idents+0.039
Neg logit contributions
ah-0.028
et-0.027
 seven-0.027
 2-0.024
ary-0.021
TokenHow
Token ID2437
Feature activation-2.75e+00

Pos logit contributions
 ourselves+0.061
ls+0.057
apy+0.057
uddled+0.056
istered+0.053
Neg logit contributions
ah-0.087
et-0.085
 seven-0.081
oscope-0.074
 2-0.073
Token long
Token ID890
Feature activation+5.43e+00

Pos logit contributions
 ourselves+0.096
apy+0.092
uddled+0.091
mons+0.088
ls+0.088
Neg logit contributions
et-0.044
ah-0.042
 seven-0.034
oscope-0.033
pole-0.030
Token do
Token ID466
Feature activation+7.97e-01

Pos logit contributions
ah+0.126
et+0.125
alling+0.113
aling+0.112
oscope+0.111
Neg logit contributions
 ourselves-0.144
apy-0.130
ls-0.129
mons-0.127
uddled-0.125
Token we
Token ID356
Feature activation+1.59e+00

Pos logit contributions
et+0.028
ah+0.027
 seven+0.025
oscope+0.024
pole+0.024
Neg logit contributions
 ourselves-0.012
uddled-0.011
apy-0.011
mons-0.010
ls-0.010
Token have
Token ID423
Feature activation+2.81e+00

Pos logit contributions
et+0.039
ah+0.039
 seven+0.034
oscope+0.032
pole+0.032
Neg logit contributions
 ourselves-0.041
apy-0.036
mons-0.034
uddled-0.034
Sorry-0.033
Token to
Token ID284
Feature activation+3.94e-01

Pos logit contributions
ah+0.070
et+0.069
oscope+0.058
pole+0.058
 seven+0.057
Neg logit contributions
 ourselves-0.076
apy-0.069
uddled-0.066
ls-0.066
mons-0.065
Token wait
Token ID4043
Feature activation+4.72e-01

Pos logit contributions
et+0.009
ah+0.009
 seven+0.007
oscope+0.007
pole+0.007
Neg logit contributions
 ourselves-0.012
apy-0.010
uddled-0.010
mons-0.009
Sorry-0.009
Token you
Token ID345
Feature activation+1.51e+00

Pos logit contributions
ah+0.005
et+0.005
 seven+0.005
oscope+0.005
ary+0.005
Neg logit contributions
uddled-0.003
 ourselves-0.003
apy-0.003
istered-0.003
mons-0.003
Token have
Token ID423
Feature activation+2.58e+00

Pos logit contributions
ah+0.036
et+0.036
oscope+0.032
pole+0.031
 seven+0.031
Neg logit contributions
 ourselves-0.038
mons-0.033
apy-0.033
ls-0.032
uddled-0.032
Token there
Token ID612
Feature activation+1.40e+00

Pos logit contributions
et+0.065
ah+0.065
oscope+0.055
 seven+0.055
pole+0.053
Neg logit contributions
 ourselves-0.067
apy-0.061
uddled-0.061
mons-0.060
ls-0.060
Token,
Token ID11
Feature activation+7.86e-01

Pos logit contributions
et+0.035
ah+0.035
oscope+0.032
wn+0.030
aling+0.030
Neg logit contributions
 ourselves-0.037
ls-0.031
uddled-0.031
apy-0.031
iring-0.030
Token L
Token ID406
Feature activation+2.44e+00

Pos logit contributions
et+0.021
ah+0.020
 seven+0.018
oscope+0.017
pole+0.017
Neg logit contributions
 ourselves-0.019
apy-0.018
uddled-0.018
mons-0.017
ls-0.017
Tokenila
Token ID10102
Feature activation+5.32e+00

Pos logit contributions
et+0.063
ah+0.062
 seven+0.056
oscope+0.055
aling+0.052
Neg logit contributions
 ourselves-0.059
uddled-0.056
apy-0.054
mons-0.054
ls-0.054
Token and
Token ID290
Feature activation+3.31e+00

Pos logit contributions
et+0.120
ah+0.120
wn+0.107
 seven+0.106
oscope+0.102
Neg logit contributions
 ourselves-0.139
ls-0.129
uddled-0.129
mons-0.124
iring-0.123
Token Ben
Token ID3932
Feature activation+1.61e+00

Pos logit contributions
et+0.070
ah+0.069
 seven+0.059
oscope+0.057
pole+0.054
Neg logit contributions
 ourselves-0.098
apy-0.093
uddled-0.092
mons-0.089
ls-0.088
Token?"
Token ID1701
Feature activation-5.72e-01

Pos logit contributions
et+0.039
ah+0.039
 seven+0.034
oscope+0.033
wn+0.032
Neg logit contributions
 ourselves-0.042
uddled-0.036
ls-0.036
mons-0.036
apy-0.036
Token she
Token ID673
Feature activation+8.28e-01

Pos logit contributions
uddled+0.018
mons+0.017
 ourselves+0.017
istered+0.016
apy+0.016
Neg logit contributions
ah-0.010
 seven-0.009
et-0.009
 2-0.009
 One-0.009
Token asks
Token ID7893
Feature activation-2.53e+00

Pos logit contributions
et+0.022
ah+0.021
oscope+0.018
 seven+0.018
wn+0.017
Neg logit contributions
 ourselves-0.020
apy-0.019
ls-0.019
uddled-0.018
mons-0.017
Token"
Token ID1
Feature activation-2.89e+00

Pos logit contributions
 ourselves+0.003
uddled+0.003
mons+0.003
apy+0.003
idents+0.003
Neg logit contributions
et-0.002
ah-0.002
 seven-0.002
aster-0.002
aling-0.002
TokenAre
Token ID8491
Feature activation-2.76e+00

Pos logit contributions
 ourselves+0.075
uddled+0.069
ls+0.067
apy+0.067
iring+0.065
Neg logit contributions
ah-0.072
et-0.071
 seven-0.064
aster-0.061
oscope-0.059
Token you
Token ID345
Feature activation-6.68e-02

Pos logit contributions
uddled+0.070
apy+0.067
istered+0.066
mons+0.064
 ourselves+0.064
Neg logit contributions
et-0.066
ah-0.063
 seven-0.059
ary-0.055
aling-0.053
Token okay
Token ID8788
Feature activation+7.61e-01

Pos logit contributions
 ourselves+0.002
apy+0.002
mons+0.002
uddled+0.002
ls+0.002
Neg logit contributions
et-0.001
ah-0.001
 seven-0.001
oscope-0.001
wn-0.001
Token,
Token ID11
Feature activation+1.90e+00

Pos logit contributions
et+0.015
ah+0.015
 seven+0.013
oscope+0.012
pole+0.011
Neg logit contributions
 ourselves-0.023
apy-0.022
uddled-0.022
mons-0.021
ls-0.021
Token Jill
Token ID22772
Feature activation+5.25e+00

Pos logit contributions
et+0.040
ah+0.038
 seven+0.034
 2+0.031
pole+0.030
Neg logit contributions
 ourselves-0.055
uddled-0.054
apy-0.053
mons-0.051
ls-0.051
Token?"
Token ID1701
Feature activation+1.05e-01

Pos logit contributions
et+0.139
ah+0.135
wn+0.132
alling+0.130
aled+0.127
Neg logit contributions
ls-0.108
 ourselves-0.106
ies-0.098
ins-0.095
uddled-0.093
Token Tom
Token ID4186
Feature activation+1.62e+00

Pos logit contributions
et+0.002
ah+0.002
 seven+0.002
aster+0.002
 2+0.002
Neg logit contributions
 ourselves-0.003
uddled-0.003
apy-0.003
mons-0.003
ls-0.003
Token asked
Token ID1965
Feature activation-2.36e+00

Pos logit contributions
ah+0.050
wn+0.049
et+0.049
alling+0.045
aster+0.045
Neg logit contributions
 ourselves-0.027
ls-0.026
sters-0.024
ies-0.024
Sorry-0.023
Token.
Token ID13
Feature activation+1.00e+00

Pos logit contributions
 ourselves+0.065
apy+0.061
uddled+0.060
mons+0.058
ls+0.058
Neg logit contributions
et-0.055
ah-0.054
 seven-0.046
oscope-0.045
pole-0.043
Token He
Token ID679
Feature activation+1.32e+00

Pos logit contributions
et+0.027
ah+0.026
oscope+0.023
wn+0.022
alling+0.022
Neg logit contributions
apy-0.024
 ourselves-0.024
Sorry-0.022
mons-0.022
uddled-0.022
Token
Token ID220
Feature activation-1.97e+00

Pos logit contributions
et+0.038
ah+0.035
oscope+0.032
wn+0.031
aster+0.030
Neg logit contributions
 ourselves-0.037
apy-0.032
ls-0.031
uddled-0.031
We-0.030
Token
Token ID198
Feature activation+1.91e+00

Pos logit contributions
 ourselves+0.055
apy+0.052
uddled+0.051
mons+0.050
ls+0.049
Neg logit contributions
et-0.043
ah-0.043
 seven-0.037
oscope-0.036
pole-0.034
Token
Token ID198
Feature activation+1.68e+00

Pos logit contributions
et+0.053
ah+0.050
oscope+0.044
 seven+0.043
wn+0.043
Neg logit contributions
 ourselves-0.045
apy-0.041
uddled-0.040
ls-0.039
Sorry-0.038
TokenThe
Token ID464
Feature activation+1.09e+00

Pos logit contributions
et+0.048
ah+0.045
oscope+0.041
wn+0.039
 seven+0.038
Neg logit contributions
 ourselves-0.038
uddled-0.034
apy-0.033
Sorry-0.033
ls-0.033
Token moral
Token ID6573
Feature activation+1.80e+00

Pos logit contributions
et+0.028
ah+0.027
oscope+0.023
 seven+0.023
wn+0.022
Neg logit contributions
 ourselves-0.028
apy-0.025
uddled-0.025
mons-0.025
ls-0.024
Token of
Token ID286
Feature activation+5.20e+00

Pos logit contributions
et+0.041
ah+0.037
oscope+0.032
pole+0.031
 seven+0.030
Neg logit contributions
 ourselves-0.053
apy-0.048
We-0.047
iring-0.047
ls-0.047
Token the
Token ID262
Feature activation+4.26e+00

Pos logit contributions
et+0.130
ah+0.129
wn+0.120
pole+0.117
ooter+0.116
Neg logit contributions
 ourselves-0.138
 themselves-0.113
apy-0.104
we-0.103
We-0.103
Token story
Token ID1621
Feature activation+1.22e+00

Pos logit contributions
et+0.099
ah+0.091
pole+0.077
wn+0.077
alling+0.074
Neg logit contributions
 ourselves-0.121
ls-0.106
We-0.105
mons-0.104
uddled-0.103
Token is
Token ID318
Feature activation+2.49e+00

Pos logit contributions
et+0.036
ah+0.036
 seven+0.032
oscope+0.031
pole+0.030
Neg logit contributions
 ourselves-0.026
apy-0.023
uddled-0.023
ls-0.022
mons-0.022
Token:
Token ID25
Feature activation+1.05e+00

Pos logit contributions
et+0.060
ah+0.059
 seven+0.051
oscope+0.050
pole+0.047
Neg logit contributions
 ourselves-0.068
apy-0.063
uddled-0.062
mons-0.060
ls-0.059
Token With
Token ID2080
Feature activation+2.06e+00

Pos logit contributions
et+0.026
ah+0.026
 seven+0.022
oscope+0.021
pole+0.021
Neg logit contributions
 ourselves-0.028
apy-0.025
uddled-0.024
mons-0.023
Sorry-0.023
Token.
Token ID13
Feature activation+1.36e+00

Pos logit contributions
ah+0.009
et+0.009
 seven+0.008
oscope+0.007
pole+0.007
Neg logit contributions
 ourselves-0.009
apy-0.008
uddled-0.008
mons-0.008
ls-0.008
Token
Token ID198
Feature activation+2.52e+00

Pos logit contributions
et+0.032
ah+0.031
oscope+0.026
 seven+0.026
wn+0.024
Neg logit contributions
 ourselves-0.038
apy-0.035
uddled-0.034
mons-0.034
ls-0.033
Token
Token ID198
Feature activation+1.91e+00

Pos logit contributions
et+0.066
ah+0.064
oscope+0.055
 seven+0.054
wn+0.053
Neg logit contributions
 ourselves-0.063
apy-0.059
uddled-0.058
ls-0.057
mons-0.054
TokenThe
Token ID464
Feature activation+1.08e+00

Pos logit contributions
et+0.054
ah+0.052
oscope+0.046
 seven+0.046
wn+0.044
Neg logit contributions
 ourselves-0.043
apy-0.039
uddled-0.039
ls-0.037
mons-0.037
Token moral
Token ID6573
Feature activation+2.59e+00

Pos logit contributions
et+0.028
ah+0.028
pole+0.023
oscope+0.023
wn+0.023
Neg logit contributions
 ourselves-0.026
uddled-0.023
mons-0.023
Sorry-0.023
ls-0.023
Token of
Token ID286
Feature activation+5.19e+00

Pos logit contributions
et+0.063
ah+0.056
oscope+0.051
pole+0.050
alling+0.047
Neg logit contributions
 ourselves-0.071
iring-0.064
We-0.064
apy-0.063
mons-0.063
Token the
Token ID262
Feature activation+3.89e+00

Pos logit contributions
et+0.134
ah+0.133
wn+0.129
ooter+0.121
pole+0.119
Neg logit contributions
 ourselves-0.132
 themselves-0.109
iring-0.099
apy-0.099
 yourselves-0.098
Token story
Token ID1621
Feature activation+1.72e+00

Pos logit contributions
et+0.090
ah+0.081
wn+0.072
oscope+0.070
pole+0.070
Neg logit contributions
 ourselves-0.112
ls-0.100
iring-0.094
mons-0.093
sters-0.092
Token is
Token ID318
Feature activation+2.74e+00

Pos logit contributions
et+0.052
ah+0.051
 seven+0.046
oscope+0.046
wn+0.044
Neg logit contributions
 ourselves-0.035
apy-0.031
uddled-0.031
ls-0.030
mons-0.029
Token:
Token ID25
Feature activation+1.52e+00

Pos logit contributions
et+0.061
ah+0.059
 seven+0.051
oscope+0.050
pole+0.047
Neg logit contributions
 ourselves-0.079
apy-0.074
uddled-0.073
mons-0.071
ls-0.070
Token Eating
Token ID37537
Feature activation-2.03e-01

Pos logit contributions
et+0.037
ah+0.036
 seven+0.032
oscope+0.031
pole+0.030
Neg logit contributions
 ourselves-0.041
apy-0.037
uddled-0.036
mons-0.036
Sorry-0.034
Token t
Token ID256
Feature activation-2.15e+00

Pos logit contributions
 ourselves+0.013
apy+0.011
ls+0.011
uddled+0.011
mons+0.011
Neg logit contributions
et-0.014
ah-0.013
aling-0.012
alling-0.012
aster-0.012
Tokeneddy
Token ID21874
Feature activation-1.38e+00

Pos logit contributions
 ourselves+0.047
apy+0.043
mons+0.043
uddled+0.041
Sorry+0.039
Neg logit contributions
ah-0.063
et-0.059
oscope-0.052
 seven-0.051
ock-0.051
Token bear
Token ID6842
Feature activation+1.38e+00

Pos logit contributions
 ourselves+0.043
apy+0.040
uddled+0.040
mons+0.039
ls+0.038
Neg logit contributions
et-0.027
ah-0.026
 seven-0.022
oscope-0.021
pole-0.020
Token,
Token ID11
Feature activation+4.71e-01

Pos logit contributions
et+0.035
ah+0.033
alling+0.031
oscope+0.031
wn+0.030
Neg logit contributions
 ourselves-0.032
ls-0.029
apy-0.029
uddled-0.028
mons-0.028
Token L
Token ID406
Feature activation+2.33e+00

Pos logit contributions
et+0.015
ah+0.015
wn+0.014
aling+0.014
alling+0.013
Neg logit contributions
 ourselves-0.010
apy-0.008
 themselves-0.007
 yourselves-0.007
iring-0.007
Tokenila
Token ID10102
Feature activation+5.16e+00

Pos logit contributions
et+0.066
ah+0.063
 seven+0.060
oscope+0.060
wn+0.058
Neg logit contributions
 ourselves-0.049
uddled-0.047
ls-0.046
ins-0.044
mons-0.043
Token?"
Token ID1701
Feature activation-6.18e-01

Pos logit contributions
wn+0.148
et+0.146
ah+0.145
aled+0.142
alling+0.142
Neg logit contributions
ies-0.087
 ourselves-0.087
ls-0.084
ins-0.082
uddled-0.077
Token Tom
Token ID4186
Feature activation+1.57e+00

Pos logit contributions
 ourselves+0.019
uddled+0.019
apy+0.018
mons+0.018
istered+0.018
Neg logit contributions
et-0.011
ah-0.011
 seven-0.009
 2-0.008
ary-0.008
Token asks
Token ID7893
Feature activation-2.72e+00

Pos logit contributions
ah+0.048
et+0.048
wn+0.045
aster+0.044
ary+0.043
Neg logit contributions
 ourselves-0.026
ls-0.025
ins-0.023
sters-0.021
ies-0.021
Token.
Token ID13
Feature activation+9.13e-01

Pos logit contributions
 ourselves+0.069
apy+0.065
uddled+0.063
ls+0.063
mons+0.061
Neg logit contributions
et-0.069
ah-0.067
oscope-0.058
 seven-0.058
aling-0.056
Token
Token ID198
Feature activation+9.28e-02

Pos logit contributions
et+0.030
ah+0.028
oscope+0.025
wn+0.025
alling+0.025
Neg logit contributions
apy-0.018
 ourselves-0.016
Sorry-0.016
We-0.016
ls-0.016
Token asked
Token ID1965
Feature activation-1.55e+00

Pos logit contributions
et+0.016
ah+0.016
 seven+0.014
oscope+0.014
aling+0.013
Neg logit contributions
 ourselves-0.017
apy-0.017
uddled-0.017
ls-0.016
mons-0.016
Token,
Token ID11
Feature activation+1.11e+00

Pos logit contributions
 ourselves+0.041
apy+0.039
uddled+0.038
mons+0.037
ls+0.036
Neg logit contributions
ah-0.037
et-0.037
 seven-0.032
oscope-0.031
aling-0.030
Token "
Token ID366
Feature activation-7.83e-01

Pos logit contributions
ah+0.031
et+0.030
ont+0.026
wn+0.026
oscope+0.025
Neg logit contributions
 ourselves-0.025
uddled-0.025
apy-0.024
Sorry-0.023
mons-0.023
TokenWhat
Token ID2061
Feature activation-1.78e+00

Pos logit contributions
 ourselves+0.015
apy+0.013
uddled+0.013
mons+0.013
Sorry+0.013
Neg logit contributions
et-0.025
ah-0.024
 seven-0.022
oscope-0.021
wn-0.021
Token does
Token ID857
Feature activation+2.61e+00

Pos logit contributions
 ourselves+0.049
uddled+0.047
apy+0.047
mons+0.045
ls+0.044
Neg logit contributions
ah-0.039
et-0.039
 seven-0.033
oscope-0.032
ont-0.031
Token spell
Token ID4822
Feature activation+5.15e+00

Pos logit contributions
et+0.068
ah+0.066
oscope+0.056
aling+0.054
alling+0.053
Neg logit contributions
 ourselves-0.069
apy-0.065
ls-0.058
mons-0.056
iring-0.055
Token mean
Token ID1612
Feature activation+3.06e+00

Pos logit contributions
ah+0.129
et+0.127
 seven+0.111
oscope+0.107
alling+0.104
Neg logit contributions
 ourselves-0.132
apy-0.116
ls-0.113
uddled-0.109
mons-0.107
Token?"
Token ID1701
Feature activation-3.47e-01

Pos logit contributions
et+0.080
ah+0.079
alling+0.071
 seven+0.070
aling+0.070
Neg logit contributions
 ourselves-0.073
mons-0.062
uddled-0.062
ls-0.061
idents-0.060
Token
Token ID198
Feature activation+4.79e-01

Pos logit contributions
 ourselves+0.008
apy+0.008
uddled+0.008
mons+0.008
Sorry+0.008
Neg logit contributions
et-0.009
ah-0.009
 seven-0.008
oscope-0.008
aling-0.007
Token
Token ID198
Feature activation-6.27e-01

Pos logit contributions
et+0.011
ah+0.011
 seven+0.010
oscope+0.009
pole+0.009
Neg logit contributions
 ourselves-0.013
apy-0.012
uddled-0.012
mons-0.012
ls-0.011
TokenHer
Token ID9360
Feature activation+5.37e-01

Pos logit contributions
 ourselves+0.015
ls+0.014
mons+0.014
apy+0.014
uddled+0.013
Neg logit contributions
et-0.016
ah-0.016
 seven-0.015
ary-0.014
oscope-0.014
Token's
Token ID338
Feature activation+2.97e-01

Pos logit contributions
 ourselves+0.014
apy+0.013
uddled+0.013
mons+0.013
ls+0.012
Neg logit contributions
et-0.010
ah-0.010
 seven-0.008
oscope-0.008
pole-0.008
Token a
Token ID257
Feature activation-3.13e-01

Pos logit contributions
ah+0.008
et+0.007
oscope+0.006
 seven+0.006
pole+0.006
Neg logit contributions
 ourselves-0.008
apy-0.007
mons-0.007
uddled-0.006
ls-0.006
Token stream
Token ID4269
Feature activation+1.82e+00

Pos logit contributions
 ourselves+0.008
apy+0.007
uddled+0.007
mons+0.007
ls+0.007
Neg logit contributions
et-0.008
ah-0.008
 seven-0.007
oscope-0.007
pole-0.006
Token!"
Token ID2474
Feature activation+1.11e+00

Pos logit contributions
et+0.044
ah+0.044
 seven+0.038
oscope+0.037
alling+0.035
Neg logit contributions
 ourselves-0.047
apy-0.044
uddled-0.043
ls-0.042
mons-0.042
Token said
Token ID531
Feature activation-2.71e-01

Pos logit contributions
et+0.026
ah+0.025
 seven+0.022
oscope+0.021
pole+0.020
Neg logit contributions
 ourselves-0.030
apy-0.029
uddled-0.028
mons-0.027
ls-0.027
Token Jas
Token ID21961
Feature activation+5.14e+00

Pos logit contributions
 ourselves+0.007
apy+0.007
uddled+0.007
mons+0.007
ls+0.006
Neg logit contributions
et-0.006
ah-0.006
oscope-0.006
aling-0.005
 seven-0.005
Tokenmine
Token ID3810
Feature activation+1.80e+00

Pos logit contributions
ah+0.109
 seven+0.100
et+0.099
wn+0.098
oscope+0.097
Neg logit contributions
uddled-0.126
 ourselves-0.122
ve-0.116
we-0.114
 themselves-0.109
Token.
Token ID13
Feature activation+1.28e+00

Pos logit contributions
ah+0.050
et+0.049
wn+0.045
 seven+0.044
oscope+0.041
Neg logit contributions
 ourselves-0.038
apy-0.037
uddled-0.036
ls-0.036
oles-0.035
Token
Token ID220
Feature activation-2.58e+00

Pos logit contributions
ah+0.024
et+0.024
 seven+0.021
ary+0.020
 2+0.019
Neg logit contributions
 ourselves-0.039
uddled-0.037
apy-0.036
istered-0.036
ls-0.036
Token
Token ID198
Feature activation+9.25e-01

Pos logit contributions
 ourselves+0.075
apy+0.071
uddled+0.071
mons+0.069
Sorry+0.068
Neg logit contributions
et-0.052
ah-0.051
 seven-0.044
pole-0.041
oscope-0.041
Token
Token ID198
Feature activation+4.35e-01

Pos logit contributions
ah+0.019
et+0.019
 seven+0.016
ary+0.015
 2+0.015
Neg logit contributions
 ourselves-0.027
apy-0.026
uddled-0.026
mons-0.025
istered-0.025
Token
Token ID220
Feature activation-1.73e+00

Pos logit contributions
et+0.058
ah+0.052
 seven+0.048
oscope+0.048
wn+0.048
Neg logit contributions
 ourselves-0.035
apy-0.032
We-0.030
iring-0.028
mons-0.028
Token
Token ID198
Feature activation+1.59e+00

Pos logit contributions
 ourselves+0.048
apy+0.046
uddled+0.045
mons+0.044
ls+0.044
Neg logit contributions
et-0.038
ah-0.038
 seven-0.032
oscope-0.031
pole-0.030
Token
Token ID198
Feature activation+1.36e+00

Pos logit contributions
et+0.044
ah+0.042
 seven+0.037
oscope+0.036
pole+0.036
Neg logit contributions
 ourselves-0.038
apy-0.035
uddled-0.034
ls-0.033
mons-0.032
TokenThe
Token ID464
Feature activation+8.59e-01

Pos logit contributions
et+0.041
ah+0.038
oscope+0.035
 seven+0.035
pole+0.034
Neg logit contributions
 ourselves-0.028
apy-0.025
uddled-0.025
Sorry-0.025
We-0.024
Token moral
Token ID6573
Feature activation+1.69e+00

Pos logit contributions
et+0.021
ah+0.020
 seven+0.017
oscope+0.017
pole+0.016
Neg logit contributions
 ourselves-0.023
apy-0.021
uddled-0.021
mons-0.020
ls-0.020
Token of
Token ID286
Feature activation+5.13e+00

Pos logit contributions
et+0.040
ah+0.035
pole+0.031
 seven+0.030
oscope+0.029
Neg logit contributions
 ourselves-0.049
apy-0.045
iring-0.044
We-0.043
ls-0.043
Token this
Token ID428
Feature activation+3.98e+00

Pos logit contributions
et+0.124
ah+0.122
wn+0.114
ooter+0.113
pole+0.111
Neg logit contributions
 ourselves-0.141
 themselves-0.112
apy-0.111
we-0.108
iring-0.106
Token story
Token ID1621
Feature activation+1.73e+00

Pos logit contributions
et+0.077
ah+0.068
 seven+0.062
wn+0.061
oscope+0.060
Neg logit contributions
 ourselves-0.125
apy-0.117
ls-0.116
uddled-0.116
iring-0.111
Token is
Token ID318
Feature activation+2.22e+00

Pos logit contributions
et+0.051
ah+0.050
 seven+0.045
oscope+0.044
pole+0.042
Neg logit contributions
 ourselves-0.037
apy-0.034
uddled-0.033
mons-0.031
ls-0.031
Token that
Token ID326
Feature activation-4.00e-02

Pos logit contributions
et+0.055
ah+0.054
 seven+0.047
oscope+0.046
pole+0.044
Neg logit contributions
 ourselves-0.059
apy-0.055
uddled-0.054
mons-0.052
ls-0.052
Token although
Token ID3584
Feature activation+7.18e-01

Pos logit contributions
uddled+0.001
 ourselves+0.001
mons+0.001
apy+0.001
istered+0.001
Neg logit contributions
et-0.001
ah-0.001
 seven-0.001
oscope-0.001
pole-0.001
Token you
Token ID345
Feature activation+2.16e+00

Pos logit contributions
et+0.016
ah+0.016
 seven+0.015
oscope+0.014
pole+0.014
Neg logit contributions
 ourselves-0.010
apy-0.009
uddled-0.009
mons-0.008
ls-0.008
Token want
Token ID765
Feature activation+2.29e+00

Pos logit contributions
et+0.059
ah+0.058
 seven+0.052
pole+0.050
 2+0.048
Neg logit contributions
 ourselves-0.046
mons-0.038
ve-0.038
apy-0.038
ls-0.037
Token to
Token ID284
Feature activation+5.65e-01

Pos logit contributions
et+0.065
ah+0.062
wn+0.053
 seven+0.051
pole+0.051
Neg logit contributions
 ourselves-0.057
uddled-0.048
apy-0.046
ls-0.045
mons-0.045
Token go
Token ID467
Feature activation+9.36e-01

Pos logit contributions
et+0.012
ah+0.011
 seven+0.010
oscope+0.009
wn+0.009
Neg logit contributions
 ourselves-0.018
apy-0.016
uddled-0.015
ls-0.015
iring-0.015
Token,
Token ID11
Feature activation+6.71e-02

Pos logit contributions
et+0.027
ah+0.026
 seven+0.023
wn+0.023
oscope+0.022
Neg logit contributions
 ourselves-0.021
ls-0.017
uddled-0.017
apy-0.017
iring-0.017
Token Blank
Token ID31990
Feature activation+5.12e+00

Pos logit contributions
et+0.002
ah+0.002
wn+0.002
aling+0.002
oscope+0.002
Neg logit contributions
 ourselves-0.001
apy-0.001
iring-0.001
ls-0.001
We-0.001
Tokenet
Token ID316
Feature activation+2.33e+00

Pos logit contributions
alling+0.130
 seven+0.129
 2+0.125
ooter+0.124
aling+0.124
Neg logit contributions
 ourselves-0.102
ies-0.096
ins-0.086
we-0.079
uddled-0.076
Token?"
Token ID1701
Feature activation-3.61e-01

Pos logit contributions
et+0.054
 seven+0.049
ah+0.049
wn+0.048
alling+0.046
Neg logit contributions
 ourselves-0.058
uddled-0.054
apy-0.053
mons-0.051
ls-0.051
Token
Token ID220
Feature activation-1.96e+00

Pos logit contributions
 ourselves+0.009
apy+0.008
mons+0.008
uddled+0.008
We+0.008
Neg logit contributions
et-0.010
ah-0.009
wn-0.008
ont-0.008
 seven-0.008
Token
Token ID198
Feature activation+4.26e-01

Pos logit contributions
 ourselves+0.055
apy+0.051
uddled+0.051
mons+0.050
ls+0.049
Neg logit contributions
et-0.043
ah-0.042
 seven-0.036
oscope-0.035
pole-0.034
TokenThe
Token ID464
Feature activation+2.33e-01

Pos logit contributions
et+0.013
ah+0.012
 seven+0.011
wn+0.011
oscope+0.011
Neg logit contributions
 ourselves-0.009
apy-0.008
uddled-0.008
Sorry-0.008
We-0.007
Token
Token ID220
Feature activation-1.90e+00

Pos logit contributions
et+0.027
ah+0.026
oscope+0.024
 seven+0.022
aster+0.022
Neg logit contributions
 ourselves-0.030
apy-0.027
uddled-0.026
mons-0.026
ls-0.026
Token
Token ID198
Feature activation+1.36e+00

Pos logit contributions
 ourselves+0.054
apy+0.051
uddled+0.051
mons+0.049
ls+0.049
Neg logit contributions
et-0.040
ah-0.040
 seven-0.034
oscope-0.032
pole-0.031
Token
Token ID198
Feature activation+1.25e+00

Pos logit contributions
et+0.035
ah+0.034
 seven+0.029
oscope+0.029
pole+0.028
Neg logit contributions
 ourselves-0.034
apy-0.032
uddled-0.031
ls-0.030
mons-0.030
TokenThe
Token ID464
Feature activation+5.66e-01

Pos logit contributions
et+0.035
ah+0.035
 seven+0.031
oscope+0.030
pole+0.029
Neg logit contributions
 ourselves-0.028
apy-0.026
uddled-0.026
mons-0.025
ls-0.024
Token moral
Token ID6573
Feature activation+1.47e+00

Pos logit contributions
et+0.014
ah+0.014
 seven+0.012
oscope+0.012
pole+0.011
Neg logit contributions
 ourselves-0.015
apy-0.014
uddled-0.014
mons-0.013
ls-0.013
Token of
Token ID286
Feature activation+5.11e+00

Pos logit contributions
et+0.031
ah+0.028
pole+0.024
 seven+0.023
oscope+0.023
Neg logit contributions
 ourselves-0.045
apy-0.043
uddled-0.041
ls-0.041
mons-0.040
Token the
Token ID262
Feature activation+4.08e+00

Pos logit contributions
et+0.124
ah+0.123
ooter+0.111
wn+0.110
pole+0.107
Neg logit contributions
 ourselves-0.141
 themselves-0.115
apy-0.110
iring-0.106
we-0.106
Token story
Token ID1621
Feature activation+1.17e+00

Pos logit contributions
et+0.087
ah+0.081
pole+0.068
wn+0.068
 seven+0.067
Neg logit contributions
 ourselves-0.124
ls-0.111
uddled-0.106
iring-0.106
apy-0.104
Token is
Token ID318
Feature activation+2.57e+00

Pos logit contributions
et+0.032
ah+0.032
 seven+0.028
oscope+0.027
pole+0.026
Neg logit contributions
 ourselves-0.027
apy-0.026
uddled-0.025
mons-0.024
istered-0.023
Token this
Token ID428
Feature activation+1.89e+00

Pos logit contributions
et+0.052
ah+0.051
 seven+0.043
ary+0.041
oscope+0.041
Neg logit contributions
 ourselves-0.075
apy-0.073
uddled-0.072
mons-0.070
istered-0.069
Token:
Token ID25
Feature activation+1.10e+00

Pos logit contributions
et+0.047
ah+0.045
 seven+0.040
oscope+0.039
pole+0.038
Neg logit contributions
 ourselves-0.050
apy-0.046
uddled-0.046
ls-0.044
mons-0.044
Token you
Token ID345
Feature activation+9.32e-01

Pos logit contributions
uddled+0.052
apy+0.051
 ourselves+0.050
mons+0.049
istered+0.047
Neg logit contributions
et-0.069
ah-0.068
 seven-0.063
oscope-0.059
aster-0.058
Token,
Token ID11
Feature activation+1.89e+00

Pos logit contributions
et+0.023
ah+0.022
oscope+0.019
aling+0.019
alling+0.019
Neg logit contributions
 ourselves-0.025
apy-0.022
uddled-0.022
ls-0.022
iring-0.021
Token doctor
Token ID6253
Feature activation+1.03e+00

Pos logit contributions
et+0.043
ah+0.042
oscope+0.035
 seven+0.034
wn+0.033
Neg logit contributions
 ourselves-0.055
apy-0.051
uddled-0.049
ls-0.048
Sorry-0.047
Token.
Token ID13
Feature activation-7.09e-01

Pos logit contributions
et+0.025
ah+0.024
 seven+0.021
oscope+0.020
alling+0.020
Neg logit contributions
 ourselves-0.027
uddled-0.025
apy-0.024
mons-0.024
ls-0.024
Token How
Token ID1374
Feature activation-2.22e+00

Pos logit contributions
 ourselves+0.019
apy+0.018
uddled+0.018
mons+0.017
ls+0.017
Neg logit contributions
et-0.017
ah-0.016
 seven-0.014
oscope-0.014
pole-0.013
Token long
Token ID890
Feature activation+5.08e+00

Pos logit contributions
 ourselves+0.070
apy+0.067
uddled+0.067
mons+0.065
ls+0.064
Neg logit contributions
et-0.042
ah-0.040
 seven-0.034
oscope-0.033
pole-0.031
Token will
Token ID481
Feature activation+2.49e-01

Pos logit contributions
et+0.119
ah+0.119
 seven+0.103
pole+0.100
alling+0.100
Neg logit contributions
 ourselves-0.137
apy-0.125
uddled-0.123
ls-0.122
mons-0.120
Token she
Token ID673
Feature activation+2.73e+00

Pos logit contributions
et+0.008
ah+0.007
 seven+0.007
oscope+0.007
aster+0.006
Neg logit contributions
 ourselves-0.005
uddled-0.005
apy-0.005
mons-0.004
ls-0.004
Token be
Token ID307
Feature activation+1.16e+00

Pos logit contributions
et+0.060
ah+0.060
 seven+0.051
oscope+0.048
pole+0.047
Neg logit contributions
 ourselves-0.077
apy-0.072
uddled-0.072
mons-0.069
ls-0.068
Token sick
Token ID6639
Feature activation+5.26e-01

Pos logit contributions
ah+0.030
et+0.030
 seven+0.025
oscope+0.025
pole+0.024
Neg logit contributions
 ourselves-0.029
apy-0.027
uddled-0.026
mons-0.026
ls-0.025
Token?"
Token ID1701
Feature activation+1.81e-01

Pos logit contributions
ah+0.013
et+0.013
 seven+0.011
pole+0.011
wn+0.011
Neg logit contributions
 ourselves-0.014
uddled-0.012
apy-0.012
mons-0.012
Sorry-0.012
Token want
Token ID765
Feature activation+2.78e+00

Pos logit contributions
et+0.087
ah+0.081
 seven+0.074
pole+0.073
oscope+0.072
Neg logit contributions
 ourselves-0.061
apy-0.050
We-0.047
istered-0.045
ve-0.045
Token to
Token ID284
Feature activation+1.27e+00

Pos logit contributions
et+0.067
ah+0.064
 seven+0.053
pole+0.053
aling+0.053
Neg logit contributions
 ourselves-0.079
apy-0.071
uddled-0.068
ls-0.067
Sorry-0.066
Token give
Token ID1577
Feature activation+4.85e-01

Pos logit contributions
et+0.029
ah+0.028
 seven+0.024
oscope+0.023
wn+0.023
Neg logit contributions
 ourselves-0.037
apy-0.033
uddled-0.031
Sorry-0.031
ls-0.031
Token it
Token ID340
Feature activation+1.75e+00

Pos logit contributions
et+0.011
ah+0.011
 seven+0.010
oscope+0.009
pole+0.009
Neg logit contributions
 ourselves-0.013
apy-0.013
uddled-0.013
mons-0.012
ls-0.012
Token a
Token ID257
Feature activation+1.85e+00

Pos logit contributions
et+0.044
ah+0.043
oscope+0.037
pole+0.036
aling+0.036
Neg logit contributions
 ourselves-0.046
apy-0.040
uddled-0.039
Sorry-0.038
ls-0.037
Token try
Token ID1949
Feature activation+5.08e+00

Pos logit contributions
et+0.057
aling+0.044
ah+0.044
aled+0.043
alling+0.043
Neg logit contributions
apy-0.040
 ourselves-0.037
Sorry-0.034
idents-0.033
ls-0.032
Token?"
Token ID1701
Feature activation+2.99e-01

Pos logit contributions
et+0.150
ah+0.140
aled+0.135
aling+0.135
oscope+0.134
Neg logit contributions
 ourselves-0.094
uddled-0.077
ls-0.074
apy-0.071
ies-0.071
Token
Token ID198
Feature activation+4.16e-01

Pos logit contributions
et+0.008
ah+0.008
 seven+0.007
oscope+0.006
wn+0.006
Neg logit contributions
 ourselves-0.008
apy-0.007
uddled-0.007
mons-0.007
ls-0.007
Token
Token ID198
Feature activation-6.03e-01

Pos logit contributions
et+0.009
ah+0.009
 seven+0.008
oscope+0.008
pole+0.007
Neg logit contributions
 ourselves-0.012
apy-0.011
uddled-0.011
mons-0.011
ls-0.010
TokenThe
Token ID464
Feature activation+5.43e-01

Pos logit contributions
 ourselves+0.015
uddled+0.015
mons+0.014
ls+0.014
apy+0.014
Neg logit contributions
ah-0.015
et-0.014
 seven-0.013
oscope-0.012
pole-0.012
Token baby
Token ID5156
Feature activation-1.49e+00

Pos logit contributions
et+0.015
ah+0.015
 seven+0.014
oscope+0.013
 2+0.013
Neg logit contributions
 ourselves-0.012
apy-0.011
uddled-0.011
mons-0.011
ls-0.011
Token keep
Token ID1394
Feature activation+1.10e+00

Pos logit contributions
 ourselves+0.048
apy+0.045
uddled+0.045
mons+0.043
ls+0.043
Neg logit contributions
et-0.039
ah-0.038
 seven-0.033
oscope-0.031
pole-0.030
Token it
Token ID340
Feature activation-1.12e+00

Pos logit contributions
et+0.028
ah+0.027
 seven+0.024
oscope+0.023
pole+0.022
Neg logit contributions
 ourselves-0.028
apy-0.026
uddled-0.026
mons-0.025
ls-0.025
Token as
Token ID355
Feature activation-7.10e-01

Pos logit contributions
 ourselves+0.032
apy+0.029
uddled+0.028
ls+0.028
mons+0.027
Neg logit contributions
et-0.026
ah-0.026
 seven-0.022
oscope-0.021
aling-0.020
Token a
Token ID257
Feature activation+7.11e-01

Pos logit contributions
 ourselves+0.019
uddled+0.019
apy+0.018
Sorry+0.017
mons+0.017
Neg logit contributions
et-0.016
ah-0.016
 seven-0.014
oscope-0.013
ary-0.013
Token souven
Token ID45043
Feature activation-3.56e-01

Pos logit contributions
et+0.018
ah+0.017
 seven+0.014
oscope+0.014
aling+0.014
Neg logit contributions
 ourselves-0.018
apy-0.017
uddled-0.017
ls-0.016
mons-0.016
Tokenir
Token ID343
Feature activation+5.04e+00

Pos logit contributions
 ourselves+0.009
apy+0.009
uddled+0.008
ls+0.008
mons+0.008
Neg logit contributions
et-0.009
ah-0.009
 seven-0.008
oscope-0.008
pole-0.007
Token of
Token ID286
Feature activation+1.90e+00

Pos logit contributions
ah+0.109
et+0.107
 seven+0.101
 2+0.096
 7+0.092
Neg logit contributions
 ourselves-0.141
ls-0.122
ies-0.119
oles-0.116
iring-0.114
Token your
Token ID534
Feature activation+2.75e-01

Pos logit contributions
ah+0.055
et+0.052
oscope+0.045
pole+0.045
 seven+0.044
Neg logit contributions
 ourselves-0.047
apy-0.039
 themselves-0.036
mons-0.036
 yourselves-0.035
Token adventure
Token ID8855
Feature activation+8.99e-01

Pos logit contributions
et+0.007
ah+0.007
 seven+0.006
oscope+0.006
pole+0.005
Neg logit contributions
 ourselves-0.007
apy-0.007
uddled-0.007
mons-0.006
ls-0.006
Token.
Token ID13
Feature activation+9.50e-01

Pos logit contributions
ah+0.025
et+0.024
 seven+0.021
oscope+0.021
alling+0.021
Neg logit contributions
 ourselves-0.022
ls-0.019
apy-0.018
mons-0.018
uddled-0.018
Token But
Token ID887
Feature activation+5.03e-01

Pos logit contributions
et+0.025
ah+0.025
oscope+0.020
 seven+0.020
pole+0.020
Neg logit contributions
 ourselves-0.025
apy-0.022
mons-0.021
Sorry-0.020
We-0.020
Token.
Token ID13
Feature activation+1.70e+00

Pos logit contributions
 ourselves+0.007
apy+0.006
uddled+0.006
ls+0.006
mons+0.006
Neg logit contributions
et-0.006
ah-0.006
 seven-0.005
oscope-0.005
pole-0.005
Token
Token ID198
Feature activation+2.08e+00

Pos logit contributions
et+0.048
ah+0.043
oscope+0.038
 seven+0.036
pole+0.036
Neg logit contributions
 ourselves-0.042
apy-0.039
uddled-0.037
iring-0.036
We-0.036
Token
Token ID198
Feature activation+1.64e+00

Pos logit contributions
et+0.056
ah+0.053
oscope+0.046
 seven+0.045
pole+0.044
Neg logit contributions
 ourselves-0.051
apy-0.049
uddled-0.048
ls-0.045
Sorry-0.044
TokenThe
Token ID464
Feature activation+1.08e+00

Pos logit contributions
et+0.050
ah+0.048
oscope+0.043
 seven+0.042
pole+0.041
Neg logit contributions
 ourselves-0.034
uddled-0.031
apy-0.031
Sorry-0.031
We-0.029
Token moral
Token ID6573
Feature activation+1.93e+00

Pos logit contributions
et+0.026
ah+0.026
 seven+0.023
oscope+0.022
pole+0.021
Neg logit contributions
 ourselves-0.028
apy-0.027
uddled-0.026
mons-0.026
ls-0.025
Token of
Token ID286
Feature activation+5.03e+00

Pos logit contributions
et+0.046
ah+0.041
pole+0.037
oscope+0.035
 seven+0.033
Neg logit contributions
 ourselves-0.056
apy-0.051
iring-0.050
uddled-0.050
We-0.049
Token the
Token ID262
Feature activation+3.69e+00

Pos logit contributions
et+0.138
ah+0.135
pole+0.128
wn+0.127
ooter+0.120
Neg logit contributions
 ourselves-0.123
 themselves-0.099
apy-0.091
we-0.089
 yourselves-0.088
Token story
Token ID1621
Feature activation+1.64e+00

Pos logit contributions
et+0.078
ah+0.073
pole+0.061
wn+0.060
 seven+0.058
Neg logit contributions
 ourselves-0.112
ls-0.101
uddled-0.100
apy-0.099
Sorry-0.097
Token is
Token ID318
Feature activation+2.93e+00

Pos logit contributions
et+0.050
ah+0.049
 seven+0.044
oscope+0.044
pole+0.042
Neg logit contributions
 ourselves-0.033
apy-0.030
uddled-0.029
ls-0.028
mons-0.028
Token that
Token ID326
Feature activation+3.06e-01

Pos logit contributions
et+0.062
ah+0.061
 seven+0.052
oscope+0.050
pole+0.048
Neg logit contributions
 ourselves-0.085
apy-0.081
uddled-0.080
mons-0.078
ls-0.077
Token it
Token ID340
Feature activation+7.35e-01

Pos logit contributions
et+0.007
ah+0.007
 seven+0.006
oscope+0.006
aling+0.006
Neg logit contributions
 ourselves-0.008
uddled-0.008
apy-0.008
mons-0.008
ls-0.007
Token"
Token ID1
Feature activation-3.00e+00

Pos logit contributions
uddled+0.028
idents+0.028
mons+0.027
ls+0.027
 ourselves+0.027
Neg logit contributions
 seven-0.027
et-0.026
ah-0.025
aster-0.024
 2-0.024
TokenAre
Token ID8491
Feature activation-3.44e+00

Pos logit contributions
 ourselves+0.071
apy+0.068
uddled+0.066
ls+0.065
iring+0.063
Neg logit contributions
ah-0.080
et-0.080
 seven-0.073
oscope-0.070
pole-0.065
Token you
Token ID345
Feature activation+1.85e-01

Pos logit contributions
uddled+0.093
apy+0.090
istered+0.089
mons+0.086
ins+0.085
Neg logit contributions
et-0.072
ah-0.072
 seven-0.071
 2-0.063
ary-0.063
Token OK
Token ID7477
Feature activation-1.80e-01

Pos logit contributions
et+0.005
ah+0.005
 seven+0.004
oscope+0.004
pole+0.004
Neg logit contributions
 ourselves-0.005
apy-0.005
uddled-0.004
mons-0.004
ls-0.004
Token,
Token ID11
Feature activation+9.40e-01

Pos logit contributions
 ourselves+0.005
apy+0.005
uddled+0.005
ls+0.005
mons+0.005
Neg logit contributions
et-0.004
ah-0.004
 seven-0.003
oscope-0.003
wn-0.003
Token Lily
Token ID20037
Feature activation+5.03e+00

Pos logit contributions
et+0.026
ah+0.025
oscope+0.022
wn+0.021
 seven+0.021
Neg logit contributions
 ourselves-0.022
apy-0.020
ls-0.020
uddled-0.020
mons-0.019
Token?"
Token ID1701
Feature activation-2.47e-02

Pos logit contributions
et+0.124
ah+0.120
wn+0.113
alling+0.111
 seven+0.108
Neg logit contributions
 ourselves-0.119
ls-0.112
uddled-0.111
mons-0.107
ins-0.106
Token Tom
Token ID4186
Feature activation+1.89e+00

Pos logit contributions
 ourselves+0.001
uddled+0.001
apy+0.001
mons+0.001
ls+0.001
Neg logit contributions
et-0.000
ah-0.000
 seven-0.000
oscope-0.000
aster-0.000
Token asked
Token ID1965
Feature activation-2.43e+00

Pos logit contributions
wn+0.061
et+0.061
ah+0.059
aster+0.056
alling+0.056
Neg logit contributions
ls-0.025
ins-0.023
ies-0.023
 ourselves-0.023
s-0.023
Token.
Token ID13
Feature activation+7.43e-01

Pos logit contributions
 ourselves+0.070
apy+0.066
uddled+0.065
mons+0.063
ls+0.062
Neg logit contributions
et-0.053
ah-0.052
 seven-0.045
oscope-0.043
pole-0.040
Token
Token ID198
Feature activation+2.03e-02

Pos logit contributions
et+0.024
ah+0.022
wn+0.020
aling+0.020
alling+0.020
Neg logit contributions
apy-0.015
 ourselves-0.014
Sorry-0.013
uddled-0.013
ls-0.013
Token."
Token ID526
Feature activation+5.76e-01

Pos logit contributions
ah+0.001
et+0.001
 seven+0.001
oscope+0.001
alling+0.001
Neg logit contributions
 ourselves-0.001
apy-0.001
uddled-0.001
ls-0.001
mons-0.001
Token
Token ID198
Feature activation+2.06e+00

Pos logit contributions
et+0.014
ah+0.013
oscope+0.012
 seven+0.011
wn+0.011
Neg logit contributions
 ourselves-0.016
apy-0.015
uddled-0.015
mons-0.014
ls-0.014
Token
Token ID198
Feature activation+1.31e+00

Pos logit contributions
et+0.052
ah+0.050
 seven+0.044
oscope+0.043
pole+0.042
Neg logit contributions
 ourselves-0.053
apy-0.050
uddled-0.049
ls-0.047
mons-0.047
TokenThe
Token ID464
Feature activation+6.29e-01

Pos logit contributions
et+0.034
ah+0.033
 seven+0.029
oscope+0.028
pole+0.027
Neg logit contributions
 ourselves-0.033
apy-0.031
uddled-0.030
mons-0.029
ls-0.029
Token moral
Token ID6573
Feature activation+1.75e+00

Pos logit contributions
et+0.014
ah+0.013
 seven+0.012
oscope+0.011
pole+0.010
Neg logit contributions
 ourselves-0.018
apy-0.017
uddled-0.017
mons-0.017
ls-0.016
Token of
Token ID286
Feature activation+5.02e+00

Pos logit contributions
et+0.039
ah+0.035
pole+0.031
 seven+0.030
oscope+0.029
Neg logit contributions
 ourselves-0.050
apy-0.048
iring-0.046
ls-0.046
uddled-0.046
Token the
Token ID262
Feature activation+4.26e+00

Pos logit contributions
et+0.118
ah+0.116
wn+0.108
ooter+0.104
oscope+0.101
Neg logit contributions
 ourselves-0.142
 themselves-0.118
apy-0.111
iring-0.110
uddled-0.106
Token story
Token ID1621
Feature activation+1.05e+00

Pos logit contributions
et+0.090
ah+0.083
wn+0.071
oscope+0.071
pole+0.069
Neg logit contributions
 ourselves-0.129
ls-0.118
iring-0.113
uddled-0.112
apy-0.109
Token is
Token ID318
Feature activation+2.39e+00

Pos logit contributions
et+0.030
ah+0.029
 seven+0.026
oscope+0.025
pole+0.024
Neg logit contributions
 ourselves-0.023
apy-0.021
uddled-0.021
mons-0.020
ls-0.020
Token that
Token ID326
Feature activation+7.36e-02

Pos logit contributions
et+0.057
ah+0.056
 seven+0.048
oscope+0.047
pole+0.045
Neg logit contributions
 ourselves-0.065
apy-0.061
uddled-0.060
mons-0.058
ls-0.058
Token it
Token ID340
Feature activation+4.40e-01

Pos logit contributions
et+0.002
ah+0.002
oscope+0.001
 seven+0.001
pole+0.001
Neg logit contributions
uddled-0.002
 ourselves-0.002
mons-0.002
apy-0.002
istered-0.002
Token the
Token ID262
Feature activation-6.51e-01

Pos logit contributions
 ourselves+0.014
apy+0.014
uddled+0.013
mons+0.013
ls+0.013
Neg logit contributions
et-0.010
ah-0.010
 seven-0.008
oscope-0.008
pole-0.008
Token sea
Token ID5417
Feature activation-6.07e-02

Pos logit contributions
 ourselves+0.017
apy+0.016
uddled+0.016
mons+0.015
ls+0.015
Neg logit contributions
et-0.016
ah-0.016
 seven-0.014
oscope-0.013
pole-0.013
Token and
Token ID290
Feature activation+3.02e-01

Pos logit contributions
 ourselves+0.002
apy+0.002
uddled+0.002
mons+0.001
ls+0.001
Neg logit contributions
et-0.001
ah-0.001
 seven-0.001
oscope-0.001
aling-0.001
Token they
Token ID484
Feature activation-2.25e+00

Pos logit contributions
et+0.008
ah+0.008
 seven+0.007
 2+0.006
pole+0.006
Neg logit contributions
 ourselves-0.007
uddled-0.007
apy-0.007
mons-0.007
istered-0.007
Token watched
Token ID7342
Feature activation-2.43e+00

Pos logit contributions
 ourselves+0.056
apy+0.052
uddled+0.052
mons+0.051
ls+0.049
Neg logit contributions
et-0.058
ah-0.057
 seven-0.050
oscope-0.048
pole-0.047
Token how
Token ID703
Feature activation-5.54e+00

Pos logit contributions
 ourselves+0.063
apy+0.059
uddled+0.056
ls+0.055
mons+0.054
Neg logit contributions
et-0.061
ah-0.060
oscope-0.052
 seven-0.051
aling-0.049
Token they
Token ID484
Feature activation-1.33e+00

Pos logit contributions
 ourselves+0.153
mons+0.147
uddled+0.147
istered+0.144
apy+0.137
Neg logit contributions
et-0.118
ah-0.116
 seven-0.111
 2-0.097
ary-0.092
Token moved
Token ID3888
Feature activation+2.54e-01

Pos logit contributions
 ourselves+0.031
apy+0.029
uddled+0.028
mons+0.028
ls+0.027
Neg logit contributions
et-0.037
ah-0.036
 seven-0.032
oscope-0.031
pole-0.030
Token through
Token ID832
Feature activation-5.64e-01

Pos logit contributions
et+0.006
ah+0.006
oscope+0.005
 seven+0.005
aster+0.005
Neg logit contributions
 ourselves-0.007
apy-0.006
ls-0.006
uddled-0.006
mons-0.006
Token the
Token ID262
Feature activation-8.45e-01

Pos logit contributions
 ourselves+0.017
apy+0.016
uddled+0.015
mons+0.015
ls+0.015
Neg logit contributions
et-0.012
ah-0.012
 seven-0.010
oscope-0.010
pole-0.009
Token water
Token ID1660
Feature activation-6.03e-02

Pos logit contributions
 ourselves+0.021
apy+0.020
uddled+0.020
mons+0.019
ls+0.019
Neg logit contributions
et-0.021
ah-0.021
 seven-0.018
oscope-0.018
pole-0.017
Token the
Token ID262
Feature activation+2.54e-01

Pos logit contributions
ah+0.037
et+0.036
 seven+0.030
oscope+0.029
wn+0.028
Neg logit contributions
 ourselves-0.042
apy-0.039
uddled-0.038
ls-0.037
mons-0.036
Token noise
Token ID7838
Feature activation-3.77e-01

Pos logit contributions
et+0.007
ah+0.006
 seven+0.005
oscope+0.005
pole+0.005
Neg logit contributions
 ourselves-0.006
apy-0.006
ls-0.006
uddled-0.006
mons-0.005
Token they
Token ID484
Feature activation-5.91e-01

Pos logit contributions
 ourselves+0.009
apy+0.009
uddled+0.009
ls+0.009
iring+0.009
Neg logit contributions
ah-0.009
et-0.009
 seven-0.008
oscope-0.008
wn-0.007
Token made
Token ID925
Feature activation+2.53e-01

Pos logit contributions
 ourselves+0.014
apy+0.013
uddled+0.013
ls+0.012
mons+0.012
Neg logit contributions
et-0.016
ah-0.015
 seven-0.014
oscope-0.013
aling-0.013
Token and
Token ID290
Feature activation+2.61e-01

Pos logit contributions
ah+0.006
et+0.006
 seven+0.005
oscope+0.005
aling+0.005
Neg logit contributions
 ourselves-0.007
apy-0.006
uddled-0.006
ls-0.006
mons-0.006
Token how
Token ID703
Feature activation-5.34e+00

Pos logit contributions
et+0.006
ah+0.006
 seven+0.005
oscope+0.005
pole+0.005
Neg logit contributions
 ourselves-0.007
uddled-0.006
apy-0.006
mons-0.006
ls-0.006
Token wild
Token ID4295
Feature activation+4.73e-02

Pos logit contributions
 ourselves+0.154
mons+0.146
uddled+0.145
istered+0.135
apy+0.133
Neg logit contributions
et-0.107
ah-0.105
 seven-0.097
 2-0.085
ary-0.085
Token it
Token ID340
Feature activation-4.36e-01

Pos logit contributions
et+0.001
ah+0.001
 seven+0.001
oscope+0.001
pole+0.001
Neg logit contributions
 ourselves-0.001
apy-0.001
uddled-0.001
mons-0.001
ls-0.001
Token all
Token ID477
Feature activation+8.86e-01

Pos logit contributions
 ourselves+0.011
apy+0.011
uddled+0.010
mons+0.010
ls+0.010
Neg logit contributions
et-0.011
ah-0.011
 seven-0.009
oscope-0.009
pole-0.009
Token was
Token ID373
Feature activation+2.61e-01

Pos logit contributions
ah+0.020
et+0.019
 seven+0.018
oscope+0.017
alling+0.016
Neg logit contributions
 ourselves-0.025
ls-0.023
apy-0.022
uddled-0.021
iring-0.021
Token.
Token ID13
Feature activation+2.45e-01

Pos logit contributions
ah+0.007
et+0.007
 seven+0.006
aster+0.006
oscope+0.006
Neg logit contributions
 ourselves-0.006
ls-0.005
uddled-0.005
apy-0.005
iring-0.005
TokenThey
Token ID2990
Feature activation-1.39e+00

Pos logit contributions
 ourselves+0.024
apy+0.022
uddled+0.022
mons+0.021
ls+0.021
Neg logit contributions
et-0.026
ah-0.025
 seven-0.022
oscope-0.022
pole-0.021
Token watched
Token ID7342
Feature activation-2.58e+00

Pos logit contributions
 ourselves+0.034
apy+0.033
uddled+0.031
ls+0.030
mons+0.030
Neg logit contributions
ah-0.033
et-0.033
oscope-0.029
 seven-0.029
pole-0.028
Token the
Token ID262
Feature activation-8.80e-01

Pos logit contributions
 ourselves+0.073
apy+0.069
uddled+0.066
ls+0.063
mons+0.063
Neg logit contributions
ah-0.060
et-0.059
 seven-0.050
oscope-0.049
aling-0.047
Token rain
Token ID6290
Feature activation-6.02e-01

Pos logit contributions
 ourselves+0.024
apy+0.023
uddled+0.023
mons+0.022
ls+0.022
Neg logit contributions
et-0.020
ah-0.019
 seven-0.017
oscope-0.016
pole-0.015
Token and
Token ID290
Feature activation-8.43e-01

Pos logit contributions
 ourselves+0.016
apy+0.015
uddled+0.015
mons+0.015
ls+0.014
Neg logit contributions
ah-0.014
et-0.014
 seven-0.012
oscope-0.012
aling-0.011
Token how
Token ID703
Feature activation-5.32e+00

Pos logit contributions
 ourselves+0.023
apy+0.022
uddled+0.022
mons+0.021
ls+0.021
Neg logit contributions
et-0.019
ah-0.019
 seven-0.016
oscope-0.016
pole-0.015
Token it
Token ID340
Feature activation-1.31e+00

Pos logit contributions
 ourselves+0.152
mons+0.143
sters+0.143
uddled+0.139
istered+0.139
Neg logit contributions
et-0.112
ah-0.104
 seven-0.096
 time-0.084
ary-0.084
Token made
Token ID925
Feature activation-9.15e-01

Pos logit contributions
 ourselves+0.037
apy+0.034
uddled+0.034
mons+0.033
ls+0.032
Neg logit contributions
et-0.030
ah-0.029
 seven-0.025
oscope-0.024
pole-0.023
Token p
Token ID279
Feature activation-2.45e+00

Pos logit contributions
 ourselves+0.024
apy+0.023
uddled+0.023
mons+0.022
ls+0.022
Neg logit contributions
et-0.022
ah-0.022
 seven-0.019
oscope-0.018
pole-0.017
Tokenudd
Token ID4185
Feature activation+2.17e-01

Pos logit contributions
 ourselves+0.051
Sorry+0.045
apy+0.044
mons+0.043
We+0.042
Neg logit contributions
et-0.072
ah-0.068
ary-0.058
ag-0.058
aling-0.057
Tokenles
Token ID829
Feature activation-1.06e+00

Pos logit contributions
et+0.005
ah+0.004
ock+0.004
 seven+0.004
ag+0.004
Neg logit contributions
 ourselves-0.006
apy-0.006
istered-0.006
Sorry-0.006
mons-0.006
Token on
Token ID319
Feature activation+2.44e+00

Pos logit contributions
apy+0.013
 ourselves+0.012
ls+0.012
Sorry+0.011
uddled+0.011
Neg logit contributions
et-0.016
ah-0.015
oscope-0.015
pole-0.014
ont-0.013
Token,
Token ID11
Feature activation-6.27e-01

Pos logit contributions
et+0.049
ah+0.048
 seven+0.040
oscope+0.039
pole+0.036
Neg logit contributions
 ourselves-0.075
apy-0.071
uddled-0.070
mons-0.068
ls-0.067
Token Lily
Token ID20037
Feature activation+2.01e-01

Pos logit contributions
 ourselves+0.018
apy+0.017
uddled+0.017
mons+0.016
ls+0.016
Neg logit contributions
et-0.014
ah-0.014
 seven-0.012
oscope-0.011
pole-0.011
Token and
Token ID290
Feature activation+1.09e+00

Pos logit contributions
et+0.004
ah+0.004
 seven+0.004
oscope+0.004
pole+0.004
Neg logit contributions
 ourselves-0.005
uddled-0.005
apy-0.005
mons-0.005
ls-0.005
Token the
Token ID262
Feature activation+3.76e-01

Pos logit contributions
et+0.021
ah+0.020
 seven+0.017
oscope+0.016
aster+0.015
Neg logit contributions
 ourselves-0.034
apy-0.033
uddled-0.033
mons-0.032
istered-0.031
Token puppy
Token ID26188
Feature activation-5.26e+00

Pos logit contributions
ah+0.008
et+0.007
 seven+0.007
 time+0.006
oscope+0.006
Neg logit contributions
apy-0.011
 ourselves-0.011
mons-0.010
uddled-0.010
ins-0.010
Token were
Token ID547
Feature activation-1.25e+00

Pos logit contributions
 ourselves+0.126
apy+0.117
uddled+0.114
mons+0.110
ls+0.109
Neg logit contributions
et-0.142
ah-0.139
 seven-0.122
oscope-0.119
pole-0.113
Token best
Token ID1266
Feature activation-1.63e+00

Pos logit contributions
 ourselves+0.030
apy+0.028
uddled+0.028
mons+0.027
ls+0.026
Neg logit contributions
et-0.034
ah-0.033
 seven-0.029
oscope-0.028
pole-0.027
Token friends
Token ID2460
Feature activation-6.34e-01

Pos logit contributions
 ourselves+0.037
idents+0.034
istered+0.031
Sorry+0.031
mons+0.031
Neg logit contributions
ah-0.044
et-0.041
 seven-0.038
ag-0.037
 about-0.037
Token and
Token ID290
Feature activation-2.33e+00

Pos logit contributions
 ourselves+0.018
apy+0.017
uddled+0.016
mons+0.016
ls+0.016
Neg logit contributions
et-0.014
ah-0.014
 seven-0.012
oscope-0.012
pole-0.011
Token they
Token ID484
Feature activation-2.84e+00

Pos logit contributions
 ourselves+0.065
uddled+0.061
apy+0.059
iring+0.059
ls+0.059
Neg logit contributions
ah-0.054
et-0.052
 seven-0.045
oscope-0.042
ary-0.041
Token One
Token ID1881
Feature activation+6.24e-01

Pos logit contributions
 ourselves+0.007
apy+0.007
uddled+0.007
ls+0.007
mons+0.007
Neg logit contributions
et-0.005
ah-0.005
 seven-0.004
oscope-0.004
aling-0.004
Token day
Token ID1110
Feature activation-2.63e+00

Pos logit contributions
et+0.012
ah+0.011
 seven+0.011
alling+0.010
aling+0.010
Neg logit contributions
apy-0.018
 ourselves-0.018
uddled-0.017
mons-0.017
ls-0.017
Token,
Token ID11
Feature activation-1.32e+00

Pos logit contributions
 ourselves+0.108
iring+0.100
uddled+0.098
istered+0.098
mons+0.097
Neg logit contributions
ah-0.025
et-0.018
 time-0.018
 seven-0.017
 One-0.013
Token Lily
Token ID20037
Feature activation-1.40e-01

Pos logit contributions
 ourselves+0.044
uddled+0.041
ins+0.040
istered+0.040
sters+0.040
Neg logit contributions
ah-0.022
et-0.019
 seven-0.019
 One-0.017
 time-0.017
Token and
Token ID290
Feature activation+2.39e-01

Pos logit contributions
 ourselves+0.005
apy+0.005
mons+0.004
iring+0.004
istered+0.004
Neg logit contributions
et-0.002
ah-0.002
oscope-0.002
 seven-0.002
ont-0.002
Token Fl
Token ID1610
Feature activation-5.20e+00

Pos logit contributions
ah+0.006
et+0.006
oscope+0.005
 seven+0.005
aster+0.004
Neg logit contributions
 ourselves-0.006
uddled-0.006
apy-0.006
mons-0.006
istered-0.006
Tokenuffy
Token ID15352
Feature activation-3.48e+00

Pos logit contributions
 ourselves+0.102
Sorry+0.093
 yourselves+0.085
 disl+0.085
apy+0.081
Neg logit contributions
ah-0.149
ary-0.135
et-0.134
ag-0.128
ow-0.126
Token went
Token ID1816
Feature activation-4.58e-01

Pos logit contributions
 ourselves+0.112
apy+0.104
iring+0.101
uddled+0.100
mons+0.098
Neg logit contributions
et-0.068
ah-0.065
 seven-0.056
oscope-0.052
ary-0.050
Token to
Token ID284
Feature activation-1.29e+00

Pos logit contributions
 ourselves+0.014
uddled+0.013
ls+0.013
apy+0.013
mons+0.012
Neg logit contributions
et-0.010
ah-0.009
 seven-0.008
oscope-0.008
pole-0.007
Token the
Token ID262
Feature activation+4.45e-02

Pos logit contributions
 ourselves+0.040
apy+0.039
uddled+0.039
mons+0.037
ls+0.037
Neg logit contributions
et-0.024
ah-0.024
 seven-0.020
oscope-0.019
aster-0.018
Token market
Token ID1910
Feature activation-1.46e+00

Pos logit contributions
et+0.001
ah+0.001
 seven+0.001
oscope+0.001
pole+0.001
Neg logit contributions
 ourselves-0.001
apy-0.001
uddled-0.001
mons-0.001
ls-0.001
Token on
Token ID319
Feature activation+2.18e+00

Pos logit contributions
apy+0.018
 ourselves+0.017
ls+0.016
uddled+0.015
Sorry+0.015
Neg logit contributions
et-0.020
ah-0.019
oscope-0.018
pole-0.017
ont-0.017
Token,
Token ID11
Feature activation-6.03e-01

Pos logit contributions
et+0.042
ah+0.041
 seven+0.034
oscope+0.033
pole+0.030
Neg logit contributions
 ourselves-0.069
apy-0.065
uddled-0.065
mons-0.063
ls-0.061
Token Lily
Token ID20037
Feature activation+6.45e-02

Pos logit contributions
 ourselves+0.017
apy+0.016
uddled+0.016
mons+0.015
ls+0.015
Neg logit contributions
et-0.014
ah-0.013
 seven-0.012
oscope-0.011
pole-0.011
Token and
Token ID290
Feature activation+7.74e-01

Pos logit contributions
et+0.001
ah+0.001
 seven+0.001
oscope+0.001
pole+0.001
Neg logit contributions
 ourselves-0.002
uddled-0.002
apy-0.002
mons-0.002
ls-0.002
Token the
Token ID262
Feature activation+1.98e-01

Pos logit contributions
et+0.015
ah+0.015
 seven+0.012
oscope+0.012
aster+0.011
Neg logit contributions
 ourselves-0.024
apy-0.023
uddled-0.023
mons-0.022
ls-0.022
Token puppy
Token ID26188
Feature activation-5.19e+00

Pos logit contributions
ah+0.004
et+0.004
 seven+0.004
 time+0.003
oscope+0.003
Neg logit contributions
apy-0.006
 ourselves-0.006
ins-0.005
mons-0.005
ls-0.005
Token became
Token ID2627
Feature activation-2.64e+00

Pos logit contributions
 ourselves+0.121
apy+0.112
uddled+0.110
mons+0.106
ls+0.105
Neg logit contributions
et-0.143
ah-0.140
 seven-0.123
oscope-0.121
pole-0.116
Token best
Token ID1266
Feature activation-1.56e+00

Pos logit contributions
 ourselves+0.061
uddled+0.060
apy+0.058
mons+0.055
ls+0.054
Neg logit contributions
et-0.071
ah-0.070
 seven-0.063
 2-0.058
oscope-0.057
Token friends
Token ID2460
Feature activation-7.51e-01

Pos logit contributions
 ourselves+0.037
idents+0.033
Sorry+0.031
istered+0.031
mons+0.031
Neg logit contributions
ah-0.039
et-0.039
 seven-0.036
 time-0.035
 hour-0.034
Token and
Token ID290
Feature activation-2.23e+00

Pos logit contributions
 ourselves+0.022
apy+0.021
uddled+0.021
mons+0.020
ls+0.020
Neg logit contributions
et-0.016
ah-0.015
 seven-0.013
oscope-0.013
pole-0.012
Token they
Token ID484
Feature activation-2.93e+00

Pos logit contributions
 ourselves+0.062
uddled+0.058
apy+0.057
iring+0.056
ls+0.056
Neg logit contributions
ah-0.051
et-0.051
 seven-0.043
oscope-0.042
ary-0.039
Token the
Token ID262
Feature activation-2.40e-01

Pos logit contributions
 ourselves+0.008
apy+0.008
uddled+0.008
mons+0.007
ls+0.007
Neg logit contributions
et-0.007
ah-0.007
 seven-0.006
oscope-0.006
pole-0.005
Token way
Token ID835
Feature activation-5.30e-01

Pos logit contributions
 ourselves+0.006
apy+0.005
uddled+0.005
ls+0.005
iring+0.005
Neg logit contributions
et-0.006
ah-0.006
 seven-0.005
oscope-0.005
aling-0.005
Token they
Token ID484
Feature activation-1.08e+00

Pos logit contributions
 ourselves+0.016
apy+0.015
uddled+0.015
mons+0.014
ls+0.014
Neg logit contributions
et-0.011
ah-0.011
 seven-0.009
oscope-0.009
pole-0.009
Token tasted
Token ID29187
Feature activation-2.85e-01

Pos logit contributions
 ourselves+0.027
apy+0.025
uddled+0.025
mons+0.024
ls+0.024
Neg logit contributions
et-0.028
ah-0.027
 seven-0.024
oscope-0.023
pole-0.022
Token and
Token ID290
Feature activation-5.22e-01

Pos logit contributions
 ourselves+0.009
apy+0.008
uddled+0.008
mons+0.008
Sorry+0.008
Neg logit contributions
et-0.006
ah-0.006
 seven-0.005
oscope-0.004
pole-0.004
Token how
Token ID703
Feature activation-5.18e+00

Pos logit contributions
 ourselves+0.015
uddled+0.015
apy+0.014
mons+0.014
ls+0.014
Neg logit contributions
et-0.011
ah-0.011
 seven-0.009
oscope-0.009
pole-0.008
Token they
Token ID484
Feature activation-7.63e-01

Pos logit contributions
uddled+0.147
 ourselves+0.147
mons+0.141
istered+0.136
Sorry+0.136
Neg logit contributions
et-0.090
ah-0.089
 seven-0.081
 about-0.080
 time-0.078
Token made
Token ID925
Feature activation+1.26e-03

Pos logit contributions
 ourselves+0.021
uddled+0.020
apy+0.020
mons+0.019
iring+0.018
Neg logit contributions
et-0.018
ah-0.017
 seven-0.015
oscope-0.014
pole-0.014
Token him
Token ID683
Feature activation-3.33e+00

Pos logit contributions
et+0.000
ah+0.000
 time+0.000
 seven+0.000
aster+0.000
Neg logit contributions
uddled-0.000
Sorry-0.000
 ourselves-0.000
apy-0.000
mons-0.000
Token feel
Token ID1254
Feature activation+5.64e-01

Pos logit contributions
 ourselves+0.092
uddled+0.091
apy+0.089
Sorry+0.087
mons+0.087
Neg logit contributions
et-0.073
ah-0.071
 seven-0.060
oscope-0.058
pole-0.056
Token warm
Token ID5814
Feature activation-1.27e+00

Pos logit contributions
et+0.013
ah+0.013
 seven+0.011
oscope+0.010
pole+0.010
Neg logit contributions
 ourselves-0.015
uddled-0.015
apy-0.014
mons-0.014
ls-0.014
Token day
Token ID1110
Feature activation-1.81e+00

Pos logit contributions
et+0.019
ah+0.017
alling+0.017
 seven+0.016
 2+0.016
Neg logit contributions
apy-0.024
 ourselves-0.023
uddled-0.022
mons-0.022
Sorry-0.021
Token,
Token ID11
Feature activation-2.54e-01

Pos logit contributions
 ourselves+0.063
uddled+0.058
apy+0.057
istered+0.056
mons+0.056
Neg logit contributions
ah-0.030
et-0.029
 seven-0.024
aster-0.020
oscope-0.020
Token Tom
Token ID4186
Feature activation-2.78e-01

Pos logit contributions
 ourselves+0.008
apy+0.007
uddled+0.007
mons+0.007
ls+0.007
Neg logit contributions
et-0.005
ah-0.005
 seven-0.004
oscope-0.004
pole-0.004
Token and
Token ID290
Feature activation+6.27e-01

Pos logit contributions
 ourselves+0.009
apy+0.008
uddled+0.008
mons+0.008
iring+0.007
Neg logit contributions
et-0.006
ah-0.006
 seven-0.005
oscope-0.005
ont-0.004
Token his
Token ID465
Feature activation-1.02e+00

Pos logit contributions
et+0.016
ah+0.016
alling+0.015
pole+0.015
alis+0.015
Neg logit contributions
apy-0.014
 ourselves-0.013
Sorry-0.012
mons-0.012
ls-0.012
Token toy
Token ID13373
Feature activation-5.18e+00

Pos logit contributions
 ourselves+0.028
apy+0.027
uddled+0.025
Sorry+0.025
ls+0.024
Neg logit contributions
ah-0.023
et-0.023
 seven-0.020
oscope-0.019
ont-0.018
Token dog
Token ID3290
Feature activation-3.80e+00

Pos logit contributions
 ourselves+0.157
ins+0.146
istered+0.142
apy+0.141
 yourselves+0.139
Neg logit contributions
ah-0.104
et-0.101
 seven-0.083
ary-0.080
oscope-0.077
Token went
Token ID1816
Feature activation-3.52e-01

Pos logit contributions
 ourselves+0.109
apy+0.101
uddled+0.098
mons+0.096
iring+0.095
Neg logit contributions
et-0.084
ah-0.083
 seven-0.071
oscope-0.068
ary-0.066
Token on
Token ID319
Feature activation+2.58e-01

Pos logit contributions
 ourselves+0.010
ls+0.009
uddled+0.009
apy+0.009
mons+0.009
Neg logit contributions
et-0.008
ah-0.008
oscope-0.007
 seven-0.007
pole-0.006
Token an
Token ID281
Feature activation+2.31e+00

Pos logit contributions
et+0.006
ah+0.006
 seven+0.005
oscope+0.005
pole+0.005
Neg logit contributions
 ourselves-0.007
apy-0.006
uddled-0.006
ls-0.006
mons-0.006
Token adventure
Token ID8855
Feature activation-3.42e-01

Pos logit contributions
ah+0.057
et+0.046
 seven+0.042
 2+0.042
oscope+0.041
Neg logit contributions
 ourselves-0.057
 themselves-0.048
uddled-0.046
iring-0.046
 yourselves-0.044
Token heard
Token ID2982
Feature activation-1.62e-01

Pos logit contributions
 ourselves+0.016
apy+0.015
uddled+0.015
mons+0.014
ls+0.014
Neg logit contributions
et-0.015
ah-0.015
 seven-0.013
oscope-0.013
pole-0.013
Token the
Token ID262
Feature activation-3.54e-01

Pos logit contributions
 ourselves+0.004
apy+0.004
uddled+0.004
mons+0.004
ls+0.004
Neg logit contributions
et-0.004
ah-0.004
 seven-0.003
oscope-0.003
pole-0.003
Token tornado
Token ID33718
Feature activation-5.02e-01

Pos logit contributions
 ourselves+0.010
uddled+0.010
mons+0.010
apy+0.010
ls+0.009
Neg logit contributions
et-0.007
ah-0.006
 seven-0.006
oscope-0.006
 2-0.005
Token roar
Token ID34532
Feature activation+5.99e-02

Pos logit contributions
 ourselves+0.015
apy+0.014
uddled+0.014
ls+0.014
mons+0.013
Neg logit contributions
ah-0.010
et-0.010
 seven-0.009
oscope-0.008
ont-0.008
Token and
Token ID290
Feature activation-1.37e-01

Pos logit contributions
ah+0.001
et+0.001
 seven+0.001
oscope+0.001
wn+0.001
Neg logit contributions
 ourselves-0.002
apy-0.002
uddled-0.002
ls-0.002
iring-0.001
Token how
Token ID703
Feature activation-5.18e+00

Pos logit contributions
 ourselves+0.004
apy+0.004
uddled+0.004
mons+0.004
ls+0.004
Neg logit contributions
et-0.003
ah-0.003
 seven-0.002
oscope-0.002
pole-0.002
Tokenl
Token ID75
Feature activation+1.15e+00

Pos logit contributions
 ourselves+0.151
uddled+0.139
mons+0.139
sters+0.134
istered+0.133
Neg logit contributions
et-0.111
ah-0.107
 seven-0.095
ary-0.084
oscope-0.083
Token outside
Token ID2354
Feature activation-1.38e+00

Pos logit contributions
et+0.030
ah+0.029
oscope+0.025
 seven+0.024
ary+0.024
Neg logit contributions
 ourselves-0.028
apy-0.026
istered-0.025
Sorry-0.024
mons-0.024
Token.
Token ID13
Feature activation-4.59e-01

Pos logit contributions
 ourselves+0.032
uddled+0.030
apy+0.029
mons+0.029
istered+0.028
Neg logit contributions
et-0.038
ah-0.037
 seven-0.032
oscope-0.032
aster-0.030
Token They
Token ID1119
Feature activation-7.77e-01

Pos logit contributions
 ourselves+0.009
apy+0.009
uddled+0.008
mons+0.008
Sorry+0.007
Neg logit contributions
et-0.015
ah-0.014
oscope-0.013
 seven-0.013
pole-0.012
Token felt
Token ID2936
Feature activation+6.86e-01

Pos logit contributions
 ourselves+0.020
apy+0.019
uddled+0.018
mons+0.018
ls+0.018
Neg logit contributions
et-0.019
ah-0.019
 seven-0.017
oscope-0.016
pole-0.015
Token Rose
Token ID8049
Feature activation-6.44e-01

Pos logit contributions
 ourselves+0.001
apy+0.001
We+0.001
mons+0.001
uddled+0.001
Neg logit contributions
et-0.002
ah-0.001
 seven-0.001
wn-0.001
oscope-0.001
Token loved
Token ID6151
Feature activation-2.88e-01

Pos logit contributions
 ourselves+0.015
uddled+0.014
apy+0.014
ls+0.014
mons+0.013
Neg logit contributions
et-0.017
ah-0.017
 seven-0.014
oscope-0.014
wn-0.014
Token Fl
Token ID1610
Feature activation-4.29e+00

Pos logit contributions
 ourselves+0.007
apy+0.006
uddled+0.006
mons+0.006
ls+0.006
Neg logit contributions
et-0.007
ah-0.007
 seven-0.006
oscope-0.006
pole-0.006
Tokenappy
Token ID7774
Feature activation-7.16e-01

Pos logit contributions
 ourselves+0.076
apy+0.070
Sorry+0.069
 themselves+0.060
mons+0.057
Neg logit contributions
ah-0.136
et-0.124
ary-0.117
ag-0.107
oscope-0.106
Token and
Token ID290
Feature activation+3.91e-01

Pos logit contributions
 ourselves+0.019
uddled+0.018
apy+0.018
ls+0.017
mons+0.017
Neg logit contributions
ah-0.015
et-0.015
 seven-0.013
aster-0.013
alling-0.013
Token Fl
Token ID1610
Feature activation-5.17e+00

Pos logit contributions
et+0.009
ah+0.009
 seven+0.008
oscope+0.007
pole+0.007
Neg logit contributions
 ourselves-0.011
uddled-0.010
apy-0.010
mons-0.010
ls-0.010
Tokenappy
Token ID7774
Feature activation-1.39e+00

Pos logit contributions
Sorry+0.081
 ourselves+0.081
apy+0.076
uddled+0.069
ls+0.066
Neg logit contributions
ah-0.173
et-0.168
ary-0.150
ag-0.144
 seven-0.141
Token seemed
Token ID3947
Feature activation-2.08e-01

Pos logit contributions
 ourselves+0.032
apy+0.030
uddled+0.029
mons+0.028
ls+0.027
Neg logit contributions
et-0.037
ah-0.036
 seven-0.032
oscope-0.031
aster-0.031
Token to
Token ID284
Feature activation-7.72e-01

Pos logit contributions
 ourselves+0.006
apy+0.006
uddled+0.006
ls+0.006
mons+0.006
Neg logit contributions
et-0.004
ah-0.004
oscope-0.004
 seven-0.003
aster-0.003
Token love
Token ID1842
Feature activation-2.37e-01

Pos logit contributions
 ourselves+0.020
apy+0.018
Sorry+0.016
 themselves+0.016
uddled+0.016
Neg logit contributions
et-0.019
ah-0.018
oscope-0.016
 seven-0.016
wn-0.015
Token her
Token ID607
Feature activation-1.34e+00

Pos logit contributions
 ourselves+0.007
apy+0.006
uddled+0.006
mons+0.006
ls+0.006
Neg logit contributions
et-0.005
ah-0.005
 seven-0.004
oscope-0.004
pole-0.004
Token honey
Token ID12498
Feature activation-5.66e-01

Pos logit contributions
 ourselves+0.034
apy+0.032
uddled+0.032
mons+0.030
Sorry+0.030
Neg logit contributions
et-0.047
ah-0.046
 seven-0.042
oscope-0.041
pole-0.038
Token.
Token ID13
Feature activation-1.03e+00

Pos logit contributions
 ourselves+0.015
apy+0.014
uddled+0.014
mons+0.013
ls+0.013
Neg logit contributions
et-0.014
ah-0.014
 seven-0.012
oscope-0.011
aling-0.011
Token Buzz
Token ID15753
Feature activation-2.62e+00

Pos logit contributions
 ourselves+0.021
apy+0.019
uddled+0.018
mons+0.018
ls+0.017
Neg logit contributions
et-0.032
ah-0.031
oscope-0.028
 seven-0.027
wn-0.027
Tokeny
Token ID88
Feature activation-1.60e+00

Pos logit contributions
 ourselves+0.070
apy+0.066
uddled+0.063
mons+0.063
ls+0.061
Neg logit contributions
et-0.064
ah-0.063
 seven-0.054
oscope-0.052
pole-0.050
Token and
Token ID290
Feature activation+1.27e-01

Pos logit contributions
 ourselves+0.038
apy+0.036
uddled+0.035
mons+0.034
ls+0.034
Neg logit contributions
et-0.042
ah-0.041
 seven-0.036
oscope-0.035
wn-0.034
Token Fl
Token ID1610
Feature activation-5.16e+00

Pos logit contributions
et+0.003
ah+0.003
 seven+0.003
oscope+0.003
alling+0.002
Neg logit contributions
 ourselves-0.003
apy-0.003
ls-0.003
uddled-0.003
mons-0.003
Tokenutter
Token ID10381
Feature activation-1.45e+00

Pos logit contributions
Sorry+0.095
 ourselves+0.090
apy+0.084
uddled+0.082
mons+0.078
Neg logit contributions
ah-0.159
et-0.152
ary-0.130
ag-0.129
 seven-0.127
Token became
Token ID2627
Feature activation-2.40e+00

Pos logit contributions
 ourselves+0.033
apy+0.031
uddled+0.030
ls+0.029
mons+0.029
Neg logit contributions
et-0.039
ah-0.038
 seven-0.034
oscope-0.034
alling-0.033
Token best
Token ID1266
Feature activation-1.32e+00

Pos logit contributions
uddled+0.056
 ourselves+0.053
mons+0.049
apy+0.049
Sorry+0.048
Neg logit contributions
et-0.065
ah-0.064
 seven-0.061
 2-0.054
 time-0.053
Token friends
Token ID2460
Feature activation-4.36e-01

Pos logit contributions
 ourselves+0.027
idents+0.024
mons+0.024
uddled+0.024
Sorry+0.024
Neg logit contributions
ah-0.037
et-0.037
 seven-0.034
 time-0.033
ag-0.031
Token.
Token ID13
Feature activation-6.77e-01

Pos logit contributions
 ourselves+0.012
apy+0.012
uddled+0.012
mons+0.011
ls+0.011
Neg logit contributions
et-0.010
ah-0.009
 seven-0.008
oscope-0.008
pole-0.007
Token on
Token ID319
Feature activation+2.00e+00

Pos logit contributions
apy+0.015
 ourselves+0.015
ls+0.014
uddled+0.014
Sorry+0.013
Neg logit contributions
et-0.016
ah-0.015
oscope-0.014
pole-0.014
 seven-0.013
Token,
Token ID11
Feature activation-5.47e-01

Pos logit contributions
et+0.037
ah+0.036
 seven+0.030
oscope+0.028
pole+0.026
Neg logit contributions
 ourselves-0.064
uddled-0.061
apy-0.060
mons-0.060
Sorry-0.058
Token Lily
Token ID20037
Feature activation+3.66e-01

Pos logit contributions
 ourselves+0.015
apy+0.014
uddled+0.014
mons+0.013
ls+0.013
Neg logit contributions
et-0.013
ah-0.013
 seven-0.011
oscope-0.010
pole-0.010
Token and
Token ID290
Feature activation+1.07e+00

Pos logit contributions
et+0.008
ah+0.008
 seven+0.007
pole+0.007
wn+0.007
Neg logit contributions
 ourselves-0.009
uddled-0.009
apy-0.009
ls-0.008
mons-0.008
Token the
Token ID262
Feature activation+5.39e-01

Pos logit contributions
et+0.024
ah+0.023
 seven+0.020
oscope+0.019
pole+0.019
Neg logit contributions
 ourselves-0.031
apy-0.029
uddled-0.028
mons-0.028
ls-0.027
Token puppy
Token ID26188
Feature activation-5.15e+00

Pos logit contributions
et+0.012
ah+0.012
 seven+0.011
oscope+0.011
 time+0.010
Neg logit contributions
 ourselves-0.015
apy-0.014
mons-0.014
uddled-0.014
ls-0.013
Token were
Token ID547
Feature activation-9.99e-01

Pos logit contributions
 ourselves+0.127
apy+0.118
uddled+0.116
mons+0.114
ls+0.111
Neg logit contributions
et-0.134
ah-0.133
 seven-0.115
oscope-0.114
pole-0.107
Token best
Token ID1266
Feature activation-1.54e+00

Pos logit contributions
 ourselves+0.028
apy+0.026
uddled+0.026
mons+0.025
ls+0.025
Neg logit contributions
et-0.023
ah-0.022
 seven-0.019
oscope-0.018
pole-0.018
Token friends
Token ID2460
Feature activation-3.31e-01

Pos logit contributions
 ourselves+0.035
mons+0.032
idents+0.032
Sorry+0.031
istered+0.029
Neg logit contributions
ah-0.041
et-0.038
 about-0.035
 time-0.035
 seven-0.034
Token forever
Token ID8097
Feature activation-1.03e+00

Pos logit contributions
 ourselves+0.010
apy+0.009
uddled+0.009
mons+0.009
ls+0.009
Neg logit contributions
et-0.007
ah-0.007
 seven-0.006
oscope-0.006
pole-0.006
Token.
Token ID13
Feature activation-6.62e-01

Pos logit contributions
 ourselves+0.033
apy+0.033
mons+0.033
uddled+0.032
Sorry+0.032
Neg logit contributions
et-0.017
ah-0.016
 seven-0.013
 time-0.013
oscope-0.012
Token out
Token ID503
Feature activation-1.43e+00

Pos logit contributions
 ourselves+0.030
apy+0.029
ls+0.029
uddled+0.029
iring+0.027
Neg logit contributions
ah-0.031
et-0.029
oscope-0.026
 seven-0.025
pole-0.025
Token the
Token ID262
Feature activation-3.70e-01

Pos logit contributions
apy+0.038
 ourselves+0.038
ls+0.037
uddled+0.035
iring+0.034
Neg logit contributions
ah-0.035
et-0.034
oscope-0.030
pole-0.029
 seven-0.028
Token window
Token ID4324
Feature activation+1.16e+00

Pos logit contributions
 ourselves+0.010
uddled+0.010
apy+0.010
mons+0.009
ls+0.009
Neg logit contributions
et-0.008
ah-0.008
 seven-0.007
oscope-0.007
pole-0.007
Token and
Token ID290
Feature activation+4.44e-01

Pos logit contributions
ah+0.031
et+0.028
wn+0.026
 seven+0.026
oscope+0.026
Neg logit contributions
 ourselves-0.027
uddled-0.025
apy-0.024
iring-0.024
we-0.023
Token saw
Token ID2497
Feature activation-1.23e-01

Pos logit contributions
et+0.010
ah+0.010
 seven+0.009
oscope+0.008
pole+0.008
Neg logit contributions
 ourselves-0.012
apy-0.011
uddled-0.011
mons-0.011
ls-0.011
Token how
Token ID703
Feature activation-5.15e+00

Pos logit contributions
 ourselves+0.004
uddled+0.004
istered+0.004
apy+0.004
Sorry+0.004
Neg logit contributions
et-0.002
ah-0.002
aster-0.001
 seven-0.001
ock-0.001
Token dirty
Token ID11841
Feature activation-9.37e-01

Pos logit contributions
 ourselves+0.150
sters+0.135
uddled+0.133
mons+0.129
 beings+0.127
Neg logit contributions
et-0.095
ah-0.087
 seven-0.083
ag-0.079
 time-0.079
Token he
Token ID339
Feature activation+2.16e-01

Pos logit contributions
 ourselves+0.026
uddled+0.026
apy+0.025
istered+0.025
Sorry+0.025
Neg logit contributions
et-0.021
ah-0.019
 seven-0.016
oscope-0.015
pole-0.015
Token was
Token ID373
Feature activation+1.38e-01

Pos logit contributions
et+0.006
ah+0.006
oscope+0.005
 seven+0.005
ont+0.005
Neg logit contributions
 ourselves-0.005
apy-0.005
uddled-0.005
mons-0.004
ls-0.004
Token.
Token ID13
Feature activation+9.64e-01

Pos logit contributions
et+0.003
ah+0.003
 seven+0.003
oscope+0.003
pole+0.002
Neg logit contributions
 ourselves-0.004
apy-0.004
uddled-0.004
mons-0.004
ls-0.003
Token She
Token ID1375
Feature activation+9.22e-01

Pos logit contributions
et+0.022
ah+0.021
 seven+0.019
oscope+0.018
pole+0.017
Neg logit contributions
 ourselves-0.027
uddled-0.024
apy-0.024
ls-0.024
mons-0.023
Token Fl
Token ID1610
Feature activation-4.58e+00

Pos logit contributions
et+0.029
ah+0.028
 seven+0.024
oscope+0.024
aling+0.023
Neg logit contributions
 ourselves-0.026
apy-0.024
uddled-0.023
mons-0.023
ls-0.022
Tokenuffy
Token ID15352
Feature activation-9.37e-01

Pos logit contributions
 ourselves+0.087
apy+0.087
Sorry+0.081
 disl+0.080
mons+0.073
Neg logit contributions
ah-0.132
et-0.115
ary-0.111
 time-0.106
oscope-0.104
Token.
Token ID13
Feature activation-3.41e-01

Pos logit contributions
 ourselves+0.022
uddled+0.020
apy+0.020
mons+0.019
ls+0.019
Neg logit contributions
et-0.025
ah-0.024
 seven-0.022
aling-0.022
alling-0.022
Token Lily
Token ID20037
Feature activation-1.07e+00

Pos logit contributions
 ourselves+0.009
apy+0.008
uddled+0.008
mons+0.008
ls+0.008
Neg logit contributions
et-0.009
ah-0.008
 seven-0.007
oscope-0.007
aling-0.007
Token loved
Token ID6151
Feature activation-1.45e+00

Pos logit contributions
 ourselves+0.027
apy+0.025
mons+0.024
uddled+0.024
iring+0.023
Neg logit contributions
et-0.028
ah-0.028
 seven-0.024
oscope-0.024
pole-0.023
Token Fl
Token ID1610
Feature activation-5.15e+00

Pos logit contributions
 ourselves+0.046
apy+0.044
uddled+0.044
mons+0.042
ls+0.042
Neg logit contributions
et-0.026
ah-0.026
 seven-0.022
oscope-0.021
pole-0.019
Tokenuffy
Token ID15352
Feature activation-3.11e+00

Pos logit contributions
 ourselves+0.105
 disl+0.105
Sorry+0.101
mons+0.094
apy+0.090
Neg logit contributions
ah-0.130
ary-0.113
 time-0.111
inn-0.107
ow-0.107
Token very
Token ID845
Feature activation-2.25e+00

Pos logit contributions
 ourselves+0.086
uddled+0.079
apy+0.079
mons+0.076
ls+0.076
Neg logit contributions
et-0.070
ah-0.070
 seven-0.060
alling-0.058
aling-0.058
Token much
Token ID881
Feature activation-7.93e-01

Pos logit contributions
 ourselves+0.064
uddled+0.060
apy+0.060
mons+0.058
ls+0.056
Neg logit contributions
et-0.050
ah-0.050
 seven-0.042
oscope-0.040
pole-0.038
Token and
Token ID290
Feature activation-6.08e-02

Pos logit contributions
 ourselves+0.022
apy+0.020
uddled+0.020
ls+0.020
mons+0.020
Neg logit contributions
et-0.018
ah-0.018
 seven-0.015
oscope-0.015
pole-0.014
Token took
Token ID1718
Feature activation-2.44e-01

Pos logit contributions
 ourselves+0.002
uddled+0.002
apy+0.002
mons+0.002
iring+0.002
Neg logit contributions
ah-0.001
et-0.001
 seven-0.001
oscope-0.001
pole-0.001
Token together
Token ID1978
Feature activation-9.55e-01

Pos logit contributions
 ourselves+0.046
apy+0.044
uddled+0.044
mons+0.044
Sorry+0.043
Neg logit contributions
et-0.025
ah-0.023
 seven-0.020
oscope-0.018
pole-0.017
Token,
Token ID11
Feature activation-9.50e-01

Pos logit contributions
 ourselves+0.030
apy+0.028
uddled+0.028
mons+0.027
ls+0.027
Neg logit contributions
et-0.018
ah-0.018
 seven-0.015
oscope-0.014
pole-0.014
Token Amy
Token ID14235
Feature activation-1.01e+00

Pos logit contributions
 ourselves+0.033
uddled+0.032
istered+0.031
mons+0.031
ins+0.031
Neg logit contributions
et-0.014
ah-0.013
 seven-0.012
 time-0.011
aster-0.010
Token and
Token ID290
Feature activation+4.72e-01

Pos logit contributions
 ourselves+0.026
uddled+0.024
apy+0.024
ls+0.023
mons+0.023
Neg logit contributions
et-0.025
ah-0.024
 seven-0.021
oscope-0.021
pole-0.020
Token the
Token ID262
Feature activation-7.05e-02

Pos logit contributions
et+0.009
ah+0.008
 seven+0.007
oscope+0.007
aster+0.006
Neg logit contributions
 ourselves-0.015
apy-0.015
uddled-0.015
mons-0.014
istered-0.014
Token puppy
Token ID26188
Feature activation-5.15e+00

Pos logit contributions
apy+0.002
 ourselves+0.002
mons+0.002
ins+0.002
ls+0.002
Neg logit contributions
ah-0.001
et-0.001
 time-0.001
 seven-0.001
oscope-0.001
Token had
Token ID550
Feature activation+2.86e-01

Pos logit contributions
 ourselves+0.133
apy+0.123
mons+0.117
uddled+0.115
iring+0.113
Neg logit contributions
et-0.130
ah-0.128
 seven-0.110
oscope-0.106
pole-0.105
Token become
Token ID1716
Feature activation-1.51e+00

Pos logit contributions
et+0.006
ah+0.006
 seven+0.005
oscope+0.005
pole+0.005
Neg logit contributions
 ourselves-0.008
apy-0.008
uddled-0.008
mons-0.007
ls-0.007
Token best
Token ID1266
Feature activation-5.54e-01

Pos logit contributions
 ourselves+0.036
apy+0.035
uddled+0.034
mons+0.033
ls+0.032
Neg logit contributions
et-0.040
ah-0.039
 seven-0.035
oscope-0.033
wn-0.032
Token friends
Token ID2460
Feature activation+7.10e-01

Pos logit contributions
 ourselves+0.012
apy+0.011
Sorry+0.010
mons+0.010
idents+0.010
Neg logit contributions
et-0.015
ah-0.015
 time-0.014
 about-0.014
wn-0.013
Token.
Token ID13
Feature activation+4.16e-01

Pos logit contributions
et+0.018
ah+0.017
 seven+0.015
oscope+0.015
pole+0.014
Neg logit contributions
 ourselves-0.018
uddled-0.017
apy-0.017
ls-0.016
mons-0.016
Token named
Token ID3706
Feature activation+2.78e-01

Pos logit contributions
 ourselves+0.026
apy+0.025
uddled+0.024
mons+0.024
ls+0.023
Neg logit contributions
et-0.012
ah-0.012
 seven-0.010
oscope-0.009
pole-0.008
Token Tim
Token ID5045
Feature activation+6.47e-02

Pos logit contributions
ah+0.006
et+0.006
 seven+0.005
oscope+0.005
pole+0.005
Neg logit contributions
 ourselves-0.008
uddled-0.008
apy-0.008
mons-0.007
ls-0.007
Token.
Token ID13
Feature activation+7.74e-02

Pos logit contributions
et+0.002
ah+0.002
 seven+0.001
alling+0.001
oscope+0.001
Neg logit contributions
 ourselves-0.002
uddled-0.001
ls-0.001
apy-0.001
mons-0.001
Token Tim
Token ID5045
Feature activation-6.75e-01

Pos logit contributions
et+0.002
ah+0.002
 seven+0.002
oscope+0.002
aling+0.002
Neg logit contributions
 ourselves-0.002
apy-0.002
uddled-0.002
mons-0.002
ls-0.002
Token and
Token ID290
Feature activation+3.32e-01

Pos logit contributions
 ourselves+0.016
apy+0.015
uddled+0.015
mons+0.014
ls+0.014
Neg logit contributions
et-0.018
ah-0.018
 seven-0.016
oscope-0.015
pole-0.015
Token Fl
Token ID1610
Feature activation-5.14e+00

Pos logit contributions
et+0.009
ah+0.009
 seven+0.008
oscope+0.008
pole+0.007
Neg logit contributions
 ourselves-0.008
apy-0.007
uddled-0.007
mons-0.007
ls-0.007
Tokenuffy
Token ID15352
Feature activation-4.59e+00

Pos logit contributions
 ourselves+0.087
Sorry+0.080
apy+0.073
 yourselves+0.070
we+0.068
Neg logit contributions
ah-0.166
et-0.151
ag-0.147
ary-0.143
ow-0.134
Token liked
Token ID8288
Feature activation-1.42e+00

Pos logit contributions
 ourselves+0.110
apy+0.103
uddled+0.101
mons+0.097
iring+0.097
Neg logit contributions
et-0.124
ah-0.121
 seven-0.108
oscope-0.104
ary-0.098
Token to
Token ID284
Feature activation-2.05e+00

Pos logit contributions
 ourselves+0.047
apy+0.045
uddled+0.045
mons+0.044
ls+0.043
Neg logit contributions
et-0.024
ah-0.024
 seven-0.019
oscope-0.018
pole-0.017
Token play
Token ID711
Feature activation-1.79e+00

Pos logit contributions
 ourselves+0.054
apy+0.051
uddled+0.049
mons+0.048
ls+0.047
Neg logit contributions
et-0.049
ah-0.048
 seven-0.042
oscope-0.040
pole-0.039
Token and
Token ID290
Feature activation-2.33e+00

Pos logit contributions
 ourselves+0.044
apy+0.042
uddled+0.040
mons+0.039
ls+0.039
Neg logit contributions
et-0.046
ah-0.043
 seven-0.039
oscope-0.037
ag-0.036
Token tower
Token ID10580
Feature activation+5.10e-01

Pos logit contributions
et+0.002
ah+0.002
 seven+0.002
aling+0.002
oscope+0.002
Neg logit contributions
 ourselves-0.002
apy-0.002
mons-0.002
uddled-0.002
ls-0.002
Token they
Token ID484
Feature activation-4.73e-01

Pos logit contributions
ah+0.012
et+0.012
 seven+0.011
oscope+0.010
alling+0.010
Neg logit contributions
 ourselves-0.013
apy-0.013
uddled-0.012
ls-0.012
mons-0.012
Token had
Token ID550
Feature activation+9.95e-01

Pos logit contributions
 ourselves+0.013
apy+0.012
uddled+0.012
mons+0.011
ls+0.011
Neg logit contributions
et-0.011
ah-0.011
 seven-0.009
oscope-0.009
pole-0.009
Token built
Token ID3170
Feature activation+3.10e-01

Pos logit contributions
et+0.021
ah+0.020
 seven+0.017
oscope+0.017
pole+0.016
Neg logit contributions
 ourselves-0.030
apy-0.028
uddled-0.028
mons-0.027
ls-0.027
Token and
Token ID290
Feature activation+1.58e-02

Pos logit contributions
ah+0.007
et+0.007
 seven+0.006
oscope+0.006
pole+0.006
Neg logit contributions
 ourselves-0.008
apy-0.008
uddled-0.008
mons-0.008
ls-0.007
Token how
Token ID703
Feature activation-5.13e+00

Pos logit contributions
et+0.000
ah+0.000
 seven+0.000
oscope+0.000
pole+0.000
Neg logit contributions
 ourselves-0.000
apy-0.000
uddled-0.000
mons-0.000
ls-0.000
Token strong
Token ID1913
Feature activation+1.19e+00

Pos logit contributions
 ourselves+0.141
uddled+0.128
mons+0.125
istered+0.124
sters+0.123
Neg logit contributions
et-0.113
ah-0.105
 seven-0.098
ary-0.091
wn-0.088
Token it
Token ID340
Feature activation-1.25e-01

Pos logit contributions
et+0.036
ah+0.035
 seven+0.031
oscope+0.030
pole+0.029
Neg logit contributions
 ourselves-0.024
uddled-0.023
apy-0.023
mons-0.021
istered-0.021
Token was
Token ID373
Feature activation+7.27e-01

Pos logit contributions
 ourselves+0.003
apy+0.003
uddled+0.003
mons+0.003
ls+0.003
Neg logit contributions
et-0.003
ah-0.003
 seven-0.003
oscope-0.003
pole-0.003
Token.
Token ID13
Feature activation+2.62e-01

Pos logit contributions
ah+0.019
et+0.018
 seven+0.017
oscope+0.016
aster+0.015
Neg logit contributions
 ourselves-0.018
uddled-0.016
apy-0.016
ls-0.016
mons-0.015
Token They
Token ID1119
Feature activation-1.41e+00

Pos logit contributions
ah+0.007
et+0.007
oscope+0.006
 seven+0.006
ont+0.006
Neg logit contributions
 ourselves-0.006
apy-0.006
uddled-0.005
mons-0.005
ls-0.005
Token day
Token ID1110
Feature activation-2.14e+00

Pos logit contributions
et+0.013
ah+0.012
alling+0.011
 seven+0.011
aling+0.011
Neg logit contributions
apy-0.016
 ourselves-0.016
uddled-0.015
mons-0.015
ls-0.015
Token,
Token ID11
Feature activation-8.89e-01

Pos logit contributions
 ourselves+0.080
uddled+0.075
istered+0.074
we+0.072
iring+0.072
Neg logit contributions
ah-0.029
et-0.027
 seven-0.021
 time-0.019
ary-0.017
Token Tim
Token ID5045
Feature activation-1.06e-01

Pos logit contributions
 ourselves+0.026
uddled+0.024
istered+0.023
ls+0.023
apy+0.023
Neg logit contributions
ah-0.019
et-0.019
 seven-0.016
aster-0.015
pole-0.015
Token and
Token ID290
Feature activation+3.04e-01

Pos logit contributions
 ourselves+0.003
apy+0.003
uddled+0.003
mons+0.003
ls+0.003
Neg logit contributions
et-0.002
ah-0.002
 seven-0.002
oscope-0.002
pole-0.002
Token his
Token ID465
Feature activation-7.69e-01

Pos logit contributions
et+0.007
ah+0.007
alling+0.007
 seven+0.006
aling+0.006
Neg logit contributions
 ourselves-0.008
apy-0.008
mons-0.007
ls-0.007
Sorry-0.007
Token toy
Token ID13373
Feature activation-5.13e+00

Pos logit contributions
 ourselves+0.021
apy+0.020
uddled+0.019
Sorry+0.018
we+0.018
Neg logit contributions
et-0.018
ah-0.018
 seven-0.016
oscope-0.014
 time-0.014
Token dog
Token ID3290
Feature activation-4.15e+00

Pos logit contributions
 ourselves+0.153
apy+0.142
istered+0.141
ins+0.140
 yourselves+0.136
Neg logit contributions
ah-0.108
et-0.103
 seven-0.087
ary-0.080
pole-0.079
Token went
Token ID1816
Feature activation-6.25e-01

Pos logit contributions
 ourselves+0.114
apy+0.107
uddled+0.104
mons+0.101
ls+0.099
Neg logit contributions
et-0.097
ah-0.095
 seven-0.083
oscope-0.079
pole-0.075
Token for
Token ID329
Feature activation+4.37e-01

Pos logit contributions
 ourselves+0.018
ls+0.016
uddled+0.016
apy+0.016
mons+0.016
Neg logit contributions
et-0.014
ah-0.014
oscope-0.012
 seven-0.011
pole-0.011
Token a
Token ID257
Feature activation+5.40e-01

Pos logit contributions
et+0.010
ah+0.010
 seven+0.008
oscope+0.008
pole+0.008
Neg logit contributions
 ourselves-0.012
apy-0.011
uddled-0.011
mons-0.011
ls-0.011
Token walk
Token ID2513
Feature activation-9.85e-01

Pos logit contributions
et+0.014
ah+0.013
 seven+0.011
oscope+0.011
aling+0.011
Neg logit contributions
 ourselves-0.013
apy-0.013
uddled-0.013
ls-0.013
mons-0.012
Token mom
Token ID1995
Feature activation+1.77e-01

Pos logit contributions
 ourselves+0.008
apy+0.007
uddled+0.007
mons+0.007
ls+0.006
Neg logit contributions
et-0.008
ah-0.008
 seven-0.007
oscope-0.007
ont-0.007
Token came
Token ID1625
Feature activation+2.99e-01

Pos logit contributions
et+0.004
ah+0.004
 seven+0.003
oscope+0.003
pole+0.003
Neg logit contributions
 ourselves-0.005
apy-0.005
uddled-0.005
ls-0.005
mons-0.004
Token in
Token ID287
Feature activation+7.90e-01

Pos logit contributions
ah+0.007
et+0.007
 seven+0.006
pole+0.006
oscope+0.006
Neg logit contributions
 ourselves-0.008
uddled-0.008
apy-0.007
iring-0.007
ls-0.007
Token and
Token ID290
Feature activation+8.84e-01

Pos logit contributions
et+0.019
ah+0.019
pole+0.016
 seven+0.015
oscope+0.015
Neg logit contributions
 ourselves-0.022
apy-0.020
uddled-0.020
Sorry-0.019
iring-0.019
Token saw
Token ID2497
Feature activation+3.21e-01

Pos logit contributions
et+0.018
ah+0.017
 seven+0.015
oscope+0.014
 2+0.014
Neg logit contributions
 ourselves-0.027
apy-0.025
mons-0.025
uddled-0.024
iring-0.024
Token how
Token ID703
Feature activation-5.12e+00

Pos logit contributions
et+0.006
ah+0.005
aster+0.004
 seven+0.004
ock+0.004
Neg logit contributions
 ourselves-0.010
uddled-0.010
istered-0.010
apy-0.010
mons-0.010
Token upset
Token ID9247
Feature activation-1.04e+00

Pos logit contributions
 ourselves+0.158
sters+0.143
mons+0.140
oles+0.132
owers+0.132
Neg logit contributions
et-0.086
ah-0.084
ary-0.076
ag-0.075
 seven-0.075
Token he
Token ID339
Feature activation-2.14e-01

Pos logit contributions
 ourselves+0.031
uddled+0.030
istered+0.030
apy+0.029
mons+0.028
Neg logit contributions
et-0.021
ah-0.020
 seven-0.016
pole-0.016
oscope-0.015
Token was
Token ID373
Feature activation+4.05e-01

Pos logit contributions
 ourselves+0.006
apy+0.005
mons+0.005
uddled+0.005
ls+0.005
Neg logit contributions
et-0.006
ah-0.006
oscope-0.005
 seven-0.004
ont-0.004
Token.
Token ID13
Feature activation+1.16e+00

Pos logit contributions
et+0.009
ah+0.008
 seven+0.007
oscope+0.007
pole+0.007
Neg logit contributions
 ourselves-0.012
apy-0.011
uddled-0.011
mons-0.011
ls-0.011
Token She
Token ID1375
Feature activation+8.19e-01

Pos logit contributions
et+0.022
ah+0.020
 seven+0.019
ary+0.017
 2+0.017
Neg logit contributions
 ourselves-0.037
apy-0.033
uddled-0.032
ls-0.032
istered-0.032
Token and
Token ID290
Feature activation-6.83e-01

Pos logit contributions
 ourselves+0.002
uddled+0.002
apy+0.002
ls+0.001
mons+0.001
Neg logit contributions
ah-0.001
et-0.001
 seven-0.001
oscope-0.001
wn-0.001
Token what
Token ID644
Feature activation-2.53e+00

Pos logit contributions
 ourselves+0.017
uddled+0.016
apy+0.016
mons+0.015
ls+0.015
Neg logit contributions
et-0.017
ah-0.017
 seven-0.015
oscope-0.014
pole-0.014
Token they
Token ID484
Feature activation+6.69e-01

Pos logit contributions
 ourselves+0.050
uddled+0.049
istered+0.046
apy+0.046
mons+0.045
Neg logit contributions
ah-0.072
et-0.071
 seven-0.065
 time-0.063
 about-0.063
Token ate
Token ID15063
Feature activation-1.94e+00

Pos logit contributions
et+0.015
ah+0.014
 seven+0.012
oscope+0.012
pole+0.011
Neg logit contributions
apy-0.019
 ourselves-0.019
uddled-0.019
mons-0.018
ls-0.017
Token and
Token ID290
Feature activation-8.90e-01

Pos logit contributions
 ourselves+0.055
apy+0.052
uddled+0.052
mons+0.050
ls+0.049
Neg logit contributions
et-0.043
ah-0.042
 seven-0.036
oscope-0.035
pole-0.033
Token how
Token ID703
Feature activation-5.10e+00

Pos logit contributions
 ourselves+0.024
uddled+0.023
apy+0.023
iring+0.022
mons+0.022
Neg logit contributions
et-0.020
ah-0.019
oscope-0.017
 seven-0.016
aster-0.015
Token they
Token ID484
Feature activation-8.15e-01

Pos logit contributions
uddled+0.137
mons+0.133
istered+0.132
 ourselves+0.131
Sorry+0.131
Neg logit contributions
 about-0.090
et-0.080
 seven-0.078
ah-0.077
ag-0.075
Token lived
Token ID5615
Feature activation-1.80e+00

Pos logit contributions
 ourselves+0.023
apy+0.022
uddled+0.022
mons+0.021
iring+0.021
Neg logit contributions
et-0.018
ah-0.017
 seven-0.015
oscope-0.015
aling-0.014
Token.
Token ID13
Feature activation-8.29e-01

Pos logit contributions
 ourselves+0.051
apy+0.049
uddled+0.048
mons+0.046
Sorry+0.045
Neg logit contributions
et-0.040
ah-0.038
 seven-0.034
oscope-0.032
pole-0.030
Token They
Token ID1119
Feature activation-1.47e+00

Pos logit contributions
 ourselves+0.019
apy+0.018
uddled+0.017
mons+0.017
ls+0.016
Neg logit contributions
et-0.023
ah-0.023
 seven-0.020
oscope-0.020
pole-0.019
Token also
Token ID635
Feature activation-6.84e-01

Pos logit contributions
 ourselves+0.039
apy+0.037
uddled+0.037
mons+0.036
ls+0.035
Neg logit contributions
et-0.035
ah-0.035
 seven-0.030
oscope-0.029
pole-0.028
Token granted
Token ID7520
Feature activation+2.33e+00

Pos logit contributions
et+0.032
ah+0.031
 seven+0.028
oscope+0.027
pole+0.025
Neg logit contributions
 ourselves-0.032
apy-0.031
uddled-0.031
mons-0.030
ls-0.029
Token.
Token ID13
Feature activation+1.07e+00

Pos logit contributions
ah+0.052
et+0.052
 seven+0.044
oscope+0.043
pole+0.042
Neg logit contributions
 ourselves-0.066
apy-0.060
uddled-0.059
ls-0.057
mons-0.057
Token
Token ID220
Feature activation-1.90e+00

Pos logit contributions
et+0.030
ah+0.029
 seven+0.025
oscope+0.024
wn+0.024
Neg logit contributions
 ourselves-0.026
apy-0.024
uddled-0.022
We-0.022
mons-0.021
Token
Token ID198
Feature activation+1.23e+00

Pos logit contributions
 ourselves+0.054
apy+0.051
uddled+0.051
mons+0.049
ls+0.049
Neg logit contributions
et-0.040
ah-0.039
 seven-0.034
oscope-0.033
pole-0.031
Token
Token ID198
Feature activation+6.87e-01

Pos logit contributions
et+0.031
ah+0.030
 seven+0.026
oscope+0.026
pole+0.024
Neg logit contributions
 ourselves-0.032
apy-0.030
uddled-0.029
mons-0.028
ls-0.028
TokenFrom
Token ID4863
Feature activation+3.76e+00

Pos logit contributions
et+0.017
ah+0.017
 seven+0.015
oscope+0.015
pole+0.014
Neg logit contributions
 ourselves-0.017
apy-0.016
uddled-0.016
mons-0.016
ls-0.015
Token then
Token ID788
Feature activation+9.07e-01

Pos logit contributions
et+0.089
ah+0.089
 seven+0.075
oscope+0.075
pole+0.071
Neg logit contributions
 ourselves-0.102
apy-0.094
uddled-0.093
ls-0.091
mons-0.090
Token on
Token ID319
Feature activation+2.14e+00

Pos logit contributions
et+0.026
ah+0.025
pole+0.025
 seven+0.023
ont+0.022
Neg logit contributions
apy-0.019
 ourselves-0.018
ls-0.016
uddled-0.016
mons-0.015
Token,
Token ID11
Feature activation-5.62e-02

Pos logit contributions
et+0.043
ah+0.042
 seven+0.035
oscope+0.033
pole+0.032
Neg logit contributions
 ourselves-0.066
apy-0.062
uddled-0.062
mons-0.060
ls-0.059
Token the
Token ID262
Feature activation+7.22e-01

Pos logit contributions
 ourselves+0.002
apy+0.002
uddled+0.002
mons+0.001
ls+0.001
Neg logit contributions
et-0.001
ah-0.001
 seven-0.001
oscope-0.001
pole-0.001
Token village
Token ID7404
Feature activation-1.12e+00

Pos logit contributions
et+0.018
ah+0.017
 seven+0.016
oscope+0.016
 2+0.015
Neg logit contributions
 ourselves-0.018
apy-0.017
uddled-0.017
mons-0.017
ls-0.016
Token they
Token ID484
Feature activation-5.22e-01

Pos logit contributions
 ourselves+0.011
uddled+0.011
mons+0.011
istered+0.011
ins+0.010
Neg logit contributions
ah-0.011
et-0.011
 seven-0.010
 2-0.009
ary-0.009
Token heard
Token ID2982
Feature activation+7.31e-01

Pos logit contributions
 ourselves+0.014
apy+0.014
uddled+0.013
ls+0.013
mons+0.013
Neg logit contributions
et-0.012
ah-0.012
 seven-0.010
oscope-0.010
pole-0.009
Token a
Token ID257
Feature activation-1.49e-01

Pos logit contributions
et+0.018
ah+0.017
oscope+0.015
 seven+0.015
aling+0.014
Neg logit contributions
 ourselves-0.019
apy-0.018
uddled-0.018
ls-0.017
mons-0.016
Token voice
Token ID3809
Feature activation+2.73e+00

Pos logit contributions
 ourselves+0.004
apy+0.004
uddled+0.003
ls+0.003
Sorry+0.003
Neg logit contributions
et-0.004
ah-0.004
 seven-0.003
oscope-0.003
aling-0.003
Token shout
Token ID9191
Feature activation+1.35e-01

Pos logit contributions
ah+0.072
et+0.070
 seven+0.065
wn+0.063
oscope+0.060
Neg logit contributions
 ourselves-0.062
ls-0.060
uddled-0.056
apy-0.055
we-0.055
Token,
Token ID11
Feature activation+2.99e+00

Pos logit contributions
ah+0.003
et+0.003
 seven+0.003
oscope+0.003
aling+0.003
Neg logit contributions
 ourselves-0.004
apy-0.003
uddled-0.003
ls-0.003
mons-0.003
Token "
Token ID366
Feature activation-4.25e-01

Pos logit contributions
et+0.064
ah+0.063
 seven+0.053
oscope+0.052
pole+0.048
Neg logit contributions
 ourselves-0.087
apy-0.082
uddled-0.081
mons-0.078
ls-0.078
TokenSh
Token ID2484
Feature activation+1.18e+00

Pos logit contributions
 ourselves+0.009
apy+0.008
uddled+0.008
mons+0.008
ls+0.008
Neg logit contributions
et-0.013
ah-0.012
 seven-0.011
oscope-0.011
pole-0.011
Tokenoot
Token ID1025
Feature activation+6.19e-01

Pos logit contributions
et+0.023
ah+0.022
 seven+0.020
oscope+0.018
wn+0.018
Neg logit contributions
 ourselves-0.036
uddled-0.033
apy-0.033
iring-0.032
mons-0.032
Token the
Token ID262
Feature activation-5.19e-01

Pos logit contributions
et+0.015
ah+0.014
 seven+0.012
oscope+0.012
pole+0.011
Neg logit contributions
 ourselves-0.017
apy-0.016
uddled-0.016
mons-0.015
ls-0.015
Token bear
Token ID6842
Feature activation-1.09e-01

Pos logit contributions
 ourselves+0.015
apy+0.014
uddled+0.014
mons+0.014
ls+0.013
Neg logit contributions
et-0.011
ah-0.011
 seven-0.009
oscope-0.009
aling-0.009
Token End
Token ID5268
Feature activation-2.45e-01

Pos logit contributions
et+0.006
ah+0.006
 seven+0.005
oscope+0.005
 2+0.005
Neg logit contributions
 ourselves-0.010
apy-0.009
uddled-0.009
ls-0.009
mons-0.009
Token.
Token ID13
Feature activation+1.18e+00

Pos logit contributions
 ourselves+0.006
apy+0.006
uddled+0.006
mons+0.006
ls+0.006
Neg logit contributions
et-0.006
ah-0.006
 seven-0.005
oscope-0.005
pole-0.005
Token<|endoftext|>
Token ID50256
Feature activation-3.14e-01

Pos logit contributions
et+0.036
ah+0.034
 seven+0.030
oscope+0.030
wn+0.029
Neg logit contributions
 ourselves-0.026
apy-0.024
We-0.022
uddled-0.022
Sorry-0.022
TokenOnce
Token ID7454
Feature activation-1.08e+00

Pos logit contributions
 ourselves+0.006
apy+0.005
uddled+0.005
mons+0.005
ls+0.005
Neg logit contributions
et-0.010
ah-0.010
 seven-0.009
oscope-0.009
pole-0.008
Token upon
Token ID2402
Feature activation+6.49e-01

Pos logit contributions
 ourselves+0.024
apy+0.022
uddled+0.022
mons+0.021
Sorry+0.020
Neg logit contributions
et-0.031
ah-0.031
 seven-0.027
oscope-0.026
pole-0.026
Token a
Token ID257
Feature activation+3.35e+00

Pos logit contributions
ah+0.010
et+0.010
 seven+0.008
oscope+0.008
aster+0.008
Neg logit contributions
 ourselves-0.021
uddled-0.021
mons-0.021
ls-0.021
apy-0.021
Token time
Token ID640
Feature activation+7.58e-01

Pos logit contributions
et+0.044
ah+0.039
oscope+0.033
pole+0.032
alling+0.032
Neg logit contributions
 ourselves-0.116
apy-0.113
uddled-0.113
istered-0.110
ls-0.109
Token there
Token ID612
Feature activation+4.69e-01

Pos logit contributions
ah+0.012
et+0.012
 seven+0.010
ary+0.009
pole+0.008
Neg logit contributions
 ourselves-0.026
uddled-0.025
apy-0.024
mons-0.024
Sorry-0.024
Token was
Token ID373
Feature activation+9.63e-02

Pos logit contributions
ah+0.012
et+0.012
 seven+0.010
 time+0.010
oscope+0.010
Neg logit contributions
 ourselves-0.012
mons-0.011
uddled-0.011
istered-0.011
apy-0.011
Token a
Token ID257
Feature activation-2.92e-01

Pos logit contributions
ah+0.002
et+0.002
 seven+0.001
oscope+0.001
aster+0.001
Neg logit contributions
 ourselves-0.003
uddled-0.003
apy-0.003
ls-0.003
mons-0.003
Token girl
Token ID2576
Feature activation+4.64e-01

Pos logit contributions
 ourselves+0.008
apy+0.007
uddled+0.007
mons+0.007
ls+0.007
Neg logit contributions
et-0.007
ah-0.007
 seven-0.006
oscope-0.006
pole-0.006
Token knew
Token ID2993
Feature activation+1.24e+00

Pos logit contributions
 ourselves+0.001
uddled+0.001
apy+0.001
ls+0.001
Sorry+0.001
Neg logit contributions
et-0.001
ah-0.001
oscope-0.001
wn-0.001
 seven-0.001
Token that
Token ID326
Feature activation+6.49e-01

Pos logit contributions
et+0.034
ah+0.034
 seven+0.030
oscope+0.029
pole+0.028
Neg logit contributions
 ourselves-0.029
apy-0.026
uddled-0.026
mons-0.025
ls-0.025
Token she
Token ID673
Feature activation+1.41e+00

Pos logit contributions
et+0.018
ah+0.018
 seven+0.016
oscope+0.015
pole+0.015
Neg logit contributions
 ourselves-0.015
apy-0.014
uddled-0.013
mons-0.013
ls-0.013
Token would
Token ID561
Feature activation+4.12e-01

Pos logit contributions
et+0.030
ah+0.030
 seven+0.025
oscope+0.024
pole+0.023
Neg logit contributions
 ourselves-0.042
apy-0.040
uddled-0.039
mons-0.038
ls-0.037
Token always
Token ID1464
Feature activation+1.50e+00

Pos logit contributions
et+0.009
ah+0.009
 seven+0.008
oscope+0.008
wn+0.007
Neg logit contributions
 ourselves-0.012
apy-0.010
uddled-0.010
Sorry-0.010
iring-0.010
Token remember
Token ID3505
Feature activation+3.10e+00

Pos logit contributions
et+0.035
ah+0.034
 seven+0.029
oscope+0.028
pole+0.027
Neg logit contributions
 ourselves-0.042
apy-0.037
uddled-0.035
Sorry-0.034
ls-0.034
Token the
Token ID262
Feature activation+8.51e-01

Pos logit contributions
ah+0.079
et+0.078
 seven+0.067
oscope+0.066
pole+0.064
Neg logit contributions
 ourselves-0.080
apy-0.072
uddled-0.071
ls-0.069
mons-0.067
Token special
Token ID2041
Feature activation+3.99e-01

Pos logit contributions
et+0.022
ah+0.022
aling+0.019
pole+0.019
 seven+0.019
Neg logit contributions
 ourselves-0.020
apy-0.018
uddled-0.018
Sorry-0.017
mons-0.017
Token knob
Token ID36842
Feature activation+1.73e+00

Pos logit contributions
et+0.009
ah+0.009
 seven+0.008
aling+0.007
alling+0.007
Neg logit contributions
 ourselves-0.011
apy-0.010
uddled-0.010
mons-0.010
ls-0.010
Token on
Token ID319
Feature activation+6.20e-01

Pos logit contributions
ah+0.044
et+0.043
 seven+0.039
wn+0.038
aling+0.037
Neg logit contributions
 ourselves-0.042
apy-0.037
uddled-0.037
ls-0.036
we-0.035
Token her
Token ID607
Feature activation+8.35e-01

Pos logit contributions
et+0.016
ah+0.016
 seven+0.014
oscope+0.014
pole+0.013
Neg logit contributions
 ourselves-0.016
apy-0.014
uddled-0.014
ls-0.013
mons-0.013
Token end
Token ID886
Feature activation+4.64e-01

Pos logit contributions
et+0.018
ah+0.018
 seven+0.015
oscope+0.014
pole+0.014
Neg logit contributions
 ourselves-0.028
apy-0.026
uddled-0.026
mons-0.026
Sorry-0.025
Token.
Token ID13
Feature activation+9.95e-01

Pos logit contributions
et+0.014
ah+0.013
 seven+0.012
pole+0.012
oscope+0.012
Neg logit contributions
 ourselves-0.010
apy-0.009
uddled-0.009
ls-0.008
we-0.008
Token<|endoftext|>
Token ID50256
Feature activation-1.81e-01

Pos logit contributions
et+0.031
ah+0.029
 seven+0.026
oscope+0.026
wn+0.025
Neg logit contributions
 ourselves-0.021
apy-0.019
uddled-0.018
We-0.018
Sorry-0.017
TokenOnce
Token ID7454
Feature activation-1.16e+00

Pos logit contributions
 ourselves+0.004
apy+0.004
uddled+0.004
ls+0.004
mons+0.003
Neg logit contributions
et-0.005
ah-0.005
 seven-0.004
oscope-0.004
pole-0.004
Token upon
Token ID2402
Feature activation+2.97e-01

Pos logit contributions
 ourselves+0.025
apy+0.023
uddled+0.023
mons+0.022
ls+0.021
Neg logit contributions
et-0.034
ah-0.033
 seven-0.030
oscope-0.029
pole-0.028
Token a
Token ID257
Feature activation+3.07e+00

Pos logit contributions
ah+0.005
et+0.005
 seven+0.004
oscope+0.004
aster+0.004
Neg logit contributions
 ourselves-0.010
uddled-0.010
mons-0.009
apy-0.009
ls-0.009
Token time
Token ID640
Feature activation+6.43e-01

Pos logit contributions
et+0.040
ah+0.035
oscope+0.031
alling+0.030
pole+0.029
Neg logit contributions
 ourselves-0.106
uddled-0.104
apy-0.103
istered-0.100
ls-0.100
Token,
Token ID11
Feature activation-2.01e-01

Pos logit contributions
ah+0.010
et+0.010
 seven+0.008
ary+0.007
pole+0.007
Neg logit contributions
 ourselves-0.022
uddled-0.021
apy-0.021
mons-0.020
Sorry-0.020
Token there
Token ID612
Feature activation+4.85e-02

Pos logit contributions
 ourselves+0.005
uddled+0.005
mons+0.005
apy+0.005
Sorry+0.005
Neg logit contributions
et-0.005
ah-0.005
 seven-0.004
pole-0.004
oscope-0.004
Token was
Token ID373
Feature activation-3.58e-02

Pos logit contributions
ah+0.001
et+0.001
 time+0.001
 seven+0.001
oscope+0.001
Neg logit contributions
 ourselves-0.001
mons-0.001
istered-0.001
uddled-0.001
Sorry-0.001
Token a
Token ID257
Feature activation-4.10e-01

Pos logit contributions
 ourselves+0.001
uddled+0.001
apy+0.001
ls+0.001
mons+0.001
Neg logit contributions
et-0.001
ah-0.001
 seven-0.001
oscope-0.001
aster-0.000
Token mom
Token ID1995
Feature activation-1.03e-01

Pos logit contributions
 ourselves+0.014
apy+0.013
uddled+0.013
mons+0.012
ls+0.012
Neg logit contributions
ah-0.022
et-0.022
aling-0.019
 seven-0.019
oscope-0.019
Token and
Token ID290
Feature activation+7.98e-01

Pos logit contributions
 ourselves+0.002
uddled+0.002
apy+0.002
ls+0.002
mons+0.002
Neg logit contributions
et-0.003
ah-0.003
 seven-0.002
oscope-0.002
pole-0.002
Token sister
Token ID6621
Feature activation-6.20e-01

Pos logit contributions
et+0.021
ah+0.021
 seven+0.018
oscope+0.017
pole+0.017
Neg logit contributions
 ourselves-0.020
apy-0.018
uddled-0.018
mons-0.017
ls-0.017
Token are
Token ID389
Feature activation+1.20e-01

Pos logit contributions
 ourselves+0.018
apy+0.017
uddled+0.017
mons+0.017
ls+0.016
Neg logit contributions
et-0.013
ah-0.013
 seven-0.011
oscope-0.011
pole-0.010
Token angry
Token ID7954
Feature activation-1.50e+00

Pos logit contributions
et+0.003
ah+0.003
 seven+0.003
oscope+0.003
pole+0.002
Neg logit contributions
 ourselves-0.003
apy-0.003
uddled-0.003
mons-0.003
ls-0.003
Token.
Token ID13
Feature activation+5.79e-01

Pos logit contributions
 ourselves+0.047
apy+0.047
uddled+0.045
mons+0.044
istered+0.043
Neg logit contributions
et-0.028
ah-0.027
 seven-0.022
oscope-0.021
aling-0.020
Token They
Token ID1119
Feature activation-4.06e-01

Pos logit contributions
et+0.014
ah+0.014
 seven+0.012
oscope+0.012
pole+0.011
Neg logit contributions
 ourselves-0.015
apy-0.014
uddled-0.014
mons-0.013
ls-0.013
Token run
Token ID1057
Feature activation-8.70e-01

Pos logit contributions
 ourselves+0.013
apy+0.012
uddled+0.012
mons+0.012
ls+0.012
Neg logit contributions
et-0.007
ah-0.007
 seven-0.006
oscope-0.006
pole-0.005
Token to
Token ID284
Feature activation-6.74e-01

Pos logit contributions
 ourselves+0.030
apy+0.029
uddled+0.028
mons+0.027
ls+0.027
Neg logit contributions
et-0.014
ah-0.014
 seven-0.011
oscope-0.011
aling-0.010
Token the
Token ID262
Feature activation+5.28e-01

Pos logit contributions
 ourselves+0.020
apy+0.018
uddled+0.018
mons+0.017
ls+0.017
Neg logit contributions
et-0.015
ah-0.015
 seven-0.012
oscope-0.012
pole-0.011
Token big
Token ID1263
Feature activation+1.02e+00

Pos logit contributions
et+0.009
 seven+0.009
ah+0.008
 2+0.008
oscope+0.008
Neg logit contributions
 ourselves-0.016
apy-0.016
uddled-0.016
mons-0.015
ls-0.015
Token the
Token ID262
Feature activation-4.09e-01

Pos logit contributions
ah+0.007
et+0.007
oscope+0.007
 seven+0.006
aling+0.006
Neg logit contributions
 ourselves-0.008
apy-0.008
uddled-0.007
iring-0.007
ls-0.007
Token centre
Token ID7372
Feature activation+8.30e-01

Pos logit contributions
 ourselves+0.009
apy+0.008
uddled+0.008
iring+0.007
mons+0.007
Neg logit contributions
et-0.011
ah-0.011
aling-0.010
oscope-0.010
aster-0.010
Token of
Token ID286
Feature activation+7.88e-03

Pos logit contributions
ah+0.019
et+0.018
 seven+0.016
oscope+0.015
aled+0.015
Neg logit contributions
 ourselves-0.022
apy-0.021
uddled-0.020
ls-0.020
we-0.020
Token her
Token ID607
Feature activation-6.60e-01

Pos logit contributions
ah+0.000
et+0.000
oscope+0.000
aling+0.000
 seven+0.000
Neg logit contributions
 ourselves-0.000
apy-0.000
uddled-0.000
mons-0.000
ls-0.000
Token imagination
Token ID13843
Feature activation-3.03e-01

Pos logit contributions
 ourselves+0.016
apy+0.015
uddled+0.015
mons+0.014
ls+0.014
Neg logit contributions
et-0.017
ah-0.017
 seven-0.015
oscope-0.015
pole-0.014
Token.
Token ID13
Feature activation+1.08e-01

Pos logit contributions
 ourselves+0.008
uddled+0.007
apy+0.007
mons+0.007
ls+0.007
Neg logit contributions
ah-0.007
et-0.007
 seven-0.007
oscope-0.006
aling-0.006
Token
Token ID220
Feature activation-3.04e+00

Pos logit contributions
et+0.003
ah+0.003
 seven+0.002
oscope+0.002
wn+0.002
Neg logit contributions
 ourselves-0.003
apy-0.003
uddled-0.003
mons-0.003
ls-0.002
Token
Token ID198
Feature activation-1.05e-01

Pos logit contributions
 ourselves+0.089
uddled+0.084
apy+0.084
mons+0.082
Sorry+0.082
Neg logit contributions
et-0.060
ah-0.059
 seven-0.050
oscope-0.048
pole-0.048
Token
Token ID198
Feature activation-4.36e-01

Pos logit contributions
 ourselves+0.003
apy+0.003
uddled+0.003
mons+0.003
ls+0.003
Neg logit contributions
et-0.002
ah-0.002
 seven-0.002
oscope-0.002
pole-0.002
TokenThe
Token ID464
Feature activation+7.72e-01

Pos logit contributions
 ourselves+0.011
apy+0.011
uddled+0.011
mons+0.010
ls+0.010
Neg logit contributions
ah-0.011
et-0.011
 seven-0.009
oscope-0.009
pole-0.009
Token little
Token ID1310
Feature activation-9.75e-01

Pos logit contributions
et+0.019
ah+0.019
 seven+0.017
oscope+0.016
pole+0.015
Neg logit contributions
 ourselves-0.020
apy-0.019
uddled-0.018
mons-0.018
ls-0.018
Token polite
Token ID23507
Feature activation-1.67e+00

Pos logit contributions
 ourselves+0.081
uddled+0.075
mons+0.070
we+0.069
Sorry+0.068
Neg logit contributions
et-0.061
ah-0.061
 seven-0.058
ary-0.052
 2-0.051
Token.
Token ID13
Feature activation+7.01e-01

Pos logit contributions
 ourselves+0.048
apy+0.045
uddled+0.044
mons+0.043
ls+0.043
Neg logit contributions
et-0.037
ah-0.036
 seven-0.031
oscope-0.030
pole-0.028
Token Jane
Token ID12091
Feature activation-1.08e+00

Pos logit contributions
et+0.016
ah+0.015
 seven+0.013
oscope+0.013
pole+0.012
Neg logit contributions
 ourselves-0.020
apy-0.019
uddled-0.019
mons-0.018
ls-0.018
Token wanted
Token ID2227
Feature activation+1.72e-01

Pos logit contributions
 ourselves+0.024
apy+0.023
uddled+0.022
mons+0.021
ls+0.021
Neg logit contributions
et-0.031
ah-0.030
 seven-0.027
oscope-0.026
pole-0.025
Token to
Token ID284
Feature activation+1.76e-01

Pos logit contributions
et+0.004
ah+0.004
pole+0.004
 seven+0.003
aling+0.003
Neg logit contributions
 ourselves-0.005
apy-0.004
mons-0.004
uddled-0.004
ls-0.004
Token learn
Token ID2193
Feature activation+1.94e+00

Pos logit contributions
et+0.004
ah+0.004
 seven+0.003
oscope+0.003
pole+0.003
Neg logit contributions
 ourselves-0.005
apy-0.005
mons-0.004
uddled-0.004
Sorry-0.004
Token how
Token ID703
Feature activation-2.38e+00

Pos logit contributions
ah+0.056
et+0.055
oscope+0.048
 seven+0.048
aling+0.046
Neg logit contributions
 ourselves-0.047
uddled-0.040
ls-0.040
apy-0.040
iring-0.039
Token to
Token ID284
Feature activation-3.70e-01

Pos logit contributions
 ourselves+0.085
uddled+0.084
apy+0.081
mons+0.081
istered+0.081
Neg logit contributions
et-0.029
ah-0.029
 seven-0.025
ary-0.020
 2-0.020
Token add
Token ID751
Feature activation+9.26e-01

Pos logit contributions
 ourselves+0.010
apy+0.009
mons+0.008
uddled+0.008
Sorry+0.008
Neg logit contributions
et-0.009
ah-0.009
 seven-0.008
oscope-0.007
wn-0.007
Token numbers
Token ID3146
Feature activation+5.32e-01

Pos logit contributions
ah+0.028
et+0.027
oscope+0.024
pole+0.023
aling+0.023
Neg logit contributions
 ourselves-0.021
apy-0.018
ls-0.018
mons-0.017
uddled-0.016
Token.
Token ID13
Feature activation+5.63e-01

Pos logit contributions
ah+0.014
et+0.014
oscope+0.013
ont+0.012
aling+0.011
Neg logit contributions
 ourselves-0.014
ls-0.012
apy-0.012
mons-0.011
iring-0.011
Token"
Token ID1
Feature activation-1.93e+00

Pos logit contributions
et+0.005
ah+0.005
aling+0.004
 seven+0.004
pole+0.004
Neg logit contributions
 ourselves-0.007
ls-0.007
mons-0.007
uddled-0.007
istered-0.007
TokenTim
Token ID14967
Feature activation+2.15e+00

Pos logit contributions
 ourselves+0.044
apy+0.042
ls+0.041
uddled+0.040
istered+0.039
Neg logit contributions
et-0.053
ah-0.052
 seven-0.045
oscope-0.044
pole-0.044
Tokenmy
Token ID1820
Feature activation+2.70e-01

Pos logit contributions
et+0.038
ah+0.037
 seven+0.035
alling+0.034
 contain+0.033
Neg logit contributions
 ourselves-0.063
uddled-0.059
apy-0.057
ls-0.057
ies-0.057
Token,
Token ID11
Feature activation+7.10e-02

Pos logit contributions
et+0.005
ah+0.005
 seven+0.004
oscope+0.003
pole+0.003
Neg logit contributions
 ourselves-0.009
apy-0.009
uddled-0.009
mons-0.008
iring-0.008
Token you
Token ID345
Feature activation-6.13e-01

Pos logit contributions
et+0.002
ah+0.002
 seven+0.002
oscope+0.002
alling+0.002
Neg logit contributions
 ourselves-0.002
apy-0.001
uddled-0.001
mons-0.001
ls-0.001
Token need
Token ID761
Feature activation+1.71e+00

Pos logit contributions
 ourselves+0.018
uddled+0.017
apy+0.017
mons+0.017
we+0.017
Neg logit contributions
ah-0.012
et-0.012
 seven-0.011
oscope-0.010
 2-0.010
Token to
Token ID284
Feature activation+2.36e-01

Pos logit contributions
et+0.033
ah+0.032
 seven+0.026
oscope+0.025
pole+0.024
Neg logit contributions
 ourselves-0.054
apy-0.051
uddled-0.051
mons-0.049
ls-0.049
Token eat
Token ID4483
Feature activation-2.73e-01

Pos logit contributions
et+0.005
ah+0.005
 seven+0.004
oscope+0.004
pole+0.004
Neg logit contributions
 ourselves-0.007
apy-0.006
uddled-0.006
mons-0.006
ls-0.006
Token your
Token ID534
Feature activation-4.46e-01

Pos logit contributions
 ourselves+0.007
apy+0.007
uddled+0.007
mons+0.007
ls+0.007
Neg logit contributions
et-0.006
ah-0.006
 seven-0.005
oscope-0.005
pole-0.005
Token cel
Token ID18725
Feature activation-1.27e-01

Pos logit contributions
 ourselves+0.012
apy+0.012
uddled+0.012
mons+0.011
ls+0.011
Neg logit contributions
et-0.010
ah-0.010
 seven-0.009
oscope-0.008
pole-0.008
Tokenery
Token ID1924
Feature activation-2.18e-01

Pos logit contributions
 ourselves+0.003
apy+0.003
uddled+0.003
mons+0.003
Sorry+0.003
Neg logit contributions
et-0.003
ah-0.003
 seven-0.003
oscope-0.003
pole-0.002
Token is
Token ID318
Feature activation+4.94e-01

Pos logit contributions
 ourselves+0.011
apy+0.010
uddled+0.010
ls+0.010
mons+0.009
Neg logit contributions
et-0.019
ah-0.019
 seven-0.017
oscope-0.017
wn-0.017
Token Billy
Token ID15890
Feature activation-9.79e-01

Pos logit contributions
et+0.013
ah+0.012
oscope+0.012
wn+0.011
 seven+0.011
Neg logit contributions
 ourselves-0.012
apy-0.011
uddled-0.010
mons-0.010
ls-0.010
Token.
Token ID13
Feature activation+8.19e-01

Pos logit contributions
 ourselves+0.021
uddled+0.017
ls+0.017
mons+0.016
s+0.016
Neg logit contributions
et-0.027
ah-0.026
 seven-0.025
wn-0.025
alling-0.024
Token Billy
Token ID15890
Feature activation-1.02e+00

Pos logit contributions
et+0.022
ah+0.022
oscope+0.019
 seven+0.018
ont+0.018
Neg logit contributions
 ourselves-0.019
apy-0.019
Sorry-0.018
uddled-0.018
mons-0.017
Token has
Token ID468
Feature activation+1.33e+00

Pos logit contributions
 ourselves+0.024
apy+0.023
uddled+0.022
mons+0.022
ls+0.021
Neg logit contributions
et-0.027
ah-0.027
 seven-0.024
oscope-0.023
pole-0.022
Token a
Token ID257
Feature activation+4.43e-01

Pos logit contributions
et+0.037
ah+0.036
pole+0.032
oscope+0.032
aling+0.030
Neg logit contributions
 ourselves-0.030
apy-0.027
mons-0.025
We-0.025
istered-0.025
Token big
Token ID1263
Feature activation+1.35e+00

Pos logit contributions
et+0.012
ah+0.011
oscope+0.010
 seven+0.010
aling+0.009
Neg logit contributions
 ourselves-0.011
apy-0.010
uddled-0.010
ls-0.010
mons-0.010
Token bag
Token ID6131
Feature activation+1.56e+00

Pos logit contributions
et+0.032
ah+0.030
alling+0.030
aling+0.030
pole+0.029
Neg logit contributions
apy-0.031
 ourselves-0.030
ls-0.029
mons-0.028
uddled-0.028
Token of
Token ID286
Feature activation-5.84e-01

Pos logit contributions
ah+0.043
et+0.041
alis+0.036
oscope+0.036
 2+0.036
Neg logit contributions
uddled-0.031
 ourselves-0.031
iring-0.030
ls-0.030
apy-0.029
Token toys
Token ID14958
Feature activation-2.23e-01

Pos logit contributions
 ourselves+0.012
apy+0.011
uddled+0.011
mons+0.011
ls+0.010
Neg logit contributions
et-0.017
ah-0.017
 seven-0.015
oscope-0.015
pole-0.015
Token.
Token ID13
Feature activation+1.27e+00

Pos logit contributions
 ourselves+0.005
uddled+0.005
ls+0.005
mons+0.005
apy+0.005
Neg logit contributions
ah-0.006
et-0.006
 seven-0.005
oscope-0.005
aster-0.005
Token
Token ID198
Feature activation-1.03e+00

Pos logit contributions
 ourselves+0.002
apy+0.002
uddled+0.002
mons+0.002
istered+0.002
Neg logit contributions
ah-0.001
et-0.001
 seven-0.001
oscope-0.001
ary-0.001
TokenHer
Token ID9360
Feature activation+3.95e-01

Pos logit contributions
 ourselves+0.026
ls+0.025
mons+0.025
apy+0.024
istered+0.024
Neg logit contributions
ah-0.024
et-0.024
 seven-0.021
ary-0.021
oscope-0.020
Token mom
Token ID1995
Feature activation+2.06e-01

Pos logit contributions
et+0.010
ah+0.010
 seven+0.008
ont+0.008
oscope+0.008
Neg logit contributions
 ourselves-0.011
apy-0.010
uddled-0.009
mons-0.009
Sorry-0.009
Token said
Token ID531
Feature activation+6.61e-01

Pos logit contributions
et+0.005
ah+0.005
 seven+0.005
wn+0.005
aster+0.005
Neg logit contributions
uddled-0.005
Sorry-0.004
 ourselves-0.004
apy-0.004
ls-0.004
Token,
Token ID11
Feature activation+1.88e+00

Pos logit contributions
et+0.011
ah+0.011
 seven+0.009
oscope+0.008
ary+0.008
Neg logit contributions
 ourselves-0.022
apy-0.021
uddled-0.021
mons-0.020
iring-0.020
Token "
Token ID366
Feature activation-1.56e+00

Pos logit contributions
et+0.051
ah+0.051
aling+0.044
alling+0.043
wn+0.043
Neg logit contributions
uddled-0.042
 ourselves-0.042
apy-0.042
mons-0.040
Sorry-0.038
TokenNo
Token ID2949
Feature activation-2.47e-01

Pos logit contributions
 ourselves+0.031
apy+0.029
uddled+0.028
ls+0.027
mons+0.026
Neg logit contributions
et-0.048
ah-0.048
 seven-0.043
oscope-0.041
pole-0.040
Token,
Token ID11
Feature activation+2.18e-01

Pos logit contributions
 ourselves+0.007
apy+0.006
uddled+0.006
ls+0.006
mons+0.006
Neg logit contributions
et-0.006
ah-0.005
 seven-0.005
oscope-0.005
pole-0.005
Token we
Token ID356
Feature activation-2.47e-03

Pos logit contributions
et+0.006
ah+0.006
 seven+0.005
oscope+0.005
pole+0.005
Neg logit contributions
 ourselves-0.005
apy-0.005
uddled-0.005
mons-0.005
ls-0.005
Token are
Token ID389
Feature activation-6.89e-01

Pos logit contributions
 ourselves+0.000
apy+0.000
uddled+0.000
mons+0.000
ls+0.000
Neg logit contributions
et-0.000
ah-0.000
 seven-0.000
oscope-0.000
pole-0.000
Token going
Token ID1016
Feature activation-5.01e-01

Pos logit contributions
 ourselves+0.018
apy+0.016
uddled+0.016
ls+0.016
mons+0.016
Neg logit contributions
ah-0.017
et-0.017
 seven-0.014
oscope-0.014
pole-0.014
Token,
Token ID11
Feature activation+2.06e+00

Pos logit contributions
 ourselves+0.011
uddled+0.011
apy+0.011
ls+0.010
iring+0.010
Neg logit contributions
et-0.009
ah-0.009
oscope-0.008
 seven-0.008
wn-0.008
Token hugged
Token ID41130
Feature activation-9.51e-01

Pos logit contributions
ah+0.053
et+0.052
ont+0.047
wn+0.046
 seven+0.046
Neg logit contributions
uddled-0.048
apy-0.047
 ourselves-0.047
Sorry-0.043
we-0.042
Token Milo
Token ID32662
Feature activation-9.45e-01

Pos logit contributions
 ourselves+0.023
apy+0.021
uddled+0.020
ls+0.020
mons+0.019
Neg logit contributions
et-0.025
ah-0.024
 seven-0.022
pole-0.022
wn-0.022
Token and
Token ID290
Feature activation+6.54e-01

Pos logit contributions
 ourselves+0.024
ls+0.021
apy+0.021
uddled+0.021
ies+0.020
Neg logit contributions
et-0.023
ah-0.023
aster-0.020
 seven-0.020
wn-0.019
Token together
Token ID1978
Feature activation-2.49e+00

Pos logit contributions
et+0.014
ah+0.014
 seven+0.012
oscope+0.011
pole+0.011
Neg logit contributions
 ourselves-0.019
apy-0.017
uddled-0.017
mons-0.017
ls-0.017
Token they
Token ID484
Feature activation-2.54e+00

Pos logit contributions
apy+0.034
 ourselves+0.033
uddled+0.032
ls+0.030
mons+0.027
Neg logit contributions
et-0.090
ah-0.089
oscope-0.083
 seven-0.082
aling-0.079
Token played
Token ID2826
Feature activation-1.90e+00

Pos logit contributions
 ourselves+0.059
apy+0.057
uddled+0.056
ls+0.053
mons+0.052
Neg logit contributions
et-0.066
ah-0.065
oscope-0.058
 seven-0.058
pole-0.055
Token until
Token ID1566
Feature activation-3.41e+00

Pos logit contributions
 ourselves+0.041
apy+0.039
uddled+0.038
mons+0.037
ls+0.036
Neg logit contributions
et-0.055
ah-0.053
 seven-0.047
oscope-0.046
pole-0.044
Token the
Token ID262
Feature activation+5.05e-01

Pos logit contributions
 ourselves+0.088
apy+0.082
uddled+0.081
mons+0.079
ls+0.077
Neg logit contributions
et-0.085
ah-0.084
 seven-0.073
oscope-0.070
pole-0.067
Token sun
Token ID4252
Feature activation-9.69e-01

Pos logit contributions
et+0.009
ah+0.009
 seven+0.007
oscope+0.007
pole+0.007
Neg logit contributions
 ourselves-0.017
apy-0.016
uddled-0.016
mons-0.015
ls-0.015
Token went
Token ID1816
Feature activation-1.13e+00

Pos logit contributions
 ourselves+0.027
apy+0.025
uddled+0.025
mons+0.024
ls+0.024
Neg logit contributions
et-0.022
ah-0.022
 seven-0.019
oscope-0.018
pole-0.017
Token punishments
Token ID35518
Feature activation-4.80e-01

Pos logit contributions
 ourselves+0.004
apy+0.003
uddled+0.003
mons+0.003
ls+0.003
Neg logit contributions
et-0.003
ah-0.003
 seven-0.003
oscope-0.003
pole-0.003
Token,
Token ID11
Feature activation+1.27e+00

Pos logit contributions
 ourselves+0.012
apy+0.011
uddled+0.011
ls+0.010
mons+0.010
Neg logit contributions
ah-0.012
et-0.012
 seven-0.011
aling-0.010
alling-0.010
Token he
Token ID339
Feature activation-8.60e-01

Pos logit contributions
ah+0.028
et+0.027
 seven+0.023
ont+0.023
wn+0.022
Neg logit contributions
 ourselves-0.036
apy-0.034
uddled-0.032
ls-0.031
mons-0.031
Token started
Token ID2067
Feature activation-3.77e-01

Pos logit contributions
 ourselves+0.019
apy+0.018
uddled+0.018
ls+0.018
Sorry+0.017
Neg logit contributions
et-0.023
ah-0.022
 seven-0.020
oscope-0.019
wn-0.019
Token to
Token ID284
Feature activation-1.67e+00

Pos logit contributions
 ourselves+0.011
apy+0.011
uddled+0.010
mons+0.010
ls+0.010
Neg logit contributions
et-0.008
ah-0.008
 seven-0.007
oscope-0.006
pole-0.006
Token cry
Token ID3960
Feature activation-1.90e+00

Pos logit contributions
 ourselves+0.043
apy+0.040
uddled+0.039
mons+0.038
ls+0.037
Neg logit contributions
et-0.042
ah-0.041
 seven-0.036
oscope-0.035
pole-0.033
Token.
Token ID13
Feature activation+3.22e-01

Pos logit contributions
 ourselves+0.053
apy+0.050
uddled+0.049
mons+0.047
ls+0.047
Neg logit contributions
et-0.043
ah-0.043
 seven-0.037
oscope-0.035
pole-0.034
Token His
Token ID2399
Feature activation-7.15e-01

Pos logit contributions
et+0.008
ah+0.008
wn+0.007
oscope+0.007
ock+0.007
Neg logit contributions
apy-0.008
 ourselves-0.008
uddled-0.008
Sorry-0.007
mons-0.007
Token mom
Token ID1995
Feature activation+6.94e-01

Pos logit contributions
 ourselves+0.016
apy+0.015
uddled+0.014
mons+0.014
ls+0.014
Neg logit contributions
et-0.020
ah-0.020
 seven-0.018
oscope-0.018
pole-0.017
Token said
Token ID531
Feature activation+1.35e+00

Pos logit contributions
et+0.017
ah+0.017
 seven+0.016
wn+0.015
aster+0.014
Neg logit contributions
uddled-0.016
 ourselves-0.015
apy-0.015
ls-0.015
Sorry-0.015
Token he
Token ID339
Feature activation+1.26e+00

Pos logit contributions
et+0.031
ah+0.030
 seven+0.026
oscope+0.025
pole+0.024
Neg logit contributions
 ourselves-0.037
apy-0.036
uddled-0.035
mons-0.034
ls-0.034
Token in
Token ID287
Feature activation+9.84e-01

Pos logit contributions
ah+0.020
et+0.020
 seven+0.017
aling+0.017
oscope+0.017
Neg logit contributions
 ourselves-0.021
uddled-0.020
ls-0.020
apy-0.020
iring-0.019
Token his
Token ID465
Feature activation-9.97e-01

Pos logit contributions
ah+0.026
et+0.025
pole+0.022
oscope+0.021
ary+0.021
Neg logit contributions
 ourselves-0.027
apy-0.022
 themselves-0.021
 yourselves-0.021
iring-0.020
Token notebook
Token ID20922
Feature activation+5.60e-01

Pos logit contributions
 ourselves+0.023
apy+0.022
uddled+0.021
mons+0.021
ls+0.020
Neg logit contributions
ah-0.027
et-0.026
pole-0.023
ock-0.023
aling-0.022
Token.
Token ID13
Feature activation+5.33e-01

Pos logit contributions
ah+0.017
et+0.016
alling+0.016
aling+0.016
 seven+0.015
Neg logit contributions
uddled-0.010
 ourselves-0.010
iring-0.009
ls-0.009
mons-0.009
Token Every
Token ID3887
Feature activation+1.06e+00

Pos logit contributions
et+0.014
ah+0.012
 seven+0.012
 contain+0.012
alling+0.011
Neg logit contributions
apy-0.012
 ourselves-0.012
s-0.011
idents-0.010
 themselves-0.010
Token day
Token ID1110
Feature activation-1.12e+00

Pos logit contributions
et+0.032
pole+0.031
alling+0.028
 2+0.028
aled+0.027
Neg logit contributions
apy-0.016
 ourselves-0.015
We-0.014
idents-0.014
uddled-0.014
Token he
Token ID339
Feature activation+4.90e-01

Pos logit contributions
apy+0.029
 ourselves+0.028
ls+0.027
uddled+0.027
mons+0.026
Neg logit contributions
et-0.028
ah-0.026
 seven-0.023
oscope-0.023
pole-0.023
Token would
Token ID561
Feature activation-2.92e-01

Pos logit contributions
et+0.011
ah+0.011
 seven+0.009
alling+0.009
pole+0.009
Neg logit contributions
 ourselves-0.012
uddled-0.012
apy-0.012
ls-0.012
mons-0.011
Token fill
Token ID6070
Feature activation-1.42e-01

Pos logit contributions
 ourselves+0.007
apy+0.007
Sorry+0.006
We+0.006
idents+0.006
Neg logit contributions
et-0.007
ah-0.007
 seven-0.006
wn-0.005
ock-0.005
Token the
Token ID262
Feature activation-7.39e-01

Pos logit contributions
 ourselves+0.005
uddled+0.004
apy+0.004
mons+0.004
Sorry+0.004
Neg logit contributions
et-0.003
ah-0.003
 seven-0.002
oscope-0.002
pole-0.002
Token pages
Token ID5468
Feature activation-9.28e-02

Pos logit contributions
 ourselves+0.021
apy+0.019
uddled+0.019
mons+0.019
ls+0.018
Neg logit contributions
ah-0.017
et-0.017
oscope-0.014
 seven-0.014
aling-0.013
Token
Token ID198
Feature activation+1.22e-01

Pos logit contributions
et+0.029
ah+0.028
 seven+0.025
oscope+0.024
pole+0.023
Neg logit contributions
 ourselves-0.029
apy-0.027
uddled-0.027
mons-0.026
ls-0.026
TokenThat
Token ID2504
Feature activation-6.61e-01

Pos logit contributions
ah+0.003
et+0.003
 seven+0.003
oscope+0.003
aling+0.003
Neg logit contributions
 ourselves-0.003
apy-0.003
uddled-0.003
mons-0.003
ls-0.003
Token night
Token ID1755
Feature activation-6.49e-02

Pos logit contributions
 ourselves+0.022
apy+0.021
uddled+0.021
mons+0.020
ls+0.020
Neg logit contributions
et-0.011
ah-0.011
 seven-0.009
oscope-0.009
pole-0.008
Token,
Token ID11
Feature activation+4.50e-01

Pos logit contributions
 ourselves+0.002
apy+0.002
uddled+0.002
mons+0.002
ls+0.002
Neg logit contributions
et-0.001
ah-0.001
 seven-0.001
oscope-0.001
aling-0.001
Token Benny
Token ID44275
Feature activation-3.88e-01

Pos logit contributions
et+0.010
ah+0.010
 seven+0.009
oscope+0.008
pole+0.008
Neg logit contributions
 ourselves-0.012
uddled-0.012
apy-0.012
mons-0.011
ls-0.011
Token stayed
Token ID9658
Feature activation-1.56e-01

Pos logit contributions
 ourselves+0.009
uddled+0.009
apy+0.008
ls+0.008
Sorry+0.008
Neg logit contributions
et-0.010
ah-0.010
 seven-0.009
pole-0.008
wn-0.008
Token up
Token ID510
Feature activation-6.13e-01

Pos logit contributions
 ourselves+0.004
apy+0.004
uddled+0.004
ls+0.003
iring+0.003
Neg logit contributions
ah-0.004
et-0.004
 seven-0.003
oscope-0.003
pole-0.003
Token late
Token ID2739
Feature activation+8.07e-01

Pos logit contributions
 ourselves+0.012
apy+0.011
ls+0.011
iring+0.011
uddled+0.010
Neg logit contributions
et-0.018
ah-0.018
oscope-0.016
shirt-0.016
pole-0.016
Token playing
Token ID2712
Feature activation-1.03e+00

Pos logit contributions
ah+0.019
et+0.018
 seven+0.016
pole+0.016
wn+0.016
Neg logit contributions
 ourselves-0.023
uddled-0.019
apy-0.019
ls-0.019
mons-0.019
Token.
Token ID13
Feature activation+6.87e-01

Pos logit contributions
 ourselves+0.025
apy+0.024
uddled+0.023
ls+0.022
mons+0.022
Neg logit contributions
ah-0.027
et-0.026
 seven-0.023
pole-0.022
oscope-0.022
Token He
Token ID679
Feature activation+2.77e-01

Pos logit contributions
et+0.014
ah+0.014
 seven+0.012
oscope+0.011
pole+0.011
Neg logit contributions
 ourselves-0.021
apy-0.020
uddled-0.019
mons-0.019
Sorry-0.018
Token mad
Token ID8805
Feature activation-2.43e+00

Pos logit contributions
 ourselves+0.021
uddled+0.021
mons+0.021
apy+0.020
we+0.019
Neg logit contributions
et-0.017
ah-0.016
 seven-0.015
oscope-0.015
aling-0.014
Token.
Token ID13
Feature activation-8.07e-01

Pos logit contributions
 ourselves+0.080
apy+0.079
uddled+0.077
mons+0.077
Sorry+0.072
Neg logit contributions
et-0.039
ah-0.036
 seven-0.032
 about-0.028
 2-0.027
Token They
Token ID1119
Feature activation-7.96e-01

Pos logit contributions
 ourselves+0.018
uddled+0.017
apy+0.017
mons+0.016
ls+0.016
Neg logit contributions
et-0.022
ah-0.022
 seven-0.020
oscope-0.019
 2-0.019
Token say
Token ID910
Feature activation-8.54e-01

Pos logit contributions
 ourselves+0.025
apy+0.024
uddled+0.024
mons+0.023
ls+0.023
Neg logit contributions
et-0.015
ah-0.015
 seven-0.013
oscope-0.012
aling-0.011
Token,
Token ID11
Feature activation+1.29e+00

Pos logit contributions
 ourselves+0.026
apy+0.025
uddled+0.025
mons+0.024
ls+0.023
Neg logit contributions
et-0.017
ah-0.017
 seven-0.014
oscope-0.013
pole-0.012
Token "
Token ID366
Feature activation-2.80e+00

Pos logit contributions
et+0.035
ah+0.035
wn+0.030
ont+0.029
aling+0.029
Neg logit contributions
 ourselves-0.030
apy-0.030
uddled-0.029
Sorry-0.027
mons-0.027
TokenYou
Token ID1639
Feature activation-5.34e-01

Pos logit contributions
uddled+0.060
ls+0.054
apy+0.053
 ourselves+0.052
istered+0.052
Neg logit contributions
ah-0.077
et-0.073
 seven-0.072
oscope-0.067
aster-0.065
Token are
Token ID389
Feature activation-1.20e+00

Pos logit contributions
uddled+0.017
we+0.016
 ourselves+0.016
apy+0.016
mons+0.016
Neg logit contributions
ah-0.009
et-0.009
 seven-0.008
ont-0.008
 time-0.008
Token a
Token ID257
Feature activation-9.75e-01

Pos logit contributions
apy+0.034
uddled+0.033
 ourselves+0.033
mons+0.031
ls+0.031
Neg logit contributions
et-0.026
ah-0.024
 seven-0.023
oscope-0.021
pole-0.021
Token bad
Token ID2089
Feature activation-1.14e+00

Pos logit contributions
 ourselves+0.029
uddled+0.029
apy+0.028
mons+0.028
we+0.025
Neg logit contributions
ah-0.021
et-0.019
 seven-0.018
 knight-0.017
 time-0.016
Token man
Token ID582
Feature activation+1.05e+00

Pos logit contributions
 ourselves+0.031
apy+0.029
uddled+0.028
istered+0.027
mons+0.027
Neg logit contributions
ah-0.030
et-0.027
 seven-0.024
oscope-0.022
ont-0.022
TokenL
Token ID43
Feature activation+1.41e+00

Pos logit contributions
 ourselves+0.004
apy+0.004
uddled+0.004
mons+0.003
ls+0.003
Neg logit contributions
et-0.005
ah-0.005
 seven-0.004
oscope-0.004
pole-0.004
Tokenily
Token ID813
Feature activation-3.52e-02

Pos logit contributions
et+0.031
ah+0.030
 seven+0.026
oscope+0.025
pole+0.024
Neg logit contributions
 ourselves-0.040
uddled-0.038
apy-0.038
mons-0.037
ls-0.036
Token said
Token ID531
Feature activation-2.80e-01

Pos logit contributions
 ourselves+0.001
uddled+0.001
apy+0.001
ls+0.001
mons+0.001
Neg logit contributions
et-0.001
ah-0.001
 seven-0.001
oscope-0.001
wn-0.001
Token,
Token ID11
Feature activation+1.15e+00

Pos logit contributions
 ourselves+0.009
apy+0.008
uddled+0.008
mons+0.008
ls+0.008
Neg logit contributions
et-0.006
ah-0.005
 seven-0.005
oscope-0.004
pole-0.004
Token "
Token ID366
Feature activation-1.70e+00

Pos logit contributions
et+0.033
ah+0.031
oscope+0.028
wn+0.028
ont+0.027
Neg logit contributions
apy-0.026
 ourselves-0.025
uddled-0.024
mons-0.023
Sorry-0.023
TokenWow
Token ID22017
Feature activation-3.25e+00

Pos logit contributions
 ourselves+0.032
apy+0.029
uddled+0.029
mons+0.028
ls+0.027
Neg logit contributions
et-0.054
ah-0.053
 seven-0.048
oscope-0.047
pole-0.045
Token!
Token ID0
Feature activation-7.77e-01

Pos logit contributions
 ourselves+0.107
uddled+0.104
apy+0.104
mons+0.103
Sorry+0.101
Neg logit contributions
et-0.055
ah-0.054
 seven-0.045
 2-0.038
pole-0.038
Token Look
Token ID6803
Feature activation-1.63e+00

Pos logit contributions
 ourselves+0.020
apy+0.019
mons+0.018
uddled+0.018
Sorry+0.017
Neg logit contributions
et-0.020
ah-0.019
oscope-0.017
 seven-0.016
wn-0.016
Token at
Token ID379
Feature activation+3.85e-01

Pos logit contributions
 ourselves+0.041
apy+0.038
ls+0.038
uddled+0.037
mons+0.036
Neg logit contributions
et-0.040
ah-0.040
oscope-0.038
aling-0.034
alling-0.033
Token the
Token ID262
Feature activation+3.95e-01

Pos logit contributions
ah+0.009
et+0.009
oscope+0.008
wn+0.008
 seven+0.007
Neg logit contributions
 ourselves-0.011
ls-0.009
apy-0.009
uddled-0.009
iring-0.009
Token pretty
Token ID2495
Feature activation+1.57e-01

Pos logit contributions
et+0.011
ah+0.011
 seven+0.010
oscope+0.010
aling+0.009
Neg logit contributions
 ourselves-0.009
apy-0.008
uddled-0.008
ls-0.008
mons-0.008
Token mom
Token ID1995
Feature activation+4.81e-01

Pos logit contributions
 ourselves+0.005
ins+0.004
ls+0.004
mons+0.004
apy+0.004
Neg logit contributions
et-0.004
ah-0.003
 time-0.003
 seven-0.003
oscope-0.003
Token came
Token ID1625
Feature activation+1.87e-01

Pos logit contributions
et+0.012
ah+0.012
 seven+0.010
oscope+0.010
wn+0.010
Neg logit contributions
 ourselves-0.012
uddled-0.012
apy-0.012
ls-0.011
mons-0.011
Token in
Token ID287
Feature activation+8.14e-01

Pos logit contributions
ah+0.004
et+0.004
 seven+0.003
oscope+0.003
pole+0.003
Neg logit contributions
 ourselves-0.005
uddled-0.005
apy-0.005
iring-0.005
ls-0.005
Token and
Token ID290
Feature activation+6.61e-01

Pos logit contributions
et+0.022
ah+0.021
pole+0.019
oscope+0.017
aling+0.017
Neg logit contributions
 ourselves-0.020
apy-0.018
uddled-0.018
Sorry-0.018
iring-0.017
Token saw
Token ID2497
Feature activation-1.46e-01

Pos logit contributions
et+0.014
ah+0.014
 seven+0.012
oscope+0.011
pole+0.011
Neg logit contributions
 ourselves-0.020
apy-0.018
uddled-0.018
mons-0.018
ls-0.017
Token how
Token ID703
Feature activation-4.28e+00

Pos logit contributions
 ourselves+0.005
apy+0.005
uddled+0.004
istered+0.004
Sorry+0.004
Neg logit contributions
et-0.003
ah-0.002
 about-0.002
ock-0.002
 seven-0.002
Token clean
Token ID3424
Feature activation+3.56e-01

Pos logit contributions
 ourselves+0.133
mons+0.121
uddled+0.120
sters+0.119
oles+0.114
Neg logit contributions
et-0.069
ah-0.066
 seven-0.064
 2-0.062
ary-0.060
Token her
Token ID607
Feature activation-3.91e-01

Pos logit contributions
et+0.008
ah+0.007
 seven+0.006
oscope+0.006
pole+0.006
Neg logit contributions
 ourselves-0.010
apy-0.010
uddled-0.010
mons-0.010
Sorry-0.009
Token room
Token ID2119
Feature activation-1.35e-01

Pos logit contributions
 ourselves+0.010
apy+0.009
Sorry+0.009
uddled+0.009
mons+0.009
Neg logit contributions
et-0.010
ah-0.009
oscope-0.009
 seven-0.008
 time-0.008
Token was
Token ID373
Feature activation+6.09e-01

Pos logit contributions
 ourselves+0.003
apy+0.003
uddled+0.003
mons+0.003
ls+0.003
Neg logit contributions
et-0.004
ah-0.004
 seven-0.003
oscope-0.003
pole-0.003
Token.
Token ID13
Feature activation+3.51e-01

Pos logit contributions
ah+0.015
et+0.015
 seven+0.013
oscope+0.013
aling+0.012
Neg logit contributions
 ourselves-0.016
uddled-0.015
apy-0.015
mons-0.014
ls-0.014
Token dog
Token ID3290
Feature activation+3.02e-01

Pos logit contributions
et+0.003
ah+0.003
 seven+0.003
oscope+0.002
 2+0.002
Neg logit contributions
 ourselves-0.004
apy-0.004
ins-0.004
uddled-0.004
mons-0.004
Token and
Token ID290
Feature activation+6.73e-01

Pos logit contributions
ah+0.008
et+0.007
 seven+0.007
wn+0.007
ont+0.006
Neg logit contributions
 ourselves-0.007
ls-0.007
uddled-0.006
mons-0.006
s-0.006
Token said
Token ID531
Feature activation-8.43e-01

Pos logit contributions
et+0.016
ah+0.015
 seven+0.013
oscope+0.013
pole+0.012
Neg logit contributions
 ourselves-0.018
apy-0.017
uddled-0.017
ls-0.017
mons-0.017
Token,
Token ID11
Feature activation+1.50e+00

Pos logit contributions
 ourselves+0.025
apy+0.024
uddled+0.023
mons+0.023
ls+0.023
Neg logit contributions
et-0.018
ah-0.017
 seven-0.015
oscope-0.014
pole-0.013
Token "
Token ID366
Feature activation-1.96e+00

Pos logit contributions
et+0.042
ah+0.041
wn+0.036
ont+0.036
oscope+0.036
Neg logit contributions
 ourselves-0.033
apy-0.033
uddled-0.031
mons-0.031
balls-0.030
TokenHello
Token ID15496
Feature activation-2.91e+00

Pos logit contributions
 ourselves+0.047
apy+0.044
uddled+0.043
mons+0.042
ls+0.041
Neg logit contributions
et-0.053
ah-0.052
 seven-0.046
oscope-0.045
pole-0.043
Token,
Token ID11
Feature activation-4.59e-01

Pos logit contributions
 ourselves+0.092
apy+0.089
uddled+0.088
mons+0.086
Sorry+0.083
Neg logit contributions
ah-0.054
et-0.054
 seven-0.044
oscope-0.040
pole-0.039
Token do
Token ID466
Feature activation-4.87e-01

Pos logit contributions
 ourselves+0.010
apy+0.010
uddled+0.010
ls+0.009
mons+0.009
Neg logit contributions
et-0.013
ah-0.013
 seven-0.011
oscope-0.011
pole-0.010
Tokenggy
Token ID19970
Feature activation+5.76e-01

Pos logit contributions
 ourselves+0.011
uddled+0.011
apy+0.011
mons+0.010
istered+0.010
Neg logit contributions
ah-0.012
et-0.012
 seven-0.011
ary-0.010
oscope-0.010
Token!
Token ID0
Feature activation-3.17e-01

Pos logit contributions
et+0.014
ah+0.014
 seven+0.012
oscope+0.012
pole+0.011
Neg logit contributions
 ourselves-0.015
apy-0.014
uddled-0.014
mons-0.013
ls-0.013
Token You
Token ID921
Feature activation-3.80e-01

Pos logit contributions
 ourselves+0.007
apy+0.007
Sorry+0.006
We+0.006
mons+0.006
Neg logit contributions
et-0.009
ah-0.008
wn-0.007
ont-0.007
 seven-0.007
Token in
Token ID287
Feature activation+1.08e+00

Pos logit contributions
ah+0.043
et+0.041
oscope+0.036
ont+0.035
aster+0.034
Neg logit contributions
 ourselves-0.043
uddled-0.040
ls-0.040
iring-0.039
Sorry-0.038
Token the
Token ID262
Feature activation+4.85e-01

Pos logit contributions
ah+0.030
et+0.030
pole+0.027
ont+0.026
aling+0.025
Neg logit contributions
 ourselves-0.024
apy-0.022
 themselves-0.021
We-0.020
ls-0.019
Token room
Token ID2119
Feature activation+7.55e-01

Pos logit contributions
ah+0.012
et+0.012
oscope+0.010
pole+0.010
 seven+0.009
Neg logit contributions
 ourselves-0.013
apy-0.012
ls-0.011
uddled-0.011
Sorry-0.011
Token.
Token ID13
Feature activation+1.01e+00

Pos logit contributions
ah+0.018
et+0.017
 seven+0.015
oscope+0.014
wn+0.014
Neg logit contributions
 ourselves-0.020
uddled-0.019
apy-0.019
mons-0.018
ls-0.018
Token "
Token ID366
Feature activation-1.68e+00

Pos logit contributions
et+0.022
ah+0.022
 seven+0.019
oscope+0.018
 2+0.017
Neg logit contributions
 ourselves-0.029
uddled-0.027
apy-0.026
ls-0.026
mons-0.026
TokenWhat
Token ID2061
Feature activation-2.90e+00

Pos logit contributions
 ourselves+0.035
apy+0.033
uddled+0.032
ls+0.031
mons+0.030
Neg logit contributions
et-0.050
ah-0.050
 seven-0.044
oscope-0.043
pole-0.041
Token are
Token ID389
Feature activation-1.08e+00

Pos logit contributions
 ourselves+0.086
uddled+0.084
apy+0.080
mons+0.079
Sorry+0.079
Neg logit contributions
ah-0.059
et-0.056
 seven-0.047
ont-0.046
oscope-0.045
Token you
Token ID345
Feature activation+1.18e+00

Pos logit contributions
uddled+0.023
istered+0.022
apy+0.022
 ourselves+0.022
Sorry+0.021
Neg logit contributions
et-0.029
ah-0.028
 seven-0.027
 2-0.025
pole-0.024
Token doing
Token ID1804
Feature activation+2.49e+00

Pos logit contributions
et+0.033
ah+0.033
 seven+0.029
oscope+0.028
pole+0.027
Neg logit contributions
 ourselves-0.027
apy-0.023
uddled-0.023
mons-0.023
iring-0.022
Token,
Token ID11
Feature activation+1.06e+00

Pos logit contributions
et+0.063
ah+0.063
 seven+0.054
oscope+0.053
pole+0.050
Neg logit contributions
 ourselves-0.063
apy-0.058
uddled-0.057
mons-0.056
ls-0.055
Token Sweet
Token ID15335
Feature activation+4.97e-02

Pos logit contributions
et+0.022
ah+0.021
 seven+0.018
pole+0.017
 2+0.017
Neg logit contributions
 ourselves-0.031
uddled-0.031
apy-0.031
mons-0.030
istered-0.029