TOP ACTIVATIONS
MAX = 3.945

is it , Tom ?" she asked . He
treat and a hug ?" she asked . "
making such a mess ?" she asks and looks cross .
is wrong , Lily ?" she asks . "
back to the leaves ?" she asked . "
c actus , Tom ?" she asks . "
play in the garden ?" she asked . Tom
is going on here ?" she asked . L
we go and see ?" she asks . Tom
Can I play too ?" she asks . "
back in the earth ?" she asked . "
's wrong , boys ?" she asked . "
" What happened here ?" she asks . She is not
a lime , Ben ?" she asks . "
" What happened here ?" she asks . They
I be your friend ?" she says . The
the tree , Ben ?" she asked . "
the bugs look big ?" she asked . Sam
" What happened ?" she asked . "
I open it now ?" she asks . Tom

MIN ACTIVATIONS
MIN = -3.611

They both put on their favorite clothes and headed outside .
her mom and wore her favorite sweater all day long .
. He was wearing his favorite spider man shirt .
started to sort out a snowball fight with some of his
discovered a long - for gotten t eddy bear inside .
made snow men and had snowball fights . They used their
man , and had a snowball fight . It was so
was colorful with lots of red s and blues . Max
. Tim my hadn 't ever been to a doctor before
and pretended to have a snowball fight with her imaginary friend
and stickers . They use red , blue , yellow and
circles , and even won ky circles . Her mom my
snow man and had a snowball fight with her friends .
, Lily proudly wore her favorite dress to go visiting with
river ! Little Bear was so excited .
He pulled and tug ged at the lid , but
so she put on her favorite sweater and cl apped her
thought that nothing bad could ever happen .
day . They worked delic ately and carefully and nothing broke
. He put on his favorite blue cap and hopped in

INTERVAL 2.056 - 3.945
CONTAINS 0.131%

" Where are our structures ?" Lily asked .
. <|endoftext|> L ily was very excited . She was going
them could think what the trade was .
That 's a curtain ," he said . He
you with your nest ?" they asked . The

INTERVAL 0.167 - 2.056
CONTAINS 37.381%

Do you like it , Lily ?" Mom says .
you keep following me ?". The bee pe ep
share the ball and the car . They make a new
. You can have your balloon now . Here you go
and ran to his mom . His mom hugged him and

INTERVAL -1.722 - 0.167
CONTAINS 62.208%

higher . His wish had come true , and he could
Once there was a boy named Sam . He was very
smiled and said , " I love you too , Ben
spoil your collection ." But Sammy didn 't listen and kept
the new wheels to his car . When she was done

INTERVAL -3.611 - -1.722
CONTAINS 0.280%

blue the sky was , about how much she missed her
loved picking shells at the beach . She had a special
not know if they would ever see their friends and family
time . They yelled and blamed and fired people .
the garden . He dug deep into the dirt and pulled
Token is
Token ID318
Feature activation+5.53e-01

Pos logit contributions
 Finally+0.014
 Sure+0.014
+0.013
icol+0.013
uggle+0.012
Neg logit contributions
 angry-0.014
 losers-0.013
aky-0.012
 alone-0.012
 lazy-0.012
Token it
Token ID340
Feature activation+4.65e-01

Pos logit contributions
 angry+0.011
 mean+0.011
 my+0.010
 rude+0.010
 selfish+0.009
Neg logit contributions
-0.021
 Finally-0.021
icol-0.021
 Sure-0.020
 Eventually-0.019
Token,
Token ID11
Feature activation+2.38e-01

Pos logit contributions
 losers+0.014
aky+0.012
 angry+0.012
 musicians+0.012
 furious+0.012
Neg logit contributions
 Finally-0.011
 Sure-0.010
-0.010
ings-0.010
 purchase-0.010
Token Tom
Token ID4186
Feature activation+1.55e+00

Pos logit contributions
aky+0.006
umpy+0.006
 alone+0.006
 angry+0.005
 lazy+0.005
Neg logit contributions
 Finally-0.006
 Sure-0.006
icol-0.006
 purchase-0.006
 Enjoy-0.005
Token?"
Token ID1701
Feature activation-5.59e-01

Pos logit contributions
 losers+0.045
 angry+0.040
chie+0.038
 furious+0.037
 selfish+0.037
Neg logit contributions
 Finally-0.039
 Sure-0.039
icol-0.039
-0.035
ings-0.034
Token she
Token ID673
Feature activation+3.95e+00

Pos logit contributions
 Finally+0.016
 Sure+0.016
icol+0.013
+0.013
 Eventually+0.013
Neg logit contributions
 losers-0.016
aky-0.015
umpy-0.014
 reckless-0.013
 mean-0.013
Token asked
Token ID1965
Feature activation+1.83e-01

Pos logit contributions
 losers+0.129
igator+0.118
 alone+0.111
umpy+0.109
chie+0.108
Neg logit contributions
 Finally-0.085
icol-0.072
 Sure-0.072
-0.065
 Eventually-0.062
Token.
Token ID13
Feature activation+4.56e-01

Pos logit contributions
 losers+0.006
 mean+0.006
 lazy+0.005
 angry+0.005
 selfish+0.005
Neg logit contributions
 Finally-0.004
 Sure-0.004
-0.004
icol-0.004
ings-0.004
Token
Token ID198
Feature activation+5.62e-01

Pos logit contributions
 angry+0.013
 mean+0.012
 losers+0.012
 rude+0.011
 lazy+0.011
Neg logit contributions
 Finally-0.013
icol-0.013
 Sure-0.013
-0.012
 Eventually-0.011
Token
Token ID198
Feature activation+3.89e-01

Pos logit contributions
 angry+0.015
 mean+0.015
 losers+0.014
 rude+0.014
 lazy+0.014
Neg logit contributions
 Finally-0.016
icol-0.016
 Sure-0.016
-0.015
 Eventually-0.014
TokenHe
Token ID1544
Feature activation+1.05e+00

Pos logit contributions
 losers+0.011
 angry+0.010
 mean+0.010
 lonely+0.010
 lazy+0.010
Neg logit contributions
 Sure-0.012
 Finally-0.012
icol-0.010
-0.010
Finally-0.010
Token treat
Token ID2190
Feature activation+5.05e-01

Pos logit contributions
 Finally+0.013
icol+0.013
 Sure+0.012
+0.012
 Eventually+0.011
Neg logit contributions
 angry-0.014
 mean-0.014
 losers-0.013
 rude-0.013
 lazy-0.013
Token and
Token ID290
Feature activation-1.66e-01

Pos logit contributions
 angry+0.014
 lazy+0.014
 losers+0.014
 selfish+0.014
 furious+0.014
Neg logit contributions
 Sure-0.013
 Finally-0.012
-0.012
uggle-0.011
ings-0.011
Token a
Token ID257
Feature activation-2.69e-01

Pos logit contributions
 Finally+0.005
 Sure+0.004
icol+0.004
+0.004
 purchase+0.004
Neg logit contributions
 angry-0.005
umpy-0.005
aky-0.005
 losers-0.004
 mean-0.004
Token hug
Token ID16225
Feature activation+9.02e-01

Pos logit contributions
 Finally+0.006
icol+0.006
 Sure+0.006
+0.006
 Eventually+0.005
Neg logit contributions
 angry-0.009
 mean-0.009
 losers-0.008
 rude-0.008
 lazy-0.008
Token?"
Token ID1701
Feature activation-4.36e-01

Pos logit contributions
 furious+0.027
 lazy+0.027
 impressed+0.027
 angry+0.026
 losers+0.026
Neg logit contributions
 Finally-0.019
 Sure-0.019
ings-0.018
uggle-0.018
-0.017
Token she
Token ID673
Feature activation+3.90e+00

Pos logit contributions
 Finally+0.014
 Sure+0.013
+0.012
icol+0.012
 Eventually+0.011
Neg logit contributions
 losers-0.011
aky-0.011
 angry-0.010
 mean-0.010
umpy-0.010
Token asked
Token ID1965
Feature activation+4.29e-01

Pos logit contributions
 losers+0.120
igator+0.113
umpy+0.106
chie+0.105
igators+0.103
Neg logit contributions
 Finally-0.087
 Sure-0.080
icol-0.078
-0.069
 Eventually-0.065
Token.
Token ID13
Feature activation+8.88e-01

Pos logit contributions
 angry+0.011
 mean+0.010
 losers+0.010
 rude+0.009
 lazy+0.009
Neg logit contributions
 Finally-0.014
icol-0.014
 Sure-0.013
-0.013
 Eventually-0.012
Token
Token ID198
Feature activation+7.17e-01

Pos logit contributions
 angry+0.023
 mean+0.022
 losers+0.021
 rude+0.020
 lazy+0.020
Neg logit contributions
icol-0.027
 Finally-0.027
 Sure-0.026
-0.026
 Eventually-0.023
Token
Token ID198
Feature activation+7.51e-01

Pos logit contributions
 angry+0.018
 mean+0.017
 losers+0.016
 lazy+0.015
 rude+0.015
Neg logit contributions
icol-0.023
 Finally-0.022
-0.022
 Sure-0.022
 Eventually-0.019
Token"
Token ID1
Feature activation+5.44e-01

Pos logit contributions
 angry+0.018
 losers+0.017
 mean+0.017
 selfish+0.016
 lazy+0.016
Neg logit contributions
 Finally-0.025
 Sure-0.025
-0.024
icol-0.023
uggle-0.021
Token making
Token ID1642
Feature activation+7.94e-01

Pos logit contributions
 losers+0.023
igator+0.023
igators+0.021
aky+0.021
umpy+0.020
Neg logit contributions
-0.025
 Finally-0.022
icol-0.020
 purchase-0.020
 Sure-0.020
Token such
Token ID884
Feature activation-2.89e-01

Pos logit contributions
aky+0.021
umpy+0.021
 losers+0.020
 mean+0.019
 reckless+0.019
Neg logit contributions
icol-0.021
 Finally-0.020
-0.019
ings-0.018
 Sure-0.018
Token a
Token ID257
Feature activation+5.65e-02

Pos logit contributions
 Finally+0.008
icol+0.008
 Sure+0.007
+0.007
 Eventually+0.007
Neg logit contributions
 losers-0.008
 angry-0.007
 mean-0.007
aky-0.007
igator-0.007
Token mess
Token ID2085
Feature activation+1.86e+00

Pos logit contributions
 angry+0.001
 mean+0.001
 losers+0.001
 rude+0.001
 lazy+0.001
Neg logit contributions
 Finally-0.002
 Sure-0.002
icol-0.002
-0.002
 Eventually-0.002
Token?"
Token ID1701
Feature activation-1.04e-01

Pos logit contributions
 losers+0.058
 furious+0.054
chie+0.053
 selfish+0.053
 lazy+0.053
Neg logit contributions
ings-0.034
-0.032
uggle-0.030
icol-0.029
ails-0.029
Token she
Token ID673
Feature activation+3.81e+00

Pos logit contributions
 Finally+0.003
 Sure+0.003
icol+0.003
+0.003
 Eventually+0.003
Neg logit contributions
 angry-0.002
 losers-0.002
 mean-0.002
 lazy-0.002
 rude-0.002
Token asks
Token ID7893
Feature activation+7.25e-02

Pos logit contributions
 losers+0.110
aky+0.096
igator+0.096
chie+0.093
 alone+0.091
Neg logit contributions
 Finally-0.092
 Sure-0.088
icol-0.086
-0.076
 curiously-0.072
Token and
Token ID290
Feature activation+3.57e-01

Pos logit contributions
 angry+0.002
 mean+0.002
 losers+0.002
 rude+0.002
 lazy+0.002
Neg logit contributions
 Finally-0.002
icol-0.002
 Sure-0.002
-0.002
 Eventually-0.002
Token looks
Token ID3073
Feature activation-7.99e-02

Pos logit contributions
 angry+0.007
 mean+0.006
 losers+0.006
 rude+0.005
 lazy+0.005
Neg logit contributions
 Finally-0.014
icol-0.013
 Sure-0.013
-0.013
 Eventually-0.012
Token cross
Token ID3272
Feature activation+1.34e+00

Pos logit contributions
 Finally+0.003
icol+0.003
 Sure+0.003
+0.003
 Eventually+0.003
Neg logit contributions
 losers-0.001
 angry-0.001
 mean-0.001
 rude-0.001
 lazy-0.001
Token.
Token ID13
Feature activation+4.85e-01

Pos logit contributions
 losers+0.044
aky+0.040
 mean+0.040
 lazy+0.040
 selfish+0.039
Neg logit contributions
 Finally-0.028
 Sure-0.027
ings-0.027
icol-0.024
-0.024
Token is
Token ID318
Feature activation+3.30e-01

Pos logit contributions
 Finally+0.020
 Sure+0.020
+0.020
icol+0.019
uggle+0.018
Neg logit contributions
 angry-0.022
 losers-0.022
 alone-0.020
aky-0.020
 musicians-0.020
Token wrong
Token ID2642
Feature activation+1.47e+00

Pos logit contributions
 angry+0.008
 mean+0.008
 losers+0.007
 rude+0.007
 selfish+0.007
Neg logit contributions
 Finally-0.011
icol-0.011
 Sure-0.011
-0.011
 Eventually-0.010
Token,
Token ID11
Feature activation+9.29e-01

Pos logit contributions
 losers+0.041
 angry+0.040
 mean+0.038
aky+0.036
 selfish+0.036
Neg logit contributions
icol-0.040
 Finally-0.040
 Sure-0.039
-0.039
ings-0.035
Token Lily
Token ID20037
Feature activation+2.75e+00

Pos logit contributions
v+0.024
aky+0.024
chie+0.024
 alone+0.024
umpy+0.024
Neg logit contributions
 Finally-0.019
 Sure-0.019
icol-0.018
-0.018
 purchase-0.017
Token?"
Token ID1701
Feature activation-4.39e-01

Pos logit contributions
 losers+0.075
 angry+0.067
chie+0.063
 selfish+0.062
 mean+0.061
Neg logit contributions
icol-0.076
 Finally-0.074
 Sure-0.072
-0.070
ings-0.065
Token she
Token ID673
Feature activation+3.77e+00

Pos logit contributions
 Finally+0.013
 Sure+0.013
icol+0.012
+0.012
 Eventually+0.011
Neg logit contributions
 losers-0.012
aky-0.010
 mean-0.010
 angry-0.010
 selfish-0.010
Token asks
Token ID7893
Feature activation+6.71e-02

Pos logit contributions
 losers+0.098
 angry+0.086
aky+0.085
igator+0.084
 mean+0.083
Neg logit contributions
 Finally-0.107
icol-0.105
 Sure-0.101
-0.093
 Eventually-0.089
Token.
Token ID13
Feature activation+8.33e-01

Pos logit contributions
 angry+0.002
 mean+0.002
 losers+0.002
 rude+0.002
 lazy+0.002
Neg logit contributions
 Finally-0.002
icol-0.002
 Sure-0.002
-0.002
 Eventually-0.002
Token
Token ID198
Feature activation+6.17e-01

Pos logit contributions
 angry+0.022
 mean+0.021
 losers+0.020
 rude+0.019
 lazy+0.019
Neg logit contributions
 Finally-0.025
icol-0.025
 Sure-0.025
-0.024
 Eventually-0.022
Token
Token ID198
Feature activation+3.85e-01

Pos logit contributions
 angry+0.015
 mean+0.014
 losers+0.013
 rude+0.013
 lazy+0.013
Neg logit contributions
icol-0.020
 Finally-0.020
-0.019
 Sure-0.019
 Eventually-0.017
Token"
Token ID1
Feature activation+8.19e-01

Pos logit contributions
 losers+0.008
 angry+0.008
 mean+0.008
 selfish+0.007
 lazy+0.007
Neg logit contributions
 Finally-0.013
 Sure-0.013
-0.012
icol-0.012
 Eventually-0.011
Token back
Token ID736
Feature activation-5.12e-02

Pos logit contributions
 angry+0.006
 mean+0.006
 losers+0.005
 lazy+0.005
 rude+0.005
Neg logit contributions
 Finally-0.009
icol-0.009
 Sure-0.008
-0.008
 Eventually-0.008
Token to
Token ID284
Feature activation+8.48e-01

Pos logit contributions
 Sure+0.001
 Finally+0.001
icol+0.001
+0.001
uggle+0.001
Neg logit contributions
 angry-0.001
 losers-0.001
 mean-0.001
 selfish-0.001
 rude-0.001
Token the
Token ID262
Feature activation-3.20e-01

Pos logit contributions
aky+0.015
 losers+0.015
 mean+0.014
 angry+0.014
umpy+0.014
Neg logit contributions
 Sure-0.031
icol-0.031
 Finally-0.030
-0.030
 purchase-0.028
Token leaves
Token ID5667
Feature activation+1.65e+00

Pos logit contributions
 Finally+0.008
 Sure+0.008
icol+0.008
+0.007
uggle+0.006
Neg logit contributions
 losers-0.010
 angry-0.010
 mean-0.010
aky-0.009
 selfish-0.009
Token?"
Token ID1701
Feature activation-6.05e-01

Pos logit contributions
 musicians+0.050
 losers+0.049
 angry+0.047
 furious+0.047
 selfish+0.046
Neg logit contributions
-0.040
uggle-0.037
 Finally-0.035
 Sure-0.033
ings-0.030
Token she
Token ID673
Feature activation+3.77e+00

Pos logit contributions
 Finally+0.020
 Sure+0.019
+0.018
icol+0.018
 Eventually+0.017
Neg logit contributions
 losers-0.015
 angry-0.014
 mean-0.013
aky-0.013
 lazy-0.012
Token asked
Token ID1965
Feature activation+7.25e-02

Pos logit contributions
 losers+0.123
igator+0.113
umpy+0.111
chie+0.108
 winners+0.103
Neg logit contributions
 Finally-0.077
 Sure-0.070
icol-0.066
-0.059
 trek-0.059
Token.
Token ID13
Feature activation+3.86e-01

Pos logit contributions
 angry+0.002
 mean+0.002
 losers+0.002
 rude+0.002
 lazy+0.002
Neg logit contributions
 Finally-0.002
icol-0.002
 Sure-0.002
-0.002
 Eventually-0.002
Token
Token ID198
Feature activation+3.30e-01

Pos logit contributions
 angry+0.009
 mean+0.009
 losers+0.008
 rude+0.008
 lazy+0.008
Neg logit contributions
icol-0.013
 Finally-0.013
 Sure-0.012
-0.012
uggle-0.011
Token
Token ID198
Feature activation+2.58e-01

Pos logit contributions
 angry+0.008
 mean+0.007
 losers+0.007
 kids+0.007
 lazy+0.007
Neg logit contributions
icol-0.011
 Finally-0.011
-0.010
 Sure-0.010
 Eventually-0.009
Token"
Token ID1
Feature activation+4.98e-01

Pos logit contributions
 angry+0.005
 losers+0.005
 mean+0.005
 lazy+0.005
 selfish+0.005
Neg logit contributions
 Finally-0.009
 Sure-0.009
-0.009
icol-0.009
uggle-0.008
Token c
Token ID269
Feature activation+1.67e+00

Pos logit contributions
 angry+0.000
 mean+0.000
 lazy+0.000
 my+0.000
 losers+0.000
Neg logit contributions
 Finally-0.000
 Sure-0.000
icol-0.000
-0.000
 Eventually-0.000
Tokenactus
Token ID34144
Feature activation+9.79e-01

Pos logit contributions
 my+0.038
 angry+0.037
 bald+0.037
 mean+0.037
 lazy+0.037
Neg logit contributions
 Finally-0.053
 Sure-0.051
icol-0.047
 Eventually-0.046
 generation-0.046
Token,
Token ID11
Feature activation+9.01e-01

Pos logit contributions
 losers+0.029
 musicians+0.028
 angry+0.026
 boys+0.026
aky+0.025
Neg logit contributions
-0.024
 Finally-0.023
uggle-0.023
 Sure-0.022
ings-0.022
Token Tom
Token ID4186
Feature activation+2.16e+00

Pos logit contributions
umpy+0.024
aky+0.024
chie+0.023
 angry+0.023
 alone+0.022
Neg logit contributions
 Finally-0.024
 Sure-0.023
icol-0.020
 purchase-0.020
-0.020
Token?"
Token ID1701
Feature activation-5.30e-01

Pos logit contributions
 losers+0.062
 angry+0.057
 furious+0.054
chie+0.053
 musicians+0.053
Neg logit contributions
 Finally-0.051
 Sure-0.050
icol-0.048
ool-0.044
ings-0.044
Token she
Token ID673
Feature activation+3.71e+00

Pos logit contributions
 Finally+0.018
 Sure+0.017
icol+0.015
+0.015
 Eventually+0.015
Neg logit contributions
 losers-0.012
aky-0.012
 mean-0.011
 angry-0.011
chie-0.010
Token asks
Token ID7893
Feature activation+9.72e-02

Pos logit contributions
 losers+0.114
aky+0.102
chie+0.102
igator+0.100
umpy+0.100
Neg logit contributions
 Finally-0.082
 Sure-0.074
icol-0.070
 Eventually-0.061
 curiously-0.060
Token.
Token ID13
Feature activation+1.12e+00

Pos logit contributions
 losers+0.002
 mean+0.002
 angry+0.002
 lazy+0.002
 selfish+0.002
Neg logit contributions
 Finally-0.003
 Sure-0.003
-0.003
icol-0.003
 Eventually-0.003
Token
Token ID198
Feature activation+1.17e+00

Pos logit contributions
 angry+0.030
 mean+0.029
 losers+0.029
 rude+0.027
 lazy+0.027
Neg logit contributions
 Finally-0.033
icol-0.032
 Sure-0.032
-0.031
 Eventually-0.028
Token
Token ID198
Feature activation+7.91e-01

Pos logit contributions
 angry+0.029
 mean+0.028
 losers+0.027
 rude+0.026
 lazy+0.025
Neg logit contributions
icol-0.037
 Finally-0.036
-0.035
 Sure-0.035
 Eventually-0.031
Token"
Token ID1
Feature activation+6.16e-01

Pos logit contributions
 angry+0.019
 lonely+0.018
 mean+0.018
 losers+0.018
 lazy+0.018
Neg logit contributions
 Finally-0.024
 Sure-0.024
-0.022
icol-0.021
Finally-0.020
Token play
Token ID711
Feature activation+1.87e+00

Pos logit contributions
aky+0.029
 losers+0.029
umpy+0.028
 angry+0.028
 selfish+0.027
Neg logit contributions
 Sure-0.016
 Finally-0.016
icol-0.015
uggle-0.013
-0.013
Token in
Token ID287
Feature activation+1.56e+00

Pos logit contributions
 angry+0.052
 losers+0.051
umpy+0.051
 selfish+0.050
 lazy+0.049
Neg logit contributions
-0.046
 Finally-0.046
 Sure-0.045
uggle-0.043
icol-0.042
Token the
Token ID262
Feature activation+5.91e-01

Pos logit contributions
 losers+0.035
 angry+0.033
umpy+0.033
aky+0.033
 selfish+0.033
Neg logit contributions
-0.047
 Sure-0.047
uggle-0.045
 Finally-0.045
icol-0.043
Token garden
Token ID11376
Feature activation+6.77e-01

Pos logit contributions
 losers+0.020
 angry+0.020
aky+0.019
 selfish+0.018
 impressed+0.018
Neg logit contributions
 Finally-0.013
icol-0.013
 Sure-0.012
uggle-0.011
-0.010
Token?"
Token ID1701
Feature activation-5.06e-01

Pos logit contributions
 losers+0.022
 musicians+0.021
chie+0.021
 furious+0.021
aky+0.020
Neg logit contributions
 Finally-0.014
-0.013
 Sure-0.013
uggle-0.012
ings-0.011
Token she
Token ID673
Feature activation+3.71e+00

Pos logit contributions
 Finally+0.015
 Sure+0.014
+0.014
icol+0.013
 Eventually+0.012
Neg logit contributions
 losers-0.014
aky-0.013
 angry-0.013
 mean-0.013
 lazy-0.012
Token asked
Token ID1965
Feature activation+1.26e-01

Pos logit contributions
 losers+0.123
igator+0.115
umpy+0.110
chie+0.109
igators+0.107
Neg logit contributions
 Finally-0.068
 Sure-0.062
icol-0.057
-0.051
 curiously-0.049
Token.
Token ID13
Feature activation+4.14e-01

Pos logit contributions
 angry+0.003
 losers+0.003
 mean+0.003
 lazy+0.003
 rude+0.003
Neg logit contributions
 Finally-0.004
 Sure-0.004
icol-0.003
-0.003
 Eventually-0.003
Token
Token ID198
Feature activation+6.36e-01

Pos logit contributions
 angry+0.010
 mean+0.010
 losers+0.010
 rude+0.009
 selfish+0.009
Neg logit contributions
icol-0.013
 Finally-0.013
 Sure-0.012
-0.012
 Eventually-0.011
Token
Token ID198
Feature activation+4.90e-01

Pos logit contributions
 angry+0.015
 mean+0.014
 losers+0.013
 kids+0.013
 rude+0.013
Neg logit contributions
icol-0.021
 Finally-0.020
-0.020
 Sure-0.020
 generation-0.018
TokenTom
Token ID13787
Feature activation+6.24e-01

Pos logit contributions
 angry+0.011
 losers+0.010
 mean+0.010
 lazy+0.010
 selfish+0.009
Neg logit contributions
 Finally-0.017
 Sure-0.017
-0.016
icol-0.016
 Eventually-0.015
Token is
Token ID318
Feature activation+3.21e-01

Pos logit contributions
 Finally+0.018
 Sure+0.018
+0.018
icol+0.017
uggle+0.016
Neg logit contributions
 angry-0.019
 losers-0.019
 mean-0.017
 alone-0.017
aky-0.017
Token going
Token ID1016
Feature activation+1.35e+00

Pos logit contributions
 angry+0.007
 mean+0.007
 rude+0.006
 selfish+0.006
 losers+0.006
Neg logit contributions
 Finally-0.011
icol-0.011
 Sure-0.011
-0.011
 Eventually-0.010
Token on
Token ID319
Feature activation+1.05e+00

Pos logit contributions
 losers+0.032
aky+0.032
 angry+0.031
 musicians+0.029
 lazy+0.029
Neg logit contributions
-0.039
icol-0.039
 Finally-0.039
 Sure-0.039
uggle-0.035
Token here
Token ID994
Feature activation+6.98e-01

Pos logit contributions
 angry+0.022
aky+0.022
 losers+0.022
 mean+0.022
 selfish+0.020
Neg logit contributions
-0.036
icol-0.033
 Finally-0.033
 Sure-0.032
 purchase-0.031
Token?"
Token ID1701
Feature activation-7.89e-01

Pos logit contributions
 losers+0.019
 angry+0.019
 musicians+0.018
aky+0.018
 selfish+0.018
Neg logit contributions
-0.018
 Finally-0.018
 Sure-0.017
icol-0.017
ings-0.016
Token she
Token ID673
Feature activation+3.70e+00

Pos logit contributions
 Finally+0.025
 Sure+0.024
icol+0.024
+0.023
 Eventually+0.021
Neg logit contributions
 losers-0.019
 angry-0.019
 mean-0.018
aky-0.017
 selfish-0.017
Token asked
Token ID1965
Feature activation+2.22e-01

Pos logit contributions
 losers+0.117
igator+0.106
 winners+0.099
umpy+0.099
 alone+0.096
Neg logit contributions
 Finally-0.079
 Sure-0.073
icol-0.072
 curiously-0.064
-0.061
Token.
Token ID13
Feature activation+5.69e-01

Pos logit contributions
 angry+0.006
 mean+0.005
 losers+0.005
 rude+0.005
 lazy+0.005
Neg logit contributions
 Finally-0.007
icol-0.007
 Sure-0.007
-0.006
 Eventually-0.006
Token
Token ID198
Feature activation+4.36e-01

Pos logit contributions
 angry+0.014
 mean+0.014
 rude+0.013
 losers+0.013
 lazy+0.012
Neg logit contributions
icol-0.018
 Finally-0.018
 Sure-0.017
-0.017
uggle-0.015
Token
Token ID198
Feature activation+2.83e-01

Pos logit contributions
 angry+0.011
 mean+0.010
 losers+0.009
 lazy+0.009
 rude+0.009
Neg logit contributions
icol-0.014
 Finally-0.014
-0.014
 Sure-0.013
 Eventually-0.012
TokenL
Token ID43
Feature activation-1.85e-01

Pos logit contributions
 losers+0.006
 angry+0.006
 mean+0.005
 selfish+0.005
 rude+0.005
Neg logit contributions
 Finally-0.010
 Sure-0.010
icol-0.010
-0.010
 Eventually-0.009
Token we
Token ID356
Feature activation-1.50e-01

Pos logit contributions
 Sure+0.012
+0.012
 Finally+0.012
uggle+0.009
icol+0.009
Neg logit contributions
aky-0.018
umpy-0.016
 angry-0.015
 impressed-0.015
 losers-0.014
Token go
Token ID467
Feature activation+6.15e-01

Pos logit contributions
+0.003
 Finally+0.003
 Sure+0.003
icol+0.003
uggle+0.003
Neg logit contributions
 angry-0.005
 losers-0.005
aky-0.005
 selfish-0.005
 mean-0.005
Token and
Token ID290
Feature activation+4.63e-01

Pos logit contributions
 angry+0.015
 mean+0.014
 losers+0.014
 selfish+0.014
 rude+0.013
Neg logit contributions
 Finally-0.019
 Sure-0.019
icol-0.019
-0.018
uggle-0.017
Token see
Token ID766
Feature activation+1.11e+00

Pos logit contributions
aky+0.017
umpy+0.016
 losers+0.016
 angry+0.016
 selfish+0.015
Neg logit contributions
 Sure-0.009
 Finally-0.008
icol-0.008
-0.007
uggle-0.007
Token?"
Token ID1701
Feature activation-8.10e-01

Pos logit contributions
umpy+0.028
 losers+0.027
aky+0.026
chie+0.026
 selfish+0.025
Neg logit contributions
 Sure-0.030
 Finally-0.030
icol-0.029
uggle-0.027
 generation-0.026
Token she
Token ID673
Feature activation+3.69e+00

Pos logit contributions
 Finally+0.027
 Sure+0.026
icol+0.025
+0.025
 Eventually+0.023
Neg logit contributions
 losers-0.019
 angry-0.018
aky-0.018
 mean-0.018
 lazy-0.017
Token asks
Token ID7893
Feature activation+7.77e-02

Pos logit contributions
 losers+0.106
aky+0.096
igator+0.096
chie+0.094
umpy+0.093
Neg logit contributions
 Finally-0.086
 Sure-0.083
icol-0.082
 trek-0.070
-0.068
Token.
Token ID13
Feature activation+6.93e-01

Pos logit contributions
 angry+0.002
 mean+0.002
 losers+0.002
 rude+0.002
 lazy+0.002
Neg logit contributions
 Finally-0.003
icol-0.002
 Sure-0.002
-0.002
 Eventually-0.002
Token
Token ID198
Feature activation+7.09e-01

Pos logit contributions
 angry+0.017
 mean+0.017
 losers+0.015
 rude+0.015
 lazy+0.015
Neg logit contributions
icol-0.022
 Finally-0.021
-0.021
 Sure-0.021
uggle-0.019
Token
Token ID198
Feature activation+4.67e-01

Pos logit contributions
 angry+0.016
 mean+0.015
 kids+0.013
 losers+0.013
 rude+0.013
Neg logit contributions
icol-0.025
 Finally-0.024
-0.023
 Sure-0.023
 purchase-0.021
TokenTom
Token ID13787
Feature activation+6.37e-01

Pos logit contributions
 losers+0.010
 angry+0.010
 mean+0.010
 lazy+0.009
 selfish+0.009
Neg logit contributions
 Sure-0.016
 Finally-0.016
-0.015
icol-0.015
uggle-0.013
TokenCan
Token ID6090
Feature activation-1.20e+00

Pos logit contributions
 Sure+0.011
 Finally+0.011
+0.011
icol+0.010
growth+0.009
Neg logit contributions
 angry-0.015
 alone-0.014
 losers-0.014
 unhappy-0.014
 mean-0.014
Token I
Token ID314
Feature activation-1.23e+00

Pos logit contributions
 Finally+0.032
 Sure+0.032
+0.030
 Eventually+0.026
icol+0.026
Neg logit contributions
aky-0.038
 angry-0.036
umpy-0.033
 losers-0.033
 mean-0.032
Token play
Token ID711
Feature activation+1.22e+00

Pos logit contributions
+0.028
 Finally+0.028
 Sure+0.028
icol+0.026
uggle+0.025
Neg logit contributions
 angry-0.040
 losers-0.037
 mean-0.037
 selfish-0.037
 lazy-0.036
Token too
Token ID1165
Feature activation+2.10e+00

Pos logit contributions
umpy+0.040
 reckless+0.039
aky+0.039
 angry+0.038
 losers+0.038
Neg logit contributions
 Sure-0.024
 Finally-0.023
uggle-0.023
-0.023
icol-0.020
Token?"
Token ID1701
Feature activation-2.81e-01

Pos logit contributions
 losers+0.053
 angry+0.053
 mean+0.049
 selfish+0.048
 musicians+0.047
Neg logit contributions
 Finally-0.063
 Sure-0.062
-0.060
icol-0.059
uggle-0.054
Token she
Token ID673
Feature activation+3.67e+00

Pos logit contributions
 Finally+0.009
 Sure+0.009
icol+0.009
+0.008
 Eventually+0.008
Neg logit contributions
 angry-0.007
 losers-0.007
 mean-0.007
 selfish-0.006
 lazy-0.006
Token asks
Token ID7893
Feature activation+4.04e-01

Pos logit contributions
 losers+0.114
aky+0.106
chie+0.105
igator+0.103
umpy+0.098
Neg logit contributions
 Finally-0.079
 Sure-0.075
icol-0.069
 On-0.060
 curiously-0.060
Token.
Token ID13
Feature activation+8.93e-01

Pos logit contributions
 angry+0.010
 mean+0.010
 losers+0.010
 rude+0.009
 lazy+0.009
Neg logit contributions
 Finally-0.013
icol-0.012
 Sure-0.012
-0.012
 Eventually-0.011
Token
Token ID198
Feature activation+6.94e-01

Pos logit contributions
 angry+0.023
 mean+0.022
 losers+0.020
 rude+0.020
 lazy+0.020
Neg logit contributions
icol-0.028
 Finally-0.027
 Sure-0.026
-0.026
uggle-0.024
Token
Token ID198
Feature activation+5.15e-01

Pos logit contributions
 angry+0.015
 mean+0.014
 kids+0.013
 my+0.012
 rude+0.012
Neg logit contributions
icol-0.025
 Finally-0.024
-0.023
 Sure-0.023
 generation-0.021
Token"
Token ID1
Feature activation+6.99e-01

Pos logit contributions
 angry+0.010
 losers+0.010
 mean+0.009
 selfish+0.008
 lazy+0.008
Neg logit contributions
 Finally-0.019
 Sure-0.019
icol-0.018
-0.018
 Eventually-0.017
Token back
Token ID736
Feature activation+1.17e-01

Pos logit contributions
 musicians+0.015
 reckless+0.015
 impressed+0.015
 losers+0.014
 angry+0.014
Neg logit contributions
 Finally-0.017
uggle-0.016
-0.016
 Sure-0.016
ings-0.014
Token in
Token ID287
Feature activation+7.87e-01

Pos logit contributions
 angry+0.003
 losers+0.003
 mean+0.003
 selfish+0.003
 musicians+0.003
Neg logit contributions
 Finally-0.003
 Sure-0.003
-0.003
icol-0.003
uggle-0.003
Token the
Token ID262
Feature activation-2.68e-02

Pos logit contributions
 losers+0.015
 angry+0.015
umpy+0.015
 musicians+0.014
 mean+0.014
Neg logit contributions
 Finally-0.026
-0.026
 Sure-0.025
icol-0.024
 purchase-0.024
Token earth
Token ID4534
Feature activation+7.99e-01

Pos logit contributions
 Finally+0.001
icol+0.001
 Sure+0.001
+0.001
 Eventually+0.001
Neg logit contributions
 angry-0.001
 losers-0.001
 mean-0.001
 musicians-0.001
aky-0.001
Token?"
Token ID1701
Feature activation-7.66e-01

Pos logit contributions
 losers+0.025
 musicians+0.024
 angry+0.023
 furious+0.022
 mean+0.022
Neg logit contributions
-0.018
 Finally-0.017
 Sure-0.017
ings-0.016
icol-0.015
Token she
Token ID673
Feature activation+3.67e+00

Pos logit contributions
 Finally+0.023
 Sure+0.022
+0.021
icol+0.021
 Eventually+0.019
Neg logit contributions
 losers-0.020
 angry-0.019
 mean-0.019
aky-0.019
 lazy-0.018
Token asked
Token ID1965
Feature activation+2.00e-01

Pos logit contributions
 losers+0.116
igator+0.104
umpy+0.102
 winners+0.099
chie+0.097
Neg logit contributions
 Finally-0.077
 Sure-0.070
icol-0.068
 curiously-0.060
 trek-0.059
Token.
Token ID13
Feature activation+6.63e-01

Pos logit contributions
 angry+0.005
 mean+0.005
 losers+0.005
 rude+0.004
 lazy+0.004
Neg logit contributions
 Finally-0.006
icol-0.006
 Sure-0.006
-0.006
 Eventually-0.005
Token
Token ID198
Feature activation+6.58e-01

Pos logit contributions
 angry+0.016
 mean+0.016
 losers+0.014
 rude+0.014
 selfish+0.014
Neg logit contributions
icol-0.021
 Finally-0.021
 Sure-0.020
-0.020
uggle-0.018
Token
Token ID198
Feature activation+4.43e-01

Pos logit contributions
 angry+0.015
 mean+0.014
 rude+0.013
 losers+0.013
 kids+0.013
Neg logit contributions
icol-0.023
 Finally-0.022
-0.021
 Sure-0.021
 purchase-0.019
Token"
Token ID1
Feature activation+6.09e-01

Pos logit contributions
 losers+0.010
 angry+0.010
 mean+0.009
 lazy+0.009
 selfish+0.009
Neg logit contributions
 Sure-0.015
 Finally-0.015
-0.014
icol-0.014
uggle-0.013
Token's
Token ID338
Feature activation-1.34e-01

Pos logit contributions
 Finally+0.020
 Sure+0.019
+0.018
icol+0.017
 Enjoy+0.017
Neg logit contributions
 angry-0.021
 losers-0.020
 alone-0.019
aky-0.019
chie-0.019
Token wrong
Token ID2642
Feature activation+1.06e+00

Pos logit contributions
 Finally+0.005
icol+0.004
 Sure+0.004
+0.004
 Eventually+0.004
Neg logit contributions
 angry-0.003
 mean-0.003
 losers-0.003
 rude-0.003
 lazy-0.003
Token,
Token ID11
Feature activation+1.23e+00

Pos logit contributions
 losers+0.029
 angry+0.029
aky+0.027
 mean+0.027
 lazy+0.026
Neg logit contributions
icol-0.030
 Finally-0.029
 Sure-0.028
-0.027
ings-0.026
Token boys
Token ID6510
Feature activation+1.75e+00

Pos logit contributions
 alone+0.031
aky+0.030
umpy+0.030
v+0.029
 losers+0.028
Neg logit contributions
icol-0.031
 Sure-0.030
 Finally-0.029
 purchase-0.027
-0.026
Token?"
Token ID1701
Feature activation-5.22e-01

Pos logit contributions
 losers+0.043
 angry+0.042
 lazy+0.039
 selfish+0.038
 mean+0.038
Neg logit contributions
icol-0.049
 Finally-0.049
 Sure-0.048
-0.047
uggle-0.043
Token she
Token ID673
Feature activation+3.67e+00

Pos logit contributions
 Finally+0.014
 Sure+0.014
+0.013
icol+0.013
 Eventually+0.012
Neg logit contributions
 losers-0.015
aky-0.014
 mean-0.013
 angry-0.013
umpy-0.013
Token asked
Token ID1965
Feature activation+3.17e-01

Pos logit contributions
 losers+0.114
igator+0.105
umpy+0.099
 winners+0.096
igators+0.094
Neg logit contributions
 Finally-0.081
icol-0.075
 Sure-0.074
 trek-0.068
 curiously-0.066
Token.
Token ID13
Feature activation+6.98e-01

Pos logit contributions
 angry+0.008
 losers+0.008
 mean+0.008
 lazy+0.007
 rude+0.007
Neg logit contributions
 Finally-0.009
 Sure-0.009
icol-0.009
-0.009
 Eventually-0.008
Token
Token ID198
Feature activation+6.43e-01

Pos logit contributions
 angry+0.015
 mean+0.015
 losers+0.014
 rude+0.013
 lazy+0.013
Neg logit contributions
icol-0.024
 Finally-0.024
 Sure-0.023
-0.023
 Eventually-0.021
Token
Token ID198
Feature activation+5.68e-01

Pos logit contributions
 angry+0.015
 mean+0.015
 losers+0.013
 rude+0.013
 lazy+0.013
Neg logit contributions
icol-0.021
 Finally-0.021
-0.020
 Sure-0.020
 Eventually-0.018
Token"
Token ID1
Feature activation+7.26e-01

Pos logit contributions
 angry+0.011
 losers+0.011
 mean+0.010
 lazy+0.009
 selfish+0.009
Neg logit contributions
 Finally-0.021
 Sure-0.021
icol-0.020
-0.020
 Eventually-0.018
Token"
Token ID1
Feature activation+1.55e-01

Pos logit contributions
 Finally+0.005
 Sure+0.005
icol+0.005
+0.005
uggle+0.004
Neg logit contributions
 angry-0.002
 losers-0.002
 mean-0.002
 rude-0.002
 selfish-0.002
TokenWhat
Token ID2061
Feature activation-8.42e-01

Pos logit contributions
 angry+0.004
 losers+0.004
 mean+0.004
 lazy+0.004
 alone+0.004
Neg logit contributions
 Sure-0.004
 Finally-0.004
-0.004
icol-0.004
uggle-0.003
Token happened
Token ID3022
Feature activation+5.21e-01

Pos logit contributions
 Finally+0.019
 Sure+0.019
+0.018
icol+0.017
uggle+0.016
Neg logit contributions
 angry-0.025
 losers-0.025
aky-0.024
 alone-0.023
 musicians-0.023
Token here
Token ID994
Feature activation+1.49e+00

Pos logit contributions
 losers+0.012
 angry+0.011
 mean+0.011
aky+0.011
 lazy+0.010
Neg logit contributions
icol-0.017
 Finally-0.017
 Sure-0.017
-0.016
uggle-0.014
Token?"
Token ID1701
Feature activation-6.09e-01

Pos logit contributions
 losers+0.045
aky+0.044
chie+0.042
 musicians+0.041
 pilots+0.040
Neg logit contributions
-0.033
icol-0.031
 Finally-0.030
 Sure-0.030
 purchase-0.028
Token she
Token ID673
Feature activation+3.66e+00

Pos logit contributions
 Finally+0.015
 Sure+0.015
+0.013
icol+0.013
 Eventually+0.012
Neg logit contributions
 losers-0.019
aky-0.018
 mean-0.017
 angry-0.017
umpy-0.016
Token asks
Token ID7893
Feature activation+3.22e-02

Pos logit contributions
 losers+0.111
aky+0.102
igator+0.101
chie+0.099
umpy+0.094
Neg logit contributions
 Finally-0.076
 Sure-0.073
icol-0.072
 curiously-0.060
 trek-0.058
Token.
Token ID13
Feature activation+5.87e-01

Pos logit contributions
 losers+0.001
 mean+0.001
 angry+0.001
 lazy+0.001
 rude+0.001
Neg logit contributions
 Finally-0.001
 Sure-0.001
-0.001
icol-0.001
ings-0.001
Token She
Token ID1375
Feature activation+2.09e+00

Pos logit contributions
 angry+0.013
 mean+0.013
 losers+0.013
 rude+0.012
 lazy+0.012
Neg logit contributions
 Finally-0.020
icol-0.019
 Sure-0.019
-0.019
 Eventually-0.017
Token is
Token ID318
Feature activation+1.16e+00

Pos logit contributions
 angry+0.051
 mean+0.048
 rude+0.044
 losers+0.043
 lazy+0.043
Neg logit contributions
 Finally-0.073
icol-0.070
-0.069
 Sure-0.069
 Eventually-0.065
Token not
Token ID407
Feature activation+7.31e-01

Pos logit contributions
 losers+0.025
igator+0.023
umpy+0.022
chie+0.022
 musicians+0.022
Neg logit contributions
 Finally-0.032
-0.028
 Sure-0.028
 Eventually-0.026
icol-0.026
Token a
Token ID257
Feature activation-2.50e-01

Pos logit contributions
 Finally+0.003
icol+0.003
 Sure+0.003
+0.003
 Eventually+0.002
Neg logit contributions
 angry-0.002
 mean-0.002
 losers-0.002
 rude-0.002
 lazy-0.002
Token lime
Token ID28738
Feature activation+3.41e-01

Pos logit contributions
icol+0.008
 Finally+0.008
 Sure+0.008
+0.007
 Eventually+0.007
Neg logit contributions
 angry-0.006
 mean-0.006
 lazy-0.006
 rude-0.006
 selfish-0.006
Token,
Token ID11
Feature activation+2.85e-01

Pos logit contributions
 angry+0.009
 musicians+0.009
 losers+0.009
 mean+0.008
 lazy+0.008
Neg logit contributions
-0.010
 Finally-0.009
 Sure-0.009
uggle-0.008
icol-0.008
Token Ben
Token ID3932
Feature activation+1.96e+00

Pos logit contributions
aky+0.009
umpy+0.009
 angry+0.008
v+0.008
 alone+0.008
Neg logit contributions
-0.007
 Finally-0.007
 Sure-0.007
icol-0.006
 Enjoy-0.005
Token?"
Token ID1701
Feature activation-1.18e+00

Pos logit contributions
 losers+0.048
 angry+0.046
 mean+0.041
 musicians+0.041
 lazy+0.040
Neg logit contributions
 Finally-0.057
 Sure-0.056
icol-0.055
-0.051
uggle-0.049
Token she
Token ID673
Feature activation+3.66e+00

Pos logit contributions
 Finally+0.039
 Sure+0.038
+0.036
icol+0.036
 Eventually+0.033
Neg logit contributions
 losers-0.027
 angry-0.025
 mean-0.024
aky-0.024
 reckless-0.023
Token asks
Token ID7893
Feature activation-1.48e-01

Pos logit contributions
 losers+0.108
aky+0.095
igator+0.095
chie+0.093
umpy+0.092
Neg logit contributions
 Finally-0.087
 Sure-0.079
icol-0.077
 Eventually-0.065
 curiously-0.063
Token.
Token ID13
Feature activation+9.53e-01

Pos logit contributions
 Finally+0.005
 Sure+0.005
icol+0.004
+0.004
 Eventually+0.004
Neg logit contributions
 losers-0.004
 angry-0.004
 mean-0.003
 lazy-0.003
 rude-0.003
Token
Token ID198
Feature activation+8.25e-01

Pos logit contributions
 angry+0.021
 mean+0.020
 lazy+0.018
 selfish+0.018
 rude+0.018
Neg logit contributions
icol-0.033
 Finally-0.031
 Sure-0.029
-0.029
 generation-0.029
Token
Token ID198
Feature activation+4.47e-01

Pos logit contributions
 angry+0.019
 mean+0.019
 losers+0.017
 rude+0.017
 lazy+0.017
Neg logit contributions
icol-0.027
 Finally-0.026
-0.026
 Sure-0.025
 generation-0.023
Token"
Token ID1
Feature activation+5.54e-01

Pos logit contributions
 angry+0.011
 losers+0.011
 mean+0.010
 alone+0.009
 lonely+0.009
Neg logit contributions
 Finally-0.014
 Sure-0.014
-0.014
icol-0.013
uggle-0.012
Token"
Token ID1
Feature activation+2.64e-01

Pos logit contributions
 losers+0.005
 angry+0.005
 mean+0.005
 lazy+0.005
 selfish+0.005
Neg logit contributions
 Finally-0.011
 Sure-0.010
-0.010
icol-0.010
 Eventually-0.009
TokenWhat
Token ID2061
Feature activation-7.13e-01

Pos logit contributions
 losers+0.008
 alone+0.008
 angry+0.008
aky+0.007
 mean+0.007
Neg logit contributions
-0.006
 Sure-0.006
 Finally-0.006
icol-0.005
Finally-0.005
Token happened
Token ID3022
Feature activation+3.41e-01

Pos logit contributions
 Finally+0.018
 Sure+0.017
+0.017
icol+0.016
uggle+0.015
Neg logit contributions
 losers-0.020
 angry-0.020
aky-0.019
 alone-0.019
 musicians-0.018
Token here
Token ID994
Feature activation+1.31e+00

Pos logit contributions
 losers+0.007
 angry+0.007
 mean+0.007
aky+0.007
 lazy+0.006
Neg logit contributions
icol-0.011
 Finally-0.011
 Sure-0.011
-0.011
 Eventually-0.010
Token?"
Token ID1701
Feature activation-1.67e-01

Pos logit contributions
 losers+0.039
aky+0.038
 musicians+0.035
chie+0.035
 angry+0.035
Neg logit contributions
-0.031
icol-0.029
 Finally-0.029
 Sure-0.028
 purchase-0.027
Token she
Token ID673
Feature activation+3.66e+00

Pos logit contributions
 Finally+0.005
 Sure+0.005
icol+0.005
+0.005
 Eventually+0.004
Neg logit contributions
 losers-0.004
aky-0.004
 mean-0.004
 angry-0.004
igator-0.004
Token asks
Token ID7893
Feature activation+1.41e-01

Pos logit contributions
 losers+0.106
aky+0.096
igator+0.095
chie+0.089
umpy+0.089
Neg logit contributions
 Finally-0.085
 Sure-0.082
icol-0.082
-0.067
 Eventually-0.067
Token.
Token ID13
Feature activation+7.89e-01

Pos logit contributions
 angry+0.004
 mean+0.003
 losers+0.003
 rude+0.003
 lazy+0.003
Neg logit contributions
 Finally-0.004
icol-0.004
 Sure-0.004
-0.004
 Eventually-0.004
Token
Token ID198
Feature activation+5.96e-01

Pos logit contributions
 angry+0.020
 mean+0.019
 rude+0.017
 losers+0.017
 selfish+0.017
Neg logit contributions
icol-0.025
 Finally-0.025
-0.024
 Sure-0.024
uggle-0.022
Token
Token ID198
Feature activation+5.80e-01

Pos logit contributions
 angry+0.014
 mean+0.013
 kids+0.012
 losers+0.012
 rude+0.012
Neg logit contributions
icol-0.020
 Finally-0.019
-0.019
 Sure-0.019
 Eventually-0.017
TokenThey
Token ID2990
Feature activation+9.43e-01

Pos logit contributions
 losers+0.011
 angry+0.011
 mean+0.011
 lazy+0.010
 selfish+0.009
Neg logit contributions
 Finally-0.021
 Sure-0.021
icol-0.020
-0.020
 Eventually-0.019
Token I
Token ID314
Feature activation-5.69e-01

Pos logit contributions
 angry+0.026
aky+0.025
 mean+0.024
 losers+0.024
umpy+0.023
Neg logit contributions
 Sure-0.035
 Finally-0.035
-0.035
icol-0.032
uggle-0.030
Token be
Token ID307
Feature activation+1.06e+00

Pos logit contributions
 Finally+0.016
icol+0.015
 Sure+0.015
+0.015
 Eventually+0.013
Neg logit contributions
 angry-0.016
 mean-0.015
 losers-0.015
 rude-0.015
 selfish-0.014
Token your
Token ID534
Feature activation+1.28e+00

Pos logit contributions
umpy+0.021
spe+0.020
 losers+0.019
folk+0.019
 reckless+0.019
Neg logit contributions
 Finally-0.029
icol-0.029
 Enjoy-0.027
 Sure-0.027
ings-0.026
Token friend
Token ID1545
Feature activation+1.86e+00

Pos logit contributions
 angry+0.032
 mean+0.030
 losers+0.029
 rude+0.028
 selfish+0.027
Neg logit contributions
 Finally-0.039
 Sure-0.038
icol-0.038
-0.037
uggle-0.034
Token?"
Token ID1701
Feature activation-8.92e-01

Pos logit contributions
chie+0.061
umpy+0.059
 losers+0.058
 furious+0.057
 reckless+0.056
Neg logit contributions
 Finally-0.027
 Sure-0.026
-0.026
uggle-0.025
icol-0.023
Token she
Token ID673
Feature activation+3.66e+00

Pos logit contributions
 Finally+0.027
 Sure+0.027
+0.024
icol+0.023
 Eventually+0.022
Neg logit contributions
aky-0.023
 losers-0.023
umpy-0.021
 mean-0.020
 angry-0.020
Token says
Token ID1139
Feature activation-3.09e-01

Pos logit contributions
 losers+0.108
aky+0.101
chie+0.098
umpy+0.096
igator+0.094
Neg logit contributions
 Finally-0.081
icol-0.079
 Sure-0.078
 trek-0.066
growth-0.064
Token.
Token ID13
Feature activation+3.98e-01

Pos logit contributions
 Finally+0.010
icol+0.010
 Sure+0.010
+0.010
 Eventually+0.009
Neg logit contributions
 angry-0.007
 mean-0.007
 losers-0.007
 rude-0.006
 lazy-0.006
Token
Token ID198
Feature activation+1.69e-01

Pos logit contributions
 angry+0.011
 mean+0.011
 losers+0.011
 rude+0.010
 lazy+0.010
Neg logit contributions
 Finally-0.011
icol-0.011
 Sure-0.011
-0.011
 Eventually-0.010
Token
Token ID198
Feature activation+1.33e-02

Pos logit contributions
 angry+0.004
 mean+0.004
 losers+0.004
 lazy+0.004
 rude+0.004
Neg logit contributions
icol-0.005
 Finally-0.005
-0.005
 Sure-0.005
 Eventually-0.005
TokenThe
Token ID464
Feature activation+1.51e-01

Pos logit contributions
 losers+0.000
 angry+0.000
 mean+0.000
 selfish+0.000
 unhappy+0.000
Neg logit contributions
 Sure-0.000
 Finally-0.000
-0.000
Finally-0.000
growth-0.000
Token the
Token ID262
Feature activation+1.48e-01

Pos logit contributions
aky+0.015
 angry+0.014
 mean+0.014
 losers+0.014
 selfish+0.013
Neg logit contributions
 Sure-0.032
 Finally-0.031
icol-0.031
-0.030
 purchase-0.029
Token tree
Token ID5509
Feature activation+8.00e-01

Pos logit contributions
 angry+0.004
 losers+0.004
aky+0.004
 mean+0.004
 musicians+0.004
Neg logit contributions
 Finally-0.004
 Sure-0.004
-0.004
icol-0.004
uggle-0.003
Token,
Token ID11
Feature activation+2.70e-01

Pos logit contributions
 losers+0.024
 musicians+0.024
 angry+0.023
aky+0.022
 selfish+0.022
Neg logit contributions
 Finally-0.020
 Sure-0.019
-0.019
uggle-0.018
ings-0.016
Token Ben
Token ID3932
Feature activation+1.83e+00

Pos logit contributions
aky+0.009
umpy+0.009
vious+0.008
v+0.008
 angry+0.008
Neg logit contributions
 Finally-0.006
 Sure-0.006
-0.005
 purchase-0.005
icol-0.005
Token?"
Token ID1701
Feature activation-8.92e-01

Pos logit contributions
 losers+0.044
 angry+0.042
 mean+0.039
 musicians+0.038
 selfish+0.038
Neg logit contributions
 Finally-0.055
icol-0.054
 Sure-0.054
-0.052
uggle-0.048
Token she
Token ID673
Feature activation+3.66e+00

Pos logit contributions
 Finally+0.031
 Sure+0.031
icol+0.029
+0.028
 Eventually+0.027
Neg logit contributions
 losers-0.019
 angry-0.018
 mean-0.018
aky-0.017
 selfish-0.016
Token asked
Token ID1965
Feature activation+1.24e-01

Pos logit contributions
 losers+0.105
umpy+0.093
 angry+0.092
igator+0.091
 mean+0.090
Neg logit contributions
 Finally-0.090
 Sure-0.084
icol-0.083
 trek-0.074
-0.072
Token.
Token ID13
Feature activation+7.57e-01

Pos logit contributions
 angry+0.003
 mean+0.003
 losers+0.003
 rude+0.003
 lazy+0.003
Neg logit contributions
 Finally-0.004
icol-0.004
 Sure-0.004
-0.004
 Eventually-0.003
Token
Token ID198
Feature activation+8.25e-01

Pos logit contributions
 angry+0.019
 mean+0.018
 losers+0.017
 rude+0.016
 lazy+0.016
Neg logit contributions
icol-0.025
 Finally-0.024
 Sure-0.023
-0.023
uggle-0.021
Token
Token ID198
Feature activation+5.96e-01

Pos logit contributions
 angry+0.018
 mean+0.017
 kids+0.015
 my+0.015
 rude+0.015
Neg logit contributions
icol-0.030
-0.028
 Finally-0.028
 Sure-0.027
 purchase-0.025
Token"
Token ID1
Feature activation+5.87e-01

Pos logit contributions
 angry+0.013
 losers+0.013
 mean+0.012
 selfish+0.011
 lazy+0.011
Neg logit contributions
 Finally-0.020
 Sure-0.020
-0.019
icol-0.019
uggle-0.018
Token the
Token ID262
Feature activation-3.96e-02

Pos logit contributions
 losers+0.040
aky+0.040
umpy+0.040
 angry+0.039
 selfish+0.037
Neg logit contributions
 Finally-0.044
 Sure-0.041
icol-0.041
ings-0.041
uggle-0.039
Token bugs
Token ID11316
Feature activation+2.29e+00

Pos logit contributions
 Finally+0.001
+0.001
icol+0.001
 Sure+0.001
 Eventually+0.001
Neg logit contributions
 angry-0.001
 losers-0.001
 mean-0.001
aky-0.001
 musicians-0.001
Token look
Token ID804
Feature activation+9.23e-02

Pos logit contributions
 losers+0.068
 musicians+0.066
 boys+0.063
 kids+0.062
 selfish+0.060
Neg logit contributions
uggle-0.045
-0.044
 Finally-0.042
 potion-0.042
ings-0.041
Token big
Token ID1263
Feature activation+8.80e-01

Pos logit contributions
 losers+0.002
chie+0.002
 musicians+0.002
 kids+0.002
 boys+0.002
Neg logit contributions
 Finally-0.002
icol-0.002
 potion-0.002
ings-0.002
 Sure-0.002
Token?"
Token ID1701
Feature activation-8.50e-01

Pos logit contributions
 losers+0.026
chie+0.024
 furious+0.023
 selfish+0.023
 angry+0.023
Neg logit contributions
icol-0.020
 Finally-0.019
 Sure-0.018
-0.018
ings-0.017
Token she
Token ID673
Feature activation+3.65e+00

Pos logit contributions
 Finally+0.025
icol+0.025
 Sure+0.024
+0.024
 Eventually+0.021
Neg logit contributions
 angry-0.023
 mean-0.022
 losers-0.022
 rude-0.020
 lazy-0.020
Token asked
Token ID1965
Feature activation-2.70e-02

Pos logit contributions
 losers+0.110
igator+0.097
umpy+0.093
 winners+0.091
 my+0.090
Neg logit contributions
 Finally-0.086
icol-0.081
 Sure-0.079
-0.071
 trek-0.070
Token.
Token ID13
Feature activation+7.07e-01

Pos logit contributions
 Finally+0.001
 Sure+0.001
icol+0.001
+0.001
 Eventually+0.001
Neg logit contributions
 angry-0.001
 losers-0.001
 mean-0.001
 lazy-0.001
 rude-0.001
Token
Token ID198
Feature activation+6.86e-01

Pos logit contributions
 angry+0.017
 mean+0.016
 rude+0.014
 lazy+0.014
 selfish+0.014
Neg logit contributions
icol-0.024
 Finally-0.023
 Sure-0.022
-0.022
 purchase-0.021
Token
Token ID198
Feature activation+5.56e-01

Pos logit contributions
 angry+0.015
 mean+0.014
 kids+0.012
 my+0.012
 alone+0.012
Neg logit contributions
icol-0.025
 Finally-0.023
-0.023
 Sure-0.022
 purchase-0.021
TokenSam
Token ID16305
Feature activation+4.11e-01

Pos logit contributions
 angry+0.013
 losers+0.013
 mean+0.012
 lazy+0.011
 selfish+0.011
Neg logit contributions
 Finally-0.018
 Sure-0.018
-0.017
icol-0.017
 Eventually-0.016
Token
Token ID198
Feature activation-7.79e-01

Pos logit contributions
icol+0.009
 Finally+0.008
 Sure+0.008
+0.008
 generation+0.008
Neg logit contributions
 angry-0.006
 mean-0.005
 kids-0.005
 lazy-0.005
 alone-0.005
Token"
Token ID1
Feature activation-3.57e-01

Pos logit contributions
 Finally+0.030
 Sure+0.030
icol+0.029
+0.029
 Eventually+0.027
Neg logit contributions
 angry-0.013
 losers-0.013
 mean-0.013
 rude-0.011
 selfish-0.011
TokenWhat
Token ID2061
Feature activation-6.74e-01

Pos logit contributions
 Sure+0.008
+0.008
 Finally+0.007
icol+0.007
growth+0.006
Neg logit contributions
 angry-0.011
 losers-0.011
 alone-0.010
 mean-0.010
 unhappy-0.010
Token happened
Token ID3022
Feature activation+4.46e-01

Pos logit contributions
 Finally+0.017
 Sure+0.016
+0.016
icol+0.015
uggle+0.014
Neg logit contributions
 losers-0.020
 angry-0.019
 alone-0.018
 mean-0.018
aky-0.018
Token?"
Token ID1701
Feature activation-6.81e-01

Pos logit contributions
 angry+0.009
 losers+0.009
 mean+0.009
 lazy+0.008
aky+0.008
Neg logit contributions
icol-0.015
 Finally-0.015
 Sure-0.015
-0.015
 Eventually-0.013
Token she
Token ID673
Feature activation+3.65e+00

Pos logit contributions
 Finally+0.019
 Sure+0.018
icol+0.018
+0.017
 Eventually+0.016
Neg logit contributions
 losers-0.019
 angry-0.019
 mean-0.018
aky-0.018
 selfish-0.017
Token asked
Token ID1965
Feature activation+2.54e-01

Pos logit contributions
 losers+0.120
igator+0.108
 winners+0.105
chie+0.101
umpy+0.101
Neg logit contributions
 Finally-0.067
 Sure-0.063
icol-0.058
 curiously-0.056
-0.053
Token.
Token ID13
Feature activation+4.11e-01

Pos logit contributions
 angry+0.007
 losers+0.006
 mean+0.006
 rude+0.006
 lazy+0.006
Neg logit contributions
 Finally-0.008
 Sure-0.007
icol-0.007
-0.007
 Eventually-0.006
Token
Token ID198
Feature activation+1.18e-01

Pos logit contributions
 angry+0.009
 mean+0.009
 losers+0.008
 rude+0.008
 lazy+0.008
Neg logit contributions
icol-0.014
 Finally-0.014
 Sure-0.013
-0.013
 Eventually-0.012
Token
Token ID198
Feature activation-1.18e-01

Pos logit contributions
 angry+0.003
 mean+0.003
 losers+0.002
 kids+0.002
 lazy+0.002
Neg logit contributions
icol-0.004
 Finally-0.004
 Sure-0.004
-0.004
 Eventually-0.003
Token"
Token ID1
Feature activation+7.70e-01

Pos logit contributions
 Finally+0.004
 Sure+0.004
+0.004
icol+0.004
 Eventually+0.004
Neg logit contributions
 losers-0.002
 angry-0.002
 mean-0.002
 rude-0.002
 selfish-0.002
Token I
Token ID314
Feature activation-9.32e-01

Pos logit contributions
 Sure+0.016
 Finally+0.016
+0.016
uggle+0.013
icol+0.013
Neg logit contributions
aky-0.017
 angry-0.015
umpy-0.015
 losers-0.014
 musicians-0.014
Token open
Token ID1280
Feature activation+5.36e-01

Pos logit contributions
 Finally+0.027
icol+0.027
 Sure+0.026
+0.025
 Eventually+0.023
Neg logit contributions
 angry-0.026
 mean-0.025
 losers-0.024
 rude-0.023
 lazy-0.023
Token it
Token ID340
Feature activation+3.69e-01

Pos logit contributions
 mean+0.012
 losers+0.012
umpy+0.012
aky+0.012
 angry+0.012
Neg logit contributions
 Sure-0.016
 Finally-0.016
icol-0.016
-0.015
ings-0.015
Token now
Token ID783
Feature activation+5.00e-01

Pos logit contributions
 losers+0.009
 angry+0.009
 mean+0.009
 musicians+0.009
 selfish+0.009
Neg logit contributions
 Finally-0.011
 Sure-0.011
-0.011
icol-0.010
uggle-0.010
Token?"
Token ID1701
Feature activation-7.67e-01

Pos logit contributions
 losers+0.013
 angry+0.012
aky+0.012
 selfish+0.011
 mean+0.011
Neg logit contributions
 Finally-0.015
-0.014
 Sure-0.014
icol-0.014
uggle-0.013
Token she
Token ID673
Feature activation+3.65e+00

Pos logit contributions
 Finally+0.026
 Sure+0.026
icol+0.024
+0.024
 Eventually+0.023
Neg logit contributions
 losers-0.017
aky-0.016
 angry-0.016
 mean-0.016
 lazy-0.015
Token asks
Token ID7893
Feature activation+5.69e-02

Pos logit contributions
 losers+0.103
aky+0.093
 angry+0.091
 alone+0.091
igator+0.090
Neg logit contributions
 Finally-0.094
 Sure-0.089
icol-0.088
 Eventually-0.075
-0.074
Token.
Token ID13
Feature activation+6.60e-01

Pos logit contributions
 angry+0.001
 mean+0.001
 losers+0.001
 rude+0.001
 lazy+0.001
Neg logit contributions
 Finally-0.002
icol-0.002
 Sure-0.002
-0.002
 Eventually-0.002
Token
Token ID198
Feature activation+6.35e-01

Pos logit contributions
 angry+0.016
 mean+0.016
 losers+0.015
 rude+0.014
 selfish+0.014
Neg logit contributions
icol-0.021
 Finally-0.020
-0.019
 Sure-0.019
uggle-0.018
Token
Token ID198
Feature activation+5.42e-01

Pos logit contributions
 angry+0.014
 mean+0.013
 my+0.012
 kids+0.011
 losers+0.011
Neg logit contributions
icol-0.023
-0.021
 Finally-0.021
 Sure-0.021
 purchase-0.019
TokenTom
Token ID13787
Feature activation+5.44e-01

Pos logit contributions
 angry+0.011
 losers+0.011
 mean+0.011
 lazy+0.010
 selfish+0.010
Neg logit contributions
 Finally-0.019
 Sure-0.019
icol-0.018
-0.018
 Eventually-0.016
Token They
Token ID1119
Feature activation-6.10e-01

Pos logit contributions
 angry+0.015
 mean+0.014
 losers+0.013
 kids+0.013
 lazy+0.013
Neg logit contributions
icol-0.016
 Finally-0.015
-0.015
 Sure-0.015
uggle-0.013
Token both
Token ID1111
Feature activation+2.52e-01

Pos logit contributions
 Finally+0.016
 Sure+0.015
icol+0.015
+0.014
 Eventually+0.013
Neg logit contributions
 angry-0.018
 losers-0.017
 mean-0.017
 rude-0.017
 selfish-0.016
Token put
Token ID1234
Feature activation-5.40e-01

Pos logit contributions
 losers+0.007
 mean+0.007
 angry+0.007
 rude+0.007
 selfish+0.007
Neg logit contributions
 Finally-0.007
 Sure-0.006
icol-0.006
 Eventually-0.006
 mun-0.006
Token on
Token ID319
Feature activation-5.86e-01

Pos logit contributions
 Finally+0.017
icol+0.017
 Sure+0.016
+0.016
 Eventually+0.015
Neg logit contributions
 angry-0.013
 mean-0.013
 losers-0.013
 selfish-0.012
 rude-0.012
Token their
Token ID511
Feature activation-1.42e+00

Pos logit contributions
 Finally+0.022
icol+0.021
 Sure+0.021
+0.021
 Eventually+0.019
Neg logit contributions
 angry-0.011
 mean-0.011
 losers-0.010
 rude-0.009
 lazy-0.009
Token favorite
Token ID4004
Feature activation-3.61e+00

Pos logit contributions
 Finally+0.036
icol+0.035
+0.034
 Sure+0.034
 purchase+0.030
Neg logit contributions
 losers-0.042
 angry-0.041
 mean-0.041
aky-0.041
 reckless-0.040
Token clothes
Token ID8242
Feature activation-7.33e-01

Pos logit contributions
 Finally+0.106
icol+0.105
 Sure+0.103
+0.102
 Eventually+0.090
Neg logit contributions
 angry-0.095
 mean-0.091
 losers-0.091
 rude-0.084
 selfish-0.084
Token and
Token ID290
Feature activation-1.03e+00

Pos logit contributions
 Finally+0.022
+0.022
icol+0.022
 Sure+0.021
 Eventually+0.019
Neg logit contributions
 angry-0.019
 losers-0.019
 mean-0.019
 rude-0.017
 selfish-0.017
Token headed
Token ID9153
Feature activation+1.19e-01

Pos logit contributions
 Finally+0.030
icol+0.029
 Sure+0.028
+0.027
 Eventually+0.025
Neg logit contributions
 angry-0.028
 losers-0.027
 mean-0.027
 selfish-0.025
 alone-0.025
Token outside
Token ID2354
Feature activation-1.57e+00

Pos logit contributions
 angry+0.003
 mean+0.003
 losers+0.003
 rude+0.003
 selfish+0.003
Neg logit contributions
 Finally-0.004
icol-0.004
 Sure-0.003
-0.003
 Eventually-0.003
Token.
Token ID13
Feature activation+5.43e-01

Pos logit contributions
 Finally+0.042
 Sure+0.042
icol+0.039
+0.039
 Eventually+0.036
Neg logit contributions
 mean-0.047
 losers-0.046
 angry-0.046
 rude-0.044
 selfish-0.043
Token her
Token ID607
Feature activation+9.35e-01

Pos logit contributions
 Finally+0.007
icol+0.006
 Sure+0.006
+0.006
 Eventually+0.006
Neg logit contributions
 angry-0.004
 mean-0.004
 losers-0.003
 rude-0.003
 lazy-0.003
Token mom
Token ID1995
Feature activation+2.40e-01

Pos logit contributions
 losers+0.027
aky+0.025
 mean+0.025
 angry+0.025
umpy+0.023
Neg logit contributions
 Finally-0.025
 Sure-0.023
icol-0.023
 purchase-0.023
-0.022
Token and
Token ID290
Feature activation+1.57e-01

Pos logit contributions
 angry+0.006
 mean+0.006
 losers+0.006
 rude+0.006
 selfish+0.006
Neg logit contributions
 Finally-0.007
 Sure-0.007
icol-0.007
-0.007
 Eventually-0.006
Token wore
Token ID12408
Feature activation-1.16e-01

Pos logit contributions
 mean+0.004
 angry+0.004
 losers+0.004
aky+0.004
umpy+0.004
Neg logit contributions
 Finally-0.005
icol-0.004
 Sure-0.004
-0.004
 purchase-0.004
Token her
Token ID607
Feature activation-8.36e-01

Pos logit contributions
 Finally+0.004
icol+0.004
 Sure+0.004
+0.004
 Eventually+0.004
Neg logit contributions
 angry-0.002
 mean-0.002
 losers-0.002
 lazy-0.002
 rude-0.002
Token favorite
Token ID4004
Feature activation-3.38e+00

Pos logit contributions
 Finally+0.026
+0.026
icol+0.025
 Sure+0.025
 Eventually+0.022
Neg logit contributions
 angry-0.021
 losers-0.020
 mean-0.019
aky-0.018
 rude-0.018
Token sweater
Token ID35685
Feature activation-1.18e+00

Pos logit contributions
 Finally+0.107
icol+0.106
 Sure+0.104
+0.101
 Eventually+0.094
Neg logit contributions
 angry-0.085
 mean-0.082
 losers-0.078
 rude-0.075
 lazy-0.075
Token all
Token ID477
Feature activation-3.72e-01

Pos logit contributions
 Finally+0.036
icol+0.035
 Sure+0.035
+0.035
 Eventually+0.031
Neg logit contributions
 angry-0.031
 mean-0.030
 losers-0.029
 rude-0.027
 lazy-0.027
Token day
Token ID1110
Feature activation-1.04e+00

Pos logit contributions
 Finally+0.010
 Sure+0.009
icol+0.009
+0.009
 Eventually+0.008
Neg logit contributions
 angry-0.011
 losers-0.011
 mean-0.011
 rude-0.010
 selfish-0.010
Token long
Token ID890
Feature activation-1.27e-01

Pos logit contributions
 Finally+0.033
icol+0.032
 Sure+0.032
+0.031
 Eventually+0.028
Neg logit contributions
 angry-0.026
 mean-0.025
 losers-0.024
 lazy-0.023
 rude-0.023
Token.
Token ID13
Feature activation+3.91e-01

Pos logit contributions
 Finally+0.004
icol+0.004
+0.003
 Sure+0.003
 Eventually+0.003
Neg logit contributions
 angry-0.004
 mean-0.003
 losers-0.003
 rude-0.003
 lazy-0.003
Token.
Token ID13
Feature activation+1.86e-01

Pos logit contributions
 angry+0.015
 losers+0.014
 mean+0.014
 rude+0.013
aky+0.013
Neg logit contributions
 Finally-0.015
icol-0.014
 Sure-0.014
-0.014
 Eventually-0.012
Token He
Token ID679
Feature activation-8.66e-01

Pos logit contributions
 angry+0.005
 mean+0.005
 losers+0.005
 rude+0.004
 lazy+0.004
Neg logit contributions
 Finally-0.006
icol-0.005
 Sure-0.005
-0.005
 Eventually-0.005
Token was
Token ID373
Feature activation-1.24e-01

Pos logit contributions
 Finally+0.021
 Sure+0.020
icol+0.019
 trek+0.018
 Eventually+0.017
Neg logit contributions
 losers-0.027
 angry-0.026
 mean-0.025
 boys-0.024
 selfish-0.024
Token wearing
Token ID5762
Feature activation-4.31e-01

Pos logit contributions
 Finally+0.003
 Eventually+0.003
 Sure+0.003
icol+0.003
zed+0.002
Neg logit contributions
 losers-0.003
igator-0.003
umpy-0.003
aky-0.003
 boys-0.003
Token his
Token ID465
Feature activation-9.55e-01

Pos logit contributions
 Finally+0.013
icol+0.013
 Sure+0.012
+0.011
 Eventually+0.011
Neg logit contributions
 angry-0.011
umpy-0.011
 losers-0.010
aky-0.010
 mean-0.010
Token favorite
Token ID4004
Feature activation-3.29e+00

Pos logit contributions
 Finally+0.026
+0.026
icol+0.025
 Sure+0.025
 Eventually+0.022
Neg logit contributions
 losers-0.026
 angry-0.026
 mean-0.025
aky-0.024
 reckless-0.023
Token spider
Token ID19230
Feature activation-3.39e-01

Pos logit contributions
 Finally+0.100
icol+0.100
 Sure+0.097
+0.096
uggle+0.086
Neg logit contributions
 angry-0.082
 losers-0.079
 mean-0.078
 selfish-0.072
 rude-0.072
Tokenman
Token ID805
Feature activation-1.41e+00

Pos logit contributions
 Finally+0.009
icol+0.009
+0.009
 Sure+0.009
 Eventually+0.008
Neg logit contributions
 angry-0.009
 losers-0.009
 mean-0.009
 rude-0.008
 selfish-0.008
Token shirt
Token ID10147
Feature activation-9.94e-01

Pos logit contributions
 Finally+0.037
+0.036
 Sure+0.035
icol+0.033
 Eventually+0.032
Neg logit contributions
 angry-0.040
 losers-0.040
 mean-0.038
 rude-0.036
 selfish-0.035
Token.
Token ID13
Feature activation+1.55e-01

Pos logit contributions
 Finally+0.027
+0.026
 Sure+0.026
icol+0.026
 Eventually+0.023
Neg logit contributions
 angry-0.029
 losers-0.028
 mean-0.027
 rude-0.026
 selfish-0.026
Token
Token ID198
Feature activation+4.30e-01

Pos logit contributions
 angry+0.005
 mean+0.004
 losers+0.004
 rude+0.004
 selfish+0.004
Neg logit contributions
 Finally-0.005
 Sure-0.004
icol-0.004
-0.004
 Eventually-0.004
Token started
Token ID2067
Feature activation+4.22e-01

Pos logit contributions
 Finally+0.011
icol+0.010
 Sure+0.010
 trek+0.009
 mun+0.009
Neg logit contributions
 angry-0.009
umpy-0.009
 mean-0.009
 losers-0.009
aky-0.009
Token to
Token ID284
Feature activation-7.59e-01

Pos logit contributions
 angry+0.011
 mean+0.011
 losers+0.011
 rude+0.010
 selfish+0.010
Neg logit contributions
icol-0.012
 Finally-0.012
 Sure-0.012
-0.012
 Eventually-0.011
Token sort
Token ID3297
Feature activation+2.11e-01

Pos logit contributions
icol+0.022
 Finally+0.021
+0.021
 Sure+0.020
 trek+0.018
Neg logit contributions
 angry-0.020
 losers-0.020
 mean-0.019
 selfish-0.019
 rude-0.018
Token out
Token ID503
Feature activation-1.21e+00

Pos logit contributions
 angry+0.005
 mean+0.005
 losers+0.005
 selfish+0.005
 rude+0.005
Neg logit contributions
 Finally-0.006
icol-0.006
 Sure-0.006
-0.006
 Eventually-0.005
Token a
Token ID257
Feature activation-7.86e-01

Pos logit contributions
 Finally+0.037
icol+0.035
 Sure+0.035
ings+0.034
 trek+0.033
Neg logit contributions
 losers-0.028
 angry-0.027
 mean-0.027
 selfish-0.026
umpy-0.026
Token snowball
Token ID46275
Feature activation-3.26e+00

Pos logit contributions
icol+0.022
 Finally+0.021
 Sure+0.020
+0.019
 purchase+0.018
Neg logit contributions
 losers-0.022
 angry-0.022
 mean-0.021
 selfish-0.020
 rude-0.020
Token fight
Token ID1907
Feature activation+2.01e-01

Pos logit contributions
 Finally+0.100
icol+0.098
 Sure+0.097
+0.095
 Eventually+0.086
Neg logit contributions
 angry-0.084
 mean-0.081
 losers-0.078
 rude-0.074
 lazy-0.074
Token with
Token ID351
Feature activation-7.17e-01

Pos logit contributions
 angry+0.006
aky+0.006
 losers+0.006
 selfish+0.006
 mean+0.006
Neg logit contributions
 Finally-0.005
 Sure-0.004
icol-0.004
ings-0.004
-0.004
Token some
Token ID617
Feature activation+8.55e-01

Pos logit contributions
 Finally+0.024
icol+0.023
 Sure+0.023
+0.023
 Eventually+0.021
Neg logit contributions
 angry-0.016
 mean-0.015
 losers-0.015
aky-0.014
 selfish-0.014
Token of
Token ID286
Feature activation+9.38e-01

Pos logit contributions
 angry+0.021
 losers+0.020
 mean+0.020
 selfish+0.020
aky+0.020
Neg logit contributions
 Finally-0.024
icol-0.023
-0.022
 Sure-0.022
 purchase-0.020
Token his
Token ID465
Feature activation+1.73e-01

Pos logit contributions
 angry+0.020
 mean+0.019
aky+0.018
 selfish+0.018
umpy+0.018
Neg logit contributions
 Finally-0.032
 Sure-0.030
icol-0.030
-0.030
 purchase-0.028
Token discovered
Token ID5071
Feature activation+7.33e-02

Pos logit contributions
+0.004
 Finally+0.004
icol+0.004
 Sure+0.004
 Eventually+0.003
Neg logit contributions
 angry-0.003
 mean-0.002
 rude-0.002
 lazy-0.002
 alone-0.002
Token a
Token ID257
Feature activation-3.30e-01

Pos logit contributions
 angry+0.002
 mean+0.002
 rude+0.001
 kids+0.001
 selfish+0.001
Neg logit contributions
icol-0.003
-0.003
 Finally-0.003
 Sure-0.003
uggle-0.002
Token long
Token ID890
Feature activation+6.88e-01

Pos logit contributions
 Finally+0.011
icol+0.011
 Sure+0.011
+0.011
 Eventually+0.010
Neg logit contributions
 angry-0.007
 losers-0.007
 mean-0.006
 lazy-0.006
 rude-0.006
Token-
Token ID12
Feature activation+1.47e+00

Pos logit contributions
 angry+0.018
 mean+0.018
 losers+0.016
 rude+0.016
 lazy+0.016
Neg logit contributions
 Finally-0.021
icol-0.020
 Sure-0.020
-0.020
 Eventually-0.018
Tokenfor
Token ID1640
Feature activation+9.22e-01

Pos logit contributions
 angry+0.045
 mean+0.044
 rude+0.041
 losers+0.040
 selfish+0.040
Neg logit contributions
-0.040
 Finally-0.039
icol-0.039
 Sure-0.036
 Eventually-0.034
Tokengotten
Token ID21646
Feature activation-3.19e+00

Pos logit contributions
 angry+0.023
 mean+0.022
 losers+0.022
 rude+0.021
 selfish+0.020
Neg logit contributions
 Finally-0.029
icol-0.029
 Sure-0.028
-0.027
 Eventually-0.025
Token t
Token ID256
Feature activation+1.65e-01

Pos logit contributions
 Finally+0.097
icol+0.093
uggle+0.093
+0.092
 Enjoy+0.090
Neg logit contributions
 mean-0.071
 angry-0.067
 alone-0.066
 boss-0.064
 ugly-0.063
Tokeneddy
Token ID21874
Feature activation-9.59e-01

Pos logit contributions
 angry+0.003
aky+0.003
 mean+0.003
umpy+0.003
 kids+0.003
Neg logit contributions
 Finally-0.006
icol-0.006
 Sure-0.006
-0.006
 Eventually-0.005
Token bear
Token ID6842
Feature activation-1.11e+00

Pos logit contributions
 Finally+0.035
icol+0.035
 Sure+0.034
+0.034
 Eventually+0.031
Neg logit contributions
 angry-0.019
 mean-0.018
 losers-0.017
 rude-0.016
 lazy-0.016
Token inside
Token ID2641
Feature activation-7.94e-01

Pos logit contributions
 Finally+0.031
icol+0.030
 Sure+0.030
+0.030
 Eventually+0.026
Neg logit contributions
 angry-0.031
 losers-0.030
 mean-0.030
 lazy-0.028
 rude-0.028
Token.
Token ID13
Feature activation-1.64e-01

Pos logit contributions
 Finally+0.022
 Sure+0.021
+0.020
icol+0.020
 Eventually+0.018
Neg logit contributions
 angry-0.022
 losers-0.022
 mean-0.022
 selfish-0.021
aky-0.021
Token made
Token ID925
Feature activation+1.90e-02

Pos logit contributions
 Finally+0.001
 trek+0.001
 mun+0.001
 Sure+0.001
icol+0.001
Neg logit contributions
 losers-0.001
 angry-0.001
 rude-0.001
umpy-0.001
 mean-0.001
Token snow
Token ID6729
Feature activation-1.46e+00

Pos logit contributions
 angry+0.000
 mean+0.000
 losers+0.000
 lazy+0.000
 rude+0.000
Neg logit contributions
 Finally-0.001
icol-0.001
-0.001
 Sure-0.001
 Eventually-0.001
Tokenmen
Token ID3653
Feature activation-4.08e-01

Pos logit contributions
 Finally+0.045
icol+0.044
 Sure+0.043
+0.043
 Eventually+0.038
Neg logit contributions
 angry-0.038
 mean-0.036
 losers-0.035
 lazy-0.033
 rude-0.033
Token and
Token ID290
Feature activation-6.63e-01

Pos logit contributions
 Finally+0.012
icol+0.012
 Sure+0.012
+0.012
uggle+0.011
Neg logit contributions
 angry-0.010
 losers-0.010
 mean-0.010
 rude-0.009
 selfish-0.009
Token had
Token ID550
Feature activation-5.23e-01

Pos logit contributions
 Finally+0.018
icol+0.018
 Sure+0.017
 trek+0.016
+0.016
Neg logit contributions
 angry-0.018
 losers-0.018
 mean-0.018
 rude-0.017
 selfish-0.017
Token snowball
Token ID46275
Feature activation-3.17e+00

Pos logit contributions
 Finally+0.015
icol+0.014
 Sure+0.013
 Eventually+0.013
+0.013
Neg logit contributions
 angry-0.014
 losers-0.014
 mean-0.013
umpy-0.013
 rude-0.013
Token fights
Token ID11418
Feature activation+3.47e-02

Pos logit contributions
 Finally+0.094
icol+0.092
 Sure+0.092
+0.089
 Eventually+0.080
Neg logit contributions
 angry-0.086
 mean-0.083
 losers-0.081
 rude-0.076
 lazy-0.076
Token.
Token ID13
Feature activation-1.36e-01

Pos logit contributions
 angry+0.001
 losers+0.001
 mean+0.001
aky+0.001
 selfish+0.001
Neg logit contributions
 Finally-0.001
 Sure-0.001
icol-0.001
-0.001
ings-0.001
Token They
Token ID1119
Feature activation+1.56e-01

Pos logit contributions
icol+0.004
+0.004
 Sure+0.004
 Finally+0.004
 purchase+0.004
Neg logit contributions
 angry-0.003
 mean-0.003
 kids-0.003
 lazy-0.003
 losers-0.003
Token used
Token ID973
Feature activation-1.85e-01

Pos logit contributions
 angry+0.005
 losers+0.005
 rude+0.004
umpy+0.004
 my+0.004
Neg logit contributions
 Finally-0.004
 trek-0.003
 mun-0.003
 Sure-0.003
icol-0.003
Token their
Token ID511
Feature activation-4.91e-01

Pos logit contributions
 Finally+0.006
icol+0.006
 Sure+0.006
+0.005
 Eventually+0.005
Neg logit contributions
 angry-0.004
 mean-0.004
 losers-0.004
 selfish-0.004
 rude-0.004
Tokenman
Token ID805
Feature activation-7.79e-01

Pos logit contributions
 Finally+0.037
icol+0.037
 Sure+0.036
+0.035
 Eventually+0.032
Neg logit contributions
 angry-0.036
 mean-0.035
 losers-0.034
 rude-0.032
 lazy-0.032
Token,
Token ID11
Feature activation-5.06e-01

Pos logit contributions
icol+0.022
 Finally+0.022
 Sure+0.021
+0.021
uggle+0.019
Neg logit contributions
 angry-0.020
 losers-0.020
 mean-0.020
 selfish-0.019
 rude-0.019
Token and
Token ID290
Feature activation-1.71e-01

Pos logit contributions
icol+0.014
 Finally+0.014
 Sure+0.014
+0.014
 trek+0.012
Neg logit contributions
 angry-0.013
 mean-0.013
 losers-0.013
 rude-0.012
umpy-0.012
Token had
Token ID550
Feature activation-9.31e-01

Pos logit contributions
 Finally+0.005
 Sure+0.005
icol+0.004
+0.004
 Eventually+0.004
Neg logit contributions
 mean-0.005
 angry-0.005
umpy-0.005
aky-0.005
 losers-0.004
Token a
Token ID257
Feature activation-1.11e+00

Pos logit contributions
 Finally+0.024
icol+0.021
 Eventually+0.020
 Sure+0.020
 trek+0.019
Neg logit contributions
 angry-0.029
umpy-0.027
 mean-0.027
 losers-0.027
 alone-0.026
Token snowball
Token ID46275
Feature activation-3.12e+00

Pos logit contributions
 Finally+0.034
icol+0.034
 Sure+0.033
+0.033
 Eventually+0.029
Neg logit contributions
 angry-0.029
 mean-0.027
 losers-0.027
 rude-0.025
 selfish-0.025
Token fight
Token ID1907
Feature activation+1.38e-01

Pos logit contributions
 Finally+0.088
icol+0.087
 Sure+0.085
+0.084
 Eventually+0.075
Neg logit contributions
 angry-0.086
 mean-0.083
 losers-0.082
 rude-0.078
 selfish-0.078
Token.
Token ID13
Feature activation+1.14e-01

Pos logit contributions
aky+0.004
 losers+0.004
 angry+0.004
 lazy+0.004
 selfish+0.004
Neg logit contributions
 Finally-0.003
 Sure-0.003
ings-0.003
icol-0.003
 Eventually-0.003
Token It
Token ID632
Feature activation-3.73e-02

Pos logit contributions
 angry+0.003
 mean+0.003
 losers+0.003
 lazy+0.003
 selfish+0.003
Neg logit contributions
icol-0.003
 Finally-0.003
 Sure-0.003
-0.003
uggle-0.002
Token was
Token ID373
Feature activation-1.21e+00

Pos logit contributions
 Finally+0.001
 Sure+0.001
icol+0.001
+0.001
 trek+0.000
Neg logit contributions
 losers-0.001
 musicians-0.001
 angry-0.001
 selfish-0.001
 kids-0.001
Token so
Token ID523
Feature activation-4.25e-01

Pos logit contributions
 Finally+0.037
icol+0.033
 Eventually+0.032
+0.030
 Sure+0.030
Neg logit contributions
umpy-0.028
aky-0.027
 losers-0.027
 musicians-0.025
 reckless-0.025
Token was
Token ID373
Feature activation-1.32e+00

Pos logit contributions
 Finally+0.014
 Sure+0.014
+0.013
icol+0.013
 trek+0.011
Neg logit contributions
 losers-0.028
 angry-0.025
 reckless-0.024
 musicians-0.024
aky-0.024
Token colorful
Token ID20239
Feature activation-2.76e-01

Pos logit contributions
 Finally+0.044
icol+0.042
 Sure+0.040
+0.039
 Eventually+0.038
Neg logit contributions
 losers-0.029
 angry-0.027
aky-0.026
 mean-0.025
 reckless-0.025
Token with
Token ID351
Feature activation-5.60e-01

Pos logit contributions
 Finally+0.010
 Sure+0.010
icol+0.010
+0.010
 Eventually+0.009
Neg logit contributions
 angry-0.006
 mean-0.005
 losers-0.005
 lazy-0.005
 rude-0.005
Token lots
Token ID6041
Feature activation+8.87e-02

Pos logit contributions
 Finally+0.017
icol+0.016
 Sure+0.015
+0.015
 Eventually+0.014
Neg logit contributions
 angry-0.014
 losers-0.014
 mean-0.013
aky-0.013
 reckless-0.012
Token of
Token ID286
Feature activation-2.66e-01

Pos logit contributions
 angry+0.002
 losers+0.002
 mean+0.002
 rude+0.002
aky+0.002
Neg logit contributions
 Finally-0.003
icol-0.003
-0.003
 Sure-0.003
 Eventually-0.002
Token red
Token ID2266
Feature activation-3.10e+00

Pos logit contributions
 Finally+0.009
icol+0.009
 Sure+0.009
+0.008
 Eventually+0.008
Neg logit contributions
 angry-0.006
 losers-0.006
 mean-0.006
 rude-0.005
 selfish-0.005
Tokens
Token ID82
Feature activation-4.39e-01

Pos logit contributions
 Finally+0.110
 Sure+0.109
icol+0.107
+0.106
 Eventually+0.098
Neg logit contributions
 angry-0.063
 mean-0.061
 kids-0.058
 losers-0.056
 lazy-0.055
Token and
Token ID290
Feature activation+1.24e-01

Pos logit contributions
 Finally+0.013
icol+0.013
+0.013
 Sure+0.013
 Eventually+0.012
Neg logit contributions
 angry-0.011
 losers-0.011
 mean-0.011
 rude-0.010
 selfish-0.010
Token blues
Token ID25570
Feature activation-5.89e-01

Pos logit contributions
 losers+0.003
 angry+0.003
aky+0.003
 mean+0.003
umpy+0.003
Neg logit contributions
 Finally-0.004
icol-0.004
 Sure-0.004
-0.003
 Eventually-0.003
Token.
Token ID13
Feature activation-9.75e-02

Pos logit contributions
 Finally+0.018
icol+0.018
+0.018
 Sure+0.018
 Eventually+0.016
Neg logit contributions
 angry-0.015
 mean-0.014
 losers-0.014
 rude-0.013
 lazy-0.013
Token Max
Token ID5436
Feature activation-8.01e-01

Pos logit contributions
 Finally+0.003
 Sure+0.003
icol+0.003
+0.003
 Eventually+0.003
Neg logit contributions
 angry-0.003
 mean-0.002
 losers-0.002
 rude-0.002
 selfish-0.002
Token.
Token ID13
Feature activation+2.58e-01

Pos logit contributions
 Finally+0.004
 Sure+0.004
+0.004
icol+0.004
 Eventually+0.004
Neg logit contributions
 angry-0.005
 losers-0.005
 mean-0.005
 lazy-0.005
 selfish-0.005
Token Tim
Token ID5045
Feature activation-1.73e-01

Pos logit contributions
 angry+0.007
 mean+0.007
 losers+0.007
 rude+0.007
aky+0.007
Neg logit contributions
 Finally-0.008
 Sure-0.007
-0.007
icol-0.007
 Eventually-0.007
Tokenmy
Token ID1820
Feature activation-4.70e-01

Pos logit contributions
 Finally+0.005
 Sure+0.005
icol+0.005
+0.004
 Eventually+0.004
Neg logit contributions
 losers-0.005
 angry-0.005
 mean-0.004
 musicians-0.004
 rude-0.004
Token hadn
Token ID8020
Feature activation-6.77e-01

Pos logit contributions
 Finally+0.012
 Sure+0.010
+0.010
icol+0.010
 Eventually+0.010
Neg logit contributions
 losers-0.015
 angry-0.014
 mean-0.013
 musicians-0.013
 my-0.013
Token't
Token ID470
Feature activation-6.15e-01

Pos logit contributions
+0.011
 Finally+0.010
 Sure+0.010
icol+0.009
 Eventually+0.008
Neg logit contributions
 angry-0.028
 mean-0.027
 losers-0.026
 rude-0.026
aky-0.026
Token ever
Token ID1683
Feature activation-3.08e+00

Pos logit contributions
 Finally+0.014
 Sure+0.014
 Eventually+0.012
 trek+0.012
icol+0.012
Neg logit contributions
 losers-0.018
 angry-0.016
 teams-0.016
igator-0.016
 my-0.016
Token been
Token ID587
Feature activation-1.48e+00

Pos logit contributions
 Finally+0.087
icol+0.085
 Sure+0.085
+0.083
 Eventually+0.075
Neg logit contributions
 angry-0.081
 losers-0.079
 mean-0.077
 selfish-0.073
 lazy-0.072
Token to
Token ID284
Feature activation-4.88e-01

Pos logit contributions
 Finally+0.037
 Sure+0.035
+0.033
to+0.032
 searching+0.032
Neg logit contributions
 losers-0.037
igator-0.034
 angry-0.034
umpy-0.033
aky-0.033
Token a
Token ID257
Feature activation+3.89e-02

Pos logit contributions
 Finally+0.010
 purchase+0.010
+0.010
icol+0.009
 generation+0.009
Neg logit contributions
aky-0.015
 losers-0.014
umpy-0.014
 teams-0.013
 angry-0.013
Token doctor
Token ID6253
Feature activation+1.65e-01

Pos logit contributions
 angry+0.001
 losers+0.001
 mean+0.001
aky+0.001
 musicians+0.001
Neg logit contributions
 Finally-0.001
 Sure-0.001
icol-0.001
-0.001
 Eventually-0.001
Token before
Token ID878
Feature activation-1.15e+00

Pos logit contributions
 losers+0.005
 angry+0.005
 mean+0.004
aky+0.004
 lazy+0.004
Neg logit contributions
 Finally-0.004
 Sure-0.004
-0.004
icol-0.004
 Eventually-0.003
Token and
Token ID290
Feature activation-3.57e-01

Pos logit contributions
 Finally+0.025
icol+0.025
 Sure+0.025
+0.024
uggle+0.022
Neg logit contributions
 angry-0.026
 mean-0.026
 losers-0.026
 selfish-0.025
 rude-0.024
Token pretended
Token ID36766
Feature activation-5.82e-02

Pos logit contributions
 Finally+0.010
icol+0.010
 Sure+0.009
 trek+0.008
 mun+0.008
Neg logit contributions
 angry-0.010
umpy-0.010
 losers-0.010
 mean-0.010
aky-0.009
Token to
Token ID284
Feature activation-1.50e-01

Pos logit contributions
 Finally+0.001
 Sure+0.001
icol+0.001
+0.001
 Eventually+0.001
Neg logit contributions
aky-0.002
 angry-0.002
 losers-0.002
 mean-0.002
umpy-0.002
Token have
Token ID423
Feature activation-6.84e-01

Pos logit contributions
 Finally+0.004
icol+0.004
 Sure+0.004
+0.004
 generation+0.004
Neg logit contributions
 angry-0.004
 losers-0.004
 selfish-0.004
aky-0.004
 mean-0.004
Token a
Token ID257
Feature activation-6.11e-01

Pos logit contributions
 Finally+0.017
icol+0.017
 Sure+0.015
growth+0.015
 trek+0.015
Neg logit contributions
 angry-0.020
umpy-0.019
 losers-0.018
 unhappy-0.018
 selfish-0.018
Token snowball
Token ID46275
Feature activation-3.08e+00

Pos logit contributions
icol+0.019
 Finally+0.019
 Sure+0.018
+0.018
 Eventually+0.016
Neg logit contributions
 angry-0.016
 losers-0.015
 mean-0.014
 selfish-0.014
 alone-0.014
Token fight
Token ID1907
Feature activation+3.22e-01

Pos logit contributions
 Finally+0.089
icol+0.089
 Sure+0.086
+0.086
 Eventually+0.076
Neg logit contributions
 angry-0.082
 mean-0.079
 losers-0.078
 selfish-0.074
 lazy-0.074
Token with
Token ID351
Feature activation-2.45e-01

Pos logit contributions
aky+0.010
 losers+0.010
 angry+0.010
 selfish+0.009
 lazy+0.009
Neg logit contributions
 Finally-0.007
ings-0.007
 Sure-0.007
icol-0.006
 Eventually-0.006
Token her
Token ID607
Feature activation+6.83e-01

Pos logit contributions
 Finally+0.008
icol+0.007
 Sure+0.007
+0.007
 Eventually+0.007
Neg logit contributions
 angry-0.006
 mean-0.006
 losers-0.006
aky-0.005
 selfish-0.005
Token imaginary
Token ID26726
Feature activation+5.02e-01

Pos logit contributions
 losers+0.020
 angry+0.019
 mean+0.019
 selfish+0.018
 rude+0.018
Neg logit contributions
 Finally-0.016
icol-0.016
-0.016
 Sure-0.016
ings-0.014
Token friend
Token ID1545
Feature activation-5.61e-01

Pos logit contributions
 angry+0.013
 losers+0.012
 mean+0.012
 selfish+0.011
 rude+0.011
Neg logit contributions
icol-0.015
 Finally-0.014
-0.014
 Sure-0.014
 Eventually-0.012
Token and
Token ID290
Feature activation-9.87e-01

Pos logit contributions
 losers+0.002
 angry+0.002
 mean+0.002
 musicians+0.002
 lazy+0.002
Neg logit contributions
 Finally-0.001
-0.001
 Sure-0.001
ings-0.001
icol-0.001
Token stickers
Token ID28568
Feature activation+7.14e-01

Pos logit contributions
 Finally+0.024
icol+0.023
 Sure+0.022
 generation+0.020
 purchase+0.020
Neg logit contributions
 losers-0.030
umpy-0.029
aky-0.029
 angry-0.028
 mean-0.028
Token.
Token ID13
Feature activation-2.21e-01

Pos logit contributions
 losers+0.023
 angry+0.022
 lazy+0.021
 selfish+0.021
 mean+0.021
Neg logit contributions
 Finally-0.017
ings-0.016
-0.015
 Sure-0.015
icol-0.015
Token They
Token ID1119
Feature activation+1.18e-01

Pos logit contributions
 Finally+0.006
icol+0.006
 Sure+0.005
+0.005
 Eventually+0.005
Neg logit contributions
 angry-0.007
 mean-0.007
 losers-0.006
 rude-0.006
 lazy-0.006
Token use
Token ID779
Feature activation-2.73e-01

Pos logit contributions
 angry+0.004
 losers+0.003
 lazy+0.003
 mean+0.003
 selfish+0.003
Neg logit contributions
 Finally-0.003
-0.003
 Sure-0.003
icol-0.003
 purchase-0.002
Token red
Token ID2266
Feature activation-3.07e+00

Pos logit contributions
 Finally+0.009
icol+0.008
 Sure+0.008
+0.008
 Eventually+0.008
Neg logit contributions
 angry-0.006
 losers-0.006
 mean-0.006
 lazy-0.006
aky-0.006
Token,
Token ID11
Feature activation+1.58e-01

Pos logit contributions
icol+0.105
 Finally+0.104
 Sure+0.102
+0.101
 Eventually+0.091
Neg logit contributions
 angry-0.067
 mean-0.065
 losers-0.061
 kids-0.058
 rude-0.057
Token blue
Token ID4171
Feature activation-1.91e+00

Pos logit contributions
 losers+0.004
umpy+0.003
 alone+0.003
 angry+0.003
 reckless+0.003
Neg logit contributions
 Finally-0.005
 Sure-0.005
icol-0.005
-0.004
growth-0.004
Token,
Token ID11
Feature activation-3.78e-01

Pos logit contributions
 Finally+0.058
icol+0.057
 Sure+0.056
+0.056
 Eventually+0.050
Neg logit contributions
 angry-0.050
 mean-0.048
 losers-0.047
 rude-0.044
 lazy-0.044
Token yellow
Token ID7872
Feature activation-1.85e+00

Pos logit contributions
 Finally+0.013
icol+0.012
 Sure+0.012
+0.012
growth+0.011
Neg logit contributions
 losers-0.008
 angry-0.008
 reckless-0.007
umpy-0.007
 alone-0.007
Token and
Token ID290
Feature activation+2.10e-01

Pos logit contributions
 Finally+0.052
icol+0.050
+0.049
 Sure+0.049
 Eventually+0.044
Neg logit contributions
 angry-0.053
 losers-0.050
 mean-0.050
 lazy-0.047
 rude-0.047
Token circles
Token ID13332
Feature activation-3.69e-01

Pos logit contributions
 Finally+0.039
icol+0.039
 Sure+0.038
+0.036
 Eventually+0.034
Neg logit contributions
 angry-0.030
 losers-0.029
 mean-0.028
 rude-0.026
 selfish-0.026
Token,
Token ID11
Feature activation-4.00e-01

Pos logit contributions
 Finally+0.010
icol+0.010
+0.010
 Sure+0.010
 Eventually+0.009
Neg logit contributions
 angry-0.011
 losers-0.010
 mean-0.010
 selfish-0.010
 lazy-0.010
Token and
Token ID290
Feature activation-6.68e-01

Pos logit contributions
 Finally+0.014
icol+0.014
 Sure+0.014
+0.013
 Eventually+0.012
Neg logit contributions
 angry-0.008
 mean-0.008
 losers-0.008
 rude-0.007
 selfish-0.007
Token even
Token ID772
Feature activation-2.75e-01

Pos logit contributions
 Finally+0.021
icol+0.020
 Sure+0.020
+0.018
 Eventually+0.018
Neg logit contributions
 angry-0.016
 losers-0.016
aky-0.015
 mean-0.015
umpy-0.015
Token won
Token ID1839
Feature activation-5.66e-02

Pos logit contributions
 Finally+0.009
 Sure+0.008
icol+0.008
+0.007
 Eventually+0.007
Neg logit contributions
 losers-0.006
 angry-0.006
 mean-0.006
aky-0.006
umpy-0.006
Tokenky
Token ID2584
Feature activation-3.07e+00

Pos logit contributions
 Finally+0.002
+0.002
 Sure+0.002
icol+0.002
ings+0.002
Neg logit contributions
aky-0.001
 losers-0.001
 angry-0.001
igator-0.001
 mean-0.001
Token circles
Token ID13332
Feature activation-5.05e-01

Pos logit contributions
 Finally+0.088
icol+0.087
 Sure+0.084
+0.082
 Eventually+0.075
Neg logit contributions
 angry-0.084
 mean-0.081
 losers-0.080
 lazy-0.075
 rude-0.075
Token.
Token ID13
Feature activation+9.22e-02

Pos logit contributions
 Finally+0.013
icol+0.012
+0.011
ings+0.011
 Sure+0.011
Neg logit contributions
 angry-0.015
 losers-0.014
aky-0.014
 reckless-0.014
 lazy-0.014
Token Her
Token ID2332
Feature activation+1.70e-01

Pos logit contributions
 mean+0.002
 angry+0.002
 kids+0.002
 lazy+0.002
 my+0.002
Neg logit contributions
icol-0.003
-0.003
 Sure-0.003
 Finally-0.003
 purchase-0.003
Token mom
Token ID1995
Feature activation-5.52e-01

Pos logit contributions
 angry+0.004
 mean+0.004
 losers+0.004
 rude+0.003
 lazy+0.003
Neg logit contributions
 Finally-0.006
icol-0.006
 Sure-0.006
-0.005
 Eventually-0.005
Tokenmy
Token ID1820
Feature activation-5.18e-01

Pos logit contributions
 Finally+0.015
 Sure+0.014
icol+0.013
+0.013
 Eventually+0.012
Neg logit contributions
 angry-0.016
 losers-0.016
 mean-0.015
 selfish-0.014
 rude-0.014
Token snow
Token ID6729
Feature activation-1.09e+00

Pos logit contributions
 Finally+0.003
icol+0.003
 Sure+0.003
+0.003
 Eventually+0.002
Neg logit contributions
 angry-0.002
 losers-0.002
 mean-0.002
 selfish-0.001
 rude-0.001
Tokenman
Token ID805
Feature activation-8.68e-01

Pos logit contributions
 Finally+0.036
icol+0.036
 Sure+0.035
+0.034
 Eventually+0.032
Neg logit contributions
 angry-0.025
 mean-0.024
 losers-0.023
 lazy-0.021
 alone-0.021
Token and
Token ID290
Feature activation-3.50e-01

Pos logit contributions
 Finally+0.026
icol+0.025
 Sure+0.025
+0.025
 Eventually+0.022
Neg logit contributions
 angry-0.022
 mean-0.021
 losers-0.021
 rude-0.020
 selfish-0.020
Token had
Token ID550
Feature activation-5.81e-01

Pos logit contributions
 Finally+0.010
icol+0.010
 Sure+0.010
+0.009
 Eventually+0.008
Neg logit contributions
 angry-0.009
 mean-0.009
 losers-0.009
 rude-0.008
 selfish-0.008
Token a
Token ID257
Feature activation-5.45e-01

Pos logit contributions
 Finally+0.016
icol+0.015
 Sure+0.014
 Eventually+0.013
+0.013
Neg logit contributions
 angry-0.016
umpy-0.016
 losers-0.015
 mean-0.015
 reckless-0.015
Token snowball
Token ID46275
Feature activation-3.06e+00

Pos logit contributions
 Finally+0.019
icol+0.019
 Sure+0.019
+0.018
 Eventually+0.017
Neg logit contributions
 angry-0.012
 mean-0.011
 losers-0.011
 lazy-0.010
 rude-0.010
Token fight
Token ID1907
Feature activation+4.94e-01

Pos logit contributions
 Finally+0.102
 Sure+0.101
icol+0.099
+0.097
 Eventually+0.090
Neg logit contributions
 angry-0.073
 mean-0.071
 losers-0.066
 alone-0.062
 rude-0.062
Token with
Token ID351
Feature activation-3.64e-01

Pos logit contributions
 losers+0.014
aky+0.014
 angry+0.014
 mean+0.013
 lazy+0.013
Neg logit contributions
 Finally-0.013
 Sure-0.012
icol-0.011
ings-0.011
-0.010
Token her
Token ID607
Feature activation+6.07e-01

Pos logit contributions
 Finally+0.013
icol+0.012
 Sure+0.012
+0.012
 Eventually+0.011
Neg logit contributions
 angry-0.008
 mean-0.008
 losers-0.007
 rude-0.007
 lazy-0.007
Token friends
Token ID2460
Feature activation-1.90e-01

Pos logit contributions
 losers+0.017
 angry+0.017
 mean+0.016
 selfish+0.016
 rude+0.015
Neg logit contributions
 Finally-0.017
 Sure-0.016
icol-0.016
-0.016
 Eventually-0.014
Token.
Token ID13
Feature activation-3.53e-02

Pos logit contributions
 Finally+0.003
+0.003
 Sure+0.003
ings+0.003
 Eventually+0.003
Neg logit contributions
 losers-0.007
 angry-0.007
 mean-0.007
 lazy-0.006
 rude-0.006
Token,
Token ID11
Feature activation+1.05e-01

Pos logit contributions
 Finally+0.030
icol+0.030
 Sure+0.029
+0.029
 Eventually+0.026
Neg logit contributions
 angry-0.025
 mean-0.024
 losers-0.023
 rude-0.022
 lazy-0.022
Token Lily
Token ID20037
Feature activation+4.50e-01

Pos logit contributions
 angry+0.002
 losers+0.002
 mean+0.002
aky+0.002
umpy+0.002
Neg logit contributions
 Finally-0.004
 Sure-0.003
icol-0.003
-0.003
 Eventually-0.003
Token proudly
Token ID23017
Feature activation+4.71e-01

Pos logit contributions
 losers+0.013
 angry+0.012
 mean+0.012
 rude+0.011
 selfish+0.011
Neg logit contributions
icol-0.013
 Finally-0.012
 Sure-0.012
-0.011
 Eventually-0.010
Token wore
Token ID12408
Feature activation+1.33e-01

Pos logit contributions
 angry+0.013
 mean+0.012
 losers+0.012
 rude+0.011
 lazy+0.011
Neg logit contributions
 Finally-0.014
icol-0.014
 Sure-0.013
-0.013
 Eventually-0.012
Token her
Token ID607
Feature activation-8.34e-01

Pos logit contributions
 angry+0.003
 losers+0.003
 mean+0.003
aky+0.003
 selfish+0.003
Neg logit contributions
 Finally-0.004
icol-0.004
 Sure-0.004
-0.004
 Eventually-0.004
Token favorite
Token ID4004
Feature activation-3.06e+00

Pos logit contributions
icol+0.026
 Finally+0.025
+0.025
 Sure+0.025
 purchase+0.022
Neg logit contributions
 angry-0.020
 losers-0.020
 mean-0.020
aky-0.018
 rude-0.018
Token dress
Token ID6576
Feature activation-7.41e-01

Pos logit contributions
 Finally+0.086
icol+0.085
 Sure+0.083
+0.082
 Eventually+0.073
Neg logit contributions
 angry-0.086
 mean-0.083
 losers-0.081
 rude-0.077
 lazy-0.077
Token to
Token ID284
Feature activation-4.72e-01

Pos logit contributions
 Finally+0.021
+0.021
icol+0.021
 Sure+0.020
 Eventually+0.018
Neg logit contributions
 angry-0.020
 losers-0.020
 mean-0.020
 selfish-0.018
 lazy-0.018
Token go
Token ID467
Feature activation+2.78e-03

Pos logit contributions
 Finally+0.013
icol+0.012
 Sure+0.012
+0.012
 purchase+0.011
Neg logit contributions
 angry-0.014
 losers-0.013
 mean-0.013
 selfish-0.012
 lazy-0.012
Token visiting
Token ID10013
Feature activation-1.01e-01

Pos logit contributions
 losers+0.000
 angry+0.000
 mean+0.000
 selfish+0.000
aky+0.000
Neg logit contributions
icol-0.000
 Finally-0.000
 Sure-0.000
-0.000
uggle-0.000
Token with
Token ID351
Feature activation-5.89e-01

Pos logit contributions
 Finally+0.003
 Sure+0.003
icol+0.003
+0.003
 generation+0.003
Neg logit contributions
 losers-0.003
 mean-0.003
 angry-0.003
aky-0.003
 selfish-0.003
Token river
Token ID7850
Feature activation-8.27e-01

Pos logit contributions
 Finally+0.004
icol+0.004
 Sure+0.004
+0.004
 purchase+0.004
Neg logit contributions
 angry-0.003
 losers-0.003
 mean-0.003
aky-0.002
 selfish-0.002
Token!
Token ID0
Feature activation-4.17e-01

Pos logit contributions
 Finally+0.025
icol+0.024
 Sure+0.024
+0.024
 Eventually+0.021
Neg logit contributions
 angry-0.022
 mean-0.021
 losers-0.021
 lazy-0.019
 rude-0.019
Token
Token ID220
Feature activation-9.75e-01

Pos logit contributions
icol+0.014
+0.013
 Finally+0.013
 Sure+0.013
ings+0.012
Neg logit contributions
 angry-0.009
 mean-0.008
 lazy-0.008
 kids-0.008
 my-0.008
Token
Token ID198
Feature activation-7.73e-02

Pos logit contributions
icol+0.034
+0.031
 Finally+0.031
 Sure+0.031
 generation+0.029
Neg logit contributions
 angry-0.021
 mean-0.020
 losers-0.019
 kids-0.019
 alone-0.018
Token
Token ID198
Feature activation-5.83e-01

Pos logit contributions
icol+0.003
 Finally+0.003
+0.003
 Sure+0.003
 generation+0.002
Neg logit contributions
 angry-0.002
 mean-0.001
 lazy-0.001
 kids-0.001
 losers-0.001
TokenLittle
Token ID22253
Feature activation-3.05e+00

Pos logit contributions
 Finally+0.017
 Sure+0.016
icol+0.016
+0.015
 Eventually+0.014
Neg logit contributions
 angry-0.017
 losers-0.016
 mean-0.016
 selfish-0.015
 rude-0.015
Token Bear
Token ID14732
Feature activation-1.51e+00

Pos logit contributions
 Finally+0.096
icol+0.095
 Sure+0.094
+0.093
 Eventually+0.084
Neg logit contributions
 angry-0.075
 mean-0.073
 losers-0.070
 lazy-0.068
 rude-0.067
Token was
Token ID373
Feature activation-1.46e-02

Pos logit contributions
 Finally+0.044
icol+0.042
 Sure+0.042
+0.041
 Eventually+0.038
Neg logit contributions
 losers-0.040
 angry-0.040
 mean-0.039
 rude-0.036
 selfish-0.036
Token so
Token ID523
Feature activation+5.41e-01

Pos logit contributions
 Finally+0.001
 Sure+0.000
 Eventually+0.000
icol+0.000
+0.000
Neg logit contributions
 losers-0.000
 mean-0.000
aky-0.000
 musicians-0.000
 angry-0.000
Token excited
Token ID6568
Feature activation-2.77e-01

Pos logit contributions
 losers+0.013
 musicians+0.010
 mean+0.010
aky+0.010
 my+0.010
Neg logit contributions
 Finally-0.018
 Sure-0.018
-0.016
icol-0.016
 Eventually-0.015
Token.
Token ID13
Feature activation-2.04e-01

Pos logit contributions
 Finally+0.010
+0.009
icol+0.009
 Sure+0.009
 Eventually+0.009
Neg logit contributions
 angry-0.006
 mean-0.006
 losers-0.005
 rude-0.005
 kids-0.005
Token
Token ID198
Feature activation-2.37e-01

Pos logit contributions
icol+0.000
+0.000
 Finally+0.000
 Sure+0.000
 generation+0.000
Neg logit contributions
 angry-0.000
 mean-0.000
 kids-0.000
 losers-0.000
 alone-0.000
TokenHe
Token ID1544
Feature activation-1.74e-01

Pos logit contributions
 Finally+0.007
 Sure+0.006
+0.006
icol+0.006
 Eventually+0.006
Neg logit contributions
 angry-0.007
 mean-0.007
 losers-0.007
 selfish-0.006
 rude-0.006
Token pulled
Token ID5954
Feature activation-9.76e-01

Pos logit contributions
 Finally+0.005
icol+0.005
 Sure+0.005
+0.005
 Eventually+0.004
Neg logit contributions
 angry-0.005
 mean-0.004
 losers-0.004
 rude-0.004
 lazy-0.004
Token and
Token ID290
Feature activation-7.14e-01

Pos logit contributions
 Finally+0.034
 Sure+0.033
icol+0.032
+0.031
 Eventually+0.030
Neg logit contributions
 losers-0.020
 angry-0.020
 mean-0.020
 rude-0.018
 selfish-0.018
Token tug
Token ID27762
Feature activation-2.09e+00

Pos logit contributions
icol+0.022
 Finally+0.022
 Sure+0.021
+0.021
uggle+0.019
Neg logit contributions
 angry-0.019
 mean-0.018
 losers-0.017
 lazy-0.016
 rude-0.016
Tokenged
Token ID2004
Feature activation-3.04e+00

Pos logit contributions
 Finally+0.046
icol+0.045
 Sure+0.044
+0.043
 Eventually+0.037
Neg logit contributions
 angry-0.072
 mean-0.070
 losers-0.069
 rude-0.066
 lazy-0.066
Token at
Token ID379
Feature activation-9.89e-01

Pos logit contributions
 Finally+0.098
icol+0.098
 Sure+0.096
+0.095
 Eventually+0.086
Neg logit contributions
 angry-0.074
 mean-0.071
 losers-0.068
 rude-0.064
 lazy-0.064
Token the
Token ID262
Feature activation-6.58e-01

Pos logit contributions
icol+0.044
 Finally+0.044
 Sure+0.043
+0.042
 Eventually+0.040
Neg logit contributions
 angry-0.013
 mean-0.011
 losers-0.011
 lazy-0.010
 rude-0.009
Token lid
Token ID19789
Feature activation-5.16e-01

Pos logit contributions
icol+0.023
 Finally+0.022
+0.021
 Sure+0.021
 Eventually+0.020
Neg logit contributions
 angry-0.017
 mean-0.017
 ugly-0.015
 rude-0.015
 kids-0.015
Token,
Token ID11
Feature activation+4.84e-01

Pos logit contributions
 Finally+0.015
 Sure+0.014
+0.014
icol+0.014
 Eventually+0.013
Neg logit contributions
 angry-0.014
 losers-0.014
 mean-0.014
 musicians-0.013
 rude-0.013
Token but
Token ID475
Feature activation-7.61e-01

Pos logit contributions
 angry+0.012
 mean+0.011
 losers+0.011
 rude+0.010
 lazy+0.010
Neg logit contributions
 Finally-0.016
icol-0.015
 Sure-0.015
-0.015
 Eventually-0.014
Token so
Token ID523
Feature activation+8.93e-03

Pos logit contributions
 Finally+0.014
 Sure+0.013
icol+0.013
+0.013
 Eventually+0.012
Neg logit contributions
 angry-0.006
 losers-0.006
 mean-0.006
aky-0.005
 rude-0.005
Token she
Token ID673
Feature activation+5.50e-01

Pos logit contributions
 angry+0.000
 mean+0.000
 losers+0.000
 rude+0.000
 lazy+0.000
Neg logit contributions
 Finally-0.000
icol-0.000
 Sure-0.000
-0.000
 Eventually-0.000
Token put
Token ID1234
Feature activation+3.54e-01

Pos logit contributions
 angry+0.015
 mean+0.014
 losers+0.014
 rude+0.013
 lazy+0.013
Neg logit contributions
 Finally-0.016
icol-0.016
 Sure-0.016
-0.015
 Eventually-0.014
Token on
Token ID319
Feature activation-8.38e-02

Pos logit contributions
 angry+0.008
 mean+0.008
 losers+0.008
 rude+0.008
 selfish+0.008
Neg logit contributions
 Finally-0.011
icol-0.011
 Sure-0.011
-0.011
 Eventually-0.010
Token her
Token ID607
Feature activation-9.21e-01

Pos logit contributions
 Finally+0.003
icol+0.003
 Sure+0.003
+0.003
 Eventually+0.003
Neg logit contributions
 angry-0.002
 mean-0.002
 losers-0.002
 rude-0.002
 lazy-0.002
Token favorite
Token ID4004
Feature activation-3.04e+00

Pos logit contributions
 Finally+0.027
icol+0.027
 Sure+0.027
+0.027
 purchase+0.023
Neg logit contributions
 losers-0.024
 angry-0.022
 mean-0.021
aky-0.021
umpy-0.020
Token sweater
Token ID35685
Feature activation-6.59e-01

Pos logit contributions
icol+0.089
 Finally+0.088
+0.086
 Sure+0.086
 purchase+0.077
Neg logit contributions
 angry-0.078
 losers-0.076
 mean-0.074
 selfish-0.069
 lazy-0.069
Token and
Token ID290
Feature activation-5.52e-01

Pos logit contributions
 Finally+0.021
icol+0.021
 Sure+0.021
+0.020
 Eventually+0.018
Neg logit contributions
 angry-0.016
 mean-0.015
 losers-0.015
 rude-0.014
 lazy-0.014
Token cl
Token ID537
Feature activation-2.13e-03

Pos logit contributions
 Finally+0.017
icol+0.017
 Sure+0.016
+0.016
 Eventually+0.014
Neg logit contributions
 angry-0.014
 mean-0.014
 losers-0.013
 rude-0.012
 selfish-0.012
Tokenapped
Token ID6320
Feature activation-1.06e+00

Pos logit contributions
 Finally+0.000
icol+0.000
 Sure+0.000
+0.000
 Eventually+0.000
Neg logit contributions
 angry-0.000
 mean-0.000
 losers-0.000
 lazy-0.000
 selfish-0.000
Token her
Token ID607
Feature activation-5.65e-02

Pos logit contributions
 Finally+0.027
icol+0.026
 Sure+0.025
ings+0.022
 Eventually+0.022
Neg logit contributions
 losers-0.031
 angry-0.030
 mean-0.029
 reckless-0.029
 selfish-0.029
Token thought
Token ID1807
Feature activation-3.46e-01

Pos logit contributions
icol+0.000
 Finally+0.000
 Sure+0.000
 mun+0.000
+0.000
Neg logit contributions
 angry-0.000
 mean-0.000
 losers-0.000
 rude-0.000
aky-0.000
Token that
Token ID326
Feature activation+1.93e-01

Pos logit contributions
 Finally+0.010
 Sure+0.009
icol+0.009
+0.009
 Eventually+0.008
Neg logit contributions
 angry-0.010
 mean-0.009
 losers-0.009
aky-0.009
 selfish-0.009
Token nothing
Token ID2147
Feature activation+6.50e-01

Pos logit contributions
 losers+0.005
 angry+0.005
aky+0.005
umpy+0.005
 mean+0.005
Neg logit contributions
 Finally-0.006
 Sure-0.005
icol-0.005
 Eventually-0.005
-0.005
Token bad
Token ID2089
Feature activation+4.99e-02

Pos logit contributions
 losers+0.018
 angry+0.018
aky+0.017
 musicians+0.017
 kids+0.016
Neg logit contributions
 Finally-0.017
 Sure-0.017
icol-0.015
-0.015
 Eventually-0.014
Token could
Token ID714
Feature activation-1.14e+00

Pos logit contributions
 angry+0.002
 losers+0.002
aky+0.002
 lazy+0.002
 mean+0.001
Neg logit contributions
 Finally-0.001
 Sure-0.001
icol-0.001
-0.001
 Eventually-0.001
Token ever
Token ID1683
Feature activation-3.01e+00

Pos logit contributions
 Finally+0.028
+0.027
 Sure+0.027
 purchase+0.026
icol+0.026
Neg logit contributions
 angry-0.033
 losers-0.032
 lazy-0.031
aky-0.031
 mean-0.031
Token happen
Token ID1645
Feature activation-5.54e-01

Pos logit contributions
 Finally+0.081
 Sure+0.078
icol+0.077
+0.076
 purchase+0.070
Neg logit contributions
 angry-0.081
 losers-0.081
 mean-0.077
aky-0.074
 lazy-0.074
Token.
Token ID13
Feature activation+2.39e-01

Pos logit contributions
 Finally+0.014
 Sure+0.013
+0.012
icol+0.012
 Eventually+0.012
Neg logit contributions
 losers-0.017
 angry-0.017
 mean-0.016
 rude-0.015
aky-0.015
Token
Token ID220
Feature activation-5.25e-01

Pos logit contributions
 angry+0.008
 mean+0.007
 losers+0.007
 rude+0.007
 selfish+0.007
Neg logit contributions
 Finally-0.006
 Sure-0.006
icol-0.006
-0.006
 Eventually-0.005
Token
Token ID198
Feature activation+9.88e-02

Pos logit contributions
 Finally+0.014
 Sure+0.014
+0.013
icol+0.013
 Eventually+0.012
Neg logit contributions
 angry-0.016
 mean-0.016
 losers-0.015
 rude-0.015
 selfish-0.015
Token
Token ID198
Feature activation+6.88e-02

Pos logit contributions
 angry+0.003
 mean+0.003
 losers+0.003
 rude+0.003
 selfish+0.003
Neg logit contributions
 Finally-0.003
 Sure-0.003
-0.002
icol-0.002
 Eventually-0.002
Token day
Token ID1110
Feature activation-1.15e+00

Pos logit contributions
 losers+0.001
 angry+0.001
 mean+0.001
 selfish+0.000
 rude+0.000
Neg logit contributions
 Finally-0.001
 Sure-0.000
 Eventually-0.000
 purchase-0.000
-0.000
Token.
Token ID13
Feature activation+2.54e-01

Pos logit contributions
 Finally+0.030
+0.029
icol+0.028
 Sure+0.027
 Eventually+0.025
Neg logit contributions
 angry-0.035
 mean-0.034
 losers-0.034
 rude-0.032
aky-0.032
Token They
Token ID1119
Feature activation-4.04e-02

Pos logit contributions
 angry+0.009
 mean+0.008
 losers+0.008
 rude+0.008
 lazy+0.008
Neg logit contributions
 Finally-0.006
icol-0.006
 Sure-0.006
-0.006
 Eventually-0.005
Token worked
Token ID3111
Feature activation-8.41e-01

Pos logit contributions
 Finally+0.001
+0.001
 Sure+0.001
 trek+0.001
 Eventually+0.001
Neg logit contributions
 angry-0.001
 losers-0.001
 mean-0.001
 my-0.001
 rude-0.001
Token delic
Token ID8675
Feature activation-1.27e+00

Pos logit contributions
 Finally+0.026
icol+0.025
 Sure+0.025
+0.024
 Eventually+0.022
Neg logit contributions
 angry-0.022
 mean-0.021
 losers-0.021
 rude-0.019
 lazy-0.019
Tokenately
Token ID1286
Feature activation-3.00e+00

Pos logit contributions
 Finally+0.027
icol+0.026
 Sure+0.026
+0.025
 Eventually+0.021
Neg logit contributions
 angry-0.043
 losers-0.043
 mean-0.042
 kids-0.040
 selfish-0.040
Token and
Token ID290
Feature activation-7.32e-01

Pos logit contributions
 Finally+0.084
icol+0.080
+0.080
 Sure+0.079
 Eventually+0.072
Neg logit contributions
 angry-0.086
 mean-0.083
 losers-0.082
 rude-0.078
 selfish-0.077
Token carefully
Token ID7773
Feature activation-8.74e-01

Pos logit contributions
 Finally+0.020
 Sure+0.019
+0.019
icol+0.019
 Eventually+0.017
Neg logit contributions
 angry-0.021
 mean-0.021
 losers-0.021
 rude-0.020
 lazy-0.019
Token and
Token ID290
Feature activation+8.31e-02

Pos logit contributions
 Finally+0.025
 Sure+0.024
+0.024
icol+0.023
 Eventually+0.021
Neg logit contributions
 losers-0.025
 angry-0.025
 mean-0.024
 rude-0.023
 lazy-0.023
Token nothing
Token ID2147
Feature activation+6.04e-01

Pos logit contributions
 mean+0.002
 angry+0.002
 losers+0.002
umpy+0.002
aky+0.002
Neg logit contributions
 Finally-0.002
 Sure-0.002
icol-0.002
-0.002
 Eventually-0.002
Token broke
Token ID6265
Feature activation+1.15e-01

Pos logit contributions
 angry+0.017
 losers+0.017
 mean+0.016
aky+0.016
 musicians+0.016
Neg logit contributions
 Finally-0.015
 Sure-0.015
icol-0.014
-0.014
 Eventually-0.013
Token.
Token ID13
Feature activation-4.72e-01

Pos logit contributions
 Finally+0.021
 Sure+0.020
+0.018
icol+0.018
 Eventually+0.017
Neg logit contributions
 losers-0.027
 angry-0.026
 mean-0.026
 selfish-0.025
 lazy-0.024
Token He
Token ID679
Feature activation-7.93e-01

Pos logit contributions
icol+0.014
 Finally+0.014
+0.014
 Sure+0.014
uggle+0.013
Neg logit contributions
 angry-0.012
 mean-0.011
 losers-0.010
 lazy-0.010
 kids-0.010
Token put
Token ID1234
Feature activation-2.25e-01

Pos logit contributions
 Finally+0.022
 Sure+0.021
icol+0.021
+0.019
 trek+0.018
Neg logit contributions
 angry-0.022
 mean-0.022
 losers-0.022
 rude-0.020
 selfish-0.020
Token on
Token ID319
Feature activation-7.11e-01

Pos logit contributions
 Finally+0.007
icol+0.007
 Sure+0.007
+0.007
 Eventually+0.006
Neg logit contributions
 angry-0.005
 mean-0.005
 losers-0.005
 rude-0.005
 selfish-0.005
Token his
Token ID465
Feature activation-6.10e-01

Pos logit contributions
 Finally+0.029
icol+0.029
 Sure+0.028
+0.028
 Eventually+0.026
Neg logit contributions
 angry-0.011
 mean-0.010
 losers-0.010
 rude-0.009
 lazy-0.009
Token favorite
Token ID4004
Feature activation-2.99e+00

Pos logit contributions
 Finally+0.017
+0.017
 Sure+0.017
icol+0.016
 purchase+0.015
Neg logit contributions
 losers-0.016
 angry-0.016
 mean-0.016
aky-0.016
 reckless-0.015
Token blue
Token ID4171
Feature activation-1.75e+00

Pos logit contributions
 Finally+0.093
icol+0.091
 Sure+0.090
+0.089
 Eventually+0.080
Neg logit contributions
 angry-0.075
 mean-0.072
 losers-0.070
 rude-0.066
 lazy-0.066
Token cap
Token ID1451
Feature activation-1.12e+00

Pos logit contributions
 Finally+0.049
icol+0.048
 Sure+0.047
+0.047
 Eventually+0.042
Neg logit contributions
 angry-0.049
 mean-0.048
 losers-0.046
 rude-0.044
 lazy-0.044
Token and
Token ID290
Feature activation-5.93e-01

Pos logit contributions
 Finally+0.033
 Sure+0.032
+0.032
icol+0.031
 Eventually+0.028
Neg logit contributions
 angry-0.032
 mean-0.030
 losers-0.030
 rude-0.028
 selfish-0.028
Token hopped
Token ID45230
Feature activation-1.32e-01

Pos logit contributions
 Finally+0.018
icol+0.018
 Sure+0.017
+0.016
 Eventually+0.015
Neg logit contributions
 angry-0.016
 losers-0.015
 mean-0.015
 rude-0.014
 selfish-0.014
Token in
Token ID287
Feature activation-3.63e-01

Pos logit contributions
 Finally+0.004
icol+0.004
 Sure+0.004
+0.004
 Eventually+0.004
Neg logit contributions
 angry-0.003
 mean-0.003
 losers-0.003
 rude-0.003
 selfish-0.003
Token
Token ID198
Feature activation+6.99e-03

Pos logit contributions
 angry+0.010
 mean+0.009
 lazy+0.008
 losers+0.008
 kids+0.008
Neg logit contributions
icol-0.015
-0.014
 Finally-0.014
 Sure-0.013
 generation-0.012
Token"
Token ID1
Feature activation+5.21e-02

Pos logit contributions
 angry+0.000
 losers+0.000
 mean+0.000
 lazy+0.000
 selfish+0.000
Neg logit contributions
 Finally-0.000
 Sure-0.000
icol-0.000
-0.000
 Eventually-0.000
TokenWhere
Token ID8496
Feature activation-8.32e-01

Pos logit contributions
 angry+0.001
 losers+0.001
 alone+0.001
 mean+0.001
aky+0.001
Neg logit contributions
-0.001
 Sure-0.001
 Finally-0.001
icol-0.001
growth-0.001
Token are
Token ID389
Feature activation+1.11e+00

Pos logit contributions
 Finally+0.026
 Sure+0.025
+0.025
icol+0.024
uggle+0.022
Neg logit contributions
 angry-0.019
 losers-0.019
 musicians-0.018
aky-0.018
 mean-0.017
Token our
Token ID674
Feature activation+5.77e-01

Pos logit contributions
aky+0.029
 angry+0.029
umpy+0.028
 losers+0.025
 mean+0.025
Neg logit contributions
 Finally-0.031
-0.030
 Sure-0.029
icol-0.028
 Enjoy-0.027
Token structures
Token ID8573
Feature activation+3.09e+00

Pos logit contributions
 angry+0.015
 losers+0.015
aky+0.014
 mean+0.014
 musicians+0.013
Neg logit contributions
 Finally-0.017
icol-0.016
 Sure-0.016
-0.016
 purchase-0.014
Token?"
Token ID1701
Feature activation-9.67e-01

Pos logit contributions
 losers+0.094
 musicians+0.094
aky+0.092
 angry+0.089
 lazy+0.085
Neg logit contributions
-0.070
 Finally-0.061
ings-0.056
uggle-0.055
 purchase-0.055
Token Lily
Token ID20037
Feature activation+2.47e+00

Pos logit contributions
 Finally+0.031
 Sure+0.030
+0.027
icol+0.027
 Eventually+0.026
Neg logit contributions
aky-0.024
 losers-0.023
 mean-0.020
umpy-0.020
 angry-0.019
Token asked
Token ID1965
Feature activation+2.14e-01

Pos logit contributions
 losers+0.070
igator+0.061
 angry+0.060
 mean+0.060
chie+0.058
Neg logit contributions
 Finally-0.064
 Sure-0.061
icol-0.059
-0.059
 Eventually-0.052
Token.
Token ID13
Feature activation+7.80e-01

Pos logit contributions
 losers+0.006
 angry+0.006
 mean+0.006
 lazy+0.005
 rude+0.005
Neg logit contributions
 Finally-0.006
 Sure-0.006
icol-0.006
-0.006
 Eventually-0.005
Token
Token ID198
Feature activation+6.43e-01

Pos logit contributions
 angry+0.021
 mean+0.020
 losers+0.019
 rude+0.018
 selfish+0.018
Neg logit contributions
icol-0.024
 Finally-0.023
 Sure-0.022
-0.022
 Eventually-0.020
Token.
Token ID13
Feature activation+1.86e-02

Pos logit contributions
 Finally+0.027
+0.026
 Sure+0.026
icol+0.026
 Eventually+0.023
Neg logit contributions
 angry-0.021
 mean-0.020
 losers-0.020
 rude-0.018
 selfish-0.018
Token<|endoftext|>
Token ID50256
Feature activation+2.18e-01

Pos logit contributions
 angry+0.000
 mean+0.000
 lazy+0.000
 losers+0.000
 selfish+0.000
Neg logit contributions
icol-0.001
-0.001
 Finally-0.001
 Sure-0.001
 purchase-0.001
TokenL
Token ID43
Feature activation-9.93e-01

Pos logit contributions
 angry+0.007
 mean+0.007
 losers+0.007
 rude+0.007
 selfish+0.006
Neg logit contributions
 Finally-0.005
icol-0.005
 Sure-0.005
-0.005
 Eventually-0.004
Tokenily
Token ID813
Feature activation+1.15e-01

Pos logit contributions
 Finally+0.035
icol+0.035
 Sure+0.034
+0.033
 Eventually+0.030
Neg logit contributions
 angry-0.022
 mean-0.021
 losers-0.020
 rude-0.019
 lazy-0.019
Token was
Token ID373
Feature activation+5.04e-01

Pos logit contributions
 angry+0.003
 mean+0.003
 losers+0.003
 rude+0.003
 selfish+0.002
Neg logit contributions
 Finally-0.004
icol-0.004
 Sure-0.004
-0.003
 Eventually-0.003
Token very
Token ID845
Feature activation+2.10e+00

Pos logit contributions
 angry+0.011
 mean+0.010
 losers+0.010
 rude+0.009
 lazy+0.009
Neg logit contributions
 Finally-0.018
icol-0.018
 Sure-0.017
-0.017
 Eventually-0.016
Token excited
Token ID6568
Feature activation+1.27e-01

Pos logit contributions
 losers+0.037
 mean+0.033
 angry+0.033
 musicians+0.031
aky+0.030
Neg logit contributions
 Finally-0.081
 Sure-0.078
-0.076
icol-0.074
 Eventually-0.071
Token.
Token ID13
Feature activation+1.13e-02

Pos logit contributions
 angry+0.003
 mean+0.003
 losers+0.002
 rude+0.002
 kids+0.002
Neg logit contributions
 Finally-0.004
icol-0.004
-0.004
 Sure-0.004
 Eventually-0.004
Token She
Token ID1375
Feature activation-3.32e-01

Pos logit contributions
 angry+0.000
 mean+0.000
 selfish+0.000
 lazy+0.000
 losers+0.000
Neg logit contributions
icol-0.000
 Finally-0.000
-0.000
 Sure-0.000
 trek-0.000
Token was
Token ID373
Feature activation+5.97e-01

Pos logit contributions
 Finally+0.011
icol+0.011
 Sure+0.010
+0.010
 Eventually+0.009
Neg logit contributions
 angry-0.008
 mean-0.008
 losers-0.007
 rude-0.007
 lazy-0.007
Token going
Token ID1016
Feature activation+7.84e-01

Pos logit contributions
 angry+0.017
 mean+0.016
 losers+0.016
 rude+0.015
 lazy+0.015
Neg logit contributions
 Finally-0.016
icol-0.016
 Sure-0.016
-0.015
 Eventually-0.014
Token them
Token ID606
Feature activation+4.54e-01

Pos logit contributions
 angry+0.031
 mean+0.030
 losers+0.029
 lazy+0.029
 selfish+0.028
Neg logit contributions
 Finally-0.033
 Sure-0.032
-0.031
icol-0.030
 purchase-0.029
Token could
Token ID714
Feature activation-1.17e-01

Pos logit contributions
 losers+0.013
 angry+0.012
 my+0.012
 musicians+0.012
 mean+0.012
Neg logit contributions
 Finally-0.012
-0.011
 Sure-0.011
icol-0.010
 purchase-0.010
Token think
Token ID892
Feature activation-7.36e-01

Pos logit contributions
 Finally+0.003
+0.003
 Sure+0.003
icol+0.003
 purchase+0.003
Neg logit contributions
 angry-0.003
 losers-0.003
 mean-0.003
 lazy-0.003
aky-0.003
Token what
Token ID644
Feature activation+3.07e-01

Pos logit contributions
 Finally+0.017
 Sure+0.017
+0.015
 purchase+0.015
ings+0.015
Neg logit contributions
 losers-0.021
aky-0.021
 angry-0.021
 lazy-0.020
 mean-0.020
Token the
Token ID262
Feature activation-7.27e-01

Pos logit contributions
 losers+0.010
aky+0.010
 angry+0.010
umpy+0.009
 musicians+0.009
Neg logit contributions
 Finally-0.007
 Sure-0.007
-0.006
 generation-0.005
 trek-0.005
Token trade
Token ID3292
Feature activation+2.16e+00

Pos logit contributions
 Finally+0.024
 Sure+0.023
+0.022
icol+0.022
 generation+0.020
Neg logit contributions
 losers-0.016
aky-0.015
 angry-0.015
 mean-0.014
 my-0.014
Token was
Token ID373
Feature activation-1.33e+00

Pos logit contributions
 angry+0.068
 losers+0.067
 mean+0.064
 lazy+0.064
aky+0.063
Neg logit contributions
 Finally-0.049
 Sure-0.047
-0.046
ings-0.043
icol-0.042
Token.
Token ID13
Feature activation+3.13e-01

Pos logit contributions
 Finally+0.032
 Sure+0.030
icol+0.029
+0.028
 purchase+0.027
Neg logit contributions
 losers-0.041
 angry-0.038
aky-0.037
 lazy-0.036
 musicians-0.036
Token
Token ID220
Feature activation-6.11e-01

Pos logit contributions
 angry+0.011
 mean+0.011
 losers+0.011
 rude+0.010
 lazy+0.010
Neg logit contributions
 Finally-0.008
 Sure-0.007
-0.007
icol-0.006
 Eventually-0.006
Token
Token ID198
Feature activation+1.94e-02

Pos logit contributions
 Finally+0.018
icol+0.017
 Sure+0.017
+0.017
 Eventually+0.015
Neg logit contributions
 angry-0.017
 mean-0.016
 losers-0.016
 rude-0.015
 lazy-0.015
Token
Token ID198
Feature activation-5.50e-02

Pos logit contributions
 angry+0.001
 mean+0.001
 losers+0.001
 rude+0.001
 lazy+0.001
Neg logit contributions
 Finally-0.001
 Sure-0.000
-0.000
icol-0.000
 Eventually-0.000
Token That
Token ID1320
Feature activation-2.09e-01

Pos logit contributions
 Finally+0.014
icol+0.013
 Sure+0.013
+0.013
 Eventually+0.012
Neg logit contributions
 angry-0.013
 mean-0.013
 losers-0.012
 rude-0.012
 lazy-0.012
Token's
Token ID338
Feature activation-2.83e-01

Pos logit contributions
 Finally+0.004
ings+0.004
 Sure+0.004
icol+0.004
uggle+0.004
Neg logit contributions
 losers-0.006
aky-0.006
 angry-0.006
 impressed-0.006
 furious-0.006
Token a
Token ID257
Feature activation-2.77e-02

Pos logit contributions
 Finally+0.009
icol+0.008
 Sure+0.008
 Eventually+0.007
+0.007
Neg logit contributions
 losers-0.006
 angry-0.006
umpy-0.006
 musicians-0.006
aky-0.006
Token curtain
Token ID29461
Feature activation-1.76e-02

Pos logit contributions
icol+0.001
 Finally+0.001
+0.001
 Sure+0.001
 generation+0.001
Neg logit contributions
 angry-0.001
 losers-0.001
aky-0.001
 boys-0.001
 musicians-0.000
Token,"
Token ID553
Feature activation-7.58e-01

Pos logit contributions
 Finally+0.000
 Sure+0.000
+0.000
icol+0.000
uggle+0.000
Neg logit contributions
 angry-0.001
 losers-0.001
 lazy-0.001
aky-0.001
 mean-0.001
Token he
Token ID339
Feature activation+2.54e+00

Pos logit contributions
 Finally+0.023
 Sure+0.023
+0.021
icol+0.021
 Eventually+0.019
Neg logit contributions
 losers-0.020
aky-0.018
 mean-0.018
 angry-0.018
 lazy-0.017
Token said
Token ID531
Feature activation-3.36e-01

Pos logit contributions
 losers+0.067
 alone+0.061
umpy+0.059
igator+0.059
 angry+0.058
Neg logit contributions
 Finally-0.071
 Sure-0.068
icol-0.068
-0.059
 trek-0.056
Token.
Token ID13
Feature activation-5.94e-02

Pos logit contributions
 Finally+0.010
 Sure+0.010
icol+0.010
+0.009
 Eventually+0.009
Neg logit contributions
 angry-0.009
 mean-0.008
 losers-0.008
 rude-0.008
 lazy-0.008
Token
Token ID198
Feature activation-2.38e-01

Pos logit contributions
 Finally+0.002
 Sure+0.002
+0.002
icol+0.001
 Eventually+0.001
Neg logit contributions
 angry-0.002
 losers-0.002
 mean-0.002
 lazy-0.002
 rude-0.002
Token
Token ID198
Feature activation-2.73e-01

Pos logit contributions
icol+0.009
+0.008
 Finally+0.008
 Sure+0.008
 generation+0.007
Neg logit contributions
 angry-0.005
 mean-0.005
 kids-0.004
 alone-0.004
 lazy-0.004
TokenHe
Token ID1544
Feature activation+8.49e-01

Pos logit contributions
 Finally+0.009
 Sure+0.009
icol+0.008
+0.008
 Eventually+0.008
Neg logit contributions
 angry-0.006
 losers-0.006
 mean-0.006
 lazy-0.005
 musicians-0.005
Token you
Token ID345
Feature activation+6.67e-01

Pos logit contributions
 angry+0.032
 mean+0.031
 losers+0.030
 lazy+0.028
 rude+0.028
Neg logit contributions
 Finally-0.058
icol-0.057
 Sure-0.057
-0.056
uggle-0.051
Token with
Token ID351
Feature activation+1.98e-01

Pos logit contributions
 angry+0.018
 losers+0.018
 mean+0.017
 rude+0.016
 selfish+0.016
Neg logit contributions
 Finally-0.019
-0.018
icol-0.018
 Sure-0.018
uggle-0.016
Token your
Token ID534
Feature activation+5.89e-01

Pos logit contributions
 angry+0.004
 losers+0.003
 mean+0.003
umpy+0.003
aky+0.003
Neg logit contributions
-0.007
 Finally-0.007
icol-0.007
 Sure-0.007
 purchase-0.006
Token nest
Token ID16343
Feature activation+7.41e-01

Pos logit contributions
 angry+0.018
 losers+0.018
 mean+0.018
 musicians+0.017
 rude+0.017
Neg logit contributions
 Finally-0.014
-0.014
icol-0.014
 Sure-0.014
 purchase-0.012
Token?"
Token ID1701
Feature activation-4.48e-01

Pos logit contributions
 losers+0.023
 musicians+0.022
 angry+0.021
 mean+0.021
 selfish+0.020
Neg logit contributions
ings-0.018
 Finally-0.017
-0.016
uggle-0.016
 Sure-0.015
Token they
Token ID484
Feature activation+2.09e+00

Pos logit contributions
 Finally+0.015
 Sure+0.015
icol+0.014
+0.014
 Eventually+0.013
Neg logit contributions
 losers-0.010
 angry-0.010
 mean-0.010
aky-0.009
 selfish-0.009
Token asked
Token ID1965
Feature activation-3.02e-01

Pos logit contributions
 losers+0.064
umpy+0.063
aky+0.057
 reckless+0.055
 fastest+0.054
Neg logit contributions
 Finally-0.046
 Sure-0.040
 curiously-0.038
 trek-0.038
 mun-0.038
Token.
Token ID13
Feature activation+8.00e-01

Pos logit contributions
 Finally+0.008
 Sure+0.008
+0.007
icol+0.007
 Eventually+0.006
Neg logit contributions
 losers-0.009
 mean-0.009
 angry-0.008
aky-0.008
 lazy-0.008
Token
Token ID198
Feature activation+3.76e-01

Pos logit contributions
 angry+0.023
 mean+0.022
 losers+0.022
aky+0.021
 lazy+0.021
Neg logit contributions
 Finally-0.023
 Sure-0.023
-0.022
icol-0.021
 Eventually-0.019
Token
Token ID198
Feature activation+1.49e-01

Pos logit contributions
 angry+0.011
 mean+0.011
 losers+0.011
 rude+0.010
 selfish+0.010
Neg logit contributions
 Finally-0.010
 Sure-0.010
icol-0.010
-0.010
 Eventually-0.009
TokenThe
Token ID464
Feature activation+9.11e-01

Pos logit contributions
 angry+0.004
 losers+0.004
 alone+0.004
 mean+0.003
 lonely+0.003
Neg logit contributions
 Finally-0.005
 Sure-0.004
-0.004
Finally-0.004
 Eventually-0.004
Token Do
Token ID2141
Feature activation-3.72e-01

Pos logit contributions
 Finally+0.000
icol+0.000
 Sure+0.000
+0.000
 Eventually+0.000
Neg logit contributions
 angry-0.000
 mean-0.000
 losers-0.000
 rude-0.000
 lazy-0.000
Token you
Token ID345
Feature activation+5.03e-01

Pos logit contributions
 Finally+0.013
 Sure+0.012
icol+0.012
+0.012
 Eventually+0.011
Neg logit contributions
 angry-0.008
 losers-0.007
 mean-0.007
aky-0.007
 lazy-0.007
Token like
Token ID588
Feature activation-4.66e-01

Pos logit contributions
 angry+0.014
 losers+0.014
 selfish+0.013
 lazy+0.013
 rude+0.013
Neg logit contributions
 Finally-0.013
 Sure-0.012
icol-0.011
-0.011
 Eventually-0.011
Token it
Token ID340
Feature activation+1.28e+00

Pos logit contributions
 Finally+0.015
icol+0.015
 Sure+0.015
+0.014
uggle+0.013
Neg logit contributions
 angry-0.011
 mean-0.010
aky-0.010
 losers-0.010
 selfish-0.009
Token,
Token ID11
Feature activation-1.16e-01

Pos logit contributions
chie+0.045
 furious+0.043
 losers+0.043
aky+0.041
 reckless+0.040
Neg logit contributions
ings-0.017
uggle-0.017
 Finally-0.017
-0.015
rap-0.014
Token Lily
Token ID20037
Feature activation+1.89e+00

Pos logit contributions
 Sure+0.003
 Finally+0.003
 purchase+0.002
icol+0.002
 generation+0.002
Neg logit contributions
aky-0.003
umpy-0.003
v-0.003
 angry-0.003
 furious-0.003
Token?"
Token ID1701
Feature activation-6.85e-01

Pos logit contributions
chie+0.065
 losers+0.065
 furious+0.061
 angry+0.060
 annoying+0.056
Neg logit contributions
 Finally-0.029
ings-0.028
uggle-0.027
 Sure-0.026
ails-0.025
Token Mom
Token ID11254
Feature activation+7.84e-01

Pos logit contributions
 Finally+0.020
icol+0.020
 Sure+0.020
+0.019
 Eventually+0.017
Neg logit contributions
 angry-0.019
 mean-0.018
 losers-0.018
 rude-0.017
 lazy-0.017
Token says
Token ID1139
Feature activation-9.96e-01

Pos logit contributions
 losers+0.021
 angry+0.020
 mean+0.019
 lazy+0.019
 selfish+0.019
Neg logit contributions
 Finally-0.022
 Sure-0.022
icol-0.021
-0.020
 Eventually-0.019
Token.
Token ID13
Feature activation+9.79e-02

Pos logit contributions
 Finally+0.030
icol+0.030
 Sure+0.030
+0.029
 Eventually+0.026
Neg logit contributions
 angry-0.026
 mean-0.025
 losers-0.024
 rude-0.023
 lazy-0.023
Token
Token ID198
Feature activation-4.76e-01

Pos logit contributions
 angry+0.003
 losers+0.003
 mean+0.003
aky+0.003
 rude+0.003
Neg logit contributions
 Finally-0.003
 Sure-0.002
-0.002
icol-0.002
 Eventually-0.002
Token you
Token ID345
Feature activation-1.74e-01

Pos logit contributions
aky+0.038
 angry+0.033
umpy+0.033
 alone+0.031
 furious+0.029
Neg logit contributions
-0.028
 Finally-0.027
 Sure-0.027
 Enjoy-0.025
zed-0.025
Token keep
Token ID1394
Feature activation+5.88e-01

Pos logit contributions
 Finally+0.005
+0.004
icol+0.004
 Sure+0.004
uggle+0.004
Neg logit contributions
 angry-0.005
 losers-0.005
 mean-0.004
 selfish-0.004
 musicians-0.004
Token following
Token ID1708
Feature activation+1.49e+00

Pos logit contributions
 losers+0.012
 angry+0.012
aky+0.012
 mean+0.012
umpy+0.012
Neg logit contributions
icol-0.018
 Finally-0.018
-0.018
 Sure-0.017
 purchase-0.017
Token me
Token ID502
Feature activation+1.82e-01

Pos logit contributions
 angry+0.032
 losers+0.029
 mean+0.029
 selfish+0.027
 rude+0.027
Neg logit contributions
 Finally-0.050
 Sure-0.047
icol-0.046
-0.044
ings-0.044
Token?".
Token ID43634
Feature activation-3.01e-01

Pos logit contributions
 angry+0.005
 losers+0.005
 mean+0.005
 rude+0.005
 furious+0.005
Neg logit contributions
 Finally-0.005
 Sure-0.005
-0.004
uggle-0.004
ings-0.004
Token
Token ID198
Feature activation+4.04e-01

Pos logit contributions
 Finally+0.009
icol+0.008
 Sure+0.008
+0.008
 Eventually+0.007
Neg logit contributions
 angry-0.008
 mean-0.008
 losers-0.008
 rude-0.008
 lazy-0.008
Token
Token ID198
Feature activation+3.11e-02

Pos logit contributions
 angry+0.010
 mean+0.010
 losers+0.010
 rude+0.009
 lazy+0.009
Neg logit contributions
icol-0.012
 Finally-0.012
 Sure-0.012
-0.012
 Eventually-0.011
TokenThe
Token ID464
Feature activation+5.61e-01

Pos logit contributions
 angry+0.001
 losers+0.001
 mean+0.001
 boys+0.001
 kids+0.001
Neg logit contributions
 Finally-0.001
 Sure-0.001
-0.001
Finally-0.001
to-0.001
Token bee
Token ID20697
Feature activation+3.58e-01

Pos logit contributions
 losers+0.014
aky+0.012
 musicians+0.012
 angry+0.012
 reckless+0.011
Neg logit contributions
-0.016
 Finally-0.015
 Sure-0.014
 potion-0.014
icol-0.013
Token pe
Token ID613
Feature activation+1.01e+00

Pos logit contributions
 losers+0.012
 angry+0.011
 mean+0.010
 musicians+0.010
 kids+0.010
Neg logit contributions
 Finally-0.008
 Sure-0.008
icol-0.008
-0.007
 Eventually-0.007
Tokenep
Token ID538
Feature activation+6.35e-01

Pos logit contributions
 angry+0.026
 mean+0.025
 losers+0.025
 rude+0.023
 lazy+0.023
Neg logit contributions
 Finally-0.030
icol-0.030
 Sure-0.030
-0.029
 Eventually-0.026
Token share
Token ID2648
Feature activation+7.59e-01

Pos logit contributions
 angry+0.025
 mean+0.024
 losers+0.024
aky+0.023
 rude+0.023
Neg logit contributions
 Finally-0.019
icol-0.019
-0.018
 Sure-0.018
 purchase-0.016
Token the
Token ID262
Feature activation-1.81e-01

Pos logit contributions
 losers+0.015
 mean+0.014
 angry+0.014
aky+0.013
 lazy+0.013
Neg logit contributions
 Finally-0.027
 Sure-0.026
icol-0.026
-0.026
uggle-0.024
Token ball
Token ID2613
Feature activation-4.89e-01

Pos logit contributions
 Sure+0.004
 Finally+0.004
+0.004
 purchase+0.004
icol+0.004
Neg logit contributions
aky-0.005
 losers-0.005
 reckless-0.005
 angry-0.005
 mean-0.005
Token and
Token ID290
Feature activation-4.53e-01

Pos logit contributions
 Finally+0.016
icol+0.016
 Sure+0.016
+0.016
 Eventually+0.014
Neg logit contributions
 angry-0.011
 mean-0.011
 losers-0.010
 rude-0.010
 lazy-0.010
Token the
Token ID262
Feature activation-7.85e-02

Pos logit contributions
 Finally+0.011
 Sure+0.011
icol+0.011
+0.010
 purchase+0.009
Neg logit contributions
aky-0.014
 angry-0.014
 mean-0.013
umpy-0.013
 lazy-0.013
Token car
Token ID1097
Feature activation+5.25e-01

Pos logit contributions
+0.002
 Sure+0.002
icol+0.002
 purchase+0.002
 Finally+0.002
Neg logit contributions
 losers-0.002
aky-0.002
umpy-0.002
 angry-0.002
 mean-0.002
Token.
Token ID13
Feature activation+3.29e-01

Pos logit contributions
 mean+0.015
 losers+0.015
 angry+0.015
 lazy+0.014
aky+0.014
Neg logit contributions
 Finally-0.014
 Sure-0.013
-0.012
icol-0.012
uggle-0.012
Token They
Token ID1119
Feature activation+6.59e-01

Pos logit contributions
 angry+0.010
 mean+0.010
 losers+0.010
 rude+0.009
 lazy+0.009
Neg logit contributions
 Finally-0.008
icol-0.008
 Sure-0.008
-0.008
 Eventually-0.007
Token make
Token ID787
Feature activation+6.30e-01

Pos logit contributions
 angry+0.020
aky+0.019
 lazy+0.019
 mean+0.018
 rude+0.018
Neg logit contributions
 Finally-0.016
-0.015
icol-0.015
 Sure-0.014
 purchase-0.014
Token a
Token ID257
Feature activation+1.41e-01

Pos logit contributions
 angry+0.014
 mean+0.014
 losers+0.014
 rude+0.013
 lazy+0.013
Neg logit contributions
 Finally-0.021
icol-0.020
 Sure-0.020
-0.019
 Eventually-0.018
Token new
Token ID649
Feature activation+3.05e-01

Pos logit contributions
 angry+0.003
 mean+0.003
 losers+0.003
 rude+0.003
 lazy+0.003
Neg logit contributions
 Finally-0.005
icol-0.005
 Sure-0.005
-0.004
 Eventually-0.004
Token.
Token ID13
Feature activation-4.14e-02

Pos logit contributions
 angry+0.026
 losers+0.026
 mean+0.025
 musicians+0.024
 rude+0.024
Neg logit contributions
 Finally-0.022
 Sure-0.021
icol-0.020
-0.020
 Eventually-0.019
Token You
Token ID921
Feature activation-2.27e-01

Pos logit contributions
icol+0.001
+0.001
 Finally+0.001
 trek+0.001
 purchase+0.001
Neg logit contributions
 my-0.001
 losers-0.001
 mean-0.001
 kids-0.001
 angry-0.001
Token can
Token ID460
Feature activation+3.04e-01

Pos logit contributions
 Finally+0.007
+0.007
icol+0.007
 Sure+0.007
 Eventually+0.006
Neg logit contributions
 angry-0.006
 mean-0.005
 losers-0.005
 rude-0.005
 lazy-0.005
Token have
Token ID423
Feature activation-1.12e+00

Pos logit contributions
 mean+0.007
 angry+0.007
 losers+0.007
 my+0.006
 bald+0.006
Neg logit contributions
icol-0.011
 Sure-0.010
 Finally-0.010
 Eventually-0.009
-0.009
Token your
Token ID534
Feature activation+4.15e-01

Pos logit contributions
 Finally+0.032
icol+0.030
 Sure+0.030
+0.029
 Eventually+0.027
Neg logit contributions
umpy-0.030
aky-0.029
 angry-0.029
 mean-0.027
 losers-0.026
Token balloon
Token ID21190
Feature activation+3.78e-01

Pos logit contributions
 angry+0.012
 mean+0.011
 losers+0.011
aky+0.011
 rude+0.010
Neg logit contributions
-0.011
 Finally-0.011
 Sure-0.011
icol-0.011
 purchase-0.010
Token now
Token ID783
Feature activation+5.89e-02

Pos logit contributions
 angry+0.012
 losers+0.012
 mean+0.012
 furious+0.012
 musicians+0.012
Neg logit contributions
 Finally-0.007
-0.007
uggle-0.007
 Sure-0.007
ings-0.007
Token.
Token ID13
Feature activation-2.67e-01

Pos logit contributions
 angry+0.002
 losers+0.002
aky+0.002
 mean+0.002
 musicians+0.002
Neg logit contributions
 Finally-0.001
-0.001
 Sure-0.001
icol-0.001
 Eventually-0.001
Token Here
Token ID3423
Feature activation-1.30e-03

Pos logit contributions
icol+0.007
+0.007
 Finally+0.007
 trek+0.007
ings+0.007
Neg logit contributions
 my-0.007
 losers-0.007
 mean-0.007
 angry-0.006
 selfish-0.006
Token you
Token ID345
Feature activation-1.10e-01

Pos logit contributions
 Finally+0.000
+0.000
icol+0.000
 Sure+0.000
uggle+0.000
Neg logit contributions
 angry-0.000
 losers-0.000
aky-0.000
 mean-0.000
 lazy-0.000
Token go
Token ID467
Feature activation-2.45e-03

Pos logit contributions
 Finally+0.003
 Sure+0.003
icol+0.003
+0.003
 Eventually+0.003
Neg logit contributions
 angry-0.003
 losers-0.003
 mean-0.003
 rude-0.003
 lazy-0.003
Token and
Token ID290
Feature activation+9.57e-02

Pos logit contributions
 Finally+0.009
 Sure+0.008
+0.008
icol+0.008
 Eventually+0.007
Neg logit contributions
 losers-0.007
 angry-0.007
 mean-0.007
aky-0.007
 lazy-0.007
Token ran
Token ID4966
Feature activation-2.41e-02

Pos logit contributions
 angry+0.002
 mean+0.002
 losers+0.002
umpy+0.002
aky+0.002
Neg logit contributions
icol-0.003
 Sure-0.003
 Finally-0.003
-0.003
 trek-0.002
Token to
Token ID284
Feature activation+6.79e-01

Pos logit contributions
 Finally+0.001
icol+0.001
 Sure+0.001
+0.001
 Eventually+0.001
Neg logit contributions
 angry-0.001
 mean-0.001
 losers-0.000
 rude-0.000
 lazy-0.000
Token his
Token ID465
Feature activation+5.88e-01

Pos logit contributions
 angry+0.011
 mean+0.010
 losers+0.009
 rude+0.009
 lazy+0.009
Neg logit contributions
 Finally-0.028
icol-0.027
 Sure-0.027
-0.027
 Eventually-0.025
Token mom
Token ID1995
Feature activation+7.21e-01

Pos logit contributions
 losers+0.017
 mean+0.016
aky+0.016
 angry+0.015
vious+0.015
Neg logit contributions
icol-0.016
 Sure-0.016
 Finally-0.015
-0.014
 purchase-0.014
Token.
Token ID13
Feature activation+6.75e-01

Pos logit contributions
 losers+0.020
 angry+0.020
 mean+0.019
 rude+0.018
 selfish+0.018
Neg logit contributions
 Finally-0.020
 Sure-0.019
icol-0.019
-0.018
 Eventually-0.017
Token His
Token ID2399
Feature activation+7.48e-01

Pos logit contributions
 angry+0.015
 mean+0.014
 rude+0.013
 lazy+0.013
 selfish+0.013
Neg logit contributions
icol-0.023
 Finally-0.023
-0.023
 Sure-0.022
uggle-0.020
Token mom
Token ID1995
Feature activation+2.17e-01

Pos logit contributions
 angry+0.017
 losers+0.017
 mean+0.016
aky+0.015
 rude+0.015
Neg logit contributions
 Finally-0.024
icol-0.024
 Sure-0.024
-0.023
 Eventually-0.021
Token hugged
Token ID41130
Feature activation-1.38e+00

Pos logit contributions
 angry+0.005
 mean+0.004
 rude+0.004
 kids+0.004
 lazy+0.004
Neg logit contributions
icol-0.008
 Finally-0.008
 Sure-0.008
-0.008
 generation-0.007
Token him
Token ID683
Feature activation+3.40e-01

Pos logit contributions
 Finally+0.037
icol+0.037
 Sure+0.036
+0.035
 Eventually+0.031
Neg logit contributions
 angry-0.041
 mean-0.039
 losers-0.038
 rude-0.037
 lazy-0.036
Token and
Token ID290
Feature activation-1.93e-01

Pos logit contributions
 angry+0.008
 mean+0.008
 losers+0.008
 rude+0.007
 selfish+0.007
Neg logit contributions
 Finally-0.011
 Sure-0.011
icol-0.010
-0.010
 Eventually-0.009
Token higher
Token ID2440
Feature activation-2.13e-01

Pos logit contributions
 Finally+0.001
 Sure+0.001
icol+0.001
 Eventually+0.001
+0.001
Neg logit contributions
 losers-0.001
 angry-0.001
aky-0.001
 mean-0.001
 musicians-0.001
Token.
Token ID13
Feature activation-8.17e-02

Pos logit contributions
 Finally+0.005
icol+0.005
 Sure+0.005
ings+0.004
+0.004
Neg logit contributions
 losers-0.007
 mean-0.007
 angry-0.007
 rude-0.006
 selfish-0.006
Token His
Token ID2399
Feature activation-4.23e-01

Pos logit contributions
 Finally+0.002
icol+0.002
 Sure+0.002
+0.002
 Eventually+0.001
Neg logit contributions
 angry-0.003
 mean-0.003
 losers-0.003
 lazy-0.003
 rude-0.003
Token wish
Token ID4601
Feature activation+1.30e+00

Pos logit contributions
 Finally+0.012
icol+0.012
 Sure+0.011
+0.011
 purchase+0.010
Neg logit contributions
 losers-0.011
 angry-0.011
 mean-0.011
aky-0.011
 rude-0.010
Token had
Token ID550
Feature activation-8.80e-01

Pos logit contributions
 angry+0.039
 losers+0.039
 mean+0.036
aky+0.036
 selfish+0.035
Neg logit contributions
 Finally-0.030
icol-0.030
 Sure-0.030
-0.028
ings-0.027
Token come
Token ID1282
Feature activation+1.78e-02

Pos logit contributions
 Finally+0.021
 Sure+0.019
 trek+0.019
 generation+0.018
icol+0.018
Neg logit contributions
 angry-0.026
umpy-0.025
aky-0.025
 losers-0.025
 musicians-0.023
Token true
Token ID2081
Feature activation-4.38e-01

Pos logit contributions
 losers+0.000
 angry+0.000
 kids+0.000
 mean+0.000
aky+0.000
Neg logit contributions
 Finally-0.001
 Sure-0.001
icol-0.000
ings-0.000
 Eventually-0.000
Token,
Token ID11
Feature activation+1.09e-01

Pos logit contributions
 Finally+0.013
icol+0.013
 Sure+0.012
+0.012
 Eventually+0.011
Neg logit contributions
 angry-0.012
 losers-0.011
 mean-0.011
aky-0.010
 lazy-0.010
Token and
Token ID290
Feature activation-2.59e-01

Pos logit contributions
 angry+0.003
 losers+0.003
 mean+0.003
aky+0.003
 rude+0.003
Neg logit contributions
 Finally-0.003
 Sure-0.003
icol-0.003
-0.003
 Eventually-0.003
Token he
Token ID339
Feature activation+1.86e-01

Pos logit contributions
 Finally+0.008
 Sure+0.007
icol+0.007
 Eventually+0.007
 generation+0.006
Neg logit contributions
 angry-0.007
 mean-0.007
 losers-0.007
aky-0.007
umpy-0.006
Token could
Token ID714
Feature activation-1.52e-02

Pos logit contributions
 angry+0.006
 losers+0.006
 mean+0.006
 rude+0.005
 selfish+0.005
Neg logit contributions
 Finally-0.005
 Sure-0.004
icol-0.004
 trek-0.004
 Eventually-0.004
TokenOnce
Token ID7454
Feature activation-2.26e-01

Pos logit contributions
 Finally+0.009
 Sure+0.008
icol+0.008
+0.008
 Eventually+0.007
Neg logit contributions
 angry-0.012
 mean-0.012
 losers-0.011
 rude-0.011
 selfish-0.011
Token there
Token ID612
Feature activation-1.32e-01

Pos logit contributions
 Finally+0.007
 Sure+0.007
icol+0.006
+0.006
 Eventually+0.006
Neg logit contributions
 angry-0.006
 mean-0.006
 losers-0.006
 rude-0.005
 selfish-0.005
Token was
Token ID373
Feature activation-2.04e-01

Pos logit contributions
 Finally+0.004
icol+0.004
 Sure+0.004
+0.004
 Eventually+0.003
Neg logit contributions
 angry-0.004
 mean-0.004
 losers-0.004
 rude-0.003
 lazy-0.003
Token a
Token ID257
Feature activation+8.17e-01

Pos logit contributions
icol+0.008
 Finally+0.008
 Sure+0.008
+0.008
 Eventually+0.007
Neg logit contributions
 angry-0.004
 mean-0.004
 lazy-0.003
 losers-0.003
 kids-0.003
Token boy
Token ID2933
Feature activation-1.17e+00

Pos logit contributions
 angry+0.014
 mean+0.013
 losers+0.013
 rude+0.012
 lazy+0.012
Neg logit contributions
 Finally-0.032
icol-0.031
 Sure-0.031
-0.031
 Eventually-0.028
Token named
Token ID3706
Feature activation-8.85e-01

Pos logit contributions
icol+0.043
 Finally+0.042
+0.041
 Sure+0.040
 Eventually+0.038
Neg logit contributions
 angry-0.028
 mean-0.028
 alone-0.026
 rude-0.024
 lazy-0.024
Token Sam
Token ID3409
Feature activation-6.83e-01

Pos logit contributions
 Finally+0.027
 Sure+0.026
icol+0.025
+0.025
 Eventually+0.023
Neg logit contributions
 angry-0.024
 mean-0.023
 losers-0.023
 rude-0.021
 selfish-0.021
Token.
Token ID13
Feature activation-2.44e-01

Pos logit contributions
 Finally+0.023
icol+0.023
 Sure+0.022
+0.022
 Eventually+0.020
Neg logit contributions
 angry-0.016
 mean-0.015
 losers-0.014
 rude-0.014
 lazy-0.014
Token He
Token ID679
Feature activation-7.22e-01

Pos logit contributions
icol+0.010
+0.009
 Finally+0.009
 Sure+0.009
uggle+0.009
Neg logit contributions
 angry-0.004
 mean-0.004
 losers-0.003
 lazy-0.003
 rude-0.003
Token was
Token ID373
Feature activation-1.49e-01

Pos logit contributions
 Finally+0.023
icol+0.023
 Sure+0.022
+0.022
 Eventually+0.020
Neg logit contributions
 angry-0.018
 mean-0.018
 losers-0.017
 rude-0.016
 lazy-0.016
Token very
Token ID845
Feature activation+1.06e+00

Pos logit contributions
icol+0.006
 Finally+0.006
 Sure+0.006
+0.006
 Eventually+0.005
Neg logit contributions
 angry-0.003
 mean-0.003
 losers-0.002
 lazy-0.002
 rude-0.002
Token smiled
Token ID13541
Feature activation-3.19e-01

Pos logit contributions
 angry+0.015
 mean+0.014
 losers+0.014
 lazy+0.013
 rude+0.013
Neg logit contributions
 Finally-0.023
icol-0.023
 Sure-0.023
-0.022
 Eventually-0.020
Token and
Token ID290
Feature activation+2.96e-01

Pos logit contributions
 Finally+0.009
 Sure+0.008
icol+0.008
+0.008
 Eventually+0.007
Neg logit contributions
 losers-0.009
 angry-0.008
 mean-0.008
igator-0.008
 rude-0.008
Token said
Token ID531
Feature activation-8.09e-01

Pos logit contributions
 angry+0.007
 losers+0.007
 mean+0.007
aky+0.006
 rude+0.006
Neg logit contributions
 Finally-0.010
icol-0.010
 Sure-0.010
-0.009
 Eventually-0.008
Token,
Token ID11
Feature activation-1.56e-01

Pos logit contributions
 Finally+0.025
icol+0.025
 Sure+0.024
+0.024
 Eventually+0.022
Neg logit contributions
 angry-0.020
 mean-0.020
 losers-0.019
 rude-0.018
 lazy-0.018
Token "
Token ID366
Feature activation-2.09e-01

Pos logit contributions
 Finally+0.005
+0.005
 Sure+0.005
icol+0.004
 Eventually+0.004
Neg logit contributions
 angry-0.004
 losers-0.004
 mean-0.004
aky-0.004
 rude-0.004
TokenI
Token ID40
Feature activation-1.24e+00

Pos logit contributions
 Finally+0.007
 Sure+0.007
+0.007
icol+0.007
 Eventually+0.006
Neg logit contributions
 angry-0.005
 mean-0.005
 losers-0.005
 rude-0.004
 alone-0.004
Token love
Token ID1842
Feature activation-6.76e-01

Pos logit contributions
icol+0.034
 Finally+0.034
 Sure+0.033
+0.033
 Eventually+0.029
Neg logit contributions
 angry-0.036
 mean-0.035
 losers-0.033
 rude-0.032
 lazy-0.032
Token you
Token ID345
Feature activation+7.00e-01

Pos logit contributions
 Finally+0.024
icol+0.023
 Sure+0.023
+0.023
 Eventually+0.021
Neg logit contributions
 angry-0.014
 mean-0.013
 losers-0.013
aky-0.012
 rude-0.012
Token too
Token ID1165
Feature activation+1.40e+00

Pos logit contributions
 angry+0.016
igator+0.015
 losers+0.015
 furious+0.015
 musicians+0.014
Neg logit contributions
 Finally-0.022
-0.020
 Eventually-0.018
 Sure-0.018
uggle-0.016
Token,
Token ID11
Feature activation+2.04e-01

Pos logit contributions
 angry+0.039
 losers+0.038
 mean+0.038
 musicians+0.035
aky+0.035
Neg logit contributions
 Finally-0.039
 Sure-0.037
-0.037
icol-0.036
 Eventually-0.033
Token Ben
Token ID3932
Feature activation+9.94e-01

Pos logit contributions
aky+0.005
 alone+0.005
 angry+0.005
igator+0.005
v+0.005
Neg logit contributions
 Sure-0.005
-0.005
 Finally-0.005
icol-0.005
 purchase-0.004
Token spoil
Token ID20851
Feature activation+7.44e-01

Pos logit contributions
 Finally+0.012
+0.012
 Sure+0.011
icol+0.011
 purchase+0.010
Neg logit contributions
 angry-0.010
 musicians-0.010
 losers-0.010
aky-0.010
 kids-0.009
Token your
Token ID534
Feature activation+5.36e-01

Pos logit contributions
 angry+0.017
 losers+0.016
 mean+0.016
 musicians+0.015
aky+0.015
Neg logit contributions
 Finally-0.024
 Sure-0.023
icol-0.022
-0.022
ings-0.021
Token collection
Token ID4947
Feature activation+1.69e+00

Pos logit contributions
 angry+0.012
 mean+0.012
 losers+0.011
 lazy+0.011
 selfish+0.011
Neg logit contributions
 Finally-0.018
icol-0.018
 Sure-0.018
-0.017
 Eventually-0.016
Token."
Token ID526
Feature activation-6.86e-01

Pos logit contributions
 angry+0.054
 furious+0.050
 mean+0.049
 musicians+0.048
 rude+0.048
Neg logit contributions
 Finally-0.039
-0.037
ings-0.036
uggle-0.035
 Sure-0.035
Token But
Token ID887
Feature activation-1.94e-01

Pos logit contributions
icol+0.020
 Finally+0.020
 Sure+0.019
+0.019
 Eventually+0.017
Neg logit contributions
 angry-0.018
 mean-0.018
 losers-0.017
 lazy-0.016
 rude-0.016
Token Sammy
Token ID44067
Feature activation-3.62e-01

Pos logit contributions
 Finally+0.005
 Sure+0.004
+0.004
icol+0.004
 Eventually+0.004
Neg logit contributions
 losers-0.007
aky-0.006
umpy-0.006
 angry-0.006
 mean-0.006
Token didn
Token ID1422
Feature activation+2.45e-01

Pos logit contributions
 Finally+0.008
+0.008
 Sure+0.007
icol+0.007
nexpected+0.007
Neg logit contributions
 losers-0.011
 musicians-0.011
 queens-0.010
 pilots-0.010
 angry-0.010
Token't
Token ID470
Feature activation+4.00e-02

Pos logit contributions
 angry+0.008
 mean+0.008
 losers+0.008
 rude+0.008
 lazy+0.008
Neg logit contributions
 Finally-0.005
-0.005
icol-0.005
 Sure-0.005
 Eventually-0.004
Token listen
Token ID6004
Feature activation+4.82e-01

Pos logit contributions
 losers+0.001
 angry+0.001
 musicians+0.001
 teams+0.001
 my+0.001
Neg logit contributions
 Finally-0.001
 Sure-0.001
-0.001
icol-0.001
 purchase-0.001
Token and
Token ID290
Feature activation-4.55e-01

Pos logit contributions
 losers+0.014
 angry+0.013
 mean+0.013
aky+0.013
 musicians+0.013
Neg logit contributions
 Finally-0.013
icol-0.012
 Sure-0.012
-0.012
 Eventually-0.011
Token kept
Token ID4030
Feature activation-4.50e-02

Pos logit contributions
 Finally+0.014
 Sure+0.013
icol+0.013
+0.013
 Eventually+0.012
Neg logit contributions
 angry-0.012
 losers-0.011
 mean-0.011
aky-0.011
 lazy-0.010
Token the
Token ID262
Feature activation-5.06e-01

Pos logit contributions
 Finally+0.005
 Sure+0.005
icol+0.005
+0.005
 Eventually+0.005
Neg logit contributions
 angry-0.003
 mean-0.003
 losers-0.003
aky-0.003
 lazy-0.003
Token new
Token ID649
Feature activation-1.05e+00

Pos logit contributions
 Finally+0.014
 Sure+0.014
icol+0.013
+0.012
 purchase+0.012
Neg logit contributions
 losers-0.013
 angry-0.013
aky-0.012
 mean-0.012
 lazy-0.012
Token wheels
Token ID13666
Feature activation-2.69e-01

Pos logit contributions
 Sure+0.029
 Finally+0.029
icol+0.029
+0.027
 purchase+0.026
Neg logit contributions
 angry-0.028
 losers-0.028
 mean-0.027
aky-0.026
 selfish-0.026
Token to
Token ID284
Feature activation-7.96e-02

Pos logit contributions
 Finally+0.007
 Sure+0.007
+0.007
icol+0.006
 Eventually+0.006
Neg logit contributions
 losers-0.008
 angry-0.008
 mean-0.008
 lazy-0.007
 rude-0.007
Token his
Token ID465
Feature activation-3.44e-01

Pos logit contributions
 Finally+0.003
 Sure+0.003
icol+0.003
+0.003
 Eventually+0.002
Neg logit contributions
 angry-0.002
 losers-0.002
 mean-0.002
 selfish-0.002
 lazy-0.002
Token car
Token ID1097
Feature activation-3.80e-01

Pos logit contributions
 Finally+0.010
 Sure+0.010
icol+0.010
+0.009
 purchase+0.009
Neg logit contributions
 losers-0.009
 mean-0.009
 angry-0.009
aky-0.009
igator-0.008
Token.
Token ID13
Feature activation-9.37e-03

Pos logit contributions
 Finally+0.010
 Sure+0.010
+0.009
icol+0.009
 Eventually+0.008
Neg logit contributions
 losers-0.012
 angry-0.012
 mean-0.011
 lazy-0.011
aky-0.011
Token When
Token ID1649
Feature activation-6.50e-01

Pos logit contributions
 Finally+0.000
icol+0.000
 Sure+0.000
+0.000
 Eventually+0.000
Neg logit contributions
 angry-0.000
 mean-0.000
 losers-0.000
 rude-0.000
 lazy-0.000
Token she
Token ID673
Feature activation-2.57e-01

Pos logit contributions
icol+0.022
 Finally+0.022
 Sure+0.021
+0.021
 Eventually+0.019
Neg logit contributions
 angry-0.015
 mean-0.014
 losers-0.014
 lazy-0.013
 rude-0.013
Token was
Token ID373
Feature activation+3.09e-01

Pos logit contributions
icol+0.009
 Finally+0.009
 Sure+0.008
+0.008
 Eventually+0.008
Neg logit contributions
 angry-0.006
 mean-0.006
 losers-0.005
 lazy-0.005
 rude-0.005
Token done
Token ID1760
Feature activation+2.11e-01

Pos logit contributions
 angry+0.007
 losers+0.007
 mean+0.007
aky+0.007
 musicians+0.007
Neg logit contributions
 Finally-0.009
-0.009
icol-0.009
 Sure-0.008
 Eventually-0.008
Token blue
Token ID4171
Feature activation-7.48e-02

Pos logit contributions
 Finally+0.021
 Sure+0.019
+0.018
icol+0.017
 Eventually+0.017
Neg logit contributions
 losers-0.021
 angry-0.020
aky-0.019
 mean-0.019
 boys-0.018
Token the
Token ID262
Feature activation-4.80e-01

Pos logit contributions
+0.002
 Finally+0.002
icol+0.002
 Sure+0.002
ings+0.002
Neg logit contributions
 losers-0.002
 angry-0.002
 mean-0.002
aky-0.002
 selfish-0.002
Token sky
Token ID6766
Feature activation-1.44e-01

Pos logit contributions
 Finally+0.014
icol+0.014
 Sure+0.013
+0.013
 Eventually+0.012
Neg logit contributions
 angry-0.012
 losers-0.012
 mean-0.012
aky-0.011
 rude-0.011
Token was
Token ID373
Feature activation-1.33e+00

Pos logit contributions
+0.003
 Finally+0.003
 Sure+0.003
icol+0.003
uggle+0.003
Neg logit contributions
 losers-0.004
 angry-0.004
aky-0.004
 mean-0.004
 selfish-0.004
Token,
Token ID11
Feature activation-3.85e-01

Pos logit contributions
 Finally+0.037
icol+0.035
 Sure+0.034
+0.034
 Eventually+0.031
Neg logit contributions
 losers-0.037
 angry-0.035
 kids-0.034
 mean-0.033
aky-0.033
Token about
Token ID546
Feature activation-1.90e+00

Pos logit contributions
 Finally+0.012
icol+0.012
 Sure+0.012
+0.011
 Eventually+0.010
Neg logit contributions
 angry-0.009
 losers-0.009
 mean-0.009
aky-0.008
 rude-0.008
Token how
Token ID703
Feature activation-9.04e-01

Pos logit contributions
 Finally+0.059
icol+0.058
 Sure+0.058
+0.056
 Eventually+0.052
Neg logit contributions
 angry-0.046
 losers-0.045
 mean-0.044
aky-0.041
 rude-0.040
Token much
Token ID881
Feature activation+8.07e-01

Pos logit contributions
 Finally+0.025
 Sure+0.023
icol+0.022
+0.021
 Eventually+0.021
Neg logit contributions
 losers-0.026
 angry-0.024
aky-0.024
 mean-0.023
 musicians-0.022
Token she
Token ID673
Feature activation-1.35e-01

Pos logit contributions
 angry+0.021
 losers+0.020
 mean+0.020
aky+0.020
 selfish+0.019
Neg logit contributions
 Finally-0.023
icol-0.022
 Sure-0.022
-0.022
 Eventually-0.020
Token missed
Token ID6825
Feature activation-2.60e-01

Pos logit contributions
 Finally+0.004
icol+0.004
 Sure+0.004
+0.004
 Eventually+0.004
Neg logit contributions
 angry-0.003
 mean-0.003
 losers-0.003
 rude-0.003
 lazy-0.003
Token her
Token ID607
Feature activation-2.03e-01

Pos logit contributions
 Finally+0.007
 Sure+0.007
icol+0.007
+0.007
 generation+0.006
Neg logit contributions
 losers-0.007
 angry-0.007
 mean-0.007
aky-0.006
 lazy-0.006
Token loved
Token ID6151
Feature activation+1.09e-01

Pos logit contributions
 Finally+0.034
 Sure+0.032
icol+0.032
+0.031
 Eventually+0.029
Neg logit contributions
 angry-0.035
 losers-0.035
 mean-0.033
 rude-0.031
 selfish-0.031
Token picking
Token ID10868
Feature activation+5.73e-01

Pos logit contributions
aky+0.003
 losers+0.003
 mean+0.003
 angry+0.003
umpy+0.003
Neg logit contributions
icol-0.003
 Finally-0.003
 Sure-0.003
-0.002
 purchase-0.002
Token shells
Token ID19679
Feature activation+9.57e-02

Pos logit contributions
 angry+0.014
 mean+0.014
 losers+0.013
 rude+0.013
 lazy+0.012
Neg logit contributions
 Finally-0.018
icol-0.018
 Sure-0.017
-0.017
 Eventually-0.016
Token at
Token ID379
Feature activation-7.46e-02

Pos logit contributions
 angry+0.003
 losers+0.003
 mean+0.003
 rude+0.002
 lazy+0.002
Neg logit contributions
 Finally-0.003
-0.003
icol-0.003
 Sure-0.003
uggle-0.002
Token the
Token ID262
Feature activation-4.28e-01

Pos logit contributions
 Finally+0.003
icol+0.003
 Sure+0.003
+0.003
 Eventually+0.003
Neg logit contributions
 angry-0.001
 mean-0.001
 losers-0.001
 rude-0.001
 lazy-0.001
Token beach
Token ID10481
Feature activation-2.09e+00

Pos logit contributions
 Finally+0.009
icol+0.009
+0.009
 Sure+0.008
 generation+0.008
Neg logit contributions
 losers-0.015
 angry-0.014
aky-0.014
 mean-0.013
 reckless-0.013
Token.
Token ID13
Feature activation+5.43e-01

Pos logit contributions
 Finally+0.055
+0.053
 Sure+0.052
icol+0.052
ings+0.047
Neg logit contributions
 losers-0.062
 angry-0.062
 mean-0.059
 lazy-0.055
 rude-0.055
Token She
Token ID1375
Feature activation-2.00e-01

Pos logit contributions
 angry+0.016
 mean+0.015
 lazy+0.014
 losers+0.014
 kids+0.014
Neg logit contributions
icol-0.015
 Finally-0.013
-0.013
 Sure-0.013
uggle-0.012
Token had
Token ID550
Feature activation-3.24e-01

Pos logit contributions
 Finally+0.006
icol+0.006
 Sure+0.006
+0.006
 Eventually+0.005
Neg logit contributions
 angry-0.005
 mean-0.005
 losers-0.005
 rude-0.005
 lazy-0.005
Token a
Token ID257
Feature activation-2.50e-01

Pos logit contributions
 Finally+0.010
icol+0.010
 Sure+0.009
+0.009
 Eventually+0.009
Neg logit contributions
 angry-0.008
 losers-0.008
 mean-0.008
 rude-0.007
 lazy-0.007
Token special
Token ID2041
Feature activation-2.60e-02

Pos logit contributions
 Finally+0.009
icol+0.009
 Sure+0.009
+0.009
 Eventually+0.009
Neg logit contributions
 angry-0.005
 mean-0.005
 rude-0.004
 selfish-0.004
 lazy-0.004
Token not
Token ID407
Feature activation+1.24e-01

Pos logit contributions
 angry+0.021
 mean+0.020
 losers+0.019
 rude+0.018
 lazy+0.018
Neg logit contributions
 Finally-0.033
icol-0.033
 Sure-0.032
-0.032
 Eventually-0.029
Token know
Token ID760
Feature activation-3.36e-01

Pos logit contributions
 losers+0.004
 angry+0.004
 selfish+0.003
 lazy+0.003
 musicians+0.003
Neg logit contributions
 Finally-0.003
icol-0.003
 Sure-0.002
 generation-0.002
-0.002
Token if
Token ID611
Feature activation-8.08e-01

Pos logit contributions
 Finally+0.008
 Sure+0.008
icol+0.008
+0.008
 Eventually+0.007
Neg logit contributions
 losers-0.010
 angry-0.010
 mean-0.009
aky-0.009
 lazy-0.009
Token they
Token ID484
Feature activation-3.02e-01

Pos logit contributions
 Finally+0.019
 Sure+0.019
icol+0.016
+0.016
 Eventually+0.015
Neg logit contributions
aky-0.026
 angry-0.025
 losers-0.025
umpy-0.024
 mean-0.024
Token would
Token ID561
Feature activation-3.10e-01

Pos logit contributions
 Finally+0.009
+0.009
 Sure+0.009
icol+0.009
 Eventually+0.008
Neg logit contributions
 angry-0.008
 losers-0.007
 mean-0.007
 selfish-0.007
 lazy-0.007
Token ever
Token ID1683
Feature activation-2.01e+00

Pos logit contributions
 Finally+0.008
+0.008
 Sure+0.008
icol+0.008
 purchase+0.007
Neg logit contributions
 angry-0.008
 losers-0.008
 mean-0.008
aky-0.008
 lazy-0.008
Token see
Token ID766
Feature activation-1.40e-01

Pos logit contributions
 Finally+0.059
icol+0.059
 Sure+0.057
+0.056
 Eventually+0.050
Neg logit contributions
 angry-0.056
 mean-0.054
 losers-0.052
 rude-0.050
 lazy-0.049
Token their
Token ID511
Feature activation+6.36e-01

Pos logit contributions
 Finally+0.004
 Sure+0.004
icol+0.004
 generation+0.004
+0.003
Neg logit contributions
 losers-0.004
aky-0.003
umpy-0.003
 angry-0.003
 mean-0.003
Token friends
Token ID2460
Feature activation+6.81e-01

Pos logit contributions
 losers+0.017
 mean+0.016
 angry+0.016
aky+0.015
 lazy+0.014
Neg logit contributions
icol-0.018
 Finally-0.018
 Sure-0.017
-0.017
 purchase-0.016
Token and
Token ID290
Feature activation+4.90e-02

Pos logit contributions
 losers+0.023
aky+0.022
chie+0.021
 lazy+0.021
 mean+0.021
Neg logit contributions
-0.013
 Finally-0.011
uggle-0.011
 purchase-0.011
 afford-0.010
Token family
Token ID1641
Feature activation+8.41e-01

Pos logit contributions
umpy+0.001
aky+0.001
 losers+0.001
 angry+0.001
 mean+0.001
Neg logit contributions
 Finally-0.001
icol-0.001
 Sure-0.001
 purchase-0.001
 generation-0.001
Token time
Token ID640
Feature activation-1.78e-01

Pos logit contributions
 losers+0.004
umpy+0.004
aky+0.004
 angry+0.004
 mean+0.004
Neg logit contributions
 Finally-0.005
 Sure-0.004
icol-0.004
 generation-0.004
 purchase-0.004
Token.
Token ID13
Feature activation-6.28e-01

Pos logit contributions
 Finally+0.004
 Sure+0.004
 Eventually+0.004
+0.004
icol+0.003
Neg logit contributions
 losers-0.005
aky-0.005
 my-0.005
igator-0.005
 lazy-0.005
Token They
Token ID1119
Feature activation+6.91e-02

Pos logit contributions
 Finally+0.016
 Sure+0.015
+0.014
icol+0.014
 Eventually+0.013
Neg logit contributions
 angry-0.021
 losers-0.020
 mean-0.020
 rude-0.019
 lazy-0.019
Token yelled
Token ID22760
Feature activation-1.32e+00

Pos logit contributions
 angry+0.002
 losers+0.002
 mean+0.002
 lazy+0.002
 my+0.001
Neg logit contributions
 Finally-0.002
icol-0.002
 Sure-0.002
-0.002
 Eventually-0.002
Token and
Token ID290
Feature activation+1.53e-01

Pos logit contributions
 Finally+0.034
 Sure+0.030
 Eventually+0.029
icol+0.028
ings+0.026
Neg logit contributions
igator-0.040
 losers-0.039
aky-0.038
igators-0.036
chie-0.035
Token blamed
Token ID13772
Feature activation-1.89e+00

Pos logit contributions
umpy+0.004
 losers+0.004
aky+0.004
 lazy+0.004
 musicians+0.004
Neg logit contributions
icol-0.004
 Finally-0.004
 Sure-0.004
-0.003
 trek-0.003
Token and
Token ID290
Feature activation-1.60e-01

Pos logit contributions
 Finally+0.059
icol+0.058
 Sure+0.057
+0.053
 Eventually+0.051
Neg logit contributions
 losers-0.045
 mean-0.043
 angry-0.043
 selfish-0.040
 lazy-0.040
Token fired
Token ID6294
Feature activation-5.38e-01

Pos logit contributions
 Finally+0.004
icol+0.004
 Sure+0.004
 Eventually+0.003
 trek+0.003
Neg logit contributions
umpy-0.004
aky-0.004
 losers-0.004
 my-0.004
 lazy-0.003
Token people
Token ID661
Feature activation+8.10e-01

Pos logit contributions
ings+0.012
 Finally+0.012
icol+0.010
 Sure+0.010
 Eventually+0.009
Neg logit contributions
 losers-0.018
aky-0.017
igator-0.016
 lazy-0.015
 mean-0.015
Token.
Token ID13
Feature activation-4.49e-01

Pos logit contributions
 losers+0.024
aky+0.023
 mean+0.023
 lazy+0.023
 angry+0.023
Neg logit contributions
 Finally-0.017
 Sure-0.017
ings-0.016
-0.016
icol-0.015
Token
Token ID198
Feature activation+1.75e-04

Pos logit contributions
 Finally+0.011
 Sure+0.011
+0.010
icol+0.009
 Eventually+0.009
Neg logit contributions
 angry-0.015
 losers-0.015
 mean-0.015
aky-0.014
 rude-0.014
Token the
Token ID262
Feature activation-1.42e-01

Pos logit contributions
 Finally+0.018
icol+0.018
 Sure+0.018
+0.017
 Eventually+0.016
Neg logit contributions
 angry-0.011
 losers-0.011
 mean-0.011
 rude-0.010
 selfish-0.010
Token garden
Token ID11376
Feature activation-1.40e+00

Pos logit contributions
 Finally+0.003
icol+0.003
 Sure+0.003
+0.003
 purchase+0.002
Neg logit contributions
 losers-0.005
 angry-0.005
 mean-0.005
aky-0.005
 rude-0.004
Token.
Token ID13
Feature activation+4.52e-01

Pos logit contributions
 Finally+0.041
 Sure+0.039
+0.039
icol+0.038
 Eventually+0.035
Neg logit contributions
 losers-0.038
 angry-0.037
 mean-0.037
 rude-0.034
 musicians-0.034
Token He
Token ID679
Feature activation-9.09e-01

Pos logit contributions
 angry+0.012
 mean+0.011
 losers+0.011
 rude+0.010
 lazy+0.010
Neg logit contributions
 Finally-0.014
icol-0.014
 Sure-0.014
-0.013
 Eventually-0.012
Token dug
Token ID18735
Feature activation-1.23e+00

Pos logit contributions
 Finally+0.023
 Sure+0.022
icol+0.021
+0.021
 Eventually+0.019
Neg logit contributions
 losers-0.028
 angry-0.028
 mean-0.027
 rude-0.026
 selfish-0.025
Token deep
Token ID2769
Feature activation-2.16e+00

Pos logit contributions
+0.044
icol+0.043
 Sure+0.043
 Finally+0.042
uggle+0.039
Neg logit contributions
 angry-0.026
 mean-0.025
 losers-0.023
 lazy-0.022
 rude-0.022
Token into
Token ID656
Feature activation-4.01e-01

Pos logit contributions
icol+0.070
 Finally+0.070
 Sure+0.067
+0.067
 Eventually+0.060
Neg logit contributions
 angry-0.050
 mean-0.049
 losers-0.049
 rude-0.045
 selfish-0.045
Token the
Token ID262
Feature activation-2.22e-01

Pos logit contributions
 Finally+0.015
 Sure+0.014
+0.014
icol+0.014
 purchase+0.013
Neg logit contributions
 losers-0.007
 mean-0.007
 angry-0.007
 selfish-0.006
 rude-0.006
Token dirt
Token ID13647
Feature activation-7.62e-01

Pos logit contributions
 Finally+0.006
icol+0.005
 Sure+0.005
+0.005
 Eventually+0.005
Neg logit contributions
 losers-0.007
 angry-0.007
 mean-0.007
 rude-0.006
 musicians-0.006
Token and
Token ID290
Feature activation-4.16e-01

Pos logit contributions
 Finally+0.023
+0.023
icol+0.023
 Sure+0.022
 Eventually+0.020
Neg logit contributions
 angry-0.019
 losers-0.019
 mean-0.019
 rude-0.017
 selfish-0.017
Token pulled
Token ID5954
Feature activation-1.12e+00

Pos logit contributions
 Finally+0.012
 Sure+0.012
icol+0.012
+0.011
 Eventually+0.010
Neg logit contributions
 losers-0.011
 angry-0.011
 mean-0.011
umpy-0.010
 selfish-0.010