TOP ACTIVATIONS
MAX = 0.086

to cry and threw a tant rum . She didn 't
She gave Lily the er aser . She helped her
listen . She threw a tant rum in the store .
the shrimp . " Wow , Lily ! You have good
We did not share the er aser ." The
" Where ? I searched everywhere in my room . There
She cried and threw a tant rum . Her mom tried
and so she threw a tant rum when her parents said
" I lost my lucky er aser . I can 't
kept crying and throwing a tant rum . Finally
happy . She put the er aser in the trash and
had become . " Wow , Lily ! Your bean plant
want to play with the laser together ?" M
sad . She threw a tant rum and cried loudly .
her goal ! <|endoftext|> J imm ie was scared . He
gone . They see their er aser . It is smiling
want to play with my laser ? We can be a
. Can I share your tube ?" Ben smiles
say , " Stop writing everywhere ! It is messy and
. He kept throwing a tant rum . His mom was

MIN ACTIVATIONS
MIN = -0.090

back to the pond the next day , hoping that the
the rabbit met again the next summer and the crazy farmer
able to catch the rabbit next time . She was now
returned the pitch fork the next day and said , "
back to the pond the next day . They were also
happy that it produced more av oc ados than ever before
tree could see lots of av oc ados that the supplier
the shapes she had been able to make . <|endoftext|> Once
from one flower to the next carrying it carefully in her
to the cherry tree the next day . The
she visited the river the next day . The river was
see the crocod ile right next to him . He had
promised to come back the next day . The
bright feathers in the tree next to her . She was
picnic to the river the next day .
under the palm tree the next day . <|endoftext|> One day
He found the perfect spot next to the pond , under
rocks to their teacher the next day in class . "
lay down on the ground next to John and fell asleep
on the raft again the next day . <|endoftext|> Billy and

INTERVAL 0.042 - 0.086
CONTAINS 0.430%

" I 'm so sorry , kids !" the man said
mom said , " No , but let 's look for
a girl . She was playing in her room when she
<|endoftext|> Tom and Lily were playing in the park with their
, " That 's OK , Ben . Don 't worry

INTERVAL -0.002 - 0.042
CONTAINS 54.967%

a silly face . " Look , Anna .
her friend too . They hugged and went back to the
. He was careful and looked around . He saw a
old child was walking in a park with their parents .
. She wanted to have a fun day . So she

INTERVAL -0.046 - -0.002
CONTAINS 44.474%

They climb the tree and pick some pr unes
sped over the hills and soon the mole arrived at his
The little girl explained that it had shrunk and become swe
. Everything was sharp and unsafe . But Jimmy
lovely day ." Then she waved goodbye to the family and

INTERVAL -0.090 - -0.046
CONTAINS 0.129%

Then , the sun began to set . They had
At the end of their adventure , Jack and the
felt relieved that she was able to help the bunny in
they needed : tomatoes , garlic , onion , oil ,
and testing to see which one tasted the best .
Token to
Token ID284
Feature activation+1.68e-02

Pos logit contributions
 moral+0.012
lights+0.012
 exc+0.011
 swollen+0.011
 works+0.011
Neg logit contributions
eb-0.007
over-0.006
'-0.006
e-0.005
ucky-0.005
Token cry
Token ID3960
Feature activation+3.15e-02

Pos logit contributions
lights+0.024
 moral+0.023
 swollen+0.022
bul+0.021
 works+0.021
Neg logit contributions
eb-0.029
'-0.026
e-0.026
over-0.026
ucky-0.024
Token and
Token ID290
Feature activation+1.07e-02

Pos logit contributions
 moral+0.050
 exc+0.046
lights+0.046
 lamp+0.045
 pret+0.045
Neg logit contributions
eb-0.052
over-0.045
e-0.043
'-0.041
ax-0.041
Token threw
Token ID9617
Feature activation+2.63e-03

Pos logit contributions
lights+0.018
 moral+0.018
bul+0.017
 works+0.016
 swollen+0.016
Neg logit contributions
eb-0.015
'-0.014
e-0.013
over-0.013
ale-0.012
Token a
Token ID257
Feature activation+1.57e-02

Pos logit contributions
 moral+0.004
 swollen+0.004
 pret+0.004
 exc+0.004
 works+0.004
Neg logit contributions
eb-0.004
over-0.004
'-0.004
e-0.004
ucky-0.004
Token tant
Token ID24246
Feature activation+8.26e-02

Pos logit contributions
lights+0.023
 moral+0.023
 works+0.021
 exc+0.021
 appearances+0.020
Neg logit contributions
eb-0.026
'-0.024
over-0.023
ucky-0.022
e-0.022
Tokenrum
Token ID6582
Feature activation+3.37e-02

Pos logit contributions
 moral+0.091
lights+0.083
 works+0.082
 exc+0.082
 swollen+0.080
Neg logit contributions
eb-0.172
over-0.159
e-0.152
'-0.152
ucky-0.145
Token.
Token ID13
Feature activation-1.20e-02

Pos logit contributions
 moral+0.063
lights+0.060
 exc+0.060
 swollen+0.059
 works+0.058
Neg logit contributions
eb-0.044
over-0.039
'-0.036
e-0.035
ucky-0.033
Token She
Token ID1375
Feature activation+2.48e-02

Pos logit contributions
eb+0.017
over+0.016
'+0.015
e+0.014
ucky+0.014
Neg logit contributions
 moral-0.020
 exc-0.019
lights-0.019
 swollen-0.019
 works-0.019
Token didn
Token ID1422
Feature activation+1.37e-02

Pos logit contributions
 moral+0.037
lights+0.036
 exc+0.035
 poster+0.034
 works+0.034
Neg logit contributions
eb-0.041
'-0.036
over-0.036
e-0.035
ucky-0.032
Token't
Token ID470
Feature activation+1.83e-02

Pos logit contributions
 moral+0.021
lights+0.019
 exc+0.019
 swollen+0.019
 works+0.019
Neg logit contributions
eb-0.022
over-0.021
'-0.019
e-0.019
ucky-0.018
Token
Token ID198
Feature activation+1.06e-02

Pos logit contributions
 moral+0.017
lights+0.017
 pret+0.017
 swollen+0.017
 exc+0.017
Neg logit contributions
eb-0.011
over-0.010
'-0.009
e-0.009
ucky-0.008
TokenShe
Token ID3347
Feature activation+3.44e-02

Pos logit contributions
lights+0.017
 pret+0.017
bul+0.016
 moral+0.016
 works+0.016
Neg logit contributions
eb-0.015
over-0.015
'-0.015
aw-0.013
he-0.013
Token gave
Token ID2921
Feature activation-9.54e-03

Pos logit contributions
lights+0.063
 appearances+0.060
 swollen+0.060
 moral+0.059
 disl+0.059
Neg logit contributions
eb-0.045
'-0.040
aw-0.038
e-0.036
al-0.036
Token Lily
Token ID20037
Feature activation+9.54e-03

Pos logit contributions
over+0.016
eb+0.016
'+0.015
ive+0.014
ake+0.014
Neg logit contributions
 moral-0.013
 works-0.012
 pret-0.012
 lamp-0.012
lights-0.012
Token the
Token ID262
Feature activation+1.64e-02

Pos logit contributions
 moral+0.013
 exc+0.012
 works+0.012
 swollen+0.011
 pret+0.011
Neg logit contributions
eb-0.017
over-0.016
'-0.015
ucky-0.014
e-0.014
Token er
Token ID1931
Feature activation+8.16e-02

Pos logit contributions
lights+0.023
bel+0.019
 Amen+0.018
 exc+0.018
 satellites+0.018
Neg logit contributions
eb-0.029
'-0.028
ucky-0.026
e-0.026
over-0.026
Tokenaser
Token ID6005
Feature activation+3.85e-02

Pos logit contributions
 moral+0.123
lights+0.122
 exc+0.118
 swollen+0.116
 works+0.115
Neg logit contributions
eb-0.133
'-0.121
over-0.120
ucky-0.117
e-0.116
Token.
Token ID13
Feature activation-2.12e-03

Pos logit contributions
 moral+0.070
 exc+0.067
lights+0.066
 swollen+0.066
 works+0.065
Neg logit contributions
eb-0.051
over-0.046
'-0.043
e-0.041
ucky-0.039
Token She
Token ID1375
Feature activation+1.80e-02

Pos logit contributions
eb+0.002
over+0.002
'+0.002
e+0.002
ucky+0.002
Neg logit contributions
 moral-0.004
lights-0.004
 exc-0.004
 swollen-0.004
 works-0.004
Token helped
Token ID4193
Feature activation-2.79e-03

Pos logit contributions
lights+0.032
 satellites+0.031
 appearances+0.031
 moral+0.031
 Amen+0.030
Neg logit contributions
eb-0.024
'-0.020
aw-0.020
e-0.020
al-0.018
Token her
Token ID607
Feature activation+1.50e-02

Pos logit contributions
eb+0.005
over+0.005
ive+0.005
ump+0.005
ard+0.005
Neg logit contributions
 moral-0.004
 lamp-0.003
 exc-0.003
 pret-0.003
 dial-0.003
Token listen
Token ID6004
Feature activation-1.52e-02

Pos logit contributions
eb+0.004
over+0.003
'+0.003
ucky+0.003
ard+0.003
Neg logit contributions
 moral-0.003
 exc-0.003
 works-0.003
 pret-0.003
 swollen-0.003
Token.
Token ID13
Feature activation-8.52e-03

Pos logit contributions
eb+0.018
over+0.017
'+0.015
e+0.015
ucky+0.014
Neg logit contributions
 moral-0.029
 exc-0.028
lights-0.028
 swollen-0.028
 works-0.028
Token She
Token ID1375
Feature activation+2.01e-02

Pos logit contributions
eb+0.013
over+0.011
e+0.010
'+0.010
ale+0.010
Neg logit contributions
 moral-0.015
 exc-0.014
lights-0.014
 swollen-0.014
 lamp-0.014
Token threw
Token ID9617
Feature activation-8.45e-04

Pos logit contributions
lights+0.030
 moral+0.030
 swollen+0.029
 poster+0.028
bul+0.028
Neg logit contributions
eb-0.033
over-0.029
'-0.029
e-0.028
aw-0.026
Token a
Token ID257
Feature activation+1.45e-02

Pos logit contributions
eb+0.001
over+0.001
'+0.001
e+0.001
ive+0.001
Neg logit contributions
 moral-0.001
 pret-0.001
 swollen-0.001
 works-0.001
 exc-0.001
Token tant
Token ID24246
Feature activation+7.97e-02

Pos logit contributions
lights+0.022
 moral+0.021
 works+0.021
 exc+0.021
 Lincoln+0.020
Neg logit contributions
eb-0.023
'-0.021
over-0.020
ucky-0.019
e-0.019
Tokenrum
Token ID6582
Feature activation+3.53e-02

Pos logit contributions
 moral+0.097
 works+0.088
lights+0.088
 exc+0.086
 pret+0.085
Neg logit contributions
eb-0.160
over-0.147
e-0.144
'-0.138
ucky-0.133
Token in
Token ID287
Feature activation+2.93e-02

Pos logit contributions
 moral+0.065
 exc+0.062
lights+0.061
 swollen+0.061
 works+0.061
Neg logit contributions
eb-0.046
over-0.042
'-0.039
e-0.037
ucky-0.036
Token the
Token ID262
Feature activation+1.23e-02

Pos logit contributions
 moral+0.042
 swollen+0.041
 exc+0.038
 lamp+0.038
 pret+0.036
Neg logit contributions
eb-0.053
over-0.049
ive-0.045
ax-0.045
ale-0.044
Token store
Token ID3650
Feature activation-1.27e-02

Pos logit contributions
 moral+0.021
 exc+0.020
lights+0.020
 swollen+0.020
 works+0.019
Neg logit contributions
eb-0.018
over-0.016
'-0.015
e-0.014
ucky-0.014
Token.
Token ID13
Feature activation-8.01e-03

Pos logit contributions
eb+0.021
over+0.018
e+0.017
ucky+0.017
ee+0.017
Neg logit contributions
 moral-0.019
 poster-0.018
 exc-0.018
lights-0.018
 works-0.017
Token the
Token ID262
Feature activation-4.05e-03

Pos logit contributions
 moral+0.029
lights+0.027
 exc+0.026
 swollen+0.026
 works+0.025
Neg logit contributions
eb-0.049
over-0.045
'-0.044
e-0.042
ucky-0.042
Token shrimp
Token ID30060
Feature activation-1.48e-02

Pos logit contributions
eb+0.006
over+0.005
e+0.005
'+0.005
ale+0.005
Neg logit contributions
 moral-0.007
 exc-0.007
 swollen-0.007
lights-0.006
 works-0.006
Token.
Token ID13
Feature activation+7.04e-03

Pos logit contributions
eb+0.024
over+0.023
e+0.019
ale+0.019
ive+0.019
Neg logit contributions
 moral-0.023
 exc-0.022
 works-0.022
 pret-0.021
 poster-0.021
Token "
Token ID366
Feature activation+1.31e-02

Pos logit contributions
 moral+0.013
 swollen+0.012
 exc+0.012
lights+0.012
 works+0.012
Neg logit contributions
eb-0.009
over-0.008
'-0.008
ucky-0.007
e-0.007
TokenWow
Token ID22017
Feature activation+2.95e-02

Pos logit contributions
lights+0.018
 works+0.017
 moral+0.017
 exc+0.017
aj+0.017
Neg logit contributions
eb-0.023
'-0.021
over-0.020
e-0.020
ucky-0.019
Token,
Token ID11
Feature activation+7.91e-02

Pos logit contributions
 moral+0.050
 works+0.050
 exc+0.049
 pret+0.048
 swollen+0.048
Neg logit contributions
eb-0.040
over-0.038
e-0.033
'-0.032
ale-0.031
Token Lily
Token ID20037
Feature activation-2.57e-03

Pos logit contributions
 moral+0.143
lights+0.142
bul+0.140
 exc+0.138
 swollen+0.137
Neg logit contributions
eb-0.099
'-0.087
ucky-0.085
over-0.080
e-0.079
Token!
Token ID0
Feature activation+5.59e-03

Pos logit contributions
eb+0.004
over+0.003
'+0.003
e+0.003
ale+0.003
Neg logit contributions
 works-0.004
 swollen-0.004
 moral-0.004
 lamp-0.004
 exc-0.004
Token You
Token ID921
Feature activation+7.25e-03

Pos logit contributions
 moral+0.008
 exc+0.007
 pret+0.007
 works+0.007
 lamp+0.007
Neg logit contributions
eb-0.009
over-0.009
'-0.008
e-0.008
ale-0.008
Token have
Token ID423
Feature activation-1.07e-02

Pos logit contributions
 moral+0.014
 exc+0.013
 works+0.013
 pret+0.013
 swollen+0.013
Neg logit contributions
eb-0.009
over-0.009
'-0.007
ale-0.007
e-0.007
Token good
Token ID922
Feature activation-1.27e-02

Pos logit contributions
eb+0.018
over+0.017
'+0.016
ale+0.015
he+0.015
Neg logit contributions
 moral-0.015
 swollen-0.014
 exc-0.014
 pret-0.014
 works-0.014
Token We
Token ID775
Feature activation-1.20e-02

Pos logit contributions
 moral+0.003
lights+0.003
 exc+0.003
 swollen+0.003
 works+0.003
Neg logit contributions
eb-0.005
over-0.004
'-0.004
e-0.004
ucky-0.004
Token did
Token ID750
Feature activation+1.54e-02

Pos logit contributions
eb+0.017
'+0.015
e+0.015
over+0.015
ucky+0.013
Neg logit contributions
lights-0.021
 moral-0.020
 swollen-0.019
 poster-0.019
 exc-0.019
Token not
Token ID407
Feature activation+6.28e-03

Pos logit contributions
bul+0.024
lights+0.023
 swollen+0.022
bel+0.021
 gen+0.021
Neg logit contributions
'-0.022
e-0.022
 after-0.021
eb-0.021
 home-0.020
Token share
Token ID2648
Feature activation-7.45e-03

Pos logit contributions
 moral+0.009
lights+0.009
 swollen+0.009
 exc+0.009
 works+0.009
Neg logit contributions
eb-0.010
over-0.009
'-0.009
e-0.009
ucky-0.008
Token the
Token ID262
Feature activation+2.83e-02

Pos logit contributions
eb+0.010
over+0.009
'+0.009
e+0.008
ucky+0.008
Neg logit contributions
 moral-0.013
 exc-0.012
 swollen-0.012
lights-0.012
 works-0.012
Token er
Token ID1931
Feature activation+7.90e-02

Pos logit contributions
lights+0.040
bel+0.034
 swollen+0.034
 exc+0.034
 moral+0.034
Neg logit contributions
eb-0.046
'-0.045
e-0.044
over-0.043
ard-0.042
Tokenaser
Token ID6005
Feature activation+4.47e-02

Pos logit contributions
 moral+0.109
 swollen+0.105
lights+0.104
 exc+0.103
 pret+0.099
Neg logit contributions
eb-0.139
over-0.130
'-0.127
e-0.123
ucky-0.120
Token."
Token ID526
Feature activation-1.70e-02

Pos logit contributions
 moral+0.080
lights+0.076
 exc+0.076
 swollen+0.076
 works+0.074
Neg logit contributions
eb-0.060
over-0.055
'-0.052
e-0.050
ucky-0.047
Token
Token ID198
Feature activation+2.39e-04

Pos logit contributions
eb+0.021
over+0.019
'+0.018
e+0.017
ucky+0.016
Neg logit contributions
 moral-0.032
 swollen-0.031
lights-0.031
 exc-0.031
 works-0.031
Token
Token ID198
Feature activation-1.42e-03

Pos logit contributions
 pret+0.000
lights+0.000
 works+0.000
bul+0.000
 swollen+0.000
Neg logit contributions
eb-0.000
'-0.000
over-0.000
e-0.000
ucky-0.000
TokenThe
Token ID464
Feature activation+2.43e-02

Pos logit contributions
eb+0.002
'+0.002
over+0.002
e+0.002
aw+0.002
Neg logit contributions
lights-0.002
 pret-0.002
 works-0.002
bul-0.002
 lamp-0.002
Token "
Token ID366
Feature activation-4.03e-03

Pos logit contributions
lights+0.035
 moral+0.032
 works+0.032
bul+0.030
bel+0.030
Neg logit contributions
'-0.036
eb-0.033
over-0.032
e-0.030
 forward-0.029
TokenWhere
Token ID8496
Feature activation-1.33e-02

Pos logit contributions
eb+0.007
over+0.007
'+0.007
e+0.006
ucky+0.006
Neg logit contributions
 moral-0.005
 exc-0.005
lights-0.005
 swollen-0.005
 works-0.005
Token?
Token ID30
Feature activation-1.44e-03

Pos logit contributions
eb+0.022
over+0.020
'+0.020
e+0.019
ucky+0.019
Neg logit contributions
 moral-0.019
lights-0.019
 swollen-0.018
 exc-0.018
bul-0.017
Token I
Token ID314
Feature activation-1.29e-02

Pos logit contributions
eb+0.002
'+0.002
e+0.002
over+0.002
ive+0.002
Neg logit contributions
lights-0.002
 moral-0.002
 swollen-0.002
 trump-0.002
 works-0.002
Token searched
Token ID16499
Feature activation+4.86e-02

Pos logit contributions
eb+0.017
over+0.015
'+0.014
e+0.014
ucky+0.012
Neg logit contributions
lights-0.024
 moral-0.024
 swollen-0.023
 poster-0.022
 exc-0.022
Token everywhere
Token ID8347
Feature activation+7.90e-02

Pos logit contributions
 moral+0.086
 exc+0.082
lights+0.082
 swollen+0.081
 works+0.080
Neg logit contributions
eb-0.066
over-0.060
'-0.056
e-0.054
ucky-0.052
Token in
Token ID287
Feature activation+4.27e-02

Pos logit contributions
 moral+0.139
 exc+0.132
lights+0.131
 swollen+0.130
 works+0.129
Neg logit contributions
eb-0.112
over-0.100
'-0.090
e-0.088
ale-0.085
Token my
Token ID616
Feature activation+3.30e-03

Pos logit contributions
 moral+0.067
lights+0.064
 works+0.063
 exc+0.063
 lamp+0.062
Neg logit contributions
eb-0.068
over-0.064
'-0.058
e-0.056
ive-0.056
Token room
Token ID2119
Feature activation+2.61e-02

Pos logit contributions
 moral+0.004
 lamp+0.003
 poster+0.003
 exc+0.003
 robot+0.003
Neg logit contributions
eb-0.007
over-0.006
ax-0.006
ram-0.006
ale-0.006
Token.
Token ID13
Feature activation+1.09e-02

Pos logit contributions
 moral+0.035
lights+0.033
 lamp+0.033
 works+0.033
 exc+0.032
Neg logit contributions
eb-0.047
over-0.042
e-0.039
'-0.038
ale-0.038
Token There
Token ID1318
Feature activation+5.14e-03

Pos logit contributions
 moral+0.015
lights+0.015
 swollen+0.015
 exc+0.014
 works+0.014
Neg logit contributions
eb-0.019
over-0.017
'-0.017
e-0.016
ucky-0.016
Token She
Token ID1375
Feature activation+2.51e-02

Pos logit contributions
eb+0.006
over+0.005
'+0.005
e+0.005
ucky+0.005
Neg logit contributions
 moral-0.008
 exc-0.008
lights-0.008
 swollen-0.007
 works-0.007
Token cried
Token ID16896
Feature activation+2.36e-02

Pos logit contributions
 moral+0.035
 exc+0.033
lights+0.032
 swollen+0.032
 works+0.032
Neg logit contributions
eb-0.044
over-0.041
'-0.039
e-0.038
ucky-0.037
Token and
Token ID290
Feature activation+1.15e-02

Pos logit contributions
 moral+0.034
lights+0.031
 lamp+0.031
 exc+0.031
 pret+0.031
Neg logit contributions
eb-0.043
over-0.036
e-0.035
ax-0.035
ram-0.035
Token threw
Token ID9617
Feature activation-3.22e-03

Pos logit contributions
 moral+0.015
lights+0.014
 exc+0.014
 swollen+0.014
 works+0.014
Neg logit contributions
eb-0.021
over-0.019
'-0.018
e-0.018
ucky-0.017
Token a
Token ID257
Feature activation+1.77e-02

Pos logit contributions
eb+0.006
over+0.005
'+0.005
ucky+0.005
ax+0.005
Neg logit contributions
 moral-0.005
 swollen-0.004
 pret-0.004
 exc-0.004
lights-0.004
Token tant
Token ID24246
Feature activation+7.88e-02

Pos logit contributions
 moral+0.024
lights+0.024
 exc+0.023
 works+0.023
 pret+0.022
Neg logit contributions
eb-0.030
'-0.028
over-0.028
e-0.026
ucky-0.025
Tokenrum
Token ID6582
Feature activation+3.09e-02

Pos logit contributions
 moral+0.088
lights+0.080
 works+0.079
 exc+0.079
 swollen+0.077
Neg logit contributions
eb-0.163
over-0.151
e-0.145
'-0.144
ucky-0.138
Token.
Token ID13
Feature activation-5.00e-03

Pos logit contributions
 moral+0.058
 exc+0.055
lights+0.055
 swollen+0.055
 works+0.054
Neg logit contributions
eb-0.039
over-0.035
'-0.033
e-0.031
ucky-0.030
Token Her
Token ID2332
Feature activation-8.65e-03

Pos logit contributions
eb+0.008
over+0.007
'+0.007
e+0.007
ucky+0.006
Neg logit contributions
 moral-0.008
 exc-0.007
lights-0.007
 swollen-0.007
 works-0.007
Token mom
Token ID1995
Feature activation+3.01e-03

Pos logit contributions
eb+0.013
over+0.012
 home+0.012
'+0.012
e+0.012
Neg logit contributions
 moral-0.012
 works-0.012
 pret-0.012
bul-0.012
lights-0.012
Token tried
Token ID3088
Feature activation+1.36e-04

Pos logit contributions
 moral+0.003
 swollen+0.003
 exc+0.003
 works+0.003
 pret+0.003
Neg logit contributions
eb-0.006
over-0.006
ucky-0.005
'-0.005
ale-0.005
Token and
Token ID290
Feature activation+1.22e-02

Pos logit contributions
 works+0.016
 moral+0.016
 pret+0.016
lights+0.016
 exc+0.016
Neg logit contributions
eb-0.021
over-0.018
e-0.018
'-0.017
ive-0.017
Token so
Token ID523
Feature activation+1.13e-02

Pos logit contributions
 moral+0.014
 exc+0.013
 swollen+0.013
 works+0.012
 pret+0.012
Neg logit contributions
eb-0.025
over-0.023
'-0.022
ucky-0.021
e-0.021
Token she
Token ID673
Feature activation+1.09e-02

Pos logit contributions
 moral+0.015
 exc+0.015
lights+0.014
 swollen+0.014
 works+0.014
Neg logit contributions
eb-0.020
over-0.019
'-0.018
e-0.017
ucky-0.017
Token threw
Token ID9617
Feature activation-5.81e-03

Pos logit contributions
 moral+0.017
 exc+0.016
lights+0.015
 swollen+0.015
 works+0.015
Neg logit contributions
eb-0.018
over-0.016
'-0.015
e-0.015
ucky-0.015
Token a
Token ID257
Feature activation+7.91e-03

Pos logit contributions
eb+0.011
over+0.010
'+0.009
e+0.009
ive+0.009
Neg logit contributions
 moral-0.008
 exc-0.007
 pret-0.007
 swollen-0.007
 works-0.007
Token tant
Token ID24246
Feature activation+7.85e-02

Pos logit contributions
 moral+0.012
lights+0.012
 works+0.011
 exc+0.011
 swollen+0.011
Neg logit contributions
eb-0.013
over-0.012
'-0.012
e-0.011
ucky-0.011
Tokenrum
Token ID6582
Feature activation+3.34e-02

Pos logit contributions
 moral+0.087
 works+0.078
lights+0.078
 exc+0.077
 swollen+0.077
Neg logit contributions
eb-0.164
over-0.152
e-0.146
'-0.144
ucky-0.137
Token when
Token ID618
Feature activation-1.28e-02

Pos logit contributions
 moral+0.064
 exc+0.061
lights+0.060
 swollen+0.060
 works+0.060
Neg logit contributions
eb-0.041
over-0.038
'-0.035
e-0.033
ucky-0.031
Token her
Token ID607
Feature activation-2.15e-02

Pos logit contributions
eb+0.023
over+0.021
'+0.020
e+0.019
ucky+0.019
Neg logit contributions
 moral-0.018
 exc-0.016
 swollen-0.016
lights-0.016
 works-0.016
Token parents
Token ID3397
Feature activation-4.83e-03

Pos logit contributions
eb+0.039
over+0.037
'+0.035
e+0.034
ucky+0.033
Neg logit contributions
 moral-0.028
lights-0.026
 works-0.025
 exc-0.025
 swollen-0.025
Token said
Token ID531
Feature activation-1.70e-02

Pos logit contributions
over+0.010
umm+0.010
eb+0.010
ucky+0.009
ive+0.009
Neg logit contributions
 exc-0.004
 dial-0.004
 works-0.004
 hon-0.004
 pret-0.004
Token "
Token ID366
Feature activation+2.60e-03

Pos logit contributions
lights+0.029
 works+0.027
 swollen+0.026
bel+0.026
 moral+0.026
Neg logit contributions
'-0.028
eb-0.026
over-0.024
 forward-0.022
ucky-0.022
TokenI
Token ID40
Feature activation-1.53e-03

Pos logit contributions
 pret+0.005
lights+0.005
 moral+0.005
 poster+0.005
 works+0.005
Neg logit contributions
eb-0.003
over-0.002
'-0.002
e-0.002
ucky-0.002
Token lost
Token ID2626
Feature activation+3.44e-02

Pos logit contributions
eb+0.002
'+0.002
over+0.002
e+0.002
ucky+0.002
Neg logit contributions
lights-0.003
 moral-0.003
 swollen-0.003
 poster-0.003
bul-0.003
Token my
Token ID616
Feature activation+7.27e-03

Pos logit contributions
 moral+0.038
lights+0.036
 exc+0.035
 swollen+0.034
 works+0.034
Neg logit contributions
eb-0.070
over-0.065
'-0.062
e-0.062
ucky-0.060
Token lucky
Token ID9670
Feature activation+3.60e-02

Pos logit contributions
lights+0.010
bel+0.009
 Amen+0.009
 moral+0.008
 satellites+0.008
Neg logit contributions
eb-0.013
ucky-0.012
'-0.012
over-0.012
e-0.012
Token er
Token ID1931
Feature activation+7.78e-02

Pos logit contributions
lights+0.045
 moral+0.040
bel+0.040
 exc+0.040
 swollen+0.039
Neg logit contributions
eb-0.070
over-0.063
'-0.063
ucky-0.061
e-0.061
Tokenaser
Token ID6005
Feature activation+3.88e-02

Pos logit contributions
 moral+0.114
lights+0.112
 swollen+0.108
 exc+0.107
 pret+0.105
Neg logit contributions
eb-0.133
'-0.119
over-0.119
ucky-0.115
e-0.114
Token.
Token ID13
Feature activation+9.30e-03

Pos logit contributions
 moral+0.076
lights+0.073
 swollen+0.072
 exc+0.071
 pret+0.070
Neg logit contributions
eb-0.045
over-0.041
'-0.041
e-0.038
ucky-0.035
Token I
Token ID314
Feature activation+3.73e-03

Pos logit contributions
 moral+0.016
lights+0.015
 swollen+0.015
 works+0.015
bel+0.015
Neg logit contributions
eb-0.013
over-0.012
e-0.011
'-0.011
ucky-0.011
Token can
Token ID460
Feature activation+6.29e-03

Pos logit contributions
lights+0.007
 moral+0.007
 swollen+0.007
 poster+0.006
 appearances+0.006
Neg logit contributions
eb-0.005
'-0.004
e-0.004
over-0.004
he-0.004
Token't
Token ID470
Feature activation+2.46e-02

Pos logit contributions
lights+0.012
 moral+0.011
 pret+0.011
 works+0.011
 swollen+0.010
Neg logit contributions
eb-0.008
'-0.008
e-0.007
he-0.007
ucky-0.007
Token kept
Token ID4030
Feature activation+1.64e-02

Pos logit contributions
 moral+0.029
 exc+0.027
 swollen+0.027
 pret+0.026
 works+0.026
Neg logit contributions
eb-0.039
over-0.037
'-0.034
e-0.033
ucky-0.032
Token crying
Token ID13774
Feature activation+4.59e-02

Pos logit contributions
 moral+0.018
 swollen+0.017
 exc+0.016
 pret+0.016
 works+0.016
Neg logit contributions
eb-0.034
over-0.032
'-0.030
ale-0.030
ive-0.030
Token and
Token ID290
Feature activation+1.30e-03

Pos logit contributions
 moral+0.072
lights+0.068
 exc+0.066
 swollen+0.066
 lamp+0.064
Neg logit contributions
eb-0.078
over-0.066
e-0.065
'-0.062
ale-0.061
Token throwing
Token ID9644
Feature activation-8.18e-04

Pos logit contributions
 moral+0.002
 swollen+0.001
 exc+0.001
lights+0.001
 pret+0.001
Neg logit contributions
eb-0.003
over-0.002
'-0.002
e-0.002
ucky-0.002
Token a
Token ID257
Feature activation+1.46e-02

Pos logit contributions
eb+0.002
over+0.001
ete+0.001
ax+0.001
'+0.001
Neg logit contributions
 moral-0.001
 swollen-0.001
 pret-0.001
lights-0.001
 lamp-0.001
Token tant
Token ID24246
Feature activation+7.76e-02

Pos logit contributions
 moral+0.020
 exc+0.019
lights+0.019
 works+0.019
 pret+0.019
Neg logit contributions
eb-0.025
over-0.023
'-0.023
ucky-0.022
e-0.021
Tokenrum
Token ID6582
Feature activation+3.00e-02

Pos logit contributions
 moral+0.082
lights+0.075
 exc+0.074
 works+0.073
 swollen+0.073
Neg logit contributions
eb-0.163
over-0.152
'-0.145
e-0.144
ucky-0.139
Token.
Token ID13
Feature activation-2.60e-03

Pos logit contributions
 moral+0.051
lights+0.050
 exc+0.049
 swollen+0.049
 lamp+0.048
Neg logit contributions
eb-0.043
over-0.038
'-0.035
e-0.035
ax-0.034
Token
Token ID198
Feature activation-5.24e-03

Pos logit contributions
eb+0.004
over+0.004
e+0.003
'+0.003
ale+0.003
Neg logit contributions
 moral-0.004
 exc-0.004
lights-0.004
 swollen-0.004
 lamp-0.004
Token
Token ID198
Feature activation-1.91e-03

Pos logit contributions
eb+0.008
over+0.007
'+0.007
e+0.007
ale+0.006
Neg logit contributions
 moral-0.009
 exc-0.008
lights-0.008
 swollen-0.008
 works-0.008
TokenFinally
Token ID11158
Feature activation+3.13e-02

Pos logit contributions
eb+0.004
over+0.003
'+0.003
e+0.003
ucky+0.003
Neg logit contributions
lights-0.002
 pret-0.002
 works-0.002
 moral-0.002
bul-0.002
Token happy
Token ID3772
Feature activation-1.97e-02

Pos logit contributions
eb+0.007
over+0.007
'+0.006
e+0.006
ucky+0.006
Neg logit contributions
 moral-0.005
lights-0.004
 exc-0.004
 swollen-0.004
 works-0.004
Token.
Token ID13
Feature activation-3.62e-03

Pos logit contributions
eb+0.027
over+0.024
'+0.022
e+0.022
ucky+0.021
Neg logit contributions
 moral-0.035
lights-0.033
 exc-0.033
 swollen-0.033
 works-0.033
Token She
Token ID1375
Feature activation+1.71e-02

Pos logit contributions
eb+0.006
over+0.005
e+0.005
'+0.005
ale+0.005
Neg logit contributions
 moral-0.006
 exc-0.005
 lamp-0.005
lights-0.005
 works-0.005
Token put
Token ID1234
Feature activation+2.89e-04

Pos logit contributions
 moral+0.025
lights+0.025
 swollen+0.024
 exc+0.024
 poster+0.024
Neg logit contributions
eb-0.028
over-0.025
'-0.025
e-0.024
ale-0.022
Token the
Token ID262
Feature activation+2.13e-02

Pos logit contributions
 moral+0.000
 swollen+0.000
 works+0.000
 pret+0.000
 lamp+0.000
Neg logit contributions
eb-0.000
over-0.000
'-0.000
ive-0.000
ucky-0.000
Token er
Token ID1931
Feature activation+7.74e-02

Pos logit contributions
lights+0.028
 exc+0.024
bel+0.024
 pret+0.023
 Amen+0.023
Neg logit contributions
eb-0.038
'-0.037
ucky-0.036
e-0.035
over-0.035
Tokenaser
Token ID6005
Feature activation+3.87e-02

Pos logit contributions
 moral+0.114
 exc+0.110
lights+0.109
 pret+0.107
 works+0.106
Neg logit contributions
eb-0.130
'-0.117
over-0.117
e-0.115
ucky-0.114
Token in
Token ID287
Feature activation+1.08e-02

Pos logit contributions
 moral+0.078
lights+0.075
 exc+0.075
 swollen+0.074
 works+0.074
Neg logit contributions
eb-0.043
over-0.038
'-0.035
e-0.033
ucky-0.031
Token the
Token ID262
Feature activation+6.83e-03

Pos logit contributions
 moral+0.015
 swollen+0.014
 exc+0.014
 lamp+0.013
 pret+0.013
Neg logit contributions
eb-0.020
over-0.018
'-0.017
ive-0.017
ale-0.017
Token trash
Token ID13913
Feature activation+6.19e-04

Pos logit contributions
 moral+0.010
lights+0.010
 exc+0.010
 swollen+0.009
 works+0.009
Neg logit contributions
eb-0.011
over-0.010
'-0.010
e-0.010
ucky-0.009
Token and
Token ID290
Feature activation-7.37e-03

Pos logit contributions
 lamp+0.001
 works+0.001
 poster+0.001
lights+0.001
bul+0.001
Neg logit contributions
over-0.001
eb-0.001
e-0.001
ake-0.001
umble-0.001
Token had
Token ID550
Feature activation-3.32e-02

Pos logit contributions
eb+0.052
over+0.049
'+0.044
e+0.044
ax+0.043
Neg logit contributions
 moral-0.032
 works-0.031
 exc-0.031
lights-0.030
 swollen-0.029
Token become
Token ID1716
Feature activation-1.66e-02

Pos logit contributions
eb+0.054
over+0.051
'+0.047
e+0.045
ucky+0.043
Neg logit contributions
 moral-0.051
 exc-0.048
 works-0.047
 swollen-0.047
lights-0.046
Token.
Token ID13
Feature activation+4.62e-03

Pos logit contributions
eb+0.027
over+0.023
e+0.021
'+0.021
he+0.021
Neg logit contributions
 moral-0.027
 exc-0.026
 swollen-0.025
 works-0.025
lights-0.024
Token "
Token ID366
Feature activation+6.87e-03

Pos logit contributions
 moral+0.008
 exc+0.008
lights+0.008
 swollen+0.008
 works+0.008
Neg logit contributions
eb-0.006
over-0.006
'-0.005
ucky-0.005
ax-0.005
TokenWow
Token ID22017
Feature activation+2.81e-02

Pos logit contributions
 moral+0.009
lights+0.009
 works+0.009
 exc+0.009
 swollen+0.009
Neg logit contributions
eb-0.012
over-0.011
'-0.011
e-0.011
ucky-0.010
Token,
Token ID11
Feature activation+7.72e-02

Pos logit contributions
 moral+0.050
 works+0.048
 exc+0.048
 swollen+0.048
 pret+0.047
Neg logit contributions
eb-0.037
over-0.034
e-0.030
'-0.029
ale-0.028
Token Lily
Token ID20037
Feature activation-3.00e-03

Pos logit contributions
lights+0.130
 moral+0.129
 exc+0.123
bul+0.122
 swollen+0.121
Neg logit contributions
eb-0.108
'-0.095
over-0.093
ucky-0.092
e-0.089
Token!
Token ID0
Feature activation+5.95e-04

Pos logit contributions
eb+0.004
over+0.004
'+0.004
e+0.004
ucky+0.003
Neg logit contributions
 moral-0.005
 works-0.005
 swollen-0.005
 exc-0.005
 lamp-0.005
Token Your
Token ID3406
Feature activation-1.57e-02

Pos logit contributions
 moral+0.001
 exc+0.001
 works+0.001
 pret+0.001
 lamp+0.001
Neg logit contributions
eb-0.001
over-0.001
'-0.001
ale-0.001
e-0.001
Token bean
Token ID26394
Feature activation-8.50e-03

Pos logit contributions
eb+0.022
over+0.020
'+0.019
ucky+0.018
e+0.018
Neg logit contributions
 moral-0.027
lights-0.026
 swollen-0.026
 exc-0.026
 works-0.026
Token plant
Token ID4618
Feature activation-1.50e-02

Pos logit contributions
eb+0.016
over+0.015
'+0.014
e+0.014
ucky+0.013
Neg logit contributions
 moral-0.011
lights-0.010
 exc-0.010
 swollen-0.010
 pret-0.010
Token want
Token ID765
Feature activation-5.34e-03

Pos logit contributions
 moral+0.006
 exc+0.005
 works+0.005
lights+0.005
 swollen+0.005
Neg logit contributions
eb-0.005
over-0.004
'-0.004
e-0.004
ucky-0.004
Token to
Token ID284
Feature activation+1.66e-02

Pos logit contributions
eb+0.004
'+0.004
e+0.003
ucky+0.003
 home+0.003
Neg logit contributions
lights-0.012
 exc-0.012
bel-0.012
 pret-0.012
 swollen-0.012
Token play
Token ID711
Feature activation+5.14e-02

Pos logit contributions
 moral+0.031
lights+0.031
 swollen+0.029
 works+0.029
 pret+0.028
Neg logit contributions
eb-0.021
e-0.018
'-0.018
over-0.017
ucky-0.016
Token with
Token ID351
Feature activation+2.19e-02

Pos logit contributions
 swollen+0.096
lights+0.095
 moral+0.093
 exc+0.093
 works+0.090
Neg logit contributions
'-0.061
eb-0.060
e-0.058
over-0.057
he-0.050
Token the
Token ID262
Feature activation+2.48e-02

Pos logit contributions
 moral+0.037
 exc+0.035
lights+0.035
 swollen+0.034
 works+0.034
Neg logit contributions
eb-0.033
over-0.030
'-0.028
e-0.027
ucky-0.026
Token laser
Token ID12855
Feature activation+7.69e-02

Pos logit contributions
bel+0.028
lights+0.025
aj+0.025
 exc+0.024
 pret+0.023
Neg logit contributions
eb-0.045
ucky-0.045
'-0.045
ard-0.042
he-0.042
Token together
Token ID1978
Feature activation+3.04e-02

Pos logit contributions
 moral+0.084
 swollen+0.082
 exc+0.082
lights+0.077
 pret+0.072
Neg logit contributions
eb-0.152
'-0.147
over-0.143
e-0.138
ucky-0.134
Token?"
Token ID1701
Feature activation-2.13e-02

Pos logit contributions
 moral+0.016
 exc+0.012
lights+0.012
 works+0.011
 pret+0.011
Neg logit contributions
eb-0.080
over-0.076
e-0.071
ale-0.070
'-0.070
Token
Token ID198
Feature activation+1.60e-03

Pos logit contributions
eb+0.025
over+0.022
'+0.021
e+0.019
ucky+0.019
Neg logit contributions
 moral-0.041
lights-0.041
 swollen-0.041
 works-0.040
bul-0.040
Token
Token ID198
Feature activation+8.32e-04

Pos logit contributions
 moral+0.003
lights+0.003
 exc+0.003
 swollen+0.003
 works+0.003
Neg logit contributions
eb-0.002
over-0.002
'-0.002
e-0.002
ucky-0.002
TokenM
Token ID44
Feature activation-2.50e-02

Pos logit contributions
 moral+0.001
lights+0.001
 works+0.001
 pret+0.001
 exc+0.001
Neg logit contributions
eb-0.001
over-0.001
'-0.001
e-0.001
ucky-0.001
Token sad
Token ID6507
Feature activation+1.40e-02

Pos logit contributions
lights+0.029
 moral+0.028
bul+0.027
 lamp+0.027
 works+0.026
Neg logit contributions
eb-0.032
over-0.030
'-0.030
e-0.028
ucky-0.027
Token.
Token ID13
Feature activation-3.38e-03

Pos logit contributions
 moral+0.029
 exc+0.028
 swollen+0.028
lights+0.027
 works+0.027
Neg logit contributions
eb-0.014
over-0.013
'-0.013
ucky-0.011
e-0.011
Token She
Token ID1375
Feature activation+2.73e-02

Pos logit contributions
eb+0.005
over+0.005
e+0.005
'+0.004
ive+0.004
Neg logit contributions
 moral-0.006
 exc-0.005
 swollen-0.005
 works-0.005
lights-0.005
Token threw
Token ID9617
Feature activation+2.18e-06

Pos logit contributions
lights+0.041
 moral+0.040
 poster+0.039
 exc+0.038
 works+0.038
Neg logit contributions
eb-0.044
'-0.039
over-0.039
e-0.037
ucky-0.035
Token a
Token ID257
Feature activation+1.81e-02

Feature not active on prev token;
no predictions were affected.
Token tant
Token ID24246
Feature activation+7.67e-02

Pos logit contributions
 moral+0.025
lights+0.025
 exc+0.024
 works+0.023
 pret+0.023
Neg logit contributions
eb-0.031
'-0.028
over-0.028
e-0.026
ucky-0.026
Tokenrum
Token ID6582
Feature activation+3.16e-02

Pos logit contributions
 moral+0.084
lights+0.076
 works+0.075
 exc+0.075
 pret+0.073
Neg logit contributions
eb-0.161
over-0.149
e-0.143
'-0.141
ucky-0.135
Token and
Token ID290
Feature activation-3.75e-03

Pos logit contributions
 moral+0.059
 exc+0.057
lights+0.056
 works+0.056
 swollen+0.056
Neg logit contributions
eb-0.040
over-0.037
'-0.034
e-0.032
ucky-0.031
Token cried
Token ID16896
Feature activation+1.96e-02

Pos logit contributions
eb+0.007
over+0.007
'+0.006
ucky+0.006
ive+0.006
Neg logit contributions
 swollen-0.005
 moral-0.005
 pret-0.004
 exc-0.004
 works-0.004
Token loudly
Token ID23112
Feature activation+3.78e-03

Pos logit contributions
 moral+0.029
 exc+0.027
 swollen+0.027
lights+0.027
 lamp+0.026
Neg logit contributions
eb-0.036
over-0.029
ax-0.029
e-0.028
ram-0.028
Token.
Token ID13
Feature activation-5.88e-03

Pos logit contributions
 moral+0.007
lights+0.007
 exc+0.007
 swollen+0.007
 works+0.006
Neg logit contributions
eb-0.005
over-0.004
'-0.004
e-0.004
ucky-0.004
Token her
Token ID607
Feature activation-9.98e-03

Pos logit contributions
 moral+0.025
 exc+0.023
lights+0.023
 swollen+0.023
 works+0.023
Neg logit contributions
eb-0.032
over-0.029
'-0.028
e-0.027
ucky-0.027
Token goal
Token ID3061
Feature activation-4.61e-03

Pos logit contributions
eb+0.013
'+0.013
over+0.012
e+0.012
day+0.012
Neg logit contributions
lights-0.017
 swollen-0.017
bel-0.016
bul-0.016
 exc-0.015
Token!
Token ID0
Feature activation-3.47e-03

Pos logit contributions
eb+0.007
over+0.006
ucky+0.006
e+0.005
'+0.005
Neg logit contributions
 moral-0.007
 works-0.007
lights-0.007
 lamp-0.007
 exc-0.007
Token<|endoftext|>
Token ID50256
Feature activation-1.64e-03

Pos logit contributions
eb+0.006
over+0.006
ale+0.005
e+0.005
ard+0.005
Neg logit contributions
 moral-0.005
 exc-0.004
lights-0.004
 swollen-0.004
 lamp-0.004
TokenJ
Token ID41
Feature activation+6.97e-04

Pos logit contributions
eb+0.003
ale+0.003
over+0.003
ram+0.003
ax+0.003
Neg logit contributions
 moral-0.001
 works-0.001
 lamp-0.001
 pret-0.001
 poster-0.001
Tokenimm
Token ID8608
Feature activation+7.63e-02

Pos logit contributions
 moral+0.001
aj+0.001
 works+0.001
lights+0.001
 poster+0.001
Neg logit contributions
over-0.001
eb-0.001
e-0.001
ard-0.001
'-0.001
Tokenie
Token ID494
Feature activation+2.16e-02

Pos logit contributions
 moral+0.141
 exc+0.139
 swollen+0.137
 pret+0.133
lights+0.132
Neg logit contributions
eb-0.103
'-0.086
over-0.084
e-0.081
ale-0.080
Token was
Token ID373
Feature activation+2.64e-03

Pos logit contributions
 moral+0.033
 swollen+0.031
lights+0.031
 exc+0.031
 pret+0.030
Neg logit contributions
eb-0.035
over-0.032
'-0.031
e-0.030
ard-0.028
Token scared
Token ID12008
Feature activation+1.76e-02

Pos logit contributions
 moral+0.004
lights+0.004
 works+0.004
 poster+0.004
 pret+0.004
Neg logit contributions
eb-0.004
over-0.004
e-0.004
'-0.004
ucky-0.003
Token.
Token ID13
Feature activation+8.27e-03

Pos logit contributions
 moral+0.028
lights+0.028
 lamp+0.027
 exc+0.026
bul+0.026
Neg logit contributions
eb-0.027
over-0.023
e-0.022
'-0.021
ax-0.020
Token He
Token ID679
Feature activation+1.62e-02

Pos logit contributions
 moral+0.012
 exc+0.011
lights+0.011
 lamp+0.011
 works+0.011
Neg logit contributions
eb-0.014
over-0.012
e-0.012
'-0.012
ale-0.011
Token gone
Token ID3750
Feature activation+2.09e-02

Pos logit contributions
 moral+0.012
lights+0.012
 works+0.012
bul+0.011
 poster+0.011
Neg logit contributions
eb-0.015
over-0.015
e-0.014
'-0.014
ax-0.014
Token.
Token ID13
Feature activation+1.02e-03

Pos logit contributions
 moral+0.034
 exc+0.033
lights+0.033
 works+0.033
 swollen+0.033
Neg logit contributions
eb-0.031
over-0.027
'-0.026
e-0.025
ucky-0.025
Token They
Token ID1119
Feature activation+3.00e-03

Pos logit contributions
 moral+0.002
 exc+0.002
lights+0.002
 pret+0.001
 swollen+0.001
Neg logit contributions
eb-0.002
over-0.001
'-0.001
e-0.001
ale-0.001
Token see
Token ID766
Feature activation+2.76e-02

Pos logit contributions
lights+0.006
 moral+0.006
 swollen+0.006
bul+0.006
web+0.005
Neg logit contributions
eb-0.003
e-0.003
'-0.003
he-0.003
over-0.003
Token their
Token ID511
Feature activation+4.26e-03

Pos logit contributions
 moral+0.047
 exc+0.044
lights+0.044
 swollen+0.044
 works+0.044
Neg logit contributions
eb-0.038
over-0.034
'-0.033
e-0.032
ucky-0.031
Token er
Token ID1931
Feature activation+7.62e-02

Pos logit contributions
 moral+0.004
 exc+0.004
lights+0.004
 swollen+0.004
 works+0.004
Neg logit contributions
eb-0.009
over-0.008
'-0.008
e-0.008
ucky-0.008
Tokenaser
Token ID6005
Feature activation+4.11e-02

Pos logit contributions
 moral+0.113
 exc+0.107
lights+0.102
 swollen+0.100
 pret+0.100
Neg logit contributions
eb-0.130
over-0.119
'-0.116
e-0.115
ucky-0.112
Token.
Token ID13
Feature activation-4.22e-03

Pos logit contributions
 moral+0.074
 exc+0.071
lights+0.070
 swollen+0.070
 works+0.069
Neg logit contributions
eb-0.055
over-0.049
'-0.046
e-0.044
ucky-0.042
Token It
Token ID632
Feature activation+6.42e-03

Pos logit contributions
eb+0.006
over+0.006
'+0.005
e+0.005
ale+0.005
Neg logit contributions
 moral-0.007
 exc-0.007
lights-0.007
 pret-0.007
 works-0.007
Token is
Token ID318
Feature activation+1.75e-03

Pos logit contributions
lights+0.012
bel+0.012
aj+0.011
lf+0.011
bul+0.011
Neg logit contributions
e-0.007
eb-0.007
'-0.006
's-0.006
al-0.006
Token smiling
Token ID16755
Feature activation+1.42e-02

Pos logit contributions
 moral+0.002
 works+0.002
 poster+0.002
lights+0.002
bel+0.002
Neg logit contributions
eb-0.003
over-0.003
e-0.003
'-0.003
ucky-0.003
Token want
Token ID765
Feature activation-3.15e-03

Pos logit contributions
 moral+0.009
 exc+0.009
lights+0.009
 swollen+0.009
 works+0.009
Neg logit contributions
eb-0.008
over-0.007
'-0.007
e-0.006
ucky-0.006
Token to
Token ID284
Feature activation+2.00e-02

Pos logit contributions
eb+0.003
'+0.002
e+0.002
ucky+0.002
over+0.002
Neg logit contributions
 exc-0.007
lights-0.007
 moral-0.007
 swollen-0.007
 works-0.007
Token play
Token ID711
Feature activation+4.93e-02

Pos logit contributions
 moral+0.035
lights+0.034
 swollen+0.033
 exc+0.033
 works+0.033
Neg logit contributions
eb-0.028
over-0.024
'-0.024
e-0.023
ucky-0.021
Token with
Token ID351
Feature activation+1.68e-02

Pos logit contributions
lights+0.093
 moral+0.092
 swollen+0.091
 exc+0.089
 works+0.087
Neg logit contributions
eb-0.060
'-0.059
e-0.059
over-0.058
he-0.049
Token my
Token ID616
Feature activation+7.56e-03

Pos logit contributions
 moral+0.025
 lamp+0.024
 swollen+0.024
lights+0.023
 pret+0.023
Neg logit contributions
eb-0.027
over-0.026
'-0.024
ive-0.022
ump-0.022
Token laser
Token ID12855
Feature activation+7.61e-02

Pos logit contributions
lights+0.010
 moral+0.009
bel+0.009
bul+0.009
 exc+0.009
Neg logit contributions
eb-0.013
over-0.013
'-0.012
e-0.012
 home-0.012
Token?
Token ID30
Feature activation-7.31e-04

Pos logit contributions
 moral+0.089
 exc+0.087
 swollen+0.083
lights+0.083
 amused+0.080
Neg logit contributions
eb-0.144
'-0.140
over-0.137
e-0.131
ucky-0.126
Token We
Token ID775
Feature activation-2.15e-02

Pos logit contributions
eb+0.001
over+0.001
'+0.001
e+0.001
ucky+0.001
Neg logit contributions
 moral-0.001
lights-0.001
 swollen-0.001
 works-0.001
 exc-0.001
Token can
Token ID460
Feature activation+8.38e-03

Pos logit contributions
eb+0.037
over+0.035
'+0.030
ucky+0.030
e+0.030
Neg logit contributions
 moral-0.032
 exc-0.030
 works-0.029
 pret-0.029
 lamp-0.028
Token be
Token ID307
Feature activation+1.52e-02

Pos logit contributions
 moral+0.014
lights+0.013
 exc+0.013
 swollen+0.013
 works+0.013
Neg logit contributions
eb-0.012
over-0.011
'-0.011
e-0.010
ucky-0.010
Token a
Token ID257
Feature activation+1.75e-02

Pos logit contributions
 moral+0.022
lights+0.022
bul+0.021
 swollen+0.020
 works+0.020
Neg logit contributions
eb-0.024
over-0.024
e-0.023
'-0.022
aw-0.021
Token.
Token ID13
Feature activation+1.26e-02

Pos logit contributions
 moral+0.030
 exc+0.028
lights+0.028
 swollen+0.027
 works+0.027
Neg logit contributions
eb-0.028
over-0.025
'-0.023
e-0.022
ale-0.022
Token Can
Token ID1680
Feature activation-6.22e-03

Pos logit contributions
 moral+0.018
lights+0.017
 exc+0.017
 swollen+0.017
 works+0.016
Neg logit contributions
eb-0.022
over-0.020
'-0.019
e-0.019
ucky-0.018
Token I
Token ID314
Feature activation+3.84e-03

Pos logit contributions
eb+0.008
'+0.008
e+0.007
over+0.007
ucky+0.007
Neg logit contributions
lights-0.011
 pret-0.011
 moral-0.011
 works-0.011
 swollen-0.010
Token share
Token ID2648
Feature activation+1.02e-02

Pos logit contributions
 moral+0.006
 exc+0.006
lights+0.006
 swollen+0.006
 works+0.006
Neg logit contributions
eb-0.006
over-0.006
'-0.005
e-0.005
ucky-0.005
Token your
Token ID534
Feature activation+2.29e-02

Pos logit contributions
 moral+0.016
 exc+0.015
lights+0.015
 swollen+0.015
 works+0.015
Neg logit contributions
eb-0.016
'-0.015
e-0.014
over-0.014
ucky-0.013
Token tube
Token ID12403
Feature activation+7.61e-02

Pos logit contributions
 Amen+0.045
 firefighter+0.044
 amen+0.044
bel+0.043
 Lincoln+0.042
Neg logit contributions
 home-0.024
'-0.023
ucky-0.022
ay-0.018
ee-0.017
Token?"
Token ID1701
Feature activation-1.52e-02

Pos logit contributions
 swollen+0.147
 moral+0.143
 pret+0.141
 appearances+0.139
 satellites+0.136
Neg logit contributions
eb-0.088
'-0.085
over-0.082
he-0.079
e-0.077
Token
Token ID198
Feature activation+6.43e-03

Pos logit contributions
eb+0.023
over+0.021
e+0.020
'+0.019
ale+0.019
Neg logit contributions
 moral-0.025
 exc-0.023
 swollen-0.023
 pret-0.023
lights-0.023
Token
Token ID198
Feature activation+3.68e-03

Pos logit contributions
 moral+0.011
 exc+0.010
 swollen+0.010
lights+0.010
 works+0.010
Neg logit contributions
eb-0.009
over-0.009
'-0.008
e-0.008
ale-0.007
TokenBen
Token ID11696
Feature activation-8.64e-03

Pos logit contributions
 moral+0.006
 swollen+0.006
 exc+0.006
lights+0.006
 pret+0.006
Neg logit contributions
eb-0.005
over-0.005
'-0.004
e-0.004
ale-0.004
Token smiles
Token ID21845
Feature activation+2.11e-02

Pos logit contributions
'+0.014
eb+0.014
aw+0.013
ax+0.012
e+0.012
Neg logit contributions
bel-0.014
lights-0.014
bul-0.013
 swollen-0.013
lf-0.012
Token say
Token ID910
Feature activation+1.20e-03

Pos logit contributions
 moral+0.019
 exc+0.018
 works+0.018
lights+0.017
 pret+0.017
Neg logit contributions
eb-0.020
over-0.018
'-0.017
e-0.016
ucky-0.016
Token,
Token ID11
Feature activation+2.25e-02

Pos logit contributions
 moral+0.002
lights+0.002
 exc+0.002
 swollen+0.002
 works+0.002
Neg logit contributions
eb-0.002
over-0.001
'-0.001
e-0.001
ucky-0.001
Token "
Token ID366
Feature activation-2.06e-03

Pos logit contributions
 moral+0.025
lights+0.024
 swollen+0.024
 works+0.023
 exc+0.023
Neg logit contributions
eb-0.044
over-0.042
'-0.042
e-0.038
ucky-0.038
TokenStop
Token ID19485
Feature activation+3.63e-02

Pos logit contributions
eb+0.004
over+0.004
'+0.003
e+0.003
ucky+0.003
Neg logit contributions
 moral-0.003
 works-0.002
 exc-0.002
 swollen-0.002
bel-0.002
Token writing
Token ID3597
Feature activation+2.28e-02

Pos logit contributions
 moral+0.067
 exc+0.064
lights+0.064
 swollen+0.063
 works+0.063
Neg logit contributions
eb-0.047
over-0.042
'-0.039
e-0.038
ucky-0.036
Token everywhere
Token ID8347
Feature activation+7.60e-02

Pos logit contributions
 moral+0.043
 exc+0.041
lights+0.041
 swollen+0.041
 works+0.040
Neg logit contributions
eb-0.028
over-0.026
'-0.024
e-0.023
ucky-0.022
Token!
Token ID0
Feature activation+2.77e-02

Pos logit contributions
 moral+0.130
 exc+0.124
lights+0.124
 works+0.122
 pret+0.121
Neg logit contributions
eb-0.110
over-0.099
'-0.089
e-0.086
ale-0.084
Token It
Token ID632
Feature activation+1.18e-02

Pos logit contributions
 moral+0.039
 exc+0.037
 pret+0.036
 lamp+0.035
 works+0.035
Neg logit contributions
eb-0.048
over-0.046
'-0.041
ale-0.041
ax-0.039
Token is
Token ID318
Feature activation-4.03e-03

Pos logit contributions
lights+0.019
 moral+0.019
 swollen+0.018
bel+0.018
 exc+0.017
Neg logit contributions
eb-0.019
e-0.017
'-0.017
ee-0.015
over-0.015
Token messy
Token ID24097
Feature activation+2.73e-02

Pos logit contributions
eb+0.006
over+0.006
e+0.006
'+0.006
ucky+0.005
Neg logit contributions
lights-0.006
 moral-0.006
bel-0.006
 exc-0.005
 swollen-0.005
Token and
Token ID290
Feature activation+2.79e-02

Pos logit contributions
lights+0.042
 moral+0.042
 works+0.041
 pret+0.040
 lamp+0.040
Neg logit contributions
eb-0.043
over-0.037
'-0.035
e-0.035
ax-0.034
Token.
Token ID13
Feature activation-5.30e-03

Pos logit contributions
eb+0.021
over+0.019
e+0.017
'+0.017
ucky+0.016
Neg logit contributions
 moral-0.026
lights-0.024
 exc-0.024
 swollen-0.024
 works-0.024
Token He
Token ID679
Feature activation+1.32e-02

Pos logit contributions
eb+0.008
over+0.008
'+0.007
e+0.007
ucky+0.007
Neg logit contributions
 moral-0.008
lights-0.008
 exc-0.008
 swollen-0.008
 works-0.008
Token kept
Token ID4030
Feature activation+2.21e-02

Pos logit contributions
lights+0.020
 moral+0.020
 swollen+0.019
 poster+0.018
 exc+0.018
Neg logit contributions
eb-0.022
over-0.019
'-0.019
e-0.018
aw-0.017
Token throwing
Token ID9644
Feature activation+3.65e-03

Pos logit contributions
 moral+0.026
 exc+0.025
 swollen+0.024
 pret+0.024
 works+0.024
Neg logit contributions
eb-0.044
over-0.041
'-0.039
ucky-0.038
e-0.037
Token a
Token ID257
Feature activation+1.36e-02

Pos logit contributions
 moral+0.005
 pret+0.005
 works+0.004
 swollen+0.004
 exc+0.004
Neg logit contributions
eb-0.007
over-0.006
'-0.006
ake-0.006
ale-0.006
Token tant
Token ID24246
Feature activation+7.59e-02

Pos logit contributions
 moral+0.019
lights+0.019
 exc+0.018
 works+0.018
 swollen+0.018
Neg logit contributions
eb-0.023
over-0.021
'-0.021
ucky-0.019
e-0.019
Tokenrum
Token ID6582
Feature activation+3.60e-02

Pos logit contributions
 moral+0.081
lights+0.074
 exc+0.073
 works+0.073
 swollen+0.072
Neg logit contributions
eb-0.161
over-0.149
e-0.143
'-0.142
ucky-0.135
Token.
Token ID13
Feature activation-7.23e-03

Pos logit contributions
 moral+0.065
lights+0.062
 exc+0.061
 swollen+0.060
 works+0.060
Neg logit contributions
eb-0.049
over-0.043
'-0.040
e-0.039
ucky-0.037
Token His
Token ID2399
Feature activation-8.92e-03

Pos logit contributions
eb+0.011
over+0.010
'+0.010
e+0.009
ucky+0.009
Neg logit contributions
 moral-0.011
lights-0.011
 exc-0.011
 swollen-0.011
 works-0.011
Token mom
Token ID1995
Feature activation+6.89e-03

Pos logit contributions
eb+0.014
over+0.013
'+0.012
 home+0.012
e+0.012
Neg logit contributions
 moral-0.013
 works-0.013
lights-0.012
bul-0.012
 pret-0.012
Token was
Token ID373
Feature activation+1.40e-02

Pos logit contributions
 moral+0.008
 exc+0.007
 works+0.007
 swollen+0.007
 pret+0.007
Neg logit contributions
eb-0.014
over-0.013
ucky-0.012
ale-0.012
'-0.012
Token back
Token ID736
Feature activation-5.07e-03

Pos logit contributions
eb+0.013
over+0.012
'+0.011
ucky+0.010
e+0.010
Neg logit contributions
 moral-0.017
 exc-0.016
 works-0.016
 pret-0.016
lights-0.016
Token to
Token ID284
Feature activation+5.57e-03

Pos logit contributions
eb+0.006
over+0.006
'+0.005
e+0.005
ucky+0.005
Neg logit contributions
 moral-0.010
 exc-0.009
 works-0.009
lights-0.009
 pret-0.009
Token the
Token ID262
Feature activation-5.26e-03

Pos logit contributions
 moral+0.009
lights+0.009
 exc+0.009
 swollen+0.009
 works+0.009
Neg logit contributions
eb-0.008
over-0.008
'-0.007
e-0.007
ucky-0.007
Token pond
Token ID16723
Feature activation-1.38e-02

Pos logit contributions
eb+0.007
over+0.006
'+0.006
ucky+0.006
e+0.006
Neg logit contributions
lights-0.009
 moral-0.009
 exc-0.009
 swollen-0.009
 works-0.009
Token the
Token ID262
Feature activation-2.38e-02

Pos logit contributions
eb+0.021
over+0.019
e+0.017
ucky+0.017
'+0.017
Neg logit contributions
 moral-0.022
 exc-0.021
lights-0.021
 works-0.020
bel-0.020
Token next
Token ID1306
Feature activation-8.37e-02

Pos logit contributions
eb+0.033
over+0.031
'+0.030
e+0.027
ucky+0.027
Neg logit contributions
lights-0.039
 moral-0.038
 works-0.038
 swollen-0.038
 exc-0.037
Token day
Token ID1110
Feature activation-2.06e-02

Pos logit contributions
eb+0.106
over+0.093
'+0.085
e+0.084
ucky+0.080
Neg logit contributions
 moral-0.161
 exc-0.151
lights-0.150
 works-0.147
 poster-0.145
Token,
Token ID11
Feature activation+3.07e-03

Pos logit contributions
eb+0.039
ream+0.033
ench+0.033
he+0.033
atter+0.033
Neg logit contributions
 pretending-0.022
 moral-0.021
 exc-0.020
lights-0.019
 works-0.018
Token hoping
Token ID7725
Feature activation-6.60e-03

Pos logit contributions
 moral+0.005
lights+0.005
 exc+0.005
 swollen+0.005
 works+0.005
Neg logit contributions
eb-0.005
over-0.004
'-0.004
e-0.004
ucky-0.004
Token that
Token ID326
Feature activation+3.84e-03

Pos logit contributions
eb+0.011
over+0.010
'+0.009
e+0.009
ucky+0.009
Neg logit contributions
 moral-0.010
 exc-0.010
lights-0.010
 lamp-0.009
 swollen-0.009
Token the
Token ID262
Feature activation-1.03e-02

Pos logit contributions
 moral+0.006
 exc+0.006
lights+0.006
 swollen+0.006
 works+0.006
Neg logit contributions
eb-0.006
over-0.006
'-0.005
e-0.005
ucky-0.005
Token the
Token ID262
Feature activation-2.12e-02

Pos logit contributions
eb+0.042
over+0.039
'+0.036
ucky+0.036
e+0.035
Neg logit contributions
 moral-0.042
 exc-0.039
 swollen-0.039
 pret-0.037
 works-0.037
Token rabbit
Token ID22746
Feature activation-1.91e-02

Pos logit contributions
eb+0.040
over+0.036
e+0.035
'+0.033
ucky+0.033
Neg logit contributions
 moral-0.030
 exc-0.026
 swollen-0.026
 works-0.025
 lamp-0.025
Token met
Token ID1138
Feature activation-8.33e-03

Pos logit contributions
eb+0.033
over+0.031
ucky+0.029
ax+0.027
ram+0.027
Neg logit contributions
 moral-0.027
 exc-0.025
 works-0.024
 swollen-0.024
lights-0.023
Token again
Token ID757
Feature activation-2.17e-02

Pos logit contributions
eb+0.016
over+0.013
ram+0.013
ucky+0.013
ax+0.013
Neg logit contributions
 moral-0.011
 exc-0.010
 forgiving-0.010
 works-0.010
 pret-0.009
Token the
Token ID262
Feature activation-2.77e-02

Pos logit contributions
eb+0.033
over+0.028
ucky+0.027
ax+0.026
'+0.025
Neg logit contributions
 moral-0.037
 exc-0.034
 works-0.033
 pret-0.033
lights-0.032
Token next
Token ID1306
Feature activation-8.02e-02

Pos logit contributions
eb+0.053
over+0.049
'+0.046
e+0.046
ucky+0.044
Neg logit contributions
 moral-0.035
 exc-0.032
lights-0.032
 swollen-0.031
 works-0.031
Token summer
Token ID3931
Feature activation-3.08e-02

Pos logit contributions
eb+0.113
over+0.098
ucky+0.092
e+0.092
'+0.090
Neg logit contributions
 moral-0.146
lights-0.131
 exc-0.131
 works-0.131
 poster-0.131
Token and
Token ID290
Feature activation-1.32e-02

Pos logit contributions
eb+0.051
e+0.049
ucky+0.047
atter+0.045
ifier+0.045
Neg logit contributions
lights-0.043
men-0.041
 works-0.039
 moral-0.039
bul-0.035
Token the
Token ID262
Feature activation-2.25e-02

Pos logit contributions
eb+0.024
over+0.023
ucky+0.022
ram+0.021
ump+0.021
Neg logit contributions
 moral-0.017
 exc-0.016
bel-0.016
 works-0.016
 forgiving-0.015
Token crazy
Token ID7165
Feature activation-1.01e-02

Pos logit contributions
eb+0.043
over+0.038
e+0.037
'+0.035
ucky+0.035
Neg logit contributions
 moral-0.033
 exc-0.028
 swollen-0.028
 works-0.027
 lamp-0.026
Token farmer
Token ID18739
Feature activation-2.81e-02

Pos logit contributions
eb+0.019
e+0.016
over+0.016
ax+0.016
he+0.015
Neg logit contributions
 moral-0.015
 swollen-0.013
lights-0.013
 exc-0.013
 works-0.012
Token able
Token ID1498
Feature activation-5.02e-02

Pos logit contributions
eb+0.001
over+0.001
'+0.001
ale+0.001
ucky+0.001
Neg logit contributions
 moral-0.000
 swollen-0.000
 exc-0.000
lights-0.000
 works-0.000
Token to
Token ID284
Feature activation+1.22e-03

Pos logit contributions
eb+0.070
over+0.062
'+0.056
e+0.056
ucky+0.053
Neg logit contributions
 moral-0.089
 swollen-0.085
 exc-0.085
lights-0.084
 works-0.083
Token catch
Token ID4929
Feature activation-1.54e-02

Pos logit contributions
 moral+0.002
 exc+0.002
 swollen+0.002
 works+0.002
lights+0.002
Neg logit contributions
eb-0.002
over-0.002
'-0.002
e-0.002
ucky-0.002
Token the
Token ID262
Feature activation-1.18e-02

Pos logit contributions
eb+0.025
over+0.023
'+0.022
e+0.021
ucky+0.020
Neg logit contributions
 moral-0.023
 exc-0.021
lights-0.021
 swollen-0.021
 works-0.021
Token rabbit
Token ID22746
Feature activation-9.43e-03

Pos logit contributions
eb+0.020
over+0.019
'+0.018
e+0.017
ucky+0.017
Neg logit contributions
 moral-0.017
lights-0.016
 exc-0.016
 swollen-0.015
 works-0.015
Token next
Token ID1306
Feature activation-7.78e-02

Pos logit contributions
eb+0.014
over+0.012
'+0.011
e+0.011
ucky+0.011
Neg logit contributions
 moral-0.016
 exc-0.015
 works-0.015
lights-0.015
 pret-0.015
Token time
Token ID640
Feature activation-1.40e-02

Pos logit contributions
eb+0.105
over+0.092
ucky+0.084
'+0.083
e+0.083
Neg logit contributions
 moral-0.142
lights-0.130
 poster-0.129
bul-0.129
 works-0.129
Token.
Token ID13
Feature activation-1.45e-02

Pos logit contributions
eb+0.025
over+0.022
e+0.021
ive+0.021
ucky+0.020
Neg logit contributions
lights-0.017
 moral-0.017
 exc-0.017
 works-0.016
 swollen-0.016
Token She
Token ID1375
Feature activation+1.53e-02

Pos logit contributions
eb+0.023
over+0.021
'+0.020
e+0.020
ale+0.019
Neg logit contributions
 moral-0.023
 exc-0.021
lights-0.020
 pret-0.020
 lamp-0.020
Token was
Token ID373
Feature activation+1.40e-03

Pos logit contributions
lights+0.022
 moral+0.022
 swollen+0.022
 exc+0.021
 works+0.021
Neg logit contributions
eb-0.025
'-0.023
over-0.022
e-0.022
ax-0.020
Token now
Token ID783
Feature activation+1.93e-03

Pos logit contributions
 moral+0.002
 exc+0.002
 swollen+0.002
lights+0.002
 works+0.002
Neg logit contributions
eb-0.002
over-0.002
'-0.002
e-0.002
ucky-0.002
Token returned
Token ID4504
Feature activation-2.69e-02

Pos logit contributions
lights+0.028
 swollen+0.026
 moral+0.026
 appearances+0.025
 lamp+0.025
Neg logit contributions
eb-0.027
'-0.025
over-0.023
e-0.022
ax-0.021
Token the
Token ID262
Feature activation-1.38e-02

Pos logit contributions
eb+0.042
over+0.040
'+0.037
ucky+0.036
e+0.035
Neg logit contributions
 moral-0.042
lights-0.039
 exc-0.039
bel-0.039
 swollen-0.039
Token pitch
Token ID7078
Feature activation-1.87e-02

Pos logit contributions
eb+0.024
over+0.022
'+0.021
e+0.020
ucky+0.020
Neg logit contributions
 moral-0.020
 exc-0.019
lights-0.018
 swollen-0.018
 works-0.018
Tokenfork
Token ID32523
Feature activation-1.22e-02

Pos logit contributions
eb+0.030
over+0.027
ucky+0.025
e+0.024
ax+0.024
Neg logit contributions
 moral-0.028
 exc-0.026
 lamp-0.026
lights-0.026
bel-0.026
Token the
Token ID262
Feature activation-2.74e-02

Pos logit contributions
eb+0.020
over+0.019
ucky+0.018
e+0.017
ax+0.016
Neg logit contributions
 moral-0.017
 works-0.016
bul-0.016
bel-0.016
 exc-0.016
Token next
Token ID1306
Feature activation-7.77e-02

Pos logit contributions
eb+0.049
over+0.041
e+0.040
ax+0.038
ucky+0.038
Neg logit contributions
 moral-0.044
 exc-0.039
 poster-0.038
 swollen-0.037
 works-0.037
Token day
Token ID1110
Feature activation-2.50e-02

Pos logit contributions
eb+0.112
over+0.096
e+0.092
ucky+0.089
ax+0.088
Neg logit contributions
 moral-0.139
 exc-0.128
 poster-0.125
 works-0.124
lights-0.124
Token and
Token ID290
Feature activation-5.26e-03

Pos logit contributions
eb+0.051
ench+0.045
ucky+0.045
ream+0.044
ale+0.044
Neg logit contributions
 pretending-0.025
 moral-0.020
 works-0.019
men-0.019
 lamp-0.019
Token said
Token ID531
Feature activation-1.11e-02

Pos logit contributions
eb+0.009
over+0.009
ucky+0.008
e+0.008
'+0.008
Neg logit contributions
 moral-0.007
 works-0.007
 exc-0.007
 pret-0.007
 swollen-0.006
Token,
Token ID11
Feature activation+6.88e-03

Pos logit contributions
eb+0.013
over+0.011
'+0.010
e+0.010
ucky+0.009
Neg logit contributions
 moral-0.023
lights-0.022
 exc-0.022
 swollen-0.021
 works-0.021
Token "
Token ID366
Feature activation-1.15e-02

Pos logit contributions
lights+0.008
 moral+0.008
 swollen+0.008
 works+0.008
 exc+0.007
Neg logit contributions
eb-0.013
'-0.013
over-0.012
e-0.011
ucky-0.011
Token back
Token ID736
Feature activation-2.10e-02

Pos logit contributions
eb+0.002
over+0.001
'+0.001
e+0.001
ucky+0.001
Neg logit contributions
 moral-0.003
 swollen-0.002
lights-0.002
 exc-0.002
 works-0.002
Token to
Token ID284
Feature activation+4.68e-03

Pos logit contributions
eb+0.027
over+0.025
'+0.022
ucky+0.021
e+0.021
Neg logit contributions
 moral-0.039
 exc-0.037
lights-0.037
 works-0.036
 pret-0.035
Token the
Token ID262
Feature activation+6.26e-05

Pos logit contributions
 moral+0.007
 exc+0.006
 pret+0.006
 works+0.006
 lamp+0.006
Neg logit contributions
eb-0.008
over-0.008
'-0.007
ucky-0.007
ive-0.007
Token pond
Token ID16723
Feature activation-1.01e-02

Pos logit contributions
 moral+0.000
 exc+0.000
lights+0.000
 swollen+0.000
 works+0.000
Neg logit contributions
eb-0.000
over-0.000
'-0.000
e-0.000
ucky-0.000
Token the
Token ID262
Feature activation-1.51e-02

Pos logit contributions
eb+0.015
over+0.013
'+0.012
e+0.012
ucky+0.012
Neg logit contributions
 moral-0.017
 exc-0.016
lights-0.016
 works-0.016
 swollen-0.016
Token next
Token ID1306
Feature activation-7.77e-02

Pos logit contributions
eb+0.022
over+0.020
'+0.019
e+0.018
ucky+0.018
Neg logit contributions
 moral-0.025
lights-0.024
 exc-0.024
 swollen-0.024
 works-0.023
Token day
Token ID1110
Feature activation-1.57e-02

Pos logit contributions
eb+0.104
over+0.093
'+0.085
e+0.083
ucky+0.080
Neg logit contributions
 moral-0.144
 exc-0.134
lights-0.133
 works-0.131
 poster-0.130
Token.
Token ID13
Feature activation-4.73e-03

Pos logit contributions
eb+0.032
ench+0.030
atter+0.030
ream+0.030
ucky+0.028
Neg logit contributions
 pretending-0.012
lights-0.012
 moral-0.011
men-0.011
 rockets-0.010
Token They
Token ID1119
Feature activation+3.72e-03

Pos logit contributions
eb+0.008
over+0.007
e+0.007
'+0.007
ale+0.006
Neg logit contributions
 moral-0.007
 exc-0.007
lights-0.007
 pret-0.007
 works-0.006
Token were
Token ID547
Feature activation-4.15e-03

Pos logit contributions
 moral+0.006
 exc+0.005
 works+0.005
 pret+0.005
lights+0.005
Neg logit contributions
over-0.006
eb-0.006
ucky-0.005
'-0.005
e-0.005
Token also
Token ID635
Feature activation-1.46e-02

Pos logit contributions
eb+0.007
over+0.006
'+0.006
ale+0.006
ucky+0.006
Neg logit contributions
 moral-0.006
 exc-0.006
 swollen-0.006
lights-0.006
 pret-0.006
Token happy
Token ID3772
Feature activation-2.27e-02

Pos logit contributions
eb+0.018
over+0.016
'+0.016
e+0.015
ucky+0.015
Neg logit contributions
 moral-0.009
 exc-0.009
 swollen-0.008
 works-0.008
lights-0.008
Token that
Token ID326
Feature activation-1.32e-03

Pos logit contributions
eb+0.030
over+0.027
'+0.025
e+0.024
ucky+0.023
Neg logit contributions
 moral-0.041
 exc-0.039
lights-0.039
 swollen-0.039
 works-0.039
Token it
Token ID340
Feature activation-1.08e-03

Pos logit contributions
eb+0.002
over+0.002
ucky+0.002
atter+0.002
ale+0.002
Neg logit contributions
 moral-0.002
 exc-0.002
 works-0.002
 forgiving-0.002
 swollen-0.002
Token produced
Token ID4635
Feature activation-3.22e-02

Pos logit contributions
eb+0.002
over+0.001
'+0.001
e+0.001
ard+0.001
Neg logit contributions
 moral-0.002
lights-0.002
 swollen-0.002
 exc-0.002
 pret-0.002
Token more
Token ID517
Feature activation-2.44e-02

Pos logit contributions
eb+0.060
over+0.056
'+0.052
e+0.051
ram+0.049
Neg logit contributions
 moral-0.043
 exc-0.038
 swollen-0.038
 works-0.038
lights-0.037
Token av
Token ID1196
Feature activation-7.72e-02

Pos logit contributions
eb+0.045
over+0.041
e+0.039
'+0.038
ive+0.036
Neg logit contributions
 moral-0.034
 exc-0.031
 works-0.031
lights-0.031
 swollen-0.030
Tokenoc
Token ID420
Feature activation-1.65e-02

Pos logit contributions
eb+0.146
over+0.144
ax+0.144
 Sandy+0.140
 Gus+0.140
Neg logit contributions
 moral-0.063
 works-0.062
bul-0.061
 swollen-0.061
bel-0.057
Tokenados
Token ID22484
Feature activation-2.60e-02

Pos logit contributions
eb+0.027
over+0.027
'+0.025
e+0.025
he+0.024
Neg logit contributions
 moral-0.023
 swollen-0.021
 works-0.021
lights-0.021
bul-0.020
Token than
Token ID621
Feature activation-1.28e-02

Pos logit contributions
eb+0.044
over+0.040
ucky+0.037
e+0.037
ale+0.036
Neg logit contributions
 works-0.036
 moral-0.035
 exc-0.035
bul-0.034
lights-0.034
Token ever
Token ID1683
Feature activation-1.29e-02

Pos logit contributions
eb+0.022
over+0.021
'+0.018
ucky+0.018
ive+0.018
Neg logit contributions
 moral-0.019
 exc-0.018
 works-0.017
 swollen-0.017
bul-0.017
Token before
Token ID878
Feature activation-3.91e-02

Pos logit contributions
eb+0.022
over+0.021
ucky+0.019
'+0.018
atter+0.018
Neg logit contributions
 moral-0.019
 works-0.018
 pret-0.017
 exc-0.017
 pretending-0.016
Token tree
Token ID5509
Feature activation-1.14e-02

Pos logit contributions
eb+0.026
over+0.024
e+0.022
ax+0.022
ucky+0.022
Neg logit contributions
 moral-0.021
 exc-0.020
 works-0.020
 swollen-0.020
lights-0.020
Token could
Token ID714
Feature activation-1.90e-02

Pos logit contributions
eb+0.020
over+0.018
'+0.016
ucky+0.016
e+0.016
Neg logit contributions
 moral-0.016
 exc-0.015
lights-0.015
 swollen-0.015
 works-0.015
Token see
Token ID766
Feature activation+1.61e-04

Pos logit contributions
eb+0.028
over+0.025
'+0.024
e+0.024
ucky+0.022
Neg logit contributions
 moral-0.031
lights-0.030
 pret-0.029
 swollen-0.029
 works-0.029
Token lots
Token ID6041
Feature activation-1.58e-02

Pos logit contributions
 moral+0.000
 exc+0.000
 works+0.000
lights+0.000
bul+0.000
Neg logit contributions
eb-0.000
over-0.000
'-0.000
e-0.000
ucky-0.000
Token of
Token ID286
Feature activation-7.31e-03

Pos logit contributions
eb+0.025
over+0.023
e+0.021
'+0.020
day+0.019
Neg logit contributions
 moral-0.025
lights-0.024
 exc-0.024
 works-0.023
bul-0.023
Token av
Token ID1196
Feature activation-7.71e-02

Pos logit contributions
eb+0.014
over+0.013
'+0.012
e+0.012
ucky+0.012
Neg logit contributions
 moral-0.009
 exc-0.009
lights-0.008
 swollen-0.008
 works-0.008
Tokenoc
Token ID420
Feature activation-1.53e-02

Pos logit contributions
over+0.137
eb+0.137
ete+0.135
ax+0.130
 Gus+0.129
Neg logit contributions
 swollen-0.078
 works-0.075
 moral-0.074
ller-0.069
bul-0.067
Tokenados
Token ID22484
Feature activation-2.58e-02

Pos logit contributions
eb+0.026
over+0.026
'+0.024
e+0.023
he+0.023
Neg logit contributions
 moral-0.020
 swollen-0.019
 works-0.019
lights-0.018
 robots-0.018
Token that
Token ID326
Feature activation-1.51e-02

Pos logit contributions
eb+0.042
over+0.038
e+0.035
ucky+0.034
ale+0.033
Neg logit contributions
 moral-0.038
 works-0.038
 exc-0.038
lights-0.036
 poster-0.035
Token the
Token ID262
Feature activation-2.52e-03

Pos logit contributions
over+0.028
eb+0.028
ucky+0.025
ive+0.024
ale+0.024
Neg logit contributions
 exc-0.019
 works-0.019
 moral-0.018
 gen-0.017
bul-0.017
Token supplier
Token ID22693
Feature activation+7.21e-03

Pos logit contributions
eb+0.004
over+0.004
e+0.004
'+0.004
ucky+0.003
Neg logit contributions
 moral-0.004
 exc-0.004
 swollen-0.003
 works-0.003
 lamp-0.003
Token the
Token ID262
Feature activation-4.18e-03

Pos logit contributions
eb+0.007
over+0.007
'+0.006
e+0.006
ucky+0.006
Neg logit contributions
 moral-0.007
lights-0.006
 exc-0.006
 swollen-0.006
 works-0.006
Token shapes
Token ID15268
Feature activation-1.61e-03

Pos logit contributions
eb+0.006
over+0.006
'+0.006
ucky+0.005
e+0.005
Neg logit contributions
lights-0.006
 pret-0.006
bul-0.006
 swollen-0.006
 moral-0.006
Token she
Token ID673
Feature activation-1.58e-02

Pos logit contributions
eb+0.002
over+0.002
'+0.002
e+0.002
ucky+0.002
Neg logit contributions
 moral-0.003
lights-0.003
 exc-0.003
 swollen-0.003
 works-0.003
Token had
Token ID550
Feature activation-3.95e-02

Pos logit contributions
eb+0.027
over+0.025
'+0.023
ucky+0.023
ale+0.022
Neg logit contributions
 moral-0.023
 exc-0.022
 works-0.022
 pret-0.021
 swollen-0.021
Token been
Token ID587
Feature activation-1.19e-02

Pos logit contributions
eb+0.057
over+0.051
'+0.047
ucky+0.046
e+0.044
Neg logit contributions
 moral-0.066
 exc-0.064
 works-0.063
lights-0.062
 swollen-0.062
Token able
Token ID1498
Feature activation-7.68e-02

Pos logit contributions
eb+0.021
over+0.018
ucky+0.017
ale+0.017
'+0.017
Neg logit contributions
 moral-0.016
 swollen-0.016
 exc-0.016
 works-0.015
lights-0.015
Token to
Token ID284
Feature activation-8.72e-03

Pos logit contributions
eb+0.109
over+0.095
'+0.086
e+0.086
ucky+0.082
Neg logit contributions
 moral-0.135
 exc-0.130
 swollen-0.127
 works-0.125
lights-0.124
Token make
Token ID787
Feature activation-1.06e-02

Pos logit contributions
eb+0.015
over+0.014
aw+0.013
ucky+0.013
ram+0.013
Neg logit contributions
 exc-0.012
 moral-0.012
 works-0.011
 lamp-0.011
 pret-0.011
Token.
Token ID13
Feature activation+1.75e-03

Pos logit contributions
eb+0.018
over+0.016
'+0.015
ram+0.015
ive+0.015
Neg logit contributions
 moral-0.015
lights-0.015
 works-0.014
 exc-0.014
 lamp-0.014
Token<|endoftext|>
Token ID50256
Feature activation-2.74e-03

Pos logit contributions
 moral+0.003
 exc+0.003
lights+0.003
 works+0.003
 swollen+0.003
Neg logit contributions
eb-0.003
over-0.002
'-0.002
e-0.002
ale-0.002
TokenOnce
Token ID7454
Feature activation+1.75e-02

Pos logit contributions
eb+0.006
over+0.005
ale+0.005
ax+0.005
atter+0.005
Neg logit contributions
 moral-0.003
 works-0.003
 lamp-0.002
 pret-0.002
lights-0.002
Token from
Token ID422
Feature activation-1.04e-03

Pos logit contributions
eb+0.001
over+0.001
ax+0.001
ucky+0.001
aw+0.001
Neg logit contributions
bel-0.001
 lamp-0.001
 moral-0.001
lights-0.001
bul-0.001
Token one
Token ID530
Feature activation-3.63e-02

Pos logit contributions
eb+0.001
over+0.001
'+0.001
e+0.001
ucky+0.001
Neg logit contributions
 moral-0.002
 exc-0.002
lights-0.002
 swollen-0.002
 works-0.002
Token flower
Token ID15061
Feature activation-1.21e-02

Pos logit contributions
eb+0.066
ax+0.058
over+0.057
hed+0.055
dd+0.054
Neg logit contributions
 lamp-0.051
 moral-0.048
 poster-0.046
 exc-0.043
bul-0.042
Token to
Token ID284
Feature activation-1.31e-02

Pos logit contributions
eb+0.019
over+0.017
ax+0.017
ucky+0.016
e+0.015
Neg logit contributions
 lamp-0.018
bel-0.017
 moral-0.017
lights-0.017
bul-0.017
Token the
Token ID262
Feature activation-1.15e-02

Pos logit contributions
eb+0.025
over+0.023
ucky+0.021
ump+0.021
ram+0.021
Neg logit contributions
 moral-0.018
 exc-0.017
 lamp-0.016
 pret-0.016
 poster-0.016
Token next
Token ID1306
Feature activation-7.61e-02

Pos logit contributions
eb+0.021
over+0.018
ax+0.017
e+0.017
ram+0.017
Neg logit contributions
 moral-0.018
 poster-0.015
 lamp-0.015
 exc-0.015
 swollen-0.015
Token carrying
Token ID6872
Feature activation-2.77e-02

Pos logit contributions
eb+0.118
over+0.103
ax+0.096
e+0.096
ucky+0.093
Neg logit contributions
 moral-0.129
 poster-0.120
 lamp-0.119
lights-0.117
 exc-0.115
Token it
Token ID340
Feature activation-5.29e-03

Pos logit contributions
eb+0.052
over+0.047
ax+0.043
atter+0.043
'+0.043
Neg logit contributions
 moral-0.036
 exc-0.031
 pret-0.030
 lamp-0.030
 works-0.030
Token carefully
Token ID7773
Feature activation-1.40e-02

Pos logit contributions
eb+0.008
over+0.007
'+0.006
e+0.006
ax+0.006
Neg logit contributions
 moral-0.009
 exc-0.009
lights-0.008
 works-0.008
 swollen-0.008
Token in
Token ID287
Feature activation+5.57e-05

Pos logit contributions
eb+0.022
over+0.019
ucky+0.018
ax+0.018
e+0.017
Neg logit contributions
 moral-0.022
 exc-0.021
 works-0.021
 pret-0.021
bul-0.020
Token her
Token ID607
Feature activation-1.73e-02

Pos logit contributions
 moral+0.000
 exc+0.000
 swollen+0.000
 works+0.000
 lamp+0.000
Neg logit contributions
eb-0.000
over-0.000
ale-0.000
ax-0.000
'-0.000
Token to
Token ID284
Feature activation+5.76e-03

Pos logit contributions
eb+0.017
over+0.016
e+0.014
ucky+0.014
'+0.013
Neg logit contributions
 moral-0.020
 exc-0.019
 works-0.019
 poster-0.018
bel-0.018
Token the
Token ID262
Feature activation+4.42e-03

Pos logit contributions
 moral+0.008
 exc+0.008
 poster+0.008
 works+0.008
 pret+0.007
Neg logit contributions
eb-0.010
over-0.010
ive-0.009
ard-0.009
'-0.009
Token cherry
Token ID23612
Feature activation-1.25e-02

Pos logit contributions
 moral+0.007
 exc+0.006
 poster+0.006
 lamp+0.006
 swollen+0.006
Neg logit contributions
eb-0.008
over-0.007
e-0.006
'-0.006
ale-0.006
Token tree
Token ID5509
Feature activation-2.62e-03

Pos logit contributions
eb+0.021
over+0.020
e+0.019
ete+0.018
ram+0.017
Neg logit contributions
 moral-0.017
 poster-0.017
 lamp-0.016
 exc-0.016
bel-0.016
Token the
Token ID262
Feature activation-1.38e-02

Pos logit contributions
eb+0.005
over+0.004
ucky+0.004
ete+0.004
ax+0.004
Neg logit contributions
 exc-0.003
 moral-0.003
bel-0.003
 lamp-0.003
 poster-0.003
Token next
Token ID1306
Feature activation-7.53e-02

Pos logit contributions
eb+0.023
over+0.021
e+0.019
'+0.019
ale+0.018
Neg logit contributions
 moral-0.022
 exc-0.021
 swollen-0.020
 lamp-0.020
 poster-0.020
Token day
Token ID1110
Feature activation-1.91e-02

Pos logit contributions
eb+0.105
over+0.094
e+0.086
'+0.084
ucky+0.082
Neg logit contributions
 moral-0.137
 exc-0.126
lights-0.125
 poster-0.124
 lamp-0.124
Token.
Token ID13
Feature activation-1.53e-03

Pos logit contributions
eb+0.043
ench+0.038
wy+0.037
e+0.037
omp+0.037
Neg logit contributions
 pretending-0.015
 lamp-0.012
 new-0.011
 stairs-0.011
 exc-0.011
Token
Token ID198
Feature activation-1.10e-02

Pos logit contributions
eb+0.003
over+0.003
e+0.002
'+0.002
ale+0.002
Neg logit contributions
 moral-0.002
 exc-0.002
 lamp-0.002
 pret-0.002
 works-0.002
Token
Token ID198
Feature activation-5.10e-03

Pos logit contributions
eb+0.018
over+0.017
e+0.015
ive+0.015
'+0.015
Neg logit contributions
 moral-0.016
 exc-0.015
lights-0.015
 lamp-0.014
 works-0.014
TokenThe
Token ID464
Feature activation+9.31e-04

Pos logit contributions
eb+0.008
'+0.008
over+0.008
e+0.007
he+0.007
Neg logit contributions
lights-0.008
 works-0.008
 swollen-0.008
 pret-0.007
bul-0.007
Token she
Token ID673
Feature activation-7.35e-03

Pos logit contributions
eb+0.018
over+0.016
'+0.015
e+0.015
ucky+0.015
Neg logit contributions
 moral-0.017
 exc-0.016
lights-0.016
 swollen-0.015
 works-0.015
Token visited
Token ID8672
Feature activation-1.41e-02

Pos logit contributions
eb+0.012
over+0.012
ucky+0.011
'+0.011
ax+0.010
Neg logit contributions
 moral-0.011
 exc-0.010
 works-0.010
lights-0.010
 pret-0.010
Token the
Token ID262
Feature activation-4.32e-04

Pos logit contributions
eb+0.026
over+0.023
ucky+0.023
umbs+0.023
ax+0.023
Neg logit contributions
 moral-0.019
 lamp-0.017
 works-0.017
 exc-0.017
bul-0.017
Token river
Token ID7850
Feature activation-5.74e-03

Pos logit contributions
eb+0.001
over+0.001
'+0.001
e+0.001
ucky+0.001
Neg logit contributions
 moral-0.001
 exc-0.001
 works-0.001
 swollen-0.001
lights-0.001
Token the
Token ID262
Feature activation-1.52e-02

Pos logit contributions
eb+0.008
over+0.007
e+0.007
'+0.007
ucky+0.007
Neg logit contributions
 moral-0.010
lights-0.009
 exc-0.009
 works-0.009
 swollen-0.009
Token next
Token ID1306
Feature activation-7.53e-02

Pos logit contributions
eb+0.024
over+0.022
e+0.020
'+0.020
ucky+0.019
Neg logit contributions
 moral-0.025
 exc-0.023
 swollen-0.022
 works-0.022
lights-0.022
Token day
Token ID1110
Feature activation-1.38e-02

Pos logit contributions
eb+0.094
over+0.084
'+0.077
e+0.075
ucky+0.072
Neg logit contributions
 moral-0.145
lights-0.137
 exc-0.136
 works-0.134
 swollen-0.132
Token.
Token ID13
Feature activation-7.65e-05

Pos logit contributions
eb+0.028
ream+0.024
ax+0.024
atter+0.024
ucky+0.023
Neg logit contributions
lights-0.014
 moral-0.013
 pretending-0.013
 works-0.012
 lamp-0.012
Token The
Token ID383
Feature activation+1.87e-03

Pos logit contributions
eb+0.000
over+0.000
'+0.000
e+0.000
ale+0.000
Neg logit contributions
 moral-0.000
 exc-0.000
lights-0.000
 swollen-0.000
 pret-0.000
Token river
Token ID7850
Feature activation-6.13e-03

Pos logit contributions
lights+0.003
 pret+0.003
 swollen+0.003
aj+0.003
bel+0.003
Neg logit contributions
eb-0.003
'-0.003
over-0.002
e-0.002
ard-0.002
Token was
Token ID373
Feature activation-1.93e-02

Pos logit contributions
eb+0.010
over+0.010
'+0.010
e+0.009
ale+0.009
Neg logit contributions
 moral-0.008
 exc-0.008
 swollen-0.008
 pret-0.008
lights-0.008
Token see
Token ID766
Feature activation+1.08e-02

Pos logit contributions
 moral+0.004
 exc+0.004
 pret+0.004
 works+0.004
 poster+0.004
Neg logit contributions
eb-0.004
over-0.003
ucky-0.003
'-0.003
aw-0.003
Token the
Token ID262
Feature activation-4.89e-03

Pos logit contributions
 moral+0.017
 exc+0.016
lights+0.016
 swollen+0.015
 works+0.015
Neg logit contributions
eb-0.018
over-0.016
'-0.015
e-0.015
ucky-0.014
Token crocod
Token ID37565
Feature activation-1.37e-02

Pos logit contributions
eb+0.009
over+0.008
'+0.008
e+0.008
ucky+0.007
Neg logit contributions
 moral-0.007
 exc-0.006
lights-0.006
 swollen-0.006
 works-0.006
Tokenile
Token ID576
Feature activation+2.84e-03

Pos logit contributions
eb+0.023
over+0.022
'+0.020
e+0.018
ax+0.018
Neg logit contributions
 moral-0.020
lights-0.018
 exc-0.018
 works-0.018
 swollen-0.017
Token right
Token ID826
Feature activation-3.72e-02

Pos logit contributions
 moral+0.005
 exc+0.005
lights+0.005
 swollen+0.005
 works+0.005
Neg logit contributions
eb-0.004
over-0.004
'-0.003
e-0.003
ucky-0.003
Token next
Token ID1306
Feature activation-7.52e-02

Pos logit contributions
eb+0.046
over+0.041
'+0.038
e+0.036
ucky+0.035
Neg logit contributions
 moral-0.072
lights-0.068
 exc-0.068
 swollen-0.068
 works-0.067
Token to
Token ID284
Feature activation-1.37e-02

Pos logit contributions
eb+0.130
over+0.119
e+0.114
'+0.111
ax+0.108
Neg logit contributions
 moral-0.110
lights-0.104
 lamp-0.101
 works-0.100
 poster-0.100
Token him
Token ID683
Feature activation-8.15e-03

Pos logit contributions
eb+0.022
over+0.021
ucky+0.019
ump+0.018
ram+0.018
Neg logit contributions
 moral-0.023
 exc-0.020
 works-0.020
 pret-0.019
 lamp-0.019
Token.
Token ID13
Feature activation-1.16e-02

Pos logit contributions
eb+0.012
over+0.011
'+0.010
e+0.010
ucky+0.010
Neg logit contributions
 moral-0.013
 exc-0.012
 works-0.012
 lamp-0.012
lights-0.012
Token He
Token ID679
Feature activation+1.40e-02

Pos logit contributions
eb+0.017
over+0.016
'+0.015
e+0.014
ucky+0.014
Neg logit contributions
 moral-0.019
 exc-0.018
lights-0.018
 swollen-0.018
 works-0.018
Token had
Token ID550
Feature activation-2.55e-03

Pos logit contributions
lights+0.023
 moral+0.021
 swollen+0.021
 exc+0.021
 poster+0.021
Neg logit contributions
eb-0.022
'-0.020
e-0.019
over-0.018
aw-0.017
Token promised
Token ID8072
Feature activation-3.11e-02

Pos logit contributions
eb+0.020
over+0.019
'+0.018
ucky+0.018
e+0.018
Neg logit contributions
 moral-0.016
 exc-0.015
 swollen-0.014
lights-0.014
 works-0.014
Token to
Token ID284
Feature activation+2.63e-03

Pos logit contributions
eb+0.053
over+0.052
ucky+0.049
umbs+0.049
e+0.049
Neg logit contributions
 moral-0.042
 exc-0.037
 forgiving-0.035
bel-0.035
 hon-0.035
Token come
Token ID1282
Feature activation-1.69e-02

Pos logit contributions
 moral+0.004
 exc+0.004
 lamp+0.003
 forgiving+0.003
 swollen+0.003
Neg logit contributions
eb-0.005
over-0.005
aw-0.004
ucky-0.004
'-0.004
Token back
Token ID736
Feature activation-1.10e-02

Pos logit contributions
eb+0.023
over+0.020
ucky+0.018
ax+0.017
aw+0.017
Neg logit contributions
 moral-0.031
 exc-0.029
lights-0.028
 lamp-0.028
 pret-0.028
Token the
Token ID262
Feature activation-1.77e-02

Pos logit contributions
eb+0.018
over+0.016
ucky+0.015
ump+0.014
ard+0.014
Neg logit contributions
 moral-0.017
 exc-0.016
 works-0.015
 pret-0.015
bel-0.015
Token next
Token ID1306
Feature activation-7.51e-02

Pos logit contributions
eb+0.039
over+0.034
e+0.034
ucky+0.032
ax+0.032
Neg logit contributions
 moral-0.023
 exc-0.018
 poster-0.017
 works-0.017
 swollen-0.016
Token day
Token ID1110
Feature activation-9.87e-03

Pos logit contributions
eb+0.114
over+0.098
e+0.095
ucky+0.095
ump+0.092
Neg logit contributions
 moral-0.132
 poster-0.115
 works-0.115
 exc-0.114
lights-0.113
Token.
Token ID13
Feature activation-3.76e-03

Pos logit contributions
eb+0.018
ucky+0.017
ream+0.017
atter+0.016
ram+0.016
Neg logit contributions
 pretending-0.010
 moral-0.010
 rockets-0.009
bel-0.009
men-0.009
Token
Token ID198
Feature activation-4.36e-03

Pos logit contributions
eb+0.006
over+0.006
e+0.006
'+0.006
ale+0.006
Neg logit contributions
 moral-0.005
 exc-0.005
lights-0.005
 works-0.004
 swollen-0.004
Token
Token ID198
Feature activation-4.07e-03

Pos logit contributions
eb+0.008
over+0.007
'+0.007
e+0.007
ucky+0.006
Neg logit contributions
 moral-0.006
 exc-0.006
lights-0.006
 swollen-0.005
 works-0.005
TokenThe
Token ID464
Feature activation-8.97e-03

Pos logit contributions
eb+0.007
over+0.006
'+0.006
e+0.006
ucky+0.005
Neg logit contributions
 moral-0.006
lights-0.006
 exc-0.006
 works-0.006
 pret-0.006
Token bright
Token ID6016
Feature activation-5.03e-04

Pos logit contributions
eb+0.025
over+0.023
e+0.022
'+0.022
ucky+0.021
Neg logit contributions
 moral-0.020
lights-0.020
 exc-0.019
 pret-0.019
 works-0.019
Token feathers
Token ID27400
Feature activation-2.45e-02

Pos logit contributions
eb+0.001
over+0.001
'+0.001
e+0.001
ucky+0.001
Neg logit contributions
 moral-0.001
 exc-0.001
 works-0.001
lights-0.001
 pret-0.001
Token in
Token ID287
Feature activation-6.80e-03

Pos logit contributions
eb+0.036
over+0.033
'+0.030
e+0.030
ucky+0.029
Neg logit contributions
 moral-0.039
 swollen-0.039
lights-0.039
 exc-0.037
bul-0.037
Token the
Token ID262
Feature activation-1.59e-02

Pos logit contributions
eb+0.012
over+0.011
'+0.010
e+0.010
ucky+0.010
Neg logit contributions
 moral-0.010
 exc-0.009
 swollen-0.009
lights-0.009
 works-0.009
Token tree
Token ID5509
Feature activation-2.28e-02

Pos logit contributions
eb+0.030
over+0.027
ax+0.026
ucky+0.025
ete+0.025
Neg logit contributions
 moral-0.022
 lamp-0.020
 swollen-0.019
 gen-0.019
bul-0.019
Token next
Token ID1306
Feature activation-7.50e-02

Pos logit contributions
eb+0.036
over+0.032
ax+0.030
ucky+0.029
ete+0.029
Neg logit contributions
 swollen-0.034
 moral-0.034
 exc-0.034
lights-0.034
 lamp-0.033
Token to
Token ID284
Feature activation-2.13e-02

Pos logit contributions
eb+0.107
over+0.098
e+0.089
'+0.088
ax+0.083
Neg logit contributions
 moral-0.132
lights-0.126
 lamp-0.122
 exc-0.122
bul-0.122
Token her
Token ID607
Feature activation-1.80e-02

Pos logit contributions
eb+0.034
over+0.032
'+0.028
ucky+0.028
ax+0.027
Neg logit contributions
 moral-0.034
 exc-0.032
 works-0.031
 lamp-0.031
 pret-0.031
Token.
Token ID13
Feature activation-3.98e-03

Pos logit contributions
eb+0.025
over+0.023
'+0.021
e+0.020
ucky+0.020
Neg logit contributions
 moral-0.032
 exc-0.030
lights-0.030
 swollen-0.030
 works-0.029
Token She
Token ID1375
Feature activation+9.30e-03

Pos logit contributions
eb+0.006
over+0.005
e+0.005
'+0.005
ale+0.004
Neg logit contributions
 moral-0.007
 exc-0.006
lights-0.006
 swollen-0.006
 pret-0.006
Token was
Token ID373
Feature activation+1.33e-03

Pos logit contributions
lights+0.017
 appearances+0.015
 moral+0.015
 poster+0.015
 works+0.015
Neg logit contributions
eb-0.013
'-0.012
e-0.012
aw-0.010
ard-0.010
Token picnic
Token ID35715
Feature activation-2.28e-02

Pos logit contributions
eb+0.007
over+0.006
'+0.006
e+0.006
ucky+0.006
Neg logit contributions
 moral-0.007
 exc-0.007
lights-0.007
 swollen-0.007
 works-0.007
Token to
Token ID284
Feature activation-2.92e-03

Pos logit contributions
eb+0.034
over+0.030
e+0.028
ucky+0.027
'+0.027
Neg logit contributions
 moral-0.036
lights-0.035
 exc-0.034
 lamp-0.034
 works-0.034
Token the
Token ID262
Feature activation-2.02e-03

Pos logit contributions
eb+0.005
over+0.005
ram+0.004
ump+0.004
ard+0.004
Neg logit contributions
 moral-0.004
 exc-0.004
 works-0.004
 lamp-0.004
 pret-0.004
Token river
Token ID7850
Feature activation-7.87e-04

Pos logit contributions
eb+0.004
over+0.003
e+0.003
'+0.003
ump+0.003
Neg logit contributions
 moral-0.003
 exc-0.003
 lamp-0.003
 swollen-0.003
 works-0.003
Token the
Token ID262
Feature activation-1.19e-02

Pos logit contributions
eb+0.001
over+0.001
'+0.001
e+0.001
ucky+0.001
Neg logit contributions
 moral-0.001
 exc-0.001
lights-0.001
 works-0.001
 swollen-0.001
Token next
Token ID1306
Feature activation-7.49e-02

Pos logit contributions
eb+0.020
over+0.018
e+0.017
'+0.016
ucky+0.015
Neg logit contributions
 moral-0.019
 exc-0.017
 swollen-0.017
lights-0.017
 lamp-0.017
Token day
Token ID1110
Feature activation-1.54e-02

Pos logit contributions
eb+0.102
over+0.092
e+0.083
'+0.082
ucky+0.079
Neg logit contributions
 moral-0.138
lights-0.129
 exc-0.127
 works-0.125
bul-0.124
Token.
Token ID13
Feature activation-2.50e-03

Pos logit contributions
eb+0.032
ram+0.027
ream+0.026
ucky+0.026
e+0.026
Neg logit contributions
lights-0.013
 moral-0.013
 pretending-0.012
 lamp-0.012
 exc-0.012
Token
Token ID220
Feature activation+1.60e-02

Pos logit contributions
eb+0.004
over+0.004
e+0.004
ale+0.003
'+0.003
Neg logit contributions
 moral-0.004
 exc-0.003
lights-0.003
 lamp-0.003
 swollen-0.003
Token
Token ID198
Feature activation-3.37e-03

Pos logit contributions
 moral+0.024
 exc+0.024
lights+0.022
 works+0.022
 swollen+0.022
Neg logit contributions
eb-0.026
over-0.023
e-0.021
'-0.021
ale-0.021
Token
Token ID198
Feature activation-3.92e-03

Pos logit contributions
eb+0.005
over+0.004
'+0.004
e+0.004
ucky+0.004
Neg logit contributions
 moral-0.006
 exc-0.006
lights-0.005
 swollen-0.005
 works-0.005
Token under
Token ID739
Feature activation+1.49e-02

Pos logit contributions
eb+0.013
over+0.011
ax+0.010
'+0.010
ram+0.010
Neg logit contributions
 moral-0.014
lights-0.014
 swollen-0.013
 exc-0.013
 works-0.013
Token the
Token ID262
Feature activation-1.01e-02

Pos logit contributions
 moral+0.023
lights+0.022
 exc+0.022
 swollen+0.022
 lamp+0.021
Neg logit contributions
eb-0.024
over-0.022
'-0.021
e-0.020
ax-0.019
Token palm
Token ID18057
Feature activation-1.15e-02

Pos logit contributions
eb+0.017
over+0.016
'+0.015
e+0.015
ucky+0.015
Neg logit contributions
 moral-0.014
 exc-0.013
lights-0.013
 swollen-0.013
 works-0.013
Token tree
Token ID5509
Feature activation-1.78e-02

Pos logit contributions
eb+0.018
over+0.017
e+0.016
'+0.015
ete+0.014
Neg logit contributions
lights-0.018
 moral-0.017
 lamp-0.017
bul-0.017
 swollen-0.016
Token the
Token ID262
Feature activation-2.68e-02

Pos logit contributions
eb+0.030
over+0.026
ax+0.025
ucky+0.025
ete+0.024
Neg logit contributions
lights-0.025
 lamp-0.024
 swollen-0.024
bul-0.024
bel-0.024
Token next
Token ID1306
Feature activation-7.48e-02

Pos logit contributions
eb+0.045
over+0.041
'+0.038
e+0.037
ucky+0.036
Neg logit contributions
 moral-0.041
 exc-0.038
 swollen-0.038
lights-0.037
 lamp-0.037
Token day
Token ID1110
Feature activation-1.85e-02

Pos logit contributions
eb+0.104
over+0.092
'+0.085
e+0.085
ucky+0.081
Neg logit contributions
 moral-0.135
lights-0.128
 exc-0.125
 works-0.124
 lamp-0.124
Token.
Token ID13
Feature activation+6.05e-03

Pos logit contributions
eb+0.041
e+0.036
atter+0.035
wy+0.035
ream+0.035
Neg logit contributions
lights-0.016
 lamp-0.012
 lights-0.012
 moral-0.012
 new-0.012
Token<|endoftext|>
Token ID50256
Feature activation-1.53e-03

Pos logit contributions
 moral+0.010
 exc+0.009
lights+0.009
 swollen+0.009
 works+0.009
Neg logit contributions
eb-0.009
over-0.008
'-0.008
e-0.008
ale-0.007
TokenOne
Token ID3198
Feature activation-2.19e-02

Pos logit contributions
eb+0.003
over+0.003
ale+0.003
ax+0.003
ram+0.003
Neg logit contributions
 moral-0.002
 works-0.002
 lamp-0.001
 pret-0.001
 poster-0.001
Token day
Token ID1110
Feature activation+8.21e-03

Pos logit contributions
eb+0.027
over+0.024
'+0.022
e+0.021
ucky+0.020
Neg logit contributions
 moral-0.043
 exc-0.040
lights-0.040
 works-0.040
 swollen-0.040
Token He
Token ID679
Feature activation+1.48e-02

Pos logit contributions
eb+0.008
over+0.007
e+0.006
'+0.006
ale+0.006
Neg logit contributions
 moral-0.006
lights-0.006
 exc-0.006
 works-0.006
 pret-0.006
Token found
Token ID1043
Feature activation+1.20e-02

Pos logit contributions
 moral+0.022
lights+0.021
 exc+0.021
 swollen+0.020
 works+0.020
Neg logit contributions
eb-0.025
over-0.023
'-0.022
e-0.021
ucky-0.020
Token the
Token ID262
Feature activation-1.71e-02

Pos logit contributions
 moral+0.014
lights+0.013
 swollen+0.013
 exc+0.013
 lamp+0.013
Neg logit contributions
eb-0.023
over-0.022
'-0.021
e-0.020
ake-0.019
Token perfect
Token ID2818
Feature activation-2.92e-02

Pos logit contributions
eb+0.027
over+0.025
'+0.023
e+0.022
ucky+0.022
Neg logit contributions
 moral-0.027
lights-0.025
 exc-0.025
 swollen-0.025
 works-0.025
Token spot
Token ID4136
Feature activation-3.20e-02

Pos logit contributions
eb+0.041
over+0.038
'+0.035
e+0.034
ucky+0.033
Neg logit contributions
 moral-0.050
lights-0.048
 exc-0.048
 swollen-0.047
 works-0.047
Token next
Token ID1306
Feature activation-7.45e-02

Pos logit contributions
eb+0.049
over+0.045
'+0.041
ucky+0.040
e+0.040
Neg logit contributions
lights-0.049
 moral-0.048
 swollen-0.046
 works-0.046
 exc-0.046
Token to
Token ID284
Feature activation-1.37e-02

Pos logit contributions
eb+0.114
over+0.100
e+0.096
'+0.094
ax+0.093
Neg logit contributions
 moral-0.118
lights-0.113
 works-0.110
 lamp-0.110
bul-0.108
Token the
Token ID262
Feature activation-1.53e-02

Pos logit contributions
eb+0.024
over+0.023
ump+0.021
ram+0.021
ucky+0.020
Neg logit contributions
 moral-0.019
 exc-0.018
 pret-0.017
 works-0.017
 lamp-0.017
Token pond
Token ID16723
Feature activation-2.95e-02

Pos logit contributions
eb+0.023
over+0.020
e+0.018
'+0.018
ax+0.017
Neg logit contributions
 moral-0.027
 exc-0.025
 swollen-0.024
 poster-0.024
lights-0.024
Token,
Token ID11
Feature activation-2.76e-03

Pos logit contributions
eb+0.047
over+0.042
e+0.038
ucky+0.038
'+0.037
Neg logit contributions
 moral-0.043
lights-0.041
 exc-0.041
 works-0.040
 lamp-0.040
Token under
Token ID739
Feature activation+1.75e-02

Pos logit contributions
eb+0.004
over+0.004
'+0.004
e+0.003
ucky+0.003
Neg logit contributions
 moral-0.005
lights-0.004
 swollen-0.004
 exc-0.004
 works-0.004
Token rocks
Token ID12586
Feature activation-7.21e-03

Pos logit contributions
eb+0.021
'+0.018
over+0.018
e+0.017
ucky+0.017
Neg logit contributions
 moral-0.017
lights-0.017
 swollen-0.017
 exc-0.016
bul-0.016
Token to
Token ID284
Feature activation-2.03e-04

Pos logit contributions
eb+0.010
over+0.009
'+0.008
e+0.008
ucky+0.008
Neg logit contributions
 moral-0.013
lights-0.012
 exc-0.012
 works-0.012
 swollen-0.012
Token their
Token ID511
Feature activation-2.12e-02

Pos logit contributions
eb+0.000
over+0.000
'+0.000
ucky+0.000
ive+0.000
Neg logit contributions
 moral-0.000
 exc-0.000
lights-0.000
 lamp-0.000
 pret-0.000
Token teacher
Token ID4701
Feature activation+2.07e-03

Pos logit contributions
eb+0.034
over+0.032
'+0.030
e+0.029
ucky+0.028
Neg logit contributions
 moral-0.033
 exc-0.031
lights-0.030
 swollen-0.030
 works-0.030
Token the
Token ID262
Feature activation-1.14e-02

Pos logit contributions
 moral+0.004
 exc+0.003
lights+0.003
 pret+0.003
 works+0.003
Neg logit contributions
eb-0.003
over-0.003
'-0.002
e-0.002
ucky-0.002
Token next
Token ID1306
Feature activation-7.43e-02

Pos logit contributions
eb+0.018
over+0.017
'+0.016
e+0.015
ucky+0.015
Neg logit contributions
 moral-0.017
lights-0.017
 exc-0.016
 swollen-0.016
 works-0.016
Token day
Token ID1110
Feature activation-1.24e-02

Pos logit contributions
eb+0.111
over+0.101
'+0.094
e+0.091
ucky+0.088
Neg logit contributions
 moral-0.125
lights-0.118
 exc-0.116
 works-0.115
 poster-0.115
Token in
Token ID287
Feature activation-7.73e-03

Pos logit contributions
eb+0.025
atter+0.022
he+0.022
e+0.021
ench+0.021
Neg logit contributions
lights-0.012
 pretending-0.012
 moral-0.011
 exc-0.011
 lights-0.011
Token class
Token ID1398
Feature activation-3.91e-03

Pos logit contributions
eb+0.013
over+0.011
ax+0.011
ale+0.011
ard+0.011
Neg logit contributions
 moral-0.012
lights-0.011
 swollen-0.011
 exc-0.011
 works-0.011
Token.
Token ID13
Feature activation-5.20e-03

Pos logit contributions
eb+0.006
over+0.005
ucky+0.005
e+0.005
ram+0.005
Neg logit contributions
 moral-0.006
lights-0.006
 exc-0.006
 works-0.006
 poster-0.006
Token "
Token ID366
Feature activation-8.31e-04

Pos logit contributions
eb+0.007
over+0.007
'+0.006
e+0.006
ucky+0.006
Neg logit contributions
 moral-0.009
 swollen-0.008
lights-0.008
 exc-0.008
 works-0.008
Token lay
Token ID3830
Feature activation-2.46e-02

Pos logit contributions
eb+0.030
over+0.028
'+0.026
e+0.026
ucky+0.025
Neg logit contributions
 moral-0.021
 exc-0.020
lights-0.019
 swollen-0.019
 works-0.019
Token down
Token ID866
Feature activation-4.70e-03

Pos logit contributions
eb+0.033
over+0.030
'+0.028
e+0.027
ucky+0.026
Neg logit contributions
 moral-0.044
lights-0.042
 exc-0.042
 swollen-0.042
 works-0.041
Token on
Token ID319
Feature activation-4.15e-03

Pos logit contributions
eb+0.006
over+0.006
'+0.005
e+0.005
ucky+0.005
Neg logit contributions
 moral-0.009
 exc-0.008
lights-0.008
 swollen-0.008
 works-0.008
Token the
Token ID262
Feature activation-1.77e-02

Pos logit contributions
eb+0.007
over+0.007
ive+0.007
atter+0.007
'+0.007
Neg logit contributions
 moral-0.005
 pret-0.005
lights-0.005
 exc-0.005
 lamp-0.005
Token ground
Token ID2323
Feature activation-2.05e-02

Pos logit contributions
eb+0.026
over+0.024
'+0.022
e+0.021
ucky+0.021
Neg logit contributions
 moral-0.030
lights-0.028
 exc-0.028
 swollen-0.028
 works-0.028
Token next
Token ID1306
Feature activation-7.43e-02

Pos logit contributions
eb+0.032
over+0.029
e+0.026
'+0.026
ucky+0.026
Neg logit contributions
 moral-0.031
lights-0.030
 exc-0.030
 swollen-0.030
 lamp-0.029
Token to
Token ID284
Feature activation-1.11e-02

Pos logit contributions
eb+0.110
over+0.101
e+0.092
'+0.092
ucky+0.086
Neg logit contributions
 moral-0.125
lights-0.119
 lamp-0.118
 exc-0.117
 works-0.115
Token John
Token ID1757
Feature activation-7.70e-03

Pos logit contributions
eb+0.019
over+0.018
ump+0.016
ucky+0.016
umbs+0.016
Neg logit contributions
 moral-0.016
 exc-0.015
 pret-0.015
 works-0.014
 lamp-0.014
Token and
Token ID290
Feature activation-9.69e-03

Pos logit contributions
eb+0.012
over+0.012
aw+0.011
ucky+0.011
umm+0.010
Neg logit contributions
 exc-0.010
 pret-0.010
 lamp-0.010
 moral-0.010
 works-0.010
Token fell
Token ID3214
Feature activation+9.98e-03

Pos logit contributions
eb+0.017
over+0.016
'+0.015
e+0.015
ucky+0.014
Neg logit contributions
 moral-0.013
 exc-0.013
lights-0.013
 swollen-0.012
 works-0.012
Token asleep
Token ID16039
Feature activation+7.88e-03

Pos logit contributions
 moral+0.015
 exc+0.013
 poster+0.013
 works+0.013
lights+0.013
Neg logit contributions
eb-0.017
over-0.015
'-0.015
ucky-0.014
 home-0.014
Token on
Token ID319
Feature activation-4.97e-03

Pos logit contributions
eb+0.037
over+0.034
ucky+0.032
e+0.032
umbs+0.031
Neg logit contributions
 moral-0.040
 poster-0.038
lights-0.038
 works-0.038
 exc-0.037
Token the
Token ID262
Feature activation-5.74e-03

Pos logit contributions
eb+0.009
over+0.008
'+0.007
ucky+0.007
ive+0.007
Neg logit contributions
 moral-0.007
lights-0.007
 swollen-0.006
 exc-0.006
 works-0.006
Token raft
Token ID31812
Feature activation-8.71e-03

Pos logit contributions
eb+0.011
over+0.010
e+0.010
'+0.009
ucky+0.009
Neg logit contributions
 moral-0.008
 exc-0.007
lights-0.007
 works-0.007
 swollen-0.007
Token again
Token ID757
Feature activation-2.35e-02

Pos logit contributions
eb+0.014
over+0.013
e+0.013
ucky+0.013
ee+0.012
Neg logit contributions
 poster-0.011
lights-0.010
bul-0.010
bel-0.010
 trumpet-0.010
Token the
Token ID262
Feature activation-1.86e-02

Pos logit contributions
eb+0.042
ucky+0.040
umbs+0.039
atter+0.039
ream+0.038
Neg logit contributions
 moral-0.027
 exc-0.025
 pretending-0.024
 poster-0.024
 fl-0.024
Token next
Token ID1306
Feature activation-7.43e-02

Pos logit contributions
eb+0.036
over+0.034
'+0.032
e+0.031
ucky+0.031
Neg logit contributions
 moral-0.023
 exc-0.021
lights-0.021
 swollen-0.020
 works-0.020
Token day
Token ID1110
Feature activation-1.49e-02

Pos logit contributions
eb+0.101
over+0.091
'+0.083
e+0.083
ucky+0.082
Neg logit contributions
 moral-0.134
 exc-0.126
lights-0.124
 works-0.122
 poster-0.121
Token.
Token ID13
Feature activation-1.30e-02

Pos logit contributions
eb+0.029
ucky+0.028
ream+0.028
ench+0.028
wy+0.027
Neg logit contributions
 pretending-0.013
 waving-0.011
 poster-0.011
lights-0.011
bel-0.010
Token<|endoftext|>
Token ID50256
Feature activation-1.93e-03

Pos logit contributions
eb+0.025
e+0.024
over+0.023
ale+0.022
ard+0.022
Neg logit contributions
 moral-0.015
 exc-0.014
 sings-0.011
 Amen-0.011
 poster-0.011
TokenBilly
Token ID49640
Feature activation-9.50e-03

Pos logit contributions
eb+0.004
over+0.004
ale+0.004
ram+0.004
atter+0.004
Neg logit contributions
 moral-0.002
 works-0.002
 lamp-0.001
 pret-0.001
 poster-0.001
Token and
Token ID290
Feature activation-2.77e-02

Pos logit contributions
eb+0.016
over+0.015
'+0.014
e+0.014
ucky+0.013
Neg logit contributions
 moral-0.013
 exc-0.013
 works-0.013
 pret-0.013
 swollen-0.012
Token"
Token ID1
Feature activation-1.22e-02

Pos logit contributions
lights+0.017
 moral+0.016
 pret+0.016
 works+0.016
bul+0.016
Neg logit contributions
eb-0.015
over-0.014
'-0.014
e-0.013
ucky-0.013
TokenI
Token ID40
Feature activation+1.38e-03

Pos logit contributions
eb+0.018
'+0.016
over+0.016
e+0.015
he+0.015
Neg logit contributions
lights-0.021
 pret-0.019
aj-0.019
 works-0.019
 exc-0.019
Token'm
Token ID1101
Feature activation+1.23e-02

Pos logit contributions
lights+0.002
 moral+0.002
 swollen+0.002
 poster+0.002
bul+0.002
Neg logit contributions
eb-0.002
'-0.002
over-0.002
e-0.002
ucky-0.002
Token so
Token ID523
Feature activation+1.24e-02

Pos logit contributions
 swollen+0.006
 exc+0.006
 moral+0.006
 forgiving+0.005
 pret+0.005
Neg logit contributions
eb-0.033
over-0.030
ale-0.029
'-0.029
ive-0.029
Token sorry
Token ID7926
Feature activation+2.58e-02

Pos logit contributions
 moral+0.011
 exc+0.010
 swollen+0.010
lights+0.010
 works+0.010
Neg logit contributions
eb-0.028
over-0.026
'-0.025
e-0.025
ale-0.024
Token,
Token ID11
Feature activation+4.75e-02

Pos logit contributions
 moral+0.049
lights+0.048
 swollen+0.047
 exc+0.047
 works+0.046
Neg logit contributions
eb-0.032
over-0.028
'-0.027
ucky-0.025
e-0.024
Token kids
Token ID3988
Feature activation+1.44e-02

Pos logit contributions
 moral+0.068
lights+0.067
 exc+0.064
 works+0.064
 swollen+0.064
Neg logit contributions
eb-0.080
over-0.073
'-0.070
ucky-0.066
e-0.066
Token!"
Token ID2474
Feature activation-1.72e-02

Pos logit contributions
 exc+0.017
 works+0.016
 moral+0.016
 swollen+0.016
 lamp+0.016
Neg logit contributions
eb-0.027
over-0.026
e-0.023
ucky-0.023
ale-0.023
Token the
Token ID262
Feature activation+1.46e-02

Pos logit contributions
eb+0.024
over+0.022
'+0.022
e+0.020
ucky+0.020
Neg logit contributions
 moral-0.029
lights-0.028
 works-0.027
bul-0.027
 exc-0.027
Token man
Token ID582
Feature activation+9.74e-03

Pos logit contributions
lights+0.020
bul+0.019
 works+0.018
 pret+0.018
 swollen+0.018
Neg logit contributions
eb-0.025
over-0.025
'-0.024
aw-0.022
ax-0.022
Token said
Token ID531
Feature activation+5.92e-03

Pos logit contributions
 moral+0.011
 exc+0.011
 swollen+0.010
lights+0.010
 works+0.010
Neg logit contributions
eb-0.019
over-0.018
'-0.017
e-0.017
ucky-0.016
Token mom
Token ID1995
Feature activation+7.74e-03

Pos logit contributions
 works+0.012
bel+0.012
lights+0.012
 satellites+0.011
bul+0.011
Neg logit contributions
 home-0.006
eb-0.005
ucky-0.005
'-0.005
over-0.004
Token said
Token ID531
Feature activation-5.99e-03

Pos logit contributions
 moral+0.009
lights+0.009
 exc+0.009
 pret+0.009
bel+0.008
Neg logit contributions
eb-0.015
'-0.014
over-0.013
e-0.013
ax-0.012
Token,
Token ID11
Feature activation+2.09e-02

Pos logit contributions
eb+0.007
over+0.007
e+0.006
'+0.006
ale+0.006
Neg logit contributions
 moral-0.011
 exc-0.011
 swollen-0.011
lights-0.011
 pret-0.011
Token "
Token ID366
Feature activation-1.50e-03

Pos logit contributions
lights+0.028
 moral+0.027
 works+0.027
 swollen+0.025
 pret+0.025
Neg logit contributions
'-0.036
eb-0.035
over-0.033
e-0.031
ucky-0.030
TokenNo
Token ID2949
Feature activation-3.89e-03

Pos logit contributions
eb+0.002
over+0.002
'+0.002
e+0.002
he+0.002
Neg logit contributions
lights-0.003
 pret-0.002
 moral-0.002
 works-0.002
 exc-0.002
Token,
Token ID11
Feature activation+5.78e-02

Pos logit contributions
eb+0.005
over+0.004
'+0.004
e+0.004
ucky+0.003
Neg logit contributions
 moral-0.008
 exc-0.007
 swollen-0.007
lights-0.007
 works-0.007
Token but
Token ID475
Feature activation+1.00e-02

Pos logit contributions
 moral+0.080
 exc+0.074
lights+0.074
 swollen+0.073
 works+0.073
Neg logit contributions
eb-0.103
over-0.095
'-0.090
e-0.088
ucky-0.085
Token let
Token ID1309
Feature activation-2.65e-02

Pos logit contributions
 moral+0.016
lights+0.015
 swollen+0.015
 exc+0.015
 pret+0.015
Neg logit contributions
eb-0.015
over-0.014
'-0.013
e-0.013
ucky-0.012
Token's
Token ID338
Feature activation-5.98e-03

Pos logit contributions
over+0.047
eb+0.043
ive+0.040
imble+0.040
ump+0.039
Neg logit contributions
 moral-0.038
 works-0.033
 lamp-0.033
 exc-0.032
 mistake-0.032
Token look
Token ID804
Feature activation+1.50e-02

Pos logit contributions
eb+0.010
over+0.010
ump+0.009
aw+0.009
'+0.009
Neg logit contributions
 exc-0.009
 lamp-0.008
 moral-0.008
 works-0.008
 trumpet-0.008
Token for
Token ID329
Feature activation+4.57e-03

Pos logit contributions
 moral+0.027
lights+0.026
 swollen+0.026
 exc+0.026
 works+0.026
Neg logit contributions
eb-0.020
over-0.018
'-0.017
e-0.016
ucky-0.015
Token a
Token ID257
Feature activation+9.61e-03

Pos logit contributions
eb+0.009
over+0.008
'+0.008
ale+0.008
ive+0.007
Neg logit contributions
 moral-0.005
 exc-0.005
 swollen-0.004
 pret-0.004
lights-0.004
Token girl
Token ID2576
Feature activation-4.65e-03

Pos logit contributions
 moral+0.012
lights+0.012
 swollen+0.011
 works+0.011
 exc+0.011
Neg logit contributions
eb-0.018
over-0.017
'-0.016
e-0.015
ucky-0.015
Token.
Token ID13
Feature activation-1.53e-03

Pos logit contributions
eb+0.009
over+0.008
ax+0.008
'+0.008
ale+0.008
Neg logit contributions
 moral-0.006
 exc-0.005
 lamp-0.005
 swollen-0.005
bul-0.005
Token She
Token ID1375
Feature activation+1.78e-02

Pos logit contributions
eb+0.003
over+0.002
ax+0.002
ram+0.002
or+0.002
Neg logit contributions
 lamp-0.002
 moral-0.002
 exc-0.002
lights-0.002
 works-0.002
Token was
Token ID373
Feature activation+5.81e-03

Pos logit contributions
lights+0.030
 moral+0.028
 swollen+0.028
bul+0.027
 poster+0.027
Neg logit contributions
eb-0.026
over-0.025
e-0.025
'-0.024
he-0.021
Token playing
Token ID2712
Feature activation+4.61e-02

Pos logit contributions
 moral+0.009
lights+0.009
 swollen+0.008
 works+0.008
 pret+0.008
Neg logit contributions
eb-0.009
over-0.009
'-0.008
e-0.008
ucky-0.008
Token in
Token ID287
Feature activation+2.03e-02

Pos logit contributions
lights+0.088
 moral+0.087
 swollen+0.085
 works+0.082
 exc+0.082
Neg logit contributions
eb-0.056
over-0.055
e-0.053
'-0.052
he-0.046
Token her
Token ID607
Feature activation+7.67e-04

Pos logit contributions
 moral+0.037
lights+0.036
 swollen+0.035
 exc+0.035
 works+0.034
Neg logit contributions
eb-0.026
over-0.024
e-0.023
'-0.022
ucky-0.020
Token room
Token ID2119
Feature activation+1.14e-02

Pos logit contributions
 moral+0.001
lights+0.001
 exc+0.001
 swollen+0.001
 works+0.001
Neg logit contributions
eb-0.001
over-0.001
'-0.001
e-0.001
ucky-0.001
Token when
Token ID618
Feature activation-1.92e-02

Pos logit contributions
 moral+0.018
 exc+0.017
lights+0.016
 works+0.016
 lamp+0.016
Neg logit contributions
eb-0.018
over-0.016
'-0.015
ax-0.015
e-0.015
Token she
Token ID673
Feature activation-6.13e-03

Pos logit contributions
eb+0.032
over+0.030
'+0.028
e+0.028
ucky+0.027
Neg logit contributions
 moral-0.028
lights-0.026
 exc-0.026
 swollen-0.026
 works-0.026
Token<|endoftext|>
Token ID50256
Feature activation+1.12e-02

Pos logit contributions
 moral+0.039
 exc+0.037
lights+0.037
 swollen+0.037
 works+0.036
Neg logit contributions
eb-0.032
over-0.029
'-0.027
e-0.026
ale-0.025
TokenTom
Token ID13787
Feature activation-2.17e-02

Pos logit contributions
 moral+0.012
 works+0.011
 lamp+0.010
 pret+0.010
 poster+0.010
Neg logit contributions
eb-0.022
over-0.021
ale-0.021
ax-0.020
ram-0.020
Token and
Token ID290
Feature activation-2.20e-02

Pos logit contributions
over+0.039
eb+0.039
'+0.034
ive+0.034
ucky+0.033
Neg logit contributions
 works-0.027
 moral-0.026
 pret-0.025
 lamp-0.024
 exc-0.024
Token Lily
Token ID20037
Feature activation-1.88e-02

Pos logit contributions
eb+0.036
over+0.033
'+0.031
e+0.030
ucky+0.030
Neg logit contributions
 moral-0.033
 exc-0.032
 works-0.030
lights-0.030
 swollen-0.030
Token were
Token ID547
Feature activation-8.10e-03

Pos logit contributions
over+0.032
eb+0.031
ucky+0.028
ram+0.027
ive+0.027
Neg logit contributions
 works-0.026
 exc-0.026
 moral-0.025
 pret-0.025
 lamp-0.023
Token playing
Token ID2712
Feature activation+4.71e-02

Pos logit contributions
eb+0.015
over+0.013
'+0.012
e+0.012
ive+0.012
Neg logit contributions
 moral-0.011
 exc-0.011
 swollen-0.010
lights-0.010
 works-0.010
Token in
Token ID287
Feature activation+1.87e-02

Pos logit contributions
 moral+0.090
lights+0.086
 swollen+0.086
 exc+0.085
 works+0.084
Neg logit contributions
eb-0.058
over-0.053
'-0.050
e-0.048
ucky-0.045
Token the
Token ID262
Feature activation+8.31e-03

Pos logit contributions
 moral+0.035
lights+0.033
 exc+0.033
 swollen+0.032
 works+0.032
Neg logit contributions
eb-0.024
over-0.022
e-0.020
'-0.020
ucky-0.018
Token park
Token ID3952
Feature activation-2.73e-03

Pos logit contributions
 moral+0.013
 exc+0.012
lights+0.012
 swollen+0.012
 works+0.012
Neg logit contributions
eb-0.013
over-0.012
'-0.011
e-0.011
ucky-0.011
Token with
Token ID351
Feature activation-8.92e-03

Pos logit contributions
eb+0.005
over+0.004
ucky+0.004
ax+0.004
'+0.004
Neg logit contributions
 moral-0.004
 exc-0.004
lights-0.004
bel-0.004
 lamp-0.004
Token their
Token ID511
Feature activation-1.60e-02

Pos logit contributions
eb+0.016
over+0.015
'+0.014
e+0.013
ive+0.013
Neg logit contributions
 moral-0.012
 exc-0.012
lights-0.012
 swollen-0.012
 works-0.011
Token,
Token ID11
Feature activation+2.81e-02

Pos logit contributions
eb+0.002
over+0.002
'+0.002
e+0.002
ucky+0.002
Neg logit contributions
 moral-0.004
 exc-0.004
lights-0.004
 swollen-0.004
 works-0.004
Token "
Token ID366
Feature activation+8.50e-04

Pos logit contributions
 moral+0.030
lights+0.028
 exc+0.027
 works+0.027
 swollen+0.027
Neg logit contributions
eb-0.058
over-0.054
'-0.053
e-0.050
ucky-0.049
TokenThat
Token ID2504
Feature activation+2.24e-02

Pos logit contributions
lights+0.001
 moral+0.001
 exc+0.001
 works+0.001
 lamp+0.001
Neg logit contributions
eb-0.001
over-0.001
'-0.001
e-0.001
ax-0.001
Token's
Token ID338
Feature activation-1.04e-02

Pos logit contributions
 moral+0.038
lights+0.037
 exc+0.036
 swollen+0.036
bel+0.035
Neg logit contributions
eb-0.032
over-0.029
'-0.029
e-0.027
ucky-0.025
Token OK
Token ID7477
Feature activation-9.02e-03

Pos logit contributions
eb+0.018
over+0.017
'+0.016
e+0.016
ucky+0.015
Neg logit contributions
lights-0.014
 moral-0.014
bel-0.013
 poster-0.013
bul-0.013
Token,
Token ID11
Feature activation+4.94e-02

Pos logit contributions
eb+0.012
over+0.010
'+0.009
e+0.009
ax+0.009
Neg logit contributions
 moral-0.017
 exc-0.016
lights-0.016
 swollen-0.016
 works-0.016
Token Ben
Token ID3932
Feature activation-1.18e-02

Pos logit contributions
 moral+0.080
 exc+0.075
 pret+0.073
 swollen+0.073
 works+0.072
Neg logit contributions
eb-0.077
over-0.072
e-0.066
'-0.066
ale-0.063
Token.
Token ID13
Feature activation+9.20e-03

Pos logit contributions
eb+0.020
over+0.017
ale+0.016
e+0.016
ram+0.016
Neg logit contributions
 moral-0.017
 exc-0.016
 works-0.016
 swollen-0.016
 pret-0.015
Token Don
Token ID2094
Feature activation+6.88e-03

Pos logit contributions
 moral+0.012
 exc+0.012
 swollen+0.011
 works+0.011
lights+0.011
Neg logit contributions
eb-0.017
over-0.015
'-0.015
e-0.014
ale-0.014
Token't
Token ID470
Feature activation-1.47e-03

Pos logit contributions
 moral+0.010
 exc+0.009
 works+0.008
 swollen+0.008
lights+0.008
Neg logit contributions
eb-0.013
over-0.012
e-0.011
'-0.010
ale-0.010
Token worry
Token ID5490
Feature activation+3.78e-02

Pos logit contributions
eb+0.003
over+0.003
'+0.002
e+0.002
ale+0.002
Neg logit contributions
 moral-0.002
 exc-0.002
 swollen-0.002
 works-0.002
 lamp-0.002
Token a
Token ID257
Feature activation+1.77e-02

Pos logit contributions
eb+0.016
over+0.015
'+0.014
e+0.014
ucky+0.014
Neg logit contributions
 moral-0.014
 exc-0.013
lights-0.013
 swollen-0.013
 works-0.012
Token silly
Token ID14397
Feature activation+1.64e-02

Pos logit contributions
 moral+0.023
lights+0.023
 works+0.022
 exc+0.022
 swollen+0.022
Neg logit contributions
eb-0.031
over-0.029
'-0.027
e-0.027
ucky-0.026
Token face
Token ID1986
Feature activation+7.63e-03

Pos logit contributions
 moral+0.022
lights+0.021
 swollen+0.021
 exc+0.021
 works+0.021
Neg logit contributions
eb-0.029
over-0.028
'-0.026
e-0.025
ucky-0.025
Token.
Token ID13
Feature activation-3.48e-03

Pos logit contributions
 moral+0.013
 exc+0.012
lights+0.012
 pret+0.012
 lamp+0.012
Neg logit contributions
eb-0.011
over-0.010
'-0.009
ucky-0.009
ale-0.009
Token
Token ID198
Feature activation+2.14e-03

Pos logit contributions
eb+0.005
over+0.004
'+0.004
e+0.004
ucky+0.004
Neg logit contributions
 moral-0.006
lights-0.006
 swollen-0.006
 exc-0.006
 works-0.006
Token
Token ID198
Feature activation+1.47e-03

Pos logit contributions
 moral+0.004
 exc+0.003
lights+0.003
 swollen+0.003
 works+0.003
Neg logit contributions
eb-0.003
over-0.003
'-0.003
e-0.003
ale-0.003
Token"
Token ID1
Feature activation-1.38e-02

Pos logit contributions
 moral+0.002
 exc+0.002
lights+0.002
 swollen+0.002
 works+0.002
Neg logit contributions
eb-0.003
over-0.002
'-0.002
e-0.002
ucky-0.002
TokenLook
Token ID8567
Feature activation+2.39e-02

Pos logit contributions
eb+0.022
over+0.020
e+0.019
'+0.019
he+0.017
Neg logit contributions
lights-0.021
 moral-0.020
 exc-0.020
 poster-0.020
 works-0.020
Token,
Token ID11
Feature activation+4.19e-02

Pos logit contributions
 moral+0.040
 exc+0.039
 pret+0.039
 works+0.039
 lamp+0.038
Neg logit contributions
eb-0.031
over-0.030
'-0.025
e-0.025
ump-0.025
Token Anna
Token ID11735
Feature activation-1.59e-02

Pos logit contributions
lights+0.077
bel+0.071
 moral+0.071
bul+0.068
 swollen+0.067
Neg logit contributions
eb-0.056
'-0.050
e-0.049
ucky-0.045
over-0.044
Token.
Token ID13
Feature activation-1.74e-03

Pos logit contributions
eb+0.022
over+0.020
'+0.018
ucky+0.018
ale+0.018
Neg logit contributions
 works-0.027
 moral-0.027
 exc-0.026
 pret-0.026
 swollen-0.025
Token her
Token ID607
Feature activation+1.38e-02

Pos logit contributions
 moral+0.018
lights+0.018
 works+0.017
 poster+0.017
 swollen+0.017
Neg logit contributions
eb-0.021
over-0.020
e-0.019
'-0.018
ucky-0.017
Token friend
Token ID1545
Feature activation+2.19e-02

Pos logit contributions
lights+0.017
 swollen+0.017
 moral+0.016
 works+0.016
 pret+0.016
Neg logit contributions
eb-0.026
over-0.024
'-0.023
e-0.022
ucky-0.022
Token too
Token ID1165
Feature activation-8.01e-03

Pos logit contributions
 moral+0.032
 exc+0.029
 lamp+0.028
 pret+0.028
 works+0.028
Neg logit contributions
eb-0.036
over-0.032
ram-0.032
ale-0.031
ard-0.030
Token.
Token ID13
Feature activation+4.27e-03

Pos logit contributions
eb+0.012
over+0.011
'+0.010
e+0.010
ale+0.010
Neg logit contributions
 moral-0.014
 exc-0.013
 swollen-0.012
 pret-0.012
 works-0.012
Token They
Token ID1119
Feature activation+7.18e-03

Pos logit contributions
 moral+0.008
 works+0.008
 poster+0.008
 exc+0.008
 pret+0.008
Neg logit contributions
eb-0.005
over-0.004
ucky-0.004
'-0.004
 Pip-0.004
Token hugged
Token ID41130
Feature activation+9.70e-03

Pos logit contributions
 moral+0.012
 exc+0.012
 swollen+0.011
lights+0.011
 pret+0.011
Neg logit contributions
eb-0.011
over-0.010
'-0.009
e-0.009
ucky-0.009
Token and
Token ID290
Feature activation+1.19e-03

Pos logit contributions
 moral+0.016
 pret+0.015
lights+0.015
 lamp+0.014
 exc+0.014
Neg logit contributions
eb-0.015
over-0.014
ram-0.013
ax-0.013
ump-0.013
Token went
Token ID1816
Feature activation+9.36e-03

Pos logit contributions
 moral+0.002
 exc+0.001
 swollen+0.001
lights+0.001
 pret+0.001
Neg logit contributions
eb-0.002
over-0.002
'-0.002
ucky-0.002
e-0.002
Token back
Token ID736
Feature activation+6.75e-04

Pos logit contributions
 poster+0.016
 moral+0.016
 pret+0.016
lights+0.016
bel+0.016
Neg logit contributions
eb-0.013
over-0.011
ard-0.010
ax-0.010
umm-0.010
Token to
Token ID284
Feature activation+2.13e-02

Pos logit contributions
 works+0.001
 pret+0.001
 moral+0.001
 exc+0.001
bel+0.001
Neg logit contributions
eb-0.001
over-0.001
ax-0.001
ucky-0.001
e-0.001
Token the
Token ID262
Feature activation+2.06e-02

Pos logit contributions
 moral+0.030
 exc+0.029
 pret+0.028
 swollen+0.027
 lamp+0.027
Neg logit contributions
eb-0.039
over-0.036
'-0.033
ive-0.033
ard-0.032
Token.
Token ID13
Feature activation-1.43e-02

Pos logit contributions
eb+0.034
over+0.031
e+0.028
'+0.027
ucky+0.027
Neg logit contributions
 moral-0.032
lights-0.032
 exc-0.031
 works-0.031
 lamp-0.030
Token He
Token ID679
Feature activation+3.74e-03

Pos logit contributions
eb+0.024
over+0.023
e+0.020
'+0.020
ucky+0.020
Neg logit contributions
 moral-0.021
 exc-0.019
 works-0.019
 pret-0.019
lights-0.018
Token was
Token ID373
Feature activation-1.28e-03

Pos logit contributions
 moral+0.005
lights+0.005
 exc+0.005
 works+0.005
 poster+0.005
Neg logit contributions
eb-0.006
over-0.006
'-0.006
e-0.005
ucky-0.005
Token careful
Token ID8161
Feature activation-1.76e-03

Pos logit contributions
eb+0.003
over+0.002
'+0.002
e+0.002
ale+0.002
Neg logit contributions
 moral-0.002
 exc-0.001
 swollen-0.001
lights-0.001
 pret-0.001
Token and
Token ID290
Feature activation-1.16e-03

Pos logit contributions
eb+0.003
over+0.003
ive+0.003
e+0.003
ucky+0.003
Neg logit contributions
 moral-0.002
 exc-0.002
lights-0.002
 swollen-0.002
bel-0.002
Token looked
Token ID3114
Feature activation+5.68e-03

Pos logit contributions
eb+0.002
over+0.002
'+0.002
ucky+0.002
e+0.002
Neg logit contributions
 moral-0.001
 exc-0.001
 swollen-0.001
 works-0.001
lights-0.001
Token around
Token ID1088
Feature activation+2.02e-02

Pos logit contributions
 moral+0.009
 exc+0.009
 swollen+0.009
lights+0.009
 works+0.009
Neg logit contributions
eb-0.009
over-0.008
e-0.007
'-0.007
ucky-0.007
Token.
Token ID13
Feature activation-1.02e-02

Pos logit contributions
 moral+0.030
 pretending+0.027
lights+0.027
 lamp+0.027
 exc+0.027
Neg logit contributions
eb-0.035
over-0.032
umbs-0.029
ive-0.029
e-0.028
Token He
Token ID679
Feature activation+6.51e-03

Pos logit contributions
eb+0.018
over+0.017
e+0.016
'+0.015
ucky+0.015
Neg logit contributions
 moral-0.014
 exc-0.013
 works-0.013
 pret-0.012
lights-0.012
Token saw
Token ID2497
Feature activation+9.15e-03

Pos logit contributions
 moral+0.008
 exc+0.008
 swollen+0.008
 pret+0.008
 works+0.008
Neg logit contributions
eb-0.012
over-0.012
ucky-0.010
'-0.010
ump-0.010
Token a
Token ID257
Feature activation-7.05e-03

Pos logit contributions
 moral+0.012
 exc+0.012
lights+0.012
 swollen+0.012
 works+0.011
Neg logit contributions
eb-0.016
over-0.015
'-0.014
e-0.014
ucky-0.014
Token old
Token ID1468
Feature activation+8.73e-03

Pos logit contributions
eb+0.031
over+0.029
e+0.028
'+0.028
ive+0.026
Neg logit contributions
 moral-0.009
lights-0.008
 exc-0.008
 works-0.007
 swollen-0.007
Token child
Token ID1200
Feature activation-5.48e-03

Pos logit contributions
 moral+0.009
 exc+0.008
 swollen+0.008
 lamp+0.008
lights+0.008
Neg logit contributions
eb-0.019
over-0.018
'-0.017
e-0.017
ucky-0.016
Token was
Token ID373
Feature activation-2.75e-03

Pos logit contributions
eb+0.010
over+0.010
ax+0.009
ucky+0.009
ale+0.009
Neg logit contributions
 moral-0.006
 exc-0.006
 works-0.006
 lamp-0.006
 pret-0.006
Token walking
Token ID6155
Feature activation+5.60e-03

Pos logit contributions
eb+0.005
over+0.004
'+0.004
ale+0.004
ax+0.004
Neg logit contributions
 moral-0.004
 exc-0.004
 pret-0.003
 lamp-0.003
 works-0.003
Token in
Token ID287
Feature activation+1.12e-02

Pos logit contributions
 moral+0.010
 exc+0.010
lights+0.010
bel+0.010
 pret+0.010
Neg logit contributions
eb-0.007
over-0.006
'-0.006
ucky-0.006
ax-0.006
Token a
Token ID257
Feature activation+4.56e-03

Pos logit contributions
 moral+0.018
 exc+0.017
 works+0.017
 swollen+0.017
 lamp+0.017
Neg logit contributions
eb-0.018
over-0.016
'-0.015
ucky-0.014
ive-0.014
Token park
Token ID3952
Feature activation-2.03e-02

Pos logit contributions
 moral+0.006
 lamp+0.005
 poster+0.005
 exc+0.005
 swollen+0.005
Neg logit contributions
eb-0.009
over-0.008
ax-0.008
e-0.008
umbs-0.008
Token with
Token ID351
Feature activation-9.37e-03

Pos logit contributions
eb+0.032
over+0.029
'+0.027
ucky+0.026
ax+0.026
Neg logit contributions
 moral-0.032
 exc-0.030
lights-0.029
 works-0.029
 pret-0.028
Token their
Token ID511
Feature activation-1.96e-02

Pos logit contributions
eb+0.019
over+0.018
'+0.017
e+0.016
ucky+0.016
Neg logit contributions
 moral-0.011
 exc-0.010
 swollen-0.010
lights-0.010
 works-0.009
Token parents
Token ID3397
Feature activation-8.22e-03

Pos logit contributions
eb+0.040
over+0.037
'+0.035
e+0.034
ucky+0.033
Neg logit contributions
 moral-0.023
 exc-0.021
 swollen-0.021
lights-0.021
 works-0.020
Token.
Token ID13
Feature activation-1.83e-02

Pos logit contributions
eb+0.017
over+0.016
aw+0.015
ive+0.015
ete+0.014
Neg logit contributions
 moral-0.009
 exc-0.008
 pretending-0.008
 lamp-0.008
 robot-0.008
Token.
Token ID13
Feature activation-2.15e-02

Pos logit contributions
eb+0.032
over+0.031
ete+0.027
ive+0.027
ucky+0.026
Neg logit contributions
 exc-0.029
 moral-0.028
 lamp-0.027
bul-0.026
 works-0.026
Token She
Token ID1375
Feature activation+1.61e-02

Pos logit contributions
eb+0.040
e+0.039
over+0.036
or+0.034
ee+0.033
Neg logit contributions
 moral-0.030
 exc-0.027
lights-0.023
 disl-0.022
 works-0.022
Token wanted
Token ID2227
Feature activation-1.49e-03

Pos logit contributions
lights+0.023
 moral+0.023
 swollen+0.022
 poster+0.021
bul+0.021
Neg logit contributions
eb-0.027
over-0.025
e-0.025
'-0.025
ucky-0.022
Token to
Token ID284
Feature activation+1.22e-02

Pos logit contributions
eb+0.002
over+0.002
'+0.002
e+0.002
ucky+0.002
Neg logit contributions
 moral-0.003
 exc-0.002
 lamp-0.002
 works-0.002
 pret-0.002
Token have
Token ID423
Feature activation-3.05e-03

Pos logit contributions
 moral+0.019
 exc+0.019
 lamp+0.018
 works+0.018
 pret+0.017
Neg logit contributions
eb-0.019
over-0.018
'-0.016
ucky-0.016
ale-0.016
Token a
Token ID257
Feature activation+4.30e-03

Pos logit contributions
eb+0.006
over+0.006
'+0.005
e+0.005
ucky+0.005
Neg logit contributions
 moral-0.004
 exc-0.003
 swollen-0.003
 works-0.003
 pret-0.003
Token fun
Token ID1257
Feature activation+2.21e-02

Pos logit contributions
lights+0.007
 moral+0.006
 swollen+0.006
 exc+0.006
 works+0.006
Neg logit contributions
eb-0.007
over-0.006
'-0.006
e-0.006
ucky-0.006
Token day
Token ID1110
Feature activation-8.57e-04

Pos logit contributions
 moral+0.039
 swollen+0.039
lights+0.038
 exc+0.036
 works+0.036
Neg logit contributions
eb-0.030
over-0.028
'-0.027
e-0.025
ake-0.024
Token.
Token ID13
Feature activation-1.18e-02

Pos logit contributions
eb+0.002
ete+0.001
ale+0.001
umbs+0.001
ucky+0.001
Neg logit contributions
 pretending-0.001
 exc-0.001
 trumpet-0.001
 trump-0.001
 lamp-0.001
Token So
Token ID1406
Feature activation+5.13e-03

Pos logit contributions
eb+0.026
e+0.026
over+0.025
dd+0.023
ake+0.022
Neg logit contributions
 moral-0.011
 exc-0.009
 Maybe-0.008
 how-0.008
 sob-0.007
Token she
Token ID673
Feature activation+9.89e-03

Pos logit contributions
 moral+0.008
 exc+0.007
lights+0.006
 swollen+0.006
 forgiving+0.006
Neg logit contributions
eb-0.009
over-0.008
e-0.008
umbs-0.008
atter-0.008
Token
Token ID198
Feature activation-6.48e-03

Pos logit contributions
eb+0.006
over+0.005
e+0.005
'+0.005
ale+0.005
Neg logit contributions
 moral-0.005
 exc-0.005
 pret-0.005
lights-0.005
 works-0.005
Token
Token ID198
Feature activation-9.26e-04

Pos logit contributions
eb+0.009
over+0.008
'+0.007
e+0.007
ucky+0.007
Neg logit contributions
 moral-0.012
lights-0.011
 exc-0.011
 swollen-0.011
 works-0.011
TokenThey
Token ID2990
Feature activation-5.61e-03

Pos logit contributions
eb+0.002
over+0.001
'+0.001
e+0.001
ucky+0.001
Neg logit contributions
 moral-0.001
lights-0.001
 swollen-0.001
 exc-0.001
 works-0.001
Token climb
Token ID12080
Feature activation-2.08e-03

Pos logit contributions
eb+0.010
over+0.010
ucky+0.008
ake+0.008
ump+0.008
Neg logit contributions
 moral-0.008
 exc-0.008
 pret-0.008
 works-0.008
 poster-0.007
Token the
Token ID262
Feature activation-1.16e-02

Pos logit contributions
eb+0.003
over+0.003
'+0.003
e+0.003
ax+0.003
Neg logit contributions
 moral-0.003
lights-0.003
 pret-0.003
 works-0.003
 exc-0.003
Token tree
Token ID5509
Feature activation-1.27e-02

Pos logit contributions
eb+0.019
over+0.018
'+0.016
e+0.016
ucky+0.015
Neg logit contributions
 moral-0.018
 exc-0.017
lights-0.017
 works-0.017
 swollen-0.017
Token and
Token ID290
Feature activation-7.63e-03

Pos logit contributions
eb+0.020
over+0.019
ax+0.016
ucky+0.016
e+0.016
Neg logit contributions
 moral-0.020
 exc-0.019
lights-0.019
 works-0.019
 lamp-0.019
Token pick
Token ID2298
Feature activation-7.32e-03

Pos logit contributions
eb+0.012
over+0.012
ucky+0.011
ake+0.010
'+0.010
Neg logit contributions
 moral-0.012
 pret-0.012
 exc-0.012
 works-0.012
 poster-0.011
Token some
Token ID617
Feature activation-2.98e-02

Pos logit contributions
over+0.015
'+0.014
eb+0.014
e+0.014
ake+0.013
Neg logit contributions
 moral-0.007
 pret-0.007
 works-0.007
 noises-0.006
 zo-0.006
Token pr
Token ID778
Feature activation-4.87e-03

Pos logit contributions
eb+0.051
over+0.049
'+0.045
e+0.044
ax+0.041
Neg logit contributions
 moral-0.043
 exc-0.040
 works-0.039
lights-0.039
 swollen-0.039
Tokenunes
Token ID4015
Feature activation-1.80e-02

Pos logit contributions
over+0.007
eb+0.007
'+0.007
e+0.006
he+0.006
Neg logit contributions
 works-0.007
 pret-0.007
lights-0.007
 moral-0.007
 exc-0.007
Token sped
Token ID40424
Feature activation-4.30e-03

Pos logit contributions
 moral+0.026
lights+0.025
 swollen+0.025
 exc+0.025
 poster+0.024
Neg logit contributions
eb-0.030
'-0.026
over-0.026
e-0.025
ard-0.024
Token over
Token ID625
Feature activation-2.14e-03

Pos logit contributions
eb+0.006
over+0.005
'+0.005
e+0.005
ucky+0.004
Neg logit contributions
 moral-0.008
lights-0.008
 exc-0.007
 works-0.007
 pret-0.007
Token the
Token ID262
Feature activation-1.15e-02

Pos logit contributions
eb+0.003
over+0.003
ucky+0.003
'+0.003
e+0.003
Neg logit contributions
 moral-0.003
 works-0.003
lights-0.003
 exc-0.003
 pret-0.003
Token hills
Token ID18639
Feature activation-2.21e-02

Pos logit contributions
eb+0.019
over+0.018
e+0.016
'+0.016
ucky+0.015
Neg logit contributions
 moral-0.018
 exc-0.016
 swollen-0.016
lights-0.016
 works-0.016
Token and
Token ID290
Feature activation-8.00e-03

Pos logit contributions
eb+0.036
over+0.034
ucky+0.032
e+0.031
ive+0.030
Neg logit contributions
lights-0.031
 moral-0.031
 exc-0.029
bel-0.029
 works-0.029
Token soon
Token ID2582
Feature activation-5.16e-03

Pos logit contributions
eb+0.013
over+0.012
'+0.011
e+0.011
ucky+0.011
Neg logit contributions
 moral-0.012
lights-0.011
 works-0.011
 exc-0.011
 pret-0.011
Token the
Token ID262
Feature activation-1.41e-02

Pos logit contributions
eb+0.010
over+0.010
ucky+0.009
e+0.009
'+0.009
Neg logit contributions
 works-0.005
 moral-0.005
lights-0.005
 pret-0.005
 exc-0.005
Token mole
Token ID9411
Feature activation-2.18e-02

Pos logit contributions
eb+0.023
over+0.022
'+0.020
e+0.020
ucky+0.019
Neg logit contributions
 moral-0.021
 exc-0.020
lights-0.020
 swollen-0.020
 works-0.019
Token arrived
Token ID5284
Feature activation-1.44e-02

Pos logit contributions
eb+0.038
over+0.036
'+0.033
ucky+0.033
e+0.032
Neg logit contributions
 moral-0.029
 exc-0.028
lights-0.028
 works-0.028
 swollen-0.027
Token at
Token ID379
Feature activation-2.99e-03

Pos logit contributions
eb+0.018
over+0.016
e+0.015
'+0.015
ucky+0.014
Neg logit contributions
 moral-0.027
lights-0.026
 exc-0.026
 swollen-0.025
 works-0.025
Token his
Token ID465
Feature activation-1.00e-02

Pos logit contributions
eb+0.005
over+0.005
'+0.005
ucky+0.005
ive+0.005
Neg logit contributions
 moral-0.004
 works-0.004
 exc-0.004
 pret-0.004
 lamp-0.004
Token The
Token ID383
Feature activation+1.96e-03

Pos logit contributions
eb+0.006
over+0.005
e+0.005
'+0.005
ale+0.005
Neg logit contributions
 moral-0.006
 exc-0.006
lights-0.006
 swollen-0.006
 pret-0.006
Token little
Token ID1310
Feature activation-2.11e-02

Pos logit contributions
lights+0.003
 pret+0.003
 works+0.003
 moral+0.003
 exc+0.003
Neg logit contributions
eb-0.003
over-0.003
'-0.003
ucky-0.003
e-0.003
Token girl
Token ID2576
Feature activation-3.44e-03

Pos logit contributions
eb+0.045
over+0.042
'+0.040
e+0.039
ucky+0.038
Neg logit contributions
 moral-0.022
lights-0.020
 exc-0.020
 swollen-0.019
 works-0.019
Token explained
Token ID4893
Feature activation-2.10e-02

Pos logit contributions
eb+0.006
'+0.005
e+0.005
over+0.005
ard+0.005
Neg logit contributions
lights-0.005
 moral-0.005
 swollen-0.005
 poster-0.005
 appearances-0.005
Token that
Token ID326
Feature activation+6.78e-03

Pos logit contributions
eb+0.031
over+0.029
'+0.026
e+0.026
ucky+0.025
Neg logit contributions
 moral-0.035
 exc-0.033
lights-0.033
 swollen-0.033
 works-0.032
Token it
Token ID340
Feature activation-8.93e-03

Pos logit contributions
 moral+0.011
 exc+0.011
lights+0.011
 swollen+0.010
 works+0.010
Neg logit contributions
eb-0.010
over-0.009
'-0.009
e-0.008
ucky-0.008
Token had
Token ID550
Feature activation-2.29e-02

Pos logit contributions
eb+0.015
over+0.014
'+0.014
e+0.013
ucky+0.013
Neg logit contributions
 moral-0.013
lights-0.012
 exc-0.012
 swollen-0.012
 pret-0.011
Token shrunk
Token ID38900
Feature activation-5.10e-03

Pos logit contributions
eb+0.037
over+0.035
'+0.032
e+0.031
ucky+0.030
Neg logit contributions
 moral-0.035
 exc-0.033
 swollen-0.033
 works-0.032
lights-0.032
Token and
Token ID290
Feature activation-2.08e-03

Pos logit contributions
eb+0.008
over+0.007
'+0.007
e+0.007
ax+0.006
Neg logit contributions
 moral-0.008
lights-0.008
 works-0.008
 swollen-0.008
 exc-0.007
Token become
Token ID1716
Feature activation-1.92e-03

Pos logit contributions
eb+0.003
over+0.003
'+0.003
e+0.003
ucky+0.002
Neg logit contributions
 moral-0.003
lights-0.003
 exc-0.003
 swollen-0.003
 pret-0.003
Token swe
Token ID3490
Feature activation-2.75e-02

Pos logit contributions
eb+0.003
over+0.003
'+0.003
e+0.003
ucky+0.003
Neg logit contributions
 moral-0.003
lights-0.003
 exc-0.003
 works-0.002
 poster-0.002
Token.
Token ID13
Feature activation+6.91e-04

Pos logit contributions
lights+0.034
 works+0.034
 lamp+0.033
bul+0.033
lf+0.032
Neg logit contributions
eb-0.040
over-0.035
ale-0.033
ete-0.033
ive-0.033
Token Everything
Token ID11391
Feature activation-3.35e-03

Pos logit contributions
 moral+0.001
 lamp+0.001
 exc+0.001
 swollen+0.001
lights+0.001
Neg logit contributions
eb-0.001
over-0.001
e-0.001
'-0.001
ale-0.001
Token was
Token ID373
Feature activation-6.56e-03

Pos logit contributions
eb+0.006
over+0.005
'+0.005
e+0.005
ucky+0.005
Neg logit contributions
 moral-0.005
 exc-0.004
 swollen-0.004
lights-0.004
 pret-0.004
Token sharp
Token ID7786
Feature activation+1.27e-02

Pos logit contributions
eb+0.010
over+0.009
'+0.009
e+0.009
ucky+0.008
Neg logit contributions
 moral-0.010
 poster-0.010
 pret-0.009
lights-0.009
 exc-0.009
Token and
Token ID290
Feature activation+8.97e-03

Pos logit contributions
 works+0.019
lights+0.018
 pret+0.018
 lamp+0.017
 moral+0.017
Neg logit contributions
eb-0.021
over-0.019
'-0.018
e-0.018
ax-0.018
Token unsafe
Token ID21596
Feature activation-8.74e-03

Pos logit contributions
 moral+0.013
 exc+0.012
lights+0.012
 swollen+0.012
 pret+0.012
Neg logit contributions
eb-0.015
over-0.014
'-0.013
e-0.013
ucky-0.012
Token.
Token ID13
Feature activation+7.77e-03

Pos logit contributions
eb+0.014
over+0.012
e+0.012
'+0.012
ax+0.011
Neg logit contributions
lights-0.013
 works-0.013
 lamp-0.013
 pret-0.013
 moral-0.013
Token
Token ID198
Feature activation+1.07e-03

Pos logit contributions
 moral+0.012
 exc+0.011
 pret+0.010
 lamp+0.010
lights+0.010
Neg logit contributions
eb-0.013
over-0.011
ale-0.011
e-0.011
'-0.011
Token
Token ID198
Feature activation+3.52e-03

Pos logit contributions
 moral+0.002
 exc+0.002
 swollen+0.002
lights+0.001
 lamp+0.001
Neg logit contributions
eb-0.002
over-0.002
e-0.001
ale-0.001
'-0.001
TokenBut
Token ID1537
Feature activation+1.73e-02

Pos logit contributions
 moral+0.005
 exc+0.004
 swollen+0.004
lights+0.004
 works+0.004
Neg logit contributions
eb-0.006
over-0.006
e-0.005
'-0.005
ale-0.005
Token Jimmy
Token ID12963
Feature activation-2.33e-02

Pos logit contributions
 moral+0.026
 exc+0.024
 swollen+0.024
lights+0.024
 pret+0.024
Neg logit contributions
eb-0.028
over-0.026
'-0.025
e-0.024
ucky-0.024
Token lovely
Token ID14081
Feature activation-2.35e-02

Pos logit contributions
eb+0.008
over+0.007
'+0.007
e+0.007
ucky+0.006
Neg logit contributions
lights-0.007
 moral-0.007
 swollen-0.007
 exc-0.006
 works-0.006
Token day
Token ID1110
Feature activation-5.71e-03

Pos logit contributions
eb+0.036
over+0.032
e+0.030
'+0.030
ucky+0.028
Neg logit contributions
 moral-0.039
 exc-0.037
 works-0.036
 pret-0.035
 lamp-0.035
Token."
Token ID526
Feature activation-2.30e-02

Pos logit contributions
eb+0.009
over+0.008
ucky+0.008
ax+0.007
ale+0.007
Neg logit contributions
 moral-0.008
 exc-0.008
 works-0.008
lights-0.008
 pret-0.008
Token Then
Token ID3244
Feature activation+1.49e-02

Pos logit contributions
eb+0.036
over+0.032
e+0.030
'+0.029
ale+0.028
Neg logit contributions
 moral-0.036
 exc-0.034
lights-0.034
 pret-0.033
 lamp-0.033
Token she
Token ID673
Feature activation+5.56e-03

Pos logit contributions
 moral+0.020
 exc+0.019
lights+0.019
 works+0.019
 pret+0.019
Neg logit contributions
eb-0.026
over-0.025
'-0.023
e-0.023
ucky-0.022
Token waved
Token ID26834
Feature activation-3.97e-03

Pos logit contributions
 moral+0.008
lights+0.008
 swollen+0.008
 exc+0.008
 appearances+0.008
Neg logit contributions
eb-0.009
'-0.008
over-0.008
e-0.007
ax-0.007
Token goodbye
Token ID24829
Feature activation-2.20e-02

Pos logit contributions
eb+0.007
over+0.006
e+0.006
'+0.006
ucky+0.006
Neg logit contributions
 moral-0.006
lights-0.005
 exc-0.005
 swollen-0.005
bel-0.005
Token to
Token ID284
Feature activation+4.42e-04

Pos logit contributions
eb+0.037
ucky+0.032
over+0.032
e+0.032
'+0.030
Neg logit contributions
bel-0.030
 moral-0.030
lights-0.029
 exc-0.029
 pret-0.028
Token the
Token ID262
Feature activation+2.32e-04

Pos logit contributions
 moral+0.001
 exc+0.001
 pret+0.001
 poster+0.001
 works+0.001
Neg logit contributions
eb-0.001
over-0.001
ucky-0.001
ake-0.001
ram-0.001
Token family
Token ID1641
Feature activation+1.53e-02

Pos logit contributions
lights+0.000
 moral+0.000
 swollen+0.000
 exc+0.000
 works+0.000
Neg logit contributions
eb-0.000
over-0.000
'-0.000
e-0.000
ucky-0.000
Token and
Token ID290
Feature activation-5.55e-03

Pos logit contributions
 moral+0.022
lights+0.021
 exc+0.021
bul+0.020
 lamp+0.020
Neg logit contributions
eb-0.026
over-0.023
ucky-0.022
e-0.021
ax-0.020
Token
Token ID198
Feature activation+1.37e-03

Pos logit contributions
eb+0.003
over+0.002
'+0.002
e+0.002
ucky+0.002
Neg logit contributions
 moral-0.002
lights-0.002
 swollen-0.002
 exc-0.002
 works-0.002
TokenThen
Token ID6423
Feature activation+1.97e-03

Pos logit contributions
 moral+0.002
lights+0.002
 works+0.002
 swollen+0.002
 exc+0.002
Neg logit contributions
eb-0.002
over-0.002
'-0.002
e-0.002
ucky-0.002
Token,
Token ID11
Feature activation+1.76e-03

Pos logit contributions
 moral+0.003
 pret+0.003
 exc+0.003
lights+0.003
 works+0.003
Neg logit contributions
eb-0.003
over-0.003
'-0.003
e-0.002
ucky-0.002
Token the
Token ID262
Feature activation-3.50e-03

Pos logit contributions
 moral+0.003
 exc+0.002
lights+0.002
 swollen+0.002
 pret+0.002
Neg logit contributions
eb-0.003
over-0.003
'-0.003
e-0.002
ucky-0.002
Token sun
Token ID4252
Feature activation-1.87e-02

Pos logit contributions
over+0.005
eb+0.005
'+0.005
ucky+0.005
ive+0.005
Neg logit contributions
lights-0.005
 swollen-0.005
 pret-0.005
 works-0.005
bul-0.005
Token began
Token ID2540
Feature activation-4.75e-02

Pos logit contributions
eb+0.032
over+0.030
'+0.028
e+0.027
ucky+0.027
Neg logit contributions
 moral-0.027
lights-0.025
 exc-0.025
 swollen-0.025
 works-0.024
Token to
Token ID284
Feature activation-1.84e-02

Pos logit contributions
eb+0.061
over+0.055
'+0.050
e+0.049
ucky+0.047
Neg logit contributions
 moral-0.089
 exc-0.085
lights-0.085
 swollen-0.084
 works-0.083
Token set
Token ID900
Feature activation-3.02e-02

Pos logit contributions
eb+0.028
over+0.025
'+0.024
e+0.023
ucky+0.022
Neg logit contributions
 moral-0.030
lights-0.029
 swollen-0.028
 exc-0.028
 works-0.028
Token.
Token ID13
Feature activation+4.64e-04

Pos logit contributions
eb+0.058
ax+0.052
e+0.051
ram+0.051
ete+0.050
Neg logit contributions
 moral-0.035
lights-0.034
 noises-0.032
bul-0.032
 lamp-0.031
Token They
Token ID1119
Feature activation-5.97e-03

Pos logit contributions
 moral+0.001
lights+0.001
 exc+0.001
 pret+0.001
 lamp+0.001
Neg logit contributions
eb-0.001
over-0.001
'-0.001
e-0.001
ale-0.001
Token had
Token ID550
Feature activation-1.11e-02

Pos logit contributions
eb+0.010
over+0.010
'+0.009
ucky+0.009
e+0.008
Neg logit contributions
 moral-0.009
 exc-0.008
lights-0.008
 pret-0.008
 swollen-0.008
Token
Token ID198
Feature activation+6.92e-04

Pos logit contributions
eb+0.003
over+0.003
e+0.002
'+0.002
ale+0.002
Neg logit contributions
 moral-0.003
 exc-0.002
lights-0.002
 swollen-0.002
 works-0.002
TokenAt
Token ID2953
Feature activation+1.50e-02

Pos logit contributions
 moral+0.001
 exc+0.001
lights+0.001
 swollen+0.001
 works+0.001
Neg logit contributions
eb-0.001
over-0.001
'-0.001
e-0.001
ucky-0.001
Token the
Token ID262
Feature activation-8.65e-03

Pos logit contributions
 moral+0.023
 poster+0.021
lights+0.021
 exc+0.021
 works+0.021
Neg logit contributions
eb-0.025
over-0.024
'-0.022
ard-0.021
e-0.021
Token end
Token ID886
Feature activation-2.02e-02

Pos logit contributions
eb+0.018
over+0.016
ax+0.015
e+0.015
ump+0.015
Neg logit contributions
 moral-0.013
 lamp-0.010
 poster-0.009
 robot-0.009
 swollen-0.009
Token of
Token ID286
Feature activation-3.15e-02

Pos logit contributions
eb+0.027
over+0.024
e+0.022
'+0.022
ax+0.021
Neg logit contributions
 moral-0.037
lights-0.036
 exc-0.035
 swollen-0.035
 lamp-0.034
Token their
Token ID511
Feature activation-4.78e-02

Pos logit contributions
eb+0.050
over+0.047
ard+0.044
'+0.044
ive+0.043
Neg logit contributions
 moral-0.049
 swollen-0.045
lights-0.045
 exc-0.044
 poster-0.043
Token adventure
Token ID8855
Feature activation+1.09e-04

Pos logit contributions
eb+0.096
over+0.085
ax+0.085
he+0.079
ump+0.079
Neg logit contributions
 moral-0.068
 poster-0.057
 swollen-0.056
 exc-0.056
 works-0.056
Token,
Token ID11
Feature activation+9.33e-03

Pos logit contributions
 works+0.000
lights+0.000
 moral+0.000
 poster+0.000
 lamp+0.000
Neg logit contributions
eb-0.000
over-0.000
e-0.000
ucky-0.000
umbs-0.000
Token Jack
Token ID3619
Feature activation-1.11e-02

Pos logit contributions
 moral+0.017
lights+0.016
 exc+0.016
 works+0.016
 swollen+0.016
Neg logit contributions
eb-0.013
over-0.011
e-0.010
ram-0.010
ale-0.010
Token and
Token ID290
Feature activation-1.55e-02

Pos logit contributions
eb+0.018
over+0.017
'+0.016
ucky+0.016
e+0.015
Neg logit contributions
 moral-0.016
 exc-0.015
 works-0.015
 swollen-0.015
 pret-0.015
Token the
Token ID262
Feature activation-1.85e-04

Pos logit contributions
eb+0.025
over+0.023
'+0.022
e+0.021
ucky+0.021
Neg logit contributions
 moral-0.025
 exc-0.023
 works-0.022
lights-0.022
 swollen-0.022
Token felt
Token ID2936
Feature activation+6.07e-03

Pos logit contributions
lights+0.030
 appearances+0.028
 moral+0.028
 poster+0.028
 satellites+0.027
Neg logit contributions
eb-0.029
'-0.027
e-0.025
over-0.024
aw-0.022
Token relieved
Token ID22598
Feature activation-2.01e-02

Pos logit contributions
 moral+0.008
 exc+0.008
 swollen+0.008
lights+0.007
 works+0.007
Neg logit contributions
eb-0.011
over-0.010
'-0.009
e-0.009
ale-0.009
Token that
Token ID326
Feature activation-1.33e-03

Pos logit contributions
eb+0.033
over+0.030
e+0.029
ucky+0.027
'+0.027
Neg logit contributions
 moral-0.030
lights-0.027
 swollen-0.027
 forgiving-0.027
 exc-0.027
Token she
Token ID673
Feature activation+4.81e-03

Pos logit contributions
eb+0.003
over+0.002
e+0.002
ucky+0.002
'+0.002
Neg logit contributions
 moral-0.002
 exc-0.002
 works-0.002
 swollen-0.002
lights-0.002
Token was
Token ID373
Feature activation+3.18e-03

Pos logit contributions
 moral+0.007
 exc+0.007
 swollen+0.007
lights+0.007
 works+0.007
Neg logit contributions
eb-0.008
over-0.007
'-0.007
e-0.006
ucky-0.006
Token able
Token ID1498
Feature activation-5.62e-02

Pos logit contributions
 moral+0.005
 exc+0.005
 swollen+0.005
 works+0.004
lights+0.004
Neg logit contributions
eb-0.005
over-0.005
ucky-0.004
ale-0.004
'-0.004
Token to
Token ID284
Feature activation+4.47e-03

Pos logit contributions
eb+0.080
over+0.071
'+0.063
e+0.063
ucky+0.061
Neg logit contributions
 moral-0.100
 swollen-0.096
 exc-0.095
 works-0.092
lights-0.091
Token help
Token ID1037
Feature activation-8.90e-03

Pos logit contributions
 moral+0.006
 exc+0.006
 works+0.005
 swollen+0.005
 lamp+0.005
Neg logit contributions
eb-0.008
over-0.008
ucky-0.007
aw-0.007
ax-0.007
Token the
Token ID262
Feature activation+1.38e-02

Pos logit contributions
eb+0.018
umbs+0.016
atter+0.016
ax+0.016
aw+0.015
Neg logit contributions
 moral-0.010
 lamp-0.010
 mistake-0.010
 exc-0.009
 forgiving-0.009
Token bunny
Token ID44915
Feature activation+5.34e-03

Pos logit contributions
 moral+0.019
lights+0.019
 exc+0.018
 works+0.018
 pret+0.018
Neg logit contributions
eb-0.023
over-0.022
'-0.021
e-0.020
ucky-0.020
Token in
Token ID287
Feature activation+1.33e-02

Pos logit contributions
 moral+0.009
 exc+0.009
 works+0.008
 lamp+0.008
 swollen+0.008
Neg logit contributions
eb-0.008
over-0.007
ax-0.006
ucky-0.006
e-0.006
Token they
Token ID484
Feature activation-2.49e-02

Pos logit contributions
eb+0.032
over+0.030
e+0.027
ive+0.026
aw+0.026
Neg logit contributions
lights-0.022
 moral-0.021
 works-0.020
 exc-0.020
bul-0.019
Token needed
Token ID2622
Feature activation-1.85e-02

Pos logit contributions
eb+0.048
over+0.045
ucky+0.040
ale+0.039
ump+0.039
Neg logit contributions
 exc-0.032
 works-0.032
 moral-0.032
 pret-0.030
bel-0.030
Token:
Token ID25
Feature activation+2.07e-03

Pos logit contributions
eb+0.030
over+0.027
ucky+0.024
ax+0.024
ive+0.024
Neg logit contributions
 moral-0.027
 works-0.026
 pret-0.025
 exc-0.025
lights-0.025
Token tomatoes
Token ID23972
Feature activation-4.35e-02

Pos logit contributions
 moral+0.003
 exc+0.003
 pret+0.003
lights+0.003
 works+0.003
Neg logit contributions
eb-0.004
over-0.003
'-0.003
e-0.003
ucky-0.003
Token,
Token ID11
Feature activation-4.08e-03

Pos logit contributions
eb+0.066
over+0.061
e+0.054
ucky+0.052
ale+0.052
Neg logit contributions
 moral-0.067
 works-0.067
 exc-0.066
lights-0.066
 pret-0.064
Token garlic
Token ID17644
Feature activation-5.06e-02

Pos logit contributions
eb+0.007
over+0.006
'+0.006
e+0.006
ucky+0.005
Neg logit contributions
 moral-0.006
lights-0.006
 exc-0.006
 swollen-0.006
 pret-0.006
Token,
Token ID11
Feature activation-1.02e-02

Pos logit contributions
eb+0.080
over+0.073
ax+0.065
e+0.065
he+0.064
Neg logit contributions
 moral-0.076
 works-0.073
lights-0.073
 pret-0.072
 exc-0.070
Token onion
Token ID21670
Feature activation-1.46e-02

Pos logit contributions
eb+0.017
over+0.016
'+0.014
e+0.014
ale+0.014
Neg logit contributions
 moral-0.015
lights-0.015
 exc-0.014
 pret-0.014
 swollen-0.014
Token,
Token ID11
Feature activation-1.85e-02

Pos logit contributions
eb+0.022
over+0.020
e+0.018
ax+0.018
he+0.018
Neg logit contributions
 works-0.023
lights-0.022
 moral-0.022
 pret-0.022
 lamp-0.022
Token oil
Token ID3056
Feature activation-3.44e-02

Pos logit contributions
eb+0.029
over+0.027
'+0.024
e+0.023
ucky+0.023
Neg logit contributions
 moral-0.030
 exc-0.028
lights-0.028
 pret-0.028
 swollen-0.028
Token,
Token ID11
Feature activation-1.83e-02

Pos logit contributions
eb+0.055
over+0.050
e+0.045
ax+0.044
other+0.044
Neg logit contributions
 works-0.051
lights-0.050
 pret-0.049
 moral-0.048
 lamp-0.047
Token and
Token ID290
Feature activation-1.59e-02

Pos logit contributions
eb+0.009
over+0.008
'+0.007
e+0.007
ucky+0.007
Neg logit contributions
 moral-0.011
lights-0.011
 exc-0.011
 swollen-0.010
 works-0.010
Token testing
Token ID4856
Feature activation-1.05e-03

Pos logit contributions
eb+0.023
e+0.021
'+0.020
over+0.020
ive+0.018
Neg logit contributions
 moral-0.026
lights-0.026
bul-0.024
 poster-0.024
 exc-0.023
Token to
Token ID284
Feature activation+1.12e-02

Pos logit contributions
eb+0.002
over+0.002
'+0.001
e+0.001
ucky+0.001
Neg logit contributions
 moral-0.002
lights-0.002
 pret-0.002
 works-0.002
 exc-0.002
Token see
Token ID766
Feature activation+4.86e-03

Pos logit contributions
 moral+0.017
 exc+0.016
 swollen+0.016
 works+0.016
lights+0.016
Neg logit contributions
eb-0.018
over-0.017
'-0.016
ucky-0.015
e-0.015
Token which
Token ID543
Feature activation-2.62e-02

Pos logit contributions
 moral+0.006
 exc+0.006
lights+0.006
 works+0.005
 swollen+0.005
Neg logit contributions
eb-0.009
over-0.009
'-0.008
e-0.008
ucky-0.008
Token one
Token ID530
Feature activation-4.72e-02

Pos logit contributions
eb+0.042
over+0.038
'+0.037
e+0.035
ucky+0.034
Neg logit contributions
 moral-0.040
lights-0.038
 exc-0.038
 swollen-0.037
 pret-0.037
Token tasted
Token ID29187
Feature activation-4.41e-02

Pos logit contributions
eb+0.072
over+0.065
'+0.061
e+0.059
ucky+0.057
Neg logit contributions
 moral-0.077
 exc-0.072
lights-0.072
 swollen-0.072
 works-0.071
Token the
Token ID262
Feature activation-2.12e-02

Pos logit contributions
eb+0.072
over+0.066
'+0.060
e+0.060
ale+0.057
Neg logit contributions
 moral-0.069
 exc-0.065
 works-0.064
 swollen-0.064
lights-0.063
Token best
Token ID1266
Feature activation-2.78e-02

Pos logit contributions
eb+0.034
over+0.032
'+0.030
e+0.029
ucky+0.029
Neg logit contributions
lights-0.030
 moral-0.030
 exc-0.029
 swollen-0.029
 works-0.029
Token.
Token ID13
Feature activation-3.19e-03

Pos logit contributions
eb+0.043
over+0.039
'+0.037
e+0.035
ucky+0.035
Neg logit contributions
 moral-0.045
lights-0.042
 swollen-0.042
 exc-0.042
 works-0.041
Token
Token ID198
Feature activation-7.63e-03

Pos logit contributions
eb+0.005
over+0.005
'+0.004
e+0.004
ale+0.004
Neg logit contributions
 moral-0.005
 exc-0.005
lights-0.005
 pret-0.005
 works-0.005