TOP ACTIVATIONS
MAX = 5.672

because she didn 't want Bun Bun to be hurt .
she didn 't want Bun Bun to be hurt .
his dad wanted to teach Jon something special . So his
The baby 's name was Ag nes . Ag nes loved
didn 't want to make Billy mad , so he left
for his obedience and cle vern ess . <|endoftext|> Once upon
sad and frustrated , wishing someone would be as kind as
didn 't want to make Billy mad , so he left
but he didn 't want Billy to mess up his creations
very boss y and wanted Billy to move away .
it and it didn 't bounce anymore . She was sad
to her . She wished Max was still alive . <|endoftext|>
was very lonely and wanted someone to be with . One
sad and wanted to make Billy happy , so he let
mustache was long and w igg ly . The dragon was
want to do something new and fun . "
, Tim my started w igg ling and dancing to the
did not want to make Sally mad . So , they
excited that he was w igg ling around like a fish
didn 't want to do anything . Later that

MIN ACTIVATIONS
MIN = -6.301

is eight . It has eight and zero ."
so weird . It has eight legs and a lot of
was a box of ch ocol ates . Lily loved ch
a big box of ch ocol ates . It was so
find a box of ch ocol ates . "
and started eating the ch ocol ates . The shop keeper
a big box of ch ocol ates !
was a box of ch ocol ates . " For me
a big box of ch ocol ates ! <|endoftext|> Once upon
with flowers , hats with ears , and hats with stars
a large selection of ch ocol ates . Ell a picked
, " This is number eight ." She shows Tom a
was a box of ch ocol ates ! Alice and her
. See , it has eight legs and two eyes and
It was a special gu inea pig . Ben
noticed something . The ch ocol ates were bitter . They
hairy spider . It had eight legs and eight eyes .
black spider . It had eight legs and many eyes .
They see strawberry jam , raspberry jam , blue berry jam
is six . It has six and zero ."

INTERVAL 2.678 - 5.672
CONTAINS 0.684%

They said they loved him and the mountain . <|endoftext|> Ben
is important to be careful and not do silly things .
was a little girl called Bella . Bella was three -
the room to make it look more special . She knew
his friend . They played all day and then went back

INTERVAL -0.315 - 2.678
CONTAINS 60.520%

. Jane learned how to sew and make the doll 's
band - aid ." L ily smiled back
She loved Grand ma . She wanted to make Grand ma
gone away ." This made the baby realise
eal ed with joy ! Em ma wanted to

INTERVAL -3.308 - -0.315
CONTAINS 38.670%

. They went home , carrying their file of
Can I play with you ?" The cat opened its mouth
a . She took Louis a to the modern screen ,
with his family . He liked to jump and play all
so happy . They hugged each other and then went home

INTERVAL -6.301 - -3.308
CONTAINS 0.126%

you very much . I know you miss your mom ,
of fun . After playing , Fl uffy and
. They fill some paper cups with the batter . They
spinning through the air . After a few seconds , the
She opened the oven and saw a black ,
Token because
Token ID780
Feature activation+9.38e-02

Pos logit contributions
rated+0.037
 alone+0.033
ny+0.033
ling+0.033
 down+0.033
Neg logit contributions
 jam-0.034
 flour-0.033
 slices-0.033
 cr-0.033
 frost-0.031
Token she
Token ID673
Feature activation+5.72e-01

Pos logit contributions
rated+0.002
 down+0.002
ling+0.002
ny+0.002
 alone+0.002
Neg logit contributions
 flour-0.003
 jam-0.003
 slices-0.003
 cr-0.002
 frost-0.002
Token didn
Token ID1422
Feature activation+1.07e+00

Pos logit contributions
rated+0.013
away+0.013
 down+0.013
 freely+0.012
 alone+0.012
Neg logit contributions
 cr-0.014
 jam-0.014
 flour-0.013
 slices-0.013
 frost-0.013
Token't
Token ID470
Feature activation+4.52e-01

Pos logit contributions
rated+0.022
ling+0.019
 down+0.019
 alone+0.018
arrass+0.018
Neg logit contributions
 slices-0.031
 jam-0.031
 flour-0.031
 cr-0.030
 tape-0.029
Token want
Token ID765
Feature activation+3.47e-01

Pos logit contributions
ling+0.010
rated+0.010
 up+0.010
 down+0.010
itating+0.010
Neg logit contributions
 jam-0.011
 cr-0.011
 flour-0.010
 slices-0.010
 frost-0.010
Token Bun
Token ID28515
Feature activation+5.30e+00

Pos logit contributions
rated+0.007
ling+0.006
 alone+0.006
 down+0.006
ny+0.006
Neg logit contributions
 jam-0.010
 flour-0.010
 slices-0.010
 cr-0.010
 frost-0.009
Token Bun
Token ID28515
Feature activation+5.30e+00

Pos logit contributions
rated+0.132
itating+0.116
itates+0.112
rier+0.112
arted+0.109
Neg logit contributions
 jam-0.110
 cr-0.107
 slices-0.102
 powder-0.099
 bisc-0.098
Token to
Token ID284
Feature activation+1.42e-01

Pos logit contributions
rated+0.139
itating+0.130
rier+0.126
itates+0.124
ivated+0.124
Neg logit contributions
 jam-0.101
 cr-0.100
 powder-0.093
 slices-0.091
 bisc-0.089
Token be
Token ID307
Feature activation+1.07e+00

Pos logit contributions
 alone+0.002
rated+0.002
 down+0.002
 up+0.002
ling+0.002
Neg logit contributions
 slices-0.005
 flour-0.004
 jam-0.004
 cr-0.004
 icing-0.004
Token hurt
Token ID5938
Feature activation+2.06e+00

Pos logit contributions
rated+0.018
arrass+0.016
ny+0.015
oun+0.015
away+0.015
Neg logit contributions
 cr-0.034
 flour-0.034
 jam-0.034
 slices-0.034
 frost-0.032
Token.
Token ID13
Feature activation+1.07e+00

Pos logit contributions
rated+0.065
ounded+0.059
arrass+0.059
ling+0.057
ashing+0.056
Neg logit contributions
 cr-0.034
 slices-0.032
 flour-0.032
 jam-0.031
 tape-0.028
Token she
Token ID673
Feature activation+5.72e-01

Pos logit contributions
rated+0.002
 down+0.002
ling+0.002
ny+0.002
 alone+0.002
Neg logit contributions
 flour-0.003
 jam-0.003
 slices-0.003
 cr-0.002
 frost-0.002
Token didn
Token ID1422
Feature activation+1.07e+00

Pos logit contributions
rated+0.013
away+0.013
 down+0.013
 freely+0.012
 alone+0.012
Neg logit contributions
 cr-0.014
 jam-0.014
 flour-0.013
 slices-0.013
 frost-0.013
Token't
Token ID470
Feature activation+4.52e-01

Pos logit contributions
rated+0.022
ling+0.019
 down+0.019
 alone+0.018
arrass+0.018
Neg logit contributions
 slices-0.031
 jam-0.031
 flour-0.031
 cr-0.030
 tape-0.029
Token want
Token ID765
Feature activation+3.47e-01

Pos logit contributions
ling+0.010
rated+0.010
 up+0.010
 down+0.010
itating+0.010
Neg logit contributions
 jam-0.011
 cr-0.011
 flour-0.010
 slices-0.010
 frost-0.010
Token Bun
Token ID28515
Feature activation+5.30e+00

Pos logit contributions
rated+0.007
ling+0.006
 alone+0.006
 down+0.006
ny+0.006
Neg logit contributions
 jam-0.010
 flour-0.010
 slices-0.010
 cr-0.010
 frost-0.009
Token Bun
Token ID28515
Feature activation+5.30e+00

Pos logit contributions
rated+0.132
itating+0.116
itates+0.112
rier+0.112
arted+0.109
Neg logit contributions
 jam-0.110
 cr-0.107
 slices-0.102
 powder-0.099
 bisc-0.098
Token to
Token ID284
Feature activation+1.42e-01

Pos logit contributions
rated+0.139
itating+0.130
rier+0.126
itates+0.124
ivated+0.124
Neg logit contributions
 jam-0.101
 cr-0.100
 powder-0.093
 slices-0.091
 bisc-0.089
Token be
Token ID307
Feature activation+1.07e+00

Pos logit contributions
 alone+0.002
rated+0.002
 down+0.002
 up+0.002
ling+0.002
Neg logit contributions
 slices-0.005
 flour-0.004
 jam-0.004
 cr-0.004
 icing-0.004
Token hurt
Token ID5938
Feature activation+2.06e+00

Pos logit contributions
rated+0.018
arrass+0.016
ny+0.015
oun+0.015
away+0.015
Neg logit contributions
 cr-0.034
 flour-0.034
 jam-0.034
 slices-0.034
 frost-0.032
Token.
Token ID13
Feature activation+1.07e+00

Pos logit contributions
rated+0.065
ounded+0.059
arrass+0.059
ling+0.057
ashing+0.056
Neg logit contributions
 cr-0.034
 slices-0.032
 flour-0.032
 jam-0.031
 tape-0.028
Token
Token ID220
Feature activation+1.36e+00

Pos logit contributions
rated+0.036
away+0.034
ling+0.033
umpy+0.032
rest+0.032
Neg logit contributions
 flour-0.015
 cr-0.014
 jam-0.014
 slices-0.013
 powder-0.013
Token his
Token ID465
Feature activation-2.35e-01

Pos logit contributions
 flour+0.007
 slices+0.007
 jam+0.007
 cr+0.006
 frost+0.006
Neg logit contributions
rated-0.006
 alone-0.006
ling-0.006
umpy-0.006
 freely-0.006
Token dad
Token ID9955
Feature activation-1.52e+00

Pos logit contributions
 slices+0.006
 flour+0.006
 jam+0.006
 icing+0.006
 frost+0.006
Neg logit contributions
 down-0.005
 up-0.005
 alone-0.005
 same-0.005
ny-0.005
Token wanted
Token ID2227
Feature activation-6.95e-01

Pos logit contributions
 slices+0.045
 jam+0.043
 flour+0.043
 unison+0.041
 icing+0.040
Neg logit contributions
 alone-0.029
 down-0.029
rated-0.028
ny-0.028
 up-0.028
Token to
Token ID284
Feature activation+1.28e+00

Pos logit contributions
 unison+0.028
 slices+0.028
 jam+0.027
 slot+0.027
 icing+0.027
Neg logit contributions
 up-0.006
 down-0.006
 alone-0.005
 same-0.004
 back-0.004
Token teach
Token ID4545
Feature activation+9.41e-02

Pos logit contributions
 alone+0.021
 down+0.021
 up+0.021
rated+0.019
 away+0.019
Neg logit contributions
 slices-0.039
 flour-0.038
 jam-0.038
 icing-0.036
 frost-0.036
Token Jon
Token ID5966
Feature activation+5.29e+00

Pos logit contributions
 down+0.001
 up+0.001
 alone+0.001
 away+0.001
away+0.001
Neg logit contributions
 slices-0.004
 jam-0.004
 flour-0.003
 slot-0.003
 cr-0.003
Token something
Token ID1223
Feature activation+2.47e+00

Pos logit contributions
rated+0.123
itates+0.120
 freely+0.120
ivated+0.114
itating+0.114
Neg logit contributions
 flour-0.120
 jam-0.118
 slices-0.117
 cones-0.112
 cr-0.111
Token special
Token ID2041
Feature activation+6.85e-01

Pos logit contributions
rated+0.045
 alone+0.043
 down+0.043
ling+0.040
ny+0.040
Neg logit contributions
 slices-0.074
 flour-0.074
 jam-0.073
 cr-0.071
 frost-0.067
Token.
Token ID13
Feature activation+2.61e-01

Pos logit contributions
 down+0.013
 alone+0.013
 up+0.013
rated+0.012
ny+0.011
Neg logit contributions
 slices-0.020
 flour-0.020
 jam-0.020
 icing-0.019
engu-0.019
Token So
Token ID1406
Feature activation-2.12e-01

Pos logit contributions
rated+0.006
 alone+0.005
ling+0.005
away+0.005
 freely+0.005
Neg logit contributions
 flour-0.007
 slices-0.007
 jam-0.006
 cr-0.006
 strips-0.006
Token his
Token ID465
Feature activation-3.96e-01

Pos logit contributions
 flour+0.004
 slices+0.004
 jam+0.004
 cr+0.004
 frost+0.004
Neg logit contributions
rated-0.006
 alone-0.006
 down-0.006
ling-0.006
 freely-0.005
Token The
Token ID383
Feature activation+1.39e+00

Pos logit contributions
 down+0.007
 alone+0.006
rated+0.006
ny+0.006
 up+0.006
Neg logit contributions
 jam-0.014
 slices-0.014
 flour-0.013
 cr-0.013
 da-0.013
Token baby
Token ID5156
Feature activation-9.57e-01

Pos logit contributions
rated+0.037
 alone+0.036
 down+0.035
ling+0.035
 freely+0.034
Neg logit contributions
 flour-0.030
 slices-0.030
 jam-0.029
 cr-0.028
 frost-0.027
Token's
Token ID338
Feature activation-1.72e-01

Pos logit contributions
 slices+0.031
 jam+0.030
 flour+0.029
engu+0.028
 icing+0.028
Neg logit contributions
 alone-0.016
 down-0.016
ling-0.015
rated-0.015
 up-0.015
Token name
Token ID1438
Feature activation+6.87e-01

Pos logit contributions
 slices+0.005
 jam+0.004
 flour+0.004
 icing+0.004
 strips+0.004
Neg logit contributions
 down-0.004
 up-0.004
 alone-0.004
 same-0.004
rated-0.004
Token was
Token ID373
Feature activation+5.16e-01

Pos logit contributions
 down+0.013
 alone+0.013
rated+0.013
 up+0.013
ling+0.012
Neg logit contributions
 slices-0.020
 jam-0.019
 flour-0.019
 cr-0.018
engu-0.018
Token Ag
Token ID2449
Feature activation+5.28e+00

Pos logit contributions
 down+0.013
 alone+0.012
rated+0.012
 up+0.012
ny+0.012
Neg logit contributions
 slices-0.012
 jam-0.012
 flour-0.012
 cr-0.011
 strips-0.011
Tokennes
Token ID2516
Feature activation+2.85e+00

Pos logit contributions
 down+0.113
rated+0.108
 alone+0.104
ling+0.100
 up+0.100
Neg logit contributions
engu-0.130
emon-0.129
 slices-0.126
ird-0.123
 jam-0.121
Token.
Token ID13
Feature activation+3.35e-01

Pos logit contributions
 alone+0.063
 down+0.061
rated+0.060
 up+0.057
ling+0.057
Neg logit contributions
 slices-0.076
 jam-0.076
 flour-0.075
 icing-0.069
 cr-0.068
Token Ag
Token ID2449
Feature activation+4.43e+00

Pos logit contributions
rated+0.009
ling+0.008
 alone+0.008
 down+0.008
away+0.008
Neg logit contributions
 flour-0.007
 cr-0.007
 slices-0.007
 jam-0.007
 strips-0.007
Tokennes
Token ID2516
Feature activation+1.74e+00

Pos logit contributions
 down+0.103
rated+0.102
 alone+0.097
 miserable+0.095
 lonely+0.094
Neg logit contributions
engu-0.094
ena-0.094
ves-0.088
emon-0.086
ird-0.084
Token loved
Token ID6151
Feature activation-2.17e-01

Pos logit contributions
rated+0.032
 down+0.031
 alone+0.031
ling+0.029
 up+0.029
Neg logit contributions
 slices-0.052
 jam-0.051
 flour-0.051
 cr-0.048
 frost-0.047
Token didn
Token ID1422
Feature activation-3.64e-02

Pos logit contributions
 alone+0.013
rated+0.013
 down+0.012
ny+0.012
ling+0.012
Neg logit contributions
 flour-0.025
 slices-0.024
 jam-0.024
 icing-0.023
 frost-0.023
Token't
Token ID470
Feature activation+5.61e-01

Pos logit contributions
 slices+0.001
 flour+0.001
 jam+0.001
 cr+0.001
 tape+0.001
Neg logit contributions
rated-0.001
 alone-0.001
 down-0.001
ling-0.001
away-0.001
Token want
Token ID765
Feature activation+3.09e-01

Pos logit contributions
rated+0.012
ling+0.011
ounded+0.011
oun+0.011
 up+0.011
Neg logit contributions
 jam-0.014
 cr-0.013
 slices-0.012
 flour-0.011
 toast-0.011
Token to
Token ID284
Feature activation+8.48e-01

Pos logit contributions
rated+0.005
 down+0.005
 alone+0.005
ling+0.005
 up+0.005
Neg logit contributions
 slices-0.010
 jam-0.010
 flour-0.010
 cr-0.009
 frost-0.009
Token make
Token ID787
Feature activation+1.72e+00

Pos logit contributions
rated+0.018
 down+0.017
 alone+0.017
ling+0.016
 up+0.016
Neg logit contributions
 jam-0.023
 flour-0.022
 slices-0.022
 cr-0.022
 frost-0.021
Token Billy
Token ID15890
Feature activation+5.26e+00

Pos logit contributions
rated+0.045
ling+0.040
 alone+0.040
 down+0.039
ny+0.039
Neg logit contributions
 slices-0.041
 flour-0.040
 jam-0.040
 cr-0.038
 frost-0.036
Token mad
Token ID8805
Feature activation+3.23e+00

Pos logit contributions
rated+0.094
 freely+0.087
itates+0.086
away+0.086
ounded+0.084
Neg logit contributions
 slices-0.158
 jam-0.156
 toast-0.147
 cr-0.146
 flour-0.143
Token,
Token ID11
Feature activation+2.27e+00

Pos logit contributions
rated+0.105
ounded+0.092
arrass+0.092
away+0.091
 freely+0.090
Neg logit contributions
 slices-0.052
 jam-0.052
 cr-0.050
 flour-0.048
 toast-0.044
Token so
Token ID523
Feature activation+1.03e+00

Pos logit contributions
rated+0.047
 down+0.044
 alone+0.043
ling+0.042
away+0.042
Neg logit contributions
 slices-0.063
 jam-0.062
 flour-0.062
 cr-0.059
 frost-0.057
Token he
Token ID339
Feature activation+9.39e-01

Pos logit contributions
rated+0.016
 down+0.016
 alone+0.016
ling+0.015
 up+0.014
Neg logit contributions
 slices-0.034
 flour-0.033
 jam-0.033
 cr-0.032
 frost-0.031
Token left
Token ID1364
Feature activation+2.19e-01

Pos logit contributions
rated+0.018
 alone+0.018
 down+0.018
ny+0.017
ling+0.017
Neg logit contributions
 slices-0.027
 flour-0.027
 jam-0.026
 icing-0.025
 cr-0.025
Token for
Token ID329
Feature activation-6.21e-02

Pos logit contributions
rated+0.028
 alone+0.025
 down+0.025
ling+0.023
ny+0.023
Neg logit contributions
 slices-0.049
 jam-0.049
 flour-0.048
 cr-0.047
 strips-0.044
Token his
Token ID465
Feature activation+1.10e-01

Pos logit contributions
 slices+0.002
 flour+0.002
 jam+0.002
 cr+0.002
 frost+0.002
Neg logit contributions
rated-0.001
 down-0.001
 alone-0.001
ling-0.001
 up-0.001
Token obedience
Token ID28777
Feature activation+5.71e-01

Pos logit contributions
rated+0.002
 alone+0.002
 down+0.002
ling+0.002
 up+0.002
Neg logit contributions
 slices-0.003
 flour-0.003
 jam-0.003
 icing-0.003
 frost-0.003
Token and
Token ID290
Feature activation+2.02e+00

Pos logit contributions
 down+0.013
rated+0.013
 alone+0.013
 up+0.012
ling+0.012
Neg logit contributions
 slices-0.014
 flour-0.014
 jam-0.014
 cr-0.013
 frost-0.013
Token cle
Token ID1190
Feature activation+5.39e-01

Pos logit contributions
rated+0.043
 down+0.041
 alone+0.041
ling+0.039
ny+0.038
Neg logit contributions
 slices-0.054
 jam-0.054
 flour-0.054
 cr-0.051
 frost-0.049
Tokenvern
Token ID933
Feature activation+5.25e+00

Pos logit contributions
rated+0.014
 down+0.014
 alone+0.013
 up+0.013
ny+0.013
Neg logit contributions
 slices-0.012
 flour-0.011
 jam-0.011
 cr-0.011
engu-0.010
Tokeness
Token ID408
Feature activation+1.26e+00

Pos logit contributions
 down+0.151
 alone+0.150
ivated+0.144
 lonely+0.143
 unnoticed+0.142
Neg logit contributions
engu-0.095
izza-0.080
 slices-0.077
 cr-0.073
ama-0.072
Token.
Token ID13
Feature activation+3.41e-01

Pos logit contributions
rated+0.032
 down+0.030
 alone+0.030
ling+0.028
away+0.028
Neg logit contributions
 slices-0.029
 jam-0.029
 flour-0.028
 cr-0.028
 frost-0.026
Token<|endoftext|>
Token ID50256
Feature activation+1.36e+00

Pos logit contributions
rated+0.008
ling+0.008
 alone+0.008
 down+0.008
away+0.008
Neg logit contributions
 slices-0.008
 jam-0.008
 flour-0.008
 cr-0.008
engu-0.007
TokenOnce
Token ID7454
Feature activation+5.44e-02

Pos logit contributions
 down+0.027
 alone+0.027
rated+0.026
 up+0.026
ny+0.025
Neg logit contributions
 slices-0.038
 jam-0.038
 flour-0.038
 cr-0.036
 icing-0.035
Token upon
Token ID2402
Feature activation+9.70e-01

Pos logit contributions
rated+0.001
arrass+0.001
away+0.001
 down+0.001
ny+0.001
Neg logit contributions
 flour-0.001
 slices-0.001
 cr-0.001
 jam-0.001
engu-0.001
Token sad
Token ID6507
Feature activation+2.87e+00

Pos logit contributions
rated+0.042
ny+0.037
 down+0.037
away+0.037
arrass+0.036
Neg logit contributions
 jam-0.066
 slices-0.064
 flour-0.064
 cr-0.061
 frost-0.059
Token and
Token ID290
Feature activation+3.64e+00

Pos logit contributions
rated+0.084
arrass+0.077
ny+0.077
 down+0.076
 invincible+0.075
Neg logit contributions
 jam-0.055
 slices-0.054
 flour-0.052
 cr-0.051
 frost-0.047
Token frustrated
Token ID14718
Feature activation+1.75e+00

Pos logit contributions
rated+0.073
away+0.071
umpy+0.068
oun+0.067
ny+0.067
Neg logit contributions
 jam-0.101
 cr-0.099
 flour-0.095
 slices-0.094
 frost-0.089
Token,
Token ID11
Feature activation+2.27e+00

Pos logit contributions
rated+0.040
 down+0.039
 alone+0.039
ling+0.037
 up+0.037
Neg logit contributions
 slices-0.045
 flour-0.044
 jam-0.044
 cr-0.041
 frost-0.040
Token wishing
Token ID24433
Feature activation+4.26e-01

Pos logit contributions
 down+0.044
 alone+0.044
rated+0.044
 up+0.041
ling+0.040
Neg logit contributions
 slices-0.065
 flour-0.064
 jam-0.064
 cr-0.060
 icing-0.059
Token someone
Token ID2130
Feature activation+5.21e+00

Pos logit contributions
 alone+0.008
 down+0.007
 up+0.007
rated+0.006
ling+0.006
Neg logit contributions
 slices-0.013
 flour-0.013
engu-0.013
 jam-0.012
 slot-0.012
Token would
Token ID561
Feature activation+8.50e-01

Pos logit contributions
rated+0.132
umpy+0.124
oun+0.119
itates+0.117
ny+0.116
Neg logit contributions
 jam-0.115
 cr-0.112
 flour-0.111
 slices-0.106
 tape-0.100
Token be
Token ID307
Feature activation+2.80e+00

Pos logit contributions
rated+0.018
 down+0.017
 alone+0.017
ling+0.016
 up+0.016
Neg logit contributions
 slices-0.023
 jam-0.023
 flour-0.023
 cr-0.022
 frost-0.021
Token as
Token ID355
Feature activation+2.01e+00

Pos logit contributions
rated+0.062
oun+0.056
ny+0.056
umpy+0.053
arrass+0.053
Neg logit contributions
 flour-0.074
 jam-0.074
 slices-0.073
 cr-0.071
 frost-0.067
Token kind
Token ID1611
Feature activation+6.82e-01

Pos logit contributions
rated+0.038
 down+0.034
ny+0.034
ling+0.033
away+0.033
Neg logit contributions
 flour-0.060
 jam-0.059
 slices-0.059
 cr-0.057
 tape-0.053
Token as
Token ID355
Feature activation+1.29e+00

Pos logit contributions
rated+0.016
 down+0.015
 alone+0.015
ling+0.014
 up+0.014
Neg logit contributions
 slices-0.017
 jam-0.017
 flour-0.017
 cr-0.016
 frost-0.015
Token didn
Token ID1422
Feature activation+3.77e-01

Pos logit contributions
 alone+0.017
rated+0.016
 down+0.016
ny+0.015
ling+0.015
Neg logit contributions
 flour-0.025
 slices-0.025
 jam-0.024
 icing-0.024
 frost-0.023
Token't
Token ID470
Feature activation+5.80e-01

Pos logit contributions
rated+0.007
ling+0.006
 alone+0.006
 down+0.006
away+0.006
Neg logit contributions
 slices-0.011
 jam-0.011
 flour-0.011
 cr-0.010
 tape-0.010
Token want
Token ID765
Feature activation+4.74e-01

Pos logit contributions
rated+0.012
itating+0.012
ling+0.012
itates+0.012
ounded+0.012
Neg logit contributions
 jam-0.014
 slices-0.013
 cr-0.012
 flour-0.012
 toast-0.012
Token to
Token ID284
Feature activation+1.36e+00

Pos logit contributions
 down+0.007
 alone+0.007
rated+0.007
 up+0.007
ny+0.006
Neg logit contributions
 slices-0.016
 flour-0.015
 jam-0.015
 cr-0.015
engu-0.014
Token make
Token ID787
Feature activation+1.64e+00

Pos logit contributions
rated+0.032
ounded+0.030
ling+0.029
 down+0.029
itates+0.029
Neg logit contributions
 jam-0.033
 slices-0.032
 flour-0.031
 cr-0.030
 toast-0.029
Token Billy
Token ID15890
Feature activation+5.16e+00

Pos logit contributions
rated+0.034
 down+0.033
 alone+0.033
ling+0.031
 up+0.031
Neg logit contributions
 slices-0.045
 jam-0.045
 flour-0.045
 cr-0.042
 frost-0.041
Token mad
Token ID8805
Feature activation+2.90e+00

Pos logit contributions
rated+0.097
 freely+0.091
itates+0.090
away+0.088
ounded+0.088
Neg logit contributions
 slices-0.149
 jam-0.145
 toast-0.136
 cr-0.135
 flour-0.133
Token,
Token ID11
Feature activation+2.08e+00

Pos logit contributions
rated+0.086
 freely+0.077
away+0.077
 alone+0.076
arrass+0.075
Neg logit contributions
 slices-0.054
 jam-0.052
 flour-0.050
 cr-0.049
 toast-0.046
Token so
Token ID523
Feature activation+1.28e+00

Pos logit contributions
rated+0.040
 down+0.039
 alone+0.039
ling+0.036
 up+0.036
Neg logit contributions
 slices-0.060
 flour-0.059
 jam-0.059
 cr-0.057
 frost-0.055
Token he
Token ID339
Feature activation+1.22e+00

Pos logit contributions
 alone+0.018
 down+0.018
rated+0.018
ling+0.017
 up+0.017
Neg logit contributions
 slices-0.043
 flour-0.043
 jam-0.042
 cr-0.041
 icing-0.041
Token left
Token ID1364
Feature activation-3.79e-02

Pos logit contributions
rated+0.025
 down+0.024
 alone+0.024
ling+0.023
 up+0.023
Neg logit contributions
 slices-0.034
 flour-0.033
 jam-0.033
 cr-0.032
 icing-0.031
Token but
Token ID475
Feature activation+1.03e+00

Pos logit contributions
 slices+0.010
 flour+0.010
 jam+0.010
engu+0.010
 cr+0.010
Neg logit contributions
 down-0.008
 alone-0.008
rated-0.007
 up-0.007
ling-0.007
Token he
Token ID339
Feature activation+8.18e-01

Pos logit contributions
rated+0.025
umpy+0.023
ling+0.023
away+0.023
 freely+0.023
Neg logit contributions
 jam-0.023
 slices-0.023
 cr-0.023
 flour-0.023
 tape-0.021
Token didn
Token ID1422
Feature activation-1.25e-02

Pos logit contributions
rated+0.015
 alone+0.015
 down+0.015
ling+0.014
 up+0.014
Neg logit contributions
 slices-0.024
 flour-0.024
 jam-0.024
 cr-0.023
 icing-0.022
Token't
Token ID470
Feature activation+3.31e-02

Pos logit contributions
 slices+0.000
 flour+0.000
 jam+0.000
 cr+0.000
 tape+0.000
Neg logit contributions
rated-0.000
ling-0.000
 alone-0.000
 down-0.000
away-0.000
Token want
Token ID765
Feature activation+2.70e-01

Pos logit contributions
rated+0.001
ling+0.001
handed+0.001
 up+0.001
 freely+0.001
Neg logit contributions
 cr-0.001
 jam-0.001
 tape-0.001
 slices-0.001
 flour-0.001
Token Billy
Token ID15890
Feature activation+5.11e+00

Pos logit contributions
rated+0.005
 down+0.005
 alone+0.005
ling+0.004
 up+0.004
Neg logit contributions
 slices-0.008
 jam-0.008
 flour-0.008
 cr-0.008
 frost-0.008
Token to
Token ID284
Feature activation+6.45e-01

Pos logit contributions
rated+0.146
 freely+0.132
away+0.127
itates+0.127
ling+0.127
Neg logit contributions
 jam-0.104
 cr-0.101
 slices-0.099
 toast-0.094
 flour-0.090
Token mess
Token ID2085
Feature activation+1.38e+00

Pos logit contributions
rated+0.015
ling+0.014
 down+0.013
 alone+0.013
 freely+0.013
Neg logit contributions
 jam-0.016
 cr-0.016
 flour-0.016
 slices-0.016
 frost-0.015
Token up
Token ID510
Feature activation+9.61e-01

Pos logit contributions
rated+0.030
ny+0.025
arrass+0.025
ashing+0.025
ounded+0.024
Neg logit contributions
 slices-0.039
 jam-0.038
 flour-0.038
 cr-0.038
 frost-0.035
Token his
Token ID465
Feature activation-7.95e-01

Pos logit contributions
rated+0.021
 alone+0.021
 down+0.021
ling+0.020
 up+0.019
Neg logit contributions
 slices-0.025
 flour-0.025
 jam-0.025
 cr-0.023
 frost-0.023
Token creations
Token ID28443
Feature activation+1.16e+00

Pos logit contributions
 slices+0.019
 flour+0.018
 jam+0.018
 cr+0.017
 frost+0.017
Neg logit contributions
rated-0.020
 down-0.019
 alone-0.019
ling-0.018
 up-0.018
Token very
Token ID845
Feature activation+1.78e-01

Pos logit contributions
 slices+0.043
 jam+0.043
 flour+0.042
engu+0.040
 cr+0.040
Neg logit contributions
 alone-0.020
 down-0.019
 up-0.018
rated-0.018
ling-0.017
Token boss
Token ID6478
Feature activation+1.72e+00

Pos logit contributions
rated+0.003
 down+0.003
 alone+0.003
ling+0.003
away+0.003
Neg logit contributions
 slices-0.005
 flour-0.005
 jam-0.005
 cr-0.005
 frost-0.005
Tokeny
Token ID88
Feature activation+8.18e-01

Pos logit contributions
rated+0.028
 freely+0.025
 alone+0.024
 down+0.024
 up+0.024
Neg logit contributions
 cr-0.053
 slices-0.052
 flour-0.051
 jam-0.050
 frost-0.049
Token and
Token ID290
Feature activation+2.00e+00

Pos logit contributions
 alone+0.018
 down+0.018
 up+0.017
ling+0.016
rated+0.016
Neg logit contributions
 flour-0.022
 jam-0.022
 slices-0.022
 icing-0.021
 frost-0.021
Token wanted
Token ID2227
Feature activation+2.94e-01

Pos logit contributions
rated+0.038
 down+0.035
 alone+0.035
ling+0.034
away+0.033
Neg logit contributions
 slices-0.058
 jam-0.057
 flour-0.057
 cr-0.056
 frost-0.053
Token Billy
Token ID15890
Feature activation+5.03e+00

Pos logit contributions
 down+0.004
 alone+0.004
 up+0.004
 away+0.003
away+0.003
Neg logit contributions
 slices-0.011
 jam-0.010
 flour-0.010
engu-0.010
 icing-0.010
Token to
Token ID284
Feature activation+1.07e+00

Pos logit contributions
rated+0.122
ling+0.107
 freely+0.106
away+0.104
itates+0.104
Neg logit contributions
 jam-0.125
 slices-0.123
 cr-0.120
 flour-0.120
 toast-0.112
Token move
Token ID1445
Feature activation+3.10e+00

Pos logit contributions
rated+0.021
 alone+0.021
 down+0.021
ling+0.019
 up+0.019
Neg logit contributions
 slices-0.030
 flour-0.030
 jam-0.030
 cr-0.028
 frost-0.027
Token away
Token ID1497
Feature activation+1.97e+00

Pos logit contributions
rated+0.060
 alone+0.056
 down+0.056
ling+0.054
ny+0.052
Neg logit contributions
 slices-0.089
 flour-0.088
 jam-0.088
 cr-0.085
 frost-0.081
Token.
Token ID13
Feature activation+8.61e-01

Pos logit contributions
 down+0.043
 alone+0.042
rated+0.041
 up+0.040
ling+0.039
Neg logit contributions
 jam-0.054
 slices-0.054
 flour-0.053
 icing-0.050
 cr-0.050
Token
Token ID198
Feature activation-4.02e-01

Pos logit contributions
rated+0.023
ling+0.021
 alone+0.021
away+0.021
 freely+0.020
Neg logit contributions
 slices-0.019
 flour-0.018
 jam-0.017
 cr-0.017
 tape-0.017
Token it
Token ID340
Feature activation+4.21e-01

Pos logit contributions
rated+0.009
 down+0.008
 alone+0.008
 up+0.008
ling+0.008
Neg logit contributions
 slices-0.012
 flour-0.012
 jam-0.012
 cr-0.011
 frost-0.011
Token and
Token ID290
Feature activation+1.03e+00

Pos logit contributions
rated+0.010
 down+0.010
 alone+0.010
ling+0.009
 up+0.009
Neg logit contributions
 slices-0.010
 jam-0.010
 flour-0.010
 cr-0.010
 frost-0.009
Token it
Token ID340
Feature activation+1.58e+00

Pos logit contributions
rated+0.022
 alone+0.020
 down+0.020
ling+0.020
ny+0.019
Neg logit contributions
 jam-0.028
 slices-0.028
 flour-0.028
 cr-0.027
 frost-0.026
Token didn
Token ID1422
Feature activation+5.54e-01

Pos logit contributions
rated+0.039
away+0.037
umpy+0.036
 down+0.035
ashing+0.035
Neg logit contributions
 jam-0.038
 cr-0.037
 slices-0.037
 flour-0.037
 powder-0.033
Token't
Token ID470
Feature activation-6.01e-02

Pos logit contributions
rated+0.009
 alone+0.009
 down+0.008
ling+0.008
away+0.008
Neg logit contributions
 slices-0.018
 jam-0.017
 flour-0.017
 cr-0.017
 tape-0.016
Token bounce
Token ID20110
Feature activation+4.98e+00

Pos logit contributions
 cr+0.002
 jam+0.001
 slices+0.001
 tape+0.001
 flour+0.001
Neg logit contributions
rated-0.001
ling-0.001
itating-0.001
itates-0.001
 up-0.001
Token anymore
Token ID7471
Feature activation+7.00e-01

Pos logit contributions
rated+0.137
arrass+0.130
ounded+0.123
handed+0.120
ashing+0.117
Neg logit contributions
 cr-0.100
 jam-0.098
 slices-0.092
 flour-0.087
 tape-0.086
Token.
Token ID13
Feature activation+1.05e+00

Pos logit contributions
 alone+0.014
 down+0.014
 up+0.013
rated+0.012
ling+0.012
Neg logit contributions
 flour-0.021
 slices-0.020
 jam-0.020
engu-0.019
 icing-0.019
Token She
Token ID1375
Feature activation+6.60e-01

Pos logit contributions
rated+0.027
ling+0.025
away+0.024
 alone+0.024
 down+0.023
Neg logit contributions
 jam-0.025
 flour-0.025
 slices-0.025
 cr-0.024
 frost-0.022
Token was
Token ID373
Feature activation+1.73e-01

Pos logit contributions
rated+0.012
 down+0.012
 alone+0.012
ling+0.011
 up+0.011
Neg logit contributions
 slices-0.020
 flour-0.020
 jam-0.020
 cr-0.019
 icing-0.018
Token sad
Token ID6507
Feature activation+1.97e+00

Pos logit contributions
rated+0.003
away+0.003
ny+0.003
ling+0.002
 freely+0.002
Neg logit contributions
 jam-0.005
 flour-0.005
 slices-0.005
 cr-0.005
 frost-0.005
Token to
Token ID284
Feature activation+1.87e+00

Pos logit contributions
rated+0.036
 alone+0.034
 down+0.033
ny+0.032
umpy+0.032
Neg logit contributions
 slices-0.041
 jam-0.041
 flour-0.040
 cr-0.039
 toast-0.037
Token her
Token ID607
Feature activation-4.79e-02

Pos logit contributions
rated+0.042
 down+0.039
 alone+0.039
ling+0.038
ny+0.037
Neg logit contributions
 jam-0.048
 slices-0.048
 flour-0.048
 cr-0.046
 frost-0.043
Token.
Token ID13
Feature activation+1.72e-01

Pos logit contributions
 flour+0.001
 slices+0.001
 jam+0.001
 icing+0.001
 frost+0.001
Neg logit contributions
 down-0.001
 alone-0.001
 up-0.001
rated-0.001
 away-0.001
Token She
Token ID1375
Feature activation-7.54e-01

Pos logit contributions
rated+0.003
 down+0.003
 alone+0.003
ling+0.003
 up+0.003
Neg logit contributions
 slices-0.005
 jam-0.005
 flour-0.005
 cr-0.005
 frost-0.005
Token wished
Token ID16555
Feature activation-7.49e-01

Pos logit contributions
 flour+0.023
 slices+0.023
 jam+0.022
 icing+0.021
 strips+0.021
Neg logit contributions
 alone-0.013
 down-0.013
 up-0.013
ny-0.012
rated-0.012
Token Max
Token ID5436
Feature activation+4.97e+00

Pos logit contributions
 slices+0.024
 flour+0.024
 jam+0.023
engu+0.023
 strips+0.022
Neg logit contributions
 down-0.013
 alone-0.012
 up-0.012
 freely-0.010
 around-0.010
Token was
Token ID373
Feature activation+1.30e+00

Pos logit contributions
rated+0.130
ling+0.120
away+0.119
ashing+0.116
umpy+0.115
Neg logit contributions
 jam-0.106
 slices-0.104
 cr-0.100
 flour-0.098
 toast-0.092
Token still
Token ID991
Feature activation+1.16e+00

Pos logit contributions
rated+0.028
ny+0.026
away+0.026
umpy+0.026
ling+0.026
Neg logit contributions
 jam-0.034
 flour-0.033
 slices-0.033
 cr-0.032
 toast-0.030
Token alive
Token ID6776
Feature activation+1.54e+00

Pos logit contributions
rated+0.024
 down+0.023
 alone+0.022
ling+0.021
 up+0.021
Neg logit contributions
 slices-0.032
 jam-0.032
 flour-0.032
 cr-0.031
 frost-0.029
Token.
Token ID13
Feature activation+4.20e-01

Pos logit contributions
rated+0.036
 down+0.035
 alone+0.035
ling+0.033
 up+0.033
Neg logit contributions
 slices-0.038
 jam-0.038
 flour-0.038
 cr-0.036
 frost-0.034
Token<|endoftext|>
Token ID50256
Feature activation+7.80e-01

Pos logit contributions
rated+0.007
 down+0.007
 alone+0.007
ling+0.007
 up+0.006
Neg logit contributions
 slices-0.013
 jam-0.013
 flour-0.013
 cr-0.012
 frost-0.012
Token was
Token ID373
Feature activation-4.56e-01

Pos logit contributions
rated+0.028
 down+0.026
away+0.025
ounded+0.024
ling+0.024
Neg logit contributions
 jam-0.031
 cr-0.030
 flour-0.029
 slices-0.029
 frost-0.028
Token very
Token ID845
Feature activation+3.91e-01

Pos logit contributions
 flour+0.012
 jam+0.012
 slices+0.012
 cr+0.011
 frost+0.011
Neg logit contributions
rated-0.010
 down-0.009
 alone-0.009
ling-0.009
away-0.009
Token lonely
Token ID21757
Feature activation+1.91e+00

Pos logit contributions
rated+0.010
away+0.008
 down+0.008
 aback+0.007
 freely+0.007
Neg logit contributions
 flour-0.010
 jam-0.010
 cr-0.010
 slices-0.009
 sprink-0.009
Token and
Token ID290
Feature activation+3.14e+00

Pos logit contributions
rated+0.044
 down+0.042
 alone+0.042
ling+0.041
 up+0.040
Neg logit contributions
 slices-0.047
 flour-0.047
 jam-0.047
 cr-0.045
 frost-0.043
Token wanted
Token ID2227
Feature activation+7.22e-01

Pos logit contributions
rated+0.062
away+0.062
ounded+0.058
itating+0.058
 aback+0.057
Neg logit contributions
 cr-0.080
 jam-0.079
 flour-0.072
 toast-0.069
 jars-0.069
Token someone
Token ID2130
Feature activation+4.94e+00

Pos logit contributions
 alone+0.010
 down+0.010
 up+0.009
 away+0.008
rated+0.008
Neg logit contributions
 slices-0.025
 flour-0.025
 jam-0.024
 icing-0.024
engu-0.024
Token to
Token ID284
Feature activation+1.11e+00

Pos logit contributions
rated+0.157
umpy+0.143
arted+0.137
rier+0.136
itates+0.133
Neg logit contributions
 cr-0.091
 jam-0.086
 flour-0.082
 tape-0.076
 toast-0.076
Token be
Token ID307
Feature activation+2.93e+00

Pos logit contributions
rated+0.021
 down+0.020
 alone+0.019
ling+0.019
 up+0.018
Neg logit contributions
 slices-0.033
 jam-0.033
 flour-0.033
 cr-0.032
 frost-0.030
Token with
Token ID351
Feature activation+1.81e+00

Pos logit contributions
rated+0.079
ounded+0.066
arrass+0.065
oun+0.064
itates+0.063
Neg logit contributions
 cr-0.069
 jam-0.066
 flour-0.065
 slices-0.065
 envelop-0.061
Token.
Token ID13
Feature activation+9.97e-01

Pos logit contributions
rated+0.033
 down+0.029
ling+0.029
 alone+0.028
ny+0.027
Neg logit contributions
 jam-0.056
 slices-0.055
 flour-0.055
 cr-0.053
 frost-0.051
Token One
Token ID1881
Feature activation-3.92e-01

Pos logit contributions
rated+0.019
 down+0.019
 alone+0.019
ling+0.018
 up+0.018
Neg logit contributions
 slices-0.028
 jam-0.028
 flour-0.028
 cr-0.027
 frost-0.026
Token sad
Token ID6507
Feature activation+1.66e+00

Pos logit contributions
rated+0.010
away+0.008
ny+0.007
oun+0.007
ashing+0.007
Neg logit contributions
 jam-0.010
 flour-0.010
 tape-0.009
 slices-0.009
 cr-0.009
Token and
Token ID290
Feature activation+2.69e+00

Pos logit contributions
rated+0.050
ny+0.044
arrass+0.044
 alone+0.044
ling+0.044
Neg logit contributions
 jam-0.030
 cr-0.029
 slices-0.029
 flour-0.029
 frost-0.026
Token wanted
Token ID2227
Feature activation+8.34e-02

Pos logit contributions
umpy+0.068
away+0.068
 freely+0.067
oun+0.065
itating+0.064
Neg logit contributions
 jam-0.057
 cr-0.054
 flour-0.050
 toast-0.049
 slices-0.047
Token to
Token ID284
Feature activation+1.72e+00

Pos logit contributions
 down+0.001
 alone+0.001
 up+0.001
rated+0.001
 away+0.001
Neg logit contributions
 slices-0.003
 jam-0.003
 flour-0.003
engu-0.003
 icing-0.003
Token make
Token ID787
Feature activation+1.72e+00

Pos logit contributions
rated+0.046
ling+0.045
ashing+0.044
umpy+0.042
itates+0.041
Neg logit contributions
 tape-0.038
 cr-0.037
 jam-0.036
 flour-0.034
 toast-0.033
Token Billy
Token ID15890
Feature activation+4.92e+00

Pos logit contributions
rated+0.042
 alone+0.037
ling+0.037
umpy+0.036
ny+0.036
Neg logit contributions
 flour-0.045
 jam-0.044
 slices-0.044
 cr-0.042
 tape-0.041
Token happy
Token ID3772
Feature activation+1.45e+00

Pos logit contributions
rated+0.097
 freely+0.089
away+0.089
itates+0.088
Nice+0.084
Neg logit contributions
 slices-0.137
 cr-0.134
 jam-0.133
 toast-0.126
 flour-0.123
Token,
Token ID11
Feature activation+2.72e+00

Pos logit contributions
rated+0.036
ling+0.033
 down+0.032
ny+0.032
 alone+0.032
Neg logit contributions
 slices-0.034
 jam-0.034
 cr-0.033
 flour-0.033
 frost-0.031
Token so
Token ID523
Feature activation+1.31e+00

Pos logit contributions
rated+0.061
 down+0.057
 alone+0.056
ling+0.056
away+0.055
Neg logit contributions
 slices-0.071
 jam-0.070
 flour-0.070
 cr-0.067
 frost-0.063
Token he
Token ID339
Feature activation+1.71e+00

Pos logit contributions
rated+0.024
 alone+0.022
 down+0.022
ling+0.021
ny+0.021
Neg logit contributions
 slices-0.039
 jam-0.039
 flour-0.039
 cr-0.038
 frost-0.036
Token let
Token ID1309
Feature activation+1.28e+00

Pos logit contributions
rated+0.039
 alone+0.037
 down+0.037
 freely+0.035
away+0.035
Neg logit contributions
 jam-0.043
 cr-0.042
 flour-0.042
 slices-0.042
 frost-0.038
Token mustache
Token ID49303
Feature activation+5.80e-01

Pos logit contributions
rated+0.011
 alone+0.011
 down+0.010
ny+0.010
 freely+0.010
Neg logit contributions
 slices-0.009
 jam-0.009
 flour-0.009
 cr-0.009
 frost-0.008
Token was
Token ID373
Feature activation-8.62e-01

Pos logit contributions
rated+0.013
 alone+0.013
 down+0.012
away+0.012
ling+0.012
Neg logit contributions
 slices-0.014
 jam-0.014
 cr-0.014
 flour-0.014
 frost-0.013
Token long
Token ID890
Feature activation-1.45e+00

Pos logit contributions
 slices+0.026
 flour+0.026
 jam+0.025
engu+0.025
 cr+0.024
Neg logit contributions
 alone-0.016
 down-0.016
 up-0.015
rated-0.015
ling-0.014
Token and
Token ID290
Feature activation+1.31e+00

Pos logit contributions
 slices+0.040
 flour+0.039
 jam+0.039
engu+0.037
 cr+0.036
Neg logit contributions
 alone-0.031
 down-0.030
 up-0.029
rated-0.029
ling-0.028
Token w
Token ID266
Feature activation+4.95e-01

Pos logit contributions
rated+0.024
 down+0.023
 alone+0.023
ling+0.022
 up+0.022
Neg logit contributions
 slices-0.039
 flour-0.039
 jam-0.038
 cr-0.037
 frost-0.036
Tokenigg
Token ID6950
Feature activation+4.90e+00

Pos logit contributions
 down+0.011
rated+0.011
 up+0.010
 alone+0.010
ny+0.010
Neg logit contributions
 slices-0.012
 jam-0.012
 flour-0.012
 frost-0.012
 icing-0.012
Tokenly
Token ID306
Feature activation+9.87e-01

Pos logit contributions
 unnoticed+0.095
ivated+0.087
itates+0.086
Hi+0.085
itating+0.079
Neg logit contributions
 cr-0.111
aw-0.110
 flour-0.107
engu-0.106
 jam-0.106
Token.
Token ID13
Feature activation+6.15e-01

Pos logit contributions
 down+0.022
 alone+0.021
rated+0.021
ling+0.020
 up+0.020
Neg logit contributions
 slices-0.026
 flour-0.026
 jam-0.025
 cr-0.024
 icing-0.023
Token The
Token ID383
Feature activation+1.23e+00

Pos logit contributions
 down+0.011
 alone+0.011
rated+0.011
 up+0.010
ny+0.010
Neg logit contributions
 slices-0.018
 jam-0.018
 flour-0.018
 cr-0.017
 frost-0.017
Token dragon
Token ID10441
Feature activation-1.02e+00

Pos logit contributions
 alone+0.028
rated+0.027
 down+0.027
ling+0.026
 up+0.025
Neg logit contributions
 slices-0.031
 flour-0.031
 jam-0.030
 cr-0.029
engu-0.028
Token was
Token ID373
Feature activation-1.09e+00

Pos logit contributions
 slices+0.030
 flour+0.030
 jam+0.029
 icing+0.027
 cr+0.027
Neg logit contributions
 alone-0.020
rated-0.020
 down-0.020
ling-0.019
 up-0.019
Token want
Token ID765
Feature activation-4.69e-01

Pos logit contributions
rated+0.013
 down+0.012
 alone+0.012
ling+0.012
away+0.011
Neg logit contributions
 jam-0.030
 slices-0.030
 flour-0.030
 cr-0.030
 frost-0.029
Token to
Token ID284
Feature activation+1.68e+00

Pos logit contributions
 slices+0.017
 slot+0.017
 jam+0.017
mm+0.017
 icing+0.017
Neg logit contributions
 up-0.005
 down-0.005
 alone-0.004
 away-0.004
 same-0.004
Token do
Token ID466
Feature activation+1.74e+00

Pos logit contributions
rated+0.031
 down+0.029
 alone+0.029
ling+0.028
 up+0.027
Neg logit contributions
 slices-0.051
 jam-0.050
 flour-0.050
 cr-0.048
 frost-0.046
Token something
Token ID1223
Feature activation+3.11e+00

Pos logit contributions
rated+0.048
 down+0.046
 alone+0.046
ny+0.044
ling+0.044
Neg logit contributions
 jam-0.037
 slices-0.036
 flour-0.036
 cr-0.034
 frost-0.032
Token new
Token ID649
Feature activation+1.76e+00

Pos logit contributions
rated+0.073
 alone+0.064
 down+0.064
ny+0.063
ling+0.063
Neg logit contributions
 flour-0.081
 jam-0.081
 cr-0.079
 slices-0.078
 frost-0.072
Token and
Token ID290
Feature activation+4.90e+00

Pos logit contributions
 down+0.038
 alone+0.038
rated+0.038
 up+0.036
ling+0.035
Neg logit contributions
 slices-0.046
 flour-0.046
 jam-0.046
 cr-0.043
 icing-0.042
Token fun
Token ID1257
Feature activation+2.07e+00

Pos logit contributions
rated+0.108
ling+0.104
 down+0.101
 alone+0.100
umpy+0.100
Neg logit contributions
 jam-0.127
 flour-0.126
 slices-0.124
 cr-0.123
 tape-0.114
Token.
Token ID13
Feature activation+6.30e-01

Pos logit contributions
rated+0.047
 down+0.046
 alone+0.046
ling+0.044
 up+0.044
Neg logit contributions
 slices-0.053
 flour-0.052
 jam-0.052
 cr-0.049
 frost-0.047
Token
Token ID198
Feature activation-1.90e-01

Pos logit contributions
 down+0.012
 alone+0.012
 up+0.012
rated+0.012
ny+0.011
Neg logit contributions
 slices-0.018
 jam-0.018
 flour-0.018
 cr-0.017
 icing-0.017
Token
Token ID198
Feature activation+1.45e-01

Pos logit contributions
 slices+0.005
 jam+0.005
 flour+0.005
 cr+0.005
 icing+0.005
Neg logit contributions
 down-0.004
 alone-0.004
rated-0.004
 up-0.004
ny-0.004
Token"
Token ID1
Feature activation+2.67e-01

Pos logit contributions
rated+0.003
ny+0.003
 down+0.003
 alone+0.003
ling+0.003
Neg logit contributions
 slices-0.004
 jam-0.004
 flour-0.004
 cr-0.004
 icing-0.003
Token,
Token ID11
Feature activation-8.69e-01

Pos logit contributions
 slices+0.018
 flour+0.017
 jam+0.017
 cr+0.016
 icing+0.016
Neg logit contributions
 down-0.023
rated-0.022
 alone-0.022
 up-0.021
ling-0.021
Token Tim
Token ID5045
Feature activation+1.78e+00

Pos logit contributions
 slices+0.025
 flour+0.024
 jam+0.024
engu+0.023
 strips+0.023
Neg logit contributions
 down-0.018
 alone-0.017
 up-0.017
rated-0.016
 away-0.015
Tokenmy
Token ID1820
Feature activation-4.76e-01

Pos logit contributions
rated+0.030
 down+0.029
 alone+0.029
ling+0.027
 up+0.027
Neg logit contributions
 slices-0.055
 jam-0.055
 flour-0.055
 cr-0.052
 frost-0.051
Token started
Token ID2067
Feature activation-6.37e-01

Pos logit contributions
 slices+0.013
 flour+0.013
 jam+0.013
 cr+0.012
 frost+0.012
Neg logit contributions
rated-0.010
 down-0.009
 alone-0.009
ling-0.009
 up-0.009
Token w
Token ID266
Feature activation+7.50e-01

Pos logit contributions
 jam+0.016
 cr+0.016
 flour+0.015
 slices+0.015
 strips+0.015
Neg logit contributions
rated-0.016
ny-0.013
umpy-0.013
 alone-0.013
ling-0.013
Tokenigg
Token ID6950
Feature activation+4.89e+00

Pos logit contributions
rated+0.016
 down+0.016
 alone+0.015
ny+0.015
 up+0.014
Neg logit contributions
 jam-0.021
 slices-0.020
 flour-0.020
 cr-0.019
 icing-0.019
Tokenling
Token ID1359
Feature activation+1.13e+00

Pos logit contributions
 unnoticed+0.083
itates+0.078
 Welcome+0.076
 doomed+0.076
 invincible+0.075
Neg logit contributions
 jam-0.127
aw-0.124
 flour-0.119
 da-0.118
 slices-0.117
Token and
Token ID290
Feature activation+1.27e-01

Pos logit contributions
rated+0.025
 alone+0.023
 down+0.023
ling+0.022
ny+0.022
Neg logit contributions
 slices-0.030
 jam-0.030
 flour-0.029
 cr-0.028
 frost-0.027
Token dancing
Token ID15360
Feature activation+2.29e+00

Pos logit contributions
rated+0.003
 alone+0.003
 down+0.003
ling+0.003
away+0.003
Neg logit contributions
 jam-0.003
 slices-0.003
 flour-0.003
 cr-0.003
 frost-0.003
Token to
Token ID284
Feature activation+1.24e+00

Pos logit contributions
rated+0.055
umpy+0.048
arrass+0.046
 alone+0.046
ling+0.045
Neg logit contributions
 cr-0.057
 slices-0.056
 jam-0.055
 flour-0.052
 strips-0.051
Token the
Token ID262
Feature activation+8.28e-01

Pos logit contributions
rated+0.029
ling+0.026
ny+0.026
 alone+0.025
umpy+0.025
Neg logit contributions
 jam-0.032
 cr-0.031
 slices-0.031
 flour-0.031
 frost-0.029
Token did
Token ID750
Feature activation+4.61e-01

Pos logit contributions
 alone+0.020
rated+0.020
 down+0.019
ny+0.019
ling+0.019
Neg logit contributions
 slices-0.030
 flour-0.030
 jam-0.029
 icing-0.027
 cr-0.027
Token not
Token ID407
Feature activation+2.27e-02

Pos logit contributions
rated+0.014
ny+0.013
umpy+0.013
ling+0.013
ashing+0.013
Neg logit contributions
 jam-0.009
 slices-0.008
 cr-0.008
 flour-0.008
 tape-0.008
Token want
Token ID765
Feature activation+3.11e-01

Pos logit contributions
rated+0.000
ling+0.000
 down+0.000
 up+0.000
lves+0.000
Neg logit contributions
 jam-0.001
 cr-0.001
 slices-0.001
 flour-0.001
 frost-0.001
Token to
Token ID284
Feature activation+1.07e+00

Pos logit contributions
 down+0.005
 alone+0.004
 up+0.004
rated+0.004
 away+0.004
Neg logit contributions
 slices-0.010
 flour-0.010
 jam-0.010
engu-0.010
 cr-0.010
Token make
Token ID787
Feature activation+1.83e+00

Pos logit contributions
rated+0.023
 down+0.021
ling+0.021
 alone+0.021
 freely+0.020
Neg logit contributions
 jam-0.028
 flour-0.027
 slices-0.027
 cr-0.027
 frost-0.026
Token Sally
Token ID25737
Feature activation+4.87e+00

Pos logit contributions
rated+0.042
 alone+0.039
 down+0.038
ling+0.038
ny+0.037
Neg logit contributions
 jam-0.048
 slices-0.048
 flour-0.048
 cr-0.045
 frost-0.043
Token mad
Token ID8805
Feature activation+3.11e+00

Pos logit contributions
rated+0.089
 freely+0.082
away+0.079
itates+0.075
ounded+0.074
Neg logit contributions
 slices-0.147
 jam-0.143
 cr-0.139
 flour-0.138
 toast-0.135
Token.
Token ID13
Feature activation+1.43e+00

Pos logit contributions
rated+0.098
 freely+0.088
away+0.086
ounded+0.086
 alone+0.086
Neg logit contributions
 slices-0.054
 jam-0.052
 cr-0.050
 flour-0.049
 toast-0.045
Token So
Token ID1406
Feature activation-2.80e-01

Pos logit contributions
rated+0.028
 alone+0.026
ling+0.026
 down+0.026
away+0.025
Neg logit contributions
 slices-0.042
 flour-0.041
 jam-0.041
 cr-0.039
 frost-0.037
Token,
Token ID11
Feature activation-4.45e-01

Pos logit contributions
 slices+0.009
engu+0.008
 flour+0.008
 strips+0.008
 icing+0.008
Neg logit contributions
 down-0.006
 up-0.005
 alone-0.005
 away-0.005
rated-0.004
Token they
Token ID484
Feature activation+8.90e-02

Pos logit contributions
engu+0.015
 slices+0.015
 flour+0.015
 jam+0.014
 icing+0.014
Neg logit contributions
 down-0.007
 alone-0.006
 up-0.006
 away-0.005
rated-0.005
Token excited
Token ID6568
Feature activation+1.61e+00

Pos logit contributions
 flour+0.016
 slices+0.016
 jam+0.016
 sprink+0.015
 cr+0.015
Neg logit contributions
rated-0.013
ny-0.012
away-0.012
arrass-0.012
 down-0.011
Token that
Token ID326
Feature activation+1.61e+00

Pos logit contributions
rated+0.040
 alone+0.038
 down+0.036
umpy+0.036
itates+0.036
Neg logit contributions
 jam-0.037
 slices-0.037
 flour-0.037
 cr-0.035
 strips-0.033
Token he
Token ID339
Feature activation+1.80e+00

Pos logit contributions
rated+0.028
 alone+0.026
 down+0.026
ling+0.025
ny+0.025
Neg logit contributions
 slices-0.049
 jam-0.049
 flour-0.049
 cr-0.047
 frost-0.045
Token was
Token ID373
Feature activation+5.57e-01

Pos logit contributions
rated+0.041
 alone+0.040
 down+0.040
away+0.038
ling+0.037
Neg logit contributions
 jam-0.042
 slices-0.042
 flour-0.042
 cr-0.041
 strips-0.038
Token w
Token ID266
Feature activation+9.91e-01

Pos logit contributions
rated+0.014
away+0.012
ny+0.012
ling+0.012
arrass+0.012
Neg logit contributions
 jam-0.014
 flour-0.013
 slices-0.013
 cr-0.013
 frost-0.012
Tokenigg
Token ID6950
Feature activation+4.84e+00

Pos logit contributions
 down+0.024
rated+0.024
 alone+0.022
ny+0.021
 up+0.021
Neg logit contributions
 slices-0.023
 jam-0.023
 flour-0.023
 icing-0.021
 frost-0.021
Tokenling
Token ID1359
Feature activation+2.41e+00

Pos logit contributions
 unnoticed+0.072
itates+0.069
Hi+0.066
thank+0.062
Nice+0.062
Neg logit contributions
 slices-0.123
 flour-0.121
 jam-0.121
 cones-0.119
engu-0.119
Token around
Token ID1088
Feature activation+1.11e+00

Pos logit contributions
rated+0.053
 down+0.052
 alone+0.052
ling+0.049
 up+0.048
Neg logit contributions
 slices-0.063
 jam-0.062
 flour-0.062
 cr-0.059
 frost-0.056
Token like
Token ID588
Feature activation-1.75e+00

Pos logit contributions
 down+0.023
 alone+0.023
rated+0.023
 up+0.022
ling+0.022
Neg logit contributions
 slices-0.030
 flour-0.030
 jam-0.029
 cr-0.028
 icing-0.027
Token a
Token ID257
Feature activation+3.88e-01

Pos logit contributions
 slices+0.052
 flour+0.051
 jam+0.050
 slot+0.049
engu+0.049
Neg logit contributions
 down-0.031
 alone-0.030
 up-0.029
 freely-0.026
rated-0.026
Token fish
Token ID5916
Feature activation+4.28e-01

Pos logit contributions
 down+0.008
 alone+0.008
rated+0.008
ling+0.008
 up+0.008
Neg logit contributions
 slices-0.010
 flour-0.010
 jam-0.010
 icing-0.009
 frost-0.009
Token didn
Token ID1422
Feature activation+7.97e-01

Pos logit contributions
away+0.060
 freely+0.057
rated+0.056
umpy+0.056
oun+0.055
Neg logit contributions
 jam-0.061
 cr-0.058
 flour-0.057
 slices-0.054
 toast-0.053
Token't
Token ID470
Feature activation+7.86e-01

Pos logit contributions
rated+0.015
 alone+0.013
ling+0.013
 down+0.013
away+0.013
Neg logit contributions
 flour-0.024
 slices-0.024
 jam-0.024
 cr-0.023
 tape-0.022
Token want
Token ID765
Feature activation+9.11e-01

Pos logit contributions
ling+0.017
 freely+0.016
lves+0.016
handed+0.016
itating+0.016
Neg logit contributions
 cr-0.019
 jam-0.019
 flour-0.018
 slices-0.018
 frost-0.017
Token to
Token ID284
Feature activation+8.63e-01

Pos logit contributions
 down+0.013
 alone+0.013
rated+0.012
 up+0.012
 away+0.011
Neg logit contributions
 slices-0.031
 flour-0.030
 jam-0.029
 cr-0.028
engu-0.028
Token do
Token ID466
Feature activation+2.25e+00

Pos logit contributions
rated+0.018
 down+0.017
 alone+0.017
ling+0.017
 freely+0.016
Neg logit contributions
 jam-0.023
 flour-0.023
 slices-0.023
 cr-0.023
 frost-0.022
Token anything
Token ID1997
Feature activation+4.83e+00

Pos logit contributions
rated+0.062
ny+0.058
umpy+0.057
ling+0.057
 alone+0.055
Neg logit contributions
 jam-0.051
 flour-0.049
 slices-0.048
 cr-0.046
 frost-0.045
Token.
Token ID13
Feature activation+1.46e+00

Pos logit contributions
rated+0.148
arted+0.141
umper+0.133
aciously+0.132
ny+0.132
Neg logit contributions
 jam-0.080
 cr-0.074
 flour-0.070
 powder-0.065
 toast-0.064
Token
Token ID198
Feature activation+3.99e-01

Pos logit contributions
rated+0.043
ling+0.041
away+0.041
 alone+0.039
 freely+0.038
Neg logit contributions
 jam-0.028
 flour-0.028
 slices-0.028
 cr-0.027
 frost-0.025
Token
Token ID198
Feature activation+1.17e+00

Pos logit contributions
rated+0.010
 down+0.010
 alone+0.010
ling+0.010
away+0.009
Neg logit contributions
 slices-0.009
 flour-0.009
 jam-0.009
 cr-0.008
 frost-0.008
TokenLater
Token ID18602
Feature activation-7.18e-01

Pos logit contributions
rated+0.031
 down+0.030
 alone+0.029
ling+0.028
 freely+0.028
Neg logit contributions
 jam-0.026
 flour-0.025
 slices-0.025
 cr-0.024
 frost-0.022
Token that
Token ID326
Feature activation+4.80e-01

Pos logit contributions
 slices+0.019
 flour+0.019
 jam+0.018
engu+0.018
 cr+0.018
Neg logit contributions
 down-0.016
 alone-0.016
 up-0.015
rated-0.015
ling-0.014
Token is
Token ID318
Feature activation-2.17e+00

Pos logit contributions
 slices+0.023
 jam+0.022
 flour+0.021
 unison+0.020
engu+0.020
Neg logit contributions
 down-0.018
 alone-0.017
 up-0.017
rated-0.016
 same-0.016
Token eight
Token ID3624
Feature activation-5.17e+00

Pos logit contributions
 slices+0.065
itt+0.063
light+0.058
 slot+0.058
 bisc+0.058
Neg logit contributions
 up-0.042
 down-0.039
 same-0.036
 alone-0.033
 back-0.032
Token.
Token ID13
Feature activation-1.67e+00

Pos logit contributions
 unison+0.110
engu+0.110
izza+0.105
 bisc+0.104
emon+0.102
Neg logit contributions
 alone-0.123
 down-0.118
 same-0.111
ling-0.111
 up-0.109
Token It
Token ID632
Feature activation+1.80e-01

Pos logit contributions
 slices+0.057
 jam+0.057
 flour+0.054
 toast+0.053
 da+0.052
Neg logit contributions
 down-0.023
ny-0.019
free-0.018
 up-0.018
 alone-0.018
Token has
Token ID468
Feature activation-2.49e+00

Pos logit contributions
 down+0.004
 up+0.004
 alone+0.003
ny+0.003
rated+0.003
Neg logit contributions
 slices-0.005
 jam-0.005
 flour-0.005
 icing-0.004
 toast-0.004
Token eight
Token ID3624
Feature activation-6.30e+00

Pos logit contributions
 slices+0.056
 jam+0.054
 flour+0.051
ama+0.051
 bisc+0.050
Neg logit contributions
 same-0.059
 down-0.056
 alone-0.055
 up-0.054
ny-0.053
Token and
Token ID290
Feature activation-2.14e+00

Pos logit contributions
 unison+0.123
 bisc+0.118
engu+0.116
 slot+0.116
 jam+0.114
Neg logit contributions
 down-0.156
 alone-0.155
 same-0.148
 up-0.146
ling-0.143
Token zero
Token ID6632
Feature activation-2.75e+00

Pos logit contributions
 slices+0.045
engu+0.044
 bisc+0.043
 unison+0.043
 flour+0.043
Neg logit contributions
 down-0.054
 up-0.051
 alone-0.050
 same-0.049
 away-0.045
Token."
Token ID526
Feature activation+1.02e-01

Pos logit contributions
 slices+0.068
 flour+0.067
 jam+0.066
 unison+0.064
engu+0.062
Neg logit contributions
 down-0.061
 alone-0.059
 up-0.057
ling-0.056
rated-0.054
Token
Token ID198
Feature activation-8.78e-01

Pos logit contributions
 down+0.002
 alone+0.002
 up+0.002
 freely+0.001
rated+0.001
Neg logit contributions
 slices-0.003
 jam-0.003
 flour-0.003
 cr-0.003
engu-0.003
Token
Token ID198
Feature activation-3.60e-01

Pos logit contributions
 slices+0.027
 jam+0.027
 flour+0.027
 cr+0.025
 icing+0.025
Neg logit contributions
 down-0.015
 alone-0.015
 up-0.014
 freely-0.013
rated-0.013
Token so
Token ID523
Feature activation-7.94e-01

Pos logit contributions
 slices+0.072
engu+0.072
izza+0.069
 flour+0.068
 icing+0.067
Neg logit contributions
 alone-0.040
 down-0.040
 up-0.039
 away-0.033
 same-0.032
Token weird
Token ID7650
Feature activation-5.88e-01

Pos logit contributions
 jam+0.022
 slices+0.022
 flour+0.022
 cr+0.021
 frost+0.020
Neg logit contributions
rated-0.016
 down-0.015
 alone-0.015
ling-0.015
ny-0.014
Token.
Token ID13
Feature activation-1.49e+00

Pos logit contributions
 slices+0.015
 flour+0.015
 jam+0.015
 cr+0.014
 icing+0.014
Neg logit contributions
 down-0.013
rated-0.013
 alone-0.013
 up-0.012
ling-0.012
Token It
Token ID632
Feature activation+6.20e-01

Pos logit contributions
 slices+0.057
 jam+0.055
 flour+0.054
 toast+0.052
 icing+0.052
Neg logit contributions
 down-0.018
 alone-0.016
ny-0.013
umpy-0.013
free-0.013
Token has
Token ID468
Feature activation-2.43e+00

Pos logit contributions
rated+0.016
 down+0.015
 alone+0.015
away+0.014
ling+0.014
Neg logit contributions
 jam-0.014
 flour-0.014
 slices-0.014
 cr-0.014
 frost-0.013
Token eight
Token ID3624
Feature activation-6.13e+00

Pos logit contributions
 slices+0.062
 flour+0.058
engu+0.057
 slot+0.056
izza+0.056
Neg logit contributions
 alone-0.053
 same-0.052
 down-0.052
 up-0.049
 away-0.047
Token legs
Token ID7405
Feature activation-3.66e+00

Pos logit contributions
engu+0.151
izza+0.150
 unison+0.149
 slot+0.149
 flour+0.147
Neg logit contributions
 alone-0.132
 down-0.127
ling-0.119
 up-0.117
 same-0.115
Token and
Token ID290
Feature activation-2.29e+00

Pos logit contributions
 slices+0.091
 jam+0.090
 flour+0.089
 cr+0.083
engu+0.082
Neg logit contributions
 down-0.087
 alone-0.085
rated-0.083
 up-0.081
ling-0.080
Token a
Token ID257
Feature activation-2.74e+00

Pos logit contributions
 slices+0.056
engu+0.055
 flour+0.054
 slot+0.052
 unison+0.052
Neg logit contributions
 down-0.053
 alone-0.050
 up-0.047
 same-0.047
 lonely-0.043
Token lot
Token ID1256
Feature activation-2.28e+00

Pos logit contributions
 slices+0.075
 flour+0.073
engu+0.070
 icing+0.069
 jam+0.069
Neg logit contributions
 down-0.058
 alone-0.056
 miserable-0.052
 lonely-0.051
 same-0.050
Token of
Token ID286
Feature activation-2.40e+00

Pos logit contributions
 slices+0.054
 flour+0.051
 unison+0.051
 jam+0.050
engu+0.050
Neg logit contributions
 alone-0.057
 down-0.055
 up-0.052
umpy-0.050
 away-0.050
Token was
Token ID373
Feature activation-2.03e+00

Pos logit contributions
 down+0.016
 up+0.016
 alone+0.015
ny+0.015
rated+0.014
Neg logit contributions
 slices-0.030
 jam-0.029
 flour-0.029
 icing-0.028
 toast-0.028
Token a
Token ID257
Feature activation-1.26e+00

Pos logit contributions
itt+0.066
engu+0.064
keys+0.063
 slices+0.062
 unison+0.061
Neg logit contributions
 down-0.035
 up-0.035
 alone-0.034
 back-0.029
 away-0.029
Token box
Token ID3091
Feature activation-7.92e-01

Pos logit contributions
 slices+0.034
els+0.033
ions+0.032
 flour+0.032
engu+0.032
Neg logit contributions
 alone-0.025
 down-0.025
 same-0.024
 lonely-0.024
 away-0.024
Token of
Token ID286
Feature activation-2.55e+00

Pos logit contributions
 jam+0.023
 flour+0.023
 slices+0.023
engu+0.021
 icing+0.021
Neg logit contributions
 alone-0.015
 down-0.015
 up-0.015
ny-0.014
ling-0.013
Token ch
Token ID442
Feature activation-1.80e+00

Pos logit contributions
ama+0.061
engu+0.061
mmm+0.060
keys+0.059
comb+0.059
Neg logit contributions
 alone-0.059
 same-0.056
 up-0.056
 down-0.056
 lonely-0.051
Tokenocol
Token ID4668
Feature activation-6.08e+00

Pos logit contributions
 jam+0.032
 flour+0.032
 slices+0.031
 toast+0.031
 bisc+0.029
Neg logit contributions
away-0.050
 down-0.050
 alone-0.050
ling-0.050
ny-0.048
Tokenates
Token ID689
Feature activation-2.33e+00

Pos logit contributions
 flour+0.112
 jam+0.111
 unison+0.110
 sheet+0.104
engu+0.103
Neg logit contributions
 down-0.156
ling-0.151
ny-0.144
 away-0.143
away-0.142
Token.
Token ID13
Feature activation+3.74e-01

Pos logit contributions
 flour+0.070
 jam+0.068
 slices+0.067
engu+0.067
 unison+0.065
Neg logit contributions
 down-0.044
 up-0.043
 alone-0.039
ling-0.037
 away-0.036
Token Lily
Token ID20037
Feature activation-8.36e-01

Pos logit contributions
 down+0.005
 alone+0.005
 up+0.004
ny+0.004
rated+0.004
Neg logit contributions
 jam-0.013
 flour-0.013
 slices-0.013
 cr-0.013
 frost-0.012
Token loved
Token ID6151
Feature activation-6.95e-01

Pos logit contributions
 slices+0.026
 flour+0.026
 jam+0.025
 icing+0.024
 unison+0.023
Neg logit contributions
 down-0.014
 alone-0.014
 up-0.014
ny-0.013
rated-0.013
Token ch
Token ID442
Feature activation-1.16e+00

Pos logit contributions
 slices+0.017
 slot+0.017
 unison+0.016
 flour+0.016
 jam+0.016
Neg logit contributions
 down-0.016
 alone-0.015
 up-0.015
 same-0.013
 away-0.013
Token a
Token ID257
Feature activation-6.81e-02

Pos logit contributions
 slices+0.026
 jam+0.025
engu+0.025
 flour+0.025
 icing+0.024
Neg logit contributions
 alone-0.021
 up-0.021
 down-0.020
 away-0.018
 same-0.018
Token big
Token ID1263
Feature activation-1.36e+00

Pos logit contributions
 slices+0.002
ions+0.002
 jam+0.002
els+0.002
 icing+0.002
Neg logit contributions
 down-0.001
 alone-0.001
 miserable-0.001
 lonely-0.001
 up-0.001
Token box
Token ID3091
Feature activation+1.35e-01

Pos logit contributions
engu+0.032
 slices+0.031
izza+0.030
keys+0.030
emon+0.030
Neg logit contributions
 down-0.034
 alone-0.034
 up-0.031
 away-0.030
 lonely-0.030
Token of
Token ID286
Feature activation-1.62e+00

Pos logit contributions
 alone+0.003
 down+0.003
 up+0.003
ling+0.003
rated+0.003
Neg logit contributions
 jam-0.004
 flour-0.004
 slices-0.004
engu-0.004
 icing-0.003
Token ch
Token ID442
Feature activation-1.33e+00

Pos logit contributions
engu+0.041
mmm+0.039
 slices+0.039
emon+0.039
 unison+0.038
Neg logit contributions
 alone-0.037
 up-0.035
 down-0.035
 same-0.033
 lonely-0.031
Tokenocol
Token ID4668
Feature activation-5.88e+00

Pos logit contributions
 jam+0.024
 flour+0.024
 slices+0.023
 cr+0.022
 toast+0.021
Neg logit contributions
 alone-0.039
ling-0.038
 down-0.038
away-0.038
umpy-0.037
Tokenates
Token ID689
Feature activation-1.59e+00

Pos logit contributions
 flour+0.107
 jam+0.106
 unison+0.102
 slot+0.098
 sheet+0.097
Neg logit contributions
 down-0.153
ling-0.150
 alone-0.141
away-0.141
 up-0.141
Token.
Token ID13
Feature activation-3.28e-01

Pos logit contributions
 flour+0.046
 jam+0.046
engu+0.045
 slices+0.045
 cr+0.043
Neg logit contributions
 down-0.032
 up-0.030
 alone-0.029
ling-0.028
 away-0.026
Token It
Token ID632
Feature activation+8.32e-01

Pos logit contributions
 jam+0.012
 slices+0.012
 flour+0.012
 cr+0.012
 frost+0.012
Neg logit contributions
 down-0.004
 alone-0.003
 up-0.003
ny-0.003
arrass-0.003
Token was
Token ID373
Feature activation-2.28e+00

Pos logit contributions
 down+0.014
 up+0.014
 alone+0.014
ny+0.014
rated+0.014
Neg logit contributions
 slices-0.026
 flour-0.026
 jam-0.025
 icing-0.024
 toast-0.024
Token so
Token ID523
Feature activation-8.73e-01

Pos logit contributions
itt+0.072
engu+0.071
 slices+0.069
izza+0.068
 unison+0.067
Neg logit contributions
 up-0.040
 alone-0.038
 down-0.038
 lonely-0.032
 same-0.032
Token find
Token ID1064
Feature activation-8.27e-01

Pos logit contributions
 slices+0.047
 flour+0.047
 jam+0.046
 strips+0.043
 icing+0.042
Neg logit contributions
 down-0.031
 alone-0.031
 up-0.029
 away-0.028
ny-0.027
Token a
Token ID257
Feature activation-1.53e+00

Pos logit contributions
mmm+0.023
 jam+0.022
mm+0.022
 slices+0.022
mmmm+0.021
Neg logit contributions
 up-0.018
 down-0.017
 around-0.016
 alone-0.016
 same-0.015
Token box
Token ID3091
Feature activation-1.02e+00

Pos logit contributions
 slices+0.040
els+0.039
mm+0.039
ions+0.038
itt+0.038
Neg logit contributions
 down-0.033
 same-0.031
 away-0.030
ny-0.029
 up-0.029
Token of
Token ID286
Feature activation-2.75e+00

Pos logit contributions
 jam+0.029
 flour+0.028
 slices+0.028
engu+0.027
 icing+0.026
Neg logit contributions
 down-0.021
 alone-0.021
 up-0.021
 around-0.019
 away-0.019
Token ch
Token ID442
Feature activation-2.03e+00

Pos logit contributions
ama+0.071
keys+0.069
mmm+0.068
mm+0.068
comb+0.066
Neg logit contributions
 alone-0.062
 up-0.060
 same-0.059
 down-0.059
 around-0.056
Tokenocol
Token ID4668
Feature activation-5.81e+00

Pos logit contributions
 flour+0.037
 jam+0.036
 slices+0.035
 bisc+0.034
 frost+0.034
Neg logit contributions
away-0.057
 down-0.056
 alone-0.056
ling-0.056
 freely-0.054
Tokenates
Token ID689
Feature activation-2.78e+00

Pos logit contributions
 flour+0.108
 jam+0.107
 unison+0.104
 slot+0.099
 sprink+0.097
Neg logit contributions
 down-0.147
ny-0.142
ling-0.141
away-0.136
 away-0.135
Token.
Token ID13
Feature activation+2.84e-01

Pos logit contributions
 flour+0.084
 jam+0.083
engu+0.080
 unison+0.080
 slices+0.080
Neg logit contributions
 down-0.054
 up-0.052
 alone-0.046
 away-0.045
 around-0.045
Token
Token ID198
Feature activation-1.22e+00

Pos logit contributions
 down+0.004
 up+0.003
 freely+0.003
 alone+0.003
ny+0.003
Neg logit contributions
 jam-0.010
 slices-0.010
 cr-0.010
 icing-0.010
 frost-0.010
Token
Token ID198
Feature activation-5.19e-01

Pos logit contributions
 jam+0.037
 slices+0.037
 flour+0.036
 cr+0.035
 icing+0.035
Neg logit contributions
 down-0.021
 alone-0.021
 up-0.021
rated-0.020
 freely-0.020
Token"
Token ID1
Feature activation-8.92e-01

Pos logit contributions
 jam+0.015
 slices+0.014
 flour+0.014
 icing+0.014
 cr+0.014
Neg logit contributions
 alone-0.010
 down-0.010
ny-0.010
rated-0.010
away-0.009
Token and
Token ID290
Feature activation-3.69e-01

Pos logit contributions
 jam+0.052
 flour+0.051
engu+0.050
 slices+0.047
emon+0.047
Neg logit contributions
 down-0.039
 up-0.038
 alone-0.035
 away-0.033
 around-0.033
Token started
Token ID2067
Feature activation-1.27e+00

Pos logit contributions
 slices+0.010
 jam+0.010
 flour+0.010
 strips+0.010
 cr+0.010
Neg logit contributions
 down-0.008
 alone-0.007
 up-0.007
rated-0.007
 away-0.006
Token eating
Token ID6600
Feature activation-1.70e+00

Pos logit contributions
 jam+0.040
 slices+0.040
 flour+0.040
 unison+0.038
engu+0.038
Neg logit contributions
 down-0.021
 up-0.019
 alone-0.019
 away-0.016
ling-0.016
Token the
Token ID262
Feature activation-3.05e-01

Pos logit contributions
engu+0.048
emon+0.046
 unison+0.045
 jam+0.045
 slices+0.045
Neg logit contributions
 down-0.035
 up-0.034
 alone-0.030
 away-0.029
ling-0.028
Token ch
Token ID442
Feature activation-1.98e+00

Pos logit contributions
 slices+0.007
 jam+0.007
 flour+0.007
engu+0.006
 unison+0.006
Neg logit contributions
 up-0.008
 down-0.008
 alone-0.007
rated-0.007
ling-0.007
Tokenocol
Token ID4668
Feature activation-5.75e+00

Pos logit contributions
 slices+0.027
 flour+0.026
 jam+0.026
 cr+0.024
 frost+0.022
Neg logit contributions
rated-0.066
 alone-0.066
 down-0.065
ling-0.065
ny-0.063
Tokenates
Token ID689
Feature activation-1.54e+00

Pos logit contributions
 flour+0.103
 jam+0.100
 unison+0.098
 cr+0.093
 sheet+0.092
Neg logit contributions
 down-0.152
ling-0.151
ny-0.146
rated-0.145
away-0.143
Token.
Token ID13
Feature activation-8.18e-01

Pos logit contributions
 flour+0.043
 jam+0.042
 slices+0.042
engu+0.041
 cr+0.040
Neg logit contributions
 down-0.032
 up-0.030
 alone-0.029
ling-0.028
 away-0.027
Token The
Token ID383
Feature activation+2.36e-01

Pos logit contributions
 slices+0.027
 flour+0.026
 jam+0.026
 cr+0.025
 frost+0.024
Neg logit contributions
rated-0.013
 alone-0.012
 down-0.012
ling-0.012
away-0.011
Token shop
Token ID6128
Feature activation-1.93e+00

Pos logit contributions
 same+0.005
 down+0.005
 alone+0.005
 up+0.005
ling+0.005
Neg logit contributions
 slices-0.005
engu-0.005
 flour-0.005
ions-0.005
 unison-0.005
Tokenkeeper
Token ID13884
Feature activation-2.25e+00

Pos logit contributions
 jam+0.059
 flour+0.057
 slices+0.056
 strips+0.053
 icing+0.053
Neg logit contributions
 down-0.033
 up-0.033
 alone-0.031
ling-0.031
ny-0.030
Token a
Token ID257
Feature activation-1.44e+00

Pos logit contributions
 slices+0.061
 jam+0.059
 flour+0.058
engu+0.058
 icing+0.055
Neg logit contributions
 down-0.050
 alone-0.048
 up-0.048
rated-0.042
 away-0.041
Token big
Token ID1263
Feature activation-2.27e+00

Pos logit contributions
ions+0.041
 slices+0.041
els+0.040
engu+0.040
 flour+0.038
Neg logit contributions
 alone-0.027
 down-0.026
 lonely-0.025
 miserable-0.024
 same-0.024
Token box
Token ID3091
Feature activation-3.07e-01

Pos logit contributions
engu+0.054
 slices+0.051
keys+0.050
 flour+0.049
izza+0.048
Neg logit contributions
 down-0.059
 alone-0.058
 up-0.053
 away-0.051
 lonely-0.050
Token of
Token ID286
Feature activation-2.20e+00

Pos logit contributions
 flour+0.008
 slices+0.008
 jam+0.008
engu+0.008
 strips+0.008
Neg logit contributions
 down-0.007
 alone-0.007
 up-0.006
rated-0.006
ling-0.006
Token ch
Token ID442
Feature activation-1.56e+00

Pos logit contributions
engu+0.055
 unison+0.053
 Theo+0.053
emon+0.052
 Ma+0.051
Neg logit contributions
 alone-0.053
 down-0.050
 up-0.047
 same-0.046
 away-0.045
Tokenocol
Token ID4668
Feature activation-5.72e+00

Pos logit contributions
 flour+0.025
 slices+0.025
 jam+0.024
 cr+0.023
engu+0.022
Neg logit contributions
 alone-0.048
 down-0.047
away-0.047
ling-0.046
umpy-0.045
Tokenates
Token ID689
Feature activation-1.99e+00

Pos logit contributions
 flour+0.105
 jam+0.102
 unison+0.102
 Theo+0.097
engu+0.097
Neg logit contributions
 down-0.145
ling-0.140
ny-0.135
 away-0.134
away-0.133
Token!
Token ID0
Feature activation-3.51e-01

Pos logit contributions
 flour+0.055
 slices+0.054
 jam+0.053
engu+0.053
 cr+0.050
Neg logit contributions
 down-0.043
 up-0.040
 alone-0.040
ling-0.037
rated-0.036
Token
Token ID220
Feature activation+6.87e-01

Pos logit contributions
 slices+0.012
 jam+0.012
 flour+0.012
 cr+0.012
 frost+0.012
Neg logit contributions
 down-0.005
 alone-0.004
rated-0.004
 up-0.004
ny-0.004
Token
Token ID198
Feature activation-8.60e-01

Pos logit contributions
rated+0.017
 alone+0.016
 down+0.016
ling+0.016
away+0.015
Neg logit contributions
 slices-0.016
 flour-0.016
 jam-0.016
 cr-0.015
 frost-0.014
Token
Token ID198
Feature activation+6.93e-02

Pos logit contributions
 slices+0.022
 jam+0.022
 flour+0.022
 cr+0.021
 frost+0.020
Neg logit contributions
rated-0.019
 down-0.019
 alone-0.019
ling-0.018
 up-0.018
Token was
Token ID373
Feature activation-2.08e+00

Pos logit contributions
 jam+0.006
 slices+0.006
 flour+0.006
engu+0.006
 frost+0.006
Neg logit contributions
 down-0.005
 up-0.005
 alone-0.005
ling-0.004
 away-0.004
Token a
Token ID257
Feature activation-8.33e-01

Pos logit contributions
 slices+0.059
 jam+0.056
engu+0.055
 flour+0.054
 slot+0.054
Neg logit contributions
 up-0.042
 down-0.040
 alone-0.039
 same-0.036
 around-0.034
Token box
Token ID3091
Feature activation-3.45e-01

Pos logit contributions
 slices+0.026
ions+0.025
els+0.025
emon+0.025
engu+0.025
Neg logit contributions
 down-0.014
 alone-0.013
 miserable-0.013
 up-0.012
 lonely-0.012
Token of
Token ID286
Feature activation-1.90e+00

Pos logit contributions
 jam+0.009
 flour+0.009
 slices+0.009
engu+0.009
 icing+0.008
Neg logit contributions
 down-0.008
 alone-0.008
 up-0.007
ling-0.007
 around-0.007
Token ch
Token ID442
Feature activation-1.46e+00

Pos logit contributions
engu+0.046
mm+0.045
mmm+0.044
izza+0.044
keys+0.044
Neg logit contributions
 up-0.043
 alone-0.042
 same-0.041
 down-0.041
 miserable-0.037
Tokenocol
Token ID4668
Feature activation-5.68e+00

Pos logit contributions
 slices+0.027
 jam+0.027
 flour+0.027
 cr+0.025
 toast+0.025
Neg logit contributions
ling-0.042
 alone-0.041
 down-0.041
away-0.040
umpy-0.040
Tokenates
Token ID689
Feature activation-1.62e+00

Pos logit contributions
 flour+0.108
 jam+0.106
 unison+0.104
 slot+0.099
engu+0.097
Neg logit contributions
 down-0.143
ling-0.139
ny-0.131
 away-0.130
ping-0.130
Token.
Token ID13
Feature activation-1.05e+00

Pos logit contributions
 flour+0.046
 jam+0.045
 slices+0.045
engu+0.044
 cr+0.041
Neg logit contributions
 down-0.034
 up-0.032
 alone-0.031
ling-0.031
 away-0.028
Token "
Token ID366
Feature activation-2.53e-01

Pos logit contributions
 jam+0.036
 flour+0.035
 slices+0.035
 cr+0.035
 frost+0.034
Neg logit contributions
 down-0.015
 alone-0.014
 up-0.013
ny-0.013
 freely-0.012
TokenFor
Token ID1890
Feature activation-1.82e+00

Pos logit contributions
 flour+0.008
 slices+0.008
 jam+0.008
 cr+0.007
 jars+0.007
Neg logit contributions
ny-0.004
rated-0.004
 down-0.004
 alone-0.004
free-0.004
Token me
Token ID502
Feature activation-1.19e+00

Pos logit contributions
 slices+0.056
 flour+0.053
 jam+0.052
 slot+0.051
 bisc+0.050
Neg logit contributions
 alone-0.034
 down-0.032
 up-0.031
 same-0.029
 away-0.028
Token a
Token ID257
Feature activation-2.89e-01

Pos logit contributions
engu+0.018
 flour+0.018
 slices+0.018
 jam+0.018
izza+0.017
Neg logit contributions
 alone-0.014
 down-0.013
 up-0.013
 away-0.012
 freely-0.011
Token big
Token ID1263
Feature activation-1.50e+00

Pos logit contributions
 slices+0.008
 flour+0.008
 jam+0.008
 icing+0.008
 frost+0.008
Neg logit contributions
 alone-0.005
 down-0.005
 away-0.005
rated-0.005
ny-0.005
Token box
Token ID3091
Feature activation+3.97e-01

Pos logit contributions
 slices+0.045
 flour+0.045
engu+0.044
 jam+0.044
ions+0.042
Neg logit contributions
 alone-0.027
 down-0.026
 away-0.024
ny-0.023
 lonely-0.023
Token of
Token ID286
Feature activation-2.21e+00

Pos logit contributions
 alone+0.009
 down+0.009
rated+0.009
 up+0.009
ling+0.009
Neg logit contributions
 flour-0.010
 jam-0.010
 slices-0.010
engu-0.009
 cr-0.009
Token ch
Token ID442
Feature activation-1.09e+00

Pos logit contributions
engu+0.052
emon+0.051
 slices+0.050
 Theo+0.049
vals+0.049
Neg logit contributions
 alone-0.055
 down-0.050
 up-0.050
 same-0.047
 away-0.047
Tokenocol
Token ID4668
Feature activation-5.66e+00

Pos logit contributions
 flour+0.019
 slices+0.019
 jam+0.019
 cr+0.017
 frost+0.016
Neg logit contributions
 alone-0.033
ling-0.033
 down-0.033
away-0.032
rated-0.032
Tokenates
Token ID689
Feature activation-1.23e+00

Pos logit contributions
 flour+0.110
 jam+0.110
 slot+0.103
 Theo+0.102
engu+0.101
Neg logit contributions
ling-0.138
 down-0.137
ny-0.129
 away-0.129
away-0.129
Token!
Token ID0
Feature activation-5.79e-01

Pos logit contributions
 flour+0.035
 jam+0.034
 slices+0.034
engu+0.032
 cr+0.032
Neg logit contributions
 down-0.026
 alone-0.025
 up-0.024
rated-0.023
ling-0.023
Token<|endoftext|>
Token ID50256
Feature activation+1.82e+00

Pos logit contributions
 slices+0.015
 flour+0.015
 jam+0.015
 cr+0.014
 frost+0.014
Neg logit contributions
rated-0.013
 down-0.012
 alone-0.012
ling-0.012
 up-0.012
TokenOnce
Token ID7454
Feature activation+6.06e-01

Pos logit contributions
 down+0.039
 alone+0.038
rated+0.037
 up+0.036
ny+0.036
Neg logit contributions
 slices-0.049
 jam-0.049
 flour-0.048
 cr-0.046
 icing-0.045
Token upon
Token ID2402
Feature activation+7.48e-01

Pos logit contributions
rated+0.015
arrass+0.014
 invincible+0.013
ny+0.013
away+0.013
Neg logit contributions
 flour-0.014
 slices-0.014
 cr-0.014
 jam-0.014
engu-0.012
Token with
Token ID351
Feature activation-2.73e+00

Pos logit contributions
 jam+0.056
 slices+0.056
 flour+0.055
engu+0.050
 icing+0.049
Neg logit contributions
 down-0.051
 up-0.048
 alone-0.048
ny-0.045
 away-0.044
Token flowers
Token ID12734
Feature activation-4.36e+00

Pos logit contributions
ama+0.069
 slot+0.069
 slices+0.068
mmm+0.067
izza+0.067
Neg logit contributions
 down-0.058
 up-0.057
 away-0.055
 alone-0.053
 same-0.052
Token,
Token ID11
Feature activation-3.27e+00

Pos logit contributions
engu+0.134
 flour+0.133
 slices+0.132
 unison+0.132
izza+0.130
Neg logit contributions
 down-0.080
 up-0.073
 around-0.071
 away-0.067
 alone-0.067
Token hats
Token ID23910
Feature activation-2.68e+00

Pos logit contributions
engu+0.079
 slices+0.077
izza+0.075
 slot+0.073
ama+0.072
Neg logit contributions
 down-0.076
 up-0.072
ling-0.070
ny-0.068
 alone-0.067
Token with
Token ID351
Feature activation-2.62e+00

Pos logit contributions
 jam+0.073
 slices+0.073
 flour+0.073
engu+0.067
 unison+0.065
Neg logit contributions
 down-0.060
 up-0.057
 alone-0.055
ny-0.052
 away-0.052
Token ears
Token ID11368
Feature activation-5.65e+00

Pos logit contributions
 slot+0.066
 slices+0.066
ama+0.066
izza+0.064
mmm+0.064
Neg logit contributions
 down-0.056
 up-0.055
 away-0.053
 alone-0.053
 around-0.050
Token,
Token ID11
Feature activation-3.02e+00

Pos logit contributions
 slices+0.170
 jam+0.168
 flour+0.163
 unison+0.162
 slot+0.161
Neg logit contributions
 down-0.097
 up-0.091
 around-0.088
ling-0.087
ny-0.086
Token and
Token ID290
Feature activation-1.84e+00

Pos logit contributions
engu+0.077
 slices+0.076
 flour+0.072
izza+0.072
 slot+0.072
Neg logit contributions
 down-0.065
 up-0.061
ling-0.060
ny-0.059
ly-0.057
Token hats
Token ID23910
Feature activation-2.95e-01

Pos logit contributions
 slices+0.042
engu+0.040
ama+0.040
itt+0.039
 flour+0.038
Neg logit contributions
 down-0.043
 up-0.040
 same-0.039
 alone-0.039
ny-0.038
Token with
Token ID351
Feature activation-1.84e+00

Pos logit contributions
 jam+0.009
 flour+0.009
 slices+0.009
engu+0.008
 cr+0.008
Neg logit contributions
 down-0.006
 alone-0.006
 up-0.005
 away-0.005
ling-0.005
Token stars
Token ID5788
Feature activation-1.77e+00

Pos logit contributions
 slices+0.044
ama+0.043
 slot+0.043
engu+0.043
 flour+0.043
Neg logit contributions
 down-0.041
 up-0.039
 alone-0.039
 away-0.038
 around-0.037
Token a
Token ID257
Feature activation-1.14e+00

Pos logit contributions
 slices+0.013
 jam+0.012
 flour+0.012
engu+0.012
 cr+0.011
Neg logit contributions
 down-0.012
 alone-0.012
 up-0.011
rated-0.011
 same-0.011
Token large
Token ID1588
Feature activation-2.18e+00

Pos logit contributions
 slices+0.034
 jam+0.032
ions+0.031
 flour+0.031
engu+0.031
Neg logit contributions
 down-0.021
 alone-0.020
 up-0.020
 same-0.019
 miserable-0.019
Token selection
Token ID6356
Feature activation-1.40e+00

Pos logit contributions
 slices+0.050
engu+0.048
 jam+0.047
keys+0.046
 flour+0.045
Neg logit contributions
 down-0.056
 alone-0.054
 up-0.052
 same-0.048
 away-0.048
Token of
Token ID286
Feature activation-2.08e+00

Pos logit contributions
 jam+0.034
engu+0.032
 flour+0.032
 slices+0.031
 unison+0.031
Neg logit contributions
 alone-0.034
 down-0.033
 up-0.032
 around-0.031
ling-0.030
Token ch
Token ID442
Feature activation-2.36e+00

Pos logit contributions
 slices+0.052
 jam+0.050
 unison+0.048
light+0.048
 slot+0.047
Neg logit contributions
 up-0.045
 down-0.044
 alone-0.043
 same-0.040
 away-0.040
Tokenocol
Token ID4668
Feature activation-5.55e+00

Pos logit contributions
 jam+0.036
 slices+0.035
 flour+0.035
 cr+0.032
 frost+0.030
Neg logit contributions
 down-0.074
 alone-0.074
ling-0.073
umpy-0.072
away-0.072
Tokenates
Token ID689
Feature activation-1.89e+00

Pos logit contributions
 flour+0.101
 unison+0.099
 jam+0.099
 sheet+0.096
 slot+0.095
Neg logit contributions
 down-0.144
ling-0.140
away-0.136
ny-0.135
 up-0.133
Token.
Token ID13
Feature activation-4.68e-01

Pos logit contributions
 flour+0.055
 jam+0.055
 slices+0.054
engu+0.052
 unison+0.051
Neg logit contributions
 down-0.036
 up-0.034
 alone-0.033
ling-0.032
 away-0.030
Token Ell
Token ID7122
Feature activation+1.78e+00

Pos logit contributions
 slices+0.014
 jam+0.014
 flour+0.014
 cr+0.014
 icing+0.013
Neg logit contributions
 down-0.008
 alone-0.008
rated-0.008
 up-0.008
ny-0.007
Tokena
Token ID64
Feature activation-2.19e+00

Pos logit contributions
rated+0.050
 down+0.049
 alone+0.049
 up+0.046
ling+0.046
Neg logit contributions
 slices-0.036
 flour-0.035
 jam-0.035
 cr-0.033
 frost-0.031
Token picked
Token ID6497
Feature activation+8.99e-01

Pos logit contributions
 flour+0.067
 jam+0.067
 slices+0.066
 icing+0.062
 slot+0.062
Neg logit contributions
 alone-0.036
 down-0.035
 up-0.035
ny-0.034
ling-0.034
Token,
Token ID11
Feature activation-1.65e+00

Pos logit contributions
 slices+0.057
 jam+0.056
 flour+0.056
 strips+0.052
 frost+0.051
Neg logit contributions
 down-0.064
 up-0.060
 alone-0.060
rated-0.056
ling-0.055
Token "
Token ID366
Feature activation-6.06e-01

Pos logit contributions
 slices+0.043
 jam+0.042
 flour+0.042
 cr+0.040
 frost+0.039
Neg logit contributions
rated-0.037
 alone-0.035
 down-0.035
ling-0.034
away-0.033
TokenThis
Token ID1212
Feature activation-1.14e+00

Pos logit contributions
 flour+0.018
 slices+0.018
 jam+0.018
 cr+0.017
 jars+0.016
Neg logit contributions
 down-0.010
rated-0.010
ny-0.010
 alone-0.010
away-0.010
Token is
Token ID318
Feature activation-2.36e+00

Pos logit contributions
 slices+0.032
 flour+0.031
 jam+0.030
 umb+0.030
engu+0.030
Neg logit contributions
 up-0.021
 same-0.021
 down-0.020
ny-0.019
 away-0.019
Token number
Token ID1271
Feature activation-2.90e+00

Pos logit contributions
 slices+0.077
itt+0.075
 flour+0.071
light+0.070
 slot+0.069
Neg logit contributions
 up-0.039
 down-0.038
 same-0.033
 back-0.031
 alone-0.031
Token eight
Token ID3624
Feature activation-5.51e+00

Pos logit contributions
 slices+0.083
 jam+0.082
 flour+0.082
 bisc+0.079
 unison+0.077
Neg logit contributions
 down-0.054
 up-0.051
 alone-0.047
ny-0.046
 around-0.045
Token."
Token ID526
Feature activation-1.04e+00

Pos logit contributions
 powder+0.108
 flour+0.103
ions+0.102
light+0.102
Matt+0.101
Neg logit contributions
 down-0.123
 alone-0.122
 same-0.116
ly-0.115
ny-0.114
Token She
Token ID1375
Feature activation-2.15e+00

Pos logit contributions
 slices+0.033
 flour+0.032
 jam+0.032
 cr+0.031
engu+0.031
Neg logit contributions
 down-0.017
 alone-0.016
 up-0.016
 freely-0.014
ny-0.013
Token shows
Token ID2523
Feature activation-9.17e-01

Pos logit contributions
 slices+0.056
 flour+0.055
 jam+0.055
 cr+0.051
engu+0.050
Neg logit contributions
 down-0.046
 up-0.046
 alone-0.043
rated-0.043
ny-0.041
Token Tom
Token ID4186
Feature activation+4.77e-01

Pos logit contributions
 slices+0.023
 flour+0.022
 jam+0.022
engu+0.021
 cr+0.020
Neg logit contributions
 down-0.022
 up-0.022
 alone-0.021
 away-0.019
 same-0.019
Token a
Token ID257
Feature activation-1.77e+00

Pos logit contributions
 down+0.010
 alone+0.010
rated+0.010
 up+0.009
ling+0.009
Neg logit contributions
 slices-0.013
 flour-0.013
 jam-0.013
 cr-0.012
 frost-0.012
Token was
Token ID373
Feature activation-1.73e+00

Pos logit contributions
 down+0.004
 alone+0.004
 up+0.003
 away+0.003
ling+0.003
Neg logit contributions
 jam-0.006
 slices-0.006
 flour-0.006
engu-0.006
 rods-0.006
Token a
Token ID257
Feature activation-1.11e+00

Pos logit contributions
 slices+0.050
engu+0.049
 rods+0.049
 strips+0.048
 jam+0.048
Neg logit contributions
 up-0.035
 down-0.033
 alone-0.031
 around-0.029
 away-0.028
Token box
Token ID3091
Feature activation-8.99e-04

Pos logit contributions
els+0.035
ions+0.035
 slices+0.034
mm+0.033
emon+0.033
Neg logit contributions
 down-0.017
 alone-0.016
 miserable-0.016
 lonely-0.016
 troubled-0.015
Token of
Token ID286
Feature activation-2.13e+00

Pos logit contributions
 jam+0.000
 flour+0.000
 slices+0.000
engu+0.000
 rods+0.000
Neg logit contributions
 down-0.000
 alone-0.000
 up-0.000
ling-0.000
 around-0.000
Token ch
Token ID442
Feature activation-1.01e+00

Pos logit contributions
engu+0.059
 unison+0.057
emon+0.056
keys+0.055
 Theo+0.054
Neg logit contributions
 alone-0.044
 up-0.043
 down-0.042
 same-0.040
 miserable-0.038
Tokenocol
Token ID4668
Feature activation-5.50e+00

Pos logit contributions
 slices+0.017
 flour+0.016
 jam+0.016
 cr+0.015
 frost+0.015
Neg logit contributions
 alone-0.031
 down-0.030
ling-0.030
away-0.030
umpy-0.029
Tokenates
Token ID689
Feature activation-1.46e+00

Pos logit contributions
 flour+0.098
 unison+0.096
 jam+0.094
engu+0.091
 slot+0.089
Neg logit contributions
 down-0.145
ling-0.138
 alone-0.133
away-0.132
ny-0.132
Token!
Token ID0
Feature activation-1.17e+00

Pos logit contributions
 flour+0.044
 jam+0.043
engu+0.043
 slices+0.042
 unison+0.041
Neg logit contributions
 down-0.028
 up-0.027
 alone-0.025
ling-0.024
 away-0.023
Token Alice
Token ID14862
Feature activation-8.91e-01

Pos logit contributions
 jam+0.033
 slices+0.032
 flour+0.032
 cr+0.031
 frost+0.029
Neg logit contributions
 down-0.023
rated-0.023
 alone-0.023
 up-0.022
ny-0.021
Token and
Token ID290
Feature activation+6.79e-01

Pos logit contributions
 flour+0.028
 jam+0.027
 slices+0.027
 icing+0.025
 slot+0.024
Neg logit contributions
 down-0.016
 up-0.015
 alone-0.015
ny-0.014
ling-0.014
Token her
Token ID607
Feature activation-2.34e-01

Pos logit contributions
rated+0.010
 alone+0.010
 down+0.010
ling+0.009
ny+0.009
Neg logit contributions
 slices-0.022
 jam-0.022
 flour-0.022
 cr-0.021
 frost-0.021
Token.
Token ID13
Feature activation-3.84e-01

Pos logit contributions
rated+0.019
 down+0.019
 alone+0.019
ling+0.018
 up+0.018
Neg logit contributions
 slices-0.021
 jam-0.021
 flour-0.021
 cr-0.020
 frost-0.019
Token See
Token ID4091
Feature activation-5.64e-01

Pos logit contributions
 slices+0.014
 jam+0.013
 flour+0.013
 cr+0.012
 strips+0.012
Neg logit contributions
 alone-0.005
 down-0.005
ny-0.004
rated-0.004
 up-0.004
Token,
Token ID11
Feature activation-1.37e+00

Pos logit contributions
 slices+0.014
 flour+0.014
 jam+0.014
 cr+0.013
 icing+0.013
Neg logit contributions
 down-0.013
rated-0.013
 alone-0.013
 up-0.012
ling-0.012
Token it
Token ID340
Feature activation+3.62e-01

Pos logit contributions
 slices+0.042
engu+0.040
 flour+0.039
 jam+0.038
 icing+0.038
Neg logit contributions
 down-0.026
 alone-0.025
 up-0.025
 away-0.022
 same-0.021
Token has
Token ID468
Feature activation-1.15e+00

Pos logit contributions
rated+0.009
 down+0.009
away+0.009
arrass+0.009
 alone+0.008
Neg logit contributions
 jam-0.008
 flour-0.008
 slices-0.008
 cr-0.008
 frost-0.007
Token eight
Token ID3624
Feature activation-5.48e+00

Pos logit contributions
 slices+0.028
 flour+0.026
 jam+0.026
engu+0.026
 slot+0.025
Neg logit contributions
 alone-0.027
 down-0.026
 up-0.025
rated-0.025
 same-0.024
Token legs
Token ID7405
Feature activation-3.38e+00

Pos logit contributions
engu+0.128
 unison+0.124
izza+0.123
 slot+0.119
 slices+0.117
Neg logit contributions
 alone-0.138
 down-0.128
 up-0.121
ling-0.121
 lonely-0.117
Token and
Token ID290
Feature activation-1.31e+00

Pos logit contributions
 slices+0.091
 jam+0.089
 flour+0.087
engu+0.085
 icing+0.082
Neg logit contributions
 down-0.076
 alone-0.076
 up-0.070
rated-0.068
ling-0.068
Token two
Token ID734
Feature activation-2.92e+00

Pos logit contributions
 slices+0.030
engu+0.030
 flour+0.029
 slot+0.028
izza+0.028
Neg logit contributions
 alone-0.031
 down-0.031
 up-0.029
 same-0.028
rated-0.027
Token eyes
Token ID2951
Feature activation-1.14e+00

Pos logit contributions
 slices+0.064
engu+0.063
 flour+0.062
 jam+0.061
 slot+0.061
Neg logit contributions
 alone-0.075
 down-0.072
ling-0.068
 up-0.067
rated-0.066
Token and
Token ID290
Feature activation-3.23e-01

Pos logit contributions
 slices+0.033
 jam+0.033
 flour+0.032
engu+0.031
 cr+0.030
Neg logit contributions
 down-0.023
 alone-0.022
 up-0.021
rated-0.020
ling-0.020
Token It
Token ID632
Feature activation+9.94e-01

Pos logit contributions
 down+0.008
 alone+0.008
 up+0.007
 freely+0.007
ny+0.007
Neg logit contributions
 jam-0.020
 flour-0.020
 slices-0.020
 icing-0.019
 cr-0.019
Token was
Token ID373
Feature activation-1.70e+00

Pos logit contributions
 alone+0.018
 down+0.018
ny+0.017
 up+0.017
ling+0.017
Neg logit contributions
 flour-0.029
 slices-0.029
 jam-0.028
 strips-0.027
 icing-0.027
Token a
Token ID257
Feature activation-7.68e-01

Pos logit contributions
engu+0.051
 slices+0.050
 flour+0.049
itt+0.049
izza+0.048
Neg logit contributions
 up-0.031
 alone-0.031
 down-0.030
 lonely-0.026
 away-0.025
Token special
Token ID2041
Feature activation-3.63e-01

Pos logit contributions
 slices+0.021
 flour+0.020
ions+0.020
els+0.019
 icing+0.019
Neg logit contributions
 alone-0.015
 down-0.015
 lonely-0.015
ny-0.015
 same-0.014
Token gu
Token ID915
Feature activation-2.15e+00

Pos logit contributions
 slices+0.009
 flour+0.009
 jam+0.009
engu+0.008
 icing+0.008
Neg logit contributions
 alone-0.008
 down-0.008
 up-0.008
ny-0.008
rated-0.008
Tokeninea
Token ID18343
Feature activation-5.43e+00

Pos logit contributions
 flour+0.045
 jam+0.044
 slices+0.043
 toast+0.039
 sprink+0.039
Neg logit contributions
 down-0.058
ny-0.058
 alone-0.057
ling-0.056
rated-0.055
Token pig
Token ID12967
Feature activation-2.37e+00

Pos logit contributions
 flour+0.133
emon+0.130
engu+0.129
 slices+0.126
 unison+0.124
Neg logit contributions
 down-0.125
ling-0.115
 up-0.108
ly-0.107
 away-0.105
Token.
Token ID13
Feature activation+7.82e-01

Pos logit contributions
 flour+0.077
 jam+0.074
 slices+0.073
 icing+0.071
 frost+0.068
Neg logit contributions
 down-0.040
 alone-0.040
ny-0.040
 up-0.039
ling-0.038
Token
Token ID198
Feature activation-6.71e-01

Pos logit contributions
 down+0.011
 alone+0.011
 freely+0.009
ny+0.009
 up+0.009
Neg logit contributions
 flour-0.026
 jam-0.025
 slices-0.025
 cr-0.025
 icing-0.024
Token
Token ID198
Feature activation-4.12e-01

Pos logit contributions
 jam+0.022
 flour+0.022
 slices+0.022
 icing+0.022
 cr+0.021
Neg logit contributions
 alone-0.009
 up-0.009
 down-0.009
 freely-0.008
ny-0.008
TokenBen
Token ID11696
Feature activation-1.16e+00

Pos logit contributions
 icing+0.013
 umb+0.013
 frost+0.013
 jam+0.012
 flour+0.012
Neg logit contributions
ny-0.005
itates-0.005
 alone-0.005
ling-0.005
umpy-0.005
Token noticed
Token ID6810
Feature activation-1.04e+00

Pos logit contributions
 flour+0.002
 slices+0.002
engu+0.002
 icing+0.002
 jam+0.002
Neg logit contributions
 alone-0.001
 down-0.001
ling-0.001
ny-0.001
 up-0.001
Token something
Token ID1223
Feature activation+1.62e+00

Pos logit contributions
engu+0.033
 slices+0.033
izza+0.032
 slot+0.032
 seventeen+0.031
Neg logit contributions
 up-0.018
 alone-0.016
 down-0.016
ling-0.014
ly-0.013
Token.
Token ID13
Feature activation-6.16e-02

Pos logit contributions
 down+0.028
 alone+0.028
 up+0.027
rated+0.027
ling+0.027
Neg logit contributions
 slices-0.050
 flour-0.049
 jam-0.048
engu-0.046
 cr-0.046
Token The
Token ID383
Feature activation+5.67e-01

Pos logit contributions
 slices+0.002
 flour+0.002
 poles+0.002
 jam+0.002
 cr+0.002
Neg logit contributions
 down-0.001
 alone-0.001
umpy-0.001
ny-0.000
 miserable-0.000
Token ch
Token ID442
Feature activation-1.24e+00

Pos logit contributions
 alone+0.012
 down+0.012
rated+0.011
 up+0.011
ling+0.011
Neg logit contributions
 slices-0.016
 jam-0.015
 flour-0.015
engu-0.014
 cr-0.014
Tokenocol
Token ID4668
Feature activation-5.40e+00

Pos logit contributions
 slices+0.031
 flour+0.031
 jam+0.031
 cr+0.029
 frost+0.028
Neg logit contributions
rated-0.029
 down-0.028
 alone-0.028
 up-0.026
ny-0.026
Tokenates
Token ID689
Feature activation-1.23e+00

Pos logit contributions
 flour+0.096
 jam+0.094
 slices+0.091
 unison+0.082
 slot+0.082
Neg logit contributions
ling-0.153
 down-0.152
ny-0.146
rated-0.145
 up-0.140
Token were
Token ID547
Feature activation-1.84e+00

Pos logit contributions
 flour+0.032
 slices+0.032
 jam+0.032
engu+0.031
 slot+0.029
Neg logit contributions
ling-0.027
 down-0.027
 alone-0.026
 up-0.025
rated-0.025
Token bitter
Token ID12922
Feature activation-3.16e-01

Pos logit contributions
engu+0.053
 slices+0.052
izza+0.051
 jam+0.050
 flour+0.049
Neg logit contributions
 alone-0.035
 down-0.035
 up-0.035
 miserable-0.030
 lonely-0.030
Token.
Token ID13
Feature activation+1.66e-01

Pos logit contributions
 slices+0.008
 flour+0.008
 jam+0.008
engu+0.008
 cr+0.008
Neg logit contributions
 down-0.007
rated-0.007
 alone-0.007
 up-0.007
ling-0.007
Token They
Token ID1119
Feature activation+9.69e-02

Pos logit contributions
 alone+0.002
 down+0.002
umpy+0.002
 miserable+0.002
 up+0.002
Neg logit contributions
 slices-0.006
 poles-0.006
 flour-0.006
 jam-0.006
engu-0.006
Token hairy
Token ID43873
Feature activation-1.89e+00

Pos logit contributions
engu+0.052
 slices+0.049
izza+0.048
 flour+0.046
 unison+0.046
Neg logit contributions
 down-0.028
 alone-0.027
 up-0.024
 miserable-0.024
 lonely-0.024
Token spider
Token ID19230
Feature activation-1.67e+00

Pos logit contributions
 flour+0.045
 slices+0.045
engu+0.045
izza+0.043
 icing+0.041
Neg logit contributions
 down-0.046
 alone-0.046
ny-0.042
 up-0.042
ling-0.041
Token.
Token ID13
Feature activation-3.81e-01

Pos logit contributions
 flour+0.048
 slices+0.048
engu+0.046
izza+0.045
 icing+0.044
Neg logit contributions
 alone-0.035
 down-0.033
 up-0.031
ling-0.031
 around-0.030
Token It
Token ID632
Feature activation+6.66e-01

Pos logit contributions
 jam+0.014
 flour+0.013
 slices+0.013
 cr+0.013
 da+0.013
Neg logit contributions
 down-0.005
 alone-0.004
 He-0.004
arrass-0.003
 up-0.003
Token had
Token ID550
Feature activation-1.52e+00

Pos logit contributions
rated+0.015
 down+0.015
 alone+0.015
ling+0.014
 up+0.014
Neg logit contributions
 slices-0.017
 flour-0.017
 jam-0.017
 cr-0.016
 icing-0.016
Token eight
Token ID3624
Feature activation-5.35e+00

Pos logit contributions
 slices+0.039
 flour+0.038
 jam+0.036
engu+0.035
 slot+0.035
Neg logit contributions
 alone-0.034
 down-0.034
 up-0.032
rated-0.032
 same-0.031
Token legs
Token ID7405
Feature activation-2.58e+00

Pos logit contributions
 flour+0.131
engu+0.130
izza+0.126
 slot+0.125
 frost+0.124
Neg logit contributions
 alone-0.119
 down-0.118
ling-0.109
 up-0.108
 miserable-0.104
Token and
Token ID290
Feature activation-2.00e+00

Pos logit contributions
 slices+0.063
 jam+0.063
 flour+0.062
 cr+0.057
engu+0.057
Neg logit contributions
 down-0.063
 alone-0.061
 up-0.059
rated-0.059
ling-0.058
Token eight
Token ID3624
Feature activation-4.48e+00

Pos logit contributions
engu+0.049
 slices+0.048
 flour+0.048
izza+0.047
 slot+0.047
Neg logit contributions
 down-0.046
 alone-0.043
 up-0.042
 same-0.039
rated-0.037
Token eyes
Token ID2951
Feature activation-1.27e+00

Pos logit contributions
 flour+0.106
engu+0.100
 slices+0.098
 jam+0.098
 frost+0.098
Neg logit contributions
 alone-0.106
 down-0.106
ling-0.099
 up-0.096
ny-0.094
Token.
Token ID13
Feature activation-4.02e-01

Pos logit contributions
 flour+0.036
 slices+0.036
 jam+0.036
engu+0.035
 strips+0.033
Neg logit contributions
 down-0.026
 alone-0.024
ling-0.024
 up-0.024
rated-0.022
Token black
Token ID2042
Feature activation-2.35e+00

Pos logit contributions
engu+0.021
 slices+0.020
itt+0.019
izza+0.019
ions+0.019
Neg logit contributions
 alone-0.011
 down-0.011
 up-0.010
 lonely-0.009
 miserable-0.009
Token spider
Token ID19230
Feature activation-1.83e+00

Pos logit contributions
 slices+0.055
engu+0.055
 flour+0.053
izza+0.051
 jam+0.051
Neg logit contributions
 down-0.059
 alone-0.055
 up-0.052
ny-0.052
ly-0.051
Token.
Token ID13
Feature activation-3.42e-01

Pos logit contributions
 flour+0.053
 slices+0.052
engu+0.052
izza+0.051
 icing+0.048
Neg logit contributions
 alone-0.040
 down-0.037
 up-0.035
ling-0.034
 around-0.034
Token It
Token ID632
Feature activation+9.09e-01

Pos logit contributions
 slices+0.012
 flour+0.012
 jam+0.012
 rods+0.012
engu+0.012
Neg logit contributions
 down-0.004
 alone-0.004
free-0.003
 up-0.003
 freely-0.003
Token had
Token ID550
Feature activation-1.59e+00

Pos logit contributions
rated+0.019
 down+0.019
 alone+0.019
 up+0.018
ling+0.018
Neg logit contributions
 slices-0.024
 flour-0.024
 jam-0.024
 cr-0.022
 icing-0.022
Token eight
Token ID3624
Feature activation-5.32e+00

Pos logit contributions
 slices+0.039
 flour+0.037
 jam+0.036
engu+0.035
izza+0.035
Neg logit contributions
 down-0.036
 alone-0.036
 up-0.034
 same-0.034
rated-0.033
Token legs
Token ID7405
Feature activation-2.78e+00

Pos logit contributions
izza+0.132
engu+0.131
 unison+0.130
 slot+0.128
 flour+0.127
Neg logit contributions
 alone-0.113
 down-0.113
 up-0.106
ling-0.103
 miserable-0.101
Token and
Token ID290
Feature activation-1.86e+00

Pos logit contributions
 slices+0.071
 jam+0.070
 flour+0.070
engu+0.065
 cr+0.064
Neg logit contributions
 down-0.065
 alone-0.063
 up-0.062
rated-0.061
ling-0.060
Token many
Token ID867
Feature activation-2.27e+00

Pos logit contributions
engu+0.046
izza+0.045
 slices+0.045
 flour+0.044
 slot+0.044
Neg logit contributions
 down-0.041
 alone-0.039
 up-0.038
 same-0.036
rated-0.033
Token eyes
Token ID2951
Feature activation-1.13e+00

Pos logit contributions
engu+0.051
 slot+0.050
izza+0.049
 slices+0.048
 flour+0.048
Neg logit contributions
 alone-0.057
 down-0.054
ling-0.054
 up-0.051
 miserable-0.051
Token.
Token ID13
Feature activation-2.94e-01

Pos logit contributions
 flour+0.035
engu+0.034
 slices+0.034
 jam+0.034
izza+0.033
Neg logit contributions
 down-0.021
 up-0.019
 alone-0.019
ling-0.018
 away-0.017
Token They
Token ID1119
Feature activation-5.80e-01

Pos logit contributions
 down+0.005
 alone+0.005
 freely+0.005
 He+0.005
ny+0.004
Neg logit contributions
 jam-0.025
 slices-0.025
 cr-0.025
 da-0.024
 flour-0.024
Token see
Token ID766
Feature activation-1.69e+00

Pos logit contributions
 slices+0.018
 flour+0.018
 jam+0.018
 cr+0.017
 strips+0.017
Neg logit contributions
 alone-0.010
 down-0.009
rated-0.009
 up-0.009
ny-0.008
Token strawberry
Token ID41236
Feature activation-2.80e+00

Pos logit contributions
mmm+0.045
ird+0.044
 slot+0.043
ama+0.043
engu+0.043
Neg logit contributions
 up-0.033
 down-0.033
 same-0.032
 alone-0.031
 back-0.031
Token jam
Token ID16853
Feature activation-2.77e+00

Pos logit contributions
 unison+0.045
 slot+0.041
engu+0.038
izza+0.036
ird+0.032
Neg logit contributions
 up-0.096
 down-0.096
ling-0.091
 away-0.086
 alone-0.085
Token,
Token ID11
Feature activation-2.11e+00

Pos logit contributions
 unison+0.074
engu+0.074
 flour+0.074
 slices+0.073
 cr+0.071
Neg logit contributions
 down-0.056
 up-0.055
 alone-0.053
ling-0.051
ny-0.049
Token raspberry
Token ID38973
Feature activation-5.30e+00

Pos logit contributions
engu+0.027
 unison+0.026
 slot+0.025
izza+0.024
ama+0.021
Neg logit contributions
 down-0.069
 up-0.066
 alone-0.066
 same-0.065
ny-0.064
Token jam
Token ID16853
Feature activation-3.95e+00

Pos logit contributions
 unison+0.094
 slot+0.092
engu+0.081
izza+0.077
 envelop+0.077
Neg logit contributions
 down-0.158
 up-0.154
ling-0.148
 alone-0.144
 around-0.143
Token,
Token ID11
Feature activation-3.14e+00

Pos logit contributions
 unison+0.106
engu+0.103
 flour+0.102
 slices+0.101
 slot+0.099
Neg logit contributions
 down-0.079
 alone-0.076
 up-0.075
ny-0.070
ling-0.070
Token blue
Token ID4171
Feature activation-4.65e+00

Pos logit contributions
engu+0.054
 slot+0.051
 unison+0.051
izza+0.049
 Theo+0.046
Neg logit contributions
 down-0.089
 up-0.086
ny-0.083
 same-0.083
ling-0.082
Tokenberry
Token ID8396
Feature activation-5.26e+00

Pos logit contributions
 slot+0.096
 unison+0.091
keys+0.089
engu+0.087
ird+0.080
Neg logit contributions
 down-0.127
 up-0.114
 same-0.111
ny-0.111
ly-0.111
Token jam
Token ID16853
Feature activation-4.07e+00

Pos logit contributions
 unison+0.128
 slot+0.116
 envelop+0.102
 umb+0.102
 passports+0.102
Neg logit contributions
 down-0.132
ny-0.123
 up-0.121
ling-0.116
 alone-0.116
Token is
Token ID318
Feature activation-2.26e+00

Pos logit contributions
 slices+0.020
 jam+0.020
 flour+0.019
 cr+0.018
 unison+0.018
Neg logit contributions
 down-0.017
 alone-0.016
 up-0.016
rated-0.016
ny-0.015
Token six
Token ID2237
Feature activation-4.19e+00

Pos logit contributions
 slices+0.065
itt+0.062
light+0.058
 flour+0.057
 slot+0.057
Neg logit contributions
 up-0.046
 down-0.043
 same-0.040
 alone-0.036
 back-0.035
Token.
Token ID13
Feature activation-1.62e+00

Pos logit contributions
engu+0.089
 bisc+0.087
izza+0.085
 unison+0.084
 slot+0.081
Neg logit contributions
 alone-0.100
 down-0.096
ling-0.093
 up-0.091
 same-0.088
Token It
Token ID632
Feature activation+1.14e-01

Pos logit contributions
 jam+0.054
 slices+0.054
 flour+0.052
 toast+0.050
 da+0.049
Neg logit contributions
 down-0.023
ny-0.021
umpy-0.019
 alone-0.019
 up-0.019
Token has
Token ID468
Feature activation-2.40e+00

Pos logit contributions
 down+0.002
 up+0.002
ny+0.002
 alone+0.002
rated+0.002
Neg logit contributions
 slices-0.003
 jam-0.003
 flour-0.003
 icing-0.003
 toast-0.003
Token six
Token ID2237
Feature activation-5.28e+00

Pos logit contributions
 slices+0.053
 jam+0.052
 flour+0.050
ama+0.049
 slot+0.048
Neg logit contributions
 same-0.057
 down-0.055
 alone-0.054
 up-0.053
ny-0.052
Token and
Token ID290
Feature activation-1.47e+00

Pos logit contributions
 bisc+0.100
engu+0.097
izza+0.096
 jam+0.094
 unison+0.094
Neg logit contributions
 alone-0.134
 down-0.133
ling-0.126
 up-0.124
 same-0.124
Token zero
Token ID6632
Feature activation-2.78e+00

Pos logit contributions
 flour+0.031
 slices+0.031
 jam+0.030
engu+0.029
 unison+0.028
Neg logit contributions
 down-0.037
 alone-0.035
 up-0.034
 same-0.034
ny-0.033
Token."
Token ID526
Feature activation+1.56e-01

Pos logit contributions
 flour+0.069
 slices+0.068
 jam+0.068
 unison+0.066
engu+0.064
Neg logit contributions
 down-0.061
 alone-0.058
 up-0.057
ling-0.056
ny-0.053
Token
Token ID198
Feature activation-8.90e-01

Pos logit contributions
 down+0.003
 alone+0.003
 up+0.003
 freely+0.002
rated+0.002
Neg logit contributions
 slices-0.005
 flour-0.005
 jam-0.005
 cr-0.005
engu-0.005
Token
Token ID198
Feature activation-3.98e-01

Pos logit contributions
 slices+0.028
 flour+0.028
 jam+0.028
 icing+0.027
 cr+0.026
Neg logit contributions
 down-0.014
 alone-0.014
 up-0.014
 freely-0.012
ling-0.012
Token They
Token ID1119
Feature activation-5.97e-01

Pos logit contributions
rated+0.016
 down+0.016
 alone+0.016
 up+0.015
ling+0.015
Neg logit contributions
 slices-0.024
 jam-0.023
 flour-0.023
 cr-0.022
 icing-0.021
Token said
Token ID531
Feature activation-1.50e+00

Pos logit contributions
 flour+0.017
 slices+0.017
 jam+0.017
 strips+0.016
 icing+0.015
Neg logit contributions
 alone-0.011
 down-0.011
ny-0.010
 up-0.010
rated-0.010
Token they
Token ID484
Feature activation-1.57e-01

Pos logit contributions
 slices+0.040
 jam+0.039
 flour+0.039
 strips+0.037
 cr+0.037
Neg logit contributions
 down-0.032
 alone-0.032
 up-0.031
ling-0.029
rated-0.029
Token loved
Token ID6151
Feature activation-7.90e-01

Pos logit contributions
 slices+0.004
 jam+0.004
 flour+0.004
 cr+0.003
 frost+0.003
Neg logit contributions
rated-0.004
 down-0.004
 alone-0.004
away-0.004
ling-0.004
Token him
Token ID683
Feature activation+1.26e+00

Pos logit contributions
 slot+0.026
 slices+0.026
engu+0.025
 flour+0.025
ird+0.025
Neg logit contributions
 up-0.011
 down-0.011
 alone-0.011
away-0.009
 freely-0.009
Token and
Token ID290
Feature activation+3.26e+00

Pos logit contributions
 alone+0.024
 down+0.023
 up+0.022
 away+0.020
 back+0.020
Neg logit contributions
 flour-0.039
 jam-0.037
 slices-0.037
engu-0.036
 icing-0.035
Token the
Token ID262
Feature activation+1.26e+00

Pos logit contributions
rated+0.073
 down+0.071
 alone+0.071
ling+0.068
 up+0.067
Neg logit contributions
 slices-0.084
 flour-0.083
 jam-0.083
 cr-0.078
 frost-0.076
Token mountain
Token ID8598
Feature activation+6.47e-01

Pos logit contributions
 alone+0.027
 down+0.027
 same+0.026
 up+0.026
rated+0.026
Neg logit contributions
 slices-0.033
 jam-0.032
 flour-0.032
engu-0.030
 icing-0.030
Token.
Token ID13
Feature activation+8.67e-01

Pos logit contributions
 alone+0.014
 down+0.013
rated+0.013
 up+0.013
ling+0.013
Neg logit contributions
 flour-0.017
 slices-0.017
 jam-0.017
 cr-0.016
 icing-0.016
Token<|endoftext|>
Token ID50256
Feature activation+1.13e+00

Pos logit contributions
 down+0.014
 alone+0.013
 up+0.012
rated+0.012
ny+0.012
Neg logit contributions
 slices-0.028
 jam-0.028
 flour-0.027
 cr-0.027
 icing-0.026
TokenBen
Token ID11696
Feature activation+9.29e-02

Pos logit contributions
 down+0.024
 alone+0.023
 up+0.022
ny+0.022
rated+0.021
Neg logit contributions
 slices-0.031
 jam-0.031
 flour-0.030
 cr-0.029
 icing-0.028
Token is
Token ID318
Feature activation+4.32e-01

Pos logit contributions
away+0.072
rated+0.071
arrass+0.069
ashing+0.067
 down+0.066
Neg logit contributions
 cr-0.036
 flour-0.034
 jam-0.034
 tape-0.032
 powder-0.031
Token important
Token ID1593
Feature activation-3.69e-02

Pos logit contributions
rated+0.008
 down+0.007
 alone+0.007
away+0.007
ling+0.007
Neg logit contributions
 flour-0.013
 jam-0.013
 slices-0.012
 cr-0.012
 tape-0.012
Token to
Token ID284
Feature activation+5.29e-01

Pos logit contributions
 slices+0.001
 jam+0.001
 flour+0.001
 cr+0.001
 strips+0.001
Neg logit contributions
rated-0.001
 alone-0.001
 down-0.001
ling-0.001
ny-0.001
Token be
Token ID307
Feature activation+1.02e+00

Pos logit contributions
rated+0.012
 down+0.011
 alone+0.011
ling+0.011
ny+0.010
Neg logit contributions
 jam-0.014
 flour-0.013
 slices-0.013
 cr-0.013
 frost-0.012
Token careful
Token ID8161
Feature activation+8.54e-01

Pos logit contributions
rated+0.023
 down+0.021
ling+0.020
 alone+0.020
ny+0.020
Neg logit contributions
 flour-0.027
 jam-0.027
 slices-0.026
 cr-0.026
 tape-0.024
Token and
Token ID290
Feature activation+3.62e+00

Pos logit contributions
rated+0.021
away+0.019
 down+0.019
arrass+0.019
 alone+0.019
Neg logit contributions
 flour-0.019
 slices-0.019
 jam-0.018
 cr-0.018
 tape-0.017
Token not
Token ID407
Feature activation+9.94e-01

Pos logit contributions
rated+0.089
ny+0.082
umpy+0.080
ling+0.079
oun+0.077
Neg logit contributions
 jam-0.085
 cr-0.085
 flour-0.082
 tape-0.077
 slices-0.076
Token do
Token ID466
Feature activation+1.51e+00

Pos logit contributions
rated+0.022
ling+0.019
 down+0.019
 alone+0.019
away+0.019
Neg logit contributions
 jam-0.026
 cr-0.026
 flour-0.025
 slices-0.025
 tape-0.023
Token silly
Token ID14397
Feature activation+2.02e+00

Pos logit contributions
rated+0.037
ny+0.033
 alone+0.032
 down+0.032
ling+0.031
Neg logit contributions
 jam-0.038
 flour-0.038
 slices-0.037
 cr-0.036
 frost-0.034
Token things
Token ID1243
Feature activation+2.45e+00

Pos logit contributions
rated+0.053
ny+0.048
itates+0.048
ounded+0.048
 alone+0.047
Neg logit contributions
 jam-0.045
 flour-0.043
 slices-0.043
 cr-0.042
 strips-0.039
Token.
Token ID13
Feature activation+7.43e-01

Pos logit contributions
rated+0.070
ny+0.059
arrass+0.058
arted+0.058
umpy+0.057
Neg logit contributions
 jam-0.046
 cr-0.046
 flour-0.045
 tape-0.044
 slices-0.043
Token was
Token ID373
Feature activation-1.44e+00

Pos logit contributions
rated+0.033
 down+0.031
 alone+0.031
ling+0.030
away+0.030
Neg logit contributions
 slices-0.029
 flour-0.029
 jam-0.029
 cr-0.028
 frost-0.027
Token a
Token ID257
Feature activation-2.12e-01

Pos logit contributions
 slices+0.036
 flour+0.034
 jam+0.034
engu+0.034
 cr+0.033
Neg logit contributions
 alone-0.033
 down-0.032
 up-0.032
rated-0.030
ny-0.029
Token little
Token ID1310
Feature activation-3.83e-01

Pos logit contributions
 slices+0.007
 flour+0.007
 jam+0.007
 icing+0.006
 cr+0.006
Neg logit contributions
 alone-0.003
 down-0.003
rated-0.003
ny-0.003
ling-0.003
Token girl
Token ID2576
Feature activation-2.78e-01

Pos logit contributions
 slices+0.014
 flour+0.013
 jam+0.013
engu+0.013
 icing+0.013
Neg logit contributions
 alone-0.005
 down-0.005
ny-0.005
ling-0.005
 up-0.004
Token called
Token ID1444
Feature activation+4.96e-02

Pos logit contributions
 slices+0.007
 flour+0.007
izza+0.007
engu+0.007
 poles+0.006
Neg logit contributions
 alone-0.007
 down-0.007
 up-0.006
ling-0.006
 around-0.006
Token Bella
Token ID39304
Feature activation+3.05e+00

Pos logit contributions
 down+0.001
 alone+0.001
 up+0.001
ny+0.001
 freely+0.001
Neg logit contributions
 slices-0.001
 flour-0.001
 jam-0.001
 strips-0.001
 slot-0.001
Token.
Token ID13
Feature activation+8.44e-01

Pos logit contributions
 alone+0.061
 down+0.060
rated+0.059
 up+0.057
ling+0.056
Neg logit contributions
 slices-0.089
 flour-0.087
 jam-0.086
 icing-0.080
 cr-0.080
Token Bella
Token ID39304
Feature activation+1.62e+00

Pos logit contributions
rated+0.018
 alone+0.017
 down+0.017
ling+0.016
away+0.016
Neg logit contributions
 jam-0.023
 slices-0.022
 flour-0.022
 cr-0.022
 strips-0.020
Token was
Token ID373
Feature activation-4.70e-01

Pos logit contributions
rated+0.027
 alone+0.026
 down+0.026
ling+0.024
 up+0.024
Neg logit contributions
 slices-0.052
 flour-0.050
 jam-0.050
 cr-0.048
 icing-0.047
Token three
Token ID1115
Feature activation-1.80e+00

Pos logit contributions
 slices+0.013
 jam+0.012
 flour+0.012
 cr+0.012
 icing+0.011
Neg logit contributions
rated-0.010
 alone-0.010
 down-0.010
 up-0.009
ling-0.009
Token-
Token ID12
Feature activation-3.77e-02

Pos logit contributions
 slices+0.039
 jam+0.038
 flour+0.037
engu+0.037
 slot+0.036
Neg logit contributions
 alone-0.050
ling-0.048
 down-0.045
umpy-0.045
 up-0.045
Token the
Token ID262
Feature activation+3.80e-01

Pos logit contributions
 up+0.028
 down+0.028
 alone+0.025
 away+0.022
 around+0.021
Neg logit contributions
engu-0.053
 slices-0.053
 slot-0.051
 unison-0.051
izza-0.050
Token room
Token ID2119
Feature activation+7.54e-01

Pos logit contributions
 up+0.008
 same+0.008
 alone+0.008
 down+0.008
 away+0.007
Neg logit contributions
 slices-0.010
engu-0.010
 jam-0.010
 Theo-0.009
izza-0.009
Token to
Token ID284
Feature activation+1.28e+00

Pos logit contributions
 down+0.016
 alone+0.015
rated+0.015
 up+0.015
ling+0.014
Neg logit contributions
 slices-0.020
 jam-0.020
 flour-0.020
 cr-0.019
engu-0.019
Token make
Token ID787
Feature activation+7.33e-01

Pos logit contributions
 alone+0.025
 down+0.025
rated+0.025
 up+0.023
away+0.022
Neg logit contributions
 slices-0.037
 jam-0.036
 flour-0.036
 cr-0.033
engu-0.033
Token it
Token ID340
Feature activation+3.11e+00

Pos logit contributions
 up+0.014
 down+0.014
 alone+0.013
ling+0.012
rated+0.012
Neg logit contributions
 slices-0.021
 jam-0.020
engu-0.020
 flour-0.020
 cr-0.019
Token look
Token ID804
Feature activation+3.81e+00

Pos logit contributions
rated+0.060
 alone+0.056
ling+0.056
ny+0.055
 down+0.054
Neg logit contributions
 flour-0.090
 slices-0.090
 jam-0.089
 cr-0.086
 frost-0.083
Token more
Token ID517
Feature activation+5.49e-01

Pos logit contributions
rated+0.076
arrass+0.073
 alone+0.073
ny+0.072
umpy+0.072
Neg logit contributions
 flour-0.106
 jam-0.103
 slices-0.103
 cr-0.101
 frost-0.096
Token special
Token ID2041
Feature activation+6.67e-01

Pos logit contributions
rated+0.011
 alone+0.010
 down+0.010
ny+0.010
ling+0.010
Neg logit contributions
 slices-0.016
 flour-0.016
 jam-0.015
 cr-0.015
 frost-0.014
Token.
Token ID13
Feature activation+4.22e-01

Pos logit contributions
 down+0.014
rated+0.013
 alone+0.013
 up+0.013
ling+0.012
Neg logit contributions
 slices-0.018
 jam-0.018
 flour-0.018
 cr-0.017
engu-0.017
Token She
Token ID1375
Feature activation-1.86e-01

Pos logit contributions
rated+0.009
 alone+0.009
 down+0.008
ling+0.008
away+0.008
Neg logit contributions
 slices-0.012
 flour-0.012
 jam-0.011
 cr-0.011
 frost-0.011
Token knew
Token ID2993
Feature activation-9.55e-01

Pos logit contributions
 slices+0.006
 jam+0.006
 flour+0.005
engu+0.005
 unison+0.005
Neg logit contributions
 down-0.003
rated-0.003
 alone-0.003
 up-0.003
ny-0.003
Token his
Token ID465
Feature activation-1.27e+00

Pos logit contributions
 flour+0.028
 slices+0.028
 slot+0.028
engu+0.028
 jam+0.028
Neg logit contributions
 alone-0.016
 down-0.015
 up-0.015
 away-0.014
 freely-0.013
Token friend
Token ID1545
Feature activation-1.41e+00

Pos logit contributions
 flour+0.034
 jam+0.034
 slices+0.033
 icing+0.031
 frost+0.031
Neg logit contributions
 alone-0.028
 down-0.027
 up-0.026
ling-0.025
rated-0.025
Token.
Token ID13
Feature activation-3.35e-01

Pos logit contributions
 flour+0.040
 slices+0.039
 jam+0.039
 icing+0.038
engu+0.037
Neg logit contributions
 down-0.033
 alone-0.031
 up-0.031
ny-0.028
 away-0.028
Token They
Token ID1119
Feature activation-1.38e-01

Pos logit contributions
 slices+0.009
 flour+0.009
 jam+0.009
 cr+0.009
 frost+0.008
Neg logit contributions
rated-0.007
 alone-0.007
 down-0.007
ling-0.007
away-0.006
Token played
Token ID2826
Feature activation+4.50e-01

Pos logit contributions
 flour+0.005
 slices+0.005
 jam+0.005
 icing+0.004
 slot+0.004
Neg logit contributions
 down-0.002
 alone-0.002
ny-0.002
rated-0.002
 up-0.002
Token all
Token ID477
Feature activation+2.70e+00

Pos logit contributions
rated+0.010
 down+0.009
 alone+0.009
ling+0.009
 up+0.009
Neg logit contributions
 slices-0.012
 flour-0.012
 jam-0.012
 cr-0.011
 frost-0.011
Token day
Token ID1110
Feature activation-9.56e-01

Pos logit contributions
rated+0.063
 unnoticed+0.059
arrass+0.059
handed+0.057
 invincible+0.056
Neg logit contributions
 jam-0.063
 cr-0.061
 flour-0.060
 slices-0.059
 da-0.058
Token and
Token ID290
Feature activation+7.46e-01

Pos logit contributions
 flour+0.024
 slices+0.023
 jam+0.023
 cr+0.022
 icing+0.021
Neg logit contributions
 alone-0.022
 down-0.022
rated-0.021
ling-0.021
 up-0.021
Token then
Token ID788
Feature activation+2.95e-01

Pos logit contributions
 alone+0.015
 down+0.015
rated+0.015
 up+0.014
ling+0.014
Neg logit contributions
 slices-0.021
 flour-0.021
 jam-0.020
 icing-0.019
 cr-0.019
Token went
Token ID1816
Feature activation+6.78e-01

Pos logit contributions
rated+0.007
 down+0.007
 alone+0.006
 up+0.006
ling+0.006
Neg logit contributions
 slices-0.008
 flour-0.008
 jam-0.008
 cr-0.007
 frost-0.007
Token back
Token ID736
Feature activation-4.99e-01

Pos logit contributions
rated+0.015
arrass+0.012
umpy+0.011
ling+0.011
 alone+0.011
Neg logit contributions
 jam-0.020
 slices-0.019
 flour-0.019
 cr-0.019
 da-0.018
Token.
Token ID13
Feature activation-1.06e+00

Pos logit contributions
 flour+0.033
 jam+0.033
 slices+0.032
 icing+0.031
 frost+0.030
Neg logit contributions
 down-0.039
 alone-0.037
 up-0.037
 away-0.036
ny-0.035
Token Jane
Token ID12091
Feature activation-3.47e-01

Pos logit contributions
 slices+0.031
 jam+0.031
 flour+0.031
 cr+0.030
 strips+0.029
Neg logit contributions
 down-0.019
rated-0.019
 alone-0.018
 up-0.018
 freely-0.017
Token learned
Token ID4499
Feature activation+3.06e-01

Pos logit contributions
 slices+0.010
 flour+0.010
 jam+0.010
 icing+0.009
 frost+0.009
Neg logit contributions
 down-0.007
 alone-0.006
ling-0.006
 up-0.006
ny-0.006
Token how
Token ID703
Feature activation+8.25e-01

Pos logit contributions
 down+0.006
 alone+0.006
 up+0.006
rated+0.006
ling+0.006
Neg logit contributions
 slices-0.008
 flour-0.008
 jam-0.008
 cr-0.008
 icing-0.007
Token to
Token ID284
Feature activation-7.56e-01

Pos logit contributions
rated+0.020
ny+0.018
ling+0.018
 alone+0.017
 down+0.017
Neg logit contributions
 flour-0.021
 jam-0.021
 slices-0.021
 cr-0.020
 powder-0.019
Token sew
Token ID34249
Feature activation+4.63e-01

Pos logit contributions
 slices+0.022
 jam+0.021
 flour+0.021
 icing+0.020
 frost+0.020
Neg logit contributions
rated-0.014
 alone-0.014
 down-0.014
 up-0.014
ny-0.013
Token and
Token ID290
Feature activation+1.01e+00

Pos logit contributions
 down+0.010
 alone+0.010
 up+0.010
rated+0.010
ling+0.009
Neg logit contributions
 slices-0.012
 jam-0.012
 flour-0.012
 cr-0.011
engu-0.011
Token make
Token ID787
Feature activation+4.46e-01

Pos logit contributions
 down+0.024
 alone+0.023
 up+0.023
rated+0.023
ling+0.021
Neg logit contributions
 slices-0.025
 flour-0.024
 jam-0.023
engu-0.023
 frost-0.022
Token the
Token ID262
Feature activation+9.05e-01

Pos logit contributions
 down+0.008
 alone+0.008
 up+0.008
rated+0.008
ling+0.008
Neg logit contributions
 slices-0.013
 jam-0.012
 flour-0.012
 cr-0.012
engu-0.012
Token doll
Token ID3654
Feature activation+2.16e+00

Pos logit contributions
 alone+0.020
 same+0.020
 up+0.019
 down+0.019
 away+0.018
Neg logit contributions
 slices-0.024
engu-0.023
 jam-0.022
 flour-0.022
izza-0.022
Token's
Token ID338
Feature activation+5.88e-01

Pos logit contributions
rated+0.039
 alone+0.035
 down+0.034
ling+0.034
ny+0.034
Neg logit contributions
 slices-0.066
 jam-0.065
 flour-0.065
 cr-0.064
 frost-0.060
Token band
Token ID4097
Feature activation+1.59e+00

Pos logit contributions
 down+0.004
rated+0.004
 alone+0.004
 up+0.004
ling+0.004
Neg logit contributions
 slices-0.005
 jam-0.005
 flour-0.005
 cr-0.005
 icing-0.005
Token-
Token ID12
Feature activation+1.13e+00

Pos logit contributions
rated+0.038
 freely+0.035
 alone+0.035
ling+0.034
 down+0.034
Neg logit contributions
 slices-0.037
 cr-0.036
 jam-0.036
 flour-0.036
 frost-0.034
Tokenaid
Token ID1698
Feature activation-1.46e-01

Pos logit contributions
rated+0.035
 alone+0.034
 down+0.033
ling+0.032
 freely+0.032
Neg logit contributions
 jam-0.020
 flour-0.020
 slices-0.020
 cr-0.018
 frost-0.017
Token."
Token ID526
Feature activation+1.10e-01

Pos logit contributions
 slices+0.004
 flour+0.004
 jam+0.004
 cr+0.004
 frost+0.004
Neg logit contributions
rated-0.003
 alone-0.003
 down-0.003
ling-0.003
 freely-0.003
Token
Token ID220
Feature activation+3.41e-01

Pos logit contributions
rated+0.002
 down+0.002
 alone+0.002
ling+0.002
 up+0.002
Neg logit contributions
 slices-0.003
 flour-0.003
 jam-0.003
 cr-0.003
 frost-0.003
Token
Token ID198
Feature activation-2.53e-01

Pos logit contributions
rated+0.009
 alone+0.008
ling+0.008
 down+0.008
away+0.008
Neg logit contributions
 slices-0.008
 flour-0.008
 jam-0.007
 cr-0.007
 frost-0.007
Token
Token ID198
Feature activation+7.61e-01

Pos logit contributions
 slices+0.006
 jam+0.006
 flour+0.006
 cr+0.006
 frost+0.006
Neg logit contributions
rated-0.006
 down-0.006
 alone-0.006
ling-0.005
 up-0.005
TokenL
Token ID43
Feature activation+1.77e+00

Pos logit contributions
rated+0.019
 down+0.018
 alone+0.018
ling+0.017
 up+0.017
Neg logit contributions
 slices-0.018
 flour-0.018
 jam-0.018
 cr-0.017
 frost-0.016
Tokenily
Token ID813
Feature activation-6.94e-01

Pos logit contributions
 down+0.054
rated+0.053
 alone+0.051
 freely+0.050
 up+0.050
Neg logit contributions
 jam-0.033
 slices-0.031
 flour-0.031
 cr-0.029
 frost-0.027
Token smiled
Token ID13541
Feature activation+1.87e+00

Pos logit contributions
 slices+0.019
 flour+0.019
 jam+0.019
 cr+0.017
 icing+0.017
Neg logit contributions
rated-0.014
 down-0.014
 alone-0.014
 up-0.014
ling-0.014
Token back
Token ID736
Feature activation-2.10e-01

Pos logit contributions
rated+0.067
arrass+0.061
umpy+0.060
beh+0.059
handed+0.058
Neg logit contributions
 slices-0.025
 cr-0.024
 jam-0.023
 flour-0.021
 frost-0.021
Token She
Token ID1375
Feature activation+1.43e-01

Pos logit contributions
 down+0.011
 alone+0.010
rated+0.010
 up+0.010
ny+0.009
Neg logit contributions
 slices-0.029
 jam-0.029
 flour-0.028
 cr-0.027
 icing-0.027
Token loved
Token ID6151
Feature activation-1.12e+00

Pos logit contributions
rated+0.003
 alone+0.003
 down+0.003
 up+0.002
ny+0.002
Neg logit contributions
 slices-0.004
 flour-0.004
 jam-0.004
 icing-0.004
 strips-0.004
Token Grand
Token ID5675
Feature activation-2.69e-01

Pos logit contributions
 slices+0.035
 slot+0.035
 unison+0.034
 flour+0.033
izza+0.033
Neg logit contributions
 down-0.017
 up-0.017
 alone-0.017
 same-0.015
 miserable-0.014
Tokenma
Token ID2611
Feature activation+1.20e+00

Pos logit contributions
 slices+0.005
 jam+0.005
 flour+0.005
 rods+0.005
 poles+0.005
Neg logit contributions
 up-0.007
 alone-0.007
 down-0.007
rated-0.007
ling-0.007
Token.
Token ID13
Feature activation+7.10e-01

Pos logit contributions
 alone+0.021
 down+0.021
 up+0.020
ny+0.018
ling+0.018
Neg logit contributions
 flour-0.037
 slices-0.037
 jam-0.036
 rods-0.035
 strips-0.034
Token She
Token ID1375
Feature activation-3.17e-02

Pos logit contributions
 down+0.012
 alone+0.011
rated+0.011
 up+0.010
ny+0.010
Neg logit contributions
 slices-0.023
 jam-0.022
 flour-0.022
 cr-0.021
 icing-0.021
Token wanted
Token ID2227
Feature activation-7.50e-01

Pos logit contributions
 slices+0.001
 flour+0.001
 jam+0.001
 icing+0.001
 rods+0.001
Neg logit contributions
 alone-0.001
rated-0.001
 down-0.001
 up-0.001
ny-0.001
Token to
Token ID284
Feature activation+2.26e+00

Pos logit contributions
 unison+0.029
 slices+0.028
 slot+0.028
 funnel+0.028
 strips+0.027
Neg logit contributions
 up-0.007
 down-0.007
 alone-0.006
 away-0.006
 same-0.006
Token make
Token ID787
Feature activation+5.06e-01

Pos logit contributions
 alone+0.040
 down+0.038
rated+0.037
 up+0.037
 away+0.035
Neg logit contributions
 slices-0.070
 flour-0.069
 jam-0.067
 cr-0.063
engu-0.063
Token Grand
Token ID5675
Feature activation-1.22e-01

Pos logit contributions
 up+0.009
 down+0.009
 alone+0.008
 same+0.008
ling+0.007
Neg logit contributions
 slices-0.015
 flour-0.015
 jam-0.015
 cr-0.014
 slot-0.014
Tokenma
Token ID2611
Feature activation+2.15e+00

Pos logit contributions
 slices+0.002
 jam+0.002
 rods+0.002
 flour+0.002
engu+0.002
Neg logit contributions
 up-0.003
ling-0.003
 alone-0.003
rated-0.003
 down-0.003
Token gone
Token ID3750
Feature activation+1.71e+00

Pos logit contributions
 slices+0.035
 jam+0.034
 flour+0.033
engu+0.031
 cr+0.031
Neg logit contributions
 alone-0.028
 down-0.028
rated-0.027
 up-0.026
 same-0.025
Token away
Token ID1497
Feature activation+8.20e-01

Pos logit contributions
rated+0.035
ny+0.032
arrass+0.032
ling+0.032
 alone+0.032
Neg logit contributions
 flour-0.045
 jam-0.044
 slices-0.044
 cr-0.043
 frost-0.041
Token."
Token ID526
Feature activation-8.27e-01

Pos logit contributions
rated+0.018
 down+0.018
 alone+0.018
ling+0.017
ny+0.017
Neg logit contributions
 slices-0.021
 flour-0.021
 jam-0.021
 cr-0.020
 frost-0.019
Token
Token ID220
Feature activation+3.39e-02

Pos logit contributions
 slices+0.027
 flour+0.027
 jam+0.026
 cr+0.026
 strips+0.025
Neg logit contributions
 down-0.013
 alone-0.012
 up-0.011
ny-0.010
 freely-0.010
Token
Token ID198
Feature activation-8.88e-01

Pos logit contributions
rated+0.001
 down+0.001
 alone+0.001
 up+0.001
ling+0.001
Neg logit contributions
 slices-0.001
 jam-0.001
 flour-0.001
 cr-0.001
 frost-0.001
Token
Token ID198
Feature activation-2.52e-01

Pos logit contributions
 jam+0.026
 flour+0.026
 slices+0.026
 cr+0.025
 icing+0.025
Neg logit contributions
 alone-0.016
 down-0.015
 up-0.014
rated-0.014
ling-0.014
TokenThis
Token ID1212
Feature activation+1.26e-01

Pos logit contributions
 flour+0.007
 slices+0.007
 jam+0.007
 icing+0.006
 cr+0.006
Neg logit contributions
 alone-0.005
away-0.005
ling-0.005
 down-0.005
rated-0.005
Token made
Token ID925
Feature activation-3.98e-02

Pos logit contributions
rated+0.002
 up+0.002
ling+0.002
ny+0.002
 same+0.002
Neg logit contributions
 slices-0.003
engu-0.003
 flour-0.003
 rods-0.003
 jam-0.003
Token the
Token ID262
Feature activation+1.27e+00

Pos logit contributions
 slices+0.001
 jam+0.001
 cr+0.001
 flour+0.001
engu+0.001
Neg logit contributions
 down-0.001
 up-0.001
 alone-0.001
ling-0.001
rated-0.001
Token baby
Token ID5156
Feature activation+6.30e-01

Pos logit contributions
rated+0.025
 down+0.025
 alone+0.024
 up+0.024
ling+0.023
Neg logit contributions
 slices-0.037
 jam-0.036
 flour-0.036
engu-0.034
 cr-0.034
Token realise
Token ID18996
Feature activation+9.19e-03

Pos logit contributions
rated+0.010
 down+0.009
 alone+0.009
ling+0.009
 up+0.008
Neg logit contributions
 slices-0.021
 flour-0.021
 jam-0.021
 cr-0.020
 frost-0.019
Tokeneal
Token ID2287
Feature activation+3.22e+00

Pos logit contributions
rated+0.032
 alone+0.030
 freely+0.028
 down+0.028
 invincible+0.027
Neg logit contributions
 jam-0.053
 slices-0.053
 flour-0.052
 frost-0.051
 cr-0.051
Tokened
Token ID276
Feature activation+1.58e+00

Pos logit contributions
rated+0.077
itionally+0.066
itating+0.066
lves+0.064
 unnoticed+0.064
Neg logit contributions
 cr-0.081
 slices-0.077
 jam-0.070
 frost-0.069
 icing-0.067
Token with
Token ID351
Feature activation+8.18e-01

Pos logit contributions
rated+0.049
umpy+0.041
ny+0.040
itates+0.040
ounded+0.040
Neg logit contributions
 slices-0.032
 cr-0.031
 jam-0.030
 frost-0.027
 sprink-0.027
Token joy
Token ID8716
Feature activation+1.14e+00

Pos logit contributions
rated+0.024
ashing+0.021
ling+0.021
ny+0.021
ounded+0.020
Neg logit contributions
 slices-0.018
 jam-0.018
 cr-0.017
 frost-0.017
 flour-0.017
Token!
Token ID0
Feature activation+3.75e-01

Pos logit contributions
rated+0.033
 alone+0.030
ny+0.029
umpy+0.029
 freely+0.029
Neg logit contributions
 slices-0.024
 cr-0.023
 jam-0.023
 flour-0.022
 frost-0.021
Token
Token ID198
Feature activation-2.59e-01

Pos logit contributions
rated+0.010
ling+0.009
away+0.009
 freely+0.008
 alone+0.008
Neg logit contributions
 slices-0.009
 flour-0.009
 jam-0.008
 cr-0.008
 tape-0.008
Token
Token ID198
Feature activation+6.55e-01

Pos logit contributions
 slices+0.006
 jam+0.006
 flour+0.006
 cr+0.005
 frost+0.005
Neg logit contributions
rated-0.007
 down-0.006
 alone-0.006
ling-0.006
ny-0.006
TokenEm
Token ID10161
Feature activation+5.08e-01

Pos logit contributions
rated+0.021
 down+0.020
 alone+0.019
 up+0.019
 lose+0.019
Neg logit contributions
 slices-0.010
 jam-0.010
 flour-0.009
 cr-0.009
Grand-0.009
Tokenma
Token ID2611
Feature activation-1.46e+00

Pos logit contributions
rated+0.016
 down+0.016
 alone+0.016
ling+0.016
 up+0.015
Neg logit contributions
 slices-0.008
 flour-0.008
 jam-0.008
 cr-0.007
 frost-0.007
Token wanted
Token ID2227
Feature activation-2.31e-01

Pos logit contributions
 slices+0.043
 flour+0.043
 jam+0.042
 icing+0.039
 cr+0.038
Neg logit contributions
 down-0.028
 alone-0.028
 up-0.027
rated-0.027
ling-0.027
Token to
Token ID284
Feature activation+1.48e+00

Pos logit contributions
 slices+0.008
 flour+0.007
 jam+0.007
 cr+0.007
 icing+0.007
Neg logit contributions
 alone-0.003
 down-0.003
rated-0.003
 up-0.003
ling-0.003
Token.
Token ID13
Feature activation-4.07e-01

Pos logit contributions
 down+0.009
 alone+0.009
rated+0.009
 up+0.008
ling+0.008
Neg logit contributions
 slices-0.011
 flour-0.011
 jam-0.011
 cr-0.010
 icing-0.010
Token
Token ID198
Feature activation-6.15e-01

Pos logit contributions
 slices+0.011
 flour+0.011
 jam+0.011
 cr+0.010
 frost+0.010
Neg logit contributions
rated-0.008
 down-0.008
 alone-0.008
ling-0.008
 up-0.008
Token
Token ID198
Feature activation-3.70e-01

Pos logit contributions
 slices+0.015
 flour+0.015
 jam+0.015
 cr+0.014
 frost+0.014
Neg logit contributions
 alone-0.014
rated-0.014
 down-0.014
ling-0.013
 up-0.013
TokenThey
Token ID2990
Feature activation-4.78e-01

Pos logit contributions
 slices+0.009
 flour+0.009
 jam+0.009
 cr+0.008
 frost+0.008
Neg logit contributions
rated-0.009
 alone-0.009
 down-0.009
ling-0.008
ny-0.008
Token went
Token ID1816
Feature activation-8.08e-02

Pos logit contributions
 slices+0.016
 flour+0.016
 jam+0.015
 strips+0.014
 powder+0.014
Neg logit contributions
 down-0.008
 alone-0.008
 away-0.007
ny-0.007
 up-0.007
Token home
Token ID1363
Feature activation-1.57e+00

Pos logit contributions
 slices+0.003
 jam+0.002
 flour+0.002
 cr+0.002
 frost+0.002
Neg logit contributions
rated-0.001
 alone-0.001
 down-0.001
ling-0.001
ny-0.001
Token,
Token ID11
Feature activation-1.32e+00

Pos logit contributions
 slices+0.039
 flour+0.039
 jam+0.039
 cr+0.037
 frost+0.035
Neg logit contributions
rated-0.036
 down-0.035
 alone-0.035
ling-0.033
 up-0.033
Token carrying
Token ID6872
Feature activation-1.19e+00

Pos logit contributions
 slices+0.038
 flour+0.038
 jam+0.038
 cr+0.035
 icing+0.035
Neg logit contributions
 down-0.026
 alone-0.026
rated-0.025
 up-0.024
ling-0.023
Token their
Token ID511
Feature activation-8.69e-01

Pos logit contributions
 slices+0.033
 jam+0.033
 flour+0.033
engu+0.032
emon+0.031
Neg logit contributions
 down-0.024
 up-0.023
 away-0.022
 alone-0.021
ling-0.020
Token file
Token ID2393
Feature activation-1.48e+00

Pos logit contributions
 slices+0.020
 flour+0.020
 jam+0.019
emon+0.018
engu+0.018
Neg logit contributions
 alone-0.021
 down-0.021
 up-0.020
 away-0.020
 same-0.019
Token of
Token ID286
Feature activation-2.18e+00

Pos logit contributions
 jam+0.042
 flour+0.042
 slices+0.042
 frost+0.039
engu+0.039
Neg logit contributions
 alone-0.030
 down-0.030
 up-0.028
rated-0.027
 away-0.026
Token Can
Token ID1680
Feature activation-1.56e+00

Pos logit contributions
 slices+0.063
 jam+0.061
 flour+0.061
 strips+0.057
 cr+0.057
Neg logit contributions
 down-0.026
 alone-0.024
rated-0.022
ny-0.022
 up-0.021
Token I
Token ID314
Feature activation-3.04e-01

Pos logit contributions
 slices+0.047
 flour+0.046
 jam+0.046
 cr+0.044
 frost+0.043
Neg logit contributions
 down-0.028
 alone-0.028
rated-0.027
 up-0.026
ling-0.025
Token play
Token ID711
Feature activation+7.38e-01

Pos logit contributions
 slices+0.008
 flour+0.008
 jam+0.008
 cr+0.008
 frost+0.007
Neg logit contributions
rated-0.006
 down-0.006
 alone-0.006
ling-0.006
 up-0.006
Token with
Token ID351
Feature activation-1.52e+00

Pos logit contributions
 down+0.014
 alone+0.014
 up+0.013
rated+0.013
ling+0.013
Neg logit contributions
 slices-0.022
 flour-0.021
 jam-0.021
engu-0.020
 cr-0.020
Token you
Token ID345
Feature activation-1.07e+00

Pos logit contributions
 slices+0.050
 flour+0.049
engu+0.048
 jam+0.048
 unison+0.047
Neg logit contributions
 down-0.025
 alone-0.024
 up-0.023
 away-0.020
away-0.020
Token?"
Token ID1701
Feature activation-3.83e-01

Pos logit contributions
 slices+0.031
 flour+0.031
 jam+0.030
engu+0.029
 icing+0.028
Neg logit contributions
 down-0.022
 alone-0.022
 up-0.021
rated-0.020
ling-0.019
Token The
Token ID383
Feature activation+8.05e-01

Pos logit contributions
 flour+0.011
 slices+0.011
 jam+0.011
 cr+0.010
 strips+0.010
Neg logit contributions
 down-0.008
 alone-0.007
rated-0.007
 up-0.007
ny-0.007
Token cat
Token ID3797
Feature activation-1.55e+00

Pos logit contributions
 alone+0.021
 down+0.020
 up+0.019
 same+0.019
rated+0.018
Neg logit contributions
 slices-0.019
engu-0.018
 flour-0.018
 jam-0.017
 icing-0.017
Token opened
Token ID4721
Feature activation-8.52e-01

Pos logit contributions
 slices+0.040
 flour+0.039
 jam+0.038
 icing+0.035
 strips+0.035
Neg logit contributions
 down-0.036
 alone-0.035
rated-0.035
 up-0.034
ling-0.034
Token its
Token ID663
Feature activation-2.61e+00

Pos logit contributions
 slices+0.024
 jam+0.024
 flour+0.024
 cr+0.023
 frost+0.022
Neg logit contributions
rated-0.018
 alone-0.016
 down-0.016
ling-0.016
ny-0.015
Token mouth
Token ID5422
Feature activation-3.04e+00

Pos logit contributions
 slices+0.046
 flour+0.045
 jam+0.044
engu+0.040
 icing+0.039
Neg logit contributions
 down-0.083
 alone-0.079
 up-0.079
 away-0.075
rated-0.074
Tokena
Token ID64
Feature activation+8.04e-01

Pos logit contributions
rated+0.041
 down+0.040
 alone+0.040
ling+0.038
 up+0.038
Neg logit contributions
 slices-0.044
 flour-0.043
 jam-0.043
 cr-0.041
 frost-0.039
Token.
Token ID13
Feature activation-4.88e-01

Pos logit contributions
 down+0.017
rated+0.017
 alone+0.017
 up+0.017
ling+0.016
Neg logit contributions
 slices-0.021
 flour-0.021
 jam-0.021
 cr-0.019
 frost-0.019
Token She
Token ID1375
Feature activation-1.03e+00

Pos logit contributions
 slices+0.013
 flour+0.013
 jam+0.013
 cr+0.013
 frost+0.012
Neg logit contributions
rated-0.010
 down-0.010
 alone-0.010
 up-0.009
ling-0.009
Token took
Token ID1718
Feature activation-7.37e-01

Pos logit contributions
 flour+0.029
 slices+0.029
 jam+0.029
 icing+0.027
 frost+0.026
Neg logit contributions
 up-0.020
 down-0.020
 alone-0.019
rated-0.019
ling-0.018
Token Louis
Token ID5593
Feature activation+2.09e-01

Pos logit contributions
 slices+0.021
 flour+0.021
 jam+0.021
 cr+0.020
 frost+0.019
Neg logit contributions
 down-0.014
 alone-0.014
 up-0.014
rated-0.014
ling-0.013
Tokena
Token ID64
Feature activation-1.15e+00

Pos logit contributions
 down+0.005
 alone+0.005
 up+0.005
rated+0.005
ling+0.005
Neg logit contributions
 flour-0.005
 slices-0.005
 jam-0.005
 icing-0.005
 frost-0.005
Token to
Token ID284
Feature activation-4.42e-01

Pos logit contributions
 flour+0.039
 slices+0.038
 jam+0.038
 icing+0.036
 frost+0.036
Neg logit contributions
 down-0.018
 up-0.017
 alone-0.016
 away-0.016
ling-0.015
Token the
Token ID262
Feature activation+1.23e-01

Pos logit contributions
 slices+0.014
 flour+0.014
engu+0.013
 jam+0.013
 icing+0.013
Neg logit contributions
 alone-0.008
 down-0.008
 up-0.007
 away-0.007
away-0.007
Token modern
Token ID3660
Feature activation-1.12e+00

Pos logit contributions
 alone+0.003
 down+0.003
 up+0.003
 same+0.003
 away+0.003
Neg logit contributions
 slices-0.003
 flour-0.003
 jam-0.003
 strips-0.003
 icing-0.003
Token screen
Token ID3159
Feature activation-1.67e+00

Pos logit contributions
 slices+0.028
 flour+0.027
 jam+0.026
engu+0.026
izza+0.025
Neg logit contributions
 alone-0.028
 down-0.026
 up-0.026
away-0.025
 away-0.024
Token,
Token ID11
Feature activation-1.22e+00

Pos logit contributions
 jam+0.046
 flour+0.045
 slices+0.044
engu+0.042
 frost+0.041
Neg logit contributions
 down-0.036
 alone-0.036
 up-0.034
 away-0.032
ny-0.031
Token with
Token ID351
Feature activation-2.30e-01

Pos logit contributions
rated+0.004
 down+0.004
 alone+0.003
ling+0.003
 up+0.003
Neg logit contributions
 slices-0.005
 jam-0.005
 flour-0.005
 cr-0.004
 frost-0.004
Token his
Token ID465
Feature activation-8.35e-01

Pos logit contributions
 slices+0.007
 jam+0.007
 flour+0.007
 cr+0.007
 frost+0.006
Neg logit contributions
rated-0.004
 down-0.004
 alone-0.004
ling-0.004
 up-0.004
Token family
Token ID1641
Feature activation-3.60e-01

Pos logit contributions
 slices+0.021
 flour+0.021
 jam+0.020
 icing+0.019
 frost+0.019
Neg logit contributions
 alone-0.020
 down-0.020
 up-0.019
rated-0.018
ling-0.018
Token.
Token ID13
Feature activation+2.27e-01

Pos logit contributions
 slices+0.012
 jam+0.012
 flour+0.011
engu+0.011
 icing+0.011
Neg logit contributions
 alone-0.007
 down-0.006
 up-0.006
ling-0.005
 around-0.005
Token He
Token ID679
Feature activation+7.36e-02

Pos logit contributions
rated+0.006
ling+0.005
away+0.005
 alone+0.005
 down+0.005
Neg logit contributions
 slices-0.006
 jam-0.005
 flour-0.005
 cr-0.005
 rods-0.005
Token liked
Token ID8288
Feature activation-7.02e-01

Pos logit contributions
rated+0.001
 down+0.001
 alone+0.001
ling+0.001
 up+0.001
Neg logit contributions
 slices-0.002
 jam-0.002
 flour-0.002
 cr-0.002
 frost-0.002
Token to
Token ID284
Feature activation+1.28e+00

Pos logit contributions
 jam+0.020
 flour+0.020
 slices+0.020
 cr+0.019
 frost+0.018
Neg logit contributions
rated-0.015
ling-0.013
 alone-0.013
 down-0.013
ny-0.013
Token jump
Token ID4391
Feature activation+2.95e-01

Pos logit contributions
rated+0.025
 down+0.024
 alone+0.024
ling+0.023
 up+0.022
Neg logit contributions
 slices-0.037
 flour-0.036
 jam-0.036
 cr-0.035
 frost-0.034
Token and
Token ID290
Feature activation+7.65e-01

Pos logit contributions
rated+0.007
 alone+0.007
 down+0.007
arrass+0.007
ling+0.007
Neg logit contributions
 slices-0.007
 jam-0.007
 flour-0.007
 cr-0.007
 strips-0.006
Token play
Token ID711
Feature activation+1.15e+00

Pos logit contributions
rated+0.017
 alone+0.016
 down+0.016
ling+0.016
away+0.016
Neg logit contributions
 jam-0.019
 slices-0.019
 flour-0.019
 cr-0.018
 frost-0.017
Token all
Token ID477
Feature activation+3.24e+00

Pos logit contributions
rated+0.030
arrass+0.026
umpy+0.026
 down+0.026
ny+0.026
Neg logit contributions
 jam-0.027
 slices-0.026
 cr-0.026
 flour-0.025
 powder-0.023
Token so
Token ID523
Feature activation-3.98e-01

Pos logit contributions
 slices+0.039
 jam+0.039
 flour+0.039
 cr+0.037
 icing+0.036
Neg logit contributions
 alone-0.023
 down-0.023
rated-0.023
 up-0.021
ling-0.021
Token happy
Token ID3772
Feature activation+7.06e-01

Pos logit contributions
 slices+0.012
 flour+0.012
 cr+0.011
 jam+0.011
 frost+0.011
Neg logit contributions
rated-0.008
ny-0.007
ling-0.006
away-0.006
 down-0.006
Token.
Token ID13
Feature activation+5.12e-01

Pos logit contributions
rated+0.017
 alone+0.015
 down+0.015
ling+0.015
ny+0.015
Neg logit contributions
 slices-0.018
 jam-0.018
 flour-0.017
 cr-0.017
 frost-0.016
Token They
Token ID1119
Feature activation-1.82e-01

Pos logit contributions
rated+0.013
ling+0.012
away+0.012
 alone+0.011
 invincible+0.011
Neg logit contributions
 slices-0.012
 flour-0.012
 jam-0.012
 cr-0.011
 frost-0.011
Token hugged
Token ID41130
Feature activation+8.26e-01

Pos logit contributions
 slices+0.006
 jam+0.005
 flour+0.005
 strips+0.005
 icing+0.005
Neg logit contributions
 down-0.003
 alone-0.003
rated-0.003
 up-0.003
ling-0.003
Token each
Token ID1123
Feature activation-6.82e-01

Pos logit contributions
rated+0.015
 alone+0.015
 down+0.014
ling+0.014
ny+0.013
Neg logit contributions
 slices-0.025
 jam-0.025
 flour-0.024
 cr-0.023
 frost-0.023
Token other
Token ID584
Feature activation+1.48e+00

Pos logit contributions
 slices+0.016
 jam+0.016
 flour+0.016
 cr+0.015
 frost+0.014
Neg logit contributions
rated-0.017
 alone-0.017
 down-0.016
ling-0.016
 freely-0.016
Token and
Token ID290
Feature activation+6.94e-01

Pos logit contributions
 down+0.033
rated+0.033
 alone+0.033
 up+0.032
ling+0.031
Neg logit contributions
 slices-0.038
 flour-0.038
 jam-0.038
 cr-0.035
 frost-0.034
Token then
Token ID788
Feature activation+3.06e-01

Pos logit contributions
 down+0.014
 alone+0.013
 up+0.013
rated+0.012
 away+0.012
Neg logit contributions
 slices-0.020
 flour-0.020
engu-0.020
 jam-0.019
 strips-0.019
Token went
Token ID1816
Feature activation+4.06e-01

Pos logit contributions
 down+0.006
 alone+0.006
rated+0.006
 up+0.006
ling+0.005
Neg logit contributions
 slices-0.009
 flour-0.009
 jam-0.009
 cr-0.008
engu-0.008
Token home
Token ID1363
Feature activation-5.62e-01

Pos logit contributions
rated+0.008
ling+0.006
umpy+0.006
arrass+0.006
ny+0.006
Neg logit contributions
 jam-0.012
 flour-0.012
 slices-0.012
 cr-0.012
 frost-0.011
Token you
Token ID345
Feature activation-2.49e-01

Pos logit contributions
 slices+0.036
 flour+0.035
 jam+0.034
 cr+0.033
engu+0.033
Neg logit contributions
 down-0.021
 alone-0.021
rated-0.020
 up-0.020
ling-0.019
Token very
Token ID845
Feature activation+5.80e-01

Pos logit contributions
 slices+0.007
 flour+0.007
 jam+0.007
 icing+0.006
 cr+0.006
Neg logit contributions
 alone-0.005
 down-0.005
rated-0.005
 up-0.005
ling-0.005
Token much
Token ID881
Feature activation-1.37e+00

Pos logit contributions
rated+0.013
 down+0.013
 alone+0.012
ling+0.012
ny+0.012
Neg logit contributions
 slices-0.014
 jam-0.014
 flour-0.014
 cr-0.013
 frost-0.013
Token.
Token ID13
Feature activation-8.33e-01

Pos logit contributions
 slices+0.038
 flour+0.038
 jam+0.038
 cr+0.036
engu+0.036
Neg logit contributions
 alone-0.028
 down-0.027
rated-0.027
 up-0.026
ling-0.025
Token I
Token ID314
Feature activation-8.94e-01

Pos logit contributions
 slices+0.029
 flour+0.028
 jam+0.028
 cr+0.027
 strips+0.026
Neg logit contributions
 down-0.012
 alone-0.011
rated-0.011
ny-0.011
 up-0.010
Token know
Token ID760
Feature activation-3.31e+00

Pos logit contributions
 slices+0.025
 jam+0.025
 flour+0.025
 cr+0.024
 frost+0.023
Neg logit contributions
rated-0.018
 down-0.017
 alone-0.017
ling-0.016
 up-0.016
Token you
Token ID345
Feature activation-3.01e-01

Pos logit contributions
 slices+0.114
 strips+0.108
 funnel+0.106
 slot+0.106
 jam+0.105
Neg logit contributions
 up-0.048
 down-0.048
 alone-0.048
 around-0.035
ly-0.035
Token miss
Token ID2051
Feature activation-8.81e-01

Pos logit contributions
 slices+0.007
 flour+0.007
 jam+0.007
 cr+0.007
 icing+0.007
Neg logit contributions
rated-0.007
 alone-0.007
 down-0.007
 up-0.007
ling-0.007
Token your
Token ID534
Feature activation+5.68e-02

Pos logit contributions
 slices+0.027
 flour+0.027
 jam+0.026
engu+0.025
 strips+0.025
Neg logit contributions
 down-0.016
 alone-0.016
 up-0.015
 away-0.014
rated-0.014
Token mom
Token ID1995
Feature activation-5.16e-01

Pos logit contributions
 alone+0.001
 down+0.001
 up+0.001
rated+0.001
ling+0.001
Neg logit contributions
 slices-0.001
 flour-0.001
 jam-0.001
engu-0.001
 cr-0.001
Token,
Token ID11
Feature activation+2.77e-01

Pos logit contributions
 slices+0.014
 flour+0.014
 jam+0.014
 cr+0.013
engu+0.013
Neg logit contributions
 alone-0.011
 down-0.011
rated-0.010
 up-0.010
ling-0.010
Token of
Token ID286
Feature activation-1.67e+00

Pos logit contributions
 flour+0.029
 slices+0.029
 jam+0.028
 cr+0.027
engu+0.027
Neg logit contributions
 alone-0.035
 down-0.033
ling-0.033
umpy-0.032
 up-0.032
Token fun
Token ID1257
Feature activation-4.29e-01

Pos logit contributions
 slices+0.047
 flour+0.046
 jam+0.046
 cr+0.044
 icing+0.042
Neg logit contributions
rated-0.033
 alone-0.033
 down-0.033
ling-0.031
 up-0.031
Token.
Token ID13
Feature activation-7.32e-01

Pos logit contributions
 slices+0.011
 flour+0.011
 jam+0.011
 cr+0.010
 icing+0.010
Neg logit contributions
rated-0.009
 alone-0.009
 down-0.009
ling-0.009
 up-0.009
Token
Token ID198
Feature activation-1.60e+00

Pos logit contributions
 slices+0.020
 jam+0.019
 flour+0.019
 cr+0.018
 icing+0.018
Neg logit contributions
rated-0.016
 down-0.015
 alone-0.015
 up-0.014
ling-0.014
Token
Token ID198
Feature activation-7.40e-01

Pos logit contributions
 slices+0.041
 jam+0.040
 flour+0.040
 cr+0.038
 frost+0.037
Neg logit contributions
rated-0.036
 down-0.035
 alone-0.035
ling-0.033
 up-0.033
TokenAfter
Token ID3260
Feature activation-3.47e+00

Pos logit contributions
 slices+0.019
 flour+0.019
 jam+0.019
 cr+0.018
 icing+0.017
Neg logit contributions
rated-0.017
 alone-0.016
 down-0.016
ling-0.016
ny-0.015
Token playing
Token ID2712
Feature activation+6.04e-01

Pos logit contributions
engu+0.111
 slices+0.107
 flour+0.107
 slot+0.106
 strips+0.103
Neg logit contributions
 down-0.054
 alone-0.050
 up-0.046
 miserable-0.043
free-0.041
Token,
Token ID11
Feature activation-1.27e+00

Pos logit contributions
rated+0.017
 down+0.016
 alone+0.016
ling+0.015
ny+0.015
Neg logit contributions
 slices-0.013
 jam-0.013
 flour-0.012
 cr-0.012
 frost-0.011
Token Fl
Token ID1610
Feature activation-2.29e-01

Pos logit contributions
 slices+0.031
 jam+0.030
 flour+0.030
 cr+0.028
 icing+0.028
Neg logit contributions
 down-0.031
 alone-0.030
rated-0.030
 up-0.029
ling-0.028
Tokenuffy
Token ID15352
Feature activation-2.10e-01

Pos logit contributions
 slices+0.005
 flour+0.005
 jam+0.005
 cr+0.005
 icing+0.005
Neg logit contributions
rated-0.006
 alone-0.006
 down-0.006
ling-0.006
 up-0.005
Token and
Token ID290
Feature activation+5.04e-01

Pos logit contributions
 flour+0.006
 slices+0.006
 jam+0.006
 icing+0.005
engu+0.005
Neg logit contributions
 down-0.004
 alone-0.004
 up-0.004
ny-0.004
rated-0.004
Token.
Token ID13
Feature activation-1.47e-02

Pos logit contributions
 jam+0.031
 flour+0.031
 slices+0.031
 frost+0.029
 slot+0.029
Neg logit contributions
 down-0.024
 up-0.023
ling-0.023
 alone-0.023
rated-0.021
Token They
Token ID1119
Feature activation-9.83e-01

Pos logit contributions
 jam+0.001
 slices+0.001
 cr+0.001
 flour+0.001
 da+0.000
Neg logit contributions
 down-0.000
 freely-0.000
 up-0.000
 alone-0.000
ny-0.000
Token fill
Token ID6070
Feature activation+1.55e-01

Pos logit contributions
 slices+0.026
 flour+0.025
 jam+0.025
 strips+0.024
 slot+0.023
Neg logit contributions
 down-0.020
 alone-0.020
 up-0.019
rated-0.019
ling-0.018
Token some
Token ID617
Feature activation-3.39e+00

Pos logit contributions
 up+0.003
 down+0.003
ling+0.002
 alone+0.002
 away+0.002
Neg logit contributions
 slices-0.005
 jam-0.005
 flour-0.004
 cr-0.004
engu-0.004
Token paper
Token ID3348
Feature activation-2.93e+00

Pos logit contributions
 unison+0.069
 Theo+0.068
ena+0.067
engu+0.066
izza+0.065
Neg logit contributions
 down-0.094
 up-0.093
 alone-0.088
ling-0.088
 away-0.087
Token cups
Token ID14180
Feature activation-4.36e+00

Pos logit contributions
 unison+0.087
ird+0.085
 jam+0.084
 flour+0.082
 slot+0.082
Neg logit contributions
 up-0.057
 down-0.054
ling-0.048
 around-0.047
 away-0.046
Token with
Token ID351
Feature activation-2.42e+00

Pos logit contributions
 jam+0.122
 unison+0.121
engu+0.118
 flour+0.117
 ze+0.115
Neg logit contributions
 down-0.088
 up-0.085
 alone-0.078
 around-0.076
 away-0.075
Token the
Token ID262
Feature activation-1.34e+00

Pos logit contributions
ird+0.069
 slot+0.067
ena+0.065
keys+0.065
glass+0.064
Neg logit contributions
 up-0.052
 down-0.050
 away-0.045
 same-0.042
 around-0.042
Token batter
Token ID4984
Feature activation-2.52e+00

Pos logit contributions
engu+0.022
 unison+0.022
 slices+0.022
emon+0.021
 jam+0.021
Neg logit contributions
 down-0.039
 alone-0.038
 same-0.038
 up-0.037
 away-0.036
Token.
Token ID13
Feature activation-7.41e-02

Pos logit contributions
 jam+0.065
 slices+0.064
engu+0.063
 flour+0.062
 unison+0.062
Neg logit contributions
 down-0.055
 up-0.054
ling-0.048
 away-0.047
 alone-0.047
Token They
Token ID1119
Feature activation-1.44e+00

Pos logit contributions
 jam+0.003
 cr+0.003
 slices+0.003
 flour+0.003
 da+0.003
Neg logit contributions
 down-0.001
 up-0.001
 freely-0.001
 alone-0.001
ny-0.001
Token spinning
Token ID19493
Feature activation+5.61e-01

Pos logit contributions
 slices+0.021
 jam+0.021
 flour+0.020
 cr+0.020
 frost+0.019
Neg logit contributions
rated-0.020
away-0.018
umpy-0.018
ny-0.018
 alone-0.018
Token through
Token ID832
Feature activation-1.13e+00

Pos logit contributions
rated+0.011
 alone+0.010
 down+0.009
umpy+0.009
ling+0.009
Neg logit contributions
 slices-0.016
 jam-0.016
 flour-0.016
 cr-0.016
 frost-0.015
Token the
Token ID262
Feature activation-6.98e-01

Pos logit contributions
 flour+0.032
 jam+0.032
 slices+0.031
engu+0.030
izza+0.030
Neg logit contributions
 alone-0.022
 down-0.022
 up-0.020
 freely-0.020
 away-0.019
Token air
Token ID1633
Feature activation+2.67e-01

Pos logit contributions
 slices+0.019
 flour+0.018
emon+0.018
engu+0.018
 jam+0.018
Neg logit contributions
 alone-0.015
 down-0.015
ny-0.014
 same-0.014
 away-0.014
Token.
Token ID13
Feature activation-9.99e-01

Pos logit contributions
 down+0.006
 alone+0.006
rated+0.006
ling+0.006
 up+0.006
Neg logit contributions
 flour-0.007
 slices-0.007
 jam-0.007
 cr-0.006
engu-0.006
Token After
Token ID2293
Feature activation-3.46e+00

Pos logit contributions
 jam+0.031
 slices+0.031
 flour+0.030
 cr+0.029
 icing+0.029
Neg logit contributions
 down-0.017
 alone-0.017
ny-0.016
 up-0.016
rated-0.016
Token a
Token ID257
Feature activation-1.21e+00

Pos logit contributions
 flour+0.104
 slices+0.101
 slot+0.101
engu+0.100
izza+0.097
Neg logit contributions
 alone-0.061
 down-0.059
 up-0.054
 miserable-0.053
free-0.053
Token few
Token ID1178
Feature activation-2.48e+00

Pos logit contributions
 slices+0.035
 flour+0.034
 jam+0.034
 cr+0.032
 icing+0.032
Neg logit contributions
rated-0.023
 down-0.023
 alone-0.023
ny-0.022
ling-0.022
Token seconds
Token ID4201
Feature activation-2.99e+00

Pos logit contributions
engu+0.048
 flour+0.047
izza+0.046
 slices+0.046
 frost+0.043
Neg logit contributions
 alone-0.072
 down-0.069
ling-0.069
 away-0.067
rated-0.066
Token,
Token ID11
Feature activation-1.87e+00

Pos logit contributions
 flour+0.082
 jam+0.081
 slices+0.079
 toast+0.078
 unison+0.076
Neg logit contributions
 down-0.067
 alone-0.064
 away-0.060
 up-0.060
 freely-0.059
Token the
Token ID262
Feature activation-7.12e-02

Pos logit contributions
 slices+0.054
 jam+0.053
 flour+0.052
 icing+0.051
 cr+0.050
Neg logit contributions
 down-0.038
 up-0.035
 alone-0.034
 away-0.032
ling-0.030
Token
Token ID198
Feature activation-9.16e-01

Pos logit contributions
 down+0.001
 alone+0.001
 up+0.001
ny+0.001
rated+0.001
Neg logit contributions
 jam-0.002
 slices-0.002
 flour-0.002
 icing-0.002
engu-0.002
Token
Token ID198
Feature activation-5.79e-01

Pos logit contributions
 slices+0.030
 jam+0.029
 flour+0.029
 cr+0.028
 icing+0.028
Neg logit contributions
 alone-0.014
 down-0.014
 up-0.013
 freely-0.013
umpy-0.012
TokenShe
Token ID3347
Feature activation-1.46e+00

Pos logit contributions
 slices+0.015
 jam+0.015
 flour+0.015
 icing+0.015
 cr+0.015
Neg logit contributions
 alone-0.012
 down-0.012
away-0.011
ny-0.011
umpy-0.011
Token opened
Token ID4721
Feature activation-8.00e-01

Pos logit contributions
 slices+0.045
 flour+0.044
 jam+0.043
 icing+0.042
engu+0.041
Neg logit contributions
 down-0.025
 alone-0.024
 up-0.023
ling-0.023
rated-0.023
Token the
Token ID262
Feature activation-7.40e-01

Pos logit contributions
 slices+0.022
 jam+0.022
 flour+0.022
 cr+0.021
engu+0.020
Neg logit contributions
 down-0.016
 alone-0.016
 up-0.016
rated-0.015
ling-0.015
Token oven
Token ID14361
Feature activation-3.93e+00

Pos logit contributions
 slices+0.017
mm+0.016
 unison+0.016
mmm+0.016
engu+0.016
Neg logit contributions
 down-0.019
 up-0.019
 alone-0.018
 away-0.018
 same-0.018
Token and
Token ID290
Feature activation-1.47e+00

Pos logit contributions
engu+0.109
 jam+0.108
 flour+0.107
emon+0.104
 unison+0.104
Neg logit contributions
 down-0.088
 up-0.082
 away-0.075
 alone-0.072
 back-0.069
Token saw
Token ID2497
Feature activation-2.28e+00

Pos logit contributions
 slices+0.043
 strips+0.043
 jam+0.043
 unison+0.042
 flour+0.042
Neg logit contributions
 down-0.029
 up-0.026
 away-0.024
 alone-0.024
rated-0.021
Token a
Token ID257
Feature activation-1.16e+00

Pos logit contributions
 unison+0.058
 slot+0.058
 strips+0.058
 slices+0.057
light+0.057
Neg logit contributions
 up-0.047
 down-0.046
 alone-0.043
 same-0.039
 away-0.038
Token black
Token ID2042
Feature activation-1.98e+00

Pos logit contributions
ions+0.036
els+0.035
 slices+0.035
emon+0.034
mm+0.034
Neg logit contributions
 down-0.017
 miserable-0.016
 alone-0.016
 lonely-0.016
 troubled-0.016
Token,
Token ID11
Feature activation-1.01e+00

Pos logit contributions
engu+0.042
 slices+0.040
 flour+0.038
izza+0.038
 umb+0.037
Neg logit contributions
 down-0.051
 alone-0.047
ly-0.047
 up-0.047
ling-0.046